TOP ACTIVATIONS
MAX = 0.111

lately . With work and err ands , they didn 't
at his salon and running err ands . But every day
give them towels and p aj amas . They give them
on dry and cozy p aj amas and sn ugg led
. They were wearing p aj amas . And they were
put on some warm p aj amas . She sn ugg
and dressed in fresh p aj amas . That
song about rain bows and umb rell as . They were
, it kept on flying err atically with inexplicable speed .
door . Inside were two umb rell as - one big
, fifteen , sixteen , seventeen , eighteen , nineteen ,
, fifteen , sixteen , seventeen , eighteen , nineteen ,
B oom !" and " P ow !" He thinks his
played together all day and Mitt ens pur red happily in
and a brush . " P aint the wall ! Go
chicken likes corn . Ha ha , do you get it
and started to climb . Wh ith her new brush ,
around and around , feeling ha py and free .
water and soap and put Mitt ens in . Mitt ens
!" Ben says . " P ill ows are fun !"

MIN ACTIVATIONS
MIN = -0.097

should not tease the ost ric hes . They are not
had not seen the jelly fish . <|endoftext|> L ily and
up . Nobody knew what she was like or what she
have finally heard someone whist ling again . Danny smiled and
not know what a tr uffle was . They did not
him to refuse the jelly fish . So they spent the
she was sorry for what she did . He wished she
her owners never knew what she was going to do next
forget about his pe b ble necklace . He made sure
Lily knew Mom meant what she said . They did not
town , with Mr sm ith blinking the car lights to
the fastest ," replied Fl utter . Tweet
W iggle !" cried Fl utter . But it
none of them knew what she could do with the coin
dad and told them what she did . Lisa
mom and to the ost ric hes . They learned their
She told her mom what she was doing with the book
she was sorry for what she did . He told her
wouldn 't give her what she wanted . No matter how
his neighbour , Mrs Clar ise , for help .

INTERVAL 0.059 - 0.111
CONTAINS 0.064%

It 's very popular . Many people want to see it
the rail , Lily went home and asked her mom my
. " Look !" yelled Jill .
his friend Billy . " Look what I made with my
tables and chairs . All his friends and family said how

INTERVAL 0.007 - 0.059
CONTAINS 35.030%

noise . The box closed . They were trapped .
carefully rubbed the cloth onto the ash tr ay . He
time , there was a little boy named Tim . He
all the games , but it is also important to invite
seats and a heavy rope . They run to the swing

INTERVAL -0.045 - 0.007
CONTAINS 63.774%

she realized it was impossible for her m ittens to fit
hug and said , " Thank you for helping me fix
I will leave you alone . I will go to another
around the world and it was the coolest thing ever .
He picked up the yogurt and studied it carefully . He

INTERVAL -0.097 - -0.045
CONTAINS 1.132%

. She asked him why he was doing that . Her
, � � € � � Where can I find some
" Hello , Mr . Cro cod ile ," said
lesson . <|endoftext|> L ily and Tom are twins . They
thankful for Mrs . Johnson 's kind words . He soon
Token lately
Token ID16537
Feature activation+9.19e-03

Pos logit contributions
 especially+0.029
 like+0.028
 rather+0.028
 really+0.027
 three+0.027
Neg logit contributions
keeping-0.049
cakes-0.047
mons-0.041
ons-0.039
 pies-0.038
Token.
Token ID13
Feature activation-3.64e-03

Pos logit contributions
keeping+0.025
cakes+0.024
mons+0.021
ons+0.021
 pies+0.020
Neg logit contributions
 especially-0.008
 rather-0.008
 like-0.008
 really-0.007
 three-0.007
Token With
Token ID2080
Feature activation-9.33e-03

Pos logit contributions
 especially+0.006
 like+0.006
 rather+0.006
 really+0.006
 who+0.006
Neg logit contributions
keeping-0.007
cakes-0.007
mons-0.006
ons-0.006
 pies-0.005
Token work
Token ID670
Feature activation+1.02e-02

Pos logit contributions
 especially+0.010
 like+0.010
 rather+0.010
 really+0.009
 who+0.009
Neg logit contributions
keeping-0.024
cakes-0.023
mons-0.021
ons-0.020
 pies-0.019
Token and
Token ID290
Feature activation+3.33e-03

Pos logit contributions
keeping+0.025
cakes+0.025
mons+0.022
ons+0.021
 pies+0.021
Neg logit contributions
 especially-0.011
 like-0.011
 rather-0.011
 who-0.010
 really-0.010
Token err
Token ID11454
Feature activation+1.08e-01

Pos logit contributions
cakes+0.008
keeping+0.008
mons+0.007
glers+0.006
rob+0.006
Neg logit contributions
 like-0.005
 three-0.005
 really-0.005
 very-0.005
 still-0.005
Tokenands
Token ID1746
Feature activation+2.27e-02

Pos logit contributions
keeping+0.160
cakes+0.150
mons+0.119
ons+0.110
 pies+0.105
Neg logit contributions
 especially-0.236
 like-0.234
 rather-0.230
 really-0.226
 who-0.225
Token,
Token ID11
Feature activation+5.36e-03

Pos logit contributions
keeping+0.056
cakes+0.054
mons+0.048
ons+0.046
 pies+0.045
Neg logit contributions
 especially-0.027
 like-0.026
 rather-0.026
 really-0.025
 who-0.025
Token they
Token ID484
Feature activation-2.27e-03

Pos logit contributions
keeping+0.010
cakes+0.009
mons+0.008
 juices+0.007
 bisc+0.007
Neg logit contributions
 especially-0.011
 like-0.011
 three-0.010
 really-0.010
 who-0.010
Token didn
Token ID1422
Feature activation+3.16e-03

Pos logit contributions
 rather+0.003
 like+0.003
 especially+0.003
 three+0.003
 quite+0.003
Neg logit contributions
keeping-0.005
cakes-0.004
mons-0.004
ons-0.004
 pies-0.004
Token't
Token ID470
Feature activation+4.63e-03

Pos logit contributions
cakes+0.006
keeping+0.006
ons+0.005
mons+0.005
 pies+0.005
Neg logit contributions
 like-0.005
 who-0.005
 especially-0.005
 rather-0.005
 three-0.005
Token at
Token ID379
Feature activation-1.88e-02

Pos logit contributions
keeping+0.031
cakes+0.030
ons+0.027
mons+0.026
 pies+0.026
Neg logit contributions
 especially-0.013
 rather-0.012
 who-0.012
 three-0.012
 like-0.011
Token his
Token ID465
Feature activation+6.24e-03

Pos logit contributions
 especially+0.024
 rather+0.024
 like+0.024
 who+0.023
 still+0.022
Neg logit contributions
keeping-0.044
cakes-0.039
ons-0.035
mons-0.035
 pies-0.034
Token salon
Token ID46608
Feature activation+5.30e-03

Pos logit contributions
keeping+0.011
cakes+0.010
mons+0.009
 pies+0.009
 cookies+0.009
Neg logit contributions
 like-0.011
 especially-0.010
 rather-0.010
 who-0.010
 quite-0.010
Token and
Token ID290
Feature activation-1.31e-02

Pos logit contributions
keeping+0.014
cakes+0.013
mons+0.012
ons+0.011
 pies+0.011
Neg logit contributions
 especially-0.005
 rather-0.005
 like-0.005
 three-0.005
 really-0.005
Token running
Token ID2491
Feature activation+1.95e-02

Pos logit contributions
 especially+0.016
 like+0.016
 rather+0.015
 really+0.015
 who+0.015
Neg logit contributions
keeping-0.032
cakes-0.031
mons-0.027
ons-0.026
 pies-0.026
Token err
Token ID11454
Feature activation+9.72e-02

Pos logit contributions
keeping+0.054
cakes+0.049
 pies+0.042
mons+0.042
ons+0.042
Neg logit contributions
 especially-0.016
 rather-0.016
 who-0.014
 three-0.014
 still-0.013
Tokenands
Token ID1746
Feature activation+2.40e-02

Pos logit contributions
keeping+0.145
cakes+0.138
mons+0.110
ons+0.101
 pies+0.097
Neg logit contributions
 especially-0.211
 like-0.209
 rather-0.206
 really-0.202
 who-0.201
Token.
Token ID13
Feature activation+1.52e-02

Pos logit contributions
keeping+0.061
cakes+0.056
ons+0.049
mons+0.048
 pies+0.046
Neg logit contributions
 especially-0.025
 three-0.023
 rather-0.023
 who-0.022
 like-0.021
Token But
Token ID887
Feature activation+1.02e-02

Pos logit contributions
keeping+0.037
cakes+0.036
mons+0.031
 juices+0.030
ons+0.030
Neg logit contributions
 especially-0.018
 like-0.018
 rather-0.018
 quite-0.016
 who-0.016
Token every
Token ID790
Feature activation-4.83e-02

Pos logit contributions
keeping+0.024
cakes+0.023
mons+0.020
 pies+0.020
ons+0.020
Neg logit contributions
 rather-0.011
 especially-0.011
 like-0.011
 really-0.011
 quite-0.010
Token day
Token ID1110
Feature activation+2.14e-02

Pos logit contributions
 rather+0.078
 especially+0.077
 like+0.076
 still+0.073
 quite+0.073
Neg logit contributions
keeping-0.082
cakes-0.078
ment-0.065
ons-0.063
 cookies-0.059
Token give
Token ID1577
Feature activation+4.06e-02

Pos logit contributions
keeping+0.031
cakes+0.031
mons+0.026
 pies+0.025
ons+0.025
Neg logit contributions
 especially-0.022
 rather-0.022
 like-0.021
 three-0.021
 quite-0.021
Token them
Token ID606
Feature activation-1.29e-02

Pos logit contributions
keeping+0.116
cakes+0.111
mons+0.100
ment+0.097
ons+0.096
Neg logit contributions
 like-0.038
 three-0.036
 really-0.036
 very-0.036
 especially-0.035
Token towels
Token ID38494
Feature activation+1.66e-02

Pos logit contributions
 especially+0.019
 like+0.019
 rather+0.018
 really+0.018
 three+0.017
Neg logit contributions
keeping-0.029
cakes-0.027
mons-0.024
ons-0.023
 pies-0.022
Token and
Token ID290
Feature activation+4.23e-03

Pos logit contributions
keeping+0.054
cakes+0.052
mons+0.047
ons+0.046
 pies+0.045
Neg logit contributions
 especially-0.007
 rather-0.007
 like-0.006
 three-0.006
 really-0.006
Token p
Token ID279
Feature activation+4.41e-02

Pos logit contributions
keeping+0.008
cakes+0.008
mons+0.006
live+0.006
ally+0.006
Neg logit contributions
 like-0.008
 especially-0.008
 three-0.008
 who-0.008
 very-0.007
Tokenaj
Token ID1228
Feature activation+9.45e-02

Pos logit contributions
keeping+0.082
cakes+0.069
mons+0.054
Jud+0.054
ons+0.053
Neg logit contributions
 like-0.085
 especially-0.082
 at-0.081
 still-0.080
 really-0.080
Tokenamas
Token ID17485
Feature activation+4.45e-02

Pos logit contributions
keeping+0.138
cakes+0.121
mons+0.100
ons+0.093
 pies+0.086
Neg logit contributions
 like-0.206
 especially-0.205
 really-0.202
 still-0.200
 rather-0.200
Token.
Token ID13
Feature activation+4.61e-03

Pos logit contributions
keeping+0.120
cakes+0.117
mons+0.102
 pies+0.098
ons+0.098
Neg logit contributions
 especially-0.044
 rather-0.040
 three-0.038
 quite-0.038
 like-0.038
Token They
Token ID1119
Feature activation+1.57e-02

Pos logit contributions
keeping+0.010
cakes+0.010
mons+0.009
ons+0.008
 pies+0.008
Neg logit contributions
 like-0.007
 especially-0.007
 rather-0.007
 really-0.006
 who-0.006
Token give
Token ID1577
Feature activation+4.14e-02

Pos logit contributions
keeping+0.032
cakes+0.032
mons+0.027
ons+0.026
 pies+0.026
Neg logit contributions
 especially-0.024
 rather-0.023
 like-0.023
 three-0.023
 really-0.022
Token them
Token ID606
Feature activation-1.29e-02

Pos logit contributions
keeping+0.119
cakes+0.113
mons+0.103
ment+0.099
ons+0.099
Neg logit contributions
 like-0.038
 three-0.036
 really-0.036
 especially-0.036
 very-0.036
Token on
Token ID319
Feature activation+9.89e-03

Pos logit contributions
keeping+0.041
cakes+0.040
mons+0.035
 juices+0.033
ons+0.033
Neg logit contributions
 like-0.015
 especially-0.014
 really-0.014
 very-0.013
 three-0.013
Token dry
Token ID5894
Feature activation+1.78e-02

Pos logit contributions
keeping+0.025
cakes+0.025
mons+0.022
ons+0.021
 pies+0.020
Neg logit contributions
 especially-0.011
 like-0.011
 rather-0.010
 really-0.010
 who-0.010
Token and
Token ID290
Feature activation+1.12e-02

Pos logit contributions
cakes+0.036
keeping+0.035
mons+0.030
Jud+0.028
ons+0.028
Neg logit contributions
 like-0.030
 especially-0.028
 three-0.028
 who-0.027
 really-0.027
Token cozy
Token ID37438
Feature activation+1.99e-02

Pos logit contributions
keeping+0.028
cakes+0.028
mons+0.025
ons+0.024
ally+0.023
Neg logit contributions
 like-0.014
 especially-0.014
 very-0.014
 still-0.014
 really-0.014
Token p
Token ID279
Feature activation+3.90e-02

Pos logit contributions
cakes+0.044
keeping+0.042
mons+0.038
ons+0.036
Jud+0.035
Neg logit contributions
 like-0.030
 especially-0.028
 who-0.027
 three-0.027
 very-0.027
Tokenaj
Token ID1228
Feature activation+9.32e-02

Pos logit contributions
keeping+0.063
cakes+0.054
ons+0.044
mons+0.043
 juices+0.041
Neg logit contributions
 especially-0.081
 like-0.081
 still-0.079
 really-0.078
 rather-0.077
Tokenamas
Token ID17485
Feature activation+3.83e-02

Pos logit contributions
keeping+0.135
cakes+0.128
mons+0.109
ons+0.099
 pies+0.096
Neg logit contributions
 especially-0.195
 like-0.193
 really-0.192
 still-0.191
 three-0.189
Token and
Token ID290
Feature activation-6.26e-03

Pos logit contributions
keeping+0.101
cakes+0.098
mons+0.083
ons+0.081
 pies+0.079
Neg logit contributions
 especially-0.038
 rather-0.036
 like-0.034
 three-0.034
 quite-0.033
Token sn
Token ID3013
Feature activation+1.25e-02

Pos logit contributions
 like+0.009
 especially+0.009
 even+0.008
 very+0.008
 still+0.008
Neg logit contributions
cakes-0.015
keeping-0.015
mons-0.014
ons-0.012
ally-0.012
Tokenugg
Token ID6837
Feature activation+3.27e-02

Pos logit contributions
keeping+0.020
cakes+0.018
ons+0.015
mons+0.015
 pies+0.014
Neg logit contributions
 still-0.026
 like-0.026
 especially-0.025
 really-0.025
 very-0.025
Tokenled
Token ID992
Feature activation-1.65e-02

Pos logit contributions
keeping+0.042
cakes+0.041
mons+0.033
ons+0.029
 pies+0.028
Neg logit contributions
 like-0.076
 especially-0.076
 really-0.074
 still-0.074
 who-0.074
Token.
Token ID13
Feature activation+2.42e-04

Pos logit contributions
keeping+0.008
cakes+0.008
mons+0.007
ons+0.007
 pies+0.007
Neg logit contributions
 especially-0.003
 rather-0.003
 three-0.003
 like-0.003
 quite-0.003
Token They
Token ID1119
Feature activation+1.79e-02

Pos logit contributions
keeping+0.001
cakes+0.001
mons+0.001
ons+0.000
 pies+0.000
Neg logit contributions
 especially-0.000
 like-0.000
 rather-0.000
 really-0.000
 who-0.000
Token were
Token ID547
Feature activation+2.07e-02

Pos logit contributions
keeping+0.036
cakes+0.035
mons+0.030
ons+0.028
 pies+0.027
Neg logit contributions
 especially-0.029
 like-0.029
 rather-0.028
 really-0.028
 three-0.027
Token wearing
Token ID5762
Feature activation-2.01e-02

Pos logit contributions
keeping+0.051
cakes+0.048
mons+0.043
 pies+0.042
ons+0.041
Neg logit contributions
 especially-0.021
 rather-0.021
 like-0.020
 who-0.019
 three-0.019
Token p
Token ID279
Feature activation+4.69e-02

Pos logit contributions
 like+0.025
 especially+0.025
 rather+0.023
 really+0.023
 three+0.023
Neg logit contributions
keeping-0.050
cakes-0.048
mons-0.043
ons-0.040
 pies-0.039
Tokenaj
Token ID1228
Feature activation+9.28e-02

Pos logit contributions
keeping+0.075
cakes+0.065
mons+0.051
 juices+0.048
 pies+0.047
Neg logit contributions
 especially-0.099
 like-0.098
 really-0.096
 rather-0.094
 still-0.094
Tokenamas
Token ID17485
Feature activation+3.67e-02

Pos logit contributions
keeping+0.125
cakes+0.113
mons+0.091
ons+0.080
 pies+0.077
Neg logit contributions
 especially-0.216
 like-0.215
 rather-0.210
 really-0.209
 three-0.206
Token.
Token ID13
Feature activation+7.27e-03

Pos logit contributions
keeping+0.098
cakes+0.094
mons+0.081
ons+0.079
 pies+0.077
Neg logit contributions
 especially-0.036
 rather-0.033
 three-0.032
 like-0.031
 particularly-0.031
Token And
Token ID843
Feature activation+2.63e-02

Pos logit contributions
keeping+0.018
cakes+0.018
mons+0.016
ons+0.015
 pies+0.015
Neg logit contributions
 especially-0.008
 like-0.008
 rather-0.008
 really-0.008
 who-0.008
Token they
Token ID484
Feature activation-6.96e-03

Pos logit contributions
keeping+0.064
cakes+0.062
mons+0.055
ons+0.053
 pies+0.051
Neg logit contributions
 like-0.034
 especially-0.034
 who-0.032
 three-0.032
 rather-0.032
Token were
Token ID547
Feature activation+4.10e-03

Pos logit contributions
 especially+0.011
 like+0.011
 rather+0.011
 three+0.011
 really+0.010
Neg logit contributions
keeping-0.014
cakes-0.013
mons-0.011
ons-0.011
 pies-0.011
Token put
Token ID1234
Feature activation+1.99e-02

Pos logit contributions
keeping+0.039
cakes+0.038
mons+0.036
oys+0.033
 wines+0.033
Neg logit contributions
 especially-0.021
 even-0.021
 very-0.019
 still-0.019
 like-0.019
Token on
Token ID319
Feature activation+2.33e-02

Pos logit contributions
keeping+0.053
cakes+0.051
mons+0.045
ons+0.043
 juices+0.042
Neg logit contributions
 like-0.021
 especially-0.021
 really-0.020
 rather-0.019
 three-0.019
Token some
Token ID617
Feature activation+4.01e-02

Pos logit contributions
keeping+0.059
cakes+0.057
mons+0.051
ons+0.047
 juices+0.046
Neg logit contributions
 especially-0.028
 like-0.028
 really-0.027
 very-0.026
 still-0.026
Token warm
Token ID5814
Feature activation+2.83e-02

Pos logit contributions
keeping+0.082
cakes+0.075
mons+0.064
ons+0.059
 juices+0.058
Neg logit contributions
 like-0.070
 especially-0.069
 really-0.066
 rather-0.065
 who-0.065
Token p
Token ID279
Feature activation+3.80e-02

Pos logit contributions
keeping+0.062
cakes+0.057
mons+0.050
Jud+0.048
live+0.046
Neg logit contributions
 like-0.046
 and-0.043
 especially-0.042
 in-0.041
 three-0.041
Tokenaj
Token ID1228
Feature activation+9.09e-02

Pos logit contributions
keeping+0.071
cakes+0.056
ons+0.046
 juices+0.045
Jud+0.044
Neg logit contributions
 like-0.072
 especially-0.071
 still-0.070
 at-0.070
 really-0.069
Tokenamas
Token ID17485
Feature activation+4.21e-02

Pos logit contributions
keeping+0.138
cakes+0.121
mons+0.104
ons+0.095
 juices+0.091
Neg logit contributions
 especially-0.189
 like-0.188
 really-0.186
 three-0.182
 still-0.182
Token.
Token ID13
Feature activation+1.64e-02

Pos logit contributions
keeping+0.105
cakes+0.099
mons+0.082
ons+0.079
 pies+0.077
Neg logit contributions
 especially-0.050
 rather-0.046
 three-0.044
 particularly-0.044
 quite-0.044
Token She
Token ID1375
Feature activation+1.53e-02

Pos logit contributions
keeping+0.044
cakes+0.042
mons+0.038
ons+0.037
glers+0.036
Neg logit contributions
 like-0.018
 especially-0.017
 really-0.017
 rather-0.016
 three-0.016
Token sn
Token ID3013
Feature activation+1.75e-02

Pos logit contributions
keeping+0.034
cakes+0.032
mons+0.028
ons+0.026
 pies+0.025
Neg logit contributions
 especially-0.024
 like-0.023
 rather-0.022
 really-0.022
 who-0.022
Tokenugg
Token ID6837
Feature activation+2.59e-02

Pos logit contributions
keeping+0.030
cakes+0.026
glers+0.021
 pies+0.020
mons+0.020
Neg logit contributions
 like-0.035
 still-0.034
 especially-0.034
 who-0.033
 really-0.033
Token and
Token ID290
Feature activation+6.22e-03

Pos logit contributions
 like+0.004
 especially+0.004
 really+0.004
 three+0.004
 rather+0.004
Neg logit contributions
keeping-0.011
cakes-0.010
mons-0.009
ons-0.009
 juices-0.009
Token dressed
Token ID12049
Feature activation+7.90e-03

Pos logit contributions
keeping+0.016
cakes+0.015
mons+0.014
ons+0.012
 juices+0.012
Neg logit contributions
 still-0.008
 especially-0.008
 like-0.008
 really-0.008
 even-0.007
Token in
Token ID287
Feature activation-1.79e-03

Pos logit contributions
keeping+0.022
cakes+0.021
mons+0.019
ons+0.018
 pies+0.018
Neg logit contributions
 especially-0.007
 rather-0.007
 like-0.007
 who-0.006
 really-0.006
Token fresh
Token ID4713
Feature activation+3.35e-02

Pos logit contributions
 especially+0.002
 rather+0.002
 like+0.002
 who+0.002
 quite+0.002
Neg logit contributions
keeping-0.004
cakes-0.004
ons-0.003
mons-0.003
 juices-0.003
Token p
Token ID279
Feature activation+3.80e-02

Pos logit contributions
keeping+0.074
cakes+0.070
mons+0.061
ons+0.056
 pies+0.055
Neg logit contributions
 like-0.051
 especially-0.051
 really-0.048
 three-0.048
 rather-0.048
Tokenaj
Token ID1228
Feature activation+9.01e-02

Pos logit contributions
keeping+0.068
cakes+0.058
mons+0.047
ons+0.045
 juices+0.044
Neg logit contributions
 especially-0.073
 like-0.072
 really-0.071
 rather-0.070
 still-0.070
Tokenamas
Token ID17485
Feature activation+3.08e-02

Pos logit contributions
keeping+0.127
cakes+0.116
mons+0.095
ons+0.084
 pies+0.081
Neg logit contributions
 especially-0.200
 like-0.199
 really-0.196
 rather-0.195
 three-0.194
Token.
Token ID13
Feature activation+1.37e-02

Pos logit contributions
keeping+0.074
cakes+0.070
mons+0.057
ons+0.055
 pies+0.053
Neg logit contributions
 especially-0.038
 rather-0.035
 particularly-0.034
 three-0.032
 quite-0.032
Token
Token ID198
Feature activation+1.10e-02

Pos logit contributions
keeping+0.034
cakes+0.033
mons+0.029
ons+0.028
 pies+0.028
Neg logit contributions
 especially-0.016
 like-0.016
 rather-0.015
 three-0.015
 really-0.015
Token
Token ID198
Feature activation+9.44e-04

Pos logit contributions
keeping+0.029
cakes+0.028
mons+0.025
ons+0.024
 pies+0.023
Neg logit contributions
 especially-0.011
 like-0.011
 rather-0.011
 three-0.010
 quite-0.010
TokenThat
Token ID2504
Feature activation+5.12e-03

Pos logit contributions
keeping+0.002
cakes+0.002
mons+0.002
ons+0.002
 pies+0.001
Neg logit contributions
 like-0.002
 rather-0.001
 especially-0.001
 three-0.001
 quite-0.001
Token song
Token ID3496
Feature activation-1.94e-06

Pos logit contributions
cakes+0.035
keeping+0.033
 pies+0.028
 juices+0.027
mons+0.027
Neg logit contributions
 especially-0.028
 rather-0.027
 like-0.025
 still-0.025
 who-0.025
Token about
Token ID546
Feature activation-1.63e-02

Feature not active on prev token;
no predictions were affected.
Token rain
Token ID6290
Feature activation+3.12e-02

Pos logit contributions
 especially+0.020
 like+0.019
 rather+0.019
 really+0.018
 quite+0.018
Neg logit contributions
keeping-0.039
cakes-0.038
mons-0.033
ons-0.032
 pies-0.032
Tokenbows
Token ID25435
Feature activation-1.01e-02

Pos logit contributions
keeping+0.036
cakes+0.034
mons+0.025
ons+0.022
 pies+0.021
Neg logit contributions
 especially-0.078
 like-0.077
 rather-0.076
 really-0.075
 who-0.075
Token and
Token ID290
Feature activation+1.51e-02

Pos logit contributions
 especially+0.010
 rather+0.009
 particularly+0.009
 three+0.009
 quite+0.009
Neg logit contributions
cakes-0.026
keeping-0.026
ons-0.022
mons-0.022
 pies-0.022
Token umb
Token ID20810
Feature activation+8.96e-02

Pos logit contributions
keeping+0.036
cakes+0.035
mons+0.031
ons+0.029
live+0.029
Neg logit contributions
 like-0.022
 especially-0.020
 three-0.020
 who-0.020
 even-0.019
Tokenrell
Token ID11252
Feature activation+2.84e-02

Pos logit contributions
keeping+0.154
cakes+0.137
mons+0.121
ment+0.108
 juices+0.105
Neg logit contributions
 like-0.180
 especially-0.173
 really-0.169
 rather-0.167
 still-0.164
Tokenas
Token ID292
Feature activation-2.07e-03

Pos logit contributions
keeping+0.064
cakes+0.061
mons+0.052
ons+0.051
 juices+0.050
Neg logit contributions
 like-0.041
 especially-0.040
 rather-0.038
 really-0.038
 who-0.037
Token.
Token ID13
Feature activation+9.68e-03

Pos logit contributions
 especially+0.002
 rather+0.002
 quite+0.002
 really+0.002
 three+0.002
Neg logit contributions
keeping-0.005
cakes-0.005
mons-0.004
ons-0.004
 pies-0.004
Token They
Token ID1119
Feature activation+2.19e-02

Pos logit contributions
keeping+0.022
cakes+0.021
mons+0.019
ons+0.018
 pies+0.017
Neg logit contributions
 especially-0.014
 like-0.014
 rather-0.013
 really-0.013
 three-0.013
Token were
Token ID547
Feature activation+2.04e-02

Pos logit contributions
keeping+0.045
cakes+0.044
mons+0.037
ons+0.035
 pies+0.034
Neg logit contributions
 especially-0.035
 like-0.034
 rather-0.034
 really-0.033
 who-0.032
Token,
Token ID11
Feature activation+2.37e-03

Pos logit contributions
keeping+0.010
cakes+0.009
mons+0.008
 pies+0.008
rob+0.008
Neg logit contributions
 like-0.002
 who-0.002
 three-0.002
 at-0.002
 especially-0.002
Token it
Token ID340
Feature activation+1.71e-02

Pos logit contributions
keeping+0.007
cakes+0.006
mons+0.006
live+0.006
ages+0.006
Neg logit contributions
 like-0.003
 even-0.002
 so-0.002
 very-0.002
 especially-0.002
Token kept
Token ID4030
Feature activation-9.12e-03

Pos logit contributions
keeping+0.033
cakes+0.032
mons+0.027
ons+0.026
 pies+0.025
Neg logit contributions
 especially-0.029
 like-0.029
 rather-0.028
 really-0.027
 three-0.027
Token on
Token ID319
Feature activation+1.87e-03

Pos logit contributions
 especially+0.012
 like+0.011
 rather+0.011
 who+0.011
 really+0.011
Neg logit contributions
keeping-0.022
cakes-0.020
mons-0.018
ons-0.017
 pies-0.017
Token flying
Token ID7348
Feature activation-2.32e-02

Pos logit contributions
keeping+0.004
cakes+0.004
mons+0.003
 pies+0.003
ons+0.003
Neg logit contributions
 especially-0.003
 like-0.002
 rather-0.002
 who-0.002
 really-0.002
Token err
Token ID11454
Feature activation+8.94e-02

Pos logit contributions
 especially+0.019
 rather+0.018
 three+0.017
 who+0.016
 like+0.016
Neg logit contributions
keeping-0.065
cakes-0.061
mons-0.054
 pies-0.053
ons-0.053
Tokenatically
Token ID4142
Feature activation+3.00e-03

Pos logit contributions
keeping+0.134
cakes+0.128
mons+0.103
ons+0.094
 pies+0.090
Neg logit contributions
 especially-0.193
 like-0.191
 rather-0.188
 really-0.184
 who-0.183
Token with
Token ID351
Feature activation-5.32e-03

Pos logit contributions
keeping+0.008
cakes+0.007
mons+0.007
 pies+0.006
ons+0.006
Neg logit contributions
 especially-0.003
 rather-0.003
 really-0.002
 three-0.002
 particularly-0.002
Token inexplicable
Token ID49602
Feature activation+6.43e-03

Pos logit contributions
 like+0.006
 three+0.006
 especially+0.005
 really+0.005
 very+0.005
Neg logit contributions
cakes-0.014
keeping-0.014
mons-0.013
ons-0.011
 pies-0.011
Token speed
Token ID2866
Feature activation-8.24e-03

Pos logit contributions
keeping+0.014
cakes+0.012
mons+0.011
ment+0.011
 pies+0.011
Neg logit contributions
 especially-0.010
 rather-0.009
 who-0.009
 like-0.009
 three-0.008
Token.
Token ID13
Feature activation+3.02e-03

Pos logit contributions
 especially+0.009
 rather+0.008
 like+0.008
 three+0.008
 who+0.007
Neg logit contributions
keeping-0.022
cakes-0.021
mons-0.019
ons-0.018
 pies-0.017
Token door
Token ID3420
Feature activation+4.89e-02

Pos logit contributions
keeping+0.036
cakes+0.035
mons+0.031
ons+0.030
 pies+0.029
Neg logit contributions
 especially-0.013
 like-0.013
 rather-0.013
 really-0.012
 quite-0.012
Token.
Token ID13
Feature activation+8.06e-03

Pos logit contributions
keeping+0.124
cakes+0.123
mons+0.104
ons+0.101
 pies+0.100
Neg logit contributions
 especially-0.055
 rather-0.053
 like-0.050
 really-0.048
 quite-0.048
Token Inside
Token ID14384
Feature activation+2.55e-03

Pos logit contributions
keeping+0.021
cakes+0.020
mons+0.018
ons+0.017
 pies+0.017
Neg logit contributions
 especially-0.009
 like-0.009
 rather-0.009
 really-0.009
 three-0.009
Token were
Token ID547
Feature activation+1.11e-02

Pos logit contributions
keeping+0.007
cakes+0.006
mons+0.006
ons+0.005
 pies+0.005
Neg logit contributions
 especially-0.003
 like-0.003
 rather-0.003
 really-0.003
 who-0.003
Token two
Token ID734
Feature activation+5.26e-02

Pos logit contributions
keeping+0.028
cakes+0.026
mons+0.024
ons+0.022
ment+0.022
Neg logit contributions
 three-0.015
 especially-0.015
 like-0.014
 very-0.014
 really-0.014
Token umb
Token ID20810
Feature activation+8.89e-02

Pos logit contributions
keeping+0.118
cakes+0.107
mons+0.095
ons+0.086
ment+0.084
Neg logit contributions
 especially-0.084
 like-0.083
 really-0.079
 three-0.079
 who-0.078
Tokenrell
Token ID11252
Feature activation+2.49e-02

Pos logit contributions
keeping+0.167
cakes+0.144
mons+0.137
ment+0.130
kins+0.118
Neg logit contributions
 like-0.158
 especially-0.151
 really-0.147
 and-0.145
 still-0.141
Tokenas
Token ID292
Feature activation+2.76e-03

Pos logit contributions
keeping+0.059
cakes+0.054
ons+0.047
 juices+0.047
mons+0.047
Neg logit contributions
 like-0.033
 especially-0.032
 who-0.030
 really-0.030
 three-0.030
Token -
Token ID532
Feature activation+1.89e-02

Pos logit contributions
keeping+0.008
cakes+0.007
mons+0.007
ons+0.006
 pies+0.006
Neg logit contributions
 especially-0.002
 like-0.002
 rather-0.002
 really-0.002
 who-0.002
Token one
Token ID530
Feature activation+1.71e-02

Pos logit contributions
keeping+0.046
cakes+0.043
mons+0.038
ons+0.036
 juices+0.035
Neg logit contributions
 especially-0.025
 like-0.025
 three-0.024
 rather-0.024
 who-0.024
Token big
Token ID1263
Feature activation+2.24e-02

Pos logit contributions
keeping+0.041
cakes+0.041
mons+0.036
ons+0.034
 pies+0.034
Neg logit contributions
 especially-0.020
 like-0.020
 rather-0.020
 really-0.019
 quite-0.019
Token,
Token ID11
Feature activation+1.60e-02

Pos logit contributions
keeping+0.141
cakes+0.133
mons+0.119
ons+0.110
ment+0.108
Neg logit contributions
 like-0.041
 especially-0.039
 three-0.036
 really-0.035
 who-0.034
Token fifteen
Token ID17280
Feature activation+5.38e-02

Pos logit contributions
cakes+0.031
keeping+0.029
 pies+0.027
mons+0.026
ons+0.025
Neg logit contributions
 especially-0.026
 rather-0.025
 quite-0.024
 particularly-0.024
 like-0.023
Token,
Token ID11
Feature activation+1.88e-02

Pos logit contributions
keeping+0.144
cakes+0.138
mons+0.122
ons+0.116
 juices+0.112
Neg logit contributions
 like-0.057
 especially-0.055
 rather-0.052
 three-0.052
 really-0.052
Token sixteen
Token ID27913
Feature activation+5.31e-02

Pos logit contributions
cakes+0.035
keeping+0.033
 pies+0.030
mons+0.029
 juices+0.028
Neg logit contributions
 especially-0.032
 rather-0.031
 quite-0.029
 particularly-0.029
 who-0.029
Token,
Token ID11
Feature activation+1.85e-02

Pos logit contributions
keeping+0.162
cakes+0.151
mons+0.136
ons+0.127
live+0.124
Neg logit contributions
 like-0.044
 especially-0.041
 three-0.038
 really-0.036
 very-0.035
Token seventeen
Token ID38741
Feature activation+8.88e-02

Pos logit contributions
cakes+0.031
keeping+0.029
 pies+0.026
mons+0.025
 juices+0.025
Neg logit contributions
 especially-0.035
 rather-0.034
 quite-0.032
 who-0.032
 particularly-0.032
Token,
Token ID11
Feature activation+2.21e-02

Pos logit contributions
keeping+0.263
cakes+0.247
mons+0.220
ons+0.208
 juices+0.203
Neg logit contributions
 like-0.077
 especially-0.072
 really-0.066
 three-0.065
 rather-0.064
Token eighteen
Token ID29095
Feature activation+6.90e-02

Pos logit contributions
cakes+0.036
keeping+0.032
 pies+0.031
mons+0.029
 juices+0.028
Neg logit contributions
 especially-0.042
 rather-0.041
 quite-0.039
 particularly-0.039
 who-0.038
Token,
Token ID11
Feature activation+2.13e-02

Pos logit contributions
keeping+0.201
cakes+0.190
mons+0.170
ons+0.160
 juices+0.157
Neg logit contributions
 like-0.061
 especially-0.059
 really-0.054
 three-0.054
 rather-0.054
Token nineteen
Token ID43063
Feature activation+3.05e-02

Pos logit contributions
cakes+0.037
keeping+0.033
 pies+0.032
mons+0.030
 juices+0.029
Neg logit contributions
 especially-0.038
 rather-0.037
 particularly-0.036
 quite-0.035
 really-0.034
Token,
Token ID11
Feature activation+2.21e-02

Pos logit contributions
keeping+0.074
cakes+0.074
mons+0.066
 pies+0.064
ons+0.063
Neg logit contributions
 especially-0.033
 rather-0.032
 like-0.031
 quite-0.030
 who-0.030
Token,
Token ID11
Feature activation+1.57e-02

Pos logit contributions
keeping+0.133
cakes+0.127
mons+0.114
ons+0.107
 juices+0.105
Neg logit contributions
 like-0.047
 especially-0.047
 rather-0.044
 really-0.043
 three-0.043
Token fifteen
Token ID17280
Feature activation+5.40e-02

Pos logit contributions
cakes+0.029
keeping+0.025
 pies+0.025
mons+0.024
ons+0.023
Neg logit contributions
 especially-0.027
 rather-0.026
 quite-0.025
 particularly-0.025
 who-0.024
Token,
Token ID11
Feature activation+1.93e-02

Pos logit contributions
keeping+0.130
cakes+0.129
mons+0.113
ons+0.110
 pies+0.109
Neg logit contributions
 especially-0.064
 rather-0.063
 like-0.061
 who-0.058
 really-0.058
Token sixteen
Token ID27913
Feature activation+5.43e-02

Pos logit contributions
cakes+0.035
keeping+0.030
 pies+0.030
mons+0.028
ons+0.027
Neg logit contributions
 especially-0.034
 rather-0.033
 quite-0.031
 particularly-0.031
 who-0.031
Token,
Token ID11
Feature activation+1.90e-02

Pos logit contributions
keeping+0.142
cakes+0.138
mons+0.122
ons+0.118
 pies+0.116
Neg logit contributions
 especially-0.056
 like-0.055
 rather-0.054
 really-0.051
 who-0.051
Token seventeen
Token ID38741
Feature activation+8.88e-02

Pos logit contributions
cakes+0.032
keeping+0.027
 pies+0.026
mons+0.024
 juices+0.024
Neg logit contributions
 especially-0.036
 rather-0.036
 quite-0.034
 particularly-0.034
 who-0.034
Token,
Token ID11
Feature activation+2.21e-02

Pos logit contributions
keeping+0.244
cakes+0.234
mons+0.209
ons+0.199
 juices+0.194
Neg logit contributions
 like-0.084
 especially-0.084
 rather-0.079
 really-0.077
 three-0.076
Token eighteen
Token ID29095
Feature activation+6.75e-02

Pos logit contributions
cakes+0.038
 pies+0.032
keeping+0.031
mons+0.029
ons+0.028
Neg logit contributions
 especially-0.041
 rather-0.041
 particularly-0.039
 quite-0.039
 who-0.038
Token,
Token ID11
Feature activation+2.10e-02

Pos logit contributions
keeping+0.177
cakes+0.172
mons+0.153
ons+0.148
 pies+0.145
Neg logit contributions
 especially-0.069
 like-0.067
 rather-0.066
 really-0.062
 who-0.061
Token nineteen
Token ID43063
Feature activation+3.00e-02

Pos logit contributions
cakes+0.037
 pies+0.032
keeping+0.031
mons+0.029
ons+0.028
Neg logit contributions
 especially-0.038
 rather-0.037
 particularly-0.036
 quite-0.035
 who-0.034
Token,
Token ID11
Feature activation+2.17e-02

Pos logit contributions
cakes+0.071
keeping+0.069
mons+0.063
 pies+0.062
ons+0.061
Neg logit contributions
 especially-0.034
 rather-0.033
 like-0.031
 quite-0.030
 who-0.030
TokenB
Token ID33
Feature activation+6.12e-02

Pos logit contributions
 especially+0.007
 like+0.007
 rather+0.007
 who+0.007
 really+0.007
Neg logit contributions
keeping-0.009
cakes-0.009
mons-0.008
ons-0.007
 pies-0.007
Tokenoom
Token ID4207
Feature activation+1.80e-02

Pos logit contributions
keeping+0.116
cakes+0.111
 pies+0.089
mons+0.088
ons+0.086
Neg logit contributions
 like-0.107
 especially-0.104
 rather-0.102
 three-0.101
 who-0.100
Token!"
Token ID2474
Feature activation+6.89e-03

Pos logit contributions
keeping+0.040
cakes+0.039
mons+0.034
ons+0.033
 juices+0.032
Neg logit contributions
 especially-0.025
 rather-0.024
 like-0.024
 quite-0.023
 really-0.023
Token and
Token ID290
Feature activation-7.34e-03

Pos logit contributions
keeping+0.019
cakes+0.018
mons+0.016
ons+0.016
 pies+0.015
Neg logit contributions
 especially-0.007
 like-0.006
 rather-0.006
 really-0.006
 who-0.006
Token "
Token ID366
Feature activation-1.67e-02

Pos logit contributions
 especially+0.010
 like+0.010
 rather+0.010
 really+0.009
 three+0.009
Neg logit contributions
keeping-0.017
cakes-0.016
mons-0.014
ons-0.014
 pies-0.013
TokenP
Token ID47
Feature activation+8.85e-02

Pos logit contributions
 still+0.029
 rather+0.029
 especially+0.028
 at+0.028
 who+0.027
Neg logit contributions
keeping-0.028
cakes-0.028
mons-0.023
 pies-0.022
ons-0.020
Tokenow
Token ID322
Feature activation+6.49e-03

Pos logit contributions
keeping+0.153
cakes+0.140
mons+0.113
 pies+0.104
ons+0.104
Neg logit contributions
 like-0.176
 especially-0.174
 rather-0.173
 really-0.170
 who-0.168
Token!"
Token ID2474
Feature activation+7.23e-03

Pos logit contributions
keeping+0.014
cakes+0.013
mons+0.011
ons+0.011
 pies+0.011
Neg logit contributions
 especially-0.010
 like-0.010
 rather-0.010
 really-0.009
 who-0.009
Token He
Token ID679
Feature activation+1.52e-02

Pos logit contributions
keeping+0.020
cakes+0.019
mons+0.017
ons+0.016
 pies+0.016
Neg logit contributions
 like-0.007
 especially-0.007
 really-0.007
 rather-0.007
 who-0.007
Token thinks
Token ID6834
Feature activation+2.44e-03

Pos logit contributions
cakes+0.023
keeping+0.021
 juices+0.019
ons+0.018
 pies+0.018
Neg logit contributions
 especially-0.028
 rather-0.028
 quite-0.027
 particularly-0.026
 who-0.026
Token his
Token ID465
Feature activation+1.40e-02

Pos logit contributions
cakes+0.005
keeping+0.005
ons+0.004
mons+0.004
 pies+0.004
Neg logit contributions
 especially-0.003
 rather-0.003
 like-0.003
 still-0.003
 quite-0.003
Token played
Token ID2826
Feature activation+2.16e-02

Pos logit contributions
keeping+0.035
cakes+0.033
mons+0.029
ons+0.027
 juices+0.026
Neg logit contributions
 especially-0.023
 like-0.022
 really-0.022
 still-0.021
 who-0.021
Token together
Token ID1978
Feature activation+8.95e-03

Pos logit contributions
keeping+0.057
cakes+0.055
mons+0.049
ons+0.047
 juices+0.046
Neg logit contributions
 like-0.023
 especially-0.023
 rather-0.021
 really-0.021
 who-0.021
Token all
Token ID477
Feature activation-2.08e-02

Pos logit contributions
keeping+0.020
cakes+0.019
 sandwiches+0.018
 pies+0.018
ons+0.017
Neg logit contributions
 rather-0.010
 especially-0.009
 particularly-0.009
 quite-0.008
 who-0.008
Token day
Token ID1110
Feature activation+2.74e-02

Pos logit contributions
 rather+0.035
 especially+0.034
 quite+0.031
 like+0.031
 three+0.031
Neg logit contributions
keeping-0.041
cakes-0.039
ons-0.032
mons-0.032
 pies-0.031
Token and
Token ID290
Feature activation+1.94e-02

Pos logit contributions
keeping+0.068
cakes+0.064
mons+0.058
ons+0.056
 pies+0.054
Neg logit contributions
 especially-0.031
 rather-0.030
 like-0.030
 three-0.028
 quite-0.027
Token Mitt
Token ID16627
Feature activation+8.84e-02

Pos logit contributions
cakes+0.050
keeping+0.049
mons+0.043
ons+0.040
irens+0.040
Neg logit contributions
 even-0.028
 like-0.027
 especially-0.027
 very-0.026
 still-0.026
Tokenens
Token ID641
Feature activation+1.17e-02

Pos logit contributions
keeping+0.155
cakes+0.133
ons+0.102
ment+0.098
glers+0.096
Neg logit contributions
 like-0.196
 really-0.179
 because-0.177
 and-0.172
 especially-0.172
Token pur
Token ID1308
Feature activation+3.29e-03

Pos logit contributions
keeping+0.025
cakes+0.024
mons+0.021
ons+0.020
 pies+0.019
Neg logit contributions
 especially-0.018
 like-0.018
 rather-0.017
 really-0.017
 who-0.017
Tokenred
Token ID445
Feature activation-1.57e-02

Pos logit contributions
cakes+0.007
keeping+0.007
mons+0.006
ons+0.006
 pies+0.005
Neg logit contributions
 like-0.005
 especially-0.005
 rather-0.005
 three-0.005
 really-0.005
Token happily
Token ID18177
Feature activation-4.18e-02

Pos logit contributions
 especially+0.016
 rather+0.015
 like+0.015
 who+0.014
 quite+0.014
Neg logit contributions
keeping-0.041
cakes-0.039
mons-0.035
 pies-0.033
ons-0.033
Token in
Token ID287
Feature activation-2.11e-02

Pos logit contributions
 especially+0.049
 rather+0.045
 like+0.043
 really+0.041
 particularly+0.041
Neg logit contributions
keeping-0.107
cakes-0.100
mons-0.089
ons-0.083
 pies-0.083
Token and
Token ID290
Feature activation+8.12e-03

Pos logit contributions
cakes+0.003
keeping+0.003
mons+0.003
ons+0.003
 pies+0.003
Neg logit contributions
 especially-0.001
 rather-0.001
 like-0.001
 who-0.001
 quite-0.001
Token a
Token ID257
Feature activation+1.83e-02

Pos logit contributions
keeping+0.023
cakes+0.019
mons+0.018
live+0.018
glers+0.017
Neg logit contributions
 three-0.011
 like-0.010
 even-0.010
 really-0.010
 especially-0.010
Token brush
Token ID14093
Feature activation-1.52e-02

Pos logit contributions
cakes+0.043
keeping+0.043
mons+0.037
ons+0.036
 pies+0.036
Neg logit contributions
 especially-0.022
 like-0.021
 rather-0.021
 who-0.020
 quite-0.019
Token.
Token ID13
Feature activation+3.33e-03

Pos logit contributions
 especially+0.017
 rather+0.016
 like+0.015
 who+0.015
 quite+0.015
Neg logit contributions
cakes-0.038
keeping-0.038
mons-0.033
ons-0.032
 pies-0.032
Token "
Token ID366
Feature activation-1.11e-02

Pos logit contributions
keeping+0.009
cakes+0.009
mons+0.008
ons+0.008
 pies+0.007
Neg logit contributions
 especially-0.003
 like-0.003
 rather-0.003
 three-0.003
 really-0.003
TokenP
Token ID47
Feature activation+8.83e-02

Pos logit contributions
 especially+0.019
 like+0.019
 rather+0.019
 really+0.018
 still+0.018
Neg logit contributions
keeping-0.021
cakes-0.021
mons-0.017
ons-0.016
 juices-0.016
Tokenaint
Token ID2913
Feature activation+1.33e-02

Pos logit contributions
keeping+0.179
cakes+0.157
mons+0.132
 pies+0.127
glers+0.124
Neg logit contributions
 like-0.151
 especially-0.151
 rather-0.150
 really-0.148
 three-0.147
Token the
Token ID262
Feature activation+3.99e-03

Pos logit contributions
keeping+0.032
cakes+0.031
mons+0.028
ons+0.026
 pies+0.026
Neg logit contributions
 especially-0.016
 like-0.016
 rather-0.016
 really-0.015
 who-0.015
Token wall
Token ID3355
Feature activation-3.91e-03

Pos logit contributions
keeping+0.008
cakes+0.008
mons+0.007
ons+0.006
 pies+0.006
Neg logit contributions
 especially-0.006
 like-0.006
 rather-0.006
 really-0.006
 who-0.006
Token!
Token ID0
Feature activation-1.04e-02

Pos logit contributions
 especially+0.006
 rather+0.005
 like+0.005
 quite+0.005
 who+0.005
Neg logit contributions
cakes-0.009
keeping-0.008
mons-0.007
ons-0.007
 pies-0.007
Token Go
Token ID1514
Feature activation+2.79e-02

Pos logit contributions
 especially+0.012
 like+0.012
 rather+0.012
 really+0.012
 three+0.011
Neg logit contributions
keeping-0.026
cakes-0.025
mons-0.022
ons-0.021
 pies-0.021
Token chicken
Token ID9015
Feature activation-7.23e-03

Pos logit contributions
keeping+0.026
cakes+0.025
mons+0.022
ons+0.021
 juices+0.020
Neg logit contributions
 especially-0.015
 like-0.015
 three-0.014
 rather-0.014
 really-0.014
Token likes
Token ID7832
Feature activation-1.82e-02

Pos logit contributions
 especially+0.010
 like+0.010
 rather+0.010
 really+0.009
 who+0.009
Neg logit contributions
keeping-0.016
cakes-0.016
mons-0.014
ons-0.013
 pies-0.013
Token corn
Token ID11676
Feature activation-9.97e-03

Pos logit contributions
 especially+0.019
 like+0.019
 rather+0.019
 still+0.018
 quite+0.018
Neg logit contributions
keeping-0.046
cakes-0.045
mons-0.039
ons-0.038
 pies-0.038
Token.
Token ID13
Feature activation-4.56e-04

Pos logit contributions
 especially+0.010
 three+0.010
 rather+0.010
 quite+0.009
 who+0.009
Neg logit contributions
keeping-0.025
cakes-0.024
mons-0.021
 pies-0.021
 juices-0.020
Token Ha
Token ID9398
Feature activation+7.50e-02

Pos logit contributions
 especially+0.001
 like+0.001
 rather+0.001
 really+0.001
 quite+0.001
Neg logit contributions
keeping-0.001
cakes-0.001
mons-0.001
ons-0.001
 juices-0.001
Token ha
Token ID387
Feature activation+8.75e-02

Pos logit contributions
keeping+0.184
cakes+0.174
mons+0.145
ons+0.134
 pies+0.134
Neg logit contributions
 like-0.107
 three-0.102
 especially-0.102
 who-0.100
 rather-0.098
Token,
Token ID11
Feature activation-3.93e-03

Pos logit contributions
keeping+0.198
cakes+0.190
mons+0.164
ons+0.155
 pies+0.152
Neg logit contributions
 especially-0.125
 like-0.124
 rather-0.120
 really-0.117
 who-0.117
Token do
Token ID466
Feature activation+1.03e-02

Pos logit contributions
 especially+0.006
 like+0.006
 rather+0.006
 really+0.006
 quite+0.005
Neg logit contributions
keeping-0.008
cakes-0.008
mons-0.007
ons-0.007
 pies-0.006
Token you
Token ID345
Feature activation+1.11e-02

Pos logit contributions
keeping+0.024
cakes+0.023
mons+0.020
ons+0.019
 pies+0.019
Neg logit contributions
 especially-0.014
 like-0.013
 rather-0.013
 really-0.013
 three-0.013
Token get
Token ID651
Feature activation+2.75e-02

Pos logit contributions
keeping+0.028
cakes+0.027
mons+0.024
ons+0.023
 juices+0.023
Neg logit contributions
 especially-0.012
 rather-0.011
 who-0.011
 three-0.011
 like-0.010
Token it
Token ID340
Feature activation-8.60e-03

Pos logit contributions
keeping+0.069
cakes+0.067
mons+0.059
ons+0.057
 pies+0.055
Neg logit contributions
 especially-0.032
 like-0.031
 rather-0.031
 really-0.029
 who-0.029
Token and
Token ID290
Feature activation+7.94e-03

Pos logit contributions
 especially+0.011
 rather+0.010
 like+0.010
 quite+0.009
 who+0.009
Neg logit contributions
cakes-0.039
keeping-0.039
mons-0.034
ons-0.033
 pies-0.033
Token started
Token ID2067
Feature activation-6.98e-04

Pos logit contributions
keeping+0.019
cakes+0.019
mons+0.016
glers+0.015
mm+0.015
Neg logit contributions
 still-0.011
 even-0.011
 like-0.010
 who-0.010
 really-0.010
Token to
Token ID284
Feature activation+2.84e-02

Pos logit contributions
 especially+0.001
 like+0.001
 rather+0.001
 really+0.001
 who+0.001
Neg logit contributions
keeping-0.002
cakes-0.002
mons-0.002
ons-0.001
 pies-0.001
Token climb
Token ID12080
Feature activation+1.92e-02

Pos logit contributions
keeping+0.071
cakes+0.070
mons+0.062
ons+0.057
 juices+0.056
Neg logit contributions
 like-0.035
 especially-0.035
 really-0.033
 rather-0.033
 still-0.033
Token.
Token ID13
Feature activation+1.43e-02

Pos logit contributions
keeping+0.051
cakes+0.049
mons+0.043
ons+0.042
 pies+0.041
Neg logit contributions
 especially-0.020
 like-0.019
 rather-0.019
 who-0.018
 three-0.018
Token Wh
Token ID854
Feature activation+8.64e-02

Pos logit contributions
keeping+0.036
cakes+0.034
mons+0.030
ons+0.030
 pies+0.029
Neg logit contributions
 like-0.017
 especially-0.017
 rather-0.016
 three-0.016
 really-0.016
Tokenith
Token ID342
Feature activation-2.67e-02

Pos logit contributions
keeping+0.140
cakes+0.133
mons+0.102
ment+0.098
 pies+0.095
Neg logit contributions
 like-0.176
 especially-0.175
 really-0.173
 still-0.172
 rather-0.171
Token her
Token ID607
Feature activation+2.15e-02

Pos logit contributions
 especially+0.042
 like+0.041
 rather+0.040
 really+0.039
 who+0.039
Neg logit contributions
keeping-0.057
cakes-0.054
mons-0.047
ons-0.044
 pies-0.043
Token new
Token ID649
Feature activation-1.47e-02

Pos logit contributions
keeping+0.048
cakes+0.046
mons+0.040
ons+0.038
 pies+0.037
Neg logit contributions
 especially-0.031
 like-0.031
 rather-0.030
 really-0.029
 who-0.029
Token brush
Token ID14093
Feature activation-1.69e-02

Pos logit contributions
 like+0.020
 especially+0.020
 three+0.020
 rather+0.020
 really+0.019
Neg logit contributions
keeping-0.034
cakes-0.033
mons-0.028
ons-0.027
glers-0.026
Token,
Token ID11
Feature activation+9.37e-04

Pos logit contributions
 especially+0.018
 rather+0.017
 like+0.017
 who+0.016
 three+0.016
Neg logit contributions
keeping-0.043
cakes-0.043
mons-0.037
ons-0.036
 pies-0.036
Token around
Token ID1088
Feature activation-2.11e-02

Pos logit contributions
keeping+0.104
cakes+0.101
mons+0.087
 pies+0.086
ons+0.084
Neg logit contributions
 especially-0.029
 rather-0.028
 who-0.025
 three-0.023
 quite-0.023
Token and
Token ID290
Feature activation+1.77e-02

Pos logit contributions
 rather+0.021
 especially+0.020
 like+0.019
 quite+0.018
 three+0.018
Neg logit contributions
keeping-0.054
cakes-0.050
 pies-0.045
mons-0.045
ons-0.045
Token around
Token ID1088
Feature activation-9.06e-03

Pos logit contributions
keeping+0.039
cakes+0.037
mons+0.032
ment+0.031
ons+0.030
Neg logit contributions
 especially-0.029
 like-0.028
 really-0.028
 very-0.027
 still-0.027
Token,
Token ID11
Feature activation+2.13e-02

Pos logit contributions
 rather+0.010
 especially+0.010
 quite+0.009
 really+0.009
 three+0.009
Neg logit contributions
keeping-0.021
cakes-0.019
 pies-0.017
mons-0.017
ons-0.017
Token feeling
Token ID4203
Feature activation-6.54e-03

Pos logit contributions
keeping+0.048
cakes+0.047
mons+0.041
 pies+0.039
ons+0.039
Neg logit contributions
 rather-0.027
 especially-0.027
 like-0.026
 three-0.026
 really-0.025
Token ha
Token ID387
Feature activation+8.64e-02

Pos logit contributions
 especially+0.006
 rather+0.006
 who+0.006
 three+0.006
 like+0.006
Neg logit contributions
keeping-0.016
cakes-0.016
mons-0.014
ons-0.014
 juices-0.014
Tokenpy
Token ID9078
Feature activation+8.96e-04

Pos logit contributions
keeping+0.180
cakes+0.172
mons+0.144
ons+0.135
 juices+0.134
Neg logit contributions
 especially-0.142
 like-0.138
 who-0.134
 rather-0.133
 really-0.132
Token and
Token ID290
Feature activation-8.08e-03

Pos logit contributions
keeping+0.002
cakes+0.002
mons+0.002
ons+0.002
 pies+0.002
Neg logit contributions
 especially-0.001
 rather-0.001
 like-0.001
 three-0.001
 really-0.001
Token free
Token ID1479
Feature activation-1.34e-03

Pos logit contributions
 especially+0.011
 like+0.011
 even+0.010
 very+0.010
 still+0.010
Neg logit contributions
cakes-0.020
keeping-0.019
mons-0.018
ons-0.016
irens-0.016
Token.
Token ID13
Feature activation+6.33e-03

Pos logit contributions
 like+0.001
 especially+0.001
 rather+0.001
 really+0.001
 three+0.001
Neg logit contributions
keeping-0.004
cakes-0.004
mons-0.003
ons-0.003
 pies-0.003
Token
Token ID220
Feature activation-4.96e-03

Pos logit contributions
keeping+0.018
cakes+0.017
mons+0.016
 pies+0.015
 sandwiches+0.015
Neg logit contributions
 really-0.006
 like-0.006
 especially-0.006
 rather-0.005
 still-0.005
Token water
Token ID1660
Feature activation+1.23e-02

Pos logit contributions
keeping+0.068
cakes+0.067
mons+0.061
Jud+0.056
live+0.056
Neg logit contributions
 like-0.048
 and-0.046
 three-0.045
 really-0.044
 especially-0.044
Token and
Token ID290
Feature activation+2.04e-02

Pos logit contributions
keeping+0.037
cakes+0.036
mons+0.033
ons+0.031
 pies+0.031
Neg logit contributions
 especially-0.008
 rather-0.007
 like-0.007
 three-0.006
 quite-0.006
Token soap
Token ID19533
Feature activation+2.06e-02

Pos logit contributions
keeping+0.053
cakes+0.051
mons+0.046
ue+0.044
irens+0.043
Neg logit contributions
 especially-0.027
 like-0.026
 even-0.026
 still-0.026
 very-0.025
Token and
Token ID290
Feature activation+1.49e-02

Pos logit contributions
keeping+0.058
cakes+0.056
mons+0.050
ons+0.048
 pies+0.048
Neg logit contributions
 especially-0.017
 like-0.016
 rather-0.016
 three-0.015
 really-0.015
Token put
Token ID1234
Feature activation+2.98e-02

Pos logit contributions
keeping+0.034
cakes+0.034
mons+0.030
ons+0.027
ue+0.027
Neg logit contributions
 especially-0.023
 like-0.023
 even-0.022
 still-0.021
 really-0.021
Token Mitt
Token ID16627
Feature activation+8.64e-02

Pos logit contributions
keeping+0.079
cakes+0.077
mons+0.069
ons+0.064
 pies+0.063
Neg logit contributions
 like-0.033
 especially-0.031
 really-0.031
 very-0.029
 still-0.029
Tokenens
Token ID641
Feature activation+3.11e-03

Pos logit contributions
keeping+0.130
cakes+0.114
ons+0.088
mons+0.081
Jud+0.081
Neg logit contributions
 like-0.205
 especially-0.191
 really-0.189
 who-0.184
 rather-0.183
Token in
Token ID287
Feature activation-1.10e-02

Pos logit contributions
keeping+0.009
cakes+0.008
mons+0.008
ons+0.007
 pies+0.007
Neg logit contributions
 rather-0.002
 especially-0.002
 like-0.002
 quite-0.002
 three-0.002
Token.
Token ID13
Feature activation+6.91e-03

Pos logit contributions
 especially+0.013
 rather+0.012
 like+0.012
 who+0.012
 three+0.011
Neg logit contributions
keeping-0.027
cakes-0.026
mons-0.022
ons-0.022
 juices-0.022
Token Mitt
Token ID16627
Feature activation+7.55e-02

Pos logit contributions
keeping+0.018
cakes+0.017
mons+0.015
ons+0.015
 pies+0.014
Neg logit contributions
 especially-0.008
 like-0.008
 really-0.008
 still-0.007
 rather-0.007
Tokenens
Token ID641
Feature activation+1.36e-02

Pos logit contributions
keeping+0.137
cakes+0.119
ons+0.095
mons+0.087
 Sale+0.087
Neg logit contributions
 like-0.167
 especially-0.154
 and-0.151
 because-0.150
 really-0.149
Token!"
Token ID2474
Feature activation-3.78e-03

Pos logit contributions
 especially+0.002
 like+0.002
 rather+0.002
 really+0.002
 who+0.002
Neg logit contributions
keeping-0.003
cakes-0.003
mons-0.002
ons-0.002
 pies-0.002
Token Ben
Token ID3932
Feature activation+1.78e-03

Pos logit contributions
 especially+0.006
 like+0.006
 rather+0.006
 really+0.005
 three+0.005
Neg logit contributions
keeping-0.008
cakes-0.008
mons-0.007
ons-0.006
 pies-0.006
Token says
Token ID1139
Feature activation-3.78e-03

Pos logit contributions
cakes+0.002
keeping+0.002
 pies+0.001
mons+0.001
 juices+0.001
Neg logit contributions
 especially-0.004
 rather-0.004
 quite-0.004
 particularly-0.004
 like-0.004
Token.
Token ID13
Feature activation-2.66e-02

Pos logit contributions
 especially+0.005
 rather+0.004
 like+0.004
 particularly+0.004
 quite+0.004
Neg logit contributions
keeping-0.009
cakes-0.009
mons-0.008
 pies-0.007
ons-0.007
Token "
Token ID366
Feature activation-2.01e-02

Pos logit contributions
 especially+0.036
 rather+0.035
 quite+0.033
 like+0.032
 particularly+0.031
Neg logit contributions
cakes-0.060
keeping-0.057
mons-0.052
 pies-0.048
ment-0.048
TokenP
Token ID47
Feature activation+8.62e-02

Pos logit contributions
 rather+0.035
 especially+0.034
 three+0.034
 particularly+0.034
 like+0.033
Neg logit contributions
keeping-0.034
cakes-0.032
mons-0.030
 pies-0.025
ons-0.024
Tokenill
Token ID359
Feature activation-3.04e-02

Pos logit contributions
keeping+0.174
cakes+0.158
mons+0.128
 pies+0.122
ons+0.122
Neg logit contributions
 like-0.151
 especially-0.146
 rather-0.146
 really-0.144
 three-0.143
Tokenows
Token ID1666
Feature activation-2.88e-02

Pos logit contributions
 especially+0.046
 rather+0.044
 three+0.043
 quite+0.043
 particularly+0.042
Neg logit contributions
keeping-0.062
cakes-0.061
ons-0.054
mons-0.053
 pies-0.051
Token are
Token ID389
Feature activation+1.30e-02

Pos logit contributions
 especially+0.045
 rather+0.043
 three+0.042
 quite+0.041
 particularly+0.041
Neg logit contributions
cakes-0.058
keeping-0.057
 pies-0.047
ons-0.047
mons-0.047
Token fun
Token ID1257
Feature activation+8.47e-03

Pos logit contributions
keeping+0.030
cakes+0.029
mons+0.026
 pies+0.025
ons+0.025
Neg logit contributions
 especially-0.016
 rather-0.016
 like-0.015
 three-0.015
 who-0.015
Token!"
Token ID2474
Feature activation-4.27e-03

Pos logit contributions
keeping+0.020
cakes+0.019
mons+0.016
ons+0.016
 pies+0.015
Neg logit contributions
 especially-0.012
 rather-0.011
 like-0.011
 quite-0.011
 really-0.011
Token should
Token ID815
Feature activation+6.11e-03

Pos logit contributions
 like+0.003
 especially+0.003
 rather+0.003
 really+0.003
 who+0.003
Neg logit contributions
keeping-0.005
cakes-0.005
mons-0.005
ons-0.004
 pies-0.004
Token not
Token ID407
Feature activation-3.93e-03

Pos logit contributions
keeping+0.017
cakes+0.016
mons+0.014
ons+0.014
ment+0.013
Neg logit contributions
 like-0.007
 especially-0.006
 still-0.006
 rather-0.006
 really-0.006
Token tease
Token ID34691
Feature activation+2.49e-03

Pos logit contributions
 like+0.005
 especially+0.005
 rather+0.005
 really+0.005
 who+0.005
Neg logit contributions
keeping-0.010
cakes-0.009
mons-0.008
ons-0.008
 pies-0.007
Token the
Token ID262
Feature activation-2.13e-02

Pos logit contributions
keeping+0.006
cakes+0.006
mons+0.005
ons+0.005
 pies+0.005
Neg logit contributions
 especially-0.003
 rather-0.003
 like-0.003
 three-0.003
 quite-0.003
Token ost
Token ID23619
Feature activation-2.29e-02

Pos logit contributions
 especially+0.036
 rather+0.035
 like+0.035
 really+0.033
 who+0.033
Neg logit contributions
keeping-0.042
cakes-0.040
mons-0.035
 pies-0.033
ons-0.032
Tokenric
Token ID1173
Feature activation-9.69e-02

Pos logit contributions
 especially+0.033
 like+0.032
 really+0.031
 rather+0.031
 quite+0.031
Neg logit contributions
keeping-0.051
cakes-0.049
mons-0.045
ons-0.042
 pies-0.041
Tokenhes
Token ID956
Feature activation-3.30e-02

Pos logit contributions
 especially+0.219
 like+0.218
 rather+0.214
 really+0.210
 who+0.209
Neg logit contributions
keeping-0.136
cakes-0.128
mons-0.100
ons-0.090
 pies-0.087
Token.
Token ID13
Feature activation-1.21e-02

Pos logit contributions
 especially+0.035
 rather+0.033
 like+0.032
 three+0.031
 really+0.031
Neg logit contributions
keeping-0.085
cakes-0.083
mons-0.073
ons-0.071
 pies-0.070
Token They
Token ID1119
Feature activation+7.19e-03

Pos logit contributions
 especially+0.018
 like+0.018
 rather+0.017
 really+0.017
 quite+0.017
Neg logit contributions
keeping-0.026
cakes-0.025
mons-0.022
ons-0.020
 juices-0.020
Token are
Token ID389
Feature activation+1.65e-02

Pos logit contributions
keeping+0.016
cakes+0.015
mons+0.013
ons+0.013
 pies+0.012
Neg logit contributions
 especially-0.010
 like-0.010
 rather-0.010
 three-0.010
 really-0.010
Token not
Token ID407
Feature activation+3.11e-02

Pos logit contributions
keeping+0.039
cakes+0.037
mons+0.034
 pies+0.032
ons+0.032
Neg logit contributions
 especially-0.019
 three-0.018
 rather-0.018
 like-0.018
 really-0.017
Token had
Token ID550
Feature activation-2.06e-02

Pos logit contributions
 especially+0.068
 like+0.067
 rather+0.066
 three+0.064
 really+0.063
Neg logit contributions
keeping-0.094
cakes-0.089
mons-0.077
ons-0.073
 pies-0.070
Token not
Token ID407
Feature activation+3.97e-03

Pos logit contributions
 like+0.027
 especially+0.027
 still+0.026
 who+0.026
 really+0.026
Neg logit contributions
cakes-0.050
keeping-0.050
mons-0.043
ons-0.041
 juices-0.040
Token seen
Token ID1775
Feature activation-3.17e-03

Pos logit contributions
cakes+0.009
keeping+0.009
mons+0.008
ons+0.008
 juices+0.007
Neg logit contributions
 like-0.006
 especially-0.006
 really-0.005
 who-0.005
 still-0.005
Token the
Token ID262
Feature activation-3.44e-02

Pos logit contributions
 especially+0.003
 like+0.003
 rather+0.003
 really+0.003
 who+0.003
Neg logit contributions
keeping-0.008
cakes-0.008
mons-0.007
ons-0.007
 pies-0.007
Token jelly
Token ID27644
Feature activation-5.22e-02

Pos logit contributions
 especially+0.051
 rather+0.050
 like+0.049
 who+0.048
 really+0.048
Neg logit contributions
keeping-0.073
cakes-0.071
mons-0.060
 pies-0.058
 juices-0.056
Tokenfish
Token ID11084
Feature activation-9.57e-02

Pos logit contributions
 rather+0.064
 especially+0.063
 three+0.061
 like+0.059
 quite+0.058
Neg logit contributions
cakes-0.125
keeping-0.122
mons-0.105
 pies-0.101
ons-0.100
Token.
Token ID13
Feature activation+2.06e-04

Pos logit contributions
 especially+0.115
 rather+0.115
 three+0.114
 particularly+0.106
 quite+0.105
Neg logit contributions
keeping-0.216
cakes-0.211
 pies-0.177
mons-0.176
ons-0.175
Token<|endoftext|>
Token ID50256
Feature activation+5.46e-03

Pos logit contributions
keeping+0.000
cakes+0.000
mons+0.000
 juices+0.000
ons+0.000
Neg logit contributions
 especially-0.000
 like-0.000
 rather-0.000
 really-0.000
 quite-0.000
TokenL
Token ID43
Feature activation-1.21e-02

Pos logit contributions
keeping+0.011
cakes+0.010
mons+0.009
ons+0.008
 pies+0.008
Neg logit contributions
 like-0.009
 especially-0.009
 rather-0.009
 really-0.009
 who-0.009
Tokenily
Token ID813
Feature activation-4.20e-03

Pos logit contributions
 especially+0.023
 like+0.022
 rather+0.022
 really+0.022
 who+0.021
Neg logit contributions
keeping-0.022
cakes-0.021
mons-0.018
ons-0.016
 pies-0.016
Token and
Token ID290
Feature activation-5.66e-02

Pos logit contributions
 like+0.004
 especially+0.004
 rather+0.004
 really+0.004
 who+0.004
Neg logit contributions
keeping-0.011
cakes-0.011
mons-0.010
ons-0.009
 pies-0.009
Token up
Token ID510
Feature activation-3.46e-02

Pos logit contributions
 especially+0.010
 rather+0.009
 three+0.009
 particularly+0.009
 who+0.008
Neg logit contributions
keeping-0.022
cakes-0.022
ment-0.019
 pies-0.018
ons-0.018
Token.
Token ID13
Feature activation+6.09e-03

Pos logit contributions
 who+0.033
 especially+0.032
 dot+0.030
 three+0.030
 rather+0.030
Neg logit contributions
keeping-0.075
cakes-0.072
ment-0.064
mons-0.063
ons-0.060
Token Nobody
Token ID15658
Feature activation+1.88e-02

Pos logit contributions
keeping+0.017
cakes+0.016
mons+0.014
ons+0.014
 pies+0.014
Neg logit contributions
 especially-0.006
 like-0.006
 rather-0.006
 really-0.006
 three-0.005
Token knew
Token ID2993
Feature activation-1.41e-02

Pos logit contributions
keeping+0.035
cakes+0.032
ons+0.029
 happened+0.027
mons+0.027
Neg logit contributions
 three-0.028
 especially-0.027
 rather-0.025
 particularly-0.024
 3-0.024
Token what
Token ID644
Feature activation+7.76e-03

Pos logit contributions
 especially+0.016
 like+0.016
 rather+0.015
 really+0.015
 who+0.014
Neg logit contributions
keeping-0.036
cakes-0.034
mons-0.030
ons-0.029
 pies-0.029
Token she
Token ID673
Feature activation-9.50e-02

Pos logit contributions
keeping+0.019
cakes+0.019
mons+0.017
ons+0.016
 pies+0.016
Neg logit contributions
 especially-0.009
 like-0.008
 rather-0.008
 still-0.008
 three-0.008
Token was
Token ID373
Feature activation-6.66e-02

Pos logit contributions
 especially+0.139
 rather+0.137
 like+0.135
 three+0.132
 who+0.130
Neg logit contributions
keeping-0.201
cakes-0.194
mons-0.167
ons-0.161
 juices-0.153
Token like
Token ID588
Feature activation-5.64e-02

Pos logit contributions
 especially+0.091
 rather+0.090
 three+0.085
 particularly+0.084
 still+0.082
Neg logit contributions
keeping-0.139
cakes-0.125
ons-0.110
mons-0.107
 pies-0.101
Token or
Token ID393
Feature activation+2.64e-02

Pos logit contributions
 especially+0.055
 rather+0.051
 still+0.050
 quite+0.049
 three+0.049
Neg logit contributions
keeping-0.142
cakes-0.136
ons-0.119
mons-0.116
 pies-0.111
Token what
Token ID644
Feature activation+1.76e-02

Pos logit contributions
keeping+0.066
cakes+0.064
mons+0.057
ons+0.054
 pies+0.053
Neg logit contributions
 especially-0.030
 like-0.029
 rather-0.028
 really-0.027
 three-0.027
Token she
Token ID673
Feature activation-7.82e-02

Pos logit contributions
keeping+0.042
cakes+0.038
mons+0.034
 juices+0.032
 pies+0.032
Neg logit contributions
 especially-0.025
 like-0.025
 really-0.025
 who-0.024
 rather-0.024
Token have
Token ID423
Feature activation-1.23e-02

Pos logit contributions
 like+0.009
 especially+0.009
 rather+0.009
 really+0.008
 three+0.008
Neg logit contributions
keeping-0.019
cakes-0.019
mons-0.017
ons-0.016
 pies-0.016
Token finally
Token ID3443
Feature activation-4.44e-02

Pos logit contributions
 like+0.013
 especially+0.012
 really+0.012
 who+0.012
 three+0.011
Neg logit contributions
cakes-0.034
keeping-0.034
mons-0.030
 pies-0.028
 juices-0.027
Token heard
Token ID2982
Feature activation-1.01e-02

Pos logit contributions
 especially+0.055
 like+0.055
 rather+0.053
 really+0.051
 who+0.051
Neg logit contributions
keeping-0.109
cakes-0.105
mons-0.093
ons-0.088
 pies-0.086
Token someone
Token ID2130
Feature activation+1.85e-02

Pos logit contributions
 especially+0.011
 rather+0.011
 like+0.010
 three+0.010
 really+0.010
Neg logit contributions
keeping-0.025
cakes-0.024
ons-0.021
mons-0.021
 pies-0.020
Token whist
Token ID32262
Feature activation-3.81e-02

Pos logit contributions
keeping+0.046
cakes+0.044
ons+0.039
 pies+0.038
 treats+0.037
Neg logit contributions
 three-0.018
 especially-0.018
 rather-0.017
 like-0.016
 particularly-0.015
Tokenling
Token ID1359
Feature activation-9.44e-02

Pos logit contributions
 rather+0.065
 especially+0.064
 like+0.062
 three+0.061
 quite+0.059
Neg logit contributions
keeping-0.072
cakes-0.069
ons-0.059
mons-0.057
ment-0.055
Token again
Token ID757
Feature activation-1.16e-02

Pos logit contributions
 especially+0.100
 rather+0.091
 three+0.091
 particularly+0.089
 who+0.086
Neg logit contributions
keeping-0.234
cakes-0.211
ment-0.184
ons-0.182
mons-0.180
Token.
Token ID13
Feature activation+4.30e-03

Pos logit contributions
 especially+0.014
 rather+0.013
 three+0.013
 like+0.012
 particularly+0.012
Neg logit contributions
keeping-0.028
cakes-0.024
mons-0.022
ment-0.021
ons-0.021
Token Danny
Token ID15105
Feature activation-1.04e-02

Pos logit contributions
cakes+0.009
keeping+0.009
 juices+0.007
mons+0.007
ment+0.007
Neg logit contributions
 especially-0.007
 rather-0.006
 like-0.006
 quite-0.006
 who-0.006
Token smiled
Token ID13541
Feature activation-4.05e-02

Pos logit contributions
 rather+0.019
 like+0.019
 especially+0.018
 quite+0.018
 particularly+0.017
Neg logit contributions
keeping-0.017
cakes-0.016
ons-0.013
 juices-0.013
mons-0.013
Token and
Token ID290
Feature activation-1.66e-02

Pos logit contributions
 usually+0.051
 especially+0.051
 particularly+0.047
 rather+0.045
 quite+0.042
Neg logit contributions
ment-0.071
cakes-0.067
keeping-0.063
ed-0.062
ushed-0.059
Token not
Token ID407
Feature activation+3.10e-03

Pos logit contributions
keeping+0.013
cakes+0.012
mons+0.011
 pies+0.011
ons+0.010
Neg logit contributions
 especially-0.009
 particularly-0.008
 rather-0.008
 three-0.008
 like-0.008
Token know
Token ID760
Feature activation-1.55e-02

Pos logit contributions
keeping+0.007
cakes+0.007
mons+0.006
ons+0.006
 pies+0.006
Neg logit contributions
 especially-0.004
 rather-0.004
 like-0.004
 really-0.004
 three-0.004
Token what
Token ID644
Feature activation+1.12e-02

Pos logit contributions
 especially+0.020
 rather+0.020
 like+0.020
 really+0.019
 quite+0.019
Neg logit contributions
keeping-0.036
cakes-0.035
mons-0.030
ons-0.030
 pies-0.029
Token a
Token ID257
Feature activation-1.94e-02

Pos logit contributions
keeping+0.026
cakes+0.025
mons+0.022
ons+0.022
 pies+0.021
Neg logit contributions
 especially-0.015
 like-0.014
 rather-0.014
 quite-0.013
 really-0.013
Token tr
Token ID491
Feature activation-1.62e-02

Pos logit contributions
 especially+0.030
 rather+0.030
 like+0.030
 who+0.028
 still+0.028
Neg logit contributions
keeping-0.038
cakes-0.037
mons-0.031
 pies-0.030
ons-0.030
Tokenuffle
Token ID18137
Feature activation-9.42e-02

Pos logit contributions
 like+0.030
 especially+0.029
 three+0.029
 who+0.029
 rather+0.028
Neg logit contributions
keeping-0.030
cakes-0.028
ons-0.025
mons-0.025
ade-0.021
Token was
Token ID373
Feature activation-6.27e-02

Pos logit contributions
 especially+0.140
 like+0.138
 rather+0.135
 really+0.131
 who+0.130
Neg logit contributions
keeping-0.206
cakes-0.198
mons-0.171
ons-0.162
 pies-0.158
Token.
Token ID13
Feature activation-6.16e-03

Pos logit contributions
 especially+0.064
 rather+0.060
 like+0.060
 really+0.057
 three+0.057
Neg logit contributions
keeping-0.167
cakes-0.160
mons-0.145
ons-0.138
 pies-0.136
Token They
Token ID1119
Feature activation+1.44e-02

Pos logit contributions
 especially+0.010
 like+0.010
 rather+0.010
 really+0.009
 who+0.009
Neg logit contributions
keeping-0.013
cakes-0.012
mons-0.011
ons-0.010
 pies-0.010
Token did
Token ID750
Feature activation+2.31e-02

Pos logit contributions
keeping+0.029
cakes+0.028
mons+0.025
ons+0.023
 juices+0.023
Neg logit contributions
 especially-0.022
 rather-0.022
 like-0.021
 three-0.020
 who-0.020
Token not
Token ID407
Feature activation+1.79e-02

Pos logit contributions
keeping+0.053
cakes+0.051
mons+0.044
 pies+0.043
ons+0.043
Neg logit contributions
 especially-0.031
 rather-0.029
 like-0.029
 three-0.029
 really-0.028
Token him
Token ID683
Feature activation-3.44e-02

Pos logit contributions
 especially+0.009
 like+0.008
 rather+0.008
 really+0.008
 who+0.008
Neg logit contributions
keeping-0.023
cakes-0.023
mons-0.020
ons-0.019
 pies-0.019
Token to
Token ID284
Feature activation-1.97e-02

Pos logit contributions
 like+0.030
 especially+0.029
 rather+0.028
 really+0.027
 who+0.027
Neg logit contributions
keeping-0.098
cakes-0.095
mons-0.086
ons-0.081
 pies-0.080
Token refuse
Token ID11148
Feature activation-6.64e-03

Pos logit contributions
 especially+0.022
 like+0.022
 rather+0.021
 really+0.020
 who+0.020
Neg logit contributions
keeping-0.051
cakes-0.049
mons-0.044
ons-0.042
 pies-0.041
Token the
Token ID262
Feature activation-3.00e-02

Pos logit contributions
 especially+0.006
 like+0.006
 rather+0.006
 really+0.006
 who+0.006
Neg logit contributions
keeping-0.018
cakes-0.018
mons-0.016
ons-0.015
 pies-0.015
Token jelly
Token ID27644
Feature activation-4.79e-02

Pos logit contributions
 especially+0.046
 like+0.046
 rather+0.045
 really+0.044
 who+0.043
Neg logit contributions
keeping-0.064
cakes-0.061
mons-0.053
ons-0.050
 pies-0.048
Tokenfish
Token ID11084
Feature activation-9.35e-02

Pos logit contributions
 especially+0.052
 like+0.050
 rather+0.050
 three+0.048
 really+0.047
Neg logit contributions
keeping-0.123
cakes-0.120
mons-0.105
ons-0.101
 pies-0.099
Token.
Token ID13
Feature activation-3.75e-03

Pos logit contributions
 especially+0.094
 rather+0.088
 like+0.088
 three+0.086
 really+0.082
Neg logit contributions
keeping-0.244
cakes-0.235
mons-0.208
ons-0.202
 pies-0.198
Token So
Token ID1406
Feature activation+2.58e-02

Pos logit contributions
 especially+0.005
 like+0.005
 rather+0.004
 really+0.004
 three+0.004
Neg logit contributions
keeping-0.009
cakes-0.009
mons-0.008
ons-0.007
 pies-0.007
Token they
Token ID484
Feature activation-3.10e-03

Pos logit contributions
keeping+0.069
cakes+0.067
mons+0.060
ons+0.057
 pies+0.056
Neg logit contributions
 like-0.026
 especially-0.026
 rather-0.025
 really-0.025
 who-0.024
Token spent
Token ID3377
Feature activation-1.87e-02

Pos logit contributions
 especially+0.004
 like+0.004
 really+0.004
 rather+0.004
 who+0.004
Neg logit contributions
keeping-0.007
cakes-0.007
mons-0.006
ons-0.006
 juices-0.006
Token the
Token ID262
Feature activation+2.14e-02

Pos logit contributions
 especially+0.017
 like+0.017
 really+0.016
 who+0.016
 rather+0.015
Neg logit contributions
keeping-0.052
cakes-0.051
mons-0.046
ons-0.043
 juices-0.042
Token she
Token ID673
Feature activation-4.50e-02

Pos logit contributions
keeping+0.023
cakes+0.022
mons+0.020
ons+0.018
 juices+0.018
Neg logit contributions
 especially-0.010
 like-0.010
 who-0.009
 rather-0.009
 really-0.009
Token was
Token ID373
Feature activation-2.64e-02

Pos logit contributions
 like+0.062
 really+0.062
 especially+0.062
 still+0.061
 who+0.059
Neg logit contributions
keeping-0.109
cakes-0.104
mons-0.093
 pies-0.086
 juices-0.085
Token sorry
Token ID7926
Feature activation+2.00e-02

Pos logit contributions
 especially+0.029
 rather+0.028
 like+0.026
 three+0.026
 particularly+0.025
Neg logit contributions
keeping-0.066
cakes-0.063
mons-0.056
ons-0.054
 pies-0.054
Token for
Token ID329
Feature activation-1.82e-02

Pos logit contributions
cakes+0.043
keeping+0.040
 pies+0.038
ons+0.036
ment+0.036
Neg logit contributions
 three-0.023
 particularly-0.023
 especially-0.023
 rather-0.022
 quite-0.021
Token what
Token ID644
Feature activation+7.78e-03

Pos logit contributions
 especially+0.023
 like+0.023
 rather+0.022
 really+0.022
 who+0.021
Neg logit contributions
keeping-0.044
cakes-0.042
mons-0.037
ons-0.035
 pies-0.035
Token she
Token ID673
Feature activation-9.31e-02

Pos logit contributions
keeping+0.019
cakes+0.018
mons+0.016
 juices+0.015
ons+0.015
Neg logit contributions
 like-0.011
 especially-0.011
 really-0.010
 who-0.010
 very-0.010
Token did
Token ID750
Feature activation-4.10e-02

Pos logit contributions
 like+0.133
 especially+0.126
 still+0.126
 really+0.125
 who+0.120
Neg logit contributions
keeping-0.227
cakes-0.218
mons-0.190
 pies-0.180
ons-0.179
Token.
Token ID13
Feature activation-8.19e-03

Pos logit contributions
 particularly+0.046
 three+0.046
 especially+0.046
 rather+0.046
 quite+0.042
Neg logit contributions
keeping-0.099
cakes-0.092
mons-0.081
ons-0.080
 pies-0.078
Token He
Token ID679
Feature activation+3.13e-03

Pos logit contributions
 especially+0.007
 like+0.007
 rather+0.007
 really+0.007
 three+0.006
Neg logit contributions
keeping-0.023
cakes-0.022
mons-0.020
ons-0.019
 pies-0.019
Token wished
Token ID16555
Feature activation+2.11e-02

Pos logit contributions
keeping+0.007
cakes+0.006
mons+0.005
ons+0.005
 pies+0.005
Neg logit contributions
 especially-0.005
 like-0.005
 really-0.005
 who-0.005
 still-0.005
Token she
Token ID673
Feature activation-6.16e-02

Pos logit contributions
keeping+0.049
cakes+0.047
mons+0.041
ons+0.039
 pies+0.038
Neg logit contributions
 especially-0.029
 like-0.029
 rather-0.028
 really-0.027
 who-0.027
Token her
Token ID607
Feature activation+2.54e-02

Pos logit contributions
cakes+0.006
keeping+0.006
mons+0.005
ons+0.005
 juices+0.005
Neg logit contributions
 like-0.003
 especially-0.003
 very-0.003
 really-0.003
 still-0.003
Token owners
Token ID4393
Feature activation+1.84e-02

Pos logit contributions
keeping+0.055
cakes+0.052
mons+0.046
ons+0.043
 pies+0.042
Neg logit contributions
 especially-0.038
 like-0.038
 rather-0.037
 who-0.035
 really-0.035
Token never
Token ID1239
Feature activation+9.99e-03

Pos logit contributions
keeping+0.037
cakes+0.035
mons+0.031
ons+0.030
 pies+0.028
Neg logit contributions
 especially-0.029
 rather-0.028
 like-0.027
 three-0.027
 quite-0.027
Token knew
Token ID2993
Feature activation-4.61e-03

Pos logit contributions
cakes+0.022
keeping+0.022
mons+0.019
ons+0.018
 juices+0.018
Neg logit contributions
 like-0.016
 especially-0.015
 really-0.015
 who-0.015
 very-0.014
Token what
Token ID644
Feature activation+9.45e-03

Pos logit contributions
 like+0.005
 especially+0.005
 who+0.005
 rather+0.005
 really+0.005
Neg logit contributions
keeping-0.012
cakes-0.012
mons-0.010
ons-0.010
 juices-0.010
Token she
Token ID673
Feature activation-9.27e-02

Pos logit contributions
keeping+0.028
cakes+0.027
mons+0.024
 juices+0.023
ons+0.023
Neg logit contributions
 especially-0.007
 like-0.007
 really-0.007
 who-0.007
 rather-0.006
Token was
Token ID373
Feature activation-6.66e-02

Pos logit contributions
 like+0.128
 especially+0.127
 really+0.122
 still+0.119
 who+0.118
Neg logit contributions
keeping-0.220
cakes-0.212
mons-0.184
ons-0.174
 pies-0.174
Token going
Token ID1016
Feature activation-3.94e-02

Pos logit contributions
 especially+0.097
 rather+0.097
 three+0.093
 particularly+0.090
 still+0.088
Neg logit contributions
keeping-0.144
cakes-0.129
mons-0.114
ons-0.110
 pies-0.110
Token to
Token ID284
Feature activation-2.61e-02

Pos logit contributions
 especially+0.040
 rather+0.038
 like+0.037
 three+0.036
 quite+0.036
Neg logit contributions
keeping-0.104
cakes-0.100
mons-0.090
ons-0.085
 pies-0.084
Token do
Token ID466
Feature activation-5.80e-02

Pos logit contributions
 especially+0.036
 like+0.036
 rather+0.034
 really+0.034
 who+0.033
Neg logit contributions
keeping-0.060
cakes-0.058
mons-0.051
ons-0.048
 juices-0.047
Token next
Token ID1306
Feature activation-1.94e-02

Pos logit contributions
 especially+0.057
 rather+0.056
 like+0.055
 three+0.053
 really+0.052
Neg logit contributions
keeping-0.153
cakes-0.148
mons-0.131
ons-0.127
 pies-0.123
Token forget
Token ID6044
Feature activation-3.89e-02

Pos logit contributions
 especially+0.009
 rather+0.009
 three+0.008
 particularly+0.008
 who+0.008
Neg logit contributions
keeping-0.015
cakes-0.014
mons-0.012
 pies-0.012
 juices-0.011
Token about
Token ID546
Feature activation-2.83e-02

Pos logit contributions
 especially+0.039
 like+0.038
 rather+0.038
 three+0.037
 quite+0.036
Neg logit contributions
keeping-0.101
cakes-0.099
mons-0.086
ons-0.085
 pies-0.084
Token his
Token ID465
Feature activation-1.15e-02

Pos logit contributions
 especially+0.029
 like+0.029
 who+0.028
 rather+0.027
 really+0.027
Neg logit contributions
cakes-0.075
keeping-0.075
mons-0.066
ons-0.062
 juices-0.061
Token pe
Token ID613
Feature activation-6.66e-02

Pos logit contributions
 especially+0.020
 like+0.020
 rather+0.019
 who+0.018
 quite+0.018
Neg logit contributions
keeping-0.021
cakes-0.019
mons-0.018
ons-0.017
 pies-0.017
Tokenb
Token ID65
Feature activation-6.25e-02

Pos logit contributions
 especially+0.118
 like+0.115
 rather+0.114
 three+0.109
 really+0.107
Neg logit contributions
keeping-0.122
cakes-0.115
ons-0.107
mons-0.104
ment-0.097
Tokenble
Token ID903
Feature activation-9.27e-02

Pos logit contributions
 especially+0.128
 rather+0.127
 like+0.121
 particularly+0.118
 quite+0.114
Neg logit contributions
keeping-0.101
cakes-0.097
ons-0.089
mons-0.086
ment-0.078
Token necklace
Token ID38987
Feature activation-3.24e-02

Pos logit contributions
 especially+0.093
 rather+0.089
 three+0.087
 like+0.084
 quite+0.081
Neg logit contributions
keeping-0.235
cakes-0.226
mons-0.201
ons-0.200
 pies-0.194
Token.
Token ID13
Feature activation-3.67e-04

Pos logit contributions
 especially+0.028
 rather+0.027
 like+0.026
 three+0.026
 quite+0.024
Neg logit contributions
keeping-0.090
cakes-0.087
mons-0.077
ons-0.075
 pies-0.073
Token He
Token ID679
Feature activation+7.85e-03

Pos logit contributions
 especially+0.000
 really+0.000
 like+0.000
 rather+0.000
 still+0.000
Neg logit contributions
keeping-0.001
cakes-0.001
mons-0.001
ons-0.001
 pies-0.001
Token made
Token ID925
Feature activation-1.05e-04

Pos logit contributions
keeping+0.016
cakes+0.015
mons+0.013
ons+0.012
 juices+0.012
Neg logit contributions
 like-0.012
 especially-0.012
 rather-0.012
 three-0.012
 who-0.011
Token sure
Token ID1654
Feature activation-1.27e-02

Pos logit contributions
 like+0.000
 especially+0.000
 really+0.000
 very+0.000
 rather+0.000
Neg logit contributions
keeping-0.000
cakes-0.000
mons-0.000
ons-0.000
 juices-0.000
Token Lily
Token ID20037
Feature activation-2.42e-02

Pos logit contributions
 three+0.095
 like+0.095
 who+0.091
 especially+0.090
 still+0.088
Neg logit contributions
keeping-0.135
cakes-0.131
ons-0.112
mons-0.111
glers-0.100
Token knew
Token ID2993
Feature activation-1.30e-02

Pos logit contributions
 like+0.039
 especially+0.038
 who+0.037
 really+0.037
 still+0.037
Neg logit contributions
keeping-0.053
cakes-0.050
mons-0.043
ons-0.042
 pies-0.039
Token Mom
Token ID11254
Feature activation-2.94e-02

Pos logit contributions
 especially+0.020
 like+0.020
 rather+0.020
 who+0.020
 really+0.019
Neg logit contributions
keeping-0.028
cakes-0.026
mons-0.023
ons-0.022
 juices-0.021
Token meant
Token ID4001
Feature activation-5.38e-02

Pos logit contributions
 like+0.039
 especially+0.039
 who+0.037
 really+0.037
 rather+0.037
Neg logit contributions
keeping-0.070
cakes-0.066
mons-0.059
ons-0.055
 juices-0.054
Token what
Token ID644
Feature activation+2.75e-03

Pos logit contributions
 especially+0.059
 like+0.058
 rather+0.056
 really+0.054
 who+0.053
Neg logit contributions
keeping-0.139
cakes-0.134
mons-0.119
ons-0.114
 pies-0.112
Token she
Token ID673
Feature activation-9.24e-02

Pos logit contributions
keeping+0.006
cakes+0.005
mons+0.005
ons+0.004
 pies+0.004
Neg logit contributions
 especially-0.005
 like-0.005
 rather-0.004
 really-0.004
 who-0.004
Token said
Token ID531
Feature activation-4.17e-02

Pos logit contributions
 especially+0.137
 like+0.135
 rather+0.132
 really+0.128
 three+0.127
Neg logit contributions
keeping-0.202
cakes-0.194
mons-0.168
ons-0.159
 pies-0.155
Token.
Token ID13
Feature activation-1.28e-02

Pos logit contributions
 especially+0.049
 rather+0.048
 three+0.048
 particularly+0.048
Tw+0.047
Neg logit contributions
keeping-0.098
cakes-0.093
ons-0.080
mons-0.079
 pies-0.076
Token They
Token ID1119
Feature activation+3.01e-03

Pos logit contributions
 especially+0.021
 like+0.020
 rather+0.020
 really+0.019
 three+0.019
Neg logit contributions
keeping-0.027
cakes-0.025
mons-0.022
ons-0.020
 pies-0.020
Token did
Token ID750
Feature activation+2.08e-02

Pos logit contributions
keeping+0.007
cakes+0.006
mons+0.006
ons+0.005
 pies+0.005
Neg logit contributions
 especially-0.005
 like-0.005
 rather-0.004
 really-0.004
 still-0.004
Token not
Token ID407
Feature activation+1.08e-02

Pos logit contributions
keeping+0.049
cakes+0.048
mons+0.042
ons+0.040
 pies+0.040
Neg logit contributions
 especially-0.026
 like-0.025
 rather-0.025
 three-0.025
 who-0.024
Token town
Token ID3240
Feature activation+4.40e-02

Pos logit contributions
 especially+0.035
 rather+0.034
 particularly+0.031
 like+0.031
 still+0.031
Neg logit contributions
keeping-0.057
cakes-0.051
 pies-0.045
ons-0.045
mons-0.043
Token,
Token ID11
Feature activation+1.77e-02

Pos logit contributions
keeping+0.114
cakes+0.105
mons+0.091
 pies+0.090
ons+0.089
Neg logit contributions
 especially-0.044
 rather-0.043
 like-0.041
 particularly-0.039
 quite-0.039
Token with
Token ID351
Feature activation+4.54e-03

Pos logit contributions
keeping+0.039
cakes+0.037
mons+0.032
 pies+0.032
ons+0.031
Neg logit contributions
 rather-0.022
 especially-0.021
 like-0.020
 really-0.020
 quite-0.020
Token Mr
Token ID1770
Feature activation+5.32e-03

Pos logit contributions
keeping+0.011
cakes+0.011
mons+0.010
ons+0.009
 pies+0.009
Neg logit contributions
 especially-0.005
 like-0.005
 rather-0.005
 really-0.005
 who-0.005
Token sm
Token ID895
Feature activation-4.09e-02

Pos logit contributions
keeping+0.015
cakes+0.015
mons+0.013
ons+0.013
 pies+0.012
Neg logit contributions
 like-0.005
 especially-0.005
 rather-0.004
 who-0.004
 really-0.004
Tokenith
Token ID342
Feature activation-9.20e-02

Pos logit contributions
 like+0.071
 especially+0.070
 rather+0.069
 three+0.067
 really+0.067
Neg logit contributions
keeping-0.079
cakes-0.076
mons-0.064
ons-0.061
 pies-0.059
Token blinking
Token ID43196
Feature activation-6.82e-02

Pos logit contributions
 especially+0.118
 rather+0.113
 like+0.110
 really+0.106
 quite+0.104
Neg logit contributions
keeping-0.216
cakes-0.210
mons-0.189
ons-0.180
 pies-0.174
Token the
Token ID262
Feature activation-3.31e-03

Pos logit contributions
 especially+0.097
 like+0.095
 rather+0.093
 really+0.090
 who+0.089
Neg logit contributions
keeping-0.153
cakes-0.147
mons-0.128
ons-0.122
 pies-0.119
Token car
Token ID1097
Feature activation+7.59e-03

Pos logit contributions
 especially+0.005
 like+0.004
 rather+0.004
 who+0.004
 really+0.004
Neg logit contributions
keeping-0.008
cakes-0.007
mons-0.006
ons-0.006
 pies-0.006
Token lights
Token ID7588
Feature activation+4.48e-03

Pos logit contributions
keeping+0.019
cakes+0.019
mons+0.016
ons+0.016
 pies+0.015
Neg logit contributions
 especially-0.008
 rather-0.008
 like-0.007
 three-0.007
 quite-0.007
Token to
Token ID284
Feature activation+7.47e-03

Pos logit contributions
keeping+0.011
cakes+0.011
ons+0.010
mons+0.010
 pies+0.009
Neg logit contributions
 especially-0.004
 rather-0.004
 three-0.004
 quite-0.004
 really-0.004
Token the
Token ID262
Feature activation+2.41e-03

Pos logit contributions
 especially+0.028
 three+0.026
 rather+0.025
 particularly+0.025
 quite+0.024
Neg logit contributions
keeping-0.060
cakes-0.056
ons-0.050
mons-0.049
 pies-0.049
Token fastest
Token ID14162
Feature activation-2.40e-02

Pos logit contributions
keeping+0.005
cakes+0.005
mons+0.004
 pies+0.004
 juices+0.004
Neg logit contributions
 especially-0.003
 rather-0.003
 who-0.003
 like-0.003
 really-0.003
Token,"
Token ID553
Feature activation+3.30e-03

Pos logit contributions
 especially+0.036
 rather+0.035
 like+0.034
 still+0.034
 really+0.033
Neg logit contributions
keeping-0.051
cakes-0.047
mons-0.042
ons-0.041
 pies-0.039
Token replied
Token ID8712
Feature activation+6.51e-03

Pos logit contributions
keeping+0.007
cakes+0.006
mons+0.005
 juices+0.005
 pies+0.005
Neg logit contributions
 who-0.006
 really-0.006
 like-0.006
 very-0.005
 especially-0.005
Token Fl
Token ID1610
Feature activation+2.55e-02

Pos logit contributions
keeping+0.014
cakes+0.013
mons+0.012
ons+0.011
 pies+0.011
Neg logit contributions
 like-0.010
 especially-0.010
 rather-0.010
 really-0.009
 three-0.009
Tokenutter
Token ID10381
Feature activation-9.10e-02

Pos logit contributions
keeping+0.046
cakes+0.044
mons+0.035
ons+0.034
 pies+0.033
Neg logit contributions
 like-0.048
 especially-0.048
 rather-0.047
 really-0.046
 who-0.046
Token.
Token ID13
Feature activation-1.07e-02

Pos logit contributions
 especially+0.131
 rather+0.125
 particularly+0.113
 quite+0.113
 still+0.113
Neg logit contributions
keeping-0.188
cakes-0.183
ment-0.162
ons-0.162
mons-0.155
Token
Token ID220
Feature activation-1.17e-02

Pos logit contributions
 like+0.013
 especially+0.013
 who+0.012
 rather+0.012
 really+0.012
Neg logit contributions
keeping-0.026
cakes-0.025
mons-0.022
ons-0.021
 pies-0.021
Token
Token ID198
Feature activation-4.08e-03

Pos logit contributions
 especially+0.011
 like+0.011
 rather+0.011
 really+0.010
 three+0.010
Neg logit contributions
keeping-0.032
cakes-0.031
mons-0.027
ons-0.026
 pies-0.026
Token
Token ID198
Feature activation-1.14e-02

Pos logit contributions
 especially+0.004
 like+0.004
 rather+0.003
 really+0.003
 who+0.003
Neg logit contributions
keeping-0.011
cakes-0.011
mons-0.010
ons-0.009
 pies-0.009
TokenTweet
Token ID47845
Feature activation+1.73e-02

Pos logit contributions
 especially+0.018
 like+0.018
 rather+0.018
 really+0.017
 three+0.017
Neg logit contributions
keeping-0.024
cakes-0.023
mons-0.019
ons-0.018
 pies-0.018
Token W
Token ID370
Feature activation+1.44e-02

Pos logit contributions
 especially+0.011
 rather+0.010
 like+0.010
 really+0.010
 quite+0.010
Neg logit contributions
keeping-0.017
cakes-0.016
mons-0.014
ons-0.013
 pies-0.013
Tokeniggle
Token ID24082
Feature activation-4.77e-02

Pos logit contributions
keeping+0.028
cakes+0.027
mons+0.023
ons+0.021
 juices+0.021
Neg logit contributions
 especially-0.025
 like-0.024
 rather-0.024
 three-0.023
 who-0.023
Token!"
Token ID2474
Feature activation-3.11e-03

Pos logit contributions
 especially+0.073
 like+0.072
 rather+0.071
 really+0.069
 who+0.068
Neg logit contributions
keeping-0.102
cakes-0.098
mons-0.084
ons-0.080
 pies-0.078
Token cried
Token ID16896
Feature activation+2.19e-02

Pos logit contributions
 especially+0.006
 like+0.006
 rather+0.006
 really+0.005
 who+0.005
Neg logit contributions
keeping-0.006
cakes-0.005
mons-0.004
ons-0.004
 pies-0.004
Token Fl
Token ID1610
Feature activation+2.50e-02

Pos logit contributions
keeping+0.041
cakes+0.039
mons+0.036
ment+0.033
 juices+0.032
Neg logit contributions
 especially-0.035
 rather-0.034
 who-0.031
 like-0.031
 really-0.030
Tokenutter
Token ID10381
Feature activation-9.09e-02

Pos logit contributions
keeping+0.047
cakes+0.045
mons+0.038
ons+0.035
 pies+0.034
Neg logit contributions
 especially-0.045
 like-0.044
 rather-0.044
 really-0.043
 who-0.042
Token.
Token ID13
Feature activation-1.29e-02

Pos logit contributions
 especially+0.112
 rather+0.107
 particularly+0.093
 like+0.092
 still+0.092
Neg logit contributions
keeping-0.205
cakes-0.194
ment-0.175
ons-0.167
mons-0.165
Token
Token ID198
Feature activation-5.33e-03

Pos logit contributions
 especially+0.018
 rather+0.018
 like+0.017
 quite+0.016
 really+0.015
Neg logit contributions
keeping-0.028
cakes-0.027
mons-0.024
 juices-0.022
ment-0.021
Token
Token ID198
Feature activation-1.29e-02

Pos logit contributions
 especially+0.006
 like+0.006
 rather+0.006
 quite+0.005
 three+0.005
Neg logit contributions
keeping-0.013
cakes-0.013
mons-0.012
ons-0.011
 pies-0.011
TokenBut
Token ID1537
Feature activation-3.82e-03

Pos logit contributions
 like+0.023
 rather+0.022
 especially+0.022
 quite+0.021
 three+0.021
Neg logit contributions
keeping-0.023
cakes-0.023
mons-0.020
 pies-0.018
 juices-0.018
Token it
Token ID340
Feature activation+4.72e-03

Pos logit contributions
 rather+0.006
 especially+0.006
 like+0.005
 quite+0.005
 who+0.005
Neg logit contributions
keeping-0.007
cakes-0.007
mons-0.006
ment-0.006
 juices-0.005
Token none
Token ID4844
Feature activation+2.39e-02

Pos logit contributions
 especially+0.029
 like+0.028
 rather+0.028
 really+0.027
 who+0.027
Neg logit contributions
keeping-0.032
cakes-0.031
mons-0.027
ons-0.025
 pies-0.024
Token of
Token ID286
Feature activation-3.08e-02

Pos logit contributions
keeping+0.063
cakes+0.061
mons+0.054
ons+0.051
 pies+0.051
Neg logit contributions
 especially-0.026
 like-0.025
 rather-0.024
 really-0.024
 who-0.023
Token them
Token ID606
Feature activation-2.36e-02

Pos logit contributions
 especially+0.041
 rather+0.040
 like+0.040
 really+0.038
 still+0.037
Neg logit contributions
keeping-0.071
cakes-0.067
mons-0.059
ons-0.059
 pies-0.056
Token knew
Token ID2993
Feature activation-1.35e-02

Pos logit contributions
 especially+0.036
 like+0.036
 rather+0.035
 really+0.034
 who+0.034
Neg logit contributions
keeping-0.051
cakes-0.049
mons-0.042
ons-0.040
 pies-0.039
Token what
Token ID644
Feature activation+1.21e-02

Pos logit contributions
 like+0.014
 especially+0.014
 rather+0.013
 who+0.013
 really+0.013
Neg logit contributions
keeping-0.036
cakes-0.035
mons-0.031
ons-0.029
 pies-0.029
Token she
Token ID673
Feature activation-9.09e-02

Pos logit contributions
keeping+0.035
cakes+0.034
mons+0.030
ons+0.029
 pies+0.029
Neg logit contributions
 especially-0.010
 like-0.010
 rather-0.009
 really-0.009
 who-0.009
Token could
Token ID714
Feature activation-4.19e-02

Pos logit contributions
 especially+0.131
 like+0.130
 rather+0.125
 really+0.124
 who+0.121
Neg logit contributions
keeping-0.206
cakes-0.198
mons-0.172
ons-0.162
 pies-0.160
Token do
Token ID466
Feature activation-6.34e-02

Pos logit contributions
 three+0.053
 especially+0.053
 particularly+0.053
 forward+0.053
 rather+0.052
Neg logit contributions
keeping-0.093
cakes-0.084
mons-0.074
ons-0.072
 pies-0.070
Token with
Token ID351
Feature activation-2.26e-02

Pos logit contributions
 three+0.071
 rather+0.071
 especially+0.070
 particularly+0.068
 quite+0.064
Neg logit contributions
keeping-0.152
cakes-0.142
ons-0.125
mons-0.123
 pies-0.116
Token the
Token ID262
Feature activation-1.22e-02

Pos logit contributions
 especially+0.028
 like+0.028
 rather+0.027
 really+0.026
 three+0.026
Neg logit contributions
keeping-0.054
cakes-0.052
mons-0.046
ons-0.045
 pies-0.043
Token coin
Token ID10752
Feature activation-2.72e-02

Pos logit contributions
 especially+0.023
 rather+0.022
 like+0.021
 who+0.020
 particularly+0.020
Neg logit contributions
keeping-0.022
cakes-0.020
mons-0.017
 juices-0.017
ons-0.016
Token dad
Token ID9955
Feature activation-1.56e-03

Pos logit contributions
 who+0.086
 like+0.085
 especially+0.085
 three+0.083
 really+0.082
Neg logit contributions
keeping-0.107
cakes-0.102
mons-0.088
 juices-0.082
ons-0.081
Token and
Token ID290
Feature activation-5.89e-02

Pos logit contributions
 especially+0.001
 rather+0.001
 quite+0.001
 particularly+0.001
 like+0.001
Neg logit contributions
keeping-0.004
cakes-0.004
ons-0.004
mons-0.003
 pies-0.003
Token told
Token ID1297
Feature activation+1.41e-02

Pos logit contributions
 who+0.095
 like+0.093
 especially+0.092
 three+0.090
 really+0.089
Neg logit contributions
keeping-0.127
cakes-0.124
mons-0.108
ons-0.100
 juices-0.100
Token them
Token ID606
Feature activation-3.35e-02

Pos logit contributions
keeping+0.036
cakes+0.034
mons+0.031
ons+0.028
 juices+0.028
Neg logit contributions
 especially-0.017
 like-0.017
 rather-0.016
 who-0.016
 really-0.016
Token what
Token ID644
Feature activation+9.40e-03

Pos logit contributions
 like+0.031
 especially+0.030
 who+0.029
 really+0.029
 rather+0.028
Neg logit contributions
keeping-0.095
cakes-0.089
mons-0.083
ons-0.078
 juices-0.076
Token she
Token ID673
Feature activation-9.04e-02

Pos logit contributions
keeping+0.021
cakes+0.020
mons+0.017
 juices+0.017
ment+0.016
Neg logit contributions
 especially-0.015
 like-0.015
 who-0.015
 really-0.014
 very-0.014
Token did
Token ID750
Feature activation-3.35e-02

Pos logit contributions
 like+0.125
 especially+0.119
 really+0.119
 still+0.117
 who+0.114
Neg logit contributions
keeping-0.222
cakes-0.212
mons-0.186
 pies-0.177
ment-0.176
Token.
Token ID13
Feature activation-1.12e-02

Pos logit contributions
 especially+0.035
 rather+0.034
 three+0.033
 particularly+0.032
 like+0.031
Neg logit contributions
keeping-0.087
cakes-0.082
mons-0.073
ons-0.071
 pies-0.069
Token
Token ID198
Feature activation-7.02e-03

Pos logit contributions
 especially+0.014
 like+0.014
 rather+0.013
 really+0.013
 who+0.013
Neg logit contributions
keeping-0.027
cakes-0.026
mons-0.023
ons-0.022
 pies-0.022
Token
Token ID198
Feature activation-1.28e-02

Pos logit contributions
 especially+0.006
 like+0.006
 rather+0.006
 really+0.006
 who+0.005
Neg logit contributions
keeping-0.020
cakes-0.019
mons-0.017
ons-0.016
 juices-0.016
TokenLisa
Token ID44203
Feature activation+3.03e-03

Pos logit contributions
 especially+0.026
 like+0.026
 rather+0.026
 really+0.025
 three+0.025
Neg logit contributions
keeping-0.021
cakes-0.020
mons-0.016
ons-0.015
 pies-0.015
Token mom
Token ID1995
Feature activation+3.15e-03

Pos logit contributions
 especially+0.000
 rather+0.000
 like+0.000
 quite+0.000
 really+0.000
Neg logit contributions
keeping-0.000
cakes-0.000
mons-0.000
 pies-0.000
ons-0.000
Token and
Token ID290
Feature activation-6.43e-02

Pos logit contributions
keeping+0.010
cakes+0.009
mons+0.008
ons+0.008
 pies+0.008
Neg logit contributions
 especially-0.002
 like-0.002
 rather-0.002
 really-0.002
 three-0.002
Token to
Token ID284
Feature activation+4.22e-03

Pos logit contributions
 like+0.059
 especially+0.059
 rather+0.056
 who+0.055
 really+0.054
Neg logit contributions
keeping-0.178
cakes-0.173
mons-0.154
ons-0.149
 juices-0.145
Token the
Token ID262
Feature activation-2.22e-02

Pos logit contributions
keeping+0.010
cakes+0.010
mons+0.009
 pies+0.008
ons+0.008
Neg logit contributions
 especially-0.005
 rather-0.005
 like-0.005
 really-0.005
 quite-0.005
Token ost
Token ID23619
Feature activation-1.49e-02

Pos logit contributions
 especially+0.035
 like+0.034
 rather+0.034
 really+0.033
 who+0.033
Neg logit contributions
keeping-0.047
cakes-0.045
mons-0.038
ons-0.036
 pies-0.035
Tokenric
Token ID1173
Feature activation-9.03e-02

Pos logit contributions
 like+0.021
 especially+0.021
 rather+0.021
 really+0.020
 quite+0.020
Neg logit contributions
keeping-0.034
cakes-0.032
mons-0.029
ons-0.027
 pies-0.027
Tokenhes
Token ID956
Feature activation-2.51e-02

Pos logit contributions
 especially+0.203
 like+0.202
 rather+0.198
 who+0.196
 really+0.196
Neg logit contributions
keeping-0.128
cakes-0.119
mons-0.094
ons-0.083
 juices-0.082
Token.
Token ID13
Feature activation-8.11e-03

Pos logit contributions
 like+0.023
 especially+0.023
 really+0.021
 rather+0.021
 who+0.021
Neg logit contributions
keeping-0.070
cakes-0.067
mons-0.060
ons-0.058
 juices-0.057
Token They
Token ID1119
Feature activation+8.57e-03

Pos logit contributions
 like+0.012
 especially+0.012
 rather+0.012
 who+0.011
 really+0.011
Neg logit contributions
keeping-0.018
cakes-0.017
mons-0.015
ons-0.015
 pies-0.014
Token learned
Token ID4499
Feature activation+1.78e-02

Pos logit contributions
keeping+0.018
cakes+0.016
mons+0.014
ons+0.013
 pies+0.013
Neg logit contributions
 especially-0.015
 like-0.015
 really-0.014
 still-0.014
 who-0.014
Token their
Token ID511
Feature activation+1.53e-02

Pos logit contributions
keeping+0.047
cakes+0.045
mons+0.040
ons+0.039
 juices+0.038
Neg logit contributions
 like-0.019
 especially-0.019
 rather-0.019
 who-0.018
 really-0.018
Token She
Token ID1375
Feature activation+4.27e-03

Pos logit contributions
keeping+0.001
cakes+0.001
mons+0.001
ons+0.001
 pies+0.001
Neg logit contributions
 especially-0.001
 like-0.001
 rather-0.001
 really-0.001
 who-0.001
Token told
Token ID1297
Feature activation+2.44e-02

Pos logit contributions
keeping+0.009
cakes+0.009
mons+0.008
ons+0.007
 pies+0.007
Neg logit contributions
 especially-0.007
 like-0.007
 rather-0.006
 really-0.006
 who-0.006
Token her
Token ID607
Feature activation-1.78e-03

Pos logit contributions
keeping+0.065
cakes+0.062
mons+0.056
ons+0.051
 juices+0.051
Neg logit contributions
 like-0.028
 especially-0.027
 really-0.026
 three-0.026
 still-0.026
Token mom
Token ID1995
Feature activation-3.60e-03

Pos logit contributions
 like+0.002
 especially+0.002
 who+0.002
 still+0.002
 really+0.002
Neg logit contributions
keeping-0.005
cakes-0.004
mons-0.004
ons-0.004
 juices-0.004
Token what
Token ID644
Feature activation+5.50e-03

Pos logit contributions
 like+0.004
 who+0.003
 especially+0.003
 very+0.003
 really+0.003
Neg logit contributions
keeping-0.010
cakes-0.010
mons-0.009
 juices-0.008
ons-0.008
Token she
Token ID673
Feature activation-8.99e-02

Pos logit contributions
keeping+0.008
cakes+0.007
mons+0.005
 juices+0.005
ment+0.005
Neg logit contributions
 especially-0.014
 like-0.013
 really-0.013
 who-0.013
 still-0.013
Token was
Token ID373
Feature activation-6.71e-02

Pos logit contributions
 like+0.128
 especially+0.125
 really+0.123
 still+0.122
 who+0.117
Neg logit contributions
keeping-0.216
cakes-0.205
mons-0.180
 pies-0.169
 juices-0.168
Token doing
Token ID1804
Feature activation-6.41e-02

Pos logit contributions
 especially+0.097
 rather+0.095
 like+0.091
 three+0.090
 really+0.089
Neg logit contributions
keeping-0.148
cakes-0.138
mons-0.123
ons-0.116
 pies-0.115
Token with
Token ID351
Feature activation-2.54e-02

Pos logit contributions
 especially+0.070
 rather+0.068
 three+0.065
 particularly+0.065
 quite+0.064
Neg logit contributions
keeping-0.163
cakes-0.153
mons-0.134
ons-0.132
 pies-0.129
Token the
Token ID262
Feature activation-1.75e-02

Pos logit contributions
 like+0.027
 especially+0.027
 rather+0.025
 really+0.025
 still+0.025
Neg logit contributions
keeping-0.068
cakes-0.065
mons-0.057
 juices-0.054
ons-0.053
Token book
Token ID1492
Feature activation-8.81e-03

Pos logit contributions
 especially+0.026
 rather+0.025
 like+0.024
 who+0.023
 really+0.023
Neg logit contributions
cakes-0.037
keeping-0.037
mons-0.033
 pies-0.032
ons-0.031
Token she
Token ID673
Feature activation-3.92e-02

Pos logit contributions
 like+0.011
 especially+0.011
 rather+0.010
 who+0.010
 really+0.010
Neg logit contributions
keeping-0.037
cakes-0.035
mons-0.032
ons-0.030
 juices-0.030
Token was
Token ID373
Feature activation-3.11e-02

Pos logit contributions
 especially+0.053
 like+0.052
 rather+0.050
 really+0.050
 who+0.049
Neg logit contributions
keeping-0.095
cakes-0.090
mons-0.079
ons-0.075
 pies-0.074
Token sorry
Token ID7926
Feature activation+2.01e-02

Pos logit contributions
 especially+0.036
 rather+0.035
 like+0.033
 three+0.032
 who+0.032
Neg logit contributions
keeping-0.073
cakes-0.071
mons-0.062
ons-0.061
 pies-0.061
Token for
Token ID329
Feature activation-2.10e-02

Pos logit contributions
cakes+0.046
keeping+0.042
 pies+0.040
 cookies+0.038
ment+0.038
Neg logit contributions
 particularly-0.021
 three-0.021
 especially-0.020
 quite-0.020
 rather-0.020
Token what
Token ID644
Feature activation+1.03e-02

Pos logit contributions
 especially+0.026
 like+0.026
 rather+0.026
 really+0.024
 three+0.024
Neg logit contributions
keeping-0.050
cakes-0.047
mons-0.041
 pies-0.040
ons-0.040
Token she
Token ID673
Feature activation-8.98e-02

Pos logit contributions
keeping+0.025
cakes+0.023
mons+0.020
 juices+0.019
 pies+0.019
Neg logit contributions
 especially-0.015
 like-0.014
 who-0.014
 really-0.014
 still-0.013
Token did
Token ID750
Feature activation-3.89e-02

Pos logit contributions
 like+0.129
 especially+0.126
 still+0.121
 really+0.121
 rather+0.118
Neg logit contributions
keeping-0.211
cakes-0.203
mons-0.176
ons-0.168
 pies-0.167
Token.
Token ID13
Feature activation-2.30e-03

Pos logit contributions
 three+0.043
 especially+0.043
 rather+0.043
 particularly+0.043
 quite+0.040
Neg logit contributions
keeping-0.095
cakes-0.088
mons-0.077
ons-0.076
 pies-0.074
Token He
Token ID679
Feature activation+9.62e-03

Pos logit contributions
 especially+0.003
 like+0.003
 rather+0.003
 quite+0.003
 really+0.003
Neg logit contributions
keeping-0.005
cakes-0.005
mons-0.004
ons-0.004
ment-0.004
Token told
Token ID1297
Feature activation+1.36e-02

Pos logit contributions
keeping+0.020
cakes+0.018
mons+0.016
ons+0.015
 pies+0.014
Neg logit contributions
 especially-0.017
 like-0.016
 rather-0.016
 really-0.016
 who-0.016
Token her
Token ID607
Feature activation-1.03e-02

Pos logit contributions
keeping+0.033
cakes+0.032
mons+0.028
ons+0.026
 pies+0.026
Neg logit contributions
 especially-0.017
 like-0.017
 rather-0.017
 really-0.016
 who-0.016
Token wouldn
Token ID3636
Feature activation-2.11e-02

Pos logit contributions
 especially+0.022
 like+0.022
 rather+0.021
 really+0.021
 who+0.020
Neg logit contributions
keeping-0.032
cakes-0.030
mons-0.026
ons-0.025
 pies-0.024
Token't
Token ID470
Feature activation-2.54e-02

Pos logit contributions
 especially+0.031
 like+0.031
 rather+0.031
 who+0.031
 three+0.030
Neg logit contributions
keeping-0.043
cakes-0.042
ons-0.035
mons-0.034
 pies-0.032
Token give
Token ID1577
Feature activation-8.07e-03

Pos logit contributions
 especially+0.034
 rather+0.034
 three+0.033
 particularly+0.032
 like+0.032
Neg logit contributions
keeping-0.056
cakes-0.053
mons-0.044
 pies-0.044
ons-0.043
Token her
Token ID607
Feature activation-2.66e-02

Pos logit contributions
 like+0.009
 especially+0.009
 really+0.009
 rather+0.008
 quite+0.008
Neg logit contributions
keeping-0.021
cakes-0.020
mons-0.018
ons-0.017
 juices-0.017
Token what
Token ID644
Feature activation-4.39e-03

Pos logit contributions
 especially+0.039
 like+0.039
 rather+0.038
 really+0.037
 who+0.037
Neg logit contributions
keeping-0.058
cakes-0.056
mons-0.049
ons-0.046
 pies-0.045
Token she
Token ID673
Feature activation-8.94e-02

Pos logit contributions
 especially+0.007
 like+0.006
 really+0.006
 rather+0.006
 who+0.006
Neg logit contributions
keeping-0.010
cakes-0.009
mons-0.008
 juices-0.008
 pies-0.008
Token wanted
Token ID2227
Feature activation-5.24e-02

Pos logit contributions
 especially+0.127
 like+0.126
 really+0.122
 rather+0.120
 still+0.119
Neg logit contributions
keeping-0.211
cakes-0.201
mons-0.175
ons-0.166
 pies-0.165
Token.
Token ID13
Feature activation+4.99e-03

Pos logit contributions
 especially+0.054
 rather+0.052
 like+0.051
 three+0.050
 who+0.048
Neg logit contributions
keeping-0.137
cakes-0.131
mons-0.116
ons-0.113
 pies-0.109
Token No
Token ID1400
Feature activation+3.79e-02

Pos logit contributions
keeping+0.012
cakes+0.012
mons+0.010
ons+0.010
 pies+0.010
Neg logit contributions
 especially-0.006
 like-0.006
 rather-0.006
 really-0.006
 three-0.006
Token matter
Token ID2300
Feature activation+2.11e-02

Pos logit contributions
keeping+0.090
cakes+0.086
mons+0.075
ons+0.071
 pies+0.070
Neg logit contributions
 especially-0.051
 like-0.051
 rather-0.049
 really-0.048
 three-0.048
Token how
Token ID703
Feature activation-6.32e-03

Pos logit contributions
keeping+0.053
cakes+0.052
mons+0.045
ons+0.044
 pies+0.043
Neg logit contributions
 especially-0.024
 like-0.024
 rather-0.023
 really-0.022
 three-0.022
Token his
Token ID465
Feature activation+6.71e-03

Pos logit contributions
keeping+0.009
cakes+0.008
ons+0.007
 pies+0.007
mons+0.007
Neg logit contributions
 especially-0.004
 rather-0.004
 like-0.004
 really-0.003
 still-0.003
Token neighbour
Token ID12250
Feature activation-1.67e-02

Pos logit contributions
keeping+0.014
cakes+0.013
mons+0.011
 pies+0.011
ons+0.011
Neg logit contributions
 especially-0.011
 like-0.011
 rather-0.010
 quite-0.010
 really-0.010
Token,
Token ID11
Feature activation-3.48e-02

Pos logit contributions
 especially+0.017
 rather+0.016
 quite+0.016
 like+0.016
 really+0.015
Neg logit contributions
keeping-0.042
cakes-0.042
mons-0.037
ons-0.036
 pies-0.036
Token Mrs
Token ID9074
Feature activation-2.24e-02

Pos logit contributions
 rather+0.047
 especially+0.047
 like+0.044
 particularly+0.043
 quite+0.042
Neg logit contributions
keeping-0.071
cakes-0.068
mons-0.058
 Sue-0.058
 pies-0.057
Token Clar
Token ID15420
Feature activation-5.04e-02

Pos logit contributions
 especially+0.022
 like+0.022
 rather+0.021
 who+0.020
 really+0.020
Neg logit contributions
keeping-0.060
cakes-0.058
mons-0.052
ons-0.050
 pies-0.049
Tokenise
Token ID786
Feature activation-8.93e-02

Pos logit contributions
 like+0.095
 rather+0.095
 especially+0.094
 really+0.093
 quite+0.090
Neg logit contributions
keeping-0.085
cakes-0.081
mons-0.068
ons-0.067
 pies-0.063
Token,
Token ID11
Feature activation-5.93e-02

Pos logit contributions
 especially+0.102
 rather+0.097
 like+0.094
 really+0.094
 quite+0.093
Neg logit contributions
keeping-0.219
cakes-0.212
mons-0.189
ons-0.182
 pies-0.177
Token for
Token ID329
Feature activation-2.41e-02

Pos logit contributions
 rather+0.070
 especially+0.069
 astronaut+0.064
 particularly+0.063
 quite+0.063
Neg logit contributions
keeping-0.109
 Sally-0.096
cakes-0.090
 Sue-0.090
 Betty-0.090
Token help
Token ID1037
Feature activation+3.67e-03

Pos logit contributions
 especially+0.029
 like+0.029
 rather+0.028
 really+0.027
 three+0.027
Neg logit contributions
keeping-0.060
cakes-0.058
mons-0.051
ons-0.049
 juices-0.048
Token.
Token ID13
Feature activation-8.92e-03

Pos logit contributions
keeping+0.009
cakes+0.008
ons+0.008
mons+0.007
 pies+0.007
Neg logit contributions
 especially-0.004
 rather-0.004
 three-0.004
 particularly-0.004
 like-0.004
Token
Token ID198
Feature activation+3.56e-03

Pos logit contributions
 especially+0.012
 like+0.012
 rather+0.012
 quite+0.011
 who+0.011
Neg logit contributions
keeping-0.020
cakes-0.020
mons-0.017
 pies-0.016
ons-0.016
TokenIt
Token ID1026
Feature activation+2.04e-02

Pos logit contributions
 at+0.022
 still+0.022
 very+0.021
 rather+0.021
 like+0.021
Neg logit contributions
keeping-0.024
cakes-0.023
mons-0.020
Kids-0.017
Looks-0.017
Token's
Token ID338
Feature activation-1.83e-03

Pos logit contributions
keeping+0.053
cakes+0.052
mons+0.046
ons+0.045
 pies+0.043
Neg logit contributions
 especially-0.021
 like-0.020
 rather-0.020
 really-0.019
 three-0.019
Token very
Token ID845
Feature activation+2.58e-02

Pos logit contributions
 especially+0.002
 rather+0.002
 like+0.001
 who+0.001
 three+0.001
Neg logit contributions
keeping-0.005
cakes-0.005
mons-0.004
ons-0.004
 juices-0.004
Token popular
Token ID2968
Feature activation-2.43e-02

Pos logit contributions
keeping+0.064
cakes+0.062
mons+0.055
ons+0.053
 pies+0.051
Neg logit contributions
 especially-0.030
 like-0.030
 rather-0.029
 really-0.028
 who-0.028
Token.
Token ID13
Feature activation-2.45e-04

Pos logit contributions
 especially+0.024
 rather+0.023
 like+0.022
 really+0.022
 quite+0.021
Neg logit contributions
keeping-0.065
cakes-0.063
mons-0.055
ons-0.053
 pies-0.052
Token Many
Token ID4650
Feature activation+5.88e-02

Pos logit contributions
 especially+0.000
 quite+0.000
 really+0.000
 rather+0.000
 like+0.000
Neg logit contributions
keeping-0.001
cakes-0.000
mons-0.000
 juices-0.000
ment-0.000
Token people
Token ID661
Feature activation+1.94e-02

Pos logit contributions
keeping+0.114
cakes+0.109
mons+0.092
ons+0.089
 pies+0.085
Neg logit contributions
 especially-0.100
 like-0.099
 rather-0.097
 really-0.095
 three-0.094
Token want
Token ID765
Feature activation-1.04e-02

Pos logit contributions
keeping+0.044
cakes+0.042
ons+0.037
mons+0.036
 pies+0.034
Neg logit contributions
 especially-0.025
 rather-0.024
 three-0.023
 quite-0.023
 really-0.022
Token to
Token ID284
Feature activation-7.45e-04

Pos logit contributions
 especially+0.011
 rather+0.010
 like+0.010
 still+0.010
 quite+0.010
Neg logit contributions
keeping-0.027
cakes-0.026
mons-0.023
ons-0.022
 juices-0.022
Token see
Token ID766
Feature activation+8.65e-03

Pos logit contributions
 especially+0.001
 like+0.001
 rather+0.001
 really+0.001
 who+0.001
Neg logit contributions
keeping-0.002
cakes-0.002
mons-0.001
ons-0.001
 pies-0.001
Token it
Token ID340
Feature activation-1.73e-02

Pos logit contributions
keeping+0.020
cakes+0.019
ons+0.017
mons+0.016
 pies+0.016
Neg logit contributions
 especially-0.011
 rather-0.010
 still-0.010
 quite-0.010
 like-0.010
Token the
Token ID262
Feature activation+7.41e-03

Pos logit contributions
keeping+0.015
cakes+0.014
mons+0.013
ons+0.012
 juices+0.012
Neg logit contributions
 especially-0.004
 like-0.004
 three-0.003
 really-0.003
 very-0.003
Token rail
Token ID6787
Feature activation+1.14e-02

Pos logit contributions
cakes+0.015
keeping+0.015
mons+0.013
ons+0.012
 pies+0.012
Neg logit contributions
 rather-0.011
 especially-0.011
 like-0.011
 who-0.011
 quite-0.011
Token,
Token ID11
Feature activation+2.00e-02

Pos logit contributions
keeping+0.031
cakes+0.030
mons+0.027
ons+0.026
 pies+0.025
Neg logit contributions
 especially-0.011
 like-0.011
 rather-0.011
 really-0.010
 who-0.010
Token Lily
Token ID20037
Feature activation+2.50e-02

Pos logit contributions
keeping+0.048
cakes+0.041
mons+0.035
ons+0.035
ment+0.034
Neg logit contributions
 especially-0.032
 like-0.032
 three-0.032
 very-0.031
 really-0.031
Token went
Token ID1816
Feature activation+2.46e-02

Pos logit contributions
keeping+0.058
cakes+0.056
mons+0.049
ons+0.048
 pies+0.046
Neg logit contributions
 rather-0.032
 like-0.031
 especially-0.031
 quite-0.029
 three-0.029
Token home
Token ID1363
Feature activation+6.14e-02

Pos logit contributions
keeping+0.068
cakes+0.066
mons+0.058
ons+0.056
 pies+0.056
Neg logit contributions
 especially-0.022
 rather-0.021
 like-0.020
 who-0.019
 three-0.019
Token and
Token ID290
Feature activation+5.99e-03

Pos logit contributions
keeping+0.170
cakes+0.164
mons+0.143
 pies+0.138
ons+0.138
Neg logit contributions
 especially-0.053
 rather-0.051
 like-0.049
 three-0.045
 particularly-0.045
Token asked
Token ID1965
Feature activation+3.40e-02

Pos logit contributions
keeping+0.015
cakes+0.014
mons+0.013
ment+0.012
 juices+0.012
Neg logit contributions
 especially-0.009
 really-0.008
 still-0.008
 very-0.008
 who-0.008
Token her
Token ID607
Feature activation+7.97e-03

Pos logit contributions
keeping+0.079
cakes+0.076
mons+0.066
ons+0.064
 pies+0.062
Neg logit contributions
 especially-0.046
 like-0.045
 rather-0.044
 really-0.043
 who-0.042
Token mom
Token ID1995
Feature activation+1.45e-02

Pos logit contributions
cakes+0.016
keeping+0.015
mons+0.013
 pies+0.012
ons+0.012
Neg logit contributions
 rather-0.013
 especially-0.013
 like-0.013
 quite-0.012
 really-0.012
Tokenmy
Token ID1820
Feature activation-2.52e-02

Pos logit contributions
cakes+0.023
keeping+0.021
ons+0.018
 pies+0.018
mons+0.017
Neg logit contributions
 rather-0.027
 quite-0.027
 especially-0.026
 particularly-0.025
 three-0.025
Token.
Token ID13
Feature activation+1.53e-02

Pos logit contributions
 especially+0.016
 like+0.016
 rather+0.016
 really+0.015
 three+0.015
Neg logit contributions
keeping-0.045
cakes-0.044
mons-0.039
ons-0.038
 pies-0.037
Token
Token ID220
Feature activation+2.06e-03

Pos logit contributions
keeping+0.045
cakes+0.042
mons+0.038
ons+0.037
 pies+0.037
Neg logit contributions
 like-0.012
 especially-0.012
 three-0.012
 really-0.012
 who-0.011
Token
Token ID198
Feature activation+1.46e-02

Pos logit contributions
keeping+0.006
cakes+0.006
mons+0.005
 juices+0.005
 pies+0.005
Neg logit contributions
 like-0.002
 especially-0.002
 really-0.001
 who-0.001
 rather-0.001
Token
Token ID198
Feature activation+3.01e-03

Pos logit contributions
keeping+0.042
cakes+0.041
mons+0.037
 juices+0.035
 pies+0.035
Neg logit contributions
 especially-0.012
 like-0.011
 really-0.011
 who-0.011
 rather-0.011
Token"
Token ID1
Feature activation-4.58e-03

Pos logit contributions
keeping+0.008
cakes+0.007
mons+0.007
ons+0.006
 pies+0.006
Neg logit contributions
 especially-0.003
 like-0.003
 really-0.003
 rather-0.003
 who-0.003
TokenLook
Token ID8567
Feature activation+6.12e-02

Pos logit contributions
 especially+0.007
 like+0.007
 rather+0.007
 really+0.007
 who+0.007
Neg logit contributions
keeping-0.010
cakes-0.009
mons-0.008
ons-0.008
 pies-0.007
Token!"
Token ID2474
Feature activation+7.82e-03

Pos logit contributions
keeping+0.158
cakes+0.156
mons+0.139
ons+0.135
 pies+0.131
Neg logit contributions
 especially-0.063
 rather-0.061
 like-0.058
 quite-0.055
 really-0.054
Token yelled
Token ID22760
Feature activation+4.74e-02

Pos logit contributions
keeping+0.015
cakes+0.014
mons+0.012
 pies+0.011
ons+0.011
Neg logit contributions
 like-0.014
 especially-0.014
 really-0.014
 who-0.014
 three-0.014
Token Jill
Token ID22772
Feature activation-2.16e-02

Pos logit contributions
keeping+0.089
cakes+0.086
mons+0.072
ons+0.068
 pies+0.066
Neg logit contributions
 like-0.086
 especially-0.084
 three-0.083
 really-0.082
 rather-0.082
Token.
Token ID13
Feature activation-9.27e-03

Pos logit contributions
 especially+0.027
 rather+0.026
 like+0.025
 particularly+0.024
 quite+0.023
Neg logit contributions
keeping-0.053
cakes-0.048
ons-0.043
mons-0.042
ment-0.041
Token
Token ID220
Feature activation-7.76e-03

Pos logit contributions
 especially+0.010
 like+0.010
 rather+0.010
 really+0.009
 quite+0.009
Neg logit contributions
keeping-0.024
cakes-0.023
mons-0.020
ons-0.019
 pies-0.019
Token his
Token ID465
Feature activation+1.50e-02

Pos logit contributions
 especially+0.012
 like+0.011
 rather+0.011
 really+0.010
 who+0.010
Neg logit contributions
keeping-0.039
cakes-0.038
mons-0.034
ons-0.033
 pies-0.032
Token friend
Token ID1545
Feature activation+7.32e-03

Pos logit contributions
keeping+0.030
cakes+0.028
mons+0.026
ons+0.025
 pies+0.024
Neg logit contributions
 like-0.024
 especially-0.024
 rather-0.024
 quite-0.022
 who-0.021
Token Billy
Token ID15890
Feature activation-1.00e-02

Pos logit contributions
keeping+0.017
cakes+0.016
mons+0.014
ons+0.013
 pies+0.013
Neg logit contributions
 especially-0.010
 like-0.010
 rather-0.010
 who-0.010
 really-0.010
Token.
Token ID13
Feature activation-4.66e-03

Pos logit contributions
 rather+0.013
 quite+0.012
 especially+0.012
 particularly+0.011
 three+0.011
Neg logit contributions
keeping-0.022
cakes-0.020
ons-0.019
 pies-0.018
mons-0.018
Token "
Token ID366
Feature activation-8.16e-03

Pos logit contributions
 especially+0.008
 like+0.008
 rather+0.008
 really+0.008
 three+0.007
Neg logit contributions
keeping-0.009
cakes-0.009
mons-0.008
ons-0.007
 pies-0.007
TokenLook
Token ID8567
Feature activation+6.00e-02

Pos logit contributions
 like+0.014
 especially+0.014
 rather+0.014
 three+0.013
 really+0.013
Neg logit contributions
keeping-0.015
cakes-0.014
mons-0.012
ons-0.011
 juices-0.011
Token what
Token ID644
Feature activation+2.51e-02

Pos logit contributions
keeping+0.154
cakes+0.154
mons+0.136
ons+0.135
 pies+0.130
Neg logit contributions
 especially-0.060
 rather-0.059
 like-0.057
 quite-0.053
 three-0.052
Token I
Token ID314
Feature activation-3.33e-02

Pos logit contributions
keeping+0.065
cakes+0.063
mons+0.055
ons+0.051
 pies+0.051
Neg logit contributions
 especially-0.029
 like-0.029
 rather-0.027
 really-0.027
 three-0.027
Token made
Token ID925
Feature activation-3.14e-02

Pos logit contributions
 especially+0.051
 like+0.050
 rather+0.049
 really+0.048
 who+0.047
Neg logit contributions
keeping-0.071
cakes-0.069
mons-0.059
ons-0.056
 pies-0.055
Token with
Token ID351
Feature activation-9.61e-03

Pos logit contributions
 especially+0.040
 like+0.039
 rather+0.039
 really+0.037
 who+0.037
Neg logit contributions
keeping-0.075
cakes-0.072
mons-0.063
ons-0.061
 pies-0.059
Token my
Token ID616
Feature activation+1.09e-02

Pos logit contributions
 especially+0.013
 rather+0.013
 like+0.013
 quite+0.012
 really+0.012
Neg logit contributions
keeping-0.022
cakes-0.021
ons-0.018
mons-0.018
 pies-0.017
Token tables
Token ID8893
Feature activation+1.58e-02

Pos logit contributions
keeping+0.070
cakes+0.067
mons+0.058
ons+0.054
 pies+0.051
Neg logit contributions
 like-0.063
 especially-0.063
 rather-0.061
 three-0.060
 really-0.059
Token and
Token ID290
Feature activation+1.92e-02

Pos logit contributions
cakes+0.040
keeping+0.039
 pies+0.034
ons+0.033
glers+0.033
Neg logit contributions
 especially-0.016
 rather-0.015
 who-0.015
 particularly-0.014
 three-0.013
Token chairs
Token ID18791
Feature activation+2.22e-02

Pos logit contributions
keeping+0.045
cakes+0.042
mons+0.039
live+0.035
ons+0.035
Neg logit contributions
 especially-0.028
 like-0.027
 three-0.026
 really-0.026
 even-0.026
Token.
Token ID13
Feature activation+9.15e-03

Pos logit contributions
cakes+0.057
keeping+0.057
 pies+0.048
ons+0.048
mons+0.047
Neg logit contributions
 especially-0.023
 rather-0.021
 who-0.020
 three-0.020
 quite-0.019
Token All
Token ID1439
Feature activation+1.26e-02

Pos logit contributions
keeping+0.023
cakes+0.022
mons+0.020
ons+0.019
 pies+0.018
Neg logit contributions
 especially-0.011
 like-0.010
 rather-0.010
 really-0.010
 who-0.010
Token his
Token ID465
Feature activation+6.67e-02

Pos logit contributions
cakes+0.029
keeping+0.029
ons+0.026
mons+0.024
 pies+0.023
Neg logit contributions
 rather-0.016
 especially-0.015
 like-0.015
 three-0.014
 quite-0.014
Token friends
Token ID2460
Feature activation+1.34e-02

Pos logit contributions
keeping+0.130
cakes+0.125
mons+0.107
ons+0.102
 pies+0.098
Neg logit contributions
 especially-0.112
 like-0.111
 rather-0.109
 who-0.105
 really-0.105
Token and
Token ID290
Feature activation-1.67e-02

Pos logit contributions
keeping+0.025
cakes+0.024
ons+0.021
mons+0.019
 pies+0.019
Neg logit contributions
 especially-0.022
 rather-0.022
 like-0.021
 three-0.021
 quite-0.021
Token family
Token ID1641
Feature activation+2.22e-02

Pos logit contributions
 especially+0.022
 like+0.022
 three+0.021
 who+0.021
 really+0.021
Neg logit contributions
keeping-0.040
cakes-0.040
mons-0.035
ons-0.032
 pies-0.031
Token said
Token ID531
Feature activation+9.78e-03

Pos logit contributions
keeping+0.042
cakes+0.042
ons+0.035
mons+0.033
 pies+0.032
Neg logit contributions
 especially-0.037
 rather-0.037
 like-0.036
 three-0.035
 quite-0.034
Token how
Token ID703
Feature activation-6.47e-03

Pos logit contributions
keeping+0.023
cakes+0.022
mons+0.019
ons+0.019
 pies+0.018
Neg logit contributions
 especially-0.013
 like-0.012
 rather-0.012
 really-0.012
 who-0.012
Token noise
Token ID7838
Feature activation+2.51e-02

Pos logit contributions
keeping+0.014
cakes+0.013
 pies+0.010
ons+0.010
mons+0.010
Neg logit contributions
 especially-0.019
 rather-0.019
 like-0.018
 particularly-0.018
 still-0.018
Token.
Token ID13
Feature activation+2.03e-02

Pos logit contributions
cakes+0.050
keeping+0.049
 pies+0.042
ons+0.041
ment+0.039
Neg logit contributions
 rather-0.035
 especially-0.034
 particularly-0.033
 three-0.032
 still-0.031
Token The
Token ID383
Feature activation+1.88e-02

Pos logit contributions
keeping+0.061
cakes+0.056
ons+0.051
mons+0.050
 pies+0.049
Neg logit contributions
 like-0.016
 three-0.016
 really-0.016
 who-0.015
 especially-0.015
Token box
Token ID3091
Feature activation+1.06e-02

Pos logit contributions
keeping+0.046
cakes+0.041
mons+0.039
ment+0.036
live+0.035
Neg logit contributions
 very-0.027
 three-0.026
 really-0.025
 like-0.025
 still-0.025
Token closed
Token ID4838
Feature activation+8.22e-03

Pos logit contributions
keeping+0.022
cakes+0.021
mons+0.018
ons+0.017
 pies+0.016
Neg logit contributions
 especially-0.017
 like-0.017
 rather-0.017
 really-0.016
 who-0.016
Token.
Token ID13
Feature activation+1.81e-02

Pos logit contributions
keeping+0.023
cakes+0.022
mons+0.020
ons+0.019
 pies+0.019
Neg logit contributions
 especially-0.008
 like-0.007
 rather-0.007
 really-0.007
 who-0.007
Token They
Token ID1119
Feature activation+1.83e-02

Pos logit contributions
keeping+0.055
cakes+0.051
ons+0.046
glers+0.045
mons+0.045
Neg logit contributions
 three-0.014
 like-0.014
 really-0.013
 who-0.013
 still-0.012
Token were
Token ID547
Feature activation+1.93e-02

Pos logit contributions
keeping+0.036
cakes+0.035
mons+0.030
ons+0.028
 pies+0.027
Neg logit contributions
 especially-0.030
 rather-0.030
 like-0.029
 three-0.028
 really-0.028
Token trapped
Token ID13640
Feature activation+1.93e-03

Pos logit contributions
keeping+0.045
cakes+0.041
 pies+0.036
mons+0.036
 sandwiches+0.035
Neg logit contributions
 especially-0.024
 rather-0.023
 particularly-0.022
 like-0.021
 three-0.021
Token.
Token ID13
Feature activation+1.98e-02

Pos logit contributions
keeping+0.005
cakes+0.005
mons+0.004
ons+0.004
 pies+0.004
Neg logit contributions
 especially-0.002
 rather-0.002
 three-0.002
 like-0.002
 particularly-0.002
Token
Token ID198
Feature activation+1.19e-02

Pos logit contributions
keeping+0.060
cakes+0.057
ons+0.051
mons+0.050
 pies+0.049
Neg logit contributions
 like-0.014
 who-0.014
 especially-0.014
 really-0.014
 three-0.014
Token carefully
Token ID7773
Feature activation-1.02e-02

Pos logit contributions
cakes+0.025
keeping+0.024
ons+0.021
mons+0.021
 pies+0.020
Neg logit contributions
 rather-0.020
 like-0.020
 especially-0.019
 quite-0.018
 who-0.018
Token rubbed
Token ID31862
Feature activation+1.14e-02

Pos logit contributions
 rather+0.016
 especially+0.016
 like+0.015
 quite+0.015
 who+0.015
Neg logit contributions
cakes-0.020
keeping-0.020
ons-0.017
mons-0.017
 pies-0.016
Token the
Token ID262
Feature activation+1.88e-02

Pos logit contributions
keeping+0.025
cakes+0.025
ons+0.022
mons+0.022
ment+0.021
Neg logit contributions
 rather-0.015
 especially-0.015
 who-0.013
 like-0.013
 quite-0.013
Token cloth
Token ID16270
Feature activation+5.49e-03

Pos logit contributions
keeping+0.038
cakes+0.036
mons+0.031
ons+0.029
 pies+0.028
Neg logit contributions
 especially-0.031
 like-0.031
 rather-0.030
 really-0.030
 three-0.029
Token onto
Token ID4291
Feature activation-1.28e-02

Pos logit contributions
cakes+0.013
keeping+0.012
ons+0.011
ade+0.011
mons+0.011
Neg logit contributions
 especially-0.006
 rather-0.006
 who-0.006
 three-0.005
 astronaut-0.005
Token the
Token ID262
Feature activation+1.08e-02

Pos logit contributions
 especially+0.016
 rather+0.016
 like+0.015
 who+0.015
 particularly+0.014
Neg logit contributions
keeping-0.030
cakes-0.028
ons-0.025
mons-0.025
 pies-0.024
Token ash
Token ID12530
Feature activation+1.03e-02

Pos logit contributions
keeping+0.020
cakes+0.019
mons+0.016
ons+0.015
 pies+0.015
Neg logit contributions
 especially-0.020
 rather-0.019
 like-0.019
 really-0.018
 quite-0.018
Tokentr
Token ID2213
Feature activation+6.38e-02

Pos logit contributions
cakes+0.015
keeping+0.015
mons+0.011
 pies+0.010
ons+0.010
Neg logit contributions
 especially-0.024
 rather-0.023
 like-0.022
 who-0.022
 quite-0.021
Tokenay
Token ID323
Feature activation+2.92e-03

Pos logit contributions
cakes+0.114
keeping+0.111
mons+0.091
ons+0.089
 pies+0.082
Neg logit contributions
 especially-0.122
 like-0.118
 rather-0.117
 who-0.115
 three-0.113
Token.
Token ID13
Feature activation+1.53e-02

Pos logit contributions
keeping+0.007
cakes+0.007
mons+0.006
ons+0.006
 pies+0.006
Neg logit contributions
 especially-0.003
 like-0.003
 rather-0.003
 really-0.003
 three-0.003
Token He
Token ID679
Feature activation+1.46e-02

Pos logit contributions
keeping+0.039
cakes+0.038
mons+0.034
ons+0.032
 pies+0.031
Neg logit contributions
 especially-0.017
 like-0.017
 rather-0.016
 really-0.016
 who-0.015
Token time
Token ID640
Feature activation-9.40e-03

Pos logit contributions
 especially+0.026
 rather+0.025
 like+0.025
 three+0.025
 quite+0.024
Neg logit contributions
keeping-0.037
cakes-0.037
mons-0.032
ons-0.032
 juices-0.030
Token,
Token ID11
Feature activation-2.70e-02

Pos logit contributions
 especially+0.012
 rather+0.012
 like+0.011
 quite+0.011
 really+0.010
Neg logit contributions
keeping-0.021
cakes-0.021
mons-0.019
ons-0.019
 pies-0.017
Token there
Token ID612
Feature activation-2.28e-02

Pos logit contributions
 rather+0.041
 especially+0.038
 quite+0.035
 particularly+0.035
 like+0.034
Neg logit contributions
cakes-0.052
keeping-0.051
mons-0.047
ons-0.046
 Sue-0.043
Token was
Token ID373
Feature activation-1.78e-02

Pos logit contributions
 rather+0.039
 especially+0.038
 like+0.038
 three+0.036
 quite+0.036
Neg logit contributions
keeping-0.043
cakes-0.041
mons-0.035
ons-0.034
 pies-0.032
Token a
Token ID257
Feature activation+9.52e-03

Pos logit contributions
 especially+0.018
 rather+0.018
 like+0.018
 who+0.017
 really+0.016
Neg logit contributions
keeping-0.046
cakes-0.045
mons-0.040
ons-0.038
 pies-0.038
Token little
Token ID1310
Feature activation+1.49e-02

Pos logit contributions
cakes+0.020
keeping+0.019
mons+0.018
 pies+0.017
ons+0.016
Neg logit contributions
 rather-0.013
 especially-0.012
 like-0.012
 quite-0.011
 particularly-0.011
Token boy
Token ID2933
Feature activation+8.57e-03

Pos logit contributions
keeping+0.029
cakes+0.028
mons+0.024
ons+0.023
 pies+0.022
Neg logit contributions
 especially-0.025
 like-0.024
 rather-0.024
 really-0.023
 quite-0.023
Token named
Token ID3706
Feature activation+1.74e-02

Pos logit contributions
keeping+0.015
cakes+0.013
ons+0.012
mons+0.012
 pies+0.011
Neg logit contributions
 rather-0.015
 especially-0.015
 quite-0.015
 like-0.015
 really-0.015
Token Tim
Token ID5045
Feature activation-1.24e-02

Pos logit contributions
keeping+0.033
cakes+0.031
mons+0.026
ons+0.025
 juices+0.024
Neg logit contributions
 like-0.032
 especially-0.032
 really-0.031
 three-0.031
 rather-0.030
Token.
Token ID13
Feature activation-2.17e-02

Pos logit contributions
 rather+0.023
 particularly+0.022
 especially+0.021
 three+0.021
 quite+0.021
Neg logit contributions
keeping-0.021
ons-0.017
cakes-0.016
 pies-0.015
ade-0.014
Token He
Token ID679
Feature activation+3.23e-03

Pos logit contributions
 especially+0.039
 like+0.039
 rather+0.037
 really+0.037
 three+0.036
Neg logit contributions
keeping-0.042
cakes-0.039
mons-0.033
ons-0.031
 pies-0.030
Token all
Token ID477
Feature activation-2.05e-02

Pos logit contributions
 especially+0.031
 like+0.030
 rather+0.030
 who+0.029
 really+0.028
Neg logit contributions
keeping-0.067
cakes-0.063
mons-0.056
ons-0.054
 pies-0.053
Token the
Token ID262
Feature activation+2.53e-02

Pos logit contributions
 like+0.023
 especially+0.022
 really+0.021
 very+0.020
 rather+0.020
Neg logit contributions
keeping-0.054
cakes-0.052
mons-0.048
 juices-0.043
ons-0.043
Token games
Token ID1830
Feature activation+1.32e-02

Pos logit contributions
keeping+0.056
cakes+0.051
mons+0.047
ment+0.042
Jud+0.041
Neg logit contributions
 especially-0.041
 like-0.041
 really-0.039
 very-0.039
 three-0.038
Token,
Token ID11
Feature activation+7.92e-03

Pos logit contributions
keeping+0.034
cakes+0.033
mons+0.028
ons+0.028
 pies+0.027
Neg logit contributions
 especially-0.013
 rather-0.012
 three-0.012
 quite-0.012
 like-0.012
Token but
Token ID475
Feature activation-1.08e-02

Pos logit contributions
keeping+0.022
cakes+0.021
mons+0.019
 pies+0.018
ons+0.018
Neg logit contributions
 especially-0.006
 rather-0.006
 like-0.006
 three-0.005
 really-0.005
Token it
Token ID340
Feature activation+1.64e-02

Pos logit contributions
 especially+0.013
 like+0.013
 rather+0.013
 really+0.012
 who+0.012
Neg logit contributions
keeping-0.026
cakes-0.026
mons-0.022
ons-0.021
 pies-0.021
Token is
Token ID318
Feature activation-4.63e-03

Pos logit contributions
keeping+0.044
cakes+0.042
mons+0.038
ons+0.036
 pies+0.036
Neg logit contributions
 especially-0.016
 like-0.016
 rather-0.015
 really-0.015
 who-0.014
Token also
Token ID635
Feature activation-1.94e-02

Pos logit contributions
 especially+0.004
 who+0.004
 three+0.004
 rather+0.004
 like+0.004
Neg logit contributions
keeping-0.012
cakes-0.011
 juices-0.010
mons-0.010
ons-0.010
Token important
Token ID1593
Feature activation+5.98e-04

Pos logit contributions
 especially+0.021
 like+0.020
 rather+0.020
 three+0.019
 who+0.019
Neg logit contributions
keeping-0.049
cakes-0.048
mons-0.042
ons-0.040
 pies-0.040
Token to
Token ID284
Feature activation-8.72e-03

Pos logit contributions
keeping+0.002
cakes+0.002
mons+0.001
ons+0.001
 pies+0.001
Neg logit contributions
 especially-0.000
 like-0.000
 rather-0.000
 really-0.000
 three-0.000
Token invite
Token ID14037
Feature activation+1.06e-02

Pos logit contributions
 especially+0.010
 like+0.010
 rather+0.009
 really+0.009
 three+0.009
Neg logit contributions
keeping-0.022
cakes-0.022
mons-0.019
ons-0.018
 pies-0.018
Token seats
Token ID8632
Feature activation+1.97e-02

Pos logit contributions
keeping+0.094
cakes+0.090
mons+0.078
ons+0.074
 pies+0.071
Neg logit contributions
 especially-0.068
 like-0.068
 rather-0.066
 really-0.064
 three-0.064
Token and
Token ID290
Feature activation+1.98e-02

Pos logit contributions
keeping+0.055
cakes+0.054
mons+0.048
ons+0.046
 pies+0.046
Neg logit contributions
 especially-0.017
 rather-0.016
 quite-0.014
 really-0.014
 particularly-0.014
Token a
Token ID257
Feature activation+1.80e-02

Pos logit contributions
keeping+0.053
cakes+0.051
mons+0.046
ons+0.044
ally+0.044
Neg logit contributions
 like-0.024
 three-0.023
 especially-0.021
 very-0.020
 who-0.020
Token heavy
Token ID4334
Feature activation+3.01e-03

Pos logit contributions
keeping+0.040
cakes+0.038
mons+0.033
ons+0.032
 pies+0.031
Neg logit contributions
 especially-0.026
 like-0.026
 rather-0.025
 really-0.024
 three-0.024
Token rope
Token ID17182
Feature activation-1.54e-02

Pos logit contributions
keeping+0.007
cakes+0.007
mons+0.006
ons+0.005
 juices+0.005
Neg logit contributions
 like-0.004
 especially-0.004
 three-0.004
 rather-0.004
 really-0.004
Token.
Token ID13
Feature activation+9.85e-03

Pos logit contributions
 especially+0.018
 rather+0.017
 quite+0.016
 three+0.016
 really+0.016
Neg logit contributions
keeping-0.038
cakes-0.037
mons-0.032
ons-0.031
 pies-0.031
Token They
Token ID1119
Feature activation+2.14e-02

Pos logit contributions
keeping+0.025
cakes+0.024
mons+0.021
ons+0.020
 pies+0.020
Neg logit contributions
 like-0.012
 especially-0.012
 rather-0.011
 three-0.011
 who-0.011
Token run
Token ID1057
Feature activation+4.02e-02

Pos logit contributions
keeping+0.038
cakes+0.038
 pies+0.033
 juices+0.032
mons+0.031
Neg logit contributions
 especially-0.030
 rather-0.030
 particularly-0.030
 who-0.028
 three-0.028
Token to
Token ID284
Feature activation+3.62e-02

Pos logit contributions
keeping+0.100
cakes+0.096
mons+0.084
 pies+0.081
 treats+0.078
Neg logit contributions
 especially-0.046
 rather-0.043
 particularly-0.039
 who-0.039
 usually-0.038
Token the
Token ID262
Feature activation+2.80e-03

Pos logit contributions
keeping+0.078
cakes+0.073
 pies+0.065
mons+0.062
 juices+0.061
Neg logit contributions
 rather-0.049
 especially-0.049
 particularly-0.047
 like-0.045
 quite-0.044
Token swing
Token ID9628
Feature activation-1.73e-02

Pos logit contributions
keeping+0.006
cakes+0.006
mons+0.005
ons+0.005
 juices+0.004
Neg logit contributions
 especially-0.005
 like-0.004
 rather-0.004
 really-0.004
 three-0.004
Token she
Token ID673
Feature activation-2.30e-02

Pos logit contributions
keeping+0.009
cakes+0.008
mons+0.008
 pies+0.007
 juices+0.007
Neg logit contributions
 like-0.004
 really-0.003
 even-0.003
 still-0.003
 especially-0.003
Token realized
Token ID6939
Feature activation+5.37e-03

Pos logit contributions
 especially+0.035
 like+0.035
 really+0.035
 still+0.034
 rather+0.033
Neg logit contributions
keeping-0.052
cakes-0.048
mons-0.043
 pies-0.039
ons-0.039
Token it
Token ID340
Feature activation-9.21e-03

Pos logit contributions
keeping+0.014
cakes+0.014
mons+0.012
ons+0.011
 pies+0.011
Neg logit contributions
 especially-0.006
 like-0.006
 rather-0.005
 really-0.005
 who-0.005
Token was
Token ID373
Feature activation-2.44e-02

Pos logit contributions
 like+0.013
 especially+0.013
 really+0.013
 still+0.012
 rather+0.012
Neg logit contributions
keeping-0.022
cakes-0.020
mons-0.018
 pies-0.017
ons-0.017
Token impossible
Token ID5340
Feature activation-7.37e-03

Pos logit contributions
 especially+0.024
 like+0.023
 rather+0.023
 who+0.021
 really+0.021
Neg logit contributions
keeping-0.066
cakes-0.064
mons-0.057
ons-0.054
 pies-0.053
Token for
Token ID329
Feature activation-1.52e-02

Pos logit contributions
 especially+0.007
 rather+0.007
 like+0.007
 three+0.007
 who+0.006
Neg logit contributions
keeping-0.020
cakes-0.019
mons-0.017
ons-0.016
 pies-0.016
Token her
Token ID607
Feature activation-1.81e-02

Pos logit contributions
 especially+0.018
 like+0.018
 rather+0.017
 really+0.017
 three+0.017
Neg logit contributions
keeping-0.038
cakes-0.037
mons-0.032
ons-0.031
 juices-0.030
Token m
Token ID285
Feature activation-1.18e-02

Pos logit contributions
 especially+0.022
 like+0.021
 rather+0.020
 who+0.020
 quite+0.020
Neg logit contributions
keeping-0.043
cakes-0.041
mons-0.036
ons-0.034
 pies-0.034
Tokenittens
Token ID34978
Feature activation-6.53e-03

Pos logit contributions
 especially+0.021
 like+0.021
 rather+0.021
 really+0.020
 who+0.020
Neg logit contributions
keeping-0.022
cakes-0.021
mons-0.018
ons-0.017
 pies-0.016
Token to
Token ID284
Feature activation+3.54e-04

Pos logit contributions
 especially+0.006
 like+0.006
 rather+0.006
 really+0.006
 who+0.006
Neg logit contributions
keeping-0.018
cakes-0.017
mons-0.015
ons-0.015
 pies-0.014
Token fit
Token ID4197
Feature activation-2.59e-02

Pos logit contributions
keeping+0.001
cakes+0.001
mons+0.001
ons+0.001
 pies+0.001
Neg logit contributions
 especially-0.000
 like-0.000
 rather-0.000
 really-0.000
 who-0.000
Token hug
Token ID16225
Feature activation-2.04e-03

Pos logit contributions
 especially+0.039
 rather+0.037
 like+0.037
 who+0.036
 quite+0.036
Neg logit contributions
cakes-0.031
keeping-0.031
mons-0.024
ons-0.024
 juices-0.023
Token and
Token ID290
Feature activation-1.30e-02

Pos logit contributions
 especially+0.002
 rather+0.002
 quite+0.001
 three+0.001
 like+0.001
Neg logit contributions
keeping-0.006
cakes-0.006
mons-0.005
ons-0.005
 pies-0.005
Token said
Token ID531
Feature activation+1.20e-02

Pos logit contributions
 even+0.018
 very+0.018
 so+0.017
 really+0.017
 still+0.016
Neg logit contributions
keeping-0.035
cakes-0.031
mons-0.030
irens-0.028
uns-0.028
Token,
Token ID11
Feature activation-1.39e-02

Pos logit contributions
keeping+0.035
cakes+0.033
mons+0.030
ons+0.028
 juices+0.028
Neg logit contributions
 like-0.010
 especially-0.010
 rather-0.009
 really-0.009
 three-0.009
Token "
Token ID366
Feature activation-1.91e-02

Pos logit contributions
 especially+0.020
 rather+0.020
 quite+0.019
 like+0.019
 particularly+0.018
Neg logit contributions
cakes-0.029
keeping-0.027
mons-0.023
 pies-0.022
ons-0.021
TokenThank
Token ID10449
Feature activation-6.39e-04

Pos logit contributions
 especially+0.035
 like+0.035
 rather+0.034
 really+0.033
 three+0.033
Neg logit contributions
keeping-0.036
cakes-0.034
mons-0.029
ons-0.027
 pies-0.026
Token you
Token ID345
Feature activation+1.24e-03

Pos logit contributions
 especially+0.001
 like+0.001
 who+0.001
 really+0.001
 three+0.001
Neg logit contributions
keeping-0.002
cakes-0.002
mons-0.001
 juices-0.001
 pies-0.001
Token for
Token ID329
Feature activation-1.87e-02

Pos logit contributions
keeping+0.004
cakes+0.003
mons+0.003
ons+0.003
 pies+0.003
Neg logit contributions
 especially-0.001
 like-0.001
 rather-0.001
 really-0.001
 three-0.001
Token helping
Token ID5742
Feature activation+5.78e-03

Pos logit contributions
 especially+0.024
 like+0.024
 really+0.023
 who+0.023
 rather+0.023
Neg logit contributions
keeping-0.045
cakes-0.044
mons-0.039
 juices-0.036
ons-0.036
Token me
Token ID502
Feature activation-1.11e-02

Pos logit contributions
keeping+0.017
cakes+0.017
mons+0.015
 juices+0.014
ment+0.014
Neg logit contributions
 especially-0.005
 like-0.005
 really-0.004
 who-0.004
 rather-0.004
Token fix
Token ID4259
Feature activation+1.32e-02

Pos logit contributions
 especially+0.015
 rather+0.015
 particularly+0.014
 like+0.014
 quite+0.014
Neg logit contributions
keeping-0.024
cakes-0.023
ons-0.020
 pies-0.020
mons-0.020
Token I
Token ID314
Feature activation+3.26e-03

Pos logit contributions
 especially+0.004
 like+0.004
 rather+0.004
 really+0.004
 quite+0.004
Neg logit contributions
keeping-0.006
cakes-0.006
mons-0.005
ons-0.005
 juices-0.005
Token will
Token ID481
Feature activation+1.04e-02

Pos logit contributions
keeping+0.007
cakes+0.007
mons+0.006
ons+0.005
 juices+0.005
Neg logit contributions
 especially-0.005
 three-0.004
 rather-0.004
 who-0.004
 like-0.004
Token leave
Token ID2666
Feature activation+1.77e-02

Pos logit contributions
keeping+0.022
cakes+0.020
 pies+0.017
mons+0.017
ons+0.017
Neg logit contributions
 especially-0.014
 three-0.013
 particularly-0.013
 rather-0.013
 who-0.013
Token you
Token ID345
Feature activation+1.43e-02

Pos logit contributions
keeping+0.041
cakes+0.040
mons+0.035
ons+0.033
 pies+0.032
Neg logit contributions
 especially-0.023
 like-0.023
 rather-0.022
 really-0.022
 three-0.021
Token alone
Token ID3436
Feature activation-3.42e-02

Pos logit contributions
keeping+0.038
cakes+0.033
ons+0.030
 pies+0.030
mons+0.029
Neg logit contributions
 especially-0.014
 particularly-0.013
 three-0.013
 rather-0.013
 quite-0.012
Token.
Token ID13
Feature activation-7.75e-03

Pos logit contributions
 especially+0.041
 particularly+0.039
 rather+0.038
 quite+0.037
 really+0.036
Neg logit contributions
keeping-0.083
cakes-0.072
mons-0.064
ons-0.062
 pies-0.062
Token I
Token ID314
Feature activation+9.84e-04

Pos logit contributions
 especially+0.011
 like+0.010
 rather+0.010
 really+0.010
 quite+0.010
Neg logit contributions
keeping-0.018
cakes-0.017
mons-0.015
ons-0.014
 juices-0.014
Token will
Token ID481
Feature activation+7.40e-03

Pos logit contributions
keeping+0.002
cakes+0.002
mons+0.002
 juices+0.001
ons+0.001
Neg logit contributions
 especially-0.002
 three-0.001
 rather-0.001
 who-0.001
 particularly-0.001
Token go
Token ID467
Feature activation+1.34e-02

Pos logit contributions
keeping+0.016
cakes+0.014
 pies+0.012
mons+0.012
ons+0.012
Neg logit contributions
 especially-0.010
 three-0.010
 particularly-0.010
 rather-0.009
 who-0.009
Token to
Token ID284
Feature activation+1.39e-02

Pos logit contributions
keeping+0.035
cakes+0.033
 pies+0.028
mons+0.028
ons+0.028
Neg logit contributions
 rather-0.012
 especially-0.012
 three-0.011
 quite-0.011
 particularly-0.010
Token another
Token ID1194
Feature activation+6.28e-03

Pos logit contributions
keeping+0.034
cakes+0.032
mons+0.028
ons+0.027
 pies+0.027
Neg logit contributions
 especially-0.017
 like-0.016
 rather-0.016
 really-0.015
 quite-0.015
Token around
Token ID1088
Feature activation-3.00e-02

Pos logit contributions
 especially+0.009
 rather+0.009
 three+0.008
 quite+0.008
 particularly+0.008
Neg logit contributions
keeping-0.022
cakes-0.021
ons-0.020
 pies-0.018
ade-0.018
Token the
Token ID262
Feature activation+9.10e-03

Pos logit contributions
ust+0.050
 rather+0.050
mo+0.045
chie+0.045
be+0.045
Neg logit contributions
keeping-0.044
ons-0.039
 foods-0.032
 pies-0.031
 cookies-0.029
Token world
Token ID995
Feature activation+2.57e-03

Pos logit contributions
keeping+0.017
cakes+0.015
ons+0.013
 pies+0.012
mons+0.012
Neg logit contributions
 rather-0.015
 especially-0.015
 like-0.015
 quite-0.015
 who-0.014
Token and
Token ID290
Feature activation-2.98e-03

Pos logit contributions
keeping+0.007
cakes+0.006
ons+0.006
mons+0.005
 pies+0.005
Neg logit contributions
 especially-0.002
 three-0.002
 rather-0.002
 particularly-0.002
 still-0.002
Token it
Token ID340
Feature activation+2.02e-02

Pos logit contributions
 especially+0.004
 rather+0.004
 three+0.004
 like+0.004
 really+0.004
Neg logit contributions
keeping-0.007
cakes-0.006
mons-0.005
 pies-0.005
ons-0.005
Token was
Token ID373
Feature activation-1.78e-03

Pos logit contributions
cakes+0.032
keeping+0.031
ons+0.027
ade+0.025
mons+0.025
Neg logit contributions
 rather-0.035
 three-0.034
 like-0.033
 who-0.032
 especially-0.032
Token the
Token ID262
Feature activation+1.03e-02

Pos logit contributions
 rather+0.001
 who+0.001
 especially+0.001
 three+0.001
 dot+0.001
Neg logit contributions
keeping-0.004
cakes-0.004
ons-0.004
 Sue-0.004
 pies-0.004
Token coolest
Token ID38889
Feature activation-6.40e-03

Pos logit contributions
keeping+0.020
cakes+0.018
mons+0.016
 pies+0.016
 treats+0.016
Neg logit contributions
 who-0.013
 especially-0.013
 rather-0.013
 closely-0.012
 at-0.012
Token thing
Token ID1517
Feature activation-6.11e-04

Pos logit contributions
 especially+0.009
 rather+0.009
 like+0.009
 really+0.008
 quite+0.008
Neg logit contributions
keeping-0.014
cakes-0.013
mons-0.012
ons-0.011
 pies-0.011
Token ever
Token ID1683
Feature activation-1.55e-02

Pos logit contributions
 especially+0.001
 rather+0.001
 three+0.001
 like+0.001
 really+0.001
Neg logit contributions
keeping-0.002
cakes-0.001
mons-0.001
ons-0.001
 pies-0.001
Token.
Token ID13
Feature activation+1.09e-02

Pos logit contributions
 especially+0.015
 like+0.015
 rather+0.014
 really+0.013
 who+0.013
Neg logit contributions
keeping-0.042
cakes-0.041
mons-0.037
ons-0.035
 pies-0.035
TokenHe
Token ID1544
Feature activation-8.73e-03

Pos logit contributions
 like+0.030
 especially+0.030
 rather+0.029
 three+0.028
 really+0.028
Neg logit contributions
keeping-0.054
cakes-0.053
mons-0.046
ons-0.044
 pies-0.043
Token picked
Token ID6497
Feature activation+1.66e-02

Pos logit contributions
 especially+0.014
 like+0.014
 rather+0.014
 three+0.013
 really+0.013
Neg logit contributions
keeping-0.017
cakes-0.017
mons-0.014
ons-0.014
 juices-0.013
Token up
Token ID510
Feature activation-2.13e-02

Pos logit contributions
keeping+0.044
cakes+0.043
mons+0.038
ons+0.036
 pies+0.036
Neg logit contributions
 like-0.017
 especially-0.017
 really-0.016
 three-0.016
 rather-0.016
Token the
Token ID262
Feature activation-6.92e-03

Pos logit contributions
 like+0.019
 especially+0.018
 three+0.017
 really+0.017
 who+0.017
Neg logit contributions
keeping-0.061
cakes-0.059
mons-0.053
ons-0.050
 pies-0.049
Token yogurt
Token ID32132
Feature activation-5.18e-03

Pos logit contributions
 especially+0.009
 rather+0.009
 like+0.009
 who+0.008
 really+0.008
Neg logit contributions
keeping-0.016
cakes-0.015
mons-0.013
ons-0.013
 pies-0.013
Token and
Token ID290
Feature activation+2.62e-03

Pos logit contributions
 rather+0.004
 especially+0.004
 like+0.003
 who+0.003
 quite+0.003
Neg logit contributions
cakes-0.015
keeping-0.015
mons-0.013
 pies-0.013
ons-0.013
Token studied
Token ID9713
Feature activation+1.40e-02

Pos logit contributions
keeping+0.007
cakes+0.006
mons+0.006
 inconvenience+0.005
oys+0.005
Neg logit contributions
 even-0.003
 still-0.003
 like-0.003
 really-0.003
 especially-0.003
Token it
Token ID340
Feature activation+1.57e-04

Pos logit contributions
keeping+0.032
cakes+0.031
ons+0.027
mons+0.026
 pies+0.025
Neg logit contributions
 especially-0.018
 rather-0.017
 like-0.016
 who-0.016
 three-0.016
Token carefully
Token ID7773
Feature activation-2.99e-02

Pos logit contributions
keeping+0.000
cakes+0.000
ons+0.000
 pies+0.000
mons+0.000
Neg logit contributions
 especially-0.000
 rather-0.000
 who-0.000
 three-0.000
 like-0.000
Token.
Token ID13
Feature activation+1.59e-03

Pos logit contributions
 especially+0.040
 rather+0.036
 who+0.034
 particularly+0.033
 three+0.033
Neg logit contributions
keeping-0.068
cakes-0.063
ment-0.054
ons-0.053
mons-0.053
Token He
Token ID679
Feature activation+8.76e-03

Pos logit contributions
keeping+0.004
cakes+0.004
mons+0.003
ons+0.003
 juices+0.003
Neg logit contributions
 especially-0.002
 like-0.002
 rather-0.002
 really-0.002
 quite-0.002
Token.
Token ID13
Feature activation+8.51e-03

Pos logit contributions
 especially+0.015
 astronaut+0.015
 particularly+0.014
 rather+0.013
 three+0.013
Neg logit contributions
keeping-0.019
cakes-0.018
mons-0.017
 pies-0.016
ons-0.016
Token She
Token ID1375
Feature activation+4.53e-03

Pos logit contributions
keeping+0.022
cakes+0.021
mons+0.019
ons+0.018
 pies+0.018
Neg logit contributions
 especially-0.009
 like-0.009
 rather-0.009
 really-0.008
 three-0.008
Token asked
Token ID1965
Feature activation+2.60e-02

Pos logit contributions
keeping+0.010
cakes+0.010
mons+0.008
ons+0.008
 pies+0.008
Neg logit contributions
 especially-0.007
 like-0.007
 really-0.007
 rather-0.006
 who-0.006
Token him
Token ID683
Feature activation-2.87e-03

Pos logit contributions
keeping+0.060
cakes+0.059
mons+0.051
ons+0.050
 pies+0.049
Neg logit contributions
 especially-0.033
 rather-0.033
 like-0.032
 really-0.031
 quite-0.031
Token why
Token ID1521
Feature activation+1.58e-02

Pos logit contributions
 especially+0.003
 rather+0.003
 quite+0.003
 like+0.003
 particularly+0.003
Neg logit contributions
cakes-0.007
keeping-0.007
mons-0.006
 pies-0.006
ons-0.006
Token he
Token ID339
Feature activation-4.65e-02

Pos logit contributions
keeping+0.036
cakes+0.035
mons+0.031
ons+0.029
 pies+0.028
Neg logit contributions
 especially-0.022
 like-0.021
 rather-0.021
 really-0.020
 three-0.020
Token was
Token ID373
Feature activation-3.88e-02

Pos logit contributions
 rather+0.075
 especially+0.075
 three+0.073
 like+0.073
 who+0.071
Neg logit contributions
keeping-0.085
cakes-0.084
mons-0.072
ons-0.071
 juices-0.066
Token doing
Token ID1804
Feature activation-3.01e-02

Pos logit contributions
 rather+0.046
 especially+0.044
 dot+0.042
 forward+0.042
 particularly+0.041
Neg logit contributions
keeping-0.076
cakes-0.074
 pies-0.068
mons-0.067
 cream-0.064
Token that
Token ID326
Feature activation-2.15e-02

Pos logit contributions
 especially+0.038
 rather+0.038
 like+0.036
 three+0.036
 who+0.035
Neg logit contributions
keeping-0.067
cakes-0.066
ons-0.057
mons-0.057
 pies-0.056
Token.
Token ID13
Feature activation+1.09e-02

Pos logit contributions
 especially+0.022
 rather+0.022
 like+0.021
 who+0.021
 three+0.020
Neg logit contributions
keeping-0.053
cakes-0.052
mons-0.045
ons-0.044
 pies-0.044
Token Her
Token ID2332
Feature activation+1.34e-02

Pos logit contributions
keeping+0.028
cakes+0.026
mons+0.023
ons+0.022
 pies+0.022
Neg logit contributions
 especially-0.013
 like-0.013
 really-0.012
 rather-0.012
 who-0.012
Token,
Token ID11
Feature activation-3.41e-02

Pos logit contributions
cakes+0.031
keeping+0.031
mons+0.026
ons+0.026
 pies+0.026
Neg logit contributions
 especially-0.013
 rather-0.013
 quite-0.012
 like-0.012
 really-0.012
Token �
Token ID6184
Feature activation+8.87e-03

Pos logit contributions
 especially+0.048
 rather+0.047
 like+0.044
 dot+0.043
 astronaut+0.043
Neg logit contributions
cakes-0.063
keeping-0.061
 Sue-0.052
mons-0.052
 Sally-0.052
Token�
Token ID95
Feature activation-1.18e-02

Pos logit contributions
keeping+0.016
cakes+0.016
mons+0.014
ons+0.012
 pies+0.012
Neg logit contributions
 especially-0.016
 like-0.015
 rather-0.015
 really-0.015
 quite-0.015
Token€
Token ID26391
Feature activation+2.98e-02

Pos logit contributions
 especially+0.021
 like+0.020
 rather+0.020
 three+0.019
 really+0.019
Neg logit contributions
keeping-0.023
cakes-0.022
mons-0.018
ons-0.017
 juices-0.016
Token�
Token ID129
Feature activation-2.44e-03

Pos logit contributions
keeping+0.070
cakes+0.068
mons+0.059
 pies+0.057
 juices+0.056
Neg logit contributions
 especially-0.039
 like-0.039
 who-0.038
 rather-0.038
 really-0.037
Token�
Token ID241
Feature activation-5.46e-02

Pos logit contributions
 especially+0.004
 like+0.004
 who+0.004
 three+0.004
 rather+0.004
Neg logit contributions
keeping-0.005
cakes-0.004
mons-0.004
ons-0.003
live-0.003
TokenWhere
Token ID8496
Feature activation+3.22e-03

Pos logit contributions
 especially+0.072
 like+0.071
 rather+0.070
 really+0.067
 three+0.066
Neg logit contributions
keeping-0.128
cakes-0.122
mons-0.107
ons-0.101
 juices-0.098
Token can
Token ID460
Feature activation+7.00e-03

Pos logit contributions
keeping+0.009
cakes+0.008
mons+0.007
 pies+0.007
 juices+0.007
Neg logit contributions
 like-0.004
 especially-0.004
 really-0.004
 who-0.004
 rather-0.004
Token I
Token ID314
Feature activation-2.02e-02

Pos logit contributions
keeping+0.015
cakes+0.015
mons+0.013
ons+0.013
 juices+0.012
Neg logit contributions
 especially-0.010
 like-0.010
 rather-0.010
 really-0.009
 three-0.009
Token find
Token ID1064
Feature activation-1.14e-02

Pos logit contributions
 especially+0.026
 rather+0.026
 three+0.026
 like+0.025
 who+0.024
Neg logit contributions
keeping-0.042
cakes-0.040
ons-0.035
mons-0.035
 juices-0.033
Token some
Token ID617
Feature activation+2.11e-02

Pos logit contributions
 like+0.012
 especially+0.012
 rather+0.012
 really+0.012
 who+0.012
Neg logit contributions
keeping-0.030
cakes-0.029
mons-0.026
 pies-0.024
 juices-0.024
Token
Token ID198
Feature activation+6.43e-03

Pos logit contributions
keeping+0.042
cakes+0.040
mons+0.035
ons+0.035
 pies+0.033
Neg logit contributions
 especially-0.017
 like-0.017
 rather-0.017
 three-0.016
 quite-0.016
Token"
Token ID1
Feature activation-3.01e-03

Pos logit contributions
keeping+0.013
cakes+0.012
mons+0.010
ons+0.010
ment+0.009
Neg logit contributions
 especially-0.012
 like-0.012
 really-0.011
 who-0.011
 still-0.011
TokenHello
Token ID15496
Feature activation+4.96e-02

Pos logit contributions
 especially+0.004
 like+0.004
 who+0.004
 really+0.004
 three+0.004
Neg logit contributions
keeping-0.007
cakes-0.007
mons-0.006
ons-0.006
 juices-0.005
Token,
Token ID11
Feature activation-1.29e-02

Pos logit contributions
keeping+0.146
cakes+0.139
mons+0.125
ons+0.119
 juices+0.117
Neg logit contributions
 like-0.042
 especially-0.039
 three-0.038
 who-0.038
 really-0.037
Token Mr
Token ID1770
Feature activation+1.55e-02

Pos logit contributions
 especially+0.018
 rather+0.018
 like+0.017
 quite+0.017
 really+0.016
Neg logit contributions
cakes-0.028
keeping-0.028
mons-0.024
 pies-0.023
ons-0.023
Token.
Token ID13
Feature activation-6.18e-02

Pos logit contributions
cakes+0.047
keeping+0.047
ons+0.041
mons+0.040
 juices+0.040
Neg logit contributions
 like-0.012
 who-0.010
 three-0.009
 really-0.009
 still-0.009
Token Cro
Token ID9325
Feature activation+1.07e-02

Pos logit contributions
 especially+0.092
 like+0.092
 rather+0.089
 three+0.088
 who+0.087
Neg logit contributions
keeping-0.134
cakes-0.129
mons-0.110
ons-0.106
 pies-0.102
Tokencod
Token ID19815
Feature activation-1.14e-02

Pos logit contributions
keeping+0.020
cakes+0.020
mons+0.017
ons+0.016
 pies+0.015
Neg logit contributions
 especially-0.019
 like-0.018
 rather-0.018
 three-0.018
 really-0.018
Tokenile
Token ID576
Feature activation-4.22e-03

Pos logit contributions
 especially+0.016
 like+0.016
 rather+0.016
 really+0.015
 three+0.015
Neg logit contributions
keeping-0.025
cakes-0.025
mons-0.021
ons-0.020
 pies-0.020
Token,"
Token ID553
Feature activation-4.60e-03

Pos logit contributions
 like+0.006
 who+0.006
 especially+0.006
 really+0.006
 rather+0.005
Neg logit contributions
keeping-0.010
cakes-0.010
mons-0.008
 juices-0.008
ons-0.008
Token said
Token ID531
Feature activation+1.27e-02

Pos logit contributions
 like+0.006
 who+0.006
 very+0.006
 three+0.006
 especially+0.006
Neg logit contributions
keeping-0.013
cakes-0.011
mons-0.010
ugs-0.009
ats-0.009
Token lesson
Token ID11483
Feature activation+6.30e-03

Pos logit contributions
keeping+0.021
cakes+0.020
mons+0.015
ons+0.015
 pies+0.013
Neg logit contributions
 especially-0.032
 like-0.032
 rather-0.032
 really-0.031
 three-0.031
Token.
Token ID13
Feature activation-7.70e-03

Pos logit contributions
keeping+0.018
cakes+0.017
mons+0.015
ons+0.015
 pies+0.015
Neg logit contributions
 like-0.006
 especially-0.005
 rather-0.005
 who-0.005
 really-0.005
Token<|endoftext|>
Token ID50256
Feature activation+7.30e-03

Pos logit contributions
 especially+0.010
 like+0.010
 rather+0.010
 who+0.010
 really+0.010
Neg logit contributions
keeping-0.018
cakes-0.018
mons-0.015
ons-0.015
 pies-0.014
TokenL
Token ID43
Feature activation-1.00e-02

Pos logit contributions
keeping+0.014
cakes+0.013
mons+0.011
ons+0.010
 pies+0.010
Neg logit contributions
 especially-0.013
 like-0.013
 rather-0.013
 really-0.012
 who-0.012
Tokenily
Token ID813
Feature activation-4.73e-03

Pos logit contributions
 especially+0.019
 like+0.019
 rather+0.019
 really+0.018
 who+0.018
Neg logit contributions
keeping-0.018
cakes-0.017
mons-0.014
ons-0.013
 pies-0.012
Token and
Token ID290
Feature activation-6.05e-02

Pos logit contributions
 especially+0.006
 like+0.005
 rather+0.005
 really+0.005
 three+0.005
Neg logit contributions
keeping-0.012
cakes-0.011
mons-0.010
ons-0.010
 pies-0.009
Token Tom
Token ID4186
Feature activation-2.80e-03

Pos logit contributions
 like+0.084
 who+0.079
 especially+0.079
 really+0.078
 still+0.076
Neg logit contributions
cakes-0.147
keeping-0.146
mons-0.125
ons-0.118
 pies-0.117
Token are
Token ID389
Feature activation-1.17e-02

Pos logit contributions
 especially+0.004
 like+0.004
 rather+0.004
 really+0.004
 three+0.004
Neg logit contributions
keeping-0.006
cakes-0.006
mons-0.005
ons-0.005
 pies-0.005
Token twins
Token ID20345
Feature activation-1.21e-02

Pos logit contributions
 rather+0.018
 like+0.018
 quite+0.017
 especially+0.017
 really+0.016
Neg logit contributions
cakes-0.023
keeping-0.022
mons-0.019
ons-0.019
 sandwiches-0.019
Token.
Token ID13
Feature activation-2.61e-02

Pos logit contributions
 quite+0.018
 especially+0.017
 particularly+0.017
 rather+0.017
 three+0.016
Neg logit contributions
keeping-0.022
cakes-0.022
ons-0.019
 pies-0.018
mons-0.017
Token They
Token ID1119
Feature activation-2.38e-03

Pos logit contributions
 like+0.036
 especially+0.036
 rather+0.034
 three+0.033
 who+0.033
Neg logit contributions
keeping-0.062
cakes-0.059
mons-0.052
ons-0.051
 pies-0.049
Token thankful
Token ID25535
Feature activation-3.48e-02

Pos logit contributions
 dot+0.016
 rather+0.014
 like+0.014
 especially+0.014
 who+0.013
Neg logit contributions
keeping-0.036
cakes-0.032
 Sue-0.030
ages-0.029
ons-0.029
Token for
Token ID329
Feature activation-1.06e-02

Pos logit contributions
 especially+0.035
 rather+0.032
 three+0.031
 quite+0.030
 like+0.030
Neg logit contributions
keeping-0.088
cakes-0.084
ment-0.076
mons-0.073
ons-0.072
Token Mrs
Token ID9074
Feature activation-2.26e-02

Pos logit contributions
 especially+0.011
 like+0.011
 rather+0.011
 really+0.010
 three+0.010
Neg logit contributions
keeping-0.027
cakes-0.026
mons-0.022
ons-0.022
 pies-0.022
Token.
Token ID13
Feature activation-6.90e-02

Pos logit contributions
 like+0.022
 especially+0.021
 who+0.020
 rather+0.020
 really+0.020
Neg logit contributions
keeping-0.061
cakes-0.060
mons-0.054
ons-0.053
 pies-0.050
Token Johnson
Token ID5030
Feature activation+1.33e-02

Pos logit contributions
 especially+0.123
 like+0.122
 rather+0.120
 really+0.117
 three+0.117
Neg logit contributions
keeping-0.129
cakes-0.124
mons-0.105
ons-0.099
 pies-0.096
Token's
Token ID338
Feature activation-4.66e-02

Pos logit contributions
keeping+0.036
cakes+0.035
mons+0.031
ons+0.030
 pies+0.029
Neg logit contributions
 especially-0.013
 like-0.012
 rather-0.012
 really-0.011
 three-0.011
Token kind
Token ID1611
Feature activation-2.23e-02

Pos logit contributions
 especially+0.072
 like+0.070
 rather+0.068
 who+0.067
 quite+0.066
Neg logit contributions
keeping-0.095
cakes-0.092
mons-0.078
 pies-0.074
ons-0.074
Token words
Token ID2456
Feature activation-8.04e-03

Pos logit contributions
 like+0.025
 especially+0.024
 rather+0.023
 three+0.023
 really+0.023
Neg logit contributions
keeping-0.057
cakes-0.056
mons-0.051
ons-0.048
 pies-0.047
Token.
Token ID13
Feature activation-1.71e-02

Pos logit contributions
 especially+0.007
 three+0.007
 rather+0.006
 quite+0.006
 particularly+0.006
Neg logit contributions
cakes-0.021
keeping-0.020
ons-0.018
mons-0.017
ment-0.017
Token He
Token ID679
Feature activation-1.07e-03

Pos logit contributions
 especially+0.019
 like+0.019
 rather+0.018
 really+0.017
 who+0.017
Neg logit contributions
keeping-0.044
cakes-0.042
mons-0.038
ons-0.036
 pies-0.035
Token soon
Token ID2582
Feature activation-4.74e-03

Pos logit contributions
 like+0.002
 especially+0.002
 rather+0.002
 three+0.002
 quite+0.002
Neg logit contributions
keeping-0.002
cakes-0.002
mons-0.002
ons-0.001
 juices-0.001