TOP ACTIVATIONS
MAX = 6.505

daughter went to the bar bers hop . The dad wanted
walked out of the bar bers hop , everyone admired the
She put on her best orange dress and got ready for
. They used buckets and m ops to get rid of
came by and ate a french fry from his hands .
put on their gray and black clothes and tie scar ves
and Tom put on their apr ons and washed their hands
He carefully m ended the bike and made
put on their coats and m ittens . They wanted to
excited he put on his m ittens and went outside to
Mom . They put on apr ons to keep their clothes
The little girl spent an hour gathering the little clothes .
the bird in a bird c age and gave it food
put on her hat and m ittens and went outside .
!" To her su prise , the thunder went
, hats , boots and m ittens . Lily had thick
cups and sp oons and m olds to make shapes in
" Of course !" He m ended the m itten with
boy took his shovel and m op outside to the sandbox
Finally M ummy used her fing ern ails to gently pull

MIN ACTIVATIONS
MIN = -6.032

girl . Her name was Mill ie . She was 3
girl . Her name was Daisy . She was only 3
" Hi Blue y , you 're so
ily said , " Hi Betsy , I care about you
was a little girl called Hope . She lived with her
, " My name is Frank . Would you like to
girl . Her name was Claire . Claire lived in a
was a little girl called Mill ie . She wanted to
girl . Her name was Claire and she was three years
was a little girl called Mill ie . She was three
Isn 't it incredible how nature can be so beautiful ?"
, " My name is Hope ." Jack thought
was a little girl called Pink y . She was three
" Hi , I 'm Blue . Do you want to
was a little rabbit named Frankie . He was a very
girl and her name was Gret a . Gret a was
, " My name is Frankie . What 's your name
a little fish y named Blue . Blue loved to swim
was a little girl called Mill ie . One day ,
was a little girl called Abby . She was very happy

INTERVAL 3.371 - 6.505
CONTAINS 0.178%

After a few minutes , a big man dressed
said , handing him a cup and a sa ucer .
t worry , Susan . We will protect you ! â
happy and ran to the cup board . He saw the
's mom offered him a quiz . She said , "

INTERVAL 0.237 - 3.371
CONTAINS 41.659%

house was ready . The two of them
" Thank you p uddle , for glowing so brightly !"
After the lion left , everyone in Jimmy 's group
a tall dad who wore big boots . One day ,
have to wash your hands ." Tim was so happy ,

INTERVAL -2.898 - 0.237
CONTAINS 56.847%

made the fire shrink to a perfect size and it shone
irled all around the garden in gratitude . The rabbit thanked
was no longer embarrassed . The friends had all worked together
wanted to play outside again . But he didn 't want
soaked the ship and made it all wet . But then

INTERVAL -6.032 - -2.898
CONTAINS 1.317%

Once upon a time , there was a little girl named
there was a little girl who loved to go on exciting
the yard . They think it is a fun toy for
!" Tim says . " It looks fun !"
But the dog was still there . It was angry
Token daughter
Token ID4957
Feature activation+3.83e+00

Pos logit contributions
 vain+0.016
 fortunate+0.016
 remembers+0.016
arter+0.016
die+0.016
Neg logit contributions
aw-0.018
boards-0.017
cakes-0.017
talk-0.016
P-0.016
Token went
Token ID1816
Feature activation+1.12e-02

Pos logit contributions
aw+0.094
P+0.089
cakes+0.088
boards+0.087
First+0.085
Neg logit contributions
 remembers-0.067
 vain-0.067
 fortunate-0.066
 mistake-0.066
die-0.066
Token to
Token ID284
Feature activation+9.38e-01

Pos logit contributions
aw+0.000
boards+0.000
cakes+0.000
P+0.000
First+0.000
Neg logit contributions
 way-0.000
 chance-0.000
 mistake-0.000
arter-0.000
 thanks-0.000
Token the
Token ID262
Feature activation+2.97e+00

Pos logit contributions
aw+0.026
cakes+0.024
boards+0.024
P+0.024
First+0.022
Neg logit contributions
arter-0.015
 vain-0.015
 mistake-0.015
 fortunate-0.014
die-0.014
Token bar
Token ID2318
Feature activation+3.94e+00

Pos logit contributions
aw+0.068
boards+0.061
cakes+0.060
talk+0.060
First+0.059
Neg logit contributions
 way-0.064
 mistake-0.063
die-0.059
 solution-0.057
 vain-0.057
Tokenbers
Token ID1213
Feature activation+6.35e+00

Pos logit contributions
 umb+0.113
aw+0.113
cakes+0.110
P+0.110
 instruments+0.110
Neg logit contributions
 fortunate-0.049
 mistake-0.047
 vain-0.047
arter-0.045
ubby-0.045
Tokenhop
Token ID8548
Feature activation+2.70e+00

Pos logit contributions
cakes+0.095
 umb+0.093
 instruments+0.089
aw+0.088
P+0.082
Neg logit contributions
 way-0.162
 mistake-0.161
 vain-0.156
 time-0.153
bie-0.152
Token.
Token ID13
Feature activation+6.91e-01

Pos logit contributions
aw+0.074
P+0.073
ican+0.072
Ta+0.071
cakes+0.071
Neg logit contributions
 mistake-0.035
 way-0.035
 fortunate-0.035
 thanks-0.033
 solution-0.032
Token The
Token ID383
Feature activation+1.90e-01

Pos logit contributions
aw+0.013
cakes+0.012
boards+0.011
P+0.011
First+0.011
Neg logit contributions
arter-0.017
 vain-0.017
 fortunate-0.016
 mistake-0.016
die-0.016
Token dad
Token ID9955
Feature activation+8.01e-01

Pos logit contributions
aw+0.004
cakes+0.004
boards+0.004
P+0.004
 instruments+0.004
Neg logit contributions
arter-0.004
 vain-0.004
 fortunate-0.004
die-0.004
 haven-0.004
Token wanted
Token ID2227
Feature activation-1.00e-04

Pos logit contributions
aw+0.019
cakes+0.017
boards+0.017
P+0.017
First+0.016
Neg logit contributions
arter-0.015
 vain-0.015
 fortunate-0.015
die-0.015
 mistake-0.015
Token walked
Token ID6807
Feature activation-6.47e-01

Pos logit contributions
arter+0.016
 vain+0.015
 fortunate+0.015
die+0.015
 mistake+0.014
Neg logit contributions
aw-0.014
cakes-0.013
boards-0.012
P-0.012
co-0.012
Token out
Token ID503
Feature activation+2.50e-01

Pos logit contributions
 way+0.009
die+0.009
 fortunate+0.009
 vain+0.009
 moment+0.009
Neg logit contributions
aw-0.018
P-0.017
cakes-0.017
boards-0.017
 instruments-0.017
Token of
Token ID286
Feature activation+5.82e-01

Pos logit contributions
aw+0.005
cakes+0.005
boards+0.005
P+0.005
 instruments+0.004
Neg logit contributions
arter-0.006
 vain-0.005
 fortunate-0.005
die-0.005
 mistake-0.005
Token the
Token ID262
Feature activation+1.37e+00

Pos logit contributions
aw+0.011
cakes+0.011
P+0.011
 instruments+0.011
First+0.010
Neg logit contributions
arter-0.013
 fortunate-0.013
dy-0.013
 remembers-0.012
die-0.012
Token bar
Token ID2318
Feature activation+3.83e+00

Pos logit contributions
aw+0.031
cakes+0.028
boards+0.028
P+0.028
 instruments+0.027
Neg logit contributions
arter-0.029
 vain-0.028
 fortunate-0.028
die-0.027
 mistake-0.027
Tokenbers
Token ID1213
Feature activation+5.80e+00

Pos logit contributions
aw+0.104
cakes+0.103
P+0.103
First+0.102
 instruments+0.099
Neg logit contributions
 fortunate-0.052
arter-0.052
 vain-0.052
 mistake-0.050
 remembers-0.050
Tokenhop
Token ID8548
Feature activation+1.57e+00

Pos logit contributions
cakes+0.079
P+0.069
First+0.068
 umb+0.068
 instruments+0.067
Neg logit contributions
 way-0.162
 mistake-0.155
 vain-0.153
 thanks-0.151
 time-0.150
Token,
Token ID11
Feature activation+1.15e+00

Pos logit contributions
aw+0.046
cakes+0.043
boards+0.042
P+0.042
 instruments+0.041
Neg logit contributions
arter-0.022
 vain-0.021
 fortunate-0.021
 mistake-0.021
die-0.020
Token everyone
Token ID2506
Feature activation-5.80e-01

Pos logit contributions
aw+0.021
P+0.019
cakes+0.018
boards+0.018
talk+0.017
Neg logit contributions
 vain-0.029
 fortunate-0.028
 haven-0.028
 way-0.028
 remembers-0.028
Token admired
Token ID29382
Feature activation-7.60e-01

Pos logit contributions
arter+0.011
 vain+0.011
 fortunate+0.011
die+0.011
 mistake+0.011
Neg logit contributions
aw-0.014
cakes-0.013
boards-0.013
P-0.012
First-0.012
Token the
Token ID262
Feature activation-2.29e-01

Pos logit contributions
arter+0.018
die+0.018
dy+0.017
 fortunate+0.017
 mistake+0.017
Neg logit contributions
aw-0.015
 instruments-0.013
cakes-0.013
boards-0.012
Her-0.012
Token She
Token ID1375
Feature activation-5.29e-01

Pos logit contributions
aw+0.017
P+0.016
cakes+0.016
boards+0.015
 instruments+0.015
Neg logit contributions
 fortunate-0.013
 haven-0.013
 vain-0.013
die-0.013
arter-0.012
Token put
Token ID1234
Feature activation+4.58e-01

Pos logit contributions
 vain+0.009
 fortunate+0.009
 remembers+0.008
die+0.008
 haven+0.008
Neg logit contributions
aw-0.014
cakes-0.013
P-0.013
boards-0.013
First-0.012
Token on
Token ID319
Feature activation-4.91e-01

Pos logit contributions
aw+0.012
boards+0.011
cakes+0.011
ican+0.011
First+0.011
Neg logit contributions
 way-0.007
 vain-0.007
die-0.007
 has-0.007
 fortunate-0.007
Token her
Token ID607
Feature activation+2.45e+00

Pos logit contributions
 way+0.008
 time+0.008
 thanks+0.007
 normal+0.007
 new+0.007
Neg logit contributions
aw-0.013
ican-0.013
talk-0.012
boards-0.012
Run-0.012
Token best
Token ID1266
Feature activation+3.50e+00

Pos logit contributions
aw+0.065
boards+0.059
cakes+0.058
P+0.058
First+0.058
Neg logit contributions
arter-0.040
 fortunate-0.040
 way-0.039
 mistake-0.039
 remembers-0.039
Token orange
Token ID10912
Feature activation+5.34e+00

Pos logit contributions
aw+0.075
cakes+0.067
boards+0.066
P+0.065
First+0.064
Neg logit contributions
arter-0.075
 way-0.075
 fortunate-0.075
die-0.074
 mistake-0.074
Token dress
Token ID6576
Feature activation+5.72e-01

Pos logit contributions
aw+0.126
First+0.120
Run+0.112
ban+0.108
G+0.107
Neg logit contributions
 way-0.114
 sister-0.105
 has-0.102
 brother-0.102
 size-0.102
Token and
Token ID290
Feature activation+2.10e-01

Pos logit contributions
aw+0.015
P+0.015
First+0.014
Run+0.014
ican+0.014
Neg logit contributions
 way-0.009
 has-0.008
 haven-0.008
 size-0.008
 beloved-0.008
Token got
Token ID1392
Feature activation+1.14e+00

Pos logit contributions
aw+0.005
ican+0.005
talk+0.004
P+0.004
boards+0.004
Neg logit contributions
 way-0.004
 remembers-0.004
 has-0.004
die-0.004
 fortunate-0.004
Token ready
Token ID3492
Feature activation+5.14e-01

Pos logit contributions
ican+0.029
aw+0.029
P+0.027
co+0.026
boards+0.026
Neg logit contributions
 lucky-0.018
 way-0.018
 good-0.018
 fortunate-0.018
 glad-0.017
Token for
Token ID329
Feature activation+2.31e+00

Pos logit contributions
aw+0.013
P+0.013
cakes+0.012
boards+0.012
ican+0.012
Neg logit contributions
 mistake-0.009
 way-0.009
arter-0.008
 fortunate-0.008
 vain-0.008
Token.
Token ID13
Feature activation+5.83e-01

Pos logit contributions
arter+0.017
die+0.017
 vain+0.017
 fortunate+0.016
 remembers+0.016
Neg logit contributions
aw-0.022
cakes-0.021
boards-0.021
 instruments-0.020
 bars-0.019
Token They
Token ID1119
Feature activation+1.64e+00

Pos logit contributions
aw+0.012
cakes+0.011
boards+0.011
First+0.011
P+0.011
Neg logit contributions
arter-0.013
 vain-0.013
 fortunate-0.013
 mistake-0.013
die-0.013
Token used
Token ID973
Feature activation+1.17e+00

Pos logit contributions
aw+0.033
cakes+0.030
boards+0.030
P+0.029
 instruments+0.029
Neg logit contributions
arter-0.038
 vain-0.037
 fortunate-0.037
die-0.036
 mistake-0.036
Token buckets
Token ID38674
Feature activation+1.53e+00

Pos logit contributions
aw+0.026
cakes+0.024
boards+0.023
P+0.023
 instruments+0.023
Neg logit contributions
arter-0.025
 fortunate-0.024
 vain-0.024
die-0.024
 mistake-0.023
Token and
Token ID290
Feature activation+7.39e-01

Pos logit contributions
aw+0.036
P+0.033
cakes+0.033
boards+0.032
 instruments+0.032
Neg logit contributions
arter-0.029
 vain-0.029
 fortunate-0.029
die-0.028
 mistake-0.028
Token m
Token ID285
Feature activation+5.31e+00

Pos logit contributions
aw+0.017
ican+0.015
cakes+0.015
boards+0.015
Her+0.015
Neg logit contributions
 way-0.016
 remembers-0.015
 vain-0.015
die-0.015
 fortunate-0.015
Tokenops
Token ID2840
Feature activation+3.22e+00

Pos logit contributions
P+0.116
cakes+0.106
Her+0.105
aw+0.104
Boo+0.102
Neg logit contributions
ubby-0.108
arter-0.108
 haven-0.101
 fault-0.099
 fortunate-0.099
Token to
Token ID284
Feature activation+4.30e-01

Pos logit contributions
aw+0.083
P+0.078
cakes+0.076
boards+0.074
First+0.074
Neg logit contributions
 vain-0.052
arter-0.052
 way-0.051
 fortunate-0.051
 haven-0.050
Token get
Token ID651
Feature activation+4.09e-01

Pos logit contributions
aw+0.010
cakes+0.009
boards+0.009
P+0.009
First+0.009
Neg logit contributions
arter-0.009
 vain-0.009
 mistake-0.008
 fortunate-0.008
die-0.008
Token rid
Token ID5755
Feature activation+1.80e+00

Pos logit contributions
aw+0.009
cakes+0.008
boards+0.008
P+0.008
 instruments+0.008
Neg logit contributions
arter-0.009
 vain-0.008
die-0.008
 fortunate-0.008
 mistake-0.008
Token of
Token ID286
Feature activation-8.74e-01

Pos logit contributions
aw+0.036
cakes+0.033
P+0.033
boards+0.033
First+0.032
Neg logit contributions
arter-0.041
 vain-0.040
 fortunate-0.040
die-0.039
ubby-0.039
Token came
Token ID1625
Feature activation-1.09e+00

Pos logit contributions
cakes+0.008
aw+0.008
boards+0.008
First+0.007
P+0.007
Neg logit contributions
arter-0.006
 remembers-0.006
 haven-0.006
 mistake-0.006
 vain-0.006
Token by
Token ID416
Feature activation-8.29e-01

Pos logit contributions
arter+0.023
 vain+0.022
 fortunate+0.022
die+0.022
 mistake+0.022
Neg logit contributions
aw-0.024
cakes-0.022
boards-0.022
P-0.022
 instruments-0.021
Token and
Token ID290
Feature activation-3.77e-01

Pos logit contributions
arter+0.014
 vain+0.014
 fortunate+0.013
die+0.013
 mistake+0.013
Neg logit contributions
aw-0.022
cakes-0.020
boards-0.020
P-0.020
 instruments-0.019
Token ate
Token ID15063
Feature activation+1.62e-01

Pos logit contributions
 vain+0.008
arter+0.007
 fortunate+0.007
die+0.007
 mistake+0.007
Neg logit contributions
aw-0.009
cakes-0.008
boards-0.008
P-0.008
First-0.008
Token a
Token ID257
Feature activation-8.45e-02

Pos logit contributions
aw+0.004
cakes+0.004
boards+0.004
P+0.004
First+0.003
Neg logit contributions
 vain-0.003
arter-0.003
 mistake-0.003
 fortunate-0.003
die-0.003
Token french
Token ID48718
Feature activation+5.25e+00

Pos logit contributions
 mistake+0.001
arter+0.001
 fortunate+0.001
die+0.001
 vain+0.001
Neg logit contributions
aw-0.002
boards-0.002
cakes-0.002
First-0.002
talk-0.002
Token fry
Token ID32959
Feature activation+2.01e+00

Pos logit contributions
aw+0.101
First+0.098
P+0.095
cakes+0.094
 instruments+0.092
Neg logit contributions
arter-0.122
 vain-0.114
 fortunate-0.114
 mistake-0.114
 remembers-0.112
Token from
Token ID422
Feature activation-1.08e+00

Pos logit contributions
aw+0.057
First+0.056
P+0.055
 instruments+0.055
cakes+0.055
Neg logit contributions
arter-0.025
 vain-0.025
 mistake-0.024
 fortunate-0.023
 way-0.023
Token his
Token ID465
Feature activation+4.03e-01

Pos logit contributions
arter+0.021
 fortunate+0.021
 vain+0.021
die+0.021
 haven+0.020
Neg logit contributions
aw-0.025
cakes-0.023
P-0.022
boards-0.022
 instruments-0.022
Token hands
Token ID2832
Feature activation+1.26e+00

Pos logit contributions
aw+0.010
cakes+0.009
boards+0.009
P+0.009
First+0.009
Neg logit contributions
arter-0.008
 vain-0.007
 fortunate-0.007
 mistake-0.007
die-0.007
Token.
Token ID13
Feature activation+4.51e-01

Pos logit contributions
aw+0.036
P+0.035
cakes+0.034
boards+0.033
First+0.033
Neg logit contributions
 vain-0.017
arter-0.017
 fortunate-0.017
 mistake-0.016
 way-0.016
Token put
Token ID1234
Feature activation+2.49e-01

Pos logit contributions
aw+0.086
boards+0.083
cakes+0.081
P+0.079
First+0.076
Neg logit contributions
 remembers-0.055
 thanks-0.054
 way-0.051
 has-0.051
 mistake-0.050
Token on
Token ID319
Feature activation-3.01e-01

Pos logit contributions
aw+0.006
cakes+0.005
P+0.005
boards+0.005
 instruments+0.005
Neg logit contributions
arter-0.005
 vain-0.005
 fortunate-0.005
 haven-0.005
die-0.005
Token their
Token ID511
Feature activation+2.98e+00

Pos logit contributions
dy+0.005
 way+0.005
die+0.005
 thanks+0.005
 mistake+0.005
Neg logit contributions
cakes-0.008
boards-0.008
aw-0.008
First-0.007
P-0.007
Token gray
Token ID12768
Feature activation+4.15e+00

Pos logit contributions
aw+0.057
cakes+0.049
boards+0.048
First+0.048
talk+0.046
Neg logit contributions
 way-0.073
 mistake-0.073
arter-0.073
 remembers-0.072
die-0.072
Token and
Token ID290
Feature activation-2.22e-01

Pos logit contributions
aw+0.099
First+0.093
cakes+0.091
talk+0.084
boards+0.083
Neg logit contributions
 way-0.086
 mistake-0.082
 sister-0.082
arter-0.081
 remembers-0.079
Token black
Token ID2042
Feature activation+5.22e+00

Pos logit contributions
 remembers+0.004
die+0.004
 has+0.004
 vain+0.004
 fortunate+0.004
Neg logit contributions
aw-0.006
boards-0.006
cakes-0.005
ican-0.005
P-0.005
Token clothes
Token ID8242
Feature activation+1.11e+00

Pos logit contributions
First+0.118
cakes+0.114
aw+0.111
qu+0.105
ican+0.105
Neg logit contributions
 sister-0.103
 way-0.102
 time-0.101
 brother-0.101
 has-0.099
Token and
Token ID290
Feature activation+4.26e-01

Pos logit contributions
aw+0.031
P+0.031
cakes+0.030
First+0.029
ican+0.029
Neg logit contributions
 way-0.014
 vain-0.014
 thanks-0.014
 mistake-0.014
 remembers-0.014
Token tie
Token ID9839
Feature activation+4.17e-01

Pos logit contributions
aw+0.010
boards+0.010
cakes+0.010
ican+0.010
P+0.009
Neg logit contributions
 remembers-0.009
 has-0.009
die-0.008
 way-0.008
 vain-0.008
Token scar
Token ID10153
Feature activation+1.01e+00

Pos logit contributions
aw+0.009
cakes+0.008
boards+0.008
P+0.008
First+0.008
Neg logit contributions
 vain-0.009
arter-0.009
die-0.009
 fortunate-0.009
dy-0.009
Tokenves
Token ID1158
Feature activation+1.96e+00

Pos logit contributions
aw+0.015
P+0.014
cakes+0.014
boards+0.013
First+0.013
Neg logit contributions
 vain-0.026
arter-0.026
 fortunate-0.026
 mistake-0.026
die-0.026
Token and
Token ID290
Feature activation+1.77e+00

Pos logit contributions
aw+0.040
P+0.037
cakes+0.037
ican+0.036
 instruments+0.036
Neg logit contributions
die-0.021
 remembers-0.021
dy-0.021
 mistake-0.020
 has-0.020
Token Tom
Token ID4186
Feature activation+2.52e+00

Pos logit contributions
P+0.054
cakes+0.054
aw+0.054
boards+0.054
talk+0.052
Neg logit contributions
 mistake-0.019
 way-0.019
 vain-0.019
 thanks-0.018
 remembers-0.018
Token put
Token ID1234
Feature activation+1.81e-01

Pos logit contributions
ican+0.066
P+0.063
aw+0.063
cakes+0.063
boards+0.061
Neg logit contributions
 remembers-0.045
 has-0.040
dy-0.038
 thanks-0.038
 loves-0.036
Token on
Token ID319
Feature activation-1.09e+00

Pos logit contributions
aw+0.004
cakes+0.004
P+0.004
boards+0.003
First+0.003
Neg logit contributions
arter-0.004
 haven-0.004
 fortunate-0.004
 mistake-0.004
die-0.004
Token their
Token ID511
Feature activation+3.01e+00

Pos logit contributions
 way+0.016
 vain+0.016
 mistake+0.016
die+0.015
dy+0.015
Neg logit contributions
aw-0.029
boards-0.029
cakes-0.028
P-0.027
First-0.027
Token apr
Token ID46593
Feature activation+5.21e+00

Pos logit contributions
aw+0.045
cakes+0.037
boards+0.037
P+0.035
First+0.035
Neg logit contributions
arter-0.085
 mistake-0.084
 vain-0.084
 fortunate-0.084
 way-0.083
Tokenons
Token ID684
Feature activation+1.85e+00

Pos logit contributions
aw+0.109
Run+0.106
ican+0.103
Boo+0.101
als+0.100
Neg logit contributions
 time-0.099
 mistake-0.098
 way-0.097
 solution-0.093
inkle-0.093
Token and
Token ID290
Feature activation+1.47e+00

Pos logit contributions
aw+0.058
Boo+0.056
ican+0.054
cakes+0.054
Ta+0.054
Neg logit contributions
 way-0.019
dy-0.018
 vain-0.017
 solution-0.017
 chance-0.017
Token washed
Token ID18989
Feature activation+1.62e-01

Pos logit contributions
aw+0.036
ican+0.035
boards+0.033
cakes+0.032
amb+0.032
Neg logit contributions
 remembers-0.028
 has-0.028
 way-0.027
die-0.026
dy-0.025
Token their
Token ID511
Feature activation+7.85e-01

Pos logit contributions
aw+0.004
cakes+0.003
boards+0.003
P+0.003
 instruments+0.003
Neg logit contributions
arter-0.003
 vain-0.003
 fortunate-0.003
 haven-0.003
die-0.003
Token hands
Token ID2832
Feature activation+2.38e-01

Pos logit contributions
aw+0.018
cakes+0.017
boards+0.016
P+0.016
First+0.016
Neg logit contributions
arter-0.016
 vain-0.016
 fortunate-0.016
die-0.015
 mistake-0.015
Token
Token ID220
Feature activation-2.45e+00

Pos logit contributions
die+0.010
 way+0.010
 fortunate+0.010
 vain+0.010
 remembers+0.010
Neg logit contributions
aw-0.010
P-0.009
cakes-0.009
co-0.008
 instruments-0.008
Token
Token ID198
Feature activation-5.85e-01

Pos logit contributions
arter+0.060
 fortunate+0.059
 vain+0.058
 remembers+0.057
 mistake+0.056
Neg logit contributions
aw-0.047
First-0.046
cakes-0.045
Her-0.045
She-0.044
Token
Token ID198
Feature activation-5.54e-01

Pos logit contributions
arter+0.009
 vain+0.009
die+0.009
 fortunate+0.009
 mistake+0.009
Neg logit contributions
aw-0.016
cakes-0.015
boards-0.015
P-0.014
 instruments-0.014
TokenHe
Token ID1544
Feature activation-8.38e-02

Pos logit contributions
die+0.012
 fortunate+0.011
dy+0.011
 moment+0.011
 vain+0.010
Neg logit contributions
aw-0.012
 umb-0.012
boards-0.011
cakes-0.011
ican-0.011
Token carefully
Token ID7773
Feature activation+1.88e+00

Pos logit contributions
 remembers+0.001
die+0.001
 vain+0.001
 mistake+0.001
dy+0.001
Neg logit contributions
aw-0.002
cakes-0.002
boards-0.002
ican-0.002
P-0.002
Token m
Token ID285
Feature activation+5.21e+00

Pos logit contributions
aw+0.049
cakes+0.046
P+0.045
boards+0.044
 instruments+0.044
Neg logit contributions
 vain-0.030
 mistake-0.030
 remembers-0.029
 fortunate-0.028
die-0.028
Tokenended
Token ID1631
Feature activation-9.68e-01

Pos logit contributions
 instruments+0.126
P+0.125
aw+0.119
Run+0.119
als+0.118
Neg logit contributions
arter-0.085
 mistake-0.084
ubby-0.084
 vain-0.083
 haven-0.080
Token the
Token ID262
Feature activation-2.84e-02

Pos logit contributions
arter+0.024
 fortunate+0.023
 haven+0.023
 vain+0.022
 remembers+0.022
Neg logit contributions
aw-0.019
cakes-0.017
 instruments-0.017
Her-0.017
P-0.017
Token bike
Token ID7161
Feature activation+1.35e+00

Pos logit contributions
arter+0.001
die+0.001
 fortunate+0.001
 vain+0.001
dy+0.001
Neg logit contributions
aw-0.001
boards-0.001
cakes-0.001
 instruments-0.001
 bars-0.001
Token and
Token ID290
Feature activation+1.10e-01

Pos logit contributions
aw+0.033
P+0.031
cakes+0.030
boards+0.029
 instruments+0.029
Neg logit contributions
 vain-0.025
arter-0.024
 mistake-0.024
 haven-0.024
 fortunate-0.024
Token made
Token ID925
Feature activation+2.85e+00

Pos logit contributions
aw+0.003
cakes+0.002
boards+0.002
P+0.002
 instruments+0.002
Neg logit contributions
 vain-0.002
arter-0.002
 fortunate-0.002
 mistake-0.002
die-0.002
Token put
Token ID1234
Feature activation+8.16e-02

Pos logit contributions
aw+0.009
cakes+0.009
boards+0.008
P+0.008
 instruments+0.008
Neg logit contributions
 remembers-0.009
 haven-0.009
 fortunate-0.009
 vain-0.009
arter-0.008
Token on
Token ID319
Feature activation-1.33e+00

Pos logit contributions
aw+0.002
boards+0.002
cakes+0.002
First+0.002
P+0.002
Neg logit contributions
arter-0.001
 vain-0.001
die-0.001
 fortunate-0.001
 remembers-0.001
Token their
Token ID511
Feature activation+2.25e+00

Pos logit contributions
die+0.021
dy+0.021
 way+0.021
 thanks+0.020
 mistake+0.020
Neg logit contributions
cakes-0.035
aw-0.035
boards-0.034
talk-0.033
ican-0.033
Token coats
Token ID30720
Feature activation+1.48e+00

Pos logit contributions
aw+0.045
cakes+0.039
boards+0.038
First+0.037
talk+0.037
Neg logit contributions
 way-0.056
arter-0.053
 mistake-0.053
 beloved-0.053
 remembers-0.052
Token and
Token ID290
Feature activation-6.10e-02

Pos logit contributions
aw+0.052
P+0.051
Ta+0.051
Boo+0.050
Run+0.050
Neg logit contributions
 way-0.012
 chance-0.009
 time-0.009
dy-0.008
 haven-0.008
Token m
Token ID285
Feature activation+5.20e+00

Pos logit contributions
 remembers+0.001
 way+0.001
 has+0.001
die+0.001
dy+0.001
Neg logit contributions
aw-0.001
ican-0.001
boards-0.001
cakes-0.001
qu-0.001
Tokenittens
Token ID34978
Feature activation+2.00e+00

Pos logit contributions
P+0.123
aw+0.116
boards+0.115
cakes+0.113
Boo+0.112
Neg logit contributions
bie-0.098
arter-0.096
ubby-0.090
 fault-0.088
 mistake-0.083
Token.
Token ID13
Feature activation-1.08e+00

Pos logit contributions
aw+0.066
Ta+0.066
P+0.066
Boo+0.064
cakes+0.064
Neg logit contributions
 way-0.019
 has-0.014
dy-0.014
 vain-0.014
 haven-0.014
Token They
Token ID1119
Feature activation+1.29e+00

Pos logit contributions
arter+0.024
 vain+0.023
 mistake+0.023
 fortunate+0.022
ubby+0.022
Neg logit contributions
aw-0.024
cakes-0.022
boards-0.022
First-0.022
 instruments-0.021
Token wanted
Token ID2227
Feature activation+7.45e-01

Pos logit contributions
cakes+0.036
aw+0.035
ican+0.034
boards+0.034
P+0.033
Neg logit contributions
 haven-0.020
 remembers-0.020
 has-0.019
 vain-0.019
 way-0.018
Token to
Token ID284
Feature activation-6.07e-01

Pos logit contributions
aw+0.021
P+0.020
ican+0.019
cakes+0.019
amb+0.018
Neg logit contributions
 way-0.012
 time-0.011
 mistake-0.011
 chance-0.011
die-0.011
Token excited
Token ID6568
Feature activation+1.73e+00

Pos logit contributions
arter+0.002
 vain+0.002
die+0.002
 mistake+0.002
dy+0.002
Neg logit contributions
aw-0.003
cakes-0.003
boards-0.003
P-0.003
First-0.002
Token he
Token ID339
Feature activation-6.39e-01

Pos logit contributions
aw+0.042
P+0.040
cakes+0.039
First+0.038
Her+0.038
Neg logit contributions
arter-0.031
 way-0.030
 mistake-0.030
 fortunate-0.030
die-0.030
Token put
Token ID1234
Feature activation+4.03e-02

Pos logit contributions
arter+0.011
 vain+0.011
 fortunate+0.011
die+0.011
 mistake+0.011
Neg logit contributions
aw-0.016
cakes-0.015
boards-0.015
P-0.015
First-0.014
Token on
Token ID319
Feature activation-1.20e+00

Pos logit contributions
aw+0.001
cakes+0.001
boards+0.001
P+0.001
 instruments+0.001
Neg logit contributions
arter-0.001
 vain-0.001
 fortunate-0.001
die-0.001
 mistake-0.001
Token his
Token ID465
Feature activation+2.18e+00

Pos logit contributions
 way+0.016
die+0.016
 thanks+0.016
 mistake+0.016
 vain+0.015
Neg logit contributions
aw-0.034
cakes-0.033
talk-0.032
boards-0.032
First-0.032
Token m
Token ID285
Feature activation+5.18e+00

Pos logit contributions
aw+0.057
cakes+0.052
boards+0.052
P+0.051
First+0.050
Neg logit contributions
arter-0.037
 vain-0.036
 fortunate-0.036
die-0.035
 mistake-0.035
Tokenittens
Token ID34978
Feature activation+2.58e+00

Pos logit contributions
aw+0.117
P+0.106
ican+0.105
Run+0.103
First+0.103
Neg logit contributions
bie-0.103
arter-0.102
 mistake-0.100
ubby-0.097
 way-0.097
Token and
Token ID290
Feature activation-9.09e-03

Pos logit contributions
aw+0.073
P+0.069
Her+0.067
Ta+0.066
cakes+0.066
Neg logit contributions
 way-0.038
 vain-0.033
 mistake-0.033
arter-0.033
 fortunate-0.032
Token went
Token ID1816
Feature activation-4.45e-01

Pos logit contributions
 way+0.000
 remembers+0.000
die+0.000
 has+0.000
dy+0.000
Neg logit contributions
aw-0.000
ican-0.000
P-0.000
boards-0.000
First-0.000
Token outside
Token ID2354
Feature activation-6.36e-01

Pos logit contributions
arter+0.009
 way+0.009
die+0.009
 vain+0.009
 mistake+0.009
Neg logit contributions
aw-0.011
P-0.009
cakes-0.009
boards-0.009
First-0.009
Token to
Token ID284
Feature activation+1.15e+00

Pos logit contributions
arter+0.011
 vain+0.011
 mistake+0.011
 fortunate+0.011
 way+0.011
Neg logit contributions
aw-0.017
cakes-0.015
P-0.015
boards-0.015
First-0.014
Token Mom
Token ID11254
Feature activation-2.19e-01

Pos logit contributions
aw+0.035
cakes+0.032
boards+0.031
P+0.030
talk+0.030
Neg logit contributions
arter-0.041
 mistake-0.040
 vain-0.040
 fortunate-0.039
die-0.039
Token.
Token ID13
Feature activation-6.97e-01

Pos logit contributions
arter+0.005
 vain+0.005
 fortunate+0.005
 mistake+0.005
ubby+0.005
Neg logit contributions
aw-0.005
cakes-0.005
boards-0.004
First-0.004
co-0.004
Token They
Token ID1119
Feature activation+1.17e+00

Pos logit contributions
arter+0.013
 vain+0.013
 fortunate+0.013
die+0.013
 mistake+0.013
Neg logit contributions
aw-0.017
cakes-0.016
boards-0.015
P-0.015
 instruments-0.015
Token put
Token ID1234
Feature activation+2.41e-01

Pos logit contributions
aw+0.032
cakes+0.029
ican+0.028
P+0.028
 instruments+0.028
Neg logit contributions
 haven-0.020
 vain-0.019
 remembers-0.019
die-0.018
dy-0.017
Token on
Token ID319
Feature activation-7.91e-01

Pos logit contributions
aw+0.005
cakes+0.004
boards+0.004
P+0.004
 instruments+0.004
Neg logit contributions
arter-0.006
 vain-0.006
 fortunate-0.006
die-0.006
 mistake-0.006
Token apr
Token ID46593
Feature activation+5.17e+00

Pos logit contributions
 way+0.013
 mistake+0.012
dy+0.012
 time+0.012
 size+0.012
Neg logit contributions
aw-0.021
amb-0.020
ican-0.020
talk-0.020
boards-0.019
Tokenons
Token ID684
Feature activation+1.40e+00

Pos logit contributions
aw+0.107
ican+0.107
Run+0.106
als+0.102
Boo+0.100
Neg logit contributions
 time-0.098
 way-0.095
 supposed-0.090
 pays-0.090
inkle-0.090
Token to
Token ID284
Feature activation+1.68e+00

Pos logit contributions
aw+0.042
ican+0.041
cakes+0.040
Boo+0.040
Ta+0.040
Neg logit contributions
dy-0.015
 way-0.014
 has-0.014
die-0.013
 time-0.013
Token keep
Token ID1394
Feature activation-3.54e-01

Pos logit contributions
aw+0.049
boards+0.045
cakes+0.044
ican+0.044
amb+0.043
Neg logit contributions
 way-0.025
 mistake-0.024
arter-0.023
 normal-0.023
 mind-0.023
Token their
Token ID511
Feature activation-5.28e-02

Pos logit contributions
arter+0.008
 fortunate+0.008
 vain+0.008
die+0.008
 haven+0.008
Neg logit contributions
aw-0.007
cakes-0.006
P-0.006
boards-0.006
 instruments-0.006
Token clothes
Token ID8242
Feature activation-3.83e-01

Pos logit contributions
arter+0.001
 vain+0.001
 fortunate+0.001
die+0.001
 haven+0.001
Neg logit contributions
aw-0.001
cakes-0.001
boards-0.001
P-0.001
 instruments-0.001
TokenThe
Token ID464
Feature activation-1.37e+00

Pos logit contributions
die+0.024
dy+0.023
 mistake+0.022
 fortunate+0.022
 vain+0.021
Neg logit contributions
cakes-0.020
boards-0.019
aw-0.017
ican-0.016
 instruments-0.016
Token little
Token ID1310
Feature activation-2.23e+00

Pos logit contributions
arter+0.019
 vain+0.018
 fortunate+0.017
die+0.017
 haven+0.016
Neg logit contributions
aw-0.040
cakes-0.039
P-0.039
boards-0.038
 instruments-0.038
Token girl
Token ID2576
Feature activation+5.71e-01

Pos logit contributions
arter+0.036
 vain+0.035
 fortunate+0.035
die+0.034
 mistake+0.034
Neg logit contributions
aw-0.061
cakes-0.056
P-0.056
boards-0.056
 instruments-0.054
Token spent
Token ID3377
Feature activation+1.54e+00

Pos logit contributions
aw+0.016
cakes+0.016
boards+0.015
P+0.015
First+0.015
Neg logit contributions
 remembers-0.008
dy-0.008
 mistake-0.008
die-0.008
 way-0.008
Token an
Token ID281
Feature activation+1.85e+00

Pos logit contributions
aw+0.033
P+0.031
 instruments+0.028
cakes+0.028
First+0.028
Neg logit contributions
arter-0.037
 haven-0.034
 vain-0.034
 fortunate-0.034
die-0.034
Token hour
Token ID1711
Feature activation+5.16e+00

Pos logit contributions
aw+0.056
boards+0.054
cakes+0.054
als+0.052
 instruments+0.051
Neg logit contributions
 fortunate-0.022
ubby-0.022
 chance-0.021
 mistake-0.021
 way-0.021
Token gathering
Token ID11228
Feature activation+9.46e-01

Pos logit contributions
aw+0.154
cakes+0.148
ican+0.143
boards+0.140
Run+0.140
Neg logit contributions
 way-0.062
 size-0.062
 solution-0.057
 time-0.057
 breathing-0.057
Token the
Token ID262
Feature activation-3.93e-03

Pos logit contributions
aw+0.018
cakes+0.016
boards+0.016
P+0.016
First+0.016
Neg logit contributions
 fortunate-0.022
 vain-0.021
 way-0.021
 mistake-0.021
arter-0.021
Token little
Token ID1310
Feature activation-8.14e-01

Pos logit contributions
 fortunate+0.000
 way+0.000
 vain+0.000
 mistake+0.000
arter+0.000
Neg logit contributions
aw-0.000
cakes-0.000
P-0.000
boards-0.000
First-0.000
Token clothes
Token ID8242
Feature activation+1.46e+00

Pos logit contributions
 way+0.022
 fortunate+0.022
 remembers+0.021
 mistake+0.021
 vain+0.021
Neg logit contributions
aw-0.013
cakes-0.011
P-0.011
boards-0.011
ican-0.011
Token.
Token ID13
Feature activation+3.17e-01

Pos logit contributions
aw+0.040
P+0.037
First+0.035
cakes+0.035
Run+0.035
Neg logit contributions
 way-0.022
 fortunate-0.022
arter-0.022
 haven-0.022
 vain-0.022
Token the
Token ID262
Feature activation-9.70e-01

Pos logit contributions
arter+0.000
 vain+0.000
 fortunate+0.000
 mistake+0.000
die+0.000
Neg logit contributions
aw-0.001
cakes-0.000
P-0.000
boards-0.000
 instruments-0.000
Token bird
Token ID6512
Feature activation-1.13e+00

Pos logit contributions
arter+0.024
 vain+0.023
die+0.023
 fortunate+0.023
dy+0.022
Neg logit contributions
aw-0.019
cakes-0.017
P-0.016
boards-0.016
 instruments-0.016
Token in
Token ID287
Feature activation-7.94e-01

Pos logit contributions
die+0.020
 fortunate+0.019
arter+0.019
 vain+0.019
dy+0.019
Neg logit contributions
aw-0.028
cakes-0.027
boards-0.027
P-0.025
First-0.025
Token a
Token ID257
Feature activation+2.08e+00

Pos logit contributions
arter+0.019
 fortunate+0.018
 vain+0.018
die+0.018
 remembers+0.018
Neg logit contributions
aw-0.016
cakes-0.014
P-0.014
boards-0.014
First-0.013
Token bird
Token ID6512
Feature activation+1.01e+00

Pos logit contributions
aw+0.054
cakes+0.052
boards+0.051
 instruments+0.048
First+0.048
Neg logit contributions
 mistake-0.035
 vain-0.035
arter-0.035
 fortunate-0.034
die-0.034
Tokenc
Token ID66
Feature activation+5.15e+00

Pos logit contributions
cakes+0.030
boards+0.030
aw+0.028
First+0.027
ican+0.027
Neg logit contributions
 way-0.014
dy-0.012
die-0.012
 breathing-0.012
 mistake-0.012
Tokenage
Token ID496
Feature activation+3.04e-01

Pos logit contributions
Run+0.110
 instruments+0.108
ican+0.106
cakes+0.106
boards+0.099
Neg logit contributions
 way-0.102
 time-0.099
 mistake-0.096
 fault-0.095
 thanks-0.094
Token and
Token ID290
Feature activation+6.97e-01

Pos logit contributions
aw+0.007
cakes+0.007
boards+0.007
P+0.007
 instruments+0.007
Neg logit contributions
 vain-0.005
arter-0.005
 fortunate-0.005
 mistake-0.005
die-0.005
Token gave
Token ID2921
Feature activation+1.13e+00

Pos logit contributions
aw+0.017
boards+0.016
cakes+0.016
P+0.015
First+0.015
Neg logit contributions
 remembers-0.013
 mistake-0.013
 vain-0.013
 way-0.013
die-0.013
Token it
Token ID340
Feature activation-2.60e-01

Pos logit contributions
aw+0.025
cakes+0.022
P+0.022
boards+0.022
 instruments+0.021
Neg logit contributions
arter-0.025
 vain-0.024
 fortunate-0.024
die-0.023
 mistake-0.023
Token food
Token ID2057
Feature activation-3.81e-01

Pos logit contributions
 chance+0.005
 way+0.005
 thanks+0.005
die+0.005
 beloved+0.005
Neg logit contributions
aw-0.006
boards-0.006
ican-0.006
cakes-0.006
co-0.005
Token put
Token ID1234
Feature activation-1.26e-01

Pos logit contributions
die+0.016
 remembers+0.016
dy+0.015
 vain+0.015
 way+0.015
Neg logit contributions
aw-0.020
cakes-0.020
boards-0.019
P-0.018
ican-0.018
Token on
Token ID319
Feature activation-8.70e-01

Pos logit contributions
 vain+0.002
die+0.002
 way+0.002
 remembers+0.002
arter+0.002
Neg logit contributions
aw-0.003
cakes-0.003
boards-0.003
First-0.003
P-0.003
Token her
Token ID607
Feature activation+1.86e+00

Pos logit contributions
 way+0.015
 mistake+0.013
 size+0.013
 thanks+0.013
dy+0.013
Neg logit contributions
aw-0.023
cakes-0.022
P-0.021
boards-0.021
talk-0.021
Token hat
Token ID6877
Feature activation+8.82e-01

Pos logit contributions
aw+0.037
cakes+0.033
boards+0.033
P+0.032
 instruments+0.031
Neg logit contributions
arter-0.044
 vain-0.043
 fortunate-0.043
die-0.042
 mistake-0.042
Token and
Token ID290
Feature activation+1.29e+00

Pos logit contributions
aw+0.024
P+0.024
cakes+0.022
First+0.022
 instruments+0.022
Neg logit contributions
 way-0.013
 haven-0.012
 vain-0.011
 remembers-0.011
die-0.011
Token m
Token ID285
Feature activation+5.11e+00

Pos logit contributions
ican+0.033
aw+0.033
cakes+0.031
boards+0.029
ban+0.029
Neg logit contributions
 remembers-0.024
 way-0.024
die-0.022
 brother-0.022
bie-0.021
Tokenittens
Token ID34978
Feature activation+1.96e+00

Pos logit contributions
P+0.113
aw+0.105
cakes+0.104
 instruments+0.103
First+0.101
Neg logit contributions
bie-0.100
arter-0.098
 haven-0.097
ubby-0.094
lad-0.090
Token and
Token ID290
Feature activation+1.58e+00

Pos logit contributions
aw+0.054
P+0.053
cakes+0.051
First+0.049
boards+0.049
Neg logit contributions
 vain-0.028
 way-0.027
 haven-0.027
 remembers-0.026
 mistake-0.026
Token went
Token ID1816
Feature activation-5.60e-01

Pos logit contributions
ican+0.039
aw+0.038
cakes+0.036
P+0.036
boards+0.035
Neg logit contributions
 way-0.029
 remembers-0.028
bie-0.026
die-0.026
 brother-0.026
Token outside
Token ID2354
Feature activation-1.13e+00

Pos logit contributions
 way+0.010
arter+0.009
 chance+0.009
dy+0.009
die+0.009
Neg logit contributions
aw-0.015
P-0.013
cakes-0.013
boards-0.013
First-0.013
Token.
Token ID13
Feature activation+5.00e-01

Pos logit contributions
 way+0.015
 mistake+0.015
 vain+0.015
 fortunate+0.015
arter+0.015
Neg logit contributions
aw-0.034
P-0.031
cakes-0.031
boards-0.031
First-0.029
Token!"
Token ID2474
Feature activation+1.73e+00

Pos logit contributions
arter+0.016
 vain+0.016
 fortunate+0.015
die+0.015
 mistake+0.015
Neg logit contributions
aw-0.027
cakes-0.026
boards-0.025
P-0.025
First-0.024
Token
Token ID198
Feature activation+2.48e-01

Pos logit contributions
P+0.050
aw+0.050
cakes+0.049
boards+0.049
ican+0.049
Neg logit contributions
 thanks-0.020
 way-0.019
 haven-0.018
 hasn-0.018
 has-0.018
Token
Token ID198
Feature activation+4.03e-01

Pos logit contributions
aw+0.007
boards+0.007
cakes+0.007
 instruments+0.007
P+0.007
Neg logit contributions
die-0.003
 haven-0.003
arter-0.003
 mistake-0.003
 fortunate-0.003
TokenTo
Token ID2514
Feature activation-3.94e-01

Pos logit contributions
boards+0.010
 umb+0.010
aw+0.010
 shorts+0.009
als+0.009
Neg logit contributions
 mistake-0.007
 true-0.007
die-0.006
 haven-0.006
lad-0.006
Token her
Token ID607
Feature activation+8.30e-01

Pos logit contributions
arter+0.008
 mistake+0.007
 vain+0.007
 fortunate+0.007
 haven+0.007
Neg logit contributions
aw-0.010
boards-0.009
cakes-0.009
P-0.009
First-0.009
Token su
Token ID424
Feature activation+5.09e+00

Pos logit contributions
aw+0.020
P+0.018
cakes+0.018
 instruments+0.018
First+0.017
Neg logit contributions
arter-0.016
 haven-0.016
 fortunate-0.015
 vain-0.015
die-0.015
Tokenprise
Token ID7919
Feature activation+1.75e+00

Pos logit contributions
 shorts+0.098
 instruments+0.097
edly+0.095
 sneakers+0.092
 shoes+0.091
Neg logit contributions
lad-0.100
reetings-0.094
 solution-0.093
bie-0.093
ubby-0.093
Token,
Token ID11
Feature activation+1.97e+00

Pos logit contributions
aw+0.045
cakes+0.042
P+0.042
boards+0.041
First+0.041
Neg logit contributions
arter-0.030
 vain-0.029
 fortunate-0.029
 mistake-0.029
 remembers-0.028
Token the
Token ID262
Feature activation-1.13e+00

Pos logit contributions
aw+0.056
boards+0.052
cakes+0.052
P+0.051
First+0.050
Neg logit contributions
 vain-0.029
arter-0.029
 fortunate-0.029
 haven-0.028
 mistake-0.028
Token thunder
Token ID18355
Feature activation-5.90e-01

Pos logit contributions
arter+0.022
 mistake+0.022
 vain+0.022
 fortunate+0.021
 way+0.021
Neg logit contributions
aw-0.027
P-0.024
cakes-0.024
boards-0.024
talk-0.024
Token went
Token ID1816
Feature activation-5.51e-01

Pos logit contributions
arter+0.012
 vain+0.011
 fortunate+0.011
die+0.011
 mistake+0.011
Neg logit contributions
aw-0.014
cakes-0.013
boards-0.013
P-0.012
 instruments-0.012
Token,
Token ID11
Feature activation+5.23e-01

Pos logit contributions
P+0.057
aw+0.056
Boo+0.054
Ta+0.054
cakes+0.053
Neg logit contributions
 way-0.019
 size-0.015
 chance-0.014
 has-0.014
dy-0.013
Token hats
Token ID23910
Feature activation+6.48e-01

Pos logit contributions
aw+0.011
cakes+0.011
ican+0.011
quet+0.010
amb+0.010
Neg logit contributions
 way-0.012
 has-0.012
 remembers-0.011
dy-0.011
die-0.011
Token,
Token ID11
Feature activation-4.12e-01

Pos logit contributions
aw+0.018
cakes+0.017
P+0.017
boards+0.017
First+0.016
Neg logit contributions
arter-0.010
 vain-0.009
 fortunate-0.009
 haven-0.009
 mistake-0.009
Token boots
Token ID14412
Feature activation+1.00e+00

Pos logit contributions
 has+0.008
 remembers+0.007
 way+0.007
 thanks+0.007
dy+0.006
Neg logit contributions
cakes-0.010
aw-0.010
ican-0.010
quet-0.009
ban-0.009
Token and
Token ID290
Feature activation+6.38e-01

Pos logit contributions
aw+0.030
P+0.029
cakes+0.029
First+0.028
boards+0.028
Neg logit contributions
 way-0.012
arter-0.011
 vain-0.011
 haven-0.011
 mistake-0.011
Token m
Token ID285
Feature activation+5.04e+00

Pos logit contributions
ican+0.016
cakes+0.015
quet+0.014
aw+0.014
amb+0.014
Neg logit contributions
 has-0.014
 remembers-0.013
 way-0.013
 good-0.012
 real-0.012
Tokenittens
Token ID34978
Feature activation+1.17e+00

Pos logit contributions
P+0.115
cakes+0.106
aw+0.105
 instruments+0.103
Boo+0.101
Neg logit contributions
arter-0.098
 fault-0.092
 haven-0.091
bie-0.090
ubby-0.089
Token.
Token ID13
Feature activation-1.23e+00

Pos logit contributions
aw+0.036
P+0.036
cakes+0.035
First+0.034
boards+0.034
Neg logit contributions
 way-0.013
arter-0.012
 vain-0.012
 haven-0.012
 has-0.011
Token Lily
Token ID20037
Feature activation+2.99e-01

Pos logit contributions
 vain+0.029
arter+0.029
 fortunate+0.029
 mistake+0.029
die+0.028
Neg logit contributions
aw-0.024
cakes-0.022
boards-0.022
First-0.022
P-0.021
Token had
Token ID550
Feature activation+1.74e+00

Pos logit contributions
aw+0.008
cakes+0.007
boards+0.007
P+0.007
First+0.007
Neg logit contributions
 vain-0.005
arter-0.005
die-0.005
 fortunate-0.005
 remembers-0.005
Token thick
Token ID6546
Feature activation+1.64e+00

Pos logit contributions
cakes+0.049
ican+0.048
amb+0.047
boards+0.046
talk+0.045
Neg logit contributions
 way-0.030
 time-0.027
die-0.026
 has-0.025
 normal-0.025
Token cups
Token ID14180
Feature activation+2.85e+00

Pos logit contributions
aw+0.026
cakes+0.024
boards+0.023
P+0.023
First+0.022
Neg logit contributions
arter-0.026
 vain-0.025
 fortunate-0.025
die-0.025
 mistake-0.025
Token and
Token ID290
Feature activation+1.75e-01

Pos logit contributions
aw+0.076
P+0.071
ican+0.069
 instruments+0.068
Her+0.067
Neg logit contributions
arter-0.043
 way-0.043
 has-0.042
 vain-0.041
dy-0.041
Token sp
Token ID599
Feature activation+3.30e+00

Pos logit contributions
aw+0.004
ican+0.004
cakes+0.004
Her+0.004
talk+0.004
Neg logit contributions
 size-0.003
 way-0.003
dy-0.003
 remembers-0.003
 has-0.003
Tokenoons
Token ID13022
Feature activation+1.29e-01

Pos logit contributions
aw+0.069
P+0.065
ican+0.063
 instruments+0.062
Her+0.061
Neg logit contributions
arter-0.069
ubby-0.067
 fortunate-0.065
dy-0.064
 vain-0.063
Token and
Token ID290
Feature activation+9.61e-01

Pos logit contributions
aw+0.003
cakes+0.002
boards+0.002
P+0.002
 instruments+0.002
Neg logit contributions
arter-0.003
 vain-0.003
 fortunate-0.003
die-0.003
 mistake-0.003
Token m
Token ID285
Feature activation+5.04e+00

Pos logit contributions
aw+0.023
ican+0.023
cakes+0.021
Her+0.021
P+0.021
Neg logit contributions
 size-0.019
 way-0.019
 remembers-0.018
 has-0.018
dy-0.018
Tokenolds
Token ID10119
Feature activation+9.14e-01

Pos logit contributions
P+0.108
aw+0.107
Run+0.102
 instruments+0.101
Her+0.100
Neg logit contributions
arter-0.103
 haven-0.095
ubby-0.094
bie-0.094
 fault-0.091
Token to
Token ID284
Feature activation-1.27e+00

Pos logit contributions
aw+0.022
cakes+0.021
boards+0.020
P+0.020
 instruments+0.020
Neg logit contributions
arter-0.017
 vain-0.017
 fortunate-0.017
die-0.017
 mistake-0.016
Token make
Token ID787
Feature activation+2.08e+00

Pos logit contributions
arter+0.028
 fortunate+0.028
 vain+0.028
die+0.027
 mistake+0.027
Neg logit contributions
aw-0.027
cakes-0.024
boards-0.024
P-0.024
 instruments-0.023
Token shapes
Token ID15268
Feature activation-8.60e-01

Pos logit contributions
aw+0.044
cakes+0.043
boards+0.042
P+0.041
First+0.040
Neg logit contributions
arter-0.043
 mistake-0.043
 vain-0.042
 fortunate-0.042
die-0.042
Token in
Token ID287
Feature activation-2.04e+00

Pos logit contributions
arter+0.013
 fortunate+0.013
 mistake+0.013
dy+0.012
 vain+0.012
Neg logit contributions
aw-0.024
P-0.022
First-0.022
boards-0.022
cakes-0.022
Token "
Token ID366
Feature activation+1.09e+00

Pos logit contributions
arter+0.021
 vain+0.020
die+0.019
 fortunate+0.019
ubby+0.019
Neg logit contributions
aw-0.034
cakes-0.033
boards-0.032
Her-0.031
P-0.031
TokenOf
Token ID5189
Feature activation+7.25e-02

Pos logit contributions
aw+0.022
 umb+0.020
 bars+0.020
boards+0.019
ican+0.018
Neg logit contributions
 moment-0.025
die-0.025
 fortunate-0.024
 haven-0.024
lad-0.023
Token course
Token ID1781
Feature activation-9.11e-01

Pos logit contributions
aw+0.001
P+0.001
cakes+0.000
boards+0.000
talk+0.000
Neg logit contributions
arter-0.003
 fortunate-0.003
die-0.002
 vain-0.002
 haven-0.002
Token!"
Token ID2474
Feature activation+1.55e+00

Pos logit contributions
arter+0.023
 vain+0.021
die+0.021
 fortunate+0.021
 fossil+0.020
Neg logit contributions
aw-0.018
cakes-0.016
boards-0.015
P-0.015
als-0.014
Token He
Token ID679
Feature activation+1.99e-01

Pos logit contributions
aw+0.039
cakes+0.035
boards+0.035
P+0.035
 instruments+0.035
Neg logit contributions
 fortunate-0.028
die-0.027
arter-0.027
 vain-0.027
 mistake-0.026
Token m
Token ID285
Feature activation+5.01e+00

Pos logit contributions
aw+0.005
boards+0.005
cakes+0.005
P+0.005
First+0.005
Neg logit contributions
die-0.003
 remembers-0.003
dy-0.003
 vain-0.003
 mistake-0.003
Tokenended
Token ID1631
Feature activation-1.33e+00

Pos logit contributions
aw+0.120
P+0.117
 instruments+0.117
boards+0.113
cakes+0.110
Neg logit contributions
arter-0.088
 vain-0.088
 haven-0.088
 mistake-0.086
ubby-0.085
Token the
Token ID262
Feature activation+4.10e-01

Pos logit contributions
arter+0.034
 fortunate+0.033
 haven+0.033
 vain+0.030
 remembers+0.030
Neg logit contributions
aw-0.026
co-0.023
cakes-0.022
 instruments-0.022
Her-0.022
Token m
Token ID285
Feature activation+3.10e+00

Pos logit contributions
aw+0.008
P+0.007
cakes+0.007
boards+0.007
 instruments+0.007
Neg logit contributions
arter-0.010
 fortunate-0.009
 vain-0.009
die-0.009
 haven-0.009
Tokenitten
Token ID2621
Feature activation+2.36e+00

Pos logit contributions
aw+0.065
 instruments+0.061
P+0.060
boards+0.060
cakes+0.059
Neg logit contributions
arter-0.066
ubby-0.064
 vain-0.063
 mistake-0.062
 haven-0.062
Token with
Token ID351
Feature activation+1.43e-01

Pos logit contributions
aw+0.059
P+0.055
cakes+0.054
boards+0.053
 instruments+0.053
Neg logit contributions
 vain-0.041
arter-0.041
 mistake-0.040
 way-0.040
 haven-0.039
Token boy
Token ID2933
Feature activation+8.15e-01

Pos logit contributions
arter+0.034
 vain+0.032
die+0.032
 fortunate+0.032
 haven+0.031
Neg logit contributions
aw-0.035
cakes-0.033
boards-0.032
P-0.032
First-0.032
Token took
Token ID1718
Feature activation+1.49e+00

Pos logit contributions
aw+0.021
cakes+0.021
P+0.021
boards+0.020
First+0.020
Neg logit contributions
 remembers-0.013
dy-0.012
die-0.012
 mistake-0.012
 has-0.012
Token his
Token ID465
Feature activation+2.07e-01

Pos logit contributions
aw+0.034
cakes+0.031
P+0.031
boards+0.031
 instruments+0.030
Neg logit contributions
arter-0.032
 vain-0.030
 fortunate-0.030
 haven-0.030
 mistake-0.030
Token shovel
Token ID33677
Feature activation-3.84e-01

Pos logit contributions
P+0.003
boards+0.003
 instruments+0.003
 shorts+0.003
 se+0.003
Neg logit contributions
arter-0.006
 fortunate-0.005
 vain-0.005
die-0.005
 haven-0.005
Token and
Token ID290
Feature activation+1.05e+00

Pos logit contributions
 vain+0.007
die+0.007
dy+0.007
arter+0.007
 fortunate+0.007
Neg logit contributions
aw-0.010
cakes-0.009
P-0.009
 instruments-0.009
boards-0.009
Token m
Token ID285
Feature activation+4.99e+00

Pos logit contributions
aw+0.028
ican+0.026
co+0.025
cakes+0.025
P+0.024
Neg logit contributions
 way-0.019
dy-0.018
 remembers-0.018
die-0.018
 size-0.017
Tokenop
Token ID404
Feature activation+2.11e+00

Pos logit contributions
aw+0.103
P+0.100
 instruments+0.099
cakes+0.096
boards+0.095
Neg logit contributions
arter-0.104
ubby-0.102
 haven-0.102
bie-0.101
 vain-0.100
Token outside
Token ID2354
Feature activation-2.02e+00

Pos logit contributions
aw+0.053
P+0.049
cakes+0.048
First+0.047
Her+0.046
Neg logit contributions
dy-0.036
 vain-0.036
 way-0.035
 mistake-0.035
 remembers-0.034
Token to
Token ID284
Feature activation-1.88e-01

Pos logit contributions
arter+0.028
 vain+0.028
 mistake+0.028
 way+0.028
 fortunate+0.028
Neg logit contributions
aw-0.058
P-0.055
cakes-0.054
boards-0.053
First-0.052
Token the
Token ID262
Feature activation+1.31e+00

Pos logit contributions
 vain+0.004
arter+0.004
 fortunate+0.004
die+0.004
 mistake+0.004
Neg logit contributions
aw-0.004
cakes-0.004
P-0.004
boards-0.004
First-0.004
Token sandbox
Token ID35204
Feature activation-1.02e+00

Pos logit contributions
aw+0.028
cakes+0.027
P+0.026
boards+0.025
talk+0.025
Neg logit contributions
 mistake-0.027
 way-0.027
arter-0.027
dy-0.026
die-0.026
TokenFinally
Token ID11158
Feature activation+7.85e-01

Pos logit contributions
ican+0.007
 instruments+0.007
talk+0.007
 bars+0.007
aw+0.007
Neg logit contributions
die-0.007
dy-0.007
 tempting-0.006
 moment-0.006
 fortunate-0.006
Token M
Token ID337
Feature activation+1.14e+00

Pos logit contributions
aw+0.022
cakes+0.021
boards+0.020
P+0.020
 instruments+0.020
Neg logit contributions
arter-0.012
 vain-0.012
 fortunate-0.012
die-0.011
 mistake-0.011
Tokenummy
Token ID13513
Feature activation+2.58e+00

Pos logit contributions
aw+0.029
cakes+0.028
boards+0.027
P+0.027
 instruments+0.026
Neg logit contributions
 vain-0.019
arter-0.019
 fortunate-0.019
die-0.019
 remembers-0.018
Token used
Token ID973
Feature activation+1.44e+00

Pos logit contributions
aw+0.061
cakes+0.057
P+0.057
First+0.056
boards+0.056
Neg logit contributions
 remembers-0.048
die-0.048
dy-0.047
 mistake-0.046
 vain-0.046
Token her
Token ID607
Feature activation+7.12e-01

Pos logit contributions
aw+0.032
cakes+0.029
boards+0.029
 instruments+0.028
P+0.028
Neg logit contributions
arter-0.031
 fortunate-0.030
die-0.029
 vain-0.029
 remembers-0.028
Token fing
Token ID8038
Feature activation+4.99e+00

Pos logit contributions
aw+0.016
 instruments+0.015
P+0.015
boards+0.015
cakes+0.015
Neg logit contributions
arter-0.015
die-0.014
 fortunate-0.014
 vain-0.014
dy-0.013
Tokenern
Token ID1142
Feature activation+1.94e+00

Pos logit contributions
aw+0.088
cakes+0.087
boards+0.082
 instruments+0.077
P+0.074
Neg logit contributions
 vain-0.134
 mistake-0.132
arter-0.129
ubby-0.124
die-0.123
Tokenails
Token ID1768
Feature activation+2.13e+00

Pos logit contributions
aw+0.049
cakes+0.045
boards+0.044
P+0.044
 instruments+0.043
Neg logit contributions
arter-0.035
 vain-0.034
 fortunate-0.034
 mistake-0.033
die-0.033
Token to
Token ID284
Feature activation+6.39e-01

Pos logit contributions
aw+0.051
cakes+0.047
boards+0.047
P+0.046
 instruments+0.045
Neg logit contributions
arter-0.041
 vain-0.040
 fortunate-0.040
die-0.039
 mistake-0.039
Token gently
Token ID15165
Feature activation+7.70e-01

Pos logit contributions
aw+0.015
cakes+0.014
boards+0.014
P+0.013
First+0.013
Neg logit contributions
 vain-0.013
arter-0.013
 mistake-0.013
 fortunate-0.012
 haven-0.012
Token pull
Token ID2834
Feature activation+1.17e-01

Pos logit contributions
aw+0.018
cakes+0.016
boards+0.016
P+0.016
 instruments+0.015
Neg logit contributions
arter-0.016
 vain-0.015
 fortunate-0.015
die-0.015
 mistake-0.015
Token girl
Token ID2576
Feature activation+1.67e-01

Pos logit contributions
arter+0.006
 vain+0.006
 fortunate+0.006
die+0.005
 mistake+0.005
Neg logit contributions
aw-0.010
cakes-0.009
boards-0.009
P-0.009
 instruments-0.009
Token.
Token ID13
Feature activation-1.09e+00

Pos logit contributions
aw+0.003
cakes+0.003
P+0.003
boards+0.003
 instruments+0.003
Neg logit contributions
arter-0.004
 vain-0.004
 fortunate-0.004
die-0.004
 mistake-0.004
Token Her
Token ID2332
Feature activation-3.21e+00

Pos logit contributions
arter+0.033
 vain+0.032
 mistake+0.032
die+0.032
 fortunate+0.032
Neg logit contributions
aw-0.015
cakes-0.013
boards-0.013
First-0.012
Her-0.012
Token name
Token ID1438
Feature activation-2.55e+00

Pos logit contributions
 vain+0.056
 fortunate+0.053
die+0.053
arter+0.053
ening+0.051
Neg logit contributions
aw-0.083
cakes-0.080
P-0.077
 instruments-0.077
boards-0.074
Token was
Token ID373
Feature activation-6.99e-01

Pos logit contributions
arter+0.055
 vain+0.052
 fortunate+0.052
die+0.051
ubby+0.050
Neg logit contributions
aw-0.062
cakes-0.056
boards-0.054
P-0.053
 instruments-0.051
Token Mill
Token ID9212
Feature activation-6.03e+00

Pos logit contributions
arter+0.011
 vain+0.011
 fortunate+0.011
 mistake+0.011
die+0.011
Neg logit contributions
aw-0.019
cakes-0.018
P-0.017
boards-0.017
 instruments-0.017
Tokenie
Token ID494
Feature activation-3.25e+00

Pos logit contributions
arter+0.161
 vain+0.145
 fossil+0.143
 fortunate+0.139
 haven+0.139
Neg logit contributions
aw-0.113
co-0.104
She-0.097
Her-0.096
P-0.094
Token.
Token ID13
Feature activation-1.38e+00

Pos logit contributions
arter+0.081
 vain+0.075
 fortunate+0.075
 haven+0.073
die+0.072
Neg logit contributions
aw-0.069
co-0.057
cakes-0.056
Her-0.056
 bars-0.055
Token She
Token ID1375
Feature activation-2.13e+00

Pos logit contributions
arter+0.038
 vain+0.037
 mistake+0.037
die+0.037
 fortunate+0.037
Neg logit contributions
aw-0.022
cakes-0.020
boards-0.020
First-0.019
P-0.019
Token was
Token ID373
Feature activation-1.49e-01

Pos logit contributions
arter+0.039
 vain+0.038
 fortunate+0.037
 mistake+0.036
die+0.036
Neg logit contributions
aw-0.055
cakes-0.051
P-0.050
boards-0.050
 instruments-0.048
Token 3
Token ID513
Feature activation+1.79e-01

Pos logit contributions
arter+0.003
die+0.003
 mistake+0.003
 vain+0.003
 fortunate+0.003
Neg logit contributions
aw-0.003
P-0.003
Her-0.003
 instruments-0.003
cakes-0.003
Token girl
Token ID2576
Feature activation-1.87e-01

Pos logit contributions
arter+0.014
 vain+0.013
 fortunate+0.013
die+0.013
 mistake+0.013
Neg logit contributions
aw-0.023
cakes-0.021
P-0.021
boards-0.021
 instruments-0.020
Token.
Token ID13
Feature activation-1.25e+00

Pos logit contributions
arter+0.005
 vain+0.005
 fortunate+0.005
die+0.005
 mistake+0.004
Neg logit contributions
aw-0.003
P-0.003
cakes-0.003
boards-0.003
 instruments-0.003
Token Her
Token ID2332
Feature activation-3.00e+00

Pos logit contributions
arter+0.035
 vain+0.035
 fortunate+0.035
 mistake+0.035
die+0.035
Neg logit contributions
aw-0.019
cakes-0.016
boards-0.016
P-0.016
First-0.015
Token name
Token ID1438
Feature activation-3.18e+00

Pos logit contributions
 vain+0.045
arter+0.044
 fortunate+0.043
die+0.043
 mistake+0.040
Neg logit contributions
aw-0.085
cakes-0.080
P-0.079
 instruments-0.078
boards-0.077
Token was
Token ID373
Feature activation-9.54e-01

Pos logit contributions
arter+0.054
 vain+0.053
 fortunate+0.052
die+0.051
 mistake+0.050
Neg logit contributions
aw-0.086
cakes-0.079
boards-0.078
P-0.077
 instruments-0.075
Token Daisy
Token ID40355
Feature activation-5.83e+00

Pos logit contributions
arter+0.015
 vain+0.014
 fortunate+0.014
die+0.014
 haven+0.014
Neg logit contributions
aw-0.026
boards-0.025
cakes-0.025
P-0.024
talk-0.024
Token.
Token ID13
Feature activation-1.21e+00

Pos logit contributions
arter+0.163
 vain+0.150
 fortunate+0.143
 fossil+0.143
ubby+0.142
Neg logit contributions
aw-0.113
She-0.093
Her-0.092
P-0.090
co-0.089
Token She
Token ID1375
Feature activation-2.41e+00

Pos logit contributions
arter+0.030
 vain+0.030
 fortunate+0.030
 mistake+0.029
die+0.029
Neg logit contributions
aw-0.022
cakes-0.020
boards-0.019
P-0.019
First-0.019
Token was
Token ID373
Feature activation-2.02e-01

Pos logit contributions
arter+0.046
 vain+0.045
 fortunate+0.044
 mistake+0.043
die+0.043
Neg logit contributions
aw-0.061
P-0.056
cakes-0.055
boards-0.054
 instruments-0.053
Token only
Token ID691
Feature activation-3.09e+00

Pos logit contributions
arter+0.005
 mistake+0.004
 vain+0.004
die+0.004
 fortunate+0.004
Neg logit contributions
aw-0.004
P-0.004
Her-0.004
cakes-0.004
 instruments-0.003
Token 3
Token ID513
Feature activation+3.22e-01

Pos logit contributions
arter+0.094
 mistake+0.090
 vain+0.088
 hasn+0.088
die+0.087
Neg logit contributions
aw-0.041
 instruments-0.035
P-0.032
 bars-0.032
First-0.032
Token
Token ID220
Feature activation-2.72e+00

Pos logit contributions
arter+0.044
 vain+0.041
ubby+0.041
 mistake+0.041
 fault+0.039
Neg logit contributions
Her-0.028
cakes-0.027
She-0.026
aw-0.025
First-0.025
Token
Token ID198
Feature activation-1.03e+00

Pos logit contributions
 vain+0.067
arter+0.067
 fortunate+0.066
 mistake+0.065
 remembers+0.065
Neg logit contributions
aw-0.053
Her-0.049
First-0.048
She-0.048
cakes-0.047
Token
Token ID198
Feature activation-1.29e+00

Pos logit contributions
arter+0.023
 vain+0.022
 fortunate+0.022
 mistake+0.021
 remembers+0.021
Neg logit contributions
aw-0.022
P-0.021
First-0.021
cakes-0.021
Her-0.020
Token"
Token ID1
Feature activation+2.00e+00

Pos logit contributions
die+0.025
 fortunate+0.025
 vain+0.025
dy+0.024
 haven+0.024
Neg logit contributions
aw-0.030
boards-0.028
cakes-0.028
 instruments-0.027
talk-0.025
TokenHi
Token ID17250
Feature activation-2.05e+00

Pos logit contributions
aw+0.038
boards+0.034
cakes+0.033
 instruments+0.031
 bars+0.031
Neg logit contributions
die-0.050
 haven-0.047
 fortunate-0.047
dy-0.047
 vain-0.046
Token Blue
Token ID4518
Feature activation-5.81e+00

Pos logit contributions
 vain+0.052
arter+0.051
 mistake+0.047
 fault+0.047
 size+0.047
Neg logit contributions
aw-0.043
P-0.036
als-0.035
cakes-0.033
ican-0.033
Tokeny
Token ID88
Feature activation-1.04e+00

Pos logit contributions
 useless+0.140
arter+0.139
 vain+0.138
 way+0.137
 hasn+0.137
Neg logit contributions
aw-0.105
P-0.092
She-0.087
amb-0.086
als-0.084
Token,
Token ID11
Feature activation-9.37e-01

Pos logit contributions
arter+0.018
 vain+0.018
 fortunate+0.018
die+0.017
 haven+0.017
Neg logit contributions
aw-0.028
cakes-0.025
P-0.025
boards-0.024
co-0.023
Token you
Token ID345
Feature activation-4.28e-01

Pos logit contributions
 vain+0.018
arter+0.017
die+0.017
 fortunate+0.017
ubby+0.016
Neg logit contributions
aw-0.022
cakes-0.021
boards-0.020
P-0.020
Her-0.020
Token're
Token ID821
Feature activation+2.66e-01

Pos logit contributions
 vain+0.011
die+0.010
 fortunate+0.010
ening+0.010
 mistake+0.010
Neg logit contributions
aw-0.008
P-0.007
boards-0.006
als-0.006
Boo-0.006
Token so
Token ID523
Feature activation-1.50e+00

Pos logit contributions
P+0.004
aw+0.004
Her+0.004
She+0.004
cakes+0.003
Neg logit contributions
 vain-0.007
 remembers-0.007
ening-0.007
ccoli-0.007
 mistake-0.007
Tokenily
Token ID813
Feature activation-1.29e-01

Pos logit contributions
aw+0.024
cakes+0.021
boards+0.020
P+0.020
 instruments+0.019
Neg logit contributions
arter-0.042
 vain-0.042
 fortunate-0.041
die-0.041
 mistake-0.041
Token said
Token ID531
Feature activation-4.59e-01

Pos logit contributions
die+0.002
 vain+0.002
 fortunate+0.002
 mistake+0.002
 remembers+0.002
Neg logit contributions
aw-0.003
cakes-0.003
boards-0.003
P-0.003
 instruments-0.003
Token,
Token ID11
Feature activation-1.85e+00

Pos logit contributions
 fortunate+0.005
 vain+0.005
arter+0.005
 remembers+0.005
 mistake+0.005
Neg logit contributions
aw-0.015
boards-0.014
P-0.013
First-0.013
cakes-0.013
Token "
Token ID366
Feature activation+4.06e-01

Pos logit contributions
arter+0.030
 vain+0.029
 fortunate+0.028
 mistake+0.028
die+0.027
Neg logit contributions
aw-0.050
cakes-0.048
boards-0.047
P-0.046
Her-0.045
TokenHi
Token ID17250
Feature activation-2.29e+00

Pos logit contributions
aw+0.009
boards+0.008
cakes+0.008
 instruments+0.008
 bars+0.008
Neg logit contributions
die-0.009
 moment-0.009
 fortunate-0.008
dy-0.008
 haven-0.008
Token Betsy
Token ID45775
Feature activation-5.80e+00

Pos logit contributions
arter+0.056
 vain+0.055
 mistake+0.052
 fossil+0.052
 fortunate+0.051
Neg logit contributions
aw-0.047
P-0.044
cakes-0.041
 umb-0.038
First-0.038
Token,
Token ID11
Feature activation-9.27e-01

Pos logit contributions
arter+0.138
 fortunate+0.137
 vain+0.134
 haven+0.131
 useless+0.129
Neg logit contributions
aw-0.111
co-0.096
P-0.095
cakes-0.091
als-0.090
Token I
Token ID314
Feature activation-5.35e-01

Pos logit contributions
 vain+0.019
arter+0.019
die+0.018
 fortunate+0.018
 mistake+0.018
Neg logit contributions
aw-0.021
cakes-0.020
boards-0.019
P-0.019
First-0.018
Token care
Token ID1337
Feature activation-1.95e+00

Pos logit contributions
arter+0.013
ening+0.013
die+0.012
 vain+0.012
 mistake+0.012
Neg logit contributions
P-0.010
aw-0.010
Boo-0.009
First-0.009
 se-0.008
Token about
Token ID546
Feature activation-3.07e+00

Pos logit contributions
die+0.041
 vain+0.040
arter+0.039
 mistake+0.039
 fortunate+0.039
Neg logit contributions
aw-0.044
cakes-0.040
boards-0.040
 instruments-0.038
 bars-0.037
Token you
Token ID345
Feature activation-5.57e-01

Pos logit contributions
arter+0.047
 vain+0.047
 fortunate+0.045
die+0.045
 mistake+0.044
Neg logit contributions
aw-0.086
cakes-0.079
P-0.079
boards-0.077
 instruments-0.077
Token was
Token ID373
Feature activation+1.41e+00

Pos logit contributions
 fortunate+0.100
arter+0.098
 vain+0.096
ubby+0.095
die+0.094
Neg logit contributions
aw-0.058
boards-0.048
cakes-0.045
Her-0.045
P-0.044
Token a
Token ID257
Feature activation-5.22e-01

Pos logit contributions
boards+0.036
cakes+0.036
aw+0.035
talk+0.035
co+0.033
Neg logit contributions
 time-0.024
die-0.024
dy-0.023
 vain-0.023
arter-0.023
Token little
Token ID1310
Feature activation-4.37e-01

Pos logit contributions
 fault+0.009
 vain+0.009
arter+0.009
ening+0.009
 fortunate+0.008
Neg logit contributions
aw-0.013
P-0.013
 se-0.013
cakes-0.012
Her-0.012
Token girl
Token ID2576
Feature activation+1.41e-01

Pos logit contributions
arter+0.007
 vain+0.007
 fortunate+0.007
die+0.006
 mistake+0.006
Neg logit contributions
aw-0.012
cakes-0.011
boards-0.011
P-0.011
First-0.011
Token called
Token ID1444
Feature activation-3.66e-01

Pos logit contributions
aw+0.003
P+0.002
cakes+0.002
boards+0.002
 instruments+0.002
Neg logit contributions
arter-0.003
 vain-0.003
 fortunate-0.003
die-0.003
 mistake-0.003
Token Hope
Token ID13408
Feature activation-5.76e+00

Pos logit contributions
arter+0.007
 fortunate+0.007
 vain+0.006
die+0.006
 mistake+0.006
Neg logit contributions
aw-0.009
cakes-0.009
boards-0.008
P-0.008
First-0.008
Token.
Token ID13
Feature activation-1.63e+00

Pos logit contributions
arter+0.164
 vain+0.156
 tempting+0.152
 fault+0.151
 hasn+0.151
Neg logit contributions
aw-0.093
Her-0.084
She-0.080
P-0.077
They-0.075
Token She
Token ID1375
Feature activation-2.33e+00

Pos logit contributions
arter+0.043
 vain+0.042
 mistake+0.042
 fortunate+0.041
ubby+0.040
Neg logit contributions
aw-0.028
cakes-0.025
boards-0.025
First-0.024
P-0.024
Token lived
Token ID5615
Feature activation-3.62e-01

Pos logit contributions
arter+0.042
 vain+0.041
 fortunate+0.039
 mistake+0.039
die+0.039
Neg logit contributions
aw-0.061
P-0.057
cakes-0.056
boards-0.055
 instruments-0.053
Token with
Token ID351
Feature activation-3.74e-01

Pos logit contributions
arter+0.008
 vain+0.008
 fortunate+0.008
die+0.008
 mistake+0.008
Neg logit contributions
aw-0.008
P-0.007
cakes-0.007
boards-0.007
First-0.007
Token her
Token ID607
Feature activation-1.76e+00

Pos logit contributions
arter+0.008
 vain+0.008
 fortunate+0.008
die+0.008
 way+0.007
Neg logit contributions
aw-0.008
boards-0.008
cakes-0.007
talk-0.007
P-0.007
Token,
Token ID11
Feature activation-1.26e+00

Pos logit contributions
arter+0.021
 vain+0.021
 fortunate+0.021
die+0.020
 mistake+0.020
Neg logit contributions
aw-0.047
cakes-0.044
boards-0.044
P-0.043
First-0.042
Token "
Token ID366
Feature activation+5.73e-01

Pos logit contributions
arter+0.017
 vain+0.017
 fortunate+0.016
die+0.016
 mistake+0.016
Neg logit contributions
aw-0.037
cakes-0.035
boards-0.034
P-0.034
 instruments-0.033
TokenMy
Token ID3666
Feature activation-6.44e-01

Pos logit contributions
cakes+0.011
boards+0.011
aw+0.010
 umb+0.010
 instruments+0.010
Neg logit contributions
die-0.013
 haven-0.013
dy-0.013
 fortunate-0.013
 hasn-0.013
Token name
Token ID1438
Feature activation-2.76e+00

Pos logit contributions
 vain+0.011
arter+0.010
 fortunate+0.010
die+0.009
ubby+0.009
Neg logit contributions
aw-0.017
 instruments-0.016
 shorts-0.015
 bars-0.015
 sneakers-0.015
Token is
Token ID318
Feature activation-6.74e-01

Pos logit contributions
arter+0.041
 vain+0.039
die+0.038
 fortunate+0.037
 mistake+0.037
Neg logit contributions
aw-0.084
cakes-0.076
boards-0.074
P-0.074
 instruments-0.072
Token Frank
Token ID5278
Feature activation-5.75e+00

Pos logit contributions
arter+0.013
 vain+0.012
 fortunate+0.012
 mistake+0.012
die+0.012
Neg logit contributions
aw-0.017
cakes-0.015
boards-0.015
P-0.015
 instruments-0.014
Token.
Token ID13
Feature activation-1.37e+00

Pos logit contributions
arter+0.137
 vain+0.129
 mistake+0.124
 haven+0.119
 ending+0.119
Neg logit contributions
aw-0.125
P-0.100
 instruments-0.098
als-0.097
cakes-0.096
Token Would
Token ID10928
Feature activation+1.67e+00

Pos logit contributions
arter+0.025
 vain+0.025
 fortunate+0.025
die+0.024
 mistake+0.024
Neg logit contributions
aw-0.033
cakes-0.031
boards-0.031
P-0.030
 instruments-0.029
Token you
Token ID345
Feature activation+6.95e-01

Pos logit contributions
aw+0.040
boards+0.037
talk+0.037
First+0.036
 instruments+0.035
Neg logit contributions
 lucky-0.030
 tempting-0.030
 true-0.030
 haven-0.030
 fortunate-0.030
Token like
Token ID588
Feature activation-8.35e-01

Pos logit contributions
aw+0.017
cakes+0.015
boards+0.015
P+0.015
 instruments+0.014
Neg logit contributions
arter-0.013
 vain-0.013
 fortunate-0.013
die-0.013
 mistake-0.013
Token to
Token ID284
Feature activation-1.74e+00

Pos logit contributions
die+0.015
 way+0.015
 fortunate+0.015
 mistake+0.015
arter+0.015
Neg logit contributions
aw-0.020
cakes-0.019
boards-0.019
P-0.018
First-0.018
Token girl
Token ID2576
Feature activation+1.66e-01

Pos logit contributions
 fault+0.010
 vain+0.010
arter+0.010
ening+0.010
 pays+0.009
Neg logit contributions
aw-0.014
P-0.014
 se-0.013
She-0.012
Her-0.012
Token.
Token ID13
Feature activation-1.54e+00

Pos logit contributions
aw+0.003
cakes+0.003
boards+0.003
P+0.003
First+0.003
Neg logit contributions
arter-0.004
 vain-0.004
 fortunate-0.004
 mistake-0.003
die-0.003
Token Her
Token ID2332
Feature activation-3.32e+00

Pos logit contributions
 mistake+0.045
 vain+0.045
arter+0.045
 fortunate+0.044
die+0.044
Neg logit contributions
aw-0.021
cakes-0.019
boards-0.018
P-0.017
Her-0.017
Token name
Token ID1438
Feature activation-2.78e+00

Pos logit contributions
 vain+0.051
arter+0.050
 fortunate+0.048
die+0.048
ening+0.045
Neg logit contributions
aw-0.093
cakes-0.088
P-0.087
 instruments-0.084
boards-0.084
Token was
Token ID373
Feature activation-1.07e+00

Pos logit contributions
arter+0.060
 vain+0.058
 fortunate+0.057
die+0.056
ubby+0.055
Neg logit contributions
aw-0.068
cakes-0.061
boards-0.059
P-0.058
 instruments-0.055
Token Claire
Token ID20758
Feature activation-5.75e+00

Pos logit contributions
arter+0.017
 vain+0.017
 fortunate+0.017
die+0.016
 mistake+0.016
Neg logit contributions
aw-0.029
cakes-0.027
boards-0.027
P-0.026
 instruments-0.026
Token.
Token ID13
Feature activation-1.18e+00

Pos logit contributions
arter+0.176
 vain+0.167
 fault+0.156
 mistake+0.154
 tempting+0.154
Neg logit contributions
aw-0.095
Her-0.080
She-0.076
P-0.076
They-0.071
Token Claire
Token ID20758
Feature activation-3.95e+00

Pos logit contributions
arter+0.034
 vain+0.034
 mistake+0.034
die+0.033
 fortunate+0.032
Neg logit contributions
aw-0.017
cakes-0.015
boards-0.015
P-0.014
Her-0.014
Token lived
Token ID5615
Feature activation-9.01e-01

Pos logit contributions
arter+0.110
 vain+0.104
 mistake+0.099
 fault+0.098
die+0.095
Neg logit contributions
aw-0.077
P-0.068
Her-0.062
She-0.061
 instruments-0.059
Token in
Token ID287
Feature activation-2.13e+00

Pos logit contributions
arter+0.022
 vain+0.022
 fortunate+0.021
 mistake+0.021
die+0.021
Neg logit contributions
aw-0.017
P-0.017
cakes-0.015
Her-0.015
boards-0.014
Token a
Token ID257
Feature activation+7.20e-01

Pos logit contributions
arter+0.051
 vain+0.050
die+0.050
 fortunate+0.050
 mistake+0.049
Neg logit contributions
aw-0.041
cakes-0.037
P-0.036
boards-0.035
 instruments-0.034
Token was
Token ID373
Feature activation+8.01e-01

Pos logit contributions
arter+0.109
 vain+0.102
ubby+0.102
 fortunate+0.102
die+0.099
Neg logit contributions
aw-0.038
Her-0.031
mine-0.031
boards-0.031
 were-0.030
Token a
Token ID257
Feature activation-3.34e-01

Pos logit contributions
cakes+0.021
boards+0.021
talk+0.020
aw+0.019
co+0.019
Neg logit contributions
 time-0.015
die-0.014
ubby-0.014
dy-0.013
 glad-0.013
Token little
Token ID1310
Feature activation-2.90e-01

Pos logit contributions
 vain+0.006
arter+0.006
 fault+0.006
ening+0.006
 pays+0.006
Neg logit contributions
P-0.009
aw-0.009
 se-0.008
She-0.007
Her-0.007
Token girl
Token ID2576
Feature activation+5.95e-02

Pos logit contributions
arter+0.005
 vain+0.005
 fortunate+0.005
 mistake+0.005
die+0.005
Neg logit contributions
aw-0.008
P-0.007
cakes-0.007
boards-0.007
 instruments-0.007
Token called
Token ID1444
Feature activation-4.06e-01

Pos logit contributions
aw+0.001
P+0.001
cakes+0.001
boards+0.001
 instruments+0.001
Neg logit contributions
arter-0.001
 vain-0.001
 fortunate-0.001
die-0.001
 mistake-0.001
Token Mill
Token ID9212
Feature activation-5.73e+00

Pos logit contributions
arter+0.007
 vain+0.007
 fortunate+0.007
die+0.007
 mistake+0.007
Neg logit contributions
aw-0.010
cakes-0.010
boards-0.010
P-0.009
First-0.009
Tokenie
Token ID494
Feature activation-3.04e+00

Pos logit contributions
arter+0.148
 vain+0.141
 fortunate+0.133
 fossil+0.131
 haven+0.129
Neg logit contributions
aw-0.109
P-0.096
She-0.095
co-0.095
Her-0.094
Token.
Token ID13
Feature activation-1.82e+00

Pos logit contributions
arter+0.069
 vain+0.067
 fortunate+0.066
die+0.063
 haven+0.063
Neg logit contributions
aw-0.068
Her-0.057
P-0.056
co-0.055
 bars-0.055
Token She
Token ID1375
Feature activation-2.37e+00

Pos logit contributions
arter+0.051
 vain+0.051
 mistake+0.050
 fortunate+0.050
die+0.050
Neg logit contributions
aw-0.027
cakes-0.025
boards-0.024
First-0.023
P-0.023
Token wanted
Token ID2227
Feature activation+2.71e-01

Pos logit contributions
arter+0.047
 vain+0.046
 fortunate+0.044
 mistake+0.044
die+0.043
Neg logit contributions
aw-0.059
P-0.054
cakes-0.052
boards-0.052
 instruments-0.051
Token to
Token ID284
Feature activation-1.52e+00

Pos logit contributions
aw+0.008
P+0.007
amb+0.007
boards+0.007
ican+0.007
Neg logit contributions
 way-0.004
 chance-0.004
 mistake-0.004
 solution-0.004
 new-0.004
Token girl
Token ID2576
Feature activation-1.12e-02

Pos logit contributions
arter+0.005
 vain+0.005
 fortunate+0.004
die+0.004
 mistake+0.004
Neg logit contributions
aw-0.007
cakes-0.007
P-0.007
boards-0.007
 instruments-0.006
Token.
Token ID13
Feature activation-1.15e+00

Pos logit contributions
arter+0.000
 vain+0.000
 fortunate+0.000
die+0.000
 mistake+0.000
Neg logit contributions
aw-0.000
P-0.000
cakes-0.000
boards-0.000
 instruments-0.000
Token Her
Token ID2332
Feature activation-3.28e+00

Pos logit contributions
 mistake+0.036
 vain+0.036
die+0.035
arter+0.035
 fortunate+0.035
Neg logit contributions
aw-0.014
cakes-0.013
Her-0.012
boards-0.012
First-0.011
Token name
Token ID1438
Feature activation-2.66e+00

Pos logit contributions
 vain+0.053
arter+0.052
die+0.051
 fortunate+0.050
ening+0.048
Neg logit contributions
aw-0.089
cakes-0.084
P-0.082
 instruments-0.081
boards-0.079
Token was
Token ID373
Feature activation-9.19e-01

Pos logit contributions
arter+0.062
 vain+0.058
 fortunate+0.058
die+0.057
ubby+0.057
Neg logit contributions
aw-0.063
cakes-0.055
boards-0.053
P-0.052
 bars-0.049
Token Claire
Token ID20758
Feature activation-5.73e+00

Pos logit contributions
arter+0.015
 vain+0.015
 fortunate+0.015
die+0.014
 mistake+0.014
Neg logit contributions
aw-0.025
cakes-0.023
P-0.022
boards-0.022
 instruments-0.022
Token and
Token ID290
Feature activation-1.97e+00

Pos logit contributions
arter+0.170
 vain+0.160
reetings+0.151
die+0.150
 fossil+0.150
Neg logit contributions
aw-0.096
Her-0.083
She-0.079
P-0.074
They-0.073
Token she
Token ID673
Feature activation-2.69e+00

Pos logit contributions
arter+0.053
 vain+0.052
 mistake+0.051
die+0.051
 fortunate+0.050
Neg logit contributions
aw-0.033
P-0.029
cakes-0.028
Her-0.027
boards-0.027
Token was
Token ID373
Feature activation-3.77e-01

Pos logit contributions
arter+0.067
 vain+0.063
 mistake+0.061
ubby+0.061
die+0.060
Neg logit contributions
aw-0.056
P-0.051
 instruments-0.048
Her-0.047
cakes-0.046
Token three
Token ID1115
Feature activation-4.82e-01

Pos logit contributions
arter+0.009
 mistake+0.009
die+0.009
 vain+0.009
lad+0.009
Neg logit contributions
aw-0.007
P-0.006
Her-0.006
 se-0.006
She-0.006
Token years
Token ID812
Feature activation-3.22e+00

Pos logit contributions
arter+0.010
 haven+0.010
die+0.010
 vain+0.010
 hasn+0.009
Neg logit contributions
aw-0.012
cakes-0.010
 instruments-0.010
boards-0.010
P-0.010
Token was
Token ID373
Feature activation+7.10e-01

Pos logit contributions
arter+0.112
 fortunate+0.106
ubby+0.104
 vain+0.104
die+0.102
Neg logit contributions
aw-0.047
Her-0.042
cakes-0.039
boards-0.038
mine-0.038
Token a
Token ID257
Feature activation-6.28e-01

Pos logit contributions
boards+0.019
cakes+0.019
talk+0.018
aw+0.018
co+0.017
Neg logit contributions
 time-0.012
die-0.011
 glad-0.011
 lucky-0.011
ubby-0.011
Token little
Token ID1310
Feature activation-6.31e-01

Pos logit contributions
 fault+0.011
arter+0.011
 vain+0.011
ening+0.011
 pays+0.010
Neg logit contributions
P-0.016
aw-0.016
 se-0.015
She-0.014
co-0.014
Token girl
Token ID2576
Feature activation-3.13e-01

Pos logit contributions
arter+0.011
 vain+0.010
 fortunate+0.010
die+0.010
 mistake+0.010
Neg logit contributions
aw-0.017
cakes-0.016
P-0.015
boards-0.015
 instruments-0.015
Token called
Token ID1444
Feature activation-5.18e-01

Pos logit contributions
arter+0.008
 vain+0.008
 fortunate+0.008
die+0.007
 mistake+0.007
Neg logit contributions
aw-0.006
P-0.005
cakes-0.005
boards-0.005
 instruments-0.005
Token Mill
Token ID9212
Feature activation-5.73e+00

Pos logit contributions
arter+0.009
 vain+0.009
 fortunate+0.009
die+0.009
 mistake+0.009
Neg logit contributions
aw-0.013
cakes-0.012
boards-0.012
P-0.012
First-0.012
Tokenie
Token ID494
Feature activation-3.17e+00

Pos logit contributions
arter+0.146
 vain+0.138
 fortunate+0.133
 fossil+0.130
 haven+0.129
Neg logit contributions
aw-0.111
co-0.099
P-0.097
She-0.095
Her-0.095
Token.
Token ID13
Feature activation-1.64e+00

Pos logit contributions
arter+0.073
 vain+0.070
 fortunate+0.069
die+0.067
 haven+0.066
Neg logit contributions
aw-0.070
Her-0.060
co-0.059
P-0.058
 bars-0.057
Token She
Token ID1375
Feature activation-2.48e+00

Pos logit contributions
arter+0.045
 vain+0.044
 mistake+0.044
die+0.044
 fortunate+0.044
Neg logit contributions
aw-0.025
cakes-0.023
boards-0.022
First-0.022
P-0.021
Token was
Token ID373
Feature activation-2.65e-01

Pos logit contributions
arter+0.049
 vain+0.047
 fortunate+0.046
 mistake+0.045
die+0.045
Neg logit contributions
aw-0.062
P-0.057
cakes-0.056
boards-0.055
 instruments-0.054
Token three
Token ID1115
Feature activation-4.11e-01

Pos logit contributions
arter+0.006
 vain+0.006
die+0.006
 mistake+0.006
 fortunate+0.006
Neg logit contributions
aw-0.006
P-0.005
cakes-0.005
 instruments-0.005
First-0.005
TokenIsn
Token ID41451
Feature activation+1.38e+00

Pos logit contributions
aw+0.027
boards+0.026
 umb+0.025
 bars+0.025
cakes+0.025
Neg logit contributions
 moment-0.026
die-0.025
 haven-0.023
 hasn-0.022
 matter-0.021
Token't
Token ID470
Feature activation-1.88e+00

Pos logit contributions
aw+0.034
talk+0.031
boards+0.031
 instruments+0.030
P+0.030
Neg logit contributions
 haven-0.026
arter-0.024
 hasn-0.024
 has-0.024
 vain-0.023
Token it
Token ID340
Feature activation-3.80e+00

Pos logit contributions
arter+0.033
 haven+0.033
 mistake+0.032
 fortunate+0.032
 vain+0.031
Neg logit contributions
boards-0.045
cakes-0.045
aw-0.044
talk-0.044
 sneakers-0.043
Token incredible
Token ID8082
Feature activation-2.09e-01

Pos logit contributions
 vain+0.071
die+0.065
 fortunate+0.065
ubby+0.065
arter+0.064
Neg logit contributions
cakes-0.094
aw-0.093
boards-0.085
P-0.084
 instruments-0.084
Token how
Token ID703
Feature activation-9.08e-01

Pos logit contributions
 vain+0.004
arter+0.004
 fortunate+0.004
die+0.004
ubby+0.004
Neg logit contributions
aw-0.005
cakes-0.005
boards-0.004
 instruments-0.004
P-0.004
Token nature
Token ID3450
Feature activation-5.72e+00

Pos logit contributions
arter+0.018
 vain+0.018
 fortunate+0.017
die+0.017
 mistake+0.017
Neg logit contributions
aw-0.022
cakes-0.020
boards-0.019
P-0.019
 instruments-0.019
Token can
Token ID460
Feature activation-4.95e-01

Pos logit contributions
arter+0.112
 vain+0.111
die+0.110
 mistake+0.105
dy+0.102
Neg logit contributions
aw-0.142
cakes-0.130
 instruments-0.128
boards-0.124
P-0.121
Token be
Token ID307
Feature activation-6.91e-01

Pos logit contributions
arter+0.009
 haven+0.008
 mistake+0.008
 way+0.008
 fortunate+0.008
Neg logit contributions
aw-0.014
boards-0.013
cakes-0.012
P-0.012
First-0.012
Token so
Token ID523
Feature activation-1.47e+00

Pos logit contributions
arter+0.015
 vain+0.014
die+0.013
 mistake+0.013
 hasn+0.013
Neg logit contributions
aw-0.015
P-0.014
 instruments-0.013
 se-0.013
First-0.013
Token beautiful
Token ID4950
Feature activation-1.20e+00

Pos logit contributions
arter+0.024
 vain+0.024
die+0.023
 mistake+0.023
dy+0.022
Neg logit contributions
aw-0.040
cakes-0.036
P-0.036
 instruments-0.034
First-0.034
Token?"
Token ID1701
Feature activation+6.27e-01

Pos logit contributions
arter+0.019
 fortunate+0.019
 vain+0.019
die+0.018
 mistake+0.018
Neg logit contributions
aw-0.032
cakes-0.030
boards-0.030
P-0.030
First-0.029
Token,
Token ID11
Feature activation-1.28e+00

Pos logit contributions
arter+0.019
 vain+0.018
 fortunate+0.018
die+0.018
 mistake+0.018
Neg logit contributions
aw-0.039
cakes-0.037
boards-0.036
P-0.036
First-0.036
Token "
Token ID366
Feature activation+9.48e-01

Pos logit contributions
arter+0.019
 vain+0.018
 fortunate+0.018
die+0.017
 mistake+0.017
Neg logit contributions
aw-0.037
cakes-0.035
boards-0.034
P-0.034
 instruments-0.033
TokenMy
Token ID3666
Feature activation-9.97e-01

Pos logit contributions
boards+0.019
cakes+0.018
aw+0.018
 umb+0.017
 instruments+0.017
Neg logit contributions
die-0.021
 haven-0.021
 fortunate-0.020
 has-0.020
dy-0.020
Token name
Token ID1438
Feature activation-2.92e+00

Pos logit contributions
 vain+0.017
arter+0.016
 fortunate+0.015
ubby+0.015
die+0.015
Neg logit contributions
aw-0.026
 instruments-0.024
 shorts-0.023
 bars-0.023
cakes-0.023
Token is
Token ID318
Feature activation-9.86e-01

Pos logit contributions
arter+0.040
 vain+0.039
die+0.037
 fortunate+0.037
 mistake+0.036
Neg logit contributions
aw-0.091
cakes-0.083
boards-0.081
P-0.080
 instruments-0.079
Token Hope
Token ID13408
Feature activation-5.71e+00

Pos logit contributions
arter+0.017
 vain+0.017
 fortunate+0.017
 mistake+0.017
die+0.017
Neg logit contributions
aw-0.025
cakes-0.023
P-0.023
boards-0.023
 instruments-0.022
Token."
Token ID526
Feature activation+4.63e-01

Pos logit contributions
arter+0.129
 vain+0.122
 mistake+0.118
 haven+0.113
 fortunate+0.112
Neg logit contributions
aw-0.122
cakes-0.104
P-0.103
 instruments-0.103
She-0.101
Token
Token ID198
Feature activation-4.80e-01

Pos logit contributions
aw+0.012
cakes+0.011
boards+0.011
P+0.011
 instruments+0.011
Neg logit contributions
arter-0.008
 vain-0.008
 fortunate-0.008
die-0.008
 mistake-0.008
Token
Token ID198
Feature activation-3.57e-01

Pos logit contributions
die+0.008
 vain+0.007
arter+0.007
 fortunate+0.007
 mistake+0.007
Neg logit contributions
aw-0.013
cakes-0.012
boards-0.012
 instruments-0.012
P-0.012
TokenJack
Token ID14295
Feature activation-1.02e-01

Pos logit contributions
die+0.009
dy+0.008
 vain+0.007
 fortunate+0.007
 fossil+0.007
Neg logit contributions
cakes-0.007
 umb-0.007
boards-0.007
aw-0.007
als-0.007
Token thought
Token ID1807
Feature activation-7.23e-01

Pos logit contributions
arter+0.002
 vain+0.002
 fortunate+0.002
die+0.002
 mistake+0.002
Neg logit contributions
aw-0.002
cakes-0.002
boards-0.002
P-0.002
 instruments-0.002
Token was
Token ID373
Feature activation+9.93e-01

Pos logit contributions
arter+0.098
ubby+0.093
 fortunate+0.092
die+0.090
 vain+0.090
Neg logit contributions
aw-0.045
Her-0.039
boards-0.037
mine-0.037
cakes-0.036
Token a
Token ID257
Feature activation-1.59e-01

Pos logit contributions
boards+0.027
cakes+0.027
talk+0.026
aw+0.025
co+0.025
Neg logit contributions
 time-0.016
die-0.015
dy-0.015
 lucky-0.015
 glad-0.015
Token little
Token ID1310
Feature activation-1.15e-01

Pos logit contributions
 vain+0.003
 fault+0.003
arter+0.003
ening+0.003
 pays+0.003
Neg logit contributions
P-0.004
aw-0.004
 se-0.004
She-0.004
Her-0.004
Token girl
Token ID2576
Feature activation+1.61e-01

Pos logit contributions
arter+0.002
 vain+0.002
 fortunate+0.002
die+0.002
 mistake+0.002
Neg logit contributions
aw-0.003
cakes-0.003
boards-0.003
P-0.003
 instruments-0.003
Token called
Token ID1444
Feature activation-2.05e-01

Pos logit contributions
aw+0.003
P+0.003
cakes+0.003
boards+0.003
 instruments+0.002
Neg logit contributions
arter-0.004
 vain-0.004
 fortunate-0.004
die-0.004
 mistake-0.004
Token Pink
Token ID14657
Feature activation-5.69e+00

Pos logit contributions
arter+0.004
 fortunate+0.004
 vain+0.004
die+0.004
 mistake+0.003
Neg logit contributions
aw-0.005
cakes-0.005
boards-0.005
P-0.005
First-0.005
Tokeny
Token ID88
Feature activation-1.64e+00

Pos logit contributions
 vain+0.133
arter+0.129
 fortunate+0.122
 tempting+0.121
 mistake+0.120
Neg logit contributions
aw-0.120
P-0.111
She-0.109
They-0.107
Her-0.106
Token.
Token ID13
Feature activation-1.24e+00

Pos logit contributions
arter+0.028
 vain+0.027
 fortunate+0.027
die+0.026
 mistake+0.026
Neg logit contributions
aw-0.043
cakes-0.040
boards-0.040
P-0.039
 instruments-0.038
Token She
Token ID1375
Feature activation-2.50e+00

Pos logit contributions
arter+0.030
 vain+0.030
 mistake+0.029
 fortunate+0.029
die+0.029
Neg logit contributions
aw-0.023
cakes-0.021
P-0.021
boards-0.021
First-0.020
Token was
Token ID373
Feature activation-3.07e-01

Pos logit contributions
arter+0.043
 vain+0.042
 fortunate+0.042
die+0.041
 mistake+0.041
Neg logit contributions
aw-0.066
cakes-0.061
P-0.060
boards-0.060
 instruments-0.058
Token three
Token ID1115
Feature activation-1.68e-01

Pos logit contributions
arter+0.007
die+0.006
 mistake+0.006
 vain+0.006
 fortunate+0.006
Neg logit contributions
aw-0.007
P-0.006
cakes-0.006
Her-0.006
 instruments-0.006
Token "
Token ID366
Feature activation+1.72e-01

Pos logit contributions
arter+0.020
 vain+0.019
 fortunate+0.019
die+0.019
 mistake+0.018
Neg logit contributions
aw-0.042
cakes-0.039
boards-0.039
P-0.039
 instruments-0.038
TokenHi
Token ID17250
Feature activation-2.49e+00

Pos logit contributions
boards+0.004
aw+0.004
adders+0.004
 umb+0.004
cakes+0.004
Neg logit contributions
die-0.004
 moment-0.003
dy-0.003
 has-0.003
 haven-0.003
Token,
Token ID11
Feature activation-7.90e-01

Pos logit contributions
 vain+0.070
arter+0.069
 mistake+0.065
 fault+0.064
ening+0.062
Neg logit contributions
aw-0.048
P-0.036
 bars-0.034
cakes-0.034
als-0.033
Token I
Token ID314
Feature activation-4.48e-01

Pos logit contributions
 vain+0.016
arter+0.015
ening+0.015
 fortunate+0.015
die+0.015
Neg logit contributions
aw-0.018
cakes-0.017
She-0.016
 bars-0.016
ican-0.015
Token'm
Token ID1101
Feature activation+4.21e-01

Pos logit contributions
 vain+0.010
arter+0.010
die+0.010
ening+0.010
 fortunate+0.010
Neg logit contributions
P-0.009
aw-0.009
Boo-0.008
 se-0.008
 instruments-0.008
Token Blue
Token ID4518
Feature activation-5.66e+00

Pos logit contributions
aw+0.009
P+0.009
cakes+0.008
Her+0.008
ican+0.008
Neg logit contributions
 mistake-0.009
 vain-0.009
 fortunate-0.009
die-0.009
 remembers-0.009
Token.
Token ID13
Feature activation-1.59e+00

Pos logit contributions
 vain+0.143
arter+0.141
 mistake+0.139
 tempting+0.139
 solution+0.138
Neg logit contributions
aw-0.101
P-0.087
als-0.079
She-0.078
They-0.077
Token Do
Token ID2141
Feature activation+1.13e+00

Pos logit contributions
 mistake+0.032
 vain+0.032
arter+0.031
 fortunate+0.031
die+0.030
Neg logit contributions
aw-0.036
cakes-0.034
boards-0.033
P-0.032
First-0.032
Token you
Token ID345
Feature activation+1.42e+00

Pos logit contributions
aw+0.022
cakes+0.022
boards+0.022
talk+0.020
First+0.020
Neg logit contributions
arter-0.025
 haven-0.024
 remembers-0.024
 has-0.024
 vain-0.024
Token want
Token ID765
Feature activation+1.17e+00

Pos logit contributions
aw+0.035
cakes+0.033
boards+0.032
P+0.032
 instruments+0.031
Neg logit contributions
arter-0.026
 vain-0.026
 fortunate-0.025
die-0.025
 mistake-0.025
Token to
Token ID284
Feature activation-1.95e+00

Pos logit contributions
aw+0.034
boards+0.031
cakes+0.030
First+0.029
P+0.029
Neg logit contributions
 way-0.018
 chance-0.017
 thanks-0.016
 real-0.016
 mistake-0.016
Token was
Token ID373
Feature activation+9.59e-01

Pos logit contributions
arter+0.109
 fortunate+0.108
ubby+0.105
 vain+0.104
die+0.101
Neg logit contributions
aw-0.057
boards-0.047
cakes-0.044
Her-0.042
She-0.041
Token a
Token ID257
Feature activation-6.17e-01

Pos logit contributions
cakes+0.026
boards+0.025
talk+0.025
aw+0.025
co+0.023
Neg logit contributions
 time-0.016
die-0.016
dy-0.015
 normal-0.015
 lucky-0.014
Token little
Token ID1310
Feature activation-4.58e-01

Pos logit contributions
 fault+0.012
 vain+0.011
ening+0.011
arter+0.011
 pays+0.010
Neg logit contributions
P-0.016
aw-0.015
 se-0.015
She-0.013
cakes-0.013
Token rabbit
Token ID22746
Feature activation+1.32e-01

Pos logit contributions
arter+0.007
 vain+0.007
 fortunate+0.007
die+0.007
 mistake+0.007
Neg logit contributions
aw-0.013
cakes-0.012
boards-0.012
P-0.011
First-0.011
Token named
Token ID3706
Feature activation+2.44e-01

Pos logit contributions
aw+0.003
cakes+0.003
boards+0.003
First+0.003
P+0.003
Neg logit contributions
arter-0.003
 vain-0.003
 fortunate-0.003
die-0.003
 mistake-0.003
Token Frankie
Token ID48535
Feature activation-5.65e+00

Pos logit contributions
aw+0.005
P+0.005
cakes+0.005
boards+0.005
First+0.005
Neg logit contributions
 fortunate-0.005
 mistake-0.005
 vain-0.005
 remembers-0.005
arter-0.005
Token.
Token ID13
Feature activation-1.63e+00

Pos logit contributions
arter+0.162
 vain+0.152
 mistake+0.150
 fault+0.149
 solution+0.147
Neg logit contributions
aw-0.090
P-0.086
She-0.079
They-0.077
Her-0.077
Token He
Token ID679
Feature activation-2.14e+00

Pos logit contributions
arter+0.032
 mistake+0.031
 vain+0.031
 fortunate+0.029
ubby+0.029
Neg logit contributions
aw-0.038
First-0.036
boards-0.035
P-0.035
cakes-0.035
Token was
Token ID373
Feature activation-1.22e-01

Pos logit contributions
arter+0.045
 vain+0.043
 fortunate+0.042
 mistake+0.042
ubby+0.040
Neg logit contributions
aw-0.051
P-0.048
boards-0.044
Her-0.043
 instruments-0.043
Token a
Token ID257
Feature activation-2.45e+00

Pos logit contributions
 mistake+0.003
arter+0.003
 vain+0.003
 fortunate+0.003
 hasn+0.003
Neg logit contributions
aw-0.002
P-0.002
Her-0.002
 instruments-0.002
First-0.002
Token very
Token ID845
Feature activation-2.14e+00

Pos logit contributions
 fault+0.048
 vain+0.047
arter+0.047
ening+0.047
 mistake+0.046
Neg logit contributions
P-0.058
aw-0.053
 se-0.053
Her-0.049
She-0.048
Token girl
Token ID2576
Feature activation-5.74e-02

Pos logit contributions
arter+0.006
 vain+0.006
 fortunate+0.006
die+0.006
 mistake+0.006
Neg logit contributions
aw-0.009
cakes-0.009
P-0.009
boards-0.008
 instruments-0.008
Token and
Token ID290
Feature activation-9.36e-01

Pos logit contributions
arter+0.001
 fortunate+0.001
 vain+0.001
die+0.001
 mistake+0.001
Neg logit contributions
aw-0.001
P-0.001
cakes-0.001
Her-0.001
 instruments-0.001
Token her
Token ID607
Feature activation-2.08e+00

Pos logit contributions
arter+0.025
 vain+0.025
 fortunate+0.024
die+0.024
 mistake+0.024
Neg logit contributions
aw-0.016
cakes-0.014
boards-0.013
P-0.013
 instruments-0.013
Token name
Token ID1438
Feature activation-2.36e+00

Pos logit contributions
arter+0.038
 vain+0.037
die+0.037
 fortunate+0.036
 mistake+0.036
Neg logit contributions
aw-0.051
P-0.047
cakes-0.047
 instruments-0.047
Her-0.046
Token was
Token ID373
Feature activation-4.05e-01

Pos logit contributions
arter+0.044
 fortunate+0.042
die+0.042
 vain+0.042
ubby+0.041
Neg logit contributions
aw-0.063
cakes-0.056
boards-0.055
P-0.054
 instruments-0.052
Token Gret
Token ID35877
Feature activation-5.64e+00

Pos logit contributions
arter+0.007
 vain+0.007
 fortunate+0.007
die+0.007
 mistake+0.006
Neg logit contributions
aw-0.011
cakes-0.010
boards-0.010
P-0.010
 instruments-0.009
Tokena
Token ID64
Feature activation-2.07e+00

Pos logit contributions
arter+0.133
 vain+0.133
 fortunate+0.130
 tempting+0.124
 ending+0.121
Neg logit contributions
aw-0.116
P-0.110
Her-0.097
She-0.095
talk-0.090
Token.
Token ID13
Feature activation-4.90e-01

Pos logit contributions
arter+0.046
 vain+0.044
 fortunate+0.043
die+0.043
 haven+0.042
Neg logit contributions
aw-0.048
cakes-0.040
P-0.040
Her-0.040
co-0.039
Token Gret
Token ID35877
Feature activation-4.49e+00

Pos logit contributions
arter+0.011
 vain+0.011
 fortunate+0.011
die+0.011
 mistake+0.011
Neg logit contributions
aw-0.010
cakes-0.009
boards-0.009
P-0.009
First-0.008
Tokena
Token ID64
Feature activation-4.03e-01

Pos logit contributions
 vain+0.107
arter+0.106
 fortunate+0.102
 tempting+0.100
 ending+0.098
Neg logit contributions
aw-0.098
P-0.090
She-0.078
Her-0.077
talk-0.075
Token was
Token ID373
Feature activation-4.81e-01

Pos logit contributions
 fortunate+0.006
 vain+0.006
arter+0.006
die+0.006
 remembers+0.006
Neg logit contributions
aw-0.011
cakes-0.011
boards-0.011
P-0.010
First-0.010
Token,
Token ID11
Feature activation-1.34e+00

Pos logit contributions
aw+0.005
cakes+0.005
boards+0.005
P+0.005
 instruments+0.005
Neg logit contributions
arter-0.004
 vain-0.004
die-0.004
 fortunate-0.004
 mistake-0.003
Token "
Token ID366
Feature activation+5.67e-01

Pos logit contributions
 fortunate+0.025
 vain+0.024
 haven+0.024
arter+0.024
 mistake+0.024
Neg logit contributions
aw-0.033
P-0.031
cakes-0.031
boards-0.030
First-0.029
TokenMy
Token ID3666
Feature activation-7.07e-01

Pos logit contributions
aw+0.010
boards+0.009
cakes+0.009
 instruments+0.008
 umb+0.008
Neg logit contributions
 haven-0.014
 fortunate-0.014
die-0.014
 moment-0.014
 hasn-0.013
Token name
Token ID1438
Feature activation-2.29e+00

Pos logit contributions
arter+0.013
 vain+0.013
 fortunate+0.013
die+0.013
 remembers+0.012
Neg logit contributions
aw-0.017
cakes-0.016
 instruments-0.016
boards-0.015
 shorts-0.015
Token is
Token ID318
Feature activation-4.96e-01

Pos logit contributions
arter+0.028
 vain+0.027
 fortunate+0.026
die+0.026
 mistake+0.025
Neg logit contributions
aw-0.072
cakes-0.068
boards-0.067
P-0.066
 instruments-0.065
Token Frankie
Token ID48535
Feature activation-5.63e+00

Pos logit contributions
arter+0.010
 vain+0.010
 fortunate+0.009
 mistake+0.009
die+0.009
Neg logit contributions
aw-0.012
cakes-0.011
P-0.010
boards-0.010
 instruments-0.010
Token.
Token ID13
Feature activation-5.75e-01

Pos logit contributions
arter+0.120
 vain+0.111
 mistake+0.106
 fortunate+0.105
 haven+0.105
Neg logit contributions
aw-0.129
cakes-0.114
P-0.114
 instruments-0.110
boards-0.109
Token What
Token ID1867
Feature activation-6.62e-01

Pos logit contributions
arter+0.010
 vain+0.010
 fortunate+0.010
die+0.010
 haven+0.010
Neg logit contributions
aw-0.015
cakes-0.014
boards-0.013
P-0.013
 instruments-0.013
Token's
Token ID338
Feature activation-4.50e-01

Pos logit contributions
arter+0.012
 vain+0.012
 fortunate+0.012
die+0.012
 mistake+0.012
Neg logit contributions
aw-0.016
cakes-0.015
boards-0.015
P-0.015
First-0.014
Token your
Token ID534
Feature activation-1.37e-01

Pos logit contributions
arter+0.009
 vain+0.009
 fortunate+0.008
die+0.008
 mistake+0.008
Neg logit contributions
aw-0.011
P-0.010
cakes-0.010
 instruments-0.010
boards-0.009
Token name
Token ID1438
Feature activation-2.51e+00

Pos logit contributions
 vain+0.002
arter+0.002
 fortunate+0.002
die+0.002
ubby+0.002
Neg logit contributions
aw-0.004
cakes-0.004
 instruments-0.004
boards-0.004
P-0.003
Token a
Token ID257
Feature activation-9.37e-01

Pos logit contributions
cakes+0.026
boards+0.025
aw+0.025
talk+0.025
co+0.024
Neg logit contributions
die-0.018
 time-0.018
 fortunate-0.017
arter-0.017
dy-0.017
Token little
Token ID1310
Feature activation-9.18e-01

Pos logit contributions
 fault+0.017
 vain+0.017
arter+0.016
ening+0.016
 pays+0.016
Neg logit contributions
aw-0.024
P-0.024
 se-0.023
She-0.020
Her-0.020
Token fish
Token ID5916
Feature activation+5.83e-02

Pos logit contributions
arter+0.015
 vain+0.014
 fortunate+0.014
die+0.014
 mistake+0.014
Neg logit contributions
aw-0.025
cakes-0.023
boards-0.023
P-0.023
 instruments-0.022
Tokeny
Token ID88
Feature activation-1.70e-01

Pos logit contributions
aw+0.001
cakes+0.001
boards+0.001
P+0.001
 instruments+0.001
Neg logit contributions
arter-0.001
 vain-0.001
 fortunate-0.001
die-0.001
 mistake-0.001
Token named
Token ID3706
Feature activation+1.60e-01

Pos logit contributions
 vain+0.004
arter+0.004
 fortunate+0.004
die+0.003
 mistake+0.003
Neg logit contributions
aw-0.004
P-0.003
cakes-0.003
boards-0.003
 instruments-0.003
Token Blue
Token ID4518
Feature activation-5.62e+00

Pos logit contributions
aw+0.003
P+0.002
cakes+0.002
boards+0.002
 se+0.002
Neg logit contributions
 vain-0.004
 mistake-0.004
 fortunate-0.004
 remembers-0.004
arter-0.004
Token.
Token ID13
Feature activation-1.87e+00

Pos logit contributions
 hasn+0.142
 vain+0.140
arter+0.139
 tempting+0.138
 accomplishment+0.137
Neg logit contributions
aw-0.107
P-0.095
amb-0.093
They-0.091
She-0.087
Token Blue
Token ID4518
Feature activation-4.41e+00

Pos logit contributions
arter+0.044
 mistake+0.044
 vain+0.043
ubby+0.040
 fortunate+0.040
Neg logit contributions
aw-0.037
cakes-0.034
First-0.034
boards-0.034
P-0.033
Token loved
Token ID6151
Feature activation-1.03e+00

Pos logit contributions
arter+0.118
 vain+0.117
 accomplishment+0.111
 pleasure+0.110
 solution+0.109
Neg logit contributions
aw-0.084
P-0.077
They-0.076
amb-0.067
She-0.066
Token to
Token ID284
Feature activation-2.02e+00

Pos logit contributions
arter+0.022
 fortunate+0.022
 vain+0.022
die+0.022
 mistake+0.021
Neg logit contributions
aw-0.022
cakes-0.020
boards-0.020
P-0.019
First-0.019
Token swim
Token ID9422
Feature activation-1.12e+00

Pos logit contributions
arter+0.039
 way+0.037
dy+0.036
 mistake+0.035
 remembers+0.035
Neg logit contributions
aw-0.052
cakes-0.049
boards-0.048
First-0.045
P-0.045
Token was
Token ID373
Feature activation+6.79e-01

Pos logit contributions
arter+0.120
 fortunate+0.120
ubby+0.117
 vain+0.116
die+0.114
Neg logit contributions
aw-0.053
boards-0.042
Her-0.040
She-0.039
cakes-0.039
Token a
Token ID257
Feature activation-9.45e-01

Pos logit contributions
cakes+0.017
boards+0.017
aw+0.016
talk+0.016
als+0.015
Neg logit contributions
die-0.012
 time-0.012
dy-0.012
arter-0.012
ubby-0.012
Token little
Token ID1310
Feature activation-8.43e-01

Pos logit contributions
 fault+0.017
 vain+0.017
arter+0.016
ening+0.016
 pays+0.016
Neg logit contributions
aw-0.024
P-0.024
 se-0.023
Her-0.021
She-0.021
Token girl
Token ID2576
Feature activation-1.82e-01

Pos logit contributions
arter+0.014
 vain+0.014
 fortunate+0.013
die+0.013
 mistake+0.013
Neg logit contributions
aw-0.023
cakes-0.021
P-0.021
boards-0.021
 instruments-0.020
Token called
Token ID1444
Feature activation-4.20e-01

Pos logit contributions
arter+0.004
 vain+0.004
 fortunate+0.004
die+0.004
 mistake+0.004
Neg logit contributions
aw-0.003
P-0.003
cakes-0.003
boards-0.003
 instruments-0.003
Token Mill
Token ID9212
Feature activation-5.62e+00

Pos logit contributions
arter+0.007
 vain+0.007
 fortunate+0.007
die+0.007
 mistake+0.007
Neg logit contributions
aw-0.011
cakes-0.010
boards-0.010
P-0.010
First-0.010
Tokenie
Token ID494
Feature activation-2.98e+00

Pos logit contributions
arter+0.143
 vain+0.136
 fortunate+0.132
 haven+0.129
 hasn+0.128
Neg logit contributions
aw-0.107
co-0.097
P-0.094
Her-0.091
She-0.091
Token.
Token ID13
Feature activation-1.70e+00

Pos logit contributions
arter+0.065
 vain+0.064
 fortunate+0.064
die+0.062
 remembers+0.061
Neg logit contributions
aw-0.066
Her-0.056
P-0.056
co-0.055
cakes-0.055
Token One
Token ID1881
Feature activation-1.57e+00

Pos logit contributions
arter+0.045
 vain+0.045
 mistake+0.044
 fortunate+0.044
die+0.044
Neg logit contributions
aw-0.028
cakes-0.025
boards-0.025
First-0.024
P-0.024
Token day
Token ID1110
Feature activation+2.93e-01

Pos logit contributions
 vain+0.022
arter+0.021
 fortunate+0.021
die+0.020
 mistake+0.020
Neg logit contributions
aw-0.047
cakes-0.044
P-0.043
boards-0.043
First-0.041
Token,
Token ID11
Feature activation+2.81e-01

Pos logit contributions
aw+0.008
cakes+0.007
P+0.007
boards+0.007
First+0.007
Neg logit contributions
arter-0.005
die-0.005
 vain-0.005
 mistake-0.005
 fortunate-0.005
Token was
Token ID373
Feature activation+9.27e-01

Pos logit contributions
 fortunate+0.113
arter+0.113
 vain+0.110
ubby+0.109
die+0.107
Neg logit contributions
aw-0.068
boards-0.057
cakes-0.055
Her-0.053
 bars-0.051
Token a
Token ID257
Feature activation-1.05e+00

Pos logit contributions
boards+0.024
cakes+0.024
talk+0.024
aw+0.024
co+0.022
Neg logit contributions
 time-0.016
die-0.015
dy-0.014
 normal-0.014
 lucky-0.014
Token little
Token ID1310
Feature activation-8.37e-01

Pos logit contributions
 fault+0.019
 vain+0.019
ening+0.018
arter+0.018
 pays+0.017
Neg logit contributions
aw-0.027
P-0.026
 se-0.025
Her-0.023
She-0.023
Token girl
Token ID2576
Feature activation-3.31e-01

Pos logit contributions
arter+0.013
 vain+0.013
 fortunate+0.013
die+0.012
 mistake+0.012
Neg logit contributions
aw-0.023
cakes-0.022
boards-0.021
P-0.021
 instruments-0.020
Token called
Token ID1444
Feature activation-6.68e-01

Pos logit contributions
arter+0.008
 vain+0.008
 fortunate+0.008
die+0.008
 mistake+0.008
Neg logit contributions
aw-0.007
cakes-0.006
boards-0.006
P-0.006
 instruments-0.005
Token Abby
Token ID49220
Feature activation-5.61e+00

Pos logit contributions
arter+0.012
 fortunate+0.012
 vain+0.012
die+0.012
 mistake+0.012
Neg logit contributions
aw-0.016
cakes-0.016
boards-0.015
P-0.015
First-0.015
Token.
Token ID13
Feature activation-1.11e+00

Pos logit contributions
arter+0.154
 vain+0.150
 mistake+0.144
 tempting+0.141
 fossil+0.139
Neg logit contributions
aw-0.098
Her-0.088
P-0.087
She-0.083
 bars-0.078
Token She
Token ID1375
Feature activation-2.81e+00

Pos logit contributions
arter+0.027
 fortunate+0.027
 vain+0.026
die+0.026
 haven+0.026
Neg logit contributions
aw-0.021
cakes-0.019
boards-0.019
P-0.018
 instruments-0.018
Token was
Token ID373
Feature activation-3.94e-01

Pos logit contributions
arter+0.053
 vain+0.052
 fortunate+0.050
 mistake+0.050
die+0.049
Neg logit contributions
aw-0.071
P-0.067
cakes-0.065
boards-0.064
 instruments-0.062
Token very
Token ID845
Feature activation-1.84e+00

Pos logit contributions
arter+0.008
die+0.008
 vain+0.008
 mistake+0.008
 fortunate+0.008
Neg logit contributions
aw-0.009
P-0.008
cakes-0.008
 instruments-0.008
boards-0.008
Token happy
Token ID3772
Feature activation+1.50e+00

Pos logit contributions
die+0.033
 mistake+0.032
ening+0.032
 vain+0.032
 fossil+0.031
Neg logit contributions
P-0.044
aw-0.043
Her-0.043
cakes-0.041
 se-0.041
Token
Token ID198
Feature activation-5.92e-02

Pos logit contributions
aw+0.017
P+0.016
cakes+0.016
boards+0.015
 instruments+0.015
Neg logit contributions
arter-0.015
die-0.014
 vain-0.014
 fortunate-0.014
dy-0.014
Token
Token ID198
Feature activation-2.89e-01

Pos logit contributions
arter+0.001
die+0.001
 vain+0.001
 fortunate+0.001
 haven+0.001
Neg logit contributions
aw-0.002
cakes-0.001
boards-0.001
P-0.001
 instruments-0.001
TokenAfter
Token ID3260
Feature activation+2.36e+00

Pos logit contributions
die+0.006
dy+0.006
 vain+0.005
ubby+0.005
 moment+0.005
Neg logit contributions
aw-0.007
ican-0.006
cakes-0.006
boards-0.006
talk-0.006
Token a
Token ID257
Feature activation+1.31e+00

Pos logit contributions
aw+0.056
P+0.050
talk+0.050
boards+0.049
cakes+0.048
Neg logit contributions
 vain-0.046
arter-0.044
 fortunate-0.043
 mistake-0.043
 way-0.043
Token few
Token ID1178
Feature activation+2.79e+00

Pos logit contributions
cakes+0.023
P+0.022
boards+0.021
First+0.021
aw+0.021
Neg logit contributions
arter-0.034
 vain-0.032
 fortunate-0.032
ubby-0.031
 remembers-0.031
Token minutes
Token ID2431
Feature activation+3.44e+00

Pos logit contributions
aw+0.067
cakes+0.062
boards+0.062
P+0.060
 instruments+0.059
Neg logit contributions
arter-0.055
die-0.052
 haven-0.052
 fortunate-0.052
 vain-0.052
Token,
Token ID11
Feature activation+1.33e+00

Pos logit contributions
aw+0.099
P+0.092
cakes+0.091
boards+0.091
 instruments+0.089
Neg logit contributions
arter-0.049
 vain-0.048
 fortunate-0.046
 haven-0.046
 mistake-0.046
Token a
Token ID257
Feature activation+9.96e-01

Pos logit contributions
aw+0.029
boards+0.026
cakes+0.026
P+0.025
talk+0.024
Neg logit contributions
arter-0.029
 vain-0.029
die-0.029
 fortunate-0.028
dy-0.028
Token big
Token ID1263
Feature activation+1.16e+00

Pos logit contributions
aw+0.028
cakes+0.026
boards+0.026
P+0.026
 instruments+0.025
Neg logit contributions
arter-0.015
 vain-0.014
 fortunate-0.014
die-0.014
 mistake-0.014
Token man
Token ID582
Feature activation+1.57e+00

Pos logit contributions
aw+0.029
cakes+0.027
boards+0.026
P+0.026
 instruments+0.026
Neg logit contributions
arter-0.021
 vain-0.020
 fortunate-0.020
die-0.020
 mistake-0.020
Token dressed
Token ID12049
Feature activation-1.16e+00

Pos logit contributions
aw+0.036
cakes+0.034
P+0.034
First+0.033
boards+0.033
Neg logit contributions
 vain-0.029
arter-0.029
 fortunate-0.029
 mistake-0.029
 haven-0.029
Token said
Token ID531
Feature activation+8.36e-01

Pos logit contributions
ican+0.029
aw+0.028
P+0.028
 umb+0.027
cakes+0.027
Neg logit contributions
 mistake-0.015
 brother-0.015
 thanks-0.015
die-0.015
 way-0.015
Token,
Token ID11
Feature activation-1.14e+00

Pos logit contributions
aw+0.022
cakes+0.021
boards+0.020
P+0.020
 instruments+0.020
Neg logit contributions
arter-0.014
 vain-0.014
 fortunate-0.013
die-0.013
 mistake-0.013
Token handing
Token ID22786
Feature activation+1.10e+00

Pos logit contributions
arter+0.019
 vain+0.018
 fortunate+0.018
die+0.018
 haven+0.017
Neg logit contributions
aw-0.030
cakes-0.028
boards-0.028
P-0.028
 instruments-0.027
Token him
Token ID683
Feature activation+1.35e-01

Pos logit contributions
aw+0.021
cakes+0.019
boards+0.018
P+0.018
 instruments+0.017
Neg logit contributions
arter-0.027
 vain-0.027
 fortunate-0.026
die-0.026
 mistake-0.026
Token a
Token ID257
Feature activation+1.97e-01

Pos logit contributions
aw+0.003
boards+0.003
cakes+0.003
P+0.003
First+0.003
Neg logit contributions
die-0.002
 vain-0.002
dy-0.002
arter-0.002
 fortunate-0.002
Token cup
Token ID6508
Feature activation+3.44e+00

Pos logit contributions
aw+0.004
cakes+0.004
P+0.004
boards+0.004
First+0.004
Neg logit contributions
arter-0.004
 fortunate-0.004
 vain-0.004
die-0.004
 remembers-0.004
Token and
Token ID290
Feature activation-1.76e-01

Pos logit contributions
First+0.093
 instruments+0.093
aw+0.093
P+0.091
ican+0.086
Neg logit contributions
 thanks-0.047
 has-0.046
arter-0.045
 needs-0.045
 mistake-0.044
Token a
Token ID257
Feature activation+1.10e+00

Pos logit contributions
 remembers+0.004
 vain+0.003
die+0.003
 mistake+0.003
 fortunate+0.003
Neg logit contributions
aw-0.004
boards-0.004
cakes-0.004
P-0.004
ican-0.004
Token sa
Token ID473
Feature activation+8.38e-01

Pos logit contributions
aw+0.026
cakes+0.024
boards+0.024
P+0.024
 instruments+0.023
Neg logit contributions
arter-0.021
 vain-0.021
 fortunate-0.021
die-0.020
 mistake-0.020
Tokenucer
Token ID48915
Feature activation+3.30e-01

Pos logit contributions
aw+0.018
cakes+0.017
P+0.017
boards+0.017
talk+0.016
Neg logit contributions
arter-0.018
 fortunate-0.018
 vain-0.018
 mistake-0.017
 way-0.017
Token.
Token ID13
Feature activation-4.60e-01

Pos logit contributions
aw+0.010
P+0.009
First+0.009
cakes+0.009
boards+0.009
Neg logit contributions
arter-0.004
 vain-0.004
 mistake-0.004
 fortunate-0.004
dy-0.004
Tokent
Token ID83
Feature activation+2.20e-01

Pos logit contributions
aw+0.016
cakes+0.015
boards+0.014
P+0.014
Her+0.013
Neg logit contributions
 fortunate-0.026
arter-0.026
die-0.026
 remembers-0.025
 vain-0.025
Token worry
Token ID5490
Feature activation-1.03e+00

Pos logit contributions
aw+0.006
cakes+0.006
boards+0.006
P+0.006
 instruments+0.006
Neg logit contributions
arter-0.003
 vain-0.003
 fortunate-0.003
 mistake-0.003
die-0.003
Token,
Token ID11
Feature activation+2.26e-01

Pos logit contributions
arter+0.016
 vain+0.015
 fortunate+0.015
die+0.015
 mistake+0.015
Neg logit contributions
aw-0.029
cakes-0.027
boards-0.027
P-0.026
 instruments-0.026
Token Susan
Token ID14919
Feature activation-3.34e+00

Pos logit contributions
aw+0.006
P+0.006
boards+0.006
cakes+0.005
talk+0.005
Neg logit contributions
 vain-0.004
 fortunate-0.004
 haven-0.004
 mistake-0.004
 way-0.004
Token.
Token ID13
Feature activation+9.67e-02

Pos logit contributions
arter+0.081
 vain+0.071
die+0.071
 fortunate+0.070
 fossil+0.069
Neg logit contributions
aw-0.073
P-0.059
 bars-0.055
She-0.055
als-0.054
Token We
Token ID775
Feature activation+4.09e+00

Pos logit contributions
aw+0.002
cakes+0.002
boards+0.002
P+0.002
First+0.002
Neg logit contributions
arter-0.002
 vain-0.002
 fortunate-0.002
 mistake-0.002
die-0.002
Token will
Token ID481
Feature activation+1.66e+00

Pos logit contributions
aw+0.118
cakes+0.112
boards+0.110
P+0.108
 instruments+0.106
Neg logit contributions
 vain-0.058
arter-0.057
 haven-0.057
 remembers-0.056
 fortunate-0.056
Token protect
Token ID1805
Feature activation-1.38e-01

Pos logit contributions
aw+0.039
cakes+0.036
boards+0.035
P+0.035
 instruments+0.033
Neg logit contributions
arter-0.033
 vain-0.033
 fortunate-0.032
 mistake-0.032
die-0.032
Token you
Token ID345
Feature activation+1.73e-01

Pos logit contributions
arter+0.003
 vain+0.003
 mistake+0.003
 fortunate+0.003
die+0.002
Neg logit contributions
aw-0.003
boards-0.003
cakes-0.003
P-0.003
First-0.003
Token!
Token ID0
Feature activation+5.87e-01

Pos logit contributions
aw+0.004
cakes+0.004
boards+0.004
P+0.004
First+0.004
Neg logit contributions
arter-0.003
 vain-0.003
 fortunate-0.003
 mistake-0.003
die-0.003
Tokenâ
Token ID22940
Feature activation+1.82e+00

Pos logit contributions
aw+0.015
cakes+0.014
boards+0.014
P+0.013
 instruments+0.013
Neg logit contributions
arter-0.010
 vain-0.010
 fortunate-0.010
die-0.010
 haven-0.010
Token happy
Token ID3772
Feature activation+1.97e+00

Pos logit contributions
aw+0.012
cakes+0.011
P+0.010
 instruments+0.010
First+0.010
Neg logit contributions
arter-0.011
die-0.010
 vain-0.010
 fortunate-0.010
 fossil-0.010
Token and
Token ID290
Feature activation-1.61e+00

Pos logit contributions
aw+0.043
cakes+0.039
boards+0.038
P+0.038
 instruments+0.037
Neg logit contributions
arter-0.043
 vain-0.042
 fortunate-0.042
die-0.041
 mistake-0.041
Token ran
Token ID4966
Feature activation+8.43e-01

Pos logit contributions
 fortunate+0.035
 vain+0.035
arter+0.035
die+0.034
 remembers+0.034
Neg logit contributions
aw-0.034
boards-0.031
cakes-0.031
P-0.030
First-0.029
Token to
Token ID284
Feature activation+7.68e-01

Pos logit contributions
aw+0.020
P+0.018
cakes+0.018
boards+0.017
 instruments+0.017
Neg logit contributions
 vain-0.016
 way-0.016
 mistake-0.016
arter-0.016
die-0.016
Token the
Token ID262
Feature activation+2.32e+00

Pos logit contributions
aw+0.015
cakes+0.015
First+0.014
 instruments+0.014
Her+0.014
Neg logit contributions
arter-0.017
 fortunate-0.017
die-0.017
 vain-0.017
ubby-0.017
Token cup
Token ID6508
Feature activation+3.88e+00

Pos logit contributions
aw+0.053
P+0.046
talk+0.045
First+0.045
 instruments+0.045
Neg logit contributions
 mistake-0.048
arter-0.047
 way-0.047
 vain-0.047
 fortunate-0.045
Tokenboard
Token ID3526
Feature activation+2.15e+00

Pos logit contributions
Run+0.058
 instruments+0.058
aw+0.058
P+0.056
Ag+0.052
Neg logit contributions
 way-0.105
 mistake-0.100
 solution-0.097
arter-0.097
 chance-0.096
Token.
Token ID13
Feature activation+1.85e-01

Pos logit contributions
P+0.071
aw+0.070
ican+0.067
Ta+0.067
Boo+0.066
Neg logit contributions
 way-0.025
 mistake-0.021
 chance-0.019
 size-0.019
 solution-0.019
Token He
Token ID679
Feature activation+9.24e-02

Pos logit contributions
aw+0.004
cakes+0.004
boards+0.004
First+0.004
P+0.004
Neg logit contributions
arter-0.004
 vain-0.004
 fortunate-0.004
die-0.004
 mistake-0.004
Token saw
Token ID2497
Feature activation+1.10e+00

Pos logit contributions
aw+0.002
cakes+0.002
boards+0.002
P+0.002
First+0.002
Neg logit contributions
 vain-0.002
die-0.002
 remembers-0.002
arter-0.002
 haven-0.002
Token the
Token ID262
Feature activation+6.79e-01

Pos logit contributions
aw+0.024
P+0.023
cakes+0.022
boards+0.022
First+0.022
Neg logit contributions
 vain-0.022
arter-0.022
 mistake-0.022
 fortunate-0.022
die-0.021
Token's
Token ID338
Feature activation+2.38e-01

Pos logit contributions
aw+0.014
cakes+0.014
boards+0.013
P+0.012
 instruments+0.012
Neg logit contributions
die-0.013
arter-0.013
 vain-0.013
 mistake-0.013
 fortunate-0.013
Token mom
Token ID1995
Feature activation+7.38e-01

Pos logit contributions
aw+0.007
cakes+0.006
boards+0.006
P+0.006
 instruments+0.006
Neg logit contributions
arter-0.003
 vain-0.003
 fortunate-0.003
die-0.003
 mistake-0.003
Token offered
Token ID4438
Feature activation+1.38e+00

Pos logit contributions
aw+0.016
cakes+0.015
P+0.015
boards+0.014
First+0.014
Neg logit contributions
arter-0.017
 vain-0.016
 fortunate-0.016
die-0.015
 fossil-0.015
Token him
Token ID683
Feature activation-1.30e-01

Pos logit contributions
aw+0.030
P+0.029
cakes+0.028
 instruments+0.028
First+0.027
Neg logit contributions
 way-0.027
 mistake-0.027
 vain-0.027
die-0.027
dy-0.026
Token a
Token ID257
Feature activation-6.91e-01

Pos logit contributions
dy+0.003
die+0.003
 way+0.003
 remembers+0.002
 beloved+0.002
Neg logit contributions
aw-0.003
ican-0.003
boards-0.003
co-0.003
talk-0.003
Token quiz
Token ID38964
Feature activation+3.42e+00

Pos logit contributions
arter+0.013
 vain+0.012
 fortunate+0.012
die+0.012
 remembers+0.012
Neg logit contributions
aw-0.017
cakes-0.016
P-0.016
boards-0.016
 instruments-0.015
Token.
Token ID13
Feature activation+1.02e-01

Pos logit contributions
aw+0.095
ican+0.088
P+0.087
First+0.087
Ta+0.086
Neg logit contributions
 way-0.048
 mistake-0.047
 chance-0.046
 fossil-0.045
 solution-0.044
Token She
Token ID1375
Feature activation-6.11e-01

Pos logit contributions
aw+0.003
P+0.002
 instruments+0.002
talk+0.002
boards+0.002
Neg logit contributions
arter-0.002
 vain-0.002
 haven-0.002
 fortunate-0.002
 mistake-0.002
Token said
Token ID531
Feature activation-3.74e-01

Pos logit contributions
die+0.012
 remembers+0.012
 mistake+0.012
 vain+0.011
dy+0.011
Neg logit contributions
aw-0.015
boards-0.013
P-0.013
ican-0.013
 instruments-0.013
Token,
Token ID11
Feature activation-1.19e+00

Pos logit contributions
 vain+0.004
 fortunate+0.004
arter+0.004
 remembers+0.004
 mistake+0.004
Neg logit contributions
aw-0.012
P-0.011
First-0.011
talk-0.011
 instruments-0.011
Token "
Token ID366
Feature activation+1.64e+00

Pos logit contributions
arter+0.015
 vain+0.015
 fortunate+0.015
die+0.014
 mistake+0.014
Neg logit contributions
aw-0.036
cakes-0.034
boards-0.033
P-0.033
 instruments-0.033
Tokenhouse
Token ID4803
Feature activation+7.15e-01

Pos logit contributions
aw+0.017
cakes+0.016
boards+0.016
P+0.016
 instruments+0.015
Neg logit contributions
arter-0.015
 vain-0.015
 fortunate-0.015
die-0.015
 mistake-0.015
Token was
Token ID373
Feature activation+9.60e-01

Pos logit contributions
aw+0.018
cakes+0.017
P+0.017
boards+0.017
First+0.016
Neg logit contributions
arter-0.012
 vain-0.012
 fortunate-0.012
 mistake-0.012
die-0.012
Token ready
Token ID3492
Feature activation+1.02e-01

Pos logit contributions
aw+0.021
cakes+0.019
P+0.018
boards+0.018
First+0.018
Neg logit contributions
arter-0.021
die-0.020
 fortunate-0.020
 vain-0.020
dy-0.019
Token.
Token ID13
Feature activation+7.70e-01

Pos logit contributions
aw+0.002
cakes+0.002
boards+0.002
P+0.002
 instruments+0.002
Neg logit contributions
arter-0.002
 vain-0.002
 fortunate-0.002
die-0.002
 mistake-0.002
Token
Token ID220
Feature activation-1.28e+00

Pos logit contributions
aw+0.018
cakes+0.017
boards+0.017
P+0.017
First+0.016
Neg logit contributions
arter-0.015
 vain-0.015
 mistake-0.014
 fortunate-0.014
die-0.014
Token
Token ID198
Feature activation+6.20e-01

Pos logit contributions
arter+0.027
 vain+0.027
 fortunate+0.027
 mistake+0.027
 remembers+0.026
Neg logit contributions
aw-0.029
cakes-0.027
First-0.026
boards-0.026
P-0.025
Token
Token ID198
Feature activation+4.38e-01

Pos logit contributions
aw+0.015
cakes+0.014
P+0.014
boards+0.014
First+0.014
Neg logit contributions
arter-0.012
 vain-0.011
 fortunate-0.011
 mistake-0.011
die-0.011
TokenThe
Token ID464
Feature activation-8.92e-01

Pos logit contributions
aw+0.009
 instruments+0.008
 bars+0.008
boards+0.008
talk+0.008
Neg logit contributions
die-0.010
 fortunate-0.010
dy-0.009
 tempting-0.009
ubby-0.009
Token two
Token ID734
Feature activation+1.92e+00

Pos logit contributions
arter+0.016
 vain+0.016
 fortunate+0.015
die+0.015
 haven+0.015
Neg logit contributions
aw-0.023
cakes-0.021
P-0.021
boards-0.021
 instruments-0.020
Token of
Token ID286
Feature activation+2.82e-01

Pos logit contributions
aw+0.043
cakes+0.040
boards+0.040
P+0.039
 instruments+0.038
Neg logit contributions
arter-0.038
 vain-0.038
 fortunate-0.038
die-0.037
 mistake-0.037
Token them
Token ID606
Feature activation+1.84e+00

Pos logit contributions
aw+0.005
cakes+0.004
boards+0.004
P+0.004
 instruments+0.004
Neg logit contributions
arter-0.007
 vain-0.007
 fortunate-0.007
die-0.007
 mistake-0.007
Token "
Token ID366
Feature activation+1.64e+00

Pos logit contributions
 fortunate+0.008
die+0.008
 haven+0.008
 vain+0.008
arter+0.008
Neg logit contributions
aw-0.023
cakes-0.022
P-0.022
boards-0.022
 instruments-0.021
TokenThank
Token ID10449
Feature activation+1.61e+00

Pos logit contributions
aw+0.031
boards+0.030
 instruments+0.029
cakes+0.029
 bars+0.028
Neg logit contributions
die-0.038
 moment-0.037
 haven-0.036
 fortunate-0.035
dy-0.034
Token you
Token ID345
Feature activation-6.01e-01

Pos logit contributions
boards+0.042
aw+0.040
talk+0.040
First+0.038
 bars+0.038
Neg logit contributions
 thanks-0.023
ubby-0.023
lad-0.022
 true-0.022
 welcome-0.022
Token p
Token ID279
Feature activation+1.68e-01

Pos logit contributions
 mistake+0.008
 vain+0.008
die+0.008
arter+0.008
dy+0.007
Neg logit contributions
aw-0.018
boards-0.017
 instruments-0.017
cakes-0.017
P-0.017
Tokenuddle
Token ID24500
Feature activation-5.07e-01

Pos logit contributions
aw+0.005
P+0.005
boards+0.005
First+0.005
 instruments+0.005
Neg logit contributions
arter-0.002
ubby-0.002
 vain-0.002
 fortunate-0.002
 haven-0.002
Token,
Token ID11
Feature activation+5.20e-01

Pos logit contributions
arter+0.007
 mistake+0.007
dy+0.007
 way+0.006
die+0.006
Neg logit contributions
aw-0.014
First-0.014
talk-0.014
P-0.014
boards-0.013
Token for
Token ID329
Feature activation+9.73e-01

Pos logit contributions
aw+0.014
boards+0.013
P+0.013
cakes+0.013
 instruments+0.012
Neg logit contributions
arter-0.009
 vain-0.009
 haven-0.009
die-0.009
 fortunate-0.009
Token glowing
Token ID21377
Feature activation-1.05e+00

Pos logit contributions
aw+0.026
boards+0.024
cakes+0.024
First+0.023
P+0.023
Neg logit contributions
arter-0.016
 mistake-0.016
 vain-0.016
 fortunate-0.015
die-0.015
Token so
Token ID523
Feature activation-3.61e-01

Pos logit contributions
arter+0.017
 way+0.017
 fortunate+0.017
die+0.017
 mistake+0.017
Neg logit contributions
aw-0.027
P-0.025
First-0.025
boards-0.025
talk-0.025
Token brightly
Token ID35254
Feature activation-2.31e+00

Pos logit contributions
arter+0.008
 vain+0.008
 fortunate+0.008
 haven+0.007
 mistake+0.007
Neg logit contributions
aw-0.008
cakes-0.007
boards-0.007
P-0.007
talk-0.007
Token!"
Token ID2474
Feature activation+2.21e+00

Pos logit contributions
arter+0.039
 vain+0.039
 fortunate+0.038
die+0.037
 mistake+0.037
Neg logit contributions
aw-0.061
cakes-0.057
boards-0.056
P-0.055
 instruments-0.054
Token
Token ID198
Feature activation+5.79e-01

Pos logit contributions
aw+0.015
cakes+0.014
boards+0.014
 instruments+0.014
P+0.013
Neg logit contributions
die-0.008
 vain-0.007
 haven-0.007
 fortunate-0.007
dy-0.007
TokenAfter
Token ID3260
Feature activation+1.98e+00

Pos logit contributions
boards+0.011
cakes+0.011
als+0.011
aw+0.011
 umb+0.011
Neg logit contributions
die-0.014
dy-0.013
 haven-0.012
 vain-0.012
 moment-0.012
Token the
Token ID262
Feature activation+2.88e-01

Pos logit contributions
aw+0.050
cakes+0.049
boards+0.048
talk+0.048
First+0.047
Neg logit contributions
 vain-0.033
 way-0.032
 ending-0.031
 time-0.031
arter-0.031
Token lion
Token ID18744
Feature activation+1.64e+00

Pos logit contributions
aw+0.007
cakes+0.007
boards+0.007
P+0.007
First+0.006
Neg logit contributions
arter-0.005
 vain-0.005
 mistake-0.005
 fortunate-0.005
die-0.005
Token left
Token ID1364
Feature activation-1.20e-01

Pos logit contributions
aw+0.041
cakes+0.037
boards+0.037
P+0.036
 instruments+0.035
Neg logit contributions
arter-0.031
 vain-0.030
 fortunate-0.029
die-0.029
 mistake-0.029
Token,
Token ID11
Feature activation+1.16e+00

Pos logit contributions
arter+0.002
die+0.002
 vain+0.002
 fortunate+0.002
dy+0.002
Neg logit contributions
aw-0.003
cakes-0.003
boards-0.003
 instruments-0.003
P-0.003
Token everyone
Token ID2506
Feature activation-6.29e-01

Pos logit contributions
aw+0.029
cakes+0.027
boards+0.027
P+0.026
First+0.026
Neg logit contributions
arter-0.021
 vain-0.020
 fortunate-0.020
die-0.020
 mistake-0.020
Token in
Token ID287
Feature activation-1.33e+00

Pos logit contributions
arter+0.012
 vain+0.012
 fortunate+0.012
die+0.012
 mistake+0.012
Neg logit contributions
aw-0.015
cakes-0.014
boards-0.013
P-0.013
First-0.013
Token Jimmy
Token ID12963
Feature activation-6.48e-01

Pos logit contributions
arter+0.033
lad+0.033
ccoli+0.032
dy+0.032
die+0.032
Neg logit contributions
aw-0.024
 instruments-0.021
 se-0.019
P-0.019
Her-0.019
Token's
Token ID338
Feature activation+1.31e+00

Pos logit contributions
arter+0.016
 vain+0.015
 fortunate+0.015
die+0.014
 mistake+0.014
Neg logit contributions
aw-0.013
boards-0.012
cakes-0.011
 instruments-0.011
P-0.011
Token group
Token ID1448
Feature activation+2.98e+00

Pos logit contributions
aw+0.030
 instruments+0.027
boards+0.026
 bars+0.026
cakes+0.026
Neg logit contributions
arter-0.027
 vain-0.026
 haven-0.026
 fortunate-0.026
die-0.025
Token a
Token ID257
Feature activation-7.69e-01

Pos logit contributions
cakes+0.030
aw+0.029
boards+0.029
ican+0.028
talk+0.027
Neg logit contributions
 way-0.021
 time-0.020
 vain-0.020
 lucky-0.020
die-0.020
Token tall
Token ID7331
Feature activation+1.44e+00

Pos logit contributions
arter+0.013
 fortunate+0.013
 vain+0.013
ening+0.012
die+0.012
Neg logit contributions
aw-0.019
P-0.019
 instruments-0.018
cakes-0.018
 se-0.017
Token dad
Token ID9955
Feature activation-6.86e-02

Pos logit contributions
aw+0.035
cakes+0.033
boards+0.032
P+0.032
First+0.032
Neg logit contributions
arter-0.028
 mistake-0.027
 vain-0.027
 fortunate-0.027
 remembers-0.026
Token who
Token ID508
Feature activation-2.72e+00

Pos logit contributions
arter+0.002
 vain+0.002
die+0.002
 fortunate+0.002
 mistake+0.002
Neg logit contributions
aw-0.001
P-0.001
cakes-0.001
boards-0.001
 instruments-0.001
Token wore
Token ID12408
Feature activation-8.18e-01

Pos logit contributions
arter+0.059
 vain+0.056
 mistake+0.056
die+0.056
 fortunate+0.055
Neg logit contributions
aw-0.062
cakes-0.056
boards-0.055
P-0.055
 instruments-0.054
Token big
Token ID1263
Feature activation+1.98e+00

Pos logit contributions
 vain+0.017
arter+0.017
 fortunate+0.017
die+0.017
 mistake+0.017
Neg logit contributions
aw-0.017
cakes-0.016
boards-0.016
P-0.015
First-0.015
Token boots
Token ID14412
Feature activation+1.66e+00

Pos logit contributions
cakes+0.046
aw+0.045
First+0.043
talk+0.042
ican+0.041
Neg logit contributions
 way-0.040
 remembers-0.038
 problem-0.038
 mistake-0.037
 fortunate-0.037
Token.
Token ID13
Feature activation-4.67e-01

Pos logit contributions
P+0.052
aw+0.052
Run+0.052
Boo+0.052
Ta+0.052
Neg logit contributions
 way-0.016
 chance-0.014
 thanks-0.013
 size-0.013
 time-0.013
Token One
Token ID1881
Feature activation-8.32e-01

Pos logit contributions
 vain+0.012
 fortunate+0.012
 remembers+0.011
arter+0.011
 mistake+0.011
Neg logit contributions
aw-0.009
cakes-0.008
boards-0.008
First-0.008
 bars-0.007
Token day
Token ID1110
Feature activation+1.14e+00

Pos logit contributions
 vain+0.012
arter+0.011
 fortunate+0.011
die+0.011
 mistake+0.011
Neg logit contributions
aw-0.025
P-0.023
cakes-0.023
boards-0.022
First-0.022
Token,
Token ID11
Feature activation+2.01e+00

Pos logit contributions
aw+0.027
boards+0.025
cakes+0.025
P+0.025
First+0.023
Neg logit contributions
die-0.024
 fortunate-0.023
arter-0.023
 mistake-0.022
 vain-0.022
Token have
Token ID423
Feature activation+1.87e+00

Pos logit contributions
aw+0.061
cakes+0.057
boards+0.056
First+0.055
 bars+0.055
Neg logit contributions
arter-0.036
die-0.035
 fortunate-0.034
 vain-0.033
 mistake-0.032
Token to
Token ID284
Feature activation-3.69e-01

Pos logit contributions
aw+0.034
cakes+0.030
P+0.029
boards+0.029
 instruments+0.028
Neg logit contributions
arter-0.049
 fortunate-0.046
 vain-0.046
die-0.046
ubby-0.045
Token wash
Token ID13502
Feature activation-5.42e-01

Pos logit contributions
 mistake+0.007
arter+0.007
 vain+0.007
 way+0.007
 fortunate+0.007
Neg logit contributions
aw-0.009
boards-0.009
cakes-0.009
P-0.008
 instruments-0.008
Token your
Token ID534
Feature activation+2.00e-01

Pos logit contributions
arter+0.012
 fortunate+0.012
 vain+0.011
 haven+0.011
die+0.011
Neg logit contributions
aw-0.012
P-0.011
cakes-0.010
 instruments-0.010
boards-0.010
Token hands
Token ID2832
Feature activation+1.09e+00

Pos logit contributions
aw+0.004
 instruments+0.003
 bars+0.003
P+0.003
cakes+0.003
Neg logit contributions
arter-0.005
 haven-0.005
 vain-0.005
 fortunate-0.005
die-0.005
Token."
Token ID526
Feature activation+1.44e+00

Pos logit contributions
aw+0.029
cakes+0.027
boards+0.027
P+0.027
 instruments+0.026
Neg logit contributions
arter-0.017
 vain-0.017
 fortunate-0.017
die-0.017
 mistake-0.017
Token Tim
Token ID5045
Feature activation+1.03e+00

Pos logit contributions
aw+0.037
P+0.034
boards+0.034
 instruments+0.034
cakes+0.034
Neg logit contributions
arter-0.024
 haven-0.024
 vain-0.023
die-0.023
 fortunate-0.023
Token was
Token ID373
Feature activation+5.18e-01

Pos logit contributions
aw+0.027
cakes+0.025
boards+0.025
P+0.024
 instruments+0.024
Neg logit contributions
die-0.018
 vain-0.017
 mistake-0.017
dy-0.017
 remembers-0.017
Token so
Token ID523
Feature activation-1.58e-01

Pos logit contributions
aw+0.011
Her+0.010
cakes+0.010
She+0.010
First+0.010
Neg logit contributions
arter-0.012
die-0.011
 fossil-0.010
 vain-0.010
ubby-0.010
Token happy
Token ID3772
Feature activation+1.97e+00

Pos logit contributions
arter+0.003
 fossil+0.003
die+0.003
 vain+0.003
 mistake+0.003
Neg logit contributions
aw-0.004
Her-0.003
P-0.003
cakes-0.003
She-0.003
Token,
Token ID11
Feature activation+6.28e-01

Pos logit contributions
aw+0.048
cakes+0.044
P+0.044
boards+0.043
First+0.043
Neg logit contributions
arter-0.037
 mistake-0.036
 vain-0.036
 fortunate-0.036
die-0.035
Token made
Token ID925
Feature activation+2.46e+00

Pos logit contributions
aw+0.059
cakes+0.057
boards+0.055
P+0.053
ican+0.052
Neg logit contributions
 vain-0.045
die-0.043
 way-0.043
 fortunate-0.043
 mistake-0.043
Token the
Token ID262
Feature activation-1.33e+00

Pos logit contributions
aw+0.055
cakes+0.053
boards+0.052
P+0.051
First+0.050
Neg logit contributions
 mistake-0.049
arter-0.049
 vain-0.048
 fortunate-0.048
die-0.048
Token fire
Token ID2046
Feature activation-3.76e-01

Pos logit contributions
 vain+0.029
die+0.029
 haven+0.029
arter+0.028
dy+0.026
Neg logit contributions
aw-0.028
 bars-0.025
cakes-0.025
 instruments-0.024
boards-0.024
Token shrink
Token ID22085
Feature activation+1.18e-01

Pos logit contributions
arter+0.007
 vain+0.007
die+0.007
 fortunate+0.007
 haven+0.007
Neg logit contributions
aw-0.009
cakes-0.008
boards-0.008
 instruments-0.008
P-0.008
Token to
Token ID284
Feature activation-4.22e-01

Pos logit contributions
aw+0.003
boards+0.002
cakes+0.002
 instruments+0.002
P+0.002
Neg logit contributions
arter-0.002
 vain-0.002
die-0.002
 haven-0.002
 fortunate-0.002
Token a
Token ID257
Feature activation+2.17e-01

Pos logit contributions
 vain+0.009
 fortunate+0.009
die+0.009
arter+0.009
 remembers+0.008
Neg logit contributions
aw-0.009
cakes-0.008
P-0.008
boards-0.008
 instruments-0.008
Token perfect
Token ID2818
Feature activation+8.43e-01

Pos logit contributions
aw+0.005
 instruments+0.005
cakes+0.005
boards+0.005
P+0.005
Neg logit contributions
arter-0.004
 vain-0.004
ubby-0.004
 remembers-0.004
 fortunate-0.004
Token size
Token ID2546
Feature activation+9.84e-01

Pos logit contributions
aw+0.023
cakes+0.021
 instruments+0.021
boards+0.021
P+0.020
Neg logit contributions
 vain-0.013
arter-0.013
 fortunate-0.013
die-0.012
 remembers-0.012
Token and
Token ID290
Feature activation-8.46e-01

Pos logit contributions
aw+0.024
boards+0.022
cakes+0.022
 instruments+0.021
P+0.020
Neg logit contributions
arter-0.019
die-0.019
 vain-0.018
 fortunate-0.018
 haven-0.018
Token it
Token ID340
Feature activation-1.82e+00

Pos logit contributions
arter+0.017
 vain+0.017
 fortunate+0.017
die+0.016
 mistake+0.016
Neg logit contributions
aw-0.019
cakes-0.018
boards-0.017
P-0.017
 instruments-0.017
Token shone
Token ID44193
Feature activation-2.17e+00

Pos logit contributions
arter+0.037
 vain+0.036
 fortunate+0.035
die+0.035
ubby+0.035
Neg logit contributions
aw-0.042
cakes-0.039
boards-0.038
P-0.037
 instruments-0.037
Tokenirled
Token ID49376
Feature activation+9.49e-01

Pos logit contributions
boards+0.007
cakes+0.007
aw+0.007
First+0.006
talk+0.006
Neg logit contributions
dy-0.004
ubby-0.004
die-0.004
arter-0.004
inkle-0.004
Token all
Token ID477
Feature activation-2.67e+00

Pos logit contributions
cakes+0.025
aw+0.024
boards+0.024
P+0.023
First+0.023
Neg logit contributions
die-0.015
dy-0.015
 way-0.014
ening-0.014
 thanks-0.014
Token around
Token ID1088
Feature activation-1.78e+00

Pos logit contributions
arter+0.059
 vain+0.056
 mistake+0.056
 fortunate+0.054
die+0.054
Neg logit contributions
aw-0.061
P-0.054
cakes-0.054
First-0.052
 instruments-0.051
Token the
Token ID262
Feature activation+7.54e-02

Pos logit contributions
arter+0.033
 vain+0.032
 fortunate+0.032
die+0.032
 mistake+0.031
Neg logit contributions
aw-0.044
cakes-0.041
boards-0.040
P-0.040
 instruments-0.039
Token garden
Token ID11376
Feature activation-1.56e+00

Pos logit contributions
aw+0.002
P+0.001
boards+0.001
cakes+0.001
 instruments+0.001
Neg logit contributions
arter-0.002
 vain-0.002
 fortunate-0.002
 haven-0.002
die-0.002
Token in
Token ID287
Feature activation-1.16e+00

Pos logit contributions
 vain+0.022
 way+0.022
arter+0.022
 fortunate+0.021
 mistake+0.021
Neg logit contributions
aw-0.044
P-0.042
cakes-0.041
First-0.041
boards-0.040
Token gratitude
Token ID24083
Feature activation-7.32e-02

Pos logit contributions
arter+0.027
ccoli+0.026
 fortunate+0.026
 remembers+0.026
die+0.025
Neg logit contributions
aw-0.024
P-0.021
 instruments-0.020
First-0.020
Her-0.020
Token.
Token ID13
Feature activation+1.27e+00

Pos logit contributions
arter+0.001
 vain+0.001
 fortunate+0.001
die+0.001
 mistake+0.001
Neg logit contributions
aw-0.002
cakes-0.002
boards-0.002
P-0.002
 instruments-0.002
Token The
Token ID383
Feature activation-8.85e-01

Pos logit contributions
aw+0.029
cakes+0.027
boards+0.027
P+0.026
First+0.025
Neg logit contributions
arter-0.026
 vain-0.025
 fortunate-0.025
 mistake-0.025
die-0.025
Token rabbit
Token ID22746
Feature activation+7.09e-01

Pos logit contributions
arter+0.016
 vain+0.016
 fortunate+0.015
die+0.015
 haven+0.015
Neg logit contributions
aw-0.022
cakes-0.021
boards-0.020
P-0.020
 instruments-0.020
Token thanked
Token ID26280
Feature activation-1.37e+00

Pos logit contributions
aw+0.018
cakes+0.017
boards+0.017
P+0.016
First+0.016
Neg logit contributions
die-0.012
 remembers-0.012
 way-0.012
 fortunate-0.012
dy-0.012
Token was
Token ID373
Feature activation+1.01e+00

Pos logit contributions
aw+0.007
cakes+0.006
boards+0.006
P+0.006
First+0.006
Neg logit contributions
arter-0.005
 vain-0.005
 fortunate-0.005
die-0.005
 mistake-0.005
Token no
Token ID645
Feature activation+1.80e+00

Pos logit contributions
aw+0.024
P+0.022
cakes+0.022
Her+0.021
She+0.020
Neg logit contributions
arter-0.020
 vain-0.019
die-0.018
 fossil-0.018
 fortunate-0.018
Token longer
Token ID2392
Feature activation-1.78e+00

Pos logit contributions
aw+0.023
P+0.019
 instruments+0.018
First+0.018
 shorts+0.016
Neg logit contributions
arter-0.056
 fossil-0.054
die-0.054
 haven-0.053
 hasn-0.053
Token embarrassed
Token ID21100
Feature activation+8.79e-01

Pos logit contributions
arter+0.029
 fossil+0.026
die+0.026
 haven+0.025
ccoli+0.025
Neg logit contributions
aw-0.048
 instruments-0.043
P-0.043
 se-0.042
 shorts-0.041
Token.
Token ID13
Feature activation+1.75e+00

Pos logit contributions
aw+0.021
cakes+0.019
boards+0.019
 instruments+0.018
P+0.018
Neg logit contributions
arter-0.017
die-0.017
 vain-0.017
 fortunate-0.016
 haven-0.016
Token The
Token ID383
Feature activation-9.59e-01

Pos logit contributions
aw+0.036
cakes+0.033
Her+0.031
First+0.031
boards+0.031
Neg logit contributions
arter-0.042
 mistake-0.041
 vain-0.040
 fortunate-0.039
ubby-0.038
Token friends
Token ID2460
Feature activation+3.22e+00

Pos logit contributions
arter+0.019
 haven+0.018
 vain+0.017
 fortunate+0.017
die+0.017
Neg logit contributions
aw-0.024
P-0.022
cakes-0.022
 instruments-0.021
First-0.021
Token had
Token ID550
Feature activation+2.50e+00

Pos logit contributions
aw+0.072
boards+0.070
cakes+0.070
P+0.068
First+0.067
Neg logit contributions
 remembers-0.064
 haven-0.063
 fortunate-0.062
 vain-0.062
 thanks-0.061
Token all
Token ID477
Feature activation-2.38e-01

Pos logit contributions
aw+0.054
cakes+0.050
boards+0.049
P+0.048
 instruments+0.046
Neg logit contributions
 vain-0.054
 fortunate-0.053
die-0.052
arter-0.052
 mistake-0.052
Token worked
Token ID3111
Feature activation-1.25e-01

Pos logit contributions
arter+0.004
 vain+0.004
 fortunate+0.004
die+0.004
 mistake+0.004
Neg logit contributions
aw-0.006
cakes-0.005
boards-0.005
P-0.005
 instruments-0.005
Token together
Token ID1978
Feature activation+7.23e-01

Pos logit contributions
arter+0.004
 haven+0.003
 fortunate+0.003
 vain+0.003
die+0.003
Neg logit contributions
aw-0.002
P-0.002
cakes-0.002
boards-0.002
 instruments-0.002
Token wanted
Token ID2227
Feature activation+4.50e-01

Pos logit contributions
aw+0.008
cakes+0.007
boards+0.007
P+0.007
 instruments+0.007
Neg logit contributions
arter-0.008
 vain-0.008
 fortunate-0.008
die-0.008
 mistake-0.008
Token to
Token ID284
Feature activation-1.02e+00

Pos logit contributions
aw+0.010
cakes+0.009
boards+0.009
P+0.009
First+0.008
Neg logit contributions
arter-0.010
 vain-0.010
 mistake-0.010
die-0.009
 fortunate-0.009
Token play
Token ID711
Feature activation-1.47e+00

Pos logit contributions
 way+0.018
arter+0.017
 mistake+0.017
 haven+0.016
 thanks+0.016
Neg logit contributions
aw-0.029
boards-0.028
P-0.025
cakes-0.025
First-0.024
Token outside
Token ID2354
Feature activation-2.00e+00

Pos logit contributions
 mistake+0.020
 time+0.020
arter+0.020
 normal+0.020
 way+0.019
Neg logit contributions
talk-0.040
aw-0.039
boards-0.039
First-0.039
cakes-0.039
Token again
Token ID757
Feature activation-2.13e+00

Pos logit contributions
 mistake+0.025
 way+0.025
 vain+0.025
arter+0.024
 thanks+0.023
Neg logit contributions
aw-0.060
P-0.057
cakes-0.057
First-0.056
Ta-0.056
Token.
Token ID13
Feature activation-2.71e-01

Pos logit contributions
arter+0.040
 vain+0.039
 fortunate+0.039
die+0.038
 haven+0.037
Neg logit contributions
aw-0.052
P-0.048
cakes-0.047
boards-0.046
First-0.046
Token But
Token ID887
Feature activation-8.58e-02

Pos logit contributions
arter+0.004
 vain+0.004
 fortunate+0.004
die+0.004
 mistake+0.004
Neg logit contributions
aw-0.007
cakes-0.007
P-0.007
boards-0.007
 instruments-0.006
Token he
Token ID339
Feature activation-4.76e-01

Pos logit contributions
arter+0.002
 fortunate+0.002
 vain+0.002
die+0.002
ubby+0.002
Neg logit contributions
aw-0.002
cakes-0.002
P-0.002
Her-0.002
boards-0.002
Token didn
Token ID1422
Feature activation+2.24e+00

Pos logit contributions
arter+0.011
 fortunate+0.010
 vain+0.009
ubby+0.009
ening+0.009
Neg logit contributions
aw-0.011
P-0.010
co-0.009
boards-0.009
cakes-0.009
Token't
Token ID470
Feature activation+2.93e-01

Pos logit contributions
aw+0.059
boards+0.051
 instruments+0.051
talk+0.051
cakes+0.051
Neg logit contributions
arter-0.040
 vain-0.039
 mistake-0.039
 haven-0.038
ubby-0.036
Token want
Token ID765
Feature activation+3.80e-01

Pos logit contributions
aw+0.006
P+0.006
First+0.006
talk+0.005
Her+0.005
Neg logit contributions
arter-0.006
die-0.006
 vain-0.006
 fortunate-0.006
 hasn-0.006
Token soaked
Token ID33657
Feature activation+4.86e-01

Pos logit contributions
aw+0.014
P+0.013
cakes+0.013
boards+0.012
talk+0.012
Neg logit contributions
 haven-0.011
 way-0.010
 vain-0.010
arter-0.010
 mistake-0.010
Token the
Token ID262
Feature activation-2.76e-01

Pos logit contributions
aw+0.012
cakes+0.011
boards+0.011
P+0.011
First+0.011
Neg logit contributions
arter-0.009
 vain-0.009
 fortunate-0.009
die-0.009
 mistake-0.009
Token ship
Token ID4074
Feature activation-6.92e-01

Pos logit contributions
arter+0.007
 vain+0.007
 fortunate+0.007
die+0.007
 mistake+0.007
Neg logit contributions
aw-0.005
cakes-0.004
boards-0.004
P-0.004
 instruments-0.004
Token and
Token ID290
Feature activation-7.54e-01

Pos logit contributions
arter+0.014
 vain+0.014
 fortunate+0.014
die+0.014
 mistake+0.013
Neg logit contributions
aw-0.016
cakes-0.015
boards-0.015
 instruments-0.014
P-0.014
Token made
Token ID925
Feature activation+2.58e+00

Pos logit contributions
 vain+0.012
arter+0.012
 fortunate+0.012
 mistake+0.012
 way+0.012
Neg logit contributions
aw-0.020
cakes-0.019
P-0.019
boards-0.018
First-0.018
Token it
Token ID340
Feature activation-6.39e-01

Pos logit contributions
aw+0.060
cakes+0.058
P+0.058
boards+0.057
talk+0.055
Neg logit contributions
 mistake-0.047
arter-0.047
 vain-0.047
 fortunate-0.046
 way-0.046
Token all
Token ID477
Feature activation-1.97e+00

Pos logit contributions
arter+0.010
 vain+0.010
 mistake+0.010
 fortunate+0.010
 haven+0.009
Neg logit contributions
aw-0.018
P-0.016
cakes-0.016
boards-0.016
First-0.016
Token wet
Token ID9583
Feature activation+3.81e-01

Pos logit contributions
arter+0.038
 vain+0.037
die+0.037
 fortunate+0.036
 remembers+0.035
Neg logit contributions
aw-0.047
cakes-0.044
 instruments-0.042
boards-0.042
P-0.042
Token.
Token ID13
Feature activation-3.09e-01

Pos logit contributions
aw+0.011
cakes+0.011
P+0.010
First+0.010
boards+0.010
Neg logit contributions
arter-0.005
 vain-0.005
 fortunate-0.005
 mistake-0.005
 way-0.005
Token But
Token ID887
Feature activation+1.38e-01

Pos logit contributions
 haven+0.004
die+0.004
 fortunate+0.004
 hasn+0.004
arter+0.004
Neg logit contributions
aw-0.009
cakes-0.008
boards-0.008
P-0.008
 instruments-0.008
Token then
Token ID788
Feature activation-4.33e-02

Pos logit contributions
aw+0.002
P+0.002
cakes+0.002
boards+0.002
 instruments+0.002
Neg logit contributions
arter-0.004
 vain-0.004
die-0.003
 fortunate-0.003
 mistake-0.003
TokenOnce
Token ID7454
Feature activation-1.18e+00

Pos logit contributions
arter+0.015
 vain+0.015
 fortunate+0.015
 mistake+0.015
 remembers+0.014
Neg logit contributions
aw-0.013
P-0.012
boards-0.011
cakes-0.011
First-0.010
Token upon
Token ID2402
Feature activation+7.10e-01

Pos logit contributions
 fossil+0.026
 fortunate+0.026
 vain+0.025
 size+0.025
ening+0.025
Neg logit contributions
Her-0.022
aw-0.021
She-0.020
P-0.020
cakes-0.019
Token a
Token ID257
Feature activation+1.19e-01

Pos logit contributions
cakes+0.016
aw+0.016
boards+0.016
talk+0.015
ican+0.015
Neg logit contributions
die-0.014
 vain-0.013
 haven-0.013
 fortunate-0.013
arter-0.013
Token time
Token ID640
Feature activation-5.93e-01

Pos logit contributions
aw+0.004
P+0.004
First+0.003
co+0.003
So+0.003
Neg logit contributions
arter-0.001
 fortunate-0.001
 vain-0.001
 remembers-0.001
 haven-0.001
Token,
Token ID11
Feature activation-2.76e-01

Pos logit contributions
die+0.014
 fortunate+0.014
 vain+0.014
arter+0.014
 mistake+0.013
Neg logit contributions
aw-0.011
P-0.011
Her-0.010
boards-0.010
cakes-0.010
Token there
Token ID612
Feature activation-3.90e+00

Pos logit contributions
 fortunate+0.008
 mistake+0.008
arter+0.008
 vain+0.008
 fault+0.008
Neg logit contributions
P-0.004
aw-0.004
Her-0.003
She-0.003
They-0.003
Token was
Token ID373
Feature activation+1.02e+00

Pos logit contributions
 fortunate+0.123
arter+0.123
 vain+0.120
ubby+0.118
die+0.117
Neg logit contributions
aw-0.054
Her-0.040
She-0.038
boards-0.038
They-0.038
Token a
Token ID257
Feature activation-9.74e-01

Pos logit contributions
boards+0.025
cakes+0.025
aw+0.024
talk+0.024
co+0.023
Neg logit contributions
die-0.019
 time-0.018
dy-0.018
 vain-0.017
arter-0.017
Token little
Token ID1310
Feature activation-8.06e-01

Pos logit contributions
 vain+0.018
 fault+0.018
ening+0.018
arter+0.017
 pays+0.017
Neg logit contributions
aw-0.025
P-0.024
 se-0.023
She-0.021
Her-0.021
Token girl
Token ID2576
Feature activation-9.80e-02

Pos logit contributions
arter+0.014
 vain+0.013
 fortunate+0.013
die+0.013
 mistake+0.013
Neg logit contributions
aw-0.021
cakes-0.020
P-0.019
boards-0.019
 instruments-0.019
Token named
Token ID3706
Feature activation+1.36e-01

Pos logit contributions
arter+0.002
 vain+0.002
 fortunate+0.002
die+0.002
 mistake+0.002
Neg logit contributions
aw-0.002
cakes-0.002
P-0.002
boards-0.002
 instruments-0.002
Token there
Token ID612
Feature activation-3.99e+00

Pos logit contributions
 mistake+0.011
arter+0.011
 fortunate+0.010
 vain+0.010
 pays+0.010
Neg logit contributions
aw-0.007
P-0.006
Her-0.006
She-0.006
First-0.006
Token was
Token ID373
Feature activation+8.70e-01

Pos logit contributions
arter+0.110
 fortunate+0.109
 vain+0.105
ubby+0.104
die+0.102
Neg logit contributions
aw-0.071
boards-0.061
cakes-0.058
Her-0.055
 bars-0.054
Token a
Token ID257
Feature activation-9.35e-01

Pos logit contributions
cakes+0.023
boards+0.023
talk+0.023
aw+0.022
co+0.021
Neg logit contributions
 time-0.015
die-0.014
 normal-0.013
dy-0.013
 lucky-0.013
Token little
Token ID1310
Feature activation-8.75e-01

Pos logit contributions
 vain+0.016
 fault+0.016
arter+0.016
ening+0.015
 pays+0.015
Neg logit contributions
aw-0.024
P-0.024
 se-0.023
cakes-0.021
Her-0.021
Token girl
Token ID2576
Feature activation-2.60e-01

Pos logit contributions
arter+0.014
 vain+0.013
 fortunate+0.013
die+0.013
 mistake+0.013
Neg logit contributions
aw-0.024
cakes-0.023
boards-0.022
P-0.022
First-0.021
Token who
Token ID508
Feature activation-3.93e+00

Pos logit contributions
arter+0.006
 vain+0.006
 fortunate+0.006
die+0.006
 mistake+0.006
Neg logit contributions
aw-0.005
cakes-0.005
P-0.004
boards-0.004
 instruments-0.004
Token loved
Token ID6151
Feature activation-1.02e+00

Pos logit contributions
arter+0.107
 fault+0.101
 mistake+0.100
ubby+0.097
 vain+0.096
Neg logit contributions
P-0.069
aw-0.067
Her-0.064
She-0.063
First-0.057
Token to
Token ID284
Feature activation-1.92e+00

Pos logit contributions
 way+0.016
 fortunate+0.015
 mistake+0.015
 breathing+0.014
 size+0.014
Neg logit contributions
aw-0.029
talk-0.026
boards-0.026
P-0.026
First-0.026
Token go
Token ID467
Feature activation-1.39e+00

Pos logit contributions
arter+0.032
 way+0.032
 mistake+0.029
 haven+0.029
 mind+0.029
Neg logit contributions
aw-0.057
boards-0.050
P-0.048
talk-0.048
cakes-0.048
Token on
Token ID319
Feature activation-8.11e-01

Pos logit contributions
arter+0.024
 way+0.023
 fortunate+0.023
 chance+0.023
 remembers+0.022
Neg logit contributions
aw-0.037
cakes-0.033
boards-0.033
P-0.031
First-0.031
Token exciting
Token ID7895
Feature activation+6.25e-01

Pos logit contributions
 way+0.015
 mistake+0.015
 vain+0.015
 time+0.014
 tempting+0.014
Neg logit contributions
aw-0.020
cakes-0.019
boards-0.018
talk-0.018
als-0.018
Token the
Token ID262
Feature activation+2.04e+00

Pos logit contributions
arter+0.043
 fortunate+0.042
 remembers+0.040
 haven+0.040
ubby+0.039
Neg logit contributions
aw-0.040
First-0.037
P-0.037
Her-0.036
 instruments-0.036
Token yard
Token ID12699
Feature activation+9.67e-02

Pos logit contributions
cakes+0.049
aw+0.045
boards+0.045
talk+0.042
ban+0.042
Neg logit contributions
 way-0.044
 mistake-0.039
die-0.039
dy-0.038
 has-0.038
Token.
Token ID13
Feature activation-7.04e-01

Pos logit contributions
aw+0.003
P+0.003
cakes+0.003
boards+0.003
First+0.003
Neg logit contributions
 mistake-0.001
 way-0.001
 vain-0.001
arter-0.001
 haven-0.001
Token They
Token ID1119
Feature activation+1.17e+00

Pos logit contributions
arter+0.015
 vain+0.015
 fortunate+0.014
die+0.014
 mistake+0.014
Neg logit contributions
aw-0.016
cakes-0.014
P-0.014
boards-0.014
 instruments-0.014
Token think
Token ID892
Feature activation+2.66e-01

Pos logit contributions
aw+0.030
cakes+0.030
boards+0.028
P+0.027
co+0.025
Neg logit contributions
 haven-0.022
 remembers-0.022
 vain-0.022
 has-0.021
die-0.021
Token it
Token ID340
Feature activation-3.56e+00

Pos logit contributions
aw+0.006
cakes+0.005
boards+0.005
 instruments+0.005
First+0.005
Neg logit contributions
arter-0.006
die-0.006
 fortunate-0.006
 vain-0.006
 mistake-0.006
Token is
Token ID318
Feature activation-1.05e+00

Pos logit contributions
arter+0.076
 fortunate+0.070
 mistake+0.064
ubby+0.064
 vain+0.063
Neg logit contributions
aw-0.087
boards-0.086
 bars-0.086
cakes-0.084
 instruments-0.082
Token a
Token ID257
Feature activation-2.34e+00

Pos logit contributions
arter+0.027
ubby+0.026
 fortunate+0.026
ening+0.024
 mistake+0.024
Neg logit contributions
aw-0.017
 shorts-0.017
 jackets-0.017
 instruments-0.016
First-0.016
Token fun
Token ID1257
Feature activation-1.25e+00

Pos logit contributions
arter+0.044
 fortunate+0.043
 vain+0.042
ubby+0.041
ening+0.040
Neg logit contributions
aw-0.056
P-0.055
 instruments-0.052
First-0.051
cakes-0.050
Token toy
Token ID13373
Feature activation+8.00e-01

Pos logit contributions
 mistake+0.021
 way+0.021
arter+0.021
 vain+0.021
die+0.021
Neg logit contributions
aw-0.033
cakes-0.031
boards-0.030
P-0.029
First-0.028
Token for
Token ID329
Feature activation+1.38e+00

Pos logit contributions
aw+0.023
First+0.023
P+0.023
cakes+0.022
talk+0.022
Neg logit contributions
arter-0.010
 way-0.010
 vain-0.009
 mistake-0.009
 has-0.009
Token!"
Token ID2474
Feature activation+8.84e-01

Pos logit contributions
aw+0.007
P+0.007
boards+0.007
cakes+0.007
 instruments+0.007
Neg logit contributions
 way-0.004
 mistake-0.003
 haven-0.003
dy-0.003
die-0.003
Token Tim
Token ID5045
Feature activation+6.64e-01

Pos logit contributions
aw+0.029
boards+0.029
cakes+0.028
co+0.028
P+0.027
Neg logit contributions
 way-0.008
 haven-0.007
 vain-0.007
 time-0.007
dy-0.007
Token says
Token ID1139
Feature activation+6.86e-01

Pos logit contributions
aw+0.017
cakes+0.016
boards+0.016
P+0.016
 instruments+0.015
Neg logit contributions
die-0.011
 mistake-0.011
 remembers-0.011
dy-0.011
 way-0.011
Token.
Token ID13
Feature activation-1.79e+00

Pos logit contributions
aw+0.021
P+0.019
cakes+0.019
boards+0.019
First+0.018
Neg logit contributions
 vain-0.009
arter-0.009
 mistake-0.009
 fortunate-0.009
die-0.008
Token "
Token ID366
Feature activation+2.82e-01

Pos logit contributions
arter+0.029
 vain+0.029
die+0.028
 fortunate+0.028
dy+0.028
Neg logit contributions
aw-0.048
cakes-0.044
boards-0.043
P-0.043
 instruments-0.042
TokenIt
Token ID1026
Feature activation-3.58e+00

Pos logit contributions
cakes+0.007
boards+0.006
 umb+0.006
ican+0.006
 sneakers+0.006
Neg logit contributions
 moment-0.006
 has-0.005
 good-0.005
 haven-0.005
 true-0.005
Token looks
Token ID3073
Feature activation-3.11e+00

Pos logit contributions
arter+0.058
 vain+0.055
 fortunate+0.054
die+0.052
 mistake+0.051
Neg logit contributions
aw-0.101
cakes-0.094
boards-0.092
 bars-0.090
First-0.090
Token fun
Token ID1257
Feature activation-1.09e+00

Pos logit contributions
arter+0.075
 vain+0.070
 mistake+0.067
ubby+0.067
 fortunate+0.067
Neg logit contributions
aw-0.066
cakes-0.054
First-0.054
boards-0.054
P-0.053
Token!"
Token ID2474
Feature activation-3.57e-01

Pos logit contributions
 vain+0.013
arter+0.013
die+0.013
 fortunate+0.013
 way+0.013
Neg logit contributions
aw-0.032
P-0.031
cakes-0.030
boards-0.030
Her-0.029
Token
Token ID198
Feature activation-9.90e-01

Pos logit contributions
die+0.004
 way+0.004
dy+0.004
 vain+0.004
 mistake+0.004
Neg logit contributions
aw-0.011
cakes-0.011
boards-0.010
P-0.010
 instruments-0.010
Token
Token ID198
Feature activation-1.04e+00

Pos logit contributions
arter+0.021
 fortunate+0.021
 vain+0.021
 remembers+0.020
die+0.020
Neg logit contributions
aw-0.022
P-0.021
cakes-0.020
First-0.020
boards-0.020
Token
Token ID198
Feature activation-7.03e-01

Pos logit contributions
arter+0.016
 vain+0.016
 fortunate+0.016
 remembers+0.015
die+0.015
Neg logit contributions
aw-0.017
P-0.016
cakes-0.015
Her-0.015
First-0.015
TokenBut
Token ID1537
Feature activation+1.06e+00

Pos logit contributions
die+0.017
 mistake+0.017
dy+0.017
 vain+0.017
 fortunate+0.016
Neg logit contributions
boards-0.014
cakes-0.013
aw-0.013
 instruments-0.012
als-0.012
Token the
Token ID262
Feature activation+1.11e-02

Pos logit contributions
aw+0.024
cakes+0.022
P+0.021
boards+0.021
Her+0.020
Neg logit contributions
arter-0.023
 fortunate-0.022
 vain-0.022
die-0.022
ubby-0.021
Token dog
Token ID3290
Feature activation+5.98e-01

Pos logit contributions
aw+0.000
boards+0.000
cakes+0.000
First+0.000
talk+0.000
Neg logit contributions
 mistake-0.000
 way-0.000
 vain-0.000
 remembers-0.000
 chance-0.000
Token was
Token ID373
Feature activation-5.26e-01

Pos logit contributions
aw+0.014
boards+0.012
cakes+0.012
P+0.012
co+0.012
Neg logit contributions
arter-0.013
 fortunate-0.013
 vain-0.012
die-0.012
ubby-0.012
Token still
Token ID991
Feature activation-3.05e+00

Pos logit contributions
arter+0.013
 haven+0.012
 fortunate+0.012
 hasn+0.012
lad+0.012
Neg logit contributions
aw-0.010
co-0.009
P-0.009
Her-0.009
She-0.009
Token there
Token ID612
Feature activation-1.65e+00

Pos logit contributions
lad+0.065
arter+0.061
 haven+0.061
reetings+0.060
 fortunate+0.060
Neg logit contributions
aw-0.065
 instruments-0.056
P-0.056
 se-0.055
co-0.055
Token.
Token ID13
Feature activation-7.17e-01

Pos logit contributions
arter+0.035
die+0.034
 fortunate+0.033
 vain+0.033
ubby+0.032
Neg logit contributions
aw-0.039
boards-0.034
cakes-0.034
 instruments-0.033
Her-0.032
Token It
Token ID632
Feature activation-1.62e+00

Pos logit contributions
die+0.017
 fortunate+0.017
arter+0.017
dy+0.016
 vain+0.016
Neg logit contributions
aw-0.015
cakes-0.012
boards-0.012
Her-0.012
P-0.012
Token was
Token ID373
Feature activation-6.89e-02

Pos logit contributions
arter+0.039
 fortunate+0.039
ubby+0.036
 vain+0.036
 fault+0.036
Neg logit contributions
aw-0.034
P-0.030
co-0.029
 bars-0.028
cakes-0.028
Token angry
Token ID7954
Feature activation+1.59e+00

Pos logit contributions
 fortunate+0.001
lad+0.001
 haven+0.001
arter+0.001
 hasn+0.001
Neg logit contributions
aw-0.002
P-0.001
 shorts-0.001
co-0.001
Her-0.001