TOP ACTIVATIONS
MAX = 0.070

He couldn 't believe how kind the lady was and how
was so impressed with how kind they were .
And he never forgot how kind the old lion was .
youth was amazed by how kind the man was even though
promised to always remember how kind and brave the bird was
It would never forget how kind the boy had been .
Everyone was amazed at how kind the guard was and they
himself for proving just how wise and organized he could be
she was proud of how kind and friendly she was to
knew he couldn 't un - m elt it .
, because you never know - one of your wishes might
thread . They have no neck l aces and brace lets
, she always remembered how kind the bear was to share
bear and thought about how wise and kind animals can be
ie was surprised at how kind he was and smiled .
were so impressed with how wise Tom was . <|endoftext|> Once
it . He told Wh isk ers to look at him
was very pleased with how wise he had been . With
the frog always remembered how kind the man had been and
did not know what ost ric hes were . They looked

MIN ACTIVATIONS
MIN = -0.079

Both of them put their anx ieties aside . From then
a nearby tree watched him int ently . The
The little lamb watched him int ently , and when the
friends and shared their l oll ip ops every time they
L ily enjoyed her l oll ipop and forgot all about
on his face â he went to play
â â
the ingredients to make her delic iously sweet past ries .
bear sniff ed around their camps ite , but then it
saw a big red l oll ipop and asked her mom
enjoyed his y ummy l oll ipop and played with his
at circles , squares , rect angles and triangles .
she always kept her books he lf tidy and never did
play with his owner 's sp oons and always knocked them
playing with his lovely l oll ipop . <|endoftext|> Once upon
perfect bag now I 'm very
a big , sweet l oll ipop and wanted it .
to bounce on the big tr amp oline in their backyard
bear who had the co z iest bedroom . Every night
in his garden with his sp ade . He dug holes

INTERVAL 0.033 - 0.070
CONTAINS 0.457%

. The attic had many old things and dust and cob
proud . They show their sticker and their scarf to each
and brown , with long ears and a fuzzy tail .
strange and did not belong here . Ben was
Volcano worked together with the old man and soon the fire

INTERVAL -0.004 - 0.033
CONTAINS 66.126%

a terrible day ." Jim wanted to stay outside and play
The girl didn 't understand and asked , " What
named Tim . Tim liked to help his friends . One
and the kitten followed her . Lucy was so happy !
to take a closer look . Did you manage ok ?"

INTERVAL -0.042 - -0.004
CONTAINS 33.316%

, " Hello , Elsa . I am happy to meet
The fish was very thankful and drank all the
They played for a while , having fun with their
and her daughter were in their backyard . The mom wanted
and cool !" Sara says . She touches the dinosaur bone

INTERVAL -0.079 - -0.042
CONTAINS 0.102%

mom my asked her to help with the laundry . They
" Hey , hey , hey !" the park
was a girl named M abel . She was only three
it â s hard and I don â
who said to him , Don
Token He
Token ID679
Feature activation+4.60e-03

Pos logit contributions
 seen+0.003
 worn+0.003
 guessed+0.003
 gol+0.003
 figured+0.003
Neg logit contributions
 ambul-0.004
 911-0.004
oral-0.004
 marsh-0.004
 beds-0.004
Token couldn
Token ID3521
Feature activation-1.24e-02

Pos logit contributions
 aback+0.004
 gol+0.004
 crossed+0.003
 nervous+0.003
 worn+0.003
Neg logit contributions
 pancakes-0.009
oons-0.009
 degrees-0.009
oral-0.009
 dining-0.009
Token't
Token ID470
Feature activation+3.89e-03

Pos logit contributions
 degrees+0.023
 pancakes+0.023
 dining+0.023
oons+0.023
 beds+0.022
Neg logit contributions
 crossed-0.013
 aback-0.012
 reached-0.012
 below-0.012
 nervous-0.012
Token believe
Token ID1975
Feature activation-1.02e-02

Pos logit contributions
 aback+0.005
 crossed+0.005
 gol+0.005
 reached+0.005
 guessed+0.005
Neg logit contributions
oons-0.006
 degrees-0.006
 pancakes-0.006
oral-0.005
 ambul-0.005
Token how
Token ID703
Feature activation-1.52e-02

Pos logit contributions
oons+0.020
oral+0.019
 pancakes+0.019
 degrees+0.019
 dining+0.019
Neg logit contributions
 below-0.010
 aback-0.009
 gol-0.009
 nervous-0.009
 around-0.009
Token kind
Token ID1611
Feature activation+7.03e-02

Pos logit contributions
 pancakes+0.029
oons+0.029
oral+0.029
 degrees+0.028
 ambul+0.028
Neg logit contributions
 aback-0.014
 nervous-0.014
 surprised-0.014
 worn-0.014
 amazed-0.014
Token the
Token ID262
Feature activation+3.85e-03

Pos logit contributions
 aback+0.092
 gol+0.089
arted+0.087
 seen+0.087
 perplex+0.087
Neg logit contributions
 pancakes-0.102
 ambul-0.101
 chores-0.099
oral-0.098
oons-0.098
Token lady
Token ID10846
Feature activation+1.87e-02

Pos logit contributions
 aback+0.004
 gol+0.004
 nervous+0.004
 worn+0.004
 crossed+0.004
Neg logit contributions
oons-0.007
 pancakes-0.007
 degrees-0.007
oral-0.006
 ambul-0.006
Token was
Token ID373
Feature activation+7.92e-03

Pos logit contributions
 aback+0.016
 crossed+0.015
 gol+0.015
 nervous+0.015
 worn+0.015
Neg logit contributions
 pancakes-0.036
oons-0.036
 degrees-0.036
oral-0.036
 chores-0.035
Token and
Token ID290
Feature activation+1.27e-05

Pos logit contributions
 aback+0.009
 crossed+0.008
 gol+0.008
 nervous+0.008
 surprised+0.008
Neg logit contributions
 pancakes-0.013
oons-0.013
oral-0.013
 degrees-0.013
 ambul-0.013
Token how
Token ID703
Feature activation-1.23e-02

Pos logit contributions
 aback+0.000
 gol+0.000
 crossed+0.000
arted+0.000
 nervous+0.000
Neg logit contributions
 pancakes-0.000
 degrees-0.000
oons-0.000
 dining-0.000
 ambul-0.000
Token was
Token ID373
Feature activation+9.18e-03

Pos logit contributions
 aback+0.012
 gol+0.011
arted+0.010
 nervous+0.010
 crossed+0.010
Neg logit contributions
 pancakes-0.021
oons-0.021
 degrees-0.020
oral-0.020
 ambul-0.020
Token so
Token ID523
Feature activation-1.60e-04

Pos logit contributions
 gol+0.008
 aback+0.007
suit+0.007
arted+0.006
 crossed+0.006
Neg logit contributions
 pancakes-0.019
 breakfast-0.018
 dining-0.018
 ambul-0.017
 degrees-0.017
Token impressed
Token ID12617
Feature activation+1.91e-02

Pos logit contributions
 pancakes+0.000
 dining+0.000
 degrees+0.000
 chores+0.000
 breakfast+0.000
Neg logit contributions
 aback-0.000
 gol-0.000
arted-0.000
 target-0.000
suit-0.000
Token with
Token ID351
Feature activation+1.10e-02

Pos logit contributions
 aback+0.022
 gol+0.022
 worn+0.021
 crossed+0.021
 nervous+0.021
Neg logit contributions
 pancakes-0.031
oons-0.030
 degrees-0.029
oral-0.029
 ambul-0.029
Token how
Token ID703
Feature activation-2.78e-03

Pos logit contributions
 gol+0.017
 aback+0.016
suit+0.016
 crossed+0.016
 worn+0.016
Neg logit contributions
 pancakes-0.015
 chores-0.014
 ambul-0.013
 degrees-0.013
 breakfast-0.013
Token kind
Token ID1611
Feature activation+6.97e-02

Pos logit contributions
 pancakes+0.005
oons+0.005
oral+0.005
 degrees+0.005
 ambul+0.005
Neg logit contributions
 aback-0.003
 crossed-0.003
 gol-0.003
 nervous-0.003
 worn-0.003
Token they
Token ID484
Feature activation+1.09e-05

Pos logit contributions
 aback+0.090
 gol+0.085
 crossed+0.085
 nervous+0.083
 worn+0.082
Neg logit contributions
 pancakes-0.107
oons-0.106
 degrees-0.106
oral-0.105
 dining-0.104
Token were
Token ID547
Feature activation-1.93e-04

Pos logit contributions
 aback+0.000
 gol+0.000
arted+0.000
 hesitant+0.000
 Luckily+0.000
Neg logit contributions
 pancakes-0.000
 ambul-0.000
oons-0.000
 chores-0.000
 degrees-0.000
Token.
Token ID13
Feature activation+9.56e-03

Pos logit contributions
 pancakes+0.000
oons+0.000
 degrees+0.000
oral+0.000
 ambul+0.000
Neg logit contributions
 aback-0.000
 gol-0.000
 crossed-0.000
 nervous-0.000
 worn-0.000
Token
Token ID220
Feature activation-7.45e-03

Pos logit contributions
 seen+0.010
 gol+0.010
 worn+0.010
suit+0.009
 nervous+0.009
Neg logit contributions
oral-0.017
 911-0.016
oons-0.016
 dining-0.016
 pancakes-0.016
Token
Token ID198
Feature activation-5.06e-06

Pos logit contributions
oons+0.015
 pancakes+0.014
 911+0.014
oral+0.014
 degrees+0.014
Neg logit contributions
 gol-0.006
 aback-0.006
 worn-0.006
 seen-0.005
 nervous-0.005
Token And
Token ID843
Feature activation-1.26e-02

Pos logit contributions
 gol+0.002
 teased+0.002
 seen+0.002
 worn+0.002
suit+0.002
Neg logit contributions
oral-0.002
 911-0.002
 dining-0.002
 breakfast-0.002
oup-0.002
Token he
Token ID339
Feature activation-6.80e-03

Pos logit contributions
 pancakes+0.016
 breakfast+0.016
 chores+0.016
 movie+0.015
oral+0.015
Neg logit contributions
 aback-0.019
arted-0.018
 perplex-0.018
suit-0.017
 worn-0.017
Token never
Token ID1239
Feature activation+6.69e-03

Pos logit contributions
 pancakes+0.011
oral+0.011
oons+0.011
 dining+0.011
 degrees+0.011
Neg logit contributions
 aback-0.008
 gol-0.007
arted-0.007
 nervous-0.007
 worn-0.007
Token forgot
Token ID16453
Feature activation+2.04e-03

Pos logit contributions
amed+0.008
tie+0.007
agged+0.007
enge+0.007
 perplex+0.007
Neg logit contributions
 ambul-0.010
 spills-0.010
 tea-0.009
 movie-0.009
 dining-0.009
Token how
Token ID703
Feature activation-1.05e-02

Pos logit contributions
 aback+0.002
 crossed+0.002
 below+0.002
 nervous+0.002
 surprised+0.002
Neg logit contributions
oral-0.004
oons-0.004
 degrees-0.004
 pancakes-0.004
 dining-0.004
Token kind
Token ID1611
Feature activation+6.93e-02

Pos logit contributions
 pancakes+0.018
oons+0.018
 degrees+0.018
oral+0.018
 chores+0.018
Neg logit contributions
 aback-0.011
 gol-0.011
 crossed-0.010
arted-0.010
 guessed-0.010
Token the
Token ID262
Feature activation-4.50e-04

Pos logit contributions
 aback+0.102
 perplex+0.097
arted+0.096
suit+0.096
 worn+0.095
Neg logit contributions
 pancakes-0.096
 soup-0.092
 ambul-0.091
 movie-0.091
oons-0.091
Token old
Token ID1468
Feature activation+3.90e-02

Pos logit contributions
 pancakes+0.001
oons+0.001
 degrees+0.001
oral+0.001
 dining+0.001
Neg logit contributions
 aback-0.001
 gol-0.001
 nervous-0.001
 crossed-0.001
 surprised-0.001
Token lion
Token ID18744
Feature activation+1.28e-02

Pos logit contributions
 aback+0.067
arted+0.067
 gol+0.062
 worn+0.062
suit+0.062
Neg logit contributions
 ambul-0.046
 movie-0.046
oral-0.045
 soup-0.044
 911-0.042
Token was
Token ID373
Feature activation+2.55e-03

Pos logit contributions
 aback+0.012
 crossed+0.011
 gol+0.011
 nervous+0.011
 worn+0.011
Neg logit contributions
 pancakes-0.025
oons-0.024
oral-0.024
 degrees-0.024
 dining-0.024
Token.
Token ID13
Feature activation+3.38e-03

Pos logit contributions
 aback+0.003
 gol+0.003
 crossed+0.003
 nervous+0.003
 worn+0.003
Neg logit contributions
 pancakes-0.004
oons-0.004
oral-0.004
 degrees-0.004
 dining-0.004
Token youth
Token ID6205
Feature activation+9.76e-03

Pos logit contributions
 pancakes+0.002
oral+0.002
oons+0.002
 chores+0.002
 degrees+0.002
Neg logit contributions
 aback-0.001
 crossed-0.001
 gol-0.001
arted-0.001
 seen-0.001
Token was
Token ID373
Feature activation+7.06e-03

Pos logit contributions
 aback+0.008
 crossed+0.008
 gol+0.008
 nervous+0.007
 reached+0.007
Neg logit contributions
 pancakes-0.019
oons-0.019
 degrees-0.019
oral-0.019
 chores-0.019
Token amazed
Token ID24562
Feature activation-1.24e-02

Pos logit contributions
 gol+0.008
suit+0.007
 aback+0.007
arted+0.007
arrass+0.007
Neg logit contributions
 pancakes-0.012
 breakfast-0.012
 ambul-0.012
 dining-0.012
 soup-0.012
Token by
Token ID416
Feature activation+3.08e-02

Pos logit contributions
 pancakes+0.015
 soup+0.014
 degrees+0.014
oral+0.014
oons+0.013
Neg logit contributions
suit-0.018
 gol-0.018
amed-0.018
 crossed-0.018
 aback-0.018
Token how
Token ID703
Feature activation-8.28e-03

Pos logit contributions
 aback+0.030
 gol+0.028
 crossed+0.028
 nervous+0.027
 worn+0.027
Neg logit contributions
 pancakes-0.057
oons-0.057
 degrees-0.056
oral-0.056
 dining-0.055
Token kind
Token ID1611
Feature activation+6.81e-02

Pos logit contributions
 pancakes+0.014
 degrees+0.014
oons+0.014
 chores+0.014
oral+0.014
Neg logit contributions
 aback-0.009
 gol-0.008
 crossed-0.008
arted-0.008
 worn-0.008
Token the
Token ID262
Feature activation+1.86e-03

Pos logit contributions
 aback+0.083
 gol+0.079
 worn+0.079
arted+0.078
 seen+0.078
Neg logit contributions
 pancakes-0.106
oons-0.102
 ambul-0.102
oral-0.102
 chores-0.101
Token man
Token ID582
Feature activation+2.19e-02

Pos logit contributions
 aback+0.002
 crossed+0.002
 gol+0.002
arted+0.002
 worn+0.002
Neg logit contributions
 pancakes-0.003
oons-0.003
oral-0.003
 degrees-0.003
 chores-0.003
Token was
Token ID373
Feature activation+7.96e-03

Pos logit contributions
 aback+0.021
 gol+0.020
arted+0.019
 crossed+0.019
 nervous+0.019
Neg logit contributions
 pancakes-0.040
oons-0.039
 degrees-0.039
oral-0.039
 ambul-0.039
Token even
Token ID772
Feature activation-2.59e-02

Pos logit contributions
 aback+0.010
 gol+0.010
arted+0.009
suit+0.009
 crossed+0.009
Neg logit contributions
 pancakes-0.013
 ambul-0.012
 chores-0.012
 degrees-0.012
oons-0.012
Token though
Token ID996
Feature activation+3.67e-03

Pos logit contributions
 pancakes+0.021
 soup+0.021
 chores+0.021
 ambul+0.020
 swe+0.020
Neg logit contributions
 gol-0.048
 Philip-0.048
suit-0.048
 Phil-0.048
arted-0.047
Token promised
Token ID8072
Feature activation+9.36e-03

Pos logit contributions
 pancakes+0.014
oral+0.013
 breakfast+0.013
 chores+0.013
 ambul+0.013
Neg logit contributions
 aback-0.009
 gol-0.008
suit-0.007
 hesitant-0.007
 nervous-0.007
Token to
Token ID284
Feature activation+5.24e-03

Pos logit contributions
 aback+0.010
 gol+0.009
 crossed+0.009
 nervous+0.009
 worn+0.009
Neg logit contributions
 pancakes-0.017
oons-0.017
 degrees-0.017
oral-0.017
 dining-0.016
Token always
Token ID1464
Feature activation+2.02e-02

Pos logit contributions
 aback+0.005
 gol+0.005
 crossed+0.005
 worn+0.005
 nervous+0.005
Neg logit contributions
 pancakes-0.009
oons-0.009
 degrees-0.009
oral-0.009
 dining-0.009
Token remember
Token ID3505
Feature activation-1.79e-03

Pos logit contributions
 aback+0.025
 crossed+0.024
 worn+0.024
 gol+0.024
 reached+0.023
Neg logit contributions
oons-0.034
 degrees-0.033
 pancakes-0.033
oral-0.032
 dining-0.031
Token how
Token ID703
Feature activation-1.32e-02

Pos logit contributions
oons+0.004
 degrees+0.003
 pancakes+0.003
oral+0.003
 dining+0.003
Neg logit contributions
 aback-0.002
 crossed-0.002
 seen-0.001
 below-0.001
 worn-0.001
Token kind
Token ID1611
Feature activation+6.75e-02

Pos logit contributions
 pancakes+0.025
oons+0.025
 degrees+0.024
oral+0.024
 dining+0.024
Neg logit contributions
 aback-0.013
 nervous-0.012
 crossed-0.012
 surprised-0.012
 gol-0.012
Token and
Token ID290
Feature activation-1.63e-02

Pos logit contributions
 aback+0.082
 gol+0.079
 crossed+0.078
 nervous+0.076
 worn+0.076
Neg logit contributions
 pancakes-0.109
oons-0.107
 degrees-0.107
oral-0.105
 dining-0.105
Token brave
Token ID14802
Feature activation+2.04e-02

Pos logit contributions
oons+0.028
 pancakes+0.028
oral+0.027
 degrees+0.027
 dining+0.026
Neg logit contributions
 aback-0.019
 nervous-0.018
 worn-0.018
 gol-0.018
 crossed-0.018
Token the
Token ID262
Feature activation-7.53e-03

Pos logit contributions
 aback+0.024
 gol+0.023
 crossed+0.023
 nervous+0.022
 below+0.022
Neg logit contributions
 pancakes-0.034
oons-0.034
oral-0.033
 degrees-0.033
 dining-0.033
Token bird
Token ID6512
Feature activation+5.96e-04

Pos logit contributions
oons+0.014
 degrees+0.014
 pancakes+0.013
 dining+0.013
oral+0.013
Neg logit contributions
 nervous-0.008
 scared-0.008
 gol-0.008
 surprised-0.008
 aback-0.007
Token was
Token ID373
Feature activation+2.58e-03

Pos logit contributions
 aback+0.001
 crossed+0.001
 nervous+0.001
 gol+0.001
 reached+0.001
Neg logit contributions
 pancakes-0.001
oons-0.001
 degrees-0.001
oral-0.001
 dining-0.001
Token It
Token ID632
Feature activation+6.06e-03

Pos logit contributions
 pancakes+0.001
oons+0.001
oral+0.001
 degrees+0.001
 dining+0.001
Neg logit contributions
 aback-0.000
 gol-0.000
 crossed-0.000
 nervous-0.000
 worn-0.000
Token would
Token ID561
Feature activation-1.00e-03

Pos logit contributions
 aback+0.005
 crossed+0.005
 reached+0.005
 surprised+0.005
 nervous+0.005
Neg logit contributions
 pancakes-0.012
oons-0.012
 degrees-0.012
oral-0.011
 911-0.011
Token never
Token ID1239
Feature activation+4.09e-03

Pos logit contributions
oons+0.002
 degrees+0.002
 pancakes+0.002
oral+0.002
 911+0.002
Neg logit contributions
 aback-0.001
 worn-0.001
 crossed-0.001
 reached-0.001
 nervous-0.001
Token forget
Token ID6044
Feature activation+3.59e-03

Pos logit contributions
 aback+0.005
 gol+0.004
 crossed+0.004
 nervous+0.004
 worn+0.004
Neg logit contributions
 pancakes-0.007
oons-0.007
 degrees-0.007
oral-0.007
 dining-0.007
Token how
Token ID703
Feature activation-1.31e-02

Pos logit contributions
 aback+0.003
 so+0.003
 around+0.003
 below+0.003
 Bill+0.003
Neg logit contributions
oral-0.007
oons-0.007
 dining-0.007
 degrees-0.007
 pancakes-0.007
Token kind
Token ID1611
Feature activation+6.71e-02

Pos logit contributions
 pancakes+0.024
oons+0.024
oral+0.023
 dining+0.023
 degrees+0.023
Neg logit contributions
 aback-0.013
 nervous-0.013
 worn-0.013
 surprised-0.013
 below-0.013
Token the
Token ID262
Feature activation+2.42e-03

Pos logit contributions
 aback+0.073
 gol+0.068
 crossed+0.067
 nervous+0.066
 worn+0.065
Neg logit contributions
 pancakes-0.118
oons-0.117
 degrees-0.116
oral-0.115
 dining-0.114
Token boy
Token ID2933
Feature activation+1.18e-02

Pos logit contributions
 gol+0.003
 nervous+0.003
 worn+0.003
 en+0.003
 scared+0.003
Neg logit contributions
oons-0.004
 911-0.004
 degrees-0.004
 ambul-0.004
 pancakes-0.004
Token had
Token ID550
Feature activation+1.09e-02

Pos logit contributions
 aback+0.011
 crossed+0.011
 gol+0.010
 nervous+0.010
 worn+0.010
Neg logit contributions
 pancakes-0.022
oons-0.022
oral-0.022
 degrees-0.022
 dining-0.021
Token been
Token ID587
Feature activation+8.64e-03

Pos logit contributions
 gol+0.011
 aback+0.011
arted+0.010
 nervous+0.010
 perplex+0.010
Neg logit contributions
 pancakes-0.019
 degrees-0.019
oons-0.019
 chores-0.018
 breakfast-0.018
Token.
Token ID13
Feature activation+2.53e-03

Pos logit contributions
 aback+0.009
 gol+0.009
 crossed+0.009
 worn+0.008
 nervous+0.008
Neg logit contributions
 pancakes-0.015
oons-0.015
 degrees-0.015
oral-0.015
 chores-0.014
Token Everyone
Token ID11075
Feature activation+2.68e-02

Pos logit contributions
 aback+0.002
 seen+0.002
 gol+0.002
 overwhelmed+0.002
 crossed+0.002
Neg logit contributions
 pancakes-0.004
oons-0.004
oral-0.004
 911-0.004
 dining-0.004
Token was
Token ID373
Feature activation+4.46e-03

Pos logit contributions
 aback+0.027
 gol+0.026
arted+0.024
 nervous+0.024
 worn+0.024
Neg logit contributions
 pancakes-0.048
oons-0.047
oral-0.046
 degrees-0.046
 dining-0.046
Token amazed
Token ID24562
Feature activation-1.30e-02

Pos logit contributions
 gol+0.005
 aback+0.005
suit+0.004
arted+0.004
 crossed+0.004
Neg logit contributions
 pancakes-0.008
 breakfast-0.008
 dining-0.008
 ambul-0.008
 soup-0.007
Token at
Token ID379
Feature activation+8.58e-03

Pos logit contributions
 pancakes+0.018
oons+0.018
 degrees+0.017
oral+0.017
 soup+0.017
Neg logit contributions
 aback-0.017
 gol-0.017
 crossed-0.017
 perplex-0.016
 worn-0.016
Token how
Token ID703
Feature activation-9.62e-03

Pos logit contributions
 aback+0.009
 gol+0.009
 crossed+0.009
 worn+0.009
 nervous+0.009
Neg logit contributions
 pancakes-0.015
oons-0.015
 degrees-0.015
oral-0.015
 dining-0.014
Token kind
Token ID1611
Feature activation+6.67e-02

Pos logit contributions
 pancakes+0.017
oons+0.017
 degrees+0.017
oral+0.016
 chores+0.016
Neg logit contributions
 aback-0.010
 gol-0.010
 crossed-0.010
 nervous-0.009
 worn-0.009
Token the
Token ID262
Feature activation+7.37e-03

Pos logit contributions
 aback+0.082
 gol+0.078
 crossed+0.078
 nervous+0.075
 below+0.075
Neg logit contributions
 pancakes-0.108
 degrees-0.107
oons-0.106
oral-0.105
 dining-0.105
Token guard
Token ID4860
Feature activation+1.34e-03

Pos logit contributions
 aback+0.008
 gol+0.008
 nervous+0.007
 worn+0.007
 crossed+0.007
Neg logit contributions
oons-0.013
 pancakes-0.013
 degrees-0.013
oral-0.013
 dining-0.013
Token was
Token ID373
Feature activation+7.05e-03

Pos logit contributions
 aback+0.001
 crossed+0.001
 nervous+0.001
 gol+0.001
 surprised+0.001
Neg logit contributions
 pancakes-0.003
oons-0.003
 degrees-0.003
oral-0.003
 dining-0.003
Token and
Token ID290
Feature activation+2.83e-03

Pos logit contributions
 aback+0.009
 gol+0.009
arted+0.009
 crossed+0.008
suit+0.008
Neg logit contributions
 pancakes-0.011
 dining-0.010
 ambul-0.010
 degrees-0.010
 chores-0.010
Token they
Token ID484
Feature activation-1.02e-02

Pos logit contributions
 aback+0.003
 gol+0.003
 crossed+0.003
arted+0.003
 worn+0.003
Neg logit contributions
 pancakes-0.005
oons-0.005
 degrees-0.005
 ambul-0.005
 dining-0.005
Token himself
Token ID2241
Feature activation+1.01e-03

Pos logit contributions
 pancakes+0.014
oons+0.014
 degrees+0.014
oral+0.013
 dining+0.013
Neg logit contributions
 aback-0.008
 gol-0.008
 crossed-0.008
 nervous-0.008
 worn-0.008
Token for
Token ID329
Feature activation-1.07e-02

Pos logit contributions
 so+0.001
 around+0.001
 nervous+0.001
 aback+0.001
 surprised+0.001
Neg logit contributions
oons-0.002
 degrees-0.002
 pancakes-0.002
 dining-0.002
 ambul-0.002
Token proving
Token ID17742
Feature activation+1.56e-02

Pos logit contributions
oons+0.017
oral+0.017
 degrees+0.017
 pancakes+0.017
 911+0.016
Neg logit contributions
 aback-0.013
 worn-0.013
 nervous-0.013
 crossed-0.013
 amazed-0.013
Token just
Token ID655
Feature activation+3.61e-02

Pos logit contributions
 aback+0.018
 crossed+0.018
 gol+0.018
 reached+0.018
suit+0.018
Neg logit contributions
 chores-0.025
 pancakes-0.025
 degrees-0.024
oral-0.024
oons-0.024
Token how
Token ID703
Feature activation-8.10e-03

Pos logit contributions
 aback+0.039
 gol+0.037
 crossed+0.036
 nervous+0.035
 worn+0.035
Neg logit contributions
 pancakes-0.064
oons-0.063
 degrees-0.062
oral-0.062
 dining-0.061
Token wise
Token ID10787
Feature activation+6.63e-02

Pos logit contributions
 pancakes+0.015
oral+0.015
oons+0.015
 dining+0.015
 ambul+0.015
Neg logit contributions
 nervous-0.008
 aback-0.008
 worn-0.008
 surprised-0.008
 embarrassed-0.007
Token and
Token ID290
Feature activation+3.27e-04

Pos logit contributions
 aback+0.088
 gol+0.083
arted+0.083
 crossed+0.082
 perplex+0.081
Neg logit contributions
 pancakes-0.098
oons-0.096
 degrees-0.095
 ambul-0.094
oral-0.094
Token organized
Token ID8389
Feature activation+1.14e-02

Pos logit contributions
 aback+0.000
arted+0.000
suit+0.000
 gol+0.000
 crossed+0.000
Neg logit contributions
 pancakes-0.000
 chores-0.000
 degrees-0.000
 ambul-0.000
oral-0.000
Token he
Token ID339
Feature activation-2.50e-04

Pos logit contributions
 aback+0.013
 gol+0.012
 crossed+0.012
 worn+0.012
 nervous+0.012
Neg logit contributions
 pancakes-0.019
oral-0.019
oons-0.019
 degrees-0.019
 dining-0.018
Token could
Token ID714
Feature activation+8.85e-04

Pos logit contributions
oral+0.000
 degrees+0.000
 chores+0.000
oons+0.000
 pancakes+0.000
Neg logit contributions
 gol-0.000
 aback-0.000
arted-0.000
 Philip-0.000
 Paige-0.000
Token be
Token ID307
Feature activation+1.84e-02

Pos logit contributions
 crossed+0.001
 aback+0.001
 guessed+0.001
 reached+0.001
 nervous+0.001
Neg logit contributions
 degrees-0.002
oons-0.002
 pancakes-0.002
 911-0.002
 ambul-0.002
Token she
Token ID673
Feature activation-1.01e-02

Pos logit contributions
 so+0.002
 surprised+0.002
 nervous+0.002
 amazed+0.002
 around+0.002
Neg logit contributions
oral-0.005
 ambul-0.005
 dining-0.005
 camps-0.005
oons-0.005
Token was
Token ID373
Feature activation+6.02e-03

Pos logit contributions
 pancakes+0.016
oons+0.016
 degrees+0.015
oral+0.015
 chores+0.015
Neg logit contributions
 gol-0.013
 aback-0.013
arted-0.012
 below-0.012
 Philip-0.012
Token proud
Token ID6613
Feature activation-1.08e-02

Pos logit contributions
 aback+0.007
 gol+0.007
suit+0.007
arted+0.006
 crossed+0.006
Neg logit contributions
 pancakes-0.011
 dining-0.010
 breakfast-0.010
 soup-0.010
 chores-0.010
Token of
Token ID286
Feature activation-1.02e-02

Pos logit contributions
 pancakes+0.017
oons+0.017
 degrees+0.017
oral+0.017
 dining+0.017
Neg logit contributions
 aback-0.013
 gol-0.013
 crossed-0.013
 nervous-0.012
 worn-0.012
Token how
Token ID703
Feature activation-1.21e-02

Pos logit contributions
oons+0.016
oral+0.015
 degrees+0.015
 pancakes+0.015
 dining+0.015
Neg logit contributions
 Paige-0.013
 Sally-0.013
 worn-0.013
 aback-0.012
 crossed-0.012
Token kind
Token ID1611
Feature activation+6.59e-02

Pos logit contributions
oons+0.021
oral+0.020
 pancakes+0.020
 degrees+0.020
 ambul+0.020
Neg logit contributions
 aback-0.013
 nervous-0.013
 worn-0.013
 surprised-0.013
 crossed-0.013
Token and
Token ID290
Feature activation-1.46e-02

Pos logit contributions
 aback+0.094
 gol+0.091
 crossed+0.090
 nervous+0.089
 worn+0.088
Neg logit contributions
 pancakes-0.092
oons-0.091
 degrees-0.090
oral-0.089
 dining-0.088
Token friendly
Token ID8030
Feature activation+4.88e-02

Pos logit contributions
 pancakes+0.023
 chores+0.022
 ambul+0.022
 degrees+0.022
 dining+0.021
Neg logit contributions
 aback-0.019
 gol-0.018
arted-0.018
 crossed-0.017
suit-0.017
Token she
Token ID673
Feature activation+1.51e-03

Pos logit contributions
 aback+0.054
 crossed+0.052
 gol+0.051
 nervous+0.051
 below+0.050
Neg logit contributions
oons-0.083
oral-0.083
 pancakes-0.082
 degrees-0.081
 dining-0.081
Token was
Token ID373
Feature activation+7.32e-03

Pos logit contributions
 gol+0.002
 aback+0.002
arted+0.002
 Philip+0.002
chard+0.002
Neg logit contributions
 movie-0.002
 pancakes-0.002
abel-0.002
oons-0.002
 soup-0.002
Token to
Token ID284
Feature activation-7.00e-04

Pos logit contributions
 aback+0.009
 gol+0.008
suit+0.008
arted+0.008
 crossed+0.007
Neg logit contributions
 pancakes-0.013
 breakfast-0.012
 soup-0.012
 chores-0.012
 ambul-0.012
Token knew
Token ID2993
Feature activation-1.17e-02

Pos logit contributions
 pancakes+0.010
oons+0.010
 degrees+0.009
oral+0.009
 dining+0.009
Neg logit contributions
 aback-0.004
 gol-0.003
 crossed-0.003
 nervous-0.003
 worn-0.003
Token he
Token ID339
Feature activation+1.52e-03

Pos logit contributions
oons+0.022
 pancakes+0.022
oral+0.021
 degrees+0.021
 dining+0.021
Neg logit contributions
 aback-0.011
 crossed-0.011
 worn-0.011
 gol-0.010
 nervous-0.010
Token couldn
Token ID3521
Feature activation-1.39e-02

Pos logit contributions
 aback+0.001
 gol+0.001
arted+0.001
 nervous+0.001
 crossed+0.001
Neg logit contributions
 pancakes-0.003
oons-0.003
 degrees-0.003
oral-0.003
 chores-0.003
Token't
Token ID470
Feature activation+1.12e-02

Pos logit contributions
 dining+0.029
 degrees+0.029
 pancakes+0.029
 beds+0.028
 bedrooms+0.027
Neg logit contributions
 crossed-0.012
 reached-0.011
 aback-0.011
 It-0.011
 so-0.011
Token un
Token ID555
Feature activation+2.34e-02

Pos logit contributions
 crossed+0.011
 reached+0.011
 below+0.011
 guessed+0.011
 touched+0.011
Neg logit contributions
 degrees-0.021
oons-0.021
 dining-0.021
 pancakes-0.021
oral-0.020
Token-
Token ID12
Feature activation+6.59e-02

Pos logit contributions
 aback+0.032
 crossed+0.032
 gol+0.031
 nervous+0.031
 Paige+0.031
Neg logit contributions
 pancakes-0.032
oons-0.032
 chores-0.032
 degrees-0.031
 ambul-0.031
Tokenm
Token ID76
Feature activation+3.04e-02

Pos logit contributions
 aback+0.075
 gol+0.073
 nervous+0.070
 crossed+0.069
 perplex+0.069
Neg logit contributions
 pancakes-0.109
oons-0.107
 degrees-0.106
oral-0.105
 chores-0.105
Tokenelt
Token ID2120
Feature activation+3.15e-02

Pos logit contributions
 nervous+0.031
 aback+0.031
 surprised+0.031
 embarrassed+0.030
 amazed+0.030
Neg logit contributions
oral-0.055
 dining-0.053
oons-0.053
 occasions-0.052
 pancakes-0.052
Token it
Token ID340
Feature activation-1.84e-02

Pos logit contributions
 aback+0.036
 nervous+0.034
 worn+0.033
 crossed+0.033
 surprised+0.033
Neg logit contributions
oons-0.054
oral-0.054
 pancakes-0.053
 degrees-0.052
 dining-0.051
Token.
Token ID13
Feature activation+7.83e-03

Pos logit contributions
oral+0.036
 pancakes+0.035
oons+0.035
 dining+0.034
 degrees+0.034
Neg logit contributions
 so-0.019
 around-0.018
 aback-0.018
 nervous-0.018
 scared-0.017
Token
Token ID220
Feature activation-3.65e-03

Pos logit contributions
 gol+0.009
 seen+0.009
amed+0.008
tie+0.008
suit+0.008
Neg logit contributions
oral-0.013
 911-0.012
 dining-0.012
 beds-0.012
 cereal-0.012
Token,
Token ID11
Feature activation+1.31e-03

Pos logit contributions
 below+0.005
 around+0.005
 so+0.005
 crossed+0.005
 reached+0.005
Neg logit contributions
oral-0.010
 ambul-0.010
 911-0.010
 degrees-0.010
 pancakes-0.010
Token because
Token ID780
Feature activation+1.33e-02

Pos logit contributions
 aback+0.001
 gol+0.001
arted+0.001
 crossed+0.001
 worn+0.001
Neg logit contributions
 pancakes-0.002
 degrees-0.002
 ambul-0.002
oons-0.002
 chores-0.002
Token you
Token ID345
Feature activation+1.56e-02

Pos logit contributions
 worn+0.012
 nervous+0.012
 aback+0.012
 gol+0.012
 crossed+0.012
Neg logit contributions
 degrees-0.025
oons-0.025
oral-0.025
 pancakes-0.024
 dining-0.024
Token never
Token ID1239
Feature activation+7.14e-03

Pos logit contributions
 gol+0.017
 aback+0.017
arted+0.016
 hesitant+0.016
 perplex+0.015
Neg logit contributions
 pancakes-0.027
oons-0.026
 degrees-0.026
 dining-0.026
 breakfast-0.026
Token know
Token ID760
Feature activation-1.06e-02

Pos logit contributions
 gol+0.007
 aback+0.006
 Philip+0.006
 perplex+0.006
arted+0.006
Neg logit contributions
 pancakes-0.013
 degrees-0.013
 soup-0.012
 breakfast-0.012
 dining-0.012
Token-
Token ID12
Feature activation+6.58e-02

Pos logit contributions
oral+0.020
oons+0.020
 dining+0.020
 degrees+0.019
 pancakes+0.019
Neg logit contributions
 below-0.009
 aback-0.009
 so-0.009
 surprised-0.009
 worn-0.009
Token one
Token ID530
Feature activation+8.06e-03

Pos logit contributions
 aback+0.087
 gol+0.087
 crossed+0.084
 nervous+0.084
 hesitant+0.083
Neg logit contributions
 pancakes-0.097
oons-0.096
 degrees-0.096
oral-0.093
 ambul-0.091
Token of
Token ID286
Feature activation-5.36e-03

Pos logit contributions
 aback+0.010
 gol+0.010
 crossed+0.010
 nervous+0.009
 worn+0.009
Neg logit contributions
 pancakes-0.013
oons-0.012
 degrees-0.012
oral-0.012
 ambul-0.012
Token your
Token ID534
Feature activation-7.20e-03

Pos logit contributions
oons+0.010
oral+0.010
 degrees+0.010
 pancakes+0.010
 dining+0.010
Neg logit contributions
 aback-0.005
 gol-0.005
 worn-0.005
 crossed-0.005
 below-0.005
Token wishes
Token ID12802
Feature activation-2.03e-02

Pos logit contributions
oons+0.012
 911+0.012
 ambul+0.012
oral+0.012
 pancakes+0.012
Neg logit contributions
 nervous-0.008
 aback-0.008
 worn-0.008
 surprised-0.007
 below-0.007
Token might
Token ID1244
Feature activation+2.47e-02

Pos logit contributions
oral+0.036
 pancakes+0.035
oons+0.035
 ambul+0.035
 degrees+0.035
Neg logit contributions
 crossed-0.020
 below-0.020
 aback-0.020
 seen-0.020
 reached-0.020
Token thread
Token ID4704
Feature activation+3.05e-02

Pos logit contributions
oons+0.019
 pancakes+0.019
 degrees+0.018
oral+0.018
 dining+0.018
Neg logit contributions
 aback-0.018
 gol-0.017
 nervous-0.017
 worn-0.017
 crossed-0.017
Token.
Token ID13
Feature activation+4.96e-03

Pos logit contributions
 aback+0.031
 crossed+0.030
 seen+0.029
 gol+0.029
 around+0.029
Neg logit contributions
oral-0.057
 pancakes-0.057
 dining-0.056
 degrees-0.054
 ambul-0.054
Token They
Token ID1119
Feature activation+5.97e-03

Pos logit contributions
 aback+0.005
 gol+0.005
arted+0.005
 crossed+0.005
 seen+0.005
Neg logit contributions
oons-0.009
 pancakes-0.009
oral-0.009
 degrees-0.009
 dining-0.008
Token have
Token ID423
Feature activation+5.39e-03

Pos logit contributions
 aback+0.008
 gol+0.008
arted+0.007
 hesitant+0.007
 crossed+0.007
Neg logit contributions
 pancakes-0.009
 chores-0.009
oons-0.008
 degrees-0.008
oral-0.008
Token no
Token ID645
Feature activation-1.27e-02

Pos logit contributions
 aback+0.007
 gol+0.007
 crossed+0.007
 nervous+0.006
 worn+0.006
Neg logit contributions
 pancakes-0.008
oons-0.008
 degrees-0.008
oral-0.008
 chores-0.008
Token neck
Token ID7393
Feature activation+6.58e-02

Pos logit contributions
oons+0.019
 pancakes+0.019
 degrees+0.018
oral+0.018
 dining+0.018
Neg logit contributions
 aback-0.017
 gol-0.017
 worn-0.016
 nervous-0.016
 crossed-0.016
Tokenl
Token ID75
Feature activation+1.99e-02

Pos logit contributions
 aback+0.084
 surprised+0.081
 crossed+0.081
 below+0.081
 nervous+0.081
Neg logit contributions
 dining-0.104
 pancakes-0.101
oral-0.101
 ambul-0.096
oons-0.096
Tokenaces
Token ID2114
Feature activation+4.63e-02

Pos logit contributions
 aback+0.017
 gol+0.015
 crossed+0.015
 nervous+0.015
arted+0.014
Neg logit contributions
 pancakes-0.040
oons-0.039
oral-0.039
 degrees-0.039
 dining-0.039
Token and
Token ID290
Feature activation+7.02e-04

Pos logit contributions
 aback+0.057
 gol+0.055
 crossed+0.054
 nervous+0.053
 worn+0.053
Neg logit contributions
 pancakes-0.073
oons-0.072
 degrees-0.071
oral-0.071
 dining-0.070
Token brace
Token ID22581
Feature activation+4.69e-02

Pos logit contributions
 worn+0.001
 crossed+0.001
 aback+0.001
 nervous+0.001
 seen+0.001
Neg logit contributions
oons-0.001
 pancakes-0.001
oral-0.001
 dining-0.001
 degrees-0.001
Tokenlets
Token ID5289
Feature activation+4.39e-02

Pos logit contributions
 aback+0.063
 gol+0.062
 perplex+0.062
 overwhelmed+0.061
 nervous+0.060
Neg logit contributions
 degrees-0.066
oons-0.066
 911-0.062
 pancakes-0.062
 chores-0.061
Token,
Token ID11
Feature activation+2.29e-02

Pos logit contributions
oral+0.001
 degrees+0.001
 ambul+0.001
 dining+0.001
 sy+0.001
Neg logit contributions
 Sally-0.000
 so-0.000
 crossed-0.000
 Bill-0.000
 around-0.000
Token she
Token ID673
Feature activation-6.73e-03

Pos logit contributions
 aback+0.025
 nervous+0.024
 crossed+0.024
 below+0.024
 around+0.023
Neg logit contributions
oral-0.041
oons-0.040
 degrees-0.040
 pancakes-0.039
 dining-0.039
Token always
Token ID1464
Feature activation+1.47e-02

Pos logit contributions
 pancakes+0.012
oons+0.012
oral+0.012
 ambul+0.012
 degrees+0.012
Neg logit contributions
 aback-0.007
 gol-0.006
arted-0.006
 nervous-0.006
 worn-0.006
Token remembered
Token ID12086
Feature activation-8.85e-03

Pos logit contributions
 aback+0.014
 gol+0.014
arted+0.013
 below+0.013
 perplex+0.012
Neg logit contributions
 pancakes-0.027
 dining-0.026
 soup-0.026
 degrees-0.026
 breakfast-0.026
Token how
Token ID703
Feature activation-1.64e-02

Pos logit contributions
oons+0.018
 degrees+0.017
oral+0.017
 pancakes+0.017
 dining+0.017
Neg logit contributions
 aback-0.008
 seen-0.007
 worn-0.007
 crossed-0.007
 below-0.007
Token kind
Token ID1611
Feature activation+6.57e-02

Pos logit contributions
oons+0.030
oral+0.029
 pancakes+0.029
 degrees+0.029
 movie+0.029
Neg logit contributions
 nervous-0.017
 aback-0.017
 worn-0.017
 surprised-0.016
 scared-0.016
Token the
Token ID262
Feature activation-3.04e-03

Pos logit contributions
 aback+0.089
 gol+0.085
 crossed+0.085
 nervous+0.083
 worn+0.083
Neg logit contributions
 pancakes-0.097
oons-0.095
 degrees-0.095
oral-0.094
 dining-0.093
Token bear
Token ID6842
Feature activation+9.58e-03

Pos logit contributions
oons+0.006
 pancakes+0.006
 degrees+0.005
oup+0.005
 panc+0.005
Neg logit contributions
 scared-0.003
 nervous-0.003
 en-0.003
 frightened-0.003
 worn-0.003
Token was
Token ID373
Feature activation-1.09e-03

Pos logit contributions
 aback+0.008
 crossed+0.007
 nervous+0.007
 reached+0.007
 gol+0.007
Neg logit contributions
 pancakes-0.019
oons-0.019
oral-0.019
 degrees-0.019
 dining-0.018
Token to
Token ID284
Feature activation+1.26e-05

Pos logit contributions
oons+0.002
 pancakes+0.002
oral+0.002
 degrees+0.002
 dining+0.002
Neg logit contributions
 aback-0.001
 crossed-0.001
 nervous-0.001
 gol-0.001
 seen-0.001
Token share
Token ID2648
Feature activation+1.52e-02

Pos logit contributions
 aback+0.000
 crossed+0.000
 guessed+0.000
 gol+0.000
 reached+0.000
Neg logit contributions
oons-0.000
 pancakes-0.000
 degrees-0.000
oral-0.000
 ambul-0.000
Token bear
Token ID6842
Feature activation+4.32e-03

Pos logit contributions
 pancakes+0.002
oons+0.002
 degrees+0.002
oral+0.002
 dining+0.002
Neg logit contributions
 aback-0.002
 gol-0.002
 crossed-0.002
 nervous-0.002
 worn-0.002
Token and
Token ID290
Feature activation+1.12e-02

Pos logit contributions
 aback+0.005
 below+0.004
 around+0.004
 crossed+0.004
 surprised+0.004
Neg logit contributions
oons-0.008
 degrees-0.008
 pancakes-0.008
 ambul-0.007
 occasions-0.007
Token thought
Token ID1807
Feature activation-5.30e-03

Pos logit contributions
 aback+0.011
 gol+0.010
 nervous+0.010
suit+0.010
 worn+0.010
Neg logit contributions
 pancakes-0.020
 ambul-0.020
 911-0.020
oral-0.019
 degrees-0.019
Token about
Token ID546
Feature activation+3.35e-02

Pos logit contributions
 dining+0.011
 pancakes+0.011
oral+0.011
 Year+0.010
oons+0.010
Neg logit contributions
 crossed-0.005
 around-0.005
 aback-0.005
 below-0.004
 so-0.004
Token how
Token ID703
Feature activation-8.28e-03

Pos logit contributions
 aback+0.040
 crossed+0.037
 gol+0.037
 below+0.036
 nervous+0.036
Neg logit contributions
oons-0.057
 pancakes-0.056
 degrees-0.055
oral-0.054
 dining-0.053
Token wise
Token ID10787
Feature activation+6.55e-02

Pos logit contributions
 pancakes+0.015
oons+0.014
 degrees+0.014
oral+0.014
 dining+0.014
Neg logit contributions
 aback-0.009
 gol-0.008
 crossed-0.008
 nervous-0.008
 worn-0.008
Token and
Token ID290
Feature activation-2.66e-03

Pos logit contributions
 aback+0.077
 gol+0.072
 worn+0.072
arted+0.071
 seen+0.071
Neg logit contributions
 pancakes-0.107
 degrees-0.104
oons-0.104
 ambul-0.103
 chores-0.102
Token kind
Token ID1611
Feature activation+6.14e-02

Pos logit contributions
 ambul+0.004
 degrees+0.004
 chores+0.004
 pancakes+0.004
 911+0.004
Neg logit contributions
 aback-0.003
 gol-0.003
suit-0.003
arted-0.003
 Paige-0.003
Token animals
Token ID4695
Feature activation+3.04e-02

Pos logit contributions
 aback+0.073
 gol+0.071
 worn+0.071
 amazed+0.070
 nervous+0.070
Neg logit contributions
 pancakes-0.097
 degrees-0.094
oons-0.094
 ambul-0.093
oral-0.093
Token can
Token ID460
Feature activation-1.19e-02

Pos logit contributions
 aback+0.031
 crossed+0.030
 below+0.029
 nervous+0.028
 surprised+0.028
Neg logit contributions
 pancakes-0.056
oons-0.056
oral-0.055
 degrees-0.055
 dining-0.054
Token be
Token ID307
Feature activation+1.90e-02

Pos logit contributions
 pancakes+0.021
oons+0.021
 degrees+0.021
 dining+0.020
 beds+0.020
Neg logit contributions
 crossed-0.014
 aback-0.014
 reached-0.013
 below-0.013
 scared-0.013
Tokenie
Token ID494
Feature activation-5.37e-03

Pos logit contributions
 chores+0.013
 911+0.013
 ambul+0.012
 dining+0.012
 pancakes+0.012
Neg logit contributions
 reached-0.015
 guessed-0.015
 crossed-0.015
 surprised-0.015
 amazed-0.015
Token was
Token ID373
Feature activation+3.69e-03

Pos logit contributions
 pancakes+0.011
oons+0.011
 degrees+0.011
 chores+0.011
oral+0.011
Neg logit contributions
 crossed-0.004
 reached-0.004
 surprised-0.004
 aback-0.004
 guessed-0.004
Token surprised
Token ID6655
Feature activation+1.10e-03

Pos logit contributions
 gol+0.003
 aback+0.002
suit+0.002
 Philip+0.002
arrass+0.002
Neg logit contributions
 pancakes-0.008
 breakfast-0.008
 dining-0.008
 chores-0.007
 soup-0.007
Token at
Token ID379
Feature activation+2.53e-03

Pos logit contributions
 aback+0.001
 nervous+0.001
 surprised+0.001
 gol+0.001
 crossed+0.001
Neg logit contributions
oons-0.002
 pancakes-0.002
 degrees-0.002
oral-0.002
 ambul-0.002
Token how
Token ID703
Feature activation-1.65e-02

Pos logit contributions
 aback+0.003
 gol+0.003
 crossed+0.003
 worn+0.003
 nervous+0.003
Neg logit contributions
 pancakes-0.004
oons-0.004
 degrees-0.004
oral-0.004
 dining-0.004
Token kind
Token ID1611
Feature activation+6.55e-02

Pos logit contributions
oral+0.032
oons+0.032
 pancakes+0.031
 degrees+0.031
 ambul+0.031
Neg logit contributions
 nervous-0.016
 aback-0.015
 surprised-0.015
 embarrassed-0.015
 worn-0.015
Token he
Token ID339
Feature activation-2.40e-03

Pos logit contributions
 aback+0.094
 crossed+0.089
 gol+0.088
 nervous+0.087
 reached+0.087
Neg logit contributions
 pancakes-0.091
oons-0.091
 degrees-0.091
oral-0.090
 dining-0.088
Token was
Token ID373
Feature activation+8.95e-03

Pos logit contributions
 degrees+0.004
 pancakes+0.003
abel+0.003
oons+0.003
 movie+0.003
Neg logit contributions
 gol-0.003
 aback-0.003
 Philip-0.003
arted-0.003
suit-0.003
Token and
Token ID290
Feature activation-5.01e-03

Pos logit contributions
 aback+0.011
 gol+0.011
 crossed+0.010
arted+0.010
 worn+0.010
Neg logit contributions
 pancakes-0.014
 degrees-0.014
oons-0.014
 chores-0.013
 ambul-0.013
Token smiled
Token ID13541
Feature activation+1.85e-02

Pos logit contributions
 pancakes+0.010
 degrees+0.009
oral+0.009
oons+0.009
 dining+0.009
Neg logit contributions
 aback-0.005
 gol-0.004
 crossed-0.004
arted-0.004
suit-0.004
Token.
Token ID13
Feature activation+5.48e-04

Pos logit contributions
 aback+0.021
 gol+0.020
 nervous+0.020
 crossed+0.020
 surprised+0.019
Neg logit contributions
 pancakes-0.031
oons-0.031
 degrees-0.031
oral-0.031
 ambul-0.030
Token were
Token ID547
Feature activation-5.50e-03

Pos logit contributions
 aback+0.008
suit+0.007
arted+0.007
 gol+0.007
 perplex+0.007
Neg logit contributions
 pancakes-0.006
 ambul-0.006
 breakfast-0.006
 chores-0.006
 movie-0.006
Token so
Token ID523
Feature activation-4.44e-03

Pos logit contributions
 pancakes+0.011
 breakfast+0.010
 chores+0.010
oons+0.010
 dining+0.010
Neg logit contributions
 aback-0.005
 gol-0.005
arted-0.005
suit-0.005
 crossed-0.005
Token impressed
Token ID12617
Feature activation+1.65e-02

Pos logit contributions
 pancakes+0.009
oons+0.009
oral+0.009
 dining+0.009
 chores+0.009
Neg logit contributions
 aback-0.004
 crossed-0.003
arted-0.003
 gol-0.003
suit-0.003
Token with
Token ID351
Feature activation+7.39e-03

Pos logit contributions
 aback+0.023
 worn+0.023
 overwhelmed+0.022
 perplex+0.022
arted+0.022
Neg logit contributions
 pancakes-0.022
oons-0.022
oral-0.021
 movie-0.021
 soup-0.021
Token how
Token ID703
Feature activation-6.57e-03

Pos logit contributions
suit+0.012
 aback+0.012
 gol+0.012
 worn+0.012
 hesitant+0.012
Neg logit contributions
 chores-0.009
 pancakes-0.009
 911-0.008
 breakfast-0.008
 ambul-0.008
Token wise
Token ID10787
Feature activation+6.53e-02

Pos logit contributions
 pancakes+0.011
oons+0.011
 degrees+0.010
 chores+0.010
oral+0.010
Neg logit contributions
 aback-0.008
 gol-0.007
 crossed-0.007
arted-0.007
 worn-0.007
Token Tom
Token ID4186
Feature activation+2.03e-02

Pos logit contributions
 aback+0.091
arted+0.086
 perplex+0.085
 worn+0.084
 seen+0.083
Neg logit contributions
 pancakes-0.092
oons-0.090
 ambul-0.090
 chores-0.090
oral-0.089
Token was
Token ID373
Feature activation+7.63e-03

Pos logit contributions
 aback+0.018
 crossed+0.017
 gol+0.016
 nervous+0.016
 worn+0.016
Neg logit contributions
 pancakes-0.040
oons-0.039
 degrees-0.039
oral-0.039
 dining-0.038
Token.
Token ID13
Feature activation+2.05e-03

Pos logit contributions
 aback+0.009
 gol+0.008
 crossed+0.008
arted+0.008
 worn+0.008
Neg logit contributions
 pancakes-0.013
oons-0.013
 degrees-0.012
oral-0.012
 chores-0.012
Token<|endoftext|>
Token ID50256
Feature activation-9.96e-04

Pos logit contributions
 aback+0.003
 seen+0.003
 worn+0.003
 source+0.003
 overwhelmed+0.003
Neg logit contributions
oral-0.003
oons-0.003
 beds-0.003
 911-0.003
 movie-0.003
TokenOnce
Token ID7454
Feature activation-1.17e-02

Pos logit contributions
oons+0.002
 movie+0.002
 breakfast+0.001
oral+0.001
 beds+0.001
Neg logit contributions
 gol-0.001
arrass-0.001
arted-0.001
 overwhelmed-0.001
 worn-0.001
Token it
Token ID340
Feature activation-1.79e-02

Pos logit contributions
 pancakes+0.006
 chores+0.006
 911+0.005
 degrees+0.005
 ambul+0.005
Neg logit contributions
 aback-0.004
 gol-0.004
 worn-0.004
 perplex-0.004
 overwhelmed-0.004
Token.
Token ID13
Feature activation+7.10e-03

Pos logit contributions
oons+0.024
 chores+0.024
 pancakes+0.024
 degrees+0.024
 911+0.023
Neg logit contributions
 gol-0.026
 aback-0.025
 crossed-0.024
 worn-0.024
 perplex-0.024
Token He
Token ID679
Feature activation+6.53e-03

Pos logit contributions
 seen+0.009
 below+0.009
 aback+0.008
amed+0.008
 gol+0.008
Neg logit contributions
 ambul-0.012
 beds-0.011
 911-0.011
 sy-0.011
 chores-0.011
Token told
Token ID1297
Feature activation-8.30e-04

Pos logit contributions
 aback+0.006
 gol+0.006
 below+0.006
suit+0.005
 overwhelmed+0.005
Neg logit contributions
 pancakes-0.012
 degrees-0.012
oons-0.012
oral-0.012
 ambul-0.011
Token Wh
Token ID854
Feature activation+1.63e-02

Pos logit contributions
 pancakes+0.001
 ambul+0.001
 chores+0.001
 911+0.001
 breakfast+0.001
Neg logit contributions
 aback-0.001
 below-0.001
 gol-0.001
 amazed-0.001
 overwhelmed-0.001
Tokenisk
Token ID1984
Feature activation+6.53e-02

Pos logit contributions
 aback+0.018
arted+0.017
 gol+0.017
 crossed+0.016
 seen+0.016
Neg logit contributions
 pancakes-0.029
 dining-0.028
 degrees-0.028
 beds-0.027
 breakfast-0.027
Tokeners
Token ID364
Feature activation+4.69e-02

Pos logit contributions
 aback+0.095
 gol+0.090
 crossed+0.090
 worn+0.087
 nervous+0.087
Neg logit contributions
 pancakes-0.090
oons-0.088
 degrees-0.087
oral-0.087
 dining-0.085
Token to
Token ID284
Feature activation+2.21e-03

Pos logit contributions
 aback+0.047
 surprised+0.045
 so+0.045
 Bill+0.044
 around+0.044
Neg logit contributions
 pancakes-0.088
 degrees-0.088
 dining-0.087
oral-0.086
 occasions-0.086
Token look
Token ID804
Feature activation-1.89e-02

Pos logit contributions
 aback+0.003
 gol+0.003
 amazed+0.003
 nervous+0.003
 surprised+0.003
Neg logit contributions
 pancakes-0.003
 chores-0.003
 breakfast-0.003
 ambul-0.003
 911-0.003
Token at
Token ID379
Feature activation+1.20e-02

Pos logit contributions
 pancakes+0.032
oons+0.031
 degrees+0.031
oral+0.030
 chores+0.030
Neg logit contributions
 aback-0.022
 gol-0.021
 crossed-0.020
 worn-0.020
 nervous-0.020
Token him
Token ID683
Feature activation-7.58e-03

Pos logit contributions
 gol+0.018
 overwhelmed+0.018
amed+0.018
 aback+0.017
 amazed+0.017
Neg logit contributions
 ambul-0.017
 911-0.016
 chores-0.016
 beds-0.015
 fires-0.015
Token was
Token ID373
Feature activation+7.62e-04

Pos logit contributions
 aback+0.007
 gol+0.006
 crossed+0.006
 nervous+0.006
 worn+0.006
Neg logit contributions
 pancakes-0.016
oons-0.016
 degrees-0.016
oral-0.016
 dining-0.016
Token very
Token ID845
Feature activation-2.08e-02

Pos logit contributions
 gol+0.001
 aback+0.001
suit+0.001
arted+0.001
arrass+0.000
Neg logit contributions
 pancakes-0.002
 dining-0.002
 breakfast-0.002
 ambul-0.002
 degrees-0.002
Token pleased
Token ID10607
Feature activation+9.01e-03

Pos logit contributions
 pancakes+0.041
 degrees+0.039
 dining+0.039
oral+0.039
 chores+0.039
Neg logit contributions
 aback-0.017
arted-0.016
 gol-0.016
tie-0.016
 crossed-0.016
Token with
Token ID351
Feature activation+3.90e-03

Pos logit contributions
 aback+0.010
 gol+0.009
 crossed+0.009
 nervous+0.009
 worn+0.009
Neg logit contributions
 pancakes-0.016
oons-0.016
 degrees-0.016
oral-0.015
 dining-0.015
Token how
Token ID703
Feature activation-1.01e-02

Pos logit contributions
 gol+0.006
suit+0.006
 aback+0.006
 crossed+0.005
 nervous+0.005
Neg logit contributions
 pancakes-0.005
 chores-0.005
 breakfast-0.005
 911-0.005
 ambul-0.005
Token wise
Token ID10787
Feature activation+6.52e-02

Pos logit contributions
 pancakes+0.019
oons+0.018
oral+0.018
 degrees+0.018
 ambul+0.018
Neg logit contributions
 aback-0.010
 crossed-0.010
 nervous-0.010
 gol-0.010
 surprised-0.009
Token he
Token ID339
Feature activation+1.62e-03

Pos logit contributions
 aback+0.081
 gol+0.077
 crossed+0.076
 worn+0.075
 nervous+0.075
Neg logit contributions
 pancakes-0.103
oons-0.101
 degrees-0.100
oral-0.099
 ambul-0.098
Token had
Token ID550
Feature activation+1.70e-02

Pos logit contributions
 gol+0.002
 aback+0.002
arted+0.002
 Paige+0.002
 Philip+0.002
Neg logit contributions
oons-0.002
 degrees-0.002
 pancakes-0.002
 soup-0.002
oral-0.002
Token been
Token ID587
Feature activation+1.06e-02

Pos logit contributions
suit+0.018
arted+0.017
 gol+0.017
 nervous+0.016
 hesitant+0.016
Neg logit contributions
 pancakes-0.029
 breakfast-0.029
 chores-0.028
 soup-0.028
 occasions-0.028
Token.
Token ID13
Feature activation+5.38e-03

Pos logit contributions
 gol+0.012
 aback+0.012
arted+0.012
suit+0.011
 crossed+0.011
Neg logit contributions
 pancakes-0.018
 degrees-0.017
 chores-0.017
oons-0.017
 ambul-0.017
Token With
Token ID2080
Feature activation-9.73e-03

Pos logit contributions
 gol+0.004
 aback+0.004
 worn+0.004
suit+0.004
 nervous+0.004
Neg logit contributions
oral-0.010
oons-0.010
 pancakes-0.010
 dining-0.010
 ambul-0.010
Token the
Token ID262
Feature activation-2.18e-03

Pos logit contributions
 aback+0.018
 gol+0.017
 crossed+0.017
 nervous+0.016
 worn+0.016
Neg logit contributions
 pancakes-0.034
oons-0.033
 degrees-0.033
oral-0.032
 ambul-0.032
Token frog
Token ID21264
Feature activation+4.82e-03

Pos logit contributions
 pancakes+0.003
oral+0.003
oons+0.003
 degrees+0.003
 soup+0.003
Neg logit contributions
 aback-0.003
 crossed-0.003
arted-0.003
 gol-0.003
 seen-0.003
Token always
Token ID1464
Feature activation+1.55e-02

Pos logit contributions
 aback+0.004
 crossed+0.004
 reached+0.004
 gol+0.004
 surprised+0.004
Neg logit contributions
 degrees-0.009
 pancakes-0.009
oons-0.009
 911-0.009
 dining-0.009
Token remembered
Token ID12086
Feature activation-6.73e-03

Pos logit contributions
 aback+0.014
 gol+0.014
 below+0.013
arted+0.013
 perplex+0.012
Neg logit contributions
 pancakes-0.029
 soup-0.028
 breakfast-0.028
 degrees-0.027
oons-0.027
Token how
Token ID703
Feature activation-1.76e-02

Pos logit contributions
oons+0.013
 degrees+0.013
oral+0.013
 pancakes+0.013
 dining+0.012
Neg logit contributions
 aback-0.006
 crossed-0.006
 worn-0.006
 gol-0.006
 seen-0.006
Token kind
Token ID1611
Feature activation+6.50e-02

Pos logit contributions
 pancakes+0.033
oons+0.033
oral+0.033
 degrees+0.033
 dining+0.032
Neg logit contributions
 nervous-0.016
 aback-0.016
 worn-0.016
 surprised-0.016
 embarrassed-0.015
Token the
Token ID262
Feature activation-4.21e-03

Pos logit contributions
 aback+0.080
 gol+0.076
 crossed+0.075
 worn+0.074
 nervous+0.074
Neg logit contributions
 pancakes-0.103
oons-0.101
 degrees-0.099
oral-0.099
 ambul-0.098
Token man
Token ID582
Feature activation+2.15e-02

Pos logit contributions
oons+0.007
 pancakes+0.007
 degrees+0.007
 911+0.007
 ambul+0.007
Neg logit contributions
 aback-0.005
 gol-0.005
 nervous-0.005
 crossed-0.004
 surprised-0.004
Token had
Token ID550
Feature activation+7.44e-03

Pos logit contributions
 aback+0.020
 gol+0.019
 crossed+0.018
 nervous+0.017
 worn+0.017
Neg logit contributions
 pancakes-0.041
oons-0.040
 degrees-0.040
oral-0.040
 chores-0.039
Token been
Token ID587
Feature activation+4.90e-03

Pos logit contributions
 gol+0.007
 aback+0.007
arted+0.007
suit+0.006
 nervous+0.006
Neg logit contributions
 pancakes-0.014
 chores-0.013
 breakfast-0.013
 soup-0.013
oons-0.013
Token and
Token ID290
Feature activation+4.76e-03

Pos logit contributions
 gol+0.005
 aback+0.005
arted+0.005
 crossed+0.005
suit+0.005
Neg logit contributions
 pancakes-0.008
 chores-0.008
 degrees-0.008
oons-0.008
oral-0.008
Token did
Token ID750
Feature activation+5.81e-03

Pos logit contributions
 gol+0.004
 aback+0.004
 Philip+0.004
 Paige+0.004
arted+0.004
Neg logit contributions
 ambul-0.006
 pancakes-0.006
 chores-0.006
 degrees-0.006
oral-0.006
Token not
Token ID407
Feature activation-2.47e-03

Pos logit contributions
 aback+0.006
 crossed+0.005
 gol+0.005
 nervous+0.005
 surprised+0.005
Neg logit contributions
 pancakes-0.011
oons-0.011
 degrees-0.010
oral-0.010
 dining-0.010
Token know
Token ID760
Feature activation-1.21e-02

Pos logit contributions
 chores+0.004
 911+0.004
 ambul+0.004
 pancakes+0.004
 excuse+0.004
Neg logit contributions
arted-0.003
 aback-0.003
 Luckily-0.003
amed-0.003
 gol-0.003
Token what
Token ID644
Feature activation-1.65e-02

Pos logit contributions
oral+0.023
 dining+0.022
oons+0.022
 pancakes+0.021
 occasions+0.021
Neg logit contributions
 below-0.012
 surprised-0.012
 crossed-0.012
 so-0.012
 amazed-0.011
Token ost
Token ID23619
Feature activation+8.33e-03

Pos logit contributions
 pancakes+0.032
oons+0.031
 degrees+0.031
oral+0.031
 dining+0.031
Neg logit contributions
 aback-0.015
 crossed-0.014
 gol-0.014
 nervous-0.014
 worn-0.013
Tokenric
Token ID1173
Feature activation+6.49e-02

Pos logit contributions
 aback+0.014
 gol+0.013
 nervous+0.013
 worn+0.013
 perplex+0.013
Neg logit contributions
oons-0.010
oral-0.010
 degrees-0.010
 ambul-0.010
 pancakes-0.010
Tokenhes
Token ID956
Feature activation+1.86e-02

Pos logit contributions
 aback+0.068
 gol+0.064
 crossed+0.063
 nervous+0.062
 worn+0.062
Neg logit contributions
 pancakes-0.115
oons-0.113
 degrees-0.112
oral-0.111
 dining-0.110
Token were
Token ID547
Feature activation-3.47e-03

Pos logit contributions
 crossed+0.019
 aback+0.018
 nervous+0.018
 surprised+0.018
 below+0.018
Neg logit contributions
 dining-0.032
 pancakes-0.032
oral-0.032
oons-0.032
 degrees-0.032
Token.
Token ID13
Feature activation-3.14e-03

Pos logit contributions
 pancakes+0.006
oons+0.006
oral+0.006
 degrees+0.006
 dining+0.006
Neg logit contributions
 aback-0.004
 crossed-0.004
 nervous-0.004
 guessed-0.004
 gol-0.004
Token They
Token ID1119
Feature activation-3.92e-03

Pos logit contributions
 pancakes+0.005
 beds+0.005
oons+0.005
 degrees+0.005
 TV+0.005
Neg logit contributions
 aback-0.004
 gol-0.004
 seen-0.004
 worn-0.004
amed-0.004
Token looked
Token ID3114
Feature activation-7.32e-03

Pos logit contributions
 ambul+0.007
 chores+0.007
 pancakes+0.007
 degrees+0.007
 beds+0.007
Neg logit contributions
 aback-0.004
 gol-0.004
 Philip-0.003
lot-0.003
arted-0.003
Token Both
Token ID5747
Feature activation+1.43e-02

Pos logit contributions
 seen+0.003
 gol+0.003
 crossed+0.003
suit+0.003
 caught+0.003
Neg logit contributions
oral-0.005
 dining-0.005
 911-0.005
 TV-0.005
oons-0.005
Token of
Token ID286
Feature activation-3.00e-02

Pos logit contributions
 aback+0.019
 gol+0.018
 crossed+0.018
 nervous+0.018
 worn+0.018
Neg logit contributions
 pancakes-0.021
oons-0.021
 degrees-0.021
oral-0.020
 chores-0.020
Token them
Token ID606
Feature activation-1.05e-02

Pos logit contributions
 pancakes+0.036
oons+0.035
 degrees+0.035
oral+0.035
 dining+0.034
Neg logit contributions
 aback-0.049
 gol-0.047
 crossed-0.047
 nervous-0.046
 worn-0.046
Token put
Token ID1234
Feature activation+9.48e-03

Pos logit contributions
oral+0.023
 degrees+0.022
oons+0.022
 pancakes+0.022
 911+0.022
Neg logit contributions
 crossed-0.007
 reached-0.007
 guessed-0.007
 aback-0.007
 surprised-0.007
Token their
Token ID511
Feature activation-1.24e-02

Pos logit contributions
 aback+0.012
 gol+0.012
 crossed+0.012
 nervous+0.012
 worn+0.012
Neg logit contributions
 pancakes-0.014
oons-0.014
 degrees-0.014
 chores-0.014
 breakfast-0.013
Token anx
Token ID7296
Feature activation-7.91e-02

Pos logit contributions
oral+0.015
oons+0.015
 911+0.015
 ambul+0.014
 dining+0.014
Neg logit contributions
 aback-0.019
 nervous-0.019
 worn-0.019
 surprised-0.019
 touched-0.018
Tokenieties
Token ID9545
Feature activation-1.99e-03

Pos logit contributions
oons+0.123
oral+0.120
 pancakes+0.120
 dining+0.120
 911+0.118
Neg logit contributions
 aback-0.098
arted-0.094
 gol-0.093
 guessed-0.092
arrass-0.092
Token aside
Token ID7263
Feature activation-5.13e-03

Pos logit contributions
oral+0.004
 dining+0.004
 ambul+0.004
 degrees+0.004
 occasions+0.004
Neg logit contributions
 around-0.002
 aback-0.001
 below-0.001
 crossed-0.001
 worn-0.001
Token.
Token ID13
Feature activation+8.09e-03

Pos logit contributions
oral+0.009
oons+0.009
 pancakes+0.009
 degrees+0.009
 ambul+0.009
Neg logit contributions
 aback-0.005
 so-0.005
 below-0.005
 around-0.005
 nervous-0.005
Token From
Token ID3574
Feature activation-1.37e-02

Pos logit contributions
 gol+0.008
 seen+0.008
 aback+0.007
 crossed+0.007
 worn+0.007
Neg logit contributions
oral-0.015
 911-0.015
 dining-0.015
oons-0.014
 TV-0.014
Token then
Token ID788
Feature activation+1.36e-02

Pos logit contributions
 pancakes+0.022
 degrees+0.021
oons+0.020
oral+0.020
 chores+0.020
Neg logit contributions
 aback-0.018
 gol-0.017
 crossed-0.016
 nervous-0.016
 hesitant-0.016
Token a
Token ID257
Feature activation-7.03e-04

Pos logit contributions
 aback+0.011
 gol+0.010
 crossed+0.010
 nervous+0.010
 worn+0.010
Neg logit contributions
 pancakes-0.021
oons-0.020
 degrees-0.020
oral-0.020
 ambul-0.020
Token nearby
Token ID6716
Feature activation+4.88e-03

Pos logit contributions
oons+0.001
 pancakes+0.001
 degrees+0.001
oral+0.001
 ambul+0.001
Neg logit contributions
 aback-0.001
 worn-0.001
 nervous-0.001
 crossed-0.001
 gol-0.001
Token tree
Token ID5509
Feature activation-7.56e-03

Pos logit contributions
 aback+0.005
 crossed+0.005
 gol+0.005
 nervous+0.005
 below+0.005
Neg logit contributions
oons-0.009
 pancakes-0.008
oral-0.008
 degrees-0.008
 chores-0.008
Token watched
Token ID7342
Feature activation-3.19e-02

Pos logit contributions
 pancakes+0.013
oons+0.012
 degrees+0.012
oral+0.012
 dining+0.012
Neg logit contributions
 aback-0.009
 gol-0.008
 crossed-0.008
 nervous-0.008
 worn-0.008
Token him
Token ID683
Feature activation-1.32e-02

Pos logit contributions
oral+0.072
 dining+0.070
oons+0.069
 pancakes+0.067
 chores+0.066
Neg logit contributions
 below-0.026
 around-0.025
 so-0.024
 nervous-0.023
 aback-0.022
Token int
Token ID493
Feature activation-7.53e-02

Pos logit contributions
oons+0.028
oral+0.027
 pancakes+0.026
 dining+0.026
 degrees+0.026
Neg logit contributions
 below-0.012
 nervous-0.012
 around-0.012
 so-0.012
 scared-0.011
Tokenently
Token ID1473
Feature activation-1.57e-02

Pos logit contributions
 dining+0.098
 ambul+0.090
 pancakes+0.088
 911+0.087
 sy+0.087
Neg logit contributions
ocked-0.117
 below-0.114
 crossed-0.114
apped-0.113
amed-0.112
Token.
Token ID13
Feature activation+5.88e-03

Pos logit contributions
oons+0.033
oral+0.032
 pancakes+0.032
 ambul+0.031
 dining+0.030
Neg logit contributions
 below-0.014
 nervous-0.014
 around-0.014
 aback-0.013
 surprised-0.013
Token
Token ID198
Feature activation-1.79e-03

Pos logit contributions
 seen+0.006
 gol+0.006
 worn+0.005
 aback+0.005
amed+0.005
Neg logit contributions
 cereal-0.010
oons-0.010
 dining-0.010
 pancakes-0.010
 911-0.010
Token
Token ID198
Feature activation-3.08e-03

Pos logit contributions
oons+0.003
oral+0.003
 pancakes+0.003
 breakfast+0.003
 degrees+0.003
Neg logit contributions
 gol-0.002
 aback-0.002
arted-0.002
 worn-0.002
 overwhelmed-0.002
TokenThe
Token ID464
Feature activation-1.02e-02

Pos logit contributions
 pancakes+0.005
oons+0.004
 degrees+0.004
 chores+0.004
 dining+0.004
Neg logit contributions
 aback-0.004
 crossed-0.004
 gol-0.004
 reached-0.004
 nervous-0.004
Token The
Token ID383
Feature activation-4.89e-03

Pos logit contributions
 gol+0.013
 known+0.011
amed+0.011
k+0.011
tie+0.011
Neg logit contributions
 cereal-0.013
 beds-0.013
 pancakes-0.013
 dining-0.012
oons-0.012
Token little
Token ID1310
Feature activation+2.90e-03

Pos logit contributions
oons+0.009
 degrees+0.008
 pancakes+0.008
 dining+0.008
oral+0.008
Neg logit contributions
 nervous-0.005
 worn-0.005
 aback-0.005
 scared-0.005
 surprised-0.005
Token lamb
Token ID19343
Feature activation+3.17e-03

Pos logit contributions
 aback+0.005
 gol+0.004
 crossed+0.004
 nervous+0.004
 worn+0.004
Neg logit contributions
oons-0.004
 pancakes-0.004
 degrees-0.003
oral-0.003
 dining-0.003
Token watched
Token ID7342
Feature activation-3.13e-02

Pos logit contributions
 aback+0.003
 gol+0.003
arted+0.002
 worn+0.002
 Paige+0.002
Neg logit contributions
 pancakes-0.006
oons-0.006
 degrees-0.006
 dining-0.006
 ambul-0.006
Token him
Token ID683
Feature activation-1.40e-02

Pos logit contributions
oons+0.075
oral+0.075
 dining+0.074
 degrees+0.072
 occasions+0.070
Neg logit contributions
 below-0.021
 around-0.020
 so-0.019
 nervous-0.019
 Sally-0.017
Token int
Token ID493
Feature activation-7.41e-02

Pos logit contributions
oons+0.030
oral+0.030
 degrees+0.029
 pancakes+0.028
 dining+0.028
Neg logit contributions
 around-0.013
 so-0.012
 nervous-0.012
 below-0.012
 scared-0.012
Tokenently
Token ID1473
Feature activation-1.60e-02

Pos logit contributions
 dining+0.085
 pancakes+0.079
 ambul+0.078
 degrees+0.078
 sy+0.077
Neg logit contributions
apped-0.125
 crossed-0.124
ocked-0.122
amed-0.122
arted-0.119
Token,
Token ID11
Feature activation-3.59e-03

Pos logit contributions
oral+0.036
oons+0.036
 occasions+0.034
 ambul+0.034
 pancakes+0.033
Neg logit contributions
 around-0.013
 below-0.013
 nervous-0.013
 so-0.012
 confused-0.012
Token and
Token ID290
Feature activation+3.52e-03

Pos logit contributions
oons+0.007
oral+0.007
 pancakes+0.006
 degrees+0.006
 chores+0.006
Neg logit contributions
 aback-0.004
 crossed-0.004
 nervous-0.004
 reached-0.004
 surprised-0.004
Token when
Token ID618
Feature activation+5.07e-03

Pos logit contributions
 aback+0.004
 gol+0.004
 crossed+0.004
 worn+0.004
 nervous+0.004
Neg logit contributions
 pancakes-0.006
oons-0.005
 degrees-0.005
 dining-0.005
 ambul-0.005
Token the
Token ID262
Feature activation-8.39e-03

Pos logit contributions
 crossed+0.005
 reached+0.005
 scared+0.005
 nervous+0.004
 Bill+0.004
Neg logit contributions
oons-0.011
 degrees-0.011
oral-0.010
 occasions-0.010
 pancakes-0.010
Token friends
Token ID2460
Feature activation+1.83e-02

Pos logit contributions
 pancakes+0.023
oons+0.023
 degrees+0.023
oral+0.023
 dining+0.022
Neg logit contributions
 aback-0.011
 gol-0.011
 crossed-0.011
 nervous-0.010
 worn-0.010
Token and
Token ID290
Feature activation+3.62e-03

Pos logit contributions
 aback+0.023
 gol+0.022
 crossed+0.021
 worn+0.021
 nervous+0.021
Neg logit contributions
 pancakes-0.029
oons-0.028
 degrees-0.028
oral-0.028
 ambul-0.028
Token shared
Token ID4888
Feature activation+1.42e-02

Pos logit contributions
 aback+0.003
 gol+0.003
 crossed+0.003
 worn+0.002
 nervous+0.002
Neg logit contributions
 pancakes-0.007
oons-0.007
 degrees-0.007
oral-0.007
 dining-0.007
Token their
Token ID511
Feature activation-9.44e-03

Pos logit contributions
 aback+0.026
 gol+0.025
arted+0.024
 hesitant+0.024
 overwhelmed+0.024
Neg logit contributions
 pancakes-0.018
 breakfast-0.017
 tea-0.016
 soup-0.016
 chores-0.015
Token l
Token ID300
Feature activation-1.03e-03

Pos logit contributions
 911+0.014
oons+0.014
 ambul+0.014
oral+0.013
 degrees+0.013
Neg logit contributions
 aback-0.013
 worn-0.012
 nervous-0.012
 gol-0.012
 seen-0.012
Tokenoll
Token ID692
Feature activation-7.21e-02

Pos logit contributions
oons+0.001
 pancakes+0.001
 degrees+0.001
oral+0.001
 breakfast+0.001
Neg logit contributions
 aback-0.002
 gol-0.002
 crossed-0.002
 nervous-0.002
 worn-0.002
Tokenip
Token ID541
Feature activation-3.94e-02

Pos logit contributions
 ambul+0.115
 dining+0.114
 911+0.112
oral+0.111
 degrees+0.108
Neg logit contributions
 worn-0.082
 aback-0.082
suit-0.080
tie-0.080
 surprised-0.080
Tokenops
Token ID2840
Feature activation-3.21e-02

Pos logit contributions
 911+0.043
oral+0.042
 ambul+0.042
 degrees+0.041
 dining+0.040
Neg logit contributions
 aback-0.066
 crossed-0.065
 around-0.064
 amazed-0.064
 gained-0.063
Token every
Token ID790
Feature activation-2.23e-02

Pos logit contributions
 degrees+0.068
 ambul+0.068
oral+0.068
 911+0.067
 sy+0.063
Neg logit contributions
 around-0.025
 below-0.023
 amazed-0.022
 so-0.022
 surprised-0.021
Token time
Token ID640
Feature activation+6.18e-03

Pos logit contributions
 pancakes+0.033
 breakfast+0.033
oral+0.032
 movie+0.031
 soup+0.031
Neg logit contributions
 aback-0.031
 crossed-0.029
 gol-0.029
 amazed-0.028
arted-0.028
Token they
Token ID484
Feature activation-1.96e-02

Pos logit contributions
 below+0.006
 around+0.006
 surprised+0.006
 crossed+0.006
 nervous+0.006
Neg logit contributions
oral-0.012
 dining-0.011
 ambul-0.011
 911-0.011
 pancakes-0.010
TokenL
Token ID43
Feature activation+1.90e-02

Pos logit contributions
 ambul+0.008
 degrees+0.008
 pancakes+0.008
 911+0.008
 chores+0.008
Neg logit contributions
 worn-0.007
 gol-0.007
 reached-0.006
 surprised-0.006
 nervous-0.006
Tokenily
Token ID813
Feature activation+2.95e-03

Pos logit contributions
 aback+0.024
 gol+0.023
 crossed+0.022
 guessed+0.022
arted+0.022
Neg logit contributions
 ambul-0.030
 pancakes-0.030
 degrees-0.030
 dining-0.029
 chores-0.029
Token enjoyed
Token ID8359
Feature activation-2.58e-02

Pos logit contributions
 aback+0.003
 gol+0.002
 crossed+0.002
 nervous+0.002
 worn+0.002
Neg logit contributions
 pancakes-0.006
oons-0.006
 degrees-0.006
oral-0.006
 dining-0.005
Token her
Token ID607
Feature activation-1.12e-03

Pos logit contributions
 pancakes+0.035
oons+0.034
 degrees+0.034
oral+0.034
 dining+0.033
Neg logit contributions
 aback-0.038
 gol-0.037
 crossed-0.037
 nervous-0.036
 worn-0.036
Token l
Token ID300
Feature activation+1.66e-03

Pos logit contributions
 ambul+0.002
 degrees+0.002
 911+0.002
oons+0.002
oral+0.002
Neg logit contributions
 aback-0.002
 nervous-0.002
 gol-0.002
 worn-0.002
 surprised-0.001
Tokenoll
Token ID692
Feature activation-7.15e-02

Pos logit contributions
 aback+0.003
 gol+0.003
 crossed+0.003
 nervous+0.003
 worn+0.003
Neg logit contributions
oons-0.002
 pancakes-0.002
 degrees-0.001
oral-0.001
 breakfast-0.001
Tokenipop
Token ID42800
Feature activation-2.11e-02

Pos logit contributions
 ambul+0.108
 dining+0.107
oral+0.106
 pancakes+0.105
 911+0.104
Neg logit contributions
 aback-0.092
 worn-0.087
 crossed-0.086
 surprised-0.086
 gol-0.086
Token and
Token ID290
Feature activation+8.01e-03

Pos logit contributions
oral+0.040
oons+0.040
 degrees+0.039
 pancakes+0.039
 ambul+0.039
Neg logit contributions
 aback-0.019
 crossed-0.018
 nervous-0.018
 gol-0.018
 worn-0.018
Token forgot
Token ID16453
Feature activation-2.99e-03

Pos logit contributions
 aback+0.009
 gol+0.008
arted+0.008
suit+0.008
 below+0.008
Neg logit contributions
 pancakes-0.015
 breakfast-0.014
 soup-0.014
oons-0.014
 dining-0.014
Token all
Token ID477
Feature activation+1.05e-02

Pos logit contributions
oral+0.007
 degrees+0.007
 ambul+0.007
 sy+0.007
 Year+0.007
Neg logit contributions
 so-0.002
 surprised-0.002
 around-0.002
 amazed-0.002
 nervous-0.001
Token about
Token ID546
Feature activation+1.91e-03

Pos logit contributions
 aback+0.009
 gol+0.009
 crossed+0.009
 nervous+0.008
 worn+0.008
Neg logit contributions
 pancakes-0.020
 degrees-0.020
oons-0.020
oral-0.020
 ambul-0.020
Token on
Token ID319
Feature activation-3.07e-03

Pos logit contributions
 aback+0.011
 nervous+0.011
 gol+0.011
 crossed+0.011
 surprised+0.011
Neg logit contributions
oons-0.019
 pancakes-0.019
oral-0.019
 degrees-0.019
 chores-0.018
Token his
Token ID465
Feature activation-1.10e-02

Pos logit contributions
 pancakes+0.005
oons+0.005
 degrees+0.005
oral+0.005
 dining+0.005
Neg logit contributions
 aback-0.004
 gol-0.004
 crossed-0.004
 nervous-0.003
 worn-0.003
Token face
Token ID1986
Feature activation+1.26e-02

Pos logit contributions
oons+0.021
 pancakes+0.020
 degrees+0.020
oral+0.020
 ambul+0.020
Neg logit contributions
 aback-0.010
 nervous-0.010
 gol-0.010
 crossed-0.010
 surprised-0.009
Tokenâ
Token ID22940
Feature activation-2.40e-02

Pos logit contributions
 aback+0.013
 nervous+0.013
 surprised+0.013
 crossed+0.013
 below+0.013
Neg logit contributions
oons-0.023
 pancakes-0.022
oral-0.022
 degrees-0.022
 dining-0.022
Token
Token ID26391
Feature activation-6.23e-04

Pos logit contributions
oral+0.038
 occasions+0.037
 degrees+0.037
oons+0.037
 pancakes+0.037
Neg logit contributions
 amazed-0.027
 surprised-0.027
 nervous-0.027
 gol-0.026
 overwhelmed-0.026
Token
Token ID447
Feature activation-7.06e-02

Pos logit contributions
oons+0.001
 degrees+0.001
 beds+0.001
 dining+0.001
 pancakes+0.001
Neg logit contributions
 aback-0.001
arted-0.001
 gol-0.001
 nervous-0.001
 hesitant-0.001
Token
Token ID251
Feature activation-8.42e-03

Pos logit contributions
 pancakes+0.107
oons+0.105
oral+0.104
 degrees+0.103
 dining+0.101
Neg logit contributions
 aback-0.092
 gol-0.088
 crossed-0.087
 nervous-0.086
 worn-0.085
Tokenhe
Token ID258
Feature activation-4.22e-02

Pos logit contributions
 pancakes+0.017
oons+0.016
oral+0.016
 degrees+0.016
 ambul+0.016
Neg logit contributions
 aback-0.007
 gol-0.007
 seen-0.006
 crossed-0.006
arted-0.006
Token went
Token ID1816
Feature activation+3.04e-03

Pos logit contributions
 pancakes+0.078
oons+0.077
 degrees+0.076
oral+0.076
 dining+0.075
Neg logit contributions
 aback-0.041
 gol-0.039
 crossed-0.039
 nervous-0.038
 worn-0.037
Token to
Token ID284
Feature activation-1.34e-02

Pos logit contributions
 aback+0.003
 gol+0.003
 crossed+0.003
 guessed+0.003
 worn+0.003
Neg logit contributions
 pancakes-0.005
 911-0.005
 chores-0.005
oral-0.005
oons-0.005
Token play
Token ID711
Feature activation-6.79e-03

Pos logit contributions
 pancakes+0.022
 chores+0.021
 911+0.021
 breakfast+0.021
 ambul+0.021
Neg logit contributions
 aback-0.016
 gol-0.015
 crossed-0.015
 hesitant-0.015
 amazed-0.015
Token
Token ID26391
Feature activation+6.18e-03

Pos logit contributions
 pancakes+0.042
oons+0.041
 degrees+0.040
 ambul+0.039
 beds+0.039
Neg logit contributions
 aback-0.035
 gol-0.033
 crossed-0.033
 worn-0.031
 guessed-0.031
Token
Token ID129
Feature activation-5.26e-03

Pos logit contributions
 aback+0.009
arted+0.008
 gol+0.008
 below+0.008
 perplex+0.008
Neg logit contributions
 pancakes-0.009
oons-0.009
 degrees-0.009
 dining-0.009
 movie-0.009
Token
Token ID254
Feature activation-1.33e-02

Pos logit contributions
 pancakes+0.008
oons+0.008
 degrees+0.008
oral+0.008
 chores+0.008
Neg logit contributions
 aback-0.007
 gol-0.006
 crossed-0.006
 worn-0.006
 nervous-0.006
Tokenâ
Token ID22940
Feature activation-2.66e-02

Pos logit contributions
oral+0.024
oons+0.023
 degrees+0.022
 pancakes+0.022
 ambul+0.022
Neg logit contributions
 aback-0.014
suit-0.013
 gol-0.013
 nervous-0.013
 around-0.013
Token
Token ID26391
Feature activation+4.12e-03

Pos logit contributions
 pancakes+0.045
oons+0.045
 degrees+0.044
oral+0.044
 dining+0.043
Neg logit contributions
 aback-0.030
 gol-0.028
 crossed-0.028
 nervous-0.027
 worn-0.027
Token
Token ID447
Feature activation-6.97e-02

Pos logit contributions
 aback+0.006
arted+0.006
 gol+0.005
 below+0.005
 perplex+0.005
Neg logit contributions
 pancakes-0.006
 degrees-0.006
oons-0.006
 movie-0.006
 breakfast-0.006
Token
Token ID251
Feature activation-9.53e-03

Pos logit contributions
 pancakes+0.113
oons+0.111
 degrees+0.110
oral+0.109
 ambul+0.107
Neg logit contributions
 aback-0.084
 gol-0.080
 crossed-0.079
 nervous-0.078
 worn-0.078
Tokenâ
Token ID22940
Feature activation-1.96e-02

Pos logit contributions
 pancakes+0.018
oons+0.018
 degrees+0.018
oral+0.018
 dining+0.017
Neg logit contributions
 aback-0.009
 gol-0.008
 crossed-0.008
 nervous-0.008
 worn-0.008
Token
Token ID26391
Feature activation+2.37e-03

Pos logit contributions
oral+0.028
oons+0.028
 occasions+0.028
 degrees+0.027
 dining+0.026
Neg logit contributions
 surprised-0.025
 nervous-0.025
 amazed-0.025
 worn-0.024
arted-0.024
Token
Token ID129
Feature activation-5.22e-03

Pos logit contributions
arted+0.004
 gol+0.003
amed+0.003
 aback+0.003
aled+0.003
Neg logit contributions
 movie-0.003
 degrees-0.003
 pancakes-0.003
 beds-0.003
 dining-0.003
Token
Token ID254
Feature activation+5.92e-04

Pos logit contributions
 degrees+0.007
 chores+0.007
 pancakes+0.007
 ambul+0.007
oons+0.007
Neg logit contributions
 aback-0.007
 worn-0.007
 seen-0.007
 gol-0.007
 crossed-0.007
Token the
Token ID262
Feature activation+3.38e-03

Pos logit contributions
 aback+0.009
 gol+0.009
 crossed+0.009
 nervous+0.008
 guessed+0.008
Neg logit contributions
 pancakes-0.016
oons-0.016
 chores-0.015
oral-0.015
 dining-0.015
Token ingredients
Token ID9391
Feature activation-1.14e-02

Pos logit contributions
 aback+0.005
 gol+0.005
 crossed+0.005
 nervous+0.005
 worn+0.005
Neg logit contributions
 pancakes-0.005
oons-0.005
 degrees-0.005
oral-0.004
 dining-0.004
Token to
Token ID284
Feature activation+4.53e-03

Pos logit contributions
oral+0.021
 degrees+0.020
 ambul+0.020
 pancakes+0.020
 911+0.020
Neg logit contributions
 aback-0.012
 below-0.011
 surprised-0.011
 amazed-0.011
 around-0.011
Token make
Token ID787
Feature activation-3.30e-02

Pos logit contributions
 aback+0.008
 gol+0.007
 crossed+0.007
 nervous+0.007
 worn+0.007
Neg logit contributions
 pancakes-0.005
oons-0.005
 degrees-0.005
oral-0.005
 dining-0.005
Token her
Token ID607
Feature activation-5.08e-03

Pos logit contributions
oral+0.048
 degrees+0.045
 dining+0.045
 ambul+0.045
oons+0.044
Neg logit contributions
 aback-0.044
 surprised-0.043
 nervous-0.043
 amazed-0.043
 below-0.042
Token delic
Token ID8675
Feature activation-6.96e-02

Pos logit contributions
 pancakes+0.007
oons+0.007
 degrees+0.007
oral+0.007
 chores+0.007
Neg logit contributions
 aback-0.007
 gol-0.007
 crossed-0.007
 nervous-0.006
 worn-0.006
Tokeniously
Token ID6819
Feature activation-4.65e-02

Pos logit contributions
 pancakes+0.099
 dining+0.097
oons+0.096
oral+0.095
 chores+0.094
Neg logit contributions
 aback-0.093
arted-0.092
 gol-0.092
 nervous-0.090
 below-0.090
Token sweet
Token ID6029
Feature activation-1.43e-03

Pos logit contributions
oral+0.078
oons+0.076
 911+0.075
 ambul+0.075
 degrees+0.073
Neg logit contributions
 nervous-0.056
 worn-0.054
 surprised-0.054
 en-0.054
 amazed-0.054
Token past
Token ID1613
Feature activation-5.82e-03

Pos logit contributions
oral+0.003
 ambul+0.003
 911+0.003
 degrees+0.003
 dining+0.003
Neg logit contributions
 so-0.001
 nervous-0.001
 surprised-0.001
 aback-0.001
 around-0.001
Tokenries
Token ID1678
Feature activation-2.20e-03

Pos logit contributions
 pancakes+0.005
oral+0.005
 degrees+0.005
 dining+0.005
oons+0.005
Neg logit contributions
 aback-0.011
 crossed-0.011
 below-0.011
 nervous-0.011
 worn-0.011
Token.
Token ID13
Feature activation+6.46e-03

Pos logit contributions
oral+0.004
 degrees+0.004
oons+0.004
 ambul+0.004
 pancakes+0.004
Neg logit contributions
 below-0.002
 nervous-0.002
 surprised-0.002
 crossed-0.002
 aback-0.002
Token bear
Token ID6842
Feature activation+1.98e-03

Pos logit contributions
oons+0.014
 degrees+0.014
 pancakes+0.013
 occasions+0.013
oup+0.013
Neg logit contributions
 nervous-0.008
 scared-0.008
 surprised-0.007
 confused-0.007
 frightened-0.007
Token sniff
Token ID26300
Feature activation+2.15e-02

Pos logit contributions
 aback+0.002
 crossed+0.002
 reached+0.002
 gol+0.002
 surprised+0.002
Neg logit contributions
 pancakes-0.004
oons-0.004
 degrees-0.004
oral-0.004
 chores-0.004
Tokened
Token ID276
Feature activation+2.17e-02

Pos logit contributions
 aback+0.010
 crossed+0.009
 reached+0.008
 nervous+0.008
 surprised+0.008
Neg logit contributions
 pancakes-0.051
oral-0.051
 dining-0.051
oons-0.050
 degrees-0.050
Token around
Token ID1088
Feature activation-2.46e-02

Pos logit contributions
 aback+0.020
 nervous+0.018
 below+0.018
 around+0.018
 crossed+0.018
Neg logit contributions
oral-0.044
oons-0.044
 pancakes-0.043
 degrees-0.042
 dining-0.042
Token their
Token ID511
Feature activation-3.05e-02

Pos logit contributions
oons+0.056
oral+0.054
 degrees+0.053
 pancakes+0.053
 chores+0.052
Neg logit contributions
 aback-0.019
 around-0.018
 so-0.017
 below-0.017
 nervous-0.017
Token camps
Token ID11665
Feature activation-6.95e-02

Pos logit contributions
oral+0.051
oons+0.050
 911+0.045
 sy+0.045
 degrees+0.045
Neg logit contributions
 around-0.039
 scared-0.037
 nervous-0.037
 surprised-0.036
 below-0.036
Tokenite
Token ID578
Feature activation-2.65e-02

Pos logit contributions
 pancakes+0.129
oons+0.128
oral+0.126
 degrees+0.125
 dining+0.124
Neg logit contributions
 aback-0.068
 guessed-0.063
 amazed-0.062
 nervous-0.061
 perplex-0.061
Token,
Token ID11
Feature activation-1.35e-02

Pos logit contributions
oral+0.053
oons+0.051
 pancakes+0.050
 degrees+0.050
 911+0.049
Neg logit contributions
 around-0.025
 crossed-0.024
 below-0.024
 so-0.024
 nervous-0.023
Token but
Token ID475
Feature activation+1.08e-02

Pos logit contributions
 pancakes+0.023
 dining+0.021
 degrees+0.021
 ambul+0.021
oons+0.021
Neg logit contributions
 aback-0.016
 gol-0.015
arted-0.014
 crossed-0.014
 worn-0.014
Token then
Token ID788
Feature activation+1.42e-02

Pos logit contributions
 aback+0.014
 gol+0.013
 crossed+0.012
 worn+0.012
arted+0.012
Neg logit contributions
 pancakes-0.017
oons-0.016
 degrees-0.016
oral-0.016
 ambul-0.016
Token it
Token ID340
Feature activation+1.19e-03

Pos logit contributions
 crossed+0.014
 surprised+0.013
 reached+0.013
 nervous+0.013
 scared+0.013
Neg logit contributions
oons-0.028
oral-0.027
 degrees-0.027
 occasions-0.026
 pancakes-0.026
Token saw
Token ID2497
Feature activation-1.38e-02

Pos logit contributions
 aback+0.010
 gol+0.010
arted+0.009
 below+0.009
 Philip+0.008
Neg logit contributions
 pancakes-0.022
oons-0.021
 dining-0.021
 degrees-0.021
 chores-0.021
Token a
Token ID257
Feature activation+4.86e-03

Pos logit contributions
 pancakes+0.025
oons+0.024
 chores+0.023
 ambul+0.023
 breakfast+0.023
Neg logit contributions
 aback-0.015
 gol-0.014
 crossed-0.014
arted-0.014
 nervous-0.013
Token big
Token ID1263
Feature activation-1.17e-03

Pos logit contributions
 gol+0.005
 nervous+0.005
 worn+0.005
 surprised+0.005
 aback+0.005
Neg logit contributions
oons-0.008
 degrees-0.008
 pancakes-0.008
oral-0.008
 911-0.008
Token red
Token ID2266
Feature activation-2.13e-03

Pos logit contributions
 degrees+0.002
oons+0.002
oral+0.002
 911+0.002
 pancakes+0.002
Neg logit contributions
 gol-0.001
 nervous-0.001
 surprised-0.001
 worn-0.001
 aback-0.001
Token l
Token ID300
Feature activation+3.96e-03

Pos logit contributions
 degrees+0.004
oons+0.004
 ambul+0.004
 911+0.004
 pancakes+0.003
Neg logit contributions
 gol-0.002
 worn-0.002
tie-0.002
suit-0.002
 caught-0.002
Tokenoll
Token ID692
Feature activation-6.93e-02

Pos logit contributions
 aback+0.008
arted+0.008
 gol+0.008
 hesitant+0.008
 worn+0.008
Neg logit contributions
oons-0.003
 pancakes-0.003
 breakfast-0.003
oral-0.003
 degrees-0.003
Tokenipop
Token ID42800
Feature activation-1.27e-02

Pos logit contributions
 ambul+0.111
 dining+0.109
oral+0.109
 pancakes+0.108
 degrees+0.106
Neg logit contributions
 aback-0.082
 worn-0.079
 gol-0.079
 reached-0.078
suit-0.078
Token and
Token ID290
Feature activation-3.22e-03

Pos logit contributions
oral+0.023
 pancakes+0.023
oons+0.022
 degrees+0.022
 ambul+0.022
Neg logit contributions
 aback-0.013
 crossed-0.012
 gol-0.012
 nervous-0.012
 surprised-0.012
Token asked
Token ID1965
Feature activation+1.35e-02

Pos logit contributions
 pancakes+0.007
 breakfast+0.006
oons+0.006
 degrees+0.006
 cereal+0.006
Neg logit contributions
 aback-0.003
 gol-0.002
 below-0.002
arted-0.002
suit-0.002
Token her
Token ID607
Feature activation+3.39e-03

Pos logit contributions
 aback+0.019
 surprised+0.018
 nervous+0.018
 guessed+0.018
 Sally+0.017
Neg logit contributions
 degrees-0.020
oons-0.020
 pancakes-0.019
oral-0.019
 dining-0.019
Token mom
Token ID1995
Feature activation+1.82e-03

Pos logit contributions
 surprised+0.004
 aback+0.004
 nervous+0.004
 gol+0.004
 worn+0.004
Neg logit contributions
 degrees-0.006
oons-0.005
 pancakes-0.005
oral-0.005
 ambul-0.005
Token enjoyed
Token ID8359
Feature activation-2.46e-02

Pos logit contributions
 pancakes+0.019
oons+0.018
oral+0.018
 ambul+0.018
 soup+0.018
Neg logit contributions
 aback-0.011
arted-0.010
 gol-0.010
 Paige-0.009
 perplex-0.009
Token his
Token ID465
Feature activation-9.41e-03

Pos logit contributions
 pancakes+0.038
 degrees+0.036
 chores+0.036
oons+0.036
 ambul+0.036
Neg logit contributions
 aback-0.033
 gol-0.031
 crossed-0.030
arted-0.029
 nervous-0.029
Token y
Token ID331
Feature activation-5.08e-02

Pos logit contributions
oons+0.011
 pancakes+0.011
 degrees+0.011
oral+0.011
 ambul+0.011
Neg logit contributions
 aback-0.015
 gol-0.015
 nervous-0.015
 crossed-0.015
 worn-0.015
Tokenummy
Token ID13513
Feature activation-2.38e-02

Pos logit contributions
 ambul+0.055
 911+0.055
 dining+0.054
oral+0.054
 degrees+0.054
Neg logit contributions
 aback-0.087
 gol-0.084
 worn-0.083
arted-0.083
 crossed-0.082
Token l
Token ID300
Feature activation+6.28e-03

Pos logit contributions
oral+0.032
 911+0.032
 ambul+0.031
 dining+0.030
 degrees+0.030
Neg logit contributions
 aback-0.036
 caught-0.036
 scared-0.036
 nervous-0.035
 touched-0.035
Tokenoll
Token ID692
Feature activation-6.90e-02

Pos logit contributions
 aback+0.012
 gol+0.012
 crossed+0.011
 nervous+0.011
 amazed+0.011
Neg logit contributions
oons-0.006
 pancakes-0.006
 degrees-0.006
oral-0.005
 chores-0.005
Tokenipop
Token ID42800
Feature activation-1.48e-02

Pos logit contributions
 dining+0.105
oral+0.104
 ambul+0.104
 pancakes+0.103
 degrees+0.101
Neg logit contributions
 aback-0.088
 worn-0.084
 gol-0.083
 crossed-0.083
 reached-0.082
Token and
Token ID290
Feature activation+4.47e-03

Pos logit contributions
 pancakes+0.027
oons+0.026
oral+0.026
 degrees+0.026
 ambul+0.026
Neg logit contributions
 aback-0.015
 gol-0.014
 crossed-0.014
 nervous-0.014
 worn-0.014
Token played
Token ID2826
Feature activation-9.93e-03

Pos logit contributions
 aback+0.005
arted+0.004
 gol+0.004
 below+0.004
 perplex+0.004
Neg logit contributions
 pancakes-0.008
 breakfast-0.008
 soup-0.008
 chores-0.008
 ambul-0.008
Token with
Token ID351
Feature activation+7.73e-03

Pos logit contributions
 pancakes+0.018
 breakfast+0.017
oons+0.017
 chores+0.017
 degrees+0.017
Neg logit contributions
 aback-0.011
 gol-0.010
 worn-0.010
arted-0.010
 crossed-0.010
Token his
Token ID465
Feature activation+2.33e-03

Pos logit contributions
 aback+0.011
 gol+0.010
arted+0.010
suit+0.010
 overwhelmed+0.010
Neg logit contributions
 pancakes-0.012
 breakfast-0.012
 soup-0.012
oons-0.011
 chores-0.011
Token at
Token ID379
Feature activation+1.39e-02

Pos logit contributions
oons+0.005
oral+0.005
 pancakes+0.005
 degrees+0.005
 ambul+0.005
Neg logit contributions
 aback-0.002
 nervous-0.002
 surprised-0.002
 below-0.002
 amazed-0.002
Token circles
Token ID13332
Feature activation-1.51e-02

Pos logit contributions
 aback+0.017
 gol+0.016
 crossed+0.016
suit+0.016
 worn+0.016
Neg logit contributions
 pancakes-0.022
 chores-0.022
 911-0.022
 ambul-0.022
oons-0.021
Token,
Token ID11
Feature activation-1.36e-02

Pos logit contributions
oral+0.031
 degrees+0.029
 pancakes+0.029
 dining+0.029
 ambul+0.029
Neg logit contributions
 around-0.015
 below-0.014
 so-0.014
 surprised-0.013
 aback-0.013
Token squares
Token ID24438
Feature activation+7.63e-04

Pos logit contributions
oons+0.023
 pancakes+0.023
oral+0.023
 degrees+0.023
 chores+0.022
Neg logit contributions
 aback-0.015
 crossed-0.015
 gol-0.015
 nervous-0.015
 amazed-0.014
Token,
Token ID11
Feature activation-2.25e-02

Pos logit contributions
 below+0.001
 around+0.001
 crossed+0.001
 amazed+0.001
 reached+0.001
Neg logit contributions
oral-0.001
 pancakes-0.001
 911-0.001
 degrees-0.001
 ambul-0.001
Token rect
Token ID13621
Feature activation-6.89e-02

Pos logit contributions
oral+0.041
oons+0.040
 degrees+0.040
 pancakes+0.040
 chores+0.039
Neg logit contributions
 amazed-0.023
 crossed-0.023
 reached-0.023
 below-0.023
 surprised-0.022
Tokenangles
Token ID27787
Feature activation-8.95e-03

Pos logit contributions
 pancakes+0.101
 degrees+0.098
oons+0.098
oral+0.097
 ambul+0.096
Neg logit contributions
 aback-0.093
 gol-0.089
 crossed-0.089
arted-0.087
 worn-0.086
Token and
Token ID290
Feature activation+1.73e-03

Pos logit contributions
oral+0.015
 pancakes+0.015
 dining+0.015
 degrees+0.015
oons+0.015
Neg logit contributions
 aback-0.010
 below-0.010
 crossed-0.010
 surprised-0.010
 amazed-0.010
Token triangles
Token ID44360
Feature activation+4.56e-03

Pos logit contributions
 crossed+0.002
 nervous+0.002
 amazed+0.002
 aback+0.002
 guessed+0.002
Neg logit contributions
oral-0.003
oons-0.003
 pancakes-0.003
 degrees-0.003
 911-0.003
Token.
Token ID13
Feature activation+5.56e-03

Pos logit contributions
 below+0.004
 around+0.004
 amazed+0.004
 crossed+0.004
 surprised+0.004
Neg logit contributions
oral-0.009
 911-0.009
 pancakes-0.008
 ambul-0.008
 degrees-0.008
Token
Token ID198
Feature activation-1.61e-03

Pos logit contributions
 aback+0.005
 worn+0.004
 gol+0.004
 seen+0.004
suit+0.004
Neg logit contributions
oral-0.011
 pancakes-0.011
 ambul-0.011
 911-0.011
oons-0.011
Token she
Token ID673
Feature activation-5.70e-03

Pos logit contributions
 nervous+0.024
 crossed+0.023
 aback+0.023
 en+0.023
 worn+0.023
Neg logit contributions
 degrees-0.044
oons-0.043
 pancakes-0.041
oral-0.041
 dining-0.041
Token always
Token ID1464
Feature activation+1.49e-02

Pos logit contributions
 ambul+0.009
oral+0.009
oons+0.009
 chores+0.009
 pancakes+0.009
Neg logit contributions
 aback-0.007
suit-0.006
arted-0.006
 below-0.006
 perplex-0.006
Token kept
Token ID4030
Feature activation-1.85e-02

Pos logit contributions
 aback+0.015
 below+0.015
arted+0.015
 perplex+0.014
 gol+0.014
Neg logit contributions
 ambul-0.025
 chores-0.025
 pancakes-0.024
 dining-0.024
 breakfast-0.024
Token her
Token ID607
Feature activation-1.12e-02

Pos logit contributions
 pancakes+0.023
 chores+0.023
 ambul+0.022
 911+0.022
 breakfast+0.021
Neg logit contributions
 aback-0.030
 gol-0.029
suit-0.029
 crossed-0.028
arted-0.028
Token books
Token ID3835
Feature activation-1.57e-02

Pos logit contributions
 pancakes+0.012
oons+0.012
 degrees+0.011
oral+0.011
 ambul+0.011
Neg logit contributions
 aback-0.020
 crossed-0.019
 nervous-0.019
 gol-0.019
 worn-0.019
Tokenhe
Token ID258
Feature activation-6.87e-02

Pos logit contributions
 pancakes+0.030
 degrees+0.030
oons+0.030
oral+0.030
 ambul+0.029
Neg logit contributions
 aback-0.014
 crossed-0.013
 nervous-0.013
 worn-0.013
 gol-0.013
Tokenlf
Token ID1652
Feature activation-4.08e-02

Pos logit contributions
oral+0.085
 degrees+0.080
 ambul+0.080
 marsh+0.079
oup+0.078
Neg logit contributions
 around-0.113
 so-0.107
 nervous-0.106
 amazed-0.103
 surprised-0.102
Token tidy
Token ID43044
Feature activation-1.75e-02

Pos logit contributions
oral+0.084
 degrees+0.083
 pancakes+0.082
oons+0.080
 ambul+0.080
Neg logit contributions
 around-0.034
 aback-0.033
 nervous-0.033
 so-0.032
 worn-0.031
Token and
Token ID290
Feature activation-2.28e-02

Pos logit contributions
 pancakes+0.030
oons+0.030
oral+0.030
 degrees+0.030
 dining+0.029
Neg logit contributions
 aback-0.019
 gol-0.018
 crossed-0.018
 nervous-0.018
 worn-0.017
Token never
Token ID1239
Feature activation+4.01e-03

Pos logit contributions
 pancakes+0.036
 chores+0.036
 degrees+0.036
 ambul+0.035
 dining+0.035
Neg logit contributions
 aback-0.028
 gol-0.027
arted-0.026
 Philip-0.025
 perplex-0.025
Token did
Token ID750
Feature activation+2.48e-02

Pos logit contributions
 aback+0.004
 Philip+0.004
 perplex+0.004
 gol+0.004
suit+0.004
Neg logit contributions
 ambul-0.007
 chores-0.007
 beds-0.007
 spills-0.006
 degrees-0.006
Token play
Token ID711
Feature activation-1.57e-02

Pos logit contributions
oral+0.004
 degrees+0.004
oons+0.003
 dining+0.003
 911+0.003
Neg logit contributions
 touched-0.001
 crossed-0.001
 reached-0.001
 asked-0.001
 raised-0.001
Token with
Token ID351
Feature activation+3.06e-03

Pos logit contributions
 degrees+0.036
oral+0.036
 dining+0.034
oons+0.034
 pancakes+0.034
Neg logit contributions
 around-0.011
 crossed-0.010
 so-0.010
 nervous-0.009
 scared-0.009
Token his
Token ID465
Feature activation-2.11e-03

Pos logit contributions
 around+0.003
 scared+0.003
 en+0.003
 nervous+0.003
 touched+0.003
Neg logit contributions
oral-0.007
 dining-0.006
 degrees-0.006
 Year-0.005
 occasions-0.005
Token owner
Token ID4870
Feature activation+2.53e-02

Pos logit contributions
oons+0.004
oral+0.004
 degrees+0.004
 pancakes+0.004
 dining+0.004
Neg logit contributions
 nervous-0.002
 worn-0.002
 scared-0.002
 gol-0.002
 surprised-0.002
Token's
Token ID338
Feature activation-8.71e-04

Pos logit contributions
 around+0.025
 so+0.024
 found+0.022
 crossed+0.022
 teased+0.022
Neg logit contributions
oral-0.051
 degrees-0.049
 pancakes-0.049
 dining-0.048
oons-0.047
Token sp
Token ID599
Feature activation-6.82e-02

Pos logit contributions
oons+0.002
oral+0.002
 degrees+0.002
abel+0.001
 pancakes+0.001
Neg logit contributions
 nervous-0.001
 scared-0.001
 gol-0.001
 worn-0.001
 en-0.001
Tokenoons
Token ID13022
Feature activation-4.29e-02

Pos logit contributions
 ambul+0.095
 pancakes+0.095
 911+0.094
 degrees+0.093
 dining+0.092
Neg logit contributions
 aback-0.091
 gol-0.091
 crossed-0.090
 worn-0.089
arted-0.087
Token and
Token ID290
Feature activation+4.16e-03

Pos logit contributions
 degrees+0.080
oral+0.080
 pancakes+0.079
oons+0.077
 ambul+0.076
Neg logit contributions
 aback-0.041
 around-0.039
 nervous-0.038
 so-0.038
 crossed-0.038
Token always
Token ID1464
Feature activation+2.84e-02

Pos logit contributions
 asked+0.006
 found+0.005
 reached+0.005
 wanted+0.005
 so+0.005
Neg logit contributions
oral-0.007
 degrees-0.007
oons-0.006
abel-0.006
Sweet-0.006
Token knocked
Token ID13642
Feature activation-3.21e-03

Pos logit contributions
 aback+0.023
 gol+0.022
 crossed+0.021
 nervous+0.021
 worn+0.020
Neg logit contributions
 pancakes-0.057
oons-0.056
 degrees-0.056
oral-0.055
 dining-0.055
Token them
Token ID606
Feature activation+7.66e-03

Pos logit contributions
oral+0.005
 degrees+0.005
 pancakes+0.005
oons+0.005
 dining+0.005
Neg logit contributions
 aback-0.004
 crossed-0.004
 worn-0.003
 gol-0.003
 below-0.003
Token playing
Token ID2712
Feature activation-1.27e-02

Pos logit contributions
oons+0.019
 degrees+0.019
oral+0.019
 pancakes+0.018
 dining+0.017
Neg logit contributions
 aback-0.009
 gol-0.008
 crossed-0.008
 nervous-0.008
 Bill-0.008
Token with
Token ID351
Feature activation+2.57e-03

Pos logit contributions
 degrees+0.025
oral+0.025
oons+0.025
 pancakes+0.025
 ambul+0.025
Neg logit contributions
 aback-0.011
 gol-0.010
 crossed-0.010
 below-0.010
 around-0.010
Token his
Token ID465
Feature activation-3.22e-03

Pos logit contributions
 aback+0.003
 gol+0.003
 crossed+0.003
 nervous+0.003
 worn+0.003
Neg logit contributions
 pancakes-0.004
oons-0.004
 degrees-0.004
oral-0.004
 chores-0.004
Token lovely
Token ID14081
Feature activation-1.19e-03

Pos logit contributions
oons+0.006
oral+0.005
 degrees+0.005
 911+0.005
 pancakes+0.005
Neg logit contributions
 gol-0.004
 nervous-0.004
 worn-0.004
 surprised-0.004
 scared-0.003
Token l
Token ID300
Feature activation+1.89e-03

Pos logit contributions
 degrees+0.002
oral+0.002
oons+0.002
 911+0.002
 ambul+0.002
Neg logit contributions
 gol-0.001
 worn-0.001
 nervous-0.001
 en-0.001
 embarrassed-0.001
Tokenoll
Token ID692
Feature activation-6.80e-02

Pos logit contributions
 aback+0.003
 gol+0.003
 crossed+0.003
 nervous+0.003
 worn+0.003
Neg logit contributions
 pancakes-0.002
oons-0.002
 degrees-0.002
oral-0.002
 dining-0.002
Tokenipop
Token ID42800
Feature activation-1.23e-02

Pos logit contributions
 ambul+0.114
 dining+0.110
oral+0.110
 911+0.109
 movie+0.107
Neg logit contributions
 worn-0.076
 aback-0.074
tie-0.074
 seen-0.074
 crossed-0.073
Token.
Token ID13
Feature activation-1.44e-03

Pos logit contributions
oral+0.024
 ambul+0.024
 pancakes+0.023
 degrees+0.023
 911+0.023
Neg logit contributions
 around-0.011
 crossed-0.011
 worn-0.011
 below-0.011
 so-0.010
Token<|endoftext|>
Token ID50256
Feature activation+6.29e-03

Pos logit contributions
 pancakes+0.003
 911+0.003
oral+0.003
 TV+0.003
 dining+0.003
Neg logit contributions
 seen-0.001
amed-0.001
 aback-0.001
 amazed-0.001
 crossed-0.001
TokenOnce
Token ID7454
Feature activation-1.28e-02

Pos logit contributions
 gol+0.008
arrass+0.008
arted+0.008
 overwhelmed+0.008
 perplex+0.008
Neg logit contributions
oons-0.010
oral-0.009
 breakfast-0.009
 cereal-0.009
 movie-0.009
Token upon
Token ID2402
Feature activation+1.46e-02

Pos logit contributions
oons+0.015
oral+0.014
 pancakes+0.014
 chores+0.014
 911+0.013
Neg logit contributions
 aback-0.022
 gol-0.021
 crossed-0.020
arted-0.020
 guessed-0.020
Token perfect
Token ID2818
Feature activation-2.12e-02

Pos logit contributions
oons+0.059
oral+0.057
 degrees+0.056
 ambul+0.055
 911+0.054
Neg logit contributions
 nervous-0.042
 gol-0.042
 worn-0.042
 seen-0.042
 aback-0.041
Token bag
Token ID6131
Feature activation+4.73e-03

Pos logit contributions
oons+0.036
oral+0.036
 ambul+0.035
 degrees+0.035
 911+0.035
Neg logit contributions
 aback-0.023
 worn-0.023
 gol-0.023
 seen-0.023
 nervous-0.022
Token
Token ID6184
Feature activation-2.56e-02

Pos logit contributions
 aback+0.005
 crossed+0.005
 worn+0.005
 seen+0.005
 below+0.005
Neg logit contributions
oral-0.008
 pancakes-0.008
 degrees-0.008
 ambul-0.008
 911-0.008
Token
Token ID95
Feature activation-4.37e-03

Pos logit contributions
 pancakes+0.040
oons+0.040
 ambul+0.040
oral+0.039
 soup+0.038
Neg logit contributions
 gol-0.033
 aback-0.032
 crossed-0.031
suit-0.030
 below-0.030
Token
Token ID26391
Feature activation-8.65e-04

Pos logit contributions
 degrees+0.006
oral+0.006
 pancakes+0.006
oons+0.006
 dining+0.006
Neg logit contributions
 aback-0.006
 gol-0.005
arted-0.005
 surprised-0.005
 worn-0.005
Token
Token ID447
Feature activation-6.79e-02

Pos logit contributions
 pancakes+0.001
 degrees+0.001
oons+0.001
 dining+0.001
 breakfast+0.001
Neg logit contributions
 aback-0.001
arted-0.001
 gol-0.001
 worn-0.001
 below-0.001
Token
Token ID250
Feature activation-6.58e-03

Pos logit contributions
oral+0.096
 pancakes+0.095
 dining+0.092
oons+0.091
 degrees+0.091
Neg logit contributions
 aback-0.093
 seen-0.092
arted-0.090
 guessed-0.087
 crossed-0.087
Token now
Token ID783
Feature activation-7.84e-03

Pos logit contributions
 pancakes+0.011
oral+0.010
 degrees+0.010
oons+0.010
 ambul+0.010
Neg logit contributions
 aback-0.008
 crossed-0.007
 surprised-0.007
 gol-0.007
 nervous-0.007
Token I
Token ID314
Feature activation-6.28e-03

Pos logit contributions
oons+0.014
oral+0.013
 pancakes+0.013
 degrees+0.013
 ambul+0.013
Neg logit contributions
 aback-0.009
 crossed-0.009
 gol-0.008
 nervous-0.008
 worn-0.008
Token'm
Token ID1101
Feature activation-1.09e-02

Pos logit contributions
 pancakes+0.010
oons+0.010
 degrees+0.010
oral+0.010
 chores+0.010
Neg logit contributions
 aback-0.007
 gol-0.007
 crossed-0.007
 nervous-0.007
arted-0.007
Token very
Token ID845
Feature activation-1.45e-02

Pos logit contributions
 pancakes+0.020
 degrees+0.020
oons+0.020
oral+0.019
 chores+0.019
Neg logit contributions
 aback-0.011
 gol-0.010
 crossed-0.010
 worn-0.009
arted-0.009
Token a
Token ID257
Feature activation+5.15e-03

Pos logit contributions
 pancakes+0.029
oons+0.028
 degrees+0.028
oral+0.028
 chores+0.027
Neg logit contributions
 aback-0.017
 gol-0.016
 crossed-0.016
arted-0.015
 nervous-0.015
Token big
Token ID1263
Feature activation-1.56e-04

Pos logit contributions
 gol+0.004
 nervous+0.004
 worn+0.004
 surprised+0.004
 scared+0.004
Neg logit contributions
 degrees-0.010
oons-0.010
 pancakes-0.010
oral-0.010
 911-0.010
Token,
Token ID11
Feature activation-2.45e-02

Pos logit contributions
 degrees+0.000
oral+0.000
oons+0.000
 911+0.000
 occasions+0.000
Neg logit contributions
 gol-0.000
 nervous-0.000
 surprised-0.000
 worn-0.000
 scared-0.000
Token sweet
Token ID6029
Feature activation-9.88e-03

Pos logit contributions
 pancakes+0.044
oons+0.044
 degrees+0.044
oral+0.044
 dining+0.042
Neg logit contributions
 aback-0.025
 gol-0.023
 crossed-0.023
 nervous-0.023
 worn-0.023
Token l
Token ID300
Feature activation+4.33e-03

Pos logit contributions
 degrees+0.016
oral+0.016
oons+0.015
 911+0.015
 ambul+0.015
Neg logit contributions
 gol-0.013
 nervous-0.012
 surprised-0.012
 worn-0.012
 embarrassed-0.012
Tokenoll
Token ID692
Feature activation-6.79e-02

Pos logit contributions
 aback+0.009
 gol+0.008
 worn+0.008
 Philip+0.008
 hesitant+0.008
Neg logit contributions
oons-0.004
 pancakes-0.004
oral-0.003
 degrees-0.003
 breakfast-0.003
Tokenipop
Token ID42800
Feature activation-7.99e-03

Pos logit contributions
 dining+0.112
 ambul+0.110
oral+0.110
 pancakes+0.107
 degrees+0.105
Neg logit contributions
 crossed-0.074
 reached-0.074
tie-0.074
 worn-0.074
 gol-0.074
Token and
Token ID290
Feature activation+4.83e-04

Pos logit contributions
oral+0.015
 pancakes+0.014
oons+0.014
 911+0.014
 degrees+0.014
Neg logit contributions
 crossed-0.008
 nervous-0.008
 surprised-0.007
 worn-0.007
 below-0.007
Token wanted
Token ID2227
Feature activation-5.70e-03

Pos logit contributions
 aback+0.000
 gol+0.000
arted+0.000
 below+0.000
 crossed+0.000
Neg logit contributions
 pancakes-0.001
oons-0.001
 degrees-0.001
 breakfast-0.001
 dining-0.001
Token it
Token ID340
Feature activation-4.52e-03

Pos logit contributions
oral+0.011
oons+0.011
 degrees+0.011
 pancakes+0.011
 dining+0.011
Neg logit contributions
 aback-0.005
 crossed-0.005
 worn-0.005
 seen-0.005
 nervous-0.005
Token.
Token ID13
Feature activation+1.23e-02

Pos logit contributions
 breakfast+0.007
 pancakes+0.007
 chores+0.007
 tea+0.006
 soup+0.006
Neg logit contributions
 gol-0.007
 aback-0.006
arted-0.006
 perplex-0.006
 overwhelmed-0.006
Token to
Token ID284
Feature activation+3.46e-03

Pos logit contributions
 degrees+0.031
oral+0.029
oons+0.029
 911+0.028
 ambul+0.028
Neg logit contributions
 worn-0.015
 around-0.014
 below-0.014
 gol-0.014
 crossed-0.014
Token bounce
Token ID20110
Feature activation-1.11e-02

Pos logit contributions
 gol+0.004
 crossed+0.004
 aback+0.004
 nervous+0.003
 reached+0.003
Neg logit contributions
 degrees-0.006
oral-0.006
oons-0.006
 ambul-0.006
 dining-0.006
Token on
Token ID319
Feature activation-7.38e-03

Pos logit contributions
 pancakes+0.020
 degrees+0.019
oons+0.019
oral+0.019
 dining+0.019
Neg logit contributions
 aback-0.012
 gol-0.011
 crossed-0.011
 nervous-0.011
 worn-0.011
Token the
Token ID262
Feature activation+2.32e-03

Pos logit contributions
oral+0.018
 degrees+0.018
oons+0.018
 911+0.018
 dining+0.017
Neg logit contributions
 so-0.004
 worn-0.003
 gol-0.003
 around-0.003
 below-0.003
Token big
Token ID1263
Feature activation+5.06e-03

Pos logit contributions
 gol+0.002
 so+0.002
 around+0.002
 below+0.002
 worn+0.002
Neg logit contributions
 degrees-0.004
oral-0.004
 911-0.004
oons-0.004
 ambul-0.004
Token tr
Token ID491
Feature activation-6.79e-02

Pos logit contributions
 gol+0.005
 around+0.005
suit+0.005
 so+0.005
 below+0.005
Neg logit contributions
 degrees-0.010
oral-0.009
oons-0.009
 911-0.009
Sweet-0.009
Tokenamp
Token ID696
Feature activation-4.95e-02

Pos logit contributions
 pancakes+0.099
 degrees+0.098
oons+0.095
oral+0.095
 ambul+0.094
Neg logit contributions
 aback-0.091
 gol-0.089
 crossed-0.088
 worn-0.086
arted-0.085
Tokenoline
Token ID14453
Feature activation-3.46e-02

Pos logit contributions
 degrees+0.092
 pancakes+0.091
 ambul+0.089
 dining+0.089
 chores+0.089
Neg logit contributions
 below-0.047
 crossed-0.047
 reached-0.046
arted-0.045
 gol-0.045
Token in
Token ID287
Feature activation-2.14e-03

Pos logit contributions
 degrees+0.069
 pancakes+0.067
oral+0.067
 911+0.066
 ambul+0.066
Neg logit contributions
 crossed-0.031
 around-0.031
 below-0.030
 nervous-0.030
 gol-0.029
Token their
Token ID511
Feature activation-2.44e-02

Pos logit contributions
 pancakes+0.004
oons+0.004
 degrees+0.004
oral+0.004
 chores+0.004
Neg logit contributions
 aback-0.002
 gol-0.002
 crossed-0.002
 nervous-0.002
 worn-0.002
Token backyard
Token ID24296
Feature activation-1.28e-02

Pos logit contributions
 pancakes+0.035
 chores+0.034
 dining+0.034
oons+0.033
 beds+0.033
Neg logit contributions
 aback-0.035
 gol-0.033
 crossed-0.032
arted-0.032
 guessed-0.032
Token bear
Token ID6842
Feature activation-1.45e-02

Pos logit contributions
 aback+0.010
 gol+0.009
 crossed+0.009
 nervous+0.009
 worn+0.009
Neg logit contributions
 pancakes-0.013
oons-0.013
 degrees-0.012
oral-0.012
 dining-0.012
Token who
Token ID508
Feature activation+3.87e-04

Pos logit contributions
 pancakes+0.030
oons+0.029
 degrees+0.029
oral+0.029
 dining+0.028
Neg logit contributions
 aback-0.011
 gol-0.011
 crossed-0.010
 nervous-0.010
 worn-0.010
Token had
Token ID550
Feature activation+1.22e-02

Pos logit contributions
 aback+0.000
 gol+0.000
 crossed+0.000
 worn+0.000
 nervous+0.000
Neg logit contributions
 pancakes-0.001
oons-0.001
 degrees-0.001
oral-0.001
 dining-0.001
Token the
Token ID262
Feature activation+6.45e-03

Pos logit contributions
 aback+0.014
 gol+0.013
arted+0.012
 below+0.012
 hesitant+0.012
Neg logit contributions
 pancakes-0.022
 breakfast-0.021
 soup-0.021
 chores-0.021
 dining-0.021
Token co
Token ID763
Feature activation+1.01e-02

Pos logit contributions
 aback+0.007
 gol+0.007
 nervous+0.007
 crossed+0.007
 worn+0.007
Neg logit contributions
oons-0.011
 pancakes-0.011
 degrees-0.011
oral-0.011
 dining-0.010
Tokenz
Token ID89
Feature activation-6.77e-02

Pos logit contributions
 aback+0.010
 gol+0.010
 crossed+0.009
 nervous+0.009
 worn+0.009
Neg logit contributions
 pancakes-0.019
oral-0.019
oons-0.018
 chores-0.018
 degrees-0.018
Tokeniest
Token ID6386
Feature activation-2.10e-02

Pos logit contributions
 pancakes+0.107
 degrees+0.106
 ambul+0.106
 dining+0.106
oons+0.105
Neg logit contributions
 aback-0.082
arted-0.080
 gol-0.080
 crossed-0.079
 below-0.076
Token bedroom
Token ID14043
Feature activation-2.76e-02

Pos logit contributions
 911+0.034
oons+0.034
 dining+0.033
oral+0.033
 chores+0.032
Neg logit contributions
 nervous-0.025
 gol-0.024
 worn-0.024
 scared-0.023
 crossed-0.023
Token.
Token ID13
Feature activation+3.31e-03

Pos logit contributions
oral+0.053
 pancakes+0.053
 degrees+0.052
 ambul+0.052
 911+0.051
Neg logit contributions
 below-0.025
 around-0.024
 crossed-0.024
suit-0.023
 reached-0.023
Token Every
Token ID3887
Feature activation-1.31e-02

Pos logit contributions
 aback+0.003
 gol+0.003
amed+0.003
 seen+0.003
arted+0.003
Neg logit contributions
 pancakes-0.006
oons-0.006
 beds-0.006
 dining-0.006
 cereal-0.006
Token night
Token ID1755
Feature activation+1.13e-02

Pos logit contributions
 degrees+0.025
oons+0.024
 pancakes+0.024
 911+0.024
 ambul+0.024
Neg logit contributions
 aback-0.012
 crossed-0.012
 nervous-0.012
 gol-0.012
 reached-0.012
Token in
Token ID287
Feature activation-6.17e-04

Pos logit contributions
 degrees+0.047
oral+0.047
oons+0.045
 pancakes+0.045
 ambul+0.044
Neg logit contributions
 crossed-0.018
 gol-0.018
 nervous-0.017
 aback-0.017
 around-0.017
Token his
Token ID465
Feature activation-1.70e-02

Pos logit contributions
oral+0.002
 degrees+0.002
oons+0.001
 pancakes+0.001
 911+0.001
Neg logit contributions
 so-0.000
 worn-0.000
 crossed-0.000
 nervous-0.000
 en-0.000
Token garden
Token ID11376
Feature activation-1.89e-02

Pos logit contributions
 degrees+0.030
oons+0.029
oral+0.029
 pancakes+0.028
 ambul+0.027
Neg logit contributions
 worn-0.019
 nervous-0.019
 crossed-0.019
 gol-0.018
 surprised-0.018
Token with
Token ID351
Feature activation+2.89e-03

Pos logit contributions
 degrees+0.042
oral+0.042
 911+0.040
 pancakes+0.040
 ambul+0.040
Neg logit contributions
 around-0.013
 below-0.013
 so-0.012
 nervous-0.012
 crossed-0.012
Token his
Token ID465
Feature activation-6.09e-03

Pos logit contributions
 worn+0.002
 touched+0.002
 nervous+0.002
 crossed+0.002
 reached+0.002
Neg logit contributions
 degrees-0.006
oral-0.006
 dining-0.006
oons-0.006
 ambul-0.006
Token sp
Token ID599
Feature activation-6.76e-02

Pos logit contributions
 degrees+0.012
oons+0.012
oral+0.011
 pancakes+0.011
 ambul+0.011
Neg logit contributions
 gol-0.005
 nervous-0.005
 worn-0.005
 surprised-0.005
 scared-0.005
Tokenade
Token ID671
Feature activation-2.71e-02

Pos logit contributions
 pancakes+0.091
 ambul+0.089
 degrees+0.089
 911+0.088
oral+0.088
Neg logit contributions
 aback-0.095
 gol-0.093
 crossed-0.093
 worn-0.092
 nervous-0.091
Token.
Token ID13
Feature activation+1.97e-03

Pos logit contributions
 degrees+0.050
oral+0.049
 pancakes+0.049
 ambul+0.048
 911+0.048
Neg logit contributions
 crossed-0.026
 nervous-0.025
 aback-0.025
 below-0.024
 around-0.024
Token He
Token ID679
Feature activation+1.09e-03

Pos logit contributions
 aback+0.002
amed+0.002
 gol+0.002
arted+0.002
 overwhelmed+0.002
Neg logit contributions
oons-0.003
 TV-0.003
 cereal-0.003
 breakfast-0.003
 chores-0.003
Token dug
Token ID18735
Feature activation-7.97e-03

Pos logit contributions
 crossed+0.001
 reached+0.001
 worn+0.001
 guessed+0.001
 met+0.001
Neg logit contributions
 degrees-0.002
 911-0.002
oral-0.002
 pancakes-0.002
 ambul-0.002
Token holes
Token ID10421
Feature activation+2.32e-03

Pos logit contributions
oons+0.013
 pancakes+0.013
 soup+0.013
 degrees+0.012
 beds+0.012
Neg logit contributions
 gol-0.010
 aback-0.009
arted-0.009
 worn-0.009
 perplex-0.009
Token.
Token ID13
Feature activation-2.43e-04

Pos logit contributions
 pancakes+0.036
 degrees+0.036
oral+0.036
oons+0.035
 ambul+0.035
Neg logit contributions
 surprised-0.015
 below-0.015
 nervous-0.014
 aback-0.014
 crossed-0.014
Token The
Token ID383
Feature activation-3.46e-03

Pos logit contributions
 911+0.000
 TV+0.000
 beds+0.000
oral+0.000
 cereal+0.000
Neg logit contributions
 gol-0.000
arted-0.000
boat-0.000
 aback-0.000
 seen-0.000
Token attic
Token ID46113
Feature activation-2.32e-03

Pos logit contributions
 degrees+0.006
oons+0.006
 pancakes+0.005
oral+0.005
 ambul+0.005
Neg logit contributions
 nervous-0.004
 scared-0.004
 surprised-0.004
 worn-0.004
 below-0.004
Token had
Token ID550
Feature activation+1.30e-02

Pos logit contributions
 pancakes+0.004
oons+0.004
 degrees+0.004
oral+0.004
 dining+0.004
Neg logit contributions
 aback-0.002
 gol-0.002
 crossed-0.002
 nervous-0.002
 worn-0.002
Token many
Token ID867
Feature activation-2.00e-02

Pos logit contributions
 aback+0.015
 crossed+0.014
 worn+0.014
 nervous+0.014
 gol+0.014
Neg logit contributions
 pancakes-0.022
oons-0.021
oral-0.021
 degrees-0.021
 dining-0.021
Token old
Token ID1468
Feature activation+3.39e-02

Pos logit contributions
oral+0.030
 pancakes+0.030
 degrees+0.029
 sy+0.028
oup+0.028
Neg logit contributions
 surprised-0.027
 nervous-0.026
 amazed-0.026
 crossed-0.026
 scared-0.026
Token things
Token ID1243
Feature activation+3.47e-03

Pos logit contributions
 seen+0.041
 lined+0.039
 scared+0.039
 nervous+0.039
 crossed+0.039
Neg logit contributions
oral-0.060
 pancakes-0.054
 degrees-0.053
oons-0.053
imi-0.052
Token and
Token ID290
Feature activation+5.18e-03

Pos logit contributions
 aback+0.004
 gol+0.003
 crossed+0.003
 nervous+0.003
 worn+0.003
Neg logit contributions
 pancakes-0.006
oons-0.006
oral-0.006
 degrees-0.006
 dining-0.006
Token dust
Token ID8977
Feature activation+9.84e-03

Pos logit contributions
 crossed+0.007
 aback+0.007
 worn+0.007
 nervous+0.007
 surprised+0.007
Neg logit contributions
oral-0.008
 pancakes-0.007
oons-0.007
 degrees-0.007
 dining-0.007
Token and
Token ID290
Feature activation+7.63e-03

Pos logit contributions
 aback+0.010
 below+0.009
 crossed+0.009
 around+0.009
 surprised+0.009
Neg logit contributions
oral-0.018
 pancakes-0.018
 dining-0.018
 degrees-0.017
oons-0.017
Token cob
Token ID22843
Feature activation-1.80e-02

Pos logit contributions
 scared+0.010
 worn+0.010
 crossed+0.009
 so+0.009
 nervous+0.009
Neg logit contributions
oral-0.013
oons-0.012
 degrees-0.012
 pancakes-0.011
imi-0.011
Token proud
Token ID6613
Feature activation-1.40e-02

Pos logit contributions
 pancakes+0.027
 breakfast+0.026
 soup+0.026
 chores+0.026
 dining+0.026
Neg logit contributions
 aback-0.018
 gol-0.018
arted-0.017
suit-0.016
 Philip-0.016
Token.
Token ID13
Feature activation-6.85e-03

Pos logit contributions
 degrees+0.027
oons+0.027
 dining+0.027
 pancakes+0.026
 ambul+0.026
Neg logit contributions
 surprised-0.013
 nervous-0.012
 scared-0.012
 below-0.012
 crossed-0.012
Token They
Token ID1119
Feature activation-1.85e-03

Pos logit contributions
oons+0.011
 pancakes+0.011
oral+0.010
 beds+0.010
 TV+0.010
Neg logit contributions
 aback-0.009
 seen-0.008
arted-0.008
 worn-0.008
 gol-0.008
Token show
Token ID905
Feature activation-1.07e-02

Pos logit contributions
 pancakes+0.003
 chores+0.002
oral+0.002
oons+0.002
 breakfast+0.002
Neg logit contributions
 aback-0.003
arted-0.003
 gol-0.003
 hesitant-0.003
 worn-0.002
Token their
Token ID511
Feature activation+1.14e-02

Pos logit contributions
 degrees+0.015
oons+0.015
oral+0.015
 pancakes+0.015
 dining+0.014
Neg logit contributions
 aback-0.016
 crossed-0.015
 nervous-0.015
 surprised-0.015
 gol-0.015
Token sticker
Token ID27735
Feature activation+3.43e-02

Pos logit contributions
 worn+0.011
 scared+0.011
 gol+0.011
 touched+0.011
 nervous+0.011
Neg logit contributions
oons-0.021
 degrees-0.021
 dining-0.020
 911-0.020
ocol-0.019
Token and
Token ID290
Feature activation+1.45e-02

Pos logit contributions
 aback+0.036
 crossed+0.034
 below+0.034
 surprised+0.034
 gol+0.033
Neg logit contributions
 degrees-0.060
 pancakes-0.060
 dining-0.059
 ambul-0.059
oral-0.059
Token their
Token ID511
Feature activation+1.84e-03

Pos logit contributions
 aback+0.019
 crossed+0.019
 gol+0.019
 worn+0.018
 nervous+0.018
Neg logit contributions
oons-0.021
 degrees-0.021
oral-0.021
 pancakes-0.021
 dining-0.020
Token scarf
Token ID36953
Feature activation+1.91e-02

Pos logit contributions
 worn+0.002
 gol+0.002
 nervous+0.002
 scared+0.002
 touched+0.002
Neg logit contributions
oons-0.003
 degrees-0.003
oral-0.003
 dining-0.003
 911-0.003
Token to
Token ID284
Feature activation+6.83e-03

Pos logit contributions
 aback+0.021
 crossed+0.020
 nervous+0.020
 surprised+0.019
 gol+0.019
Neg logit contributions
 pancakes-0.033
 degrees-0.033
oral-0.033
oons-0.033
 ambul-0.032
Token each
Token ID1123
Feature activation-2.85e-03

Pos logit contributions
 aback+0.011
 gol+0.010
arted+0.010
 overwhelmed+0.010
 hesitant+0.010
Neg logit contributions
 pancakes-0.009
 breakfast-0.009
 chores-0.009
 soup-0.009
 911-0.009
Token and
Token ID290
Feature activation-2.98e-02

Pos logit contributions
 aback+0.004
 nervous+0.004
 crossed+0.004
 surprised+0.004
 gol+0.004
Neg logit contributions
oons-0.007
oral-0.007
 pancakes-0.007
 degrees-0.007
 dining-0.007
Token brown
Token ID7586
Feature activation+1.28e-02

Pos logit contributions
 pancakes+0.054
oons+0.053
 degrees+0.052
oral+0.052
 ambul+0.051
Neg logit contributions
 aback-0.031
 gol-0.029
 crossed-0.028
 nervous-0.028
 worn-0.027
Token,
Token ID11
Feature activation-2.25e-02

Pos logit contributions
 so+0.011
 below+0.011
 around+0.011
 nervous+0.011
 aback+0.011
Neg logit contributions
oral-0.027
oons-0.025
 chores-0.024
 pancakes-0.024
 dining-0.024
Token with
Token ID351
Feature activation-4.59e-03

Pos logit contributions
oral+0.046
oons+0.044
 degrees+0.043
 dining+0.043
 pancakes+0.042
Neg logit contributions
 nervous-0.020
 surprised-0.020
 crossed-0.020
 scared-0.019
 worn-0.019
Token long
Token ID890
Feature activation+5.93e-03

Pos logit contributions
oral+0.008
oons+0.008
 pancakes+0.008
 degrees+0.008
 dining+0.008
Neg logit contributions
 aback-0.005
 crossed-0.004
 nervous-0.004
 worn-0.004
 surprised-0.004
Token ears
Token ID11368
Feature activation+3.33e-02

Pos logit contributions
 crossed+0.005
 reached+0.005
 surprised+0.005
 nervous+0.005
 guessed+0.005
Neg logit contributions
oral-0.012
 dining-0.011
 911-0.011
 pancakes-0.011
oons-0.011
Token and
Token ID290
Feature activation-5.82e-04

Pos logit contributions
 surprised+0.033
 crossed+0.033
 aback+0.033
 below+0.033
 reached+0.033
Neg logit contributions
oral-0.062
 dining-0.060
 pancakes-0.059
 chores-0.058
oons-0.058
Token a
Token ID257
Feature activation+1.45e-02

Pos logit contributions
oral+0.001
oons+0.001
 pancakes+0.001
 degrees+0.001
 dining+0.001
Neg logit contributions
 crossed-0.001
 nervous-0.001
 aback-0.001
 surprised-0.001
 worn-0.001
Token fuzzy
Token ID34669
Feature activation+2.68e-02

Pos logit contributions
 nervous+0.015
 aback+0.015
 surprised+0.015
 crossed+0.014
 worn+0.014
Neg logit contributions
oons-0.026
 pancakes-0.025
oral-0.025
 degrees-0.025
 ambul-0.024
Token tail
Token ID7894
Feature activation+4.51e-02

Pos logit contributions
 aback+0.030
 nervous+0.029
 crossed+0.029
 worn+0.028
 gol+0.028
Neg logit contributions
oons-0.045
oral-0.045
 pancakes-0.045
 degrees-0.044
 ambul-0.043
Token.
Token ID13
Feature activation+3.48e-03

Pos logit contributions
 surprised+0.039
 below+0.039
 crossed+0.039
 reached+0.038
 around+0.038
Neg logit contributions
oral-0.094
 pancakes-0.090
oons-0.089
 ambul-0.088
 dining-0.088
Token strange
Token ID6283
Feature activation+1.88e-03

Pos logit contributions
 aback+0.012
 gol+0.012
arted+0.011
 crossed+0.011
apped+0.010
Neg logit contributions
 pancakes-0.015
 soup-0.015
 degrees-0.015
 chores-0.014
 breakfast-0.014
Token and
Token ID290
Feature activation-1.49e-02

Pos logit contributions
 nervous+0.002
 around+0.002
 so+0.002
 scared+0.002
 surprised+0.002
Neg logit contributions
oral-0.004
oons-0.004
 dining-0.004
 pancakes-0.004
 degrees-0.004
Token did
Token ID750
Feature activation+1.90e-02

Pos logit contributions
 pancakes+0.027
 degrees+0.027
 chores+0.027
oons+0.027
 ambul+0.026
Neg logit contributions
 aback-0.015
 gol-0.014
arted-0.013
 crossed-0.013
 Philip-0.012
Token not
Token ID407
Feature activation+8.09e-03

Pos logit contributions
 nervous+0.015
 crossed+0.015
 surprised+0.015
 aback+0.015
 worn+0.014
Neg logit contributions
oons-0.038
 pancakes-0.038
 dining-0.038
oral-0.038
 ambul-0.037
Token belong
Token ID5594
Feature activation+4.92e-02

Pos logit contributions
 aback+0.008
 gol+0.008
arted+0.008
amed+0.007
apped+0.007
Neg logit contributions
 pancakes-0.014
 chores-0.014
 soup-0.014
 breakfast-0.014
oons-0.014
Token here
Token ID994
Feature activation+3.72e-02

Pos logit contributions
 aback+0.056
 gol+0.055
 crossed+0.053
arted+0.051
 perplex+0.051
Neg logit contributions
 pancakes-0.082
oons-0.080
 degrees-0.079
 chores-0.078
 dining-0.077
Token.
Token ID13
Feature activation+5.15e-03

Pos logit contributions
 below+0.033
 around+0.032
 aback+0.032
 nervous+0.032
 so+0.031
Neg logit contributions
oral-0.073
oons-0.072
 pancakes-0.072
 ambul-0.071
 degrees-0.071
Token
Token ID198
Feature activation+1.05e-04

Pos logit contributions
 gol+0.008
 aback+0.008
amed+0.008
arted+0.008
 awarded+0.008
Neg logit contributions
 911-0.006
oral-0.006
 beds-0.006
 chores-0.006
 ambul-0.006
Token
Token ID198
Feature activation-1.32e-03

Pos logit contributions
 aback+0.000
 gol+0.000
suit+0.000
arted+0.000
 worn+0.000
Neg logit contributions
oral-0.000
oons-0.000
 chores-0.000
 pancakes-0.000
 911-0.000
TokenBen
Token ID11696
Feature activation+1.05e-02

Pos logit contributions
 pancakes+0.002
oons+0.002
 degrees+0.002
 soup+0.002
 breakfast+0.002
Neg logit contributions
 crossed-0.002
 gol-0.002
 aback-0.002
 reached-0.002
 worn-0.002
Token was
Token ID373
Feature activation-7.18e-03

Pos logit contributions
 aback+0.010
 gol+0.009
arted+0.008
 worn+0.008
 Philip+0.008
Neg logit contributions
 pancakes-0.020
oons-0.019
 degrees-0.019
oral-0.019
 beds-0.019
Token Volcano
Token ID48380
Feature activation-1.42e-02

Pos logit contributions
 aback+0.006
 gol+0.005
 worn+0.005
 crossed+0.005
arted+0.005
Neg logit contributions
 pancakes-0.008
 ambul-0.008
 degrees-0.008
oons-0.008
oral-0.008
Token worked
Token ID3111
Feature activation-9.78e-03

Pos logit contributions
 pancakes+0.029
 dining+0.028
oral+0.028
oons+0.028
 chores+0.027
Neg logit contributions
 crossed-0.013
 aback-0.012
 reached-0.012
 surprised-0.012
 amazed-0.012
Token together
Token ID1978
Feature activation+5.17e-03

Pos logit contributions
 pancakes+0.014
oons+0.014
 degrees+0.014
 chores+0.014
oral+0.014
Neg logit contributions
 aback-0.013
 gol-0.013
 crossed-0.012
 worn-0.012
 nervous-0.012
Token with
Token ID351
Feature activation+1.24e-02

Pos logit contributions
 aback+0.005
 crossed+0.005
 below+0.005
 seen+0.005
 reached+0.005
Neg logit contributions
 pancakes-0.010
oral-0.010
oons-0.009
 degrees-0.009
 dining-0.009
Token the
Token ID262
Feature activation+5.15e-03

Pos logit contributions
 gol+0.014
 aback+0.014
 crossed+0.013
 worn+0.013
 nervous+0.013
Neg logit contributions
 pancakes-0.021
 chores-0.020
 degrees-0.020
 ambul-0.020
oons-0.020
Token old
Token ID1468
Feature activation+3.31e-02

Pos logit contributions
 aback+0.007
 scared+0.007
 nervous+0.007
 surprised+0.007
 source+0.006
Neg logit contributions
oons-0.008
 pancakes-0.008
 degrees-0.007
oral-0.007
 occasions-0.007
Token man
Token ID582
Feature activation+1.87e-02

Pos logit contributions
 aback+0.047
 crossed+0.045
 surprised+0.045
 nervous+0.045
 below+0.045
Neg logit contributions
oons-0.048
 pancakes-0.048
 degrees-0.046
oral-0.046
 chores-0.044
Token and
Token ID290
Feature activation+6.18e-03

Pos logit contributions
 aback+0.024
 crossed+0.023
 seen+0.023
 below+0.022
 around+0.022
Neg logit contributions
 pancakes-0.031
oons-0.031
oral-0.030
 degrees-0.029
 dining-0.029
Token soon
Token ID2582
Feature activation-6.04e-03

Pos logit contributions
 aback+0.006
 gol+0.006
 crossed+0.006
 nervous+0.006
 worn+0.005
Neg logit contributions
 pancakes-0.011
oons-0.011
 degrees-0.011
 ambul-0.011
oral-0.011
Token the
Token ID262
Feature activation-3.55e-03

Pos logit contributions
 pancakes+0.011
oons+0.010
 degrees+0.010
oral+0.010
 chores+0.010
Neg logit contributions
 aback-0.006
 gol-0.006
 crossed-0.006
 nervous-0.006
 worn-0.006
Token fire
Token ID2046
Feature activation-6.43e-03

Pos logit contributions
oons+0.005
 pancakes+0.005
 degrees+0.005
oral+0.005
 occasions+0.005
Neg logit contributions
 aback-0.005
 nervous-0.005
 scared-0.005
 below-0.005
 surprised-0.005
Token a
Token ID257
Feature activation-2.56e-03

Pos logit contributions
 worn+0.000
 nervous+0.000
 scared+0.000
 aback+0.000
 crossed+0.000
Neg logit contributions
oons-0.001
oral-0.001
 pancakes-0.001
 degrees-0.001
 dining-0.001
Token terrible
Token ID7818
Feature activation-3.38e-03

Pos logit contributions
oons+0.005
 occasions+0.004
oral+0.004
 degrees+0.004
 pancakes+0.004
Neg logit contributions
 scared-0.003
 nervous-0.003
 worn-0.003
 embarrassed-0.003
 frightened-0.003
Token day
Token ID1110
Feature activation-8.03e-03

Pos logit contributions
oons+0.005
oral+0.005
 degrees+0.005
 pancakes+0.005
 dining+0.005
Neg logit contributions
 scared-0.005
 nervous-0.005
 embarrassed-0.004
 startled-0.004
 surprised-0.004
Token."
Token ID526
Feature activation+1.19e-02

Pos logit contributions
oral+0.016
oons+0.015
 pancakes+0.015
 chores+0.015
 degrees+0.014
Neg logit contributions
 seen-0.008
 so-0.007
 below-0.007
 crossed-0.007
 around-0.007
Token Jim
Token ID5395
Feature activation-1.00e-02

Pos logit contributions
 gol+0.011
 aback+0.011
 seen+0.010
 worn+0.010
tie+0.010
Neg logit contributions
 pancakes-0.022
 breakfast-0.022
 beds-0.021
 911-0.021
oons-0.021
Token wanted
Token ID2227
Feature activation-9.12e-04

Pos logit contributions
 pancakes+0.021
oons+0.020
 degrees+0.020
oral+0.020
 ambul+0.019
Neg logit contributions
 aback-0.008
 gol-0.008
 worn-0.007
arted-0.007
 below-0.007
Token to
Token ID284
Feature activation+1.58e-02

Pos logit contributions
oral+0.002
oons+0.002
 degrees+0.002
 dining+0.002
 pancakes+0.002
Neg logit contributions
 aback-0.001
 seen-0.001
 crossed-0.001
 reached-0.001
 worn-0.001
Token stay
Token ID2652
Feature activation+4.20e-03

Pos logit contributions
 crossed+0.016
 touched+0.016
 scared+0.016
 reached+0.016
 nervous+0.016
Neg logit contributions
oons-0.030
 degrees-0.030
oral-0.030
 dining-0.029
 pancakes-0.028
Token outside
Token ID2354
Feature activation-2.13e-02

Pos logit contributions
 aback+0.004
 nervous+0.004
 worn+0.004
 surprised+0.004
 crossed+0.004
Neg logit contributions
oons-0.008
oral-0.008
 pancakes-0.008
 degrees-0.008
 dining-0.008
Token and
Token ID290
Feature activation+5.52e-03

Pos logit contributions
oral+0.047
oons+0.046
 degrees+0.044
 pancakes+0.043
affles+0.042
Neg logit contributions
 so-0.021
 nervous-0.019
 around-0.019
 scared-0.018
 below-0.017
Token play
Token ID711
Feature activation-5.72e-03

Pos logit contributions
 crossed+0.006
 aback+0.006
 nervous+0.006
 worn+0.005
 scared+0.005
Neg logit contributions
oons-0.010
oral-0.010
 degrees-0.010
 pancakes-0.010
 ambul-0.009
Token
Token ID198
Feature activation-2.11e-03

Pos logit contributions
 pancakes+0.008
oons+0.008
oral+0.008
 degrees+0.008
 dining+0.008
Neg logit contributions
 aback-0.002
 gol-0.002
 crossed-0.002
 nervous-0.002
 worn-0.002
TokenThe
Token ID464
Feature activation-7.15e-03

Pos logit contributions
 pancakes+0.004
 degrees+0.004
oons+0.004
 chores+0.004
 soup+0.004
Neg logit contributions
 aback-0.002
 gol-0.002
 worn-0.002
 Paige-0.002
 guessed-0.002
Token girl
Token ID2576
Feature activation+6.53e-03

Pos logit contributions
oons+0.014
 degrees+0.014
 pancakes+0.014
 ambul+0.014
oral+0.013
Neg logit contributions
 gol-0.006
 nervous-0.006
 worn-0.006
 aback-0.006
 surprised-0.006
Token didn
Token ID1422
Feature activation+1.51e-02

Pos logit contributions
 aback+0.005
 gol+0.004
 crossed+0.004
 nervous+0.004
 worn+0.004
Neg logit contributions
 pancakes-0.014
oons-0.013
 degrees-0.013
oral-0.013
 dining-0.013
Token't
Token ID470
Feature activation-7.10e-04

Pos logit contributions
 aback+0.013
 nervous+0.013
 surprised+0.013
 startled+0.012
 reached+0.012
Neg logit contributions
 degrees-0.030
 dining-0.030
 beds-0.029
 pancakes-0.029
 occasions-0.029
Token understand
Token ID1833
Feature activation-2.21e-03

Pos logit contributions
 pancakes+0.001
oral+0.001
 chores+0.001
 911+0.001
 breakfast+0.001
Neg logit contributions
arted-0.001
 gol-0.001
 aback-0.001
 worn-0.001
 crossed-0.001
Token and
Token ID290
Feature activation+7.08e-03

Pos logit contributions
oral+0.004
oons+0.004
 dining+0.004
 pancakes+0.004
 degrees+0.004
Neg logit contributions
 aback-0.003
 so-0.002
 nervous-0.002
 surprised-0.002
 confused-0.002
Token asked
Token ID1965
Feature activation+1.88e-02

Pos logit contributions
 aback+0.005
 gol+0.005
suit+0.004
arted+0.004
 Philip+0.004
Neg logit contributions
 pancakes-0.015
 degrees-0.014
oons-0.014
 chores-0.014
 ambul-0.014
Token,
Token ID11
Feature activation+4.28e-03

Pos logit contributions
 aback+0.020
 nervous+0.019
 around+0.019
 so+0.019
 confused+0.018
Neg logit contributions
oons-0.036
oral-0.035
 pancakes-0.034
 dining-0.034
 degrees-0.033
Token "
Token ID366
Feature activation-1.24e-03

Pos logit contributions
 aback+0.004
 nervous+0.003
 surprised+0.003
 crossed+0.003
 hesitant+0.003
Neg logit contributions
oons-0.009
 pancakes-0.008
 degrees-0.008
oral-0.008
 chores-0.008
TokenWhat
Token ID2061
Feature activation-9.56e-03

Pos logit contributions
oons+0.002
 pancakes+0.002
 chores+0.002
 degrees+0.002
 dining+0.002
Neg logit contributions
 aback-0.002
 worn-0.001
 gol-0.001
 guessed-0.001
 reached-0.001
Token named
Token ID3706
Feature activation+8.41e-03

Pos logit contributions
 pancakes+0.025
oons+0.025
oral+0.025
 degrees+0.025
 ambul+0.024
Neg logit contributions
 aback-0.013
 gol-0.012
 crossed-0.011
arted-0.011
 nervous-0.011
Token Tim
Token ID5045
Feature activation-1.06e-02

Pos logit contributions
 aback+0.011
 gol+0.010
 perplex+0.010
amed+0.010
 overwhelmed+0.010
Neg logit contributions
 pancakes-0.014
 breakfast-0.013
 soup-0.013
 tea-0.013
 911-0.013
Token.
Token ID13
Feature activation-7.21e-03

Pos logit contributions
oons+0.014
 pancakes+0.014
 soup+0.013
oral+0.013
 ambul+0.013
Neg logit contributions
 aback-0.016
 gol-0.016
 overwhelmed-0.015
 perplex-0.015
arted-0.015
Token Tim
Token ID5045
Feature activation-6.68e-04

Pos logit contributions
 ambul+0.012
 cereal+0.012
oons+0.012
 pancakes+0.012
 soup+0.012
Neg logit contributions
 aback-0.008
 gol-0.008
 perplex-0.008
amed-0.008
 figured-0.007
Token liked
Token ID8288
Feature activation-1.26e-02

Pos logit contributions
oons+0.001
 soup+0.001
 pancakes+0.001
 ambul+0.001
 beds+0.001
Neg logit contributions
 aback-0.001
 gol-0.001
arted-0.001
 Paige-0.001
 perplex-0.001
Token to
Token ID284
Feature activation-3.67e-03

Pos logit contributions
oons+0.026
 degrees+0.026
oral+0.026
 dining+0.025
 pancakes+0.025
Neg logit contributions
 nervous-0.009
 crossed-0.009
 gol-0.009
 aback-0.009
 reached-0.009
Token help
Token ID1037
Feature activation-3.64e-02

Pos logit contributions
 degrees+0.006
oons+0.006
oral+0.006
 dining+0.006
 pancakes+0.006
Neg logit contributions
 crossed-0.004
 nervous-0.004
 reached-0.004
 guessed-0.004
 gol-0.004
Token his
Token ID465
Feature activation-1.13e-02

Pos logit contributions
 pancakes+0.050
 soup+0.050
 chores+0.049
oons+0.048
 breakfast+0.048
Neg logit contributions
 aback-0.058
 gol-0.055
suit-0.055
arted-0.053
 perplex-0.053
Token friends
Token ID2460
Feature activation+8.32e-03

Pos logit contributions
 pancakes+0.020
oons+0.020
 degrees+0.019
oral+0.019
 chores+0.019
Neg logit contributions
 aback-0.012
 gol-0.011
 crossed-0.011
 worn-0.011
arted-0.011
Token.
Token ID13
Feature activation+1.05e-03

Pos logit contributions
 aback+0.007
 below+0.007
 nervous+0.007
 around+0.007
 crossed+0.007
Neg logit contributions
oral-0.016
 degrees-0.016
 pancakes-0.016
oons-0.016
 dining-0.016
Token One
Token ID1881
Feature activation-4.96e-03

Pos logit contributions
 aback+0.001
amed+0.001
 gol+0.001
 seen+0.001
 worn+0.001
Neg logit contributions
oons-0.002
 TV-0.002
oral-0.002
 cereal-0.002
 chores-0.002
Token and
Token ID290
Feature activation-4.57e-03

Pos logit contributions
 aback+0.005
 gol+0.005
 crossed+0.005
 nervous+0.005
 worn+0.005
Neg logit contributions
 pancakes-0.011
oons-0.010
 degrees-0.010
oral-0.010
 dining-0.010
Token the
Token ID262
Feature activation+5.66e-03

Pos logit contributions
 pancakes+0.007
 ambul+0.007
 dining+0.007
 911+0.007
 beds+0.007
Neg logit contributions
 gol-0.005
 aback-0.005
 worn-0.005
suit-0.005
 perplex-0.005
Token kitten
Token ID42143
Feature activation-3.62e-03

Pos logit contributions
 aback+0.009
 gol+0.009
 nervous+0.009
 worn+0.009
 crossed+0.009
Neg logit contributions
oons-0.007
 pancakes-0.007
 degrees-0.007
oral-0.007
 dining-0.007
Token followed
Token ID3940
Feature activation-5.67e-03

Pos logit contributions
 pancakes+0.007
oons+0.007
 degrees+0.007
oral+0.007
 ambul+0.007
Neg logit contributions
 aback-0.003
 gol-0.003
 crossed-0.003
 worn-0.003
 nervous-0.003
Token her
Token ID607
Feature activation+5.02e-03

Pos logit contributions
oons+0.010
oral+0.010
 degrees+0.010
 pancakes+0.009
 dining+0.009
Neg logit contributions
 aback-0.007
 around-0.006
 below-0.006
 nervous-0.006
 so-0.006
Token.
Token ID13
Feature activation-1.99e-03

Pos logit contributions
 aback+0.006
 gol+0.006
 crossed+0.006
 worn+0.006
 nervous+0.006
Neg logit contributions
 pancakes-0.008
 degrees-0.008
oons-0.008
oral-0.008
 chores-0.008
Token Lucy
Token ID22162
Feature activation+5.29e-03

Pos logit contributions
 pancakes+0.004
oons+0.003
 degrees+0.003
oral+0.003
 dining+0.003
Neg logit contributions
 aback-0.002
 gol-0.002
 crossed-0.002
 worn-0.002
 nervous-0.002
Token was
Token ID373
Feature activation+6.58e-03

Pos logit contributions
 aback+0.005
 gol+0.005
 worn+0.005
arted+0.004
 Paige+0.004
Neg logit contributions
 pancakes-0.010
 degrees-0.010
oons-0.009
 ambul-0.009
 dining-0.009
Token so
Token ID523
Feature activation-4.17e-03

Pos logit contributions
 gol+0.007
suit+0.006
ully+0.006
 Philip+0.005
enge+0.005
Neg logit contributions
 pancakes-0.012
 breakfast-0.012
 ambul-0.012
 dining-0.012
 degrees-0.012
Token happy
Token ID3772
Feature activation+3.55e-03

Pos logit contributions
 degrees+0.005
 pancakes+0.005
 dining+0.005
 chores+0.005
 angels+0.005
Neg logit contributions
 gol-0.006
ully-0.005
suit-0.005
 Philip-0.005
tie-0.005
Token!
Token ID0
Feature activation-7.10e-03

Pos logit contributions
 aback+0.004
 crossed+0.003
 below+0.003
 gol+0.003
 nervous+0.003
Neg logit contributions
oons-0.006
 pancakes-0.006
oral-0.006
 degrees-0.006
 chores-0.006
Token to
Token ID284
Feature activation+1.44e-02

Pos logit contributions
oons+0.000
oral+0.000
 dining+0.000
 degrees+0.000
 pancakes+0.000
Neg logit contributions
 seen-0.000
 aback-0.000
 below-0.000
 so-0.000
 around-0.000
Token take
Token ID1011
Feature activation-5.94e-03

Pos logit contributions
 aback+0.016
 crossed+0.015
 gol+0.015
 reached+0.015
 worn+0.015
Neg logit contributions
oons-0.025
 pancakes-0.025
 degrees-0.024
oral-0.024
 dining-0.024
Token a
Token ID257
Feature activation+2.64e-02

Pos logit contributions
 pancakes+0.009
oons+0.009
 degrees+0.009
 chores+0.009
 breakfast+0.009
Neg logit contributions
 aback-0.007
 gol-0.007
 crossed-0.007
 nervous-0.007
 hesitant-0.007
Token closer
Token ID5699
Feature activation+7.54e-03

Pos logit contributions
 aback+0.035
 gol+0.034
arted+0.034
 crossed+0.033
 guessed+0.032
Neg logit contributions
 pancakes-0.039
oons-0.038
 degrees-0.037
 chores-0.037
oral-0.037
Token look
Token ID804
Feature activation-1.15e-02

Pos logit contributions
 aback+0.008
 gol+0.007
 crossed+0.007
arted+0.007
 worn+0.007
Neg logit contributions
 pancakes-0.014
oons-0.013
 degrees-0.013
oral-0.013
 breakfast-0.013
Token.
Token ID13
Feature activation+1.03e-02

Pos logit contributions
oral+0.021
 pancakes+0.020
 ambul+0.020
 dining+0.020
 chores+0.020
Neg logit contributions
 around-0.012
 aback-0.012
 below-0.012
 surprised-0.012
 nervous-0.012
Token Did
Token ID7731
Feature activation-2.73e-02

Pos logit contributions
 aback+0.012
 gol+0.012
arted+0.012
 worn+0.012
 crossed+0.012
Neg logit contributions
 911-0.016
 pancakes-0.016
 ambul-0.016
 degrees-0.016
oral-0.016
Token you
Token ID345
Feature activation+7.40e-03

Pos logit contributions
oons+0.039
oral+0.039
 dining+0.038
 degrees+0.037
 pancakes+0.037
Neg logit contributions
 crossed-0.038
 reached-0.037
 aback-0.037
 Bill-0.037
 guessed-0.037
Token manage
Token ID6687
Feature activation-8.06e-03

Pos logit contributions
 aback+0.008
 gol+0.008
arted+0.008
 crossed+0.007
 perplex+0.007
Neg logit contributions
 pancakes-0.013
oons-0.012
 degrees-0.012
 chores-0.012
oral-0.012
Token ok
Token ID12876
Feature activation+1.03e-02

Pos logit contributions
oons+0.015
oral+0.015
 degrees+0.015
 pancakes+0.015
 dining+0.015
Neg logit contributions
 aback-0.008
 worn-0.007
 nervous-0.007
 crossed-0.007
 seen-0.007
Token?"
Token ID1701
Feature activation+1.21e-02

Pos logit contributions
 aback+0.014
 crossed+0.013
 gol+0.013
 nervous+0.013
 worn+0.013
Neg logit contributions
 pancakes-0.015
oons-0.015
 degrees-0.015
oral-0.015
 dining-0.015
Token,
Token ID11
Feature activation-1.21e-02

Pos logit contributions
 surprised+0.001
 aback+0.001
 nervous+0.001
apped+0.001
 around+0.001
Neg logit contributions
 ambul-0.002
 dining-0.002
 degrees-0.002
 pancakes-0.002
oons-0.002
Token "
Token ID366
Feature activation-9.37e-03

Pos logit contributions
 pancakes+0.021
oons+0.020
 degrees+0.020
oral+0.020
 dining+0.020
Neg logit contributions
 aback-0.013
 gol-0.013
 crossed-0.012
 worn-0.012
 seen-0.012
TokenHello
Token ID15496
Feature activation+2.64e-02

Pos logit contributions
oral+0.017
 pancakes+0.017
 occasions+0.017
 dining+0.017
oons+0.017
Neg logit contributions
 aback-0.008
arted-0.008
 crossed-0.008
 seen-0.008
 overwhelmed-0.008
Token,
Token ID11
Feature activation-1.37e-02

Pos logit contributions
 surprised+0.024
 nervous+0.023
 so+0.022
 scared+0.021
 aback+0.021
Neg logit contributions
 ambul-0.053
oral-0.050
oons-0.050
 dining-0.050
 stret-0.049
Token Elsa
Token ID19226
Feature activation+1.90e-04

Pos logit contributions
oons+0.025
 degrees+0.025
 ambul+0.024
oral+0.024
 pancakes+0.023
Neg logit contributions
 surprised-0.015
 nervous-0.014
 so-0.014
 worn-0.013
 Paige-0.013
Token.
Token ID13
Feature activation-2.85e-02

Pos logit contributions
 aback+0.000
 nervous+0.000
 gol+0.000
 surprised+0.000
 crossed+0.000
Neg logit contributions
 pancakes-0.000
oons-0.000
oral-0.000
 degrees-0.000
 chores-0.000
Token I
Token ID314
Feature activation-8.81e-03

Pos logit contributions
 pancakes+0.055
oons+0.055
 degrees+0.054
oral+0.053
 chores+0.053
Neg logit contributions
 aback-0.028
 gol-0.025
 nervous-0.025
 crossed-0.024
 worn-0.024
Token am
Token ID716
Feature activation-6.97e-03

Pos logit contributions
oral+0.010
 pancakes+0.010
 tea+0.010
 chores+0.010
 breakfast+0.009
Neg logit contributions
arted-0.015
 aback-0.014
 gol-0.014
 perplex-0.014
 below-0.013
Token happy
Token ID3772
Feature activation-1.59e-02

Pos logit contributions
 pancakes+0.010
 tea+0.009
 breakfast+0.009
 soup+0.009
 dining+0.009
Neg logit contributions
 aback-0.010
 gol-0.010
 crossed-0.010
arted-0.010
suit-0.009
Token to
Token ID284
Feature activation+4.81e-03

Pos logit contributions
 degrees+0.036
oons+0.035
oral+0.035
 chores+0.034
 ambul+0.034
Neg logit contributions
 so-0.011
 around-0.011
 nervous-0.011
 scared-0.010
 surprised-0.010
Token meet
Token ID1826
Feature activation+1.23e-02

Pos logit contributions
 aback+0.006
 gol+0.005
 crossed+0.005
 worn+0.005
 nervous+0.005
Neg logit contributions
 pancakes-0.008
oons-0.008
oral-0.008
 degrees-0.008
 chores-0.008
Token
Token ID198
Feature activation-7.85e-03

Pos logit contributions
 worn+0.009
 seen+0.009
 overwhelmed+0.008
 gol+0.008
suit+0.008
Neg logit contributions
oral-0.014
 pancakes-0.013
 breakfast-0.013
oons-0.013
 soup-0.013
Token
Token ID198
Feature activation-5.84e-03

Pos logit contributions
oral+0.016
oons+0.015
 pancakes+0.015
 breakfast+0.015
 chores+0.014
Neg logit contributions
 aback-0.007
 worn-0.006
 gol-0.006
arted-0.006
 overwhelmed-0.006
TokenThe
Token ID464
Feature activation-9.36e-03

Pos logit contributions
 pancakes+0.011
 degrees+0.011
oons+0.010
 911+0.010
 chores+0.010
Neg logit contributions
 gol-0.006
 aback-0.006
 crossed-0.006
 reached-0.006
 guessed-0.006
Token fish
Token ID5916
Feature activation+6.02e-03

Pos logit contributions
oons+0.016
 degrees+0.015
 pancakes+0.015
 ambul+0.015
 occasions+0.015
Neg logit contributions
 nervous-0.011
 gol-0.011
 scared-0.011
 surprised-0.011
 confused-0.011
Token was
Token ID373
Feature activation+2.76e-03

Pos logit contributions
 aback+0.005
 gol+0.004
 crossed+0.004
 nervous+0.004
 worn+0.004
Neg logit contributions
 pancakes-0.012
oons-0.012
 degrees-0.012
oral-0.012
 chores-0.012
Token very
Token ID845
Feature activation-2.25e-02

Pos logit contributions
 gol+0.003
 aback+0.002
suit+0.002
arrass+0.002
 Paige+0.002
Neg logit contributions
 pancakes-0.005
 breakfast-0.005
 soup-0.005
 dining-0.005
 degrees-0.005
Token thankful
Token ID25535
Feature activation-3.72e-03

Pos logit contributions
 pancakes+0.038
oons+0.037
 dining+0.037
 degrees+0.037
oral+0.037
Neg logit contributions
 aback-0.026
 gol-0.024
arted-0.023
 crossed-0.023
 Paige-0.023
Token and
Token ID290
Feature activation+3.54e-03

Pos logit contributions
oons+0.006
 pancakes+0.006
 degrees+0.006
oral+0.006
 ambul+0.006
Neg logit contributions
 aback-0.004
 crossed-0.004
 gol-0.004
 nervous-0.004
 surprised-0.004
Token drank
Token ID24070
Feature activation+8.09e-03

Pos logit contributions
 aback+0.003
 gol+0.003
 source+0.003
suit+0.003
 seen+0.003
Neg logit contributions
 pancakes-0.007
 soup-0.007
 dining-0.006
 breakfast-0.006
oral-0.006
Token all
Token ID477
Feature activation+1.82e-02

Pos logit contributions
 aback+0.009
 gol+0.009
 crossed+0.009
 worn+0.008
 nervous+0.008
Neg logit contributions
 pancakes-0.014
oons-0.014
 degrees-0.013
oral-0.013
 chores-0.013
Token the
Token ID262
Feature activation+1.32e-02

Pos logit contributions
 gol+0.020
 aback+0.019
 nervous+0.019
 crossed+0.018
 guessed+0.018
Neg logit contributions
 pancakes-0.032
oons-0.031
 soup-0.030
oral-0.030
 breakfast-0.030
Token
Token ID198
Feature activation-4.34e-03

Pos logit contributions
oons+0.009
oral+0.009
 pancakes+0.009
 911+0.009
 chores+0.009
Neg logit contributions
 aback-0.004
 gol-0.004
 worn-0.004
 overwhelmed-0.004
arted-0.004
TokenThey
Token ID2990
Feature activation-8.31e-03

Pos logit contributions
 pancakes+0.008
 degrees+0.008
 soup+0.007
oons+0.007
 breakfast+0.007
Neg logit contributions
 gol-0.005
 crossed-0.005
 aback-0.005
 Paige-0.005
 reached-0.005
Token played
Token ID2826
Feature activation-8.00e-03

Pos logit contributions
 pancakes+0.017
oons+0.017
 degrees+0.017
oral+0.017
 ambul+0.017
Neg logit contributions
 aback-0.006
 crossed-0.006
 reached-0.006
 gol-0.006
 guessed-0.006
Token for
Token ID329
Feature activation-4.92e-04

Pos logit contributions
 degrees+0.016
 ambul+0.016
oral+0.016
oons+0.015
 pancakes+0.015
Neg logit contributions
 around-0.007
 surprised-0.007
 nervous-0.007
 amazed-0.007
 crossed-0.007
Token a
Token ID257
Feature activation+1.71e-02

Pos logit contributions
 pancakes+0.001
oons+0.001
oral+0.001
 degrees+0.001
 dining+0.001
Neg logit contributions
 aback-0.000
 gol-0.000
 crossed-0.000
 nervous-0.000
 worn-0.000
Token while
Token ID981
Feature activation-2.55e-02

Pos logit contributions
suit+0.013
 nervous+0.013
 gol+0.012
 scared+0.011
 so+0.011
Neg logit contributions
 911-0.036
 ambul-0.036
oral-0.036
 degrees-0.035
oons-0.035
Token,
Token ID11
Feature activation+5.01e-03

Pos logit contributions
oons+0.052
 pancakes+0.050
oral+0.049
 dining+0.049
 degrees+0.049
Neg logit contributions
 so-0.024
 nervous-0.024
 below-0.024
 scared-0.023
 crossed-0.022
Token having
Token ID1719
Feature activation-1.05e-02

Pos logit contributions
 crossed+0.005
 nervous+0.005
 amazed+0.005
 reached+0.005
 aback+0.005
Neg logit contributions
oral-0.009
oons-0.009
 degrees-0.009
 pancakes-0.009
 chores-0.009
Token fun
Token ID1257
Feature activation+1.16e-02

Pos logit contributions
oral+0.022
oons+0.021
 911+0.021
 degrees+0.021
 sy+0.020
Neg logit contributions
 so-0.008
 seen-0.008
 nervous-0.008
 crossed-0.008
 aback-0.008
Token with
Token ID351
Feature activation+8.70e-03

Pos logit contributions
 around+0.008
 so+0.007
 below+0.006
 scared+0.006
 crossed+0.006
Neg logit contributions
oral-0.028
 911-0.027
 ambul-0.026
 degrees-0.026
 sy-0.026
Token their
Token ID511
Feature activation-3.36e-03

Pos logit contributions
 surprised+0.008
 nervous+0.008
 aback+0.008
 crossed+0.008
 worn+0.008
Neg logit contributions
oral-0.016
oons-0.016
 degrees-0.016
 dining-0.015
 pancakes-0.015
Token and
Token ID290
Feature activation+1.63e-03

Pos logit contributions
 degrees+0.003
 pancakes+0.003
oons+0.003
oral+0.003
 chores+0.003
Neg logit contributions
 aback-0.001
 crossed-0.001
 worn-0.001
 reached-0.001
 gol-0.001
Token her
Token ID607
Feature activation-5.08e-03

Pos logit contributions
 aback+0.002
 gol+0.002
 crossed+0.002
 nervous+0.002
 worn+0.002
Neg logit contributions
 pancakes-0.002
oons-0.002
 degrees-0.002
oral-0.002
 dining-0.002
Token daughter
Token ID4957
Feature activation-7.12e-03

Pos logit contributions
 pancakes+0.008
oons+0.008
 degrees+0.008
oral+0.008
 dining+0.008
Neg logit contributions
 aback-0.006
 gol-0.006
 crossed-0.006
 nervous-0.005
 worn-0.005
Token were
Token ID547
Feature activation-1.39e-02

Pos logit contributions
 degrees+0.016
oons+0.016
oral+0.016
 pancakes+0.016
 911+0.015
Neg logit contributions
 crossed-0.005
 met-0.004
 found-0.004
 reached-0.004
 guessed-0.004
Token in
Token ID287
Feature activation-2.88e-03

Pos logit contributions
oons+0.024
 degrees+0.023
oral+0.023
 pancakes+0.022
 occasions+0.022
Neg logit contributions
 nervous-0.017
 surprised-0.016
 amazed-0.016
 crossed-0.016
 scared-0.016
Token their
Token ID511
Feature activation-3.14e-02

Pos logit contributions
oons+0.006
 degrees+0.006
oral+0.006
 pancakes+0.006
 911+0.006
Neg logit contributions
 aback-0.002
 nervous-0.002
 seen-0.002
 worn-0.002
 gol-0.002
Token backyard
Token ID24296
Feature activation-7.78e-03

Pos logit contributions
oons+0.051
 degrees+0.051
oral+0.050
 pancakes+0.049
 ambul+0.046
Neg logit contributions
 worn-0.038
 aback-0.038
 nervous-0.038
 surprised-0.037
 crossed-0.037
Token.
Token ID13
Feature activation+1.27e-03

Pos logit contributions
oral+0.017
 degrees+0.017
oons+0.017
 pancakes+0.016
 911+0.016
Neg logit contributions
 seen-0.006
 surprised-0.005
 touched-0.005
 scared-0.005
 crossed-0.005
Token The
Token ID383
Feature activation-9.00e-03

Pos logit contributions
 aback+0.001
 gol+0.001
arted+0.001
 perplex+0.001
 hesitant+0.001
Neg logit contributions
 pancakes-0.002
 beds-0.002
 TV-0.002
 dining-0.002
 breakfast-0.002
Token mom
Token ID1995
Feature activation+5.64e-03

Pos logit contributions
 degrees+0.019
oons+0.018
ocol+0.018
 occasions+0.018
 Street+0.017
Neg logit contributions
 nervous-0.008
 worn-0.007
 surprised-0.007
 scared-0.007
 embarrassed-0.007
Token wanted
Token ID2227
Feature activation-3.67e-03

Pos logit contributions
 aback+0.004
 gol+0.004
 crossed+0.003
 nervous+0.003
 worn+0.003
Neg logit contributions
 pancakes-0.012
oons-0.012
 degrees-0.012
oral-0.012
 chores-0.012
Token and
Token ID290
Feature activation-1.17e-02

Pos logit contributions
 aback+0.022
 gol+0.021
 below+0.021
 crossed+0.021
 surprised+0.020
Neg logit contributions
oral-0.031
 pancakes-0.031
oons-0.030
 chores-0.030
 dining-0.030
Token cool
Token ID3608
Feature activation+8.91e-03

Pos logit contributions
 degrees+0.017
 chores+0.017
 soup+0.017
 pancakes+0.017
 movie+0.016
Neg logit contributions
 aback-0.016
 gol-0.016
arted-0.016
amed-0.015
apped-0.014
Token!"
Token ID2474
Feature activation+1.27e-03

Pos logit contributions
 aback+0.011
 below+0.010
 crossed+0.010
 gol+0.010
 nervous+0.010
Neg logit contributions
oral-0.015
 pancakes-0.014
oons-0.014
 ambul-0.014
 dining-0.014
Token Sara
Token ID24799
Feature activation+2.43e-03

Pos logit contributions
 aback+0.002
 gol+0.002
arted+0.002
 worn+0.002
 perplex+0.002
Neg logit contributions
 pancakes-0.002
oons-0.002
oral-0.002
 chores-0.002
 degrees-0.002
Token says
Token ID1139
Feature activation-2.07e-02

Pos logit contributions
 aback+0.003
 gol+0.003
 worn+0.003
arted+0.003
 seen+0.003
Neg logit contributions
 pancakes-0.004
oons-0.004
 degrees-0.004
oral-0.004
 soup-0.004
Token.
Token ID13
Feature activation-1.83e-02

Pos logit contributions
 pancakes+0.036
oons+0.035
oral+0.035
 degrees+0.035
 ambul+0.035
Neg logit contributions
 aback-0.023
 nervous-0.022
 crossed-0.022
 surprised-0.021
 gol-0.021
Token She
Token ID1375
Feature activation-1.70e-05

Pos logit contributions
oons+0.027
 pancakes+0.027
oral+0.027
 911+0.027
 TV+0.026
Neg logit contributions
 aback-0.023
 gol-0.023
 seen-0.023
 worn-0.023
 overwhelmed-0.022
Token touches
Token ID18105
Feature activation-1.27e-04

Pos logit contributions
 pancakes+0.000
oons+0.000
 degrees+0.000
oral+0.000
 dining+0.000
Neg logit contributions
 aback-0.000
 crossed-0.000
 gol-0.000
 nervous-0.000
 surprised-0.000
Token the
Token ID262
Feature activation+3.70e-03

Pos logit contributions
 pancakes+0.000
oons+0.000
 chores+0.000
 degrees+0.000
 911+0.000
Neg logit contributions
 aback-0.000
 gol-0.000
 crossed-0.000
 worn-0.000
arted-0.000
Token dinosaur
Token ID28257
Feature activation-5.74e-03

Pos logit contributions
 aback+0.005
 crossed+0.005
 gol+0.005
arted+0.005
 hesitant+0.005
Neg logit contributions
 pancakes-0.006
oons-0.005
 degrees-0.005
oral-0.005
 dining-0.005
Token bone
Token ID9970
Feature activation+2.92e-03

Pos logit contributions
 ambul+0.013
oral+0.013
 dining+0.013
 occasions+0.013
 Year+0.013
Neg logit contributions
 around-0.005
 so-0.004
 below-0.004
 surprised-0.004
 scared-0.004
Token mom
Token ID1995
Feature activation+6.16e-03

Pos logit contributions
 degrees+0.016
oons+0.016
oral+0.016
 pancakes+0.015
 ambul+0.015
Neg logit contributions
 nervous-0.014
 worn-0.014
 surprised-0.014
 aback-0.014
 gol-0.013
Tokenmy
Token ID1820
Feature activation-4.85e-03

Pos logit contributions
 aback+0.006
 gol+0.006
 crossed+0.006
 nervous+0.006
 worn+0.005
Neg logit contributions
 pancakes-0.011
oons-0.011
 degrees-0.011
oral-0.011
 dining-0.011
Token asked
Token ID1965
Feature activation+8.24e-03

Pos logit contributions
 degrees+0.011
 pancakes+0.011
oral+0.011
oons+0.011
 dining+0.011
Neg logit contributions
 aback-0.003
 crossed-0.002
 reached-0.002
 surprised-0.002
 guessed-0.002
Token her
Token ID607
Feature activation+2.71e-04

Pos logit contributions
 found+0.008
 Sally+0.008
 so+0.008
 around+0.007
 asked+0.007
Neg logit contributions
 degrees-0.017
 marsh-0.016
 ambul-0.016
oral-0.016
 dining-0.016
Token to
Token ID284
Feature activation-1.56e-02

Pos logit contributions
 surprised+0.000
 aback+0.000
 so+0.000
 nervous+0.000
 around+0.000
Neg logit contributions
 degrees-0.001
ocol-0.001
 ambul-0.001
oons-0.001
 beds-0.001
Token help
Token ID1037
Feature activation-4.52e-02

Pos logit contributions
 degrees+0.027
oons+0.027
oral+0.026
 ambul+0.025
 pancakes+0.025
Neg logit contributions
 aback-0.018
 worn-0.018
 crossed-0.017
 reached-0.017
 touched-0.017
Token with
Token ID351
Feature activation-1.41e-02

Pos logit contributions
oral+0.084
 degrees+0.078
 dining+0.076
 ambul+0.075
 camps+0.074
Neg logit contributions
 asked-0.051
 touched-0.049
 so-0.049
 nervous-0.048
 aback-0.047
Token the
Token ID262
Feature activation-3.00e-03

Pos logit contributions
oral+0.029
oons+0.027
 degrees+0.026
bees+0.025
 dining+0.025
Neg logit contributions
 worn-0.014
 so-0.014
 nervous-0.013
 aback-0.012
 seen-0.012
Token laundry
Token ID25724
Feature activation-3.95e-03

Pos logit contributions
oons+0.005
oral+0.005
 degrees+0.004
oup+0.004
ves+0.004
Neg logit contributions
 nervous-0.004
 so-0.004
 source-0.004
 scared-0.004
 worn-0.004
Token.
Token ID13
Feature activation+8.38e-03

Pos logit contributions
oral+0.008
 degrees+0.007
 ambul+0.007
Sweet+0.007
oons+0.007
Neg logit contributions
 so-0.004
 aback-0.004
 nervous-0.004
 around-0.004
 worn-0.003
Token They
Token ID1119
Feature activation-5.76e-03

Pos logit contributions
 gol+0.011
 aback+0.010
 Philip+0.010
 seen+0.010
lot+0.010
Neg logit contributions
oons-0.013
 pancakes-0.013
 tea-0.013
 beds-0.012
 dining-0.012
Token
Token ID198
Feature activation-9.81e-04

Pos logit contributions
 gol+0.000
 aback+0.000
tie+0.000
amed+0.000
 seen+0.000
Neg logit contributions
 911-0.001
oral-0.001
 TV-0.001
 ambul-0.001
 beds-0.001
Token
Token ID198
Feature activation-1.69e-03

Pos logit contributions
oral+0.002
oons+0.002
 pancakes+0.002
 911+0.002
 chores+0.002
Neg logit contributions
 aback-0.001
 gol-0.001
arted-0.001
suit-0.001
 crossed-0.001
Token"
Token ID1
Feature activation+9.34e-03

Pos logit contributions
 pancakes+0.003
 degrees+0.003
 breakfast+0.002
oons+0.002
 chores+0.002
Neg logit contributions
 crossed-0.002
 gol-0.002
 worn-0.002
 surprised-0.002
 aback-0.002
TokenHey
Token ID10814
Feature activation-4.46e-03

Pos logit contributions
 aback+0.011
 gol+0.011
 crossed+0.011
 reached+0.010
 worn+0.010
Neg logit contributions
 pancakes-0.016
oons-0.015
 degrees-0.015
 chores-0.015
 dining-0.015
Token,
Token ID11
Feature activation-2.39e-02

Pos logit contributions
 pancakes+0.006
oons+0.006
 degrees+0.006
oral+0.006
 chores+0.005
Neg logit contributions
 aback-0.007
 gol-0.007
 crossed-0.006
 worn-0.006
 nervous-0.006
Token hey
Token ID17207
Feature activation-4.32e-02

Pos logit contributions
oons+0.042
 degrees+0.042
oral+0.041
 pancakes+0.040
 dining+0.039
Neg logit contributions
 aback-0.027
 gol-0.026
 crossed-0.026
 surprised-0.026
 scared-0.026
Token,
Token ID11
Feature activation-1.72e-02

Pos logit contributions
oral+0.071
oons+0.070
 chores+0.069
 dining+0.069
 ambul+0.069
Neg logit contributions
 aback-0.049
 surprised-0.049
 gol-0.049
 seen-0.049
 crossed-0.048
Token hey
Token ID17207
Feature activation-3.98e-02

Pos logit contributions
oons+0.026
 degrees+0.025
oral+0.025
 pancakes+0.025
 dining+0.024
Neg logit contributions
 aback-0.024
 gol-0.023
 crossed-0.023
 seen-0.022
 reached-0.022
Token!"
Token ID2474
Feature activation-2.26e-03

Pos logit contributions
oral+0.061
oons+0.061
 pancakes+0.061
 dining+0.060
 ambul+0.060
Neg logit contributions
 aback-0.050
 crossed-0.048
 gol-0.048
 surprised-0.047
 seen-0.047
Token the
Token ID262
Feature activation-1.89e-02

Pos logit contributions
 pancakes+0.004
oral+0.004
oons+0.004
 ambul+0.004
 soup+0.004
Neg logit contributions
 aback-0.003
 gol-0.002
arted-0.002
 crossed-0.002
 worn-0.002
Token park
Token ID3952
Feature activation-1.13e-02

Pos logit contributions
 pancakes+0.032
oral+0.032
oons+0.032
 degrees+0.032
 dining+0.031
Neg logit contributions
 aback-0.021
 gol-0.020
 crossed-0.019
 worn-0.019
 nervous-0.019
Token was
Token ID373
Feature activation+1.77e-02

Pos logit contributions
 aback+0.020
 gol+0.020
arted+0.019
 crossed+0.019
 nervous+0.019
Neg logit contributions
 pancakes-0.026
oons-0.026
oral-0.026
 degrees-0.025
 dining-0.025
Token a
Token ID257
Feature activation-6.60e-03

Pos logit contributions
 so+0.008
 seen+0.007
 nervous+0.007
 amazed+0.007
 scared+0.007
Neg logit contributions
 degrees-0.044
oral-0.044
oons-0.044
 911-0.042
 occasions-0.042
Token girl
Token ID2576
Feature activation-6.74e-03

Pos logit contributions
 pancakes+0.012
oons+0.012
 degrees+0.012
oral+0.012
 ambul+0.011
Neg logit contributions
 gol-0.007
 aback-0.006
 nervous-0.006
 crossed-0.006
 worn-0.006
Token named
Token ID3706
Feature activation+1.01e-02

Pos logit contributions
 pancakes+0.014
oons+0.014
 degrees+0.014
oral+0.014
 chores+0.014
Neg logit contributions
 aback-0.005
 gol-0.004
 crossed-0.004
 worn-0.004
 nervous-0.004
Token M
Token ID337
Feature activation-3.69e-02

Pos logit contributions
 aback+0.013
 gol+0.012
 crossed+0.012
 Philip+0.011
arted+0.011
Neg logit contributions
 pancakes-0.016
 911-0.016
oons-0.016
 breakfast-0.016
 dining-0.016
Tokenabel
Token ID9608
Feature activation-5.86e-02

Pos logit contributions
 ambul+0.056
 fires+0.053
 911+0.052
 chores+0.051
 sy+0.051
Neg logit contributions
 guessed-0.046
ody-0.045
arted-0.044
 touched-0.044
 surprised-0.044
Token.
Token ID13
Feature activation-9.64e-03

Pos logit contributions
 pancakes+0.106
 911+0.105
 dining+0.104
 chores+0.104
oral+0.104
Neg logit contributions
 seen-0.056
 below-0.056
 crossed-0.056
 reached-0.055
 guessed-0.055
Token She
Token ID1375
Feature activation-2.56e-03

Pos logit contributions
 pancakes+0.017
oons+0.017
 degrees+0.017
oral+0.017
 ambul+0.016
Neg logit contributions
 aback-0.010
 crossed-0.010
 nervous-0.010
 gol-0.009
 worn-0.009
Token was
Token ID373
Feature activation-2.94e-03

Pos logit contributions
 pancakes+0.005
oons+0.005
 degrees+0.005
oral+0.005
 dining+0.005
Neg logit contributions
 aback-0.002
 gol-0.002
arted-0.002
 crossed-0.002
 hesitant-0.002
Token only
Token ID691
Feature activation+7.41e-03

Pos logit contributions
 pancakes+0.005
 breakfast+0.005
 chores+0.005
 ambul+0.005
 tea+0.005
Neg logit contributions
 aback-0.003
 gol-0.003
arted-0.003
suit-0.003
 crossed-0.003
Token three
Token ID1115
Feature activation-2.47e-02

Pos logit contributions
 aback+0.009
 crossed+0.008
 gol+0.008
arted+0.008
 below+0.008
Neg logit contributions
 pancakes-0.012
 degrees-0.012
 breakfast-0.012
 ambul-0.012
 chores-0.011
Token it
Token ID340
Feature activation-1.30e-02

Pos logit contributions
 nervous+0.009
 crossed+0.009
 aback+0.009
 surprised+0.009
 guessed+0.009
Neg logit contributions
oons-0.019
oral-0.018
 dining-0.018
 degrees-0.018
 pancakes-0.017
Tokenâ
Token ID22940
Feature activation-1.35e-02

Pos logit contributions
oral+0.023
 pancakes+0.023
oons+0.022
 dining+0.022
 degrees+0.022
Neg logit contributions
 crossed-0.015
 scared-0.014
 aback-0.014
 nervous-0.014
 surprised-0.014
Token
Token ID26391
Feature activation+7.37e-03

Pos logit contributions
oral+0.020
 occasions+0.020
 chores+0.019
 pancakes+0.019
 degrees+0.019
Neg logit contributions
 amazed-0.018
 surprised-0.018
arted-0.017
 aback-0.017
 nervous-0.016
Token
Token ID8151
Feature activation-1.38e-02

Pos logit contributions
 aback+0.010
arted+0.009
 gol+0.009
 below+0.009
 perplex+0.009
Neg logit contributions
 pancakes-0.011
 degrees-0.011
 breakfast-0.011
oons-0.011
 chores-0.011
Tokens
Token ID82
Feature activation-1.71e-04

Pos logit contributions
 pancakes+0.021
oons+0.021
 degrees+0.020
oral+0.020
 ambul+0.020
Neg logit contributions
 aback-0.018
 gol-0.017
 crossed-0.017
 nervous-0.017
 worn-0.017
Token hard
Token ID1327
Feature activation-4.21e-02

Pos logit contributions
oral+0.000
oons+0.000
 dining+0.000
 pancakes+0.000
 degrees+0.000
Neg logit contributions
 nervous-0.000
 worn-0.000
 so-0.000
 crossed-0.000
 scared-0.000
Token and
Token ID290
Feature activation+4.29e-03

Pos logit contributions
oral+0.079
 pancakes+0.077
 ambul+0.076
 degrees+0.076
oons+0.075
Neg logit contributions
 aback-0.041
 nervous-0.040
 worn-0.039
 so-0.039
 around-0.038
Token I
Token ID314
Feature activation-4.06e-03

Pos logit contributions
 aback+0.005
 gol+0.005
 crossed+0.005
 nervous+0.005
 worn+0.005
Neg logit contributions
 pancakes-0.007
oons-0.007
 degrees-0.007
oral-0.007
 dining-0.007
Token don
Token ID836
Feature activation-4.35e-03

Pos logit contributions
 pancakes+0.007
oons+0.007
 degrees+0.007
oral+0.007
 chores+0.007
Neg logit contributions
 aback-0.004
 gol-0.004
 crossed-0.004
arted-0.004
 nervous-0.004
Tokenâ
Token ID22940
Feature activation-9.12e-03

Pos logit contributions
 dining+0.008
 degrees+0.008
 pancakes+0.008
 beds+0.007
oup+0.007
Neg logit contributions
 nervous-0.004
 aback-0.004
 crossed-0.004
 gol-0.004
 surprised-0.004
Token
Token ID26391
Feature activation+1.49e-02

Pos logit contributions
 chores+0.013
oral+0.013
 pancakes+0.013
 occasions+0.013
 breakfast+0.013
Neg logit contributions
 amazed-0.012
 surprised-0.012
arted-0.012
 below-0.011
 confused-0.011
Token who
Token ID508
Feature activation+8.69e-03

Pos logit contributions
 pancakes+0.020
 degrees+0.019
oons+0.019
 ambul+0.019
oral+0.019
Neg logit contributions
 aback-0.013
 gol-0.012
 crossed-0.012
 nervous-0.012
 worn-0.012
Token said
Token ID531
Feature activation+8.53e-03

Pos logit contributions
 aback+0.010
 gol+0.009
arted+0.009
suit+0.009
arrass+0.008
Neg logit contributions
oons-0.013
 pancakes-0.013
 ambul-0.013
 degrees-0.013
abel-0.013
Token to
Token ID284
Feature activation+7.84e-03

Pos logit contributions
 aback+0.009
 surprised+0.009
 nervous+0.009
 so+0.008
 around+0.008
Neg logit contributions
oons-0.016
 pancakes-0.016
 beds-0.016
oral-0.016
 dining-0.015
Token him
Token ID683
Feature activation+1.16e-02

Pos logit contributions
 aback+0.012
 gol+0.011
 crossed+0.011
 worn+0.011
 nervous+0.011
Neg logit contributions
 pancakes-0.011
oons-0.010
oral-0.010
 degrees-0.010
 chores-0.010
Token,
Token ID11
Feature activation-4.67e-03

Pos logit contributions
 aback+0.014
 surprised+0.013
 below+0.013
 nervous+0.013
 hesitant+0.013
Neg logit contributions
 pancakes-0.019
oons-0.019
oral-0.019
 dining-0.018
 degrees-0.018
Token
Token ID6184
Feature activation-4.29e-02

Pos logit contributions
oons+0.009
 pancakes+0.009
oral+0.009
 degrees+0.009
 chores+0.009
Neg logit contributions
 aback-0.004
 nervous-0.004
 surprised-0.004
 crossed-0.004
 amazed-0.004
Token
Token ID95
Feature activation-1.00e-02

Pos logit contributions
 pancakes+0.073
oons+0.072
 degrees+0.071
oral+0.070
 ambul+0.069
Neg logit contributions
 aback-0.049
 gol-0.047
 crossed-0.046
 nervous-0.045
 worn-0.045
Token
Token ID26391
Feature activation+8.23e-03

Pos logit contributions
 degrees+0.015
 dining+0.014
 occasions+0.014
oral+0.014
 pancakes+0.014
Neg logit contributions
 nervous-0.013
 surprised-0.013
 worn-0.013
 amazed-0.012
 seen-0.012
Token
Token ID129
Feature activation-5.81e-03

Pos logit contributions
arted+0.012
 aback+0.012
 gol+0.012
 worn+0.011
 hesitant+0.011
Neg logit contributions
 degrees-0.010
 pancakes-0.010
oons-0.010
 dining-0.010
 ambul-0.010
Token
Token ID241
Feature activation+1.25e-02

Pos logit contributions
 degrees+0.008
 chores+0.008
 pancakes+0.008
 ambul+0.008
oons+0.008
Neg logit contributions
 aback-0.008
 worn-0.008
 gol-0.008
 seen-0.008
 crossed-0.007
TokenDon
Token ID3987
Feature activation-8.52e-03

Pos logit contributions
 aback+0.013
 crossed+0.012
 gol+0.012
arted+0.012
 overwhelmed+0.012
Neg logit contributions
 pancakes-0.021
 degrees-0.021
oons-0.021
oral-0.021
 ambul-0.020