TOP ACTIVATIONS
MAX = 5.810

someone calling out from up in the tree . Tony stopped
, okay ?" mom says , hugging Anna .
didn 't understand what " for give " meant , but
Smoke came out of the chim ney and made the sky
o uch !" Coco said , copying her .
crunch y ," Mom said , as she held Tom 's
hurt !" Mom my said , rubbing her eye . "
comes from the other side of the fence .
. A big rainbow appeared in the sky ! Lily shouted
naughty ," Mom my said , carrying Sara to the bathroom
and discovered a long - for gotten t eddy bear inside
, Tim ," Sam said , walking away .
The fire man truck stopped in front of a big house
body come out of the chim ney . They pulled one
. Her mom came back in the room and saw what
scared you ." Mom said , hugging them .
M um said , " Yes Daisy , we
the sun was high up in the sky and the fort
and was soon high up in the tree . As he
them gently ." Mom said , giving them a soft towel

MIN ACTIVATIONS
MIN = -7.895

They put on hats and brace lets and were ready to
, for she had a brand new bicycle . She proudly
the store to buy a brand new computer that was even
You can make knots and brace lets and neck l aces
. They make rings and brace lets and neck l aces
a little girl who loved brace lets . One day ,
beautiful neck l aces and brace lets . You sorted the
She could make knots and brace lets and neck l aces
kinds of ear rings and brace lets , and he really
make neck l aces and brace lets with them .
. He was wearing his brand new icy blue jacket as
, because she had a brand new bike that made her
make neck l aces and brace lets with them ," Tom
of fun making knots and brace lets and neck l aces
park . Jack had a brand new toy with him .
also wore gown s and brace lets . Some of them
A lyss a had a brand new toy car . She
a needle and thread and se wed Teddy 's eye back
Billy . Billy had a brand new car that could speed
your neck l aces and brace lets ?" Mom asks .

INTERVAL 2.384 - 5.810
CONTAINS 5.947%

saw Sam sleeping on the sofa . She said , "
. They stayed in a hotel with lots of rooms .
mom . Mom said , " Stay with me and do
Tim liked . He jumped up and tried to sniff the
He was scared of them at first , but then he

INTERVAL -1.043 - 2.384
CONTAINS 72.232%

to explore and look around . She went to the tall
. They are sad . They are sorry . They wish
to always listen to adults . <|endoftext|> Once upon a time
His mom said , " Oh , Sammy . That 's
When they reached the other side , they found the nest

INTERVAL -4.469 - -1.043
CONTAINS 21.699%

for recommendations . They had lots of fun together , and
, " I don 't know . Maybe he will go
all ride together and have so much fun . Tim was
Tim my just wanted to play with his toy car .
I know you didn 't mean to break the belt .

INTERVAL -7.895 - -4.469
CONTAINS 0.122%

her a hug . <|endoftext|> Once upon a time , there
my . She wanted to dress up in her favorite dress
the right thing . <|endoftext|> Once upon a time , there
They did not want to dress . " Ben
to the shop and buy things they needed or wanted .
Token someone
Token ID2130
Feature activation-6.15e-01

Pos logit contributions
ory+0.009
ly+0.009
ipop+0.009
ercise+0.009
fy+0.009
Neg logit contributions
 never-0.005
Or-0.004
 knocking-0.004
 Ign-0.004
 horror-0.004
Token calling
Token ID4585
Feature activation+2.65e-01

Pos logit contributions
ly+0.018
ory+0.018
les+0.017
ipop+0.017
fy+0.017
Neg logit contributions
Or-0.008
 never-0.008
 fact-0.008
stack-0.007
pe-0.007
Token out
Token ID503
Feature activation+2.61e+00

Pos logit contributions
 never+0.005
 horror+0.005
 knocking+0.005
 Ign+0.005
 sneaking+0.005
Neg logit contributions
ory-0.006
fy-0.006
ipop-0.006
ercise-0.006
ip-0.006
Token from
Token ID422
Feature activation+4.44e+00

Pos logit contributions
 never+0.052
Or+0.051
 knocking+0.050
 fact+0.049
 horror+0.049
Neg logit contributions
ory-0.059
ly-0.058
ipop-0.057
ercise-0.055
les-0.055
Token up
Token ID510
Feature activation+3.60e+00

Pos logit contributions
Or+0.084
 never+0.082
ric+0.081
 ambul+0.079
 Ign+0.079
Neg logit contributions
ly-0.103
ory-0.103
les-0.095
ercise-0.094
ipop-0.093
Token in
Token ID287
Feature activation+5.81e+00

Pos logit contributions
Or+0.085
 never+0.084
ric+0.083
 knocking+0.083
 ambul+0.081
Neg logit contributions
ory-0.068
ly-0.068
ipop-0.062
les-0.061
fy-0.060
Token the
Token ID262
Feature activation+1.58e+00

Pos logit contributions
Or+0.123
ric+0.120
 ambul+0.113
stack+0.112
pe+0.112
Neg logit contributions
ly-0.121
ory-0.119
 sunglasses-0.110
les-0.108
 sandwich-0.108
Token tree
Token ID5509
Feature activation+3.54e+00

Pos logit contributions
Or+0.028
 never+0.027
ric+0.027
 saving+0.027
pe+0.027
Neg logit contributions
ly-0.039
ory-0.037
les-0.036
ercise-0.036
ipop-0.036
Token.
Token ID13
Feature activation+8.82e-01

Pos logit contributions
Or+0.075
 never+0.072
 fact+0.072
 known+0.071
 knocking+0.071
Neg logit contributions
ly-0.076
ory-0.074
ipop-0.069
les-0.069
fy-0.068
Token Tony
Token ID8832
Feature activation+2.28e-01

Pos logit contributions
 never+0.018
Or+0.018
 knocking+0.018
 ambul+0.018
 sneaking+0.018
Neg logit contributions
ory-0.019
ly-0.019
ipop-0.018
ercise-0.018
les-0.017
Token stopped
Token ID5025
Feature activation+3.27e+00

Pos logit contributions
ric+0.004
 basement+0.004
Or+0.004
 ambul+0.004
 collapse+0.004
Neg logit contributions
ly-0.006
ory-0.005
les-0.005
ipop-0.005
fy-0.005
Token,
Token ID11
Feature activation+2.88e+00

Pos logit contributions
 never+0.012
Or+0.012
 knocking+0.011
 horror+0.011
 hallway+0.011
Neg logit contributions
ory-0.016
ly-0.016
ipop-0.016
les-0.015
ercise-0.015
Token okay
Token ID8788
Feature activation-2.58e-01

Pos logit contributions
 hallway+0.049
 collapse+0.048
Or+0.048
ric+0.048
 known+0.048
Neg logit contributions
ory-0.070
ly-0.070
ipop-0.068
ercise-0.065
les-0.065
Token?"
Token ID1701
Feature activation+8.03e-01

Pos logit contributions
ory+0.005
ly+0.005
ipop+0.005
ercise+0.005
 sandwich+0.005
Neg logit contributions
Or-0.005
 hallway-0.005
 horror-0.005
 never-0.005
 fact-0.005
Token mom
Token ID1995
Feature activation+1.03e+00

Pos logit contributions
ric+0.021
stack+0.021
 hallway+0.020
 sneaking+0.020
 tresp+0.020
Neg logit contributions
ory-0.013
ipop-0.012
ly-0.012
 sandwich-0.011
 handsome-0.011
Token says
Token ID1139
Feature activation+2.34e-01

Pos logit contributions
stack+0.029
Or+0.029
ric+0.029
 never+0.029
 known+0.028
Neg logit contributions
ly-0.015
les-0.014
ory-0.014
ipop-0.013
ercise-0.012
Token,
Token ID11
Feature activation+5.66e+00

Pos logit contributions
 fact+0.004
Or+0.004
 never+0.004
ric+0.004
 known+0.004
Neg logit contributions
ly-0.006
ory-0.006
ipop-0.006
les-0.005
fy-0.005
Token hugging
Token ID46292
Feature activation-2.67e-01

Pos logit contributions
ric+0.116
 fact+0.104
elope+0.103
 known+0.102
 basement+0.102
Neg logit contributions
ly-0.132
 sunglasses-0.116
 candy-0.115
ipop-0.115
 salad-0.112
Token Anna
Token ID11735
Feature activation+1.37e+00

Pos logit contributions
ory+0.007
ipop+0.007
ly+0.006
ercise+0.006
fy+0.006
Neg logit contributions
 never-0.005
Or-0.004
 knocking-0.004
 horror-0.004
 zo-0.004
Token.
Token ID13
Feature activation+1.21e+00

Pos logit contributions
Or+0.025
 never+0.025
 fact+0.025
 hallway+0.025
 knocking+0.025
Neg logit contributions
ory-0.032
ly-0.032
ipop-0.031
les-0.030
fy-0.030
Token
Token ID198
Feature activation+2.69e-01

Pos logit contributions
ric+0.025
stack+0.024
 tresp+0.023
 collapse+0.023
pe+0.023
Neg logit contributions
ly-0.026
ipop-0.026
ory-0.026
les-0.024
bs-0.024
Token
Token ID198
Feature activation+6.34e-01

Pos logit contributions
Or+0.004
stack+0.004
 sneaking+0.004
 ambul+0.004
ric+0.004
Neg logit contributions
ory-0.007
ly-0.007
les-0.007
ipop-0.007
fy-0.006
Token didn
Token ID1422
Feature activation-3.51e-01

Pos logit contributions
 hallway+0.003
 basement+0.003
 knocking+0.003
ric+0.003
Or+0.003
Neg logit contributions
ly-0.005
ory-0.004
les-0.004
ipop-0.004
ercise-0.004
Token't
Token ID470
Feature activation-1.70e+00

Pos logit contributions
ly+0.010
ory+0.009
ipop+0.009
ade+0.009
ip+0.009
Neg logit contributions
Or-0.005
elope-0.005
 hallway-0.005
ric-0.005
 horror-0.005
Token understand
Token ID1833
Feature activation-1.04e+00

Pos logit contributions
ory+0.038
ly+0.037
ipop+0.036
ercise+0.035
les+0.034
Neg logit contributions
Or-0.035
 never-0.035
 knocking-0.034
 hallway-0.033
 horror-0.033
Token what
Token ID644
Feature activation-1.36e+00

Pos logit contributions
ly+0.023
ory+0.022
ipop+0.021
les+0.021
 candy+0.020
Neg logit contributions
 hallway-0.023
Or-0.022
stack-0.022
Uh-0.022
elope-0.021
Token "
Token ID366
Feature activation+7.61e-01

Pos logit contributions
ory+0.030
ly+0.030
ipop+0.029
les+0.028
ercise+0.028
Neg logit contributions
 never-0.028
Or-0.027
 knocking-0.027
 Ign-0.026
 fact-0.026
Tokenfor
Token ID1640
Feature activation+5.64e+00

Pos logit contributions
 never+0.014
Or+0.013
 knocking+0.013
 horror+0.013
 fact+0.013
Neg logit contributions
ory-0.019
ly-0.018
ipop-0.018
ercise-0.017
les-0.017
Tokengive
Token ID26535
Feature activation-5.78e-01

Pos logit contributions
 hallway+0.100
Or+0.097
 horror+0.095
 collapse+0.093
 slam+0.093
Neg logit contributions
ory-0.143
ly-0.134
ercise-0.133
ipop-0.130
 sunglasses-0.123
Token"
Token ID1
Feature activation-5.90e-01

Pos logit contributions
ipop+0.014
ory+0.014
ly+0.013
fy+0.013
ercise+0.013
Neg logit contributions
 never-0.011
 Ign-0.010
 fact-0.010
 knocking-0.010
pe-0.010
Token meant
Token ID4001
Feature activation+5.88e-01

Pos logit contributions
ory+0.015
ipop+0.015
ly+0.015
ercise+0.015
fy+0.014
Neg logit contributions
 never-0.010
Or-0.009
 known-0.009
 knocking-0.009
 Ign-0.009
Token,
Token ID11
Feature activation+1.22e+00

Pos logit contributions
Or+0.012
ric+0.012
 hallway+0.012
Uh+0.012
elope+0.012
Neg logit contributions
ly-0.013
ory-0.013
ipop-0.012
les-0.011
 sandwich-0.011
Token but
Token ID475
Feature activation-6.25e-02

Pos logit contributions
 basement+0.026
 hallway+0.026
ric+0.026
elope+0.025
 collapse+0.025
Neg logit contributions
ory-0.028
ly-0.027
ipop-0.025
 fries-0.024
ip-0.024
Token Smoke
Token ID25416
Feature activation+5.67e-01

Pos logit contributions
 never+0.009
 sneaking+0.009
ric+0.009
Or+0.009
 known+0.009
Neg logit contributions
ory-0.008
ipop-0.008
ly-0.008
 sandwich-0.007
 sandwiches-0.007
Token came
Token ID1625
Feature activation+1.84e+00

Pos logit contributions
Or+0.009
 Ign+0.009
 known+0.009
 never+0.009
 horror+0.009
Neg logit contributions
ly-0.015
ory-0.015
les-0.014
ipop-0.014
ade-0.013
Token out
Token ID503
Feature activation+3.19e+00

Pos logit contributions
 fact+0.042
Or+0.042
stack+0.040
elope+0.040
 horror+0.040
Neg logit contributions
ly-0.038
ory-0.035
les-0.034
ipop-0.032
fy-0.032
Token of
Token ID286
Feature activation+3.51e+00

Pos logit contributions
Or+0.100
stack+0.098
 known+0.098
ric+0.098
elope+0.097
Neg logit contributions
ly-0.039
les-0.033
ory-0.031
ls-0.030
fy-0.030
Token the
Token ID262
Feature activation+1.90e+00

Pos logit contributions
ric+0.087
elope+0.087
Or+0.087
Con+0.085
 known+0.084
Neg logit contributions
 sandwiches-0.058
 groceries-0.058
 salad-0.055
 candy-0.055
 snacks-0.055
Token chim
Token ID18205
Feature activation+5.63e+00

Pos logit contributions
 known+0.043
Or+0.040
Con+0.039
 accompl+0.039
elope+0.039
Neg logit contributions
 groceries-0.039
ly-0.038
 candy-0.036
les-0.036
 sandwiches-0.035
Tokenney
Token ID1681
Feature activation+4.47e+00

Pos logit contributions
 sneaking+0.110
 accompl+0.109
 terror+0.109
 known+0.107
 horror+0.103
Neg logit contributions
ade-0.132
s-0.132
les-0.128
ly-0.127
uddle-0.126
Token and
Token ID290
Feature activation-9.47e-01

Pos logit contributions
 known+0.114
ric+0.106
Or+0.104
Uh+0.104
Con+0.103
Neg logit contributions
ly-0.090
les-0.079
fy-0.078
 sandwich-0.074
ade-0.074
Token made
Token ID925
Feature activation-8.18e-02

Pos logit contributions
ly+0.023
ipop+0.022
les+0.021
ory+0.021
 sandwiches+0.021
Neg logit contributions
ric-0.017
 Ign-0.017
 known-0.016
elope-0.016
pe-0.016
Token the
Token ID262
Feature activation+1.38e+00

Pos logit contributions
ory+0.002
ipop+0.002
ercise+0.002
ly+0.002
fy+0.002
Neg logit contributions
 never-0.001
 knocking-0.001
 horror-0.001
 collapse-0.001
 zo-0.001
Token sky
Token ID6766
Feature activation+1.22e-01

Pos logit contributions
Con+0.028
 known+0.028
Or+0.026
 accompl+0.026
 never+0.026
Neg logit contributions
ly-0.029
 sandwich-0.029
 sandwiches-0.028
 groceries-0.028
 candy-0.028
Token o
Token ID267
Feature activation+1.54e+00

Pos logit contributions
Or+0.061
 never+0.060
 hallway+0.059
 knocking+0.058
 known+0.058
Neg logit contributions
ory-0.082
ly-0.080
ipop-0.079
les-0.076
ercise-0.075
Tokenuch
Token ID794
Feature activation+4.89e-01

Pos logit contributions
 horror+0.034
Or+0.034
 knocking+0.033
 ambul+0.033
 hallway+0.033
Neg logit contributions
ipop-0.034
ory-0.032
ip-0.031
ercise-0.030
 sandwich-0.030
Token!"
Token ID2474
Feature activation+2.47e-01

Pos logit contributions
Or+0.010
 never+0.010
 horror+0.010
 knocking+0.010
 hallway+0.010
Neg logit contributions
ory-0.011
ly-0.011
ipop-0.010
les-0.010
fy-0.010
Token Coco
Token ID48222
Feature activation-7.25e-01

Pos logit contributions
Or+0.005
ric+0.005
 never+0.005
 knocking+0.005
pe+0.005
Neg logit contributions
ly-0.006
ory-0.006
ipop-0.006
les-0.005
fy-0.005
Token said
Token ID531
Feature activation-5.39e-01

Pos logit contributions
ly+0.014
ory+0.014
ipop+0.013
les+0.013
 sandwich+0.012
Neg logit contributions
ric-0.018
 fact-0.017
 hallway-0.017
 basement-0.017
Or-0.017
Token,
Token ID11
Feature activation+5.60e+00

Pos logit contributions
ly+0.013
ory+0.012
ipop+0.012
les+0.011
fy+0.011
Neg logit contributions
 fact-0.011
ric-0.011
Or-0.011
 hallway-0.010
 never-0.010
Token copying
Token ID23345
Feature activation-1.01e+00

Pos logit contributions
 known+0.106
 basement+0.103
 apartment+0.101
ric+0.100
 Zo+0.098
Neg logit contributions
ly-0.138
 sunglasses-0.121
 sandwich-0.117
ipop-0.116
les-0.116
Token her
Token ID607
Feature activation-1.50e-01

Pos logit contributions
ory+0.020
ly+0.020
ipop+0.019
les+0.019
ercise+0.018
Neg logit contributions
 never-0.022
Or-0.022
 hallway-0.022
 knocking-0.022
 fact-0.021
Token.
Token ID13
Feature activation+1.21e+00

Pos logit contributions
ory+0.004
ly+0.004
ipop+0.004
ercise+0.004
fy+0.004
Neg logit contributions
 never-0.003
Or-0.003
 knocking-0.003
 horror-0.003
 zo-0.003
Token
Token ID198
Feature activation+3.89e-01

Pos logit contributions
ric+0.025
 ambul+0.024
 fact+0.024
 collapse+0.024
elope+0.023
Neg logit contributions
ipop-0.027
ory-0.026
ly-0.026
ip-0.025
 sandwich-0.024
Token
Token ID198
Feature activation+9.65e-01

Pos logit contributions
 fact+0.006
Or+0.006
stack+0.006
 hallway+0.006
 ambul+0.006
Neg logit contributions
ory-0.010
ly-0.010
les-0.009
ipop-0.009
ip-0.009
Token crunch
Token ID23527
Feature activation-5.14e-01

Pos logit contributions
 never+0.036
Or+0.035
 hallway+0.034
 Ign+0.034
 horror+0.034
Neg logit contributions
ory-0.041
ly-0.040
ipop-0.039
les-0.038
ercise-0.038
Tokeny
Token ID88
Feature activation+6.90e-01

Pos logit contributions
ory+0.012
ercise+0.012
ipop+0.012
ly+0.012
shirt+0.012
Neg logit contributions
 never-0.009
Or-0.009
pe-0.009
 knocking-0.008
 sneaking-0.008
Token,"
Token ID553
Feature activation-3.15e-03

Pos logit contributions
 never+0.013
Or+0.013
 knocking+0.013
 hallway+0.013
 horror+0.013
Neg logit contributions
ory-0.016
ly-0.016
ipop-0.015
les-0.015
fy-0.015
Token Mom
Token ID11254
Feature activation+1.29e+00

Pos logit contributions
ly+0.000
ory+0.000
les+0.000
ercise+0.000
fy+0.000
Neg logit contributions
 never-0.000
Or-0.000
 zo-0.000
 horror-0.000
 only-0.000
Token said
Token ID531
Feature activation-4.04e-01

Pos logit contributions
Or+0.032
 never+0.032
ric+0.031
 hallway+0.031
 fact+0.031
Neg logit contributions
ly-0.023
ory-0.022
ipop-0.022
les-0.021
fy-0.020
Token,
Token ID11
Feature activation+5.59e+00

Pos logit contributions
ly+0.010
ory+0.010
ipop+0.010
fy+0.009
les+0.009
Neg logit contributions
Or-0.007
 fact-0.007
 never-0.007
ric-0.007
 hallway-0.007
Token as
Token ID355
Feature activation+7.35e-01

Pos logit contributions
ric+0.105
 basement+0.104
 fact+0.101
 known+0.100
 tresp+0.100
Neg logit contributions
ly-0.129
ipop-0.119
 sunglasses-0.119
 salad-0.118
 sandwich-0.118
Token she
Token ID673
Feature activation+7.99e-01

Pos logit contributions
 never+0.016
Or+0.016
 knocking+0.015
ric+0.015
 hallway+0.015
Neg logit contributions
ory-0.015
ly-0.015
ipop-0.015
fy-0.014
les-0.014
Token held
Token ID2714
Feature activation+2.20e-01

Pos logit contributions
ric+0.016
 basement+0.015
 collapse+0.015
 fact+0.015
 Ign+0.015
Neg logit contributions
ory-0.019
ly-0.018
ipop-0.018
les-0.018
fy-0.017
Token Tom
Token ID4186
Feature activation+1.57e-02

Pos logit contributions
 only+0.003
 never+0.003
 horror+0.003
 nine+0.003
 no+0.003
Neg logit contributions
ory-0.006
fy-0.006
ly-0.006
ercise-0.006
ipop-0.006
Token's
Token ID338
Feature activation-7.53e-02

Pos logit contributions
 never+0.000
 knocking+0.000
Or+0.000
 sneaking+0.000
elope+0.000
Neg logit contributions
ory-0.000
ly-0.000
ercise-0.000
ipop-0.000
les-0.000
Token hurt
Token ID5938
Feature activation+1.29e+00

Pos logit contributions
Or+0.014
 never+0.014
 hallway+0.013
 knocking+0.013
 horror+0.013
Neg logit contributions
ory-0.020
ly-0.020
ipop-0.019
les-0.019
ercise-0.019
Token!"
Token ID2474
Feature activation-5.87e-01

Pos logit contributions
Or+0.030
 hallway+0.029
 known+0.029
 knocking+0.029
 horror+0.028
Neg logit contributions
ly-0.026
ory-0.025
ipop-0.024
les-0.024
fy-0.023
Token Mom
Token ID11254
Feature activation+1.33e+00

Pos logit contributions
ory+0.011
ipop+0.011
ly+0.011
 sandwich+0.010
fy+0.010
Neg logit contributions
Or-0.013
 never-0.013
pe-0.013
 knocking-0.013
 hallway-0.013
Tokenmy
Token ID1820
Feature activation+1.51e+00

Pos logit contributions
Or+0.031
stack+0.031
 fact+0.030
 basement+0.030
 hallway+0.030
Neg logit contributions
ly-0.026
ory-0.025
ipop-0.024
les-0.024
fy-0.023
Token said
Token ID531
Feature activation-3.44e-02

Pos logit contributions
ric+0.040
stack+0.040
 basement+0.040
 hallway+0.039
Or+0.038
Neg logit contributions
ly-0.029
les-0.025
ipop-0.024
ory-0.024
fy-0.023
Token,
Token ID11
Feature activation+5.58e+00

Pos logit contributions
ly+0.001
ory+0.001
ipop+0.001
les+0.001
fy+0.001
Neg logit contributions
 fact-0.001
Or-0.001
ric-0.001
 known-0.001
 hallway-0.001
Token rubbing
Token ID27714
Feature activation+7.70e-01

Pos logit contributions
 known+0.101
elope+0.093
ric+0.093
 apartment+0.092
 basement+0.092
Neg logit contributions
ly-0.136
 sunglasses-0.125
 sandwich-0.124
 salad-0.120
ipop-0.118
Token her
Token ID607
Feature activation+9.34e-01

Pos logit contributions
Or+0.019
 hallway+0.019
 ambul+0.019
 known+0.019
 never+0.019
Neg logit contributions
ly-0.013
ory-0.013
ipop-0.013
les-0.013
shirt-0.012
Token eye
Token ID4151
Feature activation-6.96e-01

Pos logit contributions
 never+0.017
Or+0.017
 known+0.016
 knocking+0.016
 ambul+0.016
Neg logit contributions
ory-0.023
ly-0.023
ipop-0.022
les-0.021
fy-0.021
Token.
Token ID13
Feature activation+1.49e+00

Pos logit contributions
ory+0.018
ipop+0.017
ly+0.017
ercise+0.017
fy+0.017
Neg logit contributions
 never-0.012
Or-0.011
 knocking-0.011
pe-0.011
 horror-0.011
Token "
Token ID366
Feature activation+3.64e+00

Pos logit contributions
pe+0.033
 known+0.033
 tresp+0.032
elope+0.032
 ambul+0.032
Neg logit contributions
ory-0.028
ipop-0.028
ip-0.026
ly-0.026
 sandwich-0.026
Token comes
Token ID2058
Feature activation+1.76e+00

Pos logit contributions
Or+0.007
 Ign+0.007
stack+0.007
 hallway+0.007
 ambul+0.007
Neg logit contributions
les-0.007
ly-0.007
ory-0.007
ercise-0.006
ipop-0.006
Token from
Token ID422
Feature activation+4.30e+00

Pos logit contributions
 fact+0.039
Or+0.039
stack+0.038
ric+0.037
pe+0.037
Neg logit contributions
ly-0.036
ory-0.036
les-0.032
fy-0.032
ipop-0.031
Token the
Token ID262
Feature activation+1.83e+00

Pos logit contributions
ric+0.090
 Ign+0.089
 never+0.089
 ambul+0.086
pe+0.086
Neg logit contributions
ory-0.085
ly-0.084
ercise-0.079
les-0.078
 sunglasses-0.076
Token other
Token ID584
Feature activation+3.73e+00

Pos logit contributions
 known+0.039
ric+0.037
pe+0.036
 glaring+0.036
 accompl+0.036
Neg logit contributions
 sandwich-0.035
 sandwiches-0.034
 groceries-0.034
 sunglasses-0.034
 candy-0.034
Token side
Token ID1735
Feature activation+3.55e+00

Pos logit contributions
pe+0.082
stack+0.079
 known+0.078
Or+0.078
 glaring+0.076
Neg logit contributions
ory-0.076
ly-0.074
 sandwiches-0.069
 sunglasses-0.069
les-0.069
Token of
Token ID286
Feature activation+5.57e+00

Pos logit contributions
Or+0.100
 known+0.095
 ambul+0.095
 fact+0.094
Uh+0.094
Neg logit contributions
ly-0.051
ory-0.049
ipop-0.047
les-0.046
fy-0.045
Token the
Token ID262
Feature activation+2.18e+00

Pos logit contributions
Or+0.145
elope+0.139
ric+0.139
pe+0.136
 Gard+0.134
Neg logit contributions
 sandwiches-0.083
 groceries-0.080
 sunglasses-0.079
 candy-0.079
 salad-0.077
Token fence
Token ID13990
Feature activation+3.87e+00

Pos logit contributions
 known+0.047
ric+0.044
pe+0.044
Or+0.044
 glaring+0.044
Neg logit contributions
 sandwich-0.040
 sandwiches-0.040
ory-0.040
 groceries-0.040
ly-0.040
Token.
Token ID13
Feature activation+1.13e+00

Pos logit contributions
 known+0.093
Or+0.090
 ambul+0.090
Con+0.089
ric+0.088
Neg logit contributions
ly-0.073
les-0.069
ory-0.064
fy-0.063
 album-0.063
Token
Token ID198
Feature activation+1.02e+00

Pos logit contributions
 never+0.025
Or+0.025
ric+0.024
 sneaking+0.024
 fact+0.024
Neg logit contributions
ory-0.022
ly-0.022
ipop-0.022
les-0.021
ercise-0.021
Token
Token ID198
Feature activation+1.45e+00

Pos logit contributions
Or+0.015
 fact+0.015
 sneaking+0.015
 hallway+0.015
 never+0.015
Neg logit contributions
ory-0.027
ly-0.026
les-0.026
ipop-0.026
ade-0.025
Token.
Token ID13
Feature activation-5.11e-01

Pos logit contributions
 never+0.008
 knocking+0.008
Or+0.008
 sneaking+0.008
 zo+0.008
Neg logit contributions
ory-0.012
ly-0.012
ipop-0.012
ercise-0.012
fy-0.012
Token A
Token ID317
Feature activation-3.24e-01

Pos logit contributions
ory+0.014
ly+0.013
ipop+0.013
ercise+0.013
les+0.013
Neg logit contributions
 never-0.008
Or-0.008
 knocking-0.008
 Ign-0.008
 hallway-0.008
Token big
Token ID1263
Feature activation+1.61e+00

Pos logit contributions
ory+0.006
ly+0.006
ipop+0.006
ercise+0.006
les+0.006
Neg logit contributions
 never-0.007
Or-0.007
 knocking-0.007
 zo-0.007
 hallway-0.007
Token rainbow
Token ID27223
Feature activation+6.97e-01

Pos logit contributions
 never+0.036
Or+0.035
 knocking+0.034
 Ign+0.034
 known+0.034
Neg logit contributions
ory-0.033
ly-0.032
ipop-0.031
les-0.030
ercise-0.030
Token appeared
Token ID4120
Feature activation+1.74e+00

Pos logit contributions
 never+0.012
Or+0.012
 knocking+0.012
 horror+0.012
 Ign+0.012
Neg logit contributions
ory-0.017
ly-0.017
ipop-0.017
les-0.016
ercise-0.016
Token in
Token ID287
Feature activation+5.55e+00

Pos logit contributions
Or+0.033
 never+0.032
 horror+0.032
 ambul+0.032
 fact+0.032
Neg logit contributions
ly-0.041
ory-0.040
ipop-0.038
les-0.038
fy-0.037
Token the
Token ID262
Feature activation+1.92e+00

Pos logit contributions
Or+0.136
ric+0.131
pe+0.128
 knocking+0.127
 sneaking+0.125
Neg logit contributions
ory-0.096
 sandwiches-0.095
ly-0.094
 candy-0.093
 album-0.092
Token sky
Token ID6766
Feature activation+2.47e+00

Pos logit contributions
Or+0.032
 never+0.031
 knocking+0.031
 sneaking+0.030
pe+0.030
Neg logit contributions
ory-0.049
ly-0.049
les-0.047
ipop-0.047
fy-0.046
Token!
Token ID0
Feature activation-6.25e-01

Pos logit contributions
Or+0.057
 ambul+0.056
 Ign+0.055
 horror+0.055
 tresp+0.055
Neg logit contributions
ly-0.051
les-0.048
ory-0.048
ipop-0.044
 sandwich-0.043
Token Lily
Token ID20037
Feature activation+1.21e-01

Pos logit contributions
ory+0.018
ipop+0.017
ercise+0.017
les+0.016
ly+0.016
Neg logit contributions
 Ign-0.009
 knocking-0.009
 hallway-0.009
 never-0.009
Or-0.009
Token shouted
Token ID17293
Feature activation-4.18e-01

Pos logit contributions
 collapse+0.002
 horror+0.002
 ambul+0.002
stack+0.002
 basement+0.002
Neg logit contributions
ly-0.003
ory-0.003
les-0.003
ipop-0.003
ercise-0.003
Token naughty
Token ID45128
Feature activation+1.78e+00

Pos logit contributions
 never+0.036
 knocking+0.034
 zo+0.034
 sneaking+0.033
 collapse+0.033
Neg logit contributions
ory-0.052
ipop-0.050
ly-0.050
ercise-0.048
les-0.048
Token,"
Token ID553
Feature activation+1.23e-01

Pos logit contributions
Or+0.039
 never+0.038
 known+0.038
ric+0.037
 hallway+0.037
Neg logit contributions
ly-0.038
ory-0.037
les-0.035
ipop-0.035
fy-0.034
Token Mom
Token ID11254
Feature activation+1.21e+00

Pos logit contributions
Or+0.003
 hallway+0.003
stack+0.003
ric+0.003
pe+0.003
Neg logit contributions
ory-0.002
ipop-0.002
ly-0.002
 sandwich-0.002
 sunglasses-0.002
Tokenmy
Token ID1820
Feature activation+1.35e+00

Pos logit contributions
Or+0.029
stack+0.028
 fact+0.028
 basement+0.028
 hallway+0.028
Neg logit contributions
ly-0.024
ory-0.022
ipop-0.022
les-0.021
shirt-0.020
Token said
Token ID531
Feature activation+4.53e-01

Pos logit contributions
ric+0.040
stack+0.040
 basement+0.039
 hallway+0.039
Or+0.038
Neg logit contributions
ly-0.023
les-0.019
ipop-0.018
ory-0.018
ip-0.016
Token,
Token ID11
Feature activation+5.53e+00

Pos logit contributions
 fact+0.010
Or+0.010
ric+0.009
 known+0.009
 hallway+0.009
Neg logit contributions
ly-0.010
ory-0.009
ipop-0.009
les-0.009
fy-0.009
Token carrying
Token ID6872
Feature activation+1.38e+00

Pos logit contributions
ric+0.101
 known+0.100
elope+0.100
ney+0.097
 Zo+0.093
Neg logit contributions
ly-0.130
 sunglasses-0.123
 candy-0.117
 skirt-0.113
 sandwich-0.112
Token Sara
Token ID24799
Feature activation+1.35e+00

Pos logit contributions
Or+0.033
 never+0.033
 hallway+0.032
ric+0.032
elope+0.032
Neg logit contributions
ly-0.026
ory-0.025
ipop-0.024
les-0.024
 sandwiches-0.023
Token to
Token ID284
Feature activation+1.91e+00

Pos logit contributions
Or+0.028
 fact+0.028
 hallway+0.027
 known+0.027
 never+0.027
Neg logit contributions
ly-0.029
ory-0.028
ipop-0.028
les-0.027
fy-0.027
Token the
Token ID262
Feature activation-2.27e-01

Pos logit contributions
pe+0.051
Or+0.051
Uh+0.050
ric+0.050
 hallway+0.049
Neg logit contributions
ory-0.028
ly-0.028
 groceries-0.027
 candy-0.026
 sunglasses-0.026
Token bathroom
Token ID12436
Feature activation+2.65e+00

Pos logit contributions
ory+0.005
ly+0.005
ipop+0.005
ercise+0.005
les+0.005
Neg logit contributions
 never-0.004
Or-0.004
 knocking-0.004
 Ign-0.004
 horror-0.004
Token and
Token ID290
Feature activation-2.79e+00

Pos logit contributions
 never+0.004
Or+0.004
 knocking+0.004
 hallway+0.004
 Ign+0.004
Neg logit contributions
ory-0.007
ipop-0.006
ercise-0.006
ly-0.006
fy-0.006
Token discovered
Token ID5071
Feature activation-1.87e+00

Pos logit contributions
ory+0.072
ipop+0.070
ly+0.069
fy+0.068
ercise+0.068
Neg logit contributions
 never-0.049
Or-0.045
 knocking-0.045
 sneaking-0.043
 Ign-0.043
Token a
Token ID257
Feature activation-2.04e+00

Pos logit contributions
ory+0.043
ly+0.043
ipop+0.042
les+0.041
ercise+0.040
Neg logit contributions
Or-0.035
 never-0.035
 knocking-0.034
 hallway-0.034
 horror-0.033
Token long
Token ID890
Feature activation+1.00e+00

Pos logit contributions
ory+0.037
ly+0.036
ipop+0.035
ercise+0.034
fy+0.033
Neg logit contributions
 never-0.049
Or-0.048
 knocking-0.047
 horror-0.047
 fact-0.047
Token-
Token ID12
Feature activation-9.33e-01

Pos logit contributions
Or+0.022
 known+0.021
 ambul+0.021
 never+0.021
ric+0.021
Neg logit contributions
ly-0.021
ory-0.020
ipop-0.020
 sandwich-0.019
les-0.019
Tokenfor
Token ID1640
Feature activation+5.52e+00

Pos logit contributions
ory+0.020
ipop+0.019
ercise+0.018
ly+0.018
les+0.017
Neg logit contributions
 never-0.021
pe-0.020
Or-0.020
 zo-0.020
 knocking-0.019
Tokengotten
Token ID21646
Feature activation+5.65e-01

Pos logit contributions
 horror+0.095
 ambul+0.093
Or+0.092
 zo+0.090
Uh+0.089
Neg logit contributions
ly-0.130
ipop-0.130
shirt-0.128
ory-0.126
ercise-0.123
Token t
Token ID256
Feature activation-1.21e+00

Pos logit contributions
 known+0.014
 ambul+0.013
ric+0.013
 collapse+0.013
 resisting+0.013
Neg logit contributions
ly-0.010
 sandwich-0.010
ipop-0.010
ory-0.010
 notebook-0.010
Tokeneddy
Token ID21874
Feature activation+4.61e-01

Pos logit contributions
ercise+0.018
ory+0.017
ipop+0.017
fy+0.016
ly+0.016
Neg logit contributions
 never-0.033
 Ign-0.032
ric-0.032
 hallway-0.031
 zo-0.031
Token bear
Token ID6842
Feature activation-9.67e-02

Pos logit contributions
 never+0.008
 knocking+0.008
Or+0.008
 sneaking+0.008
 fact+0.008
Neg logit contributions
ory-0.011
ly-0.011
ipop-0.011
fy-0.011
ercise-0.011
Token inside
Token ID2641
Feature activation+2.02e+00

Pos logit contributions
ory+0.002
ly+0.002
ipop+0.002
les+0.002
ercise+0.002
Neg logit contributions
 never-0.002
Or-0.002
 knocking-0.002
 horror-0.002
 fact-0.002
Token,
Token ID11
Feature activation+1.40e+00

Pos logit contributions
ory+0.006
ly+0.006
ipop+0.006
ercise+0.006
les+0.006
Neg logit contributions
 never-0.004
Or-0.004
 knocking-0.003
 horror-0.003
 Ign-0.003
Token Tim
Token ID5045
Feature activation+2.01e-01

Pos logit contributions
 never+0.028
Or+0.027
 knocking+0.027
 horror+0.026
 fact+0.026
Neg logit contributions
ory-0.032
ly-0.031
ipop-0.031
les-0.030
ercise-0.030
Token,"
Token ID553
Feature activation+3.00e-01

Pos logit contributions
 never+0.004
Or+0.004
 knocking+0.004
 horror+0.004
 hallway+0.004
Neg logit contributions
ory-0.005
ly-0.005
ipop-0.004
les-0.004
ercise-0.004
Token Sam
Token ID3409
Feature activation+9.60e-01

Pos logit contributions
 never+0.007
Or+0.006
 knocking+0.006
 hallway+0.006
 sneaking+0.006
Neg logit contributions
ory-0.006
ly-0.006
ipop-0.006
ercise-0.006
les-0.006
Token said
Token ID531
Feature activation-1.01e+00

Pos logit contributions
ric+0.027
 basement+0.026
 hallway+0.026
 fact+0.026
stack+0.025
Neg logit contributions
ly-0.016
ipop-0.014
ory-0.014
les-0.014
 sandwich-0.013
Token,
Token ID11
Feature activation+5.52e+00

Pos logit contributions
ly+0.026
ory+0.025
ipop+0.025
les+0.024
fy+0.024
Neg logit contributions
Or-0.018
 fact-0.017
 never-0.017
 hallway-0.017
ric-0.017
Token walking
Token ID6155
Feature activation+2.09e+00

Pos logit contributions
ric+0.124
elope+0.114
 apartment+0.114
 basement+0.113
 fact+0.112
Neg logit contributions
ly-0.124
 sunglasses-0.106
 goodbye-0.105
ipop-0.104
 candy-0.103
Token away
Token ID1497
Feature activation+7.62e-01

Pos logit contributions
 never+0.047
Or+0.046
 zo+0.046
 knocking+0.045
 Ign+0.045
Neg logit contributions
ory-0.042
ipop-0.040
ly-0.040
ercise-0.039
fy-0.039
Token.
Token ID13
Feature activation+3.61e-01

Pos logit contributions
ric+0.023
 fact+0.021
Uh+0.020
 known+0.020
 ambul+0.020
Neg logit contributions
ly-0.014
ory-0.011
ls-0.011
ipop-0.011
les-0.011
Token
Token ID198
Feature activation-2.00e-01

Pos logit contributions
 fact+0.007
ric+0.007
 sneaking+0.007
 tresp+0.007
 hallway+0.007
Neg logit contributions
ory-0.007
ly-0.007
ipop-0.007
 candy-0.007
 peanuts-0.007
Token
Token ID198
Feature activation+3.83e-01

Pos logit contributions
ory+0.006
ly+0.005
ipop+0.005
les+0.005
ercise+0.005
Neg logit contributions
Or-0.003
 never-0.003
 fact-0.003
 sneaking-0.003
 knocking-0.003
Token The
Token ID383
Feature activation+1.48e+00

Pos logit contributions
 never+0.006
pe+0.006
ric+0.006
 tresp+0.006
 fact+0.006
Neg logit contributions
ory-0.007
ly-0.007
ipop-0.007
shirt-0.006
fy-0.006
Token fire
Token ID2046
Feature activation+2.83e+00

Pos logit contributions
 Ign+0.029
 never+0.029
elope+0.029
Con+0.029
 known+0.028
Neg logit contributions
ly-0.032
les-0.030
 sandwich-0.030
 salad-0.030
 sandwiches-0.030
Tokenman
Token ID805
Feature activation+1.18e+00

Pos logit contributions
 Ign+0.055
 accompl+0.054
 sneaking+0.053
 known+0.053
 fact+0.052
Neg logit contributions
ly-0.066
les-0.064
ipop-0.063
ory-0.059
 sandwich-0.058
Token truck
Token ID7779
Feature activation+1.69e+00

Pos logit contributions
 basement+0.020
 Ign+0.020
ric+0.019
 hallway+0.019
 fact+0.019
Neg logit contributions
ly-0.029
ory-0.029
ipop-0.029
les-0.028
fy-0.028
Token stopped
Token ID5025
Feature activation+4.21e+00

Pos logit contributions
ric+0.029
 fact+0.029
elope+0.029
 basement+0.028
 Ign+0.028
Neg logit contributions
ly-0.042
ipop-0.041
les-0.040
ory-0.039
 sandwich-0.039
Token in
Token ID287
Feature activation+5.51e+00

Pos logit contributions
Or+0.081
elope+0.081
stack+0.080
 hallway+0.080
 fact+0.079
Neg logit contributions
ly-0.093
ory-0.092
les-0.087
 sunglasses-0.084
 peanuts-0.081
Token front
Token ID2166
Feature activation+1.13e+00

Pos logit contributions
ric+0.131
Or+0.130
elope+0.125
stack+0.119
 sneaking+0.118
Neg logit contributions
 candy-0.098
 sandwiches-0.091
 snacks-0.090
 salad-0.090
 sandwich-0.090
Token of
Token ID286
Feature activation+3.76e+00

Pos logit contributions
 known+0.032
Or+0.032
elope+0.032
ric+0.032
 never+0.032
Neg logit contributions
ly-0.017
les-0.015
ory-0.014
fy-0.013
ipop-0.013
Token a
Token ID257
Feature activation-9.89e-01

Pos logit contributions
Or+0.089
 hallway+0.086
 never+0.085
ric+0.085
pe+0.084
Neg logit contributions
ory-0.069
ly-0.067
ipop-0.063
 sandwiches-0.061
les-0.060
Token big
Token ID1263
Feature activation+6.39e-01

Pos logit contributions
ory+0.021
ly+0.021
ipop+0.021
fy+0.020
 sandwich+0.020
Neg logit contributions
 never-0.020
Or-0.019
 known-0.019
 Ign-0.019
ric-0.019
Token house
Token ID2156
Feature activation+3.09e+00

Pos logit contributions
 known+0.014
 never+0.014
Or+0.014
ric+0.013
 Ign+0.013
Neg logit contributions
ly-0.013
ory-0.013
 sandwich-0.013
ipop-0.012
les-0.012
Token body
Token ID1767
Feature activation+2.33e+00

Pos logit contributions
 never+0.034
Or+0.034
 known+0.033
 sneaking+0.033
 Ign+0.033
Neg logit contributions
ory-0.034
ly-0.034
ipop-0.032
les-0.031
ercise-0.031
Token come
Token ID1282
Feature activation+1.15e+00

Pos logit contributions
 known+0.047
Or+0.046
elope+0.045
Uh+0.045
 horror+0.044
Neg logit contributions
ly-0.054
les-0.052
ade-0.049
ipop-0.048
 sandwich-0.048
Token out
Token ID503
Feature activation+4.25e+00

Pos logit contributions
 never+0.024
Or+0.024
 knocking+0.023
 sneaking+0.023
 Ign+0.023
Neg logit contributions
ory-0.025
ly-0.024
ipop-0.024
ercise-0.023
les-0.023
Token of
Token ID286
Feature activation+3.10e+00

Pos logit contributions
stack+0.112
Or+0.111
 known+0.111
ric+0.108
elope+0.107
Neg logit contributions
ly-0.074
les-0.067
ory-0.065
fy-0.060
ls-0.059
Token the
Token ID262
Feature activation+1.49e+00

Pos logit contributions
Or+0.077
elope+0.075
ric+0.073
 never+0.072
pe+0.072
Neg logit contributions
 groceries-0.055
 sandwiches-0.050
 candy-0.050
ory-0.049
 salad-0.049
Token chim
Token ID18205
Feature activation+5.51e+00

Pos logit contributions
 known+0.025
Or+0.024
elope+0.023
 glaring+0.023
ena+0.023
Neg logit contributions
 groceries-0.037
 candy-0.035
ly-0.034
 salad-0.034
 peanuts-0.034
Tokenney
Token ID1681
Feature activation+4.71e+00

Pos logit contributions
 sneaking+0.109
 terror+0.108
AAAA+0.106
Are+0.105
 accompl+0.104
Neg logit contributions
s-0.128
ade-0.127
uddle-0.122
les-0.121
ip-0.121
Token.
Token ID13
Feature activation+5.91e-01

Pos logit contributions
 known+0.129
Uh+0.121
AAAA+0.118
ric+0.118
Or+0.117
Neg logit contributions
ly-0.084
les-0.075
fy-0.072
ade-0.069
 sandwich-0.068
Token They
Token ID1119
Feature activation+5.06e-01

Pos logit contributions
 never+0.011
 fact+0.011
 sneaking+0.011
ric+0.011
 collapse+0.011
Neg logit contributions
ory-0.014
ipop-0.013
ly-0.013
 sandwich-0.012
shirt-0.012
Token pulled
Token ID5954
Feature activation+6.33e-01

Pos logit contributions
elope+0.011
stack+0.011
 basement+0.011
ric+0.011
ASH+0.011
Neg logit contributions
ory-0.010
les-0.010
ipop-0.010
 sunglasses-0.009
ly-0.009
Token one
Token ID530
Feature activation-1.38e+00

Pos logit contributions
Or+0.013
 never+0.013
 hallway+0.013
 known+0.013
ric+0.013
Neg logit contributions
ly-0.013
ory-0.013
ipop-0.013
les-0.013
fy-0.012
Token.
Token ID13
Feature activation-7.08e-01

Pos logit contributions
 never+0.066
Or+0.065
 fact+0.063
 known+0.063
 knocking+0.063
Neg logit contributions
ly-0.081
ory-0.080
ipop-0.078
fy-0.076
les-0.076
Token Her
Token ID2332
Feature activation-7.85e-01

Pos logit contributions
ory+0.017
ly+0.017
ipop+0.017
fy+0.016
shirt+0.016
Neg logit contributions
Or-0.012
 never-0.012
pe-0.012
 sneaking-0.012
 fact-0.012
Token mom
Token ID1995
Feature activation+1.18e+00

Pos logit contributions
ory+0.018
ly+0.017
les+0.017
ercise+0.016
fy+0.016
Neg logit contributions
 never-0.017
 horror-0.016
 hallway-0.016
 fact-0.016
 knocking-0.016
Token came
Token ID1625
Feature activation+2.98e+00

Pos logit contributions
 basement+0.025
Or+0.024
ric+0.024
 Ign+0.023
 collapse+0.023
Neg logit contributions
ly-0.027
les-0.026
ipop-0.025
ory-0.024
ip-0.023
Token back
Token ID736
Feature activation+2.40e+00

Pos logit contributions
Or+0.072
 fact+0.069
 collapse+0.068
stack+0.067
ric+0.066
Neg logit contributions
ly-0.057
ory-0.056
les-0.051
shirt-0.049
ipop-0.049
Token in
Token ID287
Feature activation+5.51e+00

Pos logit contributions
Or+0.055
ric+0.055
 fact+0.054
pe+0.053
 known+0.052
Neg logit contributions
ory-0.048
ly-0.046
ipop-0.044
 candy-0.041
 sandwich-0.040
Token the
Token ID262
Feature activation+8.91e-01

Pos logit contributions
ric+0.153
Or+0.142
elope+0.138
stack+0.137
 ambul+0.136
Neg logit contributions
ory-0.087
ly-0.084
 candy-0.079
 sandwich-0.077
 sunglasses-0.076
Token room
Token ID2119
Feature activation+3.22e+00

Pos logit contributions
Or+0.016
ric+0.015
pe+0.015
 never+0.015
 known+0.015
Neg logit contributions
ly-0.022
shirt-0.021
ipop-0.021
ory-0.021
les-0.021
Token and
Token ID290
Feature activation-1.07e+00

Pos logit contributions
ric+0.088
AAAA+0.087
Uh+0.086
 known+0.083
 Zo+0.082
Neg logit contributions
ly-0.058
ory-0.047
les-0.046
 sandwich-0.046
 album-0.045
Token saw
Token ID2497
Feature activation+7.79e-01

Pos logit contributions
ory+0.021
ly+0.021
ipop+0.021
les+0.020
ercise+0.019
Neg logit contributions
ric-0.024
 fact-0.023
pe-0.023
 horror-0.023
 hallway-0.023
Token what
Token ID644
Feature activation-1.13e+00

Pos logit contributions
Or+0.017
 hallway+0.017
 never+0.017
elope+0.017
pe+0.016
Neg logit contributions
ory-0.016
ly-0.016
ipop-0.016
les-0.015
ip-0.014
Token scared
Token ID12008
Feature activation+7.66e-01

Pos logit contributions
 never+0.008
 knocking+0.007
Or+0.007
 zo+0.007
 collapse+0.007
Neg logit contributions
ory-0.014
ly-0.014
ipop-0.014
ercise-0.013
shirt-0.013
Token you
Token ID345
Feature activation+8.79e-01

Pos logit contributions
Or+0.016
 never+0.016
 knocking+0.015
 hallway+0.015
 horror+0.015
Neg logit contributions
ory-0.017
ly-0.017
ipop-0.016
les-0.016
ercise-0.015
Token."
Token ID526
Feature activation-2.64e-02

Pos logit contributions
 never+0.016
Or+0.015
 knocking+0.015
 sneaking+0.015
 collapse+0.015
Neg logit contributions
ory-0.022
ly-0.021
ipop-0.021
ercise-0.020
les-0.020
Token Mom
Token ID11254
Feature activation+1.57e+00

Pos logit contributions
ory+0.001
ly+0.001
ipop+0.001
ercise+0.001
les+0.001
Neg logit contributions
 never-0.001
Or-0.001
 knocking-0.001
 sneaking-0.001
 hallway-0.001
Token said
Token ID531
Feature activation-9.92e-02

Pos logit contributions
Or+0.041
ric+0.040
 collapse+0.039
 basement+0.039
 never+0.039
Neg logit contributions
ly-0.027
ory-0.027
ipop-0.026
les-0.024
fy-0.023
Token,
Token ID11
Feature activation+5.50e+00

Pos logit contributions
ly+0.003
ory+0.002
ipop+0.002
les+0.002
fy+0.002
Neg logit contributions
Or-0.002
 fact-0.002
 hallway-0.002
ric-0.002
 never-0.002
Token hugging
Token ID46292
Feature activation-2.70e-01

Pos logit contributions
ric+0.106
 basement+0.103
 known+0.102
 fact+0.101
elope+0.099
Neg logit contributions
ly-0.121
ipop-0.113
 sunglasses-0.113
 salad-0.110
 sandwich-0.110
Token them
Token ID606
Feature activation+4.80e-01

Pos logit contributions
ory+0.006
ly+0.006
ipop+0.005
ercise+0.005
fy+0.005
Neg logit contributions
 never-0.006
Or-0.006
 knocking-0.006
 horror-0.005
 Ign-0.005
Token.
Token ID13
Feature activation+1.49e+00

Pos logit contributions
 never+0.008
Or+0.008
 knocking+0.008
 horror+0.007
 zo+0.007
Neg logit contributions
ory-0.012
ly-0.012
ipop-0.012
ercise-0.012
les-0.012
Token
Token ID198
Feature activation+8.33e-01

Pos logit contributions
ric+0.033
 tresp+0.031
 collapse+0.031
 sneaking+0.030
pe+0.030
Neg logit contributions
ory-0.029
ipop-0.028
ly-0.028
bs-0.027
 candy-0.027
Token
Token ID198
Feature activation+1.15e+00

Pos logit contributions
 fact+0.015
 known+0.015
 sneaking+0.015
Or+0.015
 hallway+0.015
Neg logit contributions
ory-0.018
ly-0.017
ade-0.017
 sandwiches-0.017
ipop-0.017
Token
Token ID198
Feature activation+9.77e-02

Pos logit contributions
 never+0.024
Or+0.024
 knocking+0.023
 Ign+0.023
 horror+0.023
Neg logit contributions
ory-0.026
ly-0.025
ipop-0.025
ercise-0.024
fy-0.024
Token
Token ID198
Feature activation+1.66e-01

Pos logit contributions
 never+0.001
Or+0.001
 knocking+0.001
 horror+0.001
 Ign+0.001
Neg logit contributions
ory-0.003
ly-0.003
ipop-0.003
ercise-0.003
les-0.003
TokenM
Token ID44
Feature activation+2.26e-01

Pos logit contributions
 never+0.002
Or+0.002
 knocking+0.002
 horror+0.002
 Ign+0.002
Neg logit contributions
ory-0.005
ly-0.005
ipop-0.005
ercise-0.005
les-0.005
Tokenum
Token ID388
Feature activation+8.26e-01

Pos logit contributions
 never+0.006
 horror+0.006
Or+0.006
elope+0.006
 fact+0.006
Neg logit contributions
ory-0.003
ly-0.003
ercise-0.003
shirt-0.003
ipop-0.003
Token said
Token ID531
Feature activation+2.77e-01

Pos logit contributions
Or+0.018
 never+0.017
 knocking+0.017
 collapse+0.017
 basement+0.017
Neg logit contributions
ory-0.018
ly-0.017
ipop-0.017
les-0.016
fy-0.016
Token,
Token ID11
Feature activation+5.50e+00

Pos logit contributions
 never+0.005
Or+0.005
 knocking+0.004
 hallway+0.004
 fact+0.004
Neg logit contributions
ory-0.007
ly-0.007
ipop-0.007
les-0.007
fy-0.007
Token "
Token ID366
Feature activation+4.12e+00

Pos logit contributions
 basement+0.125
 ambul+0.123
 collapse+0.120
 tresp+0.119
 fact+0.117
Neg logit contributions
ory-0.112
ly-0.112
ipop-0.107
 handsome-0.096
 sandwich-0.095
TokenYes
Token ID5297
Feature activation-1.26e-01

Pos logit contributions
 knocking+0.089
 never+0.088
 collapse+0.087
 horror+0.087
 hallway+0.087
Neg logit contributions
ory-0.085
ly-0.082
ipop-0.080
ercise-0.078
fy-0.077
Token Daisy
Token ID40355
Feature activation-3.21e-01

Pos logit contributions
ercise+0.004
ory+0.004
ipop+0.004
shirt+0.004
ouses+0.003
Neg logit contributions
 never-0.002
 horror-0.002
 fact-0.002
 no-0.002
 Ign-0.001
Token,
Token ID11
Feature activation+1.82e+00

Pos logit contributions
ory+0.009
ercise+0.009
ly+0.009
shirt+0.009
ipop+0.009
Neg logit contributions
 never-0.005
elope-0.004
 fact-0.004
Or-0.004
pe-0.004
Token we
Token ID356
Feature activation-6.31e-01

Pos logit contributions
 never+0.037
 no+0.032
 only+0.031
Or+0.031
 Ign+0.031
Neg logit contributions
ory-0.045
les-0.044
ipop-0.043
ercise-0.043
ouses-0.042
Token the
Token ID262
Feature activation+2.15e+00

Pos logit contributions
Or+0.004
ric+0.004
pe+0.004
 zo+0.004
 glaring+0.003
Neg logit contributions
ly-0.004
ory-0.004
ipop-0.004
ercise-0.004
fy-0.004
Token sun
Token ID4252
Feature activation+1.12e+00

Pos logit contributions
 never+0.044
Or+0.043
 knocking+0.043
 hallway+0.042
 fact+0.042
Neg logit contributions
ory-0.048
ly-0.047
ipop-0.046
les-0.044
ercise-0.044
Token was
Token ID373
Feature activation+3.67e+00

Pos logit contributions
 never+0.014
Or+0.014
 knocking+0.014
 Ign+0.014
 horror+0.014
Neg logit contributions
ory-0.032
ly-0.032
ipop-0.032
les-0.031
ercise-0.031
Token high
Token ID1029
Feature activation+2.06e+00

Pos logit contributions
elope+0.070
Or+0.069
 zo+0.069
 Zo+0.068
 Ign+0.068
Neg logit contributions
ly-0.081
ory-0.079
fy-0.078
ercise-0.077
 sunglasses-0.076
Token up
Token ID510
Feature activation+2.79e+00

Pos logit contributions
 never+0.040
Or+0.040
 horror+0.039
ric+0.039
 knocking+0.039
Neg logit contributions
ly-0.047
ory-0.046
les-0.044
ipop-0.044
fy-0.043
Token in
Token ID287
Feature activation+5.49e+00

Pos logit contributions
 never+0.053
Or+0.053
 knocking+0.051
 horror+0.051
 Ign+0.051
Neg logit contributions
ory-0.066
ly-0.064
ipop-0.062
les-0.060
ercise-0.060
Token the
Token ID262
Feature activation+1.80e+00

Pos logit contributions
 sneaking+0.134
ric+0.134
Or+0.133
 tresp+0.130
pe+0.130
Neg logit contributions
ory-0.095
 salad-0.088
 sunglasses-0.087
 album-0.086
 candy-0.086
Token sky
Token ID6766
Feature activation+1.90e+00

Pos logit contributions
 never+0.026
Or+0.025
 hallway+0.025
 knocking+0.024
 fact+0.024
Neg logit contributions
ory-0.051
ly-0.049
ipop-0.049
ercise-0.048
les-0.047
Token and
Token ID290
Feature activation-9.84e-01

Pos logit contributions
Or+0.040
 Ign+0.039
 horror+0.039
 zo+0.038
 ambul+0.038
Neg logit contributions
ly-0.042
ory-0.040
les-0.038
fy-0.037
ipop-0.037
Token the
Token ID262
Feature activation+2.01e+00

Pos logit contributions
ly+0.021
ory+0.021
ipop+0.020
ercise+0.020
les+0.020
Neg logit contributions
Or-0.020
 never-0.020
pe-0.019
 Ign-0.019
ric-0.019
Token fort
Token ID6285
Feature activation+2.45e+00

Pos logit contributions
 never+0.040
 hallway+0.039
 knocking+0.038
Or+0.038
 fact+0.038
Neg logit contributions
ory-0.047
ipop-0.045
ly-0.045
ercise-0.043
fy-0.043
Token and
Token ID290
Feature activation-1.72e+00

Pos logit contributions
Or+0.060
 known+0.059
ric+0.058
 horror+0.058
 fact+0.058
Neg logit contributions
ly-0.068
ipop-0.063
ory-0.063
fy-0.062
les-0.061
Token was
Token ID373
Feature activation+2.86e+00

Pos logit contributions
ory+0.040
ly+0.040
ipop+0.039
ercise+0.038
fy+0.037
Neg logit contributions
Or-0.032
 ambul-0.031
ric-0.031
 fact-0.031
 horror-0.031
Token soon
Token ID2582
Feature activation+4.58e-01

Pos logit contributions
Or+0.052
 basement+0.050
pe+0.050
 ambul+0.050
Uh+0.049
Neg logit contributions
ipop-0.065
fy-0.065
ory-0.064
ly-0.063
ercise-0.063
Token high
Token ID1029
Feature activation+2.11e+00

Pos logit contributions
Or+0.009
 never+0.009
 fact+0.009
 hallway+0.009
 ambul+0.009
Neg logit contributions
ory-0.010
ly-0.010
ipop-0.010
les-0.010
ercise-0.010
Token up
Token ID510
Feature activation+3.21e+00

Pos logit contributions
Or+0.048
 never+0.046
 horror+0.046
 known+0.046
ric+0.045
Neg logit contributions
ly-0.046
ory-0.043
les-0.041
ipop-0.039
fy-0.038
Token in
Token ID287
Feature activation+5.48e+00

Pos logit contributions
Or+0.069
 ambul+0.067
 horror+0.066
 never+0.066
ric+0.066
Neg logit contributions
ory-0.069
ly-0.068
les-0.063
ipop-0.063
fy-0.061
Token the
Token ID262
Feature activation+1.57e+00

Pos logit contributions
Or+0.149
ric+0.140
 ambul+0.139
 sneaking+0.137
 knocking+0.135
Neg logit contributions
ory-0.078
 candy-0.073
 salad-0.072
les-0.071
 sandwich-0.069
Token tree
Token ID5509
Feature activation+3.45e+00

Pos logit contributions
Or+0.030
 never+0.029
 ambul+0.029
 saving+0.029
 sneaking+0.029
Neg logit contributions
ly-0.036
ory-0.035
les-0.034
fy-0.034
ercise-0.034
Token.
Token ID13
Feature activation-2.31e-01

Pos logit contributions
Or+0.075
Uh+0.073
 ambul+0.073
 horror+0.073
 fact+0.072
Neg logit contributions
ly-0.071
ory-0.067
les-0.066
fy-0.065
ipop-0.064
Token As
Token ID1081
Feature activation+6.77e-01

Pos logit contributions
ory+0.006
ly+0.005
ipop+0.005
fy+0.005
shirt+0.005
Neg logit contributions
Or-0.004
 never-0.004
 sneaking-0.004
 knocking-0.004
 fact-0.004
Token he
Token ID339
Feature activation+5.47e-01

Pos logit contributions
ric+0.020
Or+0.019
pe+0.019
 basement+0.019
 horror+0.019
Neg logit contributions
ory-0.010
ly-0.009
ipop-0.008
fy-0.008
ip-0.008
Token them
Token ID606
Feature activation-2.96e+00

Pos logit contributions
ory+0.027
ipop+0.026
ly+0.026
ercise+0.025
fy+0.025
Neg logit contributions
 never-0.029
Or-0.028
 knocking-0.027
 fact-0.027
 Ign-0.027
Token gently
Token ID15165
Feature activation-2.79e-01

Pos logit contributions
ory+0.062
ly+0.059
ipop+0.059
ercise+0.057
les+0.056
Neg logit contributions
 never-0.064
Or-0.063
 knocking-0.062
 Ign-0.061
 collapse-0.061
Token."
Token ID526
Feature activation-2.86e-01

Pos logit contributions
ly+0.006
ory+0.006
ipop+0.006
les+0.006
fy+0.006
Neg logit contributions
 never-0.006
Or-0.006
 horror-0.006
 known-0.006
elope-0.006
Token Mom
Token ID11254
Feature activation+1.36e+00

Pos logit contributions
ly+0.008
fy+0.007
ory+0.007
ercise+0.007
shirt+0.007
Neg logit contributions
 horror-0.005
 never-0.005
 Ign-0.005
 zo-0.004
Or-0.004
Token said
Token ID531
Feature activation+1.75e-01

Pos logit contributions
 never+0.032
Or+0.032
 knocking+0.031
 fact+0.031
 collapse+0.031
Neg logit contributions
ory-0.026
ly-0.025
ipop-0.025
les-0.024
ercise-0.023
Token,
Token ID11
Feature activation+5.47e+00

Pos logit contributions
 never+0.003
Or+0.003
 fact+0.003
 hallway+0.003
ric+0.003
Neg logit contributions
ly-0.004
ory-0.004
ipop-0.004
les-0.004
fy-0.004
Token giving
Token ID3501
Feature activation-8.83e-01

Pos logit contributions
ric+0.104
 basement+0.101
elope+0.101
 fact+0.101
 known+0.100
Neg logit contributions
ly-0.128
ipop-0.119
 sunglasses-0.117
ory-0.117
 salad-0.115
Token them
Token ID606
Feature activation-1.06e+00

Pos logit contributions
ipop+0.025
fy+0.025
ouse+0.024
ory+0.024
ouses+0.024
Neg logit contributions
 never-0.014
 only-0.013
 no-0.012
 nine-0.012
 horror-0.012
Token a
Token ID257
Feature activation-2.87e+00

Pos logit contributions
ory+0.031
ipop+0.031
actus+0.030
ly+0.030
fy+0.030
Neg logit contributions
 never-0.015
 no-0.013
 only-0.013
 ambul-0.013
 nine-0.012
Token soft
Token ID2705
Feature activation-1.67e+00

Pos logit contributions
ory+0.061
ly+0.060
ercise+0.056
les+0.055
ipop+0.053
Neg logit contributions
 zo-0.061
 hallway-0.061
Or-0.060
 never-0.060
 knocking-0.060
Token towel
Token ID24808
Feature activation+7.10e-02

Pos logit contributions
ory+0.030
ly+0.029
ipop+0.028
ercise+0.028
fy+0.027
Neg logit contributions
 never-0.041
Or-0.040
 knocking-0.040
 fact-0.039
 hallway-0.039
Token They
Token ID1119
Feature activation-6.12e-01

Pos logit contributions
 never+0.009
Or+0.008
 knocking+0.008
 horror+0.008
 sneaking+0.008
Neg logit contributions
ory-0.015
ly-0.015
ipop-0.015
ercise-0.014
les-0.014
Token put
Token ID1234
Feature activation-9.52e-01

Pos logit contributions
ory+0.016
ly+0.016
ipop+0.015
les+0.015
fy+0.015
Neg logit contributions
Or-0.010
 horror-0.010
 collapse-0.010
 knocking-0.010
 never-0.010
Token on
Token ID319
Feature activation-3.36e-01

Pos logit contributions
ory+0.026
ly+0.024
ipop+0.024
fy+0.024
ercise+0.023
Neg logit contributions
 never-0.015
 fact-0.014
 saving-0.014
Or-0.014
 only-0.014
Token hats
Token ID23910
Feature activation-3.26e+00

Pos logit contributions
ory+0.009
ipop+0.008
ly+0.008
ercise+0.008
fy+0.008
Neg logit contributions
 never-0.006
 hallway-0.006
 ambul-0.006
 knocking-0.006
 fact-0.006
Token and
Token ID290
Feature activation-4.38e+00

Pos logit contributions
ory+0.092
ly+0.088
ipop+0.088
ercise+0.087
fy+0.086
Neg logit contributions
 never-0.048
Or-0.044
 knocking-0.043
 hallway-0.043
pe-0.042
Token brace
Token ID22581
Feature activation-7.90e+00

Pos logit contributions
ory+0.080
actus+0.076
les+0.074
ipop+0.074
ly+0.073
Neg logit contributions
 never-0.104
 hallway-0.103
 zo-0.100
 ambul-0.098
Or-0.098
Tokenlets
Token ID5289
Feature activation-4.39e+00

Pos logit contributions
affles+0.147
 salad+0.146
ipop+0.141
ory+0.140
actus+0.138
Neg logit contributions
 never-0.173
gy-0.172
Or-0.160
elope-0.157
 extingu-0.152
Token and
Token ID290
Feature activation-3.93e+00

Pos logit contributions
ory+0.127
ercise+0.122
ly+0.122
ipop+0.121
fy+0.120
Neg logit contributions
 never-0.064
Or-0.058
pe-0.058
 turning-0.055
stack-0.055
Token were
Token ID547
Feature activation+1.48e+00

Pos logit contributions
ory+0.087
ipop+0.082
les+0.081
actus+0.080
ly+0.080
Neg logit contributions
 never-0.087
Or-0.080
 hallway-0.079
 only-0.077
 zo-0.077
Token ready
Token ID3492
Feature activation+7.50e-01

Pos logit contributions
 never+0.025
 knocking+0.023
Or+0.023
 known+0.022
 sneaking+0.022
Neg logit contributions
ory-0.040
ly-0.040
ipop-0.038
les-0.038
ercise-0.037
Token to
Token ID284
Feature activation+6.76e-01

Pos logit contributions
 never+0.017
 knocking+0.016
 turning+0.016
Or+0.016
 sneaking+0.016
Neg logit contributions
ory-0.016
ipop-0.015
ercise-0.014
les-0.014
actus-0.014
Token,
Token ID11
Feature activation-1.60e+00

Pos logit contributions
ory+0.031
ipop+0.030
ly+0.030
ercise+0.029
fy+0.029
Neg logit contributions
 never-0.025
 knocking-0.023
 sneaking-0.023
Or-0.023
 saving-0.023
Token for
Token ID329
Feature activation+2.30e+00

Pos logit contributions
ly+0.037
ory+0.037
ercise+0.036
les+0.035
ipop+0.035
Neg logit contributions
 never-0.032
 knocking-0.029
 hallway-0.029
 sneaking-0.029
 Ign-0.029
Token she
Token ID673
Feature activation+1.20e-02

Pos logit contributions
 never+0.041
Or+0.040
 knocking+0.039
 sneaking+0.039
 fact+0.038
Neg logit contributions
ory-0.057
ly-0.056
ipop-0.055
les-0.053
fy-0.053
Token had
Token ID550
Feature activation-7.65e-01

Pos logit contributions
Or+0.000
 never+0.000
 knocking+0.000
 horror+0.000
 Ign+0.000
Neg logit contributions
ory-0.000
ly-0.000
ipop-0.000
les-0.000
ercise-0.000
Token a
Token ID257
Feature activation-3.38e+00

Pos logit contributions
ory+0.020
ly+0.020
ipop+0.020
ercise+0.019
les+0.019
Neg logit contributions
 never-0.012
Or-0.012
 knocking-0.012
 fact-0.011
 sneaking-0.011
Token brand
Token ID4508
Feature activation-7.58e+00

Pos logit contributions
ly+0.067
les+0.064
ory+0.062
ercise+0.060
ipop+0.059
Neg logit contributions
 never-0.081
 zo-0.077
 hallway-0.077
 fact-0.077
Or-0.076
Token new
Token ID649
Feature activation-2.36e+00

Pos logit contributions
ercise+0.140
ory+0.136
les+0.130
bs+0.127
fy+0.126
Neg logit contributions
 zo-0.166
 never-0.165
 Ign-0.159
 fore-0.156
 basement-0.155
Token bicycle
Token ID17026
Feature activation-1.08e+00

Pos logit contributions
ory+0.041
ly+0.040
ipop+0.039
les+0.038
ercise+0.038
Neg logit contributions
 never-0.059
Or-0.058
 hallway-0.057
 fact-0.057
 knocking-0.057
Token.
Token ID13
Feature activation-9.91e-01

Pos logit contributions
ory+0.025
ly+0.025
ipop+0.024
ercise+0.024
les+0.023
Neg logit contributions
 never-0.021
Or-0.020
 knocking-0.020
 Ign-0.019
 horror-0.019
Token She
Token ID1375
Feature activation+1.48e-01

Pos logit contributions
ory+0.026
ly+0.026
ipop+0.025
ercise+0.025
les+0.025
Neg logit contributions
 never-0.016
Or-0.015
 knocking-0.015
 Ign-0.015
 hallway-0.015
Token proudly
Token ID23017
Feature activation-5.24e-01

Pos logit contributions
 never+0.002
Or+0.002
 knocking+0.002
 fact+0.002
 hallway+0.002
Neg logit contributions
ory-0.004
ly-0.004
ipop-0.004
ercise-0.004
fy-0.004
Token the
Token ID262
Feature activation-1.41e+00

Pos logit contributions
 never+0.006
 knocking+0.006
 hallway+0.005
 Ign+0.005
 fact+0.005
Neg logit contributions
ory-0.010
ipop-0.010
ly-0.010
fy-0.010
les-0.010
Token store
Token ID3650
Feature activation+8.65e-01

Pos logit contributions
ory+0.038
ly+0.038
ipop+0.037
les+0.036
fy+0.036
Neg logit contributions
 never-0.022
Or-0.021
 knocking-0.021
 hallway-0.020
 fact-0.020
Token to
Token ID284
Feature activation+6.60e-01

Pos logit contributions
Or+0.017
 horror+0.017
 never+0.017
ric+0.017
 collapse+0.017
Neg logit contributions
ory-0.020
ly-0.019
ipop-0.018
les-0.018
fy-0.018
Token buy
Token ID2822
Feature activation-2.65e+00

Pos logit contributions
 never+0.015
Or+0.014
 knocking+0.014
 Ign+0.014
 fact+0.014
Neg logit contributions
ory-0.013
ly-0.013
ipop-0.013
les-0.012
ercise-0.012
Token a
Token ID257
Feature activation-4.01e+00

Pos logit contributions
ory+0.082
fy+0.082
ipop+0.081
ercise+0.076
night+0.076
Neg logit contributions
 only-0.032
 ambul-0.031
 no-0.030
 never-0.030
 nine-0.029
Token brand
Token ID4508
Feature activation-7.46e+00

Pos logit contributions
ly+0.081
ory+0.078
les+0.077
ipop+0.073
ercise+0.070
Neg logit contributions
 knocking-0.093
 hallway-0.093
 basement-0.093
 zo-0.092
 turning-0.087
Token new
Token ID649
Feature activation-1.71e+00

Pos logit contributions
ory+0.154
fy+0.154
iper+0.151
ercise+0.150
 fries+0.143
Neg logit contributions
 zo-0.143
 factory-0.134
 site-0.132
 office-0.129
 building-0.128
Token computer
Token ID3644
Feature activation+4.39e-01

Pos logit contributions
ory+0.038
ipop+0.036
ly+0.036
fy+0.036
les+0.034
Neg logit contributions
 hallway-0.036
 knocking-0.035
 fact-0.035
 basement-0.035
Or-0.034
Token that
Token ID326
Feature activation-1.48e+00

Pos logit contributions
 never+0.008
Or+0.007
 knocking+0.007
 hallway+0.007
 Ign+0.007
Neg logit contributions
ory-0.011
ly-0.011
ipop-0.011
ercise-0.010
les-0.010
Token was
Token ID373
Feature activation+2.89e+00

Pos logit contributions
ory+0.031
ly+0.031
ipop+0.031
ercise+0.030
shirt+0.029
Neg logit contributions
ric-0.030
 horror-0.030
 glaring-0.029
Or-0.029
 sneaking-0.029
Token even
Token ID772
Feature activation-1.26e+00

Pos logit contributions
 never+0.049
Or+0.048
 knocking+0.047
 horror+0.046
 Ign+0.046
Neg logit contributions
ory-0.073
ly-0.072
ipop-0.070
ercise-0.068
fy-0.068
Token You
Token ID921
Feature activation+2.47e-01

Pos logit contributions
ory+0.010
ly+0.009
ipop+0.009
ercise+0.009
fy+0.009
Neg logit contributions
 never-0.008
Or-0.007
 knocking-0.007
 collapse-0.007
 hallway-0.007
Token can
Token ID460
Feature activation-4.14e-02

Pos logit contributions
Or+0.005
 horror+0.005
 knocking+0.005
 hallway+0.005
 sneaking+0.005
Neg logit contributions
ory-0.005
ly-0.005
ipop-0.005
ercise-0.005
fy-0.005
Token make
Token ID787
Feature activation-2.35e+00

Pos logit contributions
ory+0.001
ly+0.001
ipop+0.001
les+0.001
ercise+0.001
Neg logit contributions
 never-0.001
Or-0.001
 knocking-0.001
 fact-0.001
 Ign-0.001
Token knots
Token ID33620
Feature activation-1.41e+00

Pos logit contributions
ory+0.059
ipop+0.059
ly+0.057
fy+0.055
ercise+0.054
Neg logit contributions
 never-0.043
 knocking-0.042
 zo-0.042
 Ign-0.041
 sneaking-0.040
Token and
Token ID290
Feature activation-2.91e+00

Pos logit contributions
ory+0.038
ly+0.037
ipop+0.036
ercise+0.035
les+0.035
Neg logit contributions
 never-0.022
Or-0.022
 knocking-0.021
 zo-0.021
 sneaking-0.021
Token brace
Token ID22581
Feature activation-7.45e+00

Pos logit contributions
ory+0.067
ly+0.064
ipop+0.064
les+0.061
ercise+0.061
Neg logit contributions
 never-0.058
Or-0.057
 knocking-0.056
 zo-0.055
 sneaking-0.054
Tokenlets
Token ID5289
Feature activation-4.44e+00

Pos logit contributions
ipop+0.130
 groceries+0.126
ory+0.124
affles+0.122
 salad+0.121
Neg logit contributions
 never-0.188
Or-0.179
 zo-0.171
 nine-0.170
gy-0.167
Token and
Token ID290
Feature activation-3.33e+00

Pos logit contributions
ory+0.121
ipop+0.117
ly+0.116
ercise+0.114
fy+0.113
Neg logit contributions
 never-0.072
Or-0.068
pe-0.065
 zo-0.064
 knocking-0.063
Token neck
Token ID7393
Feature activation-3.08e+00

Pos logit contributions
ory+0.076
ly+0.072
ipop+0.072
les+0.069
fy+0.068
Neg logit contributions
 never-0.069
Or-0.067
 knocking-0.066
 zo-0.065
 fact-0.064
Tokenl
Token ID75
Feature activation-6.84e-01

Pos logit contributions
ory+0.073
ipop+0.069
night+0.066
fy+0.065
ercise+0.065
Neg logit contributions
 never-0.063
pe-0.061
 zo-0.060
Or-0.059
ric-0.057
Tokenaces
Token ID2114
Feature activation-4.17e+00

Pos logit contributions
ly+0.011
ory+0.011
les+0.011
ipop+0.011
ercise+0.010
Neg logit contributions
 never-0.018
Or-0.018
 horror-0.017
 hallway-0.017
 knocking-0.017
Token.
Token ID13
Feature activation-4.48e-01

Pos logit contributions
 never+0.012
 horror+0.012
Or+0.012
 fact+0.011
 knocking+0.011
Neg logit contributions
ory-0.015
ly-0.015
ipop-0.014
fy-0.014
ercise-0.014
Token They
Token ID1119
Feature activation-6.59e-01

Pos logit contributions
ory+0.011
ipop+0.010
ly+0.010
shirt+0.010
ercise+0.010
Neg logit contributions
 never-0.008
Or-0.008
 knocking-0.008
 collapse-0.008
 sneaking-0.008
Token make
Token ID787
Feature activation-1.62e+00

Pos logit contributions
ory+0.016
ly+0.016
ipop+0.015
ercise+0.015
les+0.015
Neg logit contributions
 never-0.012
Or-0.012
 knocking-0.011
 horror-0.011
 Ign-0.011
Token rings
Token ID13917
Feature activation-4.06e+00

Pos logit contributions
ory+0.038
ipop+0.037
ly+0.036
fy+0.035
ercise+0.035
Neg logit contributions
 never-0.032
 knocking-0.031
Or-0.031
 zo-0.030
 Ign-0.030
Token and
Token ID290
Feature activation-2.24e+00

Pos logit contributions
ory+0.123
ipop+0.123
ercise+0.121
shirt+0.118
fy+0.117
Neg logit contributions
 never-0.055
Or-0.049
 zo-0.048
 knocking-0.048
 saving-0.047
Token brace
Token ID22581
Feature activation-7.40e+00

Pos logit contributions
ory+0.047
ly+0.047
ipop+0.045
ercise+0.044
les+0.044
Neg logit contributions
 never-0.046
Or-0.045
 Ign-0.045
 ambul-0.044
 horror-0.044
Tokenlets
Token ID5289
Feature activation-4.19e+00

Pos logit contributions
ipop+0.120
 groceries+0.114
ory+0.114
affles+0.110
shirt+0.107
Neg logit contributions
 never-0.201
Or-0.190
ric-0.182
 zo-0.181
elope-0.181
Token and
Token ID290
Feature activation-3.52e+00

Pos logit contributions
ory+0.112
ipop+0.107
ly+0.107
ercise+0.106
fy+0.105
Neg logit contributions
 never-0.070
Or-0.066
pe-0.064
 knocking-0.064
 zo-0.063
Token neck
Token ID7393
Feature activation-3.30e+00

Pos logit contributions
ory+0.064
ly+0.061
ipop+0.060
les+0.058
ercise+0.057
Neg logit contributions
 never-0.086
Or-0.085
 knocking-0.083
 zo-0.083
 fact-0.082
Tokenl
Token ID75
Feature activation-1.03e+00

Pos logit contributions
ory+0.070
ipop+0.066
ercise+0.065
fy+0.064
night+0.062
Neg logit contributions
 never-0.074
pe-0.072
 zo-0.071
Or-0.070
 knocking-0.069
Tokenaces
Token ID2114
Feature activation-3.75e+00

Pos logit contributions
ly+0.017
ory+0.017
ipop+0.017
les+0.016
ercise+0.016
Neg logit contributions
 never-0.026
Or-0.026
 knocking-0.025
 hallway-0.025
 horror-0.025
Token a
Token ID257
Feature activation-1.53e+00

Pos logit contributions
ory+0.017
ly+0.016
ipop+0.016
ercise+0.016
les+0.016
Neg logit contributions
 never-0.009
Or-0.009
 knocking-0.008
 horror-0.008
 hallway-0.008
Token little
Token ID1310
Feature activation+4.98e-01

Pos logit contributions
ory+0.031
ly+0.030
ipop+0.030
ercise+0.029
les+0.029
Neg logit contributions
 never-0.033
Or-0.033
 knocking-0.032
 zo-0.032
 hallway-0.032
Token girl
Token ID2576
Feature activation+1.13e+00

Pos logit contributions
 zo+0.010
 never+0.010
Or+0.010
 knocking+0.010
 fact+0.009
Neg logit contributions
ory-0.011
ly-0.010
ercise-0.010
ipop-0.010
shirt-0.010
Token who
Token ID508
Feature activation-6.50e-01

Pos logit contributions
 collapse+0.019
Uh+0.018
Or+0.018
 horror+0.017
ric+0.017
Neg logit contributions
ly-0.030
ory-0.029
ip-0.028
shirt-0.027
fy-0.027
Token loved
Token ID6151
Feature activation-2.92e-01

Pos logit contributions
ly+0.015
ory+0.015
fy+0.014
ipop+0.014
shirt+0.014
Neg logit contributions
 horror-0.012
stack-0.012
 ambul-0.012
Or-0.012
ric-0.012
Token brace
Token ID22581
Feature activation-7.35e+00

Pos logit contributions
ory+0.007
fy+0.007
ly+0.007
les+0.006
ipop+0.006
Neg logit contributions
 ambul-0.006
 knocking-0.006
 never-0.006
 turning-0.006
 sneaking-0.006
Tokenlets
Token ID5289
Feature activation-3.84e+00

Pos logit contributions
 groceries+0.117
ercise+0.114
 fries+0.114
ipop+0.111
shirt+0.108
Neg logit contributions
 never-0.187
ric-0.178
gy-0.174
pe-0.171
 only-0.171
Token.
Token ID13
Feature activation-1.02e+00

Pos logit contributions
ory+0.087
ly+0.087
ipop+0.084
les+0.082
fy+0.080
Neg logit contributions
Or-0.074
 horror-0.073
 knocking-0.073
 collapse-0.073
 never-0.072
Token One
Token ID1881
Feature activation-2.36e-01

Pos logit contributions
ory+0.028
ly+0.028
ipop+0.027
ercise+0.026
les+0.026
Neg logit contributions
 never-0.015
Or-0.015
 knocking-0.015
 sneaking-0.014
 horror-0.014
Token day
Token ID1110
Feature activation+1.53e+00

Pos logit contributions
ory+0.006
ercise+0.006
les+0.006
ly+0.006
fy+0.006
Neg logit contributions
 never-0.004
 only-0.004
 hallway-0.004
elope-0.004
Or-0.003
Token,
Token ID11
Feature activation-4.93e-01

Pos logit contributions
 never+0.018
 knocking+0.017
Or+0.016
 sneaking+0.016
 turning+0.016
Neg logit contributions
ory-0.047
ly-0.046
ercise-0.045
les-0.045
fy-0.045
Token beautiful
Token ID4950
Feature activation-7.67e-01

Pos logit contributions
 never+0.032
Or+0.031
 knocking+0.031
 fact+0.030
 Ign+0.030
Neg logit contributions
ory-0.036
ly-0.035
ipop-0.035
ercise-0.033
les-0.033
Token neck
Token ID7393
Feature activation-3.01e+00

Pos logit contributions
ory+0.019
ipop+0.018
ly+0.018
ercise+0.018
fy+0.018
Neg logit contributions
 never-0.013
Or-0.013
 fact-0.013
 zo-0.013
 knocking-0.013
Tokenl
Token ID75
Feature activation-2.40e-01

Pos logit contributions
ory+0.077
ercise+0.076
ipop+0.073
shirt+0.071
night+0.070
Neg logit contributions
 never-0.052
pe-0.049
ric-0.045
 fact-0.045
Or-0.045
Tokenaces
Token ID2114
Feature activation-3.39e+00

Pos logit contributions
ory+0.005
ly+0.005
ipop+0.005
les+0.005
ercise+0.005
Neg logit contributions
 never-0.005
Or-0.005
 knocking-0.005
 horror-0.005
 hallway-0.005
Token and
Token ID290
Feature activation-3.43e+00

Pos logit contributions
ory+0.087
ly+0.084
ipop+0.083
ercise+0.082
les+0.080
Neg logit contributions
 never-0.058
Or-0.055
 knocking-0.053
 fact-0.053
pe-0.052
Token brace
Token ID22581
Feature activation-7.33e+00

Pos logit contributions
ory+0.071
ly+0.067
ipop+0.067
fy+0.064
les+0.063
Neg logit contributions
 never-0.077
Or-0.075
 zo-0.073
 knocking-0.073
 fact-0.073
Tokenlets
Token ID5289
Feature activation-4.29e+00

Pos logit contributions
 groceries+0.127
affles+0.116
 salad+0.115
ory+0.115
 sandwich+0.115
Neg logit contributions
 never-0.191
Or-0.185
pe-0.180
 nine-0.172
ric-0.172
Token.
Token ID13
Feature activation-6.79e-01

Pos logit contributions
ory+0.110
ercise+0.105
ipop+0.104
ly+0.102
fy+0.102
Neg logit contributions
 never-0.076
Or-0.069
pe-0.069
 fact-0.065
stack-0.065
Token You
Token ID921
Feature activation+3.22e-01

Pos logit contributions
ory+0.017
ly+0.016
ipop+0.016
ercise+0.016
fy+0.015
Neg logit contributions
 never-0.012
 knocking-0.012
 collapse-0.012
Or-0.012
 hallway-0.011
Token sorted
Token ID23243
Feature activation-3.77e+00

Pos logit contributions
 knocking+0.006
 horror+0.006
Or+0.006
 hallway+0.006
 sneaking+0.006
Neg logit contributions
ory-0.008
ly-0.007
ipop-0.007
les-0.007
ercise-0.007
Token the
Token ID262
Feature activation+3.09e-01

Pos logit contributions
ipop+0.088
ory+0.087
ercise+0.085
fy+0.085
night+0.084
Neg logit contributions
 never-0.072
 nine-0.068
 only-0.065
 fact-0.065
ric-0.063
Token She
Token ID1375
Feature activation+4.22e-01

Pos logit contributions
ly+0.019
ercise+0.019
ory+0.019
ipop+0.019
fy+0.019
Neg logit contributions
 Ign-0.010
 No-0.009
 horror-0.009
Or-0.009
 never-0.009
Token could
Token ID714
Feature activation+1.70e-01

Pos logit contributions
 never+0.007
Or+0.007
 knocking+0.007
 zo+0.007
 hallway+0.007
Neg logit contributions
ory-0.011
ly-0.011
ipop-0.010
ercise-0.010
les-0.010
Token make
Token ID787
Feature activation-2.17e+00

Pos logit contributions
 never+0.003
Or+0.003
 knocking+0.003
 zo+0.003
 fact+0.003
Neg logit contributions
ory-0.005
ly-0.004
ipop-0.004
les-0.004
fy-0.004
Token knots
Token ID33620
Feature activation-1.28e+00

Pos logit contributions
ory+0.059
ipop+0.058
fy+0.055
ly+0.054
ercise+0.054
Neg logit contributions
 never-0.037
 knocking-0.036
 zo-0.036
 sneaking-0.034
 Mir-0.034
Token and
Token ID290
Feature activation-2.56e+00

Pos logit contributions
ory+0.033
ly+0.032
ipop+0.032
les+0.031
ercise+0.031
Neg logit contributions
 never-0.021
Or-0.020
 knocking-0.020
 horror-0.020
 Ign-0.020
Token brace
Token ID22581
Feature activation-7.33e+00

Pos logit contributions
ory+0.053
ly+0.051
ipop+0.050
les+0.048
ercise+0.048
Neg logit contributions
 never-0.057
Or-0.056
 knocking-0.055
 zo-0.054
 hallway-0.054
Tokenlets
Token ID5289
Feature activation-4.23e+00

Pos logit contributions
ipop+0.126
ory+0.122
affles+0.119
 groceries+0.119
 salad+0.115
Neg logit contributions
 never-0.187
Or-0.178
 zo-0.172
 nine-0.170
gy-0.167
Token and
Token ID290
Feature activation-3.26e+00

Pos logit contributions
ory+0.112
ly+0.108
ipop+0.107
ercise+0.105
fy+0.104
Neg logit contributions
 never-0.071
Or-0.068
 knocking-0.066
 zo-0.065
pe-0.064
Token neck
Token ID7393
Feature activation-3.34e+00

Pos logit contributions
ory+0.072
ly+0.067
ipop+0.067
les+0.064
fy+0.064
Neg logit contributions
 never-0.072
Or-0.070
 knocking-0.068
 zo-0.067
 hallway-0.067
Tokenl
Token ID75
Feature activation-7.96e-01

Pos logit contributions
ory+0.074
ipop+0.070
ercise+0.067
fy+0.066
night+0.066
Neg logit contributions
 never-0.072
pe-0.069
 knocking-0.069
 zo-0.069
Or-0.069
Tokenaces
Token ID2114
Feature activation-4.33e+00

Pos logit contributions
ory+0.014
ly+0.014
ipop+0.013
les+0.013
ercise+0.013
Neg logit contributions
 never-0.019
Or-0.019
 knocking-0.019
 horror-0.019
 fact-0.019
Token kinds
Token ID6982
Feature activation-9.83e-01

Pos logit contributions
ory+0.039
ly+0.038
ipop+0.037
fy+0.036
ercise+0.036
Neg logit contributions
 never-0.032
 known-0.030
 fact-0.030
 knocking-0.029
ric-0.029
Token of
Token ID286
Feature activation+8.29e-01

Pos logit contributions
ory+0.020
ercise+0.020
ipop+0.019
ly+0.019
les+0.018
Neg logit contributions
 never-0.022
Or-0.021
 saving-0.021
 Ign-0.020
 fact-0.020
Token ear
Token ID1027
Feature activation-2.51e+00

Pos logit contributions
 never+0.021
 fact+0.020
 knocking+0.020
 known+0.020
 only+0.020
Neg logit contributions
ipop-0.015
ory-0.014
ercise-0.014
fy-0.014
night-0.013
Tokenrings
Token ID33173
Feature activation-3.61e+00

Pos logit contributions
ercise+0.049
ory+0.045
fy+0.044
affles+0.043
night+0.043
Neg logit contributions
 never-0.063
 no-0.057
ric-0.057
 only-0.056
 turning-0.054
Token and
Token ID290
Feature activation-3.28e+00

Pos logit contributions
ory+0.092
ercise+0.088
ly+0.088
ipop+0.087
les+0.085
Neg logit contributions
 never-0.065
Or-0.059
 saving-0.058
 knocking-0.057
 known-0.057
Token brace
Token ID22581
Feature activation-7.32e+00

Pos logit contributions
ory+0.070
ipop+0.068
ercise+0.067
fy+0.065
les+0.065
Neg logit contributions
 never-0.077
Or-0.072
 only-0.069
 no-0.068
 zo-0.067
Tokenlets
Token ID5289
Feature activation-4.21e+00

Pos logit contributions
 groceries+0.127
ipop+0.125
 sandwich+0.124
affles+0.122
 salad+0.120
Neg logit contributions
 never-0.179
gy-0.165
Or-0.165
ric-0.164
pe-0.161
Token,
Token ID11
Feature activation-3.58e-01

Pos logit contributions
ory+0.111
ercise+0.109
fy+0.107
ipop+0.107
ly+0.105
Neg logit contributions
 never-0.072
pe-0.065
Or-0.065
 saving-0.063
 known-0.062
Token and
Token ID290
Feature activation-2.65e+00

Pos logit contributions
ory+0.009
ly+0.009
ipop+0.009
shirt+0.008
les+0.008
Neg logit contributions
 collapse-0.006
 horror-0.006
elope-0.006
Or-0.006
 never-0.006
Token he
Token ID339
Feature activation-1.43e-01

Pos logit contributions
ory+0.062
ly+0.059
ipop+0.059
les+0.057
ercise+0.057
Neg logit contributions
 never-0.054
Or-0.052
 knocking-0.049
 known-0.049
 zo-0.049
Token really
Token ID1107
Feature activation-8.96e-01

Pos logit contributions
ory+0.004
ly+0.004
ipop+0.004
les+0.003
ercise+0.003
Neg logit contributions
 horror-0.002
Or-0.002
 knocking-0.002
 never-0.002
 collapse-0.002
Token make
Token ID787
Feature activation-2.59e+00

Pos logit contributions
 never+0.003
Or+0.002
 knocking+0.002
 collapse+0.002
 hallway+0.002
Neg logit contributions
ory-0.003
ly-0.002
ipop-0.002
les-0.002
ercise-0.002
Token neck
Token ID7393
Feature activation-4.44e+00

Pos logit contributions
ory+0.075
ipop+0.075
fy+0.071
ly+0.070
ercise+0.069
Neg logit contributions
 knocking-0.041
 never-0.040
 ambul-0.038
 Mir-0.038
 horror-0.037
Tokenl
Token ID75
Feature activation-1.68e+00

Pos logit contributions
ory+0.104
ercise+0.098
ipop+0.097
night+0.096
shirt+0.092
Neg logit contributions
 never-0.095
ric-0.090
 zo-0.088
pe-0.087
 no-0.084
Tokenaces
Token ID2114
Feature activation-4.16e+00

Pos logit contributions
ory+0.031
ly+0.030
ipop+0.030
ercise+0.028
les+0.028
Neg logit contributions
 never-0.040
Or-0.040
 knocking-0.039
 Ign-0.038
 sneaking-0.038
Token and
Token ID290
Feature activation-3.59e+00

Pos logit contributions
ory+0.118
ercise+0.117
ipop+0.114
shirt+0.114
ly+0.114
Neg logit contributions
 never-0.060
Or-0.053
pe-0.051
 only-0.051
 no-0.051
Token brace
Token ID22581
Feature activation-7.32e+00

Pos logit contributions
ory+0.078
ly+0.071
ipop+0.071
les+0.069
fy+0.068
Neg logit contributions
 never-0.081
Or-0.080
 zo-0.078
 knocking-0.076
pe-0.075
Tokenlets
Token ID5289
Feature activation-4.49e+00

Pos logit contributions
 groceries+0.114
ory+0.112
ipop+0.110
affles+0.108
shirt+0.107
Neg logit contributions
 never-0.197
Or-0.192
pe-0.183
 zo-0.180
 nine-0.179
Token with
Token ID351
Feature activation+2.04e-01

Pos logit contributions
ory+0.117
ercise+0.113
ipop+0.111
ly+0.111
fy+0.111
Neg logit contributions
 never-0.078
Or-0.073
pe-0.071
 only-0.067
stack-0.067
Token them
Token ID606
Feature activation-1.46e+00

Pos logit contributions
 never+0.004
Or+0.004
 zo+0.004
 only+0.004
 fact+0.004
Neg logit contributions
ory-0.004
ipop-0.004
ly-0.004
ercise-0.004
fy-0.004
Token.
Token ID13
Feature activation-6.55e-01

Pos logit contributions
ory+0.041
ly+0.039
ipop+0.039
ercise+0.039
shirt+0.038
Neg logit contributions
 never-0.022
Or-0.021
pe-0.020
 zo-0.020
 only-0.019
Token
Token ID198
Feature activation-1.86e-01

Pos logit contributions
ory+0.017
ly+0.017
ipop+0.016
ercise+0.016
shirt+0.016
Neg logit contributions
 never-0.010
 knocking-0.010
 collapse-0.010
 sneaking-0.010
 fact-0.010
Token.
Token ID13
Feature activation-3.21e-01

Pos logit contributions
Or+0.009
 horror+0.009
 never+0.009
ric+0.009
 collapse+0.009
Neg logit contributions
ly-0.011
ory-0.011
ipop-0.010
fy-0.010
shirt-0.010
Token He
Token ID679
Feature activation+2.31e-01

Pos logit contributions
ly+0.007
ory+0.007
shirt+0.006
ercise+0.006
fy+0.006
Neg logit contributions
 sneaking-0.007
Or-0.007
 tresp-0.007
ric-0.006
 never-0.006
Token was
Token ID373
Feature activation+1.93e+00

Pos logit contributions
 never+0.004
Or+0.004
 knocking+0.004
 horror+0.004
 Ign+0.004
Neg logit contributions
ory-0.006
ly-0.006
ipop-0.006
ercise-0.006
les-0.006
Token wearing
Token ID5762
Feature activation-1.82e+00

Pos logit contributions
Or+0.038
elope+0.037
pe+0.036
stack+0.036
 collapse+0.036
Neg logit contributions
ory-0.043
 handsome-0.042
ipop-0.042
ly-0.041
shirt-0.041
Token his
Token ID465
Feature activation-2.35e+00

Pos logit contributions
ory+0.058
ipop+0.056
ly+0.054
fy+0.053
actus+0.052
Neg logit contributions
 never-0.024
 fact-0.022
 no-0.022
 only-0.021
 turning-0.021
Token brand
Token ID4508
Feature activation-7.31e+00

Pos logit contributions
ory+0.045
les+0.041
ly+0.040
ercise+0.040
fy+0.038
Neg logit contributions
 fact-0.059
 hallway-0.058
 only-0.056
 never-0.056
 no-0.055
Token new
Token ID649
Feature activation-3.09e+00

Pos logit contributions
ory+0.154
bs+0.145
iper+0.144
ercise+0.143
fy+0.140
Neg logit contributions
 site-0.135
 fact-0.126
 turning-0.126
 never-0.125
 hallway-0.125
Token icy
Token ID30284
Feature activation-7.76e-01

Pos logit contributions
ory+0.062
ip+0.057
ipop+0.056
fy+0.056
ercise+0.056
Neg logit contributions
 hallway-0.074
 fact-0.071
 never-0.068
 knocking-0.067
 turning-0.067
Token blue
Token ID4171
Feature activation-1.74e+00

Pos logit contributions
ory+0.014
ercise+0.013
ipop+0.012
fy+0.012
ly+0.012
Neg logit contributions
 hallway-0.020
 fact-0.019
 knocking-0.019
 never-0.019
 turning-0.018
Token jacket
Token ID15224
Feature activation-1.65e+00

Pos logit contributions
ory+0.034
ercise+0.032
fy+0.031
ly+0.030
ipop+0.030
Neg logit contributions
 hallway-0.042
 turning-0.038
 fact-0.038
 never-0.037
 knocking-0.037
Token as
Token ID355
Feature activation-8.00e-01

Pos logit contributions
ory+0.043
ly+0.041
ipop+0.041
ercise+0.040
les+0.040
Neg logit contributions
 never-0.028
Or-0.027
 knocking-0.026
 hallway-0.026
 fact-0.025
Token,
Token ID11
Feature activation+7.18e-01

Pos logit contributions
ory+0.027
ly+0.026
ipop+0.026
ercise+0.025
les+0.025
Neg logit contributions
 never-0.020
Or-0.019
 knocking-0.019
 sneaking-0.018
 horror-0.018
Token because
Token ID780
Feature activation+5.20e-02

Pos logit contributions
ric+0.016
Or+0.015
 hallway+0.015
 zo+0.015
 collapse+0.015
Neg logit contributions
ly-0.014
ory-0.013
ipop-0.013
 sunglasses-0.013
les-0.013
Token she
Token ID673
Feature activation+5.92e-01

Pos logit contributions
 never+0.001
Or+0.001
 knocking+0.001
 horror+0.001
 Ign+0.001
Neg logit contributions
ory-0.001
ly-0.001
ipop-0.001
ercise-0.001
les-0.001
Token had
Token ID550
Feature activation-6.11e-01

Pos logit contributions
Or+0.009
 knocking+0.009
ric+0.009
 collapse+0.009
 horror+0.009
Neg logit contributions
ory-0.016
ly-0.016
ipop-0.015
les-0.015
fy-0.015
Token a
Token ID257
Feature activation-2.93e+00

Pos logit contributions
ory+0.015
ly+0.015
ipop+0.015
ercise+0.015
les+0.014
Neg logit contributions
 never-0.010
Or-0.010
 knocking-0.010
 horror-0.010
 hallway-0.010
Token brand
Token ID4508
Feature activation-7.11e+00

Pos logit contributions
ory+0.058
ly+0.057
ipop+0.056
ercise+0.054
fy+0.053
Neg logit contributions
 never-0.065
Or-0.064
 knocking-0.063
 horror-0.062
 Ign-0.062
Token new
Token ID649
Feature activation-1.51e+00

Pos logit contributions
ory+0.141
ercise+0.139
fy+0.135
ipop+0.128
bs+0.128
Neg logit contributions
 zo-0.142
 never-0.140
 only-0.137
 Ign-0.134
 gen-0.132
Token bike
Token ID7161
Feature activation-9.80e-01

Pos logit contributions
ory+0.027
ly+0.026
ipop+0.025
ercise+0.025
les+0.025
Neg logit contributions
 never-0.037
Or-0.037
 knocking-0.036
 hallway-0.036
 fact-0.036
Token that
Token ID326
Feature activation-1.46e+00

Pos logit contributions
ory+0.023
ly+0.023
ipop+0.022
les+0.021
fy+0.021
Neg logit contributions
 never-0.018
Or-0.018
 horror-0.018
 knocking-0.018
 Ign-0.017
Token made
Token ID925
Feature activation-8.05e-01

Pos logit contributions
ory+0.034
ly+0.034
ipop+0.033
ercise+0.032
fy+0.032
Neg logit contributions
ric-0.026
Or-0.026
 horror-0.026
 knocking-0.026
 zo-0.025
Token her
Token ID607
Feature activation-2.34e-01

Pos logit contributions
ory+0.018
ipop+0.017
ly+0.017
ercise+0.016
fy+0.016
Neg logit contributions
 never-0.017
Or-0.016
 knocking-0.016
 collapse-0.016
 Ign-0.016
Token make
Token ID787
Feature activation-2.42e+00

Pos logit contributions
ory+0.007
ipop+0.007
ly+0.007
les+0.006
fy+0.006
Neg logit contributions
 never-0.005
Or-0.005
 only-0.004
 zo-0.004
 knocking-0.004
Token neck
Token ID7393
Feature activation-3.76e+00

Pos logit contributions
ipop+0.063
ory+0.062
ly+0.060
fy+0.060
ercise+0.059
Neg logit contributions
 never-0.042
 zo-0.040
Or-0.039
 knocking-0.039
 Ign-0.039
Tokenl
Token ID75
Feature activation-1.34e+00

Pos logit contributions
ory+0.081
ipop+0.080
night+0.076
ercise+0.076
shirt+0.073
Neg logit contributions
 never-0.085
 zo-0.082
ric-0.080
pe-0.080
Or-0.079
Tokenaces
Token ID2114
Feature activation-4.18e+00

Pos logit contributions
ly+0.024
ory+0.023
les+0.022
ipop+0.022
fy+0.022
Neg logit contributions
 never-0.033
Or-0.033
 horror-0.032
 fact-0.032
 knocking-0.032
Token and
Token ID290
Feature activation-3.07e+00

Pos logit contributions
ory+0.114
ercise+0.112
ipop+0.112
shirt+0.109
ly+0.108
Neg logit contributions
 never-0.066
Or-0.060
pe-0.059
 knocking-0.059
 zo-0.059
Token brace
Token ID22581
Feature activation-7.10e+00

Pos logit contributions
ory+0.049
ly+0.048
ipop+0.046
ercise+0.044
les+0.044
Neg logit contributions
 never-0.081
Or-0.080
 knocking-0.078
 horror-0.078
 Ign-0.077
Tokenlets
Token ID5289
Feature activation-4.19e+00

Pos logit contributions
 groceries+0.123
ipop+0.120
affles+0.116
ory+0.112
 salad+0.110
Neg logit contributions
 never-0.182
Or-0.175
 zo-0.167
 nine-0.165
ric-0.165
Token with
Token ID351
Feature activation+1.44e-01

Pos logit contributions
ory+0.110
ipop+0.106
ly+0.104
ercise+0.103
fy+0.102
Neg logit contributions
 never-0.071
Or-0.068
pe-0.066
 zo-0.065
 knocking-0.064
Token them
Token ID606
Feature activation-1.69e+00

Pos logit contributions
 never+0.003
 fact+0.003
 zo+0.003
 nine+0.003
Or+0.003
Neg logit contributions
ory-0.003
ipop-0.003
ly-0.003
fy-0.003
ercise-0.003
Token,"
Token ID553
Feature activation+4.89e-01

Pos logit contributions
ory+0.048
ipop+0.046
ercise+0.045
shirt+0.045
ly+0.044
Neg logit contributions
 never-0.025
 zo-0.023
pe-0.023
Or-0.023
 nine-0.023
Token Tom
Token ID4186
Feature activation+1.12e+00

Pos logit contributions
 zo+0.007
 never+0.006
 horror+0.006
Or+0.006
 knocking+0.006
Neg logit contributions
ory-0.014
ly-0.014
ipop-0.014
fy-0.014
les-0.014
Token of
Token ID286
Feature activation-2.83e-01

Pos logit contributions
ory+0.023
ipop+0.023
ly+0.022
ercise+0.022
fy+0.021
Neg logit contributions
 never-0.028
Or-0.027
 knocking-0.026
 zo-0.026
 fact-0.026
Token fun
Token ID1257
Feature activation-1.70e+00

Pos logit contributions
ory+0.006
ipop+0.006
ly+0.006
ip+0.006
ercise+0.006
Neg logit contributions
 never-0.006
 fact-0.006
 knocking-0.006
 ambul-0.006
 zo-0.006
Token making
Token ID1642
Feature activation-1.96e+00

Pos logit contributions
ly+0.039
ory+0.039
les+0.038
ipop+0.038
ercise+0.037
Neg logit contributions
Or-0.032
 never-0.032
 horror-0.031
 collapse-0.031
 known-0.031
Token knots
Token ID33620
Feature activation-1.22e+00

Pos logit contributions
ory+0.046
ipop+0.045
ly+0.044
fy+0.043
ercise+0.043
Neg logit contributions
 never-0.038
 knocking-0.037
 horror-0.037
 zo-0.037
 ambul-0.037
Token and
Token ID290
Feature activation-2.19e+00

Pos logit contributions
ory+0.030
ly+0.030
ipop+0.029
les+0.028
ercise+0.028
Neg logit contributions
 never-0.021
Or-0.021
 knocking-0.020
 horror-0.020
 Ign-0.020
Token brace
Token ID22581
Feature activation-7.08e+00

Pos logit contributions
ory+0.050
ly+0.050
ipop+0.048
ercise+0.048
les+0.047
Neg logit contributions
 never-0.041
Or-0.040
 horror-0.039
 Ign-0.039
elope-0.039
Tokenlets
Token ID5289
Feature activation-4.16e+00

Pos logit contributions
ipop+0.120
 groceries+0.119
affles+0.115
ory+0.115
 salad+0.113
Neg logit contributions
 never-0.179
Or-0.171
gy-0.166
elope-0.165
pe-0.160
Token and
Token ID290
Feature activation-3.11e+00

Pos logit contributions
ory+0.106
ly+0.103
ipop+0.102
ercise+0.100
fy+0.098
Neg logit contributions
 never-0.071
Or-0.069
 knocking-0.067
pe-0.066
 zo-0.066
Token neck
Token ID7393
Feature activation-2.78e+00

Pos logit contributions
ory+0.056
ly+0.054
ipop+0.053
ercise+0.050
les+0.050
Neg logit contributions
 never-0.076
Or-0.075
 knocking-0.074
 horror-0.073
 Ign-0.073
Tokenl
Token ID75
Feature activation-4.40e-01

Pos logit contributions
ory+0.055
ipop+0.052
ercise+0.050
fy+0.049
ly+0.048
Neg logit contributions
 never-0.065
pe-0.063
Or-0.063
 zo-0.062
 knocking-0.061
Tokenaces
Token ID2114
Feature activation-3.58e+00

Pos logit contributions
ory+0.008
ly+0.008
ipop+0.008
les+0.007
fy+0.007
Neg logit contributions
 never-0.011
Or-0.011
 knocking-0.010
 horror-0.010
 fact-0.010
Token park
Token ID3952
Feature activation+1.47e+00

Pos logit contributions
ory+0.012
ly+0.012
ercise+0.012
les+0.012
ipop+0.011
Neg logit contributions
Or-0.008
 never-0.008
pe-0.008
 knocking-0.008
 horror-0.008
Token.
Token ID13
Feature activation+3.85e-01

Pos logit contributions
Uh+0.035
 horror+0.034
Or+0.034
 tresp+0.034
 fact+0.034
Neg logit contributions
ly-0.029
ory-0.028
 sandwich-0.024
les-0.024
shirt-0.024
Token Jack
Token ID3619
Feature activation-4.16e-01

Pos logit contributions
 never+0.007
Or+0.007
pe+0.007
 sneaking+0.007
 ambul+0.007
Neg logit contributions
ory-0.010
ly-0.009
ipop-0.009
ercise-0.009
les-0.009
Token had
Token ID550
Feature activation-2.50e-01

Pos logit contributions
ory+0.010
ly+0.010
les+0.010
ipop+0.009
fy+0.009
Neg logit contributions
 collapse-0.008
 horror-0.008
 basement-0.007
Or-0.007
 ambul-0.007
Token a
Token ID257
Feature activation-2.17e+00

Pos logit contributions
ly+0.006
ory+0.006
ipop+0.006
shirt+0.006
ercise+0.006
Neg logit contributions
 horror-0.004
Or-0.004
stack-0.004
ric-0.004
 collapse-0.004
Token brand
Token ID4508
Feature activation-7.08e+00

Pos logit contributions
ly+0.042
ory+0.041
les+0.039
ipop+0.039
ercise+0.038
Neg logit contributions
 never-0.051
Or-0.050
 hallway-0.049
 knocking-0.049
 zo-0.049
Token new
Token ID649
Feature activation-1.43e+00

Pos logit contributions
ory+0.145
bs+0.141
fy+0.141
ercise+0.141
ems+0.137
Neg logit contributions
 zo-0.135
 site-0.125
 gen-0.123
 only-0.122
 basement-0.121
Token toy
Token ID13373
Feature activation-1.28e+00

Pos logit contributions
ory+0.025
ly+0.024
ipop+0.023
fy+0.023
les+0.023
Neg logit contributions
 never-0.036
Or-0.035
 hallway-0.035
 knocking-0.035
 fact-0.035
Token with
Token ID351
Feature activation+3.04e-01

Pos logit contributions
ory+0.029
ly+0.028
ipop+0.027
ercise+0.026
les+0.026
Neg logit contributions
 never-0.026
Or-0.025
 knocking-0.025
 Ign-0.024
 fact-0.024
Token him
Token ID683
Feature activation-2.94e-01

Pos logit contributions
 never+0.005
 knocking+0.005
 fact+0.005
 zo+0.005
Or+0.005
Neg logit contributions
ory-0.008
ipop-0.008
ercise-0.008
ly-0.007
fy-0.007
Token.
Token ID13
Feature activation-1.21e-01

Pos logit contributions
ory+0.008
ly+0.007
ipop+0.007
ercise+0.007
les+0.007
Neg logit contributions
 never-0.005
Or-0.005
 knocking-0.005
 fact-0.005
 Ign-0.004
Token also
Token ID635
Feature activation-6.64e-01

Pos logit contributions
ory+0.006
ly+0.006
ipop+0.006
les+0.006
fy+0.006
Neg logit contributions
Or-0.004
 horror-0.004
ric-0.004
 knocking-0.004
 collapse-0.004
Token wore
Token ID12408
Feature activation-3.21e+00

Pos logit contributions
ly+0.015
ory+0.015
les+0.015
ipop+0.015
fy+0.014
Neg logit contributions
stack-0.013
 horror-0.013
ric-0.012
 collapse-0.012
Or-0.012
Token gown
Token ID33323
Feature activation-3.95e+00

Pos logit contributions
ory+0.099
ipop+0.094
ly+0.090
fy+0.089
ercise+0.087
Neg logit contributions
 never-0.045
 ambul-0.040
 saving-0.040
 only-0.039
 knocking-0.039
Tokens
Token ID82
Feature activation-8.30e-01

Pos logit contributions
ory+0.050
ercise+0.046
ipop+0.044
shirt+0.041
 salad+0.040
Neg logit contributions
 never-0.118
 ambul-0.112
pe-0.112
Or-0.112
ric-0.111
Token and
Token ID290
Feature activation-3.94e+00

Pos logit contributions
ory+0.021
ly+0.021
ipop+0.021
les+0.020
ercise+0.020
Neg logit contributions
 never-0.014
Or-0.014
 knocking-0.013
 horror-0.013
 Ign-0.013
Token brace
Token ID22581
Feature activation-7.07e+00

Pos logit contributions
ory+0.079
ly+0.073
ipop+0.072
les+0.070
fy+0.069
Neg logit contributions
 never-0.094
Or-0.091
 knocking-0.089
 ambul-0.088
 zo-0.088
Tokenlets
Token ID5289
Feature activation-3.92e+00

Pos logit contributions
 salad+0.123
 groceries+0.118
affles+0.116
ipop+0.116
ory+0.115
Neg logit contributions
 never-0.174
gy-0.168
Or-0.160
ric-0.157
pe-0.156
Token.
Token ID13
Feature activation-3.96e-01

Pos logit contributions
ory+0.099
ly+0.096
ipop+0.094
ercise+0.092
fy+0.091
Neg logit contributions
 never-0.068
Or-0.066
 knocking-0.065
 Ign-0.063
 fact-0.063
Token Some
Token ID2773
Feature activation+1.62e-01

Pos logit contributions
ory+0.009
ly+0.009
ipop+0.009
ercise+0.009
fy+0.008
Neg logit contributions
 never-0.008
Or-0.007
 sneaking-0.007
elope-0.007
ric-0.007
Token of
Token ID286
Feature activation+3.28e+00

Pos logit contributions
 never+0.003
Or+0.003
 knocking+0.003
 ambul+0.003
 fact+0.003
Neg logit contributions
ory-0.004
ipop-0.003
ly-0.003
ercise-0.003
fy-0.003
Token them
Token ID606
Feature activation+6.46e-02

Pos logit contributions
 never+0.070
Or+0.069
 knocking+0.067
 horror+0.066
 Ign+0.066
Neg logit contributions
ory-0.069
ly-0.068
ipop-0.066
ercise-0.064
les-0.064
TokenA
Token ID32
Feature activation-8.59e-01

Pos logit contributions
Or+0.045
 horror+0.042
 ambul+0.042
 never+0.041
 knocking+0.041
Neg logit contributions
ory-0.062
les-0.061
ipop-0.060
ly-0.059
ip-0.057
Tokenlyss
Token ID35670
Feature activation-1.79e+00

Pos logit contributions
ory+0.018
ly+0.018
ipop+0.017
fy+0.017
ercise+0.017
Neg logit contributions
Or-0.018
 horror-0.018
 never-0.018
 collapse-0.018
 knocking-0.018
Tokena
Token ID64
Feature activation+1.11e+00

Pos logit contributions
ory+0.045
ly+0.045
ipop+0.045
ercise+0.044
les+0.043
Neg logit contributions
 never-0.030
Or-0.029
 knocking-0.029
 hallway-0.028
 Ign-0.028
Token had
Token ID550
Feature activation-6.70e-02

Pos logit contributions
 collapse+0.019
ric+0.019
 horror+0.018
Or+0.018
 knocking+0.018
Neg logit contributions
ory-0.028
ly-0.028
les-0.027
ipop-0.027
fy-0.026
Token a
Token ID257
Feature activation-2.08e+00

Pos logit contributions
ory+0.002
ly+0.002
ipop+0.002
les+0.002
shirt+0.002
Neg logit contributions
Or-0.001
 horror-0.001
 knocking-0.001
 collapse-0.001
 zo-0.001
Token brand
Token ID4508
Feature activation-7.07e+00

Pos logit contributions
ly+0.043
ory+0.042
les+0.041
ipop+0.040
ercise+0.040
Neg logit contributions
 never-0.046
Or-0.044
 fact-0.044
 knocking-0.044
 hallway-0.044
Token new
Token ID649
Feature activation-1.62e+00

Pos logit contributions
ercise+0.132
ems+0.125
les+0.122
bs+0.121
iper+0.119
Neg logit contributions
 zo-0.154
 never-0.150
 Ign-0.149
 only-0.149
 apartment-0.143
Token toy
Token ID13373
Feature activation-1.62e+00

Pos logit contributions
ory+0.028
ly+0.027
ipop+0.026
ercise+0.025
les+0.025
Neg logit contributions
 never-0.041
Or-0.041
 knocking-0.040
 horror-0.039
 sneaking-0.039
Token car
Token ID1097
Feature activation-9.45e-01

Pos logit contributions
ory+0.040
ly+0.038
ercise+0.038
ipop+0.038
fy+0.038
Neg logit contributions
 never-0.029
Or-0.028
 zo-0.028
 knocking-0.027
 Ign-0.027
Token.
Token ID13
Feature activation-1.61e-01

Pos logit contributions
ory+0.022
ly+0.022
ipop+0.021
les+0.021
ercise+0.020
Neg logit contributions
Or-0.018
 never-0.018
 horror-0.017
 knocking-0.017
 collapse-0.017
Token She
Token ID1375
Feature activation+2.58e-01

Pos logit contributions
ory+0.004
ly+0.004
ipop+0.004
ercise+0.004
fy+0.004
Neg logit contributions
 never-0.002
Or-0.002
 Ign-0.002
 knocking-0.002
 zo-0.002
Token a
Token ID257
Feature activation-3.86e+00

Pos logit contributions
ipop+0.007
ory+0.007
ercise+0.007
actus+0.007
ouse+0.007
Neg logit contributions
 never-0.004
 knocking-0.004
 Ign-0.004
 horror-0.004
 only-0.004
Token needle
Token ID17598
Feature activation-6.43e-01

Pos logit contributions
ory+0.091
ercise+0.090
les+0.086
ly+0.085
ouses+0.082
Neg logit contributions
 hallway-0.077
 knocking-0.075
 basement-0.072
 zo-0.072
 never-0.072
Token and
Token ID290
Feature activation-3.42e+00

Pos logit contributions
ly+0.015
ory+0.014
ipop+0.014
les+0.014
fy+0.014
Neg logit contributions
Or-0.013
 known-0.012
ric-0.012
 fact-0.012
 horror-0.012
Token thread
Token ID4704
Feature activation-1.99e+00

Pos logit contributions
ory+0.097
ipop+0.089
ly+0.087
fy+0.085
les+0.085
Neg logit contributions
 never-0.059
 knocking-0.052
 saving-0.051
 zo-0.051
Or-0.051
Token and
Token ID290
Feature activation-3.22e+00

Pos logit contributions
ly+0.040
ory+0.039
ipop+0.038
les+0.038
fy+0.036
Neg logit contributions
Or-0.044
 never-0.043
 fact-0.043
 horror-0.043
 hallway-0.043
Token se
Token ID384
Feature activation-7.02e+00

Pos logit contributions
ory+0.089
ipop+0.082
ly+0.079
ercise+0.078
les+0.078
Neg logit contributions
 never-0.059
 knocking-0.052
Or-0.051
 only-0.050
 no-0.050
Tokenwed
Token ID19103
Feature activation-2.42e+00

Pos logit contributions
ory+0.170
actus+0.164
 salad+0.163
 sandwich+0.161
ipop+0.160
Neg logit contributions
 never-0.120
pe-0.112
Or-0.110
e-0.108
 known-0.106
Token Teddy
Token ID29345
Feature activation-1.96e+00

Pos logit contributions
ory+0.056
ly+0.054
ipop+0.054
ercise+0.053
fy+0.052
Neg logit contributions
 never-0.047
Or-0.045
 knocking-0.045
 Ign-0.044
 zo-0.044
Token's
Token ID338
Feature activation-6.06e-01

Pos logit contributions
ory+0.051
ipop+0.047
ercise+0.046
ly+0.046
fy+0.044
Neg logit contributions
 never-0.036
 knocking-0.034
 sneaking-0.034
Or-0.034
pe-0.033
Token eye
Token ID4151
Feature activation-2.62e+00

Pos logit contributions
ory+0.014
ipop+0.013
ercise+0.013
ly+0.013
les+0.013
Neg logit contributions
 never-0.012
Or-0.012
 knocking-0.011
 saving-0.011
 zo-0.011
Token back
Token ID736
Feature activation-1.16e+00

Pos logit contributions
ory+0.080
ercise+0.075
ipop+0.074
fy+0.071
 notebooks+0.069
Neg logit contributions
 never-0.034
 only-0.031
pe-0.031
 turning-0.031
 knocking-0.031
Token Billy
Token ID15890
Feature activation-8.52e-01

Pos logit contributions
ory+0.004
ly+0.003
ipop+0.003
ercise+0.003
fy+0.003
Neg logit contributions
 never-0.003
Or-0.003
 knocking-0.003
 sneaking-0.003
 horror-0.003
Token.
Token ID13
Feature activation-8.32e-01

Pos logit contributions
ory+0.023
ly+0.023
ipop+0.022
les+0.021
ercise+0.021
Neg logit contributions
Or-0.013
 never-0.013
 horror-0.012
 fact-0.012
 hallway-0.012
Token Billy
Token ID15890
Feature activation-1.35e+00

Pos logit contributions
ory+0.019
ly+0.018
ipop+0.018
ercise+0.018
les+0.018
Neg logit contributions
 never-0.016
Or-0.016
 knocking-0.016
 horror-0.016
 Ign-0.016
Token had
Token ID550
Feature activation-1.62e-01

Pos logit contributions
ory+0.033
ly+0.033
les+0.031
ipop+0.030
ercise+0.030
Neg logit contributions
Or-0.025
 horror-0.024
 basement-0.024
 collapse-0.024
stack-0.023
Token a
Token ID257
Feature activation-2.64e+00

Pos logit contributions
ly+0.004
 sunglasses+0.004
ory+0.004
ercise+0.004
 sandwich+0.004
Neg logit contributions
stack-0.003
elope-0.003
Or-0.003
 horror-0.003
ric-0.003
Token brand
Token ID4508
Feature activation-7.02e+00

Pos logit contributions
ory+0.052
ly+0.050
ipop+0.049
les+0.047
ercise+0.047
Neg logit contributions
 never-0.061
Or-0.060
 knocking-0.059
 fact-0.058
 zo-0.058
Token new
Token ID649
Feature activation-1.63e+00

Pos logit contributions
ory+0.151
fy+0.143
iper+0.140
ercise+0.139
bs+0.137
Neg logit contributions
 zo-0.138
 site-0.125
 never-0.120
 only-0.119
 knocking-0.117
Token car
Token ID1097
Feature activation-9.03e-01

Pos logit contributions
ory+0.028
ly+0.027
ipop+0.027
ercise+0.025
les+0.025
Neg logit contributions
 never-0.041
Or-0.040
 sneaking-0.040
 knocking-0.039
 horror-0.039
Token that
Token ID326
Feature activation-2.13e+00

Pos logit contributions
ly+0.019
ory+0.019
ipop+0.018
les+0.018
fy+0.018
Neg logit contributions
Or-0.019
 horror-0.019
elope-0.018
 never-0.018
ric-0.018
Token could
Token ID714
Feature activation+2.57e-01

Pos logit contributions
ory+0.046
ly+0.044
ipop+0.044
les+0.042
fy+0.042
Neg logit contributions
 never-0.047
Or-0.045
 knocking-0.044
 Ign-0.043
 hallway-0.043
Token speed
Token ID2866
Feature activation-1.27e+00

Pos logit contributions
 never+0.005
Or+0.005
 knocking+0.005
 horror+0.005
 Ign+0.005
Neg logit contributions
ory-0.006
ly-0.006
ipop-0.006
ercise-0.005
les-0.005
Token your
Token ID534
Feature activation-1.43e+00

Pos logit contributions
ory+0.032
ly+0.032
ipop+0.031
fy+0.030
ercise+0.030
Neg logit contributions
 never-0.029
Or-0.028
 knocking-0.028
 fact-0.027
 Ign-0.027
Token neck
Token ID7393
Feature activation-2.62e+00

Pos logit contributions
ory+0.029
ly+0.028
ipop+0.027
les+0.026
ercise+0.026
Neg logit contributions
 never-0.033
Or-0.032
 knocking-0.032
 fact-0.032
 hallway-0.031
Tokenl
Token ID75
Feature activation-5.42e-01

Pos logit contributions
ory+0.069
ly+0.066
ipop+0.066
ercise+0.064
fy+0.064
Neg logit contributions
 never-0.044
Or-0.041
 knocking-0.040
 fact-0.040
 collapse-0.039
Tokenaces
Token ID2114
Feature activation-3.71e+00

Pos logit contributions
ly+0.009
ory+0.009
ipop+0.009
les+0.009
ercise+0.009
Neg logit contributions
Or-0.013
 horror-0.013
 never-0.013
 knocking-0.013
 hallway-0.013
Token and
Token ID290
Feature activation-3.62e+00

Pos logit contributions
ory+0.094
ly+0.092
ipop+0.090
les+0.087
ercise+0.087
Neg logit contributions
 never-0.062
Or-0.062
 knocking-0.061
 horror-0.060
 hallway-0.060
Token brace
Token ID22581
Feature activation-7.01e+00

Pos logit contributions
ory+0.081
ly+0.077
ipop+0.076
fy+0.073
les+0.073
Neg logit contributions
 never-0.077
Or-0.074
 knocking-0.074
 fact-0.072
ric-0.071
Tokenlets
Token ID5289
Feature activation-3.85e+00

Pos logit contributions
 groceries+0.117
ipop+0.114
 fries+0.114
 salad+0.113
affles+0.112
Neg logit contributions
 never-0.180
ric-0.167
Or-0.165
gy-0.159
 only-0.159
Token?"
Token ID1701
Feature activation+9.76e-01

Pos logit contributions
ory+0.098
ly+0.096
ipop+0.094
ercise+0.091
les+0.091
Neg logit contributions
 never-0.065
Or-0.064
 knocking-0.062
 horror-0.061
 Ign-0.061
Token Mom
Token ID11254
Feature activation+2.02e+00

Pos logit contributions
ric+0.022
 never+0.022
Or+0.021
stack+0.021
 knocking+0.021
Neg logit contributions
ory-0.019
ly-0.019
ipop-0.018
les-0.017
ercise-0.017
Token asks
Token ID7893
Feature activation+1.74e+00

Pos logit contributions
Or+0.052
stack+0.051
 never+0.050
 known+0.050
 hallway+0.049
Neg logit contributions
ly-0.036
les-0.035
ory-0.034
ipop-0.031
ercise-0.031
Token.
Token ID13
Feature activation+2.00e+00

Pos logit contributions
 never+0.038
Or+0.038
ric+0.038
 fact+0.038
stack+0.037
Neg logit contributions
ly-0.038
ory-0.036
ipop-0.034
les-0.034
ls-0.031
Token saw
Token ID2497
Feature activation-9.53e-01

Pos logit contributions
ric+0.004
 collapse+0.004
Or+0.004
 horror+0.004
 knocking+0.004
Neg logit contributions
ory-0.006
ly-0.005
ipop-0.005
les-0.005
ercise-0.005
Token Sam
Token ID3409
Feature activation-8.46e-04

Pos logit contributions
ory+0.027
fy+0.026
ly+0.026
ipop+0.025
ercise+0.025
Neg logit contributions
 never-0.014
 zo-0.014
 knocking-0.014
 horror-0.013
Or-0.013
Token sleeping
Token ID11029
Feature activation+1.57e+00

Pos logit contributions
ory+0.000
ly+0.000
ipop+0.000
ercise+0.000
les+0.000
Neg logit contributions
 never-0.000
Or-0.000
 knocking-0.000
 sneaking-0.000
 horror-0.000
Token on
Token ID319
Feature activation+3.05e+00

Pos logit contributions
 never+0.029
Or+0.028
 knocking+0.028
 Ign+0.027
ric+0.027
Neg logit contributions
ory-0.037
ly-0.037
ipop-0.036
les-0.035
ercise-0.035
Token the
Token ID262
Feature activation-1.10e+00

Pos logit contributions
Or+0.058
 never+0.055
 knocking+0.054
ric+0.054
 known+0.054
Neg logit contributions
ly-0.070
ory-0.070
ipop-0.067
les-0.066
 sandwich-0.066
Token sofa
Token ID34902
Feature activation+3.16e+00

Pos logit contributions
ory+0.032
ipop+0.030
ercise+0.030
ly+0.029
les+0.028
Neg logit contributions
 hallway-0.017
 never-0.016
 knocking-0.016
 zo-0.015
 fact-0.015
Token.
Token ID13
Feature activation+2.74e-01

Pos logit contributions
Uh+0.069
 known+0.069
ric+0.068
AAAA+0.067
Or+0.067
Neg logit contributions
ly-0.067
 sandwich-0.064
ipop-0.062
ory-0.061
les-0.061
Token She
Token ID1375
Feature activation+3.02e-01

Pos logit contributions
 never+0.004
Or+0.004
 knocking+0.004
 Ign+0.004
 horror+0.004
Neg logit contributions
ory-0.008
ly-0.008
ipop-0.007
fy-0.007
ercise-0.007
Token said
Token ID531
Feature activation-8.63e-01

Pos logit contributions
 collapse+0.006
 basement+0.006
stack+0.006
ric+0.006
 horror+0.005
Neg logit contributions
ory-0.007
ly-0.007
ipop-0.007
les-0.007
 sandwich-0.007
Token,
Token ID11
Feature activation+3.42e+00

Pos logit contributions
ory+0.026
fy+0.025
les+0.025
shirt+0.024
ly+0.024
Neg logit contributions
 never-0.012
 no-0.011
Or-0.011
 sneaking-0.011
 zo-0.011
Token "
Token ID366
Feature activation+3.26e+00

Pos logit contributions
 collapse+0.063
 fact+0.063
 never+0.062
 hallway+0.062
 known+0.061
Neg logit contributions
ipop-0.079
ly-0.079
ory-0.076
ercise-0.073
ip-0.072
Token.
Token ID13
Feature activation-8.97e-01

Pos logit contributions
Or+0.051
 horror+0.051
 fact+0.050
ric+0.050
 known+0.050
Neg logit contributions
ly-0.048
ory-0.045
les-0.043
fy-0.043
ipop-0.042
Token They
Token ID1119
Feature activation-6.56e-01

Pos logit contributions
ory+0.024
ly+0.023
ipop+0.023
ercise+0.022
les+0.022
Neg logit contributions
 never-0.015
 knocking-0.014
 hallway-0.014
Or-0.014
 Ign-0.014
Token stayed
Token ID9658
Feature activation+4.24e-01

Pos logit contributions
ory+0.016
ly+0.015
ipop+0.015
ercise+0.015
les+0.015
Neg logit contributions
Or-0.012
 horror-0.012
 collapse-0.012
 glaring-0.011
 Ign-0.011
Token in
Token ID287
Feature activation+2.72e+00

Pos logit contributions
Or+0.008
 never+0.008
 horror+0.008
 knocking+0.008
 collapse+0.008
Neg logit contributions
ory-0.009
ly-0.009
ipop-0.009
ercise-0.009
les-0.009
Token a
Token ID257
Feature activation-1.94e+00

Pos logit contributions
ric+0.068
pe+0.067
 glaring+0.067
Or+0.065
Uh+0.065
Neg logit contributions
 sandwiches-0.045
 snacks-0.044
 candy-0.043
 groceries-0.042
 album-0.041
Token hotel
Token ID7541
Feature activation+2.68e+00

Pos logit contributions
ory+0.041
ly+0.040
ipop+0.039
ercise+0.038
les+0.037
Neg logit contributions
 never-0.042
Or-0.041
 knocking-0.040
 horror-0.040
 Ign-0.039
Token with
Token ID351
Feature activation+7.69e-01

Pos logit contributions
Uh+0.060
ric+0.059
AAAA+0.058
 horror+0.058
 glaring+0.058
Neg logit contributions
ly-0.058
ory-0.053
fy-0.053
les-0.050
 sandwich-0.050
Token lots
Token ID6041
Feature activation-1.78e+00

Pos logit contributions
 never+0.014
Or+0.013
 knocking+0.013
 horror+0.013
 zo+0.013
Neg logit contributions
ory-0.019
ly-0.019
ipop-0.018
ercise-0.018
les-0.018
Token of
Token ID286
Feature activation+9.57e-01

Pos logit contributions
ory+0.037
ercise+0.036
ipop+0.036
ly+0.033
shirt+0.033
Neg logit contributions
 never-0.039
 zo-0.037
 ambul-0.037
 knocking-0.037
 hallway-0.037
Token rooms
Token ID9519
Feature activation-1.09e-01

Pos logit contributions
 never+0.020
 ambul+0.020
 zo+0.020
 knocking+0.020
 hallway+0.020
Neg logit contributions
ory-0.021
ipop-0.021
ly-0.020
ercise-0.020
fy-0.019
Token.
Token ID13
Feature activation-1.31e+00

Pos logit contributions
ly+0.002
ory+0.002
les+0.002
ipop+0.002
fy+0.002
Neg logit contributions
Uh-0.002
 collapse-0.002
 glaring-0.002
 fact-0.002
Or-0.002
Token mom
Token ID1995
Feature activation+3.84e-02

Pos logit contributions
les+0.014
ly+0.014
ory+0.014
ercise+0.014
fy+0.013
Neg logit contributions
 zo-0.011
 only-0.011
 never-0.011
 fact-0.011
 hallway-0.011
Token.
Token ID13
Feature activation+4.64e-01

Pos logit contributions
 never+0.001
 knocking+0.001
Or+0.001
 sneaking+0.001
 hallway+0.001
Neg logit contributions
ory-0.001
ly-0.001
ipop-0.001
ercise-0.001
les-0.001
Token Mom
Token ID11254
Feature activation+2.83e-01

Pos logit contributions
 never+0.007
Or+0.007
 knocking+0.007
 sneaking+0.007
 fact+0.007
Neg logit contributions
ory-0.012
ly-0.012
ipop-0.012
ercise-0.012
les-0.011
Token said
Token ID531
Feature activation+1.50e-01

Pos logit contributions
 never+0.006
Or+0.005
 knocking+0.005
 horror+0.005
 Ign+0.005
Neg logit contributions
ory-0.006
ly-0.006
ipop-0.006
ercise-0.006
les-0.006
Token,
Token ID11
Feature activation+3.49e+00

Pos logit contributions
 never+0.003
Or+0.003
 knocking+0.003
 horror+0.002
 Ign+0.002
Neg logit contributions
ory-0.004
ly-0.004
ipop-0.004
ercise-0.004
les-0.003
Token "
Token ID366
Feature activation+3.97e+00

Pos logit contributions
 collapse+0.058
 never+0.057
Or+0.056
 fact+0.056
 horror+0.056
Neg logit contributions
ly-0.088
ory-0.087
ipop-0.087
ercise-0.082
shirt-0.081
TokenStay
Token ID25681
Feature activation+1.05e+00

Pos logit contributions
 never+0.054
Or+0.053
 knocking+0.051
 hallway+0.050
 fact+0.050
Neg logit contributions
ory-0.115
ly-0.113
ipop-0.111
ercise-0.108
les-0.108
Token with
Token ID351
Feature activation+2.00e+00

Pos logit contributions
Or+0.020
 never+0.019
 horror+0.019
 collapse+0.019
ric+0.019
Neg logit contributions
ory-0.024
ly-0.024
ipop-0.023
fy-0.022
ercise-0.022
Token me
Token ID502
Feature activation+1.03e+00

Pos logit contributions
 never+0.045
 knocking+0.043
 fact+0.042
 sneaking+0.042
 ambul+0.042
Neg logit contributions
ory-0.043
ipop-0.041
ly-0.040
les-0.039
fy-0.039
Token and
Token ID290
Feature activation-1.79e+00

Pos logit contributions
Or+0.020
Uh+0.019
 collapse+0.019
 known+0.018
 horror+0.018
Neg logit contributions
ly-0.025
ory-0.024
ipop-0.024
fy-0.023
ip-0.022
Token do
Token ID466
Feature activation-1.72e-01

Pos logit contributions
ory+0.042
ly+0.041
ipop+0.040
les+0.040
ercise+0.039
Neg logit contributions
 never-0.035
Or-0.034
 knocking-0.033
 Ign-0.032
 fact-0.032
Token Tim
Token ID5045
Feature activation-5.48e-02

Pos logit contributions
ory+0.032
ly+0.031
ipop+0.031
ercise+0.030
les+0.030
Neg logit contributions
 never-0.024
Or-0.023
 knocking-0.023
 fact-0.022
 zo-0.022
Token liked
Token ID8288
Feature activation-2.83e-01

Pos logit contributions
ory+0.001
ly+0.001
ipop+0.001
ercise+0.001
fy+0.001
Neg logit contributions
 never-0.001
 knocking-0.001
Or-0.001
 only-0.001
elope-0.001
Token.
Token ID13
Feature activation-9.16e-01

Pos logit contributions
ory+0.007
ly+0.007
ipop+0.007
ercise+0.007
fy+0.007
Neg logit contributions
 never-0.005
 knocking-0.005
Or-0.005
 sneaking-0.004
 Ign-0.004
Token He
Token ID679
Feature activation+6.44e-01

Pos logit contributions
ory+0.022
ly+0.021
ipop+0.021
ercise+0.020
fy+0.020
Neg logit contributions
 never-0.017
Or-0.016
 knocking-0.016
 hallway-0.016
 fact-0.016
Token jumped
Token ID11687
Feature activation+1.02e-01

Pos logit contributions
 collapse+0.012
 basement+0.012
 horror+0.012
ric+0.012
 ambul+0.012
Neg logit contributions
ory-0.016
ly-0.015
ipop-0.015
les-0.014
ercise-0.014
Token up
Token ID510
Feature activation+2.67e+00

Pos logit contributions
 never+0.002
 knocking+0.002
 zo+0.002
Or+0.002
 sneaking+0.002
Neg logit contributions
ory-0.003
ipop-0.003
ly-0.003
ercise-0.003
fy-0.002
Token and
Token ID290
Feature activation-1.30e+00

Pos logit contributions
ric+0.060
elope+0.059
Or+0.057
 fact+0.057
 known+0.056
Neg logit contributions
ly-0.055
ory-0.055
ipop-0.053
 sandwich-0.050
 candy-0.050
Token tried
Token ID3088
Feature activation-1.50e+00

Pos logit contributions
ory+0.031
ly+0.030
ipop+0.028
ade+0.028
shirt+0.028
Neg logit contributions
 knocking-0.025
 never-0.025
 zo-0.024
Or-0.024
 turning-0.024
Token to
Token ID284
Feature activation+1.64e+00

Pos logit contributions
ory+0.023
ly+0.022
ipop+0.022
les+0.021
ercise+0.021
Neg logit contributions
 never-0.040
Or-0.040
 fact-0.039
 knocking-0.039
 hallway-0.039
Token sniff
Token ID26300
Feature activation-3.39e+00

Pos logit contributions
 never+0.032
Or+0.031
 knocking+0.031
 collapse+0.030
 zo+0.030
Neg logit contributions
ory-0.038
ly-0.037
ipop-0.037
les-0.036
fy-0.036
Token the
Token ID262
Feature activation+6.45e-01

Pos logit contributions
ory+0.085
ipop+0.084
 sandwich+0.080
ercise+0.080
ouse+0.079
Neg logit contributions
 never-0.059
 zo-0.058
 only-0.057
Or-0.056
 Ign-0.054
Token He
Token ID679
Feature activation+2.75e-01

Pos logit contributions
ory+0.034
ly+0.034
ipop+0.034
shirt+0.033
ercise+0.032
Neg logit contributions
Or-0.028
 never-0.028
 collapse-0.027
 fact-0.026
 horror-0.026
Token was
Token ID373
Feature activation+2.16e+00

Pos logit contributions
Or+0.004
 horror+0.004
 never+0.004
 collapse+0.004
 knocking+0.004
Neg logit contributions
ory-0.007
ly-0.007
ipop-0.007
les-0.007
ercise-0.007
Token scared
Token ID12008
Feature activation+2.00e-01

Pos logit contributions
Or+0.048
elope+0.044
 collapse+0.044
pe+0.044
ASH+0.043
Neg logit contributions
 handsome-0.042
 sunglasses-0.041
 favorite-0.041
ory-0.040
 sandwiches-0.039
Token of
Token ID286
Feature activation+2.05e+00

Pos logit contributions
Or+0.004
 never+0.004
stack+0.004
 hallway+0.004
pe+0.004
Neg logit contributions
ly-0.005
ory-0.005
les-0.004
ipop-0.004
fy-0.004
Token them
Token ID606
Feature activation-2.38e-01

Pos logit contributions
Or+0.047
elope+0.045
Uh+0.044
stack+0.044
pe+0.042
Neg logit contributions
ly-0.039
 sunglasses-0.039
les-0.038
ory-0.038
 candy-0.038
Token at
Token ID379
Feature activation+2.75e+00

Pos logit contributions
ory+0.006
ly+0.006
ipop+0.006
les+0.006
fy+0.006
Neg logit contributions
Or-0.004
 never-0.004
 hallway-0.004
 fact-0.004
 horror-0.004
Token first
Token ID717
Feature activation+8.64e-01

Pos logit contributions
Or+0.064
elope+0.061
Uh+0.059
pe+0.058
 collapse+0.058
Neg logit contributions
ory-0.055
 sandwiches-0.051
 sunglasses-0.050
ipop-0.050
 sandwich-0.049
Token,
Token ID11
Feature activation+1.25e+00

Pos logit contributions
Or+0.016
Uh+0.016
 known+0.016
 never+0.016
 knocking+0.016
Neg logit contributions
ly-0.020
ory-0.019
ipop-0.019
les-0.018
 sandwich-0.018
Token but
Token ID475
Feature activation+1.62e-01

Pos logit contributions
 collapse+0.024
elope+0.023
 basement+0.023
 Gard+0.023
 known+0.023
Neg logit contributions
ly-0.028
ory-0.026
 sunglasses-0.025
ipop-0.024
 peanuts-0.024
Token then
Token ID788
Feature activation+4.61e-01

Pos logit contributions
ric+0.004
Or+0.004
 hallway+0.003
 collapse+0.003
 horror+0.003
Neg logit contributions
ory-0.003
ly-0.003
ipop-0.003
ercise-0.003
fy-0.003
Token he
Token ID339
Feature activation-5.73e-02

Pos logit contributions
Or+0.012
 never+0.012
ric+0.012
 knocking+0.011
 fact+0.011
Neg logit contributions
ory-0.008
ly-0.007
ipop-0.007
ercise-0.007
les-0.007
Token to
Token ID284
Feature activation+1.65e+00

Pos logit contributions
ory+0.022
ipop+0.021
les+0.020
ly+0.020
fy+0.020
Neg logit contributions
 never-0.032
 knocking-0.030
Or-0.030
 sneaking-0.029
 saving-0.029
Token explore
Token ID7301
Feature activation-1.32e+00

Pos logit contributions
Or+0.038
ric+0.036
 knocking+0.036
 ambul+0.036
 hallway+0.036
Neg logit contributions
ory-0.031
ly-0.031
ercise-0.028
 sandwich-0.028
ipop-0.028
Token and
Token ID290
Feature activation-3.84e+00

Pos logit contributions
ory+0.032
ly+0.032
ipop+0.031
les+0.030
ercise+0.030
Neg logit contributions
 never-0.023
Or-0.023
 knocking-0.022
ric-0.022
 zo-0.022
Token look
Token ID804
Feature activation-3.44e+00

Pos logit contributions
ory+0.090
ly+0.089
ipop+0.086
ercise+0.084
les+0.083
Neg logit contributions
 never-0.072
Or-0.071
 knocking-0.070
 fact-0.069
 horror-0.069
Token around
Token ID1088
Feature activation+9.92e-01

Pos logit contributions
ipop+0.086
ory+0.084
ercise+0.082
shirt+0.079
 sandwich+0.078
Neg logit contributions
 never-0.064
 knocking-0.063
 horror-0.061
 sneaking-0.061
 standing-0.060
Token.
Token ID13
Feature activation-6.29e-01

Pos logit contributions
Or+0.019
 never+0.018
ric+0.018
 zo+0.018
 knocking+0.018
Neg logit contributions
ly-0.023
ory-0.023
ipop-0.022
les-0.022
ercise-0.021
Token She
Token ID1375
Feature activation+2.22e-01

Pos logit contributions
ly+0.016
ory+0.015
ipop+0.015
ercise+0.015
shirt+0.014
Neg logit contributions
 never-0.011
ric-0.010
pe-0.010
 fact-0.010
Or-0.010
Token went
Token ID1816
Feature activation+1.22e+00

Pos logit contributions
ric+0.004
 collapse+0.004
Or+0.004
 ambul+0.004
 horror+0.004
Neg logit contributions
ory-0.005
ly-0.005
les-0.005
ipop-0.005
ercise-0.005
Token to
Token ID284
Feature activation+4.26e-01

Pos logit contributions
 never+0.027
Or+0.026
 knocking+0.026
 hallway+0.025
 sneaking+0.025
Neg logit contributions
ory-0.025
ly-0.025
ipop-0.024
ercise-0.023
les-0.023
Token the
Token ID262
Feature activation-5.77e-01

Pos logit contributions
Or+0.009
ric+0.009
pe+0.009
 glaring+0.009
Uh+0.008
Neg logit contributions
ory-0.009
ly-0.009
ercise-0.008
 sandwich-0.008
ipop-0.008
Token tall
Token ID7331
Feature activation+1.82e+00

Pos logit contributions
ory+0.016
ly+0.016
ercise+0.015
les+0.015
 sandwich+0.015
Neg logit contributions
Or-0.008
 horror-0.008
ric-0.008
 collapse-0.008
 never-0.008
Token.
Token ID13
Feature activation+1.44e+00

Pos logit contributions
Or+0.035
Uh+0.035
 fact+0.034
elope+0.034
 known+0.034
Neg logit contributions
ly-0.033
ory-0.031
ipop-0.030
les-0.030
fy-0.028
Token They
Token ID1119
Feature activation+5.26e-01

Pos logit contributions
 fact+0.034
ric+0.034
 sneaking+0.033
 glaring+0.033
pe+0.033
Neg logit contributions
bs-0.024
ly-0.023
ory-0.021
s-0.021
ouses-0.021
Token are
Token ID389
Feature activation+2.11e+00

Pos logit contributions
Or+0.011
elope+0.011
ric+0.011
 hallway+0.011
 glaring+0.011
Neg logit contributions
ory-0.010
ly-0.010
ercise-0.009
ipop-0.009
fy-0.009
Token sad
Token ID6507
Feature activation+6.24e-01

Pos logit contributions
Or+0.035
ric+0.035
elope+0.034
stack+0.034
 hallway+0.034
Neg logit contributions
ly-0.050
ory-0.050
ipop-0.049
 sandwiches-0.049
 sunglasses-0.049
Token.
Token ID13
Feature activation+8.71e-01

Pos logit contributions
stack+0.016
 known+0.016
elope+0.016
pe+0.015
ric+0.015
Neg logit contributions
ly-0.012
ory-0.011
les-0.010
 candy-0.010
ipop-0.010
Token They
Token ID1119
Feature activation+8.81e-01

Pos logit contributions
 sneaking+0.020
stack+0.020
 glaring+0.020
pe+0.019
ric+0.019
Neg logit contributions
ory-0.015
bs-0.015
ly-0.014
ouses-0.014
ipop-0.014
Token are
Token ID389
Feature activation+2.08e+00

Pos logit contributions
Or+0.018
elope+0.018
 hallway+0.018
ric+0.017
 knocking+0.017
Neg logit contributions
ory-0.018
ly-0.018
ipop-0.018
fy-0.017
ercise-0.017
Token sorry
Token ID7926
Feature activation-1.92e+00

Pos logit contributions
Or+0.031
 hallway+0.030
elope+0.030
ric+0.030
 Ign+0.030
Neg logit contributions
ory-0.055
ly-0.055
ipop-0.053
 sandwich-0.052
 sunglasses-0.052
Token.
Token ID13
Feature activation+7.49e-01

Pos logit contributions
ory+0.050
ipop+0.048
ly+0.048
ercise+0.047
fy+0.046
Neg logit contributions
 never-0.033
Or-0.032
 knocking-0.032
 Ign-0.031
 horror-0.031
Token They
Token ID1119
Feature activation+7.48e-01

Pos logit contributions
 sneaking+0.017
stack+0.016
 glaring+0.016
ric+0.016
 knocking+0.016
Neg logit contributions
ory-0.014
bs-0.013
ly-0.013
ouses-0.013
ipop-0.012
Token wish
Token ID4601
Feature activation-1.86e+00

Pos logit contributions
Or+0.016
 never+0.016
 knocking+0.016
 hallway+0.016
 sneaking+0.015
Neg logit contributions
ory-0.015
ly-0.015
ipop-0.015
ercise-0.014
fy-0.014
Token to
Token ID284
Feature activation+4.50e-01

Pos logit contributions
Or+0.013
 never+0.013
 knocking+0.013
 collapse+0.013
ric+0.013
Neg logit contributions
ly-0.007
ory-0.007
ipop-0.006
ercise-0.006
fy-0.006
Token always
Token ID1464
Feature activation-2.33e+00

Pos logit contributions
Or+0.009
 hallway+0.009
 horror+0.009
Uh+0.009
 glaring+0.008
Neg logit contributions
ory-0.010
ly-0.010
ercise-0.010
ipop-0.009
 sandwich-0.009
Token listen
Token ID6004
Feature activation-1.08e+00

Pos logit contributions
ly+0.047
ory+0.046
ercise+0.045
night+0.042
ipop+0.042
Neg logit contributions
 hallway-0.050
stack-0.050
Or-0.050
ric-0.050
 horror-0.050
Token to
Token ID284
Feature activation-4.56e-01

Pos logit contributions
ory+0.022
ipop+0.021
ercise+0.019
ly+0.019
 sandwich+0.019
Neg logit contributions
 never-0.027
Or-0.024
 knocking-0.024
 Ign-0.024
 saving-0.024
Token adults
Token ID6490
Feature activation-1.40e+00

Pos logit contributions
ory+0.009
ly+0.009
ipop+0.008
ercise+0.008
les+0.008
Neg logit contributions
Or-0.010
 never-0.010
 hallway-0.010
 knocking-0.010
 zo-0.010
Token.
Token ID13
Feature activation-7.04e-01

Pos logit contributions
ory+0.035
ly+0.035
ipop+0.034
les+0.033
ercise+0.033
Neg logit contributions
 never-0.023
Or-0.023
 knocking-0.023
 collapse-0.023
 horror-0.023
Token<|endoftext|>
Token ID50256
Feature activation+2.57e+00

Pos logit contributions
ory+0.016
 album+0.014
ercise+0.014
shirt+0.014
ipop+0.013
Neg logit contributions
Or-0.014
 sneaking-0.014
 knocking-0.014
 dodging-0.014
 hallway-0.014
TokenOnce
Token ID7454
Feature activation-4.10e+00

Pos logit contributions
Or+0.038
 never+0.037
 knocking+0.036
 horror+0.036
 ambul+0.035
Neg logit contributions
ory-0.072
ipop-0.069
ly-0.069
les-0.069
shirt-0.066
Token upon
Token ID2402
Feature activation+2.28e-01

Pos logit contributions
ercise+0.101
shirt+0.097
ly+0.095
fy+0.095
 fries+0.095
Neg logit contributions
 never-0.080
 only-0.073
 no-0.072
 Ign-0.071
ric-0.070
Token a
Token ID257
Feature activation-2.35e+00

Pos logit contributions
 never+0.003
Or+0.002
 knocking+0.002
 Ign+0.002
 sneaking+0.002
Neg logit contributions
ory-0.007
ipop-0.007
ly-0.007
ercise-0.007
les-0.007
Token time
Token ID640
Feature activation-2.00e-02

Pos logit contributions
ory+0.055
ly+0.053
fy+0.052
ipop+0.052
 sandwich+0.050
Neg logit contributions
stack-0.045
 knocking-0.044
Or-0.044
ric-0.044
 ambul-0.044
Token His
Token ID2399
Feature activation-6.03e-01

Pos logit contributions
ory+0.002
ly+0.002
ipop+0.002
ercise+0.002
les+0.002
Neg logit contributions
 never-0.001
Or-0.001
 knocking-0.001
 fact-0.001
 hallway-0.001
Token mom
Token ID1995
Feature activation+1.05e+00

Pos logit contributions
ory+0.014
ly+0.014
les+0.014
ercise+0.014
fy+0.013
Neg logit contributions
 never-0.013
 horror-0.012
 knocking-0.012
 fact-0.012
 hallway-0.012
Token said
Token ID531
Feature activation-7.58e-02

Pos logit contributions
 basement+0.023
ric+0.022
Or+0.022
 hallway+0.022
stack+0.021
Neg logit contributions
ly-0.024
ipop-0.022
les-0.021
ory-0.021
fy-0.021
Token,
Token ID11
Feature activation+3.82e+00

Pos logit contributions
ory+0.002
ly+0.002
ipop+0.002
les+0.002
ercise+0.002
Neg logit contributions
Or-0.001
 never-0.001
 hallway-0.001
ric-0.001
 knocking-0.001
Token "
Token ID366
Feature activation+3.79e+00

Pos logit contributions
 collapse+0.084
 basement+0.082
 hallway+0.081
ric+0.080
 apartment+0.079
Neg logit contributions
ly-0.082
ipop-0.077
ory-0.072
ip-0.072
ercise-0.070
TokenOh
Token ID5812
Feature activation+2.08e+00

Pos logit contributions
 horror+0.054
 knocking+0.053
 hallway+0.053
 never+0.052
 basement+0.051
Neg logit contributions
ory-0.106
ly-0.103
ercise-0.101
ipop-0.100
les-0.099
Token,
Token ID11
Feature activation+3.13e+00

Pos logit contributions
Or+0.036
 hallway+0.036
 knocking+0.034
 zo+0.033
ric+0.033
Neg logit contributions
ly-0.053
ory-0.052
ipop-0.052
fy-0.049
ercise-0.049
Token Sammy
Token ID44067
Feature activation-8.02e-01

Pos logit contributions
 never+0.076
Or+0.072
 knocking+0.072
 Ign+0.072
 fact+0.072
Neg logit contributions
ory-0.059
ly-0.056
ipop-0.056
les-0.054
fy-0.054
Token.
Token ID13
Feature activation+1.53e+00

Pos logit contributions
ly+0.019
ory+0.018
ipop+0.018
les+0.017
fy+0.017
Neg logit contributions
Or-0.016
 hallway-0.015
Uh-0.015
ric-0.015
 horror-0.015
Token That
Token ID1320
Feature activation+7.91e-01

Pos logit contributions
Or+0.028
 never+0.027
 hallway+0.027
 knocking+0.027
 sneaking+0.026
Neg logit contributions
ory-0.036
ipop-0.036
ly-0.035
ercise-0.034
fy-0.034
Token's
Token ID338
Feature activation+1.96e+00

Pos logit contributions
 never+0.015
Or+0.014
 knocking+0.014
 fact+0.014
 Ign+0.014
Neg logit contributions
ory-0.019
ly-0.018
ipop-0.018
ercise-0.017
les-0.017
TokenWhen
Token ID2215
Feature activation+9.25e-01

Pos logit contributions
ercise+0.006
ory+0.006
ade+0.006
les+0.006
fy+0.006
Neg logit contributions
 knocking-0.006
 horror-0.006
 fact-0.006
 sneaking-0.005
 saving-0.005
Token they
Token ID484
Feature activation-4.36e-01

Pos logit contributions
Or+0.021
 never+0.020
 knocking+0.020
pe+0.020
 glaring+0.020
Neg logit contributions
ory-0.018
ipop-0.018
ly-0.018
les-0.017
ercise-0.017
Token reached
Token ID4251
Feature activation+4.71e-01

Pos logit contributions
ory+0.011
ipop+0.011
ly+0.011
les+0.010
fy+0.010
Neg logit contributions
Or-0.008
 sneaking-0.008
 knocking-0.007
 ambul-0.007
 hallway-0.007
Token the
Token ID262
Feature activation+2.11e-01

Pos logit contributions
Or+0.010
pe+0.010
Uh+0.010
 glaring+0.010
 Ign+0.010
Neg logit contributions
ory-0.009
ipop-0.009
 sandwiches-0.009
 sandwich-0.009
ly-0.009
Token other
Token ID584
Feature activation+2.47e+00

Pos logit contributions
Or+0.004
 sneaking+0.004
 never+0.004
 knocking+0.004
 glaring+0.004
Neg logit contributions
ory-0.005
ly-0.005
les-0.005
 sandwich-0.005
ercise-0.004
Token side
Token ID1735
Feature activation+2.18e+00

Pos logit contributions
Con+0.048
 known+0.048
Or+0.047
 never+0.047
 glaring+0.047
Neg logit contributions
ly-0.056
ory-0.055
les-0.052
 sandwiches-0.050
 groceries-0.050
Token,
Token ID11
Feature activation+4.48e-01

Pos logit contributions
Or+0.046
ric+0.042
AAAA+0.042
Uh+0.042
 known+0.042
Neg logit contributions
ly-0.047
ory-0.047
ipop-0.046
les-0.044
fy-0.043
Token they
Token ID484
Feature activation-3.07e-01

Pos logit contributions
 fact+0.010
Or+0.010
 glaring+0.010
pe+0.010
 never+0.010
Neg logit contributions
ory-0.008
ly-0.008
ipop-0.008
les-0.008
 sandwiches-0.008
Token found
Token ID1043
Feature activation-4.65e-01

Pos logit contributions
ory+0.007
ly+0.007
ipop+0.007
les+0.007
fy+0.007
Neg logit contributions
Or-0.006
 sneaking-0.006
 knocking-0.006
 ambul-0.006
 horror-0.006
Token the
Token ID262
Feature activation+8.82e-01

Pos logit contributions
ory+0.012
ly+0.012
ipop+0.012
ercise+0.011
les+0.011
Neg logit contributions
 never-0.008
Or-0.008
 knocking-0.007
 horror-0.007
 Ign-0.007
Token nest
Token ID16343
Feature activation+1.81e+00

Pos logit contributions
Or+0.019
 never+0.019
 knocking+0.018
 Ign+0.018
 sneaking+0.018
Neg logit contributions
ory-0.018
ly-0.017
ipop-0.017
 sandwich-0.017
les-0.016
Token for
Token ID329
Feature activation+2.19e+00

Pos logit contributions
ory+0.019
ly+0.018
ipop+0.018
fy+0.018
ercise+0.017
Neg logit contributions
 never-0.017
Or-0.015
 knocking-0.015
 saving-0.015
 sneaking-0.015
Token recommendations
Token ID10763
Feature activation-1.56e+00

Pos logit contributions
 never+0.042
Or+0.040
 knocking+0.040
 fact+0.039
 sneaking+0.039
Neg logit contributions
ory-0.052
ly-0.051
ipop-0.050
les-0.048
fy-0.048
Token.
Token ID13
Feature activation-1.11e+00

Pos logit contributions
ory+0.039
ly+0.038
ipop+0.037
ercise+0.036
les+0.036
Neg logit contributions
 never-0.027
Or-0.027
 knocking-0.026
 horror-0.026
 Ign-0.026
Token They
Token ID1119
Feature activation-9.13e-01

Pos logit contributions
ory+0.029
ly+0.029
ipop+0.028
ercise+0.028
fy+0.027
Neg logit contributions
 never-0.017
Or-0.017
 knocking-0.017
 sneaking-0.017
 fact-0.017
Token had
Token ID550
Feature activation+2.00e-01

Pos logit contributions
ory+0.022
ipop+0.021
ly+0.021
les+0.021
ercise+0.021
Neg logit contributions
 knocking-0.016
Or-0.016
 sneaking-0.016
 glaring-0.016
 hallway-0.016
Token lots
Token ID6041
Feature activation-2.23e+00

Pos logit contributions
 never+0.003
Or+0.003
 knocking+0.003
 fact+0.003
 horror+0.003
Neg logit contributions
ory-0.005
ly-0.005
ipop-0.005
ercise-0.005
les-0.005
Token of
Token ID286
Feature activation-1.94e-01

Pos logit contributions
ory+0.046
ipop+0.044
ercise+0.043
ly+0.042
shirt+0.041
Neg logit contributions
 never-0.049
Or-0.047
 fact-0.047
 horror-0.047
 knocking-0.046
Token fun
Token ID1257
Feature activation-1.38e+00

Pos logit contributions
ory+0.004
ipop+0.004
ly+0.004
les+0.003
ercise+0.003
Neg logit contributions
 never-0.005
 fact-0.005
 knocking-0.004
Or-0.004
 ambul-0.004
Token together
Token ID1978
Feature activation-6.77e-01

Pos logit contributions
ly+0.031
les+0.030
ory+0.030
 sandwiches+0.029
 sandwich+0.029
Neg logit contributions
stack-0.027
Or-0.027
 collapse-0.027
 zo-0.026
 glaring-0.026
Token,
Token ID11
Feature activation-1.48e-01

Pos logit contributions
ory+0.014
ly+0.014
ipop+0.013
 sandwich+0.013
les+0.013
Neg logit contributions
Uh-0.014
 collapse-0.014
ric-0.014
 zo-0.014
 hallway-0.014
Token and
Token ID290
Feature activation-2.37e+00

Pos logit contributions
ory+0.003
ly+0.003
ipop+0.003
 sandwich+0.003
les+0.003
Neg logit contributions
ric-0.003
 collapse-0.003
 Ign-0.003
pe-0.003
 glaring-0.003
Token,
Token ID11
Feature activation+3.62e+00

Pos logit contributions
ory+0.038
ly+0.036
ipop+0.036
ercise+0.036
fy+0.036
Neg logit contributions
 never-0.017
Or-0.016
 knocking-0.016
 zo-0.015
 Ign-0.015
Token "
Token ID366
Feature activation+2.90e+00

Pos logit contributions
elope+0.075
 collapse+0.074
 fact+0.073
ric+0.072
 basement+0.072
Neg logit contributions
ipop-0.077
ly-0.075
ory-0.070
ercise-0.070
ip-0.069
TokenI
Token ID40
Feature activation+1.32e+00

Pos logit contributions
 never+0.044
 knocking+0.043
 sneaking+0.043
ric+0.042
 horror+0.042
Neg logit contributions
ory-0.079
ly-0.078
ipop-0.076
ercise-0.075
fy-0.074
Token don
Token ID836
Feature activation+7.41e-01

Pos logit contributions
 never+0.020
Or+0.019
 knocking+0.019
 hallway+0.019
 fact+0.019
Neg logit contributions
ory-0.037
ly-0.036
ipop-0.035
ercise-0.034
fy-0.034
Token't
Token ID470
Feature activation-1.02e+00

Pos logit contributions
Or+0.008
 never+0.007
elope+0.007
 horror+0.007
 hallway+0.007
Neg logit contributions
ory-0.024
ly-0.023
ipop-0.023
les-0.023
ercise-0.023
Token know
Token ID760
Feature activation-1.91e+00

Pos logit contributions
ory+0.023
ly+0.023
ipop+0.023
les+0.022
ercise+0.022
Neg logit contributions
 never-0.020
Or-0.019
 knocking-0.019
 Ign-0.019
 fact-0.019
Token.
Token ID13
Feature activation+6.27e-01

Pos logit contributions
ory+0.050
ly+0.048
ipop+0.048
ercise+0.047
fy+0.047
Neg logit contributions
 never-0.032
Or-0.030
 knocking-0.030
 sneaking-0.029
 horror-0.029
Token Maybe
Token ID6674
Feature activation-2.07e-01

Pos logit contributions
 never+0.010
Or+0.010
 knocking+0.010
 sneaking+0.010
 fact+0.010
Neg logit contributions
ory-0.016
ly-0.016
ipop-0.016
ercise-0.015
fy-0.015
Token he
Token ID339
Feature activation+1.59e+00

Pos logit contributions
ory+0.005
les+0.005
fy+0.005
ipop+0.005
ly+0.005
Neg logit contributions
 never-0.004
Or-0.004
 knocking-0.004
 zo-0.004
 no-0.004
Token will
Token ID481
Feature activation+1.15e+00

Pos logit contributions
 never+0.026
Or+0.025
 knocking+0.025
 hallway+0.024
 horror+0.024
Neg logit contributions
ory-0.041
ly-0.041
ipop-0.040
les-0.039
ercise-0.039
Token go
Token ID467
Feature activation+1.47e+00

Pos logit contributions
elope+0.022
 ambul+0.022
Far+0.022
 hallway+0.022
Uh+0.022
Neg logit contributions
ory-0.026
 sandwich-0.024
ly-0.024
ipop-0.024
 candy-0.023
Token all
Token ID477
Feature activation-2.59e+00

Pos logit contributions
 horror+0.015
 sneaking+0.015
Or+0.015
elope+0.014
Uh+0.014
Neg logit contributions
ory-0.011
ly-0.011
ercise-0.011
fy-0.010
ipop-0.010
Token ride
Token ID6594
Feature activation-1.76e+00

Pos logit contributions
ly+0.053
ory+0.052
ercise+0.051
les+0.051
ipop+0.051
Neg logit contributions
Or-0.055
 horror-0.054
 sneaking-0.054
 glaring-0.053
 basement-0.052
Token together
Token ID1978
Feature activation-1.11e+00

Pos logit contributions
ory+0.044
ipop+0.043
ly+0.042
fy+0.040
ercise+0.040
Neg logit contributions
 never-0.033
Or-0.031
 knocking-0.030
 zo-0.030
pe-0.030
Token and
Token ID290
Feature activation-3.18e+00

Pos logit contributions
ly+0.024
ory+0.024
ipop+0.023
les+0.023
fy+0.022
Neg logit contributions
Or-0.022
 collapse-0.022
 horror-0.022
Uh-0.022
ric-0.022
Token have
Token ID423
Feature activation-4.21e-01

Pos logit contributions
ly+0.065
ory+0.064
ercise+0.063
ipop+0.062
fy+0.062
Neg logit contributions
Or-0.065
 glaring-0.065
 fact-0.064
ric-0.063
 sneaking-0.063
Token so
Token ID523
Feature activation-1.32e+00

Pos logit contributions
ory+0.008
ipop+0.008
ly+0.008
les+0.007
fy+0.007
Neg logit contributions
 never-0.010
Or-0.010
 knocking-0.010
 known-0.010
 zo-0.009
Token much
Token ID881
Feature activation-1.77e+00

Pos logit contributions
ory+0.031
ly+0.031
ipop+0.030
fy+0.030
ercise+0.029
Neg logit contributions
Or-0.024
 never-0.024
stack-0.024
 knocking-0.024
 fact-0.023
Token fun
Token ID1257
Feature activation-2.04e+00

Pos logit contributions
ory+0.032
ipop+0.030
ly+0.030
ercise+0.028
les+0.028
Neg logit contributions
 never-0.045
Or-0.043
 knocking-0.043
 zo-0.043
 Ign-0.043
Token.
Token ID13
Feature activation-1.39e+00

Pos logit contributions
ly+0.045
ory+0.044
les+0.043
ercise+0.042
ipop+0.041
Neg logit contributions
Or-0.041
 horror-0.040
 collapse-0.040
 never-0.040
ric-0.039
Token Tim
Token ID5045
Feature activation-8.68e-01

Pos logit contributions
ory+0.034
ipop+0.033
ercise+0.032
ly+0.032
ip+0.032
Neg logit contributions
Or-0.023
 sneaking-0.023
 collapse-0.023
 never-0.023
 fact-0.022
Token was
Token ID373
Feature activation+1.73e+00

Pos logit contributions
ory+0.021
les+0.021
ly+0.021
ercise+0.020
ipop+0.020
Neg logit contributions
Or-0.015
 collapse-0.015
stack-0.015
 horror-0.015
 knocking-0.015
Token Tim
Token ID5045
Feature activation-1.27e+00

Pos logit contributions
 never+0.005
Or+0.005
 knocking+0.005
 horror+0.005
 hallway+0.005
Neg logit contributions
ory-0.004
ly-0.004
ipop-0.004
ercise-0.004
les-0.004
Tokenmy
Token ID1820
Feature activation-1.17e+00

Pos logit contributions
ory+0.032
ly+0.031
ipop+0.030
ercise+0.030
fy+0.030
Neg logit contributions
 never-0.023
 knocking-0.021
Or-0.021
 fact-0.021
elope-0.021
Token just
Token ID655
Feature activation+3.03e-01

Pos logit contributions
ory+0.030
ly+0.029
ipop+0.029
ercise+0.028
fy+0.028
Neg logit contributions
 never-0.021
Or-0.019
 knocking-0.018
 fact-0.018
 Ign-0.018
Token wanted
Token ID2227
Feature activation-1.50e+00

Pos logit contributions
 never+0.005
 only+0.004
 knocking+0.004
 no+0.004
 glaring+0.004
Neg logit contributions
ory-0.008
ercise-0.008
les-0.008
ouses-0.008
ade-0.008
Token to
Token ID284
Feature activation+4.79e-01

Pos logit contributions
ory+0.033
les+0.030
ipop+0.030
ly+0.030
fy+0.030
Neg logit contributions
 never-0.036
 knocking-0.034
 saving-0.033
 ambul-0.033
 no-0.033
Token play
Token ID711
Feature activation-2.00e+00

Pos logit contributions
 never+0.010
 knocking+0.009
 collapse+0.009
Or+0.009
 zo+0.009
Neg logit contributions
ory-0.011
ly-0.010
ipop-0.010
les-0.010
ercise-0.010
Token with
Token ID351
Feature activation-8.04e-01

Pos logit contributions
ory+0.059
ipop+0.057
ly+0.056
ercise+0.054
les+0.054
Neg logit contributions
 never-0.028
 knocking-0.026
Or-0.026
 sneaking-0.025
 saving-0.025
Token his
Token ID465
Feature activation-1.20e+00

Pos logit contributions
ory+0.020
ipop+0.019
fy+0.018
ly+0.018
ercise+0.018
Neg logit contributions
 never-0.016
 ambul-0.015
 knocking-0.015
 no-0.015
 saving-0.015
Token toy
Token ID13373
Feature activation-1.63e+00

Pos logit contributions
ory+0.027
ly+0.025
ercise+0.025
les+0.024
fy+0.024
Neg logit contributions
 zo-0.026
 never-0.025
 knocking-0.025
 only-0.025
 ambul-0.025
Token car
Token ID1097
Feature activation-1.49e+00

Pos logit contributions
ory+0.039
ercise+0.036
ipop+0.036
ly+0.036
fy+0.036
Neg logit contributions
 never-0.031
 zo-0.031
Or-0.030
 knocking-0.030
 ambul-0.030
Token.
Token ID13
Feature activation-1.98e+00

Pos logit contributions
ory+0.043
ipop+0.040
ercise+0.040
les+0.040
ly+0.039
Neg logit contributions
 never-0.022
 knocking-0.020
Or-0.020
 zo-0.020
 only-0.019
Token I
Token ID314
Feature activation+1.38e+00

Pos logit contributions
 knocking+0.032
 sneaking+0.032
 hallway+0.032
Or+0.032
 collapse+0.032
Neg logit contributions
ory-0.039
ly-0.039
ipop-0.038
ercise-0.037
fy-0.036
Token know
Token ID760
Feature activation-1.21e+00

Pos logit contributions
 hallway+0.028
 sneaking+0.028
stack+0.027
 knocking+0.027
 collapse+0.027
Neg logit contributions
ory-0.032
ipop-0.029
fy-0.028
ade-0.028
ercise-0.028
Token you
Token ID345
Feature activation+4.97e-01

Pos logit contributions
ory+0.024
ly+0.023
ipop+0.023
ercise+0.022
fy+0.022
Neg logit contributions
 never-0.028
Or-0.027
 knocking-0.026
 sneaking-0.026
 Ign-0.026
Token didn
Token ID1422
Feature activation-7.79e-01

Pos logit contributions
 never+0.007
 only+0.006
 no+0.006
 known+0.006
Or+0.005
Neg logit contributions
ipop-0.015
ercise-0.015
ly-0.015
shirt-0.014
ory-0.014
Token't
Token ID470
Feature activation-7.88e-01

Pos logit contributions
ory+0.025
ly+0.025
ipop+0.025
ercise+0.024
les+0.024
Neg logit contributions
Or-0.007
 never-0.007
 horror-0.007
elope-0.007
 knocking-0.007
Token mean
Token ID1612
Feature activation-2.12e+00

Pos logit contributions
ory+0.020
ly+0.019
ipop+0.019
ercise+0.019
les+0.019
Neg logit contributions
 never-0.014
Or-0.013
 knocking-0.013
 horror-0.013
 hallway-0.013
Token to
Token ID284
Feature activation+4.78e-01

Pos logit contributions
ory+0.044
ipop+0.044
fy+0.042
shirt+0.041
ercise+0.040
Neg logit contributions
 never-0.049
 knocking-0.046
 Ign-0.045
Or-0.045
 only-0.044
Token break
Token ID2270
Feature activation-1.30e+00

Pos logit contributions
Or+0.009
 never+0.009
 horror+0.009
 hallway+0.009
stack+0.009
Neg logit contributions
ory-0.011
ly-0.011
ipop-0.010
ercise-0.010
les-0.010
Token the
Token ID262
Feature activation+3.23e-01

Pos logit contributions
ory+0.031
ipop+0.030
ly+0.029
fy+0.029
ercise+0.028
Neg logit contributions
 never-0.025
 knocking-0.024
Or-0.024
 ambul-0.024
 zo-0.024
Token belt
Token ID10999
Feature activation+4.20e-02

Pos logit contributions
 never+0.008
 knocking+0.008
Or+0.008
 fact+0.008
 hallway+0.008
Neg logit contributions
ory-0.006
ly-0.006
ipop-0.006
ercise-0.006
fy-0.005
Token.
Token ID13
Feature activation+4.49e-01

Pos logit contributions
Or+0.001
ric+0.001
 never+0.001
elope+0.001
 fact+0.001
Neg logit contributions
ly-0.001
ory-0.001
ipop-0.001
les-0.001
fy-0.001
Token her
Token ID607
Feature activation-1.31e+00

Pos logit contributions
ipop+0.072
ory+0.071
fy+0.070
ercise+0.070
uddle+0.069
Neg logit contributions
 never-0.040
 only-0.036
 no-0.034
 known-0.032
 knocking-0.031
Token a
Token ID257
Feature activation-3.40e+00

Pos logit contributions
ory+0.051
ly+0.050
les+0.048
actus+0.047
ipop+0.047
Neg logit contributions
 never-0.011
 no-0.010
 only-0.008
 known-0.007
 fore-0.006
Token hug
Token ID16225
Feature activation-2.16e+00

Pos logit contributions
ory+0.082
ly+0.079
les+0.076
ercise+0.073
 veggies+0.070
Neg logit contributions
 never-0.068
 zo-0.066
 known-0.064
 Ign-0.063
 knocking-0.062
Token.
Token ID13
Feature activation-1.24e+00

Pos logit contributions
ory+0.061
ercise+0.057
ipop+0.057
ly+0.057
fy+0.057
Neg logit contributions
 never-0.033
Or-0.030
 knocking-0.028
 only-0.028
 no-0.028
Token<|endoftext|>
Token ID50256
Feature activation+2.06e+00

Pos logit contributions
ory+0.038
ly+0.037
ipop+0.036
ercise+0.036
les+0.035
Neg logit contributions
 never-0.015
Or-0.015
 knocking-0.014
 horror-0.014
 Ign-0.014
TokenOnce
Token ID7454
Feature activation-4.48e+00

Pos logit contributions
Or+0.039
 ambul+0.037
 horror+0.036
 collapse+0.036
 knocking+0.035
Neg logit contributions
ory-0.050
les-0.048
ipop-0.047
ly-0.047
 sandwich-0.046
Token upon
Token ID2402
Feature activation-3.17e-01

Pos logit contributions
ercise+0.108
shirt+0.103
ly+0.102
fy+0.099
 fries+0.099
Neg logit contributions
 never-0.089
 no-0.082
 only-0.081
 horror-0.080
 knocking-0.080
Token a
Token ID257
Feature activation-2.08e+00

Pos logit contributions
ory+0.010
ly+0.009
ipop+0.009
ercise+0.009
les+0.009
Neg logit contributions
 never-0.004
Or-0.004
 knocking-0.004
 Ign-0.004
 sneaking-0.004
Token time
Token ID640
Feature activation+2.08e-01

Pos logit contributions
ory+0.046
ly+0.044
fy+0.044
ipop+0.043
 sandwich+0.043
Neg logit contributions
stack-0.043
Or-0.042
 ambul-0.042
ric-0.042
 knocking-0.042
Token,
Token ID11
Feature activation-1.01e+00

Pos logit contributions
Or+0.005
 known+0.005
ric+0.005
stack+0.005
 hallway+0.005
Neg logit contributions
ipop-0.004
ory-0.004
ly-0.004
ls-0.004
ip-0.004
Token there
Token ID612
Feature activation-8.08e-01

Pos logit contributions
ory+0.020
ly+0.020
ipop+0.020
les+0.018
 sandwich+0.018
Neg logit contributions
Or-0.022
 ambul-0.022
 horror-0.022
ric-0.022
 zo-0.022
Tokenmy
Token ID1820
Feature activation-1.57e+00

Pos logit contributions
ory+0.036
ly+0.035
ipop+0.035
ercise+0.034
les+0.034
Neg logit contributions
 never-0.024
 knocking-0.022
Or-0.021
 sneaking-0.021
 Ign-0.021
Token.
Token ID13
Feature activation-1.48e+00

Pos logit contributions
ory+0.042
ly+0.041
ipop+0.041
ercise+0.040
les+0.040
Neg logit contributions
 never-0.025
 knocking-0.023
Or-0.023
 Ign-0.023
 sneaking-0.023
Token She
Token ID1375
Feature activation-3.89e-01

Pos logit contributions
ory+0.046
ly+0.045
ipop+0.044
les+0.043
ercise+0.043
Neg logit contributions
 never-0.018
Or-0.017
 knocking-0.016
 hallway-0.016
 Ign-0.016
Token wanted
Token ID2227
Feature activation-1.12e+00

Pos logit contributions
ory+0.010
ly+0.009
ipop+0.009
ercise+0.009
fy+0.009
Neg logit contributions
 never-0.007
Or-0.007
 knocking-0.007
 horror-0.007
 collapse-0.007
Token to
Token ID284
Feature activation+9.67e-01

Pos logit contributions
ory+0.022
ly+0.021
ipop+0.021
les+0.020
fy+0.020
Neg logit contributions
 never-0.029
 knocking-0.026
 ambul-0.026
 only-0.026
 saving-0.026
Token dress
Token ID6576
Feature activation-5.44e+00

Pos logit contributions
 never+0.021
Or+0.021
 knocking+0.020
 horror+0.020
 Ign+0.020
Neg logit contributions
ory-0.020
ly-0.019
ipop-0.019
ercise-0.018
les-0.018
Token up
Token ID510
Feature activation+5.16e-01

Pos logit contributions
ory+0.157
fy+0.144
ipop+0.143
ercise+0.142
 notebooks+0.141
Neg logit contributions
 knocking-0.075
 never-0.075
 only-0.075
 getting-0.071
 Clar-0.070
Token in
Token ID287
Feature activation+3.75e+00

Pos logit contributions
ric+0.009
 horror+0.009
Or+0.009
 knocking+0.009
 zo+0.009
Neg logit contributions
ory-0.012
ly-0.012
ipop-0.012
fy-0.012
les-0.011
Token her
Token ID607
Feature activation-1.24e+00

Pos logit contributions
Or+0.071
 never+0.070
ric+0.069
 knocking+0.069
pe+0.069
Neg logit contributions
ory-0.086
ly-0.084
ipop-0.081
fy-0.079
les-0.079
Token favorite
Token ID4004
Feature activation-1.44e+00

Pos logit contributions
ercise+0.030
ory+0.030
ly+0.030
les+0.029
ipop+0.029
Neg logit contributions
 hallway-0.023
 basement-0.022
 never-0.022
 apartment-0.022
 fact-0.022
Token dress
Token ID6576
Feature activation-1.92e+00

Pos logit contributions
ory+0.034
ercise+0.033
ipop+0.033
ip+0.032
bs+0.032
Neg logit contributions
 hallway-0.029
 fact-0.026
 apartment-0.026
 basement-0.026
 knocking-0.025
Token the
Token ID262
Feature activation-1.36e-01

Pos logit contributions
ory+0.028
ly+0.028
ipop+0.027
fy+0.026
les+0.026
Neg logit contributions
 never-0.024
Or-0.024
 knocking-0.023
 Ign-0.023
 fact-0.023
Token right
Token ID826
Feature activation+1.79e+00

Pos logit contributions
ly+0.003
ory+0.003
ercise+0.003
 sandwich+0.003
shirt+0.003
Neg logit contributions
Or-0.002
ric-0.002
elope-0.002
stack-0.002
 never-0.002
Token thing
Token ID1517
Feature activation-9.69e-01

Pos logit contributions
Or+0.045
stack+0.044
ric+0.044
 never+0.044
elope+0.043
Neg logit contributions
ly-0.033
ory-0.030
ipop-0.029
les-0.029
night-0.027
Token.
Token ID13
Feature activation-9.86e-01

Pos logit contributions
ory+0.023
ly+0.022
ipop+0.022
ercise+0.022
les+0.021
Neg logit contributions
 never-0.018
Or-0.017
 knocking-0.017
 sneaking-0.017
 Ign-0.017
Token<|endoftext|>
Token ID50256
Feature activation+2.26e+00

Pos logit contributions
ory+0.027
ly+0.026
ipop+0.026
ercise+0.025
fy+0.025
Neg logit contributions
Or-0.014
 never-0.014
 knocking-0.014
 sneaking-0.014
 hallway-0.014
TokenOnce
Token ID7454
Feature activation-4.50e+00

Pos logit contributions
Or+0.042
 ambul+0.039
 horror+0.038
 collapse+0.038
 knocking+0.038
Neg logit contributions
ory-0.056
les-0.054
ly-0.053
ipop-0.053
 sandwich-0.051
Token upon
Token ID2402
Feature activation-4.64e-01

Pos logit contributions
ercise+0.107
shirt+0.102
ly+0.101
fy+0.099
ory+0.099
Neg logit contributions
 never-0.090
 no-0.082
 only-0.082
 knocking-0.081
 horror-0.081
Token a
Token ID257
Feature activation-2.19e+00

Pos logit contributions
ory+0.014
ly+0.013
ipop+0.013
ercise+0.013
les+0.013
Neg logit contributions
 never-0.006
Or-0.006
 knocking-0.006
 Ign-0.006
 sneaking-0.006
Token time
Token ID640
Feature activation+1.28e-01

Pos logit contributions
ory+0.047
ly+0.045
fy+0.045
 sandwich+0.044
ipop+0.043
Neg logit contributions
stack-0.048
Or-0.046
 ambul-0.046
 zo-0.046
ric-0.045
Token,
Token ID11
Feature activation-9.68e-01

Pos logit contributions
Or+0.003
 known+0.003
 zo+0.003
ric+0.003
 hallway+0.003
Neg logit contributions
ipop-0.002
ory-0.002
ly-0.002
ls-0.002
ip-0.002
Token there
Token ID612
Feature activation-7.70e-01

Pos logit contributions
ory+0.018
ly+0.018
ipop+0.018
 sandwich+0.016
les+0.016
Neg logit contributions
Or-0.022
 ambul-0.022
ric-0.022
 horror-0.022
 zo-0.021
Token They
Token ID1119
Feature activation+1.21e-02

Pos logit contributions
ory+0.002
ly+0.002
ipop+0.002
shirt+0.002
ip+0.002
Neg logit contributions
 never-0.002
 sneaking-0.002
pe-0.002
 collapse-0.002
stack-0.002
Token did
Token ID750
Feature activation-9.70e-01

Pos logit contributions
 never+0.000
Or+0.000
 knocking+0.000
 horror+0.000
 hallway+0.000
Neg logit contributions
ory-0.000
ly-0.000
ipop-0.000
ercise-0.000
les-0.000
Token not
Token ID407
Feature activation-1.56e+00

Pos logit contributions
ory+0.034
ly+0.033
ipop+0.033
ercise+0.032
les+0.032
Neg logit contributions
 never-0.007
Or-0.007
 knocking-0.007
 hallway-0.007
 horror-0.007
Token want
Token ID765
Feature activation-1.72e+00

Pos logit contributions
ory+0.038
ipop+0.038
ly+0.037
ercise+0.037
les+0.036
Neg logit contributions
 never-0.029
 knocking-0.028
Or-0.027
 known-0.026
 zo-0.026
Token to
Token ID284
Feature activation+1.04e+00

Pos logit contributions
ory+0.035
ipop+0.034
ly+0.034
ercise+0.034
fy+0.033
Neg logit contributions
 never-0.040
 knocking-0.038
 only-0.038
 zo-0.037
 Ign-0.037
Token dress
Token ID6576
Feature activation-5.16e+00

Pos logit contributions
 never+0.026
Or+0.026
 knocking+0.025
 horror+0.025
 Ign+0.025
Neg logit contributions
ory-0.018
ly-0.018
ipop-0.017
ercise-0.017
les-0.016
Token.
Token ID13
Feature activation-3.15e-01

Pos logit contributions
ory+0.119
ipop+0.116
ercise+0.114
 salad+0.114
fy+0.111
Neg logit contributions
 never-0.097
 knocking-0.096
 only-0.094
 ambul-0.094
 turning-0.091
Token
Token ID198
Feature activation-3.45e-01

Pos logit contributions
ory+0.008
ly+0.007
ipop+0.007
shirt+0.007
ip+0.007
Neg logit contributions
 never-0.006
 sneaking-0.005
 collapse-0.005
pe-0.005
 knocking-0.005
Token
Token ID198
Feature activation+3.53e-01

Pos logit contributions
ory+0.008
ly+0.008
ipop+0.008
les+0.008
ip+0.008
Neg logit contributions
Or-0.006
 sneaking-0.006
 fact-0.006
stack-0.006
 ambul-0.006
Token"
Token ID1
Feature activation+1.31e+00

Pos logit contributions
stack+0.006
 horror+0.006
 knocking+0.006
 sneaking+0.006
ric+0.006
Neg logit contributions
ory-0.009
ly-0.008
fy-0.008
ercise-0.008
ip-0.008
TokenBen
Token ID11696
Feature activation+8.13e-01

Pos logit contributions
 never+0.019
 knocking+0.019
Or+0.018
ric+0.018
 horror+0.018
Neg logit contributions
ory-0.037
ly-0.036
ipop-0.035
ercise-0.034
les-0.034
Token to
Token ID284
Feature activation+6.63e-01

Pos logit contributions
ipop+0.002
ly+0.002
les+0.002
ercise+0.002
ory+0.002
Neg logit contributions
 never-0.002
 knocking-0.002
 saving-0.002
 turning-0.002
 Ign-0.002
Token the
Token ID262
Feature activation-4.05e-03

Pos logit contributions
 never+0.010
 known+0.009
 knocking+0.009
Or+0.009
 zo+0.009
Neg logit contributions
ipop-0.020
ly-0.019
les-0.019
ory-0.019
fy-0.019
Token shop
Token ID6128
Feature activation+1.76e+00

Pos logit contributions
ory+0.000
ly+0.000
ipop+0.000
les+0.000
ercise+0.000
Neg logit contributions
 never-0.000
Or-0.000
 knocking-0.000
 hallway-0.000
 zo-0.000
Token and
Token ID290
Feature activation-2.00e+00

Pos logit contributions
Or+0.035
 horror+0.035
 collapse+0.035
 never+0.035
 fact+0.034
Neg logit contributions
ory-0.040
ly-0.038
ipop-0.037
fy-0.036
les-0.035
Token buy
Token ID2822
Feature activation-2.29e+00

Pos logit contributions
ory+0.043
ly+0.042
ipop+0.041
les+0.040
ercise+0.040
Neg logit contributions
 never-0.042
Or-0.041
 knocking-0.041
 hallway-0.040
 Ign-0.040
Token things
Token ID1243
Feature activation-4.49e+00

Pos logit contributions
ipop+0.064
ory+0.063
fy+0.062
ercise+0.061
ly+0.060
Neg logit contributions
 zo-0.034
 no-0.034
 never-0.034
 ambul-0.034
 only-0.033
Token they
Token ID484
Feature activation-6.70e-01

Pos logit contributions
ory+0.131
ipop+0.128
ercise+0.126
ly+0.126
les+0.125
Neg logit contributions
 never-0.062
 only-0.056
 saving-0.056
Or-0.054
 turning-0.054
Token needed
Token ID2622
Feature activation-6.35e-01

Pos logit contributions
ipop+0.019
ory+0.019
ly+0.019
les+0.018
ercise+0.018
Neg logit contributions
 never-0.011
Or-0.009
 known-0.009
 knocking-0.009
 only-0.009
Token or
Token ID393
Feature activation-2.86e+00

Pos logit contributions
ory+0.014
ly+0.014
ipop+0.014
ercise+0.013
les+0.013
Neg logit contributions
 never-0.013
Or-0.012
 knocking-0.012
 Ign-0.012
 fact-0.012
Token wanted
Token ID2227
Feature activation-1.33e+00

Pos logit contributions
ory+0.059
ipop+0.057
ly+0.056
les+0.056
ercise+0.054
Neg logit contributions
 never-0.066
 knocking-0.063
Or-0.063
 saving-0.061
 ambul-0.061
Token.
Token ID13
Feature activation-1.28e+00

Pos logit contributions
ory+0.027
ipop+0.026
ly+0.026
ercise+0.025
les+0.025
Neg logit contributions
 never-0.031
 knocking-0.029
Or-0.029
 saving-0.028
 sneaking-0.028