TOP ACTIVATIONS
MAX = 7.333

coin had made it all possible . <|endoftext|> Once upon a
license that made it all possible . <|endoftext|> Once upon a
ion that made it all possible . <|endoftext|> Once upon a
goal that made it all possible . <|endoftext|> Once upon a
and determination , anything is possible . The two friends smiled
her friends . Anything was possible when she was sailing out
for all that anything is possible if you just keep believing
in miracles ? Anything is possible if you believe hard enough
and practice , anything is possible . <|endoftext|> James lived in
l icked the cube and oh ! What a taste !
and slides down . " Whe e !" she says .
, which was nice and oh so cool .
of effort , anything is possible . <|endoftext|> Once upon a
... you ... a ... sur prise . It ... is
took a big bite and oh , it tasted so good
worked hard , anything was possible . <|endoftext|> " Mom my
He said that anything was possible , he just had to
and dedication , anything is possible ! <|endoftext|> Once upon a
's what made it all possible . <|endoftext|> Once upon a
and dedication , anything is possible . <|endoftext|> Once there was

MIN ACTIVATIONS
MIN = -7.462

mom and to the ost ric hes . They learned their
should not tease the ost ric hes . They are not
and looked at the ost ric hes . They saw one
and run to the ost ric hes . "
cried . " The ost ric hes are mean ! They
but be careful . Ost ric hes can be gr umpy
Look , mom , ost ric hes !" "
feels even happier and p rou der . He displays his
hurts ," he said , scr unch ing up his face
made his mother even p rou der . <|endoftext|> John and
wanted to touch the ost ric hes . They reached out
shouted . " The ost ric hes are so funny !"
ostr ich pen . Ost ric hes are big birds with
for the first time in years . The old
a flashlight and a magn ifying glass to examine their finds
But the ost ric hes did not like that
ostr ich pen . Ost ric hes are big birds with
pillow many times over the years , but the first time
said , holding the magn ifying glass over a black ant
too close . The ost ric hes might pe ck you

INTERVAL 3.634 - 7.333
CONTAINS 0.113%

, and decided to make one last try . To her
Jack made his way up a hill and stopped
up . She tried something else . She took two pans
voice . " Sur prise !" Grand ma says
Finally he found one that said it was depend

INTERVAL -0.065 - 3.634
CONTAINS 54.485%

, when Jimmy was practicing his songs , he heard something
Emily smiled and then reached out and grabbed the puppy 's
excited again . " Ok , dad ,
and play , but to stay where she could see her
he shouldn 't talk to strangers . Later

INTERVAL -3.764 - -0.065
CONTAINS 45.328%

my cry . He wished he could be big again ,
Tim my . Tim my loved to play with his ball
better and the children felt very proud of themselves . They
tell her it 's okay . They say they will buy
, but she was determined to reach the apples . She

INTERVAL -7.462 - -3.764
CONTAINS 0.074%

walked home with a big smile , feeling happy , brave
They said sorry to their mom and dad . They played
y . Jack closed his eyes and stayed still for a
Katie . She was 3 years old . One day ,
a cheerful little 3 - year - old boy . He
Token coin
Token ID10752
Feature activation+1.17e+00

Pos logit contributions
ising+0.007
ized+0.006
ters+0.006
mun+0.006
lest+0.006
Neg logit contributions
 microwave-0.006
 North-0.006
 alarm-0.005
 stories-0.005
ric-0.005
Token had
Token ID550
Feature activation+2.37e-01

Pos logit contributions
ising+0.032
ized+0.028
ters+0.027
mun+0.027
ging+0.026
Neg logit contributions
 microwave-0.027
 North-0.026
 alarm-0.026
 stories-0.026
ric-0.025
Token made
Token ID925
Feature activation-5.35e-01

Pos logit contributions
ising+0.006
ized+0.005
 members+0.005
ters+0.005
lest+0.005
Neg logit contributions
 microwave-0.006
 North-0.006
 alarm-0.006
ric-0.006
 Friday-0.005
Token it
Token ID340
Feature activation+1.29e+00

Pos logit contributions
 microwave+0.012
 North+0.012
 alarm+0.012
 stories+0.012
 Friday+0.011
Neg logit contributions
ising-0.015
ized-0.013
ters-0.013
mun-0.012
lest-0.012
Token all
Token ID477
Feature activation+1.54e+00

Pos logit contributions
ising+0.035
ized+0.031
ters+0.030
mun+0.029
 members+0.029
Neg logit contributions
 microwave-0.030
 North-0.030
 alarm-0.028
ric-0.028
 Friday-0.028
Token possible
Token ID1744
Feature activation+7.33e+00

Pos logit contributions
ising+0.042
ized+0.037
ters+0.036
mun+0.035
lest+0.035
Neg logit contributions
 microwave-0.036
 North-0.035
 alarm-0.035
 stories-0.034
ric-0.033
Token.
Token ID13
Feature activation+8.00e-01

Pos logit contributions
ising+0.228
ized+0.221
mun+0.210
lest+0.203
ging+0.203
Neg logit contributions
 stories-0.131
 North-0.130
 chocolate-0.121
 microwave-0.121
 alarm-0.120
Token<|endoftext|>
Token ID50256
Feature activation-1.19e+00

Pos logit contributions
ising+0.026
ized+0.024
ters+0.023
mun+0.023
lest+0.022
Neg logit contributions
 microwave-0.014
 North-0.014
 alarm-0.013
 stories-0.013
ric-0.013
TokenOnce
Token ID7454
Feature activation+9.15e-01

Pos logit contributions
 microwave+0.027
 North+0.027
 alarm+0.026
 stories+0.025
ric+0.025
Neg logit contributions
ising-0.033
ized-0.029
ters-0.029
mun-0.028
 members-0.027
Token upon
Token ID2402
Feature activation-1.95e+00

Pos logit contributions
ising+0.024
ized+0.021
ters+0.021
mun+0.020
lest+0.020
Neg logit contributions
 microwave-0.022
 North-0.022
 alarm-0.021
 stories-0.021
ric-0.021
Token a
Token ID257
Feature activation-1.61e+00

Pos logit contributions
 microwave+0.041
 North+0.041
ric+0.039
 alarm+0.039
 Sale+0.038
Neg logit contributions
ising-0.055
ized-0.050
ters-0.049
mun-0.048
 members-0.047
Token license
Token ID5964
Feature activation+7.84e-01

Pos logit contributions
ising+0.015
ized+0.014
ters+0.013
mun+0.013
aping+0.013
Neg logit contributions
 alarm-0.008
 stories-0.008
 microwave-0.008
 North-0.007
 chocolate-0.007
Token that
Token ID326
Feature activation-1.59e+00

Pos logit contributions
ising+0.021
ized+0.019
mun+0.018
ters+0.018
lest+0.017
Neg logit contributions
 microwave-0.018
 North-0.018
 alarm-0.018
 stories-0.018
 Friday-0.017
Token made
Token ID925
Feature activation-1.20e+00

Pos logit contributions
 microwave+0.040
 North+0.039
 alarm+0.038
ric+0.037
 stories+0.037
Neg logit contributions
ising-0.040
ized-0.036
ters-0.035
mun-0.034
 members-0.033
Token it
Token ID340
Feature activation+7.18e-01

Pos logit contributions
 microwave+0.030
 North+0.030
 alarm+0.029
 stories+0.029
ric+0.028
Neg logit contributions
ising-0.030
ized-0.027
ters-0.026
mun-0.025
lest-0.025
Token all
Token ID477
Feature activation+1.00e+00

Pos logit contributions
ising+0.019
ized+0.017
ters+0.017
lest+0.016
 members+0.016
Neg logit contributions
 microwave-0.017
 North-0.016
 alarm-0.016
ric-0.016
 Friday-0.015
Token possible
Token ID1744
Feature activation+6.71e+00

Pos logit contributions
ising+0.028
ized+0.025
ters+0.024
mun+0.024
ging+0.023
Neg logit contributions
 microwave-0.022
 North-0.022
 alarm-0.022
 stories-0.022
ric-0.021
Token.
Token ID13
Feature activation+7.32e-01

Pos logit contributions
ising+0.202
ized+0.195
mun+0.184
ging+0.180
ters+0.179
Neg logit contributions
 stories-0.129
 North-0.124
 alarm-0.119
 microwave-0.115
 chocolate-0.115
Token<|endoftext|>
Token ID50256
Feature activation-1.44e+00

Pos logit contributions
ising+0.016
 members+0.015
mun+0.015
ized+0.015
Gr+0.014
Neg logit contributions
 microwave-0.019
 alarm-0.019
 North-0.018
 stories-0.018
tr-0.018
TokenOnce
Token ID7454
Feature activation+7.63e-01

Pos logit contributions
 microwave+0.033
 North+0.032
 alarm+0.031
 stories+0.031
ric+0.031
Neg logit contributions
ising-0.039
ized-0.035
ters-0.034
mun-0.033
 members-0.033
Token upon
Token ID2402
Feature activation-2.08e+00

Pos logit contributions
ising+0.020
ized+0.018
ters+0.018
mun+0.017
lest+0.017
Neg logit contributions
 microwave-0.018
 North-0.018
 alarm-0.017
 stories-0.017
ric-0.017
Token a
Token ID257
Feature activation-1.85e+00

Pos logit contributions
 microwave+0.044
 North+0.044
ric+0.041
 alarm+0.041
 Sale+0.040
Neg logit contributions
ising-0.060
ized-0.054
ters-0.052
mun-0.052
 members-0.051
Tokenion
Token ID295
Feature activation-4.80e-01

Pos logit contributions
 microwave+0.098
 North+0.097
 alarm+0.094
 stories+0.093
 Friday+0.091
Neg logit contributions
ising-0.100
ized-0.089
ters-0.085
mun-0.084
lest-0.082
Token that
Token ID326
Feature activation-1.24e+00

Pos logit contributions
 microwave+0.010
 alarm+0.010
 North+0.010
 stories+0.010
 fridge+0.009
Neg logit contributions
ising-0.014
ized-0.012
mun-0.012
ters-0.012
Gr-0.012
Token made
Token ID925
Feature activation-8.06e-01

Pos logit contributions
 microwave+0.032
 North+0.032
 alarm+0.031
 stories+0.030
ric+0.030
Neg logit contributions
ising-0.030
ized-0.027
ters-0.026
mun-0.025
lest-0.025
Token it
Token ID340
Feature activation+8.71e-01

Pos logit contributions
 stories+0.015
 microwave+0.014
 alarm+0.014
 North+0.014
 Friday+0.013
Neg logit contributions
ising-0.026
ized-0.023
lest-0.023
mun-0.023
ters-0.023
Token all
Token ID477
Feature activation+1.27e+00

Pos logit contributions
ising+0.023
ized+0.021
ters+0.020
 members+0.019
lest+0.019
Neg logit contributions
 microwave-0.021
 North-0.020
 alarm-0.020
ric-0.019
 Sale-0.019
Token possible
Token ID1744
Feature activation+6.69e+00

Pos logit contributions
ising+0.036
ized+0.032
ters+0.031
mun+0.030
ging+0.030
Neg logit contributions
 microwave-0.028
 North-0.028
 alarm-0.027
 stories-0.027
ric-0.026
Token.
Token ID13
Feature activation+6.27e-01

Pos logit contributions
ising+0.208
ized+0.205
mun+0.193
ging+0.193
Gr+0.190
Neg logit contributions
 stories-0.121
 North-0.116
 alarm-0.107
 microwave-0.104
 chocolate-0.103
Token<|endoftext|>
Token ID50256
Feature activation-1.43e+00

Pos logit contributions
ising+0.016
ized+0.014
 members+0.014
ters+0.014
mun+0.014
Neg logit contributions
 microwave-0.015
 alarm-0.015
 North-0.015
 stories-0.014
tr-0.014
TokenOnce
Token ID7454
Feature activation+7.28e-01

Pos logit contributions
 microwave+0.032
 North+0.031
 alarm+0.030
 stories+0.029
ric+0.029
Neg logit contributions
ising-0.040
ized-0.036
ters-0.035
mun-0.034
lest-0.034
Token upon
Token ID2402
Feature activation-2.02e+00

Pos logit contributions
ising+0.019
ized+0.017
ters+0.016
mun+0.016
lest+0.016
Neg logit contributions
 microwave-0.018
 North-0.017
 alarm-0.017
 stories-0.017
ric-0.016
Token a
Token ID257
Feature activation-1.84e+00

Pos logit contributions
 microwave+0.043
 North+0.042
ric+0.040
 alarm+0.040
 Sale+0.039
Neg logit contributions
ising-0.058
ized-0.052
ters-0.051
mun-0.050
 members-0.049
Token goal
Token ID3061
Feature activation+1.38e+00

Pos logit contributions
ising+0.041
ized+0.035
 members+0.033
ters+0.033
lest+0.032
Neg logit contributions
 microwave-0.046
 North-0.044
ric-0.043
 alarm-0.043
 Sale-0.042
Token that
Token ID326
Feature activation-1.07e+00

Pos logit contributions
ising+0.037
ized+0.033
ters+0.032
mun+0.031
 members+0.031
Neg logit contributions
 microwave-0.032
 North-0.032
 alarm-0.030
 stories-0.030
ric-0.030
Token made
Token ID925
Feature activation-8.63e-01

Pos logit contributions
 North+0.025
 stories+0.024
 alarm+0.024
 microwave+0.024
 Friday+0.023
Neg logit contributions
ising-0.029
ized-0.026
ters-0.025
ging-0.025
mun-0.025
Token it
Token ID340
Feature activation+9.71e-01

Pos logit contributions
 stories+0.019
 alarm+0.018
 North+0.018
 microwave+0.017
 thunder+0.017
Neg logit contributions
ising-0.024
ized-0.022
lest-0.022
mun-0.022
ging-0.021
Token all
Token ID477
Feature activation+1.26e+00

Pos logit contributions
ising+0.026
ized+0.023
ters+0.023
mun+0.022
lest+0.022
Neg logit contributions
 microwave-0.023
 North-0.022
 alarm-0.022
ric-0.021
 stories-0.021
Token possible
Token ID1744
Feature activation+6.66e+00

Pos logit contributions
ising+0.033
ized+0.030
ters+0.029
mun+0.028
ging+0.028
Neg logit contributions
 North-0.029
 stories-0.028
 alarm-0.028
 microwave-0.028
ric-0.028
Token.
Token ID13
Feature activation+5.37e-01

Pos logit contributions
ising+0.197
ized+0.191
mun+0.180
ging+0.180
ters+0.175
Neg logit contributions
 stories-0.130
 North-0.123
 alarm-0.118
 Friday-0.116
 chocolate-0.114
Token<|endoftext|>
Token ID50256
Feature activation-1.42e+00

Pos logit contributions
ising+0.016
ized+0.014
ters+0.014
mun+0.014
 members+0.014
Neg logit contributions
 microwave-0.011
 North-0.011
 alarm-0.011
 stories-0.010
ric-0.010
TokenOnce
Token ID7454
Feature activation+8.02e-01

Pos logit contributions
 microwave+0.032
 North+0.032
 alarm+0.031
 stories+0.030
ric+0.030
Neg logit contributions
ising-0.039
ized-0.035
ters-0.034
mun-0.033
 members-0.033
Token upon
Token ID2402
Feature activation-2.07e+00

Pos logit contributions
ising+0.021
ized+0.018
ters+0.018
mun+0.017
lest+0.017
Neg logit contributions
 microwave-0.020
 North-0.019
 alarm-0.019
 stories-0.018
ric-0.018
Token a
Token ID257
Feature activation-2.06e+00

Pos logit contributions
 microwave+0.044
 North+0.043
 alarm+0.041
ric+0.041
 Sale+0.040
Neg logit contributions
ising-0.059
ized-0.054
ters-0.052
mun-0.051
 members-0.051
Token and
Token ID290
Feature activation-3.96e-01

Pos logit contributions
 microwave+0.054
 North+0.053
 alarm+0.051
ric+0.050
 stories+0.050
Neg logit contributions
ising-0.063
ized-0.056
ters-0.054
mun-0.052
lest-0.052
Token determination
Token ID12123
Feature activation+7.30e-01

Pos logit contributions
 stories+0.008
 North+0.007
 alarm+0.007
 Friday+0.007
 Sale+0.007
Neg logit contributions
ising-0.012
ized-0.011
ters-0.011
mun-0.011
ging-0.011
Token,
Token ID11
Feature activation-5.58e-01

Pos logit contributions
ising+0.021
ized+0.019
ters+0.018
mun+0.018
lest+0.018
Neg logit contributions
 microwave-0.015
 North-0.015
 alarm-0.015
 stories-0.015
ric-0.014
Token anything
Token ID1997
Feature activation+1.49e+00

Pos logit contributions
 microwave+0.013
 North+0.012
 alarm+0.012
 stories+0.012
ric+0.012
Neg logit contributions
ising-0.015
ized-0.014
ters-0.013
mun-0.013
lest-0.013
Token is
Token ID318
Feature activation-5.07e-02

Pos logit contributions
ising+0.049
ized+0.045
ters+0.044
ging+0.044
mun+0.043
Neg logit contributions
 stories-0.024
 North-0.023
 alarm-0.022
 microwave-0.022
ric-0.022
Token possible
Token ID1744
Feature activation+6.65e+00

Pos logit contributions
 microwave+0.001
 North+0.001
 alarm+0.001
ric+0.001
tr+0.001
Neg logit contributions
ising-0.001
ized-0.001
ters-0.001
 members-0.001
lest-0.001
Token.
Token ID13
Feature activation+7.73e-01

Pos logit contributions
ising+0.201
ized+0.190
mun+0.184
ters+0.177
Gr+0.174
Neg logit contributions
 North-0.124
 microwave-0.119
 stories-0.118
 alarm-0.118
 Friday-0.115
Token The
Token ID383
Feature activation-9.55e-01

Pos logit contributions
ising+0.023
ized+0.020
ters+0.020
mun+0.020
 members+0.019
Neg logit contributions
 microwave-0.016
 North-0.016
 alarm-0.015
 stories-0.015
ric-0.014
Token two
Token ID734
Feature activation+1.75e+00

Pos logit contributions
 microwave+0.024
 North+0.024
 alarm+0.023
 stories+0.023
 Friday+0.022
Neg logit contributions
ising-0.024
ized-0.022
ters-0.021
mun-0.020
lest-0.020
Token friends
Token ID2460
Feature activation-2.43e-02

Pos logit contributions
ising+0.035
ging+0.035
Gr+0.034
otte+0.034
ized+0.033
Neg logit contributions
 stories-0.049
 belly-0.046
 eyes-0.043
 s-0.043
 snow-0.043
Token smiled
Token ID13541
Feature activation+6.12e-01

Pos logit contributions
 microwave+0.001
 North+0.001
 alarm+0.001
 stories+0.001
ric+0.000
Neg logit contributions
ising-0.001
ized-0.001
ters-0.001
mun-0.001
ging-0.001
Token her
Token ID607
Feature activation+7.70e-02

Pos logit contributions
ising+0.015
ized+0.013
ters+0.013
mun+0.012
lest+0.012
Neg logit contributions
 microwave-0.014
 North-0.014
 alarm-0.014
 stories-0.014
ric-0.014
Token friends
Token ID2460
Feature activation-9.43e-01

Pos logit contributions
ising+0.002
ized+0.001
ters+0.001
mun+0.001
lest+0.001
Neg logit contributions
 North-0.002
 microwave-0.002
 alarm-0.002
 stories-0.002
ric-0.002
Token.
Token ID13
Feature activation+1.13e+00

Pos logit contributions
 microwave+0.023
 North+0.023
 alarm+0.022
 stories+0.022
ric+0.022
Neg logit contributions
ising-0.024
ized-0.022
ters-0.021
mun-0.020
 members-0.020
Token Anything
Token ID21035
Feature activation+2.73e+00

Pos logit contributions
ising+0.037
ized+0.034
ters+0.033
mun+0.032
 members+0.032
Neg logit contributions
 microwave-0.020
 North-0.020
 alarm-0.019
ric-0.018
 stories-0.018
Token was
Token ID373
Feature activation+1.18e-02

Pos logit contributions
ging+0.093
ising+0.091
Gr+0.088
aping+0.087
ters+0.086
Neg logit contributions
 stories-0.046
 thunder-0.036
 about-0.035
 does-0.034
 Friday-0.034
Token possible
Token ID1744
Feature activation+6.62e+00

Pos logit contributions
ising+0.000
mun+0.000
ized+0.000
ters+0.000
aping+0.000
Neg logit contributions
 North-0.000
 stories-0.000
 alarm-0.000
 steep-0.000
 Friday-0.000
Token when
Token ID618
Feature activation+1.74e+00

Pos logit contributions
ising+0.213
ized+0.207
mun+0.201
Gr+0.193
lest+0.192
Neg logit contributions
 North-0.105
 stories-0.103
 Friday-0.098
 Sunday-0.095
 chocolate-0.094
Token she
Token ID673
Feature activation-6.10e-01

Pos logit contributions
ising+0.045
ized+0.040
ters+0.040
lest+0.039
ging+0.039
Neg logit contributions
 North-0.042
 stories-0.042
 thunder-0.040
 Friday-0.039
 alarm-0.039
Token was
Token ID373
Feature activation+7.54e-01

Pos logit contributions
 microwave+0.015
 North+0.014
 alarm+0.014
 stories+0.014
ric+0.014
Neg logit contributions
ising-0.016
ized-0.014
ters-0.014
mun-0.014
 members-0.013
Token sailing
Token ID28499
Feature activation+1.00e+00

Pos logit contributions
ising+0.020
ized+0.018
ters+0.017
mun+0.017
lest+0.017
Neg logit contributions
 microwave-0.018
 North-0.018
 alarm-0.017
 stories-0.017
ric-0.017
Token out
Token ID503
Feature activation+8.08e-01

Pos logit contributions
ising+0.022
ized+0.020
ters+0.019
ging+0.018
 members+0.018
Neg logit contributions
 microwave-0.029
 North-0.028
 alarm-0.027
ric-0.027
 Sale-0.027
Token for
Token ID329
Feature activation-6.48e-01

Pos logit contributions
 stories+0.001
 North+0.000
 alarm+0.000
 belly+0.000
 Friday+0.000
Neg logit contributions
ising-0.001
ters-0.001
mun-0.001
ized-0.001
ging-0.001
Token all
Token ID477
Feature activation-5.51e-01

Pos logit contributions
 North+0.014
 microwave+0.014
 stories+0.014
 alarm+0.014
 Friday+0.013
Neg logit contributions
ising-0.018
ized-0.017
ters-0.016
mun-0.016
lest-0.016
Token that
Token ID326
Feature activation-3.50e-01

Pos logit contributions
 microwave+0.013
 North+0.013
 alarm+0.013
 stories+0.013
ric+0.012
Neg logit contributions
ising-0.015
ized-0.013
ters-0.013
mun-0.012
lest-0.012
Token anything
Token ID1997
Feature activation+2.78e+00

Pos logit contributions
 stories+0.009
 North+0.009
 alarm+0.009
 microwave+0.009
 Friday+0.009
Neg logit contributions
ising-0.008
ized-0.007
ters-0.007
mun-0.007
aping-0.007
Token is
Token ID318
Feature activation+3.06e-01

Pos logit contributions
ising+0.079
ters+0.071
ging+0.070
ized+0.070
aping+0.069
Neg logit contributions
 stories-0.058
 alarm-0.054
 North-0.054
 microwave-0.054
 Friday-0.053
Token possible
Token ID1744
Feature activation+6.54e+00

Pos logit contributions
ising+0.007
ized+0.006
ters+0.006
 members+0.006
lest+0.006
Neg logit contributions
 microwave-0.009
 North-0.008
 alarm-0.008
ric-0.008
tr-0.008
Token if
Token ID611
Feature activation+2.93e+00

Pos logit contributions
ising+0.198
ized+0.187
mun+0.180
ters+0.174
lest+0.172
Neg logit contributions
 North-0.121
 stories-0.118
 microwave-0.116
 alarm-0.116
 Friday-0.114
Token you
Token ID345
Feature activation+2.52e-01

Pos logit contributions
ising+0.065
ized+0.058
mun+0.056
ters+0.055
ging+0.054
Neg logit contributions
 stories-0.081
 North-0.080
 alarm-0.079
 microwave-0.078
 Friday-0.076
Token just
Token ID655
Feature activation+4.62e-02

Pos logit contributions
ising+0.007
ized+0.006
ters+0.006
 members+0.006
lest+0.006
Neg logit contributions
 microwave-0.006
 North-0.006
 alarm-0.005
tr-0.005
ric-0.005
Token keep
Token ID1394
Feature activation+2.08e+00

Pos logit contributions
ising+0.001
ized+0.001
ters+0.001
lest+0.001
mun+0.001
Neg logit contributions
 microwave-0.001
 North-0.001
 alarm-0.001
 stories-0.001
ric-0.001
Token believing
Token ID14773
Feature activation+2.39e+00

Pos logit contributions
ising+0.055
mun+0.055
lest+0.053
ters+0.053
ized+0.052
Neg logit contributions
 stories-0.045
 North-0.044
 alarm-0.041
 microwave-0.041
 Friday-0.040
Token in
Token ID287
Feature activation+3.56e-01

Pos logit contributions
ising+0.075
ized+0.066
ters+0.065
ging+0.063
 members+0.060
Neg logit contributions
 microwave-0.083
 North-0.080
tr-0.076
ric-0.076
 alarm-0.076
Token miracles
Token ID32904
Feature activation+5.14e-01

Pos logit contributions
ising+0.009
ized+0.008
ters+0.008
ging+0.008
 members+0.007
Neg logit contributions
 microwave-0.009
 North-0.009
tr-0.008
 alarm-0.008
ric-0.008
Token?
Token ID30
Feature activation+2.46e+00

Pos logit contributions
ising+0.014
ters+0.012
ized+0.012
ging+0.011
 members+0.011
Neg logit contributions
 microwave-0.013
 North-0.012
tr-0.012
 fever-0.011
ric-0.011
Token Anything
Token ID21035
Feature activation+2.30e+00

Pos logit contributions
ising+0.077
ized+0.071
mun+0.065
lest+0.065
ging+0.065
Neg logit contributions
 microwave-0.049
 North-0.048
ric-0.044
 Friday-0.044
 alarm-0.044
Token is
Token ID318
Feature activation-1.68e-01

Pos logit contributions
ising+0.072
ized+0.067
ters+0.064
ging+0.063
mun+0.063
Neg logit contributions
 North-0.042
 stories-0.041
 microwave-0.041
 alarm-0.041
ric-0.039
Token possible
Token ID1744
Feature activation+6.39e+00

Pos logit contributions
 microwave+0.004
 North+0.004
 alarm+0.004
 stories+0.004
ric+0.004
Neg logit contributions
ising-0.004
ized-0.004
ters-0.004
mun-0.004
lest-0.004
Token if
Token ID611
Feature activation+3.03e+00

Pos logit contributions
ising+0.185
ized+0.171
mun+0.163
ters+0.161
lest+0.157
Neg logit contributions
 North-0.132
 microwave-0.131
 alarm-0.127
 stories-0.125
 Friday-0.123
Token you
Token ID345
Feature activation+4.24e-01

Pos logit contributions
ising+0.069
ized+0.062
mun+0.059
ters+0.058
ging+0.057
Neg logit contributions
 North-0.082
 stories-0.081
 alarm-0.079
 microwave-0.079
 Friday-0.078
Token believe
Token ID1975
Feature activation+3.24e+00

Pos logit contributions
ising+0.012
ized+0.010
ters+0.010
 members+0.010
lest+0.010
Neg logit contributions
 microwave-0.010
 North-0.009
 alarm-0.009
 chocolate-0.009
 Friday-0.009
Token hard
Token ID1327
Feature activation-2.55e+00

Pos logit contributions
ising+0.081
ized+0.072
ters+0.070
mun+0.068
lest+0.067
Neg logit contributions
 microwave-0.081
 North-0.080
 alarm-0.078
 stories-0.077
ric-0.076
Token enough
Token ID1576
Feature activation-2.83e-01

Pos logit contributions
 microwave+0.065
tr+0.062
 North+0.061
ric+0.061
 alarm+0.060
Neg logit contributions
ising-0.069
ized-0.057
ters-0.057
ging-0.055
 members-0.054
Token and
Token ID290
Feature activation-4.51e-01

Pos logit contributions
 microwave+0.039
 North+0.038
 alarm+0.037
 stories+0.036
ric+0.036
Neg logit contributions
ising-0.050
ized-0.046
ters-0.044
mun-0.043
lest-0.043
Token practice
Token ID3357
Feature activation+5.47e-01

Pos logit contributions
 stories+0.009
 alarm+0.009
 North+0.009
 Friday+0.009
ric+0.009
Neg logit contributions
ising-0.012
ized-0.012
ters-0.012
mun-0.011
Gr-0.011
Token,
Token ID11
Feature activation-1.13e-01

Pos logit contributions
ising+0.015
ized+0.014
ters+0.014
mun+0.014
lest+0.014
Neg logit contributions
 stories-0.011
 North-0.011
 microwave-0.011
 alarm-0.011
ric-0.011
Token anything
Token ID1997
Feature activation+1.86e+00

Pos logit contributions
 stories+0.003
 North+0.003
ric+0.003
 alarm+0.003
 thunder+0.003
Neg logit contributions
ising-0.002
ized-0.002
mun-0.002
Gr-0.002
ging-0.002
Token is
Token ID318
Feature activation-1.24e-01

Pos logit contributions
ising+0.071
ized+0.068
ging+0.067
ters+0.067
aping+0.066
Neg logit contributions
 stories-0.019
 North-0.016
 alarm-0.016
 Friday-0.015
ric-0.015
Token possible
Token ID1744
Feature activation+6.38e+00

Pos logit contributions
 microwave+0.003
 North+0.003
 alarm+0.003
 stories+0.003
ric+0.003
Neg logit contributions
ising-0.003
ized-0.003
ters-0.003
mun-0.003
lest-0.003
Token.
Token ID13
Feature activation+8.88e-01

Pos logit contributions
ising+0.184
ized+0.179
mun+0.172
Gr+0.165
ters+0.163
Neg logit contributions
 North-0.125
 microwave-0.118
 stories-0.118
 alarm-0.117
 Friday-0.117
Token<|endoftext|>
Token ID50256
Feature activation-1.62e+00

Pos logit contributions
ising+0.028
ized+0.026
ging+0.024
ters+0.024
lest+0.024
Neg logit contributions
 North-0.017
 microwave-0.017
ric-0.016
 alarm-0.016
 Friday-0.016
TokenJames
Token ID14731
Feature activation+7.34e-01

Pos logit contributions
 microwave+0.036
 North+0.035
 alarm+0.034
 stories+0.034
ric+0.033
Neg logit contributions
ising-0.045
ized-0.041
ters-0.039
mun-0.038
lest-0.038
Token lived
Token ID5615
Feature activation+5.36e-01

Pos logit contributions
ising+0.017
ized+0.015
 members+0.015
mun+0.015
ey+0.015
Neg logit contributions
 microwave-0.019
ric-0.019
 North-0.018
 Sale-0.018
 seventeen-0.018
Token in
Token ID287
Feature activation+8.13e-01

Pos logit contributions
ising+0.013
lest+0.012
mun+0.011
ters+0.010
ized+0.010
Neg logit contributions
 microwave-0.014
 chocolate-0.014
 North-0.014
 Friday-0.013
 Sunday-0.013
Token l
Token ID300
Feature activation+5.17e-01

Pos logit contributions
 North+0.011
ric+0.011
 microwave+0.010
 alarm+0.010
opl+0.010
Neg logit contributions
ising-0.009
ized-0.008
mun-0.008
 members-0.007
lest-0.007
Tokenicked
Token ID9484
Feature activation-9.04e-01

Pos logit contributions
ising+0.018
ized+0.016
lest+0.016
mun+0.015
ters+0.015
Neg logit contributions
 microwave-0.009
 belly-0.008
 North-0.008
 alarm-0.008
 Sale-0.008
Token the
Token ID262
Feature activation-4.75e-01

Pos logit contributions
 microwave+0.015
 alarm+0.015
 stories+0.015
 Friday+0.014
 chocolate+0.014
Neg logit contributions
ising-0.030
ized-0.027
mun-0.027
ters-0.026
lest-0.026
Token cube
Token ID23441
Feature activation-2.95e-01

Pos logit contributions
 microwave+0.010
 North+0.010
 alarm+0.009
 stories+0.009
ric+0.009
Neg logit contributions
ising-0.014
ized-0.013
ters-0.012
mun-0.012
lest-0.012
Token and
Token ID290
Feature activation-4.07e-01

Pos logit contributions
 microwave+0.007
 North+0.007
 alarm+0.007
ric+0.007
 stories+0.007
Neg logit contributions
ising-0.008
ized-0.007
ters-0.007
lest-0.006
mun-0.006
Token oh
Token ID11752
Feature activation+6.33e+00

Pos logit contributions
 microwave+0.009
 North+0.009
 alarm+0.009
 stories+0.008
ric+0.008
Neg logit contributions
ising-0.011
ized-0.010
ters-0.010
mun-0.010
lest-0.010
Token!
Token ID0
Feature activation+7.47e-02

Pos logit contributions
ising+0.201
ized+0.181
ters+0.180
lest+0.174
 members+0.173
Neg logit contributions
 microwave-0.119
 North-0.117
 alarm-0.109
ric-0.107
 Friday-0.107
Token What
Token ID1867
Feature activation+2.77e+00

Pos logit contributions
ising+0.002
ters+0.002
ized+0.002
mun+0.002
 members+0.002
Neg logit contributions
 North-0.002
 alarm-0.002
 microwave-0.002
 stories-0.002
ric-0.001
Token a
Token ID257
Feature activation+1.44e+00

Pos logit contributions
ising+0.085
ized+0.078
ters+0.075
ging+0.073
mun+0.073
Neg logit contributions
 microwave-0.055
 North-0.054
 alarm-0.052
 stories-0.052
 Friday-0.050
Token taste
Token ID6938
Feature activation-1.81e-01

Pos logit contributions
ising+0.043
ized+0.038
ters+0.036
mun+0.036
 members+0.036
Neg logit contributions
 microwave-0.029
 North-0.029
ric-0.029
 alarm-0.027
 stories-0.027
Token!
Token ID0
Feature activation+1.09e-01

Pos logit contributions
 microwave+0.004
 alarm+0.004
 North+0.004
 stories+0.004
 chocolate+0.004
Neg logit contributions
ising-0.005
ized-0.005
ters-0.005
mun-0.005
Gr-0.005
Token and
Token ID290
Feature activation+8.24e-01

Pos logit contributions
ising+0.058
ized+0.054
ters+0.052
 members+0.049
mun+0.048
Neg logit contributions
 North-0.065
 microwave-0.064
 stories-0.063
 alarm-0.061
ric-0.061
Token slides
Token ID19392
Feature activation+9.33e-01

Pos logit contributions
ising+0.021
ized+0.019
ters+0.018
mun+0.018
 members+0.018
Neg logit contributions
 North-0.020
 microwave-0.020
 alarm-0.019
ric-0.019
 stories-0.018
Token down
Token ID866
Feature activation+1.05e+00

Pos logit contributions
ising+0.017
ized+0.016
ters+0.015
ging+0.015
 further+0.014
Neg logit contributions
 microwave-0.030
 North-0.030
 Sale-0.029
 stories-0.028
 alarm-0.028
Token.
Token ID13
Feature activation-1.81e-01

Pos logit contributions
ising+0.025
ized+0.022
ters+0.022
 further+0.021
 members+0.020
Neg logit contributions
 microwave-0.028
ric-0.027
 North-0.027
tr-0.027
 alarm-0.026
Token "
Token ID366
Feature activation+1.99e+00

Pos logit contributions
 North+0.004
 alarm+0.004
 microwave+0.004
 stories+0.004
ric+0.004
Neg logit contributions
ising-0.005
ters-0.004
ized-0.004
mun-0.004
 members-0.004
TokenWhe
Token ID10842
Feature activation+6.32e+00

Pos logit contributions
ising+0.046
ters+0.041
ized+0.040
mun+0.039
 members+0.038
Neg logit contributions
 microwave-0.053
 North-0.052
 stories-0.051
 alarm-0.051
ric-0.050
Tokene
Token ID68
Feature activation-9.21e-01

Pos logit contributions
ising+0.119
ized+0.106
mun+0.101
 members+0.098
ters+0.095
Neg logit contributions
 North-0.202
 microwave-0.199
 fridge-0.194
 stories-0.191
tr-0.189
Token!"
Token ID2474
Feature activation+4.00e-01

Pos logit contributions
 North+0.021
 microwave+0.021
ric+0.019
 Friday+0.019
ores+0.019
Neg logit contributions
ising-0.027
ters-0.024
ized-0.024
ve-0.022
 members-0.022
Token she
Token ID673
Feature activation-9.42e-01

Pos logit contributions
ising+0.011
ters+0.010
ized+0.010
mun+0.009
 members+0.009
Neg logit contributions
 North-0.009
 microwave-0.009
ric-0.009
 alarm-0.009
tr-0.008
Token says
Token ID1139
Feature activation+1.83e+00

Pos logit contributions
 North+0.025
 microwave+0.025
 alarm+0.024
ric+0.024
 Friday+0.024
Neg logit contributions
ising-0.022
ized-0.019
ters-0.018
mun-0.018
 members-0.018
Token.
Token ID13
Feature activation-1.78e+00

Pos logit contributions
ising+0.049
ized+0.044
ters+0.042
mun+0.042
lest+0.041
Neg logit contributions
 microwave-0.043
 North-0.042
 alarm-0.041
 stories-0.040
ric-0.039
Token,
Token ID11
Feature activation-3.39e-01

Pos logit contributions
 microwave+0.015
 North+0.015
 alarm+0.014
 stories+0.014
ric+0.014
Neg logit contributions
ising-0.017
ized-0.015
ters-0.015
mun-0.015
lest-0.014
Token which
Token ID543
Feature activation+8.94e-01

Pos logit contributions
 microwave+0.009
 North+0.009
 alarm+0.008
 Sale+0.008
 stories+0.008
Neg logit contributions
ising-0.008
ized-0.007
ters-0.007
ging-0.007
mun-0.007
Token was
Token ID373
Feature activation+4.50e-01

Pos logit contributions
ising+0.025
ized+0.023
ters+0.022
mun+0.022
 members+0.022
Neg logit contributions
 microwave-0.019
 North-0.019
 alarm-0.018
ric-0.018
 stories-0.018
Token nice
Token ID3621
Feature activation+3.71e-01

Pos logit contributions
ising+0.011
ters+0.010
ized+0.010
ging+0.010
 further+0.010
Neg logit contributions
 microwave-0.011
 North-0.011
 Sale-0.010
tr-0.010
 Friday-0.010
Token and
Token ID290
Feature activation-4.72e-01

Pos logit contributions
ising+0.011
ized+0.010
ters+0.009
mun+0.009
lest+0.009
Neg logit contributions
 microwave-0.008
 North-0.008
 alarm-0.008
 stories-0.007
ric-0.007
Token oh
Token ID11752
Feature activation+6.27e+00

Pos logit contributions
 North+0.009
 stories+0.009
 microwave+0.009
 alarm+0.009
 steep+0.009
Neg logit contributions
ising-0.014
ized-0.013
mun-0.013
ters-0.013
Gr-0.012
Token so
Token ID523
Feature activation+3.38e-01

Pos logit contributions
ising+0.199
ized+0.180
ters+0.178
lest+0.172
mun+0.171
Neg logit contributions
 microwave-0.118
 North-0.115
 alarm-0.109
 stories-0.107
 Friday-0.106
Token cool
Token ID3608
Feature activation+4.95e-01

Pos logit contributions
ising+0.010
ized+0.010
mun+0.010
ters+0.009
 members+0.009
Neg logit contributions
 microwave-0.006
 North-0.006
 alarm-0.006
 Friday-0.006
 stories-0.006
Token.
Token ID13
Feature activation+4.12e-01

Pos logit contributions
ising+0.012
ters+0.011
ized+0.011
ging+0.010
lest+0.010
Neg logit contributions
 microwave-0.013
 North-0.013
ric-0.012
tr-0.012
 alarm-0.012
Token
Token ID198
Feature activation+7.81e-01

Pos logit contributions
ising+0.012
ized+0.011
ters+0.011
mun+0.011
 members+0.011
Neg logit contributions
 microwave-0.008
 North-0.008
 alarm-0.008
 stories-0.008
ric-0.008
Token
Token ID198
Feature activation+3.76e-01

Pos logit contributions
ising+0.022
ized+0.019
ters+0.019
mun+0.018
 members+0.018
Neg logit contributions
 microwave-0.018
 North-0.017
 alarm-0.017
 stories-0.017
ric-0.016
Token of
Token ID286
Feature activation-3.74e-01

Pos logit contributions
 microwave+0.015
 North+0.014
 alarm+0.014
 stories+0.014
 belly+0.014
Neg logit contributions
ising-0.011
ized-0.010
mun-0.010
ters-0.010
ging-0.009
Token effort
Token ID3626
Feature activation+7.15e-01

Pos logit contributions
 microwave+0.009
 North+0.009
 alarm+0.008
tr+0.008
ric+0.008
Neg logit contributions
ising-0.010
ized-0.009
ters-0.009
mun-0.009
 members-0.008
Token,
Token ID11
Feature activation-2.45e-01

Pos logit contributions
ising+0.019
ized+0.017
ters+0.017
mun+0.016
 members+0.016
Neg logit contributions
 microwave-0.017
 North-0.017
 alarm-0.016
ric-0.016
 stories-0.016
Token anything
Token ID1997
Feature activation+1.57e+00

Pos logit contributions
 North+0.006
 microwave+0.006
 stories+0.005
 alarm+0.005
ric+0.005
Neg logit contributions
ising-0.007
ized-0.006
ters-0.006
mun-0.006
lest-0.005
Token is
Token ID318
Feature activation-1.67e-02

Pos logit contributions
ising+0.046
ized+0.041
ters+0.041
mun+0.040
ging+0.040
Neg logit contributions
 stories-0.031
 microwave-0.031
 North-0.031
 alarm-0.030
 marsh-0.029
Token possible
Token ID1744
Feature activation+6.27e+00

Pos logit contributions
 microwave+0.000
 North+0.000
 alarm+0.000
ric+0.000
tr+0.000
Neg logit contributions
ising-0.000
ized-0.000
ters-0.000
 members-0.000
lest-0.000
Token.
Token ID13
Feature activation+7.58e-01

Pos logit contributions
ising+0.183
ized+0.171
mun+0.165
ters+0.161
Gr+0.158
Neg logit contributions
 North-0.126
 microwave-0.123
 alarm-0.119
 stories-0.119
ric-0.116
Token<|endoftext|>
Token ID50256
Feature activation-1.39e+00

Pos logit contributions
ising+0.020
ized+0.018
ters+0.018
mun+0.017
 members+0.017
Neg logit contributions
 microwave-0.017
 North-0.017
 alarm-0.017
 stories-0.016
tr-0.016
TokenOnce
Token ID7454
Feature activation+7.43e-01

Pos logit contributions
 microwave+0.032
 North+0.032
 alarm+0.031
 stories+0.030
ric+0.030
Neg logit contributions
ising-0.037
ized-0.033
ters-0.033
mun-0.032
 members-0.031
Token upon
Token ID2402
Feature activation-1.93e+00

Pos logit contributions
ising+0.019
ized+0.017
ters+0.016
mun+0.016
 members+0.015
Neg logit contributions
 microwave-0.019
 North-0.018
 alarm-0.018
 stories-0.017
ric-0.017
Token a
Token ID257
Feature activation-1.56e+00

Pos logit contributions
 North+0.042
 microwave+0.042
ric+0.040
 alarm+0.039
 Sale+0.039
Neg logit contributions
ising-0.053
ized-0.048
ters-0.047
mun-0.046
 members-0.046
Token...
Token ID986
Feature activation+2.30e+00

Pos logit contributions
ising+0.082
ters+0.073
ized+0.072
ging+0.070
ey+0.067
Neg logit contributions
 microwave-0.088
 North-0.085
 alarm-0.080
ric-0.080
 Friday-0.079
Tokenyou
Token ID5832
Feature activation+1.14e+00

Pos logit contributions
ising+0.059
ized+0.053
ters+0.052
mun+0.049
 members+0.049
Neg logit contributions
 microwave-0.056
 North-0.055
 alarm-0.054
ric-0.053
 stories-0.053
Token...
Token ID986
Feature activation+2.83e+00

Pos logit contributions
ising+0.032
ters+0.029
 members+0.029
ized+0.028
lest+0.028
Neg logit contributions
 microwave-0.025
 North-0.024
ric-0.023
 alarm-0.023
 snowball-0.022
Tokena
Token ID64
Feature activation+2.02e+00

Pos logit contributions
ising+0.069
ters+0.062
ized+0.062
mun+0.058
 members+0.058
Neg logit contributions
 microwave-0.072
 North-0.071
 alarm-0.069
ric-0.068
 stories-0.068
Token...
Token ID986
Feature activation+2.74e+00

Pos logit contributions
ising+0.053
ters+0.047
ized+0.045
 members+0.045
lest+0.044
Neg logit contributions
 microwave-0.051
 North-0.049
 alarm-0.046
 Friday-0.045
tr-0.045
Tokensur
Token ID11793
Feature activation+6.26e+00

Pos logit contributions
ising+0.068
ters+0.061
ized+0.060
 members+0.057
mun+0.057
Neg logit contributions
 microwave-0.069
 North-0.068
 alarm-0.066
ric-0.065
 stories-0.065
Tokenprise
Token ID7919
Feature activation+1.96e+00

Pos logit contributions
ising+0.114
ized+0.099
ters+0.093
mun+0.090
lest+0.087
Neg logit contributions
 microwave-0.198
 North-0.195
 alarm-0.192
 stories-0.190
 Friday-0.188
Token.
Token ID13
Feature activation+1.58e-01

Pos logit contributions
ising+0.061
ters+0.055
ized+0.054
 members+0.052
ging+0.051
Neg logit contributions
 North-0.037
 microwave-0.037
ric-0.037
 alarm-0.034
 fever-0.033
Token It
Token ID632
Feature activation+5.01e-01

Pos logit contributions
ising+0.005
ized+0.004
ters+0.004
mun+0.004
 members+0.004
Neg logit contributions
 microwave-0.003
 North-0.003
 alarm-0.003
 stories-0.003
ric-0.003
Token...
Token ID986
Feature activation+2.32e+00

Pos logit contributions
ising+0.018
ized+0.016
ters+0.016
mun+0.016
 members+0.016
Neg logit contributions
 microwave-0.007
 North-0.007
 alarm-0.007
ric-0.006
 stories-0.006
Tokenis
Token ID271
Feature activation+1.58e+00

Pos logit contributions
ising+0.059
ters+0.054
ized+0.053
 members+0.050
mun+0.050
Neg logit contributions
 North-0.056
 microwave-0.056
ric-0.054
 alarm-0.053
 stories-0.052
Token took
Token ID1718
Feature activation+1.16e+00

Pos logit contributions
 microwave+0.022
 North+0.021
 alarm+0.021
 stories+0.020
 Friday+0.020
Neg logit contributions
ising-0.023
ized-0.020
ters-0.020
mun-0.019
ging-0.019
Token a
Token ID257
Feature activation+6.46e-01

Pos logit contributions
ising+0.037
ized+0.033
mun+0.032
ters+0.032
lest+0.031
Neg logit contributions
 microwave-0.022
 stories-0.021
 North-0.021
 chocolate-0.021
 alarm-0.021
Token big
Token ID1263
Feature activation-9.37e-01

Pos logit contributions
ising+0.023
ized+0.021
ters+0.021
mun+0.020
 members+0.020
Neg logit contributions
 microwave-0.010
 North-0.009
 alarm-0.009
 stories-0.009
 Friday-0.008
Token bite
Token ID13197
Feature activation-5.66e-01

Pos logit contributions
 North+0.017
ric+0.016
 microwave+0.016
 alarm+0.015
tr+0.015
Neg logit contributions
ising-0.031
ized-0.028
lest-0.027
ters-0.027
ging-0.027
Token and
Token ID290
Feature activation-1.09e+00

Pos logit contributions
 microwave+0.013
 North+0.013
 alarm+0.012
 stories+0.012
ric+0.012
Neg logit contributions
ising-0.016
ized-0.014
ters-0.014
mun-0.013
lest-0.013
Token oh
Token ID11752
Feature activation+6.24e+00

Pos logit contributions
 microwave+0.024
 stories+0.023
 North+0.023
 chocolate+0.023
 alarm+0.022
Neg logit contributions
ising-0.031
ters-0.027
ized-0.027
ging-0.026
mun-0.026
Token,
Token ID11
Feature activation-6.74e-01

Pos logit contributions
ising+0.197
ters+0.177
ized+0.177
lest+0.171
 members+0.169
Neg logit contributions
 microwave-0.118
 North-0.117
 Friday-0.107
 alarm-0.107
ric-0.106
Token it
Token ID340
Feature activation+2.92e-01

Pos logit contributions
 microwave+0.018
 North+0.018
ric+0.018
tr+0.017
 alarm+0.017
Neg logit contributions
ising-0.015
ters-0.013
ized-0.013
lest-0.012
 members-0.012
Token tasted
Token ID29187
Feature activation-9.17e-01

Pos logit contributions
ising+0.008
ized+0.007
ters+0.007
mun+0.007
lest+0.007
Neg logit contributions
 microwave-0.007
 North-0.007
 alarm-0.006
 stories-0.006
ric-0.006
Token so
Token ID523
Feature activation-3.12e-01

Pos logit contributions
 North+0.024
 microwave+0.024
ric+0.023
 trailer+0.023
tr+0.023
Neg logit contributions
ising-0.024
ized-0.021
ters-0.020
 else-0.019
 further-0.018
Token good
Token ID922
Feature activation-2.32e-02

Pos logit contributions
 microwave+0.006
 North+0.006
 alarm+0.006
ric+0.006
tr+0.005
Neg logit contributions
ising-0.010
ized-0.009
ters-0.009
mun-0.008
lest-0.008
Token worked
Token ID3111
Feature activation+1.13e-01

Pos logit contributions
 alarm+0.028
 North+0.027
 microwave+0.027
 stories+0.027
 Friday+0.026
Neg logit contributions
ising-0.043
ized-0.039
ters-0.039
ging-0.038
mun-0.037
Token hard
Token ID1327
Feature activation-2.11e+00

Pos logit contributions
ising+0.004
ized+0.003
ters+0.003
mun+0.003
lest+0.003
Neg logit contributions
 microwave-0.002
 stories-0.002
 North-0.002
 alarm-0.002
 Friday-0.002
Token,
Token ID11
Feature activation-1.11e+00

Pos logit contributions
 microwave+0.046
 North+0.045
 alarm+0.044
ric+0.043
 stories+0.043
Neg logit contributions
ising-0.060
ized-0.054
ters-0.053
mun-0.051
lest-0.050
Token anything
Token ID1997
Feature activation+1.37e+00

Pos logit contributions
 North+0.030
 stories+0.030
 microwave+0.029
 alarm+0.029
 Friday+0.028
Neg logit contributions
ising-0.025
ized-0.022
ters-0.021
ging-0.021
mun-0.021
Token was
Token ID373
Feature activation-1.68e-01

Pos logit contributions
ising+0.042
ized+0.038
ging+0.038
ters+0.038
aping+0.037
Neg logit contributions
 stories-0.026
 alarm-0.024
 North-0.023
 microwave-0.023
 Friday-0.023
Token possible
Token ID1744
Feature activation+6.23e+00

Pos logit contributions
 microwave+0.004
 North+0.004
 alarm+0.004
 stories+0.004
 Friday+0.004
Neg logit contributions
ising-0.005
ized-0.004
ters-0.004
mun-0.004
lest-0.004
Token.
Token ID13
Feature activation+5.16e-01

Pos logit contributions
ising+0.182
ized+0.172
mun+0.167
ters+0.160
lest+0.159
Neg logit contributions
 North-0.121
 microwave-0.115
 stories-0.114
 alarm-0.114
 Friday-0.113
Token<|endoftext|>
Token ID50256
Feature activation-1.39e+00

Pos logit contributions
ising+0.014
ized+0.012
ters+0.012
mun+0.012
 members+0.012
Neg logit contributions
 microwave-0.012
 North-0.012
 alarm-0.012
 stories-0.011
ric-0.011
Token"
Token ID1
Feature activation+1.51e+00

Pos logit contributions
 microwave+0.032
 North+0.031
 alarm+0.030
 stories+0.030
ric+0.029
Neg logit contributions
ising-0.038
ized-0.034
ters-0.033
mun-0.032
 members-0.032
TokenMom
Token ID29252
Feature activation-1.54e+00

Pos logit contributions
ising+0.041
ized+0.036
ters+0.035
mun+0.034
lest+0.034
Neg logit contributions
 microwave-0.035
 North-0.034
 alarm-0.033
 stories-0.033
ric-0.032
Tokenmy
Token ID1820
Feature activation-1.58e+00

Pos logit contributions
 North+0.050
 steep+0.049
 gag+0.049
ores+0.048
tr+0.048
Neg logit contributions
ising-0.025
ters-0.024
 members-0.022
aping-0.020
 by-0.019
Token He
Token ID679
Feature activation-1.20e-01

Pos logit contributions
ising+0.036
ized+0.034
ters+0.033
mun+0.032
ging+0.032
Neg logit contributions
 microwave-0.015
 North-0.014
 alarm-0.013
 stories-0.013
 Friday-0.013
Token said
Token ID531
Feature activation+1.23e+00

Pos logit contributions
 microwave+0.003
 North+0.003
ric+0.003
 alarm+0.003
ores+0.003
Neg logit contributions
ising-0.003
ized-0.003
 members-0.003
ters-0.003
lest-0.003
Token that
Token ID326
Feature activation-2.17e+00

Pos logit contributions
ising+0.032
ized+0.028
ters+0.027
 members+0.026
lest+0.026
Neg logit contributions
 microwave-0.031
 North-0.030
ric-0.030
tr-0.029
 alarm-0.028
Token anything
Token ID1997
Feature activation+1.75e+00

Pos logit contributions
 stories+0.051
 North+0.050
 microwave+0.049
 alarm+0.049
 Friday+0.048
Neg logit contributions
ising-0.061
ized-0.054
ters-0.052
ging-0.051
lest-0.050
Token was
Token ID373
Feature activation+1.26e-01

Pos logit contributions
ising+0.051
ters+0.047
ging+0.047
ized+0.045
aping+0.045
Neg logit contributions
 stories-0.036
 alarm-0.034
 microwave-0.033
 Friday-0.032
 North-0.032
Token possible
Token ID1744
Feature activation+6.23e+00

Pos logit contributions
ising+0.004
ized+0.003
ters+0.003
mun+0.003
 members+0.003
Neg logit contributions
 microwave-0.003
 North-0.003
 alarm-0.003
ric-0.003
 stories-0.003
Token,
Token ID11
Feature activation-1.36e-01

Pos logit contributions
ising+0.193
ized+0.189
mun+0.178
Gr+0.177
ters+0.172
Neg logit contributions
 North-0.110
 stories-0.108
 alarm-0.105
 microwave-0.104
 Friday-0.102
Token he
Token ID339
Feature activation+8.76e-01

Pos logit contributions
 microwave+0.003
 North+0.003
 alarm+0.003
 stories+0.003
ric+0.003
Neg logit contributions
ising-0.003
ized-0.003
ters-0.003
mun-0.003
lest-0.003
Token just
Token ID655
Feature activation+1.28e+00

Pos logit contributions
ising+0.025
ized+0.022
 members+0.022
ters+0.021
lest+0.021
Neg logit contributions
 microwave-0.019
 North-0.019
ric-0.018
 alarm-0.018
 marsh-0.018
Token had
Token ID550
Feature activation+2.85e-01

Pos logit contributions
ising+0.035
ized+0.031
ters+0.030
lest+0.030
 members+0.029
Neg logit contributions
 microwave-0.030
 North-0.029
 alarm-0.028
ric-0.027
 Friday-0.027
Token to
Token ID284
Feature activation+7.70e-01

Pos logit contributions
ising+0.008
ized+0.007
 members+0.007
ters+0.007
 else+0.006
Neg logit contributions
 microwave-0.007
 North-0.007
 alarm-0.006
tr-0.006
ric-0.006
Token and
Token ID290
Feature activation-7.29e-01

Pos logit contributions
 microwave+0.039
 North+0.038
 stories+0.037
 alarm+0.037
 Friday+0.036
Neg logit contributions
ising-0.049
ized-0.045
ters-0.044
mun-0.043
lest-0.042
Token dedication
Token ID22445
Feature activation+4.54e-01

Pos logit contributions
 stories+0.016
 North+0.015
 alarm+0.015
 microwave+0.015
 Friday+0.014
Neg logit contributions
ising-0.020
ized-0.019
ters-0.019
mun-0.019
ging-0.018
Token,
Token ID11
Feature activation-7.10e-01

Pos logit contributions
ising+0.013
ized+0.011
ters+0.011
mun+0.011
lest+0.011
Neg logit contributions
 microwave-0.010
 North-0.010
 alarm-0.010
 stories-0.010
ric-0.009
Token anything
Token ID1997
Feature activation+1.52e+00

Pos logit contributions
 North+0.017
 stories+0.017
 alarm+0.016
 microwave+0.016
 Friday+0.016
Neg logit contributions
ising-0.018
ized-0.017
mun-0.016
Gr-0.016
ging-0.015
Token is
Token ID318
Feature activation-2.91e-01

Pos logit contributions
ising+0.058
ized+0.055
ging+0.054
ters+0.054
Gr+0.053
Neg logit contributions
 stories-0.016
 North-0.014
 alarm-0.014
 Friday-0.013
 microwave-0.013
Token possible
Token ID1744
Feature activation+6.21e+00

Pos logit contributions
 microwave+0.007
 North+0.007
 alarm+0.007
ric+0.007
 stories+0.007
Neg logit contributions
ising-0.007
ized-0.007
ters-0.006
 members-0.006
lest-0.006
Token!
Token ID0
Feature activation+7.86e-01

Pos logit contributions
ising+0.189
ized+0.182
mun+0.178
ters+0.170
Gr+0.169
Neg logit contributions
 North-0.112
 stories-0.107
 alarm-0.106
 Friday-0.104
 microwave-0.104
Token<|endoftext|>
Token ID50256
Feature activation-1.35e+00

Pos logit contributions
ising+0.021
ized+0.019
ters+0.019
 members+0.018
mun+0.018
Neg logit contributions
 microwave-0.018
 North-0.017
 alarm-0.017
 stories-0.017
 marsh-0.017
TokenOnce
Token ID7454
Feature activation+8.71e-01

Pos logit contributions
 microwave+0.030
 North+0.029
 alarm+0.028
 stories+0.028
ric+0.027
Neg logit contributions
ising-0.039
ized-0.035
ters-0.033
mun-0.033
lest-0.032
Token upon
Token ID2402
Feature activation-1.98e+00

Pos logit contributions
ising+0.021
ized+0.019
ters+0.018
mun+0.018
 members+0.018
Neg logit contributions
 microwave-0.023
 North-0.022
 alarm-0.022
 stories-0.021
ric-0.021
Token a
Token ID257
Feature activation-1.78e+00

Pos logit contributions
 microwave+0.042
 North+0.041
ric+0.040
 alarm+0.039
 Sale+0.038
Neg logit contributions
ising-0.056
ized-0.051
ters-0.050
mun-0.049
 members-0.048
Token's
Token ID338
Feature activation-5.30e-01

Pos logit contributions
 microwave+0.040
 North+0.040
 alarm+0.039
 stories+0.038
ric+0.037
Neg logit contributions
ising-0.031
ized-0.027
ters-0.026
mun-0.025
lest-0.025
Token what
Token ID644
Feature activation+1.33e+00

Pos logit contributions
 microwave+0.014
 North+0.014
ric+0.013
 seventeen+0.013
 Sale+0.013
Neg logit contributions
ising-0.013
ized-0.011
 members-0.011
ters-0.010
 convince-0.010
Token made
Token ID925
Feature activation-8.95e-01

Pos logit contributions
ising+0.034
ized+0.031
ging+0.030
ters+0.028
bing+0.027
Neg logit contributions
 stories-0.035
 alarm-0.033
 Friday-0.033
 chocolate-0.033
 North-0.032
Token it
Token ID340
Feature activation+1.06e+00

Pos logit contributions
 stories+0.019
 alarm+0.018
 thunder+0.018
 chocolate+0.018
 North+0.017
Neg logit contributions
ising-0.026
ging-0.024
lest-0.023
ized-0.023
Gr-0.023
Token all
Token ID477
Feature activation+7.08e-01

Pos logit contributions
ising+0.029
ized+0.026
ters+0.025
mun+0.025
lest+0.024
Neg logit contributions
 microwave-0.024
 North-0.024
 alarm-0.023
ric-0.023
 stories-0.023
Token possible
Token ID1744
Feature activation+6.19e+00

Pos logit contributions
ising+0.020
ized+0.018
ters+0.017
mun+0.017
ging+0.017
Neg logit contributions
 microwave-0.016
 North-0.015
 alarm-0.015
 stories-0.015
ric-0.014
Token.
Token ID13
Feature activation+3.03e-01

Pos logit contributions
ising+0.185
ized+0.180
mun+0.172
ging+0.169
lest+0.167
Neg logit contributions
 stories-0.119
 North-0.113
 Friday-0.107
 alarm-0.107
 chocolate-0.105
Token<|endoftext|>
Token ID50256
Feature activation-1.55e+00

Pos logit contributions
ising+0.008
ized+0.007
ters+0.007
mun+0.007
 members+0.007
Neg logit contributions
 microwave-0.007
 North-0.007
 alarm-0.007
 stories-0.007
ric-0.007
TokenOnce
Token ID7454
Feature activation+6.39e-01

Pos logit contributions
 microwave+0.037
 North+0.036
 alarm+0.036
 stories+0.035
ric+0.035
Neg logit contributions
ising-0.040
ized-0.035
ters-0.035
mun-0.034
 members-0.034
Token upon
Token ID2402
Feature activation-2.09e+00

Pos logit contributions
ising+0.016
ized+0.014
ters+0.014
mun+0.013
 members+0.013
Neg logit contributions
 microwave-0.016
 North-0.016
 alarm-0.015
 stories-0.015
ric-0.015
Token a
Token ID257
Feature activation-1.73e+00

Pos logit contributions
 North+0.046
 microwave+0.046
ric+0.044
 alarm+0.043
 Sale+0.043
Neg logit contributions
ising-0.057
ized-0.051
ters-0.050
mun-0.049
 members-0.048
Token and
Token ID290
Feature activation-8.29e-01

Pos logit contributions
 microwave+0.034
 stories+0.034
 North+0.033
 alarm+0.033
 Friday+0.033
Neg logit contributions
ising-0.051
ized-0.047
mun-0.047
ters-0.046
Gr-0.044
Token dedication
Token ID22445
Feature activation+5.78e-01

Pos logit contributions
 stories+0.018
 Friday+0.017
 North+0.017
 alarm+0.017
 microwave+0.017
Neg logit contributions
ising-0.022
ters-0.021
mun-0.021
ized-0.021
Gr-0.020
Token,
Token ID11
Feature activation-6.97e-01

Pos logit contributions
ising+0.016
ized+0.015
ters+0.014
mun+0.014
lest+0.014
Neg logit contributions
 microwave-0.012
 North-0.012
 stories-0.012
 alarm-0.012
 Friday-0.011
Token anything
Token ID1997
Feature activation+1.50e+00

Pos logit contributions
 stories+0.016
 Friday+0.015
 North+0.015
 thunder+0.015
 marsh+0.014
Neg logit contributions
ising-0.017
mun-0.017
ging-0.017
ized-0.016
Gr-0.016
Token is
Token ID318
Feature activation-2.48e-01

Pos logit contributions
ising+0.057
ging+0.055
ized+0.054
Gr+0.054
ters+0.053
Neg logit contributions
 stories-0.015
 North-0.012
 steep-0.012
 alarm-0.012
 thunder-0.012
Token possible
Token ID1744
Feature activation+6.18e+00

Pos logit contributions
 microwave+0.007
 North+0.007
 alarm+0.007
ric+0.007
 stories+0.007
Neg logit contributions
ising-0.005
ized-0.005
ters-0.004
lest-0.004
 members-0.004
Token.
Token ID13
Feature activation+8.26e-01

Pos logit contributions
ising+0.187
ized+0.181
mun+0.176
Gr+0.167
ters+0.165
Neg logit contributions
 North-0.110
 stories-0.107
 alarm-0.104
 Friday-0.104
 microwave-0.104
Token<|endoftext|>
Token ID50256
Feature activation-1.50e+00

Pos logit contributions
ising+0.028
ized+0.026
ging+0.024
lest+0.024
ters+0.024
Neg logit contributions
 North-0.014
 microwave-0.014
ric-0.014
 Friday-0.014
ores-0.013
TokenOnce
Token ID7454
Feature activation+7.85e-01

Pos logit contributions
 microwave+0.036
 North+0.035
 alarm+0.035
 stories+0.034
ric+0.034
Neg logit contributions
ising-0.039
ized-0.034
ters-0.034
mun-0.033
 members-0.033
Token there
Token ID612
Feature activation+1.50e+00

Pos logit contributions
ising+0.020
ized+0.018
ters+0.017
mun+0.017
 members+0.017
Neg logit contributions
 microwave-0.019
 North-0.019
 alarm-0.018
 stories-0.018
ric-0.018
Token was
Token ID373
Feature activation-2.22e+00

Pos logit contributions
ising+0.039
ized+0.034
ters+0.033
mun+0.032
lest+0.032
Neg logit contributions
 microwave-0.037
 North-0.037
 alarm-0.036
 stories-0.035
ric-0.035
Token mom
Token ID1995
Feature activation-4.16e+00

Pos logit contributions
 microwave+0.049
 North+0.046
 stories+0.045
tr+0.045
 seventeen+0.044
Neg logit contributions
ising-0.051
ized-0.046
ters-0.046
mun-0.043
 members-0.043
Token and
Token ID290
Feature activation-1.10e+00

Pos logit contributions
 North+0.110
 microwave+0.107
ric+0.104
tr+0.103
 Friday+0.102
Neg logit contributions
ising-0.098
ters-0.090
ized-0.085
 members-0.084
mun-0.084
Token to
Token ID284
Feature activation-8.69e-01

Pos logit contributions
 microwave+0.023
 North+0.022
 alarm+0.021
ric+0.021
 stories+0.021
Neg logit contributions
ising-0.032
ized-0.029
ters-0.028
mun-0.028
 members-0.028
Token the
Token ID262
Feature activation-1.02e+00

Pos logit contributions
 alarm+0.017
 North+0.017
 microwave+0.017
ric+0.016
 stories+0.016
Neg logit contributions
ising-0.027
ized-0.025
ters-0.024
lest-0.024
mun-0.023
Token ost
Token ID23619
Feature activation+1.85e+00

Pos logit contributions
 microwave+0.023
 North+0.022
 Friday+0.021
 stories+0.021
 alarm+0.021
Neg logit contributions
ising-0.029
ized-0.025
ters-0.025
mun-0.024
 members-0.024
Tokenric
Token ID1173
Feature activation-7.46e+00

Pos logit contributions
ising+0.077
ters+0.074
ized+0.073
aping+0.071
mun+0.070
Neg logit contributions
 microwave-0.018
 Friday-0.016
 stories-0.015
 North-0.014
 marsh-0.013
Tokenhes
Token ID956
Feature activation-2.63e+00

Pos logit contributions
 North+0.119
 microwave+0.108
 fever+0.103
 alarm+0.102
 headache+0.101
Neg logit contributions
ising-0.252
ters-0.247
ized-0.241
 members-0.233
mun-0.231
Token.
Token ID13
Feature activation-7.42e-01

Pos logit contributions
 North+0.081
 habitats+0.077
 seventeen+0.077
ric+0.077
 microwave+0.076
Neg logit contributions
ters-0.048
ising-0.046
 members-0.041
ized-0.039
 further-0.037
Token They
Token ID1119
Feature activation-1.75e+00

Pos logit contributions
ah+0.017
ric+0.017
 North+0.016
 microwave+0.016
at+0.016
Neg logit contributions
ising-0.018
ters-0.017
ized-0.017
Gr-0.016
 members-0.016
Token learned
Token ID4499
Feature activation-2.04e+00

Pos logit contributions
 microwave+0.042
 North+0.041
 alarm+0.040
 stories+0.040
ric+0.039
Neg logit contributions
ising-0.046
ized-0.041
ters-0.040
mun-0.038
ging-0.038
Token their
Token ID511
Feature activation-9.22e-01

Pos logit contributions
 microwave+0.044
 North+0.043
ric+0.041
 alarm+0.040
 Friday+0.040
Neg logit contributions
ising-0.058
ized-0.052
ters-0.051
mun-0.049
 members-0.049
Token should
Token ID815
Feature activation-2.01e+00

Pos logit contributions
 microwave+0.005
 North+0.005
 alarm+0.005
 Friday+0.005
 stories+0.005
Neg logit contributions
ising-0.007
ters-0.006
ized-0.006
 members-0.006
lest-0.006
Token not
Token ID407
Feature activation-1.26e+00

Pos logit contributions
 microwave+0.043
 North+0.041
 alarm+0.038
ric+0.038
tr+0.037
Neg logit contributions
ising-0.060
ized-0.052
ters-0.052
 members-0.050
ey-0.050
Token tease
Token ID34691
Feature activation+1.12e+00

Pos logit contributions
 microwave+0.030
 North+0.029
 stories+0.027
 alarm+0.027
ric+0.027
Neg logit contributions
ising-0.034
ized-0.030
ters-0.030
 members-0.029
mun-0.029
Token the
Token ID262
Feature activation-1.72e+00

Pos logit contributions
ising+0.031
ized+0.028
lest+0.026
mun+0.026
 members+0.026
Neg logit contributions
 alarm-0.025
 North-0.025
 stories-0.024
ric-0.024
 microwave-0.024
Token ost
Token ID23619
Feature activation+1.35e+00

Pos logit contributions
 microwave+0.043
 Friday+0.042
ores+0.041
 North+0.040
 stories+0.040
Neg logit contributions
ising-0.042
ters-0.039
mun-0.036
 members-0.035
ized-0.035
Tokenric
Token ID1173
Feature activation-7.32e+00

Pos logit contributions
ising+0.058
ters+0.056
ized+0.054
mun+0.053
 members+0.052
Neg logit contributions
 microwave-0.012
 Friday-0.010
 North-0.009
 stories-0.009
 fever-0.007
Tokenhes
Token ID956
Feature activation-2.44e+00

Pos logit contributions
 North+0.133
 microwave+0.118
 fever+0.118
 headache+0.114
 alarm+0.112
Neg logit contributions
ising-0.235
ters-0.234
ized-0.221
 members-0.215
mun-0.215
Token.
Token ID13
Feature activation-5.42e-01

Pos logit contributions
 North+0.072
ric+0.070
 microwave+0.070
tr+0.067
 seventeen+0.067
Neg logit contributions
ters-0.050
ising-0.050
 members-0.042
ized-0.041
lest-0.039
Token They
Token ID1119
Feature activation-1.87e+00

Pos logit contributions
 microwave+0.010
 stories+0.010
 North+0.009
 alarm+0.009
ah+0.009
Neg logit contributions
ising-0.017
ters-0.016
ized-0.015
aping-0.014
 members-0.014
Token are
Token ID389
Feature activation-4.87e-01

Pos logit contributions
 microwave+0.028
 North+0.027
 alarm+0.027
 stories+0.026
ric+0.026
Neg logit contributions
ising-0.065
ized-0.060
ters-0.058
ging-0.058
mun-0.057
Token not
Token ID407
Feature activation+5.33e-02

Pos logit contributions
 microwave+0.008
 North+0.008
 alarm+0.008
 stories+0.008
ric+0.008
Neg logit contributions
ising-0.016
ized-0.015
ters-0.014
mun-0.014
lest-0.014
Token and
Token ID290
Feature activation-3.26e-01

Pos logit contributions
ising+0.036
lest+0.034
ized+0.033
ters+0.033
Gr+0.033
Neg logit contributions
 alarm-0.025
ric-0.024
 thunder-0.023
 North-0.022
 snow-0.021
Token looked
Token ID3114
Feature activation-2.70e-01

Pos logit contributions
 microwave+0.006
 North+0.006
 alarm+0.006
 stories+0.005
ric+0.005
Neg logit contributions
ising-0.011
ized-0.010
ters-0.009
lest-0.009
ging-0.009
Token at
Token ID379
Feature activation+2.37e+00

Pos logit contributions
 microwave+0.007
 North+0.007
 alarm+0.007
 stories+0.007
 marsh+0.007
Neg logit contributions
ising-0.006
ized-0.005
ters-0.005
mun-0.005
 members-0.005
Token the
Token ID262
Feature activation-1.58e-01

Pos logit contributions
ising+0.067
ized+0.061
ters+0.060
mun+0.057
 members+0.056
Neg logit contributions
 microwave-0.053
 North-0.051
tr-0.048
 stories-0.048
 alarm-0.048
Token ost
Token ID23619
Feature activation+2.43e+00

Pos logit contributions
 microwave+0.004
 Friday+0.004
 North+0.003
tr+0.003
ores+0.003
Neg logit contributions
ising-0.004
ters-0.004
ized-0.004
 members-0.004
mun-0.004
Tokenric
Token ID1173
Feature activation-6.76e+00

Pos logit contributions
ising+0.103
ters+0.097
ized+0.096
aping+0.093
ey+0.093
Neg logit contributions
 microwave-0.025
 Friday-0.020
 stories-0.019
 North-0.018
 fever-0.016
Tokenhes
Token ID956
Feature activation-1.96e+00

Pos logit contributions
 North+0.136
 fever+0.128
 headache+0.127
 fridge+0.120
 microwave+0.118
Neg logit contributions
ters-0.198
ising-0.192
we-0.185
ized-0.185
 members-0.183
Token.
Token ID13
Feature activation+5.15e-02

Pos logit contributions
 North+0.056
 microwave+0.055
ric+0.055
tr+0.051
 Sale+0.051
Neg logit contributions
ising-0.043
ters-0.039
ized-0.037
 members-0.035
ging-0.035
Token They
Token ID1119
Feature activation-9.91e-01

Pos logit contributions
ters+0.001
ising+0.001
ized+0.001
 crit+0.001
 tad+0.001
Neg logit contributions
ah-0.001
opl-0.001
 North-0.001
haw-0.001
ric-0.001
Token saw
Token ID2497
Feature activation-3.78e-01

Pos logit contributions
 microwave+0.024
 North+0.024
 alarm+0.023
 stories+0.022
ric+0.022
Neg logit contributions
ising-0.026
ized-0.023
ters-0.022
mun-0.022
 members-0.022
Token one
Token ID530
Feature activation+4.15e+00

Pos logit contributions
 microwave+0.008
 North+0.008
 Friday+0.008
ric+0.008
 alarm+0.008
Neg logit contributions
ising-0.010
ters-0.010
ized-0.010
mun-0.009
 members-0.009
Token and
Token ID290
Feature activation-3.24e-01

Pos logit contributions
 North+0.026
 microwave+0.026
ric+0.025
 fever+0.025
 alarm+0.024
Neg logit contributions
ising-0.024
ters-0.022
ized-0.021
 members-0.020
 further-0.020
Token run
Token ID1057
Feature activation-5.11e-01

Pos logit contributions
 microwave+0.007
 North+0.006
 alarm+0.006
 stories+0.006
ric+0.006
Neg logit contributions
ising-0.010
ized-0.009
ters-0.009
mun-0.008
 members-0.008
Token to
Token ID284
Feature activation-1.08e+00

Pos logit contributions
 microwave+0.012
 North+0.012
 alarm+0.012
 stories+0.011
ric+0.011
Neg logit contributions
ising-0.014
ized-0.012
ters-0.012
mun-0.011
lest-0.011
Token the
Token ID262
Feature activation-3.06e-01

Pos logit contributions
 microwave+0.024
 North+0.023
 alarm+0.022
 stories+0.022
ric+0.022
Neg logit contributions
ising-0.031
ized-0.028
ters-0.027
mun-0.026
 members-0.026
Token ost
Token ID23619
Feature activation+2.31e+00

Pos logit contributions
ores+0.005
 microwave+0.005
 Friday+0.005
 North+0.005
ric+0.005
Neg logit contributions
ising-0.010
ters-0.009
ized-0.009
 members-0.008
mun-0.008
Tokenric
Token ID1173
Feature activation-6.69e+00

Pos logit contributions
ising+0.093
ters+0.089
ized+0.088
ey+0.087
aping+0.086
Neg logit contributions
 microwave-0.026
 Friday-0.022
 stories-0.020
 fever-0.019
 headache-0.019
Tokenhes
Token ID956
Feature activation-2.09e+00

Pos logit contributions
 North+0.134
 fever+0.132
 headache+0.130
 fridge+0.122
 alarm+0.120
Neg logit contributions
ters-0.193
ising-0.186
 members-0.179
we-0.179
ized-0.178
Token.
Token ID13
Feature activation-3.37e-01

Pos logit contributions
ric+0.068
 North+0.068
 microwave+0.065
 fever+0.064
 habitats+0.064
Neg logit contributions
ters-0.035
ising-0.033
 further-0.028
 members-0.027
ized-0.027
Token
Token ID198
Feature activation+2.69e-01

Pos logit contributions
 North+0.007
 microwave+0.007
 stories+0.007
 alarm+0.007
ric+0.007
Neg logit contributions
ising-0.008
ters-0.008
ized-0.008
mun-0.008
 members-0.007
Token
Token ID198
Feature activation-2.06e-01

Pos logit contributions
ising+0.006
ized+0.005
 members+0.005
ters+0.005
mun+0.005
Neg logit contributions
 microwave-0.007
 alarm-0.007
 North-0.007
 stories-0.007
tr-0.007
Token"
Token ID1
Feature activation+8.25e-01

Pos logit contributions
 microwave+0.005
 North+0.005
 alarm+0.005
 stories+0.005
ric+0.005
Neg logit contributions
ising-0.005
ized-0.005
ters-0.004
mun-0.004
lest-0.004
Token cried
Token ID16896
Feature activation-1.35e+00

Pos logit contributions
ising+0.010
ized+0.009
ters+0.009
mun+0.008
lest+0.008
Neg logit contributions
 microwave-0.012
 alarm-0.011
 North-0.011
 stories-0.011
 belly-0.011
Token.
Token ID13
Feature activation-1.92e+00

Pos logit contributions
 alarm+0.025
 microwave+0.024
 thunder+0.023
 stories+0.023
 chocolate+0.022
Neg logit contributions
ising-0.043
ized-0.041
mun-0.039
lest-0.038
 members-0.038
Token "
Token ID366
Feature activation+7.29e-02

Pos logit contributions
 North+0.054
ric+0.051
opl+0.050
 microwave+0.048
at+0.048
Neg logit contributions
ters-0.042
ising-0.040
ized-0.036
 together-0.036
mun-0.036
TokenThe
Token ID464
Feature activation-1.34e+00

Pos logit contributions
ising+0.002
ized+0.001
ters+0.001
mun+0.001
 members+0.001
Neg logit contributions
 microwave-0.002
 North-0.002
 alarm-0.002
ric-0.002
 stories-0.002
Token ost
Token ID23619
Feature activation+1.66e+00

Pos logit contributions
ores+0.028
 North+0.028
ric+0.028
 Friday+0.028
 microwave+0.027
Neg logit contributions
ising-0.038
ters-0.033
 members-0.031
ized-0.031
mun-0.031
Tokenric
Token ID1173
Feature activation-6.68e+00

Pos logit contributions
ising+0.066
ters+0.065
ized+0.062
ging+0.061
ey+0.061
Neg logit contributions
 microwave-0.018
 Friday-0.018
 North-0.017
 stories-0.017
 Sunday-0.015
Tokenhes
Token ID956
Feature activation-2.58e+00

Pos logit contributions
 North+0.143
 fridge+0.123
 fever+0.122
 microwave+0.119
 Cafe+0.119
Neg logit contributions
ters-0.198
ising-0.195
we-0.184
ized-0.183
mun-0.181
Token are
Token ID389
Feature activation-1.32e+00

Pos logit contributions
 North+0.066
 habitats+0.063
 microwave+0.062
 Cafe+0.060
 dentist+0.059
Neg logit contributions
ising-0.065
ters-0.062
 members-0.057
ized-0.056
lest-0.055
Token mean
Token ID1612
Feature activation-7.00e-01

Pos logit contributions
 microwave+0.033
 North+0.033
 Friday+0.030
 Cafe+0.029
 dentist+0.029
Neg logit contributions
ising-0.034
ters-0.031
 else-0.029
ging-0.029
ized-0.029
Token!
Token ID0
Feature activation-1.05e+00

Pos logit contributions
 microwave+0.018
 North+0.018
ric+0.018
tr+0.017
 Friday+0.017
Neg logit contributions
ising-0.019
ters-0.017
ging-0.015
 members-0.015
lest-0.015
Token They
Token ID1119
Feature activation-1.93e+00

Pos logit contributions
 stories+0.022
 North+0.022
 steep+0.021
 microwave+0.021
 habitats+0.019
Neg logit contributions
ters-0.030
ising-0.028
we-0.026
lest-0.026
aping-0.026
Token but
Token ID475
Feature activation+4.98e-02

Pos logit contributions
 microwave+0.101
 North+0.100
 alarm+0.097
ric+0.096
 stories+0.095
Neg logit contributions
ising-0.061
ized-0.051
ters-0.049
mun-0.047
lest-0.046
Token be
Token ID307
Feature activation-1.84e-01

Pos logit contributions
ising+0.001
ized+0.001
mun+0.001
ging+0.001
ters+0.001
Neg logit contributions
 North-0.001
 alarm-0.001
 microwave-0.001
 stories-0.001
ric-0.001
Token careful
Token ID8161
Feature activation+1.55e+00

Pos logit contributions
 North+0.007
 microwave+0.007
 alarm+0.006
tr+0.006
 Friday+0.006
Neg logit contributions
ising-0.003
ters-0.002
ized-0.002
ging-0.002
 members-0.002
Token.
Token ID13
Feature activation-1.11e+00

Pos logit contributions
ising+0.037
ters+0.034
ized+0.031
ging+0.030
 members+0.030
Neg logit contributions
 microwave-0.043
 North-0.042
tr-0.039
 Friday-0.039
 Sale-0.038
Token Ost
Token ID38919
Feature activation+6.31e-01

Pos logit contributions
 stories+0.022
 microwave+0.022
 alarm+0.021
 North+0.021
 Friday+0.020
Neg logit contributions
ising-0.032
ters-0.030
ized-0.028
ey-0.027
aping-0.027
Tokenric
Token ID1173
Feature activation-6.64e+00

Pos logit contributions
ising+0.025
ters+0.024
ey+0.024
ized+0.023
aping+0.023
Neg logit contributions
 microwave-0.007
 steep-0.006
 headache-0.006
 stories-0.006
 fever-0.006
Tokenhes
Token ID956
Feature activation-2.51e+00

Pos logit contributions
 North+0.138
 fever+0.129
 headache+0.128
 alarm+0.121
 fridge+0.120
Neg logit contributions
ters-0.197
ising-0.189
 members-0.176
mun-0.176
we-0.175
Token can
Token ID460
Feature activation-7.07e-01

Pos logit contributions
 North+0.056
 microwave+0.055
tr+0.053
 dentist+0.052
 habitats+0.051
Neg logit contributions
ising-0.071
ters-0.069
lest-0.062
ey-0.061
 members-0.061
Token be
Token ID307
Feature activation+5.71e-01

Pos logit contributions
 microwave+0.018
 stories+0.017
 North+0.017
 fridge+0.016
 alarm+0.016
Neg logit contributions
ising-0.018
ized-0.016
ters-0.016
lest-0.015
ey-0.015
Token gr
Token ID1036
Feature activation-1.87e-01

Pos logit contributions
ising+0.014
ized+0.013
ters+0.013
lest+0.012
ging+0.012
Neg logit contributions
 microwave-0.014
 North-0.014
 alarm-0.014
 Friday-0.013
 stories-0.013
Tokenumpy
Token ID32152
Feature activation-1.55e+00

Pos logit contributions
 microwave+0.005
 North+0.005
 alarm+0.005
 stories+0.005
 Friday+0.004
Neg logit contributions
ising-0.005
ized-0.004
ters-0.004
 members-0.004
lest-0.004
TokenLook
Token ID8567
Feature activation+1.08e+00

Pos logit contributions
ising+0.012
ters+0.011
ized+0.011
Gr+0.010
mun+0.010
Neg logit contributions
 microwave-0.013
 fever-0.012
 headache-0.012
ores-0.012
 alarm-0.012
Token,
Token ID11
Feature activation-2.32e+00

Pos logit contributions
ising+0.027
ized+0.024
ters+0.023
mun+0.022
ging+0.022
Neg logit contributions
 microwave-0.028
 North-0.027
 alarm-0.026
ric-0.026
 Friday-0.025
Token mom
Token ID1995
Feature activation-2.75e+00

Pos logit contributions
opl+0.067
 fever+0.062
 microwave+0.062
ores+0.060
ws+0.060
Neg logit contributions
 girls-0.046
 else-0.044
ters-0.043
 you-0.041
ter-0.039
Token,
Token ID11
Feature activation-1.48e+00

Pos logit contributions
 microwave+0.093
 fever+0.090
 headache+0.089
 gag+0.088
ores+0.087
Neg logit contributions
ising-0.048
ters-0.045
 members-0.040
ized-0.039
aping-0.036
Token ost
Token ID23619
Feature activation+1.56e+00

Pos logit contributions
 microwave+0.042
opl+0.042
ores+0.041
 fever+0.040
ric+0.039
Neg logit contributions
ising-0.027
ters-0.027
 else-0.026
 girls-0.026
 members-0.024
Tokenric
Token ID1173
Feature activation-6.56e+00

Pos logit contributions
ising+0.063
ters+0.061
ized+0.058
ey+0.058
ging+0.057
Neg logit contributions
 microwave-0.020
 Friday-0.017
 fever-0.014
 North-0.014
 stories-0.014
Tokenhes
Token ID956
Feature activation-2.23e+00

Pos logit contributions
 fever+0.139
 North+0.136
 headache+0.135
 microwave+0.130
omo+0.129
Neg logit contributions
ters-0.192
ising-0.175
 members-0.173
 else-0.173
ve-0.168
Token!"
Token ID2474
Feature activation-1.36e-01

Pos logit contributions
 North+0.061
 microwave+0.061
 dentist+0.058
 habitats+0.057
 fever+0.056
Neg logit contributions
ising-0.055
ters-0.053
ized-0.046
 members-0.045
 else-0.043
Token
Token ID198
Feature activation-1.18e-01

Pos logit contributions
 North+0.004
opl+0.004
 steep+0.004
ah+0.004
 microwave+0.004
Neg logit contributions
ising-0.003
ters-0.002
mun-0.002
ized-0.002
 members-0.002
Token
Token ID198
Feature activation-3.28e-01

Pos logit contributions
 microwave+0.003
 North+0.003
 alarm+0.003
 stories+0.003
ric+0.003
Neg logit contributions
ising-0.003
ized-0.003
ters-0.003
mun-0.003
 members-0.003
Token"
Token ID1
Feature activation+4.99e-01

Pos logit contributions
 microwave+0.011
 North+0.010
ric+0.010
 fault+0.010
 chocolate+0.010
Neg logit contributions
ising-0.006
ized-0.005
ters-0.005
Gr-0.005
 members-0.005
Token feels
Token ID5300
Feature activation-2.09e+00

Pos logit contributions
ising+0.008
mun+0.007
ters+0.007
ized+0.007
 members+0.007
Neg logit contributions
 North-0.008
 microwave-0.008
tr-0.008
ric-0.008
 Friday-0.008
Token even
Token ID772
Feature activation+1.07e+00

Pos logit contributions
 microwave+0.050
 North+0.049
 alarm+0.047
ric+0.046
 stories+0.046
Neg logit contributions
ising-0.056
ized-0.050
ters-0.048
mun-0.047
lest-0.046
Token happier
Token ID23030
Feature activation+4.16e-01

Pos logit contributions
ising+0.031
otte+0.031
ized+0.030
ve+0.029
ters+0.029
Neg logit contributions
 microwave-0.022
 stories-0.021
 snow-0.020
 thunder-0.019
 s-0.019
Token and
Token ID290
Feature activation-1.05e+00

Pos logit contributions
ising+0.011
ized+0.010
ters+0.010
mun+0.010
lest+0.009
Neg logit contributions
 microwave-0.010
 North-0.009
 alarm-0.009
 stories-0.009
ric-0.009
Token p
Token ID279
Feature activation+4.62e-01

Pos logit contributions
 stories+0.022
 listens+0.018
 prayer+0.018
 microwave+0.018
 does+0.018
Neg logit contributions
ising-0.031
Gr-0.031
ters-0.031
ging-0.030
ve-0.029
Tokenrou
Token ID472
Feature activation-6.55e+00

Pos logit contributions
ising+0.011
ized+0.010
ters+0.009
mun+0.009
ging+0.009
Neg logit contributions
 North-0.013
 microwave-0.013
 alarm-0.012
 Friday-0.012
 stories-0.012
Tokender
Token ID1082
Feature activation-8.74e-02

Pos logit contributions
 microwave+0.139
 North+0.136
 alarm+0.132
 stories+0.129
ric+0.127
Neg logit contributions
ising-0.190
ized-0.171
ters-0.166
mun-0.163
 members-0.161
Token.
Token ID13
Feature activation-3.24e-01

Pos logit contributions
 North+0.002
 microwave+0.002
 stories+0.002
 alarm+0.002
 Friday+0.002
Neg logit contributions
ising-0.002
ters-0.002
mun-0.002
Gr-0.002
ized-0.002
Token He
Token ID679
Feature activation-5.35e-01

Pos logit contributions
 alarm+0.007
 North+0.007
 microwave+0.007
 stories+0.006
ric+0.006
Neg logit contributions
ising-0.009
ters-0.008
ized-0.008
mun-0.008
 members-0.008
Token displays
Token ID11298
Feature activation+1.66e+00

Pos logit contributions
 microwave+0.011
 alarm+0.010
 North+0.010
 Friday+0.010
 chocolate+0.010
Neg logit contributions
ising-0.016
ized-0.015
mun-0.014
 members-0.014
ters-0.014
Token his
Token ID465
Feature activation+1.19e+00

Pos logit contributions
ising+0.038
ters+0.035
ized+0.035
 members+0.034
 further+0.032
Neg logit contributions
 microwave-0.044
 North-0.044
tr-0.042
 fever-0.041
ric-0.040
Token hurts
Token ID20406
Feature activation-1.94e+00

Pos logit contributions
 North+0.025
 microwave+0.025
 Cafe+0.024
ores+0.024
ric+0.024
Neg logit contributions
ising-0.033
ized-0.030
ters-0.029
lest-0.029
 members-0.028
Token,"
Token ID553
Feature activation-2.51e-02

Pos logit contributions
 microwave+0.054
ric+0.054
 North+0.052
 habitats+0.051
 Cafe+0.049
Neg logit contributions
ising-0.045
ters-0.042
ized-0.039
lest-0.036
ging-0.035
Token he
Token ID339
Feature activation-1.60e+00

Pos logit contributions
 microwave+0.001
 alarm+0.001
 chocolate+0.001
 North+0.001
 Friday+0.001
Neg logit contributions
ising-0.001
ized-0.001
ters-0.001
mun-0.001
ging-0.001
Token said
Token ID531
Feature activation+8.46e-01

Pos logit contributions
 microwave+0.043
 alarm+0.041
 North+0.041
 stories+0.041
tr+0.039
Neg logit contributions
ising-0.038
ized-0.033
ters-0.032
mun-0.031
ging-0.031
Token,
Token ID11
Feature activation-1.66e+00

Pos logit contributions
ising+0.026
ized+0.024
mun+0.023
ters+0.022
 members+0.022
Neg logit contributions
 alarm-0.016
 microwave-0.016
 stories-0.015
 belly-0.015
 chocolate-0.015
Token scr
Token ID6040
Feature activation-6.46e+00

Pos logit contributions
 North+0.030
 microwave+0.029
 stories+0.027
 Sale+0.027
ric+0.027
Neg logit contributions
ising-0.054
ized-0.050
ters-0.049
ging-0.048
mun-0.047
Tokenunch
Token ID3316
Feature activation-1.90e+00

Pos logit contributions
 microwave+0.177
 North+0.177
 stories+0.171
 alarm+0.171
tr+0.169
Neg logit contributions
ising-0.156
ized-0.138
ters-0.122
aping-0.121
ging-0.116
Tokening
Token ID278
Feature activation-7.63e-01

Pos logit contributions
 North+0.076
 microwave+0.076
 stories+0.074
 Friday+0.073
 alarm+0.073
Neg logit contributions
ising-0.020
ters-0.016
ized-0.014
ging-0.013
mun-0.011
Token up
Token ID510
Feature activation+8.58e-01

Pos logit contributions
 microwave+0.020
 North+0.019
 alarm+0.019
 stories+0.018
ric+0.018
Neg logit contributions
ising-0.019
ized-0.017
ters-0.016
mun-0.016
lest-0.015
Token his
Token ID465
Feature activation-2.06e-01

Pos logit contributions
ising+0.023
ized+0.020
ters+0.020
mun+0.020
 members+0.019
Neg logit contributions
 microwave-0.020
 North-0.020
 alarm-0.019
 stories-0.019
 Friday-0.019
Token face
Token ID1986
Feature activation-3.24e+00

Pos logit contributions
 microwave+0.003
 North+0.003
ric+0.003
 alarm+0.003
 stories+0.003
Neg logit contributions
ising-0.007
ized-0.006
ters-0.006
 members-0.006
lest-0.006
Token made
Token ID925
Feature activation-1.02e+00

Pos logit contributions
 microwave+0.023
 North+0.022
 alarm+0.021
ric+0.021
 stories+0.021
Neg logit contributions
ising-0.024
ized-0.021
ters-0.020
mun-0.020
 members-0.020
Token his
Token ID465
Feature activation+1.96e+00

Pos logit contributions
 microwave+0.021
 stories+0.021
 North+0.021
 alarm+0.020
 chocolate+0.019
Neg logit contributions
ising-0.030
ized-0.027
ters-0.027
lest-0.027
mun-0.026
Token mother
Token ID2802
Feature activation+4.05e-01

Pos logit contributions
ising+0.051
 members+0.043
ized+0.043
 convince+0.041
mun+0.041
Neg logit contributions
 microwave-0.049
 Sale-0.046
ores-0.045
 seventeen-0.045
tr-0.044
Token even
Token ID772
Feature activation+6.36e-01

Pos logit contributions
ising+0.012
ized+0.010
ters+0.010
mun+0.010
ging+0.010
Neg logit contributions
 microwave-0.009
 North-0.009
 alarm-0.008
 stories-0.008
ric-0.008
Token p
Token ID279
Feature activation+7.33e-01

Pos logit contributions
ising+0.018
ized+0.017
ve+0.017
ging+0.016
ters+0.016
Neg logit contributions
 stories-0.014
 chocolate-0.013
 microwave-0.012
 belly-0.012
 snow-0.012
Tokenrou
Token ID472
Feature activation-6.44e+00

Pos logit contributions
ising+0.019
ized+0.017
ters+0.016
mun+0.016
 members+0.016
Neg logit contributions
 microwave-0.018
 North-0.018
 alarm-0.017
 stories-0.017
 Friday-0.017
Tokender
Token ID1082
Feature activation+7.99e-01

Pos logit contributions
 microwave+0.159
 North+0.159
ric+0.150
 alarm+0.150
 stories+0.148
Neg logit contributions
ising-0.173
ized-0.155
ters-0.146
ging-0.141
mun-0.138
Token.
Token ID13
Feature activation+6.73e-01

Pos logit contributions
ising+0.021
ters+0.020
ized+0.020
mun+0.019
Gr+0.019
Neg logit contributions
 North-0.017
 stories-0.016
 microwave-0.016
 alarm-0.016
 Friday-0.016
Token<|endoftext|>
Token ID50256
Feature activation-1.33e+00

Pos logit contributions
ising+0.013
 members+0.012
mun+0.012
ters+0.012
ized+0.012
Neg logit contributions
 alarm-0.018
 fever-0.017
 microwave-0.017
 North-0.017
 stories-0.017
TokenJohn
Token ID7554
Feature activation+1.10e+00

Pos logit contributions
 microwave+0.031
 North+0.030
 alarm+0.029
 stories+0.029
ric+0.028
Neg logit contributions
ising-0.036
ized-0.032
ters-0.032
mun-0.031
 members-0.030
Token and
Token ID290
Feature activation+1.02e+00

Pos logit contributions
ising+0.022
ized+0.020
 members+0.020
ey+0.020
mun+0.019
Neg logit contributions
 microwave-0.031
 Sale-0.030
 seventeen-0.030
 stories-0.030
tr-0.030
Token wanted
Token ID2227
Feature activation-3.02e-01

Pos logit contributions
 microwave+0.036
 North+0.036
ric+0.034
 alarm+0.034
 Friday+0.033
Neg logit contributions
ising-0.047
ized-0.043
ters-0.041
mun-0.041
 members-0.040
Token to
Token ID284
Feature activation-4.39e-01

Pos logit contributions
 microwave+0.006
 North+0.006
ric+0.006
 alarm+0.006
tr+0.006
Neg logit contributions
ising-0.009
ized-0.008
ters-0.008
mun-0.008
 members-0.008
Token touch
Token ID3638
Feature activation+6.32e-01

Pos logit contributions
 North+0.009
 alarm+0.009
 microwave+0.008
 stories+0.008
ric+0.008
Neg logit contributions
ising-0.013
ized-0.012
ters-0.012
ging-0.012
mun-0.012
Token the
Token ID262
Feature activation-6.63e-01

Pos logit contributions
ising+0.016
ters+0.015
ized+0.015
mun+0.014
ging+0.014
Neg logit contributions
 microwave-0.015
 North-0.015
tr-0.014
 Friday-0.014
ric-0.014
Token ost
Token ID23619
Feature activation+2.48e+00

Pos logit contributions
 Friday+0.018
 microwave+0.018
ores+0.017
 North+0.017
 Today+0.016
Neg logit contributions
ising-0.016
ters-0.015
ized-0.014
mun-0.013
 members-0.013
Tokenric
Token ID1173
Feature activation-6.40e+00

Pos logit contributions
ising+0.104
ters+0.100
ized+0.098
ey+0.095
aping+0.095
Neg logit contributions
 microwave-0.024
 Friday-0.019
 North-0.018
 stories-0.018
 fever-0.016
Tokenhes
Token ID956
Feature activation-2.13e+00

Pos logit contributions
 North+0.118
 fever+0.106
 headache+0.104
 microwave+0.102
 alarm+0.100
Neg logit contributions
ising-0.198
ters-0.198
ized-0.191
 members-0.187
mun-0.184
Token.
Token ID13
Feature activation-1.06e-02

Pos logit contributions
 North+0.064
ric+0.064
 microwave+0.062
 fever+0.058
 Sale+0.058
Neg logit contributions
ising-0.043
ters-0.042
ized-0.036
 members-0.035
ging-0.034
Token They
Token ID1119
Feature activation-1.37e+00

Pos logit contributions
 microwave+0.000
 North+0.000
 alarm+0.000
ric+0.000
 stories+0.000
Neg logit contributions
ising-0.000
ized-0.000
ters-0.000
mun-0.000
 members-0.000
Token reached
Token ID4251
Feature activation+2.54e-01

Pos logit contributions
 microwave+0.030
 North+0.029
 alarm+0.028
 stories+0.028
ric+0.028
Neg logit contributions
ising-0.039
ized-0.035
ters-0.034
mun-0.034
lest-0.033
Token out
Token ID503
Feature activation+1.35e+00

Pos logit contributions
ising+0.006
ized+0.005
ters+0.005
mun+0.005
lest+0.005
Neg logit contributions
 microwave-0.007
 North-0.007
 alarm-0.006
 stories-0.006
ric-0.006
Token shouted
Token ID17293
Feature activation+1.03e+00

Pos logit contributions
ising+0.015
ized+0.014
ging+0.013
mun+0.013
ters+0.013
Neg logit contributions
 microwave-0.011
 alarm-0.010
 stories-0.010
 North-0.010
 chocolate-0.010
Token.
Token ID13
Feature activation-1.99e+00

Pos logit contributions
ising+0.038
ized+0.035
mun+0.035
lest+0.033
aping+0.033
Neg logit contributions
 alarm-0.015
 thunder-0.014
 s-0.014
 stories-0.014
ores-0.013
Token "
Token ID366
Feature activation-2.09e-01

Pos logit contributions
 North+0.042
ric+0.041
 microwave+0.041
 stories+0.039
 alarm+0.038
Neg logit contributions
ising-0.055
ters-0.052
ized-0.050
mun-0.049
 members-0.049
TokenThe
Token ID464
Feature activation-1.32e+00

Pos logit contributions
 microwave+0.006
 North+0.006
 alarm+0.006
 stories+0.006
ric+0.006
Neg logit contributions
ising-0.004
ized-0.004
ters-0.004
mun-0.003
ging-0.003
Token ost
Token ID23619
Feature activation+1.92e+00

Pos logit contributions
ores+0.037
 Friday+0.032
 microwave+0.032
icle+0.031
ric+0.031
Neg logit contributions
ising-0.032
ters-0.027
 members-0.026
 crit-0.025
 tad-0.025
Tokenric
Token ID1173
Feature activation-6.31e+00

Pos logit contributions
ising+0.074
ters+0.072
ey+0.070
ized+0.069
aping+0.069
Neg logit contributions
 microwave-0.025
 Friday-0.022
 stories-0.021
 North-0.020
 fever-0.019
Tokenhes
Token ID956
Feature activation-2.35e+00

Pos logit contributions
 North+0.133
 fever+0.125
 headache+0.122
 microwave+0.115
 fridge+0.113
Neg logit contributions
ters-0.184
ising-0.179
 members-0.171
ized-0.169
ve-0.169
Token are
Token ID389
Feature activation-1.32e+00

Pos logit contributions
 North+0.056
 microwave+0.055
 habitats+0.051
 fever+0.051
 dentist+0.050
Neg logit contributions
ising-0.064
ters-0.060
ized-0.057
 members-0.056
lest-0.054
Token so
Token ID523
Feature activation+3.42e-01

Pos logit contributions
 microwave+0.034
 North+0.032
 Friday+0.031
tr+0.030
 Sale+0.030
Neg logit contributions
ising-0.034
ters-0.031
 else-0.028
ized-0.028
ging-0.028
Token funny
Token ID8258
Feature activation+3.70e-01

Pos logit contributions
ising+0.010
ized+0.010
ters+0.009
 else+0.009
lest+0.009
Neg logit contributions
 microwave-0.008
 North-0.007
tr-0.007
ric-0.007
 alarm-0.007
Token!"
Token ID2474
Feature activation+1.66e-01

Pos logit contributions
ising+0.010
ters+0.010
we+0.009
ized+0.009
ging+0.008
Neg logit contributions
 microwave-0.009
 fever-0.009
 North-0.008
tr-0.008
 headache-0.008
Token ostr
Token ID42387
Feature activation+2.12e+00

Pos logit contributions
ising+0.007
mun+0.006
ized+0.006
 members+0.006
ters+0.006
Neg logit contributions
icle-0.004
ores-0.004
 microwave-0.004
ric-0.004
ah-0.004
Tokenich
Token ID488
Feature activation-1.82e-01

Pos logit contributions
ising+0.055
ized+0.049
ters+0.048
ging+0.047
lest+0.047
Neg logit contributions
ric-0.051
 North-0.050
 alarm-0.049
 microwave-0.048
tr-0.047
Token pen
Token ID3112
Feature activation-1.28e+00

Pos logit contributions
 microwave+0.005
 North+0.005
 alarm+0.005
 seventeen+0.005
 Sale+0.005
Neg logit contributions
ising-0.004
ized-0.004
ters-0.004
mun-0.004
 members-0.004
Token.
Token ID13
Feature activation-5.05e-01

Pos logit contributions
 microwave+0.031
 North+0.031
ric+0.030
 alarm+0.030
 Friday+0.029
Neg logit contributions
ising-0.033
ized-0.030
ters-0.029
 members-0.028
mun-0.028
Token Ost
Token ID38919
Feature activation+7.77e-01

Pos logit contributions
 microwave+0.010
ric+0.010
 fever+0.010
 alarm+0.010
 North+0.010
Neg logit contributions
ters-0.013
ising-0.013
ized-0.012
 crit-0.012
Gr-0.011
Tokenric
Token ID1173
Feature activation-6.22e+00

Pos logit contributions
ising+0.031
ey+0.029
ters+0.029
ized+0.029
aping+0.028
Neg logit contributions
 microwave-0.009
 headache-0.009
 fever-0.008
 marsh-0.008
 steep-0.008
Tokenhes
Token ID956
Feature activation-2.43e+00

Pos logit contributions
 North+0.130
 fever+0.121
 headache+0.120
 microwave+0.117
 fridge+0.116
Neg logit contributions
ters-0.180
ising-0.173
mun-0.167
 members-0.166
ized-0.163
Token are
Token ID389
Feature activation-1.17e+00

Pos logit contributions
 microwave+0.056
 North+0.056
tr+0.054
 pend+0.051
 dentist+0.051
Neg logit contributions
ising-0.065
ters-0.063
 members-0.056
ey-0.055
ized-0.055
Token big
Token ID1263
Feature activation+5.35e-02

Pos logit contributions
 microwave+0.029
 Friday+0.028
 North+0.028
 Sale+0.027
icle+0.027
Neg logit contributions
ising-0.028
ters-0.027
 else-0.025
 members-0.024
ized-0.023
Token birds
Token ID10087
Feature activation+1.35e+00

Pos logit contributions
ising+0.001
ters+0.001
ized+0.001
mun+0.001
ging+0.001
Neg logit contributions
 Friday-0.001
 microwave-0.001
ores-0.001
tr-0.001
 Sale-0.001
Token with
Token ID351
Feature activation+1.13e+00

Pos logit contributions
ising+0.034
ized+0.030
ters+0.030
mun+0.028
lest+0.028
Neg logit contributions
 microwave-0.034
 North-0.033
 alarm-0.032
 stories-0.031
 Friday-0.031
Token for
Token ID329
Feature activation-1.60e-01

Pos logit contributions
ising+0.014
ized+0.013
ters+0.012
ging+0.011
 members+0.010
Neg logit contributions
 microwave-0.019
ric-0.019
 Sale-0.019
 North-0.019
 marsh-0.018
Token the
Token ID262
Feature activation-3.06e-01

Pos logit contributions
 stories+0.003
 North+0.003
 alarm+0.003
 microwave+0.003
 chocolate+0.003
Neg logit contributions
ising-0.005
ized-0.005
lest-0.004
ters-0.004
mun-0.004
Token first
Token ID717
Feature activation-5.38e-01

Pos logit contributions
 microwave+0.007
 North+0.007
 stories+0.007
 alarm+0.007
 chocolate+0.007
Neg logit contributions
ising-0.008
ized-0.008
ters-0.007
ging-0.007
mun-0.007
Token time
Token ID640
Feature activation+7.20e-01

Pos logit contributions
 microwave+0.011
 North+0.011
 alarm+0.011
 stories+0.010
 Friday+0.010
Neg logit contributions
ising-0.016
ized-0.015
ters-0.014
mun-0.014
ging-0.013
Token in
Token ID287
Feature activation-3.53e-01

Pos logit contributions
ising+0.019
ized+0.017
ters+0.016
mun+0.016
lest+0.016
Neg logit contributions
 microwave-0.017
 North-0.017
 alarm-0.016
 stories-0.016
 chocolate-0.016
Token years
Token ID812
Feature activation-6.17e+00

Pos logit contributions
 microwave+0.010
tr+0.009
ric+0.009
 seventeen+0.009
 alarm+0.009
Neg logit contributions
ising-0.008
 members-0.007
ized-0.007
ging-0.007
ters-0.006
Token.
Token ID13
Feature activation+1.88e-02

Pos logit contributions
 microwave+0.158
 North+0.152
ric+0.151
 alarm+0.148
tr+0.147
Neg logit contributions
ising-0.154
ized-0.138
ters-0.132
 members-0.127
lest-0.127
Token
Token ID198
Feature activation+4.87e-01

Pos logit contributions
ising+0.000
ized+0.000
ters+0.000
 members+0.000
mun+0.000
Neg logit contributions
 microwave-0.000
 alarm-0.000
 North-0.000
 stories-0.000
ric-0.000
Token
Token ID198
Feature activation+1.37e-01

Pos logit contributions
ising+0.012
ized+0.011
mun+0.011
 members+0.010
ters+0.010
Neg logit contributions
 microwave-0.012
 North-0.012
 alarm-0.012
ric-0.011
 stories-0.011
TokenThe
Token ID464
Feature activation-1.19e+00

Pos logit contributions
ising+0.003
ized+0.003
ters+0.003
mun+0.002
 members+0.002
Neg logit contributions
 microwave-0.004
 North-0.004
 alarm-0.004
ric-0.004
 stories-0.004
Token old
Token ID1468
Feature activation-1.21e+00

Pos logit contributions
 microwave+0.026
 North+0.026
 alarm+0.025
 stories+0.025
 Friday+0.024
Neg logit contributions
ising-0.035
ized-0.032
ters-0.031
mun-0.030
ging-0.030
Token a
Token ID257
Feature activation+1.47e+00

Pos logit contributions
 microwave+0.010
 alarm+0.010
icle+0.010
opl+0.010
ric+0.010
Neg logit contributions
 members-0.008
ising-0.008
 else-0.008
 further-0.008
ters-0.008
Token flashlight
Token ID38371
Feature activation-2.58e-01

Pos logit contributions
ising+0.044
mun+0.039
ized+0.039
 members+0.038
lest+0.037
Neg logit contributions
ric-0.027
 North-0.027
 microwave-0.027
 alarm-0.026
ores-0.026
Token and
Token ID290
Feature activation+8.39e-01

Pos logit contributions
 microwave+0.007
 North+0.007
ric+0.007
 Sale+0.007
 Friday+0.006
Neg logit contributions
ising-0.006
ized-0.005
ters-0.005
 members-0.005
lest-0.005
Token a
Token ID257
Feature activation+4.87e-01

Pos logit contributions
ising+0.024
ized+0.022
ters+0.021
 members+0.021
mun+0.020
Neg logit contributions
 microwave-0.018
 North-0.018
ric-0.017
 alarm-0.017
 Friday-0.016
Token magn
Token ID7842
Feature activation+1.34e+00

Pos logit contributions
ising+0.014
ized+0.013
mun+0.013
ters+0.013
lest+0.013
Neg logit contributions
 North-0.009
ric-0.009
 microwave-0.009
ores-0.009
 Friday-0.009
Tokenifying
Token ID4035
Feature activation-6.13e+00

Pos logit contributions
ising+0.042
ters+0.040
ized+0.039
ging+0.037
 members+0.036
Neg logit contributions
 microwave-0.025
 stories-0.025
 North-0.024
 alarm-0.024
 Friday-0.024
Token glass
Token ID5405
Feature activation-2.28e+00

Pos logit contributions
haw+0.153
urger+0.150
ahs+0.147
 meanings+0.143
allows+0.142
Neg logit contributions
 tad-0.117
 further-0.117
ising-0.116
ized-0.116
 with-0.109
Token to
Token ID284
Feature activation-5.61e-01

Pos logit contributions
 North+0.061
ric+0.060
 microwave+0.059
 alarm+0.057
 Friday+0.056
Neg logit contributions
ising-0.053
ters-0.049
ized-0.046
 members-0.046
lest-0.043
Token examine
Token ID10716
Feature activation+9.96e-01

Pos logit contributions
 microwave+0.013
 North+0.013
 alarm+0.013
ric+0.013
 stories+0.013
Neg logit contributions
ising-0.015
ized-0.013
ters-0.013
mun-0.012
lest-0.012
Token their
Token ID511
Feature activation-5.37e-01

Pos logit contributions
ising+0.029
ized+0.026
mun+0.025
ters+0.025
lest+0.025
Neg logit contributions
 North-0.021
 microwave-0.021
 stories-0.021
 alarm-0.020
 chocolate-0.020
Token finds
Token ID7228
Feature activation+5.23e-01

Pos logit contributions
 microwave+0.012
 North+0.012
 alarm+0.011
ric+0.011
 Friday+0.011
Neg logit contributions
ising-0.015
ized-0.014
ters-0.013
mun-0.013
 members-0.013
Token
Token ID198
Feature activation+1.22e-01

Pos logit contributions
 North+0.010
 microwave+0.010
ric+0.009
 stories+0.009
 alarm+0.009
Neg logit contributions
ising-0.012
ters-0.011
ized-0.011
mun-0.010
 members-0.010
Token
Token ID198
Feature activation-2.90e-01

Pos logit contributions
ising+0.003
ized+0.003
ters+0.003
 members+0.003
mun+0.003
Neg logit contributions
 microwave-0.003
 North-0.003
 alarm-0.003
 stories-0.003
ric-0.003
TokenBut
Token ID1537
Feature activation+1.24e+00

Pos logit contributions
 microwave+0.005
 Friday+0.005
 Sale+0.005
 North+0.005
 alarm+0.005
Neg logit contributions
ising-0.009
ized-0.008
mun-0.008
ters-0.008
lest-0.008
Token the
Token ID262
Feature activation-3.43e-01

Pos logit contributions
ising+0.035
ized+0.031
ters+0.030
mun+0.030
lest+0.029
Neg logit contributions
 microwave-0.028
 North-0.027
 alarm-0.027
 stories-0.026
 Friday-0.026
Token ost
Token ID23619
Feature activation+2.44e+00

Pos logit contributions
 microwave+0.007
 Friday+0.007
ores+0.007
 North+0.007
 stories+0.007
Neg logit contributions
ising-0.009
ters-0.008
 members-0.008
ized-0.008
mun-0.008
Tokenric
Token ID1173
Feature activation-6.11e+00

Pos logit contributions
ising+0.098
ters+0.093
ized+0.092
aping+0.091
 members+0.090
Neg logit contributions
 microwave-0.028
 Friday-0.024
 North-0.023
 stories-0.023
 marsh-0.021
Tokenhes
Token ID956
Feature activation-1.93e+00

Pos logit contributions
 North+0.129
 fever+0.112
 headache+0.112
 microwave+0.111
 fridge+0.108
Neg logit contributions
ters-0.180
ising-0.175
ized-0.173
 members-0.171
 else-0.169
Token did
Token ID750
Feature activation-1.23e+00

Pos logit contributions
 North+0.056
ric+0.053
 microwave+0.053
 stories+0.052
opl+0.050
Neg logit contributions
ising-0.041
ters-0.039
ized-0.037
 members-0.035
lest-0.034
Token not
Token ID407
Feature activation-8.60e-01

Pos logit contributions
 North+0.027
 microwave+0.026
 fridge+0.024
 Sale+0.022
tr+0.022
Neg logit contributions
ising-0.038
ized-0.033
ters-0.033
 else-0.033
ging-0.031
Token like
Token ID588
Feature activation-2.59e+00

Pos logit contributions
 microwave+0.017
 North+0.017
 alarm+0.016
 stories+0.016
ric+0.016
Neg logit contributions
ising-0.026
ized-0.024
ters-0.023
mun-0.023
lest-0.022
Token that
Token ID326
Feature activation-2.83e+00

Pos logit contributions
tr+0.080
 microwave+0.079
 Today+0.079
 spoiler+0.077
opl+0.076
Neg logit contributions
 else-0.044
ters-0.043
ising-0.041
ized-0.041
 further-0.038
Token ostr
Token ID42387
Feature activation+1.95e+00

Pos logit contributions
ising+0.013
ters+0.012
 members+0.012
ized+0.012
mun+0.012
Neg logit contributions
ores-0.009
icle-0.008
ric-0.008
 microwave-0.008
ah-0.008
Tokenich
Token ID488
Feature activation-6.07e-02

Pos logit contributions
ising+0.051
ized+0.045
ters+0.044
ging+0.043
lest+0.043
Neg logit contributions
ric-0.047
 North-0.046
 alarm-0.045
 microwave-0.045
tr-0.044
Token pen
Token ID3112
Feature activation-1.25e+00

Pos logit contributions
 microwave+0.002
 North+0.002
 seventeen+0.002
 alarm+0.002
 marsh+0.002
Neg logit contributions
ising-0.001
ized-0.001
 members-0.001
mun-0.001
ters-0.001
Token.
Token ID13
Feature activation-2.72e-01

Pos logit contributions
 North+0.031
 microwave+0.030
ric+0.030
 alarm+0.029
 stories+0.029
Neg logit contributions
ising-0.031
ized-0.028
ters-0.028
 members-0.027
mun-0.026
Token Ost
Token ID38919
Feature activation+8.45e-01

Pos logit contributions
 North+0.005
ric+0.005
 microwave+0.005
 alarm+0.005
 fever+0.005
Neg logit contributions
ising-0.008
ters-0.008
ized-0.007
 crit-0.007
mun-0.007
Tokenric
Token ID1173
Feature activation-6.11e+00

Pos logit contributions
ising+0.033
ters+0.032
ey+0.032
ized+0.032
aping+0.030
Neg logit contributions
 microwave-0.009
 headache-0.009
 fever-0.009
 marsh-0.009
 stories-0.008
Tokenhes
Token ID956
Feature activation-2.29e+00

Pos logit contributions
 North+0.123
 fever+0.113
 headache+0.112
 microwave+0.111
 fridge+0.108
Neg logit contributions
ters-0.181
ising-0.175
 members-0.170
ized-0.169
mun-0.169
Token are
Token ID389
Feature activation-9.73e-01

Pos logit contributions
 North+0.053
 microwave+0.053
tr+0.049
 Friday+0.048
 chocolate+0.047
Neg logit contributions
ising-0.062
ters-0.058
ized-0.055
 members-0.055
mun-0.054
Token big
Token ID1263
Feature activation+1.09e-01

Pos logit contributions
 microwave+0.023
 Friday+0.022
 North+0.022
 Sale+0.021
 alarm+0.021
Neg logit contributions
ising-0.024
ters-0.023
 else-0.021
 members-0.021
ized-0.021
Token birds
Token ID10087
Feature activation+1.41e+00

Pos logit contributions
ising+0.003
ters+0.003
ized+0.003
mun+0.002
ging+0.002
Neg logit contributions
 microwave-0.003
 Friday-0.003
tr-0.003
 North-0.003
ores-0.002
Token with
Token ID351
Feature activation+1.24e+00

Pos logit contributions
ising+0.036
ized+0.032
ters+0.031
mun+0.030
lest+0.030
Neg logit contributions
 microwave-0.035
 North-0.034
 alarm-0.033
 stories-0.032
 Friday-0.032
Token pillow
Token ID28774
Feature activation-1.29e+00

Pos logit contributions
 North+0.004
 microwave+0.004
ric+0.004
 Sale+0.004
ores+0.004
Neg logit contributions
ising-0.009
ized-0.008
ters-0.008
lest-0.008
 members-0.008
Token many
Token ID867
Feature activation+3.57e-01

Pos logit contributions
 microwave+0.031
 North+0.030
 alarm+0.030
 stories+0.030
tr+0.029
Neg logit contributions
ising-0.033
ized-0.030
mun-0.029
ters-0.029
lest-0.028
Token times
Token ID1661
Feature activation+5.46e-01

Pos logit contributions
ters+0.010
Gr+0.010
mun+0.010
ising+0.010
we+0.010
Neg logit contributions
 stories-0.008
 belly-0.007
 alarm-0.006
 chocolate-0.006
 microwave-0.006
Token over
Token ID625
Feature activation+5.87e-01

Pos logit contributions
ising+0.017
mun+0.016
Gr+0.015
ters+0.015
ized+0.015
Neg logit contributions
 stories-0.009
 microwave-0.009
 alarm-0.009
 North-0.009
 Friday-0.009
Token the
Token ID262
Feature activation-9.03e-01

Pos logit contributions
ising+0.017
ized+0.015
ters+0.015
mun+0.015
lest+0.014
Neg logit contributions
 microwave-0.013
 North-0.013
 alarm-0.012
 stories-0.012
 Friday-0.012
Token years
Token ID812
Feature activation-6.09e+00

Pos logit contributions
ric+0.014
 microwave+0.013
 Sale+0.013
ores+0.013
 North+0.013
Neg logit contributions
ising-0.031
ized-0.028
ters-0.027
 members-0.027
lest-0.026
Token,
Token ID11
Feature activation-4.15e-01

Pos logit contributions
 microwave+0.136
 North+0.133
 alarm+0.129
 stories+0.126
ric+0.126
Neg logit contributions
ising-0.171
ized-0.153
ters-0.149
mun-0.144
lest-0.143
Token but
Token ID475
Feature activation+2.50e+00

Pos logit contributions
 microwave+0.010
 North+0.010
 alarm+0.009
 stories+0.009
ric+0.009
Neg logit contributions
ising-0.011
ized-0.010
ters-0.010
mun-0.009
lest-0.009
Token the
Token ID262
Feature activation+1.79e-01

Pos logit contributions
ising+0.069
mun+0.062
ters+0.062
lest+0.060
ized+0.060
Neg logit contributions
 stories-0.055
 alarm-0.054
 Friday-0.053
 North-0.053
 microwave-0.052
Token first
Token ID717
Feature activation+1.31e+00

Pos logit contributions
ising+0.006
 members+0.005
ized+0.005
ters+0.005
Gr+0.005
Neg logit contributions
ric-0.003
 Sale-0.003
 North-0.003
ores-0.003
 seventeen-0.003
Token time
Token ID640
Feature activation+2.49e+00

Pos logit contributions
ising+0.038
ized+0.033
 members+0.033
lest+0.032
ters+0.032
Neg logit contributions
ric-0.028
 North-0.027
 microwave-0.027
 Sale-0.027
 seventeen-0.026
Token said
Token ID531
Feature activation+1.55e+00

Pos logit contributions
ising+0.005
ging+0.005
ized+0.005
ters+0.005
mun+0.004
Neg logit contributions
 microwave-0.004
 alarm-0.004
 stories-0.004
 chocolate-0.004
tr-0.004
Token,
Token ID11
Feature activation-1.54e+00

Pos logit contributions
ising+0.060
ized+0.056
mun+0.054
Gr+0.051
lves+0.051
Neg logit contributions
 alarm-0.020
 belly-0.017
 stories-0.017
 chocolate-0.017
 cough-0.017
Token holding
Token ID4769
Feature activation+1.37e-01

Pos logit contributions
 alarm+0.039
 microwave+0.039
 North+0.038
 stories+0.037
 chocolate+0.037
Neg logit contributions
ising-0.038
ized-0.033
ters-0.032
mun-0.032
 members-0.031
Token the
Token ID262
Feature activation-3.49e-02

Pos logit contributions
ising+0.004
ized+0.004
lest+0.004
mun+0.004
ters+0.004
Neg logit contributions
 microwave-0.003
 alarm-0.003
 chocolate-0.003
 stories-0.003
 North-0.003
Token magn
Token ID7842
Feature activation+1.22e+00

Pos logit contributions
 North+0.001
 microwave+0.001
ores+0.001
ric+0.001
 Friday+0.001
Neg logit contributions
ising-0.001
ters-0.001
ized-0.001
 members-0.001
lest-0.001
Tokenifying
Token ID4035
Feature activation-6.07e+00

Pos logit contributions
ising+0.043
ized+0.040
ters+0.039
ging+0.037
lest+0.037
Neg logit contributions
 microwave-0.019
 North-0.018
 stories-0.018
 alarm-0.017
 Friday-0.017
Token glass
Token ID5405
Feature activation-2.34e+00

Pos logit contributions
ric+0.128
ahs+0.120
 habitats+0.119
 stories+0.118
haw+0.118
Neg logit contributions
ising-0.151
ters-0.143
ized-0.141
 tad-0.141
 crit-0.135
Token over
Token ID625
Feature activation+5.19e-02

Pos logit contributions
 North+0.062
 microwave+0.061
ric+0.060
 alarm+0.058
 Friday+0.058
Neg logit contributions
ising-0.056
ized-0.050
ters-0.050
 members-0.046
lest-0.046
Token a
Token ID257
Feature activation-5.19e-01

Pos logit contributions
ising+0.002
lest+0.001
ized+0.001
mun+0.001
ters+0.001
Neg logit contributions
 microwave-0.001
 North-0.001
 chocolate-0.001
 alarm-0.001
 stories-0.001
Token black
Token ID2042
Feature activation-2.14e+00

Pos logit contributions
 North+0.011
 microwave+0.011
ric+0.011
ores+0.010
 alarm+0.010
Neg logit contributions
ising-0.015
ized-0.013
ters-0.013
lest-0.013
ging-0.013
Token ant
Token ID1885
Feature activation+7.81e-01

Pos logit contributions
 North+0.049
ric+0.049
ores+0.047
 microwave+0.046
 stories+0.044
Neg logit contributions
ising-0.057
ters-0.053
ging-0.048
ized-0.048
 members-0.048
Token too
Token ID1165
Feature activation-2.16e+00

Pos logit contributions
ising+0.011
ized+0.010
mun+0.010
lest+0.010
ters+0.010
Neg logit contributions
 North-0.009
 alarm-0.009
 microwave-0.009
ric-0.009
 chocolate-0.009
Token close
Token ID1969
Feature activation+1.29e+00

Pos logit contributions
 microwave+0.059
ric+0.057
tr+0.056
 North+0.055
 stories+0.054
Neg logit contributions
ising-0.052
ized-0.046
ters-0.042
 members-0.042
lest-0.042
Token.
Token ID13
Feature activation-6.42e-01

Pos logit contributions
ising+0.032
ized+0.029
ters+0.028
ging+0.027
 further+0.027
Neg logit contributions
 microwave-0.034
tr-0.033
 North-0.031
 seventeen-0.031
 alarm-0.030
Token The
Token ID383
Feature activation-1.15e+00

Pos logit contributions
 microwave+0.013
 North+0.013
 alarm+0.013
ric+0.012
 stories+0.012
Neg logit contributions
ising-0.019
ized-0.017
ters-0.017
mun-0.016
lest-0.016
Token ost
Token ID23619
Feature activation+2.02e+00

Pos logit contributions
 microwave+0.025
ores+0.023
 Friday+0.023
icle+0.023
ah+0.022
Neg logit contributions
ising-0.033
ters-0.030
 members-0.028
ized-0.027
 crit-0.027
Tokenric
Token ID1173
Feature activation-6.07e+00

Pos logit contributions
ising+0.082
ters+0.080
ized+0.076
ey+0.076
 members+0.074
Neg logit contributions
 microwave-0.023
 Friday-0.019
 North-0.018
 stories-0.017
 dentist-0.017
Tokenhes
Token ID956
Feature activation-2.13e+00

Pos logit contributions
 North+0.130
 fever+0.124
 headache+0.120
 fridge+0.114
 microwave+0.112
Neg logit contributions
ters-0.175
ising-0.167
 members-0.162
 else-0.160
ve-0.159
Token might
Token ID1244
Feature activation-1.27e+00

Pos logit contributions
 North+0.052
 microwave+0.052
ric+0.048
 chocolate+0.048
tr+0.048
Neg logit contributions
ising-0.056
ters-0.052
ized-0.049
 members-0.049
lest-0.047
Token pe
Token ID613
Feature activation+2.15e+00

Pos logit contributions
 microwave+0.038
 stories+0.033
 North+0.033
 fridge+0.033
 chocolate+0.033
Neg logit contributions
ising-0.029
 convince-0.024
ters-0.024
 further-0.023
lest-0.023
Tokenck
Token ID694
Feature activation+1.77e+00

Pos logit contributions
ising+0.048
ized+0.042
ters+0.040
mun+0.039
 members+0.038
Neg logit contributions
 microwave-0.060
 North-0.059
 alarm-0.058
ric-0.057
 stories-0.057
Token you
Token ID345
Feature activation-4.17e-01

Pos logit contributions
ising+0.041
ters+0.038
ized+0.035
lest+0.033
ging+0.033
Neg logit contributions
 microwave-0.050
 North-0.048
ric-0.047
 alarm-0.046
tr-0.046
Token,
Token ID11
Feature activation+6.11e-02

Pos logit contributions
 microwave+0.005
 North+0.005
ric+0.004
 alarm+0.004
 stories+0.004
Neg logit contributions
ising-0.005
ized-0.004
ters-0.004
 members-0.004
mun-0.004
Token and
Token ID290
Feature activation+6.59e-01

Pos logit contributions
ising+0.002
ized+0.002
ters+0.001
mun+0.001
lest+0.001
Neg logit contributions
 microwave-0.001
 North-0.001
 alarm-0.001
 stories-0.001
 chocolate-0.001
Token decided
Token ID3066
Feature activation-1.56e-01

Pos logit contributions
ising+0.019
mun+0.017
ging+0.017
ters+0.016
ized+0.016
Neg logit contributions
 stories-0.014
 microwave-0.014
 alarm-0.013
 Friday-0.013
 chocolate-0.013
Token to
Token ID284
Feature activation-1.94e-01

Pos logit contributions
 microwave+0.003
 North+0.003
ric+0.003
 alarm+0.003
 marsh+0.003
Neg logit contributions
ising-0.005
ized-0.004
ters-0.004
 members-0.004
mun-0.004
Token make
Token ID787
Feature activation+5.64e-02

Pos logit contributions
 microwave+0.004
 North+0.004
 alarm+0.004
 stories+0.004
ric+0.004
Neg logit contributions
ising-0.005
ized-0.005
ters-0.005
mun-0.005
ging-0.004
Token one
Token ID530
Feature activation+4.01e+00

Pos logit contributions
ising+0.002
lest+0.002
ized+0.002
ging+0.002
mun+0.002
Neg logit contributions
 stories-0.001
 microwave-0.001
 alarm-0.001
 chocolate-0.001
 North-0.001
Token last
Token ID938
Feature activation+1.39e+00

Pos logit contributions
ising+0.108
ized+0.101
mun+0.099
ging+0.099
ters+0.098
Neg logit contributions
 microwave-0.092
 alarm-0.091
 stories-0.090
 North-0.089
 chocolate-0.087
Token try
Token ID1949
Feature activation-4.27e-01

Pos logit contributions
ising+0.040
ized+0.034
 members+0.034
lest+0.033
ters+0.033
Neg logit contributions
 microwave-0.030
 North-0.030
ric-0.029
 Sale-0.028
ores-0.027
Token.
Token ID13
Feature activation+7.46e-01

Pos logit contributions
 microwave+0.013
 North+0.012
ric+0.012
 alarm+0.012
 marsh+0.011
Neg logit contributions
ising-0.009
ters-0.008
ized-0.008
 members-0.008
 further-0.007
Token To
Token ID1675
Feature activation+1.62e+00

Pos logit contributions
ising+0.020
ters+0.019
ized+0.018
 members+0.018
Gr+0.017
Neg logit contributions
 North-0.017
ric-0.016
 microwave-0.016
 alarm-0.015
 marsh-0.015
Token her
Token ID607
Feature activation+1.73e-01

Pos logit contributions
ising+0.042
ized+0.038
ters+0.037
mun+0.036
ging+0.036
Neg logit contributions
 microwave-0.039
 North-0.038
 alarm-0.038
 stories-0.037
 Friday-0.037
Token
Token ID198
Feature activation+6.73e-01

Pos logit contributions
 microwave+0.030
 North+0.029
 alarm+0.028
 stories+0.028
ric+0.028
Neg logit contributions
ising-0.038
ized-0.034
ters-0.033
mun-0.032
lest-0.031
Token
Token ID198
Feature activation+2.73e-01

Pos logit contributions
ising+0.020
ized+0.018
ters+0.018
mun+0.017
lest+0.017
Neg logit contributions
 microwave-0.014
 North-0.014
 alarm-0.013
 Friday-0.013
ric-0.013
TokenJack
Token ID14295
Feature activation+1.88e+00

Pos logit contributions
ising+0.007
ized+0.006
ters+0.006
mun+0.006
lest+0.006
Neg logit contributions
 microwave-0.007
 North-0.007
 alarm-0.006
 Friday-0.006
 stories-0.006
Token made
Token ID925
Feature activation-3.33e-01

Pos logit contributions
ising+0.046
ized+0.041
ters+0.039
 members+0.038
mun+0.038
Neg logit contributions
 microwave-0.049
 North-0.048
 alarm-0.047
ric-0.046
 stories-0.046
Token his
Token ID465
Feature activation+2.28e+00

Pos logit contributions
 microwave+0.008
 North+0.008
 alarm+0.008
ric+0.008
tr+0.008
Neg logit contributions
ising-0.009
ized-0.008
ters-0.007
 members-0.007
mun-0.007
Token way
Token ID835
Feature activation+4.15e+00

Pos logit contributions
ising+0.063
ized+0.055
 members+0.054
ters+0.051
lest+0.051
Neg logit contributions
 microwave-0.052
 North-0.050
ric-0.049
 alarm-0.049
tr-0.048
Token up
Token ID510
Feature activation+2.27e+00

Pos logit contributions
ising+0.141
mun+0.139
ters+0.130
Gr+0.126
lest+0.125
Neg logit contributions
 steep-0.064
 North-0.062
 alarm-0.061
 belly-0.061
ric-0.059
Token a
Token ID257
Feature activation+2.53e-01

Pos logit contributions
ising+0.064
ized+0.057
ters+0.055
mun+0.054
lest+0.053
Neg logit contributions
 microwave-0.051
 North-0.050
 alarm-0.048
 stories-0.048
ric-0.047
Token hill
Token ID12788
Feature activation+1.45e-01

Pos logit contributions
ising+0.007
ized+0.007
ters+0.007
mun+0.006
lest+0.006
Neg logit contributions
 microwave-0.005
 North-0.005
 alarm-0.005
 stories-0.005
ric-0.005
Token and
Token ID290
Feature activation+4.94e-01

Pos logit contributions
ising+0.004
ized+0.003
ters+0.003
mun+0.003
lest+0.003
Neg logit contributions
 microwave-0.003
 North-0.003
 alarm-0.003
 stories-0.003
ric-0.003
Token stopped
Token ID5025
Feature activation+1.15e+00

Pos logit contributions
ising+0.014
ters+0.012
ized+0.012
mun+0.012
Gr+0.012
Neg logit contributions
 North-0.011
 microwave-0.011
 stories-0.011
 alarm-0.010
 Friday-0.010
Token up
Token ID510
Feature activation+1.86e+00

Pos logit contributions
ising+0.041
ized+0.036
ters+0.034
mun+0.034
lest+0.034
Neg logit contributions
 microwave-0.054
 North-0.054
 alarm-0.054
 stories-0.053
 chocolate-0.052
Token.
Token ID13
Feature activation+4.55e-01

Pos logit contributions
ising+0.049
ized+0.043
ters+0.042
mun+0.040
ging+0.040
Neg logit contributions
 microwave-0.046
 North-0.045
 alarm-0.043
ric-0.043
 stories-0.042
Token She
Token ID1375
Feature activation-1.13e+00

Pos logit contributions
ising+0.010
ters+0.009
ized+0.009
 members+0.008
aping+0.008
Neg logit contributions
 microwave-0.012
 North-0.012
ric-0.012
 marsh-0.012
ah-0.012
Token tried
Token ID3088
Feature activation+1.50e+00

Pos logit contributions
 microwave+0.028
 North+0.027
 alarm+0.026
 stories+0.026
ric+0.026
Neg logit contributions
ising-0.029
ized-0.026
ters-0.025
mun-0.025
 members-0.024
Token something
Token ID1223
Feature activation+8.70e-01

Pos logit contributions
ising+0.044
ized+0.039
ters+0.038
mun+0.038
 members+0.037
Neg logit contributions
 microwave-0.032
 North-0.031
 alarm-0.030
ric-0.029
 stories-0.029
Token else
Token ID2073
Feature activation+4.59e+00

Pos logit contributions
mun+0.028
ising+0.027
fer+0.026
otte+0.026
Gr+0.026
Neg logit contributions
 steep-0.015
 stories-0.015
 snow-0.014
 thunder-0.014
 alarm-0.014
Token.
Token ID13
Feature activation+5.48e-01

Pos logit contributions
ising+0.150
Gr+0.148
mun+0.146
ized+0.142
ters+0.136
Neg logit contributions
 stories-0.078
 alarm-0.076
 chocolate-0.071
 steep-0.070
 snow-0.070
Token She
Token ID1375
Feature activation-9.53e-01

Pos logit contributions
ising+0.015
ters+0.013
ized+0.013
 members+0.012
aping+0.012
Neg logit contributions
 North-0.013
 microwave-0.013
ric-0.012
ah-0.012
 alarm-0.012
Token took
Token ID1718
Feature activation+1.14e+00

Pos logit contributions
 North+0.020
 microwave+0.020
 alarm+0.020
 stories+0.019
tr+0.019
Neg logit contributions
ising-0.027
ters-0.024
ized-0.024
ging-0.023
mun-0.023
Token two
Token ID734
Feature activation+1.66e+00

Pos logit contributions
ising+0.039
ized+0.036
mun+0.035
lest+0.034
ters+0.034
Neg logit contributions
 stories-0.019
 alarm-0.018
 chocolate-0.018
 microwave-0.018
 thunder-0.018
Token pans
Token ID36209
Feature activation+5.59e-01

Pos logit contributions
otte+0.050
Gr+0.049
ising+0.048
mun+0.046
we+0.046
Neg logit contributions
 stories-0.032
 snow-0.030
 thunder-0.029
 horns-0.029
 belly-0.029
Token voice
Token ID3809
Feature activation-3.04e-01

Pos logit contributions
 microwave+0.006
 North+0.006
ric+0.006
 Friday+0.006
 alarm+0.005
Neg logit contributions
ising-0.014
ized-0.013
ters-0.013
mun-0.013
lest-0.013
Token.
Token ID13
Feature activation-2.76e-01

Pos logit contributions
 alarm+0.007
 North+0.006
 microwave+0.006
 stories+0.006
 thunder+0.006
Neg logit contributions
ising-0.009
ized-0.008
mun-0.008
 members-0.008
ters-0.008
Token
Token ID198
Feature activation+7.32e-02

Pos logit contributions
 microwave+0.004
 North+0.004
ric+0.003
 stories+0.003
 alarm+0.003
Neg logit contributions
ising-0.010
ters-0.009
ized-0.009
Gr-0.009
mun-0.009
Token
Token ID198
Feature activation-2.48e-01

Pos logit contributions
ising+0.002
ized+0.002
ters+0.002
mun+0.002
lest+0.002
Neg logit contributions
 North-0.001
 microwave-0.001
 alarm-0.001
 Friday-0.001
 Sale-0.001
Token"
Token ID1
Feature activation+1.24e+00

Pos logit contributions
 Friday+0.003
 North+0.003
 microwave+0.003
ores+0.003
 Sale+0.003
Neg logit contributions
ising-0.009
ized-0.008
ters-0.008
mun-0.008
lest-0.008
TokenSur
Token ID14214
Feature activation+4.19e+00

Pos logit contributions
ising+0.043
ized+0.037
aping+0.033
ipping+0.033
lest+0.033
Neg logit contributions
 alarm-0.023
 microwave-0.023
 thunder-0.022
 chocolate-0.022
 Sale-0.022
Tokenprise
Token ID7919
Feature activation+4.82e-01

Pos logit contributions
ising+0.108
ized+0.105
lest+0.102
ters+0.099
bing+0.092
Neg logit contributions
 microwave-0.098
 alarm-0.092
 chocolate-0.090
ores-0.090
 Sale-0.090
Token!"
Token ID2474
Feature activation+4.35e-01

Pos logit contributions
ising+0.016
ters+0.014
ized+0.014
mun+0.013
ging+0.013
Neg logit contributions
 microwave-0.009
 North-0.009
ric-0.008
 alarm-0.008
 Friday-0.008
Token Grand
Token ID5675
Feature activation-1.14e+00

Pos logit contributions
ising+0.014
ized+0.013
ters+0.012
mun+0.012
ging+0.012
Neg logit contributions
 microwave-0.008
 North-0.008
 alarm-0.008
 stories-0.008
 Friday-0.007
Tokenma
Token ID2611
Feature activation-9.69e-02

Pos logit contributions
 microwave+0.030
 North+0.029
 alarm+0.028
 stories+0.028
ric+0.028
Neg logit contributions
ising-0.028
ized-0.025
ters-0.024
mun-0.023
 members-0.023
Token says
Token ID1139
Feature activation+1.85e+00

Pos logit contributions
 microwave+0.002
 North+0.002
 alarm+0.002
 stories+0.002
ric+0.002
Neg logit contributions
ising-0.003
ized-0.003
ters-0.003
lest-0.003
ging-0.003
Token
Token ID198
Feature activation+1.04e+00

Pos logit contributions
ising+0.024
ized+0.022
ters+0.022
mun+0.021
 members+0.021
Neg logit contributions
 microwave-0.015
 North-0.015
 alarm-0.014
ric-0.014
tr-0.014
Token
Token ID198
Feature activation+5.46e-01

Pos logit contributions
ising+0.030
ized+0.027
ters+0.026
mun+0.025
lest+0.025
Neg logit contributions
 microwave-0.022
 North-0.022
 alarm-0.021
 stories-0.021
ric-0.021
TokenFinally
Token ID11158
Feature activation-5.61e-01

Pos logit contributions
ising+0.013
ized+0.011
ters+0.011
mun+0.011
lest+0.011
Neg logit contributions
 microwave-0.015
 North-0.014
 alarm-0.014
ric-0.014
 stories-0.014
Token he
Token ID339
Feature activation-2.62e-01

Pos logit contributions
 microwave+0.011
 North+0.011
 stories+0.011
 alarm+0.011
 Friday+0.011
Neg logit contributions
ising-0.017
ized-0.016
ters-0.015
mun-0.015
 members-0.015
Token found
Token ID1043
Feature activation+4.92e-01

Pos logit contributions
 microwave+0.007
 North+0.007
ric+0.006
 alarm+0.006
 stories+0.006
Neg logit contributions
ising-0.006
ized-0.006
ters-0.006
 members-0.006
mun-0.005
Token one
Token ID530
Feature activation+3.79e+00

Pos logit contributions
ising+0.013
ters+0.012
ized+0.012
 members+0.011
mun+0.011
Neg logit contributions
 microwave-0.011
 North-0.011
 alarm-0.011
tr-0.010
ric-0.010
Token that
Token ID326
Feature activation-9.97e-01

Pos logit contributions
ising+0.107
ized+0.098
mun+0.094
ters+0.094
ging+0.093
Neg logit contributions
 microwave-0.084
 North-0.081
 alarm-0.080
 chocolate-0.078
 stories-0.077
Token said
Token ID531
Feature activation+7.35e-01

Pos logit contributions
 microwave+0.026
 North+0.025
 alarm+0.024
ric+0.024
 stories+0.024
Neg logit contributions
ising-0.025
ized-0.022
ters-0.021
mun-0.021
 members-0.020
Token it
Token ID340
Feature activation-7.09e-01

Pos logit contributions
ising+0.015
ters+0.015
 else+0.014
ized+0.013
 members+0.013
Neg logit contributions
ric-0.021
 North-0.020
 microwave-0.020
opl-0.019
 habitats-0.019
Token was
Token ID373
Feature activation+2.89e-01

Pos logit contributions
 microwave+0.020
 North+0.020
ric+0.019
 alarm+0.019
 stories+0.018
Neg logit contributions
ising-0.016
ized-0.014
ters-0.014
mun-0.013
 members-0.013
Token depend
Token ID4745
Feature activation+2.58e+00

Pos logit contributions
ising+0.005
ters+0.005
ized+0.005
 convince+0.005
 hopefully+0.005
Neg logit contributions
ric-0.008
haw-0.008
tr-0.008
opl-0.008
 North-0.008
Token,
Token ID11
Feature activation+5.53e-01

Pos logit contributions
ising+0.022
ized+0.020
ters+0.020
aping+0.019
lest+0.019
Neg logit contributions
 microwave-0.014
 Friday-0.013
 marsh-0.013
 chocolate-0.013
 stories-0.013
Token when
Token ID618
Feature activation+2.05e+00

Pos logit contributions
ising+0.015
ized+0.013
ters+0.013
mun+0.013
lest+0.013
Neg logit contributions
 microwave-0.013
 Friday-0.013
 North-0.013
 alarm-0.012
 stories-0.012
Token Jimmy
Token ID12963
Feature activation+4.77e-01

Pos logit contributions
ising+0.055
ized+0.049
ters+0.048
lest+0.046
mun+0.046
Neg logit contributions
 microwave-0.050
 North-0.049
 Friday-0.047
 alarm-0.047
 stories-0.046
Token was
Token ID373
Feature activation+7.71e-01

Pos logit contributions
ising+0.013
ized+0.012
ters+0.011
mun+0.011
lest+0.011
Neg logit contributions
 microwave-0.011
 North-0.011
 alarm-0.011
 stories-0.010
ric-0.010
Token practicing
Token ID18207
Feature activation+1.19e+00

Pos logit contributions
ising+0.020
ized+0.018
ters+0.017
mun+0.017
lest+0.017
Neg logit contributions
 microwave-0.019
 North-0.018
 alarm-0.018
 stories-0.018
ric-0.017
Token his
Token ID465
Feature activation+2.69e-01

Pos logit contributions
ising+0.031
ized+0.027
ters+0.026
ging+0.026
mun+0.025
Neg logit contributions
 microwave-0.029
 North-0.029
ric-0.028
tr-0.027
 alarm-0.027
Token songs
Token ID7259
Feature activation-3.74e-01

Pos logit contributions
ising+0.008
ized+0.007
ters+0.007
mun+0.007
lest+0.006
Neg logit contributions
 microwave-0.006
 North-0.006
 alarm-0.006
ric-0.005
 stories-0.005
Token,
Token ID11
Feature activation+1.01e+00

Pos logit contributions
 microwave+0.010
 North+0.010
ric+0.009
 stories+0.009
 alarm+0.009
Neg logit contributions
ising-0.009
ized-0.008
ters-0.008
 members-0.008
ging-0.008
Token he
Token ID339
Feature activation+6.69e-01

Pos logit contributions
ising+0.023
ized+0.020
ters+0.019
mun+0.019
 members+0.019
Neg logit contributions
 microwave-0.028
 North-0.027
 alarm-0.026
ric-0.026
 stories-0.026
Token heard
Token ID2982
Feature activation-1.54e+00

Pos logit contributions
ising+0.017
ized+0.015
 members+0.014
ters+0.014
mun+0.014
Neg logit contributions
 microwave-0.017
 North-0.017
ric-0.016
 alarm-0.016
 stories-0.016
Token something
Token ID1223
Feature activation+8.95e-01

Pos logit contributions
ric+0.037
 microwave+0.036
 North+0.035
tr+0.034
 seventeen+0.033
Neg logit contributions
ising-0.040
ized-0.036
ters-0.035
 members-0.035
 else-0.033
TokenEmily
Token ID48640
Feature activation+9.65e-01

Pos logit contributions
ising+0.005
ized+0.005
ters+0.004
mun+0.004
ging+0.004
Neg logit contributions
 microwave-0.005
 Friday-0.005
 North-0.005
 stories-0.005
 alarm-0.004
Token smiled
Token ID13541
Feature activation+3.49e-01

Pos logit contributions
ising+0.024
ized+0.021
ters+0.020
mun+0.020
lest+0.019
Neg logit contributions
 microwave-0.025
 North-0.024
 alarm-0.024
 stories-0.023
ric-0.023
Token and
Token ID290
Feature activation-9.44e-01

Pos logit contributions
ising+0.010
ized+0.009
ters+0.009
mun+0.008
ging+0.008
Neg logit contributions
 microwave-0.008
 North-0.008
 alarm-0.007
ric-0.007
 stories-0.007
Token then
Token ID788
Feature activation-2.65e-01

Pos logit contributions
 stories+0.021
 North+0.020
 belly+0.020
 microwave+0.020
 Friday+0.020
Neg logit contributions
ising-0.027
ters-0.024
ized-0.023
mun-0.022
ging-0.022
Token reached
Token ID4251
Feature activation-1.20e-01

Pos logit contributions
 stories+0.006
 microwave+0.005
 North+0.005
 chocolate+0.005
 alarm+0.005
Neg logit contributions
ising-0.008
ters-0.007
ized-0.007
mun-0.006
lest-0.006
Token out
Token ID503
Feature activation+9.40e-01

Pos logit contributions
 microwave+0.003
 North+0.003
 alarm+0.003
 stories+0.003
 Friday+0.003
Neg logit contributions
ising-0.003
ized-0.003
ters-0.003
mun-0.003
lest-0.003
Token and
Token ID290
Feature activation-1.59e-02

Pos logit contributions
ising+0.027
ized+0.024
ters+0.024
mun+0.024
lest+0.023
Neg logit contributions
 alarm-0.020
 stories-0.020
 microwave-0.020
 North-0.019
 Friday-0.019
Token grabbed
Token ID13176
Feature activation+1.03e+00

Pos logit contributions
 microwave+0.000
 chocolate+0.000
 marsh+0.000
 alarm+0.000
tr+0.000
Neg logit contributions
ising-0.001
ters-0.000
ized-0.000
ging-0.000
mun-0.000
Token the
Token ID262
Feature activation+2.64e-01

Pos logit contributions
ising+0.037
ized+0.032
mun+0.032
lest+0.031
aping+0.031
Neg logit contributions
 chocolate-0.018
 microwave-0.017
 stories-0.017
 marsh-0.017
 alarm-0.016
Token puppy
Token ID26188
Feature activation+1.33e-01

Pos logit contributions
ising+0.008
ized+0.007
mun+0.007
aping+0.006
ters+0.006
Neg logit contributions
 chocolate-0.006
 alarm-0.006
 microwave-0.006
 belly-0.006
 North-0.006
Token's
Token ID338
Feature activation+1.16e+00

Pos logit contributions
ising+0.004
ized+0.003
mun+0.003
lest+0.003
Gr+0.003
Neg logit contributions
 stories-0.003
 alarm-0.003
 chocolate-0.003
tr-0.003
 microwave-0.003
Token excited
Token ID6568
Feature activation+1.69e+00

Pos logit contributions
 microwave+0.034
 North+0.033
 alarm+0.032
 stories+0.032
 Friday+0.031
Neg logit contributions
ising-0.048
ized-0.043
ters-0.042
mun-0.041
lest-0.040
Token again
Token ID757
Feature activation+4.54e-01

Pos logit contributions
ising+0.050
mun+0.046
ized+0.046
ters+0.046
lest+0.045
Neg logit contributions
 stories-0.034
 alarm-0.032
 microwave-0.032
 North-0.031
 chocolate-0.031
Token.
Token ID13
Feature activation+6.36e-01

Pos logit contributions
ising+0.014
mun+0.013
ized+0.013
ters+0.013
lest+0.013
Neg logit contributions
 North-0.008
 stories-0.008
 alarm-0.008
 microwave-0.007
 Friday-0.007
Token
Token ID220
Feature activation-1.19e+00

Pos logit contributions
ising+0.016
 members+0.014
mun+0.014
ized+0.014
Gr+0.014
Neg logit contributions
 microwave-0.016
 North-0.015
 alarm-0.015
ric-0.014
 marsh-0.014
Token
Token ID198
Feature activation+7.79e-01

Pos logit contributions
 microwave+0.026
 North+0.026
 alarm+0.025
 stories+0.025
ric+0.024
Neg logit contributions
ising-0.034
ized-0.030
ters-0.029
mun-0.029
lest-0.028
Token
Token ID198
Feature activation+3.47e-01

Pos logit contributions
ising+0.021
ized+0.019
ters+0.018
mun+0.018
 members+0.018
Neg logit contributions
 microwave-0.018
 North-0.018
 alarm-0.017
ric-0.017
 stories-0.017
Token"
Token ID1
Feature activation+1.82e+00

Pos logit contributions
ising+0.008
ized+0.007
ters+0.006
mun+0.006
 members+0.006
Neg logit contributions
 microwave-0.010
 North-0.010
 alarm-0.009
ric-0.009
 chocolate-0.009
TokenOk
Token ID18690
Feature activation-7.03e-01

Pos logit contributions
ising+0.051
ized+0.046
lest+0.044
mun+0.044
ters+0.044
Neg logit contributions
 North-0.040
 Friday-0.039
 microwave-0.038
 chocolate-0.037
 Sunday-0.036
Token,
Token ID11
Feature activation-2.14e+00

Pos logit contributions
 microwave+0.018
 North+0.017
ric+0.017
 alarm+0.016
tr+0.016
Neg logit contributions
ising-0.018
ized-0.016
ters-0.016
lest-0.015
 members-0.015
Token dad
Token ID9955
Feature activation-1.13e+00

Pos logit contributions
 microwave+0.042
 North+0.041
ric+0.040
 alarm+0.039
 stories+0.038
Neg logit contributions
ising-0.066
ized-0.059
ters-0.058
 members-0.056
mun-0.056
Token,
Token ID11
Feature activation-1.23e+00

Pos logit contributions
 microwave+0.026
 North+0.026
 alarm+0.025
ric+0.024
 stories+0.024
Neg logit contributions
ising-0.031
ized-0.028
ters-0.027
mun-0.026
lest-0.026
Token and
Token ID290
Feature activation+4.57e-01

Pos logit contributions
ising+0.029
ized+0.027
ters+0.026
lest+0.026
mun+0.026
Neg logit contributions
 microwave-0.017
 North-0.017
 alarm-0.017
 chocolate-0.016
 bus-0.016
Token play
Token ID711
Feature activation-1.61e+00

Pos logit contributions
ising+0.016
ized+0.015
ters+0.015
ging+0.014
lest+0.013
Neg logit contributions
 alarm-0.008
 microwave-0.008
 chocolate-0.007
 fridge-0.007
 North-0.007
Token,
Token ID11
Feature activation+6.58e-02

Pos logit contributions
 stories+0.035
 microwave+0.034
 Friday+0.034
 Sunday+0.033
 alarm+0.033
Neg logit contributions
lest-0.045
ising-0.043
aping-0.039
mun-0.039
ized-0.039
Token but
Token ID475
Feature activation+2.43e+00

Pos logit contributions
ising+0.001
ized+0.001
ters+0.001
mun+0.001
 members+0.001
Neg logit contributions
 microwave-0.002
 North-0.002
 alarm-0.002
ric-0.002
 stories-0.002
Token to
Token ID284
Feature activation-3.48e-01

Pos logit contributions
ising+0.058
ized+0.050
ters+0.049
mun+0.048
lest+0.046
Neg logit contributions
 microwave-0.065
 North-0.064
 alarm-0.064
 stories-0.062
 Friday-0.062
Token stay
Token ID2652
Feature activation+1.22e+00

Pos logit contributions
 North+0.007
 microwave+0.007
 alarm+0.007
 Friday+0.007
 stories+0.007
Neg logit contributions
ising-0.010
ized-0.010
ters-0.009
mun-0.009
lest-0.009
Token where
Token ID810
Feature activation+1.71e+00

Pos logit contributions
ising+0.029
ters+0.026
lest+0.026
mun+0.026
ized+0.026
Neg logit contributions
 microwave-0.031
 North-0.031
 alarm-0.031
 stories-0.030
 Friday-0.029
Token she
Token ID673
Feature activation+1.14e-01

Pos logit contributions
ising+0.036
ized+0.032
ters+0.030
mun+0.029
 members+0.029
Neg logit contributions
 microwave-0.050
 North-0.049
 alarm-0.047
ric-0.047
 stories-0.047
Token could
Token ID714
Feature activation-8.98e-02

Pos logit contributions
ising+0.003
ized+0.003
ters+0.003
ging+0.003
lest+0.003
Neg logit contributions
 microwave-0.002
 North-0.002
 alarm-0.002
 stories-0.002
ric-0.002
Token see
Token ID766
Feature activation-1.06e+00

Pos logit contributions
 microwave+0.002
 North+0.002
 alarm+0.002
ric+0.002
 stories+0.002
Neg logit contributions
ising-0.003
ized-0.002
ters-0.002
mun-0.002
 members-0.002
Token her
Token ID607
Feature activation+1.11e+00

Pos logit contributions
 microwave+0.024
 North+0.024
 alarm+0.023
 stories+0.022
ric+0.022
Neg logit contributions
ising-0.029
ized-0.026
ters-0.025
mun-0.025
lest-0.025
Token he
Token ID339
Feature activation+3.44e-01

Pos logit contributions
ising+0.015
ized+0.013
mun+0.013
lest+0.013
ging+0.012
Neg logit contributions
 alarm-0.011
 North-0.010
 stories-0.010
 microwave-0.010
 chocolate-0.010
Token shouldn
Token ID6584
Feature activation-1.71e+00

Pos logit contributions
ising+0.010
ized+0.009
ters+0.009
mun+0.009
lest+0.009
Neg logit contributions
 microwave-0.007
 North-0.007
 alarm-0.007
 stories-0.007
ric-0.007
Token't
Token ID470
Feature activation-1.26e+00

Pos logit contributions
 alarm+0.018
 North+0.018
 microwave+0.017
 Sale+0.017
 Today+0.016
Neg logit contributions
ising-0.069
ized-0.065
lest-0.062
ging-0.062
ters-0.061
Token talk
Token ID1561
Feature activation-1.10e+00

Pos logit contributions
 alarm+0.019
 Today+0.018
 One+0.017
 It+0.015
 Sale+0.015
Neg logit contributions
ising-0.046
mun-0.045
ized-0.043
ging-0.043
Gr-0.042
Token to
Token ID284
Feature activation-9.82e-01

Pos logit contributions
 microwave+0.033
 seventeen+0.031
 North+0.030
ric+0.030
 marsh+0.029
Neg logit contributions
ising-0.023
 else-0.021
ters-0.021
ized-0.020
 members-0.019
Token strangers
Token ID18981
Feature activation+3.58e-01

Pos logit contributions
 North+0.021
 microwave+0.020
 alarm+0.020
 stories+0.020
ric+0.019
Neg logit contributions
ising-0.029
ized-0.026
ters-0.025
mun-0.025
lest-0.024
Token.
Token ID13
Feature activation+8.40e-01

Pos logit contributions
ising+0.010
ized+0.009
ters+0.009
mun+0.008
lest+0.008
Neg logit contributions
 microwave-0.008
 North-0.008
 alarm-0.008
 stories-0.007
ric-0.007
Token
Token ID220
Feature activation-8.52e-01

Pos logit contributions
ising+0.024
ized+0.022
ters+0.021
mun+0.021
 members+0.021
Neg logit contributions
 North-0.018
 microwave-0.018
 alarm-0.017
 stories-0.017
ric-0.017
Token
Token ID198
Feature activation+1.32e+00

Pos logit contributions
 microwave+0.019
 North+0.019
 alarm+0.018
 stories+0.018
ric+0.018
Neg logit contributions
ising-0.024
ized-0.021
ters-0.021
mun-0.020
lest-0.020
Token
Token ID198
Feature activation+8.71e-01

Pos logit contributions
ising+0.041
ized+0.037
ters+0.036
lest+0.035
ging+0.034
Neg logit contributions
 microwave-0.026
 North-0.025
 alarm-0.024
ric-0.024
 Friday-0.024
TokenLater
Token ID18602
Feature activation-1.27e+00

Pos logit contributions
ising+0.020
ized+0.018
ters+0.017
mun+0.017
lest+0.017
Neg logit contributions
 microwave-0.023
 North-0.023
 alarm-0.023
 stories-0.022
 Friday-0.022
Tokenmy
Token ID1820
Feature activation-5.79e-01

Pos logit contributions
 alarm+0.020
 microwave+0.019
 stories+0.019
 chocolate+0.019
ric+0.018
Neg logit contributions
ising-0.037
ized-0.034
ters-0.033
mun-0.033
Gr-0.032
Token cry
Token ID3960
Feature activation-1.69e+00

Pos logit contributions
 microwave+0.013
 alarm+0.013
 stories+0.013
 North+0.013
ric+0.012
Neg logit contributions
ising-0.016
ized-0.014
ters-0.014
mun-0.013
lest-0.013
Token.
Token ID13
Feature activation+7.00e-02

Pos logit contributions
 microwave+0.041
 North+0.041
ric+0.039
 alarm+0.038
 Sale+0.038
Neg logit contributions
ising-0.045
ters-0.040
ized-0.039
ging-0.037
 members-0.037
Token He
Token ID679
Feature activation-5.24e-01

Pos logit contributions
ising+0.002
ized+0.002
ters+0.002
 members+0.002
mun+0.002
Neg logit contributions
 North-0.002
 microwave-0.002
 stories-0.001
ric-0.001
 marsh-0.001
Token wished
Token ID16555
Feature activation+1.69e+00

Pos logit contributions
 microwave+0.013
 North+0.013
ric+0.012
 alarm+0.012
 stories+0.012
Neg logit contributions
ising-0.013
ized-0.012
ters-0.011
 members-0.011
mun-0.011
Token he
Token ID339
Feature activation-2.76e-01

Pos logit contributions
ising+0.045
ized+0.038
ters+0.038
mun+0.038
lest+0.037
Neg logit contributions
 microwave-0.041
 alarm-0.041
 stories-0.041
 North-0.040
 chocolate-0.039
Token could
Token ID714
Feature activation-1.42e+00

Pos logit contributions
 microwave+0.005
 North+0.005
 alarm+0.005
 stories+0.005
tr+0.005
Neg logit contributions
ising-0.009
ized-0.008
ters-0.008
mun-0.008
lest-0.007
Token be
Token ID307
Feature activation+7.26e-01

Pos logit contributions
 North+0.029
 alarm+0.029
 microwave+0.028
tr+0.028
ric+0.027
Neg logit contributions
ising-0.042
ized-0.039
ging-0.038
ters-0.038
mun-0.037
Token big
Token ID1263
Feature activation+7.06e-01

Pos logit contributions
ising+0.020
ized+0.018
mun+0.018
ters+0.018
lest+0.018
Neg logit contributions
 microwave-0.016
 stories-0.016
 North-0.016
 alarm-0.016
 chocolate-0.015
Token again
Token ID757
Feature activation+4.40e-01

Pos logit contributions
ising+0.019
ized+0.017
ters+0.016
mun+0.016
lest+0.015
Neg logit contributions
 microwave-0.017
 North-0.016
 alarm-0.016
ric-0.016
 stories-0.016
Token,
Token ID11
Feature activation-1.06e+00

Pos logit contributions
ising+0.012
ized+0.011
mun+0.011
ters+0.011
lest+0.011
Neg logit contributions
 North-0.009
 Friday-0.009
 alarm-0.009
 microwave-0.009
 stories-0.009
Token Tim
Token ID5045
Feature activation+6.89e-01

Pos logit contributions
 North+0.066
 microwave+0.064
ric+0.063
 alarm+0.062
 stories+0.061
Neg logit contributions
ising-0.064
ters-0.059
ized-0.059
mun-0.056
 members-0.056
Tokenmy
Token ID1820
Feature activation-1.07e+00

Pos logit contributions
ising+0.014
 members+0.013
ey+0.012
ized+0.012
ters+0.012
Neg logit contributions
 microwave-0.020
 Sale-0.020
 seventeen-0.020
 marsh-0.020
 North-0.019
Token.
Token ID13
Feature activation-6.16e-01

Pos logit contributions
 microwave+0.030
 North+0.029
 Sale+0.028
 seventeen+0.028
 stories+0.028
Neg logit contributions
ising-0.024
ized-0.021
 members-0.020
ters-0.020
mun-0.020
Token Tim
Token ID5045
Feature activation+1.81e+00

Pos logit contributions
 North+0.020
 Bible+0.019
 fever+0.019
 microwave+0.019
 stories+0.019
Neg logit contributions
ising-0.008
aping-0.008
ters-0.008
B-0.008
ized-0.008
Tokenmy
Token ID1820
Feature activation+6.06e-02

Pos logit contributions
ising+0.029
 members+0.028
ey+0.027
ized+0.026
aping+0.025
Neg logit contributions
 Sale-0.059
 seventeen-0.058
 marsh-0.056
 microwave-0.056
ores-0.055
Token loved
Token ID6151
Feature activation-1.98e+00

Pos logit contributions
ising+0.001
 members+0.001
ized+0.001
mun+0.001
ey+0.001
Neg logit contributions
 microwave-0.002
 Sale-0.002
 North-0.002
ric-0.002
 seventeen-0.002
Token to
Token ID284
Feature activation-4.73e-02

Pos logit contributions
 seventeen+0.059
ah+0.056
tr+0.055
ric+0.054
 sixteen+0.054
Neg logit contributions
ising-0.034
 members-0.032
ized-0.032
mun-0.031
we-0.031
Token play
Token ID711
Feature activation-1.85e+00

Pos logit contributions
ric+0.002
 seventeen+0.002
 microwave+0.001
 fault+0.001
 steep+0.001
Neg logit contributions
ising-0.001
 convince-0.001
 solve-0.001
 resolve-0.001
lest-0.001
Token with
Token ID351
Feature activation+2.76e-01

Pos logit contributions
 microwave+0.054
 alarm+0.052
 stories+0.051
 Friday+0.051
 chocolate+0.050
Neg logit contributions
ising-0.040
lest-0.036
ized-0.035
ters-0.034
mun-0.034
Token his
Token ID465
Feature activation+5.85e-01

Pos logit contributions
ising+0.007
ized+0.006
ters+0.006
lest+0.006
mun+0.006
Neg logit contributions
 North-0.007
ric-0.007
 microwave-0.007
tr-0.007
 seventeen-0.006
Token ball
Token ID2613
Feature activation+2.02e-02

Pos logit contributions
ising+0.017
ized+0.015
ters+0.015
mun+0.014
ging+0.014
Neg logit contributions
 microwave-0.013
 alarm-0.013
 North-0.013
 stories-0.013
 chocolate-0.012
Token better
Token ID1365
Feature activation+3.43e-01

Pos logit contributions
ising+0.011
ized+0.010
ters+0.010
lest+0.010
mun+0.009
Neg logit contributions
 microwave-0.007
 stories-0.007
 North-0.007
 chocolate-0.007
 alarm-0.007
Token and
Token ID290
Feature activation+6.31e-01

Pos logit contributions
ising+0.010
ized+0.009
ters+0.009
mun+0.009
lest+0.009
Neg logit contributions
 microwave-0.007
 North-0.007
 alarm-0.006
 stories-0.006
ric-0.006
Token the
Token ID262
Feature activation+2.56e-01

Pos logit contributions
ising+0.020
ging+0.018
ized+0.017
Gr+0.017
ve+0.017
Neg logit contributions
 stories-0.013
 microwave-0.012
 snow-0.012
 chocolate-0.012
 marsh-0.012
Token children
Token ID1751
Feature activation+6.49e-01

Pos logit contributions
ising+0.006
ized+0.006
ters+0.006
mun+0.005
 members+0.005
Neg logit contributions
 microwave-0.007
 North-0.006
 alarm-0.006
ric-0.006
 stories-0.006
Token felt
Token ID2936
Feature activation-9.05e-01

Pos logit contributions
ising+0.016
ized+0.014
ters+0.014
 members+0.014
mun+0.014
Neg logit contributions
 microwave-0.017
 North-0.016
 alarm-0.016
ric-0.015
 stories-0.015
Token very
Token ID845
Feature activation-1.01e+00

Pos logit contributions
 microwave+0.020
 North+0.019
 alarm+0.019
ric+0.019
tr+0.018
Neg logit contributions
ising-0.026
ized-0.023
ters-0.023
mun-0.022
 members-0.022
Token proud
Token ID6613
Feature activation-2.28e-01

Pos logit contributions
 belly+0.018
 stories+0.017
ores+0.017
 fever+0.016
 marsh+0.016
Neg logit contributions
ising-0.031
ters-0.031
mun-0.029
aping-0.028
we-0.028
Token of
Token ID286
Feature activation-5.16e-02

Pos logit contributions
 stories+0.005
 microwave+0.005
 North+0.005
 alarm+0.005
 chocolate+0.005
Neg logit contributions
ising-0.006
ized-0.006
mun-0.006
lest-0.006
ging-0.006
Token themselves
Token ID2405
Feature activation-1.54e+00

Pos logit contributions
 stories+0.001
 alarm+0.001
 Friday+0.001
 North+0.001
ric+0.001
Neg logit contributions
ising-0.001
lest-0.001
ized-0.001
mun-0.001
ters-0.001
Token.
Token ID13
Feature activation+1.04e+00

Pos logit contributions
 microwave+0.034
 North+0.033
 alarm+0.033
 stories+0.033
 Friday+0.031
Neg logit contributions
ising-0.043
ized-0.039
ters-0.037
mun-0.037
lest-0.036
Token They
Token ID1119
Feature activation-4.01e-01

Pos logit contributions
ising+0.035
ized+0.032
ters+0.032
 members+0.031
mun+0.031
Neg logit contributions
 microwave-0.017
 North-0.017
 alarm-0.017
 stories-0.016
ric-0.015
Token tell
Token ID1560
Feature activation-2.63e-01

Pos logit contributions
 microwave+0.012
 stories+0.012
 North+0.012
 alarm+0.011
 chocolate+0.011
Neg logit contributions
ising-0.021
ized-0.019
ters-0.018
ging-0.018
lest-0.017
Token her
Token ID607
Feature activation-8.95e-01

Pos logit contributions
 microwave+0.007
 North+0.007
ric+0.007
tr+0.007
 alarm+0.007
Neg logit contributions
ising-0.006
ized-0.005
ters-0.005
mun-0.005
 members-0.005
Token it
Token ID340
Feature activation-1.61e+00

Pos logit contributions
 microwave+0.025
 North+0.025
 seventeen+0.024
ah+0.023
opl+0.023
Neg logit contributions
 else-0.019
ising-0.018
 members-0.018
ized-0.016
mun-0.016
Token's
Token ID338
Feature activation-6.55e-01

Pos logit contributions
 microwave+0.050
 North+0.049
ric+0.049
 trailer+0.046
 seventeen+0.046
Neg logit contributions
ising-0.032
ized-0.027
 else-0.026
mun-0.026
 members-0.026
Token okay
Token ID8788
Feature activation+2.55e-01

Pos logit contributions
 seventeen+0.016
 microwave+0.016
ric+0.016
 sixteen+0.016
ah+0.016
Neg logit contributions
ising-0.016
 convince-0.014
 members-0.013
ters-0.013
 resolve-0.013
Token.
Token ID13
Feature activation-7.45e-01

Pos logit contributions
ising+0.004
ters+0.004
we+0.004
 else+0.003
 members+0.003
Neg logit contributions
 microwave-0.009
 seventeen-0.009
 North-0.008
 trailer-0.008
 Sale-0.008
Token They
Token ID1119
Feature activation-1.49e+00

Pos logit contributions
 microwave+0.016
ah+0.015
 alarm+0.015
 North+0.015
 stories+0.015
Neg logit contributions
ising-0.019
ters-0.018
Gr-0.018
ized-0.018
mun-0.018
Token say
Token ID910
Feature activation-5.22e-03

Pos logit contributions
 microwave+0.037
 North+0.036
ric+0.036
icle+0.035
 alarm+0.035
Neg logit contributions
ising-0.037
ized-0.032
mun-0.032
 members-0.032
ters-0.031
Token they
Token ID484
Feature activation-1.72e+00

Pos logit contributions
ric+0.000
 microwave+0.000
 North+0.000
tr+0.000
 trailer+0.000
Neg logit contributions
ising-0.000
ters-0.000
ized-0.000
mun-0.000
 else-0.000
Token will
Token ID481
Feature activation-2.60e+00

Pos logit contributions
 North+0.039
 microwave+0.039
icle+0.038
 seventeen+0.037
 Friday+0.036
Neg logit contributions
ising-0.047
 members-0.044
mun-0.043
ters-0.041
 else-0.040
Token buy
Token ID2822
Feature activation+3.64e-01

Pos logit contributions
 microwave+0.071
 North+0.063
 seventeen+0.062
ric+0.061
 steep+0.061
Neg logit contributions
ising-0.066
 convince-0.056
 resolve-0.055
ters-0.054
 members-0.054
Token,
Token ID11
Feature activation-3.86e-02

Pos logit contributions
 microwave+0.026
 North+0.026
 alarm+0.025
 stories+0.025
ric+0.025
Neg logit contributions
ising-0.028
ized-0.025
ters-0.024
mun-0.023
lest-0.023
Token but
Token ID475
Feature activation+2.34e+00

Pos logit contributions
 microwave+0.001
 North+0.001
ric+0.001
 alarm+0.001
 Sale+0.001
Neg logit contributions
ising-0.001
ized-0.001
ters-0.001
mun-0.001
ging-0.001
Token she
Token ID673
Feature activation+4.78e-01

Pos logit contributions
ising+0.064
ters+0.055
lest+0.055
mun+0.054
Gr+0.054
Neg logit contributions
 chocolate-0.056
 alarm-0.056
 stories-0.055
 Friday-0.055
 microwave-0.053
Token was
Token ID373
Feature activation+7.29e-01

Pos logit contributions
ising+0.014
ized+0.013
ters+0.013
ging+0.012
mun+0.012
Neg logit contributions
 microwave-0.010
 North-0.010
 alarm-0.010
 stories-0.009
 Friday-0.009
Token determined
Token ID5295
Feature activation+1.93e+00

Pos logit contributions
ising+0.018
ized+0.016
ters+0.016
mun+0.015
lest+0.015
Neg logit contributions
 microwave-0.018
 North-0.018
 alarm-0.018
 stories-0.017
ric-0.017
Token to
Token ID284
Feature activation-8.81e-01

Pos logit contributions
ising+0.056
ized+0.050
ters+0.048
 members+0.047
ging+0.046
Neg logit contributions
 microwave-0.044
 North-0.042
 alarm-0.040
ric-0.040
 marsh-0.039
Token reach
Token ID3151
Feature activation+1.29e+00

Pos logit contributions
 North+0.020
 alarm+0.019
 microwave+0.019
 stories+0.018
 Friday+0.018
Neg logit contributions
ising-0.024
ized-0.022
ters-0.021
mun-0.021
lest-0.021
Token the
Token ID262
Feature activation+2.08e-01

Pos logit contributions
ising+0.035
ized+0.032
ters+0.031
mun+0.029
ging+0.029
Neg logit contributions
 microwave-0.030
 North-0.029
 alarm-0.028
ric-0.028
 stories-0.028
Token apples
Token ID22514
Feature activation+2.54e-01

Pos logit contributions
ising+0.006
ters+0.006
ized+0.006
 members+0.005
 further+0.005
Neg logit contributions
 North-0.004
ric-0.004
 microwave-0.004
ores-0.004
 stories-0.004
Token.
Token ID13
Feature activation+7.34e-01

Pos logit contributions
ising+0.007
ized+0.007
mun+0.006
lest+0.006
Gr+0.006
Neg logit contributions
 microwave-0.005
 North-0.005
 alarm-0.005
 Friday-0.005
 stories-0.005
Token She
Token ID1375
Feature activation-8.71e-01

Pos logit contributions
ising+0.019
ized+0.017
ters+0.017
 members+0.017
mun+0.017
Neg logit contributions
 North-0.017
 microwave-0.017
ric-0.016
 alarm-0.016
 stories-0.016
Token walked
Token ID6807
Feature activation+5.06e-01

Pos logit contributions
 microwave+0.000
 North+0.000
 alarm+0.000
 stories+0.000
ric+0.000
Neg logit contributions
ising-0.000
ized-0.000
ters-0.000
mun-0.000
lest-0.000
Token home
Token ID1363
Feature activation-2.53e-01

Pos logit contributions
ising+0.012
ized+0.010
ters+0.010
mun+0.010
lest+0.010
Neg logit contributions
 microwave-0.014
 North-0.013
 stories-0.013
 alarm-0.013
 Friday-0.013
Token with
Token ID351
Feature activation+1.19e+00

Pos logit contributions
 stories+0.006
 microwave+0.006
 alarm+0.006
 North+0.006
 belly+0.005
Neg logit contributions
ising-0.007
ters-0.006
ized-0.006
mun-0.006
lest-0.006
Token a
Token ID257
Feature activation+1.08e-01

Pos logit contributions
ising+0.033
ized+0.031
 members+0.029
ters+0.029
mun+0.028
Neg logit contributions
 microwave-0.025
ric-0.025
 North-0.025
tr-0.025
 alarm-0.024
Token big
Token ID1263
Feature activation-1.81e+00

Pos logit contributions
ising+0.003
ized+0.003
 members+0.003
lest+0.003
mun+0.003
Neg logit contributions
 North-0.002
 microwave-0.002
ric-0.002
 alarm-0.002
 Friday-0.002
Token smile
Token ID8212
Feature activation-3.85e+00

Pos logit contributions
ric+0.042
 North+0.041
ores+0.041
tr+0.039
 habitats+0.038
Neg logit contributions
ising-0.048
ized-0.042
ging-0.040
 members-0.040
lest-0.040
Token,
Token ID11
Feature activation-1.33e+00

Pos logit contributions
 microwave+0.109
 North+0.107
ric+0.105
 Sale+0.103
 stories+0.101
Neg logit contributions
ising-0.084
ized-0.073
ging-0.073
ters-0.073
lest-0.070
Token feeling
Token ID4203
Feature activation-1.95e+00

Pos logit contributions
 microwave+0.032
 North+0.031
 alarm+0.030
ric+0.030
 stories+0.029
Neg logit contributions
ising-0.035
ized-0.032
ters-0.030
mun-0.030
 members-0.029
Token happy
Token ID3772
Feature activation-7.37e-01

Pos logit contributions
 North+0.042
 microwave+0.041
 stories+0.040
 alarm+0.039
ores+0.038
Neg logit contributions
ising-0.057
ized-0.050
ters-0.050
mun-0.049
lest-0.047
Token,
Token ID11
Feature activation-1.63e+00

Pos logit contributions
 microwave+0.016
 North+0.016
 alarm+0.015
 stories+0.015
ric+0.015
Neg logit contributions
ising-0.021
ized-0.019
ters-0.018
mun-0.018
lest-0.018
Token brave
Token ID14802
Feature activation-6.71e-01

Pos logit contributions
 microwave+0.041
 North+0.039
 alarm+0.039
 Sale+0.038
ric+0.038
Neg logit contributions
ising-0.042
ized-0.038
ters-0.035
 members-0.034
ging-0.034
Token They
Token ID1119
Feature activation-1.24e+00

Pos logit contributions
ah+0.008
 microwave+0.008
 North+0.008
 alarm+0.008
at+0.008
Neg logit contributions
ising-0.009
ters-0.008
ized-0.008
aping-0.007
 resolve-0.007
Token said
Token ID531
Feature activation+1.43e+00

Pos logit contributions
 microwave+0.029
 North+0.028
 alarm+0.027
ric+0.027
 stories+0.027
Neg logit contributions
ising-0.034
ized-0.031
ters-0.030
mun-0.029
 members-0.029
Token sorry
Token ID7926
Feature activation-2.30e+00

Pos logit contributions
ising+0.029
ized+0.026
ters+0.025
ging+0.023
 members+0.023
Neg logit contributions
 North-0.043
ric-0.042
 microwave-0.042
tr-0.041
ah-0.040
Token to
Token ID284
Feature activation-2.21e+00

Pos logit contributions
 North+0.068
 microwave+0.068
ric+0.067
opl+0.067
 seventeen+0.066
Neg logit contributions
ising-0.049
ters-0.043
 together-0.040
 else-0.040
 further-0.038
Token their
Token ID511
Feature activation-1.69e+00

Pos logit contributions
 microwave+0.047
 North+0.046
 alarm+0.044
 stories+0.043
ric+0.043
Neg logit contributions
ising-0.065
ized-0.059
ters-0.057
mun-0.055
 members-0.055
Token mom
Token ID1995
Feature activation-4.04e+00

Pos logit contributions
 microwave+0.028
 North+0.028
 alarm+0.028
 stories+0.026
 chocolate+0.025
Neg logit contributions
ising-0.057
ized-0.053
ters-0.051
mun-0.051
lest-0.050
Token and
Token ID290
Feature activation-1.11e+00

Pos logit contributions
 North+0.103
 microwave+0.101
ric+0.097
 Friday+0.095
 marsh+0.095
Neg logit contributions
ising-0.100
ters-0.091
ized-0.087
 members-0.085
mun-0.084
Token dad
Token ID9955
Feature activation-1.45e+00

Pos logit contributions
 microwave+0.023
 North+0.023
 alarm+0.022
 stories+0.021
ric+0.021
Neg logit contributions
ising-0.033
ized-0.030
ters-0.029
mun-0.028
lest-0.028
Token.
Token ID13
Feature activation-8.66e-01

Pos logit contributions
 microwave+0.034
 North+0.034
 alarm+0.032
ric+0.032
tr+0.032
Neg logit contributions
ising-0.039
ized-0.035
ters-0.034
mun-0.033
 members-0.033
Token They
Token ID1119
Feature activation-1.59e+00

Pos logit contributions
ah+0.023
ric+0.022
at+0.022
 microwave+0.022
icle+0.021
Neg logit contributions
ising-0.019
ters-0.018
 members-0.016
ized-0.016
 together-0.016
Token played
Token ID2826
Feature activation-1.51e+00

Pos logit contributions
 microwave+0.040
 North+0.039
 alarm+0.038
 stories+0.037
ric+0.037
Neg logit contributions
ising-0.041
ized-0.036
ters-0.035
mun-0.034
lest-0.033
Tokeny
Token ID88
Feature activation-5.97e-01

Pos logit contributions
 microwave+0.001
 North+0.001
 alarm+0.001
ric+0.001
ores+0.001
Neg logit contributions
ising-0.001
ized-0.001
mun-0.001
 members-0.001
ters-0.001
Token.
Token ID13
Feature activation+3.98e-01

Pos logit contributions
 North+0.014
 microwave+0.014
 alarm+0.013
 stories+0.013
ric+0.013
Neg logit contributions
ising-0.016
ized-0.014
mun-0.014
ters-0.014
lest-0.013
Token Jack
Token ID3619
Feature activation+2.23e+00

Pos logit contributions
ising+0.013
ized+0.012
ters+0.011
mun+0.011
 members+0.011
Neg logit contributions
 microwave-0.007
 North-0.007
 alarm-0.007
 stories-0.007
ric-0.007
Token closed
Token ID4838
Feature activation-1.00e+00

Pos logit contributions
ising+0.054
ized+0.047
ters+0.045
 members+0.044
mun+0.044
Neg logit contributions
 microwave-0.060
 North-0.058
 alarm-0.056
 stories-0.056
ric-0.056
Token his
Token ID465
Feature activation-5.33e-01

Pos logit contributions
 microwave+0.022
 North+0.022
ric+0.021
 alarm+0.020
tr+0.020
Neg logit contributions
ising-0.029
ized-0.026
ters-0.025
ging-0.024
mun-0.024
Token eyes
Token ID2951
Feature activation-3.88e+00

Pos logit contributions
 North+0.008
 microwave+0.008
 Friday+0.008
tr+0.008
ores+0.008
Neg logit contributions
ising-0.019
ized-0.016
 members-0.016
ging-0.016
ters-0.016
Token and
Token ID290
Feature activation-9.49e-02

Pos logit contributions
 microwave+0.083
 North+0.081
 alarm+0.078
 stories+0.077
ric+0.077
Neg logit contributions
ising-0.112
ized-0.101
ters-0.098
mun-0.096
lest-0.094
Token stayed
Token ID9658
Feature activation+1.35e+00

Pos logit contributions
 microwave+0.002
 North+0.002
 stories+0.002
 alarm+0.002
 Friday+0.002
Neg logit contributions
ising-0.003
ters-0.002
ized-0.002
mun-0.002
ging-0.002
Token still
Token ID991
Feature activation-5.37e-01

Pos logit contributions
ising+0.032
ters+0.029
mun+0.029
lest+0.029
ized+0.029
Neg logit contributions
 microwave-0.034
 North-0.034
 alarm-0.033
 stories-0.033
ores-0.031
Token for
Token ID329
Feature activation-6.23e-01

Pos logit contributions
 North+0.013
 microwave+0.013
 alarm+0.013
 stories+0.013
 Friday+0.012
Neg logit contributions
ising-0.013
ized-0.012
ters-0.012
mun-0.011
lest-0.011
Token a
Token ID257
Feature activation-1.04e+00

Pos logit contributions
 microwave+0.014
 North+0.013
ric+0.013
 alarm+0.013
tr+0.013
Neg logit contributions
ising-0.018
ized-0.016
ters-0.015
 members-0.015
mun-0.015
Token Katie
Token ID24721
Feature activation-5.61e-01

Pos logit contributions
 microwave+0.039
 North+0.039
 alarm+0.038
 stories+0.038
 Friday+0.037
Neg logit contributions
ising-0.057
ized-0.051
ters-0.048
mun-0.048
lest-0.047
Token.
Token ID13
Feature activation-6.61e-01

Pos logit contributions
 microwave+0.015
 North+0.015
 alarm+0.014
 stories+0.014
 Sale+0.014
Neg logit contributions
ising-0.013
ized-0.012
ters-0.011
 members-0.011
lest-0.011
Token She
Token ID1375
Feature activation-1.40e+00

Pos logit contributions
 microwave+0.017
ric+0.017
 alarm+0.017
tr+0.017
 fever+0.016
Neg logit contributions
ising-0.015
ized-0.014
 members-0.013
mun-0.013
Gr-0.013
Token was
Token ID373
Feature activation+3.21e-01

Pos logit contributions
 microwave+0.036
 North+0.035
ric+0.035
 Sale+0.035
 seventeen+0.033
Neg logit contributions
ising-0.032
ized-0.030
mun-0.030
 members-0.030
Gr-0.028
Token 3
Token ID513
Feature activation+1.13e+00

Pos logit contributions
ising+0.011
ized+0.010
lest+0.010
ters+0.010
mun+0.010
Neg logit contributions
 microwave-0.005
 North-0.005
 alarm-0.005
 chocolate-0.005
 stories-0.005
Token years
Token ID812
Feature activation-3.96e+00

Pos logit contributions
ising+0.027
ized+0.024
ters+0.023
mun+0.022
lest+0.022
Neg logit contributions
 microwave-0.030
 North-0.029
 alarm-0.029
 stories-0.028
ric-0.028
Token old
Token ID1468
Feature activation+6.11e-01

Pos logit contributions
 microwave+0.082
 North+0.081
 alarm+0.079
 stories+0.076
 Friday+0.075
Neg logit contributions
ising-0.119
ized-0.107
ters-0.105
mun-0.103
ging-0.100
Token.
Token ID13
Feature activation+4.57e-02

Pos logit contributions
ising+0.019
ized+0.017
ters+0.016
lest+0.016
mun+0.016
Neg logit contributions
 microwave-0.012
 North-0.012
 alarm-0.012
 stories-0.011
 chocolate-0.011
Token One
Token ID1881
Feature activation-8.18e-01

Pos logit contributions
ising+0.001
ized+0.001
 members+0.001
mun+0.001
Gr+0.001
Neg logit contributions
 microwave-0.001
tr-0.001
 alarm-0.001
ric-0.001
 North-0.001
Token day
Token ID1110
Feature activation+9.39e-01

Pos logit contributions
 microwave+0.017
ric+0.017
 North+0.017
 Sale+0.017
 seventeen+0.016
Neg logit contributions
ising-0.023
ized-0.021
 members-0.021
lest-0.020
mun-0.020
Token,
Token ID11
Feature activation+6.69e-01

Pos logit contributions
ising+0.028
ized+0.025
ters+0.025
mun+0.024
lest+0.023
Neg logit contributions
 microwave-0.019
 North-0.019
 Friday-0.019
 stories-0.018
 alarm-0.018
Token a
Token ID257
Feature activation+1.41e+00

Pos logit contributions
 microwave+0.007
 North+0.007
 alarm+0.006
ric+0.006
 stories+0.006
Neg logit contributions
ising-0.010
ized-0.009
ters-0.009
mun-0.009
 members-0.009
Token cheerful
Token ID37999
Feature activation-3.71e-01

Pos logit contributions
ising+0.038
ized+0.033
mun+0.032
ters+0.032
 members+0.032
Neg logit contributions
 microwave-0.031
 North-0.031
ric-0.031
ores-0.031
 Friday-0.030
Token little
Token ID1310
Feature activation+1.64e+00

Pos logit contributions
 microwave+0.009
 North+0.009
ric+0.009
ores+0.009
 Sale+0.009
Neg logit contributions
ising-0.009
ized-0.008
ters-0.008
 members-0.007
mun-0.007
Token 3
Token ID513
Feature activation+5.11e-01

Pos logit contributions
ising+0.040
ized+0.037
ters+0.033
lest+0.033
mun+0.033
Neg logit contributions
 alarm-0.043
 microwave-0.043
 chocolate-0.042
 North-0.042
 stories-0.041
Token-
Token ID12
Feature activation+2.42e-01

Pos logit contributions
ising+0.015
ized+0.014
ters+0.014
mun+0.013
lest+0.013
Neg logit contributions
 microwave-0.010
 alarm-0.010
 North-0.010
 stories-0.010
 chocolate-0.009
Tokenyear
Token ID1941
Feature activation-4.04e+00

Pos logit contributions
ising+0.007
ized+0.006
ters+0.006
bing+0.005
 resolve+0.005
Neg logit contributions
 microwave-0.006
 alarm-0.006
 North-0.005
ah-0.005
ores-0.005
Token-
Token ID12
Feature activation-6.28e-02

Pos logit contributions
 microwave+0.120
 North+0.119
 alarm+0.117
 stories+0.116
 Friday+0.113
Neg logit contributions
ising-0.082
ized-0.071
ters-0.069
mun-0.066
lest-0.064
Tokenold
Token ID727
Feature activation+2.78e+00

Pos logit contributions
 stories+0.002
 microwave+0.002
 North+0.002
 alarm+0.002
 chocolate+0.002
Neg logit contributions
ising-0.001
ters-0.001
ized-0.001
ging-0.001
 resolve-0.001
Token boy
Token ID2933
Feature activation+2.09e+00

Pos logit contributions
ising+0.073
ized+0.068
lest+0.061
ters+0.059
mun+0.058
Neg logit contributions
 microwave-0.068
 alarm-0.067
 stories-0.066
 chocolate-0.066
 North-0.064
Token.
Token ID13
Feature activation+1.11e+00

Pos logit contributions
ising+0.073
ized+0.068
lest+0.066
ters+0.064
Gr+0.062
Neg logit contributions
 alarm-0.033
 chocolate-0.032
 microwave-0.030
 North-0.030
ric-0.030
Token He
Token ID679
Feature activation-1.05e-02

Pos logit contributions
ising+0.032
ized+0.030
ters+0.030
mun+0.030
 members+0.029
Neg logit contributions
 North-0.021
 microwave-0.020
ric-0.020
 alarm-0.020
tr-0.019