TOP ACTIVATIONS
MAX = 4.202

went back to the gloomy alley again . <|endoftext|> Once upon
There was a gr umpy ranc her named Tom . Tom
Once Sus ie was home , she decided to play
The other animals were un imp ressed and scoff ed at
she spotted it in an alley . But when
mom never returned to the lab . They were both very
. " Help ! Help !" Lucy cried .
scratched . The ranc her heard the noise and
They never went near the hedge again . <|endoftext|> L ily
couldn 't . " Help ! Help !" he cried .
paper flew away . Pat ches was disappointed - she wanted
Her mom saw the Hedge , and immediately smiled at
the story is : Mat ches are not toys and can
What happened , Tim my ? Why are you crying ?"
But when he arrived home , he found out that
. " Uh - oh , where did it
. " Uh - oh ," Jake said .
Billy cried , " Help ! Help ! I am stuck
while walking in the me adow , he saw an available
came time to say good - bye , Lucy gave her

MIN ACTIVATIONS
MIN = -4.562

could soar in the sky higher than any other bird .
. It will make them louder and faster ." Mia says
will make them louder and faster ." Mia says , "
sun and stretch her stems higher and higher . She had
the elevator as it rose higher and higher . She looked
. It is faster and stronger . I have a tool
's measure how tall our tower is ," Lily says .
places and ride the swings again . After a
him to flap his wings harder . The little
, but the ground shook harder and harder . Tim 's
his face as he soared higher and higher .
wind lifting her k ite higher and higher .
trees to make them shine again . As the
. Sara can count to ten . Ben can count to
of cheese and sauce and topp ings . The people who
other and go higher and higher . " Look
" Let 's push it harder !" " OK
swing and began to swing higher and higher .
new tutor as soon as possible . We love you ,
led as the ball bounced higher and higher .

INTERVAL 2.011 - 4.202
CONTAINS 0.601%

. After playing with his new toy for
replied , " Oh no ! You need to wear your
. He was three years old . One day , Mama
, she saw a funny - looking bug . She wanted
running around and making a lot of noise . Old Sam

INTERVAL -0.180 - 2.011
CONTAINS 59.591%

't !" Mom my and Daddy went out the
park when she saw a sweater that she liked . She
The mother smiled and said , " Of course we can
She showed her the children 's section and asked if she
As Blue played , everyone danced and had a great time

INTERVAL -2.371 - -0.180
CONTAINS 39.415%

getting a drink . â € His mum was
kept examining the area where it had been . But the
pretended to be mighty hunters who chased animals with their spears
sad and angry . He wanted the dish very much .
you act and treat others that really matters ! Being kind

INTERVAL -4.562 - -2.371
CONTAINS 0.393%

before . <|endoftext|> Once upon a time , there was a
and laughed as he wob bled around the ice .
zoo . <|endoftext|> Once upon a time , there was a
happy . <|endoftext|> Once upon a time , there was a
. She opens it and lets the bird out .
Token went
Token ID1816
Feature activation+7.50e-01

Pos logit contributions
taker+0.012
 devastated+0.012
chair+0.012
ase+0.012
he+0.012
Neg logit contributions
 brace-0.013
yt-0.012
 swe-0.012
 narrative-0.012
 runs-0.012
Token back
Token ID736
Feature activation+1.17e+00

Pos logit contributions
taker+0.016
 devastated+0.015
ase+0.015
he+0.015
 Shards+0.015
Neg logit contributions
 brace-0.017
 swe-0.017
 runs-0.016
yt-0.016
cast-0.016
Token to
Token ID284
Feature activation+1.73e+00

Pos logit contributions
taker+0.038
 devastated+0.034
ase+0.034
 Shards+0.034
ions+0.033
Neg logit contributions
 runs-0.018
cast-0.016
 brace-0.016
 swe-0.016
 goes-0.014
Token the
Token ID262
Feature activation+3.34e-01

Pos logit contributions
taker+0.040
ase+0.038
 devastated+0.036
chair+0.036
he+0.036
Neg logit contributions
 brace-0.039
 swe-0.037
 runs-0.037
yt-0.036
cast-0.034
Token gloomy
Token ID46400
Feature activation+3.60e-01

Pos logit contributions
taker+0.007
 devastated+0.007
ase+0.006
he+0.006
chair+0.006
Neg logit contributions
 brace-0.008
yt-0.007
 swe-0.007
 runs-0.007
 narrative-0.007
Token alley
Token ID23978
Feature activation+4.20e+00

Pos logit contributions
taker+0.008
 devastated+0.008
har+0.007
he+0.007
ase+0.007
Neg logit contributions
 brace-0.008
yt-0.008
 swe-0.008
 runs-0.008
 narrative-0.007
Token again
Token ID757
Feature activation-1.28e+00

Pos logit contributions
ions+0.103
taker+0.103
 Shards+0.099
uster+0.096
ioned+0.096
Neg logit contributions
yt-0.074
cast-0.073
 brace-0.072
 runs-0.069
 swe-0.068
Token.
Token ID13
Feature activation+7.74e-01

Pos logit contributions
 brace+0.027
 narrative+0.025
 swe+0.025
yt+0.025
 runs+0.025
Neg logit contributions
 devastated-0.028
taker-0.028
he-0.027
har-0.027
chair-0.027
Token<|endoftext|>
Token ID50256
Feature activation+1.49e+00

Pos logit contributions
ner+0.020
taker+0.020
 devastated+0.019
ions+0.019
har+0.019
Neg logit contributions
 runs-0.012
 brace-0.011
 swe-0.011
 narrative-0.011
cast-0.011
TokenOnce
Token ID7454
Feature activation-8.28e-02

Pos logit contributions
taker+0.037
 devastated+0.035
har+0.035
 rubble+0.034
chair+0.032
Neg logit contributions
 brace-0.027
 runs-0.027
cast-0.026
 narrative-0.026
yt-0.026
Token upon
Token ID2402
Feature activation+8.66e-01

Pos logit contributions
 brace+0.002
yt+0.002
 swe+0.002
 runs+0.002
 narrative+0.002
Neg logit contributions
taker-0.002
 devastated-0.002
ase-0.001
he-0.001
chair-0.001
TokenThere
Token ID1858
Feature activation-1.14e+00

Pos logit contributions
taker+0.024
 devastated+0.023
har+0.022
 rubble+0.022
chair+0.021
Neg logit contributions
 runs-0.020
 brace-0.020
yt-0.020
cast-0.020
 narrative-0.020
Token was
Token ID373
Feature activation-4.66e-01

Pos logit contributions
 brace+0.027
yt+0.026
 runs+0.026
 swe+0.026
cast+0.025
Neg logit contributions
taker-0.023
har-0.022
 devastated-0.022
 rubble-0.021
chair-0.021
Token a
Token ID257
Feature activation-1.06e+00

Pos logit contributions
 runs+0.010
 brace+0.009
Say+0.009
 swe+0.009
yt+0.009
Neg logit contributions
har-0.011
taker-0.011
ase-0.010
 devastated-0.010
 rubble-0.009
Token gr
Token ID1036
Feature activation+2.07e+00

Pos logit contributions
 brace+0.022
yt+0.021
 swe+0.021
 runs+0.021
 narrative+0.020
Neg logit contributions
taker-0.024
 devastated-0.023
ase-0.023
he-0.022
ions-0.022
Tokenumpy
Token ID32152
Feature activation+1.14e+00

Pos logit contributions
 aback+0.045
 devastated+0.044
 rescued+0.043
wards+0.043
ions+0.043
Neg logit contributions
yt-0.045
 narrative-0.038
ints-0.038
cast-0.038
rick-0.038
Token ranc
Token ID32205
Feature activation+4.03e+00

Pos logit contributions
 devastated+0.030
 aback+0.030
ions+0.029
har+0.029
taker+0.028
Neg logit contributions
yt-0.020
 brace-0.019
 swe-0.019
 runs-0.019
cast-0.018
Tokenher
Token ID372
Feature activation+8.81e-01

Pos logit contributions
taker+0.083
 devastated+0.080
ase+0.077
chair+0.076
 Shards+0.076
Neg logit contributions
 brace-0.092
yt-0.088
 swe-0.086
 runs-0.085
 narrative-0.085
Token named
Token ID3706
Feature activation+1.12e+00

Pos logit contributions
taker+0.022
 devastated+0.022
har+0.021
 Shards+0.021
 rubble+0.021
Neg logit contributions
 brace-0.015
yt-0.015
 swe-0.015
 runs-0.015
cast-0.014
Token Tom
Token ID4186
Feature activation-1.50e+00

Pos logit contributions
taker+0.020
 devastated+0.020
ase+0.019
har+0.018
he+0.018
Neg logit contributions
 brace-0.028
yt-0.028
 swe-0.028
 runs-0.027
cast-0.027
Token.
Token ID13
Feature activation-3.00e-01

Pos logit contributions
 brace+0.026
yt+0.025
 swe+0.024
 runs+0.024
 narrative+0.024
Neg logit contributions
taker-0.039
 devastated-0.038
ase-0.037
he-0.037
chair-0.037
Token Tom
Token ID4186
Feature activation-2.63e-01

Pos logit contributions
 brace+0.008
yt+0.008
 swe+0.007
 narrative+0.007
cast+0.007
Neg logit contributions
chair-0.006
he-0.005
 Shards-0.005
 jail-0.005
ase-0.005
Token
Token ID198
Feature activation+5.12e-01

Pos logit contributions
taker+0.013
 devastated+0.012
 rubble+0.012
chair+0.012
ase+0.012
Neg logit contributions
 brace-0.011
 runs-0.010
yt-0.010
 swe-0.010
Nik-0.010
TokenOnce
Token ID7454
Feature activation+8.27e-01

Pos logit contributions
taker+0.013
 devastated+0.012
 rubble+0.012
chair+0.012
 Shards+0.012
Neg logit contributions
 brace-0.010
Say-0.009
 swe-0.009
 runs-0.009
 narrative-0.009
Token Sus
Token ID8932
Feature activation+1.60e+00

Pos logit contributions
 devastated+0.022
har+0.022
 rubble+0.021
taker+0.021
 Shards+0.020
Neg logit contributions
yt-0.014
 runs-0.014
 brace-0.014
cast-0.013
 swe-0.013
Tokenie
Token ID494
Feature activation+1.33e+00

Pos logit contributions
taker+0.045
 devastated+0.043
har+0.042
 Shards+0.042
 rubble+0.042
Neg logit contributions
 brace-0.026
yt-0.025
 runs-0.024
 swe-0.024
cast-0.022
Token was
Token ID373
Feature activation+2.17e-01

Pos logit contributions
taker+0.031
 Shards+0.028
 devastated+0.028
ions+0.028
ase+0.028
Neg logit contributions
 runs-0.029
 brace-0.028
 swe-0.027
yt-0.027
 narrative-0.026
Token home
Token ID1363
Feature activation+4.01e+00

Pos logit contributions
ase+0.005
taker+0.005
ions+0.005
rolley+0.005
mith+0.005
Neg logit contributions
 runs-0.004
 swe-0.004
 brace-0.004
 nineteen-0.003
ints-0.003
Token,
Token ID11
Feature activation+1.33e+00

Pos logit contributions
ase+0.120
 Shards+0.116
idd+0.114
taker+0.114
 devastated+0.110
Neg logit contributions
 swe-0.058
yt-0.057
 brace-0.056
 runs-0.050
cast-0.049
Token she
Token ID673
Feature activation-2.46e-01

Pos logit contributions
ase+0.049
idd+0.048
taker+0.047
har+0.046
ore+0.046
Neg logit contributions
 brace-0.014
 swe-0.011
yt-0.009
 runs-0.008
 hats-0.007
Token decided
Token ID3066
Feature activation-5.97e-02

Pos logit contributions
 brace+0.005
 runs+0.005
yt+0.005
 swe+0.005
cast+0.004
Neg logit contributions
taker-0.006
ase-0.006
he-0.006
 devastated-0.006
ions-0.006
Token to
Token ID284
Feature activation+9.33e-01

Pos logit contributions
 brace+0.001
yt+0.001
 narrative+0.001
 swe+0.001
 runs+0.001
Neg logit contributions
taker-0.001
 devastated-0.001
chair-0.001
 aback-0.001
har-0.001
Token play
Token ID711
Feature activation+1.77e+00

Pos logit contributions
taker+0.026
 devastated+0.025
ase+0.024
 Shards+0.024
chair+0.024
Neg logit contributions
 brace-0.015
 runs-0.014
 swe-0.014
yt-0.014
 narrative-0.014
TokenThe
Token ID464
Feature activation-4.50e-01

Pos logit contributions
 brace+0.008
yt+0.007
 narrative+0.007
 runs+0.007
 swe+0.007
Neg logit contributions
taker-0.013
 devastated-0.012
 rubble-0.012
chair-0.012
ase-0.012
Token other
Token ID584
Feature activation-6.61e-02

Pos logit contributions
 brace+0.009
 swe+0.009
yt+0.009
 narrative+0.009
 runs+0.009
Neg logit contributions
taker-0.010
 devastated-0.010
ase-0.010
he-0.010
 Shards-0.010
Token animals
Token ID4695
Feature activation+7.62e-01

Pos logit contributions
 brace+0.001
 runs+0.001
yt+0.001
 swe+0.001
 narrative+0.001
Neg logit contributions
taker-0.002
 devastated-0.001
he-0.001
har-0.001
 aback-0.001
Token were
Token ID547
Feature activation-7.35e-01

Pos logit contributions
 Shards+0.018
ase+0.018
har+0.018
taker+0.017
chair+0.017
Neg logit contributions
 runs-0.015
 narrative-0.014
 brace-0.014
cast-0.013
 swe-0.013
Token un
Token ID555
Feature activation+1.95e+00

Pos logit contributions
 brace+0.017
 swe+0.016
 runs+0.016
yt+0.015
 hats+0.015
Neg logit contributions
ase-0.015
taker-0.015
chair-0.015
 Shards-0.015
he-0.014
Tokenimp
Token ID11011
Feature activation+3.94e+00

Pos logit contributions
taker+0.038
chair+0.037
 Shards+0.037
 rubble+0.036
ase+0.035
Neg logit contributions
yt-0.044
 brace-0.044
cast-0.043
 swe-0.042
ints-0.041
Tokenressed
Token ID2790
Feature activation-4.55e-01

Pos logit contributions
 Shards+0.089
 rubble+0.081
 devastated+0.079
taker+0.077
 doorway+0.077
Neg logit contributions
 brace-0.080
yt-0.076
 runs-0.075
cast-0.074
 swe-0.073
Token and
Token ID290
Feature activation+5.63e-02

Pos logit contributions
 brace+0.008
 swe+0.008
 runs+0.008
yt+0.008
cast+0.007
Neg logit contributions
taker-0.012
 devastated-0.011
 Shards-0.011
ase-0.011
har-0.011
Token scoff
Token ID39247
Feature activation+5.67e-02

Pos logit contributions
ase+0.001
he+0.001
taker+0.001
 devastated+0.001
 Shards+0.001
Neg logit contributions
 brace-0.001
 runs-0.001
 swe-0.001
yt-0.001
cast-0.001
Tokened
Token ID276
Feature activation-5.96e-01

Pos logit contributions
taker+0.001
 devastated+0.001
ase+0.001
he+0.001
har+0.001
Neg logit contributions
 brace-0.002
yt-0.002
 swe-0.002
 runs-0.002
 narrative-0.002
Token at
Token ID379
Feature activation-5.12e-01

Pos logit contributions
 brace+0.011
yt+0.011
 swe+0.011
 narrative+0.011
 runs+0.010
Neg logit contributions
taker-0.015
 devastated-0.014
he-0.014
ase-0.014
har-0.014
Token she
Token ID673
Feature activation-6.03e-01

Pos logit contributions
taker+0.009
ase+0.009
 devastated+0.009
har+0.009
chair+0.008
Neg logit contributions
 brace-0.006
yt-0.006
 runs-0.006
 swe-0.006
cast-0.006
Token spotted
Token ID13489
Feature activation+4.28e-01

Pos logit contributions
 brace+0.012
yt+0.012
 runs+0.012
cast+0.012
 swe+0.012
Neg logit contributions
taker-0.015
 devastated-0.013
chair-0.013
he-0.013
ions-0.013
Token it
Token ID340
Feature activation+1.57e+00

Pos logit contributions
har+0.010
taker+0.010
 devastated+0.010
ase+0.009
he+0.009
Neg logit contributions
 brace-0.009
 swe-0.009
 hats-0.008
 runs-0.008
cast-0.008
Token in
Token ID287
Feature activation+1.09e+00

Pos logit contributions
taker+0.032
 devastated+0.031
he+0.030
ase+0.030
chair+0.030
Neg logit contributions
 brace-0.036
yt-0.035
 swe-0.034
 narrative-0.034
 runs-0.034
Token an
Token ID281
Feature activation+5.65e-01

Pos logit contributions
taker+0.025
ase+0.023
 devastated+0.023
chair+0.023
har+0.022
Neg logit contributions
 brace-0.025
yt-0.023
 swe-0.023
 runs-0.022
Say-0.022
Token alley
Token ID23978
Feature activation+3.89e+00

Pos logit contributions
taker+0.013
 devastated+0.013
ase+0.013
har+0.013
he+0.012
Neg logit contributions
 brace-0.011
 swe-0.011
 runs-0.011
yt-0.011
 narrative-0.010
Token.
Token ID13
Feature activation+5.55e-01

Pos logit contributions
taker+0.096
ions+0.092
 Shards+0.092
 devastated+0.091
har+0.090
Neg logit contributions
yt-0.070
 brace-0.066
 runs-0.062
cast-0.062
 narrative-0.059
Token
Token ID198
Feature activation-2.51e-01

Pos logit contributions
taker+0.013
 devastated+0.012
har+0.012
 rubble+0.012
ase+0.012
Neg logit contributions
 brace-0.011
yt-0.011
 runs-0.011
 swe-0.011
cast-0.010
Token
Token ID198
Feature activation+2.68e-01

Pos logit contributions
 brace+0.005
 runs+0.005
yt+0.005
 swe+0.005
cast+0.005
Neg logit contributions
taker-0.005
 devastated-0.005
har-0.005
 rubble-0.005
ase-0.005
TokenBut
Token ID1537
Feature activation+1.25e+00

Pos logit contributions
taker+0.007
 devastated+0.006
he+0.006
ase+0.006
chair+0.006
Neg logit contributions
 brace-0.005
 swe-0.005
yt-0.005
 runs-0.005
 narrative-0.005
Token when
Token ID618
Feature activation-2.89e-01

Pos logit contributions
ase+0.033
taker+0.032
 rubble+0.032
chair+0.030
har+0.030
Neg logit contributions
 brace-0.022
yt-0.021
 runs-0.021
 swe-0.021
 hats-0.019
Token mom
Token ID1995
Feature activation+1.77e+00

Pos logit contributions
 aback+0.010
ions+0.010
taker+0.009
ase+0.009
he+0.009
Neg logit contributions
 swe-0.003
 brace-0.003
 hats-0.003
 narrative-0.003
 runs-0.003
Token never
Token ID1239
Feature activation+1.30e-01

Pos logit contributions
taker+0.047
ase+0.046
ions+0.044
 Shards+0.044
har+0.044
Neg logit contributions
 runs-0.030
 brace-0.029
 swe-0.029
yt-0.028
 narrative-0.027
Token returned
Token ID4504
Feature activation-2.21e-01

Pos logit contributions
taker+0.003
 devastated+0.003
chair+0.002
har+0.002
he+0.002
Neg logit contributions
 brace-0.003
 narrative-0.003
yt-0.003
 swe-0.003
cast-0.003
Token to
Token ID284
Feature activation+1.66e+00

Pos logit contributions
 brace+0.004
 swe+0.004
yt+0.004
 runs+0.004
cast+0.004
Neg logit contributions
taker-0.006
ase-0.006
 devastated-0.006
chair-0.006
he-0.006
Token the
Token ID262
Feature activation+1.49e-01

Pos logit contributions
ase+0.046
taker+0.045
chair+0.041
idd+0.041
he+0.040
Neg logit contributions
 swe-0.031
 runs-0.030
 brace-0.030
yt-0.028
 hats-0.028
Token lab
Token ID2248
Feature activation+3.85e+00

Pos logit contributions
taker+0.004
ase+0.003
ions+0.003
 devastated+0.003
he+0.003
Neg logit contributions
 swe-0.003
 runs-0.003
 narrative-0.003
 brace-0.003
yt-0.003
Token.
Token ID13
Feature activation+1.09e+00

Pos logit contributions
 Shards+0.096
taker+0.096
idd+0.095
ions+0.094
har+0.091
Neg logit contributions
yt-0.064
 swe-0.062
 brace-0.061
 runs-0.060
cast-0.059
Token They
Token ID1119
Feature activation+1.46e+00

Pos logit contributions
taker+0.032
ions+0.031
ase+0.031
idd+0.030
ner+0.030
Neg logit contributions
 narrative-0.015
 runs-0.015
 swe-0.014
ints-0.014
yt-0.014
Token were
Token ID547
Feature activation+5.90e-01

Pos logit contributions
ase+0.046
ions+0.044
ocolate+0.044
ore+0.042
idd+0.042
Neg logit contributions
 runs-0.021
 would-0.019
 swe-0.017
 goes-0.017
 narrative-0.016
Token both
Token ID1111
Feature activation+2.67e+00

Pos logit contributions
ase+0.017
ner+0.016
ival+0.016
ore+0.016
he+0.016
Neg logit contributions
 swe-0.010
 runs-0.010
 often-0.008
 long-0.008
 best-0.008
Token very
Token ID845
Feature activation+6.62e-01

Pos logit contributions
ase+0.089
ocolate+0.084
ions+0.084
oon+0.083
ival+0.078
Neg logit contributions
 long-0.034
 runs-0.034
 best-0.033
 swe-0.032
 brown-0.029
Token.
Token ID13
Feature activation+1.18e+00

Pos logit contributions
taker+0.001
 devastated+0.001
he+0.001
chair+0.001
ase+0.001
Neg logit contributions
 brace-0.001
yt-0.001
 swe-0.001
 runs-0.001
 narrative-0.001
Token
Token ID198
Feature activation+6.56e-01

Pos logit contributions
taker+0.033
ions+0.031
idd+0.030
 devastated+0.030
har+0.030
Neg logit contributions
 brace-0.019
 swe-0.019
 runs-0.019
One-0.017
yt-0.016
Token
Token ID198
Feature activation+1.37e+00

Pos logit contributions
taker+0.015
 devastated+0.014
har+0.014
 rubble+0.014
ase+0.014
Neg logit contributions
 brace-0.013
yt-0.013
 runs-0.013
 swe-0.013
 narrative-0.012
Token"
Token ID1
Feature activation+1.82e-01

Pos logit contributions
taker+0.033
 devastated+0.030
ase+0.030
ions+0.030
 rubble+0.029
Neg logit contributions
 brace-0.027
 swe-0.026
 runs-0.026
yt-0.025
Say-0.024
TokenHelp
Token ID22087
Feature activation+2.81e-01

Pos logit contributions
taker+0.005
 devastated+0.004
ore+0.004
 rubble+0.004
ions+0.004
Neg logit contributions
 brace-0.003
 swe-0.003
Say-0.003
 narrative-0.003
Nik-0.003
Token!
Token ID0
Feature activation+3.85e+00

Pos logit contributions
taker+0.007
 devastated+0.007
ase+0.007
he+0.007
chair+0.007
Neg logit contributions
 brace-0.005
yt-0.005
 swe-0.005
 runs-0.005
 narrative-0.005
Token Help
Token ID10478
Feature activation-2.57e-01

Pos logit contributions
he+0.093
taker+0.091
 rubble+0.089
 aback+0.086
 devastated+0.086
Neg logit contributions
 runs-0.071
 swe-0.064
Nik-0.060
 lets-0.059
 brace-0.059
Token!"
Token ID2474
Feature activation+5.94e-01

Pos logit contributions
 runs+0.004
 brace+0.004
 swe+0.004
yt+0.004
Nik+0.003
Neg logit contributions
 devastated-0.007
taker-0.007
har-0.007
 arrested-0.007
ase-0.007
Token Lucy
Token ID22162
Feature activation+1.63e+00

Pos logit contributions
ase+0.015
idd+0.015
har+0.015
ions+0.014
uster+0.014
Neg logit contributions
 runs-0.012
 brace-0.011
 swe-0.010
 says-0.010
yt-0.010
Token cried
Token ID16896
Feature activation+1.68e+00

Pos logit contributions
taker+0.040
ase+0.039
he+0.039
ions+0.038
 devastated+0.038
Neg logit contributions
 brace-0.031
 runs-0.031
yt-0.029
 swe-0.029
ints-0.026
Token.
Token ID13
Feature activation+1.64e+00

Pos logit contributions
ase+0.048
taker+0.048
 devastated+0.047
har+0.046
idd+0.046
Neg logit contributions
 brace-0.025
 swe-0.024
yt-0.023
 runs-0.022
 narrative-0.020
Token scratched
Token ID37468
Feature activation-9.22e-01

Pos logit contributions
ase+0.044
taker+0.043
he+0.042
 devastated+0.041
ival+0.040
Neg logit contributions
 brace-0.043
 runs-0.042
 swe-0.041
yt-0.038
 hats-0.037
Token.
Token ID13
Feature activation-1.39e-01

Pos logit contributions
 brace+0.020
yt+0.020
 swe+0.019
 narrative+0.019
 runs+0.018
Neg logit contributions
taker-0.020
 devastated-0.019
he-0.019
chair-0.019
 Shards-0.018
Token
Token ID198
Feature activation+5.02e-01

Pos logit contributions
 runs+0.002
 brace+0.002
 narrative+0.002
Say+0.002
Nik+0.002
Neg logit contributions
taker-0.004
ase-0.004
 devastated-0.004
he-0.004
har-0.004
Token
Token ID198
Feature activation+5.02e-01

Pos logit contributions
taker+0.013
ase+0.012
 devastated+0.011
har+0.011
 aback+0.011
Neg logit contributions
 runs-0.009
 brace-0.008
One-0.008
Say-0.008
Nik-0.008
TokenThe
Token ID464
Feature activation-9.68e-01

Pos logit contributions
taker+0.015
ase+0.014
 devastated+0.014
 aback+0.014
chair+0.014
Neg logit contributions
 runs-0.006
 brace-0.006
Say-0.006
 narrative-0.006
Nik-0.006
Token ranc
Token ID32205
Feature activation+3.82e+00

Pos logit contributions
 brace+0.020
yt+0.019
cast+0.018
 swe+0.018
 runs+0.018
Neg logit contributions
taker-0.023
 devastated-0.022
chair-0.021
he-0.021
 rubble-0.021
Tokenher
Token ID372
Feature activation+1.02e+00

Pos logit contributions
taker+0.080
 Shards+0.076
 devastated+0.076
ions+0.076
 aback+0.075
Neg logit contributions
 brace-0.083
 narrative-0.079
 runs-0.077
yt-0.077
cast-0.076
Token heard
Token ID2982
Feature activation-8.27e-01

Pos logit contributions
taker+0.024
 devastated+0.023
he+0.023
chair+0.022
ase+0.022
Neg logit contributions
 brace-0.020
 runs-0.020
yt-0.019
 swe-0.019
 hats-0.018
Token the
Token ID262
Feature activation-7.31e-01

Pos logit contributions
 runs+0.010
 hats+0.009
 swe+0.009
 brace+0.009
Say+0.008
Neg logit contributions
har-0.027
taker-0.026
ase-0.026
idd-0.025
days-0.025
Token noise
Token ID7838
Feature activation+6.63e-01

Pos logit contributions
 brace+0.017
yt+0.016
 swe+0.015
 narrative+0.015
cast+0.015
Neg logit contributions
taker-0.016
 devastated-0.015
 rubble-0.015
chair-0.014
 jail-0.014
Token and
Token ID290
Feature activation+1.62e+00

Pos logit contributions
ase+0.017
taker+0.017
har+0.017
 devastated+0.017
 Shards+0.017
Neg logit contributions
 runs-0.011
 brace-0.011
 swe-0.011
yt-0.011
cast-0.010
Token They
Token ID1119
Feature activation+7.04e-01

Pos logit contributions
 brace+0.004
yt+0.004
 swe+0.004
 runs+0.004
 narrative+0.004
Neg logit contributions
taker-0.008
 devastated-0.007
ase-0.007
he-0.007
chair-0.007
Token never
Token ID1239
Feature activation+1.02e-01

Pos logit contributions
taker+0.013
 devastated+0.013
he+0.012
har+0.012
chair+0.012
Neg logit contributions
 brace-0.018
yt-0.017
 swe-0.017
cast-0.016
 runs-0.016
Token went
Token ID1816
Feature activation+6.36e-01

Pos logit contributions
 devastated+0.002
taker+0.002
chair+0.002
 jail+0.002
he+0.002
Neg logit contributions
 brace-0.002
yt-0.002
cast-0.002
 narrative-0.002
 swe-0.002
Token near
Token ID1474
Feature activation+8.07e-01

Pos logit contributions
taker+0.013
 devastated+0.012
ase+0.012
he+0.012
chair+0.012
Neg logit contributions
 brace-0.015
yt-0.015
 swe-0.014
 runs-0.014
 narrative-0.014
Token the
Token ID262
Feature activation+1.89e-01

Pos logit contributions
taker+0.024
chair+0.021
ase+0.021
har+0.021
he+0.021
Neg logit contributions
 swe-0.014
 brace-0.014
 hats-0.014
 runs-0.013
 signs-0.012
Token hedge
Token ID19859
Feature activation+3.81e+00

Pos logit contributions
taker+0.005
 devastated+0.004
 aback+0.004
ase+0.004
 Shards+0.004
Neg logit contributions
 brace-0.004
 swe-0.004
 runs-0.004
yt-0.004
 narrative-0.003
Token again
Token ID757
Feature activation-1.37e+00

Pos logit contributions
taker+0.092
 Shards+0.089
 devastated+0.087
har+0.085
uster+0.084
Neg logit contributions
 brace-0.071
 swe-0.069
yt-0.069
cast-0.065
 runs-0.064
Token.
Token ID13
Feature activation-5.10e-03

Pos logit contributions
 brace+0.027
yt+0.025
 swe+0.025
 narrative+0.025
 runs+0.024
Neg logit contributions
 devastated-0.032
taker-0.032
he-0.031
chair-0.031
har-0.030
Token<|endoftext|>
Token ID50256
Feature activation+2.01e-01

Pos logit contributions
 brace+0.000
 narrative+0.000
yt+0.000
 swe+0.000
 runs+0.000
Neg logit contributions
taker-0.000
 devastated-0.000
ase-0.000
he-0.000
har-0.000
TokenL
Token ID43
Feature activation+1.54e+00

Pos logit contributions
taker+0.004
 devastated+0.004
har+0.004
 rubble+0.004
chair+0.004
Neg logit contributions
 brace-0.004
 runs-0.004
 narrative-0.004
yt-0.004
 swe-0.004
Tokenily
Token ID813
Feature activation-1.95e-01

Pos logit contributions
taker+0.037
 devastated+0.036
he+0.035
ase+0.035
chair+0.035
Neg logit contributions
 brace-0.030
yt-0.029
 swe-0.029
 runs-0.028
 narrative-0.028
Token couldn
Token ID3521
Feature activation+1.99e+00

Pos logit contributions
taker+0.020
 aback+0.018
ions+0.018
 rubble+0.018
ocolate+0.018
Neg logit contributions
 runs-0.014
 brace-0.014
i-0.013
cast-0.013
yt-0.013
Token't
Token ID470
Feature activation+1.26e+00

Pos logit contributions
taker+0.027
 devastated+0.025
har+0.024
ase+0.023
 Shards+0.022
Neg logit contributions
 brace-0.060
 runs-0.058
yt-0.057
 swe-0.057
 narrative-0.057
Token.
Token ID13
Feature activation+1.93e+00

Pos logit contributions
taker+0.032
 devastated+0.031
ase+0.031
chair+0.030
 Shards+0.030
Neg logit contributions
 brace-0.023
yt-0.022
 runs-0.022
 swe-0.021
cast-0.020
Token "
Token ID366
Feature activation-3.83e-01

Pos logit contributions
ions+0.056
he+0.054
taker+0.054
itty+0.053
ioned+0.052
Neg logit contributions
 runs-0.027
cast-0.026
 brace-0.026
 swe-0.025
ints-0.024
TokenHelp
Token ID22087
Feature activation+2.62e-01

Pos logit contributions
Say+0.007
 brace+0.006
 narrative+0.006
 runs+0.006
 swe+0.006
Neg logit contributions
taker-0.010
ions-0.009
 devastated-0.009
ase-0.009
chair-0.009
Token!
Token ID0
Feature activation+3.78e+00

Pos logit contributions
taker+0.007
 devastated+0.007
ase+0.006
he+0.006
chair+0.006
Neg logit contributions
 brace-0.005
 runs-0.005
yt-0.005
 swe-0.005
 nineteen-0.004
Token Help
Token ID10478
Feature activation-1.49e-01

Pos logit contributions
taker+0.089
 rubble+0.089
he+0.085
 doorway+0.083
 aback+0.083
Neg logit contributions
 runs-0.067
 lets-0.060
rick-0.058
 goes-0.056
 takes-0.056
Token!"
Token ID2474
Feature activation+3.29e-01

Pos logit contributions
 runs+0.002
 swe+0.002
yt+0.002
 brace+0.002
cast+0.002
Neg logit contributions
 devastated-0.005
taker-0.004
har-0.004
 Nobody-0.004
wards-0.004
Token he
Token ID339
Feature activation+3.49e-01

Pos logit contributions
ase+0.010
ions+0.009
har+0.009
idd+0.009
bird+0.009
Neg logit contributions
 runs-0.005
 brace-0.005
 says-0.005
 hats-0.005
ints-0.004
Token cried
Token ID16896
Feature activation+1.96e+00

Pos logit contributions
taker+0.008
ase+0.008
 devastated+0.008
he+0.008
chair+0.007
Neg logit contributions
 brace-0.007
 runs-0.007
yt-0.007
 swe-0.007
cast-0.006
Token.
Token ID13
Feature activation+1.46e+00

Pos logit contributions
ase+0.056
 devastated+0.054
taker+0.054
ions+0.054
idd+0.053
Neg logit contributions
 brace-0.029
 swe-0.028
yt-0.027
 runs-0.026
cast-0.024
Token paper
Token ID3348
Feature activation+2.03e-01

Pos logit contributions
 brace+0.014
 swe+0.013
 runs+0.012
 hats+0.012
yt+0.012
Neg logit contributions
taker-0.019
 devastated-0.019
ase-0.018
 aback-0.018
he-0.018
Token flew
Token ID13112
Feature activation+6.47e-01

Pos logit contributions
taker+0.005
 devastated+0.005
har+0.004
he+0.004
 Shards+0.004
Neg logit contributions
 brace-0.004
yt-0.004
 runs-0.004
 swe-0.004
cast-0.004
Token away
Token ID1497
Feature activation+1.09e+00

Pos logit contributions
taker+0.014
ase+0.014
 devastated+0.014
chair+0.014
har+0.013
Neg logit contributions
 brace-0.014
 swe-0.014
yt-0.014
 runs-0.014
cast-0.013
Token.
Token ID13
Feature activation-3.66e-01

Pos logit contributions
taker+0.026
ase+0.026
 devastated+0.025
chair+0.025
 Shards+0.025
Neg logit contributions
yt-0.022
 brace-0.021
 swe-0.020
 runs-0.020
cast-0.020
Token Pat
Token ID3208
Feature activation+9.04e-01

Pos logit contributions
 runs+0.008
 brace+0.007
yt+0.007
ints+0.007
rick+0.007
Neg logit contributions
taker-0.008
 devastated-0.008
 rubble-0.008
he-0.008
ase-0.008
Tokenches
Token ID2052
Feature activation+3.78e+00

Pos logit contributions
taker+0.026
 devastated+0.025
 rubble+0.024
 jail+0.024
 Shards+0.024
Neg logit contributions
 brace-0.015
yt-0.014
 runs-0.014
 swe-0.013
cast-0.012
Token was
Token ID373
Feature activation+5.55e-01

Pos logit contributions
 Shards+0.103
ions+0.101
har+0.101
taker+0.101
uster+0.101
Neg logit contributions
 runs-0.063
 brace-0.054
yt-0.047
 goes-0.047
cast-0.046
Token disappointed
Token ID11679
Feature activation+4.50e-01

Pos logit contributions
ase+0.017
ions+0.016
rolley+0.016
loo+0.015
mith+0.015
Neg logit contributions
 runs-0.009
 brace-0.009
 swe-0.009
 long-0.008
 points-0.007
Token -
Token ID532
Feature activation+8.53e-01

Pos logit contributions
taker+0.009
 devastated+0.009
he+0.009
ase+0.009
chair+0.009
Neg logit contributions
 brace-0.010
yt-0.010
 swe-0.010
 runs-0.010
 narrative-0.010
Token she
Token ID673
Feature activation+5.79e-01

Pos logit contributions
ase+0.024
taker+0.024
he+0.023
 devastated+0.023
har+0.023
Neg logit contributions
 brace-0.014
 runs-0.013
 swe-0.013
yt-0.012
 narrative-0.011
Token wanted
Token ID2227
Feature activation+1.33e-01

Pos logit contributions
taker+0.015
ase+0.015
ions+0.015
he+0.014
 Shards+0.014
Neg logit contributions
 runs-0.011
 brace-0.011
 swe-0.010
yt-0.010
 narrative-0.009
Token
Token ID198
Feature activation-1.23e-01

Pos logit contributions
One+0.003
 runs+0.003
Nik+0.003
 brace+0.003
yt+0.003
Neg logit contributions
taker-0.004
 devastated-0.004
 rubble-0.004
har-0.004
ase-0.004
TokenHer
Token ID9360
Feature activation+2.35e-01

Pos logit contributions
 brace+0.002
Say+0.002
Nik+0.002
One+0.002
 runs+0.002
Neg logit contributions
taker-0.003
 devastated-0.003
 rubble-0.003
 aback-0.003
chair-0.003
Token mom
Token ID1995
Feature activation+4.48e-01

Pos logit contributions
taker+0.007
 devastated+0.006
ase+0.006
he+0.006
 Shards+0.006
Neg logit contributions
 brace-0.004
 swe-0.004
yt-0.004
 runs-0.004
 hats-0.003
Token saw
Token ID2497
Feature activation-1.62e+00

Pos logit contributions
taker+0.012
 devastated+0.011
 rubble+0.011
 jail+0.011
har+0.011
Neg logit contributions
 brace-0.008
 runs-0.008
yt-0.008
 swe-0.008
cast-0.007
Token the
Token ID262
Feature activation+2.26e-01

Pos logit contributions
 brace+0.030
 swe+0.028
 runs+0.027
yt+0.026
 hats+0.025
Neg logit contributions
har-0.044
taker-0.043
 devastated-0.043
ase-0.041
chair-0.041
Token Hedge
Token ID41078
Feature activation+3.77e+00

Pos logit contributions
taker+0.005
ase+0.005
 devastated+0.005
he+0.005
chair+0.005
Neg logit contributions
 brace-0.005
 swe-0.005
yt-0.005
 runs-0.005
 narrative-0.005
Token,
Token ID11
Feature activation+1.26e+00

Pos logit contributions
 devastated+0.062
 Shards+0.060
har+0.058
taker+0.056
 rubble+0.055
Neg logit contributions
 brace-0.102
yt-0.101
 swe-0.100
 runs-0.097
cast-0.091
Token and
Token ID290
Feature activation+7.24e-01

Pos logit contributions
ase+0.034
taker+0.032
ore+0.032
he+0.032
idd+0.030
Neg logit contributions
 brace-0.024
 swe-0.024
 runs-0.023
 hats-0.022
yt-0.022
Token immediately
Token ID3393
Feature activation+7.51e-01

Pos logit contributions
ase+0.022
he+0.022
ore+0.021
taker+0.021
har+0.021
Neg logit contributions
 brace-0.011
 swe-0.010
 runs-0.010
yt-0.010
 hats-0.008
Token smiled
Token ID13541
Feature activation+2.98e-02

Pos logit contributions
taker+0.019
 devastated+0.018
ase+0.017
he+0.017
chair+0.017
Neg logit contributions
 brace-0.014
yt-0.014
 swe-0.014
 runs-0.013
 narrative-0.013
Token at
Token ID379
Feature activation-5.19e-01

Pos logit contributions
taker+0.001
 devastated+0.001
he+0.001
har+0.001
chair+0.001
Neg logit contributions
 brace-0.001
 narrative-0.001
 runs-0.001
yt-0.001
 vitamins-0.001
Token the
Token ID262
Feature activation+8.81e-01

Pos logit contributions
taker+0.041
har+0.040
 devastated+0.039
 rubble+0.039
ival+0.038
Neg logit contributions
Nik-0.027
 brace-0.026
 narrative-0.025
 runs-0.025
 swe-0.025
Token story
Token ID1621
Feature activation-5.90e-01

Pos logit contributions
taker+0.023
 devastated+0.023
chair+0.022
he+0.022
ase+0.021
Neg logit contributions
 brace-0.016
yt-0.015
 swe-0.015
cast-0.014
 runs-0.014
Token is
Token ID318
Feature activation+7.92e-01

Pos logit contributions
 runs+0.007
 brace+0.007
 narrative+0.007
yt+0.007
 swe+0.007
Neg logit contributions
taker-0.018
 devastated-0.018
har-0.017
 rubble-0.017
 Shards-0.017
Token:
Token ID25
Feature activation-2.53e-01

Pos logit contributions
taker+0.020
 devastated+0.020
he+0.020
chair+0.019
ase+0.019
Neg logit contributions
 brace-0.014
 runs-0.013
 narrative-0.013
 swe-0.013
yt-0.013
Token Mat
Token ID6550
Feature activation+7.06e-01

Pos logit contributions
 brace+0.006
 runs+0.005
 swe+0.005
 narrative+0.005
yt+0.005
Neg logit contributions
taker-0.006
he-0.005
 devastated-0.005
ase-0.005
 rubble-0.005
Tokenches
Token ID2052
Feature activation+3.76e+00

Pos logit contributions
taker+0.015
 devastated+0.015
ase+0.014
he+0.014
chair+0.014
Neg logit contributions
 brace-0.016
yt-0.015
 swe-0.015
 runs-0.014
 narrative-0.014
Token are
Token ID389
Feature activation-4.01e-01

Pos logit contributions
har+0.110
 Shards+0.106
idd+0.106
 devastated+0.105
taker+0.105
Neg logit contributions
 runs-0.054
 narrative-0.050
 brace-0.050
 swe-0.045
rick-0.044
Token not
Token ID407
Feature activation-1.11e+00

Pos logit contributions
 brace+0.007
 swe+0.007
 runs+0.007
 narrative+0.007
 vitamins+0.007
Neg logit contributions
he-0.010
taker-0.010
ase-0.010
chair-0.010
 devastated-0.010
Token toys
Token ID14958
Feature activation+2.17e+00

Pos logit contributions
 brace+0.024
 swe+0.023
yt+0.023
 runs+0.022
 narrative+0.021
Neg logit contributions
taker-0.025
 devastated-0.025
ase-0.024
he-0.024
chair-0.024
Token and
Token ID290
Feature activation+2.22e+00

Pos logit contributions
taker+0.056
 devastated+0.056
 Shards+0.055
har+0.055
ase+0.053
Neg logit contributions
 brace-0.040
 runs-0.037
 swe-0.037
yt-0.037
 narrative-0.034
Token can
Token ID460
Feature activation+1.17e-01

Pos logit contributions
taker+0.052
 devastated+0.050
ase+0.048
he+0.048
chair+0.048
Neg logit contributions
 brace-0.045
yt-0.044
 swe-0.043
 runs-0.042
 narrative-0.042
TokenWhat
Token ID2061
Feature activation-5.23e-01

Pos logit contributions
taker+0.015
 devastated+0.014
chair+0.014
ase+0.014
he+0.014
Neg logit contributions
 brace-0.011
 swe-0.011
Say-0.010
 runs-0.010
yt-0.010
Token happened
Token ID3022
Feature activation+1.24e-01

Pos logit contributions
 runs+0.008
 brace+0.008
 narrative+0.007
 swe+0.007
yt+0.007
Neg logit contributions
har-0.015
taker-0.015
 devastated-0.015
ase-0.014
 rubble-0.014
Token,
Token ID11
Feature activation+1.17e+00

Pos logit contributions
taker+0.003
 devastated+0.003
ase+0.003
 rubble+0.003
har+0.003
Neg logit contributions
 runs-0.002
 brace-0.002
yt-0.002
 swe-0.002
cast-0.002
Token Tim
Token ID5045
Feature activation-5.25e-01

Pos logit contributions
ase+0.023
 devastated+0.022
har+0.021
chair+0.021
ore+0.021
Neg logit contributions
 swe-0.030
 brace-0.029
 runs-0.029
yt-0.029
cast-0.027
Tokenmy
Token ID1820
Feature activation-4.50e-01

Pos logit contributions
 brace+0.013
yt+0.012
 swe+0.012
 runs+0.012
 narrative+0.012
Neg logit contributions
taker-0.010
ase-0.010
 devastated-0.010
he-0.010
chair-0.009
Token?
Token ID30
Feature activation+3.75e+00

Pos logit contributions
 brace+0.011
 narrative+0.010
 swe+0.010
yt+0.010
 runs+0.010
Neg logit contributions
taker-0.009
 devastated-0.008
ase-0.008
he-0.008
chair-0.008
Token Why
Token ID4162
Feature activation+2.08e-01

Pos logit contributions
 reunited+0.098
 rubble+0.097
he+0.096
 doorway+0.094
har+0.091
Neg logit contributions
 runs-0.058
 has-0.055
 wants-0.055
 asks-0.053
 lets-0.052
Token are
Token ID389
Feature activation+9.48e-01

Pos logit contributions
har+0.007
 devastated+0.007
 doorway+0.006
wards+0.006
ase+0.006
Neg logit contributions
 runs-0.003
 brace-0.002
 narrative-0.002
cast-0.002
 goes-0.002
Token you
Token ID345
Feature activation+2.14e+00

Pos logit contributions
taker+0.024
ase+0.023
 devastated+0.023
he+0.023
har+0.022
Neg logit contributions
 brace-0.018
 swe-0.017
 runs-0.017
yt-0.016
 narrative-0.016
Token crying
Token ID13774
Feature activation+1.18e+00

Pos logit contributions
chair+0.060
ase+0.060
ival+0.060
ions+0.056
rolley+0.056
Neg logit contributions
yt-0.033
 brace-0.032
 runs-0.031
 swe-0.030
 mean-0.029
Token?"
Token ID1701
Feature activation+7.56e-03

Pos logit contributions
 devastated+0.032
har+0.031
taker+0.031
 Shards+0.031
ase+0.031
Neg logit contributions
 swe-0.018
cast-0.018
yt-0.017
 runs-0.017
 brace-0.017
Token
Token ID198
Feature activation+7.85e-01

Pos logit contributions
 rubble+0.019
taker+0.019
idd+0.019
 devastated+0.018
ions+0.018
Neg logit contributions
 brace-0.013
Nik-0.013
Say-0.013
 runs-0.013
 swe-0.012
TokenBut
Token ID1537
Feature activation+1.18e+00

Pos logit contributions
taker+0.019
 rubble+0.018
idd+0.018
 Shards+0.018
 aback+0.018
Neg logit contributions
 brace-0.014
Say-0.014
Nik-0.014
 swe-0.014
 narrative-0.013
Token when
Token ID618
Feature activation-7.51e-02

Pos logit contributions
ase+0.030
taker+0.030
idd+0.030
 Shards+0.029
 rubble+0.029
Neg logit contributions
 swe-0.021
 brace-0.021
 runs-0.019
yt-0.019
 narrative-0.019
Token he
Token ID339
Feature activation+4.33e-01

Pos logit contributions
 runs+0.001
 swe+0.001
 brace+0.001
 narrative+0.001
yt+0.001
Neg logit contributions
ase-0.002
taker-0.002
idd-0.002
har-0.002
ions-0.002
Token arrived
Token ID5284
Feature activation+4.17e-01

Pos logit contributions
taker+0.011
ions+0.010
ase+0.010
 rubble+0.010
 devastated+0.009
Neg logit contributions
yt-0.009
 runs-0.008
 brace-0.008
 swe-0.008
Say-0.008
Token home
Token ID1363
Feature activation+3.74e+00

Pos logit contributions
taker+0.010
 devastated+0.010
ase+0.010
 Shards+0.009
har+0.009
Neg logit contributions
 brace-0.008
yt-0.008
 swe-0.008
 runs-0.008
 narrative-0.007
Token,
Token ID11
Feature activation+6.99e-01

Pos logit contributions
ase+0.102
 Shards+0.100
taker+0.099
 devastated+0.097
ions+0.097
Neg logit contributions
yt-0.060
 swe-0.057
 runs-0.057
 brace-0.057
cast-0.053
Token he
Token ID339
Feature activation-1.38e-01

Pos logit contributions
ase+0.023
idd+0.022
ore+0.021
har+0.021
aun+0.021
Neg logit contributions
 swe-0.010
 brace-0.009
 runs-0.008
 hats-0.008
 long-0.008
Token found
Token ID1043
Feature activation+8.42e-01

Pos logit contributions
 brace+0.003
 runs+0.003
yt+0.002
 swe+0.002
cast+0.002
Neg logit contributions
taker-0.004
ase-0.003
ions-0.003
he-0.003
 rubble-0.003
Token out
Token ID503
Feature activation+2.55e+00

Pos logit contributions
har+0.023
days+0.022
idd+0.021
ase+0.021
 devastated+0.021
Neg logit contributions
 brace-0.015
 swe-0.014
 runs-0.012
 hats-0.012
 vitamins-0.011
Token that
Token ID326
Feature activation+1.15e-01

Pos logit contributions
ase+0.057
taker+0.055
idd+0.054
 Shards+0.054
har+0.052
Neg logit contributions
 runs-0.052
 brace-0.050
 swe-0.049
yt-0.047
Say-0.046
Token.
Token ID13
Feature activation-4.32e-01

Pos logit contributions
taker+0.011
 devastated+0.011
har+0.011
ase+0.011
 Shards+0.011
Neg logit contributions
 brace-0.006
yt-0.005
 runs-0.005
 swe-0.005
cast-0.005
Token
Token ID198
Feature activation+6.38e-02

Pos logit contributions
 runs+0.008
 brace+0.007
 swe+0.007
ints+0.007
 hats+0.007
Neg logit contributions
taker-0.011
har-0.011
 devastated-0.011
chair-0.010
 rubble-0.010
Token
Token ID198
Feature activation+1.88e-01

Pos logit contributions
taker+0.002
 devastated+0.001
ase+0.001
chair+0.001
har+0.001
Neg logit contributions
 runs-0.001
 brace-0.001
 swe-0.001
yt-0.001
Say-0.001
Token"
Token ID1
Feature activation+2.36e-02

Pos logit contributions
taker+0.006
 devastated+0.005
chair+0.005
ase+0.005
 rubble+0.005
Neg logit contributions
 brace-0.003
 runs-0.002
yt-0.002
 swe-0.002
 narrative-0.002
TokenUh
Token ID34653
Feature activation-3.28e-01

Pos logit contributions
taker+0.001
 devastated+0.001
chair+0.001
ase+0.001
har+0.000
Neg logit contributions
 brace-0.001
 runs-0.000
 swe-0.000
 narrative-0.000
yt-0.000
Token-
Token ID12
Feature activation+3.73e+00

Pos logit contributions
 brace+0.008
yt+0.008
 swe+0.008
 runs+0.008
 narrative+0.008
Neg logit contributions
taker-0.006
 devastated-0.006
ase-0.005
he-0.005
chair-0.005
Tokenoh
Token ID1219
Feature activation-1.92e+00

Pos logit contributions
taker+0.076
 devastated+0.073
 rubble+0.072
chair+0.069
 jail+0.067
Neg logit contributions
 brace-0.083
 runs-0.082
rick-0.079
Say-0.078
 swe-0.078
Token,
Token ID11
Feature activation+1.06e+00

Pos logit contributions
 brace+0.049
 narrative+0.048
 swe+0.047
 vitamins+0.046
Say+0.045
Neg logit contributions
taker-0.035
 devastated-0.032
har-0.032
he-0.032
 jail-0.032
Token where
Token ID810
Feature activation-1.91e-01

Pos logit contributions
ase+0.026
har+0.025
 devastated+0.025
he+0.025
taker+0.025
Neg logit contributions
 runs-0.022
 brace-0.021
 swe-0.020
 hats-0.020
yt-0.020
Token did
Token ID750
Feature activation-3.76e-01

Pos logit contributions
 runs+0.004
 brace+0.003
 narrative+0.003
 swe+0.003
yt+0.003
Neg logit contributions
taker-0.005
har-0.005
 devastated-0.005
ase-0.005
he-0.005
Token it
Token ID340
Feature activation+5.64e-01

Pos logit contributions
 brace+0.007
 runs+0.007
 swe+0.007
yt+0.006
 hats+0.006
Neg logit contributions
taker-0.010
 devastated-0.009
ase-0.009
he-0.009
har-0.009
Token.
Token ID13
Feature activation+7.95e-02

Pos logit contributions
 devastated+0.020
taker+0.020
har+0.019
 Shards+0.019
ase+0.019
Neg logit contributions
 brace-0.010
yt-0.009
 runs-0.009
 swe-0.009
cast-0.008
Token
Token ID198
Feature activation+2.40e-01

Pos logit contributions
har+0.002
 devastated+0.002
taker+0.002
ase+0.002
aroo+0.002
Neg logit contributions
 runs-0.001
Say-0.001
yt-0.001
 swe-0.001
 vitamins-0.001
Token
Token ID198
Feature activation+2.61e-01

Pos logit contributions
taker+0.005
 devastated+0.005
ase+0.005
har+0.005
chair+0.004
Neg logit contributions
 brace-0.005
 runs-0.005
yt-0.005
 swe-0.005
cast-0.005
Token"
Token ID1
Feature activation+8.95e-02

Pos logit contributions
taker+0.006
 devastated+0.006
he+0.006
chair+0.006
ase+0.006
Neg logit contributions
 brace-0.005
 swe-0.005
yt-0.005
 narrative-0.005
cast-0.005
TokenUh
Token ID34653
Feature activation-3.51e-01

Pos logit contributions
taker+0.002
 devastated+0.002
chair+0.002
ore+0.002
ase+0.002
Neg logit contributions
 narrative-0.002
Say-0.002
 brace-0.002
 swe-0.002
yt-0.002
Token-
Token ID12
Feature activation+3.70e+00

Pos logit contributions
 brace+0.009
yt+0.009
 swe+0.008
 runs+0.008
 narrative+0.008
Neg logit contributions
taker-0.007
 devastated-0.006
he-0.006
ase-0.006
chair-0.006
Tokenoh
Token ID1219
Feature activation-2.00e+00

Pos logit contributions
taker+0.080
 rubble+0.077
 devastated+0.077
 jail+0.073
chair+0.072
Neg logit contributions
 runs-0.073
rick-0.073
Say-0.072
Nik-0.072
 brace-0.071
Token,"
Token ID553
Feature activation-1.07e+00

Pos logit contributions
 brace+0.050
 narrative+0.048
 swe+0.048
 vitamins+0.046
yt+0.045
Neg logit contributions
taker-0.039
 devastated-0.036
he-0.035
chair-0.035
har-0.035
Token Jake
Token ID14757
Feature activation+6.85e-01

Pos logit contributions
 brace+0.019
 runs+0.019
 swe+0.019
yt+0.018
 hats+0.017
Neg logit contributions
ase-0.028
 devastated-0.028
har-0.027
taker-0.027
he-0.026
Token said
Token ID531
Feature activation+1.97e-01

Pos logit contributions
taker+0.021
ase+0.021
 devastated+0.021
 rubble+0.021
ions+0.021
Neg logit contributions
 brace-0.009
 runs-0.008
yt-0.008
 swe-0.008
 hats-0.007
Token.
Token ID13
Feature activation+6.23e-01

Pos logit contributions
taker+0.005
 devastated+0.005
ase+0.005
he+0.005
chair+0.005
Neg logit contributions
 brace-0.003
yt-0.003
 swe-0.003
 runs-0.003
 narrative-0.003
TokenBilly
Token ID49640
Feature activation+9.21e-01

Pos logit contributions
taker+0.008
 devastated+0.008
ase+0.008
he+0.008
ions+0.008
Neg logit contributions
 brace-0.008
yt-0.008
 swe-0.008
 runs-0.008
 narrative-0.007
Token cried
Token ID16896
Feature activation+1.91e+00

Pos logit contributions
taker+0.024
ions+0.023
ase+0.023
he+0.022
har+0.022
Neg logit contributions
 runs-0.017
 brace-0.017
 swe-0.015
yt-0.015
cast-0.014
Token,
Token ID11
Feature activation+4.82e-01

Pos logit contributions
har+0.047
ase+0.047
idd+0.046
 devastated+0.045
ions+0.045
Neg logit contributions
 runs-0.036
 swe-0.036
 brace-0.035
 hats-0.033
yt-0.032
Token "
Token ID366
Feature activation+2.74e-01

Pos logit contributions
taker+0.011
 devastated+0.010
ase+0.010
he+0.010
har+0.010
Neg logit contributions
 brace-0.010
yt-0.010
 swe-0.010
 runs-0.010
 narrative-0.009
TokenHelp
Token ID22087
Feature activation+2.40e-01

Pos logit contributions
taker+0.007
ions+0.007
ase+0.006
 devastated+0.006
 rubble+0.006
Neg logit contributions
 brace-0.005
 runs-0.005
Say-0.005
 swe-0.005
 narrative-0.005
Token!
Token ID0
Feature activation+3.70e+00

Pos logit contributions
taker+0.006
 devastated+0.006
ase+0.006
he+0.006
chair+0.006
Neg logit contributions
 brace-0.004
 runs-0.004
yt-0.004
 swe-0.004
 narrative-0.004
Token Help
Token ID10478
Feature activation+2.92e-03

Pos logit contributions
taker+0.087
he+0.086
 rubble+0.080
ions+0.080
 aback+0.079
Neg logit contributions
 runs-0.076
 goes-0.062
rick-0.060
 lets-0.059
 swe-0.058
Token!
Token ID0
Feature activation+3.09e+00

Pos logit contributions
 devastated+0.000
taker+0.000
har+0.000
ase+0.000
 arrested+0.000
Neg logit contributions
 runs-0.000
 swe-0.000
yt-0.000
 brace-0.000
 goes-0.000
Token I
Token ID314
Feature activation+1.93e+00

Pos logit contributions
he+0.085
taker+0.081
 rubble+0.077
 reunited+0.076
 devastated+0.076
Neg logit contributions
 runs-0.053
 goes-0.042
 lets-0.042
rick-0.041
 takes-0.040
Token am
Token ID716
Feature activation+5.64e-01

Pos logit contributions
wards+0.054
taker+0.053
ival+0.051
 devastated+0.051
ase+0.051
Neg logit contributions
 runs-0.037
 goes-0.032
 wants-0.029
yt-0.028
 takes-0.028
Token stuck
Token ID7819
Feature activation+1.78e-01

Pos logit contributions
ase+0.018
rolley+0.017
ival+0.017
he+0.017
chair+0.016
Neg logit contributions
 long-0.008
 runs-0.008
 going-0.008
 swe-0.008
 brace-0.008
Token while
Token ID981
Feature activation-1.14e+00

Pos logit contributions
 brace+0.004
yt+0.004
 runs+0.004
 swe+0.004
cast+0.003
Neg logit contributions
ase-0.004
taker-0.004
 devastated-0.004
har-0.004
 rubble-0.004
Token walking
Token ID6155
Feature activation+1.53e+00

Pos logit contributions
 brace+0.027
yt+0.026
 swe+0.026
 runs+0.025
 narrative+0.025
Neg logit contributions
taker-0.023
 devastated-0.022
ase-0.021
he-0.021
chair-0.021
Token in
Token ID287
Feature activation+3.57e-01

Pos logit contributions
 devastated+0.039
har+0.037
taker+0.036
 Shards+0.036
idd+0.035
Neg logit contributions
 runs-0.029
 brace-0.028
 swe-0.028
yt-0.027
cast-0.026
Token the
Token ID262
Feature activation-8.52e-01

Pos logit contributions
taker+0.009
har+0.009
 devastated+0.009
ase+0.009
idd+0.008
Neg logit contributions
 runs-0.007
 brace-0.007
 swe-0.006
yt-0.006
 hats-0.006
Token me
Token ID502
Feature activation+2.85e-01

Pos logit contributions
 narrative+0.023
 brace+0.023
yt+0.022
cast+0.022
 runs+0.021
Neg logit contributions
ase-0.014
he-0.014
 Shards-0.013
chair-0.013
 doorway-0.013
Tokenadow
Token ID4584
Feature activation+3.69e+00

Pos logit contributions
taker+0.005
 devastated+0.005
har+0.005
 rubble+0.005
ner+0.005
Neg logit contributions
yt-0.007
 runs-0.007
 brace-0.007
cast-0.007
ints-0.007
Token,
Token ID11
Feature activation+5.57e-01

Pos logit contributions
har+0.128
uster+0.127
taker+0.122
plug+0.121
ions+0.120
Neg logit contributions
 runs-0.037
 goes-0.030
cast-0.029
ia-0.028
yt-0.027
Token he
Token ID339
Feature activation-1.06e+00

Pos logit contributions
ase+0.015
har+0.015
 devastated+0.014
taker+0.014
he+0.014
Neg logit contributions
 brace-0.009
 runs-0.009
yt-0.009
cast-0.009
 swe-0.009
Token saw
Token ID2497
Feature activation-2.92e-01

Pos logit contributions
 brace+0.022
yt+0.021
 runs+0.021
 swe+0.021
cast+0.020
Neg logit contributions
taker-0.025
 devastated-0.023
ase-0.023
har-0.023
he-0.023
Token an
Token ID281
Feature activation+9.85e-01

Pos logit contributions
 brace+0.005
 hats+0.005
 swe+0.005
 runs+0.005
Say+0.005
Neg logit contributions
har-0.008
he-0.007
ase-0.007
days-0.007
taker-0.007
Token available
Token ID1695
Feature activation+2.26e-01

Pos logit contributions
taker+0.024
 devastated+0.023
ase+0.022
he+0.022
har+0.022
Neg logit contributions
 brace-0.019
yt-0.018
 swe-0.018
 runs-0.018
 narrative-0.018
Token came
Token ID1625
Feature activation+1.11e+00

Pos logit contributions
taker+0.026
 rubble+0.024
 devastated+0.024
har+0.024
uster+0.024
Neg logit contributions
 runs-0.020
 swe-0.020
yt-0.020
 brace-0.019
cast-0.018
Token time
Token ID640
Feature activation-3.70e-01

Pos logit contributions
taker+0.023
 devastated+0.022
ase+0.022
he+0.021
chair+0.021
Neg logit contributions
 brace-0.026
yt-0.025
 swe-0.025
 runs-0.024
cast-0.023
Token to
Token ID284
Feature activation+8.92e-01

Pos logit contributions
 brace+0.008
yt+0.008
 swe+0.008
 narrative+0.008
 runs+0.008
Neg logit contributions
taker-0.008
 devastated-0.008
he-0.007
ase-0.007
chair-0.007
Token say
Token ID910
Feature activation+1.98e+00

Pos logit contributions
taker+0.024
 devastated+0.023
he+0.023
har+0.022
ase+0.022
Neg logit contributions
 brace-0.015
yt-0.014
 swe-0.014
 runs-0.014
 narrative-0.014
Token good
Token ID922
Feature activation-4.44e-01

Pos logit contributions
idd+0.052
har+0.052
ase+0.051
itty+0.049
 rubble+0.049
Neg logit contributions
 brace-0.036
 swe-0.036
 runs-0.033
 hats-0.031
Nik-0.030
Token-
Token ID12
Feature activation+3.69e+00

Pos logit contributions
 brace+0.009
yt+0.009
 swe+0.009
 runs+0.009
 narrative+0.009
Neg logit contributions
taker-0.010
 devastated-0.010
he-0.009
ase-0.009
har-0.009
Tokenbye
Token ID16390
Feature activation-6.51e-01

Pos logit contributions
 devastated+0.070
 rubble+0.067
 Shards+0.065
taker+0.065
har+0.063
Neg logit contributions
 brace-0.089
 swe-0.084
 runs-0.083
yt-0.082
Nik-0.080
Token,
Token ID11
Feature activation+5.60e-01

Pos logit contributions
 brace+0.014
 swe+0.014
yt+0.014
 runs+0.013
cast+0.012
Neg logit contributions
taker-0.015
 devastated-0.014
he-0.014
ase-0.014
har-0.013
Token Lucy
Token ID22162
Feature activation+5.69e-01

Pos logit contributions
ase+0.012
idd+0.011
ore+0.011
aun+0.010
taker+0.010
Neg logit contributions
 swe-0.015
 brace-0.014
yt-0.013
 runs-0.013
 narrative-0.013
Token gave
Token ID2921
Feature activation-7.28e-01

Pos logit contributions
taker+0.015
ions+0.014
 rubble+0.014
 Shards+0.014
ase+0.014
Neg logit contributions
 runs-0.010
 brace-0.010
yt-0.010
 swe-0.010
Nik-0.009
Token her
Token ID607
Feature activation+1.21e+00

Pos logit contributions
 brace+0.010
 swe+0.010
 runs+0.008
 hats+0.008
 names+0.007
Neg logit contributions
taker-0.023
ase-0.022
 devastated-0.022
har-0.022
 rubble-0.022
Token could
Token ID714
Feature activation-1.30e+00

Pos logit contributions
 runs+0.022
 brace+0.021
 vitamins+0.019
 wants+0.018
 goes+0.018
Neg logit contributions
 devastated-0.039
 rubble-0.038
har-0.038
taker-0.038
he-0.038
Token soar
Token ID48701
Feature activation-2.48e+00

Pos logit contributions
 brace+0.023
 runs+0.021
 swe+0.020
Say+0.020
 hats+0.019
Neg logit contributions
taker-0.035
 devastated-0.034
ase-0.034
har-0.033
 aback-0.032
Token in
Token ID287
Feature activation+1.45e-01

Pos logit contributions
 brace+0.053
yt+0.051
 swe+0.050
 runs+0.050
 narrative+0.050
Neg logit contributions
taker-0.055
 devastated-0.053
he-0.052
ase-0.051
chair-0.051
Token the
Token ID262
Feature activation+1.89e-01

Pos logit contributions
taker+0.004
 devastated+0.004
ase+0.004
har+0.004
aun+0.003
Neg logit contributions
 brace-0.003
 swe-0.003
 runs-0.003
yt-0.003
 hats-0.002
Token sky
Token ID6766
Feature activation-1.82e+00

Pos logit contributions
taker+0.004
 devastated+0.004
har+0.004
ions+0.004
aun+0.004
Neg logit contributions
 brace-0.004
 swe-0.004
yt-0.004
 narrative-0.004
 runs-0.004
Token higher
Token ID2440
Feature activation-4.56e+00

Pos logit contributions
 brace+0.039
yt+0.038
 swe+0.037
 runs+0.037
 narrative+0.036
Neg logit contributions
taker-0.040
 devastated-0.040
ase-0.038
har-0.037
he-0.037
Token than
Token ID621
Feature activation+7.41e-01

Pos logit contributions
 brace+0.120
Say+0.118
 narrative+0.118
 swe+0.117
 vitamins+0.113
Neg logit contributions
he-0.073
taker-0.067
ions-0.064
ore-0.064
chair-0.063
Token any
Token ID597
Feature activation-1.10e+00

Pos logit contributions
 devastated+0.020
taker+0.019
ase+0.019
har+0.019
 Shards+0.018
Neg logit contributions
 brace-0.013
cast-0.012
 swe-0.012
 runs-0.011
 vitamins-0.011
Token other
Token ID584
Feature activation-6.81e-01

Pos logit contributions
 brace+0.023
 swe+0.022
yt+0.021
 runs+0.021
 narrative+0.020
Neg logit contributions
taker-0.026
 devastated-0.025
ase-0.024
he-0.024
har-0.024
Token bird
Token ID6512
Feature activation-2.25e+00

Pos logit contributions
 brace+0.014
yt+0.013
 swe+0.013
 runs+0.013
 narrative+0.012
Neg logit contributions
taker-0.016
 devastated-0.016
ase-0.015
he-0.015
har-0.015
Token.
Token ID13
Feature activation-5.51e-01

Pos logit contributions
 brace+0.048
yt+0.046
 swe+0.045
 runs+0.045
 narrative+0.044
Neg logit contributions
taker-0.051
 devastated-0.049
ase-0.047
he-0.047
har-0.047
Token.
Token ID13
Feature activation+1.92e-01

Pos logit contributions
yt+0.014
 brace+0.013
 runs+0.012
 swe+0.012
 hats+0.012
Neg logit contributions
 devastated-0.033
 Shards-0.031
taker-0.031
har-0.031
 rubble-0.030
Token It
Token ID632
Feature activation-1.11e+00

Pos logit contributions
 devastated+0.004
taker+0.004
he+0.004
 rubble+0.004
ase+0.004
Neg logit contributions
 brace-0.004
 swe-0.004
 runs-0.004
 hats-0.004
 nineteen-0.004
Token will
Token ID481
Feature activation-9.16e-01

Pos logit contributions
 brace+0.016
 runs+0.015
 swe+0.015
yt+0.014
 narrative+0.013
Neg logit contributions
taker-0.033
 devastated-0.033
he-0.032
ase-0.032
har-0.031
Token make
Token ID787
Feature activation-8.88e-01

Pos logit contributions
 brace+0.016
 swe+0.015
yt+0.014
 runs+0.014
 narrative+0.014
Neg logit contributions
taker-0.025
 devastated-0.024
ase-0.024
chair-0.024
he-0.023
Token them
Token ID606
Feature activation+3.26e-01

Pos logit contributions
 brace+0.018
 swe+0.017
 runs+0.016
 hats+0.015
yt+0.015
Neg logit contributions
ase-0.023
taker-0.022
 devastated-0.022
har-0.022
he-0.021
Token louder
Token ID27089
Feature activation-4.46e+00

Pos logit contributions
ase+0.009
taker+0.008
 devastated+0.008
 Shards+0.008
ions+0.008
Neg logit contributions
 brace-0.006
 swe-0.006
 runs-0.005
yt-0.005
cast-0.005
Token and
Token ID290
Feature activation-1.28e+00

Pos logit contributions
 brace+0.113
 narrative+0.111
 swe+0.107
Say+0.106
yt+0.105
Neg logit contributions
taker-0.078
chair-0.073
 jail-0.069
har-0.069
 devastated-0.069
Token faster
Token ID5443
Feature activation-4.33e+00

Pos logit contributions
yt+0.030
 narrative+0.029
 brace+0.028
asha+0.028
Nik+0.027
Neg logit contributions
taker-0.025
 devastated-0.025
har-0.024
 jail-0.023
chair-0.023
Token."
Token ID526
Feature activation-4.80e-01

Pos logit contributions
 brace+0.100
 narrative+0.097
yt+0.096
 swe+0.095
Say+0.094
Neg logit contributions
taker-0.087
chair-0.079
he-0.079
 jail-0.078
 devastated-0.078
Token Mia
Token ID32189
Feature activation-1.06e+00

Pos logit contributions
 runs+0.009
 brace+0.008
yt+0.008
 swe+0.008
cast+0.008
Neg logit contributions
 devastated-0.012
taker-0.012
 rubble-0.011
har-0.011
ase-0.011
Token says
Token ID1139
Feature activation+9.93e-03

Pos logit contributions
 brace+0.014
yt+0.013
 swe+0.012
 narrative+0.012
cast+0.012
Neg logit contributions
taker-0.032
 devastated-0.031
he-0.031
chair-0.031
ase-0.030
Token will
Token ID481
Feature activation-9.16e-01

Pos logit contributions
 brace+0.016
 runs+0.015
 swe+0.015
yt+0.014
 narrative+0.013
Neg logit contributions
taker-0.033
 devastated-0.033
he-0.032
ase-0.032
har-0.031
Token make
Token ID787
Feature activation-8.88e-01

Pos logit contributions
 brace+0.016
 swe+0.015
yt+0.014
 runs+0.014
 narrative+0.014
Neg logit contributions
taker-0.025
 devastated-0.024
ase-0.024
chair-0.024
he-0.023
Token them
Token ID606
Feature activation+3.26e-01

Pos logit contributions
 brace+0.018
 swe+0.017
 runs+0.016
 hats+0.015
yt+0.015
Neg logit contributions
ase-0.023
taker-0.022
 devastated-0.022
har-0.022
he-0.021
Token louder
Token ID27089
Feature activation-4.46e+00

Pos logit contributions
ase+0.009
taker+0.008
 devastated+0.008
 Shards+0.008
ions+0.008
Neg logit contributions
 brace-0.006
 swe-0.006
 runs-0.005
yt-0.005
cast-0.005
Token and
Token ID290
Feature activation-1.28e+00

Pos logit contributions
 brace+0.113
 narrative+0.111
 swe+0.107
Say+0.106
yt+0.105
Neg logit contributions
taker-0.078
chair-0.073
 jail-0.069
har-0.069
 devastated-0.069
Token faster
Token ID5443
Feature activation-4.33e+00

Pos logit contributions
yt+0.030
 narrative+0.029
 brace+0.028
asha+0.028
Nik+0.027
Neg logit contributions
taker-0.025
 devastated-0.025
har-0.024
 jail-0.023
chair-0.023
Token."
Token ID526
Feature activation-4.80e-01

Pos logit contributions
 brace+0.100
 narrative+0.097
yt+0.096
 swe+0.095
Say+0.094
Neg logit contributions
taker-0.087
chair-0.079
he-0.079
 jail-0.078
 devastated-0.078
Token Mia
Token ID32189
Feature activation-1.06e+00

Pos logit contributions
 runs+0.009
 brace+0.008
yt+0.008
 swe+0.008
cast+0.008
Neg logit contributions
 devastated-0.012
taker-0.012
 rubble-0.011
har-0.011
ase-0.011
Token says
Token ID1139
Feature activation+9.93e-03

Pos logit contributions
 brace+0.014
yt+0.013
 swe+0.012
 narrative+0.012
cast+0.012
Neg logit contributions
taker-0.032
 devastated-0.031
he-0.031
chair-0.031
ase-0.030
Token,
Token ID11
Feature activation+1.11e-01

Pos logit contributions
taker+0.000
 devastated+0.000
chair+0.000
he+0.000
ase+0.000
Neg logit contributions
 brace-0.000
yt-0.000
 narrative-0.000
cast-0.000
 swe-0.000
Token "
Token ID366
Feature activation-3.90e-01

Pos logit contributions
taker+0.003
 devastated+0.003
ase+0.002
he+0.002
har+0.002
Neg logit contributions
 brace-0.002
 swe-0.002
 runs-0.002
yt-0.002
 narrative-0.002
Token sun
Token ID4252
Feature activation-1.35e+00

Pos logit contributions
yt+0.010
 brace+0.010
 narrative+0.010
 runs+0.010
cast+0.010
Neg logit contributions
taker-0.007
bird-0.007
he-0.007
ase-0.007
 Shards-0.007
Token and
Token ID290
Feature activation-7.81e-01

Pos logit contributions
 brace+0.035
 narrative+0.035
yt+0.034
 vitamins+0.034
 runs+0.034
Neg logit contributions
ase-0.023
bird-0.022
taker-0.022
he-0.021
chair-0.021
Token stretch
Token ID7539
Feature activation-1.22e+00

Pos logit contributions
 brace+0.015
yt+0.015
 narrative+0.015
 swe+0.015
 runs+0.014
Neg logit contributions
taker-0.018
 devastated-0.018
chair-0.017
he-0.017
har-0.017
Token her
Token ID607
Feature activation-8.21e-01

Pos logit contributions
 brace+0.022
yt+0.021
 narrative+0.020
 swe+0.020
 runs+0.020
Neg logit contributions
taker-0.032
 devastated-0.031
he-0.030
ase-0.029
chair-0.029
Token stems
Token ID21552
Feature activation-8.12e-01

Pos logit contributions
 brace+0.016
yt+0.016
 swe+0.015
 runs+0.015
 narrative+0.015
Neg logit contributions
taker-0.020
 devastated-0.019
ase-0.018
he-0.018
chair-0.018
Token higher
Token ID2440
Feature activation-4.33e+00

Pos logit contributions
 brace+0.019
yt+0.018
 narrative+0.018
 runs+0.018
 swe+0.018
Neg logit contributions
taker-0.017
 devastated-0.016
he-0.016
ase-0.015
chair-0.015
Token and
Token ID290
Feature activation-1.03e+00

Pos logit contributions
 narrative+0.138
Say+0.137
 vitamins+0.135
 runs+0.131
Jes+0.125
Neg logit contributions
he-0.046
 of-0.043
 so-0.042
har-0.040
 as-0.040
Token higher
Token ID2440
Feature activation-3.97e+00

Pos logit contributions
 narrative+0.028
yt+0.027
 brace+0.027
 vitamins+0.026
rick+0.026
Neg logit contributions
taker-0.016
 devastated-0.016
har-0.016
he-0.015
 aback-0.015
Token.
Token ID13
Feature activation-1.20e-01

Pos logit contributions
 narrative+0.120
Say+0.116
 vitamins+0.113
rick+0.107
Nik+0.107
Neg logit contributions
 of-0.047
 as-0.047
he-0.046
 so-0.046
 such-0.044
Token She
Token ID1375
Feature activation-4.37e-01

Pos logit contributions
 brace+0.003
yt+0.003
 swe+0.003
 runs+0.003
 narrative+0.003
Neg logit contributions
taker-0.002
 devastated-0.002
har-0.002
he-0.002
ase-0.002
Token had
Token ID550
Feature activation-2.37e-01

Pos logit contributions
 brace+0.008
yt+0.007
 swe+0.007
 runs+0.007
 narrative+0.007
Neg logit contributions
taker-0.011
 devastated-0.011
ase-0.011
he-0.011
har-0.010
Token the
Token ID262
Feature activation+3.90e-01

Pos logit contributions
 brace+0.007
yt+0.007
 runs+0.007
 swe+0.007
cast+0.007
Neg logit contributions
 devastated-0.008
taker-0.008
ase-0.008
 Shards-0.008
he-0.008
Token elevator
Token ID20932
Feature activation-1.69e+00

Pos logit contributions
taker+0.009
 devastated+0.009
ase+0.009
 aback+0.009
ions+0.009
Neg logit contributions
 brace-0.008
yt-0.008
 runs-0.007
 swe-0.007
cast-0.007
Token as
Token ID355
Feature activation-9.76e-01

Pos logit contributions
 brace+0.034
yt+0.033
 runs+0.032
 swe+0.032
cast+0.031
Neg logit contributions
taker-0.040
 devastated-0.040
ase-0.038
 Shards-0.038
he-0.037
Token it
Token ID340
Feature activation-1.60e+00

Pos logit contributions
 brace+0.019
 runs+0.018
yt+0.018
 swe+0.018
cast+0.017
Neg logit contributions
 devastated-0.024
taker-0.024
ase-0.023
 aback-0.022
har-0.022
Token rose
Token ID8278
Feature activation-4.05e-01

Pos logit contributions
 runs+0.029
yt+0.028
 brace+0.028
cast+0.027
 swe+0.027
Neg logit contributions
taker-0.042
 devastated-0.042
har-0.039
 rubble-0.039
 Shards-0.038
Token higher
Token ID2440
Feature activation-4.31e+00

Pos logit contributions
 brace+0.009
 swe+0.008
yt+0.008
 runs+0.008
cast+0.008
Neg logit contributions
taker-0.009
 devastated-0.009
ase-0.009
 Shards-0.009
he-0.009
Token and
Token ID290
Feature activation-1.19e-01

Pos logit contributions
 narrative+0.111
 brace+0.110
Say+0.107
 vitamins+0.105
 swe+0.105
Neg logit contributions
he-0.073
taker-0.070
ions-0.067
har-0.066
ore-0.065
Token higher
Token ID2440
Feature activation-3.92e+00

Pos logit contributions
 brace+0.003
yt+0.003
 swe+0.002
 runs+0.002
 narrative+0.002
Neg logit contributions
taker-0.003
 devastated-0.003
he-0.002
chair-0.002
ase-0.002
Token.
Token ID13
Feature activation-9.15e-02

Pos logit contributions
 narrative+0.096
Say+0.092
 brace+0.091
 swe+0.088
 vitamins+0.087
Neg logit contributions
taker-0.070
he-0.068
ions-0.067
har-0.064
chair-0.064
Token She
Token ID1375
Feature activation-1.38e-01

Pos logit contributions
 runs+0.002
 swe+0.001
 brace+0.001
yt+0.001
Nik+0.001
Neg logit contributions
 devastated-0.003
 rubble-0.002
taker-0.002
har-0.002
ase-0.002
Token looked
Token ID3114
Feature activation-4.37e-01

Pos logit contributions
 runs+0.003
 brace+0.003
cast+0.002
yt+0.002
 swe+0.002
Neg logit contributions
taker-0.004
ase-0.003
 jail-0.003
 devastated-0.003
aun-0.003
Token.
Token ID13
Feature activation+1.51e-01

Pos logit contributions
 brace+0.027
yt+0.026
 swe+0.026
 narrative+0.025
 runs+0.025
Neg logit contributions
taker-0.037
 devastated-0.036
he-0.035
ase-0.035
chair-0.035
Token It
Token ID632
Feature activation-9.66e-01

Pos logit contributions
 devastated+0.003
taker+0.003
 rubble+0.003
he+0.003
har+0.003
Neg logit contributions
 brace-0.004
 runs-0.003
 swe-0.003
 narrative-0.003
Say-0.003
Token is
Token ID318
Feature activation-1.48e+00

Pos logit contributions
 brace+0.014
 swe+0.013
 runs+0.013
yt+0.013
 narrative+0.013
Neg logit contributions
taker-0.028
 devastated-0.027
ase-0.027
har-0.027
he-0.026
Token faster
Token ID5443
Feature activation-3.44e+00

Pos logit contributions
 brace+0.032
yt+0.032
 narrative+0.031
 runs+0.030
 swe+0.030
Neg logit contributions
taker-0.032
 devastated-0.031
chair-0.030
har-0.029
he-0.029
Token and
Token ID290
Feature activation-9.43e-01

Pos logit contributions
 brace+0.058
 swe+0.055
 narrative+0.054
yt+0.054
Say+0.052
Neg logit contributions
taker-0.091
chair-0.088
 devastated-0.086
ase-0.086
he-0.085
Token stronger
Token ID7387
Feature activation-4.28e+00

Pos logit contributions
yt+0.025
 narrative+0.024
asha+0.023
Nik+0.023
iri+0.023
Neg logit contributions
taker-0.016
chair-0.016
 devastated-0.015
har-0.015
 jail-0.015
Token.
Token ID13
Feature activation-6.76e-01

Pos logit contributions
 narrative+0.095
 brace+0.094
yt+0.092
 swe+0.092
Say+0.090
Neg logit contributions
taker-0.087
chair-0.084
 jail-0.083
he-0.081
ore-0.080
Token I
Token ID314
Feature activation+5.08e-02

Pos logit contributions
 brace+0.015
 runs+0.015
 swe+0.014
yt+0.014
 narrative+0.014
Neg logit contributions
taker-0.014
 devastated-0.014
he-0.014
 rubble-0.013
ase-0.013
Token have
Token ID423
Feature activation-4.61e-01

Pos logit contributions
taker+0.001
 devastated+0.001
 Shards+0.001
ase+0.001
 rubble+0.001
Neg logit contributions
 brace-0.001
 runs-0.001
 swe-0.001
 narrative-0.001
yt-0.001
Token a
Token ID257
Feature activation-9.57e-01

Pos logit contributions
 brace+0.010
 swe+0.010
 runs+0.009
 vitamins+0.009
 hats+0.009
Neg logit contributions
taker-0.011
he-0.011
ase-0.011
 devastated-0.011
har-0.010
Token tool
Token ID2891
Feature activation-4.29e-01

Pos logit contributions
 brace+0.019
yt+0.018
 swe+0.018
 runs+0.018
 narrative+0.017
Neg logit contributions
taker-0.023
 devastated-0.022
ase-0.022
he-0.021
chair-0.021
Token's
Token ID338
Feature activation-8.54e-01

Pos logit contributions
 devastated+0.003
taker+0.003
he+0.003
chair+0.003
 Shards+0.003
Neg logit contributions
yt-0.004
 brace-0.004
cast-0.004
 narrative-0.004
 vitamins-0.003
Token measure
Token ID3953
Feature activation-1.75e+00

Pos logit contributions
 brace+0.017
yt+0.016
 swe+0.016
 runs+0.016
 narrative+0.016
Neg logit contributions
taker-0.020
 devastated-0.020
ase-0.019
he-0.019
chair-0.019
Token how
Token ID703
Feature activation+4.65e-01

Pos logit contributions
 brace+0.042
 swe+0.040
yt+0.040
 runs+0.040
cast+0.038
Neg logit contributions
taker-0.035
 devastated-0.034
ase-0.032
he-0.032
chair-0.032
Token tall
Token ID7331
Feature activation-1.24e+00

Pos logit contributions
taker+0.012
 devastated+0.012
ase+0.012
he+0.012
har+0.012
Neg logit contributions
 brace-0.008
 swe-0.007
 runs-0.007
yt-0.007
cast-0.007
Token our
Token ID674
Feature activation-1.92e-01

Pos logit contributions
 brace+0.029
yt+0.028
 swe+0.027
 narrative+0.027
 runs+0.027
Neg logit contributions
taker-0.025
 devastated-0.024
chair-0.023
ase-0.023
he-0.023
Token tower
Token ID10580
Feature activation-4.22e+00

Pos logit contributions
 brace+0.004
 swe+0.004
yt+0.004
 runs+0.004
cast+0.004
Neg logit contributions
taker-0.004
 devastated-0.004
he-0.004
ase-0.004
har-0.004
Token is
Token ID318
Feature activation-7.19e-01

Pos logit contributions
 brace+0.068
yt+0.064
 swe+0.063
 runs+0.062
 narrative+0.061
Neg logit contributions
taker-0.116
 devastated-0.113
ase-0.110
he-0.110
chair-0.109
Token,"
Token ID553
Feature activation-1.84e+00

Pos logit contributions
 brace+0.015
 swe+0.014
yt+0.014
 runs+0.013
cast+0.013
Neg logit contributions
taker-0.017
 devastated-0.017
ase-0.016
he-0.016
chair-0.016
Token Lily
Token ID20037
Feature activation-1.62e+00

Pos logit contributions
cast+0.054
yt+0.054
 brace+0.053
 narrative+0.053
Nik+0.051
Neg logit contributions
taker-0.024
 Bobby-0.024
 Shards-0.022
 Beth-0.022
 Sus-0.021
Token says
Token ID1139
Feature activation-7.05e-01

Pos logit contributions
yt+0.032
iri+0.029
 narrative+0.028
cast+0.028
 brace+0.027
Neg logit contributions
 aback-0.041
taker-0.039
chair-0.038
he-0.037
bird-0.037
Token.
Token ID13
Feature activation-6.27e-01

Pos logit contributions
 narrative+0.017
yt+0.017
cast+0.016
 vitamins+0.016
 brace+0.016
Neg logit contributions
taker-0.015
on-0.013
chair-0.013
aun-0.013
 devastated-0.013
Token places
Token ID4113
Feature activation+1.80e-01

Pos logit contributions
 brace+0.017
 runs+0.016
 swe+0.016
yt+0.015
cast+0.015
Neg logit contributions
taker-0.021
 devastated-0.020
ase-0.020
har-0.020
 Shards-0.019
Token and
Token ID290
Feature activation-1.82e-01

Pos logit contributions
taker+0.005
 devastated+0.005
ase+0.005
 Shards+0.005
har+0.005
Neg logit contributions
 brace-0.003
 runs-0.003
 swe-0.003
 hats-0.003
yt-0.003
Token ride
Token ID6594
Feature activation-1.19e+00

Pos logit contributions
 brace+0.003
yt+0.003
 swe+0.003
 runs+0.003
 narrative+0.003
Neg logit contributions
taker-0.005
 devastated-0.004
he-0.004
chair-0.004
ase-0.004
Token the
Token ID262
Feature activation-7.21e-01

Pos logit contributions
 brace+0.024
yt+0.024
 swe+0.023
 narrative+0.023
 runs+0.023
Neg logit contributions
taker-0.027
 devastated-0.026
he-0.026
chair-0.026
ase-0.025
Token swings
Token ID26728
Feature activation-2.03e+00

Pos logit contributions
yt+0.014
 brace+0.013
ints+0.013
 narrative+0.013
 vitamins+0.013
Neg logit contributions
chair-0.017
taker-0.017
he-0.016
 devastated-0.016
 rubble-0.016
Token again
Token ID757
Feature activation-4.22e+00

Pos logit contributions
 brace+0.031
 runs+0.029
 swe+0.029
yt+0.029
 narrative+0.027
Neg logit contributions
taker-0.058
 devastated-0.057
har-0.055
 Shards-0.054
ase-0.054
Token.
Token ID13
Feature activation-1.33e+00

Pos logit contributions
 brace+0.090
 narrative+0.088
 vitamins+0.087
 swe+0.086
Say+0.086
Neg logit contributions
he-0.091
chair-0.088
taker-0.085
har-0.083
ase-0.083
Token
Token ID198
Feature activation-5.14e-01

Pos logit contributions
 brace+0.031
yt+0.030
 swe+0.030
 narrative+0.029
 runs+0.029
Neg logit contributions
taker-0.027
 devastated-0.026
he-0.025
 Shards-0.025
chair-0.025
Token
Token ID198
Feature activation-5.01e-01

Pos logit contributions
 brace+0.012
 runs+0.011
 swe+0.011
yt+0.011
 narrative+0.011
Neg logit contributions
taker-0.011
 devastated-0.010
har-0.010
ase-0.010
 rubble-0.009
TokenAfter
Token ID3260
Feature activation+1.02e+00

Pos logit contributions
 brace+0.010
yt+0.010
 swe+0.009
cast+0.009
 runs+0.009
Neg logit contributions
he-0.012
taker-0.012
 devastated-0.012
ase-0.012
 Shards-0.011
Token a
Token ID257
Feature activation-1.46e+00

Pos logit contributions
taker+0.027
ase+0.026
 aback+0.025
har+0.024
 devastated+0.024
Neg logit contributions
 brace-0.020
 runs-0.020
 hats-0.019
Say-0.018
 swe-0.018
Token him
Token ID683
Feature activation+1.06e+00

Pos logit contributions
 brace+0.005
 runs+0.005
 swe+0.005
 hats+0.004
cast+0.004
Neg logit contributions
ase-0.008
har-0.008
idd-0.008
 devastated-0.008
taker-0.008
Token to
Token ID284
Feature activation+8.20e-01

Pos logit contributions
taker+0.021
 devastated+0.021
he+0.020
ase+0.020
har+0.020
Neg logit contributions
 brace-0.025
 swe-0.024
yt-0.024
 runs-0.024
 narrative-0.023
Token flap
Token ID37699
Feature activation-1.24e+00

Pos logit contributions
taker+0.023
 devastated+0.023
ase+0.023
idd+0.023
chair+0.022
Neg logit contributions
 brace-0.013
 runs-0.012
 swe-0.011
yt-0.011
 narrative-0.011
Token his
Token ID465
Feature activation-2.11e+00

Pos logit contributions
 brace+0.025
yt+0.024
 swe+0.024
 runs+0.023
 narrative+0.023
Neg logit contributions
taker-0.029
 devastated-0.028
ase-0.027
he-0.027
chair-0.027
Token wings
Token ID12098
Feature activation-1.31e+00

Pos logit contributions
yt+0.048
cast+0.047
 brace+0.047
 narrative+0.046
Say+0.046
Neg logit contributions
taker-0.042
 devastated-0.041
 Shards-0.039
chair-0.039
 rubble-0.038
Token harder
Token ID7069
Feature activation-4.19e+00

Pos logit contributions
 brace+0.025
 swe+0.023
yt+0.023
 runs+0.022
cast+0.022
Neg logit contributions
taker-0.032
 devastated-0.032
ase-0.032
har-0.031
idd-0.031
Token.
Token ID13
Feature activation+3.13e-01

Pos logit contributions
 narrative+0.114
Say+0.112
 runs+0.110
 brace+0.110
 swe+0.110
Neg logit contributions
taker-0.067
 of-0.062
he-0.061
ore-0.061
 aback-0.059
Token
Token ID198
Feature activation-8.12e-02

Pos logit contributions
taker+0.007
 devastated+0.007
 rubble+0.007
ions+0.007
har+0.006
Neg logit contributions
 brace-0.007
 runs-0.007
 swe-0.006
cast-0.006
yt-0.006
Token
Token ID198
Feature activation+3.23e-01

Pos logit contributions
 brace+0.002
yt+0.002
 runs+0.002
 swe+0.002
cast+0.002
Neg logit contributions
taker-0.002
 devastated-0.002
 rubble-0.002
har-0.002
ase-0.002
TokenThe
Token ID464
Feature activation-8.52e-01

Pos logit contributions
taker+0.007
 devastated+0.007
ase+0.007
 rubble+0.006
ions+0.006
Neg logit contributions
 brace-0.007
 swe-0.007
yt-0.007
 runs-0.007
 narrative-0.007
Token little
Token ID1310
Feature activation-8.00e-01

Pos logit contributions
yt+0.024
cast+0.024
 brace+0.023
 swe+0.023
 runs+0.023
Neg logit contributions
 devastated-0.013
taker-0.012
he-0.012
 nowhere-0.011
 jail-0.011
Token,
Token ID11
Feature activation+8.35e-01

Pos logit contributions
ase+0.035
 devastated+0.033
taker+0.032
 Shards+0.032
ended+0.032
Neg logit contributions
 runs-0.021
 swe-0.019
 brace-0.018
 hats-0.018
cast-0.017
Token but
Token ID475
Feature activation+1.26e+00

Pos logit contributions
ase+0.023
taker+0.023
he+0.022
 devastated+0.022
har+0.022
Neg logit contributions
 brace-0.014
 runs-0.013
 swe-0.013
yt-0.013
 hats-0.012
Token the
Token ID262
Feature activation-1.56e-01

Pos logit contributions
ase+0.043
handed+0.041
he+0.040
taker+0.040
har+0.039
Neg logit contributions
 runs-0.016
 swe-0.014
 hats-0.013
 brace-0.013
 goes-0.012
Token ground
Token ID2323
Feature activation+3.67e-01

Pos logit contributions
 narrative+0.003
 runs+0.003
 brace+0.003
 swe+0.003
yt+0.003
Neg logit contributions
ase-0.003
taker-0.003
ions-0.003
he-0.003
 devastated-0.003
Token shook
Token ID14682
Feature activation-9.54e-01

Pos logit contributions
taker+0.008
 devastated+0.008
he+0.007
ase+0.007
har+0.007
Neg logit contributions
 brace-0.008
 swe-0.008
 runs-0.008
yt-0.008
cast-0.007
Token harder
Token ID7069
Feature activation-4.14e+00

Pos logit contributions
 brace+0.022
yt+0.022
 narrative+0.021
 swe+0.021
 runs+0.021
Neg logit contributions
taker-0.019
 devastated-0.019
he-0.018
chair-0.017
har-0.017
Token and
Token ID290
Feature activation+7.49e-01

Pos logit contributions
Say+0.112
 brace+0.111
 narrative+0.110
 runs+0.107
Nik+0.103
Neg logit contributions
taker-0.070
 of-0.066
chair-0.065
he-0.063
 aback-0.060
Token harder
Token ID7069
Feature activation-3.95e+00

Pos logit contributions
taker+0.016
 devastated+0.015
ase+0.015
he+0.015
chair+0.015
Neg logit contributions
 brace-0.017
yt-0.016
 swe-0.016
 runs-0.016
 narrative-0.016
Token.
Token ID13
Feature activation+1.41e+00

Pos logit contributions
 narrative+0.097
Say+0.093
 brace+0.091
Nik+0.089
yt+0.087
Neg logit contributions
taker-0.079
chair-0.070
 Bobby-0.069
ions-0.069
 jail-0.069
Token Tim
Token ID5045
Feature activation+1.81e-02

Pos logit contributions
taker+0.033
he+0.033
 devastated+0.032
ase+0.031
har+0.030
Neg logit contributions
 runs-0.029
 swe-0.028
 brace-0.027
ints-0.027
 nineteen-0.027
Token's
Token ID338
Feature activation-1.70e-01

Pos logit contributions
taker+0.000
ions+0.000
 rubble+0.000
ase+0.000
har+0.000
Neg logit contributions
 runs-0.000
 brace-0.000
 swe-0.000
yt-0.000
rick-0.000
Token his
Token ID465
Feature activation-1.14e+00

Pos logit contributions
 narrative+0.006
yt+0.006
 brace+0.006
 vitamins+0.006
 swe+0.005
Neg logit contributions
he-0.003
taker-0.003
on-0.003
 jail-0.003
 control-0.003
Token face
Token ID1986
Feature activation-3.72e-01

Pos logit contributions
 narrative+0.024
yt+0.024
 brace+0.024
 swe+0.023
 runs+0.023
Neg logit contributions
taker-0.024
 devastated-0.024
chair-0.024
he-0.023
har-0.023
Token as
Token ID355
Feature activation-9.54e-01

Pos logit contributions
 brace+0.008
yt+0.007
 swe+0.007
 runs+0.007
 narrative+0.007
Neg logit contributions
taker-0.009
 devastated-0.008
ase-0.008
har-0.008
he-0.008
Token he
Token ID339
Feature activation-1.61e+00

Pos logit contributions
 brace+0.021
yt+0.021
 swe+0.020
 narrative+0.020
 runs+0.020
Neg logit contributions
taker-0.021
 devastated-0.019
he-0.019
chair-0.019
ase-0.018
Token soared
Token ID35292
Feature activation-1.15e+00

Pos logit contributions
 brace+0.030
yt+0.029
 swe+0.029
 runs+0.028
 narrative+0.028
Neg logit contributions
taker-0.040
 devastated-0.039
ase-0.038
he-0.037
chair-0.037
Token higher
Token ID2440
Feature activation-4.12e+00

Pos logit contributions
 narrative+0.030
 brace+0.030
 vitamins+0.030
 runs+0.029
yt+0.028
Neg logit contributions
taker-0.020
he-0.019
har-0.018
 devastated-0.018
 aback-0.017
Token and
Token ID290
Feature activation+1.32e-01

Pos logit contributions
 vitamins+0.133
 narrative+0.129
Say+0.125
 swe+0.121
 runs+0.121
Neg logit contributions
he-0.053
 of-0.047
 control-0.040
chair-0.038
har-0.038
Token higher
Token ID2440
Feature activation-3.71e+00

Pos logit contributions
taker+0.002
 devastated+0.002
he+0.002
chair+0.002
har+0.002
Neg logit contributions
 brace-0.003
yt-0.003
 narrative-0.003
 vitamins-0.003
 swe-0.003
Token.
Token ID13
Feature activation-1.16e-01

Pos logit contributions
 narrative+0.104
 vitamins+0.104
Say+0.099
 swe+0.095
Nik+0.095
Neg logit contributions
he-0.057
 of-0.053
 control-0.051
taker-0.049
har-0.048
Token
Token ID198
Feature activation-2.48e-01

Pos logit contributions
 brace+0.002
 runs+0.002
 swe+0.002
yt+0.002
cast+0.002
Neg logit contributions
taker-0.003
 devastated-0.002
har-0.002
ase-0.002
 rubble-0.002
Token
Token ID198
Feature activation+1.73e-01

Pos logit contributions
 brace+0.006
yt+0.005
 runs+0.005
 swe+0.005
 narrative+0.005
Neg logit contributions
taker-0.005
 devastated-0.005
 rubble-0.005
har-0.005
ase-0.005
Token wind
Token ID2344
Feature activation-4.55e-01

Pos logit contributions
ase+0.019
taker+0.019
ions+0.018
ore+0.018
aun+0.018
Neg logit contributions
 brace-0.012
 swe-0.011
 runs-0.010
yt-0.009
 narrative-0.009
Token lifting
Token ID16842
Feature activation-1.06e+00

Pos logit contributions
 brace+0.009
 runs+0.008
 swe+0.008
yt+0.008
cast+0.008
Neg logit contributions
taker-0.011
 devastated-0.011
 Shards-0.011
har-0.011
ase-0.010
Token her
Token ID607
Feature activation-1.04e-01

Pos logit contributions
 brace+0.024
 swe+0.022
 runs+0.022
yt+0.022
 narrative+0.021
Neg logit contributions
taker-0.024
 devastated-0.023
ase-0.022
he-0.022
chair-0.022
Token k
Token ID479
Feature activation+1.53e-01

Pos logit contributions
 brace+0.002
 swe+0.002
 runs+0.002
yt+0.002
 hats+0.002
Neg logit contributions
taker-0.003
ase-0.003
 devastated-0.003
ions-0.002
aun-0.002
Tokenite
Token ID578
Feature activation-2.10e+00

Pos logit contributions
taker+0.003
 devastated+0.003
 rubble+0.003
wards+0.003
 doorway+0.003
Neg logit contributions
 brace-0.004
 runs-0.003
rick-0.003
yt-0.003
 pads-0.003
Token higher
Token ID2440
Feature activation-4.12e+00

Pos logit contributions
 brace+0.045
yt+0.044
 swe+0.043
 narrative+0.043
 runs+0.042
Neg logit contributions
taker-0.046
 devastated-0.044
he-0.043
chair-0.043
ase-0.043
Token and
Token ID290
Feature activation-7.13e-01

Pos logit contributions
 vitamins+0.120
Say+0.119
 narrative+0.118
 runs+0.117
 brace+0.114
Neg logit contributions
he-0.060
taker-0.051
 of-0.050
har-0.047
chair-0.047
Token higher
Token ID2440
Feature activation-3.77e+00

Pos logit contributions
yt+0.018
 narrative+0.017
 brace+0.017
 vitamins+0.017
 runs+0.017
Neg logit contributions
taker-0.013
 devastated-0.012
har-0.012
he-0.012
chair-0.012
Token.
Token ID13
Feature activation-4.32e-02

Pos logit contributions
 narrative+0.103
 vitamins+0.101
Say+0.101
Nik+0.094
rick+0.093
Neg logit contributions
he-0.058
taker-0.055
 control-0.052
 of-0.052
har-0.051
Token
Token ID220
Feature activation-1.64e-01

Pos logit contributions
 brace+0.001
yt+0.001
 runs+0.001
 swe+0.001
cast+0.001
Neg logit contributions
taker-0.001
 devastated-0.001
har-0.001
ions-0.001
idd-0.001
Token
Token ID198
Feature activation+4.60e-02

Pos logit contributions
 brace+0.003
 runs+0.003
cast+0.003
yt+0.003
 swe+0.003
Neg logit contributions
taker-0.004
 devastated-0.004
 rubble-0.004
har-0.004
ase-0.004
Token trees
Token ID7150
Feature activation-1.30e-02

Pos logit contributions
 brace+0.006
 swe+0.005
yt+0.005
 runs+0.005
 narrative+0.005
Neg logit contributions
taker-0.008
 devastated-0.008
he-0.007
ase-0.007
ions-0.007
Token to
Token ID284
Feature activation+4.96e-02

Pos logit contributions
 brace+0.000
 swe+0.000
yt+0.000
cast+0.000
 runs+0.000
Neg logit contributions
 devastated-0.000
taker-0.000
 Shards-0.000
har-0.000
ase-0.000
Token make
Token ID787
Feature activation-4.12e-01

Pos logit contributions
taker+0.001
 devastated+0.001
ase+0.001
chair+0.001
idd+0.001
Neg logit contributions
 brace-0.001
 swe-0.001
 runs-0.001
yt-0.001
cast-0.001
Token them
Token ID606
Feature activation+8.38e-01

Pos logit contributions
 brace+0.009
 swe+0.009
 hats+0.008
 runs+0.008
cast+0.008
Neg logit contributions
ase-0.010
 devastated-0.010
har-0.010
itty-0.010
chair-0.010
Token shine
Token ID18340
Feature activation-6.62e-01

Pos logit contributions
ase+0.022
idd+0.021
 devastated+0.021
itty+0.021
taker+0.020
Neg logit contributions
 brace-0.017
 swe-0.016
yt-0.015
 runs-0.014
cast-0.014
Token again
Token ID757
Feature activation-4.12e+00

Pos logit contributions
 brace+0.014
yt+0.014
 runs+0.014
 narrative+0.014
 swe+0.013
Neg logit contributions
taker-0.015
 devastated-0.014
he-0.014
chair-0.013
ase-0.013
Token.
Token ID13
Feature activation-2.81e-01

Pos logit contributions
 narrative+0.097
 runs+0.096
 brace+0.094
Say+0.090
 vitamins+0.089
Neg logit contributions
he-0.077
taker-0.075
har-0.075
chair-0.073
 jail-0.072
Token
Token ID198
Feature activation-4.67e-01

Pos logit contributions
 brace+0.006
 runs+0.006
 swe+0.006
yt+0.005
cast+0.005
Neg logit contributions
taker-0.006
 devastated-0.006
har-0.006
 rubble-0.006
ions-0.006
Token
Token ID198
Feature activation+4.81e-02

Pos logit contributions
 brace+0.011
yt+0.010
 runs+0.010
 swe+0.010
cast+0.010
Neg logit contributions
taker-0.010
 devastated-0.009
har-0.009
 rubble-0.009
ase-0.009
TokenAs
Token ID1722
Feature activation-1.46e+00

Pos logit contributions
taker+0.001
 devastated+0.001
ase+0.001
chair+0.001
har+0.001
Neg logit contributions
 brace-0.001
yt-0.001
 swe-0.001
 runs-0.001
 narrative-0.001
Token the
Token ID262
Feature activation+1.19e-03

Pos logit contributions
 brace+0.029
yt+0.028
 runs+0.028
 swe+0.028
cast+0.026
Neg logit contributions
taker-0.036
ase-0.034
 devastated-0.034
har-0.033
chair-0.033
Token.
Token ID13
Feature activation-1.19e+00

Pos logit contributions
yt+0.051
 brace+0.048
 swe+0.048
 runs+0.047
 narrative+0.047
Neg logit contributions
he-0.047
chair-0.046
taker-0.045
ase-0.044
ions-0.043
Token Sara
Token ID24799
Feature activation-1.53e+00

Pos logit contributions
yt+0.027
 brace+0.027
 narrative+0.025
cast+0.025
 runs+0.023
Neg logit contributions
chair-0.024
 Everything-0.024
 jail-0.024
 Shards-0.024
 Nobody-0.023
Token can
Token ID460
Feature activation-6.23e-01

Pos logit contributions
yt+0.031
 swe+0.028
 brace+0.027
iri+0.026
 avail+0.026
Neg logit contributions
ase-0.037
he-0.036
chair-0.036
on-0.035
taker-0.034
Token count
Token ID954
Feature activation-2.04e+00

Pos logit contributions
yt+0.016
 swe+0.015
 brace+0.014
 narrative+0.014
 vitamins+0.014
Neg logit contributions
he-0.013
chair-0.012
taker-0.012
on-0.012
bird-0.012
Token to
Token ID284
Feature activation-1.24e+00

Pos logit contributions
yt+0.047
 narrative+0.047
 swe+0.045
 brace+0.044
Say+0.044
Neg logit contributions
ions-0.039
he-0.039
taker-0.039
chair-0.038
on-0.037
Token ten
Token ID3478
Feature activation-4.12e+00

Pos logit contributions
yt+0.031
 narrative+0.029
 swe+0.029
Say+0.027
cast+0.027
Neg logit contributions
he-0.023
bird-0.022
 jail-0.022
taker-0.021
chair-0.021
Token.
Token ID13
Feature activation-1.05e+00

Pos logit contributions
 narrative+0.108
Say+0.104
 swe+0.100
yt+0.100
iri+0.099
Neg logit contributions
chair-0.071
 of-0.071
he-0.067
on-0.062
ions-0.061
Token Ben
Token ID3932
Feature activation-2.45e+00

Pos logit contributions
yt+0.030
 brace+0.029
 narrative+0.029
cast+0.026
 swe+0.026
Neg logit contributions
 Everything-0.018
 As-0.017
 While-0.017
 It-0.016
 Shards-0.016
Token can
Token ID460
Feature activation-8.55e-01

Pos logit contributions
yt+0.053
iri+0.047
 narrative+0.045
asha+0.045
Say+0.045
Neg logit contributions
chair-0.055
on-0.053
ase-0.052
taker-0.051
bird-0.050
Token count
Token ID954
Feature activation-2.01e+00

Pos logit contributions
yt+0.019
 swe+0.018
 narrative+0.017
 brace+0.017
 vitamins+0.016
Neg logit contributions
he-0.019
chair-0.019
taker-0.019
on-0.019
bird-0.018
Token to
Token ID284
Feature activation-9.71e-01

Pos logit contributions
 narrative+0.049
yt+0.048
Say+0.046
 swe+0.045
 brace+0.045
Neg logit contributions
ions-0.037
he-0.036
taker-0.036
chair-0.036
on-0.035
Token of
Token ID286
Feature activation+2.48e-01

Pos logit contributions
 devastated+0.025
har+0.022
taker+0.020
 Shards+0.020
ase+0.019
Neg logit contributions
 vitamins-0.048
 runs-0.048
 swe-0.048
 hats-0.047
 brace-0.047
Token cheese
Token ID9891
Feature activation-1.73e+00

Pos logit contributions
 devastated+0.007
ase+0.007
taker+0.007
he+0.007
har+0.007
Neg logit contributions
 brace-0.004
 runs-0.004
 swe-0.004
 vitamins-0.004
 hats-0.004
Token and
Token ID290
Feature activation+1.06e+00

Pos logit contributions
 brace+0.033
yt+0.032
 swe+0.031
 runs+0.031
 narrative+0.030
Neg logit contributions
taker-0.042
 devastated-0.042
har-0.040
 Shards-0.039
he-0.039
Token sauce
Token ID10746
Feature activation-1.42e+00

Pos logit contributions
taker+0.025
 devastated+0.024
chair+0.024
ase+0.024
he+0.023
Neg logit contributions
 brace-0.021
yt-0.020
 swe-0.020
 narrative-0.019
 runs-0.019
Token and
Token ID290
Feature activation+7.34e-01

Pos logit contributions
 brace+0.027
yt+0.026
 swe+0.026
 runs+0.025
 narrative+0.025
Neg logit contributions
taker-0.035
 devastated-0.033
ase-0.033
he-0.033
chair-0.032
Token topp
Token ID23126
Feature activation-4.12e+00

Pos logit contributions
taker+0.017
 devastated+0.017
chair+0.016
ase+0.016
 rubble+0.016
Neg logit contributions
 brace-0.014
yt-0.014
 swe-0.013
 narrative-0.013
 runs-0.013
Tokenings
Token ID654
Feature activation-4.97e-01

Pos logit contributions
 narrative+0.122
Say+0.119
 swe+0.114
 brace+0.113
yt+0.113
Neg logit contributions
ase-0.058
on-0.057
he-0.055
taker-0.054
ore-0.054
Token.
Token ID13
Feature activation+9.52e-03

Pos logit contributions
 brace+0.009
 swe+0.008
 runs+0.008
yt+0.008
 narrative+0.008
Neg logit contributions
taker-0.013
 devastated-0.013
har-0.012
ase-0.012
chair-0.012
Token The
Token ID383
Feature activation+3.56e-02

Pos logit contributions
taker+0.000
 devastated+0.000
he+0.000
ase+0.000
har+0.000
Neg logit contributions
 brace-0.000
 swe-0.000
 runs-0.000
yt-0.000
 narrative-0.000
Token people
Token ID661
Feature activation+2.53e-01

Pos logit contributions
taker+0.001
 devastated+0.001
chair+0.001
he+0.001
har+0.001
Neg logit contributions
 brace-0.001
 swe-0.001
 narrative-0.001
 runs-0.001
yt-0.001
Token who
Token ID508
Feature activation-1.42e+00

Pos logit contributions
 Shards+0.007
ase+0.007
taker+0.006
 devastated+0.006
har+0.006
Neg logit contributions
 runs-0.005
 brace-0.004
 goes-0.004
 points-0.004
 swe-0.004
Token other
Token ID584
Feature activation-1.18e+00

Pos logit contributions
 brace+0.016
yt+0.016
 swe+0.015
 runs+0.015
 narrative+0.015
Neg logit contributions
taker-0.026
 devastated-0.025
he-0.024
ase-0.024
chair-0.024
Token and
Token ID290
Feature activation-3.81e-01

Pos logit contributions
 brace+0.026
yt+0.026
 narrative+0.025
 swe+0.025
 runs+0.025
Neg logit contributions
taker-0.025
 devastated-0.024
he-0.023
chair-0.023
ase-0.023
Token go
Token ID467
Feature activation-6.13e-01

Pos logit contributions
 brace+0.008
yt+0.007
 swe+0.007
 runs+0.007
 narrative+0.007
Neg logit contributions
taker-0.009
 devastated-0.009
he-0.009
ase-0.008
chair-0.008
Token higher
Token ID2440
Feature activation-3.93e+00

Pos logit contributions
 narrative+0.015
 brace+0.015
yt+0.015
 swe+0.015
Say+0.014
Neg logit contributions
taker-0.012
he-0.011
 devastated-0.010
har-0.010
chair-0.010
Token and
Token ID290
Feature activation-6.78e-02

Pos logit contributions
Say+0.114
 vitamins+0.112
 narrative+0.111
 swe+0.106
 brace+0.104
Neg logit contributions
he-0.056
 of-0.052
 control-0.048
 jail-0.047
ions-0.047
Token higher
Token ID2440
Feature activation-4.11e+00

Pos logit contributions
yt+0.002
 brace+0.002
 narrative+0.002
 vitamins+0.002
 swe+0.002
Neg logit contributions
taker-0.001
he-0.001
 devastated-0.001
har-0.001
 aback-0.001
Token.
Token ID13
Feature activation-1.10e+00

Pos logit contributions
 narrative+0.121
Say+0.120
 vitamins+0.120
 swe+0.111
Nik+0.111
Neg logit contributions
he-0.053
 of-0.052
 control-0.051
 as-0.048
ions-0.047
Token
Token ID198
Feature activation-4.92e-01

Pos logit contributions
 brace+0.025
 swe+0.024
yt+0.024
 runs+0.024
 narrative+0.023
Neg logit contributions
taker-0.023
 devastated-0.022
har-0.021
ase-0.021
he-0.021
Token
Token ID198
Feature activation-2.95e-01

Pos logit contributions
 brace+0.011
yt+0.011
 runs+0.011
 swe+0.011
 narrative+0.011
Neg logit contributions
taker-0.010
 devastated-0.010
 rubble-0.009
har-0.009
ase-0.009
Token"
Token ID1
Feature activation-8.83e-01

Pos logit contributions
 brace+0.005
yt+0.005
 swe+0.005
 narrative+0.004
 runs+0.004
Neg logit contributions
taker-0.008
 devastated-0.008
he-0.008
ase-0.008
 Shards-0.007
TokenLook
Token ID8567
Feature activation-1.48e+00

Pos logit contributions
yt+0.020
 brace+0.020
 swe+0.019
cast+0.019
 runs+0.018
Neg logit contributions
taker-0.018
ase-0.018
he-0.018
 devastated-0.018
 Shards-0.017
Token "
Token ID366
Feature activation-1.46e-01

Pos logit contributions
 brace+0.003
yt+0.003
 swe+0.003
 runs+0.003
 narrative+0.003
Neg logit contributions
taker-0.003
 devastated-0.003
ase-0.002
he-0.002
har-0.002
TokenLet
Token ID5756
Feature activation-4.85e-01

Pos logit contributions
 brace+0.003
yt+0.003
 runs+0.003
 swe+0.003
 narrative+0.003
Neg logit contributions
taker-0.003
 devastated-0.003
ase-0.003
chair-0.003
har-0.003
Token's
Token ID338
Feature activation-1.07e+00

Pos logit contributions
yt+0.013
 narrative+0.013
 brace+0.013
cast+0.013
 vitamins+0.013
Neg logit contributions
he-0.007
 devastated-0.007
chair-0.007
on-0.007
 aback-0.007
Token push
Token ID4574
Feature activation-9.16e-01

Pos logit contributions
 brace+0.018
 runs+0.017
 swe+0.016
yt+0.016
cast+0.015
Neg logit contributions
taker-0.031
 devastated-0.030
ase-0.029
chair-0.028
 Shards-0.028
Token it
Token ID340
Feature activation-1.25e+00

Pos logit contributions
 brace+0.022
 runs+0.021
 swe+0.020
yt+0.020
 narrative+0.020
Neg logit contributions
taker-0.019
 devastated-0.019
ase-0.018
chair-0.017
har-0.017
Token harder
Token ID7069
Feature activation-4.10e+00

Pos logit contributions
 brace+0.028
yt+0.027
 swe+0.026
 narrative+0.026
 runs+0.026
Neg logit contributions
taker-0.027
 devastated-0.026
he-0.025
chair-0.025
ase-0.025
Token!"
Token ID2474
Feature activation-8.26e-01

Pos logit contributions
 brace+0.110
 narrative+0.108
Say+0.108
 swe+0.108
yt+0.100
Neg logit contributions
taker-0.064
chair-0.063
ions-0.063
he-0.063
 of-0.063
Token
Token ID198
Feature activation+1.07e-01

Pos logit contributions
 runs+0.015
 brace+0.014
yt+0.014
cast+0.014
 hats+0.014
Neg logit contributions
 devastated-0.021
taker-0.020
ase-0.020
har-0.020
 rubble-0.020
Token
Token ID198
Feature activation+3.49e-01

Pos logit contributions
taker+0.002
 devastated+0.002
ase+0.002
he+0.002
chair+0.002
Neg logit contributions
 brace-0.003
yt-0.003
 swe-0.003
 runs-0.003
 narrative-0.003
Token"
Token ID1
Feature activation-1.77e-01

Pos logit contributions
taker+0.009
har+0.009
ase+0.008
he+0.008
 Shards+0.008
Neg logit contributions
 brace-0.007
 swe-0.006
yt-0.006
cast-0.006
 narrative-0.006
TokenOK
Token ID11380
Feature activation-4.94e-01

Pos logit contributions
yt+0.004
 brace+0.004
 swe+0.003
cast+0.003
 narrative+0.003
Neg logit contributions
taker-0.004
 devastated-0.004
he-0.004
har-0.004
 Shards-0.004
Token swing
Token ID9628
Feature activation-8.65e-01

Pos logit contributions
taker+0.015
 devastated+0.015
ase+0.015
he+0.015
har+0.014
Neg logit contributions
 brace-0.009
yt-0.008
 swe-0.008
 runs-0.008
 narrative-0.008
Token and
Token ID290
Feature activation+2.79e-01

Pos logit contributions
 brace+0.017
yt+0.016
 runs+0.016
 swe+0.016
cast+0.015
Neg logit contributions
taker-0.021
 devastated-0.021
har-0.020
 rubble-0.019
 Shards-0.019
Token began
Token ID2540
Feature activation-1.17e+00

Pos logit contributions
taker+0.007
 devastated+0.007
ase+0.007
har+0.007
he+0.007
Neg logit contributions
 brace-0.005
 runs-0.005
yt-0.005
 swe-0.005
cast-0.005
Token to
Token ID284
Feature activation-3.39e-01

Pos logit contributions
 brace+0.028
 narrative+0.027
yt+0.027
 swe+0.026
 runs+0.026
Neg logit contributions
taker-0.023
 devastated-0.022
chair-0.021
har-0.021
 Shards-0.021
Token swing
Token ID9628
Feature activation-1.40e+00

Pos logit contributions
 narrative+0.007
yt+0.007
 vitamins+0.007
 brace+0.007
 swe+0.006
Neg logit contributions
taker-0.007
he-0.007
chair-0.007
 devastated-0.007
 Shards-0.007
Token higher
Token ID2440
Feature activation-4.10e+00

Pos logit contributions
yt+0.032
 narrative+0.032
 vitamins+0.032
 brace+0.030
Say+0.030
Neg logit contributions
he-0.028
taker-0.028
chair-0.028
ase-0.025
ions-0.025
Token and
Token ID290
Feature activation+9.71e-02

Pos logit contributions
 vitamins+0.125
 narrative+0.121
Say+0.119
 swe+0.112
 runs+0.109
Neg logit contributions
he-0.062
 of-0.051
har-0.049
 control-0.048
 so-0.046
Token higher
Token ID2440
Feature activation-3.82e+00

Pos logit contributions
taker+0.002
he+0.002
 devastated+0.002
chair+0.002
har+0.001
Neg logit contributions
yt-0.003
 narrative-0.003
 vitamins-0.002
 brace-0.002
 swe-0.002
Token.
Token ID13
Feature activation+1.56e-01

Pos logit contributions
 narrative+0.114
 vitamins+0.114
Say+0.111
rick+0.103
Nik+0.102
Neg logit contributions
he-0.053
 of-0.046
 as-0.046
 control-0.045
taker-0.045
Token
Token ID198
Feature activation-1.55e-01

Pos logit contributions
taker+0.003
 devastated+0.003
har+0.003
ase+0.003
he+0.003
Neg logit contributions
 brace-0.004
yt-0.004
 runs-0.004
 swe-0.004
 narrative-0.004
Token
Token ID198
Feature activation+5.07e-01

Pos logit contributions
 brace+0.004
yt+0.003
 runs+0.003
 swe+0.003
cast+0.003
Neg logit contributions
taker-0.003
 devastated-0.003
har-0.003
 rubble-0.003
ase-0.003
Token new
Token ID649
Feature activation+7.75e-01

Pos logit contributions
taker+0.013
ions+0.013
 devastated+0.013
ase+0.013
 Shards+0.012
Neg logit contributions
 brace-0.008
 swe-0.008
yt-0.008
 runs-0.007
 narrative-0.007
Token tutor
Token ID48680
Feature activation-2.81e-01

Pos logit contributions
taker+0.017
 devastated+0.017
har+0.016
 Shards+0.016
he+0.016
Neg logit contributions
 brace-0.017
yt-0.016
 swe-0.016
 runs-0.016
 narrative-0.016
Token as
Token ID355
Feature activation-1.02e+00

Pos logit contributions
 brace+0.005
 runs+0.005
 swe+0.005
yt+0.005
 narrative+0.004
Neg logit contributions
 Shards-0.007
 devastated-0.007
taker-0.007
har-0.007
 rubble-0.007
Token soon
Token ID2582
Feature activation-2.37e+00

Pos logit contributions
 brace+0.023
 swe+0.022
yt+0.022
 runs+0.022
 narrative+0.021
Neg logit contributions
taker-0.021
 devastated-0.021
ase-0.020
he-0.020
chair-0.020
Token as
Token ID355
Feature activation-1.32e+00

Pos logit contributions
 brace+0.053
 narrative+0.053
yt+0.051
 swe+0.051
cast+0.049
Neg logit contributions
taker-0.048
chair-0.047
 jail-0.046
 devastated-0.045
he-0.045
Token possible
Token ID1744
Feature activation-4.09e+00

Pos logit contributions
 brace+0.027
yt+0.025
 swe+0.025
 runs+0.025
 narrative+0.024
Neg logit contributions
taker-0.031
 devastated-0.030
ase-0.029
he-0.029
chair-0.029
Token.
Token ID13
Feature activation-1.82e-01

Pos logit contributions
 brace+0.086
yt+0.084
 swe+0.082
 narrative+0.080
cast+0.079
Neg logit contributions
taker-0.091
chair-0.088
he-0.088
ase-0.087
ore-0.086
Token We
Token ID775
Feature activation-2.92e-01

Pos logit contributions
 brace+0.004
 swe+0.004
yt+0.004
 runs+0.004
 narrative+0.004
Neg logit contributions
taker-0.004
 devastated-0.004
he-0.003
ase-0.003
har-0.003
Token love
Token ID1842
Feature activation-9.01e-01

Pos logit contributions
 runs+0.004
 narrative+0.004
 brace+0.003
 names+0.003
 swe+0.003
Neg logit contributions
 Shards-0.009
chair-0.009
ase-0.009
taker-0.009
ions-0.009
Token you
Token ID345
Feature activation+1.77e+00

Pos logit contributions
yt+0.020
 brace+0.020
 narrative+0.019
 swe+0.019
 runs+0.019
Neg logit contributions
taker-0.018
 devastated-0.017
chair-0.017
ions-0.017
he-0.017
Token,
Token ID11
Feature activation-2.04e-01

Pos logit contributions
idd+0.048
ase+0.047
 Shards+0.047
taker+0.047
 devastated+0.047
Neg logit contributions
 swe-0.029
 brace-0.029
 runs-0.026
 hats-0.026
 narrative-0.025
Tokenled
Token ID992
Feature activation-4.58e-01

Pos logit contributions
taker+0.011
 devastated+0.011
he+0.011
ase+0.010
chair+0.010
Neg logit contributions
 brace-0.011
yt-0.011
 swe-0.011
 narrative-0.010
 runs-0.010
Token as
Token ID355
Feature activation-1.55e+00

Pos logit contributions
 brace+0.011
yt+0.010
 narrative+0.010
 swe+0.010
 runs+0.010
Neg logit contributions
taker-0.009
 devastated-0.009
he-0.008
har-0.008
chair-0.008
Token the
Token ID262
Feature activation-3.75e-01

Pos logit contributions
 brace+0.027
yt+0.027
 narrative+0.026
 swe+0.025
 runs+0.024
Neg logit contributions
taker-0.040
 devastated-0.038
chair-0.037
he-0.037
ions-0.037
Token ball
Token ID2613
Feature activation-8.19e-01

Pos logit contributions
 brace+0.009
yt+0.009
cast+0.009
 narrative+0.009
 swe+0.009
Neg logit contributions
taker-0.007
 devastated-0.007
he-0.007
chair-0.007
ase-0.006
Token bounced
Token ID31318
Feature activation-3.99e-01

Pos logit contributions
 brace+0.015
 runs+0.014
 swe+0.014
yt+0.014
cast+0.013
Neg logit contributions
taker-0.021
 devastated-0.021
har-0.020
ase-0.020
he-0.020
Token higher
Token ID2440
Feature activation-4.08e+00

Pos logit contributions
 brace+0.009
yt+0.009
 swe+0.009
 narrative+0.009
 runs+0.008
Neg logit contributions
taker-0.008
 devastated-0.008
he-0.008
ase-0.008
chair-0.008
Token and
Token ID290
Feature activation-1.31e-01

Pos logit contributions
Say+0.110
 narrative+0.110
 vitamins+0.109
 swe+0.105
Nik+0.105
Neg logit contributions
he-0.067
 of-0.059
taker-0.057
chair-0.056
 control-0.055
Token higher
Token ID2440
Feature activation-3.81e+00

Pos logit contributions
 brace+0.003
yt+0.003
 narrative+0.003
 vitamins+0.003
 swe+0.003
Neg logit contributions
taker-0.003
 devastated-0.002
he-0.002
har-0.002
chair-0.002
Token.
Token ID13
Feature activation-1.28e-01

Pos logit contributions
 narrative+0.101
Say+0.098
 vitamins+0.097
Nik+0.095
 swe+0.094
Neg logit contributions
he-0.062
taker-0.058
 control-0.057
 of-0.055
har-0.055
Token
Token ID220
Feature activation-1.29e-01

Pos logit contributions
 runs+0.002
 nineteen+0.002
 brace+0.002
 swe+0.002
Say+0.002
Neg logit contributions
taker-0.003
 rubble-0.003
har-0.003
 devastated-0.003
ions-0.003
Token
Token ID198
Feature activation-1.21e-01

Pos logit contributions
 brace+0.002
 runs+0.002
cast+0.002
yt+0.002
 swe+0.002
Neg logit contributions
taker-0.003
 devastated-0.003
 rubble-0.003
har-0.003
chair-0.003
Token.
Token ID13
Feature activation-1.54e-02

Pos logit contributions
 brace+0.008
yt+0.008
 swe+0.008
 runs+0.007
 narrative+0.007
Neg logit contributions
taker-0.015
 devastated-0.015
ase-0.015
har-0.014
 Shards-0.014
Token
Token ID220
Feature activation+1.82e-01

Pos logit contributions
 runs+0.000
cast+0.000
 swe+0.000
 brace+0.000
yt+0.000
Neg logit contributions
 devastated-0.000
taker-0.000
 rubble-0.000
har-0.000
he-0.000
Token
Token ID198
Feature activation+3.21e-01

Pos logit contributions
 devastated+0.004
taker+0.004
 rubble+0.004
har+0.004
chair+0.004
Neg logit contributions
 brace-0.003
cast-0.003
 runs-0.003
 swe-0.003
yt-0.003
Token
Token ID198
Feature activation+4.35e-01

Pos logit contributions
taker+0.007
 devastated+0.007
 rubble+0.006
ase+0.006
har+0.006
Neg logit contributions
 brace-0.007
yt-0.007
 runs-0.007
cast-0.007
 swe-0.007
TokenAfter
Token ID3260
Feature activation+1.47e+00

Pos logit contributions
taker+0.011
 devastated+0.010
 rubble+0.010
ase+0.010
chair+0.010
Neg logit contributions
 brace-0.008
 swe-0.008
yt-0.008
 runs-0.008
 narrative-0.008
Token playing
Token ID2712
Feature activation+2.57e+00

Pos logit contributions
taker+0.038
ase+0.037
har+0.037
 aback+0.036
 devastated+0.035
Neg logit contributions
 brace-0.028
 runs-0.026
 hats-0.025
yt-0.025
 swe-0.025
Token with
Token ID351
Feature activation+4.73e-01

Pos logit contributions
 Shards+0.081
 devastated+0.080
idd+0.080
har+0.080
plets+0.078
Neg logit contributions
 runs-0.029
 brace-0.029
 hats-0.028
cast-0.027
 swe-0.027
Token his
Token ID465
Feature activation+2.18e-01

Pos logit contributions
ase+0.015
har+0.014
taker+0.014
 devastated+0.014
ore+0.014
Neg logit contributions
 brace-0.007
 swe-0.007
 hats-0.006
 runs-0.006
 vitamins-0.005
Token new
Token ID649
Feature activation+7.06e-01

Pos logit contributions
taker+0.008
 aback+0.007
he+0.007
ase+0.007
 devastated+0.007
Neg logit contributions
 brace-0.003
 swe-0.003
 hats-0.003
 runs-0.002
yt-0.002
Token toy
Token ID13373
Feature activation+1.33e+00

Pos logit contributions
taker+0.019
 devastated+0.019
he+0.019
 Shards+0.018
har+0.018
Neg logit contributions
 brace-0.013
 swe-0.013
yt-0.012
 hats-0.012
 runs-0.012
Token for
Token ID329
Feature activation+1.13e+00

Pos logit contributions
 devastated+0.042
taker+0.042
har+0.041
 aback+0.041
 Shards+0.041
Neg logit contributions
 brace-0.018
 swe-0.016
 runs-0.016
 hats-0.016
yt-0.015
Token replied
Token ID8712
Feature activation+5.37e-02

Pos logit contributions
taker+0.003
 rubble+0.003
 devastated+0.003
 jail+0.003
ase+0.003
Neg logit contributions
 brace-0.003
 runs-0.002
 swe-0.002
yt-0.002
cast-0.002
Token,
Token ID11
Feature activation+1.60e-01

Pos logit contributions
taker+0.001
 devastated+0.001
ase+0.001
har+0.001
he+0.001
Neg logit contributions
 brace-0.001
yt-0.001
 swe-0.001
 runs-0.001
 narrative-0.001
Token "
Token ID366
Feature activation-9.32e-02

Pos logit contributions
ase+0.004
he+0.004
taker+0.004
 devastated+0.004
 rubble+0.004
Neg logit contributions
 brace-0.003
 swe-0.003
yt-0.003
 runs-0.003
 hats-0.002
TokenOh
Token ID5812
Feature activation+1.42e+00

Pos logit contributions
 brace+0.002
 swe+0.002
yt+0.002
 runs+0.002
Say+0.002
Neg logit contributions
taker-0.002
 devastated-0.002
ase-0.002
 rubble-0.002
he-0.002
Token no
Token ID645
Feature activation-6.81e-01

Pos logit contributions
taker+0.034
 devastated+0.034
ase+0.034
he+0.034
har+0.033
Neg logit contributions
 brace-0.028
yt-0.028
 swe-0.027
 runs-0.027
 nineteen-0.025
Token!
Token ID0
Feature activation+2.36e+00

Pos logit contributions
 brace+0.018
 narrative+0.017
cast+0.016
yt+0.016
 swe+0.016
Neg logit contributions
taker-0.012
chair-0.011
 jail-0.011
 devastated-0.011
ions-0.011
Token You
Token ID921
Feature activation+2.38e+00

Pos logit contributions
he+0.056
 rubble+0.055
 devastated+0.055
taker+0.053
 reunited+0.051
Neg logit contributions
 runs-0.048
rick-0.044
 swe-0.043
Nik-0.043
 vitamins-0.042
Token need
Token ID761
Feature activation+1.23e+00

Pos logit contributions
ase+0.077
wards+0.077
 doorway+0.073
aun+0.073
har+0.073
Neg logit contributions
 runs-0.034
 takes-0.029
 goes-0.029
 wants-0.029
 let-0.028
Token to
Token ID284
Feature activation+4.39e-01

Pos logit contributions
he+0.038
taker+0.037
ase+0.037
 devastated+0.036
har+0.035
Neg logit contributions
 swe-0.021
 vitamins-0.018
 runs-0.018
 hats-0.018
 brace-0.017
Token wear
Token ID5806
Feature activation+1.18e+00

Pos logit contributions
taker+0.011
 devastated+0.010
he+0.010
ase+0.010
har+0.010
Neg logit contributions
 brace-0.009
yt-0.008
 swe-0.008
 runs-0.008
 narrative-0.008
Token your
Token ID534
Feature activation-6.97e-01

Pos logit contributions
 devastated+0.032
taker+0.031
ase+0.031
he+0.030
har+0.030
Neg logit contributions
 swe-0.024
 brace-0.023
 hats-0.022
 runs-0.020
 vitamins-0.020
Token.
Token ID13
Feature activation-5.49e-01

Pos logit contributions
 brace+0.014
yt+0.014
 swe+0.013
 narrative+0.013
 runs+0.013
Neg logit contributions
taker-0.015
he-0.015
 devastated-0.015
ase-0.014
chair-0.014
Token He
Token ID679
Feature activation-1.06e+00

Pos logit contributions
 brace+0.017
 swe+0.016
 narrative+0.015
yt+0.015
Say+0.015
Neg logit contributions
taker-0.008
he-0.008
 Shards-0.008
chair-0.008
on-0.007
Token was
Token ID373
Feature activation+1.28e-01

Pos logit contributions
 brace+0.024
yt+0.023
 swe+0.023
 runs+0.023
 narrative+0.023
Neg logit contributions
taker-0.022
 devastated-0.022
ase-0.021
he-0.021
har-0.020
Token three
Token ID1115
Feature activation+5.77e-01

Pos logit contributions
taker+0.003
ase+0.003
har+0.003
chair+0.003
 devastated+0.003
Neg logit contributions
 brace-0.003
 runs-0.003
 swe-0.003
yt-0.003
 hats-0.003
Token years
Token ID812
Feature activation+1.26e+00

Pos logit contributions
taker+0.009
 devastated+0.008
chair+0.008
ase+0.008
he+0.008
Neg logit contributions
 brace-0.016
yt-0.016
 swe-0.015
 narrative-0.015
 runs-0.015
Token old
Token ID1468
Feature activation+2.24e+00

Pos logit contributions
 devastated+0.034
taker+0.033
har+0.032
ase+0.032
he+0.032
Neg logit contributions
 brace-0.021
yt-0.020
 swe-0.020
 runs-0.019
cast-0.018
Token.
Token ID13
Feature activation-3.32e-01

Pos logit contributions
taker+0.062
 devastated+0.061
har+0.058
ase+0.058
 Shards+0.057
Neg logit contributions
 brace-0.039
 swe-0.035
yt-0.035
 runs-0.035
 narrative-0.032
Token One
Token ID1881
Feature activation+2.64e-01

Pos logit contributions
 brace+0.009
 swe+0.009
yt+0.008
 narrative+0.008
 runs+0.008
Neg logit contributions
taker-0.006
 Shards-0.006
chair-0.006
he-0.005
ase-0.005
Token day
Token ID1110
Feature activation-3.63e-01

Pos logit contributions
taker+0.007
 devastated+0.007
har+0.007
 aback+0.007
ions+0.007
Neg logit contributions
 runs-0.004
 swe-0.004
yt-0.004
 brace-0.004
 narrative-0.004
Token,
Token ID11
Feature activation-6.12e-01

Pos logit contributions
 brace+0.008
yt+0.008
 runs+0.007
 swe+0.007
 narrative+0.007
Neg logit contributions
taker-0.008
 devastated-0.008
ase-0.008
har-0.007
he-0.007
Token Mama
Token ID40084
Feature activation-2.15e-01

Pos logit contributions
 brace+0.013
 swe+0.013
yt+0.013
 runs+0.012
 narrative+0.012
Neg logit contributions
taker-0.013
 devastated-0.013
he-0.012
chair-0.012
ase-0.012
Token,
Token ID11
Feature activation-4.19e-01

Pos logit contributions
taker+0.030
 devastated+0.029
har+0.029
 rubble+0.028
ase+0.027
Neg logit contributions
 runs-0.018
yt-0.018
 brace-0.017
 swe-0.016
cast-0.016
Token she
Token ID673
Feature activation-3.65e-01

Pos logit contributions
 brace+0.006
 swe+0.006
yt+0.006
 runs+0.006
 narrative+0.005
Neg logit contributions
taker-0.012
 devastated-0.012
ase-0.012
har-0.012
he-0.011
Token saw
Token ID2497
Feature activation+1.25e-01

Pos logit contributions
 runs+0.007
 brace+0.007
yt+0.007
 swe+0.006
cast+0.006
Neg logit contributions
taker-0.010
ions-0.009
har-0.009
ase-0.009
 devastated-0.009
Token a
Token ID257
Feature activation+7.55e-01

Pos logit contributions
har+0.003
taker+0.003
ase+0.003
 devastated+0.003
ore+0.003
Neg logit contributions
 brace-0.003
 swe-0.003
 runs-0.003
 hats-0.003
Say-0.002
Token funny
Token ID8258
Feature activation+3.47e-01

Pos logit contributions
taker+0.021
ions+0.020
ase+0.020
 devastated+0.020
 aback+0.020
Neg logit contributions
 brace-0.012
yt-0.012
 swe-0.012
 runs-0.011
 narrative-0.010
Token-
Token ID12
Feature activation+2.70e+00

Pos logit contributions
taker+0.008
 devastated+0.008
ions+0.007
har+0.007
 aback+0.007
Neg logit contributions
 brace-0.007
yt-0.007
 runs-0.007
 swe-0.007
cast-0.007
Tokenlooking
Token ID11534
Feature activation+7.33e-01

Pos logit contributions
 devastated+0.050
har+0.046
taker+0.046
 rubble+0.046
ions+0.044
Neg logit contributions
 brace-0.068
yt-0.063
 runs-0.063
Say-0.062
 swe-0.060
Token bug
Token ID5434
Feature activation-1.97e-02

Pos logit contributions
taker+0.017
 devastated+0.017
har+0.016
he+0.016
 Shards+0.016
Neg logit contributions
 brace-0.016
yt-0.015
 swe-0.015
 runs-0.015
cast-0.014
Token.
Token ID13
Feature activation+4.19e-01

Pos logit contributions
 brace+0.000
 swe+0.000
yt+0.000
 narrative+0.000
 runs+0.000
Neg logit contributions
taker-0.000
 devastated-0.000
chair-0.000
ase-0.000
he-0.000
Token She
Token ID1375
Feature activation+6.35e-01

Pos logit contributions
taker+0.010
 devastated+0.010
har+0.010
 rubble+0.010
idd+0.010
Neg logit contributions
 brace-0.008
 runs-0.008
 swe-0.007
 offers-0.007
Nik-0.007
Token wanted
Token ID2227
Feature activation-1.15e-01

Pos logit contributions
taker+0.015
 devastated+0.015
ase+0.014
he+0.014
chair+0.014
Neg logit contributions
 brace-0.013
yt-0.012
 swe-0.012
 runs-0.012
 narrative-0.011
Token running
Token ID2491
Feature activation+6.74e-01

Pos logit contributions
taker+0.040
ase+0.038
 devastated+0.038
 Shards+0.038
har+0.037
Neg logit contributions
 brace-0.039
yt-0.038
 runs-0.037
 swe-0.037
cast-0.036
Token around
Token ID1088
Feature activation+3.41e-01

Pos logit contributions
ase+0.013
taker+0.013
 devastated+0.013
 Shards+0.013
chair+0.012
Neg logit contributions
 brace-0.016
 runs-0.016
 swe-0.016
yt-0.015
cast-0.015
Token and
Token ID290
Feature activation+8.22e-03

Pos logit contributions
taker+0.009
 devastated+0.009
ase+0.009
har+0.009
 Shards+0.009
Neg logit contributions
 brace-0.006
 runs-0.006
 swe-0.006
yt-0.005
cast-0.005
Token making
Token ID1642
Feature activation-2.67e-01

Pos logit contributions
taker+0.000
 devastated+0.000
ase+0.000
he+0.000
har+0.000
Neg logit contributions
 brace-0.000
yt-0.000
 swe-0.000
 runs-0.000
cast-0.000
Token a
Token ID257
Feature activation-5.91e-01

Pos logit contributions
 brace+0.005
 swe+0.005
 runs+0.005
 hats+0.004
cast+0.004
Neg logit contributions
ase-0.007
taker-0.007
 devastated-0.006
har-0.006
chair-0.006
Token lot
Token ID1256
Feature activation+2.26e+00

Pos logit contributions
 brace+0.010
 swe+0.010
yt+0.010
 runs+0.009
 narrative+0.009
Neg logit contributions
taker-0.016
ase-0.016
ions-0.016
 devastated-0.015
he-0.015
Token of
Token ID286
Feature activation+4.27e-01

Pos logit contributions
ase+0.034
 Shards+0.033
har+0.033
 devastated+0.032
�+0.031
Neg logit contributions
 swe-0.066
 brace-0.064
 runs-0.062
 sounds-0.060
Say-0.059
Token noise
Token ID7838
Feature activation-4.33e-02

Pos logit contributions
ase+0.012
har+0.011
 Shards+0.011
ions+0.011
can+0.010
Neg logit contributions
 brace-0.008
 swe-0.008
 runs-0.008
 hats-0.007
 narrative-0.007
Token.
Token ID13
Feature activation+1.78e-01

Pos logit contributions
 brace+0.001
yt+0.001
 swe+0.001
 runs+0.001
cast+0.001
Neg logit contributions
taker-0.001
 devastated-0.001
ase-0.001
har-0.001
 Shards-0.001
Token Old
Token ID5706
Feature activation-8.69e-02

Pos logit contributions
har+0.005
he+0.005
taker+0.005
 devastated+0.005
ase+0.004
Neg logit contributions
 runs-0.003
 brace-0.003
 swe-0.003
Say-0.003
ints-0.003
Token Sam
Token ID3409
Feature activation+5.07e-01

Pos logit contributions
 brace+0.002
 swe+0.002
yt+0.002
 runs+0.002
cast+0.002
Neg logit contributions
taker-0.002
 devastated-0.002
ase-0.002
he-0.002
chair-0.002
Token't
Token ID470
Feature activation+3.18e-02

Pos logit contributions
taker+0.011
ase+0.011
 rubble+0.011
 devastated+0.010
chair+0.010
Neg logit contributions
 brace-0.015
 runs-0.015
 swe-0.014
yt-0.014
cast-0.014
Token!"
Token ID2474
Feature activation-3.95e-02

Pos logit contributions
taker+0.001
he+0.001
 devastated+0.001
har+0.001
chair+0.001
Neg logit contributions
 brace-0.001
 swe-0.001
 narrative-0.001
yt-0.001
 runs-0.001
Token
Token ID198
Feature activation+5.56e-01

Pos logit contributions
 brace+0.001
 swe+0.001
yt+0.001
 runs+0.001
 hats+0.001
Neg logit contributions
 devastated-0.001
 rubble-0.001
ase-0.001
ions-0.001
idd-0.001
Token
Token ID198
Feature activation+7.39e-01

Pos logit contributions
taker+0.012
 rubble+0.012
 devastated+0.012
ase+0.011
ions+0.011
Neg logit contributions
 brace-0.011
 runs-0.011
 swe-0.011
Nik-0.011
yt-0.011
TokenMom
Token ID29252
Feature activation+2.39e-01

Pos logit contributions
taker+0.020
 devastated+0.019
 rubble+0.019
ase+0.019
 Shards+0.018
Neg logit contributions
 brace-0.012
yt-0.012
 narrative-0.011
 swe-0.011
 runs-0.011
Tokenmy
Token ID1820
Feature activation+3.10e-01

Pos logit contributions
taker+0.006
 devastated+0.006
ase+0.005
har+0.005
 rubble+0.005
Neg logit contributions
 brace-0.005
 runs-0.005
yt-0.005
 swe-0.005
cast-0.004
Token and
Token ID290
Feature activation+6.62e-01

Pos logit contributions
taker+0.008
ase+0.008
 rubble+0.008
 devastated+0.007
har+0.007
Neg logit contributions
 brace-0.006
 runs-0.006
yt-0.005
 swe-0.005
cast-0.005
Token Daddy
Token ID26926
Feature activation+3.05e-01

Pos logit contributions
ase+0.017
 devastated+0.016
ore+0.016
har+0.016
taker+0.016
Neg logit contributions
 brace-0.012
 runs-0.012
yt-0.012
 swe-0.012
 hats-0.011
Token went
Token ID1816
Feature activation+3.04e-01

Pos logit contributions
taker+0.008
ase+0.007
har+0.007
chair+0.007
 devastated+0.007
Neg logit contributions
 runs-0.006
 brace-0.006
yt-0.005
 swe-0.005
cast-0.005
Token out
Token ID503
Feature activation+2.52e-01

Pos logit contributions
taker+0.006
 devastated+0.006
ase+0.006
he+0.006
chair+0.006
Neg logit contributions
 brace-0.007
yt-0.007
 swe-0.007
 runs-0.007
 narrative-0.006
Token the
Token ID262
Feature activation+1.43e-01

Pos logit contributions
ase+0.006
 devastated+0.006
taker+0.006
 Shards+0.006
har+0.006
Neg logit contributions
yt-0.005
 runs-0.005
 swe-0.005
 brace-0.005
cast-0.004
Token park
Token ID3952
Feature activation+8.41e-01

Pos logit contributions
taker+0.011
 devastated+0.011
ase+0.010
chair+0.010
he+0.010
Neg logit contributions
 brace-0.010
 swe-0.009
yt-0.009
 runs-0.009
 narrative-0.009
Token when
Token ID618
Feature activation-7.15e-02

Pos logit contributions
taker+0.017
 devastated+0.016
ase+0.015
he+0.015
har+0.015
Neg logit contributions
 brace-0.020
yt-0.019
 swe-0.019
 runs-0.019
 narrative-0.019
Token she
Token ID673
Feature activation-2.35e-01

Pos logit contributions
 brace+0.001
 runs+0.001
yt+0.001
 swe+0.001
 narrative+0.001
Neg logit contributions
taker-0.002
ase-0.002
 devastated-0.002
har-0.002
 aback-0.002
Token saw
Token ID2497
Feature activation-2.08e-01

Pos logit contributions
 runs+0.004
yt+0.004
 brace+0.004
 swe+0.004
cast+0.004
Neg logit contributions
taker-0.006
 devastated-0.006
 rubble-0.006
har-0.005
ions-0.005
Token a
Token ID257
Feature activation+4.24e-01

Pos logit contributions
 brace+0.005
 swe+0.005
 runs+0.004
 hats+0.004
yt+0.004
Neg logit contributions
har-0.005
taker-0.004
 devastated-0.004
ase-0.004
he-0.004
Token sweater
Token ID35685
Feature activation+1.23e+00

Pos logit contributions
taker+0.011
 devastated+0.011
ase+0.011
ions+0.011
 aback+0.010
Neg logit contributions
 brace-0.008
yt-0.007
 swe-0.007
 runs-0.007
 narrative-0.006
Token that
Token ID326
Feature activation-3.30e-01

Pos logit contributions
he+0.029
taker+0.029
chair+0.028
ase+0.028
 devastated+0.028
Neg logit contributions
 brace-0.025
 swe-0.024
yt-0.024
 narrative-0.024
 vitamins-0.023
Token she
Token ID673
Feature activation-4.45e-01

Pos logit contributions
 brace+0.007
yt+0.007
 swe+0.007
 runs+0.007
 narrative+0.007
Neg logit contributions
taker-0.007
 devastated-0.007
ase-0.007
he-0.007
har-0.007
Token liked
Token ID8288
Feature activation-3.02e-02

Pos logit contributions
 brace+0.009
yt+0.009
 swe+0.009
 runs+0.009
 narrative+0.009
Neg logit contributions
taker-0.010
 devastated-0.010
ase-0.009
he-0.009
chair-0.009
Token.
Token ID13
Feature activation-6.12e-02

Pos logit contributions
 brace+0.001
 swe+0.001
yt+0.001
 runs+0.001
 narrative+0.001
Neg logit contributions
taker-0.001
 devastated-0.001
he-0.001
ase-0.001
har-0.001
Token She
Token ID1375
Feature activation-1.83e-01

Pos logit contributions
 brace+0.001
 runs+0.001
 swe+0.001
yt+0.001
Nik+0.001
Neg logit contributions
taker-0.002
 devastated-0.002
har-0.002
 rubble-0.002
 arrested-0.001
TokenThe
Token ID464
Feature activation-3.29e-01

Pos logit contributions
taker+0.023
 rubble+0.022
 aback+0.022
ions+0.021
chair+0.021
Neg logit contributions
 brace-0.014
 swe-0.013
Say-0.013
Nik-0.012
 narrative-0.012
Token mother
Token ID2802
Feature activation+8.73e-01

Pos logit contributions
 swe+0.006
 brace+0.006
 narrative+0.006
yt+0.006
 runs+0.006
Neg logit contributions
taker-0.008
ase-0.008
 devastated-0.008
har-0.008
 aback-0.008
Token smiled
Token ID13541
Feature activation+1.63e-01

Pos logit contributions
 rubble+0.025
 jail+0.025
 Shards+0.024
 devastated+0.023
taker+0.023
Neg logit contributions
 runs-0.014
 swe-0.014
 brace-0.014
 says-0.012
yt-0.012
Token and
Token ID290
Feature activation-6.87e-03

Pos logit contributions
taker+0.003
 devastated+0.003
he+0.003
 aback+0.003
chair+0.003
Neg logit contributions
 brace-0.004
 narrative-0.004
yt-0.004
 runs-0.004
 vitamins-0.004
Token said
Token ID531
Feature activation+2.36e-01

Pos logit contributions
 brace+0.000
yt+0.000
 swe+0.000
 runs+0.000
 narrative+0.000
Neg logit contributions
taker-0.000
 devastated-0.000
ase-0.000
he-0.000
har-0.000
Token,
Token ID11
Feature activation+3.17e-01

Pos logit contributions
taker+0.005
 devastated+0.004
chair+0.004
ase+0.004
he+0.004
Neg logit contributions
 brace-0.006
yt-0.006
 narrative-0.006
 swe-0.005
cast-0.005
Token "
Token ID366
Feature activation+2.08e-01

Pos logit contributions
taker+0.007
 devastated+0.006
ase+0.006
he+0.006
chair+0.006
Neg logit contributions
 brace-0.007
yt-0.007
 swe-0.007
 runs-0.007
 narrative-0.007
TokenOf
Token ID5189
Feature activation+2.08e+00

Pos logit contributions
taker+0.004
 devastated+0.004
ase+0.004
he+0.004
chair+0.004
Neg logit contributions
 brace-0.005
yt-0.005
 swe-0.005
 runs-0.005
 narrative-0.005
Token course
Token ID1781
Feature activation-3.97e-02

Pos logit contributions
taker+0.039
 devastated+0.037
ase+0.036
chair+0.035
 Shards+0.035
Neg logit contributions
 brace-0.052
 narrative-0.051
 swe-0.051
 runs-0.050
yt-0.050
Token we
Token ID356
Feature activation-7.01e-01

Pos logit contributions
 brace+0.001
yt+0.001
 swe+0.001
 runs+0.001
 narrative+0.001
Neg logit contributions
taker-0.001
 devastated-0.001
ase-0.001
he-0.001
 Shards-0.001
Token can
Token ID460
Feature activation-3.40e-01

Pos logit contributions
 brace+0.017
yt+0.016
 swe+0.016
 runs+0.015
cast+0.015
Neg logit contributions
taker-0.014
 devastated-0.013
he-0.013
ase-0.013
chair-0.012
Token She
Token ID1375
Feature activation+5.56e-02

Pos logit contributions
 devastated+0.019
har+0.019
 rubble+0.018
taker+0.018
he+0.018
Neg logit contributions
 brace-0.012
 runs-0.011
 swe-0.011
yt-0.011
Nik-0.010
Token showed
Token ID3751
Feature activation-3.85e-01

Pos logit contributions
taker+0.001
ase+0.001
 rubble+0.001
he+0.001
 devastated+0.001
Neg logit contributions
 brace-0.001
 runs-0.001
 swe-0.001
yt-0.001
cast-0.001
Token her
Token ID607
Feature activation+1.35e+00

Pos logit contributions
 brace+0.006
 swe+0.006
 runs+0.006
 hats+0.005
 names+0.005
Neg logit contributions
har-0.011
 devastated-0.011
ase-0.011
idd-0.011
taker-0.011
Token the
Token ID262
Feature activation+1.09e-01

Pos logit contributions
idd+0.033
har+0.032
ase+0.032
taker+0.031
he+0.031
Neg logit contributions
 brace-0.029
 swe-0.027
 runs-0.027
 names-0.025
 hats-0.025
Token children
Token ID1751
Feature activation+1.77e+00

Pos logit contributions
taker+0.003
 devastated+0.003
ase+0.003
he+0.003
ions+0.003
Neg logit contributions
 brace-0.002
 swe-0.002
 runs-0.002
yt-0.002
 narrative-0.002
Token's
Token ID338
Feature activation+1.82e-01

Pos logit contributions
taker+0.041
ase+0.041
har+0.040
 Shards+0.040
 devastated+0.040
Neg logit contributions
 brace-0.037
 runs-0.035
 swe-0.035
yt-0.033
 hats-0.032
Token section
Token ID2665
Feature activation+1.70e+00

Pos logit contributions
taker+0.005
 devastated+0.005
he+0.005
ase+0.005
har+0.005
Neg logit contributions
 brace-0.003
 swe-0.003
 runs-0.003
yt-0.002
 narrative-0.002
Token and
Token ID290
Feature activation+1.48e-01

Pos logit contributions
taker+0.040
 devastated+0.039
 Shards+0.038
har+0.038
he+0.037
Neg logit contributions
 brace-0.034
yt-0.032
 swe-0.032
 runs-0.031
 narrative-0.030
Token asked
Token ID1965
Feature activation-9.60e-01

Pos logit contributions
ase+0.004
he+0.004
taker+0.004
 devastated+0.004
har+0.004
Neg logit contributions
 brace-0.003
 swe-0.003
 runs-0.003
yt-0.003
 hats-0.002
Token if
Token ID611
Feature activation+1.75e-01

Pos logit contributions
 brace+0.019
yt+0.019
 narrative+0.018
 swe+0.018
 runs+0.018
Neg logit contributions
taker-0.023
 devastated-0.022
he-0.021
chair-0.021
ase-0.021
Token she
Token ID673
Feature activation+1.11e-02

Pos logit contributions
taker+0.004
 devastated+0.004
chair+0.004
he+0.004
ase+0.004
Neg logit contributions
 brace-0.003
yt-0.003
 swe-0.003
 runs-0.003
 narrative-0.003
Token As
Token ID1081
Feature activation-9.64e-01

Pos logit contributions
har+0.004
ions+0.004
days+0.004
idd+0.004
ase+0.004
Neg logit contributions
 runs-0.003
 brace-0.003
 swe-0.003
 vitamins-0.002
 offers-0.002
Token Blue
Token ID4518
Feature activation-4.18e-01

Pos logit contributions
 brace+0.018
 swe+0.018
yt+0.017
 runs+0.017
 narrative+0.016
Neg logit contributions
ase-0.024
taker-0.024
 devastated-0.023
har-0.023
 Shards-0.023
Token played
Token ID2826
Feature activation+1.37e+00

Pos logit contributions
 runs+0.007
 brace+0.007
yt+0.007
 swe+0.006
cast+0.006
Neg logit contributions
taker-0.012
 rubble-0.011
 Shards-0.011
ions-0.011
 devastated-0.011
Token,
Token ID11
Feature activation+4.75e-01

Pos logit contributions
 Shards+0.039
har+0.039
 devastated+0.039
itty+0.038
idd+0.038
Neg logit contributions
 brace-0.022
 runs-0.020
 hats-0.020
cast-0.020
 swe-0.019
Token everyone
Token ID2506
Feature activation+9.54e-01

Pos logit contributions
ase+0.015
har+0.014
idd+0.014
anny+0.014
taker+0.014
Neg logit contributions
 brace-0.007
 runs-0.006
 swe-0.006
 narrative-0.006
Say-0.005
Token danced
Token ID39480
Feature activation+5.33e-01

Pos logit contributions
 Shards+0.025
ocolate+0.024
uster+0.024
har+0.024
�+0.024
Neg logit contributions
 runs-0.018
yt-0.016
cast-0.016
 brace-0.016
 goes-0.015
Token and
Token ID290
Feature activation-1.46e-02

Pos logit contributions
 devastated+0.013
 Shards+0.013
taker+0.012
har+0.012
ase+0.012
Neg logit contributions
 brace-0.011
 swe-0.010
yt-0.010
cast-0.010
 runs-0.010
Token had
Token ID550
Feature activation+4.00e-01

Pos logit contributions
 brace+0.000
 runs+0.000
 swe+0.000
yt+0.000
cast+0.000
Neg logit contributions
ase-0.000
taker-0.000
 devastated-0.000
he-0.000
har-0.000
Token a
Token ID257
Feature activation-3.31e-02

Pos logit contributions
taker+0.009
ase+0.009
 devastated+0.008
he+0.008
har+0.008
Neg logit contributions
 brace-0.009
 swe-0.009
 runs-0.009
 vitamins-0.008
yt-0.008
Token great
Token ID1049
Feature activation-2.72e-01

Pos logit contributions
yt+0.000
 swe+0.000
 brace+0.000
 runs+0.000
 narrative+0.000
Neg logit contributions
taker-0.001
ions-0.001
ase-0.001
chair-0.001
aun-0.001
Token time
Token ID640
Feature activation-5.49e-01

Pos logit contributions
yt+0.004
 brace+0.004
 runs+0.004
 swe+0.004
 narrative+0.003
Neg logit contributions
taker-0.008
 devastated-0.008
ions-0.008
har-0.008
 rubble-0.008
Token getting
Token ID1972
Feature activation-1.94e-01

Pos logit contributions
ase+0.021
taker+0.019
he+0.019
 Shards+0.019
har+0.019
Neg logit contributions
 runs-0.016
 brace-0.015
 swe-0.014
 points-0.014
yt-0.014
Token a
Token ID257
Feature activation-1.47e-01

Pos logit contributions
 swe+0.004
 brace+0.004
 runs+0.004
Say+0.004
yt+0.004
Neg logit contributions
ase-0.004
taker-0.004
he-0.004
chair-0.004
 devastated-0.004
Token drink
Token ID4144
Feature activation+1.36e+00

Pos logit contributions
 swe+0.003
yt+0.003
 brace+0.003
 runs+0.003
 hats+0.002
Neg logit contributions
taker-0.004
ions-0.004
ase-0.004
he-0.004
 devastated-0.004
Token.
Token ID13
Feature activation+9.69e-01

Pos logit contributions
 devastated+0.030
taker+0.029
har+0.028
 Shards+0.028
ase+0.028
Neg logit contributions
 brace-0.029
 swe-0.029
yt-0.028
 runs-0.027
 narrative-0.027
Tokenâ
Token ID22940
Feature activation+7.76e-01

Pos logit contributions
 devastated+0.022
 rubble+0.022
taker+0.021
ase+0.021
he+0.021
Neg logit contributions
 swe-0.019
 runs-0.019
 brace-0.019
yt-0.018
Say-0.018
Token€
Token ID26391
Feature activation-3.99e-01

Pos logit contributions
taker+0.019
 devastated+0.018
ase+0.018
he+0.018
chair+0.018
Neg logit contributions
 brace-0.015
yt-0.014
 swe-0.014
 runs-0.014
 narrative-0.014
Token
Token ID198
Feature activation+6.83e-01

Pos logit contributions
 runs+0.008
yt+0.008
 brace+0.008
cast+0.008
Nik+0.007
Neg logit contributions
taker-0.009
 devastated-0.009
 rubble-0.009
ions-0.008
ase-0.008
Token
Token ID198
Feature activation+6.58e-01

Pos logit contributions
taker+0.015
 rubble+0.014
 devastated+0.014
ase+0.014
chair+0.013
Neg logit contributions
 brace-0.014
 swe-0.014
 runs-0.014
yt-0.013
Say-0.013
TokenHis
Token ID6653
Feature activation-1.21e+00

Pos logit contributions
taker+0.017
 devastated+0.016
 rubble+0.016
ase+0.016
chair+0.016
Neg logit contributions
 brace-0.011
 swe-0.011
yt-0.011
 runs-0.011
Say-0.011
Token mum
Token ID25682
Feature activation+2.27e-01

Pos logit contributions
 brace+0.023
 swe+0.022
yt+0.022
 runs+0.022
 narrative+0.021
Neg logit contributions
taker-0.030
 devastated-0.029
ase-0.028
he-0.028
 Shards-0.028
Token was
Token ID373
Feature activation+1.79e-01

Pos logit contributions
taker+0.006
 devastated+0.006
 rubble+0.006
 Shards+0.006
ase+0.006
Neg logit contributions
 brace-0.003
 runs-0.003
 swe-0.003
yt-0.003
cast-0.003
Token kept
Token ID4030
Feature activation-1.04e+00

Pos logit contributions
 brace+0.007
 swe+0.006
yt+0.006
 runs+0.006
 narrative+0.006
Neg logit contributions
taker-0.007
 devastated-0.007
ase-0.007
he-0.007
 Shards-0.007
Token examining
Token ID17247
Feature activation-6.27e-01

Pos logit contributions
 brace+0.022
yt+0.021
 swe+0.021
 runs+0.021
 narrative+0.020
Neg logit contributions
taker-0.024
 devastated-0.023
ase-0.022
he-0.022
chair-0.022
Token the
Token ID262
Feature activation+1.50e-01

Pos logit contributions
 brace+0.015
 swe+0.014
 runs+0.013
yt+0.013
cast+0.013
Neg logit contributions
taker-0.013
 devastated-0.013
ase-0.013
har-0.013
chair-0.013
Token area
Token ID1989
Feature activation+3.25e-01

Pos logit contributions
taker+0.003
 devastated+0.003
he+0.003
chair+0.003
ase+0.003
Neg logit contributions
 brace-0.003
yt-0.003
 runs-0.003
 swe-0.003
cast-0.003
Token where
Token ID810
Feature activation-4.22e-01

Pos logit contributions
taker+0.008
 devastated+0.007
ase+0.007
 Shards+0.007
har+0.007
Neg logit contributions
 brace-0.007
yt-0.006
 swe-0.006
 runs-0.006
cast-0.006
Token it
Token ID340
Feature activation-1.29e+00

Pos logit contributions
 brace+0.007
 swe+0.007
 runs+0.007
yt+0.007
 narrative+0.006
Neg logit contributions
taker-0.011
har-0.011
 devastated-0.011
ase-0.011
he-0.011
Token had
Token ID550
Feature activation+5.43e-01

Pos logit contributions
 brace+0.031
yt+0.030
 swe+0.030
 runs+0.029
 narrative+0.029
Neg logit contributions
taker-0.025
 devastated-0.025
ase-0.024
he-0.023
chair-0.023
Token been
Token ID587
Feature activation+1.64e-01

Pos logit contributions
taker+0.012
 devastated+0.011
he+0.011
ase+0.011
chair+0.011
Neg logit contributions
 brace-0.013
 swe-0.012
yt-0.012
 runs-0.012
 vitamins-0.011
Token.
Token ID13
Feature activation+7.94e-01

Pos logit contributions
taker+0.004
 devastated+0.004
ase+0.004
he+0.004
chair+0.004
Neg logit contributions
 brace-0.003
yt-0.003
 swe-0.003
 runs-0.002
 narrative-0.002
Token But
Token ID887
Feature activation+8.30e-01

Pos logit contributions
taker+0.019
 devastated+0.019
 rubble+0.018
he+0.018
idd+0.018
Neg logit contributions
 brace-0.015
 runs-0.015
 swe-0.014
ints-0.013
Nik-0.013
Token the
Token ID262
Feature activation-3.25e-01

Pos logit contributions
ase+0.023
taker+0.022
idd+0.022
har+0.021
he+0.020
Neg logit contributions
 brace-0.015
 swe-0.014
 runs-0.013
yt-0.012
 narrative-0.012
Token pretended
Token ID36766
Feature activation+2.74e-01

Pos logit contributions
 brace+0.011
yt+0.010
 swe+0.010
 narrative+0.010
cast+0.010
Neg logit contributions
taker-0.010
 devastated-0.010
he-0.010
ase-0.009
chair-0.009
Token to
Token ID284
Feature activation+3.41e-01

Pos logit contributions
taker+0.007
har+0.007
he+0.007
 devastated+0.007
ase+0.007
Neg logit contributions
 brace-0.005
 runs-0.005
 swe-0.004
 hats-0.004
yt-0.004
Token be
Token ID307
Feature activation+1.94e-01

Pos logit contributions
 jail+0.007
 devastated+0.007
taker+0.007
chair+0.007
he+0.007
Neg logit contributions
 narrative-0.007
yt-0.007
 brace-0.007
 swe-0.007
Say-0.007
Token mighty
Token ID18680
Feature activation+3.54e-01

Pos logit contributions
 devastated+0.005
chair+0.005
taker+0.005
ase+0.005
he+0.005
Neg logit contributions
 brace-0.004
 runs-0.004
 swe-0.004
yt-0.004
 hats-0.004
Token hunters
Token ID21490
Feature activation-9.14e-01

Pos logit contributions
taker+0.008
 devastated+0.008
he+0.008
ase+0.008
har+0.008
Neg logit contributions
 brace-0.007
 swe-0.007
yt-0.007
 runs-0.007
cast-0.007
Token who
Token ID508
Feature activation-1.03e+00

Pos logit contributions
 brace+0.016
 runs+0.015
 swe+0.014
yt+0.014
cast+0.013
Neg logit contributions
har-0.025
taker-0.024
 devastated-0.024
 Shards-0.024
chair-0.023
Token chased
Token ID26172
Feature activation-1.17e+00

Pos logit contributions
 runs+0.019
 brace+0.019
yt+0.017
 swe+0.017
cast+0.017
Neg logit contributions
taker-0.026
 devastated-0.026
har-0.025
 Shards-0.025
ase-0.025
Token animals
Token ID4695
Feature activation-2.13e+00

Pos logit contributions
yt+0.023
 brace+0.023
 narrative+0.022
 swe+0.022
cast+0.021
Neg logit contributions
taker-0.027
 devastated-0.026
he-0.025
chair-0.025
 Shards-0.025
Token with
Token ID351
Feature activation-1.08e+00

Pos logit contributions
 brace+0.041
yt+0.039
 runs+0.038
 swe+0.038
 narrative+0.036
Neg logit contributions
taker-0.052
 devastated-0.052
har-0.050
ase-0.049
 Shards-0.049
Token their
Token ID511
Feature activation-9.66e-01

Pos logit contributions
 brace+0.019
yt+0.019
 swe+0.018
 runs+0.018
 narrative+0.018
Neg logit contributions
taker-0.027
 devastated-0.027
ase-0.026
he-0.026
chair-0.026
Token spears
Token ID46842
Feature activation-2.92e+00

Pos logit contributions
 brace+0.018
yt+0.017
 swe+0.017
 narrative+0.016
 runs+0.016
Neg logit contributions
taker-0.024
 devastated-0.023
 rubble-0.022
chair-0.022
ase-0.022
Token sad
Token ID6507
Feature activation+3.03e-01

Pos logit contributions
ase+0.005
har+0.005
ions+0.004
chair+0.004
taker+0.004
Neg logit contributions
 runs-0.004
 brace-0.004
 swe-0.004
Say-0.004
ints-0.004
Token and
Token ID290
Feature activation+2.96e-01

Pos logit contributions
taker+0.006
he+0.006
 devastated+0.006
chair+0.006
ore+0.006
Neg logit contributions
 brace-0.007
yt-0.007
 swe-0.006
 narrative-0.006
Nik-0.006
Token angry
Token ID7954
Feature activation-5.80e-01

Pos logit contributions
taker+0.005
 devastated+0.005
 rubble+0.004
chair+0.004
 arrested+0.004
Neg logit contributions
yt-0.009
 brace-0.008
asha-0.008
Nik-0.008
 swe-0.007
Token.
Token ID13
Feature activation+1.08e-01

Pos logit contributions
 brace+0.012
yt+0.012
 swe+0.011
 narrative+0.011
Nik+0.011
Neg logit contributions
taker-0.014
 devastated-0.013
chair-0.013
he-0.013
ase-0.012
Token He
Token ID679
Feature activation-2.98e-02

Pos logit contributions
taker+0.002
 devastated+0.002
ase+0.002
he+0.002
har+0.002
Neg logit contributions
 brace-0.002
yt-0.002
 swe-0.002
 runs-0.002
 narrative-0.002
Token wanted
Token ID2227
Feature activation-2.98e-01

Pos logit contributions
yt+0.001
 brace+0.001
 swe+0.001
 narrative+0.001
cast+0.001
Neg logit contributions
taker-0.001
 devastated-0.001
chair-0.001
 Shards-0.001
 wrecked-0.001
Token the
Token ID262
Feature activation+7.78e-01

Pos logit contributions
 brace+0.007
 swe+0.006
yt+0.006
 runs+0.006
 narrative+0.006
Neg logit contributions
taker-0.007
 devastated-0.006
he-0.006
ase-0.006
har-0.006
Token dish
Token ID9433
Feature activation+1.74e+00

Pos logit contributions
taker+0.014
 devastated+0.014
har+0.014
chair+0.013
ase+0.013
Neg logit contributions
 brace-0.020
 swe-0.019
yt-0.019
 runs-0.018
 narrative-0.018
Token very
Token ID845
Feature activation+6.08e-01

Pos logit contributions
 devastated+0.036
har+0.034
 Shards+0.034
ner+0.033
 rubble+0.033
Neg logit contributions
 swe-0.039
 brace-0.039
cast-0.038
 narrative-0.037
 runs-0.037
Token much
Token ID881
Feature activation-3.04e-02

Pos logit contributions
taker+0.011
he+0.011
chair+0.010
 devastated+0.010
 Shards+0.010
Neg logit contributions
 brace-0.016
yt-0.016
 swe-0.015
 runs-0.015
Say-0.015
Token.
Token ID13
Feature activation-2.72e-01

Pos logit contributions
 brace+0.001
yt+0.001
cast+0.001
 narrative+0.001
 runs+0.001
Neg logit contributions
taker-0.001
he-0.001
chair-0.001
 devastated-0.001
 jail-0.001
Token you
Token ID345
Feature activation+3.63e-01

Pos logit contributions
ase+0.023
 devastated+0.023
 Shards+0.023
taker+0.022
idd+0.022
Neg logit contributions
 runs-0.015
 swe-0.015
yt-0.014
 brace-0.014
cast-0.013
Token act
Token ID719
Feature activation+3.12e-01

Pos logit contributions
ase+0.009
taker+0.009
 devastated+0.009
 Shards+0.009
har+0.009
Neg logit contributions
 runs-0.007
 brace-0.006
cast-0.006
 swe-0.006
yt-0.006
Token and
Token ID290
Feature activation-4.62e-01

Pos logit contributions
 Shards+0.007
 devastated+0.007
ase+0.007
 rubble+0.007
har+0.007
Neg logit contributions
 brace-0.006
 runs-0.006
 swe-0.006
yt-0.006
cast-0.006
Token treat
Token ID2190
Feature activation-9.62e-01

Pos logit contributions
 brace+0.009
 runs+0.009
 swe+0.008
yt+0.008
 narrative+0.008
Neg logit contributions
ase-0.011
taker-0.011
 devastated-0.011
he-0.011
 Shards-0.011
Token others
Token ID1854
Feature activation+5.45e-02

Pos logit contributions
 brace+0.018
 swe+0.018
 runs+0.017
yt+0.017
cast+0.016
Neg logit contributions
 devastated-0.024
taker-0.024
ase-0.023
 Shards-0.022
he-0.022
Token that
Token ID326
Feature activation-1.65e+00

Pos logit contributions
 Shards+0.001
 devastated+0.001
ase+0.001
 rubble+0.001
taker+0.001
Neg logit contributions
 brace-0.001
 runs-0.001
 swe-0.001
yt-0.001
cast-0.001
Token really
Token ID1107
Feature activation-3.99e-01

Pos logit contributions
 brace+0.030
 runs+0.029
 swe+0.029
yt+0.028
 narrative+0.027
Neg logit contributions
taker-0.041
 devastated-0.041
ase-0.040
har-0.040
 Shards-0.039
Token matters
Token ID6067
Feature activation-1.82e-01

Pos logit contributions
 runs+0.007
 brace+0.007
 swe+0.007
yt+0.006
cast+0.006
Neg logit contributions
ase-0.010
 devastated-0.010
 Shards-0.010
taker-0.010
har-0.010
Token!
Token ID0
Feature activation-3.54e-01

Pos logit contributions
 brace+0.003
 runs+0.003
yt+0.003
 swe+0.003
cast+0.003
Neg logit contributions
 devastated-0.005
taker-0.004
 Shards-0.004
har-0.004
ase-0.004
Token Being
Token ID11204
Feature activation+4.85e-01

Pos logit contributions
 runs+0.007
 brace+0.007
 narrative+0.007
 swe+0.007
Say+0.007
Neg logit contributions
 rubble-0.008
taker-0.008
 devastated-0.008
he-0.008
ase-0.007
Token kind
Token ID1611
Feature activation-9.03e-04

Pos logit contributions
he+0.013
ase+0.013
taker+0.012
chair+0.012
 devastated+0.012
Neg logit contributions
 brace-0.009
 swe-0.009
 runs-0.009
 vitamins-0.008
yt-0.008
Token before
Token ID878
Feature activation-1.04e+00

Pos logit contributions
taker+0.001
 devastated+0.001
har+0.001
ase+0.001
he+0.001
Neg logit contributions
 brace-0.001
 swe-0.001
 runs-0.001
yt-0.001
 hats-0.001
Token.
Token ID13
Feature activation+1.33e-02

Pos logit contributions
 brace+0.022
 narrative+0.020
yt+0.020
 swe+0.020
 runs+0.020
Neg logit contributions
 devastated-0.023
taker-0.023
chair-0.022
he-0.022
 jail-0.022
Token<|endoftext|>
Token ID50256
Feature activation+8.23e-01

Pos logit contributions
taker+0.000
 devastated+0.000
ase+0.000
har+0.000
 rubble+0.000
Neg logit contributions
 brace-0.000
 runs-0.000
yt-0.000
 swe-0.000
 narrative-0.000
TokenOnce
Token ID7454
Feature activation-5.12e-01

Pos logit contributions
taker+0.020
 devastated+0.020
har+0.019
 rubble+0.019
 Shards+0.017
Neg logit contributions
 runs-0.016
 brace-0.015
cast-0.015
 narrative-0.015
yt-0.015
Token upon
Token ID2402
Feature activation+7.92e-01

Pos logit contributions
 brace+0.012
yt+0.012
 swe+0.012
 runs+0.012
 narrative+0.012
Neg logit contributions
taker-0.010
 devastated-0.010
ase-0.009
har-0.009
he-0.009
Token a
Token ID257
Feature activation-2.58e+00

Pos logit contributions
taker+0.017
har+0.017
 devastated+0.016
ase+0.016
idd+0.016
Neg logit contributions
 brace-0.018
 runs-0.017
Say-0.017
 swe-0.017
 hats-0.016
Token time
Token ID640
Feature activation+7.21e-01

Pos logit contributions
 brace+0.068
cast+0.062
yt+0.059
 swe+0.059
 narrative+0.059
Neg logit contributions
he-0.049
chair-0.045
 devastated-0.044
 Shards-0.043
 nowhere-0.043
Token,
Token ID11
Feature activation+2.15e-01

Pos logit contributions
taker+0.017
 devastated+0.017
har+0.017
ase+0.016
 rubble+0.016
Neg logit contributions
 brace-0.014
yt-0.014
 runs-0.014
 swe-0.013
cast-0.013
Token there
Token ID612
Feature activation+8.72e-01

Pos logit contributions
taker+0.004
 devastated+0.004
har+0.004
ase+0.004
 rubble+0.004
Neg logit contributions
 brace-0.005
 runs-0.005
yt-0.005
 swe-0.005
 narrative-0.005
Token was
Token ID373
Feature activation-1.35e-01

Pos logit contributions
taker+0.024
har+0.024
 devastated+0.022
 rubble+0.022
uster+0.022
Neg logit contributions
 runs-0.016
 brace-0.015
yt-0.014
 swe-0.014
cast-0.013
Token a
Token ID257
Feature activation-8.77e-01

Pos logit contributions
 brace+0.003
 swe+0.003
Say+0.003
 runs+0.003
 hats+0.003
Neg logit contributions
taker-0.003
har-0.003
 devastated-0.003
ase-0.003
chair-0.003
Token and
Token ID290
Feature activation-4.41e-01

Pos logit contributions
 brace+0.011
yt+0.011
cast+0.010
 swe+0.010
 runs+0.010
Neg logit contributions
 devastated-0.013
taker-0.013
 Shards-0.013
har-0.013
 rubble-0.013
Token laughed
Token ID13818
Feature activation+3.77e-01

Pos logit contributions
 brace+0.009
yt+0.009
 narrative+0.009
 swe+0.009
 runs+0.008
Neg logit contributions
taker-0.010
 devastated-0.010
chair-0.009
he-0.009
har-0.009
Token as
Token ID355
Feature activation-8.54e-01

Pos logit contributions
 devastated+0.008
taker+0.008
ase+0.007
har+0.007
chair+0.007
Neg logit contributions
 brace-0.009
 swe-0.008
yt-0.008
 runs-0.008
cast-0.008
Token he
Token ID339
Feature activation-1.28e+00

Pos logit contributions
 brace+0.018
yt+0.017
 swe+0.017
 runs+0.017
 narrative+0.016
Neg logit contributions
taker-0.020
 devastated-0.019
he-0.018
ase-0.018
chair-0.018
Token wob
Token ID46867
Feature activation-1.43e-01

Pos logit contributions
 brace+0.025
yt+0.024
 runs+0.024
 swe+0.023
cast+0.023
Neg logit contributions
taker-0.032
 devastated-0.031
ase-0.029
chair-0.029
he-0.029
Tokenbled
Token ID9342
Feature activation-2.65e+00

Pos logit contributions
 brace+0.004
 runs+0.003
 narrative+0.003
 swe+0.003
yt+0.003
Neg logit contributions
taker-0.003
 devastated-0.003
he-0.003
har-0.003
ase-0.003
Token around
Token ID1088
Feature activation-4.68e-01

Pos logit contributions
 brace+0.059
yt+0.057
 swe+0.056
 runs+0.056
 narrative+0.055
Neg logit contributions
taker-0.056
 devastated-0.054
he-0.052
ase-0.052
chair-0.052
Token the
Token ID262
Feature activation-5.06e-01

Pos logit contributions
 brace+0.008
 swe+0.008
yt+0.008
 runs+0.008
cast+0.008
Neg logit contributions
taker-0.012
 devastated-0.012
ase-0.011
har-0.011
chair-0.011
Token ice
Token ID4771
Feature activation-1.90e+00

Pos logit contributions
 brace+0.011
 swe+0.010
yt+0.010
 runs+0.010
 narrative+0.010
Neg logit contributions
taker-0.012
 devastated-0.011
he-0.011
ase-0.011
chair-0.011
Token.
Token ID13
Feature activation+5.41e-01

Pos logit contributions
 brace+0.037
 narrative+0.035
yt+0.035
 swe+0.035
 runs+0.035
Neg logit contributions
taker-0.046
he-0.044
ase-0.044
chair-0.044
 devastated-0.043
Token
Token ID220
Feature activation+5.27e-01

Pos logit contributions
 devastated+0.013
taker+0.012
har+0.012
 rubble+0.012
ase+0.012
Neg logit contributions
 runs-0.010
cast-0.010
 swe-0.010
 brace-0.009
 nineteen-0.009
Token zoo
Token ID26626
Feature activation-1.02e-01

Pos logit contributions
taker+0.004
 devastated+0.004
chair+0.004
he+0.004
ase+0.004
Neg logit contributions
 brace-0.003
 swe-0.003
yt-0.003
 runs-0.003
 narrative-0.003
Token.
Token ID13
Feature activation-5.69e-02

Pos logit contributions
 brace+0.001
 swe+0.001
yt+0.001
 runs+0.001
cast+0.001
Neg logit contributions
taker-0.003
 devastated-0.003
 Shards-0.003
har-0.003
 aback-0.003
Token<|endoftext|>
Token ID50256
Feature activation+1.07e+00

Pos logit contributions
 runs+0.001
 brace+0.001
cast+0.001
 swe+0.001
 vitamins+0.001
Neg logit contributions
 devastated-0.001
har-0.001
taker-0.001
 rubble-0.001
 aback-0.001
TokenOnce
Token ID7454
Feature activation-3.50e-01

Pos logit contributions
taker+0.028
 devastated+0.027
har+0.026
 rubble+0.026
 arrested+0.024
Neg logit contributions
 runs-0.018
 brace-0.018
cast-0.018
 narrative-0.017
Say-0.017
Token upon
Token ID2402
Feature activation+7.97e-01

Pos logit contributions
 brace+0.008
yt+0.008
 swe+0.008
 runs+0.008
 narrative+0.008
Neg logit contributions
taker-0.007
 devastated-0.007
ase-0.006
har-0.006
chair-0.006
Token a
Token ID257
Feature activation-2.65e+00

Pos logit contributions
taker+0.017
 devastated+0.016
har+0.016
ase+0.016
idd+0.015
Neg logit contributions
 brace-0.019
 runs-0.018
Say-0.017
 swe-0.017
yt-0.017
Token time
Token ID640
Feature activation+7.36e-01

Pos logit contributions
 brace+0.069
cast+0.063
 swe+0.062
yt+0.061
 vitamins+0.060
Neg logit contributions
he-0.051
ase-0.045
chair-0.045
 devastated-0.045
 Shards-0.044
Token,
Token ID11
Feature activation-4.97e-02

Pos logit contributions
taker+0.017
 devastated+0.017
har+0.016
ase+0.016
chair+0.016
Neg logit contributions
 brace-0.015
yt-0.015
 runs-0.014
 swe-0.014
 narrative-0.014
Token there
Token ID612
Feature activation+3.10e-01

Pos logit contributions
 brace+0.001
yt+0.001
 swe+0.001
 runs+0.001
 narrative+0.001
Neg logit contributions
taker-0.001
 devastated-0.001
he-0.001
ase-0.001
chair-0.001
Token was
Token ID373
Feature activation-4.37e-01

Pos logit contributions
taker+0.008
har+0.008
 devastated+0.008
 rubble+0.007
uster+0.007
Neg logit contributions
 runs-0.006
 brace-0.005
yt-0.005
cast-0.005
 swe-0.005
Token a
Token ID257
Feature activation-6.67e-01

Pos logit contributions
 brace+0.010
 runs+0.009
 swe+0.009
Say+0.009
 narrative+0.009
Neg logit contributions
taker-0.009
har-0.009
 devastated-0.009
ase-0.009
chair-0.009
Token happy
Token ID3772
Feature activation+2.59e-01

Pos logit contributions
 runs+0.004
 brace+0.004
yt+0.004
 swe+0.004
cast+0.004
Neg logit contributions
ase-0.007
ions-0.007
chair-0.007
mith-0.007
 rubble-0.007
Token.
Token ID13
Feature activation+5.35e-01

Pos logit contributions
taker+0.006
 devastated+0.006
har+0.006
ase+0.006
 Shards+0.006
Neg logit contributions
 brace-0.005
yt-0.005
 runs-0.005
cast-0.005
 swe-0.005
Token<|endoftext|>
Token ID50256
Feature activation+9.72e-01

Pos logit contributions
taker+0.013
 devastated+0.013
har+0.013
 arrested+0.013
 rubble+0.013
Neg logit contributions
 runs-0.009
 brace-0.009
yt-0.009
 swe-0.008
cast-0.008
TokenOnce
Token ID7454
Feature activation-5.15e-01

Pos logit contributions
taker+0.023
 devastated+0.023
har+0.022
 rubble+0.022
chair+0.020
Neg logit contributions
 runs-0.019
 brace-0.019
yt-0.018
cast-0.018
 narrative-0.017
Token upon
Token ID2402
Feature activation+1.03e+00

Pos logit contributions
 brace+0.013
yt+0.012
 swe+0.012
 runs+0.012
 narrative+0.012
Neg logit contributions
taker-0.010
 devastated-0.009
he-0.009
ase-0.009
chair-0.009
Token a
Token ID257
Feature activation-2.60e+00

Pos logit contributions
taker+0.023
har+0.022
 devastated+0.022
ase+0.021
idd+0.021
Neg logit contributions
 brace-0.023
 runs-0.022
Say-0.022
 swe-0.021
yt-0.021
Token time
Token ID640
Feature activation+7.48e-01

Pos logit contributions
 brace+0.068
cast+0.062
 swe+0.060
 narrative+0.060
 vitamins+0.059
Neg logit contributions
he-0.051
chair-0.046
 Shards-0.045
ase-0.044
 devastated-0.044
Token,
Token ID11
Feature activation+1.75e-01

Pos logit contributions
taker+0.019
 devastated+0.018
har+0.018
 rubble+0.018
ase+0.017
Neg logit contributions
yt-0.014
 brace-0.014
 runs-0.014
 swe-0.013
 narrative-0.013
Token there
Token ID612
Feature activation+6.27e-01

Pos logit contributions
taker+0.003
 devastated+0.003
har+0.003
ase+0.003
chair+0.003
Neg logit contributions
 brace-0.004
yt-0.004
 runs-0.004
 swe-0.004
 narrative-0.004
Token was
Token ID373
Feature activation-4.54e-01

Pos logit contributions
taker+0.018
har+0.017
 devastated+0.016
 rubble+0.016
uster+0.016
Neg logit contributions
 runs-0.011
 brace-0.010
yt-0.010
 swe-0.010
cast-0.009
Token a
Token ID257
Feature activation-9.89e-01

Pos logit contributions
 brace+0.011
 swe+0.010
 runs+0.010
Say+0.010
yt+0.010
Neg logit contributions
taker-0.010
har-0.009
 devastated-0.009
ase-0.009
chair-0.008
Token.
Token ID13
Feature activation-1.13e+00

Pos logit contributions
 brace+0.013
yt+0.012
 swe+0.012
 narrative+0.012
 runs+0.012
Neg logit contributions
taker-0.017
he-0.017
 devastated-0.017
chair-0.017
ase-0.016
Token She
Token ID1375
Feature activation-1.30e+00

Pos logit contributions
 brace+0.024
yt+0.023
 swe+0.023
 runs+0.022
 narrative+0.021
Neg logit contributions
taker-0.026
 devastated-0.025
har-0.024
ase-0.024
he-0.024
Token opens
Token ID9808
Feature activation-5.98e-01

Pos logit contributions
yt+0.016
 brace+0.016
 narrative+0.016
 swe+0.015
cast+0.014
Neg logit contributions
taker-0.038
 devastated-0.037
chair-0.037
ase-0.036
he-0.036
Token it
Token ID340
Feature activation-1.21e-01

Pos logit contributions
 brace+0.015
 swe+0.014
yt+0.014
 runs+0.014
cast+0.014
Neg logit contributions
taker-0.012
 devastated-0.011
ase-0.011
har-0.011
chair-0.011
Token and
Token ID290
Feature activation-6.12e-01

Pos logit contributions
 narrative+0.003
yt+0.003
 brace+0.003
 swe+0.003
 vitamins+0.003
Neg logit contributions
he-0.002
taker-0.002
chair-0.002
ase-0.002
 devastated-0.002
Token lets
Token ID8781
Feature activation-2.65e+00

Pos logit contributions
yt+0.009
 narrative+0.009
 brace+0.009
 swe+0.008
 vitamins+0.008
Neg logit contributions
taker-0.017
chair-0.016
 devastated-0.016
he-0.016
 Shards-0.016
Token the
Token ID262
Feature activation-1.49e-01

Pos logit contributions
 brace+0.056
yt+0.054
 swe+0.053
 narrative+0.052
 runs+0.052
Neg logit contributions
taker-0.060
 devastated-0.058
he-0.056
ase-0.056
chair-0.055
Token bird
Token ID6512
Feature activation-1.33e+00

Pos logit contributions
 brace+0.004
 runs+0.003
yt+0.003
 swe+0.003
 narrative+0.003
Neg logit contributions
taker-0.003
 devastated-0.003
he-0.003
ions-0.003
ase-0.003
Token out
Token ID503
Feature activation-6.80e-01

Pos logit contributions
 brace+0.023
 swe+0.021
yt+0.021
 runs+0.021
cast+0.019
Neg logit contributions
taker-0.036
 devastated-0.035
ase-0.034
chair-0.034
he-0.034
Token.
Token ID13
Feature activation-1.27e+00

Pos logit contributions
 brace+0.015
yt+0.015
 swe+0.014
 runs+0.014
 narrative+0.014
Neg logit contributions
taker-0.015
 devastated-0.014
ase-0.014
he-0.014
har-0.013
Token
Token ID198
Feature activation-7.32e-01

Pos logit contributions
 brace+0.030
 swe+0.028
yt+0.028
 runs+0.028
 narrative+0.027
Neg logit contributions
taker-0.026
 devastated-0.026
har-0.024
ase-0.024
he-0.024