TOP ACTIVATIONS
MAX = 0.296

Ab bie went to the jun ky ard . She saw
't get hurt in the jun ky ard . She liked
Everyday he would take the av oc ados and scatter them
but there were no other av oc ados .
. Mom says they are av oc ados . Sara wants
still couldn 't find any av oc ados . Finally ,
and her mom sorted the av oc ados into piles according
something new ! She thought av oc ados were the most
and shared everything , including av oc ados . They were
away . She learned that av oc ados are good for
. Lily loved to eat av oc ados , they were
It turned out that the av oc ados were bad and
weather before he picked his av oc ados . <|endoftext|> Once
Maybe they could grow more av oc ados and share them
, and all the other av oc ados were very happy
day on , Lily loved av oc ados and would ask
. Did you know that av oc ados are very healthy
before . The av oc ados were shared with
happy that it produced more av oc ados than ever before
, there was a loud av ac ado . It was

MIN ACTIVATIONS
MIN = -0.212

and said , " Whe e !" Anna smiled back and
and had a long man e . Tom wanted to ride
, he had to sne e ze . " A ch
Fl uffy started to sne e ze and cough . Suddenly
was her royal s ce pter . As they
mouse was happy to oblig e and quickly ran off to
a big , scary sne e ze . Every time he
my said . " By e Tim my , you 're
Lily said , " By e , Max ! Come back
her walk , R yle e stumbled across a mysterious cave
Whenever R yle e went for a walk in
crown and a s ce pter . They make up stories
Tom said , " By e , Sally . See you
� € � � By e ! â € Then she
. " By e , Ben and Lily .
the slide . Whe ee e ! After playing
when Sam started to sne e ze . L
� € � � By e now ! â € <|endoftext|>
. " By e , staff . Thank you
She said , " By e , Tom . See you

INTERVAL 0.169 - 0.296
CONTAINS 0.125%

I can cut it just right ." Tommy was nervous but
his shoes and realized his mistake . He said , "
mouth to point out the problem and come up with a
was worth a lot of money . She explained to him
on . The moral of the story is that

INTERVAL 0.042 - 0.169
CONTAINS 12.511%

adventure for one day . <|endoftext|> L ila and Tom are
my 's mom called him back to the kitchen . She
Everyone laughed . <|endoftext|> Once upon a time there was a
in different colors . " O oh , pretty !" she
ie and said , " Oh no ! What happened ?"

INTERVAL -0.085 - 0.042
CONTAINS 84.251%

adventure that day . He discovered the bitter smell of fuel
embarrassed ! The girl didn â € ™ t
spun it around . He laughed and laughed as it tw
better . <|endoftext|> There was a little boy . He liked
help the cat , but he didn 't know how .

INTERVAL -0.212 - -0.085
CONTAINS 3.113%

tall dad who wore big boots . One day , Tim
blocks , but their favorite ones were the red ones .
asked her mom , " Can we go on the roller
of water . H oot ie picked it up and sprayed
. Sam and Mia nodded . They said they
Token Ab
Token ID2275
Feature activation+9.26e-02

Pos logit contributions
 narrative+0.011
 ang+0.010
 mad+0.010
 trouble+0.010
Less+0.010
Neg logit contributions
cious-0.015
ze-0.014
oster-0.013
mare-0.013
bes-0.012
Tokenbie
Token ID12590
Feature activation-5.41e-02

Pos logit contributions
 narrative+0.134
 mistake+0.124
 obstacle+0.122
 ang+0.120
 Moral+0.117
Neg logit contributions
cious-0.156
ze-0.144
 songs-0.134
mare-0.134
suit-0.131
Token went
Token ID1816
Feature activation+6.24e-02

Pos logit contributions
cious+0.062
ze+0.050
mare+0.047
we+0.044
coat+0.044
Neg logit contributions
 narrative-0.112
 obstacle-0.107
 ang-0.106
 mistake-0.102
 trouble-0.102
Token to
Token ID284
Feature activation-1.18e-02

Pos logit contributions
 narrative+0.126
 obstacle+0.116
 ang+0.113
 mistake+0.112
 Moral+0.112
Neg logit contributions
cious-0.076
ze-0.063
mare-0.062
coat-0.060
suit-0.059
Token the
Token ID262
Feature activation+1.91e-02

Pos logit contributions
cious+0.016
mare+0.015
ze+0.014
 pet+0.014
 songs+0.014
Neg logit contributions
 narrative-0.022
 obstacle-0.019
 ang-0.018
Less-0.018
 Moral-0.018
Token jun
Token ID10891
Feature activation+2.88e-01

Pos logit contributions
 narrative+0.028
 obstacle+0.026
 ang+0.025
 mistake+0.025
 trouble+0.024
Neg logit contributions
cious-0.033
ze-0.029
mare-0.028
coat-0.027
Ch-0.027
Tokenky
Token ID2584
Feature activation-6.15e-02

Pos logit contributions
 narrative+0.548
 obstacle+0.517
 ang+0.507
 mistake+0.500
 trouble+0.495
Neg logit contributions
cious-0.372
ze-0.313
mare-0.300
we-0.283
coat-0.281
Tokenard
Token ID446
Feature activation-4.10e-02

Pos logit contributions
cious+0.078
ze+0.065
suit+0.064
coat+0.063
we+0.060
Neg logit contributions
 narrative-0.124
 ang-0.110
Less-0.110
 Moral-0.107
 mad-0.107
Token.
Token ID13
Feature activation+8.67e-03

Pos logit contributions
cious+0.061
ze+0.052
mare+0.051
coat+0.048
Ch+0.048
Neg logit contributions
 narrative-0.070
 obstacle-0.065
 ang-0.065
 mistake-0.064
 trouble-0.063
Token She
Token ID1375
Feature activation+1.71e-02

Pos logit contributions
 narrative+0.013
 obstacle+0.011
 ang+0.011
 mistake+0.011
 trouble+0.011
Neg logit contributions
cious-0.015
ze-0.013
mare-0.013
coat-0.013
suit-0.013
Token saw
Token ID2497
Feature activation-5.96e-02

Pos logit contributions
 narrative+0.035
 obstacle+0.033
 ang+0.033
 mistake+0.033
 trouble+0.033
Neg logit contributions
cious-0.019
ze-0.016
mare-0.015
we-0.014
coat-0.014
Token't
Token ID470
Feature activation+1.30e-02

Pos logit contributions
 narrative+0.070
 ang+0.067
 obstacle+0.065
 mistake+0.063
 Moral+0.061
Neg logit contributions
ze-0.037
cious-0.037
ves-0.030
ggy-0.029
age-0.029
Token get
Token ID651
Feature activation+4.59e-02

Pos logit contributions
 narrative+0.022
 ang+0.020
 obstacle+0.020
 Moral+0.020
Less+0.019
Neg logit contributions
cious-0.019
ze-0.017
mare-0.016
coat-0.016
 songs-0.015
Token hurt
Token ID5938
Feature activation+2.49e-02

Pos logit contributions
 narrative+0.077
 obstacle+0.070
 Moral+0.067
 mistake+0.067
 ang+0.067
Neg logit contributions
cious-0.071
ze-0.062
mare-0.059
coat-0.058
we-0.057
Token in
Token ID287
Feature activation+1.90e-02

Pos logit contributions
 narrative+0.043
 obstacle+0.039
 ang+0.039
 Moral+0.039
Less+0.038
Neg logit contributions
cious-0.037
ze-0.033
we-0.031
coat-0.030
 songs-0.029
Token the
Token ID262
Feature activation+1.61e-02

Pos logit contributions
 narrative+0.027
 obstacle+0.025
 ang+0.025
 Moral+0.024
 mistake+0.023
Neg logit contributions
cious-0.033
ze-0.029
mare-0.028
coat-0.027
we-0.027
Token jun
Token ID10891
Feature activation+2.87e-01

Pos logit contributions
 narrative+0.026
 ang+0.024
 obstacle+0.024
 mistake+0.023
 Moral+0.023
Neg logit contributions
cious-0.025
ze-0.022
mare-0.021
suit-0.020
coat-0.020
Tokenky
Token ID2584
Feature activation-6.88e-02

Pos logit contributions
 narrative+0.627
Less+0.574
 Moral+0.572
 ang+0.571
 obstacle+0.570
Neg logit contributions
cious-0.309
suit-0.242
coat-0.240
ze-0.235
mare-0.219
Tokenard
Token ID446
Feature activation-4.75e-02

Pos logit contributions
cious+0.093
ze+0.078
mare+0.073
coat+0.073
suit+0.072
Neg logit contributions
 narrative-0.129
 obstacle-0.118
 ang-0.118
 mistake-0.115
 Moral-0.115
Token.
Token ID13
Feature activation+7.29e-04

Pos logit contributions
cious+0.069
ze+0.059
mare+0.056
coat+0.055
suit+0.054
Neg logit contributions
 narrative-0.083
 obstacle-0.077
 ang-0.076
 Moral-0.074
 mistake-0.074
Token She
Token ID1375
Feature activation+8.79e-03

Pos logit contributions
 narrative+0.001
 obstacle+0.001
 ang+0.001
 Moral+0.001
 mistake+0.001
Neg logit contributions
cious-0.001
ze-0.001
mare-0.001
coat-0.001
suit-0.001
Token liked
Token ID8288
Feature activation-1.01e-01

Pos logit contributions
 narrative+0.017
 obstacle+0.016
 ang+0.016
 mistake+0.016
 trouble+0.015
Neg logit contributions
cious-0.011
ze-0.009
mare-0.009
coat-0.008
suit-0.008
Token Everyday
Token ID48154
Feature activation+6.33e-02

Pos logit contributions
cious+0.027
ze+0.023
mare+0.022
we+0.022
pter+0.022
Neg logit contributions
 narrative-0.026
 obstacle-0.023
 ang-0.023
 mad-0.023
 mistake-0.022
Token he
Token ID339
Feature activation+5.68e-02

Pos logit contributions
 narrative+0.099
 obstacle+0.091
 ang+0.090
 mistake+0.087
 Moral+0.086
Neg logit contributions
cious-0.104
ze-0.090
mare-0.088
coat-0.084
we-0.084
Token would
Token ID561
Feature activation-4.74e-02

Pos logit contributions
 narrative+0.113
 obstacle+0.108
 ang+0.106
 mistake+0.104
 Moral+0.104
Neg logit contributions
cious-0.066
ze-0.053
mare-0.053
we-0.050
coat-0.049
Token take
Token ID1011
Feature activation+9.90e-02

Pos logit contributions
cious+0.074
ze+0.062
mare+0.061
we+0.060
suit+0.060
Neg logit contributions
 narrative-0.077
 obstacle-0.073
 ang-0.070
 mistake-0.070
 trouble-0.069
Token the
Token ID262
Feature activation+3.37e-02

Pos logit contributions
 narrative+0.124
 obstacle+0.116
 trouble+0.115
 ang+0.112
 mad+0.107
Neg logit contributions
cious-0.189
mare-0.167
ze-0.164
coat-0.162
suit-0.160
Token av
Token ID1196
Feature activation+2.66e-01

Pos logit contributions
 narrative+0.058
 ang+0.052
Less+0.052
 obstacle+0.052
 Moral+0.051
Neg logit contributions
cious-0.049
ze-0.043
coat-0.041
 songs-0.040
suit-0.040
Tokenoc
Token ID420
Feature activation+3.41e-02

Pos logit contributions
 narrative+0.459
 mistake+0.423
 ang+0.418
 obstacle+0.414
 trouble+0.413
Neg logit contributions
cious-0.401
ze-0.336
mare-0.324
suit-0.317
pter-0.315
Tokenados
Token ID22484
Feature activation+7.01e-02

Pos logit contributions
 narrative+0.051
 trouble+0.045
 ang+0.045
 obstacle+0.045
 mad+0.045
Neg logit contributions
cious-0.056
we-0.051
ze-0.050
pter-0.049
ages-0.048
Token and
Token ID290
Feature activation+1.96e-02

Pos logit contributions
 narrative+0.129
 obstacle+0.121
 ang+0.119
 mistake+0.117
 trouble+0.116
Neg logit contributions
cious-0.095
ze-0.079
mare-0.077
suit-0.074
coat-0.074
Token scatter
Token ID41058
Feature activation-3.62e-03

Pos logit contributions
 narrative+0.033
 obstacle+0.030
 ang+0.030
 mistake+0.029
 trouble+0.029
Neg logit contributions
cious-0.030
ze-0.026
mare-0.025
we-0.024
suit-0.024
Token them
Token ID606
Feature activation-1.09e-01

Pos logit contributions
cious+0.005
ze+0.004
suit+0.004
 songs+0.004
mare+0.004
Neg logit contributions
 narrative-0.006
 mistake-0.006
 obstacle-0.006
 ang-0.006
 Moral-0.006
Token but
Token ID475
Feature activation-3.30e-02

Pos logit contributions
 narrative+0.023
 obstacle+0.021
 ang+0.021
 mistake+0.021
 Moral+0.020
Neg logit contributions
cious-0.018
ze-0.016
mare-0.015
coat-0.014
we-0.014
Token there
Token ID612
Feature activation-5.08e-02

Pos logit contributions
cious+0.046
ze+0.039
mare+0.036
 songs+0.036
suit+0.035
Neg logit contributions
 narrative-0.059
 obstacle-0.055
 ang-0.053
 mistake-0.053
Less-0.052
Token were
Token ID547
Feature activation+2.21e-02

Pos logit contributions
cious+0.059
ze+0.050
we+0.045
mare+0.044
ves+0.044
Neg logit contributions
 narrative-0.101
 obstacle-0.096
 ang-0.094
Less-0.092
 Moral-0.092
Token no
Token ID645
Feature activation-2.73e-02

Pos logit contributions
 narrative+0.032
 obstacle+0.030
 ang+0.029
 trouble+0.029
 mistake+0.029
Neg logit contributions
cious-0.039
ze-0.034
mare-0.034
suit-0.032
coat-0.032
Token other
Token ID584
Feature activation+3.97e-02

Pos logit contributions
cious+0.032
 songs+0.030
ze+0.028
 pet+0.026
 song+0.026
Neg logit contributions
 narrative-0.052
Less-0.049
 ang-0.047
 Moral-0.046
 obstacle-0.045
Token av
Token ID1196
Feature activation+2.63e-01

Pos logit contributions
 narrative+0.065
 obstacle+0.060
 ang+0.060
 Moral+0.058
Less+0.058
Neg logit contributions
cious-0.061
ze-0.053
mare-0.049
we-0.049
 songs-0.049
Tokenoc
Token ID420
Feature activation+4.82e-02

Pos logit contributions
 narrative+0.436
 ang+0.396
 obstacle+0.395
 mistake+0.388
 Moral+0.386
Neg logit contributions
cious-0.423
ze-0.357
mare-0.340
suit-0.332
coat-0.332
Tokenados
Token ID22484
Feature activation+6.95e-02

Pos logit contributions
 narrative+0.068
 obstacle+0.063
 ang+0.061
 mistake+0.060
Less+0.059
Neg logit contributions
cious-0.086
ze-0.077
we-0.075
balls-0.071
pter-0.069
Token.
Token ID13
Feature activation+1.33e-02

Pos logit contributions
 narrative+0.119
 obstacle+0.111
 ang+0.109
 trouble+0.107
 mistake+0.107
Neg logit contributions
cious-0.103
ze-0.088
mare-0.086
coat-0.082
suit-0.082
Token
Token ID198
Feature activation+3.10e-03

Pos logit contributions
 narrative+0.024
 obstacle+0.023
 mistake+0.022
 ang+0.022
 mad+0.021
Neg logit contributions
cious-0.019
ze-0.015
mare-0.015
Ch-0.015
suit-0.015
Token
Token ID198
Feature activation+3.98e-03

Pos logit contributions
 narrative+0.005
 obstacle+0.005
 Moral+0.005
 ang+0.004
 trouble+0.004
Neg logit contributions
cious-0.005
ze-0.004
mare-0.004
coat-0.004
suit-0.004
Token.
Token ID13
Feature activation+1.36e-02

Pos logit contributions
 narrative+0.104
 obstacle+0.099
 ang+0.097
 mistake+0.094
 Moral+0.093
Neg logit contributions
cious-0.087
ze-0.071
coat-0.070
suit-0.069
mare-0.069
Token Mom
Token ID11254
Feature activation+2.13e-02

Pos logit contributions
 narrative+0.020
 obstacle+0.018
 ang+0.018
 mistake+0.017
 trouble+0.017
Neg logit contributions
cious-0.024
ze-0.021
mare-0.020
coat-0.019
suit-0.019
Token says
Token ID1139
Feature activation+4.87e-02

Pos logit contributions
 narrative+0.033
 obstacle+0.031
 tragedy+0.031
Less+0.030
 ang+0.029
Neg logit contributions
cious-0.034
 sang-0.030
 liked-0.029
 danced-0.029
coat-0.029
Token they
Token ID484
Feature activation-2.25e-02

Pos logit contributions
 narrative+0.071
 obstacle+0.064
 ang+0.063
 trouble+0.062
 mistake+0.062
Neg logit contributions
cious-0.085
ze-0.075
mare-0.073
coat-0.071
suit-0.071
Token are
Token ID389
Feature activation+3.60e-02

Pos logit contributions
cious+0.023
 liked+0.019
ze+0.018
 danced+0.018
 pretended+0.017
Neg logit contributions
 ang-0.047
 narrative-0.046
 mistake-0.045
Less-0.045
 tragedy-0.044
Token av
Token ID1196
Feature activation+2.63e-01

Pos logit contributions
 narrative+0.053
 obstacle+0.049
 ang+0.047
 mistake+0.046
 Moral+0.046
Neg logit contributions
cious-0.063
ze-0.055
mare-0.054
coat-0.052
suit-0.052
Tokenoc
Token ID420
Feature activation+4.48e-02

Pos logit contributions
 narrative+0.388
 obstacle+0.357
 ang+0.349
 mistake+0.340
 trouble+0.340
Neg logit contributions
cious-0.451
ze-0.398
mare-0.387
coat-0.373
we-0.372
Tokenados
Token ID22484
Feature activation+6.89e-02

Pos logit contributions
 narrative+0.052
 obstacle+0.046
 ang+0.045
 mistake+0.044
 trouble+0.043
Neg logit contributions
cious-0.093
ze-0.083
mare-0.080
we-0.080
coat-0.079
Token.
Token ID13
Feature activation+1.04e-02

Pos logit contributions
 narrative+0.106
 obstacle+0.098
 trouble+0.095
 ang+0.095
 mistake+0.095
Neg logit contributions
cious-0.111
ze-0.100
mare-0.096
Ch-0.092
we-0.091
Token Sara
Token ID24799
Feature activation-1.56e-02

Pos logit contributions
 narrative+0.016
 obstacle+0.014
 ang+0.014
 trouble+0.014
 Moral+0.014
Neg logit contributions
cious-0.017
ze-0.015
mare-0.015
we-0.014
coat-0.014
Token wants
Token ID3382
Feature activation+2.89e-03

Pos logit contributions
cious+0.018
 liked+0.015
ze+0.015
 pretended+0.014
 danced+0.014
Neg logit contributions
 narrative-0.032
 tragedy-0.030
 ang-0.030
 obstacle-0.030
Less-0.030
Token still
Token ID991
Feature activation+2.33e-02

Pos logit contributions
cious+0.051
ze+0.043
mare+0.041
coat+0.039
we+0.039
Neg logit contributions
 narrative-0.071
 obstacle-0.067
 ang-0.064
 mistake-0.064
 Moral-0.063
Token couldn
Token ID3521
Feature activation+3.95e-02

Pos logit contributions
 narrative+0.038
 obstacle+0.035
 ang+0.035
 mistake+0.035
 trouble+0.034
Neg logit contributions
cious-0.036
ze-0.031
mare-0.030
coat-0.029
suit-0.029
Token't
Token ID470
Feature activation+1.88e-02

Pos logit contributions
 narrative+0.086
 obstacle+0.081
 ang+0.080
 mistake+0.079
 mad+0.074
Neg logit contributions
cious-0.040
ze-0.037
ves-0.031
we-0.030
ggy-0.030
Token find
Token ID1064
Feature activation+3.01e-02

Pos logit contributions
 narrative+0.030
 obstacle+0.028
 ang+0.027
 mistake+0.027
 trouble+0.027
Neg logit contributions
cious-0.030
ze-0.026
mare-0.025
coat-0.025
suit-0.024
Token any
Token ID597
Feature activation+1.32e-02

Pos logit contributions
 trouble+0.033
 obstacle+0.032
 narrative+0.032
 ang+0.031
 mistake+0.030
Neg logit contributions
cious-0.061
ze-0.054
mare-0.054
suit-0.053
coat-0.052
Token av
Token ID1196
Feature activation+2.63e-01

Pos logit contributions
 narrative+0.021
 ang+0.019
 Moral+0.019
Less+0.019
 obstacle+0.018
Neg logit contributions
cious-0.021
ze-0.018
coat-0.017
mare-0.017
we-0.017
Tokenoc
Token ID420
Feature activation+3.36e-02

Pos logit contributions
 narrative+0.459
 ang+0.417
 mistake+0.415
 obstacle+0.412
 Moral+0.406
Neg logit contributions
cious-0.397
ze-0.327
pter-0.309
mare-0.308
suit-0.308
Tokenados
Token ID22484
Feature activation+6.63e-02

Pos logit contributions
 narrative+0.051
 obstacle+0.045
 ang+0.045
 mistake+0.044
Less+0.044
Neg logit contributions
cious-0.056
we-0.049
ze-0.049
pter-0.047
ages-0.047
Token.
Token ID13
Feature activation+1.71e-03

Pos logit contributions
 narrative+0.110
 obstacle+0.102
 ang+0.101
 mistake+0.099
 trouble+0.099
Neg logit contributions
cious-0.101
ze-0.087
mare-0.085
coat-0.081
suit-0.080
Token Finally
Token ID9461
Feature activation+6.41e-02

Pos logit contributions
 narrative+0.003
 obstacle+0.003
 ang+0.003
 mistake+0.003
 trouble+0.003
Neg logit contributions
cious-0.003
ze-0.002
mare-0.002
coat-0.002
suit-0.002
Token,
Token ID11
Feature activation+4.42e-03

Pos logit contributions
 narrative+0.097
 obstacle+0.090
 ang+0.088
 mistake+0.086
 Moral+0.085
Neg logit contributions
cious-0.109
ze-0.095
mare-0.092
coat-0.089
suit-0.089
Token and
Token ID290
Feature activation+2.78e-02

Pos logit contributions
cious+0.045
mare+0.039
ze+0.038
suit+0.037
we+0.036
Neg logit contributions
 narrative-0.039
 obstacle-0.037
 mistake-0.037
 ang-0.037
 Moral-0.036
Token her
Token ID607
Feature activation-5.49e-02

Pos logit contributions
 narrative+0.057
 obstacle+0.056
 ang+0.052
 tragedy+0.050
 trouble+0.050
Neg logit contributions
cious-0.030
 songs-0.027
ze-0.026
mare-0.026
 sang-0.024
Token mom
Token ID1995
Feature activation+1.85e-02

Pos logit contributions
cious+0.070
ze+0.058
mare+0.055
coat+0.054
 songs+0.054
Neg logit contributions
 narrative-0.105
 obstacle-0.101
 ang-0.098
 mistake-0.093
Less-0.092
Token sorted
Token ID23243
Feature activation+4.61e-02

Pos logit contributions
 narrative+0.034
 obstacle+0.032
 ang+0.032
 mistake+0.031
 trouble+0.031
Neg logit contributions
cious-0.025
ze-0.021
mare-0.020
coat-0.019
suit-0.019
Token the
Token ID262
Feature activation+2.37e-02

Pos logit contributions
 narrative+0.086
 obstacle+0.077
 cheating+0.074
 mistake+0.074
 Moral+0.072
Neg logit contributions
cious-0.056
suit-0.050
ot-0.049
uds-0.049
 stuffed-0.048
Token av
Token ID1196
Feature activation+2.62e-01

Pos logit contributions
 narrative+0.034
 ang+0.031
 obstacle+0.031
ulk+0.029
Less+0.029
Neg logit contributions
cious-0.040
 liked-0.033
 favorite-0.032
let-0.032
ze-0.032
Tokenoc
Token ID420
Feature activation+4.73e-02

Pos logit contributions
 narrative+0.451
 ang+0.413
 obstacle+0.407
 mistake+0.401
 trouble+0.397
Neg logit contributions
cious-0.405
ze-0.332
mare-0.327
suit-0.317
we-0.312
Tokenados
Token ID22484
Feature activation+6.79e-02

Pos logit contributions
 narrative+0.075
 obstacle+0.073
 ang+0.070
Less+0.069
 tragedy+0.067
Neg logit contributions
cious-0.072
we-0.063
ze-0.062
als-0.059
balls-0.058
Token into
Token ID656
Feature activation+4.68e-02

Pos logit contributions
 narrative+0.123
 obstacle+0.113
 ang+0.111
 mistake+0.108
 Moral+0.108
Neg logit contributions
cious-0.097
ze-0.080
we-0.079
suit-0.078
coat-0.077
Token piles
Token ID30935
Feature activation+2.52e-02

Pos logit contributions
 narrative+0.085
 obstacle+0.076
 ang+0.073
 Moral+0.071
 cheating+0.071
Neg logit contributions
cious-0.066
ze-0.053
 songs-0.052
 stuffed-0.051
mare-0.051
Token according
Token ID1864
Feature activation+4.82e-02

Pos logit contributions
 narrative+0.049
 obstacle+0.046
 ang+0.045
 Moral+0.044
 mistake+0.044
Neg logit contributions
cious-0.031
ze-0.026
we-0.025
coat-0.025
suit-0.024
Token something
Token ID1223
Feature activation-1.82e-02

Pos logit contributions
 narrative+0.052
 obstacle+0.049
 ang+0.047
 Moral+0.045
 mistake+0.045
Neg logit contributions
cious-0.039
ze-0.035
 songs-0.032
mare-0.032
we-0.030
Token new
Token ID649
Feature activation+3.01e-02

Pos logit contributions
cious+0.028
ze+0.024
mare+0.023
we+0.023
suit+0.022
Neg logit contributions
 narrative-0.030
 obstacle-0.028
 ang-0.027
 Moral-0.027
 mistake-0.026
Token!
Token ID0
Feature activation+7.47e-03

Pos logit contributions
 narrative+0.044
 mistake+0.041
 trouble+0.040
 obstacle+0.040
 Moral+0.040
Neg logit contributions
cious-0.052
ze-0.046
mare-0.045
Ch-0.045
coat-0.044
Token She
Token ID1375
Feature activation+1.72e-02

Pos logit contributions
 narrative+0.011
 obstacle+0.011
 ang+0.010
 mistake+0.010
 trouble+0.010
Neg logit contributions
cious-0.013
ze-0.011
mare-0.011
suit-0.010
coat-0.010
Token thought
Token ID1807
Feature activation+3.35e-03

Pos logit contributions
 narrative+0.032
 obstacle+0.030
 ang+0.029
 mistake+0.029
 Moral+0.029
Neg logit contributions
cious-0.023
ze-0.020
mare-0.019
coat-0.018
suit-0.018
Token av
Token ID1196
Feature activation+2.62e-01

Pos logit contributions
 narrative+0.005
 ang+0.005
 obstacle+0.005
 Moral+0.005
 mistake+0.005
Neg logit contributions
cious-0.005
ze-0.005
mare-0.004
we-0.004
coat-0.004
Tokenoc
Token ID420
Feature activation+4.59e-02

Pos logit contributions
 narrative+0.364
 obstacle+0.336
 ang+0.324
 Moral+0.315
 trouble+0.315
Neg logit contributions
cious-0.469
ze-0.418
mare-0.410
coat-0.395
we-0.393
Tokenados
Token ID22484
Feature activation+6.83e-02

Pos logit contributions
 narrative+0.059
 obstacle+0.054
 ang+0.052
 trouble+0.050
 mistake+0.049
Neg logit contributions
cious-0.089
ze-0.081
we-0.078
mare-0.074
abies-0.073
Token were
Token ID547
Feature activation+2.01e-02

Pos logit contributions
 narrative+0.105
 obstacle+0.097
 ang+0.095
 trouble+0.092
 mistake+0.092
Neg logit contributions
cious-0.113
ze-0.098
mare-0.097
suit-0.091
coat-0.091
Token the
Token ID262
Feature activation+2.55e-02

Pos logit contributions
 narrative+0.028
 obstacle+0.026
 ang+0.026
 trouble+0.025
 mistake+0.025
Neg logit contributions
cious-0.036
ze-0.032
mare-0.032
coat-0.031
suit-0.030
Token most
Token ID749
Feature activation-3.10e-02

Pos logit contributions
 narrative+0.042
 obstacle+0.040
 ang+0.039
 mistake+0.038
 Moral+0.037
Neg logit contributions
cious-0.039
ze-0.034
mare-0.032
coat-0.032
suit-0.031
Token and
Token ID290
Feature activation+2.26e-02

Pos logit contributions
 narrative+0.060
 obstacle+0.055
 ang+0.054
 trouble+0.053
 mistake+0.052
Neg logit contributions
cious-0.058
ze-0.051
mare-0.048
coat-0.047
suit-0.046
Token shared
Token ID4888
Feature activation-3.75e-02

Pos logit contributions
 narrative+0.044
 obstacle+0.041
 ang+0.041
 mistake+0.040
 Moral+0.040
Neg logit contributions
cious-0.028
ze-0.024
mare-0.023
suit-0.022
coat-0.022
Token everything
Token ID2279
Feature activation+7.91e-05

Pos logit contributions
cious+0.069
ze+0.061
mare+0.061
coat+0.059
suit+0.059
Neg logit contributions
 narrative-0.051
 obstacle-0.046
 ang-0.046
 mistake-0.045
 trouble-0.045
Token,
Token ID11
Feature activation+9.21e-03

Pos logit contributions
 narrative+0.000
 obstacle+0.000
 ang+0.000
 mistake+0.000
 trouble+0.000
Neg logit contributions
cious-0.000
ze-0.000
mare-0.000
coat-0.000
Ch-0.000
Token including
Token ID1390
Feature activation-9.00e-02

Pos logit contributions
 narrative+0.014
 mistake+0.013
 ang+0.013
 obstacle+0.013
 trouble+0.013
Neg logit contributions
cious-0.016
ze-0.014
mare-0.013
suit-0.013
Ch-0.013
Token av
Token ID1196
Feature activation+2.62e-01

Pos logit contributions
cious+0.145
ze+0.130
mare+0.123
we+0.123
 songs+0.121
Neg logit contributions
 narrative-0.139
 obstacle-0.128
 ang-0.123
 mistake-0.119
 Moral-0.119
Tokenoc
Token ID420
Feature activation+4.55e-02

Pos logit contributions
 narrative+0.379
 obstacle+0.349
 ang+0.341
 mistake+0.333
 trouble+0.332
Neg logit contributions
cious-0.454
ze-0.402
mare-0.393
coat-0.379
suit-0.377
Tokenados
Token ID22484
Feature activation+6.72e-02

Pos logit contributions
 narrative+0.063
 obstacle+0.061
 ang+0.057
 trouble+0.055
 mistake+0.054
Neg logit contributions
cious-0.082
we-0.074
ze-0.073
abies-0.067
ages-0.067
Token.
Token ID13
Feature activation+9.80e-03

Pos logit contributions
 narrative+0.098
 obstacle+0.088
 ang+0.088
 mistake+0.088
 trouble+0.087
Neg logit contributions
cious-0.114
ze-0.103
mare-0.103
Ch-0.098
suit-0.094
Token They
Token ID1119
Feature activation-1.04e-02

Pos logit contributions
 narrative+0.014
 Moral+0.013
 obstacle+0.013
Less+0.012
 mistake+0.012
Neg logit contributions
cious-0.018
ze-0.016
mare-0.015
coat-0.015
suit-0.015
Token were
Token ID547
Feature activation+1.76e-02

Pos logit contributions
cious+0.012
ze+0.010
mare+0.009
 sang+0.009
 danced+0.009
Neg logit contributions
 narrative-0.020
 obstacle-0.020
 trouble-0.020
 ang-0.020
 mistake-0.019
Token away
Token ID1497
Feature activation+4.56e-03

Pos logit contributions
 narrative+0.096
 obstacle+0.088
 ang+0.084
 Moral+0.083
 mistake+0.083
Neg logit contributions
cious-0.103
ze-0.092
mare-0.088
coat-0.087
suit-0.085
Token.
Token ID13
Feature activation+7.37e-03

Pos logit contributions
 narrative+0.009
 obstacle+0.009
 Moral+0.008
 ang+0.008
Less+0.008
Neg logit contributions
cious-0.006
ze-0.005
coat-0.004
mare-0.004
we-0.004
Token She
Token ID1375
Feature activation+1.83e-02

Pos logit contributions
 narrative+0.010
 Moral+0.009
 obstacle+0.009
 ang+0.009
 trouble+0.009
Neg logit contributions
cious-0.013
ze-0.012
mare-0.011
suit-0.011
coat-0.011
Token learned
Token ID4499
Feature activation+2.65e-02

Pos logit contributions
 narrative+0.033
 mistake+0.030
 obstacle+0.030
 ang+0.030
 Moral+0.030
Neg logit contributions
cious-0.026
mare-0.021
ze-0.021
suit-0.021
coat-0.020
Token that
Token ID326
Feature activation-5.12e-02

Pos logit contributions
 narrative+0.057
 obstacle+0.054
 ang+0.050
 Moral+0.049
 mistake+0.049
Neg logit contributions
cious-0.026
ze-0.025
 songs-0.023
mare-0.020
we-0.020
Token av
Token ID1196
Feature activation+2.61e-01

Pos logit contributions
ze+0.062
 songs+0.060
cious+0.060
we+0.058
 song+0.056
Neg logit contributions
 narrative-0.088
 ang-0.087
 obstacle-0.085
Less-0.078
 wreck-0.076
Tokenoc
Token ID420
Feature activation+4.58e-02

Pos logit contributions
 narrative+0.396
 obstacle+0.364
 ang+0.358
 mistake+0.349
 trouble+0.347
Neg logit contributions
cious-0.441
ze-0.385
mare-0.374
coat-0.362
suit-0.361
Tokenados
Token ID22484
Feature activation+6.69e-02

Pos logit contributions
 narrative+0.064
 obstacle+0.062
 ang+0.057
Less+0.055
 mistake+0.055
Neg logit contributions
cious-0.081
ze-0.074
we-0.072
balls-0.067
bes-0.066
Token are
Token ID389
Feature activation+4.09e-02

Pos logit contributions
 narrative+0.105
 obstacle+0.097
 ang+0.096
 Moral+0.093
 mistake+0.093
Neg logit contributions
cious-0.107
ze-0.094
mare-0.090
coat-0.088
we-0.088
Token good
Token ID922
Feature activation+2.04e-02

Pos logit contributions
 narrative+0.065
 obstacle+0.061
 ang+0.058
 Moral+0.057
 mistake+0.057
Neg logit contributions
cious-0.061
ze-0.055
 songs-0.052
mare-0.051
we-0.050
Token for
Token ID329
Feature activation+9.41e-02

Pos logit contributions
 narrative+0.030
 trouble+0.029
 obstacle+0.028
 mistake+0.027
 problem+0.027
Neg logit contributions
cious-0.036
ze-0.031
mare-0.031
Ch-0.030
suit-0.029
Token.
Token ID13
Feature activation+6.89e-03

Pos logit contributions
cious+0.002
suit+0.002
ze+0.002
mare+0.002
balls+0.002
Neg logit contributions
 narrative-0.001
 mistake-0.001
 mad-0.001
 Moral-0.001
 trouble-0.001
Token Lily
Token ID20037
Feature activation-1.67e-03

Pos logit contributions
 narrative+0.008
 Moral+0.006
 trouble+0.006
 obstacle+0.006
Less+0.006
Neg logit contributions
cious-0.015
ze-0.014
we-0.013
mare-0.013
oster-0.013
Token loved
Token ID6151
Feature activation-5.20e-02

Pos logit contributions
cious+0.002
mare+0.002
ze+0.002
lic+0.002
suit+0.002
Neg logit contributions
 narrative-0.003
 mistake-0.003
 trouble-0.003
 mad-0.003
 Moral-0.003
Token to
Token ID284
Feature activation-9.74e-03

Pos logit contributions
cious+0.067
ze+0.056
mare+0.053
suit+0.052
coat+0.051
Neg logit contributions
 narrative-0.099
 obstacle-0.093
 ang-0.092
 trouble-0.091
 mistake-0.090
Token eat
Token ID4483
Feature activation+6.89e-03

Pos logit contributions
cious+0.018
we+0.017
suit+0.016
ave+0.016
oster+0.016
Neg logit contributions
 work-0.011
 wreck-0.011
 trouble-0.011
 narrative-0.011
 mad-0.010
Token av
Token ID1196
Feature activation+2.61e-01

Pos logit contributions
 narrative+0.012
 obstacle+0.011
 ang+0.011
Less+0.011
 Moral+0.011
Neg logit contributions
cious-0.010
ze-0.009
 songs-0.008
mare-0.008
we-0.008
Tokenoc
Token ID420
Feature activation+4.35e-02

Pos logit contributions
 narrative+0.397
 obstacle+0.366
 ang+0.359
 mistake+0.351
 trouble+0.348
Neg logit contributions
cious-0.437
ze-0.381
mare-0.370
coat-0.358
suit-0.356
Tokenados
Token ID22484
Feature activation+6.71e-02

Pos logit contributions
 narrative+0.065
 obstacle+0.060
 ang+0.057
 mistake+0.056
Less+0.054
Neg logit contributions
cious-0.075
ze-0.069
we-0.064
bes-0.061
 songs-0.061
Token,
Token ID11
Feature activation+1.04e-02

Pos logit contributions
 narrative+0.112
 obstacle+0.104
 ang+0.103
 mistake+0.100
 trouble+0.100
Neg logit contributions
cious-0.102
ze-0.088
mare-0.086
coat-0.082
suit-0.082
Token they
Token ID484
Feature activation-1.82e-02

Pos logit contributions
 narrative+0.017
 ang+0.015
 trouble+0.015
 mad+0.015
 obstacle+0.015
Neg logit contributions
cious-0.017
ze-0.014
mare-0.014
suit-0.014
Ch-0.013
Token were
Token ID547
Feature activation+1.88e-02

Pos logit contributions
cious+0.027
ze+0.024
mare+0.023
coat+0.022
suit+0.022
Neg logit contributions
 narrative-0.031
 Moral-0.027
 obstacle-0.027
 ang-0.027
Less-0.027
Token It
Token ID632
Feature activation-2.75e-02

Pos logit contributions
 narrative+0.010
 obstacle+0.009
 ang+0.009
 mistake+0.009
 trouble+0.009
Neg logit contributions
cious-0.010
ze-0.008
mare-0.008
coat-0.008
suit-0.008
Token turned
Token ID2900
Feature activation+1.70e-02

Pos logit contributions
cious+0.042
coat+0.034
mare+0.034
suit+0.033
ze+0.032
Neg logit contributions
 narrative-0.047
 ang-0.045
 obstacle-0.044
 mistake-0.044
 trouble-0.043
Token out
Token ID503
Feature activation+2.61e-02

Pos logit contributions
 narrative+0.029
 obstacle+0.025
 mistake+0.025
Less+0.025
 Moral+0.025
Neg logit contributions
cious-0.025
ze-0.025
suit-0.022
coat-0.022
mare-0.021
Token that
Token ID326
Feature activation-4.78e-02

Pos logit contributions
 narrative+0.043
 obstacle+0.037
 Moral+0.036
Less+0.036
 ang+0.036
Neg logit contributions
cious-0.040
ze-0.036
mare-0.035
 songs-0.034
suit-0.033
Token the
Token ID262
Feature activation+2.29e-02

Pos logit contributions
cious+0.074
ze+0.067
mare+0.064
 songs+0.062
we+0.062
Neg logit contributions
 narrative-0.076
 obstacle-0.070
 ang-0.069
Less-0.066
 Moral-0.066
Token av
Token ID1196
Feature activation+2.60e-01

Pos logit contributions
 narrative+0.045
 ang+0.042
 obstacle+0.042
Less+0.041
 Moral+0.040
Neg logit contributions
cious-0.027
ze-0.024
 songs-0.021
coat-0.021
mare-0.020
Tokenoc
Token ID420
Feature activation+3.63e-02

Pos logit contributions
 narrative+0.453
 obstacle+0.407
 ang+0.406
 mistake+0.404
 trouble+0.395
Neg logit contributions
cious-0.394
ze-0.336
mare-0.320
coat-0.306
suit-0.306
Tokenados
Token ID22484
Feature activation+6.34e-02

Pos logit contributions
 narrative+0.056
Less+0.051
 obstacle+0.051
 ang+0.049
 cheating+0.049
Neg logit contributions
cious-0.058
we-0.052
ze-0.052
pter-0.050
ages-0.049
Token were
Token ID547
Feature activation+1.81e-02

Pos logit contributions
 narrative+0.113
 obstacle+0.105
 ang+0.104
 mistake+0.102
 trouble+0.101
Neg logit contributions
cious-0.090
ze-0.076
mare-0.074
coat-0.070
suit-0.070
Token bad
Token ID2089
Feature activation+8.92e-02

Pos logit contributions
 narrative+0.025
 obstacle+0.022
 ang+0.022
 mistake+0.021
 trouble+0.021
Neg logit contributions
cious-0.034
ze-0.030
mare-0.029
coat-0.028
suit-0.028
Token and
Token ID290
Feature activation+8.79e-03

Pos logit contributions
 narrative+0.128
 obstacle+0.119
 ang+0.116
 trouble+0.115
 mistake+0.115
Neg logit contributions
cious-0.157
ze-0.137
mare-0.135
coat-0.130
suit-0.130
Token weather
Token ID6193
Feature activation-1.09e-02

Pos logit contributions
 narrative+0.031
 ang+0.030
 obstacle+0.030
Less+0.028
ulk+0.027
Neg logit contributions
cious-0.031
ze-0.028
 songs-0.026
 song-0.025
we-0.025
Token before
Token ID878
Feature activation-9.45e-03

Pos logit contributions
cious+0.018
mare+0.016
ze+0.016
Ch+0.015
bes+0.014
Neg logit contributions
 narrative-0.016
 obstacle-0.015
 ang-0.015
 mistake-0.015
 trouble-0.014
Token he
Token ID339
Feature activation+2.56e-02

Pos logit contributions
cious+0.016
ze+0.014
we+0.013
mare+0.013
coat+0.013
Neg logit contributions
 narrative-0.015
 obstacle-0.013
 ang-0.013
 Moral-0.012
 mistake-0.012
Token picked
Token ID6497
Feature activation+1.24e-03

Pos logit contributions
 narrative+0.040
 obstacle+0.039
 trouble+0.039
 ang+0.039
 Moral+0.038
Neg logit contributions
cious-0.039
mare-0.034
suit-0.032
coat-0.032
ze-0.032
Token his
Token ID465
Feature activation-1.47e-02

Pos logit contributions
 narrative+0.002
 obstacle+0.002
Less+0.002
 mistake+0.002
 ang+0.002
Neg logit contributions
cious-0.002
ze-0.002
mare-0.002
 songs-0.002
we-0.002
Token av
Token ID1196
Feature activation+2.59e-01

Pos logit contributions
cious+0.026
ze+0.024
 songs+0.022
mare+0.022
we+0.022
Neg logit contributions
 narrative-0.020
 obstacle-0.018
 ang-0.017
 tragedy-0.016
Less-0.016
Tokenoc
Token ID420
Feature activation+4.68e-02

Pos logit contributions
 narrative+0.371
 obstacle+0.345
 ang+0.338
 mistake+0.329
 Moral+0.327
Neg logit contributions
cious-0.452
ze-0.399
mare-0.390
we-0.377
coat-0.376
Tokenados
Token ID22484
Feature activation+6.60e-02

Pos logit contributions
 narrative+0.064
 obstacle+0.060
 ang+0.057
Less+0.056
 tragedy+0.055
Neg logit contributions
cious-0.083
ze-0.075
we-0.072
pter-0.068
bes-0.067
Token.
Token ID13
Feature activation+7.45e-03

Pos logit contributions
 narrative+0.140
 obstacle+0.131
 ang+0.124
 Moral+0.124
 mistake+0.122
Neg logit contributions
cious-0.074
ze-0.060
coat-0.056
suit-0.055
we-0.054
Token<|endoftext|>
Token ID50256
Feature activation+1.24e-01

Pos logit contributions
 narrative+0.012
 obstacle+0.011
 ang+0.011
 mistake+0.011
 trouble+0.011
Neg logit contributions
cious-0.012
ze-0.010
mare-0.010
Ch-0.010
we-0.010
TokenOnce
Token ID7454
Feature activation+1.09e-03

Pos logit contributions
 narrative+0.180
 trouble+0.167
 Moral+0.164
Less+0.158
 obstacle+0.158
Neg logit contributions
cious-0.216
ze-0.188
mare-0.181
coat-0.177
suit-0.174
Token Maybe
Token ID6674
Feature activation+4.62e-02

Pos logit contributions
cious+0.002
ze+0.002
we+0.002
coat+0.002
mare+0.002
Neg logit contributions
 narrative-0.001
 trouble-0.001
 Moral-0.001
 obstacle-0.001
 ang-0.001
Token they
Token ID484
Feature activation-2.95e-02

Pos logit contributions
 narrative+0.069
 obstacle+0.064
 trouble+0.063
 ang+0.063
 mistake+0.062
Neg logit contributions
cious-0.079
ze-0.070
coat-0.067
mare-0.067
suit-0.067
Token could
Token ID714
Feature activation-1.03e-01

Pos logit contributions
cious+0.030
ze+0.024
mare+0.023
suit+0.021
coat+0.021
Neg logit contributions
 narrative-0.064
 obstacle-0.061
 ang-0.060
 mistake-0.059
 trouble-0.059
Token grow
Token ID1663
Feature activation-7.36e-02

Pos logit contributions
cious+0.156
ze+0.131
mare+0.130
suit+0.126
coat+0.125
Neg logit contributions
 narrative-0.177
 obstacle-0.164
 ang-0.161
 mistake-0.161
 trouble-0.158
Token more
Token ID517
Feature activation+1.48e-02

Pos logit contributions
cious+0.141
ze+0.126
mare+0.123
coat+0.119
suit+0.119
Neg logit contributions
 narrative-0.094
 obstacle-0.086
 ang-0.084
 mistake-0.081
 trouble-0.081
Token av
Token ID1196
Feature activation+2.59e-01

Pos logit contributions
 narrative+0.026
 obstacle+0.024
 trouble+0.024
 ang+0.023
 mistake+0.023
Neg logit contributions
cious-0.022
ze-0.019
mare-0.018
coat-0.018
suit-0.018
Tokenoc
Token ID420
Feature activation+4.22e-02

Pos logit contributions
 narrative+0.402
 obstacle+0.370
 ang+0.364
 mistake+0.356
 trouble+0.352
Neg logit contributions
cious-0.431
ze-0.374
mare-0.362
coat-0.351
suit-0.350
Tokenados
Token ID22484
Feature activation+6.40e-02

Pos logit contributions
 narrative+0.057
 obstacle+0.052
 ang+0.050
 mistake+0.048
Less+0.048
Neg logit contributions
cious-0.080
ze-0.072
we-0.070
balls-0.065
pter-0.065
Token and
Token ID290
Feature activation+1.56e-02

Pos logit contributions
 narrative+0.091
 obstacle+0.083
 ang+0.081
 mistake+0.080
 trouble+0.079
Neg logit contributions
cious-0.114
ze-0.100
mare-0.098
coat-0.094
Ch-0.094
Token share
Token ID2648
Feature activation-4.11e-02

Pos logit contributions
 narrative+0.027
 obstacle+0.025
 ang+0.025
 mistake+0.024
 trouble+0.024
Neg logit contributions
cious-0.023
ze-0.020
mare-0.019
coat-0.019
suit-0.018
Token them
Token ID606
Feature activation-1.12e-01

Pos logit contributions
cious+0.067
ze+0.058
mare+0.057
coat+0.056
suit+0.055
Neg logit contributions
 narrative-0.064
 obstacle-0.059
 ang-0.058
 mistake-0.058
 trouble-0.057
Token,
Token ID11
Feature activation+1.41e-02

Pos logit contributions
 narrative+0.075
 obstacle+0.070
 ang+0.069
 mistake+0.068
 trouble+0.067
Neg logit contributions
cious-0.069
ze-0.059
mare-0.058
suit-0.055
coat-0.055
Token and
Token ID290
Feature activation+1.76e-02

Pos logit contributions
 narrative+0.023
 obstacle+0.021
 ang+0.021
 mistake+0.020
 trouble+0.020
Neg logit contributions
cious-0.022
ze-0.019
mare-0.019
coat-0.018
suit-0.018
Token all
Token ID477
Feature activation+9.83e-02

Pos logit contributions
 narrative+0.030
 obstacle+0.027
 ang+0.027
 mistake+0.026
 trouble+0.026
Neg logit contributions
cious-0.027
ze-0.023
mare-0.022
coat-0.022
suit-0.021
Token the
Token ID262
Feature activation+2.53e-02

Pos logit contributions
 narrative+0.130
 obstacle+0.121
 ang+0.118
 trouble+0.115
 mistake+0.115
Neg logit contributions
cious-0.186
ze-0.163
mare-0.158
coat-0.157
suit-0.155
Token other
Token ID584
Feature activation+4.16e-02

Pos logit contributions
 narrative+0.044
 ang+0.042
 obstacle+0.041
Less+0.040
 Moral+0.039
Neg logit contributions
cious-0.034
 songs-0.030
ze-0.029
we-0.027
 song-0.027
Token av
Token ID1196
Feature activation+2.59e-01

Pos logit contributions
 narrative+0.076
 ang+0.073
 obstacle+0.069
Less+0.067
 mistake+0.067
Neg logit contributions
cious-0.055
 songs-0.046
we-0.045
ze-0.044
 liked-0.043
Tokenoc
Token ID420
Feature activation+3.43e-02

Pos logit contributions
 narrative+0.403
 obstacle+0.371
 ang+0.366
 mistake+0.358
 trouble+0.354
Neg logit contributions
cious-0.431
ze-0.372
mare-0.361
coat-0.349
suit-0.348
Tokenados
Token ID22484
Feature activation+6.12e-02

Pos logit contributions
 narrative+0.051
 obstacle+0.048
 ang+0.047
Less+0.047
 tragedy+0.045
Neg logit contributions
cious-0.056
we-0.052
ze-0.049
pter-0.048
ages-0.047
Token were
Token ID547
Feature activation+1.84e-02

Pos logit contributions
 narrative+0.095
 ang+0.087
 obstacle+0.086
 trouble+0.084
 mistake+0.084
Neg logit contributions
cious-0.099
ze-0.087
mare-0.086
Ch-0.082
suit-0.081
Token very
Token ID845
Feature activation+2.92e-02

Pos logit contributions
 narrative+0.025
 ang+0.024
 obstacle+0.023
 trouble+0.023
 mistake+0.023
Neg logit contributions
cious-0.033
mare-0.029
ze-0.029
suit-0.028
Ch-0.028
Token happy
Token ID3772
Feature activation-3.32e-02

Pos logit contributions
 narrative+0.049
 obstacle+0.046
 ang+0.045
 mistake+0.044
 trouble+0.044
Neg logit contributions
cious-0.045
ze-0.039
mare-0.038
coat-0.036
suit-0.036
Token day
Token ID1110
Feature activation-1.82e-02

Pos logit contributions
cious+0.067
ze+0.059
mare+0.058
 songs+0.053
Ch+0.053
Neg logit contributions
 narrative-0.095
 obstacle-0.090
 ang-0.090
 trouble-0.086
 cheating-0.083
Token on
Token ID319
Feature activation+3.43e-02

Pos logit contributions
cious+0.031
ze+0.027
mare+0.027
coat+0.025
suit+0.025
Neg logit contributions
 narrative-0.027
 obstacle-0.025
 ang-0.025
 mistake-0.024
 Moral-0.024
Token,
Token ID11
Feature activation+9.20e-03

Pos logit contributions
 narrative+0.057
 obstacle+0.054
 ang+0.053
 Moral+0.051
Less+0.050
Neg logit contributions
cious-0.052
ze-0.046
mare-0.045
coat-0.043
suit-0.042
Token Lily
Token ID20037
Feature activation-1.37e-03

Pos logit contributions
 narrative+0.014
 obstacle+0.013
 ang+0.013
 mistake+0.013
 Moral+0.013
Neg logit contributions
cious-0.015
ze-0.013
mare-0.013
suit-0.013
coat-0.012
Token loved
Token ID6151
Feature activation-5.29e-02

Pos logit contributions
cious+0.002
suit+0.002
lic+0.002
pter+0.002
bel+0.002
Neg logit contributions
 narrative-0.002
 mistake-0.002
 Moral-0.002
 mad-0.002
 trouble-0.002
Token av
Token ID1196
Feature activation+2.59e-01

Pos logit contributions
cious+0.089
suit+0.078
mare+0.075
ze+0.074
coat+0.072
Neg logit contributions
 narrative-0.078
 trouble-0.077
 ang-0.077
 mistake-0.076
 mad-0.075
Tokenoc
Token ID420
Feature activation+4.29e-02

Pos logit contributions
 narrative+0.315
 obstacle+0.292
 ang+0.281
 mistake+0.277
 trouble+0.276
Neg logit contributions
cious-0.487
mare-0.440
ze-0.439
we-0.426
suit-0.425
Tokenados
Token ID22484
Feature activation+6.55e-02

Pos logit contributions
 narrative+0.057
 obstacle+0.053
 ang+0.050
 mistake+0.048
Less+0.047
Neg logit contributions
cious-0.082
ze-0.074
we-0.070
mare-0.067
bes-0.066
Token and
Token ID290
Feature activation+2.18e-02

Pos logit contributions
 narrative+0.076
 trouble+0.068
 mistake+0.067
 obstacle+0.067
 ang+0.065
Neg logit contributions
cious-0.128
mare-0.116
suit-0.113
Ch-0.113
ze-0.113
Token would
Token ID561
Feature activation-6.83e-02

Pos logit contributions
 narrative+0.037
 mistake+0.034
 trouble+0.034
 Moral+0.034
 mad+0.033
Neg logit contributions
cious-0.033
suit-0.030
mare-0.028
coat-0.027
oster-0.027
Token ask
Token ID1265
Feature activation+3.14e-02

Pos logit contributions
cious+0.105
suit+0.087
ze+0.086
mare+0.086
we+0.083
Neg logit contributions
 narrative-0.116
 mistake-0.106
 obstacle-0.106
 ang-0.105
 Moral-0.105
Token.
Token ID13
Feature activation+8.92e-03

Pos logit contributions
 narrative+0.000
 mistake+0.000
 trouble+0.000
 obstacle+0.000
 ang+0.000
Neg logit contributions
cious-0.000
ze-0.000
mare-0.000
Ch-0.000
suit-0.000
Token Did
Token ID7731
Feature activation-3.91e-02

Pos logit contributions
 narrative+0.012
 Moral+0.011
 obstacle+0.010
Less+0.010
 Enc+0.010
Neg logit contributions
cious-0.016
ze-0.015
we-0.014
coat-0.014
suit-0.014
Token you
Token ID345
Feature activation+2.85e-02

Pos logit contributions
cious+0.063
ze+0.055
mare+0.052
we+0.052
Ch+0.050
Neg logit contributions
 narrative-0.062
 obstacle-0.057
 ang-0.056
 Moral-0.055
 mistake-0.054
Token know
Token ID760
Feature activation+2.79e-02

Pos logit contributions
 narrative+0.043
 obstacle+0.040
 ang+0.039
 mistake+0.039
 mad+0.039
Neg logit contributions
cious-0.048
mare-0.041
we-0.040
ze-0.040
suit-0.040
Token that
Token ID326
Feature activation-4.68e-02

Pos logit contributions
 narrative+0.038
 obstacle+0.035
 ang+0.034
 mistake+0.033
 trouble+0.033
Neg logit contributions
cious-0.051
ze-0.045
mare-0.044
we-0.043
coat-0.043
Token av
Token ID1196
Feature activation+2.59e-01

Pos logit contributions
cious+0.073
ze+0.065
mare+0.062
we+0.060
coat+0.060
Neg logit contributions
 narrative-0.075
 obstacle-0.069
 ang-0.068
 Moral-0.066
 mistake-0.066
Tokenoc
Token ID420
Feature activation+3.50e-02

Pos logit contributions
 narrative+0.383
 obstacle+0.352
 ang+0.345
 mistake+0.336
 trouble+0.335
Neg logit contributions
cious-0.445
ze-0.390
mare-0.381
coat-0.368
suit-0.366
Tokenados
Token ID22484
Feature activation+6.04e-02

Pos logit contributions
 narrative+0.047
 obstacle+0.044
 ang+0.042
Less+0.041
 tragedy+0.040
Neg logit contributions
cious-0.062
ze-0.057
we-0.057
pter-0.052
ages-0.052
Token are
Token ID389
Feature activation+3.92e-02

Pos logit contributions
 narrative+0.094
 obstacle+0.086
 ang+0.085
 mistake+0.084
 trouble+0.084
Neg logit contributions
cious-0.099
ze-0.085
mare-0.083
Ch-0.081
coat-0.080
Token very
Token ID845
Feature activation+2.83e-02

Pos logit contributions
 narrative+0.067
 obstacle+0.062
 ang+0.060
 Moral+0.059
 mistake+0.059
Neg logit contributions
cious-0.058
ze-0.050
mare-0.048
we-0.045
coat-0.045
Token healthy
Token ID5448
Feature activation-2.86e-02

Pos logit contributions
 narrative+0.047
 obstacle+0.044
 ang+0.043
 mistake+0.042
 Moral+0.042
Neg logit contributions
cious-0.044
ze-0.037
mare-0.036
suit-0.035
we-0.034
Token before
Token ID878
Feature activation-9.48e-03

Pos logit contributions
 narrative+0.002
 obstacle+0.002
 ang+0.002
 mistake+0.002
 trouble+0.002
Neg logit contributions
cious-0.002
ze-0.001
mare-0.001
coat-0.001
suit-0.001
Token.
Token ID13
Feature activation+8.82e-03

Pos logit contributions
cious+0.013
ze+0.011
mare+0.011
we+0.010
coat+0.010
Neg logit contributions
 narrative-0.017
 obstacle-0.016
 ang-0.016
 mistake-0.015
 Moral-0.015
Token
Token ID198
Feature activation-3.80e-03

Pos logit contributions
 narrative+0.017
 obstacle+0.016
 ang+0.016
 mistake+0.016
 mad+0.015
Neg logit contributions
cious-0.011
mare-0.010
ze-0.009
Ch-0.009
we-0.009
Token
Token ID198
Feature activation-2.58e-03

Pos logit contributions
cious+0.007
ze+0.006
mare+0.006
Ch+0.006
we+0.005
Neg logit contributions
 narrative-0.006
 obstacle-0.005
 ang-0.005
 mistake-0.005
 trouble-0.005
TokenThe
Token ID464
Feature activation-3.31e-02

Pos logit contributions
cious+0.003
ze+0.003
Ch+0.002
we+0.002
mare+0.002
Neg logit contributions
 narrative-0.005
 ang-0.005
 obstacle-0.005
 trouble-0.005
 mistake-0.005
Token av
Token ID1196
Feature activation+2.58e-01

Pos logit contributions
cious+0.050
 songs+0.045
ze+0.044
 liked+0.043
 pet+0.043
Neg logit contributions
ulk-0.047
 ang-0.046
reating-0.045
ge-0.044
 cheating-0.043
Tokenoc
Token ID420
Feature activation+4.23e-02

Pos logit contributions
 narrative+0.434
 ang+0.402
 obstacle+0.398
 trouble+0.396
 mistake+0.395
Neg logit contributions
cious-0.397
ze-0.336
mare-0.318
Ch-0.309
we-0.308
Tokenados
Token ID22484
Feature activation+6.46e-02

Pos logit contributions
 narrative+0.064
 obstacle+0.062
 trouble+0.060
 ang+0.059
Less+0.058
Neg logit contributions
cious-0.067
we-0.061
ze-0.060
balls-0.055
abies-0.054
Token were
Token ID547
Feature activation+1.84e-02

Pos logit contributions
 narrative+0.121
 obstacle+0.116
 ang+0.115
 trouble+0.113
 mistake+0.112
Neg logit contributions
cious-0.083
ze-0.071
mare-0.065
we-0.064
coat-0.062
Token shared
Token ID4888
Feature activation-3.82e-02

Pos logit contributions
 narrative+0.024
 obstacle+0.022
 ang+0.022
 trouble+0.021
 Moral+0.021
Neg logit contributions
cious-0.035
ze-0.031
mare-0.030
coat-0.030
suit-0.030
Token with
Token ID351
Feature activation-1.48e-02

Pos logit contributions
cious+0.067
ze+0.059
mare+0.058
coat+0.057
Ch+0.056
Neg logit contributions
 narrative-0.056
 obstacle-0.050
 ang-0.050
 mistake-0.049
 trouble-0.049
Token happy
Token ID3772
Feature activation-3.94e-02

Pos logit contributions
cious+0.022
ze+0.019
mare+0.019
coat+0.018
suit+0.018
Neg logit contributions
 narrative-0.025
 obstacle-0.023
 ang-0.022
 mistake-0.022
 trouble-0.022
Token that
Token ID326
Feature activation-5.23e-02

Pos logit contributions
cious+0.070
coat+0.061
Ch+0.061
ze+0.061
suit+0.060
Neg logit contributions
 trouble-0.052
 narrative-0.052
 mistake-0.051
 obstacle-0.051
 Moral-0.049
Token it
Token ID340
Feature activation-7.49e-03

Pos logit contributions
cious+0.076
ze+0.065
mare+0.063
we+0.061
 songs+0.060
Neg logit contributions
 narrative-0.087
 obstacle-0.083
 ang-0.081
 mistake-0.079
 mad-0.076
Token produced
Token ID4635
Feature activation-3.32e-02

Pos logit contributions
cious+0.010
coat+0.009
ze+0.009
mare+0.009
Ch+0.008
Neg logit contributions
 narrative-0.013
 Moral-0.012
 obstacle-0.012
 ang-0.012
Less-0.012
Token more
Token ID517
Feature activation+1.67e-02

Pos logit contributions
cious+0.056
ze+0.049
mare+0.047
we+0.046
suit+0.045
Neg logit contributions
 narrative-0.050
 obstacle-0.046
 ang-0.044
 mistake-0.043
 Moral-0.043
Token av
Token ID1196
Feature activation+2.56e-01

Pos logit contributions
 narrative+0.024
 ang+0.022
 obstacle+0.021
Less+0.021
 mistake+0.021
Neg logit contributions
cious-0.029
ze-0.024
 songs-0.024
mare-0.023
we-0.023
Tokenoc
Token ID420
Feature activation+4.11e-02

Pos logit contributions
 narrative+0.418
 obstacle+0.381
 ang+0.380
 mistake+0.374
 trouble+0.367
Neg logit contributions
cious-0.413
ze-0.351
mare-0.336
suit-0.327
coat-0.324
Tokenados
Token ID22484
Feature activation+6.35e-02

Pos logit contributions
 narrative+0.060
 obstacle+0.055
 ang+0.053
 mistake+0.052
 trouble+0.052
Neg logit contributions
cious-0.071
we-0.063
ze-0.062
ages-0.060
balls-0.057
Token than
Token ID621
Feature activation-4.94e-02

Pos logit contributions
 narrative+0.096
 obstacle+0.089
 ang+0.088
 trouble+0.086
 mistake+0.086
Neg logit contributions
cious-0.104
ze-0.092
mare-0.090
Ch-0.088
coat-0.087
Token ever
Token ID1683
Feature activation+1.05e-03

Pos logit contributions
cious+0.077
ze+0.066
mare+0.063
we+0.063
 songs+0.062
Neg logit contributions
 narrative-0.080
 obstacle-0.073
 ang-0.070
 mistake-0.068
 Moral-0.068
Token before
Token ID878
Feature activation-9.48e-03

Pos logit contributions
 narrative+0.002
 obstacle+0.002
 ang+0.002
 mistake+0.002
 trouble+0.002
Neg logit contributions
cious-0.002
ze-0.001
mare-0.001
coat-0.001
suit-0.001
Token,
Token ID11
Feature activation+1.51e-02

Pos logit contributions
cious+0.038
ze+0.034
mare+0.033
coat+0.032
suit+0.032
Neg logit contributions
 narrative-0.032
 obstacle-0.030
 ang-0.029
 mistake-0.028
 trouble-0.028
Token there
Token ID612
Feature activation-5.14e-02

Pos logit contributions
 narrative+0.027
 obstacle+0.025
 ang+0.025
 mistake+0.024
 Moral+0.024
Neg logit contributions
cious-0.022
ze-0.018
mare-0.018
suit-0.017
coat-0.017
Token was
Token ID373
Feature activation-2.10e-03

Pos logit contributions
cious+0.082
ze+0.070
mare+0.069
coat+0.067
suit+0.066
Neg logit contributions
 narrative-0.081
 trouble-0.077
 mistake-0.077
 Moral-0.075
 obstacle-0.074
Token a
Token ID257
Feature activation-8.59e-03

Pos logit contributions
cious+0.003
ze+0.003
mare+0.003
coat+0.003
suit+0.003
Neg logit contributions
 narrative-0.003
 obstacle-0.003
 ang-0.003
 mistake-0.003
 Moral-0.003
Token loud
Token ID7812
Feature activation-4.81e-03

Pos logit contributions
cious+0.012
ze+0.010
mare+0.010
we+0.010
coat+0.010
Neg logit contributions
 narrative-0.015
 obstacle-0.014
 ang-0.014
 mistake-0.014
 trouble-0.014
Token av
Token ID1196
Feature activation+2.56e-01

Pos logit contributions
cious+0.007
ze+0.006
mare+0.006
coat+0.005
suit+0.005
Neg logit contributions
 narrative-0.009
 obstacle-0.008
 ang-0.008
 mistake-0.008
 trouble-0.008
Tokenac
Token ID330
Feature activation-7.11e-02

Pos logit contributions
 narrative+0.399
 obstacle+0.367
 ang+0.364
 mistake+0.354
 trouble+0.352
Neg logit contributions
cious-0.425
ze-0.369
mare-0.357
coat-0.344
suit-0.343
Tokenado
Token ID4533
Feature activation-7.25e-02

Pos logit contributions
cious+0.100
ze+0.086
mare+0.082
coat+0.079
we+0.079
Neg logit contributions
 narrative-0.127
 obstacle-0.119
 ang-0.117
 mistake-0.114
 trouble-0.114
Token.
Token ID13
Feature activation+1.09e-02

Pos logit contributions
cious+0.113
mare+0.093
ze+0.093
coat+0.092
pter+0.090
Neg logit contributions
 narrative-0.115
 obstacle-0.113
 mistake-0.111
 trouble-0.105
 mad-0.105
Token It
Token ID632
Feature activation-3.42e-02

Pos logit contributions
 narrative+0.016
 obstacle+0.015
 ang+0.014
 trouble+0.014
 Moral+0.014
Neg logit contributions
cious-0.019
ze-0.017
mare-0.016
coat-0.016
we-0.016
Token was
Token ID373
Feature activation-1.95e-03

Pos logit contributions
cious+0.056
mare+0.046
suit+0.045
coat+0.045
pter+0.044
Neg logit contributions
 narrative-0.055
 Moral-0.051
 ang-0.051
 obstacle-0.051
 mad-0.050
Token and
Token ID290
Feature activation-3.58e-04

Pos logit contributions
cious+0.016
mare+0.014
bes+0.013
pter+0.013
ze+0.013
Neg logit contributions
 narrative-0.008
 ang-0.008
 trouble-0.008
 mad-0.008
 obstacle-0.008
Token said
Token ID531
Feature activation-4.67e-03

Pos logit contributions
pter+0.001
cious+0.001
oster+0.001
ze+0.001
rito+0.001
Neg logit contributions
 ang-0.000
 mad-0.000
 apolog-0.000
 Mr-0.000
 trouble-0.000
Token,
Token ID11
Feature activation-1.08e-02

Pos logit contributions
cious+0.010
mare+0.009
ze+0.009
suit+0.008
bes+0.008
Neg logit contributions
 narrative-0.005
 trouble-0.004
 ang-0.004
 mad-0.004
 Moral-0.004
Token "
Token ID366
Feature activation-3.68e-02

Pos logit contributions
cious+0.019
kel+0.017
mare+0.017
suit+0.017
oster+0.017
Neg logit contributions
 Moral-0.013
 ang-0.013
 Enc-0.012
 mad-0.012
Less-0.012
TokenWhe
Token ID10842
Feature activation-4.22e-02

Pos logit contributions
cious+0.073
ves+0.072
mare+0.069
ze+0.068
ggy+0.068
Neg logit contributions
 narrative-0.041
Less-0.036
 Moral-0.034
 Enc-0.028
�-0.027
Tokene
Token ID68
Feature activation-2.12e-01

Pos logit contributions
cious+0.049
ze+0.036
suit+0.034
mare+0.034
coat+0.033
Neg logit contributions
 narrative-0.091
 mad-0.085
 ang-0.083
 mistake-0.082
 trouble-0.082
Token!"
Token ID2474
Feature activation-3.49e-02

Pos logit contributions
rito+0.405
cious+0.403
mare+0.383
pter+0.377
Ch+0.372
Neg logit contributions
 mad-0.187
 mistake-0.180
 trouble-0.155
 narrative-0.148
 problem-0.145
Token Anna
Token ID11735
Feature activation-5.54e-02

Pos logit contributions
cious+0.071
ze+0.062
oster+0.062
rito+0.060
kel+0.060
Neg logit contributions
 narrative-0.040
 Moral-0.039
 trouble-0.038
 ang-0.036
 mad-0.036
Token smiled
Token ID13541
Feature activation-6.24e-03

Pos logit contributions
cious+0.100
mare+0.094
rito+0.092
pter+0.090
we+0.090
Neg logit contributions
 mad-0.062
 ang-0.061
 Moral-0.059
 apolog-0.059
 mistake-0.058
Token back
Token ID736
Feature activation+6.04e-02

Pos logit contributions
cious+0.012
mare+0.011
ze+0.011
coat+0.010
we+0.010
Neg logit contributions
 narrative-0.008
 ang-0.007
 obstacle-0.007
 mad-0.007
 trouble-0.007
Token and
Token ID290
Feature activation+6.03e-03

Pos logit contributions
 narrative+0.095
 obstacle+0.088
 ang+0.086
 mistake+0.084
 trouble+0.083
Neg logit contributions
cious-0.099
ze-0.087
mare-0.084
coat-0.081
we-0.081
Token and
Token ID290
Feature activation+4.52e-03

Pos logit contributions
 narrative+0.073
 obstacle+0.067
 ang+0.067
Less+0.066
 Moral+0.066
Neg logit contributions
cious-0.056
ze-0.048
suit-0.045
mare-0.045
coat-0.045
Token had
Token ID550
Feature activation+1.36e-02

Pos logit contributions
 narrative+0.007
 obstacle+0.007
 ang+0.007
 trouble+0.007
 mad+0.007
Neg logit contributions
cious-0.007
ze-0.006
mare-0.006
kel-0.006
we-0.005
Token a
Token ID257
Feature activation-1.80e-02

Pos logit contributions
 trouble+0.017
 narrative+0.016
 obstacle+0.015
 ang+0.015
 mad+0.014
Neg logit contributions
cious-0.025
mare-0.024
suit-0.023
we-0.022
coat-0.022
Token long
Token ID890
Feature activation-3.44e-05

Pos logit contributions
cious+0.027
we+0.024
mare+0.023
Ch+0.022
ze+0.022
Neg logit contributions
 narrative-0.028
 mistake-0.028
 trouble-0.027
 obstacle-0.027
 mad-0.027
Token man
Token ID582
Feature activation+5.33e-02

Pos logit contributions
we+0.000
cious+0.000
Ch+0.000
ave+0.000
bel+0.000
Neg logit contributions
 mistake-0.000
 problem-0.000
 trouble-0.000
 narrative-0.000
 obstacle-0.000
Tokene
Token ID68
Feature activation-2.11e-01

Pos logit contributions
 narrative+0.105
 mad+0.104
 mistake+0.102
 ang+0.102
 trouble+0.102
Neg logit contributions
cious-0.054
mare-0.049
coat-0.044
Ch-0.043
suit-0.042
Token.
Token ID13
Feature activation-2.56e-03

Pos logit contributions
mare+0.372
cious+0.341
suit+0.328
coat+0.321
rito+0.319
Neg logit contributions
 mistake-0.225
 mad-0.221
 wreck-0.213
 trouble-0.207
 altogether-0.203
Token Tom
Token ID4186
Feature activation+4.08e-02

Pos logit contributions
cious+0.004
mare+0.003
ze+0.003
suit+0.003
coat+0.003
Neg logit contributions
 narrative-0.005
 obstacle-0.004
 mistake-0.004
 ang-0.004
Less-0.004
Token wanted
Token ID2227
Feature activation-2.74e-02

Pos logit contributions
 narrative+0.081
 obstacle+0.076
 ang+0.076
 mistake+0.075
 Moral+0.074
Neg logit contributions
cious-0.048
ze-0.040
mare-0.039
coat-0.037
we-0.036
Token to
Token ID284
Feature activation-1.98e-02

Pos logit contributions
cious+0.039
ze+0.033
mare+0.032
suit+0.031
coat+0.031
Neg logit contributions
 narrative-0.048
 trouble-0.046
 ang-0.046
 obstacle-0.045
 Moral-0.045
Token ride
Token ID6594
Feature activation+1.63e-02

Pos logit contributions
cious+0.034
we+0.031
suit+0.029
bes+0.029
pter+0.028
Neg logit contributions
 mad-0.028
 ang-0.027
 wreck-0.027
 trouble-0.027
 Moral-0.026
Token,
Token ID11
Feature activation+8.47e-04

Pos logit contributions
cious+0.019
ze+0.016
mare+0.016
coat+0.015
suit+0.015
Neg logit contributions
 narrative-0.021
 obstacle-0.020
 ang-0.019
 mistake-0.019
 trouble-0.019
Token he
Token ID339
Feature activation+1.86e-02

Pos logit contributions
 narrative+0.001
 obstacle+0.001
 mistake+0.001
 Moral+0.001
 ang+0.001
Neg logit contributions
cious-0.001
ze-0.001
 songs-0.001
coat-0.001
mare-0.001
Token had
Token ID550
Feature activation+1.72e-02

Pos logit contributions
 narrative+0.035
 ang+0.033
 obstacle+0.033
 mistake+0.032
 trouble+0.032
Neg logit contributions
cious-0.024
ze-0.020
mare-0.020
coat-0.019
we-0.019
Token to
Token ID284
Feature activation-1.84e-02

Pos logit contributions
 narrative+0.026
 trouble+0.025
 ang+0.025
 mistake+0.025
 obstacle+0.024
Neg logit contributions
cious-0.028
mare-0.024
ze-0.024
suit-0.024
we-0.023
Token sne
Token ID10505
Feature activation+2.05e-02

Pos logit contributions
cious+0.024
ze+0.022
mare+0.020
coat+0.020
 songs+0.019
Neg logit contributions
 narrative-0.033
 obstacle-0.032
 ang-0.030
 Moral-0.030
Less-0.030
Tokene
Token ID68
Feature activation-2.11e-01

Pos logit contributions
 narrative+0.036
 trouble+0.036
 mad+0.034
 ang+0.034
 obstacle+0.033
Neg logit contributions
cious-0.029
we-0.023
suit-0.023
mare-0.022
 groom-0.022
Tokenze
Token ID2736
Feature activation-1.90e-01

Pos logit contributions
cious+0.143
mare+0.125
pter+0.111
Ch+0.100
suit+0.095
Neg logit contributions
 mad-0.485
 ang-0.476
 mistake-0.472
 apolog-0.446
 trouble-0.443
Token.
Token ID13
Feature activation+1.72e-03

Pos logit contributions
cious+0.358
mare+0.300
pter+0.294
Ch+0.290
suit+0.287
Neg logit contributions
 narrative-0.239
 ang-0.235
 trouble-0.229
 mistake-0.226
 obstacle-0.226
Token "
Token ID366
Feature activation-2.75e-02

Pos logit contributions
 narrative+0.002
 obstacle+0.002
 ang+0.002
 mistake+0.002
 trouble+0.002
Neg logit contributions
cious-0.003
ze-0.003
mare-0.003
coat-0.003
we-0.003
TokenA
Token ID32
Feature activation-6.68e-02

Pos logit contributions
cious+0.046
mare+0.040
suit+0.039
 majesty+0.038
ze+0.037
Neg logit contributions
 narrative-0.044
Less-0.039
 Moral-0.038
 mad-0.036
 obstacle-0.036
Tokench
Token ID354
Feature activation-1.36e-01

Pos logit contributions
cious+0.098
pter+0.085
bes+0.081
abies+0.079
we+0.079
Neg logit contributions
 problem-0.119
 obstacle-0.118
 mistake-0.118
 trouble-0.116
 mad-0.115
Token Fl
Token ID1610
Feature activation-3.61e-02

Pos logit contributions
cious+0.004
ze+0.003
mare+0.003
coat+0.003
suit+0.003
Neg logit contributions
 narrative-0.005
 obstacle-0.005
 ang-0.005
 mistake-0.005
 trouble-0.005
Tokenuffy
Token ID15352
Feature activation-6.82e-02

Pos logit contributions
cious+0.061
mare+0.053
pter+0.051
 songs+0.051
 groom+0.051
Neg logit contributions
 narrative-0.049
 mistake-0.048
 obstacle-0.045
 problem-0.045
 trouble-0.045
Token started
Token ID2067
Feature activation-2.25e-02

Pos logit contributions
cious+0.091
mare+0.076
ze+0.074
coat+0.072
suit+0.072
Neg logit contributions
 narrative-0.124
 obstacle-0.117
 ang-0.116
 mistake-0.115
 trouble-0.113
Token to
Token ID284
Feature activation-1.76e-02

Pos logit contributions
cious+0.028
ze+0.023
mare+0.023
suit+0.021
coat+0.021
Neg logit contributions
 narrative-0.043
 ang-0.041
 obstacle-0.041
 mistake-0.040
 trouble-0.040
Token sne
Token ID10505
Feature activation+1.94e-02

Pos logit contributions
cious+0.024
ze+0.023
coat+0.020
mare+0.019
 nood+0.019
Neg logit contributions
 narrative-0.034
 obstacle-0.029
 Moral-0.029
Less-0.028
oral-0.028
Tokene
Token ID68
Feature activation-2.10e-01

Pos logit contributions
 narrative+0.038
 trouble+0.037
 ang+0.036
 obstacle+0.036
 mad+0.036
Neg logit contributions
cious-0.023
mare-0.019
suit-0.018
we-0.018
coat-0.017
Tokenze
Token ID2736
Feature activation-1.88e-01

Pos logit contributions
cious+0.129
mare+0.125
pter+0.095
suit+0.095
coat+0.085
Neg logit contributions
 mad-0.504
 ang-0.504
 mistake-0.493
 trouble-0.469
 obstacle-0.465
Token and
Token ID290
Feature activation+1.14e-02

Pos logit contributions
cious+0.332
mare+0.285
ze+0.281
suit+0.273
coat+0.271
Neg logit contributions
 narrative-0.266
 obstacle-0.251
 ang-0.249
 trouble-0.241
 mistake-0.240
Token cough
Token ID22094
Feature activation-3.19e-02

Pos logit contributions
 ang+0.017
 mad+0.017
 trouble+0.016
 mistake+0.016
 obstacle+0.015
Neg logit contributions
cious-0.019
suit-0.016
mare-0.016
coat-0.016
we-0.016
Token.
Token ID13
Feature activation+2.92e-04

Pos logit contributions
cious+0.058
pter+0.053
mare+0.052
 majesty+0.050
bes+0.049
Neg logit contributions
 ang-0.042
 mad-0.040
 trouble-0.038
 obstacle-0.037
 narrative-0.037
Token Suddenly
Token ID24975
Feature activation-2.37e-02

Pos logit contributions
 narrative+0.000
 Moral+0.000
 ang+0.000
 obstacle+0.000
 trouble+0.000
Neg logit contributions
cious-0.000
mare-0.000
ze-0.000
we-0.000
coat-0.000
Token was
Token ID373
Feature activation-1.69e-02

Pos logit contributions
cious+0.019
mare+0.017
ze+0.016
pter+0.015
 majesty+0.015
Neg logit contributions
 narrative-0.018
 obstacle-0.017
 ang-0.017
 Moral-0.017
 mistake-0.017
Token her
Token ID607
Feature activation-6.63e-02

Pos logit contributions
cious+0.033
mare+0.028
ze+0.028
we+0.027
suit+0.027
Neg logit contributions
 narrative-0.021
 trouble-0.020
 ang-0.020
 obstacle-0.019
 mistake-0.019
Token royal
Token ID15100
Feature activation-1.39e-01

Pos logit contributions
cious+0.109
ze+0.093
mare+0.093
Ch+0.089
we+0.089
Neg logit contributions
 narrative-0.103
 obstacle-0.097
 mistake-0.095
 trouble-0.094
 ang-0.092
Token s
Token ID264
Feature activation+2.48e-02

Pos logit contributions
cious+0.234
ze+0.202
mare+0.201
Ch+0.195
we+0.194
Neg logit contributions
 narrative-0.208
 obstacle-0.198
 trouble-0.192
 mistake-0.190
 problem-0.185
Tokence
Token ID344
Feature activation-5.20e-02

Pos logit contributions
 narrative+0.037
 obstacle+0.034
 ang+0.033
 mistake+0.033
 trouble+0.032
Neg logit contributions
cious-0.043
ze-0.038
mare-0.036
coat-0.036
we-0.036
Tokenpter
Token ID42104
Feature activation-2.10e-01

Pos logit contributions
mare+0.071
cious+0.069
lic+0.065
Ch+0.063
Rain+0.061
Neg logit contributions
 narrative-0.079
 mistake-0.075
 obstacle-0.073
 problem-0.073
 trouble-0.070
Token.
Token ID13
Feature activation-1.35e-02

Pos logit contributions
cious+0.381
Ch+0.348
mare+0.342
ze+0.335
oster+0.325
Neg logit contributions
 narrative-0.263
 mad-0.245
 mistake-0.241
 trouble-0.239
 ang-0.234
Token
Token ID198
Feature activation-2.27e-02

Pos logit contributions
cious+0.024
ze+0.021
mare+0.020
we+0.019
suit+0.019
Neg logit contributions
 narrative-0.020
 Moral-0.018
 obstacle-0.018
 ang-0.018
 trouble-0.017
Token
Token ID198
Feature activation-2.07e-02

Pos logit contributions
cious+0.034
ze+0.029
mare+0.028
suit+0.027
coat+0.027
Neg logit contributions
 narrative-0.039
 obstacle-0.036
 ang-0.036
 mistake-0.035
 Moral-0.035
TokenAs
Token ID1722
Feature activation-6.80e-02

Pos logit contributions
cious+0.025
ze+0.021
mare+0.020
coat+0.019
Ch+0.019
Neg logit contributions
 narrative-0.041
 obstacle-0.039
 ang-0.039
 trouble-0.038
 mistake-0.038
Token they
Token ID484
Feature activation-3.10e-02

Pos logit contributions
cious+0.082
ze+0.068
mare+0.067
coat+0.066
Ch+0.066
Neg logit contributions
 narrative-0.134
 obstacle-0.127
 ang-0.124
 mistake-0.119
 Moral-0.118
Token mouse
Token ID10211
Feature activation+7.85e-03

Pos logit contributions
cious+0.039
ze+0.034
mare+0.031
Ch+0.030
 songs+0.029
Neg logit contributions
 narrative-0.068
 obstacle-0.063
 ang-0.062
Less-0.061
 mistake-0.061
Token was
Token ID373
Feature activation-1.61e-02

Pos logit contributions
 narrative+0.013
 ang+0.012
 mistake+0.012
 obstacle+0.012
 Moral+0.012
Neg logit contributions
cious-0.012
mare-0.010
coat-0.010
we-0.010
suit-0.009
Token happy
Token ID3772
Feature activation-4.57e-02

Pos logit contributions
cious+0.029
ze+0.025
mare+0.025
we+0.024
suit+0.024
Neg logit contributions
 narrative-0.022
 ang-0.021
 obstacle-0.020
 trouble-0.020
 mistake-0.020
Token to
Token ID284
Feature activation-1.93e-02

Pos logit contributions
cious+0.080
ze+0.069
mare+0.068
Ch+0.067
suit+0.067
Neg logit contributions
 narrative-0.064
 trouble-0.062
 mistake-0.062
 ang-0.062
 obstacle-0.061
Token oblig
Token ID6263
Feature activation-3.39e-02

Pos logit contributions
cious+0.035
we+0.031
suit+0.031
Ch+0.030
coat+0.029
Neg logit contributions
 narrative-0.026
 mistake-0.025
 Moral-0.025
 trouble-0.025
 mad-0.024
Tokene
Token ID68
Feature activation-2.10e-01

Pos logit contributions
cious+0.026
ze+0.018
mare+0.017
coat+0.016
suit+0.016
Neg logit contributions
 narrative-0.083
 obstacle-0.078
 ang-0.078
 Moral-0.077
 mistake-0.077
Token and
Token ID290
Feature activation+7.41e-03

Pos logit contributions
cious+0.357
mare+0.322
pter+0.308
Ch+0.307
suit+0.305
Neg logit contributions
 ang-0.260
 mistake-0.259
 mad-0.254
 narrative-0.249
 Moral-0.243
Token quickly
Token ID2952
Feature activation+3.84e-02

Pos logit contributions
 narrative+0.012
 ang+0.012
 Moral+0.011
 obstacle+0.011
 mistake+0.011
Neg logit contributions
cious-0.011
ze-0.010
coat-0.009
mare-0.009
we-0.009
Token ran
Token ID4966
Feature activation+3.35e-03

Pos logit contributions
 narrative+0.066
 ang+0.065
 obstacle+0.064
 mistake+0.063
 trouble+0.062
Neg logit contributions
cious-0.052
we-0.045
mare-0.045
ze-0.045
Ch-0.043
Token off
Token ID572
Feature activation+4.21e-02

Pos logit contributions
 narrative+0.006
 obstacle+0.005
 mistake+0.005
 Moral+0.005
 ang+0.005
Neg logit contributions
cious-0.005
ze-0.005
suit-0.005
coat-0.004
mare-0.004
Token to
Token ID284
Feature activation-1.99e-02

Pos logit contributions
 narrative+0.088
 obstacle+0.081
 ang+0.077
 mistake+0.077
 Moral+0.076
Neg logit contributions
cious-0.050
ze-0.041
mare-0.039
coat-0.037
suit-0.037
Token a
Token ID257
Feature activation-1.23e-02

Pos logit contributions
 trouble+0.024
 narrative+0.020
 mad+0.020
 ang+0.017
Less+0.016
Neg logit contributions
cious-0.053
mare-0.051
suit-0.051
rito-0.049
oster-0.049
Token big
Token ID1263
Feature activation-2.14e-03

Pos logit contributions
cious+0.017
ze+0.015
mare+0.014
we+0.014
coat+0.014
Neg logit contributions
 narrative-0.022
 obstacle-0.021
 ang-0.020
 mistake-0.020
 trouble-0.020
Token,
Token ID11
Feature activation+4.94e-03

Pos logit contributions
 hat+0.002
cious+0.002
 nose+0.002
coat+0.002
 tail+0.002
Neg logit contributions
Less-0.004
 narrative-0.004
 ang-0.004
 obstacle-0.004
uct-0.004
Token scary
Token ID14343
Feature activation+1.08e-02

Pos logit contributions
 narrative+0.009
 obstacle+0.008
 ang+0.008
 trouble+0.008
 mad+0.008
Neg logit contributions
cious-0.007
ze-0.006
mare-0.005
we-0.005
suit-0.005
Token sne
Token ID10505
Feature activation+1.57e-02

Pos logit contributions
 narrative+0.019
 obstacle+0.018
 ang+0.018
 Moral+0.017
Less+0.017
Neg logit contributions
cious-0.015
ze-0.013
mare-0.013
coat-0.012
suit-0.012
Tokene
Token ID68
Feature activation-2.10e-01

Pos logit contributions
 narrative+0.029
 mad+0.027
 trouble+0.026
 mistake+0.026
 obstacle+0.025
Neg logit contributions
cious-0.022
 groom-0.017
suit-0.017
mare-0.016
pron-0.016
Tokenze
Token ID2736
Feature activation-1.89e-01

Pos logit contributions
cious+0.212
rito+0.176
mare+0.174
suit+0.160
pter+0.158
Neg logit contributions
 mistake-0.407
 wreck-0.394
 mad-0.382
 badly-0.377
 altogether-0.362
Token.
Token ID13
Feature activation+7.91e-04

Pos logit contributions
cious+0.346
mare+0.284
pter+0.276
suit+0.272
 majesty+0.269
Neg logit contributions
 narrative-0.259
 trouble-0.246
 obstacle-0.244
 ang-0.241
 mistake-0.241
Token Every
Token ID3887
Feature activation-1.43e-03

Pos logit contributions
 narrative+0.001
 obstacle+0.001
 ang+0.001
 mistake+0.001
 trouble+0.001
Neg logit contributions
cious-0.001
ze-0.001
mare-0.001
coat-0.001
Ch-0.001
Token time
Token ID640
Feature activation-3.05e-02

Pos logit contributions
cious+0.003
bes+0.003
we+0.003
pter+0.003
ouses+0.003
Neg logit contributions
 narrative-0.001
 problem-0.001
 trouble-0.001
 obstacle-0.001
 mistake-0.001
Token he
Token ID339
Feature activation+2.34e-02

Pos logit contributions
cious+0.046
ze+0.040
Ch+0.039
coat+0.039
we+0.039
Neg logit contributions
 narrative-0.049
 obstacle-0.046
 mistake-0.045
 ang-0.044
Less-0.043
Tokenmy
Token ID1820
Feature activation+3.65e-02

Pos logit contributions
 narrative+0.065
 obstacle+0.059
 trouble+0.054
 ang+0.053
 mistake+0.052
Neg logit contributions
cious-0.105
ze-0.095
mare-0.093
coat-0.091
suit-0.089
Token said
Token ID531
Feature activation+3.89e-03

Pos logit contributions
 narrative+0.061
 obstacle+0.057
 ang+0.056
 mistake+0.055
 trouble+0.054
Neg logit contributions
cious-0.056
ze-0.048
mare-0.046
coat-0.045
suit-0.044
Token.
Token ID13
Feature activation-2.95e-03

Pos logit contributions
 ang+0.005
 mad+0.005
 mistake+0.005
 narrative+0.005
 trouble+0.005
Neg logit contributions
cious-0.007
mare-0.006
we-0.006
Ch-0.006
suit-0.006
Token "
Token ID366
Feature activation-2.96e-02

Pos logit contributions
cious+0.004
ze+0.004
mare+0.004
coat+0.003
suit+0.003
Neg logit contributions
 narrative-0.005
 obstacle-0.005
 ang-0.005
 Moral-0.005
 mistake-0.005
TokenBy
Token ID3886
Feature activation+2.19e-02

Pos logit contributions
cious+0.050
ze+0.044
mare+0.044
ves+0.043
suit+0.042
Neg logit contributions
 narrative-0.047
Less-0.041
 Moral-0.039
 List-0.037
 moral-0.036
Tokene
Token ID68
Feature activation-2.10e-01

Pos logit contributions
 narrative+0.048
 obstacle+0.044
 mistake+0.044
 ang+0.044
 trouble+0.044
Neg logit contributions
cious-0.023
suit-0.017
coat-0.017
ze-0.016
mare-0.016
Token Tim
Token ID5045
Feature activation+5.25e-02

Pos logit contributions
cious+0.368
we+0.311
pter+0.311
suit+0.308
Ch+0.307
Neg logit contributions
 mistake-0.262
 mad-0.262
 ang-0.258
 obstacle-0.246
 trouble-0.245
Tokenmy
Token ID1820
Feature activation+3.61e-02

Pos logit contributions
 narrative+0.065
 obstacle+0.058
 ang+0.055
 Moral+0.053
Less+0.053
Neg logit contributions
cious-0.103
ze-0.096
mare-0.092
coat-0.091
we-0.089
Token,
Token ID11
Feature activation+2.78e-03

Pos logit contributions
 narrative+0.059
 obstacle+0.055
 ang+0.054
 mistake+0.053
 trouble+0.053
Neg logit contributions
cious-0.056
ze-0.048
mare-0.047
suit-0.045
coat-0.045
Token you
Token ID345
Feature activation+2.52e-02

Pos logit contributions
 narrative+0.005
 ang+0.004
 obstacle+0.004
 mistake+0.004
Less+0.004
Neg logit contributions
cious-0.004
mare-0.004
ze-0.004
 songs-0.004
coat-0.004
Token're
Token ID821
Feature activation+1.33e-02

Pos logit contributions
 narrative+0.038
 obstacle+0.035
 ang+0.035
 trouble+0.034
 mistake+0.034
Neg logit contributions
cious-0.043
ze-0.036
we-0.035
mare-0.035
coat-0.034
Token Lily
Token ID20037
Feature activation-9.50e-03

Pos logit contributions
 narrative+0.007
 obstacle+0.006
 ang+0.006
 mistake+0.006
 trouble+0.006
Neg logit contributions
cious-0.008
ze-0.007
mare-0.007
coat-0.007
suit-0.007
Token said
Token ID531
Feature activation+2.06e-03

Pos logit contributions
cious+0.017
we+0.015
pter+0.015
ze+0.014
abies+0.014
Neg logit contributions
 mistake-0.012
 ang-0.012
 mad-0.011
 trouble-0.011
 apolog-0.011
Token,
Token ID11
Feature activation+2.72e-03

Pos logit contributions
 narrative+0.003
 trouble+0.002
 ang+0.002
 mad+0.002
 Moral+0.002
Neg logit contributions
cious-0.004
we-0.003
bes-0.003
mare-0.003
Ch-0.003
Token "
Token ID366
Feature activation-3.02e-02

Pos logit contributions
 Moral+0.004
 narrative+0.004
 ang+0.004
Less+0.004
 Enc+0.004
Neg logit contributions
cious-0.004
ze-0.004
kel-0.004
pter-0.004
imm-0.004
TokenBy
Token ID3886
Feature activation+1.99e-02

Pos logit contributions
cious+0.053
 majesty+0.045
ze+0.045
mare+0.044
rito+0.044
Neg logit contributions
 narrative-0.045
 Moral-0.042
Less-0.041
 trouble-0.037
 Enc-0.036
Tokene
Token ID68
Feature activation-2.10e-01

Pos logit contributions
 narrative+0.044
 mistake+0.040
 trouble+0.040
 obstacle+0.040
 Moral+0.040
Neg logit contributions
cious-0.021
suit-0.016
coat-0.016
lic-0.015
mare-0.014
Token,
Token ID11
Feature activation+4.07e-03

Pos logit contributions
cious+0.423
we+0.377
pter+0.373
rito+0.368
Ch+0.363
Neg logit contributions
 mistake-0.216
 mad-0.211
 ang-0.202
 trouble-0.196
 obstacle-0.185
Token Max
Token ID5436
Feature activation+1.08e-02

Pos logit contributions
 narrative+0.006
 obstacle+0.006
 ang+0.005
 mistake+0.005
 Moral+0.005
Neg logit contributions
cious-0.007
ze-0.006
mare-0.006
coat-0.006
suit-0.006
Token!
Token ID0
Feature activation-4.42e-03

Pos logit contributions
 mistake+0.013
 trouble+0.013
 narrative+0.013
 obstacle+0.013
 mad+0.013
Neg logit contributions
cious-0.020
ze-0.017
mare-0.017
pter-0.017
suit-0.017
Token Come
Token ID7911
Feature activation+2.11e-03

Pos logit contributions
cious+0.008
ze+0.007
we+0.007
bes+0.007
coat+0.007
Neg logit contributions
 narrative-0.006
 Moral-0.005
 trouble-0.005
 Enc-0.005
Less-0.005
Token back
Token ID736
Feature activation+6.74e-02

Pos logit contributions
 narrative+0.003
 ang+0.002
 trouble+0.002
 obstacle+0.002
 mad+0.002
Neg logit contributions
cious-0.004
ze-0.004
mare-0.003
we-0.003
bes-0.003
Token her
Token ID607
Feature activation-6.35e-02

Pos logit contributions
 narrative+0.040
 obstacle+0.036
 ang+0.036
 mistake+0.035
 trouble+0.034
Neg logit contributions
cious-0.052
ze-0.046
mare-0.044
coat-0.043
suit-0.043
Token walk
Token ID2513
Feature activation+1.86e-02

Pos logit contributions
cious+0.109
mare+0.099
we+0.097
coat+0.097
lic+0.096
Neg logit contributions
 narrative-0.089
 obstacle-0.087
 trouble-0.087
 mistake-0.085
 problem-0.083
Token,
Token ID11
Feature activation+9.42e-04

Pos logit contributions
 narrative+0.026
 obstacle+0.024
 ang+0.024
 trouble+0.023
 Moral+0.023
Neg logit contributions
cious-0.033
ze-0.029
mare-0.027
we-0.027
Ch-0.027
Token R
Token ID371
Feature activation-6.38e-02

Pos logit contributions
 trouble+0.001
 ang+0.001
 narrative+0.001
 tragedy+0.001
 Enc+0.001
Neg logit contributions
cious-0.002
coat-0.001
ze-0.001
bes-0.001
suit-0.001
Tokenyle
Token ID2349
Feature activation+6.88e-02

Pos logit contributions
cious+0.091
mare+0.078
ze+0.078
coat+0.075
suit+0.073
Neg logit contributions
 narrative-0.112
 obstacle-0.102
 mistake-0.101
 trouble-0.099
 ang-0.099
Tokene
Token ID68
Feature activation-2.10e-01

Pos logit contributions
 narrative+0.110
 mistake+0.102
 obstacle+0.102
 Moral+0.099
 ang+0.098
Neg logit contributions
cious-0.100
coat-0.087
 songs-0.086
balls-0.085
we-0.085
Token stumbled
Token ID24241
Feature activation+2.97e-02

Pos logit contributions
mare+0.279
cious+0.278
rito+0.278
pter+0.273
ectar+0.260
Neg logit contributions
 mistake-0.273
 mad-0.257
 trouble-0.249
 ang-0.247
 problem-0.246
Token across
Token ID1973
Feature activation+1.32e-02

Pos logit contributions
 narrative+0.043
 obstacle+0.039
 ang+0.038
 mistake+0.037
 trouble+0.037
Neg logit contributions
cious-0.052
ze-0.046
mare-0.045
coat-0.044
suit-0.044
Token a
Token ID257
Feature activation-1.54e-02

Pos logit contributions
 narrative+0.022
 obstacle+0.020
 ang+0.020
 mistake+0.019
 Moral+0.019
Neg logit contributions
cious-0.020
ze-0.018
mare-0.017
we-0.016
coat-0.016
Token mysterious
Token ID11428
Feature activation+1.70e-02

Pos logit contributions
cious+0.022
ze+0.019
mare+0.018
suit+0.017
coat+0.017
Neg logit contributions
 narrative-0.029
Less-0.025
 Moral-0.025
 ang-0.025
 obstacle-0.024
Token cave
Token ID11527
Feature activation+2.95e-02

Pos logit contributions
 narrative+0.031
Less+0.028
 ang+0.028
 Moral+0.027
 obstacle+0.026
Neg logit contributions
cious-0.024
ze-0.021
mare-0.021
suit-0.020
coat-0.020
Token
Token ID198
Feature activation-4.81e-03

Pos logit contributions
 narrative+0.066
 Moral+0.061
 trouble+0.060
Less+0.060
 ang+0.060
Neg logit contributions
cious-0.073
ze-0.064
mare-0.062
suit-0.060
coat-0.059
Token
Token ID198
Feature activation-4.58e-03

Pos logit contributions
cious+0.008
ze+0.007
coat+0.007
suit+0.007
mare+0.007
Neg logit contributions
 narrative-0.007
 Moral-0.006
Less-0.006
 obstacle-0.006
 ang-0.006
TokenWhenever
Token ID28877
Feature activation-1.12e-01

Pos logit contributions
cious+0.009
ze+0.008
mare+0.008
suit+0.008
coat+0.008
Neg logit contributions
 narrative-0.006
 obstacle-0.005
 Moral-0.005
Less-0.005
 ang-0.005
Token R
Token ID371
Feature activation-6.44e-02

Pos logit contributions
cious+0.167
ze+0.147
coat+0.146
bes+0.140
suit+0.140
Neg logit contributions
 narrative-0.186
 trouble-0.175
 obstacle-0.174
 Moral-0.169
 ang-0.169
Tokenyle
Token ID2349
Feature activation+6.50e-02

Pos logit contributions
cious+0.098
ze+0.083
mare+0.080
we+0.078
suit+0.078
Neg logit contributions
 narrative-0.109
 obstacle-0.103
 ang-0.103
 trouble-0.098
 mistake-0.097
Tokene
Token ID68
Feature activation-2.10e-01

Pos logit contributions
 narrative+0.126
 obstacle+0.116
 ang+0.114
 mistake+0.114
 Moral+0.112
Neg logit contributions
cious-0.082
ze-0.066
mare-0.065
coat-0.064
suit-0.063
Token went
Token ID1816
Feature activation+5.73e-02

Pos logit contributions
cious+0.311
mare+0.283
suit+0.264
Ch+0.264
rito+0.262
Neg logit contributions
 narrative-0.310
 mistake-0.308
 mad-0.296
 problem-0.292
 obstacle-0.289
Token for
Token ID329
Feature activation+8.64e-02

Pos logit contributions
 narrative+0.116
 obstacle+0.104
Less+0.104
 mistake+0.102
 Moral+0.100
Neg logit contributions
cious-0.067
ze-0.059
mare-0.058
coat-0.055
suit-0.054
Token a
Token ID257
Feature activation-1.13e-02

Pos logit contributions
 narrative+0.148
 obstacle+0.137
 ang+0.134
 mistake+0.130
 Moral+0.130
Neg logit contributions
cious-0.129
ze-0.112
mare-0.108
coat-0.103
 songs-0.102
Token walk
Token ID2513
Feature activation+1.66e-02

Pos logit contributions
cious+0.019
we+0.017
ze+0.017
bes+0.017
pter+0.016
Neg logit contributions
 obstacle-0.015
 ang-0.015
 mistake-0.015
 problem-0.015
 mad-0.015
Token in
Token ID287
Feature activation+1.90e-02

Pos logit contributions
 narrative+0.023
 obstacle+0.021
 ang+0.021
 trouble+0.020
 mistake+0.020
Neg logit contributions
cious-0.030
ze-0.027
mare-0.026
we-0.025
Ch-0.025
Token crown
Token ID12389
Feature activation-8.06e-02

Pos logit contributions
cious+0.022
ze+0.018
mare+0.018
we+0.017
coat+0.017
Neg logit contributions
 narrative-0.027
 obstacle-0.025
 ang-0.025
 mistake-0.025
 trouble-0.025
Token and
Token ID290
Feature activation+5.96e-03

Pos logit contributions
cious+0.164
ze+0.145
Ch+0.143
mare+0.141
we+0.140
Neg logit contributions
 narrative-0.085
 ang-0.076
 obstacle-0.076
 trouble-0.075
 mistake-0.074
Token a
Token ID257
Feature activation-1.50e-02

Pos logit contributions
 narrative+0.009
 obstacle+0.008
 trouble+0.008
 ang+0.008
 mistake+0.008
Neg logit contributions
cious-0.010
ze-0.008
mare-0.008
suit-0.008
coat-0.008
Token s
Token ID264
Feature activation+2.18e-02

Pos logit contributions
cious+0.020
ze+0.016
mare+0.016
we+0.015
Ch+0.015
Neg logit contributions
 narrative-0.028
 obstacle-0.026
 mistake-0.026
 ang-0.026
 trouble-0.026
Tokence
Token ID344
Feature activation-5.29e-02

Pos logit contributions
 narrative+0.031
 obstacle+0.029
 ang+0.028
 mistake+0.028
 trouble+0.028
Neg logit contributions
cious-0.038
ze-0.034
mare-0.033
coat-0.033
we-0.032
Tokenpter
Token ID42104
Feature activation-2.10e-01

Pos logit contributions
cious+0.051
mare+0.051
lic+0.050
Ch+0.049
Rain+0.047
Neg logit contributions
 narrative-0.100
 mistake-0.092
 problem-0.089
 enough-0.088
 mad-0.087
Token.
Token ID13
Feature activation-3.40e-03

Pos logit contributions
cious+0.412
Ch+0.381
mare+0.357
ze+0.357
Rain+0.346
Neg logit contributions
 narrative-0.233
 mad-0.227
 mistake-0.215
 trouble-0.212
 enough-0.210
Token They
Token ID1119
Feature activation-3.21e-02

Pos logit contributions
cious+0.007
ze+0.007
coat+0.007
mare+0.007
we+0.006
Neg logit contributions
 narrative-0.003
 Moral-0.003
 trouble-0.003
 Enc-0.003
Less-0.003
Token make
Token ID787
Feature activation-9.34e-02

Pos logit contributions
cious+0.040
ze+0.035
pter+0.033
 danced+0.033
 sang+0.032
Neg logit contributions
 narrative-0.059
 obstacle-0.057
 ang-0.057
 tragedy-0.056
 trouble-0.056
Token up
Token ID510
Feature activation+6.24e-02

Pos logit contributions
cious+0.169
mare+0.149
coat+0.145
suit+0.145
Ch+0.138
Neg logit contributions
 ang-0.125
 narrative-0.125
 obstacle-0.121
 trouble-0.121
 mistake-0.120
Token stories
Token ID3923
Feature activation-2.80e-02

Pos logit contributions
 narrative+0.112
 obstacle+0.102
Less+0.100
 Moral+0.098
 ang+0.097
Neg logit contributions
cious-0.088
 songs-0.079
ze-0.079
pter-0.072
mare-0.070
Token Tom
Token ID4186
Feature activation+4.05e-02

Pos logit contributions
cious+0.047
ze+0.039
mare+0.039
coat+0.038
we+0.037
Neg logit contributions
 narrative-0.059
 obstacle-0.058
 ang-0.054
 mistake-0.053
 trouble-0.052
Token said
Token ID531
Feature activation+1.99e-04

Pos logit contributions
 narrative+0.075
 obstacle+0.071
 ang+0.068
 trouble+0.067
 mistake+0.067
Neg logit contributions
cious-0.055
ze-0.047
mare-0.045
coat-0.043
suit-0.043
Token,
Token ID11
Feature activation+2.54e-03

Pos logit contributions
 ang+0.000
 narrative+0.000
 mad+0.000
 mistake+0.000
 trouble+0.000
Neg logit contributions
cious-0.000
mare-0.000
pter-0.000
Ch-0.000
ox-0.000
Token "
Token ID366
Feature activation-3.36e-02

Pos logit contributions
 ang+0.004
 Moral+0.004
 narrative+0.004
 mad+0.004
 mistake+0.004
Neg logit contributions
cious-0.004
pter-0.003
suit-0.003
ze-0.003
ox-0.003
TokenBy
Token ID3886
Feature activation+2.15e-02

Pos logit contributions
cious+0.067
ox+0.061
mare+0.060
aby+0.060
ze+0.059
Neg logit contributions
 narrative-0.040
Less-0.037
 Moral-0.035
 Enc-0.029
 List-0.028
Tokene
Token ID68
Feature activation-2.10e-01

Pos logit contributions
 narrative+0.049
 mistake+0.045
 obstacle+0.045
 trouble+0.045
 Moral+0.045
Neg logit contributions
cious-0.021
suit-0.015
coat-0.015
ze-0.014
mare-0.014
Token,
Token ID11
Feature activation+3.34e-03

Pos logit contributions
cious+0.418
rito+0.383
pter+0.378
we+0.375
Ch+0.368
Neg logit contributions
 mistake-0.205
 mad-0.203
 ang-0.182
 trouble-0.177
 obstacle-0.170
Token Sally
Token ID25737
Feature activation-2.39e-02

Pos logit contributions
 narrative+0.006
 ang+0.005
 obstacle+0.005
 mistake+0.005
 trouble+0.005
Neg logit contributions
cious-0.005
ze-0.004
mare-0.004
coat-0.004
suit-0.004
Token.
Token ID13
Feature activation-1.77e-03

Pos logit contributions
cious+0.042
ze+0.036
mare+0.035
suit+0.035
Ch+0.034
Neg logit contributions
 narrative-0.032
 mistake-0.030
 trouble-0.030
 obstacle-0.029
 Moral-0.029
Token See
Token ID4091
Feature activation-3.10e-02

Pos logit contributions
cious+0.003
ze+0.003
we+0.003
ves+0.003
bes+0.003
Neg logit contributions
 narrative-0.002
 Moral-0.002
Less-0.002
 Enc-0.002
 trouble-0.002
Token you
Token ID345
Feature activation+1.88e-02

Pos logit contributions
cious+0.060
ze+0.052
mare+0.051
suit+0.051
coat+0.050
Neg logit contributions
 narrative-0.040
 obstacle-0.037
 ang-0.036
 mistake-0.035
 trouble-0.035
Token�
Token ID95
Feature activation+7.03e-02

Pos logit contributions
 narrative+0.094
 obstacle+0.087
 ang+0.083
 mistake+0.082
 Moral+0.082
Neg logit contributions
cious-0.109
ze-0.096
mare-0.095
we-0.092
coat-0.092
Token€
Token ID26391
Feature activation+2.14e-02

Pos logit contributions
 narrative+0.112
 Moral+0.104
 obstacle+0.102
Less+0.102
 mistake+0.098
Neg logit contributions
cious-0.107
bel-0.094
ze-0.093
coat-0.092
we-0.092
Token�
Token ID129
Feature activation+2.06e-02

Pos logit contributions
 trouble+0.028
 narrative+0.028
 Moral+0.026
�+0.026
Less+0.025
Neg logit contributions
rito-0.033
 songs-0.032
 groom-0.031
oster-0.030
ze-0.029
Token�
Token ID241
Feature activation-7.90e-02

Pos logit contributions
 narrative+0.037
 obstacle+0.035
 ang+0.034
 mistake+0.034
 Moral+0.033
Neg logit contributions
cious-0.028
mare-0.024
ze-0.023
we-0.023
suit-0.023
TokenBy
Token ID3886
Feature activation+2.44e-02

Pos logit contributions
cious+0.120
ze+0.111
mare+0.110
suit+0.104
oster+0.101
Neg logit contributions
 narrative-0.121
 trouble-0.114
Less-0.111
 Moral-0.110
 mad-0.107
Tokene
Token ID68
Feature activation-2.09e-01

Pos logit contributions
 narrative+0.056
 obstacle+0.052
 Moral+0.051
 mistake+0.051
 ang+0.051
Neg logit contributions
cious-0.023
suit-0.016
coat-0.016
ze-0.016
mare-0.016
Token!
Token ID0
Feature activation-6.79e-03

Pos logit contributions
cious+0.307
Ch+0.267
pter+0.263
we+0.261
mare+0.261
Neg logit contributions
 mistake-0.307
 mad-0.293
 ang-0.285
 narrative-0.284
 obstacle-0.282
Tokenâ
Token ID22940
Feature activation+1.19e-01

Pos logit contributions
cious+0.011
ze+0.010
mare+0.009
coat+0.009
suit+0.009
Neg logit contributions
 narrative-0.010
 Moral-0.009
 obstacle-0.009
 ang-0.009
 trouble-0.009
Token€
Token ID26391
Feature activation+2.15e-02

Pos logit contributions
 narrative+0.197
 obstacle+0.183
 ang+0.180
 mistake+0.176
 trouble+0.176
Neg logit contributions
cious-0.185
ze-0.160
mare-0.155
coat-0.149
suit-0.149
Token Then
Token ID3244
Feature activation-1.70e-03

Pos logit contributions
 narrative+0.029
 trouble+0.028
 Moral+0.027
Less+0.026
 mistake+0.026
Neg logit contributions
cious-0.033
rito-0.032
 songs-0.032
ze-0.032
 groom-0.031
Token she
Token ID673
Feature activation-2.24e-02

Pos logit contributions
cious+0.003
ze+0.003
coat+0.003
mare+0.003
Ch+0.003
Neg logit contributions
 narrative-0.002
 trouble-0.002
 obstacle-0.002
 mistake-0.002
 Moral-0.002
Token.
Token ID13
Feature activation-1.34e-02

Pos logit contributions
cious+0.005
mare+0.004
ze+0.004
coat+0.004
suit+0.004
Neg logit contributions
 narrative-0.005
 ang-0.004
 mistake-0.004
 mad-0.004
 Moral-0.004
Token
Token ID198
Feature activation-2.24e-02

Pos logit contributions
cious+0.026
ze+0.024
coat+0.023
mare+0.022
kel+0.022
Neg logit contributions
 narrative-0.017
 Moral-0.017
 ang-0.015
 trouble-0.015
Less-0.015
Token
Token ID198
Feature activation-2.03e-02

Pos logit contributions
cious+0.038
ze+0.033
mare+0.032
coat+0.032
suit+0.032
Neg logit contributions
 narrative-0.034
 Moral-0.032
 obstacle-0.031
 mistake-0.031
 ang-0.031
Token"
Token ID1
Feature activation-6.57e-02

Pos logit contributions
cious+0.030
mare+0.027
coat+0.027
ze+0.027
suit+0.026
Neg logit contributions
 narrative-0.035
 obstacle-0.031
 Moral-0.031
Less-0.031
 ang-0.031
TokenBy
Token ID3886
Feature activation+2.44e-02

Pos logit contributions
cious+0.124
mare+0.117
ze+0.116
ox+0.115
 majesty+0.112
Neg logit contributions
 narrative-0.097
Less-0.081
 Moral-0.078
 moral-0.072
oral-0.068
Tokene
Token ID68
Feature activation-2.09e-01

Pos logit contributions
 narrative+0.056
 obstacle+0.052
 mistake+0.051
 Moral+0.051
 trouble+0.051
Neg logit contributions
cious-0.024
coat-0.017
suit-0.017
ze-0.016
mare-0.016
Token,
Token ID11
Feature activation-1.48e-03

Pos logit contributions
cious+0.410
we+0.356
Ch+0.355
pter+0.355
mare+0.353
Neg logit contributions
 mistake-0.220
 mad-0.215
 ang-0.211
 narrative-0.210
 obstacle-0.202
Token Ben
Token ID3932
Feature activation-2.03e-02

Pos logit contributions
cious+0.002
ze+0.002
mare+0.002
suit+0.002
coat+0.002
Neg logit contributions
 narrative-0.002
 obstacle-0.002
 ang-0.002
 mistake-0.002
 Moral-0.002
Token and
Token ID290
Feature activation+5.65e-03

Pos logit contributions
cious+0.035
ze+0.030
mare+0.029
coat+0.029
we+0.028
Neg logit contributions
 narrative-0.030
 obstacle-0.028
 ang-0.027
 mistake-0.027
 trouble-0.027
Token Lily
Token ID20037
Feature activation-1.15e-02

Pos logit contributions
 narrative+0.009
 obstacle+0.009
 ang+0.009
 mistake+0.008
 trouble+0.008
Neg logit contributions
cious-0.009
ze-0.008
mare-0.007
suit-0.007
coat-0.007
Token.
Token ID13
Feature activation-1.15e-02

Pos logit contributions
cious+0.017
ze+0.015
mare+0.015
coat+0.014
suit+0.014
Neg logit contributions
 narrative-0.019
 obstacle-0.018
 ang-0.018
 mistake-0.017
 trouble-0.017
Token the
Token ID262
Feature activation+1.02e-02

Pos logit contributions
 narrative+0.044
 obstacle+0.039
oral+0.038
 mistake+0.038
 Moral+0.037
Neg logit contributions
coat-0.014
we-0.014
cious-0.014
 pet-0.013
Rain-0.013
Token slide
Token ID10649
Feature activation+2.39e-02

Pos logit contributions
 narrative+0.018
 ang+0.016
 obstacle+0.016
 Moral+0.016
 mistake+0.015
Neg logit contributions
cious-0.015
coat-0.014
ze-0.013
suit-0.013
we-0.013
Token.
Token ID13
Feature activation-2.89e-03

Pos logit contributions
 narrative+0.048
 obstacle+0.044
 ang+0.043
 Moral+0.043
 trouble+0.043
Neg logit contributions
cious-0.030
ze-0.025
coat-0.024
suit-0.024
we-0.023
Token Whe
Token ID5792
Feature activation-4.55e-02

Pos logit contributions
cious+0.006
ze+0.005
pter+0.005
we+0.005
mare+0.005
Neg logit contributions
 Moral-0.003
 narrative-0.003
 trouble-0.003
 ang-0.003
Less-0.003
Tokenee
Token ID1453
Feature activation-5.06e-02

Pos logit contributions
cious+0.060
ze+0.049
suit+0.046
mare+0.045
balls+0.043
Neg logit contributions
 narrative-0.084
 mad-0.081
 ang-0.080
 obstacle-0.079
 mistake-0.078
Tokene
Token ID68
Feature activation-2.09e-01

Pos logit contributions
cious+0.085
mare+0.083
pter+0.077
suit+0.076
 groom+0.075
Neg logit contributions
 mad-0.061
 mistake-0.058
 wreck-0.055
 obstacle-0.055
 narrative-0.054
Token!
Token ID0
Feature activation-6.24e-03

Pos logit contributions
mare+0.373
cious+0.352
rito+0.347
pter+0.346
oster+0.322
Neg logit contributions
 mad-0.239
 mistake-0.229
 badly-0.213
 trouble-0.205
 ang-0.205
Token
Token ID198
Feature activation-1.37e-02

Pos logit contributions
cious+0.010
ze+0.009
mare+0.008
we+0.008
coat+0.008
Neg logit contributions
 narrative-0.010
 obstacle-0.009
 ang-0.009
 Moral-0.009
 trouble-0.009
Token
Token ID198
Feature activation-1.24e-02

Pos logit contributions
cious+0.024
ze+0.021
mare+0.020
suit+0.020
coat+0.019
Neg logit contributions
 narrative-0.020
 Moral-0.018
 trouble-0.018
 mistake-0.018
Less-0.018
TokenAfter
Token ID3260
Feature activation+2.82e-02

Pos logit contributions
cious+0.019
pter+0.016
mare+0.016
ze+0.016
suit+0.016
Neg logit contributions
 narrative-0.020
Less-0.018
 obstacle-0.018
 Moral-0.018
 mistake-0.017
Token playing
Token ID2712
Feature activation-5.82e-02

Pos logit contributions
 narrative+0.050
 obstacle+0.047
 mistake+0.045
 ang+0.045
 Moral+0.044
Neg logit contributions
cious-0.041
ze-0.035
mare-0.033
we-0.033
coat-0.033
Token when
Token ID618
Feature activation-8.96e-02

Pos logit contributions
cious+0.026
ze+0.022
mare+0.022
suit+0.021
we+0.021
Neg logit contributions
 narrative-0.023
 obstacle-0.021
 ang-0.021
 mistake-0.021
 trouble-0.020
Token Sam
Token ID3409
Feature activation-2.63e-02

Pos logit contributions
cious+0.125
ze+0.108
mare+0.102
we+0.101
coat+0.099
Neg logit contributions
 narrative-0.159
 obstacle-0.149
 ang-0.147
 mistake-0.143
 Moral-0.142
Token started
Token ID2067
Feature activation-2.30e-02

Pos logit contributions
cious+0.035
mare+0.031
ze+0.030
we+0.028
pter+0.028
Neg logit contributions
 narrative-0.047
 ang-0.044
 mistake-0.044
 trouble-0.044
 obstacle-0.044
Token to
Token ID284
Feature activation-1.53e-02

Pos logit contributions
cious+0.020
ze+0.016
coat+0.014
we+0.014
 songs+0.014
Neg logit contributions
 narrative-0.055
 obstacle-0.049
 Moral-0.048
Less-0.048
 ang-0.047
Token sne
Token ID10505
Feature activation+2.07e-02

Pos logit contributions
ze+0.018
cious+0.018
coat+0.017
 pretended+0.015
 sang+0.015
Neg logit contributions
 narrative-0.031
oral-0.027
 obstacle-0.027
 Moral-0.027
Less-0.026
Tokene
Token ID68
Feature activation-2.09e-01

Pos logit contributions
 narrative+0.040
 trouble+0.039
 mad+0.038
 obstacle+0.038
 ang+0.038
Neg logit contributions
cious-0.026
mare-0.020
suit-0.020
we-0.019
pron-0.019
Tokenze
Token ID2736
Feature activation-1.86e-01

Pos logit contributions
cious+0.168
mare+0.168
pter+0.118
rito+0.118
suit+0.111
Neg logit contributions
 ang-0.460
 mad-0.454
 mistake-0.451
 obstacle-0.429
 altogether-0.424
Token.
Token ID13
Feature activation+1.22e-03

Pos logit contributions
cious+0.292
mare+0.245
ze+0.238
suit+0.231
coat+0.230
Neg logit contributions
 narrative-0.303
 obstacle-0.287
 ang-0.284
 trouble-0.278
 mistake-0.275
Token
Token ID198
Feature activation-5.46e-03

Pos logit contributions
 Moral+0.001
 narrative+0.001
 ang+0.001
 trouble+0.001
Less+0.001
Neg logit contributions
cious-0.002
ze-0.002
we-0.002
mare-0.002
pter-0.002
Token
Token ID198
Feature activation-4.60e-03

Pos logit contributions
cious+0.011
ze+0.010
mare+0.009
coat+0.009
suit+0.009
Neg logit contributions
 narrative-0.006
 Moral-0.006
 trouble-0.006
Less-0.006
 mistake-0.006
TokenL
Token ID43
Feature activation+1.85e-02

Pos logit contributions
cious+0.007
ze+0.006
mare+0.006
pter+0.006
we+0.006
Neg logit contributions
 narrative-0.007
 Moral-0.007
 obstacle-0.007
Less-0.007
 ang-0.006
Token�
Token ID95
Feature activation+7.11e-02

Pos logit contributions
 narrative+0.117
 obstacle+0.110
 ang+0.108
 mistake+0.106
 Moral+0.105
Neg logit contributions
cious-0.091
ze-0.077
mare-0.074
coat-0.071
we-0.071
Token€
Token ID26391
Feature activation+2.10e-02

Pos logit contributions
 narrative+0.103
 Moral+0.095
 obstacle+0.094
Less+0.092
 ang+0.091
Neg logit contributions
cious-0.119
ze-0.102
mare-0.100
we-0.100
coat-0.099
Token�
Token ID129
Feature activation+2.21e-02

Pos logit contributions
 narrative+0.030
 trouble+0.030
 mad+0.028
 Moral+0.028
 obstacle+0.027
Neg logit contributions
cious-0.029
rito-0.029
 songs-0.028
ze-0.027
 groom-0.027
Token�
Token ID241
Feature activation-7.53e-02

Pos logit contributions
 narrative+0.038
 obstacle+0.036
 ang+0.036
 mistake+0.035
 trouble+0.035
Neg logit contributions
cious-0.032
mare-0.027
ze-0.026
we-0.026
suit-0.026
TokenBy
Token ID3886
Feature activation+2.74e-02

Pos logit contributions
cious+0.115
ze+0.105
mare+0.103
bes+0.096
suit+0.095
Neg logit contributions
 narrative-0.118
 trouble-0.110
Less-0.108
 Moral-0.107
 obstacle-0.106
Tokene
Token ID68
Feature activation-2.09e-01

Pos logit contributions
 narrative+0.058
 obstacle+0.053
 mistake+0.052
 Moral+0.052
 trouble+0.052
Neg logit contributions
cious-0.032
suit-0.024
coat-0.024
ze-0.023
mare-0.023
Token now
Token ID783
Feature activation-3.65e-02

Pos logit contributions
cious+0.329
pter+0.283
we+0.279
Ch+0.274
mare+0.274
Neg logit contributions
 mistake-0.285
 mad-0.278
 ang-0.272
 narrative-0.271
 obstacle-0.267
Token!
Token ID0
Feature activation-1.56e-03

Pos logit contributions
cious+0.066
ze+0.057
mare+0.056
coat+0.054
suit+0.054
Neg logit contributions
 narrative-0.052
 obstacle-0.048
 ang-0.046
 mistake-0.045
 trouble-0.045
Tokenâ
Token ID22940
Feature activation+1.20e-01

Pos logit contributions
cious+0.003
ze+0.002
mare+0.002
coat+0.002
suit+0.002
Neg logit contributions
 narrative-0.002
 obstacle-0.002
 Moral-0.002
 ang-0.002
 trouble-0.002
Token€
Token ID26391
Feature activation+2.10e-02

Pos logit contributions
 narrative+0.193
 obstacle+0.178
 ang+0.175
 mistake+0.170
 Moral+0.170
Neg logit contributions
cious-0.192
ze-0.167
mare-0.161
coat-0.156
we-0.155
Token<|endoftext|>
Token ID50256
Feature activation+1.28e-01

Pos logit contributions
 narrative+0.031
 trouble+0.030
 Moral+0.028
 obstacle+0.028
 mad+0.028
Neg logit contributions
cious-0.032
ze-0.029
 songs-0.028
rito-0.028
we-0.028
Token.
Token ID13
Feature activation-1.32e-02

Pos logit contributions
cious+0.005
mare+0.004
ze+0.004
coat+0.004
suit+0.004
Neg logit contributions
 narrative-0.005
 ang-0.005
 mistake-0.005
 obstacle-0.004
 Moral-0.004
Token
Token ID198
Feature activation-2.26e-02

Pos logit contributions
cious+0.026
ze+0.023
mare+0.022
coat+0.022
kel+0.022
Neg logit contributions
 narrative-0.017
 Moral-0.016
 ang-0.015
 trouble-0.015
Less-0.015
Token
Token ID198
Feature activation-2.05e-02

Pos logit contributions
cious+0.035
ze+0.030
mare+0.029
coat+0.029
suit+0.028
Neg logit contributions
 narrative-0.037
 obstacle-0.035
 ang-0.034
 Moral-0.034
 mistake-0.034
Token"
Token ID1
Feature activation-6.54e-02

Pos logit contributions
cious+0.032
mare+0.028
coat+0.028
ze+0.028
suit+0.027
Neg logit contributions
 narrative-0.034
 Moral-0.031
Less-0.031
 obstacle-0.030
 ang-0.030
TokenBy
Token ID3886
Feature activation+2.45e-02

Pos logit contributions
cious+0.126
mare+0.118
ze+0.117
ox+0.117
 majesty+0.113
Neg logit contributions
 narrative-0.095
Less-0.078
 Moral-0.076
 moral-0.071
oral-0.067
Tokene
Token ID68
Feature activation-2.09e-01

Pos logit contributions
 narrative+0.056
 mistake+0.052
 obstacle+0.052
 Moral+0.051
 trouble+0.051
Neg logit contributions
cious-0.024
coat-0.018
suit-0.018
ze-0.017
mare-0.016
Token,
Token ID11
Feature activation-1.41e-03

Pos logit contributions
cious+0.430
pter+0.379
rito+0.378
Ch+0.374
we+0.373
Neg logit contributions
 mistake-0.199
 mad-0.197
 ang-0.189
 narrative-0.182
 trouble-0.178
Token staff
Token ID3085
Feature activation-6.10e-02

Pos logit contributions
cious+0.002
ze+0.002
mare+0.002
coat+0.002
suit+0.002
Neg logit contributions
 narrative-0.002
 obstacle-0.002
 ang-0.002
 mistake-0.002
 trouble-0.002
Token.
Token ID13
Feature activation-9.76e-03

Pos logit contributions
cious+0.103
ze+0.089
mare+0.087
suit+0.084
coat+0.084
Neg logit contributions
 narrative-0.092
 obstacle-0.085
 ang-0.083
 mistake-0.081
 trouble-0.081
Token Thank
Token ID6952
Feature activation+4.22e-02

Pos logit contributions
cious+0.020
ze+0.019
coat+0.018
ves+0.018
bes+0.018
Neg logit contributions
 narrative-0.011
 Moral-0.010
 Enc-0.009
 List-0.009
Less-0.009
Token you
Token ID345
Feature activation+2.09e-02

Pos logit contributions
 narrative+0.037
 Moral+0.027
 trouble+0.027
Less+0.026
 mad+0.024
Neg logit contributions
coat-0.101
cious-0.098
suit-0.096
rito-0.093
ze-0.091
Token She
Token ID1375
Feature activation+5.07e-03

Pos logit contributions
 narrative+0.001
 obstacle+0.001
 ang+0.001
 Moral+0.001
 trouble+0.001
Neg logit contributions
cious-0.001
ze-0.001
mare-0.001
we-0.001
suit-0.001
Token said
Token ID531
Feature activation+2.00e-03

Pos logit contributions
 narrative+0.010
 obstacle+0.009
 ang+0.009
 mistake+0.009
 trouble+0.009
Neg logit contributions
cious-0.007
ze-0.005
mare-0.005
coat-0.005
suit-0.005
Token,
Token ID11
Feature activation+1.57e-03

Pos logit contributions
 narrative+0.003
 ang+0.003
 obstacle+0.003
 trouble+0.003
 mistake+0.003
Neg logit contributions
cious-0.003
ze-0.003
mare-0.003
we-0.003
suit-0.003
Token "
Token ID366
Feature activation-3.16e-02

Pos logit contributions
 Moral+0.002
 narrative+0.002
 ang+0.002
Less+0.002
 Enc+0.002
Neg logit contributions
cious-0.002
ze-0.002
coat-0.002
mare-0.002
kel-0.002
TokenBy
Token ID3886
Feature activation+2.09e-02

Pos logit contributions
cious+0.054
ze+0.048
mare+0.047
suit+0.046
 majesty+0.046
Neg logit contributions
 narrative-0.048
Less-0.043
 Moral-0.042
 trouble-0.038
 obstacle-0.038
Tokene
Token ID68
Feature activation-2.09e-01

Pos logit contributions
 narrative+0.047
 mistake+0.044
 obstacle+0.043
 trouble+0.043
 Moral+0.043
Neg logit contributions
cious-0.021
suit-0.015
coat-0.015
mare-0.014
lic-0.014
Token,
Token ID11
Feature activation+2.62e-03

Pos logit contributions
cious+0.412
we+0.376
rito+0.376
pter+0.373
Ch+0.369
Neg logit contributions
 mistake-0.211
 mad-0.206
 ang-0.190
 trouble-0.184
 obstacle-0.180
Token Tom
Token ID4186
Feature activation+4.31e-02

Pos logit contributions
 narrative+0.004
 ang+0.004
 obstacle+0.004
Less+0.004
 Moral+0.004
Neg logit contributions
cious-0.004
ze-0.003
mare-0.003
coat-0.003
 songs-0.003
Token.
Token ID13
Feature activation-2.66e-03

Pos logit contributions
 narrative+0.056
 mistake+0.054
 trouble+0.053
 obstacle+0.052
 ang+0.051
Neg logit contributions
cious-0.080
ze-0.068
mare-0.067
suit-0.067
Ch-0.066
Token See
Token ID4091
Feature activation-3.15e-02

Pos logit contributions
cious+0.005
ze+0.004
we+0.004
coat+0.004
mare+0.004
Neg logit contributions
 narrative-0.004
 Moral-0.003
 obstacle-0.003
Less-0.003
 trouble-0.003
Token you
Token ID345
Feature activation+1.89e-02

Pos logit contributions
cious+0.058
ze+0.051
mare+0.050
coat+0.049
suit+0.049
Neg logit contributions
 narrative-0.043
 obstacle-0.040
 ang-0.039
 mistake-0.038
 trouble-0.038
Token I
Token ID314
Feature activation-9.59e-03

Pos logit contributions
 narrative+0.026
 obstacle+0.024
 ang+0.023
 mistake+0.023
 Moral+0.023
Neg logit contributions
cious-0.022
ze-0.019
mare-0.018
coat-0.018
suit-0.018
Token can
Token ID460
Feature activation-1.41e-03

Pos logit contributions
cious+0.013
ze+0.011
mare+0.011
coat+0.011
suit+0.010
Neg logit contributions
 narrative-0.017
 obstacle-0.016
 ang-0.016
 Moral-0.015
 mistake-0.015
Token cut
Token ID2005
Feature activation+1.65e-02

Pos logit contributions
cious+0.002
ze+0.002
coat+0.002
mare+0.002
we+0.002
Neg logit contributions
 narrative-0.003
 obstacle-0.002
 ang-0.002
 Moral-0.002
 trouble-0.002
Token it
Token ID340
Feature activation-2.00e-03

Pos logit contributions
 narrative+0.029
 obstacle+0.025
 mistake+0.024
 Moral+0.024
 ang+0.024
Neg logit contributions
cious-0.025
ze-0.023
we-0.021
 songs-0.020
Ch-0.020
Token just
Token ID655
Feature activation+3.41e-02

Pos logit contributions
cious+0.003
ze+0.003
coat+0.003
we+0.003
 songs+0.003
Neg logit contributions
 narrative-0.003
 obstacle-0.003
 Moral-0.003
 ang-0.003
 trouble-0.003
Token right
Token ID826
Feature activation+1.71e-01

Pos logit contributions
 narrative+0.057
 obstacle+0.053
 ang+0.051
 Moral+0.050
 mistake+0.050
Neg logit contributions
cious-0.052
ze-0.045
mare-0.043
coat-0.042
we-0.041
Token."
Token ID526
Feature activation-4.28e-02

Pos logit contributions
 narrative+0.259
 obstacle+0.237
 ang+0.235
 mistake+0.235
 trouble+0.231
Neg logit contributions
cious-0.293
ze-0.252
mare-0.246
coat-0.240
Ch-0.239
Token Tommy
Token ID19919
Feature activation+9.68e-02

Pos logit contributions
cious+0.067
coat+0.063
suit+0.061
 liked+0.058
anas+0.058
Neg logit contributions
 narrative-0.072
 obstacle-0.069
 mistake-0.057
 problem-0.055
 ang-0.054
Token was
Token ID373
Feature activation+1.80e-03

Pos logit contributions
 narrative+0.181
 ang+0.170
 obstacle+0.170
 mistake+0.167
 Moral+0.166
Neg logit contributions
cious-0.125
ze-0.103
mare-0.102
coat-0.096
we-0.096
Token nervous
Token ID10927
Feature activation-1.40e-05

Pos logit contributions
 narrative+0.003
 obstacle+0.002
 ang+0.002
 mistake+0.002
 trouble+0.002
Neg logit contributions
cious-0.003
ze-0.003
mare-0.003
coat-0.003
suit-0.003
Token but
Token ID475
Feature activation-3.47e-02

Pos logit contributions
cious+0.000
ze+0.000
we+0.000
coat+0.000
mare+0.000
Neg logit contributions
 narrative-0.000
 obstacle-0.000
 Moral-0.000
Less-0.000
 ang-0.000
Token his
Token ID465
Feature activation-2.10e-02

Pos logit contributions
 narrative+0.020
 obstacle+0.018
 ang+0.017
 Moral+0.017
Less+0.017
Neg logit contributions
cious-0.020
ze-0.018
mare-0.018
Ch-0.017
 songs-0.017
Token shoes
Token ID10012
Feature activation-3.32e-02

Pos logit contributions
cious+0.022
ze+0.019
suit+0.017
coat+0.017
mare+0.017
Neg logit contributions
 narrative-0.045
 obstacle-0.042
 ang-0.041
 Moral-0.041
Less-0.041
Token and
Token ID290
Feature activation+6.83e-03

Pos logit contributions
cious+0.037
coat+0.032
ze+0.032
suit+0.029
we+0.028
Neg logit contributions
 narrative-0.069
 obstacle-0.065
 Moral-0.062
Less-0.061
 trouble-0.061
Token realized
Token ID6939
Feature activation+8.46e-03

Pos logit contributions
 ang+0.011
 narrative+0.010
 mad+0.010
 mistake+0.010
 trouble+0.010
Neg logit contributions
cious-0.011
ze-0.010
mare-0.010
coat-0.010
oster-0.010
Token his
Token ID465
Feature activation-2.13e-02

Pos logit contributions
 narrative+0.014
 obstacle+0.013
 ang+0.013
 Moral+0.013
Less+0.012
Neg logit contributions
cious-0.012
ze-0.011
we-0.010
mare-0.010
ves-0.010
Token mistake
Token ID7457
Feature activation+2.09e-01

Pos logit contributions
cious+0.024
ze+0.019
mare+0.018
coat+0.018
suit+0.018
Neg logit contributions
 narrative-0.044
 obstacle-0.042
 ang-0.041
 Moral-0.040
Less-0.040
Token.
Token ID13
Feature activation-5.28e-03

Pos logit contributions
 narrative+0.404
 obstacle+0.376
 ang+0.370
 Moral+0.359
Less+0.359
Neg logit contributions
cious-0.270
ze-0.230
we-0.212
mare-0.211
suit-0.209
Token He
Token ID679
Feature activation+4.47e-02

Pos logit contributions
cious+0.009
mare+0.008
suit+0.008
ze+0.008
Ch+0.007
Neg logit contributions
 narrative-0.008
 obstacle-0.007
 ang-0.007
 mistake-0.007
 mad-0.007
Token said
Token ID531
Feature activation+1.19e-03

Pos logit contributions
 narrative+0.093
 obstacle+0.088
 ang+0.085
 trouble+0.083
Less+0.083
Neg logit contributions
cious-0.050
ze-0.042
mare-0.038
suit-0.036
Ch-0.036
Token,
Token ID11
Feature activation-9.39e-04

Pos logit contributions
 narrative+0.002
 ang+0.002
 obstacle+0.002
 mistake+0.001
 trouble+0.001
Neg logit contributions
cious-0.002
ze-0.002
mare-0.002
coat-0.002
suit-0.002
Token "
Token ID366
Feature activation-3.22e-02

Pos logit contributions
cious+0.001
ze+0.001
mare+0.001
coat+0.001
suit+0.001
Neg logit contributions
 narrative-0.002
 ang-0.002
 Moral-0.002
 mistake-0.002
 mad-0.001
Token mouth
Token ID5422
Feature activation-5.45e-02

Pos logit contributions
cious+0.045
mare+0.038
ze+0.038
we+0.037
Ch+0.037
Neg logit contributions
 narrative-0.045
 obstacle-0.043
 trouble-0.042
 mistake-0.042
 ang-0.041
Token to
Token ID284
Feature activation-7.58e-03

Pos logit contributions
cious+0.086
ze+0.073
mare+0.072
Ch+0.069
we+0.069
Neg logit contributions
 narrative-0.085
 obstacle-0.081
 ang-0.080
 mistake-0.080
 trouble-0.079
Token point
Token ID966
Feature activation+3.54e-02

Pos logit contributions
cious+0.011
ze+0.009
mare+0.009
suit+0.009
coat+0.009
Neg logit contributions
 narrative-0.013
 obstacle-0.012
 ang-0.012
 mistake-0.012
 trouble-0.012
Token out
Token ID503
Feature activation+2.46e-02

Pos logit contributions
 narrative+0.053
 obstacle+0.050
 ang+0.049
 mistake+0.048
 trouble+0.047
Neg logit contributions
cious-0.059
ze-0.052
mare-0.051
coat-0.049
we-0.048
Token the
Token ID262
Feature activation+2.08e-02

Pos logit contributions
 narrative+0.039
 ang+0.035
 obstacle+0.035
 Moral+0.034
 mistake+0.034
Neg logit contributions
cious-0.039
ze-0.035
mare-0.033
we-0.032
suit-0.032
Token problem
Token ID1917
Feature activation+1.70e-01

Pos logit contributions
 narrative+0.033
 ang+0.031
 obstacle+0.031
Less+0.030
 Moral+0.030
Neg logit contributions
cious-0.031
ze-0.028
suit-0.027
mare-0.026
coat-0.026
Token and
Token ID290
Feature activation+8.07e-03

Pos logit contributions
 narrative+0.282
 obstacle+0.264
 ang+0.255
 mistake+0.254
 trouble+0.253
Neg logit contributions
cious-0.261
ze-0.225
mare-0.220
coat-0.210
suit-0.208
Token come
Token ID1282
Feature activation-4.97e-02

Pos logit contributions
 narrative+0.013
 obstacle+0.012
 ang+0.012
 mistake+0.011
 Moral+0.011
Neg logit contributions
cious-0.013
ze-0.011
mare-0.011
we-0.010
suit-0.010
Token up
Token ID510
Feature activation+7.16e-02

Pos logit contributions
cious+0.091
mare+0.079
ze+0.079
coat+0.076
Ch+0.075
Neg logit contributions
 narrative-0.066
 obstacle-0.062
 ang-0.061
 mistake-0.060
 trouble-0.059
Token with
Token ID351
Feature activation-1.64e-02

Pos logit contributions
 narrative+0.144
 obstacle+0.126
Less+0.125
 Moral+0.124
 ang+0.124
Neg logit contributions
cious-0.085
ze-0.078
 songs-0.065
we-0.065
suit-0.063
Token a
Token ID257
Feature activation-4.50e-03

Pos logit contributions
cious+0.028
ze+0.025
mare+0.024
coat+0.024
suit+0.024
Neg logit contributions
 narrative-0.024
 obstacle-0.022
 ang-0.022
 mistake-0.021
 trouble-0.021
Token was
Token ID373
Feature activation-6.76e-03

Pos logit contributions
cious+0.079
ze+0.066
mare+0.062
suit+0.060
we+0.060
Neg logit contributions
 narrative-0.118
 obstacle-0.111
 ang-0.110
 mistake-0.107
 trouble-0.106
Token worth
Token ID2861
Feature activation+1.45e-01

Pos logit contributions
cious+0.010
ze+0.009
mare+0.008
coat+0.008
suit+0.008
Neg logit contributions
 narrative-0.011
 obstacle-0.011
Less-0.010
 Moral-0.010
 ang-0.010
Token a
Token ID257
Feature activation-1.26e-02

Pos logit contributions
 narrative+0.226
 obstacle+0.207
 ang+0.199
 mistake+0.195
 Moral+0.195
Neg logit contributions
cious-0.243
ze-0.211
mare-0.198
we-0.197
coat-0.192
Token lot
Token ID1256
Feature activation+3.51e-02

Pos logit contributions
cious+0.020
ze+0.017
mare+0.016
suit+0.016
coat+0.015
Neg logit contributions
 narrative-0.021
 ang-0.019
 obstacle-0.019
Less-0.018
 Moral-0.018
Token of
Token ID286
Feature activation-2.67e-02

Pos logit contributions
 narrative+0.060
 obstacle+0.055
 ang+0.055
 Moral+0.053
Less+0.052
Neg logit contributions
cious-0.052
ze-0.044
coat-0.042
suit-0.042
mare-0.041
Token money
Token ID1637
Feature activation+1.81e-01

Pos logit contributions
cious+0.048
ze+0.043
 songs+0.040
mare+0.039
coat+0.039
Neg logit contributions
 narrative-0.038
 obstacle-0.033
 ang-0.032
Less-0.032
 Moral-0.032
Token.
Token ID13
Feature activation+3.61e-03

Pos logit contributions
 narrative+0.308
 obstacle+0.287
 ang+0.282
 mistake+0.275
 trouble+0.274
Neg logit contributions
cious-0.271
ze-0.232
mare-0.225
coat-0.216
suit-0.216
Token She
Token ID1375
Feature activation+1.09e-02

Pos logit contributions
 narrative+0.005
 obstacle+0.005
 ang+0.005
 mistake+0.005
 trouble+0.005
Neg logit contributions
cious-0.006
ze-0.005
mare-0.005
coat-0.005
suit-0.005
Token explained
Token ID4893
Feature activation+7.62e-02

Pos logit contributions
 narrative+0.021
 obstacle+0.021
 ang+0.019
 trouble+0.019
 decisions+0.019
Neg logit contributions
cious-0.013
ze-0.011
 sang-0.011
 liked-0.011
 danced-0.010
Token to
Token ID284
Feature activation-1.36e-02

Pos logit contributions
 narrative+0.161
 obstacle+0.144
Less+0.143
 Moral+0.140
 ang+0.137
Neg logit contributions
cious-0.084
ze-0.076
 songs-0.070
we-0.062
mare-0.061
Token him
Token ID683
Feature activation-2.90e-03

Pos logit contributions
cious+0.023
ze+0.022
uds+0.020
mare+0.019
 songs+0.019
Neg logit contributions
 narrative-0.020
 obstacle-0.018
 ang-0.017
 Moral-0.017
Less-0.017
Token on
Token ID319
Feature activation+2.86e-02

Pos logit contributions
cious+0.063
ze+0.058
mare+0.055
 songs+0.052
we+0.052
Neg logit contributions
 narrative-0.066
 ang-0.063
 obstacle-0.062
 mistake-0.060
 Moral-0.058
Token.
Token ID13
Feature activation-7.06e-03

Pos logit contributions
 narrative+0.055
 ang+0.052
 obstacle+0.052
 Moral+0.049
Less+0.049
Neg logit contributions
cious-0.036
ze-0.031
mare-0.028
coat-0.028
we-0.027
Token
Token ID198
Feature activation-1.57e-02

Pos logit contributions
cious+0.010
ze+0.009
mare+0.008
we+0.008
coat+0.008
Neg logit contributions
 narrative-0.013
 obstacle-0.012
 ang-0.012
 mistake-0.011
 trouble-0.011
Token
Token ID198
Feature activation-1.37e-02

Pos logit contributions
cious+0.030
ze+0.027
mare+0.026
coat+0.025
we+0.025
Neg logit contributions
 narrative-0.020
 obstacle-0.018
 ang-0.018
 mistake-0.017
 trouble-0.017
TokenThe
Token ID464
Feature activation-4.03e-02

Pos logit contributions
cious+0.020
ze+0.017
mare+0.016
coat+0.016
we+0.016
Neg logit contributions
 narrative-0.024
 obstacle-0.023
 ang-0.022
 mistake-0.022
 trouble-0.022
Token moral
Token ID6573
Feature activation+2.37e-01

Pos logit contributions
 songs+0.060
cious+0.060
 song+0.059
ze+0.058
 liked+0.058
Neg logit contributions
 obstacle-0.053
ulk-0.051
 ang-0.050
ge-0.048
 altogether-0.047
Token of
Token ID286
Feature activation-3.22e-02

Pos logit contributions
 ang+0.445
 narrative+0.427
 obstacle+0.418
 mistake+0.403
Less+0.395
Neg logit contributions
cious-0.291
mare-0.254
 songs-0.246
Ch-0.245
age-0.241
Token the
Token ID262
Feature activation+1.07e-02

Pos logit contributions
cious+0.054
mare+0.049
ze+0.049
we+0.047
Ch+0.047
Neg logit contributions
 narrative-0.047
 obstacle-0.044
 ang-0.042
 trouble-0.039
 mistake-0.038
Token story
Token ID1621
Feature activation+6.87e-02

Pos logit contributions
 ang+0.021
 obstacle+0.019
ulk+0.018
ge+0.018
bars+0.018
Neg logit contributions
cious-0.012
 songs-0.011
 song-0.011
ze-0.011
ggy-0.010
Token is
Token ID318
Feature activation+1.74e-03

Pos logit contributions
 ang+0.155
 obstacle+0.147
Less+0.137
 narrative+0.133
 mad+0.132
Neg logit contributions
 liked-0.051
ves-0.047
cious-0.046
mare-0.046
ot-0.046
Token that
Token ID326
Feature activation-5.91e-02

Pos logit contributions
 narrative+0.004
 ang+0.004
 obstacle+0.004
Less+0.003
 mad+0.003
Neg logit contributions
cious-0.002
mare-0.001
ze-0.001
ot-0.001
we-0.001
Token adventure
Token ID8855
Feature activation+3.32e-02

Pos logit contributions
 narrative+0.202
 obstacle+0.187
 ang+0.184
 mistake+0.180
 trouble+0.180
Neg logit contributions
cious-0.195
ze-0.168
mare-0.163
coat-0.157
suit-0.156
Token for
Token ID329
Feature activation+9.67e-02

Pos logit contributions
 narrative+0.050
 obstacle+0.046
 ang+0.045
 trouble+0.045
 mistake+0.045
Neg logit contributions
cious-0.056
ze-0.049
mare-0.047
coat-0.045
suit-0.045
Token one
Token ID530
Feature activation+1.85e-02

Pos logit contributions
 narrative+0.157
 obstacle+0.144
 ang+0.143
 Moral+0.136
 mistake+0.134
Neg logit contributions
cious-0.153
ze-0.136
 songs-0.128
mare-0.126
we-0.123
Token day
Token ID1110
Feature activation-1.22e-02

Pos logit contributions
 narrative+0.024
 obstacle+0.023
 mistake+0.022
 ang+0.022
 trouble+0.022
Neg logit contributions
cious-0.036
ze-0.031
mare-0.031
we-0.030
coat-0.030
Token.
Token ID13
Feature activation+2.74e-04

Pos logit contributions
cious+0.020
ze+0.018
mare+0.018
Ch+0.017
coat+0.017
Neg logit contributions
 narrative-0.018
 trouble-0.017
 obstacle-0.016
 mistake-0.016
 mad-0.016
Token<|endoftext|>
Token ID50256
Feature activation+1.24e-01

Pos logit contributions
 narrative+0.000
 obstacle+0.000
 ang+0.000
 mistake+0.000
 trouble+0.000
Neg logit contributions
cious-0.000
ze-0.000
mare-0.000
suit-0.000
coat-0.000
TokenL
Token ID43
Feature activation+1.70e-02

Pos logit contributions
 narrative+0.167
 obstacle+0.150
 ang+0.146
 trouble+0.145
 Moral+0.144
Neg logit contributions
cious-0.232
ze-0.205
mare-0.199
coat-0.194
suit-0.192
Tokenila
Token ID10102
Feature activation-5.17e-02

Pos logit contributions
 narrative+0.035
 obstacle+0.034
 ang+0.033
 Moral+0.031
 trouble+0.031
Neg logit contributions
cious-0.019
mare-0.016
ze-0.015
we-0.015
ox-0.015
Token and
Token ID290
Feature activation+1.09e-02

Pos logit contributions
cious+0.085
ze+0.074
mare+0.072
coat+0.070
suit+0.069
Neg logit contributions
 narrative-0.080
 obstacle-0.073
 ang-0.071
 mistake-0.071
 trouble-0.071
Token Tom
Token ID4186
Feature activation+5.12e-02

Pos logit contributions
 narrative+0.016
 obstacle+0.015
 ang+0.014
 mistake+0.014
 Moral+0.014
Neg logit contributions
cious-0.020
ze-0.017
mare-0.017
coat-0.016
suit-0.016
Token are
Token ID389
Feature activation+3.23e-02

Pos logit contributions
 narrative+0.092
 obstacle+0.088
 ang+0.085
 mistake+0.083
 Moral+0.082
Neg logit contributions
cious-0.072
ze-0.061
mare-0.058
coat-0.057
suit-0.057
Tokenmy
Token ID1820
Feature activation+5.05e-02

Pos logit contributions
 narrative+0.053
 ang+0.048
 mistake+0.048
 obstacle+0.047
 trouble+0.046
Neg logit contributions
cious-0.134
ze-0.117
mare-0.117
suit-0.116
we-0.114
Token's
Token ID338
Feature activation+2.01e-02

Pos logit contributions
 mistake+0.076
 narrative+0.075
 ang+0.075
 mad+0.074
 trouble+0.074
Neg logit contributions
cious-0.079
suit-0.067
mare-0.065
coat-0.065
we-0.063
Token mom
Token ID1995
Feature activation+2.62e-02

Pos logit contributions
 narrative+0.035
 obstacle+0.033
 ang+0.032
 mistake+0.031
 trouble+0.031
Neg logit contributions
cious-0.029
ze-0.025
mare-0.024
coat-0.023
suit-0.023
Token called
Token ID1444
Feature activation-3.81e-02

Pos logit contributions
 narrative+0.031
 ang+0.029
 mistake+0.029
 mad+0.028
 Moral+0.028
Neg logit contributions
cious-0.053
ze-0.045
suit-0.045
mare-0.045
coat-0.043
Token him
Token ID683
Feature activation+1.11e-02

Pos logit contributions
cious+0.066
ze+0.062
mare+0.061
coat+0.057
 songs+0.056
Neg logit contributions
 narrative-0.056
 obstacle-0.050
Less-0.047
 Moral-0.047
 ang-0.046
Token back
Token ID736
Feature activation+8.07e-02

Pos logit contributions
 narrative+0.015
 ang+0.015
 trouble+0.014
 Moral+0.014
 obstacle+0.014
Neg logit contributions
cious-0.020
ze-0.017
mare-0.016
we-0.016
bes-0.016
Token to
Token ID284
Feature activation-3.38e-03

Pos logit contributions
 narrative+0.143
 obstacle+0.133
 ang+0.131
 mistake+0.128
 trouble+0.127
Neg logit contributions
cious-0.116
ze-0.099
mare-0.096
coat-0.092
suit-0.091
Token the
Token ID262
Feature activation+2.76e-02

Pos logit contributions
cious+0.006
suit+0.005
ze+0.005
mare+0.005
coat+0.004
Neg logit contributions
 narrative-0.005
 ang-0.005
 mistake-0.005
 trouble-0.005
 obstacle-0.005
Token kitchen
Token ID9592
Feature activation+2.25e-02

Pos logit contributions
 narrative+0.046
 obstacle+0.042
 ang+0.042
Less+0.041
 Moral+0.040
Neg logit contributions
cious-0.039
ze-0.037
coat-0.034
mare-0.034
 songs-0.033
Token.
Token ID13
Feature activation+8.05e-03

Pos logit contributions
 narrative+0.031
 mistake+0.031
 trouble+0.030
 ang+0.029
 Moral+0.028
Neg logit contributions
cious-0.039
ze-0.034
mare-0.033
Ch-0.033
oster-0.032
Token She
Token ID1375
Feature activation+2.27e-02

Pos logit contributions
 narrative+0.010
 ang+0.009
 obstacle+0.009
 Moral+0.009
 trouble+0.009
Neg logit contributions
cious-0.016
ze-0.014
mare-0.013
we-0.013
suit-0.013
Token Everyone
Token ID11075
Feature activation-4.19e-03

Pos logit contributions
 narrative+0.079
 obstacle+0.074
 ang+0.072
 mistake+0.071
 trouble+0.070
Neg logit contributions
cious-0.062
ze-0.053
mare-0.051
coat-0.049
we-0.049
Token laughed
Token ID13818
Feature activation-5.45e-02

Pos logit contributions
cious+0.006
ze+0.005
mare+0.005
coat+0.005
suit+0.004
Neg logit contributions
 narrative-0.008
 obstacle-0.007
 ang-0.007
 mistake-0.007
 Moral-0.007
Token.
Token ID13
Feature activation+7.62e-03

Pos logit contributions
cious+0.098
mare+0.085
ze+0.085
coat+0.082
suit+0.081
Neg logit contributions
 narrative-0.075
 obstacle-0.070
 ang-0.069
 trouble-0.067
 mistake-0.067
Token<|endoftext|>
Token ID50256
Feature activation+1.28e-01

Pos logit contributions
 narrative+0.010
 Moral+0.010
 ang+0.009
Less+0.009
 obstacle+0.009
Neg logit contributions
cious-0.014
ze-0.013
mare-0.012
coat-0.012
suit-0.012
TokenOnce
Token ID7454
Feature activation+3.00e-03

Pos logit contributions
 narrative+0.180
 obstacle+0.160
 trouble+0.158
 Moral+0.158
 ang+0.157
Neg logit contributions
cious-0.232
ze-0.206
mare-0.198
coat-0.193
suit-0.189
Token upon
Token ID2402
Feature activation+4.69e-02

Pos logit contributions
 narrative+0.004
 obstacle+0.004
 ang+0.004
 mistake+0.003
 trouble+0.003
Neg logit contributions
cious-0.005
ze-0.005
mare-0.005
suit-0.005
coat-0.005
Token a
Token ID257
Feature activation-8.87e-03

Pos logit contributions
 trouble+0.054
 narrative+0.053
 mad+0.050
 ang+0.049
 mistake+0.048
Neg logit contributions
cious-0.093
suit-0.082
mare-0.082
coat-0.080
ze-0.079
Token time
Token ID640
Feature activation-2.29e-02

Pos logit contributions
cious+0.018
we+0.016
mare+0.016
ze+0.015
als+0.015
Neg logit contributions
 narrative-0.011
 mistake-0.010
 trouble-0.010
 mad-0.010
 obstacle-0.009
Token there
Token ID612
Feature activation-5.13e-02

Pos logit contributions
cious+0.042
ze+0.037
mare+0.036
coat+0.035
suit+0.035
Neg logit contributions
 narrative-0.031
 obstacle-0.028
 ang-0.028
 trouble-0.027
 mistake-0.027
Token was
Token ID373
Feature activation-5.49e-03

Pos logit contributions
cious+0.084
mare+0.071
ze+0.071
coat+0.068
suit+0.067
Neg logit contributions
 narrative-0.080
 trouble-0.075
 mistake-0.074
 Moral-0.073
 ang-0.072
Token a
Token ID257
Feature activation-9.65e-03

Pos logit contributions
cious+0.009
ze+0.008
mare+0.008
suit+0.007
coat+0.007
Neg logit contributions
 narrative-0.009
 obstacle-0.008
 ang-0.008
 trouble-0.008
 mistake-0.007
Token in
Token ID287
Feature activation+1.72e-02

Pos logit contributions
 narrative+0.135
 Moral+0.126
 obstacle+0.124
Less+0.123
 ang+0.121
Neg logit contributions
cious-0.063
ze-0.055
suit-0.051
 songs-0.051
we-0.050
Token different
Token ID1180
Feature activation-1.84e-03

Pos logit contributions
 narrative+0.025
 obstacle+0.023
 ang+0.023
 Moral+0.022
Less+0.021
Neg logit contributions
cious-0.029
ze-0.026
mare-0.025
Ch-0.025
 songs-0.024
Token colors
Token ID7577
Feature activation-1.01e-01

Pos logit contributions
cious+0.002
ze+0.002
mare+0.002
suit+0.002
we+0.002
Neg logit contributions
 narrative-0.004
 obstacle-0.003
 ang-0.003
 Moral-0.003
 mistake-0.003
Token.
Token ID13
Feature activation-2.43e-03

Pos logit contributions
cious+0.162
mare+0.141
ze+0.135
coat+0.130
suit+0.129
Neg logit contributions
 narrative-0.158
 obstacle-0.150
 ang-0.144
 mistake-0.144
 trouble-0.143
Token "
Token ID366
Feature activation-3.36e-02

Pos logit contributions
cious+0.004
ze+0.003
mare+0.003
coat+0.003
Ch+0.003
Neg logit contributions
 narrative-0.004
 obstacle-0.004
 ang-0.004
 mistake-0.004
 trouble-0.003
TokenO
Token ID46
Feature activation+6.66e-02

Pos logit contributions
cious+0.053
mare+0.045
ze+0.045
coat+0.043
suit+0.043
Neg logit contributions
 narrative-0.056
 obstacle-0.050
Less-0.049
 Moral-0.049
 ang-0.049
Tokenoh
Token ID1219
Feature activation-7.52e-02

Pos logit contributions
 narrative+0.133
 ang+0.125
 obstacle+0.125
 Moral+0.123
 trouble+0.121
Neg logit contributions
cious-0.083
ze-0.068
we-0.065
mare-0.063
coat-0.063
Token,
Token ID11
Feature activation-3.21e-04

Pos logit contributions
cious+0.142
mare+0.125
Ch+0.124
bes+0.123
suit+0.121
Neg logit contributions
 mistake-0.089
 mad-0.088
 ang-0.086
 problem-0.085
 narrative-0.084
Token pretty
Token ID2495
Feature activation-7.39e-02

Pos logit contributions
cious+0.001
ze+0.000
mare+0.000
Ch+0.000
 songs+0.000
Neg logit contributions
 narrative-0.001
 obstacle-0.000
 ang-0.000
 Moral-0.000
Less-0.000
Token!"
Token ID2474
Feature activation-2.94e-02

Pos logit contributions
cious+0.106
ze+0.092
mare+0.089
suit+0.085
we+0.084
Neg logit contributions
 narrative-0.132
 obstacle-0.120
 ang-0.120
 Moral-0.116
 trouble-0.116
Token she
Token ID673
Feature activation-2.13e-02

Pos logit contributions
cious+0.047
ze+0.040
mare+0.039
suit+0.038
coat+0.038
Neg logit contributions
 narrative-0.047
 ang-0.044
 trouble-0.044
 obstacle-0.043
 Moral-0.043
Tokenie
Token ID494
Feature activation-8.74e-02

Pos logit contributions
 narrative+0.015
 obstacle+0.014
 Moral+0.014
Less+0.013
 trouble+0.013
Neg logit contributions
cious-0.011
ze-0.010
we-0.009
mare-0.009
coat-0.009
Token and
Token ID290
Feature activation+2.15e-02

Pos logit contributions
cious+0.129
ze+0.129
we+0.106
mare+0.106
coat+0.106
Neg logit contributions
 narrative-0.154
 obstacle-0.134
Less-0.131
 Moral-0.130
 ang-0.126
Token said
Token ID531
Feature activation+1.24e-02

Pos logit contributions
 narrative+0.036
 ang+0.035
 obstacle+0.035
 mistake+0.034
 trouble+0.034
Neg logit contributions
cious-0.032
ze-0.027
suit-0.026
mare-0.026
coat-0.026
Token,
Token ID11
Feature activation+1.08e-02

Pos logit contributions
 narrative+0.018
 obstacle+0.016
 ang+0.016
 mistake+0.016
 trouble+0.016
Neg logit contributions
cious-0.022
ze-0.019
mare-0.019
coat-0.018
suit-0.018
Token "
Token ID366
Feature activation-2.01e-02

Pos logit contributions
 narrative+0.021
 ang+0.020
 obstacle+0.020
 Moral+0.019
 mistake+0.019
Neg logit contributions
cious-0.013
ze-0.011
mare-0.010
suit-0.010
coat-0.010
TokenOh
Token ID5812
Feature activation+6.84e-02

Pos logit contributions
cious+0.032
suit+0.027
mare+0.027
ze+0.027
coat+0.026
Neg logit contributions
 narrative-0.032
 ang-0.031
 obstacle-0.030
 Moral-0.030
 trouble-0.030
Token no
Token ID645
Feature activation-3.12e-02

Pos logit contributions
 narrative+0.092
 trouble+0.092
 ang+0.089
 problem+0.088
 mad+0.087
Neg logit contributions
cious-0.124
suit-0.111
bes-0.108
coat-0.107
we-0.106
Token!
Token ID0
Feature activation+8.09e-03

Pos logit contributions
cious+0.049
ze+0.044
mare+0.042
coat+0.041
we+0.040
Neg logit contributions
 narrative-0.052
 obstacle-0.046
 ang-0.046
 Moral-0.045
Less-0.044
Token What
Token ID1867
Feature activation-2.60e-02

Pos logit contributions
 narrative+0.012
 ang+0.011
 obstacle+0.011
 Moral+0.011
 trouble+0.011
Neg logit contributions
cious-0.013
ze-0.011
coat-0.011
suit-0.011
mare-0.011
Token happened
Token ID3022
Feature activation+6.12e-02

Pos logit contributions
cious+0.045
ze+0.039
mare+0.038
suit+0.038
coat+0.038
Neg logit contributions
 narrative-0.039
 trouble-0.035
 obstacle-0.035
 mistake-0.034
 ang-0.034
Token?"
Token ID1701
Feature activation+4.90e-03

Pos logit contributions
 trouble+0.077
 mistake+0.074
 mad+0.073
 narrative+0.072
 ang+0.071
Neg logit contributions
cious-0.120
pter-0.103
Ch-0.101
suit-0.099
bes-0.099
Token adventure
Token ID8855
Feature activation+2.95e-02

Pos logit contributions
 narrative+0.037
 ang+0.035
 obstacle+0.034
Less+0.034
 Moral+0.033
Neg logit contributions
cious-0.025
ze-0.021
coat-0.021
 songs-0.020
mare-0.020
Token that
Token ID326
Feature activation-5.09e-02

Pos logit contributions
 narrative+0.050
 obstacle+0.046
 ang+0.046
 Moral+0.044
 mistake+0.044
Neg logit contributions
cious-0.045
ze-0.039
mare-0.037
coat-0.036
we-0.036
Token day
Token ID1110
Feature activation-1.45e-02

Pos logit contributions
cious+0.083
ze+0.072
mare+0.070
coat+0.067
suit+0.067
Neg logit contributions
 narrative-0.081
 obstacle-0.074
 ang-0.073
 mistake-0.071
 Moral-0.070
Token.
Token ID13
Feature activation+1.12e-02

Pos logit contributions
cious+0.021
ze+0.018
mare+0.018
coat+0.017
suit+0.017
Neg logit contributions
 narrative-0.025
 obstacle-0.023
 ang-0.023
 mistake-0.023
 Moral-0.022
Token He
Token ID679
Feature activation+5.82e-02

Pos logit contributions
 narrative+0.018
 ang+0.017
 obstacle+0.016
 mistake+0.016
 mad+0.016
Neg logit contributions
cious-0.017
mare-0.016
Ch-0.015
ze-0.015
 songs-0.015
Token discovered
Token ID5071
Feature activation-3.42e-02

Pos logit contributions
 narrative+0.112
 obstacle+0.105
 ang+0.104
 mistake+0.102
 trouble+0.102
Neg logit contributions
cious-0.074
ze-0.061
mare-0.059
coat-0.056
suit-0.056
Token the
Token ID262
Feature activation+1.95e-02

Pos logit contributions
cious+0.062
ze+0.054
mare+0.054
suit+0.052
coat+0.052
Neg logit contributions
 narrative-0.047
 obstacle-0.044
 ang-0.043
 trouble-0.042
 mistake-0.042
Token bitter
Token ID12922
Feature activation+2.87e-02

Pos logit contributions
 narrative+0.032
Less+0.029
 obstacle+0.028
 ang+0.028
 Moral+0.027
Neg logit contributions
cious-0.027
ze-0.026
 songs-0.025
coat-0.024
suit-0.023
Token smell
Token ID8508
Feature activation-5.82e-02

Pos logit contributions
 narrative+0.046
Less+0.041
 ang+0.040
 obstacle+0.040
 Moral+0.039
Neg logit contributions
cious-0.045
ze-0.042
coat-0.040
 songs-0.039
suit-0.039
Token of
Token ID286
Feature activation-2.25e-02

Pos logit contributions
cious+0.110
mare+0.097
ze+0.094
Ch+0.091
suit+0.091
Neg logit contributions
 narrative-0.075
 obstacle-0.070
 trouble-0.068
 mistake-0.066
 ang-0.066
Token fuel
Token ID5252
Feature activation+1.08e-01

Pos logit contributions
cious+0.036
ze+0.033
 songs+0.032
mare+0.030
we+0.030
Neg logit contributions
 narrative-0.035
 obstacle-0.031
 ang-0.029
Less-0.029
 Moral-0.029
Token embarrassed
Token ID21100
Feature activation+3.41e-02

Pos logit contributions
 narrative+0.028
reating+0.026
 obstacle+0.025
 Moral+0.025
Less+0.024
Neg logit contributions
cious-0.027
ze-0.027
ot-0.024
 sprayed-0.023
ggy-0.023
Token!
Token ID0
Feature activation+2.63e-03

Pos logit contributions
 narrative+0.053
 obstacle+0.049
 ang+0.048
 mistake+0.047
 trouble+0.047
Neg logit contributions
cious-0.057
ze-0.049
mare-0.048
suit-0.046
coat-0.046
Token
Token ID198
Feature activation-1.51e-03

Pos logit contributions
 narrative+0.004
 obstacle+0.004
 ang+0.004
 Moral+0.004
 trouble+0.004
Neg logit contributions
cious-0.004
ze-0.004
mare-0.003
suit-0.003
coat-0.003
Token
Token ID198
Feature activation-2.43e-04

Pos logit contributions
cious+0.002
ze+0.002
mare+0.002
suit+0.002
coat+0.002
Neg logit contributions
 narrative-0.002
 obstacle-0.002
 ang-0.002
 Moral-0.002
 trouble-0.002
TokenThe
Token ID464
Feature activation-3.15e-02

Pos logit contributions
cious+0.000
mare+0.000
ze+0.000
suit+0.000
coat+0.000
Neg logit contributions
 narrative-0.000
 obstacle-0.000
 ang-0.000
 Moral-0.000
Less-0.000
Token girl
Token ID2576
Feature activation+1.93e-02

Pos logit contributions
cious+0.041
ze+0.036
mare+0.033
coat+0.033
we+0.033
Neg logit contributions
 narrative-0.056
 obstacle-0.054
 ang-0.054
Less-0.051
 trouble-0.051
Token didn
Token ID1422
Feature activation+3.76e-02

Pos logit contributions
 narrative+0.040
 obstacle+0.038
 ang+0.036
Less+0.036
 Moral+0.036
Neg logit contributions
cious-0.022
ze-0.020
mare-0.017
we-0.017
Ch-0.016
Tokenâ
Token ID22940
Feature activation+1.28e-01

Pos logit contributions
 narrative+0.090
 obstacle+0.088
 ang+0.086
 mistake+0.081
 cheating+0.080
Neg logit contributions
ze-0.030
 hardly-0.024
cious-0.023
ggy-0.022
ves-0.022
Token€
Token ID26391
Feature activation+3.46e-02

Pos logit contributions
 narrative+0.221
 obstacle+0.209
 ang+0.205
 mistake+0.201
 trouble+0.200
Neg logit contributions
cious-0.190
ze-0.162
mare-0.158
coat-0.152
suit-0.150
Tokenâ„¢
Token ID8151
Feature activation-8.10e-02

Pos logit contributions
 trouble+0.059
 narrative+0.055
 mistake+0.054
 mad+0.054
 tragedy+0.053
Neg logit contributions
rito-0.042
oster-0.040
cious-0.039
 groom-0.039
mare-0.038
Tokent
Token ID83
Feature activation-3.96e-02

Pos logit contributions
cious+0.075
ze+0.056
mare+0.053
coat+0.052
suit+0.050
Neg logit contributions
 narrative-0.186
 obstacle-0.180
 ang-0.174
 trouble-0.171
 mistake-0.171
Token spun
Token ID26843
Feature activation-4.18e-02

Pos logit contributions
 narrative+0.048
 obstacle+0.045
 ang+0.045
 mistake+0.044
 trouble+0.044
Neg logit contributions
cious-0.033
ze-0.027
mare-0.027
coat-0.025
we-0.025
Token it
Token ID340
Feature activation-3.91e-03

Pos logit contributions
cious+0.063
ze+0.055
mare+0.052
suit+0.051
coat+0.051
Neg logit contributions
 narrative-0.071
 obstacle-0.065
 ang-0.064
 mistake-0.063
 Moral-0.063
Token around
Token ID1088
Feature activation-3.04e-02

Pos logit contributions
cious+0.007
ze+0.006
mare+0.006
we+0.006
coat+0.006
Neg logit contributions
 narrative-0.006
 obstacle-0.005
 ang-0.005
 mistake-0.005
 trouble-0.005
Token.
Token ID13
Feature activation+1.16e-03

Pos logit contributions
cious+0.041
ze+0.037
mare+0.033
we+0.033
 songs+0.033
Neg logit contributions
 narrative-0.057
 obstacle-0.052
 ang-0.051
 Moral-0.050
Less-0.049
Token He
Token ID679
Feature activation+5.21e-02

Pos logit contributions
 narrative+0.002
 obstacle+0.002
 ang+0.002
 mistake+0.002
 mad+0.002
Neg logit contributions
cious-0.002
ze-0.001
mare-0.001
suit-0.001
Ch-0.001
Token laughed
Token ID13818
Feature activation-5.47e-02

Pos logit contributions
 narrative+0.090
 obstacle+0.086
 ang+0.085
 mistake+0.085
 mad+0.084
Neg logit contributions
cious-0.074
mare-0.063
ze-0.058
pter-0.057
suit-0.057
Token and
Token ID290
Feature activation+1.06e-02

Pos logit contributions
cious+0.108
mare+0.095
pter+0.090
coat+0.090
ze+0.089
Neg logit contributions
 narrative-0.063
 ang-0.063
 obstacle-0.061
 mad-0.060
 trouble-0.059
Token laughed
Token ID13818
Feature activation-5.57e-02

Pos logit contributions
 ang+0.017
 narrative+0.017
 mad+0.016
 obstacle+0.016
 trouble+0.016
Neg logit contributions
cious-0.016
mare-0.014
ze-0.014
pter-0.014
oster-0.013
Token as
Token ID355
Feature activation-5.56e-02

Pos logit contributions
cious+0.104
mare+0.093
pter+0.086
coat+0.086
ox+0.085
Neg logit contributions
 ang-0.070
 narrative-0.068
 obstacle-0.067
 mad-0.067
 trouble-0.066
Token it
Token ID340
Feature activation-7.49e-03

Pos logit contributions
cious+0.099
ze+0.087
mare+0.084
we+0.082
suit+0.082
Neg logit contributions
 narrative-0.079
 obstacle-0.072
 ang-0.071
 mistake-0.069
 Moral-0.069
Token tw
Token ID665
Feature activation+2.98e-02

Pos logit contributions
cious+0.013
mare+0.010
pter+0.010
 majesty+0.010
ox+0.009
Neg logit contributions
 ang-0.011
 narrative-0.011
 mad-0.010
 obstacle-0.010
 mistake-0.010
Token better
Token ID1365
Feature activation+1.46e-01

Pos logit contributions
cious+0.081
ze+0.070
mare+0.066
we+0.066
coat+0.064
Neg logit contributions
 narrative-0.082
 obstacle-0.076
 ang-0.074
 mistake-0.072
 Moral-0.072
Token.
Token ID13
Feature activation+1.10e-02

Pos logit contributions
 narrative+0.236
 obstacle+0.221
 trouble+0.221
 mistake+0.219
 ang+0.212
Neg logit contributions
cious-0.230
Ch-0.202
coat-0.201
ze-0.200
mare-0.199
Token<|endoftext|>
Token ID50256
Feature activation+1.32e-01

Pos logit contributions
 narrative+0.022
 ang+0.021
 obstacle+0.021
 mad+0.021
 mistake+0.021
Neg logit contributions
cious-0.013
mare-0.010
Ch-0.010
ze-0.010
we-0.010
TokenThere
Token ID1858
Feature activation-4.30e-02

Pos logit contributions
 narrative+0.214
 obstacle+0.202
 ang+0.198
 mistake+0.194
 trouble+0.187
Neg logit contributions
cious-0.207
ze-0.178
mare-0.175
suit-0.170
we-0.169
Token was
Token ID373
Feature activation-2.26e-03

Pos logit contributions
cious+0.068
ze+0.057
mare+0.057
coat+0.054
suit+0.054
Neg logit contributions
 narrative-0.071
 obstacle-0.064
 ang-0.064
 trouble-0.063
 mistake-0.063
Token a
Token ID257
Feature activation-9.88e-03

Pos logit contributions
cious+0.003
ze+0.003
mare+0.003
coat+0.003
we+0.003
Neg logit contributions
 narrative-0.004
 obstacle-0.004
 ang-0.004
 mistake-0.004
 Moral-0.004
Token little
Token ID1310
Feature activation+1.04e-03

Pos logit contributions
cious+0.011
ze+0.009
mare+0.009
suit+0.009
pron+0.008
Neg logit contributions
 narrative-0.021
 ang-0.019
Less-0.019
 obstacle-0.019
 trouble-0.018
Token boy
Token ID2933
Feature activation+9.61e-02

Pos logit contributions
 narrative+0.002
Less+0.002
 ang+0.002
 Moral+0.002
oral+0.002
Neg logit contributions
cious-0.001
 sheep-0.001
suit-0.001
 striped-0.001
coat-0.001
Token.
Token ID13
Feature activation+1.45e-02

Pos logit contributions
 narrative+0.196
 obstacle+0.186
 ang+0.183
 mistake+0.183
 trouble+0.181
Neg logit contributions
cious-0.108
ze-0.088
mare-0.085
suit-0.080
coat-0.079
Token He
Token ID679
Feature activation+6.02e-02

Pos logit contributions
 narrative+0.023
 ang+0.021
 obstacle+0.021
 mistake+0.021
 mad+0.020
Neg logit contributions
cious-0.023
mare-0.020
suit-0.020
ze-0.019
Ch-0.019
Token liked
Token ID8288
Feature activation-9.08e-02

Pos logit contributions
 narrative+0.121
 obstacle+0.113
 mistake+0.112
 ang+0.112
 trouble+0.112
Neg logit contributions
cious-0.069
mare-0.056
ze-0.055
we-0.051
suit-0.051
Token help
Token ID1037
Feature activation-1.40e-02

Pos logit contributions
cious+0.026
suit+0.023
pter+0.022
we+0.022
mare+0.020
Neg logit contributions
 trouble-0.022
 mistake-0.021
 ang-0.021
 mad-0.021
 obstacle-0.020
Token the
Token ID262
Feature activation+1.58e-02

Pos logit contributions
cious+0.026
ze+0.023
mare+0.023
coat+0.022
we+0.022
Neg logit contributions
 narrative-0.020
 obstacle-0.017
 ang-0.016
 Moral-0.016
 mistake-0.016
Token cat
Token ID3797
Feature activation-4.21e-02

Pos logit contributions
 trouble+0.027
 mistake+0.026
 problem+0.025
 mad+0.025
 Moral+0.025
Neg logit contributions
cious-0.024
pter-0.022
suit-0.021
kel-0.020
we-0.020
Token,
Token ID11
Feature activation+4.64e-03

Pos logit contributions
cious+0.059
ze+0.053
mare+0.049
coat+0.048
suit+0.046
Neg logit contributions
 narrative-0.078
 obstacle-0.069
 ang-0.067
Less-0.067
 Moral-0.067
Token but
Token ID475
Feature activation-4.47e-02

Pos logit contributions
 narrative+0.007
 obstacle+0.007
 ang+0.007
 mistake+0.007
 trouble+0.006
Neg logit contributions
cious-0.008
ze-0.007
mare-0.006
coat-0.006
suit-0.006
Token he
Token ID339
Feature activation+2.35e-02

Pos logit contributions
cious+0.090
ze+0.080
mare+0.078
we+0.076
coat+0.076
Neg logit contributions
 narrative-0.054
 ang-0.049
 obstacle-0.049
 trouble-0.047
 Moral-0.047
Token didn
Token ID1422
Feature activation+3.04e-02

Pos logit contributions
 narrative+0.039
 ang+0.038
 mistake+0.037
 Moral+0.037
 mad+0.037
Neg logit contributions
cious-0.034
ze-0.029
mare-0.029
coat-0.029
suit-0.028
Token't
Token ID470
Feature activation+1.29e-02

Pos logit contributions
 narrative+0.075
 obstacle+0.072
 ang+0.071
 mistake+0.068
 cheating+0.066
Neg logit contributions
ze-0.022
cious-0.018
ggy-0.018
ves-0.017
 hardly-0.017
Token know
Token ID760
Feature activation+2.51e-02

Pos logit contributions
 narrative+0.020
 obstacle+0.018
 ang+0.018
 Moral+0.018
Less+0.017
Neg logit contributions
cious-0.021
ze-0.019
mare-0.018
coat-0.017
suit-0.017
Token how
Token ID703
Feature activation-9.89e-02

Pos logit contributions
 narrative+0.033
 obstacle+0.031
 ang+0.030
 mistake+0.029
 trouble+0.029
Neg logit contributions
cious-0.047
ze-0.042
mare-0.041
suit-0.040
coat-0.040
Token.
Token ID13
Feature activation+6.32e-03

Pos logit contributions
cious+0.155
mare+0.136
suit+0.129
ze+0.128
coat+0.126
Neg logit contributions
 trouble-0.157
 narrative-0.154
 obstacle-0.153
 ang-0.151
 mad-0.151
Token tall
Token ID7331
Feature activation+4.67e-03

Pos logit contributions
cious+0.008
ze+0.006
mare+0.006
coat+0.006
suit+0.006
Neg logit contributions
 narrative-0.010
 obstacle-0.010
 ang-0.010
 mistake-0.009
 trouble-0.009
Token dad
Token ID9955
Feature activation+4.11e-02

Pos logit contributions
Less+0.009
 narrative+0.009
 ang+0.008
 Moral+0.008
 obstacle+0.008
Neg logit contributions
cious-0.006
 hat-0.005
ze-0.005
coat-0.005
 tail-0.005
Token who
Token ID508
Feature activation-8.53e-02

Pos logit contributions
 narrative+0.068
 mad+0.065
 ang+0.064
 mistake+0.064
 trouble+0.063
Neg logit contributions
cious-0.058
mare-0.051
ze-0.051
suit-0.047
we-0.046
Token wore
Token ID12408
Feature activation-4.17e-02

Pos logit contributions
cious+0.085
ze+0.066
mare+0.065
coat+0.059
suit+0.058
Neg logit contributions
 narrative-0.186
 obstacle-0.172
 ang-0.172
 Moral-0.172
 mistake-0.170
Token big
Token ID1263
Feature activation+7.17e-03

Pos logit contributions
cious+0.088
mare+0.080
we+0.076
ze+0.074
oster+0.073
Neg logit contributions
 trouble-0.042
 ang-0.042
 mad-0.039
 narrative-0.039
 tragedy-0.036
Token boots
Token ID14412
Feature activation-1.10e-01

Pos logit contributions
 narrative+0.014
 obstacle+0.013
 ang+0.013
 Moral+0.013
 mistake+0.013
Neg logit contributions
cious-0.009
ze-0.007
mare-0.007
coat-0.007
suit-0.007
Token.
Token ID13
Feature activation+1.34e-02

Pos logit contributions
cious+0.197
mare+0.167
ze+0.165
we+0.156
Ch+0.153
Neg logit contributions
 narrative-0.155
 ang-0.141
 obstacle-0.138
 trouble-0.135
 Moral-0.134
Token One
Token ID1881
Feature activation+9.49e-02

Pos logit contributions
 narrative+0.019
 obstacle+0.018
 ang+0.017
 mistake+0.017
 trouble+0.016
Neg logit contributions
cious-0.023
coat-0.020
mare-0.020
ze-0.020
suit-0.020
Token day
Token ID1110
Feature activation-1.34e-02

Pos logit contributions
 narrative+0.099
 mistake+0.092
 obstacle+0.091
 mad+0.091
 Moral+0.090
Neg logit contributions
cious-0.199
ze-0.179
bes-0.179
we-0.176
pter-0.170
Token,
Token ID11
Feature activation+1.48e-02

Pos logit contributions
cious+0.024
ze+0.021
mare+0.021
we+0.020
coat+0.020
Neg logit contributions
 narrative-0.019
 obstacle-0.017
 ang-0.017
 trouble-0.016
 mistake-0.016
Token Tim
Token ID5045
Feature activation+5.92e-02

Pos logit contributions
 narrative+0.017
 obstacle+0.016
 ang+0.015
 mistake+0.015
 Moral+0.014
Neg logit contributions
cious-0.030
ze-0.027
mare-0.026
coat-0.026
suit-0.026
Token blocks
Token ID7021
Feature activation+3.88e-02

Pos logit contributions
 narrative+0.056
 obstacle+0.051
 ang+0.050
 Moral+0.050
Less+0.049
Neg logit contributions
cious-0.059
ze-0.052
 songs-0.049
we-0.049
pter-0.049
Token,
Token ID11
Feature activation+6.25e-03

Pos logit contributions
 narrative+0.058
 obstacle+0.053
 ang+0.052
 mistake+0.051
 trouble+0.051
Neg logit contributions
cious-0.067
ze-0.059
mare-0.057
coat-0.055
suit-0.055
Token but
Token ID475
Feature activation-3.84e-02

Pos logit contributions
 narrative+0.011
 obstacle+0.010
 ang+0.010
 trouble+0.010
 mistake+0.010
Neg logit contributions
cious-0.009
ze-0.008
mare-0.008
suit-0.007
coat-0.007
Token their
Token ID511
Feature activation-6.51e-02

Pos logit contributions
cious+0.059
ze+0.049
mare+0.048
suit+0.047
we+0.046
Neg logit contributions
 narrative-0.065
 obstacle-0.061
 ang-0.059
 mistake-0.058
 Moral-0.057
Token favorite
Token ID4004
Feature activation-9.90e-02

Pos logit contributions
cious+0.098
ze+0.083
mare+0.083
we+0.078
coat+0.078
Neg logit contributions
 narrative-0.111
 obstacle-0.104
 ang-0.102
 mistake-0.102
 trouble-0.101
Token ones
Token ID3392
Feature activation-9.08e-02

Pos logit contributions
cious+0.166
mare+0.137
ze+0.135
Ch+0.131
coat+0.128
Neg logit contributions
 mistake-0.155
 narrative-0.149
 obstacle-0.145
 problem-0.145
 wreck-0.143
Token were
Token ID547
Feature activation+1.14e-02

Pos logit contributions
cious+0.149
mare+0.133
ze+0.132
Ch+0.124
suit+0.122
Neg logit contributions
 narrative-0.138
 mistake-0.128
 trouble-0.126
 ang-0.125
 obstacle-0.125
Token the
Token ID262
Feature activation+1.43e-02

Pos logit contributions
 narrative+0.017
 obstacle+0.016
 ang+0.016
 mistake+0.015
 trouble+0.015
Neg logit contributions
cious-0.019
ze-0.017
mare-0.017
suit-0.016
coat-0.016
Token red
Token ID2266
Feature activation-5.40e-02

Pos logit contributions
 narrative+0.025
 obstacle+0.024
 ang+0.023
 Moral+0.023
Less+0.023
Neg logit contributions
cious-0.019
ze-0.017
coat-0.016
suit-0.016
 songs-0.016
Token ones
Token ID3392
Feature activation-9.05e-02

Pos logit contributions
cious+0.071
ze+0.068
pter+0.064
coat+0.064
 songs+0.063
Neg logit contributions
 narrative-0.096
 ang-0.088
Less-0.087
 obstacle-0.085
 trouble-0.085
Token.
Token ID13
Feature activation+5.92e-03

Pos logit contributions
cious+0.141
ze+0.124
mare+0.123
Ch+0.116
coat+0.115
Neg logit contributions
 narrative-0.146
 obstacle-0.134
 ang-0.132
 trouble-0.131
 mistake-0.130
Token asked
Token ID1965
Feature activation+2.79e-04

Pos logit contributions
cious+0.011
mare+0.010
ze+0.009
suit+0.009
we+0.009
Neg logit contributions
 narrative-0.015
 ang-0.014
 mistake-0.014
 obstacle-0.014
 mad-0.014
Token her
Token ID607
Feature activation-6.54e-02

Pos logit contributions
 narrative+0.000
 obstacle+0.000
 ang+0.000
 mistake+0.000
 trouble+0.000
Neg logit contributions
cious-0.000
ze-0.000
mare-0.000
 songs-0.000
we-0.000
Token mom
Token ID1995
Feature activation+6.39e-03

Pos logit contributions
cious+0.095
ze+0.081
mare+0.079
coat+0.074
suit+0.074
Neg logit contributions
 narrative-0.117
 obstacle-0.108
 ang-0.106
 trouble-0.103
 mistake-0.102
Token,
Token ID11
Feature activation-2.00e-03

Pos logit contributions
 narrative+0.010
 obstacle+0.009
 trouble+0.008
 ang+0.008
Less+0.008
Neg logit contributions
cious-0.011
ze-0.009
mare-0.009
coat-0.009
we-0.009
Token "
Token ID366
Feature activation-3.03e-02

Pos logit contributions
cious+0.003
ze+0.002
mare+0.002
suit+0.002
coat+0.002
Neg logit contributions
 narrative-0.004
 ang-0.004
 obstacle-0.004
 Moral-0.004
 mistake-0.004
TokenCan
Token ID6090
Feature activation-8.96e-02

Pos logit contributions
cious+0.037
ze+0.031
mare+0.029
suit+0.028
coat+0.028
Neg logit contributions
 narrative-0.061
 obstacle-0.056
 ang-0.056
 Moral-0.055
 mistake-0.054
Token we
Token ID356
Feature activation-4.06e-03

Pos logit contributions
cious+0.137
ze+0.117
mare+0.114
suit+0.111
coat+0.110
Neg logit contributions
 narrative-0.151
 obstacle-0.138
 ang-0.136
 Moral-0.135
 mistake-0.134
Token go
Token ID467
Feature activation+5.29e-02

Pos logit contributions
cious+0.006
ze+0.005
mare+0.005
coat+0.005
suit+0.005
Neg logit contributions
 narrative-0.007
 obstacle-0.006
 ang-0.006
 mistake-0.006
 Moral-0.006
Token on
Token ID319
Feature activation+2.57e-02

Pos logit contributions
 narrative+0.085
 obstacle+0.079
 ang+0.077
 mistake+0.075
 trouble+0.075
Neg logit contributions
cious-0.085
ze-0.073
mare-0.071
coat-0.069
suit-0.068
Token the
Token ID262
Feature activation+1.38e-02

Pos logit contributions
 narrative+0.047
 Moral+0.040
 obstacle+0.040
Less+0.040
 ang+0.039
Neg logit contributions
cious-0.035
ze-0.032
mare-0.031
coat-0.030
 songs-0.030
Token roller
Token ID24471
Feature activation+4.96e-02

Pos logit contributions
 narrative+0.019
 obstacle+0.018
 ang+0.018
 trouble+0.017
 mistake+0.017
Neg logit contributions
cious-0.025
ze-0.022
mare-0.021
suit-0.021
we-0.021
Token of
Token ID286
Feature activation-3.03e-02

Pos logit contributions
cious+0.054
ze+0.048
mare+0.046
coat+0.045
suit+0.045
Neg logit contributions
 narrative-0.052
 obstacle-0.048
 ang-0.047
 mistake-0.046
 Moral-0.046
Token water
Token ID1660
Feature activation-7.20e-02

Pos logit contributions
cious+0.035
ze+0.031
 songs+0.028
coat+0.028
mare+0.028
Neg logit contributions
 narrative-0.060
 obstacle-0.056
 ang-0.054
Less-0.054
 Moral-0.054
Token.
Token ID13
Feature activation+1.94e-03

Pos logit contributions
cious+0.117
ze+0.100
mare+0.098
Ch+0.093
coat+0.093
Neg logit contributions
 narrative-0.114
 obstacle-0.106
 ang-0.104
 mistake-0.103
 trouble-0.102
Token H
Token ID367
Feature activation+1.72e-01

Pos logit contributions
 narrative+0.003
 obstacle+0.003
 ang+0.003
 mistake+0.003
 trouble+0.003
Neg logit contributions
cious-0.003
ze-0.003
mare-0.003
suit-0.003
coat-0.003
Tokenoot
Token ID1025
Feature activation+6.39e-02

Pos logit contributions
 narrative+0.266
 ang+0.244
 obstacle+0.241
 Moral+0.233
 trouble+0.233
Neg logit contributions
cious-0.294
ze-0.255
mare-0.247
coat-0.247
we-0.244
Tokenie
Token ID494
Feature activation-9.44e-02

Pos logit contributions
 ang+0.065
 narrative+0.063
 mad+0.062
 mistake+0.061
 obstacle+0.061
Neg logit contributions
cious-0.127
ze-0.113
mare-0.112
pter-0.109
bes-0.109
Token picked
Token ID6497
Feature activation-7.34e-03

Pos logit contributions
cious+0.142
mare+0.137
coat+0.133
pter+0.130
suit+0.126
Neg logit contributions
 mad-0.137
 mistake-0.134
 ang-0.131
 Moral-0.128
 wreck-0.126
Token it
Token ID340
Feature activation-1.13e-02

Pos logit contributions
cious+0.014
ze+0.012
mare+0.012
coat+0.012
suit+0.012
Neg logit contributions
 narrative-0.010
 obstacle-0.009
 ang-0.009
 trouble-0.008
 mistake-0.008
Token up
Token ID510
Feature activation+6.23e-02

Pos logit contributions
cious+0.024
ze+0.022
mare+0.021
suit+0.021
coat+0.021
Neg logit contributions
 narrative-0.013
 obstacle-0.011
 ang-0.010
 trouble-0.010
 Moral-0.010
Token and
Token ID290
Feature activation+1.27e-02

Pos logit contributions
 narrative+0.099
 obstacle+0.091
 ang+0.088
 mistake+0.086
 Moral+0.086
Neg logit contributions
cious-0.102
ze-0.089
mare-0.086
coat-0.084
suit-0.084
Token sprayed
Token ID35729
Feature activation-1.25e-01

Pos logit contributions
 ang+0.020
 mad+0.019
 mistake+0.018
 trouble+0.017
 obstacle+0.017
Neg logit contributions
cious-0.021
pter-0.019
ze-0.018
oster-0.018
mare-0.017
Token.
Token ID13
Feature activation-1.08e-02

Pos logit contributions
cious+0.002
ze+0.002
mare+0.002
coat+0.002
we+0.002
Neg logit contributions
 narrative-0.003
 obstacle-0.003
 ang-0.003
 mistake-0.003
 trouble-0.003
Token
Token ID198
Feature activation-1.94e-02

Pos logit contributions
cious+0.018
ze+0.016
mare+0.016
suit+0.015
coat+0.015
Neg logit contributions
 narrative-0.017
 Moral-0.015
 trouble-0.015
 ang-0.015
 obstacle-0.015
Token
Token ID198
Feature activation-1.76e-02

Pos logit contributions
cious+0.022
ze+0.018
Ch+0.017
we+0.017
mare+0.016
Neg logit contributions
 narrative-0.040
 obstacle-0.039
 ang-0.037
 cheating-0.036
 trouble-0.036
TokenSam
Token ID16305
Feature activation-4.32e-02

Pos logit contributions
cious+0.018
Ch+0.014
we+0.014
ze+0.014
Rain+0.013
Neg logit contributions
 narrative-0.038
 obstacle-0.037
 ang-0.036
 trouble-0.035
 tragedy-0.035
Token and
Token ID290
Feature activation+1.42e-02

Pos logit contributions
cious+0.068
ze+0.059
mare+0.058
we+0.055
suit+0.055
Neg logit contributions
 narrative-0.069
 ang-0.064
 obstacle-0.063
 mistake-0.063
 Moral-0.062
Token Mia
Token ID32189
Feature activation-9.83e-02

Pos logit contributions
 narrative+0.030
 obstacle+0.028
 ang+0.028
 mistake+0.027
 trouble+0.027
Neg logit contributions
cious-0.016
ze-0.012
mare-0.012
Ch-0.011
coat-0.011
Token nodded
Token ID14464
Feature activation-7.47e-02

Pos logit contributions
cious+0.111
ze+0.090
coat+0.085
mare+0.083
suit+0.083
Neg logit contributions
 narrative-0.203
 obstacle-0.197
 trouble-0.189
 mistake-0.188
 ang-0.187
Token.
Token ID13
Feature activation-5.79e-03

Pos logit contributions
cious+0.124
mare+0.106
ze+0.105
coat+0.101
suit+0.100
Neg logit contributions
 narrative-0.117
 ang-0.109
 obstacle-0.107
 mistake-0.106
 mad-0.106
Token They
Token ID1119
Feature activation-2.31e-02

Pos logit contributions
cious+0.009
ze+0.008
mare+0.008
suit+0.008
coat+0.008
Neg logit contributions
 narrative-0.009
 obstacle-0.009
 ang-0.008
 mistake-0.008
 trouble-0.008
Token said
Token ID531
Feature activation+1.90e-03

Pos logit contributions
cious+0.026
ze+0.021
mare+0.020
coat+0.020
suit+0.020
Neg logit contributions
 narrative-0.047
 obstacle-0.045
 ang-0.044
 mistake-0.044
 trouble-0.044
Token they
Token ID484
Feature activation-2.95e-02

Pos logit contributions
 narrative+0.003
 obstacle+0.003
 ang+0.003
 mistake+0.003
Less+0.003
Neg logit contributions
cious-0.003
ze-0.002
 songs-0.002
ves-0.002
balls-0.002