TOP ACTIVATIONS
MAX = 0.083

't know the rh in oc eros was wild . Can
give to the rh in oc eros . Lily walked to
that the bad rh in oc eros was now nice .
am taking the rh in oc eros to a new home
there was a rh in oc eros who was always naughty
and all the other av oc ados were very happy too
," said the rh in oc eros . The
They find the rh in oc eros . They are gentle
was a big rh in oc eros . He was troubled
was a big rh in oc eros . He was very
fence where the rh in oc eros was . She saw
closer , the rh in oc eros started to tear up
as he opened the ch ocol ates and started eating them
giving him too many ch ocol ates ." The king learned
" The rh in oc eros is in there ,
learn more about rh in oc er oses and respect them
care of the rh in oc eros and it needed to
he would take the av oc ados and scatter them around
playing ! The rh in oc eros was very worried ,
it to the rh in oc eros . When he tasted

MIN ACTIVATIONS
MIN = -0.029

Sam gasped in amaz ement . He had never
, her eyes wide with amaz ement . Her
Tim my just stared in amaz ement at the t eddy
ly . Jill nodded in amaz ement .
. She looked up in amaz ement at all of the
L ucky watched in amaz ement as the frog zoom
again . The End . <|endoftext|> Tim and Kim
little girl looked up in amaz ement and said , "
Little Jack gasped in amaz ement . Dad pointed to
. To her amaz ement , the clothes came
go ! Everyone watched in amaz ement as it took off
friends stopped and watched in amaz ement . The dog too
the zipper and gasped with amaz ement at all the treasures
air . He watched with amaz ement as the z ig
, his eyes widening with amaz ement . He wanted to
She opened her mouth in amaz ement . She had never
T ina waited with amaz ement as the old man
up at the tree in amaz ement , wondering how it
looked at each other in amaz ement . They had learned
. Amy watched them with amaz ement . She smiled ,

INTERVAL 0.055 - 0.083
CONTAINS 0.004%

they saw a rh in oc eros . They tell their
wait for the rh in oc eros to come .
happy for the rh in oc eros . She said ,
The rh in oc eros lifted his head and
eros . The rh in oc eros looked very scary and

INTERVAL 0.027 - 0.055
CONTAINS 0.028%

started eating the delicious ch ocol ates . She was so
eat . C ind y was hesitant at first
but it wouldn 't bud ge . She was very upset
. He pulled and tug ged on the string and was
. They pulled and tug ged , and the book ripped

INTERVAL -0.001 - 0.027
CONTAINS 60.694%

saw Benny and called the police . B enny
and visit far away places . As they flew around ,
help him get back home safely . Tim my
!" " Yes , let 's go !" Ben
their homes . <|endoftext|> Once upon a time , there was

INTERVAL -0.029 - -0.001
CONTAINS 39.274%

who had a lovely cat named Mitt ens . Lily loved
The bird ie felt very scared and didn 't know what
funny and it made her feel a little better .
there lived a little bear named Y awn . Y awn
Lily was being stubborn . " I don 't want to
Token't
Token ID470
Feature activation+1.06e-02

Pos logit contributions
 blamed+0.003
 nodded+0.003
Kids+0.003
 swapped+0.002
Uh+0.002
Neg logit contributions
gy-0.002
 myself-0.002
 Owl-0.002
 possible-0.002
 rescuing-0.002
Token know
Token ID760
Feature activation+1.00e-02

Pos logit contributions
gy+0.020
 Owl+0.017
 myself+0.017
ulent+0.016
 possible+0.016
Neg logit contributions
 blamed-0.017
 nodded-0.017
Kids-0.016
 swapped-0.015
 did-0.015
Token the
Token ID262
Feature activation+3.65e-03

Pos logit contributions
gy+0.019
 Owl+0.016
 myself+0.016
 possible+0.015
ulent+0.015
Neg logit contributions
 blamed-0.017
 nodded-0.016
Kids-0.015
 did-0.014
 swapped-0.014
Token rh
Token ID9529
Feature activation+2.98e-02

Pos logit contributions
gy+0.007
 myself+0.006
 Owl+0.006
 possible+0.006
 rescuing+0.005
Neg logit contributions
 blamed-0.006
 nodded-0.006
Kids-0.006
 swapped-0.005
Uh-0.005
Tokenin
Token ID259
Feature activation+9.30e-03

Pos logit contributions
gy+0.049
 myself+0.043
 Owl+0.040
ulent+0.040
 possible+0.039
Neg logit contributions
 nodded-0.054
 blamed-0.054
Kids-0.049
 swapped-0.049
Uh-0.048
Tokenoc
Token ID420
Feature activation+6.77e-02

Pos logit contributions
gy+0.012
 myself+0.011
 possible+0.010
 Owl+0.010
ulent+0.009
Neg logit contributions
 nodded-0.020
 blamed-0.019
Kids-0.018
 did-0.018
 swapped-0.017
Tokeneros
Token ID27498
Feature activation+9.74e-03

Pos logit contributions
gy+0.088
 possible+0.087
 myself+0.074
ulent+0.074
 outer+0.072
Neg logit contributions
 nodded-0.129
 blamed-0.126
 swapped-0.115
 moms-0.115
outed-0.114
Token was
Token ID373
Feature activation+1.32e-03

Pos logit contributions
gy+0.023
 myself+0.021
 Owl+0.020
 possible+0.020
ulent+0.019
Neg logit contributions
 blamed-0.011
 nodded-0.011
Kids-0.010
 swapped-0.009
 did-0.009
Token wild
Token ID4295
Feature activation+5.47e-03

Pos logit contributions
gy+0.002
 myself+0.002
 Owl+0.002
 possible+0.002
ulent+0.002
Neg logit contributions
 blamed-0.002
 nodded-0.002
Kids-0.002
 swapped-0.002
Uh-0.002
Token.
Token ID13
Feature activation-2.40e-03

Pos logit contributions
gy+0.010
 myself+0.009
 Owl+0.009
 possible+0.009
ble+0.008
Neg logit contributions
 blamed-0.009
 nodded-0.009
Kids-0.008
 swapped-0.008
 did-0.007
Token Can
Token ID1680
Feature activation-8.74e-03

Pos logit contributions
 nodded+0.004
 blamed+0.004
Kids+0.004
 swapped+0.003
Uh+0.003
Neg logit contributions
gy-0.005
 Owl-0.004
 myself-0.004
 possible-0.004
 rescuing-0.004
Token give
Token ID1577
Feature activation+7.28e-03

Pos logit contributions
gy+0.002
 Owl+0.002
 myself+0.001
 possible+0.001
ulent+0.001
Neg logit contributions
 blamed-0.002
 nodded-0.002
Kids-0.002
 swapped-0.001
 did-0.001
Token to
Token ID284
Feature activation+2.06e-03

Pos logit contributions
gy+0.014
 Owl+0.013
 myself+0.012
 possible+0.012
ulent+0.012
Neg logit contributions
 nodded-0.011
 blamed-0.010
Kids-0.010
 swapped-0.010
 brushes-0.009
Token the
Token ID262
Feature activation+3.00e-03

Pos logit contributions
gy+0.004
 Owl+0.003
 possible+0.003
 myself+0.003
ulent+0.003
Neg logit contributions
 blamed-0.003
 nodded-0.003
Kids-0.003
 swapped-0.003
 brushes-0.003
Token rh
Token ID9529
Feature activation+2.48e-02

Pos logit contributions
gy+0.006
 myself+0.005
 Owl+0.005
 possible+0.005
 rescuing+0.004
Neg logit contributions
 blamed-0.005
 nodded-0.005
Kids-0.005
 swapped-0.004
Uh-0.004
Tokenin
Token ID259
Feature activation+5.67e-03

Pos logit contributions
gy+0.042
 myself+0.036
 possible+0.036
ulent+0.034
 Owl+0.034
Neg logit contributions
 nodded-0.043
 blamed-0.043
Kids-0.040
 swapped-0.039
Uh-0.038
Tokenoc
Token ID420
Feature activation+6.48e-02

Pos logit contributions
gy+0.008
 myself+0.008
 possible+0.007
bing+0.007
ble+0.007
Neg logit contributions
 nodded-0.011
 blamed-0.010
 thanked-0.009
 said-0.009
 did-0.009
Tokeneros
Token ID27498
Feature activation+7.00e-03

Pos logit contributions
 possible+0.087
gy+0.085
 myself+0.070
ulent+0.069
 outer+0.069
Neg logit contributions
 nodded-0.123
 blamed-0.119
 moms-0.113
 swapped-0.108
outed-0.108
Token.
Token ID13
Feature activation-9.94e-04

Pos logit contributions
gy+0.012
 myself+0.011
 possible+0.010
 Owl+0.010
ulent+0.010
Neg logit contributions
 nodded-0.012
 blamed-0.012
Kids-0.010
 swapped-0.010
 did-0.010
Token Lily
Token ID20037
Feature activation+1.12e-03

Pos logit contributions
 blamed+0.001
 nodded+0.001
Kids+0.001
 swapped+0.001
Uh+0.001
Neg logit contributions
gy-0.002
 Owl-0.002
 myself-0.002
 possible-0.002
 rescuing-0.002
Token walked
Token ID6807
Feature activation+2.05e-04

Pos logit contributions
gy+0.003
 Owl+0.002
 myself+0.002
 possible+0.002
 rescuing+0.002
Neg logit contributions
 blamed-0.001
Kids-0.001
 nodded-0.001
Uh-0.001
 swapped-0.001
Token to
Token ID284
Feature activation-1.65e-03

Pos logit contributions
gy+0.000
 myself+0.000
 Owl+0.000
 possible+0.000
 rescuing+0.000
Neg logit contributions
 blamed-0.000
 nodded-0.000
Kids-0.000
 swapped-0.000
Uh-0.000
Token that
Token ID326
Feature activation-1.54e-03

Pos logit contributions
gy+0.010
 myself+0.009
 Owl+0.009
 possible+0.009
 rescuing+0.008
Neg logit contributions
 blamed-0.012
 nodded-0.011
Kids-0.011
 swapped-0.011
Uh-0.011
Token the
Token ID262
Feature activation+2.48e-03

Pos logit contributions
 blamed+0.002
 nodded+0.002
Kids+0.002
 swapped+0.002
Uh+0.002
Neg logit contributions
gy-0.003
 myself-0.003
 Owl-0.003
 possible-0.002
ulent-0.002
Token bad
Token ID2089
Feature activation+5.91e-04

Pos logit contributions
gy+0.005
 myself+0.004
 Owl+0.004
 possible+0.004
 rescuing+0.004
Neg logit contributions
 blamed-0.004
 nodded-0.004
Kids-0.004
 swapped-0.003
Uh-0.003
Token rh
Token ID9529
Feature activation+2.62e-02

Pos logit contributions
gy+0.001
 myself+0.001
 Owl+0.001
 possible+0.001
 rescuing+0.001
Neg logit contributions
 blamed-0.001
 nodded-0.001
Kids-0.001
 swapped-0.001
Uh-0.001
Tokenin
Token ID259
Feature activation+5.00e-03

Pos logit contributions
gy+0.041
 myself+0.036
ulent+0.034
 Owl+0.034
 possible+0.033
Neg logit contributions
 nodded-0.050
 blamed-0.050
Kids-0.046
 did-0.045
 swapped-0.044
Tokenoc
Token ID420
Feature activation+6.45e-02

Pos logit contributions
gy+0.006
 myself+0.005
 Owl+0.005
 possible+0.005
ulent+0.005
Neg logit contributions
 blamed-0.011
 nodded-0.011
Kids-0.011
 swapped-0.010
Uh-0.010
Tokeneros
Token ID27498
Feature activation+7.00e-03

Pos logit contributions
gy+0.084
 possible+0.080
 outer+0.071
 myself+0.070
ulent+0.070
Neg logit contributions
 nodded-0.129
 blamed-0.123
 swapped-0.113
 thanked-0.113
 moms-0.112
Token was
Token ID373
Feature activation+5.50e-05

Pos logit contributions
gy+0.016
 myself+0.015
 Owl+0.015
 possible+0.014
 rescuing+0.014
Neg logit contributions
 blamed-0.008
 nodded-0.008
Kids-0.008
 swapped-0.007
Uh-0.007
Token now
Token ID783
Feature activation-4.30e-03

Pos logit contributions
gy+0.000
 myself+0.000
 Owl+0.000
 possible+0.000
ulent+0.000
Neg logit contributions
 nodded-0.000
 blamed-0.000
Kids-0.000
 swapped-0.000
 did-0.000
Token nice
Token ID3621
Feature activation+5.77e-03

Pos logit contributions
 nodded+0.008
 blamed+0.008
Kids+0.007
 swapped+0.007
 did+0.007
Neg logit contributions
gy-0.007
 myself-0.006
 Owl-0.006
 possible-0.006
ulent-0.006
Token.
Token ID13
Feature activation-2.90e-03

Pos logit contributions
gy+0.010
 Owl+0.009
 myself+0.009
 possible+0.008
 rescuing+0.008
Neg logit contributions
 blamed-0.011
Kids-0.010
 nodded-0.010
Uh-0.009
 swapped-0.009
Token am
Token ID716
Feature activation+1.22e-02

Pos logit contributions
gy+0.008
 myself+0.007
 possible+0.007
 Owl+0.007
 rescuing+0.006
Neg logit contributions
Kids-0.007
 blamed-0.007
 nodded-0.006
Uh-0.006
Wow-0.006
Token taking
Token ID2263
Feature activation+1.01e-03

Pos logit contributions
gy+0.022
 Owl+0.018
 myself+0.018
ulent+0.017
 possible+0.017
Neg logit contributions
 blamed-0.021
 nodded-0.021
Kids-0.020
 swapped-0.019
 did-0.018
Token the
Token ID262
Feature activation+9.74e-04

Pos logit contributions
gy+0.002
 Owl+0.001
 possible+0.001
ble+0.001
 myself+0.001
Neg logit contributions
 nodded-0.002
 blamed-0.002
Kids-0.002
 swapped-0.002
 turns-0.002
Token rh
Token ID9529
Feature activation+2.27e-02

Pos logit contributions
gy+0.002
 myself+0.002
 Owl+0.002
 possible+0.002
ulent+0.002
Neg logit contributions
 nodded-0.001
 blamed-0.001
Kids-0.001
 swapped-0.001
Uh-0.001
Tokenin
Token ID259
Feature activation+7.47e-04

Pos logit contributions
gy+0.040
 myself+0.035
ulent+0.033
 possible+0.033
 Owl+0.033
Neg logit contributions
 nodded-0.036
 blamed-0.036
Uh-0.033
Kids-0.033
 swapped-0.032
Tokenoc
Token ID420
Feature activation+6.40e-02

Pos logit contributions
gy+0.001
 possible+0.001
 myself+0.001
ulent+0.001
ble+0.001
Neg logit contributions
 nodded-0.001
 blamed-0.001
 thanked-0.001
 said-0.001
 did-0.001
Tokeneros
Token ID27498
Feature activation+5.00e-03

Pos logit contributions
 possible+0.085
gy+0.085
 myself+0.072
 outer+0.071
ulent+0.070
Neg logit contributions
 nodded-0.118
 blamed-0.112
 moms-0.107
omo-0.105
outed-0.104
Token to
Token ID284
Feature activation-1.27e-03

Pos logit contributions
gy+0.009
 Owl+0.008
 myself+0.008
ulent+0.007
ble+0.007
Neg logit contributions
 blamed-0.008
 nodded-0.008
 did-0.007
Kids-0.007
 swapped-0.007
Token a
Token ID257
Feature activation-1.90e-03

Pos logit contributions
 blamed+0.002
 thanked+0.002
 nodded+0.002
 did+0.002
 said+0.002
Neg logit contributions
gy-0.002
owing-0.002
bing-0.002
rying-0.002
arling-0.002
Token new
Token ID649
Feature activation+5.53e-03

Pos logit contributions
 blamed+0.003
 nodded+0.003
Kids+0.003
 swapped+0.003
Uh+0.002
Neg logit contributions
gy-0.004
 myself-0.003
 Owl-0.003
ble-0.003
 possible-0.003
Token home
Token ID1363
Feature activation-2.25e-03

Pos logit contributions
gy+0.010
 myself+0.009
 Owl+0.008
 possible+0.008
ble+0.008
Neg logit contributions
 nodded-0.009
 blamed-0.009
Kids-0.008
 swapped-0.008
omo-0.008
Token there
Token ID612
Feature activation+1.26e-03

Pos logit contributions
 blamed+0.001
 nodded+0.001
Kids+0.001
 swapped+0.001
Uh+0.001
Neg logit contributions
gy-0.001
 myself-0.001
 Owl-0.001
 possible-0.001
ulent-0.001
Token was
Token ID373
Feature activation+4.54e-03

Pos logit contributions
gy+0.003
 myself+0.002
 Owl+0.002
 possible+0.002
 rescuing+0.002
Neg logit contributions
 blamed-0.002
 nodded-0.002
Kids-0.002
 swapped-0.001
Uh-0.001
Token a
Token ID257
Feature activation-7.09e-04

Pos logit contributions
gy+0.006
 myself+0.005
 Owl+0.005
 possible+0.005
ble+0.004
Neg logit contributions
 blamed-0.010
 nodded-0.010
Kids-0.009
 swapped-0.009
Uh-0.009
Token rh
Token ID9529
Feature activation+2.63e-02

Pos logit contributions
 blamed+0.001
 nodded+0.001
Kids+0.001
 swapped+0.001
Uh+0.001
Neg logit contributions
gy-0.001
 Owl-0.001
 myself-0.001
 possible-0.001
 rescuing-0.001
Tokenin
Token ID259
Feature activation+6.30e-03

Pos logit contributions
gy+0.044
 myself+0.039
 possible+0.038
ulent+0.036
 Owl+0.035
Neg logit contributions
 nodded-0.047
 blamed-0.046
Kids-0.042
Uh-0.042
 did-0.041
Tokenoc
Token ID420
Feature activation+6.39e-02

Pos logit contributions
gy+0.009
 myself+0.008
 possible+0.008
ulent+0.007
 rescuing+0.007
Neg logit contributions
 nodded-0.013
 blamed-0.012
Kids-0.011
 did-0.011
omo-0.011
Tokeneros
Token ID27498
Feature activation+8.42e-03

Pos logit contributions
gy+0.084
 possible+0.083
 myself+0.071
 outer+0.069
 afloat+0.065
Neg logit contributions
 nodded-0.125
 blamed-0.120
 moms-0.109
 thanked-0.108
 swapped-0.107
Token who
Token ID508
Feature activation+4.94e-03

Pos logit contributions
gy+0.016
 myself+0.015
 possible+0.015
 Owl+0.014
ulent+0.014
Neg logit contributions
 blamed-0.013
 nodded-0.012
Kids-0.011
 did-0.011
 swapped-0.011
Token was
Token ID373
Feature activation+2.55e-03

Pos logit contributions
gy+0.012
 myself+0.011
 possible+0.011
 Owl+0.011
ulent+0.011
Neg logit contributions
 blamed-0.005
 nodded-0.005
Kids-0.004
 swapped-0.004
 did-0.004
Token always
Token ID1464
Feature activation+6.50e-03

Pos logit contributions
gy+0.005
 myself+0.004
 possible+0.004
 ourselves+0.004
 Owl+0.004
Neg logit contributions
 blamed-0.004
 nodded-0.004
 swapped-0.003
 did-0.003
 thanked-0.003
Token naughty
Token ID45128
Feature activation-3.45e-03

Pos logit contributions
gy+0.010
 myself+0.009
 Owl+0.009
ulent+0.008
ble+0.008
Neg logit contributions
 nodded-0.012
 blamed-0.012
 did-0.011
 swapped-0.011
 thanked-0.010
Token and
Token ID290
Feature activation-2.10e-03

Pos logit contributions
gy+0.005
 myself+0.004
 Owl+0.004
 possible+0.004
 rescuing+0.004
Neg logit contributions
 blamed-0.005
Kids-0.005
 nodded-0.005
Uh-0.004
 swapped-0.004
Token all
Token ID477
Feature activation+5.27e-03

Pos logit contributions
 blamed+0.003
Kids+0.003
 nodded+0.003
Uh+0.003
 swapped+0.003
Neg logit contributions
gy-0.004
 myself-0.003
 Owl-0.003
 possible-0.003
 rescuing-0.003
Token the
Token ID262
Feature activation+1.76e-03

Pos logit contributions
gy+0.009
 myself+0.008
 possible+0.008
 Owl+0.007
 rescuing+0.007
Neg logit contributions
 blamed-0.009
Kids-0.009
 nodded-0.009
Uh-0.009
 swapped-0.008
Token other
Token ID584
Feature activation-2.86e-03

Pos logit contributions
gy+0.003
 myself+0.003
 Owl+0.003
 possible+0.003
 rescuing+0.003
Neg logit contributions
 blamed-0.003
 nodded-0.003
Kids-0.003
 swapped-0.002
Uh-0.002
Token av
Token ID1196
Feature activation+3.41e-02

Pos logit contributions
 blamed+0.004
 nodded+0.004
Kids+0.004
 swapped+0.004
Uh+0.004
Neg logit contributions
gy-0.006
 myself-0.005
 Owl-0.005
 possible-0.005
 rescuing-0.005
Tokenoc
Token ID420
Feature activation+6.38e-02

Pos logit contributions
gy+0.044
 myself+0.040
 possible+0.039
 Owl+0.037
ble+0.037
Neg logit contributions
 nodded-0.071
 blamed-0.069
 swapped-0.066
Uh-0.064
Kids-0.064
Tokenados
Token ID22484
Feature activation+5.19e-03

Pos logit contributions
gy+0.106
 possible+0.098
 outer+0.091
 afloat+0.088
 myself+0.085
Neg logit contributions
 nodded-0.110
 blamed-0.107
 swapped-0.096
 moms-0.096
outed-0.094
Token were
Token ID547
Feature activation+4.52e-03

Pos logit contributions
gy+0.012
 myself+0.010
 Owl+0.010
 possible+0.010
 rescuing+0.010
Neg logit contributions
 blamed-0.006
Kids-0.006
 nodded-0.006
Uh-0.006
Wow-0.005
Token very
Token ID845
Feature activation+7.68e-03

Pos logit contributions
gy+0.008
 myself+0.007
 Owl+0.007
 possible+0.006
 rescuing+0.006
Neg logit contributions
 blamed-0.008
 nodded-0.008
Kids-0.008
 swapped-0.007
Uh-0.007
Token happy
Token ID3772
Feature activation+4.58e-03

Pos logit contributions
gy+0.015
 Owl+0.013
 myself+0.013
 possible+0.012
 rescuing+0.012
Neg logit contributions
 blamed-0.013
 nodded-0.013
Kids-0.012
 swapped-0.011
Uh-0.011
Token too
Token ID1165
Feature activation-4.09e-03

Pos logit contributions
gy+0.006
 myself+0.006
 Owl+0.006
 possible+0.005
 rescuing+0.005
Neg logit contributions
 blamed-0.010
Kids-0.009
 nodded-0.009
Uh-0.009
 swapped-0.009
Token,"
Token ID553
Feature activation-6.62e-03

Pos logit contributions
gy+0.011
 Owl+0.009
 myself+0.009
 possible+0.009
ulent+0.009
Neg logit contributions
 blamed-0.010
 nodded-0.010
Kids-0.009
 did-0.009
 swapped-0.009
Token said
Token ID531
Feature activation+2.01e-03

Pos logit contributions
 blamed+0.009
Kids+0.009
 nodded+0.009
 swapped+0.008
Uh+0.008
Neg logit contributions
gy-0.014
 myself-0.013
 Owl-0.013
 possible-0.012
 rescuing-0.011
Token the
Token ID262
Feature activation-8.88e-04

Pos logit contributions
gy+0.004
 myself+0.004
 possible+0.003
 Owl+0.003
ulent+0.003
Neg logit contributions
 nodded-0.003
 blamed-0.003
Kids-0.003
 swapped-0.003
 did-0.003
Token rh
Token ID9529
Feature activation+2.02e-02

Pos logit contributions
 blamed+0.001
 nodded+0.001
Kids+0.001
 swapped+0.001
Uh+0.001
Neg logit contributions
gy-0.002
 myself-0.002
 Owl-0.002
 possible-0.002
ulent-0.002
Tokenin
Token ID259
Feature activation-1.14e-03

Pos logit contributions
gy+0.034
 myself+0.029
 possible+0.028
ulent+0.027
 Owl+0.027
Neg logit contributions
 nodded-0.038
 blamed-0.037
Kids-0.034
 swapped-0.033
 did-0.033
Tokenoc
Token ID420
Feature activation+6.36e-02

Pos logit contributions
 nodded+0.002
 blamed+0.002
Kids+0.002
 did+0.002
 swapped+0.002
Neg logit contributions
gy-0.002
 myself-0.002
 possible-0.002
 Owl-0.002
 rescuing-0.001
Tokeneros
Token ID27498
Feature activation+2.40e-03

Pos logit contributions
 possible+0.087
gy+0.086
 outer+0.074
 myself+0.071
 afloat+0.070
Neg logit contributions
 nodded-0.121
 blamed-0.114
 moms-0.110
omo-0.105
 swapped-0.105
Token.
Token ID13
Feature activation-3.99e-03

Pos logit contributions
gy+0.004
 myself+0.004
 possible+0.004
 Owl+0.004
ulent+0.003
Neg logit contributions
 nodded-0.004
 blamed-0.004
Kids-0.004
 swapped-0.004
 did-0.003
Token
Token ID198
Feature activation-1.88e-03

Pos logit contributions
 blamed+0.006
Kids+0.006
 nodded+0.006
 swapped+0.006
Uh+0.005
Neg logit contributions
gy-0.008
 Owl-0.007
 myself-0.007
 possible-0.006
 rescuing-0.006
Token
Token ID198
Feature activation-2.69e-03

Pos logit contributions
 blamed+0.003
 nodded+0.003
Kids+0.003
 swapped+0.002
Uh+0.002
Neg logit contributions
gy-0.004
 myself-0.003
 Owl-0.003
 possible-0.003
 rescuing-0.003
TokenThe
Token ID464
Feature activation-1.11e-03

Pos logit contributions
 blamed+0.004
 nodded+0.004
Kids+0.004
 swapped+0.004
Uh+0.004
Neg logit contributions
gy-0.005
 myself-0.005
 Owl-0.005
 possible-0.004
 rescuing-0.004
Token They
Token ID1119
Feature activation-3.48e-04

Pos logit contributions
gy+0.000
 myself+0.000
 Owl+0.000
 possible+0.000
 rescuing+0.000
Neg logit contributions
 blamed-0.000
 nodded-0.000
Kids-0.000
 swapped-0.000
Uh-0.000
Token find
Token ID1064
Feature activation+4.17e-03

Pos logit contributions
Kids+0.001
 blamed+0.001
Uh+0.001
 nodded+0.001
omo+0.001
Neg logit contributions
gy-0.001
 Owl-0.001
 myself-0.001
 possible-0.001
 rescuing-0.000
Token the
Token ID262
Feature activation+1.79e-04

Pos logit contributions
gy+0.006
 myself+0.005
 Owl+0.005
 possible+0.005
ulent+0.005
Neg logit contributions
 blamed-0.008
 nodded-0.008
Kids-0.008
 swapped-0.008
 did-0.007
Token rh
Token ID9529
Feature activation+2.05e-02

Pos logit contributions
gy+0.000
 myself+0.000
 Owl+0.000
 possible+0.000
 rescuing+0.000
Neg logit contributions
 blamed-0.000
 nodded-0.000
Kids-0.000
 swapped-0.000
Uh-0.000
Tokenin
Token ID259
Feature activation+2.14e-03

Pos logit contributions
gy+0.036
 myself+0.031
 possible+0.030
ulent+0.029
 Owl+0.029
Neg logit contributions
 nodded-0.034
 blamed-0.034
Kids-0.032
Uh-0.032
 did-0.031
Tokenoc
Token ID420
Feature activation+6.34e-02

Pos logit contributions
 myself+0.003
gy+0.003
 possible+0.003
ulent+0.003
unct+0.003
Neg logit contributions
 nodded-0.004
 did-0.004
 blamed-0.004
 thanked-0.003
 said-0.003
Tokeneros
Token ID27498
Feature activation+2.80e-03

Pos logit contributions
 possible+0.090
gy+0.087
 outer+0.076
 myself+0.075
ulent+0.070
Neg logit contributions
 nodded-0.114
 moms-0.107
 blamed-0.107
 swapped-0.102
omo-0.100
Token.
Token ID13
Feature activation-1.97e-03

Pos logit contributions
gy+0.005
 myself+0.005
 Owl+0.005
 possible+0.005
ulent+0.005
Neg logit contributions
 nodded-0.004
 blamed-0.004
 did-0.004
 swapped-0.004
Kids-0.003
Token They
Token ID1119
Feature activation-8.70e-04

Pos logit contributions
 nodded+0.003
 blamed+0.003
 swapped+0.003
Kids+0.003
Uh+0.003
Neg logit contributions
gy-0.004
 Owl-0.003
 myself-0.003
 possible-0.003
 rescuing-0.003
Token are
Token ID389
Feature activation+4.77e-03

Pos logit contributions
Kids+0.002
 blamed+0.002
 nodded+0.001
Uh+0.001
omo+0.001
Neg logit contributions
gy-0.001
 Owl-0.001
 myself-0.001
 possible-0.001
 rescuing-0.001
Token gentle
Token ID10296
Feature activation+2.63e-03

Pos logit contributions
gy+0.007
 myself+0.006
 Owl+0.006
 possible+0.006
 rescuing+0.006
Neg logit contributions
Kids-0.009
 nodded-0.009
 blamed-0.009
Uh-0.008
 swapped-0.008
Token was
Token ID373
Feature activation+3.12e-03

Pos logit contributions
gy+0.003
 myself+0.002
 Owl+0.002
 possible+0.002
ulent+0.002
Neg logit contributions
 nodded-0.001
 blamed-0.001
Kids-0.001
 swapped-0.001
 did-0.001
Token a
Token ID257
Feature activation-1.68e-03

Pos logit contributions
gy+0.004
 myself+0.003
 Owl+0.003
 possible+0.003
ble+0.003
Neg logit contributions
 blamed-0.007
 nodded-0.007
Kids-0.006
 swapped-0.006
Uh-0.006
Token big
Token ID1263
Feature activation+4.83e-03

Pos logit contributions
 blamed+0.002
 nodded+0.002
Kids+0.002
 swapped+0.002
Uh+0.002
Neg logit contributions
gy-0.003
 Owl-0.003
 myself-0.003
 possible-0.003
 rescuing-0.003
Token rh
Token ID9529
Feature activation+2.45e-02

Pos logit contributions
gy+0.009
 Owl+0.008
 myself+0.007
 possible+0.007
 rescuing+0.007
Neg logit contributions
 blamed-0.008
Kids-0.008
 nodded-0.008
 swapped-0.008
Uh-0.007
Tokenin
Token ID259
Feature activation+4.80e-03

Pos logit contributions
gy+0.042
 myself+0.036
 possible+0.035
ulent+0.034
 Owl+0.033
Neg logit contributions
 nodded-0.043
 blamed-0.043
Kids-0.039
Uh-0.039
 did-0.038
Tokenoc
Token ID420
Feature activation+6.33e-02

Pos logit contributions
gy+0.006
 myself+0.006
 possible+0.005
 Owl+0.005
ulent+0.005
Neg logit contributions
 blamed-0.010
 nodded-0.010
Kids-0.009
 did-0.009
 swapped-0.009
Tokeneros
Token ID27498
Feature activation+6.47e-03

Pos logit contributions
gy+0.085
 possible+0.084
 myself+0.071
 outer+0.069
 afloat+0.066
Neg logit contributions
 nodded-0.123
 blamed-0.118
 moms-0.107
 thanked-0.105
 swapped-0.105
Token.
Token ID13
Feature activation-3.26e-04

Pos logit contributions
gy+0.012
 myself+0.011
 possible+0.011
 Owl+0.010
ulent+0.010
Neg logit contributions
 blamed-0.010
 nodded-0.010
Kids-0.009
 swapped-0.009
 did-0.009
Token He
Token ID679
Feature activation+4.69e-04

Pos logit contributions
 blamed+0.001
 nodded+0.001
Kids+0.001
 swapped+0.001
Uh+0.001
Neg logit contributions
gy-0.001
 Owl-0.000
 myself-0.000
 possible-0.000
 rescuing-0.000
Token was
Token ID373
Feature activation+1.44e-03

Pos logit contributions
gy+0.001
 Owl+0.001
 myself+0.001
 possible+0.001
 rescuing+0.001
Neg logit contributions
Kids-0.001
 blamed-0.001
 nodded-0.001
Uh-0.000
 swapped-0.000
Token troubled
Token ID17840
Feature activation-5.62e-03

Pos logit contributions
gy+0.003
 myself+0.002
 possible+0.002
 Owl+0.002
ble+0.002
Neg logit contributions
 blamed-0.002
 nodded-0.002
 swapped-0.002
Kids-0.002
 did-0.002
Token was
Token ID373
Feature activation+3.09e-03

Pos logit contributions
gy+0.005
 myself+0.004
 Owl+0.004
 possible+0.004
ulent+0.004
Neg logit contributions
 blamed-0.002
 nodded-0.002
Kids-0.002
 swapped-0.002
Uh-0.002
Token a
Token ID257
Feature activation-2.20e-03

Pos logit contributions
gy+0.004
 Owl+0.004
 myself+0.004
 possible+0.004
 rescuing+0.003
Neg logit contributions
 blamed-0.006
 nodded-0.006
Kids-0.006
 swapped-0.006
Uh-0.006
Token big
Token ID1263
Feature activation+5.20e-03

Pos logit contributions
 nodded+0.003
 blamed+0.003
Kids+0.002
 swapped+0.002
Uh+0.002
Neg logit contributions
gy-0.005
 myself-0.005
 possible-0.005
 Owl-0.005
 rescuing-0.004
Token rh
Token ID9529
Feature activation+2.20e-02

Pos logit contributions
gy+0.010
 myself+0.009
 Owl+0.008
 possible+0.008
 rescuing+0.008
Neg logit contributions
 blamed-0.008
 nodded-0.008
Kids-0.008
 swapped-0.007
Uh-0.007
Tokenin
Token ID259
Feature activation+2.16e-03

Pos logit contributions
gy+0.037
 myself+0.032
 possible+0.032
ulent+0.031
 Owl+0.030
Neg logit contributions
 nodded-0.039
 blamed-0.039
Kids-0.036
 did-0.035
Uh-0.034
Tokenoc
Token ID420
Feature activation+6.32e-02

Pos logit contributions
gy+0.003
 myself+0.003
 possible+0.003
ulent+0.002
 Owl+0.002
Neg logit contributions
 nodded-0.004
 blamed-0.004
Kids-0.004
 did-0.004
omo-0.004
Tokeneros
Token ID27498
Feature activation+4.60e-03

Pos logit contributions
 possible+0.091
gy+0.087
 afloat+0.073
 myself+0.073
 outer+0.070
Neg logit contributions
 nodded-0.118
 blamed-0.112
 moms-0.108
omo-0.103
 swapped-0.101
Token.
Token ID13
Feature activation+7.39e-04

Pos logit contributions
gy+0.010
 myself+0.010
 possible+0.010
 Owl+0.009
ulent+0.009
Neg logit contributions
 blamed-0.006
 nodded-0.006
Kids-0.005
 did-0.005
 swapped-0.004
Token He
Token ID679
Feature activation+1.20e-03

Pos logit contributions
gy+0.001
 possible+0.001
 myself+0.001
ble+0.001
rying+0.001
Neg logit contributions
 blamed-0.001
 nodded-0.001
Wow-0.001
Kids-0.001
 swapped-0.001
Token was
Token ID373
Feature activation+1.86e-03

Pos logit contributions
gy+0.003
 myself+0.003
 possible+0.003
 Owl+0.003
ulent+0.003
Neg logit contributions
 blamed-0.001
 nodded-0.001
 did-0.001
Kids-0.001
 swapped-0.001
Token very
Token ID845
Feature activation+6.50e-03

Pos logit contributions
gy+0.004
 myself+0.003
 possible+0.003
bing+0.003
 ourselves+0.003
Neg logit contributions
 blamed-0.003
 nodded-0.003
 did-0.002
 swapped-0.002
 thanked-0.002
Token fence
Token ID13990
Feature activation+2.34e-04

Pos logit contributions
gy+0.003
 possible+0.003
 myself+0.003
owing+0.002
 Owl+0.002
Neg logit contributions
 blamed-0.002
 nodded-0.002
 swapped-0.002
Kids-0.002
Uh-0.002
Token where
Token ID810
Feature activation+1.14e-03

Pos logit contributions
gy+0.000
 myself+0.000
 Owl+0.000
 possible+0.000
ulent+0.000
Neg logit contributions
 nodded-0.000
 blamed-0.000
Kids-0.000
 swapped-0.000
Uh-0.000
Token the
Token ID262
Feature activation+2.00e-03

Pos logit contributions
gy+0.002
 myself+0.002
 Owl+0.002
 possible+0.002
 rescuing+0.002
Neg logit contributions
 blamed-0.002
 nodded-0.002
Kids-0.002
 swapped-0.002
Uh-0.002
Token rh
Token ID9529
Feature activation+2.17e-02

Pos logit contributions
gy+0.003
 myself+0.003
 possible+0.003
 Owl+0.003
ulent+0.003
Neg logit contributions
 blamed-0.004
 nodded-0.004
Kids-0.003
Uh-0.003
 swapped-0.003
Tokenin
Token ID259
Feature activation+5.69e-03

Pos logit contributions
gy+0.035
 myself+0.031
 possible+0.030
ulent+0.029
 Owl+0.028
Neg logit contributions
 blamed-0.040
 nodded-0.040
Kids-0.037
Uh-0.036
 swapped-0.036
Tokenoc
Token ID420
Feature activation+6.31e-02

Pos logit contributions
gy+0.006
 myself+0.006
 possible+0.005
 Owl+0.005
ulent+0.005
Neg logit contributions
 nodded-0.013
 blamed-0.013
Kids-0.012
 swapped-0.012
Uh-0.012
Tokeneros
Token ID27498
Feature activation+5.19e-03

Pos logit contributions
 possible+0.091
gy+0.083
ulent+0.074
 outer+0.073
 myself+0.073
Neg logit contributions
 nodded-0.116
 blamed-0.112
 moms-0.108
omo-0.102
outed-0.102
Token was
Token ID373
Feature activation+1.37e-03

Pos logit contributions
gy+0.012
 myself+0.011
 Owl+0.011
 possible+0.011
ulent+0.011
Neg logit contributions
 blamed-0.006
 nodded-0.006
Kids-0.005
 swapped-0.005
 did-0.004
Token.
Token ID13
Feature activation-1.72e-03

Pos logit contributions
gy+0.002
 myself+0.002
 Owl+0.002
 possible+0.002
ulent+0.002
Neg logit contributions
 blamed-0.002
 nodded-0.002
 did-0.002
 swapped-0.002
 thanked-0.002
Token She
Token ID1375
Feature activation-1.59e-03

Pos logit contributions
 nodded+0.003
 blamed+0.003
Kids+0.003
 swapped+0.003
Uh+0.003
Neg logit contributions
gy-0.003
 Owl-0.003
 myself-0.003
 possible-0.002
 Must-0.002
Token saw
Token ID2497
Feature activation-1.56e-03

Pos logit contributions
Kids+0.002
 blamed+0.002
Uh+0.002
 nodded+0.002
Wow+0.002
Neg logit contributions
gy-0.004
 Owl-0.003
 myself-0.003
 rescuing-0.003
 possible-0.003
Token closer
Token ID5699
Feature activation-1.91e-03

Pos logit contributions
gy+0.003
 myself+0.003
 Owl+0.003
 possible+0.002
 rescuing+0.002
Neg logit contributions
 blamed-0.004
 nodded-0.004
Kids-0.004
 swapped-0.004
Uh-0.004
Token,
Token ID11
Feature activation-1.37e-03

Pos logit contributions
 nodded+0.003
 blamed+0.003
 did+0.003
Kids+0.002
 swapped+0.002
Neg logit contributions
gy-0.004
 myself-0.003
 Owl-0.003
ulent-0.003
 possible-0.003
Token the
Token ID262
Feature activation+9.38e-04

Pos logit contributions
 blamed+0.002
 nodded+0.002
Kids+0.002
 swapped+0.002
 did+0.002
Neg logit contributions
gy-0.003
 myself-0.002
 Owl-0.002
 possible-0.002
ulent-0.002
Token rh
Token ID9529
Feature activation+2.17e-02

Pos logit contributions
gy+0.002
 myself+0.002
 Owl+0.002
 possible+0.002
 rescuing+0.002
Neg logit contributions
 blamed-0.001
 nodded-0.001
Kids-0.001
 swapped-0.001
Uh-0.001
Tokenin
Token ID259
Feature activation+3.36e-04

Pos logit contributions
gy+0.036
 myself+0.033
 possible+0.030
ulent+0.030
 Owl+0.029
Neg logit contributions
 nodded-0.040
 blamed-0.039
Kids-0.035
Uh-0.035
 did-0.035
Tokenoc
Token ID420
Feature activation+6.31e-02

Pos logit contributions
gy+0.000
 myself+0.000
 possible+0.000
 Owl+0.000
 rescuing+0.000
Neg logit contributions
 nodded-0.001
 blamed-0.001
Kids-0.001
 swapped-0.001
Uh-0.001
Tokeneros
Token ID27498
Feature activation+4.92e-03

Pos logit contributions
 possible+0.090
gy+0.084
 myself+0.074
 outer+0.074
ulent+0.071
Neg logit contributions
 nodded-0.118
 blamed-0.111
 moms-0.107
 swapped-0.102
 thanked-0.102
Token started
Token ID2067
Feature activation-1.37e-03

Pos logit contributions
gy+0.012
 Owl+0.011
 myself+0.011
 possible+0.010
 rescuing+0.010
Neg logit contributions
 blamed-0.005
 nodded-0.005
Kids-0.005
 swapped-0.005
Uh-0.004
Token to
Token ID284
Feature activation-7.98e-04

Pos logit contributions
 nodded+0.002
 blamed+0.002
 swapped+0.002
Kids+0.002
 did+0.002
Neg logit contributions
gy-0.003
 myself-0.002
 Owl-0.002
 possible-0.002
ble-0.002
Token tear
Token ID11626
Feature activation+1.53e-03

Pos logit contributions
 nodded+0.001
 blamed+0.001
 did+0.001
 thanked+0.001
 swapped+0.001
Neg logit contributions
gy-0.001
 myself-0.001
ich-0.001
 possible-0.001
ulent-0.001
Token up
Token ID510
Feature activation+7.30e-03

Pos logit contributions
gy+0.002
 myself+0.002
 Owl+0.002
 possible+0.002
ulent+0.002
Neg logit contributions
 nodded-0.003
 blamed-0.002
 swapped-0.002
Kids-0.002
Wow-0.002
Token as
Token ID355
Feature activation-4.29e-03

Pos logit contributions
gy+0.020
 myself+0.017
 Owl+0.017
 possible+0.016
ulent+0.015
Neg logit contributions
 blamed-0.020
 nodded-0.020
Kids-0.019
 swapped-0.018
 did-0.017
Token he
Token ID339
Feature activation-4.31e-03

Pos logit contributions
 nodded+0.006
 blamed+0.006
Kids+0.006
 did+0.005
 swapped+0.005
Neg logit contributions
gy-0.009
 myself-0.008
 Owl-0.007
ulent-0.007
icken-0.007
Token opened
Token ID4721
Feature activation-1.35e-03

Pos logit contributions
 blamed+0.005
Kids+0.005
 nodded+0.004
Uh+0.004
Wow+0.004
Neg logit contributions
gy-0.010
 Owl-0.009
 myself-0.009
 possible-0.009
 rescuing-0.008
Token the
Token ID262
Feature activation+7.50e-04

Pos logit contributions
 blamed+0.002
 nodded+0.002
Kids+0.002
 swapped+0.002
Uh+0.002
Neg logit contributions
gy-0.002
 myself-0.002
 Owl-0.002
 possible-0.002
ulent-0.002
Token ch
Token ID442
Feature activation+5.10e-03

Pos logit contributions
gy+0.002
 Owl+0.001
 myself+0.001
 possible+0.001
 rescuing+0.001
Neg logit contributions
 blamed-0.001
 nodded-0.001
Kids-0.001
 swapped-0.001
Uh-0.001
Tokenocol
Token ID4668
Feature activation+6.30e-02

Pos logit contributions
gy+0.008
 myself+0.007
 Owl+0.006
 possible+0.006
 rescuing+0.006
Neg logit contributions
 blamed-0.010
 nodded-0.010
Kids-0.010
 swapped-0.009
Uh-0.009
Tokenates
Token ID689
Feature activation+2.08e-02

Pos logit contributions
gy+0.104
 Owl+0.096
 myself+0.091
 possible+0.089
 rescuing+0.084
Neg logit contributions
 nodded-0.104
 blamed-0.095
outed-0.094
Kids-0.091
 swapped-0.090
Token and
Token ID290
Feature activation-3.85e-03

Pos logit contributions
gy+0.040
 Owl+0.035
 myself+0.034
ulent+0.032
 possible+0.032
Neg logit contributions
 nodded-0.033
 blamed-0.030
 swapped-0.029
Kids-0.028
 did-0.028
Token started
Token ID2067
Feature activation+7.13e-04

Pos logit contributions
 blamed+0.006
 nodded+0.005
Kids+0.005
 swapped+0.005
Uh+0.005
Neg logit contributions
gy-0.008
 myself-0.007
 Owl-0.007
 possible-0.007
 rescuing-0.006
Token eating
Token ID6600
Feature activation-2.61e-05

Pos logit contributions
gy+0.001
 myself+0.001
 Owl+0.001
 possible+0.001
ble+0.001
Neg logit contributions
 blamed-0.001
 nodded-0.001
Kids-0.001
 swapped-0.001
 did-0.001
Token them
Token ID606
Feature activation+2.54e-03

Pos logit contributions
 nodded+0.000
 blamed+0.000
Kids+0.000
 swapped+0.000
 did+0.000
Neg logit contributions
gy-0.000
 Owl-0.000
 myself-0.000
 possible-0.000
ulent-0.000
Token giving
Token ID3501
Feature activation+6.35e-03

Pos logit contributions
 blamed+0.012
 nodded+0.012
Kids+0.011
omo+0.011
 did+0.011
Neg logit contributions
gy-0.010
 possible-0.008
 Owl-0.008
ulent-0.008
owing-0.007
Token him
Token ID683
Feature activation-6.48e-03

Pos logit contributions
gy+0.014
 Owl+0.012
 possible+0.012
ulent+0.012
ble+0.011
Neg logit contributions
 nodded-0.008
 blamed-0.008
Kids-0.008
 swapped-0.007
 thanked-0.007
Token too
Token ID1165
Feature activation-3.08e-03

Pos logit contributions
 nodded+0.011
 blamed+0.010
 swapped+0.010
 did+0.010
 thanked+0.010
Neg logit contributions
gy-0.011
 Owl-0.010
ulent-0.009
 possible-0.009
bing-0.009
Token many
Token ID867
Feature activation+1.09e-03

Pos logit contributions
 nodded+0.005
 blamed+0.005
Kids+0.005
 swapped+0.004
 did+0.004
Neg logit contributions
gy-0.006
 Owl-0.005
 myself-0.005
 possible-0.004
ble-0.004
Token ch
Token ID442
Feature activation+6.35e-03

Pos logit contributions
gy+0.002
 myself+0.002
 Owl+0.002
 possible+0.002
 rescuing+0.001
Neg logit contributions
 blamed-0.002
 nodded-0.002
Kids-0.002
 swapped-0.002
Uh-0.002
Tokenocol
Token ID4668
Feature activation+6.30e-02

Pos logit contributions
gy+0.012
 Owl+0.010
 myself+0.010
 possible+0.010
 rescuing+0.009
Neg logit contributions
 blamed-0.011
 nodded-0.011
Kids-0.010
outed-0.010
 swapped-0.009
Tokenates
Token ID689
Feature activation+1.97e-02

Pos logit contributions
gy+0.108
 Owl+0.100
 possible+0.094
 myself+0.094
 Grizz+0.091
Neg logit contributions
 nodded-0.096
outed-0.087
 blamed-0.084
omo-0.082
 swapped-0.082
Token."
Token ID526
Feature activation-2.75e-03

Pos logit contributions
gy+0.037
 Owl+0.033
ulent+0.030
 rescuing+0.029
 myself+0.029
Neg logit contributions
 nodded-0.031
 blamed-0.028
 did-0.027
Kids-0.027
 swapped-0.027
Token The
Token ID383
Feature activation-1.21e-03

Pos logit contributions
 blamed+0.004
 nodded+0.004
Kids+0.004
 swapped+0.004
Uh+0.004
Neg logit contributions
gy-0.005
 Owl-0.005
 myself-0.005
 possible-0.004
 rescuing-0.004
Token king
Token ID5822
Feature activation+3.21e-03

Pos logit contributions
 nodded+0.002
 blamed+0.002
Kids+0.002
 swapped+0.001
Uh+0.001
Neg logit contributions
gy-0.003
 myself-0.002
 possible-0.002
 Owl-0.002
 rescuing-0.002
Token learned
Token ID4499
Feature activation+1.33e-03

Pos logit contributions
gy+0.008
 Owl+0.007
 myself+0.007
ble+0.006
fly+0.006
Neg logit contributions
Kids-0.004
 blamed-0.003
Uh-0.003
Wow-0.003
 swapped-0.003
Token
Token ID198
Feature activation-2.46e-03

Pos logit contributions
 nodded+0.001
 blamed+0.001
 swapped+0.000
Kids+0.000
Uh+0.000
Neg logit contributions
gy-0.001
 myself-0.001
 Owl-0.001
 possible-0.000
 rescuing-0.000
Token"
Token ID1
Feature activation-5.90e-03

Pos logit contributions
 blamed+0.004
 nodded+0.004
 swapped+0.003
Kids+0.003
outed+0.003
Neg logit contributions
gy-0.005
 myself-0.004
 Owl-0.004
 possible-0.004
 rescuing-0.004
TokenThe
Token ID464
Feature activation-2.14e-03

Pos logit contributions
 blamed+0.009
 nodded+0.008
outed+0.008
 brushes+0.008
omo+0.007
Neg logit contributions
 myself-0.011
gy-0.010
 Owl-0.010
 possible-0.010
 Must-0.009
Token rh
Token ID9529
Feature activation+2.26e-02

Pos logit contributions
 blamed+0.003
 nodded+0.003
Kids+0.003
 swapped+0.003
Uh+0.003
Neg logit contributions
gy-0.004
 myself-0.004
 Owl-0.004
 possible-0.004
 rescuing-0.003
Tokenin
Token ID259
Feature activation+2.76e-03

Pos logit contributions
gy+0.038
 myself+0.032
 possible+0.031
ulent+0.030
 Owl+0.030
Neg logit contributions
 nodded-0.042
 blamed-0.041
Kids-0.038
 did-0.038
Uh-0.037
Tokenoc
Token ID420
Feature activation+6.28e-02

Pos logit contributions
gy+0.003
 myself+0.003
 possible+0.003
 Owl+0.003
ulent+0.003
Neg logit contributions
 nodded-0.006
 blamed-0.006
 did-0.005
Kids-0.005
Uh-0.005
Tokeneros
Token ID27498
Feature activation+5.24e-03

Pos logit contributions
 possible+0.090
gy+0.087
 outer+0.077
 afloat+0.075
ulent+0.075
Neg logit contributions
 nodded-0.113
 blamed-0.107
 moms-0.105
outed-0.096
omo-0.096
Token is
Token ID318
Feature activation-1.48e-03

Pos logit contributions
gy+0.011
 myself+0.010
 Owl+0.010
 possible+0.010
ulent+0.009
Neg logit contributions
 nodded-0.007
 blamed-0.007
Kids-0.006
 swapped-0.006
 did-0.006
Token in
Token ID287
Feature activation-4.42e-03

Pos logit contributions
 blamed+0.003
 nodded+0.003
Kids+0.002
 swapped+0.002
 did+0.002
Neg logit contributions
gy-0.003
 myself-0.002
 Owl-0.002
ulent-0.002
 possible-0.002
Token there
Token ID612
Feature activation-1.30e-03

Pos logit contributions
 blamed+0.008
 nodded+0.008
Kids+0.008
 swapped+0.007
Uh+0.007
Neg logit contributions
gy-0.007
 Owl-0.006
 myself-0.006
 possible-0.006
ulent-0.006
Token,
Token ID11
Feature activation+1.27e-03

Pos logit contributions
 nodded+0.002
 blamed+0.002
 did+0.002
Kids+0.002
 thanked+0.002
Neg logit contributions
gy-0.002
 Owl-0.002
 myself-0.002
ulent-0.002
 possible-0.002
Token learn
Token ID2193
Feature activation+1.96e-03

Pos logit contributions
gy+0.001
 myself+0.001
 Owl+0.001
 possible+0.001
ulent+0.001
Neg logit contributions
 blamed-0.001
 nodded-0.001
Kids-0.001
 swapped-0.001
Uh-0.001
Token more
Token ID517
Feature activation-1.11e-03

Pos logit contributions
gy+0.003
 myself+0.003
 Owl+0.003
 possible+0.003
ulent+0.003
Neg logit contributions
 blamed-0.003
 nodded-0.003
Kids-0.003
 swapped-0.003
 did-0.003
Token about
Token ID546
Feature activation-6.93e-03

Pos logit contributions
 blamed+0.001
 nodded+0.001
Kids+0.001
 swapped+0.001
Uh+0.001
Neg logit contributions
gy-0.003
 myself-0.002
 Owl-0.002
 possible-0.002
 rescuing-0.002
Token rh
Token ID9529
Feature activation+1.77e-02

Pos logit contributions
 blamed+0.010
 nodded+0.010
Kids+0.009
 did+0.009
 brushes+0.009
Neg logit contributions
gy-0.014
ulent-0.012
 myself-0.011
ble-0.011
uch-0.011
Tokenin
Token ID259
Feature activation-2.08e-03

Pos logit contributions
gy+0.037
 myself+0.031
 possible+0.031
 Owl+0.031
ulent+0.030
Neg logit contributions
 blamed-0.026
 nodded-0.025
Kids-0.024
 did-0.022
 swapped-0.022
Tokenoc
Token ID420
Feature activation+6.27e-02

Pos logit contributions
 nodded+0.004
 blamed+0.003
 did+0.003
 thanked+0.003
Kids+0.003
Neg logit contributions
gy-0.004
 myself-0.003
 possible-0.003
 Owl-0.003
ulent-0.003
Tokener
Token ID263
Feature activation-3.31e-03

Pos logit contributions
gy+0.097
 possible+0.094
 outer+0.087
 afloat+0.084
 myself+0.078
Neg logit contributions
 blamed-0.107
 nodded-0.105
 moms-0.100
 swapped-0.094
 turns-0.093
Tokenoses
Token ID4629
Feature activation+1.33e-02

Pos logit contributions
 nodded+0.005
 blamed+0.005
 did+0.004
Kids+0.004
 swapped+0.004
Neg logit contributions
gy-0.006
ulent-0.006
 myself-0.006
 possible-0.006
 Owl-0.006
Token and
Token ID290
Feature activation-3.04e-03

Pos logit contributions
gy+0.026
 myself+0.023
 Owl+0.023
 possible+0.023
ulent+0.022
Neg logit contributions
 nodded-0.020
 blamed-0.020
Kids-0.018
 did-0.017
 swapped-0.017
Token respect
Token ID2461
Feature activation+3.05e-05

Pos logit contributions
 blamed+0.005
 nodded+0.005
Kids+0.005
 swapped+0.004
Uh+0.004
Neg logit contributions
gy-0.006
 myself-0.005
 Owl-0.005
 possible-0.005
ble-0.004
Token them
Token ID606
Feature activation+5.32e-04

Pos logit contributions
gy+0.000
 myself+0.000
 Owl+0.000
 possible+0.000
ulent+0.000
Neg logit contributions
 blamed-0.000
 nodded-0.000
Kids-0.000
 swapped-0.000
 did-0.000
Token care
Token ID1337
Feature activation+1.04e-03

Pos logit contributions
gy+0.009
 Owl+0.008
 myself+0.007
 possible+0.007
ble+0.007
Neg logit contributions
 blamed-0.010
 nodded-0.010
Kids-0.010
 swapped-0.009
 turns-0.009
Token of
Token ID286
Feature activation-1.61e-03

Pos logit contributions
gy+0.003
 myself+0.003
 Owl+0.003
 possible+0.003
ulent+0.002
Neg logit contributions
 blamed-0.001
 nodded-0.001
Kids-0.001
 swapped-0.001
 did-0.001
Token the
Token ID262
Feature activation+5.67e-04

Pos logit contributions
 blamed+0.002
 nodded+0.002
 brushes+0.002
 did+0.002
Kids+0.002
Neg logit contributions
gy-0.003
ulent-0.003
 possible-0.003
owing-0.003
icken-0.003
Token rh
Token ID9529
Feature activation+1.98e-02

Pos logit contributions
gy+0.001
 myself+0.001
 possible+0.001
 Owl+0.001
ble+0.001
Neg logit contributions
 blamed-0.001
 nodded-0.001
Kids-0.001
 swapped-0.001
Uh-0.001
Tokenin
Token ID259
Feature activation+8.65e-04

Pos logit contributions
gy+0.037
 myself+0.032
 possible+0.032
ulent+0.030
 Owl+0.030
Neg logit contributions
 blamed-0.031
 nodded-0.031
Uh-0.028
Kids-0.028
 swapped-0.028
Tokenoc
Token ID420
Feature activation+6.26e-02

Pos logit contributions
 possible+0.001
 myself+0.001
gy+0.001
ble+0.001
ulent+0.001
Neg logit contributions
 blamed-0.001
 nodded-0.001
 thanked-0.001
 said-0.001
 did-0.001
Tokeneros
Token ID27498
Feature activation+1.43e-03

Pos logit contributions
 possible+0.092
gy+0.082
 outer+0.079
 myself+0.075
 afloat+0.074
Neg logit contributions
 nodded-0.109
 blamed-0.108
 moms-0.105
omo-0.099
 swapped-0.099
Token and
Token ID290
Feature activation-2.65e-03

Pos logit contributions
gy+0.003
 myself+0.002
ble+0.002
 possible+0.002
 Owl+0.002
Neg logit contributions
 blamed-0.002
 nodded-0.002
 did-0.002
 swapped-0.002
Kids-0.002
Token it
Token ID340
Feature activation-1.15e-03

Pos logit contributions
 blamed+0.004
 nodded+0.004
Kids+0.004
Uh+0.004
 swapped+0.004
Neg logit contributions
gy-0.005
 myself-0.004
 Owl-0.004
 possible-0.004
 rescuing-0.004
Token needed
Token ID2622
Feature activation+5.75e-03

Pos logit contributions
 blamed+0.001
 nodded+0.001
Kids+0.001
 swapped+0.001
Uh+0.001
Neg logit contributions
gy-0.003
 myself-0.002
 Owl-0.002
 possible-0.002
 rescuing-0.002
Token to
Token ID284
Feature activation+8.37e-04

Pos logit contributions
gy+0.010
 myself+0.009
 Owl+0.009
 possible+0.008
ble+0.008
Neg logit contributions
 blamed-0.010
 nodded-0.010
Kids-0.009
 swapped-0.009
Uh-0.008
Token he
Token ID339
Feature activation+1.50e-04

Pos logit contributions
 blamed+0.009
 nodded+0.009
Kids+0.009
 swapped+0.008
Uh+0.008
Neg logit contributions
gy-0.008
 myself-0.007
 Owl-0.007
 possible-0.007
 rescuing-0.006
Token would
Token ID561
Feature activation-9.12e-04

Pos logit contributions
gy+0.000
 myself+0.000
 Owl+0.000
 possible+0.000
 rescuing+0.000
Neg logit contributions
 blamed-0.000
Kids-0.000
 nodded-0.000
Uh-0.000
 swapped-0.000
Token take
Token ID1011
Feature activation+6.12e-03

Pos logit contributions
 blamed+0.002
 nodded+0.002
Kids+0.001
 swapped+0.001
Uh+0.001
Neg logit contributions
gy-0.002
 myself-0.001
 Owl-0.001
 possible-0.001
ulent-0.001
Token the
Token ID262
Feature activation+4.35e-03

Pos logit contributions
gy+0.011
 Owl+0.009
ble+0.009
 myself+0.009
ich+0.009
Neg logit contributions
 nodded-0.010
 blamed-0.010
Kids-0.010
 turns-0.010
 swapped-0.010
Token av
Token ID1196
Feature activation+3.36e-02

Pos logit contributions
gy+0.008
 myself+0.007
 Owl+0.007
 possible+0.007
ble+0.007
Neg logit contributions
 nodded-0.007
 blamed-0.007
Kids-0.006
 swapped-0.006
Uh-0.006
Tokenoc
Token ID420
Feature activation+6.25e-02

Pos logit contributions
gy+0.047
 possible+0.042
 myself+0.042
ble+0.041
 rescuing+0.040
Neg logit contributions
 nodded-0.062
 swapped-0.058
Uh-0.056
 blamed-0.056
Wow-0.056
Tokenados
Token ID22484
Feature activation+5.95e-03

Pos logit contributions
gy+0.102
 possible+0.096
 outer+0.086
 afloat+0.086
 myself+0.085
Neg logit contributions
 nodded-0.100
 blamed-0.095
outed-0.087
 moms-0.086
 swapped-0.085
Token and
Token ID290
Feature activation-3.94e-03

Pos logit contributions
gy+0.012
 myself+0.010
 Owl+0.010
ulent+0.010
 rescuing+0.009
Neg logit contributions
 nodded-0.009
 blamed-0.008
 swapped-0.008
 did-0.008
Kids-0.007
Token scatter
Token ID41058
Feature activation+1.78e-03

Pos logit contributions
 nodded+0.006
 blamed+0.006
 swapped+0.006
Kids+0.006
 thanked+0.005
Neg logit contributions
gy-0.007
 myself-0.006
 Owl-0.006
 possible-0.006
ble-0.006
Token them
Token ID606
Feature activation+1.74e-03

Pos logit contributions
gy+0.003
 myself+0.003
 Owl+0.003
 possible+0.003
 rescuing+0.003
Neg logit contributions
 blamed-0.003
 nodded-0.003
Kids-0.003
 swapped-0.003
Uh-0.002
Token around
Token ID1088
Feature activation-2.85e-03

Pos logit contributions
gy+0.003
 Owl+0.003
 myself+0.003
 possible+0.003
ulent+0.003
Neg logit contributions
 nodded-0.003
 blamed-0.003
 swapped-0.003
 did-0.002
Kids-0.002
Token playing
Token ID2712
Feature activation-2.82e-03

Pos logit contributions
 blamed+0.000
 nodded+0.000
Kids+0.000
 swapped+0.000
Uh+0.000
Neg logit contributions
gy-0.000
 myself-0.000
 Owl-0.000
 possible-0.000
 rescuing-0.000
Token!
Token ID0
Feature activation-5.52e-03

Pos logit contributions
 blamed+0.005
 nodded+0.005
Kids+0.005
 swapped+0.004
Uh+0.004
Neg logit contributions
gy-0.005
 myself-0.004
 Owl-0.004
 possible-0.004
ulent-0.004
Token The
Token ID383
Feature activation-9.94e-04

Pos logit contributions
 nodded+0.009
 blamed+0.008
Kids+0.008
 swapped+0.008
Uh+0.008
Neg logit contributions
gy-0.011
 Owl-0.010
 myself-0.009
 possible-0.009
 rescuing-0.008
Token rh
Token ID9529
Feature activation+2.12e-02

Pos logit contributions
 blamed+0.001
 nodded+0.001
Kids+0.001
 swapped+0.001
Uh+0.001
Neg logit contributions
gy-0.002
 myself-0.002
 Owl-0.002
 possible-0.002
ble-0.002
Tokenin
Token ID259
Feature activation+1.81e-03

Pos logit contributions
gy+0.034
 myself+0.028
 possible+0.027
 Owl+0.027
ulent+0.027
Neg logit contributions
 nodded-0.041
 blamed-0.041
Kids-0.039
Uh-0.037
 swapped-0.037
Tokenoc
Token ID420
Feature activation+6.25e-02

Pos logit contributions
gy+0.002
 myself+0.002
 Owl+0.002
 possible+0.001
 rescuing+0.001
Neg logit contributions
 blamed-0.004
 nodded-0.004
Kids-0.004
 swapped-0.004
Uh-0.004
Tokeneros
Token ID27498
Feature activation+3.79e-03

Pos logit contributions
 possible+0.090
gy+0.087
 outer+0.077
ulent+0.074
 afloat+0.072
Neg logit contributions
 nodded-0.116
 blamed-0.109
 moms-0.105
 swapped-0.100
Kids-0.099
Token was
Token ID373
Feature activation+1.60e-03

Pos logit contributions
gy+0.009
 Owl+0.008
 myself+0.007
 possible+0.007
 rescuing+0.007
Neg logit contributions
 blamed-0.005
Kids-0.005
 nodded-0.004
Uh-0.004
 swapped-0.004
Token very
Token ID845
Feature activation+7.28e-03

Pos logit contributions
gy+0.003
 myself+0.002
 Owl+0.002
 possible+0.002
 rescuing+0.002
Neg logit contributions
 blamed-0.003
 nodded-0.003
Kids-0.003
 swapped-0.003
Uh-0.002
Token worried
Token ID7960
Feature activation-6.07e-03

Pos logit contributions
gy+0.014
 myself+0.012
 Owl+0.012
 possible+0.012
 rescuing+0.011
Neg logit contributions
 blamed-0.012
 nodded-0.012
Kids-0.011
 swapped-0.010
Uh-0.010
Token,
Token ID11
Feature activation+1.38e-03

Pos logit contributions
 blamed+0.010
 nodded+0.010
Kids+0.010
 swapped+0.009
Uh+0.009
Neg logit contributions
gy-0.011
 myself-0.010
 Owl-0.009
 possible-0.009
ulent-0.009
Token it
Token ID340
Feature activation-1.33e-03

Pos logit contributions
 blamed+0.003
 nodded+0.003
Kids+0.003
 swapped+0.003
 did+0.003
Neg logit contributions
gy-0.003
 Owl-0.003
 myself-0.003
 possible-0.003
ulent-0.003
Token to
Token ID284
Feature activation-4.89e-04

Pos logit contributions
 nodded+0.002
 blamed+0.002
 swapped+0.002
Kids+0.002
 thanked+0.002
Neg logit contributions
gy-0.002
 Owl-0.002
 myself-0.002
ulent-0.002
ble-0.002
Token the
Token ID262
Feature activation+6.03e-04

Pos logit contributions
 blamed+0.001
 nodded+0.001
Kids+0.001
 swapped+0.001
 thanked+0.001
Neg logit contributions
gy-0.001
 Owl-0.001
 myself-0.001
 possible-0.001
ulent-0.001
Token rh
Token ID9529
Feature activation+2.28e-02

Pos logit contributions
gy+0.001
 myself+0.001
 Owl+0.001
 possible+0.001
ulent+0.001
Neg logit contributions
 blamed-0.001
 nodded-0.001
Kids-0.001
 swapped-0.001
Uh-0.001
Tokenin
Token ID259
Feature activation+3.35e-03

Pos logit contributions
gy+0.038
 myself+0.033
 possible+0.032
ulent+0.031
 Owl+0.031
Neg logit contributions
 nodded-0.041
 blamed-0.040
Kids-0.037
Uh-0.036
 swapped-0.036
Tokenoc
Token ID420
Feature activation+6.23e-02

Pos logit contributions
gy+0.005
 possible+0.004
 myself+0.004
ble+0.004
ulent+0.004
Neg logit contributions
 nodded-0.007
 blamed-0.006
 thanked-0.006
 swapped-0.006
Kids-0.006
Tokeneros
Token ID27498
Feature activation+5.55e-03

Pos logit contributions
 possible+0.085
gy+0.082
ulent+0.070
 myself+0.070
 outer+0.069
Neg logit contributions
 nodded-0.117
 blamed-0.112
 moms-0.106
omo-0.104
outed-0.103
Token.
Token ID13
Feature activation-2.39e-03

Pos logit contributions
gy+0.010
 myself+0.009
 possible+0.009
 Owl+0.009
ulent+0.008
Neg logit contributions
 nodded-0.009
 blamed-0.009
Kids-0.008
 swapped-0.008
 thanked-0.008
Token When
Token ID1649
Feature activation-1.05e-02

Pos logit contributions
 blamed+0.004
 nodded+0.004
Kids+0.004
 swapped+0.003
Uh+0.003
Neg logit contributions
gy-0.004
 Owl-0.004
 myself-0.004
 possible-0.004
 rescuing-0.003
Token he
Token ID339
Feature activation-5.11e-03

Pos logit contributions
 blamed+0.018
 nodded+0.018
Kids+0.017
 swapped+0.016
Uh+0.015
Neg logit contributions
gy-0.019
 myself-0.017
 Owl-0.016
 possible-0.016
ulent-0.015
Token tasted
Token ID29187
Feature activation+4.00e-03

Pos logit contributions
 blamed+0.006
Kids+0.006
 nodded+0.006
Uh+0.005
Wow+0.005
Neg logit contributions
gy-0.012
 Owl-0.010
 myself-0.010
 possible-0.010
 rescuing-0.010
Token
Token ID198
Feature activation-1.69e-03

Pos logit contributions
gy+0.014
 myself+0.012
 possible+0.012
 Owl+0.012
 rescuing+0.012
Neg logit contributions
 nodded-0.010
 blamed-0.010
Kids-0.010
 did-0.009
 swapped-0.009
Token
Token ID198
Feature activation-2.57e-03

Pos logit contributions
 blamed+0.003
 nodded+0.003
 swapped+0.002
Kids+0.002
Uh+0.002
Neg logit contributions
gy-0.003
 myself-0.003
 Owl-0.003
 possible-0.003
ble-0.003
TokenSam
Token ID16305
Feature activation-6.74e-03

Pos logit contributions
 blamed+0.004
 nodded+0.004
 swapped+0.004
outed+0.003
 brushes+0.003
Neg logit contributions
gy-0.005
 myself-0.005
 Owl-0.004
 possible-0.004
ulent-0.004
Token gasped
Token ID45236
Feature activation+6.96e-03

Pos logit contributions
 blamed+0.009
Kids+0.009
Uh+0.008
 nodded+0.008
 swapped+0.008
Neg logit contributions
gy-0.015
 Owl-0.013
 myself-0.013
 possible-0.012
 Must-0.011
Token in
Token ID287
Feature activation-4.44e-03

Pos logit contributions
gy+0.014
 possible+0.013
 Owl+0.012
ulent+0.012
 myself+0.012
Neg logit contributions
 nodded-0.010
 blamed-0.009
Kids-0.009
 did-0.008
 swapped-0.008
Token amaz
Token ID40642
Feature activation-2.86e-02

Pos logit contributions
outed+0.008
Kids+0.008
 blamed+0.008
 nodded+0.008
 swapped+0.007
Neg logit contributions
 myself-0.007
gy-0.007
 rescuing-0.006
 Owl-0.006
 himself-0.006
Tokenement
Token ID972
Feature activation-6.83e-03

Pos logit contributions
 blamed+0.044
Kids+0.043
Uh+0.040
 swapped+0.039
 nodded+0.039
Neg logit contributions
 myself-0.053
gy-0.051
ble-0.047
 possible-0.047
 Owl-0.044
Token.
Token ID13
Feature activation-1.11e-03

Pos logit contributions
 nodded+0.012
 blamed+0.011
Kids+0.010
 swapped+0.010
 did+0.010
Neg logit contributions
gy-0.012
 Owl-0.011
 myself-0.011
 possible-0.010
ulent-0.010
Token He
Token ID679
Feature activation+1.50e-03

Pos logit contributions
 blamed+0.002
 nodded+0.002
Kids+0.002
 swapped+0.002
Uh+0.002
Neg logit contributions
gy-0.002
 Owl-0.002
 myself-0.002
 possible-0.002
 rescuing-0.002
Token had
Token ID550
Feature activation+6.05e-04

Pos logit contributions
gy+0.003
 myself+0.003
 Owl+0.003
 possible+0.002
ble+0.002
Neg logit contributions
Kids-0.002
Uh-0.002
 blamed-0.002
Wow-0.002
omo-0.002
Token never
Token ID1239
Feature activation-8.44e-04

Pos logit contributions
gy+0.001
 myself+0.001
 possible+0.001
 Owl+0.001
 rescuing+0.001
Neg logit contributions
Kids-0.001
 blamed-0.001
 nodded-0.001
Uh-0.001
Wow-0.001
Token,
Token ID11
Feature activation-5.39e-04

Pos logit contributions
 nodded+0.001
 blamed+0.001
Kids+0.001
 swapped+0.001
 did+0.001
Neg logit contributions
gy-0.001
 myself-0.001
 Owl-0.001
 possible-0.001
 rescuing-0.001
Token her
Token ID607
Feature activation-2.29e-04

Pos logit contributions
 blamed+0.001
 nodded+0.001
Kids+0.001
 swapped+0.001
Uh+0.001
Neg logit contributions
gy-0.001
 myself-0.001
 Owl-0.001
 possible-0.001
 rescuing-0.001
Token eyes
Token ID2951
Feature activation-8.52e-03

Pos logit contributions
 blamed+0.000
Kids+0.000
 nodded+0.000
Uh+0.000
 swapped+0.000
Neg logit contributions
gy-0.000
 Owl-0.000
 myself-0.000
 possible-0.000
ble-0.000
Token wide
Token ID3094
Feature activation+4.28e-03

Pos logit contributions
 nodded+0.011
 blamed+0.011
Kids+0.009
 swapped+0.009
Uh+0.008
Neg logit contributions
gy-0.019
 myself-0.018
 Owl-0.018
 possible-0.017
 rescuing-0.016
Token with
Token ID351
Feature activation-4.09e-03

Pos logit contributions
gy+0.008
 myself+0.007
 Owl+0.007
 possible+0.007
 rescuing+0.006
Neg logit contributions
 nodded-0.007
 blamed-0.007
 swapped-0.006
Kids-0.006
 did-0.006
Token amaz
Token ID40642
Feature activation-2.78e-02

Pos logit contributions
 blamed+0.006
Kids+0.006
 nodded+0.006
 swapped+0.006
Uh+0.005
Neg logit contributions
gy-0.008
 myself-0.007
 Owl-0.007
 possible-0.007
 rescuing-0.006
Tokenement
Token ID972
Feature activation-8.62e-03

Pos logit contributions
 blamed+0.039
Kids+0.037
 nodded+0.035
 swapped+0.033
Uh+0.033
Neg logit contributions
gy-0.058
 myself-0.054
 possible-0.052
ble-0.051
 Owl-0.051
Token.
Token ID13
Feature activation-1.10e-03

Pos logit contributions
 nodded+0.015
 blamed+0.014
Kids+0.013
 swapped+0.013
 did+0.012
Neg logit contributions
gy-0.015
 Owl-0.014
 myself-0.014
 possible-0.013
ulent-0.013
Token
Token ID198
Feature activation-1.68e-03

Pos logit contributions
 blamed+0.002
 nodded+0.002
Kids+0.002
 swapped+0.002
Uh+0.002
Neg logit contributions
gy-0.002
 Owl-0.002
 myself-0.002
 possible-0.002
 rescuing-0.002
Token
Token ID198
Feature activation-2.13e-03

Pos logit contributions
 blamed+0.003
 nodded+0.003
Kids+0.002
 swapped+0.002
Uh+0.002
Neg logit contributions
gy-0.003
 myself-0.003
 Owl-0.003
 possible-0.003
 rescuing-0.003
TokenHer
Token ID9360
Feature activation-5.04e-03

Pos logit contributions
 blamed+0.004
 nodded+0.003
Kids+0.003
 swapped+0.003
outed+0.003
Neg logit contributions
gy-0.004
 myself-0.004
 Owl-0.003
 possible-0.003
 rescuing-0.003
TokenTim
Token ID14967
Feature activation-8.23e-04

Pos logit contributions
 blamed+0.003
 nodded+0.003
outed+0.003
 brushes+0.003
 swapped+0.003
Neg logit contributions
 myself-0.003
gy-0.003
 possible-0.003
 Owl-0.003
 rescuing-0.003
Tokenmy
Token ID1820
Feature activation+5.97e-03

Pos logit contributions
 blamed+0.002
Kids+0.002
 nodded+0.002
Uh+0.001
 swapped+0.001
Neg logit contributions
gy-0.001
 myself-0.001
 Owl-0.001
 possible-0.001
 rescuing-0.001
Token just
Token ID655
Feature activation+3.07e-05

Pos logit contributions
gy+0.013
 myself+0.011
 Owl+0.011
 possible+0.010
ble+0.010
Neg logit contributions
Kids-0.008
 blamed-0.008
Uh-0.008
Wow-0.007
 nodded-0.007
Token stared
Token ID18484
Feature activation+1.65e-03

Pos logit contributions
gy+0.000
 myself+0.000
 Owl+0.000
 possible+0.000
 rescuing+0.000
Neg logit contributions
Kids-0.000
 blamed-0.000
 nodded-0.000
Uh-0.000
 swapped-0.000
Token in
Token ID287
Feature activation-4.13e-03

Pos logit contributions
gy+0.004
 Owl+0.003
 myself+0.003
 possible+0.003
ulent+0.003
Neg logit contributions
 nodded-0.002
 blamed-0.002
Kids-0.002
 swapped-0.002
outed-0.002
Token amaz
Token ID40642
Feature activation-2.77e-02

Pos logit contributions
 blamed+0.007
Kids+0.007
 nodded+0.007
Uh+0.007
outed+0.007
Neg logit contributions
 myself-0.007
gy-0.007
 possible-0.006
 Owl-0.006
 rescuing-0.006
Tokenement
Token ID972
Feature activation-9.25e-03

Pos logit contributions
 blamed+0.043
Kids+0.042
Uh+0.039
 nodded+0.038
 swapped+0.038
Neg logit contributions
 myself-0.050
gy-0.048
ble-0.044
 possible-0.044
bing-0.043
Token at
Token ID379
Feature activation-2.75e-03

Pos logit contributions
 nodded+0.016
 blamed+0.015
Kids+0.014
 swapped+0.013
 did+0.013
Neg logit contributions
gy-0.017
 Owl-0.015
 myself-0.015
 possible-0.014
 rescuing-0.014
Token the
Token ID262
Feature activation+1.66e-03

Pos logit contributions
 blamed+0.004
 nodded+0.004
Kids+0.004
 swapped+0.004
 did+0.004
Neg logit contributions
gy-0.005
ulent-0.005
 Owl-0.004
 myself-0.004
 possible-0.004
Token t
Token ID256
Feature activation+2.09e-03

Pos logit contributions
gy+0.003
 myself+0.003
 Owl+0.003
 possible+0.003
 rescuing+0.002
Neg logit contributions
 blamed-0.003
 nodded-0.003
Kids-0.003
 swapped-0.002
outed-0.002
Tokeneddy
Token ID21874
Feature activation+2.97e-03

Pos logit contributions
gy+0.004
 possible+0.003
 Owl+0.003
 myself+0.003
 rescuing+0.003
Neg logit contributions
Kids-0.003
 nodded-0.003
 blamed-0.003
omo-0.003
outed-0.003
Tokenly
Token ID306
Feature activation+3.93e-03

Pos logit contributions
 nodded+0.010
 blamed+0.010
Kids+0.009
 swapped+0.008
 did+0.008
Neg logit contributions
gy-0.014
 Owl-0.012
 myself-0.012
 possible-0.012
 rescuing-0.011
Token.
Token ID13
Feature activation-2.82e-03

Pos logit contributions
gy+0.006
 myself+0.006
 Owl+0.005
 possible+0.005
 rescuing+0.005
Neg logit contributions
 blamed-0.007
Kids-0.007
 nodded-0.007
Uh-0.007
 swapped-0.007
Token Jill
Token ID22772
Feature activation-8.69e-04

Pos logit contributions
 blamed+0.005
 nodded+0.004
Kids+0.004
Uh+0.004
 swapped+0.004
Neg logit contributions
gy-0.005
 myself-0.005
 Owl-0.004
 possible-0.004
 himself-0.004
Token nodded
Token ID14464
Feature activation-1.43e-02

Pos logit contributions
Kids+0.001
Uh+0.001
 blamed+0.001
Wow+0.001
omo+0.001
Neg logit contributions
gy-0.002
 myself-0.002
 Owl-0.002
ble-0.001
fly-0.001
Token in
Token ID287
Feature activation-5.00e-03

Pos logit contributions
 blamed+0.023
 nodded+0.023
Kids+0.022
 swapped+0.020
Uh+0.020
Neg logit contributions
gy-0.027
 myself-0.024
 Owl-0.024
 possible-0.023
 rescuing-0.022
Token amaz
Token ID40642
Feature activation-2.76e-02

Pos logit contributions
 blamed+0.007
Kids+0.007
 nodded+0.006
outed+0.006
 swapped+0.006
Neg logit contributions
 myself-0.010
gy-0.009
 possible-0.009
 rescuing-0.009
 Owl-0.009
Tokenement
Token ID972
Feature activation-7.09e-03

Pos logit contributions
 blamed+0.042
Kids+0.041
Uh+0.038
 nodded+0.037
 swapped+0.037
Neg logit contributions
 myself-0.051
gy-0.050
 possible-0.046
ble-0.046
 Owl-0.044
Token.
Token ID13
Feature activation-6.34e-04

Pos logit contributions
 nodded+0.012
 blamed+0.011
Kids+0.010
 swapped+0.010
 did+0.010
Neg logit contributions
gy-0.013
 Owl-0.012
 myself-0.012
 possible-0.011
ulent-0.011
Token
Token ID220
Feature activation-8.52e-03

Pos logit contributions
 blamed+0.001
 nodded+0.001
Kids+0.001
Uh+0.001
 swapped+0.001
Neg logit contributions
gy-0.001
 myself-0.001
 Owl-0.001
 possible-0.001
 Must-0.001
Token
Token ID198
Feature activation-2.47e-03

Pos logit contributions
 blamed+0.010
 nodded+0.009
Kids+0.009
 swapped+0.008
outed+0.008
Neg logit contributions
gy-0.020
 myself-0.018
 Owl-0.018
 possible-0.018
ulent-0.017
Token
Token ID198
Feature activation-3.01e-03

Pos logit contributions
 blamed+0.004
 nodded+0.004
Kids+0.003
 swapped+0.003
Uh+0.003
Neg logit contributions
gy-0.005
 myself-0.005
 Owl-0.004
 possible-0.004
 rescuing-0.004
Token.
Token ID13
Feature activation-1.42e-03

Pos logit contributions
 nodded+0.010
 blamed+0.010
Kids+0.009
 swapped+0.009
 did+0.009
Neg logit contributions
gy-0.011
 myself-0.010
 Owl-0.010
 possible-0.009
ulent-0.009
Token She
Token ID1375
Feature activation-2.30e-03

Pos logit contributions
 blamed+0.002
 nodded+0.002
Kids+0.002
Uh+0.002
 swapped+0.002
Neg logit contributions
gy-0.003
 Owl-0.002
 myself-0.002
 possible-0.002
 rescuing-0.002
Token looked
Token ID3114
Feature activation-1.94e-03

Pos logit contributions
Kids+0.003
 blamed+0.003
Wow+0.003
Uh+0.003
 nodded+0.003
Neg logit contributions
gy-0.005
 Owl-0.004
 myself-0.004
 rescuing-0.004
 possible-0.004
Token up
Token ID510
Feature activation+5.40e-03

Pos logit contributions
Kids+0.004
 blamed+0.004
 nodded+0.004
Uh+0.004
omo+0.004
Neg logit contributions
gy-0.003
 myself-0.002
 Owl-0.002
 possible-0.002
 rescuing-0.002
Token in
Token ID287
Feature activation-4.72e-03

Pos logit contributions
gy+0.011
 myself+0.010
 Owl+0.010
 possible+0.010
 rescuing+0.009
Neg logit contributions
 nodded-0.008
 blamed-0.007
Kids-0.007
 swapped-0.007
Uh-0.006
Token amaz
Token ID40642
Feature activation-2.74e-02

Pos logit contributions
Kids+0.012
 blamed+0.012
outed+0.011
 nodded+0.011
Uh+0.011
Neg logit contributions
 myself-0.004
gy-0.004
 possible-0.004
 outer-0.003
 rescuing-0.003
Tokenement
Token ID972
Feature activation-7.50e-03

Pos logit contributions
 blamed+0.043
Kids+0.042
Uh+0.038
 nodded+0.038
 swapped+0.038
Neg logit contributions
 myself-0.049
gy-0.048
 possible-0.044
ble-0.044
 Owl-0.042
Token at
Token ID379
Feature activation-3.52e-03

Pos logit contributions
 nodded+0.011
 blamed+0.010
 swapped+0.009
 did+0.009
Kids+0.009
Neg logit contributions
gy-0.015
 Owl-0.014
 myself-0.013
 possible-0.013
ulent-0.013
Token all
Token ID477
Feature activation+1.90e-03

Pos logit contributions
 blamed+0.005
 nodded+0.005
Kids+0.005
 brushes+0.005
 did+0.004
Neg logit contributions
gy-0.007
ulent-0.006
 myself-0.006
 Owl-0.006
 possible-0.006
Token of
Token ID286
Feature activation-2.37e-03

Pos logit contributions
gy+0.002
 myself+0.002
 possible+0.002
 Owl+0.002
 rescuing+0.002
Neg logit contributions
 blamed-0.004
Kids-0.004
 nodded-0.004
 swapped-0.004
Uh-0.004
Token the
Token ID262
Feature activation+6.67e-04

Pos logit contributions
 blamed+0.004
 nodded+0.004
Kids+0.004
 swapped+0.004
 did+0.003
Neg logit contributions
gy-0.004
 myself-0.004
 Owl-0.004
 possible-0.003
ulent-0.003
Token
Token ID198
Feature activation-1.26e-03

Pos logit contributions
gy+0.000
 myself+0.000
 Owl+0.000
 possible+0.000
 rescuing+0.000
Neg logit contributions
 blamed-0.000
 nodded-0.000
Kids-0.000
 swapped-0.000
Uh-0.000
TokenL
Token ID43
Feature activation+4.63e-03

Pos logit contributions
 blamed+0.002
 nodded+0.002
Kids+0.002
 swapped+0.002
outed+0.002
Neg logit contributions
gy-0.002
 myself-0.002
 Owl-0.002
 possible-0.002
 rescuing-0.002
Tokenucky
Token ID5309
Feature activation-6.05e-03

Pos logit contributions
gy+0.011
 Owl+0.010
 myself+0.010
 possible+0.010
 rescuing+0.009
Neg logit contributions
 blamed-0.005
 nodded-0.005
Kids-0.005
 swapped-0.004
Uh-0.004
Token watched
Token ID7342
Feature activation-4.84e-03

Pos logit contributions
 blamed+0.007
Kids+0.007
 nodded+0.007
Uh+0.007
 swapped+0.006
Neg logit contributions
gy-0.014
 Owl-0.012
 myself-0.012
 possible-0.011
 rescuing-0.011
Token in
Token ID287
Feature activation-5.51e-03

Pos logit contributions
 nodded+0.007
 blamed+0.007
 swapped+0.006
Kids+0.006
 thanked+0.006
Neg logit contributions
gy-0.010
 Owl-0.009
 possible-0.008
 myself-0.008
ich-0.008
Token amaz
Token ID40642
Feature activation-2.73e-02

Pos logit contributions
 blamed+0.009
Kids+0.009
 nodded+0.009
 swapped+0.008
Uh+0.008
Neg logit contributions
gy-0.010
 myself-0.009
 Owl-0.009
 possible-0.009
 rescuing-0.009
Tokenement
Token ID972
Feature activation-6.02e-03

Pos logit contributions
 blamed+0.035
Kids+0.034
 nodded+0.033
 swapped+0.030
Uh+0.030
Neg logit contributions
gy-0.058
 myself-0.056
 possible-0.053
 Owl-0.052
ble-0.051
Token as
Token ID355
Feature activation-6.31e-03

Pos logit contributions
 nodded+0.009
 blamed+0.008
 thanked+0.007
 swapped+0.007
 did+0.006
Neg logit contributions
gy-0.012
 Owl-0.011
 myself-0.011
 possible-0.011
ulent-0.011
Token the
Token ID262
Feature activation+3.54e-04

Pos logit contributions
 nodded+0.010
 blamed+0.010
Kids+0.010
 did+0.009
 swapped+0.009
Neg logit contributions
gy-0.012
 myself-0.010
 Owl-0.010
ulent-0.010
 possible-0.009
Token frog
Token ID21264
Feature activation-1.10e-03

Pos logit contributions
gy+0.001
 myself+0.001
 Owl+0.000
 possible+0.000
ble+0.000
Neg logit contributions
 blamed-0.001
Kids-0.001
 nodded-0.001
 swapped-0.001
Uh-0.001
Token zoom
Token ID19792
Feature activation+2.68e-03

Pos logit contributions
 blamed+0.001
 nodded+0.001
Kids+0.001
 swapped+0.001
Uh+0.001
Neg logit contributions
gy-0.002
 myself-0.002
 Owl-0.002
 possible-0.002
 rescuing-0.002
Token again
Token ID757
Feature activation-1.70e-03

Pos logit contributions
gy+0.003
 myself+0.003
 Owl+0.003
 possible+0.002
 rescuing+0.002
Neg logit contributions
 blamed-0.003
 nodded-0.003
Kids-0.002
 swapped-0.002
Uh-0.002
Token.
Token ID13
Feature activation-4.47e-04

Pos logit contributions
 nodded+0.002
 blamed+0.002
 did+0.002
 said+0.002
 thanked+0.002
Neg logit contributions
gy-0.004
 myself-0.003
 Owl-0.003
 possible-0.003
ulent-0.003
Token
Token ID198
Feature activation+1.84e-04

Pos logit contributions
 blamed+0.001
 nodded+0.001
 swapped+0.001
Uh+0.001
Kids+0.001
Neg logit contributions
gy-0.001
 Owl-0.001
 myself-0.001
 possible-0.001
 Must-0.001
Token
Token ID198
Feature activation-2.39e-04

Pos logit contributions
gy+0.000
 myself+0.000
 Owl+0.000
 possible+0.000
ulent+0.000
Neg logit contributions
 blamed-0.000
 nodded-0.000
Kids-0.000
 swapped-0.000
Uh-0.000
TokenThe
Token ID464
Feature activation+4.31e-04

Pos logit contributions
 blamed+0.000
 nodded+0.000
Kids+0.000
 swapped+0.000
Uh+0.000
Neg logit contributions
gy-0.001
 myself-0.000
 Owl-0.000
 possible-0.000
ble-0.000
Token End
Token ID5268
Feature activation-2.73e-02

Pos logit contributions
gy+0.001
 myself+0.001
 Owl+0.001
 possible+0.001
 rescuing+0.001
Neg logit contributions
 blamed-0.001
 nodded-0.001
Kids-0.001
 swapped-0.001
Uh-0.001
Token.
Token ID13
Feature activation-1.05e-03

Pos logit contributions
 blamed+0.060
 swapped+0.059
 nodded+0.057
Uh+0.055
Kids+0.054
Neg logit contributions
gy-0.032
 possible-0.028
 myself-0.027
 Owl-0.026
ble-0.024
Token<|endoftext|>
Token ID50256
Feature activation-1.41e-03

Pos logit contributions
 blamed+0.002
 nodded+0.002
Kids+0.002
 swapped+0.001
Uh+0.001
Neg logit contributions
gy-0.002
 Owl-0.002
 myself-0.002
 possible-0.002
 rescuing-0.002
TokenTim
Token ID14967
Feature activation-1.55e-03

Pos logit contributions
 blamed+0.002
 nodded+0.002
Kids+0.002
 swapped+0.002
Uh+0.002
Neg logit contributions
gy-0.003
 myself-0.002
 Owl-0.002
 possible-0.002
 rescuing-0.002
Token and
Token ID290
Feature activation-5.54e-04

Pos logit contributions
 blamed+0.002
 nodded+0.002
Kids+0.002
 swapped+0.002
Uh+0.002
Neg logit contributions
gy-0.003
 myself-0.003
 Owl-0.003
 possible-0.002
 rescuing-0.002
Token Kim
Token ID6502
Feature activation+4.38e-04

Pos logit contributions
 blamed+0.001
 nodded+0.001
Kids+0.001
 swapped+0.001
Uh+0.001
Neg logit contributions
gy-0.001
 myself-0.001
 Owl-0.001
 possible-0.001
 rescuing-0.001
Token little
Token ID1310
Feature activation-3.04e-03

Pos logit contributions
 blamed+0.003
 nodded+0.003
Kids+0.003
 swapped+0.003
Uh+0.003
Neg logit contributions
gy-0.004
 myself-0.004
 Owl-0.003
 possible-0.003
 rescuing-0.003
Token girl
Token ID2576
Feature activation-7.86e-03

Pos logit contributions
 blamed+0.003
 nodded+0.003
Kids+0.003
 swapped+0.003
Uh+0.003
Neg logit contributions
gy-0.007
 Owl-0.007
 myself-0.007
 possible-0.007
 rescuing-0.006
Token looked
Token ID3114
Feature activation-2.03e-03

Pos logit contributions
 blamed+0.010
Kids+0.010
Uh+0.009
Wow+0.009
 nodded+0.009
Neg logit contributions
gy-0.017
 Owl-0.015
 myself-0.015
 possible-0.014
 rescuing-0.013
Token up
Token ID510
Feature activation+4.92e-03

Pos logit contributions
 blamed+0.004
Kids+0.004
 nodded+0.004
 swapped+0.004
Uh+0.004
Neg logit contributions
gy-0.003
 myself-0.002
 Owl-0.002
 possible-0.002
 rescuing-0.002
Token in
Token ID287
Feature activation-5.85e-03

Pos logit contributions
gy+0.010
 myself+0.009
 Owl+0.009
 possible+0.009
 rescuing+0.008
Neg logit contributions
 blamed-0.007
 nodded-0.007
Kids-0.007
 swapped-0.006
Uh-0.006
Token amaz
Token ID40642
Feature activation-2.73e-02

Pos logit contributions
 blamed+0.013
Kids+0.013
outed+0.012
 nodded+0.012
omo+0.012
Neg logit contributions
gy-0.006
 myself-0.006
 possible-0.006
 Owl-0.006
 outer-0.006
Tokenement
Token ID972
Feature activation-5.50e-03

Pos logit contributions
 blamed+0.040
Kids+0.039
 nodded+0.035
 swapped+0.035
Uh+0.034
Neg logit contributions
gy-0.052
 myself-0.052
 possible-0.049
ble-0.047
 Owl-0.047
Token and
Token ID290
Feature activation-7.09e-03

Pos logit contributions
 nodded+0.009
 blamed+0.008
outed+0.007
 did+0.007
Kids+0.007
Neg logit contributions
gy-0.011
 Owl-0.010
 myself-0.010
 possible-0.009
ulent-0.009
Token said
Token ID531
Feature activation-5.70e-03

Pos logit contributions
 nodded+0.009
 blamed+0.009
Kids+0.009
 swapped+0.008
Uh+0.008
Neg logit contributions
gy-0.016
 myself-0.014
 Owl-0.014
 possible-0.013
ulent-0.013
Token,
Token ID11
Feature activation-2.30e-03

Pos logit contributions
 nodded+0.010
 blamed+0.009
 did+0.008
Kids+0.008
 swapped+0.008
Neg logit contributions
gy-0.011
 myself-0.010
 Owl-0.009
 possible-0.009
ulent-0.009
Token "
Token ID366
Feature activation-2.73e-03

Pos logit contributions
 blamed+0.003
 nodded+0.003
Kids+0.003
 swapped+0.002
Uh+0.002
Neg logit contributions
gy-0.005
 myself-0.005
 Owl-0.005
 possible-0.004
 rescuing-0.004
Token
Token ID198
Feature activation-3.31e-03

Pos logit contributions
 blamed+0.004
 nodded+0.004
Kids+0.004
 swapped+0.004
Uh+0.004
Neg logit contributions
gy-0.005
 myself-0.005
 Owl-0.005
 possible-0.004
 rescuing-0.004
TokenLittle
Token ID22253
Feature activation+6.66e-04

Pos logit contributions
 blamed+0.005
 nodded+0.005
 swapped+0.005
Kids+0.005
outed+0.005
Neg logit contributions
gy-0.006
 myself-0.006
 Owl-0.005
 possible-0.005
ulent-0.005
Token Jack
Token ID3619
Feature activation-2.19e-03

Pos logit contributions
gy+0.001
 Owl+0.001
 myself+0.001
 possible+0.001
 rescuing+0.001
Neg logit contributions
 blamed-0.001
 nodded-0.001
Kids-0.001
 swapped-0.001
Uh-0.001
Token gasped
Token ID45236
Feature activation+6.46e-03

Pos logit contributions
 blamed+0.003
Kids+0.003
 nodded+0.002
Uh+0.002
 swapped+0.002
Neg logit contributions
gy-0.005
 Owl-0.004
 myself-0.004
 possible-0.004
ulent-0.004
Token in
Token ID287
Feature activation-5.14e-03

Pos logit contributions
gy+0.013
 possible+0.012
 myself+0.012
 Owl+0.012
ulent+0.011
Neg logit contributions
 nodded-0.009
 blamed-0.009
 did-0.008
 swapped-0.007
Kids-0.007
Token amaz
Token ID40642
Feature activation-2.72e-02

Pos logit contributions
Kids+0.008
 blamed+0.007
 nodded+0.007
outed+0.007
 swapped+0.006
Neg logit contributions
gy-0.010
 myself-0.010
 Owl-0.009
 possible-0.009
 rescuing-0.009
Tokenement
Token ID972
Feature activation-6.65e-03

Pos logit contributions
 blamed+0.043
Kids+0.043
Uh+0.039
 swapped+0.038
 nodded+0.037
Neg logit contributions
gy-0.048
 myself-0.048
 possible-0.044
ble-0.044
 Owl-0.041
Token.
Token ID13
Feature activation-1.55e-03

Pos logit contributions
 nodded+0.011
 blamed+0.011
Kids+0.010
 swapped+0.010
 did+0.009
Neg logit contributions
gy-0.012
 myself-0.011
 Owl-0.011
 possible-0.010
ulent-0.010
Token Dad
Token ID17415
Feature activation-4.12e-03

Pos logit contributions
 blamed+0.002
 nodded+0.002
Kids+0.002
 swapped+0.002
Uh+0.002
Neg logit contributions
gy-0.003
 Owl-0.003
 myself-0.003
 possible-0.003
 rescuing-0.002
Token pointed
Token ID6235
Feature activation+4.28e-03

Pos logit contributions
 blamed+0.004
Kids+0.004
Uh+0.004
Wow+0.004
 nodded+0.004
Neg logit contributions
gy-0.010
 Owl-0.009
 myself-0.008
 possible-0.008
 rescuing-0.008
Token to
Token ID284
Feature activation-6.56e-04

Pos logit contributions
gy+0.006
 myself+0.005
 Owl+0.005
 possible+0.005
 rescuing+0.005
Neg logit contributions
 blamed-0.009
Kids-0.009
Uh-0.008
 nodded-0.008
omo-0.008
Token.
Token ID13
Feature activation+1.96e-03

Pos logit contributions
gy+0.010
 Owl+0.009
 myself+0.009
 possible+0.009
ulent+0.008
Neg logit contributions
 nodded-0.010
 blamed-0.010
Kids-0.009
 swapped-0.009
 did-0.008
Token
Token ID198
Feature activation-1.38e-04

Pos logit contributions
gy+0.004
 myself+0.003
 Owl+0.003
 possible+0.003
 rescuing+0.003
Neg logit contributions
 nodded-0.003
 blamed-0.003
Kids-0.003
 swapped-0.003
Uh-0.003
Token
Token ID198
Feature activation-1.66e-03

Pos logit contributions
 nodded+0.000
 blamed+0.000
 swapped+0.000
Kids+0.000
Uh+0.000
Neg logit contributions
gy-0.000
 myself-0.000
 Owl-0.000
 possible-0.000
 rescuing-0.000
TokenTo
Token ID2514
Feature activation-9.58e-03

Pos logit contributions
 nodded+0.003
 blamed+0.003
 swapped+0.002
Kids+0.002
outed+0.002
Neg logit contributions
gy-0.003
 myself-0.003
 Owl-0.003
 possible-0.003
 rescuing-0.002
Token her
Token ID607
Feature activation-2.80e-04

Pos logit contributions
 blamed+0.017
 nodded+0.017
Kids+0.016
 swapped+0.015
Uh+0.014
Neg logit contributions
gy-0.017
 myself-0.015
 Owl-0.015
 possible-0.014
ulent-0.013
Token amaz
Token ID40642
Feature activation-2.71e-02

Pos logit contributions
 blamed+0.000
Kids+0.000
 nodded+0.000
Wow+0.000
Uh+0.000
Neg logit contributions
gy-0.001
 Owl-0.001
 possible-0.001
 myself-0.000
 rescuing-0.000
Tokenement
Token ID972
Feature activation-6.26e-03

Pos logit contributions
 blamed+0.039
Kids+0.038
 nodded+0.034
 swapped+0.034
Uh+0.033
Neg logit contributions
gy-0.055
 myself-0.052
 possible-0.050
 Owl-0.049
ble-0.048
Token,
Token ID11
Feature activation-2.36e-03

Pos logit contributions
 nodded+0.011
 blamed+0.011
Kids+0.010
 swapped+0.010
 did+0.010
Neg logit contributions
gy-0.010
 myself-0.009
 Owl-0.009
 possible-0.009
ulent-0.008
Token the
Token ID262
Feature activation-3.59e-04

Pos logit contributions
 blamed+0.004
 nodded+0.004
Kids+0.004
 swapped+0.004
Uh+0.004
Neg logit contributions
gy-0.004
 myself-0.004
 Owl-0.003
 possible-0.003
 rescuing-0.003
Token clothes
Token ID8242
Feature activation-6.52e-03

Pos logit contributions
 blamed+0.001
 nodded+0.001
Kids+0.001
 swapped+0.001
Uh+0.000
Neg logit contributions
gy-0.001
 Owl-0.001
 myself-0.001
 possible-0.001
ble-0.001
Token came
Token ID1625
Feature activation-2.17e-03

Pos logit contributions
 blamed+0.008
Kids+0.008
 nodded+0.007
Uh+0.007
Wow+0.007
Neg logit contributions
gy-0.015
 myself-0.013
 possible-0.013
 Owl-0.013
ble-0.013
Token go
Token ID467
Feature activation+2.05e-03

Pos logit contributions
 blamed+0.007
 nodded+0.007
Kids+0.007
 swapped+0.006
Uh+0.006
Neg logit contributions
gy-0.007
 myself-0.006
 Owl-0.006
 possible-0.006
ulent-0.005
Token!
Token ID0
Feature activation-2.19e-03

Pos logit contributions
gy+0.004
ulent+0.004
 Owl+0.003
icken+0.003
 myself+0.003
Neg logit contributions
 nodded-0.003
 blamed-0.003
Kids-0.003
 said-0.003
 swapped-0.003
Token Everyone
Token ID11075
Feature activation+3.00e-03

Pos logit contributions
 blamed+0.003
 nodded+0.003
Kids+0.003
 swapped+0.003
Uh+0.003
Neg logit contributions
gy-0.004
 Owl-0.004
 myself-0.004
 possible-0.003
 rescuing-0.003
Token watched
Token ID7342
Feature activation-4.79e-04

Pos logit contributions
gy+0.006
 myself+0.006
 Owl+0.005
 possible+0.005
 rescuing+0.005
Neg logit contributions
 blamed-0.004
Kids-0.004
Uh-0.004
omo-0.004
 nodded-0.004
Token in
Token ID287
Feature activation-4.30e-03

Pos logit contributions
 nodded+0.001
 blamed+0.001
Kids+0.001
 swapped+0.001
 did+0.001
Neg logit contributions
gy-0.001
 Owl-0.001
 myself-0.001
 possible-0.001
ulent-0.001
Token amaz
Token ID40642
Feature activation-2.70e-02

Pos logit contributions
 blamed+0.008
Kids+0.008
 nodded+0.008
 swapped+0.007
Uh+0.007
Neg logit contributions
gy-0.007
 myself-0.006
 Owl-0.006
 possible-0.006
 rescuing-0.006
Tokenement
Token ID972
Feature activation-3.10e-03

Pos logit contributions
 blamed+0.037
Kids+0.036
 nodded+0.034
Uh+0.032
 swapped+0.032
Neg logit contributions
gy-0.055
 myself-0.054
 possible-0.051
 Owl-0.048
ble-0.048
Token as
Token ID355
Feature activation-4.33e-03

Pos logit contributions
 nodded+0.005
 blamed+0.005
Kids+0.004
 did+0.004
 swapped+0.004
Neg logit contributions
gy-0.006
 Owl-0.005
 myself-0.005
ulent-0.005
 possible-0.005
Token it
Token ID340
Feature activation+1.78e-03

Pos logit contributions
 blamed+0.007
 nodded+0.007
Kids+0.007
 swapped+0.007
Uh+0.006
Neg logit contributions
gy-0.008
 myself-0.007
 Owl-0.007
 possible-0.007
 rescuing-0.006
Token took
Token ID1718
Feature activation+5.01e-04

Pos logit contributions
gy+0.004
 myself+0.004
 Owl+0.004
 possible+0.004
ble+0.003
Neg logit contributions
 blamed-0.002
Kids-0.002
 nodded-0.002
 swapped-0.002
Uh-0.002
Token off
Token ID572
Feature activation-1.48e-02

Pos logit contributions
gy+0.001
 Owl+0.001
 myself+0.001
 possible+0.001
ble+0.001
Neg logit contributions
 nodded-0.001
 blamed-0.001
Kids-0.001
 swapped-0.001
 did-0.001
Token friends
Token ID2460
Feature activation-6.45e-03

Pos logit contributions
Kids+0.007
 blamed+0.007
 nodded+0.006
Uh+0.006
Wow+0.006
Neg logit contributions
gy-0.009
 Owl-0.008
 myself-0.008
 possible-0.007
ble-0.007
Token stopped
Token ID5025
Feature activation+4.12e-05

Pos logit contributions
Kids+0.007
 blamed+0.007
Uh+0.007
 nodded+0.006
Wow+0.006
Neg logit contributions
gy-0.015
 Owl-0.013
 myself-0.013
 possible-0.012
 rescuing-0.012
Token and
Token ID290
Feature activation-4.20e-03

Pos logit contributions
gy+0.000
 myself+0.000
 possible+0.000
ulent+0.000
 Owl+0.000
Neg logit contributions
 nodded-0.000
 blamed-0.000
Kids-0.000
 did-0.000
 swapped-0.000
Token watched
Token ID7342
Feature activation-3.59e-03

Pos logit contributions
Kids+0.004
 blamed+0.004
Uh+0.004
 nodded+0.004
omo+0.004
Neg logit contributions
gy-0.010
 Owl-0.009
 possible-0.009
 myself-0.009
 rescuing-0.008
Token in
Token ID287
Feature activation-5.19e-03

Pos logit contributions
 blamed+0.006
 nodded+0.006
Kids+0.006
 swapped+0.005
 did+0.005
Neg logit contributions
gy-0.007
 myself-0.006
 Owl-0.006
 possible-0.006
ulent-0.005
Token amaz
Token ID40642
Feature activation-2.70e-02

Pos logit contributions
Kids+0.009
 blamed+0.009
 nodded+0.009
outed+0.008
Uh+0.008
Neg logit contributions
gy-0.008
 myself-0.008
 possible-0.008
 rescuing-0.008
 Owl-0.007
Tokenement
Token ID972
Feature activation-4.53e-03

Pos logit contributions
 blamed+0.038
Kids+0.037
 nodded+0.034
Uh+0.033
 swapped+0.032
Neg logit contributions
gy-0.053
 myself-0.052
 possible-0.050
 Owl-0.047
ble-0.047
Token.
Token ID13
Feature activation-1.76e-03

Pos logit contributions
 nodded+0.007
 blamed+0.007
Kids+0.006
 swapped+0.006
 did+0.006
Neg logit contributions
gy-0.009
 Owl-0.008
 myself-0.008
 possible-0.007
ulent-0.007
Token The
Token ID383
Feature activation-7.85e-04

Pos logit contributions
 nodded+0.002
 blamed+0.002
Kids+0.002
 swapped+0.002
Uh+0.002
Neg logit contributions
gy-0.004
 Owl-0.003
 myself-0.003
 possible-0.003
 rescuing-0.003
Token dog
Token ID3290
Feature activation-1.80e-03

Pos logit contributions
 blamed+0.001
 nodded+0.001
Kids+0.001
 swapped+0.001
Uh+0.001
Neg logit contributions
gy-0.001
 myself-0.001
 Owl-0.001
 possible-0.001
ble-0.001
Token too
Token ID1165
Feature activation-5.95e-03

Pos logit contributions
Kids+0.002
 blamed+0.002
Uh+0.002
 nodded+0.002
Wow+0.002
Neg logit contributions
gy-0.004
 Owl-0.004
 myself-0.003
 possible-0.003
 rescuing-0.003
Token the
Token ID262
Feature activation+1.13e-03

Pos logit contributions
 nodded+0.006
 blamed+0.005
Kids+0.005
 swapped+0.005
Wow+0.005
Neg logit contributions
gy-0.006
 myself-0.005
 Owl-0.005
 possible-0.005
ulent-0.005
Token zipper
Token ID48992
Feature activation-1.71e-03

Pos logit contributions
gy+0.002
 myself+0.002
 Owl+0.002
 possible+0.002
ble+0.002
Neg logit contributions
 blamed-0.002
 nodded-0.002
Kids-0.002
 swapped-0.001
Uh-0.001
Token and
Token ID290
Feature activation-3.24e-03

Pos logit contributions
 nodded+0.002
 blamed+0.002
 did+0.002
 swapped+0.002
 thanked+0.002
Neg logit contributions
gy-0.004
 myself-0.004
 Owl-0.004
 possible-0.004
ulent-0.003
Token gasped
Token ID45236
Feature activation+5.76e-03

Pos logit contributions
 nodded+0.004
 blamed+0.004
Kids+0.004
 swapped+0.004
 did+0.004
Neg logit contributions
gy-0.007
 myself-0.006
 Owl-0.006
 possible-0.006
ulent-0.006
Token with
Token ID351
Feature activation-3.66e-03

Pos logit contributions
gy+0.011
 possible+0.011
 Owl+0.011
 myself+0.011
uch+0.010
Neg logit contributions
 nodded-0.008
 blamed-0.007
 swapped-0.007
 did-0.007
 thanked-0.006
Token amaz
Token ID40642
Feature activation-2.69e-02

Pos logit contributions
Kids+0.004
 blamed+0.004
 nodded+0.004
Uh+0.004
outed+0.004
Neg logit contributions
gy-0.008
 myself-0.008
 possible-0.007
 Owl-0.007
 rescuing-0.007
Tokenement
Token ID972
Feature activation-8.10e-03

Pos logit contributions
 blamed+0.037
Kids+0.036
 nodded+0.034
Uh+0.032
 swapped+0.032
Neg logit contributions
gy-0.055
 myself-0.052
 possible-0.050
 Owl-0.049
ble-0.048
Token at
Token ID379
Feature activation-2.37e-03

Pos logit contributions
 nodded+0.013
 blamed+0.013
 swapped+0.011
Kids+0.011
outed+0.011
Neg logit contributions
gy-0.015
 myself-0.014
 Owl-0.014
 possible-0.013
ulent-0.012
Token all
Token ID477
Feature activation+2.11e-03

Pos logit contributions
 blamed+0.004
 nodded+0.004
Kids+0.004
 swapped+0.003
 did+0.003
Neg logit contributions
gy-0.004
 myself-0.004
 Owl-0.004
ulent-0.004
 possible-0.004
Token the
Token ID262
Feature activation+1.62e-03

Pos logit contributions
gy+0.003
 possible+0.002
 myself+0.002
 Owl+0.002
 rescuing+0.002
Neg logit contributions
 blamed-0.005
 nodded-0.005
Kids-0.005
Uh-0.004
 swapped-0.004
Token treasures
Token ID29561
Feature activation+1.58e-03

Pos logit contributions
gy+0.002
 myself+0.002
 possible+0.002
 Owl+0.002
 rescuing+0.002
Neg logit contributions
 blamed-0.003
 nodded-0.003
Kids-0.003
Uh-0.003
 swapped-0.003
Token air
Token ID1633
Feature activation-8.19e-03

Pos logit contributions
 blamed+0.011
 nodded+0.011
Kids+0.010
Wow+0.010
 did+0.010
Neg logit contributions
gy-0.011
 myself-0.009
 possible-0.008
ulent-0.008
 Owl-0.008
Token.
Token ID13
Feature activation-2.16e-04

Pos logit contributions
 blamed+0.014
 nodded+0.014
Kids+0.013
 swapped+0.013
Uh+0.012
Neg logit contributions
gy-0.014
 myself-0.013
 Owl-0.012
 possible-0.012
ulent-0.011
Token He
Token ID679
Feature activation-2.55e-03

Pos logit contributions
 blamed+0.000
 nodded+0.000
Kids+0.000
 swapped+0.000
Uh+0.000
Neg logit contributions
gy-0.000
 myself-0.000
 Owl-0.000
 possible-0.000
 rescuing-0.000
Token watched
Token ID7342
Feature activation-4.03e-03

Pos logit contributions
Kids+0.003
Uh+0.003
Wow+0.003
 blamed+0.003
 nodded+0.003
Neg logit contributions
gy-0.006
 myself-0.005
 Owl-0.005
 possible-0.005
ble-0.004
Token with
Token ID351
Feature activation-5.66e-03

Pos logit contributions
 nodded+0.007
 blamed+0.007
Kids+0.006
 swapped+0.006
 did+0.006
Neg logit contributions
gy-0.008
 Owl-0.006
 possible-0.006
 myself-0.006
ulent-0.006
Token amaz
Token ID40642
Feature activation-2.69e-02

Pos logit contributions
 blamed+0.011
 nodded+0.011
Kids+0.011
 swapped+0.010
Uh+0.010
Neg logit contributions
gy-0.009
 myself-0.008
 Owl-0.007
 possible-0.007
 rescuing-0.007
Tokenement
Token ID972
Feature activation-4.38e-03

Pos logit contributions
 blamed+0.031
 nodded+0.030
Kids+0.029
 swapped+0.026
Uh+0.025
Neg logit contributions
gy-0.063
 myself-0.058
 Owl-0.057
 possible-0.056
 rescuing-0.053
Token as
Token ID355
Feature activation-7.02e-03

Pos logit contributions
 nodded+0.007
 blamed+0.006
 did+0.006
 brushes+0.006
Kids+0.005
Neg logit contributions
gy-0.008
 Owl-0.008
 myself-0.007
ulent-0.007
 possible-0.007
Token the
Token ID262
Feature activation+1.02e-03

Pos logit contributions
 blamed+0.012
 nodded+0.012
Kids+0.012
 swapped+0.011
 did+0.011
Neg logit contributions
gy-0.012
 myself-0.010
 Owl-0.010
ulent-0.010
 possible-0.010
Token z
Token ID1976
Feature activation+2.74e-03

Pos logit contributions
gy+0.002
 Owl+0.002
 myself+0.002
 possible+0.002
 rescuing+0.002
Neg logit contributions
 blamed-0.002
 nodded-0.002
Kids-0.002
 swapped-0.001
Uh-0.001
Tokenig
Token ID328
Feature activation+7.57e-03

Pos logit contributions
gy+0.005
 Owl+0.004
 myself+0.004
 possible+0.004
 rescuing+0.004
Neg logit contributions
 nodded-0.005
 blamed-0.004
Kids-0.004
 swapped-0.004
outed-0.004
Token,
Token ID11
Feature activation-6.33e-04

Pos logit contributions
gy+0.000
 Owl+0.000
 myself+0.000
ulent+0.000
 rescuing+0.000
Neg logit contributions
 nodded-0.000
 blamed-0.000
 swapped-0.000
 brushes-0.000
 did-0.000
Token his
Token ID465
Feature activation-2.70e-03

Pos logit contributions
 nodded+0.001
 blamed+0.001
Kids+0.001
 swapped+0.001
Uh+0.001
Neg logit contributions
gy-0.001
 myself-0.001
 Owl-0.001
 possible-0.001
ulent-0.001
Token eyes
Token ID2951
Feature activation-8.17e-03

Pos logit contributions
 blamed+0.004
Kids+0.004
 nodded+0.004
 swapped+0.003
Uh+0.003
Neg logit contributions
gy-0.006
 myself-0.005
 Owl-0.005
 possible-0.005
ble-0.005
Token widening
Token ID38118
Feature activation+2.36e-04

Pos logit contributions
 nodded+0.011
 blamed+0.010
 swapped+0.009
Kids+0.009
 did+0.008
Neg logit contributions
gy-0.019
 myself-0.017
 Owl-0.017
 possible-0.016
 rescuing-0.015
Token with
Token ID351
Feature activation-3.35e-03

Pos logit contributions
 Owl+0.000
gy+0.000
 myself+0.000
 possible+0.000
ulent+0.000
Neg logit contributions
 nodded-0.000
 blamed-0.000
 swapped-0.000
 did-0.000
omo-0.000
Token amaz
Token ID40642
Feature activation-2.68e-02

Pos logit contributions
Kids+0.004
 blamed+0.004
 nodded+0.004
Uh+0.003
 swapped+0.003
Neg logit contributions
gy-0.008
 myself-0.007
 Owl-0.007
 possible-0.007
 rescuing-0.007
Tokenement
Token ID972
Feature activation-9.30e-03

Pos logit contributions
 blamed+0.037
Kids+0.036
 nodded+0.033
 swapped+0.031
Uh+0.031
Neg logit contributions
gy-0.056
 myself-0.052
 possible-0.050
ble-0.049
 Owl-0.049
Token.
Token ID13
Feature activation+1.42e-03

Pos logit contributions
 nodded+0.016
 blamed+0.015
 swapped+0.013
Kids+0.013
 did+0.013
Neg logit contributions
gy-0.017
 Owl-0.016
 myself-0.015
 possible-0.014
ulent-0.014
Token He
Token ID679
Feature activation+3.49e-03

Pos logit contributions
gy+0.003
 myself+0.003
 Owl+0.003
 possible+0.002
 rescuing+0.002
Neg logit contributions
 blamed-0.002
 nodded-0.002
Kids-0.002
 swapped-0.002
Uh-0.002
Token wanted
Token ID2227
Feature activation-2.18e-03

Pos logit contributions
gy+0.008
 Owl+0.007
 myself+0.007
 possible+0.006
ble+0.006
Neg logit contributions
Kids-0.004
 blamed-0.004
Uh-0.004
Wow-0.004
 swapped-0.004
Token to
Token ID284
Feature activation+2.19e-03

Pos logit contributions
 nodded+0.004
 blamed+0.004
Kids+0.003
 swapped+0.003
 did+0.003
Neg logit contributions
gy-0.004
 myself-0.003
 Owl-0.003
 possible-0.003
ble-0.003
Token She
Token ID1375
Feature activation-2.02e-03

Pos logit contributions
gy+0.003
 myself+0.002
 Owl+0.002
 possible+0.002
 rescuing+0.002
Neg logit contributions
 blamed-0.002
 nodded-0.002
Kids-0.002
 swapped-0.002
Uh-0.002
Token opened
Token ID4721
Feature activation-3.87e-03

Pos logit contributions
Kids+0.003
 blamed+0.003
Uh+0.003
Wow+0.003
 swapped+0.002
Neg logit contributions
gy-0.004
 myself-0.004
 Owl-0.003
 possible-0.003
ble-0.003
Token her
Token ID607
Feature activation+3.48e-04

Pos logit contributions
 blamed+0.007
 nodded+0.007
Kids+0.007
 swapped+0.007
Uh+0.006
Neg logit contributions
gy-0.006
 myself-0.005
 Owl-0.005
 possible-0.005
 rescuing-0.005
Token mouth
Token ID5422
Feature activation-1.84e-03

Pos logit contributions
gy+0.001
 myself+0.001
 Owl+0.001
 possible+0.001
ble+0.001
Neg logit contributions
 blamed-0.000
Kids-0.000
 nodded-0.000
Uh-0.000
 swapped-0.000
Token in
Token ID287
Feature activation-4.58e-03

Pos logit contributions
 nodded+0.003
 blamed+0.003
Kids+0.002
 swapped+0.002
 did+0.002
Neg logit contributions
gy-0.004
 myself-0.003
 Owl-0.003
 possible-0.003
 rescuing-0.003
Token amaz
Token ID40642
Feature activation-2.68e-02

Pos logit contributions
 blamed+0.009
Kids+0.009
 nodded+0.009
 swapped+0.008
Uh+0.008
Neg logit contributions
gy-0.007
 myself-0.006
 possible-0.006
 Owl-0.006
 rescuing-0.006
Tokenement
Token ID972
Feature activation-1.10e-02

Pos logit contributions
 blamed+0.044
Kids+0.044
Uh+0.040
 swapped+0.039
 nodded+0.038
Neg logit contributions
gy-0.046
 myself-0.045
ble-0.042
 possible-0.040
bing-0.039
Token.
Token ID13
Feature activation+1.64e-03

Pos logit contributions
 nodded+0.017
 blamed+0.016
 swapped+0.014
outed+0.014
 did+0.014
Neg logit contributions
gy-0.021
 Owl-0.020
 myself-0.019
 possible-0.019
ulent-0.018
Token She
Token ID1375
Feature activation-1.60e-03

Pos logit contributions
gy+0.003
 myself+0.003
 Owl+0.003
 possible+0.003
 rescuing+0.003
Neg logit contributions
 blamed-0.002
 nodded-0.002
Kids-0.002
 swapped-0.002
Uh-0.002
Token had
Token ID550
Feature activation+1.81e-04

Pos logit contributions
Kids+0.002
 blamed+0.002
Uh+0.002
 nodded+0.002
Wow+0.002
Neg logit contributions
gy-0.004
 myself-0.003
 Owl-0.003
 possible-0.003
ble-0.003
Token never
Token ID1239
Feature activation-2.69e-03

Pos logit contributions
gy+0.000
 myself+0.000
 possible+0.000
 Owl+0.000
 rescuing+0.000
Neg logit contributions
 blamed-0.000
Kids-0.000
 nodded-0.000
Uh-0.000
 swapped-0.000
Token
Token ID198
Feature activation-2.37e-03

Pos logit contributions
 blamed+0.003
 nodded+0.003
Kids+0.003
 swapped+0.002
Uh+0.002
Neg logit contributions
gy-0.004
 myself-0.004
 Owl-0.004
 possible-0.004
 rescuing-0.003
TokenT
Token ID51
Feature activation+4.53e-04

Pos logit contributions
 blamed+0.005
 nodded+0.005
Kids+0.004
 swapped+0.004
Uh+0.004
Neg logit contributions
gy-0.004
 myself-0.003
 Owl-0.003
 possible-0.003
ble-0.003
Tokenina
Token ID1437
Feature activation-1.02e-02

Pos logit contributions
gy+0.001
 myself+0.001
 Owl+0.001
 possible+0.001
 rescuing+0.001
Neg logit contributions
Kids-0.001
 nodded-0.001
 blamed-0.001
outed-0.001
omo-0.001
Token waited
Token ID13488
Feature activation-2.43e-03

Pos logit contributions
Kids+0.013
 blamed+0.013
Uh+0.012
Wow+0.011
 swapped+0.011
Neg logit contributions
gy-0.023
 myself-0.020
 Owl-0.020
ble-0.018
 rescuing-0.018
Token with
Token ID351
Feature activation-4.04e-03

Pos logit contributions
 nodded+0.004
 blamed+0.003
 did+0.003
 minutes+0.003
 swapped+0.003
Neg logit contributions
gy-0.005
 Owl-0.005
 myself-0.005
 possible-0.004
uch-0.004
Token amaz
Token ID40642
Feature activation-2.68e-02

Pos logit contributions
 blamed+0.007
 nodded+0.006
Kids+0.006
 swapped+0.006
Uh+0.006
Neg logit contributions
gy-0.008
 myself-0.007
 Owl-0.007
 possible-0.006
 rescuing-0.006
Tokenement
Token ID972
Feature activation-9.09e-03

Pos logit contributions
 blamed+0.031
 nodded+0.030
Kids+0.029
 swapped+0.026
Uh+0.026
Neg logit contributions
gy-0.062
 myself-0.057
 Owl-0.056
 possible-0.055
ble-0.052
Token as
Token ID355
Feature activation-4.22e-03

Pos logit contributions
 nodded+0.014
 blamed+0.012
outed+0.011
 did+0.011
 thanked+0.011
Neg logit contributions
gy-0.018
 myself-0.017
 Owl-0.017
 possible-0.016
ulent-0.015
Token the
Token ID262
Feature activation+5.76e-04

Pos logit contributions
 nodded+0.006
 blamed+0.005
Kids+0.005
 brushes+0.005
 did+0.005
Neg logit contributions
gy-0.009
 myself-0.008
ulent-0.008
 Owl-0.008
 rescuing-0.007
Token old
Token ID1468
Feature activation-7.71e-04

Pos logit contributions
gy+0.001
 Owl+0.001
 possible+0.001
 myself+0.001
 rescuing+0.001
Neg logit contributions
 blamed-0.001
Kids-0.001
 nodded-0.001
 swapped-0.001
Uh-0.001
Token man
Token ID582
Feature activation-2.25e-03

Pos logit contributions
 blamed+0.001
Kids+0.001
 nodded+0.001
Uh+0.001
 swapped+0.001
Neg logit contributions
gy-0.002
 Owl-0.002
 possible-0.002
 myself-0.001
ble-0.001
Token up
Token ID510
Feature activation+5.79e-03

Pos logit contributions
 blamed+0.009
Kids+0.009
 nodded+0.008
Uh+0.008
 swapped+0.008
Neg logit contributions
gy-0.008
 myself-0.007
 Owl-0.007
 possible-0.007
 rescuing-0.006
Token at
Token ID379
Feature activation-1.65e-03

Pos logit contributions
gy+0.012
 myself+0.011
 Owl+0.011
 possible+0.010
 rescuing+0.010
Neg logit contributions
 blamed-0.008
 nodded-0.008
Kids-0.008
 swapped-0.007
Uh-0.007
Token the
Token ID262
Feature activation+1.71e-03

Pos logit contributions
 blamed+0.003
 nodded+0.003
Kids+0.003
 swapped+0.002
Wow+0.002
Neg logit contributions
gy-0.003
 myself-0.003
 Owl-0.003
 possible-0.003
ulent-0.003
Token tree
Token ID5509
Feature activation-4.88e-03

Pos logit contributions
gy+0.003
 myself+0.003
 Owl+0.003
 possible+0.003
 rescuing+0.003
Neg logit contributions
 nodded-0.003
 blamed-0.003
Kids-0.002
 swapped-0.002
Uh-0.002
Token in
Token ID287
Feature activation-4.23e-03

Pos logit contributions
 nodded+0.007
 blamed+0.006
 swapped+0.005
 thanked+0.005
 did+0.005
Neg logit contributions
gy-0.010
 myself-0.010
 possible-0.009
 Owl-0.009
ble-0.009
Token amaz
Token ID40642
Feature activation-2.68e-02

Pos logit contributions
 blamed+0.007
Kids+0.007
 nodded+0.006
outed+0.006
omo+0.006
Neg logit contributions
gy-0.008
 possible-0.007
 myself-0.007
 Owl-0.007
 rescuing-0.007
Tokenement
Token ID972
Feature activation-9.56e-03

Pos logit contributions
 blamed+0.047
Kids+0.046
Uh+0.042
 swapped+0.041
 nodded+0.040
Neg logit contributions
 myself-0.043
gy-0.042
ble-0.040
 possible-0.038
bing-0.037
Token,
Token ID11
Feature activation-1.01e-03

Pos logit contributions
 nodded+0.016
 blamed+0.015
Kids+0.014
 swapped+0.013
 did+0.013
Neg logit contributions
gy-0.018
 myself-0.016
 Owl-0.016
 possible-0.015
ulent-0.015
Token wondering
Token ID11263
Feature activation-3.56e-03

Pos logit contributions
 blamed+0.002
 nodded+0.002
Kids+0.002
 swapped+0.002
Uh+0.001
Neg logit contributions
gy-0.002
 myself-0.002
 Owl-0.002
 possible-0.002
 rescuing-0.001
Token how
Token ID703
Feature activation-3.48e-03

Pos logit contributions
 nodded+0.006
 blamed+0.005
 did+0.005
Kids+0.005
 swapped+0.005
Neg logit contributions
gy-0.007
 myself-0.006
 Owl-0.006
ulent-0.006
 possible-0.006
Token it
Token ID340
Feature activation-5.93e-04

Pos logit contributions
 nodded+0.006
 blamed+0.006
 did+0.005
Kids+0.005
 thanked+0.005
Neg logit contributions
gy-0.007
 myself-0.006
 Owl-0.006
ulent-0.005
pper-0.005
Token looked
Token ID3114
Feature activation-4.62e-03

Pos logit contributions
Kids+0.003
 blamed+0.003
Uh+0.002
 nodded+0.002
omo+0.002
Neg logit contributions
gy-0.006
 myself-0.005
 Owl-0.005
 possible-0.005
ble-0.005
Token at
Token ID379
Feature activation-1.85e-03

Pos logit contributions
Kids+0.010
 blamed+0.009
 nodded+0.009
omo+0.009
Uh+0.009
Neg logit contributions
gy-0.007
 myself-0.006
 Owl-0.005
 possible-0.005
 rescuing-0.005
Token each
Token ID1123
Feature activation-2.61e-03

Pos logit contributions
 blamed+0.003
 nodded+0.003
Kids+0.003
 swapped+0.003
Uh+0.003
Neg logit contributions
gy-0.003
 myself-0.003
 Owl-0.003
 possible-0.003
 rescuing-0.003
Token other
Token ID584
Feature activation-5.83e-03

Pos logit contributions
 nodded+0.004
 blamed+0.004
Kids+0.004
 swapped+0.003
Uh+0.003
Neg logit contributions
gy-0.005
 myself-0.005
 Owl-0.005
 possible-0.005
 rescuing-0.004
Token in
Token ID287
Feature activation-4.72e-03

Pos logit contributions
 nodded+0.009
 blamed+0.008
 swapped+0.007
Kids+0.007
 did+0.007
Neg logit contributions
gy-0.012
 myself-0.011
 Owl-0.010
 possible-0.010
ulent-0.010
Token amaz
Token ID40642
Feature activation-2.67e-02

Pos logit contributions
Kids+0.007
 blamed+0.007
 nodded+0.007
outed+0.006
omo+0.006
Neg logit contributions
gy-0.009
 possible-0.008
 Owl-0.008
 myself-0.008
 rescuing-0.008
Tokenement
Token ID972
Feature activation-9.76e-03

Pos logit contributions
 blamed+0.041
Kids+0.040
Uh+0.037
 nodded+0.037
 swapped+0.036
Neg logit contributions
gy-0.049
 myself-0.047
ble-0.045
 possible-0.044
bing-0.042
Token.
Token ID13
Feature activation-1.99e-04

Pos logit contributions
 nodded+0.017
 blamed+0.016
Kids+0.015
 swapped+0.014
 did+0.014
Neg logit contributions
gy-0.018
 Owl-0.016
 myself-0.016
 possible-0.015
ulent-0.014
Token They
Token ID1119
Feature activation+4.43e-03

Pos logit contributions
 blamed+0.000
 nodded+0.000
Kids+0.000
 swapped+0.000
Uh+0.000
Neg logit contributions
gy-0.000
 Owl-0.000
 myself-0.000
 possible-0.000
 rescuing-0.000
Token had
Token ID550
Feature activation-2.83e-04

Pos logit contributions
gy+0.010
 Owl+0.009
 myself+0.009
 possible+0.008
ble+0.008
Neg logit contributions
Kids-0.006
 blamed-0.005
omo-0.005
Uh-0.005
 nodded-0.004
Token learned
Token ID4499
Feature activation+1.55e-04

Pos logit contributions
Kids+0.001
 blamed+0.001
 nodded+0.000
Uh+0.000
omo+0.000
Neg logit contributions
gy-0.000
 myself-0.000
 possible-0.000
 Owl-0.000
 rescuing-0.000
Token.
Token ID13
Feature activation+2.77e-04

Pos logit contributions
 blamed+0.003
Kids+0.003
 nodded+0.003
 swapped+0.003
Uh+0.003
Neg logit contributions
gy-0.003
 myself-0.003
 Owl-0.003
 possible-0.003
 rescuing-0.003
Token Amy
Token ID14235
Feature activation+1.86e-03

Pos logit contributions
gy+0.000
 myself+0.000
 Owl+0.000
 rescuing+0.000
 possible+0.000
Neg logit contributions
 blamed-0.000
 nodded-0.000
Kids-0.000
Uh-0.000
 swapped-0.000
Token watched
Token ID7342
Feature activation-4.28e-03

Pos logit contributions
gy+0.004
 myself+0.003
 Owl+0.003
 possible+0.003
ulent+0.003
Neg logit contributions
 blamed-0.002
Kids-0.002
Uh-0.002
Wow-0.002
 nodded-0.002
Token them
Token ID606
Feature activation+3.23e-04

Pos logit contributions
 nodded+0.006
 blamed+0.005
Kids+0.005
 swapped+0.005
 thanked+0.005
Neg logit contributions
gy-0.009
 myself-0.008
 Owl-0.008
 possible-0.008
ulent-0.008
Token with
Token ID351
Feature activation-2.95e-03

Pos logit contributions
gy+0.001
 possible+0.001
 Owl+0.001
 myself+0.001
ich+0.001
Neg logit contributions
 nodded-0.000
 blamed-0.000
 thanked-0.000
 swapped-0.000
 did-0.000
Token amaz
Token ID40642
Feature activation-2.67e-02

Pos logit contributions
 blamed+0.005
 nodded+0.005
Kids+0.005
 swapped+0.005
Uh+0.005
Neg logit contributions
gy-0.005
 myself-0.004
 Owl-0.004
 possible-0.004
 rescuing-0.004
Tokenement
Token ID972
Feature activation-8.87e-03

Pos logit contributions
 blamed+0.033
Kids+0.031
 nodded+0.031
 swapped+0.027
Uh+0.027
Neg logit contributions
gy-0.060
 myself-0.055
 Owl-0.054
 possible-0.053
ble-0.051
Token.
Token ID13
Feature activation+2.47e-04

Pos logit contributions
 nodded+0.015
 blamed+0.013
 swapped+0.012
 did+0.012
Kids+0.012
Neg logit contributions
gy-0.017
 Owl-0.016
 myself-0.015
 possible-0.015
ulent-0.014
Token She
Token ID1375
Feature activation-1.91e-03

Pos logit contributions
gy+0.000
 Owl+0.000
 myself+0.000
 possible+0.000
 rescuing+0.000
Neg logit contributions
 blamed-0.000
 nodded-0.000
Kids-0.000
Uh-0.000
 swapped-0.000
Token smiled
Token ID13541
Feature activation+6.21e-03

Pos logit contributions
Kids+0.002
 blamed+0.002
Uh+0.002
Wow+0.002
 swapped+0.002
Neg logit contributions
gy-0.004
 myself-0.003
 Owl-0.003
 possible-0.003
ble-0.003
Token,
Token ID11
Feature activation+2.32e-04

Pos logit contributions
gy+0.013
 myself+0.012
 Owl+0.012
 possible+0.012
 rescuing+0.011
Neg logit contributions
 nodded-0.009
 blamed-0.008
Kids-0.007
 swapped-0.007
 thanked-0.007
Token they
Token ID484
Feature activation-3.39e-05

Pos logit contributions
 nodded+0.000
 blamed+0.000
Kids+0.000
 did+0.000
 swapped+0.000
Neg logit contributions
gy-0.000
 myself-0.000
 Owl-0.000
ulent-0.000
 possible-0.000
Token saw
Token ID2497
Feature activation-3.04e-03

Pos logit contributions
 blamed+0.000
Kids+0.000
 nodded+0.000
Uh+0.000
omo+0.000
Neg logit contributions
gy-0.000
 myself-0.000
 Owl-0.000
 possible-0.000
 rescuing-0.000
Token a
Token ID257
Feature activation-3.30e-03

Pos logit contributions
 blamed+0.005
 nodded+0.005
Kids+0.005
 swapped+0.005
 did+0.005
Neg logit contributions
gy-0.005
 myself-0.005
 Owl-0.005
 possible-0.005
ulent-0.004
Token rh
Token ID9529
Feature activation+1.61e-02

Pos logit contributions
 blamed+0.005
 nodded+0.005
Kids+0.005
 swapped+0.005
Uh+0.005
Neg logit contributions
gy-0.006
 Owl-0.005
 myself-0.005
 possible-0.005
 rescuing-0.005
Tokenin
Token ID259
Feature activation+2.04e-03

Pos logit contributions
gy+0.023
 myself+0.019
 possible+0.018
 Owl+0.018
ulent+0.017
Neg logit contributions
 blamed-0.034
 nodded-0.033
Kids-0.032
Uh-0.030
 swapped-0.030
Tokenoc
Token ID420
Feature activation+5.93e-02

Pos logit contributions
gy+0.003
 myself+0.002
 possible+0.002
 Owl+0.002
ulent+0.002
Neg logit contributions
 blamed-0.004
 nodded-0.004
Kids-0.004
 did-0.004
Uh-0.004
Tokeneros
Token ID27498
Feature activation+1.90e-03

Pos logit contributions
gy+0.075
 possible+0.074
 myself+0.066
 outer+0.060
 Owl+0.057
Neg logit contributions
 nodded-0.123
 blamed-0.119
Kids-0.110
 moms-0.110
 swapped-0.107
Token.
Token ID13
Feature activation+3.77e-04

Pos logit contributions
gy+0.003
 myself+0.003
 Owl+0.003
 possible+0.003
ulent+0.003
Neg logit contributions
 blamed-0.003
 nodded-0.003
Kids-0.003
 swapped-0.003
 did-0.003
Token They
Token ID1119
Feature activation+3.82e-03

Pos logit contributions
gy+0.001
 Owl+0.001
 myself+0.001
 Must+0.001
 Grizz+0.001
Neg logit contributions
 nodded-0.001
 blamed-0.001
 swapped-0.001
Kids-0.001
Uh-0.001
Token tell
Token ID1560
Feature activation+8.07e-03

Pos logit contributions
gy+0.007
 Owl+0.006
 myself+0.006
 possible+0.006
 rescuing+0.005
Neg logit contributions
Kids-0.006
Uh-0.006
 blamed-0.006
outed-0.006
omo-0.006
Token their
Token ID511
Feature activation-2.25e-03

Pos logit contributions
gy+0.014
 myself+0.012
 possible+0.012
 Owl+0.011
ulent+0.011
Neg logit contributions
 blamed-0.015
 nodded-0.014
Kids-0.014
 did-0.013
 swapped-0.013
Token wait
Token ID4043
Feature activation-4.93e-03

Pos logit contributions
 blamed+0.005
 nodded+0.005
Kids+0.005
 swapped+0.005
Uh+0.005
Neg logit contributions
gy-0.006
 myself-0.005
 Owl-0.005
 possible-0.005
ulent-0.005
Token for
Token ID329
Feature activation-7.70e-04

Pos logit contributions
 nodded+0.008
 blamed+0.008
Kids+0.007
 did+0.007
 swapped+0.007
Neg logit contributions
gy-0.009
 Owl-0.008
 myself-0.007
ulent-0.007
ich-0.007
Token the
Token ID262
Feature activation+1.77e-03

Pos logit contributions
 blamed+0.001
 nodded+0.001
Kids+0.001
 minutes+0.001
 thanked+0.001
Neg logit contributions
gy-0.002
fly-0.001
ble-0.001
ich-0.001
 Owl-0.001
Token rh
Token ID9529
Feature activation+1.97e-02

Pos logit contributions
gy+0.003
 myself+0.003
 Owl+0.002
 possible+0.002
ble+0.002
Neg logit contributions
 blamed-0.003
 nodded-0.003
Kids-0.003
 swapped-0.003
Uh-0.003
Tokenin
Token ID259
Feature activation+4.08e-03

Pos logit contributions
gy+0.036
 myself+0.031
 possible+0.030
ulent+0.030
 Owl+0.029
Neg logit contributions
 nodded-0.031
 blamed-0.031
Kids-0.029
Uh-0.028
 swapped-0.028
Tokenoc
Token ID420
Feature activation+6.16e-02

Pos logit contributions
gy+0.005
 myself+0.005
 possible+0.004
ble+0.004
 Owl+0.004
Neg logit contributions
 nodded-0.009
 blamed-0.008
Kids-0.008
 did-0.008
Uh-0.008
Tokeneros
Token ID27498
Feature activation+3.54e-03

Pos logit contributions
 possible+0.092
gy+0.082
 outer+0.080
 myself+0.076
 afloat+0.076
Neg logit contributions
 nodded-0.104
 moms-0.102
 blamed-0.100
 swapped-0.093
omo-0.093
Token to
Token ID284
Feature activation+8.14e-04

Pos logit contributions
gy+0.007
 myself+0.006
 Owl+0.006
ble+0.006
bing+0.006
Neg logit contributions
 nodded-0.005
 blamed-0.005
 did-0.005
 swapped-0.004
Kids-0.004
Token come
Token ID1282
Feature activation-1.22e-03

Pos logit contributions
gy+0.002
 Owl+0.001
 possible+0.001
ulent+0.001
 myself+0.001
Neg logit contributions
 nodded-0.001
 blamed-0.001
 did-0.001
 swapped-0.001
 thanked-0.001
Token.
Token ID13
Feature activation-7.40e-04

Pos logit contributions
 nodded+0.001
 did+0.001
 asked+0.001
 swapped+0.001
 thanked+0.001
Neg logit contributions
iflower-0.002
icken-0.002
ulent-0.002
uch-0.002
 myself-0.002
Token
Token ID198
Feature activation-1.41e-03

Pos logit contributions
 blamed+0.001
 nodded+0.001
Kids+0.001
Uh+0.001
 swapped+0.001
Neg logit contributions
gy-0.001
 Owl-0.001
 myself-0.001
 Must-0.001
 possible-0.001
Token happy
Token ID3772
Feature activation+6.41e-04

Pos logit contributions
 blamed+0.002
 nodded+0.002
Kids+0.002
Uh+0.002
 swapped+0.002
Neg logit contributions
gy-0.003
 myself-0.003
 Owl-0.003
 possible-0.003
 rescuing-0.003
Token for
Token ID329
Feature activation-1.63e-03

Pos logit contributions
gy+0.001
 myself+0.001
 Owl+0.001
 possible+0.001
 rescuing+0.001
Neg logit contributions
 blamed-0.001
Kids-0.001
 nodded-0.001
Uh-0.001
 swapped-0.001
Token the
Token ID262
Feature activation+1.91e-03

Pos logit contributions
 blamed+0.002
Kids+0.002
 nodded+0.002
 did+0.002
Wow+0.002
Neg logit contributions
gy-0.004
uch-0.003
working-0.003
ulent-0.003
 Owl-0.003
Token rh
Token ID9529
Feature activation+1.50e-02

Pos logit contributions
gy+0.004
 myself+0.003
 Owl+0.003
 possible+0.003
ble+0.003
Neg logit contributions
 blamed-0.003
 nodded-0.003
Kids-0.003
 swapped-0.003
Uh-0.003
Tokenin
Token ID259
Feature activation+1.10e-03

Pos logit contributions
gy+0.027
 myself+0.023
 possible+0.022
ulent+0.022
 Owl+0.022
Neg logit contributions
 nodded-0.025
 blamed-0.025
Kids-0.024
 swapped-0.022
 did-0.022
Tokenoc
Token ID420
Feature activation+5.82e-02

Pos logit contributions
gy+0.002
 myself+0.001
 possible+0.001
ble+0.001
ulent+0.001
Neg logit contributions
 blamed-0.002
 nodded-0.002
Kids-0.002
 did-0.002
 thanked-0.002
Tokeneros
Token ID27498
Feature activation+1.50e-03

Pos logit contributions
 possible+0.081
gy+0.076
 afloat+0.067
 myself+0.067
 outer+0.066
Neg logit contributions
 nodded-0.110
 blamed-0.105
Kids-0.104
 moms-0.102
omo-0.097
Token.
Token ID13
Feature activation+8.89e-04

Pos logit contributions
gy+0.003
 myself+0.003
 possible+0.002
ble+0.002
 Owl+0.002
Neg logit contributions
 nodded-0.002
 blamed-0.002
Kids-0.002
 swapped-0.002
 did-0.002
Token She
Token ID1375
Feature activation-9.94e-04

Pos logit contributions
gy+0.001
 Owl+0.001
 myself+0.001
 possible+0.001
ulent+0.001
Neg logit contributions
 nodded-0.002
 blamed-0.001
 swapped-0.001
Uh-0.001
Kids-0.001
Token said
Token ID531
Feature activation-6.85e-04

Pos logit contributions
Uh+0.001
Kids+0.001
 blamed+0.001
Wow+0.001
outed+0.001
Neg logit contributions
gy-0.002
 Owl-0.002
 myself-0.002
 possible-0.002
 rescuing-0.002
Token,
Token ID11
Feature activation+2.82e-03

Pos logit contributions
 nodded+0.001
 blamed+0.001
Kids+0.001
 swapped+0.001
 did+0.001
Neg logit contributions
gy-0.001
 myself-0.001
 Owl-0.001
 possible-0.001
ulent-0.001
Token
Token ID198
Feature activation-1.75e-04

Pos logit contributions
 blamed+0.004
 nodded+0.004
Kids+0.004
 swapped+0.004
Uh+0.004
Neg logit contributions
gy-0.006
 myself-0.005
 Owl-0.005
 possible-0.005
 rescuing-0.004
Token
Token ID198
Feature activation-9.46e-04

Pos logit contributions
 blamed+0.000
 nodded+0.000
Kids+0.000
 swapped+0.000
Uh+0.000
Neg logit contributions
gy-0.000
 myself-0.000
 Owl-0.000
 possible-0.000
 rescuing-0.000
TokenThe
Token ID464
Feature activation-1.92e-03

Pos logit contributions
 blamed+0.001
 nodded+0.001
Kids+0.001
 swapped+0.001
outed+0.001
Neg logit contributions
gy-0.002
 myself-0.002
 Owl-0.002
 possible-0.002
ble-0.002
Token rh
Token ID9529
Feature activation+1.59e-02

Pos logit contributions
 blamed+0.003
 nodded+0.003
Kids+0.003
 swapped+0.003
Uh+0.003
Neg logit contributions
gy-0.004
 Owl-0.003
 myself-0.003
 possible-0.003
 rescuing-0.003
Tokenin
Token ID259
Feature activation-1.51e-03

Pos logit contributions
gy+0.025
 myself+0.021
 possible+0.021
 Owl+0.020
ulent+0.020
Neg logit contributions
 nodded-0.031
 blamed-0.030
Kids-0.028
 did-0.027
Uh-0.027
Tokenoc
Token ID420
Feature activation+5.82e-02

Pos logit contributions
 blamed+0.004
 nodded+0.004
Kids+0.004
 swapped+0.003
Uh+0.003
Neg logit contributions
gy-0.002
 Owl-0.001
 myself-0.001
 possible-0.001
 rescuing-0.001
Tokeneros
Token ID27498
Feature activation-7.28e-04

Pos logit contributions
 possible+0.080
gy+0.078
 afloat+0.066
ulent+0.065
 outer+0.065
Neg logit contributions
 nodded-0.114
 blamed-0.105
 moms-0.104
Kids-0.100
 swapped-0.098
Token lifted
Token ID13663
Feature activation+1.38e-04

Pos logit contributions
Kids+0.001
 blamed+0.001
Uh+0.001
Wow+0.001
 swapped+0.001
Neg logit contributions
gy-0.002
 myself-0.002
 Owl-0.002
ble-0.001
 possible-0.001
Token his
Token ID465
Feature activation-4.69e-03

Pos logit contributions
gy+0.000
 possible+0.000
 Owl+0.000
 myself+0.000
ulent+0.000
Neg logit contributions
 nodded-0.000
 blamed-0.000
Kids-0.000
 swapped-0.000
Wow-0.000
Token head
Token ID1182
Feature activation-2.82e-03

Pos logit contributions
 nodded+0.007
 blamed+0.007
 swapped+0.006
Kids+0.006
 did+0.006
Neg logit contributions
gy-0.009
 myself-0.008
 possible-0.008
 Owl-0.008
 rescuing-0.008
Token and
Token ID290
Feature activation-5.05e-03

Pos logit contributions
 nodded+0.004
 blamed+0.003
 swapped+0.003
 did+0.003
 shook+0.003
Neg logit contributions
gy-0.006
 Owl-0.005
 myself-0.005
 possible-0.005
ulent-0.005
Tokeneros
Token ID27498
Feature activation+2.70e-03

Pos logit contributions
 possible+0.075
gy+0.072
 myself+0.063
 outer+0.059
 afloat+0.056
Neg logit contributions
 nodded-0.112
 blamed-0.108
 moms-0.100
 swapped-0.098
Kids-0.097
Token.
Token ID13
Feature activation+1.87e-03

Pos logit contributions
gy+0.005
 myself+0.005
 possible+0.004
 Owl+0.004
ulent+0.004
Neg logit contributions
 blamed-0.004
 nodded-0.004
 did-0.004
Kids-0.004
 swapped-0.004
Token The
Token ID383
Feature activation+1.25e-04

Pos logit contributions
gy+0.003
 Owl+0.003
 myself+0.003
 possible+0.003
 rescuing+0.003
Neg logit contributions
 blamed-0.003
 nodded-0.003
Kids-0.003
 swapped-0.003
Uh-0.003
Token rh
Token ID9529
Feature activation+1.47e-02

Pos logit contributions
gy+0.000
 myself+0.000
 possible+0.000
 Owl+0.000
ulent+0.000
Neg logit contributions
 blamed-0.000
 nodded-0.000
Wow-0.000
Kids-0.000
 did-0.000
Tokenin
Token ID259
Feature activation+9.72e-04

Pos logit contributions
gy+0.021
 myself+0.018
 possible+0.017
 Owl+0.017
ulent+0.015
Neg logit contributions
 blamed-0.031
 nodded-0.031
Kids-0.030
Uh-0.028
 swapped-0.028
Tokenoc
Token ID420
Feature activation+5.62e-02

Pos logit contributions
gy+0.001
 myself+0.001
 possible+0.001
 Owl+0.001
ulent+0.001
Neg logit contributions
 blamed-0.002
 nodded-0.002
Kids-0.002
 swapped-0.002
Uh-0.002
Tokeneros
Token ID27498
Feature activation+2.76e-03

Pos logit contributions
 possible+0.077
gy+0.071
 outer+0.062
ulent+0.062
 myself+0.061
Neg logit contributions
 nodded-0.111
 blamed-0.107
 moms-0.100
Kids-0.097
 swapped-0.097
Token looked
Token ID3114
Feature activation-5.53e-03

Pos logit contributions
gy+0.007
 Owl+0.006
 myself+0.006
 possible+0.005
 rescuing+0.005
Neg logit contributions
Kids-0.003
 blamed-0.003
 nodded-0.003
 swapped-0.003
Uh-0.003
Token very
Token ID845
Feature activation+6.00e-03

Pos logit contributions
 blamed+0.010
 nodded+0.009
Kids+0.009
 swapped+0.009
Uh+0.008
Neg logit contributions
gy-0.010
 myself-0.009
 Owl-0.009
 possible-0.008
 rescuing-0.008
Token scary
Token ID14343
Feature activation-5.16e-03

Pos logit contributions
gy+0.013
 myself+0.011
 Owl+0.011
 possible+0.011
 rescuing+0.010
Neg logit contributions
 blamed-0.009
 nodded-0.009
 swapped-0.007
Kids-0.007
 did-0.007
Token and
Token ID290
Feature activation-8.69e-04

Pos logit contributions
 nodded+0.007
 blamed+0.007
 did+0.006
Kids+0.006
 swapped+0.006
Neg logit contributions
gy-0.011
 myself-0.010
 Owl-0.010
 possible-0.010
ulent-0.009
Token started
Token ID2067
Feature activation+4.99e-04

Pos logit contributions
 blamed+0.001
Kids+0.001
 nodded+0.001
Uh+0.001
 swapped+0.001
Neg logit contributions
gy-0.001
 myself-0.001
 Owl-0.001
 possible-0.001
 rescuing-0.001
Token eating
Token ID6600
Feature activation-2.78e-03

Pos logit contributions
gy+0.001
 myself+0.001
 Owl+0.001
 possible+0.001
ble+0.001
Neg logit contributions
 nodded-0.001
 blamed-0.001
Kids-0.001
 swapped-0.001
 brushes-0.001
Token the
Token ID262
Feature activation+2.05e-03

Pos logit contributions
 nodded+0.005
 blamed+0.004
Kids+0.004
 swapped+0.004
 did+0.004
Neg logit contributions
gy-0.005
 myself-0.005
 Owl-0.005
 possible-0.004
ulent-0.004
Token delicious
Token ID12625
Feature activation-2.70e-03

Pos logit contributions
gy+0.004
 myself+0.004
 Owl+0.004
 possible+0.004
ble+0.004
Neg logit contributions
 nodded-0.003
 blamed-0.003
Kids-0.002
 swapped-0.002
 did-0.002
Token ch
Token ID442
Feature activation+7.42e-04

Pos logit contributions
 nodded+0.003
 blamed+0.003
Kids+0.003
 swapped+0.003
 brushes+0.003
Neg logit contributions
gy-0.006
 myself-0.005
 possible-0.005
 Owl-0.005
ble-0.005
Tokenocol
Token ID4668
Feature activation+5.30e-02

Pos logit contributions
gy+0.001
 possible+0.001
 Owl+0.001
 myself+0.001
 rescuing+0.001
Neg logit contributions
 blamed-0.001
 nodded-0.001
 brushes-0.001
Kids-0.001
 swapped-0.001
Tokenates
Token ID689
Feature activation+1.86e-02

Pos logit contributions
gy+0.084
 possible+0.076
 Owl+0.076
 myself+0.075
 Grizz+0.068
Neg logit contributions
 nodded-0.088
Kids-0.085
 blamed-0.083
outed-0.080
 brushes-0.080
Token.
Token ID13
Feature activation+1.37e-03

Pos logit contributions
gy+0.036
 Owl+0.033
 myself+0.031
ulent+0.029
 rescuing+0.029
Neg logit contributions
 nodded-0.028
 blamed-0.025
 swapped-0.024
 did-0.024
 agreed-0.023
Token She
Token ID1375
Feature activation-8.59e-04

Pos logit contributions
gy+0.002
 myself+0.002
 Owl+0.002
 possible+0.002
 rescuing+0.002
Neg logit contributions
 blamed-0.002
Kids-0.002
 nodded-0.002
omo-0.002
 swapped-0.002
Token was
Token ID373
Feature activation-9.19e-04

Pos logit contributions
Kids+0.001
 blamed+0.001
 nodded+0.001
Uh+0.001
Wow+0.001
Neg logit contributions
gy-0.002
 myself-0.002
 Owl-0.002
 possible-0.002
ulent-0.002
Token so
Token ID523
Feature activation+2.66e-03

Pos logit contributions
Kids+0.002
 blamed+0.002
 nodded+0.002
Uh+0.001
 swapped+0.001
Neg logit contributions
gy-0.002
 myself-0.001
 Owl-0.001
 possible-0.001
 rescuing-0.001
Token eat
Token ID4483
Feature activation+4.62e-03

Pos logit contributions
gy+0.010
ich+0.009
bing+0.009
icken+0.009
 possible+0.008
Neg logit contributions
 nodded-0.006
 blamed-0.006
 did-0.006
 thanked-0.006
 asked-0.005
Token.
Token ID13
Feature activation+5.78e-04

Pos logit contributions
gy+0.010
 Owl+0.008
ich+0.008
 possible+0.008
 myself+0.008
Neg logit contributions
 nodded-0.006
 blamed-0.006
Kids-0.006
 did-0.006
 swapped-0.006
Token
Token ID198
Feature activation-1.13e-03

Pos logit contributions
gy+0.001
 Owl+0.001
 myself+0.001
 possible+0.001
 rescuing+0.001
Neg logit contributions
 blamed-0.001
 nodded-0.001
Kids-0.001
Uh-0.001
 swapped-0.001
Token
Token ID198
Feature activation-1.30e-03

Pos logit contributions
 blamed+0.002
 nodded+0.002
Kids+0.002
 swapped+0.002
Uh+0.002
Neg logit contributions
gy-0.002
 Owl-0.002
 myself-0.002
 possible-0.002
ble-0.002
TokenC
Token ID34
Feature activation+5.85e-03

Pos logit contributions
 blamed+0.002
 nodded+0.002
Kids+0.002
 swapped+0.002
Uh+0.002
Neg logit contributions
gy-0.002
 myself-0.002
 Owl-0.002
 possible-0.002
ulent-0.002
Tokenind
Token ID521
Feature activation+2.89e-02

Pos logit contributions
gy+0.008
 myself+0.007
 Owl+0.007
 possible+0.007
 rescuing+0.006
Neg logit contributions
 blamed-0.012
 nodded-0.012
Kids-0.012
 swapped-0.011
Uh-0.011
Tokeny
Token ID88
Feature activation+1.11e-03

Pos logit contributions
gy+0.051
 myself+0.049
 possible+0.048
 Owl+0.046
 rescuing+0.045
Neg logit contributions
 nodded-0.048
 blamed-0.046
Kids-0.045
 did-0.041
 swapped-0.040
Token was
Token ID373
Feature activation-5.34e-05

Pos logit contributions
gy+0.003
 Owl+0.002
 myself+0.002
 possible+0.002
ble+0.002
Neg logit contributions
 blamed-0.001
Kids-0.001
 nodded-0.001
Uh-0.001
 swapped-0.001
Token hesitant
Token ID32848
Feature activation-3.55e-03

Pos logit contributions
 blamed+0.000
 nodded+0.000
Kids+0.000
 swapped+0.000
 did+0.000
Neg logit contributions
gy-0.000
 myself-0.000
 Owl-0.000
 possible-0.000
ulent-0.000
Token at
Token ID379
Feature activation+1.00e-04

Pos logit contributions
 nodded+0.005
 blamed+0.004
 did+0.004
Kids+0.004
outed+0.004
Neg logit contributions
gy-0.008
 myself-0.007
 Owl-0.007
ulent-0.007
 rescuing-0.007
Token first
Token ID717
Feature activation-7.06e-04

Pos logit contributions
gy+0.000
ulent+0.000
 possible+0.000
icken+0.000
owing+0.000
Neg logit contributions
Kids-0.000
 did-0.000
 nodded-0.000
 blamed-0.000
 shook-0.000
Token but
Token ID475
Feature activation+4.15e-04

Pos logit contributions
gy+0.004
 myself+0.003
 Owl+0.003
 possible+0.003
 rescuing+0.003
Neg logit contributions
 blamed-0.003
 nodded-0.003
Kids-0.003
 swapped-0.003
Uh-0.003
Token it
Token ID340
Feature activation-1.17e-04

Pos logit contributions
gy+0.001
 myself+0.001
 Owl+0.001
ulent+0.001
 possible+0.001
Neg logit contributions
 nodded-0.001
 blamed-0.001
Kids-0.001
 did-0.001
 swapped-0.001
Token wouldn
Token ID3636
Feature activation-1.05e-03

Pos logit contributions
 blamed+0.000
Kids+0.000
 nodded+0.000
 swapped+0.000
Uh+0.000
Neg logit contributions
gy-0.000
 Owl-0.000
 myself-0.000
ble-0.000
 possible-0.000
Token't
Token ID470
Feature activation+2.30e-03

Pos logit contributions
 blamed+0.001
Kids+0.001
 did+0.001
Uh+0.001
 nodded+0.001
Neg logit contributions
gy-0.002
ulent-0.002
 hunting-0.002
owing-0.002
 possible-0.002
Token bud
Token ID3600
Feature activation-1.27e-02

Pos logit contributions
gy+0.005
 Owl+0.004
 myself+0.004
 possible+0.004
ulent+0.004
Neg logit contributions
 nodded-0.003
 blamed-0.003
Kids-0.003
 swapped-0.003
 did-0.003
Tokenge
Token ID469
Feature activation+4.16e-02

Pos logit contributions
 blamed+0.022
 did+0.021
 swapped+0.021
 nodded+0.020
outed+0.020
Neg logit contributions
 Owl-0.017
 myself-0.016
ulent-0.015
 rescuing-0.015
iflower-0.014
Token.
Token ID13
Feature activation+1.83e-03

Pos logit contributions
gy+0.083
 Owl+0.078
 myself+0.075
ulent+0.075
ich+0.071
Neg logit contributions
 nodded-0.054
 blamed-0.053
 did-0.051
 swapped-0.048
Kids-0.047
Token She
Token ID1375
Feature activation-7.62e-04

Pos logit contributions
gy+0.004
 myself+0.003
 Owl+0.003
 possible+0.003
 rescuing+0.003
Neg logit contributions
 blamed-0.003
 nodded-0.003
Kids-0.003
 swapped-0.002
Uh-0.002
Token was
Token ID373
Feature activation-2.46e-04

Pos logit contributions
Kids+0.001
 blamed+0.001
 nodded+0.001
Uh+0.001
Wow+0.001
Neg logit contributions
gy-0.002
 Owl-0.001
 myself-0.001
ble-0.001
bing-0.001
Token very
Token ID845
Feature activation+6.10e-03

Pos logit contributions
 nodded+0.000
Kids+0.000
 blamed+0.000
 swapped+0.000
Uh+0.000
Neg logit contributions
gy-0.000
 myself-0.000
 Owl-0.000
 possible-0.000
 rescuing-0.000
Token upset
Token ID9247
Feature activation-6.41e-03

Pos logit contributions
gy+0.013
 myself+0.011
 Owl+0.011
 possible+0.011
ulent+0.010
Neg logit contributions
 blamed-0.009
 nodded-0.009
Kids-0.008
 swapped-0.008
 did-0.008
Token.
Token ID13
Feature activation-1.26e-04

Pos logit contributions
 blamed+0.011
Kids+0.011
 nodded+0.011
Uh+0.011
 swapped+0.010
Neg logit contributions
gy-0.009
 myself-0.008
 Owl-0.008
 possible-0.008
bing-0.007
Token He
Token ID679
Feature activation+3.31e-03

Pos logit contributions
 blamed+0.000
 nodded+0.000
Kids+0.000
Uh+0.000
 swapped+0.000
Neg logit contributions
gy-0.000
 Owl-0.000
 myself-0.000
 possible-0.000
 Must-0.000
Token pulled
Token ID5954
Feature activation-5.81e-04

Pos logit contributions
gy+0.006
 Owl+0.006
 myself+0.006
ble+0.005
 possible+0.005
Neg logit contributions
Kids-0.005
 blamed-0.004
Uh-0.004
Wow-0.004
omo-0.004
Token and
Token ID290
Feature activation-1.40e-03

Pos logit contributions
 nodded+0.001
 blamed+0.001
Kids+0.001
 swapped+0.001
 did+0.001
Neg logit contributions
gy-0.001
 Owl-0.001
 myself-0.001
ulent-0.001
 possible-0.001
Token tug
Token ID27762
Feature activation-6.74e-03

Pos logit contributions
 blamed+0.002
Kids+0.002
 nodded+0.001
omo+0.001
Uh+0.001
Neg logit contributions
gy-0.003
 myself-0.003
 Owl-0.003
 possible-0.003
 rescuing-0.003
Tokenged
Token ID2004
Feature activation+3.00e-02

Pos logit contributions
 blamed+0.013
 nodded+0.013
Kids+0.013
 swapped+0.012
Uh+0.012
Neg logit contributions
gy-0.010
 myself-0.009
 Owl-0.009
 possible-0.008
 rescuing-0.008
Token on
Token ID319
Feature activation-4.64e-03

Pos logit contributions
gy+0.065
 myself+0.056
 Owl+0.054
 possible+0.053
ich+0.051
Neg logit contributions
 nodded-0.043
 blamed-0.042
Kids-0.037
 swapped-0.037
 did-0.035
Token the
Token ID262
Feature activation+2.11e-03

Pos logit contributions
 nodded+0.007
 blamed+0.007
 swapped+0.006
 did+0.006
 turns+0.006
Neg logit contributions
gy-0.009
 Owl-0.008
 possible-0.008
 myself-0.008
ulent-0.008
Token string
Token ID4731
Feature activation+2.82e-03

Pos logit contributions
gy+0.005
 Owl+0.004
 myself+0.004
 possible+0.004
 rescuing+0.004
Neg logit contributions
 nodded-0.003
 blamed-0.003
Kids-0.002
 swapped-0.002
Uh-0.002
Token and
Token ID290
Feature activation-5.80e-04

Pos logit contributions
gy+0.005
 myself+0.005
 Owl+0.005
ulent+0.005
ich+0.005
Neg logit contributions
 nodded-0.004
 blamed-0.004
 swapped-0.004
 did-0.004
 shook-0.004
Token was
Token ID373
Feature activation+2.77e-04

Pos logit contributions
 blamed+0.001
 nodded+0.001
Kids+0.001
 swapped+0.001
Uh+0.001
Neg logit contributions
gy-0.001
 myself-0.001
 Owl-0.001
 possible-0.001
 rescuing-0.001
Token.
Token ID13
Feature activation+1.26e-03

Pos logit contributions
gy+0.006
 myself+0.005
 Owl+0.005
 possible+0.005
 rescuing+0.004
Neg logit contributions
 blamed-0.008
 nodded-0.008
Kids-0.008
Uh-0.007
 swapped-0.007
Token They
Token ID1119
Feature activation+6.03e-03

Pos logit contributions
gy+0.002
 Owl+0.002
 myself+0.002
 ourselves+0.002
 Must+0.002
Neg logit contributions
 nodded-0.002
 blamed-0.002
Kids-0.002
 swapped-0.002
 kissed-0.002
Token pulled
Token ID5954
Feature activation-1.72e-03

Pos logit contributions
gy+0.012
 Owl+0.009
 myself+0.009
 any+0.009
pper+0.009
Neg logit contributions
Kids-0.009
omo-0.008
lights-0.008
Uh-0.007
 nodded-0.007
Token and
Token ID290
Feature activation-9.91e-04

Pos logit contributions
 blamed+0.003
 nodded+0.003
Kids+0.003
 swapped+0.002
Uh+0.002
Neg logit contributions
gy-0.003
 myself-0.003
 Owl-0.003
 possible-0.003
 rescuing-0.003
Token tug
Token ID27762
Feature activation-8.18e-03

Pos logit contributions
Kids+0.001
omo+0.001
 nodded+0.001
outed+0.001
 blamed+0.001
Neg logit contributions
gy-0.002
 myself-0.002
 Owl-0.002
 any-0.002
 rescuing-0.002
Tokenged
Token ID2004
Feature activation+2.99e-02

Pos logit contributions
 blamed+0.015
 nodded+0.015
Kids+0.015
 swapped+0.014
Uh+0.013
Neg logit contributions
gy-0.013
 myself-0.012
 Owl-0.011
 possible-0.011
ble-0.010
Token,
Token ID11
Feature activation+3.65e-03

Pos logit contributions
gy+0.057
 myself+0.051
 Owl+0.050
 possible+0.049
 rescuing+0.046
Neg logit contributions
 blamed-0.048
 nodded-0.047
Kids-0.045
 swapped-0.042
Uh-0.041
Token and
Token ID290
Feature activation-5.00e-04

Pos logit contributions
gy+0.007
 myself+0.006
 Owl+0.006
 possible+0.006
 rescuing+0.006
Neg logit contributions
 blamed-0.006
 nodded-0.006
Kids-0.005
 swapped-0.005
Uh-0.005
Token the
Token ID262
Feature activation+3.03e-03

Pos logit contributions
Kids+0.001
 nodded+0.001
 blamed+0.001
omo+0.001
Uh+0.001
Neg logit contributions
gy-0.001
 Owl-0.001
 myself-0.001
 possible-0.001
 rescuing-0.001
Token book
Token ID1492
Feature activation+3.74e-03

Pos logit contributions
gy+0.006
 myself+0.005
 Owl+0.005
 possible+0.005
 rescuing+0.004
Neg logit contributions
 blamed-0.005
 nodded-0.005
Kids-0.005
 swapped-0.005
Uh-0.004
Token ripped
Token ID19551
Feature activation-2.41e-03

Pos logit contributions
gy+0.006
 possible+0.005
 myself+0.005
 any+0.005
ble+0.005
Neg logit contributions
Kids-0.007
omo-0.006
 nodded-0.006
 blamed-0.006
Wow-0.006
Token saw
Token ID2497
Feature activation-6.94e-04

Pos logit contributions
Kids+0.004
Wow+0.004
Uh+0.004
 blamed+0.004
 brushes+0.004
Neg logit contributions
gy-0.007
 Owl-0.006
 myself-0.006
 rescuing-0.005
ey-0.005
Token Benny
Token ID44275
Feature activation-4.32e-03

Pos logit contributions
 blamed+0.001
 nodded+0.001
Kids+0.001
 did+0.001
 swapped+0.001
Neg logit contributions
gy-0.001
 possible-0.001
 myself-0.001
ulent-0.001
ich-0.001
Token and
Token ID290
Feature activation-2.54e-03

Pos logit contributions
 nodded+0.006
 blamed+0.005
 did+0.005
 swapped+0.005
 thanked+0.005
Neg logit contributions
gy-0.009
 possible-0.009
 myself-0.008
 Owl-0.008
 outer-0.008
Token called
Token ID1444
Feature activation+7.37e-04

Pos logit contributions
Kids+0.003
 blamed+0.003
Uh+0.003
 nodded+0.003
Wow+0.002
Neg logit contributions
gy-0.006
 Owl-0.005
 myself-0.005
 possible-0.005
 rescuing-0.005
Token the
Token ID262
Feature activation+1.01e-03

Pos logit contributions
gy+0.001
 myself+0.001
 possible+0.001
ich+0.001
ulent+0.001
Neg logit contributions
 blamed-0.001
 nodded-0.001
Kids-0.001
 did-0.001
 swapped-0.001
Token police
Token ID1644
Feature activation+3.39e-03

Pos logit contributions
gy+0.002
 Owl+0.002
 possible+0.002
 myself+0.002
 rescuing+0.002
Neg logit contributions
 nodded-0.002
 blamed-0.002
Kids-0.002
Uh-0.001
outed-0.001
Token.
Token ID13
Feature activation+2.68e-04

Pos logit contributions
gy+0.006
 myself+0.006
 possible+0.005
 Owl+0.005
ble+0.005
Neg logit contributions
 blamed-0.006
 nodded-0.006
Kids-0.005
 swapped-0.005
 did-0.005
Token
Token ID198
Feature activation-7.21e-04

Pos logit contributions
gy+0.001
 Owl+0.000
 myself+0.000
 possible+0.000
 rescuing+0.000
Neg logit contributions
 blamed-0.000
 nodded-0.000
Kids-0.000
 swapped-0.000
Uh-0.000
Token
Token ID198
Feature activation-1.17e-03

Pos logit contributions
 blamed+0.001
 nodded+0.001
Kids+0.001
 swapped+0.001
Uh+0.001
Neg logit contributions
gy-0.001
 myself-0.001
 Owl-0.001
 possible-0.001
 rescuing-0.001
TokenB
Token ID33
Feature activation+7.78e-04

Pos logit contributions
 blamed+0.002
 nodded+0.002
Kids+0.002
 swapped+0.002
Uh+0.002
Neg logit contributions
gy-0.002
 myself-0.002
 Owl-0.002
 possible-0.002
 rescuing-0.002
Tokenenny
Token ID11870
Feature activation-2.27e-03

Pos logit contributions
 possible+0.001
 myself+0.001
gy+0.001
 Owl+0.001
 rescuing+0.001
Neg logit contributions
Kids-0.001
 nodded-0.001
 blamed-0.001
omo-0.001
 swapped-0.001
Token and
Token ID290
Feature activation-1.79e-03

Pos logit contributions
gy+0.003
 myself+0.003
 Owl+0.003
 possible+0.002
ulent+0.002
Neg logit contributions
 nodded-0.002
 blamed-0.002
Kids-0.002
 swapped-0.002
 did-0.002
Token visit
Token ID3187
Feature activation+4.92e-03

Pos logit contributions
 blamed+0.003
 nodded+0.003
Kids+0.003
 swapped+0.003
Uh+0.003
Neg logit contributions
gy-0.003
 myself-0.003
 Owl-0.003
 possible-0.003
ble-0.002
Token far
Token ID1290
Feature activation+5.48e-03

Pos logit contributions
gy+0.010
 myself+0.009
 Owl+0.009
ulent+0.008
ble+0.008
Neg logit contributions
 nodded-0.007
 blamed-0.007
Kids-0.007
 did-0.006
 thanked-0.006
Token away
Token ID1497
Feature activation-6.89e-03

Pos logit contributions
gy+0.009
 myself+0.008
 Owl+0.008
 possible+0.008
 rescuing+0.007
Neg logit contributions
 nodded-0.010
 blamed-0.010
Kids-0.009
 swapped-0.009
Uh-0.008
Token places
Token ID4113
Feature activation-4.33e-03

Pos logit contributions
 nodded+0.011
 blamed+0.011
Kids+0.010
 swapped+0.010
 did+0.010
Neg logit contributions
gy-0.013
 myself-0.011
 Owl-0.011
ulent-0.011
 rescuing-0.010
Token.
Token ID13
Feature activation+7.36e-04

Pos logit contributions
 nodded+0.006
 blamed+0.006
Kids+0.006
 thanked+0.006
 did+0.006
Neg logit contributions
gy-0.009
 myself-0.008
 Owl-0.007
ulent-0.007
ble-0.007
Token As
Token ID1081
Feature activation-2.54e-03

Pos logit contributions
gy+0.001
 Owl+0.001
 myself+0.001
 possible+0.001
 rescuing+0.001
Neg logit contributions
 blamed-0.001
 nodded-0.001
Kids-0.001
 swapped-0.001
Uh-0.001
Token they
Token ID484
Feature activation+1.79e-03

Pos logit contributions
 blamed+0.003
 nodded+0.003
Kids+0.003
 swapped+0.003
Uh+0.003
Neg logit contributions
gy-0.005
 myself-0.005
 Owl-0.005
 possible-0.005
ble-0.005
Token flew
Token ID13112
Feature activation-1.84e-04

Pos logit contributions
gy+0.004
 Owl+0.003
ey+0.003
 hunting+0.003
 outer+0.003
Neg logit contributions
 blamed-0.003
omo-0.002
Kids-0.002
Uh-0.002
 swapped-0.002
Token around
Token ID1088
Feature activation-3.42e-03

Pos logit contributions
 nodded+0.000
 blamed+0.000
Kids+0.000
 swapped+0.000
Uh+0.000
Neg logit contributions
gy-0.000
 myself-0.000
 Owl-0.000
 possible-0.000
ulent-0.000
Token,
Token ID11
Feature activation-1.02e-03

Pos logit contributions
 nodded+0.005
 blamed+0.005
Kids+0.005
 did+0.005
 swapped+0.005
Neg logit contributions
gy-0.006
 myself-0.006
 Owl-0.005
ulent-0.005
 possible-0.005
Token help
Token ID1037
Feature activation+1.24e-03

Pos logit contributions
 blamed+0.000
 nodded+0.000
 swapped+0.000
 thanked+0.000
 did+0.000
Neg logit contributions
gy-0.000
 Owl-0.000
 myself-0.000
 possible-0.000
ich-0.000
Token him
Token ID683
Feature activation-8.07e-03

Pos logit contributions
gy+0.003
 myself+0.002
ulent+0.002
uch+0.002
 Owl+0.002
Neg logit contributions
 blamed-0.002
 nodded-0.002
Kids-0.001
 thanked-0.001
 said-0.001
Token get
Token ID651
Feature activation+5.06e-03

Pos logit contributions
 nodded+0.012
 blamed+0.012
 thanked+0.011
 swapped+0.011
 did+0.011
Neg logit contributions
gy-0.014
 Owl-0.013
ich-0.012
ulent-0.012
 myself-0.012
Token back
Token ID736
Feature activation+2.47e-03

Pos logit contributions
gy+0.009
 myself+0.008
 Owl+0.008
ulent+0.007
ich+0.007
Neg logit contributions
 nodded-0.009
 blamed-0.009
 swapped-0.008
 thanked-0.008
Kids-0.008
Token home
Token ID1363
Feature activation-1.60e-03

Pos logit contributions
gy+0.004
ulent+0.004
 Owl+0.004
 myself+0.004
ble+0.003
Neg logit contributions
 nodded-0.004
 blamed-0.004
 swapped-0.004
 thanked-0.004
 brushes-0.003
Token safely
Token ID11512
Feature activation+2.91e-03

Pos logit contributions
 nodded+0.002
 thanked+0.002
 blamed+0.002
 swapped+0.002
 praised+0.002
Neg logit contributions
gy-0.003
 Owl-0.003
ulent-0.003
 myself-0.003
ich-0.003
Token.
Token ID13
Feature activation-3.53e-04

Pos logit contributions
gy+0.006
 Owl+0.005
 myself+0.005
ulent+0.005
 rescuing+0.005
Neg logit contributions
 nodded-0.004
 blamed-0.004
 thanked-0.003
 swapped-0.003
 did-0.003
Token
Token ID198
Feature activation-1.48e-03

Pos logit contributions
 nodded+0.001
 blamed+0.001
Kids+0.001
 swapped+0.001
Uh+0.001
Neg logit contributions
gy-0.001
 Owl-0.001
 myself-0.001
 possible-0.001
 rescuing-0.001
Token
Token ID198
Feature activation-2.05e-03

Pos logit contributions
 blamed+0.002
 nodded+0.002
 swapped+0.002
Kids+0.002
Uh+0.002
Neg logit contributions
gy-0.003
 Owl-0.003
 myself-0.003
 possible-0.002
 rescuing-0.002
TokenTim
Token ID14967
Feature activation-9.87e-04

Pos logit contributions
 blamed+0.003
 nodded+0.003
outed+0.003
 swapped+0.003
 brushes+0.003
Neg logit contributions
gy-0.004
 myself-0.003
 Owl-0.003
 possible-0.003
 rescuing-0.003
Tokenmy
Token ID1820
Feature activation+5.07e-03

Pos logit contributions
 blamed+0.001
 nodded+0.001
Kids+0.001
 swapped+0.001
Uh+0.001
Neg logit contributions
gy-0.002
 myself-0.002
 Owl-0.002
 possible-0.002
 rescuing-0.002
Token!"
Token ID2474
Feature activation+7.13e-04

Pos logit contributions
 nodded+0.002
 blamed+0.001
 thanked+0.001
 did+0.001
 minutes+0.001
Neg logit contributions
gy-0.003
 Owl-0.003
ulent-0.003
 rescuing-0.003
 possible-0.003
Token
Token ID198
Feature activation+8.77e-04

Pos logit contributions
gy+0.001
 myself+0.001
 Owl+0.001
 possible+0.001
 rescuing+0.001
Neg logit contributions
 blamed-0.001
Kids-0.001
Uh-0.001
 nodded-0.001
 swapped-0.001
Token
Token ID198
Feature activation-4.83e-04

Pos logit contributions
gy+0.002
 myself+0.001
 Owl+0.001
 possible+0.001
ble+0.001
Neg logit contributions
 blamed-0.001
 nodded-0.001
 swapped-0.001
Kids-0.001
Uh-0.001
Token"
Token ID1
Feature activation-6.66e-03

Pos logit contributions
 blamed+0.001
 nodded+0.001
 swapped+0.001
outed+0.001
Kids+0.001
Neg logit contributions
gy-0.001
 myself-0.001
 Owl-0.001
 possible-0.001
ble-0.001
TokenYes
Token ID5297
Feature activation+3.69e-03

Pos logit contributions
 blamed+0.010
 nodded+0.008
outed+0.008
 brushes+0.008
omo+0.008
Neg logit contributions
 myself-0.012
 possible-0.012
gy-0.012
 Owl-0.011
ulent-0.011
Token,
Token ID11
Feature activation+8.70e-04

Pos logit contributions
gy+0.005
 myself+0.004
 Owl+0.004
 possible+0.004
ble+0.004
Neg logit contributions
 blamed-0.008
 nodded-0.008
Kids-0.008
 swapped-0.007
Uh-0.007
Token let
Token ID1309
Feature activation+8.08e-03

Pos logit contributions
 myself+0.001
gy+0.001
 Owl+0.001
 possible+0.001
 rescuing+0.001
Neg logit contributions
 blamed-0.002
 nodded-0.002
outed-0.002
Kids-0.002
 swapped-0.002
Token's
Token ID338
Feature activation-5.26e-03

Pos logit contributions
gy+0.013
 myself+0.011
 Owl+0.011
 possible+0.011
 rescuing+0.010
Neg logit contributions
 blamed-0.015
 nodded-0.015
Kids-0.015
 swapped-0.014
Uh-0.014
Token go
Token ID467
Feature activation+1.58e-03

Pos logit contributions
 nodded+0.008
 blamed+0.008
Kids+0.007
 swapped+0.007
 did+0.007
Neg logit contributions
gy-0.010
 myself-0.009
 Owl-0.009
 possible-0.009
ulent-0.008
Token!"
Token ID2474
Feature activation-4.35e-03

Pos logit contributions
gy+0.003
iflower+0.003
ulent+0.003
icken+0.003
 Owl+0.003
Neg logit contributions
 nodded-0.002
 said-0.002
Kids-0.002
 thanked-0.002
 did-0.002
Token Ben
Token ID3932
Feature activation-8.47e-04

Pos logit contributions
 blamed+0.007
Kids+0.006
Uh+0.006
 nodded+0.006
 swapped+0.006
Neg logit contributions
gy-0.008
 Owl-0.007
 myself-0.007
 possible-0.006
 Must-0.006
Token their
Token ID511
Feature activation-2.01e-03

Pos logit contributions
 blamed+0.011
Kids+0.011
 did+0.010
 minutes+0.010
 nodded+0.010
Neg logit contributions
gy-0.016
ulent-0.013
ble-0.013
 Owl-0.013
 myself-0.012
Token homes
Token ID5682
Feature activation-3.63e-03

Pos logit contributions
 blamed+0.004
 nodded+0.004
Kids+0.004
 swapped+0.003
Uh+0.003
Neg logit contributions
gy-0.003
 myself-0.003
 Owl-0.003
 possible-0.003
 rescuing-0.003
Token.
Token ID13
Feature activation+3.27e-03

Pos logit contributions
 nodded+0.005
 blamed+0.004
 did+0.004
 thanked+0.004
Kids+0.004
Neg logit contributions
gy-0.007
 Owl-0.006
 myself-0.006
ulent-0.006
ble-0.006
Token<|endoftext|>
Token ID50256
Feature activation+3.01e-03

Pos logit contributions
gy+0.007
 myself+0.006
 Owl+0.006
 possible+0.006
 rescuing+0.005
Neg logit contributions
 blamed-0.005
 nodded-0.005
Kids-0.005
 swapped-0.004
Uh-0.004
TokenOnce
Token ID7454
Feature activation+5.65e-04

Pos logit contributions
gy+0.006
 myself+0.006
 Owl+0.006
 possible+0.006
ulent+0.005
Neg logit contributions
 blamed-0.004
 nodded-0.004
Kids-0.004
 swapped-0.003
Wow-0.003
Token upon
Token ID2402
Feature activation+6.73e-03

Pos logit contributions
gy+0.001
 myself+0.001
 Owl+0.001
 possible+0.001
ble+0.001
Neg logit contributions
 blamed-0.001
 nodded-0.001
Kids-0.001
 swapped-0.001
Uh-0.001
Token a
Token ID257
Feature activation-9.62e-04

Pos logit contributions
gy+0.012
 Owl+0.011
bing+0.010
 possible+0.010
 myself+0.010
Neg logit contributions
 nodded-0.012
 blamed-0.011
 did-0.011
Kids-0.011
Wow-0.010
Token time
Token ID640
Feature activation+4.49e-03

Pos logit contributions
 blamed+0.001
 nodded+0.001
Kids+0.001
 swapped+0.001
Uh+0.001
Neg logit contributions
gy-0.003
 myself-0.002
 Owl-0.002
 possible-0.002
ulent-0.002
Token,
Token ID11
Feature activation+5.12e-03

Pos logit contributions
gy+0.009
 myself+0.008
 Owl+0.008
 possible+0.008
ulent+0.008
Neg logit contributions
 nodded-0.006
 blamed-0.006
 did-0.006
Kids-0.006
omo-0.005
Token there
Token ID612
Feature activation+3.08e-03

Pos logit contributions
gy+0.011
 myself+0.010
 Owl+0.010
 possible+0.009
ulent+0.009
Neg logit contributions
 blamed-0.007
 nodded-0.007
Kids-0.007
 swapped-0.006
 did-0.006
Token was
Token ID373
Feature activation+1.21e-03

Pos logit contributions
gy+0.007
 myself+0.006
 Owl+0.006
 possible+0.006
 rescuing+0.006
Neg logit contributions
 blamed-0.004
 nodded-0.004
Kids-0.003
 swapped-0.003
Uh-0.003
Token who
Token ID508
Feature activation+5.05e-03

Pos logit contributions
gy+0.004
 myself+0.004
 possible+0.004
 Owl+0.004
ulent+0.004
Neg logit contributions
 blamed-0.004
 nodded-0.004
Kids-0.003
 did-0.003
 swapped-0.003
Token had
Token ID550
Feature activation+9.55e-04

Pos logit contributions
gy+0.012
 myself+0.011
 Owl+0.011
 possible+0.011
ble+0.010
Neg logit contributions
 blamed-0.006
 nodded-0.006
Kids-0.005
 swapped-0.005
 did-0.004
Token a
Token ID257
Feature activation-1.96e-03

Pos logit contributions
gy+0.001
 Owl+0.001
 myself+0.001
 possible+0.001
ble+0.001
Neg logit contributions
 blamed-0.002
 nodded-0.002
Kids-0.002
 swapped-0.002
 did-0.002
Token lovely
Token ID14081
Feature activation+3.59e-03

Pos logit contributions
 blamed+0.003
 nodded+0.003
Kids+0.003
 swapped+0.003
Uh+0.003
Neg logit contributions
gy-0.004
 Owl-0.003
 myself-0.003
 possible-0.003
 rescuing-0.003
Token cat
Token ID3797
Feature activation+2.14e-03

Pos logit contributions
gy+0.007
 myself+0.007
 Owl+0.007
 possible+0.006
 rescuing+0.006
Neg logit contributions
 blamed-0.005
 nodded-0.005
Kids-0.005
 swapped-0.004
Uh-0.004
Token named
Token ID3706
Feature activation-4.65e-03

Pos logit contributions
gy+0.004
 Owl+0.004
 myself+0.004
 possible+0.004
 rescuing+0.003
Neg logit contributions
 blamed-0.003
 nodded-0.003
Kids-0.003
Uh-0.003
 swapped-0.003
Token Mitt
Token ID16627
Feature activation+5.45e-03

Pos logit contributions
Kids+0.005
 blamed+0.005
 nodded+0.005
 brushes+0.005
 did+0.004
Neg logit contributions
gy-0.010
ble-0.010
 possible-0.010
 afloat-0.009
 myself-0.009
Tokenens
Token ID641
Feature activation-2.05e-03

Pos logit contributions
gy+0.012
 myself+0.011
 possible+0.011
 Owl+0.011
ble+0.010
Neg logit contributions
Kids-0.007
 nodded-0.007
 blamed-0.007
 swapped-0.006
 did-0.006
Token.
Token ID13
Feature activation+1.54e-03

Pos logit contributions
 blamed+0.003
 nodded+0.003
Kids+0.003
 did+0.003
 swapped+0.003
Neg logit contributions
gy-0.004
 myself-0.004
 possible-0.003
 Owl-0.003
ble-0.003
Token Lily
Token ID20037
Feature activation+2.11e-03

Pos logit contributions
gy+0.003
 Owl+0.003
 myself+0.003
ulent+0.002
 possible+0.002
Neg logit contributions
 blamed-0.002
 nodded-0.002
Kids-0.002
 swapped-0.002
Uh-0.002
Token loved
Token ID6151
Feature activation+3.42e-03

Pos logit contributions
gy+0.005
 Owl+0.004
 myself+0.004
 possible+0.004
 rescuing+0.004
Neg logit contributions
 blamed-0.003
Kids-0.003
 nodded-0.003
Uh-0.003
 swapped-0.002
Token The
Token ID383
Feature activation-3.13e-04

Pos logit contributions
 blamed+0.009
 nodded+0.009
Kids+0.009
 swapped+0.008
Uh+0.008
Neg logit contributions
gy-0.011
 myself-0.010
 Owl-0.010
 possible-0.009
 rescuing-0.009
Token bird
Token ID6512
Feature activation-1.57e-03

Pos logit contributions
 blamed+0.001
 nodded+0.001
Kids+0.000
 swapped+0.000
Uh+0.000
Neg logit contributions
gy-0.001
 Owl-0.001
 myself-0.001
 possible-0.000
 rescuing-0.000
Tokenie
Token ID494
Feature activation-8.75e-04

Pos logit contributions
 blamed+0.004
 nodded+0.004
Kids+0.004
 swapped+0.003
Uh+0.003
Neg logit contributions
gy-0.002
 Owl-0.002
 myself-0.001
 possible-0.001
ble-0.001
Token felt
Token ID2936
Feature activation+4.11e-04

Pos logit contributions
 blamed+0.001
Kids+0.001
 nodded+0.001
 swapped+0.001
Uh+0.001
Neg logit contributions
gy-0.002
 Owl-0.002
 myself-0.002
 possible-0.002
ble-0.002
Token very
Token ID845
Feature activation+6.52e-03

Pos logit contributions
gy+0.001
 myself+0.001
 Owl+0.001
 possible+0.001
ulent+0.001
Neg logit contributions
 blamed-0.001
 nodded-0.001
Kids-0.001
 swapped-0.001
 did-0.001
Token scared
Token ID12008
Feature activation-6.82e-03

Pos logit contributions
gy+0.015
 Owl+0.013
 myself+0.012
 possible+0.012
ulent+0.012
Neg logit contributions
 blamed-0.009
 nodded-0.009
 did-0.008
 thanked-0.007
 swapped-0.007
Token and
Token ID290
Feature activation+2.10e-04

Pos logit contributions
 blamed+0.008
 nodded+0.008
Kids+0.007
 did+0.007
 swapped+0.006
Neg logit contributions
gy-0.016
 myself-0.014
 Owl-0.014
 possible-0.014
ulent-0.013
Token didn
Token ID1422
Feature activation-2.46e-04

Pos logit contributions
gy+0.000
 Owl+0.000
 myself+0.000
 rescuing+0.000
 possible+0.000
Neg logit contributions
Kids-0.000
 blamed-0.000
omo-0.000
 nodded-0.000
Uh-0.000
Token't
Token ID470
Feature activation+1.21e-03

Pos logit contributions
 blamed+0.000
Kids+0.000
 nodded+0.000
 did+0.000
Uh+0.000
Neg logit contributions
gy-0.000
 Owl-0.000
ulent-0.000
rying-0.000
 myself-0.000
Token know
Token ID760
Feature activation+3.50e-03

Pos logit contributions
gy+0.002
 myself+0.002
 Owl+0.002
 possible+0.002
 rescuing+0.002
Neg logit contributions
 blamed-0.002
 nodded-0.002
Kids-0.002
 swapped-0.002
Uh-0.002
Token what
Token ID644
Feature activation+6.50e-03

Pos logit contributions
gy+0.009
 myself+0.007
working+0.007
ulent+0.007
uch+0.007
Neg logit contributions
 blamed-0.004
 nodded-0.004
 did-0.004
 minutes-0.003
 kissed-0.003
Token funny
Token ID8258
Feature activation+2.21e-03

Pos logit contributions
 blamed+0.002
 nodded+0.002
 did+0.002
Kids+0.001
 swapped+0.001
Neg logit contributions
gy-0.002
 myself-0.002
 Owl-0.002
ulent-0.002
 possible-0.002
Token and
Token ID290
Feature activation+4.73e-05

Pos logit contributions
gy+0.004
 myself+0.004
 Owl+0.004
 possible+0.004
 rescuing+0.003
Neg logit contributions
 blamed-0.004
 nodded-0.004
Kids-0.003
 swapped-0.003
Uh-0.003
Token it
Token ID340
Feature activation-6.84e-04

Pos logit contributions
gy+0.000
 myself+0.000
 Owl+0.000
 possible+0.000
 rescuing+0.000
Neg logit contributions
Kids-0.000
 blamed-0.000
 nodded-0.000
Uh-0.000
 swapped-0.000
Token made
Token ID925
Feature activation-5.82e-04

Pos logit contributions
Kids+0.001
 blamed+0.001
 nodded+0.001
Uh+0.001
omo+0.001
Neg logit contributions
gy-0.001
 Owl-0.001
 myself-0.001
 possible-0.001
 rescuing-0.001
Token her
Token ID607
Feature activation+2.44e-03

Pos logit contributions
 blamed+0.001
 nodded+0.001
Kids+0.001
 swapped+0.001
 did+0.001
Neg logit contributions
gy-0.001
 possible-0.001
 myself-0.001
 Owl-0.001
ulent-0.001
Token feel
Token ID1254
Feature activation-5.65e-03

Pos logit contributions
gy+0.004
 myself+0.004
 Owl+0.004
 possible+0.004
ulent+0.004
Neg logit contributions
 nodded-0.004
 blamed-0.004
 swapped-0.003
 did-0.003
Kids-0.003
Token a
Token ID257
Feature activation-2.14e-03

Pos logit contributions
 blamed+0.011
 nodded+0.011
Kids+0.010
 swapped+0.010
Uh+0.010
Neg logit contributions
gy-0.009
 myself-0.008
 Owl-0.008
 possible-0.008
 rescuing-0.007
Token little
Token ID1310
Feature activation+2.13e-03

Pos logit contributions
 nodded+0.003
 blamed+0.003
Kids+0.002
 swapped+0.002
Uh+0.002
Neg logit contributions
gy-0.005
 myself-0.004
 Owl-0.004
 possible-0.004
 rescuing-0.004
Token better
Token ID1365
Feature activation-4.98e-03

Pos logit contributions
gy+0.004
 myself+0.003
 Owl+0.003
 possible+0.003
 rescuing+0.003
Neg logit contributions
 blamed-0.004
 nodded-0.004
Kids-0.003
 swapped-0.003
Uh-0.003
Token.
Token ID13
Feature activation+1.78e-03

Pos logit contributions
 nodded+0.009
 blamed+0.009
Kids+0.008
 swapped+0.008
 did+0.007
Neg logit contributions
gy-0.009
 myself-0.008
 Owl-0.007
 possible-0.007
ulent-0.007
Token
Token ID220
Feature activation-3.16e-03

Pos logit contributions
gy+0.004
 Owl+0.003
 myself+0.003
 possible+0.003
 rescuing+0.003
Neg logit contributions
 blamed-0.003
 nodded-0.003
Kids-0.002
 swapped-0.002
Uh-0.002
Token there
Token ID612
Feature activation+2.74e-03

Pos logit contributions
gy+0.010
 myself+0.009
 Owl+0.009
 possible+0.009
ulent+0.009
Neg logit contributions
 blamed-0.006
 nodded-0.006
Kids-0.006
 did-0.005
Wow-0.005
Token lived
Token ID5615
Feature activation+6.40e-03

Pos logit contributions
gy+0.007
 myself+0.006
 Owl+0.006
 possible+0.006
ulent+0.006
Neg logit contributions
 nodded-0.003
 blamed-0.003
Kids-0.003
 did-0.002
 swapped-0.002
Token a
Token ID257
Feature activation-2.59e-03

Pos logit contributions
gy+0.011
 possible+0.009
 Owl+0.009
 myself+0.009
bing+0.009
Neg logit contributions
 blamed-0.011
 nodded-0.011
 did-0.011
Kids-0.010
Wow-0.010
Token little
Token ID1310
Feature activation+1.49e-03

Pos logit contributions
 nodded+0.003
 blamed+0.003
Kids+0.003
 did+0.003
Uh+0.003
Neg logit contributions
gy-0.006
 myself-0.005
 possible-0.005
 Owl-0.005
 rescuing-0.005
Token bear
Token ID6842
Feature activation-1.32e-03

Pos logit contributions
gy+0.003
 myself+0.003
 possible+0.003
 rescuing+0.003
ulent+0.003
Neg logit contributions
 nodded-0.002
 blamed-0.002
Kids-0.002
Wow-0.002
Uh-0.001
Token named
Token ID3706
Feature activation-5.86e-03

Pos logit contributions
 blamed+0.001
 nodded+0.001
 did+0.001
Kids+0.001
 swapped+0.001
Neg logit contributions
 possible-0.003
 myself-0.003
gy-0.003
 Owl-0.003
ulent-0.003
Token Y
Token ID575
Feature activation+1.08e-02

Pos logit contributions
 blamed+0.005
Kids+0.005
Wow+0.005
 did+0.005
 nodded+0.005
Neg logit contributions
 possible-0.013
ble-0.013
gy-0.013
 believing-0.013
owing-0.013
Tokenawn
Token ID3832
Feature activation-8.13e-03

Pos logit contributions
gy+0.019
 possible+0.018
 myself+0.017
 Owl+0.016
 rescuing+0.016
Neg logit contributions
Kids-0.018
 nodded-0.017
 blamed-0.016
omo-0.016
 swapped-0.015
Token.
Token ID13
Feature activation+1.33e-03

Pos logit contributions
 nodded+0.011
 blamed+0.011
Kids+0.010
 did+0.009
outed+0.009
Neg logit contributions
 myself-0.016
 possible-0.016
gy-0.015
ulent-0.015
 afloat-0.014
Token Y
Token ID575
Feature activation+1.07e-02

Pos logit contributions
gy+0.003
 myself+0.002
 possible+0.002
 Owl+0.002
ble+0.002
Neg logit contributions
 blamed-0.002
 nodded-0.002
Kids-0.002
 swapped-0.002
Wow-0.002
Tokenawn
Token ID3832
Feature activation-8.82e-03

Pos logit contributions
gy+0.019
 possible+0.018
 Owl+0.017
 myself+0.016
ulent+0.016
Neg logit contributions
 nodded-0.017
 blamed-0.016
Kids-0.016
omo-0.015
 swapped-0.015
Token Lily
Token ID20037
Feature activation+2.25e-03

Pos logit contributions
gy+0.001
 myself+0.001
 possible+0.001
 Owl+0.001
 rescuing+0.001
Neg logit contributions
 blamed-0.001
Kids-0.001
 nodded-0.001
 swapped-0.001
Uh-0.001
Token was
Token ID373
Feature activation-2.17e-04

Pos logit contributions
gy+0.005
 Owl+0.004
 myself+0.004
ble+0.004
pper+0.004
Neg logit contributions
Kids-0.003
 blamed-0.003
Uh-0.003
Wow-0.003
 swapped-0.002
Token being
Token ID852
Feature activation+8.65e-03

Pos logit contributions
 blamed+0.000
 nodded+0.000
Kids+0.000
 swapped+0.000
Uh+0.000
Neg logit contributions
gy-0.000
 myself-0.000
 Owl-0.000
 possible-0.000
 rescuing-0.000
Token stubborn
Token ID24444
Feature activation-4.63e-03

Pos logit contributions
gy+0.016
 myself+0.014
 Owl+0.014
 possible+0.013
 rescuing+0.013
Neg logit contributions
 blamed-0.014
 nodded-0.014
Kids-0.014
 swapped-0.013
Uh-0.012
Token.
Token ID13
Feature activation+4.05e-05

Pos logit contributions
 blamed+0.009
Kids+0.009
 nodded+0.009
Uh+0.009
 swapped+0.008
Neg logit contributions
gy-0.007
 myself-0.006
 Owl-0.006
 possible-0.006
 rescuing-0.005
Token "
Token ID366
Feature activation-9.80e-04

Pos logit contributions
gy+0.000
 myself+0.000
 Owl+0.000
 Must+0.000
 rescuing+0.000
Neg logit contributions
 blamed-0.000
Kids-0.000
Uh-0.000
 nodded-0.000
 praised-0.000
TokenI
Token ID40
Feature activation-1.77e-03

Pos logit contributions
 nodded+0.002
 blamed+0.002
lights+0.001
 swapped+0.001
 thanked+0.001
Neg logit contributions
gy-0.002
 myself-0.002
 Owl-0.001
 possible-0.001
ich-0.001
Token don
Token ID836
Feature activation+7.09e-03

Pos logit contributions
 blamed+0.003
 nodded+0.003
Kids+0.003
 swapped+0.003
Uh+0.003
Neg logit contributions
gy-0.003
 Owl-0.003
 myself-0.003
 possible-0.003
 rescuing-0.002
Token't
Token ID470
Feature activation+1.58e-03

Pos logit contributions
gy+0.011
 possible+0.009
 myself+0.009
 Owl+0.009
ulent+0.008
Neg logit contributions
 blamed-0.014
Kids-0.013
 nodded-0.013
 swapped-0.012
Uh-0.012
Token want
Token ID765
Feature activation-5.61e-03

Pos logit contributions
gy+0.003
 myself+0.003
 Owl+0.003
 possible+0.003
ble+0.003
Neg logit contributions
 blamed-0.002
 nodded-0.002
Kids-0.002
 swapped-0.002
 did-0.002
Token to
Token ID284
Feature activation+3.78e-03

Pos logit contributions
 blamed+0.009
 nodded+0.009
Kids+0.009
 swapped+0.008
 brushes+0.008
Neg logit contributions
gy-0.011
 Owl-0.009
 myself-0.009
 possible-0.009
ble-0.008