TOP ACTIVATIONS
MAX = 3.188

didn 't like wet av oc ados , so he decided
who didn 't like av oc ados . She always ignored
he saw a rh in oc eros ! It looked very
his favourite treat - ch oc - chip ice cream !
saw a big rh in oc eros . It was dirty
there were no other av oc ados . The
Sammy would often share av oc ados and other y ummy
into a big rh in oc eros . The rh in
suddenly saw a rh in oc eros in front of him
they saw a rh in oc eros in the field .
" Hello , rh in oc eros !" Anna and Ben
She loved to eat av oc ados because they were green
my loved to eat av oc ados , they were his
nice like other rh in oc eros . It was mean
and blue . They showed each other their shapes .
wondered what a rh in oc eros was and where it
It is a rh in oc eros . It is bright
had a big rh in oc eros that he liked to
picture of a rh in oc eros on it . They
s wrong , rh in oc eros ? â said

MIN ACTIVATIONS
MIN = -3.435

. Can we own the rain , Mom ?" Mia asks
. Can you see the rain that way , Lily and
Was it to keep the rain away ? Or was it
the window to watch the rain ?" Her
can I be like the rain and make sure to use
" Do you like the rain ?" " Yes
I go play in the rain ?' His mum
want to play in the rain with me ?"
you protect me all the time ?" the puppy asked .
Do you like the rain ? â Daisy smiled
outside and play with the rain , Mom ?" Ben asks
, " Will it really rain today ?" "
scared . What was the alarm for ? They
are you dancing in the rain , silly king ?"
that . Whenever the chip m unk needed help , Sam
Can I see your jewelry sometime ?" Her mom
Mom , why are the pand as not playing ?" Anna
, " Where are the pand as , mom ?"
about we clean your old basketball and make it look new
have to wait a long time ?" she asked sadly .

INTERVAL 1.532 - 3.188
CONTAINS 0.050%

the two tubes , helping each other in and out of
. They hug each other . They play together
smoke !". Su zy eagerly listened , and Jane said ,
dad . You are a loyal and kind dad ."
<|endoftext|> One day , a woman went to the park .

INTERVAL -0.123 - 1.532
CONTAINS 61.904%

wasn 't worth it . <|endoftext|> One day , a thirsty
with my toy plane again !" From that day on ,
L ily and Jack played together all day , taking turns
but they invited him to join in with them .
an entire world filled with amazing things to explore . She

INTERVAL -1.779 - -0.123
CONTAINS 38.015%

ily shakes her head . " No , Tim , it
too long and too dangerous . I don 't want you
That â s a good idea
" We 're sorry , Mom . We were hungry
a time , there was a j olly little boy named

INTERVAL -3.435 - -1.779
CONTAINS 0.031%

<|endoftext|> Jack and Nora were having breakfast one morning . Jack
for sharing . From then on , they played together happily
They waited for the rain to come , and when
play at my daughter 's birthday party ?" Lily was so
special treat for the chip m unk . She said ,
Token didn
Token ID1422
Feature activation-2.16e-02

Pos logit contributions
rified+0.011
ow+0.010
licks+0.009
 bye+0.009
ocked+0.009
Neg logit contributions
-0.007
 printed-0.006
b-0.005
Ar-0.005
 wonderful-0.005
Token't
Token ID470
Feature activation-2.77e-01

Pos logit contributions
Ar+0.001
 lifelong+0.001
 bisc+0.001
 printed+0.001
 feather+0.001
Neg logit contributions
ow-0.001
rified-0.000
ocked-0.000
 loudly-0.000
licks-0.000
Token like
Token ID588
Feature activation+6.14e-01

Pos logit contributions
+0.009
Ar+0.008
 printed+0.007
 lifelong+0.007
 common+0.007
Neg logit contributions
rified-0.012
ow-0.011
ocked-0.011
licks-0.010
ly-0.010
Token wet
Token ID9583
Feature activation+3.55e-01

Pos logit contributions
rified+0.028
ow+0.026
owing+0.024
licks+0.023
ocked+0.023
Neg logit contributions
-0.021
 printed-0.016
 wonderful-0.016
 print-0.015
 lifelong-0.015
Token av
Token ID1196
Feature activation+9.23e-01

Pos logit contributions
ow+0.016
rified+0.016
ocked+0.014
owing+0.014
licks+0.014
Neg logit contributions
-0.010
fl-0.007
 two-0.007
 printed-0.007
 wonderful-0.007
Tokenoc
Token ID420
Feature activation+3.19e+00

Pos logit contributions
rified+0.047
ow+0.045
ocked+0.043
licks+0.040
 bye+0.040
Neg logit contributions
-0.024
Ar-0.018
 printed-0.017
 lifelong-0.016
 wonderful-0.015
Tokenados
Token ID22484
Feature activation+7.58e-01

Pos logit contributions
rified+0.099
 spine+0.082
ow+0.081
chair+0.081
 manner+0.077
Neg logit contributions
-0.132
Wal-0.110
fl-0.106
 lifelong-0.106
 bisc-0.105
Token,
Token ID11
Feature activation-6.29e-01

Pos logit contributions
rified+0.032
ow+0.029
ocked+0.028
arrass+0.027
 bye+0.027
Neg logit contributions
-0.028
 printed-0.021
 wonderful-0.020
 lifelong-0.019
Ar-0.019
Token so
Token ID523
Feature activation-9.28e-01

Pos logit contributions
+0.022
 printed+0.015
 wonderful+0.015
Ar+0.014
 sparkling+0.014
Neg logit contributions
rified-0.027
ow-0.027
owing-0.024
ocked-0.024
licks-0.024
Token he
Token ID339
Feature activation-2.92e-01

Pos logit contributions
+0.041
Ar+0.035
 printed+0.034
 lifelong+0.034
 wonderful+0.033
Neg logit contributions
rified-0.030
ow-0.028
ocked-0.026
licks-0.025
storms-0.024
Token decided
Token ID3066
Feature activation+3.39e-01

Pos logit contributions
+0.010
 printed+0.008
Ar+0.008
 wonderful+0.007
Wal+0.007
Neg logit contributions
rified-0.013
ow-0.012
owing-0.011
ocked-0.011
licks-0.011
Token who
Token ID508
Feature activation-7.64e-01

Pos logit contributions
rified+0.031
ow+0.029
ocked+0.028
licks+0.026
storms+0.026
Neg logit contributions
-0.025
Ar-0.020
 printed-0.019
 lifelong-0.018
 common-0.018
Token didn
Token ID1422
Feature activation-2.10e-01

Pos logit contributions
+0.031
Ar+0.025
 printed+0.025
 wonderful+0.024
 lifelong+0.023
Neg logit contributions
rified-0.030
ow-0.027
ocked-0.027
licks-0.025
arted-0.024
Token't
Token ID470
Feature activation-1.75e-01

Pos logit contributions
+0.009
Ar+0.008
 lifelong+0.008
 printed+0.008
 bisc+0.008
Neg logit contributions
ow-0.007
rified-0.007
ocked-0.006
 loudly-0.005
licks-0.005
Token like
Token ID588
Feature activation+5.69e-01

Pos logit contributions
+0.006
Ar+0.004
 printed+0.004
 wonderful+0.004
 lifelong+0.004
Neg logit contributions
rified-0.008
ow-0.007
ocked-0.007
licks-0.007
storms-0.007
Token av
Token ID1196
Feature activation+1.04e+00

Pos logit contributions
rified+0.028
ow+0.025
agged+0.025
ocked+0.024
arrass+0.024
Neg logit contributions
-0.018
 wonderful-0.014
 printed-0.013
 print-0.013
 se-0.012
Tokenoc
Token ID420
Feature activation+2.93e+00

Pos logit contributions
rified+0.053
ow+0.050
ocked+0.048
 bye+0.046
arrass+0.046
Neg logit contributions
-0.027
Ar-0.021
 lifelong-0.019
 printed-0.018
 wonderful-0.017
Tokenados
Token ID22484
Feature activation+5.84e-01

Pos logit contributions
rified+0.096
 spine+0.080
ow+0.078
chair+0.077
ocked+0.076
Neg logit contributions
-0.112
fl-0.095
Wal-0.092
 bisc-0.092
 lifelong-0.090
Token.
Token ID13
Feature activation+6.24e-01

Pos logit contributions
rified+0.029
ow+0.025
ocked+0.025
arrass+0.024
arted+0.024
Neg logit contributions
-0.018
 printed-0.013
 wonderful-0.012
 print-0.011
Ar-0.011
Token She
Token ID1375
Feature activation+5.35e-01

Pos logit contributions
rified+0.024
ow+0.021
ocked+0.021
owing+0.021
iness+0.020
Neg logit contributions
-0.028
 rag-0.019
 rainbow-0.019
 wonderful-0.019
 two-0.019
Token always
Token ID1464
Feature activation+7.82e-01

Pos logit contributions
rified+0.023
ow+0.021
ocked+0.020
owing+0.020
licks+0.020
Neg logit contributions
-0.020
 printed-0.016
Ar-0.015
 wonderful-0.014
vy-0.013
Token ignored
Token ID9514
Feature activation+8.58e-01

Pos logit contributions
rified+0.036
ocked+0.032
ow+0.032
licks+0.030
arted+0.029
Neg logit contributions
-0.027
 printed-0.023
Ar-0.020
 wonderful-0.019
 common-0.019
Token he
Token ID339
Feature activation+6.57e-01

Pos logit contributions
ocked+0.028
rified+0.026
arrass+0.026
ow+0.026
owing+0.025
Neg logit contributions
-0.024
Ar-0.018
 two-0.016
 printed-0.016
Wal-0.016
Token saw
Token ID2497
Feature activation+1.19e-01

Pos logit contributions
rified+0.025
ow+0.023
ocked+0.021
licks+0.020
ly+0.020
Neg logit contributions
-0.026
Ar-0.022
 lifelong-0.021
 printed-0.021
 common-0.020
Token a
Token ID257
Feature activation-3.03e-01

Pos logit contributions
rified+0.007
ocked+0.007
ow+0.007
owing+0.007
arrass+0.006
Neg logit contributions
-0.002
 wonderful-0.001
 two-0.001
Ar-0.001
 treasure-0.001
Token rh
Token ID9529
Feature activation+6.93e-01

Pos logit contributions
+0.010
Ar+0.008
 lifelong+0.007
 printed+0.007
 common+0.007
Neg logit contributions
rified-0.013
ow-0.012
 loudly-0.012
 bye-0.011
ocked-0.011
Tokenin
Token ID259
Feature activation-5.92e-01

Pos logit contributions
rified+0.032
ow+0.031
ocked+0.029
arrass+0.029
storms+0.029
Neg logit contributions
-0.020
Ar-0.017
 printed-0.015
 lifelong-0.014
Wal-0.014
Tokenoc
Token ID420
Feature activation+2.76e+00

Pos logit contributions
+0.019
 lifelong+0.016
Ar+0.016
 printed+0.015
 common+0.015
Neg logit contributions
rified-0.025
ow-0.024
ocked-0.022
 loudly-0.022
ly-0.022
Tokeneros
Token ID27498
Feature activation+4.17e-01

Pos logit contributions
rified+0.100
outed+0.095
ocked+0.095
 lately+0.092
licks+0.091
Neg logit contributions
-0.084
Ar-0.080
 bisc-0.078
 extingu-0.073
fl-0.072
Token!
Token ID0
Feature activation+4.69e-01

Pos logit contributions
rified+0.017
ow+0.016
ocked+0.015
licks+0.014
storms+0.014
Neg logit contributions
-0.015
Ar-0.013
 printed-0.012
 lifelong-0.012
 common-0.011
Token It
Token ID632
Feature activation+2.18e-01

Pos logit contributions
rified+0.021
owing+0.021
 unatt+0.021
iness+0.021
ow+0.021
Neg logit contributions
-0.016
Ar-0.012
 bisc-0.009
Wal-0.009
 printed-0.009
Token looked
Token ID3114
Feature activation+5.17e-02

Pos logit contributions
rified+0.010
ow+0.009
ocked+0.009
 loudly+0.009
storms+0.008
Neg logit contributions
-0.006
Ar-0.005
 lifelong-0.005
 common-0.005
 printed-0.005
Token very
Token ID845
Feature activation+2.00e-01

Pos logit contributions
rified+0.002
ow+0.002
ocked+0.002
 loudly+0.002
licks+0.002
Neg logit contributions
-0.002
Ar-0.002
 printed-0.002
 lifelong-0.002
 common-0.002
Token his
Token ID465
Feature activation-1.47e-01

Pos logit contributions
ow+0.001
rified+0.001
ocked+0.001
 loudly+0.001
ly+0.001
Neg logit contributions
-0.001
Ar-0.001
Wal-0.001
 lifelong-0.001
 common-0.001
Token favourite
Token ID12507
Feature activation+9.71e-02

Pos logit contributions
+0.006
Ar+0.005
Wal+0.005
 printed+0.005
 lifelong+0.005
Neg logit contributions
ow-0.005
rified-0.005
 mood-0.005
ly-0.005
 bye-0.004
Token treat
Token ID2190
Feature activation-2.40e-01

Pos logit contributions
ow+0.003
rified+0.003
ly+0.003
 mood+0.003
 bye+0.003
Neg logit contributions
-0.004
Ar-0.003
Wal-0.003
 lifelong-0.003
 printed-0.003
Token -
Token ID532
Feature activation-4.72e-01

Pos logit contributions
+0.009
Ar+0.008
 lifelong+0.007
 printed+0.007
 common+0.007
Neg logit contributions
rified-0.009
ow-0.009
ocked-0.008
ly-0.008
owing-0.008
Token ch
Token ID442
Feature activation-1.06e-01

Pos logit contributions
+0.019
Ar+0.017
 printed+0.016
 lifelong+0.016
 common+0.016
Neg logit contributions
rified-0.016
ow-0.015
ocked-0.014
 bye-0.013
 loudly-0.013
Tokenoc
Token ID420
Feature activation+2.63e+00

Pos logit contributions
+0.003
vy+0.002
ady+0.002
b+0.002
Wal+0.002
Neg logit contributions
rified-0.005
storms-0.005
arrass-0.004
chair-0.004
 unatt-0.004
Token-
Token ID12
Feature activation+1.15e+00

Pos logit contributions
rified+0.089
 alone+0.080
ocked+0.077
 unatt+0.076
chair+0.076
Neg logit contributions
-0.098
 bisc-0.085
fl-0.085
Ar-0.082
b-0.078
Tokenchip
Token ID35902
Feature activation+5.94e-01

Pos logit contributions
rified+0.042
ow+0.040
ocked+0.037
ly+0.034
licks+0.034
Neg logit contributions
-0.045
Ar-0.039
 printed-0.037
 lifelong-0.037
 common-0.035
Token ice
Token ID4771
Feature activation-6.26e-01

Pos logit contributions
rified+0.023
ow+0.022
ocked+0.021
ly+0.020
licks+0.019
Neg logit contributions
-0.022
Ar-0.019
 lifelong-0.018
 printed-0.017
 common-0.017
Token cream
Token ID8566
Feature activation+5.62e-01

Pos logit contributions
+0.025
Ar+0.022
 lifelong+0.021
 printed+0.021
Wal+0.020
Neg logit contributions
rified-0.022
ow-0.021
ocked-0.020
ly-0.019
owing-0.018
Token!
Token ID0
Feature activation+3.59e-01

Pos logit contributions
rified+0.023
ow+0.022
ocked+0.021
licks+0.019
ly+0.019
Neg logit contributions
-0.020
Ar-0.017
 printed-0.015
 lifelong-0.015
 common-0.015
Token saw
Token ID2497
Feature activation-3.78e-01

Pos logit contributions
+0.003
 printed+0.002
Ar+0.002
 print+0.002
 wonderful+0.002
Neg logit contributions
rified-0.006
ocked-0.006
iness-0.006
owing-0.006
attering-0.006
Token a
Token ID257
Feature activation-4.43e-01

Pos logit contributions
+0.010
 printed+0.008
 wonderful+0.007
Ar+0.007
 lifelong+0.007
Neg logit contributions
rified-0.019
ow-0.018
ocked-0.017
owing-0.017
storms-0.017
Token big
Token ID1263
Feature activation+1.83e-01

Pos logit contributions
+0.016
Ar+0.012
 lifelong+0.012
ady+0.012
 printed+0.011
Neg logit contributions
rified-0.016
 bye-0.016
 loudly-0.015
ow-0.014
 mood-0.014
Token rh
Token ID9529
Feature activation+5.70e-01

Pos logit contributions
rified+0.007
ow+0.007
ocked+0.006
licks+0.006
ly+0.006
Neg logit contributions
-0.007
Ar-0.006
 printed-0.005
 lifelong-0.005
 common-0.005
Tokenin
Token ID259
Feature activation-6.73e-01

Pos logit contributions
rified+0.025
ow+0.024
ocked+0.023
licks+0.021
storms+0.021
Neg logit contributions
-0.018
Ar-0.015
 printed-0.014
 lifelong-0.014
 common-0.013
Tokenoc
Token ID420
Feature activation+2.63e+00

Pos logit contributions
+0.023
 lifelong+0.019
Ar+0.019
 common+0.018
 printed+0.018
Neg logit contributions
rified-0.027
ow-0.026
 bye-0.024
ly-0.024
 loudly-0.024
Tokeneros
Token ID27498
Feature activation+2.46e-01

Pos logit contributions
rified+0.097
ocked+0.087
storms+0.086
licks+0.086
 lately+0.084
Neg logit contributions
Ar-0.084
-0.082
 printed-0.072
 bisc-0.072
 feather-0.071
Token.
Token ID13
Feature activation+4.79e-01

Pos logit contributions
rified+0.010
ow+0.009
ocked+0.008
licks+0.008
owing+0.008
Neg logit contributions
-0.009
Ar-0.008
 lifelong-0.007
 printed-0.007
 common-0.007
Token It
Token ID632
Feature activation+3.46e-02

Pos logit contributions
rified+0.021
ow+0.019
ocked+0.019
iness+0.019
owing+0.018
Neg logit contributions
-0.019
Ar-0.013
 printed-0.013
 print-0.012
 bisc-0.012
Token was
Token ID373
Feature activation+3.43e-01

Pos logit contributions
rified+0.002
ow+0.001
ocked+0.001
storms+0.001
 loudly+0.001
Neg logit contributions
-0.001
 lifelong-0.001
Ar-0.001
Wal-0.001
 common-0.001
Token dirty
Token ID11841
Feature activation-2.54e-01

Pos logit contributions
rified+0.015
ow+0.014
ocked+0.013
licks+0.013
 loudly+0.013
Neg logit contributions
-0.011
Ar-0.009
 printed-0.009
 lifelong-0.009
 common-0.008
Token there
Token ID612
Feature activation+4.50e-01

Pos logit contributions
+0.002
 printed+0.002
 wonderful+0.002
Ar+0.002
 lifelong+0.002
Neg logit contributions
rified-0.002
ow-0.002
ocked-0.002
owing-0.002
licks-0.002
Token were
Token ID547
Feature activation+5.76e-01

Pos logit contributions
rified+0.022
ow+0.021
ocked+0.021
arrass+0.020
ly+0.020
Neg logit contributions
-0.013
 printed-0.009
 wonderful-0.008
 lifelong-0.008
Ar-0.008
Token no
Token ID645
Feature activation+1.11e+00

Pos logit contributions
rified+0.022
ow+0.020
ocked+0.019
licks+0.018
ly+0.018
Neg logit contributions
-0.023
Ar-0.019
 printed-0.019
 lifelong-0.018
 wonderful-0.018
Token other
Token ID584
Feature activation-1.78e-01

Pos logit contributions
rified+0.048
ow+0.046
ocked+0.043
ly+0.041
licks+0.040
Neg logit contributions
-0.037
Ar-0.031
 printed-0.029
 lifelong-0.028
 common-0.027
Token av
Token ID1196
Feature activation+3.45e-01

Pos logit contributions
+0.006
Ar+0.005
 printed+0.005
 lifelong+0.005
 wonderful+0.005
Neg logit contributions
rified-0.007
ow-0.007
ocked-0.007
licks-0.006
arrass-0.006
Tokenoc
Token ID420
Feature activation+2.62e+00

Pos logit contributions
rified+0.016
ow+0.016
ocked+0.015
licks+0.014
owing+0.014
Neg logit contributions
-0.010
Ar-0.008
 printed-0.008
 lifelong-0.007
 common-0.007
Tokenados
Token ID22484
Feature activation+2.07e-01

Pos logit contributions
rified+0.080
chair+0.068
ow+0.067
ocked+0.067
 spine+0.063
Neg logit contributions
-0.109
Ar-0.092
 printed-0.091
fl-0.091
Wal-0.091
Token.
Token ID13
Feature activation+8.81e-02

Pos logit contributions
rified+0.008
ow+0.008
ocked+0.007
storms+0.007
licks+0.007
Neg logit contributions
-0.008
Ar-0.007
 lifelong-0.006
 printed-0.006
 common-0.006
Token
Token ID198
Feature activation-2.40e-01

Pos logit contributions
rified+0.003
ow+0.003
ocked+0.003
owing+0.003
ly+0.003
Neg logit contributions
-0.003
Ar-0.003
 printed-0.003
 lifelong-0.002
Wal-0.002
Token
Token ID198
Feature activation+2.87e-01

Pos logit contributions
+0.008
 printed+0.007
 lifelong+0.007
 bisc+0.007
 feather+0.006
Neg logit contributions
ow-0.009
rified-0.009
ocked-0.009
 loudly-0.008
licks-0.008
TokenThe
Token ID464
Feature activation+8.10e-02

Pos logit contributions
rified+0.012
ow+0.011
ocked+0.011
 loudly+0.011
Fine+0.010
Neg logit contributions
-0.008
 bisc-0.008
 lifelong-0.008
 printed-0.007
Ar-0.007
Token Sammy
Token ID44067
Feature activation-2.47e-01

Pos logit contributions
rified+0.009
ow+0.008
ocked+0.008
storms+0.007
licks+0.007
Neg logit contributions
-0.007
Ar-0.006
 printed-0.006
 lifelong-0.006
 common-0.005
Token would
Token ID561
Feature activation-1.87e-01

Pos logit contributions
+0.010
Ar+0.008
 printed+0.008
 lifelong+0.008
 common+0.008
Neg logit contributions
rified-0.009
ow-0.009
ocked-0.008
licks-0.007
storms-0.007
Token often
Token ID1690
Feature activation+8.65e-02

Pos logit contributions
+0.008
Ar+0.007
 printed+0.007
 lifelong+0.007
 wonderful+0.007
Neg logit contributions
rified-0.006
ow-0.006
ocked-0.005
licks-0.005
ly-0.005
Token share
Token ID2648
Feature activation+2.57e-01

Pos logit contributions
rified+0.004
ow+0.004
ocked+0.003
licks+0.003
storms+0.003
Neg logit contributions
-0.003
Ar-0.002
 printed-0.002
 lifelong-0.002
 common-0.002
Token av
Token ID1196
Feature activation+1.48e-01

Pos logit contributions
rified+0.008
ow+0.008
ocked+0.008
 loudly+0.007
 anymore+0.007
Neg logit contributions
-0.011
Ar-0.010
vy-0.008
 bisc-0.008
 lifelong-0.008
Tokenoc
Token ID420
Feature activation+2.61e+00

Pos logit contributions
rified+0.007
ow+0.007
ocked+0.006
licks+0.006
ly+0.006
Neg logit contributions
-0.004
Ar-0.004
 printed-0.003
 lifelong-0.003
 common-0.003
Tokenados
Token ID22484
Feature activation-2.90e-02

Pos logit contributions
rified+0.072
ow+0.061
chair+0.056
 spine+0.056
storms+0.053
Neg logit contributions
-0.122
Ar-0.104
Wal-0.103
 lifelong-0.103
 printed-0.102
Token and
Token ID290
Feature activation+7.40e-02

Pos logit contributions
+0.001
Ar+0.001
 printed+0.001
 lifelong+0.001
 common+0.001
Neg logit contributions
rified-0.001
ow-0.001
ocked-0.001
ly-0.001
storms-0.001
Token other
Token ID584
Feature activation-5.51e-01

Pos logit contributions
rified+0.003
ow+0.003
ocked+0.002
 bye+0.002
ly+0.002
Neg logit contributions
-0.003
Ar-0.002
 lifelong-0.002
vy-0.002
 printed-0.002
Token y
Token ID331
Feature activation-3.78e-01

Pos logit contributions
+0.019
Ar+0.017
 printed+0.015
 lifelong+0.015
 common+0.015
Neg logit contributions
rified-0.023
ow-0.021
ocked-0.020
ly-0.019
licks-0.019
Tokenummy
Token ID13513
Feature activation-2.37e-01

Pos logit contributions
+0.016
Ar+0.013
 lifelong+0.013
 printed+0.013
Wal+0.013
Neg logit contributions
rified-0.014
ow-0.012
licks-0.010
storms-0.010
chair-0.010
Token into
Token ID656
Feature activation-2.20e-01

Pos logit contributions
+0.020
Ar+0.017
 printed+0.016
 lifelong+0.016
 common+0.015
Neg logit contributions
rified-0.020
ow-0.019
ocked-0.017
licks-0.016
ly-0.016
Token a
Token ID257
Feature activation-3.08e-01

Pos logit contributions
+0.007
Ar+0.006
 lifelong+0.006
 printed+0.006
 common+0.006
Neg logit contributions
rified-0.009
ow-0.009
ocked-0.008
 loudly-0.008
ly-0.008
Token big
Token ID1263
Feature activation-3.19e-01

Pos logit contributions
+0.011
Ar+0.010
 printed+0.009
 lifelong+0.009
 common+0.009
Neg logit contributions
rified-0.012
ow-0.011
ocked-0.011
 loudly-0.011
 bye-0.010
Token rh
Token ID9529
Feature activation+4.01e-01

Pos logit contributions
+0.011
Ar+0.008
 printed+0.008
 wonderful+0.008
 lifelong+0.008
Neg logit contributions
rified-0.014
ow-0.013
ocked-0.013
licks-0.012
arrass-0.012
Tokenin
Token ID259
Feature activation-7.67e-01

Pos logit contributions
rified+0.019
ow+0.018
arrass+0.017
ocked+0.017
 bye+0.017
Neg logit contributions
-0.011
Ar-0.009
 printed-0.008
Wal-0.008
 lifelong-0.008
Tokenoc
Token ID420
Feature activation+2.59e+00

Pos logit contributions
+0.022
Ar+0.018
 lifelong+0.017
 printed+0.017
 common+0.016
Neg logit contributions
rified-0.036
ow-0.035
ocked-0.033
ly-0.031
 loudly-0.031
Tokeneros
Token ID27498
Feature activation+2.66e-01

Pos logit contributions
rified+0.097
outed+0.095
ocked+0.093
 lately+0.091
licks+0.090
Neg logit contributions
-0.070
 bisc-0.066
 extingu-0.065
Ar-0.064
 feather-0.063
Token.
Token ID13
Feature activation+5.28e-01

Pos logit contributions
rified+0.011
ow+0.011
ocked+0.010
licks+0.009
arrass+0.009
Neg logit contributions
-0.009
Ar-0.007
 printed-0.007
 lifelong-0.007
 bisc-0.007
Token The
Token ID383
Feature activation-2.42e-01

Pos logit contributions
rified+0.021
ow+0.020
ocked+0.018
owing+0.018
licks+0.017
Neg logit contributions
-0.020
Ar-0.017
 printed-0.016
 lifelong-0.015
Wal-0.015
Token rh
Token ID9529
Feature activation+4.81e-01

Pos logit contributions
+0.009
Ar+0.007
 lifelong+0.007
 printed+0.007
 common+0.007
Neg logit contributions
rified-0.010
ow-0.009
ocked-0.008
 bye-0.008
ly-0.008
Tokenin
Token ID259
Feature activation-1.05e+00

Pos logit contributions
rified+0.022
ow+0.021
ocked+0.019
arrass+0.019
storms+0.019
Neg logit contributions
-0.015
Ar-0.012
 printed-0.012
 lifelong-0.011
 wonderful-0.011
Token suddenly
Token ID6451
Feature activation-3.42e-02

Pos logit contributions
rified+0.009
ow+0.009
ocked+0.009
licks+0.008
arrass+0.008
Neg logit contributions
-0.007
Ar-0.006
 printed-0.006
 lifelong-0.005
 wonderful-0.005
Token saw
Token ID2497
Feature activation-5.58e-01

Pos logit contributions
+0.001
 printed+0.001
Ar+0.001
Wal+0.001
 wonderful+0.001
Neg logit contributions
rified-0.001
ow-0.001
licks-0.001
ocked-0.001
arrass-0.001
Token a
Token ID257
Feature activation-5.21e-01

Pos logit contributions
+0.016
Ar+0.014
 lifelong+0.012
 printed+0.012
 common+0.012
Neg logit contributions
rified-0.026
ow-0.025
ocked-0.023
ly-0.022
 loudly-0.022
Token rh
Token ID9529
Feature activation+4.82e-01

Pos logit contributions
+0.017
Ar+0.015
 lifelong+0.014
 printed+0.013
 common+0.013
Neg logit contributions
rified-0.021
 loudly-0.019
 bye-0.019
ow-0.019
ocked-0.018
Tokenin
Token ID259
Feature activation-6.99e-01

Pos logit contributions
rified+0.021
ow+0.020
ocked+0.019
licks+0.018
storms+0.018
Neg logit contributions
-0.016
Ar-0.013
 printed-0.012
 lifelong-0.012
 wonderful-0.011
Tokenoc
Token ID420
Feature activation+2.58e+00

Pos logit contributions
+0.025
 lifelong+0.023
 common+0.022
Ar+0.022
 printed+0.021
Neg logit contributions
rified-0.026
ow-0.025
ly-0.022
 bye-0.022
 loudly-0.022
Tokeneros
Token ID27498
Feature activation+2.52e-01

Pos logit contributions
rified+0.089
ocked+0.078
licks+0.077
ow+0.076
storms+0.075
Neg logit contributions
-0.098
Ar-0.086
 printed-0.080
Wal-0.079
 feather-0.078
Token in
Token ID287
Feature activation-5.10e-01

Pos logit contributions
rified+0.009
ow+0.009
ocked+0.008
owing+0.008
 loudly+0.008
Neg logit contributions
-0.010
Ar-0.009
 lifelong-0.008
 common-0.008
 printed-0.008
Token front
Token ID2166
Feature activation+8.33e-01

Pos logit contributions
+0.019
Ar+0.016
 lifelong+0.015
 printed+0.015
 wonderful+0.014
Neg logit contributions
rified-0.020
ow-0.019
ocked-0.017
 loudly-0.017
licks-0.016
Token of
Token ID286
Feature activation-4.06e-01

Pos logit contributions
rified+0.038
ow+0.034
ocked+0.032
arrass+0.032
storms+0.032
Neg logit contributions
-0.029
 printed-0.020
Ar-0.019
 print-0.019
 wonderful-0.018
Token him
Token ID683
Feature activation-2.14e-01

Pos logit contributions
+0.022
Ar+0.020
 lifelong+0.019
 printed+0.019
 common+0.019
Neg logit contributions
rified-0.009
ow-0.008
ocked-0.007
 loudly-0.006
ly-0.006
Token they
Token ID484
Feature activation+3.04e-01

Pos logit contributions
+0.029
Ar+0.025
 lifelong+0.023
 printed+0.023
 bisc+0.023
Neg logit contributions
rified-0.027
ow-0.025
ocked-0.023
 loudly-0.023
 bye-0.022
Token saw
Token ID2497
Feature activation-7.85e-01

Pos logit contributions
rified+0.014
ow+0.013
ocked+0.013
arrass+0.013
licks+0.013
Neg logit contributions
-0.009
 printed-0.007
Ar-0.007
Wal-0.007
 lifelong-0.006
Token a
Token ID257
Feature activation-4.97e-01

Pos logit contributions
+0.024
Ar+0.022
 lifelong+0.019
 common+0.018
 printed+0.018
Neg logit contributions
rified-0.035
ow-0.033
ocked-0.031
 loudly-0.030
 bye-0.029
Token rh
Token ID9529
Feature activation+5.99e-01

Pos logit contributions
+0.016
Ar+0.014
 lifelong+0.013
 printed+0.012
 common+0.012
Neg logit contributions
rified-0.019
 loudly-0.018
ow-0.018
 bye-0.018
ocked-0.017
Tokenin
Token ID259
Feature activation-7.24e-01

Pos logit contributions
rified+0.026
ow+0.024
ocked+0.023
licks+0.022
storms+0.022
Neg logit contributions
-0.020
Ar-0.017
 printed-0.016
 lifelong-0.015
 common-0.015
Tokenoc
Token ID420
Feature activation+2.57e+00

Pos logit contributions
+0.026
 lifelong+0.023
 common+0.022
Ar+0.022
 printed+0.021
Neg logit contributions
rified-0.028
ow-0.026
ly-0.024
 bye-0.024
 loudly-0.023
Tokeneros
Token ID27498
Feature activation+6.01e-02

Pos logit contributions
rified+0.091
ocked+0.083
licks+0.080
storms+0.079
ow+0.078
Neg logit contributions
-0.090
Ar-0.082
fl-0.073
 printed-0.073
Wal-0.073
Token in
Token ID287
Feature activation-6.03e-01

Pos logit contributions
rified+0.002
ow+0.002
ocked+0.002
owing+0.002
 loudly+0.002
Neg logit contributions
-0.002
Ar-0.002
 lifelong-0.002
 common-0.002
 printed-0.002
Token the
Token ID262
Feature activation+4.00e-01

Pos logit contributions
+0.023
 lifelong+0.020
Ar+0.019
 printed+0.019
b+0.019
Neg logit contributions
rified-0.021
ow-0.020
 loudly-0.018
ocked-0.018
 bye-0.017
Token field
Token ID2214
Feature activation+6.60e-01

Pos logit contributions
rified+0.014
ow+0.013
ocked+0.012
 bye+0.011
ly+0.011
Neg logit contributions
-0.016
Ar-0.014
 printed-0.013
 lifelong-0.013
 common-0.013
Token.
Token ID13
Feature activation+1.16e-01

Pos logit contributions
rified+0.029
ow+0.027
ocked+0.025
licks+0.025
storms+0.024
Neg logit contributions
-0.022
 printed-0.017
Ar-0.017
 lifelong-0.016
 common-0.015
Token"
Token ID1
Feature activation+6.67e-02

Pos logit contributions
rified+0.020
ow+0.019
ocked+0.018
 loudly+0.018
licks+0.018
Neg logit contributions
-0.013
 lifelong-0.012
 printed-0.012
Ar-0.011
 bisc-0.011
TokenHello
Token ID15496
Feature activation-1.89e-03

Pos logit contributions
rified+0.002
Fine+0.002
 bye+0.002
ow+0.002
 loudly+0.002
Neg logit contributions
-0.003
 lifelong-0.002
 masterpiece-0.002
inations-0.002
 printed-0.002
Token,
Token ID11
Feature activation+1.04e+00

Pos logit contributions
+0.000
Ar+0.000
 printed+0.000
 lifelong+0.000
Wal+0.000
Neg logit contributions
rified-0.000
ow-0.000
ocked-0.000
 bye-0.000
ly-0.000
Token rh
Token ID9529
Feature activation+1.31e+00

Pos logit contributions
rified+0.037
ow+0.034
 bye+0.032
licks+0.030
ly+0.030
Neg logit contributions
-0.041
Ar-0.037
 lifelong-0.036
 printed-0.036
 common-0.035
Tokenin
Token ID259
Feature activation+2.01e-01

Pos logit contributions
rified+0.060
ow+0.057
ocked+0.054
arrass+0.051
storms+0.051
Neg logit contributions
-0.040
Ar-0.033
 printed-0.030
 lifelong-0.029
 common-0.028
Tokenoc
Token ID420
Feature activation+2.56e+00

Pos logit contributions
rified+0.009
ow+0.008
ly+0.007
 loudly+0.007
ocked+0.007
Neg logit contributions
-0.006
 lifelong-0.005
Ar-0.005
 printed-0.005
 common-0.005
Tokeneros
Token ID27498
Feature activation+4.74e-01

Pos logit contributions
rified+0.088
ocked+0.081
ow+0.081
chair+0.078
 mood+0.077
Neg logit contributions
-0.092
Ar-0.088
fl-0.079
b-0.077
 printed-0.076
Token!"
Token ID2474
Feature activation+5.82e-01

Pos logit contributions
rified+0.023
ow+0.022
ocked+0.021
licks+0.020
storms+0.020
Neg logit contributions
-0.013
Ar-0.010
 printed-0.009
 lifelong-0.009
 common-0.008
Token Anna
Token ID11735
Feature activation-4.03e-02

Pos logit contributions
rified+0.016
ow+0.015
ocked+0.014
licks+0.013
storms+0.012
Neg logit contributions
-0.028
Ar-0.025
 printed-0.024
 lifelong-0.024
 common-0.023
Token and
Token ID290
Feature activation+6.97e-01

Pos logit contributions
+0.001
Ar+0.001
 lifelong+0.001
 printed+0.001
 common+0.001
Neg logit contributions
rified-0.002
ow-0.002
ocked-0.002
licks-0.001
ly-0.001
Token Ben
Token ID3932
Feature activation-3.41e-02

Pos logit contributions
rified+0.024
ow+0.023
ocked+0.023
owing+0.022
licks+0.021
Neg logit contributions
-0.030
 printed-0.024
Ar-0.023
 two-0.023
 bisc-0.023
Token She
Token ID1375
Feature activation+5.61e-01

Pos logit contributions
rified+0.010
ow+0.009
ocked+0.009
owing+0.009
arrass+0.008
Neg logit contributions
-0.013
 printed-0.010
 wonderful-0.009
Ar-0.009
 bisc-0.009
Token loved
Token ID6151
Feature activation+1.21e-02

Pos logit contributions
rified+0.017
ow+0.016
ocked+0.015
ly+0.014
 loudly+0.014
Neg logit contributions
-0.025
Ar-0.022
 lifelong-0.021
 printed-0.021
 common-0.020
Token to
Token ID284
Feature activation+1.87e-01

Pos logit contributions
rified+0.001
ow+0.000
ocked+0.000
licks+0.000
owing+0.000
Neg logit contributions
-0.000
 printed-0.000
Ar-0.000
 wonderful-0.000
 lifelong-0.000
Token eat
Token ID4483
Feature activation-3.46e-01

Pos logit contributions
ow+0.007
rified+0.006
 bye+0.006
ocked+0.006
ly+0.005
Neg logit contributions
-0.007
Ar-0.007
 lifelong-0.006
Wal-0.006
 printed-0.006
Token av
Token ID1196
Feature activation+2.03e-01

Pos logit contributions
+0.011
Ar+0.009
 printed+0.009
 lifelong+0.009
 common+0.008
Neg logit contributions
rified-0.015
ow-0.014
ocked-0.014
licks-0.013
ly-0.013
Tokenoc
Token ID420
Feature activation+2.55e+00

Pos logit contributions
rified+0.010
ow+0.010
ocked+0.009
licks+0.009
storms+0.009
Neg logit contributions
-0.006
Ar-0.004
 printed-0.004
 lifelong-0.004
 common-0.004
Tokenados
Token ID22484
Feature activation+6.42e-02

Pos logit contributions
rified+0.077
ow+0.064
 spine+0.062
ocked+0.059
licks+0.057
Neg logit contributions
-0.112
Wal-0.094
Ar-0.093
 printed-0.093
 lifelong-0.091
Token because
Token ID780
Feature activation-1.60e-01

Pos logit contributions
rified+0.002
ow+0.002
ocked+0.002
licks+0.002
arrass+0.002
Neg logit contributions
-0.003
Ar-0.002
 printed-0.002
 lifelong-0.002
 wonderful-0.002
Token they
Token ID484
Feature activation+4.57e-01

Pos logit contributions
+0.007
Ar+0.006
 bisc+0.006
 lifelong+0.006
 printed+0.006
Neg logit contributions
rified-0.005
ow-0.005
ocked-0.004
ly-0.004
 loudly-0.004
Token were
Token ID547
Feature activation+9.65e-02

Pos logit contributions
rified+0.018
ow+0.017
ocked+0.016
 bye+0.015
licks+0.015
Neg logit contributions
-0.017
Ar-0.014
 printed-0.014
 lifelong-0.013
 wonderful-0.013
Token green
Token ID4077
Feature activation+2.29e-01

Pos logit contributions
rified+0.003
ow+0.003
ocked+0.003
 loudly+0.003
storms+0.003
Neg logit contributions
-0.004
Ar-0.003
 bisc-0.003
 printed-0.003
 lifelong-0.003
Tokenmy
Token ID1820
Feature activation-3.02e-01

Pos logit contributions
+0.023
Ar+0.020
 lifelong+0.019
 printed+0.018
 common+0.018
Neg logit contributions
rified-0.021
ow-0.020
ocked-0.018
licks-0.017
ly-0.017
Token loved
Token ID6151
Feature activation-6.00e-02

Pos logit contributions
+0.013
Ar+0.012
 lifelong+0.012
Wal+0.011
 printed+0.011
Neg logit contributions
rified-0.009
ow-0.009
ocked-0.008
 loudly-0.007
ly-0.007
Token to
Token ID284
Feature activation+8.14e-02

Pos logit contributions
+0.002
Ar+0.002
 printed+0.002
 wonderful+0.002
 lifelong+0.002
Neg logit contributions
rified-0.003
ow-0.002
ocked-0.002
licks-0.002
owing-0.002
Token eat
Token ID4483
Feature activation-4.20e-01

Pos logit contributions
ow+0.003
rified+0.003
ocked+0.003
 bye+0.003
ly+0.002
Neg logit contributions
-0.003
Ar-0.003
 lifelong-0.003
 printed-0.002
Wal-0.002
Token av
Token ID1196
Feature activation+2.57e-01

Pos logit contributions
+0.014
Ar+0.011
 printed+0.011
 lifelong+0.010
 common+0.010
Neg logit contributions
rified-0.019
ow-0.018
ocked-0.017
licks-0.016
storms-0.016
Tokenoc
Token ID420
Feature activation+2.54e+00

Pos logit contributions
rified+0.013
ow+0.012
ocked+0.012
licks+0.011
storms+0.011
Neg logit contributions
-0.007
Ar-0.005
 printed-0.005
 lifelong-0.005
 wonderful-0.005
Tokenados
Token ID22484
Feature activation+2.18e-02

Pos logit contributions
rified+0.078
ow+0.066
 spine+0.065
chair+0.062
 alone+0.060
Neg logit contributions
-0.106
Wal-0.091
Ar-0.089
 printed-0.087
fl-0.087
Token,
Token ID11
Feature activation-9.03e-02

Pos logit contributions
rified+0.001
ow+0.001
ocked+0.001
licks+0.001
arrass+0.001
Neg logit contributions
-0.001
Ar-0.001
 printed-0.001
 lifelong-0.001
 wonderful-0.001
Token they
Token ID484
Feature activation+3.63e-01

Pos logit contributions
+0.004
Ar+0.003
 printed+0.003
 lifelong+0.003
 common+0.003
Neg logit contributions
rified-0.003
ow-0.003
ocked-0.003
licks-0.003
storms-0.003
Token were
Token ID547
Feature activation+1.19e-01

Pos logit contributions
rified+0.014
ow+0.013
ocked+0.013
licks+0.012
owing+0.012
Neg logit contributions
-0.014
Ar-0.011
 printed-0.011
 lifelong-0.011
 wonderful-0.010
Token his
Token ID465
Feature activation+1.57e-01

Pos logit contributions
ow+0.004
ocked+0.004
rified+0.004
 loudly+0.003
storms+0.003
Neg logit contributions
-0.005
Ar-0.004
 bisc-0.004
 common-0.004
 lifelong-0.004
Token nice
Token ID3621
Feature activation+4.34e-01

Pos logit contributions
rified+0.007
ow+0.006
ocked+0.006
licks+0.006
storms+0.006
Neg logit contributions
-0.005
Ar-0.004
 printed-0.004
 lifelong-0.003
 common-0.003
Token like
Token ID588
Feature activation+3.48e-01

Pos logit contributions
rified+0.019
ow+0.017
ocked+0.016
Fine+0.016
arrass+0.016
Neg logit contributions
-0.015
 printed-0.011
 wonderful-0.011
 lifelong-0.011
 print-0.010
Token other
Token ID584
Feature activation+2.34e-01

Pos logit contributions
rified+0.016
ow+0.015
owing+0.014
licks+0.014
ocked+0.014
Neg logit contributions
-0.011
 wonderful-0.009
 printed-0.009
Ar-0.008
 great-0.008
Token rh
Token ID9529
Feature activation+8.75e-01

Pos logit contributions
rified+0.011
ow+0.010
ocked+0.010
licks+0.009
storms+0.009
Neg logit contributions
-0.007
Ar-0.006
 lifelong-0.006
 wonderful-0.005
 printed-0.005
Tokenin
Token ID259
Feature activation-3.28e-01

Pos logit contributions
rified+0.042
ow+0.040
 bye+0.038
arrass+0.038
ocked+0.038
Neg logit contributions
-0.024
Ar-0.019
b-0.017
 printed-0.017
 lifelong-0.017
Tokenoc
Token ID420
Feature activation+2.53e+00

Pos logit contributions
+0.011
 printed+0.007
 wonderful+0.007
 lifelong+0.007
b+0.007
Neg logit contributions
rified-0.015
ocked-0.014
ow-0.013
arted-0.013
licks-0.013
Tokeneros
Token ID27498
Feature activation+5.23e-01

Pos logit contributions
rified+0.080
 alone+0.067
ocked+0.066
chair+0.066
ow+0.064
Neg logit contributions
-0.100
Ar-0.088
fl-0.087
b-0.082
Wal-0.082
Token.
Token ID13
Feature activation+4.48e-01

Pos logit contributions
rified+0.025
ocked+0.023
ow+0.022
arrass+0.021
arted+0.021
Neg logit contributions
-0.017
 printed-0.011
 wonderful-0.011
 lifelong-0.010
b-0.010
Token It
Token ID632
Feature activation+1.30e-01

Pos logit contributions
rified+0.019
ocked+0.018
owing+0.018
ly+0.017
ow+0.017
Neg logit contributions
-0.017
Ar-0.012
 printed-0.011
 common-0.011
 two-0.010
Token was
Token ID373
Feature activation+4.48e-01

Pos logit contributions
rified+0.006
ow+0.006
ocked+0.006
licks+0.006
owing+0.005
Neg logit contributions
-0.004
Ar-0.003
 printed-0.003
 lifelong-0.003
 wonderful-0.002
Token mean
Token ID1612
Feature activation+4.48e-01

Pos logit contributions
rified+0.018
ow+0.017
ocked+0.016
licks+0.015
storms+0.015
Neg logit contributions
-0.016
Ar-0.014
 printed-0.013
 lifelong-0.013
 common-0.012
Token and
Token ID290
Feature activation+3.56e-01

Pos logit contributions
rified+0.029
ow+0.028
ocked+0.026
 bye+0.025
arrass+0.025
Neg logit contributions
-0.018
 printed-0.013
Ar-0.012
 wonderful-0.012
 lifelong-0.011
Token blue
Token ID4171
Feature activation+3.34e-01

Pos logit contributions
rified+0.018
ow+0.017
ocked+0.016
owing+0.015
licks+0.015
Neg logit contributions
-0.010
 printed-0.008
Ar-0.008
 wonderful-0.007
 common-0.007
Token.
Token ID13
Feature activation+1.29e-01

Pos logit contributions
rified+0.016
ow+0.015
arrass+0.014
ocked+0.014
 bye+0.014
Neg logit contributions
-0.010
 printed-0.007
 print-0.006
Ar-0.006
 wonderful-0.006
Token They
Token ID1119
Feature activation+1.12e+00

Pos logit contributions
rified+0.005
ow+0.004
ocked+0.004
licks+0.004
storms+0.004
Neg logit contributions
-0.005
Ar-0.004
 printed-0.004
 lifelong-0.004
 common-0.004
Token showed
Token ID3751
Feature activation+1.08e-01

Pos logit contributions
rified+0.045
ow+0.043
ocked+0.040
licks+0.038
storms+0.037
Neg logit contributions
-0.040
Ar-0.034
 lifelong-0.032
 printed-0.032
 common-0.030
Token each
Token ID1123
Feature activation+2.53e+00

Pos logit contributions
rified+0.005
ow+0.005
owing+0.004
storms+0.004
ocked+0.004
Neg logit contributions
-0.004
 printed-0.003
 wonderful-0.003
 lifelong-0.003
Ar-0.003
Token other
Token ID584
Feature activation-2.80e-01

Pos logit contributions
rified+0.100
ow+0.096
ocked+0.093
ly+0.091
 loudly+0.091
Neg logit contributions
-0.080
Ar-0.071
Wal-0.065
 lifelong-0.065
 bisc-0.062
Token their
Token ID511
Feature activation+2.61e-01

Pos logit contributions
+0.009
 printed+0.007
Ar+0.007
 lifelong+0.007
 wonderful+0.007
Neg logit contributions
rified-0.013
ow-0.012
ocked-0.011
storms-0.010
licks-0.010
Token shapes
Token ID15268
Feature activation+5.76e-01

Pos logit contributions
ow+0.010
rified+0.009
ocked+0.009
 bye+0.009
 loudly+0.009
Neg logit contributions
-0.010
Ar-0.008
Wal-0.008
 lifelong-0.007
vy-0.007
Token.
Token ID13
Feature activation-2.32e-02

Pos logit contributions
rified+0.023
ow+0.022
ocked+0.020
licks+0.019
storms+0.019
Neg logit contributions
-0.021
Ar-0.018
 printed-0.017
 lifelong-0.017
 common-0.016
Token
Token ID198
Feature activation+1.07e-02

Pos logit contributions
+0.001
Ar+0.001
 lifelong+0.001
 printed+0.001
 wonderful+0.001
Neg logit contributions
rified-0.001
ow-0.001
ocked-0.001
licks-0.001
storms-0.001
Token wondered
Token ID14028
Feature activation-9.79e-02

Pos logit contributions
rified+0.022
ow+0.021
ocked+0.020
storms+0.019
licks+0.019
Neg logit contributions
-0.023
Ar-0.019
 printed-0.019
 wonderful-0.017
 lifelong-0.017
Token what
Token ID644
Feature activation+3.25e-01

Pos logit contributions
+0.004
 printed+0.003
Ar+0.003
 wonderful+0.003
 lifelong+0.003
Neg logit contributions
rified-0.004
ow-0.003
ocked-0.003
storms-0.003
licks-0.003
Token a
Token ID257
Feature activation-2.52e-02

Pos logit contributions
rified+0.013
ow+0.013
ocked+0.012
storms+0.011
licks+0.011
Neg logit contributions
-0.011
Ar-0.009
 printed-0.009
 lifelong-0.009
Wal-0.008
Token rh
Token ID9529
Feature activation+7.48e-01

Pos logit contributions
+0.001
 lifelong+0.001
Wal+0.001
Ar+0.001
vy+0.001
Neg logit contributions
rified-0.001
 bye-0.001
ow-0.001
 loudly-0.001
arrass-0.001
Tokenin
Token ID259
Feature activation-6.38e-01

Pos logit contributions
rified+0.033
ow+0.031
ocked+0.030
arrass+0.028
storms+0.028
Neg logit contributions
-0.024
Ar-0.020
 printed-0.019
 lifelong-0.018
 common-0.017
Tokenoc
Token ID420
Feature activation+2.52e+00

Pos logit contributions
+0.022
 lifelong+0.019
Ar+0.019
Wal+0.018
 common+0.017
Neg logit contributions
rified-0.025
ow-0.025
ly-0.022
arrass-0.021
 bye-0.021
Tokeneros
Token ID27498
Feature activation+2.89e-01

Pos logit contributions
rified+0.091
ocked+0.082
ow+0.076
 alone+0.076
storms+0.075
Neg logit contributions
-0.087
Ar-0.083
 bisc-0.076
 printed-0.076
fl-0.075
Token was
Token ID373
Feature activation-5.68e-01

Pos logit contributions
rified+0.011
ow+0.011
ocked+0.010
ly+0.010
storms+0.010
Neg logit contributions
-0.010
Ar-0.009
 lifelong-0.009
Wal-0.008
 common-0.008
Token and
Token ID290
Feature activation-1.28e-01

Pos logit contributions
+0.023
Ar+0.022
 lifelong+0.021
Wal+0.019
 common+0.018
Neg logit contributions
ow-0.019
rified-0.018
owing-0.017
ly-0.016
ocked-0.016
Token where
Token ID810
Feature activation-2.34e-01

Pos logit contributions
+0.005
Ar+0.005
 printed+0.004
 lifelong+0.004
 common+0.004
Neg logit contributions
rified-0.004
ow-0.004
ocked-0.004
licks-0.004
storms-0.003
Token it
Token ID340
Feature activation+1.38e-01

Pos logit contributions
+0.008
Ar+0.007
 printed+0.007
 lifelong+0.006
 wonderful+0.006
Neg logit contributions
rified-0.010
ow-0.009
ocked-0.009
licks-0.008
owing-0.008
Token It
Token ID632
Feature activation+1.88e-01

Pos logit contributions
+0.003
Ar+0.003
 lifelong+0.003
 printed+0.003
 wonderful+0.003
Neg logit contributions
rified-0.002
ow-0.002
ocked-0.002
licks-0.002
storms-0.002
Token is
Token ID318
Feature activation+4.92e-01

Pos logit contributions
rified+0.007
ow+0.007
licks+0.006
ocked+0.006
 loudly+0.006
Neg logit contributions
-0.007
 lifelong-0.006
Ar-0.006
Wal-0.006
 printed-0.005
Token a
Token ID257
Feature activation-4.04e-02

Pos logit contributions
ow+0.019
rified+0.019
ocked+0.018
 loudly+0.017
licks+0.017
Neg logit contributions
-0.017
Ar-0.015
 lifelong-0.014
 printed-0.013
Wal-0.013
Token rh
Token ID9529
Feature activation+6.63e-01

Pos logit contributions
+0.002
 lifelong+0.001
Ar+0.001
vy+0.001
Wal+0.001
Neg logit contributions
rified-0.001
 bye-0.001
 loudly-0.001
ow-0.001
 mood-0.001
Tokenin
Token ID259
Feature activation-8.97e-01

Pos logit contributions
rified+0.030
ow+0.029
ocked+0.027
arrass+0.026
storms+0.026
Neg logit contributions
-0.020
Ar-0.017
 printed-0.015
 lifelong-0.015
 wonderful-0.014
Tokenoc
Token ID420
Feature activation+2.51e+00

Pos logit contributions
+0.031
 lifelong+0.027
Ar+0.026
Wal+0.024
 printed+0.024
Neg logit contributions
rified-0.036
ow-0.035
ly-0.032
 loudly-0.032
 bye-0.031
Tokeneros
Token ID27498
Feature activation+1.57e-01

Pos logit contributions
rified+0.093
ocked+0.086
 lately+0.085
outed+0.084
licks+0.082
Neg logit contributions
-0.074
Ar-0.073
 bisc-0.070
 extingu-0.067
fl-0.067
Token.
Token ID13
Feature activation+5.00e-02

Pos logit contributions
rified+0.006
ow+0.006
ocked+0.006
licks+0.005
storms+0.005
Neg logit contributions
-0.006
Ar-0.005
 printed-0.004
 lifelong-0.004
 common-0.004
Token It
Token ID632
Feature activation+3.51e-01

Pos logit contributions
rified+0.001
ow+0.001
ocked+0.001
licks+0.001
 bye+0.001
Neg logit contributions
-0.002
Ar-0.002
 lifelong-0.002
 printed-0.002
 wonderful-0.002
Token is
Token ID318
Feature activation+5.60e-01

Pos logit contributions
rified+0.013
ow+0.013
licks+0.012
 loudly+0.012
ocked+0.011
Neg logit contributions
-0.012
 lifelong-0.011
Ar-0.011
Wal-0.010
 printed-0.010
Token bright
Token ID6016
Feature activation+3.59e-01

Pos logit contributions
ow+0.021
rified+0.020
ocked+0.019
 loudly+0.019
licks+0.018
Neg logit contributions
-0.021
Ar-0.018
 lifelong-0.017
 printed-0.016
Wal-0.016
Token had
Token ID550
Feature activation-1.03e+00

Pos logit contributions
rified+0.002
ow+0.002
ocked+0.002
storms+0.002
 loudly+0.002
Neg logit contributions
-0.002
 lifelong-0.002
Ar-0.002
 common-0.002
Wal-0.002
Token a
Token ID257
Feature activation-9.95e-01

Pos logit contributions
+0.031
Ar+0.025
 lifelong+0.025
 printed+0.023
Wal+0.023
Neg logit contributions
rified-0.044
ow-0.044
ocked-0.041
 loudly-0.039
storms-0.038
Token big
Token ID1263
Feature activation-4.42e-01

Pos logit contributions
+0.039
Ar+0.029
vy+0.029
Wal+0.028
b+0.027
Neg logit contributions
 bye-0.033
 mood-0.032
 loudly-0.031
ow-0.031
 stomach-0.030
Token rh
Token ID9529
Feature activation+6.26e-01

Pos logit contributions
+0.017
Ar+0.014
 lifelong+0.013
 printed+0.013
Wal+0.013
Neg logit contributions
rified-0.017
ow-0.017
 bye-0.015
ocked-0.014
 loudly-0.014
Tokenin
Token ID259
Feature activation-9.87e-01

Pos logit contributions
rified+0.029
ow+0.028
ocked+0.026
arrass+0.026
storms+0.025
Neg logit contributions
-0.018
Ar-0.015
 printed-0.014
 lifelong-0.013
 wonderful-0.013
Tokenoc
Token ID420
Feature activation+2.50e+00

Pos logit contributions
+0.036
 lifelong+0.030
Ar+0.029
 printed+0.028
 common+0.028
Neg logit contributions
ow-0.038
rified-0.038
 bye-0.034
ly-0.034
 loudly-0.033
Tokeneros
Token ID27498
Feature activation+4.28e-01

Pos logit contributions
rified+0.094
ocked+0.088
licks+0.082
storms+0.082
 lately+0.081
Neg logit contributions
Ar-0.076
-0.073
 bisc-0.069
 extingu-0.067
fl-0.066
Token that
Token ID326
Feature activation-4.72e-01

Pos logit contributions
rified+0.015
ow+0.014
ocked+0.013
licks+0.012
ly+0.012
Neg logit contributions
-0.018
Ar-0.015
 lifelong-0.015
 printed-0.014
 common-0.014
Token he
Token ID339
Feature activation-2.80e-01

Pos logit contributions
+0.018
Ar+0.015
 printed+0.015
 lifelong+0.014
 wonderful+0.014
Neg logit contributions
rified-0.018
ow-0.017
ocked-0.016
owing-0.015
licks-0.015
Token liked
Token ID8288
Feature activation+1.32e-01

Pos logit contributions
+0.011
Ar+0.009
 lifelong+0.009
 printed+0.009
Wal+0.008
Neg logit contributions
rified-0.010
ow-0.010
ocked-0.009
licks-0.008
storms-0.008
Token to
Token ID284
Feature activation+4.07e-01

Pos logit contributions
rified+0.006
ow+0.005
ocked+0.005
arrass+0.005
owing+0.005
Neg logit contributions
-0.005
Ar-0.004
 printed-0.004
 wonderful-0.003
 lifelong-0.003
Token picture
Token ID4286
Feature activation+4.32e-02

Pos logit contributions
+0.017
Ar+0.015
 lifelong+0.015
Wal+0.014
vy+0.014
Neg logit contributions
rified-0.017
 loudly-0.017
ow-0.016
 bye-0.016
ocked-0.015
Token of
Token ID286
Feature activation-1.88e-01

Pos logit contributions
rified+0.002
ow+0.002
ocked+0.002
arrass+0.002
licks+0.002
Neg logit contributions
-0.002
 printed-0.001
Ar-0.001
 lifelong-0.001
 bisc-0.001
Token a
Token ID257
Feature activation-2.78e-01

Pos logit contributions
+0.006
Ar+0.005
 lifelong+0.005
 printed+0.004
vy+0.004
Neg logit contributions
rified-0.008
ow-0.008
ocked-0.007
 loudly-0.007
 bye-0.007
Token rh
Token ID9529
Feature activation+3.85e-01

Pos logit contributions
+0.011
Ar+0.009
 lifelong+0.009
Wal+0.009
vy+0.009
Neg logit contributions
rified-0.009
 loudly-0.008
 bye-0.008
chair-0.008
ow-0.007
Tokenin
Token ID259
Feature activation-6.03e-01

Pos logit contributions
rified+0.017
ow+0.016
ocked+0.015
arrass+0.014
licks+0.014
Neg logit contributions
-0.013
Ar-0.010
 printed-0.010
 lifelong-0.010
 common-0.009
Tokenoc
Token ID420
Feature activation+2.49e+00

Pos logit contributions
+0.022
 lifelong+0.020
Ar+0.019
Wal+0.018
 wonderful+0.018
Neg logit contributions
rified-0.023
ow-0.022
ly-0.020
 bye-0.020
chair-0.020
Tokeneros
Token ID27498
Feature activation+1.28e-01

Pos logit contributions
rified+0.089
ocked+0.079
ow+0.075
 lately+0.073
licks+0.073
Neg logit contributions
-0.090
Ar-0.084
 bisc-0.079
 printed-0.078
 print-0.075
Token on
Token ID319
Feature activation-4.39e-01

Pos logit contributions
rified+0.005
ow+0.005
ocked+0.004
owing+0.004
 loudly+0.004
Neg logit contributions
-0.005
Ar-0.005
 lifelong-0.004
 common-0.004
Wal-0.004
Token it
Token ID340
Feature activation+2.37e-01

Pos logit contributions
+0.015
Ar+0.012
 printed+0.011
 common+0.011
 lifelong+0.010
Neg logit contributions
rified-0.019
ow-0.019
ocked-0.018
owing-0.017
licks-0.016
Token.
Token ID13
Feature activation+2.64e-01

Pos logit contributions
rified+0.010
ow+0.010
ocked+0.009
licks+0.008
arrass+0.008
Neg logit contributions
-0.008
Ar-0.006
 printed-0.006
 lifelong-0.006
 common-0.006
Token They
Token ID1119
Feature activation+5.83e-01

Pos logit contributions
rified+0.010
ow+0.010
owing+0.010
ocked+0.009
 unatt+0.009
Neg logit contributions
-0.011
Ar-0.008
 printed-0.008
 common-0.007
 lifelong-0.007
Tokens
Token ID82
Feature activation-3.32e-01

Pos logit contributions
rified+0.003
ow+0.002
ocked+0.002
arrass+0.002
ly+0.002
Neg logit contributions
-0.002
Ar-0.002
 printed-0.002
 lifelong-0.002
 common-0.002
Token wrong
Token ID2642
Feature activation-9.53e-01

Pos logit contributions
+0.011
Ar+0.009
 printed+0.009
 lifelong+0.008
 wonderful+0.008
Neg logit contributions
rified-0.014
ow-0.014
ocked-0.013
licks-0.012
storms-0.012
Token,
Token ID11
Feature activation-5.31e-01

Pos logit contributions
+0.032
Ar+0.029
 lifelong+0.027
 printed+0.027
Wal+0.025
Neg logit contributions
rified-0.038
ow-0.037
ocked-0.034
 loudly-0.033
 bye-0.032
Token rh
Token ID9529
Feature activation+5.21e-01

Pos logit contributions
+0.021
Ar+0.021
 printed+0.019
 lifelong+0.019
Wal+0.018
Neg logit contributions
rified-0.017
ow-0.016
 bye-0.015
ocked-0.014
 loudly-0.014
Tokenin
Token ID259
Feature activation-1.40e+00

Pos logit contributions
rified+0.026
ow+0.025
arrass+0.023
ocked+0.023
storms+0.023
Neg logit contributions
-0.013
Ar-0.010
b-0.009
 printed-0.009
 wonderful-0.009
Tokenoc
Token ID420
Feature activation+2.49e+00

Pos logit contributions
+0.044
Ar+0.039
 lifelong+0.037
 printed+0.035
Wal+0.034
Neg logit contributions
rified-0.060
ow-0.059
ocked-0.053
 loudly-0.051
 bye-0.051
Tokeneros
Token ID27498
Feature activation-8.48e-01

Pos logit contributions
rified+0.082
ocked+0.073
ow+0.071
 mood+0.070
 spine+0.070
Neg logit contributions
-0.094
fl-0.081
Ar-0.080
b-0.078
 printed-0.073
Token?
Token ID30
Feature activation-1.70e-01

Pos logit contributions
+0.025
Ar+0.021
 printed+0.019
 lifelong+0.019
 common+0.018
Neg logit contributions
rified-0.039
ow-0.038
ocked-0.035
licks-0.034
storms-0.033
Tokenâ
Token ID22940
Feature activation-3.33e-01

Pos logit contributions
+0.007
Ar+0.006
 printed+0.006
 lifelong+0.006
 wonderful+0.006
Neg logit contributions
rified-0.006
ow-0.005
licks-0.004
ocked-0.004
storms-0.004
Token
Token ID26391
Feature activation-5.72e-01

Pos logit contributions
+0.012
Ar+0.010
 printed+0.009
 lifelong+0.009
 common+0.009
Neg logit contributions
rified-0.014
ow-0.013
ocked-0.012
licks-0.011
storms-0.011
Token said
Token ID531
Feature activation+2.26e-01

Pos logit contributions
+0.017
Ar+0.013
 printed+0.012
 lifelong+0.012
 common+0.011
Neg logit contributions
rified-0.027
ow-0.026
ocked-0.024
licks-0.023
ly-0.023
Token.
Token ID13
Feature activation+4.47e-01

Pos logit contributions
rified+0.016
ow+0.015
ocked+0.014
licks+0.013
storms+0.013
Neg logit contributions
-0.013
Ar-0.010
 printed-0.010
 lifelong-0.010
 wonderful-0.009
Token Can
Token ID1680
Feature activation-6.56e-01

Pos logit contributions
ocked+0.018
rified+0.018
ow+0.017
ly+0.016
owing+0.016
Neg logit contributions
-0.017
Ar-0.013
 bisc-0.012
 printed-0.012
 lifelong-0.011
Token we
Token ID356
Feature activation-4.63e-01

Pos logit contributions
+0.019
Ar+0.013
 printed+0.012
 lifelong+0.011
 wonderful+0.011
Neg logit contributions
rified-0.033
ow-0.031
ocked-0.030
licks-0.029
owing-0.029
Token own
Token ID898
Feature activation-8.45e-01

Pos logit contributions
+0.017
Ar+0.016
 lifelong+0.015
 printed+0.014
Wal+0.014
Neg logit contributions
ow-0.018
rified-0.017
ocked-0.015
 bye-0.015
owing-0.014
Token the
Token ID262
Feature activation-1.37e-01

Pos logit contributions
+0.030
Ar+0.025
 printed+0.024
 lifelong+0.023
 common+0.022
Neg logit contributions
rified-0.035
ow-0.033
ocked-0.031
licks-0.029
storms-0.029
Token rain
Token ID6290
Feature activation-3.44e+00

Pos logit contributions
+0.006
Ar+0.005
 printed+0.005
 lifelong+0.005
 wonderful+0.005
Neg logit contributions
rified-0.005
ow-0.004
ocked-0.004
licks-0.004
ly-0.004
Token,
Token ID11
Feature activation-1.12e+00

Pos logit contributions
+0.126
Ar+0.107
 printed+0.100
 lifelong+0.099
 common+0.095
Neg logit contributions
rified-0.136
ow-0.130
ocked-0.121
licks-0.113
storms-0.112
Token Mom
Token ID11254
Feature activation-1.48e+00

Pos logit contributions
+0.052
Ar+0.046
 printed+0.044
 lifelong+0.043
 common+0.042
Neg logit contributions
rified-0.033
ow-0.031
ocked-0.028
 bye-0.026
licks-0.025
Token?"
Token ID1701
Feature activation+3.40e-03

Pos logit contributions
+0.050
Ar+0.047
 lifelong+0.043
Wal+0.041
 common+0.040
Neg logit contributions
ow-0.060
rified-0.056
ocked-0.053
owing-0.051
ly-0.051
Token Mia
Token ID32189
Feature activation-2.42e-01

Pos logit contributions
rified+0.000
ow+0.000
 bye+0.000
ocked+0.000
 loudly+0.000
Neg logit contributions
-0.000
Ar-0.000
 printed-0.000
 lifelong-0.000
 wonderful-0.000
Token asks
Token ID7893
Feature activation-7.05e-01

Pos logit contributions
Ar+0.009
+0.009
 lifelong+0.009
 printed+0.009
 common+0.009
Neg logit contributions
rified-0.008
ow-0.008
ocked-0.007
licks-0.007
 loudly-0.007
Token.
Token ID13
Feature activation-4.97e-02

Pos logit contributions
+0.011
Ar+0.009
 lifelong+0.009
 printed+0.008
Wal+0.008
Neg logit contributions
rified-0.010
ow-0.010
ocked-0.009
ly-0.009
storms-0.008
Token Can
Token ID1680
Feature activation-6.25e-01

Pos logit contributions
+0.002
b+0.002
Ar+0.002
vy+0.002
 lifelong+0.002
Neg logit contributions
rified-0.002
ow-0.002
 bye-0.002
ocked-0.002
storms-0.001
Token you
Token ID345
Feature activation-9.15e-01

Pos logit contributions
+0.021
Ar+0.017
 printed+0.016
 lifelong+0.015
 wonderful+0.015
Neg logit contributions
rified-0.027
ow-0.026
ocked-0.025
licks-0.023
owing-0.023
Token see
Token ID766
Feature activation-5.77e-01

Pos logit contributions
+0.034
Ar+0.032
 lifelong+0.030
Wal+0.028
 bisc+0.028
Neg logit contributions
ow-0.035
rified-0.030
 bye-0.028
ocked-0.027
 loudly-0.027
Token the
Token ID262
Feature activation-1.08e-02

Pos logit contributions
+0.021
Ar+0.018
 printed+0.017
 lifelong+0.016
 common+0.016
Neg logit contributions
rified-0.023
ow-0.022
ocked-0.021
licks-0.019
storms-0.019
Token rain
Token ID6290
Feature activation-3.31e+00

Pos logit contributions
+0.000
Ar+0.000
 lifelong+0.000
 printed+0.000
vy+0.000
Neg logit contributions
rified-0.000
ow-0.000
 bye-0.000
 mood-0.000
 loudly-0.000
Token that
Token ID326
Feature activation-6.11e-01

Pos logit contributions
+0.124
Ar+0.104
 printed+0.099
 lifelong+0.097
 common+0.093
Neg logit contributions
rified-0.130
ow-0.123
ocked-0.115
licks-0.107
storms-0.106
Token way
Token ID835
Feature activation+5.93e-01

Pos logit contributions
+0.020
 printed+0.016
Ar+0.015
 lifelong+0.015
 wonderful+0.015
Neg logit contributions
rified-0.028
ow-0.026
ocked-0.025
arrass-0.024
licks-0.023
Token,
Token ID11
Feature activation-9.13e-01

Pos logit contributions
rified+0.022
ow+0.022
ocked+0.020
ly+0.019
licks+0.018
Neg logit contributions
-0.022
Ar-0.020
 lifelong-0.018
 printed-0.018
Wal-0.018
Token Lily
Token ID20037
Feature activation-1.65e+00

Pos logit contributions
+0.039
Ar+0.035
 lifelong+0.032
 printed+0.032
Wal+0.031
Neg logit contributions
rified-0.029
ow-0.029
 bye-0.026
ocked-0.025
 loudly-0.024
Token and
Token ID290
Feature activation+1.46e-01

Pos logit contributions
+0.053
Ar+0.043
 printed+0.041
 lifelong+0.040
 common+0.038
Neg logit contributions
rified-0.073
ow-0.069
ocked-0.065
licks-0.062
storms-0.061
Token Was
Token ID8920
Feature activation-1.81e-01

Pos logit contributions
+0.007
Ar+0.006
 printed+0.006
 lifelong+0.006
 common+0.005
Neg logit contributions
rified-0.006
ow-0.006
ocked-0.006
ly-0.005
owing-0.005
Token it
Token ID340
Feature activation-1.25e+00

Pos logit contributions
+0.007
 lifelong+0.007
Ar+0.007
 bisc+0.007
 printed+0.007
Neg logit contributions
ow-0.006
rified-0.005
ocked-0.005
 anyone-0.005
licks-0.005
Token to
Token ID284
Feature activation+8.85e-02

Pos logit contributions
Ar+0.045
+0.043
 bisc+0.038
Wal+0.037
 lifelong+0.037
Neg logit contributions
ow-0.048
rified-0.042
 loudly-0.040
ly-0.040
 alone-0.039
Token keep
Token ID1394
Feature activation-7.17e-01

Pos logit contributions
ow+0.004
rified+0.003
ocked+0.003
 bye+0.003
 loudly+0.003
Neg logit contributions
-0.003
Ar-0.002
 lifelong-0.002
 printed-0.002
 bisc-0.002
Token the
Token ID262
Feature activation+8.24e-03

Pos logit contributions
+0.029
Ar+0.026
 lifelong+0.024
 bisc+0.024
 printed+0.023
Neg logit contributions
ow-0.024
rified-0.024
ocked-0.021
 loudly-0.021
ly-0.020
Token rain
Token ID6290
Feature activation-3.24e+00

Pos logit contributions
rified+0.000
ow+0.000
ocked+0.000
ly+0.000
 loudly+0.000
Neg logit contributions
-0.000
Ar-0.000
 printed-0.000
 lifelong-0.000
 common-0.000
Token away
Token ID1497
Feature activation-1.05e+00

Pos logit contributions
+0.117
Ar+0.107
 printed+0.096
 lifelong+0.096
Wal+0.095
Neg logit contributions
ow-0.122
rified-0.119
ocked-0.108
owing-0.105
ly-0.104
Token?
Token ID30
Feature activation-2.48e-01

Pos logit contributions
+0.040
Ar+0.039
 printed+0.035
 bisc+0.035
 lifelong+0.035
Neg logit contributions
rified-0.035
ow-0.035
 lately-0.031
 loudly-0.031
ly-0.031
Token Or
Token ID1471
Feature activation-1.02e-01

Pos logit contributions
+0.009
Ar+0.008
 printed+0.008
 lifelong+0.008
 common+0.008
Neg logit contributions
rified-0.009
ow-0.009
ocked-0.008
licks-0.008
storms-0.008
Token was
Token ID373
Feature activation+5.30e-01

Pos logit contributions
+0.004
Ar+0.004
 printed+0.004
 lifelong+0.004
 common+0.003
Neg logit contributions
rified-0.003
ow-0.003
ocked-0.003
licks-0.003
ly-0.003
Token it
Token ID340
Feature activation-3.21e-01

Pos logit contributions
rified+0.019
ow+0.019
ocked+0.017
licks+0.016
 loudly+0.016
Neg logit contributions
-0.020
Ar-0.018
 lifelong-0.017
 printed-0.017
 bisc-0.016
Token the
Token ID262
Feature activation+2.24e-01

Pos logit contributions
+0.019
Ar+0.017
 lifelong+0.016
 printed+0.015
 bisc+0.015
Neg logit contributions
ow-0.016
rified-0.014
ocked-0.013
 alone-0.013
 bye-0.012
Token window
Token ID4324
Feature activation-1.53e+00

Pos logit contributions
rified+0.009
ow+0.008
 mood+0.007
ocked+0.007
 bye+0.007
Neg logit contributions
-0.008
Ar-0.007
 lifelong-0.007
 common-0.006
 printed-0.006
Token to
Token ID284
Feature activation-4.54e-01

Pos logit contributions
+0.046
 printed+0.033
Ar+0.033
 lifelong+0.032
 common+0.030
Neg logit contributions
rified-0.074
ow-0.067
ocked-0.066
arrass-0.063
licks-0.063
Token watch
Token ID2342
Feature activation-6.12e-01

Pos logit contributions
+0.016
Ar+0.014
 lifelong+0.013
 printed+0.013
Wal+0.012
Neg logit contributions
ow-0.018
rified-0.018
ocked-0.016
licks-0.015
 bye-0.015
Token the
Token ID262
Feature activation-4.40e-02

Pos logit contributions
+0.020
Ar+0.016
 printed+0.015
 lifelong+0.015
 wonderful+0.014
Neg logit contributions
rified-0.027
ow-0.025
ocked-0.024
arrass-0.023
licks-0.023
Token rain
Token ID6290
Feature activation-3.19e+00

Pos logit contributions
+0.002
Ar+0.001
 lifelong+0.001
 printed+0.001
 common+0.001
Neg logit contributions
rified-0.002
ow-0.002
ocked-0.001
ly-0.001
storms-0.001
Token?"
Token ID1701
Feature activation+2.49e-01

Pos logit contributions
+0.106
Ar+0.088
 printed+0.082
 lifelong+0.081
 common+0.077
Neg logit contributions
rified-0.137
ow-0.132
ocked-0.123
licks-0.116
storms-0.116
Token
Token ID220
Feature activation+3.24e-01

Pos logit contributions
rified+0.010
ow+0.010
ocked+0.009
licks+0.009
ly+0.009
Neg logit contributions
-0.009
Ar-0.007
 printed-0.007
 lifelong-0.007
Wal-0.006
Token
Token ID198
Feature activation+4.76e-01

Pos logit contributions
rified+0.015
ow+0.015
ocked+0.014
licks+0.013
storms+0.013
Neg logit contributions
-0.009
Ar-0.008
 printed-0.007
 lifelong-0.007
 common-0.006
Token
Token ID198
Feature activation+7.26e-01

Pos logit contributions
ow+0.020
rified+0.019
ocked+0.018
 bye+0.017
 loudly+0.017
Neg logit contributions
-0.014
 lifelong-0.012
Ar-0.012
 printed-0.012
 wonderful-0.011
TokenHer
Token ID9360
Feature activation+2.13e-01

Pos logit contributions
rified+0.028
ow+0.028
ocked+0.026
 loudly+0.024
 bye+0.024
Neg logit contributions
-0.024
Ar-0.021
 lifelong-0.021
 printed-0.020
 bisc-0.020
Token can
Token ID460
Feature activation-3.94e-01

Pos logit contributions
rified+0.015
ow+0.014
ocked+0.013
licks+0.012
ly+0.012
Neg logit contributions
-0.015
Ar-0.013
 printed-0.012
 lifelong-0.012
 common-0.012
Token I
Token ID314
Feature activation-2.61e-01

Pos logit contributions
+0.017
Ar+0.014
 printed+0.014
 lifelong+0.013
 common+0.013
Neg logit contributions
rified-0.014
ow-0.013
ocked-0.012
licks-0.011
storms-0.011
Token be
Token ID307
Feature activation-9.10e-02

Pos logit contributions
Ar+0.010
 lifelong+0.010
Wal+0.010
b+0.009
 strips+0.009
Neg logit contributions
ow-0.010
 bye-0.007
 anymore-0.007
 mood-0.007
owing-0.007
Token like
Token ID588
Feature activation+1.17e-01

Pos logit contributions
+0.004
Ar+0.004
 bisc+0.003
 common+0.003
Wal+0.003
Neg logit contributions
ow-0.003
 alone-0.003
 loudly-0.003
ocked-0.003
rified-0.002
Token the
Token ID262
Feature activation+5.14e-02

Pos logit contributions
rified+0.004
ow+0.004
ocked+0.004
 bye+0.004
 loudly+0.004
Neg logit contributions
-0.004
Ar-0.004
 lifelong-0.004
vy-0.004
 printed-0.004
Token rain
Token ID6290
Feature activation-3.02e+00

Pos logit contributions
rified+0.002
ow+0.002
 mood+0.001
 alone+0.001
 bye+0.001
Neg logit contributions
-0.002
Ar-0.002
 lifelong-0.002
Wal-0.002
 common-0.002
Token and
Token ID290
Feature activation+2.05e-01

Pos logit contributions
+0.114
Ar+0.094
 printed+0.091
 lifelong+0.088
 wonderful+0.085
Neg logit contributions
rified-0.119
ow-0.111
ocked-0.104
licks-0.099
arrass-0.097
Token make
Token ID787
Feature activation-6.02e-01

Pos logit contributions
rified+0.008
ow+0.008
ocked+0.007
licks+0.007
storms+0.007
Neg logit contributions
-0.007
Ar-0.006
 printed-0.006
 lifelong-0.006
 common-0.005
Token sure
Token ID1654
Feature activation-3.83e-01

Pos logit contributions
+0.021
Ar+0.019
 lifelong+0.017
 printed+0.017
Wal+0.017
Neg logit contributions
ow-0.024
rified-0.023
ocked-0.022
 loudly-0.021
 bye-0.020
Token to
Token ID284
Feature activation+9.36e-02

Pos logit contributions
+0.015
 printed+0.011
 wonderful+0.011
Ar+0.011
 lifelong+0.011
Neg logit contributions
rified-0.016
ow-0.014
ocked-0.014
licks-0.013
arrass-0.013
Token use
Token ID779
Feature activation-3.40e-01

Pos logit contributions
ow+0.004
rified+0.003
ocked+0.003
 bye+0.003
 loudly+0.003
Neg logit contributions
-0.003
Ar-0.003
 lifelong-0.003
 printed-0.003
Wal-0.003
Token "
Token ID366
Feature activation+3.23e-01

Pos logit contributions
rified+0.003
ow+0.003
owing+0.003
ocked+0.003
 unatt+0.003
Neg logit contributions
-0.003
Ar-0.002
 printed-0.002
 common-0.002
 lifelong-0.002
TokenDo
Token ID5211
Feature activation-4.01e-01

Pos logit contributions
rified+0.014
ow+0.013
ocked+0.012
licks+0.012
storms+0.012
Neg logit contributions
-0.010
Ar-0.008
 printed-0.008
 lifelong-0.008
 wonderful-0.008
Token you
Token ID345
Feature activation-4.54e-02

Pos logit contributions
+0.016
Ar+0.013
 printed+0.012
 wonderful+0.012
 lifelong+0.012
Neg logit contributions
rified-0.016
ow-0.014
ocked-0.014
licks-0.013
owing-0.013
Token like
Token ID588
Feature activation+4.04e-01

Pos logit contributions
+0.002
Ar+0.002
 lifelong+0.002
 printed+0.002
Wal+0.002
Neg logit contributions
ow-0.001
rified-0.001
ocked-0.001
owing-0.001
 bye-0.001
Token the
Token ID262
Feature activation+5.47e-02

Pos logit contributions
ow+0.014
rified+0.014
ocked+0.013
licks+0.012
 bye+0.012
Neg logit contributions
-0.016
Ar-0.014
 lifelong-0.014
 printed-0.013
Wal-0.013
Token rain
Token ID6290
Feature activation-2.97e+00

Pos logit contributions
rified+0.002
ow+0.002
 bye+0.002
ocked+0.002
 mood+0.002
Neg logit contributions
-0.002
Ar-0.002
 lifelong-0.002
Wal-0.002
 printed-0.002
Token?"
Token ID1701
Feature activation+4.64e-02

Pos logit contributions
+0.098
Ar+0.082
 lifelong+0.075
 printed+0.075
Wal+0.072
Neg logit contributions
rified-0.128
ow-0.125
ocked-0.117
licks-0.109
storms-0.109
Token
Token ID198
Feature activation+1.51e-01

Pos logit contributions
rified+0.002
ow+0.002
ocked+0.002
 loudly+0.002
licks+0.002
Neg logit contributions
-0.001
Ar-0.001
 printed-0.001
 lifelong-0.001
 bisc-0.001
Token
Token ID198
Feature activation+5.46e-01

Pos logit contributions
rified+0.006
ow+0.006
ocked+0.006
 loudly+0.006
licks+0.006
Neg logit contributions
-0.004
 printed-0.004
 lifelong-0.004
 bisc-0.004
Ar-0.003
Token"
Token ID1
Feature activation+2.07e-01

Pos logit contributions
rified+0.025
ow+0.024
ocked+0.022
licks+0.021
ly+0.021
Neg logit contributions
-0.017
Ar-0.014
 printed-0.013
 lifelong-0.013
 common-0.012
TokenYes
Token ID5297
Feature activation+7.82e-01

Pos logit contributions
rified+0.006
ow+0.005
ocked+0.005
 bye+0.005
Fine+0.005
Neg logit contributions
-0.009
 lifelong-0.008
 printed-0.008
Wal-0.008
Ar-0.008
Token I
Token ID314
Feature activation+1.76e-01

Pos logit contributions
rified+0.013
ow+0.012
ocked+0.011
licks+0.010
storms+0.010
Neg logit contributions
-0.017
Ar-0.014
 printed-0.014
 lifelong-0.013
 wonderful-0.013
Token go
Token ID467
Feature activation+4.39e-01

Pos logit contributions
ow+0.006
rified+0.005
ly+0.005
 mood+0.005
owing+0.004
Neg logit contributions
Ar-0.007
 lifelong-0.007
Wal-0.007
b-0.006
fl-0.006
Token play
Token ID711
Feature activation-7.00e-01

Pos logit contributions
ow+0.013
ly+0.011
 alone+0.010
owing+0.010
ocked+0.010
Neg logit contributions
Ar-0.020
Wal-0.018
 lifelong-0.018
-0.018
 common-0.017
Token in
Token ID287
Feature activation-5.82e-01

Pos logit contributions
+0.027
Ar+0.026
 lifelong+0.024
Wal+0.023
 printed+0.023
Neg logit contributions
rified-0.024
ow-0.023
ocked-0.021
 loudly-0.020
ly-0.020
Token the
Token ID262
Feature activation+1.98e-01

Pos logit contributions
Ar+0.025
+0.023
 lifelong+0.022
 bisc+0.022
 printed+0.022
Neg logit contributions
ow-0.016
rified-0.016
 alone-0.014
 anymore-0.014
 loudly-0.014
Token rain
Token ID6290
Feature activation-2.95e+00

Pos logit contributions
rified+0.007
 mood+0.007
ow+0.007
 alone+0.007
 loudly+0.006
Neg logit contributions
-0.007
Ar-0.007
 common-0.006
 lifelong-0.006
Wal-0.006
Token?'
Token ID8348
Feature activation+1.28e-02

Pos logit contributions
+0.099
Ar+0.078
 printed+0.076
 lifelong+0.073
 common+0.070
Neg logit contributions
rified-0.128
ow-0.121
ocked-0.114
licks-0.108
storms-0.107
Token
Token ID198
Feature activation-3.26e-02

Pos logit contributions
rified+0.000
ow+0.000
ocked+0.000
ly+0.000
 loudly+0.000
Neg logit contributions
-0.001
Ar-0.000
 printed-0.000
 lifelong-0.000
 common-0.000
Token
Token ID198
Feature activation+1.94e-01

Pos logit contributions
+0.001
 printed+0.001
Ar+0.001
 lifelong+0.001
 bisc+0.001
Neg logit contributions
rified-0.001
ow-0.001
ocked-0.001
 loudly-0.001
 bye-0.001
TokenHis
Token ID6653
Feature activation+6.19e-01

Pos logit contributions
rified+0.007
ow+0.007
ocked+0.006
 loudly+0.006
arrass+0.006
Neg logit contributions
-0.007
Ar-0.006
 printed-0.006
 lifelong-0.006
 bisc-0.006
Token mum
Token ID25682
Feature activation+6.11e-01

Pos logit contributions
ow+0.020
rified+0.019
ocked+0.018
 bye+0.018
 mood+0.017
Neg logit contributions
-0.026
Ar-0.024
 printed-0.022
 lifelong-0.021
Wal-0.021
Token want
Token ID765
Feature activation-9.09e-02

Pos logit contributions
+0.014
Ar+0.013
 lifelong+0.013
 printed+0.012
 bisc+0.012
Neg logit contributions
ow-0.011
rified-0.011
owing-0.009
ocked-0.009
ly-0.009
Token to
Token ID284
Feature activation+3.74e-01

Pos logit contributions
+0.004
Ar+0.004
 lifelong+0.003
 printed+0.003
Wal+0.003
Neg logit contributions
rified-0.003
ow-0.003
ocked-0.002
ly-0.002
 bye-0.002
Token play
Token ID711
Feature activation-2.21e-01

Pos logit contributions
ow+0.014
 bye+0.012
rified+0.012
 anymore+0.011
 mood+0.011
Neg logit contributions
-0.014
Ar-0.013
 lifelong-0.012
 printed-0.012
Wal-0.012
Token in
Token ID287
Feature activation-3.39e-01

Pos logit contributions
+0.009
Ar+0.008
 printed+0.008
 lifelong+0.007
Wal+0.007
Neg logit contributions
rified-0.008
ow-0.007
ocked-0.007
licks-0.006
 loudly-0.006
Token the
Token ID262
Feature activation-1.84e-01

Pos logit contributions
+0.013
Ar+0.011
 printed+0.011
 lifelong+0.011
 bisc+0.010
Neg logit contributions
rified-0.013
ow-0.012
ocked-0.011
licks-0.010
 loudly-0.010
Token rain
Token ID6290
Feature activation-2.91e+00

Pos logit contributions
+0.006
Ar+0.006
 printed+0.005
 lifelong+0.005
 common+0.005
Neg logit contributions
rified-0.007
ow-0.007
ocked-0.007
 bye-0.006
 loudly-0.006
Token with
Token ID351
Feature activation-7.56e-01

Pos logit contributions
+0.085
 printed+0.062
Ar+0.060
 wonderful+0.056
 lifelong+0.055
Neg logit contributions
rified-0.143
ow-0.127
ocked-0.123
arrass-0.121
licks-0.120
Token me
Token ID502
Feature activation-1.11e+00

Pos logit contributions
+0.029
 printed+0.023
Ar+0.023
 wonderful+0.023
 common+0.022
Neg logit contributions
rified-0.029
ow-0.026
ocked-0.026
arrass-0.025
owing-0.025
Token?"
Token ID1701
Feature activation+4.12e-01

Pos logit contributions
+0.029
 printed+0.019
Ar+0.019
 lifelong+0.018
 wonderful+0.018
Neg logit contributions
rified-0.058
ow-0.052
ocked-0.051
arrass-0.050
storms-0.049
Token
Token ID220
Feature activation+1.68e-01

Pos logit contributions
rified+0.016
ow+0.016
ocked+0.015
owing+0.014
ly+0.014
Neg logit contributions
-0.014
Ar-0.013
 printed-0.012
 common-0.012
 lifelong-0.012
Token
Token ID198
Feature activation+5.21e-01

Pos logit contributions
rified+0.007
ow+0.007
ocked+0.007
licks+0.006
 loudly+0.006
Neg logit contributions
-0.005
Ar-0.005
 printed-0.005
 lifelong-0.004
 common-0.004
Token you
Token ID345
Feature activation-3.08e-01

Pos logit contributions
+0.000
Ar+0.000
 printed+0.000
 lifelong+0.000
 wonderful+0.000
Neg logit contributions
rified-0.000
ow-0.000
ocked-0.000
licks-0.000
owing-0.000
Token protect
Token ID1805
Feature activation+3.78e-01

Pos logit contributions
+0.011
Ar+0.009
 printed+0.009
 lifelong+0.009
 common+0.008
Neg logit contributions
rified-0.013
ow-0.012
ocked-0.011
licks-0.011
storms-0.010
Token me
Token ID502
Feature activation-1.70e+00

Pos logit contributions
rified+0.014
ow+0.013
ocked+0.012
storms+0.011
licks+0.011
Neg logit contributions
-0.015
Ar-0.013
 printed-0.012
 lifelong-0.012
 common-0.012
Token all
Token ID477
Feature activation-2.60e-02

Pos logit contributions
+0.054
Ar+0.044
 printed+0.041
 lifelong+0.040
 common+0.038
Neg logit contributions
rified-0.076
ow-0.073
ocked-0.068
licks-0.065
ly-0.064
Token the
Token ID262
Feature activation-4.03e-01

Pos logit contributions
+0.001
Ar+0.001
 printed+0.001
 lifelong+0.001
 common+0.001
Neg logit contributions
rified-0.001
ow-0.001
ocked-0.001
licks-0.001
storms-0.001
Token time
Token ID640
Feature activation-2.90e+00

Pos logit contributions
+0.015
vy+0.013
 printed+0.013
Ar+0.013
 lifelong+0.012
Neg logit contributions
rified-0.014
ly-0.013
 alone-0.012
 mood-0.012
ocked-0.012
Token?"
Token ID1701
Feature activation+2.23e-01

Pos logit contributions
+0.094
Ar+0.069
 printed+0.069
 lifelong+0.066
 wonderful+0.064
Neg logit contributions
rified-0.133
ow-0.124
ocked-0.117
arrass-0.112
licks-0.112
Token the
Token ID262
Feature activation-9.84e-02

Pos logit contributions
rified+0.008
ow+0.008
ocked+0.008
owing+0.007
licks+0.007
Neg logit contributions
-0.010
 printed-0.007
Ar-0.007
 wonderful-0.007
 common-0.007
Token puppy
Token ID26188
Feature activation-3.61e-01

Pos logit contributions
+0.003
Ar+0.003
 printed+0.003
 lifelong+0.003
vy+0.003
Neg logit contributions
rified-0.004
ow-0.003
ocked-0.003
ly-0.003
 bye-0.003
Token asked
Token ID1965
Feature activation-3.65e-01

Pos logit contributions
+0.011
 printed+0.008
Ar+0.007
 wonderful+0.007
 lifelong+0.007
Neg logit contributions
rified-0.017
ow-0.016
ocked-0.015
licks-0.015
storms-0.015
Token.
Token ID13
Feature activation+3.06e-01

Pos logit contributions
+0.011
 wonderful+0.006
 printed+0.006
 lifelong+0.006
 two+0.006
Neg logit contributions
rified-0.018
ocked-0.017
ow-0.017
storms-0.016
licks-0.016
Token
Token ID241
Feature activation-8.82e-01

Pos logit contributions
rified+0.021
ow+0.019
ocked+0.017
licks+0.016
ly+0.016
Neg logit contributions
-0.024
Ar-0.020
 printed-0.019
 lifelong-0.019
 common-0.019
TokenDo
Token ID5211
Feature activation-6.60e-01

Pos logit contributions
+0.033
 printed+0.031
 lifelong+0.030
Ar+0.029
Wal+0.028
Neg logit contributions
rified-0.030
ow-0.027
ocked-0.027
arrass-0.025
 bye-0.025
Token you
Token ID345
Feature activation-3.61e-01

Pos logit contributions
+0.026
Ar+0.023
 printed+0.022
 lifelong+0.021
 bisc+0.020
Neg logit contributions
rified-0.024
ow-0.023
ocked-0.021
ly-0.020
 bye-0.019
Token like
Token ID588
Feature activation-4.67e-03

Pos logit contributions
Ar+0.014
+0.014
 lifelong+0.014
 printed+0.013
 bisc+0.013
Neg logit contributions
ow-0.012
rified-0.011
 bye-0.009
owing-0.009
 mood-0.009
Token the
Token ID262
Feature activation+2.02e-01

Pos logit contributions
+0.000
Ar+0.000
 printed+0.000
 lifelong+0.000
Wal+0.000
Neg logit contributions
rified-0.000
ow-0.000
ocked-0.000
licks-0.000
storms-0.000
Token rain
Token ID6290
Feature activation-2.85e+00

Pos logit contributions
rified+0.008
ow+0.007
ocked+0.007
 bye+0.007
 alone+0.007
Neg logit contributions
-0.007
Ar-0.006
 printed-0.006
 lifelong-0.006
vy-0.006
Token?
Token ID30
Feature activation+7.04e-02

Pos logit contributions
+0.095
Ar+0.081
 printed+0.074
 lifelong+0.074
Wal+0.071
Neg logit contributions
rified-0.121
ow-0.117
ocked-0.109
storms-0.102
licks-0.102
Tokenâ
Token ID22940
Feature activation-3.27e-01

Pos logit contributions
rified+0.002
ow+0.002
ocked+0.002
licks+0.002
 bye+0.002
Neg logit contributions
Ar-0.003
-0.003
 printed-0.003
 lifelong-0.003
 wonderful-0.003
Token
Token ID26391
Feature activation-2.46e-01

Pos logit contributions
+0.012
Ar+0.010
 printed+0.009
 common+0.009
 lifelong+0.009
Neg logit contributions
rified-0.013
ow-0.012
ocked-0.012
licks-0.011
storms-0.011
Token Daisy
Token ID40355
Feature activation+3.58e-01

Pos logit contributions
+0.007
Ar+0.006
 printed+0.005
 lifelong+0.005
 common+0.005
Neg logit contributions
rified-0.012
ow-0.011
ocked-0.011
licks-0.010
ly-0.010
Token smiled
Token ID13541
Feature activation-2.75e-01

Pos logit contributions
rified+0.015
ow+0.014
ocked+0.013
storms+0.013
ly+0.013
Neg logit contributions
-0.012
Ar-0.010
 lifelong-0.010
 printed-0.009
 common-0.009
Token outside
Token ID2354
Feature activation-5.89e-01

Pos logit contributions
ow+0.010
 alone+0.008
 anymore+0.008
owing+0.008
ly+0.007
Neg logit contributions
-0.015
Ar-0.015
Wal-0.015
 lifelong-0.014
 printed-0.013
Token and
Token ID290
Feature activation+1.39e-01

Pos logit contributions
+0.024
Ar+0.023
Wal+0.021
 lifelong+0.021
 printed+0.020
Neg logit contributions
ow-0.019
rified-0.018
 anymore-0.016
 alone-0.016
 loudly-0.016
Token play
Token ID711
Feature activation-6.37e-01

Pos logit contributions
ow+0.005
rified+0.005
 bye+0.004
 alone+0.004
 loudly+0.004
Neg logit contributions
-0.005
Ar-0.005
 lifelong-0.005
Wal-0.004
 printed-0.004
Token with
Token ID351
Feature activation-5.46e-01

Pos logit contributions
+0.028
Ar+0.026
Wal+0.026
 lifelong+0.025
 Olympic+0.024
Neg logit contributions
ow-0.017
rified-0.017
 alone-0.017
 anymore-0.017
 loudly-0.016
Token the
Token ID262
Feature activation-1.54e-01

Pos logit contributions
+0.021
Ar+0.020
 lifelong+0.018
Wal+0.017
 printed+0.017
Neg logit contributions
rified-0.019
ow-0.019
ocked-0.017
ly-0.016
 loudly-0.016
Token rain
Token ID6290
Feature activation-2.81e+00

Pos logit contributions
+0.006
Ar+0.005
 lifelong+0.005
vy+0.005
Wal+0.004
Neg logit contributions
rified-0.005
 alone-0.005
ow-0.005
 mood-0.005
 bye-0.005
Token,
Token ID11
Feature activation-6.52e-01

Pos logit contributions
+0.100
Ar+0.089
Wal+0.084
 lifelong+0.083
 printed+0.079
Neg logit contributions
ow-0.109
rified-0.108
ocked-0.099
storms-0.095
ly-0.094
Token Mom
Token ID11254
Feature activation-8.28e-01

Pos logit contributions
+0.028
Ar+0.026
 lifelong+0.024
 printed+0.023
Wal+0.023
Neg logit contributions
ow-0.020
rified-0.020
 bye-0.018
 loudly-0.017
ocked-0.017
Token?"
Token ID1701
Feature activation-1.98e-01

Pos logit contributions
Ar+0.030
+0.030
 lifelong+0.027
Wal+0.027
 common+0.025
Neg logit contributions
ow-0.031
rified-0.027
ocked-0.027
owing-0.026
 alone-0.026
Token Ben
Token ID3932
Feature activation-3.49e-01

Pos logit contributions
+0.011
Ar+0.011
 lifelong+0.011
 printed+0.010
b+0.010
Neg logit contributions
rified-0.003
ow-0.003
 bye-0.002
 loudly-0.002
 alone-0.002
Token asks
Token ID7893
Feature activation-4.58e-01

Pos logit contributions
+0.014
Ar+0.014
 lifelong+0.014
 printed+0.013
 wonderful+0.013
Neg logit contributions
rified-0.011
ow-0.011
ocked-0.010
 loudly-0.009
licks-0.009
Token,
Token ID11
Feature activation-1.31e-01

Pos logit contributions
+0.034
 wonderful+0.025
 printed+0.024
 lifelong+0.023
Ar+0.023
Neg logit contributions
rified-0.039
ow-0.035
ocked-0.034
storms-0.033
licks-0.032
Token "
Token ID366
Feature activation+3.53e-01

Pos logit contributions
+0.006
Ar+0.005
 printed+0.005
 lifelong+0.005
 common+0.005
Neg logit contributions
rified-0.004
ow-0.004
ocked-0.003
licks-0.003
ly-0.003
TokenWill
Token ID8743
Feature activation-5.15e-01

Pos logit contributions
rified+0.014
ow+0.012
ocked+0.011
ly+0.011
Fine+0.011
Neg logit contributions
-0.013
 lifelong-0.012
Ar-0.011
 common-0.011
 printed-0.011
Token it
Token ID340
Feature activation-8.54e-01

Pos logit contributions
+0.019
 printed+0.013
 common+0.011
 print+0.011
b+0.011
Neg logit contributions
rified-0.023
licks-0.021
storms-0.021
ocked-0.021
arted-0.020
Token really
Token ID1107
Feature activation-7.04e-01

Pos logit contributions
+0.030
Ar+0.027
Wal+0.024
 lifelong+0.024
 printed+0.024
Neg logit contributions
ow-0.034
rified-0.033
ocked-0.030
 bye-0.029
 loudly-0.029
Token rain
Token ID6290
Feature activation-2.78e+00

Pos logit contributions
+0.022
Ar+0.018
 printed+0.017
 common+0.016
 wonderful+0.016
Neg logit contributions
rified-0.033
ow-0.030
ocked-0.029
licks-0.028
owing-0.027
Token today
Token ID1909
Feature activation-2.33e+00

Pos logit contributions
+0.092
Ar+0.072
 printed+0.072
 lifelong+0.067
 common+0.066
Neg logit contributions
rified-0.125
ow-0.112
ocked-0.109
licks-0.105
arrass-0.103
Token?"
Token ID1701
Feature activation+3.15e-02

Pos logit contributions
+0.068
Ar+0.050
 printed+0.049
 lifelong+0.047
 wonderful+0.045
Neg logit contributions
rified-0.114
ow-0.106
ocked-0.102
licks-0.098
arrass-0.097
Token
Token ID198
Feature activation+9.39e-03

Pos logit contributions
rified+0.002
ow+0.002
ocked+0.001
licks+0.001
storms+0.001
Neg logit contributions
-0.001
Ar-0.001
 printed-0.001
 lifelong-0.001
 common-0.001
Token
Token ID198
Feature activation+5.72e-01

Pos logit contributions
ow+0.000
rified+0.000
ocked+0.000
 bye+0.000
chair+0.000
Neg logit contributions
-0.000
Ar-0.000
 lifelong-0.000
 printed-0.000
 common-0.000
Token"
Token ID1
Feature activation-3.68e-01

Pos logit contributions
rified+0.025
ow+0.024
ocked+0.023
ly+0.021
licks+0.021
Neg logit contributions
-0.018
Ar-0.015
 printed-0.014
 lifelong-0.014
 common-0.013
Token scared
Token ID12008
Feature activation-2.82e-01

Pos logit contributions
rified+0.009
arrass+0.008
ly+0.008
Fine+0.008
ow+0.008
Neg logit contributions
-0.007
 printed-0.005
 wonderful-0.005
Ar-0.005
 two-0.005
Token.
Token ID13
Feature activation-6.33e-02

Pos logit contributions
+0.008
 wonderful+0.005
 printed+0.004
 two+0.004
 lifelong+0.004
Neg logit contributions
rified-0.014
ow-0.014
arrass-0.013
licks-0.013
ocked-0.012
Token What
Token ID1867
Feature activation+5.03e-01

Pos logit contributions
+0.002
Ar+0.002
 lifelong+0.002
 printed+0.002
 bisc+0.002
Neg logit contributions
rified-0.002
ow-0.002
ocked-0.002
licks-0.002
storms-0.002
Token was
Token ID373
Feature activation-4.69e-01

Pos logit contributions
rified+0.023
ow+0.022
ocked+0.021
licks+0.020
storms+0.020
Neg logit contributions
-0.015
Ar-0.012
 printed-0.011
 lifelong-0.011
 common-0.011
Token the
Token ID262
Feature activation+1.15e-01

Pos logit contributions
+0.021
Ar+0.018
 lifelong+0.018
 printed+0.017
 bisc+0.017
Neg logit contributions
rified-0.015
ow-0.014
ocked-0.013
licks-0.012
storms-0.012
Token alarm
Token ID10436
Feature activation-2.75e+00

Pos logit contributions
rified+0.003
ow+0.003
ocked+0.003
licks+0.003
owing+0.003
Neg logit contributions
-0.005
Ar-0.005
 printed-0.004
 lifelong-0.004
 wonderful-0.004
Token for
Token ID329
Feature activation-6.82e-01

Pos logit contributions
+0.099
 printed+0.078
Ar+0.074
 lifelong+0.071
 wonderful+0.070
Neg logit contributions
rified-0.117
ow-0.106
ocked-0.103
licks-0.098
arrass-0.098
Token?
Token ID30
Feature activation-1.18e-01

Pos logit contributions
+0.026
Ar+0.022
 lifelong+0.022
Wal+0.021
 printed+0.020
Neg logit contributions
ow-0.026
rified-0.025
ocked-0.023
ly-0.021
 loudly-0.021
Token
Token ID198
Feature activation+1.66e-02

Pos logit contributions
+0.004
Ar+0.003
 lifelong+0.003
b+0.003
 wonderful+0.003
Neg logit contributions
rified-0.005
ow-0.005
ocked-0.004
licks-0.004
storms-0.004
Token
Token ID198
Feature activation+4.58e-01

Pos logit contributions
rified+0.001
ocked+0.001
ow+0.001
 loudly+0.000
licks+0.000
Neg logit contributions
-0.001
 lifelong-0.001
 feather-0.001
 bisc-0.001
 printed-0.001
TokenThey
Token ID2990
Feature activation+2.26e-01

Pos logit contributions
rified+0.017
Fine+0.015
 loudly+0.015
ocked+0.015
ly+0.014
Neg logit contributions
 lifelong-0.014
 bisc-0.014
-0.014
fl-0.014
 feather-0.013
Token are
Token ID389
Feature activation+1.14e+00

Pos logit contributions
rified+0.040
ocked+0.038
ow+0.037
arted+0.036
ly+0.035
Neg logit contributions
-0.024
 printed-0.018
 wonderful-0.017
Ar-0.016
 common-0.016
Token you
Token ID345
Feature activation-7.54e-02

Pos logit contributions
rified+0.050
ow+0.046
owing+0.045
ocked+0.045
licks+0.043
Neg logit contributions
-0.040
 wonderful-0.030
 printed-0.030
Ar-0.029
 two-0.027
Token dancing
Token ID15360
Feature activation-6.17e-01

Pos logit contributions
+0.003
Ar+0.002
 printed+0.002
 lifelong+0.002
 wonderful+0.002
Neg logit contributions
rified-0.003
ow-0.003
ocked-0.003
arrass-0.003
licks-0.003
Token in
Token ID287
Feature activation-3.79e-01

Pos logit contributions
+0.021
Ar+0.016
 printed+0.016
 lifelong+0.016
 wonderful+0.015
Neg logit contributions
rified-0.027
ow-0.025
ocked-0.024
arrass-0.023
licks-0.023
Token the
Token ID262
Feature activation+2.79e-02

Pos logit contributions
+0.014
Ar+0.012
 printed+0.011
 lifelong+0.011
 bisc+0.011
Neg logit contributions
rified-0.014
ow-0.014
ocked-0.013
licks-0.012
 loudly-0.012
Token rain
Token ID6290
Feature activation-2.73e+00

Pos logit contributions
rified+0.001
ow+0.001
ocked+0.001
 bye+0.001
storms+0.001
Neg logit contributions
-0.001
Ar-0.001
 printed-0.001
 lifelong-0.001
 common-0.001
Token,
Token ID11
Feature activation-5.00e-01

Pos logit contributions
+0.074
 printed+0.054
Ar+0.048
 wonderful+0.048
 lifelong+0.047
Neg logit contributions
rified-0.140
ow-0.125
ocked-0.121
arrass-0.120
licks-0.119
Token silly
Token ID14397
Feature activation-2.77e-01

Pos logit contributions
+0.019
Ar+0.016
 printed+0.015
 wonderful+0.015
 lifelong+0.015
Neg logit contributions
rified-0.019
ow-0.018
ocked-0.017
owing-0.016
licks-0.016
Token king
Token ID5822
Feature activation-8.09e-01

Pos logit contributions
+0.009
Ar+0.008
 printed+0.008
 wonderful+0.007
 lifelong+0.007
Neg logit contributions
rified-0.012
ow-0.011
ocked-0.011
licks-0.010
arrass-0.010
Token?"
Token ID1701
Feature activation+3.78e-01

Pos logit contributions
+0.021
 printed+0.015
Ar+0.015
 lifelong+0.014
 wonderful+0.013
Neg logit contributions
rified-0.042
ow-0.038
ocked-0.037
arrass-0.036
licks-0.035
Token
Token ID220
Feature activation+5.34e-01

Pos logit contributions
rified+0.017
ow+0.016
ocked+0.016
licks+0.015
owing+0.015
Neg logit contributions
-0.013
Ar-0.010
 printed-0.009
 bisc-0.008
 lifelong-0.008
Token that
Token ID326
Feature activation+6.56e-01

Pos logit contributions
rified+0.001
ow+0.001
ocked+0.001
 loudly+0.001
licks+0.001
Neg logit contributions
-0.002
Ar-0.001
 lifelong-0.001
 printed-0.001
 bisc-0.001
Token.
Token ID13
Feature activation-5.37e-01

Pos logit contributions
rified+0.026
ow+0.025
ocked+0.023
licks+0.022
storms+0.022
Neg logit contributions
-0.024
Ar-0.020
 printed-0.019
 lifelong-0.019
 common-0.018
Token Whenever
Token ID21326
Feature activation-9.22e-02

Pos logit contributions
+0.023
Ar+0.020
 printed+0.020
 lifelong+0.019
 wonderful+0.019
Neg logit contributions
rified-0.018
ow-0.016
ocked-0.015
licks-0.014
storms-0.014
Token the
Token ID262
Feature activation-9.84e-01

Pos logit contributions
+0.004
Ar+0.003
 printed+0.003
 lifelong+0.003
 wonderful+0.003
Neg logit contributions
rified-0.004
ow-0.003
ocked-0.003
licks-0.003
owing-0.003
Token chip
Token ID11594
Feature activation-1.03e+00

Pos logit contributions
+0.034
Ar+0.031
 printed+0.029
 lifelong+0.028
 common+0.028
Neg logit contributions
rified-0.038
ow-0.037
 mood-0.033
 loudly-0.032
ocked-0.032
Tokenm
Token ID76
Feature activation-2.71e+00

Pos logit contributions
+0.033
Ar+0.025
 printed+0.024
 lifelong+0.024
 wonderful+0.022
Neg logit contributions
rified-0.047
ow-0.045
ocked-0.043
licks-0.040
arrass-0.040
Tokenunk
Token ID2954
Feature activation-8.53e-01

Pos logit contributions
+0.103
Ar+0.090
 lifelong+0.083
 printed+0.083
 wonderful+0.079
Neg logit contributions
rified-0.103
ow-0.102
ocked-0.093
owing-0.087
ly-0.086
Token needed
Token ID2622
Feature activation-7.62e-01

Pos logit contributions
+0.028
Ar+0.023
 printed+0.022
 lifelong+0.021
 common+0.020
Neg logit contributions
rified-0.037
ow-0.035
ocked-0.033
licks-0.031
storms-0.031
Token help
Token ID1037
Feature activation+3.92e-01

Pos logit contributions
+0.026
Ar+0.022
 printed+0.021
 lifelong+0.020
 common+0.019
Neg logit contributions
rified-0.032
ow-0.031
ocked-0.028
ly-0.027
licks-0.027
Token,
Token ID11
Feature activation-7.67e-01

Pos logit contributions
rified+0.015
ow+0.014
ocked+0.014
licks+0.013
arrass+0.013
Neg logit contributions
-0.015
Ar-0.012
 printed-0.012
 lifelong-0.012
 wonderful-0.012
Token Sam
Token ID3409
Feature activation-1.58e-01

Pos logit contributions
+0.030
Ar+0.027
 printed+0.025
 lifelong+0.024
 common+0.024
Neg logit contributions
rified-0.028
ow-0.026
ocked-0.024
 loudly-0.023
storms-0.023
Token Can
Token ID1680
Feature activation-8.77e-01

Pos logit contributions
+0.001
Ar+0.001
 lifelong+0.001
 printed+0.001
 bisc+0.001
Neg logit contributions
rified-0.001
ocked-0.001
ow-0.001
owing-0.001
ly-0.001
Token I
Token ID314
Feature activation-4.67e-01

Pos logit contributions
+0.031
Ar+0.022
 printed+0.021
 wonderful+0.020
 common+0.020
Neg logit contributions
rified-0.040
ow-0.036
ocked-0.036
licks-0.034
owing-0.034
Token see
Token ID766
Feature activation-7.38e-01

Pos logit contributions
Ar+0.019
 lifelong+0.017
Wal+0.017
b+0.016
+0.016
Neg logit contributions
ow-0.017
 bye-0.013
 anymore-0.012
owing-0.012
 mood-0.012
Token your
Token ID534
Feature activation-3.16e-02

Pos logit contributions
+0.029
Ar+0.026
 printed+0.024
 lifelong+0.024
 common+0.023
Neg logit contributions
rified-0.027
ow-0.026
ocked-0.024
licks-0.022
ly-0.022
Token jewelry
Token ID22634
Feature activation-2.17e+00

Pos logit contributions
+0.001
Ar+0.001
vy+0.001
 lifelong+0.001
Wal+0.001
Neg logit contributions
rified-0.001
ow-0.001
 bye-0.001
 alone-0.001
ly-0.001
Token sometime
Token ID17291
Feature activation-2.68e+00

Pos logit contributions
+0.070
Ar+0.055
 printed+0.054
 lifelong+0.052
 wonderful+0.050
Neg logit contributions
rified-0.097
ow-0.092
ocked-0.087
licks-0.082
arrass-0.082
Token?"
Token ID1701
Feature activation-3.36e-01

Pos logit contributions
+0.096
Ar+0.086
Wal+0.081
 printed+0.078
 lifelong+0.078
Neg logit contributions
ow-0.101
rified-0.096
ocked-0.091
 bye-0.089
ly-0.086
Token
Token ID198
Feature activation-2.04e-01

Pos logit contributions
+0.008
Ar+0.005
 printed+0.005
 lifelong+0.005
 wonderful+0.004
Neg logit contributions
rified-0.019
ow-0.018
ocked-0.017
licks-0.016
ly-0.016
Token
Token ID198
Feature activation+1.65e-01

Pos logit contributions
+0.006
Ar+0.005
 printed+0.005
 lifelong+0.004
 common+0.004
Neg logit contributions
ow-0.009
rified-0.009
ocked-0.009
 bye-0.008
 loudly-0.008
TokenHer
Token ID9360
Feature activation+3.03e-01

Pos logit contributions
rified+0.005
ow+0.005
ocked+0.005
 bye+0.004
 loudly+0.004
Neg logit contributions
-0.007
Ar-0.006
 printed-0.006
 lifelong-0.006
 wonderful-0.006
Token mom
Token ID1995
Feature activation+3.77e-01

Pos logit contributions
ow+0.009
rified+0.008
ocked+0.008
 bye+0.007
 loudly+0.007
Neg logit contributions
-0.014
Ar-0.013
 lifelong-0.012
 printed-0.012
Wal-0.012
TokenMom
Token ID29252
Feature activation+2.10e-01

Pos logit contributions
+0.002
 lifelong+0.002
 wonderful+0.002
 masterpiece+0.002
 printed+0.002
Neg logit contributions
rified-0.002
Fine-0.002
ocked-0.001
ow-0.001
storms-0.001
Token,
Token ID11
Feature activation+8.46e-02

Pos logit contributions
rified+0.007
ow+0.007
ocked+0.006
licks+0.006
ly+0.006
Neg logit contributions
-0.009
Ar-0.008
 lifelong-0.007
 printed-0.007
 common-0.007
Token why
Token ID1521
Feature activation+4.73e-01

Pos logit contributions
rified+0.003
ow+0.003
ocked+0.002
 bye+0.002
licks+0.002
Neg logit contributions
-0.004
Ar-0.003
 lifelong-0.003
 printed-0.003
 common-0.003
Token are
Token ID389
Feature activation+2.39e-01

Pos logit contributions
rified+0.023
ow+0.022
ocked+0.021
arted+0.020
storms+0.020
Neg logit contributions
-0.014
 printed-0.010
 common-0.010
 wonderful-0.010
Ar-0.010
Token the
Token ID262
Feature activation-3.43e-02

Pos logit contributions
rified+0.010
ow+0.009
ocked+0.009
owing+0.009
licks+0.008
Neg logit contributions
-0.008
Ar-0.006
 printed-0.006
 wonderful-0.006
 common-0.006
Token pand
Token ID19798
Feature activation-2.68e+00

Pos logit contributions
+0.002
 printed+0.001
Ar+0.001
 lifelong+0.001
 bisc+0.001
Neg logit contributions
rified-0.001
ow-0.001
 bye-0.001
 alone-0.001
 anymore-0.001
Tokenas
Token ID292
Feature activation-1.18e+00

Pos logit contributions
+0.103
Ar+0.086
 printed+0.082
 lifelong+0.081
 common+0.078
Neg logit contributions
rified-0.103
ow-0.096
ocked-0.090
licks-0.084
storms-0.084
Token not
Token ID407
Feature activation-4.80e-01

Pos logit contributions
+0.037
Ar+0.028
 printed+0.028
 lifelong+0.026
 wonderful+0.025
Neg logit contributions
rified-0.055
ow-0.051
ocked-0.049
arrass-0.047
licks-0.046
Token playing
Token ID2712
Feature activation+2.30e-01

Pos logit contributions
+0.018
Ar+0.015
 printed+0.015
 lifelong+0.014
 common+0.014
Neg logit contributions
rified-0.019
ow-0.017
ocked-0.016
licks-0.015
storms-0.015
Token?"
Token ID1701
Feature activation-7.26e-03

Pos logit contributions
rified+0.008
ow+0.008
ocked+0.007
licks+0.007
 loudly+0.007
Neg logit contributions
-0.009
Ar-0.008
 printed-0.008
 lifelong-0.008
Wal-0.007
Token Anna
Token ID11735
Feature activation-7.09e-01

Pos logit contributions
+0.000
Ar+0.000
 printed+0.000
 lifelong+0.000
 common+0.000
Neg logit contributions
rified-0.000
ow-0.000
ocked-0.000
licks-0.000
owing-0.000
Token,
Token ID11
Feature activation-6.96e-01

Pos logit contributions
rified+0.003
ow+0.003
ocked+0.003
licks+0.002
storms+0.002
Neg logit contributions
-0.004
Ar-0.003
 printed-0.003
 lifelong-0.003
 wonderful-0.003
Token "
Token ID366
Feature activation+3.37e-01

Pos logit contributions
+0.031
Ar+0.027
 printed+0.026
 lifelong+0.025
 wonderful+0.025
Neg logit contributions
rified-0.023
ow-0.021
ocked-0.019
licks-0.018
storms-0.018
TokenWhere
Token ID8496
Feature activation+2.50e-01

Pos logit contributions
rified+0.013
ow+0.012
ocked+0.011
ly+0.011
licks+0.011
Neg logit contributions
-0.012
 lifelong-0.011
 common-0.010
Ar-0.010
 printed-0.010
Token are
Token ID389
Feature activation+3.16e-02

Pos logit contributions
rified+0.012
ow+0.011
ocked+0.011
storms+0.010
ly+0.010
Neg logit contributions
-0.008
Ar-0.006
 printed-0.006
 lifelong-0.005
 wonderful-0.005
Token the
Token ID262
Feature activation-1.57e-02

Pos logit contributions
rified+0.001
ow+0.001
ocked+0.001
licks+0.001
owing+0.001
Neg logit contributions
-0.001
Ar-0.001
 printed-0.001
 lifelong-0.001
 wonderful-0.001
Token pand
Token ID19798
Feature activation-2.67e+00

Pos logit contributions
+0.001
 printed+0.001
Ar+0.001
 lifelong+0.001
 masterpiece+0.001
Neg logit contributions
ow-0.000
rified-0.000
 bye-0.000
 anymore-0.000
ho-0.000
Tokenas
Token ID292
Feature activation-1.18e+00

Pos logit contributions
+0.115
Ar+0.101
 printed+0.095
 lifelong+0.093
 common+0.090
Neg logit contributions
ow-0.092
rified-0.086
ocked-0.079
owing-0.078
licks-0.075
Token,
Token ID11
Feature activation-1.19e+00

Pos logit contributions
+0.039
Ar+0.030
 printed+0.030
 lifelong+0.028
 common+0.027
Neg logit contributions
rified-0.052
ow-0.048
ocked-0.047
arrass-0.044
licks-0.044
Token mom
Token ID1995
Feature activation-1.89e-01

Pos logit contributions
+0.052
Ar+0.047
 lifelong+0.043
 printed+0.043
 common+0.042
Neg logit contributions
rified-0.038
ow-0.037
ocked-0.032
 bye-0.031
licks-0.030
Token?"
Token ID1701
Feature activation+2.06e-01

Pos logit contributions
+0.006
Ar+0.005
 lifelong+0.005
 printed+0.005
 common+0.004
Neg logit contributions
rified-0.008
ow-0.008
ocked-0.008
licks-0.007
storms-0.007
Token
Token ID198
Feature activation+1.49e-01

Pos logit contributions
rified+0.007
ow+0.006
ocked+0.006
licks+0.006
owing+0.006
Neg logit contributions
-0.009
 printed-0.007
Ar-0.007
 lifelong-0.007
 common-0.007
Token about
Token ID546
Feature activation-3.80e-01

Pos logit contributions
rified+0.020
ow+0.018
ocked+0.017
arted+0.017
storms+0.017
Neg logit contributions
-0.018
 printed-0.015
 wonderful-0.014
 common-0.013
 lifelong-0.013
Token we
Token ID356
Feature activation-8.80e-01

Pos logit contributions
Ar+0.016
+0.014
vy+0.014
 bisc+0.014
 lifelong+0.014
Neg logit contributions
ow-0.011
ocked-0.011
rified-0.011
 bye-0.010
 anyone-0.010
Token clean
Token ID3424
Feature activation-1.19e+00

Pos logit contributions
+0.031
Ar+0.030
 lifelong+0.027
 bisc+0.026
vy+0.025
Neg logit contributions
rified-0.032
ow-0.032
ocked-0.029
 bye-0.027
arrass-0.027
Token your
Token ID534
Feature activation-9.98e-01

Pos logit contributions
+0.052
Ar+0.048
 lifelong+0.043
Wal+0.043
 printed+0.042
Neg logit contributions
rified-0.039
ow-0.035
ocked-0.034
ly-0.032
 loudly-0.031
Token old
Token ID1468
Feature activation-9.78e-01

Pos logit contributions
+0.035
Ar+0.030
 printed+0.028
 lifelong+0.027
Wal+0.025
Neg logit contributions
rified-0.041
ow-0.039
ocked-0.037
arrass-0.035
ly-0.035
Token basketball
Token ID9669
Feature activation-2.66e+00

Pos logit contributions
+0.040
Ar+0.037
vy+0.034
 printed+0.033
Wal+0.032
Neg logit contributions
rified-0.033
arrass-0.030
ocked-0.029
 alone-0.029
ow-0.029
Token and
Token ID290
Feature activation-5.21e-01

Pos logit contributions
+0.092
Ar+0.076
 printed+0.072
 lifelong+0.071
 common+0.067
Neg logit contributions
rified-0.112
ow-0.106
ocked-0.100
licks-0.094
storms-0.093
Token make
Token ID787
Feature activation-7.74e-01

Pos logit contributions
+0.018
Ar+0.016
 lifelong+0.014
Wal+0.014
vy+0.014
Neg logit contributions
rified-0.020
ow-0.020
ocked-0.019
arrass-0.018
 bye-0.018
Token it
Token ID340
Feature activation-1.24e+00

Pos logit contributions
+0.034
Ar+0.033
vy+0.030
 bisc+0.030
 common+0.029
Neg logit contributions
ocked-0.022
ow-0.021
rified-0.020
 loudly-0.019
 bye-0.018
Token look
Token ID804
Feature activation-7.23e-01

Pos logit contributions
Ar+0.045
+0.041
Wal+0.039
 bisc+0.037
ady+0.036
Neg logit contributions
ow-0.043
 anymore-0.043
 alone-0.043
ocked-0.041
 loudly-0.040
Token new
Token ID649
Feature activation-1.10e+00

Pos logit contributions
+0.025
Ar+0.023
Wal+0.020
 bisc+0.019
 lifelong+0.019
Neg logit contributions
rified-0.028
 loudly-0.026
ow-0.026
ocked-0.026
ly-0.026
Token have
Token ID423
Feature activation-6.06e-01

Pos logit contributions
Ar+0.009
 lifelong+0.008
+0.008
Wal+0.008
 printed+0.007
Neg logit contributions
ow-0.014
rified-0.012
owing-0.011
 loudly-0.011
 bye-0.011
Token to
Token ID284
Feature activation-2.76e-01

Pos logit contributions
+0.022
Ar+0.020
 lifelong+0.019
 printed+0.018
Wal+0.018
Neg logit contributions
ow-0.022
rified-0.022
ocked-0.020
 loudly-0.019
ly-0.019
Token wait
Token ID4043
Feature activation-1.17e+00

Pos logit contributions
+0.010
Ar+0.009
 lifelong+0.009
 printed+0.008
b+0.008
Neg logit contributions
ow-0.010
rified-0.010
 mood-0.009
 bye-0.009
 loudly-0.009
Token a
Token ID257
Feature activation+6.28e-02

Pos logit contributions
+0.049
Ar+0.046
 lifelong+0.042
Wal+0.041
 printed+0.041
Neg logit contributions
ow-0.036
rified-0.036
 loudly-0.032
ocked-0.032
ly-0.032
Token long
Token ID890
Feature activation-4.37e-01

Pos logit contributions
rified+0.002
ow+0.002
ocked+0.002
 bye+0.002
 loudly+0.002
Neg logit contributions
-0.002
Ar-0.002
 printed-0.002
 lifelong-0.002
Wal-0.002
Token time
Token ID640
Feature activation-2.65e+00

Pos logit contributions
+0.013
Ar+0.010
 printed+0.009
 lifelong+0.009
 common+0.008
Neg logit contributions
rified-0.021
ow-0.020
ocked-0.019
licks-0.018
storms-0.018
Token?"
Token ID1701
Feature activation-4.88e-02

Pos logit contributions
+0.100
Ar+0.095
 lifelong+0.089
Wal+0.084
 printed+0.084
Neg logit contributions
ow-0.094
rified-0.090
ocked-0.082
owing-0.079
ly-0.078
Token she
Token ID673
Feature activation-4.45e-01

Pos logit contributions
+0.002
Ar+0.002
 printed+0.002
 lifelong+0.002
 common+0.002
Neg logit contributions
rified-0.002
ow-0.002
ocked-0.002
licks-0.001
owing-0.001
Token asked
Token ID1965
Feature activation-3.69e-01

Pos logit contributions
+0.015
Ar+0.013
 lifelong+0.012
 printed+0.012
 common+0.011
Neg logit contributions
rified-0.019
ow-0.018
ocked-0.017
ly-0.016
 loudly-0.016
Token sadly
Token ID21098
Feature activation-7.38e-01

Pos logit contributions
+0.013
 printed+0.009
 lifelong+0.008
 wonderful+0.008
Ar+0.008
Neg logit contributions
rified-0.016
ocked-0.015
storms-0.014
licks-0.014
ow-0.014
Token.
Token ID13
Feature activation-9.14e-02

Pos logit contributions
+0.030
Ar+0.028
 lifelong+0.025
 printed+0.025
Wal+0.024
Neg logit contributions
ow-0.025
rified-0.025
ocked-0.022
 loudly-0.022
ly-0.021
Token the
Token ID262
Feature activation+7.47e-02

Pos logit contributions
rified+0.004
ow+0.003
ocked+0.003
 loudly+0.003
licks+0.003
Neg logit contributions
-0.004
Ar-0.004
 printed-0.003
 lifelong-0.003
Wal-0.003
Token two
Token ID734
Feature activation+1.48e-01

Pos logit contributions
rified+0.003
ly+0.003
ocked+0.003
iness+0.002
 alone+0.002
Neg logit contributions
-0.002
Ar-0.002
vy-0.002
Wal-0.002
 piping-0.002
Token tubes
Token ID21103
Feature activation-5.29e-01

Pos logit contributions
rified+0.006
ly+0.005
ocked+0.005
iness+0.005
arted+0.005
Neg logit contributions
Ar-0.005
-0.005
Wal-0.004
vy-0.004
 common-0.004
Token,
Token ID11
Feature activation-3.65e-01

Pos logit contributions
+0.020
Ar+0.016
 printed+0.016
 lifelong+0.015
 wonderful+0.015
Neg logit contributions
rified-0.021
ow-0.020
ocked-0.018
licks-0.017
arrass-0.017
Token helping
Token ID5742
Feature activation+3.76e-01

Pos logit contributions
+0.013
Ar+0.012
 lifelong+0.011
 printed+0.011
 common+0.010
Neg logit contributions
rified-0.014
ow-0.013
ocked-0.012
 bye-0.012
 loudly-0.012
Token each
Token ID1123
Feature activation+1.88e+00

Pos logit contributions
rified+0.012
ow+0.011
ocked+0.010
licks+0.010
storms+0.010
Neg logit contributions
-0.017
Ar-0.014
 printed-0.014
 lifelong-0.014
 wonderful-0.013
Token other
Token ID584
Feature activation-7.84e-01

Pos logit contributions
rified+0.076
ly+0.067
ow+0.066
ocked+0.065
 alone+0.064
Neg logit contributions
-0.061
Ar-0.057
 bisc-0.050
 printed-0.049
Wal-0.048
Token in
Token ID287
Feature activation-3.35e-02

Pos logit contributions
+0.030
Ar+0.025
 printed+0.024
 lifelong+0.023
 common+0.022
Neg logit contributions
rified-0.030
ow-0.029
ocked-0.027
licks-0.025
storms-0.025
Token and
Token ID290
Feature activation+7.44e-01

Pos logit contributions
+0.001
Ar+0.001
 printed+0.001
 bisc+0.001
 lifelong+0.001
Neg logit contributions
rified-0.001
ow-0.001
ocked-0.001
 loudly-0.001
licks-0.001
Token out
Token ID503
Feature activation-1.69e-01

Pos logit contributions
rified+0.030
ow+0.028
ocked+0.027
licks+0.025
storms+0.025
Neg logit contributions
-0.028
Ar-0.022
 printed-0.022
 lifelong-0.021
 wonderful-0.020
Token of
Token ID286
Feature activation-7.53e-01

Pos logit contributions
+0.006
Ar+0.005
 printed+0.005
 lifelong+0.005
 common+0.005
Neg logit contributions
rified-0.007
ow-0.006
ocked-0.006
licks-0.006
storms-0.005
Token.
Token ID13
Feature activation-3.34e-01

Pos logit contributions
+0.014
Ar+0.012
 printed+0.011
 lifelong+0.011
 wonderful+0.010
Neg logit contributions
rified-0.018
ow-0.017
ocked-0.016
licks-0.015
arrass-0.015
Token
Token ID198
Feature activation-4.00e-01

Pos logit contributions
+0.011
 printed+0.009
 lifelong+0.008
Ar+0.008
b+0.008
Neg logit contributions
rified-0.014
ow-0.014
ocked-0.012
 bye-0.012
storms-0.012
Token
Token ID198
Feature activation+1.32e-01

Pos logit contributions
+0.013
 lifelong+0.012
 printed+0.012
 bisc+0.011
 masterpiece+0.011
Neg logit contributions
ow-0.014
ocked-0.014
 loudly-0.013
rified-0.013
 bye-0.013
TokenThey
Token ID2990
Feature activation+4.51e-01

Pos logit contributions
rified+0.005
ocked+0.005
ow+0.005
 loudly+0.005
Fine+0.005
Neg logit contributions
-0.004
 lifelong-0.004
 bisc-0.004
 masterpiece-0.004
 printed-0.004
Token hug
Token ID16225
Feature activation-2.21e-01

Pos logit contributions
rified+0.018
ow+0.018
ocked+0.015
 bye+0.015
licks+0.015
Neg logit contributions
-0.016
Ar-0.013
 lifelong-0.013
 printed-0.013
 common-0.012
Token each
Token ID1123
Feature activation+1.78e+00

Pos logit contributions
+0.007
Ar+0.005
 common+0.005
 two+0.005
 print+0.005
Neg logit contributions
rified-0.010
ocked-0.009
ow-0.009
licks-0.009
owing-0.009
Token other
Token ID584
Feature activation-9.26e-01

Pos logit contributions
rified+0.072
 bye+0.064
ly+0.064
ow+0.064
 loudly+0.062
Neg logit contributions
-0.059
 printed-0.051
Ar-0.049
 common-0.048
 lifelong-0.048
Token.
Token ID13
Feature activation-4.12e-01

Pos logit contributions
+0.030
Ar+0.021
 printed+0.021
 lifelong+0.020
 wonderful+0.020
Neg logit contributions
rified-0.044
ow-0.040
ocked-0.039
licks-0.038
storms-0.037
Token They
Token ID1119
Feature activation+4.13e-01

Pos logit contributions
+0.016
Ar+0.013
 printed+0.012
 common+0.012
 lifelong+0.012
Neg logit contributions
rified-0.016
ow-0.016
ocked-0.015
owing-0.014
ly-0.014
Token play
Token ID711
Feature activation-7.51e-03

Pos logit contributions
ow+0.015
rified+0.015
 bye+0.013
arrass+0.012
 loudly+0.012
Neg logit contributions
-0.015
Ar-0.013
 lifelong-0.013
 printed-0.012
Wal-0.012
Token together
Token ID1978
Feature activation+3.13e-01

Pos logit contributions
+0.000
Ar+0.000
 printed+0.000
 lifelong+0.000
 common+0.000
Neg logit contributions
rified-0.000
ow-0.000
ocked-0.000
licks-0.000
 loudly-0.000
Token smoke
Token ID7523
Feature activation+2.79e-01

Pos logit contributions
ow+0.017
rified+0.016
 loudly+0.016
 bye+0.015
ly+0.015
Neg logit contributions
-0.016
Ar-0.015
vy-0.013
Wal-0.013
 printed-0.013
Token!".
Token ID48220
Feature activation+8.08e-02

Pos logit contributions
rified+0.012
ow+0.011
ocked+0.010
licks+0.010
storms+0.010
Neg logit contributions
-0.010
Ar-0.008
 printed-0.007
 lifelong-0.007
 common-0.007
Token Su
Token ID1778
Feature activation-8.92e-02

Pos logit contributions
rified+0.003
ow+0.003
ocked+0.003
licks+0.003
storms+0.003
Neg logit contributions
-0.003
Ar-0.002
 printed-0.002
 lifelong-0.002
 common-0.002
Tokenzy
Token ID7357
Feature activation+2.13e-01

Pos logit contributions
+0.004
 printed+0.003
Ar+0.003
 bisc+0.003
 lifelong+0.003
Neg logit contributions
rified-0.003
ow-0.002
ocked-0.002
licks-0.002
storms-0.002
Token eagerly
Token ID30130
Feature activation+2.80e-01

Pos logit contributions
rified+0.008
ow+0.008
ocked+0.007
ly+0.007
 loudly+0.007
Neg logit contributions
-0.008
Ar-0.007
 lifelong-0.006
 common-0.006
 printed-0.006
Token listened
Token ID16399
Feature activation+1.60e+00

Pos logit contributions
rified+0.010
ow+0.010
ocked+0.009
ly+0.009
arrass+0.008
Neg logit contributions
-0.011
Ar-0.009
 lifelong-0.009
 printed-0.009
 common-0.009
Token,
Token ID11
Feature activation+1.74e-01

Pos logit contributions
rified+0.063
ow+0.060
ocked+0.056
licks+0.052
storms+0.052
Neg logit contributions
-0.060
Ar-0.050
 printed-0.048
 lifelong-0.047
 common-0.045
Token and
Token ID290
Feature activation+4.65e-01

Pos logit contributions
rified+0.008
ow+0.008
licks+0.007
ocked+0.007
storms+0.007
Neg logit contributions
-0.006
 printed-0.004
Ar-0.004
 lifelong-0.004
 wonderful-0.004
Token Jane
Token ID12091
Feature activation-1.15e-01

Pos logit contributions
rified+0.020
ow+0.018
ocked+0.017
licks+0.017
storms+0.017
Neg logit contributions
-0.017
 printed-0.013
Ar-0.013
 lifelong-0.012
 wonderful-0.012
Token said
Token ID531
Feature activation+7.10e-01

Pos logit contributions
+0.004
Ar+0.004
 common+0.004
 lifelong+0.003
 bisc+0.003
Neg logit contributions
ow-0.004
rified-0.004
ocked-0.004
 loudly-0.004
ly-0.004
Token,
Token ID11
Feature activation+3.97e-01

Pos logit contributions
rified+0.032
ow+0.029
licks+0.027
storms+0.027
ocked+0.027
Neg logit contributions
-0.026
 printed-0.019
 wonderful-0.018
 lifelong-0.017
 great-0.016
Token dad
Token ID9955
Feature activation+8.25e-01

Pos logit contributions
+0.001
Ar+0.001
 printed+0.001
 lifelong+0.001
 common+0.001
Neg logit contributions
rified-0.001
ow-0.001
ocked-0.001
 bye-0.001
ly-0.001
Token.
Token ID13
Feature activation+2.39e-02

Pos logit contributions
rified+0.035
ow+0.033
ocked+0.031
licks+0.030
storms+0.029
Neg logit contributions
-0.029
Ar-0.023
 printed-0.022
 lifelong-0.021
 common-0.021
Token You
Token ID921
Feature activation+3.28e-01

Pos logit contributions
rified+0.001
ow+0.001
ocked+0.001
owing+0.001
ly+0.001
Neg logit contributions
-0.001
Ar-0.001
 printed-0.001
 lifelong-0.001
 wonderful-0.001
Token are
Token ID389
Feature activation+6.67e-01

Pos logit contributions
rified+0.013
ow+0.013
ocked+0.012
licks+0.011
ly+0.011
Neg logit contributions
-0.012
Ar-0.010
 printed-0.009
 lifelong-0.009
 common-0.009
Token a
Token ID257
Feature activation+7.89e-01

Pos logit contributions
rified+0.029
ow+0.028
ocked+0.026
licks+0.025
 bye+0.024
Neg logit contributions
-0.022
Ar-0.018
 printed-0.017
 lifelong-0.016
 bisc-0.016
Token loyal
Token ID9112
Feature activation+1.67e+00

Pos logit contributions
rified+0.031
 bye+0.029
ow+0.029
 loudly+0.028
 mood+0.027
Neg logit contributions
-0.026
 printed-0.021
Ar-0.021
 bisc-0.020
b-0.020
Token and
Token ID290
Feature activation-4.62e-02

Pos logit contributions
rified+0.073
ow+0.069
ocked+0.067
licks+0.063
storms+0.061
Neg logit contributions
-0.055
Ar-0.046
 printed-0.043
 lifelong-0.041
 common-0.040
Token kind
Token ID1611
Feature activation+1.52e+00

Pos logit contributions
+0.001
Ar+0.001
 wonderful+0.001
 printed+0.001
 common+0.001
Neg logit contributions
rified-0.002
ocked-0.002
ow-0.002
owing-0.002
arted-0.002
Token dad
Token ID9955
Feature activation+9.98e-01

Pos logit contributions
rified+0.071
ow+0.069
ocked+0.068
licks+0.065
arted+0.063
Neg logit contributions
-0.044
Ar-0.035
 printed-0.031
 print-0.031
 bisc-0.030
Token."
Token ID526
Feature activation-4.64e-01

Pos logit contributions
rified+0.044
ow+0.041
ocked+0.040
licks+0.037
arrass+0.037
Neg logit contributions
-0.033
Ar-0.026
 printed-0.025
 lifelong-0.024
 wonderful-0.023
Token
Token ID198
Feature activation-1.20e-01

Pos logit contributions
+0.012
Ar+0.009
 printed+0.008
 lifelong+0.008
 common+0.007
Neg logit contributions
rified-0.024
ow-0.023
ocked-0.022
licks-0.021
storms-0.021
Token<|endoftext|>
Token ID50256
Feature activation+1.20e-01

Pos logit contributions
+0.012
Ar+0.009
 printed+0.008
 lifelong+0.008
 wonderful+0.008
Neg logit contributions
rified-0.022
ow-0.021
ocked-0.020
ly-0.019
licks-0.019
TokenOne
Token ID3198
Feature activation-4.83e-01

Pos logit contributions
rified+0.005
ow+0.005
ocked+0.005
licks+0.004
storms+0.004
Neg logit contributions
-0.004
 printed-0.003
 lifelong-0.003
Ar-0.003
Wal-0.003
Token day
Token ID1110
Feature activation+3.92e-02

Pos logit contributions
+0.016
Ar+0.012
 printed+0.012
 wonderful+0.012
 lifelong+0.012
Neg logit contributions
rified-0.022
ow-0.020
ocked-0.019
licks-0.019
storms-0.018
Token,
Token ID11
Feature activation+1.05e+00

Pos logit contributions
rified+0.001
ow+0.001
ocked+0.001
licks+0.001
storms+0.001
Neg logit contributions
-0.002
Ar-0.001
 printed-0.001
 lifelong-0.001
 common-0.001
Token a
Token ID257
Feature activation-5.07e-01

Pos logit contributions
rified+0.054
ow+0.052
ocked+0.050
storms+0.048
licks+0.048
Neg logit contributions
-0.027
Ar-0.020
 printed-0.018
 lifelong-0.017
 wonderful-0.017
Token woman
Token ID2415
Feature activation+1.61e+00

Pos logit contributions
+0.016
Ar+0.013
 printed+0.012
 lifelong+0.012
 wonderful+0.011
Neg logit contributions
rified-0.023
ow-0.022
ocked-0.021
licks-0.020
storms-0.019
Token went
Token ID1816
Feature activation-4.03e-02

Pos logit contributions
rified+0.054
ow+0.053
ocked+0.047
ly+0.045
owing+0.044
Neg logit contributions
-0.065
Ar-0.059
 lifelong-0.057
 common-0.054
Wal-0.053
Token to
Token ID284
Feature activation-7.92e-01

Pos logit contributions
+0.002
Ar+0.001
 printed+0.001
 lifelong+0.001
 wonderful+0.001
Neg logit contributions
rified-0.002
ow-0.001
ocked-0.001
licks-0.001
arrass-0.001
Token the
Token ID262
Feature activation-7.51e-02

Pos logit contributions
+0.031
Ar+0.026
 lifelong+0.025
 printed+0.024
 bisc+0.024
Neg logit contributions
ow-0.028
rified-0.028
ocked-0.024
 bye-0.024
arrass-0.023
Token park
Token ID3952
Feature activation-2.97e-01

Pos logit contributions
+0.002
Ar+0.002
 printed+0.002
 lifelong+0.002
 wonderful+0.002
Neg logit contributions
rified-0.003
ow-0.003
ocked-0.003
licks-0.003
storms-0.003
Token.
Token ID13
Feature activation+2.99e-01

Pos logit contributions
+0.008
 printed+0.005
 print+0.005
 common+0.005
Ar+0.004
Neg logit contributions
rified-0.017
licks-0.014
ocked-0.014
storms-0.014
hetically-0.014
Token wasn
Token ID2492
Feature activation-2.95e-01

Pos logit contributions
+0.013
Ar+0.010
 printed+0.010
 lifelong+0.010
 common+0.009
Neg logit contributions
rified-0.015
ow-0.014
ocked-0.013
licks-0.013
storms-0.012
Token't
Token ID470
Feature activation-4.07e-01

Pos logit contributions
+0.012
Ar+0.012
 lifelong+0.012
 printed+0.012
 bisc+0.012
Neg logit contributions
ow-0.009
rified-0.008
ocked-0.007
licks-0.007
 loudly-0.007
Token worth
Token ID2861
Feature activation+8.72e-02

Pos logit contributions
+0.015
Ar+0.013
 lifelong+0.012
 printed+0.012
 bisc+0.011
Neg logit contributions
rified-0.016
ow-0.015
ocked-0.014
ly-0.013
 loudly-0.013
Token it
Token ID340
Feature activation-7.63e-02

Pos logit contributions
rified+0.003
ow+0.003
ocked+0.003
licks+0.003
storms+0.003
Neg logit contributions
-0.003
Ar-0.003
 printed-0.003
 lifelong-0.003
 common-0.003
Token.
Token ID13
Feature activation-3.92e-01

Pos logit contributions
+0.003
Ar+0.002
 printed+0.002
 lifelong+0.002
 wonderful+0.002
Neg logit contributions
rified-0.003
ow-0.003
ocked-0.003
licks-0.003
storms-0.003
Token<|endoftext|>
Token ID50256
Feature activation+2.79e-01

Pos logit contributions
+0.012
Ar+0.011
 printed+0.011
 common+0.010
 bisc+0.010
Neg logit contributions
rified-0.017
ow-0.015
ocked-0.014
storms-0.014
 bye-0.013
TokenOne
Token ID3198
Feature activation-4.21e-01

Pos logit contributions
rified+0.012
ow+0.011
ocked+0.010
licks+0.010
storms+0.010
Neg logit contributions
-0.008
 lifelong-0.008
 printed-0.007
Ar-0.007
 common-0.007
Token day
Token ID1110
Feature activation+2.04e-01

Pos logit contributions
+0.015
Ar+0.013
 printed+0.012
 lifelong+0.012
 common+0.012
Neg logit contributions
rified-0.017
ow-0.016
ocked-0.015
ly-0.014
licks-0.014
Token,
Token ID11
Feature activation+1.25e+00

Pos logit contributions
rified+0.006
ow+0.006
ocked+0.005
ly+0.005
 loudly+0.005
Neg logit contributions
-0.009
Ar-0.008
 lifelong-0.008
 printed-0.008
 bisc-0.008
Token a
Token ID257
Feature activation-2.25e-01

Pos logit contributions
rified+0.057
ow+0.053
ocked+0.049
 loudly+0.048
ly+0.047
Neg logit contributions
-0.038
Ar-0.032
 lifelong-0.031
 printed-0.029
 bisc-0.029
Token thirsty
Token ID47124
Feature activation-6.82e-01

Pos logit contributions
+0.008
Ar+0.006
 lifelong+0.006
 printed+0.006
 common+0.006
Neg logit contributions
rified-0.009
ow-0.008
 bye-0.008
ocked-0.008
 loudly-0.008
Token with
Token ID351
Feature activation-1.70e-01

Pos logit contributions
rified+0.010
ow+0.010
ocked+0.009
licks+0.008
storms+0.008
Neg logit contributions
-0.010
Ar-0.009
 printed-0.008
 lifelong-0.008
 common-0.008
Token my
Token ID616
Feature activation-1.55e-01

Pos logit contributions
+0.006
Ar+0.005
 printed+0.005
 lifelong+0.005
 wonderful+0.004
Neg logit contributions
rified-0.007
ow-0.007
ocked-0.006
licks-0.006
owing-0.006
Token toy
Token ID13373
Feature activation-1.44e-01

Pos logit contributions
+0.007
Ar+0.005
Wal+0.005
vy+0.005
 lifelong+0.005
Neg logit contributions
rified-0.005
ow-0.005
 bye-0.005
 alone-0.005
 mood-0.004
Token plane
Token ID6614
Feature activation-1.21e-01

Pos logit contributions
+0.005
Ar+0.004
 printed+0.004
 lifelong+0.004
 wonderful+0.003
Neg logit contributions
rified-0.006
ow-0.006
ocked-0.006
arrass-0.005
licks-0.005
Token again
Token ID757
Feature activation+6.54e-02

Pos logit contributions
+0.004
 printed+0.003
Ar+0.003
 lifelong+0.003
 wonderful+0.003
Neg logit contributions
rified-0.005
ow-0.005
ocked-0.005
arrass-0.004
licks-0.004
Token!"
Token ID2474
Feature activation+2.99e-01

Pos logit contributions
rified+0.003
ow+0.003
ocked+0.002
 loudly+0.002
ly+0.002
Neg logit contributions
-0.002
Ar-0.002
 printed-0.002
 lifelong-0.002
Wal-0.002
Token From
Token ID3574
Feature activation+1.81e-01

Pos logit contributions
rified+0.012
ow+0.012
ocked+0.011
 bye+0.011
storms+0.010
Neg logit contributions
-0.010
Ar-0.009
 lifelong-0.008
 printed-0.008
 common-0.008
Token that
Token ID326
Feature activation+2.52e-01

Pos logit contributions
rified+0.004
ow+0.004
ocked+0.003
ly+0.003
 bye+0.003
Neg logit contributions
-0.010
Ar-0.009
 lifelong-0.009
 printed-0.008
Wal-0.008
Token day
Token ID1110
Feature activation+1.69e-01

Pos logit contributions
rified+0.008
ow+0.007
ocked+0.007
 bye+0.006
 alone+0.006
Neg logit contributions
-0.011
Ar-0.010
 lifelong-0.009
 bisc-0.009
Wal-0.009
Token on
Token ID319
Feature activation-1.27e+00

Pos logit contributions
rified+0.009
ow+0.008
ocked+0.008
arrass+0.008
storms+0.008
Neg logit contributions
-0.004
Ar-0.003
 printed-0.003
 lifelong-0.003
 wonderful-0.003
Token,
Token ID11
Feature activation+7.25e-02

Pos logit contributions
+0.055
Ar+0.048
 printed+0.046
 lifelong+0.045
 common+0.043
Neg logit contributions
rified-0.042
ow-0.040
ocked-0.036
ly-0.034
licks-0.033
TokenL
Token ID43
Feature activation-1.98e-01

Pos logit contributions
rified+0.015
ocked+0.015
 loudly+0.014
ly+0.013
ow+0.013
Neg logit contributions
-0.014
 lifelong-0.012
Ar-0.012
 bisc-0.012
 printed-0.012
Tokenily
Token ID813
Feature activation-4.53e-01

Pos logit contributions
+0.009
Ar+0.008
 printed+0.008
 lifelong+0.008
 common+0.007
Neg logit contributions
rified-0.006
ow-0.006
ocked-0.005
ly-0.005
licks-0.005
Token and
Token ID290
Feature activation+1.13e-01

Pos logit contributions
+0.015
Ar+0.013
 lifelong+0.013
 common+0.013
 printed+0.012
Neg logit contributions
rified-0.018
ow-0.018
ocked-0.017
ly-0.015
 bye-0.015
Token Jack
Token ID3619
Feature activation-7.35e-01

Pos logit contributions
ow+0.004
rified+0.004
licks+0.004
ocked+0.004
owing+0.004
Neg logit contributions
-0.004
 printed-0.004
 wonderful-0.003
Ar-0.003
 print-0.003
Token played
Token ID2826
Feature activation-2.60e-01

Pos logit contributions
+0.031
Ar+0.028
 lifelong+0.026
 printed+0.026
 common+0.026
Neg logit contributions
rified-0.024
ow-0.023
ocked-0.021
ly-0.019
licks-0.019
Token together
Token ID1978
Feature activation+2.84e-01

Pos logit contributions
+0.009
Ar+0.007
 printed+0.007
 lifelong+0.007
 common+0.006
Neg logit contributions
rified-0.011
ow-0.011
ocked-0.010
licks-0.009
storms-0.009
Token all
Token ID477
Feature activation+6.06e-01

Pos logit contributions
rified+0.012
ow+0.011
ocked+0.010
storms+0.010
arrass+0.010
Neg logit contributions
-0.011
Ar-0.008
 printed-0.008
 lifelong-0.008
 wonderful-0.008
Token day
Token ID1110
Feature activation-3.63e-01

Pos logit contributions
rified+0.023
ow+0.022
ocked+0.021
ly+0.019
 bye+0.019
Neg logit contributions
-0.023
Ar-0.020
 printed-0.019
 lifelong-0.018
 bisc-0.018
Token,
Token ID11
Feature activation-3.70e-01

Pos logit contributions
+0.013
Ar+0.010
 printed+0.010
 lifelong+0.010
 common+0.009
Neg logit contributions
rified-0.015
ow-0.015
ocked-0.014
licks-0.013
storms-0.013
Token taking
Token ID2263
Feature activation-7.22e-02

Pos logit contributions
+0.012
 printed+0.009
Ar+0.009
 lifelong+0.009
 wonderful+0.009
Neg logit contributions
rified-0.017
ow-0.017
ocked-0.015
licks-0.015
ly-0.014
Token turns
Token ID4962
Feature activation-1.64e-01

Pos logit contributions
+0.002
Ar+0.002
 printed+0.002
 lifelong+0.002
 wonderful+0.002
Neg logit contributions
rified-0.003
ow-0.003
ocked-0.003
licks-0.003
owing-0.003
Token but
Token ID475
Feature activation-6.31e-03

Pos logit contributions
+0.013
Ar+0.011
 lifelong+0.011
 printed+0.010
 common+0.010
Neg logit contributions
rified-0.013
ow-0.012
ocked-0.012
 loudly-0.011
licks-0.011
Token they
Token ID484
Feature activation-2.26e-01

Pos logit contributions
+0.000
Ar+0.000
 printed+0.000
 wonderful+0.000
 lifelong+0.000
Neg logit contributions
rified-0.000
ow-0.000
ocked-0.000
owing-0.000
storms-0.000
Token invited
Token ID9392
Feature activation-7.91e-02

Pos logit contributions
+0.009
 printed+0.007
Ar+0.007
 lifelong+0.007
 wonderful+0.007
Neg logit contributions
rified-0.009
ow-0.008
ocked-0.008
licks-0.007
storms-0.007
Token him
Token ID683
Feature activation-7.02e-03

Pos logit contributions
+0.004
Ar+0.003
 printed+0.003
 wonderful+0.003
 lifelong+0.003
Neg logit contributions
rified-0.003
ocked-0.002
ow-0.002
storms-0.002
outing-0.002
Token to
Token ID284
Feature activation-7.55e-01

Pos logit contributions
+0.000
Ar+0.000
 printed+0.000
 lifelong+0.000
Wal+0.000
Neg logit contributions
ow-0.000
rified-0.000
ocked-0.000
 bye-0.000
owing-0.000
Token join
Token ID4654
Feature activation+1.66e-01

Pos logit contributions
+0.026
Ar+0.023
 lifelong+0.021
 printed+0.021
vy+0.020
Neg logit contributions
ow-0.030
rified-0.029
ocked-0.026
 bye-0.026
 loudly-0.025
Token in
Token ID287
Feature activation-4.56e-01

Pos logit contributions
rified+0.006
ow+0.006
ocked+0.005
licks+0.005
ly+0.005
Neg logit contributions
-0.007
Ar-0.006
 printed-0.006
 lifelong-0.006
 common-0.005
Token with
Token ID351
Feature activation-6.84e-01

Pos logit contributions
+0.018
Ar+0.017
 printed+0.016
Wal+0.015
 lifelong+0.015
Neg logit contributions
ow-0.015
rified-0.014
 loudly-0.013
ocked-0.013
 alone-0.012
Token them
Token ID606
Feature activation+3.77e-01

Pos logit contributions
+0.029
Ar+0.025
 printed+0.024
 lifelong+0.024
 wonderful+0.023
Neg logit contributions
rified-0.023
ow-0.022
ocked-0.021
licks-0.019
storms-0.019
Token.
Token ID13
Feature activation-2.42e-01

Pos logit contributions
rified+0.013
ow+0.013
ocked+0.012
 loudly+0.011
owing+0.011
Neg logit contributions
-0.015
Ar-0.013
 printed-0.012
 lifelong-0.012
Wal-0.012
Token
Token ID198
Feature activation-1.40e-01

Pos logit contributions
+0.008
Ar+0.007
 printed+0.007
 lifelong+0.006
 common+0.006
Neg logit contributions
rified-0.010
ow-0.010
ocked-0.009
licks-0.009
storms-0.009
Token an
Token ID281
Feature activation+1.60e-03

Pos logit contributions
+0.012
Ar+0.010
 printed+0.009
 lifelong+0.009
 common+0.008
Neg logit contributions
rified-0.018
ow-0.017
ocked-0.016
licks-0.016
ly-0.015
Token entire
Token ID2104
Feature activation-1.19e+00

Pos logit contributions
rified+0.000
ow+0.000
ly+0.000
ocked+0.000
 loudly+0.000
Neg logit contributions
-0.000
Ar-0.000
 printed-0.000
 common-0.000
 lifelong-0.000
Token world
Token ID995
Feature activation+5.27e-02

Pos logit contributions
+0.041
Ar+0.036
 printed+0.034
 lifelong+0.033
 common+0.032
Neg logit contributions
rified-0.047
ow-0.044
ocked-0.041
 loudly-0.040
ly-0.039
Token filled
Token ID5901
Feature activation+3.74e-01

Pos logit contributions
rified+0.002
ow+0.002
ocked+0.002
ly+0.002
licks+0.002
Neg logit contributions
-0.002
Ar-0.002
 printed-0.002
 lifelong-0.002
 common-0.002
Token with
Token ID351
Feature activation-4.34e-02

Pos logit contributions
rified+0.012
ow+0.011
ocked+0.010
 loudly+0.010
licks+0.010
Neg logit contributions
-0.016
Ar-0.014
 printed-0.013
 lifelong-0.013
 bisc-0.013
Token amazing
Token ID4998
Feature activation+3.28e-01

Pos logit contributions
+0.001
Ar+0.001
 printed+0.001
 lifelong+0.001
 bisc+0.001
Neg logit contributions
rified-0.002
ow-0.002
ocked-0.002
 loudly-0.002
ly-0.002
Token things
Token ID1243
Feature activation-8.48e-02

Pos logit contributions
rified+0.012
ly+0.011
ow+0.011
ocked+0.011
 loudly+0.010
Neg logit contributions
-0.013
Ar-0.011
 printed-0.010
 bisc-0.010
vy-0.010
Token to
Token ID284
Feature activation+5.84e-01

Pos logit contributions
+0.003
Ar+0.003
 lifelong+0.003
 printed+0.003
Wal+0.003
Neg logit contributions
rified-0.003
ow-0.003
ocked-0.003
ly-0.003
owing-0.002
Token explore
Token ID7301
Feature activation+7.34e-01

Pos logit contributions
rified+0.024
ow+0.024
ocked+0.021
licks+0.021
 loudly+0.021
Neg logit contributions
-0.020
Ar-0.017
 printed-0.015
 lifelong-0.015
 bisc-0.014
Token.
Token ID13
Feature activation-1.09e-01

Pos logit contributions
rified+0.024
ow+0.021
ocked+0.021
 loudly+0.020
ly+0.020
Neg logit contributions
-0.029
Ar-0.028
 printed-0.026
 bisc-0.026
Wal-0.025
Token She
Token ID1375
Feature activation+3.93e-01

Pos logit contributions
+0.005
Ar+0.004
 printed+0.004
 lifelong+0.004
 wonderful+0.004
Neg logit contributions
rified-0.003
ow-0.003
ocked-0.003
licks-0.003
storms-0.003
Tokenily
Token ID813
Feature activation-3.20e-01

Pos logit contributions
+0.011
Ar+0.010
 printed+0.009
 lifelong+0.009
 common+0.009
Neg logit contributions
rified-0.008
ow-0.007
ocked-0.007
licks-0.006
ly-0.006
Token shakes
Token ID29959
Feature activation+5.22e-01

Pos logit contributions
+0.013
Ar+0.011
 printed+0.010
 lifelong+0.010
 wonderful+0.010
Neg logit contributions
rified-0.012
ow-0.011
ocked-0.010
storms-0.010
owing-0.010
Token her
Token ID607
Feature activation-2.77e-01

Pos logit contributions
rified+0.023
storms+0.022
ow+0.021
ugg+0.021
licks+0.021
Neg logit contributions
-0.017
 wonderful-0.013
 common-0.012
 two-0.012
 great-0.012
Token head
Token ID1182
Feature activation+4.11e-01

Pos logit contributions
+0.011
Ar+0.009
 wonderful+0.009
 feather+0.009
 two+0.009
Neg logit contributions
rified-0.010
storms-0.009
ocked-0.009
ow-0.009
licks-0.008
Token.
Token ID13
Feature activation-1.21e-01

Pos logit contributions
rified+0.021
azes+0.020
ow+0.020
arrass+0.020
ocked+0.020
Neg logit contributions
-0.011
 two-0.006
 printed-0.005
vy-0.005
 wonderful-0.005
Token "
Token ID366
Feature activation-1.49e-01

Pos logit contributions
+0.005
 printed+0.004
 bisc+0.004
 rag+0.004
 print+0.004
Neg logit contributions
rified-0.004
ow-0.004
 unatt-0.004
ocked-0.004
owing-0.004
TokenNo
Token ID2949
Feature activation+7.24e-01

Pos logit contributions
+0.007
Ar+0.006
 printed+0.006
 lifelong+0.006
 common+0.005
Neg logit contributions
rified-0.005
ow-0.004
ocked-0.004
licks-0.004
arrass-0.004
Token,
Token ID11
Feature activation-3.53e-01

Pos logit contributions
rified+0.029
ow+0.027
ocked+0.026
licks+0.024
storms+0.024
Neg logit contributions
-0.028
Ar-0.022
 wonderful-0.021
 printed-0.021
 common-0.021
Token Tim
Token ID5045
Feature activation-7.40e-01

Pos logit contributions
+0.015
 printed+0.011
 wonderful+0.011
Ar+0.011
 bisc+0.011
Neg logit contributions
rified-0.013
ow-0.012
ocked-0.012
owing-0.011
licks-0.011
Token,
Token ID11
Feature activation+3.22e-02

Pos logit contributions
+0.024
 printed+0.016
 wonderful+0.016
Ar+0.015
 bisc+0.015
Neg logit contributions
rified-0.034
ocked-0.032
ow-0.031
licks-0.030
storms-0.030
Token it
Token ID340
Feature activation-3.76e-01

Pos logit contributions
rified+0.001
ow+0.001
ocked+0.001
licks+0.001
storms+0.001
Neg logit contributions
-0.001
 wonderful-0.001
 printed-0.001
Ar-0.001
 print-0.001
Token too
Token ID1165
Feature activation-6.72e-01

Pos logit contributions
+0.002
Ar+0.001
 lifelong+0.001
 printed+0.001
 bisc+0.001
Neg logit contributions
ow-0.001
rified-0.001
ocked-0.001
 loudly-0.001
 lately-0.001
Token long
Token ID890
Feature activation+3.56e-01

Pos logit contributions
+0.023
Ar+0.018
 common+0.017
 wonderful+0.017
 printed+0.017
Neg logit contributions
rified-0.030
ow-0.028
ocked-0.026
licks-0.026
storms-0.025
Token and
Token ID290
Feature activation-1.21e-01

Pos logit contributions
rified+0.016
ow+0.014
ocked+0.014
arrass+0.013
licks+0.013
Neg logit contributions
-0.012
Ar-0.009
 printed-0.009
 lifelong-0.009
 wonderful-0.008
Token too
Token ID1165
Feature activation-2.94e-01

Pos logit contributions
+0.005
Ar+0.003
 printed+0.003
 wonderful+0.003
 lifelong+0.003
Neg logit contributions
rified-0.005
ow-0.005
ocked-0.005
licks-0.004
storms-0.004
Token dangerous
Token ID4923
Feature activation+3.24e-01

Pos logit contributions
+0.011
Ar+0.008
 wonderful+0.008
 common+0.008
 printed+0.007
Neg logit contributions
rified-0.013
ow-0.012
ocked-0.011
licks-0.011
arrass-0.011
Token.
Token ID13
Feature activation-3.47e-01

Pos logit contributions
rified+0.015
ow+0.014
ocked+0.013
licks+0.013
arrass+0.013
Neg logit contributions
-0.010
Ar-0.007
 printed-0.007
 wonderful-0.007
 common-0.007
Token I
Token ID314
Feature activation-6.38e-01

Pos logit contributions
+0.015
Ar+0.011
 printed+0.011
 lifelong+0.011
Wal+0.010
Neg logit contributions
rified-0.012
ocked-0.011
ow-0.011
ly-0.010
owing-0.010
Token don
Token ID836
Feature activation-6.49e-01

Pos logit contributions
+0.020
Ar+0.019
 lifelong+0.018
 printed+0.017
 bisc+0.017
Neg logit contributions
ow-0.027
rified-0.025
owing-0.022
licks-0.022
ocked-0.022
Token't
Token ID470
Feature activation-4.22e-01

Pos logit contributions
+0.028
Ar+0.026
 lifelong+0.025
 printed+0.025
 bisc+0.025
Neg logit contributions
rified-0.019
ow-0.019
ocked-0.018
licks-0.016
 loudly-0.016
Token want
Token ID765
Feature activation+4.83e-01

Pos logit contributions
+0.015
Ar+0.012
 printed+0.011
 lifelong+0.011
 common+0.011
Neg logit contributions
rified-0.018
ow-0.017
ocked-0.016
licks-0.015
storms-0.015
Token you
Token ID345
Feature activation-1.06e-01

Pos logit contributions
rified+0.018
ow+0.016
ocked+0.015
licks+0.014
storms+0.014
Neg logit contributions
-0.020
Ar-0.017
 printed-0.016
 lifelong-0.016
 wonderful-0.015
Token
Token ID26391
Feature activation+4.75e-02

Pos logit contributions
+0.002
Ar+0.001
 printed+0.001
 common+0.001
 bisc+0.001
Neg logit contributions
rified-0.002
ow-0.002
ocked-0.002
licks-0.002
 loudly-0.002
Token
Token ID129
Feature activation+7.06e-01

Pos logit contributions
rified+0.002
ow+0.002
ocked+0.002
licks+0.002
ly+0.002
Neg logit contributions
-0.002
Ar-0.001
 printed-0.001
 bisc-0.001
 common-0.001
Token
Token ID241
Feature activation-4.25e-01

Pos logit contributions
rified+0.024
ow+0.023
ocked+0.021
licks+0.019
storms+0.019
Neg logit contributions
-0.030
Ar-0.026
 printed-0.024
 lifelong-0.024
 common-0.023
TokenThat
Token ID2504
Feature activation-7.00e-01

Pos logit contributions
+0.015
 lifelong+0.014
 printed+0.014
Wal+0.013
Ar+0.013
Neg logit contributions
rified-0.015
ow-0.013
ocked-0.013
arrass-0.013
Fine-0.012
Tokenâ
Token ID22940
Feature activation-7.67e-01

Pos logit contributions
+0.021
Ar+0.020
 lifelong+0.019
Wal+0.018
 printed+0.018
Neg logit contributions
rified-0.030
ow-0.028
ocked-0.026
storms-0.025
ly-0.025
Token
Token ID26391
Feature activation-6.97e-01

Pos logit contributions
+0.028
Ar+0.024
 printed+0.022
 lifelong+0.022
 common+0.021
Neg logit contributions
rified-0.030
ow-0.029
ocked-0.027
licks-0.025
storms-0.025
Token
Token ID8151
Feature activation+4.94e-01

Pos logit contributions
+0.028
Ar+0.024
 printed+0.023
 lifelong+0.022
 common+0.022
Neg logit contributions
rified-0.025
ow-0.024
ocked-0.022
licks-0.021
ly-0.020
Tokens
Token ID82
Feature activation+5.35e-02

Pos logit contributions
rified+0.022
ow+0.021
ocked+0.020
licks+0.019
storms+0.019
Neg logit contributions
-0.016
Ar-0.012
 printed-0.012
 lifelong-0.011
 common-0.011
Token a
Token ID257
Feature activation+1.84e-02

Pos logit contributions
rified+0.002
ow+0.002
ocked+0.002
arrass+0.002
 loudly+0.002
Neg logit contributions
-0.002
Ar-0.002
 lifelong-0.001
 printed-0.001
Wal-0.001
Token good
Token ID922
Feature activation+2.41e-01

Pos logit contributions
rified+0.001
ow+0.001
 loudly+0.001
 bye+0.001
 mood+0.001
Neg logit contributions
-0.001
Ar-0.000
 printed-0.000
 lifelong-0.000
Wal-0.000
Token idea
Token ID2126
Feature activation-3.29e-01

Pos logit contributions
rified+0.011
ow+0.010
ocked+0.010
licks+0.009
storms+0.009
Neg logit contributions
-0.008
Ar-0.006
 printed-0.006
 lifelong-0.006
 common-0.005
Token
Token ID198
Feature activation-7.70e-02

Pos logit contributions
 lifelong+0.015
 printed+0.014
 bisc+0.014
+0.014
 masterpiece+0.014
Neg logit contributions
ocked-0.013
ow-0.013
 loudly-0.012
 bye-0.011
rified-0.011
Token"
Token ID1
Feature activation-3.66e-01

Pos logit contributions
 lifelong+0.003
 printed+0.002
 bisc+0.002
 masterpiece+0.002
+0.002
Neg logit contributions
ow-0.003
ocked-0.003
rified-0.002
 loudly-0.002
Fine-0.002
TokenWe
Token ID1135
Feature activation-3.83e-01

Pos logit contributions
+0.015
 lifelong+0.014
 masterpiece+0.013
inations+0.013
 tal+0.013
Neg logit contributions
rified-0.011
Fine-0.011
outing-0.009
ocked-0.009
outed-0.009
Token're
Token ID821
Feature activation-4.66e-01

Pos logit contributions
Ar+0.015
 lifelong+0.014
+0.014
Wal+0.013
 common+0.013
Neg logit contributions
ow-0.013
rified-0.013
 anymore-0.010
owing-0.010
 loudly-0.010
Token sorry
Token ID7926
Feature activation-3.22e-01

Pos logit contributions
+0.027
Ar+0.026
 lifelong+0.024
 common+0.023
 printed+0.023
Neg logit contributions
ow-0.008
rified-0.008
ocked-0.006
 loudly-0.006
 alone-0.005
Token,
Token ID11
Feature activation-5.77e-01

Pos logit contributions
+0.013
Ar+0.013
 lifelong+0.012
 printed+0.012
Wal+0.011
Neg logit contributions
rified-0.010
ow-0.010
ocked-0.009
ly-0.009
 bye-0.009
Token Mom
Token ID11254
Feature activation+1.01e-01

Pos logit contributions
+0.027
Ar+0.026
 printed+0.024
 lifelong+0.024
 common+0.023
Neg logit contributions
rified-0.016
ow-0.014
 bye-0.013
ocked-0.013
 loudly-0.012
Token.
Token ID13
Feature activation+5.16e-02

Pos logit contributions
ow+0.004
rified+0.004
ocked+0.003
owing+0.003
ly+0.003
Neg logit contributions
-0.004
Ar-0.004
 lifelong-0.003
Wal-0.003
 common-0.003
Token We
Token ID775
Feature activation-6.23e-01

Pos logit contributions
rified+0.002
ow+0.002
ocked+0.002
 bye+0.002
storms+0.002
Neg logit contributions
-0.002
Ar-0.002
 lifelong-0.002
 printed-0.002
 wonderful-0.002
Token were
Token ID547
Feature activation-4.45e-01

Pos logit contributions
Ar+0.027
+0.026
 lifelong+0.026
Wal+0.024
 common+0.024
Neg logit contributions
rified-0.019
ow-0.018
ocked-0.015
 loudly-0.014
 anymore-0.014
Token hungry
Token ID14720
Feature activation-2.95e-01

Pos logit contributions
+0.021
Ar+0.021
 lifelong+0.019
 common+0.018
 bisc+0.018
Neg logit contributions
ow-0.012
rified-0.011
ocked-0.010
 loudly-0.010
 alone-0.010
Token a
Token ID257
Feature activation-4.73e-01

Pos logit contributions
rified+0.039
ow+0.039
ocked+0.035
 loudly+0.035
 bye+0.033
Neg logit contributions
-0.027
Ar-0.024
 lifelong-0.023
 printed-0.022
Wal-0.021
Token time
Token ID640
Feature activation-5.01e-01

Pos logit contributions
+0.012
Ar+0.009
 printed+0.009
 lifelong+0.008
 wonderful+0.008
Neg logit contributions
rified-0.024
ow-0.023
ocked-0.022
licks-0.021
storms-0.021
Token,
Token ID11
Feature activation+6.90e-01

Pos logit contributions
+0.020
Ar+0.016
 printed+0.016
 lifelong+0.016
 wonderful+0.015
Neg logit contributions
rified-0.019
ow-0.017
ocked-0.017
licks-0.015
arrass-0.015
Token there
Token ID612
Feature activation+4.62e-02

Pos logit contributions
rified+0.024
ow+0.022
 alone+0.020
 loudly+0.019
ocked+0.019
Neg logit contributions
-0.027
Ar-0.026
 bisc-0.024
 lifelong-0.024
 printed-0.024
Token was
Token ID373
Feature activation-1.13e-01

Pos logit contributions
rified+0.002
ow+0.002
ocked+0.002
licks+0.002
storms+0.002
Neg logit contributions
-0.001
Ar-0.001
 printed-0.001
 lifelong-0.001
 wonderful-0.001
Token a
Token ID257
Feature activation-4.00e-01

Pos logit contributions
b+0.004
+0.004
Ar+0.004
 lifelong+0.004
 common+0.004
Neg logit contributions
rified-0.004
ow-0.004
 loudly-0.003
ocked-0.003
 alone-0.003
Token j
Token ID474
Feature activation+2.98e-01

Pos logit contributions
+0.014
Ar+0.012
 lifelong+0.011
 printed+0.011
 common+0.011
Neg logit contributions
rified-0.016
ow-0.015
 bye-0.014
 loudly-0.014
ocked-0.014
Tokenolly
Token ID5098
Feature activation-2.57e-02

Pos logit contributions
rified+0.011
storms+0.010
ow+0.010
licks+0.009
ocked+0.009
Neg logit contributions
-0.012
 common-0.009
 wonderful-0.009
vy-0.009
Ar-0.009
Token little
Token ID1310
Feature activation+2.46e-01

Pos logit contributions
+0.001
Ar+0.001
 lifelong+0.001
 printed+0.001
 common+0.001
Neg logit contributions
rified-0.001
ow-0.001
ocked-0.001
ly-0.001
 bye-0.001
Token boy
Token ID2933
Feature activation+1.57e-01

Pos logit contributions
rified+0.011
ow+0.010
ocked+0.010
licks+0.009
storms+0.009
Neg logit contributions
-0.008
Ar-0.006
 printed-0.006
 lifelong-0.006
 wonderful-0.006
Token named
Token ID3706
Feature activation-4.23e-01

Pos logit contributions
rified+0.006
ow+0.006
ocked+0.006
licks+0.005
storms+0.005
Neg logit contributions
-0.006
Ar-0.005
 printed-0.005
 lifelong-0.004
 common-0.004
Token<|endoftext|>
Token ID50256
Feature activation+9.77e-02

Pos logit contributions
+0.020
Ar+0.015
 printed+0.014
 lifelong+0.014
 common+0.013
Neg logit contributions
rified-0.035
ow-0.033
ocked-0.031
ly-0.031
owing-0.030
TokenJack
Token ID14295
Feature activation-1.62e-01

Pos logit contributions
rified+0.004
ow+0.004
ocked+0.004
licks+0.004
storms+0.004
Neg logit contributions
-0.003
 lifelong-0.002
 printed-0.002
Ar-0.002
 common-0.002
Token and
Token ID290
Feature activation+5.48e-02

Pos logit contributions
+0.006
Ar+0.005
 lifelong+0.004
 printed+0.004
 common+0.004
Neg logit contributions
rified-0.007
ow-0.006
ocked-0.006
licks-0.006
ly-0.006
Token Nora
Token ID32561
Feature activation+1.73e-01

Pos logit contributions
rified+0.002
ow+0.002
ocked+0.002
 bye+0.001
ly+0.001
Neg logit contributions
-0.002
Ar-0.002
 printed-0.002
 lifelong-0.002
 common-0.002
Token were
Token ID547
Feature activation-3.16e-01

Pos logit contributions
rified+0.005
ow+0.005
ly+0.004
ocked+0.004
 loudly+0.004
Neg logit contributions
-0.007
Ar-0.007
 lifelong-0.007
Wal-0.006
 common-0.006
Token having
Token ID1719
Feature activation-1.83e+00

Pos logit contributions
+0.013
Ar+0.011
 printed+0.011
 lifelong+0.010
 common+0.010
Neg logit contributions
rified-0.011
ow-0.011
ocked-0.010
licks-0.009
owing-0.009
Token breakfast
Token ID12607
Feature activation-2.94e-01

Pos logit contributions
Ar+0.059
+0.057
b+0.054
Wal+0.054
 common+0.053
Neg logit contributions
ow-0.067
rified-0.067
ocked-0.061
 anymore-0.059
 loudly-0.058
Token one
Token ID530
Feature activation+3.70e-01

Pos logit contributions
+0.010
 printed+0.007
Ar+0.007
 wonderful+0.007
 lifelong+0.007
Neg logit contributions
rified-0.014
ow-0.012
ocked-0.012
licks-0.012
 bye-0.011
Token morning
Token ID3329
Feature activation-9.76e-01

Pos logit contributions
rified+0.015
ow+0.014
ocked+0.013
licks+0.013
storms+0.013
Neg logit contributions
-0.013
Ar-0.011
 printed-0.010
 lifelong-0.010
 common-0.010
Token.
Token ID13
Feature activation-3.25e-01

Pos logit contributions
+0.038
Ar+0.031
 printed+0.031
 lifelong+0.030
 wonderful+0.029
Neg logit contributions
rified-0.037
ow-0.035
ocked-0.032
licks-0.031
storms-0.030
Token Jack
Token ID3619
Feature activation-1.53e-01

Pos logit contributions
+0.014
Ar+0.013
 printed+0.012
 lifelong+0.012
 common+0.012
Neg logit contributions
rified-0.010
ow-0.009
ocked-0.009
ly-0.008
arrass-0.008
Token for
Token ID329
Feature activation-5.85e-01

Pos logit contributions
rified+0.011
ow+0.011
ocked+0.010
licks+0.009
storms+0.009
Neg logit contributions
-0.011
Ar-0.009
 printed-0.009
 lifelong-0.009
 wonderful-0.009
Token sharing
Token ID7373
Feature activation+8.92e-02

Pos logit contributions
+0.023
Ar+0.019
 printed+0.019
 lifelong+0.018
 wonderful+0.018
Neg logit contributions
rified-0.022
ow-0.021
ocked-0.019
licks-0.018
storms-0.018
Token.
Token ID13
Feature activation-3.23e-01

Pos logit contributions
rified+0.003
ow+0.003
ocked+0.003
ly+0.002
 loudly+0.002
Neg logit contributions
-0.004
Ar-0.003
 lifelong-0.003
 printed-0.003
Wal-0.003
Token From
Token ID3574
Feature activation-7.00e-02

Pos logit contributions
+0.013
Ar+0.012
 printed+0.011
 lifelong+0.011
 common+0.011
Neg logit contributions
rified-0.011
ow-0.010
ocked-0.010
storms-0.009
licks-0.009
Token then
Token ID788
Feature activation+1.23e-01

Pos logit contributions
+0.004
Ar+0.003
 lifelong+0.003
 printed+0.003
 common+0.003
Neg logit contributions
rified-0.002
ow-0.002
ocked-0.001
ly-0.001
 bye-0.001
Token on
Token ID319
Feature activation-1.79e+00

Pos logit contributions
rified+0.006
ow+0.006
ocked+0.005
licks+0.005
storms+0.005
Neg logit contributions
-0.004
Ar-0.003
 printed-0.003
 lifelong-0.003
 wonderful-0.002
Token,
Token ID11
Feature activation-3.66e-01

Pos logit contributions
+0.074
Ar+0.063
 printed+0.060
 lifelong+0.059
 common+0.057
Neg logit contributions
rified-0.064
ow-0.061
ocked-0.056
licks-0.052
storms-0.051
Token they
Token ID484
Feature activation-2.58e-01

Pos logit contributions
+0.017
Ar+0.016
 printed+0.015
 lifelong+0.015
 common+0.014
Neg logit contributions
rified-0.010
ow-0.010
ocked-0.009
storms-0.008
licks-0.008
Token played
Token ID2826
Feature activation-9.10e-01

Pos logit contributions
+0.011
Ar+0.009
 printed+0.009
 lifelong+0.008
 common+0.008
Neg logit contributions
rified-0.009
ow-0.009
ocked-0.008
licks-0.008
storms-0.007
Token together
Token ID1978
Feature activation-2.09e-01

Pos logit contributions
+0.030
Ar+0.024
 printed+0.023
 lifelong+0.023
 wonderful+0.022
Neg logit contributions
rified-0.040
ow-0.038
ocked-0.036
licks-0.034
storms-0.034
Token happily
Token ID18177
Feature activation-9.36e-01

Pos logit contributions
+0.007
 printed+0.005
 wonderful+0.005
 lifelong+0.005
Ar+0.005
Neg logit contributions
rified-0.010
ow-0.009
ocked-0.008
arrass-0.008
licks-0.008
Token
Token ID198
Feature activation+3.67e-01

Pos logit contributions
rified+0.005
ocked+0.005
ow+0.005
 loudly+0.005
 bye+0.004
Neg logit contributions
-0.004
 printed-0.004
 lifelong-0.004
Ar-0.004
 common-0.004
TokenThey
Token ID2990
Feature activation+2.74e-02

Pos logit contributions
rified+0.012
ocked+0.011
Fine+0.010
 loudly+0.010
arrass+0.010
Neg logit contributions
-0.013
 printed-0.012
 lifelong-0.012
 common-0.012
 bisc-0.012
Token waited
Token ID13488
Feature activation-3.15e-01

Pos logit contributions
rified+0.001
ow+0.001
ocked+0.001
licks+0.001
ly+0.001
Neg logit contributions
-0.001
Ar-0.001
 lifelong-0.001
 printed-0.001
 common-0.001
Token for
Token ID329
Feature activation+2.03e-01

Pos logit contributions
+0.012
Ar+0.010
 printed+0.010
 lifelong+0.009
 wonderful+0.009
Neg logit contributions
rified-0.012
ow-0.012
ocked-0.011
licks-0.010
storms-0.010
Token the
Token ID262
Feature activation-1.89e-01

Pos logit contributions
rified+0.008
ow+0.008
ocked+0.007
ly+0.007
 bye+0.007
Neg logit contributions
-0.007
Ar-0.006
 printed-0.006
 lifelong-0.006
Wal-0.005
Token rain
Token ID6290
Feature activation-1.81e+00

Pos logit contributions
+0.007
Ar+0.006
 printed+0.005
 lifelong+0.005
 common+0.005
Neg logit contributions
rified-0.007
ow-0.007
 bye-0.006
 mood-0.006
ly-0.006
Token to
Token ID284
Feature activation-4.85e-02

Pos logit contributions
+0.069
 printed+0.055
Ar+0.054
 lifelong+0.051
 bisc+0.051
Neg logit contributions
rified-0.073
ow-0.065
ocked-0.063
licks-0.062
arrass-0.059
Token come
Token ID1282
Feature activation-2.80e-01

Pos logit contributions
+0.002
Ar+0.002
 lifelong+0.001
 printed+0.001
b+0.001
Neg logit contributions
ow-0.002
rified-0.002
 bye-0.002
ocked-0.002
owing-0.002
Token,
Token ID11
Feature activation-8.73e-01

Pos logit contributions
+0.012
Ar+0.012
 bisc+0.011
 printed+0.011
 lifelong+0.011
Neg logit contributions
ow-0.008
rified-0.008
owing-0.007
 loudly-0.007
ly-0.007
Token and
Token ID290
Feature activation+1.79e-01

Pos logit contributions
+0.032
 printed+0.025
Ar+0.025
 lifelong+0.024
 wonderful+0.023
Neg logit contributions
rified-0.036
ow-0.034
ocked-0.032
licks-0.030
ly-0.030
Token when
Token ID618
Feature activation-2.51e-01

Pos logit contributions
rified+0.007
ow+0.007
ocked+0.007
licks+0.006
storms+0.006
Neg logit contributions
-0.006
 printed-0.005
Ar-0.005
 lifelong-0.005
 wonderful-0.005
Token play
Token ID711
Feature activation-9.22e-01

Pos logit contributions
+0.009
Ar+0.008
 bisc+0.007
Wal+0.007
 lifelong+0.007
Neg logit contributions
ow-0.006
ocked-0.005
 bye-0.005
rified-0.005
 loudly-0.005
Token at
Token ID379
Feature activation+3.97e-01

Pos logit contributions
+0.035
Ar+0.030
 printed+0.029
 lifelong+0.028
 common+0.027
Neg logit contributions
rified-0.035
ow-0.033
ocked-0.031
licks-0.029
storms-0.029
Token my
Token ID616
Feature activation+2.54e-01

Pos logit contributions
rified+0.016
ow+0.015
ocked+0.014
licks+0.014
arrass+0.013
Neg logit contributions
-0.014
Ar-0.012
 printed-0.011
 lifelong-0.011
 wonderful-0.011
Token daughter
Token ID4957
Feature activation-1.52e+00

Pos logit contributions
rified+0.009
ow+0.009
ocked+0.008
 bye+0.008
 loudly+0.008
Neg logit contributions
-0.009
Ar-0.008
Wal-0.008
 lifelong-0.007
vy-0.007
Token's
Token ID338
Feature activation-3.20e-01

Pos logit contributions
+0.051
Ar+0.048
 lifelong+0.042
Wal+0.041
 printed+0.041
Neg logit contributions
ow-0.062
rified-0.061
ocked-0.054
owing-0.052
 bye-0.052
Token birthday
Token ID10955
Feature activation-1.79e+00

Pos logit contributions
+0.012
Ar+0.011
Wal+0.010
vy+0.010
 lifelong+0.010
Neg logit contributions
rified-0.011
ow-0.011
ocked-0.011
 bye-0.010
 loudly-0.010
Token party
Token ID2151
Feature activation-2.27e+00

Pos logit contributions
+0.063
Ar+0.052
 printed+0.051
 lifelong+0.049
 wonderful+0.047
Neg logit contributions
rified-0.075
ow-0.070
ocked-0.065
licks-0.062
arrass-0.061
Token?"
Token ID1701
Feature activation+5.35e-01

Pos logit contributions
+0.073
Ar+0.060
 printed+0.056
 lifelong+0.055
 common+0.052
Neg logit contributions
rified-0.100
ow-0.096
ocked-0.090
licks-0.085
storms-0.084
Token Lily
Token ID20037
Feature activation-6.99e-01

Pos logit contributions
rified+0.022
ow+0.020
ocked+0.019
licks+0.018
ly+0.018
Neg logit contributions
-0.019
Ar-0.016
 printed-0.015
 lifelong-0.015
 common-0.015
Token was
Token ID373
Feature activation+8.45e-01

Pos logit contributions
+0.029
Ar+0.024
 printed+0.024
 lifelong+0.023
 wonderful+0.022
Neg logit contributions
rified-0.025
ow-0.024
ocked-0.022
licks-0.021
storms-0.020
Token so
Token ID523
Feature activation+3.37e-01

Pos logit contributions
rified+0.029
ow+0.028
ocked+0.026
 loudly+0.024
licks+0.023
Neg logit contributions
-0.034
Ar-0.031
 printed-0.029
 bisc-0.028
 common-0.028
Token special
Token ID2041
Feature activation-3.56e-01

Pos logit contributions
+0.004
Ar+0.003
 common+0.003
 printed+0.003
Wal+0.003
Neg logit contributions
rified-0.004
 loudly-0.004
ly-0.004
ow-0.004
 bye-0.004
Token treat
Token ID2190
Feature activation-6.61e-03

Pos logit contributions
+0.011
Ar+0.010
 lifelong+0.009
Wal+0.009
 printed+0.009
Neg logit contributions
ow-0.015
rified-0.015
ly-0.014
ocked-0.013
 bye-0.013
Token for
Token ID329
Feature activation+8.18e-01

Pos logit contributions
+0.000
Ar+0.000
 lifelong+0.000
 printed+0.000
Wal+0.000
Neg logit contributions
rified-0.000
ow-0.000
ocked-0.000
ly-0.000
 loudly-0.000
Token the
Token ID262
Feature activation+3.69e-01

Pos logit contributions
rified+0.034
ow+0.030
ocked+0.029
storms+0.029
licks+0.028
Neg logit contributions
-0.029
Ar-0.023
 wonderful-0.023
 printed-0.023
 lifelong-0.023
Token chip
Token ID11594
Feature activation+3.30e-01

Pos logit contributions
rified+0.016
ow+0.015
ocked+0.014
ly+0.013
 bye+0.013
Neg logit contributions
-0.012
Ar-0.010
 printed-0.010
 lifelong-0.009
 common-0.009
Tokenm
Token ID76
Feature activation-2.09e+00

Pos logit contributions
rified+0.015
ow+0.014
ocked+0.013
licks+0.013
arrass+0.013
Neg logit contributions
-0.011
Ar-0.008
 printed-0.008
 lifelong-0.008
 bisc-0.007
Tokenunk
Token ID2954
Feature activation+6.94e-02

Pos logit contributions
+0.079
Ar+0.067
 printed+0.063
 lifelong+0.062
 common+0.060
Neg logit contributions
rified-0.081
ow-0.077
ocked-0.071
licks-0.067
ly-0.066
Token.
Token ID13
Feature activation+1.34e-01

Pos logit contributions
rified+0.003
ow+0.003
ocked+0.002
ly+0.002
owing+0.002
Neg logit contributions
-0.003
Ar-0.002
 printed-0.002
 lifelong-0.002
 common-0.002
Token She
Token ID1375
Feature activation+6.01e-01

Pos logit contributions
rified+0.004
ow+0.004
ocked+0.003
owing+0.003
ly+0.003
Neg logit contributions
-0.007
Ar-0.005
 printed-0.005
 lifelong-0.005
Wal-0.005
Token said
Token ID531
Feature activation+5.59e-01

Pos logit contributions
rified+0.022
ow+0.022
ocked+0.021
 loudly+0.020
ly+0.019
Neg logit contributions
-0.021
Ar-0.019
 lifelong-0.018
 common-0.018
Wal-0.017
Token,
Token ID11
Feature activation+2.13e-01

Pos logit contributions
rified+0.021
ow+0.019
storms+0.018
licks+0.018
ocked+0.018
Neg logit contributions
-0.024
 wonderful-0.018
 printed-0.018
 lifelong-0.017
Ar-0.017