TOP ACTIVATIONS
MAX = 4.822

very special toy for L ott ie . It was a
One day , L ott ie 's mom went to
loyal little girl named L ott ie . Every day ,
a kind deer named D ott ie appeared . " Leave
a good dog named D ott ie . D ott ie
. When D ott ie arrived at the lab
said , " Hello D ott ie ! This is a
Suddenly , they sp ott led a bicycle . It
After a while , D ott ie was no longer scared
?" D ott ie per ked up and
the n ooks and cr ann ies . After
D ott ie . D ott ie wanted to visit a
there was a girl named Grac ie . She was big
was a little girl named Grac ie . She was very
she was inside , D ott ie saw lots of colorful
stove - two more sh rim ps ! His dad had
slipped and fell into the rav ine below .
Next , the b unn ies got to work making
in and said , " Ros ie , you need to
small n ooks and cr ann ies to grab onto ,

MIN ACTIVATIONS
MIN = -4.026

very happy . They love bikes . They run to the
wrist . Lily smelled her wrist and said , " It
She grabbed onto its handle bars and put her feet onto
" I like your name , Lily ," said Ellie
. You are my favorite niece ," uncle Tom said .
bird sat on the handle bars and held on tight .
shiny . She loved her bike and her friend . <|endoftext|>
. He grabbed the handle bars with his little hands .
, " Hi ! My name is Sue . This is
me . You are my son . And you are a
eyes widen . He loves bikes . He looks at the
" Hi , my name is Lily . What 's
. He grabbed the handle bars and he quickly pushed on
, " Hi ! My name is Tim . This is
little boy said " My name is Jack . Can I
too . You are my friends . Now we can play
, " Hi , my name is Lily . Do you
, " Hi ! My name is Bob . What 's
, " Hi , my name is Ch oo - Ch
" Hi Max , my name is Lily . Do you

INTERVAL 2.610 - 4.822
CONTAINS 0.035%

his hands and then went inside to play with his toys
there was a little girl named Lily . She loved to
the passport and the magical mer maid was grateful to get
thread and a needle and se wed up the tear in
box . " Gu ess what , kids ?"

INTERVAL 0.398 - 2.610
CONTAINS 27.175%

. Suddenly , she heard a strange noise and realised it
Once upon a time , there was a big mountain with
trying to fly . It had hurt its wing and couldn
. One day he went to a new place
<|endoftext|> Once upon a time , there was a little girl

INTERVAL -1.814 - 0.398
CONTAINS 72.158%

went to listen to them and danced along with the music
together . Maybe we can dress up and have a tea
flag . Sam thought this was a great idea and suggested
a dog . It said : " B eware of the
friends and had many more adventures together . <|endoftext|> Once upon

INTERVAL -4.026 - -1.814
CONTAINS 0.632%

in the park . He saw something shiny in the grass
there , Lucy noticed a big square chalk board on the
excited he couldn â t contain himself !
flowers . They are good friends . They enjoy the hot
high . She wanted to swing on it so much .
Token very
Token ID845
Feature activation+1.01e+00

Pos logit contributions
inn+0.026
 attent+0.025
ork+0.025
or+0.024
oons+0.024
Neg logit contributions
 glad-0.013
 merry-0.012
 floppy-0.011
 eager-0.011
 rider-0.010
Token special
Token ID2041
Feature activation+1.43e+00

Pos logit contributions
inn+0.039
ork+0.037
 liquids+0.037
 attent+0.037
or+0.036
Neg logit contributions
 glad-0.016
 eager-0.014
 loyal-0.013
 floppy-0.013
 merry-0.013
Token toy
Token ID13373
Feature activation-7.44e-02

Pos logit contributions
inn+0.042
ork+0.041
 attent+0.041
or+0.039
oons+0.038
Neg logit contributions
 glad-0.033
 floppy-0.031
 merry-0.031
 eager-0.030
 rider-0.030
Token for
Token ID329
Feature activation+9.62e-01

Pos logit contributions
 glad+0.001
 floppy+0.001
 no+0.001
 rider+0.001
 eager+0.001
Neg logit contributions
inn-0.003
 attent-0.003
ork-0.003
oons-0.003
or-0.002
Token L
Token ID406
Feature activation+2.04e+00

Pos logit contributions
ork+0.031
inn+0.031
 attent+0.031
oons+0.030
sc+0.029
Neg logit contributions
 glad-0.019
 no-0.019
 riding-0.017
 merry-0.017
 swinging-0.017
Tokenott
Token ID1252
Feature activation+4.82e+00

Pos logit contributions
 attent+0.047
 liquids+0.042
works+0.041
sc+0.040
inct+0.038
Neg logit contributions
 glad-0.059
 eager-0.055
 greedy-0.054
 swinging-0.053
 no-0.053
Tokenie
Token ID494
Feature activation+4.70e-01

Pos logit contributions
 attent+0.118
ork+0.114
 liquids+0.112
inn+0.111
 Grac+0.106
Neg logit contributions
 glad-0.127
 eager-0.119
 rider-0.119
 eagle-0.118
 strongest-0.118
Token.
Token ID13
Feature activation+6.36e-01

Pos logit contributions
ork+0.015
 attent+0.015
inn+0.015
or+0.014
oons+0.014
Neg logit contributions
 glad-0.010
 floppy-0.009
 merry-0.009
 rider-0.008
 eager-0.008
Token It
Token ID632
Feature activation+9.72e-02

Pos logit contributions
oons+0.022
ork+0.022
inn+0.022
works+0.021
 attent+0.021
Neg logit contributions
 floppy-0.011
 eager-0.010
 glad-0.010
 no-0.009
 riding-0.009
Token was
Token ID373
Feature activation+2.83e-01

Pos logit contributions
inn+0.003
ork+0.003
 attent+0.003
or+0.003
oons+0.003
Neg logit contributions
 glad-0.002
 floppy-0.002
 merry-0.002
 eager-0.002
 no-0.002
Token a
Token ID257
Feature activation-2.99e-01

Pos logit contributions
ork+0.011
inn+0.011
 attent+0.011
 Grac+0.010
 liquids+0.010
Neg logit contributions
 glad-0.004
 rider-0.004
 merry-0.003
 floppy-0.003
 eagle-0.003
Token
Token ID198
Feature activation+8.83e-01

Pos logit contributions
 attent+0.004
ork+0.004
inn+0.004
 liquids+0.004
oons+0.004
Neg logit contributions
 glad-0.003
 merry-0.002
 eager-0.002
 floppy-0.002
 riding-0.002
TokenOne
Token ID3198
Feature activation-8.86e-01

Pos logit contributions
 liquids+0.033
ork+0.033
 attent+0.032
oons+0.032
works+0.032
Neg logit contributions
 also-0.011
 glad-0.011
 merry-0.011
 riding-0.011
 no-0.011
Token day
Token ID1110
Feature activation+5.45e-01

Pos logit contributions
 glad+0.023
 floppy+0.022
 merry+0.022
 rider+0.021
 greedy+0.021
Neg logit contributions
inn-0.023
 attent-0.023
ork-0.023
or-0.021
oons-0.020
Token,
Token ID11
Feature activation+7.99e-02

Pos logit contributions
 attent+0.027
ork+0.026
oons+0.026
inn+0.025
works+0.025
Neg logit contributions
 no-0.003
 glad-0.003
 swinging-0.002
 riding-0.002
 then-0.001
Token L
Token ID406
Feature activation+7.91e-01

Pos logit contributions
ork+0.002
oons+0.002
inn+0.002
 attent+0.002
or+0.002
Neg logit contributions
 glad-0.002
 no-0.002
 merry-0.002
 eager-0.002
 swinging-0.002
Tokenott
Token ID1252
Feature activation+4.68e+00

Pos logit contributions
 attent+0.014
ork+0.013
inn+0.013
 liquids+0.013
sc+0.012
Neg logit contributions
 glad-0.028
 eager-0.026
 greedy-0.025
 no-0.025
 floppy-0.024
Tokenie
Token ID494
Feature activation+5.86e-01

Pos logit contributions
 attent+0.114
ork+0.111
inn+0.110
 liquids+0.105
 Grac+0.101
Neg logit contributions
 glad-0.123
 eagle-0.117
 eager-0.116
 floppy-0.115
 rider-0.114
Token's
Token ID338
Feature activation-3.95e-01

Pos logit contributions
ork+0.017
 attent+0.016
or+0.015
inn+0.015
 liquids+0.015
Neg logit contributions
 glad-0.014
 merry-0.013
 floppy-0.013
 rider-0.012
 strongest-0.012
Token mom
Token ID1995
Feature activation+2.64e-01

Pos logit contributions
 glad+0.010
 merry+0.010
 floppy+0.010
 eagle+0.009
 strongest+0.009
Neg logit contributions
ork-0.010
 attent-0.009
inn-0.009
eth-0.009
 Grac-0.008
Token went
Token ID1816
Feature activation+7.85e-01

Pos logit contributions
ork+0.008
 attent+0.008
inn+0.008
or+0.007
oons+0.007
Neg logit contributions
 glad-0.006
 floppy-0.006
 merry-0.006
 rider-0.005
 strongest-0.005
Token to
Token ID284
Feature activation+1.61e+00

Pos logit contributions
ork+0.027
inn+0.026
oons+0.026
 attent+0.026
ub+0.025
Neg logit contributions
 glad-0.015
 eager-0.015
 riding-0.014
 swinging-0.013
 floppy-0.013
Token loyal
Token ID9112
Feature activation-3.18e-01

Pos logit contributions
 glad+0.010
 floppy+0.010
 merry+0.009
 rider+0.009
 strongest+0.009
Neg logit contributions
ork-0.019
 attent-0.019
inn-0.019
or-0.018
 liquids-0.018
Token little
Token ID1310
Feature activation-1.27e-01

Pos logit contributions
 glad+0.004
 floppy+0.004
 merry+0.004
 eager+0.004
 rider+0.004
Neg logit contributions
ork-0.012
inn-0.012
 attent-0.012
or-0.011
oons-0.011
Token girl
Token ID2576
Feature activation+9.17e-01

Pos logit contributions
 glad+0.002
 floppy+0.002
 merry+0.002
 happiest+0.002
 no+0.002
Neg logit contributions
inn-0.004
ork-0.004
 attent-0.004
or-0.004
 liquids-0.004
Token named
Token ID3706
Feature activation+2.64e+00

Pos logit contributions
ork+0.031
inn+0.030
 attent+0.030
or+0.028
 liquids+0.028
Neg logit contributions
 glad-0.017
 floppy-0.016
 merry-0.016
 strongest-0.015
 happiest-0.015
Token L
Token ID406
Feature activation+1.59e+00

Pos logit contributions
 attent+0.071
ork+0.069
inn+0.069
 liquids+0.067
oons+0.066
Neg logit contributions
 glad-0.069
 floppy-0.065
 eager-0.065
 no-0.064
 merry-0.062
Tokenott
Token ID1252
Feature activation+4.55e+00

Pos logit contributions
 attent+0.032
 liquids+0.030
inn+0.027
ork+0.027
inct+0.025
Neg logit contributions
 glad-0.051
 eager-0.047
 no-0.047
 merry-0.047
 swinging-0.047
Tokenie
Token ID494
Feature activation+1.00e+00

Pos logit contributions
 attent+0.110
ork+0.108
inn+0.107
 liquids+0.104
 Grac+0.098
Neg logit contributions
 glad-0.121
 rider-0.114
 eager-0.114
 eagle-0.114
 greedy-0.111
Token.
Token ID13
Feature activation+5.93e-01

Pos logit contributions
ork+0.028
 attent+0.028
inn+0.028
or+0.026
 liquids+0.025
Neg logit contributions
 glad-0.023
 floppy-0.023
 merry-0.023
 strongest-0.022
 happiest-0.021
Token Every
Token ID3887
Feature activation+1.87e-02

Pos logit contributions
ork+0.021
 attent+0.021
inn+0.021
oons+0.020
or+0.020
Neg logit contributions
 glad-0.009
 floppy-0.009
 merry-0.009
 eager-0.009
 no-0.008
Token day
Token ID1110
Feature activation-3.92e-01

Pos logit contributions
ork+0.000
inn+0.000
 attent+0.000
or+0.000
oons+0.000
Neg logit contributions
 glad-0.001
 floppy-0.000
 merry-0.000
 greedy-0.000
 strongest-0.000
Token,
Token ID11
Feature activation-3.63e-01

Pos logit contributions
 glad+0.005
 floppy+0.005
 no+0.004
 merry+0.004
 eager+0.004
Neg logit contributions
 attent-0.016
ork-0.016
inn-0.015
oons-0.015
 liquids-0.015
Token a
Token ID257
Feature activation+1.06e+00

Pos logit contributions
 merry+0.010
 glad+0.010
 rider+0.010
 floppy+0.009
 happiest+0.009
Neg logit contributions
 attent-0.013
inn-0.013
ork-0.013
 Grac-0.012
 liquids-0.012
Token kind
Token ID1611
Feature activation+6.75e-01

Pos logit contributions
ork+0.038
 attent+0.037
inn+0.037
 liquids+0.034
or+0.034
Neg logit contributions
 glad-0.017
 floppy-0.016
 merry-0.015
 rider-0.015
 eager-0.015
Token deer
Token ID20096
Feature activation+8.08e-01

Pos logit contributions
ork+0.023
inn+0.023
 attent+0.022
or+0.021
 Grac+0.021
Neg logit contributions
 glad-0.012
 merry-0.011
 floppy-0.011
 happiest-0.011
 strongest-0.011
Token named
Token ID3706
Feature activation+1.66e+00

Pos logit contributions
ork+0.028
inn+0.028
 attent+0.028
oons+0.026
or+0.026
Neg logit contributions
 glad-0.014
 floppy-0.014
 merry-0.013
 eager-0.013
 rider-0.013
Token D
Token ID360
Feature activation+1.73e+00

Pos logit contributions
ork+0.040
 attent+0.039
inn+0.039
or+0.036
 liquids+0.035
Neg logit contributions
 glad-0.047
 floppy-0.045
 merry-0.044
 rider-0.043
 eager-0.043
Tokenott
Token ID1252
Feature activation+4.47e+00

Pos logit contributions
 attent+0.039
ork+0.038
 liquids+0.035
oons+0.034
inn+0.033
Neg logit contributions
 glad-0.048
 floppy-0.048
 rider-0.048
 eager-0.047
 happiest-0.047
Tokenie
Token ID494
Feature activation+1.05e+00

Pos logit contributions
 attent+0.112
ork+0.108
inn+0.107
 liquids+0.103
 Grac+0.099
Neg logit contributions
 glad-0.117
 rider-0.111
 floppy-0.110
 eagle-0.110
 eager-0.108
Token appeared
Token ID4120
Feature activation+1.62e-01

Pos logit contributions
inn+0.036
ork+0.036
 attent+0.036
oons+0.034
 liquids+0.034
Neg logit contributions
 glad-0.018
 floppy-0.017
 eager-0.016
 merry-0.016
 no-0.016
Token.
Token ID13
Feature activation+1.04e+00

Pos logit contributions
ork+0.005
 attent+0.005
inn+0.005
or+0.004
ula+0.004
Neg logit contributions
 glad-0.004
 merry-0.004
 floppy-0.003
 rider-0.003
 strongest-0.003
Token "
Token ID366
Feature activation+2.77e-01

Pos logit contributions
oons+0.034
ork+0.034
works+0.033
 occasions+0.033
or+0.033
Neg logit contributions
 floppy-0.022
 glad-0.022
 eager-0.021
 curious-0.019
 greedy-0.018
TokenLeave
Token ID35087
Feature activation+5.71e-01

Pos logit contributions
ork+0.009
oons+0.009
 attent+0.009
inn+0.009
 liquids+0.008
Neg logit contributions
 glad-0.005
 floppy-0.005
Leave-0.005
 merry-0.005
 strongest-0.005
Token a
Token ID257
Feature activation-7.63e-01

Pos logit contributions
inn+0.018
ork+0.017
 attent+0.017
 Grac+0.016
ula+0.016
Neg logit contributions
 glad-0.008
 rider-0.008
 merry-0.008
 eagle-0.008
 strongest-0.008
Token good
Token ID922
Feature activation+4.18e-01

Pos logit contributions
 glad+0.014
 floppy+0.013
 merry+0.013
 strongest+0.012
 rider+0.012
Neg logit contributions
ork-0.025
 attent-0.025
inn-0.025
or-0.023
 liquids-0.023
Token dog
Token ID3290
Feature activation-4.23e-01

Pos logit contributions
ork+0.016
 attent+0.015
inn+0.015
oons+0.015
or+0.014
Neg logit contributions
 glad-0.006
 floppy-0.006
 merry-0.006
 eager-0.006
 rider-0.006
Token named
Token ID3706
Feature activation+2.26e+00

Pos logit contributions
 glad+0.006
 floppy+0.005
 eager+0.005
 merry+0.005
 rider+0.005
Neg logit contributions
ork-0.016
 attent-0.016
inn-0.016
oons-0.015
or-0.015
Token D
Token ID360
Feature activation+1.56e+00

Pos logit contributions
ork+0.068
inn+0.067
 attent+0.066
or+0.061
oons+0.061
Neg logit contributions
 glad-0.050
 floppy-0.046
 merry-0.046
 greedy-0.045
 eager-0.044
Tokenott
Token ID1252
Feature activation+4.46e+00

Pos logit contributions
 attent+0.037
ork+0.035
 liquids+0.034
inn+0.032
oons+0.031
Neg logit contributions
 glad-0.044
 floppy-0.042
 eager-0.042
 rider-0.040
 greedy-0.040
Tokenie
Token ID494
Feature activation+8.51e-01

Pos logit contributions
 attent+0.104
ork+0.101
inn+0.100
 liquids+0.097
 Grac+0.092
Neg logit contributions
 glad-0.124
 rider-0.118
 eager-0.116
 eagle-0.116
 floppy-0.116
Token.
Token ID13
Feature activation+8.34e-01

Pos logit contributions
ork+0.025
inn+0.025
 attent+0.025
or+0.023
oons+0.022
Neg logit contributions
 glad-0.019
 floppy-0.018
 merry-0.018
 strongest-0.017
 happiest-0.017
Token D
Token ID360
Feature activation+1.11e+00

Pos logit contributions
ork+0.018
 attent+0.018
inn+0.017
oons+0.016
or+0.016
Neg logit contributions
 glad-0.026
 floppy-0.025
 merry-0.024
 eager-0.024
 strongest-0.024
Tokenott
Token ID1252
Feature activation+3.91e+00

Pos logit contributions
 attent+0.022
ork+0.020
inn+0.019
 liquids+0.018
oons+0.018
Neg logit contributions
 glad-0.036
 eager-0.035
 greedy-0.034
 floppy-0.034
 strongest-0.034
Tokenie
Token ID494
Feature activation-2.35e-01

Pos logit contributions
 attent+0.093
ork+0.089
inn+0.088
 liquids+0.085
ula+0.080
Neg logit contributions
 glad-0.108
 eager-0.101
 floppy-0.101
 eagle-0.100
 greedy-0.099
Token.
Token ID13
Feature activation+4.60e-01

Pos logit contributions
 glad+0.031
 floppy+0.030
 eagle+0.029
 merry+0.028
 strongest+0.028
Neg logit contributions
ork-0.041
 attent-0.040
inn-0.039
or-0.037
 liquids-0.036
Token
Token ID198
Feature activation+2.16e-01

Pos logit contributions
ork+0.017
 attent+0.017
oons+0.016
works+0.016
 liquids+0.016
Neg logit contributions
 glad-0.007
 eager-0.007
 floppy-0.007
 swinging-0.006
 riding-0.006
Token
Token ID198
Feature activation+9.21e-01

Pos logit contributions
 attent+0.007
ork+0.007
inn+0.007
 liquids+0.007
oons+0.007
Neg logit contributions
 glad-0.004
 eager-0.004
 floppy-0.004
 merry-0.004
 no-0.003
TokenWhen
Token ID2215
Feature activation+3.99e-02

Pos logit contributions
 attent+0.024
ork+0.024
 liquids+0.022
oons+0.022
works+0.022
Neg logit contributions
 glad-0.023
 floppy-0.023
 eager-0.022
 merry-0.022
 no-0.022
Token D
Token ID360
Feature activation+1.71e+00

Pos logit contributions
ork+0.001
 attent+0.001
inn+0.001
oons+0.001
 liquids+0.001
Neg logit contributions
 glad-0.001
 floppy-0.001
 no-0.001
 eager-0.001
 merry-0.001
Tokenott
Token ID1252
Feature activation+4.39e+00

Pos logit contributions
 attent+0.035
ork+0.032
oons+0.031
works+0.031
 liquids+0.031
Neg logit contributions
 glad-0.053
 eager-0.051
 greedy-0.048
 strongest-0.048
 happiest-0.048
Tokenie
Token ID494
Feature activation+7.36e-01

Pos logit contributions
 attent+0.110
ork+0.107
inn+0.103
 liquids+0.101
ula+0.099
Neg logit contributions
 glad-0.110
 eagle-0.106
 eager-0.104
 rider-0.103
 floppy-0.102
Token arrived
Token ID5284
Feature activation+4.26e-01

Pos logit contributions
 attent+0.023
inn+0.023
ork+0.023
oons+0.021
 liquids+0.021
Neg logit contributions
 glad-0.015
 floppy-0.014
 merry-0.014
 eager-0.014
 rider-0.014
Token at
Token ID379
Feature activation+1.52e-01

Pos logit contributions
ork+0.014
inn+0.014
 attent+0.014
or+0.013
oons+0.013
Neg logit contributions
 glad-0.009
 floppy-0.008
 merry-0.008
 rider-0.008
 eager-0.008
Token the
Token ID262
Feature activation+2.91e-01

Pos logit contributions
ork+0.005
inn+0.005
 attent+0.005
or+0.005
 liquids+0.005
Neg logit contributions
 glad-0.003
 floppy-0.002
 merry-0.002
 rider-0.002
 eager-0.002
Token lab
Token ID2248
Feature activation-3.73e-01

Pos logit contributions
inn+0.007
ork+0.007
 attent+0.007
oons+0.006
or+0.006
Neg logit contributions
 glad-0.008
 eager-0.008
 floppy-0.008
 happiest-0.008
 strongest-0.008
Token said
Token ID531
Feature activation+6.91e-01

Pos logit contributions
 glad+0.005
 floppy+0.005
 merry+0.005
 strongest+0.005
 happiest+0.005
Neg logit contributions
ork-0.007
 attent-0.007
inn-0.007
or-0.006
oons-0.006
Token,
Token ID11
Feature activation+1.08e+00

Pos logit contributions
inn+0.027
 attent+0.026
ork+0.026
oons+0.025
 liquids+0.025
Neg logit contributions
 glad-0.010
 no-0.009
 floppy-0.009
 merry-0.008
 eager-0.008
Token "
Token ID366
Feature activation+5.45e-01

Pos logit contributions
ork+0.035
inn+0.035
or+0.034
oons+0.034
 attent+0.033
Neg logit contributions
 glad-0.023
 eager-0.021
 floppy-0.021
 no-0.020
 riding-0.019
TokenHello
Token ID15496
Feature activation+6.31e-01

Pos logit contributions
ork+0.019
oons+0.018
 attent+0.018
inn+0.018
 liquids+0.016
Neg logit contributions
Leave-0.009
 glad-0.009
 waved-0.009
 strongest-0.009
 merry-0.009
Token D
Token ID360
Feature activation+2.38e+00

Pos logit contributions
 attent+0.023
inn+0.023
ork+0.022
oons+0.021
or+0.021
Neg logit contributions
 glad-0.010
 floppy-0.010
 eager-0.009
 merry-0.009
 greedy-0.009
Tokenott
Token ID1252
Feature activation+4.32e+00

Pos logit contributions
 attent+0.051
ork+0.045
oons+0.044
inn+0.044
 liquids+0.042
Neg logit contributions
 glad-0.072
 eager-0.068
 greedy-0.068
 strongest-0.066
 floppy-0.066
Tokenie
Token ID494
Feature activation+6.01e-02

Pos logit contributions
 attent+0.108
inn+0.104
ork+0.103
 liquids+0.100
 Grac+0.096
Neg logit contributions
 glad-0.114
 eagle-0.109
 eager-0.106
 greedy-0.105
 rider-0.103
Token!
Token ID0
Feature activation+4.30e-01

Pos logit contributions
ork+0.002
inn+0.002
 attent+0.002
or+0.002
oons+0.002
Neg logit contributions
 glad-0.001
 floppy-0.001
 merry-0.001
 rider-0.001
 eager-0.001
Token This
Token ID770
Feature activation-3.60e-02

Pos logit contributions
 attent+0.015
ork+0.015
oons+0.014
inn+0.014
or+0.013
Neg logit contributions
 glad-0.009
 floppy-0.008
 riding-0.008
 no-0.007
 eager-0.007
Token is
Token ID318
Feature activation+3.48e-01

Pos logit contributions
 glad+0.001
 floppy+0.001
 merry+0.001
 strongest+0.001
 rider+0.001
Neg logit contributions
inn-0.001
 attent-0.001
ork-0.001
or-0.001
oons-0.001
Token a
Token ID257
Feature activation-5.01e-02

Pos logit contributions
ork+0.015
inn+0.014
 attent+0.014
oons+0.014
or+0.014
Neg logit contributions
 glad-0.004
 no-0.004
 floppy-0.004
 strongest-0.004
 greedy-0.003
Token
Token ID198
Feature activation+4.34e-01

Pos logit contributions
 attent+0.007
ork+0.007
inn+0.007
 liquids+0.006
oons+0.006
Neg logit contributions
 glad-0.004
 merry-0.003
 eager-0.003
 floppy-0.003
 riding-0.003
TokenSuddenly
Token ID38582
Feature activation-5.05e-01

Pos logit contributions
 attent+0.014
ork+0.014
inn+0.014
or+0.014
oons+0.014
Neg logit contributions
 glad-0.008
 also-0.007
 eager-0.007
 no-0.007
 merry-0.007
Token,
Token ID11
Feature activation-6.51e-02

Pos logit contributions
 glad+0.006
 floppy+0.005
 no+0.005
 merry+0.005
 eager+0.005
Neg logit contributions
inn-0.021
 attent-0.021
ork-0.021
oons-0.019
or-0.019
Token they
Token ID484
Feature activation+1.00e-02

Pos logit contributions
 no+0.001
 glad+0.001
 floppy+0.001
 then+0.001
 eager+0.001
Neg logit contributions
inn-0.003
ork-0.003
 attent-0.003
oons-0.003
or-0.003
Token sp
Token ID599
Feature activation+1.20e+00

Pos logit contributions
ork+0.000
 attent+0.000
inn+0.000
or+0.000
 liquids+0.000
Neg logit contributions
 glad-0.000
 rider-0.000
 floppy-0.000
 merry-0.000
 strongest-0.000
Tokenott
Token ID1252
Feature activation+4.31e+00

Pos logit contributions
 attent+0.022
 Grac+0.022
ork+0.022
 liquids+0.021
inn+0.019
Neg logit contributions
 glad-0.036
 floppy-0.035
 eager-0.035
 riding-0.034
 swinging-0.034
Tokenled
Token ID992
Feature activation+1.49e-02

Pos logit contributions
 attent+0.111
inn+0.108
ork+0.108
 liquids+0.101
 Grac+0.099
Neg logit contributions
 glad-0.109
 floppy-0.103
 rider-0.103
 eagle-0.102
 merry-0.102
Token a
Token ID257
Feature activation+7.35e-01

Pos logit contributions
ork+0.001
inn+0.000
 attent+0.000
oons+0.000
or+0.000
Neg logit contributions
 glad-0.000
 floppy-0.000
 eager-0.000
 merry-0.000
 no-0.000
Token bicycle
Token ID17026
Feature activation-8.81e-01

Pos logit contributions
inn+0.026
ork+0.026
 attent+0.026
or+0.025
oons+0.025
Neg logit contributions
 glad-0.012
 floppy-0.011
 merry-0.011
 eager-0.011
 rider-0.010
Token.
Token ID13
Feature activation-7.35e-02

Pos logit contributions
 glad+0.021
 floppy+0.019
 merry+0.018
 happiest+0.018
 greedy+0.018
Neg logit contributions
ork-0.026
 attent-0.026
inn-0.025
or-0.024
 liquids-0.023
Token It
Token ID632
Feature activation+1.42e-01

Pos logit contributions
 floppy+0.001
 eager+0.001
 no+0.001
 curious+0.001
 glad+0.001
Neg logit contributions
or-0.003
ork-0.003
inn-0.003
oons-0.003
 occasions-0.003
Token After
Token ID2293
Feature activation+4.65e-01

Pos logit contributions
ork+0.001
oons+0.001
 attent+0.001
inn+0.001
works+0.001
Neg logit contributions
 glad-0.000
 floppy-0.000
 eager-0.000
 riding-0.000
 no-0.000
Token a
Token ID257
Feature activation+1.14e+00

Pos logit contributions
ork+0.017
 attent+0.017
inn+0.017
 liquids+0.016
or+0.016
Neg logit contributions
 glad-0.007
 floppy-0.007
 rider-0.006
 merry-0.006
 strongest-0.006
Token while
Token ID981
Feature activation-3.97e-01

Pos logit contributions
auc+0.039
oons+0.037
angers+0.037
ount+0.036
or+0.035
Neg logit contributions
 glad-0.024
 merry-0.023
 happy-0.023
 brave-0.022
 friendly-0.022
Token,
Token ID11
Feature activation-2.43e-01

Pos logit contributions
 glad+0.005
 floppy+0.005
 no+0.004
 eager+0.004
 merry+0.004
Neg logit contributions
ork-0.016
 attent-0.016
inn-0.016
oons-0.015
 liquids-0.015
Token D
Token ID360
Feature activation+1.67e+00

Pos logit contributions
 glad+0.004
 no+0.004
 floppy+0.004
 eager+0.004
 happiest+0.004
Neg logit contributions
oons-0.009
ork-0.009
inn-0.009
 attent-0.009
works-0.008
Tokenott
Token ID1252
Feature activation+4.18e+00

Pos logit contributions
 attent+0.035
oons+0.032
ork+0.032
inn+0.032
 liquids+0.030
Neg logit contributions
 glad-0.051
 eager-0.048
 greedy-0.047
 happiest-0.046
 strongest-0.046
Tokenie
Token ID494
Feature activation+2.51e-02

Pos logit contributions
 attent+0.101
ork+0.098
inn+0.097
 liquids+0.095
 Grac+0.090
Neg logit contributions
 glad-0.110
 eagle-0.106
 eager-0.103
 floppy-0.102
 greedy-0.101
Token was
Token ID373
Feature activation-1.71e-01

Pos logit contributions
ork+0.001
 attent+0.001
inn+0.001
or+0.001
 liquids+0.001
Neg logit contributions
 glad-0.000
 floppy-0.000
 merry-0.000
 rider-0.000
 strongest-0.000
Token no
Token ID645
Feature activation-2.37e-01

Pos logit contributions
 rider+0.003
 floppy+0.003
 eagle+0.003
 merry+0.003
 greedy+0.003
Neg logit contributions
inn-0.005
 attent-0.005
ork-0.005
 Grac-0.005
or-0.005
Token longer
Token ID2392
Feature activation-2.05e-01

Pos logit contributions
 glad+0.006
 merry+0.005
 floppy+0.005
 rider+0.005
 greedy+0.005
Neg logit contributions
inn-0.007
 attent-0.007
ork-0.007
 liquids-0.006
or-0.006
Token scared
Token ID12008
Feature activation-9.02e-01

Pos logit contributions
 merry+0.003
 rider+0.003
 floppy+0.003
 eagle+0.002
 glad+0.002
Neg logit contributions
 attent-0.007
ork-0.007
inn-0.007
 liquids-0.007
 Grac-0.007
Token?"
Token ID1701
Feature activation+2.45e-01

Pos logit contributions
ork+0.004
 attent+0.004
inn+0.004
or+0.004
oons+0.004
Neg logit contributions
 glad-0.003
 floppy-0.003
 merry-0.003
 rider-0.003
 greedy-0.003
Token
Token ID220
Feature activation+7.65e-01

Pos logit contributions
oons+0.009
works+0.009
ork+0.009
 attent+0.009
inner+0.009
Neg logit contributions
 glad-0.004
 floppy-0.003
 eager-0.003
 then-0.003
 riding-0.003
Token
Token ID198
Feature activation+2.68e-01

Pos logit contributions
 attent+0.030
 occasions+0.030
 liquids+0.029
inn+0.029
works+0.029
Neg logit contributions
 glad-0.010
 eager-0.008
 riding-0.007
 no-0.007
 happy-0.007
Token
Token ID198
Feature activation+2.37e-01

Pos logit contributions
 attent+0.010
ork+0.009
oons+0.009
inn+0.009
works+0.009
Neg logit contributions
 glad-0.005
 floppy-0.004
 merry-0.004
 eager-0.004
 no-0.004
TokenD
Token ID35
Feature activation+2.34e+00

Pos logit contributions
ork+0.006
 attent+0.006
oons+0.006
inn+0.006
works+0.006
Neg logit contributions
 glad-0.006
 floppy-0.006
 merry-0.006
 eager-0.006
 greedy-0.006
Tokenott
Token ID1252
Feature activation+4.16e+00

Pos logit contributions
 attent+0.051
 occasions+0.047
sc+0.045
inn+0.045
 liquids+0.045
Neg logit contributions
 glad-0.065
 eager-0.062
 greedy-0.061
 floppy-0.058
 strongest-0.058
Tokenie
Token ID494
Feature activation-1.53e-02

Pos logit contributions
 attent+0.100
ork+0.099
inn+0.096
 liquids+0.093
ula+0.090
Neg logit contributions
 glad-0.111
 eagle-0.106
 eager-0.104
 floppy-0.102
 greedy-0.102
Token per
Token ID583
Feature activation+2.22e-02

Pos logit contributions
 glad+0.000
 floppy+0.000
 merry+0.000
 eager+0.000
 rider+0.000
Neg logit contributions
ork-0.001
inn-0.001
 attent-0.001
oons-0.000
or-0.000
Tokenked
Token ID9091
Feature activation+8.05e-02

Pos logit contributions
oons+0.001
ork+0.001
 attent+0.001
inn+0.001
 occasions+0.001
Neg logit contributions
 floppy-0.001
 glad-0.001
 rider-0.001
 eager-0.000
 greedy-0.000
Token up
Token ID510
Feature activation+9.51e-02

Pos logit contributions
ork+0.003
inn+0.003
 attent+0.003
oons+0.003
or+0.003
Neg logit contributions
 glad-0.001
 floppy-0.001
 merry-0.001
 eager-0.001
 rider-0.001
Token and
Token ID290
Feature activation-1.11e-01

Pos logit contributions
ork+0.003
 attent+0.003
inn+0.003
or+0.003
 liquids+0.003
Neg logit contributions
 glad-0.001
 floppy-0.001
 merry-0.001
 rider-0.001
 strongest-0.001
Token the
Token ID262
Feature activation+1.74e+00

Pos logit contributions
ork+0.043
 attent+0.042
inn+0.041
or+0.039
oons+0.039
Neg logit contributions
 glad-0.019
 floppy-0.018
 merry-0.017
 eager-0.017
 strongest-0.016
Token n
Token ID299
Feature activation+1.84e+00

Pos logit contributions
ork+0.056
inn+0.056
 attent+0.053
or+0.052
ub+0.051
Neg logit contributions
 glad-0.037
 floppy-0.036
 eager-0.035
 merry-0.034
 strongest-0.034
Tokenooks
Token ID31085
Feature activation+2.95e+00

Pos logit contributions
 attent+0.040
ork+0.037
inn+0.035
 liquids+0.034
 Grac+0.033
Neg logit contributions
 glad-0.056
 eager-0.050
 swinging-0.049
 floppy-0.048
 no-0.048
Token and
Token ID290
Feature activation+6.94e-01

Pos logit contributions
 attent+0.111
inn+0.111
ork+0.110
or+0.104
oons+0.104
Neg logit contributions
 glad-0.042
 merry-0.040
 floppy-0.040
 rider-0.039
 strongest-0.036
Token cr
Token ID1067
Feature activation+1.60e+00

Pos logit contributions
ork+0.021
inn+0.021
 attent+0.021
or+0.020
oons+0.019
Neg logit contributions
 glad-0.015
 floppy-0.015
 merry-0.014
 eager-0.014
 no-0.014
Tokenann
Token ID1236
Feature activation+4.07e+00

Pos logit contributions
 attent+0.042
inn+0.041
ork+0.041
or+0.038
 Grac+0.037
Neg logit contributions
 floppy-0.039
 glad-0.037
 swinging-0.035
 no-0.035
 riding-0.034
Tokenies
Token ID444
Feature activation+2.18e+00

Pos logit contributions
inn+0.095
ork+0.095
 attent+0.093
or+0.082
 liquids+0.080
Neg logit contributions
 glad-0.116
 floppy-0.110
 eager-0.109
 greedy-0.105
 no-0.105
Token.
Token ID13
Feature activation+7.09e-01

Pos logit contributions
ork+0.071
 attent+0.070
inn+0.070
or+0.065
oons+0.065
Neg logit contributions
 glad-0.043
 floppy-0.041
 merry-0.039
 eager-0.038
 rider-0.038
Token
Token ID198
Feature activation+4.88e-01

Pos logit contributions
ork+0.028
or+0.025
 attent+0.025
works+0.025
auc+0.025
Neg logit contributions
 glad-0.011
 floppy-0.009
 eager-0.009
 no-0.008
 curious-0.008
Token
Token ID198
Feature activation+3.20e-01

Pos logit contributions
 attent+0.018
ork+0.017
 liquids+0.017
inn+0.016
works+0.016
Neg logit contributions
 glad-0.008
 eager-0.006
 also-0.006
 floppy-0.006
 happy-0.006
TokenAfter
Token ID3260
Feature activation+9.23e-01

Pos logit contributions
ork+0.009
 attent+0.008
ount+0.008
 liquids+0.008
angers+0.008
Neg logit contributions
 glad-0.008
 also-0.008
 brave-0.007
 no-0.007
 eager-0.007
Token D
Token ID360
Feature activation+1.56e+00

Pos logit contributions
ork+0.068
inn+0.067
 attent+0.066
or+0.061
oons+0.061
Neg logit contributions
 glad-0.050
 floppy-0.046
 merry-0.046
 greedy-0.045
 eager-0.044
Tokenott
Token ID1252
Feature activation+4.46e+00

Pos logit contributions
 attent+0.037
ork+0.035
 liquids+0.034
inn+0.032
oons+0.031
Neg logit contributions
 glad-0.044
 floppy-0.042
 eager-0.042
 rider-0.040
 greedy-0.040
Tokenie
Token ID494
Feature activation+8.51e-01

Pos logit contributions
 attent+0.104
ork+0.101
inn+0.100
 liquids+0.097
 Grac+0.092
Neg logit contributions
 glad-0.124
 rider-0.118
 eager-0.116
 eagle-0.116
 floppy-0.116
Token.
Token ID13
Feature activation+8.34e-01

Pos logit contributions
ork+0.025
inn+0.025
 attent+0.025
or+0.023
oons+0.022
Neg logit contributions
 glad-0.019
 floppy-0.018
 merry-0.018
 strongest-0.017
 happiest-0.017
Token D
Token ID360
Feature activation+1.11e+00

Pos logit contributions
ork+0.018
 attent+0.018
inn+0.017
oons+0.016
or+0.016
Neg logit contributions
 glad-0.026
 floppy-0.025
 merry-0.024
 eager-0.024
 strongest-0.024
Tokenott
Token ID1252
Feature activation+3.91e+00

Pos logit contributions
 attent+0.022
ork+0.020
inn+0.019
 liquids+0.018
oons+0.018
Neg logit contributions
 glad-0.036
 eager-0.035
 greedy-0.034
 floppy-0.034
 strongest-0.034
Tokenie
Token ID494
Feature activation-2.35e-01

Pos logit contributions
 attent+0.093
ork+0.089
inn+0.088
 liquids+0.085
ula+0.080
Neg logit contributions
 glad-0.108
 eager-0.101
 floppy-0.101
 eagle-0.100
 greedy-0.099
Token wanted
Token ID2227
Feature activation-1.46e+00

Pos logit contributions
 glad+0.005
 merry+0.004
 happiest+0.004
 strongest+0.004
 rider+0.004
Neg logit contributions
ork-0.007
inn-0.007
 attent-0.007
or-0.006
eth-0.006
Token to
Token ID284
Feature activation+2.38e-01

Pos logit contributions
 glad+0.042
 floppy+0.040
 merry+0.039
 rider+0.039
 happiest+0.038
Neg logit contributions
ork-0.035
 attent-0.034
inn-0.033
 liquids-0.030
or-0.030
Token visit
Token ID3187
Feature activation-1.47e+00

Pos logit contributions
ork+0.008
 attent+0.008
inn+0.008
or+0.007
oons+0.007
Neg logit contributions
 glad-0.005
 floppy-0.004
 merry-0.004
 eager-0.004
 rider-0.004
Token a
Token ID257
Feature activation-4.45e-01

Pos logit contributions
 glad+0.032
 merry+0.029
 floppy+0.029
 rider+0.029
 greedy+0.029
Neg logit contributions
 attent-0.044
inn-0.044
ork-0.044
or-0.040
 Grac-0.040
Token there
Token ID612
Feature activation+6.30e-01

Pos logit contributions
inn+0.028
ork+0.028
 attent+0.027
oons+0.025
or+0.025
Neg logit contributions
 glad-0.016
 floppy-0.016
 merry-0.016
 rider-0.016
 eagle-0.015
Token was
Token ID373
Feature activation+7.41e-01

Pos logit contributions
ork+0.017
inn+0.017
 attent+0.016
oons+0.015
or+0.015
Neg logit contributions
 glad-0.016
 floppy-0.016
 rider-0.015
 merry-0.015
 greedy-0.015
Token a
Token ID257
Feature activation-5.77e-01

Pos logit contributions
inn+0.026
 attent+0.025
ork+0.025
 Grac+0.023
oons+0.023
Neg logit contributions
 glad-0.012
 rider-0.012
 merry-0.012
 eagle-0.011
 strongest-0.011
Token girl
Token ID2576
Feature activation+1.02e+00

Pos logit contributions
 glad+0.011
 floppy+0.010
 merry+0.010
 strongest+0.009
 rider+0.009
Neg logit contributions
ork-0.019
 attent-0.019
inn-0.018
 liquids-0.017
 Grac-0.017
Token named
Token ID3706
Feature activation+2.42e+00

Pos logit contributions
ork+0.031
inn+0.030
 attent+0.030
or+0.028
 Grac+0.027
Neg logit contributions
 glad-0.022
 merry-0.021
 floppy-0.021
 strongest-0.021
 happiest-0.021
Token Grac
Token ID38099
Feature activation+3.89e+00

Pos logit contributions
 attent+0.068
inn+0.066
ork+0.066
or+0.064
oons+0.063
Neg logit contributions
 glad-0.061
 floppy-0.059
 no-0.057
 eager-0.057
 swinging-0.054
Tokenie
Token ID494
Feature activation+1.08e+00

Pos logit contributions
ork+0.102
 attent+0.100
inn+0.097
 liquids+0.092
ula+0.085
Neg logit contributions
 glad-0.098
 eager-0.094
 greedy-0.091
 happiest-0.091
 strongest-0.090
Token.
Token ID13
Feature activation+1.54e+00

Pos logit contributions
 attent+0.031
ork+0.031
inn+0.030
or+0.028
 Grac+0.027
Neg logit contributions
 glad-0.025
 floppy-0.024
 merry-0.024
 strongest-0.023
 happiest-0.023
Token She
Token ID1375
Feature activation+2.40e-01

Pos logit contributions
ork+0.052
 attent+0.049
inn+0.048
oons+0.047
or+0.047
Neg logit contributions
 floppy-0.030
 glad-0.029
 eager-0.028
 no-0.027
 happiest-0.026
Token was
Token ID373
Feature activation-6.23e-01

Pos logit contributions
ork+0.009
inn+0.009
 attent+0.009
oons+0.008
or+0.008
Neg logit contributions
 glad-0.004
 floppy-0.003
 merry-0.003
 strongest-0.003
 happiest-0.003
Token big
Token ID1263
Feature activation-1.94e+00

Pos logit contributions
 eagle+0.010
 rider+0.010
 floppy+0.010
 merry+0.009
 glad+0.009
Neg logit contributions
inn-0.020
 attent-0.020
 Grac-0.020
ork-0.019
ula-0.019
Token was
Token ID373
Feature activation+6.68e-01

Pos logit contributions
ork+0.019
inn+0.019
 attent+0.018
oons+0.017
or+0.017
Neg logit contributions
 glad-0.020
 floppy-0.019
 rider-0.018
 merry-0.018
 greedy-0.018
Token a
Token ID257
Feature activation-4.49e-01

Pos logit contributions
inn+0.022
 attent+0.022
ork+0.021
 Grac+0.020
 las+0.020
Neg logit contributions
 glad-0.011
 rider-0.011
 merry-0.011
 eagle-0.011
 strongest-0.011
Token little
Token ID1310
Feature activation+3.29e-02

Pos logit contributions
 glad+0.008
 floppy+0.008
 merry+0.007
 strongest+0.007
 no+0.007
Neg logit contributions
ork-0.015
 attent-0.015
inn-0.015
oons-0.013
 liquids-0.013
Token girl
Token ID2576
Feature activation+1.32e+00

Pos logit contributions
inn+0.001
ork+0.001
 attent+0.001
 liquids+0.001
 Grac+0.001
Neg logit contributions
 glad-0.001
 floppy-0.001
 merry-0.001
 no-0.001
 happiest-0.001
Token named
Token ID3706
Feature activation+2.70e+00

Pos logit contributions
ork+0.042
inn+0.041
 attent+0.041
or+0.039
 Grac+0.037
Neg logit contributions
 glad-0.026
 merry-0.025
 floppy-0.025
 strongest-0.024
 happiest-0.024
Token Grac
Token ID38099
Feature activation+3.88e+00

Pos logit contributions
 attent+0.072
ork+0.071
inn+0.070
or+0.067
oons+0.066
Neg logit contributions
 glad-0.073
 eager-0.069
 floppy-0.068
 no-0.068
 greedy-0.064
Tokenie
Token ID494
Feature activation+1.15e+00

Pos logit contributions
ork+0.097
 attent+0.094
inn+0.093
 liquids+0.087
ula+0.081
Neg logit contributions
 glad-0.104
 eager-0.099
 greedy-0.097
 happiest-0.096
 strongest-0.096
Token.
Token ID13
Feature activation+1.79e+00

Pos logit contributions
 attent+0.032
ork+0.032
inn+0.031
or+0.029
 Grac+0.028
Neg logit contributions
 glad-0.027
 floppy-0.026
 merry-0.026
 strongest-0.025
 rider-0.025
Token She
Token ID1375
Feature activation+4.30e-01

Pos logit contributions
ork+0.063
 attent+0.059
inn+0.057
or+0.056
oons+0.056
Neg logit contributions
 floppy-0.031
 glad-0.031
 eager-0.031
 no-0.029
 happiest-0.028
Token was
Token ID373
Feature activation-5.55e-01

Pos logit contributions
ork+0.017
 attent+0.016
inn+0.016
oons+0.015
or+0.015
Neg logit contributions
 glad-0.006
 floppy-0.005
 merry-0.005
 rider-0.005
 strongest-0.005
Token very
Token ID845
Feature activation-3.44e-01

Pos logit contributions
 eagle+0.009
 rider+0.009
 floppy+0.008
 merry+0.008
 glad+0.008
Neg logit contributions
 attent-0.018
inn-0.018
 Grac-0.018
ork-0.017
ula-0.017
Token she
Token ID673
Feature activation+3.19e-01

Pos logit contributions
inn+0.019
 attent+0.019
ork+0.018
 liquids+0.018
 occasions+0.017
Neg logit contributions
 glad-0.011
 no-0.010
 eager-0.009
 floppy-0.009
 merry-0.009
Token was
Token ID373
Feature activation-3.16e-01

Pos logit contributions
ork+0.010
 attent+0.010
inn+0.010
oons+0.009
or+0.009
Neg logit contributions
 glad-0.007
 floppy-0.006
 merry-0.006
 eager-0.006
 rider-0.006
Token inside
Token ID2641
Feature activation+1.85e+00

Pos logit contributions
 merry+0.007
 floppy+0.007
 rider+0.007
 glad+0.006
 eagle+0.006
Neg logit contributions
inn-0.009
 attent-0.009
ork-0.009
or-0.008
 Grac-0.008
Token,
Token ID11
Feature activation-2.33e-01

Pos logit contributions
 attent+0.064
ork+0.064
inn+0.063
or+0.060
 liquids+0.059
Neg logit contributions
 glad-0.031
 floppy-0.031
 rider-0.030
 merry-0.029
 strongest-0.028
Token D
Token ID360
Feature activation+1.21e+00

Pos logit contributions
 glad+0.004
 floppy+0.004
 merry+0.004
 eager+0.004
 no+0.004
Neg logit contributions
ork-0.008
inn-0.008
 attent-0.008
oons-0.007
or-0.007
Tokenott
Token ID1252
Feature activation+3.87e+00

Pos logit contributions
 attent+0.029
ork+0.027
 liquids+0.027
inn+0.026
oons+0.026
Neg logit contributions
 glad-0.033
 floppy-0.032
 eager-0.032
 greedy-0.031
 happiest-0.031
Tokenie
Token ID494
Feature activation+3.71e-01

Pos logit contributions
ork+0.081
inn+0.080
 attent+0.080
 liquids+0.071
oons+0.070
Neg logit contributions
 glad-0.120
 floppy-0.115
 merry-0.113
 eager-0.112
 rider-0.111
Token saw
Token ID2497
Feature activation-3.13e-01

Pos logit contributions
ork+0.012
 attent+0.012
inn+0.011
or+0.011
 liquids+0.011
Neg logit contributions
 glad-0.008
 merry-0.007
 floppy-0.007
 rider-0.007
 strongest-0.007
Token lots
Token ID6041
Feature activation-4.31e-01

Pos logit contributions
 rider+0.007
 glad+0.007
 merry+0.007
 strongest+0.006
 floppy+0.006
Neg logit contributions
ork-0.009
 attent-0.009
 Grac-0.009
 liquids-0.009
inn-0.008
Token of
Token ID286
Feature activation+5.70e-01

Pos logit contributions
 glad+0.010
 merry+0.010
 rider+0.009
 floppy+0.009
 no+0.009
Neg logit contributions
 attent-0.012
inn-0.012
ork-0.012
 liquids-0.012
oons-0.011
Token colorful
Token ID20239
Feature activation+1.87e-01

Pos logit contributions
 attent+0.019
ork+0.019
inn+0.018
 liquids+0.018
oons+0.017
Neg logit contributions
 glad-0.011
 floppy-0.010
 merry-0.010
 rider-0.010
 happiest-0.009
Token stove
Token ID27635
Feature activation+7.70e-01

Pos logit contributions
 glad+0.002
 floppy+0.002
 merry+0.002
 eager+0.002
 happiest+0.002
Neg logit contributions
inn-0.002
ork-0.002
 attent-0.002
or-0.002
oons-0.002
Token -
Token ID532
Feature activation-8.20e-01

Pos logit contributions
ork+0.025
 attent+0.024
inn+0.024
oons+0.022
or+0.022
Neg logit contributions
 glad-0.016
 floppy-0.015
 merry-0.014
 rider-0.014
 greedy-0.014
Token two
Token ID734
Feature activation+1.14e-02

Pos logit contributions
 glad+0.015
 floppy+0.015
 merry+0.015
 rider+0.014
 strongest+0.014
Neg logit contributions
ork-0.027
 attent-0.027
inn-0.026
 liquids-0.025
oons-0.024
Token more
Token ID517
Feature activation+7.22e-01

Pos logit contributions
 attent+0.000
ork+0.000
inn+0.000
 liquids+0.000
oons+0.000
Neg logit contributions
 glad-0.000
 rider-0.000
 floppy-0.000
 merry-0.000
 eager-0.000
Token sh
Token ID427
Feature activation+6.24e-01

Pos logit contributions
ork+0.021
inn+0.021
 attent+0.021
or+0.020
oons+0.019
Neg logit contributions
 glad-0.017
 floppy-0.016
 eager-0.015
 merry-0.015
 greedy-0.015
Tokenrim
Token ID3036
Feature activation+3.82e+00

Pos logit contributions
ork+0.011
 attent+0.011
inn+0.011
or+0.009
oons+0.009
Neg logit contributions
 glad-0.022
 floppy-0.021
 merry-0.021
 eager-0.020
 rider-0.020
Tokenps
Token ID862
Feature activation+9.84e-01

Pos logit contributions
ork+0.144
 attent+0.136
inn+0.136
oons+0.125
 liquids+0.125
Neg logit contributions
 glad-0.061
 floppy-0.052
 eager-0.052
 riding-0.048
 strongest-0.048
Token!
Token ID0
Feature activation-1.57e-01

Pos logit contributions
ork+0.030
inn+0.030
 attent+0.030
or+0.028
oons+0.028
Neg logit contributions
 glad-0.021
 floppy-0.020
 merry-0.019
 eager-0.019
 rider-0.019
Token His
Token ID2399
Feature activation-1.02e+00

Pos logit contributions
 glad+0.003
 floppy+0.003
 eager+0.003
 greedy+0.003
 no+0.003
Neg logit contributions
ork-0.005
oons-0.005
 occasions-0.005
works-0.005
 attent-0.005
Token dad
Token ID9955
Feature activation-9.98e-01

Pos logit contributions
 glad+0.019
 floppy+0.019
 rider+0.018
 merry+0.018
 strongest+0.017
Neg logit contributions
ork-0.034
 attent-0.034
inn-0.033
or-0.031
oons-0.031
Token had
Token ID550
Feature activation+3.45e-01

Pos logit contributions
 glad+0.020
 floppy+0.020
 merry+0.019
 strongest+0.019
 rider+0.019
Neg logit contributions
 attent-0.031
ork-0.030
inn-0.030
 Grac-0.028
or-0.028
Token slipped
Token ID18859
Feature activation+2.23e-01

Pos logit contributions
ork+0.005
 attent+0.005
inn+0.005
or+0.004
 liquids+0.004
Neg logit contributions
 glad-0.003
 floppy-0.003
 merry-0.003
 strongest-0.003
 eager-0.003
Token and
Token ID290
Feature activation+2.82e-01

Pos logit contributions
ork+0.009
 attent+0.009
inn+0.009
oons+0.009
or+0.009
Neg logit contributions
 glad-0.003
 floppy-0.002
 no-0.002
 merry-0.002
 eager-0.002
Token fell
Token ID3214
Feature activation+2.33e-01

Pos logit contributions
ork+0.010
 attent+0.010
inn+0.010
oons+0.009
or+0.009
Neg logit contributions
 glad-0.005
 floppy-0.004
 merry-0.004
 no-0.004
 eager-0.004
Token into
Token ID656
Feature activation+1.29e+00

Pos logit contributions
ork+0.007
inn+0.007
 attent+0.007
or+0.007
oons+0.007
Neg logit contributions
 glad-0.005
 floppy-0.005
 merry-0.004
 rider-0.004
 eager-0.004
Token the
Token ID262
Feature activation+1.30e-02

Pos logit contributions
ork+0.051
 attent+0.050
inn+0.050
or+0.048
oons+0.048
Neg logit contributions
 glad-0.017
 floppy-0.015
 merry-0.015
 eager-0.014
 no-0.014
Token rav
Token ID24343
Feature activation+3.77e+00

Pos logit contributions
ork+0.000
 attent+0.000
inn+0.000
 liquids+0.000
or+0.000
Neg logit contributions
 glad-0.000
 rider-0.000
 floppy-0.000
 riding-0.000
 eager-0.000
Tokenine
Token ID500
Feature activation+3.36e-01

Pos logit contributions
ork+0.078
 attent+0.077
inn+0.076
or+0.068
oons+0.068
Neg logit contributions
 glad-0.119
 floppy-0.115
 merry-0.112
 eager-0.110
 rider-0.110
Token below
Token ID2174
Feature activation+7.14e-02

Pos logit contributions
ork+0.011
 attent+0.010
inn+0.010
or+0.010
oons+0.009
Neg logit contributions
 glad-0.007
 rider-0.007
 floppy-0.007
 merry-0.006
 strongest-0.006
Token.
Token ID13
Feature activation-1.34e-02

Pos logit contributions
ork+0.002
 attent+0.002
inn+0.002
oons+0.002
 liquids+0.002
Neg logit contributions
 glad-0.001
 floppy-0.001
 merry-0.001
 no-0.001
 eager-0.001
Token
Token ID198
Feature activation+1.61e-02

Pos logit contributions
 glad+0.000
 eager+0.000
 no+0.000
 floppy+0.000
 riding+0.000
Neg logit contributions
inn-0.000
 attent-0.000
ork-0.000
oons-0.000
or-0.000
Token
Token ID198
Feature activation+1.13e-01

Pos logit contributions
 attent+0.001
inn+0.000
ork+0.000
 liquids+0.000
oons+0.000
Neg logit contributions
 glad-0.000
 floppy-0.000
 eager-0.000
 merry-0.000
 no-0.000
Token
Token ID198
Feature activation+6.29e-01

Pos logit contributions
 attent+0.008
ork+0.007
 liquids+0.007
inn+0.007
or+0.007
Neg logit contributions
 glad-0.004
 eager-0.003
 merry-0.003
 riding-0.003
 floppy-0.003
TokenNext
Token ID10019
Feature activation-5.17e-01

Pos logit contributions
ork+0.021
auc+0.020
oons+0.020
or+0.020
ware+0.020
Neg logit contributions
 glad-0.011
 also-0.011
 eager-0.011
 then-0.010
 no-0.010
Token,
Token ID11
Feature activation-2.53e-01

Pos logit contributions
 glad+0.004
 no+0.004
 then+0.003
 eager+0.003
 now+0.003
Neg logit contributions
 attent-0.024
oons-0.023
ork-0.023
inn-0.023
auc-0.022
Token the
Token ID262
Feature activation-3.09e-02

Pos logit contributions
 no+0.005
 glad+0.004
 eager+0.004
 then+0.004
 happy+0.004
Neg logit contributions
ork-0.010
inn-0.009
oons-0.009
 attent-0.009
or-0.009
Token b
Token ID275
Feature activation+1.18e+00

Pos logit contributions
 glad+0.001
 floppy+0.001
 rider+0.001
 greedy+0.001
 merry+0.001
Neg logit contributions
 attent-0.001
ork-0.001
inn-0.001
 Grac-0.001
 liquids-0.001
Tokenunn
Token ID20935
Feature activation+3.71e+00

Pos logit contributions
 attent+0.031
ork+0.031
inn+0.029
 liquids+0.027
oons+0.026
Neg logit contributions
 glad-0.031
 floppy-0.030
 rider-0.029
 eager-0.028
 strongest-0.028
Tokenies
Token ID444
Feature activation+1.23e+00

Pos logit contributions
ork+0.097
 attent+0.094
inn+0.092
 liquids+0.079
ula+0.079
Neg logit contributions
 glad-0.102
 eager-0.096
 floppy-0.095
 greedy-0.093
 strongest-0.092
Token got
Token ID1392
Feature activation+8.95e-01

Pos logit contributions
ork+0.042
inn+0.042
 attent+0.041
 liquids+0.038
oons+0.038
Neg logit contributions
 glad-0.022
 eager-0.020
 floppy-0.020
 merry-0.019
 no-0.019
Token to
Token ID284
Feature activation+1.50e+00

Pos logit contributions
ork+0.035
inn+0.034
 attent+0.033
or+0.032
oons+0.032
Neg logit contributions
 glad-0.014
 eager-0.013
 no-0.012
 greedy-0.012
 floppy-0.012
Token work
Token ID670
Feature activation+6.21e-01

Pos logit contributions
inn+0.045
ork+0.045
 attent+0.044
or+0.041
oons+0.041
Neg logit contributions
 glad-0.034
 floppy-0.032
 merry-0.031
 eager-0.030
 no-0.030
Token making
Token ID1642
Feature activation+7.66e-01

Pos logit contributions
ork+0.022
inn+0.022
 attent+0.022
or+0.020
oons+0.020
Neg logit contributions
 glad-0.011
 floppy-0.010
 merry-0.010
 eager-0.009
 rider-0.009
Token in
Token ID287
Feature activation+6.46e-01

Pos logit contributions
ork+0.026
 attent+0.025
inn+0.025
oons+0.024
or+0.024
Neg logit contributions
 glad-0.018
 floppy-0.017
 eager-0.016
 merry-0.016
 riding-0.016
Token and
Token ID290
Feature activation+7.33e-01

Pos logit contributions
ork+0.029
inn+0.029
oons+0.028
 attent+0.028
or+0.028
Neg logit contributions
 glad-0.006
 eager-0.005
 no-0.004
 floppy-0.004
 merry-0.004
Token said
Token ID531
Feature activation+1.14e+00

Pos logit contributions
ork+0.031
inn+0.030
 occasions+0.029
oons+0.029
 attent+0.029
Neg logit contributions
 glad-0.010
 waved-0.009
 eager-0.009
 brave-0.008
 happy-0.008
Token,
Token ID11
Feature activation+9.04e-01

Pos logit contributions
inn+0.054
 attent+0.052
ork+0.051
 liquids+0.049
oons+0.049
Neg logit contributions
 glad-0.009
 no-0.009
 greedy-0.007
 floppy-0.007
 eager-0.006
Token "
Token ID366
Feature activation+5.34e-01

Pos logit contributions
ork+0.033
inn+0.033
or+0.032
oons+0.031
 occasions+0.031
Neg logit contributions
 glad-0.017
 eager-0.015
 no-0.014
 floppy-0.013
 riding-0.013
TokenRos
Token ID35740
Feature activation+3.67e+00

Pos logit contributions
ork+0.014
oons+0.013
inn+0.013
 attent+0.013
engu+0.012
Neg logit contributions
 glad-0.014
Leave-0.013
 eager-0.013
 floppy-0.013
 rider-0.013
Tokenie
Token ID494
Feature activation+8.10e-01

Pos logit contributions
 attent+0.094
ork+0.091
inn+0.088
 liquids+0.083
or+0.077
Neg logit contributions
 glad-0.095
 floppy-0.092
 eager-0.088
 greedy-0.087
 no-0.087
Token,
Token ID11
Feature activation+8.44e-01

Pos logit contributions
ork+0.029
inn+0.028
 attent+0.028
or+0.027
oons+0.026
Neg logit contributions
 glad-0.013
 floppy-0.012
 merry-0.012
 rider-0.012
 eager-0.011
Token you
Token ID345
Feature activation+7.81e-01

Pos logit contributions
inn+0.031
 attent+0.031
ork+0.030
oons+0.029
or+0.028
Neg logit contributions
 glad-0.015
 no-0.014
 floppy-0.013
 merry-0.013
 eager-0.012
Token need
Token ID761
Feature activation+1.44e+00

Pos logit contributions
ork+0.023
inn+0.021
or+0.020
 attent+0.020
ub+0.020
Neg logit contributions
 merry-0.018
 floppy-0.017
 glad-0.017
 rider-0.016
 swinging-0.016
Token to
Token ID284
Feature activation+8.01e-01

Pos logit contributions
ork+0.036
 attent+0.035
inn+0.035
 liquids+0.032
or+0.032
Neg logit contributions
 glad-0.039
 floppy-0.038
 merry-0.037
 rider-0.037
 strongest-0.036
Token small
Token ID1402
Feature activation+9.75e-01

Pos logit contributions
 attent+0.030
 liquids+0.029
ork+0.028
oons+0.028
inn+0.028
Neg logit contributions
 glad-0.020
 rider-0.020
 floppy-0.018
 happiest-0.018
 greedy-0.018
Token n
Token ID299
Feature activation+2.65e-01

Pos logit contributions
 liquids+0.028
 attent+0.028
ork+0.028
inn+0.027
oons+0.027
Neg logit contributions
 glad-0.023
 rider-0.021
 happiest-0.020
 merry-0.020
 floppy-0.019
Tokenooks
Token ID31085
Feature activation+2.45e+00

Pos logit contributions
ork+0.005
 attent+0.005
inn+0.005
oons+0.004
 liquids+0.004
Neg logit contributions
 glad-0.009
 floppy-0.008
 merry-0.008
 eager-0.008
 rider-0.008
Token and
Token ID290
Feature activation+2.04e-01

Pos logit contributions
 attent+0.086
inn+0.086
ork+0.082
or+0.079
 liquids+0.079
Neg logit contributions
 rider-0.042
 merry-0.042
 glad-0.041
 strongest-0.038
 happiest-0.038
Token cr
Token ID1067
Feature activation+7.99e-01

Pos logit contributions
 attent+0.006
ork+0.006
inn+0.006
 liquids+0.006
oons+0.005
Neg logit contributions
 glad-0.005
 rider-0.004
 merry-0.004
 floppy-0.004
 happiest-0.004
Tokenann
Token ID1236
Feature activation+3.65e+00

Pos logit contributions
 attent+0.019
ork+0.019
inn+0.018
or+0.017
 liquids+0.016
Neg logit contributions
 floppy-0.022
 glad-0.022
 merry-0.020
 eager-0.020
 no-0.020
Tokenies
Token ID444
Feature activation+1.73e+00

Pos logit contributions
ork+0.078
inn+0.077
 attent+0.076
or+0.067
 liquids+0.066
Neg logit contributions
 glad-0.112
 floppy-0.109
 merry-0.105
 eager-0.105
 greedy-0.103
Token to
Token ID284
Feature activation+6.46e-01

Pos logit contributions
 attent+0.051
ork+0.050
inn+0.049
 liquids+0.047
or+0.047
Neg logit contributions
 glad-0.038
 merry-0.037
 floppy-0.036
 rider-0.036
 strongest-0.035
Token grab
Token ID5552
Feature activation-7.68e-01

Pos logit contributions
ork+0.018
 attent+0.018
inn+0.017
 liquids+0.016
oons+0.016
Neg logit contributions
 glad-0.016
 rider-0.015
 floppy-0.014
 greedy-0.014
 happiest-0.014
Token onto
Token ID4291
Feature activation+5.79e-01

Pos logit contributions
 floppy+0.012
 glad+0.012
 no+0.012
 eager+0.011
 merry+0.011
Neg logit contributions
ork-0.028
inn-0.027
 attent-0.027
oons-0.026
or-0.026
Token,
Token ID11
Feature activation+2.07e-01

Pos logit contributions
ork+0.022
inn+0.022
 attent+0.021
oons+0.020
or+0.020
Neg logit contributions
 glad-0.008
 floppy-0.008
 no-0.008
 merry-0.007
 eager-0.007
Token very
Token ID845
Feature activation-7.01e-01

Pos logit contributions
 eagle+0.014
 rider+0.013
 floppy+0.013
 trusts+0.012
 reckless+0.012
Neg logit contributions
inn-0.026
 attent-0.024
 Grac-0.024
ula-0.023
ount-0.021
Token happy
Token ID3772
Feature activation-3.94e-01

Pos logit contributions
 floppy+0.008
 glad+0.008
 eagle+0.008
 happiest+0.007
 rider+0.007
Neg logit contributions
inn-0.027
 attent-0.027
ork-0.026
or-0.025
oons-0.025
Token.
Token ID13
Feature activation-6.35e-01

Pos logit contributions
 glad+0.008
 floppy+0.007
 merry+0.007
 rider+0.007
 eager+0.007
Neg logit contributions
ork-0.013
 attent-0.013
inn-0.013
or-0.012
oons-0.012
Token They
Token ID1119
Feature activation-1.08e+00

Pos logit contributions
 glad+0.015
 floppy+0.014
 eager+0.013
 merry+0.013
 no+0.013
Neg logit contributions
ork-0.019
 attent-0.019
inn-0.018
or-0.018
oons-0.018
Token love
Token ID1842
Feature activation-6.78e-01

Pos logit contributions
 glad+0.021
 floppy+0.019
 merry+0.019
 greedy+0.019
 strongest+0.019
Neg logit contributions
 attent-0.033
inn-0.032
ork-0.031
or-0.030
 Grac-0.030
Token bikes
Token ID16715
Feature activation-4.03e+00

Pos logit contributions
 glad+0.014
 floppy+0.013
 happiest+0.013
 strongest+0.013
 greedy+0.012
Neg logit contributions
 attent-0.021
inn-0.021
ork-0.020
 liquids-0.019
oons-0.019
Token.
Token ID13
Feature activation-4.19e-01

Pos logit contributions
 glad+0.117
Leave+0.115
 eagle+0.114
 floppy+0.113
 happiest+0.111
Neg logit contributions
 attent-0.082
ork-0.080
inn-0.079
or-0.074
oons-0.073
Token They
Token ID1119
Feature activation-1.10e+00

Pos logit contributions
 glad+0.010
 floppy+0.009
 merry+0.009
 eager+0.009
 no+0.008
Neg logit contributions
ork-0.013
 attent-0.012
inn-0.012
or-0.012
oons-0.012
Token run
Token ID1057
Feature activation-1.80e-01

Pos logit contributions
 glad+0.023
 floppy+0.022
 eagle+0.021
 greedy+0.021
 merry+0.020
Neg logit contributions
 attent-0.031
inn-0.030
or-0.029
ork-0.028
 Grac-0.027
Token to
Token ID284
Feature activation+2.53e-01

Pos logit contributions
 glad+0.004
 floppy+0.004
 merry+0.004
 rider+0.004
 eager+0.004
Neg logit contributions
ork-0.005
inn-0.005
 attent-0.005
oons-0.005
 liquids-0.005
Token the
Token ID262
Feature activation-1.14e+00

Pos logit contributions
 attent+0.007
inn+0.006
ork+0.006
or+0.006
ula+0.006
Neg logit contributions
 floppy-0.006
 strongest-0.006
 glad-0.006
 eagle-0.006
 smartest-0.006
Token wrist
Token ID15980
Feature activation-2.21e+00

Pos logit contributions
 glad+0.003
 floppy+0.002
 merry+0.002
 eager+0.002
 greedy+0.002
Neg logit contributions
inn-0.004
ork-0.004
 attent-0.004
or-0.004
oons-0.004
Token.
Token ID13
Feature activation+8.68e-02

Pos logit contributions
 glad+0.056
 merry+0.053
 floppy+0.051
 strongest+0.049
 rider+0.049
Neg logit contributions
ork-0.064
 attent-0.061
inn-0.061
or-0.058
oons-0.056
Token Lily
Token ID20037
Feature activation-9.16e-02

Pos logit contributions
inn+0.003
ork+0.003
or+0.003
 attent+0.003
oons+0.003
Neg logit contributions
 glad-0.002
 eager-0.002
 floppy-0.002
 no-0.001
 riding-0.001
Token smelled
Token ID37609
Feature activation-3.50e-01

Pos logit contributions
 glad+0.002
 floppy+0.002
 merry+0.002
 rider+0.002
 strongest+0.002
Neg logit contributions
 attent-0.003
ork-0.003
inn-0.003
oons-0.003
or-0.003
Token her
Token ID607
Feature activation-1.66e+00

Pos logit contributions
 rider+0.007
 glad+0.007
 floppy+0.006
 merry+0.006
 strongest+0.006
Neg logit contributions
 attent-0.011
ork-0.011
inn-0.011
 liquids-0.010
oons-0.010
Token wrist
Token ID15980
Feature activation-3.87e+00

Pos logit contributions
 glad+0.047
 floppy+0.045
 merry+0.044
 rider+0.043
 eager+0.043
Neg logit contributions
ork-0.040
 attent-0.040
inn-0.039
or-0.036
oons-0.035
Token and
Token ID290
Feature activation-1.19e-01

Pos logit contributions
 glad+0.105
 pilots+0.101
 eagle+0.096
 merry+0.096
 rider+0.095
Neg logit contributions
ork-0.108
 attent-0.104
inn-0.086
ula-0.085
or-0.083
Token said
Token ID531
Feature activation-3.50e-01

Pos logit contributions
 glad+0.003
 rider+0.002
 merry+0.002
 floppy+0.002
 strongest+0.002
Neg logit contributions
 attent-0.004
ork-0.003
 liquids-0.003
inn-0.003
 Grac-0.003
Token,
Token ID11
Feature activation-6.81e-03

Pos logit contributions
 glad+0.005
 floppy+0.005
 merry+0.005
 eager+0.004
 rider+0.004
Neg logit contributions
ork-0.013
inn-0.013
 attent-0.013
or-0.012
oons-0.012
Token "
Token ID366
Feature activation-2.89e-01

Pos logit contributions
 glad+0.000
 floppy+0.000
 eager+0.000
 merry+0.000
 greedy+0.000
Neg logit contributions
ork-0.000
inn-0.000
 attent-0.000
or-0.000
oons-0.000
TokenIt
Token ID1026
Feature activation+7.34e-01

Pos logit contributions
 glad+0.006
 floppy+0.006
 merry+0.006
 rider+0.006
 eager+0.005
Neg logit contributions
ork-0.009
inn-0.009
 attent-0.009
or-0.008
oons-0.008
Token She
Token ID1375
Feature activation+5.15e-01

Pos logit contributions
inn+0.015
ork+0.015
 attent+0.015
oons+0.014
or+0.014
Neg logit contributions
 glad-0.006
 floppy-0.006
 eager-0.006
 swinging-0.005
 riding-0.005
Token grabbed
Token ID13176
Feature activation-1.08e+00

Pos logit contributions
ork+0.015
inn+0.015
 attent+0.015
or+0.014
oons+0.014
Neg logit contributions
 glad-0.012
 floppy-0.011
 greedy-0.011
 happiest-0.011
 strongest-0.010
Token onto
Token ID4291
Feature activation-9.33e-02

Pos logit contributions
 glad+0.022
 floppy+0.020
 merry+0.020
 rider+0.020
 strongest+0.020
Neg logit contributions
inn-0.034
ork-0.034
 attent-0.034
or-0.031
 liquids-0.031
Token its
Token ID663
Feature activation-6.60e-01

Pos logit contributions
 glad+0.002
 floppy+0.002
 greedy+0.002
 rider+0.002
 strongest+0.002
Neg logit contributions
ork-0.003
 attent-0.003
inn-0.003
or-0.003
 liquids-0.003
Token handle
Token ID5412
Feature activation-1.46e+00

Pos logit contributions
 glad+0.014
 greedy+0.014
 reckless+0.012
 strongest+0.012
 smartest+0.012
Neg logit contributions
 attent-0.020
ork-0.019
inn-0.019
eth-0.018
ula-0.017
Tokenbars
Token ID34046
Feature activation-3.84e+00

Pos logit contributions
 glad+0.032
 greedy+0.030
 happiest+0.028
 eagle+0.028
 merry+0.028
Neg logit contributions
inn-0.046
 attent-0.046
ork-0.046
or-0.045
eth-0.044
Token and
Token ID290
Feature activation+5.81e-01

Pos logit contributions
 glad+0.096
 greedy+0.092
 happiest+0.092
 smartest+0.091
 strongest+0.091
Neg logit contributions
inn-0.106
or-0.105
ork-0.103
eth-0.102
 attent-0.093
Token put
Token ID1234
Feature activation-6.65e-01

Pos logit contributions
 attent+0.017
ork+0.016
inn+0.016
or+0.015
 liquids+0.015
Neg logit contributions
 glad-0.014
 greedy-0.013
 floppy-0.013
 merry-0.012
 happiest-0.012
Token her
Token ID607
Feature activation-1.25e+00

Pos logit contributions
 glad+0.014
 floppy+0.013
 merry+0.012
 greedy+0.012
 rider+0.012
Neg logit contributions
inn-0.021
ork-0.021
 attent-0.021
or-0.020
oons-0.019
Token feet
Token ID3625
Feature activation-2.05e-01

Pos logit contributions
 glad+0.028
 greedy+0.027
 eagle+0.025
 grateful+0.024
 reckless+0.024
Neg logit contributions
 attent-0.035
inn-0.034
ork-0.034
or-0.032
eth-0.032
Token onto
Token ID4291
Feature activation+4.96e-01

Pos logit contributions
 glad+0.005
 floppy+0.005
 greedy+0.005
 eagle+0.005
 merry+0.005
Neg logit contributions
inn-0.006
 attent-0.006
ork-0.006
or-0.005
eth-0.005
Token
Token ID198
Feature activation+4.08e-01

Pos logit contributions
 attent+0.006
inn+0.006
ork+0.006
oons+0.006
 liquids+0.006
Neg logit contributions
 glad-0.004
 floppy-0.004
 merry-0.004
 eager-0.004
 rider-0.004
Token"
Token ID1
Feature activation-4.05e-01

Pos logit contributions
ork+0.010
inn+0.010
 attent+0.010
or+0.009
oons+0.009
Neg logit contributions
 glad-0.011
 floppy-0.011
 merry-0.011
 eager-0.010
 rider-0.010
TokenI
Token ID40
Feature activation+2.37e-02

Pos logit contributions
 glad+0.008
 floppy+0.008
 merry+0.007
 eager+0.007
 rider+0.007
Neg logit contributions
ork-0.013
inn-0.013
 attent-0.013
 liquids-0.012
or-0.012
Token like
Token ID588
Feature activation-4.46e-01

Pos logit contributions
ork+0.001
inn+0.001
 attent+0.001
or+0.001
oons+0.001
Neg logit contributions
 glad-0.001
 floppy-0.001
 merry-0.001
 rider-0.000
 eager-0.000
Token your
Token ID534
Feature activation-6.38e-01

Pos logit contributions
 glad+0.010
 floppy+0.009
 merry+0.009
 eager+0.009
 rider+0.009
Neg logit contributions
ork-0.014
inn-0.014
 attent-0.014
or-0.013
oons-0.013
Token name
Token ID1438
Feature activation-3.78e+00

Pos logit contributions
 glad+0.016
 floppy+0.015
 rider+0.015
 merry+0.015
 greedy+0.015
Neg logit contributions
 attent-0.016
ork-0.016
inn-0.015
 liquids-0.014
 Grac-0.014
Token,
Token ID11
Feature activation+1.10e+00

Pos logit contributions
 glad+0.097
 happiest+0.090
 strongest+0.085
 merry+0.084
 smartest+0.083
Neg logit contributions
ork-0.095
oons-0.092
 attent-0.091
inn-0.088
sc-0.087
Token Lily
Token ID20037
Feature activation-9.89e-01

Pos logit contributions
 attent+0.025
ork+0.025
inn+0.024
 Grac+0.022
 liquids+0.022
Neg logit contributions
 glad-0.032
 floppy-0.030
 merry-0.030
 rider-0.029
 eager-0.029
Token,"
Token ID553
Feature activation+9.82e-01

Pos logit contributions
 glad+0.025
 merry+0.024
 happiest+0.024
 floppy+0.023
 strongest+0.023
Neg logit contributions
ork-0.025
inn-0.025
 attent-0.024
or-0.023
oons-0.023
Token said
Token ID531
Feature activation-3.22e-01

Pos logit contributions
ork+0.028
 attent+0.028
inn+0.027
or+0.025
oons+0.025
Neg logit contributions
 glad-0.023
 floppy-0.022
 merry-0.022
 rider-0.021
 eager-0.021
Token Ellie
Token ID44801
Feature activation+9.40e-01

Pos logit contributions
 glad+0.008
 floppy+0.008
 eager+0.007
 merry+0.007
 no+0.007
Neg logit contributions
ork-0.009
inn-0.009
 attent-0.009
oons-0.009
 liquids-0.008
Token.
Token ID13
Feature activation-2.05e-01

Pos logit contributions
 happiest+0.036
 glad+0.036
 merry+0.036
 floppy+0.035
 strongest+0.035
Neg logit contributions
ork-0.030
inn-0.029
 attent-0.029
oons-0.029
eth-0.027
Token You
Token ID921
Feature activation-6.67e-01

Pos logit contributions
 glad+0.005
 floppy+0.005
 merry+0.005
 eager+0.005
 greedy+0.005
Neg logit contributions
inn-0.005
ork-0.005
 attent-0.005
 liquids-0.005
or-0.005
Token are
Token ID389
Feature activation-1.71e+00

Pos logit contributions
 glad+0.013
 merry+0.013
 floppy+0.013
 swinging+0.013
 leaning+0.012
Neg logit contributions
inn-0.019
 attent-0.019
 Grac-0.019
 liquids-0.019
ork-0.019
Token my
Token ID616
Feature activation-2.05e+00

Pos logit contributions
 eagle+0.029
 floppy+0.029
 rider+0.029
 merry+0.028
 glad+0.028
Neg logit contributions
inn-0.052
 attent-0.051
 Grac-0.051
 las-0.049
ork-0.049
Token favorite
Token ID4004
Feature activation-6.23e-01

Pos logit contributions
 glad+0.036
 floppy+0.035
 riding+0.033
 eagle+0.032
 swinging+0.032
Neg logit contributions
 attent-0.067
ork-0.063
inn-0.062
 liquids-0.060
 Grac-0.060
Token niece
Token ID41803
Feature activation-3.76e+00

Pos logit contributions
 glad+0.013
 floppy+0.012
 merry+0.012
 eager+0.012
 rider+0.012
Neg logit contributions
ork-0.020
 attent-0.019
inn-0.019
or-0.018
oons-0.018
Token,"
Token ID553
Feature activation-1.80e-01

Pos logit contributions
 glad+0.104
 eagle+0.104
 happiest+0.104
Leave+0.103
 strongest+0.101
Neg logit contributions
ork-0.067
 Grac-0.066
inn-0.064
 occasions-0.064
 liquids-0.063
Token uncle
Token ID7711
Feature activation-1.01e+00

Pos logit contributions
 glad+0.004
 strongest+0.004
 floppy+0.004
 happiest+0.004
 smartest+0.004
Neg logit contributions
 attent-0.005
ork-0.005
inn-0.005
 Grac-0.004
oons-0.004
Token Tom
Token ID4186
Feature activation-2.25e-01

Pos logit contributions
 glad+0.018
 floppy+0.017
 merry+0.016
 rider+0.016
 eager+0.016
Neg logit contributions
ork-0.035
 attent-0.035
inn-0.034
or-0.032
oons-0.032
Token said
Token ID531
Feature activation-2.97e-01

Pos logit contributions
 glad+0.004
 floppy+0.004
 merry+0.004
 strongest+0.004
 rider+0.004
Neg logit contributions
 attent-0.007
ork-0.007
inn-0.007
or-0.007
oons-0.007
Token.
Token ID13
Feature activation+4.48e-02

Pos logit contributions
 merry+0.008
 strongest+0.008
 happiest+0.008
 floppy+0.007
 eagle+0.007
Neg logit contributions
 attent-0.008
inn-0.008
ork-0.007
or-0.007
oons-0.007
Token bird
Token ID6512
Feature activation+9.59e-03

Pos logit contributions
 glad+0.028
opl+0.023
 happiest+0.022
 relieved+0.021
 grateful+0.021
Neg logit contributions
 Grac-0.020
ork-0.019
ene-0.018
eth-0.018
 powd-0.017
Token sat
Token ID3332
Feature activation-1.02e+00

Pos logit contributions
 attent+0.000
inn+0.000
ork+0.000
or+0.000
sc+0.000
Neg logit contributions
 glad-0.000
 greedy-0.000
 floppy-0.000
 merry-0.000
 happiest-0.000
Token on
Token ID319
Feature activation-8.54e-01

Pos logit contributions
 floppy+0.029
hy+0.029
 happiest+0.028
 strongest+0.027
 Leave+0.027
Neg logit contributions
inn-0.023
 attent-0.021
or-0.019
ork-0.018
angers-0.017
Token the
Token ID262
Feature activation-9.05e-01

Pos logit contributions
 glad+0.015
 floppy+0.013
 happiest+0.013
 strongest+0.013
 merry+0.013
Neg logit contributions
ork-0.030
 attent-0.030
inn-0.030
 liquids-0.027
or-0.027
Token handle
Token ID5412
Feature activation-1.42e+00

Pos logit contributions
 glad+0.023
 greedy+0.022
 willing+0.019
 smartest+0.019
 grateful+0.019
Neg logit contributions
 attent-0.024
ork-0.024
ula-0.023
inn-0.022
or-0.021
Tokenbars
Token ID34046
Feature activation-3.64e+00

Pos logit contributions
 glad+0.035
 greedy+0.031
 floppy+0.030
 merry+0.028
 happiest+0.028
Neg logit contributions
inn-0.044
or-0.043
 attent-0.043
ork-0.043
oons-0.039
Token and
Token ID290
Feature activation-6.13e-02

Pos logit contributions
 glad+0.095
 happiest+0.084
 floppy+0.084
 greedy+0.083
 merry+0.082
Neg logit contributions
inn-0.103
or-0.098
ork-0.094
 attent-0.089
eth-0.087
Token held
Token ID2714
Feature activation-1.33e+00

Pos logit contributions
 glad+0.001
 floppy+0.001
 merry+0.001
 greedy+0.001
 strongest+0.001
Neg logit contributions
 attent-0.002
ork-0.002
inn-0.002
or-0.002
 liquids-0.002
Token on
Token ID319
Feature activation+3.41e-01

Pos logit contributions
 glad+0.028
 floppy+0.026
 merry+0.025
 eager+0.025
 rider+0.025
Neg logit contributions
ork-0.042
 attent-0.042
inn-0.042
or-0.039
oons-0.039
Token tight
Token ID5381
Feature activation-6.01e-01

Pos logit contributions
ork+0.013
inn+0.013
oons+0.012
 liquids+0.012
 attent+0.012
Neg logit contributions
 glad-0.005
 floppy-0.005
 eager-0.005
 swinging-0.004
 merry-0.004
Token.
Token ID13
Feature activation-3.30e-01

Pos logit contributions
 glad+0.014
 happiest+0.012
 merry+0.012
 floppy+0.012
 strongest+0.012
Neg logit contributions
ork-0.018
 attent-0.018
inn-0.018
or-0.017
 liquids-0.017
Token shiny
Token ID22441
Feature activation-1.27e+00

Pos logit contributions
 eagle+0.037
 rider+0.032
ese+0.031
 glad+0.031
vious+0.031
Neg logit contributions
 Grac-0.044
 attent-0.040
ork-0.040
inn-0.039
 Sp-0.038
Token.
Token ID13
Feature activation-3.30e-01

Pos logit contributions
 glad+0.048
 merry+0.041
 happiest+0.041
 broad+0.040
 eager+0.039
Neg logit contributions
ork-0.020
 attent-0.019
ware-0.018
 liquids-0.018
inn-0.017
Token She
Token ID1375
Feature activation-1.33e+00

Pos logit contributions
 glad+0.006
 floppy+0.006
 merry+0.006
 rider+0.006
 eager+0.006
Neg logit contributions
ork-0.011
 attent-0.011
inn-0.011
or-0.010
oons-0.010
Token loved
Token ID6151
Feature activation-4.36e-01

Pos logit contributions
 glad+0.035
 eagle+0.034
 hesitation+0.033
 floppy+0.032
 strongest+0.031
Neg logit contributions
ork-0.030
 attent-0.028
inn-0.026
or-0.025
sc-0.025
Token her
Token ID607
Feature activation-1.98e+00

Pos logit contributions
 glad+0.010
 floppy+0.010
 strongest+0.009
 happiest+0.009
 rider+0.009
Neg logit contributions
 attent-0.012
ork-0.012
inn-0.012
 liquids-0.011
oons-0.011
Token bike
Token ID7161
Feature activation-3.64e+00

Pos logit contributions
 Leave+0.046
 floppy+0.044
 glad+0.044
 greedy+0.043
Leave+0.043
Neg logit contributions
ork-0.047
 attent-0.044
 project-0.040
ware-0.036
 Wed-0.035
Token and
Token ID290
Feature activation-8.34e-02

Pos logit contributions
Leave+0.119
 glad+0.119
 eagle+0.117
 hesitation+0.112
 hawk+0.108
Neg logit contributions
ork-0.072
eth-0.054
 attent-0.054
sc-0.051
or-0.049
Token her
Token ID607
Feature activation-2.11e+00

Pos logit contributions
 eagle+0.002
 glad+0.002
 strongest+0.002
 greedy+0.002
 rider+0.002
Neg logit contributions
 attent-0.002
 Wed-0.002
 Grac-0.002
ork-0.002
 liquids-0.002
Token friend
Token ID1545
Feature activation-1.26e+00

Pos logit contributions
 glad+0.041
 floppy+0.040
 Leave+0.039
 greedy+0.039
vious+0.039
Neg logit contributions
ork-0.054
 attent-0.053
ware-0.045
 Wed-0.045
 project-0.044
Token.
Token ID13
Feature activation-2.67e-01

Pos logit contributions
 glad+0.032
 eagle+0.031
 merry+0.030
 floppy+0.030
 strongest+0.029
Neg logit contributions
 attent-0.030
ork-0.030
inn-0.029
or-0.027
 Grac-0.026
Token<|endoftext|>
Token ID50256
Feature activation-2.87e-01

Pos logit contributions
 glad+0.006
 floppy+0.006
 merry+0.005
 rider+0.005
 eager+0.005
Neg logit contributions
ork-0.008
inn-0.008
 attent-0.008
oons-0.007
or-0.007
Token.
Token ID13
Feature activation+1.40e-01

Pos logit contributions
 glad+0.019
 floppy+0.018
 merry+0.017
 greedy+0.017
 happiest+0.017
Neg logit contributions
ork-0.027
 attent-0.027
inn-0.027
or-0.026
oons-0.025
Token He
Token ID679
Feature activation+8.83e-02

Pos logit contributions
ork+0.005
inn+0.005
 attent+0.005
or+0.004
oons+0.004
Neg logit contributions
 glad-0.003
 floppy-0.003
 eager-0.002
 rider-0.002
 riding-0.002
Token grabbed
Token ID13176
Feature activation-9.23e-01

Pos logit contributions
ork+0.003
 attent+0.003
inn+0.002
or+0.002
oons+0.002
Neg logit contributions
 glad-0.002
 happiest-0.002
 floppy-0.002
 greedy-0.002
 smartest-0.002
Token the
Token ID262
Feature activation-6.46e-01

Pos logit contributions
 glad+0.020
 merry+0.018
 happiest+0.018
 floppy+0.018
 greedy+0.018
Neg logit contributions
inn-0.028
 attent-0.028
ork-0.028
or-0.026
oons-0.025
Token handle
Token ID5412
Feature activation-1.58e+00

Pos logit contributions
 glad+0.014
 greedy+0.013
 grateful+0.012
 floppy+0.012
pled+0.012
Neg logit contributions
ork-0.017
 attent-0.016
eth-0.015
inn-0.015
or-0.015
Tokenbars
Token ID34046
Feature activation-3.64e+00

Pos logit contributions
 glad+0.034
 greedy+0.031
 floppy+0.031
 happiest+0.030
 eagle+0.030
Neg logit contributions
inn-0.051
 attent-0.049
or-0.049
ork-0.048
eth-0.045
Token with
Token ID351
Feature activation+1.25e-02

Pos logit contributions
 glad+0.085
 happiest+0.077
 greedy+0.076
 floppy+0.076
 eagle+0.075
Neg logit contributions
inn-0.107
or-0.104
ork-0.100
eth-0.096
 attent-0.096
Token his
Token ID465
Feature activation-9.28e-01

Pos logit contributions
inn+0.000
 attent+0.000
ork+0.000
or+0.000
 Grac+0.000
Neg logit contributions
 glad-0.000
 greedy-0.000
 happiest-0.000
 eagle-0.000
 merry-0.000
Token little
Token ID1310
Feature activation+4.95e-01

Pos logit contributions
 glad+0.019
 greedy+0.017
 eagle+0.017
 happiest+0.016
 merry+0.016
Neg logit contributions
ork-0.028
 attent-0.027
eth-0.026
inn-0.025
or-0.024
Token hands
Token ID2832
Feature activation-7.76e-02

Pos logit contributions
ork+0.015
 attent+0.015
inn+0.014
eth+0.014
or+0.013
Neg logit contributions
 glad-0.011
 greedy-0.009
 happiest-0.009
 merry-0.009
 floppy-0.009
Token.
Token ID13
Feature activation+2.54e-02

Pos logit contributions
 glad+0.002
 merry+0.002
 floppy+0.002
 strongest+0.002
 eagle+0.002
Neg logit contributions
 attent-0.002
inn-0.002
ork-0.002
or-0.002
ula-0.002
Token,
Token ID11
Feature activation+5.19e-02

Pos logit contributions
 glad+0.009
 floppy+0.009
 rider+0.008
 strongest+0.008
 merry+0.008
Neg logit contributions
ork-0.016
 attent-0.016
inn-0.015
or-0.015
 Grac-0.015
Token "
Token ID366
Feature activation-6.73e-01

Pos logit contributions
ork+0.001
inn+0.001
 attent+0.001
or+0.001
oons+0.001
Neg logit contributions
 glad-0.001
 floppy-0.001
 merry-0.001
 eager-0.001
 rider-0.001
TokenHi
Token ID17250
Feature activation-2.96e-01

Pos logit contributions
 glad+0.014
 floppy+0.013
 merry+0.013
 eager+0.012
 rider+0.012
Neg logit contributions
ork-0.022
 attent-0.021
inn-0.021
or-0.020
oons-0.020
Token!
Token ID0
Feature activation-4.71e-01

Pos logit contributions
 glad+0.005
 floppy+0.005
 merry+0.005
 eager+0.005
 rider+0.005
Neg logit contributions
ork-0.010
inn-0.010
 attent-0.010
or-0.009
 liquids-0.009
Token My
Token ID2011
Feature activation-9.33e-02

Pos logit contributions
 glad+0.010
 floppy+0.010
 merry+0.009
 rider+0.009
 eager+0.009
Neg logit contributions
ork-0.015
 attent-0.015
inn-0.015
oons-0.014
or-0.014
Token name
Token ID1438
Feature activation-3.60e+00

Pos logit contributions
 glad+0.002
 floppy+0.002
 merry+0.002
 rider+0.002
 eager+0.002
Neg logit contributions
ork-0.002
 attent-0.002
inn-0.002
or-0.002
 liquids-0.002
Token is
Token ID318
Feature activation-3.01e-01

Pos logit contributions
 glad+0.104
 floppy+0.096
 eager+0.093
 happiest+0.092
 strongest+0.089
Neg logit contributions
ork-0.085
oons-0.076
 attent-0.074
inn-0.070
or-0.070
Token Sue
Token ID26089
Feature activation-1.98e+00

Pos logit contributions
 glad+0.007
 floppy+0.007
 merry+0.007
 eager+0.007
 greedy+0.007
Neg logit contributions
inn-0.009
 attent-0.009
ork-0.009
oons-0.008
or-0.008
Token.
Token ID13
Feature activation-1.46e+00

Pos logit contributions
 glad+0.052
 merry+0.051
 floppy+0.049
 eagle+0.048
Leave+0.048
Neg logit contributions
ork-0.045
inn-0.043
 attent-0.043
oons-0.041
or-0.040
Token This
Token ID770
Feature activation-6.03e-01

Pos logit contributions
 glad+0.029
 floppy+0.028
 merry+0.026
 rider+0.026
 eager+0.025
Neg logit contributions
ork-0.049
 attent-0.048
inn-0.048
oons-0.046
or-0.045
Token is
Token ID318
Feature activation-1.03e-01

Pos logit contributions
 glad+0.012
 floppy+0.011
 merry+0.011
 eager+0.011
 rider+0.010
Neg logit contributions
ork-0.020
 attent-0.019
inn-0.019
or-0.018
oons-0.018
Token me
Token ID502
Feature activation-2.03e+00

Pos logit contributions
 glad+0.002
 floppy+0.002
 merry+0.001
 rider+0.001
 eager+0.001
Neg logit contributions
ork-0.003
 attent-0.003
inn-0.003
 liquids-0.003
oons-0.003
Token.
Token ID13
Feature activation-1.19e-01

Pos logit contributions
 floppy+0.048
 eagle+0.046
 merry+0.046
 glad+0.046
Leave+0.046
Neg logit contributions
ork-0.053
inn-0.051
 attent-0.051
or-0.049
oons-0.049
Token You
Token ID921
Feature activation-2.56e-01

Pos logit contributions
 glad+0.003
 floppy+0.003
 rider+0.002
 merry+0.002
 eager+0.002
Neg logit contributions
ork-0.004
 attent-0.004
inn-0.004
oons-0.003
or-0.003
Token are
Token ID389
Feature activation-1.31e+00

Pos logit contributions
 glad+0.005
 floppy+0.004
 merry+0.004
 rider+0.004
 eager+0.004
Neg logit contributions
ork-0.008
 attent-0.008
inn-0.008
 liquids-0.008
 Grac-0.008
Token my
Token ID616
Feature activation-1.51e+00

Pos logit contributions
 glad+0.017
 floppy+0.017
 merry+0.016
 rider+0.016
 eagle+0.015
Neg logit contributions
inn-0.048
 attent-0.048
ork-0.047
 liquids-0.045
 Grac-0.045
Token son
Token ID3367
Feature activation-3.60e+00

Pos logit contributions
 floppy+0.026
 glad+0.026
 rider+0.024
 merry+0.024
 eagle+0.024
Neg logit contributions
 attent-0.049
ork-0.048
inn-0.046
 Grac-0.045
 liquids-0.045
Token.
Token ID13
Feature activation-3.48e-01

Pos logit contributions
Leave+0.099
 glad+0.095
 floppy+0.094
 happiest+0.094
 eagle+0.093
Neg logit contributions
ork-0.068
oons-0.067
inn-0.063
 attent-0.061
 Grac-0.061
Token And
Token ID843
Feature activation-4.30e-01

Pos logit contributions
 glad+0.007
 floppy+0.007
 rider+0.007
 eager+0.006
 strongest+0.006
Neg logit contributions
ork-0.011
 attent-0.011
inn-0.011
oons-0.010
or-0.010
Token you
Token ID345
Feature activation+6.43e-01

Pos logit contributions
 glad+0.009
 floppy+0.009
 merry+0.009
 rider+0.009
 eagle+0.009
Neg logit contributions
 attent-0.012
inn-0.012
ork-0.012
or-0.011
 liquids-0.011
Token are
Token ID389
Feature activation-7.67e-01

Pos logit contributions
ork+0.022
inn+0.021
 attent+0.021
 liquids+0.021
oons+0.020
Neg logit contributions
 glad-0.011
 floppy-0.011
 merry-0.010
 rider-0.010
 eager-0.010
Token a
Token ID257
Feature activation-4.24e-01

Pos logit contributions
 floppy+0.011
 glad+0.010
 merry+0.010
 rider+0.010
 eagle+0.010
Neg logit contributions
inn-0.027
 attent-0.027
ork-0.027
 Grac-0.026
 liquids-0.026
Token eyes
Token ID2951
Feature activation-4.99e-02

Pos logit contributions
 glad+0.024
 floppy+0.024
 strongest+0.024
 eagle+0.023
 rider+0.023
Neg logit contributions
ork-0.027
 attent-0.027
inn-0.025
eth-0.023
oons-0.022
Token widen
Token ID45676
Feature activation+2.10e-01

Pos logit contributions
 glad+0.001
 eagle+0.001
 rider+0.001
 greedy+0.001
 strongest+0.001
Neg logit contributions
 attent-0.001
ork-0.001
 liquids-0.001
inn-0.001
oons-0.001
Token.
Token ID13
Feature activation-4.01e-01

Pos logit contributions
ork+0.007
 attent+0.007
inn+0.007
or+0.007
oons+0.006
Neg logit contributions
 glad-0.004
 floppy-0.004
 merry-0.004
 strongest-0.003
 happiest-0.003
Token He
Token ID679
Feature activation-1.48e+00

Pos logit contributions
 glad+0.007
 floppy+0.007
 eager+0.006
 greedy+0.006
 rider+0.006
Neg logit contributions
ork-0.014
 attent-0.014
inn-0.014
or-0.014
oons-0.013
Token loves
Token ID10408
Feature activation+1.63e-01

Pos logit contributions
 glad+0.028
 floppy+0.025
 merry+0.025
 strongest+0.025
 rider+0.024
Neg logit contributions
 attent-0.050
ork-0.049
inn-0.048
oons-0.046
 liquids-0.045
Token bikes
Token ID16715
Feature activation-3.60e+00

Pos logit contributions
ork+0.005
inn+0.005
 attent+0.005
or+0.005
oons+0.005
Neg logit contributions
 glad-0.004
 floppy-0.003
 merry-0.003
 eager-0.003
 rider-0.003
Token.
Token ID13
Feature activation-1.31e-01

Pos logit contributions
 glad+0.107
Leave+0.102
 eagle+0.102
 floppy+0.101
 merry+0.100
Neg logit contributions
inn-0.071
ork-0.069
 attent-0.069
or-0.066
oons-0.059
Token He
Token ID679
Feature activation-1.15e+00

Pos logit contributions
 glad+0.002
 floppy+0.002
 eager+0.002
 greedy+0.002
 rider+0.002
Neg logit contributions
ork-0.005
 attent-0.005
inn-0.005
or-0.005
oons-0.004
Token looks
Token ID3073
Feature activation-2.85e-01

Pos logit contributions
 glad+0.022
 floppy+0.020
 merry+0.020
 strongest+0.019
 happiest+0.019
Neg logit contributions
 attent-0.038
ork-0.037
inn-0.036
oons-0.035
 liquids-0.034
Token at
Token ID379
Feature activation-4.11e-01

Pos logit contributions
 glad+0.007
 merry+0.006
 floppy+0.006
 rider+0.006
 strongest+0.006
Neg logit contributions
 attent-0.008
inn-0.008
ork-0.008
or-0.007
 liquids-0.007
Token the
Token ID262
Feature activation-1.36e+00

Pos logit contributions
 glad+0.011
 floppy+0.010
 merry+0.010
 rider+0.010
 strongest+0.010
Neg logit contributions
inn-0.011
ork-0.010
 attent-0.010
oons-0.010
or-0.010
Token
Token ID198
Feature activation+7.17e-01

Pos logit contributions
 attent+0.021
ork+0.020
inn+0.020
 liquids+0.020
oons+0.019
Neg logit contributions
 glad-0.012
 floppy-0.011
 merry-0.011
 eager-0.011
 riding-0.010
Token"
Token ID1
Feature activation-2.94e-01

Pos logit contributions
 attent+0.019
ork+0.018
inn+0.018
 liquids+0.017
oons+0.017
Neg logit contributions
 glad-0.018
 floppy-0.018
 eager-0.017
 merry-0.017
 rider-0.017
TokenHi
Token ID17250
Feature activation+1.05e-01

Pos logit contributions
 glad+0.007
 floppy+0.006
 merry+0.006
 rider+0.006
 eager+0.006
Neg logit contributions
ork-0.009
 attent-0.009
inn-0.009
or-0.008
oons-0.008
Token,
Token ID11
Feature activation+8.12e-01

Pos logit contributions
ork+0.003
inn+0.003
 attent+0.003
or+0.003
 Grac+0.003
Neg logit contributions
 glad-0.002
 floppy-0.002
 rider-0.002
 merry-0.002
 strongest-0.002
Token my
Token ID616
Feature activation+1.04e-01

Pos logit contributions
ork+0.022
inn+0.022
 attent+0.021
 Grac+0.020
or+0.020
Neg logit contributions
 glad-0.019
 floppy-0.019
 rider-0.018
 merry-0.018
 eager-0.018
Token name
Token ID1438
Feature activation-3.58e+00

Pos logit contributions
ork+0.003
 attent+0.003
inn+0.003
or+0.003
 liquids+0.003
Neg logit contributions
 glad-0.002
 floppy-0.002
 merry-0.002
 rider-0.002
 eager-0.002
Token is
Token ID318
Feature activation-4.63e-01

Pos logit contributions
 glad+0.101
 happiest+0.091
 floppy+0.091
 eager+0.091
 strongest+0.088
Neg logit contributions
ork-0.092
oons-0.079
 attent-0.076
inn-0.076
sc-0.075
Token Lily
Token ID20037
Feature activation-1.91e+00

Pos logit contributions
 glad+0.011
 floppy+0.011
 merry+0.011
 eager+0.010
 no+0.010
Neg logit contributions
 attent-0.013
ork-0.013
inn-0.013
oons-0.012
or-0.012
Token.
Token ID13
Feature activation-1.07e+00

Pos logit contributions
 glad+0.052
 happiest+0.052
 merry+0.051
 smartest+0.050
 strongest+0.050
Neg logit contributions
inn-0.042
ork-0.041
eth-0.039
 attent-0.038
 Grac-0.037
Token What
Token ID1867
Feature activation-9.34e-01

Pos logit contributions
 glad+0.024
 floppy+0.023
 merry+0.023
 eager+0.022
 rider+0.022
Neg logit contributions
ork-0.032
 attent-0.031
inn-0.031
or-0.029
oons-0.029
Token's
Token ID338
Feature activation+4.70e-01

Pos logit contributions
 glad+0.023
 floppy+0.022
 eagle+0.021
 merry+0.020
 rider+0.020
Neg logit contributions
ork-0.026
oons-0.023
to-0.023
or-0.022
 las-0.022
Token.
Token ID13
Feature activation+4.04e-02

Pos logit contributions
 glad+0.015
 floppy+0.015
 merry+0.014
 greedy+0.014
 strongest+0.013
Neg logit contributions
ork-0.021
 attent-0.021
inn-0.021
or-0.020
oons-0.019
Token He
Token ID679
Feature activation-2.11e-02

Pos logit contributions
ork+0.001
inn+0.001
 attent+0.001
works+0.001
or+0.001
Neg logit contributions
 eager-0.001
 floppy-0.001
 glad-0.001
 no-0.001
 curious-0.001
Token grabbed
Token ID13176
Feature activation-8.49e-01

Pos logit contributions
 glad+0.000
 floppy+0.000
 greedy+0.000
 strongest+0.000
 happiest+0.000
Neg logit contributions
ork-0.001
 attent-0.001
inn-0.001
or-0.001
oons-0.001
Token the
Token ID262
Feature activation-9.01e-01

Pos logit contributions
 glad+0.018
 floppy+0.017
 merry+0.016
 rider+0.016
 greedy+0.016
Neg logit contributions
inn-0.027
ork-0.027
 attent-0.027
or-0.025
oons-0.024
Token handle
Token ID5412
Feature activation-1.63e+00

Pos logit contributions
 glad+0.019
 greedy+0.017
 grateful+0.015
 floppy+0.015
 eager+0.015
Neg logit contributions
ork-0.027
 attent-0.025
inn-0.024
inner-0.024
ula-0.024
Tokenbars
Token ID34046
Feature activation-3.57e+00

Pos logit contributions
 glad+0.035
 greedy+0.032
 floppy+0.030
 happiest+0.030
 merry+0.029
Neg logit contributions
inn-0.053
 attent-0.052
or-0.051
ork-0.051
ula-0.048
Token and
Token ID290
Feature activation+4.96e-01

Pos logit contributions
 glad+0.081
 greedy+0.075
 happiest+0.072
 floppy+0.071
 eagle+0.070
Neg logit contributions
inn-0.108
or-0.103
ork-0.099
 attent-0.097
ula-0.095
Token he
Token ID339
Feature activation+4.83e-01

Pos logit contributions
 attent+0.015
inn+0.015
ork+0.015
or+0.014
 liquids+0.013
Neg logit contributions
 glad-0.011
 floppy-0.010
 greedy-0.010
 merry-0.010
 happiest-0.010
Token quickly
Token ID2952
Feature activation+1.20e+00

Pos logit contributions
ork+0.015
 attent+0.014
inn+0.014
or+0.014
oons+0.013
Neg logit contributions
 glad-0.011
 floppy-0.010
 greedy-0.010
 merry-0.009
 strongest-0.009
Token pushed
Token ID7121
Feature activation-8.43e-01

Pos logit contributions
ork+0.036
 attent+0.035
inn+0.035
or+0.033
oons+0.032
Neg logit contributions
 glad-0.028
 floppy-0.026
 merry-0.025
 strongest-0.025
 happiest-0.025
Token on
Token ID319
Feature activation+2.84e-01

Pos logit contributions
 glad+0.017
 floppy+0.016
 merry+0.016
 rider+0.015
 eager+0.015
Neg logit contributions
ork-0.027
inn-0.027
 attent-0.027
or-0.025
oons-0.025
Token,
Token ID11
Feature activation-5.83e-01

Pos logit contributions
 glad+0.015
 floppy+0.015
 rider+0.015
 merry+0.015
 strongest+0.014
Neg logit contributions
ork-0.030
 attent-0.029
inn-0.028
 Grac-0.027
or-0.027
Token "
Token ID366
Feature activation-4.95e-01

Pos logit contributions
 glad+0.014
 floppy+0.014
 merry+0.013
 rider+0.013
 eager+0.013
Neg logit contributions
ork-0.016
 attent-0.016
inn-0.016
or-0.014
oons-0.014
TokenHi
Token ID17250
Feature activation-5.09e-01

Pos logit contributions
 glad+0.008
 floppy+0.008
 merry+0.008
 rider+0.008
 eager+0.007
Neg logit contributions
ork-0.018
inn-0.017
 attent-0.017
oons-0.016
or-0.016
Token!
Token ID0
Feature activation-1.35e-01

Pos logit contributions
 glad+0.009
 floppy+0.008
 merry+0.008
 rider+0.008
 eager+0.008
Neg logit contributions
ork-0.018
inn-0.017
 attent-0.017
or-0.016
oons-0.016
Token My
Token ID2011
Feature activation+3.96e-02

Pos logit contributions
 glad+0.003
 floppy+0.003
 merry+0.002
 no+0.002
 rider+0.002
Neg logit contributions
 attent-0.004
ork-0.004
inn-0.004
oons-0.004
or-0.004
Token name
Token ID1438
Feature activation-3.56e+00

Pos logit contributions
ork+0.001
 attent+0.001
inn+0.001
or+0.001
 liquids+0.001
Neg logit contributions
 glad-0.001
 floppy-0.001
 merry-0.001
 rider-0.001
 eager-0.001
Token is
Token ID318
Feature activation-2.47e-01

Pos logit contributions
 glad+0.098
 floppy+0.091
 eager+0.087
 happiest+0.086
 strongest+0.084
Neg logit contributions
ork-0.090
oons-0.080
 attent-0.078
inn-0.076
or-0.075
Token Tim
Token ID5045
Feature activation-1.64e+00

Pos logit contributions
 glad+0.006
 floppy+0.006
 merry+0.005
 no+0.005
 eager+0.005
Neg logit contributions
inn-0.007
 attent-0.007
ork-0.007
oons-0.007
or-0.007
Token.
Token ID13
Feature activation-9.76e-01

Pos logit contributions
 glad+0.043
 merry+0.042
 floppy+0.041
 broad+0.039
 happiest+0.039
Neg logit contributions
inn-0.038
ork-0.038
 attent-0.035
or-0.035
oons-0.035
Token This
Token ID770
Feature activation-5.68e-01

Pos logit contributions
 glad+0.018
 floppy+0.017
 rider+0.016
 merry+0.015
 no+0.015
Neg logit contributions
 attent-0.034
ork-0.034
inn-0.033
oons-0.032
or-0.032
Token is
Token ID318
Feature activation+4.76e-02

Pos logit contributions
 glad+0.010
 floppy+0.010
 merry+0.010
 eager+0.009
 rider+0.009
Neg logit contributions
ork-0.019
 attent-0.019
inn-0.019
or-0.018
oons-0.018
Token little
Token ID1310
Feature activation-6.00e-01

Pos logit contributions
 glad+0.008
 floppy+0.008
 soaring+0.008
 swinging+0.008
opl+0.008
Neg logit contributions
ork-0.006
inn-0.006
 attent-0.006
inner-0.006
oons-0.006
Token boy
Token ID2933
Feature activation-8.96e-01

Pos logit contributions
 glad+0.019
opl+0.017
 soaring+0.017
 floppy+0.017
 eagle+0.017
Neg logit contributions
inn-0.011
ork-0.011
 Grac-0.011
or-0.010
 liquids-0.010
Token said
Token ID531
Feature activation-3.26e-01

Pos logit contributions
 glad+0.014
 floppy+0.013
 eager+0.013
 waved+0.012
 merry+0.012
Neg logit contributions
ork-0.033
inn-0.033
 attent-0.032
oons-0.031
or-0.031
Token "
Token ID366
Feature activation-2.24e-01

Pos logit contributions
 glad+0.007
 no+0.006
 eager+0.006
 floppy+0.006
 merry+0.006
Neg logit contributions
inn-0.011
ork-0.011
 attent-0.011
oons-0.010
 liquids-0.010
TokenMy
Token ID3666
Feature activation-4.38e-01

Pos logit contributions
 glad+0.004
 floppy+0.004
 merry+0.004
 rider+0.004
 eager+0.004
Neg logit contributions
ork-0.008
 attent-0.008
inn-0.008
oons-0.007
 liquids-0.007
Token name
Token ID1438
Feature activation-3.55e+00

Pos logit contributions
 glad+0.013
 floppy+0.012
 merry+0.012
 eagle+0.011
 greedy+0.011
Neg logit contributions
ork-0.010
 attent-0.009
inn-0.009
 Grac-0.008
ula-0.008
Token is
Token ID318
Feature activation-2.41e-01

Pos logit contributions
 glad+0.100
 floppy+0.092
 eager+0.088
 greedy+0.086
 happiest+0.085
Neg logit contributions
ork-0.087
 attent-0.077
oons-0.077
or-0.074
inn-0.072
Token Jack
Token ID3619
Feature activation-9.65e-01

Pos logit contributions
 glad+0.007
 floppy+0.006
 merry+0.006
 no+0.006
 eager+0.006
Neg logit contributions
ork-0.006
 attent-0.006
inn-0.006
oons-0.006
or-0.006
Token.
Token ID13
Feature activation-6.75e-01

Pos logit contributions
 glad+0.022
 merry+0.021
 floppy+0.021
 happiest+0.019
 rider+0.019
Neg logit contributions
inn-0.027
ork-0.027
 attent-0.026
or-0.025
oons-0.025
Token Can
Token ID1680
Feature activation-2.15e+00

Pos logit contributions
 glad+0.013
 floppy+0.012
 merry+0.011
 eager+0.011
 rider+0.011
Neg logit contributions
ork-0.023
 attent-0.023
inn-0.022
oons-0.021
or-0.021
Token I
Token ID314
Feature activation-1.48e+00

Pos logit contributions
 glad+0.041
 floppy+0.039
 merry+0.037
 eager+0.036
 rider+0.036
Neg logit contributions
ork-0.071
inn-0.070
 attent-0.070
or-0.066
 Grac-0.064
Token too
Token ID1165
Feature activation-1.32e+00

Pos logit contributions
 floppy+0.026
 glad+0.026
 merry+0.025
 strongest+0.025
 smartest+0.024
Neg logit contributions
inn-0.037
 attent-0.037
ork-0.036
ula-0.034
or-0.034
Token.
Token ID13
Feature activation-2.39e-01

Pos logit contributions
 happiest+0.034
Leave+0.033
 merry+0.033
 eagle+0.033
 smartest+0.032
Neg logit contributions
 attent-0.032
inn-0.031
ork-0.030
oons-0.028
ub-0.027
Token You
Token ID921
Feature activation-5.09e-01

Pos logit contributions
 glad+0.006
 floppy+0.005
 merry+0.005
 eager+0.005
 rider+0.005
Neg logit contributions
ork-0.007
inn-0.007
 attent-0.007
or-0.006
 liquids-0.006
Token are
Token ID389
Feature activation-1.42e+00

Pos logit contributions
 glad+0.010
 floppy+0.010
 swinging+0.010
 merry+0.010
 rider+0.010
Neg logit contributions
ork-0.015
 attent-0.014
 Grac-0.014
 liquids-0.014
inn-0.014
Token my
Token ID616
Feature activation-1.60e+00

Pos logit contributions
 floppy+0.024
 rider+0.024
 merry+0.023
 eagle+0.023
 glad+0.022
Neg logit contributions
inn-0.044
 attent-0.043
 Grac-0.043
ork-0.042
 las-0.041
Token friends
Token ID2460
Feature activation-3.54e+00

Pos logit contributions
 glad+0.026
 floppy+0.025
 riding+0.024
 merry+0.023
 swinging+0.023
Neg logit contributions
 attent-0.054
ork-0.053
inn-0.052
 Grac-0.050
 liquids-0.050
Token.
Token ID13
Feature activation-6.00e-01

Pos logit contributions
 floppy+0.104
 merry+0.099
 swinging+0.099
 happiest+0.099
 eagle+0.098
Neg logit contributions
ork-0.059
inn-0.058
or-0.055
 attent-0.052
 Sp-0.051
Token Now
Token ID2735
Feature activation-8.77e-01

Pos logit contributions
 glad+0.014
 floppy+0.013
 merry+0.013
 eager+0.012
 rider+0.012
Neg logit contributions
ork-0.017
inn-0.017
 attent-0.017
or-0.016
oons-0.016
Token we
Token ID356
Feature activation+1.72e-01

Pos logit contributions
 glad+0.018
 floppy+0.018
 merry+0.016
 eager+0.016
 greedy+0.016
Neg logit contributions
ork-0.027
 attent-0.026
inn-0.026
or-0.025
 Grac-0.024
Token can
Token ID460
Feature activation-1.45e+00

Pos logit contributions
ork+0.006
 attent+0.006
inn+0.006
or+0.005
oons+0.005
Neg logit contributions
 glad-0.003
 floppy-0.003
 merry-0.003
 eager-0.003
 rider-0.003
Token play
Token ID711
Feature activation-5.79e-01

Pos logit contributions
 glad+0.032
 floppy+0.031
 merry+0.029
 greedy+0.029
 rider+0.029
Neg logit contributions
ork-0.044
inn-0.043
 attent-0.043
or-0.040
oons-0.039
Token,
Token ID11
Feature activation+2.43e-01

Pos logit contributions
 glad+0.008
 floppy+0.007
 merry+0.007
 rider+0.007
 eager+0.006
Neg logit contributions
ork-0.018
 attent-0.017
inn-0.017
or-0.016
oons-0.016
Token "
Token ID366
Feature activation-5.48e-01

Pos logit contributions
ork+0.009
inn+0.009
or+0.009
oons+0.008
 attent+0.008
Neg logit contributions
 glad-0.004
 eager-0.004
 floppy-0.004
 no-0.004
 riding-0.004
TokenHi
Token ID17250
Feature activation-2.18e-01

Pos logit contributions
 glad+0.009
 floppy+0.009
 merry+0.009
 rider+0.009
 eager+0.009
Neg logit contributions
ork-0.019
inn-0.019
 attent-0.019
oons-0.018
 liquids-0.017
Token,
Token ID11
Feature activation+6.20e-01

Pos logit contributions
 glad+0.004
 floppy+0.004
 merry+0.003
 eager+0.003
 strongest+0.003
Neg logit contributions
ork-0.008
 attent-0.007
inn-0.007
or-0.007
oons-0.007
Token my
Token ID616
Feature activation-6.20e-01

Pos logit contributions
ork+0.018
 attent+0.018
inn+0.018
or+0.016
oons+0.016
Neg logit contributions
 glad-0.015
 floppy-0.014
 merry-0.014
 eager-0.013
 rider-0.013
Token name
Token ID1438
Feature activation-3.53e+00

Pos logit contributions
 eagle+0.016
 glad+0.015
 strongest+0.014
Leave+0.014
 merry+0.014
Neg logit contributions
ork-0.016
 attent-0.015
inn-0.014
ula-0.014
eth-0.014
Token is
Token ID318
Feature activation-5.83e-01

Pos logit contributions
 glad+0.094
 floppy+0.088
 happiest+0.083
 eager+0.083
 strongest+0.082
Neg logit contributions
ork-0.093
oons-0.082
inn-0.081
 attent-0.081
or-0.078
Token Lily
Token ID20037
Feature activation-1.68e+00

Pos logit contributions
 glad+0.014
 floppy+0.014
 merry+0.014
 no+0.013
 eager+0.013
Neg logit contributions
 attent-0.017
ork-0.017
inn-0.017
oons-0.016
or-0.015
Token.
Token ID13
Feature activation-1.24e+00

Pos logit contributions
 glad+0.040
 merry+0.040
 floppy+0.039
 happiest+0.037
 strongest+0.036
Neg logit contributions
inn-0.044
ork-0.044
 attent-0.042
or-0.040
oons-0.039
Token Do
Token ID2141
Feature activation-4.95e-01

Pos logit contributions
 glad+0.024
 floppy+0.023
 merry+0.021
 eager+0.021
 rider+0.021
Neg logit contributions
ork-0.043
 attent-0.042
inn-0.042
oons-0.040
or-0.039
Token you
Token ID345
Feature activation-1.77e-02

Pos logit contributions
 glad+0.013
 floppy+0.013
 rider+0.012
 eager+0.012
 merry+0.012
Neg logit contributions
ork-0.012
 attent-0.011
inn-0.011
or-0.011
oons-0.010
Token,
Token ID11
Feature activation+1.86e-01

Pos logit contributions
 glad+0.005
 floppy+0.005
 merry+0.005
 rider+0.004
 eager+0.004
Neg logit contributions
ork-0.011
 attent-0.011
inn-0.010
or-0.010
oons-0.010
Token "
Token ID366
Feature activation-2.22e-01

Pos logit contributions
ork+0.006
inn+0.006
 attent+0.006
oons+0.006
or+0.006
Neg logit contributions
 glad-0.004
 floppy-0.004
 eager-0.004
 no-0.003
 merry-0.003
TokenHi
Token ID17250
Feature activation-6.34e-01

Pos logit contributions
 glad+0.004
 floppy+0.004
 rider+0.004
 merry+0.004
 eager+0.003
Neg logit contributions
ork-0.008
inn-0.008
 attent-0.008
oons-0.007
 liquids-0.007
Token!
Token ID0
Feature activation-6.45e-01

Pos logit contributions
 glad+0.011
 floppy+0.010
 merry+0.010
 eager+0.010
 rider+0.009
Neg logit contributions
ork-0.022
inn-0.021
 attent-0.021
or-0.020
oons-0.020
Token My
Token ID2011
Feature activation-6.49e-01

Pos logit contributions
 glad+0.014
 floppy+0.014
 merry+0.013
 eager+0.013
 rider+0.013
Neg logit contributions
 attent-0.020
ork-0.020
inn-0.019
oons-0.019
or-0.018
Token name
Token ID1438
Feature activation-3.53e+00

Pos logit contributions
 glad+0.018
 floppy+0.017
 merry+0.017
 strongest+0.016
 eagle+0.016
Neg logit contributions
ork-0.015
 attent-0.015
inn-0.014
eth-0.013
 Grac-0.013
Token is
Token ID318
Feature activation-1.84e-01

Pos logit contributions
 glad+0.103
 floppy+0.096
 eager+0.092
 happiest+0.091
 strongest+0.089
Neg logit contributions
ork-0.080
oons-0.071
 attent-0.068
or-0.066
inn-0.063
Token Bob
Token ID5811
Feature activation-1.45e+00

Pos logit contributions
 glad+0.004
 floppy+0.004
 merry+0.004
 eager+0.004
 no+0.004
Neg logit contributions
ork-0.006
 attent-0.006
inn-0.006
oons-0.005
or-0.005
Token.
Token ID13
Feature activation-1.30e+00

Pos logit contributions
 glad+0.034
 merry+0.033
 floppy+0.033
 happiest+0.031
 strongest+0.030
Neg logit contributions
inn-0.038
ork-0.038
 attent-0.037
oons-0.036
or-0.035
Token What
Token ID1867
Feature activation-9.04e-01

Pos logit contributions
 glad+0.027
 floppy+0.026
 merry+0.024
 rider+0.024
 eager+0.024
Neg logit contributions
ork-0.042
 attent-0.041
inn-0.041
oons-0.038
or-0.038
Token's
Token ID338
Feature activation+5.74e-01

Pos logit contributions
 glad+0.022
 floppy+0.020
 merry+0.019
 eagle+0.019
 happiest+0.019
Neg logit contributions
ork-0.024
oons-0.023
to-0.022
 attent-0.021
 las-0.021
Token,
Token ID11
Feature activation-7.10e-02

Pos logit contributions
 glad+0.005
 floppy+0.004
 merry+0.004
 eager+0.004
 greedy+0.004
Neg logit contributions
ork-0.013
inn-0.013
 attent-0.013
oons-0.012
or-0.012
Token "
Token ID366
Feature activation-8.70e-02

Pos logit contributions
 glad+0.001
 floppy+0.001
 eager+0.001
 merry+0.001
 no+0.001
Neg logit contributions
ork-0.002
inn-0.002
 attent-0.002
oons-0.002
or-0.002
TokenHi
Token ID17250
Feature activation-2.95e-01

Pos logit contributions
 floppy+0.002
 glad+0.002
 merry+0.002
 rider+0.002
 eager+0.001
Neg logit contributions
ork-0.003
inn-0.003
 attent-0.003
oons-0.003
 liquids-0.003
Token,
Token ID11
Feature activation+9.52e-01

Pos logit contributions
 glad+0.005
 floppy+0.005
 merry+0.005
 eager+0.004
 rider+0.004
Neg logit contributions
ork-0.010
inn-0.010
 attent-0.010
or-0.010
 liquids-0.009
Token my
Token ID616
Feature activation-3.88e-01

Pos logit contributions
ork+0.028
 attent+0.028
inn+0.027
oons+0.025
or+0.025
Neg logit contributions
 glad-0.023
 floppy-0.021
 merry-0.021
 eager-0.020
 rider-0.020
Token name
Token ID1438
Feature activation-3.52e+00

Pos logit contributions
 glad+0.009
 floppy+0.009
 merry+0.008
 rider+0.008
 greedy+0.008
Neg logit contributions
ork-0.011
 attent-0.011
inn-0.011
or-0.010
eth-0.010
Token is
Token ID318
Feature activation-4.10e-01

Pos logit contributions
 glad+0.103
 floppy+0.094
 eager+0.091
 happiest+0.089
 strongest+0.088
Neg logit contributions
ork-0.084
oons-0.070
 attent-0.069
or-0.068
inn-0.067
Token Ch
Token ID609
Feature activation+3.10e-01

Pos logit contributions
 glad+0.011
 floppy+0.011
 merry+0.011
 no+0.010
 eager+0.010
Neg logit contributions
 attent-0.011
inn-0.011
ork-0.011
oons-0.010
or-0.010
Tokenoo
Token ID2238
Feature activation-3.78e-01

Pos logit contributions
 attent+0.008
ork+0.008
inn+0.008
 liquids+0.007
oons+0.007
Neg logit contributions
 glad-0.008
 floppy-0.008
 merry-0.008
 eager-0.007
 rider-0.007
Token-
Token ID12
Feature activation+1.15e-01

Pos logit contributions
 glad+0.008
 floppy+0.008
 merry+0.007
 rider+0.007
 eager+0.007
Neg logit contributions
ork-0.012
inn-0.012
 attent-0.012
oons-0.011
or-0.011
TokenCh
Token ID1925
Feature activation-6.67e-02

Pos logit contributions
 attent+0.002
ork+0.002
inn+0.002
oons+0.002
 liquids+0.002
Neg logit contributions
 glad-0.003
 floppy-0.003
 rider-0.003
 merry-0.003
 greedy-0.003
Token"
Token ID1
Feature activation-2.77e-01

Pos logit contributions
ork+0.021
 attent+0.021
inn+0.021
 liquids+0.020
oons+0.020
Neg logit contributions
 glad-0.021
 rider-0.021
 floppy-0.020
 eager-0.020
 merry-0.020
TokenHi
Token ID17250
Feature activation-2.11e-02

Pos logit contributions
 glad+0.007
 floppy+0.006
 greedy+0.006
 eager+0.006
 merry+0.006
Neg logit contributions
inn-0.008
 attent-0.008
ork-0.008
 liquids-0.007
 Grac-0.007
Token Max
Token ID5436
Feature activation-8.71e-01

Pos logit contributions
 glad+0.000
 floppy+0.000
 merry+0.000
 strongest+0.000
 happiest+0.000
Neg logit contributions
ork-0.001
inn-0.001
 attent-0.001
 Grac-0.001
or-0.001
Token,
Token ID11
Feature activation+5.14e-01

Pos logit contributions
 glad+0.023
 happiest+0.023
 merry+0.022
 strongest+0.022
 smartest+0.022
Neg logit contributions
ork-0.021
inn-0.020
eth-0.020
 Grac-0.019
 attent-0.019
Token my
Token ID616
Feature activation-3.09e-01

Pos logit contributions
ork+0.016
 attent+0.016
inn+0.016
 Grac+0.015
or+0.015
Neg logit contributions
 glad-0.009
 floppy-0.009
 merry-0.009
 rider-0.009
 eager-0.009
Token name
Token ID1438
Feature activation-3.51e+00

Pos logit contributions
 glad+0.007
 floppy+0.007
 eagle+0.007
 merry+0.007
 greedy+0.007
Neg logit contributions
ork-0.008
 attent-0.008
inn-0.007
eth-0.007
ula-0.007
Token is
Token ID318
Feature activation-6.09e-01

Pos logit contributions
 glad+0.099
 happiest+0.090
 eager+0.089
 floppy+0.088
 strongest+0.087
Neg logit contributions
ork-0.087
oons-0.076
 attent-0.074
inn-0.072
sc-0.072
Token Lily
Token ID20037
Feature activation-1.79e+00

Pos logit contributions
 glad+0.014
 floppy+0.014
 merry+0.013
 eager+0.013
 rider+0.013
Neg logit contributions
ork-0.018
 attent-0.018
inn-0.018
or-0.016
oons-0.016
Token.
Token ID13
Feature activation-7.58e-01

Pos logit contributions
 glad+0.049
 happiest+0.048
 merry+0.047
 smartest+0.046
Leave+0.046
Neg logit contributions
ork-0.037
inn-0.037
eth-0.036
 Grac-0.034
 attent-0.033
Token Do
Token ID2141
Feature activation-4.20e-01

Pos logit contributions
 glad+0.017
 floppy+0.016
 merry+0.016
 rider+0.015
 eager+0.015
Neg logit contributions
ork-0.022
inn-0.022
 attent-0.022
 liquids-0.020
or-0.020
Token you
Token ID345
Feature activation-3.16e-02

Pos logit contributions
 glad+0.012
 floppy+0.011
 rider+0.011
 eager+0.011
 merry+0.011
Neg logit contributions
ork-0.010
 attent-0.009
inn-0.009
or-0.009
oons-0.008
Token his
Token ID465
Feature activation-5.42e-01

Pos logit contributions
ork+0.039
 attent+0.037
inn+0.036
oons+0.035
ount+0.035
Neg logit contributions
 floppy-0.015
 glad-0.015
 no-0.015
 then-0.014
 eager-0.014
Token hands
Token ID2832
Feature activation-6.17e-02

Pos logit contributions
 glad+0.010
 floppy+0.010
 merry+0.009
 eager+0.009
 greedy+0.009
Neg logit contributions
ork-0.018
inn-0.018
 attent-0.018
or-0.017
oons-0.017
Token and
Token ID290
Feature activation+9.46e-01

Pos logit contributions
 glad+0.001
 floppy+0.001
 merry+0.001
 rider+0.001
 greedy+0.001
Neg logit contributions
inn-0.002
 attent-0.002
ork-0.002
or-0.002
oons-0.002
Token then
Token ID788
Feature activation+4.02e-01

Pos logit contributions
ork+0.025
 attent+0.025
inn+0.025
oons+0.023
 liquids+0.023
Neg logit contributions
 glad-0.024
 floppy-0.023
 merry-0.023
 rider-0.022
 greedy-0.022
Token went
Token ID1816
Feature activation+9.58e-01

Pos logit contributions
ork+0.012
inn+0.012
 attent+0.012
or+0.011
oons+0.011
Neg logit contributions
 glad-0.009
 floppy-0.009
 merry-0.008
 eager-0.008
 rider-0.008
Token inside
Token ID2641
Feature activation+2.71e+00

Pos logit contributions
 attent+0.033
ork+0.033
works+0.032
oons+0.032
inn+0.032
Neg logit contributions
 glad-0.017
 riding-0.016
 eager-0.015
 swinging-0.015
 floppy-0.015
Token to
Token ID284
Feature activation+1.16e+00

Pos logit contributions
ork+0.092
inn+0.090
 attent+0.086
or+0.085
oons+0.084
Neg logit contributions
 glad-0.055
 eager-0.050
 no-0.047
 riding-0.047
 happy-0.045
Token play
Token ID711
Feature activation+3.87e-01

Pos logit contributions
ork+0.033
inn+0.033
 attent+0.033
or+0.030
oons+0.030
Neg logit contributions
 glad-0.028
 floppy-0.026
 merry-0.025
 eager-0.025
 rider-0.025
Token with
Token ID351
Feature activation+1.02e+00

Pos logit contributions
ork+0.013
inn+0.013
 attent+0.013
or+0.012
ub+0.012
Neg logit contributions
 glad-0.007
 merry-0.006
 floppy-0.006
 no-0.006
 eager-0.006
Token his
Token ID465
Feature activation-3.67e-01

Pos logit contributions
ork+0.030
inn+0.030
 attent+0.030
oons+0.028
or+0.028
Neg logit contributions
 glad-0.023
 floppy-0.022
 merry-0.021
 rider-0.021
 eager-0.021
Token toys
Token ID14958
Feature activation+3.40e-01

Pos logit contributions
 glad+0.008
 floppy+0.007
 merry+0.007
 rider+0.007
 greedy+0.007
Neg logit contributions
ork-0.011
 attent-0.011
inn-0.011
oons-0.011
 liquids-0.011
Token there
Token ID612
Feature activation+5.74e-01

Pos logit contributions
inn+0.028
ork+0.027
 attent+0.026
oons+0.025
 Grac+0.025
Neg logit contributions
 glad-0.018
 floppy-0.018
 merry-0.017
 rider-0.017
 eagle-0.017
Token was
Token ID373
Feature activation+7.02e-01

Pos logit contributions
ork+0.015
inn+0.015
 attent+0.014
oons+0.013
or+0.013
Neg logit contributions
 glad-0.015
 floppy-0.015
 rider-0.014
 merry-0.014
 greedy-0.014
Token a
Token ID257
Feature activation-6.01e-01

Pos logit contributions
inn+0.025
ork+0.024
 attent+0.024
oons+0.023
 Grac+0.022
Neg logit contributions
 glad-0.011
 rider-0.011
 merry-0.011
 strongest-0.010
 floppy-0.010
Token little
Token ID1310
Feature activation-4.82e-02

Pos logit contributions
 glad+0.011
 floppy+0.011
 merry+0.010
 strongest+0.010
 no+0.010
Neg logit contributions
ork-0.020
 attent-0.019
inn-0.019
oons-0.018
 liquids-0.017
Token girl
Token ID2576
Feature activation+1.22e+00

Pos logit contributions
 glad+0.001
 floppy+0.001
 merry+0.001
 happiest+0.001
 no+0.001
Neg logit contributions
inn-0.001
ork-0.001
 attent-0.001
 liquids-0.001
oons-0.001
Token named
Token ID3706
Feature activation+2.61e+00

Pos logit contributions
ork+0.040
inn+0.039
 attent+0.039
or+0.036
oons+0.036
Neg logit contributions
 glad-0.023
 merry-0.022
 floppy-0.022
 happiest-0.022
 strongest-0.022
Token Lily
Token ID20037
Feature activation+6.60e-01

Pos logit contributions
 attent+0.075
ork+0.073
inn+0.072
or+0.071
 liquids+0.068
Neg logit contributions
 glad-0.066
 no-0.062
 eager-0.062
 floppy-0.061
 greedy-0.058
Token.
Token ID13
Feature activation+7.35e-01

Pos logit contributions
inn+0.017
ork+0.017
 attent+0.016
or+0.015
oons+0.015
Neg logit contributions
 merry-0.017
 glad-0.017
 happiest-0.017
 strongest-0.017
 floppy-0.017
Token She
Token ID1375
Feature activation+1.52e-01

Pos logit contributions
ork+0.025
 attent+0.024
inn+0.024
or+0.023
oons+0.023
Neg logit contributions
 glad-0.014
 floppy-0.013
 merry-0.013
 eager-0.012
 rider-0.012
Token loved
Token ID6151
Feature activation-1.46e-01

Pos logit contributions
ork+0.006
inn+0.006
 attent+0.006
oons+0.005
or+0.005
Neg logit contributions
 glad-0.002
 floppy-0.002
 merry-0.002
 happiest-0.002
 strongest-0.002
Token to
Token ID284
Feature activation+4.06e-01

Pos logit contributions
 glad+0.003
 floppy+0.003
 merry+0.003
 eager+0.003
 riding+0.003
Neg logit contributions
inn-0.004
ork-0.004
 attent-0.004
or-0.004
oons-0.004
Token the
Token ID262
Feature activation-7.95e-01

Pos logit contributions
ork+0.002
 attent+0.002
inn+0.002
or+0.002
 liquids+0.002
Neg logit contributions
 glad-0.001
 floppy-0.001
 merry-0.001
 rider-0.001
 greedy-0.001
Token passport
Token ID17981
Feature activation-1.24e+00

Pos logit contributions
 floppy+0.023
 swinging+0.022
 glad+0.022
 riding+0.020
 greedy+0.020
Neg logit contributions
ork-0.019
 attent-0.019
or-0.016
inn-0.015
 Grac-0.014
Token and
Token ID290
Feature activation+3.23e-01

Pos logit contributions
 floppy+0.031
 glad+0.029
 greedy+0.028
 merry+0.028
 rider+0.026
Neg logit contributions
ork-0.036
 attent-0.033
or-0.032
inn-0.031
 Grac-0.029
Token the
Token ID262
Feature activation-6.98e-01

Pos logit contributions
 attent+0.009
ork+0.009
 Grac+0.009
 liquids+0.008
or+0.008
Neg logit contributions
 floppy-0.007
 rider-0.007
 merry-0.007
 glad-0.007
 greedy-0.006
Token magical
Token ID10883
Feature activation+6.59e-01

Pos logit contributions
 floppy+0.013
 glad+0.013
 rider+0.012
 swinging+0.012
 riding+0.012
Neg logit contributions
 attent-0.023
ork-0.022
inn-0.021
or-0.021
 Grac-0.020
Token mer
Token ID4017
Feature activation+3.19e+00

Pos logit contributions
ork+0.020
 attent+0.019
inn+0.019
or+0.018
oons+0.018
Neg logit contributions
 glad-0.015
 floppy-0.015
 merry-0.014
 greedy-0.014
 eager-0.013
Tokenmaid
Token ID23151
Feature activation+5.00e-01

Pos logit contributions
inn+0.078
ork+0.076
 liquids+0.074
 attent+0.073
oons+0.071
Neg logit contributions
 glad-0.084
 merry-0.082
 rider-0.078
 floppy-0.078
 eager-0.077
Token was
Token ID373
Feature activation-3.01e-01

Pos logit contributions
ork+0.016
 attent+0.016
inn+0.016
or+0.015
 Grac+0.014
Neg logit contributions
 glad-0.010
 floppy-0.010
 merry-0.009
 rider-0.009
 greedy-0.009
Token grateful
Token ID14066
Feature activation+1.03e-01

Pos logit contributions
 floppy+0.006
 rider+0.005
 eagle+0.005
 merry+0.005
 pedals+0.005
Neg logit contributions
ork-0.010
 attent-0.010
inn-0.009
 Grac-0.009
or-0.009
Token to
Token ID284
Feature activation+5.59e-01

Pos logit contributions
ork+0.003
 attent+0.003
inn+0.003
or+0.003
oons+0.002
Neg logit contributions
 glad-0.003
 floppy-0.002
 merry-0.002
 rider-0.002
 strongest-0.002
Token get
Token ID651
Feature activation+1.63e-01

Pos logit contributions
ork+0.018
 attent+0.017
inn+0.017
or+0.016
 Grac+0.016
Neg logit contributions
 floppy-0.011
 glad-0.011
 rider-0.011
 merry-0.011
 greedy-0.010
Token thread
Token ID4704
Feature activation+1.16e+00

Pos logit contributions
ork+0.039
inn+0.038
 attent+0.038
or+0.035
oons+0.035
Neg logit contributions
 glad-0.027
 floppy-0.026
 merry-0.025
 eager-0.024
 rider-0.024
Token and
Token ID290
Feature activation+4.06e-01

Pos logit contributions
ork+0.044
inn+0.043
 attent+0.043
or+0.041
oons+0.041
Neg logit contributions
 glad-0.017
 floppy-0.016
 merry-0.015
 rider-0.014
 strongest-0.014
Token a
Token ID257
Feature activation+1.53e+00

Pos logit contributions
ork+0.012
 attent+0.012
inn+0.012
 liquids+0.011
oons+0.011
Neg logit contributions
 glad-0.009
 floppy-0.008
 merry-0.008
 rider-0.008
 strongest-0.008
Token needle
Token ID17598
Feature activation+2.21e+00

Pos logit contributions
inn+0.049
 attent+0.048
ork+0.048
or+0.046
oons+0.045
Neg logit contributions
 glad-0.032
 floppy-0.030
 merry-0.029
 eager-0.029
 no-0.028
Token and
Token ID290
Feature activation+3.88e-01

Pos logit contributions
ork+0.077
inn+0.076
 attent+0.076
or+0.071
oons+0.071
Neg logit contributions
 glad-0.039
 floppy-0.037
 merry-0.036
 eager-0.034
 rider-0.034
Token se
Token ID384
Feature activation+2.71e+00

Pos logit contributions
ork+0.012
 attent+0.012
inn+0.012
 liquids+0.011
oons+0.011
Neg logit contributions
 glad-0.009
 floppy-0.008
 merry-0.008
 eager-0.008
 rider-0.008
Tokenwed
Token ID19103
Feature activation+6.12e-02

Pos logit contributions
 Grac+0.073
 attent+0.073
ork+0.072
 liquids+0.071
ula+0.070
Neg logit contributions
 floppy-0.062
 rider-0.062
 glad-0.061
 greedy-0.057
 eager-0.056
Token up
Token ID510
Feature activation+2.40e-01

Pos logit contributions
 attent+0.002
ork+0.002
inn+0.002
oons+0.002
 occasions+0.002
Neg logit contributions
 glad-0.001
 floppy-0.001
 no-0.001
 eager-0.001
 riding-0.001
Token the
Token ID262
Feature activation+1.04e+00

Pos logit contributions
ork+0.009
inn+0.009
 attent+0.009
or+0.008
oons+0.008
Neg logit contributions
 glad-0.004
 floppy-0.004
 merry-0.003
 eager-0.003
 rider-0.003
Token tear
Token ID11626
Feature activation+1.24e+00

Pos logit contributions
ork+0.033
inn+0.032
 attent+0.032
 liquids+0.030
oons+0.030
Neg logit contributions
 glad-0.022
 merry-0.020
 floppy-0.019
 rider-0.019
 eager-0.019
Token in
Token ID287
Feature activation+1.89e-01

Pos logit contributions
 attent+0.040
ork+0.040
inn+0.040
or+0.037
oons+0.037
Neg logit contributions
 glad-0.023
 floppy-0.023
 merry-0.022
 rider-0.021
 no-0.021
Token box
Token ID3091
Feature activation+3.94e-01

Pos logit contributions
 glad+0.002
 strongest+0.002
 happiest+0.002
 smartest+0.002
 greedy+0.002
Neg logit contributions
ork-0.003
 attent-0.003
inn-0.003
 Grac-0.003
oons-0.003
Token.
Token ID13
Feature activation+1.34e-01

Pos logit contributions
inn+0.011
 attent+0.011
ork+0.011
or+0.010
oons+0.009
Neg logit contributions
 glad-0.010
 merry-0.009
 floppy-0.009
 happiest-0.009
 eagle-0.009
Token
Token ID198
Feature activation+3.42e-01

Pos logit contributions
 attent+0.005
ork+0.005
inn+0.005
or+0.005
oons+0.004
Neg logit contributions
 glad-0.002
 floppy-0.002
 eager-0.002
 no-0.002
 greedy-0.002
Token
Token ID198
Feature activation+4.02e-01

Pos logit contributions
 attent+0.011
ork+0.011
inn+0.011
 liquids+0.010
or+0.010
Neg logit contributions
 glad-0.007
 eager-0.006
 floppy-0.006
 merry-0.006
 riding-0.006
Token"
Token ID1
Feature activation+2.67e-01

Pos logit contributions
 attent+0.014
ork+0.013
inn+0.013
 liquids+0.013
oons+0.012
Neg logit contributions
 glad-0.008
 eager-0.007
 merry-0.007
 greedy-0.006
 rider-0.006
TokenGu
Token ID8205
Feature activation+2.81e+00

Pos logit contributions
ork+0.009
 attent+0.009
inn+0.009
or+0.008
oons+0.008
Neg logit contributions
 glad-0.005
 merry-0.005
 rider-0.005
 floppy-0.005
 greedy-0.005
Tokeness
Token ID408
Feature activation+1.25e+00

Pos logit contributions
inn+0.060
 liquids+0.059
 attent+0.058
 Grac+0.057
oons+0.052
Neg logit contributions
 glad-0.084
 no-0.081
 rider-0.079
 greedy-0.078
 merry-0.077
Token what
Token ID644
Feature activation-4.79e-04

Pos logit contributions
ork+0.040
 attent+0.039
inn+0.039
or+0.037
oons+0.037
Neg logit contributions
 glad-0.025
 floppy-0.025
 merry-0.023
 rider-0.023
 eager-0.022
Token,
Token ID11
Feature activation+1.56e+00

Pos logit contributions
 glad+0.000
 floppy+0.000
 merry+0.000
 rider+0.000
 eager+0.000
Neg logit contributions
ork-0.000
 attent-0.000
inn-0.000
oons-0.000
or-0.000
Token kids
Token ID3988
Feature activation-9.08e-02

Pos logit contributions
ork+0.050
 attent+0.049
inn+0.049
or+0.046
oons+0.045
Neg logit contributions
 glad-0.032
 floppy-0.031
 merry-0.030
 rider-0.029
 eager-0.029
Token?"
Token ID1701
Feature activation+2.59e-01

Pos logit contributions
 glad+0.002
 floppy+0.002
 merry+0.002
 rider+0.002
 eager+0.002
Neg logit contributions
ork-0.003
 attent-0.003
inn-0.003
or-0.003
oons-0.003
Token.
Token ID13
Feature activation+2.55e-01

Pos logit contributions
 glad+0.009
 floppy+0.008
 merry+0.008
 greedy+0.008
 rider+0.008
Neg logit contributions
ork-0.008
 attent-0.008
or-0.007
inn-0.007
oons-0.007
Token Suddenly
Token ID24975
Feature activation-1.42e-01

Pos logit contributions
ork+0.008
 attent+0.008
inn+0.008
or+0.007
oons+0.007
Neg logit contributions
 glad-0.005
 floppy-0.005
 merry-0.005
 rider-0.005
 eager-0.005
Token,
Token ID11
Feature activation-1.32e-01

Pos logit contributions
 glad+0.002
 floppy+0.002
 merry+0.002
 rider+0.002
 eager+0.002
Neg logit contributions
ork-0.005
 attent-0.005
inn-0.005
or-0.005
oons-0.005
Token she
Token ID673
Feature activation+6.49e-01

Pos logit contributions
 glad+0.003
 floppy+0.002
 merry+0.002
 rider+0.002
 greedy+0.002
Neg logit contributions
ork-0.004
 attent-0.004
inn-0.004
 liquids-0.004
or-0.004
Token heard
Token ID2982
Feature activation-8.39e-01

Pos logit contributions
ork+0.020
 attent+0.019
inn+0.018
or+0.018
eth+0.017
Neg logit contributions
 glad-0.015
 floppy-0.014
 rider-0.014
 merry-0.014
 strongest-0.013
Token a
Token ID257
Feature activation+1.31e+00

Pos logit contributions
 glad+0.015
 floppy+0.014
 rider+0.014
 merry+0.013
 strongest+0.013
Neg logit contributions
ork-0.030
 attent-0.029
 liquids-0.026
inn-0.026
or-0.026
Token strange
Token ID6283
Feature activation+1.21e+00

Pos logit contributions
inn+0.053
ork+0.051
 attent+0.050
or+0.049
oons+0.049
Neg logit contributions
 glad-0.018
 merry-0.017
 floppy-0.016
 eager-0.015
 eagle-0.015
Token noise
Token ID7838
Feature activation+1.89e-01

Pos logit contributions
inn+0.043
ork+0.042
 attent+0.041
or+0.040
oons+0.040
Neg logit contributions
 glad-0.021
 merry-0.020
 floppy-0.020
 eager-0.020
 eagle-0.020
Token and
Token ID290
Feature activation-3.18e-02

Pos logit contributions
ork+0.006
inn+0.006
 attent+0.006
or+0.006
oons+0.006
Neg logit contributions
 glad-0.004
 floppy-0.003
 merry-0.003
 eager-0.003
 rider-0.003
Token realised
Token ID19169
Feature activation+6.77e-01

Pos logit contributions
 glad+0.001
 floppy+0.001
 merry+0.001
 rider+0.001
 strongest+0.001
Neg logit contributions
 attent-0.001
ork-0.001
inn-0.001
 liquids-0.001
or-0.001
Token it
Token ID340
Feature activation+5.12e-01

Pos logit contributions
ork+0.023
 attent+0.023
inn+0.023
or+0.021
 liquids+0.021
Neg logit contributions
 glad-0.012
 floppy-0.012
 merry-0.011
 rider-0.011
 eager-0.011
TokenOnce
Token ID7454
Feature activation+8.02e-01

Pos logit contributions
 glad+0.010
 floppy+0.009
 merry+0.009
 eager+0.009
 rider+0.009
Neg logit contributions
 attent-0.012
ork-0.012
inn-0.012
or-0.011
oons-0.011
Token upon
Token ID2402
Feature activation-7.50e-01

Pos logit contributions
ork+0.015
 attent+0.015
inn+0.015
 liquids+0.014
oons+0.013
Neg logit contributions
 glad-0.027
 floppy-0.025
 eager-0.025
 merry-0.025
 no-0.025
Token a
Token ID257
Feature activation+6.87e-01

Pos logit contributions
 strongest+0.017
 eagle+0.017
 happiest+0.017
 floppy+0.017
 smartest+0.017
Neg logit contributions
ork-0.021
to-0.020
 attent-0.019
inn-0.019
oons-0.018
Token time
Token ID640
Feature activation+3.99e-01

Pos logit contributions
ork+0.014
inn+0.014
 attent+0.014
oons+0.012
or+0.012
Neg logit contributions
 glad-0.022
 floppy-0.021
 merry-0.021
 rider-0.020
 strongest-0.020
Token,
Token ID11
Feature activation+9.72e-01

Pos logit contributions
 attent+0.016
ork+0.016
inn+0.016
 liquids+0.015
oons+0.015
Neg logit contributions
 glad-0.005
 floppy-0.004
 merry-0.004
 eager-0.004
 no-0.004
Token there
Token ID612
Feature activation+6.22e-01

Pos logit contributions
inn+0.033
ork+0.033
 attent+0.032
or+0.030
oons+0.030
Neg logit contributions
 glad-0.018
 floppy-0.017
 merry-0.017
 rider-0.016
 eager-0.016
Token was
Token ID373
Feature activation+6.07e-01

Pos logit contributions
ork+0.016
inn+0.016
 attent+0.015
oons+0.015
or+0.014
Neg logit contributions
 glad-0.016
 floppy-0.016
 rider-0.015
 merry-0.015
 greedy-0.015
Token a
Token ID257
Feature activation-8.12e-01

Pos logit contributions
inn+0.022
ork+0.022
 attent+0.022
 Grac+0.020
oons+0.020
Neg logit contributions
 glad-0.009
 rider-0.009
 merry-0.009
 floppy-0.008
 eagle-0.008
Token big
Token ID1263
Feature activation-1.46e+00

Pos logit contributions
 glad+0.015
 floppy+0.014
 merry+0.013
 strongest+0.013
 no+0.013
Neg logit contributions
ork-0.027
 attent-0.026
inn-0.026
oons-0.024
or-0.024
Token mountain
Token ID8598
Feature activation-1.75e-01

Pos logit contributions
 glad+0.029
 floppy+0.027
 strongest+0.026
 no+0.025
 smartest+0.025
Neg logit contributions
ork-0.042
inn-0.042
 attent-0.042
 Grac-0.039
 las-0.039
Token with
Token ID351
Feature activation+1.10e-01

Pos logit contributions
 glad+0.004
 floppy+0.004
 merry+0.004
 happiest+0.004
 greedy+0.004
Neg logit contributions
ork-0.005
inn-0.005
 attent-0.005
or-0.005
oons-0.005
Token trying
Token ID2111
Feature activation+1.50e-01

Pos logit contributions
ork+0.020
 attent+0.020
inn+0.019
or+0.019
to+0.018
Neg logit contributions
 floppy-0.015
 glad-0.015
 merry-0.014
 strongest-0.014
 rider-0.014
Token to
Token ID284
Feature activation+1.36e-01

Pos logit contributions
inn+0.004
ork+0.004
 attent+0.004
or+0.004
oons+0.004
Neg logit contributions
 glad-0.004
 floppy-0.003
 eager-0.003
 merry-0.003
 no-0.003
Token fly
Token ID6129
Feature activation-4.46e-01

Pos logit contributions
ork+0.004
 attent+0.004
inn+0.004
or+0.004
oons+0.004
Neg logit contributions
 glad-0.003
 floppy-0.003
 merry-0.003
 rider-0.003
 eager-0.002
Token.
Token ID13
Feature activation-3.59e-02

Pos logit contributions
 glad+0.011
 floppy+0.010
 merry+0.010
 rider+0.010
 eager+0.010
Neg logit contributions
ork-0.012
 attent-0.012
inn-0.012
or-0.011
oons-0.011
Token It
Token ID632
Feature activation+5.83e-01

Pos logit contributions
 glad+0.000
 floppy+0.000
 eager+0.000
 no+0.000
 swinging+0.000
Neg logit contributions
inn-0.001
ork-0.001
 attent-0.001
oons-0.001
or-0.001
Token had
Token ID550
Feature activation+5.81e-01

Pos logit contributions
ork+0.021
inn+0.020
 attent+0.020
or+0.019
oons+0.019
Neg logit contributions
 glad-0.010
 floppy-0.009
 merry-0.009
 eager-0.009
 rider-0.009
Token hurt
Token ID5938
Feature activation-4.25e-01

Pos logit contributions
ork+0.020
 attent+0.020
inn+0.019
 liquids+0.018
 Grac+0.018
Neg logit contributions
 glad-0.010
 rider-0.009
 merry-0.009
 happiest-0.008
 floppy-0.008
Token its
Token ID663
Feature activation-1.14e-01

Pos logit contributions
 glad+0.008
 floppy+0.008
 merry+0.008
 rider+0.007
 eager+0.007
Neg logit contributions
ork-0.014
 attent-0.014
inn-0.014
 liquids-0.013
or-0.013
Token wing
Token ID8539
Feature activation-8.60e-01

Pos logit contributions
 glad+0.002
 merry+0.002
 rider+0.002
Leave+0.002
pled+0.002
Neg logit contributions
ork-0.003
 attent-0.003
eth-0.003
inn-0.003
 potion-0.003
Token and
Token ID290
Feature activation+8.94e-01

Pos logit contributions
 glad+0.020
 merry+0.020
 rider+0.018
 smartest+0.017
 happiest+0.017
Neg logit contributions
ork-0.026
inn-0.023
 attent-0.023
or-0.022
sc-0.022
Token couldn
Token ID3521
Feature activation-6.15e-01

Pos logit contributions
ork+0.027
 attent+0.026
inn+0.025
 liquids+0.024
 Grac+0.024
Neg logit contributions
 glad-0.019
 merry-0.019
 rider-0.018
 floppy-0.018
 happiest-0.017
Token.
Token ID13
Feature activation+8.59e-02

Pos logit contributions
 glad+0.014
 floppy+0.013
 merry+0.012
 eager+0.012
 rider+0.012
Neg logit contributions
ork-0.021
 attent-0.021
inn-0.021
or-0.020
oons-0.020
Token
Token ID198
Feature activation+7.29e-02

Pos logit contributions
ork+0.003
 attent+0.003
inn+0.003
or+0.003
oons+0.003
Neg logit contributions
 glad-0.002
 floppy-0.002
 eager-0.002
 riding-0.001
 rider-0.001
Token
Token ID198
Feature activation+6.41e-01

Pos logit contributions
 attent+0.002
ork+0.002
inn+0.002
 liquids+0.002
or+0.002
Neg logit contributions
 glad-0.002
 floppy-0.001
 merry-0.001
 eager-0.001
 rider-0.001
TokenOne
Token ID3198
Feature activation-1.18e+00

Pos logit contributions
ork+0.021
 attent+0.021
inn+0.020
 liquids+0.020
oons+0.020
Neg logit contributions
 glad-0.012
 rider-0.012
 eager-0.011
 merry-0.011
 floppy-0.011
Token day
Token ID1110
Feature activation-8.30e-02

Pos logit contributions
 glad+0.031
 floppy+0.030
 merry+0.029
 greedy+0.029
 rider+0.028
Neg logit contributions
inn-0.030
ork-0.030
 attent-0.029
or-0.027
oons-0.026
Token he
Token ID339
Feature activation+5.66e-01

Pos logit contributions
 glad+0.001
 floppy+0.001
 no+0.001
 riding+0.001
 rider+0.001
Neg logit contributions
 attent-0.004
ork-0.004
inn-0.004
oons-0.003
or-0.003
Token went
Token ID1816
Feature activation+1.56e-01

Pos logit contributions
ork+0.018
 attent+0.018
inn+0.018
or+0.017
oons+0.017
Neg logit contributions
 glad-0.012
 floppy-0.011
 merry-0.010
 rider-0.010
 strongest-0.010
Token to
Token ID284
Feature activation+9.88e-01

Pos logit contributions
ork+0.005
inn+0.005
 attent+0.005
oons+0.004
 liquids+0.004
Neg logit contributions
 glad-0.004
 floppy-0.003
 merry-0.003
 eager-0.003
 riding-0.003
Token a
Token ID257
Feature activation-1.79e-01

Pos logit contributions
ork+0.032
 attent+0.031
inn+0.031
 Grac+0.029
or+0.029
Neg logit contributions
 glad-0.019
 floppy-0.019
 greedy-0.017
 rider-0.017
 strongest-0.016
Token new
Token ID649
Feature activation+3.57e-01

Pos logit contributions
 glad+0.003
 floppy+0.003
 merry+0.003
 eager+0.002
 rider+0.002
Neg logit contributions
ork-0.006
inn-0.006
 attent-0.006
or-0.006
oons-0.006
Token place
Token ID1295
Feature activation+2.79e-01

Pos logit contributions
ork+0.011
 attent+0.011
inn+0.011
or+0.010
oons+0.010
Neg logit contributions
 glad-0.008
 floppy-0.007
 merry-0.007
 eager-0.007
 greedy-0.007
Token<|endoftext|>
Token ID50256
Feature activation-5.69e-01

Pos logit contributions
oons+0.017
inn+0.016
or+0.016
ork+0.016
 attent+0.016
Neg logit contributions
 glad-0.009
 eager-0.007
 floppy-0.007
 no-0.007
 greedy-0.006
TokenOnce
Token ID7454
Feature activation+7.16e-01

Pos logit contributions
 glad+0.013
 floppy+0.012
 merry+0.012
 eager+0.011
 rider+0.011
Neg logit contributions
 attent-0.017
ork-0.017
inn-0.017
or-0.016
 liquids-0.015
Token upon
Token ID2402
Feature activation-6.67e-01

Pos logit contributions
ork+0.014
 attent+0.014
inn+0.014
 liquids+0.013
or+0.012
Neg logit contributions
 glad-0.023
 floppy-0.022
 merry-0.022
 eager-0.022
 no-0.021
Token a
Token ID257
Feature activation+4.66e-01

Pos logit contributions
 happiest+0.015
 strongest+0.015
 eagle+0.015
 smartest+0.015
 floppy+0.015
Neg logit contributions
ork-0.020
to-0.018
 attent-0.018
inn-0.017
oons-0.017
Token time
Token ID640
Feature activation+9.53e-02

Pos logit contributions
ork+0.010
inn+0.010
 attent+0.010
oons+0.009
or+0.009
Neg logit contributions
 glad-0.014
 floppy-0.014
 merry-0.014
 rider-0.013
 eager-0.013
Token,
Token ID11
Feature activation+8.45e-01

Pos logit contributions
 attent+0.004
ork+0.004
inn+0.004
 liquids+0.004
oons+0.004
Neg logit contributions
 glad-0.001
 floppy-0.001
 merry-0.001
 eager-0.001
 no-0.001
Token there
Token ID612
Feature activation+7.18e-01

Pos logit contributions
ork+0.029
inn+0.029
 attent+0.029
or+0.027
oons+0.027
Neg logit contributions
 glad-0.015
 floppy-0.014
 merry-0.014
 rider-0.013
 eager-0.013
Token was
Token ID373
Feature activation+5.97e-01

Pos logit contributions
ork+0.020
inn+0.019
 attent+0.019
oons+0.018
or+0.018
Neg logit contributions
 glad-0.018
 floppy-0.017
 merry-0.017
 rider-0.017
 greedy-0.016
Token a
Token ID257
Feature activation-5.35e-01

Pos logit contributions
inn+0.022
ork+0.022
 attent+0.021
 Grac+0.020
oons+0.020
Neg logit contributions
 glad-0.009
 rider-0.008
 merry-0.008
 floppy-0.008
 strongest-0.008
Token little
Token ID1310
Feature activation-3.81e-03

Pos logit contributions
 glad+0.009
 floppy+0.009
 merry+0.008
 strongest+0.008
 rider+0.008
Neg logit contributions
ork-0.018
 attent-0.018
inn-0.018
oons-0.016
 liquids-0.016
Token girl
Token ID2576
Feature activation+1.18e+00

Pos logit contributions
 glad+0.000
 floppy+0.000
 merry+0.000
 happiest+0.000
 no+0.000
Neg logit contributions
inn-0.000
ork-0.000
 attent-0.000
 liquids-0.000
 Grac-0.000
Token went
Token ID1816
Feature activation+1.36e-01

Pos logit contributions
 glad+0.006
 floppy+0.006
 merry+0.005
 rider+0.005
 strongest+0.005
Neg logit contributions
ork-0.012
 attent-0.012
inn-0.012
or-0.011
oons-0.011
Token to
Token ID284
Feature activation+9.37e-01

Pos logit contributions
ork+0.004
 attent+0.004
inn+0.004
oons+0.004
 liquids+0.004
Neg logit contributions
 glad-0.003
 eager-0.003
 floppy-0.003
 merry-0.003
 riding-0.003
Token listen
Token ID6004
Feature activation-3.61e-01

Pos logit contributions
ork+0.034
inn+0.034
 attent+0.034
or+0.031
oons+0.031
Neg logit contributions
 glad-0.015
 floppy-0.014
 merry-0.014
 eager-0.013
 rider-0.013
Token to
Token ID284
Feature activation+9.95e-02

Pos logit contributions
 glad+0.007
 eager+0.007
 floppy+0.006
 riding+0.006
 no+0.006
Neg logit contributions
inn-0.012
ork-0.012
 attent-0.011
 liquids-0.011
oons-0.011
Token them
Token ID606
Feature activation+1.52e-01

Pos logit contributions
 attent+0.003
ork+0.002
 Grac+0.002
 Wed+0.002
ula+0.002
Neg logit contributions
 floppy-0.003
 strongest-0.002
 rider-0.002
 greedy-0.002
 glad-0.002
Token and
Token ID290
Feature activation+4.89e-02

Pos logit contributions
ork+0.006
inn+0.006
 attent+0.006
oons+0.006
or+0.006
Neg logit contributions
 glad-0.002
 floppy-0.002
 eager-0.002
 merry-0.002
 rider-0.001
Token danced
Token ID39480
Feature activation-3.23e-01

Pos logit contributions
 attent+0.002
ork+0.002
inn+0.002
or+0.001
 liquids+0.001
Neg logit contributions
 glad-0.001
 floppy-0.001
 merry-0.001
 rider-0.001
 strongest-0.001
Token along
Token ID1863
Feature activation-8.84e-02

Pos logit contributions
 glad+0.006
 floppy+0.005
 eager+0.005
 merry+0.005
 happy+0.005
Neg logit contributions
ork-0.011
oons-0.011
inn-0.011
 liquids-0.010
 attent-0.010
Token with
Token ID351
Feature activation-3.96e-01

Pos logit contributions
 glad+0.001
 eager+0.001
 merry+0.001
 no+0.001
 floppy+0.001
Neg logit contributions
inn-0.003
ork-0.003
oons-0.003
 attent-0.003
 liquids-0.003
Token the
Token ID262
Feature activation+2.31e-01

Pos logit contributions
 glad+0.010
 rider+0.010
 strongest+0.010
 greedy+0.010
 floppy+0.010
Neg logit contributions
 attent-0.010
ork-0.010
 Grac-0.010
 liquids-0.009
inn-0.009
Token music
Token ID2647
Feature activation-6.12e-01

Pos logit contributions
ork+0.004
ula+0.004
 attent+0.004
 liquids+0.004
 Grac+0.004
Neg logit contributions
 glad-0.007
 floppy-0.007
 greedy-0.007
 eager-0.006
 rider-0.006
Token together
Token ID1978
Feature activation+7.64e-02

Pos logit contributions
 attent+0.000
inn+0.000
ork+0.000
oons+0.000
or+0.000
Neg logit contributions
 glad-0.000
 merry-0.000
 floppy-0.000
 no-0.000
 eager-0.000
Token.
Token ID13
Feature activation+3.01e-01

Pos logit contributions
ork+0.002
inn+0.002
 attent+0.002
or+0.002
oons+0.002
Neg logit contributions
 glad-0.002
 floppy-0.002
 merry-0.002
 eagle-0.002
 strongest-0.002
Token Maybe
Token ID6674
Feature activation-8.38e-01

Pos logit contributions
ork+0.007
 attent+0.007
inn+0.007
 liquids+0.007
or+0.007
Neg logit contributions
 glad-0.008
 floppy-0.008
 merry-0.008
 eager-0.007
 rider-0.007
Token we
Token ID356
Feature activation+3.25e-01

Pos logit contributions
 glad+0.023
 floppy+0.022
 merry+0.021
 rider+0.021
 eager+0.021
Neg logit contributions
ork-0.021
 attent-0.020
inn-0.020
or-0.019
 liquids-0.019
Token can
Token ID460
Feature activation-1.27e+00

Pos logit contributions
ork+0.011
 attent+0.011
inn+0.011
oons+0.010
or+0.010
Neg logit contributions
 glad-0.006
 floppy-0.006
 merry-0.006
 eager-0.006
 rider-0.006
Token dress
Token ID6576
Feature activation-4.40e-01

Pos logit contributions
 glad+0.029
 floppy+0.028
 merry+0.026
 rider+0.026
 greedy+0.026
Neg logit contributions
ork-0.038
inn-0.037
 attent-0.037
or-0.035
 Grac-0.034
Token up
Token ID510
Feature activation-4.32e-03

Pos logit contributions
 glad+0.009
 floppy+0.009
 merry+0.008
 no+0.008
 rider+0.008
Neg logit contributions
inn-0.014
 attent-0.014
ork-0.013
oons-0.013
 liquids-0.012
Token and
Token ID290
Feature activation-1.34e-01

Pos logit contributions
 glad+0.000
 floppy+0.000
 merry+0.000
 rider+0.000
 strongest+0.000
Neg logit contributions
ork-0.000
 attent-0.000
inn-0.000
or-0.000
oons-0.000
Token have
Token ID423
Feature activation+1.94e-01

Pos logit contributions
 glad+0.003
 floppy+0.003
 rider+0.003
 merry+0.003
 eager+0.003
Neg logit contributions
ork-0.004
 attent-0.004
inn-0.004
 liquids-0.004
or-0.004
Token a
Token ID257
Feature activation+1.39e+00

Pos logit contributions
ork+0.006
 attent+0.005
 liquids+0.005
 Grac+0.005
 occasions+0.005
Neg logit contributions
 rider-0.004
 glad-0.004
 eagle-0.004
 eager-0.004
 strongest-0.004
Token tea
Token ID8887
Feature activation+1.87e+00

Pos logit contributions
inn+0.044
 attent+0.044
ork+0.044
or+0.042
oons+0.042
Neg logit contributions
 glad-0.029
 merry-0.028
 floppy-0.027
 eager-0.026
 no-0.025
Token flag
Token ID6056
Feature activation-5.07e-01

Pos logit contributions
 glad+0.011
 floppy+0.010
 eager+0.009
 rider+0.009
 no+0.009
Neg logit contributions
ork-0.019
 attent-0.019
inn-0.018
or-0.017
 liquids-0.017
Token.
Token ID13
Feature activation+2.51e-01

Pos logit contributions
 glad+0.013
 floppy+0.012
 greedy+0.012
 merry+0.011
 eager+0.011
Neg logit contributions
 attent-0.014
ork-0.014
inn-0.013
or-0.013
oons-0.012
Token Sam
Token ID3409
Feature activation-2.45e-01

Pos logit contributions
ork+0.009
inn+0.009
 attent+0.009
oons+0.009
or+0.008
Neg logit contributions
 glad-0.005
 floppy-0.004
 eager-0.004
 no-0.004
 riding-0.004
Token thought
Token ID1807
Feature activation-7.67e-01

Pos logit contributions
 glad+0.005
 floppy+0.005
 merry+0.005
 greedy+0.005
 strongest+0.005
Neg logit contributions
 attent-0.007
ork-0.007
inn-0.007
or-0.007
 Grac-0.006
Token this
Token ID428
Feature activation-5.95e-01

Pos logit contributions
 glad+0.017
 floppy+0.017
 rider+0.016
 merry+0.016
 greedy+0.016
Neg logit contributions
ork-0.023
 attent-0.022
inn-0.022
or-0.021
oons-0.021
Token was
Token ID373
Feature activation-1.86e-01

Pos logit contributions
 glad+0.015
 floppy+0.014
 merry+0.013
 greedy+0.013
 eager+0.013
Neg logit contributions
ork-0.017
 attent-0.016
inn-0.016
or-0.015
oons-0.015
Token a
Token ID257
Feature activation-6.46e-01

Pos logit contributions
 glad+0.003
 floppy+0.002
 merry+0.002
 rider+0.002
 eager+0.002
Neg logit contributions
ork-0.007
 attent-0.007
inn-0.007
 liquids-0.007
or-0.007
Token great
Token ID1049
Feature activation+4.39e-03

Pos logit contributions
 glad+0.013
 floppy+0.012
 merry+0.011
 rider+0.011
 eager+0.011
Neg logit contributions
ork-0.021
 attent-0.021
inn-0.020
or-0.019
 liquids-0.019
Token idea
Token ID2126
Feature activation-6.82e-01

Pos logit contributions
ork+0.000
inn+0.000
 attent+0.000
oons+0.000
or+0.000
Neg logit contributions
 glad-0.000
 floppy-0.000
 merry-0.000
 eager-0.000
 rider-0.000
Token and
Token ID290
Feature activation-3.93e-01

Pos logit contributions
 glad+0.011
 floppy+0.011
 merry+0.010
 eager+0.010
 rider+0.010
Neg logit contributions
ork-0.024
 attent-0.024
inn-0.024
or-0.023
oons-0.023
Token suggested
Token ID5220
Feature activation+3.75e-01

Pos logit contributions
 glad+0.008
 floppy+0.008
 merry+0.008
 rider+0.007
 greedy+0.007
Neg logit contributions
 attent-0.012
ork-0.012
inn-0.012
or-0.011
 liquids-0.011
Token a
Token ID257
Feature activation+9.09e-02

Pos logit contributions
ork+0.017
 attent+0.017
 liquids+0.016
inn+0.016
 Grac+0.016
Neg logit contributions
 glad-0.010
 floppy-0.009
 rider-0.009
 merry-0.009
 eager-0.008
Token dog
Token ID3290
Feature activation+2.84e-01

Pos logit contributions
ork+0.003
 attent+0.003
inn+0.003
or+0.003
oons+0.003
Neg logit contributions
 glad-0.002
 floppy-0.002
 merry-0.002
 eager-0.001
 rider-0.001
Token.
Token ID13
Feature activation-7.43e-01

Pos logit contributions
ork+0.009
inn+0.009
 attent+0.009
or+0.009
oons+0.009
Neg logit contributions
 glad-0.005
 floppy-0.005
 merry-0.005
 rider-0.005
 eager-0.005
Token It
Token ID632
Feature activation-6.56e-01

Pos logit contributions
 glad+0.016
 floppy+0.015
 merry+0.014
 eager+0.014
 greedy+0.014
Neg logit contributions
 attent-0.023
ork-0.023
inn-0.023
or-0.022
oons-0.021
Token said
Token ID531
Feature activation-6.06e-01

Pos logit contributions
 glad+0.012
 floppy+0.011
 merry+0.011
 eager+0.011
 rider+0.011
Neg logit contributions
ork-0.022
inn-0.022
 attent-0.022
or-0.021
oons-0.021
Token:
Token ID25
Feature activation-9.47e-01

Pos logit contributions
 glad+0.010
 floppy+0.009
 no+0.009
 merry+0.009
 greedy+0.009
Neg logit contributions
inn-0.022
 attent-0.022
ork-0.022
or-0.020
oons-0.020
Token "
Token ID366
Feature activation+1.81e-01

Pos logit contributions
 glad+0.022
 floppy+0.021
 merry+0.020
 eager+0.020
 greedy+0.020
Neg logit contributions
inn-0.028
ork-0.027
 attent-0.027
or-0.025
oons-0.025
TokenB
Token ID33
Feature activation+1.90e+00

Pos logit contributions
 attent+0.006
ork+0.006
inn+0.005
oons+0.005
 liquids+0.005
Neg logit contributions
 rider-0.004
 merry-0.004
 greedy-0.003
 glad-0.003
 riding-0.003
Tokeneware
Token ID29725
Feature activation-6.60e-01

Pos logit contributions
 attent+0.038
inn+0.034
ork+0.034
 liquids+0.031
 Grac+0.031
Neg logit contributions
 glad-0.060
 rider-0.059
 floppy-0.058
 merry-0.058
 riding-0.057
Token of
Token ID286
Feature activation+2.97e-01

Pos logit contributions
 glad+0.012
 floppy+0.011
 merry+0.011
 eager+0.010
 rider+0.010
Neg logit contributions
ork-0.023
inn-0.022
 attent-0.022
or-0.021
oons-0.021
Token the
Token ID262
Feature activation+9.15e-02

Pos logit contributions
ork+0.010
 attent+0.010
inn+0.009
 liquids+0.009
or+0.009
Neg logit contributions
 glad-0.005
 floppy-0.005
 merry-0.005
 rider-0.005
 eager-0.005
Token friends
Token ID2460
Feature activation-1.79e+00

Pos logit contributions
 glad+0.007
 floppy+0.006
 merry+0.006
 rider+0.006
 eager+0.006
Neg logit contributions
ork-0.019
 attent-0.018
inn-0.018
or-0.017
oons-0.017
Token and
Token ID290
Feature activation-1.13e-01

Pos logit contributions
 floppy+0.038
 merry+0.037
 glad+0.037
 strongest+0.037
 eagle+0.036
Neg logit contributions
ork-0.052
 attent-0.051
inn-0.050
or-0.048
 liquids-0.045
Token had
Token ID550
Feature activation+1.40e-01

Pos logit contributions
 glad+0.003
 floppy+0.002
 merry+0.002
 strongest+0.002
 rider+0.002
Neg logit contributions
 attent-0.003
ork-0.003
inn-0.003
or-0.003
 liquids-0.003
Token many
Token ID867
Feature activation-9.26e-02

Pos logit contributions
ork+0.004
inn+0.004
 attent+0.004
oons+0.004
to+0.004
Neg logit contributions
 glad-0.003
 rider-0.003
 eagle-0.003
 floppy-0.003
 greedy-0.003
Token more
Token ID517
Feature activation+9.51e-01

Pos logit contributions
 rider+0.002
 glad+0.002
hy+0.002
 eagle+0.002
 floppy+0.002
Neg logit contributions
oons-0.002
 attent-0.002
 liquids-0.002
 occasions-0.002
ork-0.002
Token adventures
Token ID17545
Feature activation-4.35e-03

Pos logit contributions
 attent+0.028
ork+0.028
inn+0.028
oons+0.026
 liquids+0.026
Neg logit contributions
 glad-0.020
 merry-0.020
 floppy-0.019
 rider-0.019
 no-0.018
Token together
Token ID1978
Feature activation+5.62e-01

Pos logit contributions
 glad+0.000
 floppy+0.000
 merry+0.000
 happiest+0.000
 strongest+0.000
Neg logit contributions
ork-0.000
 attent-0.000
inn-0.000
or-0.000
oons-0.000
Token.
Token ID13
Feature activation-9.24e-02

Pos logit contributions
ork+0.018
 attent+0.018
inn+0.018
or+0.016
oons+0.016
Neg logit contributions
 glad-0.012
 floppy-0.011
 merry-0.011
 rider-0.010
 eager-0.010
Token<|endoftext|>
Token ID50256
Feature activation-1.02e-01

Pos logit contributions
 glad+0.002
 floppy+0.002
 no+0.002
 eager+0.002
 riding+0.002
Neg logit contributions
 attent-0.003
ork-0.003
inn-0.003
 liquids-0.003
oons-0.003
TokenOnce
Token ID7454
Feature activation+8.89e-01

Pos logit contributions
 glad+0.003
 floppy+0.003
 merry+0.003
 rider+0.003
 strongest+0.003
Neg logit contributions
ork-0.003
inn-0.003
 attent-0.002
oons-0.002
or-0.002
Token upon
Token ID2402
Feature activation-4.30e-01

Pos logit contributions
ork+0.016
 attent+0.016
inn+0.016
or+0.014
oons+0.014
Neg logit contributions
 glad-0.030
 floppy-0.029
 merry-0.029
 eager-0.028
 rider-0.028
Token in
Token ID287
Feature activation-7.24e-01

Pos logit contributions
 glad+0.017
 eager+0.017
 floppy+0.016
 riding+0.016
 rider+0.016
Neg logit contributions
ork-0.030
 attent-0.029
oons-0.028
inn-0.027
 liquids-0.027
Token the
Token ID262
Feature activation-1.60e-01

Pos logit contributions
 glad+0.014
 floppy+0.013
 greedy+0.013
 happiest+0.013
 strongest+0.013
Neg logit contributions
ork-0.023
inn-0.023
 attent-0.023
 liquids-0.022
 Grac-0.021
Token park
Token ID3952
Feature activation-4.42e-01

Pos logit contributions
 glad+0.002
 floppy+0.002
 merry+0.002
 rider+0.002
 greedy+0.002
Neg logit contributions
ork-0.006
 attent-0.006
inn-0.006
or-0.006
 liquids-0.006
Token.
Token ID13
Feature activation+2.66e-01

Pos logit contributions
 glad+0.011
 floppy+0.011
 merry+0.011
 rider+0.010
 strongest+0.010
Neg logit contributions
ork-0.012
 attent-0.012
inn-0.012
or-0.011
oons-0.010
Token He
Token ID679
Feature activation+2.68e-01

Pos logit contributions
ork+0.009
 attent+0.008
inn+0.008
or+0.008
oons+0.008
Neg logit contributions
 glad-0.005
 floppy-0.005
 merry-0.005
 eager-0.005
 rider-0.005
Token saw
Token ID2497
Feature activation-1.82e+00

Pos logit contributions
ork+0.008
inn+0.008
 attent+0.008
or+0.008
to+0.007
Neg logit contributions
 glad-0.006
 floppy-0.005
 rider-0.005
 strongest-0.005
 eagle-0.005
Token something
Token ID1223
Feature activation-1.83e-01

Pos logit contributions
 rider+0.051
 glad+0.048
opl+0.045
 forget+0.044
 willing+0.044
Neg logit contributions
ork-0.037
auc-0.036
 liquids-0.035
 Sp-0.035
 M-0.034
Token shiny
Token ID22441
Feature activation-7.30e-01

Pos logit contributions
 glad+0.002
 floppy+0.002
 merry+0.002
 eager+0.002
 greedy+0.002
Neg logit contributions
ork-0.007
 attent-0.007
inn-0.007
or-0.007
oons-0.007
Token in
Token ID287
Feature activation-1.07e+00

Pos logit contributions
 glad+0.023
 happiest+0.020
 merry+0.020
 grateful+0.019
 smartest+0.019
Neg logit contributions
inn-0.017
ork-0.017
 attent-0.016
 liquids-0.015
oons-0.015
Token the
Token ID262
Feature activation-1.75e+00

Pos logit contributions
 glad+0.024
 happiest+0.022
 rider+0.022
 floppy+0.021
 strongest+0.021
Neg logit contributions
inn-0.032
ork-0.031
 attent-0.030
 liquids-0.029
 Grac-0.028
Token grass
Token ID8701
Feature activation-8.51e-01

Pos logit contributions
 glad+0.041
 rider+0.037
 willing+0.035
 eager+0.034
 greedy+0.034
Neg logit contributions
ork-0.050
 attent-0.045
 liquids-0.044
inn-0.044
auc-0.042
Token there
Token ID612
Feature activation+1.21e+00

Pos logit contributions
ork+0.002
inn+0.002
 attent+0.002
or+0.002
oons+0.002
Neg logit contributions
 floppy-0.001
 glad-0.001
 rider-0.001
 merry-0.001
 eagle-0.001
Token,
Token ID11
Feature activation+2.12e-01

Pos logit contributions
inn+0.048
ork+0.048
 attent+0.048
oons+0.045
 liquids+0.045
Neg logit contributions
 glad-0.016
 merry-0.013
 floppy-0.013
 eager-0.013
 no-0.013
Token Lucy
Token ID22162
Feature activation+4.31e-01

Pos logit contributions
inn+0.006
ork+0.006
 attent+0.006
or+0.006
oons+0.006
Neg logit contributions
 glad-0.005
 no-0.005
 floppy-0.005
 merry-0.005
 eager-0.005
Token noticed
Token ID6810
Feature activation-5.26e-01

Pos logit contributions
ork+0.013
 attent+0.012
or+0.012
inn+0.012
oons+0.011
Neg logit contributions
 glad-0.010
 floppy-0.009
 merry-0.009
 strongest-0.009
 rider-0.009
Token a
Token ID257
Feature activation-8.75e-01

Pos logit contributions
 glad+0.012
 rider+0.012
 floppy+0.011
 strongest+0.011
 merry+0.011
Neg logit contributions
ork-0.015
 attent-0.015
 liquids-0.014
oons-0.014
inn-0.013
Token big
Token ID1263
Feature activation-1.84e+00

Pos logit contributions
 glad+0.014
 floppy+0.013
 merry+0.012
 eager+0.012
 rider+0.011
Neg logit contributions
ork-0.032
 attent-0.032
inn-0.031
or-0.029
oons-0.029
Token square
Token ID6616
Feature activation+2.64e-01

Pos logit contributions
 glad+0.037
 floppy+0.033
 happiest+0.033
 strongest+0.032
 eager+0.032
Neg logit contributions
ork-0.058
inn-0.054
 attent-0.052
 liquids-0.051
oons-0.049
Token chalk
Token ID30860
Feature activation-5.36e-01

Pos logit contributions
ork+0.008
inn+0.008
 attent+0.008
oons+0.008
or+0.007
Neg logit contributions
 glad-0.006
 floppy-0.005
 merry-0.005
 eager-0.005
 rider-0.005
Tokenboard
Token ID3526
Feature activation-6.15e-01

Pos logit contributions
 glad+0.013
 floppy+0.011
 merry+0.011
 happiest+0.011
 eager+0.011
Neg logit contributions
ork-0.016
inn-0.016
 attent-0.015
oons-0.015
or-0.014
Token on
Token ID319
Feature activation-2.55e-01

Pos logit contributions
 glad+0.015
 floppy+0.013
 merry+0.013
 happiest+0.013
 strongest+0.013
Neg logit contributions
ork-0.019
inn-0.018
 attent-0.018
oons-0.017
or-0.017
Token the
Token ID262
Feature activation-7.90e-01

Pos logit contributions
 glad+0.003
 floppy+0.003
 merry+0.003
 no+0.003
 eager+0.003
Neg logit contributions
inn-0.010
 attent-0.010
ork-0.010
or-0.009
oons-0.009
Token excited
Token ID6568
Feature activation+2.78e-01

Pos logit contributions
 floppy+0.021
 merry+0.021
 eagle+0.021
 rider+0.020
 happiest+0.019
Neg logit contributions
 attent-0.034
inn-0.032
ork-0.031
 Grac-0.029
ub-0.029
Token he
Token ID339
Feature activation+3.83e-01

Pos logit contributions
ork+0.010
inn+0.010
 attent+0.010
oons+0.010
or+0.010
Neg logit contributions
 glad-0.004
 floppy-0.004
 eager-0.004
 merry-0.004
 rider-0.004
Token couldn
Token ID3521
Feature activation-1.69e+00

Pos logit contributions
ork+0.014
inn+0.014
 attent+0.014
oons+0.013
or+0.013
Neg logit contributions
 glad-0.006
 floppy-0.006
 eager-0.006
 merry-0.006
 rider-0.005
Tokenâ
Token ID22940
Feature activation-3.02e-01

Pos logit contributions
 glad+0.046
 floppy+0.044
 merry+0.043
 eager+0.042
 rider+0.042
Neg logit contributions
ork-0.043
inn-0.042
 attent-0.042
or-0.039
oons-0.038
Token
Token ID26391
Feature activation-1.85e-02

Pos logit contributions
 glad+0.007
 rider+0.007
 floppy+0.007
 merry+0.007
 eager+0.007
Neg logit contributions
ork-0.008
 attent-0.008
inn-0.008
 liquids-0.008
oons-0.008
Token
Token ID8151
Feature activation-1.83e+00

Pos logit contributions
 glad+0.001
 floppy+0.001
 merry+0.000
 greedy+0.000
 strongest+0.000
Neg logit contributions
ork-0.000
 attent-0.000
inn-0.000
or-0.000
oons-0.000
Tokent
Token ID83
Feature activation-6.33e-01

Pos logit contributions
 glad+0.068
 floppy+0.066
 merry+0.065
 rider+0.063
 eager+0.063
Neg logit contributions
ork-0.028
inn-0.027
 attent-0.027
or-0.024
oons-0.023
Token contain
Token ID3994
Feature activation-1.19e+00

Pos logit contributions
 glad+0.015
 floppy+0.014
 merry+0.014
 greedy+0.014
 eager+0.013
Neg logit contributions
 attent-0.018
ork-0.018
inn-0.018
or-0.017
 liquids-0.016
Token himself
Token ID2241
Feature activation-5.33e-01

Pos logit contributions
 floppy+0.029
 glad+0.028
 merry+0.028
 rider+0.028
 strongest+0.027
Neg logit contributions
 attent-0.033
ork-0.033
inn-0.032
or-0.030
oons-0.029
Token!
Token ID0
Feature activation-3.46e-01

Pos logit contributions
 floppy+0.010
 glad+0.010
 merry+0.010
 rider+0.009
 greedy+0.009
Neg logit contributions
 attent-0.017
ork-0.017
inn-0.016
or-0.016
oons-0.015
Token
Token ID198
Feature activation+1.24e-01

Pos logit contributions
 glad+0.007
 floppy+0.006
 eager+0.006
 merry+0.006
 rider+0.006
Neg logit contributions
ork-0.012
 attent-0.012
inn-0.011
oons-0.011
or-0.011
Token flowers
Token ID12734
Feature activation-6.16e-01

Pos logit contributions
 glad+0.005
 floppy+0.005
 rider+0.005
 greedy+0.005
 eager+0.004
Neg logit contributions
 attent-0.010
ork-0.010
 liquids-0.009
inn-0.009
oons-0.009
Token.
Token ID13
Feature activation+2.25e-02

Pos logit contributions
 floppy+0.017
 rider+0.016
 glad+0.016
 strongest+0.016
 greedy+0.016
Neg logit contributions
 attent-0.014
ork-0.014
inn-0.014
oons-0.013
or-0.012
Token They
Token ID1119
Feature activation-7.04e-01

Pos logit contributions
ork+0.001
 attent+0.001
inn+0.001
or+0.001
oons+0.000
Neg logit contributions
 glad-0.001
 floppy-0.001
 merry-0.001
 eager-0.001
 rider-0.001
Token are
Token ID389
Feature activation-4.55e-01

Pos logit contributions
 glad+0.014
 floppy+0.013
 merry+0.012
 strongest+0.012
 greedy+0.012
Neg logit contributions
 attent-0.022
ork-0.022
inn-0.021
or-0.021
oons-0.020
Token good
Token ID922
Feature activation-2.46e-01

Pos logit contributions
 rider+0.008
 eagle+0.008
 pedals+0.007
Leave+0.007
 floppy+0.007
Neg logit contributions
inn-0.014
 Grac-0.013
 attent-0.013
ula-0.013
uggets-0.012
Token friends
Token ID2460
Feature activation-2.98e+00

Pos logit contributions
 glad+0.004
 floppy+0.004
 merry+0.004
 strongest+0.004
 happiest+0.004
Neg logit contributions
 attent-0.008
inn-0.008
ork-0.008
oons-0.008
or-0.008
Token.
Token ID13
Feature activation-2.25e-01

Pos logit contributions
 strongest+0.089
 floppy+0.088
 eagle+0.088
 smartest+0.086
 happiest+0.086
Neg logit contributions
ork-0.058
inn-0.058
 attent-0.055
or-0.052
ula-0.049
Token They
Token ID1119
Feature activation-5.39e-01

Pos logit contributions
 glad+0.006
 floppy+0.006
 eager+0.005
 no+0.005
 merry+0.005
Neg logit contributions
ork-0.006
 attent-0.006
inn-0.006
or-0.006
oons-0.006
Token enjoy
Token ID2883
Feature activation-3.92e-01

Pos logit contributions
 glad+0.010
 floppy+0.009
 merry+0.009
 rider+0.009
 strongest+0.009
Neg logit contributions
 attent-0.018
ork-0.017
inn-0.017
or-0.016
 Grac-0.016
Token the
Token ID262
Feature activation+2.29e-01

Pos logit contributions
 floppy+0.011
 rider+0.011
 strongest+0.011
 smartest+0.010
hy+0.010
Neg logit contributions
 attent-0.009
inn-0.008
ork-0.008
 liquids-0.008
 Grac-0.008
Token hot
Token ID3024
Feature activation+1.70e-01

Pos logit contributions
ork+0.007
 attent+0.007
inn+0.007
oons+0.007
 liquids+0.007
Neg logit contributions
 glad-0.005
 floppy-0.004
 greedy-0.004
 rider-0.004
 no-0.004
Token high
Token ID1029
Feature activation-4.65e-01

Pos logit contributions
ork+0.008
 attent+0.008
inn+0.008
or+0.008
 liquids+0.008
Neg logit contributions
 glad-0.003
 floppy-0.002
 merry-0.002
 eager-0.002
 rider-0.002
Token.
Token ID13
Feature activation-9.61e-01

Pos logit contributions
 glad+0.012
 happiest+0.011
 merry+0.010
 floppy+0.010
 smartest+0.010
Neg logit contributions
inn-0.013
 attent-0.013
ork-0.012
or-0.012
oons-0.011
Token She
Token ID1375
Feature activation-2.90e-01

Pos logit contributions
 glad+0.021
 floppy+0.020
 merry+0.020
 rider+0.019
 happiest+0.019
Neg logit contributions
ork-0.029
 attent-0.029
inn-0.028
oons-0.026
 liquids-0.026
Token wanted
Token ID2227
Feature activation-1.63e+00

Pos logit contributions
 glad+0.006
 happiest+0.005
 strongest+0.005
 floppy+0.005
 rider+0.005
Neg logit contributions
 attent-0.009
ork-0.009
inn-0.009
oons-0.009
or-0.008
Token to
Token ID284
Feature activation-1.18e+00

Pos logit contributions
 glad+0.055
 floppy+0.053
 rider+0.053
 eagle+0.053
 happiest+0.053
Neg logit contributions
ork-0.029
 attent-0.026
inn-0.026
to-0.024
 liquids-0.023
Token swing
Token ID9628
Feature activation-1.85e+00

Pos logit contributions
 glad+0.031
 greedy+0.029
 floppy+0.028
 happiest+0.026
 reckless+0.026
Neg logit contributions
ork-0.030
inn-0.029
 attent-0.029
or-0.028
ula-0.026
Token on
Token ID319
Feature activation-1.52e+00

Pos logit contributions
 glad+0.047
 eagle+0.045
 floppy+0.044
 greedy+0.043
 happiest+0.042
Neg logit contributions
ork-0.049
inn-0.048
 attent-0.045
or-0.044
ula-0.042
Token it
Token ID340
Feature activation-5.04e-02

Pos logit contributions
 glad+0.049
 happiest+0.046
 strongest+0.045
 grateful+0.044
 floppy+0.044
Neg logit contributions
ork-0.035
ula-0.027
 attent-0.027
inn-0.027
or-0.026
Token so
Token ID523
Feature activation-2.05e+00

Pos logit contributions
 glad+0.001
 floppy+0.001
 merry+0.001
 eagle+0.001
 greedy+0.001
Neg logit contributions
inn-0.001
ork-0.001
 attent-0.001
or-0.001
ula-0.001
Token much
Token ID881
Feature activation+3.71e-01

Pos logit contributions
 glad+0.053
 floppy+0.051
 happiest+0.051
 eagle+0.051
 strongest+0.050
Neg logit contributions
ork-0.057
inn-0.052
 attent-0.050
or-0.048
ula-0.046
Token.
Token ID13
Feature activation-3.13e-01

Pos logit contributions
ork+0.012
inn+0.012
 attent+0.012
or+0.011
oons+0.011
Neg logit contributions
 glad-0.007
 floppy-0.007
 merry-0.007
 eager-0.006
 greedy-0.006