TOP ACTIVATIONS
MAX = 5.118

small cat named P uff . P uff was a very
visit . When D ott ie arrived at the
. But one day , when Linda was playing , she
p anda named P andy . P andy was very cute
p anda named P andy . P andy was very successful
a lot and asked Linda if she could borrow it .
in named P ipp in . P ipp in was not
p anda named P andy . P andy was very generous
But one day , when D ina was riding her tr
in named P udd les . P udd les loved to
p anda named P andy . P andy was a big
crab named P inch y . P inch y was very
named D ott ie . D ott ie wanted to visit
crab named P inch y . P inch y loved to
k itty named P uss . One day P uss was
friend , D alia . D alia had made a promise
gy named P ipp a . P ipp a loved to
pony named P ipp in . P ipp in loved to
was a guy called Jazz . He was very deep and
pengu in named P ingu . P ingu loved to swim

MIN ACTIVATIONS
MIN = -4.442

the wise pony and the 3 - year old girl played
mother became angry . She grabbed the microphone out of Frankie
to do ." So the three of them sang together in
So , the three raven s lined up ,
So the princess only added five apples to her basket and
! <|endoftext|> Sam was a three year old boy and he
something miraculous happened ! A 3 year old child saw the
every morning . Every morning the people in the house would
it spread out of the 3 year old 's house ,
. He watched them in amaz ement and even laughed a
The queen declared that the three year old was the special
brother . The three year old boy began to
on farms . They have four legs , a tail ,
shouted . The three little pigs started to measure
neatly . The three - year - old girl
<|endoftext|> John was a friendly three year old . He went
decided to take their polite three year old on a picnic
€ � � it fit three people ! On
. <|endoftext|> Billy was a three year old boy who loved
much . The two friends were happy to enjoy

INTERVAL 2.728 - 5.118
CONTAINS 0.029%

" This stuff smells bad ," Jenny said .
But he could not find the marble . Tim felt sad
side of the lake , L izzy saw a wise duck
on her face and demanded that Sammy show her what he
He asked his friends to help the bird too .

INTERVAL 0.338 - 2.728
CONTAINS 29.933%

to wait for your friend to come ." L
toy boat that might fit in your kay ak . We
Sophie and her mum went for a walk along the beach
said , " Mom my , I don 't like this
and go outside . One day , her friends invited her

INTERVAL -2.052 - 0.338
CONTAINS 69.740%

and dad . They wore nice clothes and shiny shoes .
their toys in the living room . They had many toys
hopped onto it . She was so excited at this special
and a dress . He wondered what it was like to
was excited when the big band started to play . She

INTERVAL -4.442 - -2.052
CONTAINS 0.298%

<|endoftext|> S ara loved to look at the sky and see
, stop !" Sara cries . " That 's not yours
." They each picked a few blo oms and smelled them
kernels in the popcorn maker . Bob was so excited .
sister . <|endoftext|> L ily and Ben were best friends .
Token small
Token ID1402
Feature activation+4.68e-02

Pos logit contributions
folk+0.001
ly+0.001
der+0.000
thia+0.000
affe+0.000
Neg logit contributions
 P-0.001
 chess-0.001
 Linda-0.001
 Jelly-0.001
 surfing-0.001
Token cat
Token ID3797
Feature activation-5.63e-01

Pos logit contributions
 P+0.002
 chess+0.001
 Linda+0.001
 Jelly+0.001
 ink+0.001
Neg logit contributions
folk-0.001
thia-0.001
ly-0.001
der-0.001
affe-0.001
Token named
Token ID3706
Feature activation+1.49e+00

Pos logit contributions
affe+0.012
ly+0.012
 ponies+0.012
thia+0.011
folk+0.011
Neg logit contributions
 P-0.018
 Linda-0.018
 chess-0.017
 Jelly-0.016
 noon-0.016
Token P
Token ID350
Feature activation+5.03e-01

Pos logit contributions
 P+0.039
 chess+0.038
 Linda+0.037
 surfing+0.033
 Jelly+0.033
Neg logit contributions
ly-0.041
thia-0.041
affe-0.040
 ponies-0.040
folk-0.040
Tokenuff
Token ID1648
Feature activation-2.12e+00

Pos logit contributions
 P+0.013
 chess+0.013
 Linda+0.013
 ink+0.011
 Jelly+0.011
Neg logit contributions
folk-0.015
 bells-0.014
 ponies-0.014
ly-0.014
keys-0.013
Token.
Token ID13
Feature activation+5.12e+00

Pos logit contributions
thia+0.057
folk+0.056
 ponies+0.055
Exc+0.054
 bells+0.053
Neg logit contributions
 P-0.059
 chess-0.055
 Linda-0.054
 Jelly-0.048
 ink-0.048
Token P
Token ID350
Feature activation+1.92e+00

Pos logit contributions
 possible+0.098
ucer+0.097
prise+0.089
 rid+0.084
etchup+0.083
Neg logit contributions
Exc-0.159
 Many-0.151
 So-0.148
 little-0.148
 eyes-0.147
Tokenuff
Token ID1648
Feature activation-2.28e+00

Pos logit contributions
 P+0.063
 chess+0.060
 Linda+0.059
 Jelly+0.054
 noon+0.053
Neg logit contributions
ly-0.042
thia-0.041
folk-0.041
affe-0.041
der-0.041
Token was
Token ID373
Feature activation+9.03e-01

Pos logit contributions
folk+0.048
thia+0.048
 ponies+0.047
der+0.046
affe+0.045
Neg logit contributions
 P-0.075
 chess-0.071
 Linda-0.071
 ink-0.065
 Jelly-0.063
Token a
Token ID257
Feature activation-7.79e-01

Pos logit contributions
ucer+0.028
 chess+0.027
 Linda+0.027
 P+0.026
ennis+0.025
Neg logit contributions
 ponies-0.019
 ow-0.019
 little-0.018
 gor-0.018
 strangers-0.018
Token very
Token ID845
Feature activation-2.75e-01

Pos logit contributions
ly+0.015
affe+0.015
thia+0.015
 ponies+0.015
folk+0.015
Neg logit contributions
 P-0.027
 Linda-0.026
 chess-0.025
 Jelly-0.023
 surfing-0.023
Token visit
Token ID3187
Feature activation+5.75e-01

Pos logit contributions
 P+0.054
 Linda+0.052
ucer+0.050
 chess+0.050
ores+0.050
Neg logit contributions
 ponies-0.028
 strangers-0.026
ly-0.025
 pony-0.024
 bells-0.024
Token.
Token ID13
Feature activation+9.21e-01

Pos logit contributions
 P+0.017
 Linda+0.016
 chess+0.016
 Jelly+0.015
 Ink+0.015
Neg logit contributions
ly-0.014
 ponies-0.014
folk-0.013
thia-0.013
anna-0.013
Token
Token ID198
Feature activation+5.13e-01

Pos logit contributions
 chess+0.029
 P+0.028
 Linda+0.027
 ink+0.025
 surfing+0.025
Neg logit contributions
ly-0.020
 ponies-0.020
thia-0.020
anna-0.020
folk-0.019
Token
Token ID198
Feature activation+1.16e+00

Pos logit contributions
 P+0.014
 Linda+0.014
 chess+0.014
 Jelly+0.012
 surfing+0.012
Neg logit contributions
folk-0.014
ly-0.014
thia-0.013
affe-0.013
 ponies-0.013
TokenWhen
Token ID2215
Feature activation+1.75e+00

Pos logit contributions
 chess+0.036
 Linda+0.034
 P+0.034
ores+0.031
 ink+0.030
Neg logit contributions
Exc-0.029
ly-0.027
 ponies-0.026
affe-0.025
thia-0.025
Token D
Token ID360
Feature activation+4.91e+00

Pos logit contributions
 chess+0.049
 disk+0.046
 Linda+0.042
ores+0.041
ucer+0.041
Neg logit contributions
 little-0.043
 ponies-0.043
Exc-0.042
 ow-0.041
ly-0.041
Tokenott
Token ID1252
Feature activation-1.37e+00

Pos logit contributions
 chess+0.148
 possible+0.142
 surfing+0.140
ores+0.136
etchup+0.135
Neg logit contributions
ly-0.109
ith-0.095
anna-0.089
Exc-0.088
der-0.086
Tokenie
Token ID494
Feature activation-8.55e-02

Pos logit contributions
folk+0.033
 ponies+0.032
affe+0.031
ly+0.031
thia+0.031
Neg logit contributions
 P-0.042
 Linda-0.040
 chess-0.040
 ink-0.035
 Jelly-0.035
Token arrived
Token ID5284
Feature activation-1.01e+00

Pos logit contributions
folk+0.002
der+0.002
affe+0.002
thia+0.002
 ponies+0.002
Neg logit contributions
 P-0.003
 Linda-0.003
 chess-0.003
 Jelly-0.003
 noon-0.002
Token at
Token ID379
Feature activation+3.56e-01

Pos logit contributions
folk+0.021
ly+0.021
thia+0.021
affe+0.021
 ponies+0.020
Neg logit contributions
 P-0.035
 chess-0.033
 Linda-0.033
 Jelly-0.030
 surfing-0.029
Token the
Token ID262
Feature activation+1.98e+00

Pos logit contributions
 P+0.011
 chess+0.010
 surfing+0.010
 Linda+0.010
ores+0.010
Neg logit contributions
 ponies-0.008
 strangers-0.007
 little-0.006
 pony-0.006
ly-0.006
Token.
Token ID13
Feature activation+3.16e+00

Pos logit contributions
ly+0.025
folk+0.024
 ponies+0.024
der+0.024
affe+0.024
Neg logit contributions
 P-0.030
 Linda-0.028
 chess-0.027
 Jelly-0.025
 surfing-0.025
Token But
Token ID887
Feature activation+2.01e+00

Pos logit contributions
etchup+0.129
ucer+0.127
 surfing+0.126
 possible+0.122
 chess+0.122
Neg logit contributions
Exc-0.033
ly-0.032
der-0.028
thia-0.025
 little-0.024
Token one
Token ID530
Feature activation-6.50e-01

Pos logit contributions
ucer+0.066
ennis+0.063
etchup+0.062
 chess+0.062
ores+0.061
Neg logit contributions
ly-0.037
der-0.036
 ponies-0.034
 bells-0.033
 strangers-0.033
Token day
Token ID1110
Feature activation+1.03e+00

Pos logit contributions
ly+0.022
affe+0.021
 pony+0.021
 ponies+0.021
 little+0.021
Neg logit contributions
 P-0.013
 chess-0.012
 Linda-0.012
 Jak-0.011
ucer-0.011
Token,
Token ID11
Feature activation+3.74e+00

Pos logit contributions
 chess+0.038
 P+0.038
ores+0.037
 surfing+0.037
 shallow+0.036
Neg logit contributions
ly-0.015
 ponies-0.012
 strangers-0.012
 ow-0.012
Exc-0.011
Token when
Token ID618
Feature activation+4.85e+00

Pos logit contributions
etchup+0.153
ucer+0.145
ennis+0.142
 possible+0.138
 hyd+0.138
Neg logit contributions
 little-0.061
ly-0.047
Exc-0.039
 nodded-0.035
 stopped-0.035
Token Linda
Token ID21776
Feature activation-1.29e-01

Pos logit contributions
ucer+0.186
etchup+0.185
 hyd+0.182
 bisc+0.179
ennis+0.179
Neg logit contributions
 little-0.071
 people-0.048
 eyes-0.047
,-0.044
 ran-0.041
Token was
Token ID373
Feature activation+2.98e-01

Pos logit contributions
folk+0.003
affe+0.003
 ponies+0.003
thia+0.003
der+0.003
Neg logit contributions
 P-0.004
 Linda-0.004
 chess-0.004
 ink-0.003
 Jelly-0.003
Token playing
Token ID2712
Feature activation-6.09e-01

Pos logit contributions
 P+0.008
 chess+0.007
 Linda+0.007
 surfing+0.006
 Jelly+0.006
Neg logit contributions
folk-0.009
ly-0.009
affe-0.009
thia-0.008
 ponies-0.008
Token,
Token ID11
Feature activation+3.10e+00

Pos logit contributions
ly+0.014
folk+0.013
 ponies+0.013
der+0.013
affe+0.013
Neg logit contributions
 P-0.019
 Linda-0.018
 chess-0.018
 Jelly-0.017
 surfing-0.016
Token she
Token ID673
Feature activation+1.90e-01

Pos logit contributions
etchup+0.110
ucer+0.107
ennis+0.105
 hyd+0.103
iffs+0.099
Neg logit contributions
 little-0.055
 ow-0.047
ly-0.046
Exc-0.045
 stopped-0.044
Token p
Token ID279
Feature activation+1.00e+00

Pos logit contributions
 P+0.004
 chess+0.004
 Linda+0.004
 Jelly+0.004
 surfing+0.004
Neg logit contributions
folk-0.001
ly-0.001
thia-0.001
der-0.001
 ponies-0.001
Tokenanda
Token ID5282
Feature activation-9.11e-01

Pos logit contributions
 P+0.045
 chess+0.043
 Linda+0.043
 Jelly+0.040
 noon+0.040
Neg logit contributions
ly-0.010
folk-0.010
thia-0.009
affe-0.009
 ponies-0.009
Token named
Token ID3706
Feature activation+1.38e+00

Pos logit contributions
ly+0.021
affe+0.021
 ponies+0.020
thia+0.020
folk+0.020
Neg logit contributions
 P-0.029
 Linda-0.028
 chess-0.028
 Jelly-0.025
 noon-0.025
Token P
Token ID350
Feature activation+8.82e-01

Pos logit contributions
 chess+0.032
 Linda+0.029
 P+0.029
 surfing+0.028
 ink+0.028
Neg logit contributions
ly-0.039
thia-0.039
anna-0.038
 ponies-0.038
affe-0.038
Tokenandy
Token ID10757
Feature activation-1.25e+00

Pos logit contributions
 P+0.022
 Linda+0.020
 chess+0.020
 Jelly+0.017
 ink+0.017
Neg logit contributions
folk-0.028
 ponies-0.027
keys-0.026
 bells-0.026
affe-0.026
Token.
Token ID13
Feature activation+4.78e+00

Pos logit contributions
folk+0.040
 ponies+0.038
ly+0.037
 bells+0.037
thia+0.037
Neg logit contributions
 P-0.031
 Linda-0.028
 chess-0.028
 Jelly-0.025
 ink-0.024
Token P
Token ID350
Feature activation+1.65e+00

Pos logit contributions
 possible+0.093
ucer+0.082
etchup+0.081
 rid+0.079
ores+0.078
Neg logit contributions
Exc-0.146
ly-0.142
 So-0.139
 Her-0.137
 little-0.136
Tokenandy
Token ID10757
Feature activation-1.39e+00

Pos logit contributions
 P+0.043
 chess+0.041
 Linda+0.040
 Jelly+0.036
 surfing+0.035
Neg logit contributions
ly-0.047
thia-0.046
folk-0.046
affe-0.046
 ponies-0.045
Token was
Token ID373
Feature activation+8.00e-01

Pos logit contributions
folk+0.034
der+0.033
 ponies+0.032
ly+0.032
uted+0.031
Neg logit contributions
 P-0.044
 chess-0.041
 Linda-0.041
 Jelly-0.037
 ink-0.037
Token very
Token ID845
Feature activation+4.36e-01

Pos logit contributions
 chess+0.026
ucer+0.026
 Linda+0.026
 P+0.025
ores+0.024
Neg logit contributions
 ow-0.015
 ponies-0.015
 little-0.013
 gor-0.013
 strangers-0.013
Token cute
Token ID13779
Feature activation+3.53e-01

Pos logit contributions
 P+0.014
 chess+0.013
 Linda+0.013
 Jelly+0.012
 surfing+0.011
Neg logit contributions
folk-0.010
ly-0.010
thia-0.010
affe-0.010
 ponies-0.010
Token p
Token ID279
Feature activation+1.11e+00

Pos logit contributions
folk+0.001
ly+0.001
thia+0.001
der+0.001
 ponies+0.001
Neg logit contributions
 P-0.003
 chess-0.003
 Linda-0.003
 Jelly-0.003
 surfing-0.003
Tokenanda
Token ID5282
Feature activation-1.03e+00

Pos logit contributions
 P+0.049
 chess+0.047
 Linda+0.046
 Jelly+0.043
 noon+0.043
Neg logit contributions
ly-0.013
folk-0.012
affe-0.012
thia-0.012
 ponies-0.012
Token named
Token ID3706
Feature activation+1.14e+00

Pos logit contributions
ly+0.025
folk+0.024
affe+0.024
thia+0.024
 ponies+0.024
Neg logit contributions
 P-0.032
 Linda-0.030
 chess-0.030
 Jelly-0.027
 noon-0.026
Token P
Token ID350
Feature activation+7.28e-01

Pos logit contributions
 chess+0.027
 P+0.026
 Linda+0.026
 surfing+0.024
 ink+0.024
Neg logit contributions
ly-0.032
thia-0.032
affe-0.032
 ponies-0.031
anna-0.031
Tokenandy
Token ID10757
Feature activation-1.28e+00

Pos logit contributions
 P+0.018
 Linda+0.017
 chess+0.016
 Jelly+0.014
 ink+0.014
Neg logit contributions
folk-0.024
 ponies-0.022
keys-0.022
 bells-0.022
ly-0.022
Token.
Token ID13
Feature activation+4.74e+00

Pos logit contributions
folk+0.042
 ponies+0.040
ly+0.039
 bells+0.039
thia+0.038
Neg logit contributions
 P-0.031
 Linda-0.028
 chess-0.028
 Jelly-0.025
 ink-0.024
Token P
Token ID350
Feature activation+1.42e+00

Pos logit contributions
 possible+0.080
ucer+0.074
ores+0.073
etchup+0.072
 rid+0.070
Neg logit contributions
Exc-0.152
ly-0.150
der-0.145
 So-0.142
 little-0.142
Tokenandy
Token ID10757
Feature activation-1.58e+00

Pos logit contributions
 P+0.037
 chess+0.034
 Linda+0.034
 Jelly+0.030
 ink+0.029
Neg logit contributions
folk-0.041
ly-0.041
thia-0.041
affe-0.040
 ponies-0.040
Token was
Token ID373
Feature activation+5.03e-01

Pos logit contributions
folk+0.038
der+0.037
 ponies+0.036
ly+0.036
uted+0.035
Neg logit contributions
 P-0.051
 chess-0.047
 Linda-0.047
 Jelly-0.043
 ink-0.042
Token very
Token ID845
Feature activation+2.43e-01

Pos logit contributions
 chess+0.017
ucer+0.016
 Linda+0.016
 P+0.016
ores+0.015
Neg logit contributions
 ponies-0.009
 ow-0.009
 gor-0.009
 little-0.008
 strangers-0.008
Token successful
Token ID4388
Feature activation+1.46e-01

Pos logit contributions
 P+0.007
 Linda+0.007
 chess+0.007
 Jelly+0.006
 surfing+0.006
Neg logit contributions
folk-0.006
ly-0.006
affe-0.006
thia-0.006
 ponies-0.006
Token a
Token ID257
Feature activation+4.91e-01

Pos logit contributions
 P+0.036
 Linda+0.035
ucer+0.035
etchup+0.035
ores+0.034
Neg logit contributions
ly-0.014
der-0.013
 little-0.011
 ponies-0.011
 shoulders-0.010
Token lot
Token ID1256
Feature activation+9.42e-01

Pos logit contributions
 P+0.018
 Linda+0.018
ores+0.017
 chess+0.017
ained+0.017
Neg logit contributions
affe-0.008
 bell-0.008
 pony-0.008
 little-0.008
thia-0.008
Token and
Token ID290
Feature activation+2.69e+00

Pos logit contributions
 P+0.035
 Linda+0.033
 chess+0.033
 Jelly+0.031
 Jak+0.031
Neg logit contributions
ly-0.019
der-0.017
affe-0.015
aron-0.015
 ponies-0.015
Token asked
Token ID1965
Feature activation+3.18e+00

Pos logit contributions
 chess+0.085
ucer+0.084
 P+0.082
 late+0.080
ennis+0.080
Neg logit contributions
ly-0.054
der-0.052
 nodded-0.051
 blamed-0.050
 hated-0.050
Token Linda
Token ID21776
Feature activation+1.09e+00

Pos logit contributions
etchup+0.084
 late+0.082
ucer+0.079
 chess+0.079
 noon+0.078
Neg logit contributions
ly-0.086
der-0.076
 strangers-0.071
Exc-0.070
 ponies-0.070
Token if
Token ID611
Feature activation+4.73e+00

Pos logit contributions
 P+0.032
 Linda+0.031
 chess+0.030
 noon+0.029
 ink+0.029
Neg logit contributions
ly-0.026
 ponies-0.025
der-0.025
folk-0.024
anna-0.024
Token she
Token ID673
Feature activation+7.04e-01

Pos logit contributions
etchup+0.156
ucer+0.154
 bisc+0.154
 hyd+0.149
ores+0.148
Neg logit contributions
ly-0.074
 strangers-0.073
 little-0.068
der-0.067
 hated-0.063
Token could
Token ID714
Feature activation-1.73e-01

Pos logit contributions
 P+0.024
 Linda+0.023
 chess+0.022
 Jelly+0.021
 Jill+0.020
Neg logit contributions
folk-0.016
 ponies-0.016
uted-0.015
opes-0.015
anna-0.014
Token borrow
Token ID8804
Feature activation+1.57e+00

Pos logit contributions
thia+0.004
affe+0.004
ly+0.004
 ponies+0.004
 bells+0.004
Neg logit contributions
 P-0.005
 chess-0.005
 Linda-0.005
 late-0.004
 Ink-0.004
Token it
Token ID340
Feature activation+8.48e-01

Pos logit contributions
 chess+0.050
 surfing+0.047
 P+0.047
 late+0.046
tub+0.046
Neg logit contributions
ly-0.032
 bells-0.031
 ponies-0.031
 strangers-0.029
 bell-0.029
Token.
Token ID13
Feature activation+3.76e+00

Pos logit contributions
 Linda+0.032
etchup+0.032
 P+0.032
ores+0.032
ucer+0.032
Neg logit contributions
ly-0.012
 ponies-0.011
der-0.010
affe-0.009
 pony-0.009
Tokenin
Token ID259
Feature activation-1.33e-01

Pos logit contributions
 P+0.005
 Linda+0.005
 chess+0.005
 Jelly+0.004
 noon+0.004
Neg logit contributions
ly-0.003
folk-0.003
affe-0.003
thia-0.003
 ponies-0.003
Token named
Token ID3706
Feature activation+1.89e+00

Pos logit contributions
affe+0.003
ly+0.003
der+0.003
 ponies+0.003
thia+0.003
Neg logit contributions
 P-0.005
 Linda-0.004
 chess-0.004
 noon-0.004
 ink-0.004
Token P
Token ID350
Feature activation+1.36e+00

Pos logit contributions
 Linda+0.046
 chess+0.044
 P+0.042
 late+0.040
ained+0.040
Neg logit contributions
affe-0.052
ly-0.051
thia-0.051
 gor-0.051
keys-0.050
Tokenipp
Token ID3974
Feature activation-5.83e-02

Pos logit contributions
 P+0.039
 chess+0.036
 Linda+0.036
 Jelly+0.032
 ink+0.031
Neg logit contributions
folk-0.036
ly-0.036
thia-0.035
affe-0.035
 ponies-0.035
Tokenin
Token ID259
Feature activation-2.26e-02

Pos logit contributions
ly+0.001
folk+0.001
affe+0.001
thia+0.001
 ponies+0.001
Neg logit contributions
 P-0.002
 Linda-0.002
 chess-0.002
 Jelly-0.002
 ink-0.002
Token.
Token ID13
Feature activation+4.72e+00

Pos logit contributions
affe+0.001
ly+0.000
thia+0.000
der+0.000
folk+0.000
Neg logit contributions
 P-0.001
 Linda-0.001
 chess-0.001
 surfing-0.001
 noon-0.001
Token P
Token ID350
Feature activation+6.70e-01

Pos logit contributions
 possible+0.075
ores+0.073
ucer+0.068
etchup+0.065
prise+0.064
Neg logit contributions
Exc-0.154
ly-0.151
 So-0.145
der-0.145
affe-0.143
Tokenipp
Token ID3974
Feature activation-7.16e-01

Pos logit contributions
 P+0.020
 chess+0.019
 Linda+0.019
 Jelly+0.017
 surfing+0.017
Neg logit contributions
folk-0.017
ly-0.017
 ponies-0.016
affe-0.016
thia-0.016
Tokenin
Token ID259
Feature activation-4.49e-01

Pos logit contributions
folk+0.018
 ponies+0.018
ly+0.017
thia+0.017
der+0.017
Neg logit contributions
 P-0.021
 chess-0.020
 Linda-0.020
 Jelly-0.018
 surfing-0.018
Token was
Token ID373
Feature activation+9.21e-01

Pos logit contributions
thia+0.007
affe+0.007
ly+0.007
folk+0.007
 ponies+0.007
Neg logit contributions
 P-0.017
 Linda-0.016
 chess-0.016
 Jelly-0.015
 surfing-0.015
Token not
Token ID407
Feature activation+2.63e-01

Pos logit contributions
ucer+0.035
 Linda+0.034
 chess+0.033
 P+0.031
ennis+0.031
Neg logit contributions
 little-0.014
 ow-0.013
 ponies-0.013
 gor-0.012
 nos-0.011
Token p
Token ID279
Feature activation+1.29e+00

Pos logit contributions
folk+0.003
thia+0.003
der+0.003
ly+0.003
keys+0.003
Neg logit contributions
 P-0.005
 chess-0.005
 Linda-0.005
 Jelly-0.005
 ink-0.005
Tokenanda
Token ID5282
Feature activation-9.18e-01

Pos logit contributions
 P+0.054
 chess+0.052
 Linda+0.052
 Jelly+0.048
 noon+0.048
Neg logit contributions
ly-0.017
folk-0.016
thia-0.016
affe-0.016
 ponies-0.016
Token named
Token ID3706
Feature activation+1.05e+00

Pos logit contributions
folk+0.024
ly+0.024
thia+0.023
affe+0.023
 ponies+0.023
Neg logit contributions
 P-0.027
 chess-0.025
 Linda-0.025
 Jelly-0.022
 ink-0.022
Token P
Token ID350
Feature activation+6.38e-01

Pos logit contributions
 P+0.023
 chess+0.023
 Linda+0.022
 surfing+0.020
 ink+0.020
Neg logit contributions
ly-0.032
thia-0.032
 ponies-0.032
affe-0.031
folk-0.031
Tokenandy
Token ID10757
Feature activation-1.22e+00

Pos logit contributions
 P+0.016
 Linda+0.015
 chess+0.014
 Jelly+0.012
 ink+0.012
Neg logit contributions
folk-0.021
keys-0.020
 bells-0.020
 ponies-0.020
affe-0.019
Token.
Token ID13
Feature activation+4.69e+00

Pos logit contributions
folk+0.037
 ponies+0.035
ly+0.035
thia+0.035
der+0.035
Neg logit contributions
 P-0.031
 chess-0.029
 Linda-0.029
 Jelly-0.025
 ink-0.024
Token P
Token ID350
Feature activation+1.27e+00

Pos logit contributions
ucer+0.088
 possible+0.084
 rid+0.080
etchup+0.079
prise+0.079
Neg logit contributions
Exc-0.144
der-0.141
ly-0.140
 So-0.134
 eyes-0.133
Tokenandy
Token ID10757
Feature activation-1.63e+00

Pos logit contributions
 P+0.032
 Linda+0.030
 chess+0.030
 Jelly+0.026
 ink+0.025
Neg logit contributions
folk-0.038
ly-0.037
 ponies-0.037
affe-0.037
thia-0.037
Token was
Token ID373
Feature activation+2.53e-01

Pos logit contributions
folk+0.035
der+0.034
ly+0.032
uted+0.032
 ponies+0.032
Neg logit contributions
 P-0.056
 Linda-0.053
 chess-0.053
 Jelly-0.048
 ink-0.048
Token very
Token ID845
Feature activation+8.45e-03

Pos logit contributions
 chess+0.008
 P+0.008
 Linda+0.008
ucer+0.008
 ink+0.007
Neg logit contributions
 ponies-0.005
 ow-0.005
 gor-0.005
thia-0.005
 strangers-0.004
Token generous
Token ID14431
Feature activation+4.10e-01

Pos logit contributions
 P+0.000
 Linda+0.000
 chess+0.000
 Jelly+0.000
 surfing+0.000
Neg logit contributions
folk-0.000
ly-0.000
affe-0.000
thia-0.000
 ponies-0.000
TokenBut
Token ID1537
Feature activation+1.10e+00

Pos logit contributions
 chess+0.032
 Linda+0.031
 P+0.031
 Jak+0.029
 Jill+0.028
Neg logit contributions
Exc-0.014
 ponies-0.012
ly-0.012
anna-0.011
co-0.011
Token one
Token ID530
Feature activation-3.64e-01

Pos logit contributions
 chess+0.033
 Linda+0.031
 P+0.030
ucer+0.029
ores+0.029
Neg logit contributions
ly-0.026
 ponies-0.025
anna-0.024
 pony-0.023
der-0.023
Token day
Token ID1110
Feature activation+1.17e+00

Pos logit contributions
ly+0.013
affe+0.013
 ponies+0.012
 pony+0.012
 little+0.012
Neg logit contributions
 P-0.007
 chess-0.007
 Linda-0.006
ucer-0.005
 Jak-0.005
Token,
Token ID11
Feature activation+1.78e+00

Pos logit contributions
 chess+0.041
 P+0.041
 Linda+0.040
 surfing+0.037
 Jelly+0.036
Neg logit contributions
ly-0.022
 ponies-0.020
folk-0.020
affe-0.020
der-0.020
Token when
Token ID618
Feature activation+2.53e+00

Pos logit contributions
 chess+0.045
ucer+0.043
 disk+0.042
etchup+0.041
 surfing+0.040
Neg logit contributions
 little-0.052
ly-0.050
Exc-0.047
der-0.044
 ow-0.044
Token D
Token ID360
Feature activation+4.69e+00

Pos logit contributions
 chess+0.068
 disk+0.064
ucer+0.064
ennis+0.057
 projector+0.057
Neg logit contributions
 ow-0.066
 little-0.066
 ponies-0.060
 strangers-0.059
Exc-0.059
Tokenina
Token ID1437
Feature activation-6.82e-01

Pos logit contributions
 chess+0.120
 surfing+0.107
 soda+0.106
 possible+0.106
 P+0.105
Neg logit contributions
ly-0.141
anna-0.137
ith-0.120
ul-0.114
Exc-0.109
Token was
Token ID373
Feature activation-1.02e+00

Pos logit contributions
folk+0.017
keys+0.017
affe+0.017
der+0.017
uted+0.017
Neg logit contributions
 P-0.020
 chess-0.019
 Linda-0.018
 Jelly-0.017
 ink-0.017
Token riding
Token ID10311
Feature activation-1.24e-01

Pos logit contributions
affe+0.032
folk+0.030
ly+0.030
keys+0.030
der+0.030
Neg logit contributions
 chess-0.026
 P-0.026
 surfing-0.025
 Linda-0.024
 late-0.023
Token her
Token ID607
Feature activation+6.68e-01

Pos logit contributions
ly+0.003
 ponies+0.003
folk+0.003
affe+0.003
 bells+0.003
Neg logit contributions
 P-0.004
 chess-0.003
 Linda-0.003
 Jelly-0.003
 surfing-0.003
Token tr
Token ID491
Feature activation-4.72e-01

Pos logit contributions
 P+0.024
 chess+0.023
 Linda+0.023
 noon+0.021
 ink+0.021
Neg logit contributions
folk-0.013
ly-0.012
thia-0.012
affe-0.012
keys-0.012
Tokenin
Token ID259
Feature activation-3.45e-01

Pos logit contributions
folk+0.009
ly+0.009
thia+0.009
 ponies+0.009
affe+0.009
Neg logit contributions
 P-0.013
 chess-0.012
 Linda-0.012
 Jelly-0.011
 surfing-0.011
Token named
Token ID3706
Feature activation+1.88e+00

Pos logit contributions
affe+0.007
ly+0.007
der+0.007
 ponies+0.007
thia+0.007
Neg logit contributions
 P-0.012
 Linda-0.011
 chess-0.011
 noon-0.010
 ink-0.010
Token P
Token ID350
Feature activation+6.34e-01

Pos logit contributions
 Linda+0.045
 chess+0.043
ained+0.041
 ink+0.041
ucer+0.040
Neg logit contributions
 gor-0.051
ly-0.051
affe-0.050
thia-0.050
 ponies-0.048
Tokenudd
Token ID4185
Feature activation+6.05e-01

Pos logit contributions
 P+0.017
 chess+0.016
 Linda+0.016
 Jelly+0.014
 surfing+0.014
Neg logit contributions
folk-0.018
 ponies-0.018
ly-0.017
affe-0.017
 bells-0.017
Tokenles
Token ID829
Feature activation-1.90e-01

Pos logit contributions
 P+0.023
 chess+0.022
 Linda+0.022
 Jelly+0.020
 surfing+0.020
Neg logit contributions
folk-0.010
thia-0.010
affe-0.009
 ponies-0.009
ly-0.009
Token.
Token ID13
Feature activation+4.66e+00

Pos logit contributions
folk+0.005
ly+0.005
thia+0.005
affe+0.005
 ponies+0.005
Neg logit contributions
 P-0.006
 chess-0.005
 Linda-0.005
 Jelly-0.005
 ink-0.005
Token P
Token ID350
Feature activation+1.97e-01

Pos logit contributions
ucer+0.077
ores+0.073
 possible+0.073
prise+0.072
etchup+0.071
Neg logit contributions
Exc-0.150
der-0.150
ly-0.148
 So-0.140
 eyes-0.139
Tokenudd
Token ID4185
Feature activation-5.82e-02

Pos logit contributions
 P+0.006
 chess+0.005
 Linda+0.005
 Jelly+0.005
 ink+0.005
Neg logit contributions
folk-0.005
 ponies-0.005
ly-0.005
affe-0.005
 bells-0.005
Tokenles
Token ID829
Feature activation+2.77e-01

Pos logit contributions
folk+0.001
thia+0.001
affe+0.001
 ponies+0.001
keys+0.001
Neg logit contributions
 P-0.002
 chess-0.002
 Linda-0.002
 Jelly-0.002
 surfing-0.002
Token loved
Token ID6151
Feature activation-7.78e-02

Pos logit contributions
 P+0.012
 Linda+0.012
 chess+0.012
 Jelly+0.011
 Ink+0.011
Neg logit contributions
ly-0.003
thia-0.003
folk-0.003
affe-0.003
der-0.003
Token to
Token ID284
Feature activation+4.35e-01

Pos logit contributions
 ponies+0.002
 bells+0.002
 strangers+0.002
ly+0.002
der+0.002
Neg logit contributions
 Linda-0.002
 P-0.002
 chess-0.002
 Jelly-0.002
ucer-0.002
Token p
Token ID279
Feature activation+1.04e+00

Pos logit contributions
folk+0.003
ly+0.003
der+0.003
thia+0.003
 ponies+0.003
Neg logit contributions
 P-0.008
 chess-0.008
 Linda-0.008
 Jelly-0.008
 surfing-0.007
Tokenanda
Token ID5282
Feature activation-1.35e+00

Pos logit contributions
 P+0.046
 chess+0.044
 Linda+0.044
 Jelly+0.041
 noon+0.041
Neg logit contributions
ly-0.011
folk-0.010
 ponies-0.010
affe-0.010
thia-0.010
Token named
Token ID3706
Feature activation+1.08e+00

Pos logit contributions
folk+0.035
ly+0.034
thia+0.034
affe+0.034
 ponies+0.034
Neg logit contributions
 P-0.039
 chess-0.037
 Linda-0.037
 Jelly-0.033
 surfing-0.032
Token P
Token ID350
Feature activation+5.34e-01

Pos logit contributions
 chess+0.026
 P+0.024
 Linda+0.024
 surfing+0.023
 ink+0.023
Neg logit contributions
ly-0.030
thia-0.030
affe-0.029
 ponies-0.029
folk-0.029
Tokenandy
Token ID10757
Feature activation-1.42e+00

Pos logit contributions
 P+0.014
 Linda+0.013
 chess+0.012
 Jelly+0.011
 ink+0.010
Neg logit contributions
folk-0.017
 ponies-0.016
keys-0.016
 bells-0.016
affe-0.015
Token.
Token ID13
Feature activation+4.64e+00

Pos logit contributions
folk+0.046
 ponies+0.043
ly+0.043
thia+0.043
 bells+0.043
Neg logit contributions
 P-0.035
 chess-0.032
 Linda-0.031
 Jelly-0.027
 ink-0.026
Token P
Token ID350
Feature activation+1.27e+00

Pos logit contributions
 possible+0.084
ucer+0.083
 rid+0.080
etchup+0.077
ores+0.076
Neg logit contributions
Exc-0.141
ly-0.139
der-0.134
 So-0.133
 little-0.130
Tokenandy
Token ID10757
Feature activation-1.70e+00

Pos logit contributions
 P+0.033
 Linda+0.031
 chess+0.030
 Jelly+0.027
 ink+0.026
Neg logit contributions
folk-0.038
ly-0.037
 ponies-0.037
affe-0.037
thia-0.037
Token was
Token ID373
Feature activation+3.34e-01

Pos logit contributions
folk+0.040
der+0.039
ly+0.037
 ponies+0.037
uted+0.036
Neg logit contributions
 P-0.056
 chess-0.052
 Linda-0.052
 Jelly-0.047
 ink-0.047
Token a
Token ID257
Feature activation-5.06e-01

Pos logit contributions
 chess+0.011
ucer+0.011
 Linda+0.011
 P+0.011
ores+0.010
Neg logit contributions
 ponies-0.006
 ow-0.006
 gor-0.006
 little-0.006
thia-0.005
Token big
Token ID1263
Feature activation+4.38e-01

Pos logit contributions
affe+0.010
ly+0.010
thia+0.010
 ponies+0.010
folk+0.010
Neg logit contributions
 P-0.017
 Linda-0.016
 chess-0.016
 ink-0.014
 noon-0.014
Token crab
Token ID32202
Feature activation-6.27e-01

Pos logit contributions
 P+0.011
 chess+0.010
 Linda+0.010
 Jelly+0.010
 ink+0.009
Neg logit contributions
folk-0.003
ly-0.003
thia-0.003
affe-0.003
 ponies-0.003
Token named
Token ID3706
Feature activation+1.74e+00

Pos logit contributions
ly+0.013
affe+0.013
 ponies+0.013
thia+0.013
anna+0.012
Neg logit contributions
 P-0.021
 Linda-0.020
 chess-0.020
 Jelly-0.018
 noon-0.018
Token P
Token ID350
Feature activation+1.36e+00

Pos logit contributions
ucer+0.044
 Linda+0.043
ained+0.043
 chess+0.042
ores+0.042
Neg logit contributions
affe-0.042
thia-0.041
ly-0.040
 ow-0.039
 ponies-0.038
Tokeninch
Token ID8589
Feature activation-5.09e-01

Pos logit contributions
 P+0.033
 Linda+0.031
 chess+0.031
 Jelly+0.027
 noon+0.026
Neg logit contributions
ly-0.041
thia-0.041
folk-0.041
affe-0.041
 ponies-0.040
Tokeny
Token ID88
Feature activation+4.63e-01

Pos logit contributions
ly+0.011
affe+0.010
folk+0.010
thia+0.010
der+0.010
Neg logit contributions
 P-0.017
 chess-0.016
 Linda-0.016
 Jelly-0.015
 noon-0.015
Token.
Token ID13
Feature activation+4.57e+00

Pos logit contributions
 P+0.014
 chess+0.013
 Linda+0.013
 Jelly+0.011
 surfing+0.011
Neg logit contributions
folk-0.012
ly-0.012
thia-0.012
affe-0.012
 ponies-0.011
Token P
Token ID350
Feature activation+2.04e+00

Pos logit contributions
ores+0.068
 possible+0.064
ucer+0.061
prise+0.058
etchup+0.057
Neg logit contributions
Exc-0.161
ly-0.157
der-0.153
affe-0.149
 little-0.147
Tokeninch
Token ID8589
Feature activation-8.60e-01

Pos logit contributions
 P+0.043
 Linda+0.041
 chess+0.040
 Jelly+0.035
 Ink+0.034
Neg logit contributions
ly-0.068
thia-0.065
der-0.065
affe-0.065
folk-0.064
Tokeny
Token ID88
Feature activation-2.89e-01

Pos logit contributions
ly+0.018
folk+0.018
affe+0.017
thia+0.017
 ponies+0.017
Neg logit contributions
 P-0.029
 chess-0.028
 Linda-0.028
 Jelly-0.025
 surfing-0.025
Token was
Token ID373
Feature activation+1.16e+00

Pos logit contributions
folk+0.006
 ponies+0.005
ly+0.005
der+0.005
affe+0.005
Neg logit contributions
 P-0.011
 chess-0.010
 Linda-0.010
 Jelly-0.009
 ink-0.009
Token very
Token ID845
Feature activation+7.16e-01

Pos logit contributions
ucer+0.040
 chess+0.040
 Linda+0.039
ores+0.038
 P+0.037
Neg logit contributions
 ow-0.019
 ponies-0.018
 little-0.018
 gor-0.016
 babies-0.016
Token named
Token ID3706
Feature activation+1.82e+00

Pos logit contributions
affe+0.003
ly+0.003
 ponies+0.002
 strangers+0.002
 who+0.002
Neg logit contributions
 Linda-0.006
 P-0.006
 chess-0.006
ennis-0.006
 possible-0.006
Token D
Token ID360
Feature activation+2.97e+00

Pos logit contributions
 Linda+0.056
 P+0.055
 chess+0.054
 surfing+0.051
ucer+0.051
Neg logit contributions
 ponies-0.038
ly-0.038
affe-0.037
thia-0.037
 strangers-0.035
Tokenott
Token ID1252
Feature activation-2.50e+00

Pos logit contributions
 chess+0.090
 P+0.089
 Linda+0.086
 surfing+0.080
 Ink+0.079
Neg logit contributions
ly-0.073
affe-0.070
anna-0.066
Exc-0.064
 ponies-0.062
Tokenie
Token ID494
Feature activation-5.65e-01

Pos logit contributions
folk+0.065
Exc+0.060
 ponies+0.060
 bells+0.059
thia+0.059
Neg logit contributions
 P-0.073
 Linda-0.070
 chess-0.069
 ink-0.061
 noon-0.060
Token.
Token ID13
Feature activation+2.88e+00

Pos logit contributions
folk+0.015
ly+0.015
thia+0.014
 ponies+0.014
affe+0.014
Neg logit contributions
 P-0.016
 chess-0.015
 Linda-0.015
 Jelly-0.014
 ink-0.013
Token D
Token ID360
Feature activation+4.51e+00

Pos logit contributions
ucer+0.072
 possible+0.068
 chess+0.068
 rid+0.065
 Linda+0.063
Neg logit contributions
Exc-0.078
der-0.073
ly-0.073
 ponies-0.070
 little-0.068
Tokenott
Token ID1252
Feature activation-2.03e+00

Pos logit contributions
 chess+0.138
 possible+0.132
 surfing+0.131
ores+0.129
 common+0.125
Neg logit contributions
ly-0.096
ith-0.084
affe-0.080
der-0.079
thia-0.079
Tokenie
Token ID494
Feature activation-7.33e-01

Pos logit contributions
folk+0.057
 ponies+0.053
keys+0.052
 bells+0.051
affe+0.050
Neg logit contributions
 P-0.057
 Linda-0.055
 chess-0.053
 ink-0.048
 Jelly-0.046
Token wanted
Token ID2227
Feature activation+5.48e-01

Pos logit contributions
folk+0.010
ly+0.010
thia+0.010
affe+0.010
 ponies+0.010
Neg logit contributions
 P-0.030
 chess-0.029
 Linda-0.029
 Jelly-0.026
 ink-0.026
Token to
Token ID284
Feature activation+2.63e-01

Pos logit contributions
 Linda+0.018
 P+0.018
 chess+0.017
ucer+0.016
ores+0.016
Neg logit contributions
 ponies-0.012
 strangers-0.011
ly-0.011
 bells-0.011
thia-0.010
Token visit
Token ID3187
Feature activation+6.59e-01

Pos logit contributions
 P+0.007
 Linda+0.007
 chess+0.007
 Jelly+0.006
 late+0.006
Neg logit contributions
thia-0.007
 ponies-0.007
folk-0.007
affe-0.007
ly-0.006
Token crab
Token ID32202
Feature activation-7.95e-01

Pos logit contributions
 P+0.019
 Linda+0.018
 chess+0.017
 surfing+0.016
ores+0.016
Neg logit contributions
ly-0.010
affe-0.009
 ponies-0.009
folk-0.009
 pony-0.008
Token named
Token ID3706
Feature activation+1.45e+00

Pos logit contributions
affe+0.016
ly+0.016
 ponies+0.015
thia+0.015
anna+0.015
Neg logit contributions
 P-0.027
 Linda-0.026
 chess-0.025
 Jelly-0.023
 noon-0.023
Token P
Token ID350
Feature activation+1.20e+00

Pos logit contributions
 Linda+0.038
 chess+0.036
 P+0.036
ucer+0.035
ained+0.034
Neg logit contributions
affe-0.037
ly-0.036
thia-0.036
 ponies-0.034
 gor-0.034
Tokeninch
Token ID8589
Feature activation-8.30e-01

Pos logit contributions
 P+0.031
 chess+0.029
 Linda+0.029
 Jelly+0.025
 ink+0.025
Neg logit contributions
folk-0.035
ly-0.035
thia-0.034
affe-0.034
 ponies-0.034
Tokeny
Token ID88
Feature activation+2.00e-01

Pos logit contributions
ly+0.018
folk+0.018
affe+0.017
thia+0.017
 ponies+0.017
Neg logit contributions
 P-0.028
 chess-0.026
 Linda-0.026
 Jelly-0.024
 ink-0.023
Token.
Token ID13
Feature activation+4.50e+00

Pos logit contributions
 P+0.006
 Linda+0.006
 chess+0.006
 Jelly+0.005
 surfing+0.005
Neg logit contributions
ly-0.005
folk-0.005
affe-0.005
thia-0.005
 ponies-0.005
Token P
Token ID350
Feature activation+1.81e+00

Pos logit contributions
ores+0.068
 possible+0.064
ucer+0.062
etchup+0.061
 kettle+0.059
Neg logit contributions
ly-0.152
Exc-0.152
der-0.148
affe-0.144
 So-0.143
Tokeninch
Token ID8589
Feature activation-1.40e+00

Pos logit contributions
 P+0.038
 Linda+0.035
 chess+0.035
 Jelly+0.030
 noon+0.029
Neg logit contributions
ly-0.061
thia-0.060
affe-0.059
der-0.059
folk-0.059
Tokeny
Token ID88
Feature activation-7.65e-01

Pos logit contributions
folk+0.030
ly+0.030
thia+0.029
affe+0.029
 ponies+0.029
Neg logit contributions
 P-0.047
 chess-0.045
 Linda-0.045
 Jelly-0.040
 ink-0.040
Token loved
Token ID6151
Feature activation-2.20e-01

Pos logit contributions
folk+0.014
 ponies+0.013
ly+0.013
der+0.013
affe+0.013
Neg logit contributions
 P-0.028
 chess-0.027
 Linda-0.026
 Jelly-0.025
 ink-0.024
Token to
Token ID284
Feature activation+2.26e-01

Pos logit contributions
 ponies+0.005
 strangers+0.005
 bells+0.005
ly+0.005
 gor+0.004
Neg logit contributions
 Linda-0.007
 P-0.007
 chess-0.006
 Jelly-0.006
ucer-0.006
Token k
Token ID479
Feature activation+2.42e-01

Pos logit contributions
 P+0.017
 Linda+0.016
 chess+0.016
 surfing+0.014
ores+0.014
Neg logit contributions
ly-0.008
 ponies-0.008
affe-0.008
folk-0.007
 pony-0.007
Tokenitty
Token ID9760
Feature activation-3.74e-01

Pos logit contributions
 P+0.007
 chess+0.007
 Linda+0.007
 Jelly+0.006
 noon+0.006
Neg logit contributions
folk-0.006
ly-0.006
affe-0.006
 ponies-0.006
thia-0.006
Token named
Token ID3706
Feature activation+7.20e-01

Pos logit contributions
ly+0.009
folk+0.009
affe+0.009
thia+0.009
 ponies+0.009
Neg logit contributions
 P-0.011
 chess-0.011
 Linda-0.011
 Jelly-0.010
 surfing-0.009
Token P
Token ID350
Feature activation+1.40e-01

Pos logit contributions
 P+0.020
 chess+0.019
 Linda+0.019
 surfing+0.017
 Jelly+0.017
Neg logit contributions
ly-0.019
folk-0.019
 ponies-0.019
thia-0.019
affe-0.019
Tokenuss
Token ID1046
Feature activation-3.34e+00

Pos logit contributions
 P+0.004
 Linda+0.003
 chess+0.003
 ink+0.003
 Jelly+0.003
Neg logit contributions
folk-0.004
 bells-0.004
keys-0.004
 ponies-0.004
 gor-0.004
Token.
Token ID13
Feature activation+4.47e+00

Pos logit contributions
folk+0.105
thia+0.102
Exc+0.100
 backs+0.100
 bells+0.099
Neg logit contributions
 P-0.072
 Linda-0.066
 night-0.062
 chess-0.060
 day-0.057
Token One
Token ID1881
Feature activation+9.48e-01

Pos logit contributions
ores+0.073
etchup+0.071
 possible+0.070
ucer+0.068
 kettle+0.064
Neg logit contributions
Exc-0.152
ly-0.147
der-0.145
 So-0.140
 eyes-0.138
Token day
Token ID1110
Feature activation+3.51e+00

Pos logit contributions
 P+0.019
 chess+0.019
 Linda+0.018
 surfing+0.015
 Jak+0.015
Neg logit contributions
ly-0.033
affe-0.033
thia-0.032
 ow-0.031
 hated-0.031
Token P
Token ID350
Feature activation+2.45e+00

Pos logit contributions
ores+0.142
 possible+0.139
 chess+0.135
 late+0.135
 Linda+0.134
Neg logit contributions
ly-0.038
 little-0.032
Exc-0.027
 ow-0.026
 stopped-0.024
Tokenuss
Token ID1046
Feature activation-3.21e+00

Pos logit contributions
 P+0.073
 chess+0.069
 Linda+0.068
 Jelly+0.062
 surfing+0.061
Neg logit contributions
ly-0.061
thia-0.060
affe-0.059
folk-0.058
der-0.058
Token was
Token ID373
Feature activation+8.56e-01

Pos logit contributions
folk+0.100
arians+0.091
 hind+0.091
 bells+0.090
 shoulders+0.089
Neg logit contributions
 P-0.074
 Linda-0.069
 night-0.069
 ink-0.066
 noon-0.065
Token friend
Token ID1545
Feature activation-2.28e-01

Pos logit contributions
 P+0.015
 Linda+0.014
 chess+0.014
 Jelly+0.013
ores+0.013
Neg logit contributions
 ponies-0.007
thia-0.007
folk-0.007
ly-0.007
affe-0.007
Token,
Token ID11
Feature activation+5.83e-01

Pos logit contributions
ly+0.005
 ponies+0.005
folk+0.005
affe+0.005
thia+0.005
Neg logit contributions
 chess-0.007
 P-0.006
ucer-0.006
ores-0.006
 Linda-0.006
Token D
Token ID360
Feature activation+2.67e+00

Pos logit contributions
 chess+0.014
 P+0.014
ained+0.014
ucer+0.014
 ink+0.013
Neg logit contributions
 ponies-0.014
folk-0.014
 strangers-0.014
thia-0.014
 pony-0.014
Tokenalia
Token ID9752
Feature activation-1.54e+00

Pos logit contributions
 chess+0.076
 P+0.073
 Linda+0.069
 Ink+0.066
 ink+0.064
Neg logit contributions
ly-0.074
anna-0.071
affe-0.067
 ponies-0.066
Exc-0.065
Token.
Token ID13
Feature activation+1.86e+00

Pos logit contributions
affe+0.045
folk+0.045
thia+0.044
der+0.044
uted+0.044
Neg logit contributions
 P-0.039
 Linda-0.037
 chess-0.036
 surfing-0.032
 Jelly-0.031
Token D
Token ID360
Feature activation+4.43e+00

Pos logit contributions
 chess+0.056
 Linda+0.050
 P+0.050
 ink+0.049
 possible+0.048
Neg logit contributions
Exc-0.042
ly-0.042
anna-0.041
folk-0.041
 ponies-0.040
Tokenalia
Token ID9752
Feature activation-1.21e+00

Pos logit contributions
 chess+0.142
 late+0.131
 hyd+0.129
 rid+0.129
 possible+0.128
Neg logit contributions
ly-0.109
anna-0.096
ith-0.086
der-0.083
affe-0.082
Token had
Token ID550
Feature activation+6.05e-01

Pos logit contributions
der+0.035
 ponies+0.034
 ow+0.034
affe+0.034
gging+0.033
Neg logit contributions
 P-0.031
 Linda-0.030
 chess-0.028
 Jill-0.026
 Jenny-0.025
Token made
Token ID925
Feature activation+1.54e+00

Pos logit contributions
 P+0.015
 chess+0.015
 Linda+0.014
 Jelly+0.013
 ink+0.012
Neg logit contributions
thia-0.017
 ponies-0.017
folk-0.017
ly-0.017
affe-0.017
Token a
Token ID257
Feature activation+3.86e-01

Pos logit contributions
 chess+0.045
ucer+0.042
 P+0.041
 Linda+0.039
etchup+0.039
Neg logit contributions
ly-0.038
 ponies-0.037
 bells-0.034
 little-0.034
der-0.033
Token promise
Token ID6991
Feature activation+2.17e-01

Pos logit contributions
 P+0.013
 Linda+0.013
ores+0.012
 chess+0.012
ained+0.012
Neg logit contributions
 bell-0.008
 pony-0.008
thia-0.007
affe-0.007
 ponies-0.007
Tokengy
Token ID1360
Feature activation-1.23e+00

Pos logit contributions
 P+0.014
 Linda+0.014
 chess+0.013
ores+0.012
 noon+0.012
Neg logit contributions
affe-0.009
thia-0.009
ly-0.009
 ponies-0.008
anna-0.008
Token named
Token ID3706
Feature activation+6.42e-01

Pos logit contributions
folk+0.029
ly+0.028
thia+0.028
affe+0.028
 ponies+0.028
Neg logit contributions
 P-0.038
 chess-0.036
 Linda-0.036
 Jelly-0.032
 ink-0.032
Token P
Token ID350
Feature activation-5.48e-01

Pos logit contributions
 chess+0.015
 P+0.015
 Linda+0.015
 surfing+0.014
 ink+0.013
Neg logit contributions
ly-0.019
thia-0.018
 ponies-0.018
affe-0.018
folk-0.018
Tokenipp
Token ID3974
Feature activation-1.44e+00

Pos logit contributions
folk+0.016
 bells+0.015
 blamed+0.014
 ponies+0.014
keys+0.014
Neg logit contributions
 P-0.015
 chess-0.014
 Linda-0.013
 ink-0.012
 Jelly-0.012
Tokena
Token ID64
Feature activation-7.68e-01

Pos logit contributions
folk+0.039
 ponies+0.038
der+0.037
thia+0.037
ly+0.037
Neg logit contributions
 P-0.040
 chess-0.037
 Linda-0.037
 Jelly-0.034
 surfing-0.032
Token.
Token ID13
Feature activation+4.39e+00

Pos logit contributions
folk+0.022
ly+0.021
 ponies+0.021
der+0.021
thia+0.021
Neg logit contributions
 P-0.021
 chess-0.019
 Linda-0.019
 Jelly-0.017
 ink-0.017
Token P
Token ID350
Feature activation+9.64e-02

Pos logit contributions
 possible+0.077
ucer+0.073
ores+0.073
 rid+0.072
etchup+0.070
Neg logit contributions
Exc-0.146
ly-0.138
 little-0.133
der-0.132
 Many-0.129
Tokenipp
Token ID3974
Feature activation-1.61e+00

Pos logit contributions
 P+0.003
 chess+0.003
 Linda+0.002
 Jelly+0.002
 ink+0.002
Neg logit contributions
folk-0.003
 ponies-0.002
ly-0.002
affe-0.002
thia-0.002
Tokena
Token ID64
Feature activation-1.19e+00

Pos logit contributions
folk+0.046
 ponies+0.045
der+0.045
 strangers+0.044
thia+0.044
Neg logit contributions
 P-0.042
 chess-0.038
 Linda-0.038
 Jelly-0.036
 Ink-0.033
Token loved
Token ID6151
Feature activation-8.03e-01

Pos logit contributions
folk+0.026
der+0.025
 ponies+0.025
ly+0.024
uted+0.024
Neg logit contributions
 P-0.041
 chess-0.038
 Linda-0.038
 ink-0.035
 Jelly-0.035
Token to
Token ID284
Feature activation-1.89e-01

Pos logit contributions
 ponies+0.020
ly+0.018
 bells+0.018
thia+0.018
 strangers+0.018
Neg logit contributions
 P-0.024
 Linda-0.024
 chess-0.023
 Jelly-0.021
ucer-0.020
Token pony
Token ID26902
Feature activation-4.74e-01

Pos logit contributions
 P+0.010
 Linda+0.010
 chess+0.009
ores+0.009
 surfing+0.009
Neg logit contributions
ly-0.005
 ponies-0.005
affe-0.005
folk-0.005
 pony-0.005
Token named
Token ID3706
Feature activation-1.81e-01

Pos logit contributions
affe+0.010
ly+0.010
 ponies+0.009
thia+0.009
folk+0.009
Neg logit contributions
 P-0.016
 Linda-0.016
 chess-0.015
 Jelly-0.014
 noon-0.014
Token P
Token ID350
Feature activation-8.92e-01

Pos logit contributions
folk+0.005
ly+0.005
thia+0.005
der+0.005
affe+0.005
Neg logit contributions
 P-0.005
 Linda-0.004
 chess-0.004
 Jelly-0.004
 noon-0.004
Tokenipp
Token ID3974
Feature activation-1.53e+00

Pos logit contributions
folk+0.030
 blamed+0.027
 bells+0.027
keys+0.026
 policemen+0.025
Neg logit contributions
 P-0.021
 chess-0.020
 Linda-0.019
ennis-0.017
 noon-0.017
Tokenin
Token ID259
Feature activation-1.07e+00

Pos logit contributions
folk+0.042
 ponies+0.041
der+0.041
thia+0.040
ly+0.040
Neg logit contributions
 P-0.042
 chess-0.039
 Linda-0.038
 Jelly-0.035
 surfing-0.034
Token.
Token ID13
Feature activation+4.36e+00

Pos logit contributions
affe+0.025
ly+0.024
thia+0.024
folk+0.024
 ponies+0.024
Neg logit contributions
 P-0.034
 Linda-0.032
 chess-0.032
 Jelly-0.029
 surfing-0.029
Token P
Token ID350
Feature activation-3.30e-01

Pos logit contributions
 possible+0.064
ores+0.063
etchup+0.061
ucer+0.060
 rid+0.058
Neg logit contributions
Exc-0.149
der-0.144
ly-0.142
 eyes-0.136
 So-0.136
Tokenipp
Token ID3974
Feature activation-2.21e+00

Pos logit contributions
folk+0.010
keys+0.009
 ponies+0.009
ly+0.009
thia+0.009
Neg logit contributions
 P-0.009
 chess-0.008
 Linda-0.007
 surfing-0.007
 Jelly-0.007
Tokenin
Token ID259
Feature activation-1.97e+00

Pos logit contributions
der+0.066
folk+0.065
 ponies+0.064
 strangers+0.063
 bells+0.061
Neg logit contributions
 P-0.054
 chess-0.050
 Linda-0.047
 Jelly-0.046
 marble-0.045
Token loved
Token ID6151
Feature activation-1.46e+00

Pos logit contributions
folk+0.042
der+0.040
ly+0.040
uted+0.037
 ponies+0.037
Neg logit contributions
 P-0.068
 chess-0.063
 Linda-0.062
 Jelly-0.058
 noon-0.058
Token to
Token ID284
Feature activation-6.15e-01

Pos logit contributions
folk+0.043
ly+0.042
thia+0.042
affe+0.041
der+0.041
Neg logit contributions
 P-0.039
 chess-0.036
 Linda-0.036
 Jelly-0.031
 surfing-0.031
Token was
Token ID373
Feature activation+1.01e+00

Pos logit contributions
 P+0.017
 chess+0.016
 Linda+0.016
 Jelly+0.015
 ink+0.015
Neg logit contributions
folk-0.010
ly-0.010
thia-0.010
affe-0.010
 ponies-0.010
Token a
Token ID257
Feature activation+3.78e-01

Pos logit contributions
 P+0.028
 Linda+0.028
 chess+0.026
ucer+0.025
 Jelly+0.024
Neg logit contributions
 ponies-0.026
ly-0.025
thia-0.024
affe-0.024
 bells-0.023
Token guy
Token ID3516
Feature activation-1.01e+00

Pos logit contributions
 P+0.017
 chess+0.016
 Linda+0.016
 Jelly+0.015
 surfing+0.015
Neg logit contributions
folk-0.004
ly-0.004
thia-0.004
affe-0.004
 ponies-0.004
Token called
Token ID1444
Feature activation+1.11e+00

Pos logit contributions
affe+0.016
ly+0.016
 ponies+0.016
der+0.015
 strangers+0.015
Neg logit contributions
 P-0.038
 Linda-0.037
 chess-0.037
 noon-0.035
 Jelly-0.034
Token Jazz
Token ID21406
Feature activation-1.62e+00

Pos logit contributions
 Linda+0.027
 P+0.027
 chess+0.026
ucer+0.025
ained+0.025
Neg logit contributions
ly-0.029
folk-0.029
affe-0.029
 strangers-0.028
keys-0.028
Token.
Token ID13
Feature activation+4.35e+00

Pos logit contributions
folk+0.043
ly+0.042
thia+0.041
 ponies+0.041
affe+0.041
Neg logit contributions
 P-0.047
 chess-0.044
 Linda-0.044
 Jelly-0.039
 ink-0.038
Token He
Token ID679
Feature activation-6.06e-01

Pos logit contributions
etchup+0.097
ores+0.084
 rid+0.081
ucer+0.076
 possible+0.075
Neg logit contributions
 bells-0.136
 pedals-0.136
ly-0.134
Exc-0.132
folk-0.131
Token was
Token ID373
Feature activation+3.01e-01

Pos logit contributions
thia+0.008
ly+0.008
affe+0.008
folk+0.008
 ponies+0.008
Neg logit contributions
 P-0.025
 Linda-0.024
 chess-0.024
 Jelly-0.022
 surfing-0.022
Token very
Token ID845
Feature activation+7.67e-02

Pos logit contributions
 Linda+0.010
 P+0.009
ucer+0.009
 chess+0.009
ores+0.008
Neg logit contributions
 ponies-0.006
 gor-0.006
 bells-0.006
 strangers-0.006
 little-0.006
Token deep
Token ID2769
Feature activation+5.03e-01

Pos logit contributions
 P+0.002
 Linda+0.002
 chess+0.002
 Jelly+0.002
 ink+0.002
Neg logit contributions
ly-0.002
folk-0.002
thia-0.002
 ponies-0.002
affe-0.002
Token and
Token ID290
Feature activation-4.02e-02

Pos logit contributions
 P+0.016
 chess+0.015
 Linda+0.015
 Jelly+0.014
 noon+0.013
Neg logit contributions
folk-0.012
thia-0.012
affe-0.011
ly-0.011
 ponies-0.011
Token pengu
Token ID38373
Feature activation-2.09e-01

Pos logit contributions
 P+0.006
 chess+0.006
 Linda+0.005
 Jelly+0.005
 ink+0.005
Neg logit contributions
folk-0.002
thia-0.002
der-0.002
ly-0.002
 ponies-0.002
Tokenin
Token ID259
Feature activation-5.02e-01

Pos logit contributions
folk+0.004
ly+0.004
affe+0.004
thia+0.004
 ponies+0.004
Neg logit contributions
 P-0.007
 Linda-0.007
 chess-0.007
 Jelly-0.006
 ink-0.006
Token named
Token ID3706
Feature activation+1.82e+00

Pos logit contributions
affe+0.010
ly+0.009
 ponies+0.009
der+0.009
keys+0.009
Neg logit contributions
 P-0.018
 Linda-0.018
 chess-0.017
 noon-0.016
 Jak-0.016
Token P
Token ID350
Feature activation+7.80e-01

Pos logit contributions
 Linda+0.045
 chess+0.043
 P+0.042
ained+0.040
 ink+0.040
Neg logit contributions
ly-0.049
affe-0.049
thia-0.048
 ponies-0.047
 gor-0.047
Tokeningu
Token ID6680
Feature activation-6.93e-01

Pos logit contributions
 P+0.022
 chess+0.021
 Linda+0.020
 Jelly+0.018
 ink+0.018
Neg logit contributions
folk-0.021
ly-0.021
 ponies-0.021
thia-0.020
affe-0.020
Token.
Token ID13
Feature activation+4.32e+00

Pos logit contributions
ly+0.015
affe+0.015
thia+0.015
 ponies+0.015
der+0.015
Neg logit contributions
 P-0.022
 Linda-0.022
 chess-0.021
 Jelly-0.019
 noon-0.019
Token P
Token ID350
Feature activation+8.49e-01

Pos logit contributions
ores+0.089
 possible+0.088
etchup+0.086
ucer+0.081
 rid+0.081
Neg logit contributions
ly-0.122
Exc-0.121
 So-0.111
der-0.111
 little-0.111
Tokeningu
Token ID6680
Feature activation-1.22e+00

Pos logit contributions
 P+0.025
 Linda+0.024
 chess+0.024
 Jelly+0.021
 ink+0.021
Neg logit contributions
ly-0.021
folk-0.021
thia-0.021
affe-0.021
 ponies-0.021
Token loved
Token ID6151
Feature activation-5.58e-01

Pos logit contributions
folk+0.019
ly+0.018
 ponies+0.017
affe+0.017
der+0.017
Neg logit contributions
 P-0.049
 chess-0.047
 Linda-0.047
 Jelly-0.043
 surfing-0.043
Token to
Token ID284
Feature activation+2.28e-01

Pos logit contributions
 ponies+0.014
ly+0.013
thia+0.013
folk+0.013
affe+0.013
Neg logit contributions
 P-0.016
 Linda-0.016
 chess-0.015
 Jelly-0.014
 Ink-0.013
Token swim
Token ID9422
Feature activation+9.17e-01

Pos logit contributions
 P+0.006
 Linda+0.006
 chess+0.005
 late+0.005
 Jelly+0.005
Neg logit contributions
ly-0.006
thia-0.006
 ponies-0.006
 bells-0.006
affe-0.006
Token the
Token ID262
Feature activation-1.74e+00

Pos logit contributions
 chess+0.007
 P+0.007
 Linda+0.007
 ink+0.006
 surfing+0.006
Neg logit contributions
folk-0.007
 ponies-0.007
ly-0.006
 gor-0.006
 strangers-0.006
Token wise
Token ID10787
Feature activation-1.96e+00

Pos logit contributions
thia+0.036
uted+0.036
keys+0.036
folk+0.035
opes+0.034
Neg logit contributions
 night-0.056
 chess-0.056
 P-0.055
 noon-0.055
 day-0.052
Token pony
Token ID26902
Feature activation+4.54e-02

Pos logit contributions
keys+0.072
opes+0.065
ts+0.063
uted+0.062
thia+0.062
Neg logit contributions
 night-0.041
 day-0.038
 morning-0.035
 noon-0.035
 P-0.034
Token and
Token ID290
Feature activation-5.55e-01

Pos logit contributions
 P+0.002
 Linda+0.001
 chess+0.001
 Jelly+0.001
 noon+0.001
Neg logit contributions
folk-0.001
ly-0.001
 ponies-0.001
thia-0.001
affe-0.001
Token the
Token ID262
Feature activation-1.67e+00

Pos logit contributions
thia+0.017
ly+0.017
folk+0.017
affe+0.017
der+0.017
Neg logit contributions
 P-0.014
 Linda-0.013
 chess-0.012
 Jelly-0.011
 noon-0.011
Token 3
Token ID513
Feature activation-4.44e+00

Pos logit contributions
uted+0.035
thia+0.035
anna+0.034
der+0.033
opes+0.033
Neg logit contributions
 P-0.054
 chess-0.054
 Linda-0.051
 noon-0.051
 night-0.050
Token-
Token ID12
Feature activation-1.53e+00

Pos logit contributions
der+0.151
arians+0.145
opes+0.142
thia+0.141
 hind+0.140
Neg logit contributions
 day-0.084
 night-0.071
 afternoon-0.067
 morning-0.065
 evening-0.061
Tokenyear
Token ID1941
Feature activation-3.24e+00

Pos logit contributions
thia+0.038
der+0.037
folk+0.037
 ponies+0.037
ly+0.037
Neg logit contributions
 P-0.045
 Linda-0.043
 chess-0.041
 Jelly-0.038
 ink-0.037
Token old
Token ID1468
Feature activation-1.94e+00

Pos logit contributions
Exc+0.088
thia+0.085
uted+0.084
 blamed+0.084
Flash+0.083
Neg logit contributions
 P-0.075
 night-0.075
 day-0.073
 surfing-0.070
 noon-0.069
Token girl
Token ID2576
Feature activation-2.64e-01

Pos logit contributions
thia+0.046
keys+0.046
uted+0.045
der+0.045
aned+0.043
Neg logit contributions
 P-0.054
 night-0.053
 Linda-0.051
 chess-0.051
 day-0.048
Token played
Token ID2826
Feature activation-7.99e-01

Pos logit contributions
ly+0.006
folk+0.005
affe+0.005
thia+0.005
 ponies+0.005
Neg logit contributions
 P-0.009
 Linda-0.009
 chess-0.008
 Jelly-0.008
 surfing-0.008
Token mother
Token ID2802
Feature activation-3.21e-01

Pos logit contributions
folk+0.007
uted+0.006
ly+0.006
der+0.006
keys+0.006
Neg logit contributions
 P-0.008
 Linda-0.007
 chess-0.007
 Jelly-0.007
 noon-0.007
Token became
Token ID2627
Feature activation-1.14e+00

Pos logit contributions
 ponies+0.009
der+0.009
folk+0.009
thia+0.009
opes+0.009
Neg logit contributions
 Linda-0.009
 P-0.008
 chess-0.007
 Sammy-0.007
 Jenny-0.007
Token angry
Token ID7954
Feature activation-2.36e-01

Pos logit contributions
folk+0.028
ly+0.028
affe+0.028
thia+0.027
 ponies+0.027
Neg logit contributions
 P-0.035
 Linda-0.033
 chess-0.033
 Jelly-0.030
 surfing-0.029
Token.
Token ID13
Feature activation-1.91e+00

Pos logit contributions
ly+0.005
folk+0.004
der+0.004
 ponies+0.004
affe+0.004
Neg logit contributions
 P-0.009
 chess-0.008
 Linda-0.008
 Jelly-0.007
 Ink-0.007
Token She
Token ID1375
Feature activation-4.10e-01

Pos logit contributions
affe+0.062
keys+0.061
ts+0.061
co+0.059
ly+0.059
Neg logit contributions
 P-0.046
 Linda-0.039
 late-0.039
 Sammy-0.037
 Jelly-0.034
Token grabbed
Token ID13176
Feature activation-4.44e+00

Pos logit contributions
folk+0.014
 ponies+0.013
affe+0.013
der+0.012
uted+0.012
Neg logit contributions
 P-0.010
 Linda-0.009
 chess-0.008
 ink-0.007
 Sammy-0.007
Token the
Token ID262
Feature activation-1.63e-01

Pos logit contributions
thia+0.182
der+0.174
anna+0.173
keys+0.170
affe+0.169
Neg logit contributions
 P-0.079
 Sammy-0.063
 Linda-0.060
 late-0.058
 noon-0.057
Token microphone
Token ID21822
Feature activation-3.36e-01

Pos logit contributions
folk+0.004
ly+0.004
 ponies+0.004
thia+0.004
affe+0.004
Neg logit contributions
 P-0.005
 Linda-0.005
 chess-0.004
 Jelly-0.004
 surfing-0.004
Token out
Token ID503
Feature activation-1.08e+00

Pos logit contributions
ly+0.006
der+0.005
folk+0.005
affe+0.005
 ponies+0.005
Neg logit contributions
 P-0.013
 Linda-0.012
 chess-0.012
 Jelly-0.012
 surfing-0.012
Token of
Token ID286
Feature activation-6.67e-01

Pos logit contributions
ly+0.021
folk+0.020
der+0.019
affe+0.019
 ponies+0.019
Neg logit contributions
 P-0.039
 chess-0.038
 Linda-0.037
 Jelly-0.035
 surfing-0.034
Token Frankie
Token ID48535
Feature activation+4.45e-01

Pos logit contributions
folk+0.018
 ponies+0.018
ly+0.017
 bells+0.017
 strangers+0.017
Neg logit contributions
 P-0.018
 Linda-0.017
 chess-0.017
 Jelly-0.015
 surfing-0.015
Token to
Token ID284
Feature activation-2.18e-02

Pos logit contributions
ly+0.028
affe+0.027
der+0.026
 ponies+0.026
folk+0.026
Neg logit contributions
 P-0.034
 Linda-0.033
 chess-0.032
 Jelly-0.029
 surfing-0.029
Token do
Token ID466
Feature activation-1.71e-01

Pos logit contributions
folk+0.001
ly+0.001
der+0.001
uted+0.001
thia+0.001
Neg logit contributions
 P-0.000
 Linda-0.000
 chess-0.000
 Jelly-0.000
 noon-0.000
Token."
Token ID526
Feature activation+3.19e-02

Pos logit contributions
ly+0.004
folk+0.004
thia+0.004
 ponies+0.004
affe+0.004
Neg logit contributions
 P-0.005
 Linda-0.005
 chess-0.005
 Jelly-0.004
 ink-0.004
Token So
Token ID1406
Feature activation-1.20e-01

Pos logit contributions
 P+0.001
 chess+0.001
 Linda+0.001
 surfing+0.001
 ink+0.001
Neg logit contributions
folk-0.001
affe-0.001
ly-0.001
thia-0.001
der-0.001
Token the
Token ID262
Feature activation-2.12e+00

Pos logit contributions
ly+0.003
folk+0.003
affe+0.003
 ponies+0.003
 bells+0.003
Neg logit contributions
 P-0.004
 chess-0.004
 Linda-0.003
 surfing-0.003
ores-0.003
Token three
Token ID1115
Feature activation-4.40e+00

Pos logit contributions
der+0.058
anna+0.057
ously+0.056
urbed+0.052
ly+0.052
Neg logit contributions
 noon-0.058
 P-0.056
 day-0.056
 night-0.055
 morning-0.050
Token of
Token ID286
Feature activation-1.47e+00

Pos logit contributions
arians+0.135
der+0.135
 backs+0.135
 shoulders+0.131
thia+0.129
Neg logit contributions
 day-0.093
 night-0.072
 morning-0.070
 Jill-0.068
 P-0.067
Token them
Token ID606
Feature activation+3.33e-02

Pos logit contributions
thia+0.039
ly+0.039
der+0.038
affe+0.038
anna+0.038
Neg logit contributions
 P-0.044
 Linda-0.041
 chess-0.039
 noon-0.037
 ink-0.036
Token sang
Token ID25889
Feature activation+4.16e-01

Pos logit contributions
 P+0.001
 Linda+0.001
 chess+0.001
 Jelly+0.001
 ink+0.001
Neg logit contributions
folk-0.001
thia-0.001
affe-0.001
 ponies-0.001
ly-0.001
Token together
Token ID1978
Feature activation+5.96e-01

Pos logit contributions
 P+0.013
 Linda+0.012
 chess+0.012
 Jelly+0.011
 ink+0.010
Neg logit contributions
ly-0.010
folk-0.010
 ponies-0.010
affe-0.010
thia-0.010
Token in
Token ID287
Feature activation+6.89e-03

Pos logit contributions
 P+0.022
 Linda+0.021
 chess+0.020
 Jelly+0.019
 Ink+0.019
Neg logit contributions
ly-0.012
der-0.011
folk-0.011
 ponies-0.011
affe-0.010
Token
Token ID198
Feature activation+8.51e-01

Pos logit contributions
 P+0.010
 chess+0.010
 Linda+0.010
 ink+0.009
 Jelly+0.009
Neg logit contributions
folk-0.008
thia-0.008
ly-0.008
affe-0.008
 ponies-0.008
Token
Token ID198
Feature activation+5.72e-01

Pos logit contributions
 P+0.026
 chess+0.026
 Linda+0.025
 ink+0.023
 noon+0.023
Neg logit contributions
thia-0.019
folk-0.019
Exc-0.019
affe-0.019
anna-0.018
TokenSo
Token ID2396
Feature activation+7.69e-01

Pos logit contributions
 P+0.020
 chess+0.020
 Linda+0.019
 ink+0.018
 noon+0.018
Neg logit contributions
Exc-0.012
folk-0.011
affe-0.010
thia-0.010
 ponies-0.010
Token,
Token ID11
Feature activation+5.42e-01

Pos logit contributions
 chess+0.025
 Linda+0.023
 ink+0.023
ucer+0.022
 Ink+0.022
Neg logit contributions
ly-0.017
folk-0.016
 nodded-0.015
 ponies-0.015
 little-0.015
Token the
Token ID262
Feature activation-2.32e+00

Pos logit contributions
 chess+0.015
 Linda+0.014
 ink+0.014
 P+0.014
ucer+0.014
Neg logit contributions
folk-0.013
 ponies-0.013
ly-0.013
 little-0.012
 gor-0.012
Token three
Token ID1115
Feature activation-4.16e+00

Pos logit contributions
opes+0.047
ously+0.045
thia+0.045
anna+0.044
der+0.043
Neg logit contributions
 night-0.077
 noon-0.077
 P-0.074
 chess-0.073
 Jelly-0.073
Token raven
Token ID37735
Feature activation-2.00e-01

Pos logit contributions
der+0.121
opes+0.120
thia+0.119
arians+0.115
 shoulders+0.115
Neg logit contributions
 day-0.090
 night-0.081
 morning-0.076
 surfing-0.072
 afternoon-0.072
Tokens
Token ID82
Feature activation+5.79e-02

Pos logit contributions
folk+0.004
ly+0.004
thia+0.004
 ponies+0.004
affe+0.004
Neg logit contributions
 P-0.007
 Linda-0.006
 chess-0.006
 Jelly-0.006
 surfing-0.005
Token lined
Token ID16566
Feature activation-7.06e-01

Pos logit contributions
 P+0.002
 Linda+0.002
 chess+0.002
 Jelly+0.002
 ink+0.002
Neg logit contributions
folk-0.001
 ponies-0.001
ly-0.001
affe-0.001
der-0.001
Token up
Token ID510
Feature activation+5.98e-02

Pos logit contributions
 bells+0.015
uted+0.015
ly+0.015
 ponies+0.015
folk+0.014
Neg logit contributions
 P-0.022
 Linda-0.022
 chess-0.021
 ink-0.020
 Ink-0.020
Token,
Token ID11
Feature activation+3.06e-01

Pos logit contributions
 P+0.002
 Linda+0.002
 chess+0.002
ucer+0.002
 Ink+0.002
Neg logit contributions
ly-0.001
der-0.001
 ponies-0.001
 bells-0.001
folk-0.001
TokenSo
Token ID2396
Feature activation+1.02e+00

Pos logit contributions
 P+0.027
 chess+0.027
 Linda+0.026
 ink+0.024
 Jelly+0.024
Neg logit contributions
Exc-0.017
folk-0.015
 ponies-0.015
affe-0.015
thia-0.014
Token the
Token ID262
Feature activation-6.04e-01

Pos logit contributions
 chess+0.034
 Linda+0.030
ores+0.030
ucer+0.030
 ink+0.030
Neg logit contributions
ly-0.022
 nodded-0.020
folk-0.019
 ponies-0.019
 little-0.019
Token princess
Token ID21752
Feature activation-1.23e+00

Pos logit contributions
folk+0.013
thia+0.013
ly+0.013
affe+0.013
der+0.012
Neg logit contributions
 P-0.020
 chess-0.019
 Linda-0.019
 Jelly-0.017
 noon-0.017
Token only
Token ID691
Feature activation-3.50e-01

Pos logit contributions
folk+0.028
 ponies+0.027
thia+0.027
der+0.027
ly+0.027
Neg logit contributions
 P-0.038
 Linda-0.037
 chess-0.036
 Jelly-0.033
 ink-0.032
Token added
Token ID2087
Feature activation-1.60e+00

Pos logit contributions
ly+0.007
folk+0.007
affe+0.007
thia+0.007
 ponies+0.007
Neg logit contributions
 P-0.012
 chess-0.011
 Linda-0.011
 Jelly-0.010
 surfing-0.010
Token five
Token ID1936
Feature activation-4.12e+00

Pos logit contributions
thia+0.050
folk+0.048
affe+0.048
anna+0.047
ly+0.047
Neg logit contributions
 P-0.040
 chess-0.037
 Linda-0.035
 ink-0.032
 Jelly-0.032
Token apples
Token ID22514
Feature activation+1.58e-01

Pos logit contributions
thia+0.133
Flash+0.124
uted+0.121
icity+0.120
lder+0.119
Neg logit contributions
 sauce-0.085
 P-0.083
 chess-0.081
 ink-0.080
 ol-0.080
Token to
Token ID284
Feature activation+2.35e-02

Pos logit contributions
 P+0.005
 Linda+0.005
 chess+0.005
 Jelly+0.004
 Ink+0.004
Neg logit contributions
ly-0.004
folk-0.004
der-0.003
affe-0.003
 ponies-0.003
Token her
Token ID607
Feature activation-1.13e+00

Pos logit contributions
 P+0.001
 chess+0.001
 Linda+0.001
 Jelly+0.000
 ink+0.000
Neg logit contributions
folk-0.001
ly-0.001
thia-0.001
affe-0.001
 ponies-0.001
Token basket
Token ID7988
Feature activation+8.30e-01

Pos logit contributions
folk+0.030
ly+0.029
uted+0.029
Exc+0.028
anna+0.028
Neg logit contributions
 P-0.032
 chess-0.032
 Linda-0.029
 noon-0.029
 Jelly-0.028
Token and
Token ID290
Feature activation+1.86e-01

Pos logit contributions
 P+0.028
 Linda+0.027
 chess+0.026
 Jelly+0.024
 Ink+0.024
Neg logit contributions
ly-0.018
folk-0.017
der-0.017
affe-0.017
 ponies-0.016
Token!
Token ID0
Feature activation+1.78e-01

Pos logit contributions
 P+0.003
 Linda+0.003
 chess+0.003
 Jelly+0.003
 Ink+0.003
Neg logit contributions
ly-0.003
folk-0.002
der-0.002
 ponies-0.002
thia-0.002
Token<|endoftext|>
Token ID50256
Feature activation+5.21e-01

Pos logit contributions
 P+0.005
 chess+0.005
 Linda+0.005
 ink+0.005
 surfing+0.005
Neg logit contributions
folk-0.004
affe-0.004
thia-0.004
der-0.004
ly-0.004
TokenSam
Token ID16305
Feature activation-7.13e-01

Pos logit contributions
 P+0.016
 Linda+0.015
 chess+0.015
 ink+0.014
 Jelly+0.014
Neg logit contributions
folk-0.012
thia-0.012
 ponies-0.012
ly-0.012
affe-0.012
Token was
Token ID373
Feature activation-8.73e-02

Pos logit contributions
folk+0.020
ly+0.020
thia+0.020
affe+0.020
 ponies+0.020
Neg logit contributions
 P-0.019
 chess-0.018
 Linda-0.018
 Jelly-0.016
 surfing-0.015
Token a
Token ID257
Feature activation-7.49e-01

Pos logit contributions
 ponies+0.002
 ent+0.002
thia+0.002
folk+0.002
uted+0.002
Neg logit contributions
 P-0.003
 Linda-0.003
 chess-0.003
ucer-0.003
 Jelly-0.002
Token three
Token ID1115
Feature activation-4.12e+00

Pos logit contributions
folk+0.014
ly+0.013
der+0.013
thia+0.013
affe+0.013
Neg logit contributions
 P-0.028
 chess-0.027
 Linda-0.026
 Jelly-0.024
 surfing-0.024
Token year
Token ID614
Feature activation+2.06e-02

Pos logit contributions
thia+0.114
der+0.111
uted+0.108
opes+0.108
arians+0.108
Neg logit contributions
 day-0.096
 chess-0.093
 night-0.090
 pasta-0.089
 Linda-0.088
Token old
Token ID1468
Feature activation-1.16e+00

Pos logit contributions
 P+0.001
 chess+0.001
 Linda+0.001
 Jelly+0.001
 surfing+0.001
Neg logit contributions
folk-0.000
thia-0.000
affe-0.000
 ponies-0.000
ly-0.000
Token boy
Token ID2933
Feature activation-9.90e-01

Pos logit contributions
thia+0.028
folk+0.026
affe+0.026
keys+0.026
anna+0.025
Neg logit contributions
 P-0.036
 chess-0.035
 Linda-0.034
 Jelly-0.031
 surfing-0.030
Token and
Token ID290
Feature activation-9.92e-01

Pos logit contributions
ly+0.013
 ponies+0.012
folk+0.012
anna+0.012
 strangers+0.011
Neg logit contributions
 P-0.039
 Linda-0.039
 chess-0.038
ucer-0.037
 noon-0.036
Token he
Token ID339
Feature activation-7.16e-01

Pos logit contributions
folk+0.026
ly+0.026
 ponies+0.026
thia+0.026
affe+0.025
Neg logit contributions
 P-0.027
 chess-0.026
 Linda-0.026
 Jelly-0.023
 ink-0.023
Token something
Token ID1223
Feature activation-1.03e+00

Pos logit contributions
ly+0.008
folk+0.008
 little+0.007
der+0.007
 strangers+0.007
Neg logit contributions
 P-0.010
 Linda-0.010
 chess-0.010
etchup-0.010
ained-0.010
Token miraculous
Token ID40336
Feature activation-7.81e-01

Pos logit contributions
folk+0.027
thia+0.026
 ponies+0.026
anna+0.025
affe+0.025
Neg logit contributions
 P-0.030
 Linda-0.029
 chess-0.029
 ink-0.026
 noon-0.025
Token happened
Token ID3022
Feature activation+9.41e-01

Pos logit contributions
folk+0.019
thia+0.019
 ponies+0.018
affe+0.018
ly+0.018
Neg logit contributions
 P-0.024
 Linda-0.023
 chess-0.023
 Jelly-0.020
 ink-0.020
Token!
Token ID0
Feature activation-6.12e-01

Pos logit contributions
 P+0.033
 Linda+0.032
 chess+0.031
 Ink+0.030
 Jelly+0.029
Neg logit contributions
ly-0.018
der-0.018
folk-0.016
thia-0.016
uted-0.016
Token A
Token ID317
Feature activation-2.96e-02

Pos logit contributions
ly+0.015
folk+0.015
der+0.015
affe+0.015
thia+0.015
Neg logit contributions
 P-0.018
 Linda-0.017
 chess-0.017
 surfing-0.015
 Jelly-0.015
Token 3
Token ID513
Feature activation-4.10e+00

Pos logit contributions
ly+0.001
anna+0.001
der+0.001
folk+0.001
uted+0.001
Neg logit contributions
 P-0.001
 chess-0.001
 Linda-0.001
 ink-0.001
 Jelly-0.001
Token year
Token ID614
Feature activation-1.49e-02

Pos logit contributions
der+0.142
arians+0.139
opes+0.132
 backs+0.132
 wrists+0.128
Neg logit contributions
 day-0.076
 night-0.070
 Linda-0.065
 Jill-0.065
 sorts-0.059
Token old
Token ID1468
Feature activation-8.08e-01

Pos logit contributions
thia+0.000
folk+0.000
uted+0.000
 ponies+0.000
der+0.000
Neg logit contributions
 P-0.000
 Linda-0.000
 chess-0.000
 ink-0.000
 late-0.000
Token child
Token ID1200
Feature activation-8.73e-01

Pos logit contributions
anna+0.019
thia+0.018
der+0.018
uted+0.018
opes+0.018
Neg logit contributions
 P-0.025
 Linda-0.025
 chess-0.025
 ink-0.022
 Jelly-0.021
Token saw
Token ID2497
Feature activation-7.28e-01

Pos logit contributions
der+0.023
 ponies+0.022
folk+0.022
thia+0.022
affe+0.021
Neg logit contributions
 P-0.025
 Linda-0.025
 chess-0.023
 ink-0.021
 Jelly-0.020
Token the
Token ID262
Feature activation-1.98e+00

Pos logit contributions
folk+0.021
ly+0.020
thia+0.020
affe+0.020
 ponies+0.020
Neg logit contributions
 P-0.019
 chess-0.018
 Linda-0.018
 Jelly-0.016
 surfing-0.016
Token every
Token ID790
Feature activation-2.70e+00

Pos logit contributions
folk+0.019
affe+0.019
thia+0.019
 ponies+0.018
Exc+0.018
Neg logit contributions
 P-0.019
 chess-0.018
 Linda-0.017
 Jelly-0.016
 ink-0.016
Token morning
Token ID3329
Feature activation-3.41e-01

Pos logit contributions
folk+0.117
thia+0.117
ly+0.116
affe+0.116
der+0.115
Neg logit contributions
 P-0.028
 chess-0.025
 Linda-0.024
 noon-0.017
 Jelly-0.016
Token.
Token ID13
Feature activation-1.43e-01

Pos logit contributions
ly+0.008
folk+0.008
 ponies+0.008
thia+0.008
affe+0.008
Neg logit contributions
 P-0.011
 Linda-0.010
 chess-0.010
 Jelly-0.009
 Ink-0.009
Token Every
Token ID3887
Feature activation-9.01e-01

Pos logit contributions
folk+0.003
thia+0.003
affe+0.003
 ponies+0.003
 bells+0.003
Neg logit contributions
 P-0.005
 Linda-0.005
 chess-0.004
ucer-0.004
 surfing-0.004
Token morning
Token ID3329
Feature activation+4.66e-01

Pos logit contributions
ly+0.033
affe+0.032
 ponies+0.032
anna+0.031
 bell+0.031
Neg logit contributions
 P-0.018
 Linda-0.017
 chess-0.016
 Jak-0.013
 Ink-0.013
Token the
Token ID262
Feature activation-4.10e+00

Pos logit contributions
 P+0.016
 Linda+0.016
 chess+0.016
 surfing+0.014
 Ink+0.014
Neg logit contributions
ly-0.009
folk-0.009
affe-0.008
 ponies-0.008
 bells-0.008
Token people
Token ID661
Feature activation-8.63e-01

Pos logit contributions
der+0.093
affe+0.090
arians+0.090
ously+0.086
opes+0.084
Neg logit contributions
 night-0.137
 day-0.131
 noon-0.126
 morning-0.124
 evening-0.121
Token in
Token ID287
Feature activation-4.36e-01

Pos logit contributions
thia+0.020
affe+0.020
folk+0.020
der+0.019
ly+0.019
Neg logit contributions
 P-0.026
 chess-0.025
 Linda-0.025
 surfing-0.023
 noon-0.023
Token the
Token ID262
Feature activation-1.91e+00

Pos logit contributions
folk+0.012
 ponies+0.012
 bells+0.011
 strangers+0.011
ly+0.011
Neg logit contributions
 P-0.012
 Linda-0.012
 chess-0.011
 Jelly-0.010
 surfing-0.010
Token house
Token ID2156
Feature activation+6.00e-01

Pos logit contributions
affe+0.056
der+0.056
ly+0.055
folk+0.054
keys+0.054
Neg logit contributions
 P-0.048
 chess-0.046
 noon-0.045
 night-0.043
 Linda-0.042
Token would
Token ID561
Feature activation-7.39e-01

Pos logit contributions
 P+0.021
 chess+0.020
 Linda+0.020
 Jelly+0.018
 ink+0.018
Neg logit contributions
folk-0.012
ly-0.012
thia-0.011
 ponies-0.011
affe-0.011
Token it
Token ID340
Feature activation-3.22e-02

Pos logit contributions
 P+0.008
 Linda+0.008
ained+0.007
 chess+0.007
ennis+0.007
Neg logit contributions
 bells-0.005
 ponies-0.005
 little-0.005
folk-0.005
 bell-0.005
Token spread
Token ID4104
Feature activation+1.36e+00

Pos logit contributions
folk+0.001
ly+0.001
thia+0.001
 ponies+0.001
affe+0.001
Neg logit contributions
 P-0.001
 Linda-0.001
 chess-0.001
 Jelly-0.001
 ink-0.001
Token out
Token ID503
Feature activation+4.86e-01

Pos logit contributions
 P+0.040
 chess+0.039
 Linda+0.038
 surfing+0.037
 Jelly+0.035
Neg logit contributions
ly-0.034
 bells-0.031
der-0.030
affe-0.029
folk-0.029
Token of
Token ID286
Feature activation+3.43e-02

Pos logit contributions
 P+0.016
 chess+0.015
 Linda+0.015
 Jelly+0.014
ennis+0.014
Neg logit contributions
ly-0.011
 bells-0.010
der-0.010
 ponies-0.010
folk-0.010
Token the
Token ID262
Feature activation-1.25e+00

Pos logit contributions
 P+0.001
 Linda+0.001
ucer+0.001
 Jelly+0.001
ennis+0.001
Neg logit contributions
 ponies-0.001
 bells-0.001
 bicycles-0.001
 strangers-0.001
 ow-0.001
Token 3
Token ID513
Feature activation-4.08e+00

Pos logit contributions
keys+0.033
thia+0.033
anna+0.032
uted+0.032
folk+0.031
Neg logit contributions
 chess-0.036
 P-0.035
 ink-0.033
 Linda-0.033
 noon-0.033
Token year
Token ID614
Feature activation+5.96e-01

Pos logit contributions
der+0.128
arians+0.113
thia+0.111
uted+0.109
 blamed+0.108
Neg logit contributions
 day-0.098
 afternoon-0.090
 night-0.087
 sauce-0.086
 pasta-0.085
Token old
Token ID1468
Feature activation-1.23e-01

Pos logit contributions
 P+0.021
 chess+0.020
 Linda+0.020
 ink+0.018
 late+0.018
Neg logit contributions
folk-0.012
thia-0.012
affe-0.011
 ponies-0.011
uted-0.011
Token's
Token ID338
Feature activation-2.23e-01

Pos logit contributions
folk+0.003
ly+0.003
 ponies+0.003
thia+0.003
affe+0.003
Neg logit contributions
 P-0.004
 Linda-0.004
 chess-0.004
 Jelly-0.004
 surfing-0.004
Token house
Token ID2156
Feature activation+8.05e-01

Pos logit contributions
uted+0.005
keys+0.005
folk+0.005
anna+0.005
Exc+0.005
Neg logit contributions
 chess-0.007
 P-0.007
 Linda-0.006
 ink-0.006
 noon-0.006
Token,
Token ID11
Feature activation+4.32e-01

Pos logit contributions
 P+0.027
 Linda+0.025
 chess+0.025
 Jelly+0.023
 surfing+0.022
Neg logit contributions
ly-0.018
folk-0.017
der-0.017
thia-0.017
 ponies-0.016
Token.
Token ID13
Feature activation+1.31e-01

Pos logit contributions
 P+0.002
 Linda+0.002
 chess+0.002
 Ink+0.002
 Jelly+0.002
Neg logit contributions
ly-0.001
der-0.001
 ponies-0.001
affe-0.001
folk-0.001
Token He
Token ID679
Feature activation-4.10e-01

Pos logit contributions
 P+0.004
 chess+0.004
 Linda+0.004
 ink+0.004
 surfing+0.004
Neg logit contributions
folk-0.003
affe-0.003
thia-0.003
 ponies-0.003
Exc-0.003
Token watched
Token ID7342
Feature activation-1.49e+00

Pos logit contributions
folk+0.010
 ponies+0.010
ly+0.010
affe+0.009
der+0.009
Neg logit contributions
 P-0.013
 Linda-0.012
 chess-0.012
 Jelly-0.011
 surfing-0.011
Token them
Token ID606
Feature activation+6.65e-01

Pos logit contributions
ly+0.033
folk+0.033
affe+0.032
 ponies+0.032
thia+0.032
Neg logit contributions
 P-0.048
 chess-0.046
 Linda-0.046
 Jelly-0.041
 ink-0.040
Token in
Token ID287
Feature activation-6.48e-02

Pos logit contributions
 P+0.020
 Linda+0.020
 chess+0.019
 ink+0.018
 Ink+0.018
Neg logit contributions
ly-0.016
 ponies-0.015
affe-0.015
folk-0.015
 bells-0.015
Token amaz
Token ID40642
Feature activation-3.98e+00

Pos logit contributions
 ponies+0.001
folk+0.001
 bells+0.001
ly+0.001
 strangers+0.001
Neg logit contributions
 P-0.002
 Linda-0.002
 chess-0.002
 Jelly-0.002
 surfing-0.002
Tokenement
Token ID972
Feature activation-1.01e+00

Pos logit contributions
Mic+0.095
affe+0.094
thia+0.094
keys+0.093
 chicks+0.092
Neg logit contributions
 Linda-0.107
 P-0.106
 Jill-0.105
 chess-0.105
 Jenny-0.101
Token and
Token ID290
Feature activation+6.36e-01

Pos logit contributions
folk+0.024
ly+0.024
thia+0.024
affe+0.023
 ponies+0.023
Neg logit contributions
 P-0.031
 chess-0.030
 Linda-0.030
 Jelly-0.027
 surfing-0.026
Token even
Token ID772
Feature activation-8.42e-01

Pos logit contributions
 P+0.020
 Linda+0.019
 chess+0.019
 Jelly+0.017
 surfing+0.017
Neg logit contributions
folk-0.015
affe-0.015
ly-0.015
thia-0.014
 ponies-0.014
Token laughed
Token ID13818
Feature activation-1.11e-01

Pos logit contributions
thia+0.020
affe+0.020
 ponies+0.020
folk+0.020
der+0.019
Neg logit contributions
 P-0.027
 Linda-0.025
 chess-0.025
 Jelly-0.023
 noon-0.022
Token a
Token ID257
Feature activation+2.60e-01

Pos logit contributions
ly+0.003
der+0.002
folk+0.002
affe+0.002
 ponies+0.002
Neg logit contributions
 P-0.004
 Linda-0.003
 chess-0.003
 Ink-0.003
 ink-0.003
TokenThe
Token ID464
Feature activation-1.62e+00

Pos logit contributions
Exc+0.000
folk+0.000
affe+0.000
thia+0.000
 ponies+0.000
Neg logit contributions
 P-0.000
 chess-0.000
 Linda-0.000
 noon-0.000
 ink-0.000
Token queen
Token ID16599
Feature activation-9.68e-01

Pos logit contributions
uted+0.041
anna+0.040
thia+0.040
der+0.040
gging+0.040
Neg logit contributions
 P-0.047
 chess-0.047
 Linda-0.045
 noon-0.043
 night-0.041
Token declared
Token ID6875
Feature activation-1.19e-01

Pos logit contributions
ly+0.022
 ponies+0.022
folk+0.021
thia+0.021
 strangers+0.021
Neg logit contributions
 P-0.030
 Linda-0.029
 chess-0.029
 Jelly-0.026
 ink-0.025
Token that
Token ID326
Feature activation+4.39e-01

Pos logit contributions
ly+0.003
folk+0.003
 ponies+0.003
affe+0.003
der+0.003
Neg logit contributions
 P-0.004
 Linda-0.003
 chess-0.003
 Jelly-0.003
 noon-0.003
Token the
Token ID262
Feature activation-1.50e+00

Pos logit contributions
 P+0.011
 Linda+0.011
 chess+0.011
ores+0.010
ucer+0.010
Neg logit contributions
 ponies-0.012
folk-0.012
der-0.011
ly-0.011
 strangers-0.011
Token three
Token ID1115
Feature activation-3.95e+00

Pos logit contributions
anna+0.035
thia+0.035
ly+0.035
uted+0.035
affe+0.035
Neg logit contributions
 P-0.046
 chess-0.045
 Linda-0.043
 noon-0.040
 Jelly-0.040
Token year
Token ID614
Feature activation-3.37e-01

Pos logit contributions
gging+0.104
uted+0.104
der+0.101
Mic+0.100
thia+0.100
Neg logit contributions
 Jill-0.092
 Linda-0.089
 day-0.088
 chess-0.082
 P-0.081
Token old
Token ID1468
Feature activation-6.58e-01

Pos logit contributions
folk+0.006
thia+0.006
 ponies+0.006
affe+0.006
ly+0.006
Neg logit contributions
 P-0.012
 chess-0.012
 Linda-0.012
 Jelly-0.011
 ink-0.011
Token was
Token ID373
Feature activation+3.27e-01

Pos logit contributions
thia+0.015
folk+0.015
ly+0.015
affe+0.015
 ponies+0.015
Neg logit contributions
 P-0.021
 chess-0.020
 Linda-0.020
 ink-0.018
 Jelly-0.018
Token the
Token ID262
Feature activation-6.25e-01

Pos logit contributions
 P+0.010
 chess+0.009
 Linda+0.009
ucer+0.009
 Jelly+0.008
Neg logit contributions
thia-0.007
folk-0.007
 ponies-0.007
 bells-0.007
 ent-0.007
Token special
Token ID2041
Feature activation-9.94e-01

Pos logit contributions
folk+0.015
 ponies+0.015
affe+0.014
ly+0.014
 bells+0.014
Neg logit contributions
 P-0.020
 Linda-0.019
 chess-0.018
ores-0.017
 surfing-0.016
Token brother
Token ID3956
Feature activation-2.36e-01

Pos logit contributions
folk+0.003
thia+0.003
ly+0.003
affe+0.003
anna+0.003
Neg logit contributions
 P-0.005
 chess-0.005
 Linda-0.005
 ink-0.004
 Jelly-0.004
Token.
Token ID13
Feature activation-1.52e-02

Pos logit contributions
ly+0.005
folk+0.005
 ponies+0.005
affe+0.005
thia+0.004
Neg logit contributions
 P-0.008
 Linda-0.008
 chess-0.007
 noon-0.007
ores-0.007
Token
Token ID198
Feature activation+3.34e-01

Pos logit contributions
folk+0.000
ly+0.000
thia+0.000
affe+0.000
 ponies+0.000
Neg logit contributions
 P-0.000
 chess-0.000
 Linda-0.000
 Jelly-0.000
 surfing-0.000
Token
Token ID198
Feature activation+7.17e-02

Pos logit contributions
 P+0.009
 Linda+0.008
 chess+0.008
 Jelly+0.007
 surfing+0.007
Neg logit contributions
ly-0.010
folk-0.010
der-0.010
 ponies-0.009
affe-0.009
TokenThe
Token ID464
Feature activation-1.60e+00

Pos logit contributions
 P+0.002
 chess+0.002
 Linda+0.002
 Jelly+0.002
 ink+0.002
Neg logit contributions
folk-0.002
ly-0.002
thia-0.001
affe-0.001
 ponies-0.001
Token three
Token ID1115
Feature activation-3.95e+00

Pos logit contributions
der+0.039
anna+0.039
uted+0.039
thia+0.039
aron+0.039
Neg logit contributions
 P-0.048
 chess-0.048
 Linda-0.046
 noon-0.043
 ink-0.041
Token year
Token ID614
Feature activation-3.52e-01

Pos logit contributions
der+0.117
 backs+0.115
 prope+0.115
 shoulders+0.112
gging+0.112
Neg logit contributions
 Jill-0.083
 day-0.078
 Linda-0.078
 night-0.068
 P-0.067
Token old
Token ID1468
Feature activation-6.57e-01

Pos logit contributions
thia+0.007
folk+0.007
affe+0.007
 ponies+0.007
der+0.007
Neg logit contributions
 P-0.012
 chess-0.012
 Linda-0.012
 ink-0.011
 Jelly-0.010
Token boy
Token ID2933
Feature activation-3.40e-02

Pos logit contributions
thia+0.014
der+0.014
folk+0.014
anna+0.014
 ponies+0.014
Neg logit contributions
 P-0.021
 Linda-0.020
 chess-0.020
 Jelly-0.018
 ink-0.018
Token began
Token ID2540
Feature activation+3.06e-01

Pos logit contributions
folk+0.001
ly+0.001
 ponies+0.001
affe+0.001
thia+0.001
Neg logit contributions
 P-0.001
 chess-0.001
 Linda-0.001
 Jelly-0.001
 ink-0.001
Token to
Token ID284
Feature activation+5.50e-01

Pos logit contributions
 P+0.009
 Linda+0.009
 chess+0.008
ores+0.008
ucer+0.008
Neg logit contributions
 ponies-0.007
folk-0.007
thia-0.007
ly-0.007
 bells-0.007
Token on
Token ID319
Feature activation-1.38e+00

Pos logit contributions
ly+0.010
folk+0.010
 ponies+0.010
affe+0.010
thia+0.009
Neg logit contributions
 P-0.015
 Linda-0.014
 chess-0.014
 Jelly-0.012
 Ink-0.012
Token farms
Token ID15893
Feature activation+4.54e-01

Pos logit contributions
folk+0.038
ly+0.038
thia+0.037
affe+0.037
 ponies+0.037
Neg logit contributions
 P-0.038
 chess-0.035
 Linda-0.035
 Jelly-0.031
 ink-0.030
Token.
Token ID13
Feature activation-8.29e-02

Pos logit contributions
 P+0.015
 Linda+0.014
 chess+0.014
 Jelly+0.013
 Ink+0.012
Neg logit contributions
folk-0.010
ly-0.010
 ponies-0.010
affe-0.010
thia-0.010
Token They
Token ID1119
Feature activation-6.71e-01

Pos logit contributions
folk+0.002
affe+0.002
 ponies+0.002
thia+0.002
ly+0.002
Neg logit contributions
 P-0.002
 chess-0.002
 Linda-0.002
 surfing-0.002
 ink-0.002
Token have
Token ID423
Feature activation-6.83e-01

Pos logit contributions
folk+0.018
ly+0.018
thia+0.018
affe+0.017
 ponies+0.017
Neg logit contributions
 P-0.019
 chess-0.018
 Linda-0.018
 Jelly-0.016
 ink-0.015
Token four
Token ID1440
Feature activation-3.93e+00

Pos logit contributions
folk+0.019
ly+0.019
affe+0.019
thia+0.019
der+0.018
Neg logit contributions
 P-0.018
 chess-0.017
 Linda-0.017
 Jelly-0.015
 ink-0.015
Token legs
Token ID7405
Feature activation-8.87e-01

Pos logit contributions
uted+0.102
thia+0.100
folk+0.099
Rober+0.097
ternally+0.096
Neg logit contributions
 sauce-0.091
etchup-0.090
 salt-0.087
 ink-0.085
 P-0.084
Token,
Token ID11
Feature activation-4.47e-01

Pos logit contributions
folk+0.019
thia+0.019
ly+0.019
 ponies+0.018
affe+0.018
Neg logit contributions
 P-0.030
 chess-0.028
 Linda-0.028
 Jelly-0.025
 ink-0.025
Token a
Token ID257
Feature activation-1.46e+00

Pos logit contributions
ly+0.011
folk+0.011
 ponies+0.010
thia+0.010
affe+0.010
Neg logit contributions
 P-0.014
 Linda-0.013
 chess-0.013
 Jelly-0.012
 surfing-0.011
Token tail
Token ID7894
Feature activation-5.35e-01

Pos logit contributions
folk+0.033
ly+0.033
thia+0.032
der+0.032
affe+0.031
Neg logit contributions
 P-0.047
 chess-0.045
 Linda-0.044
 Jelly-0.040
 ink-0.040
Token,
Token ID11
Feature activation-5.23e-01

Pos logit contributions
ly+0.011
folk+0.011
affe+0.011
 ponies+0.010
thia+0.010
Neg logit contributions
 P-0.019
 Linda-0.018
 chess-0.018
 Jelly-0.016
 surfing-0.016
Token shouted
Token ID17293
Feature activation-3.24e-01

Pos logit contributions
folk+0.005
ly+0.005
thia+0.005
affe+0.005
 ponies+0.005
Neg logit contributions
 P-0.009
 chess-0.008
 Linda-0.008
 Jelly-0.008
 surfing-0.007
Token.
Token ID13
Feature activation-5.80e-01

Pos logit contributions
ly+0.007
folk+0.007
der+0.007
affe+0.006
thia+0.006
Neg logit contributions
 P-0.011
 Linda-0.010
 chess-0.010
 surfing-0.009
 Ink-0.009
Token
Token ID220
Feature activation-1.07e+00

Pos logit contributions
folk+0.013
thia+0.012
affe+0.012
ly+0.012
der+0.012
Neg logit contributions
 chess-0.018
 P-0.018
 Linda-0.017
 ink-0.016
 surfing-0.016
Token
Token ID198
Feature activation-3.07e-01

Pos logit contributions
folk+0.027
ly+0.027
thia+0.027
affe+0.026
 ponies+0.026
Neg logit contributions
 P-0.031
 chess-0.030
 Linda-0.029
 Jelly-0.026
 ink-0.026
TokenThe
Token ID464
Feature activation-1.28e+00

Pos logit contributions
folk+0.007
thia+0.007
affe+0.007
ly+0.007
 ponies+0.007
Neg logit contributions
 P-0.009
 chess-0.009
 Linda-0.009
 Jelly-0.008
 ink-0.008
Token three
Token ID1115
Feature activation-3.92e+00

Pos logit contributions
uted+0.029
thia+0.028
anna+0.028
der+0.028
ly+0.026
Neg logit contributions
 P-0.042
 chess-0.040
 Linda-0.037
 ink-0.037
 noon-0.036
Token little
Token ID1310
Feature activation-7.66e-01

Pos logit contributions
der+0.106
thia+0.102
uted+0.098
 meanings+0.097
 shoulders+0.096
Neg logit contributions
 P-0.084
 day-0.082
etchup-0.081
 Linda-0.081
 cup-0.076
Token pigs
Token ID22333
Feature activation-4.04e-01

Pos logit contributions
thia+0.017
der+0.017
anna+0.017
uted+0.016
ly+0.016
Neg logit contributions
 P-0.026
 chess-0.024
 Linda-0.024
 Jelly-0.022
 ink-0.022
Token started
Token ID2067
Feature activation-8.80e-01

Pos logit contributions
 ponies+0.009
ly+0.009
thia+0.009
affe+0.009
der+0.009
Neg logit contributions
 P-0.013
 chess-0.012
 Linda-0.012
 surfing-0.011
 noon-0.011
Token to
Token ID284
Feature activation-3.01e-01

Pos logit contributions
folk+0.024
ly+0.023
thia+0.023
 ponies+0.023
affe+0.023
Neg logit contributions
 P-0.025
 Linda-0.023
 chess-0.023
 Jelly-0.020
 ink-0.020
Token measure
Token ID3953
Feature activation+2.19e-01

Pos logit contributions
ly+0.008
folk+0.008
affe+0.008
 ponies+0.008
thia+0.008
Neg logit contributions
 P-0.008
 chess-0.008
 Linda-0.008
 Jelly-0.007
 noon-0.007
Token neatly
Token ID29776
Feature activation+3.37e-01

Pos logit contributions
 P+0.002
 chess+0.002
 Linda+0.002
 Jelly+0.002
 ink+0.002
Neg logit contributions
folk-0.002
ly-0.002
thia-0.002
affe-0.002
 ponies-0.002
Token.
Token ID13
Feature activation-1.82e-01

Pos logit contributions
 P+0.010
 Linda+0.009
 chess+0.009
 Jelly+0.008
 surfing+0.008
Neg logit contributions
ly-0.009
folk-0.009
der-0.008
 ponies-0.008
thia-0.008
Token
Token ID198
Feature activation-2.82e-03

Pos logit contributions
folk+0.004
ly+0.004
thia+0.004
affe+0.004
 ponies+0.004
Neg logit contributions
 P-0.006
 Linda-0.005
 chess-0.005
 Jelly-0.005
 noon-0.005
Token
Token ID198
Feature activation-4.94e-01

Pos logit contributions
folk+0.000
ly+0.000
 ponies+0.000
der+0.000
affe+0.000
Neg logit contributions
 P-0.000
 Linda-0.000
 chess-0.000
 Jelly-0.000
 noon-0.000
TokenThe
Token ID464
Feature activation-2.05e+00

Pos logit contributions
folk+0.012
ly+0.012
thia+0.012
affe+0.012
 ponies+0.012
Neg logit contributions
 P-0.015
 chess-0.014
 Linda-0.014
 Jelly-0.013
 Ink-0.012
Token three
Token ID1115
Feature activation-3.91e+00

Pos logit contributions
folk+0.059
opes+0.056
anna+0.054
uted+0.054
der+0.053
Neg logit contributions
 noon-0.053
 chess-0.051
 P-0.051
 night-0.050
 ink-0.049
Token-
Token ID12
Feature activation-1.93e+00

Pos logit contributions
opes+0.126
Mic+0.124
arians+0.120
 backs+0.120
der+0.119
Neg logit contributions
 day-0.077
 Jill-0.071
 Linda-0.068
 cup-0.065
 night-0.065
Tokenyear
Token ID1941
Feature activation-3.05e+00

Pos logit contributions
folk+0.049
thia+0.048
der+0.048
 ponies+0.047
ly+0.046
Neg logit contributions
 P-0.056
 Linda-0.053
 chess-0.051
 Jelly-0.046
 late-0.046
Token-
Token ID12
Feature activation-1.56e+00

Pos logit contributions
folk+0.090
arians+0.088
opes+0.086
 blamed+0.086
Exc+0.085
Neg logit contributions
 P-0.065
 late-0.062
 night-0.062
 day-0.061
 ink-0.061
Tokenold
Token ID727
Feature activation-1.18e+00

Pos logit contributions
folk+0.027
der+0.027
thia+0.026
 ponies+0.026
ly+0.025
Neg logit contributions
 P-0.057
 Linda-0.055
 chess-0.054
 late-0.049
 Jelly-0.049
Token girl
Token ID2576
Feature activation-4.24e-01

Pos logit contributions
folk+0.026
thia+0.026
 ponies+0.026
der+0.026
affe+0.026
Neg logit contributions
 P-0.038
 Linda-0.036
 chess-0.036
 Jelly-0.032
 ink-0.032
Token<|endoftext|>
Token ID50256
Feature activation+7.25e-01

Pos logit contributions
 P+0.007
 chess+0.006
 Linda+0.006
 Jelly+0.006
 surfing+0.006
Neg logit contributions
folk-0.005
thia-0.005
affe-0.005
ly-0.005
 ponies-0.005
TokenJohn
Token ID7554
Feature activation-2.87e-01

Pos logit contributions
 P+0.022
 Linda+0.022
 chess+0.021
 surfing+0.020
 ink+0.019
Neg logit contributions
folk-0.016
thia-0.015
 ponies-0.015
ly-0.015
affe-0.015
Token was
Token ID373
Feature activation+9.46e-02

Pos logit contributions
ly+0.005
thia+0.005
anna+0.005
affe+0.005
 ponies+0.005
Neg logit contributions
 P-0.011
 chess-0.010
 Linda-0.010
 Jelly-0.010
 Ink-0.009
Token a
Token ID257
Feature activation-4.78e-01

Pos logit contributions
 P+0.003
 Linda+0.003
ucer+0.003
 chess+0.003
 Jelly+0.003
Neg logit contributions
 ponies-0.002
 ent-0.002
uted-0.002
thia-0.002
anna-0.002
Token friendly
Token ID8030
Feature activation+9.63e-02

Pos logit contributions
folk+0.009
ly+0.009
thia+0.009
affe+0.008
der+0.008
Neg logit contributions
 P-0.017
 chess-0.017
 Linda-0.017
 Jelly-0.015
 surfing-0.015
Token three
Token ID1115
Feature activation-3.90e+00

Pos logit contributions
 P+0.003
 chess+0.003
 Linda+0.003
 Jelly+0.003
 surfing+0.003
Neg logit contributions
ly-0.002
folk-0.002
affe-0.002
 ponies-0.002
thia-0.002
Token year
Token ID614
Feature activation-5.22e-01

Pos logit contributions
thia+0.108
uted+0.101
opes+0.099
der+0.099
Flash+0.098
Neg logit contributions
 P-0.092
 day-0.092
 chess-0.091
 Linda-0.091
 night-0.087
Token old
Token ID1468
Feature activation-9.15e-01

Pos logit contributions
folk+0.010
thia+0.010
affe+0.010
 ponies+0.010
uted+0.010
Neg logit contributions
 P-0.018
 chess-0.018
 Linda-0.017
 late-0.016
 surfing-0.016
Token.
Token ID13
Feature activation-1.61e+00

Pos logit contributions
thia+0.019
folk+0.018
affe+0.018
ly+0.018
keys+0.018
Neg logit contributions
 P-0.031
 chess-0.030
 Linda-0.029
 Jelly-0.027
 surfing-0.026
Token He
Token ID679
Feature activation-6.48e-01

Pos logit contributions
ly+0.043
keys+0.042
affe+0.042
 ponies+0.042
folk+0.041
Neg logit contributions
 P-0.046
 chess-0.043
 Linda-0.042
 noon-0.039
 late-0.038
Token went
Token ID1816
Feature activation-3.56e-01

Pos logit contributions
ly+0.008
thia+0.008
folk+0.008
 ponies+0.008
affe+0.008
Neg logit contributions
 P-0.027
 Linda-0.026
 chess-0.026
 Jelly-0.024
 Ink-0.024
Token decided
Token ID3066
Feature activation-3.62e-02

Pos logit contributions
folk+0.005
der+0.005
thia+0.005
affe+0.005
 ponies+0.005
Neg logit contributions
 P-0.007
 Linda-0.007
 chess-0.007
 Jelly-0.006
 ink-0.006
Token to
Token ID284
Feature activation-3.92e-02

Pos logit contributions
ly+0.001
folk+0.001
thia+0.001
 ponies+0.001
der+0.001
Neg logit contributions
 P-0.001
 Linda-0.001
 chess-0.001
 Jelly-0.001
 Ink-0.001
Token take
Token ID1011
Feature activation-5.66e-01

Pos logit contributions
folk+0.001
ly+0.001
thia+0.001
 ponies+0.001
affe+0.001
Neg logit contributions
 P-0.001
 chess-0.001
 Linda-0.001
 Jelly-0.001
 surfing-0.001
Token their
Token ID511
Feature activation-1.60e+00

Pos logit contributions
ly+0.017
folk+0.017
 ponies+0.017
thia+0.017
affe+0.017
Neg logit contributions
 P-0.014
 Linda-0.013
 chess-0.013
 Jelly-0.011
 surfing-0.011
Token polite
Token ID23507
Feature activation-8.55e-01

Pos logit contributions
affe+0.041
folk+0.041
der+0.041
thia+0.040
ly+0.040
Neg logit contributions
 P-0.045
 chess-0.044
 Linda-0.042
 noon-0.039
 Jelly-0.039
Token three
Token ID1115
Feature activation-3.90e+00

Pos logit contributions
folk+0.020
thia+0.020
affe+0.020
der+0.020
ly+0.020
Neg logit contributions
 P-0.026
 chess-0.025
 Linda-0.024
 Jelly-0.022
 ink-0.022
Token year
Token ID614
Feature activation-1.11e-01

Pos logit contributions
thia+0.096
der+0.095
uted+0.093
folk+0.092
 blamed+0.090
Neg logit contributions
 chess-0.109
 night-0.103
 ink-0.102
 P-0.100
 Linda-0.098
Token old
Token ID1468
Feature activation-6.68e-01

Pos logit contributions
folk+0.002
thia+0.002
affe+0.002
uted+0.002
 ponies+0.002
Neg logit contributions
 P-0.004
 chess-0.003
 Linda-0.003
 ink-0.003
 Jelly-0.003
Token on
Token ID319
Feature activation-4.95e-01

Pos logit contributions
folk+0.015
ly+0.014
thia+0.014
affe+0.014
 ponies+0.014
Neg logit contributions
 P-0.022
 chess-0.021
 Linda-0.021
 Jelly-0.019
 ink-0.018
Token a
Token ID257
Feature activation+3.87e-01

Pos logit contributions
 ponies+0.013
 strangers+0.013
 bells+0.013
thia+0.012
folk+0.012
Neg logit contributions
 P-0.014
 Linda-0.014
 chess-0.013
ucer-0.012
 Jelly-0.012
Token picnic
Token ID35715
Feature activation+1.42e-01

Pos logit contributions
 P+0.012
 Linda+0.011
 chess+0.011
 Jelly+0.010
 surfing+0.009
Neg logit contributions
folk-0.010
ly-0.010
thia-0.009
affe-0.009
 ponies-0.009
Token€
Token ID26391
Feature activation-5.56e-01

Pos logit contributions
folk+0.011
ly+0.011
thia+0.011
affe+0.011
 ponies+0.010
Neg logit contributions
 P-0.012
 chess-0.011
 Linda-0.011
 Jelly-0.010
 ink-0.010
Token�
Token ID447
Feature activation+3.66e-01

Pos logit contributions
ly+0.016
der+0.016
 ponies+0.015
 strangers+0.014
folk+0.014
Neg logit contributions
 P-0.015
 Linda-0.014
 chess-0.014
 Jazz-0.013
 surfing-0.013
Token�
Token ID250
Feature activation-5.97e-02

Pos logit contributions
 P+0.012
 Linda+0.012
 chess+0.011
 ink+0.010
 Jelly+0.010
Neg logit contributions
folk-0.008
thia-0.008
Exc-0.008
affe-0.008
 ponies-0.008
Token it
Token ID340
Feature activation-8.21e-01

Pos logit contributions
folk+0.001
Exc+0.001
 ponies+0.001
co+0.001
 bells+0.001
Neg logit contributions
 chess-0.002
 P-0.002
 Linda-0.002
 ink-0.002
 surfing-0.002
Token fit
Token ID4197
Feature activation-5.10e-01

Pos logit contributions
ly+0.018
folk+0.018
thia+0.018
affe+0.018
 ponies+0.017
Neg logit contributions
 P-0.027
 chess-0.026
 Linda-0.025
 Jelly-0.023
 surfing-0.023
Token three
Token ID1115
Feature activation-3.88e+00

Pos logit contributions
ly+0.013
folk+0.012
 ponies+0.012
anna+0.012
der+0.011
Neg logit contributions
 P-0.015
 Linda-0.015
 chess-0.014
 Jelly-0.013
 Ink-0.013
Token people
Token ID661
Feature activation-9.09e-01

Pos logit contributions
thia+0.091
der+0.089
uted+0.084
affe+0.083
gging+0.082
Neg logit contributions
 P-0.110
 chess-0.104
 Linda-0.102
 surfing-0.102
 ink-0.096
Token!
Token ID0
Feature activation+1.52e-02

Pos logit contributions
folk+0.023
ly+0.022
 ponies+0.022
anna+0.021
affe+0.021
Neg logit contributions
 P-0.028
 Linda-0.026
 chess-0.026
 Jelly-0.024
 Ink-0.023
Token
Token ID198
Feature activation-1.03e+00

Pos logit contributions
 P+0.000
 chess+0.000
 Linda+0.000
 ink+0.000
 surfing+0.000
Neg logit contributions
folk-0.000
affe-0.000
 ponies-0.000
thia-0.000
anna-0.000
Token
Token ID198
Feature activation-1.72e+00

Pos logit contributions
ly+0.027
folk+0.027
der+0.026
affe+0.026
thia+0.026
Neg logit contributions
 P-0.030
 Linda-0.028
 chess-0.028
 Jelly-0.024
 late-0.024
TokenOn
Token ID2202
Feature activation-1.76e-01

Pos logit contributions
ly+0.039
der+0.038
folk+0.037
thia+0.037
affe+0.037
Neg logit contributions
 P-0.055
 Linda-0.053
 chess-0.052
 late-0.048
 Jelly-0.046
Token.
Token ID13
Feature activation+5.82e-01

Pos logit contributions
 P+0.003
 Linda+0.003
 chess+0.003
 Jelly+0.003
 Ink+0.003
Neg logit contributions
ly-0.002
folk-0.002
 ponies-0.002
der-0.002
anna-0.002
Token<|endoftext|>
Token ID50256
Feature activation+9.28e-01

Pos logit contributions
 chess+0.019
 P+0.019
 Linda+0.018
 surfing+0.018
 ink+0.018
Neg logit contributions
der-0.011
thia-0.011
affe-0.011
folk-0.011
Exc-0.011
TokenBilly
Token ID49640
Feature activation-2.95e-03

Pos logit contributions
 P+0.026
 Linda+0.025
 chess+0.025
 surfing+0.023
 Jelly+0.023
Neg logit contributions
folk-0.023
 ponies-0.022
ly-0.022
Exc-0.022
thia-0.022
Token was
Token ID373
Feature activation+1.95e-01

Pos logit contributions
ly+0.000
thia+0.000
folk+0.000
affe+0.000
 ponies+0.000
Neg logit contributions
 P-0.000
 chess-0.000
 Linda-0.000
 Jelly-0.000
 surfing-0.000
Token a
Token ID257
Feature activation-3.08e-01

Pos logit contributions
 P+0.006
 Linda+0.006
ucer+0.006
 chess+0.006
 Jelly+0.005
Neg logit contributions
 ponies-0.004
thia-0.004
 ent-0.004
folk-0.004
uted-0.004
Token three
Token ID1115
Feature activation-3.87e+00

Pos logit contributions
folk+0.005
ly+0.005
thia+0.005
affe+0.005
 ponies+0.005
Neg logit contributions
 P-0.012
 chess-0.011
 Linda-0.011
 Jelly-0.010
 surfing-0.010
Token year
Token ID614
Feature activation-1.45e-02

Pos logit contributions
thia+0.103
der+0.098
arians+0.097
Flash+0.096
opes+0.096
Neg logit contributions
 chess-0.094
 P-0.093
 day-0.092
 Linda-0.092
 surfing-0.091
Token old
Token ID1468
Feature activation-8.67e-01

Pos logit contributions
folk+0.000
thia+0.000
affe+0.000
ly+0.000
 ponies+0.000
Neg logit contributions
 P-0.001
 chess-0.001
 Linda-0.001
 Jelly-0.000
 surfing-0.000
Token boy
Token ID2933
Feature activation-1.07e+00

Pos logit contributions
thia+0.019
affe+0.018
folk+0.018
keys+0.018
anna+0.018
Neg logit contributions
 P-0.029
 chess-0.028
 Linda-0.027
 Jelly-0.025
 surfing-0.025
Token who
Token ID508
Feature activation+7.36e-01

Pos logit contributions
ly+0.012
 ponies+0.012
folk+0.011
anna+0.011
der+0.010
Neg logit contributions
 P-0.044
 Linda-0.043
 chess-0.043
ucer-0.041
 noon-0.041
Token loved
Token ID6151
Feature activation-4.79e-01

Pos logit contributions
 P+0.033
 Linda+0.032
 chess+0.032
 Jelly+0.030
 ink+0.030
Neg logit contributions
ly-0.007
thia-0.007
folk-0.007
 ponies-0.006
affe-0.006
Token much
Token ID881
Feature activation+5.82e-02

Pos logit contributions
folk+0.007
ly+0.007
thia+0.007
affe+0.007
 ponies+0.007
Neg logit contributions
 P-0.008
 Linda-0.007
 chess-0.007
 Jelly-0.007
 surfing-0.006
Token.
Token ID13
Feature activation-1.17e+00

Pos logit contributions
 P+0.002
 Linda+0.002
 chess+0.002
 Jelly+0.002
 Ink+0.002
Neg logit contributions
ly-0.001
 ponies-0.001
der-0.001
folk-0.001
affe-0.001
Token
Token ID198
Feature activation-1.75e-01

Pos logit contributions
folk+0.038
 ponies+0.038
keys+0.037
 strangers+0.036
ly+0.036
Neg logit contributions
 P-0.028
 Linda-0.027
 chess-0.027
 Jill-0.024
 late-0.024
Token
Token ID198
Feature activation-1.31e+00

Pos logit contributions
folk+0.006
ly+0.006
 ponies+0.006
der+0.006
affe+0.005
Neg logit contributions
 P-0.004
 Linda-0.004
 chess-0.004
 late-0.003
 noon-0.003
TokenThe
Token ID464
Feature activation-3.23e+00

Pos logit contributions
folk+0.040
der+0.039
 ponies+0.038
opes+0.035
 bells+0.035
Neg logit contributions
 late-0.032
 Linda-0.031
 P-0.031
 chess-0.029
 noon-0.028
Token two
Token ID734
Feature activation-3.87e+00

Pos logit contributions
aces+0.087
der+0.086
folk+0.086
anna+0.085
aron+0.082
Neg logit contributions
 day-0.090
 noon-0.089
 night-0.088
 evening-0.083
 morning-0.082
Token friends
Token ID2460
Feature activation-7.49e-01

Pos logit contributions
anna+0.128
aron+0.121
der+0.119
thia+0.119
arians+0.116
Neg logit contributions
 night-0.082
 day-0.082
 noon-0.078
 evening-0.072
 morning-0.072
Token were
Token ID547
Feature activation-4.31e-01

Pos logit contributions
thia+0.017
der+0.017
folk+0.017
affe+0.017
ly+0.016
Neg logit contributions
 P-0.024
 Linda-0.023
 chess-0.023
 Jelly-0.020
 surfing-0.020
Token happy
Token ID3772
Feature activation-3.01e-04

Pos logit contributions
ly+0.013
der+0.012
affe+0.012
folk+0.012
Exc+0.012
Neg logit contributions
 P-0.012
 chess-0.011
 Linda-0.011
 surfing-0.010
 noon-0.010
Token to
Token ID284
Feature activation+2.93e-01

Pos logit contributions
ly+0.000
der+0.000
 ponies+0.000
folk+0.000
thia+0.000
Neg logit contributions
 P-0.000
 Linda-0.000
 chess-0.000
ennis-0.000
ucer-0.000
Token enjoy
Token ID2883
Feature activation-1.08e+00

Pos logit contributions
 P+0.008
 chess+0.008
 Linda+0.008
 Jelly+0.007
 surfing+0.007
Neg logit contributions
folk-0.008
ly-0.008
thia-0.008
affe-0.007
 ponies-0.007
Token"
Token ID1
Feature activation-4.40e-01

Pos logit contributions
 P+0.042
 chess+0.040
 Linda+0.039
 noon+0.036
 surfing+0.036
Neg logit contributions
Exc-0.045
ly-0.045
folk-0.044
thia-0.044
affe-0.042
TokenThis
Token ID1212
Feature activation-1.17e+00

Pos logit contributions
Exc+0.012
folk+0.011
thia+0.011
 ponies+0.011
der+0.011
Neg logit contributions
 P-0.013
 chess-0.013
 Linda-0.012
 surfing-0.011
 Jelly-0.011
Token stuff
Token ID3404
Feature activation-1.02e+00

Pos logit contributions
folk+0.028
ly+0.028
thia+0.028
affe+0.027
 ponies+0.027
Neg logit contributions
 P-0.036
 Linda-0.034
 chess-0.034
 Jelly-0.030
 ink-0.030
Token smells
Token ID25760
Feature activation-5.10e-01

Pos logit contributions
ly+0.023
folk+0.022
thia+0.022
affe+0.022
 ponies+0.021
Neg logit contributions
 P-0.033
 chess-0.032
 Linda-0.032
 Jelly-0.029
 surfing-0.028
Token bad
Token ID2089
Feature activation-1.29e+00

Pos logit contributions
ly+0.012
der+0.011
 ponies+0.011
 strangers+0.011
 ow+0.011
Neg logit contributions
 P-0.016
 Linda-0.016
 chess-0.016
 Jelly-0.014
 surfing-0.014
Token,"
Token ID553
Feature activation+2.84e+00

Pos logit contributions
folk+0.027
ly+0.027
thia+0.026
affe+0.026
 ponies+0.026
Neg logit contributions
 P-0.044
 chess-0.041
 Linda-0.041
 Jelly-0.037
 ink-0.037
Token Jenny
Token ID28437
Feature activation-6.47e-01

Pos logit contributions
ained+0.094
ores+0.093
 seeker+0.087
 surfing+0.087
ucer+0.087
Neg logit contributions
ly-0.059
 nodded-0.056
Exc-0.055
 blamed-0.054
folk-0.054
Token said
Token ID531
Feature activation-3.53e-01

Pos logit contributions
 ponies+0.019
folk+0.019
thia+0.018
affe+0.018
 gor+0.018
Neg logit contributions
 P-0.017
 Linda-0.016
 chess-0.015
 Jelly-0.014
 ink-0.014
Token.
Token ID13
Feature activation-7.01e-01

Pos logit contributions
folk+0.009
affe+0.009
thia+0.009
 ponies+0.009
ly+0.009
Neg logit contributions
 P-0.010
 Linda-0.009
 chess-0.009
 Jelly-0.008
 noon-0.008
Token
Token ID220
Feature activation-4.46e-01

Pos logit contributions
folk+0.017
ly+0.017
affe+0.017
thia+0.017
 ponies+0.017
Neg logit contributions
 P-0.021
 Linda-0.020
 chess-0.020
 Jelly-0.018
 ink-0.017
Token
Token ID198
Feature activation+3.41e-01

Pos logit contributions
folk+0.012
ly+0.012
thia+0.011
affe+0.011
 ponies+0.011
Neg logit contributions
 P-0.013
 chess-0.012
 Linda-0.012
 Jelly-0.011
 ink-0.010
Token But
Token ID887
Feature activation+3.14e-01

Pos logit contributions
folk+0.009
ly+0.009
thia+0.009
affe+0.009
 ponies+0.009
Neg logit contributions
 P-0.011
 chess-0.010
 Linda-0.010
 Jelly-0.009
 surfing-0.009
Token he
Token ID339
Feature activation+6.15e-01

Pos logit contributions
 P+0.009
 Linda+0.009
 chess+0.009
 Jelly+0.008
 ink+0.008
Neg logit contributions
ly-0.007
folk-0.007
der-0.007
affe-0.007
 ponies-0.007
Token could
Token ID714
Feature activation-1.89e-01

Pos logit contributions
 P+0.019
 Linda+0.018
 chess+0.018
 Jelly+0.017
 surfing+0.016
Neg logit contributions
folk-0.016
 ponies-0.015
ly-0.015
affe-0.014
der-0.014
Token not
Token ID407
Feature activation+5.36e-01

Pos logit contributions
thia+0.004
ly+0.004
folk+0.004
der+0.003
 strangers+0.003
Neg logit contributions
 P-0.007
 chess-0.006
 Linda-0.006
 ink-0.006
 Jelly-0.006
Token find
Token ID1064
Feature activation+3.84e-01

Pos logit contributions
 P+0.015
 chess+0.015
 Linda+0.015
 Jelly+0.013
 surfing+0.013
Neg logit contributions
folk-0.014
ly-0.014
thia-0.014
affe-0.014
 ponies-0.014
Token the
Token ID262
Feature activation+3.21e+00

Pos logit contributions
 P+0.010
 chess+0.009
 Linda+0.009
 Jelly+0.008
 surfing+0.008
Neg logit contributions
ly-0.011
folk-0.011
 ponies-0.011
affe-0.011
der-0.011
Token marble
Token ID30623
Feature activation+2.55e-01

Pos logit contributions
ores+0.101
 Linda+0.093
etchup+0.085
 P+0.081
 late+0.080
Neg logit contributions
 little-0.081
 bell-0.076
ly-0.075
 ow-0.074
 bicycles-0.071
Token.
Token ID13
Feature activation-1.09e-01

Pos logit contributions
 P+0.009
 Linda+0.008
 chess+0.008
 Jelly+0.008
 Ink+0.008
Neg logit contributions
ly-0.006
der-0.005
affe-0.005
 strangers-0.005
 ponies-0.005
Token Tim
Token ID5045
Feature activation-1.50e-01

Pos logit contributions
folk+0.003
ly+0.003
 ponies+0.003
thia+0.003
affe+0.003
Neg logit contributions
 P-0.003
 chess-0.003
 Linda-0.003
 Jelly-0.003
 Ink-0.003
Token felt
Token ID2936
Feature activation-5.08e-01

Pos logit contributions
folk+0.004
der+0.004
 ponies+0.004
keys+0.004
affe+0.004
Neg logit contributions
 P-0.004
 Linda-0.004
 chess-0.004
 ink-0.003
 Jelly-0.003
Token sad
Token ID6507
Feature activation+7.61e-02

Pos logit contributions
ly+0.011
folk+0.010
 ponies+0.010
der+0.010
affe+0.010
Neg logit contributions
 P-0.017
 Linda-0.016
 chess-0.016
 Jelly-0.015
 noon-0.014
Token side
Token ID1735
Feature activation-3.99e-01

Pos logit contributions
 P+0.010
 chess+0.009
 Linda+0.009
ores+0.009
 late+0.009
Neg logit contributions
ly-0.005
 ponies-0.005
folk-0.005
 strangers-0.005
 bells-0.005
Token of
Token ID286
Feature activation-3.00e-01

Pos logit contributions
folk+0.007
ly+0.007
affe+0.006
 ponies+0.006
anna+0.006
Neg logit contributions
 P-0.015
 chess-0.015
 Linda-0.014
 Jelly-0.013
 ink-0.013
Token the
Token ID262
Feature activation+2.73e-02

Pos logit contributions
 ponies+0.007
folk+0.007
 strangers+0.007
ly+0.007
Exc+0.007
Neg logit contributions
 P-0.008
 chess-0.008
 Linda-0.008
 Jelly-0.007
 surfing-0.007
Token lake
Token ID13546
Feature activation+1.07e+00

Pos logit contributions
 P+0.001
 Linda+0.001
 chess+0.001
ores+0.001
 Jelly+0.001
Neg logit contributions
 ponies-0.001
ly-0.001
 bell-0.001
 strangers-0.001
folk-0.001
Token,
Token ID11
Feature activation+1.63e+00

Pos logit contributions
 P+0.039
 chess+0.037
 Linda+0.036
 Jelly+0.034
 Ink+0.034
Neg logit contributions
ly-0.020
folk-0.020
thia-0.018
 ponies-0.018
affe-0.018
Token L
Token ID406
Feature activation+2.81e+00

Pos logit contributions
 chess+0.036
ained+0.034
ucer+0.034
 surfing+0.033
ennis+0.033
Neg logit contributions
 little-0.046
ly-0.043
 ponies-0.043
der-0.042
 bells-0.041
Tokenizzy
Token ID40593
Feature activation-7.54e-01

Pos logit contributions
 chess+0.060
 P+0.058
 late+0.050
 noon+0.048
 surfing+0.048
Neg logit contributions
anna-0.097
ly-0.091
der-0.089
thia-0.086
co-0.084
Token saw
Token ID2497
Feature activation-3.27e-01

Pos logit contributions
 ponies+0.016
folk+0.016
affe+0.016
thia+0.016
der+0.016
Neg logit contributions
 P-0.025
 Linda-0.024
 chess-0.023
 surfing-0.021
 Jelly-0.021
Token a
Token ID257
Feature activation+6.52e-01

Pos logit contributions
ly+0.010
folk+0.010
 ponies+0.010
der+0.010
thia+0.010
Neg logit contributions
 P-0.008
 chess-0.007
 Linda-0.007
 Jelly-0.006
 surfing-0.006
Token wise
Token ID10787
Feature activation+3.49e-01

Pos logit contributions
 P+0.023
 Linda+0.022
 chess+0.021
 Jelly+0.019
 noon+0.019
Neg logit contributions
affe-0.012
thia-0.012
ly-0.012
 ponies-0.012
folk-0.012
Token duck
Token ID22045
Feature activation+5.98e-01

Pos logit contributions
 P+0.012
 chess+0.011
 Linda+0.011
 Jelly+0.011
 ink+0.010
Neg logit contributions
thia-0.007
folk-0.007
uted-0.007
keys-0.007
 ponies-0.007
Token on
Token ID319
Feature activation+3.38e-01

Pos logit contributions
ly+0.013
 ponies+0.013
der+0.013
folk+0.013
affe+0.012
Neg logit contributions
 P-0.013
 Linda-0.012
 chess-0.011
 Jelly-0.010
 Ink-0.010
Token her
Token ID607
Feature activation+1.45e-01

Pos logit contributions
 P+0.010
 chess+0.010
 Linda+0.010
 surfing+0.009
 Jelly+0.009
Neg logit contributions
ly-0.008
thia-0.008
affe-0.008
 ponies-0.008
folk-0.008
Token face
Token ID1986
Feature activation-2.74e-01

Pos logit contributions
 P+0.006
 chess+0.005
 Linda+0.005
 ink+0.005
 Jelly+0.005
Neg logit contributions
folk-0.003
affe-0.002
 ponies-0.002
Exc-0.002
 gor-0.002
Token and
Token ID290
Feature activation+1.21e+00

Pos logit contributions
ly+0.006
der+0.006
folk+0.006
thia+0.005
 ponies+0.005
Neg logit contributions
 P-0.009
 Linda-0.009
 chess-0.009
 Jelly-0.008
 Ink-0.008
Token demanded
Token ID12284
Feature activation+1.71e+00

Pos logit contributions
 P+0.037
 chess+0.035
 Linda+0.034
 Jelly+0.031
 surfing+0.031
Neg logit contributions
ly-0.029
thia-0.028
folk-0.028
 ponies-0.028
der-0.028
Token that
Token ID326
Feature activation+2.77e+00

Pos logit contributions
 P+0.045
 chess+0.045
ucer+0.045
 Linda+0.045
 noon+0.044
Neg logit contributions
ly-0.046
der-0.042
 strangers-0.041
 ponies-0.041
anna-0.039
Token Sammy
Token ID44067
Feature activation+1.03e+00

Pos logit contributions
ucer+0.076
ained+0.072
ores+0.069
etchup+0.066
 hyd+0.066
Neg logit contributions
 strangers-0.071
 ponies-0.068
der-0.063
folk-0.062
ly-0.062
Token show
Token ID905
Feature activation+8.85e-01

Pos logit contributions
 P+0.032
 chess+0.031
 Linda+0.031
 Ink+0.028
 Jak+0.028
Neg logit contributions
ly-0.025
 ponies-0.023
anna-0.023
 strangers-0.022
affe-0.022
Token her
Token ID607
Feature activation+1.01e+00

Pos logit contributions
 P+0.027
 chess+0.027
 Linda+0.026
 noon+0.024
 surfing+0.024
Neg logit contributions
ly-0.021
folk-0.019
 ponies-0.019
 strangers-0.019
der-0.019
Token what
Token ID644
Feature activation+8.69e-02

Pos logit contributions
 P+0.032
 Linda+0.029
 Jak+0.028
 chess+0.028
 Ink+0.027
Neg logit contributions
 ponies-0.024
ly-0.022
 strangers-0.022
der-0.022
 bells-0.022
Token he
Token ID339
Feature activation-1.30e-02

Pos logit contributions
 P+0.002
 Linda+0.002
 chess+0.002
ores+0.002
 Jelly+0.002
Neg logit contributions
 ponies-0.002
thia-0.002
ly-0.002
folk-0.002
 strangers-0.002
Token He
Token ID679
Feature activation+2.42e-03

Pos logit contributions
 chess+0.061
 Linda+0.060
 possible+0.058
 surfing+0.058
ores+0.057
Neg logit contributions
ly-0.027
der-0.026
Exc-0.025
affe-0.025
thia-0.024
Token asked
Token ID1965
Feature activation+1.04e+00

Pos logit contributions
 P+0.000
 Linda+0.000
 chess+0.000
 Jelly+0.000
 ink+0.000
Neg logit contributions
folk-0.000
 ponies-0.000
der-0.000
thia-0.000
uted-0.000
Token his
Token ID465
Feature activation-4.45e-01

Pos logit contributions
 P+0.027
 Linda+0.027
 chess+0.027
 Jelly+0.023
 Ink+0.023
Neg logit contributions
ly-0.028
 ponies-0.026
folk-0.026
der-0.026
affe-0.026
Token friends
Token ID2460
Feature activation+1.21e+00

Pos logit contributions
folk+0.010
ly+0.010
anna+0.009
thia+0.009
der+0.009
Neg logit contributions
 P-0.015
 chess-0.015
 Linda-0.014
 Jelly-0.013
 surfing-0.013
Token to
Token ID284
Feature activation+7.83e-01

Pos logit contributions
 P+0.033
 Linda+0.032
 chess+0.031
 Jelly+0.028
 noon+0.028
Neg logit contributions
ly-0.032
folk-0.031
 ponies-0.031
der-0.031
affe-0.031
Token help
Token ID1037
Feature activation+2.74e+00

Pos logit contributions
 P+0.021
 chess+0.020
 Linda+0.020
 Jelly+0.017
 ink+0.017
Neg logit contributions
folk-0.022
ly-0.022
affe-0.022
thia-0.022
 ponies-0.022
Token the
Token ID262
Feature activation-1.54e+00

Pos logit contributions
 seeker+0.097
ained+0.097
 chess+0.096
 possible+0.095
 Linda+0.094
Neg logit contributions
ly-0.048
 strangers-0.041
gging-0.036
Exc-0.035
ts-0.035
Token bird
Token ID6512
Feature activation+1.03e+00

Pos logit contributions
aron+0.054
folk+0.052
anna+0.051
der+0.050
opes+0.049
Neg logit contributions
 noon-0.032
 night-0.031
 P-0.030
 bath-0.029
 day-0.029
Token too
Token ID1165
Feature activation+1.39e+00

Pos logit contributions
 P+0.032
 chess+0.031
 Linda+0.031
 Jelly+0.028
 surfing+0.027
Neg logit contributions
ly-0.024
 ponies-0.023
affe-0.023
folk-0.023
thia-0.023
Token.
Token ID13
Feature activation+2.76e+00

Pos logit contributions
 P+0.040
 chess+0.038
 Linda+0.038
 Jelly+0.033
 ink+0.033
Neg logit contributions
folk-0.036
ly-0.036
thia-0.036
 ponies-0.035
affe-0.035
Token
Token ID198
Feature activation+1.26e+00

Pos logit contributions
 possible+0.105
 seeker+0.098
 Linda+0.094
ucer+0.093
 chess+0.091
Neg logit contributions
der-0.042
Exc-0.038
 nodded-0.038
ly-0.038
 So-0.037
Token to
Token ID284
Feature activation+2.36e-01

Pos logit contributions
 P+0.003
 Linda+0.003
 chess+0.003
 Jelly+0.003
 Ink+0.003
Neg logit contributions
 ponies-0.003
folk-0.003
ly-0.003
thia-0.003
affe-0.003
Token wait
Token ID4043
Feature activation-1.28e-02

Pos logit contributions
 P+0.008
 Linda+0.007
 chess+0.007
 Jelly+0.006
 ink+0.006
Neg logit contributions
folk-0.005
ly-0.005
thia-0.005
 ponies-0.005
affe-0.005
Token for
Token ID329
Feature activation+3.77e-01

Pos logit contributions
ly+0.000
der+0.000
 ponies+0.000
anna+0.000
thia+0.000
Neg logit contributions
 P-0.000
 Linda-0.000
 chess-0.000
 Jelly-0.000
 Ink-0.000
Token your
Token ID534
Feature activation+1.91e-01

Pos logit contributions
 P+0.010
 Linda+0.009
 chess+0.009
 Jelly+0.008
 Ink+0.008
Neg logit contributions
 ponies-0.011
ly-0.011
folk-0.010
 strangers-0.010
thia-0.010
Token friend
Token ID1545
Feature activation-1.15e+00

Pos logit contributions
 P+0.006
 Linda+0.006
 chess+0.006
 Jelly+0.005
 Ink+0.005
Neg logit contributions
 ponies-0.004
ly-0.004
 bells-0.004
thia-0.004
folk-0.004
Token to
Token ID284
Feature activation+4.88e-01

Pos logit contributions
ly+0.027
 ponies+0.027
folk+0.026
affe+0.026
anna+0.026
Neg logit contributions
 P-0.035
 chess-0.034
 Linda-0.034
 Jelly-0.031
 Ink-0.030
Token come
Token ID1282
Feature activation-5.30e-01

Pos logit contributions
 P+0.015
 Linda+0.014
 chess+0.014
 Jelly+0.013
 ink+0.013
Neg logit contributions
folk-0.011
ly-0.011
thia-0.011
 ponies-0.011
affe-0.011
Token."
Token ID526
Feature activation+7.20e-01

Pos logit contributions
 ponies+0.010
ly+0.010
anna+0.010
thia+0.009
 pony+0.009
Neg logit contributions
 P-0.018
 Linda-0.017
 chess-0.017
 ink-0.016
 Ink-0.016
Token
Token ID198
Feature activation+1.41e-01

Pos logit contributions
 chess+0.022
 P+0.022
 surfing+0.021
 Linda+0.021
 possible+0.020
Neg logit contributions
ly-0.016
anna-0.016
thia-0.015
Exc-0.015
der-0.015
Token
Token ID198
Feature activation-1.01e-01

Pos logit contributions
 P+0.004
 Linda+0.004
 chess+0.004
 Jelly+0.003
 noon+0.003
Neg logit contributions
folk-0.004
ly-0.004
 ponies-0.004
affe-0.004
thia-0.004
TokenL
Token ID43
Feature activation+4.85e-01

Pos logit contributions
ly+0.003
thia+0.003
Exc+0.003
folk+0.003
affe+0.003
Neg logit contributions
 P-0.003
 chess-0.003
 Linda-0.003
 Jelly-0.002
 surfing-0.002
Token toy
Token ID13373
Feature activation-2.08e-01

Pos logit contributions
 Linda+0.019
 P+0.019
ores+0.017
 chess+0.017
ained+0.016
Neg logit contributions
 bell-0.014
thia-0.013
 bells-0.013
 pony-0.013
affe-0.013
Token boat
Token ID8848
Feature activation+1.78e-01

Pos logit contributions
ly+0.005
der+0.005
affe+0.005
 ponies+0.005
folk+0.004
Neg logit contributions
 Linda-0.007
 P-0.006
 chess-0.006
ores-0.006
 noon-0.006
Token that
Token ID326
Feature activation+6.03e-01

Pos logit contributions
 P+0.006
 Linda+0.006
 chess+0.005
 Jelly+0.005
 Ink+0.005
Neg logit contributions
ly-0.004
affe-0.004
folk-0.004
 ponies-0.004
der-0.003
Token might
Token ID1244
Feature activation-1.65e-01

Pos logit contributions
 P+0.019
 Linda+0.018
 chess+0.018
 surfing+0.016
 Jelly+0.016
Neg logit contributions
ly-0.014
folk-0.014
affe-0.013
 ponies-0.013
thia-0.013
Token fit
Token ID4197
Feature activation+2.78e-01

Pos logit contributions
 strangers+0.003
ly+0.003
 bells+0.003
affe+0.003
 bell+0.003
Neg logit contributions
 Linda-0.006
 P-0.006
 chess-0.006
 Ink-0.005
ucer-0.005
Token in
Token ID287
Feature activation+6.73e-01

Pos logit contributions
 P+0.008
 Linda+0.008
 chess+0.008
 Jelly+0.008
 Ink+0.007
Neg logit contributions
ly-0.007
folk-0.006
 ponies-0.006
 strangers-0.006
affe-0.006
Token your
Token ID534
Feature activation+5.19e-01

Pos logit contributions
 Linda+0.019
 P+0.019
 chess+0.017
ores+0.017
 Jelly+0.016
Neg logit contributions
 strangers-0.016
 ponies-0.016
folk-0.015
 bells-0.015
ly-0.015
Token kay
Token ID34681
Feature activation+1.60e+00

Pos logit contributions
 P+0.015
 Linda+0.014
 chess+0.014
 Jelly+0.012
 Ink+0.012
Neg logit contributions
folk-0.014
ly-0.014
thia-0.013
 ponies-0.013
affe-0.013
Tokenak
Token ID461
Feature activation+1.86e-01

Pos logit contributions
 P+0.056
 Linda+0.054
 chess+0.050
 Jelly+0.049
 noon+0.048
Neg logit contributions
affe-0.031
folk-0.031
co-0.030
ly-0.030
anna-0.030
Token.
Token ID13
Feature activation+2.20e-01

Pos logit contributions
 P+0.006
 Linda+0.006
 chess+0.006
 Jelly+0.005
 Ink+0.005
Neg logit contributions
ly-0.004
folk-0.004
affe-0.004
der-0.004
 ponies-0.004
Token We
Token ID775
Feature activation-3.77e-01

Pos logit contributions
 P+0.006
 chess+0.006
 Linda+0.006
 noon+0.005
 ink+0.005
Neg logit contributions
folk-0.006
affe-0.006
ly-0.006
thia-0.006
der-0.006
Token Sophie
Token ID35331
Feature activation+4.36e-01

Pos logit contributions
ly+0.001
 ponies+0.001
folk+0.001
der+0.001
thia+0.001
Neg logit contributions
 chess-0.001
 P-0.001
 Linda-0.001
 ink-0.001
 Ink-0.001
Token and
Token ID290
Feature activation-4.88e-03

Pos logit contributions
 P+0.015
 chess+0.014
 Linda+0.014
 Jelly+0.013
 ink+0.013
Neg logit contributions
folk-0.009
ly-0.009
thia-0.009
affe-0.009
 ponies-0.009
Token her
Token ID607
Feature activation+1.01e+00

Pos logit contributions
ly+0.000
folk+0.000
thia+0.000
 ponies+0.000
affe+0.000
Neg logit contributions
 P-0.000
 chess-0.000
 Linda-0.000
 Jelly-0.000
 surfing-0.000
Token mum
Token ID25682
Feature activation-3.39e-02

Pos logit contributions
 P+0.033
 Linda+0.031
 chess+0.031
 late+0.028
 Ink+0.028
Neg logit contributions
ly-0.022
thia-0.022
 ponies-0.022
folk-0.021
affe-0.021
Token went
Token ID1816
Feature activation-2.31e-01

Pos logit contributions
folk+0.001
der+0.001
ly+0.001
affe+0.001
thia+0.001
Neg logit contributions
 P-0.001
 Linda-0.001
 chess-0.001
 surfing-0.001
 Jelly-0.001
Token for
Token ID329
Feature activation+6.02e-01

Pos logit contributions
 ponies+0.004
ly+0.004
co+0.004
anna+0.004
 horses+0.004
Neg logit contributions
 P-0.008
 Linda-0.007
 Jelly-0.007
 Ink-0.007
 ink-0.007
Token a
Token ID257
Feature activation+5.13e-01

Pos logit contributions
 P+0.016
 Linda+0.015
 chess+0.015
ucer+0.014
ennis+0.014
Neg logit contributions
 ponies-0.016
 strangers-0.015
 bells-0.015
ly-0.015
 ow-0.015
Token walk
Token ID2513
Feature activation+1.61e-01

Pos logit contributions
 P+0.017
 Linda+0.017
 chess+0.016
ained+0.016
ores+0.015
Neg logit contributions
ly-0.011
thia-0.010
anna-0.010
keys-0.009
Exc-0.009
Token along
Token ID1863
Feature activation+2.13e-01

Pos logit contributions
 P+0.006
 chess+0.005
 Linda+0.005
 Ink+0.005
 Jelly+0.005
Neg logit contributions
ly-0.004
folk-0.003
 ponies-0.003
co-0.003
anna-0.003
Token the
Token ID262
Feature activation+2.15e-01

Pos logit contributions
 P+0.006
 Linda+0.006
 chess+0.006
ucer+0.006
ennis+0.006
Neg logit contributions
 ponies-0.005
ly-0.005
 strangers-0.005
folk-0.005
 bells-0.005
Token beach
Token ID10481
Feature activation+1.19e+00

Pos logit contributions
 P+0.006
 Linda+0.006
 chess+0.005
 Ink+0.005
ores+0.005
Neg logit contributions
ly-0.006
 ponies-0.006
thia-0.006
folk-0.006
 bells-0.006
Token said
Token ID531
Feature activation-3.56e-01

Pos logit contributions
 P+0.009
 Linda+0.009
 chess+0.008
 Jenny+0.007
 Jelly+0.007
Neg logit contributions
affe-0.011
folk-0.011
der-0.010
 ponies-0.010
opes-0.010
Token,
Token ID11
Feature activation+2.02e-01

Pos logit contributions
affe+0.009
folk+0.009
thia+0.009
 ponies+0.009
opes+0.009
Neg logit contributions
 P-0.011
 chess-0.010
 Linda-0.010
 Jelly-0.009
 surfing-0.008
Token "
Token ID366
Feature activation-8.27e-01

Pos logit contributions
 P+0.008
 chess+0.007
 Linda+0.007
 Jelly+0.007
 surfing+0.007
Neg logit contributions
folk-0.003
ly-0.003
affe-0.003
thia-0.003
 ponies-0.003
TokenMom
Token ID29252
Feature activation+9.87e-01

Pos logit contributions
folk+0.020
ly+0.019
thia+0.019
 ponies+0.019
affe+0.019
Neg logit contributions
 P-0.026
 chess-0.025
 Linda-0.025
 Jelly-0.022
 surfing-0.022
Tokenmy
Token ID1820
Feature activation+6.87e-01

Pos logit contributions
 P+0.031
 chess+0.030
 Linda+0.029
 Jelly+0.027
 surfing+0.027
Neg logit contributions
ly-0.022
thia-0.022
folk-0.021
affe-0.021
 ponies-0.021
Token,
Token ID11
Feature activation+4.88e-01

Pos logit contributions
 P+0.026
 chess+0.026
 Linda+0.025
 surfing+0.024
 Ink+0.024
Neg logit contributions
ly-0.011
thia-0.010
affe-0.010
anna-0.009
folk-0.009
Token I
Token ID314
Feature activation+7.03e-01

Pos logit contributions
 P+0.015
 Linda+0.014
 chess+0.014
ained+0.013
 late+0.013
Neg logit contributions
ly-0.010
 ponies-0.010
 bell-0.010
 strangers-0.010
 ow-0.010
Token don
Token ID836
Feature activation-7.27e-01

Pos logit contributions
 P+0.024
 chess+0.024
 Linda+0.022
 surfing+0.021
 Ink+0.021
Neg logit contributions
 hated-0.014
thia-0.014
ly-0.013
anna-0.013
 blamed-0.013
Token't
Token ID470
Feature activation-1.56e-01

Pos logit contributions
ly+0.023
affe+0.023
folk+0.022
thia+0.022
 ponies+0.022
Neg logit contributions
 P-0.017
 chess-0.016
 Linda-0.016
 surfing-0.014
 Ink-0.014
Token like
Token ID588
Feature activation-7.55e-01

Pos logit contributions
ly+0.003
 strangers+0.003
anna+0.003
affe+0.003
aron+0.003
Neg logit contributions
 chess-0.005
 P-0.005
 Linda-0.005
etchup-0.004
ucer-0.004
Token this
Token ID428
Feature activation-1.97e+00

Pos logit contributions
folk+0.024
affe+0.024
ly+0.024
thia+0.024
der+0.023
Neg logit contributions
 P-0.017
 chess-0.016
 Linda-0.016
 Jelly-0.014
 ink-0.014
Token and
Token ID290
Feature activation-7.20e-01

Pos logit contributions
ly+0.019
 ponies+0.019
folk+0.019
affe+0.018
der+0.018
Neg logit contributions
 P-0.031
 Linda-0.030
 chess-0.030
 Jelly-0.028
 noon-0.027
Token go
Token ID467
Feature activation-1.27e-01

Pos logit contributions
folk+0.023
ly+0.023
affe+0.023
der+0.023
uted+0.023
Neg logit contributions
 P-0.017
 Linda-0.016
 chess-0.016
 ink-0.014
 surfing-0.013
Token outside
Token ID2354
Feature activation+5.36e-02

Pos logit contributions
 ponies+0.003
ly+0.003
anna+0.003
co+0.003
 bicycles+0.003
Neg logit contributions
 P-0.004
 Jelly-0.004
 Linda-0.004
 Ink-0.003
 chess-0.003
Token.
Token ID13
Feature activation+3.51e-01

Pos logit contributions
 P+0.002
 Linda+0.002
 chess+0.002
 Jelly+0.002
 Ink+0.001
Neg logit contributions
 ponies-0.001
ly-0.001
folk-0.001
der-0.001
anna-0.001
Token One
Token ID1881
Feature activation+1.22e+00

Pos logit contributions
 P+0.011
 chess+0.011
 Linda+0.011
 Jelly+0.010
 surfing+0.009
Neg logit contributions
folk-0.008
ly-0.008
thia-0.008
affe-0.008
 ponies-0.008
Token day
Token ID1110
Feature activation+6.43e-01

Pos logit contributions
 P+0.027
 chess+0.027
 Linda+0.025
 rid+0.023
 Jak+0.022
Neg logit contributions
ly-0.040
 little-0.039
affe-0.038
thia-0.037
 ponies-0.037
Token,
Token ID11
Feature activation+9.43e-01

Pos logit contributions
 P+0.023
 chess+0.023
 Linda+0.023
 surfing+0.021
 Jak+0.021
Neg logit contributions
ly-0.011
der-0.009
 ponies-0.009
thia-0.009
anna-0.009
Token her
Token ID607
Feature activation-4.15e-02

Pos logit contributions
 chess+0.024
 P+0.024
 surfing+0.023
 Jak+0.023
 Linda+0.023
Neg logit contributions
ly-0.025
 little-0.023
der-0.022
 nodded-0.022
 ow-0.021
Token friends
Token ID2460
Feature activation+4.66e-01

Pos logit contributions
folk+0.001
ly+0.001
affe+0.001
thia+0.001
der+0.001
Neg logit contributions
 P-0.001
 chess-0.001
 Linda-0.001
 Jelly-0.001
 ink-0.001
Token invited
Token ID9392
Feature activation-4.21e-01

Pos logit contributions
 P+0.017
 chess+0.016
 Linda+0.016
 Jelly+0.015
 surfing+0.014
Neg logit contributions
ly-0.009
folk-0.009
thia-0.008
 ponies-0.008
affe-0.008
Token her
Token ID607
Feature activation+2.58e-01

Pos logit contributions
ly+0.009
 ponies+0.009
 strangers+0.009
uted+0.008
der+0.008
Neg logit contributions
 P-0.013
 chess-0.013
 Linda-0.013
ucer-0.012
 surfing-0.012
Token and
Token ID290
Feature activation-7.08e-01

Pos logit contributions
ly+0.018
 ponies+0.017
folk+0.017
anna+0.016
thia+0.016
Neg logit contributions
 P-0.029
 Linda-0.027
 chess-0.027
 Jelly-0.025
 Ink-0.025
Token dad
Token ID9955
Feature activation-7.77e-01

Pos logit contributions
affe+0.017
ly+0.017
folk+0.017
der+0.016
thia+0.016
Neg logit contributions
 P-0.023
 chess-0.022
 Linda-0.021
 Jelly-0.019
 surfing-0.019
Token.
Token ID13
Feature activation-9.15e-02

Pos logit contributions
ly+0.017
folk+0.017
 ponies+0.016
anna+0.016
thia+0.016
Neg logit contributions
 P-0.026
 Linda-0.025
 chess-0.024
 Jelly-0.023
 Ink-0.022
Token They
Token ID1119
Feature activation-5.19e-01

Pos logit contributions
folk+0.002
ly+0.002
 ponies+0.002
thia+0.002
Exc+0.002
Neg logit contributions
 P-0.003
 chess-0.002
 Linda-0.002
 surfing-0.002
 Jelly-0.002
Token wore
Token ID12408
Feature activation+3.55e-01

Pos logit contributions
der+0.011
affe+0.011
ly+0.011
folk+0.010
thia+0.010
Neg logit contributions
 P-0.018
 Linda-0.017
 chess-0.017
 Jelly-0.015
 surfing-0.015
Token nice
Token ID3621
Feature activation-8.42e-01

Pos logit contributions
 P+0.009
 Linda+0.009
 chess+0.008
 Jelly+0.008
 Ink+0.008
Neg logit contributions
 ponies-0.010
 bells-0.010
ly-0.010
thia-0.009
folk-0.009
Token clothes
Token ID8242
Feature activation-8.25e-01

Pos logit contributions
der+0.023
affe+0.023
folk+0.022
keys+0.022
thia+0.022
Neg logit contributions
 P-0.024
 chess-0.023
 Linda-0.023
 ink-0.021
 noon-0.020
Token and
Token ID290
Feature activation-3.79e-01

Pos logit contributions
ly+0.015
folk+0.015
 ponies+0.014
affe+0.014
der+0.014
Neg logit contributions
 P-0.031
 Linda-0.029
 chess-0.029
 Jelly-0.027
 Ink-0.026
Token shiny
Token ID22441
Feature activation-1.32e+00

Pos logit contributions
ly+0.009
thia+0.009
folk+0.009
 ponies+0.009
 bells+0.009
Neg logit contributions
 P-0.011
 Linda-0.011
 chess-0.011
 Jelly-0.010
 Ink-0.009
Token shoes
Token ID10012
Feature activation-9.23e-01

Pos logit contributions
folk+0.036
thia+0.036
affe+0.035
keys+0.035
uted+0.035
Neg logit contributions
 P-0.037
 chess-0.036
 Linda-0.034
 ink-0.034
 sauce-0.030
Token.
Token ID13
Feature activation-3.46e-01

Pos logit contributions
ly+0.021
folk+0.020
 ponies+0.020
der+0.020
affe+0.019
Neg logit contributions
 P-0.031
 Linda-0.029
 chess-0.029
 Jelly-0.027
 Ink-0.026
Token their
Token ID511
Feature activation+8.66e-01

Pos logit contributions
 P+0.009
 Linda+0.009
 chess+0.008
 Jelly+0.008
ores+0.008
Neg logit contributions
 ponies-0.009
 bells-0.008
 bicycles-0.008
thia-0.008
ly-0.008
Token toys
Token ID14958
Feature activation-7.28e-01

Pos logit contributions
 P+0.029
 Linda+0.027
 chess+0.026
 Jelly+0.025
ores+0.025
Neg logit contributions
 ponies-0.018
ly-0.018
folk-0.018
thia-0.018
affe-0.017
Token in
Token ID287
Feature activation+1.73e-01

Pos logit contributions
thia+0.019
folk+0.018
affe+0.018
ly+0.018
anna+0.018
Neg logit contributions
 P-0.022
 chess-0.021
 Linda-0.020
 ink-0.018
 surfing-0.018
Token the
Token ID262
Feature activation+9.08e-01

Pos logit contributions
 P+0.005
 Linda+0.004
 chess+0.004
ores+0.004
 Jelly+0.004
Neg logit contributions
folk-0.004
 ponies-0.004
 strangers-0.004
thia-0.004
 bells-0.004
Token living
Token ID2877
Feature activation+1.68e+00

Pos logit contributions
 Linda+0.032
 P+0.032
ores+0.032
ucer+0.030
 rid+0.030
Neg logit contributions
 ponies-0.015
 bicycles-0.014
 little-0.014
 bell-0.013
 strangers-0.013
Token room
Token ID2119
Feature activation-7.01e-01

Pos logit contributions
 P+0.055
 Linda+0.053
 chess+0.050
 surfing+0.047
 Jelly+0.046
Neg logit contributions
folk-0.035
ly-0.034
thia-0.032
 ponies-0.032
uted-0.031
Token.
Token ID13
Feature activation+6.47e-02

Pos logit contributions
folk+0.019
thia+0.018
affe+0.018
ly+0.018
 ponies+0.018
Neg logit contributions
 P-0.020
 chess-0.019
 Linda-0.018
 Jelly-0.016
 ink-0.016
Token They
Token ID1119
Feature activation-2.67e-02

Pos logit contributions
 P+0.002
 chess+0.002
 Linda+0.002
 surfing+0.002
 ink+0.002
Neg logit contributions
ly-0.002
folk-0.002
thia-0.002
der-0.001
 ponies-0.001
Token had
Token ID550
Feature activation+6.28e-01

Pos logit contributions
folk+0.001
ly+0.001
affe+0.001
thia+0.001
 ponies+0.000
Neg logit contributions
 P-0.001
 chess-0.001
 Linda-0.001
 Jelly-0.001
 surfing-0.001
Token many
Token ID867
Feature activation-8.89e-01

Pos logit contributions
 P+0.014
 chess+0.013
 Linda+0.013
 Jelly+0.011
 ink+0.011
Neg logit contributions
folk-0.021
affe-0.021
ly-0.021
thia-0.020
der-0.020
Token toys
Token ID14958
Feature activation-7.61e-01

Pos logit contributions
affe+0.029
thia+0.029
keys+0.029
anna+0.028
folk+0.028
Neg logit contributions
 chess-0.022
 P-0.020
 Linda-0.019
 ink-0.019
 marble-0.017
Token hopped
Token ID45230
Feature activation+6.76e-01

Pos logit contributions
folk+0.002
 ponies+0.002
thia+0.002
 strangers+0.002
der+0.002
Neg logit contributions
 P-0.002
 Linda-0.002
 chess-0.002
 Jelly-0.002
 ink-0.001
Token onto
Token ID4291
Feature activation+8.27e-01

Pos logit contributions
 chess+0.023
 surfing+0.022
 Linda+0.022
 P+0.021
etchup+0.021
Neg logit contributions
ly-0.015
Exc-0.013
 happily-0.011
ified-0.011
 curiously-0.011
Token it
Token ID340
Feature activation+3.94e-01

Pos logit contributions
 surfing+0.028
ained+0.028
ucer+0.027
 chess+0.026
ennis+0.026
Neg logit contributions
ly-0.014
 ponies-0.014
 little-0.014
 bells-0.014
 ow-0.014
Token.
Token ID13
Feature activation+7.79e-02

Pos logit contributions
 P+0.013
 Linda+0.012
 chess+0.012
 Jelly+0.011
 surfing+0.011
Neg logit contributions
ly-0.009
der-0.008
affe-0.008
folk-0.008
thia-0.008
Token She
Token ID1375
Feature activation-8.29e-01

Pos logit contributions
 P+0.002
 chess+0.002
 Linda+0.002
 surfing+0.002
 noon+0.002
Neg logit contributions
ly-0.002
Exc-0.002
thia-0.002
folk-0.002
der-0.002
Token was
Token ID373
Feature activation-3.69e-01

Pos logit contributions
folk+0.024
 ponies+0.022
ly+0.022
affe+0.021
 strangers+0.021
Neg logit contributions
 P-0.024
 Linda-0.022
 chess-0.021
 Jelly-0.021
 ink-0.018
Token so
Token ID523
Feature activation-7.85e-02

Pos logit contributions
folk+0.009
ly+0.009
thia+0.009
affe+0.009
 ponies+0.009
Neg logit contributions
 P-0.011
 chess-0.011
 Linda-0.011
 Jelly-0.010
 surfing-0.009
Token excited
Token ID6568
Feature activation-1.67e-01

Pos logit contributions
folk+0.002
ly+0.002
der+0.002
affe+0.002
 ponies+0.002
Neg logit contributions
 P-0.002
 Linda-0.002
 chess-0.002
 late-0.002
 Jelly-0.002
Token at
Token ID379
Feature activation-5.98e-01

Pos logit contributions
folk+0.004
thia+0.004
affe+0.004
 ponies+0.004
ly+0.004
Neg logit contributions
 P-0.005
 Linda-0.005
 chess-0.005
 Jelly-0.004
 ink-0.004
Token this
Token ID428
Feature activation-9.04e-01

Pos logit contributions
folk+0.018
ly+0.018
thia+0.018
affe+0.018
 ponies+0.018
Neg logit contributions
 P-0.015
 chess-0.014
 Linda-0.014
 Jelly-0.012
 ink-0.012
Token special
Token ID2041
Feature activation-4.11e-01

Pos logit contributions
folk+0.022
thia+0.021
 ponies+0.021
ly+0.021
affe+0.020
Neg logit contributions
 P-0.028
 Linda-0.026
 chess-0.026
 noon-0.024
 Jelly-0.024
Token and
Token ID290
Feature activation-5.49e-01

Pos logit contributions
ly+0.019
affe+0.018
der+0.018
 ponies+0.017
folk+0.017
Neg logit contributions
 P-0.023
 Linda-0.022
 chess-0.022
 Jelly-0.020
 Ink-0.019
Token a
Token ID257
Feature activation-3.37e-01

Pos logit contributions
ly+0.017
folk+0.017
 ponies+0.017
thia+0.017
affe+0.017
Neg logit contributions
 P-0.013
 Linda-0.012
 chess-0.012
 Jelly-0.010
 surfing-0.010
Token dress
Token ID6576
Feature activation-4.22e-01

Pos logit contributions
affe+0.007
thia+0.007
 ponies+0.007
 bell+0.007
ly+0.007
Neg logit contributions
 P-0.011
 Linda-0.011
 chess-0.010
ores-0.010
 noon-0.010
Token.
Token ID13
Feature activation-2.36e-01

Pos logit contributions
ly+0.010
der+0.009
affe+0.009
 ponies+0.009
folk+0.008
Neg logit contributions
 P-0.015
 Linda-0.014
 chess-0.014
 Ink-0.013
 Jelly-0.013
Token He
Token ID679
Feature activation-4.10e-01

Pos logit contributions
folk+0.005
affe+0.004
Exc+0.004
 ponies+0.004
der+0.004
Neg logit contributions
 P-0.008
 chess-0.008
 Linda-0.007
 surfing-0.007
 noon-0.007
Token wondered
Token ID14028
Feature activation-8.23e-01

Pos logit contributions
folk+0.008
ly+0.008
 ponies+0.008
affe+0.008
thia+0.008
Neg logit contributions
 P-0.014
 chess-0.013
 Linda-0.013
 Jelly-0.012
 ink-0.012
Token what
Token ID644
Feature activation-9.26e-01

Pos logit contributions
folk+0.024
thia+0.024
affe+0.024
anna+0.023
ly+0.023
Neg logit contributions
 P-0.022
 chess-0.021
 Linda-0.020
 surfing-0.018
 Jelly-0.017
Token it
Token ID340
Feature activation+3.70e-01

Pos logit contributions
folk+0.024
ly+0.024
thia+0.023
affe+0.023
 ponies+0.023
Neg logit contributions
 P-0.027
 chess-0.025
 Linda-0.025
 Jelly-0.022
 ink-0.022
Token was
Token ID373
Feature activation+6.50e-01

Pos logit contributions
 P+0.013
 chess+0.012
 Linda+0.012
 Jelly+0.011
 surfing+0.011
Neg logit contributions
folk-0.007
ly-0.007
thia-0.007
affe-0.007
 ponies-0.007
Token like
Token ID588
Feature activation+1.68e-01

Pos logit contributions
 P+0.022
 Linda+0.022
 chess+0.021
 Jelly+0.020
 Ink+0.020
Neg logit contributions
ly-0.013
 ponies-0.012
 strangers-0.012
anna-0.012
der-0.012
Token to
Token ID284
Feature activation-3.69e-01

Pos logit contributions
 P+0.006
 Linda+0.006
 chess+0.005
ucer+0.005
 Jelly+0.005
Neg logit contributions
ly-0.003
 ponies-0.003
der-0.003
 strangers-0.003
affe-0.003
Token was
Token ID373
Feature activation-3.50e-01

Pos logit contributions
folk+0.015
 ponies+0.015
 strangers+0.014
affe+0.014
der+0.014
Neg logit contributions
 P-0.018
 Linda-0.017
 chess-0.016
 Jelly-0.016
 ink-0.014
Token excited
Token ID6568
Feature activation+5.78e-02

Pos logit contributions
affe+0.009
ly+0.009
folk+0.009
der+0.008
thia+0.008
Neg logit contributions
 P-0.011
 Linda-0.010
 chess-0.010
 Jelly-0.009
 surfing-0.009
Token when
Token ID618
Feature activation+6.97e-01

Pos logit contributions
 P+0.002
 chess+0.002
 Linda+0.002
 Jelly+0.001
 surfing+0.001
Neg logit contributions
ly-0.001
folk-0.001
 ponies-0.001
thia-0.001
affe-0.001
Token the
Token ID262
Feature activation-7.43e-01

Pos logit contributions
 P+0.020
 chess+0.020
 Linda+0.019
ucer+0.017
 ink+0.017
Neg logit contributions
 ponies-0.018
folk-0.017
 strangers-0.017
thia-0.017
 bells-0.016
Token big
Token ID1263
Feature activation-6.67e-01

Pos logit contributions
folk+0.019
ly+0.019
 ponies+0.018
thia+0.018
affe+0.018
Neg logit contributions
 P-0.022
 Linda-0.021
 chess-0.021
 Jelly-0.019
 ink-0.018
Token band
Token ID4097
Feature activation+1.56e-01

Pos logit contributions
thia+0.016
ly+0.016
affe+0.016
folk+0.016
der+0.016
Neg logit contributions
 P-0.020
 chess-0.019
 Linda-0.019
 Jelly-0.017
 surfing-0.017
Token started
Token ID2067
Feature activation+8.09e-01

Pos logit contributions
 P+0.005
 chess+0.005
 Linda+0.005
 Jelly+0.004
 ink+0.004
Neg logit contributions
folk-0.003
ly-0.003
thia-0.003
affe-0.003
 ponies-0.003
Token to
Token ID284
Feature activation+6.22e-01

Pos logit contributions
 P+0.025
 chess+0.024
 Linda+0.023
 Ink+0.022
ucer+0.021
Neg logit contributions
 ponies-0.019
ly-0.019
folk-0.018
 bells-0.018
 strangers-0.018
Token play
Token ID711
Feature activation+3.54e-01

Pos logit contributions
 P+0.019
 chess+0.018
 Linda+0.017
 Ink+0.016
 Jelly+0.015
Neg logit contributions
folk-0.015
 bells-0.015
ly-0.015
thia-0.015
 ponies-0.015
Token.
Token ID13
Feature activation+7.96e-02

Pos logit contributions
 P+0.011
 Linda+0.011
 chess+0.010
 Jelly+0.010
 Ink+0.010
Neg logit contributions
ly-0.008
folk-0.008
 ponies-0.008
der-0.007
co-0.007
Token She
Token ID1375
Feature activation-7.84e-01

Pos logit contributions
 P+0.003
 chess+0.003
 Linda+0.003
 Jelly+0.002
 surfing+0.002
Neg logit contributions
folk-0.002
ly-0.002
thia-0.002
 ponies-0.002
affe-0.002
Token<|endoftext|>
Token ID50256
Feature activation+9.15e-01

Pos logit contributions
 chess+0.004
 surfing+0.004
 P+0.004
ucer+0.004
 Linda+0.004
Neg logit contributions
der-0.002
thia-0.002
Exc-0.002
folk-0.002
 blamed-0.002
TokenS
Token ID50
Feature activation+1.98e+00

Pos logit contributions
 Linda+0.025
 P+0.025
 chess+0.024
 Jelly+0.022
ores+0.022
Neg logit contributions
folk-0.023
 ponies-0.022
Exc-0.021
ly-0.021
affe-0.021
Tokenara
Token ID3301
Feature activation+3.51e-01

Pos logit contributions
 chess+0.053
 P+0.053
 Linda+0.052
 rid+0.047
 late+0.047
Neg logit contributions
thia-0.053
affe-0.050
 ponies-0.049
ly-0.049
Exc-0.048
Token loved
Token ID6151
Feature activation-6.72e-01

Pos logit contributions
 P+0.014
 Linda+0.013
 chess+0.013
 Jelly+0.012
 surfing+0.012
Neg logit contributions
ly-0.006
thia-0.005
folk-0.005
affe-0.005
 ponies-0.005
Token to
Token ID284
Feature activation+4.38e-01

Pos logit contributions
 ponies+0.017
ly+0.015
 bells+0.015
folk+0.015
 strangers+0.015
Neg logit contributions
 Linda-0.020
 P-0.020
 chess-0.018
 Jelly-0.018
ucer-0.017
Token look
Token ID804
Feature activation-2.51e+00

Pos logit contributions
 P+0.011
 Linda+0.010
 late+0.010
ucer+0.010
 chess+0.010
Neg logit contributions
 ponies-0.012
thia-0.011
 bells-0.011
folk-0.011
ly-0.011
Token at
Token ID379
Feature activation-1.78e-01

Pos logit contributions
folk+0.057
thia+0.055
affe+0.054
 ponies+0.052
keys+0.052
Neg logit contributions
 P-0.083
 chess-0.079
 Linda-0.079
 Jelly-0.071
 surfing-0.070
Token the
Token ID262
Feature activation+2.16e-01

Pos logit contributions
folk+0.005
ly+0.005
 ponies+0.005
thia+0.005
affe+0.005
Neg logit contributions
 P-0.005
 chess-0.004
 Linda-0.004
 Jelly-0.004
 surfing-0.004
Token sky
Token ID6766
Feature activation+6.48e-01

Pos logit contributions
 P+0.007
 Linda+0.007
 chess+0.006
ores+0.006
 Jelly+0.006
Neg logit contributions
 ponies-0.005
ly-0.005
folk-0.004
 bells-0.004
affe-0.004
Token and
Token ID290
Feature activation-3.20e-01

Pos logit contributions
 P+0.024
 Linda+0.023
 chess+0.022
 Jak+0.021
 rid+0.021
Neg logit contributions
ly-0.012
 ponies-0.011
der-0.011
 bells-0.011
folk-0.011
Token see
Token ID766
Feature activation-1.02e+00

Pos logit contributions
affe+0.010
ly+0.010
folk+0.010
uted+0.010
der+0.010
Neg logit contributions
 P-0.008
 chess-0.007
 Linda-0.007
 Jelly-0.007
 surfing-0.007
Token,
Token ID11
Feature activation-6.79e-01

Pos logit contributions
ly+0.008
folk+0.008
thia+0.008
affe+0.008
 ponies+0.008
Neg logit contributions
 P-0.013
 chess-0.013
 Linda-0.013
 Jelly-0.011
 surfing-0.011
Token stop
Token ID2245
Feature activation+2.48e-01

Pos logit contributions
folk+0.018
ly+0.018
thia+0.018
affe+0.018
 ponies+0.018
Neg logit contributions
 P-0.019
 Linda-0.018
 chess-0.018
 Jelly-0.016
 surfing-0.016
Token!"
Token ID2474
Feature activation-2.31e-01

Pos logit contributions
 P+0.008
 chess+0.008
 Linda+0.008
 Ink+0.007
 Jelly+0.007
Neg logit contributions
ly-0.006
 ow-0.005
 strangers-0.005
 ponies-0.005
thia-0.005
Token Sara
Token ID24799
Feature activation+8.57e-01

Pos logit contributions
folk+0.007
ly+0.007
thia+0.007
 ponies+0.007
affe+0.007
Neg logit contributions
 P-0.005
 chess-0.005
 Linda-0.005
 Jelly-0.004
 surfing-0.004
Token cries
Token ID24691
Feature activation+2.21e-01

Pos logit contributions
 P+0.026
 Linda+0.024
 chess+0.024
 Jelly+0.022
 surfing+0.021
Neg logit contributions
folk-0.022
thia-0.021
affe-0.021
anna-0.020
ly-0.020
Token.
Token ID13
Feature activation-2.37e+00

Pos logit contributions
 P+0.006
 Linda+0.005
 chess+0.005
 Jelly+0.004
 late+0.004
Neg logit contributions
folk-0.007
thia-0.007
affe-0.007
 ponies-0.007
keys-0.007
Token "
Token ID366
Feature activation-1.52e+00

Pos logit contributions
keys+0.069
folk+0.068
affe+0.066
ly+0.065
anna+0.064
Neg logit contributions
 P-0.070
 Linda-0.060
 chess-0.058
 Jelly-0.056
 Ink-0.055
TokenThat
Token ID2504
Feature activation-7.55e-03

Pos logit contributions
folk+0.045
ly+0.045
thia+0.044
affe+0.044
 ponies+0.044
Neg logit contributions
 P-0.038
 chess-0.036
 Linda-0.036
 Jelly-0.031
 surfing-0.030
Token's
Token ID338
Feature activation-1.43e-01

Pos logit contributions
ly+0.000
 bell+0.000
affe+0.000
thia+0.000
der+0.000
Neg logit contributions
 P-0.000
 Linda-0.000
ores-0.000
 chess-0.000
 surfing-0.000
Token not
Token ID407
Feature activation+3.76e-01

Pos logit contributions
ly+0.003
thia+0.003
 ponies+0.003
affe+0.003
 strangers+0.003
Neg logit contributions
 P-0.004
 chess-0.004
 Linda-0.004
 Jelly-0.004
 surfing-0.004
Token yours
Token ID12431
Feature activation-6.71e-01

Pos logit contributions
 P+0.013
 Linda+0.013
 chess+0.012
 Jelly+0.011
 surfing+0.011
Neg logit contributions
ly-0.008
 ponies-0.007
affe-0.007
 bells-0.007
folk-0.007
Token."
Token ID526
Feature activation+7.99e-01

Pos logit contributions
 P+0.005
 Linda+0.005
 chess+0.005
 Ink+0.004
 Jelly+0.004
Neg logit contributions
ly-0.003
 ponies-0.003
der-0.003
folk-0.003
keys-0.003
Token They
Token ID1119
Feature activation+1.80e-02

Pos logit contributions
 possible+0.027
 chess+0.027
 P+0.026
 surfing+0.026
 Linda+0.025
Neg logit contributions
ly-0.015
anna-0.014
folk-0.014
Exc-0.014
thia-0.014
Token each
Token ID1123
Feature activation-5.57e-01

Pos logit contributions
 P+0.001
 Linda+0.001
 chess+0.001
 Jelly+0.000
 noon+0.000
Neg logit contributions
folk-0.000
der-0.000
affe-0.000
thia-0.000
ly-0.000
Token picked
Token ID6497
Feature activation-5.19e-01

Pos logit contributions
folk+0.013
der+0.013
thia+0.013
 ponies+0.012
affe+0.012
Neg logit contributions
 P-0.018
 Linda-0.017
 chess-0.017
 Jelly-0.015
 ink-0.015
Token a
Token ID257
Feature activation-4.23e-01

Pos logit contributions
ly+0.014
 ponies+0.014
folk+0.012
affe+0.012
 bells+0.012
Neg logit contributions
 P-0.014
 Linda-0.014
 chess-0.014
 surfing-0.013
 Ink-0.012
Token few
Token ID1178
Feature activation-2.08e+00

Pos logit contributions
ly+0.010
 ponies+0.010
thia+0.010
affe+0.010
folk+0.010
Neg logit contributions
 P-0.013
 Linda-0.012
 chess-0.012
 Jelly-0.011
 surfing-0.011
Token blo
Token ID24924
Feature activation-3.56e-03

Pos logit contributions
thia+0.067
folk+0.065
affe+0.063
uted+0.060
lder+0.059
Neg logit contributions
 P-0.047
 chess-0.045
 Linda-0.044
 ink-0.044
 tea-0.041
Tokenoms
Token ID3150
Feature activation+4.55e-01

Pos logit contributions
ly+0.000
folk+0.000
 ponies+0.000
der+0.000
anna+0.000
Neg logit contributions
 P-0.000
 chess-0.000
 Linda-0.000
 Jelly-0.000
 Ink-0.000
Token and
Token ID290
Feature activation+2.81e-01

Pos logit contributions
 P+0.017
 chess+0.017
 Linda+0.016
 Jelly+0.016
 Ink+0.016
Neg logit contributions
ly-0.008
folk-0.007
 ponies-0.007
affe-0.007
co-0.006
Token smelled
Token ID37609
Feature activation+7.66e-01

Pos logit contributions
 P+0.008
 chess+0.008
 Linda+0.008
 Jelly+0.007
 surfing+0.007
Neg logit contributions
ly-0.007
folk-0.007
thia-0.007
 ponies-0.007
affe-0.007
Token them
Token ID606
Feature activation+9.01e-01

Pos logit contributions
 P+0.022
 chess+0.022
 Linda+0.021
 Ink+0.019
ucer+0.018
Neg logit contributions
ly-0.020
 ponies-0.019
der-0.019
folk-0.018
 strangers-0.018
Token kernels
Token ID50207
Feature activation-1.83e-01

Pos logit contributions
folk+0.037
ly+0.037
thia+0.037
affe+0.037
 ponies+0.036
Neg logit contributions
 P-0.038
 chess-0.037
 Linda-0.036
 Jelly-0.032
 surfing-0.032
Token in
Token ID287
Feature activation+2.12e-01

Pos logit contributions
ly+0.004
der+0.003
folk+0.003
keys+0.003
 ponies+0.003
Neg logit contributions
 P-0.007
 Linda-0.006
 chess-0.006
 Jelly-0.006
 Jazz-0.006
Token the
Token ID262
Feature activation-7.72e-01

Pos logit contributions
 P+0.007
 Linda+0.006
 chess+0.006
 surfing+0.006
ucer+0.006
Neg logit contributions
 bells-0.004
 ponies-0.004
folk-0.004
der-0.004
 strangers-0.004
Token popcorn
Token ID38914
Feature activation+5.21e-01

Pos logit contributions
ly+0.020
folk+0.020
 ponies+0.020
thia+0.019
 bells+0.019
Neg logit contributions
 P-0.023
 Linda-0.022
 chess-0.020
 Jelly-0.019
 surfing-0.018
Token maker
Token ID16009
Feature activation+3.80e-01

Pos logit contributions
 P+0.018
 Linda+0.017
 chess+0.016
 Jelly+0.015
 Ink+0.015
Neg logit contributions
ly-0.011
der-0.011
folk-0.011
affe-0.010
 ponies-0.010
Token.
Token ID13
Feature activation-2.16e+00

Pos logit contributions
 P+0.013
 Linda+0.013
 chess+0.012
 Jelly+0.012
 Ink+0.011
Neg logit contributions
ly-0.008
folk-0.007
der-0.007
affe-0.007
 ponies-0.007
Token Bob
Token ID5811
Feature activation+2.37e-01

Pos logit contributions
ly+0.060
folk+0.059
 ponies+0.057
affe+0.057
thia+0.057
Neg logit contributions
 P-0.061
 Linda-0.054
 chess-0.054
 Jelly-0.051
 noon-0.049
Token was
Token ID373
Feature activation+1.92e-01

Pos logit contributions
 P+0.007
 chess+0.007
 Linda+0.007
 Jelly+0.006
 Ink+0.006
Neg logit contributions
 ponies-0.006
folk-0.006
ly-0.005
 strangers-0.005
affe-0.005
Token so
Token ID523
Feature activation+1.47e-01

Pos logit contributions
 P+0.006
 chess+0.006
 Linda+0.006
 Jelly+0.005
 ink+0.005
Neg logit contributions
folk-0.004
thia-0.004
 ponies-0.004
ly-0.004
der-0.004
Token excited
Token ID6568
Feature activation+2.75e-01

Pos logit contributions
 P+0.005
 chess+0.005
 Linda+0.005
 Jelly+0.004
 ink+0.004
Neg logit contributions
folk-0.003
ly-0.003
thia-0.003
affe-0.003
 ponies-0.003
Token.
Token ID13
Feature activation-1.38e+00

Pos logit contributions
 P+0.009
 Linda+0.009
 chess+0.009
 Jelly+0.008
 Ink+0.008
Neg logit contributions
ly-0.006
der-0.006
 ponies-0.006
folk-0.006
affe-0.006
Token sister
Token ID6621
Feature activation-2.30e-01

Pos logit contributions
 P+0.032
 Linda+0.031
 chess+0.030
 Jelly+0.028
 surfing+0.028
Neg logit contributions
folk-0.015
ly-0.014
 ponies-0.014
affe-0.014
thia-0.014
Token.
Token ID13
Feature activation+2.21e-01

Pos logit contributions
 ponies+0.004
affe+0.004
ly+0.004
folk+0.004
der+0.004
Neg logit contributions
 P-0.008
 Linda-0.008
ucer-0.008
 chess-0.008
 noon-0.007
Token<|endoftext|>
Token ID50256
Feature activation+3.86e-01

Pos logit contributions
 chess+0.008
 Linda+0.007
 surfing+0.007
 P+0.007
ucer+0.007
Neg logit contributions
affe-0.004
der-0.004
thia-0.004
folk-0.003
Exc-0.003
TokenL
Token ID43
Feature activation+4.23e-01

Pos logit contributions
 P+0.011
 chess+0.010
 Linda+0.010
 Jelly+0.009
 noon+0.009
Neg logit contributions
folk-0.010
ly-0.010
affe-0.010
thia-0.010
 ponies-0.010
Tokenily
Token ID813
Feature activation-6.89e-01

Pos logit contributions
 P+0.014
 chess+0.013
 Linda+0.013
 Jelly+0.012
 surfing+0.011
Neg logit contributions
ly-0.010
folk-0.009
thia-0.009
affe-0.009
 ponies-0.009
Token and
Token ID290
Feature activation-2.09e+00

Pos logit contributions
ly+0.012
thia+0.012
folk+0.012
affe+0.012
 ponies+0.012
Neg logit contributions
 P-0.025
 Linda-0.024
 chess-0.024
 Jelly-0.022
 surfing-0.022
Token Ben
Token ID3932
Feature activation-8.29e-01

Pos logit contributions
der+0.069
folk+0.066
thia+0.065
ly+0.064
affe+0.064
Neg logit contributions
 P-0.052
 Linda-0.045
 chess-0.045
 noon-0.041
 Jelly-0.040
Token were
Token ID547
Feature activation-9.31e-01

Pos logit contributions
ly+0.015
folk+0.015
thia+0.014
 ponies+0.014
affe+0.014
Neg logit contributions
 P-0.030
 Linda-0.029
 chess-0.029
 Jelly-0.026
 surfing-0.026
Token best
Token ID1266
Feature activation-1.47e+00

Pos logit contributions
 ponies+0.022
folk+0.021
thia+0.021
ly+0.021
anna+0.020
Neg logit contributions
 P-0.029
 Linda-0.028
 chess-0.027
 Jelly-0.025
 ink-0.024
Token friends
Token ID2460
Feature activation-1.91e+00

Pos logit contributions
folk+0.038
der+0.035
thia+0.034
ly+0.034
anna+0.034
Neg logit contributions
 P-0.046
 chess-0.044
 Linda-0.043
 surfing-0.041
 Jelly-0.038
Token.
Token ID13
Feature activation+6.32e-01

Pos logit contributions
folk+0.049
thia+0.048
ly+0.048
affe+0.048
 ponies+0.047
Neg logit contributions
 P-0.056
 chess-0.053
 Linda-0.053
 Jelly-0.047
 surfing-0.046