TOP ACTIVATIONS
MAX = 7.884

has lettuce and tomatoes and cuc umbers . And it has
John chose some carrots and cuc umbers to take home .
like lettuce , tomatoes and cuc umbers . He smiled as
some y ummy carrots and cuc umbers from the farmer 's
tomato , and then a cuc umber . Lily and Tom
cut the lettuce and the cuc umber . She said he
and the peppers from the cuc umbers . It took a
cuc umber . Mama puts cuc umbers in our salad .
big salad with carrots , cuc umbers , and tomatoes .
chopped up some tomatoes , cuc umbers , and lettuce .
I think it is a cuc umber . Mama puts cuc
saw carrots , potatoes , cuc umbers and tomatoes . He
the bowl . Lucy put cuc umbers , tomatoes and some
. You found a nice cuc umber . It will be
thank you . We love cuc umbers . And we love
house . They wash the cuc umber and cut it into
monkeys and elephants and gir aff es . They were very
They let them run and gra ze on the grass .
. They were called ost ric hes . Lily and Ben
many fish and shells and cor als . They liked to

MIN ACTIVATIONS
MIN = -5.126

their favorite thing was the merry - go - round .
had colorful pol ka - d ots all over it .
bounce on the big tr amp oline in their backyard .
backyard on a big tr amp oline . T
big , boun cy tr amp oline . They all cheered
and found a big tr amp oline . He jumped on
sofa , on the tr amp oline . They bounce high
They forgot about the broken merry - go - round .
playing outside on the tr amp oline . He was excited
to jump on the tr amp oline . It is fun
pretending it was a tr amp oline . One
She cried and threw a tant rum . Her mom tried
swing and play on the merry - go - round .
. He was wearing his brand new icy blue jacket as
colorful , pol ka - d otted sheep all over it
It 's like a big merry - go - round !"
sad . She threw a tant rum and cried loudly .
the sees aw , the merry - go - round .
park . Sara is the seeker and Ben is the h
park . Lily was the seeker and Max was the h

INTERVAL 4.631 - 7.884
CONTAINS 0.002%

cuc umber . Mama puts cuc umbers in our salad .
gift from God and he pr ays every day to thank
. He felt bigger and bra ver than before . He
Tweet felt even stronger and bra ver than before . <|endoftext|>
very naughty and always mis beh aving . One day she

INTERVAL 1.379 - 4.631
CONTAINS 3.768%

ons fly around in her garden . All of a sudden
a bowl of hot cooked rice for breakfast .
Look , Sue , I have a rod !" Tom says
go to the park and sing in the sun . She
. " St ir , stir , stir !"

INTERVAL -1.874 - 1.379
CONTAINS 94.795%

for the puppy and decided to help him find his way
to keep the silver coin in her pocket and show it
Ben did not listen to Lily . He thought
to Ben , " I know where we can find more
Ab bi 's garden was starting to look very different .

INTERVAL -5.126 - -1.874
CONTAINS 1.435%

." The old lady smiled back and continued to walk .
It was her dad 's car . He was coming home
They look at the broken box and cry .
L ily smiled back . She said , "
car and a pretty purple dress . She was so happy
Token has
Token ID468
Feature activation-3.94e-01

Pos logit contributions
lf+0.022
 cereal+0.022
James+0.020
 spaghetti+0.020
bes+0.020
Neg logit contributions
 Pet-0.018
 rob-0.018
 tort-0.017
 tou-0.017
 encountered-0.016
Token lettuce
Token ID39406
Feature activation+3.00e+00

Pos logit contributions
lf+0.010
 cereal+0.009
James+0.009
ma+0.009
bes+0.009
Neg logit contributions
 Pet-0.010
 rob-0.010
 tou-0.009
 tort-0.009
 encountered-0.009
Token and
Token ID290
Feature activation-4.23e-01

Pos logit contributions
 Pet+0.071
 tort+0.068
 Kang+0.068
 sur+0.068
 rob+0.067
Neg logit contributions
 cereal-0.081
lf-0.075
 spaghetti-0.075
 Amy-0.069
bes-0.069
Token tomatoes
Token ID23972
Feature activation+3.46e+00

Pos logit contributions
lf+0.010
James+0.010
Kate+0.009
ma+0.009
bes+0.009
Neg logit contributions
 Pet-0.011
 rob-0.011
 tort-0.010
 tou-0.010
 mag-0.010
Token and
Token ID290
Feature activation+1.85e-01

Pos logit contributions
 Pet+0.081
 Kang+0.078
 tort+0.078
 sur+0.077
 rob+0.076
Neg logit contributions
 cereal-0.096
 spaghetti-0.088
lf-0.088
 Amy-0.081
bes-0.081
Token cuc
Token ID38421
Feature activation+7.85e+00

Pos logit contributions
 Pet+0.005
 rob+0.005
 tort+0.005
 tou+0.005
 sur+0.005
Neg logit contributions
lf-0.004
 cereal-0.004
James-0.004
ma-0.004
Kate-0.004
Tokenumbers
Token ID17024
Feature activation+2.59e+00

Pos logit contributions
 Pet+0.218
 tort+0.212
 Kang+0.211
 ko+0.211
 rob+0.209
Neg logit contributions
 cereal-0.186
 spaghetti-0.166
lf-0.156
ma-0.150
bes-0.148
Token.
Token ID13
Feature activation+8.57e-02

Pos logit contributions
 Pet+0.062
 tort+0.061
 Kang+0.061
 sur+0.060
 ko+0.059
Neg logit contributions
 cereal-0.070
 spaghetti-0.064
lf-0.063
ma-0.059
 Amy-0.059
Token And
Token ID843
Feature activation+5.69e-01

Pos logit contributions
 Pet+0.002
 Kang+0.002
 rob+0.002
 sur+0.002
 tort+0.002
Neg logit contributions
lf-0.002
 cereal-0.002
ma-0.002
 spaghetti-0.002
bes-0.002
Token it
Token ID340
Feature activation-2.14e-01

Pos logit contributions
 Pet+0.013
 rob+0.012
 sur+0.012
 tort+0.012
 Kang+0.012
Neg logit contributions
 cereal-0.016
lf-0.015
 spaghetti-0.014
James-0.014
bes-0.014
Token has
Token ID468
Feature activation-5.19e-02

Pos logit contributions
 cereal+0.005
lf+0.005
 spaghetti+0.005
James+0.005
 Amy+0.004
Neg logit contributions
 Pet-0.005
 tort-0.005
 rob-0.005
 sur-0.005
 Kang-0.005
Token John
Token ID1757
Feature activation+4.75e-01

Pos logit contributions
 Kang+0.019
 unnecessary+0.019
 tort+0.019
 gol+0.019
 tou+0.018
Neg logit contributions
 cereal-0.012
lf-0.012
 Amy-0.011
 Kate-0.011
ma-0.011
Token chose
Token ID7690
Feature activation+1.01e+00

Pos logit contributions
 Pet+0.011
 tort+0.011
 Kang+0.010
 rob+0.010
 sur+0.010
Neg logit contributions
 cereal-0.013
lf-0.012
 spaghetti-0.012
ma-0.011
 Amy-0.011
Token some
Token ID617
Feature activation+2.26e-02

Pos logit contributions
 Pet+0.024
 tort+0.023
 rob+0.023
 sur+0.022
 Kang+0.022
Neg logit contributions
 cereal-0.027
lf-0.025
 spaghetti-0.025
James-0.023
 Amy-0.023
Token carrots
Token ID34397
Feature activation+2.98e+00

Pos logit contributions
 Pet+0.001
 tort+0.001
 rob+0.001
 sur+0.001
 Kang+0.001
Neg logit contributions
 cereal-0.001
lf-0.001
 spaghetti-0.000
James-0.000
bes-0.000
Token and
Token ID290
Feature activation+3.76e-02

Pos logit contributions
 Pet+0.062
 tort+0.059
 sur+0.059
 Kang+0.059
 rob+0.059
Neg logit contributions
 cereal-0.088
 spaghetti-0.081
lf-0.081
 Amy-0.076
ma-0.076
Token cuc
Token ID38421
Feature activation+7.48e+00

Pos logit contributions
 Pet+0.001
 tort+0.001
 sur+0.001
 Kang+0.001
 rob+0.001
Neg logit contributions
 cereal-0.001
lf-0.001
 spaghetti-0.001
 Amy-0.001
James-0.001
Tokenumbers
Token ID17024
Feature activation+2.46e+00

Pos logit contributions
 Pet+0.201
 ko+0.199
 tou+0.199
 Kang+0.197
 rob+0.197
Neg logit contributions
 cereal-0.182
 spaghetti-0.159
lf-0.148
ma-0.147
bes-0.142
Token to
Token ID284
Feature activation-3.13e-01

Pos logit contributions
 Pet+0.054
 tort+0.052
 Kang+0.052
 sur+0.052
 rob+0.051
Neg logit contributions
 cereal-0.071
lf-0.065
 spaghetti-0.064
 Amy-0.060
ma-0.060
Token take
Token ID1011
Feature activation+2.31e-01

Pos logit contributions
 cereal+0.010
lf+0.009
 spaghetti+0.009
 Amy+0.008
James+0.008
Neg logit contributions
 Pet-0.006
 tort-0.006
 sur-0.006
 Kang-0.006
 rob-0.006
Token home
Token ID1363
Feature activation+6.34e-01

Pos logit contributions
 Pet+0.006
 tort+0.006
 rob+0.005
 sur+0.005
 Kang+0.005
Neg logit contributions
 cereal-0.006
lf-0.006
 spaghetti-0.005
James-0.005
 Amy-0.005
Token.
Token ID13
Feature activation-9.10e-01

Pos logit contributions
 Pet+0.014
 tort+0.013
 Kang+0.013
 sur+0.013
 rob+0.013
Neg logit contributions
 cereal-0.018
lf-0.017
 spaghetti-0.017
 Amy-0.016
James-0.016
Token like
Token ID588
Feature activation-3.68e-01

Pos logit contributions
 Pet+0.030
 tort+0.029
 Kang+0.029
 sur+0.029
 rob+0.029
Neg logit contributions
 cereal-0.042
lf-0.039
 spaghetti-0.038
 Amy-0.036
James-0.036
Token lettuce
Token ID39406
Feature activation+3.21e+00

Pos logit contributions
 cereal+0.010
lf+0.009
 spaghetti+0.009
 Amy+0.008
James+0.008
Neg logit contributions
 Pet-0.009
 tort-0.009
 sur-0.008
 Kang-0.008
 rob-0.008
Token,
Token ID11
Feature activation-1.01e+00

Pos logit contributions
 Pet+0.070
 sur+0.067
 Kang+0.067
 tort+0.066
 rob+0.065
Neg logit contributions
 cereal-0.094
 spaghetti-0.087
lf-0.086
 Amy-0.080
bes-0.079
Token tomatoes
Token ID23972
Feature activation+2.38e+00

Pos logit contributions
 cereal+0.024
lf+0.024
James+0.022
Kate+0.022
ma+0.021
Neg logit contributions
 Pet-0.026
 rob-0.025
 tort-0.025
 Kang-0.024
 tou-0.024
Token and
Token ID290
Feature activation-6.38e-01

Pos logit contributions
 Pet+0.050
 sur+0.048
 Kang+0.048
 tort+0.048
 rob+0.047
Neg logit contributions
 cereal-0.071
lf-0.065
 spaghetti-0.065
 Amy-0.061
ma-0.060
Token cuc
Token ID38421
Feature activation+7.48e+00

Pos logit contributions
 cereal+0.015
lf+0.013
 spaghetti+0.013
 Amy+0.012
James+0.012
Neg logit contributions
 Pet-0.018
 tort-0.017
 sur-0.017
 Kang-0.017
 rob-0.017
Tokenumbers
Token ID17024
Feature activation+1.90e+00

Pos logit contributions
 Pet+0.212
 ko+0.211
 Kang+0.210
 tou+0.206
 tort+0.205
Neg logit contributions
 cereal-0.171
 spaghetti-0.151
ma-0.138
lf-0.137
bes-0.133
Token.
Token ID13
Feature activation-5.53e-01

Pos logit contributions
 Pet+0.043
 tort+0.042
 Kang+0.042
 sur+0.042
 rob+0.041
Neg logit contributions
 cereal-0.053
lf-0.049
 spaghetti-0.048
ma-0.045
 Amy-0.045
Token He
Token ID679
Feature activation+7.08e-02

Pos logit contributions
lf+0.015
 cereal+0.015
 spaghetti+0.014
bes+0.014
ma+0.013
Neg logit contributions
 Pet-0.013
 Kang-0.012
 tort-0.012
 rob-0.012
 ko-0.011
Token smiled
Token ID13541
Feature activation+1.96e-01

Pos logit contributions
 Pet+0.002
 tort+0.002
 rob+0.002
 sur+0.002
 Kang+0.001
Neg logit contributions
 cereal-0.002
lf-0.002
 spaghetti-0.002
James-0.002
 Amy-0.002
Token as
Token ID355
Feature activation+2.55e-01

Pos logit contributions
 tort+0.005
 Pet+0.005
 Kang+0.005
 tou+0.005
 rob+0.005
Neg logit contributions
 cereal-0.005
 spaghetti-0.004
lf-0.004
 Amy-0.004
 Kate-0.004
Token some
Token ID617
Feature activation-6.97e-02

Pos logit contributions
 sur+0.046
 Kang+0.045
 tort+0.045
 rob+0.045
 Pet+0.044
Neg logit contributions
 cereal-0.041
 spaghetti-0.037
 Amy-0.032
lf-0.030
ma-0.029
Token y
Token ID331
Feature activation-1.00e+00

Pos logit contributions
 cereal+0.002
 spaghetti+0.001
lf+0.001
 Amy+0.001
ma+0.001
Neg logit contributions
 Pet-0.002
 sur-0.002
 Kang-0.002
 tort-0.002
 rob-0.002
Tokenummy
Token ID13513
Feature activation+8.91e-01

Pos logit contributions
 cereal+0.019
lf+0.017
 spaghetti+0.016
James+0.015
 Amy+0.015
Neg logit contributions
 Pet-0.032
 tort-0.031
 Kang-0.031
 rob-0.031
 sur-0.031
Token carrots
Token ID34397
Feature activation+2.59e+00

Pos logit contributions
 Pet+0.027
 sur+0.027
 Kang+0.027
 tort+0.027
 gol+0.026
Neg logit contributions
 cereal-0.019
 spaghetti-0.017
lf-0.015
 Amy-0.014
ma-0.014
Token and
Token ID290
Feature activation+3.38e-01

Pos logit contributions
 Pet+0.060
 sur+0.058
 Kang+0.058
 ko+0.058
 tort+0.058
Neg logit contributions
 cereal-0.073
 spaghetti-0.067
lf-0.064
 Amy-0.061
ma-0.061
Token cuc
Token ID38421
Feature activation+7.34e+00

Pos logit contributions
 Pet+0.009
 sur+0.009
 Kang+0.009
 tort+0.009
 rob+0.009
Neg logit contributions
 cereal-0.008
lf-0.007
 spaghetti-0.007
 Amy-0.006
bes-0.006
Tokenumbers
Token ID17024
Feature activation+2.66e+00

Pos logit contributions
 Pet+0.191
 tort+0.184
 rob+0.182
 sur+0.182
 Kang+0.181
Neg logit contributions
 cereal-0.182
lf-0.166
 spaghetti-0.164
James-0.152
ma-0.152
Token from
Token ID422
Feature activation+2.46e-01

Pos logit contributions
 Pet+0.061
 tort+0.060
 Kang+0.060
 sur+0.059
 rob+0.058
Neg logit contributions
 cereal-0.074
 spaghetti-0.068
lf-0.067
ma-0.063
 Amy-0.063
Token the
Token ID262
Feature activation+8.05e-01

Pos logit contributions
 Pet+0.005
 tort+0.005
 rob+0.005
 Kang+0.005
 sur+0.005
Neg logit contributions
 cereal-0.007
lf-0.007
 spaghetti-0.007
James-0.006
bes-0.006
Token farmer
Token ID18739
Feature activation+2.03e+00

Pos logit contributions
 Pet+0.019
 sur+0.019
 Kang+0.018
 tort+0.018
 rob+0.018
Neg logit contributions
 cereal-0.022
 spaghetti-0.020
lf-0.020
 Amy-0.018
ma-0.018
Token's
Token ID338
Feature activation-2.13e-01

Pos logit contributions
 sur+0.047
 Pet+0.046
 ko+0.045
 Kang+0.045
 tort+0.044
Neg logit contributions
 cereal-0.057
 spaghetti-0.053
ma-0.051
lf-0.050
 Amy-0.049
Token tomato
Token ID24240
Feature activation+2.06e+00

Pos logit contributions
 cereal+0.006
lf+0.005
 spaghetti+0.005
pron+0.005
 rug+0.005
Neg logit contributions
 Pet-0.006
 Kang-0.006
ey-0.006
 encountered-0.006
lves-0.006
Token,
Token ID11
Feature activation+6.27e-01

Pos logit contributions
 Pet+0.046
 tort+0.043
 Kang+0.043
 sur+0.043
 rob+0.043
Neg logit contributions
 cereal-0.060
lf-0.056
 spaghetti-0.055
 Amy-0.051
bes-0.051
Token and
Token ID290
Feature activation+3.56e-02

Pos logit contributions
 Pet+0.013
 rob+0.012
 tort+0.012
 sur+0.012
 Kang+0.012
Neg logit contributions
lf-0.019
 cereal-0.018
James-0.017
Kate-0.017
 spaghetti-0.017
Token then
Token ID788
Feature activation-4.69e-01

Pos logit contributions
 Pet+0.001
 rob+0.001
 tort+0.001
 sur+0.001
 tou+0.001
Neg logit contributions
 cereal-0.001
lf-0.001
bes-0.001
James-0.001
 spaghetti-0.001
Token a
Token ID257
Feature activation-2.81e-01

Pos logit contributions
 cereal+0.012
 spaghetti+0.011
 Amy+0.010
lf+0.010
 Kate+0.009
Neg logit contributions
 Kang-0.013
 perspective-0.012
 Pet-0.012
 tort-0.012
ey-0.012
Token cuc
Token ID38421
Feature activation+6.98e+00

Pos logit contributions
 cereal+0.007
 spaghetti+0.006
lf+0.006
 Amy+0.006
pron+0.005
Neg logit contributions
 Pet-0.008
 Kang-0.007
ey-0.007
 encountered-0.007
lves-0.007
Tokenumber
Token ID4494
Feature activation+5.40e-01

Pos logit contributions
 Pet+0.157
 tort+0.150
 rob+0.148
 Kang+0.148
 sur+0.146
Neg logit contributions
 cereal-0.199
lf-0.182
 spaghetti-0.182
ma-0.170
James-0.169
Token.
Token ID13
Feature activation+4.04e-01

Pos logit contributions
 Pet+0.012
 Kang+0.012
 tort+0.012
 ko+0.011
 sur+0.011
Neg logit contributions
 cereal-0.015
lf-0.014
 spaghetti-0.014
ma-0.014
 Amy-0.013
Token Lily
Token ID20037
Feature activation+1.63e-01

Pos logit contributions
 Pet+0.010
 tort+0.009
 Kang+0.009
 rob+0.009
 sur+0.009
Neg logit contributions
 cereal-0.011
lf-0.011
 spaghetti-0.010
ma-0.010
bes-0.009
Token and
Token ID290
Feature activation-1.04e+00

Pos logit contributions
 Pet+0.004
 tort+0.004
 Kang+0.004
 rob+0.004
 sur+0.004
Neg logit contributions
 cereal-0.004
lf-0.004
 spaghetti-0.004
 Amy-0.004
bes-0.004
Token Tom
Token ID4186
Feature activation+2.79e-01

Pos logit contributions
 cereal+0.020
lf+0.020
bes+0.018
Kate+0.018
 spaghetti+0.018
Neg logit contributions
 Pet-0.032
 tort-0.030
 rob-0.030
 tou-0.029
 Kang-0.029
Token cut
Token ID2005
Feature activation+1.43e+00

Pos logit contributions
 Pet+0.002
 rob+0.002
 tort+0.002
 sur+0.002
 Kang+0.001
Neg logit contributions
 cereal-0.003
lf-0.003
James-0.003
 spaghetti-0.003
 Amy-0.003
Token the
Token ID262
Feature activation+1.65e-01

Pos logit contributions
 Pet+0.027
 rob+0.024
 tort+0.024
 Kang+0.024
ive+0.023
Neg logit contributions
lf-0.043
 cereal-0.043
James-0.040
bes-0.040
 spaghetti-0.040
Token lettuce
Token ID39406
Feature activation+3.35e+00

Pos logit contributions
 Pet+0.005
 rob+0.005
 tort+0.005
ive+0.005
 sur+0.005
Neg logit contributions
lf-0.003
James-0.003
 cereal-0.003
Kate-0.002
bes-0.002
Token and
Token ID290
Feature activation+3.37e-01

Pos logit contributions
 Pet+0.079
 Kang+0.076
 sur+0.076
 ko+0.076
 tort+0.075
Neg logit contributions
 cereal-0.096
 spaghetti-0.088
lf-0.085
 Amy-0.079
ma-0.078
Token the
Token ID262
Feature activation+9.67e-01

Pos logit contributions
 Pet+0.009
 rob+0.008
 tort+0.008
 sur+0.008
 Kang+0.008
Neg logit contributions
 cereal-0.008
lf-0.008
 spaghetti-0.007
James-0.007
bes-0.007
Token cuc
Token ID38421
Feature activation+6.72e+00

Pos logit contributions
 Pet+0.029
 rob+0.029
 tort+0.029
 sur+0.028
 tou+0.027
Neg logit contributions
lf-0.018
 cereal-0.018
James-0.016
bes-0.016
Kate-0.016
Tokenumber
Token ID4494
Feature activation+5.99e-01

Pos logit contributions
 Pet+0.168
 tort+0.161
 Kang+0.159
 ko+0.159
 rob+0.158
Neg logit contributions
 cereal-0.179
 spaghetti-0.161
lf-0.160
ma-0.150
bes-0.149
Token.
Token ID13
Feature activation+5.11e-01

Pos logit contributions
 Pet+0.014
 Kang+0.014
 tort+0.014
 sur+0.014
 ko+0.014
Neg logit contributions
 cereal-0.017
lf-0.015
 spaghetti-0.015
ma-0.014
 Amy-0.014
Token She
Token ID1375
Feature activation+1.23e-01

Pos logit contributions
 Pet+0.015
 tort+0.014
 Kang+0.014
 sur+0.014
 rob+0.014
Neg logit contributions
 cereal-0.011
lf-0.010
 spaghetti-0.010
James-0.009
bes-0.009
Token said
Token ID531
Feature activation+6.60e-02

Pos logit contributions
 Pet+0.002
 rob+0.002
 sur+0.002
 tort+0.002
 encountered+0.002
Neg logit contributions
 cereal-0.004
lf-0.003
James-0.003
 spaghetti-0.003
bes-0.003
Token he
Token ID339
Feature activation+1.24e+00

Pos logit contributions
 Pet+0.002
 Kang+0.002
 tort+0.001
 sur+0.001
 tou+0.001
Neg logit contributions
 cereal-0.002
 spaghetti-0.002
lf-0.002
 Amy-0.002
 Kate-0.002
Token and
Token ID290
Feature activation+4.49e-01

Pos logit contributions
 Pet+0.003
 rob+0.003
 tort+0.003
 sur+0.003
 Kang+0.003
Neg logit contributions
 cereal-0.004
lf-0.004
James-0.003
 spaghetti-0.003
Kate-0.003
Token the
Token ID262
Feature activation+1.48e+00

Pos logit contributions
 Pet+0.010
 Kang+0.010
 sur+0.010
 tort+0.010
 rob+0.010
Neg logit contributions
 cereal-0.012
lf-0.011
 spaghetti-0.011
 Amy-0.011
ma-0.010
Token peppers
Token ID31804
Feature activation+3.21e+00

Pos logit contributions
 Pet+0.039
 Kang+0.038
 sur+0.037
 rob+0.037
 ko+0.037
Neg logit contributions
 cereal-0.038
 spaghetti-0.035
lf-0.032
 Amy-0.030
 microwave-0.030
Token from
Token ID422
Feature activation+1.48e-02

Pos logit contributions
 Pet+0.072
 tort+0.069
 Kang+0.069
 sur+0.068
 rob+0.068
Neg logit contributions
 cereal-0.091
lf-0.084
 spaghetti-0.083
 Amy-0.078
ma-0.077
Token the
Token ID262
Feature activation+9.06e-01

Pos logit contributions
 Pet+0.000
 rob+0.000
 tort+0.000
 Kang+0.000
 ko+0.000
Neg logit contributions
 cereal-0.000
lf-0.000
 spaghetti-0.000
James-0.000
bes-0.000
Token cuc
Token ID38421
Feature activation+6.64e+00

Pos logit contributions
 Pet+0.028
 Kang+0.027
 sur+0.027
 ko+0.026
 rob+0.026
Neg logit contributions
 cereal-0.020
 spaghetti-0.018
lf-0.015
 microwave-0.015
 couch-0.015
Tokenumbers
Token ID17024
Feature activation+2.10e+00

Pos logit contributions
 Pet+0.178
 ko+0.173
 rob+0.170
 tort+0.168
 Kang+0.168
Neg logit contributions
 cereal-0.163
 spaghetti-0.145
lf-0.142
ma-0.139
 microwave-0.137
Token.
Token ID13
Feature activation+1.10e-01

Pos logit contributions
 Pet+0.046
 tort+0.044
 Kang+0.043
 rob+0.043
 sur+0.043
Neg logit contributions
 cereal-0.061
lf-0.057
 spaghetti-0.056
 Amy-0.052
ma-0.052
Token It
Token ID632
Feature activation-5.33e-01

Pos logit contributions
 Pet+0.003
 tort+0.003
 Kang+0.003
 rob+0.002
 ko+0.002
Neg logit contributions
 cereal-0.003
lf-0.003
 spaghetti-0.003
bes-0.003
James-0.003
Token took
Token ID1718
Feature activation-2.19e-01

Pos logit contributions
 cereal+0.015
lf+0.014
 spaghetti+0.013
 Amy+0.012
James+0.012
Neg logit contributions
 Pet-0.013
 tort-0.012
 Kang-0.012
 sur-0.012
 rob-0.012
Token a
Token ID257
Feature activation-9.82e-02

Pos logit contributions
 cereal+0.006
lf+0.005
 spaghetti+0.005
 Amy+0.005
James+0.005
Neg logit contributions
 Pet-0.005
 tort-0.005
 sur-0.005
 Kang-0.005
 rob-0.005
Token cuc
Token ID38421
Feature activation+6.38e+00

Pos logit contributions
 Pet+0.003
 Kang+0.002
 sur+0.002
ive+0.002
 tort+0.002
Neg logit contributions
 cereal-0.003
 spaghetti-0.002
lf-0.002
 Amy-0.002
ma-0.002
Tokenumber
Token ID4494
Feature activation-6.50e-04

Pos logit contributions
 Pet+0.162
 ko+0.154
 tou+0.150
 Kang+0.150
 rob+0.149
Neg logit contributions
 cereal-0.167
 spaghetti-0.148
ma-0.141
lf-0.140
 microwave-0.135
Token.
Token ID13
Feature activation+4.94e-01

Pos logit contributions
 cereal+0.000
lf+0.000
 spaghetti+0.000
 Amy+0.000
ma+0.000
Neg logit contributions
 Pet-0.000
 sur-0.000
 Kang-0.000
 tort-0.000
 ko-0.000
Token Mama
Token ID40084
Feature activation+1.07e+00

Pos logit contributions
 Pet+0.012
 tort+0.012
 Kang+0.012
 rob+0.012
 sur+0.011
Neg logit contributions
 cereal-0.013
lf-0.012
 spaghetti-0.012
James-0.011
 Amy-0.011
Token puts
Token ID7584
Feature activation+6.81e-01

Pos logit contributions
 Pet+0.027
 sur+0.027
 Kang+0.027
 tort+0.027
 ko+0.026
Neg logit contributions
 cereal-0.027
 spaghetti-0.025
lf-0.024
ma-0.023
 Amy-0.023
Token cuc
Token ID38421
Feature activation+6.59e+00

Pos logit contributions
 Pet+0.015
 rob+0.014
 tort+0.014
 Kang+0.013
 tou+0.013
Neg logit contributions
 cereal-0.019
lf-0.019
James-0.018
 spaghetti-0.018
Kate-0.017
Tokenumbers
Token ID17024
Feature activation+2.36e+00

Pos logit contributions
 Pet+0.171
 ko+0.164
 tort+0.162
 Kang+0.161
 rob+0.161
Neg logit contributions
 cereal-0.168
 spaghetti-0.149
ma-0.143
lf-0.141
bes-0.134
Token in
Token ID287
Feature activation-8.15e-01

Pos logit contributions
 Pet+0.043
 rob+0.040
 tort+0.040
 Kang+0.039
 sur+0.039
Neg logit contributions
 cereal-0.074
lf-0.073
 spaghetti-0.069
James-0.068
Kate-0.066
Token our
Token ID674
Feature activation-3.27e-01

Pos logit contributions
lf+0.026
 cereal+0.026
James+0.025
bes+0.024
Kate+0.024
Neg logit contributions
 Pet-0.014
 rob-0.014
 tort-0.013
 Kang-0.012
 ko-0.012
Token salad
Token ID20698
Feature activation+1.69e+00

Pos logit contributions
 cereal+0.007
lf+0.006
James+0.006
 spaghetti+0.006
Kate+0.006
Neg logit contributions
 Pet-0.010
 tort-0.009
 rob-0.009
 Kang-0.009
 sur-0.009
Token.
Token ID13
Feature activation+5.76e-01

Pos logit contributions
 Pet+0.031
 tort+0.030
 rob+0.029
ive+0.028
 Kang+0.028
Neg logit contributions
 cereal-0.051
lf-0.050
James-0.049
Kate-0.047
 spaghetti-0.047
Token big
Token ID1263
Feature activation-5.35e-01

Pos logit contributions
 cereal+0.002
 spaghetti+0.002
lf+0.002
 Amy+0.002
ma+0.002
Neg logit contributions
 Pet-0.003
 Kang-0.003
 tort-0.003
 sur-0.003
 rob-0.003
Token salad
Token ID20698
Feature activation+1.65e+00

Pos logit contributions
 cereal+0.010
lf+0.009
 spaghetti+0.009
 Amy+0.008
James+0.008
Neg logit contributions
 Pet-0.017
 tort-0.016
 Kang-0.016
 sur-0.016
 rob-0.016
Token with
Token ID351
Feature activation-6.25e-01

Pos logit contributions
 Pet+0.033
 tort+0.032
 rob+0.031
 sur+0.031
 Kang+0.031
Neg logit contributions
 cereal-0.049
lf-0.047
 spaghetti-0.046
James-0.044
 Amy-0.043
Token carrots
Token ID34397
Feature activation+2.51e+00

Pos logit contributions
lf+0.015
 cereal+0.014
James+0.014
bes+0.013
 spaghetti+0.013
Neg logit contributions
 Pet-0.016
 rob-0.016
 tort-0.015
 tou-0.015
 ko-0.015
Token,
Token ID11
Feature activation-1.96e-02

Pos logit contributions
 Pet+0.065
 Kang+0.063
 sur+0.063
 ko+0.063
 rob+0.062
Neg logit contributions
 cereal-0.065
 spaghetti-0.060
lf-0.058
ma-0.054
 Amy-0.053
Token cuc
Token ID38421
Feature activation+6.58e+00

Pos logit contributions
lf+0.001
James+0.001
 cereal+0.001
bes+0.001
Kate+0.001
Neg logit contributions
 Pet-0.000
 rob-0.000
 tort-0.000
 sur-0.000
 judge-0.000
Tokenumbers
Token ID17024
Feature activation+2.08e+00

Pos logit contributions
 Pet+0.164
 tort+0.158
 rob+0.156
 Kang+0.156
 sur+0.156
Neg logit contributions
 cereal-0.171
lf-0.157
 spaghetti-0.155
James-0.144
ma-0.143
Token,
Token ID11
Feature activation-1.24e-03

Pos logit contributions
 Pet+0.047
 sur+0.046
 tort+0.045
 Kang+0.045
 ko+0.044
Neg logit contributions
 cereal-0.060
 spaghetti-0.055
lf-0.055
ma-0.051
 Amy-0.050
Token and
Token ID290
Feature activation+1.18e+00

Pos logit contributions
lf+0.000
 cereal+0.000
James+0.000
Kate+0.000
bes+0.000
Neg logit contributions
 Pet-0.000
 rob-0.000
 tort-0.000
 sur-0.000
 tou-0.000
Token tomatoes
Token ID23972
Feature activation+3.29e+00

Pos logit contributions
 Pet+0.031
 rob+0.030
 tort+0.030
 sur+0.029
 tou+0.028
Neg logit contributions
lf-0.027
 cereal-0.026
bes-0.025
James-0.025
Kate-0.024
Token.
Token ID13
Feature activation+7.68e-02

Pos logit contributions
 Pet+0.075
 Kang+0.072
 sur+0.072
 tort+0.071
 rob+0.070
Neg logit contributions
 cereal-0.093
lf-0.086
 spaghetti-0.085
bes-0.079
ma-0.079
Token chopped
Token ID20720
Feature activation+2.24e+00

Pos logit contributions
 cereal+0.001
lf+0.001
 spaghetti+0.001
 Amy+0.001
James+0.001
Neg logit contributions
 Pet-0.001
 tort-0.001
 Kang-0.001
 sur-0.001
 rob-0.001
Token up
Token ID510
Feature activation+2.00e+00

Pos logit contributions
 Pet+0.044
 rob+0.040
 tort+0.040
 Kang+0.039
 sur+0.039
Neg logit contributions
 cereal-0.066
lf-0.066
James-0.061
bes-0.061
 spaghetti-0.061
Token some
Token ID617
Feature activation-1.29e-01

Pos logit contributions
 Pet+0.044
 rob+0.042
 across+0.040
 tort+0.038
 tou+0.038
Neg logit contributions
lf-0.053
James-0.051
bes-0.048
Kate-0.048
 cereal-0.048
Token tomatoes
Token ID23972
Feature activation+3.32e+00

Pos logit contributions
lf+0.003
James+0.003
 cereal+0.003
bes+0.003
Kate+0.002
Neg logit contributions
 Pet-0.003
 rob-0.003
 tort-0.003
 sur-0.003
ive-0.003
Token,
Token ID11
Feature activation+2.16e-01

Pos logit contributions
 Pet+0.073
 tort+0.070
 Kang+0.070
 sur+0.069
 rob+0.069
Neg logit contributions
 cereal-0.096
lf-0.089
 spaghetti-0.089
 Amy-0.083
ma-0.082
Token cuc
Token ID38421
Feature activation+6.56e+00

Pos logit contributions
 Pet+0.005
 rob+0.005
 tort+0.004
 sur+0.004
 encountered+0.004
Neg logit contributions
lf-0.006
 cereal-0.006
James-0.005
bes-0.005
Kate-0.005
Tokenumbers
Token ID17024
Feature activation+2.17e+00

Pos logit contributions
 Pet+0.150
 tort+0.144
 sur+0.142
 rob+0.142
 Kang+0.142
Neg logit contributions
 cereal-0.180
lf-0.170
 spaghetti-0.165
 Amy-0.158
James-0.155
Token,
Token ID11
Feature activation-1.88e-02

Pos logit contributions
 Pet+0.049
 tort+0.048
 sur+0.048
 Kang+0.048
 ko+0.047
Neg logit contributions
 cereal-0.062
 spaghetti-0.057
lf-0.057
ma-0.053
 Amy-0.053
Token and
Token ID290
Feature activation+5.61e-01

Pos logit contributions
 cereal+0.001
lf+0.001
James+0.000
Kate+0.000
bes+0.000
Neg logit contributions
 Pet-0.000
 rob-0.000
 tort-0.000
 sur-0.000
 encountered-0.000
Token lettuce
Token ID39406
Feature activation+3.71e+00

Pos logit contributions
 Pet+0.014
 rob+0.014
 tort+0.014
 sur+0.013
 Kang+0.013
Neg logit contributions
 cereal-0.013
lf-0.013
bes-0.012
 spaghetti-0.012
James-0.012
Token.
Token ID13
Feature activation+2.12e-01

Pos logit contributions
 Pet+0.085
 Kang+0.082
 sur+0.082
 tort+0.082
 ko+0.080
Neg logit contributions
 cereal-0.106
 spaghetti-0.097
lf-0.096
 Amy-0.089
bes-0.089
TokenI
Token ID40
Feature activation-2.94e-01

Pos logit contributions
 cereal+0.001
lf+0.001
 spaghetti+0.001
 Amy+0.001
James+0.001
Neg logit contributions
 Pet-0.001
 tort-0.001
 rob-0.001
 Kang-0.001
 sur-0.001
Token think
Token ID892
Feature activation+2.87e-02

Pos logit contributions
 cereal+0.008
 spaghetti+0.007
ma+0.007
 Amy+0.007
lf+0.007
Neg logit contributions
 Kang-0.007
 tort-0.007
ting-0.007
 Pet-0.007
 tou-0.006
Token it
Token ID340
Feature activation+2.83e-01

Pos logit contributions
 Pet+0.001
 rob+0.001
 Kang+0.001
 tort+0.001
 sur+0.000
Neg logit contributions
 cereal-0.001
lf-0.001
James-0.001
bes-0.001
 spaghetti-0.001
Token is
Token ID318
Feature activation+5.15e-01

Pos logit contributions
 Pet+0.007
 rob+0.007
 tort+0.007
 Kang+0.007
 sur+0.006
Neg logit contributions
 cereal-0.007
lf-0.007
 spaghetti-0.007
James-0.006
 Amy-0.006
Token a
Token ID257
Feature activation+1.02e-01

Pos logit contributions
 Pet+0.015
 sur+0.014
 tort+0.014
 Kang+0.014
 gol+0.014
Neg logit contributions
 cereal-0.012
 spaghetti-0.011
lf-0.010
 Amy-0.010
ma-0.009
Token cuc
Token ID38421
Feature activation+6.38e+00

Pos logit contributions
 Pet+0.003
 Kang+0.002
 sur+0.002
ive+0.002
 tort+0.002
Neg logit contributions
 cereal-0.003
 spaghetti-0.002
lf-0.002
 Amy-0.002
ma-0.002
Tokenumber
Token ID4494
Feature activation-6.50e-04

Pos logit contributions
 Pet+0.162
 ko+0.154
 tou+0.150
 Kang+0.150
 rob+0.149
Neg logit contributions
 cereal-0.167
 spaghetti-0.148
ma-0.141
lf-0.140
 microwave-0.135
Token.
Token ID13
Feature activation+4.94e-01

Pos logit contributions
 cereal+0.000
lf+0.000
 spaghetti+0.000
 Amy+0.000
ma+0.000
Neg logit contributions
 Pet-0.000
 sur-0.000
 Kang-0.000
 tort-0.000
 ko-0.000
Token Mama
Token ID40084
Feature activation+1.07e+00

Pos logit contributions
 Pet+0.012
 tort+0.012
 Kang+0.012
 rob+0.012
 sur+0.011
Neg logit contributions
 cereal-0.013
lf-0.012
 spaghetti-0.012
James-0.011
 Amy-0.011
Token puts
Token ID7584
Feature activation+6.81e-01

Pos logit contributions
 Pet+0.027
 sur+0.027
 Kang+0.027
 tort+0.027
 ko+0.026
Neg logit contributions
 cereal-0.027
 spaghetti-0.025
lf-0.024
ma-0.023
 Amy-0.023
Token cuc
Token ID38421
Feature activation+6.59e+00

Pos logit contributions
 Pet+0.015
 rob+0.014
 tort+0.014
 Kang+0.013
 tou+0.013
Neg logit contributions
 cereal-0.019
lf-0.019
James-0.018
 spaghetti-0.018
Kate-0.017
Token saw
Token ID2497
Feature activation-8.20e-01

Pos logit contributions
 Pet+0.001
 tort+0.001
 rob+0.001
 sur+0.001
 Kang+0.001
Neg logit contributions
 cereal-0.002
lf-0.002
 spaghetti-0.002
James-0.002
 Amy-0.002
Token carrots
Token ID34397
Feature activation+2.62e+00

Pos logit contributions
lf+0.021
bes+0.021
 cereal+0.020
James+0.020
Kate+0.020
Neg logit contributions
 Pet-0.019
 rob-0.019
 tort-0.018
 Kang-0.017
 sur-0.017
Token,
Token ID11
Feature activation+1.61e-01

Pos logit contributions
 Pet+0.059
 sur+0.057
 tort+0.057
 Kang+0.057
 ko+0.056
Neg logit contributions
 cereal-0.075
 spaghetti-0.069
lf-0.068
 Amy-0.063
ma-0.063
Token potatoes
Token ID18821
Feature activation+2.99e+00

Pos logit contributions
 Pet+0.004
 tort+0.004
 rob+0.004
 Kang+0.004
 sur+0.004
Neg logit contributions
 cereal-0.004
lf-0.004
 spaghetti-0.003
James-0.003
bes-0.003
Token,
Token ID11
Feature activation-6.00e-02

Pos logit contributions
 Pet+0.065
 sur+0.064
 Kang+0.064
 tort+0.063
 rob+0.062
Neg logit contributions
 cereal-0.088
 spaghetti-0.081
lf-0.081
ma-0.074
bes-0.074
Token cuc
Token ID38421
Feature activation+6.25e+00

Pos logit contributions
 cereal+0.001
lf+0.001
James+0.001
 spaghetti+0.001
Kate+0.001
Neg logit contributions
 Pet-0.002
 rob-0.001
 tort-0.001
 sur-0.001
 Kang-0.001
Tokenumbers
Token ID17024
Feature activation+1.53e+00

Pos logit contributions
 Pet+0.170
 tort+0.165
 rob+0.164
 ko+0.162
 Kang+0.162
Neg logit contributions
 cereal-0.150
 spaghetti-0.135
lf-0.133
ma-0.124
James-0.123
Token and
Token ID290
Feature activation+5.32e-01

Pos logit contributions
 Pet+0.030
 tort+0.029
 sur+0.029
 Kang+0.029
 rob+0.029
Neg logit contributions
 cereal-0.047
lf-0.044
 spaghetti-0.044
 Amy-0.041
ma-0.041
Token tomatoes
Token ID23972
Feature activation+3.17e+00

Pos logit contributions
 Pet+0.015
 tort+0.014
 rob+0.014
 sur+0.014
 Kang+0.014
Neg logit contributions
 cereal-0.012
lf-0.011
 spaghetti-0.011
James-0.010
ma-0.010
Token.
Token ID13
Feature activation-4.01e-01

Pos logit contributions
 Pet+0.073
 tort+0.070
 Kang+0.070
 sur+0.070
 rob+0.069
Neg logit contributions
 cereal-0.088
lf-0.081
 spaghetti-0.081
 Amy-0.074
bes-0.074
Token He
Token ID679
Feature activation-3.34e-02

Pos logit contributions
lf+0.011
 cereal+0.011
bes+0.010
ma+0.010
James+0.010
Neg logit contributions
 Pet-0.009
 sur-0.009
 tort-0.008
 Kang-0.008
 gol-0.008
Token the
Token ID262
Feature activation+7.97e-01

Pos logit contributions
 Pet+0.002
 rob+0.001
 tort+0.001
 across+0.001
 ko+0.001
Neg logit contributions
lf-0.003
James-0.003
 cereal-0.003
Kate-0.003
bes-0.003
Token bowl
Token ID9396
Feature activation-2.87e-01

Pos logit contributions
 Pet+0.024
 tort+0.024
 rob+0.024
 encountered+0.022
 across+0.022
Neg logit contributions
lf-0.015
James-0.014
bes-0.013
 cereal-0.013
Kate-0.013
Token.
Token ID13
Feature activation+1.22e-01

Pos logit contributions
lf+0.009
 cereal+0.009
James+0.009
bes+0.009
 spaghetti+0.008
Neg logit contributions
 Pet-0.005
ive-0.004
 across-0.004
 tort-0.004
 rob-0.004
Token Lucy
Token ID22162
Feature activation-1.55e-02

Pos logit contributions
 Pet+0.003
 Kang+0.003
 tort+0.003
 rob+0.003
 ko+0.003
Neg logit contributions
lf-0.003
 cereal-0.003
 spaghetti-0.002
ma-0.002
bes-0.002
Token put
Token ID1234
Feature activation+2.50e-01

Pos logit contributions
 cereal+0.000
lf+0.000
 spaghetti+0.000
James+0.000
 Amy+0.000
Neg logit contributions
 Pet-0.000
 tort-0.000
 rob-0.000
 sur-0.000
 Kang-0.000
Token cuc
Token ID38421
Feature activation+6.23e+00

Pos logit contributions
 Pet+0.005
 rob+0.004
 across+0.004
 tort+0.004
ting+0.004
Neg logit contributions
lf-0.008
James-0.007
bes-0.007
Kate-0.007
 cereal-0.007
Tokenumbers
Token ID17024
Feature activation+2.23e+00

Pos logit contributions
 Pet+0.143
 tort+0.138
 rob+0.136
 sur+0.136
 Kang+0.135
Neg logit contributions
 cereal-0.169
lf-0.161
 spaghetti-0.155
 Amy-0.149
James-0.146
Token,
Token ID11
Feature activation+3.59e-01

Pos logit contributions
 Pet+0.040
 rob+0.037
 tort+0.037
 sur+0.036
ive+0.035
Neg logit contributions
 cereal-0.071
lf-0.069
 spaghetti-0.066
James-0.065
bes-0.064
Token tomatoes
Token ID23972
Feature activation+2.78e+00

Pos logit contributions
 Pet+0.008
 rob+0.008
 tort+0.007
 sur+0.007
 encountered+0.007
Neg logit contributions
lf-0.010
James-0.009
 cereal-0.009
bes-0.009
Kate-0.009
Token and
Token ID290
Feature activation+5.01e-02

Pos logit contributions
 Pet+0.059
 tort+0.056
 sur+0.056
 Kang+0.056
 rob+0.055
Neg logit contributions
 cereal-0.082
lf-0.076
 spaghetti-0.075
 Amy-0.071
bes-0.070
Token some
Token ID617
Feature activation+1.22e-01

Pos logit contributions
 Pet+0.001
 tort+0.001
 rob+0.001
 sur+0.001
 encountered+0.001
Neg logit contributions
lf-0.001
 cereal-0.001
James-0.001
bes-0.001
Kate-0.001
Token.
Token ID13
Feature activation-1.32e-01

Pos logit contributions
 Kang+0.028
 Pet+0.028
 tort+0.027
 rob+0.027
mur+0.027
Neg logit contributions
 cereal-0.029
 spaghetti-0.026
ma-0.025
lf-0.025
 Amy-0.025
Token You
Token ID921
Feature activation-2.47e-02

Pos logit contributions
 cereal+0.004
lf+0.004
 spaghetti+0.003
ma+0.003
bes+0.003
Neg logit contributions
 Pet-0.003
 Kang-0.003
 tort-0.003
 rob-0.003
 ko-0.003
Token found
Token ID1043
Feature activation-1.89e-01

Pos logit contributions
 cereal+0.001
 spaghetti+0.001
lf+0.001
James+0.001
 Amy+0.001
Neg logit contributions
 Pet-0.001
 Kang-0.001
 tort-0.001
 sur-0.001
 rob-0.001
Token a
Token ID257
Feature activation-4.35e-01

Pos logit contributions
 cereal+0.005
lf+0.005
bes+0.004
James+0.004
 spaghetti+0.004
Neg logit contributions
 Pet-0.005
 rob-0.004
 tort-0.004
 Kang-0.004
 tou-0.004
Token nice
Token ID3621
Feature activation+8.44e-01

Pos logit contributions
 cereal+0.010
 spaghetti+0.009
lf+0.009
 rug+0.008
 Amy+0.008
Neg logit contributions
 Pet-0.012
 Kang-0.012
ive-0.011
 sur-0.011
 tort-0.011
Token cuc
Token ID38421
Feature activation+6.01e+00

Pos logit contributions
 Pet+0.022
 Kang+0.021
 tort+0.021
 sur+0.020
 rob+0.020
Neg logit contributions
 cereal-0.021
lf-0.019
 spaghetti-0.019
 Amy-0.018
ma-0.018
Tokenumber
Token ID4494
Feature activation-5.85e-01

Pos logit contributions
 Pet+0.157
 ko+0.151
 Kang+0.148
 tou+0.147
 rob+0.147
Neg logit contributions
 cereal-0.153
 spaghetti-0.136
ma-0.131
lf-0.127
 microwave-0.124
Token.
Token ID13
Feature activation-3.35e-01

Pos logit contributions
 cereal+0.016
lf+0.015
 spaghetti+0.015
ma+0.014
 Amy+0.014
Neg logit contributions
 Pet-0.014
 Kang-0.013
 ko-0.013
 sur-0.013
 tort-0.013
Token It
Token ID632
Feature activation-7.72e-01

Pos logit contributions
lf+0.010
 cereal+0.009
 spaghetti+0.009
ma+0.009
bes+0.008
Neg logit contributions
 Pet-0.008
 Kang-0.007
 tort-0.007
 rob-0.007
 ko-0.007
Token will
Token ID481
Feature activation-2.41e+00

Pos logit contributions
 cereal+0.021
lf+0.020
 spaghetti+0.019
James+0.018
bes+0.018
Neg logit contributions
 Pet-0.018
 tort-0.017
 rob-0.017
 Kang-0.017
 sur-0.017
Token be
Token ID307
Feature activation-3.62e-01

Pos logit contributions
 cereal+0.077
lf+0.072
 spaghetti+0.071
James+0.068
 Amy+0.067
Neg logit contributions
 Pet-0.046
 tort-0.043
 rob-0.043
 sur-0.042
 Kang-0.042
Token thank
Token ID5875
Feature activation-4.12e-01

Pos logit contributions
 cereal+0.005
lf+0.005
 spaghetti+0.005
James+0.005
bes+0.005
Neg logit contributions
 Pet-0.004
 tort-0.004
 rob-0.004
 sur-0.004
 Kang-0.004
Token you
Token ID345
Feature activation-1.71e-01

Pos logit contributions
 cereal+0.008
 spaghetti+0.008
lf+0.008
ma+0.007
 Amy+0.007
Neg logit contributions
 Pet-0.012
 tort-0.012
 sur-0.012
 Kang-0.012
 rob-0.012
Token.
Token ID13
Feature activation+1.66e-01

Pos logit contributions
 cereal+0.005
lf+0.005
 spaghetti+0.005
bes+0.005
James+0.005
Neg logit contributions
 Pet-0.003
 tort-0.003
 sur-0.003
 Kang-0.003
 rob-0.003
Token We
Token ID775
Feature activation+9.83e-01

Pos logit contributions
 Pet+0.004
 Kang+0.004
 tort+0.004
 rob+0.004
 ko+0.004
Neg logit contributions
lf-0.005
 cereal-0.005
 spaghetti-0.004
James-0.004
ma-0.004
Token love
Token ID1842
Feature activation+3.84e-02

Pos logit contributions
 Pet+0.023
 tort+0.022
 Kang+0.022
 rob+0.022
 sur+0.022
Neg logit contributions
 cereal-0.027
lf-0.025
 spaghetti-0.025
 Amy-0.023
ma-0.023
Token cuc
Token ID38421
Feature activation+5.98e+00

Pos logit contributions
 Pet+0.001
 rob+0.001
 sur+0.001
 tort+0.001
 Kang+0.001
Neg logit contributions
 cereal-0.001
lf-0.001
James-0.001
bes-0.001
Kate-0.001
Tokenumbers
Token ID17024
Feature activation+1.79e+00

Pos logit contributions
 Pet+0.144
 tort+0.136
 Kang+0.135
 rob+0.135
 ko+0.135
Neg logit contributions
 cereal-0.164
 spaghetti-0.149
lf-0.147
ma-0.141
James-0.137
Token.
Token ID13
Feature activation+2.68e-01

Pos logit contributions
 Pet+0.027
 rob+0.026
 tort+0.025
ive+0.025
ey+0.024
Neg logit contributions
lf-0.058
 cereal-0.057
James-0.056
Kate-0.055
endo-0.054
Token And
Token ID843
Feature activation+6.89e-01

Pos logit contributions
 Pet+0.006
 Kang+0.006
 tort+0.006
 rob+0.006
 sur+0.006
Neg logit contributions
lf-0.007
 cereal-0.007
 spaghetti-0.007
ma-0.007
James-0.007
Token we
Token ID356
Feature activation+9.00e-01

Pos logit contributions
 Pet+0.017
 tort+0.017
 sur+0.017
 Kang+0.017
 rob+0.016
Neg logit contributions
 cereal-0.018
 spaghetti-0.016
lf-0.016
 Amy-0.015
ma-0.014
Token love
Token ID1842
Feature activation-2.97e-02

Pos logit contributions
 Pet+0.021
 tort+0.020
 Kang+0.020
 sur+0.020
 rob+0.020
Neg logit contributions
 cereal-0.025
lf-0.023
 spaghetti-0.023
 Amy-0.021
ma-0.021
Token house
Token ID2156
Feature activation-6.84e-01

Pos logit contributions
 cereal+0.002
lf+0.001
 spaghetti+0.001
 Amy+0.001
James+0.001
Neg logit contributions
 Pet-0.004
 tort-0.003
 Kang-0.003
 rob-0.003
 sur-0.003
Token.
Token ID13
Feature activation+5.55e-01

Pos logit contributions
 cereal+0.022
lf+0.022
James+0.021
 spaghetti+0.021
bes+0.020
Neg logit contributions
 Pet-0.011
 sur-0.010
 rob-0.010
ive-0.010
 tort-0.010
Token They
Token ID1119
Feature activation-2.61e-02

Pos logit contributions
 Pet+0.014
 Kang+0.013
 tort+0.013
 rob+0.013
 sur+0.013
Neg logit contributions
 cereal-0.014
lf-0.014
 spaghetti-0.013
ma-0.012
James-0.012
Token wash
Token ID13502
Feature activation+1.38e+00

Pos logit contributions
 cereal+0.001
lf+0.001
 spaghetti+0.001
James+0.001
 Amy+0.001
Neg logit contributions
 Pet-0.001
 tort-0.000
 rob-0.000
 sur-0.000
 Kang-0.000
Token the
Token ID262
Feature activation+3.32e-01

Pos logit contributions
 Pet+0.034
 tort+0.032
 rob+0.032
 Kang+0.032
 sur+0.032
Neg logit contributions
 cereal-0.036
lf-0.034
 spaghetti-0.033
James-0.031
 Amy-0.031
Token cuc
Token ID38421
Feature activation+5.81e+00

Pos logit contributions
 Pet+0.010
 tort+0.010
 rob+0.010
 sur+0.010
 tou+0.010
Neg logit contributions
 cereal-0.006
lf-0.006
James-0.005
 spaghetti-0.005
 Amy-0.005
Tokenumber
Token ID4494
Feature activation-1.11e-01

Pos logit contributions
 Pet+0.131
 tort+0.123
 rob+0.122
 Kang+0.122
 ko+0.122
Neg logit contributions
 cereal-0.168
 spaghetti-0.152
lf-0.152
ma-0.145
bes-0.142
Token and
Token ID290
Feature activation+6.30e-01

Pos logit contributions
 cereal+0.003
lf+0.003
 spaghetti+0.003
James+0.003
 Amy+0.003
Neg logit contributions
 Pet-0.002
 rob-0.002
 tort-0.002
 sur-0.002
 Kang-0.002
Token cut
Token ID2005
Feature activation+1.27e+00

Pos logit contributions
 Pet+0.011
 rob+0.011
 tort+0.010
 sur+0.010
 tou+0.010
Neg logit contributions
lf-0.019
 cereal-0.019
James-0.019
Kate-0.018
bes-0.018
Token it
Token ID340
Feature activation-4.29e-01

Pos logit contributions
 Pet+0.030
 tort+0.030
 sur+0.030
 Kang+0.029
 rob+0.029
Neg logit contributions
 cereal-0.035
 spaghetti-0.031
lf-0.030
 Amy-0.029
ma-0.029
Token into
Token ID656
Feature activation+8.63e-01

Pos logit contributions
 cereal+0.014
lf+0.014
James+0.014
 spaghetti+0.013
Kate+0.013
Neg logit contributions
 Pet-0.007
 rob-0.006
 tort-0.006
 Kang-0.006
 sur-0.006
Token monkeys
Token ID26697
Feature activation+1.71e+00

Pos logit contributions
 Pet+0.003
 tort+0.003
 rob+0.002
 sur+0.002
 Kang+0.002
Neg logit contributions
 cereal-0.003
lf-0.003
 spaghetti-0.003
James-0.003
 Amy-0.003
Token and
Token ID290
Feature activation+3.87e-01

Pos logit contributions
 Pet+0.038
 tort+0.036
 rob+0.036
 Kang+0.036
 sur+0.036
Neg logit contributions
 cereal-0.049
lf-0.046
 spaghetti-0.045
James-0.042
 Amy-0.042
Token elephants
Token ID25368
Feature activation+2.35e+00

Pos logit contributions
 Pet+0.008
 tort+0.008
 sur+0.008
 Kang+0.008
 rob+0.008
Neg logit contributions
 cereal-0.011
lf-0.011
 spaghetti-0.010
 Amy-0.010
James-0.010
Token and
Token ID290
Feature activation+5.69e-01

Pos logit contributions
 Pet+0.058
 tort+0.056
 sur+0.056
 Kang+0.055
 rob+0.054
Neg logit contributions
 cereal-0.061
lf-0.056
 spaghetti-0.056
 Amy-0.053
ma-0.052
Token gir
Token ID37370
Feature activation+2.67e+00

Pos logit contributions
 Pet+0.012
 tort+0.012
 sur+0.012
 Kang+0.012
 rob+0.011
Neg logit contributions
 cereal-0.017
lf-0.015
 spaghetti-0.015
 Amy-0.014
James-0.014
Tokenaff
Token ID2001
Feature activation+5.81e+00

Pos logit contributions
 Pet+0.023
 tort+0.021
 sur+0.021
 Kang+0.020
 rob+0.020
Neg logit contributions
 cereal-0.110
lf-0.104
 spaghetti-0.104
 Amy-0.101
ma-0.099
Tokenes
Token ID274
Feature activation+1.70e+00

Pos logit contributions
 tort+0.164
 Pet+0.163
 sur+0.163
 Kang+0.159
 gol+0.157
Neg logit contributions
 cereal-0.121
lf-0.117
 spaghetti-0.107
 Amy-0.105
ma-0.104
Token.
Token ID13
Feature activation+3.47e-01

Pos logit contributions
 Pet+0.041
 tort+0.040
 sur+0.040
 Kang+0.040
 rob+0.040
Neg logit contributions
 cereal-0.044
lf-0.041
 spaghetti-0.041
 Amy-0.038
ma-0.037
Token They
Token ID1119
Feature activation-2.21e-01

Pos logit contributions
 Pet+0.008
 Kang+0.008
 tort+0.008
 rob+0.008
 ko+0.008
Neg logit contributions
 cereal-0.009
lf-0.009
 spaghetti-0.009
bes-0.008
ma-0.008
Token were
Token ID547
Feature activation-1.46e+00

Pos logit contributions
 cereal+0.006
lf+0.005
 spaghetti+0.005
 Amy+0.005
James+0.005
Neg logit contributions
 Kang-0.005
 Pet-0.005
 tort-0.005
ive-0.005
 sur-0.005
Token very
Token ID845
Feature activation-2.83e-01

Pos logit contributions
 cereal+0.037
lf+0.034
 spaghetti+0.033
 Amy+0.031
James+0.031
Neg logit contributions
 Pet-0.037
 tort-0.036
 sur-0.035
 Kang-0.035
 rob-0.035
Token They
Token ID1119
Feature activation-1.08e-01

Pos logit contributions
 Pet+0.021
 tort+0.021
 rob+0.020
 sur+0.020
 Kang+0.020
Neg logit contributions
 cereal-0.020
lf-0.019
 spaghetti-0.019
 Amy-0.017
James-0.017
Token let
Token ID1309
Feature activation-5.05e-01

Pos logit contributions
 cereal+0.003
 spaghetti+0.003
lf+0.003
 Amy+0.003
ma+0.003
Neg logit contributions
 Kang-0.002
 Pet-0.002
 tort-0.002
 sur-0.002
 rob-0.002
Token them
Token ID606
Feature activation+7.52e-01

Pos logit contributions
 cereal+0.012
 spaghetti+0.012
 Amy+0.011
ma+0.010
 rug+0.010
Neg logit contributions
 sur-0.014
 tort-0.014
 gol-0.013
 perspective-0.013
 Kang-0.013
Token run
Token ID1057
Feature activation+7.28e-01

Pos logit contributions
 tort+0.015
 gol+0.015
 Pet+0.015
 sur+0.015
 Kang+0.015
Neg logit contributions
 cereal-0.023
 spaghetti-0.021
 Amy-0.020
lf-0.020
ma-0.019
Token and
Token ID290
Feature activation+1.39e+00

Pos logit contributions
 rob+0.016
 Pet+0.016
 tort+0.016
 Kang+0.016
 sur+0.015
Neg logit contributions
 cereal-0.021
 spaghetti-0.019
 Amy-0.019
lf-0.018
ma-0.018
Token gra
Token ID7933
Feature activation+5.79e+00

Pos logit contributions
 Pet+0.031
 Kang+0.030
 sur+0.029
 tort+0.029
 rob+0.029
Neg logit contributions
 cereal-0.039
 spaghetti-0.036
lf-0.035
 Amy-0.035
bes-0.033
Tokenze
Token ID2736
Feature activation+2.09e+00

Pos logit contributions
 perspective+0.162
 sur+0.157
 gol+0.157
 tort+0.154
 encountered+0.151
Neg logit contributions
bes-0.122
 cereal-0.119
lf-0.114
ma-0.108
 Amy-0.104
Token on
Token ID319
Feature activation-7.36e-01

Pos logit contributions
 Pet+0.046
 tort+0.045
 rob+0.044
 Kang+0.044
 sur+0.044
Neg logit contributions
 cereal-0.060
lf-0.056
 spaghetti-0.055
 Amy-0.052
James-0.051
Token the
Token ID262
Feature activation-3.55e-01

Pos logit contributions
 cereal+0.024
lf+0.023
 spaghetti+0.022
James+0.022
bes+0.021
Neg logit contributions
 Pet-0.013
 rob-0.012
 Kang-0.012
 tort-0.012
 sur-0.012
Token grass
Token ID8701
Feature activation+1.27e-01

Pos logit contributions
 cereal+0.009
lf+0.008
 spaghetti+0.008
James+0.007
 Amy+0.007
Neg logit contributions
 Pet-0.009
 tort-0.009
 rob-0.009
 Kang-0.009
 sur-0.009
Token.
Token ID13
Feature activation+6.51e-01

Pos logit contributions
 Pet+0.003
 tort+0.003
 Kang+0.003
 rob+0.003
 sur+0.002
Neg logit contributions
 cereal-0.004
lf-0.004
 spaghetti-0.003
James-0.003
 Amy-0.003
Token.
Token ID13
Feature activation-2.20e-01

Pos logit contributions
 Pet+0.009
 tort+0.009
 sur+0.009
 Kang+0.009
 rob+0.009
Neg logit contributions
 cereal-0.012
lf-0.011
 spaghetti-0.011
 Amy-0.010
James-0.010
Token They
Token ID1119
Feature activation-4.26e-01

Pos logit contributions
 cereal+0.006
lf+0.005
 spaghetti+0.005
ma+0.005
bes+0.005
Neg logit contributions
 Pet-0.006
 Kang-0.005
 tort-0.005
 rob-0.005
 sur-0.005
Token were
Token ID547
Feature activation-1.32e+00

Pos logit contributions
 cereal+0.010
lf+0.009
 Amy+0.009
 spaghetti+0.009
 Kate+0.008
Neg logit contributions
 Kang-0.011
ive-0.011
 tort-0.011
 gol-0.011
 sur-0.011
Token called
Token ID1444
Feature activation+1.02e-01

Pos logit contributions
 cereal+0.030
 Amy+0.028
 spaghetti+0.027
lf+0.027
ma+0.025
Neg logit contributions
ive-0.035
 tort-0.035
 sur-0.035
 gol-0.034
 Pet-0.034
Token ost
Token ID23619
Feature activation+2.79e+00

Pos logit contributions
 sur+0.002
 tort+0.002
 encountered+0.002
 gol+0.002
ive+0.002
Neg logit contributions
 cereal-0.003
 Amy-0.003
 spaghetti-0.002
lf-0.002
 Kate-0.002
Tokenric
Token ID1173
Feature activation+5.79e+00

Pos logit contributions
 tort+0.054
 sur+0.053
 Pet+0.053
 ko+0.051
 gol+0.051
Neg logit contributions
 cereal-0.083
lf-0.080
ma-0.077
 spaghetti-0.077
 Amy-0.075
Tokenhes
Token ID956
Feature activation+1.21e+00

Pos logit contributions
 Pet+0.203
 rob+0.195
 sur+0.191
 Kang+0.191
 tort+0.190
Neg logit contributions
 cereal-0.091
lf-0.079
 spaghetti-0.077
James-0.066
 Amy-0.065
Token.
Token ID13
Feature activation+6.92e-02

Pos logit contributions
 Pet+0.028
 tort+0.027
 sur+0.027
 Kang+0.026
 rob+0.026
Neg logit contributions
 cereal-0.033
lf-0.031
 spaghetti-0.030
 Amy-0.029
ma-0.028
Token Lily
Token ID20037
Feature activation+4.62e-01

Pos logit contributions
 Pet+0.002
 tort+0.002
 rob+0.002
 Kang+0.002
 sur+0.002
Neg logit contributions
 cereal-0.002
lf-0.002
 spaghetti-0.002
James-0.001
 Amy-0.001
Token and
Token ID290
Feature activation-1.41e+00

Pos logit contributions
 Pet+0.012
 tort+0.012
 Kang+0.012
 rob+0.011
 sur+0.011
Neg logit contributions
 cereal-0.011
lf-0.011
 spaghetti-0.010
 Amy-0.010
bes-0.010
Token Ben
Token ID3932
Feature activation+1.35e+00

Pos logit contributions
 cereal+0.027
 spaghetti+0.024
lf+0.024
 Amy+0.024
 Kate+0.021
Neg logit contributions
 sur-0.043
ive-0.042
 tort-0.042
 Pet-0.042
 Kang-0.042
Token many
Token ID867
Feature activation-1.55e+00

Pos logit contributions
 cereal+0.022
lf+0.021
 spaghetti+0.020
James+0.019
bes+0.019
Neg logit contributions
 Pet-0.023
 sur-0.022
 rob-0.022
 tort-0.022
 Kang-0.021
Token fish
Token ID5916
Feature activation+1.46e+00

Pos logit contributions
 cereal+0.039
lf+0.036
 spaghetti+0.036
ma+0.033
 beds+0.032
Neg logit contributions
 tort-0.041
 Pet-0.041
mur-0.040
 Kang-0.039
severe-0.039
Token and
Token ID290
Feature activation+4.50e-01

Pos logit contributions
 tort+0.039
 Pet+0.039
severe+0.036
 ko+0.036
 Kang+0.036
Neg logit contributions
 cereal-0.035
 spaghetti-0.033
lf-0.031
ma-0.030
 blocks-0.029
Token shells
Token ID19679
Feature activation+7.91e-01

Pos logit contributions
 Pet+0.009
 Kang+0.008
 tort+0.008
 sur+0.008
 rob+0.008
Neg logit contributions
 cereal-0.014
 spaghetti-0.013
lf-0.013
 Amy-0.013
James-0.012
Token and
Token ID290
Feature activation+7.02e-01

Pos logit contributions
 Pet+0.019
 tort+0.019
 Kang+0.019
 rob+0.018
 sur+0.018
Neg logit contributions
 cereal-0.021
 spaghetti-0.019
lf-0.019
ma-0.018
 Amy-0.018
Token cor
Token ID1162
Feature activation+5.72e+00

Pos logit contributions
 Pet+0.014
 Kang+0.014
 tort+0.013
 sur+0.013
mur+0.013
Neg logit contributions
 cereal-0.022
 spaghetti-0.020
lf-0.020
 Amy-0.019
 Kate-0.018
Tokenals
Token ID874
Feature activation+2.24e+00

Pos logit contributions
 rob+0.111
 tort+0.106
 Kang+0.106
 sur+0.105
 tou+0.103
Neg logit contributions
 cereal-0.179
 spaghetti-0.170
 forks-0.153
 bowls-0.150
 blankets-0.150
Token.
Token ID13
Feature activation+8.85e-01

Pos logit contributions
 tort+0.056
 Pet+0.056
 Kang+0.056
 sur+0.054
 ko+0.053
Neg logit contributions
 cereal-0.057
lf-0.052
 spaghetti-0.052
 Amy-0.047
ma-0.047
Token They
Token ID1119
Feature activation-1.68e-01

Pos logit contributions
 Pet+0.020
 Kang+0.018
 sur+0.018
 rob+0.017
 across+0.017
Neg logit contributions
lf-0.025
 cereal-0.024
bes-0.023
 spaghetti-0.022
ma-0.022
Token liked
Token ID8288
Feature activation-1.88e-01

Pos logit contributions
 cereal+0.004
lf+0.004
 spaghetti+0.004
 Amy+0.004
ma+0.004
Neg logit contributions
 Pet-0.004
 Kang-0.004
 tort-0.004
ive-0.004
 ko-0.004
Token to
Token ID284
Feature activation+3.12e-01

Pos logit contributions
 cereal+0.006
 spaghetti+0.006
 Amy+0.005
lf+0.005
 blocks+0.005
Neg logit contributions
ive-0.004
 tort-0.004
 Kang-0.003
mur-0.003
severe-0.003
Token their
Token ID511
Feature activation-7.22e-01

Pos logit contributions
 Pet+0.038
 Kang+0.034
 sur+0.034
 rob+0.034
 tort+0.034
Neg logit contributions
 cereal-0.049
lf-0.047
 spaghetti-0.045
James-0.044
bes-0.044
Token favorite
Token ID4004
Feature activation-2.43e+00

Pos logit contributions
 cereal+0.018
lf+0.017
 spaghetti+0.016
James+0.015
 Amy+0.015
Neg logit contributions
 Pet-0.019
 sur-0.018
 tort-0.018
 rob-0.018
 Kang-0.018
Token thing
Token ID1517
Feature activation-9.70e-01

Pos logit contributions
 cereal+0.069
lf+0.065
 spaghetti+0.064
James+0.061
 Amy+0.060
Neg logit contributions
 Pet-0.053
 tort-0.051
 rob-0.051
 sur-0.050
 Kang-0.050
Token was
Token ID373
Feature activation-1.17e+00

Pos logit contributions
 cereal+0.026
lf+0.024
 spaghetti+0.024
 Amy+0.023
 Kate+0.022
Neg logit contributions
 Pet-0.023
 tort-0.022
 Kang-0.022
 sur-0.022
 rob-0.022
Token the
Token ID262
Feature activation+1.03e-01

Pos logit contributions
 cereal+0.035
 spaghetti+0.033
lf+0.032
 Amy+0.031
 Kate+0.029
Neg logit contributions
 Pet-0.025
 tort-0.024
ive-0.024
 Kang-0.024
 sur-0.024
Token merry
Token ID44570
Feature activation-5.13e+00

Pos logit contributions
 Pet+0.003
 tort+0.002
 Kang+0.002
 sur+0.002
 rob+0.002
Neg logit contributions
 cereal-0.003
lf-0.003
 spaghetti-0.002
 Amy-0.002
James-0.002
Token-
Token ID12
Feature activation-3.20e+00

Pos logit contributions
 cereal+0.165
lf+0.155
 spaghetti+0.153
James+0.146
 Amy+0.145
Neg logit contributions
 Pet-0.094
 tort-0.089
 sur-0.088
 rob-0.088
 Kang-0.087
Tokengo
Token ID2188
Feature activation-2.09e+00

Pos logit contributions
 cereal+0.094
 spaghetti+0.088
lf+0.085
bes+0.079
 Amy+0.078
Neg logit contributions
 Pet-0.070
 rob-0.069
 sur-0.068
 tou-0.066
 Kang-0.065
Token-
Token ID12
Feature activation-3.05e+00

Pos logit contributions
 cereal+0.059
lf+0.058
ma+0.054
 spaghetti+0.053
 Amy+0.052
Neg logit contributions
 Pet-0.045
 tort-0.044
 rob-0.042
 Kang-0.042
 ko-0.041
Tokenround
Token ID744
Feature activation-1.43e+00

Pos logit contributions
 cereal+0.087
lf+0.083
 spaghetti+0.080
 Amy+0.076
James+0.076
Neg logit contributions
 Pet-0.066
 tort-0.063
 Kang-0.062
 rob-0.062
 sur-0.062
Token.
Token ID13
Feature activation+2.13e-01

Pos logit contributions
 cereal+0.035
lf+0.034
 spaghetti+0.032
 Amy+0.030
ma+0.030
Neg logit contributions
 Pet-0.036
 rob-0.035
 tort-0.035
 sur-0.035
 Kang-0.035
Token had
Token ID550
Feature activation-8.72e-01

Pos logit contributions
 cereal+0.034
lf+0.032
 spaghetti+0.032
James+0.030
bes+0.030
Neg logit contributions
 Pet-0.024
 tort-0.023
 rob-0.022
 Kang-0.022
 ko-0.022
Token colorful
Token ID20239
Feature activation-8.27e-01

Pos logit contributions
 cereal+0.020
lf+0.019
 spaghetti+0.018
bes+0.017
James+0.017
Neg logit contributions
 Pet-0.023
 rob-0.022
 tort-0.022
 tou-0.022
 sur-0.022
Token pol
Token ID755
Feature activation-1.39e+00

Pos logit contributions
 cereal+0.017
lf+0.016
 spaghetti+0.015
James+0.014
 Amy+0.014
Neg logit contributions
 Pet-0.024
 tort-0.023
 rob-0.023
 sur-0.023
 Kang-0.023
Tokenka
Token ID4914
Feature activation-1.45e+00

Pos logit contributions
 cereal+0.034
lf+0.031
 spaghetti+0.030
ma+0.028
James+0.028
Neg logit contributions
 Pet-0.037
 tort-0.036
 rob-0.036
 Kang-0.036
 sur-0.035
Token-
Token ID12
Feature activation-2.61e+00

Pos logit contributions
 cereal+0.036
lf+0.034
 spaghetti+0.032
bes+0.030
James+0.029
Neg logit contributions
 Pet-0.039
 Kang-0.037
 tort-0.037
 tou-0.037
 rob-0.036
Tokend
Token ID67
Feature activation-5.08e+00

Pos logit contributions
 cereal+0.066
lf+0.061
 spaghetti+0.060
 Amy+0.056
James+0.055
Neg logit contributions
 Pet-0.066
 tort-0.063
 rob-0.063
 sur-0.063
 Kang-0.062
Tokenots
Token ID1747
Feature activation-6.67e-01

Pos logit contributions
 cereal+0.134
lf+0.125
 spaghetti+0.122
 Amy+0.115
James+0.115
Neg logit contributions
 Pet-0.122
 tort-0.116
 sur-0.115
 rob-0.115
 Kang-0.115
Token all
Token ID477
Feature activation-2.02e+00

Pos logit contributions
 cereal+0.018
 spaghetti+0.016
lf+0.015
ma+0.015
 Kate+0.014
Neg logit contributions
 tou-0.016
 Pet-0.016
 rob-0.016
 Kang-0.016
 sur-0.016
Token over
Token ID625
Feature activation-6.89e-01

Pos logit contributions
 cereal+0.057
lf+0.054
 spaghetti+0.052
 Amy+0.049
James+0.049
Neg logit contributions
 Pet-0.045
 tort-0.043
 Kang-0.043
 sur-0.043
 rob-0.043
Token it
Token ID340
Feature activation-1.88e+00

Pos logit contributions
 cereal+0.020
lf+0.019
 spaghetti+0.018
bes+0.018
James+0.017
Neg logit contributions
 Pet-0.014
 sur-0.013
 tort-0.013
 rob-0.013
 Kang-0.013
Token.
Token ID13
Feature activation-1.68e-01

Pos logit contributions
 cereal+0.056
lf+0.052
 spaghetti+0.051
 Amy+0.048
James+0.048
Neg logit contributions
 Pet-0.039
 tort-0.037
 rob-0.037
 sur-0.037
 Kang-0.037
Token bounce
Token ID20110
Feature activation+2.67e-01

Pos logit contributions
 Pet+0.014
 tort+0.013
 rob+0.013
 sur+0.013
 Kang+0.012
Neg logit contributions
 cereal-0.021
lf-0.020
 spaghetti-0.020
James-0.019
bes-0.019
Token on
Token ID319
Feature activation-7.66e-01

Pos logit contributions
 Pet+0.005
 tort+0.005
 sur+0.005
 rob+0.005
 Kang+0.005
Neg logit contributions
 cereal-0.008
lf-0.008
 spaghetti-0.007
James-0.007
 Amy-0.007
Token the
Token ID262
Feature activation+1.30e-01

Pos logit contributions
lf+0.026
 cereal+0.026
James+0.025
Kate+0.025
 spaghetti+0.024
Neg logit contributions
 Pet-0.012
 tort-0.010
 rob-0.010
 sur-0.010
 ko-0.009
Token big
Token ID1263
Feature activation-1.64e+00

Pos logit contributions
 Pet+0.003
 tort+0.003
 rob+0.003
 sur+0.003
 Kang+0.003
Neg logit contributions
 cereal-0.003
lf-0.003
 spaghetti-0.003
James-0.003
 Amy-0.003
Token tr
Token ID491
Feature activation-3.05e+00

Pos logit contributions
 cereal+0.041
lf+0.039
 spaghetti+0.038
James+0.036
bes+0.036
Neg logit contributions
 Pet-0.041
 tort-0.040
 rob-0.039
 sur-0.039
 Kang-0.039
Tokenamp
Token ID696
Feature activation-4.95e+00

Pos logit contributions
 cereal+0.082
 spaghetti+0.077
 Amy+0.073
James+0.071
lf+0.070
Neg logit contributions
 Kang-0.065
 Pet-0.063
 sur-0.063
 gol-0.061
 perspective-0.060
Tokenoline
Token ID14453
Feature activation-1.38e+00

Pos logit contributions
 cereal+0.112
 Deal+0.104
 spaghetti+0.099
James+0.098
bes+0.098
Neg logit contributions
ive-0.119
crow-0.119
 sur-0.119
 Pet-0.117
 across-0.116
Token in
Token ID287
Feature activation-1.16e+00

Pos logit contributions
 cereal+0.044
lf+0.042
 spaghetti+0.041
James+0.040
bes+0.039
Neg logit contributions
 Pet-0.024
 tort-0.023
 sur-0.022
 Kang-0.022
 rob-0.022
Token their
Token ID511
Feature activation-4.21e-02

Pos logit contributions
 cereal+0.040
lf+0.039
 spaghetti+0.038
bes+0.038
 microwave+0.037
Neg logit contributions
 Pet-0.017
 rob-0.016
 Kang-0.015
 ko-0.015
 across-0.015
Token backyard
Token ID24296
Feature activation-7.92e-01

Pos logit contributions
 cereal+0.001
lf+0.001
 spaghetti+0.001
James+0.001
bes+0.001
Neg logit contributions
 Pet-0.001
 tort-0.001
 sur-0.001
 Kang-0.001
 rob-0.001
Token.
Token ID13
Feature activation+8.28e-01

Pos logit contributions
 cereal+0.025
lf+0.024
 spaghetti+0.023
James+0.022
 Amy+0.022
Neg logit contributions
 Pet-0.015
 tort-0.014
 rob-0.014
 sur-0.014
 Kang-0.014
Token backyard
Token ID24296
Feature activation-1.85e-01

Pos logit contributions
 cereal+0.010
lf+0.010
 spaghetti+0.009
bes+0.009
James+0.009
Neg logit contributions
 Pet-0.010
 tort-0.009
 sur-0.009
 rob-0.009
 Kang-0.009
Token on
Token ID319
Feature activation-8.77e-01

Pos logit contributions
 cereal+0.005
lf+0.005
 spaghetti+0.005
James+0.005
 Amy+0.005
Neg logit contributions
 Pet-0.004
 tort-0.004
 rob-0.004
 sur-0.004
 Kang-0.004
Token a
Token ID257
Feature activation-6.78e-01

Pos logit contributions
 cereal+0.026
lf+0.025
 spaghetti+0.024
James+0.023
bes+0.023
Neg logit contributions
 Pet-0.018
 rob-0.017
 tort-0.017
 Kang-0.016
 sur-0.016
Token big
Token ID1263
Feature activation-1.45e+00

Pos logit contributions
 cereal+0.018
lf+0.016
 spaghetti+0.016
James+0.015
bes+0.015
Neg logit contributions
 Pet-0.016
 tort-0.016
 rob-0.015
 sur-0.015
 Kang-0.015
Token tr
Token ID491
Feature activation-2.71e+00

Pos logit contributions
 cereal+0.039
lf+0.036
 spaghetti+0.035
bes+0.034
James+0.034
Neg logit contributions
 Pet-0.033
 tort-0.033
 rob-0.032
 sur-0.032
 Kang-0.032
Tokenamp
Token ID696
Feature activation-4.90e+00

Pos logit contributions
 cereal+0.075
 spaghetti+0.068
 Amy+0.066
James+0.063
 blankets+0.061
Neg logit contributions
 sur-0.058
 Kang-0.056
 gol-0.056
 Pet-0.056
 perspective-0.055
Tokenoline
Token ID14453
Feature activation-1.39e+00

Pos logit contributions
 cereal+0.115
 Deal+0.110
 spaghetti+0.100
bes+0.099
James+0.098
Neg logit contributions
ting-0.118
 sur-0.117
 Pet-0.116
 across-0.115
 gol-0.114
Token.
Token ID13
Feature activation+3.65e-01

Pos logit contributions
 cereal+0.040
lf+0.037
 spaghetti+0.037
 Amy+0.034
James+0.034
Neg logit contributions
 Pet-0.031
 tort-0.029
 rob-0.029
 Kang-0.029
 sur-0.029
Token
Token ID198
Feature activation+2.58e-01

Pos logit contributions
 Pet+0.010
 rob+0.010
 tort+0.010
 sur+0.009
 tou+0.009
Neg logit contributions
 cereal-0.009
 spaghetti-0.008
lf-0.008
 Amy-0.007
James-0.007
Token
Token ID198
Feature activation+3.91e-01

Pos logit contributions
 Pet+0.005
 Kang+0.005
 sur+0.005
 tort+0.005
 rob+0.005
Neg logit contributions
 cereal-0.008
lf-0.007
 spaghetti-0.007
bes-0.007
ma-0.007
TokenT
Token ID51
Feature activation+4.16e-01

Pos logit contributions
 Pet+0.009
 tort+0.008
 rob+0.008
 Kang+0.008
 sur+0.008
Neg logit contributions
 cereal-0.011
lf-0.010
 spaghetti-0.010
James-0.010
 Amy-0.010
Token big
Token ID1263
Feature activation-9.24e-01

Pos logit contributions
 Pet+0.017
ey+0.016
ive+0.015
 Kang+0.015
ting+0.015
Neg logit contributions
 cereal-0.015
 spaghetti-0.014
lf-0.013
 rug-0.012
 Amy-0.012
Token,
Token ID11
Feature activation-1.40e+00

Pos logit contributions
 cereal+0.024
 spaghetti+0.022
lf+0.021
 Amy+0.020
 rug+0.020
Neg logit contributions
 Pet-0.024
 rob-0.022
 Kang-0.022
 tou-0.022
 tort-0.022
Token boun
Token ID31283
Feature activation-2.42e+00

Pos logit contributions
 cereal+0.034
lf+0.033
James+0.031
ma+0.030
bes+0.030
Neg logit contributions
 Pet-0.035
 tort-0.034
 rob-0.034
 sur-0.034
 tou-0.033
Tokency
Token ID948
Feature activation-1.77e+00

Pos logit contributions
 cereal+0.076
lf+0.073
 spaghetti+0.070
bes+0.068
James+0.068
Neg logit contributions
 Pet-0.045
 sur-0.044
 tort-0.043
 Kang-0.043
 rob-0.042
Token tr
Token ID491
Feature activation-2.43e+00

Pos logit contributions
 cereal+0.042
lf+0.038
 spaghetti+0.037
James+0.035
 Amy+0.035
Neg logit contributions
 Pet-0.047
 tort-0.046
 rob-0.045
 sur-0.045
 Kang-0.045
Tokenamp
Token ID696
Feature activation-4.88e+00

Pos logit contributions
 cereal+0.066
 spaghetti+0.061
 Amy+0.060
James+0.055
lf+0.054
Neg logit contributions
 sur-0.056
 Kang-0.054
 Pet-0.053
 gol-0.053
mur-0.052
Tokenoline
Token ID14453
Feature activation-1.26e+00

Pos logit contributions
 cereal+0.105
 spaghetti+0.092
 Deal+0.089
James+0.088
bes+0.087
Neg logit contributions
 Pet-0.135
 sur-0.133
ting-0.131
ive-0.130
 tort-0.129
Token.
Token ID13
Feature activation+5.40e-03

Pos logit contributions
 cereal+0.034
lf+0.032
 spaghetti+0.032
 Amy+0.029
bes+0.029
Neg logit contributions
 Pet-0.029
 rob-0.028
 tort-0.028
 Kang-0.028
 sur-0.028
Token They
Token ID1119
Feature activation-2.62e-01

Pos logit contributions
 Pet+0.000
 tort+0.000
 rob+0.000
 sur+0.000
 Kang+0.000
Neg logit contributions
 cereal-0.000
lf-0.000
 spaghetti-0.000
 Amy-0.000
James-0.000
Token all
Token ID477
Feature activation-3.29e-01

Pos logit contributions
 cereal+0.007
lf+0.006
 spaghetti+0.006
 Amy+0.006
James+0.006
Neg logit contributions
 Pet-0.007
 Kang-0.006
 tort-0.006
 rob-0.006
 sur-0.006
Token cheered
Token ID34767
Feature activation-1.45e+00

Pos logit contributions
 cereal+0.008
lf+0.007
 spaghetti+0.007
 Amy+0.007
Kate+0.007
Neg logit contributions
 Pet-0.009
 Kang-0.008
 tort-0.008
 tou-0.008
 rob-0.008
Token and
Token ID290
Feature activation-9.61e-02

Pos logit contributions
 Pet+0.025
 tort+0.024
 sur+0.024
 rob+0.024
 Kang+0.023
Neg logit contributions
 cereal-0.033
lf-0.031
 spaghetti-0.030
James-0.029
 Amy-0.029
Token found
Token ID1043
Feature activation+1.99e-01

Pos logit contributions
 cereal+0.003
lf+0.003
 spaghetti+0.003
 Amy+0.002
James+0.002
Neg logit contributions
 Pet-0.002
 Kang-0.002
 tort-0.002
 rob-0.002
 tou-0.002
Token a
Token ID257
Feature activation+6.76e-02

Pos logit contributions
 Pet+0.005
 tort+0.005
 rob+0.005
 sur+0.005
 Kang+0.005
Neg logit contributions
lf-0.005
bes-0.005
 cereal-0.005
James-0.004
Kate-0.004
Token big
Token ID1263
Feature activation-1.16e+00

Pos logit contributions
 Pet+0.002
 Kang+0.002
 tort+0.002
 sur+0.002
ive+0.002
Neg logit contributions
 cereal-0.002
lf-0.001
 spaghetti-0.001
 Amy-0.001
 rug-0.001
Token tr
Token ID491
Feature activation-2.35e+00

Pos logit contributions
 cereal+0.029
lf+0.027
 spaghetti+0.026
 Amy+0.024
ma+0.024
Neg logit contributions
 Pet-0.030
 Kang-0.029
 tort-0.028
 sur-0.028
 rob-0.028
Tokenamp
Token ID696
Feature activation-4.83e+00

Pos logit contributions
 cereal+0.063
 spaghetti+0.059
 Amy+0.057
James+0.054
bes+0.052
Neg logit contributions
 sur-0.053
 Kang-0.052
 Pet-0.051
 gol-0.050
 tort-0.049
Tokenoline
Token ID14453
Feature activation-9.62e-01

Pos logit contributions
 cereal+0.108
 Deal+0.101
 spaghetti+0.097
James+0.094
bes+0.090
Neg logit contributions
 Pet-0.126
ting-0.123
 tort-0.122
 sur-0.121
mur-0.118
Token.
Token ID13
Feature activation-4.74e-01

Pos logit contributions
 cereal+0.027
lf+0.025
 spaghetti+0.025
 Amy+0.023
James+0.023
Neg logit contributions
 Pet-0.022
 rob-0.021
 tort-0.021
 Kang-0.021
 sur-0.021
Token He
Token ID679
Feature activation-2.26e-01

Pos logit contributions
lf+0.012
 cereal+0.011
ma+0.011
bes+0.011
kin+0.010
Neg logit contributions
 Pet-0.011
 sur-0.011
 across-0.011
 tort-0.011
 gol-0.010
Token jumped
Token ID11687
Feature activation+7.73e-01

Pos logit contributions
 cereal+0.006
lf+0.006
 spaghetti+0.006
 Amy+0.006
James+0.006
Neg logit contributions
 Pet-0.005
 tort-0.005
 rob-0.005
 sur-0.005
 Kang-0.005
Token on
Token ID319
Feature activation-9.82e-01

Pos logit contributions
 rob+0.019
 Kang+0.018
 Pet+0.018
 tou+0.018
 tort+0.018
Neg logit contributions
 cereal-0.021
 spaghetti-0.019
lf-0.018
 Amy-0.018
 Kate-0.017
Token sofa
Token ID34902
Feature activation-1.74e-01

Pos logit contributions
 cereal+0.008
lf+0.008
James+0.008
 Amy+0.007
Kate+0.007
Neg logit contributions
 Pet-0.011
 tort-0.011
 rob-0.011
 across-0.011
 tou-0.011
Token,
Token ID11
Feature activation-6.00e-02

Pos logit contributions
 cereal+0.006
lf+0.005
 spaghetti+0.005
James+0.005
 Amy+0.005
Neg logit contributions
 Pet-0.003
 tort-0.003
 sur-0.003
 rob-0.003
 Kang-0.003
Token on
Token ID319
Feature activation-1.11e+00

Pos logit contributions
 cereal+0.002
lf+0.002
 spaghetti+0.002
James+0.002
bes+0.002
Neg logit contributions
 Pet-0.001
 rob-0.001
 tort-0.001
 sur-0.001
 Kang-0.001
Token the
Token ID262
Feature activation-2.38e-01

Pos logit contributions
 cereal+0.038
lf+0.037
James+0.036
 spaghetti+0.035
bes+0.035
Neg logit contributions
 Pet-0.017
 rob-0.015
 Kang-0.014
 sur-0.014
 tort-0.014
Token tr
Token ID491
Feature activation-2.23e+00

Pos logit contributions
 cereal+0.004
lf+0.004
James+0.004
 Amy+0.003
Kate+0.003
Neg logit contributions
 Pet-0.007
 tort-0.007
 rob-0.007
 ko-0.007
 across-0.007
Tokenamp
Token ID696
Feature activation-4.82e+00

Pos logit contributions
 cereal+0.063
 spaghetti+0.058
 Amy+0.057
 forehead+0.056
 elbow+0.053
Neg logit contributions
 Kang-0.041
ub-0.040
ounce-0.039
 sur-0.039
mur-0.039
Tokenoline
Token ID14453
Feature activation-1.17e+00

Pos logit contributions
 cereal+0.111
 Deal+0.107
James+0.100
 spaghetti+0.098
bes+0.093
Neg logit contributions
ive-0.113
 Pet-0.113
crow-0.112
 sur-0.111
 across-0.109
Token.
Token ID13
Feature activation+6.63e-01

Pos logit contributions
 cereal+0.034
lf+0.032
 spaghetti+0.031
 Amy+0.029
bes+0.029
Neg logit contributions
 Pet-0.025
 tort-0.024
 Kang-0.024
 rob-0.024
 sur-0.024
Token They
Token ID1119
Feature activation-1.42e-01

Pos logit contributions
 Pet+0.017
 Kang+0.016
 tort+0.016
 sur+0.016
 rob+0.016
Neg logit contributions
 cereal-0.016
lf-0.016
 spaghetti-0.015
ma-0.014
James-0.014
Token bounce
Token ID20110
Feature activation+2.46e-01

Pos logit contributions
 cereal+0.004
lf+0.003
 spaghetti+0.003
 Amy+0.003
 Kate+0.003
Neg logit contributions
 Kang-0.003
 Pet-0.003
ive-0.003
 rob-0.003
 tou-0.003
Token high
Token ID1029
Feature activation-1.39e+00

Pos logit contributions
 Pet+0.005
 tort+0.005
 sur+0.005
 rob+0.005
 Kang+0.005
Neg logit contributions
 cereal-0.007
lf-0.007
 spaghetti-0.007
James-0.007
ma-0.006
Token They
Token ID1119
Feature activation-2.74e-01

Pos logit contributions
 Pet+0.005
 Kang+0.005
 sur+0.005
 rob+0.005
 ko+0.005
Neg logit contributions
lf-0.007
 cereal-0.007
ma-0.007
 spaghetti-0.007
bes-0.006
Token forgot
Token ID16453
Feature activation+1.39e-01

Pos logit contributions
 cereal+0.007
lf+0.007
 spaghetti+0.006
 Amy+0.006
James+0.006
Neg logit contributions
 Pet-0.007
 Kang-0.006
 tort-0.006
 rob-0.006
 sur-0.006
Token about
Token ID546
Feature activation-1.66e-02

Pos logit contributions
 Pet+0.003
 tort+0.003
 Kang+0.003
 rob+0.003
 sur+0.003
Neg logit contributions
 cereal-0.004
 spaghetti-0.003
lf-0.003
 Amy-0.003
James-0.003
Token the
Token ID262
Feature activation-5.13e-01

Pos logit contributions
 cereal+0.000
lf+0.000
 spaghetti+0.000
 Amy+0.000
James+0.000
Neg logit contributions
 Pet-0.000
 tort-0.000
 Kang-0.000
 sur-0.000
 rob-0.000
Token broken
Token ID5445
Feature activation-1.16e+00

Pos logit contributions
 cereal+0.014
lf+0.013
 spaghetti+0.013
 Amy+0.012
James+0.012
Neg logit contributions
 Pet-0.012
 tort-0.012
 sur-0.011
 rob-0.011
 Kang-0.011
Token merry
Token ID44570
Feature activation-4.80e+00

Pos logit contributions
 cereal+0.030
lf+0.028
 spaghetti+0.027
ma+0.026
bes+0.025
Neg logit contributions
 Pet-0.029
 rob-0.028
 tort-0.028
 sur-0.028
 Kang-0.027
Token-
Token ID12
Feature activation-3.08e+00

Pos logit contributions
 cereal+0.149
lf+0.140
 spaghetti+0.138
 Amy+0.130
James+0.130
Neg logit contributions
 Pet-0.094
 tort-0.089
 rob-0.088
 sur-0.088
 Kang-0.088
Tokengo
Token ID2188
Feature activation-1.59e+00

Pos logit contributions
 cereal+0.088
 spaghetti+0.082
lf+0.080
 Amy+0.075
bes+0.074
Neg logit contributions
 Pet-0.070
 sur-0.066
 rob-0.066
 tort-0.066
 Kang-0.065
Token-
Token ID12
Feature activation-2.88e+00

Pos logit contributions
 cereal+0.045
lf+0.045
ma+0.042
 spaghetti+0.041
bes+0.040
Neg logit contributions
 Pet-0.034
 tort-0.033
 rob-0.033
 Kang-0.032
 sur-0.031
Tokenround
Token ID744
Feature activation-1.27e+00

Pos logit contributions
 cereal+0.085
lf+0.080
 spaghetti+0.078
James+0.074
 Amy+0.074
Neg logit contributions
 Pet-0.060
 tort-0.058
 rob-0.057
 Kang-0.057
 sur-0.057
Token.
Token ID13
Feature activation+9.40e-02

Pos logit contributions
 cereal+0.033
lf+0.032
 spaghetti+0.030
ma+0.029
 Amy+0.028
Neg logit contributions
 Pet-0.031
 rob-0.030
 tort-0.030
 sur-0.030
 Kang-0.029
Token playing
Token ID2712
Feature activation-1.86e-01

Pos logit contributions
lf+0.005
 cereal+0.005
bes+0.004
lights+0.004
 spaghetti+0.004
Neg logit contributions
 Pet-0.004
 sur-0.004
 rob-0.004
 Kang-0.004
 tort-0.004
Token outside
Token ID2354
Feature activation-7.88e-01

Pos logit contributions
 cereal+0.006
lf+0.005
 spaghetti+0.005
ma+0.005
James+0.005
Neg logit contributions
 Pet-0.003
 tort-0.003
 sur-0.003
 rob-0.003
 Kang-0.003
Token on
Token ID319
Feature activation-1.04e+00

Pos logit contributions
 cereal+0.025
lf+0.024
 spaghetti+0.024
James+0.023
bes+0.023
Neg logit contributions
 Pet-0.014
 tort-0.013
 rob-0.013
 sur-0.013
 Kang-0.013
Token the
Token ID262
Feature activation-6.24e-01

Pos logit contributions
 cereal+0.035
lf+0.035
 Deal+0.034
bes+0.034
James+0.034
Neg logit contributions
 Pet-0.014
 rob-0.013
 across-0.012
 tort-0.012
 sur-0.012
Token tr
Token ID491
Feature activation-2.73e+00

Pos logit contributions
 cereal+0.016
lf+0.015
 spaghetti+0.014
James+0.014
bes+0.014
Neg logit contributions
 Pet-0.015
 tort-0.015
 rob-0.015
 Kang-0.015
 sur-0.015
Tokenamp
Token ID696
Feature activation-4.78e+00

Pos logit contributions
 cereal+0.077
 spaghetti+0.071
 Amy+0.067
James+0.065
 forehead+0.065
Neg logit contributions
 sur-0.055
 gol-0.054
 Kang-0.054
 Pet-0.054
 perspective-0.053
Tokenoline
Token ID14453
Feature activation-1.61e+00

Pos logit contributions
 cereal+0.108
 Deal+0.096
 spaghetti+0.094
James+0.091
bes+0.090
Neg logit contributions
 sur-0.123
 Pet-0.123
ive-0.118
 across-0.118
 encountered-0.117
Token.
Token ID13
Feature activation+1.36e-01

Pos logit contributions
 cereal+0.049
lf+0.046
 spaghetti+0.045
James+0.043
 Amy+0.043
Neg logit contributions
 Pet-0.032
 tort-0.030
 sur-0.030
 Kang-0.030
 rob-0.030
Token He
Token ID679
Feature activation+1.75e-02

Pos logit contributions
 Pet+0.004
 tort+0.004
 sur+0.004
 Kang+0.004
 rob+0.004
Neg logit contributions
 cereal-0.003
lf-0.003
 spaghetti-0.003
bes-0.003
ma-0.003
Token was
Token ID373
Feature activation-4.81e-01

Pos logit contributions
 Pet+0.000
 sur+0.000
 tort+0.000
 rob+0.000
 Kang+0.000
Neg logit contributions
 cereal-0.001
lf-0.001
 spaghetti-0.001
bes-0.000
James-0.000
Token excited
Token ID6568
Feature activation-1.09e+00

Pos logit contributions
 cereal+0.011
lf+0.010
 spaghetti+0.010
 Amy+0.010
James+0.009
Neg logit contributions
 Pet-0.013
 tort-0.013
 Kang-0.012
 rob-0.012
 tou-0.012
Token to
Token ID284
Feature activation+6.04e-01

Pos logit contributions
 cereal+0.009
 spaghetti+0.008
lf+0.008
 Amy+0.008
 rug+0.007
Neg logit contributions
 Pet-0.003
 tort-0.003
 Kang-0.003
ive-0.003
 ko-0.003
Token jump
Token ID4391
Feature activation+3.86e-01

Pos logit contributions
 Pet+0.010
 rob+0.010
 tort+0.010
 sur+0.009
 judge+0.009
Neg logit contributions
 cereal-0.020
lf-0.019
 spaghetti-0.019
James-0.018
ma-0.018
Token on
Token ID319
Feature activation-1.22e+00

Pos logit contributions
 Pet+0.008
 tort+0.007
 Kang+0.007
 rob+0.007
 sur+0.007
Neg logit contributions
 cereal-0.012
lf-0.011
 spaghetti-0.011
James-0.011
ma-0.010
Token the
Token ID262
Feature activation+3.74e-02

Pos logit contributions
 cereal+0.038
lf+0.038
James+0.036
 spaghetti+0.036
bes+0.035
Neg logit contributions
 Pet-0.022
 rob-0.020
 tort-0.020
 sur-0.020
 Kang-0.019
Token tr
Token ID491
Feature activation-2.70e+00

Pos logit contributions
 Pet+0.001
 tort+0.001
 rob+0.001
 Kang+0.001
 sur+0.001
Neg logit contributions
 cereal-0.001
lf-0.001
 spaghetti-0.001
James-0.001
 Amy-0.001
Tokenamp
Token ID696
Feature activation-4.69e+00

Pos logit contributions
 cereal+0.075
 spaghetti+0.069
 Amy+0.066
James+0.065
lf+0.064
Neg logit contributions
 Kang-0.055
 Pet-0.054
 sur-0.054
ub-0.052
 perspective-0.052
Tokenoline
Token ID14453
Feature activation-1.23e+00

Pos logit contributions
 cereal+0.104
 Deal+0.102
James+0.095
 spaghetti+0.093
bes+0.088
Neg logit contributions
 Pet-0.115
ive-0.114
crow-0.112
 sur-0.111
 across-0.110
Token.
Token ID13
Feature activation+1.03e+00

Pos logit contributions
 cereal+0.038
lf+0.036
 spaghetti+0.035
James+0.034
bes+0.033
Neg logit contributions
 Pet-0.023
 tort-0.022
 sur-0.022
 Kang-0.022
 rob-0.021
Token It
Token ID632
Feature activation-1.14e+00

Pos logit contributions
 Pet+0.028
 Kang+0.026
 tort+0.026
 rob+0.026
 sur+0.026
Neg logit contributions
lf-0.024
 cereal-0.023
ma-0.021
 spaghetti-0.021
bes-0.020
Token is
Token ID318
Feature activation-1.06e+00

Pos logit contributions
 cereal+0.031
lf+0.029
 spaghetti+0.028
James+0.027
bes+0.026
Neg logit contributions
 Pet-0.027
 tort-0.026
 Kang-0.025
 rob-0.025
 sur-0.025
Token fun
Token ID1257
Feature activation-1.74e+00

Pos logit contributions
 cereal+0.027
lf+0.025
 spaghetti+0.024
James+0.023
bes+0.023
Neg logit contributions
 Pet-0.027
 Kang-0.025
 tort-0.025
 rob-0.025
 sur-0.025
Token pretending
Token ID23662
Feature activation+3.05e-01

Pos logit contributions
 Pet+0.014
 tort+0.014
 sur+0.014
 rob+0.014
 Kang+0.013
Neg logit contributions
 cereal-0.016
lf-0.015
 spaghetti-0.014
James-0.013
ma-0.013
Token it
Token ID340
Feature activation-6.72e-01

Pos logit contributions
 Pet+0.005
 Kang+0.005
 tort+0.005
 sur+0.005
 rob+0.005
Neg logit contributions
 cereal-0.011
lf-0.010
 spaghetti-0.010
 Amy-0.010
James-0.009
Token was
Token ID373
Feature activation+2.64e-01

Pos logit contributions
 cereal+0.018
lf+0.016
 spaghetti+0.016
 Amy+0.015
bes+0.015
Neg logit contributions
 Pet-0.017
 sur-0.016
 tort-0.016
 rob-0.016
 Kang-0.016
Token a
Token ID257
Feature activation-1.03e-02

Pos logit contributions
 Pet+0.007
 tort+0.007
 sur+0.007
 Kang+0.007
 rob+0.007
Neg logit contributions
 cereal-0.006
 spaghetti-0.005
lf-0.005
 Amy-0.005
bes-0.005
Token tr
Token ID491
Feature activation-2.21e+00

Pos logit contributions
 cereal+0.000
 spaghetti+0.000
lf+0.000
 rug+0.000
 Amy+0.000
Neg logit contributions
 Pet-0.000
 Kang-0.000
 sur-0.000
ey-0.000
 rob-0.000
Tokenamp
Token ID696
Feature activation-4.68e+00

Pos logit contributions
 cereal+0.059
 spaghetti+0.054
 Amy+0.054
James+0.051
 Kate+0.050
Neg logit contributions
 Kang-0.050
 sur-0.048
 Pet-0.047
mur-0.047
 tort-0.047
Tokenoline
Token ID14453
Feature activation-1.26e+00

Pos logit contributions
 cereal+0.102
 Deal+0.095
 spaghetti+0.090
James+0.089
bes+0.089
Neg logit contributions
ting-0.121
 Pet-0.119
 tort-0.117
 sur-0.117
ive-0.116
Token.
Token ID13
Feature activation-1.63e-01

Pos logit contributions
 cereal+0.033
lf+0.030
 spaghetti+0.030
 Amy+0.028
ma+0.027
Neg logit contributions
 Pet-0.032
 rob-0.031
 tort-0.030
 Kang-0.030
 sur-0.030
Token
Token ID198
Feature activation+1.16e-01

Pos logit contributions
 cereal+0.004
lf+0.004
 spaghetti+0.004
bes+0.004
 Amy+0.004
Neg logit contributions
 Pet-0.004
 tort-0.004
 Kang-0.004
 sur-0.004
 rob-0.004
Token
Token ID198
Feature activation+1.77e-01

Pos logit contributions
 Pet+0.002
 Kang+0.002
 sur+0.002
 tort+0.002
 rob+0.002
Neg logit contributions
 cereal-0.004
 spaghetti-0.003
lf-0.003
bes-0.003
 Amy-0.003
TokenOne
Token ID3198
Feature activation+4.21e-01

Pos logit contributions
 Pet+0.006
 rob+0.006
ive+0.006
 sur+0.006
 tort+0.006
Neg logit contributions
lf-0.003
 cereal-0.003
James-0.003
 spaghetti-0.003
Kate-0.003
Token She
Token ID1375
Feature activation-2.74e-01

Pos logit contributions
 Pet+0.012
 tort+0.012
 Kang+0.012
 rob+0.011
 ko+0.011
Neg logit contributions
 cereal-0.013
lf-0.012
 spaghetti-0.012
James-0.011
bes-0.011
Token cried
Token ID16896
Feature activation-6.39e-01

Pos logit contributions
 cereal+0.008
lf+0.008
 spaghetti+0.008
James+0.007
 Amy+0.007
Neg logit contributions
 Pet-0.006
 tort-0.005
 sur-0.005
 rob-0.005
 Kang-0.005
Token and
Token ID290
Feature activation-6.10e-01

Pos logit contributions
 cereal+0.018
lf+0.018
 spaghetti+0.017
James+0.016
bes+0.016
Neg logit contributions
 Pet-0.014
 tort-0.013
 Kang-0.013
 sur-0.013
 rob-0.013
Token threw
Token ID9617
Feature activation-5.32e-01

Pos logit contributions
 cereal+0.017
lf+0.015
 spaghetti+0.015
James+0.014
ma+0.014
Neg logit contributions
 Pet-0.014
 tort-0.014
 sur-0.013
 Kang-0.013
 ko-0.013
Token a
Token ID257
Feature activation-1.45e+00

Pos logit contributions
lf+0.012
 cereal+0.012
 spaghetti+0.011
bes+0.011
James+0.011
Neg logit contributions
 Pet-0.014
 tort-0.014
 rob-0.014
 ko-0.013
 tou-0.013
Token tant
Token ID24246
Feature activation-4.63e+00

Pos logit contributions
 cereal+0.034
lf+0.031
 spaghetti+0.031
 Amy+0.028
James+0.028
Neg logit contributions
 Pet-0.040
 Kang-0.038
 tort-0.038
 sur-0.037
 rob-0.037
Tokenrum
Token ID6582
Feature activation-7.89e-01

Pos logit contributions
 cereal+0.096
 spaghetti+0.086
lf+0.079
James+0.075
 microwave+0.075
Neg logit contributions
 rob-0.137
 Pet-0.135
 tort-0.135
 sur-0.134
 Kang-0.133
Token.
Token ID13
Feature activation+3.27e-01

Pos logit contributions
 cereal+0.025
lf+0.024
 spaghetti+0.024
James+0.023
bes+0.022
Neg logit contributions
 Pet-0.014
 Kang-0.013
 tort-0.013
 ko-0.012
 rob-0.012
Token Her
Token ID2332
Feature activation+2.72e-01

Pos logit contributions
 Pet+0.008
 tort+0.007
 Kang+0.007
 rob+0.007
 ko+0.007
Neg logit contributions
 cereal-0.008
lf-0.008
 spaghetti-0.008
ma-0.007
James-0.007
Token mom
Token ID1995
Feature activation-1.50e+00

Pos logit contributions
 judge+0.006
 tort+0.006
 tou+0.006
 rob+0.006
 sur+0.006
Neg logit contributions
James-0.007
 cereal-0.007
endo-0.007
Kate-0.007
Amy-0.006
Token tried
Token ID3088
Feature activation-5.12e-01

Pos logit contributions
 cereal+0.042
lf+0.039
 spaghetti+0.038
 Amy+0.036
James+0.036
Neg logit contributions
 Pet-0.034
 tort-0.033
 rob-0.032
 sur-0.032
 Kang-0.032
Token swing
Token ID9628
Feature activation-2.42e-01

Pos logit contributions
 Pet+0.008
 tort+0.007
 rob+0.007
 sur+0.007
 Kang+0.007
Neg logit contributions
 cereal-0.016
lf-0.016
 spaghetti-0.015
James-0.015
bes-0.015
Token and
Token ID290
Feature activation+7.50e-01

Pos logit contributions
 cereal+0.008
lf+0.007
 spaghetti+0.007
James+0.007
bes+0.007
Neg logit contributions
 Pet-0.005
 tort-0.004
 tou-0.004
 Kang-0.004
 ko-0.004
Token play
Token ID711
Feature activation-6.80e-01

Pos logit contributions
 Pet+0.017
 tort+0.017
 Kang+0.016
 sur+0.016
 rob+0.016
Neg logit contributions
 cereal-0.021
lf-0.019
 spaghetti-0.019
 Amy-0.018
James-0.018
Token on
Token ID319
Feature activation-8.50e-01

Pos logit contributions
 cereal+0.022
 spaghetti+0.020
lf+0.020
ma+0.020
James+0.020
Neg logit contributions
 Pet-0.012
 tort-0.011
 sur-0.011
 ko-0.011
 Kang-0.010
Token the
Token ID262
Feature activation-3.73e-01

Pos logit contributions
 cereal+0.030
 spaghetti+0.029
lf+0.029
James+0.029
 Deal+0.029
Neg logit contributions
 Pet-0.012
 rob-0.010
 tou-0.010
 sur-0.009
 ko-0.009
Token merry
Token ID44570
Feature activation-4.61e+00

Pos logit contributions
 cereal+0.008
lf+0.008
 spaghetti+0.008
 Amy+0.007
James+0.007
Neg logit contributions
 Pet-0.011
 tort-0.010
 sur-0.010
 Kang-0.010
 rob-0.010
Token-
Token ID12
Feature activation-2.93e+00

Pos logit contributions
 cereal+0.156
 spaghetti+0.146
James+0.142
lf+0.138
Kate+0.137
Neg logit contributions
 Pet-0.075
 tort-0.071
 tou-0.071
 sur-0.070
 rob-0.069
Tokengo
Token ID2188
Feature activation-1.27e+00

Pos logit contributions
 cereal+0.087
 spaghetti+0.081
lf+0.076
James+0.072
 Deal+0.072
Neg logit contributions
 Pet-0.063
 rob-0.061
 sur-0.059
 tou-0.059
 tort-0.058
Token-
Token ID12
Feature activation-2.49e+00

Pos logit contributions
 cereal+0.040
lf+0.038
 spaghetti+0.037
 Amy+0.035
ma+0.035
Neg logit contributions
 Pet-0.025
 tort-0.024
 rob-0.023
 sur-0.023
 Kang-0.023
Tokenround
Token ID744
Feature activation-6.19e-01

Pos logit contributions
 cereal+0.079
 spaghetti+0.074
lf+0.073
 Amy+0.068
James+0.068
Neg logit contributions
 Pet-0.047
 rob-0.044
 sur-0.044
 tort-0.044
 Kang-0.044
Token.
Token ID13
Feature activation-1.47e-01

Pos logit contributions
 cereal+0.016
lf+0.016
 spaghetti+0.015
 Amy+0.014
ma+0.014
Neg logit contributions
 Pet-0.015
 tort-0.014
 rob-0.014
 sur-0.014
 Kang-0.014
Token.
Token ID13
Feature activation+3.27e-01

Pos logit contributions
 cereal+0.032
lf+0.030
 spaghetti+0.029
James+0.027
 Amy+0.027
Neg logit contributions
 Pet-0.021
 tort-0.020
 sur-0.020
 Kang-0.020
 rob-0.020
Token He
Token ID679
Feature activation+6.77e-02

Pos logit contributions
 Pet+0.008
 sur+0.007
 Kang+0.007
 tort+0.007
 across+0.007
Neg logit contributions
lf-0.008
 cereal-0.008
bes-0.008
 spaghetti-0.008
kin-0.008
Token was
Token ID373
Feature activation-4.91e-01

Pos logit contributions
 Pet+0.001
 sur+0.001
 tort+0.001
 rob+0.001
 Kang+0.001
Neg logit contributions
 cereal-0.002
lf-0.002
 spaghetti-0.002
bes-0.002
James-0.002
Token wearing
Token ID5762
Feature activation+3.64e-01

Pos logit contributions
 cereal+0.013
lf+0.012
 spaghetti+0.012
bes+0.011
James+0.011
Neg logit contributions
 Pet-0.012
 tort-0.011
 sur-0.011
 rob-0.011
 Kang-0.011
Token his
Token ID465
Feature activation-1.70e+00

Pos logit contributions
 Pet+0.008
 sur+0.008
 tort+0.007
 across+0.007
 ko+0.007
Neg logit contributions
lf-0.009
 cereal-0.009
bes-0.009
Kate-0.008
ma-0.008
Token brand
Token ID4508
Feature activation-4.60e+00

Pos logit contributions
lf+0.049
ocolate+0.047
endo+0.047
 cereal+0.047
Amy+0.047
Neg logit contributions
 across-0.029
 judge-0.028
 tort-0.026
mit-0.026
 Pet-0.026
Token new
Token ID649
Feature activation-2.53e+00

Pos logit contributions
 cereal+0.129
lf+0.124
James+0.124
 Amy+0.121
bes+0.121
Neg logit contributions
 sur-0.089
 Pet-0.088
 tort-0.084
 ko-0.081
 rob-0.081
Token icy
Token ID30284
Feature activation-3.09e+00

Pos logit contributions
James+0.066
 cereal+0.066
lf+0.063
mm+0.062
ocolate+0.062
Neg logit contributions
 judge-0.049
 across-0.049
 tort-0.049
 sur-0.047
 ko-0.046
Token blue
Token ID4171
Feature activation-2.17e+00

Pos logit contributions
James+0.077
Amy+0.072
 cereal+0.071
 Amy+0.071
Kate+0.069
Neg logit contributions
 tort-0.067
 sur-0.066
 judge-0.064
 across-0.064
ive-0.063
Token jacket
Token ID15224
Feature activation-2.83e+00

Pos logit contributions
James+0.051
 Amy+0.050
Amy+0.050
 cereal+0.050
lf+0.049
Neg logit contributions
 tort-0.050
ive-0.048
 judge-0.048
 sur-0.047
 across-0.047
Token as
Token ID355
Feature activation+3.36e-01

Pos logit contributions
 cereal+0.083
lf+0.078
 spaghetti+0.077
 Amy+0.072
James+0.072
Neg logit contributions
 Pet-0.060
 tort-0.057
 rob-0.056
 sur-0.056
 Kang-0.056
Token colorful
Token ID20239
Feature activation-1.45e+00

Pos logit contributions
 cereal+0.033
lf+0.032
James+0.029
 spaghetti+0.029
bes+0.029
Neg logit contributions
 Pet-0.037
 rob-0.035
 ko-0.035
 tort-0.035
 Kang-0.034
Token,
Token ID11
Feature activation-2.27e+00

Pos logit contributions
 cereal+0.034
lf+0.031
 spaghetti+0.030
 Amy+0.028
bes+0.028
Neg logit contributions
 Pet-0.040
 tort-0.038
 sur-0.038
 Kang-0.038
 rob-0.038
Token pol
Token ID755
Feature activation-8.64e-01

Pos logit contributions
 cereal+0.055
lf+0.055
ma+0.054
James+0.053
Kate+0.052
Neg logit contributions
 Pet-0.055
 ko-0.051
 tort-0.051
 rob-0.051
 tou-0.050
Tokenka
Token ID4914
Feature activation-1.37e+00

Pos logit contributions
 cereal+0.023
lf+0.022
 spaghetti+0.021
 Amy+0.020
James+0.020
Neg logit contributions
 Pet-0.020
 tort-0.019
 Kang-0.019
 sur-0.019
 rob-0.019
Token-
Token ID12
Feature activation-2.35e+00

Pos logit contributions
 cereal+0.035
lf+0.033
 spaghetti+0.032
bes+0.030
James+0.028
Neg logit contributions
 Pet-0.035
 tort-0.033
 sur-0.033
 Kang-0.033
 rob-0.032
Tokend
Token ID67
Feature activation-4.54e+00

Pos logit contributions
lf+0.059
 cereal+0.059
 spaghetti+0.055
 Amy+0.053
ma+0.053
Neg logit contributions
 Pet-0.059
 rob-0.056
mur-0.054
 tort-0.054
 ko-0.053
Tokenotted
Token ID8426
Feature activation-3.06e-01

Pos logit contributions
 cereal+0.114
lf+0.107
 spaghetti+0.103
James+0.098
 Amy+0.097
Neg logit contributions
 Pet-0.116
 rob-0.109
 tort-0.109
 sur-0.108
 Kang-0.107
Token sheep
Token ID15900
Feature activation+3.20e-01

Pos logit contributions
 cereal+0.007
lf+0.006
 spaghetti+0.006
 rug+0.006
bes+0.006
Neg logit contributions
 sur-0.009
 Kang-0.008
 Pet-0.008
 gol-0.008
 encountered-0.008
Token all
Token ID477
Feature activation-1.95e+00

Pos logit contributions
 sur+0.008
 Pet+0.008
 rob+0.007
 Kang+0.007
 tort+0.007
Neg logit contributions
 cereal-0.009
lf-0.008
 spaghetti-0.008
bes-0.007
 rug-0.007
Token over
Token ID625
Feature activation-6.79e-01

Pos logit contributions
 cereal+0.063
lf+0.059
 spaghetti+0.059
bes+0.056
James+0.056
Neg logit contributions
 Pet-0.036
 tort-0.033
 rob-0.033
 sur-0.032
 Kang-0.032
Token it
Token ID340
Feature activation-1.79e+00

Pos logit contributions
 cereal+0.021
lf+0.021
bes+0.020
 spaghetti+0.020
James+0.019
Neg logit contributions
 Pet-0.012
 sur-0.011
 rob-0.011
 tort-0.011
 Kang-0.011
Token It
Token ID632
Feature activation-7.45e-01

Pos logit contributions
 Pet+0.002
 Kang+0.002
 tort+0.002
 rob+0.002
 sur+0.002
Neg logit contributions
 cereal-0.002
lf-0.002
 spaghetti-0.002
ma-0.002
bes-0.002
Token's
Token ID338
Feature activation-5.35e-01

Pos logit contributions
 cereal+0.017
lf+0.015
 spaghetti+0.015
 Amy+0.015
 rug+0.015
Neg logit contributions
 gol-0.020
 Pet-0.020
 tort-0.019
 sur-0.019
 rob-0.018
Token like
Token ID588
Feature activation-5.11e-01

Pos logit contributions
 cereal+0.013
lf+0.012
 spaghetti+0.012
 Amy+0.011
ma+0.011
Neg logit contributions
 Pet-0.014
 tort-0.014
 sur-0.013
 rob-0.013
 Kang-0.013
Token a
Token ID257
Feature activation-2.71e-01

Pos logit contributions
 cereal+0.012
lf+0.011
 spaghetti+0.011
bes+0.010
James+0.010
Neg logit contributions
 Pet-0.014
 rob-0.013
 tort-0.013
 Kang-0.013
 sur-0.013
Token big
Token ID1263
Feature activation-1.50e+00

Pos logit contributions
 cereal+0.007
lf+0.006
 spaghetti+0.006
 Amy+0.006
 rug+0.006
Neg logit contributions
 Pet-0.007
 Kang-0.007
 tort-0.007
 sur-0.007
ey-0.006
Token merry
Token ID44570
Feature activation-4.54e+00

Pos logit contributions
 cereal+0.042
lf+0.039
 spaghetti+0.038
James+0.036
bes+0.036
Neg logit contributions
 Pet-0.034
 tort-0.033
 rob-0.032
 sur-0.032
 Kang-0.032
Token-
Token ID12
Feature activation-3.28e+00

Pos logit contributions
 cereal+0.147
 spaghetti+0.137
lf+0.133
James+0.132
Kate+0.129
Neg logit contributions
 Pet-0.080
 tort-0.077
 sur-0.076
 rob-0.076
 Kang-0.074
Tokengo
Token ID2188
Feature activation-1.88e+00

Pos logit contributions
 cereal+0.096
lf+0.089
 spaghetti+0.089
James+0.082
bes+0.082
Neg logit contributions
 Pet-0.070
 rob-0.070
 sur-0.066
 tou-0.066
 Kang-0.065
Token-
Token ID12
Feature activation-2.94e+00

Pos logit contributions
 cereal+0.058
lf+0.055
 spaghetti+0.053
ma+0.051
 Amy+0.051
Neg logit contributions
 Pet-0.038
 tort-0.036
 sur-0.035
 Kang-0.035
 rob-0.035
Tokenround
Token ID744
Feature activation-1.47e+00

Pos logit contributions
 cereal+0.092
lf+0.086
 spaghetti+0.086
James+0.080
 Amy+0.080
Neg logit contributions
 Pet-0.057
 rob-0.054
 tort-0.054
 sur-0.053
 Kang-0.053
Token!"
Token ID2474
Feature activation+4.69e-01

Pos logit contributions
 cereal+0.039
lf+0.036
 spaghetti+0.035
 Amy+0.033
ma+0.032
Neg logit contributions
 Pet-0.036
 tort-0.035
 rob-0.035
 sur-0.034
 Kang-0.034
Token sad
Token ID6507
Feature activation+1.70e-02

Pos logit contributions
 cereal+0.025
lf+0.023
 spaghetti+0.023
 Amy+0.023
ma+0.021
Neg logit contributions
 Pet-0.023
 Kang-0.023
 rob-0.022
ive-0.022
 sur-0.022
Token.
Token ID13
Feature activation-5.93e-01

Pos logit contributions
 Pet+0.000
 rob+0.000
 Kang+0.000
 tort+0.000
 sur+0.000
Neg logit contributions
 cereal-0.000
 spaghetti-0.000
lf-0.000
 Amy-0.000
 Kate-0.000
Token She
Token ID1375
Feature activation-5.37e-01

Pos logit contributions
 cereal+0.016
lf+0.015
 spaghetti+0.015
James+0.014
bes+0.014
Neg logit contributions
 Pet-0.014
 tort-0.013
 Kang-0.013
 sur-0.013
 rob-0.013
Token threw
Token ID9617
Feature activation-7.17e-01

Pos logit contributions
 cereal+0.016
lf+0.015
 spaghetti+0.015
 Amy+0.014
James+0.014
Neg logit contributions
 Pet-0.011
 tort-0.011
 Kang-0.010
 rob-0.010
 sur-0.010
Token a
Token ID257
Feature activation-1.40e+00

Pos logit contributions
 cereal+0.022
lf+0.022
 spaghetti+0.021
bes+0.020
James+0.020
Neg logit contributions
 Pet-0.014
 tort-0.013
 rob-0.012
 Kang-0.012
 sur-0.012
Token tant
Token ID24246
Feature activation-4.54e+00

Pos logit contributions
 cereal+0.035
lf+0.033
 spaghetti+0.032
bes+0.031
James+0.031
Neg logit contributions
 Pet-0.034
 tort-0.034
 rob-0.033
 sur-0.033
 ko-0.032
Tokenrum
Token ID6582
Feature activation-9.19e-01

Pos logit contributions
 cereal+0.101
 spaghetti+0.091
 microwave+0.081
lf+0.080
James+0.078
Neg logit contributions
 rob-0.129
 Pet-0.127
ey-0.127
 tort-0.126
 ko-0.125
Token and
Token ID290
Feature activation-3.16e-01

Pos logit contributions
 cereal+0.028
lf+0.026
 spaghetti+0.026
James+0.024
bes+0.024
Neg logit contributions
 Pet-0.019
 tort-0.017
 Kang-0.017
 sur-0.017
 rob-0.017
Token cried
Token ID16896
Feature activation-2.25e-01

Pos logit contributions
 cereal+0.009
lf+0.008
 spaghetti+0.008
 Amy+0.007
James+0.007
Neg logit contributions
 Pet-0.007
 tort-0.007
 Kang-0.007
 rob-0.007
 sur-0.007
Token loudly
Token ID23112
Feature activation+5.12e-01

Pos logit contributions
 cereal+0.007
lf+0.006
 spaghetti+0.006
 Amy+0.006
James+0.006
Neg logit contributions
 Pet-0.005
 tort-0.005
 sur-0.005
 rob-0.005
 Kang-0.005
Token.
Token ID13
Feature activation-1.65e-01

Pos logit contributions
 rob+0.012
 Pet+0.012
 Kang+0.012
 sur+0.012
 tou+0.012
Neg logit contributions
 cereal-0.014
 spaghetti-0.013
lf-0.013
 Amy-0.012
bes-0.012
Token the
Token ID262
Feature activation+5.40e-01

Pos logit contributions
 Pet+0.005
 rob+0.005
 sur+0.004
 tort+0.004
 tou+0.004
Neg logit contributions
 cereal-0.010
lf-0.009
James-0.009
 spaghetti-0.009
bes-0.009
Token sees
Token ID7224
Feature activation-1.42e+00

Pos logit contributions
 Pet+0.012
 tort+0.011
 rob+0.011
 sur+0.011
 Kang+0.011
Neg logit contributions
 cereal-0.015
lf-0.014
 spaghetti-0.014
James-0.013
 Amy-0.013
Tokenaw
Token ID707
Feature activation+1.04e+00

Pos logit contributions
 cereal+0.045
lf+0.042
 spaghetti+0.041
James+0.040
bes+0.039
Neg logit contributions
 Pet-0.027
 rob-0.025
 sur-0.025
 tort-0.024
 tou-0.024
Token,
Token ID11
Feature activation-4.41e-01

Pos logit contributions
 tort+0.027
 Pet+0.026
 sur+0.026
 Kang+0.025
 rob+0.025
Neg logit contributions
 cereal-0.026
lf-0.024
 spaghetti-0.024
 Amy-0.023
ma-0.023
Token the
Token ID262
Feature activation+1.90e-01

Pos logit contributions
 cereal+0.014
lf+0.013
James+0.013
 spaghetti+0.013
Kate+0.013
Neg logit contributions
 Pet-0.008
 rob-0.008
 sur-0.007
 tou-0.007
 tort-0.007
Token merry
Token ID44570
Feature activation-4.51e+00

Pos logit contributions
 Pet+0.004
 tort+0.004
 Kang+0.004
 sur+0.004
 rob+0.004
Neg logit contributions
 cereal-0.005
lf-0.005
 spaghetti-0.005
 Amy-0.004
James-0.004
Token-
Token ID12
Feature activation-2.69e+00

Pos logit contributions
 cereal+0.139
lf+0.129
 spaghetti+0.128
James+0.122
 Amy+0.121
Neg logit contributions
 Pet-0.090
 tort-0.085
 sur-0.085
 rob-0.084
 Kang-0.084
Tokengo
Token ID2188
Feature activation-1.35e+00

Pos logit contributions
 cereal+0.082
 spaghetti+0.078
lf+0.076
 Deal+0.072
James+0.072
Neg logit contributions
 rob-0.055
 Pet-0.053
 sur-0.052
 tou-0.052
ive-0.051
Token-
Token ID12
Feature activation-2.67e+00

Pos logit contributions
 cereal+0.041
lf+0.039
 spaghetti+0.038
ma+0.036
 Amy+0.036
Neg logit contributions
 Pet-0.027
 tort-0.026
 Kang-0.025
 sur-0.025
 rob-0.025
Tokenround
Token ID744
Feature activation-9.38e-01

Pos logit contributions
 cereal+0.085
lf+0.080
 spaghetti+0.079
James+0.075
 Amy+0.074
Neg logit contributions
 Pet-0.051
 rob-0.048
 tort-0.047
 sur-0.047
 Kang-0.047
Token.
Token ID13
Feature activation-1.45e-01

Pos logit contributions
 cereal+0.025
lf+0.023
 spaghetti+0.022
 Amy+0.021
ma+0.021
Neg logit contributions
 Pet-0.023
 tort-0.022
 sur-0.022
 Kang-0.022
 rob-0.022
Token park
Token ID3952
Feature activation-7.09e-01

Pos logit contributions
 Pet+0.010
 tort+0.009
 rob+0.009
 sur+0.009
 Kang+0.009
Neg logit contributions
 cereal-0.011
lf-0.011
bes-0.010
 spaghetti-0.010
James-0.010
Token.
Token ID13
Feature activation+6.34e-01

Pos logit contributions
 cereal+0.021
lf+0.020
 spaghetti+0.020
James+0.019
bes+0.018
Neg logit contributions
 Pet-0.015
 tort-0.014
 rob-0.014
 Kang-0.014
 sur-0.014
Token Sara
Token ID24799
Feature activation+7.62e-01

Pos logit contributions
 tort+0.018
 sur+0.018
 rob+0.018
 gol+0.018
 tou+0.018
Neg logit contributions
 cereal-0.014
 spaghetti-0.012
 Amy-0.012
lf-0.011
 Kate-0.011
Token is
Token ID318
Feature activation-2.51e-02

Pos logit contributions
 Pet+0.021
 tort+0.020
 rob+0.020
 Kang+0.020
 sur+0.020
Neg logit contributions
 cereal-0.018
lf-0.016
 spaghetti-0.016
 Amy-0.015
James-0.015
Token the
Token ID262
Feature activation+1.13e+00

Pos logit contributions
 Amy+0.000
 cereal+0.000
 Kate+0.000
lf+0.000
 spaghetti+0.000
Neg logit contributions
ive-0.001
 Kang-0.001
 tort-0.001
 Pet-0.001
mur-0.001
Token seeker
Token ID45993
Feature activation-4.51e+00

Pos logit contributions
 Pet+0.026
 tort+0.025
 sur+0.025
 rob+0.025
 Kang+0.025
Neg logit contributions
 cereal-0.031
lf-0.028
 spaghetti-0.028
 Amy-0.027
James-0.026
Token and
Token ID290
Feature activation-3.56e-01

Pos logit contributions
 cereal+0.138
 spaghetti+0.125
lf+0.123
James+0.123
bes+0.117
Neg logit contributions
 Pet-0.088
 tort-0.082
 sur-0.081
 rob-0.081
 Kang-0.080
Token Ben
Token ID3932
Feature activation+4.07e-01

Pos logit contributions
 cereal+0.010
lf+0.009
bes+0.009
 spaghetti+0.008
James+0.008
Neg logit contributions
 Pet-0.009
 sur-0.008
 Kang-0.008
 ko-0.008
 rob-0.008
Token is
Token ID318
Feature activation+1.62e-01

Pos logit contributions
 Pet+0.011
 tort+0.010
 Kang+0.010
 rob+0.010
 sur+0.010
Neg logit contributions
 cereal-0.010
lf-0.009
 spaghetti-0.009
 Amy-0.008
James-0.008
Token the
Token ID262
Feature activation+1.11e+00

Pos logit contributions
 tort+0.004
aff+0.004
 tou+0.004
 rob+0.004
 Kang+0.004
Neg logit contributions
 Amy-0.004
 cereal-0.004
 Kate-0.004
lf-0.004
Kate-0.003
Token h
Token ID289
Feature activation-9.72e-01

Pos logit contributions
 Pet+0.027
 tort+0.026
 sur+0.026
 Kang+0.026
 rob+0.026
Neg logit contributions
 cereal-0.029
lf-0.027
 spaghetti-0.026
James-0.025
 Amy-0.024
Token park
Token ID3952
Feature activation-8.25e-01

Pos logit contributions
 Pet+0.009
 tort+0.008
 rob+0.008
 tou+0.008
 ko+0.008
Neg logit contributions
 cereal-0.010
lf-0.010
bes-0.009
 spaghetti-0.009
James-0.009
Token.
Token ID13
Feature activation+5.15e-01

Pos logit contributions
 cereal+0.025
lf+0.024
 spaghetti+0.023
James+0.022
bes+0.022
Neg logit contributions
 Pet-0.017
 tort-0.016
 rob-0.015
 Kang-0.015
 sur-0.015
Token Lily
Token ID20037
Feature activation+2.88e-01

Pos logit contributions
 Pet+0.014
 tort+0.014
 Kang+0.014
 rob+0.013
 ko+0.013
Neg logit contributions
 cereal-0.012
lf-0.011
 spaghetti-0.011
ma-0.010
bes-0.010
Token was
Token ID373
Feature activation-5.00e-01

Pos logit contributions
 Pet+0.007
 Kang+0.007
 tort+0.007
 rob+0.007
 sur+0.007
Neg logit contributions
 cereal-0.007
lf-0.007
 spaghetti-0.007
 Amy-0.006
James-0.006
Token the
Token ID262
Feature activation+1.02e+00

Pos logit contributions
 cereal+0.011
 Amy+0.010
lf+0.010
 Kate+0.009
 spaghetti+0.009
Neg logit contributions
 tort-0.014
 Pet-0.014
 Kang-0.014
ive-0.014
 sur-0.014
Token seeker
Token ID45993
Feature activation-4.50e+00

Pos logit contributions
 Pet+0.024
 tort+0.023
 rob+0.023
 sur+0.023
 Kang+0.023
Neg logit contributions
 cereal-0.027
lf-0.025
 spaghetti-0.025
 Amy-0.024
James-0.023
Token and
Token ID290
Feature activation-4.55e-01

Pos logit contributions
 cereal+0.138
 spaghetti+0.125
James+0.124
lf+0.121
endo+0.120
Neg logit contributions
 Pet-0.084
 tort-0.080
 rob-0.078
ive-0.077
 sur-0.077
Token Max
Token ID5436
Feature activation-4.78e-02

Pos logit contributions
 cereal+0.010
lf+0.009
bes+0.009
lights+0.009
 spaghetti+0.009
Neg logit contributions
 Pet-0.013
 tort-0.012
 ko-0.012
 sur-0.012
 Kang-0.011
Token was
Token ID373
Feature activation-2.14e-01

Pos logit contributions
 cereal+0.001
lf+0.001
 spaghetti+0.001
 Amy+0.001
bes+0.001
Neg logit contributions
 Pet-0.001
 Kang-0.001
 rob-0.001
 tort-0.001
 sur-0.001
Token the
Token ID262
Feature activation+9.92e-01

Pos logit contributions
 Amy+0.005
 cereal+0.005
lf+0.005
 Kate+0.005
James+0.005
Neg logit contributions
 tort-0.006
ive-0.006
 Kang-0.005
 tou-0.005
 rob-0.005
Token h
Token ID289
Feature activation-9.96e-01

Pos logit contributions
 Pet+0.025
 tort+0.024
 rob+0.024
 sur+0.024
 Kang+0.024
Neg logit contributions
 cereal-0.025
lf-0.023
 spaghetti-0.023
 Amy-0.021
James-0.021
Token cuc
Token ID38421
Feature activation+6.38e+00

Pos logit contributions
 Pet+0.003
 Kang+0.002
 sur+0.002
ive+0.002
 tort+0.002
Neg logit contributions
 cereal-0.003
 spaghetti-0.002
lf-0.002
 Amy-0.002
ma-0.002
Tokenumber
Token ID4494
Feature activation-6.50e-04

Pos logit contributions
 Pet+0.162
 ko+0.154
 tou+0.150
 Kang+0.150
 rob+0.149
Neg logit contributions
 cereal-0.167
 spaghetti-0.148
ma-0.141
lf-0.140
 microwave-0.135
Token.
Token ID13
Feature activation+4.94e-01

Pos logit contributions
 cereal+0.000
lf+0.000
 spaghetti+0.000
 Amy+0.000
ma+0.000
Neg logit contributions
 Pet-0.000
 sur-0.000
 Kang-0.000
 tort-0.000
 ko-0.000
Token Mama
Token ID40084
Feature activation+1.07e+00

Pos logit contributions
 Pet+0.012
 tort+0.012
 Kang+0.012
 rob+0.012
 sur+0.011
Neg logit contributions
 cereal-0.013
lf-0.012
 spaghetti-0.012
James-0.011
 Amy-0.011
Token puts
Token ID7584
Feature activation+6.81e-01

Pos logit contributions
 Pet+0.027
 sur+0.027
 Kang+0.027
 tort+0.027
 ko+0.026
Neg logit contributions
 cereal-0.027
 spaghetti-0.025
lf-0.024
ma-0.023
 Amy-0.023
Token cuc
Token ID38421
Feature activation+6.59e+00

Pos logit contributions
 Pet+0.015
 rob+0.014
 tort+0.014
 Kang+0.013
 tou+0.013
Neg logit contributions
 cereal-0.019
lf-0.019
James-0.018
 spaghetti-0.018
Kate-0.017
Tokenumbers
Token ID17024
Feature activation+2.36e+00

Pos logit contributions
 Pet+0.171
 ko+0.164
 tort+0.162
 Kang+0.161
 rob+0.161
Neg logit contributions
 cereal-0.168
 spaghetti-0.149
ma-0.143
lf-0.141
bes-0.134
Token in
Token ID287
Feature activation-8.15e-01

Pos logit contributions
 Pet+0.043
 rob+0.040
 tort+0.040
 Kang+0.039
 sur+0.039
Neg logit contributions
 cereal-0.074
lf-0.073
 spaghetti-0.069
James-0.068
Kate-0.066
Token our
Token ID674
Feature activation-3.27e-01

Pos logit contributions
lf+0.026
 cereal+0.026
James+0.025
bes+0.024
Kate+0.024
Neg logit contributions
 Pet-0.014
 rob-0.014
 tort-0.013
 Kang-0.012
 ko-0.012
Token salad
Token ID20698
Feature activation+1.69e+00

Pos logit contributions
 cereal+0.007
lf+0.006
James+0.006
 spaghetti+0.006
Kate+0.006
Neg logit contributions
 Pet-0.010
 tort-0.009
 rob-0.009
 Kang-0.009
 sur-0.009
Token.
Token ID13
Feature activation+5.76e-01

Pos logit contributions
 Pet+0.031
 tort+0.030
 rob+0.029
ive+0.028
 Kang+0.028
Neg logit contributions
 cereal-0.051
lf-0.050
James-0.049
Kate-0.047
 spaghetti-0.047
Token gift
Token ID6979
Feature activation-6.52e-01

Pos logit contributions
 cereal+0.007
lf+0.006
 spaghetti+0.006
 Amy+0.006
James+0.006
Neg logit contributions
 Pet-0.007
 Kang-0.007
 tort-0.007
 sur-0.007
 rob-0.006
Token from
Token ID422
Feature activation-9.63e-01

Pos logit contributions
 cereal+0.022
lf+0.021
 spaghetti+0.021
James+0.020
bes+0.020
Neg logit contributions
 Pet-0.010
 tort-0.010
 rob-0.010
 Kang-0.009
 sur-0.009
Token God
Token ID1793
Feature activation+1.94e+00

Pos logit contributions
 cereal+0.024
lf+0.023
bes+0.022
 spaghetti+0.021
James+0.021
Neg logit contributions
 Pet-0.024
 rob-0.023
 tort-0.022
 Kang-0.022
 ko-0.022
Token and
Token ID290
Feature activation+3.88e-01

Pos logit contributions
 Pet+0.047
 tort+0.046
 sur+0.046
 Kang+0.045
 rob+0.045
Neg logit contributions
 cereal-0.051
 spaghetti-0.047
lf-0.047
ma-0.044
 Amy-0.044
Token he
Token ID339
Feature activation+6.33e-01

Pos logit contributions
 Pet+0.009
 tort+0.008
 rob+0.008
 sur+0.008
 Kang+0.008
Neg logit contributions
 cereal-0.011
lf-0.010
 spaghetti-0.010
James-0.009
bes-0.009
Token pr
Token ID778
Feature activation+4.86e+00

Pos logit contributions
 Pet+0.015
 tort+0.015
 Kang+0.015
 sur+0.015
 rob+0.014
Neg logit contributions
 cereal-0.017
lf-0.015
 spaghetti-0.015
 Amy-0.014
ma-0.014
Tokenays
Token ID592
Feature activation-4.98e-01

Pos logit contributions
 Pet+0.121
 tort+0.119
 sur+0.117
 rob+0.117
 Kang+0.115
Neg logit contributions
 cereal-0.126
 spaghetti-0.113
lf-0.112
ma-0.106
bes-0.104
Token every
Token ID790
Feature activation-6.73e-01

Pos logit contributions
 cereal+0.014
lf+0.014
James+0.013
bes+0.013
 spaghetti+0.013
Neg logit contributions
 Pet-0.010
 rob-0.010
 tort-0.010
 sur-0.009
 tou-0.009
Token day
Token ID1110
Feature activation-3.40e-01

Pos logit contributions
 cereal+0.018
lf+0.018
bes+0.017
James+0.017
 spaghetti+0.017
Neg logit contributions
 Pet-0.014
 rob-0.014
 tort-0.014
 sur-0.014
 tou-0.014
Token to
Token ID284
Feature activation-5.34e-01

Pos logit contributions
 cereal+0.010
lf+0.009
 spaghetti+0.009
 Amy+0.009
James+0.009
Neg logit contributions
 Pet-0.007
 tort-0.007
 rob-0.007
 sur-0.007
 Kang-0.007
Token thank
Token ID5875
Feature activation+8.88e-01

Pos logit contributions
 cereal+0.017
lf+0.016
 spaghetti+0.016
James+0.015
 Amy+0.015
Neg logit contributions
 Pet-0.010
 tort-0.009
 rob-0.009
 sur-0.009
 Kang-0.009
Token.
Token ID13
Feature activation-7.26e-01

Pos logit contributions
 Pet+0.041
 Kang+0.040
 sur+0.040
 tort+0.039
mur+0.038
Neg logit contributions
 cereal-0.051
 spaghetti-0.046
lf-0.046
 Amy-0.044
ma-0.044
Token He
Token ID679
Feature activation-2.74e-01

Pos logit contributions
 cereal+0.019
lf+0.018
 spaghetti+0.018
bes+0.017
ma+0.016
Neg logit contributions
 Pet-0.018
 tort-0.017
 Kang-0.017
 rob-0.016
 sur-0.016
Token felt
Token ID2936
Feature activation+4.88e-01

Pos logit contributions
 cereal+0.007
lf+0.007
 spaghetti+0.007
ma+0.006
 Amy+0.006
Neg logit contributions
 Pet-0.006
 Kang-0.006
 tort-0.006
 sur-0.006
 rob-0.006
Token bigger
Token ID5749
Feature activation+9.10e-01

Pos logit contributions
 Kang+0.013
mur+0.013
 tort+0.013
 Pet+0.013
ive+0.013
Neg logit contributions
 cereal-0.011
 spaghetti-0.010
 Amy-0.010
lf-0.009
James-0.009
Token and
Token ID290
Feature activation-1.09e+00

Pos logit contributions
 Pet+0.022
 tort+0.021
 rob+0.021
 Kang+0.021
 sur+0.021
Neg logit contributions
 cereal-0.024
lf-0.022
 spaghetti-0.022
 Amy-0.021
ma-0.021
Token bra
Token ID8290
Feature activation+4.99e+00

Pos logit contributions
 cereal+0.031
lf+0.029
 spaghetti+0.028
 Amy+0.027
James+0.026
Neg logit contributions
 Pet-0.024
 Kang-0.023
 tort-0.023
 rob-0.023
 sur-0.023
Tokenver
Token ID332
Feature activation-3.27e-01

Pos logit contributions
 tou+0.140
 rob+0.137
 Pet+0.134
 Kang+0.134
 tort+0.134
Neg logit contributions
lf-0.094
ma-0.092
James-0.091
bes-0.090
 cereal-0.089
Token than
Token ID621
Feature activation+4.57e-01

Pos logit contributions
lf+0.007
ma+0.007
 cereal+0.007
James+0.007
 Amy+0.007
Neg logit contributions
 tou-0.008
 rob-0.008
ive-0.008
 sur-0.008
mur-0.008
Token before
Token ID878
Feature activation+4.29e-01

Pos logit contributions
 Pet+0.010
 Kang+0.010
 tort+0.010
 rob+0.010
 sur+0.010
Neg logit contributions
 cereal-0.013
lf-0.012
 spaghetti-0.012
 Amy-0.011
James-0.011
Token.
Token ID13
Feature activation-7.96e-01

Pos logit contributions
 Pet+0.009
 tort+0.009
 rob+0.009
 sur+0.009
 Kang+0.009
Neg logit contributions
 cereal-0.012
lf-0.012
 spaghetti-0.011
 Amy-0.011
James-0.011
Token He
Token ID679
Feature activation-1.60e-01

Pos logit contributions
 cereal+0.021
lf+0.020
 spaghetti+0.019
bes+0.018
ma+0.018
Neg logit contributions
 Pet-0.020
 tort-0.019
 Kang-0.019
 rob-0.018
 sur-0.018
Token Tweet
Token ID18752
Feature activation+7.11e-01

Pos logit contributions
 sur+0.010
 Pet+0.009
mur+0.009
 Kang+0.009
 tort+0.009
Neg logit contributions
 cereal-0.010
 Amy-0.009
lf-0.009
 spaghetti-0.009
 Kate-0.008
Token felt
Token ID2936
Feature activation+7.82e-02

Pos logit contributions
 Pet+0.018
mur+0.018
 Kang+0.018
 sur+0.018
 tort+0.018
Neg logit contributions
lf-0.017
 cereal-0.017
ma-0.016
 spaghetti-0.015
 Amy-0.014
Token even
Token ID772
Feature activation+6.44e-01

Pos logit contributions
 Kang+0.002
mur+0.002
aff+0.002
 perspective+0.002
 tort+0.002
Neg logit contributions
 cereal-0.002
 spaghetti-0.001
 Amy-0.001
ma-0.001
James-0.001
Token stronger
Token ID7387
Feature activation+1.09e+00

Pos logit contributions
 Pet+0.016
 Kang+0.016
mur+0.016
 tort+0.015
 rob+0.015
Neg logit contributions
 cereal-0.016
 spaghetti-0.015
 Amy-0.014
lf-0.014
ma-0.014
Token and
Token ID290
Feature activation-7.08e-01

Pos logit contributions
 rob+0.028
 Pet+0.028
 Kang+0.028
 ko+0.027
 tou+0.027
Neg logit contributions
 cereal-0.025
 spaghetti-0.023
ma-0.023
lf-0.022
bes-0.022
Token bra
Token ID8290
Feature activation+4.93e+00

Pos logit contributions
 cereal+0.019
lf+0.017
 spaghetti+0.017
 Amy+0.017
ma+0.016
Neg logit contributions
 Pet-0.017
 Kang-0.016
mur-0.016
 tort-0.016
 ko-0.016
Tokenver
Token ID332
Feature activation-3.21e-01

Pos logit contributions
 tou+0.142
 rob+0.141
 tort+0.139
 Pet+0.138
 sur+0.137
Neg logit contributions
bes-0.082
lf-0.080
kin-0.079
 hips-0.079
 cereal-0.079
Token than
Token ID621
Feature activation+6.04e-01

Pos logit contributions
ma+0.007
 cereal+0.006
lf+0.006
 Amy+0.006
bes+0.006
Neg logit contributions
mur-0.008
ive-0.008
 sur-0.008
 tou-0.008
 Kang-0.008
Token before
Token ID878
Feature activation+5.10e-01

Pos logit contributions
 Pet+0.013
 tort+0.012
 Kang+0.012
 sur+0.012
 rob+0.012
Neg logit contributions
 cereal-0.018
lf-0.016
 spaghetti-0.016
 Amy-0.015
James-0.015
Token.
Token ID13
Feature activation-4.78e-01

Pos logit contributions
 Pet+0.011
 tort+0.011
 rob+0.010
 sur+0.010
 Kang+0.010
Neg logit contributions
 cereal-0.015
lf-0.014
 spaghetti-0.014
 Amy-0.013
James-0.013
Token<|endoftext|>
Token ID50256
Feature activation-6.01e-01

Pos logit contributions
 cereal+0.012
lf+0.011
 spaghetti+0.011
 Amy+0.010
James+0.010
Neg logit contributions
 Pet-0.012
 tort-0.012
 sur-0.012
 rob-0.011
 Kang-0.011
Token very
Token ID845
Feature activation-2.38e-01

Pos logit contributions
 cereal+0.012
lf+0.011
 spaghetti+0.011
 Amy+0.010
James+0.010
Neg logit contributions
 Pet-0.011
 tort-0.011
 sur-0.011
 Kang-0.011
 rob-0.011
Token naughty
Token ID45128
Feature activation+2.77e-02

Pos logit contributions
 cereal+0.007
lf+0.006
 spaghetti+0.006
James+0.006
bes+0.006
Neg logit contributions
 Pet-0.006
 tort-0.005
 rob-0.005
 sur-0.005
 Kang-0.005
Token and
Token ID290
Feature activation-6.25e-01

Pos logit contributions
 Pet+0.001
 tort+0.001
 rob+0.001
 sur+0.001
 Kang+0.001
Neg logit contributions
 cereal-0.001
lf-0.001
 spaghetti-0.001
 Amy-0.001
ma-0.001
Token always
Token ID1464
Feature activation-3.12e-02

Pos logit contributions
 cereal+0.018
lf+0.016
 spaghetti+0.016
bes+0.016
James+0.016
Neg logit contributions
 Pet-0.014
 tort-0.013
 rob-0.013
 sur-0.013
 tou-0.013
Token mis
Token ID2984
Feature activation+5.06e+00

Pos logit contributions
 cereal+0.001
lf+0.001
 spaghetti+0.001
 Amy+0.001
James+0.001
Neg logit contributions
 Pet-0.001
 Kang-0.001
 tort-0.001
 sur-0.001
 rob-0.001
Tokenbeh
Token ID20709
Feature activation+4.67e+00

Pos logit contributions
 Pet+0.123
 sur+0.112
 ko+0.110
 tort+0.101
 tou+0.100
Neg logit contributions
ma-0.119
James-0.116
Kate-0.115
Amy-0.113
 cereal-0.112
Tokenaving
Token ID2703
Feature activation-6.52e-01

Pos logit contributions
 Pet+0.096
 rob+0.089
 tort+0.089
 sur+0.087
 Kang+0.087
Neg logit contributions
 cereal-0.137
 spaghetti-0.128
lf-0.126
James-0.123
Kate-0.122
Token.
Token ID13
Feature activation+4.27e-01

Pos logit contributions
 cereal+0.016
lf+0.015
 spaghetti+0.015
 Amy+0.015
ma+0.014
Neg logit contributions
 sur-0.017
 Pet-0.017
 Kang-0.017
 rob-0.016
 tort-0.016
Token One
Token ID1881
Feature activation+3.44e-01

Pos logit contributions
 Pet+0.009
 Kang+0.008
 rob+0.008
 tort+0.008
 ko+0.008
Neg logit contributions
lf-0.013
 cereal-0.013
 spaghetti-0.012
bes-0.012
pron-0.012
Token day
Token ID1110
Feature activation+1.25e+00

Pos logit contributions
 Kang+0.006
 judge+0.006
 Pet+0.006
 tort+0.005
 rob+0.005
Neg logit contributions
bes-0.011
 cereal-0.011
lf-0.011
James-0.011
lights-0.010
Token she
Token ID673
Feature activation+5.30e-01

Pos logit contributions
 Kang+0.029
 sur+0.029
 tort+0.028
 Pet+0.028
 rob+0.028
Neg logit contributions
 cereal-0.036
lf-0.033
 Amy-0.033
 spaghetti-0.032
 Kate-0.030
Tokenons
Token ID684
Feature activation+1.04e+00

Pos logit contributions
 Pet+0.079
 tort+0.077
 rob+0.077
 sur+0.076
 Kang+0.076
Neg logit contributions
 cereal-0.024
lf-0.020
 spaghetti-0.020
 Amy-0.016
James-0.016
Token fly
Token ID6129
Feature activation+1.29e+00

Pos logit contributions
 Pet+0.023
 tort+0.022
 Kang+0.022
 sur+0.022
 ko+0.022
Neg logit contributions
 cereal-0.030
lf-0.029
 spaghetti-0.028
 Amy-0.026
ma-0.026
Token around
Token ID1088
Feature activation-7.09e-01

Pos logit contributions
 sur+0.032
 tort+0.032
 Pet+0.031
 Kang+0.031
 rob+0.030
Neg logit contributions
 cereal-0.033
 Amy-0.031
 spaghetti-0.031
lf-0.031
ma-0.028
Token in
Token ID287
Feature activation-9.53e-01

Pos logit contributions
 cereal+0.020
lf+0.018
 spaghetti+0.018
 Amy+0.017
James+0.017
Neg logit contributions
 Pet-0.016
 tort-0.016
 sur-0.015
 Kang-0.015
 rob-0.015
Token her
Token ID607
Feature activation-4.92e-02

Pos logit contributions
 cereal+0.031
lf+0.030
 spaghetti+0.029
bes+0.028
James+0.028
Neg logit contributions
 Pet-0.017
 rob-0.016
 tou-0.015
 ko-0.015
 Kang-0.015
Token garden
Token ID11376
Feature activation+1.64e+00

Pos logit contributions
 cereal+0.001
lf+0.001
 spaghetti+0.001
James+0.001
 Amy+0.001
Neg logit contributions
 Pet-0.001
 tort-0.001
 rob-0.001
 Kang-0.001
 sur-0.001
Token.
Token ID13
Feature activation+3.11e-01

Pos logit contributions
 Pet+0.040
 tort+0.039
 sur+0.038
 Kang+0.038
 rob+0.038
Neg logit contributions
 cereal-0.044
lf-0.041
 spaghetti-0.040
ma-0.037
 Amy-0.037
Token All
Token ID1439
Feature activation-1.29e-02

Pos logit contributions
 Pet+0.008
 tort+0.008
 sur+0.008
 rob+0.008
 Kang+0.008
Neg logit contributions
 cereal-0.008
 spaghetti-0.007
lf-0.007
 Amy-0.006
James-0.006
Token of
Token ID286
Feature activation+1.47e-01

Pos logit contributions
 cereal+0.000
 spaghetti+0.000
lf+0.000
 Amy+0.000
ma+0.000
Neg logit contributions
 Pet-0.000
 tort-0.000
 sur-0.000
 Kang-0.000
mur-0.000
Token a
Token ID257
Feature activation-1.31e-01

Pos logit contributions
 Pet+0.003
 rob+0.003
 ko+0.003
 tou+0.003
 mag+0.003
Neg logit contributions
lf-0.004
 cereal-0.004
bes-0.004
James-0.004
 Deal-0.004
Token sudden
Token ID4802
Feature activation+5.56e-02

Pos logit contributions
lf+0.003
 cereal+0.003
 spaghetti+0.003
 Amy+0.003
ma+0.003
Neg logit contributions
 Pet-0.004
severe-0.004
mur-0.003
ive-0.003
ey-0.003
Token a
Token ID257
Feature activation-7.13e-02

Pos logit contributions
 Pet+0.008
 rob+0.008
 tort+0.007
 sur+0.007
 Kang+0.007
Neg logit contributions
lf-0.009
 cereal-0.009
James-0.009
 spaghetti-0.008
Kate-0.008
Token bowl
Token ID9396
Feature activation-7.66e-01

Pos logit contributions
 cereal+0.002
lf+0.001
 spaghetti+0.001
James+0.001
bes+0.001
Neg logit contributions
 Pet-0.002
 tort-0.002
 rob-0.002
 sur-0.002
 Kang-0.002
Token of
Token ID286
Feature activation-4.18e-01

Pos logit contributions
 cereal+0.024
lf+0.024
James+0.023
bes+0.023
 spaghetti+0.022
Neg logit contributions
 Pet-0.013
 tort-0.012
 rob-0.012
 sur-0.012
ive-0.012
Token hot
Token ID3024
Feature activation-9.04e-01

Pos logit contributions
lf+0.009
bes+0.008
James+0.008
Kate+0.008
Amy+0.007
Neg logit contributions
 rob-0.012
 Pet-0.012
 tort-0.011
 sur-0.011
 mag-0.011
Token cooked
Token ID15847
Feature activation+1.74e+00

Pos logit contributions
lf+0.015
James+0.015
Kate+0.014
bes+0.014
 cereal+0.013
Neg logit contributions
 Pet-0.027
 sur-0.027
 tort-0.027
 rob-0.027
ive-0.027
Token rice
Token ID11464
Feature activation+2.56e+00

Pos logit contributions
 Pet+0.042
 tort+0.040
 rob+0.040
 sur+0.040
 Kang+0.040
Neg logit contributions
 cereal-0.042
lf-0.042
James-0.039
 spaghetti-0.038
bes-0.038
Token for
Token ID329
Feature activation+1.49e-01

Pos logit contributions
 Pet+0.059
 Kang+0.056
 rob+0.056
 tort+0.056
 sur+0.055
Neg logit contributions
 cereal-0.074
 spaghetti-0.069
lf-0.067
 Amy-0.061
ma-0.061
Token breakfast
Token ID12607
Feature activation+8.74e-01

Pos logit contributions
 Pet+0.005
 tort+0.004
 sur+0.004
 rob+0.004
 ko+0.004
Neg logit contributions
 cereal-0.002
lf-0.002
Kate-0.002
James-0.002
bes-0.002
Token.
Token ID13
Feature activation+2.41e-01

Pos logit contributions
 Pet+0.017
 tort+0.016
 sur+0.016
 Kang+0.015
 rob+0.015
Neg logit contributions
 cereal-0.026
lf-0.025
 spaghetti-0.024
James-0.024
Kate-0.023
Token
Token ID220
Feature activation+5.78e-01

Pos logit contributions
 Pet+0.006
 tort+0.006
 sur+0.006
 Kang+0.006
 rob+0.005
Neg logit contributions
 cereal-0.006
lf-0.006
 spaghetti-0.005
bes-0.005
ma-0.005
Token
Token ID198
Feature activation+1.33e-01

Pos logit contributions
 Pet+0.009
 Kang+0.008
 across+0.008
ey+0.007
 sur+0.007
Neg logit contributions
 cereal-0.020
lf-0.020
 spaghetti-0.019
bes-0.018
 candles-0.017
TokenLook
Token ID8567
Feature activation-2.61e+00

Pos logit contributions
 Pet+0.010
 tort+0.009
 sur+0.009
 Kang+0.009
 rob+0.009
Neg logit contributions
 cereal-0.010
 spaghetti-0.009
lf-0.009
James-0.008
 Amy-0.008
Token,
Token ID11
Feature activation+5.07e-01

Pos logit contributions
 cereal+0.078
 spaghetti+0.072
lf+0.071
 Amy+0.071
ma+0.069
Neg logit contributions
 tort-0.057
 Pet-0.055
 sur-0.054
 Kang-0.054
 rob-0.054
Token Sue
Token ID26089
Feature activation-2.42e-02

Pos logit contributions
 Pet+0.015
 tort+0.015
 Kang+0.014
 sur+0.014
 rob+0.014
Neg logit contributions
 cereal-0.011
 spaghetti-0.009
lf-0.009
 Amy-0.009
James-0.008
Token,
Token ID11
Feature activation+5.62e-01

Pos logit contributions
 cereal+0.001
lf+0.001
 spaghetti+0.001
ma+0.001
 Amy+0.001
Neg logit contributions
 Pet-0.001
 tort-0.001
 sur-0.000
 Kang-0.000
 rob-0.000
Token I
Token ID314
Feature activation+7.69e-01

Pos logit contributions
mur+0.017
 tort+0.017
 Pet+0.016
 encountered+0.016
 ko+0.016
Neg logit contributions
 cereal-0.012
 spaghetti-0.011
 Amy-0.010
 blocks-0.009
lf-0.009
Token have
Token ID423
Feature activation+1.84e+00

Pos logit contributions
 Pet+0.018
 tort+0.018
 Kang+0.018
 ko+0.018
mur+0.018
Neg logit contributions
 cereal-0.020
 spaghetti-0.018
 Amy-0.017
lf-0.017
ma-0.017
Token a
Token ID257
Feature activation+7.03e-01

Pos logit contributions
 tort+0.058
 Pet+0.058
 Kang+0.058
 sur+0.057
 ko+0.057
Neg logit contributions
 cereal-0.034
 spaghetti-0.031
lf-0.027
 Amy-0.027
James-0.024
Token rod
Token ID15299
Feature activation-1.37e-01

Pos logit contributions
 Pet+0.017
 Kang+0.016
ive+0.015
ting+0.015
 tort+0.015
Neg logit contributions
 cereal-0.019
 spaghetti-0.018
lf-0.017
 rug-0.016
 Amy-0.016
Token!"
Token ID2474
Feature activation+3.63e-01

Pos logit contributions
 cereal+0.004
lf+0.004
 spaghetti+0.004
 Amy+0.003
ma+0.003
Neg logit contributions
 Pet-0.003
 tort-0.003
 Kang-0.003
 sur-0.003
 rob-0.003
Token Tom
Token ID4186
Feature activation-7.61e-01

Pos logit contributions
 Pet+0.011
 rob+0.011
 sur+0.011
 Kang+0.010
 tort+0.010
Neg logit contributions
 cereal-0.007
lf-0.007
 spaghetti-0.006
bes-0.006
James-0.006
Token says
Token ID1139
Feature activation+4.24e-01

Pos logit contributions
 cereal+0.021
lf+0.019
 spaghetti+0.019
James+0.018
bes+0.018
Neg logit contributions
 Pet-0.018
 tort-0.017
 sur-0.017
 rob-0.017
 Kang-0.017
Token go
Token ID467
Feature activation+2.40e-01

Pos logit contributions
lves+0.013
 Pet+0.013
 Kang+0.012
 tort+0.012
 sur+0.012
Neg logit contributions
 cereal-0.015
lf-0.014
 Amy-0.014
 spaghetti-0.013
 rug-0.013
Token to
Token ID284
Feature activation+8.30e-01

Pos logit contributions
 Pet+0.005
 rob+0.005
 Kang+0.005
 tort+0.005
 gol+0.005
Neg logit contributions
 cereal-0.007
 spaghetti-0.007
ma-0.006
 Amy-0.006
lf-0.006
Token the
Token ID262
Feature activation+6.75e-01

Pos logit contributions
 Pet+0.021
ive+0.020
 tort+0.020
 Kang+0.020
 sur+0.020
Neg logit contributions
 cereal-0.022
 spaghetti-0.020
lf-0.019
 Amy-0.019
 couch-0.018
Token park
Token ID3952
Feature activation-4.74e-02

Pos logit contributions
 Pet+0.020
ub+0.020
ey+0.019
 sur+0.019
stru+0.019
Neg logit contributions
 cereal-0.015
 spaghetti-0.013
 couch-0.012
lf-0.012
 rug-0.012
Token and
Token ID290
Feature activation+6.21e-01

Pos logit contributions
 cereal+0.001
 spaghetti+0.001
lf+0.001
 Amy+0.001
ma+0.001
Neg logit contributions
 Pet-0.001
 tort-0.001
 rob-0.001
 Kang-0.001
 sur-0.001
Token sing
Token ID1702
Feature activation+1.44e+00

Pos logit contributions
 Kang+0.015
 Pet+0.014
 tort+0.014
king+0.014
ive+0.014
Neg logit contributions
 cereal-0.017
 Amy-0.016
 spaghetti-0.015
lf-0.015
 Kate-0.015
Token in
Token ID287
Feature activation-3.35e-01

Pos logit contributions
 Pet+0.040
 tort+0.039
 rob+0.039
 Kang+0.039
 sur+0.039
Neg logit contributions
 cereal-0.034
 spaghetti-0.031
lf-0.030
 Amy-0.029
James-0.028
Token the
Token ID262
Feature activation+6.83e-01

Pos logit contributions
 cereal+0.010
lf+0.010
 spaghetti+0.009
bes+0.009
James+0.009
Neg logit contributions
 Pet-0.007
 ko-0.006
 rob-0.006
 tort-0.006
 sur-0.006
Token sun
Token ID4252
Feature activation-9.14e-02

Pos logit contributions
 Pet+0.017
 tort+0.016
 sur+0.016
 Kang+0.016
 rob+0.016
Neg logit contributions
 cereal-0.018
 spaghetti-0.016
lf-0.016
 Amy-0.015
James-0.015
Token.
Token ID13
Feature activation+2.30e-01

Pos logit contributions
 cereal+0.002
lf+0.002
 spaghetti+0.002
 Amy+0.002
ma+0.002
Neg logit contributions
 Pet-0.002
 sur-0.002
 tort-0.002
 rob-0.002
 Kang-0.002
Token She
Token ID1375
Feature activation+2.32e-03

Pos logit contributions
 Pet+0.006
 tort+0.006
 sur+0.006
 rob+0.006
 Kang+0.006
Neg logit contributions
 cereal-0.006
lf-0.005
 spaghetti-0.005
 Amy-0.005
James-0.005
Token.
Token ID13
Feature activation+1.14e-01

Pos logit contributions
 cereal+0.025
lf+0.024
 spaghetti+0.023
James+0.023
bes+0.022
Neg logit contributions
 Pet-0.015
 tort-0.014
 rob-0.014
 sur-0.014
 Kang-0.014
Token "
Token ID366
Feature activation-8.71e-01

Pos logit contributions
 Pet+0.003
 tort+0.003
 rob+0.003
 tou+0.003
 Kang+0.003
Neg logit contributions
 cereal-0.003
 spaghetti-0.003
lf-0.002
 Amy-0.002
James-0.002
TokenSt
Token ID1273
Feature activation+2.34e+00

Pos logit contributions
 cereal+0.024
lf+0.023
 spaghetti+0.022
 Amy+0.021
ma+0.021
Neg logit contributions
 Pet-0.020
 tort-0.019
 rob-0.019
 sur-0.019
 Kang-0.018
Tokenir
Token ID343
Feature activation-6.96e-01

Pos logit contributions
 Pet+0.051
 tort+0.049
 sur+0.048
 rob+0.048
 Kang+0.048
Neg logit contributions
 cereal-0.067
lf-0.063
 spaghetti-0.062
 Amy-0.058
James-0.058
Token,
Token ID11
Feature activation-1.78e-01

Pos logit contributions
 cereal+0.016
lf+0.015
 spaghetti+0.015
 Amy+0.014
ma+0.013
Neg logit contributions
 Pet-0.019
 tort-0.018
 sur-0.018
 Kang-0.018
 rob-0.018
Token stir
Token ID11240
Feature activation+1.44e+00

Pos logit contributions
 cereal+0.005
lf+0.004
 spaghetti+0.004
James+0.004
bes+0.004
Neg logit contributions
 Pet-0.004
 rob-0.004
 tort-0.004
 sur-0.004
 Kang-0.004
Token,
Token ID11
Feature activation-2.59e-01

Pos logit contributions
 tort+0.034
 Pet+0.034
 sur+0.033
 tou+0.033
 gol+0.033
Neg logit contributions
 cereal-0.040
 spaghetti-0.037
lf-0.036
ma-0.034
 Amy-0.033
Token stir
Token ID11240
Feature activation+1.48e+00

Pos logit contributions
 cereal+0.007
lf+0.006
 spaghetti+0.006
James+0.006
bes+0.006
Neg logit contributions
 Pet-0.006
 rob-0.006
 tort-0.006
 Kang-0.006
 sur-0.006
Token!"
Token ID2474
Feature activation+9.82e-01

Pos logit contributions
 tort+0.036
 Pet+0.035
 sur+0.035
 tou+0.035
 gol+0.034
Neg logit contributions
 cereal-0.041
 spaghetti-0.037
lf-0.037
ma-0.034
 Amy-0.034
Token
Token ID198
Feature activation+3.00e-01

Pos logit contributions
 Pet+0.024
 tort+0.023
 sur+0.023
 Kang+0.023
 rob+0.023
Neg logit contributions
 cereal-0.026
lf-0.025
 spaghetti-0.023
James-0.022
bes-0.022
Token
Token ID198
Feature activation+5.46e-01

Pos logit contributions
 Pet+0.007
 Kang+0.006
 sur+0.006
 tort+0.006
 rob+0.006
Neg logit contributions
 cereal-0.009
lf-0.008
 spaghetti-0.008
bes-0.007
James-0.007
Token for
Token ID329
Feature activation-1.36e-02

Pos logit contributions
 Pet+0.015
 rob+0.015
 Kang+0.014
 tort+0.014
 sur+0.014
Neg logit contributions
 cereal-0.013
 spaghetti-0.013
 Amy-0.011
lf-0.011
 Kate-0.011
Token the
Token ID262
Feature activation+1.61e-01

Pos logit contributions
 cereal+0.000
lf+0.000
 spaghetti+0.000
bes+0.000
James+0.000
Neg logit contributions
 Pet-0.000
 tort-0.000
 rob-0.000
 Kang-0.000
 sur-0.000
Token puppy
Token ID26188
Feature activation-4.22e-01

Pos logit contributions
 tort+0.002
 Pet+0.002
 tou+0.002
 ko+0.002
 judge+0.002
Neg logit contributions
 cereal-0.006
lf-0.006
Kate-0.006
bes-0.006
 spaghetti-0.006
Token and
Token ID290
Feature activation-3.55e-01

Pos logit contributions
 cereal+0.010
 spaghetti+0.010
lf+0.009
 Amy+0.009
ma+0.008
Neg logit contributions
 Pet-0.011
 sur-0.011
 rob-0.011
 Kang-0.011
 tort-0.011
Token decided
Token ID3066
Feature activation-3.23e-01

Pos logit contributions
 cereal+0.010
lf+0.010
 spaghetti+0.010
 Amy+0.009
ma+0.009
Neg logit contributions
 Pet-0.008
 Kang-0.007
 sur-0.007
 tort-0.007
 rob-0.007
Token to
Token ID284
Feature activation+5.89e-01

Pos logit contributions
 cereal+0.009
 Amy+0.008
 spaghetti+0.008
lf+0.008
ma+0.007
Neg logit contributions
 Kang-0.008
ive-0.007
 ko-0.007
 rob-0.007
 Pet-0.007
Token help
Token ID1037
Feature activation-5.08e-01

Pos logit contributions
 Pet+0.012
 sur+0.012
 Kang+0.012
mur+0.012
 tort+0.012
Neg logit contributions
 cereal-0.017
lf-0.016
 Amy-0.015
 spaghetti-0.015
ma-0.015
Token him
Token ID683
Feature activation+7.73e-01

Pos logit contributions
 cereal+0.008
 Amy+0.008
lf+0.007
 spaghetti+0.007
ma+0.007
Neg logit contributions
 sur-0.017
ive-0.017
 Kang-0.017
 Pet-0.016
 gol-0.016
Token find
Token ID1064
Feature activation+3.74e-01

Pos logit contributions
 Pet+0.019
 sur+0.019
 gol+0.019
 tort+0.019
 Kang+0.019
Neg logit contributions
 cereal-0.020
 spaghetti-0.017
 Amy-0.017
lf-0.017
ma-0.016
Token his
Token ID465
Feature activation-1.77e-01

Pos logit contributions
 Pet+0.010
 tort+0.010
 sur+0.010
 Kang+0.010
 rob+0.010
Neg logit contributions
 cereal-0.009
lf-0.008
 spaghetti-0.008
 Amy-0.008
James-0.007
Token way
Token ID835
Feature activation-1.60e+00

Pos logit contributions
 cereal+0.006
lf+0.006
 spaghetti+0.005
James+0.005
Kate+0.005
Neg logit contributions
 Pet-0.003
 tort-0.003
 Kang-0.003
 rob-0.003
 tou-0.003
Token to
Token ID284
Feature activation+3.50e-01

Pos logit contributions
 cereal+0.020
 spaghetti+0.019
 Amy+0.019
lf+0.018
ma+0.017
Neg logit contributions
 Kang-0.020
ive-0.019
 ko-0.018
 rob-0.018
mur-0.018
Token keep
Token ID1394
Feature activation-2.26e-01

Pos logit contributions
 Pet+0.006
 Kang+0.006
 tort+0.006
 sur+0.006
 rob+0.006
Neg logit contributions
 cereal-0.012
lf-0.011
 spaghetti-0.011
 Amy-0.010
ma-0.010
Token the
Token ID262
Feature activation+6.79e-01

Pos logit contributions
 cereal+0.006
lf+0.006
 spaghetti+0.006
 Amy+0.005
ma+0.005
Neg logit contributions
 Pet-0.005
 tort-0.005
 Kang-0.005
 sur-0.005
 rob-0.005
Token silver
Token ID8465
Feature activation-6.81e-01

Pos logit contributions
 Pet+0.018
 tort+0.017
 rob+0.017
 sur+0.017
 Kang+0.017
Neg logit contributions
 cereal-0.017
lf-0.016
 spaghetti-0.015
 Amy-0.014
James-0.014
Token coin
Token ID10752
Feature activation-1.27e-01

Pos logit contributions
 cereal+0.016
lf+0.014
 spaghetti+0.014
ma+0.013
bes+0.013
Neg logit contributions
 Pet-0.019
 Kang-0.018
 tort-0.018
 sur-0.018
 rob-0.018
Token in
Token ID287
Feature activation-8.88e-01

Pos logit contributions
 cereal+0.004
lf+0.004
James+0.004
 spaghetti+0.004
Kate+0.004
Neg logit contributions
 Pet-0.002
 tort-0.002
 rob-0.002
ive-0.002
 tou-0.002
Token her
Token ID607
Feature activation-3.95e-01

Pos logit contributions
 cereal+0.023
lf+0.022
bes+0.020
 spaghetti+0.020
James+0.020
Neg logit contributions
 Pet-0.022
 rob-0.020
 tort-0.020
 ko-0.020
 tou-0.019
Token pocket
Token ID10000
Feature activation-1.04e+00

Pos logit contributions
 cereal+0.010
lf+0.010
James+0.010
 spaghetti+0.010
Kate+0.010
Neg logit contributions
 Pet-0.009
 tort-0.009
 rob-0.009
 Kang-0.009
 tou-0.008
Token and
Token ID290
Feature activation+5.00e-01

Pos logit contributions
 cereal+0.033
lf+0.031
 spaghetti+0.030
James+0.030
Kate+0.029
Neg logit contributions
 Pet-0.019
 tort-0.018
 sur-0.018
 Kang-0.017
 rob-0.017
Token show
Token ID905
Feature activation-1.09e-01

Pos logit contributions
 Pet+0.009
 tort+0.008
 Kang+0.008
 sur+0.008
 rob+0.008
Neg logit contributions
 cereal-0.016
lf-0.015
 spaghetti-0.015
 Amy-0.014
James-0.014
Token it
Token ID340
Feature activation+2.07e-02

Pos logit contributions
 cereal+0.003
 Amy+0.002
lf+0.002
 spaghetti+0.002
 Kate+0.002
Neg logit contributions
 Kang-0.003
 Pet-0.003
 tort-0.003
 sur-0.003
 rob-0.003
Token
Token ID198
Feature activation+1.83e-01

Pos logit contributions
 Pet+0.016
 sur+0.014
 tort+0.014
 rob+0.014
 Kang+0.014
Neg logit contributions
 cereal-0.020
lf-0.020
 spaghetti-0.018
James-0.018
ma-0.018
Token
Token ID198
Feature activation+3.97e-01

Pos logit contributions
 Pet+0.003
 rob+0.003
 sur+0.003
 Kang+0.003
ey+0.003
Neg logit contributions
 cereal-0.006
lf-0.006
 spaghetti-0.006
lights-0.005
bes-0.005
TokenBen
Token ID11696
Feature activation+9.25e-01

Pos logit contributions
 Pet+0.008
 rob+0.008
 sur+0.008
 tort+0.008
 Kang+0.008
Neg logit contributions
 cereal-0.012
lf-0.011
 spaghetti-0.011
 Amy-0.010
 Kate-0.010
Token did
Token ID750
Feature activation-1.84e-01

Pos logit contributions
 Pet+0.021
 sur+0.020
 tort+0.020
 rob+0.020
 Kang+0.019
Neg logit contributions
 cereal-0.026
lf-0.024
 spaghetti-0.023
James-0.022
bes-0.022
Token not
Token ID407
Feature activation+1.01e+00

Pos logit contributions
 cereal+0.004
lf+0.003
 spaghetti+0.003
James+0.003
ma+0.003
Neg logit contributions
 Pet-0.006
 sur-0.005
 rob-0.005
 tort-0.005
 Kang-0.005
Token listen
Token ID6004
Feature activation+6.16e-01

Pos logit contributions
 Kang+0.024
 Pet+0.023
 tort+0.022
 rob+0.021
 ko+0.021
Neg logit contributions
 cereal-0.027
 Amy-0.025
 spaghetti-0.025
lf-0.024
ma-0.024
Token to
Token ID284
Feature activation-8.45e-01

Pos logit contributions
 Pet+0.014
 Kang+0.013
 rob+0.013
 tort+0.013
 sur+0.013
Neg logit contributions
 cereal-0.017
lf-0.016
 spaghetti-0.016
 Amy-0.015
ma-0.015
Token Lily
Token ID20037
Feature activation+9.72e-01

Pos logit contributions
 cereal+0.019
lf+0.019
bes+0.019
James+0.018
 spaghetti+0.017
Neg logit contributions
 Pet-0.023
 rob-0.022
 tort-0.021
 ko-0.021
 Kang-0.021
Token.
Token ID13
Feature activation+5.49e-01

Pos logit contributions
 Pet+0.026
 sur+0.025
 tort+0.025
 Kang+0.025
 tou+0.025
Neg logit contributions
 cereal-0.024
 spaghetti-0.021
 Amy-0.021
ma-0.021
lf-0.021
Token He
Token ID679
Feature activation+2.72e-01

Pos logit contributions
 Pet+0.014
 Kang+0.013
 tort+0.013
 sur+0.013
 rob+0.013
Neg logit contributions
lf-0.014
 cereal-0.014
ma-0.013
 spaghetti-0.013
James-0.013
Token thought
Token ID1807
Feature activation-9.24e-01

Pos logit contributions
 Pet+0.006
 tort+0.006
 Kang+0.006
 rob+0.006
 sur+0.006
Neg logit contributions
 cereal-0.007
lf-0.007
 spaghetti-0.007
 Amy-0.006
James-0.006
Token to
Token ID284
Feature activation-7.73e-01

Pos logit contributions
lves+0.009
 Kang+0.009
 tort+0.009
 encountered+0.009
 Pet+0.009
Neg logit contributions
 cereal-0.009
 Amy-0.008
 spaghetti-0.008
lf-0.008
 Kate-0.008
Token Ben
Token ID3932
Feature activation+1.24e+00

Pos logit contributions
 cereal+0.019
lf+0.019
bes+0.018
lights+0.018
 spaghetti+0.018
Neg logit contributions
 Pet-0.020
 rob-0.018
 judge-0.017
 ko-0.017
 tort-0.017
Token,
Token ID11
Feature activation+1.97e-01

Pos logit contributions
 Kang+0.029
 sur+0.028
 tort+0.028
 Pet+0.028
 rob+0.028
Neg logit contributions
 cereal-0.034
ma-0.031
 Amy-0.031
lf-0.031
 spaghetti-0.031
Token "
Token ID366
Feature activation+4.36e-01

Pos logit contributions
 Pet+0.004
 tort+0.004
 sur+0.004
 Kang+0.004
 rob+0.004
Neg logit contributions
 cereal-0.006
lf-0.005
 spaghetti-0.005
 Amy-0.005
James-0.005
TokenI
Token ID40
Feature activation+1.34e-01

Pos logit contributions
 Pet+0.011
 tort+0.010
 Kang+0.010
 ko+0.010
 rob+0.010
Neg logit contributions
 cereal-0.011
James-0.010
 spaghetti-0.010
lf-0.010
Kate-0.010
Token know
Token ID760
Feature activation+4.28e-02

Pos logit contributions
 Pet+0.003
ting+0.003
 Kang+0.003
 tort+0.003
king+0.003
Neg logit contributions
 cereal-0.004
ma-0.003
 Amy-0.003
 spaghetti-0.003
 Kate-0.003
Token where
Token ID810
Feature activation+5.03e-01

Pos logit contributions
 Pet+0.001
 tort+0.001
 rob+0.001
 sur+0.001
 Kang+0.001
Neg logit contributions
 cereal-0.001
lf-0.001
 spaghetti-0.001
James-0.001
bes-0.001
Token we
Token ID356
Feature activation+3.67e-01

Pos logit contributions
 Pet+0.011
 tort+0.010
 sur+0.010
 Kang+0.010
 rob+0.010
Neg logit contributions
 cereal-0.015
lf-0.014
 spaghetti-0.014
 Amy-0.013
ma-0.013
Token can
Token ID460
Feature activation-4.14e-01

Pos logit contributions
 Pet+0.010
 tort+0.009
 sur+0.009
 gol+0.009
 ko+0.009
Neg logit contributions
 cereal-0.009
lf-0.008
 spaghetti-0.008
ma-0.007
 blocks-0.007
Token find
Token ID1064
Feature activation+1.91e-01

Pos logit contributions
 cereal+0.012
lf+0.011
 spaghetti+0.010
ma+0.010
 rug+0.010
Neg logit contributions
 Pet-0.009
 sur-0.009
 gol-0.009
 tort-0.009
 ko-0.008
Token more
Token ID517
Feature activation+4.70e-01

Pos logit contributions
 Pet+0.005
 tort+0.005
 sur+0.005
 rob+0.004
 Kang+0.004
Neg logit contributions
 cereal-0.005
 spaghetti-0.005
lf-0.005
 Amy-0.004
ma-0.004
Token Ab
Token ID2275
Feature activation-1.09e+00

Pos logit contributions
 Pet+0.012
 tort+0.012
 sur+0.012
 Kang+0.012
 rob+0.012
Neg logit contributions
 cereal-0.013
lf-0.012
 spaghetti-0.011
 Amy-0.011
James-0.011
Tokenbi
Token ID8482
Feature activation+1.28e+00

Pos logit contributions
 cereal+0.029
lf+0.026
 spaghetti+0.026
ma+0.025
 Amy+0.024
Neg logit contributions
 Pet-0.027
 tort-0.026
 Kang-0.025
 rob-0.025
 sur-0.025
Token's
Token ID338
Feature activation+1.89e-01

Pos logit contributions
 Pet+0.030
 Kang+0.030
 tort+0.030
 sur+0.029
 gol+0.029
Neg logit contributions
 cereal-0.034
lf-0.032
ma-0.031
 spaghetti-0.031
 Amy-0.029
Token garden
Token ID11376
Feature activation+1.39e+00

Pos logit contributions
 Pet+0.005
 tort+0.005
 sur+0.005
 Kang+0.005
 rob+0.005
Neg logit contributions
 cereal-0.004
lf-0.004
 spaghetti-0.004
ma-0.004
 Amy-0.004
Token was
Token ID373
Feature activation+6.42e-02

Pos logit contributions
 Pet+0.030
 tort+0.029
 rob+0.029
 Kang+0.028
 sur+0.028
Neg logit contributions
 cereal-0.040
lf-0.037
 spaghetti-0.036
James-0.035
 Amy-0.034
Token starting
Token ID3599
Feature activation+1.10e+00

Pos logit contributions
 Pet+0.002
 tort+0.001
 rob+0.001
 sur+0.001
 Kang+0.001
Neg logit contributions
 cereal-0.002
lf-0.002
 spaghetti-0.002
James-0.001
 Amy-0.001
Token to
Token ID284
Feature activation+6.84e-01

Pos logit contributions
 Pet+0.013
 tort+0.012
 rob+0.012
 sur+0.012
 Kang+0.012
Neg logit contributions
 cereal-0.043
lf-0.041
 spaghetti-0.040
James-0.038
 Amy-0.038
Token look
Token ID804
Feature activation-7.28e-01

Pos logit contributions
 Pet+0.013
 Kang+0.013
 tort+0.013
 sur+0.013
 rob+0.012
Neg logit contributions
 cereal-0.021
lf-0.020
 spaghetti-0.020
 Amy-0.019
bes-0.018
Token very
Token ID845
Feature activation-4.74e-01

Pos logit contributions
 cereal+0.021
 spaghetti+0.020
lf+0.019
 Amy+0.018
ma+0.018
Neg logit contributions
 Pet-0.016
 Kang-0.016
 tort-0.016
 rob-0.016
 sur-0.015
Token different
Token ID1180
Feature activation+7.00e-01

Pos logit contributions
 cereal+0.011
 spaghetti+0.010
lf+0.009
 Amy+0.009
ma+0.008
Neg logit contributions
 Pet-0.013
 tort-0.013
 Kang-0.013
 rob-0.013
 ko-0.013
Token.
Token ID13
Feature activation-1.69e-01

Pos logit contributions
 Pet+0.015
 tort+0.015
 Kang+0.015
mur+0.015
 sur+0.015
Neg logit contributions
 cereal-0.021
 spaghetti-0.019
lf-0.018
 Amy-0.017
 blankets-0.017
Token."
Token ID526
Feature activation+6.60e-01

Pos logit contributions
 Pet+0.036
 rob+0.034
 tort+0.034
 Kang+0.034
 sur+0.034
Neg logit contributions
 cereal-0.043
 spaghetti-0.039
lf-0.039
 Amy-0.037
ma-0.036
Token The
Token ID383
Feature activation+3.78e-01

Pos logit contributions
 Pet+0.017
 tort+0.016
 rob+0.016
 Kang+0.016
 sur+0.016
Neg logit contributions
 cereal-0.017
lf-0.016
 spaghetti-0.015
James-0.014
 Amy-0.014
Token old
Token ID1468
Feature activation+6.73e-01

Pos logit contributions
 Pet+0.010
 Kang+0.010
 tort+0.010
 sur+0.010
ive+0.010
Neg logit contributions
 cereal-0.009
lf-0.008
 spaghetti-0.008
 Amy-0.007
ma-0.007
Token lady
Token ID10846
Feature activation+5.34e-01

Pos logit contributions
 Pet+0.016
 Kang+0.016
 tort+0.016
 sur+0.016
ive+0.016
Neg logit contributions
 cereal-0.018
lf-0.017
 spaghetti-0.015
ma-0.015
 Amy-0.015
Token smiled
Token ID13541
Feature activation+1.38e-03

Pos logit contributions
 Pet+0.015
 tort+0.014
 rob+0.014
 sur+0.014
 Kang+0.014
Neg logit contributions
 cereal-0.012
lf-0.011
 spaghetti-0.011
James-0.010
 Amy-0.010
Token back
Token ID736
Feature activation-2.35e+00

Pos logit contributions
 Pet+0.000
 rob+0.000
 tort+0.000
 Kang+0.000
 sur+0.000
Neg logit contributions
 cereal-0.000
 spaghetti-0.000
lf-0.000
 Amy-0.000
ma-0.000
Token and
Token ID290
Feature activation-8.70e-01

Pos logit contributions
 cereal+0.059
lf+0.055
 spaghetti+0.053
 Amy+0.050
ma+0.050
Neg logit contributions
 Pet-0.060
 tort-0.058
 rob-0.058
 sur-0.057
 Kang-0.057
Token continued
Token ID3767
Feature activation-2.74e-02

Pos logit contributions
 cereal+0.017
 Amy+0.016
lf+0.015
 spaghetti+0.015
ma+0.014
Neg logit contributions
 perspective-0.027
mur-0.026
 Kang-0.026
 unnecessary-0.026
ive-0.026
Token to
Token ID284
Feature activation+1.10e+00

Pos logit contributions
 cereal+0.001
lf+0.001
 spaghetti+0.001
 Amy+0.001
ma+0.001
Neg logit contributions
 rob-0.001
ive-0.001
 sur-0.001
mur-0.001
 Kang-0.001
Token walk
Token ID2513
Feature activation+9.56e-01

Pos logit contributions
mur+0.025
 Kang+0.024
 sur+0.024
 rob+0.024
lves+0.024
Neg logit contributions
 cereal-0.030
lf-0.027
 Amy-0.026
 spaghetti-0.026
ma-0.026
Token.
Token ID13
Feature activation-5.53e-01

Pos logit contributions
 rob+0.023
 sur+0.023
 Kang+0.023
 tou+0.022
 gol+0.021
Neg logit contributions
 cereal-0.025
 Amy-0.024
 spaghetti-0.022
 Kate-0.022
ma-0.022
Token It
Token ID632
Feature activation-6.74e-01

Pos logit contributions
 Pet+0.012
 Kang+0.011
 rob+0.011
 tou+0.011
 ko+0.011
Neg logit contributions
lf-0.015
 cereal-0.015
ma-0.014
 spaghetti-0.014
James-0.014
Token was
Token ID373
Feature activation-1.15e+00

Pos logit contributions
 cereal+0.019
lf+0.017
 spaghetti+0.017
James+0.016
bes+0.016
Neg logit contributions
 Pet-0.016
 tort-0.015
 rob-0.015
 Kang-0.015
 sur-0.015
Token her
Token ID607
Feature activation-2.16e-01

Pos logit contributions
 cereal+0.028
lf+0.026
bes+0.025
James+0.024
 spaghetti+0.024
Neg logit contributions
 Pet-0.031
 rob-0.029
 tou-0.029
 Kang-0.028
 ko-0.028
Token dad
Token ID9955
Feature activation-5.08e-01

Pos logit contributions
 cereal+0.005
lf+0.005
James+0.005
 spaghetti+0.005
Kate+0.005
Neg logit contributions
 Pet-0.005
 tort-0.005
 rob-0.005
 sur-0.005
 Kang-0.005
Token's
Token ID338
Feature activation+9.72e-01

Pos logit contributions
 cereal+0.014
lf+0.013
 spaghetti+0.013
ma+0.012
 Amy+0.012
Neg logit contributions
 Pet-0.012
 tort-0.011
 sur-0.011
 Kang-0.011
 rob-0.011
Token car
Token ID1097
Feature activation-2.51e+00

Pos logit contributions
 Pet+0.022
 tort+0.021
 rob+0.021
 sur+0.021
 tou+0.020
Neg logit contributions
 cereal-0.027
lf-0.026
James-0.024
 spaghetti-0.024
Kate-0.024
Token.
Token ID13
Feature activation+7.33e-01

Pos logit contributions
 cereal+0.074
lf+0.069
 spaghetti+0.068
James+0.064
 Amy+0.064
Neg logit contributions
 Pet-0.053
 tort-0.051
 rob-0.050
 Kang-0.050
 sur-0.050
Token He
Token ID679
Feature activation+3.31e-01

Pos logit contributions
 Pet+0.019
 Kang+0.017
 tort+0.017
 rob+0.017
 ko+0.017
Neg logit contributions
lf-0.018
 cereal-0.018
ma-0.017
 spaghetti-0.017
bes-0.016
Token was
Token ID373
Feature activation-5.45e-01

Pos logit contributions
 Pet+0.007
 tort+0.007
 rob+0.007
 sur+0.007
 Kang+0.007
Neg logit contributions
 cereal-0.009
lf-0.009
 spaghetti-0.009
James-0.008
 Amy-0.008
Token coming
Token ID2406
Feature activation+4.02e-02

Pos logit contributions
 cereal+0.014
lf+0.013
 spaghetti+0.012
 Amy+0.012
James+0.012
Neg logit contributions
 Pet-0.014
 tort-0.014
 rob-0.013
 sur-0.013
 Kang-0.013
Token home
Token ID1363
Feature activation+1.07e-01

Pos logit contributions
 Pet+0.001
 tort+0.001
 sur+0.001
 rob+0.001
 Kang+0.001
Neg logit contributions
 cereal-0.001
lf-0.001
 spaghetti-0.001
 Amy-0.001
ma-0.001
Token They
Token ID1119
Feature activation-4.11e-01

Pos logit contributions
 cereal+0.000
lf+0.000
 spaghetti+0.000
James+0.000
 Amy+0.000
Neg logit contributions
 Pet-0.000
 tort-0.000
 rob-0.000
 Kang-0.000
 sur-0.000
Token look
Token ID804
Feature activation-7.31e-01

Pos logit contributions
 cereal+0.011
 spaghetti+0.010
lf+0.010
 Amy+0.010
James+0.009
Neg logit contributions
 Kang-0.010
 Pet-0.009
 tort-0.009
 sur-0.009
ive-0.009
Token at
Token ID379
Feature activation-7.34e-01

Pos logit contributions
 cereal+0.024
 spaghetti+0.023
lf+0.022
 Amy+0.021
James+0.021
Neg logit contributions
 Pet-0.013
 tort-0.013
 Kang-0.013
 rob-0.013
 sur-0.012
Token the
Token ID262
Feature activation+4.22e-01

Pos logit contributions
lf+0.027
 cereal+0.027
bes+0.026
James+0.025
pron+0.025
Neg logit contributions
 Pet-0.010
 rob-0.008
 Kang-0.008
 sur-0.008
 ko-0.008
Token broken
Token ID5445
Feature activation-8.16e-01

Pos logit contributions
 rob+0.011
 Pet+0.011
 tort+0.011
 tou+0.011
 sur+0.011
Neg logit contributions
 cereal-0.009
lf-0.009
bes-0.008
James-0.008
Kate-0.008
Token box
Token ID3091
Feature activation-2.46e+00

Pos logit contributions
 cereal+0.016
 spaghetti+0.014
James+0.014
lf+0.014
Kate+0.014
Neg logit contributions
 Pet-0.024
 rob-0.023
 tort-0.023
 sur-0.023
 tou-0.023
Token and
Token ID290
Feature activation+3.38e-02

Pos logit contributions
 cereal+0.080
lf+0.074
 spaghetti+0.074
James+0.072
bes+0.071
Neg logit contributions
 Pet-0.042
 sur-0.039
 tort-0.039
 rob-0.039
 across-0.038
Token cry
Token ID3960
Feature activation-2.87e-01

Pos logit contributions
 Pet+0.001
 tort+0.001
 Kang+0.001
 rob+0.001
 sur+0.001
Neg logit contributions
 cereal-0.001
lf-0.001
 spaghetti-0.001
 Amy-0.001
James-0.001
Token.
Token ID13
Feature activation+1.45e-01

Pos logit contributions
 cereal+0.006
 spaghetti+0.006
 Amy+0.005
 Kate+0.005
 rug+0.005
Neg logit contributions
 Pet-0.009
 rob-0.009
 tort-0.009
 sur-0.008
 gol-0.008
Token
Token ID198
Feature activation+1.83e-01

Pos logit contributions
 Pet+0.003
 tort+0.003
 Kang+0.003
 rob+0.003
 sur+0.003
Neg logit contributions
 cereal-0.004
lf-0.004
 spaghetti-0.004
James-0.003
bes-0.003
Token
Token ID198
Feature activation+3.94e-01

Pos logit contributions
 Pet+0.003
 Kang+0.003
 rob+0.003
 sur+0.003
 ko+0.003
Neg logit contributions
 cereal-0.006
lf-0.006
 spaghetti-0.006
bes-0.005
 microwave-0.005
Token
Token ID198
Feature activation+8.76e-02

Pos logit contributions
 Pet+0.016
 sur+0.016
 tort+0.015
 Kang+0.014
 gol+0.014
Neg logit contributions
lf-0.029
 cereal-0.028
 spaghetti-0.026
bes-0.026
ma-0.025
Token
Token ID198
Feature activation+3.50e-01

Pos logit contributions
 Pet+0.001
 sur+0.001
 rob+0.001
 Kang+0.001
 ko+0.001
Neg logit contributions
 cereal-0.003
lf-0.003
 spaghetti-0.003
bes-0.003
James-0.003
TokenL
Token ID43
Feature activation-4.60e-01

Pos logit contributions
 Pet+0.007
 sur+0.007
 tort+0.007
 rob+0.007
 Kang+0.007
Neg logit contributions
 cereal-0.010
 spaghetti-0.009
lf-0.009
bes-0.009
 Amy-0.009
Tokenily
Token ID813
Feature activation+5.77e-01

Pos logit contributions
 cereal+0.010
lf+0.009
 blankets+0.008
 spaghetti+0.008
bes+0.008
Neg logit contributions
ive-0.013
 sur-0.013
 gol-0.013
ey-0.013
 Pet-0.013
Token smiled
Token ID13541
Feature activation+6.06e-01

Pos logit contributions
 Pet+0.015
 tort+0.015
 Kang+0.015
 rob+0.015
 sur+0.014
Neg logit contributions
 cereal-0.014
lf-0.013
 spaghetti-0.013
 Amy-0.012
James-0.012
Token back
Token ID736
Feature activation-1.94e+00

Pos logit contributions
 Pet+0.016
 rob+0.016
 tort+0.016
 Kang+0.015
 sur+0.015
Neg logit contributions
 cereal-0.015
 spaghetti-0.014
lf-0.013
 Amy-0.013
ma-0.012
Token.
Token ID13
Feature activation+3.16e-03

Pos logit contributions
 cereal+0.052
lf+0.049
 spaghetti+0.048
 Amy+0.045
ma+0.045
Neg logit contributions
 Pet-0.046
 tort-0.045
 rob-0.045
 sur-0.044
 Kang-0.044
Token She
Token ID1375
Feature activation+1.26e-01

Pos logit contributions
 Pet+0.000
 tort+0.000
 Kang+0.000
 sur+0.000
 rob+0.000
Neg logit contributions
 cereal-0.000
lf-0.000
 spaghetti-0.000
James-0.000
bes-0.000
Token said
Token ID531
Feature activation+1.39e-01

Pos logit contributions
 Pet+0.003
 tort+0.003
 Kang+0.003
 rob+0.003
 sur+0.003
Neg logit contributions
 cereal-0.003
lf-0.003
 spaghetti-0.003
 Amy-0.003
James-0.003
Token,
Token ID11
Feature activation-8.66e-02

Pos logit contributions
 Kang+0.003
 Pet+0.003
 tort+0.003
 encountered+0.003
 rob+0.003
Neg logit contributions
 cereal-0.004
 spaghetti-0.004
 Amy-0.004
lf-0.003
 Kate-0.003
Token "
Token ID366
Feature activation+2.46e-01

Pos logit contributions
 cereal+0.002
lf+0.002
 spaghetti+0.002
 Amy+0.002
James+0.002
Neg logit contributions
 Pet-0.002
 tort-0.002
 rob-0.002
 Kang-0.002
 sur-0.002
Token car
Token ID1097
Feature activation-3.44e+00

Pos logit contributions
 cereal+0.021
lf+0.020
James+0.020
Kate+0.019
 spaghetti+0.019
Neg logit contributions
 tort-0.026
 Pet-0.026
 rob-0.025
 ko-0.025
 Kang-0.024
Token and
Token ID290
Feature activation-8.96e-01

Pos logit contributions
 cereal+0.095
lf+0.090
 spaghetti+0.087
 Amy+0.083
ma+0.082
Neg logit contributions
 Pet-0.080
 sur-0.076
 rob-0.075
 Kang-0.075
 tort-0.074
Token a
Token ID257
Feature activation-5.86e-01

Pos logit contributions
 cereal+0.021
lf+0.020
ma+0.020
James+0.019
Kate+0.019
Neg logit contributions
 Pet-0.023
 tort-0.023
 rob-0.022
 ko-0.022
 Kang-0.021
Token pretty
Token ID2495
Feature activation-2.03e-01

Pos logit contributions
 cereal+0.014
lf+0.013
 spaghetti+0.012
James+0.012
bes+0.012
Neg logit contributions
 Pet-0.016
 tort-0.015
 rob-0.015
 sur-0.015
 Kang-0.015
Token purple
Token ID14032
Feature activation-1.01e+00

Pos logit contributions
 cereal+0.004
lf+0.004
 spaghetti+0.004
James+0.004
bes+0.004
Neg logit contributions
 Pet-0.006
 tort-0.006
 rob-0.006
 sur-0.005
 ko-0.005
Token dress
Token ID6576
Feature activation-2.68e+00

Pos logit contributions
 cereal+0.022
lf+0.021
 spaghetti+0.020
James+0.020
Kate+0.020
Neg logit contributions
 tort-0.027
 Pet-0.027
 rob-0.026
 ko-0.026
 sur-0.026
Token.
Token ID13
Feature activation-4.30e-01

Pos logit contributions
 cereal+0.075
lf+0.070
 spaghetti+0.068
 Amy+0.064
ma+0.064
Neg logit contributions
 Pet-0.061
 tort-0.058
 sur-0.058
 rob-0.058
 Kang-0.058
Token She
Token ID1375
Feature activation-7.44e-02

Pos logit contributions
lf+0.012
 cereal+0.012
kin+0.011
bes+0.011
 elbow+0.011
Neg logit contributions
 Pet-0.010
 Kang-0.008
 rob-0.008
 across-0.008
 tort-0.008
Token was
Token ID373
Feature activation-4.85e-01

Pos logit contributions
 cereal+0.002
lf+0.002
 spaghetti+0.002
James+0.002
 Amy+0.002
Neg logit contributions
 Pet-0.002
 tort-0.002
 sur-0.002
 rob-0.002
 Kang-0.001
Token so
Token ID523
Feature activation+9.44e-02

Pos logit contributions
 cereal+0.012
lf+0.012
 spaghetti+0.011
 Amy+0.011
James+0.011
Neg logit contributions
 Pet-0.012
 tort-0.012
 Kang-0.011
 sur-0.011
 rob-0.011
Token happy
Token ID3772
Feature activation+5.31e-01

Pos logit contributions
 Pet+0.002
 rob+0.002
 sur+0.002
 Kang+0.002
 ko+0.002
Neg logit contributions
 cereal-0.002
lf-0.002
 spaghetti-0.002
bes-0.002
James-0.002