TOP ACTIVATIONS
MAX = 4.276

with fries and a mil ksh ake . But after eating
the tree to its once former beauty . He was happy
ksh akes . The mil ksh akes were blue and pink
, roses , and sun fl owers . They waved to
as they drank the mil ksh akes .
a garden full of wild fl owers . He decided to
me adow full of wild fl owers . He spun around
garden and admire his sun fl owers . They also see
da is ies , sun fl owers , and tul ips
stretch out and enormous sun fl owers would show their bright
would take out a j igsaw puzzle and try to solve
It was a big j igsaw puzzle and she wanted to
payoff : The story fore sh adows the rainbow by mentioning
day drinking y ummy mil ksh akes in the sun .
back with lots of sun fl owers to fill the prince
had her mom and Cro co , and that was enough
<|endoftext|> L ily loved sun fl owers . She liked to
, and all the other av oc ados were very happy
a delicious , mild - fl av oured cake . The
need . Even the t ini est creature needs our help

MIN ACTIVATIONS
MIN = -3.149

. " RE VERS E , huh ? That
sign that said " RE VERS E ". L ila and
When they got to the checkout , Mom my saw the
When they got to the checkout , Tim my saw the
likes honey and Ben likes salsa . They are both sweet
! But it wasn 't able to move much because the
toys she wouldn 't be able to play with her dress
have four fingers . Two plus two is four . See
When the basket was full , it was
very frustrated at not being able to reach it .
get the bread . It nib bled her fingers and made
sad that he wasn 't able to squeeze into the frame
the baskets . When the basket was full , she went
When they got to the checkout counter , the cash ier
her . She wasn 't able to move as quickly as
and the monster wasn 't able to break it .
that accidents happen . Ho pper learned his lesson and promised
away . Tom wasn 't able to keep up with the
When they reached the checkout , Alice 's mom picked
" Yes , honey and salsa . You know , Lily

INTERVAL 2.420 - 4.276
CONTAINS 0.001%

frogs , Big Frog and Little Frog . Big Frog was
there was an ordinary ash tr ay . The ash tr
stretch out and enormous sun fl owers would show their bright
around on a tight ro pe , he could even turn
need . Even the t ini est creature needs our help

INTERVAL 0.564 - 2.420
CONTAINS 10.064%

to float away faster . The girl tried and tried to
. They hugged mom and thanked her for the lunch .
nodded his head and scam pered down the street . Soon
there was a boy named Tim . Tim was feeling sad
door . He asked her if she could help him with

INTERVAL -1.293 - 0.564
CONTAINS 89.445%

forever . <|endoftext|> Once upon a time , there was a
stained Anna 's dress and made it look dirty .
upon a time there was a girl who wanted to have
them again . But then her brother had an idea -
is that it 's important to share and be kind to

INTERVAL -3.149 - -1.293
CONTAINS 0.489%

and says , " Thank you , Ben . You are
the porch . It was full of y ummy juice .
the feel of the wind on her face and the sun
, because you have been such a good girl . Now
It was a big basket with all sorts of colorful things
Token with
Token ID351
Feature activation-9.23e-01

Pos logit contributions
iw+0.030
ho+0.027
chard+0.026
 mun+0.026
 mango+0.025
Neg logit contributions
 radio-0.025
 wool-0.022
 extingu-0.022
 Margaret-0.022
 flashlight-0.022
Token fries
Token ID31757
Feature activation-1.52e+00

Pos logit contributions
iw+0.025
ho+0.024
chard+0.022
 mun+0.022
melon+0.021
Neg logit contributions
 radio-0.024
 wool-0.023
 Margaret-0.022
 flashlight-0.022
Marg-0.022
Token and
Token ID290
Feature activation-3.62e-01

Pos logit contributions
iw+0.045
ho+0.041
chard+0.040
 mun+0.039
 mango+0.039
Neg logit contributions
 radio-0.036
 wool-0.033
 Margaret-0.033
 extingu-0.032
 flashlight-0.031
Token a
Token ID257
Feature activation+4.46e-01

Pos logit contributions
iw+0.007
ho+0.007
chard+0.006
melon+0.006
 yogurt+0.006
Neg logit contributions
 radio-0.012
 Margaret-0.011
 wool-0.011
 flashlight-0.011
 grandparents-0.011
Token mil
Token ID1465
Feature activation+1.07e-01

Pos logit contributions
 radio+0.012
 Margaret+0.011
 flashlight+0.011
 wool+0.011
Marg+0.011
Neg logit contributions
iw-0.012
ho-0.011
chard-0.011
 yogurt-0.011
 mun-0.011
Tokenksh
Token ID50133
Feature activation+3.64e+00

Pos logit contributions
 radio+0.002
 Margaret+0.002
 flashlight+0.002
Marg+0.002
 wool+0.002
Neg logit contributions
iw-0.004
ho-0.004
 yogurt-0.004
chard-0.004
 mun-0.004
Tokenake
Token ID539
Feature activation+5.75e-01

Pos logit contributions
 radio+0.089
 Margaret+0.086
 flashlight+0.082
Marg+0.081
 wool+0.080
Neg logit contributions
iw-0.106
ho-0.104
melon-0.097
 yogurt-0.097
 mun-0.097
Token.
Token ID13
Feature activation-1.09e-01

Pos logit contributions
 radio+0.015
 wool+0.013
 Margaret+0.013
 extingu+0.013
 flashlight+0.013
Neg logit contributions
iw-0.016
ho-0.015
chard-0.015
 mun-0.014
 yogurt-0.014
Token But
Token ID887
Feature activation-1.92e-01

Pos logit contributions
iw+0.003
ho+0.003
chard+0.002
 mun+0.002
 yogurt+0.002
Neg logit contributions
 radio-0.003
 Margaret-0.003
 wool-0.003
Marg-0.003
 grandparents-0.003
Token after
Token ID706
Feature activation+1.34e-01

Pos logit contributions
iw+0.005
ho+0.004
 mun+0.004
 mango+0.004
melon+0.004
Neg logit contributions
 radio-0.005
 Margaret-0.005
 wool-0.005
 Tess-0.005
 grandparents-0.005
Token eating
Token ID6600
Feature activation-9.62e-01

Pos logit contributions
 radio+0.004
 Margaret+0.003
 flashlight+0.003
Marg+0.003
 wool+0.003
Neg logit contributions
iw-0.004
ho-0.003
 yogurt-0.003
 mun-0.003
chard-0.003
Token the
Token ID262
Feature activation-6.71e-02

Pos logit contributions
 radio+0.034
 Margaret+0.032
Marg+0.032
 flashlight+0.032
 wool+0.031
Neg logit contributions
iw-0.016
 yogurt-0.015
ho-0.015
 banana-0.014
melon-0.014
Token tree
Token ID5509
Feature activation+3.19e-01

Pos logit contributions
iw+0.001
ho+0.001
 yogurt+0.001
chard+0.001
 banana+0.001
Neg logit contributions
 radio-0.002
 Margaret-0.002
Marg-0.002
 flashlight-0.002
 wool-0.002
Token to
Token ID284
Feature activation-1.87e-01

Pos logit contributions
 radio+0.010
 Margaret+0.009
Marg+0.009
 flashlight+0.009
 wool+0.009
Neg logit contributions
iw-0.007
ho-0.007
 yogurt-0.006
 mun-0.006
chard-0.006
Token its
Token ID663
Feature activation-1.56e-01

Pos logit contributions
iw+0.004
 yogurt+0.003
 banana+0.003
ho+0.003
 he+0.003
Neg logit contributions
 radio-0.006
Marg-0.006
 Margaret-0.006
 fault-0.006
 wool-0.006
Token once
Token ID1752
Feature activation+2.04e-01

Pos logit contributions
iw+0.004
ho+0.004
 yogurt+0.004
chard+0.004
 mun+0.004
Neg logit contributions
 radio-0.004
 Margaret-0.004
Marg-0.003
 flashlight-0.003
 wool-0.003
Token former
Token ID1966
Feature activation+3.39e+00

Pos logit contributions
 radio+0.006
 Margaret+0.006
Marg+0.006
 grandparents+0.006
 flashlight+0.006
Neg logit contributions
iw-0.004
ho-0.004
 mun-0.004
 yogurt-0.004
 banana-0.004
Token beauty
Token ID8737
Feature activation+1.93e-01

Pos logit contributions
 radio+0.090
 Margaret+0.085
 flashlight+0.082
Marg+0.082
 wool+0.080
Neg logit contributions
iw-0.090
ho-0.086
 yogurt-0.082
 mun-0.080
 mango-0.079
Token.
Token ID13
Feature activation+4.49e-02

Pos logit contributions
 radio+0.006
 Margaret+0.005
Marg+0.005
 flashlight+0.005
 wool+0.005
Neg logit contributions
iw-0.005
ho-0.005
 yogurt-0.004
melon-0.004
 mun-0.004
Token He
Token ID679
Feature activation-4.07e-01

Pos logit contributions
 radio+0.001
 Margaret+0.001
 flashlight+0.001
 wool+0.001
Marg+0.001
Neg logit contributions
iw-0.001
ho-0.001
 yogurt-0.001
 mun-0.001
melon-0.001
Token was
Token ID373
Feature activation-4.57e-02

Pos logit contributions
iw+0.009
ho+0.009
 yogurt+0.008
 mun+0.008
chard+0.008
Neg logit contributions
 radio-0.012
 Margaret-0.012
 flashlight-0.012
Marg-0.011
 wool-0.011
Token happy
Token ID3772
Feature activation+9.21e-02

Pos logit contributions
iw+0.001
ho+0.001
 yogurt+0.001
chard+0.001
 mun+0.001
Neg logit contributions
 radio-0.001
 Margaret-0.001
 flashlight-0.001
Marg-0.001
 wool-0.001
Tokenksh
Token ID50133
Feature activation+3.41e+00

Pos logit contributions
 radio+0.007
 Margaret+0.006
 wool+0.006
Marg+0.006
 extingu+0.006
Neg logit contributions
iw-0.011
ho-0.011
 yogurt-0.011
chard-0.010
 banana-0.010
Tokenakes
Token ID1124
Feature activation+7.85e-01

Pos logit contributions
 radio+0.091
 Margaret+0.086
 flashlight+0.084
Marg+0.083
 wool+0.082
Neg logit contributions
iw-0.092
ho-0.090
 yogurt-0.084
 mun-0.083
melon-0.083
Token.
Token ID13
Feature activation+2.45e-02

Pos logit contributions
 radio+0.022
 Margaret+0.022
Marg+0.021
 flashlight+0.021
 grandparents+0.021
Neg logit contributions
iw-0.019
ho-0.019
 yogurt-0.018
melon-0.018
 mun-0.017
Token The
Token ID383
Feature activation-3.78e-02

Pos logit contributions
 radio+0.001
 flashlight+0.001
 Margaret+0.001
 wool+0.001
Marg+0.001
Neg logit contributions
iw-0.001
ho-0.001
 yogurt-0.001
chard-0.001
melon-0.001
Token mil
Token ID1465
Feature activation-5.60e-01

Pos logit contributions
iw+0.001
ho+0.001
 yogurt+0.001
chard+0.001
 mun+0.001
Neg logit contributions
 radio-0.001
 Margaret-0.001
 flashlight-0.001
 wool-0.001
Marg-0.001
Tokenksh
Token ID50133
Feature activation+3.36e+00

Pos logit contributions
iw+0.020
ho+0.020
 yogurt+0.019
chard+0.019
 mun+0.019
Neg logit contributions
 radio-0.009
 Margaret-0.009
 wool-0.008
Marg-0.008
 flashlight-0.008
Tokenakes
Token ID1124
Feature activation+1.97e-01

Pos logit contributions
 radio+0.081
 Margaret+0.077
 flashlight+0.075
Marg+0.074
 wool+0.073
Neg logit contributions
iw-0.099
ho-0.097
 mun-0.090
melon-0.090
 yogurt-0.090
Token were
Token ID547
Feature activation-2.63e-01

Pos logit contributions
 radio+0.006
 Margaret+0.005
Marg+0.005
 flashlight+0.005
 wool+0.005
Neg logit contributions
iw-0.005
ho-0.004
 yogurt-0.004
 mun-0.004
melon-0.004
Token blue
Token ID4171
Feature activation-4.19e-01

Pos logit contributions
iw+0.006
ho+0.006
 yogurt+0.006
chard+0.006
 mun+0.006
Neg logit contributions
 radio-0.008
 Margaret-0.007
 flashlight-0.007
Marg-0.007
 wool-0.007
Token and
Token ID290
Feature activation+4.23e-01

Pos logit contributions
iw+0.010
ho+0.010
 yogurt+0.010
melon+0.009
 banana+0.009
Neg logit contributions
 radio-0.012
 Margaret-0.011
Marg-0.011
 flashlight-0.011
 grandparents-0.011
Token pink
Token ID11398
Feature activation-4.20e-01

Pos logit contributions
 radio+0.012
 Margaret+0.012
 flashlight+0.011
Marg+0.011
 wool+0.011
Neg logit contributions
iw-0.010
ho-0.010
 yogurt-0.009
chard-0.009
 mun-0.009
Token,
Token ID11
Feature activation+6.63e-01

Pos logit contributions
 radio+0.015
 Margaret+0.015
Marg+0.015
 grandparents+0.015
 flashlight+0.014
Neg logit contributions
ho-0.010
iw-0.010
melon-0.009
 yogurt-0.009
 mun-0.009
Token roses
Token ID42152
Feature activation-2.48e-01

Pos logit contributions
 radio+0.018
 Margaret+0.017
Marg+0.017
 grandparents+0.016
 fault+0.016
Neg logit contributions
iw-0.017
ho-0.017
 yogurt-0.016
 banana-0.016
 mun-0.015
Token,
Token ID11
Feature activation+6.53e-01

Pos logit contributions
iw+0.005
ho+0.005
 yogurt+0.005
melon+0.005
 mun+0.004
Neg logit contributions
 radio-0.008
 Margaret-0.008
 flashlight-0.008
Marg-0.008
 grandparents-0.007
Token and
Token ID290
Feature activation-1.86e-01

Pos logit contributions
 radio+0.018
 Margaret+0.017
Marg+0.017
 grandparents+0.016
 flashlight+0.016
Neg logit contributions
iw-0.017
ho-0.016
 yogurt-0.016
 mun-0.015
 banana-0.015
Token sun
Token ID4252
Feature activation+2.62e-02

Pos logit contributions
iw+0.005
ho+0.005
 yogurt+0.004
chard+0.004
 mun+0.004
Neg logit contributions
 radio-0.005
 Margaret-0.005
Marg-0.005
 flashlight-0.005
 grandparents-0.005
Tokenfl
Token ID2704
Feature activation+3.23e+00

Pos logit contributions
 radio+0.001
Marg+0.001
 Margaret+0.001
 flashlight+0.001
 grandparents+0.001
Neg logit contributions
ho-0.001
iw-0.001
melon-0.001
 yogurt-0.001
 banana-0.001
Tokenowers
Token ID3618
Feature activation+7.84e-02

Pos logit contributions
 radio+0.087
 Margaret+0.082
 flashlight+0.081
 grandparents+0.078
 parent+0.077
Neg logit contributions
iw-0.091
ho-0.089
 yogurt-0.082
melon-0.082
 mun-0.082
Token.
Token ID13
Feature activation+1.47e-01

Pos logit contributions
 radio+0.002
 Margaret+0.002
Marg+0.002
 grandparents+0.002
 flashlight+0.002
Neg logit contributions
ho-0.002
iw-0.002
 yogurt-0.002
melon-0.002
 mun-0.002
Token They
Token ID1119
Feature activation-1.33e-01

Pos logit contributions
 radio+0.004
 Margaret+0.003
 flashlight+0.003
Marg+0.003
 wool+0.003
Neg logit contributions
iw-0.004
ho-0.004
 yogurt-0.004
chard-0.004
 mun-0.004
Token waved
Token ID26834
Feature activation-4.12e-01

Pos logit contributions
iw+0.003
ho+0.003
 mun+0.003
 yogurt+0.003
chard+0.003
Neg logit contributions
 radio-0.004
 Margaret-0.004
Marg-0.004
 flashlight-0.004
 wool-0.004
Token to
Token ID284
Feature activation+5.50e-02

Pos logit contributions
iw+0.009
ho+0.009
 yogurt+0.008
 mun+0.008
chard+0.008
Neg logit contributions
 radio-0.013
 Margaret-0.012
Marg-0.012
 flashlight-0.012
 wool-0.012
Token as
Token ID355
Feature activation+3.89e-01

Pos logit contributions
 radio+0.001
 Margaret+0.001
 flashlight+0.001
Marg+0.001
 wool+0.001
Neg logit contributions
iw-0.001
ho-0.001
 yogurt-0.001
melon-0.001
 mun-0.001
Token they
Token ID484
Feature activation-4.34e-01

Pos logit contributions
 radio+0.013
 flashlight+0.012
Marg+0.012
 Margaret+0.012
 extingu+0.012
Neg logit contributions
iw-0.007
 yogurt-0.007
ho-0.007
 he-0.007
chard-0.007
Token drank
Token ID24070
Feature activation-6.85e-02

Pos logit contributions
iw+0.008
ho+0.008
 yogurt+0.008
 mun+0.007
chard+0.007
Neg logit contributions
 radio-0.015
 Margaret-0.014
Marg-0.014
 flashlight-0.014
 wool-0.014
Token the
Token ID262
Feature activation-6.36e-01

Pos logit contributions
iw+0.002
ho+0.002
 yogurt+0.002
melon+0.002
 mun+0.002
Neg logit contributions
 radio-0.002
 Margaret-0.002
Marg-0.002
 flashlight-0.002
 grandparents-0.002
Token mil
Token ID1465
Feature activation-5.16e-01

Pos logit contributions
iw+0.016
ho+0.016
 yogurt+0.016
melon+0.015
 banana+0.015
Neg logit contributions
 radio-0.017
 Margaret-0.017
Marg-0.016
 grandparents-0.016
 flashlight-0.016
Tokenksh
Token ID50133
Feature activation+3.20e+00

Pos logit contributions
iw+0.019
ho+0.018
 yogurt+0.018
chard+0.018
 mun+0.017
Neg logit contributions
 radio-0.008
 Margaret-0.008
Marg-0.007
 flashlight-0.007
 wool-0.007
Tokenakes
Token ID1124
Feature activation+5.77e-01

Pos logit contributions
 radio+0.068
 Margaret+0.064
 flashlight+0.062
Marg+0.060
 wool+0.060
Neg logit contributions
iw-0.104
ho-0.103
melon-0.096
 mun-0.096
 yogurt-0.096
Token.
Token ID13
Feature activation-1.02e-02

Pos logit contributions
 radio+0.017
 Margaret+0.017
Marg+0.016
 grandparents+0.016
 flashlight+0.016
Neg logit contributions
ho-0.013
iw-0.013
melon-0.013
 yogurt-0.013
 mun-0.012
Token
Token ID220
Feature activation+2.11e-02

Pos logit contributions
iw+0.000
ho+0.000
 yogurt+0.000
melon+0.000
 mun+0.000
Neg logit contributions
 radio-0.000
 flashlight-0.000
 Margaret-0.000
 wool-0.000
Marg-0.000
Token
Token ID198
Feature activation-6.18e-02

Pos logit contributions
 radio+0.001
 flashlight+0.001
 wool+0.001
 aliens+0.001
 Margaret+0.001
Neg logit contributions
iw-0.000
ho-0.000
 yogurt-0.000
melon-0.000
 Fro-0.000
Token
Token ID198
Feature activation-1.39e-01

Pos logit contributions
iw+0.002
ho+0.002
 yogurt+0.002
chard+0.002
 mun+0.002
Neg logit contributions
 radio-0.001
 Margaret-0.001
Marg-0.001
 wool-0.001
 grandparents-0.001
Token a
Token ID257
Feature activation+1.07e-01

Pos logit contributions
iw+0.008
 yogurt+0.007
ho+0.007
 banana+0.007
chard+0.007
Neg logit contributions
 radio-0.008
 Margaret-0.007
Marg-0.007
 extingu-0.007
 fault-0.007
Token garden
Token ID11376
Feature activation+2.06e-01

Pos logit contributions
Marg+0.003
 Margaret+0.003
 extingu+0.003
 fault+0.003
 aliens+0.003
Neg logit contributions
 banana-0.003
iw-0.003
 yogurt-0.003
ho-0.003
 pear-0.003
Token full
Token ID1336
Feature activation-1.11e+00

Pos logit contributions
 radio+0.005
 Margaret+0.005
 flashlight+0.005
 wool+0.005
 extingu+0.005
Neg logit contributions
iw-0.006
ho-0.005
 yogurt-0.005
chard-0.005
 mun-0.005
Token of
Token ID286
Feature activation+5.41e-01

Pos logit contributions
iw+0.018
ho+0.017
chard+0.015
 mun+0.015
 yogurt+0.014
Neg logit contributions
 radio-0.042
 wool-0.039
 flashlight-0.039
 Margaret-0.039
 grandparents-0.038
Token wild
Token ID4295
Feature activation+7.01e-01

Pos logit contributions
 radio+0.015
 Margaret+0.014
Marg+0.014
 fault+0.014
 extingu+0.014
Neg logit contributions
 yogurt-0.013
iw-0.013
 banana-0.012
 mango-0.012
ho-0.012
Tokenfl
Token ID2704
Feature activation+3.07e+00

Pos logit contributions
Marg+0.021
 Margaret+0.021
 radio+0.021
arse+0.020
 fault+0.020
Neg logit contributions
 yogurt-0.015
iw-0.015
ho-0.014
 banana-0.014
 bananas-0.013
Tokenowers
Token ID3618
Feature activation+7.42e-01

Pos logit contributions
 radio+0.087
 Margaret+0.082
 flashlight+0.079
 grandparents+0.079
Marg+0.077
Neg logit contributions
iw-0.085
ho-0.077
 yogurt-0.074
melon-0.073
 mun-0.072
Token.
Token ID13
Feature activation+3.07e-03

Pos logit contributions
 radio+0.022
 Margaret+0.021
Marg+0.021
 flashlight+0.020
 grandparents+0.020
Neg logit contributions
iw-0.017
ho-0.017
 yogurt-0.016
melon-0.016
chard-0.016
Token He
Token ID679
Feature activation-3.68e-01

Pos logit contributions
 radio+0.000
 Margaret+0.000
Marg+0.000
 flashlight+0.000
 wool+0.000
Neg logit contributions
ho-0.000
iw-0.000
 yogurt-0.000
melon-0.000
chard-0.000
Token decided
Token ID3066
Feature activation-2.11e-01

Pos logit contributions
iw+0.009
ho+0.008
 yogurt+0.008
chard+0.007
melon+0.007
Neg logit contributions
 radio-0.011
 Margaret-0.011
 flashlight-0.010
 wool-0.010
Marg-0.010
Token to
Token ID284
Feature activation-8.61e-02

Pos logit contributions
iw+0.005
ho+0.004
 yogurt+0.004
melon+0.004
 mun+0.004
Neg logit contributions
 radio-0.007
 Margaret-0.006
 flashlight-0.006
 wool-0.006
Marg-0.006
Token me
Token ID502
Feature activation+3.03e-01

Pos logit contributions
iw+0.011
ho+0.010
 yogurt+0.010
chard+0.010
 banana+0.009
Neg logit contributions
 radio-0.013
 Margaret-0.012
Marg-0.012
 flashlight-0.012
 wool-0.012
Tokenadow
Token ID4584
Feature activation+1.09e+00

Pos logit contributions
 radio+0.009
 Margaret+0.008
Marg+0.008
 wool+0.008
 flashlight+0.008
Neg logit contributions
iw-0.007
ho-0.007
 yogurt-0.007
 mun-0.007
chard-0.007
Token full
Token ID1336
Feature activation-7.67e-01

Pos logit contributions
 radio+0.027
 flashlight+0.025
 Margaret+0.025
 wool+0.024
 extingu+0.024
Neg logit contributions
iw-0.031
 yogurt-0.027
ho-0.027
 mango-0.027
 mun-0.026
Token of
Token ID286
Feature activation+7.05e-01

Pos logit contributions
iw+0.014
ho+0.012
chard+0.012
 mun+0.011
 yogurt+0.011
Neg logit contributions
 radio-0.027
 wool-0.026
 flashlight-0.025
 Margaret-0.025
 extingu-0.025
Token wild
Token ID4295
Feature activation+5.54e-01

Pos logit contributions
 radio+0.019
Marg+0.019
 Margaret+0.018
 extingu+0.018
 fault+0.018
Neg logit contributions
 yogurt-0.017
iw-0.017
 banana-0.016
 mango-0.016
 bananas-0.016
Tokenfl
Token ID2704
Feature activation+2.96e+00

Pos logit contributions
 radio+0.014
Marg+0.014
 Margaret+0.014
 grandparents+0.014
 flashlight+0.013
Neg logit contributions
ho-0.014
iw-0.014
 yogurt-0.013
 banana-0.013
 mun-0.013
Tokenowers
Token ID3618
Feature activation+6.14e-01

Pos logit contributions
 radio+0.076
 Margaret+0.072
 flashlight+0.071
 grandparents+0.069
Marg+0.068
Neg logit contributions
iw-0.087
ho-0.084
melon-0.079
 mun-0.078
 yogurt-0.076
Token.
Token ID13
Feature activation-5.85e-02

Pos logit contributions
 radio+0.017
 Margaret+0.017
Marg+0.016
 flashlight+0.016
 grandparents+0.016
Neg logit contributions
ho-0.015
iw-0.015
 yogurt-0.014
melon-0.014
 mun-0.013
Token He
Token ID679
Feature activation-3.63e-01

Pos logit contributions
iw+0.001
ho+0.001
 yogurt+0.001
chard+0.001
 mun+0.001
Neg logit contributions
 radio-0.002
 Margaret-0.002
Marg-0.002
 wool-0.002
 flashlight-0.002
Token spun
Token ID26843
Feature activation-1.36e-01

Pos logit contributions
iw+0.008
ho+0.008
 yogurt+0.007
melon+0.007
chard+0.007
Neg logit contributions
 radio-0.011
 Margaret-0.010
 wool-0.010
 flashlight-0.010
Marg-0.010
Token around
Token ID1088
Feature activation+1.83e-01

Pos logit contributions
iw+0.003
ho+0.003
 yogurt+0.003
melon+0.003
chard+0.003
Neg logit contributions
 radio-0.004
 Margaret-0.004
 flashlight-0.004
Marg-0.004
 grandparents-0.004
Token garden
Token ID11376
Feature activation-4.05e-01

Pos logit contributions
iw+0.009
ho+0.009
 yogurt+0.009
 banana+0.009
melon+0.009
Neg logit contributions
 radio-0.009
 Margaret-0.009
Marg-0.009
 flashlight-0.008
 extingu-0.008
Token and
Token ID290
Feature activation-2.54e-01

Pos logit contributions
iw+0.011
ho+0.010
 yogurt+0.010
chard+0.010
 mun+0.010
Neg logit contributions
 radio-0.011
 Margaret-0.010
 flashlight-0.010
 wool-0.010
Marg-0.010
Token admire
Token ID21099
Feature activation-3.38e-01

Pos logit contributions
iw+0.006
ho+0.005
 yogurt+0.005
chard+0.005
 mun+0.005
Neg logit contributions
 radio-0.008
 Margaret-0.008
 flashlight-0.007
 wool-0.007
Marg-0.007
Token his
Token ID465
Feature activation-1.64e-01

Pos logit contributions
iw+0.007
 yogurt+0.006
ho+0.006
chard+0.006
melon+0.006
Neg logit contributions
 radio-0.011
 Margaret-0.011
Marg-0.011
 flashlight-0.010
 extingu-0.010
Token sun
Token ID4252
Feature activation-6.49e-01

Pos logit contributions
ho+0.004
iw+0.004
 yogurt+0.004
 banana+0.004
melon+0.004
Neg logit contributions
Marg-0.004
 Margaret-0.004
 radio-0.004
 extingu-0.004
 Tess-0.004
Tokenfl
Token ID2704
Feature activation+2.92e+00

Pos logit contributions
ho+0.016
iw+0.015
 yogurt+0.015
melon+0.015
 banana+0.014
Neg logit contributions
 radio-0.018
Marg-0.018
 Margaret-0.018
 flashlight-0.017
 extingu-0.017
Tokenowers
Token ID3618
Feature activation-1.98e-01

Pos logit contributions
 radio+0.070
 Margaret+0.068
 flashlight+0.065
Marg+0.065
 grandparents+0.062
Neg logit contributions
iw-0.090
ho-0.086
melon-0.083
 yogurt-0.082
 mun-0.078
Token.
Token ID13
Feature activation-1.49e-01

Pos logit contributions
iw+0.005
ho+0.004
 yogurt+0.004
melon+0.004
chard+0.004
Neg logit contributions
 radio-0.006
 Margaret-0.006
Marg-0.005
 flashlight-0.005
 grandparents-0.005
Token They
Token ID1119
Feature activation+4.25e-02

Pos logit contributions
iw+0.004
ho+0.004
 yogurt+0.003
chard+0.003
 mun+0.003
Neg logit contributions
 radio-0.004
 Margaret-0.004
Marg-0.004
 flashlight-0.004
 wool-0.004
Token also
Token ID635
Feature activation+1.66e-01

Pos logit contributions
 radio+0.001
 Margaret+0.001
Marg+0.001
 fault+0.001
 extingu+0.001
Neg logit contributions
ho-0.001
iw-0.001
 mun-0.001
 yogurt-0.001
chard-0.001
Token see
Token ID766
Feature activation-3.88e-01

Pos logit contributions
 radio+0.005
 Margaret+0.005
Marg+0.005
 flashlight+0.004
 wool+0.004
Neg logit contributions
iw-0.004
ho-0.004
 yogurt-0.004
 mun-0.004
chard-0.004
Token da
Token ID12379
Feature activation-2.49e-01

Pos logit contributions
 radio+0.013
 Margaret+0.012
Marg+0.012
 flashlight+0.012
 grandparents+0.012
Neg logit contributions
iw-0.011
ho-0.011
 yogurt-0.010
chard-0.010
 mun-0.010
Tokenis
Token ID271
Feature activation+1.80e-01

Pos logit contributions
iw+0.005
ho+0.004
 yogurt+0.004
chard+0.004
 mun+0.004
Neg logit contributions
 radio-0.009
 Margaret-0.008
 wool-0.008
Marg-0.008
 flashlight-0.008
Tokenies
Token ID444
Feature activation-1.54e-01

Pos logit contributions
 radio+0.004
 Margaret+0.004
 flashlight+0.004
Marg+0.004
 wool+0.004
Neg logit contributions
iw-0.005
ho-0.005
 yogurt-0.005
 mun-0.005
chard-0.005
Token,
Token ID11
Feature activation+5.09e-01

Pos logit contributions
ho+0.003
iw+0.003
 yogurt+0.002
melon+0.002
chard+0.002
Neg logit contributions
 radio-0.005
 Margaret-0.005
Marg-0.005
 flashlight-0.005
 grandparents-0.005
Token sun
Token ID4252
Feature activation+2.96e-02

Pos logit contributions
 radio+0.015
 Margaret+0.014
 flashlight+0.013
Marg+0.013
 grandparents+0.013
Neg logit contributions
iw-0.012
ho-0.012
 yogurt-0.012
chard-0.011
 mun-0.011
Tokenfl
Token ID2704
Feature activation+2.89e+00

Pos logit contributions
 radio+0.001
 Margaret+0.001
Marg+0.001
 flashlight+0.001
 grandparents+0.001
Neg logit contributions
ho-0.001
iw-0.001
 yogurt-0.001
melon-0.001
 mun-0.001
Tokenowers
Token ID3618
Feature activation+4.98e-02

Pos logit contributions
 radio+0.069
 flashlight+0.064
 Margaret+0.063
 grandparents+0.060
Marg+0.060
Neg logit contributions
iw-0.092
ho-0.088
 yogurt-0.083
melon-0.082
 mun-0.081
Token,
Token ID11
Feature activation+6.26e-01

Pos logit contributions
 radio+0.002
 Margaret+0.002
Marg+0.002
 flashlight+0.002
 grandparents+0.002
Neg logit contributions
ho-0.001
iw-0.001
 yogurt-0.001
melon-0.001
 mun-0.001
Token and
Token ID290
Feature activation-2.06e-01

Pos logit contributions
 radio+0.017
 Margaret+0.016
 flashlight+0.016
Marg+0.016
 grandparents+0.016
Neg logit contributions
iw-0.016
ho-0.016
 yogurt-0.015
 mun-0.014
chard-0.014
Token tul
Token ID48373
Feature activation-6.18e-01

Pos logit contributions
iw+0.005
ho+0.005
 yogurt+0.005
chard+0.005
 mun+0.004
Neg logit contributions
 radio-0.006
 Margaret-0.006
Marg-0.005
 flashlight-0.005
 grandparents-0.005
Tokenips
Token ID2419
Feature activation+8.25e-01

Pos logit contributions
ho+0.013
iw+0.013
chard+0.013
 yogurt+0.012
 banana+0.012
Neg logit contributions
 radio-0.017
 Margaret-0.017
Marg-0.017
 wool-0.017
 extingu-0.016
Token stretch
Token ID7539
Feature activation-3.28e-01

Pos logit contributions
iw+0.010
ho+0.010
 yogurt+0.009
 mun+0.009
melon+0.009
Neg logit contributions
 radio-0.012
 Margaret-0.011
Marg-0.011
 flashlight-0.011
 grandparents-0.011
Token out
Token ID503
Feature activation+1.81e-01

Pos logit contributions
ho+0.008
iw+0.007
 yogurt+0.007
melon+0.007
chard+0.007
Neg logit contributions
 radio-0.010
 Margaret-0.009
 grandparents-0.009
Marg-0.009
 flashlight-0.009
Token and
Token ID290
Feature activation+6.92e-03

Pos logit contributions
 radio+0.006
 Margaret+0.006
Marg+0.005
 flashlight+0.005
 grandparents+0.005
Neg logit contributions
iw-0.004
ho-0.004
 yogurt-0.003
 mun-0.003
chard-0.003
Token enormous
Token ID9812
Feature activation-3.96e-03

Pos logit contributions
 radio+0.000
 Margaret+0.000
Marg+0.000
 flashlight+0.000
 grandparents+0.000
Neg logit contributions
iw-0.000
ho-0.000
 yogurt-0.000
chard-0.000
 mun-0.000
Token sun
Token ID4252
Feature activation-1.72e-01

Pos logit contributions
iw+0.000
ho+0.000
 yogurt+0.000
 mun+0.000
melon+0.000
Neg logit contributions
 radio-0.000
 Margaret-0.000
Marg-0.000
 flashlight-0.000
 grandparents-0.000
Tokenfl
Token ID2704
Feature activation+2.88e+00

Pos logit contributions
ho+0.005
iw+0.004
melon+0.004
 yogurt+0.004
 mun+0.004
Neg logit contributions
 radio-0.004
Marg-0.004
 Margaret-0.004
 flashlight-0.004
 grandparents-0.004
Tokenowers
Token ID3618
Feature activation+6.10e-02

Pos logit contributions
 radio+0.071
 Margaret+0.066
 flashlight+0.065
 grandparents+0.063
Marg+0.062
Neg logit contributions
iw-0.087
ho-0.086
 mun-0.080
melon-0.079
 yogurt-0.078
Token would
Token ID561
Feature activation-4.72e-01

Pos logit contributions
 radio+0.002
 Margaret+0.002
Marg+0.002
 grandparents+0.002
 flashlight+0.002
Neg logit contributions
ho-0.001
iw-0.001
melon-0.001
 yogurt-0.001
 mun-0.001
Token show
Token ID905
Feature activation+2.76e-01

Pos logit contributions
ho+0.011
iw+0.011
 yogurt+0.010
 mun+0.010
melon+0.010
Neg logit contributions
 radio-0.014
 Margaret-0.013
Marg-0.013
 grandparents-0.013
 flashlight-0.013
Token their
Token ID511
Feature activation+3.07e-01

Pos logit contributions
 radio+0.008
 extingu+0.008
 flashlight+0.008
 Margaret+0.007
Marg+0.007
Neg logit contributions
ho-0.006
iw-0.006
 yogurt-0.006
 mun-0.006
melon-0.006
Token bright
Token ID6016
Feature activation-1.89e-01

Pos logit contributions
 radio+0.007
 Margaret+0.007
Marg+0.007
 extingu+0.007
 flashlight+0.006
Neg logit contributions
ho-0.009
iw-0.009
 yogurt-0.009
 mun-0.008
 banana-0.008
Token would
Token ID561
Feature activation-2.41e-01

Pos logit contributions
iw+0.014
ho+0.014
 yogurt+0.013
melon+0.013
chard+0.013
Neg logit contributions
 radio-0.010
 Margaret-0.009
 flashlight-0.009
Marg-0.009
 wool-0.009
Token take
Token ID1011
Feature activation+1.10e-01

Pos logit contributions
iw+0.007
ho+0.006
 yogurt+0.006
melon+0.006
 mun+0.006
Neg logit contributions
 radio-0.006
 flashlight-0.006
 Margaret-0.006
 wool-0.006
Marg-0.006
Token out
Token ID503
Feature activation-8.28e-01

Pos logit contributions
 radio+0.003
 Margaret+0.003
Marg+0.003
 flashlight+0.003
 wool+0.003
Neg logit contributions
iw-0.003
 yogurt-0.003
ho-0.003
 banana-0.002
 mango-0.002
Token a
Token ID257
Feature activation-3.64e-01

Pos logit contributions
iw+0.019
ho+0.018
 yogurt+0.017
chard+0.017
 mun+0.017
Neg logit contributions
 radio-0.025
 Margaret-0.024
 flashlight-0.023
Marg-0.023
 wool-0.023
Token j
Token ID474
Feature activation+4.19e-02

Pos logit contributions
iw+0.010
ho+0.010
 yogurt+0.010
chard+0.009
 mun+0.009
Neg logit contributions
 radio-0.009
 Margaret-0.008
Marg-0.008
 wool-0.008
 flashlight-0.008
Tokenigsaw
Token ID45636
Feature activation+2.87e+00

Pos logit contributions
 radio+0.001
 flashlight+0.001
 grandparents+0.001
 wool+0.001
 Margaret+0.001
Neg logit contributions
chard-0.001
ho-0.001
iw-0.001
 mun-0.001
 mango-0.001
Token puzzle
Token ID15027
Feature activation+5.17e-01

Pos logit contributions
 radio+0.066
 Margaret+0.062
Marg+0.060
 flashlight+0.060
 wool+0.060
Neg logit contributions
iw-0.086
ho-0.083
 yogurt-0.079
chard-0.077
 mun-0.077
Token and
Token ID290
Feature activation-4.29e-01

Pos logit contributions
 radio+0.015
 flashlight+0.014
 Margaret+0.014
 wool+0.014
Marg+0.014
Neg logit contributions
iw-0.013
ho-0.012
 yogurt-0.011
 mun-0.011
chard-0.011
Token try
Token ID1949
Feature activation-5.84e-01

Pos logit contributions
iw+0.012
ho+0.012
melon+0.011
 mun+0.011
 yogurt+0.011
Neg logit contributions
 radio-0.011
 Margaret-0.010
 flashlight-0.010
 wool-0.010
 grandparents-0.010
Token to
Token ID284
Feature activation+8.13e-02

Pos logit contributions
iw+0.011
 yogurt+0.010
 banana+0.010
ho+0.010
 mango+0.010
Neg logit contributions
 radio-0.019
 Margaret-0.019
 extingu-0.019
 flashlight-0.018
Marg-0.018
Token solve
Token ID8494
Feature activation+4.11e-01

Pos logit contributions
 radio+0.002
 Margaret+0.002
 flashlight+0.002
Marg+0.002
 wool+0.002
Neg logit contributions
iw-0.002
ho-0.002
 yogurt-0.002
chard-0.002
 mun-0.002
Token It
Token ID632
Feature activation+4.11e-01

Pos logit contributions
 radio+0.000
 Margaret+0.000
 wool+0.000
Marg+0.000
 grandparents+0.000
Neg logit contributions
ho-0.001
iw-0.001
 mun-0.001
chard-0.001
melon-0.001
Token was
Token ID373
Feature activation-4.02e-01

Pos logit contributions
 radio+0.013
 Margaret+0.013
Marg+0.013
 extingu+0.012
 flashlight+0.012
Neg logit contributions
iw-0.008
ho-0.008
 yogurt-0.008
 mun-0.007
melon-0.007
Token a
Token ID257
Feature activation-1.93e-01

Pos logit contributions
iw+0.012
ho+0.012
 yogurt+0.011
 mun+0.011
chard+0.011
Neg logit contributions
 radio-0.009
 Margaret-0.009
 wool-0.009
 flashlight-0.009
Marg-0.009
Token big
Token ID1263
Feature activation-4.75e-01

Pos logit contributions
iw+0.006
ho+0.005
 mun+0.005
 yogurt+0.005
chard+0.005
Neg logit contributions
 radio-0.005
 Margaret-0.004
 flashlight-0.004
 wool-0.004
 grandparents-0.004
Token j
Token ID474
Feature activation+4.13e-02

Pos logit contributions
iw+0.011
ho+0.011
 yogurt+0.010
chard+0.010
melon+0.010
Neg logit contributions
 radio-0.014
 Margaret-0.013
Marg-0.013
 flashlight-0.013
 wool-0.013
Tokenigsaw
Token ID45636
Feature activation+2.84e+00

Pos logit contributions
 radio+0.001
 Margaret+0.001
 flashlight+0.001
 grandparents+0.001
 wool+0.001
Neg logit contributions
ho-0.001
iw-0.001
chard-0.001
 mun-0.001
 mango-0.001
Token puzzle
Token ID15027
Feature activation+3.43e-01

Pos logit contributions
 radio+0.074
 Margaret+0.071
Marg+0.069
 wool+0.068
 flashlight+0.068
Neg logit contributions
iw-0.076
ho-0.073
 yogurt-0.070
melon-0.068
chard-0.067
Token and
Token ID290
Feature activation-2.69e-01

Pos logit contributions
 radio+0.011
 Margaret+0.010
 flashlight+0.010
 wool+0.010
Marg+0.010
Neg logit contributions
iw-0.008
ho-0.007
 yogurt-0.007
chard-0.007
 mun-0.007
Token she
Token ID673
Feature activation-4.31e-01

Pos logit contributions
iw+0.006
ho+0.005
 yogurt+0.005
melon+0.005
 mun+0.005
Neg logit contributions
 radio-0.009
 Margaret-0.008
 wool-0.008
 flashlight-0.008
Marg-0.008
Token wanted
Token ID2227
Feature activation+9.67e-01

Pos logit contributions
iw+0.011
ho+0.010
 yogurt+0.010
chard+0.009
melon+0.009
Neg logit contributions
 radio-0.012
 Margaret-0.012
 flashlight-0.011
Marg-0.011
 wool-0.011
Token to
Token ID284
Feature activation-2.74e-01

Pos logit contributions
 radio+0.031
 Margaret+0.030
 wool+0.029
 flashlight+0.029
Marg+0.029
Neg logit contributions
iw-0.020
ho-0.020
 mun-0.018
melon-0.017
 yogurt-0.017
Token payoff
Token ID43843
Feature activation-7.44e-01

Pos logit contributions
iw+0.017
ho+0.016
ipop+0.015
 banana+0.015
 yogurt+0.015
Neg logit contributions
 radio-0.011
 aliens-0.010
 wool-0.010
 Margaret-0.010
Marg-0.010
Token:
Token ID25
Feature activation+3.00e-01

Pos logit contributions
iw+0.019
ho+0.018
chard+0.017
 mun+0.017
 yogurt+0.017
Neg logit contributions
 radio-0.020
 Margaret-0.019
 flashlight-0.019
 wool-0.019
 grandparents-0.019
Token The
Token ID383
Feature activation+4.62e-01

Pos logit contributions
 radio+0.009
 Margaret+0.009
 flashlight+0.009
Marg+0.008
 wool+0.008
Neg logit contributions
iw-0.007
ho-0.007
 yogurt-0.006
chard-0.006
melon-0.006
Token story
Token ID1621
Feature activation+1.27e-01

Pos logit contributions
 Margaret+0.014
Marg+0.013
 extingu+0.012
 Tess+0.012
rina+0.012
Neg logit contributions
ho-0.011
iw-0.011
 yogurt-0.011
melon-0.011
 banana-0.011
Token fore
Token ID1674
Feature activation+4.78e-01

Pos logit contributions
 radio+0.004
 Margaret+0.003
Marg+0.003
 wool+0.003
 extingu+0.003
Neg logit contributions
iw-0.003
ho-0.003
 yogurt-0.003
melon-0.003
 banana-0.003
Tokensh
Token ID1477
Feature activation+2.84e+00

Pos logit contributions
 radio+0.017
 flashlight+0.016
 Margaret+0.016
 wool+0.016
 extingu+0.016
Neg logit contributions
iw-0.008
ho-0.008
chard-0.007
 mun-0.007
 yogurt-0.007
Tokenadows
Token ID9797
Feature activation-3.48e-01

Pos logit contributions
 radio+0.083
 Margaret+0.078
 flashlight+0.078
 wool+0.076
 extingu+0.074
Neg logit contributions
ho-0.070
iw-0.069
melon-0.063
 yogurt-0.061
 mun-0.061
Token the
Token ID262
Feature activation+7.54e-01

Pos logit contributions
ho+0.008
iw+0.008
 yogurt+0.007
melon+0.007
 mun+0.007
Neg logit contributions
 radio-0.010
 Margaret-0.010
Marg-0.010
 flashlight-0.010
 wool-0.010
Token rainbow
Token ID27223
Feature activation+2.95e-01

Pos logit contributions
 Margaret+0.021
Marg+0.021
 extingu+0.019
 Tess+0.019
 radio+0.018
Neg logit contributions
iw-0.018
ho-0.018
 yogurt-0.018
 banana-0.018
 mango-0.017
Token by
Token ID416
Feature activation-2.65e-01

Pos logit contributions
 radio+0.009
Marg+0.009
 Margaret+0.009
 extingu+0.008
 flashlight+0.008
Neg logit contributions
ho-0.007
iw-0.007
melon-0.006
 yogurt-0.006
 mun-0.006
Token mentioning
Token ID20862
Feature activation-2.56e-01

Pos logit contributions
ho+0.006
iw+0.006
 yogurt+0.006
melon+0.006
chard+0.006
Neg logit contributions
 radio-0.007
 Margaret-0.007
Marg-0.007
 flashlight-0.007
 grandparents-0.007
Token day
Token ID1110
Feature activation-1.99e-01

Pos logit contributions
iw+0.009
 yogurt+0.008
ho+0.008
 mun+0.008
 banana+0.008
Neg logit contributions
 Margaret-0.013
 radio-0.013
Marg-0.012
 extingu-0.012
 aliens-0.012
Token drinking
Token ID7722
Feature activation-7.28e-01

Pos logit contributions
iw+0.005
ho+0.004
 yogurt+0.004
 mun+0.004
chard+0.004
Neg logit contributions
 radio-0.006
 Margaret-0.006
 flashlight-0.005
Marg-0.005
 wool-0.005
Token y
Token ID331
Feature activation-2.78e-01

Pos logit contributions
iw+0.018
ho+0.017
chard+0.016
 mun+0.016
 yogurt+0.016
Neg logit contributions
 radio-0.021
 Margaret-0.019
 wool-0.019
 flashlight-0.019
Marg-0.019
Tokenummy
Token ID13513
Feature activation-3.23e-01

Pos logit contributions
iw+0.007
ho+0.006
chard+0.006
 mun+0.006
 banana+0.006
Neg logit contributions
 radio-0.008
 wool-0.007
 Margaret-0.007
 extingu-0.007
Marg-0.007
Token mil
Token ID1465
Feature activation-5.82e-01

Pos logit contributions
iw+0.007
ho+0.007
 yogurt+0.007
 mun+0.006
melon+0.006
Neg logit contributions
 radio-0.010
 Margaret-0.010
Marg-0.009
 flashlight-0.009
 grandparents-0.009
Tokenksh
Token ID50133
Feature activation+2.81e+00

Pos logit contributions
iw+0.020
ho+0.020
 mun+0.019
melon+0.019
 yogurt+0.019
Neg logit contributions
 radio-0.011
 Margaret-0.010
 flashlight-0.010
Marg-0.009
 grandparents-0.009
Tokenakes
Token ID1124
Feature activation+6.11e-01

Pos logit contributions
 radio+0.064
 Margaret+0.061
 flashlight+0.060
 wool+0.056
Marg+0.056
Neg logit contributions
iw-0.088
ho-0.087
melon-0.081
 mun-0.081
 yogurt-0.080
Token in
Token ID287
Feature activation+4.03e-01

Pos logit contributions
 radio+0.018
 Margaret+0.018
Marg+0.017
 flashlight+0.017
 grandparents+0.017
Neg logit contributions
ho-0.014
iw-0.014
melon-0.013
 yogurt-0.013
 mun-0.013
Token the
Token ID262
Feature activation+4.61e-01

Pos logit contributions
 radio+0.015
Marg+0.015
 Margaret+0.015
 flashlight+0.014
 fault+0.014
Neg logit contributions
 yogurt-0.006
iw-0.006
 banana-0.005
 he-0.005
melon-0.005
Token sun
Token ID4252
Feature activation-4.24e-01

Pos logit contributions
Marg+0.012
 radio+0.012
 fault+0.012
 Margaret+0.012
 extingu+0.012
Neg logit contributions
iw-0.011
 yogurt-0.011
ho-0.011
melon-0.011
 banana-0.010
Token.
Token ID13
Feature activation-2.56e-02

Pos logit contributions
ho+0.010
iw+0.010
melon+0.009
 yogurt+0.009
 mun+0.009
Neg logit contributions
 radio-0.012
 flashlight-0.012
Marg-0.012
 Margaret-0.012
 aliens-0.012
Token back
Token ID736
Feature activation+9.33e-01

Pos logit contributions
 radio+0.006
 Margaret+0.006
Marg+0.006
 flashlight+0.006
 grandparents+0.006
Neg logit contributions
ho-0.004
iw-0.004
 yogurt-0.003
melon-0.003
chard-0.003
Token with
Token ID351
Feature activation-4.15e-02

Pos logit contributions
 radio+0.030
 Margaret+0.028
Marg+0.028
 flashlight+0.027
 wool+0.027
Neg logit contributions
iw-0.020
ho-0.019
 yogurt-0.018
chard-0.018
melon-0.018
Token lots
Token ID6041
Feature activation-6.78e-01

Pos logit contributions
iw+0.001
 yogurt+0.001
ho+0.001
 banana+0.001
melon+0.001
Neg logit contributions
 radio-0.001
 Margaret-0.001
Marg-0.001
 flashlight-0.001
 grandparents-0.001
Token of
Token ID286
Feature activation+4.79e-01

Pos logit contributions
iw+0.011
ho+0.011
chard+0.011
melon+0.009
 yogurt+0.009
Neg logit contributions
 radio-0.024
 wool-0.023
 Margaret-0.023
 flashlight-0.023
Marg-0.022
Token sun
Token ID4252
Feature activation-5.49e-01

Pos logit contributions
 radio+0.013
 Margaret+0.012
Marg+0.012
 flashlight+0.012
 wool+0.012
Neg logit contributions
iw-0.012
ho-0.012
 yogurt-0.012
 mun-0.011
chard-0.011
Tokenfl
Token ID2704
Feature activation+2.79e+00

Pos logit contributions
ho+0.014
melon+0.013
 yogurt+0.013
iw+0.013
 mun+0.012
Neg logit contributions
 radio-0.016
Marg-0.015
 Margaret-0.014
 flashlight-0.014
 grandparents-0.014
Tokenowers
Token ID3618
Feature activation-5.11e-02

Pos logit contributions
 radio+0.070
 Margaret+0.065
 flashlight+0.064
Marg+0.062
 grandparents+0.061
Neg logit contributions
iw-0.087
ho-0.082
 yogurt-0.078
melon-0.078
 mun-0.077
Token to
Token ID284
Feature activation-1.72e-01

Pos logit contributions
ho+0.001
iw+0.001
melon+0.001
 yogurt+0.001
 mun+0.001
Neg logit contributions
 radio-0.002
 Margaret-0.002
Marg-0.002
 grandparents-0.002
 flashlight-0.002
Token fill
Token ID6070
Feature activation-1.08e-01

Pos logit contributions
iw+0.004
ho+0.004
 yogurt+0.004
 mun+0.004
chard+0.004
Neg logit contributions
 radio-0.005
 Margaret-0.005
 flashlight-0.005
Marg-0.005
 fault-0.005
Token the
Token ID262
Feature activation-9.18e-02

Pos logit contributions
iw+0.002
ho+0.002
 yogurt+0.002
melon+0.002
chard+0.002
Neg logit contributions
 radio-0.003
 Margaret-0.003
Marg-0.003
 flashlight-0.003
 grandparents-0.003
Token prince
Token ID19716
Feature activation-4.00e-01

Pos logit contributions
iw+0.002
ho+0.002
 yogurt+0.002
melon+0.002
chard+0.002
Neg logit contributions
 radio-0.002
 Margaret-0.002
Marg-0.002
 flashlight-0.002
 grandparents-0.002
Token had
Token ID550
Feature activation+2.23e-01

Pos logit contributions
 radio+0.003
 Margaret+0.003
 flashlight+0.003
Marg+0.003
 wool+0.002
Neg logit contributions
iw-0.002
ho-0.002
 yogurt-0.002
 mun-0.002
chard-0.002
Token her
Token ID607
Feature activation-2.03e-01

Pos logit contributions
 radio+0.006
 Margaret+0.005
 flashlight+0.005
Marg+0.005
 extingu+0.005
Neg logit contributions
iw-0.006
ho-0.006
 yogurt-0.006
 mun-0.006
chard-0.006
Token mom
Token ID1995
Feature activation-2.99e-01

Pos logit contributions
iw+0.005
 yogurt+0.005
ho+0.005
 banana+0.005
melon+0.005
Neg logit contributions
Marg-0.005
 Margaret-0.005
 radio-0.005
 extingu-0.005
 aliens-0.005
Token and
Token ID290
Feature activation-4.38e-01

Pos logit contributions
iw+0.008
ho+0.008
 yogurt+0.007
chard+0.007
 mun+0.007
Neg logit contributions
 radio-0.008
 Margaret-0.008
 flashlight-0.007
Marg-0.007
 wool-0.007
Token Cro
Token ID9325
Feature activation+9.68e-01

Pos logit contributions
iw+0.012
ho+0.012
 yogurt+0.011
 mun+0.011
chard+0.011
Neg logit contributions
 radio-0.011
 Margaret-0.011
 wool-0.010
Marg-0.010
 flashlight-0.010
Tokenco
Token ID1073
Feature activation+2.77e+00

Pos logit contributions
 radio+0.024
 Margaret+0.023
Marg+0.023
 wool+0.022
 flashlight+0.022
Neg logit contributions
iw-0.026
ho-0.026
 yogurt-0.025
chard-0.024
melon-0.024
Token,
Token ID11
Feature activation+4.01e-01

Pos logit contributions
 radio+0.080
 Margaret+0.077
Marg+0.076
 grandparents+0.074
 wool+0.073
Neg logit contributions
iw-0.064
ho-0.062
chard-0.058
 yogurt-0.058
melon-0.057
Token and
Token ID290
Feature activation-1.70e-01

Pos logit contributions
 radio+0.011
 Margaret+0.010
 flashlight+0.010
Marg+0.010
 wool+0.010
Neg logit contributions
iw-0.010
ho-0.010
 yogurt-0.010
chard-0.009
 mun-0.009
Token that
Token ID326
Feature activation+2.66e-01

Pos logit contributions
iw+0.004
 yogurt+0.004
chard+0.004
 mun+0.004
ho+0.004
Neg logit contributions
 radio-0.005
 Margaret-0.004
 wool-0.004
Marg-0.004
 parent-0.004
Token was
Token ID373
Feature activation+2.50e-01

Pos logit contributions
 radio+0.008
 Margaret+0.008
 wool+0.008
 flashlight+0.008
Marg+0.008
Neg logit contributions
iw-0.006
ho-0.005
chard-0.005
 yogurt-0.005
 mun-0.005
Token enough
Token ID1576
Feature activation-6.68e-01

Pos logit contributions
 radio+0.007
 Margaret+0.007
 flashlight+0.006
Marg+0.006
 extingu+0.006
Neg logit contributions
iw-0.006
ho-0.006
 yogurt-0.006
chard-0.006
melon-0.006
Token<|endoftext|>
Token ID50256
Feature activation+1.52e-01

Pos logit contributions
iw+0.000
ho+0.000
 yogurt+0.000
 mun+0.000
chard+0.000
Neg logit contributions
 radio-0.000
 Margaret-0.000
Marg-0.000
 flashlight-0.000
 wool-0.000
TokenL
Token ID43
Feature activation-1.18e-01

Pos logit contributions
 radio+0.004
 Margaret+0.004
 wool+0.004
 flashlight+0.004
Marg+0.004
Neg logit contributions
iw-0.004
ho-0.004
 yogurt-0.004
chard-0.004
 mun-0.004
Tokenily
Token ID813
Feature activation+4.75e-01

Pos logit contributions
iw+0.003
ho+0.003
chard+0.003
melon+0.003
 mun+0.003
Neg logit contributions
 radio-0.003
 Margaret-0.003
Marg-0.003
 wool-0.003
 flashlight-0.003
Token loved
Token ID6151
Feature activation+5.55e-01

Pos logit contributions
 radio+0.013
 Margaret+0.013
Marg+0.012
 flashlight+0.012
 extingu+0.012
Neg logit contributions
iw-0.012
ho-0.011
 yogurt-0.011
 mun-0.011
 banana-0.010
Token sun
Token ID4252
Feature activation-1.37e+00

Pos logit contributions
 radio+0.016
 wool+0.015
 Margaret+0.015
 grandparents+0.015
 flashlight+0.015
Neg logit contributions
iw-0.014
ho-0.013
 mun-0.012
chard-0.012
 mango-0.011
Tokenfl
Token ID2704
Feature activation+2.76e+00

Pos logit contributions
ho+0.033
iw+0.033
 yogurt+0.032
melon+0.031
 banana+0.030
Neg logit contributions
 radio-0.038
 Margaret-0.036
Marg-0.036
 flashlight-0.035
 grandparents-0.035
Tokenowers
Token ID3618
Feature activation-6.57e-01

Pos logit contributions
 radio+0.060
 Margaret+0.057
 flashlight+0.055
Marg+0.053
 grandparents+0.052
Neg logit contributions
iw-0.091
ho-0.087
 yogurt-0.083
melon-0.081
 mun-0.081
Token.
Token ID13
Feature activation+2.57e-01

Pos logit contributions
iw+0.015
ho+0.014
 yogurt+0.013
melon+0.013
chard+0.013
Neg logit contributions
 radio-0.020
 Margaret-0.019
 flashlight-0.019
Marg-0.019
 grandparents-0.018
Token She
Token ID1375
Feature activation-2.93e-01

Pos logit contributions
 radio+0.007
 flashlight+0.007
 Margaret+0.007
 wool+0.007
Marg+0.007
Neg logit contributions
iw-0.006
ho-0.006
 yogurt-0.006
melon-0.006
chard-0.006
Token liked
Token ID8288
Feature activation+1.40e-01

Pos logit contributions
iw+0.007
ho+0.007
 mun+0.006
 yogurt+0.006
melon+0.006
Neg logit contributions
 radio-0.009
 Margaret-0.008
 flashlight-0.008
 fault-0.008
Marg-0.008
Token to
Token ID284
Feature activation-4.45e-01

Pos logit contributions
 radio+0.005
 Margaret+0.004
 wool+0.004
Marg+0.004
 flashlight+0.004
Neg logit contributions
iw-0.003
ho-0.003
 yogurt-0.002
chard-0.002
 mun-0.002
Token,
Token ID11
Feature activation+2.32e-01

Pos logit contributions
 radio+0.009
 Margaret+0.009
 flashlight+0.009
 wool+0.009
Marg+0.009
Neg logit contributions
iw-0.010
ho-0.009
 yogurt-0.009
chard-0.009
 mun-0.009
Token and
Token ID290
Feature activation-1.34e-01

Pos logit contributions
 radio+0.007
 Margaret+0.006
 flashlight+0.006
 wool+0.006
Marg+0.006
Neg logit contributions
iw-0.006
ho-0.005
 yogurt-0.005
chard-0.005
 mun-0.005
Token all
Token ID477
Feature activation+3.45e-01

Pos logit contributions
iw+0.003
ho+0.003
 yogurt+0.003
melon+0.003
chard+0.003
Neg logit contributions
 radio-0.004
 Margaret-0.004
 wool-0.004
 flashlight-0.004
 grandparents-0.004
Token the
Token ID262
Feature activation+3.22e-01

Pos logit contributions
 radio+0.011
 Margaret+0.011
 flashlight+0.010
 wool+0.010
Marg+0.010
Neg logit contributions
iw-0.007
ho-0.007
chard-0.006
 yogurt-0.006
 mun-0.006
Token other
Token ID584
Feature activation+1.32e+00

Pos logit contributions
 radio+0.009
 Margaret+0.008
Marg+0.008
 flashlight+0.008
 extingu+0.008
Neg logit contributions
iw-0.008
ho-0.008
 yogurt-0.008
 banana-0.007
 mun-0.007
Token av
Token ID1196
Feature activation+2.74e+00

Pos logit contributions
 radio+0.033
 Margaret+0.030
 flashlight+0.030
 wool+0.030
 grandparents+0.030
Neg logit contributions
iw-0.037
ho-0.036
chard-0.034
melon-0.033
 mun-0.033
Tokenoc
Token ID420
Feature activation+5.36e-01

Pos logit contributions
Marg+0.069
 radio+0.069
 wool+0.068
 Margaret+0.067
 flashlight+0.064
Neg logit contributions
 yogurt-0.068
chard-0.066
ho-0.065
melon-0.065
iw-0.064
Tokenados
Token ID22484
Feature activation+4.90e-01

Pos logit contributions
 fault+0.014
 extingu+0.013
 radio+0.013
 grandparents+0.013
 aliens+0.013
Neg logit contributions
chard-0.013
 yogurt-0.012
 mun-0.012
 Fro-0.012
iw-0.011
Token were
Token ID547
Feature activation+1.71e-01

Pos logit contributions
 radio+0.015
 Margaret+0.014
Marg+0.014
 flashlight+0.014
 wool+0.014
Neg logit contributions
iw-0.011
ho-0.011
 yogurt-0.010
 mun-0.010
chard-0.010
Token very
Token ID845
Feature activation+1.51e-02

Pos logit contributions
 radio+0.005
 Margaret+0.005
 flashlight+0.005
Marg+0.005
 wool+0.004
Neg logit contributions
iw-0.004
ho-0.004
 yogurt-0.004
chard-0.004
 mun-0.004
Token happy
Token ID3772
Feature activation-1.88e-01

Pos logit contributions
 wonderful+0.000
 grandparents+0.000
 true+0.000
 radio+0.000
 wool+0.000
Neg logit contributions
iw-0.000
ho-0.000
ro-0.000
hop-0.000
 salon-0.000
Token a
Token ID257
Feature activation-1.49e-01

Pos logit contributions
iw+0.005
ho+0.004
 mun+0.004
chard+0.004
 yogurt+0.004
Neg logit contributions
 radio-0.004
 Margaret-0.004
 wool-0.004
 flashlight-0.004
Marg-0.004
Token delicious
Token ID12625
Feature activation+1.31e-01

Pos logit contributions
iw+0.004
ho+0.004
 yogurt+0.004
chard+0.004
 mun+0.004
Neg logit contributions
 radio-0.004
 Margaret-0.004
Marg-0.004
 flashlight-0.004
 wool-0.003
Token,
Token ID11
Feature activation+6.18e-01

Pos logit contributions
 radio+0.004
 Margaret+0.003
 flashlight+0.003
 wool+0.003
Marg+0.003
Neg logit contributions
iw-0.003
ho-0.003
chard-0.003
 yogurt-0.003
 mun-0.003
Token mild
Token ID11607
Feature activation-2.26e-01

Pos logit contributions
 radio+0.017
 Margaret+0.017
Marg+0.016
 flashlight+0.016
 wool+0.016
Neg logit contributions
iw-0.015
ho-0.014
 yogurt-0.014
chard-0.014
 mun-0.013
Token-
Token ID12
Feature activation+5.35e-01

Pos logit contributions
iw+0.005
ho+0.005
 yogurt+0.005
 mun+0.005
chard+0.005
Neg logit contributions
 radio-0.007
 Margaret-0.006
Marg-0.006
 flashlight-0.006
 wool-0.006
Tokenfl
Token ID2704
Feature activation+2.73e+00

Pos logit contributions
 radio+0.018
 Margaret+0.017
Marg+0.017
 extingu+0.016
 wool+0.016
Neg logit contributions
iw-0.010
ho-0.009
chard-0.009
melon-0.009
 mun-0.009
Tokenav
Token ID615
Feature activation-2.46e-01

Pos logit contributions
 radio+0.064
 Margaret+0.060
 flashlight+0.060
Marg+0.058
 aliens+0.057
Neg logit contributions
iw-0.085
ho-0.082
 yogurt-0.080
 mun-0.077
melon-0.076
Tokenoured
Token ID8167
Feature activation-9.84e-01

Pos logit contributions
ho+0.006
iw+0.006
 yogurt+0.005
melon+0.005
 mun+0.005
Neg logit contributions
 radio-0.007
Marg-0.007
 Margaret-0.007
 flashlight-0.007
 aliens-0.007
Token cake
Token ID12187
Feature activation+4.04e-01

Pos logit contributions
iw+0.023
ho+0.021
chard+0.020
 mun+0.019
melon+0.019
Neg logit contributions
 radio-0.029
 wool-0.028
 flashlight-0.027
 Margaret-0.027
 extingu-0.026
Token.
Token ID13
Feature activation+1.75e-02

Pos logit contributions
 radio+0.011
Marg+0.011
 Margaret+0.011
 flashlight+0.010
 aliens+0.010
Neg logit contributions
iw-0.010
ho-0.010
 yogurt-0.010
 mun-0.009
melon-0.009
Token The
Token ID383
Feature activation+1.74e-01

Pos logit contributions
 radio+0.001
 Margaret+0.000
Marg+0.000
 flashlight+0.000
 wool+0.000
Neg logit contributions
iw-0.000
ho-0.000
 yogurt-0.000
chard-0.000
 mun-0.000
Token need
Token ID761
Feature activation+7.21e-01

Pos logit contributions
 radio+0.013
Marg+0.013
 Margaret+0.013
 extingu+0.013
 grandparents+0.013
Neg logit contributions
 yogurt-0.007
iw-0.007
 banana-0.007
ho-0.007
 he-0.007
Token.
Token ID13
Feature activation+1.86e-01

Pos logit contributions
 radio+0.020
 Margaret+0.019
Marg+0.019
 flashlight+0.019
 extingu+0.018
Neg logit contributions
iw-0.018
 yogurt-0.017
ho-0.017
 banana-0.016
chard-0.016
Token Even
Token ID3412
Feature activation+4.66e-01

Pos logit contributions
 radio+0.005
 Margaret+0.005
Marg+0.005
 flashlight+0.005
 wool+0.005
Neg logit contributions
iw-0.005
ho-0.004
 yogurt-0.004
 mun-0.004
chard-0.004
Token the
Token ID262
Feature activation-3.08e-01

Pos logit contributions
 radio+0.015
 Margaret+0.014
 wool+0.014
 flashlight+0.014
Marg+0.014
Neg logit contributions
iw-0.010
ho-0.010
 mun-0.009
 yogurt-0.008
melon-0.008
Token t
Token ID256
Feature activation+7.71e-01

Pos logit contributions
iw+0.008
ho+0.008
 yogurt+0.007
 banana+0.007
chard+0.007
Neg logit contributions
 radio-0.008
 Margaret-0.008
Marg-0.008
 flashlight-0.007
 wool-0.007
Tokenini
Token ID5362
Feature activation+2.72e+00

Pos logit contributions
 radio+0.013
 Margaret+0.012
 flashlight+0.012
Marg+0.012
 wool+0.012
Neg logit contributions
iw-0.027
ho-0.027
 yogurt-0.026
chard-0.025
 mun-0.025
Tokenest
Token ID395
Feature activation+3.18e-01

Pos logit contributions
 radio+0.073
 Margaret+0.069
Marg+0.068
 flashlight+0.067
 wool+0.066
Neg logit contributions
iw-0.071
ho-0.069
 yogurt-0.065
melon-0.063
 mun-0.063
Token creature
Token ID7185
Feature activation+2.53e-01

Pos logit contributions
 radio+0.009
 Margaret+0.008
Marg+0.008
 flashlight+0.008
 wool+0.008
Neg logit contributions
iw-0.008
ho-0.008
 yogurt-0.008
 mun-0.007
melon-0.007
Token needs
Token ID2476
Feature activation-1.22e-01

Pos logit contributions
 radio+0.008
 Margaret+0.007
Marg+0.007
 flashlight+0.007
 wool+0.007
Neg logit contributions
iw-0.005
ho-0.005
 yogurt-0.005
 mun-0.005
melon-0.005
Token our
Token ID674
Feature activation+1.82e-01

Pos logit contributions
iw+0.003
ho+0.003
 yogurt+0.003
 mun+0.002
 banana+0.002
Neg logit contributions
 radio-0.004
 Margaret-0.003
Marg-0.003
 flashlight-0.003
 grandparents-0.003
Token help
Token ID1037
Feature activation-4.06e-01

Pos logit contributions
 radio+0.004
 Margaret+0.004
Marg+0.004
 flashlight+0.004
 extingu+0.004
Neg logit contributions
iw-0.005
ho-0.005
 yogurt-0.005
 mun-0.005
 banana-0.005
Token.
Token ID13
Feature activation+1.15e-01

Pos logit contributions
 radio+0.012
 Margaret+0.012
Marg+0.012
 flashlight+0.012
 grandparents+0.011
Neg logit contributions
ho-0.012
iw-0.012
 yogurt-0.011
chard-0.011
melon-0.011
Token
Token ID198
Feature activation+1.50e-01

Pos logit contributions
 radio+0.003
 Margaret+0.003
Marg+0.003
 flashlight+0.003
 grandparents+0.003
Neg logit contributions
iw-0.003
ho-0.003
 mun-0.003
chard-0.003
 yogurt-0.003
Token
Token ID198
Feature activation+1.91e-01

Pos logit contributions
 radio+0.003
 Margaret+0.003
Marg+0.003
 flashlight+0.003
 grandparents+0.003
Neg logit contributions
iw-0.005
ho-0.004
chard-0.004
 yogurt-0.004
 mun-0.004
Token"
Token ID1
Feature activation-2.22e-01

Pos logit contributions
Marg+0.004
 radio+0.004
 Margaret+0.004
 grandparents+0.004
 parent+0.003
Neg logit contributions
iw-0.006
 mun-0.006
chard-0.006
melon-0.006
 yogurt-0.006
TokenRE
Token ID2200
Feature activation-4.43e-01

Pos logit contributions
iw+0.005
ho+0.005
 yogurt+0.005
chard+0.004
melon+0.004
Neg logit contributions
 radio-0.007
 Margaret-0.007
 flashlight-0.006
 wool-0.006
Marg-0.006
TokenVERS
Token ID28884
Feature activation-3.15e+00

Pos logit contributions
iw+0.012
ho+0.012
 yogurt+0.011
 mun+0.011
 banana+0.011
Neg logit contributions
 radio-0.011
 Margaret-0.011
 flashlight-0.010
Marg-0.010
 wool-0.010
TokenE
Token ID36
Feature activation-1.46e-01

Pos logit contributions
 mun+0.089
 mango+0.089
 yogurt+0.088
ipop+0.087
chard+0.086
Neg logit contributions
 radio-0.070
Marg-0.068
 wool-0.061
 extingu-0.061
 aliens-0.060
Token,
Token ID11
Feature activation+1.40e-01

Pos logit contributions
iw+0.004
ho+0.003
chard+0.003
 yogurt+0.003
 mun+0.003
Neg logit contributions
 radio-0.004
 Margaret-0.004
 wool-0.004
Marg-0.004
 flashlight-0.004
Token huh
Token ID24926
Feature activation+3.41e-01

Pos logit contributions
 radio+0.004
 Margaret+0.004
 flashlight+0.004
 wool+0.004
 grandparents+0.004
Neg logit contributions
iw-0.003
ho-0.003
chard-0.003
 mun-0.003
melon-0.003
Token?
Token ID30
Feature activation+5.09e-01

Pos logit contributions
 radio+0.009
 Margaret+0.008
Marg+0.008
 wool+0.008
 flashlight+0.008
Neg logit contributions
iw-0.009
ho-0.009
 yogurt-0.009
melon-0.008
 banana-0.008
Token That
Token ID1320
Feature activation+1.05e+00

Pos logit contributions
 radio+0.015
 Margaret+0.014
 flashlight+0.013
Marg+0.013
 extingu+0.013
Neg logit contributions
iw-0.012
ho-0.012
 yogurt-0.011
 mun-0.011
chard-0.011
Token sign
Token ID1051
Feature activation-6.13e-01

Pos logit contributions
ho+0.004
 banana+0.004
iw+0.004
 yogurt+0.004
 mango+0.004
Neg logit contributions
Marg-0.005
 Margaret-0.005
 aliens-0.004
 radio-0.004
 grandparents-0.004
Token that
Token ID326
Feature activation-5.84e-01

Pos logit contributions
iw+0.014
ho+0.013
 mun+0.012
 yogurt+0.012
chard+0.012
Neg logit contributions
 radio-0.020
 Margaret-0.018
 flashlight-0.018
 wool-0.018
Marg-0.018
Token said
Token ID531
Feature activation-2.56e-01

Pos logit contributions
ho+0.010
iw+0.010
 yogurt+0.009
melon+0.009
chard+0.009
Neg logit contributions
 radio-0.020
 Margaret-0.019
Marg-0.019
 wool-0.019
 fault-0.019
Token "
Token ID366
Feature activation-8.44e-01

Pos logit contributions
ho+0.005
iw+0.005
 yogurt+0.005
melon+0.004
 banana+0.004
Neg logit contributions
 radio-0.008
Marg-0.008
 Margaret-0.008
 wool-0.008
 flashlight-0.008
TokenRE
Token ID2200
Feature activation-3.87e-01

Pos logit contributions
iw+0.020
ho+0.020
 yogurt+0.018
 mun+0.018
chard+0.018
Neg logit contributions
 radio-0.025
 Margaret-0.023
 flashlight-0.023
 wool-0.023
 grandparents-0.022
TokenVERS
Token ID28884
Feature activation-2.97e+00

Pos logit contributions
iw+0.010
ho+0.010
 mun+0.009
 yogurt+0.009
 banana+0.009
Neg logit contributions
 radio-0.010
 Margaret-0.010
 flashlight-0.009
Marg-0.009
 wool-0.009
TokenE
Token ID36
Feature activation-2.78e-01

Pos logit contributions
 yogurt+0.084
iw+0.083
 mun+0.082
ipop+0.080
 mango+0.080
Neg logit contributions
 radio-0.067
Marg-0.066
 Margaret-0.060
 wool-0.059
 extingu-0.058
Token".
Token ID1911
Feature activation-2.51e-01

Pos logit contributions
iw+0.008
ho+0.008
melon+0.007
 yogurt+0.007
 mun+0.007
Neg logit contributions
 radio-0.007
 Margaret-0.007
 flashlight-0.007
Marg-0.007
 grandparents-0.007
Token L
Token ID406
Feature activation-4.50e-01

Pos logit contributions
iw+0.006
ho+0.006
 yogurt+0.006
chard+0.006
 mun+0.006
Neg logit contributions
 radio-0.007
 Margaret-0.007
 flashlight-0.006
Marg-0.006
 wool-0.006
Tokenila
Token ID10102
Feature activation-2.48e-01

Pos logit contributions
iw+0.008
ho+0.008
chard+0.007
 yogurt+0.007
 mun+0.007
Neg logit contributions
 radio-0.016
 Margaret-0.015
Marg-0.015
 flashlight-0.015
 wool-0.015
Token and
Token ID290
Feature activation-5.75e-01

Pos logit contributions
iw+0.007
ho+0.007
 yogurt+0.006
chard+0.006
melon+0.006
Neg logit contributions
 radio-0.006
 Margaret-0.006
Marg-0.006
 flashlight-0.006
 wool-0.006
Token When
Token ID1649
Feature activation+5.77e-02

Pos logit contributions
 radio+0.013
 Margaret+0.012
 wool+0.011
 flashlight+0.011
 prayers+0.011
Neg logit contributions
ho-0.015
iw-0.015
 mun-0.014
melon-0.014
chard-0.014
Token they
Token ID484
Feature activation-5.27e-01

Pos logit contributions
 radio+0.002
 Margaret+0.002
 flashlight+0.002
 wool+0.002
 grandparents+0.002
Neg logit contributions
iw-0.001
ho-0.001
 mun-0.001
melon-0.001
chard-0.001
Token got
Token ID1392
Feature activation+4.14e-02

Pos logit contributions
iw+0.014
ho+0.013
 mango+0.012
chard+0.012
melon+0.012
Neg logit contributions
 flashlight-0.012
 Margaret-0.012
 grandparents-0.011
 extingu-0.011
 radio-0.011
Token to
Token ID284
Feature activation+4.71e-02

Pos logit contributions
 radio+0.001
 Margaret+0.001
 grandparents+0.001
 flashlight+0.001
 wool+0.001
Neg logit contributions
iw-0.001
ho-0.001
 mun-0.001
 mango-0.001
 banana-0.001
Token the
Token ID262
Feature activation-2.47e-01

Pos logit contributions
 radio+0.002
Marg+0.002
 flashlight+0.002
 Margaret+0.002
 aliens+0.002
Neg logit contributions
iw-0.001
 yogurt-0.001
ho-0.001
chard-0.001
 mango-0.001
Token checkout
Token ID28006
Feature activation-2.86e+00

Pos logit contributions
iw+0.006
ho+0.006
 mun+0.006
ipop+0.006
 Fro+0.006
Neg logit contributions
 radio-0.007
 flashlight-0.006
 grandparents-0.006
 wool-0.006
 luggage-0.006
Token,
Token ID11
Feature activation+5.04e-01

Pos logit contributions
iw+0.088
 mun+0.083
ho+0.083
chard+0.082
 Fro+0.080
Neg logit contributions
 radio-0.062
 grandparents-0.059
 grandmother-0.057
 Margaret-0.055
 fault-0.052
Token Mom
Token ID11254
Feature activation-3.62e-01

Pos logit contributions
 radio+0.017
 Margaret+0.016
Marg+0.016
 wool+0.016
 flashlight+0.016
Neg logit contributions
iw-0.009
ho-0.009
 yogurt-0.009
chard-0.008
 banana-0.008
Tokenmy
Token ID1820
Feature activation+2.28e-01

Pos logit contributions
iw+0.009
ho+0.009
 yogurt+0.009
melon+0.008
 mun+0.008
Neg logit contributions
 radio-0.010
 Margaret-0.009
Marg-0.009
 flashlight-0.009
 wool-0.009
Token saw
Token ID2497
Feature activation+3.78e-02

Pos logit contributions
 radio+0.006
 Margaret+0.006
 flashlight+0.006
 grandparents+0.006
Marg+0.006
Neg logit contributions
iw-0.006
ho-0.005
chard-0.005
 yogurt-0.005
melon-0.005
Token the
Token ID262
Feature activation-1.06e-01

Pos logit contributions
 radio+0.001
 Margaret+0.001
 flashlight+0.001
Marg+0.001
 wool+0.001
Neg logit contributions
iw-0.001
ho-0.001
 yogurt-0.001
chard-0.001
melon-0.001
Token When
Token ID1649
Feature activation-5.41e-01

Pos logit contributions
iw+0.022
ho+0.021
chard+0.019
 mun+0.019
 yogurt+0.019
Neg logit contributions
 radio-0.025
 Margaret-0.024
 flashlight-0.023
 wool-0.023
 grandparents-0.022
Token they
Token ID484
Feature activation-7.78e-01

Pos logit contributions
iw+0.014
ho+0.013
chard+0.012
 yogurt+0.012
 mun+0.012
Neg logit contributions
 radio-0.015
 Margaret-0.014
 flashlight-0.014
 wool-0.014
Marg-0.014
Token got
Token ID1392
Feature activation-6.71e-02

Pos logit contributions
iw+0.022
ho+0.020
ro+0.020
chard+0.019
 Fro+0.018
Neg logit contributions
 flashlight-0.015
 grandparents-0.014
 extingu-0.014
 Margaret-0.014
 sense-0.014
Token to
Token ID284
Feature activation-2.08e-01

Pos logit contributions
ho+0.002
iw+0.002
ro+0.002
 mun+0.002
 mango+0.002
Neg logit contributions
 radio-0.001
 grandparents-0.001
 Margaret-0.001
 flashlight-0.001
 wool-0.001
Token the
Token ID262
Feature activation-3.32e-01

Pos logit contributions
iw+0.004
ho+0.004
 yogurt+0.004
chard+0.004
 mun+0.004
Neg logit contributions
 radio-0.007
 Margaret-0.007
 flashlight-0.006
Marg-0.006
 wool-0.006
Token checkout
Token ID28006
Feature activation-2.77e+00

Pos logit contributions
ho+0.008
iw+0.008
 mun+0.008
 Fro+0.007
ipop+0.007
Neg logit contributions
 radio-0.010
 flashlight-0.009
 wool-0.009
 grandparents-0.008
 extingu-0.008
Token,
Token ID11
Feature activation+3.38e-01

Pos logit contributions
iw+0.089
ho+0.084
chard+0.084
years+0.081
 mun+0.080
Neg logit contributions
 radio-0.049
 grandparents-0.046
 Margaret-0.043
 grandmother-0.043
 fault-0.043
Token Tim
Token ID5045
Feature activation+3.27e-01

Pos logit contributions
 radio+0.011
 Margaret+0.010
Marg+0.010
 wool+0.010
 flashlight+0.010
Neg logit contributions
iw-0.007
ho-0.007
 yogurt-0.007
chard-0.007
melon-0.006
Tokenmy
Token ID1820
Feature activation+5.90e-02

Pos logit contributions
 radio+0.008
 Margaret+0.007
 flashlight+0.007
Marg+0.007
 wool+0.007
Neg logit contributions
iw-0.010
ho-0.009
chard-0.009
 yogurt-0.009
melon-0.009
Token saw
Token ID2497
Feature activation-9.30e-02

Pos logit contributions
 grandparents+0.001
 Margaret+0.001
 radio+0.001
 grandmother+0.001
 flashlight+0.001
Neg logit contributions
ho-0.002
iw-0.002
chard-0.002
melon-0.002
ro-0.001
Token the
Token ID262
Feature activation-2.51e-01

Pos logit contributions
iw+0.003
ho+0.002
 mun+0.002
chard+0.002
 yogurt+0.002
Neg logit contributions
 radio-0.002
 Margaret-0.002
 wool-0.002
 flashlight-0.002
Marg-0.002
Token likes
Token ID7832
Feature activation-2.71e-01

Pos logit contributions
 radio+0.011
 Margaret+0.010
Marg+0.010
 flashlight+0.010
 extingu+0.010
Neg logit contributions
iw-0.008
ho-0.008
 yogurt-0.007
melon-0.007
 banana-0.007
Token honey
Token ID12498
Feature activation-5.47e-01

Pos logit contributions
iw+0.006
ho+0.006
chard+0.006
 mun+0.006
melon+0.005
Neg logit contributions
 radio-0.008
 wool-0.008
 Margaret-0.007
 flashlight-0.007
Marg-0.007
Token and
Token ID290
Feature activation-3.06e-01

Pos logit contributions
iw+0.013
ho+0.012
 yogurt+0.012
 banana+0.012
chard+0.011
Neg logit contributions
 radio-0.016
 Margaret-0.015
Marg-0.015
 flashlight-0.015
 grandparents-0.014
Token Ben
Token ID3932
Feature activation-2.17e-01

Pos logit contributions
iw+0.007
ho+0.007
chard+0.006
 mun+0.006
melon+0.006
Neg logit contributions
 radio-0.009
 wool-0.009
 Margaret-0.009
 flashlight-0.008
Marg-0.008
Token likes
Token ID7832
Feature activation-7.19e-02

Pos logit contributions
iw+0.005
ho+0.005
 yogurt+0.005
 banana+0.005
chard+0.005
Neg logit contributions
 radio-0.006
 Margaret-0.006
Marg-0.006
 flashlight-0.005
 extingu-0.005
Token salsa
Token ID49529
Feature activation-2.76e+00

Pos logit contributions
iw+0.002
ho+0.002
chard+0.002
 mun+0.002
melon+0.002
Neg logit contributions
 radio-0.002
 wool-0.002
 Margaret-0.002
 flashlight-0.002
Marg-0.002
Token.
Token ID13
Feature activation+3.74e-02

Pos logit contributions
iw+0.084
ho+0.082
 mun+0.079
chard+0.078
 mango+0.075
Neg logit contributions
 radio-0.060
 wool-0.055
 Margaret-0.052
 flashlight-0.052
 extingu-0.051
Token They
Token ID1119
Feature activation-2.85e-01

Pos logit contributions
 radio+0.001
 Margaret+0.001
 flashlight+0.001
Marg+0.001
 wool+0.001
Neg logit contributions
iw-0.001
ho-0.001
 yogurt-0.001
chard-0.001
 mun-0.001
Token are
Token ID389
Feature activation+3.63e-01

Pos logit contributions
iw+0.008
ho+0.008
 yogurt+0.007
 mun+0.007
chard+0.007
Neg logit contributions
 radio-0.007
 Margaret-0.007
 flashlight-0.006
Marg-0.006
 wool-0.006
Token both
Token ID1111
Feature activation-1.33e-01

Pos logit contributions
 radio+0.010
 Margaret+0.009
 wool+0.009
 flashlight+0.009
Marg+0.009
Neg logit contributions
iw-0.010
ho-0.009
 yogurt-0.009
 mun-0.009
chard-0.008
Token sweet
Token ID6029
Feature activation+7.20e-02

Pos logit contributions
iw+0.004
ho+0.004
chard+0.003
 mun+0.003
 yogurt+0.003
Neg logit contributions
 radio-0.003
 wool-0.003
 Margaret-0.003
 flashlight-0.003
 extingu-0.003
Token!
Token ID0
Feature activation-4.56e-01

Pos logit contributions
iw+0.052
ho+0.050
 yogurt+0.048
chard+0.047
 mun+0.047
Neg logit contributions
 radio-0.050
 Margaret-0.047
 flashlight-0.046
 wool-0.045
Marg-0.045
Token But
Token ID887
Feature activation-8.49e-02

Pos logit contributions
iw+0.010
ho+0.010
 yogurt+0.009
chard+0.009
 mun+0.009
Neg logit contributions
 radio-0.014
 Margaret-0.013
Marg-0.013
 wool-0.013
 flashlight-0.013
Token it
Token ID340
Feature activation-1.34e-02

Pos logit contributions
iw+0.002
ho+0.002
 yogurt+0.002
 mango+0.002
 banana+0.002
Neg logit contributions
 radio-0.003
 Margaret-0.003
 flashlight-0.002
 wool-0.002
 grandparents-0.002
Token wasn
Token ID2492
Feature activation+1.23e-01

Pos logit contributions
iw+0.000
ho+0.000
 yogurt+0.000
 mun+0.000
chard+0.000
Neg logit contributions
 radio-0.000
 Margaret-0.000
Marg-0.000
 flashlight-0.000
 wool-0.000
Token't
Token ID470
Feature activation+1.39e-01

Pos logit contributions
 radio+0.004
 Margaret+0.004
 flashlight+0.004
 wool+0.004
Marg+0.004
Neg logit contributions
ho-0.002
iw-0.002
melon-0.002
 yogurt-0.002
chard-0.002
Token able
Token ID1498
Feature activation-2.75e+00

Pos logit contributions
 radio+0.005
 grandparents+0.004
Marg+0.004
 Margaret+0.004
 flashlight+0.004
Neg logit contributions
 banana-0.003
iw-0.003
 he-0.003
chard-0.003
 yogurt-0.003
Token to
Token ID284
Feature activation+2.61e-01

Pos logit contributions
iw+0.046
ho+0.044
 yogurt+0.041
 mun+0.039
 banana+0.039
Neg logit contributions
 radio-0.100
 Margaret-0.097
Marg-0.095
 flashlight-0.094
 wool-0.093
Token move
Token ID1445
Feature activation-3.42e-01

Pos logit contributions
 radio+0.008
 flashlight+0.007
 wool+0.007
 Margaret+0.007
 extingu+0.007
Neg logit contributions
iw-0.006
ho-0.006
 yogurt-0.005
chard-0.005
 mun-0.005
Token much
Token ID881
Feature activation+2.04e-01

Pos logit contributions
iw+0.008
ho+0.008
 yogurt+0.008
chard+0.007
melon+0.007
Neg logit contributions
 radio-0.010
 Margaret-0.009
Marg-0.009
 flashlight-0.009
 grandparents-0.009
Token because
Token ID780
Feature activation+4.88e-01

Pos logit contributions
 radio+0.006
 wool+0.005
 flashlight+0.005
 extingu+0.005
 aliens+0.005
Neg logit contributions
iw-0.005
ho-0.005
chard-0.005
 yogurt-0.005
 mun-0.005
Token the
Token ID262
Feature activation+1.47e-01

Pos logit contributions
 radio+0.017
 Margaret+0.017
Marg+0.016
 flashlight+0.016
 wool+0.016
Neg logit contributions
iw-0.008
ho-0.008
 yogurt-0.007
chard-0.007
melon-0.007
Token toys
Token ID14958
Feature activation-6.17e-01

Pos logit contributions
Marg+0.000
 Margaret+0.000
 radio+0.000
 extingu+0.000
Karl+0.000
Neg logit contributions
 yogurt-0.000
iw-0.000
 l-0.000
 banana-0.000
ipop-0.000
Token she
Token ID673
Feature activation-4.56e-01

Pos logit contributions
iw+0.014
ho+0.013
 yogurt+0.012
 mun+0.012
chard+0.012
Neg logit contributions
 radio-0.019
 Margaret-0.018
 flashlight-0.018
Marg-0.018
 wool-0.018
Token wouldn
Token ID3636
Feature activation+1.48e-01

Pos logit contributions
iw+0.010
ho+0.009
 mun+0.009
 yogurt+0.009
chard+0.008
Neg logit contributions
 radio-0.015
 Margaret-0.014
Marg-0.014
 extingu-0.014
 flashlight-0.014
Token't
Token ID470
Feature activation+2.96e-01

Pos logit contributions
 radio+0.005
 Margaret+0.005
 flashlight+0.005
 wool+0.005
Marg+0.005
Neg logit contributions
ho-0.003
iw-0.002
 yogurt-0.002
melon-0.002
 banana-0.002
Token be
Token ID307
Feature activation-7.91e-01

Pos logit contributions
 radio+0.009
 Margaret+0.009
Marg+0.009
 flashlight+0.008
 grandparents+0.008
Neg logit contributions
ho-0.006
iw-0.006
 yogurt-0.006
melon-0.006
 mun-0.006
Token able
Token ID1498
Feature activation-2.74e+00

Pos logit contributions
iw+0.019
ho+0.018
 yogurt+0.018
chard+0.017
 mun+0.017
Neg logit contributions
 radio-0.023
 Margaret-0.022
 flashlight-0.021
Marg-0.021
 wool-0.021
Token to
Token ID284
Feature activation+3.39e-01

Pos logit contributions
iw+0.044
ho+0.042
 yogurt+0.040
melon+0.037
 mun+0.037
Neg logit contributions
 radio-0.101
 Margaret-0.097
Marg-0.095
 flashlight-0.094
 extingu-0.093
Token play
Token ID711
Feature activation+9.31e-02

Pos logit contributions
 radio+0.010
 Margaret+0.009
Marg+0.009
 flashlight+0.009
 wool+0.009
Neg logit contributions
iw-0.008
ho-0.008
 yogurt-0.007
chard-0.007
 mun-0.007
Token with
Token ID351
Feature activation+1.28e-01

Pos logit contributions
 radio+0.004
 Margaret+0.004
Marg+0.003
 flashlight+0.003
 extingu+0.003
Neg logit contributions
iw-0.001
 yogurt-0.001
ho-0.001
chard-0.001
 banana-0.001
Token her
Token ID607
Feature activation-9.73e-02

Pos logit contributions
 radio+0.004
 Margaret+0.004
Marg+0.003
 flashlight+0.003
 extingu+0.003
Neg logit contributions
iw-0.003
 yogurt-0.003
ho-0.003
chard-0.003
 banana-0.003
Token dress
Token ID6576
Feature activation+1.17e-01

Pos logit contributions
 yogurt+0.002
iw+0.002
ho+0.002
 l+0.002
 banana+0.002
Neg logit contributions
Marg-0.003
 Margaret-0.003
 extingu-0.003
 radio-0.003
 Karl-0.003
Token have
Token ID423
Feature activation+5.29e-01

Pos logit contributions
iw+0.006
ho+0.005
 yogurt+0.005
 banana+0.005
 mun+0.005
Neg logit contributions
 radio-0.005
 Margaret-0.005
Marg-0.005
 wool-0.005
 grandparents-0.005
Token four
Token ID1440
Feature activation-1.84e-01

Pos logit contributions
 radio+0.016
 Margaret+0.015
 flashlight+0.015
Marg+0.015
 wool+0.015
Neg logit contributions
iw-0.012
ho-0.011
 yogurt-0.011
 mun-0.011
chard-0.011
Token fingers
Token ID9353
Feature activation-2.55e-01

Pos logit contributions
iw+0.005
ho+0.005
 yogurt+0.004
 banana+0.004
 mun+0.004
Neg logit contributions
 radio-0.005
 Margaret-0.005
Marg-0.004
 flashlight-0.004
 wool-0.004
Token.
Token ID13
Feature activation+1.21e-01

Pos logit contributions
iw+0.006
ho+0.006
 yogurt+0.006
melon+0.005
 banana+0.005
Neg logit contributions
 radio-0.007
 Margaret-0.007
 flashlight-0.007
Marg-0.007
 grandparents-0.007
Token Two
Token ID4930
Feature activation-6.76e-01

Pos logit contributions
 radio+0.004
 flashlight+0.004
 wool+0.003
 Margaret+0.003
 fault+0.003
Neg logit contributions
iw-0.003
 yogurt-0.002
ho-0.002
 banana-0.002
 mango-0.002
Token plus
Token ID5556
Feature activation-2.72e+00

Pos logit contributions
iw+0.016
ho+0.015
 yogurt+0.014
chard+0.014
 mun+0.014
Neg logit contributions
 radio-0.020
 Margaret-0.019
 flashlight-0.019
 wool-0.019
Marg-0.019
Token two
Token ID734
Feature activation-7.12e-02

Pos logit contributions
iw+0.072
 yogurt+0.072
 banana+0.069
 he+0.065
 mango+0.064
Neg logit contributions
 radio-0.068
 extingu-0.064
 flashlight-0.064
 Margaret-0.064
Marg-0.063
Token is
Token ID318
Feature activation-1.39e-01

Pos logit contributions
iw+0.002
ho+0.002
 yogurt+0.002
 banana+0.001
 mun+0.001
Neg logit contributions
 radio-0.002
 Margaret-0.002
Marg-0.002
 flashlight-0.002
 extingu-0.002
Token four
Token ID1440
Feature activation-3.08e-01

Pos logit contributions
iw+0.003
 yogurt+0.003
ho+0.003
chard+0.003
 banana+0.003
Neg logit contributions
 radio-0.004
 flashlight-0.004
Marg-0.004
 Margaret-0.004
 extingu-0.004
Token.
Token ID13
Feature activation+1.58e-01

Pos logit contributions
iw+0.008
ho+0.008
 yogurt+0.007
 mun+0.007
 banana+0.007
Neg logit contributions
 radio-0.008
 Margaret-0.008
 flashlight-0.008
Marg-0.008
 wool-0.008
Token See
Token ID4091
Feature activation-8.81e-01

Pos logit contributions
 flashlight+0.005
 radio+0.005
 wool+0.005
 fault+0.005
 Margaret+0.004
Neg logit contributions
iw-0.003
 yogurt-0.003
 banana-0.003
chard-0.003
 mango-0.003
Token
Token ID220
Feature activation-1.71e-02

Pos logit contributions
iw+0.010
 yogurt+0.010
 he+0.010
 mun+0.010
ho+0.010
Neg logit contributions
 radio-0.021
 aliens-0.020
Marg-0.020
 wool-0.020
 flashlight-0.020
Token
Token ID198
Feature activation-5.87e-02

Pos logit contributions
iw+0.000
 he+0.000
 yogurt+0.000
 Fro+0.000
melon+0.000
Neg logit contributions
 radio-0.001
 flashlight-0.001
 wool-0.001
 aliens-0.001
 Margaret-0.001
Token
Token ID198
Feature activation-1.13e-01

Pos logit contributions
iw+0.001
ho+0.001
 yogurt+0.001
 mun+0.001
 banana+0.001
Neg logit contributions
 radio-0.002
 fault-0.002
 Margaret-0.002
 flashlight-0.002
 aliens-0.002
TokenWhen
Token ID2215
Feature activation-5.55e-01

Pos logit contributions
iw+0.002
ho+0.002
 yogurt+0.002
 mun+0.002
 banana+0.002
Neg logit contributions
 radio-0.004
 fault-0.004
 Margaret-0.004
 flashlight-0.003
 aliens-0.003
Token the
Token ID262
Feature activation-6.19e-02

Pos logit contributions
 he+0.010
 yogurt+0.010
iw+0.009
ho+0.009
 mun+0.009
Neg logit contributions
 radio-0.019
Marg-0.018
 fault-0.018
 Margaret-0.018
 aliens-0.018
Token basket
Token ID7988
Feature activation-2.71e+00

Pos logit contributions
iw+0.002
ho+0.002
 yogurt+0.001
chard+0.001
 mun+0.001
Neg logit contributions
 radio-0.002
 Margaret-0.002
 flashlight-0.002
Marg-0.002
 wool-0.002
Token was
Token ID373
Feature activation-7.72e-01

Pos logit contributions
iw+0.080
chard+0.072
 mango+0.070
 yogurt+0.070
 mun+0.069
Neg logit contributions
 radio-0.059
 flashlight-0.057
 Margaret-0.055
 grandparents-0.055
 parent-0.053
Token full
Token ID1336
Feature activation-1.67e+00

Pos logit contributions
iw+0.020
ho+0.019
 yogurt+0.018
 mun+0.018
chard+0.017
Neg logit contributions
 radio-0.021
 Margaret-0.020
 wool-0.019
 flashlight-0.019
Marg-0.019
Token,
Token ID11
Feature activation+6.61e-01

Pos logit contributions
iw+0.038
ho+0.035
chard+0.034
 yogurt+0.032
 banana+0.031
Neg logit contributions
 radio-0.051
 Margaret-0.048
 wool-0.048
 extingu-0.047
 flashlight-0.047
Token it
Token ID340
Feature activation+4.41e-01

Pos logit contributions
 fault+0.025
Marg+0.024
 radio+0.024
rina+0.023
 Margaret+0.023
Neg logit contributions
 he-0.012
 yogurt-0.011
 banana-0.008
iw-0.008
chard-0.008
Token was
Token ID373
Feature activation-3.68e-01

Pos logit contributions
 radio+0.014
 Margaret+0.013
 flashlight+0.013
 wool+0.013
Marg+0.013
Neg logit contributions
iw-0.010
ho-0.009
 yogurt-0.009
chard-0.009
 mun-0.008
Token very
Token ID845
Feature activation-5.44e-01

Pos logit contributions
iw+0.007
ho+0.007
 yogurt+0.006
 mun+0.006
melon+0.006
Neg logit contributions
 radio-0.010
 Margaret-0.009
Marg-0.009
 wool-0.009
 flashlight-0.009
Token frustrated
Token ID14718
Feature activation-5.40e-01

Pos logit contributions
iw+0.016
ho+0.015
 yogurt+0.014
 mun+0.014
melon+0.014
Neg logit contributions
 radio-0.013
 Margaret-0.012
 parent-0.012
 true-0.012
 sense-0.012
Token at
Token ID379
Feature activation-1.74e-01

Pos logit contributions
iw+0.011
ho+0.011
 yogurt+0.010
 mun+0.009
chard+0.009
Neg logit contributions
 radio-0.018
 Margaret-0.017
Marg-0.016
 flashlight-0.016
 wool-0.016
Token not
Token ID407
Feature activation-6.19e-01

Pos logit contributions
iw+0.003
ho+0.003
 yogurt+0.003
 mun+0.003
chard+0.003
Neg logit contributions
 radio-0.006
 Margaret-0.006
 flashlight-0.006
Marg-0.006
 wool-0.006
Token being
Token ID852
Feature activation-9.20e-01

Pos logit contributions
iw+0.015
ho+0.015
 yogurt+0.014
chard+0.014
 mun+0.013
Neg logit contributions
 radio-0.018
 Margaret-0.017
Marg-0.016
 flashlight-0.016
 wool-0.016
Token able
Token ID1498
Feature activation-2.65e+00

Pos logit contributions
iw+0.023
ho+0.023
 yogurt+0.021
melon+0.021
 mun+0.021
Neg logit contributions
 radio-0.026
 Margaret-0.024
 wool-0.024
Marg-0.024
 flashlight-0.024
Token to
Token ID284
Feature activation+3.39e-01

Pos logit contributions
iw+0.045
ho+0.044
 yogurt+0.040
 mun+0.038
chard+0.038
Neg logit contributions
 radio-0.095
 Margaret-0.091
Marg-0.089
 flashlight-0.089
 wool-0.088
Token reach
Token ID3151
Feature activation-2.53e-01

Pos logit contributions
 radio+0.010
 Margaret+0.010
Marg+0.010
 wool+0.009
 flashlight+0.009
Neg logit contributions
iw-0.008
ho-0.007
 yogurt-0.007
chard-0.007
 mun-0.007
Token it
Token ID340
Feature activation-3.92e-01

Pos logit contributions
 yogurt+0.004
iw+0.004
ho+0.004
chard+0.004
 he+0.004
Neg logit contributions
 radio-0.009
Marg-0.008
 extingu-0.008
 Margaret-0.008
 grandparents-0.008
Token.
Token ID13
Feature activation+2.75e-01

Pos logit contributions
ho+0.008
iw+0.008
 yogurt+0.007
chard+0.007
 banana+0.007
Neg logit contributions
 radio-0.012
Marg-0.012
 Margaret-0.012
 grandparents-0.012
 extingu-0.012
Token
Token ID198
Feature activation+4.44e-01

Pos logit contributions
 radio+0.008
 Margaret+0.008
Marg+0.007
 flashlight+0.007
 grandparents+0.007
Neg logit contributions
iw-0.007
ho-0.006
 yogurt-0.006
 mun-0.006
chard-0.006
Token get
Token ID651
Feature activation+2.91e-01

Pos logit contributions
 radio+0.003
 Margaret+0.003
Marg+0.003
 flashlight+0.003
 grandparents+0.003
Neg logit contributions
iw-0.002
ho-0.002
 yogurt-0.002
 mun-0.002
chard-0.002
Token the
Token ID262
Feature activation-3.66e-01

Pos logit contributions
 wool+0.006
 radio+0.006
 flashlight+0.006
 parent+0.005
 Margaret+0.005
Neg logit contributions
board-0.009
iw-0.009
 salon-0.008
ro-0.008
ho-0.008
Token bread
Token ID8509
Feature activation-1.75e+00

Pos logit contributions
iw+0.005
ho+0.005
 yogurt+0.005
melon+0.004
chard+0.004
Neg logit contributions
 radio-0.014
 Margaret-0.014
Marg-0.013
 flashlight-0.013
 grandparents-0.013
Token.
Token ID13
Feature activation-2.66e-01

Pos logit contributions
iw+0.047
ho+0.043
 yogurt+0.042
 mun+0.042
chard+0.042
Neg logit contributions
 radio-0.046
 wool-0.043
 flashlight-0.043
 Margaret-0.042
 extingu-0.041
Token It
Token ID632
Feature activation+9.28e-02

Pos logit contributions
ho+0.008
iw+0.008
chard+0.007
 mun+0.007
melon+0.007
Neg logit contributions
 radio-0.006
 Margaret-0.006
 fault-0.006
 flashlight-0.005
Marg-0.005
Token nib
Token ID33272
Feature activation-2.64e+00

Pos logit contributions
 radio+0.002
 wool+0.002
 flashlight+0.002
 extingu+0.002
 aliens+0.002
Neg logit contributions
iw-0.002
 yogurt-0.002
 mango-0.002
chard-0.002
 salon-0.002
Tokenbled
Token ID9342
Feature activation-1.07e+00

Pos logit contributions
iw+0.072
 yogurt+0.065
 mun+0.064
 mango+0.064
ho+0.064
Neg logit contributions
 radio-0.066
 wool-0.064
 flashlight-0.061
 Margaret-0.060
Marg-0.060
Token her
Token ID607
Feature activation-2.19e-01

Pos logit contributions
iw+0.030
ho+0.028
 mun+0.027
chard+0.027
 yogurt+0.027
Neg logit contributions
 radio-0.026
 wool-0.025
 Margaret-0.024
 flashlight-0.024
 extingu-0.023
Token fingers
Token ID9353
Feature activation-4.09e-01

Pos logit contributions
iw+0.006
ho+0.005
 yogurt+0.005
chard+0.005
 mun+0.005
Neg logit contributions
 radio-0.006
 Margaret-0.006
 wool-0.006
 flashlight-0.006
Marg-0.006
Token and
Token ID290
Feature activation-7.23e-02

Pos logit contributions
iw+0.012
ho+0.011
 yogurt+0.010
 mun+0.010
chard+0.010
Neg logit contributions
 radio-0.010
 Margaret-0.009
 wool-0.009
 flashlight-0.009
Marg-0.009
Token made
Token ID925
Feature activation-5.74e-01

Pos logit contributions
iw+0.002
ho+0.002
 yogurt+0.001
chard+0.001
melon+0.001
Neg logit contributions
 radio-0.002
 wool-0.002
 flashlight-0.002
 Margaret-0.002
 extingu-0.002
Token sad
Token ID6507
Feature activation+4.23e-01

Pos logit contributions
 radio+0.001
 true+0.001
 flashlight+0.001
 grandparents+0.001
 Margaret+0.001
Neg logit contributions
iw-0.001
ears-0.001
 mun-0.001
ho-0.001
ro-0.001
Token that
Token ID326
Feature activation+3.36e-01

Pos logit contributions
 radio+0.014
 Margaret+0.013
 flashlight+0.013
 wool+0.013
 grandparents+0.013
Neg logit contributions
iw-0.009
ho-0.008
chard-0.007
 yogurt-0.007
 mun-0.007
Token he
Token ID339
Feature activation+1.99e-01

Pos logit contributions
 radio+0.013
 Margaret+0.013
 flashlight+0.012
 grandparents+0.012
 aliens+0.012
Neg logit contributions
iw-0.005
ho-0.004
 mun-0.004
melon-0.004
 yogurt-0.003
Token wasn
Token ID2492
Feature activation+5.71e-01

Pos logit contributions
 radio+0.006
 Margaret+0.005
 flashlight+0.005
 wool+0.005
Marg+0.005
Neg logit contributions
iw-0.005
ho-0.005
 yogurt-0.005
chard-0.004
melon-0.004
Token't
Token ID470
Feature activation-9.74e-02

Pos logit contributions
 radio+0.018
 Margaret+0.017
 flashlight+0.017
 wool+0.017
Marg+0.017
Neg logit contributions
iw-0.012
ho-0.012
 yogurt-0.011
chard-0.011
 mun-0.011
Token able
Token ID1498
Feature activation-2.60e+00

Pos logit contributions
iw+0.002
ho+0.002
 yogurt+0.002
chard+0.002
 banana+0.002
Neg logit contributions
 radio-0.003
 Margaret-0.003
 flashlight-0.003
Marg-0.003
 grandparents-0.003
Token to
Token ID284
Feature activation+5.17e-01

Pos logit contributions
iw+0.054
ho+0.052
 yogurt+0.047
chard+0.047
melon+0.046
Neg logit contributions
 radio-0.084
 Margaret-0.080
 flashlight-0.079
 wool-0.079
Marg-0.078
Token squeeze
Token ID21229
Feature activation-3.04e-01

Pos logit contributions
 radio+0.015
 Margaret+0.015
Marg+0.014
 flashlight+0.014
 wool+0.014
Neg logit contributions
iw-0.012
ho-0.012
 yogurt-0.011
 mun-0.011
chard-0.011
Token into
Token ID656
Feature activation+9.03e-01

Pos logit contributions
iw+0.007
ho+0.006
 yogurt+0.006
 mun+0.006
chard+0.006
Neg logit contributions
 radio-0.010
 Margaret-0.009
 flashlight-0.009
 wool-0.009
Marg-0.009
Token the
Token ID262
Feature activation-6.71e-02

Pos logit contributions
 radio+0.028
 Margaret+0.027
Marg+0.027
 grandparents+0.026
 extingu+0.026
Neg logit contributions
iw-0.019
 yogurt-0.018
ho-0.018
 banana-0.018
chard-0.017
Token frame
Token ID5739
Feature activation+2.54e-01

Pos logit contributions
iw+0.002
ho+0.002
 yogurt+0.001
chard+0.001
 mun+0.001
Neg logit contributions
 radio-0.002
 Margaret-0.002
Marg-0.002
 flashlight-0.002
 wool-0.002
Token the
Token ID262
Feature activation+1.14e-01

Pos logit contributions
 radio+0.009
 wool+0.009
 Margaret+0.009
 extingu+0.009
 flashlight+0.008
Neg logit contributions
ho-0.009
iw-0.009
ipop-0.008
 mun-0.008
chard-0.008
Token baskets
Token ID46530
Feature activation-1.23e+00

Pos logit contributions
 radio+0.003
 wool+0.003
 flashlight+0.003
 Margaret+0.003
 grandparents+0.003
Neg logit contributions
ho-0.003
iw-0.003
chard-0.002
 mun-0.002
melon-0.002
Token.
Token ID13
Feature activation-1.51e-01

Pos logit contributions
iw+0.037
 mango+0.036
 Fro+0.036
ho+0.034
 banana+0.034
Neg logit contributions
 extingu-0.022
 radio-0.022
 wool-0.020
 flashlight-0.020
 grandparents-0.020
Token When
Token ID1649
Feature activation-1.62e-01

Pos logit contributions
iw+0.003
ho+0.003
 yogurt+0.003
 mun+0.003
chard+0.003
Neg logit contributions
 radio-0.005
 flashlight-0.004
 Margaret-0.004
 wool-0.004
 aliens-0.004
Token the
Token ID262
Feature activation-8.92e-02

Pos logit contributions
iw+0.003
ho+0.003
chard+0.003
 yogurt+0.003
 mun+0.003
Neg logit contributions
 radio-0.006
 Margaret-0.005
Marg-0.005
 wool-0.005
 grandparents-0.005
Token basket
Token ID7988
Feature activation-2.59e+00

Pos logit contributions
iw+0.002
ho+0.002
chard+0.002
melon+0.002
 mun+0.002
Neg logit contributions
 radio-0.002
 wool-0.002
 flashlight-0.002
 Margaret-0.002
 grandparents-0.002
Token was
Token ID373
Feature activation-5.28e-01

Pos logit contributions
iw+0.082
chard+0.075
 mango+0.074
years+0.073
 mun+0.072
Neg logit contributions
 radio-0.046
 flashlight-0.045
 Margaret-0.043
 Tess-0.042
 grandparents-0.041
Token full
Token ID1336
Feature activation-1.58e+00

Pos logit contributions
iw+0.014
ho+0.014
 mun+0.012
 yogurt+0.012
 mango+0.012
Neg logit contributions
 radio-0.014
 Margaret-0.013
 wool-0.013
 flashlight-0.013
Marg-0.013
Token,
Token ID11
Feature activation+4.44e-01

Pos logit contributions
iw+0.039
chard+0.037
ho+0.037
 banana+0.033
 mun+0.033
Neg logit contributions
 radio-0.042
 wool-0.042
 Margaret-0.040
 extingu-0.040
 flashlight-0.039
Token she
Token ID673
Feature activation-3.84e-01

Pos logit contributions
 radio+0.015
 Margaret+0.015
 flashlight+0.014
Marg+0.014
 wool+0.014
Neg logit contributions
iw-0.008
ho-0.008
 yogurt-0.007
chard-0.007
 mun-0.007
Token went
Token ID1816
Feature activation-6.01e-01

Pos logit contributions
iw+0.010
ho+0.010
chard+0.009
melon+0.009
 mango+0.009
Neg logit contributions
 radio-0.010
 Margaret-0.009
 grandparents-0.009
 flashlight-0.009
 wool-0.009
TokenWhen
Token ID2215
Feature activation-5.84e-01

Pos logit contributions
iw+0.009
ho+0.008
 yogurt+0.008
chard+0.008
melon+0.008
Neg logit contributions
 radio-0.006
Marg-0.006
 Margaret-0.006
 wool-0.006
 flashlight-0.005
Token they
Token ID484
Feature activation-7.68e-01

Pos logit contributions
iw+0.016
ho+0.015
 yogurt+0.014
chard+0.014
 mun+0.014
Neg logit contributions
 radio-0.016
 Margaret-0.015
 flashlight-0.015
 wool-0.014
 grandparents-0.014
Token got
Token ID1392
Feature activation-1.47e-01

Pos logit contributions
iw+0.020
ho+0.017
chard+0.016
 banana+0.016
 mango+0.016
Neg logit contributions
 Margaret-0.019
 radio-0.019
 flashlight-0.019
 extingu-0.018
 grandparents-0.018
Token to
Token ID284
Feature activation-7.30e-02

Pos logit contributions
iw+0.004
ho+0.004
 mun+0.004
 mango+0.004
ro+0.004
Neg logit contributions
 radio-0.003
 Margaret-0.003
 grandparents-0.003
 flashlight-0.003
 wool-0.003
Token the
Token ID262
Feature activation-2.92e-01

Pos logit contributions
iw+0.001
 yogurt+0.001
chard+0.001
 mango+0.001
 banana+0.001
Neg logit contributions
 radio-0.003
Marg-0.003
 Margaret-0.003
 flashlight-0.003
 wool-0.003
Token checkout
Token ID28006
Feature activation-2.59e+00

Pos logit contributions
ho+0.006
iw+0.006
 mun+0.006
melon+0.005
ipop+0.005
Neg logit contributions
 radio-0.010
 wool-0.009
 grandparents-0.009
 flashlight-0.009
 Margaret-0.009
Token counter
Token ID3753
Feature activation-1.50e+00

Pos logit contributions
iw+0.074
ho+0.070
chard+0.069
 mun+0.069
 mango+0.066
Neg logit contributions
 radio-0.062
 grandparents-0.059
 Margaret-0.058
 grandmother-0.056
 flashlight-0.052
Token,
Token ID11
Feature activation+1.06e-01

Pos logit contributions
iw+0.037
ho+0.035
 mun+0.033
 mango+0.033
chard+0.033
Neg logit contributions
 radio-0.040
 Margaret-0.039
 grandparents-0.039
 flashlight-0.037
 grandmother-0.036
Token the
Token ID262
Feature activation+1.93e-01

Pos logit contributions
 radio+0.003
 Margaret+0.003
Marg+0.003
 flashlight+0.003
 wool+0.003
Neg logit contributions
iw-0.002
ho-0.002
 yogurt-0.002
chard-0.002
melon-0.002
Token cash
Token ID5003
Feature activation-1.82e+00

Pos logit contributions
 radio+0.005
 flashlight+0.004
 grandparents+0.004
 Margaret+0.004
 wool+0.004
Neg logit contributions
iw-0.006
ho-0.005
 mun-0.005
chard-0.005
melon-0.005
Tokenier
Token ID959
Feature activation-1.88e-01

Pos logit contributions
iw+0.055
ho+0.053
chard+0.051
 yogurt+0.050
 mun+0.050
Neg logit contributions
 radio-0.040
 Margaret-0.038
 flashlight-0.037
 extingu-0.036
Marg-0.036
Token her
Token ID607
Feature activation-2.34e-01

Pos logit contributions
iw+0.006
ho+0.006
 yogurt+0.006
 banana+0.006
 mun+0.006
Neg logit contributions
 radio-0.009
 grandparents-0.009
 Margaret-0.009
Marg-0.009
 aliens-0.008
Token.
Token ID13
Feature activation-1.23e-01

Pos logit contributions
ho+0.006
iw+0.006
 yogurt+0.005
 banana+0.005
 mun+0.005
Neg logit contributions
Marg-0.006
 Margaret-0.006
 radio-0.006
 aliens-0.006
 extingu-0.006
Token She
Token ID1375
Feature activation-4.85e-01

Pos logit contributions
iw+0.003
ho+0.003
 yogurt+0.003
chard+0.003
melon+0.003
Neg logit contributions
 radio-0.004
 Margaret-0.003
Marg-0.003
 wool-0.003
 flashlight-0.003
Token wasn
Token ID2492
Feature activation+2.24e-01

Pos logit contributions
iw+0.011
ho+0.010
 yogurt+0.010
chard+0.010
melon+0.010
Neg logit contributions
 radio-0.015
 Margaret-0.014
 flashlight-0.014
 wool-0.014
Marg-0.014
Token't
Token ID470
Feature activation-8.02e-03

Pos logit contributions
 radio+0.007
 Margaret+0.007
 flashlight+0.007
 wool+0.007
Marg+0.007
Neg logit contributions
iw-0.005
ho-0.004
 yogurt-0.004
chard-0.004
melon-0.004
Token able
Token ID1498
Feature activation-2.58e+00

Pos logit contributions
iw+0.000
 yogurt+0.000
 banana+0.000
 mun+0.000
chard+0.000
Neg logit contributions
 radio-0.000
 grandparents-0.000
 flashlight-0.000
 Margaret-0.000
 aliens-0.000
Token to
Token ID284
Feature activation+6.07e-01

Pos logit contributions
iw+0.041
ho+0.038
 yogurt+0.037
 mun+0.035
 banana+0.034
Neg logit contributions
 radio-0.096
 Margaret-0.092
Marg-0.090
 flashlight-0.089
 extingu-0.088
Token move
Token ID1445
Feature activation-4.10e-02

Pos logit contributions
 radio+0.019
 Margaret+0.018
Marg+0.018
 flashlight+0.017
 wool+0.017
Neg logit contributions
iw-0.013
ho-0.013
 yogurt-0.012
 mun-0.012
chard-0.012
Token as
Token ID355
Feature activation-5.55e-02

Pos logit contributions
iw+0.001
ho+0.001
 yogurt+0.001
chard+0.001
melon+0.001
Neg logit contributions
 radio-0.001
 Margaret-0.001
Marg-0.001
 flashlight-0.001
 grandparents-0.001
Token quickly
Token ID2952
Feature activation-6.63e-01

Pos logit contributions
iw+0.001
ho+0.001
 yogurt+0.001
chard+0.001
 mun+0.001
Neg logit contributions
 radio-0.002
 Margaret-0.001
 flashlight-0.001
Marg-0.001
 wool-0.001
Token as
Token ID355
Feature activation+2.06e-01

Pos logit contributions
iw+0.015
ho+0.014
chard+0.013
 yogurt+0.013
melon+0.013
Neg logit contributions
 radio-0.020
 Margaret-0.019
 wool-0.019
 flashlight-0.019
Marg-0.018
Token and
Token ID290
Feature activation+2.65e-01

Pos logit contributions
 radio+0.036
 Margaret+0.035
Marg+0.034
 flashlight+0.034
 grandparents+0.034
Neg logit contributions
iw-0.028
ho-0.028
 yogurt-0.026
 mun-0.025
chard-0.025
Token the
Token ID262
Feature activation+4.82e-01

Pos logit contributions
 radio+0.008
 Margaret+0.008
Marg+0.008
 flashlight+0.008
 wool+0.008
Neg logit contributions
iw-0.006
ho-0.005
 yogurt-0.005
chard-0.005
 mun-0.005
Token monster
Token ID9234
Feature activation+1.25e+00

Pos logit contributions
 radio+0.016
 flashlight+0.015
 aliens+0.015
 parent+0.014
 grandparents+0.014
Neg logit contributions
iw-0.011
ho-0.011
 mun-0.010
ipop-0.009
 Fro-0.009
Token wasn
Token ID2492
Feature activation+4.09e-01

Pos logit contributions
 radio+0.039
 Margaret+0.037
Marg+0.037
 flashlight+0.036
 wool+0.036
Neg logit contributions
iw-0.027
ho-0.026
 mun-0.024
 yogurt-0.024
chard-0.023
Token't
Token ID470
Feature activation+1.27e-01

Pos logit contributions
 radio+0.014
 flashlight+0.014
 Margaret+0.014
 wool+0.014
 extingu+0.013
Neg logit contributions
ho-0.007
iw-0.007
chard-0.006
melon-0.006
 yogurt-0.006
Token able
Token ID1498
Feature activation-2.57e+00

Pos logit contributions
 radio+0.004
 grandparents+0.004
 extingu+0.004
Marg+0.004
 flashlight+0.004
Neg logit contributions
 he-0.002
chard-0.002
 banana-0.002
iw-0.002
 mango-0.002
Token to
Token ID284
Feature activation+4.49e-01

Pos logit contributions
iw+0.035
ho+0.033
 yogurt+0.032
 banana+0.031
 mango+0.030
Neg logit contributions
 radio-0.100
 Margaret-0.097
Marg-0.095
 extingu-0.094
 flashlight-0.093
Token break
Token ID2270
Feature activation+3.48e-01

Pos logit contributions
 radio+0.014
 Margaret+0.014
Marg+0.013
 wool+0.013
 grandparents+0.013
Neg logit contributions
iw-0.010
ho-0.010
 yogurt-0.009
 banana-0.009
chard-0.009
Token it
Token ID340
Feature activation-1.55e-01

Pos logit contributions
 radio+0.012
 Margaret+0.012
Marg+0.012
 grandparents+0.012
 extingu+0.011
Neg logit contributions
iw-0.006
 yogurt-0.005
ho-0.005
 he-0.005
 banana-0.005
Token.
Token ID13
Feature activation-9.89e-02

Pos logit contributions
ho+0.004
iw+0.003
 yogurt+0.003
melon+0.003
 banana+0.003
Neg logit contributions
 radio-0.005
 Margaret-0.004
Marg-0.004
 grandparents-0.004
 flashlight-0.004
Token
Token ID198
Feature activation+1.60e-01

Pos logit contributions
iw+0.002
ho+0.002
 mun+0.002
 yogurt+0.002
chard+0.002
Neg logit contributions
 radio-0.003
 flashlight-0.003
 wool-0.003
 Margaret-0.003
 extingu-0.003
Token that
Token ID326
Feature activation+5.13e-01

Pos logit contributions
iw+0.010
ho+0.009
 yogurt+0.009
 banana+0.008
chard+0.008
Neg logit contributions
 radio-0.012
 Margaret-0.011
 wool-0.011
 flashlight-0.011
 grandparents-0.011
Token accidents
Token ID17390
Feature activation+1.01e+00

Pos logit contributions
 radio+0.015
 Margaret+0.014
 flashlight+0.014
Marg+0.014
 wool+0.014
Neg logit contributions
iw-0.012
ho-0.012
 yogurt-0.011
chard-0.011
 mun-0.011
Token happen
Token ID1645
Feature activation+3.27e-01

Pos logit contributions
 Margaret+0.027
 radio+0.027
Marg+0.027
 grandparents+0.026
 nephew+0.026
Neg logit contributions
ho-0.025
 he-0.025
melon-0.025
 yogurt-0.024
iw-0.023
Token.
Token ID13
Feature activation-2.36e-02

Pos logit contributions
 radio+0.011
 Margaret+0.011
Marg+0.010
 extingu+0.010
 flashlight+0.010
Neg logit contributions
 he-0.006
iw-0.006
ho-0.006
 yogurt-0.006
melon-0.005
Token Ho
Token ID9544
Feature activation-8.85e-01

Pos logit contributions
iw+0.001
ho+0.001
 yogurt+0.001
chard+0.001
 banana+0.001
Neg logit contributions
 radio-0.001
 Margaret-0.001
Marg-0.001
 wool-0.001
 Tess-0.001
Tokenpper
Token ID2848
Feature activation-2.57e+00

Pos logit contributions
iw+0.018
ho+0.017
 mun+0.015
chard+0.015
 mango+0.014
Neg logit contributions
 radio-0.031
 Margaret-0.029
 flashlight-0.029
 grandparents-0.028
 aliens-0.028
Token learned
Token ID4499
Feature activation-8.35e-01

Pos logit contributions
iw+0.070
chard+0.064
ho+0.064
 yogurt+0.062
 mango+0.059
Neg logit contributions
 radio-0.065
 Margaret-0.062
 flashlight-0.062
 wool-0.061
 parent-0.059
Token his
Token ID465
Feature activation-6.23e-01

Pos logit contributions
iw+0.020
ho+0.019
 mun+0.017
chard+0.017
melon+0.017
Neg logit contributions
 radio-0.026
 wool-0.024
 Margaret-0.024
Marg-0.024
 flashlight-0.024
Token lesson
Token ID11483
Feature activation-6.51e-01

Pos logit contributions
iw+0.016
ho+0.016
 yogurt+0.016
 banana+0.015
 mango+0.015
Neg logit contributions
 radio-0.016
 Margaret-0.016
Marg-0.015
 extingu-0.015
 flashlight-0.014
Token and
Token ID290
Feature activation-8.18e-01

Pos logit contributions
iw+0.021
chard+0.019
ho+0.019
 yogurt+0.019
 mun+0.018
Neg logit contributions
 radio-0.015
 Margaret-0.014
 wool-0.013
Marg-0.013
 flashlight-0.013
Token promised
Token ID8072
Feature activation-3.93e-01

Pos logit contributions
iw+0.021
ho+0.020
chard+0.019
 yogurt+0.019
 banana+0.019
Neg logit contributions
 radio-0.022
 Margaret-0.021
 flashlight-0.020
 wool-0.020
 grandparents-0.020
Token away
Token ID1497
Feature activation+8.63e-01

Pos logit contributions
iw+0.001
ho+0.001
 yogurt+0.001
chard+0.001
melon+0.001
Neg logit contributions
 radio-0.001
 Margaret-0.001
Marg-0.001
 flashlight-0.001
 wool-0.001
Token.
Token ID13
Feature activation+1.11e-01

Pos logit contributions
 radio+0.022
 Margaret+0.021
 flashlight+0.020
Marg+0.020
 wool+0.020
Neg logit contributions
iw-0.023
ho-0.023
 yogurt-0.022
chard-0.021
 mun-0.021
Token Tom
Token ID4186
Feature activation+8.84e-02

Pos logit contributions
 radio+0.003
 Margaret+0.003
 fault+0.003
 grandparents+0.003
Marg+0.003
Neg logit contributions
iw-0.003
ho-0.003
chard-0.003
melon-0.003
 mun-0.003
Token wasn
Token ID2492
Feature activation+1.54e-01

Pos logit contributions
 radio+0.003
 Margaret+0.002
 wool+0.002
Marg+0.002
 flashlight+0.002
Neg logit contributions
iw-0.002
ho-0.002
 yogurt-0.002
chard-0.002
melon-0.002
Token't
Token ID470
Feature activation-4.33e-02

Pos logit contributions
 radio+0.005
 Margaret+0.005
 flashlight+0.004
Marg+0.004
 wool+0.004
Neg logit contributions
iw-0.003
ho-0.003
 yogurt-0.003
chard-0.003
 mun-0.003
Token able
Token ID1498
Feature activation-2.56e+00

Pos logit contributions
iw+0.001
chard+0.001
 yogurt+0.001
ho+0.001
 banana+0.001
Neg logit contributions
 radio-0.001
 flashlight-0.001
 grandparents-0.001
 wool-0.001
 Margaret-0.001
Token to
Token ID284
Feature activation+6.84e-01

Pos logit contributions
iw+0.046
ho+0.044
 yogurt+0.041
 mun+0.039
chard+0.039
Neg logit contributions
 radio-0.090
 Margaret-0.086
Marg-0.084
 flashlight-0.084
 wool-0.084
Token keep
Token ID1394
Feature activation+4.05e-01

Pos logit contributions
 radio+0.021
 Margaret+0.020
Marg+0.019
 flashlight+0.019
 wool+0.019
Neg logit contributions
iw-0.015
ho-0.015
 yogurt-0.014
 mun-0.014
chard-0.014
Token up
Token ID510
Feature activation-9.45e-02

Pos logit contributions
 radio+0.012
 Margaret+0.011
Marg+0.011
 flashlight+0.011
 extingu+0.011
Neg logit contributions
iw-0.009
ho-0.009
 yogurt-0.009
chard-0.009
 banana-0.008
Token with
Token ID351
Feature activation+1.49e-02

Pos logit contributions
iw+0.002
ho+0.002
 yogurt+0.002
chard+0.002
 mun+0.002
Neg logit contributions
 radio-0.003
 Margaret-0.003
 flashlight-0.003
Marg-0.003
 wool-0.003
Token the
Token ID262
Feature activation-3.27e-01

Pos logit contributions
 radio+0.000
 Margaret+0.000
 flashlight+0.000
 wool+0.000
 grandparents+0.000
Neg logit contributions
iw-0.000
ho-0.000
 mun-0.000
melon-0.000
chard-0.000
Token
Token ID198
Feature activation+1.26e-01

Pos logit contributions
 radio+0.007
 flashlight+0.006
 wool+0.006
 extingu+0.006
 Margaret+0.006
Neg logit contributions
iw-0.006
ho-0.006
 yogurt-0.005
chard-0.005
 banana-0.005
TokenWhen
Token ID2215
Feature activation-1.01e-01

Pos logit contributions
 radio+0.004
 flashlight+0.004
 Margaret+0.004
 wool+0.004
 extingu+0.004
Neg logit contributions
iw-0.003
ho-0.003
 yogurt-0.002
chard-0.002
 mun-0.002
Token they
Token ID484
Feature activation-5.01e-01

Pos logit contributions
iw+0.002
ho+0.002
 yogurt+0.002
chard+0.002
 mun+0.002
Neg logit contributions
 radio-0.003
 flashlight-0.003
 extingu-0.003
 wool-0.003
Marg-0.003
Token reached
Token ID4251
Feature activation-5.78e-01

Pos logit contributions
iw+0.013
ho+0.012
 banana+0.011
chard+0.011
 mango+0.011
Neg logit contributions
 Margaret-0.012
 flashlight-0.012
 grandparents-0.011
 radio-0.011
 extingu-0.011
Token the
Token ID262
Feature activation-6.36e-01

Pos logit contributions
iw+0.009
 yogurt+0.009
ho+0.008
chard+0.008
melon+0.008
Neg logit contributions
 radio-0.021
Marg-0.021
 Margaret-0.020
 flashlight-0.020
 aliens-0.020
Token checkout
Token ID28006
Feature activation-2.56e+00

Pos logit contributions
iw+0.017
ho+0.017
 mun+0.016
 Fro+0.015
chard+0.014
Neg logit contributions
 radio-0.017
 flashlight-0.016
 wool-0.015
 grandparents-0.015
 Margaret-0.015
Token,
Token ID11
Feature activation+2.01e-01

Pos logit contributions
iw+0.069
ho+0.065
 Fro+0.062
 mun+0.062
chard+0.061
Neg logit contributions
 radio-0.065
 grandparents-0.061
 Margaret-0.059
 grandmother-0.057
 extingu-0.056
Token Alice
Token ID14862
Feature activation-2.41e-01

Pos logit contributions
 fault+0.006
 radio+0.006
 extingu+0.006
 wool+0.005
Marg+0.005
Neg logit contributions
chard-0.005
 yogurt-0.004
ho-0.004
 he-0.004
iw-0.004
Token's
Token ID338
Feature activation-2.06e-01

Pos logit contributions
iw+0.006
ho+0.005
 yogurt+0.005
chard+0.005
melon+0.005
Neg logit contributions
 radio-0.007
 Margaret-0.007
Marg-0.006
 grandparents-0.006
 flashlight-0.006
Token mom
Token ID1995
Feature activation-5.86e-01

Pos logit contributions
iw+0.007
ho+0.007
 mun+0.006
chard+0.006
melon+0.006
Neg logit contributions
 radio-0.004
 grandparents-0.004
 flashlight-0.004
 wool-0.004
 Margaret-0.004
Token picked
Token ID6497
Feature activation+2.40e-01

Pos logit contributions
iw+0.013
ho+0.013
 yogurt+0.012
chard+0.012
melon+0.011
Neg logit contributions
 radio-0.018
 Margaret-0.017
Marg-0.017
 flashlight-0.016
 wool-0.016
Token"
Token ID1
Feature activation-6.81e-01

Pos logit contributions
 radio+0.003
 Margaret+0.002
 flashlight+0.002
 fault+0.002
 wool+0.002
Neg logit contributions
iw-0.002
ho-0.002
 yogurt-0.001
chard-0.001
melon-0.001
TokenYes
Token ID5297
Feature activation-7.49e-01

Pos logit contributions
iw+0.017
ho+0.017
 yogurt+0.016
chard+0.015
melon+0.015
Neg logit contributions
 radio-0.019
 Margaret-0.018
 flashlight-0.018
 wool-0.018
Marg-0.018
Token,
Token ID11
Feature activation-5.87e-01

Pos logit contributions
iw+0.013
ho+0.013
 yogurt+0.012
melon+0.011
 banana+0.011
Neg logit contributions
 radio-0.026
 Margaret-0.025
Marg-0.025
 flashlight-0.025
 wool-0.025
Token honey
Token ID12498
Feature activation-8.44e-01

Pos logit contributions
iw+0.013
ho+0.012
 yogurt+0.011
chard+0.011
 mun+0.011
Neg logit contributions
 radio-0.019
 Margaret-0.018
 flashlight-0.017
 wool-0.017
Marg-0.017
Token and
Token ID290
Feature activation-6.65e-01

Pos logit contributions
ho+0.020
iw+0.019
 yogurt+0.018
melon+0.018
 banana+0.017
Neg logit contributions
 radio-0.026
Marg-0.026
 Margaret-0.025
 grandparents-0.024
 flashlight-0.024
Token salsa
Token ID49529
Feature activation-2.54e+00

Pos logit contributions
iw+0.016
ho+0.016
chard+0.014
 mun+0.014
 yogurt+0.014
Neg logit contributions
 radio-0.019
 wool-0.018
 Margaret-0.018
 flashlight-0.017
Marg-0.017
Token.
Token ID13
Feature activation-8.54e-03

Pos logit contributions
iw+0.070
ho+0.067
chard+0.065
 mun+0.063
 yogurt+0.063
Neg logit contributions
 radio-0.063
 wool-0.058
 Margaret-0.058
 flashlight-0.057
 extingu-0.057
Token You
Token ID921
Feature activation-5.14e-01

Pos logit contributions
iw+0.000
ho+0.000
 yogurt+0.000
chard+0.000
 mun+0.000
Neg logit contributions
 radio-0.000
 Margaret-0.000
 wool-0.000
 flashlight-0.000
Marg-0.000
Token know
Token ID760
Feature activation-6.90e-01

Pos logit contributions
iw+0.013
ho+0.013
melon+0.012
 yogurt+0.012
 banana+0.012
Neg logit contributions
 radio-0.015
 Margaret-0.014
Marg-0.014
 grandparents-0.013
 wool-0.013
Token,
Token ID11
Feature activation+2.27e-01

Pos logit contributions
iw+0.017
ho+0.017
 yogurt+0.016
 mun+0.016
chard+0.016
Neg logit contributions
 radio-0.019
 Margaret-0.018
 wool-0.018
 flashlight-0.018
Marg-0.017
Token Lily
Token ID20037
Feature activation+3.54e-01

Pos logit contributions
 radio+0.007
 Margaret+0.006
 wool+0.006
 flashlight+0.006
Marg+0.006
Neg logit contributions
iw-0.005
ho-0.005
chard-0.005
 mun-0.005
 yogurt-0.005
Token frogs
Token ID37475
Feature activation+2.06e-02

Pos logit contributions
 radio+0.008
 Margaret+0.008
 flashlight+0.008
 wool+0.008
Marg+0.008
Neg logit contributions
iw-0.010
ho-0.010
 yogurt-0.009
chard-0.009
 mun-0.009
Token,
Token ID11
Feature activation-3.08e-01

Pos logit contributions
 radio+0.001
 Margaret+0.001
 flashlight+0.001
Marg+0.001
 wool+0.001
Neg logit contributions
iw-0.001
ho-0.001
 yogurt-0.000
 mun-0.000
chard-0.000
Token Big
Token ID4403
Feature activation+7.05e-01

Pos logit contributions
iw+0.008
ho+0.007
chard+0.007
 yogurt+0.007
 banana+0.007
Neg logit contributions
 radio-0.009
 fault-0.008
 Margaret-0.008
 grandparents-0.008
 flashlight-0.008
Token Frog
Token ID28328
Feature activation+7.48e-01

Pos logit contributions
 radio+0.019
 flashlight+0.019
Marg+0.019
 grandparents+0.019
 basement+0.018
Neg logit contributions
ho-0.017
iw-0.016
 Fro-0.016
 l-0.016
 mun-0.016
Token and
Token ID290
Feature activation+4.62e-02

Pos logit contributions
 radio+0.024
 Margaret+0.023
 flashlight+0.022
Marg+0.022
 grandparents+0.022
Neg logit contributions
ho-0.016
iw-0.016
 mun-0.014
melon-0.014
 yogurt-0.014
Token Little
Token ID7703
Feature activation+2.58e+00

Pos logit contributions
 radio+0.001
 fault+0.001
 grandparents+0.001
 flashlight+0.001
Marg+0.001
Neg logit contributions
ho-0.001
iw-0.001
 Fro-0.001
 mun-0.001
chard-0.001
Token Frog
Token ID28328
Feature activation+1.14e+00

Pos logit contributions
 radio+0.084
Marg+0.079
 flashlight+0.079
 Margaret+0.078
 grandparents+0.077
Neg logit contributions
ho-0.054
iw-0.053
 yogurt-0.049
 mun-0.049
 Fro-0.048
Token.
Token ID13
Feature activation+6.16e-01

Pos logit contributions
 radio+0.035
 Margaret+0.033
 flashlight+0.032
Marg+0.031
 grandparents+0.031
Neg logit contributions
ho-0.028
iw-0.027
 mun-0.024
melon-0.024
 yogurt-0.024
Token Big
Token ID4403
Feature activation-2.60e-01

Pos logit contributions
 fault+0.017
 radio+0.017
 flashlight+0.016
 alright+0.016
 grandparents+0.016
Neg logit contributions
 Fro-0.014
iw-0.013
ho-0.013
 Crab-0.013
 he-0.013
Token Frog
Token ID28328
Feature activation+2.16e-01

Pos logit contributions
iw+0.005
ho+0.005
 yogurt+0.005
melon+0.005
 mun+0.005
Neg logit contributions
 radio-0.008
 Margaret-0.008
Marg-0.008
 flashlight-0.008
 grandparents-0.008
Token was
Token ID373
Feature activation+1.80e-02

Pos logit contributions
 radio+0.007
 Margaret+0.006
 flashlight+0.006
Marg+0.006
 extingu+0.006
Neg logit contributions
iw-0.005
ho-0.005
 mun-0.004
 yogurt-0.004
melon-0.004
Token there
Token ID612
Feature activation+2.66e-03

Pos logit contributions
iw+0.005
ho+0.005
chard+0.004
 yogurt+0.004
melon+0.004
Neg logit contributions
 radio-0.006
 Margaret-0.005
Marg-0.005
 wool-0.005
 flashlight-0.005
Token was
Token ID373
Feature activation+4.42e-02

Pos logit contributions
 radio+0.000
 Margaret+0.000
Marg+0.000
 extingu+0.000
 wool+0.000
Neg logit contributions
ho-0.000
iw-0.000
 yogurt-0.000
 mun-0.000
 banana-0.000
Token an
Token ID281
Feature activation-5.88e-01

Pos logit contributions
 radio+0.001
 extingu+0.001
 flashlight+0.001
 Margaret+0.001
Marg+0.001
Neg logit contributions
 yogurt-0.001
iw-0.001
 banana-0.001
 mun-0.001
chard-0.001
Token ordinary
Token ID8850
Feature activation+2.45e-01

Pos logit contributions
iw+0.015
ho+0.015
 yogurt+0.014
 mun+0.014
chard+0.014
Neg logit contributions
 radio-0.016
 Margaret-0.015
 flashlight-0.014
 wool-0.014
Marg-0.014
Token ash
Token ID12530
Feature activation+1.34e+00

Pos logit contributions
 radio+0.007
Marg+0.007
 Margaret+0.007
 wool+0.006
 extingu+0.006
Neg logit contributions
iw-0.006
ho-0.006
 yogurt-0.006
 banana-0.006
melon-0.005
Tokentr
Token ID2213
Feature activation+2.43e+00

Pos logit contributions
 radio+0.022
 Margaret+0.020
Marg+0.020
 flashlight+0.019
 wool+0.019
Neg logit contributions
iw-0.049
ho-0.048
 yogurt-0.046
melon-0.045
chard-0.045
Tokenay
Token ID323
Feature activation+1.36e-02

Pos logit contributions
 radio+0.048
 Margaret+0.044
 wool+0.043
Marg+0.042
 flashlight+0.042
Neg logit contributions
iw-0.083
ho-0.080
 yogurt-0.077
melon-0.075
 banana-0.074
Token.
Token ID13
Feature activation+2.56e-01

Pos logit contributions
 radio+0.000
Marg+0.000
 Margaret+0.000
 grandparents+0.000
 flashlight+0.000
Neg logit contributions
ho-0.000
iw-0.000
 yogurt-0.000
melon-0.000
 banana-0.000
Token The
Token ID383
Feature activation+1.14e-01

Pos logit contributions
 radio+0.006
 Margaret+0.006
Marg+0.006
 flashlight+0.006
 wool+0.006
Neg logit contributions
iw-0.007
ho-0.007
 yogurt-0.007
chard-0.007
melon-0.007
Token ash
Token ID12530
Feature activation+8.15e-01

Pos logit contributions
 radio+0.003
 flashlight+0.003
 wool+0.003
 Margaret+0.002
 grandparents+0.002
Neg logit contributions
iw-0.003
ho-0.003
 mun-0.003
chard-0.003
 yogurt-0.003
Tokentr
Token ID2213
Feature activation+1.95e+00

Pos logit contributions
 radio+0.013
 Margaret+0.013
Marg+0.012
 grandparents+0.012
 wool+0.011
Neg logit contributions
ho-0.030
iw-0.029
 yogurt-0.028
 banana-0.028
melon-0.027
Token stretch
Token ID7539
Feature activation-3.28e-01

Pos logit contributions
iw+0.010
ho+0.010
 yogurt+0.009
 mun+0.009
melon+0.009
Neg logit contributions
 radio-0.012
 Margaret-0.011
Marg-0.011
 flashlight-0.011
 grandparents-0.011
Token out
Token ID503
Feature activation+1.81e-01

Pos logit contributions
ho+0.008
iw+0.007
 yogurt+0.007
melon+0.007
chard+0.007
Neg logit contributions
 radio-0.010
 Margaret-0.009
 grandparents-0.009
Marg-0.009
 flashlight-0.009
Token and
Token ID290
Feature activation+6.92e-03

Pos logit contributions
 radio+0.006
 Margaret+0.006
Marg+0.005
 flashlight+0.005
 grandparents+0.005
Neg logit contributions
iw-0.004
ho-0.004
 yogurt-0.003
 mun-0.003
chard-0.003
Token enormous
Token ID9812
Feature activation-3.96e-03

Pos logit contributions
 radio+0.000
 Margaret+0.000
Marg+0.000
 flashlight+0.000
 grandparents+0.000
Neg logit contributions
iw-0.000
ho-0.000
 yogurt-0.000
chard-0.000
 mun-0.000
Token sun
Token ID4252
Feature activation-1.72e-01

Pos logit contributions
iw+0.000
ho+0.000
 yogurt+0.000
 mun+0.000
melon+0.000
Neg logit contributions
 radio-0.000
 Margaret-0.000
Marg-0.000
 flashlight-0.000
 grandparents-0.000
Tokenfl
Token ID2704
Feature activation+2.88e+00

Pos logit contributions
ho+0.005
iw+0.004
melon+0.004
 yogurt+0.004
 mun+0.004
Neg logit contributions
 radio-0.004
Marg-0.004
 Margaret-0.004
 flashlight-0.004
 grandparents-0.004
Tokenowers
Token ID3618
Feature activation+6.10e-02

Pos logit contributions
 radio+0.071
 Margaret+0.066
 flashlight+0.065
 grandparents+0.063
Marg+0.062
Neg logit contributions
iw-0.087
ho-0.086
 mun-0.080
melon-0.079
 yogurt-0.078
Token would
Token ID561
Feature activation-4.72e-01

Pos logit contributions
 radio+0.002
 Margaret+0.002
Marg+0.002
 grandparents+0.002
 flashlight+0.002
Neg logit contributions
ho-0.001
iw-0.001
melon-0.001
 yogurt-0.001
 mun-0.001
Token show
Token ID905
Feature activation+2.76e-01

Pos logit contributions
ho+0.011
iw+0.011
 yogurt+0.010
 mun+0.010
melon+0.010
Neg logit contributions
 radio-0.014
 Margaret-0.013
Marg-0.013
 grandparents-0.013
 flashlight-0.013
Token their
Token ID511
Feature activation+3.07e-01

Pos logit contributions
 radio+0.008
 extingu+0.008
 flashlight+0.008
 Margaret+0.007
Marg+0.007
Neg logit contributions
ho-0.006
iw-0.006
 yogurt-0.006
 mun-0.006
melon-0.006
Token bright
Token ID6016
Feature activation-1.89e-01

Pos logit contributions
 radio+0.007
 Margaret+0.007
Marg+0.007
 extingu+0.007
 flashlight+0.006
Neg logit contributions
ho-0.009
iw-0.009
 yogurt-0.009
 mun-0.008
 banana-0.008
Token around
Token ID1088
Feature activation+1.85e-01

Pos logit contributions
iw+0.004
ho+0.004
 yogurt+0.004
chard+0.004
 mun+0.004
Neg logit contributions
 radio-0.006
 Margaret-0.006
Marg-0.006
 flashlight-0.006
 wool-0.006
Token on
Token ID319
Feature activation+2.58e-01

Pos logit contributions
 radio+0.006
 Margaret+0.006
Marg+0.006
 flashlight+0.006
 wool+0.006
Neg logit contributions
iw-0.004
ho-0.004
 yogurt-0.003
chard-0.003
 mun-0.003
Token a
Token ID257
Feature activation+2.54e-01

Pos logit contributions
 radio+0.008
 Margaret+0.008
Marg+0.008
 flashlight+0.008
 wool+0.008
Neg logit contributions
iw-0.005
ho-0.005
 yogurt-0.005
 banana-0.005
chard-0.004
Token tight
Token ID5381
Feature activation-2.05e-01

Pos logit contributions
 radio+0.007
 Margaret+0.007
 flashlight+0.006
 wool+0.006
Marg+0.006
Neg logit contributions
iw-0.007
ho-0.006
 yogurt-0.006
chard-0.006
 mun-0.006
Tokenro
Token ID305
Feature activation-3.33e-01

Pos logit contributions
iw+0.005
ho+0.004
 yogurt+0.004
 mun+0.004
chard+0.004
Neg logit contributions
 radio-0.006
 flashlight-0.006
 wool-0.006
 Margaret-0.006
 extingu-0.005
Tokenpe
Token ID431
Feature activation+2.56e+00

Pos logit contributions
iw+0.008
ho+0.007
 yogurt+0.007
chard+0.007
melon+0.007
Neg logit contributions
 radio-0.010
 Margaret-0.009
 wool-0.009
 flashlight-0.009
Marg-0.009
Token,
Token ID11
Feature activation+5.06e-01

Pos logit contributions
 radio+0.064
 wool+0.058
 Margaret+0.057
 flashlight+0.056
Marg+0.056
Neg logit contributions
iw-0.074
 yogurt-0.068
ho-0.067
 mun-0.065
chard-0.064
Token he
Token ID339
Feature activation-5.78e-01

Pos logit contributions
 radio+0.013
 Margaret+0.012
 wool+0.012
 flashlight+0.012
Marg+0.012
Neg logit contributions
iw-0.014
ho-0.013
 yogurt-0.013
melon-0.012
 mango-0.012
Token could
Token ID714
Feature activation-1.18e-01

Pos logit contributions
iw+0.016
ho+0.015
 yogurt+0.015
melon+0.014
chard+0.014
Neg logit contributions
 radio-0.015
 Margaret-0.014
 flashlight-0.014
Marg-0.013
 wool-0.013
Token even
Token ID772
Feature activation-9.08e-01

Pos logit contributions
iw+0.003
ho+0.003
 yogurt+0.003
 mun+0.003
 banana+0.003
Neg logit contributions
 radio-0.003
 Margaret-0.003
Marg-0.003
 wool-0.003
 flashlight-0.003
Token turn
Token ID1210
Feature activation+1.39e-01

Pos logit contributions
iw+0.026
ho+0.025
melon+0.023
 yogurt+0.023
 mun+0.023
Neg logit contributions
 radio-0.023
 flashlight-0.021
 Margaret-0.021
 wool-0.021
 aliens-0.020
Token need
Token ID761
Feature activation+7.21e-01

Pos logit contributions
 radio+0.013
Marg+0.013
 Margaret+0.013
 extingu+0.013
 grandparents+0.013
Neg logit contributions
 yogurt-0.007
iw-0.007
 banana-0.007
ho-0.007
 he-0.007
Token.
Token ID13
Feature activation+1.86e-01

Pos logit contributions
 radio+0.020
 Margaret+0.019
Marg+0.019
 flashlight+0.019
 extingu+0.018
Neg logit contributions
iw-0.018
 yogurt-0.017
ho-0.017
 banana-0.016
chard-0.016
Token Even
Token ID3412
Feature activation+4.66e-01

Pos logit contributions
 radio+0.005
 Margaret+0.005
Marg+0.005
 flashlight+0.005
 wool+0.005
Neg logit contributions
iw-0.005
ho-0.004
 yogurt-0.004
 mun-0.004
chard-0.004
Token the
Token ID262
Feature activation-3.08e-01

Pos logit contributions
 radio+0.015
 Margaret+0.014
 wool+0.014
 flashlight+0.014
Marg+0.014
Neg logit contributions
iw-0.010
ho-0.010
 mun-0.009
 yogurt-0.008
melon-0.008
Token t
Token ID256
Feature activation+7.71e-01

Pos logit contributions
iw+0.008
ho+0.008
 yogurt+0.007
 banana+0.007
chard+0.007
Neg logit contributions
 radio-0.008
 Margaret-0.008
Marg-0.008
 flashlight-0.007
 wool-0.007
Tokenini
Token ID5362
Feature activation+2.72e+00

Pos logit contributions
 radio+0.013
 Margaret+0.012
 flashlight+0.012
Marg+0.012
 wool+0.012
Neg logit contributions
iw-0.027
ho-0.027
 yogurt-0.026
chard-0.025
 mun-0.025
Tokenest
Token ID395
Feature activation+3.18e-01

Pos logit contributions
 radio+0.073
 Margaret+0.069
Marg+0.068
 flashlight+0.067
 wool+0.066
Neg logit contributions
iw-0.071
ho-0.069
 yogurt-0.065
melon-0.063
 mun-0.063
Token creature
Token ID7185
Feature activation+2.53e-01

Pos logit contributions
 radio+0.009
 Margaret+0.008
Marg+0.008
 flashlight+0.008
 wool+0.008
Neg logit contributions
iw-0.008
ho-0.008
 yogurt-0.008
 mun-0.007
melon-0.007
Token needs
Token ID2476
Feature activation-1.22e-01

Pos logit contributions
 radio+0.008
 Margaret+0.007
Marg+0.007
 flashlight+0.007
 wool+0.007
Neg logit contributions
iw-0.005
ho-0.005
 yogurt-0.005
 mun-0.005
melon-0.005
Token our
Token ID674
Feature activation+1.82e-01

Pos logit contributions
iw+0.003
ho+0.003
 yogurt+0.003
 mun+0.002
 banana+0.002
Neg logit contributions
 radio-0.004
 Margaret-0.003
Marg-0.003
 flashlight-0.003
 grandparents-0.003
Token help
Token ID1037
Feature activation-4.06e-01

Pos logit contributions
 radio+0.004
 Margaret+0.004
Marg+0.004
 flashlight+0.004
 extingu+0.004
Neg logit contributions
iw-0.005
ho-0.005
 yogurt-0.005
 mun-0.005
 banana-0.005
Token to
Token ID284
Feature activation-3.95e-01

Pos logit contributions
iw+0.008
ho+0.007
 yogurt+0.007
chard+0.007
 mango+0.007
Neg logit contributions
 radio-0.012
 Margaret-0.012
 wool-0.011
 flashlight-0.011
Marg-0.011
Token float
Token ID12178
Feature activation+8.37e-02

Pos logit contributions
ho+0.009
iw+0.009
melon+0.009
 yogurt+0.009
 banana+0.008
Neg logit contributions
 radio-0.011
 grandparents-0.011
 Margaret-0.011
Marg-0.011
 flashlight-0.010
Token away
Token ID1497
Feature activation+5.83e-01

Pos logit contributions
 radio+0.003
 Margaret+0.003
Marg+0.003
 grandparents+0.003
 extingu+0.003
Neg logit contributions
ho-0.002
iw-0.002
chard-0.001
 yogurt-0.001
melon-0.001
Token faster
Token ID5443
Feature activation-9.32e-01

Pos logit contributions
 radio+0.018
 Margaret+0.018
Marg+0.017
 grandparents+0.017
 flashlight+0.017
Neg logit contributions
ho-0.013
iw-0.012
melon-0.011
 yogurt-0.011
chard-0.011
Token.
Token ID13
Feature activation-3.83e-02

Pos logit contributions
iw+0.024
ho+0.024
 yogurt+0.022
 mun+0.022
chard+0.022
Neg logit contributions
 radio-0.025
 Margaret-0.024
Marg-0.023
 flashlight-0.023
 wool-0.023
Token The
Token ID383
Feature activation+6.41e-01

Pos logit contributions
iw+0.001
ho+0.001
chard+0.001
 mun+0.001
melon+0.001
Neg logit contributions
 radio-0.001
 Margaret-0.001
Marg-0.001
 grandparents-0.001
 wool-0.001
Token girl
Token ID2576
Feature activation+1.57e-03

Pos logit contributions
 Margaret+0.018
Marg+0.018
 radio+0.017
 extingu+0.017
 fault+0.016
Neg logit contributions
iw-0.015
ho-0.015
 yogurt-0.015
chard-0.015
 banana-0.014
Token tried
Token ID3088
Feature activation-5.32e-01

Pos logit contributions
 radio+0.000
 Margaret+0.000
 flashlight+0.000
 wool+0.000
Marg+0.000
Neg logit contributions
iw-0.000
 yogurt-0.000
chard-0.000
ho-0.000
melon-0.000
Token and
Token ID290
Feature activation-5.99e-01

Pos logit contributions
iw+0.011
ho+0.010
 yogurt+0.010
 mun+0.010
chard+0.010
Neg logit contributions
 radio-0.017
 Margaret-0.016
 wool-0.016
 flashlight-0.016
Marg-0.016
Token tried
Token ID3088
Feature activation-3.42e-01

Pos logit contributions
iw+0.012
ho+0.011
 yogurt+0.010
chard+0.010
melon+0.010
Neg logit contributions
 radio-0.020
 Margaret-0.019
 flashlight-0.019
Marg-0.019
 wool-0.019
Token to
Token ID284
Feature activation+3.73e-03

Pos logit contributions
iw+0.007
ho+0.007
 yogurt+0.007
 mun+0.006
chard+0.006
Neg logit contributions
 radio-0.011
 Margaret-0.010
 flashlight-0.010
 wool-0.010
Marg-0.010
Token.
Token ID13
Feature activation-2.28e-02

Pos logit contributions
 radio+0.001
 extingu+0.001
 flashlight+0.001
 Margaret+0.001
 grandparents+0.001
Neg logit contributions
iw-0.001
ho-0.001
 mango-0.001
 mun-0.001
 yogurt-0.001
Token They
Token ID1119
Feature activation-7.34e-02

Pos logit contributions
iw+0.001
ho+0.001
 yogurt+0.001
 Fro+0.001
melon+0.001
Neg logit contributions
 radio-0.001
 flashlight-0.001
Marg-0.001
 Margaret-0.001
 wool-0.001
Token hugged
Token ID41130
Feature activation+5.72e-01

Pos logit contributions
iw+0.002
ho+0.002
 yogurt+0.002
 mango+0.002
 banana+0.002
Neg logit contributions
 radio-0.002
 flashlight-0.002
 Margaret-0.002
Marg-0.002
 grandparents-0.002
Token mom
Token ID1995
Feature activation-5.13e-01

Pos logit contributions
 radio+0.016
 Margaret+0.015
Marg+0.015
 grandparents+0.015
 fault+0.015
Neg logit contributions
ho-0.014
iw-0.014
 yogurt-0.013
 Fro-0.013
 banana-0.012
Token and
Token ID290
Feature activation-2.77e-02

Pos logit contributions
iw+0.015
ho+0.015
chard+0.014
 mun+0.014
 yogurt+0.014
Neg logit contributions
 radio-0.012
 Margaret-0.012
 wool-0.011
 flashlight-0.011
Marg-0.011
Token thanked
Token ID26280
Feature activation+8.39e-01

Pos logit contributions
iw+0.001
ho+0.001
 yogurt+0.001
 mun+0.001
chard+0.001
Neg logit contributions
 radio-0.001
 Margaret-0.001
Marg-0.001
 flashlight-0.001
 wool-0.001
Token her
Token ID607
Feature activation-4.08e-01

Pos logit contributions
 radio+0.024
 fault+0.024
 flashlight+0.023
 Margaret+0.023
Marg+0.023
Neg logit contributions
ho-0.019
iw-0.019
 yogurt-0.018
 Fro-0.016
chard-0.016
Token for
Token ID329
Feature activation+4.60e-01

Pos logit contributions
iw+0.008
ho+0.008
 yogurt+0.008
 banana+0.008
melon+0.008
Neg logit contributions
 radio-0.012
Marg-0.012
 Margaret-0.012
 flashlight-0.011
 wool-0.011
Token the
Token ID262
Feature activation-3.29e-01

Pos logit contributions
 radio+0.013
 Margaret+0.012
 wool+0.012
Marg+0.012
 flashlight+0.012
Neg logit contributions
iw-0.012
ho-0.012
 mun-0.011
 yogurt-0.011
 mango-0.010
Token lunch
Token ID9965
Feature activation-6.75e-01

Pos logit contributions
 yogurt+0.008
 banana+0.008
melon+0.008
 crab+0.008
iw+0.008
Neg logit contributions
Marg-0.009
 Margaret-0.009
rina-0.008
 Tess-0.008
 Karl-0.008
Token.
Token ID13
Feature activation+2.78e-02

Pos logit contributions
iw+0.017
ho+0.017
 yogurt+0.016
melon+0.016
 mun+0.016
Neg logit contributions
 radio-0.018
 Margaret-0.017
Marg-0.017
 flashlight-0.017
 wool-0.017
Token nodded
Token ID14464
Feature activation-7.34e-02

Pos logit contributions
 radio+0.006
 Margaret+0.006
 flashlight+0.005
 wool+0.005
 grandparents+0.005
Neg logit contributions
iw-0.005
ho-0.005
chard-0.005
 yogurt-0.005
melon-0.005
Token his
Token ID465
Feature activation-4.46e-01

Pos logit contributions
ho+0.002
iw+0.002
 yogurt+0.002
chard+0.001
 mun+0.001
Neg logit contributions
 radio-0.002
 Margaret-0.002
Marg-0.002
 flashlight-0.002
 fault-0.002
Token head
Token ID1182
Feature activation-5.01e-02

Pos logit contributions
iw+0.010
ho+0.009
 yogurt+0.009
 banana+0.009
 mun+0.009
Neg logit contributions
 radio-0.014
 Margaret-0.013
Marg-0.013
 wool-0.013
 flashlight-0.013
Token and
Token ID290
Feature activation+1.18e-01

Pos logit contributions
iw+0.001
ho+0.001
 yogurt+0.001
 mun+0.001
chard+0.001
Neg logit contributions
 radio-0.001
 Margaret-0.001
 flashlight-0.001
Marg-0.001
 wool-0.001
Token scam
Token ID19126
Feature activation-1.12e+00

Pos logit contributions
 radio+0.004
 Margaret+0.003
 flashlight+0.003
 wool+0.003
Marg+0.003
Neg logit contributions
iw-0.003
ho-0.003
 yogurt-0.002
chard-0.002
melon-0.002
Tokenpered
Token ID13653
Feature activation+5.93e-01

Pos logit contributions
iw+0.039
ho+0.037
 yogurt+0.037
 banana+0.036
chard+0.036
Neg logit contributions
 extingu-0.020
 Margaret-0.017
 wool-0.017
 radio-0.017
Marg-0.017
Token down
Token ID866
Feature activation-1.26e-01

Pos logit contributions
 radio+0.017
 Margaret+0.016
 flashlight+0.016
Marg+0.016
 wool+0.016
Neg logit contributions
iw-0.014
ho-0.014
 yogurt-0.013
chard-0.012
 mun-0.012
Token the
Token ID262
Feature activation+4.82e-01

Pos logit contributions
iw+0.003
ho+0.003
 mun+0.003
chard+0.003
 yogurt+0.003
Neg logit contributions
 radio-0.004
 Margaret-0.004
 extingu-0.003
 flashlight-0.003
Marg-0.003
Token street
Token ID4675
Feature activation+5.86e-01

Pos logit contributions
 radio+0.015
 flashlight+0.014
 Margaret+0.013
 grandparents+0.013
Marg+0.013
Neg logit contributions
iw-0.012
ho-0.012
melon-0.011
 mun-0.011
 Fro-0.011
Token.
Token ID13
Feature activation-1.99e-01

Pos logit contributions
 radio+0.014
 Margaret+0.013
 grandparents+0.013
 flashlight+0.013
 extingu+0.012
Neg logit contributions
iw-0.017
ho-0.016
 yogurt-0.016
 mango-0.015
chard-0.015
Token Soon
Token ID15894
Feature activation-7.71e-01

Pos logit contributions
iw+0.004
ho+0.004
 mun+0.004
chard+0.004
 yogurt+0.004
Neg logit contributions
 radio-0.006
 Margaret-0.006
 grandparents-0.006
 flashlight-0.006
Marg-0.006
Token there
Token ID612
Feature activation-2.02e-01

Pos logit contributions
iw+0.018
ho+0.017
 yogurt+0.016
 mun+0.016
chard+0.016
Neg logit contributions
 radio-0.017
 Margaret-0.016
 flashlight-0.016
 grandparents-0.015
Marg-0.015
Token was
Token ID373
Feature activation+1.86e-01

Pos logit contributions
iw+0.005
ho+0.005
chard+0.005
 mun+0.005
 yogurt+0.005
Neg logit contributions
 radio-0.005
 Margaret-0.005
 flashlight-0.005
 grandparents-0.005
Marg-0.005
Token a
Token ID257
Feature activation+2.29e-02

Pos logit contributions
 radio+0.005
 Margaret+0.005
 flashlight+0.005
Marg+0.005
 extingu+0.004
Neg logit contributions
 yogurt-0.005
iw-0.005
 banana-0.005
 pear-0.004
 mun-0.004
Token boy
Token ID2933
Feature activation+1.28e-01

Pos logit contributions
 radio+0.001
 Margaret+0.001
Marg+0.001
 flashlight+0.001
 wool+0.001
Neg logit contributions
iw-0.001
ho-0.001
 yogurt-0.001
chard-0.001
 mun-0.001
Token named
Token ID3706
Feature activation-8.08e-01

Pos logit contributions
 radio+0.003
 wool+0.003
 parent+0.003
Marg+0.003
 Margaret+0.003
Neg logit contributions
iw-0.003
 mango-0.003
chard-0.003
 yogurt-0.003
ho-0.003
Token Tim
Token ID5045
Feature activation+9.04e-01

Pos logit contributions
iw+0.022
ho+0.021
 mun+0.019
 yogurt+0.019
chard+0.019
Neg logit contributions
 radio-0.022
 Margaret-0.021
 flashlight-0.020
 wool-0.020
Marg-0.020
Token.
Token ID13
Feature activation+3.95e-01

Pos logit contributions
 radio+0.026
 Margaret+0.024
Marg+0.023
 wool+0.023
 flashlight+0.023
Neg logit contributions
iw-0.022
ho-0.022
 yogurt-0.021
 banana-0.020
 mun-0.020
Token Tim
Token ID5045
Feature activation+9.43e-01

Pos logit contributions
 radio+0.010
 flashlight+0.010
 Margaret+0.009
 extingu+0.009
 wool+0.009
Neg logit contributions
iw-0.010
ho-0.010
chard-0.010
 mun-0.010
 salon-0.009
Token was
Token ID373
Feature activation-3.42e-03

Pos logit contributions
 radio+0.028
 Margaret+0.027
 wool+0.026
Marg+0.026
 extingu+0.026
Neg logit contributions
iw-0.022
ho-0.021
 yogurt-0.020
 mun-0.019
 banana-0.019
Token feeling
Token ID4203
Feature activation-7.29e-01

Pos logit contributions
iw+0.000
ho+0.000
 mun+0.000
 yogurt+0.000
 mango+0.000
Neg logit contributions
 radio-0.000
 Margaret-0.000
 wool-0.000
Marg-0.000
 flashlight-0.000
Token sad
Token ID6507
Feature activation+1.81e-02

Pos logit contributions
iw+0.020
ho+0.020
 mun+0.018
 mango+0.018
 yogurt+0.017
Neg logit contributions
 radio-0.019
 Margaret-0.018
 wool-0.018
 flashlight-0.017
Marg-0.017
Token door
Token ID3420
Feature activation+1.31e+00

Pos logit contributions
ho+0.002
iw+0.002
 yogurt+0.002
chard+0.002
 l+0.002
Neg logit contributions
Marg-0.002
 Margaret-0.002
 radio-0.002
 aliens-0.002
 fault-0.002
Token.
Token ID13
Feature activation-1.94e-01

Pos logit contributions
 radio+0.032
 Margaret+0.029
 flashlight+0.028
 wool+0.028
Marg+0.028
Neg logit contributions
iw-0.039
ho-0.037
 yogurt-0.035
chard-0.035
 mun-0.035
Token He
Token ID679
Feature activation-2.15e-01

Pos logit contributions
iw+0.005
ho+0.005
 mun+0.004
chard+0.004
melon+0.004
Neg logit contributions
 radio-0.006
 Margaret-0.006
 grandparents-0.005
 flashlight-0.005
 extingu-0.005
Token asked
Token ID1965
Feature activation+1.84e-01

Pos logit contributions
iw+0.005
ho+0.004
 yogurt+0.004
 mun+0.004
chard+0.004
Neg logit contributions
 radio-0.007
 Margaret-0.006
 wool-0.006
 flashlight-0.006
Marg-0.006
Token her
Token ID607
Feature activation-6.84e-01

Pos logit contributions
 radio+0.005
 Margaret+0.005
 grandparents+0.005
 flashlight+0.005
 wool+0.005
Neg logit contributions
iw-0.004
ho-0.004
 mun-0.004
melon-0.004
chard-0.004
Token if
Token ID611
Feature activation+5.72e-01

Pos logit contributions
 yogurt+0.016
 banana+0.015
iw+0.015
ho+0.015
 mango+0.015
Neg logit contributions
 radio-0.019
Marg-0.019
 Margaret-0.018
arse-0.018
 extingu-0.018
Token she
Token ID673
Feature activation-5.83e-01

Pos logit contributions
 radio+0.023
 Margaret+0.023
 grandparents+0.022
 flashlight+0.022
Marg+0.022
Neg logit contributions
iw-0.007
ho-0.006
 mun-0.006
melon-0.006
 yogurt-0.005
Token could
Token ID714
Feature activation+2.51e-01

Pos logit contributions
iw+0.011
ho+0.011
 yogurt+0.010
 mun+0.010
chard+0.010
Neg logit contributions
 radio-0.020
 Margaret-0.019
Marg-0.019
 wool-0.019
 flashlight-0.018
Token help
Token ID1037
Feature activation+9.56e-02

Pos logit contributions
 radio+0.007
 Margaret+0.007
Marg+0.007
 fault+0.007
 wool+0.007
Neg logit contributions
ho-0.006
iw-0.006
 yogurt-0.005
 banana-0.005
 mun-0.005
Token him
Token ID683
Feature activation-1.77e-01

Pos logit contributions
 radio+0.003
 Margaret+0.003
 wool+0.003
Marg+0.003
 flashlight+0.003
Neg logit contributions
iw-0.002
ho-0.002
 mun-0.002
 yogurt-0.002
chard-0.002
Token with
Token ID351
Feature activation+4.76e-01

Pos logit contributions
 yogurt+0.004
iw+0.003
 banana+0.003
chard+0.003
melon+0.003
Neg logit contributions
 radio-0.005
 Margaret-0.005
Marg-0.005
 aliens-0.005
 fault-0.005
Token forever
Token ID8097
Feature activation-2.89e-01

Pos logit contributions
iw+0.008
ho+0.007
 mun+0.007
 yogurt+0.007
 banana+0.007
Neg logit contributions
 radio-0.011
 extingu-0.011
 Margaret-0.011
 flashlight-0.011
 wool-0.011
Token.
Token ID13
Feature activation-1.83e-01

Pos logit contributions
iw+0.007
ho+0.006
 mun+0.006
 yogurt+0.006
 banana+0.006
Neg logit contributions
 radio-0.009
 Margaret-0.009
 flashlight-0.008
 extingu-0.008
Marg-0.008
Token<|endoftext|>
Token ID50256
Feature activation+3.50e-02

Pos logit contributions
iw+0.005
ho+0.005
 yogurt+0.005
chard+0.004
 mun+0.004
Neg logit contributions
 radio-0.005
 Margaret-0.004
 flashlight-0.004
Marg-0.004
 wool-0.004
TokenOnce
Token ID7454
Feature activation-3.14e-01

Pos logit contributions
 radio+0.001
 Margaret+0.001
 wool+0.001
 flashlight+0.001
 extingu+0.001
Neg logit contributions
iw-0.001
ho-0.001
 yogurt-0.001
 banana-0.001
chard-0.001
Token upon
Token ID2402
Feature activation-7.13e-01

Pos logit contributions
iw+0.008
ho+0.008
 yogurt+0.008
 mun+0.008
chard+0.007
Neg logit contributions
 radio-0.008
 Margaret-0.008
 flashlight-0.008
Marg-0.008
 grandparents-0.007
Token a
Token ID257
Feature activation+4.62e-01

Pos logit contributions
iw+0.021
ho+0.021
 yogurt+0.020
 banana+0.020
 mun+0.020
Neg logit contributions
 radio-0.016
 Margaret-0.015
 extingu-0.014
 flashlight-0.014
 grandparents-0.014
Token time
Token ID640
Feature activation+4.72e-01

Pos logit contributions
 radio+0.011
 Margaret+0.010
 extingu+0.010
 aliens+0.010
 flashlight+0.010
Neg logit contributions
iw-0.013
ho-0.013
 mun-0.013
 banana-0.013
 yogurt-0.013
Token,
Token ID11
Feature activation-4.59e-01

Pos logit contributions
 radio+0.016
 Margaret+0.015
 flashlight+0.015
Marg+0.015
 wool+0.015
Neg logit contributions
iw-0.009
ho-0.009
 yogurt-0.008
chard-0.008
 mun-0.008
Token there
Token ID612
Feature activation+2.02e-02

Pos logit contributions
iw+0.010
ho+0.010
chard+0.010
 yogurt+0.010
melon+0.009
Neg logit contributions
 radio-0.014
 Margaret-0.013
Marg-0.013
 grandparents-0.013
 wool-0.013
Token was
Token ID373
Feature activation+5.14e-02

Pos logit contributions
 radio+0.001
 flashlight+0.001
 Margaret+0.001
 wool+0.000
Marg+0.000
Neg logit contributions
iw-0.001
ho-0.000
chard-0.000
 mun-0.000
 yogurt-0.000
Token a
Token ID257
Feature activation+4.91e-02

Pos logit contributions
 radio+0.001
 Margaret+0.001
 flashlight+0.001
 extingu+0.001
 grandparents+0.001
Neg logit contributions
 yogurt-0.001
iw-0.001
 banana-0.001
chard-0.001
 mango-0.001
Token stained
Token ID34568
Feature activation-6.23e-01

Pos logit contributions
iw+0.020
ho+0.019
chard+0.018
 yogurt+0.018
 mun+0.018
Neg logit contributions
 radio-0.023
 Margaret-0.022
 flashlight-0.022
 wool-0.022
Marg-0.021
Token Anna
Token ID11735
Feature activation-4.03e-01

Pos logit contributions
iw+0.014
ho+0.014
chard+0.013
 mun+0.013
 yogurt+0.012
Neg logit contributions
 radio-0.019
 Margaret-0.018
 wool-0.018
 flashlight-0.018
Marg-0.017
Token's
Token ID338
Feature activation-1.36e-01

Pos logit contributions
iw+0.011
ho+0.011
chard+0.010
 yogurt+0.010
melon+0.010
Neg logit contributions
 radio-0.010
 Margaret-0.010
 wool-0.009
 flashlight-0.009
Marg-0.009
Token dress
Token ID6576
Feature activation+4.84e-01

Pos logit contributions
iw+0.004
ho+0.003
chard+0.003
 yogurt+0.003
 mun+0.003
Neg logit contributions
 radio-0.004
 flashlight-0.003
 Margaret-0.003
 wool-0.003
 grandparents-0.003
Token and
Token ID290
Feature activation-6.24e-02

Pos logit contributions
 radio+0.013
 Margaret+0.013
 flashlight+0.012
Marg+0.012
 wool+0.012
Neg logit contributions
iw-0.012
ho-0.012
 yogurt-0.011
chard-0.011
 mun-0.011
Token made
Token ID925
Feature activation-2.00e-01

Pos logit contributions
iw+0.002
ho+0.002
chard+0.001
melon+0.001
 mango+0.001
Neg logit contributions
 radio-0.002
 Margaret-0.002
 wool-0.002
 flashlight-0.002
 grandparents-0.002
Token it
Token ID340
Feature activation-5.04e-01

Pos logit contributions
iw+0.006
ho+0.005
 yogurt+0.005
 mun+0.005
 banana+0.005
Neg logit contributions
 radio-0.005
 Margaret-0.005
Marg-0.005
 flashlight-0.005
 wool-0.004
Token look
Token ID804
Feature activation-7.22e-02

Pos logit contributions
iw+0.013
ho+0.012
chard+0.012
 yogurt+0.012
 mun+0.011
Neg logit contributions
 radio-0.014
 flashlight-0.013
 Margaret-0.013
 wool-0.013
Marg-0.012
Token dirty
Token ID11841
Feature activation+7.82e-01

Pos logit contributions
iw+0.002
ho+0.002
 yogurt+0.002
 mun+0.002
chard+0.002
Neg logit contributions
 radio-0.002
 Margaret-0.002
Marg-0.002
 flashlight-0.002
 grandparents-0.002
Token.
Token ID13
Feature activation-2.06e-01

Pos logit contributions
 radio+0.023
 Margaret+0.023
Marg+0.023
 grandparents+0.022
 flashlight+0.022
Neg logit contributions
ho-0.017
iw-0.017
 yogurt-0.017
melon-0.016
 banana-0.015
Token
Token ID198
Feature activation+9.53e-02

Pos logit contributions
iw+0.005
ho+0.005
 yogurt+0.005
 mun+0.005
chard+0.005
Neg logit contributions
 radio-0.006
 Margaret-0.006
Marg-0.005
 flashlight-0.005
 wool-0.005
Token upon
Token ID2402
Feature activation-4.62e-01

Pos logit contributions
iw+0.014
ho+0.013
 yogurt+0.012
chard+0.012
 mun+0.012
Neg logit contributions
 radio-0.015
 Margaret-0.014
 flashlight-0.014
Marg-0.014
 wool-0.014
Token a
Token ID257
Feature activation+4.24e-01

Pos logit contributions
 yogurt+0.012
 banana+0.011
 he+0.011
 mun+0.011
iw+0.011
Neg logit contributions
 radio-0.012
 extingu-0.012
 Margaret-0.012
 fault-0.011
 grandparents-0.011
Token time
Token ID640
Feature activation+3.98e-01

Pos logit contributions
 radio+0.011
 Margaret+0.010
 aliens+0.010
 extingu+0.010
 pilots+0.010
Neg logit contributions
 mun-0.011
iw-0.011
 banana-0.011
ho-0.011
 yogurt-0.011
Token there
Token ID612
Feature activation-1.18e-01

Pos logit contributions
 radio+0.013
 Margaret+0.013
 flashlight+0.012
Marg+0.012
 wool+0.012
Neg logit contributions
iw-0.008
ho-0.007
 yogurt-0.007
chard-0.007
 mun-0.007
Token was
Token ID373
Feature activation+2.01e-01

Pos logit contributions
iw+0.003
ho+0.003
chard+0.003
 yogurt+0.003
 mun+0.003
Neg logit contributions
 radio-0.003
 Margaret-0.003
 flashlight-0.003
Marg-0.003
 wool-0.003
Token a
Token ID257
Feature activation+9.02e-02

Pos logit contributions
 radio+0.006
 grandparents+0.006
 extingu+0.006
 Margaret+0.006
rina+0.005
Neg logit contributions
 yogurt-0.005
 banana-0.004
iw-0.004
 mun-0.004
 pear-0.004
Token girl
Token ID2576
Feature activation-4.23e-01

Pos logit contributions
 radio+0.002
 Margaret+0.002
 flashlight+0.002
Marg+0.002
 wool+0.002
Neg logit contributions
iw-0.003
ho-0.002
 yogurt-0.002
chard-0.002
 mun-0.002
Token who
Token ID508
Feature activation+3.80e-01

Pos logit contributions
iw+0.011
 mango+0.010
melon+0.010
ho+0.010
 yogurt+0.010
Neg logit contributions
 radio-0.011
 Margaret-0.011
 wool-0.011
Marg-0.011
 parent-0.010
Token wanted
Token ID2227
Feature activation+1.17e+00

Pos logit contributions
 radio+0.010
 Margaret+0.009
 flashlight+0.009
Marg+0.009
 wool+0.009
Neg logit contributions
iw-0.010
ho-0.010
 yogurt-0.009
 mun-0.009
chard-0.009
Token to
Token ID284
Feature activation-8.15e-03

Pos logit contributions
 radio+0.035
 wool+0.033
 Margaret+0.032
 grandparents+0.032
 flashlight+0.032
Neg logit contributions
iw-0.029
ho-0.028
melon-0.025
 mun-0.025
chard-0.024
Token have
Token ID423
Feature activation+1.66e+00

Pos logit contributions
iw+0.000
ho+0.000
chard+0.000
melon+0.000
 mun+0.000
Neg logit contributions
 radio-0.000
 Margaret-0.000
 flashlight-0.000
 wool-0.000
Marg-0.000
Token them
Token ID606
Feature activation+2.59e-01

Pos logit contributions
iw+0.020
ho+0.019
 yogurt+0.018
chard+0.018
melon+0.018
Neg logit contributions
 radio-0.019
 Margaret-0.017
 flashlight-0.017
Marg-0.017
 wool-0.017
Token again
Token ID757
Feature activation+4.56e-01

Pos logit contributions
 radio+0.008
 Margaret+0.008
Marg+0.008
 flashlight+0.007
 wool+0.007
Neg logit contributions
iw-0.006
 yogurt-0.006
ho-0.006
melon-0.005
chard-0.005
Token.
Token ID13
Feature activation+1.82e-01

Pos logit contributions
 radio+0.014
 Margaret+0.013
Marg+0.013
 flashlight+0.013
 extingu+0.012
Neg logit contributions
iw-0.011
 yogurt-0.010
ho-0.010
chard-0.010
 banana-0.010
Token But
Token ID887
Feature activation+5.23e-01

Pos logit contributions
 radio+0.005
 Margaret+0.005
Marg+0.005
 grandparents+0.005
 flashlight+0.005
Neg logit contributions
iw-0.004
ho-0.004
 yogurt-0.004
chard-0.004
 mun-0.004
Token then
Token ID788
Feature activation-1.89e-01

Pos logit contributions
 radio+0.016
 Margaret+0.015
 grandparents+0.014
 wool+0.014
 flashlight+0.014
Neg logit contributions
iw-0.013
ho-0.012
 mun-0.011
 yogurt-0.011
melon-0.011
Token her
Token ID607
Feature activation+2.75e-02

Pos logit contributions
iw+0.003
ho+0.003
 yogurt+0.003
chard+0.003
 mun+0.003
Neg logit contributions
 radio-0.007
 Margaret-0.006
Marg-0.006
 flashlight-0.006
 wool-0.006
Token brother
Token ID3956
Feature activation-3.16e-01

Pos logit contributions
Marg+0.001
 Margaret+0.001
 radio+0.001
 extingu+0.001
 Karl+0.001
Neg logit contributions
 yogurt-0.001
iw-0.001
 banana-0.001
ho-0.001
 l-0.001
Token had
Token ID550
Feature activation+1.87e-01

Pos logit contributions
ho+0.006
iw+0.006
 yogurt+0.006
melon+0.006
 mun+0.006
Neg logit contributions
 radio-0.010
Marg-0.009
 flashlight-0.009
 Margaret-0.009
 wool-0.009
Token an
Token ID281
Feature activation-4.53e-01

Pos logit contributions
 radio+0.004
 Margaret+0.003
 flashlight+0.003
Marg+0.003
 wool+0.003
Neg logit contributions
iw-0.006
ho-0.006
 yogurt-0.006
chard-0.006
 mun-0.006
Token idea
Token ID2126
Feature activation+4.80e-01

Pos logit contributions
iw+0.016
 yogurt+0.016
ho+0.016
melon+0.015
 banana+0.015
Neg logit contributions
 radio-0.007
 Margaret-0.007
Marg-0.007
 flashlight-0.006
 wool-0.006
Token -
Token ID532
Feature activation+4.55e-01

Pos logit contributions
 radio+0.013
 Margaret+0.013
Marg+0.012
 wool+0.012
 flashlight+0.012
Neg logit contributions
iw-0.012
ho-0.012
 yogurt-0.011
 mun-0.011
 banana-0.011
Token is
Token ID318
Feature activation-5.35e-01

Pos logit contributions
iw+0.008
ho+0.008
 yogurt+0.007
chard+0.007
 mun+0.007
Neg logit contributions
 radio-0.007
 Margaret-0.006
 flashlight-0.006
Marg-0.006
 wool-0.006
Token that
Token ID326
Feature activation-8.85e-01

Pos logit contributions
iw+0.011
ho+0.011
chard+0.010
 yogurt+0.010
 mun+0.010
Neg logit contributions
 radio-0.017
 Margaret-0.016
Marg-0.016
 wool-0.016
 grandparents-0.016
Token it
Token ID340
Feature activation+1.92e-01

Pos logit contributions
iw+0.019
ho+0.018
 yogurt+0.017
chard+0.016
melon+0.016
Neg logit contributions
 radio-0.028
 Margaret-0.027
Marg-0.027
 flashlight-0.026
 extingu-0.026
Token's
Token ID338
Feature activation+1.74e-01

Pos logit contributions
 radio+0.006
 extingu+0.006
 Margaret+0.006
Marg+0.006
 flashlight+0.006
Neg logit contributions
iw-0.004
ho-0.004
melon-0.003
 yogurt-0.003
 he-0.003
Token important
Token ID1593
Feature activation-1.20e-01

Pos logit contributions
 radio+0.005
 Margaret+0.004
 flashlight+0.004
 extingu+0.004
Marg+0.004
Neg logit contributions
chard-0.004
iw-0.004
 yogurt-0.004
melon-0.004
 mango-0.004
Token to
Token ID284
Feature activation-8.92e-01

Pos logit contributions
iw+0.002
ho+0.002
 yogurt+0.002
melon+0.002
chard+0.002
Neg logit contributions
 radio-0.004
 Margaret-0.004
Marg-0.004
 flashlight-0.004
 wool-0.004
Token share
Token ID2648
Feature activation-2.67e-03

Pos logit contributions
iw+0.018
ho+0.018
 yogurt+0.017
melon+0.016
 mango+0.016
Neg logit contributions
 radio-0.029
 Margaret-0.028
 flashlight-0.027
Marg-0.027
 extingu-0.026
Token and
Token ID290
Feature activation+2.93e-01

Pos logit contributions
iw+0.000
 yogurt+0.000
ho+0.000
 banana+0.000
chard+0.000
Neg logit contributions
 radio-0.000
 Margaret-0.000
Marg-0.000
 flashlight-0.000
 extingu-0.000
Token be
Token ID307
Feature activation+8.00e-02

Pos logit contributions
 radio+0.008
 flashlight+0.007
 Margaret+0.007
 wool+0.007
 grandparents+0.007
Neg logit contributions
iw-0.008
ho-0.007
 yogurt-0.007
 mun-0.007
melon-0.007
Token kind
Token ID1611
Feature activation-3.67e-01

Pos logit contributions
 radio+0.002
 Margaret+0.002
Marg+0.002
 wool+0.002
 flashlight+0.002
Neg logit contributions
iw-0.002
ho-0.002
 yogurt-0.002
 mun-0.002
chard-0.002
Token to
Token ID284
Feature activation-2.46e-01

Pos logit contributions
iw+0.008
ho+0.008
 yogurt+0.007
chard+0.007
 mun+0.007
Neg logit contributions
 radio-0.011
 Margaret-0.011
 flashlight-0.011
 wool-0.011
Marg-0.011
Token and
Token ID290
Feature activation+2.23e-01

Pos logit contributions
iw+0.009
ho+0.009
 yogurt+0.009
 mun+0.009
chard+0.008
Neg logit contributions
 radio-0.010
 Margaret-0.009
Marg-0.009
 flashlight-0.009
 wool-0.009
Token says
Token ID1139
Feature activation-5.86e-02

Pos logit contributions
 radio+0.006
 Margaret+0.006
Marg+0.006
 flashlight+0.006
 wool+0.006
Neg logit contributions
iw-0.005
ho-0.005
 yogurt-0.005
 mun-0.005
 banana-0.005
Token,
Token ID11
Feature activation+2.39e-01

Pos logit contributions
iw+0.001
chard+0.001
ho+0.001
 mun+0.001
 mango+0.001
Neg logit contributions
 radio-0.002
 Margaret-0.002
 wool-0.002
 flashlight-0.002
Marg-0.002
Token "
Token ID366
Feature activation-2.54e-01

Pos logit contributions
 radio+0.007
 Margaret+0.007
 flashlight+0.007
Marg+0.007
 wool+0.007
Neg logit contributions
iw-0.005
ho-0.005
 yogurt-0.005
chard-0.005
 mun-0.005
TokenThank
Token ID10449
Feature activation+3.46e-01

Pos logit contributions
iw+0.006
ho+0.005
 yogurt+0.005
chard+0.005
melon+0.005
Neg logit contributions
 radio-0.008
 Margaret-0.008
Marg-0.008
 wool-0.007
 flashlight-0.007
Token you
Token ID345
Feature activation-1.53e+00

Pos logit contributions
 radio+0.011
Marg+0.011
 Margaret+0.011
 wool+0.011
 aliens+0.010
Neg logit contributions
iw-0.007
chard-0.007
ho-0.007
melon-0.007
 mun-0.006
Token,
Token ID11
Feature activation-9.85e-01

Pos logit contributions
iw+0.027
ho+0.026
 yogurt+0.025
melon+0.023
 banana+0.023
Neg logit contributions
 radio-0.054
 Margaret-0.051
Marg-0.051
 flashlight-0.051
 wool-0.050
Token Ben
Token ID3932
Feature activation-2.93e-01

Pos logit contributions
iw+0.022
ho+0.021
chard+0.020
 mun+0.019
 yogurt+0.019
Neg logit contributions
 radio-0.031
 Margaret-0.029
 flashlight-0.028
 wool-0.028
 grandparents-0.028
Token.
Token ID13
Feature activation+2.97e-01

Pos logit contributions
iw+0.006
ho+0.006
 yogurt+0.005
 banana+0.005
melon+0.005
Neg logit contributions
 radio-0.010
Marg-0.009
 Margaret-0.009
 flashlight-0.009
 grandparents-0.009
Token You
Token ID921
Feature activation-6.74e-01

Pos logit contributions
 radio+0.010
 Margaret+0.009
 flashlight+0.009
Marg+0.009
 wool+0.009
Neg logit contributions
iw-0.006
ho-0.006
 yogurt-0.005
melon-0.005
 mun-0.005
Token are
Token ID389
Feature activation-1.29e-01

Pos logit contributions
 yogurt+0.013
 mun+0.013
iw+0.013
 l+0.013
 banana+0.012
Neg logit contributions
 radio-0.022
 Margaret-0.021
Marg-0.020
 grandparents-0.020
 extingu-0.020
Token the
Token ID262
Feature activation-1.18e-01

Pos logit contributions
iw+0.014
ho+0.013
 mun+0.012
 yogurt+0.012
chard+0.012
Neg logit contributions
 radio-0.020
 Margaret-0.019
 wool-0.019
 flashlight-0.019
Marg-0.018
Token porch
Token ID33179
Feature activation+8.41e-02

Pos logit contributions
iw+0.004
ho+0.003
 mun+0.003
chard+0.003
 mango+0.003
Neg logit contributions
 radio-0.003
 wool-0.002
 flashlight-0.002
 Margaret-0.002
 extingu-0.002
Token.
Token ID13
Feature activation-6.16e-02

Pos logit contributions
 radio+0.002
 Margaret+0.002
 flashlight+0.002
 extingu+0.001
 wool+0.001
Neg logit contributions
iw-0.003
ho-0.002
chard-0.002
 mango-0.002
 Fro-0.002
Token It
Token ID632
Feature activation+3.53e-01

Pos logit contributions
ho+0.002
iw+0.002
melon+0.002
 mun+0.002
chard+0.002
Neg logit contributions
 Margaret-0.001
 radio-0.001
Marg-0.001
 Tess-0.001
 wool-0.001
Token was
Token ID373
Feature activation-4.16e-01

Pos logit contributions
 radio+0.010
 Margaret+0.010
 flashlight+0.010
 wool+0.010
Marg+0.009
Neg logit contributions
iw-0.009
ho-0.008
 yogurt-0.008
chard-0.008
 mun-0.007
Token full
Token ID1336
Feature activation-1.74e+00

Pos logit contributions
iw+0.013
ho+0.012
 mun+0.012
chard+0.011
 yogurt+0.011
Neg logit contributions
 radio-0.010
 wool-0.009
 Margaret-0.009
 flashlight-0.009
Marg-0.008
Token of
Token ID286
Feature activation+1.83e-01

Pos logit contributions
iw+0.038
ho+0.035
chard+0.034
 mango+0.031
 mun+0.029
Neg logit contributions
 radio-0.057
 wool-0.055
 flashlight-0.053
 Margaret-0.052
 extingu-0.052
Token y
Token ID331
Feature activation-2.81e-01

Pos logit contributions
 radio+0.005
 Margaret+0.005
 wool+0.005
 flashlight+0.005
Marg+0.004
Neg logit contributions
iw-0.005
ho-0.005
chard-0.004
 mun-0.004
melon-0.004
Tokenummy
Token ID13513
Feature activation-2.55e-02

Pos logit contributions
chard+0.008
iw+0.007
ho+0.007
 mun+0.007
melon+0.007
Neg logit contributions
 wool-0.007
 radio-0.007
 flashlight-0.006
 Margaret-0.006
Marg-0.006
Token juice
Token ID13135
Feature activation-3.92e-01

Pos logit contributions
iw+0.001
ho+0.001
 yogurt+0.001
 mun+0.001
melon+0.001
Neg logit contributions
 radio-0.001
 Margaret-0.001
Marg-0.001
 flashlight-0.001
 wool-0.001
Token.
Token ID13
Feature activation-5.67e-02

Pos logit contributions
iw+0.010
ho+0.010
 yogurt+0.010
 mun+0.009
melon+0.009
Neg logit contributions
 radio-0.010
 Margaret-0.010
Marg-0.010
 flashlight-0.009
 wool-0.009
Token the
Token ID262
Feature activation-3.22e-01

Pos logit contributions
 radio+0.008
 wool+0.008
 Margaret+0.008
Marg+0.008
 flashlight+0.008
Neg logit contributions
iw-0.006
ho-0.006
melon-0.005
chard-0.005
 mun-0.005
Token feel
Token ID1254
Feature activation-9.94e-01

Pos logit contributions
iw+0.008
ho+0.008
 yogurt+0.007
chard+0.007
 mun+0.007
Neg logit contributions
 radio-0.009
 Margaret-0.009
Marg-0.008
 flashlight-0.008
 wool-0.008
Token of
Token ID286
Feature activation-1.11e+00

Pos logit contributions
iw+0.020
ho+0.019
chard+0.017
 yogurt+0.017
melon+0.017
Neg logit contributions
 radio-0.033
 Margaret-0.032
 wool-0.032
Marg-0.031
 flashlight-0.031
Token the
Token ID262
Feature activation-6.46e-02

Pos logit contributions
iw+0.028
ho+0.028
chard+0.025
 mun+0.025
melon+0.025
Neg logit contributions
 radio-0.032
 wool-0.030
 Margaret-0.029
 flashlight-0.029
Marg-0.028
Token wind
Token ID2344
Feature activation+1.67e-01

Pos logit contributions
iw+0.002
ho+0.002
 yogurt+0.002
chard+0.001
 mun+0.001
Neg logit contributions
 radio-0.002
 Margaret-0.002
Marg-0.002
 flashlight-0.002
 extingu-0.002
Token on
Token ID319
Feature activation-1.42e+00

Pos logit contributions
 radio+0.005
 Margaret+0.005
Marg+0.005
 grandparents+0.005
 flashlight+0.005
Neg logit contributions
ho-0.004
iw-0.003
melon-0.003
 yogurt-0.003
chard-0.003
Token her
Token ID607
Feature activation-3.65e-01

Pos logit contributions
iw+0.032
ho+0.031
 yogurt+0.030
chard+0.028
 banana+0.028
Neg logit contributions
 radio-0.042
 Margaret-0.041
 flashlight-0.040
Marg-0.040
 grandparents-0.039
Token face
Token ID1986
Feature activation-1.70e-02

Pos logit contributions
iw+0.008
ho+0.007
 yogurt+0.007
chard+0.007
 mun+0.007
Neg logit contributions
 radio-0.012
 Margaret-0.011
Marg-0.011
 flashlight-0.011
 wool-0.011
Token and
Token ID290
Feature activation+1.84e-01

Pos logit contributions
iw+0.000
ho+0.000
 yogurt+0.000
chard+0.000
 mun+0.000
Neg logit contributions
 radio-0.000
 Margaret-0.000
Marg-0.000
 flashlight-0.000
 grandparents-0.000
Token the
Token ID262
Feature activation+6.73e-01

Pos logit contributions
 radio+0.005
 Margaret+0.005
 wool+0.005
 flashlight+0.005
Marg+0.005
Neg logit contributions
iw-0.005
ho-0.004
melon-0.004
 yogurt-0.004
chard-0.004
Token sun
Token ID4252
Feature activation-3.09e-01

Pos logit contributions
 radio+0.019
 Margaret+0.018
Marg+0.018
 flashlight+0.018
 extingu+0.018
Neg logit contributions
iw-0.016
ho-0.016
 yogurt-0.015
chard-0.014
 mun-0.014
Token,
Token ID11
Feature activation+1.16e-01

Pos logit contributions
iw+0.014
ho+0.013
 yogurt+0.013
 mun+0.012
chard+0.012
Neg logit contributions
 radio-0.018
 Margaret-0.017
Marg-0.017
 extingu-0.017
 flashlight-0.017
Token because
Token ID780
Feature activation+6.45e-01

Pos logit contributions
 radio+0.004
Marg+0.003
 Margaret+0.003
 flashlight+0.003
 extingu+0.003
Neg logit contributions
iw-0.002
ho-0.002
 yogurt-0.002
 banana-0.002
 mun-0.002
Token you
Token ID345
Feature activation-3.37e-01

Pos logit contributions
 radio+0.022
 Margaret+0.021
Marg+0.021
 flashlight+0.021
 extingu+0.021
Neg logit contributions
iw-0.012
ho-0.011
 yogurt-0.011
chard-0.010
 banana-0.010
Token have
Token ID423
Feature activation+1.96e-01

Pos logit contributions
iw+0.008
 mun+0.008
ho+0.007
 yogurt+0.007
 banana+0.007
Neg logit contributions
 radio-0.010
 Margaret-0.010
 extingu-0.009
 wool-0.009
Marg-0.009
Token been
Token ID587
Feature activation+2.17e-01

Pos logit contributions
 radio+0.006
 Margaret+0.005
 flashlight+0.005
Marg+0.005
 wool+0.005
Neg logit contributions
iw-0.005
ho-0.004
 yogurt-0.004
chard-0.004
 mun-0.004
Token such
Token ID884
Feature activation-1.56e+00

Pos logit contributions
 radio+0.006
 Margaret+0.006
 wool+0.006
Marg+0.006
 flashlight+0.005
Neg logit contributions
iw-0.006
ho-0.005
 yogurt-0.005
 mun-0.005
melon-0.005
Token a
Token ID257
Feature activation+1.45e-01

Pos logit contributions
iw+0.042
 yogurt+0.041
chard+0.040
 mango+0.040
 banana+0.040
Neg logit contributions
 radio-0.039
 Margaret-0.038
Marg-0.036
 flashlight-0.036
 grandparents-0.035
Token good
Token ID922
Feature activation+1.95e-01

Pos logit contributions
 radio+0.004
 Margaret+0.003
Marg+0.003
 flashlight+0.003
 wool+0.003
Neg logit contributions
iw-0.004
ho-0.004
 yogurt-0.004
 banana-0.004
chard-0.004
Token girl
Token ID2576
Feature activation+7.27e-02

Pos logit contributions
 radio+0.005
 Margaret+0.005
Marg+0.005
 flashlight+0.005
 wool+0.004
Neg logit contributions
iw-0.005
ho-0.005
 yogurt-0.005
 banana-0.005
melon-0.005
Token.
Token ID13
Feature activation+3.00e-01

Pos logit contributions
 radio+0.002
 flashlight+0.002
 Margaret+0.002
Marg+0.002
 extingu+0.002
Neg logit contributions
iw-0.002
ho-0.002
 yogurt-0.002
chard-0.002
 mun-0.002
Token Now
Token ID2735
Feature activation-8.37e-01

Pos logit contributions
 radio+0.009
 flashlight+0.009
 Margaret+0.009
 wool+0.008
Marg+0.008
Neg logit contributions
iw-0.007
ho-0.006
 yogurt-0.006
chard-0.006
 mun-0.006
Token It
Token ID632
Feature activation+2.04e-01

Pos logit contributions
ho+0.013
iw+0.013
 yogurt+0.011
chard+0.011
 mun+0.011
Neg logit contributions
 radio-0.015
 Margaret-0.015
Marg-0.014
 flashlight-0.013
 grandparents-0.013
Token was
Token ID373
Feature activation-4.85e-01

Pos logit contributions
 radio+0.007
 Margaret+0.006
Marg+0.006
 wool+0.006
 grandparents+0.006
Neg logit contributions
iw-0.004
ho-0.004
 mun-0.004
 yogurt-0.004
 banana-0.004
Token a
Token ID257
Feature activation-3.01e-01

Pos logit contributions
iw+0.011
ho+0.011
 yogurt+0.010
chard+0.010
melon+0.010
Neg logit contributions
 radio-0.014
 Margaret-0.014
Marg-0.013
 flashlight-0.013
 wool-0.013
Token big
Token ID1263
Feature activation-5.49e-01

Pos logit contributions
iw+0.009
ho+0.009
 yogurt+0.008
 mun+0.008
chard+0.008
Neg logit contributions
 radio-0.007
 Margaret-0.007
 flashlight-0.006
 wool-0.006
Marg-0.006
Token basket
Token ID7988
Feature activation-1.41e+00

Pos logit contributions
iw+0.012
ho+0.012
 yogurt+0.011
chard+0.011
 mun+0.011
Neg logit contributions
 radio-0.017
 Margaret-0.016
 flashlight-0.016
 wool-0.016
Marg-0.016
Token with
Token ID351
Feature activation-1.35e+00

Pos logit contributions
iw+0.033
ho+0.029
 yogurt+0.028
 mango+0.028
chard+0.028
Neg logit contributions
 radio-0.041
 flashlight-0.038
 Margaret-0.038
 extingu-0.037
 wool-0.037
Token all
Token ID477
Feature activation-6.37e-01

Pos logit contributions
iw+0.040
ho+0.039
chard+0.037
 mun+0.036
 yogurt+0.036
Neg logit contributions
 radio-0.031
 wool-0.029
 Margaret-0.028
 flashlight-0.028
Marg-0.028
Token sorts
Token ID10524
Feature activation-5.52e-01

Pos logit contributions
iw+0.016
ho+0.015
chard+0.014
 mun+0.014
 yogurt+0.014
Neg logit contributions
 radio-0.018
 Margaret-0.017
 wool-0.017
 flashlight-0.017
Marg-0.017
Token of
Token ID286
Feature activation+3.38e-01

Pos logit contributions
iw+0.010
ho+0.009
chard+0.009
 yogurt+0.008
 mun+0.008
Neg logit contributions
 radio-0.019
 wool-0.018
 aliens-0.018
 Margaret-0.018
 flashlight-0.018
Token colorful
Token ID20239
Feature activation+4.29e-01

Pos logit contributions
 radio+0.009
 wool+0.008
 Margaret+0.008
 flashlight+0.008
Marg+0.008
Neg logit contributions
iw-0.009
ho-0.009
chard-0.008
melon-0.008
 mun-0.008
Token things
Token ID1243
Feature activation+2.35e-01

Pos logit contributions
 radio+0.012
 Margaret+0.012
 flashlight+0.012
Marg+0.011
 wool+0.011
Neg logit contributions
iw-0.010
ho-0.010
 yogurt-0.009
chard-0.009
 mun-0.009