TOP ACTIVATIONS
MAX = 0.084

would try to find something y ummy to snack on !
to travel to find something y ummy to taste .
back and the skirt was fr illy . Amy tw irled
a chance to find something y ummy and fun to explore
. He thinks jam is y ucky . He wants something
He wanted to find something y ummy to eat . Sam
doctor , which was an err and he didn 't understand
. She thinks cereal is y ummy and good for her
side , she found something y ummy waiting for her .
box of tea that looked y ummy . Lucy asked her
Inside the bone was a y ummy sausage .
Lily thought the oats looked y ummy . But
<|endoftext|> Billy was a 3 year old boy who was very
M andy was a 3 year old girl who lived with
. He thought it looked y ummy and wanted to try
lobby . The cake looks y ummy . They want to
It was shiny and looked y ummy . So , the
is this ? It looks y ucky . I don 't
. Inside was a three year old girl .
. He thought it looked y ummy , so he tried

MIN ACTIVATIONS
MIN = -0.072

was a clever and hard working boy who always thought things
Sammy was a curious and outgoing three - year - old
a beautiful girl and a handsome boy . They loved each
and friends and not take unnecessary risks . We should always
responsible and making sure to double - check before doing something
be careful and not take unnecessary risks . The moral of
Tommy and a c udd ly friend like her t eddy
monkeys and elephants and gir aff es . They were very
" Thank you , Che wy !" They became best friends
a chees eb urger and fries all by herself . Her
fun . The younger sister was very thankful for
on , Ben and T oni spent a lot of time
lions and monkeys and gir aff es . They
at the game and the younger sister said she was happy
angry . The younger brother saw this and felt
saw plates with pizza and fries and ice cream . They
be prepared and not take unnecessary risks . <|endoftext|> Once upon
The hard working ant said , " If
paws and a w agg ly tail . One
her cows and sheep and chickens . She gave them some

INTERVAL 0.045 - 0.084
CONTAINS 0.039%

<|endoftext|> Alice was a 3 year old girl who loved to
fun was done , Tom and Jack said goodbye to their
gy and Jessica felt a bit scared . Suddenly , a
boy . He saw a y ummy cel ery on the
She was a lively three year old with big , bright

INTERVAL 0.006 - 0.045
CONTAINS 27.657%

the voice that she wanted to explore the cabin . After
cared for it . <|endoftext|> Once upon a time there was
you mum !' She was so proud of her son that
, let 's fix it together ." They worked together to
the bird , " Why are you in the cell ?"

INTERVAL -0.033 - 0.006
CONTAINS 72.044%

said , " Y oga , can you teach me some
the bird and it got burned . The other animals found
, his little dog , Spot , started to bark loudly
them on the oak . He was surprised and worried .
Mam ma remembered something . " Wait ," she said ,

INTERVAL -0.072 - -0.033
CONTAINS 0.260%

, disgusting place for a 3 - year - old to
excited and asked lots of questions . Anna 's
him about his love for rad ishes . He knew they
mother and her three year old son were walking down the
." The wise old owl thought for a moment
Token would
Token ID561
Feature activation-9.44e-03

Pos logit contributions
ga+0.015
be+0.012
ric+0.012
aff+0.012
Ar+0.012
Neg logit contributions
 Amy-0.018
 Daisy-0.017
 there-0.017
 Sally-0.016
 Emily-0.016
Token try
Token ID1949
Feature activation+1.76e-02

Pos logit contributions
ga+0.019
be+0.017
Ar+0.016
zee+0.016
aff+0.016
Neg logit contributions
 Amy-0.012
 Daisy-0.011
 Sally-0.011
 Meg-0.011
 Jane-0.011
Token to
Token ID284
Feature activation-2.17e-03

Pos logit contributions
 Amy+0.023
 Daisy+0.022
 Meg+0.021
 Sally+0.021
 there+0.021
Neg logit contributions
ga-0.034
be-0.031
Ar-0.030
Bang-0.029
zee-0.029
Token find
Token ID1064
Feature activation-3.94e-03

Pos logit contributions
ga+0.003
be+0.003
Ar+0.003
 rh+0.003
aff+0.003
Neg logit contributions
 Amy-0.003
 Meg-0.003
 Daisy-0.003
uct-0.003
igsaw-0.003
Token something
Token ID1223
Feature activation+1.12e-02

Pos logit contributions
ga+0.007
be+0.006
Ar+0.006
aff+0.006
 rh+0.006
Neg logit contributions
uct-0.005
 Amy-0.005
 Meg-0.005
igsaw-0.005
kins-0.004
Token y
Token ID331
Feature activation+6.66e-02

Pos logit contributions
igsaw+0.015
uct+0.015
 Meg+0.014
kin+0.014
 Amy+0.014
Neg logit contributions
by-0.017
ga-0.017
be-0.017
please-0.016
aff-0.016
Tokenummy
Token ID13513
Feature activation+1.04e-03

Pos logit contributions
 Amy+0.087
uct+0.084
 Meg+0.083
 Emily+0.078
 reassured+0.078
Neg logit contributions
ga-0.115
aff-0.114
be-0.113
by-0.112
ruck-0.104
Token to
Token ID284
Feature activation+1.30e-02

Pos logit contributions
 Amy+0.002
uct+0.002
 Meg+0.002
 Sarah+0.001
igsaw+0.001
Neg logit contributions
ga-0.002
be-0.002
aff-0.001
Ar-0.001
zee-0.001
Token snack
Token ID26906
Feature activation-3.00e-03

Pos logit contributions
 Amy+0.019
 Meg+0.019
uct+0.018
igsaw+0.018
 Emily+0.018
Neg logit contributions
ga-0.021
be-0.019
Ar-0.018
 rh-0.018
aff-0.018
Token on
Token ID319
Feature activation+1.99e-02

Pos logit contributions
ga+0.005
be+0.004
aff+0.004
 rh+0.004
zee+0.004
Neg logit contributions
uct-0.004
igsaw-0.004
 refreshing-0.004
 Meg-0.004
 reassured-0.004
Token!
Token ID0
Feature activation-1.30e-03

Pos logit contributions
uct+0.028
 reassured+0.028
 Meg+0.026
 refreshing+0.026
 Linda+0.025
Neg logit contributions
ga-0.031
 rh-0.028
be-0.025
 gor-0.025
smith-0.025
Token to
Token ID284
Feature activation+8.60e-03

Pos logit contributions
 Amy+0.008
 there+0.008
 Sally+0.008
 Daisy+0.008
 Meg+0.008
Neg logit contributions
ga-0.008
be-0.007
Ar-0.007
aff-0.007
ric-0.007
Token travel
Token ID3067
Feature activation+4.58e-03

Pos logit contributions
 Amy+0.014
uct+0.014
ement+0.013
 Meg+0.013
 refreshing+0.013
Neg logit contributions
ga-0.013
be-0.011
Ar-0.011
Stretch-0.011
 rh-0.010
Token to
Token ID284
Feature activation-4.17e-03

Pos logit contributions
 Amy+0.008
 Meg+0.008
uct+0.008
 Daisy+0.008
igsaw+0.008
Neg logit contributions
ga-0.006
be-0.005
Ar-0.005
zee-0.005
by-0.005
Token find
Token ID1064
Feature activation-4.47e-03

Pos logit contributions
ga+0.009
be+0.009
Ar+0.008
aff+0.008
zee+0.008
Neg logit contributions
 Amy-0.004
 Daisy-0.004
 Meg-0.004
 Sally-0.004
 Sarah-0.004
Token something
Token ID1223
Feature activation+1.04e-02

Pos logit contributions
ga+0.008
be+0.007
Ar+0.007
ruck+0.007
 rh+0.007
Neg logit contributions
 Amy-0.006
uct-0.006
 Meg-0.006
igsaw-0.005
kins-0.005
Token y
Token ID331
Feature activation+6.58e-02

Pos logit contributions
uct+0.015
 Meg+0.015
igsaw+0.015
 Amy+0.014
kins+0.014
Neg logit contributions
ga-0.015
be-0.015
 pengu-0.015
 rh-0.015
Ar-0.014
Tokenummy
Token ID13513
Feature activation-2.41e-04

Pos logit contributions
 Amy+0.082
 Meg+0.077
uct+0.075
 Emily+0.071
 firm+0.070
Neg logit contributions
ga-0.124
be-0.119
by-0.119
aff-0.118
ruck-0.113
Token to
Token ID284
Feature activation+1.26e-02

Pos logit contributions
ga+0.000
be+0.000
Ar+0.000
eer+0.000
zee+0.000
Neg logit contributions
 Amy-0.000
 Meg-0.000
uct-0.000
 Daisy-0.000
 Sarah-0.000
Token taste
Token ID6938
Feature activation-4.70e-03

Pos logit contributions
uct+0.020
 Amy+0.020
 Meg+0.020
igsaw+0.019
 Emily+0.019
Neg logit contributions
ga-0.019
 rh-0.017
Ar-0.016
be-0.016
eer-0.015
Token.
Token ID13
Feature activation-6.60e-04

Pos logit contributions
ga+0.007
be+0.007
aff+0.007
ruck+0.007
Ar+0.007
Neg logit contributions
 Meg-0.007
 Amy-0.007
igsaw-0.006
uct-0.006
 Sarah-0.006
Token
Token ID198
Feature activation-2.61e-03

Pos logit contributions
ga+0.001
be+0.001
zee+0.001
aff+0.001
ruck+0.001
Neg logit contributions
 Amy-0.001
 Sally-0.001
 Daisy-0.001
 there-0.001
 Sarah-0.001
Token back
Token ID736
Feature activation+7.23e-03

Pos logit contributions
 Amy+0.009
 Daisy+0.009
 Meg+0.009
 Sally+0.009
 refreshing+0.008
Neg logit contributions
ga-0.015
be-0.014
Ar-0.014
ric-0.013
aff-0.013
Token and
Token ID290
Feature activation-2.90e-03

Pos logit contributions
 Amy+0.009
 Meg+0.008
 Daisy+0.008
uct+0.008
 refreshing+0.008
Neg logit contributions
ga-0.014
be-0.013
Ar-0.012
aff-0.012
ruck-0.012
Token the
Token ID262
Feature activation+1.31e-03

Pos logit contributions
ga+0.005
be+0.005
Ar+0.005
 rh+0.005
zee+0.005
Neg logit contributions
igsaw-0.003
uct-0.003
 Amy-0.003
 Meg-0.003
kins-0.003
Token skirt
Token ID23967
Feature activation-5.47e-03

Pos logit contributions
 Amy+0.002
 Meg+0.002
 Daisy+0.002
uct+0.002
 Sally+0.002
Neg logit contributions
ga-0.002
be-0.002
Ar-0.002
zee-0.002
aff-0.002
Token was
Token ID373
Feature activation-2.06e-03

Pos logit contributions
ga+0.010
be+0.009
Ar+0.008
aff+0.008
zee+0.008
Neg logit contributions
 Amy-0.007
 Meg-0.007
 Daisy-0.007
uct-0.007
igsaw-0.007
Token fr
Token ID1216
Feature activation+6.49e-02

Pos logit contributions
ga+0.004
be+0.004
Ar+0.003
ruck+0.003
aff+0.003
Neg logit contributions
 Amy-0.003
 Meg-0.002
 Daisy-0.002
uct-0.002
igsaw-0.002
Tokenilly
Token ID6548
Feature activation-8.11e-03

Pos logit contributions
 Amy+0.123
 Daisy+0.117
 Meg+0.114
 there+0.110
 Sally+0.110
Neg logit contributions
ga-0.092
be-0.084
aff-0.083
ruck-0.076
by-0.070
Token.
Token ID13
Feature activation+2.88e-03

Pos logit contributions
ga+0.016
be+0.014
zee+0.013
by+0.013
k+0.013
Neg logit contributions
 Amy-0.010
 Meg-0.010
uct-0.010
 Daisy-0.010
 Sally-0.009
Token Amy
Token ID14235
Feature activation+2.20e-02

Pos logit contributions
kins+0.003
 refreshing+0.003
kin+0.003
igsaw+0.003
uct+0.003
Neg logit contributions
ga-0.005
be-0.005
Ar-0.005
aff-0.005
Bang-0.005
Token tw
Token ID665
Feature activation+1.12e-02

Pos logit contributions
 Amy+0.028
 Daisy+0.027
 Emily+0.026
 Sarah+0.026
 there+0.026
Neg logit contributions
ga-0.041
Ar-0.038
ruck-0.037
aff-0.036
ric-0.036
Tokenirled
Token ID49376
Feature activation+2.22e-02

Pos logit contributions
 Amy+0.014
 Daisy+0.013
 Meg+0.013
 Sally+0.012
 Emily+0.012
Neg logit contributions
ga-0.023
be-0.021
Ar-0.020
ric-0.019
aff-0.019
Token a
Token ID257
Feature activation-1.92e-02

Pos logit contributions
ga+0.023
Ar+0.022
ruck+0.020
Stretch+0.020
Bang+0.020
Neg logit contributions
uct-0.021
kins-0.020
igsaw-0.020
ement-0.020
kin-0.019
Token chance
Token ID2863
Feature activation+3.45e-03

Pos logit contributions
ga+0.039
be+0.033
ric+0.033
aff+0.032
smith+0.031
Neg logit contributions
 Amy-0.024
 Daisy-0.024
 refreshing-0.022
 Sally-0.022
 Meg-0.022
Token to
Token ID284
Feature activation+1.03e-02

Pos logit contributions
 Amy+0.006
uct+0.006
 Meg+0.006
 refreshing+0.006
igsaw+0.006
Neg logit contributions
ga-0.005
be-0.004
Ar-0.004
ruck-0.004
zee-0.004
Token find
Token ID1064
Feature activation-6.16e-03

Pos logit contributions
igsaw+0.019
 Amy+0.019
ement+0.019
kins+0.018
uct+0.018
Neg logit contributions
ga-0.012
be-0.011
Ar-0.010
Stretch-0.010
check-0.010
Token something
Token ID1223
Feature activation+5.63e-03

Pos logit contributions
ga+0.008
ruck+0.007
be+0.007
eer+0.007
by+0.007
Neg logit contributions
uct-0.011
igsaw-0.011
kins-0.010
kin-0.010
 refreshing-0.009
Token y
Token ID331
Feature activation+6.42e-02

Pos logit contributions
uct+0.010
igsaw+0.010
 Meg+0.010
 Amy+0.010
ement+0.009
Neg logit contributions
ga-0.006
by-0.006
be-0.006
Ar-0.006
 rh-0.006
Tokenummy
Token ID13513
Feature activation-5.22e-03

Pos logit contributions
 Amy+0.085
uct+0.080
 Meg+0.078
 firm+0.075
 Emily+0.075
Neg logit contributions
ga-0.111
be-0.110
aff-0.109
by-0.109
ruck-0.100
Token and
Token ID290
Feature activation+9.79e-03

Pos logit contributions
ga+0.007
be+0.006
by+0.006
eer+0.006
ruck+0.006
Neg logit contributions
uct-0.009
 Amy-0.009
 Meg-0.009
igsaw-0.009
 refreshing-0.008
Token fun
Token ID1257
Feature activation+7.02e-03

Pos logit contributions
ement+0.011
igsaw+0.010
uct+0.010
 Meg+0.009
 Amy+0.009
Neg logit contributions
Ar-0.017
by-0.017
 rh-0.017
please-0.017
ga-0.017
Token to
Token ID284
Feature activation+1.14e-02

Pos logit contributions
uct+0.011
 Amy+0.010
 Meg+0.010
 firm+0.010
ement+0.010
Neg logit contributions
ga-0.010
by-0.010
be-0.009
ruck-0.009
Ar-0.009
Token explore
Token ID7301
Feature activation+3.63e-03

Pos logit contributions
 Amy+0.017
uct+0.016
 Meg+0.016
kins+0.016
igsaw+0.015
Neg logit contributions
ga-0.019
be-0.017
Ar-0.016
 rh-0.015
Stretch-0.015
Token.
Token ID13
Feature activation+1.74e-03

Pos logit contributions
ga+0.030
be+0.028
aff+0.026
eer+0.025
 rh+0.025
Neg logit contributions
 Amy-0.021
uct-0.021
 Meg-0.021
 Daisy-0.021
igsaw-0.021
Token He
Token ID679
Feature activation-5.01e-03

Pos logit contributions
 Amy+0.003
uct+0.003
kins+0.003
 refreshing+0.003
 Meg+0.003
Neg logit contributions
ga-0.002
be-0.002
Ar-0.001
Bang-0.001
aff-0.001
Token thinks
Token ID6834
Feature activation+3.91e-03

Pos logit contributions
ga+0.009
be+0.008
Ar+0.008
ruck+0.007
zee+0.007
Neg logit contributions
 Amy-0.007
 Meg-0.007
 Daisy-0.007
 refreshing-0.007
uct-0.007
Token jam
Token ID16853
Feature activation-1.09e-02

Pos logit contributions
uct+0.005
 Meg+0.005
 refreshing+0.005
igsaw+0.005
 Amy+0.005
Neg logit contributions
ga-0.007
 rh-0.006
be-0.006
eer-0.006
smith-0.006
Token is
Token ID318
Feature activation+1.63e-02

Pos logit contributions
ga+0.012
be+0.011
eer+0.010
aff+0.010
 rh+0.010
Neg logit contributions
 Meg-0.021
uct-0.021
 Amy-0.020
igsaw-0.020
 Sally-0.020
Token y
Token ID331
Feature activation+6.40e-02

Pos logit contributions
 Amy+0.017
uct+0.017
 Meg+0.017
igsaw+0.017
 Daisy+0.016
Neg logit contributions
ga-0.031
be-0.029
 rh-0.029
Ar-0.028
ric-0.028
Tokenucky
Token ID5309
Feature activation+1.95e-02

Pos logit contributions
uct+0.078
 Amy+0.078
 Meg+0.075
 reassured+0.073
 Daisy+0.071
Neg logit contributions
ga-0.119
be-0.114
aff-0.114
by-0.106
ruck-0.104
Token.
Token ID13
Feature activation+5.85e-03

Pos logit contributions
 Amy+0.024
uct+0.024
 Meg+0.024
 Daisy+0.023
 Sarah+0.022
Neg logit contributions
ga-0.037
be-0.034
eer-0.032
 rh-0.031
Ar-0.031
Token He
Token ID679
Feature activation-5.70e-03

Pos logit contributions
 Amy+0.011
 Meg+0.011
 Daisy+0.011
uct+0.011
 Sally+0.010
Neg logit contributions
ga-0.008
be-0.006
Ar-0.006
aff-0.006
zee-0.006
Token wants
Token ID3382
Feature activation+1.76e-02

Pos logit contributions
ga+0.010
be+0.009
Ar+0.008
ruck+0.008
zee+0.008
Neg logit contributions
 Amy-0.008
 Meg-0.008
 Daisy-0.008
 refreshing-0.008
uct-0.008
Token something
Token ID1223
Feature activation+1.18e-03

Pos logit contributions
 Amy+0.029
 Daisy+0.027
 Meg+0.027
 Sally+0.027
 Emily+0.026
Neg logit contributions
ga-0.030
be-0.026
Ar-0.025
ric-0.024
zee-0.024
Token He
Token ID679
Feature activation-1.32e-02

Pos logit contributions
 Amy+0.013
 Meg+0.012
 Daisy+0.012
uct+0.012
 Sally+0.012
Neg logit contributions
ga-0.014
be-0.012
Ar-0.012
zee-0.011
aff-0.011
Token wanted
Token ID2227
Feature activation+1.00e-02

Pos logit contributions
ga+0.023
be+0.020
ric+0.019
aff+0.019
Ar+0.018
Neg logit contributions
 Amy-0.021
 Daisy-0.019
 there-0.019
 Sally-0.019
 Meg-0.019
Token to
Token ID284
Feature activation+4.92e-03

Pos logit contributions
 Amy+0.017
 Daisy+0.016
 Meg+0.016
 Sally+0.016
 there+0.016
Neg logit contributions
ga-0.016
be-0.014
Ar-0.013
aff-0.012
zee-0.012
Token find
Token ID1064
Feature activation-4.72e-03

Pos logit contributions
ement+0.008
uct+0.008
igsaw+0.008
 Meg+0.008
 Amy+0.008
Neg logit contributions
ga-0.007
be-0.006
Ar-0.006
 rh-0.006
check-0.005
Token something
Token ID1223
Feature activation+9.95e-03

Pos logit contributions
ga+0.010
be+0.009
Ar+0.008
zee+0.008
aff+0.008
Neg logit contributions
 Amy-0.006
 Meg-0.005
 Daisy-0.005
 Sally-0.005
uct-0.005
Token y
Token ID331
Feature activation+6.38e-02

Pos logit contributions
igsaw+0.015
uct+0.015
 Meg+0.015
ement+0.014
 Amy+0.013
Neg logit contributions
by-0.013
be-0.013
 pengu-0.013
Ar-0.013
 rh-0.013
Tokenummy
Token ID13513
Feature activation-1.34e-04

Pos logit contributions
uct+0.081
 Amy+0.080
 Meg+0.077
 reassured+0.075
 firm+0.074
Neg logit contributions
by-0.112
ga-0.111
be-0.110
aff-0.108
ruck-0.101
Token to
Token ID284
Feature activation+1.27e-02

Pos logit contributions
ga+0.000
be+0.000
eer+0.000
Ar+0.000
by+0.000
Neg logit contributions
uct-0.000
 Amy-0.000
 Meg-0.000
igsaw-0.000
 Sarah-0.000
Token eat
Token ID4483
Feature activation+2.34e-03

Pos logit contributions
uct+0.020
 Meg+0.020
igsaw+0.020
 Amy+0.019
ement+0.019
Neg logit contributions
ga-0.018
be-0.016
 rh-0.016
Ar-0.016
eer-0.015
Token.
Token ID13
Feature activation+2.99e-03

Pos logit contributions
igsaw+0.004
uct+0.004
 Meg+0.004
 reassured+0.003
 Amy+0.003
Neg logit contributions
ga-0.003
eer-0.003
be-0.003
 rh-0.003
by-0.003
Token Sam
Token ID3409
Feature activation+9.29e-03

Pos logit contributions
 Amy+0.005
 Meg+0.004
 Daisy+0.004
 Sally+0.004
 Sarah+0.004
Neg logit contributions
ga-0.005
be-0.005
Ar-0.004
aff-0.004
ruck-0.004
Token doctor
Token ID6253
Feature activation-6.88e-03

Pos logit contributions
ga+0.012
aff+0.011
eer+0.011
ked+0.011
Ar+0.011
Neg logit contributions
 there-0.005
 Amy-0.005
 empty-0.005
 dry-0.005
 Jane-0.005
Token,
Token ID11
Feature activation-1.89e-02

Pos logit contributions
ga+0.014
be+0.013
Ar+0.012
Bang+0.012
aff+0.012
Neg logit contributions
 Amy-0.008
 Daisy-0.007
 Meg-0.007
 Sally-0.007
 Sarah-0.007
Token which
Token ID543
Feature activation+8.79e-03

Pos logit contributions
ga+0.035
be+0.031
Ar+0.030
ruck+0.029
ric+0.029
Neg logit contributions
 Amy-0.025
 Meg-0.024
uct-0.023
 Daisy-0.023
igsaw-0.023
Token was
Token ID373
Feature activation-5.92e-03

Pos logit contributions
uct+0.015
 Meg+0.014
igsaw+0.014
 Amy+0.014
 refreshing+0.014
Neg logit contributions
ga-0.012
be-0.011
 rh-0.011
Ar-0.011
smith-0.010
Token an
Token ID281
Feature activation+1.81e-02

Pos logit contributions
ga+0.011
be+0.010
Ar+0.010
zee+0.010
aff+0.010
Neg logit contributions
 Amy-0.008
 Daisy-0.007
 Meg-0.007
 Sally-0.007
uct-0.007
Token err
Token ID11454
Feature activation+6.35e-02

Pos logit contributions
 Amy+0.025
 Meg+0.024
 Daisy+0.023
 Sally+0.022
uct+0.022
Neg logit contributions
ga-0.034
be-0.030
Ar-0.028
ruck-0.028
eer-0.028
Tokenand
Token ID392
Feature activation+1.04e-02

Pos logit contributions
 Amy+0.111
 Meg+0.109
 Daisy+0.108
 refreshing+0.104
uct+0.104
Neg logit contributions
ga-0.098
be-0.087
Ar-0.079
ruck-0.075
aff-0.075
Token he
Token ID339
Feature activation-6.83e-03

Pos logit contributions
 Amy+0.016
 Meg+0.015
uct+0.015
 Daisy+0.015
 refreshing+0.014
Neg logit contributions
ga-0.018
be-0.016
Ar-0.014
ruck-0.014
eer-0.014
Token didn
Token ID1422
Feature activation+5.80e-03

Pos logit contributions
ga+0.011
be+0.010
Ar+0.009
aff+0.009
zee+0.009
Neg logit contributions
 Amy-0.011
 Daisy-0.010
 Meg-0.010
 Sally-0.010
uct-0.010
Token't
Token ID470
Feature activation-3.77e-03

Pos logit contributions
 Amy+0.009
 Meg+0.009
 Daisy+0.009
 Sally+0.009
uct+0.009
Neg logit contributions
ga-0.009
be-0.008
Ar-0.008
zee-0.007
aff-0.007
Token understand
Token ID1833
Feature activation-2.26e-02

Pos logit contributions
ga+0.007
be+0.006
Ar+0.006
zee+0.005
aff+0.005
Neg logit contributions
 Amy-0.006
 Meg-0.005
 Daisy-0.005
 Sally-0.005
uct-0.005
Token.
Token ID13
Feature activation-2.43e-03

Pos logit contributions
ga+0.010
be+0.009
Ar+0.009
aff+0.008
k+0.008
Neg logit contributions
 Amy-0.006
 Meg-0.006
uct-0.006
 Daisy-0.006
 Sally-0.006
Token She
Token ID1375
Feature activation-3.56e-03

Pos logit contributions
ga+0.005
be+0.004
Ar+0.004
zee+0.004
k+0.004
Neg logit contributions
 Amy-0.003
 Daisy-0.003
 Sally-0.003
 Sarah-0.003
 Meg-0.003
Token thinks
Token ID6834
Feature activation+3.64e-03

Pos logit contributions
ga+0.006
be+0.005
Ar+0.005
zee+0.005
aff+0.005
Neg logit contributions
 Amy-0.006
 Daisy-0.005
 refreshing-0.005
 Meg-0.005
 there-0.005
Token cereal
Token ID33158
Feature activation-9.97e-03

Pos logit contributions
 Amy+0.004
 Daisy+0.003
 Meg+0.003
 Sally+0.003
 Sarah+0.003
Neg logit contributions
ga-0.008
be-0.007
Ar-0.007
zee-0.007
aff-0.007
Token is
Token ID318
Feature activation+1.58e-02

Pos logit contributions
ga+0.013
be+0.012
Ar+0.011
 rh+0.010
eer+0.010
Neg logit contributions
 Amy-0.018
 Meg-0.018
uct-0.018
 Daisy-0.017
 Sally-0.017
Token y
Token ID331
Feature activation+6.32e-02

Pos logit contributions
igsaw+0.023
 Meg+0.022
ogging+0.022
olt+0.021
ement+0.021
Neg logit contributions
ga-0.022
 rh-0.021
please-0.021
gen-0.021
be-0.021
Tokenummy
Token ID13513
Feature activation+8.18e-03

Pos logit contributions
uct+0.080
 Amy+0.078
 Meg+0.078
 reassured+0.074
 firm+0.074
Neg logit contributions
ga-0.113
aff-0.109
be-0.107
by-0.102
ruck-0.100
Token and
Token ID290
Feature activation+1.26e-02

Pos logit contributions
uct+0.013
 Meg+0.012
 Amy+0.012
 refreshing+0.012
igsaw+0.012
Neg logit contributions
ga-0.013
be-0.012
eer-0.011
 rh-0.011
ruck-0.011
Token good
Token ID922
Feature activation+5.81e-03

Pos logit contributions
 Meg+0.014
igsaw+0.014
uct+0.013
ement+0.013
 Amy+0.013
Neg logit contributions
ga-0.022
 rh-0.022
be-0.022
gen-0.021
eer-0.020
Token for
Token ID329
Feature activation-5.83e-03

Pos logit contributions
 Amy+0.006
 Daisy+0.005
 Sally+0.005
 there+0.005
 Meg+0.005
Neg logit contributions
ga-0.013
Ar-0.012
be-0.012
Bang-0.012
zee-0.011
Token her
Token ID607
Feature activation+9.88e-03

Pos logit contributions
ga+0.013
be+0.012
Ar+0.011
aff+0.011
zee+0.011
Neg logit contributions
 Amy-0.006
 Meg-0.005
 Daisy-0.005
uct-0.005
 Sally-0.005
Token side
Token ID1735
Feature activation+2.42e-02

Pos logit contributions
ga+0.034
be+0.031
Ar+0.029
zee+0.027
 rh+0.027
Neg logit contributions
 Amy-0.037
uct-0.036
 Meg-0.036
 Daisy-0.036
igsaw-0.035
Token,
Token ID11
Feature activation-1.27e-03

Pos logit contributions
uct+0.036
kins+0.031
 refreshing+0.031
igsaw+0.031
 reassured+0.030
Neg logit contributions
ga-0.040
be-0.039
by-0.035
aff-0.033
 pengu-0.033
Token she
Token ID673
Feature activation+6.44e-03

Pos logit contributions
be+0.002
ruck+0.002
ga+0.002
 Where+0.002
 Suddenly+0.002
Neg logit contributions
igsaw-0.002
kin-0.001
uct-0.001
ement-0.001
kins-0.001
Token found
Token ID1043
Feature activation+2.57e-04

Pos logit contributions
 Amy+0.010
 there+0.009
 Daisy+0.009
 Meg+0.009
 refreshing+0.009
Neg logit contributions
ga-0.011
be-0.009
ruck-0.009
aff-0.009
Ar-0.009
Token something
Token ID1223
Feature activation+8.55e-03

Pos logit contributions
 Amy+0.000
uct+0.000
 Meg+0.000
 Daisy+0.000
igsaw+0.000
Neg logit contributions
ga-0.001
be-0.000
Ar-0.000
zee-0.000
ruck-0.000
Token y
Token ID331
Feature activation+6.31e-02

Pos logit contributions
uct+0.012
 Amy+0.012
igsaw+0.012
 Meg+0.011
kins+0.010
Neg logit contributions
ga-0.013
be-0.013
zee-0.012
 rh-0.012
by-0.012
Tokenummy
Token ID13513
Feature activation+7.37e-04

Pos logit contributions
 Amy+0.078
uct+0.072
 Meg+0.071
 Emily+0.068
 Sarah+0.066
Neg logit contributions
ga-0.121
be-0.116
aff-0.114
by-0.110
ruck-0.108
Token waiting
Token ID4953
Feature activation-1.40e-03

Pos logit contributions
 Amy+0.001
 Meg+0.001
uct+0.001
 Daisy+0.001
 Sarah+0.001
Neg logit contributions
ga-0.001
be-0.001
Ar-0.001
ruck-0.001
aff-0.001
Token for
Token ID329
Feature activation+8.08e-03

Pos logit contributions
ga+0.002
be+0.002
ruck+0.002
by+0.002
ric+0.002
Neg logit contributions
 Amy-0.002
uct-0.002
 Meg-0.002
 Daisy-0.002
igsaw-0.002
Token her
Token ID607
Feature activation+8.27e-03

Pos logit contributions
igsaw+0.012
uct+0.012
 refreshing+0.012
 firm+0.011
 reassured+0.011
Neg logit contributions
ga-0.012
ruck-0.011
be-0.011
aff-0.010
please-0.009
Token.
Token ID13
Feature activation-9.98e-04

Pos logit contributions
 Amy+0.011
 Meg+0.010
uct+0.010
 Daisy+0.010
 Sally+0.010
Neg logit contributions
ga-0.016
be-0.014
Ar-0.014
 rh-0.013
zee-0.013
Token box
Token ID3091
Feature activation+1.66e-02

Pos logit contributions
uct+0.031
 Meg+0.026
 Amy+0.025
 reassured+0.023
kins+0.023
Neg logit contributions
ga-0.042
 rh-0.040
be-0.039
 pengu-0.038
Ar-0.037
Token of
Token ID286
Feature activation+2.10e-02

Pos logit contributions
 Meg+0.024
 Amy+0.024
 Daisy+0.023
uct+0.023
igsaw+0.023
Neg logit contributions
ga-0.028
be-0.027
ruck-0.024
k-0.023
aff-0.023
Token tea
Token ID8887
Feature activation+8.60e-03

Pos logit contributions
uct+0.028
 Meg+0.026
 Amy+0.025
igsaw+0.024
 Sally+0.023
Neg logit contributions
ga-0.038
be-0.035
Ar-0.034
 rh-0.034
 pengu-0.034
Token that
Token ID326
Feature activation+8.43e-03

Pos logit contributions
uct+0.012
 Meg+0.012
 Amy+0.011
 Daisy+0.011
 refreshing+0.011
Neg logit contributions
ga-0.015
be-0.014
aff-0.014
zee-0.013
Ar-0.012
Token looked
Token ID3114
Feature activation+3.67e-03

Pos logit contributions
igsaw+0.014
ogging+0.014
ement+0.014
uct+0.013
 firm+0.012
Neg logit contributions
ga-0.011
be-0.010
Ar-0.010
by-0.010
 rh-0.009
Token y
Token ID331
Feature activation+6.28e-02

Pos logit contributions
 Meg+0.005
 Amy+0.005
uct+0.005
igsaw+0.005
 Daisy+0.005
Neg logit contributions
ga-0.006
be-0.006
Ar-0.006
aff-0.005
 rh-0.005
Tokenummy
Token ID13513
Feature activation+7.48e-03

Pos logit contributions
 Amy+0.078
 Meg+0.076
uct+0.073
 Emily+0.069
 firm+0.069
Neg logit contributions
ga-0.119
be-0.116
aff-0.114
ruck-0.105
by-0.102
Token.
Token ID13
Feature activation+5.90e-03

Pos logit contributions
 Meg+0.010
 Amy+0.010
uct+0.010
 Sarah+0.009
igsaw+0.009
Neg logit contributions
ga-0.013
be-0.012
aff-0.011
ruck-0.011
 rh-0.011
Token Lucy
Token ID22162
Feature activation+1.76e-02

Pos logit contributions
 Amy+0.006
 Meg+0.006
uct+0.006
 Daisy+0.006
kins+0.006
Neg logit contributions
ga-0.012
be-0.011
Ar-0.011
zee-0.010
aff-0.010
Token asked
Token ID1965
Feature activation+1.50e-03

Pos logit contributions
 Amy+0.024
 Daisy+0.023
 Sally+0.022
 Meg+0.022
 Emily+0.022
Neg logit contributions
ga-0.033
Ar-0.029
be-0.028
ruck-0.028
ric-0.028
Token her
Token ID607
Feature activation+7.48e-03

Pos logit contributions
 Amy+0.001
 Meg+0.001
 refreshing+0.001
uct+0.001
 Daisy+0.001
Neg logit contributions
ga-0.003
be-0.003
Ar-0.003
aff-0.003
zee-0.003
Token Inside
Token ID14384
Feature activation+3.55e-02

Pos logit contributions
 Amy+0.009
 Sarah+0.009
 Emily+0.008
 there+0.008
 Meg+0.008
Neg logit contributions
ga-0.013
be-0.012
ruck-0.012
eers-0.012
ric-0.012
Token the
Token ID262
Feature activation+9.40e-03

Pos logit contributions
uct+0.046
 refreshing+0.045
igsaw+0.045
 Meg+0.043
 Amy+0.043
Neg logit contributions
ga-0.061
be-0.056
smith-0.055
 rh-0.055
Ar-0.054
Token bone
Token ID9970
Feature activation+7.98e-03

Pos logit contributions
 Amy+0.015
 Meg+0.014
uct+0.014
kins+0.014
 Jane+0.014
Neg logit contributions
ga-0.014
be-0.013
 rh-0.013
Ar-0.013
smith-0.013
Token was
Token ID373
Feature activation-3.66e-03

Pos logit contributions
uct+0.013
 Amy+0.013
igsaw+0.012
 refreshing+0.012
 Meg+0.012
Neg logit contributions
ga-0.011
be-0.010
smith-0.010
 rh-0.010
Ar-0.010
Token a
Token ID257
Feature activation+8.25e-03

Pos logit contributions
ga+0.008
be+0.007
Ar+0.007
aff+0.007
zee+0.006
Neg logit contributions
 Amy-0.004
 Daisy-0.004
 Meg-0.004
 Sally-0.004
 Sarah-0.004
Token y
Token ID331
Feature activation+6.25e-02

Pos logit contributions
 Amy+0.009
 Daisy+0.009
 Meg+0.008
 Sally+0.008
 refreshing+0.008
Neg logit contributions
ga-0.018
be-0.016
Ar-0.015
aff-0.015
ric-0.015
Tokenummy
Token ID13513
Feature activation-5.54e-03

Pos logit contributions
 Amy+0.066
uct+0.062
 Meg+0.062
kins+0.057
 Daisy+0.056
Neg logit contributions
ga-0.132
be-0.124
aff-0.120
by-0.117
Ar-0.116
Token sausage
Token ID31889
Feature activation-1.41e-02

Pos logit contributions
ga+0.010
be+0.009
 rh+0.009
ruck+0.009
Ar+0.009
Neg logit contributions
uct-0.007
 Amy-0.007
 Meg-0.007
kins-0.006
 Daisy-0.006
Token.
Token ID13
Feature activation+5.49e-03

Pos logit contributions
ga+0.029
be+0.026
 rh+0.025
ruck+0.025
Ar+0.025
Neg logit contributions
 Amy-0.016
uct-0.015
 Meg-0.015
 Daisy-0.014
 refreshing-0.014
Token
Token ID220
Feature activation-2.75e-02

Pos logit contributions
 Amy+0.007
 Sarah+0.006
 there+0.006
 Emily+0.006
 Meg+0.006
Neg logit contributions
ga-0.011
be-0.010
ruck-0.010
eers-0.010
eer-0.010
Token
Token ID198
Feature activation-1.45e-03

Pos logit contributions
ga+0.054
be+0.048
zee+0.046
Ar+0.045
ruck+0.045
Neg logit contributions
 Amy-0.036
 Daisy-0.033
 there-0.032
 Meg-0.032
 Sarah-0.032
Token Lily
Token ID20037
Feature activation+2.05e-02

Pos logit contributions
 Amy+0.002
 Meg+0.002
 Daisy+0.002
uct+0.002
 Sally+0.002
Neg logit contributions
ga-0.002
be-0.002
Ar-0.002
aff-0.002
zee-0.002
Token thought
Token ID1807
Feature activation+3.29e-03

Pos logit contributions
 Amy+0.031
 there+0.029
 Daisy+0.029
 Sally+0.029
 Sarah+0.028
Neg logit contributions
ga-0.034
aff-0.030
Ar-0.029
be-0.029
zee-0.029
Token the
Token ID262
Feature activation+3.77e-03

Pos logit contributions
uct+0.004
igsaw+0.004
 Meg+0.004
 Amy+0.004
 refreshing+0.003
Neg logit contributions
ga-0.006
be-0.006
 rh-0.006
eer-0.005
ruck-0.005
Token oats
Token ID47009
Feature activation-1.37e-02

Pos logit contributions
 refreshing+0.006
 there+0.006
 Amy+0.006
 Daisy+0.006
 Sally+0.006
Neg logit contributions
ga-0.006
be-0.005
Bang-0.004
zee-0.004
aff-0.004
Token looked
Token ID3114
Feature activation+5.33e-03

Pos logit contributions
ga+0.015
be+0.014
eer+0.013
 rh+0.012
ruck+0.012
Neg logit contributions
 Meg-0.027
igsaw-0.027
uct-0.026
 Amy-0.026
 Emily-0.025
Token y
Token ID331
Feature activation+6.24e-02

Pos logit contributions
 Meg+0.008
 Amy+0.007
igsaw+0.007
uct+0.007
kins+0.007
Neg logit contributions
ga-0.009
be-0.008
Ar-0.008
 rh-0.008
zee-0.007
Tokenummy
Token ID13513
Feature activation+1.22e-02

Pos logit contributions
 Amy+0.082
 Meg+0.079
uct+0.079
 firm+0.073
 Emily+0.072
Neg logit contributions
ga-0.113
be-0.109
aff-0.108
by-0.101
ruck-0.100
Token.
Token ID13
Feature activation+3.19e-03

Pos logit contributions
uct+0.020
 Meg+0.020
igsaw+0.019
 Amy+0.019
 Linda+0.018
Neg logit contributions
ga-0.017
be-0.016
by-0.015
eer-0.015
 rh-0.015
Token
Token ID198
Feature activation-5.63e-03

Pos logit contributions
 Amy+0.004
 Daisy+0.004
 Sally+0.004
 Sarah+0.004
 Meg+0.004
Neg logit contributions
ga-0.007
be-0.006
Ar-0.006
ruck-0.006
eer-0.006
Token
Token ID198
Feature activation+3.88e-03

Pos logit contributions
ga+0.010
be+0.009
Ar+0.009
zee+0.009
aff+0.008
Neg logit contributions
 Amy-0.008
 Daisy-0.008
 Meg-0.007
 Sally-0.007
 there-0.007
TokenBut
Token ID1537
Feature activation-5.52e-03

Pos logit contributions
 Amy+0.007
 Daisy+0.006
 Meg+0.006
uct+0.006
 there+0.006
Neg logit contributions
ga-0.006
be-0.005
eer-0.005
ric-0.005
smith-0.005
Token<|endoftext|>
Token ID50256
Feature activation+8.26e-04

Pos logit contributions
 Amy+0.009
 Meg+0.008
 Daisy+0.008
uct+0.008
 Sally+0.008
Neg logit contributions
ga-0.007
be-0.007
Ar-0.006
aff-0.006
zee-0.006
TokenBilly
Token ID49640
Feature activation-9.96e-03

Pos logit contributions
 Meg+0.001
 refreshing+0.001
uct+0.001
 Amy+0.001
igsaw+0.001
Neg logit contributions
ga-0.001
Ar-0.001
be-0.001
Bang-0.001
smith-0.001
Token was
Token ID373
Feature activation-9.11e-04

Pos logit contributions
ga+0.017
be+0.016
zee+0.015
Ar+0.015
aff+0.015
Neg logit contributions
 Amy-0.015
 Daisy-0.014
 Sally-0.014
 there-0.014
 Sarah-0.014
Token a
Token ID257
Feature activation-1.10e-03

Pos logit contributions
ga+0.002
be+0.002
Ar+0.002
zee+0.002
aff+0.002
Neg logit contributions
 Amy-0.001
 Meg-0.001
 Daisy-0.001
uct-0.001
 Sally-0.001
Token 3
Token ID513
Feature activation-2.70e-02

Pos logit contributions
ga+0.002
be+0.002
Ar+0.002
aff+0.002
ric+0.002
Neg logit contributions
 Amy-0.001
 Daisy-0.001
 there-0.001
 Sally-0.001
 refreshing-0.001
Token year
Token ID614
Feature activation+6.23e-02

Pos logit contributions
ga+0.050
be+0.046
eer+0.042
 rh+0.041
Ar+0.041
Neg logit contributions
 Meg-0.036
 Amy-0.036
uct-0.035
 Daisy-0.035
 refreshing-0.034
Token old
Token ID1468
Feature activation-2.63e-02

Pos logit contributions
 Amy+0.082
 Meg+0.080
 Daisy+0.077
uct+0.077
 Sally+0.073
Neg logit contributions
ga-0.117
be-0.107
aff-0.100
Ar-0.098
 rh-0.098
Token boy
Token ID2933
Feature activation-2.99e-03

Pos logit contributions
ga+0.044
Ar+0.042
Bang+0.041
ric+0.040
be+0.040
Neg logit contributions
 Amy-0.038
 there-0.037
 Daisy-0.036
 Sally-0.036
 Sarah-0.035
Token who
Token ID508
Feature activation-5.26e-03

Pos logit contributions
ga+0.006
Ar+0.005
be+0.005
Bang+0.005
aff+0.005
Neg logit contributions
 Amy-0.004
 Daisy-0.003
 there-0.003
 Sally-0.003
 Sarah-0.003
Token was
Token ID373
Feature activation-9.45e-03

Pos logit contributions
ga+0.006
be+0.005
Ar+0.005
aff+0.005
zee+0.005
Neg logit contributions
 Amy-0.010
 Daisy-0.010
 Meg-0.010
 Sally-0.010
 Sarah-0.010
Token very
Token ID845
Feature activation+2.21e-02

Pos logit contributions
ga+0.018
Ar+0.017
aff+0.017
zee+0.017
Bang+0.017
Neg logit contributions
 Amy-0.012
 there-0.012
 Sally-0.011
 Sarah-0.011
 Daisy-0.011
TokenM
Token ID44
Feature activation+1.66e-02

Pos logit contributions
 Amy+0.002
 Meg+0.002
uct+0.002
 Daisy+0.002
 refreshing+0.002
Neg logit contributions
ga-0.002
be-0.002
Ar-0.002
aff-0.002
smith-0.002
Tokenandy
Token ID10757
Feature activation+7.48e-03

Pos logit contributions
 Amy+0.022
 Meg+0.021
kins+0.021
 Daisy+0.021
 refreshing+0.021
Neg logit contributions
be-0.028
by-0.027
ga-0.026
aff-0.025
Ar-0.024
Token was
Token ID373
Feature activation-6.39e-04

Pos logit contributions
 Amy+0.012
 Meg+0.011
uct+0.011
 Daisy+0.011
 refreshing+0.011
Neg logit contributions
ga-0.012
be-0.011
Ar-0.010
smith-0.010
 rh-0.010
Token a
Token ID257
Feature activation-3.09e-03

Pos logit contributions
ga+0.001
be+0.001
Ar+0.001
 rh+0.001
ric+0.001
Neg logit contributions
 Amy-0.001
 Meg-0.001
 Daisy-0.001
uct-0.001
igsaw-0.001
Token 3
Token ID513
Feature activation-2.65e-02

Pos logit contributions
ga+0.006
be+0.005
Ar+0.005
aff+0.005
zee+0.005
Neg logit contributions
 Amy-0.004
 Daisy-0.004
 Meg-0.004
 Sally-0.004
 refreshing-0.004
Token year
Token ID614
Feature activation+6.23e-02

Pos logit contributions
ga+0.048
be+0.045
eer+0.040
 pengu+0.039
 rh+0.039
Neg logit contributions
 Meg-0.036
uct-0.035
 Amy-0.035
 Daisy-0.035
 refreshing-0.035
Token old
Token ID1468
Feature activation-2.62e-02

Pos logit contributions
 Amy+0.084
 Meg+0.083
uct+0.080
 Daisy+0.078
kins+0.076
Neg logit contributions
ga-0.112
be-0.103
aff-0.095
 rh-0.093
Ar-0.093
Token girl
Token ID2576
Feature activation+4.22e-03

Pos logit contributions
ga+0.045
Ar+0.042
Bang+0.041
be+0.040
ric+0.039
Neg logit contributions
 Amy-0.038
 Daisy-0.036
 there-0.036
 Sally-0.035
 Sarah-0.035
Token who
Token ID508
Feature activation-5.88e-03

Pos logit contributions
 Amy+0.006
 Daisy+0.005
 Meg+0.005
 Sally+0.005
 Sarah+0.005
Neg logit contributions
ga-0.008
be-0.007
Ar-0.007
zee-0.007
aff-0.007
Token lived
Token ID5615
Feature activation-9.25e-04

Pos logit contributions
ga+0.006
be+0.005
Ar+0.005
zee+0.004
 rh+0.004
Neg logit contributions
 Amy-0.013
 Meg-0.012
 Daisy-0.012
uct-0.012
 Sally-0.012
Token with
Token ID351
Feature activation+1.11e-02

Pos logit contributions
ga+0.002
be+0.001
Ar+0.001
k+0.001
by+0.001
Neg logit contributions
 Amy-0.001
 Meg-0.001
 refreshing-0.001
 Daisy-0.001
uct-0.001
Token.
Token ID13
Feature activation+1.13e-02

Pos logit contributions
ga+0.002
be+0.002
Ar+0.002
ric+0.002
ruck+0.002
Neg logit contributions
 Amy-0.001
 Meg-0.001
 Daisy-0.001
uct-0.001
igsaw-0.001
Token He
Token ID679
Feature activation-7.06e-03

Pos logit contributions
 Amy+0.018
 Daisy+0.017
 Meg+0.017
 Sally+0.017
uct+0.017
Neg logit contributions
ga-0.019
be-0.017
Ar-0.016
zee-0.015
ruck-0.015
Token thought
Token ID1807
Feature activation+7.88e-03

Pos logit contributions
ga+0.014
ric+0.011
ruck+0.011
be+0.011
aff+0.011
Neg logit contributions
 Amy-0.010
 Daisy-0.010
 Sally-0.009
 there-0.009
 Meg-0.009
Token it
Token ID340
Feature activation+1.21e-02

Pos logit contributions
 Amy+0.010
uct+0.010
 Meg+0.009
 Daisy+0.009
igsaw+0.009
Neg logit contributions
ga-0.015
be-0.013
Ar-0.013
 rh-0.012
aff-0.012
Token looked
Token ID3114
Feature activation+5.51e-03

Pos logit contributions
 Amy+0.022
 Meg+0.021
 Daisy+0.021
uct+0.021
 Sally+0.021
Neg logit contributions
ga-0.016
be-0.014
Ar-0.014
zee-0.013
aff-0.013
Token y
Token ID331
Feature activation+6.21e-02

Pos logit contributions
 Amy+0.008
igsaw+0.008
 Meg+0.008
uct+0.007
kins+0.007
Neg logit contributions
ga-0.009
be-0.008
Ar-0.008
Bang-0.008
zee-0.008
Tokenummy
Token ID13513
Feature activation+1.42e-02

Pos logit contributions
 Amy+0.079
uct+0.077
 Meg+0.076
 firm+0.071
kins+0.069
Neg logit contributions
ga-0.111
be-0.108
aff-0.105
by-0.101
ruck-0.098
Token and
Token ID290
Feature activation+2.05e-02

Pos logit contributions
uct+0.022
 Meg+0.022
igsaw+0.021
 Amy+0.021
 Linda+0.020
Neg logit contributions
ga-0.020
be-0.019
by-0.019
eer-0.018
zee-0.018
Token wanted
Token ID2227
Feature activation+1.67e-02

Pos logit contributions
igsaw+0.041
ement+0.036
 Meg+0.036
uct+0.035
 Amy+0.035
Neg logit contributions
ga-0.022
 rh-0.021
please-0.020
be-0.020
Ar-0.020
Token to
Token ID284
Feature activation+8.04e-03

Pos logit contributions
 Amy+0.028
 Daisy+0.027
 Sally+0.027
 there+0.026
 Meg+0.026
Neg logit contributions
ga-0.028
be-0.023
Ar-0.023
aff-0.022
ric-0.022
Token try
Token ID1949
Feature activation+3.57e-03

Pos logit contributions
 Amy+0.011
 Meg+0.010
 Daisy+0.010
 Sally+0.010
uct+0.010
Neg logit contributions
ga-0.015
be-0.014
Ar-0.013
zee-0.013
aff-0.013
Token lobby
Token ID10866
Feature activation+1.18e-02

Pos logit contributions
 Daisy+0.015
 Amy+0.015
 Meg+0.015
uct+0.015
igsaw+0.014
Neg logit contributions
ga-0.019
be-0.018
k-0.016
aff-0.016
Ar-0.016
Token.
Token ID13
Feature activation-3.32e-03

Pos logit contributions
 Daisy+0.016
 Amy+0.016
 Meg+0.016
igsaw+0.015
 Emily+0.015
Neg logit contributions
ga-0.021
be-0.020
k-0.018
aff-0.018
ruck-0.018
Token The
Token ID383
Feature activation-7.90e-03

Pos logit contributions
ga+0.006
be+0.005
Ar+0.005
ruck+0.005
zee+0.005
Neg logit contributions
 Amy-0.005
 Daisy-0.005
 Sally-0.005
 Meg-0.005
 Jane-0.005
Token cake
Token ID12187
Feature activation+1.64e-04

Pos logit contributions
ga+0.017
be+0.016
Ar+0.015
zee+0.015
aff+0.015
Neg logit contributions
 Amy-0.008
 Meg-0.007
 Daisy-0.007
 Sally-0.007
uct-0.007
Token looks
Token ID3073
Feature activation+1.05e-02

Pos logit contributions
 Meg+0.000
 Amy+0.000
uct+0.000
 Daisy+0.000
 Sally+0.000
Neg logit contributions
ga-0.000
be-0.000
 rh-0.000
Ar-0.000
aff-0.000
Token y
Token ID331
Feature activation+6.21e-02

Pos logit contributions
 Amy+0.013
 Meg+0.012
igsaw+0.012
 Daisy+0.012
uct+0.012
Neg logit contributions
ga-0.020
be-0.019
Ar-0.018
 rh-0.018
aff-0.017
Tokenummy
Token ID13513
Feature activation+1.15e-02

Pos logit contributions
 Amy+0.078
uct+0.076
 Meg+0.076
 reassured+0.072
 Emily+0.071
Neg logit contributions
ga-0.114
be-0.110
aff-0.109
ruck-0.101
by-0.100
Token.
Token ID13
Feature activation+9.95e-04

Pos logit contributions
uct+0.018
 Meg+0.018
 Amy+0.017
 Linda+0.017
igsaw+0.017
Neg logit contributions
ga-0.017
be-0.016
ruck-0.015
 rh-0.015
eer-0.015
Token They
Token ID1119
Feature activation+4.82e-03

Pos logit contributions
 Amy+0.001
 Daisy+0.001
 Sally+0.001
 Meg+0.001
 Jane+0.001
Neg logit contributions
ga-0.002
be-0.002
Ar-0.002
ruck-0.002
zee-0.002
Token want
Token ID765
Feature activation+1.51e-02

Pos logit contributions
 Amy+0.007
 Daisy+0.006
 Meg+0.006
 Sally+0.006
uct+0.006
Neg logit contributions
ga-0.009
be-0.008
Ar-0.008
zee-0.007
aff-0.007
Token to
Token ID284
Feature activation+6.92e-03

Pos logit contributions
 Amy+0.026
 Daisy+0.024
 Sally+0.024
 Meg+0.024
 Jane+0.023
Neg logit contributions
ga-0.025
be-0.021
Ar-0.021
ric-0.020
Bang-0.019
Token It
Token ID632
Feature activation+1.39e-02

Pos logit contributions
 Amy+0.015
 Daisy+0.014
 Meg+0.014
 Sally+0.014
 Sarah+0.013
Neg logit contributions
ga-0.014
be-0.012
Ar-0.012
ruck-0.011
zee-0.011
Token was
Token ID373
Feature activation-3.64e-03

Pos logit contributions
 Amy+0.022
 Meg+0.022
igsaw+0.021
 Daisy+0.021
uct+0.021
Neg logit contributions
ga-0.021
be-0.019
Ar-0.018
Bang-0.017
 rh-0.017
Token shiny
Token ID22441
Feature activation+2.96e-02

Pos logit contributions
ga+0.007
be+0.006
Ar+0.006
 rh+0.006
zee+0.006
Neg logit contributions
 Amy-0.004
 Meg-0.004
 Daisy-0.004
uct-0.004
 Sally-0.004
Token and
Token ID290
Feature activation+4.02e-02

Pos logit contributions
 Amy+0.046
 Meg+0.045
uct+0.044
 Daisy+0.042
 Emily+0.041
Neg logit contributions
ga-0.048
be-0.043
 rh-0.041
ruck-0.040
zee-0.040
Token looked
Token ID3114
Feature activation-3.45e-04

Pos logit contributions
igsaw+0.059
ogging+0.058
uct+0.055
ement+0.054
 Meg+0.053
Neg logit contributions
be-0.056
ga-0.056
 rh-0.056
Bang-0.054
Stretch-0.053
Token y
Token ID331
Feature activation+6.21e-02

Pos logit contributions
ga+0.001
be+0.001
Ar+0.001
 rh+0.000
aff+0.000
Neg logit contributions
 Amy-0.000
 Meg-0.000
igsaw-0.000
 Daisy-0.000
uct-0.000
Tokenummy
Token ID13513
Feature activation+9.07e-03

Pos logit contributions
 Amy+0.079
 Meg+0.075
uct+0.070
 Emily+0.069
kins+0.068
Neg logit contributions
ga-0.115
be-0.113
aff-0.109
by-0.102
ruck-0.101
Token.
Token ID13
Feature activation+9.94e-03

Pos logit contributions
 Meg+0.012
 Amy+0.011
uct+0.011
 Emily+0.011
igsaw+0.010
Neg logit contributions
ga-0.016
be-0.015
ruck-0.014
 rh-0.014
by-0.014
Token So
Token ID1406
Feature activation+6.44e-03

Pos logit contributions
 Amy+0.016
 Daisy+0.016
 Meg+0.015
 Sally+0.015
 Sarah+0.015
Neg logit contributions
ga-0.016
be-0.014
Ar-0.014
ruck-0.013
zee-0.013
Token,
Token ID11
Feature activation+7.28e-03

Pos logit contributions
uct+0.009
igsaw+0.009
 refreshing+0.008
 Amy+0.008
kins+0.008
Neg logit contributions
ga-0.011
be-0.010
 rh-0.010
Bang-0.010
Ar-0.010
Token the
Token ID262
Feature activation-3.61e-03

Pos logit contributions
igsaw+0.010
uct+0.009
 Amy+0.009
kin+0.009
kins+0.009
Neg logit contributions
ga-0.012
be-0.012
 rh-0.011
Bang-0.011
Ar-0.011
Token is
Token ID318
Feature activation+2.34e-02

Pos logit contributions
uct+0.005
 refreshing+0.005
igsaw+0.005
 Daisy+0.005
 Meg+0.005
Neg logit contributions
be-0.004
ga-0.004
 rh-0.004
smith-0.004
Ar-0.004
Token this
Token ID428
Feature activation+1.76e-03

Pos logit contributions
 Amy+0.026
 Daisy+0.024
 Meg+0.024
 Sally+0.023
uct+0.023
Neg logit contributions
ga-0.049
be-0.045
Ar-0.044
zee-0.042
aff-0.042
Token?
Token ID30
Feature activation-6.41e-03

Pos logit contributions
 Amy+0.002
 Meg+0.002
 Daisy+0.002
uct+0.002
 Sally+0.002
Neg logit contributions
ga-0.004
be-0.003
Ar-0.003
aff-0.003
zee-0.003
Token It
Token ID632
Feature activation+1.68e-02

Pos logit contributions
ga+0.011
Ar+0.009
ruck+0.009
by+0.009
be+0.009
Neg logit contributions
 Amy-0.010
 Daisy-0.010
 Sally-0.009
 there-0.009
 Emily-0.009
Token looks
Token ID3073
Feature activation+1.04e-02

Pos logit contributions
 Amy+0.026
 Daisy+0.025
 Meg+0.025
 Sally+0.024
uct+0.024
Neg logit contributions
ga-0.028
be-0.025
Ar-0.024
zee-0.023
aff-0.022
Token y
Token ID331
Feature activation+6.21e-02

Pos logit contributions
 Amy+0.012
 Meg+0.012
uct+0.011
igsaw+0.011
 Daisy+0.011
Neg logit contributions
ga-0.020
be-0.019
Ar-0.018
 rh-0.017
aff-0.017
Tokenucky
Token ID5309
Feature activation+1.94e-02

Pos logit contributions
 Amy+0.079
uct+0.077
 Meg+0.076
 firm+0.072
 reassured+0.071
Neg logit contributions
ga-0.114
be-0.110
aff-0.109
ruck-0.101
by-0.099
Token.
Token ID13
Feature activation+6.63e-03

Pos logit contributions
 Amy+0.027
 Meg+0.027
uct+0.027
 Sarah+0.026
 Daisy+0.025
Neg logit contributions
ga-0.033
be-0.031
eer-0.028
Ar-0.028
aff-0.028
Token I
Token ID314
Feature activation-1.10e-02

Pos logit contributions
 Amy+0.011
 Daisy+0.010
 Meg+0.010
 Sally+0.010
 Sarah+0.010
Neg logit contributions
ga-0.011
be-0.009
Ar-0.009
ruck-0.009
by-0.009
Token don
Token ID836
Feature activation+2.09e-02

Pos logit contributions
ga+0.020
be+0.018
Ar+0.017
zee+0.017
aff+0.017
Neg logit contributions
 Amy-0.015
 Daisy-0.014
 there-0.014
 Sally-0.014
 Meg-0.014
Token't
Token ID470
Feature activation-7.73e-03

Pos logit contributions
 Amy+0.039
 Meg+0.038
 refreshing+0.037
uct+0.037
 Daisy+0.037
Neg logit contributions
ga-0.028
be-0.025
Ar-0.022
Bang-0.020
Stretch-0.020
Token.
Token ID13
Feature activation+9.79e-03

Pos logit contributions
 Amy+0.002
 Meg+0.002
 Daisy+0.002
uct+0.002
 refreshing+0.002
Neg logit contributions
ga-0.003
be-0.003
ruck-0.003
k-0.003
Ar-0.003
Token Inside
Token ID14384
Feature activation+3.19e-02

Pos logit contributions
 Amy+0.016
 Meg+0.015
 Daisy+0.015
uct+0.015
 Sally+0.015
Neg logit contributions
ga-0.015
be-0.013
Ar-0.013
zee-0.012
aff-0.012
Token was
Token ID373
Feature activation-3.75e-03

Pos logit contributions
uct+0.044
igsaw+0.043
 refreshing+0.042
 Meg+0.040
 Amy+0.039
Neg logit contributions
ga-0.051
be-0.049
smith-0.048
 rh-0.047
by-0.047
Token a
Token ID257
Feature activation+7.43e-03

Pos logit contributions
ga+0.007
be+0.006
Ar+0.006
zee+0.006
 rh+0.006
Neg logit contributions
 Amy-0.005
 Meg-0.004
uct-0.004
 Daisy-0.004
igsaw-0.004
Token three
Token ID1115
Feature activation-7.91e-03

Pos logit contributions
 Amy+0.009
 Daisy+0.008
 Meg+0.008
 Sally+0.008
 Sarah+0.008
Neg logit contributions
ga-0.015
be-0.014
Ar-0.013
aff-0.013
zee-0.013
Token year
Token ID614
Feature activation+6.20e-02

Pos logit contributions
ga+0.014
be+0.013
 rh+0.012
 pengu+0.012
Ar+0.012
Neg logit contributions
uct-0.011
 Meg-0.011
 Daisy-0.010
 Amy-0.010
 refreshing-0.010
Token old
Token ID1468
Feature activation-1.64e-02

Pos logit contributions
uct+0.094
 Amy+0.093
 Meg+0.092
kin+0.089
kins+0.089
Neg logit contributions
ga-0.098
be-0.089
aff-0.088
 rh-0.085
ruck-0.081
Token girl
Token ID2576
Feature activation+8.04e-03

Pos logit contributions
ga+0.028
be+0.025
Ar+0.024
ric+0.023
Bang+0.023
Neg logit contributions
 Amy-0.025
 Daisy-0.023
 Meg-0.023
 Sally-0.023
 Sarah-0.023
Token.
Token ID13
Feature activation+8.46e-03

Pos logit contributions
uct+0.013
 refreshing+0.013
igsaw+0.013
 Amy+0.012
 Meg+0.012
Neg logit contributions
ga-0.013
be-0.012
 rh-0.010
by-0.010
k-0.010
Token
Token ID198
Feature activation-1.16e-03

Pos logit contributions
 refreshing+0.012
 Amy+0.012
uct+0.012
kins+0.012
 Meg+0.012
Neg logit contributions
ga-0.014
be-0.012
Ar-0.012
aff-0.012
zee-0.011
Token
Token ID198
Feature activation+8.45e-03

Pos logit contributions
ga+0.002
be+0.002
Ar+0.002
zee+0.002
ric+0.002
Neg logit contributions
 Amy-0.002
 Daisy-0.002
 Meg-0.002
 Sally-0.002
 Sarah-0.002
Token.
Token ID13
Feature activation+9.08e-03

Pos logit contributions
 Amy+0.015
igsaw+0.014
 Meg+0.014
 Daisy+0.014
uct+0.014
Neg logit contributions
ga-0.023
be-0.021
k-0.020
Ar-0.019
aff-0.019
Token He
Token ID679
Feature activation-3.84e-03

Pos logit contributions
 Amy+0.016
 Meg+0.016
uct+0.015
 Daisy+0.015
kins+0.015
Neg logit contributions
ga-0.012
be-0.011
Ar-0.010
Bang-0.010
aff-0.009
Token thought
Token ID1807
Feature activation+7.74e-03

Pos logit contributions
ga+0.007
ric+0.006
aff+0.006
be+0.006
Ar+0.006
Neg logit contributions
 Amy-0.005
 Daisy-0.005
 Sally-0.005
 there-0.005
uct-0.005
Token it
Token ID340
Feature activation+1.41e-02

Pos logit contributions
uct+0.012
 Meg+0.011
igsaw+0.011
 Amy+0.011
kins+0.011
Neg logit contributions
ga-0.012
be-0.011
ruck-0.010
 rh-0.010
smith-0.010
Token looked
Token ID3114
Feature activation+5.69e-03

Pos logit contributions
 Amy+0.024
 Meg+0.024
 Daisy+0.023
uct+0.023
 Sally+0.023
Neg logit contributions
ga-0.020
be-0.018
Ar-0.017
zee-0.016
Bang-0.016
Token y
Token ID331
Feature activation+6.20e-02

Pos logit contributions
 Amy+0.008
igsaw+0.008
 Meg+0.008
uct+0.008
kins+0.008
Neg logit contributions
ga-0.009
be-0.008
Ar-0.008
Bang-0.008
zee-0.007
Tokenummy
Token ID13513
Feature activation+1.51e-02

Pos logit contributions
 Amy+0.077
 Meg+0.074
uct+0.074
 firm+0.068
kins+0.067
Neg logit contributions
ga-0.114
be-0.112
aff-0.107
ruck-0.102
by-0.102
Token,
Token ID11
Feature activation+7.52e-03

Pos logit contributions
uct+0.024
 Meg+0.023
 Amy+0.022
igsaw+0.022
ulu+0.022
Neg logit contributions
ga-0.021
be-0.020
by-0.019
eer-0.019
 rh-0.019
Token so
Token ID523
Feature activation+7.92e-03

Pos logit contributions
igsaw+0.009
 Amy+0.008
kins+0.008
 Meg+0.008
kin+0.008
Neg logit contributions
ga-0.015
be-0.014
Ar-0.013
 rh-0.013
ruck-0.013
Token he
Token ID339
Feature activation-8.78e-03

Pos logit contributions
igsaw+0.012
 Meg+0.011
uct+0.011
 Amy+0.011
 Daisy+0.011
Neg logit contributions
ga-0.012
 rh-0.011
be-0.011
Ar-0.011
ruck-0.011
Token tried
Token ID3088
Feature activation+9.40e-03

Pos logit contributions
ga+0.016
be+0.014
ric+0.014
Ar+0.013
aff+0.013
Neg logit contributions
 Amy-0.013
 Daisy-0.012
 Sally-0.012
 Meg-0.012
 there-0.012
Token was
Token ID373
Feature activation-1.35e-02

Pos logit contributions
ga+0.016
be+0.014
Ar+0.013
aff+0.013
zee+0.013
Neg logit contributions
 Amy-0.012
 Daisy-0.011
 Meg-0.011
 Sally-0.011
 Sarah-0.011
Token a
Token ID257
Feature activation-9.34e-03

Pos logit contributions
ga+0.029
be+0.026
Ar+0.026
aff+0.025
zee+0.025
Neg logit contributions
 Amy-0.016
 Daisy-0.014
 Sally-0.014
 there-0.014
 Meg-0.014
Token clever
Token ID14169
Feature activation-3.11e-02

Pos logit contributions
ga+0.022
ric+0.019
aff+0.019
be+0.019
ruck+0.019
Neg logit contributions
 Amy-0.009
 Daisy-0.008
 Emily-0.008
 Sally-0.008
 refreshing-0.008
Token and
Token ID290
Feature activation+2.87e-02

Pos logit contributions
ga+0.058
ric+0.053
Stretch+0.050
be+0.049
Ar+0.049
Neg logit contributions
 Amy-0.044
 Emily-0.042
 Sarah-0.042
 Daisy-0.041
 Sally-0.041
Token hard
Token ID1327
Feature activation-2.90e-02

Pos logit contributions
uct+0.038
ement+0.037
igsaw+0.036
kins+0.032
 Meg+0.032
Neg logit contributions
ga-0.047
Ar-0.046
Bang-0.045
be-0.045
 rh-0.043
Tokenworking
Token ID16090
Feature activation-7.19e-02

Pos logit contributions
ga+0.047
ric+0.042
aff+0.039
be+0.039
zee+0.038
Neg logit contributions
 Amy-0.043
kin-0.042
 Emily-0.042
 there-0.042
 Daisy-0.041
Token boy
Token ID2933
Feature activation-7.88e-03

Pos logit contributions
ga+0.142
ric+0.124
Stretch+0.122
aff+0.122
Ar+0.120
Neg logit contributions
 Amy-0.092
 Daisy-0.083
 Sally-0.082
 Sarah-0.082
 there-0.081
Token who
Token ID508
Feature activation-5.01e-03

Pos logit contributions
ga+0.019
Ar+0.018
be+0.017
Stretch+0.017
aff+0.017
Neg logit contributions
 Amy-0.007
 Sally-0.006
 Daisy-0.006
 Sarah-0.006
 there-0.006
Token always
Token ID1464
Feature activation-1.31e-02

Pos logit contributions
ga+0.007
be+0.006
Ar+0.005
aff+0.005
zee+0.005
Neg logit contributions
 Amy-0.010
 Daisy-0.009
 Meg-0.009
 Sally-0.009
 Sarah-0.009
Token thought
Token ID1807
Feature activation-1.92e-03

Pos logit contributions
ga+0.021
be+0.018
Ar+0.018
aff+0.017
zee+0.017
Neg logit contributions
 Amy-0.021
 Daisy-0.020
 Meg-0.020
 Sally-0.020
 Sarah-0.019
Token things
Token ID1243
Feature activation+4.69e-03

Pos logit contributions
ga+0.003
be+0.003
ruck+0.003
smith+0.003
eer+0.003
Neg logit contributions
uct-0.002
 refreshing-0.002
igsaw-0.002
 Meg-0.002
 Daisy-0.002
Token Sammy
Token ID44067
Feature activation-2.03e-02

Pos logit contributions
ga+0.011
be+0.010
Ar+0.009
zee+0.009
aff+0.009
Neg logit contributions
 Amy-0.014
 Daisy-0.014
 Meg-0.014
 Sally-0.013
 Sarah-0.013
Token was
Token ID373
Feature activation-1.47e-02

Pos logit contributions
ga+0.030
be+0.026
Ar+0.025
zee+0.024
aff+0.024
Neg logit contributions
 Amy-0.035
 Meg-0.033
 Daisy-0.033
uct-0.032
 Sally-0.032
Token a
Token ID257
Feature activation-1.25e-02

Pos logit contributions
ga+0.029
be+0.026
Ar+0.026
aff+0.025
zee+0.025
Neg logit contributions
 Amy-0.019
 Daisy-0.017
 Sally-0.017
 Meg-0.017
 there-0.016
Token curious
Token ID11040
Feature activation-2.82e-02

Pos logit contributions
ga+0.027
aff+0.024
ric+0.024
ruck+0.024
be+0.024
Neg logit contributions
 Amy-0.013
 Daisy-0.012
 Sally-0.012
 there-0.012
 refreshing-0.012
Token and
Token ID290
Feature activation+1.96e-02

Pos logit contributions
ga+0.047
Ar+0.042
be+0.041
Stretch+0.040
ric+0.039
Neg logit contributions
 Amy-0.044
 Sally-0.043
 Daisy-0.042
 Sarah-0.042
 Jane-0.041
Token outgoing
Token ID28181
Feature activation-6.93e-02

Pos logit contributions
 Amy+0.029
 Meg+0.028
 Daisy+0.027
uct+0.027
 Sally+0.027
Neg logit contributions
ga-0.033
be-0.030
Ar-0.029
zee-0.027
aff-0.027
Token three
Token ID1115
Feature activation-2.03e-02

Pos logit contributions
ga+0.119
Ar+0.114
Bang+0.111
Stretch+0.110
be+0.109
Neg logit contributions
 Amy-0.098
 Sally-0.097
 Daisy-0.096
 Sarah-0.094
 Jane-0.091
Token-
Token ID12
Feature activation-4.53e-03

Pos logit contributions
ga+0.033
be+0.031
eer+0.028
aff+0.028
 rh+0.027
Neg logit contributions
 Meg-0.031
uct-0.031
 Amy-0.030
igsaw-0.030
 Daisy-0.030
Tokenyear
Token ID1941
Feature activation-3.42e-02

Pos logit contributions
ga+0.009
ric+0.008
OO+0.008
Ar+0.008
be+0.008
Neg logit contributions
 Amy-0.005
 Sally-0.005
 Jane-0.005
 Sarah-0.005
 Meg-0.005
Token-
Token ID12
Feature activation-4.74e-03

Pos logit contributions
ga+0.059
be+0.053
Ar+0.052
Bang+0.050
ric+0.050
Neg logit contributions
 Amy-0.049
 Daisy-0.046
 Meg-0.046
 Sally-0.046
 refreshing-0.044
Tokenold
Token ID727
Feature activation-2.40e-02

Pos logit contributions
ga+0.009
ric+0.008
be+0.008
OO+0.008
Ar+0.008
Neg logit contributions
 Amy-0.006
 Meg-0.005
 Sarah-0.005
 Sally-0.005
 Jane-0.005
Token a
Token ID257
Feature activation-1.14e-03

Pos logit contributions
ga+0.007
Ar+0.007
be+0.006
Bang+0.006
 rh+0.006
Neg logit contributions
uct-0.006
kins-0.006
 Meg-0.006
 Amy-0.006
igsaw-0.006
Token beautiful
Token ID4950
Feature activation-2.18e-03

Pos logit contributions
ga+0.002
be+0.002
Ar+0.002
zee+0.002
aff+0.002
Neg logit contributions
 Amy-0.002
 Daisy-0.001
 Meg-0.001
 Sally-0.001
 Sarah-0.001
Token girl
Token ID2576
Feature activation-1.47e-03

Pos logit contributions
ga+0.004
be+0.003
Ar+0.003
 rh+0.003
aff+0.003
Neg logit contributions
 Amy-0.003
uct-0.003
 Meg-0.003
 Daisy-0.003
 Sally-0.003
Token and
Token ID290
Feature activation+1.97e-02

Pos logit contributions
ga+0.002
be+0.002
 rh+0.002
k+0.002
Ar+0.002
Neg logit contributions
uct-0.003
igsaw-0.002
 Meg-0.002
 refreshing-0.002
 Amy-0.002
Token a
Token ID257
Feature activation-1.26e-02

Pos logit contributions
igsaw+0.024
uct+0.024
ement+0.023
kins+0.021
 Meg+0.020
Neg logit contributions
ga-0.036
 rh-0.033
be-0.033
Ar-0.032
Bang-0.032
Token handsome
Token ID22665
Feature activation-6.46e-02

Pos logit contributions
ga+0.025
be+0.023
Ar+0.022
aff+0.021
zee+0.021
Neg logit contributions
 Amy-0.016
 Daisy-0.015
 Sally-0.014
 Meg-0.014
 Emily-0.014
Token boy
Token ID2933
Feature activation-1.86e-02

Pos logit contributions
ga+0.108
be+0.099
Ar+0.095
ric+0.092
Bang+0.091
Neg logit contributions
 Amy-0.097
 Daisy-0.093
 Meg-0.091
 Sally-0.091
 Emily-0.089
Token.
Token ID13
Feature activation-2.77e-04

Pos logit contributions
ga+0.038
Ar+0.035
be+0.035
zee+0.033
aff+0.033
Neg logit contributions
 Amy-0.021
 Daisy-0.019
 Sally-0.019
 Meg-0.019
 there-0.019
Token They
Token ID1119
Feature activation+4.06e-03

Pos logit contributions
ga+0.000
be+0.000
Ar+0.000
aff+0.000
zee+0.000
Neg logit contributions
 Amy-0.000
uct-0.000
 Meg-0.000
 refreshing-0.000
 Daisy-0.000
Token loved
Token ID6151
Feature activation+2.08e-03

Pos logit contributions
 Amy+0.007
 Daisy+0.007
 there+0.007
 Sally+0.007
 Sarah+0.006
Neg logit contributions
ga-0.006
be-0.005
Ar-0.005
zee-0.005
smith-0.005
Token each
Token ID1123
Feature activation-1.47e-02

Pos logit contributions
 Amy+0.003
 Daisy+0.003
 Sally+0.003
 there+0.003
 Sarah+0.003
Neg logit contributions
ga-0.004
be-0.003
Ar-0.003
zee-0.003
aff-0.003
Token and
Token ID290
Feature activation-3.27e-02

Pos logit contributions
ga+0.036
be+0.033
Ar+0.029
ruck+0.029
k+0.029
Neg logit contributions
 Amy-0.026
 Meg-0.025
uct-0.024
 Daisy-0.024
 refreshing-0.024
Token friends
Token ID2460
Feature activation-7.07e-03

Pos logit contributions
ga+0.056
be+0.050
Ar+0.048
zee+0.045
aff+0.045
Neg logit contributions
 Amy-0.048
 Meg-0.046
 Daisy-0.046
uct-0.045
 Sally-0.044
Token and
Token ID290
Feature activation-3.04e-02

Pos logit contributions
ga+0.014
be+0.013
Ar+0.012
zee+0.012
aff+0.012
Neg logit contributions
 Amy-0.009
 Meg-0.008
 Daisy-0.008
 Sally-0.008
uct-0.008
Token not
Token ID407
Feature activation-1.31e-02

Pos logit contributions
ga+0.054
be+0.048
Ar+0.046
aff+0.045
zee+0.044
Neg logit contributions
 Amy-0.044
 Daisy-0.041
 Meg-0.041
 Sally-0.040
 Sarah-0.039
Token take
Token ID1011
Feature activation-1.04e-02

Pos logit contributions
ga+0.027
smith+0.024
Ar+0.024
aff+0.023
zee+0.023
Neg logit contributions
 Amy-0.016
 there-0.015
 Daisy-0.015
 Sally-0.015
 Sue-0.014
Token unnecessary
Token ID13114
Feature activation-6.39e-02

Pos logit contributions
ga+0.020
be+0.018
Ar+0.018
smith+0.017
aff+0.017
Neg logit contributions
 Amy-0.013
 Daisy-0.013
 Sally-0.012
 there-0.012
 Meg-0.012
Token risks
Token ID7476
Feature activation-1.84e-02

Pos logit contributions
ga+0.124
be+0.111
Ar+0.108
zee+0.107
aff+0.107
Neg logit contributions
 Amy-0.079
 there-0.076
 Daisy-0.074
 Meg-0.073
 Sally-0.072
Token.
Token ID13
Feature activation+3.43e-03

Pos logit contributions
ga+0.037
be+0.034
Ar+0.032
aff+0.031
ruck+0.031
Neg logit contributions
 Amy-0.022
 Meg-0.021
 Daisy-0.020
uct-0.020
 Sally-0.020
Token We
Token ID775
Feature activation-9.45e-03

Pos logit contributions
 Amy+0.005
 Meg+0.005
 Daisy+0.005
 Sally+0.005
uct+0.005
Neg logit contributions
ga-0.006
be-0.005
Ar-0.005
zee-0.004
aff-0.004
Token should
Token ID815
Feature activation-1.19e-02

Pos logit contributions
ga+0.015
be+0.013
Ar+0.012
zee+0.012
aff+0.012
Neg logit contributions
 Amy-0.015
 Daisy-0.015
 Meg-0.014
 Sally-0.014
 there-0.014
Token always
Token ID1464
Feature activation-2.81e-02

Pos logit contributions
ga+0.016
be+0.014
Ar+0.013
ric+0.012
 rh+0.012
Neg logit contributions
 Amy-0.022
 Meg-0.021
 Daisy-0.021
uct-0.021
igsaw-0.020
Token responsible
Token ID4497
Feature activation-7.80e-03

Pos logit contributions
 Amy+0.006
 Meg+0.006
uct+0.006
igsaw+0.006
 Daisy+0.006
Neg logit contributions
ga-0.010
be-0.009
Ar-0.009
k-0.008
eer-0.008
Token and
Token ID290
Feature activation-9.23e-03

Pos logit contributions
ga+0.016
be+0.014
Ar+0.014
zee+0.013
aff+0.013
Neg logit contributions
 Amy-0.009
 Daisy-0.009
 Meg-0.009
 Sally-0.008
uct-0.008
Token making
Token ID1642
Feature activation+8.95e-04

Pos logit contributions
ga+0.017
be+0.016
Ar+0.015
ruck+0.014
zee+0.014
Neg logit contributions
 Amy-0.012
 Meg-0.012
 Daisy-0.011
uct-0.011
 Sally-0.011
Token sure
Token ID1654
Feature activation-1.09e-02

Pos logit contributions
 Amy+0.001
 Meg+0.001
uct+0.001
 Daisy+0.001
 Sally+0.001
Neg logit contributions
ga-0.002
be-0.002
Ar-0.002
ruck-0.001
ric-0.001
Token to
Token ID284
Feature activation+2.69e-04

Pos logit contributions
ga+0.020
be+0.018
Ar+0.018
zee+0.017
aff+0.017
Neg logit contributions
 Amy-0.015
 Daisy-0.014
 Meg-0.014
 Sally-0.013
 there-0.013
Token double
Token ID4274
Feature activation-6.31e-02

Pos logit contributions
 Amy+0.000
 Meg+0.000
 Daisy+0.000
uct+0.000
 Sally+0.000
Neg logit contributions
ga-0.000
be-0.000
Ar-0.000
ric-0.000
ruck-0.000
Token-
Token ID12
Feature activation-4.15e-03

Pos logit contributions
ga+0.119
be+0.108
Ar+0.106
aff+0.103
zee+0.102
Neg logit contributions
 Amy-0.080
 Daisy-0.076
 there-0.075
 Meg-0.075
 Sally-0.074
Tokencheck
Token ID9122
Feature activation-5.03e-02

Pos logit contributions
ga+0.006
Ar+0.006
aff+0.006
zee+0.005
Bang+0.005
Neg logit contributions
 there-0.007
 Amy-0.007
 Sally-0.006
 Daisy-0.006
uct-0.006
Token before
Token ID878
Feature activation+1.63e-03

Pos logit contributions
ga+0.100
be+0.092
Ar+0.090
zee+0.087
aff+0.086
Neg logit contributions
 Amy-0.059
 Daisy-0.055
 Meg-0.055
 Sally-0.053
 Sarah-0.053
Token doing
Token ID1804
Feature activation-1.95e-02

Pos logit contributions
 Amy+0.002
 Meg+0.002
uct+0.002
 Daisy+0.002
igsaw+0.002
Neg logit contributions
ga-0.003
be-0.003
Ar-0.003
ruck-0.003
ric-0.002
Token something
Token ID1223
Feature activation-4.39e-03

Pos logit contributions
ga+0.037
be+0.034
Ar+0.032
ruck+0.031
ric+0.031
Neg logit contributions
 Amy-0.025
 Meg-0.024
 Daisy-0.023
uct-0.023
 Sally-0.023
Token be
Token ID307
Feature activation+4.45e-03

Pos logit contributions
ga+0.013
be+0.012
Ar+0.011
ric+0.010
ruck+0.010
Neg logit contributions
 Amy-0.013
 Meg-0.012
 Daisy-0.012
uct-0.012
 refreshing-0.012
Token careful
Token ID8161
Feature activation+7.73e-03

Pos logit contributions
igsaw+0.006
 Amy+0.005
uct+0.005
 Meg+0.005
 Daisy+0.005
Neg logit contributions
ga-0.008
be-0.008
k-0.007
please-0.007
Ar-0.007
Token and
Token ID290
Feature activation-2.42e-02

Pos logit contributions
 Amy+0.010
 Meg+0.010
uct+0.010
 refreshing+0.010
 Daisy+0.010
Neg logit contributions
ga-0.014
be-0.013
ruck-0.012
k-0.012
Ar-0.011
Token not
Token ID407
Feature activation-9.09e-03

Pos logit contributions
ga+0.046
be+0.041
Ar+0.040
zee+0.038
aff+0.038
Neg logit contributions
 Amy-0.032
 Meg-0.030
 Daisy-0.030
 Sally-0.029
uct-0.029
Token take
Token ID1011
Feature activation-1.20e-02

Pos logit contributions
ga+0.018
Ar+0.016
smith+0.015
be+0.015
zee+0.015
Neg logit contributions
 Amy-0.012
 Sally-0.011
 Daisy-0.011
 there-0.011
 Meg-0.011
Token unnecessary
Token ID13114
Feature activation-6.30e-02

Pos logit contributions
ga+0.022
be+0.020
Ar+0.019
zee+0.018
aff+0.018
Neg logit contributions
 Amy-0.016
 Meg-0.015
 Daisy-0.015
uct-0.015
 Sally-0.015
Token risks
Token ID7476
Feature activation-1.74e-02

Pos logit contributions
ga+0.122
be+0.109
Ar+0.107
zee+0.105
smith+0.104
Neg logit contributions
 Amy-0.081
 there-0.075
 Meg-0.075
 Daisy-0.075
 Sally-0.073
Token.
Token ID13
Feature activation-5.30e-03

Pos logit contributions
ga+0.034
be+0.031
Ar+0.028
aff+0.028
ruck+0.028
Neg logit contributions
 Amy-0.021
 Meg-0.021
uct-0.020
 Daisy-0.020
 Sally-0.020
Token The
Token ID383
Feature activation-2.23e-03

Pos logit contributions
ga+0.008
be+0.007
Ar+0.007
aff+0.007
Bang+0.007
Neg logit contributions
 Amy-0.008
 Meg-0.008
 refreshing-0.008
uct-0.008
 Daisy-0.008
Token moral
Token ID6573
Feature activation-1.80e-02

Pos logit contributions
ga+0.003
be+0.003
Ar+0.003
 rh+0.003
Bang+0.003
Neg logit contributions
 Amy-0.004
 Meg-0.004
 Daisy-0.004
 Sally-0.004
uct-0.004
Token of
Token ID286
Feature activation+1.90e-02

Pos logit contributions
ga+0.041
be+0.039
Ar+0.038
ruck+0.036
Bang+0.036
Neg logit contributions
 Amy-0.016
uct-0.015
 Meg-0.014
 Daisy-0.014
 refreshing-0.014
Token Tommy
Token ID19919
Feature activation-1.55e-02

Pos logit contributions
uct+0.014
kins+0.014
igsaw+0.013
 refreshing+0.013
 Meg+0.013
Neg logit contributions
ga-0.022
be-0.019
 rh-0.019
smith-0.019
aff-0.018
Token and
Token ID290
Feature activation-3.23e-03

Pos logit contributions
ga+0.027
be+0.024
 rh+0.023
by+0.022
eer+0.022
Neg logit contributions
igsaw-0.022
uct-0.022
 refreshing-0.021
 Meg-0.021
 Amy-0.020
Token a
Token ID257
Feature activation-2.01e-02

Pos logit contributions
ga+0.006
 rh+0.005
be+0.005
Ar+0.005
eer+0.005
Neg logit contributions
ement-0.004
uct-0.004
kins-0.004
igsaw-0.004
 Amy-0.004
Token c
Token ID269
Feature activation+1.10e-02

Pos logit contributions
ga+0.037
Ar+0.035
be+0.034
 rh+0.033
by+0.032
Neg logit contributions
uct-0.025
 Amy-0.024
 Meg-0.024
kins-0.023
 Daisy-0.023
Tokenudd
Token ID4185
Feature activation+2.09e-03

Pos logit contributions
 Amy+0.014
 Meg+0.013
 Daisy+0.013
 Sally+0.013
uct+0.013
Neg logit contributions
ga-0.022
be-0.020
Ar-0.019
aff-0.018
zee-0.018
Tokenly
Token ID306
Feature activation-6.28e-02

Pos logit contributions
 Amy+0.005
 Meg+0.004
 Daisy+0.004
uct+0.004
 Emily+0.004
Neg logit contributions
ga-0.002
be-0.002
aff-0.002
ruck-0.002
eer-0.002
Token friend
Token ID1545
Feature activation+5.38e-04

Pos logit contributions
ga+0.098
be+0.087
Ar+0.083
zee+0.079
aff+0.078
Neg logit contributions
 Amy-0.102
 Meg-0.097
 Daisy-0.096
uct-0.096
 Sally-0.094
Token like
Token ID588
Feature activation+7.06e-03

Pos logit contributions
 Amy+0.001
uct+0.001
 Meg+0.001
igsaw+0.001
 refreshing+0.001
Neg logit contributions
ga-0.001
be-0.001
Ar-0.001
zee-0.001
aff-0.001
Token her
Token ID607
Feature activation-1.26e-03

Pos logit contributions
uct+0.009
 Amy+0.009
 Meg+0.009
kins+0.009
 refreshing+0.009
Neg logit contributions
ga-0.013
be-0.011
Ar-0.011
 rh-0.011
aff-0.011
Token t
Token ID256
Feature activation+4.24e-03

Pos logit contributions
 rh+0.002
ga+0.002
 pengu+0.002
by+0.002
eer+0.002
Neg logit contributions
uct-0.002
igsaw-0.002
kins-0.002
 Meg-0.002
 Amy-0.002
Tokeneddy
Token ID21874
Feature activation-2.47e-02

Pos logit contributions
 Amy+0.009
 Daisy+0.008
 Meg+0.008
 Sally+0.008
uct+0.008
Neg logit contributions
ga-0.005
be-0.004
Ar-0.004
zee-0.004
aff-0.004
Token monkeys
Token ID26697
Feature activation-8.85e-03

Pos logit contributions
ga+0.002
be+0.002
zee+0.002
Ar+0.002
check+0.002
Neg logit contributions
 Amy-0.002
 Daisy-0.002
 Sue-0.002
 there-0.002
 Sally-0.002
Token and
Token ID290
Feature activation-1.12e-02

Pos logit contributions
ga+0.019
be+0.017
Ar+0.017
zee+0.016
Bang+0.016
Neg logit contributions
 Amy-0.009
 Meg-0.009
 Daisy-0.009
 Sally-0.008
 there-0.008
Token elephants
Token ID25368
Feature activation-2.76e-02

Pos logit contributions
ga+0.018
be+0.016
Ar+0.015
zee+0.015
net+0.015
Neg logit contributions
 Amy-0.020
 Daisy-0.018
 Sue-0.018
 there-0.017
 Meg-0.017
Token and
Token ID290
Feature activation-1.35e-02

Pos logit contributions
ga+0.063
Ar+0.058
be+0.057
Bang+0.055
zee+0.055
Neg logit contributions
 Amy-0.025
 Daisy-0.022
 Meg-0.022
 there-0.022
 Sally-0.021
Token gir
Token ID37370
Feature activation-1.70e-02

Pos logit contributions
ga+0.022
be+0.019
Ar+0.018
zee+0.018
aff+0.017
Neg logit contributions
 Amy-0.024
 Daisy-0.022
 Sue-0.021
 Meg-0.021
 there-0.021
Tokenaff
Token ID2001
Feature activation-6.25e-02

Pos logit contributions
ga+0.014
be+0.011
Ar+0.011
Bang+0.010
Stretch+0.010
Neg logit contributions
 Amy-0.039
 Meg-0.038
 Sue-0.038
 Daisy-0.037
 Sally-0.037
Tokenes
Token ID274
Feature activation-1.17e-02

Pos logit contributions
ga+0.125
Ar+0.121
Bang+0.119
be+0.117
 mechanic+0.117
Neg logit contributions
 Amy-0.064
 there-0.063
kins-0.062
 Meg-0.059
 Daisy-0.057
Token.
Token ID13
Feature activation-1.08e-03

Pos logit contributions
ga+0.024
be+0.022
Ar+0.021
aff+0.021
zee+0.020
Neg logit contributions
 Amy-0.013
 Meg-0.012
 Daisy-0.012
uct-0.012
 Sally-0.012
Token They
Token ID1119
Feature activation+4.89e-03

Pos logit contributions
ga+0.002
be+0.001
Ar+0.001
zee+0.001
ruck+0.001
Neg logit contributions
 Amy-0.002
 Daisy-0.002
 Meg-0.002
 Sally-0.002
 Sarah-0.002
Token were
Token ID547
Feature activation+2.98e-03

Pos logit contributions
 Amy+0.007
 Daisy+0.007
 Meg+0.007
 Sally+0.006
 there+0.006
Neg logit contributions
ga-0.009
be-0.008
Ar-0.008
zee-0.007
ruck-0.007
Token very
Token ID845
Feature activation+7.39e-03

Pos logit contributions
 Amy+0.004
 Meg+0.004
 Daisy+0.004
 Sally+0.004
 Sarah+0.003
Neg logit contributions
ga-0.006
be-0.005
Ar-0.005
zee-0.005
aff-0.005
Token "
Token ID366
Feature activation-1.45e-02

Pos logit contributions
 Amy+0.009
 Daisy+0.009
 there+0.009
 Sally+0.009
 Sarah+0.008
Neg logit contributions
ga-0.009
by-0.009
Ar-0.009
zee-0.009
be-0.009
TokenThank
Token ID10449
Feature activation+6.72e-03

Pos logit contributions
ga+0.027
ric+0.025
be+0.024
aff+0.024
eer+0.024
Neg logit contributions
 Amy-0.017
uct-0.017
 Daisy-0.017
 Sally-0.017
 Meg-0.017
Token you
Token ID345
Feature activation+1.34e-03

Pos logit contributions
 Amy+0.005
 Meg+0.005
 refreshing+0.005
uct+0.005
 Daisy+0.005
Neg logit contributions
ga-0.016
be-0.015
Ar-0.015
aff-0.014
 rh-0.014
Token,
Token ID11
Feature activation-3.72e-03

Pos logit contributions
uct+0.002
 Amy+0.002
 Meg+0.002
 refreshing+0.001
 Daisy+0.001
Neg logit contributions
ga-0.003
be-0.002
Ar-0.002
aff-0.002
 rh-0.002
Token Che
Token ID2580
Feature activation-1.53e-02

Pos logit contributions
ga+0.005
 rh+0.005
be+0.005
 pengu+0.005
 horn+0.004
Neg logit contributions
kins-0.006
 Amy-0.006
 Meg-0.006
uct-0.006
igsaw-0.006
Tokenwy
Token ID21768
Feature activation-6.20e-02

Pos logit contributions
ga+0.025
be+0.022
Ar+0.021
zee+0.020
aff+0.020
Neg logit contributions
 Amy-0.025
 Daisy-0.023
 Meg-0.023
 Sally-0.023
uct-0.023
Token!"
Token ID2474
Feature activation-9.73e-03

Pos logit contributions
ga+0.140
Ar+0.128
be+0.125
Bang+0.122
zee+0.121
Neg logit contributions
 Amy-0.062
 Daisy-0.056
 Sally-0.055
 Sarah-0.054
 Jane-0.052
Token They
Token ID1119
Feature activation+2.87e-03

Pos logit contributions
ga+0.016
be+0.015
Ar+0.015
zee+0.015
ruck+0.014
Neg logit contributions
 Amy-0.013
 Sally-0.013
 Daisy-0.013
uct-0.013
 Meg-0.013
Token became
Token ID2627
Feature activation-8.09e-03

Pos logit contributions
 Amy+0.004
 Meg+0.004
igsaw+0.004
 Daisy+0.004
uct+0.004
Neg logit contributions
ga-0.005
be-0.004
Ar-0.004
Bang-0.004
 rh-0.004
Token best
Token ID1266
Feature activation-4.03e-03

Pos logit contributions
ga+0.015
be+0.013
Ar+0.013
 rh+0.013
Bang+0.013
Neg logit contributions
uct-0.011
 Amy-0.010
igsaw-0.010
 Meg-0.010
 Daisy-0.009
Token friends
Token ID2460
Feature activation+3.00e-03

Pos logit contributions
ga+0.008
be+0.007
Ar+0.007
aff+0.007
ruck+0.007
Neg logit contributions
uct-0.005
 Amy-0.005
 Meg-0.005
 Daisy-0.005
igsaw-0.005
Token a
Token ID257
Feature activation-1.67e-02

Pos logit contributions
ga+0.044
be+0.040
Ar+0.037
aff+0.037
ruck+0.037
Neg logit contributions
uct-0.032
 Amy-0.032
 Meg-0.031
 refreshing-0.030
igsaw-0.030
Token chees
Token ID27384
Feature activation-3.29e-02

Pos logit contributions
ga+0.034
be+0.030
Ar+0.028
ric+0.028
Bang+0.028
Neg logit contributions
 Amy-0.021
 Daisy-0.019
 Meg-0.019
 Sally-0.019
 refreshing-0.019
Tokeneb
Token ID1765
Feature activation-1.58e-02

Pos logit contributions
ga+0.036
be+0.031
Ar+0.028
aff+0.026
zee+0.026
Neg logit contributions
 Amy-0.069
 Meg-0.067
 Daisy-0.067
uct-0.066
 Sally-0.065
Tokenurger
Token ID32650
Feature activation-4.82e-02

Pos logit contributions
ga+0.022
Bang+0.021
Ar+0.019
Stretch+0.019
Charge+0.018
Neg logit contributions
kins-0.028
 there-0.026
 y-0.025
ma-0.025
 Amy-0.025
Token and
Token ID290
Feature activation-2.18e-02

Pos logit contributions
ga+0.097
be+0.089
Ar+0.086
zee+0.082
aff+0.082
Neg logit contributions
 Amy-0.057
 Meg-0.053
 Daisy-0.053
 Sally-0.051
uct-0.050
Token fries
Token ID31757
Feature activation-6.18e-02

Pos logit contributions
ga+0.044
be+0.038
ric+0.036
aff+0.036
zee+0.036
Neg logit contributions
 Amy-0.030
 Daisy-0.028
 Sally-0.027
 Sarah-0.026
 Meg-0.026
Token all
Token ID477
Feature activation-5.64e-03

Pos logit contributions
ga+0.130
be+0.119
Ar+0.115
aff+0.110
zee+0.110
Neg logit contributions
 Amy-0.067
 Meg-0.062
 Daisy-0.062
uct-0.060
 Sally-0.060
Token by
Token ID416
Feature activation-4.18e-03

Pos logit contributions
ga+0.010
be+0.010
ruck+0.009
by+0.009
eer+0.009
Neg logit contributions
 Meg-0.007
 Amy-0.007
uct-0.007
 Daisy-0.007
igsaw-0.007
Token herself
Token ID5223
Feature activation-7.02e-03

Pos logit contributions
ga+0.008
be+0.007
Ar+0.007
ruck+0.007
aff+0.007
Neg logit contributions
 Amy-0.005
 Meg-0.005
uct-0.005
 Daisy-0.005
 refreshing-0.005
Token.
Token ID13
Feature activation-2.47e-03

Pos logit contributions
ga+0.013
be+0.012
by+0.011
k+0.011
aff+0.011
Neg logit contributions
 Amy-0.009
 Meg-0.009
uct-0.008
 Daisy-0.008
 refreshing-0.008
Token Her
Token ID2332
Feature activation-2.26e-03

Pos logit contributions
ga+0.005
be+0.004
Ar+0.004
zee+0.004
eer+0.004
Neg logit contributions
 Amy-0.004
 Daisy-0.003
 Sally-0.003
 Sarah-0.003
 Meg-0.003
Token fun
Token ID1257
Feature activation-1.21e-02

Pos logit contributions
ga+0.006
Ar+0.006
be+0.006
ruck+0.006
aff+0.006
Neg logit contributions
uct-0.003
 Amy-0.003
 Meg-0.003
 Daisy-0.003
 Sally-0.003
Token.
Token ID13
Feature activation-2.77e-03

Pos logit contributions
ga+0.023
be+0.022
ruck+0.021
Ar+0.020
aff+0.020
Neg logit contributions
 Amy-0.015
 Meg-0.014
uct-0.014
 Daisy-0.014
 Sally-0.013
Token
Token ID198
Feature activation-4.87e-03

Pos logit contributions
ga+0.005
be+0.004
Ar+0.004
zee+0.004
aff+0.004
Neg logit contributions
 Amy-0.004
 Meg-0.004
 Daisy-0.004
 Sally-0.004
uct-0.004
Token
Token ID198
Feature activation+3.69e-03

Pos logit contributions
ga+0.010
be+0.009
zee+0.008
Ar+0.008
aff+0.008
Neg logit contributions
 Amy-0.006
 Daisy-0.006
 Meg-0.006
 Sally-0.005
uct-0.005
TokenThe
Token ID464
Feature activation-7.67e-03

Pos logit contributions
 Amy+0.005
 Meg+0.005
 Daisy+0.005
uct+0.005
 Sally+0.005
Neg logit contributions
ga-0.007
be-0.006
Ar-0.006
zee-0.005
aff-0.005
Token younger
Token ID7099
Feature activation-6.15e-02

Pos logit contributions
ga+0.012
be+0.011
Ar+0.010
 rh+0.010
by+0.010
Neg logit contributions
 Amy-0.012
uct-0.012
 Sally-0.012
 Meg-0.012
 Sue-0.011
Token sister
Token ID6621
Feature activation-1.20e-02

Pos logit contributions
ga+0.109
Ar+0.098
be+0.097
ric+0.093
zee+0.093
Neg logit contributions
 Amy-0.088
 Daisy-0.083
 Meg-0.082
 Emily-0.080
 Sarah-0.079
Token was
Token ID373
Feature activation-1.08e-02

Pos logit contributions
ga+0.022
be+0.019
Ar+0.019
zee+0.018
aff+0.018
Neg logit contributions
 Amy-0.016
 Daisy-0.015
 Meg-0.015
 Sally-0.015
uct-0.015
Token very
Token ID845
Feature activation+7.96e-03

Pos logit contributions
ga+0.021
be+0.020
Ar+0.018
ric+0.018
 rh+0.017
Neg logit contributions
 Amy-0.013
 Meg-0.012
 Daisy-0.012
igsaw-0.012
kins-0.012
Token thankful
Token ID25535
Feature activation+3.41e-03

Pos logit contributions
 Amy+0.010
 Daisy+0.009
 Meg+0.009
uct+0.009
igsaw+0.009
Neg logit contributions
ga-0.015
be-0.014
Ar-0.013
k-0.012
by-0.012
Token for
Token ID329
Feature activation-1.05e-02

Pos logit contributions
uct+0.005
 Amy+0.005
igsaw+0.005
 Meg+0.005
 refreshing+0.005
Neg logit contributions
ga-0.005
be-0.005
by-0.004
ruck-0.004
aff-0.004
Token on
Token ID319
Feature activation+2.94e-02

Pos logit contributions
ga+0.042
be+0.039
Ar+0.037
Bang+0.035
aff+0.035
Neg logit contributions
 Amy-0.024
 Meg-0.023
 Daisy-0.023
uct-0.022
 Sally-0.022
Token,
Token ID11
Feature activation+3.55e-03

Pos logit contributions
uct+0.039
 Amy+0.036
 Meg+0.035
 refreshing+0.035
igsaw+0.035
Neg logit contributions
ga-0.054
be-0.048
Ar-0.046
by-0.045
 rh-0.044
Token Ben
Token ID3932
Feature activation-8.19e-03

Pos logit contributions
 Amy+0.004
uct+0.004
 Meg+0.004
ement+0.004
igsaw+0.004
Neg logit contributions
ga-0.007
be-0.006
Ar-0.006
by-0.006
 rh-0.005
Token and
Token ID290
Feature activation+3.15e-02

Pos logit contributions
ga+0.014
be+0.013
Ar+0.013
zee+0.013
aff+0.012
Neg logit contributions
 Amy-0.011
 Daisy-0.011
 Meg-0.011
 Sally-0.010
 Sarah-0.010
Token T
Token ID309
Feature activation-2.32e-02

Pos logit contributions
 Amy+0.035
uct+0.034
 refreshing+0.034
kins+0.034
ement+0.033
Neg logit contributions
ga-0.064
be-0.056
Ar-0.055
by-0.054
 rh-0.053
Tokenoni
Token ID14651
Feature activation-6.11e-02

Pos logit contributions
ga+0.036
be+0.032
Ar+0.031
Bang+0.029
zee+0.029
Neg logit contributions
 Amy-0.038
 Meg-0.037
 Sally-0.036
 Daisy-0.036
uct-0.036
Token spent
Token ID3377
Feature activation-2.13e-02

Pos logit contributions
ga+0.106
be+0.094
Ar+0.091
zee+0.087
aff+0.087
Neg logit contributions
 Amy-0.089
 Meg-0.084
 Daisy-0.084
 Sally-0.082
uct-0.081
Token a
Token ID257
Feature activation-2.92e-02

Pos logit contributions
ga+0.047
be+0.043
Ar+0.042
zee+0.040
aff+0.040
Neg logit contributions
 Amy-0.022
 Meg-0.020
 Daisy-0.020
 Sally-0.019
uct-0.019
Token lot
Token ID1256
Feature activation-3.73e-03

Pos logit contributions
ga+0.049
be+0.044
Ar+0.043
zee+0.039
Bang+0.039
Neg logit contributions
 Amy-0.044
 Meg-0.042
 Daisy-0.042
uct-0.041
 Sally-0.041
Token of
Token ID286
Feature activation-1.36e-02

Pos logit contributions
ga+0.008
be+0.008
Ar+0.007
ruck+0.007
aff+0.007
Neg logit contributions
 Amy-0.004
uct-0.004
 Daisy-0.004
 Meg-0.004
 Sally-0.003
Token time
Token ID640
Feature activation-1.85e-02

Pos logit contributions
ga+0.027
be+0.024
Ar+0.024
Bang+0.022
ruck+0.022
Neg logit contributions
 Amy-0.016
 Meg-0.016
 Daisy-0.016
uct-0.015
 Sally-0.015
Token lions
Token ID31420
Feature activation-6.59e-03

Pos logit contributions
ga+0.005
be+0.005
Ar+0.004
Stretch+0.004
zee+0.004
Neg logit contributions
 Amy-0.006
 Daisy-0.006
 Meg-0.006
 Sally-0.005
 refreshing-0.005
Token and
Token ID290
Feature activation-1.93e-02

Pos logit contributions
ga+0.013
be+0.012
Ar+0.011
aff+0.011
 rh+0.011
Neg logit contributions
 Amy-0.008
 Meg-0.008
 Daisy-0.008
uct-0.007
 Sally-0.007
Token monkeys
Token ID26697
Feature activation-1.44e-02

Pos logit contributions
ga+0.041
be+0.038
Ar+0.035
zee+0.035
OO+0.034
Neg logit contributions
 Amy-0.025
 Daisy-0.023
 Sue-0.022
 Sally-0.022
 Meg-0.022
Token and
Token ID290
Feature activation-2.45e-02

Pos logit contributions
ga+0.031
be+0.028
Ar+0.028
zee+0.027
Bang+0.026
Neg logit contributions
 Amy-0.015
 Meg-0.014
 Daisy-0.014
 Sally-0.013
uct-0.013
Token gir
Token ID37370
Feature activation-1.98e-02

Pos logit contributions
ga+0.041
be+0.037
Ar+0.035
zee+0.033
net+0.033
Neg logit contributions
 Amy-0.043
 Daisy-0.039
 Sue-0.038
 Meg-0.038
 Sally-0.038
Tokenaff
Token ID2001
Feature activation-6.11e-02

Pos logit contributions
ga+0.016
be+0.013
Ar+0.013
Bang+0.012
Stretch+0.012
Neg logit contributions
 Amy-0.045
 Meg-0.044
 Sue-0.044
ement-0.043
 Daisy-0.043
Tokenes
Token ID274
Feature activation-1.49e-02

Pos logit contributions
ga+0.122
Ar+0.120
Bang+0.117
be+0.115
zee+0.112
Neg logit contributions
 Amy-0.064
 there-0.061
kins-0.060
 Meg-0.059
 Daisy-0.056
Token.
Token ID13
Feature activation-5.90e-03

Pos logit contributions
ga+0.031
be+0.029
Ar+0.028
aff+0.027
zee+0.027
Neg logit contributions
 Amy-0.016
 Meg-0.015
 Daisy-0.015
uct-0.014
 Sally-0.014
Token
Token ID198
Feature activation-5.38e-03

Pos logit contributions
ga+0.009
be+0.007
Ar+0.007
aff+0.007
zee+0.007
Neg logit contributions
 Amy-0.010
 Meg-0.010
 Daisy-0.010
uct-0.010
 Sally-0.010
Token
Token ID198
Feature activation+2.04e-03

Pos logit contributions
ga+0.010
be+0.009
zee+0.009
eer+0.009
k+0.009
Neg logit contributions
 Amy-0.007
 Daisy-0.007
 Meg-0.006
 there-0.006
 Sally-0.006
TokenThey
Token ID2990
Feature activation-2.80e-03

Pos logit contributions
 Amy+0.003
 Meg+0.003
 Daisy+0.003
 Sally+0.003
 there+0.003
Neg logit contributions
ga-0.004
be-0.003
zee-0.003
aff-0.003
eer-0.003
Token at
Token ID379
Feature activation+1.89e-02

Pos logit contributions
uct+0.010
 refreshing+0.010
 Amy+0.010
igsaw+0.009
 Meg+0.009
Neg logit contributions
ga-0.013
be-0.012
 lear-0.011
smith-0.011
 fighter-0.011
Token the
Token ID262
Feature activation-1.53e-02

Pos logit contributions
uct+0.024
 Amy+0.024
 refreshing+0.024
 Meg+0.023
kins+0.023
Neg logit contributions
ga-0.034
be-0.030
Ar-0.029
Bang-0.028
ruck-0.028
Token game
Token ID983
Feature activation+1.35e-02

Pos logit contributions
ga+0.028
be+0.025
Ar+0.025
aff+0.024
ric+0.024
Neg logit contributions
 Amy-0.021
 Meg-0.020
 Daisy-0.020
 Sally-0.019
 Sarah-0.019
Token and
Token ID290
Feature activation-7.63e-03

Pos logit contributions
uct+0.017
 Amy+0.016
 refreshing+0.016
 Meg+0.016
igsaw+0.015
Neg logit contributions
ga-0.025
be-0.023
ruck-0.021
smith-0.021
Ar-0.021
Token the
Token ID262
Feature activation-9.35e-03

Pos logit contributions
ga+0.015
be+0.014
Ar+0.013
by+0.013
 rh+0.013
Neg logit contributions
 Amy-0.008
igsaw-0.008
uct-0.008
 Meg-0.008
 refreshing-0.008
Token younger
Token ID7099
Feature activation-6.11e-02

Pos logit contributions
ga+0.017
be+0.015
Ar+0.015
zee+0.014
smith+0.014
Neg logit contributions
 Amy-0.013
 Meg-0.012
 Daisy-0.012
uct-0.012
 Sally-0.012
Token sister
Token ID6621
Feature activation-1.48e-02

Pos logit contributions
ga+0.108
Ar+0.097
be+0.095
aff+0.093
ric+0.092
Neg logit contributions
 Amy-0.088
 Meg-0.082
 Daisy-0.081
 Sarah-0.080
 Sally-0.080
Token said
Token ID531
Feature activation-6.64e-03

Pos logit contributions
ga+0.026
be+0.023
Ar+0.022
zee+0.021
aff+0.021
Neg logit contributions
 Amy-0.021
 Meg-0.020
 Daisy-0.020
 Sally-0.020
uct-0.020
Token she
Token ID673
Feature activation+4.07e-03

Pos logit contributions
ga+0.014
be+0.013
Ar+0.012
aff+0.012
 rh+0.012
Neg logit contributions
 Amy-0.007
 Meg-0.006
uct-0.006
 refreshing-0.006
 Daisy-0.006
Token was
Token ID373
Feature activation-6.54e-03

Pos logit contributions
 Amy+0.007
 Meg+0.007
 Daisy+0.007
 Sally+0.007
uct+0.007
Neg logit contributions
ga-0.006
be-0.005
Ar-0.005
aff-0.005
zee-0.005
Token happy
Token ID3772
Feature activation+9.58e-03

Pos logit contributions
ga+0.012
be+0.011
Ar+0.010
 rh+0.010
ric+0.010
Neg logit contributions
 Amy-0.008
igsaw-0.008
 Meg-0.008
 Daisy-0.007
 refreshing-0.007
Token angry
Token ID7954
Feature activation-7.17e-03

Pos logit contributions
 Amy+0.026
 Meg+0.025
 Daisy+0.024
igsaw+0.024
uct+0.024
Neg logit contributions
ga-0.032
be-0.029
Ar-0.027
Bang-0.026
 rh-0.026
Token.
Token ID13
Feature activation+3.49e-03

Pos logit contributions
ga+0.017
Ar+0.016
be+0.016
ric+0.016
Stretch+0.015
Neg logit contributions
 Amy-0.006
 Sally-0.006
 there-0.006
 Sarah-0.005
 Daisy-0.005
Token
Token ID198
Feature activation-4.00e-03

Pos logit contributions
 Amy+0.005
 Sally+0.004
 Daisy+0.004
 Sarah+0.004
 Emily+0.004
Neg logit contributions
ga-0.007
be-0.006
ruck-0.006
Ar-0.006
zee-0.006
Token
Token ID198
Feature activation+5.64e-03

Pos logit contributions
ga+0.008
zee+0.007
be+0.007
eer+0.007
aff+0.007
Neg logit contributions
 Amy-0.005
 there-0.004
 Daisy-0.004
 Sally-0.004
 Meg-0.004
TokenThe
Token ID464
Feature activation-5.87e-03

Pos logit contributions
 Amy+0.008
uct+0.007
 there+0.007
 Meg+0.007
 Daisy+0.007
Neg logit contributions
ga-0.010
aff-0.009
zee-0.009
eer-0.009
ric-0.009
Token younger
Token ID7099
Feature activation-6.08e-02

Pos logit contributions
ga+0.012
be+0.011
aff+0.011
Ar+0.010
zee+0.010
Neg logit contributions
 Amy-0.007
 Daisy-0.006
 Meg-0.006
 Emily-0.006
 Sarah-0.006
Token brother
Token ID3956
Feature activation-1.84e-02

Pos logit contributions
ga+0.124
Ar+0.115
aff+0.113
zee+0.111
Stretch+0.110
Neg logit contributions
 Amy-0.075
 Sarah-0.068
 Emily-0.066
 there-0.063
 Daisy-0.063
Token saw
Token ID2497
Feature activation+7.18e-03

Pos logit contributions
ga+0.034
Ar+0.031
ric+0.030
be+0.029
aff+0.029
Neg logit contributions
 Amy-0.026
 Daisy-0.024
 Sarah-0.024
 Sally-0.024
 Meg-0.024
Token this
Token ID428
Feature activation-4.61e-03

Pos logit contributions
 Amy+0.009
 Daisy+0.008
 Meg+0.008
 Sally+0.008
 Sarah+0.008
Neg logit contributions
ga-0.015
be-0.014
Ar-0.013
zee-0.013
aff-0.013
Token and
Token ID290
Feature activation+8.30e-03

Pos logit contributions
ga+0.008
be+0.007
ruck+0.007
Ar+0.007
 rh+0.007
Neg logit contributions
 Amy-0.006
 Meg-0.006
uct-0.006
 Daisy-0.006
 Sally-0.006
Token felt
Token ID2936
Feature activation-1.22e-02

Pos logit contributions
 Amy+0.011
 Daisy+0.010
 Sally+0.010
 Meg+0.010
 Sarah+0.010
Neg logit contributions
ga-0.017
be-0.015
Ar-0.015
aff-0.014
zee-0.014
Token saw
Token ID2497
Feature activation+3.10e-03

Pos logit contributions
 Amy+0.005
 Meg+0.005
 Daisy+0.005
 Sally+0.005
uct+0.005
Neg logit contributions
ga-0.006
be-0.005
Ar-0.005
zee-0.005
aff-0.005
Token plates
Token ID13375
Feature activation-1.07e-02

Pos logit contributions
 Amy+0.004
 Daisy+0.003
 Meg+0.003
 Sally+0.003
 Sarah+0.003
Neg logit contributions
ga-0.006
be-0.006
Ar-0.006
zee-0.005
aff-0.005
Token with
Token ID351
Feature activation+9.58e-03

Pos logit contributions
ga+0.014
be+0.013
ruck+0.012
 rh+0.011
aff+0.011
Neg logit contributions
uct-0.020
igsaw-0.019
 Meg-0.018
 Amy-0.018
 Daisy-0.018
Token pizza
Token ID14256
Feature activation-2.23e-02

Pos logit contributions
 Amy+0.012
 Daisy+0.011
 Meg+0.011
 Sally+0.011
 refreshing+0.010
Neg logit contributions
ga-0.019
be-0.017
Ar-0.017
zee-0.016
aff-0.016
Token and
Token ID290
Feature activation-1.64e-02

Pos logit contributions
ga+0.041
be+0.038
Ar+0.036
aff+0.035
 rh+0.035
Neg logit contributions
uct-0.029
 Amy-0.029
 Meg-0.028
 Daisy-0.028
 Sally-0.027
Token fries
Token ID31757
Feature activation-6.07e-02

Pos logit contributions
ga+0.035
be+0.031
Ar+0.030
zee+0.029
aff+0.029
Neg logit contributions
 Amy-0.019
 Daisy-0.017
 Meg-0.017
 Sally-0.017
 there-0.016
Token and
Token ID290
Feature activation-2.28e-02

Pos logit contributions
ga+0.129
be+0.118
Ar+0.115
zee+0.111
aff+0.110
Neg logit contributions
 Amy-0.064
 Meg-0.059
 Daisy-0.059
 Sally-0.057
uct-0.057
Token ice
Token ID4771
Feature activation-1.53e-02

Pos logit contributions
ga+0.047
be+0.042
Ar+0.041
zee+0.040
aff+0.040
Neg logit contributions
 Amy-0.027
 Daisy-0.025
 Meg-0.024
 Sally-0.024
 there-0.024
Token cream
Token ID8566
Feature activation-3.88e-02

Pos logit contributions
ga+0.033
be+0.030
Ar+0.030
Bang+0.028
zee+0.028
Neg logit contributions
 Amy-0.016
 Daisy-0.014
 Meg-0.014
 Sally-0.014
 there-0.014
Token.
Token ID13
Feature activation+7.42e-05

Pos logit contributions
ga+0.084
be+0.077
Ar+0.075
zee+0.072
aff+0.071
Neg logit contributions
 Amy-0.040
 Daisy-0.037
 Meg-0.037
 Sally-0.036
uct-0.035
Token They
Token ID1119
Feature activation+5.34e-03

Pos logit contributions
 Amy+0.000
 Meg+0.000
uct+0.000
 Daisy+0.000
 Sally+0.000
Neg logit contributions
ga-0.000
be-0.000
Ar-0.000
aff-0.000
zee-0.000
Token be
Token ID307
Feature activation+1.16e-02

Pos logit contributions
 Amy+0.010
 Daisy+0.010
 Meg+0.010
 Sally+0.010
uct+0.009
Neg logit contributions
ga-0.011
be-0.010
Ar-0.009
zee-0.009
aff-0.009
Token prepared
Token ID5597
Feature activation-8.32e-03

Pos logit contributions
 Amy+0.012
uct+0.012
 Meg+0.011
igsaw+0.011
 Daisy+0.011
Neg logit contributions
ga-0.025
be-0.023
Ar-0.022
ruck-0.021
k-0.021
Token and
Token ID290
Feature activation-2.24e-02

Pos logit contributions
ga+0.013
be+0.012
Ar+0.011
ruck+0.011
by+0.011
Neg logit contributions
 Amy-0.013
uct-0.013
 Meg-0.013
igsaw-0.012
 Daisy-0.012
Token not
Token ID407
Feature activation-1.41e-02

Pos logit contributions
ga+0.038
be+0.034
Ar+0.033
zee+0.031
ruck+0.031
Neg logit contributions
 Amy-0.033
 Meg-0.031
 Daisy-0.031
uct-0.031
 Sally-0.030
Token take
Token ID1011
Feature activation-5.60e-03

Pos logit contributions
ga+0.028
smith+0.025
Ar+0.025
zee+0.024
be+0.024
Neg logit contributions
 Daisy-0.018
 there-0.018
 Sally-0.018
 Amy-0.018
 refreshing-0.017
Token unnecessary
Token ID13114
Feature activation-6.06e-02

Pos logit contributions
ga+0.010
be+0.009
Ar+0.009
zee+0.008
aff+0.008
Neg logit contributions
 Amy-0.008
 Daisy-0.007
 Meg-0.007
 Sally-0.007
uct-0.007
Token risks
Token ID7476
Feature activation-1.42e-02

Pos logit contributions
ga+0.117
be+0.105
Ar+0.102
zee+0.101
smith+0.101
Neg logit contributions
 Amy-0.075
 there-0.072
 Daisy-0.071
 Meg-0.070
 Sally-0.068
Token.
Token ID13
Feature activation+1.43e-03

Pos logit contributions
ga+0.029
be+0.027
Ar+0.025
aff+0.025
ruck+0.024
Neg logit contributions
 Amy-0.016
 Meg-0.015
 Daisy-0.015
uct-0.015
 Sally-0.014
Token<|endoftext|>
Token ID50256
Feature activation-1.83e-03

Pos logit contributions
 Amy+0.002
 Meg+0.002
 Daisy+0.002
uct+0.002
 refreshing+0.002
Neg logit contributions
ga-0.002
be-0.002
Ar-0.002
ric-0.002
Bang-0.002
TokenOnce
Token ID7454
Feature activation+3.04e-02

Pos logit contributions
ga+0.003
be+0.002
Ar+0.002
aff+0.002
Bang+0.002
Neg logit contributions
 Amy-0.003
 Meg-0.003
uct-0.003
 Daisy-0.003
 refreshing-0.003
Token upon
Token ID2402
Feature activation-1.20e-02

Pos logit contributions
uct+0.047
 refreshing+0.047
 Meg+0.045
 reassured+0.044
 Amy+0.044
Neg logit contributions
by-0.041
ga-0.041
be-0.039
ric-0.036
aff-0.036
Token
Token ID220
Feature activation-2.76e-02

Pos logit contributions
ga+0.005
be+0.005
Ar+0.004
aff+0.004
zee+0.004
Neg logit contributions
 Amy-0.004
 refreshing-0.004
 Meg-0.004
uct-0.004
kins-0.004
Token
Token ID198
Feature activation-3.34e-03

Pos logit contributions
ga+0.048
be+0.043
Ar+0.041
zee+0.040
ruck+0.039
Neg logit contributions
 Amy-0.041
 Meg-0.038
 Daisy-0.038
 Sally-0.038
uct-0.037
Token
Token ID198
Feature activation+4.34e-03

Pos logit contributions
ga+0.006
be+0.005
Ar+0.005
zee+0.005
eer+0.005
Neg logit contributions
 Amy-0.005
 Daisy-0.005
 Meg-0.004
 Sally-0.004
 Sarah-0.004
TokenThe
Token ID464
Feature activation-5.62e-03

Pos logit contributions
 Amy+0.006
 Meg+0.005
 Daisy+0.005
uct+0.005
 Sally+0.005
Neg logit contributions
ga-0.008
be-0.007
eer-0.007
Ar-0.007
aff-0.007
Token hard
Token ID1327
Feature activation-1.10e-02

Pos logit contributions
ga+0.011
be+0.009
Ar+0.009
aff+0.009
zee+0.009
Neg logit contributions
 Amy-0.008
 Meg-0.007
 Daisy-0.007
 Sally-0.007
uct-0.007
Tokenworking
Token ID16090
Feature activation-6.05e-02

Pos logit contributions
ga+0.019
be+0.017
Ar+0.016
zee+0.015
aff+0.015
Neg logit contributions
 Amy-0.016
 Meg-0.016
 Daisy-0.016
 Sally-0.015
uct-0.015
Token ant
Token ID1885
Feature activation-1.17e-02

Pos logit contributions
ga+0.101
Ar+0.088
be+0.088
ric+0.084
zee+0.082
Neg logit contributions
 Amy-0.095
 Daisy-0.089
 Meg-0.089
 Sally-0.088
 Jane-0.086
Token said
Token ID531
Feature activation-4.13e-03

Pos logit contributions
ga+0.021
be+0.019
Ar+0.018
zee+0.017
smith+0.017
Neg logit contributions
 Amy-0.016
 Meg-0.015
 Daisy-0.015
 Sally-0.015
igsaw-0.015
Token,
Token ID11
Feature activation+4.03e-03

Pos logit contributions
ga+0.010
be+0.009
Ar+0.009
ric+0.008
zee+0.008
Neg logit contributions
 Amy-0.004
 Daisy-0.003
 Meg-0.003
 Sally-0.003
 Sarah-0.003
Token "
Token ID366
Feature activation-1.77e-02

Pos logit contributions
 Amy+0.006
 Daisy+0.006
 Sally+0.006
 there+0.006
 Sarah+0.006
Neg logit contributions
ga-0.007
Ar-0.006
by-0.006
be-0.006
zee-0.006
TokenIf
Token ID1532
Feature activation+1.77e-03

Pos logit contributions
ga+0.030
be+0.027
aff+0.026
ric+0.026
eer+0.026
Neg logit contributions
 Amy-0.025
uct-0.025
 Meg-0.024
 Daisy-0.024
 Sally-0.024
Token paws
Token ID49193
Feature activation+9.86e-03

Pos logit contributions
 Amy+0.004
 Daisy+0.004
 Sally+0.004
 Sarah+0.004
 there+0.004
Neg logit contributions
ga-0.006
Ar-0.006
ric-0.006
be-0.005
check-0.005
Token and
Token ID290
Feature activation+1.54e-02

Pos logit contributions
 Amy+0.009
 there+0.009
 Daisy+0.008
 Sally+0.008
 Sarah+0.008
Neg logit contributions
ga-0.022
Ar-0.021
be-0.020
Bang-0.020
zee-0.020
Token a
Token ID257
Feature activation-1.11e-02

Pos logit contributions
 Amy+0.020
 Daisy+0.019
 Meg+0.018
 Sally+0.018
 Sarah+0.018
Neg logit contributions
ga-0.031
be-0.027
Ar-0.026
aff-0.026
zee-0.025
Token w
Token ID266
Feature activation+1.16e-03

Pos logit contributions
ga+0.023
be+0.021
aff+0.020
ruck+0.020
zee+0.020
Neg logit contributions
 Amy-0.013
 Daisy-0.013
 Sally-0.012
 Emily-0.012
 Sarah-0.012
Tokenagg
Token ID9460
Feature activation-1.94e-02

Pos logit contributions
 Amy+0.002
 Meg+0.001
 Daisy+0.001
 Sally+0.001
uct+0.001
Neg logit contributions
ga-0.002
be-0.002
Ar-0.002
zee-0.002
aff-0.002
Tokenly
Token ID306
Feature activation-6.05e-02

Pos logit contributions
ga+0.030
Ar+0.028
Bang+0.027
be+0.027
check+0.025
Neg logit contributions
 Amy-0.031
 Daisy-0.029
 Sally-0.029
 there-0.029
kins-0.028
Token tail
Token ID7894
Feature activation-7.52e-03

Pos logit contributions
ga+0.102
be+0.094
Ar+0.090
ric+0.088
check+0.087
Neg logit contributions
 Amy-0.090
 Daisy-0.089
 Sally-0.086
 Meg-0.082
 Sarah-0.081
Token.
Token ID13
Feature activation+1.26e-03

Pos logit contributions
ga+0.018
Ar+0.017
be+0.016
Bang+0.016
zee+0.016
Neg logit contributions
 Amy-0.006
 there-0.006
 Daisy-0.005
 Sally-0.005
 Meg-0.005
Token
Token ID198
Feature activation-3.49e-03

Pos logit contributions
 Amy+0.002
 Daisy+0.002
 Meg+0.002
 Sally+0.002
 Sarah+0.002
Neg logit contributions
ga-0.002
be-0.002
Ar-0.002
zee-0.002
aff-0.002
Token
Token ID198
Feature activation+6.11e-03

Pos logit contributions
ga+0.007
be+0.006
zee+0.006
aff+0.006
ric+0.006
Neg logit contributions
 Amy-0.005
 Daisy-0.004
 Meg-0.004
 there-0.004
 Emily-0.004
TokenOne
Token ID3198
Feature activation+7.68e-03

Pos logit contributions
 Amy+0.010
 Daisy+0.009
 Meg+0.009
 Emily+0.009
 Sally+0.009
Neg logit contributions
ga-0.010
be-0.009
Ar-0.008
aff-0.008
zee-0.008
Token her
Token ID607
Feature activation+1.02e-02

Pos logit contributions
 Amy+0.007
uct+0.007
 Meg+0.007
 Daisy+0.007
 refreshing+0.007
Neg logit contributions
ga-0.011
be-0.010
Ar-0.010
aff-0.010
zee-0.010
Token cows
Token ID22575
Feature activation-1.28e-02

Pos logit contributions
 Amy+0.015
uct+0.015
 Meg+0.015
 Daisy+0.014
 Emily+0.014
Neg logit contributions
ga-0.017
be-0.015
 rh-0.014
ric-0.014
eer-0.014
Token and
Token ID290
Feature activation-2.23e-02

Pos logit contributions
ga+0.022
be+0.019
 rh+0.019
ric+0.019
aff+0.018
Neg logit contributions
uct-0.019
 Amy-0.018
igsaw-0.018
 Meg-0.017
 refreshing-0.017
Token sheep
Token ID15900
Feature activation-2.84e-02

Pos logit contributions
ga+0.037
be+0.034
Ar+0.032
zee+0.031
aff+0.031
Neg logit contributions
 Amy-0.035
 Daisy-0.033
 Meg-0.033
 Sally-0.032
 Sarah-0.032
Token and
Token ID290
Feature activation-2.55e-02

Pos logit contributions
ga+0.055
be+0.050
Ar+0.046
aff+0.046
ric+0.046
Neg logit contributions
 Amy-0.035
uct-0.034
 Meg-0.034
 Daisy-0.033
igsaw-0.032
Token chickens
Token ID25972
Feature activation-6.03e-02

Pos logit contributions
ga+0.042
be+0.038
Ar+0.036
zee+0.035
aff+0.034
Neg logit contributions
 Amy-0.041
 Daisy-0.039
 Meg-0.039
 Sally-0.038
 Sarah-0.037
Token.
Token ID13
Feature activation-5.38e-03

Pos logit contributions
ga+0.119
be+0.109
Ar+0.101
aff+0.101
ric+0.099
Neg logit contributions
 Amy-0.073
 Meg-0.069
uct-0.069
 Daisy-0.068
igsaw-0.067
Token She
Token ID1375
Feature activation+6.15e-03

Pos logit contributions
ga+0.009
be+0.008
Ar+0.008
aff+0.007
zee+0.007
Neg logit contributions
 Amy-0.008
 Meg-0.008
 Daisy-0.008
uct-0.008
 Sally-0.008
Token gave
Token ID2921
Feature activation+1.46e-02

Pos logit contributions
 Amy+0.008
 Meg+0.008
 Daisy+0.008
uct+0.008
 Sally+0.008
Neg logit contributions
ga-0.011
be-0.010
Ar-0.010
zee-0.009
aff-0.009
Token them
Token ID606
Feature activation+9.77e-03

Pos logit contributions
 Amy+0.025
 Daisy+0.024
 Meg+0.023
 Sally+0.023
 Emily+0.022
Neg logit contributions
ga-0.023
be-0.021
Ar-0.020
ric-0.019
ruck-0.018
Token some
Token ID617
Feature activation+2.31e-02

Pos logit contributions
uct+0.014
 Amy+0.014
 Meg+0.013
ement+0.013
igsaw+0.013
Neg logit contributions
ga-0.016
be-0.015
Ar-0.014
by-0.014
 rh-0.014
Token<|endoftext|>
Token ID50256
Feature activation-3.99e-03

Pos logit contributions
kins+0.001
 refreshing+0.001
 Amy+0.001
uct+0.001
kin+0.001
Neg logit contributions
ga-0.001
be-0.001
Ar-0.001
ric-0.001
Bang-0.001
TokenAlice
Token ID44484
Feature activation-8.57e-03

Pos logit contributions
ga+0.005
be+0.005
Ar+0.004
Bang+0.004
smith+0.004
Neg logit contributions
 Amy-0.007
 Meg-0.007
uct-0.007
 refreshing-0.007
 Daisy-0.007
Token was
Token ID373
Feature activation-7.67e-03

Pos logit contributions
ga+0.015
be+0.014
Ar+0.013
zee+0.013
aff+0.013
Neg logit contributions
 Amy-0.012
 Daisy-0.011
 Meg-0.011
 Sally-0.011
 Sarah-0.011
Token a
Token ID257
Feature activation-1.05e-02

Pos logit contributions
ga+0.014
be+0.013
Ar+0.012
Bang+0.011
 rh+0.011
Neg logit contributions
igsaw-0.010
 Amy-0.010
 Meg-0.010
kins-0.009
uct-0.009
Token 3
Token ID513
Feature activation-3.44e-02

Pos logit contributions
ga+0.021
be+0.019
Ar+0.019
aff+0.018
ruck+0.018
Neg logit contributions
 Amy-0.012
 Daisy-0.012
 Meg-0.011
 Sally-0.011
 Sarah-0.011
Token year
Token ID614
Feature activation+5.28e-02

Pos logit contributions
ga+0.064
be+0.060
eer+0.055
smith+0.052
by+0.052
Neg logit contributions
 Meg-0.044
uct-0.044
 Amy-0.044
 Daisy-0.044
 refreshing-0.043
Token old
Token ID1468
Feature activation-2.90e-02

Pos logit contributions
 Amy+0.068
 Meg+0.067
 Daisy+0.065
uct+0.064
 refreshing+0.062
Neg logit contributions
ga-0.099
be-0.091
aff-0.085
Ar-0.082
 rh-0.081
Token girl
Token ID2576
Feature activation+1.83e-04

Pos logit contributions
ga+0.050
Ar+0.046
be+0.045
Bang+0.044
ric+0.043
Neg logit contributions
 Amy-0.042
 Daisy-0.040
 Sally-0.039
 there-0.039
 Sarah-0.039
Token who
Token ID508
Feature activation-1.02e-02

Pos logit contributions
 Amy+0.000
 Daisy+0.000
 Meg+0.000
 Sally+0.000
 Sarah+0.000
Neg logit contributions
ga-0.000
be-0.000
Ar-0.000
zee-0.000
aff-0.000
Token loved
Token ID6151
Feature activation-1.63e-03

Pos logit contributions
ga+0.012
be+0.010
Ar+0.009
zee+0.009
aff+0.009
Neg logit contributions
 Amy-0.021
 Meg-0.020
 Daisy-0.020
 Sally-0.019
uct-0.019
Token to
Token ID284
Feature activation-4.75e-03

Pos logit contributions
ga+0.003
be+0.002
Ar+0.002
zee+0.002
Bang+0.002
Neg logit contributions
 Amy-0.003
 Daisy-0.002
 Sally-0.002
 Sarah-0.002
 there-0.002
Token fun
Token ID1257
Feature activation+9.68e-03

Pos logit contributions
 Amy+0.018
 Meg+0.018
 Daisy+0.017
uct+0.017
 Sally+0.017
Neg logit contributions
ga-0.025
be-0.023
Ar-0.022
 rh-0.021
zee-0.021
Token was
Token ID373
Feature activation-1.98e-03

Pos logit contributions
 Amy+0.013
 Daisy+0.012
 Meg+0.012
 Sally+0.012
 Sarah+0.011
Neg logit contributions
ga-0.018
be-0.017
Ar-0.016
zee-0.015
aff-0.015
Token done
Token ID1760
Feature activation-8.69e-03

Pos logit contributions
ga+0.004
be+0.003
Ar+0.003
ruck+0.003
Bang+0.003
Neg logit contributions
 Amy-0.003
 Meg-0.003
 Daisy-0.003
uct-0.003
kins-0.003
Token,
Token ID11
Feature activation+3.78e-03

Pos logit contributions
ga+0.019
be+0.017
Ar+0.016
ruck+0.016
aff+0.016
Neg logit contributions
 Amy-0.009
 Meg-0.009
uct-0.008
 Daisy-0.008
 refreshing-0.008
Token Tom
Token ID4186
Feature activation+6.54e-03

Pos logit contributions
igsaw+0.005
kins+0.005
ulu+0.005
kin+0.005
quet+0.005
Neg logit contributions
ga-0.006
Bang-0.005
ruck-0.005
be-0.005
Ar-0.005
Token and
Token ID290
Feature activation+4.52e-02

Pos logit contributions
 Amy+0.008
 Meg+0.008
 Daisy+0.007
uct+0.007
 Sally+0.007
Neg logit contributions
ga-0.013
be-0.012
Ar-0.011
zee-0.011
aff-0.011
Token Jack
Token ID3619
Feature activation-1.39e-02

Pos logit contributions
kins+0.038
 refreshing+0.037
ement+0.035
kin+0.032
quet+0.032
Neg logit contributions
ga-0.097
by-0.090
Bang-0.088
ruck-0.087
 lear-0.086
Token said
Token ID531
Feature activation-7.63e-03

Pos logit contributions
ga+0.026
be+0.023
Ar+0.023
zee+0.022
aff+0.022
Neg logit contributions
 Amy-0.018
 Daisy-0.017
 Sally-0.017
 Sarah-0.017
 Meg-0.017
Token goodbye
Token ID24829
Feature activation+1.27e-02

Pos logit contributions
ga+0.016
be+0.015
Ar+0.015
ric+0.014
zee+0.014
Neg logit contributions
 Amy-0.008
 Daisy-0.008
 Meg-0.008
 Sally-0.007
 there-0.007
Token to
Token ID284
Feature activation+3.98e-03

Pos logit contributions
 Amy+0.017
 Daisy+0.016
 Sally+0.015
 Meg+0.015
 Sarah+0.015
Neg logit contributions
ga-0.024
Ar-0.021
be-0.021
smith-0.020
Bang-0.020
Token their
Token ID511
Feature activation+3.39e-03

Pos logit contributions
kins+0.005
 Amy+0.005
uct+0.005
igsaw+0.005
kin+0.005
Neg logit contributions
ga-0.007
be-0.006
ric-0.006
by-0.006
ruck-0.006
Tokengy
Token ID1360
Feature activation+7.25e-03

Pos logit contributions
 Amy+0.018
 Meg+0.017
 Daisy+0.017
uct+0.017
 Sally+0.016
Neg logit contributions
ga-0.029
be-0.026
Ar-0.025
zee-0.024
aff-0.024
Token and
Token ID290
Feature activation+1.40e-02

Pos logit contributions
 Amy+0.008
 Daisy+0.007
 Sally+0.007
 Meg+0.007
 refreshing+0.007
Neg logit contributions
ga-0.015
be-0.014
Ar-0.014
Bang-0.013
aff-0.013
Token Jessica
Token ID17352
Feature activation-1.03e-02

Pos logit contributions
uct+0.018
igsaw+0.017
ement+0.016
ogging+0.015
kins+0.015
Neg logit contributions
ga-0.024
be-0.022
ruck-0.021
Ar-0.021
Bang-0.020
Token felt
Token ID2936
Feature activation-6.67e-03

Pos logit contributions
ga+0.016
be+0.014
Ar+0.013
aff+0.013
zee+0.013
Neg logit contributions
 Amy-0.017
 Daisy-0.016
 Meg-0.016
 Sally-0.016
uct-0.016
Token a
Token ID257
Feature activation-1.89e-03

Pos logit contributions
ga+0.012
be+0.011
Ar+0.011
 rh+0.010
Bang+0.010
Neg logit contributions
 Meg-0.009
 Amy-0.009
igsaw-0.009
uct-0.008
kins-0.008
Token bit
Token ID1643
Feature activation+4.75e-02

Pos logit contributions
ga+0.004
be+0.003
aff+0.003
zee+0.003
smith+0.003
Neg logit contributions
 refreshing-0.002
 Amy-0.002
 Daisy-0.002
 there-0.002
 Sally-0.002
Token scared
Token ID12008
Feature activation+1.11e-02

Pos logit contributions
 Amy+0.057
 Meg+0.056
uct+0.053
kins+0.052
igsaw+0.051
Neg logit contributions
ga-0.092
be-0.089
Ar-0.080
aff-0.079
 rh-0.079
Token.
Token ID13
Feature activation+9.09e-03

Pos logit contributions
uct+0.014
 Amy+0.013
 Meg+0.013
kins+0.013
igsaw+0.013
Neg logit contributions
ga-0.021
be-0.019
ruck-0.018
aff-0.018
eer-0.017
Token Suddenly
Token ID24975
Feature activation+2.92e-03

Pos logit contributions
 Amy+0.014
 Sarah+0.013
 Daisy+0.013
 Emily+0.013
 Meg+0.013
Neg logit contributions
ga-0.016
be-0.015
Ar-0.014
eer-0.014
ric-0.014
Token,
Token ID11
Feature activation+6.49e-03

Pos logit contributions
uct+0.004
 refreshing+0.004
igsaw+0.004
kins+0.003
ulu+0.003
Neg logit contributions
ga-0.005
be-0.005
by-0.004
smith-0.004
ically-0.004
Token a
Token ID257
Feature activation+4.91e-03

Pos logit contributions
 refreshing+0.008
uct+0.007
igsaw+0.007
ulu+0.007
kins+0.007
Neg logit contributions
ga-0.012
be-0.011
 lear-0.010
Ar-0.010
ically-0.010
Token boy
Token ID2933
Feature activation-7.28e-03

Pos logit contributions
ga+0.017
be+0.015
 rh+0.014
Ar+0.014
aff+0.014
Neg logit contributions
 Amy-0.022
uct-0.022
 Meg-0.022
 Daisy-0.021
 refreshing-0.020
Token.
Token ID13
Feature activation+5.36e-03

Pos logit contributions
ga+0.013
be+0.012
Ar+0.011
zee+0.010
aff+0.010
Neg logit contributions
 Amy-0.010
 Meg-0.010
 Daisy-0.010
uct-0.010
 Sally-0.010
Token He
Token ID679
Feature activation-1.76e-02

Pos logit contributions
 Amy+0.011
 Meg+0.010
uct+0.010
 Daisy+0.010
 refreshing+0.010
Neg logit contributions
ga-0.006
be-0.005
Ar-0.005
 rh-0.005
Bang-0.005
Token saw
Token ID2497
Feature activation+5.95e-03

Pos logit contributions
ga+0.032
be+0.028
aff+0.027
ric+0.027
zee+0.026
Neg logit contributions
 Amy-0.027
 Daisy-0.025
 there-0.025
 Sally-0.024
 Emily-0.024
Token a
Token ID257
Feature activation+6.86e-03

Pos logit contributions
 Amy+0.007
 Daisy+0.006
 Meg+0.006
 Sally+0.006
 there+0.006
Neg logit contributions
ga-0.013
be-0.012
Ar-0.011
aff-0.011
zee-0.011
Token y
Token ID331
Feature activation+5.44e-02

Pos logit contributions
 Amy+0.008
 Daisy+0.007
 Meg+0.007
 Sally+0.007
 refreshing+0.007
Neg logit contributions
ga-0.014
be-0.013
Ar-0.012
aff-0.012
zee-0.012
Tokenummy
Token ID13513
Feature activation-1.70e-02

Pos logit contributions
 Amy+0.057
 Meg+0.053
uct+0.053
 Daisy+0.049
 Emily+0.048
Neg logit contributions
ga-0.116
be-0.109
aff-0.105
by-0.104
 rh-0.102
Token cel
Token ID18725
Feature activation-2.21e-03

Pos logit contributions
ga+0.031
be+0.028
 rh+0.028
Ar+0.027
ruck+0.027
Neg logit contributions
uct-0.022
 Amy-0.022
 Meg-0.022
 Daisy-0.020
 Sarah-0.020
Tokenery
Token ID1924
Feature activation-1.71e-02

Pos logit contributions
ga+0.004
be+0.003
Ar+0.003
zee+0.003
ric+0.003
Neg logit contributions
 Amy-0.003
 Daisy-0.003
 Meg-0.003
kins-0.003
 there-0.003
Token on
Token ID319
Feature activation+2.29e-02

Pos logit contributions
ga+0.034
be+0.031
Ar+0.029
zee+0.028
aff+0.028
Neg logit contributions
 Amy-0.021
 Meg-0.020
 Daisy-0.019
uct-0.019
 Sally-0.019
Token the
Token ID262
Feature activation+2.14e-03

Pos logit contributions
 Amy+0.021
 Daisy+0.020
 Meg+0.019
 Sally+0.019
uct+0.018
Neg logit contributions
ga-0.052
be-0.049
Ar-0.047
zee-0.045
aff-0.045
Token She
Token ID1375
Feature activation-1.79e-02

Pos logit contributions
ga+0.005
be+0.004
Ar+0.004
Stretch+0.004
 rh+0.004
Neg logit contributions
kins-0.005
uct-0.004
igsaw-0.004
 refreshing-0.004
 Meg-0.004
Token was
Token ID373
Feature activation-1.18e-02

Pos logit contributions
ga+0.030
aff+0.030
zee+0.029
ric+0.027
 trigger+0.027
Neg logit contributions
 Amy-0.027
 there-0.027
 she-0.025
 Daisy-0.024
 Sally-0.024
Token a
Token ID257
Feature activation-7.03e-03

Pos logit contributions
ga+0.021
be+0.019
Ar+0.018
zee+0.018
 rh+0.018
Neg logit contributions
 Amy-0.016
 Meg-0.015
 Daisy-0.015
uct-0.015
 Sally-0.015
Token lively
Token ID29696
Feature activation-1.05e-02

Pos logit contributions
ga+0.015
be+0.013
Ar+0.013
aff+0.013
ric+0.013
Neg logit contributions
 Amy-0.008
 Daisy-0.007
 Sally-0.007
 there-0.007
 Meg-0.007
Token three
Token ID1115
Feature activation-2.05e-02

Pos logit contributions
ga+0.017
be+0.016
Ar+0.015
ric+0.014
zee+0.014
Neg logit contributions
 Amy-0.016
 Daisy-0.015
 Meg-0.015
 Sally-0.015
 Sarah-0.015
Token year
Token ID614
Feature activation+4.85e-02

Pos logit contributions
ga+0.029
be+0.028
eer+0.026
 pengu+0.025
smith+0.024
Neg logit contributions
uct-0.036
 Meg-0.034
 Daisy-0.033
 refreshing-0.033
 Amy-0.032
Token old
Token ID1468
Feature activation-2.79e-02

Pos logit contributions
uct+0.071
 Meg+0.071
 Amy+0.069
 reassured+0.066
 Daisy+0.066
Neg logit contributions
ga-0.083
aff-0.073
be-0.072
 rh-0.068
ruck-0.065
Token with
Token ID351
Feature activation+9.06e-03

Pos logit contributions
ga+0.056
Ar+0.053
Bang+0.052
be+0.050
Stretch+0.050
Neg logit contributions
 Amy-0.034
 Daisy-0.031
 Sally-0.031
 there-0.031
 Sarah-0.030
Token big
Token ID1263
Feature activation+1.92e-02

Pos logit contributions
 Amy+0.009
 Daisy+0.009
 Meg+0.008
 Sally+0.008
 Sarah+0.008
Neg logit contributions
ga-0.020
be-0.018
Ar-0.018
zee-0.017
aff-0.017
Token,
Token ID11
Feature activation+9.08e-03

Pos logit contributions
 Amy+0.025
 Meg+0.024
 Daisy+0.024
 Sally+0.023
uct+0.023
Neg logit contributions
ga-0.036
be-0.033
Ar-0.032
zee-0.030
aff-0.030
Token bright
Token ID6016
Feature activation+3.72e-03

Pos logit contributions
uct+0.015
 Meg+0.014
kins+0.014
igsaw+0.014
ement+0.013
Neg logit contributions
ga-0.013
 rh-0.012
Ar-0.011
be-0.011
 horn-0.011
Token the
Token ID262
Feature activation-5.95e-03

Pos logit contributions
 refreshing+0.004
uct+0.004
kins+0.004
 Amy+0.004
igsaw+0.004
Neg logit contributions
ga-0.009
be-0.008
smith-0.007
ruck-0.007
by-0.007
Token voice
Token ID3809
Feature activation+1.75e-02

Pos logit contributions
ga+0.010
be+0.009
Ar+0.008
smith+0.008
 rh+0.008
Neg logit contributions
 Amy-0.009
 Meg-0.009
 Daisy-0.009
uct-0.009
 Sally-0.008
Token that
Token ID326
Feature activation+1.48e-02

Pos logit contributions
igsaw+0.029
 Amy+0.028
 Meg+0.028
 refreshing+0.028
kins+0.028
Neg logit contributions
ga-0.024
be-0.023
zee-0.020
by-0.020
ruck-0.020
Token she
Token ID673
Feature activation-5.03e-03

Pos logit contributions
igsaw+0.018
uct+0.017
 refreshing+0.017
ement+0.016
 Meg+0.015
Neg logit contributions
ga-0.027
be-0.025
 rh-0.024
smith-0.023
Stretch-0.023
Token wanted
Token ID2227
Feature activation+5.34e-03

Pos logit contributions
ga+0.008
be+0.007
Ar+0.006
zee+0.006
aff+0.006
Neg logit contributions
 Amy-0.008
 Meg-0.008
 Daisy-0.008
uct-0.008
 Sally-0.008
Token to
Token ID284
Feature activation+8.72e-03

Pos logit contributions
uct+0.010
igsaw+0.010
 refreshing+0.010
 Amy+0.010
kins+0.010
Neg logit contributions
ga-0.006
be-0.005
Ar-0.005
ruck-0.005
by-0.005
Token explore
Token ID7301
Feature activation+4.12e-03

Pos logit contributions
igsaw+0.015
uct+0.015
kins+0.015
 Amy+0.015
ement+0.015
Neg logit contributions
ga-0.012
be-0.010
Ar-0.010
 rh-0.010
ruck-0.009
Token the
Token ID262
Feature activation-6.85e-03

Pos logit contributions
uct+0.007
igsaw+0.007
 Meg+0.006
kin+0.006
 refreshing+0.006
Neg logit contributions
ga-0.006
ruck-0.005
 rh-0.005
be-0.005
Ar-0.005
Token cabin
Token ID9351
Feature activation-3.65e-03

Pos logit contributions
ga+0.011
be+0.010
Ar+0.009
aff+0.009
zee+0.009
Neg logit contributions
 Amy-0.011
 Daisy-0.010
 Meg-0.010
 Sally-0.010
uct-0.010
Token.
Token ID13
Feature activation+1.89e-03

Pos logit contributions
ga+0.006
be+0.006
k+0.005
ruck+0.005
 rh+0.005
Neg logit contributions
 Meg-0.005
 Amy-0.005
igsaw-0.005
 refreshing-0.005
uct-0.005
Token After
Token ID2293
Feature activation-5.84e-03

Pos logit contributions
 Amy+0.003
 Daisy+0.002
 Meg+0.002
 Sarah+0.002
 Emily+0.002
Neg logit contributions
ga-0.004
be-0.003
Ar-0.003
ruck-0.003
eer-0.003
Token cared
Token ID19951
Feature activation-1.06e-02

Pos logit contributions
ga+0.022
be+0.019
Ar+0.017
by+0.017
eer+0.017
Neg logit contributions
 Amy-0.035
 Meg-0.035
uct-0.034
 Daisy-0.033
igsaw-0.033
Token for
Token ID329
Feature activation-7.95e-03

Pos logit contributions
ga+0.017
be+0.015
ruck+0.014
aff+0.014
Ar+0.014
Neg logit contributions
uct-0.016
 Meg-0.016
 Amy-0.015
 refreshing-0.015
igsaw-0.015
Token it
Token ID340
Feature activation+1.48e-02

Pos logit contributions
ga+0.016
be+0.014
ruck+0.014
by+0.014
Ar+0.013
Neg logit contributions
uct-0.009
 refreshing-0.009
igsaw-0.009
 Meg-0.008
 Amy-0.008
Token.
Token ID13
Feature activation+4.60e-05

Pos logit contributions
igsaw+0.024
uct+0.024
 Meg+0.024
 refreshing+0.023
 Amy+0.022
Neg logit contributions
ga-0.020
be-0.018
by-0.018
gen-0.017
eer-0.017
Token<|endoftext|>
Token ID50256
Feature activation-1.27e-03

Pos logit contributions
 Amy+0.000
 Meg+0.000
 Daisy+0.000
uct+0.000
 Sally+0.000
Neg logit contributions
ga-0.000
be-0.000
Ar-0.000
aff-0.000
zee-0.000
TokenOnce
Token ID7454
Feature activation+2.68e-02

Pos logit contributions
ga+0.002
be+0.001
Ar+0.001
Bang+0.001
aff+0.001
Neg logit contributions
 Amy-0.002
 Meg-0.002
uct-0.002
 refreshing-0.002
 Daisy-0.002
Token upon
Token ID2402
Feature activation-1.35e-02

Pos logit contributions
uct+0.041
 refreshing+0.041
 Meg+0.040
 Amy+0.040
igsaw+0.039
Neg logit contributions
ga-0.037
by-0.037
be-0.034
ric-0.032
aff-0.032
Token a
Token ID257
Feature activation-2.26e-03

Pos logit contributions
ga+0.024
be+0.022
Ar+0.021
ruck+0.020
Bang+0.020
Neg logit contributions
 Amy-0.018
 Meg-0.017
uct-0.017
 Daisy-0.017
igsaw-0.017
Token time
Token ID640
Feature activation+5.48e-03

Pos logit contributions
ga+0.004
be+0.003
Ar+0.003
by+0.003
Bang+0.003
Neg logit contributions
 Amy-0.003
 Meg-0.003
uct-0.003
 Daisy-0.003
 Sally-0.003
Token there
Token ID612
Feature activation+6.11e-03

Pos logit contributions
uct+0.008
 refreshing+0.008
kins+0.007
 reassured+0.007
 Meg+0.007
Neg logit contributions
ga-0.009
be-0.008
Ar-0.008
 rh-0.008
by-0.007
Token was
Token ID373
Feature activation-7.86e-03

Pos logit contributions
 Meg+0.011
uct+0.011
igsaw+0.011
 Amy+0.011
 refreshing+0.011
Neg logit contributions
ga-0.007
be-0.007
Ar-0.006
 rh-0.006
Bang-0.006
Token you
Token ID345
Feature activation+5.71e-03

Pos logit contributions
ga+0.007
be+0.007
Ar+0.006
aff+0.006
 rh+0.006
Neg logit contributions
 refreshing-0.003
 Amy-0.003
uct-0.003
 Meg-0.003
kins-0.003
Token mum
Token ID25682
Feature activation-5.41e-03

Pos logit contributions
 Amy+0.008
 Meg+0.008
 Daisy+0.008
uct+0.008
 refreshing+0.008
Neg logit contributions
ga-0.010
be-0.009
Ar-0.008
aff-0.008
zee-0.008
Token!'
Token ID13679
Feature activation-1.79e-02

Pos logit contributions
ga+0.010
be+0.009
Ar+0.009
aff+0.008
by+0.008
Neg logit contributions
 Amy-0.007
 Meg-0.007
uct-0.006
 refreshing-0.006
 Daisy-0.006
Token She
Token ID1375
Feature activation+1.49e-03

Pos logit contributions
ga+0.030
be+0.026
Ar+0.024
aff+0.024
k+0.023
Neg logit contributions
 Amy-0.027
 Meg-0.026
uct-0.025
 refreshing-0.025
igsaw-0.025
Token was
Token ID373
Feature activation-6.62e-03

Pos logit contributions
 Amy+0.002
 Meg+0.002
 Daisy+0.002
uct+0.002
igsaw+0.002
Neg logit contributions
ga-0.003
be-0.002
Ar-0.002
zee-0.002
 rh-0.002
Token so
Token ID523
Feature activation+8.65e-03

Pos logit contributions
ga+0.013
be+0.012
Ar+0.011
ric+0.011
 rh+0.010
Neg logit contributions
igsaw-0.008
 Amy-0.008
 Meg-0.008
kins-0.007
 Daisy-0.007
Token proud
Token ID6613
Feature activation+1.75e-02

Pos logit contributions
 Meg+0.012
 Amy+0.012
igsaw+0.011
uct+0.011
 Daisy+0.011
Neg logit contributions
ga-0.015
be-0.014
Ar-0.013
 rh-0.012
by-0.012
Token of
Token ID286
Feature activation+3.71e-03

Pos logit contributions
 Amy+0.021
uct+0.020
 Meg+0.020
kins+0.020
igsaw+0.020
Neg logit contributions
ga-0.033
be-0.031
ruck-0.030
aff-0.029
smith-0.028
Token her
Token ID607
Feature activation-2.69e-03

Pos logit contributions
 refreshing+0.005
igsaw+0.005
uct+0.005
kins+0.005
 jar+0.005
Neg logit contributions
ga-0.006
ric-0.005
ruck-0.005
 rh-0.004
 gor-0.004
Token son
Token ID3367
Feature activation-2.13e-02

Pos logit contributions
ga+0.005
by+0.004
be+0.004
 rh+0.004
 pengu+0.004
Neg logit contributions
 Amy-0.004
 Meg-0.003
uct-0.003
 Daisy-0.003
 Sally-0.003
Token that
Token ID326
Feature activation+8.76e-04

Pos logit contributions
ga+0.037
be+0.034
by+0.032
k+0.031
 rh+0.031
Neg logit contributions
 Amy-0.029
uct-0.028
 refreshing-0.028
 Meg-0.028
igsaw-0.027
Token,
Token ID11
Feature activation-3.30e-03

Pos logit contributions
uct+0.031
 refreshing+0.029
igsaw+0.029
 Meg+0.028
kins+0.028
Neg logit contributions
ga-0.038
be-0.036
by-0.032
zee-0.030
k-0.030
Token let
Token ID1309
Feature activation+2.12e-02

Pos logit contributions
ga+0.005
be+0.005
 pengu+0.005
 rh+0.005
 Who+0.004
Neg logit contributions
kins-0.005
ement-0.004
uct-0.004
igsaw-0.004
ized-0.004
Token's
Token ID338
Feature activation-1.27e-02

Pos logit contributions
igsaw+0.036
kins+0.036
 Amy+0.036
 refreshing+0.035
kin+0.035
Neg logit contributions
ga-0.029
be-0.027
by-0.024
check-0.023
Ar-0.023
Token fix
Token ID4259
Feature activation-2.04e-02

Pos logit contributions
ga+0.021
be+0.020
Ar+0.018
Bang+0.017
ruck+0.017
Neg logit contributions
 Amy-0.018
 Meg-0.018
 Daisy-0.017
uct-0.017
 refreshing-0.017
Token it
Token ID340
Feature activation+7.04e-03

Pos logit contributions
ga+0.035
be+0.031
 rh+0.029
ruck+0.028
ric+0.028
Neg logit contributions
uct-0.028
 Meg-0.027
 Amy-0.027
 refreshing-0.026
igsaw-0.026
Token together
Token ID1978
Feature activation+1.41e-02

Pos logit contributions
 Meg+0.011
 refreshing+0.010
uct+0.010
 Amy+0.010
 Daisy+0.010
Neg logit contributions
ga-0.011
be-0.011
ruck-0.009
k-0.009
Ar-0.009
Token."
Token ID526
Feature activation-4.43e-03

Pos logit contributions
 Amy+0.020
 Meg+0.020
 Daisy+0.019
uct+0.019
 Sally+0.019
Neg logit contributions
ga-0.024
be-0.022
Ar-0.020
ruck-0.020
k-0.020
Token They
Token ID1119
Feature activation+6.92e-03

Pos logit contributions
ga+0.006
be+0.006
Ar+0.005
aff+0.005
Bang+0.005
Neg logit contributions
 refreshing-0.007
uct-0.007
 Amy-0.007
kins-0.007
 Meg-0.007
Token worked
Token ID3111
Feature activation-7.83e-03

Pos logit contributions
 Amy+0.011
 Meg+0.010
 Daisy+0.010
igsaw+0.010
 Sally+0.010
Neg logit contributions
ga-0.011
be-0.010
Ar-0.009
Bang-0.009
k-0.009
Token together
Token ID1978
Feature activation+2.34e-03

Pos logit contributions
ga+0.017
be+0.016
Ar+0.015
by+0.015
Bang+0.015
Neg logit contributions
 Amy-0.008
 Meg-0.007
 Daisy-0.007
uct-0.007
 refreshing-0.007
Token to
Token ID284
Feature activation-1.13e-02

Pos logit contributions
 Amy+0.004
 Meg+0.004
 Daisy+0.004
uct+0.004
 Sally+0.003
Neg logit contributions
ga-0.004
be-0.003
Ar-0.003
zee-0.003
aff-0.003
Token the
Token ID262
Feature activation-4.81e-03

Pos logit contributions
 Amy+0.002
 Meg+0.002
 refreshing+0.002
uct+0.002
 Daisy+0.001
Neg logit contributions
ga-0.004
be-0.004
Ar-0.004
aff-0.004
zee-0.004
Token bird
Token ID6512
Feature activation+3.62e-03

Pos logit contributions
ga+0.008
be+0.007
Ar+0.007
zee+0.007
aff+0.007
Neg logit contributions
 Amy-0.007
 Daisy-0.007
 Meg-0.007
 Sally-0.007
 there-0.007
Token,
Token ID11
Feature activation-1.57e-02

Pos logit contributions
 Amy+0.004
uct+0.004
 Meg+0.004
 refreshing+0.004
 Daisy+0.004
Neg logit contributions
ga-0.007
be-0.006
Ar-0.006
aff-0.006
zee-0.006
Token "
Token ID366
Feature activation-2.02e-02

Pos logit contributions
ga+0.022
Ar+0.019
be+0.019
zee+0.018
aff+0.018
Neg logit contributions
 Amy-0.029
 Daisy-0.028
 Sally-0.027
 there-0.027
 Meg-0.027
TokenWhy
Token ID5195
Feature activation-1.09e-02

Pos logit contributions
ga+0.028
be+0.024
Ar+0.023
aff+0.023
zee+0.022
Neg logit contributions
 Amy-0.036
 Daisy-0.035
 there-0.035
 Meg-0.035
uct-0.034
Token are
Token ID389
Feature activation+2.18e-02

Pos logit contributions
ga+0.018
be+0.016
 rh+0.015
Ar+0.015
eer+0.015
Neg logit contributions
uct-0.016
igsaw-0.016
 Amy-0.016
 refreshing-0.016
 Meg-0.015
Token you
Token ID345
Feature activation-4.26e-03

Pos logit contributions
igsaw+0.020
 Amy+0.019
 Meg+0.019
uct+0.019
kin+0.019
Neg logit contributions
ga-0.045
be-0.044
 rh-0.043
Stretch-0.040
 pengu-0.040
Token in
Token ID287
Feature activation+1.20e-02

Pos logit contributions
ga+0.009
be+0.008
ric+0.007
Ar+0.007
 rh+0.007
Neg logit contributions
 Amy-0.005
 Meg-0.005
uct-0.005
 Daisy-0.005
 refreshing-0.004
Token the
Token ID262
Feature activation-1.26e-03

Pos logit contributions
uct+0.019
kins+0.018
cakes+0.017
ement+0.017
 Meg+0.017
Neg logit contributions
 rh-0.020
ga-0.019
 pengu-0.017
be-0.017
 horn-0.016
Token cell
Token ID2685
Feature activation-4.20e-03

Pos logit contributions
ga+0.002
be+0.002
Ar+0.002
zee+0.002
aff+0.002
Neg logit contributions
 Amy-0.002
 Daisy-0.002
 Meg-0.002
 Sally-0.002
uct-0.002
Token?"
Token ID1701
Feature activation+5.95e-04

Pos logit contributions
ga+0.007
be+0.007
ric+0.006
smith+0.006
 rh+0.006
Neg logit contributions
 Amy-0.006
 Meg-0.006
uct-0.005
igsaw-0.005
 Daisy-0.005
Token said
Token ID531
Feature activation-3.83e-03

Pos logit contributions
ga+0.013
be+0.011
Ar+0.011
zee+0.010
ric+0.010
Neg logit contributions
 Amy-0.011
 Daisy-0.010
 Meg-0.010
 Sally-0.010
uct-0.010
Token,
Token ID11
Feature activation+7.96e-03

Pos logit contributions
ga+0.008
Ar+0.007
ric+0.007
be+0.007
ruck+0.007
Neg logit contributions
 Amy-0.005
 Daisy-0.004
 Meg-0.004
 Sally-0.004
 there-0.004
Token "
Token ID366
Feature activation-1.48e-02

Pos logit contributions
 Amy+0.013
 Daisy+0.013
 Sally+0.012
 Emily+0.012
 there+0.012
Neg logit contributions
ga-0.013
ric-0.012
be-0.012
by-0.012
Ar-0.012
TokenY
Token ID56
Feature activation+2.27e-02

Pos logit contributions
ga+0.026
ric+0.024
eer+0.023
OO+0.023
ruck+0.023
Neg logit contributions
 Daisy-0.020
 Amy-0.020
 Meg-0.019
 Sally-0.019
 Emily-0.018
Tokenoga
Token ID10949
Feature activation+1.43e-03

Pos logit contributions
 Amy+0.030
 Daisy+0.028
 Meg+0.028
uct+0.027
 refreshing+0.027
Neg logit contributions
ga-0.042
be-0.039
aff-0.036
zee-0.036
Ar-0.036
Token,
Token ID11
Feature activation-4.03e-03

Pos logit contributions
uct+0.002
 Meg+0.002
 Amy+0.002
 Sally+0.002
 Daisy+0.002
Neg logit contributions
ga-0.002
be-0.002
Ar-0.002
zee-0.002
ric-0.002
Token can
Token ID460
Feature activation+2.23e-02

Pos logit contributions
ga+0.007
be+0.007
 rh+0.006
Ar+0.006
Bang+0.006
Neg logit contributions
 Amy-0.005
kins-0.005
uct-0.005
 Meg-0.005
ement-0.005
Token you
Token ID345
Feature activation-1.05e-02

Pos logit contributions
 Amy+0.024
 Meg+0.024
igsaw+0.024
uct+0.024
kins+0.023
Neg logit contributions
ga-0.044
be-0.042
Stretch-0.039
 rh-0.038
Ar-0.038
Token teach
Token ID4545
Feature activation+3.62e-03

Pos logit contributions
ga+0.018
be+0.017
Ar+0.016
zee+0.015
aff+0.015
Neg logit contributions
 Amy-0.015
 Meg-0.014
uct-0.014
 Daisy-0.014
 Sally-0.014
Token me
Token ID502
Feature activation+3.99e-03

Pos logit contributions
 Amy+0.004
 Daisy+0.004
 Meg+0.004
 Emily+0.004
 Sally+0.004
Neg logit contributions
ga-0.007
be-0.007
Ar-0.007
zee-0.006
ruck-0.006
Token some
Token ID617
Feature activation+1.09e-02

Pos logit contributions
 Amy+0.005
 Meg+0.005
 Daisy+0.005
uct+0.005
 Sally+0.005
Neg logit contributions
ga-0.007
be-0.007
Ar-0.006
ric-0.006
ruck-0.006
Token the
Token ID262
Feature activation+7.37e-03

Pos logit contributions
 Amy+0.008
 Meg+0.007
 Daisy+0.007
uct+0.007
 Sally+0.007
Neg logit contributions
ga-0.018
be-0.017
Ar-0.016
zee-0.016
aff-0.016
Token bird
Token ID6512
Feature activation+1.79e-02

Pos logit contributions
 Amy+0.009
 there+0.009
 refreshing+0.009
 Daisy+0.009
 dry+0.009
Neg logit contributions
ga-0.015
be-0.012
zee-0.012
Ar-0.012
aff-0.012
Token and
Token ID290
Feature activation-1.44e-03

Pos logit contributions
 Amy+0.026
 Meg+0.025
igsaw+0.025
uct+0.025
 Sally+0.024
Neg logit contributions
ga-0.030
be-0.028
k-0.025
 rh-0.025
ric-0.025
Token it
Token ID340
Feature activation+7.34e-03

Pos logit contributions
ga+0.003
be+0.003
Ar+0.002
zee+0.002
aff+0.002
Neg logit contributions
 Amy-0.002
 Meg-0.002
 Daisy-0.002
uct-0.002
 Sally-0.002
Token got
Token ID1392
Feature activation+4.35e-03

Pos logit contributions
 Amy+0.011
 Daisy+0.011
 Meg+0.011
uct+0.010
 Sally+0.010
Neg logit contributions
ga-0.012
be-0.011
Ar-0.011
zee-0.010
aff-0.010
Token burned
Token ID11544
Feature activation-2.00e-03

Pos logit contributions
 Amy+0.005
 Daisy+0.005
 Meg+0.005
 Sally+0.004
uct+0.004
Neg logit contributions
ga-0.009
be-0.008
Ar-0.008
zee-0.008
aff-0.008
Token.
Token ID13
Feature activation+7.03e-03

Pos logit contributions
ga+0.004
be+0.004
Ar+0.003
ruck+0.003
ric+0.003
Neg logit contributions
 Amy-0.002
 Meg-0.002
uct-0.002
igsaw-0.002
 Daisy-0.002
Token The
Token ID383
Feature activation+2.71e-04

Pos logit contributions
 Amy+0.012
 Meg+0.011
 Daisy+0.011
uct+0.011
 Sally+0.011
Neg logit contributions
ga-0.011
be-0.009
Ar-0.009
zee-0.008
aff-0.008
Token other
Token ID584
Feature activation-2.52e-02

Pos logit contributions
 Amy+0.000
 there+0.000
 refreshing+0.000
 Daisy+0.000
 Sally+0.000
Neg logit contributions
ga-0.001
be-0.000
zee-0.000
Ar-0.000
aff-0.000
Token animals
Token ID4695
Feature activation-4.31e-03

Pos logit contributions
ga+0.039
Ar+0.032
be+0.031
zee+0.031
aff+0.030
Neg logit contributions
 Amy-0.043
 there-0.042
 Daisy-0.041
 Sally-0.041
 Sue-0.040
Token found
Token ID1043
Feature activation+5.60e-03

Pos logit contributions
ga+0.007
be+0.007
Ar+0.006
aff+0.006
zee+0.006
Neg logit contributions
 Amy-0.006
 Meg-0.006
 Daisy-0.006
uct-0.006
 Sally-0.006
Token,
Token ID11
Feature activation+2.02e-03

Pos logit contributions
ga+0.023
be+0.021
ruck+0.019
 rh+0.019
Bang+0.018
Neg logit contributions
igsaw-0.022
uct-0.021
 Meg-0.021
 refreshing-0.020
 Amy-0.020
Token his
Token ID465
Feature activation+3.71e-03

Pos logit contributions
igsaw+0.003
 Meg+0.003
uct+0.003
 refreshing+0.003
ement+0.003
Neg logit contributions
ga-0.003
be-0.003
ruck-0.003
 rh-0.003
zee-0.003
Token little
Token ID1310
Feature activation+3.39e-03

Pos logit contributions
uct+0.005
 Meg+0.005
 Amy+0.005
 Daisy+0.005
 Sally+0.005
Neg logit contributions
ga-0.006
be-0.006
Ar-0.006
 rh-0.005
zee-0.005
Token dog
Token ID3290
Feature activation-5.74e-03

Pos logit contributions
 Amy+0.005
 Meg+0.005
 Daisy+0.005
uct+0.005
 Sally+0.005
Neg logit contributions
ga-0.005
be-0.005
Ar-0.005
zee-0.004
aff-0.004
Token,
Token ID11
Feature activation-5.40e-03

Pos logit contributions
ga+0.012
be+0.011
Ar+0.010
aff+0.010
zee+0.010
Neg logit contributions
 Amy-0.007
 Daisy-0.006
 Meg-0.006
 Sally-0.006
 Sarah-0.006
Token Spot
Token ID15899
Feature activation-8.71e-03

Pos logit contributions
ga+0.011
be+0.010
Ar+0.010
zee+0.009
aff+0.009
Neg logit contributions
 Amy-0.006
 Meg-0.006
 Daisy-0.006
uct-0.006
 Sally-0.006
Token,
Token ID11
Feature activation-3.97e-03

Pos logit contributions
ga+0.020
Ar+0.018
be+0.018
ruck+0.018
aff+0.018
Neg logit contributions
 Amy-0.009
 Daisy-0.008
 Meg-0.007
 there-0.007
 Sally-0.007
Token started
Token ID2067
Feature activation-2.67e-03

Pos logit contributions
ga+0.007
be+0.006
Ar+0.006
aff+0.006
zee+0.006
Neg logit contributions
 Amy-0.006
 Daisy-0.006
 Sally-0.006
 Meg-0.006
 Sarah-0.006
Token to
Token ID284
Feature activation+5.45e-03

Pos logit contributions
ga+0.004
be+0.003
Ar+0.003
zee+0.003
aff+0.003
Neg logit contributions
 Amy-0.005
 Meg-0.004
 Daisy-0.004
 Sally-0.004
uct-0.004
Token bark
Token ID21405
Feature activation-7.16e-03

Pos logit contributions
ement+0.008
uct+0.008
 Meg+0.008
igsaw+0.008
kins+0.008
Neg logit contributions
ga-0.008
be-0.008
Ar-0.007
 rh-0.007
Bang-0.007
Token loudly
Token ID23112
Feature activation-1.54e-02

Pos logit contributions
ga+0.014
be+0.013
Ar+0.012
zee+0.012
aff+0.012
Neg logit contributions
 Amy-0.009
 Meg-0.008
uct-0.008
 Daisy-0.008
 refreshing-0.008
Token them
Token ID606
Feature activation+7.58e-03

Pos logit contributions
uct+0.006
 Amy+0.006
 Meg+0.006
 Daisy+0.006
igsaw+0.005
Neg logit contributions
ga-0.011
be-0.010
Ar-0.010
 rh-0.010
aff-0.010
Token on
Token ID319
Feature activation+9.71e-03

Pos logit contributions
 Meg+0.010
 Amy+0.010
 Daisy+0.010
igsaw+0.010
uct+0.010
Neg logit contributions
ga-0.013
be-0.012
ric-0.011
ruck-0.011
 rh-0.011
Token the
Token ID262
Feature activation-9.47e-03

Pos logit contributions
 Amy+0.008
 Daisy+0.008
 Meg+0.008
 Sally+0.007
 Sarah+0.007
Neg logit contributions
ga-0.023
be-0.021
Ar-0.020
zee-0.020
aff-0.020
Token oak
Token ID26210
Feature activation+1.98e-03

Pos logit contributions
ga+0.018
be+0.016
Ar+0.016
zee+0.015
aff+0.015
Neg logit contributions
 Amy-0.012
 Daisy-0.012
 Sally-0.012
 Meg-0.012
uct-0.011
Token.
Token ID13
Feature activation-3.41e-03

Pos logit contributions
 Amy+0.003
 Meg+0.003
 Daisy+0.002
uct+0.002
 refreshing+0.002
Neg logit contributions
ga-0.004
be-0.003
 rh-0.003
smith-0.003
aff-0.003
Token He
Token ID679
Feature activation-8.27e-03

Pos logit contributions
ga+0.005
be+0.005
Ar+0.004
ric+0.004
by+0.004
Neg logit contributions
 Amy-0.006
 Sally-0.006
 Daisy-0.006
 Meg-0.006
 Sarah-0.006
Token was
Token ID373
Feature activation-1.02e-02

Pos logit contributions
ga+0.014
be+0.012
Ar+0.012
aff+0.011
zee+0.011
Neg logit contributions
 Amy-0.013
 Meg-0.012
 Daisy-0.012
 Sally-0.012
uct-0.012
Token surprised
Token ID6655
Feature activation+1.81e-02

Pos logit contributions
ga+0.020
be+0.018
Ar+0.018
ric+0.017
 rh+0.017
Neg logit contributions
igsaw-0.012
 Daisy-0.011
 Amy-0.011
uct-0.011
 Meg-0.011
Token and
Token ID290
Feature activation+1.72e-02

Pos logit contributions
igsaw+0.026
 Amy+0.026
uct+0.026
kin+0.025
 Meg+0.025
Neg logit contributions
ga-0.030
be-0.028
ruck-0.026
aff-0.026
by-0.025
Token worried
Token ID7960
Feature activation-1.30e-03

Pos logit contributions
igsaw+0.020
 Amy+0.019
 Meg+0.018
 Daisy+0.018
uct+0.018
Neg logit contributions
ga-0.034
be-0.032
Ar-0.030
 rh-0.029
Bang-0.029
Token.
Token ID13
Feature activation+1.86e-03

Pos logit contributions
ga+0.002
be+0.002
ruck+0.002
aff+0.002
zee+0.002
Neg logit contributions
 Meg-0.002
uct-0.002
igsaw-0.002
 Amy-0.002
 Daisy-0.002
Token Mam
Token ID29926
Feature activation-1.44e-03

Pos logit contributions
ga+0.031
be+0.029
Ar+0.027
zee+0.026
 rh+0.026
Neg logit contributions
 Amy-0.015
uct-0.014
 Meg-0.014
 Daisy-0.013
 refreshing-0.013
Tokenma
Token ID2611
Feature activation+9.74e-03

Pos logit contributions
ga+0.003
be+0.003
by+0.003
ruck+0.003
smith+0.003
Neg logit contributions
 Amy-0.001
igsaw-0.001
 Daisy-0.001
 refreshing-0.001
 Meg-0.001
Token remembered
Token ID12086
Feature activation-3.90e-03

Pos logit contributions
 Amy+0.013
 Meg+0.012
 Daisy+0.012
 Sally+0.012
uct+0.012
Neg logit contributions
ga-0.018
be-0.016
Ar-0.016
zee-0.015
aff-0.015
Token something
Token ID1223
Feature activation+2.07e-04

Pos logit contributions
ga+0.007
be+0.006
ruck+0.006
by+0.006
smith+0.006
Neg logit contributions
 Amy-0.005
igsaw-0.005
uct-0.005
 Meg-0.005
 refreshing-0.005
Token.
Token ID13
Feature activation+7.71e-04

Pos logit contributions
uct+0.000
igsaw+0.000
 Amy+0.000
 Meg+0.000
mm+0.000
Neg logit contributions
ga-0.000
be-0.000
ruck-0.000
by-0.000
 motors-0.000
Token "
Token ID366
Feature activation-6.13e-03

Pos logit contributions
 Amy+0.001
 Sally+0.001
 Daisy+0.001
 Jane+0.001
 Sarah+0.001
Neg logit contributions
ga-0.002
Ar-0.001
be-0.001
zee-0.001
Bang-0.001
TokenWait
Token ID21321
Feature activation+2.93e-03

Pos logit contributions
ga+0.010
be+0.008
Ar+0.008
aff+0.008
zee+0.008
Neg logit contributions
 Amy-0.010
 Meg-0.009
uct-0.009
 Daisy-0.009
 Sally-0.009
Token,"
Token ID553
Feature activation-1.42e-02

Pos logit contributions
 Amy+0.003
 Meg+0.003
uct+0.003
 refreshing+0.003
 Daisy+0.003
Neg logit contributions
ga-0.006
be-0.005
Ar-0.005
zee-0.005
ruck-0.005
Token she
Token ID673
Feature activation+1.15e-03

Pos logit contributions
be+0.030
ga+0.030
Ar+0.026
 rh+0.025
please+0.025
Neg logit contributions
igsaw-0.015
 Amy-0.013
 Meg-0.013
kin-0.012
 refreshing-0.012
Token said
Token ID531
Feature activation-8.73e-03

Pos logit contributions
igsaw+0.001
 refreshing+0.001
 Amy+0.001
uct+0.001
 Meg+0.001
Neg logit contributions
ga-0.002
be-0.002
Ar-0.002
zee-0.002
by-0.002
Token,
Token ID11
Feature activation-4.67e-03

Pos logit contributions
ga+0.016
be+0.015
k+0.013
by+0.013
ically+0.013
Neg logit contributions
 refreshing-0.011
igsaw-0.011
uct-0.011
 Meg-0.011
 Amy-0.010
Token,
Token ID11
Feature activation+1.67e-03

Pos logit contributions
ga+0.022
be+0.020
Ar+0.019
aff+0.018
ric+0.018
Neg logit contributions
 Amy-0.015
 Daisy-0.015
 Sally-0.014
 Meg-0.014
 Sarah-0.014
Token disgusting
Token ID23374
Feature activation-1.12e-02

Pos logit contributions
igsaw+0.002
uct+0.002
kins+0.002
ement+0.002
 Meg+0.002
Neg logit contributions
ga-0.003
be-0.003
 rh-0.003
ruck-0.002
 horn-0.002
Token place
Token ID1295
Feature activation+1.59e-02

Pos logit contributions
ga+0.021
be+0.019
Ar+0.018
zee+0.017
aff+0.017
Neg logit contributions
 Amy-0.015
 Meg-0.014
 Daisy-0.014
uct-0.014
 Sally-0.014
Token for
Token ID329
Feature activation-4.24e-03

Pos logit contributions
uct+0.022
 Meg+0.021
igsaw+0.020
 Amy+0.020
 refreshing+0.020
Neg logit contributions
ga-0.026
be-0.024
ruck-0.023
ric-0.023
eer-0.022
Token a
Token ID257
Feature activation-1.70e-02

Pos logit contributions
ga+0.008
be+0.007
Ar+0.007
 rh+0.007
ruck+0.007
Neg logit contributions
uct-0.005
 Amy-0.005
 Meg-0.005
 refreshing-0.005
igsaw-0.005
Token 3
Token ID513
Feature activation-3.48e-02

Pos logit contributions
ga+0.035
be+0.031
ric+0.029
aff+0.029
Ar+0.029
Neg logit contributions
 Amy-0.021
 Daisy-0.020
 Sally-0.019
 refreshing-0.019
 Meg-0.019
Token-
Token ID12
Feature activation+1.53e-02

Pos logit contributions
ga+0.060
be+0.057
eer+0.053
Ar+0.053
smith+0.051
Neg logit contributions
uct-0.047
 Daisy-0.045
 refreshing-0.044
 Emily-0.044
 Meg-0.044
Tokenyear
Token ID1941
Feature activation-2.03e-02

Pos logit contributions
 Amy+0.021
 Meg+0.020
 Daisy+0.020
 Sally+0.020
uct+0.019
Neg logit contributions
ga-0.028
be-0.024
Ar-0.024
ric-0.023
aff-0.023
Token-
Token ID12
Feature activation+1.20e-02

Pos logit contributions
ga+0.031
be+0.028
Ar+0.026
aff+0.025
zee+0.025
Neg logit contributions
 Amy-0.034
 Meg-0.032
 Daisy-0.032
uct-0.032
 Sally-0.031
Tokenold
Token ID727
Feature activation-1.06e-02

Pos logit contributions
 Amy+0.017
 Meg+0.016
 Daisy+0.016
 Sally+0.016
 Sarah+0.015
Neg logit contributions
ga-0.022
be-0.019
Ar-0.019
ric-0.018
aff-0.018
Token to
Token ID284
Feature activation+1.01e-02

Pos logit contributions
ga+0.020
be+0.018
Ar+0.017
aff+0.016
zee+0.016
Neg logit contributions
 Amy-0.014
 Daisy-0.014
 Meg-0.014
 Sally-0.013
 Sarah-0.013
Token excited
Token ID6568
Feature activation+3.09e-02

Pos logit contributions
 Amy+0.011
igsaw+0.010
 Meg+0.010
uct+0.010
 Daisy+0.010
Neg logit contributions
ga-0.015
be-0.014
Ar-0.013
 rh-0.012
by-0.012
Token and
Token ID290
Feature activation+7.36e-03

Pos logit contributions
kin+0.049
kins+0.048
igsaw+0.048
 refreshing+0.047
uct+0.047
Neg logit contributions
ga-0.045
be-0.039
ruck-0.039
aff-0.038
by-0.037
Token asked
Token ID1965
Feature activation-1.06e-02

Pos logit contributions
igsaw+0.013
 Amy+0.013
 Meg+0.012
ement+0.012
kins+0.012
Neg logit contributions
ga-0.009
be-0.009
k-0.007
Ar-0.007
check-0.007
Token lots
Token ID6041
Feature activation+1.92e-04

Pos logit contributions
ga+0.023
be+0.021
Ar+0.019
aff+0.019
by+0.019
Neg logit contributions
 Amy-0.010
 refreshing-0.010
uct-0.010
igsaw-0.010
 Meg-0.010
Token of
Token ID286
Feature activation+3.54e-03

Pos logit contributions
 Amy+0.000
uct+0.000
 Meg+0.000
kins+0.000
kin+0.000
Neg logit contributions
ga-0.000
aff-0.000
ruck-0.000
be-0.000
eer-0.000
Token questions
Token ID2683
Feature activation-3.81e-02

Pos logit contributions
 Amy+0.005
 Meg+0.005
 Daisy+0.005
uct+0.005
 Sally+0.005
Neg logit contributions
ga-0.006
be-0.005
Ar-0.005
aff-0.005
zee-0.005
Token.
Token ID13
Feature activation-1.62e-04

Pos logit contributions
ga+0.071
be+0.065
aff+0.060
zee+0.059
ruck+0.059
Neg logit contributions
 refreshing-0.048
 Amy-0.048
 Meg-0.046
 Daisy-0.045
uct-0.045
Token
Token ID198
Feature activation-5.95e-03

Pos logit contributions
ga+0.000
be+0.000
Ar+0.000
zee+0.000
ruck+0.000
Neg logit contributions
 Amy-0.000
 Daisy-0.000
 Meg-0.000
 Sally-0.000
uct-0.000
Token
Token ID198
Feature activation+3.45e-03

Pos logit contributions
ga+0.011
be+0.010
zee+0.010
eer+0.010
Ar+0.010
Neg logit contributions
 Amy-0.008
 Daisy-0.008
 Meg-0.007
 Sally-0.007
 Sarah-0.007
TokenAnna
Token ID31160
Feature activation+1.95e-02

Pos logit contributions
 Amy+0.005
 Meg+0.005
 Daisy+0.005
uct+0.004
 Sally+0.004
Neg logit contributions
ga-0.006
be-0.006
Ar-0.005
zee-0.005
aff-0.005
Token's
Token ID338
Feature activation+2.18e-03

Pos logit contributions
 Amy+0.024
 Daisy+0.023
 Meg+0.022
uct+0.022
 Sally+0.022
Neg logit contributions
ga-0.039
be-0.034
Ar-0.034
ruck-0.033
ric-0.033
Token him
Token ID683
Feature activation+4.64e-03

Pos logit contributions
 Amy+0.006
 Daisy+0.005
 Meg+0.005
 Sally+0.005
 Sarah+0.005
Neg logit contributions
ga-0.007
be-0.006
Ar-0.006
zee-0.006
aff-0.006
Token about
Token ID546
Feature activation-1.07e-02

Pos logit contributions
 Amy+0.006
 Meg+0.006
uct+0.006
 Daisy+0.006
 refreshing+0.006
Neg logit contributions
ga-0.009
be-0.008
aff-0.007
 rh-0.007
k-0.007
Token his
Token ID465
Feature activation+1.11e-02

Pos logit contributions
ga+0.021
be+0.018
Ar+0.018
 rh+0.018
aff+0.017
Neg logit contributions
 Amy-0.013
 Meg-0.012
uct-0.012
 Daisy-0.012
 refreshing-0.011
Token love
Token ID1842
Feature activation-2.99e-03

Pos logit contributions
 Amy+0.014
 Daisy+0.013
 Sally+0.013
 Meg+0.013
 refreshing+0.013
Neg logit contributions
ga-0.022
be-0.019
Ar-0.019
zee-0.018
Bang-0.018
Token for
Token ID329
Feature activation-2.03e-02

Pos logit contributions
ga+0.006
be+0.005
Ar+0.005
aff+0.005
ruck+0.005
Neg logit contributions
 Amy-0.004
 Meg-0.003
uct-0.003
 Daisy-0.003
 Sally-0.003
Token rad
Token ID2511
Feature activation-3.67e-02

Pos logit contributions
ga+0.037
be+0.033
Ar+0.032
zee+0.030
aff+0.030
Neg logit contributions
 Amy-0.028
 Meg-0.026
 Daisy-0.026
uct-0.026
 Sally-0.025
Tokenishes
Token ID5614
Feature activation-3.10e-02

Pos logit contributions
ga+0.076
Bang+0.069
Ar+0.069
Charge+0.066
zee+0.065
Neg logit contributions
 Daisy-0.039
 Sally-0.038
 there-0.037
 promises-0.037
kins-0.036
Token.
Token ID13
Feature activation+3.14e-03

Pos logit contributions
ga+0.053
be+0.049
aff+0.045
ruck+0.045
 rh+0.044
Neg logit contributions
 Amy-0.043
 Meg-0.042
uct-0.042
igsaw-0.040
 refreshing-0.040
Token He
Token ID679
Feature activation-6.50e-03

Pos logit contributions
 Amy+0.005
 Daisy+0.005
 Sally+0.005
 Sarah+0.005
 Emily+0.005
Neg logit contributions
ga-0.005
be-0.005
Ar-0.005
zee-0.004
k-0.004
Token knew
Token ID2993
Feature activation+1.36e-03

Pos logit contributions
ga+0.011
be+0.010
Ar+0.010
ruck+0.009
zee+0.009
Neg logit contributions
 Amy-0.009
 Daisy-0.009
 Meg-0.009
 Sally-0.009
 refreshing-0.009
Token they
Token ID484
Feature activation+3.12e-03

Pos logit contributions
 Amy+0.002
igsaw+0.002
 Meg+0.002
uct+0.002
kins+0.002
Neg logit contributions
ga-0.002
be-0.002
ruck-0.002
 rh-0.002
Ar-0.002
Token mother
Token ID2802
Feature activation-1.70e-03

Pos logit contributions
 Amy+0.004
 Daisy+0.004
 Sarah+0.004
 Sally+0.004
 Meg+0.004
Neg logit contributions
ga-0.007
be-0.006
ric-0.006
zee-0.006
eer-0.006
Token and
Token ID290
Feature activation+1.55e-02

Pos logit contributions
ga+0.003
be+0.002
ruck+0.002
Ar+0.002
smith+0.002
Neg logit contributions
 Amy-0.003
 Meg-0.003
uct-0.003
igsaw-0.003
 refreshing-0.002
Token her
Token ID607
Feature activation-1.50e-02

Pos logit contributions
 Amy+0.012
 Daisy+0.011
 Meg+0.011
 Sally+0.011
 Sarah+0.010
Neg logit contributions
ga-0.038
be-0.035
Ar-0.034
aff-0.033
zee-0.033
Token three
Token ID1115
Feature activation-1.23e-02

Pos logit contributions
ga+0.027
Ar+0.024
ric+0.024
be+0.024
Bang+0.023
Neg logit contributions
 Amy-0.021
 Daisy-0.021
 Sarah-0.020
 Sally-0.020
 Emily-0.019
Token year
Token ID614
Feature activation+4.44e-02

Pos logit contributions
ga+0.022
be+0.020
Ar+0.019
aff+0.019
zee+0.018
Neg logit contributions
 Amy-0.017
 Meg-0.016
 Daisy-0.016
uct-0.016
 Sally-0.015
Token old
Token ID1468
Feature activation-3.70e-02

Pos logit contributions
 Amy+0.051
 Meg+0.049
 Daisy+0.048
uct+0.047
 refreshing+0.046
Neg logit contributions
ga-0.089
be-0.082
Ar-0.077
aff-0.077
zee-0.075
Token son
Token ID3367
Feature activation-2.50e-02

Pos logit contributions
ga+0.068
Ar+0.063
be+0.060
Bang+0.060
ric+0.058
Neg logit contributions
 Amy-0.052
 Sally-0.051
 Daisy-0.051
 Sarah-0.050
 Jane-0.048
Token were
Token ID547
Feature activation-6.31e-03

Pos logit contributions
ga+0.051
Ar+0.047
be+0.045
zee+0.045
Bang+0.044
Neg logit contributions
 Amy-0.030
 Sally-0.029
 Daisy-0.029
 Sarah-0.029
 Jane-0.028
Token walking
Token ID6155
Feature activation-9.78e-03

Pos logit contributions
ga+0.013
Ar+0.012
be+0.011
zee+0.011
aff+0.011
Neg logit contributions
 Amy-0.008
 Daisy-0.007
 Sally-0.007
 Sarah-0.007
 there-0.007
Token down
Token ID866
Feature activation-1.81e-02

Pos logit contributions
ga+0.018
be+0.016
Ar+0.016
zee+0.015
aff+0.015
Neg logit contributions
 Amy-0.013
 Meg-0.012
 Daisy-0.012
 Sally-0.012
uct-0.012
Token the
Token ID262
Feature activation-9.76e-03

Pos logit contributions
ga+0.039
be+0.036
Ar+0.036
ruck+0.034
 rh+0.034
Neg logit contributions
 Amy-0.017
uct-0.017
 Meg-0.017
 refreshing-0.016
 Daisy-0.016
Token."
Token ID526
Feature activation+1.14e-02

Pos logit contributions
 Amy+0.012
uct+0.011
 Meg+0.011
 refreshing+0.011
igsaw+0.011
Neg logit contributions
ga-0.015
be-0.013
eer-0.012
aff-0.012
 rh-0.012
Token
Token ID198
Feature activation+4.95e-03

Pos logit contributions
 Amy+0.016
 Meg+0.015
uct+0.015
 refreshing+0.015
 Daisy+0.015
Neg logit contributions
ga-0.020
be-0.018
Ar-0.017
aff-0.016
zee-0.016
Token
Token ID198
Feature activation+9.71e-03

Pos logit contributions
 Amy+0.008
 Meg+0.007
 Daisy+0.007
 Sally+0.007
uct+0.007
Neg logit contributions
ga-0.008
be-0.008
Ar-0.007
zee-0.007
aff-0.007
TokenThe
Token ID464
Feature activation-8.31e-03

Pos logit contributions
 Amy+0.014
 Meg+0.013
 Daisy+0.013
 Sally+0.013
uct+0.013
Neg logit contributions
ga-0.017
be-0.015
Ar-0.015
zee-0.014
aff-0.014
Token wise
Token ID10787
Feature activation-3.89e-02

Pos logit contributions
ga+0.014
be+0.013
Ar+0.012
ric+0.012
aff+0.011
Neg logit contributions
 Amy-0.012
 Daisy-0.012
 Meg-0.012
 Sally-0.011
 there-0.011
Token old
Token ID1468
Feature activation-3.98e-02

Pos logit contributions
ga+0.075
be+0.072
Ar+0.070
ric+0.069
Bang+0.066
Neg logit contributions
 Amy-0.047
 Sally-0.046
 Daisy-0.046
 Meg-0.044
 Sarah-0.043
Token owl
Token ID39610
Feature activation-8.74e-03

Pos logit contributions
ga+0.068
be+0.063
Ar+0.060
ric+0.060
zee+0.057
Neg logit contributions
 Amy-0.059
 Daisy-0.057
 Sally-0.057
 Meg-0.055
 there-0.055
Token thought
Token ID1807
Feature activation+7.55e-04

Pos logit contributions
ga+0.017
be+0.016
Ar+0.015
zee+0.014
aff+0.014
Neg logit contributions
 Amy-0.011
 Meg-0.010
 Daisy-0.010
 Sally-0.010
uct-0.010
Token for
Token ID329
Feature activation-2.78e-03

Pos logit contributions
uct+0.001
 refreshing+0.001
kins+0.001
igsaw+0.001
 Meg+0.001
Neg logit contributions
ga-0.001
be-0.001
ruck-0.001
eer-0.001
smith-0.001
Token a
Token ID257
Feature activation-2.08e-02

Pos logit contributions
ga+0.005
Ar+0.004
be+0.004
zee+0.004
Bang+0.004
Neg logit contributions
uct-0.004
 Amy-0.004
 refreshing-0.004
 Meg-0.004
 Sally-0.003
Token moment
Token ID2589
Feature activation+8.49e-03

Pos logit contributions
ga+0.045
be+0.040
aff+0.038
ric+0.038
k+0.037
Neg logit contributions
 Amy-0.024
 Daisy-0.023
 Meg-0.022
 Sally-0.021
 Emily-0.021