TOP ACTIVATIONS
MAX = 3.937

. But those are not tr unks . Those are my
She writes " To Ben " on the heart .
they did not eat the tr uffle . "
ig , what is that truck for ?" asked B ess
was green and had a high tree - like shape .
Suddenly , the screen goes dark . The computer
big pond with lots of y ummy bugs and fish ."
. She forgot about her m ama . She forgot about
talk . It had many y ummy foods that smell good
they had never seen the barrel . They wished they had
" Where does the bus go ?" L ila asks
" Wow , that truck is cool !" Tom says
It was pink and had high r uff les on the
grandma because she always had y ummy treats and fun toys
. It has lots of y ummy food . What do
see only a basket with y ummy food . They say
potato . This is a tr uffle . It is a
because it had lots of y ummy decorations on it .
" What is the truck doing ?" Ben asks .
that he had earned his m ama 's permission . <|endoftext|>

MIN ACTIVATIONS
MIN = -4.034

and let the ha ird ress er go to work .
trick - or - t reating anymore . She sadly put
trick - or - t reating with her . Tim my
again ." Daddy put a band aid on his thumb and
haircut . The ha ird ress er started cutting Tim my
and gently wiped the little girl 's skin .
," replied the ha ird ress er . " It feels
he asked the ha ird ress er to give him a
and it made the little girl scared . The
mom my said she would replace it with a new one
. He fet ched a band age he found in the
Let 's go get some thread and needle ."
up , she would cl ench her fists and keep on
Lily 's mother wrapped a band age around her friend 's
ended up in such bad situations . The
. We 'll put a band - aid on it and
. Her mom put a band - aid on his knee
out a needle and some thread and showed Lily how to
help . She put a band - aid on his knee
Let 's go to the doctor and see if you need

INTERVAL 1.944 - 3.937
CONTAINS 0.314%

One day , Ben goes to the lake with
your kids not to climb high trees and forget about everything
very excited and had a big smile on his face .
staring at him with her big eyes . The
and brown and had a red cap . It looked at

INTERVAL -0.049 - 1.944
CONTAINS 52.858%

ays water on the fire . He puts out the fire
and pulled out a cheap shiny toy . She was so
But then , the teacher comes in . She
cricket ch ir ped happily in the background .
't get up . Suddenly , she heard a

INTERVAL -2.041 - -0.049
CONTAINS 46.578%

cover what a pupil is . A pupil is someone who
S ara does not like that . She says , "
, Anna . I will be gentle ." Ben says .
her and asked her for the key .
" Please give me one piece of candy . I am

INTERVAL -4.034 - -2.041
CONTAINS 0.250%

be brave and try to fix the pear . He used
" Don 't worry , you can build it again .
's dad said , " Let 's test out our new
how much she loved playing doctor and decided to hang her
. â The two of them were very happy as
Token.
Token ID13
Feature activation-1.36e-01

Pos logit contributions
umm+0.042
oops+0.042
 enclosure+0.041
cks+0.041
opl+0.041
Neg logit contributions
 Lily-0.026
 Emma-0.023
Prin-0.022
 Emily-0.021
 someone-0.021
Token But
Token ID887
Feature activation-1.07e-02

Pos logit contributions
 Lily+0.003
Prin+0.003
 Emma+0.003
 accidentally+0.003
 someone+0.003
Neg logit contributions
oops-0.004
cks-0.004
umm-0.004
opl-0.004
 enclosure-0.003
Token those
Token ID883
Feature activation+2.50e-01

Pos logit contributions
Prin+0.000
 Bun+0.000
 Midnight+0.000
 Emma+0.000
 accidentally+0.000
Neg logit contributions
oops-0.000
cks-0.000
umm-0.000
 enclosure-0.000
ires-0.000
Token are
Token ID389
Feature activation+5.89e-01

Pos logit contributions
cks+0.008
oops+0.008
opl+0.008
umm+0.008
 enclosure+0.008
Neg logit contributions
 Lily-0.004
Prin-0.004
 Emma-0.004
 accidentally-0.004
 someone-0.004
Token not
Token ID407
Feature activation-3.53e-01

Pos logit contributions
umm+0.020
opl+0.020
 enclosure+0.020
oops+0.020
cks+0.020
Neg logit contributions
 Lily-0.011
Prin-0.009
 Emma-0.009
 someone-0.009
 Emily-0.008
Token tr
Token ID491
Feature activation+3.94e+00

Pos logit contributions
Prin+0.007
 Lily+0.006
 Emma+0.006
 Bun+0.006
 Benny+0.006
Neg logit contributions
cks-0.011
oops-0.011
ines-0.010
 enclosure-0.010
opl-0.010
Tokenunks
Token ID14125
Feature activation+1.78e+00

Pos logit contributions
 enclosure+0.063
iously+0.063
umm+0.062
oops+0.060
 entrance+0.055
Neg logit contributions
 accidentally-0.117
 Lily-0.115
 Emma-0.110
 Emily-0.108
arre-0.107
Token.
Token ID13
Feature activation+2.36e-01

Pos logit contributions
cks+0.059
oops+0.059
umm+0.059
 enclosure+0.058
opl+0.058
Neg logit contributions
 Lily-0.032
Prin-0.028
 Emma-0.028
 someone-0.027
 accidentally-0.026
Token Those
Token ID5845
Feature activation+7.84e-01

Pos logit contributions
oops+0.008
 enclosure+0.008
opl+0.008
umm+0.008
cks+0.008
Neg logit contributions
Prin-0.004
 Lily-0.004
 accidentally-0.004
 Emma-0.004
 Bun-0.004
Token are
Token ID389
Feature activation+5.09e-01

Pos logit contributions
oops+0.026
cks+0.026
opl+0.026
 enclosure+0.026
umm+0.026
Neg logit contributions
 Lily-0.014
Prin-0.013
 Emma-0.012
 accidentally-0.012
 someone-0.012
Token my
Token ID616
Feature activation-2.00e-02

Pos logit contributions
cks+0.016
oops+0.016
 enclosure+0.016
ines+0.016
opl+0.015
Neg logit contributions
Prin-0.009
 Lily-0.009
 Emma-0.009
 accidentally-0.008
 Bun-0.008
Token She
Token ID1375
Feature activation+1.17e+00

Pos logit contributions
cks+0.014
opl+0.014
umm+0.014
 enclosure+0.014
ines+0.014
Neg logit contributions
 Lily-0.007
Prin-0.006
 Emma-0.006
 someone-0.005
 Benny-0.005
Token writes
Token ID6797
Feature activation+4.98e-02

Pos logit contributions
cks+0.028
oops+0.028
umm+0.028
 enclosure+0.028
opl+0.028
Neg logit contributions
 Lily-0.032
Prin-0.030
 Emma-0.029
 accidentally-0.028
 Emily-0.028
Token "
Token ID366
Feature activation+2.16e+00

Pos logit contributions
oops+0.002
cks+0.002
ines+0.002
opl+0.002
 enclosure+0.002
Neg logit contributions
Prin-0.001
 Lily-0.001
 Bun-0.001
 accidentally-0.001
 Emma-0.001
TokenTo
Token ID2514
Feature activation+1.74e+00

Pos logit contributions
opl+0.057
cks+0.055
umm+0.054
ines+0.054
oops+0.054
Neg logit contributions
 Lily-0.049
 Emma-0.049
 Benny-0.048
 Emily-0.047
 war-0.047
Token Ben
Token ID3932
Feature activation+1.01e+00

Pos logit contributions
oops+0.052
cks+0.052
opl+0.051
 enclosure+0.050
ines+0.050
Neg logit contributions
Prin-0.033
 Lily-0.031
 accidentally-0.031
 Bun-0.030
 Emma-0.030
Token"
Token ID1
Feature activation+3.88e+00

Pos logit contributions
oops+0.030
 enclosure+0.029
cks+0.029
opl+0.029
ines+0.028
Neg logit contributions
 Lily-0.019
Prin-0.018
 Emma-0.017
 Bun-0.017
 Benny-0.016
Token on
Token ID319
Feature activation+4.12e-01

Pos logit contributions
oops+0.115
umm+0.111
opl+0.111
ines+0.111
cks+0.108
Neg logit contributions
 war-0.071
 Benny-0.071
 Emma-0.070
 Lily-0.070
 accidentally-0.069
Token the
Token ID262
Feature activation+7.36e-01

Pos logit contributions
opl+0.015
cks+0.014
oops+0.014
ines+0.014
 enclosure+0.014
Neg logit contributions
Prin-0.006
 Bun-0.005
 Lily-0.005
 Benny-0.004
 forgive-0.004
Token heart
Token ID2612
Feature activation-4.59e-01

Pos logit contributions
cks+0.024
oops+0.024
ines+0.024
umm+0.024
 enclosure+0.024
Neg logit contributions
 Lily-0.014
Prin-0.013
 Emma-0.013
 someone-0.012
 Emily-0.012
Token.
Token ID13
Feature activation+5.32e-01

Pos logit contributions
 Lily+0.009
Prin+0.009
 Benny+0.009
 Bun+0.008
 Emma+0.008
Neg logit contributions
oops-0.014
ines-0.014
opl-0.013
 enclosure-0.013
cks-0.013
Token
Token ID198
Feature activation+1.14e+00

Pos logit contributions
oops+0.018
cks+0.017
 enclosure+0.017
umm+0.017
opl+0.017
Neg logit contributions
 Lily-0.009
Prin-0.009
 Emma-0.008
 accidentally-0.008
 someone-0.008
Token they
Token ID484
Feature activation-2.94e-01

Pos logit contributions
Prin+0.006
 Bun+0.005
 accidentally+0.005
 Penny+0.005
 Benny+0.005
Neg logit contributions
oops-0.006
cks-0.006
opl-0.006
 cones-0.006
 enclosure-0.006
Token did
Token ID750
Feature activation+1.69e-01

Pos logit contributions
 Lily+0.005
Prin+0.005
 Emma+0.005
 Emily+0.004
 someone+0.004
Neg logit contributions
ines-0.009
oops-0.009
cks-0.009
 enclosure-0.009
opl-0.009
Token not
Token ID407
Feature activation+1.05e+00

Pos logit contributions
cks+0.005
oops+0.005
ines+0.005
opl+0.005
 enclosure+0.005
Neg logit contributions
Prin-0.003
 Lily-0.003
 Emma-0.003
 Bun-0.003
 Benny-0.003
Token eat
Token ID4483
Feature activation+5.34e-01

Pos logit contributions
oops+0.034
cks+0.033
 enclosure+0.033
opl+0.033
ines+0.033
Neg logit contributions
 Lily-0.020
Prin-0.019
 Emma-0.018
 Emily-0.017
 Bun-0.017
Token the
Token ID262
Feature activation+9.54e-01

Pos logit contributions
cks+0.018
oops+0.017
 enclosure+0.017
ines+0.017
umm+0.016
Neg logit contributions
Prin-0.010
 Emma-0.008
 Lily-0.008
L-0.008
 Emily-0.008
Token tr
Token ID491
Feature activation+3.81e+00

Pos logit contributions
cks+0.032
 enclosure+0.030
oops+0.030
opl+0.030
ines+0.029
Neg logit contributions
 Lily-0.020
 Benny-0.018
 Penny-0.018
 Emma-0.018
 something-0.018
Tokenuffle
Token ID18137
Feature activation+6.35e-01

Pos logit contributions
oops+0.057
iously+0.054
 enclosure+0.054
cks+0.051
sed+0.048
Neg logit contributions
 accidentally-0.126
 Benny-0.123
 Emily-0.121
 Emma-0.119
 Penny-0.118
Token.
Token ID13
Feature activation-6.34e-01

Pos logit contributions
 enclosure+0.019
ines+0.019
opl+0.019
cks+0.019
iously+0.018
Neg logit contributions
Prin-0.012
 Lily-0.011
 Emma-0.011
 Emily-0.010
 Bun-0.010
Token
Token ID198
Feature activation-9.77e-02

Pos logit contributions
 Lily+0.011
Prin+0.010
 Emma+0.010
 someone+0.009
 accidentally+0.009
Neg logit contributions
cks-0.021
oops-0.021
opl-0.021
 enclosure-0.021
umm-0.021
Token
Token ID198
Feature activation+1.71e-01

Pos logit contributions
 Lily+0.001
L+0.000
 someone+0.000
 Max+0.000
 it+0.000
Neg logit contributions
umm-0.005
iously-0.005
oops-0.005
 enclosure-0.004
 ounces-0.004
Token"
Token ID1
Feature activation+7.92e-01

Pos logit contributions
umm+0.007
iously+0.007
oops+0.007
 enclosure+0.007
cks+0.007
Neg logit contributions
L-0.002
 Lily-0.002
Prin-0.002
 they-0.002
 someone-0.001
Tokenig
Token ID328
Feature activation-3.14e-01

Pos logit contributions
cks+0.009
umm+0.009
opl+0.008
 tallest+0.008
 enclosure+0.008
Neg logit contributions
 Lily-0.013
Prin-0.012
 Bun-0.012
 someone-0.012
L-0.012
Token,
Token ID11
Feature activation+5.46e-01

Pos logit contributions
 Lily+0.009
Prin+0.009
 Emma+0.009
 someone+0.009
 accidentally+0.009
Neg logit contributions
cks-0.006
oops-0.006
 enclosure-0.006
opl-0.006
ines-0.006
Token what
Token ID644
Feature activation+1.43e-01

Pos logit contributions
oops+0.016
 enclosure+0.016
cks+0.015
 ig+0.015
umm+0.015
Neg logit contributions
Prin-0.011
 Lily-0.009
 Penny-0.009
 Emma-0.009
 accidentally-0.009
Token is
Token ID318
Feature activation-1.41e-01

Pos logit contributions
oops+0.003
cks+0.003
asp+0.003
 supposed+0.003
 enclosure+0.003
Neg logit contributions
Prin-0.004
 Bun-0.004
 Penny-0.004
 Beat-0.003
 louder-0.003
Token that
Token ID326
Feature activation+1.13e+00

Pos logit contributions
 Lily+0.003
 Emma+0.002
 someone+0.002
Prin+0.002
 Benny+0.002
Neg logit contributions
umm-0.005
opl-0.005
ines-0.005
 enclosure-0.005
oops-0.005
Token truck
Token ID7779
Feature activation+3.78e+00

Pos logit contributions
oops+0.033
opl+0.033
umm+0.033
cks+0.032
 enclosure+0.032
Neg logit contributions
 Lily-0.023
Prin-0.023
 Emma-0.022
 Emily-0.021
 Penny-0.021
Token for
Token ID329
Feature activation+1.06e+00

Pos logit contributions
cks+0.126
oops+0.126
umm+0.125
opl+0.125
 enclosure+0.124
Neg logit contributions
 Lily-0.066
Prin-0.060
 Emma-0.059
 accidentally-0.055
 someone-0.055
Token?"
Token ID1701
Feature activation+5.18e-01

Pos logit contributions
umm+0.034
oops+0.034
cks+0.034
opl+0.034
ines+0.033
Neg logit contributions
 Lily-0.021
 Emma-0.018
Prin-0.018
 someone-0.018
 Benny-0.017
Token asked
Token ID1965
Feature activation+1.44e+00

Pos logit contributions
cks+0.014
oops+0.014
 enclosure+0.014
opl+0.014
umm+0.014
Neg logit contributions
Prin-0.010
 war-0.009
 Lily-0.009
 Emma-0.008
 someone-0.008
Token B
Token ID347
Feature activation-1.32e-01

Pos logit contributions
cks+0.042
 enclosure+0.042
oops+0.042
iously+0.041
umm+0.041
Neg logit contributions
 war-0.023
Prin-0.022
 accidentally-0.020
arre-0.019
 louder-0.018
Tokeness
Token ID408
Feature activation-8.53e-01

Pos logit contributions
 Lily+0.003
Prin+0.003
 Bun+0.003
L+0.003
 someone+0.003
Neg logit contributions
cks-0.004
oops-0.004
opl-0.004
umm-0.004
ines-0.004
Token was
Token ID373
Feature activation-2.41e-01

Pos logit contributions
 Lily+0.007
Prin+0.006
 Emma+0.006
 someone+0.005
 war+0.005
Neg logit contributions
cks-0.021
oops-0.021
umm-0.021
 enclosure-0.021
opl-0.021
Token green
Token ID4077
Feature activation+1.24e+00

Pos logit contributions
 Lily+0.004
Prin+0.004
 Emma+0.004
 accidentally+0.003
 Bun+0.003
Neg logit contributions
oops-0.008
cks-0.008
umm-0.008
opl-0.008
 enclosure-0.008
Token and
Token ID290
Feature activation-8.49e-01

Pos logit contributions
umm+0.045
oops+0.045
 enclosure+0.044
ines+0.044
cks+0.044
Neg logit contributions
 Lily-0.017
Prin-0.016
 Emma-0.015
 Emily-0.014
 someone-0.014
Token had
Token ID550
Feature activation+7.39e-01

Pos logit contributions
Prin+0.021
 war+0.017
 forgive+0.017
 band+0.017
 Bun+0.016
Neg logit contributions
cks-0.021
oops-0.021
umm-0.021
ines-0.020
 tallest-0.020
Token a
Token ID257
Feature activation+1.18e+00

Pos logit contributions
oops+0.025
cks+0.024
umm+0.024
 enclosure+0.024
opl+0.024
Neg logit contributions
 Lily-0.013
Prin-0.012
 Emma-0.011
 accidentally-0.011
 Emily-0.011
Token high
Token ID1029
Feature activation+3.77e+00

Pos logit contributions
cks+0.034
oops+0.034
opl+0.034
 enclosure+0.033
umm+0.033
Neg logit contributions
 Lily-0.026
Prin-0.024
 Emma-0.024
 accidentally-0.023
 someone-0.023
Token tree
Token ID5509
Feature activation+9.99e-01

Pos logit contributions
oops+0.089
cks+0.088
umm+0.087
opl+0.087
 enclosure+0.087
Neg logit contributions
 Lily-0.102
Prin-0.098
 Emma-0.096
 Emily-0.092
 someone-0.091
Token-
Token ID12
Feature activation+1.88e+00

Pos logit contributions
umm+0.038
oops+0.037
opl+0.037
cks+0.037
 enclosure+0.036
Neg logit contributions
 Lily-0.014
 accidentally-0.011
 it-0.011
 Emma-0.011
 while-0.011
Tokenlike
Token ID2339
Feature activation+9.84e-01

Pos logit contributions
cks+0.035
umm+0.035
opl+0.035
 enclosure+0.035
oops+0.034
Neg logit contributions
 Lily-0.061
Prin-0.059
 Emma-0.056
 someone-0.055
 accidentally-0.055
Token shape
Token ID5485
Feature activation+1.23e+00

Pos logit contributions
oops+0.025
cks+0.025
umm+0.025
 enclosure+0.025
opl+0.025
Neg logit contributions
 Lily-0.025
Prin-0.024
 Emma-0.023
 Emily-0.022
 Bun-0.022
Token.
Token ID13
Feature activation+2.09e-01

Pos logit contributions
cks+0.046
opl+0.046
 enclosure+0.045
oops+0.045
umm+0.044
Neg logit contributions
 Lily-0.019
 it-0.015
 someone-0.015
 Emma-0.015
 accidentally-0.015
Token
Token ID198
Feature activation+2.60e-01

Pos logit contributions
oops+0.016
cks+0.016
opl+0.015
 enclosure+0.015
ines+0.015
Neg logit contributions
Prin-0.009
 Lily-0.009
 Emma-0.009
 war-0.009
 accidentally-0.008
Token
Token ID198
Feature activation+4.64e-01

Pos logit contributions
umm+0.011
oops+0.011
 enclosure+0.011
cks+0.011
iously+0.011
Neg logit contributions
 Lily-0.003
Prin-0.002
 Emma-0.002
 someone-0.002
 Emily-0.002
TokenSuddenly
Token ID38582
Feature activation+1.98e-01

Pos logit contributions
umm+0.016
oops+0.016
opl+0.015
 enclosure+0.015
cks+0.015
Neg logit contributions
 Lily-0.008
Prin-0.008
 Emma-0.007
L-0.007
 someone-0.007
Token,
Token ID11
Feature activation+2.95e-01

Pos logit contributions
ines+0.005
oops+0.005
 enclosure+0.005
cks+0.005
iously+0.005
Neg logit contributions
Prin-0.004
 Midnight-0.004
 Bun-0.004
 Benny-0.004
 Emma-0.004
Token the
Token ID262
Feature activation+2.74e-01

Pos logit contributions
oops+0.007
 cones+0.007
cks+0.007
ines+0.007
 tallest+0.007
Neg logit contributions
Prin-0.006
arre-0.006
 Bun-0.005
 Benny-0.005
 forgive-0.005
Token screen
Token ID3159
Feature activation+3.69e+00

Pos logit contributions
opl+0.007
umm+0.007
cks+0.007
ines+0.007
oops+0.007
Neg logit contributions
 Lily-0.007
Prin-0.006
 Emma-0.006
 Bun-0.006
 Benny-0.006
Token goes
Token ID2925
Feature activation+3.71e-01

Pos logit contributions
umm+0.109
opl+0.108
oops+0.108
 enclosure+0.107
cks+0.107
Neg logit contributions
 Lily-0.079
Prin-0.070
 someone-0.069
 war-0.069
 Emma-0.069
Token dark
Token ID3223
Feature activation+8.23e-01

Pos logit contributions
oops+0.011
cks+0.011
ines+0.011
 enclosure+0.011
umm+0.011
Neg logit contributions
 Lily-0.007
Prin-0.007
 Emma-0.007
 Benny-0.006
 Midnight-0.006
Token.
Token ID13
Feature activation+1.27e-01

Pos logit contributions
cks+0.026
oops+0.026
umm+0.026
 enclosure+0.026
opl+0.026
Neg logit contributions
 Lily-0.016
Prin-0.014
 Emma-0.014
 accidentally-0.013
 Benny-0.013
Token The
Token ID383
Feature activation-8.56e-02

Pos logit contributions
oops+0.004
 enclosure+0.003
cks+0.003
umm+0.003
iously+0.003
Neg logit contributions
Prin-0.003
 accidentally-0.002
 Bun-0.002
 Benny-0.002
 Penny-0.002
Token computer
Token ID3644
Feature activation+1.82e+00

Pos logit contributions
 Lily+0.002
Prin+0.002
 Emma+0.002
 accidentally+0.002
 Bun+0.002
Neg logit contributions
umm-0.002
oops-0.002
cks-0.002
 enclosure-0.002
opl-0.002
Token big
Token ID1263
Feature activation+1.13e+00

Pos logit contributions
 Lily+0.011
Prin+0.011
 Emma+0.010
 Emily+0.010
 Penny+0.010
Neg logit contributions
 enclosure-0.013
umm-0.012
opl-0.012
ines-0.012
cks-0.012
Token pond
Token ID16723
Feature activation+5.71e-01

Pos logit contributions
 enclosure+0.029
opl+0.029
umm+0.028
oops+0.028
ines+0.028
Neg logit contributions
Prin-0.028
 Lily-0.028
 Emma-0.026
L-0.025
 Emily-0.025
Token with
Token ID351
Feature activation+1.45e+00

Pos logit contributions
cks+0.018
oops+0.018
 enclosure+0.018
opl+0.018
umm+0.018
Neg logit contributions
 Lily-0.011
Prin-0.010
 Emma-0.010
 accidentally-0.009
 someone-0.009
Token lots
Token ID6041
Feature activation+1.63e+00

Pos logit contributions
oops+0.048
cks+0.048
umm+0.047
opl+0.047
 enclosure+0.047
Neg logit contributions
 Lily-0.026
Prin-0.024
 Emma-0.023
 someone-0.022
 accidentally-0.022
Token of
Token ID286
Feature activation-2.41e-01

Pos logit contributions
cks+0.061
umm+0.059
oops+0.059
iously+0.058
opl+0.058
Neg logit contributions
 Lily-0.024
 someone-0.019
Prin-0.019
 Emma-0.019
 it-0.018
Token y
Token ID331
Feature activation+3.68e+00

Pos logit contributions
 Lily+0.006
Prin+0.006
 Emma+0.006
 someone+0.005
 accidentally+0.005
Neg logit contributions
cks-0.006
oops-0.006
umm-0.006
opl-0.006
 enclosure-0.006
Tokenummy
Token ID13513
Feature activation+4.64e-01

Pos logit contributions
cks+0.052
 enclosure+0.049
oops+0.047
wen+0.047
opl+0.047
Neg logit contributions
 Lily-0.124
 Benny-0.123
 war-0.122
 it-0.116
 someone-0.115
Token bugs
Token ID11316
Feature activation+1.70e-01

Pos logit contributions
cks+0.012
oops+0.012
umm+0.012
opl+0.012
 enclosure+0.012
Neg logit contributions
 Lily-0.011
Prin-0.010
 Emma-0.010
 someone-0.010
 Benny-0.010
Token and
Token ID290
Feature activation+5.98e-01

Pos logit contributions
umm+0.005
oops+0.005
cks+0.005
opl+0.005
 enclosure+0.005
Neg logit contributions
 Lily-0.003
Prin-0.003
 Emma-0.003
 accidentally-0.003
 Benny-0.003
Token fish
Token ID5916
Feature activation+2.59e-01

Pos logit contributions
 enclosure+0.013
cks+0.012
oops+0.012
opl+0.012
umm+0.012
Neg logit contributions
Prin-0.017
 Emma-0.016
 Lily-0.016
L-0.015
 Penny-0.015
Token."
Token ID526
Feature activation-5.72e-01

Pos logit contributions
umm+0.008
oops+0.008
opl+0.008
cks+0.008
iously+0.007
Neg logit contributions
 Lily-0.005
Prin-0.005
 Emma-0.005
 accidentally-0.005
 someone-0.005
Token.
Token ID13
Feature activation-1.60e-01

Pos logit contributions
ines+0.020
cks+0.020
oops+0.020
opl+0.020
 enclosure+0.020
Neg logit contributions
 Lily-0.015
Prin-0.015
 Emma-0.014
 Benny-0.014
 accidentally-0.014
Token She
Token ID1375
Feature activation+2.96e-01

Pos logit contributions
Prin+0.003
 accidentally+0.003
 war+0.003
 Lily+0.003
 Emma+0.003
Neg logit contributions
oops-0.005
opl-0.005
umm-0.005
cks-0.005
ines-0.005
Token forgot
Token ID16453
Feature activation-3.70e-01

Pos logit contributions
oops+0.009
 enclosure+0.009
cks+0.009
ines+0.009
opl+0.009
Neg logit contributions
 Lily-0.005
Prin-0.005
 Emma-0.005
 war-0.005
 Benny-0.005
Token about
Token ID546
Feature activation+1.12e+00

Pos logit contributions
Prin+0.010
 accidentally+0.009
 Bun+0.009
L+0.009
 Lily+0.009
Neg logit contributions
oops-0.009
ines-0.009
cks-0.008
 enclosure-0.008
opl-0.008
Token her
Token ID607
Feature activation+1.84e+00

Pos logit contributions
cks+0.041
opl+0.039
 enclosure+0.039
ines+0.038
oops+0.038
Neg logit contributions
Prin-0.017
 accidentally-0.013
L-0.013
 Bun-0.013
 Emma-0.012
Token m
Token ID285
Feature activation+3.60e+00

Pos logit contributions
cks+0.055
oops+0.055
umm+0.054
opl+0.053
ram+0.052
Neg logit contributions
 Lily-0.041
 Emma-0.037
 war-0.036
 someone-0.036
 Benny-0.035
Tokenama
Token ID1689
Feature activation+1.96e+00

Pos logit contributions
cks+0.063
oops+0.059
 exact+0.059
 tallest+0.052
 thirteen+0.051
Neg logit contributions
 band-0.099
 someone-0.097
 Bun-0.096
 again-0.094
 Benny-0.093
Token.
Token ID13
Feature activation+5.54e-01

Pos logit contributions
cks+0.066
oops+0.066
umm+0.065
opl+0.065
 enclosure+0.065
Neg logit contributions
 Lily-0.034
Prin-0.030
 Emma-0.030
 someone-0.028
 accidentally-0.028
Token She
Token ID1375
Feature activation+2.58e-01

Pos logit contributions
oops+0.017
opl+0.017
cks+0.017
 enclosure+0.017
umm+0.017
Neg logit contributions
Prin-0.009
 Lily-0.009
 accidentally-0.009
 war-0.009
 Emma-0.009
Token forgot
Token ID16453
Feature activation-1.91e-01

Pos logit contributions
 enclosure+0.008
oops+0.008
ines+0.008
cks+0.008
opl+0.008
Neg logit contributions
 Lily-0.005
Prin-0.005
 Emma-0.004
 war-0.004
 Benny-0.004
Token about
Token ID546
Feature activation+1.13e+00

Pos logit contributions
Prin+0.005
 accidentally+0.005
 Bun+0.005
L+0.005
 Lily+0.005
Neg logit contributions
oops-0.005
ines-0.004
cks-0.004
 enclosure-0.004
opl-0.004
Token talk
Token ID1561
Feature activation+9.04e-01

Pos logit contributions
oops+0.020
cks+0.020
 enclosure+0.019
ines+0.019
opl+0.019
Neg logit contributions
Prin-0.020
 Bun-0.018
L-0.018
 Lily-0.017
 Midnight-0.017
Token.
Token ID13
Feature activation+1.55e-01

Pos logit contributions
oops+0.026
ines+0.026
cks+0.026
 enclosure+0.025
opl+0.025
Neg logit contributions
Prin-0.019
 Lily-0.018
 accidentally-0.017
 Emma-0.017
 Benny-0.017
Token It
Token ID632
Feature activation-1.23e-01

Pos logit contributions
oops+0.006
umm+0.006
 enclosure+0.006
opl+0.006
ines+0.006
Neg logit contributions
Prin-0.002
 war-0.002
 sounded-0.001
 accidentally-0.001
 band-0.001
Token had
Token ID550
Feature activation+9.68e-01

Pos logit contributions
 Lily+0.001
Prin+0.001
 Emma+0.001
 someone+0.001
 accidentally+0.001
Neg logit contributions
oops-0.005
cks-0.005
 enclosure-0.005
opl-0.005
ines-0.005
Token many
Token ID867
Feature activation+4.85e-01

Pos logit contributions
oops+0.030
 enclosure+0.030
cks+0.030
ines+0.030
opl+0.030
Neg logit contributions
 Lily-0.018
Prin-0.017
 Emma-0.017
 accidentally-0.016
 Bun-0.016
Token y
Token ID331
Feature activation+3.54e+00

Pos logit contributions
umm+0.015
cks+0.015
opl+0.015
ram+0.014
iously+0.014
Neg logit contributions
 Lily-0.011
 someone-0.009
 something-0.009
 Emma-0.009
 it-0.009
Tokenummy
Token ID13513
Feature activation+6.16e-01

Pos logit contributions
 enclosure+0.051
cks+0.050
opl+0.046
 tallest+0.045
 main+0.041
Neg logit contributions
 Lily-0.124
 Benny-0.115
 someone-0.115
 Emma-0.114
 something-0.113
Token foods
Token ID9013
Feature activation+4.40e-01

Pos logit contributions
cks+0.018
umm+0.018
opl+0.018
 enclosure+0.018
oops+0.018
Neg logit contributions
 Lily-0.013
 someone-0.012
 something-0.012
 Emma-0.012
Prin-0.011
Token that
Token ID326
Feature activation+5.33e-01

Pos logit contributions
umm+0.015
cks+0.015
opl+0.015
oops+0.015
 enclosure+0.015
Neg logit contributions
 Lily-0.008
 Emma-0.006
Prin-0.006
 someone-0.006
 Emily-0.006
Token smell
Token ID8508
Feature activation+9.75e-01

Pos logit contributions
oops+0.016
cks+0.015
umm+0.015
ines+0.015
 enclosure+0.014
Neg logit contributions
Prin-0.010
 Bun-0.009
 louder-0.009
 war-0.008
 accidentally-0.008
Token good
Token ID922
Feature activation+1.20e-01

Pos logit contributions
cks+0.031
oops+0.031
ines+0.031
opl+0.030
umm+0.030
Neg logit contributions
Prin-0.017
 Lily-0.017
 Emma-0.016
 accidentally-0.015
 Bun-0.015
Token they
Token ID484
Feature activation+5.04e-01

Pos logit contributions
ines+0.016
oops+0.016
iously+0.016
ires+0.016
cks+0.016
Neg logit contributions
Prin-0.010
 Midnight-0.006
 accidentally-0.006
L-0.006
 band-0.006
Token had
Token ID550
Feature activation+5.53e-01

Pos logit contributions
ines+0.012
oops+0.012
 enclosure+0.012
 cones+0.012
cks+0.012
Neg logit contributions
Prin-0.012
 Emily-0.011
 Lily-0.011
 Penny-0.010
 war-0.010
Token never
Token ID1239
Feature activation+1.33e+00

Pos logit contributions
cks+0.014
oops+0.014
 enclosure+0.013
 cones+0.013
iously+0.013
Neg logit contributions
Prin-0.013
 Penny-0.012
 Emily-0.012
 Emma-0.012
 Lily-0.012
Token seen
Token ID1775
Feature activation+5.41e-02

Pos logit contributions
oops+0.034
cks+0.033
 enclosure+0.032
ines+0.031
iously+0.031
Neg logit contributions
Prin-0.032
 Emma-0.031
 Lily-0.031
 Midnight-0.031
 Penny-0.030
Token the
Token ID262
Feature activation+1.21e+00

Pos logit contributions
opl+0.002
cks+0.002
oops+0.002
 cones+0.002
 tallest+0.002
Neg logit contributions
Prin-0.001
L-0.001
Da-0.001
 accidentally-0.001
Rose-0.001
Token barrel
Token ID9036
Feature activation+3.54e+00

Pos logit contributions
umm+0.034
oops+0.033
 enclosure+0.033
opl+0.033
cks+0.033
Neg logit contributions
 Lily-0.027
Prin-0.026
 Emma-0.025
 accidentally-0.024
 Benny-0.024
Token.
Token ID13
Feature activation-1.30e-01

Pos logit contributions
oops+0.132
cks+0.132
 enclosure+0.131
opl+0.131
umm+0.131
Neg logit contributions
 Lily-0.047
Prin-0.042
 Emma-0.040
 accidentally-0.037
 Emily-0.037
Token They
Token ID1119
Feature activation-5.59e-01

Pos logit contributions
Prin+0.002
 Lily+0.002
 Emma+0.002
 accidentally+0.002
 band+0.002
Neg logit contributions
oops-0.004
 enclosure-0.004
cks-0.004
umm-0.004
opl-0.004
Token wished
Token ID16555
Feature activation+5.64e-01

Pos logit contributions
 Lily+0.014
Prin+0.013
 Emma+0.013
 Emily+0.013
 Penny+0.013
Neg logit contributions
 enclosure-0.014
cks-0.013
ines-0.013
oops-0.013
iously-0.013
Token they
Token ID484
Feature activation+5.89e-01

Pos logit contributions
ines+0.018
oops+0.018
iously+0.018
cks+0.018
ires+0.018
Neg logit contributions
Prin-0.011
L-0.007
 accidentally-0.007
 band-0.006
 Midnight-0.006
Token had
Token ID550
Feature activation+6.88e-01

Pos logit contributions
ines+0.014
oops+0.014
 enclosure+0.013
 cones+0.013
cks+0.013
Neg logit contributions
Prin-0.014
 Emily-0.013
 Lily-0.013
 war-0.012
 Penny-0.012
Token
Token ID198
Feature activation+1.88e-01

Pos logit contributions
oops+0.030
cks+0.030
umm+0.030
opl+0.030
 enclosure+0.030
Neg logit contributions
 Lily-0.010
Prin-0.009
 Emma-0.008
 someone-0.007
 accidentally-0.007
Token"
Token ID1
Feature activation+1.47e+00

Pos logit contributions
umm+0.007
opl+0.007
oops+0.007
iously+0.007
cks+0.007
Neg logit contributions
 Lily-0.003
Prin-0.002
L-0.002
 Emma-0.002
 accidentally-0.002
TokenWhere
Token ID8496
Feature activation+3.79e-01

Pos logit contributions
cks+0.044
oops+0.044
umm+0.044
 enclosure+0.044
opl+0.044
Neg logit contributions
 Lily-0.030
Prin-0.027
 Emma-0.027
 accidentally-0.026
 Benny-0.026
Token does
Token ID857
Feature activation-2.41e-01

Pos logit contributions
oops+0.010
cks+0.009
ines+0.009
 enclosure+0.009
opl+0.009
Neg logit contributions
Prin-0.010
 Bun-0.009
 Midnight-0.009
 Penny-0.009
 Benny-0.009
Token the
Token ID262
Feature activation+2.62e-01

Pos logit contributions
Prin+0.003
 Lily+0.003
 Emma+0.003
 Midnight+0.003
L+0.003
Neg logit contributions
cks-0.009
oops-0.009
 enclosure-0.009
opl-0.008
ines-0.008
Token bus
Token ID1323
Feature activation+3.53e+00

Pos logit contributions
cks+0.007
umm+0.007
oops+0.007
unks+0.007
ram+0.007
Neg logit contributions
 Lily-0.007
 someone-0.007
 war-0.006
 band-0.006
 sounded-0.006
Token go
Token ID467
Feature activation+1.07e+00

Pos logit contributions
oops+0.114
opl+0.113
 enclosure+0.113
cks+0.112
ines+0.111
Neg logit contributions
 Lily-0.065
Prin-0.062
 Emma-0.060
 Benny-0.058
 Penny-0.058
Token?"
Token ID1701
Feature activation+1.20e+00

Pos logit contributions
cks+0.032
 enclosure+0.032
opl+0.031
ines+0.031
oops+0.031
Neg logit contributions
 Lily-0.021
Prin-0.020
 Emma-0.020
 Emily-0.019
 Midnight-0.019
Token L
Token ID406
Feature activation-4.95e-01

Pos logit contributions
umm+0.040
opl+0.040
cks+0.040
oops+0.040
 enclosure+0.040
Neg logit contributions
 Lily-0.022
Prin-0.019
 Emma-0.019
 someone-0.018
 accidentally-0.018
Tokenila
Token ID10102
Feature activation+6.45e-01

Pos logit contributions
 Lily+0.015
Prin+0.014
 Emma+0.013
 someone+0.013
 accidentally+0.013
Neg logit contributions
umm-0.011
 enclosure-0.011
opl-0.011
oops-0.011
cks-0.010
Token asks
Token ID7893
Feature activation+1.34e+00

Pos logit contributions
umm+0.016
oops+0.015
cks+0.015
opl+0.015
 enclosure+0.015
Neg logit contributions
 Lily-0.018
Prin-0.016
 Emma-0.016
 accidentally-0.016
 Emily-0.016
Token
Token ID198
Feature activation+4.64e-01

Pos logit contributions
umm+0.035
oops+0.034
 enclosure+0.034
opl+0.034
iously+0.034
Neg logit contributions
 Lily-0.009
Prin-0.007
 Emma-0.007
 someone-0.007
 Emily-0.006
Token"
Token ID1
Feature activation+1.11e+00

Pos logit contributions
umm+0.016
oops+0.016
 enclosure+0.016
opl+0.016
iously+0.016
Neg logit contributions
 Lily-0.008
Prin-0.007
L-0.007
 accidentally-0.007
 someone-0.007
TokenWow
Token ID22017
Feature activation-4.04e-01

Pos logit contributions
cks+0.029
umm+0.029
opl+0.029
oops+0.028
asp+0.028
Neg logit contributions
 Lily-0.025
 Emma-0.024
 Penny-0.023
 Benny-0.023
 Emily-0.023
Token,
Token ID11
Feature activation+3.29e-01

Pos logit contributions
 Lily+0.007
 Emma+0.006
 Benny+0.005
 Emily+0.005
 it+0.005
Neg logit contributions
opl-0.015
umm-0.015
 enclosure-0.015
oops-0.014
cks-0.014
Token that
Token ID326
Feature activation-6.51e-02

Pos logit contributions
oops+0.010
cks+0.010
 enclosure+0.010
opl+0.009
umm+0.009
Neg logit contributions
Prin-0.007
 Midnight-0.006
 Emma-0.006
 Penny-0.005
 accidentally-0.005
Token truck
Token ID7779
Feature activation+3.52e+00

Pos logit contributions
 Lily+0.001
Prin+0.001
 Penny+0.001
 Emma+0.001
 Midnight+0.001
Neg logit contributions
opl-0.002
cks-0.002
 enclosure-0.002
oops-0.002
ines-0.002
Token is
Token ID318
Feature activation+6.67e-01

Pos logit contributions
umm+0.125
iously+0.121
oops+0.120
 enclosure+0.120
ram+0.119
Neg logit contributions
 Lily-0.058
 sounded-0.047
 war-0.047
 Max-0.046
 Emma-0.046
Token cool
Token ID3608
Feature activation+8.95e-01

Pos logit contributions
 enclosure+0.022
oops+0.022
umm+0.022
ines+0.021
opl+0.021
Neg logit contributions
 Lily-0.013
 Emma-0.012
 someone-0.011
Prin-0.011
 accidentally-0.011
Token!"
Token ID2474
Feature activation+1.36e+00

Pos logit contributions
cks+0.021
oops+0.021
 enclosure+0.021
opl+0.021
umm+0.021
Neg logit contributions
 Lily-0.025
 Emma-0.023
Prin-0.023
 Benny-0.022
 someone-0.022
Token Tom
Token ID4186
Feature activation+1.90e+00

Pos logit contributions
cks+0.041
oops+0.041
 enclosure+0.040
opl+0.039
ines+0.039
Neg logit contributions
Prin-0.027
 Lily-0.026
 Emma-0.025
 Bun-0.025
 Midnight-0.025
Token says
Token ID1139
Feature activation+1.11e+00

Pos logit contributions
umm+0.043
cks+0.041
opl+0.040
oops+0.040
 enclosure+0.040
Neg logit contributions
 Lily-0.057
 Emma-0.053
Prin-0.053
 Emily-0.051
 accidentally-0.050
Token It
Token ID632
Feature activation-2.28e-01

Pos logit contributions
 Lily+0.004
 Emma+0.003
 Max+0.003
 someone+0.003
 it+0.003
Neg logit contributions
umm-0.006
cks-0.006
opl-0.006
ram-0.006
unks-0.006
Token was
Token ID373
Feature activation+2.19e-01

Pos logit contributions
Prin+0.003
 Lily+0.003
 war+0.003
 Emma+0.003
 Benny+0.003
Neg logit contributions
ines-0.008
oops-0.008
cks-0.008
 enclosure-0.008
umm-0.008
Token pink
Token ID11398
Feature activation+9.12e-01

Pos logit contributions
 enclosure+0.008
umm+0.008
cks+0.008
opl+0.008
oops+0.008
Neg logit contributions
 Lily-0.004
 Emma-0.003
Prin-0.003
 someone-0.003
 Emily-0.003
Token and
Token ID290
Feature activation-7.32e-01

Pos logit contributions
cks+0.036
oops+0.035
opl+0.035
 enclosure+0.035
umm+0.035
Neg logit contributions
 Lily-0.011
Prin-0.009
 Emma-0.009
 someone-0.008
 accidentally-0.008
Token had
Token ID550
Feature activation+1.10e+00

Pos logit contributions
Prin+0.019
 war+0.017
 accidentally+0.017
 Midnight+0.016
 Benny+0.016
Neg logit contributions
cks-0.017
oops-0.017
ines-0.017
umm-0.016
 tallest-0.016
Token high
Token ID1029
Feature activation+3.51e+00

Pos logit contributions
oops+0.037
cks+0.037
 enclosure+0.036
umm+0.036
ines+0.036
Neg logit contributions
 Lily-0.018
Prin-0.017
 Emma-0.017
 Benny-0.016
 accidentally-0.016
Token r
Token ID374
Feature activation+9.49e-01

Pos logit contributions
cks+0.085
oops+0.085
 enclosure+0.084
opl+0.083
umm+0.083
Neg logit contributions
 Lily-0.093
Prin-0.088
 Emma-0.086
 Emily-0.083
 accidentally-0.083
Tokenuff
Token ID1648
Feature activation+1.75e+00

Pos logit contributions
 enclosure+0.021
cks+0.019
 ounces+0.018
oops+0.018
iously+0.018
Neg logit contributions
 Lily-0.028
Prin-0.027
L-0.026
 Emma-0.024
 band-0.024
Tokenles
Token ID829
Feature activation+1.34e+00

Pos logit contributions
umm+0.051
 enclosure+0.050
cks+0.050
oops+0.050
opl+0.050
Neg logit contributions
 Lily-0.039
Prin-0.035
 Emma-0.035
 someone-0.034
 accidentally-0.034
Token on
Token ID319
Feature activation+1.29e+00

Pos logit contributions
umm+0.048
oops+0.047
opl+0.047
cks+0.047
 enclosure+0.046
Neg logit contributions
 Lily-0.022
Prin-0.018
 Emma-0.018
 someone-0.018
 accidentally-0.018
Token the
Token ID262
Feature activation+8.34e-01

Pos logit contributions
cks+0.054
oops+0.054
opl+0.053
 enclosure+0.053
ines+0.053
Neg logit contributions
 Lily-0.011
Prin-0.010
 Emma-0.009
 accidentally-0.008
 Benny-0.008
Token grandma
Token ID49890
Feature activation-2.84e-02

Pos logit contributions
 Lily+0.011
 Emma+0.009
 someone+0.009
Prin+0.008
 Max+0.008
Neg logit contributions
cks-0.025
oops-0.025
umm-0.025
opl-0.024
ines-0.024
Token because
Token ID780
Feature activation+4.28e-01

Pos logit contributions
 Lily+0.000
Prin+0.000
 Emma+0.000
 accidentally+0.000
 someone+0.000
Neg logit contributions
cks-0.001
oops-0.001
opl-0.001
umm-0.001
 enclosure-0.001
Token she
Token ID673
Feature activation-4.47e-01

Pos logit contributions
oops+0.015
opl+0.015
ines+0.015
 enclosure+0.015
umm+0.015
Neg logit contributions
Prin-0.006
 accidentally-0.004
 Lily-0.004
 war-0.004
 Benny-0.004
Token always
Token ID1464
Feature activation-4.12e-02

Pos logit contributions
 Lily+0.007
Prin+0.006
 Emma+0.006
 war+0.006
 someone+0.006
Neg logit contributions
ines-0.015
 enclosure-0.015
oops-0.015
cks-0.015
opl-0.015
Token had
Token ID550
Feature activation+8.62e-01

Pos logit contributions
Prin+0.001
 war+0.001
 Lily+0.001
 Midnight+0.001
 Benny+0.001
Neg logit contributions
oops-0.001
cks-0.001
ines-0.001
 enclosure-0.001
opl-0.001
Token y
Token ID331
Feature activation+3.49e+00

Pos logit contributions
oops+0.031
cks+0.031
 enclosure+0.030
opl+0.029
ines+0.029
Neg logit contributions
Prin-0.011
 Lily-0.010
 accidentally-0.010
 Benny-0.010
 Emma-0.009
Tokenummy
Token ID13513
Feature activation+5.00e-01

Pos logit contributions
cks+0.060
 tallest+0.056
 enclosure+0.056
stack+0.053
opl+0.051
Neg logit contributions
 Lily-0.115
 Benny-0.107
 Emma-0.107
 someone-0.103
 band-0.103
Token treats
Token ID18432
Feature activation+5.98e-01

Pos logit contributions
cks+0.015
umm+0.014
opl+0.014
oops+0.014
 enclosure+0.014
Neg logit contributions
 Lily-0.012
 Emma-0.011
Prin-0.010
 someone-0.010
 something-0.010
Token and
Token ID290
Feature activation+4.47e-01

Pos logit contributions
cks+0.020
opl+0.020
umm+0.020
oops+0.020
 enclosure+0.020
Neg logit contributions
 Lily-0.011
 Emma-0.009
Prin-0.009
 someone-0.009
 Penny-0.009
Token fun
Token ID1257
Feature activation+4.99e-01

Pos logit contributions
oops+0.011
 cones+0.011
umm+0.010
ines+0.010
 enclosure+0.010
Neg logit contributions
Prin-0.011
L-0.009
 war-0.009
 Bun-0.009
 Lily-0.009
Token toys
Token ID14958
Feature activation-1.61e-01

Pos logit contributions
cks+0.016
oops+0.016
umm+0.015
opl+0.015
 enclosure+0.015
Neg logit contributions
 Lily-0.010
Prin-0.009
 Emma-0.009
 someone-0.008
 accidentally-0.008
Token.
Token ID13
Feature activation-5.65e-01

Pos logit contributions
cks+0.042
oops+0.041
opl+0.041
 enclosure+0.040
umm+0.040
Neg logit contributions
 Lily-0.025
 Emma-0.023
Prin-0.022
 someone-0.021
 Emily-0.021
Token It
Token ID632
Feature activation-5.53e-01

Pos logit contributions
 Lily+0.011
Prin+0.011
 accidentally+0.010
 Emma+0.010
 war+0.010
Neg logit contributions
oops-0.017
umm-0.017
 enclosure-0.017
opl-0.016
cks-0.016
Token has
Token ID468
Feature activation+9.08e-01

Pos logit contributions
 Lily+0.010
Prin+0.009
 Emma+0.009
 accidentally+0.008
 someone+0.008
Neg logit contributions
oops-0.018
 enclosure-0.018
cks-0.018
opl-0.018
umm-0.018
Token lots
Token ID6041
Feature activation+1.45e+00

Pos logit contributions
cks+0.032
oops+0.032
opl+0.032
umm+0.032
 enclosure+0.032
Neg logit contributions
 Lily-0.014
Prin-0.013
 Emma-0.012
 someone-0.012
 accidentally-0.011
Token of
Token ID286
Feature activation+6.73e-02

Pos logit contributions
cks+0.057
 enclosure+0.054
opl+0.053
iously+0.053
umm+0.052
Neg logit contributions
 Lily-0.022
 Emma-0.017
 someone-0.017
 while-0.017
 it-0.016
Token y
Token ID331
Feature activation+3.47e+00

Pos logit contributions
umm+0.002
cks+0.002
iously+0.002
opl+0.002
 enclosure+0.002
Neg logit contributions
 Lily-0.002
 something-0.001
 someone-0.001
 it-0.001
 Emma-0.001
Tokenummy
Token ID13513
Feature activation+4.86e-01

Pos logit contributions
cks+0.061
 enclosure+0.059
 tallest+0.056
wen+0.054
stack+0.054
Neg logit contributions
 Lily-0.112
 Benny-0.107
 it-0.101
 Emma-0.100
 Penny-0.100
Token food
Token ID2057
Feature activation+7.31e-01

Pos logit contributions
cks+0.015
opl+0.015
 enclosure+0.014
umm+0.014
iously+0.014
Neg logit contributions
 Lily-0.011
 something-0.010
 while-0.009
 someone-0.009
 it-0.009
Token.
Token ID13
Feature activation+3.87e-02

Pos logit contributions
opl+0.026
cks+0.026
umm+0.026
oops+0.025
 enclosure+0.025
Neg logit contributions
 Lily-0.012
 Emma-0.010
 they-0.010
 someone-0.010
 Penny-0.009
Token What
Token ID1867
Feature activation+1.11e-01

Pos logit contributions
oops+0.001
cks+0.001
 enclosure+0.001
umm+0.001
opl+0.001
Neg logit contributions
 Lily-0.001
Prin-0.001
 Emma-0.001
 accidentally-0.001
 Emily-0.001
Token do
Token ID466
Feature activation-1.22e+00

Pos logit contributions
oops+0.003
 pounds+0.003
ines+0.003
 kil+0.003
asp+0.003
Neg logit contributions
 Bun-0.002
Prin-0.002
 Midnight-0.002
 Benny-0.002
 war-0.002
Token see
Token ID766
Feature activation+8.25e-01

Pos logit contributions
oops+0.020
cks+0.019
 enclosure+0.019
ines+0.019
opl+0.019
Neg logit contributions
 Lily-0.012
Prin-0.011
 Emma-0.011
 Emily-0.010
 Benny-0.010
Token only
Token ID691
Feature activation+5.04e-01

Pos logit contributions
 enclosure+0.031
cks+0.031
oops+0.031
opl+0.030
ines+0.030
Neg logit contributions
Prin-0.010
 Lily-0.009
 Emma-0.009
 accidentally-0.009
 Penny-0.008
Token a
Token ID257
Feature activation-1.04e+00

Pos logit contributions
oops+0.018
cks+0.018
 enclosure+0.018
opl+0.018
ines+0.018
Neg logit contributions
 Lily-0.007
Prin-0.007
 Emma-0.007
 Bun-0.006
 accidentally-0.006
Token basket
Token ID7988
Feature activation+1.10e+00

Pos logit contributions
Prin+0.024
 Lily+0.024
 Emma+0.023
 Emily+0.022
 Bun+0.022
Neg logit contributions
 enclosure-0.027
umm-0.026
ines-0.025
opl-0.025
 cones-0.024
Token with
Token ID351
Feature activation+1.10e+00

Pos logit contributions
umm+0.039
opl+0.039
cks+0.039
oops+0.039
iously+0.038
Neg logit contributions
 Lily-0.018
 they-0.015
 Emma-0.015
 someone-0.015
Prin-0.014
Token y
Token ID331
Feature activation+3.45e+00

Pos logit contributions
oops+0.036
cks+0.036
 enclosure+0.036
opl+0.036
umm+0.036
Neg logit contributions
 Lily-0.019
Prin-0.018
 Emma-0.017
 accidentally-0.016
 Emily-0.016
Tokenummy
Token ID13513
Feature activation+2.68e-01

Pos logit contributions
cks+0.048
opl+0.047
 enclosure+0.044
 tallest+0.043
wen+0.041
Neg logit contributions
 Lily-0.126
 Benny-0.116
 Emma-0.115
 someone-0.115
 it-0.114
Token food
Token ID2057
Feature activation+4.97e-01

Pos logit contributions
cks+0.007
opl+0.007
oops+0.007
umm+0.007
 enclosure+0.007
Neg logit contributions
 Lily-0.006
 Emma-0.006
Prin-0.006
 someone-0.006
 Benny-0.005
Token.
Token ID13
Feature activation+6.49e-02

Pos logit contributions
umm+0.018
opl+0.018
cks+0.018
oops+0.018
iously+0.017
Neg logit contributions
 Lily-0.008
 they-0.007
 Emma-0.007
 someone-0.007
 it-0.006
Token They
Token ID1119
Feature activation+6.44e-01

Pos logit contributions
oops+0.002
 enclosure+0.002
cks+0.002
umm+0.002
opl+0.002
Neg logit contributions
Prin-0.001
 Lily-0.001
 Emma-0.001
 accidentally-0.001
 Bun-0.001
Token say
Token ID910
Feature activation+2.74e-02

Pos logit contributions
oops+0.020
cks+0.020
 enclosure+0.020
ines+0.020
opl+0.019
Neg logit contributions
 Lily-0.013
Prin-0.012
 Emma-0.011
 Bun-0.011
 Emily-0.011
Token potato
Token ID21219
Feature activation+8.93e-01

Pos logit contributions
umm+0.002
ines+0.002
opl+0.002
oops+0.002
cks+0.002
Neg logit contributions
 Lily-0.002
Prin-0.002
 Emma-0.002
 Bun-0.002
 Emily-0.002
Token.
Token ID13
Feature activation-9.56e-02

Pos logit contributions
cks+0.028
oops+0.028
opl+0.028
umm+0.028
 enclosure+0.028
Neg logit contributions
 Lily-0.017
Prin-0.016
 Emma-0.016
 accidentally-0.015
 someone-0.015
Token This
Token ID770
Feature activation+1.05e+00

Pos logit contributions
 Lily+0.002
Prin+0.002
 Emma+0.002
 accidentally+0.002
 someone+0.002
Neg logit contributions
oops-0.003
cks-0.003
opl-0.003
umm-0.003
 enclosure-0.003
Token is
Token ID318
Feature activation+1.45e-01

Pos logit contributions
opl+0.028
oops+0.028
ines+0.028
cks+0.028
umm+0.027
Neg logit contributions
 Lily-0.025
Prin-0.024
 Emma-0.023
 Bun-0.022
 Emily-0.022
Token a
Token ID257
Feature activation-3.52e-01

Pos logit contributions
cks+0.005
oops+0.005
ines+0.004
 enclosure+0.004
opl+0.004
Neg logit contributions
Prin-0.003
 Emma-0.002
 Bun-0.002
 Lily-0.002
 accidentally-0.002
Token tr
Token ID491
Feature activation+3.45e+00

Pos logit contributions
 Lily+0.009
Prin+0.008
 Emma+0.008
 Bun+0.008
 Emily+0.008
Neg logit contributions
umm-0.009
ines-0.009
opl-0.009
oops-0.009
cks-0.008
Tokenuffle
Token ID18137
Feature activation-4.71e-01

Pos logit contributions
 enclosure+0.057
oops+0.054
iously+0.053
cks+0.045
umm+0.044
Neg logit contributions
 accidentally-0.113
 Benny-0.108
 Penny-0.107
 Emily-0.106
 Lily-0.105
Token.
Token ID13
Feature activation-2.79e-01

Pos logit contributions
Prin+0.010
 Lily+0.010
 Emma+0.010
 Emily+0.009
 Bun+0.009
Neg logit contributions
cks-0.013
ines-0.013
opl-0.013
 enclosure-0.013
oops-0.013
Token It
Token ID632
Feature activation+3.75e-01

Pos logit contributions
 Lily+0.005
Prin+0.005
 Emma+0.005
 accidentally+0.005
 Bun+0.005
Neg logit contributions
oops-0.009
cks-0.009
umm-0.009
opl-0.009
 enclosure-0.009
Token is
Token ID318
Feature activation-6.89e-02

Pos logit contributions
cks+0.011
oops+0.011
 enclosure+0.010
ines+0.010
opl+0.010
Neg logit contributions
 Lily-0.008
Prin-0.008
 Emma-0.008
 Penny-0.007
 Bun-0.007
Token a
Token ID257
Feature activation-7.50e-01

Pos logit contributions
Prin+0.001
 Lily+0.001
 Emma+0.001
 Bun+0.001
 Benny+0.001
Neg logit contributions
cks-0.002
oops-0.002
opl-0.002
ines-0.002
 enclosure-0.002
Token because
Token ID780
Feature activation+9.99e-02

Pos logit contributions
 Lily+0.026
Prin+0.025
 accidentally+0.023
 Emma+0.023
 Bun+0.023
Neg logit contributions
umm-0.030
oops-0.030
 enclosure-0.030
ines-0.030
cks-0.029
Token it
Token ID340
Feature activation-2.55e-01

Pos logit contributions
umm+0.004
oops+0.004
ines+0.004
ires+0.003
cks+0.003
Neg logit contributions
Prin-0.001
 Midnight-0.001
 Bun-0.001
L-0.001
 accidentally-0.001
Token had
Token ID550
Feature activation+7.80e-01

Pos logit contributions
Prin+0.005
 Lily+0.004
 louder+0.004
 war+0.004
 Bun+0.004
Neg logit contributions
umm-0.008
ines-0.007
oops-0.007
cks-0.007
opl-0.007
Token lots
Token ID6041
Feature activation+1.61e+00

Pos logit contributions
oops+0.024
cks+0.024
umm+0.024
 enclosure+0.024
opl+0.023
Neg logit contributions
 Lily-0.015
Prin-0.014
 Emma-0.013
 Bun-0.013
 Emily-0.013
Token of
Token ID286
Feature activation+2.53e-01

Pos logit contributions
cks+0.060
 enclosure+0.057
oops+0.057
opl+0.056
ram+0.055
Neg logit contributions
 Lily-0.026
 Emma-0.022
 someone-0.021
Prin-0.021
 while-0.020
Token y
Token ID331
Feature activation+3.44e+00

Pos logit contributions
cks+0.007
 enclosure+0.007
opl+0.007
umm+0.007
oops+0.007
Neg logit contributions
 Lily-0.006
 Emma-0.006
 it-0.006
 someone-0.006
 something-0.006
Tokenummy
Token ID13513
Feature activation+8.37e-01

Pos logit contributions
 enclosure+0.061
cks+0.061
opl+0.055
 tallest+0.054
wen+0.053
Neg logit contributions
 Lily-0.110
 Benny-0.106
 Emma-0.104
 it-0.100
 something-0.098
Token decorations
Token ID39011
Feature activation-5.74e-02

Pos logit contributions
cks+0.021
opl+0.021
 enclosure+0.021
oops+0.021
umm+0.020
Neg logit contributions
 Lily-0.022
 Emma-0.020
 something-0.020
 someone-0.020
 it-0.019
Token on
Token ID319
Feature activation+6.48e-01

Pos logit contributions
 Lily+0.001
Prin+0.001
 Emma+0.001
 accidentally+0.001
 someone+0.001
Neg logit contributions
oops-0.002
cks-0.002
umm-0.002
opl-0.002
 enclosure-0.002
Token it
Token ID340
Feature activation+1.01e+00

Pos logit contributions
cks+0.026
 enclosure+0.026
oops+0.025
umm+0.025
opl+0.025
Neg logit contributions
 Lily-0.006
Prin-0.006
 Bun-0.005
 Emma-0.005
 accidentally-0.005
Token.
Token ID13
Feature activation+6.41e-01

Pos logit contributions
oops+0.029
cks+0.029
opl+0.029
umm+0.029
 enclosure+0.029
Neg logit contributions
 Lily-0.022
Prin-0.020
 Emma-0.020
 accidentally-0.019
 someone-0.019
Token
Token ID198
Feature activation+4.47e-01

Pos logit contributions
oops+0.008
umm+0.008
cks+0.008
opl+0.008
 enclosure+0.008
Neg logit contributions
 Lily-0.003
Prin-0.002
 Emma-0.002
 someone-0.002
 Emily-0.002
Token"
Token ID1
Feature activation+1.39e+00

Pos logit contributions
cks+0.014
oops+0.014
 enclosure+0.014
ines+0.013
opl+0.013
Neg logit contributions
 Lily-0.009
 Emma-0.008
Prin-0.008
 accidentally-0.008
 Benny-0.008
TokenWhat
Token ID2061
Feature activation+5.89e-01

Pos logit contributions
cks+0.041
oops+0.040
umm+0.040
 enclosure+0.040
opl+0.040
Neg logit contributions
 Lily-0.029
 Emma-0.027
Prin-0.027
 accidentally-0.026
 Emily-0.026
Token is
Token ID318
Feature activation+2.73e-01

Pos logit contributions
oops+0.015
cks+0.015
opl+0.014
ines+0.014
 enclosure+0.014
Neg logit contributions
Prin-0.014
 Bun-0.014
 Penny-0.013
 Lily-0.013
 Midnight-0.013
Token the
Token ID262
Feature activation-5.50e-02

Pos logit contributions
umm+0.011
ines+0.011
opl+0.011
 enclosure+0.011
iously+0.011
Neg logit contributions
 Lily-0.004
 it-0.004
 they-0.003
 something-0.003
 someone-0.003
Token truck
Token ID7779
Feature activation+3.44e+00

Pos logit contributions
 Lily+0.001
Prin+0.001
 someone+0.001
 war+0.001
 Emma+0.001
Neg logit contributions
oops-0.002
umm-0.002
cks-0.002
ines-0.002
iously-0.002
Token doing
Token ID1804
Feature activation+1.01e+00

Pos logit contributions
cks+0.114
oops+0.114
opl+0.113
 enclosure+0.113
umm+0.113
Neg logit contributions
 Lily-0.060
Prin-0.055
 Emma-0.054
 accidentally-0.050
 someone-0.050
Token?"
Token ID1701
Feature activation+2.02e+00

Pos logit contributions
cks+0.031
ines+0.031
oops+0.031
opl+0.030
 enclosure+0.030
Neg logit contributions
 Lily-0.019
Prin-0.019
 Emma-0.018
 Bun-0.017
 Benny-0.017
Token Ben
Token ID3932
Feature activation+1.67e+00

Pos logit contributions
cks+0.062
 enclosure+0.061
oops+0.061
ines+0.058
iously+0.057
Neg logit contributions
Prin-0.035
 Bun-0.033
 Lily-0.031
 Midnight-0.030
 accidentally-0.029
Token asks
Token ID7893
Feature activation+1.26e+00

Pos logit contributions
umm+0.040
oops+0.039
opl+0.038
 enclosure+0.038
cks+0.038
Neg logit contributions
 Lily-0.048
Prin-0.044
 Emma-0.044
 Emily-0.042
 someone-0.042
Token.
Token ID13
Feature activation-2.83e-01

Pos logit contributions
oops+0.028
cks+0.027
ines+0.027
 enclosure+0.026
iously+0.026
Neg logit contributions
Prin-0.033
 accidentally-0.032
 Bun-0.031
 Midnight-0.031
 war-0.031
Token that
Token ID326
Feature activation-6.87e-01

Pos logit contributions
cks+0.020
opl+0.020
oops+0.019
 enclosure+0.019
umm+0.019
Neg logit contributions
 Lily-0.006
 Emma-0.005
Prin-0.005
 someone-0.005
 Emily-0.005
Token he
Token ID339
Feature activation+2.68e-01

Pos logit contributions
Prin+0.010
 Lily+0.009
 Emma+0.008
 accidentally+0.008
 war+0.008
Neg logit contributions
oops-0.025
opl-0.024
umm-0.024
 enclosure-0.024
cks-0.024
Token had
Token ID550
Feature activation+7.42e-01

Pos logit contributions
oops+0.009
ines+0.009
 enclosure+0.009
cks+0.009
opl+0.009
Neg logit contributions
 Lily-0.004
Prin-0.004
 Emma-0.004
 war-0.004
 someone-0.004
Token earned
Token ID7366
Feature activation+1.35e-01

Pos logit contributions
oops+0.022
cks+0.022
 enclosure+0.021
ines+0.021
opl+0.021
Neg logit contributions
Prin-0.015
 Lily-0.015
 Emma-0.014
 war-0.013
 Midnight-0.013
Token his
Token ID465
Feature activation+2.14e+00

Pos logit contributions
oops+0.005
cks+0.005
opl+0.005
ines+0.005
 enclosure+0.005
Neg logit contributions
Prin-0.002
 Lily-0.002
 Emma-0.002
 accidentally-0.002
 Bun-0.002
Token m
Token ID285
Feature activation+3.43e+00

Pos logit contributions
cks+0.068
ines+0.067
opl+0.066
umm+0.064
asp+0.062
Neg logit contributions
 Lily-0.045
 something-0.044
 someone-0.044
 band-0.043
 a-0.041
Tokenama
Token ID1689
Feature activation+1.80e+00

Pos logit contributions
cks+0.076
 tallest+0.063
 enclosure+0.063
asp+0.062
ines+0.060
Neg logit contributions
 band-0.091
 someone-0.081
 forgive-0.079
 something-0.076
 feel-0.075
Token's
Token ID338
Feature activation+9.73e-01

Pos logit contributions
cks+0.068
opl+0.068
umm+0.068
ines+0.067
oops+0.066
Neg logit contributions
 Lily-0.025
 Emma-0.022
 someone-0.021
 something-0.020
Prin-0.019
Token permission
Token ID7170
Feature activation+1.51e+00

Pos logit contributions
cks+0.032
opl+0.030
ines+0.030
unks+0.029
umm+0.029
Neg logit contributions
 Lily-0.020
 something-0.020
 someone-0.020
 band-0.020
 it-0.019
Token.
Token ID13
Feature activation+1.58e-01

Pos logit contributions
cks+0.053
opl+0.052
umm+0.052
oops+0.051
 enclosure+0.051
Neg logit contributions
 Lily-0.025
 Emma-0.022
Prin-0.022
 someone-0.021
 Emily-0.020
Token<|endoftext|>
Token ID50256
Feature activation+1.71e-01

Pos logit contributions
cks+0.006
oops+0.006
 enclosure+0.006
umm+0.006
opl+0.006
Neg logit contributions
 Lily-0.002
Prin-0.002
 Emma-0.002
 Emily-0.002
 Benny-0.002
Token and
Token ID290
Feature activation-1.10e+00

Pos logit contributions
 Lily+0.003
 Emma+0.003
Prin+0.003
 Emily+0.003
 someone+0.003
Neg logit contributions
cks-0.010
umm-0.010
oops-0.010
opl-0.010
 enclosure-0.010
Token let
Token ID1309
Feature activation-1.64e+00

Pos logit contributions
Prin+0.024
 war+0.024
 Lily+0.022
 someone+0.021
 sounded+0.021
Neg logit contributions
oops-0.030
 enclosure-0.029
cks-0.028
iously-0.027
ram-0.027
Token the
Token ID262
Feature activation-8.51e-01

Pos logit contributions
Prin+0.027
 war+0.022
 accidentally+0.022
 Bun+0.021
 Benny+0.020
Neg logit contributions
oops-0.053
cks-0.049
zy-0.048
 enclosure-0.048
opl-0.047
Token ha
Token ID387
Feature activation-2.12e+00

Pos logit contributions
 Lily+0.019
 Emma+0.017
Prin+0.017
 someone+0.017
 band+0.016
Neg logit contributions
cks-0.026
opl-0.025
oops-0.025
 enclosure-0.025
umm-0.025
Tokenird
Token ID1447
Feature activation-2.24e+00

Pos logit contributions
 Lily+0.047
Prin+0.044
 Emma+0.043
 accidentally+0.041
 someone+0.041
Neg logit contributions
cks-0.061
oops-0.061
opl-0.060
 enclosure-0.060
umm-0.060
Tokenress
Token ID601
Feature activation-4.03e+00

Pos logit contributions
 Lily+0.050
 Emma+0.044
Prin+0.044
 someone+0.043
 accidentally+0.043
Neg logit contributions
cks-0.067
umm-0.065
opl-0.064
oops-0.063
 enclosure-0.062
Tokener
Token ID263
Feature activation-2.44e+00

Pos logit contributions
 war+0.099
Prin+0.099
 Lily+0.095
 accidentally+0.093
 Benny+0.092
Neg logit contributions
oops-0.102
iously-0.099
ines-0.099
cks-0.098
 enclosure-0.098
Token go
Token ID467
Feature activation-8.99e-01

Pos logit contributions
Prin+0.050
 war+0.047
 Lily+0.046
 Emma+0.044
 Midnight+0.043
Neg logit contributions
oops-0.073
 enclosure-0.071
cks-0.071
ines-0.071
iously-0.069
Token to
Token ID284
Feature activation-7.55e-01

Pos logit contributions
 Lily+0.018
Prin+0.018
 Emma+0.017
 accidentally+0.016
 someone+0.016
Neg logit contributions
oops-0.027
cks-0.027
 enclosure-0.027
ines-0.027
opl-0.027
Token work
Token ID670
Feature activation-1.61e+00

Pos logit contributions
Prin+0.012
 sounded+0.010
 accidentally+0.010
L+0.009
 war+0.009
Neg logit contributions
 enclosure-0.025
oops-0.025
cks-0.025
opl-0.025
iously-0.024
Token.
Token ID13
Feature activation-1.69e-01

Pos logit contributions
Prin+0.038
 Lily+0.035
 accidentally+0.034
 war+0.033
 Bun+0.033
Neg logit contributions
ines-0.043
cks-0.042
oops-0.041
opl-0.041
iously-0.040
Token trick
Token ID6908
Feature activation-2.15e-01

Pos logit contributions
ines+0.004
 enclosure+0.004
oops+0.004
opl+0.004
cks+0.004
Neg logit contributions
Prin-0.003
 Lily-0.002
 Emma-0.002
 Emily-0.002
 sounded-0.002
Token-
Token ID12
Feature activation-7.08e-01

Pos logit contributions
Prin+0.004
 accidentally+0.004
 war+0.004
 Lily+0.004
 sounded+0.004
Neg logit contributions
cks-0.006
ines-0.006
iously-0.006
oops-0.006
opl-0.006
Tokenor
Token ID273
Feature activation-1.64e+00

Pos logit contributions
 Lily+0.022
 war+0.022
 Emma+0.021
 Emily+0.021
 Benny+0.021
Neg logit contributions
oops-0.012
opl-0.012
ines-0.012
cks-0.011
used-0.011
Token-
Token ID12
Feature activation-7.75e-01

Pos logit contributions
Prin+0.044
 Lily+0.042
 war+0.042
 accidentally+0.041
 Benny+0.040
Neg logit contributions
iously-0.039
oops-0.038
cks-0.038
ines-0.037
opl-0.037
Tokent
Token ID83
Feature activation-7.51e-01

Pos logit contributions
 Lily+0.024
 war+0.024
 Emma+0.023
 Emily+0.023
 Benny+0.023
Neg logit contributions
opl-0.013
oops-0.013
ines-0.013
ram-0.012
iously-0.012
Tokenreating
Token ID34567
Feature activation-3.90e+00

Pos logit contributions
Prin+0.015
 war+0.014
 accidentally+0.014
 Midnight+0.013
 Penny+0.013
Neg logit contributions
ines-0.023
opl-0.022
cks-0.020
ram-0.020
oops-0.020
Token anymore
Token ID7471
Feature activation-1.48e+00

Pos logit contributions
 accidentally+0.088
Prin+0.088
 Benny+0.085
 war+0.082
 Emily+0.081
Neg logit contributions
ines-0.110
oops-0.102
cks-0.101
opl-0.101
 cones-0.095
Token.
Token ID13
Feature activation+1.04e-01

Pos logit contributions
Prin+0.034
 Lily+0.032
 Emma+0.030
 war+0.030
 Penny+0.030
Neg logit contributions
ines-0.040
cks-0.039
opl-0.039
 enclosure-0.038
oops-0.038
Token She
Token ID1375
Feature activation-6.01e-01

Pos logit contributions
 enclosure+0.003
opl+0.003
oops+0.003
umm+0.003
ines+0.003
Neg logit contributions
Prin-0.002
 war-0.002
 band-0.002
 sounded-0.002
 Lily-0.002
Token sadly
Token ID21098
Feature activation+3.65e-01

Pos logit contributions
Prin+0.012
 Lily+0.012
 war+0.012
 Emma+0.011
 Penny+0.011
Neg logit contributions
 enclosure-0.017
iously-0.016
ines-0.016
oops-0.016
anda-0.015
Token put
Token ID1234
Feature activation-1.96e+00

Pos logit contributions
iously+0.010
oops+0.010
 enclosure+0.010
cks+0.009
ines+0.009
Neg logit contributions
Prin-0.008
 Lily-0.007
 war-0.007
 Emma-0.007
 Emily-0.007
Token trick
Token ID6908
Feature activation-5.70e-01

Pos logit contributions
oops+0.006
 enclosure+0.006
ines+0.006
opl+0.006
cks+0.006
Neg logit contributions
Prin-0.004
 Lily-0.003
 Emma-0.003
 accidentally-0.003
 Midnight-0.003
Token-
Token ID12
Feature activation-9.46e-01

Pos logit contributions
Prin+0.011
 accidentally+0.009
 war+0.009
 Lily+0.009
 Emma+0.008
Neg logit contributions
oops-0.018
iously-0.017
cks-0.017
ines-0.017
cles-0.017
Tokenor
Token ID273
Feature activation-1.92e+00

Pos logit contributions
 Lily+0.027
 Emma+0.027
 Emily+0.026
 war+0.026
 Benny+0.026
Neg logit contributions
oops-0.019
opl-0.018
cks-0.018
ines-0.017
 enclosure-0.017
Token-
Token ID12
Feature activation-9.75e-01

Pos logit contributions
Prin+0.044
 Lily+0.043
 Emma+0.041
 accidentally+0.040
 war+0.040
Neg logit contributions
oops-0.053
cks-0.052
iously-0.051
opl-0.051
ines-0.051
Tokent
Token ID83
Feature activation-7.54e-01

Pos logit contributions
 Lily+0.028
 Emma+0.028
 Emily+0.027
 war+0.027
 Benny+0.027
Neg logit contributions
oops-0.019
opl-0.019
cks-0.018
ines-0.018
 enclosure-0.018
Tokenreating
Token ID34567
Feature activation-3.76e+00

Pos logit contributions
Prin+0.019
 Emma+0.018
 accidentally+0.018
 Midnight+0.018
 Emily+0.017
Neg logit contributions
opl-0.019
ines-0.019
cks-0.018
 enclosure-0.017
ram-0.017
Token with
Token ID351
Feature activation+1.02e-01

Pos logit contributions
Prin+0.083
 accidentally+0.079
 Lily+0.077
 Emma+0.075
 Benny+0.074
Neg logit contributions
ines-0.108
oops-0.106
cks-0.104
opl-0.101
 enclosure-0.099
Token her
Token ID607
Feature activation-7.28e-01

Pos logit contributions
cks+0.003
oops+0.003
opl+0.003
 enclosure+0.003
iously+0.003
Neg logit contributions
Prin-0.002
 Lily-0.002
 Emma-0.002
 accidentally-0.002
 sounded-0.002
Token.
Token ID13
Feature activation-1.04e+00

Pos logit contributions
 Lily+0.014
Prin+0.013
 Emma+0.012
 someone+0.012
 accidentally+0.012
Neg logit contributions
cks-0.024
oops-0.023
opl-0.023
umm-0.023
 enclosure-0.023
Token Tim
Token ID5045
Feature activation-6.60e-01

Pos logit contributions
Prin+0.016
 Lily+0.015
 Emma+0.014
 someone+0.013
 war+0.013
Neg logit contributions
oops-0.036
 enclosure-0.036
umm-0.036
opl-0.035
cks-0.035
Tokenmy
Token ID1820
Feature activation-1.16e+00

Pos logit contributions
 Lily+0.011
Prin+0.010
 Emma+0.010
 accidentally+0.010
 Emily+0.010
Neg logit contributions
cks-0.022
oops-0.022
umm-0.022
opl-0.022
 enclosure-0.022
Token again
Token ID757
Feature activation-1.33e+00

Pos logit contributions
Prin+0.018
 Lily+0.017
 Penny+0.017
 Emily+0.017
 sounded+0.017
Neg logit contributions
ines-0.014
umm-0.014
cks-0.013
opl-0.013
asp-0.013
Token."
Token ID526
Feature activation+4.28e-01

Pos logit contributions
 Lily+0.036
Prin+0.036
 Emma+0.034
 Bun+0.033
 Emily+0.033
Neg logit contributions
cks-0.031
oops-0.030
ines-0.030
opl-0.030
 enclosure-0.029
Token Daddy
Token ID26926
Feature activation-3.85e-01

Pos logit contributions
oops+0.012
umm+0.012
 enclosure+0.011
iously+0.011
ummy+0.011
Neg logit contributions
Prin-0.009
 war-0.009
 Lily-0.008
 accidentally-0.008
 Bun-0.008
Token put
Token ID1234
Feature activation-2.66e+00

Pos logit contributions
Prin+0.007
 Lily+0.007
 war+0.007
 someone+0.007
 Penny+0.006
Neg logit contributions
iously-0.012
oops-0.012
 enclosure-0.012
cks-0.011
ines-0.011
Token a
Token ID257
Feature activation-1.36e+00

Pos logit contributions
Prin+0.048
 accidentally+0.044
 war+0.042
 Bun+0.037
 practiced+0.036
Neg logit contributions
oops-0.086
cks-0.081
umm-0.080
iously-0.076
icer-0.074
Token band
Token ID4097
Feature activation-3.47e+00

Pos logit contributions
 Lily+0.032
Prin+0.032
 Emma+0.030
 Red+0.029
 Penny+0.029
Neg logit contributions
 sle-0.032
 enclosure-0.031
 ig-0.031
oops-0.031
 cones-0.029
Tokenaid
Token ID1698
Feature activation-2.07e+00

Pos logit contributions
 Lily+0.096
Prin+0.094
 Emma+0.092
 accidentally+0.091
 Penny+0.091
Neg logit contributions
oops-0.082
opl-0.078
cks-0.076
ines-0.074
 enclosure-0.074
Token on
Token ID319
Feature activation-5.38e-01

Pos logit contributions
Prin+0.047
 Lily+0.046
 accidentally+0.045
 Emma+0.043
 Penny+0.041
Neg logit contributions
oops-0.057
cks-0.054
opl-0.053
iously-0.053
ines-0.052
Token his
Token ID465
Feature activation-7.22e-01

Pos logit contributions
Prin+0.009
 war+0.008
 practiced+0.007
 accidentally+0.007
 sounded+0.007
Neg logit contributions
iously-0.016
cks-0.016
umm-0.016
ummy-0.016
oops-0.015
Token thumb
Token ID15683
Feature activation-1.69e+00

Pos logit contributions
 Lily+0.020
Prin+0.020
 Emma+0.019
 accidentally+0.018
 Emily+0.018
Neg logit contributions
oops-0.016
 enclosure-0.016
cks-0.016
umm-0.016
opl-0.016
Token and
Token ID290
Feature activation-1.10e+00

Pos logit contributions
Prin+0.035
 Lily+0.031
 war+0.030
L+0.028
 Emma+0.028
Neg logit contributions
 enclosure-0.049
iously-0.049
opl-0.048
oops-0.048
ines-0.047
Token haircut
Token ID45548
Feature activation-1.71e+00

Pos logit contributions
oops+0.040
umm+0.039
iously+0.039
opl+0.039
ines+0.039
Neg logit contributions
Prin-0.025
 war-0.024
 Emma-0.023
 Lily-0.023
 Midnight-0.023
Token.
Token ID13
Feature activation-1.58e-01

Pos logit contributions
Prin+0.031
 Lily+0.030
 Emma+0.028
 war+0.028
 Emily+0.027
Neg logit contributions
oops-0.055
ines-0.054
iously-0.053
umm-0.053
 enclosure-0.053
Token The
Token ID383
Feature activation-8.45e-01

Pos logit contributions
Prin+0.003
 Lily+0.003
 Emma+0.003
 Penny+0.003
 war+0.003
Neg logit contributions
oops-0.005
umm-0.005
 enclosure-0.005
opl-0.005
cks-0.004
Token ha
Token ID387
Feature activation-2.20e+00

Pos logit contributions
 Lily+0.019
Prin+0.017
 someone+0.017
 band+0.016
 accidentally+0.016
Neg logit contributions
cks-0.027
opl-0.025
 enclosure-0.025
umm-0.025
oops-0.025
Tokenird
Token ID1447
Feature activation-2.27e+00

Pos logit contributions
 Lily+0.058
 war+0.058
Prin+0.057
 Benny+0.057
 Emma+0.056
Neg logit contributions
ines-0.052
oops-0.051
opl-0.049
cks-0.049
 enclosure-0.048
Tokenress
Token ID601
Feature activation-3.47e+00

Pos logit contributions
 Lily+0.054
Prin+0.050
 Emma+0.049
 accidentally+0.047
 someone+0.047
Neg logit contributions
cks-0.062
oops-0.062
opl-0.061
umm-0.061
 enclosure-0.061
Tokener
Token ID263
Feature activation-1.57e+00

Pos logit contributions
 war+0.102
Prin+0.098
 Penny+0.092
 Emma+0.091
 Emily+0.091
Neg logit contributions
iously-0.071
ines-0.067
oops-0.066
anda-0.065
umm-0.064
Token started
Token ID2067
Feature activation+1.88e-01

Pos logit contributions
Prin+0.029
 war+0.029
 Lily+0.029
 Emma+0.027
 Midnight+0.027
Neg logit contributions
ines-0.047
oops-0.047
 enclosure-0.046
iously-0.046
opl-0.045
Token cutting
Token ID7720
Feature activation-8.22e-01

Pos logit contributions
oops+0.006
 enclosure+0.006
cks+0.006
ines+0.006
opl+0.006
Neg logit contributions
 Lily-0.003
Prin-0.003
 Emma-0.003
 Benny-0.003
 Penny-0.003
Token Tim
Token ID5045
Feature activation-3.77e-01

Pos logit contributions
Prin+0.015
 accidentally+0.012
 Emma+0.012
 war+0.012
 Midnight+0.012
Neg logit contributions
oops-0.027
iously-0.026
cks-0.026
 cones-0.026
opl-0.026
Tokenmy
Token ID1820
Feature activation-3.73e-01

Pos logit contributions
 Lily+0.007
Prin+0.006
 Emma+0.006
 accidentally+0.006
 someone+0.006
Neg logit contributions
oops-0.012
cks-0.012
opl-0.012
 enclosure-0.012
umm-0.012
Token and
Token ID290
Feature activation-1.32e+00

Pos logit contributions
Prin+0.026
 Lily+0.025
 war+0.023
 Emma+0.023
 Emily+0.022
Neg logit contributions
 enclosure-0.048
oops-0.048
iously-0.048
ines-0.047
opl-0.047
Token gently
Token ID15165
Feature activation-4.38e-01

Pos logit contributions
Prin+0.035
 war+0.033
 villain+0.029
arre+0.028
L+0.026
Neg logit contributions
oops-0.031
iously-0.030
 cones-0.028
 gob-0.027
cks-0.026
Token wiped
Token ID21122
Feature activation-1.84e+00

Pos logit contributions
Prin+0.010
 war+0.010
 Lily+0.010
 Emma+0.009
 Emily+0.009
Neg logit contributions
oops-0.012
cks-0.011
iously-0.011
 enclosure-0.011
ram-0.010
Token the
Token ID262
Feature activation-5.02e-01

Pos logit contributions
Prin+0.030
 war+0.028
 accidentally+0.025
 Midnight+0.022
 villain+0.022
Neg logit contributions
cks-0.064
oops-0.063
ines-0.061
opl-0.059
ram-0.059
Token little
Token ID1310
Feature activation-1.89e+00

Pos logit contributions
 Lily+0.013
Prin+0.013
 Emma+0.012
 Benny+0.012
 Emily+0.012
Neg logit contributions
oops-0.012
 enclosure-0.012
umm-0.012
ines-0.012
iously-0.012
Token girl
Token ID2576
Feature activation-3.44e+00

Pos logit contributions
Prin+0.036
 war+0.033
 Bun+0.033
 accidentally+0.031
 Lily+0.031
Neg logit contributions
umm-0.057
 enclosure-0.056
oops-0.055
iously-0.054
ines-0.053
Token's
Token ID338
Feature activation-1.15e+00

Pos logit contributions
 Lily+0.060
Prin+0.058
 Emma+0.052
 war+0.051
 accidentally+0.051
Neg logit contributions
oops-0.114
cks-0.114
 enclosure-0.113
iously-0.110
ines-0.110
Token skin
Token ID4168
Feature activation-9.52e-01

Pos logit contributions
Prin+0.030
 Lily+0.030
 Emma+0.028
 war+0.028
 accidentally+0.028
Neg logit contributions
oops-0.029
cks-0.028
iously-0.027
opl-0.027
 enclosure-0.027
Token.
Token ID13
Feature activation-1.59e-01

Pos logit contributions
Prin+0.022
 Lily+0.021
 war+0.020
 Emma+0.020
 Emily+0.019
Neg logit contributions
oops-0.025
iously-0.025
cks-0.025
ines-0.025
umm-0.024
Token
Token ID198
Feature activation+2.63e-03

Pos logit contributions
 Lily+0.003
Prin+0.003
 Emma+0.002
 accidentally+0.002
 war+0.002
Neg logit contributions
oops-0.005
 enclosure-0.005
cks-0.005
opl-0.005
umm-0.005
Token
Token ID198
Feature activation+1.29e-01

Pos logit contributions
cks+0.000
oops+0.000
opl+0.000
umm+0.000
 enclosure+0.000
Neg logit contributions
 Lily-0.000
Prin-0.000
 Emma-0.000
 accidentally-0.000
 someone-0.000
Token,"
Token ID553
Feature activation+5.29e-01

Pos logit contributions
cks+0.003
oops+0.003
umm+0.003
 enclosure+0.003
opl+0.003
Neg logit contributions
 Lily-0.002
Prin-0.002
 Emma-0.002
 someone-0.002
 Emily-0.002
Token replied
Token ID8712
Feature activation+2.73e-01

Pos logit contributions
oops+0.010
iously+0.010
 cones+0.009
umm+0.009
cks+0.009
Neg logit contributions
 war-0.014
Prin-0.013
 villain-0.013
 accidentally-0.012
arre-0.012
Token the
Token ID262
Feature activation-5.79e-01

Pos logit contributions
iously+0.008
used+0.008
oops+0.008
cks+0.008
asp+0.008
Neg logit contributions
Prin-0.004
 war-0.004
arre-0.003
 sounded-0.003
 accidentally-0.003
Token ha
Token ID387
Feature activation-1.88e+00

Pos logit contributions
 Lily+0.013
Prin+0.012
 someone+0.012
 Emma+0.012
 accidentally+0.011
Neg logit contributions
cks-0.017
oops-0.017
 enclosure-0.017
opl-0.016
umm-0.016
Tokenird
Token ID1447
Feature activation-2.08e+00

Pos logit contributions
 Lily+0.045
Prin+0.042
 Emma+0.041
 war+0.040
 Benny+0.040
Neg logit contributions
oops-0.050
ines-0.050
cks-0.050
 enclosure-0.049
opl-0.049
Tokenress
Token ID601
Feature activation-3.42e+00

Pos logit contributions
 Lily+0.046
Prin+0.041
 Emma+0.041
 accidentally+0.039
 someone+0.039
Neg logit contributions
cks-0.063
opl-0.061
umm-0.060
oops-0.060
 enclosure-0.059
Tokener
Token ID263
Feature activation-2.00e+00

Pos logit contributions
 war+0.094
Prin+0.086
 accidentally+0.085
 Lily+0.084
 Emma+0.084
Neg logit contributions
iously-0.080
ines-0.079
oops-0.077
opl-0.074
 enclosure-0.074
Token.
Token ID13
Feature activation-7.78e-01

Pos logit contributions
Prin+0.042
 Lily+0.041
 war+0.040
 Emma+0.039
 someone+0.037
Neg logit contributions
iously-0.056
ines-0.056
 enclosure-0.056
oops-0.055
cks-0.055
Token "
Token ID366
Feature activation-6.20e-01

Pos logit contributions
 war+0.013
Prin+0.012
 sounded+0.012
 band+0.012
 accidentally+0.011
Neg logit contributions
oops-0.025
umm-0.024
 enclosure-0.023
asp-0.022
iously-0.022
TokenIt
Token ID1026
Feature activation-8.67e-01

Pos logit contributions
 Lily+0.014
 Emma+0.014
 Benny+0.014
 Emily+0.014
 war+0.014
Neg logit contributions
umm-0.016
oops-0.016
cks-0.016
opl-0.016
asp-0.016
Token feels
Token ID5300
Feature activation-5.75e-01

Pos logit contributions
Prin+0.019
 Lily+0.018
 Penny+0.017
 war+0.017
 Emma+0.017
Neg logit contributions
oops-0.024
cks-0.023
ines-0.023
 enclosure-0.023
iously-0.022
Token he
Token ID339
Feature activation-1.99e-01

Pos logit contributions
Prin+0.008
 war+0.006
 Penny+0.006
 Midnight+0.006
 Emma+0.006
Neg logit contributions
iously-0.010
oops-0.010
ines-0.010
cks-0.010
umm-0.010
Token asked
Token ID1965
Feature activation+3.57e-01

Pos logit contributions
 Lily+0.004
Prin+0.004
 war+0.004
 Emma+0.004
 someone+0.003
Neg logit contributions
oops-0.006
iously-0.006
 enclosure-0.006
ines-0.006
cks-0.006
Token the
Token ID262
Feature activation-4.27e-01

Pos logit contributions
oops+0.011
ines+0.010
iously+0.010
 enclosure+0.009
cks+0.009
Neg logit contributions
Prin-0.007
 accidentally-0.006
 war-0.006
 Midnight-0.005
arre-0.005
Token ha
Token ID387
Feature activation-1.71e+00

Pos logit contributions
 Lily+0.008
Prin+0.007
 someone+0.007
 band+0.007
L+0.007
Neg logit contributions
cks-0.016
opl-0.015
 enclosure-0.015
umm-0.014
oops-0.014
Tokenird
Token ID1447
Feature activation-1.67e+00

Pos logit contributions
 Lily+0.042
Prin+0.040
 Emma+0.039
 war+0.038
 Benny+0.038
Neg logit contributions
oops-0.045
ines-0.045
 enclosure-0.044
opl-0.043
cks-0.043
Tokenress
Token ID601
Feature activation-3.41e+00

Pos logit contributions
 Lily+0.039
Prin+0.036
 Emma+0.035
 someone+0.034
 accidentally+0.034
Neg logit contributions
cks-0.047
oops-0.046
opl-0.046
umm-0.046
 enclosure-0.046
Tokener
Token ID263
Feature activation-1.79e+00

Pos logit contributions
 war+0.091
Prin+0.088
 Lily+0.085
 accidentally+0.085
 Emma+0.085
Neg logit contributions
ines-0.080
oops-0.078
 enclosure-0.077
iously-0.076
opl-0.073
Token to
Token ID284
Feature activation-6.77e-01

Pos logit contributions
Prin+0.031
 war+0.029
 Lily+0.028
 Midnight+0.027
 Emma+0.027
Neg logit contributions
ines-0.058
 enclosure-0.057
oops-0.056
ram-0.055
iously-0.054
Token give
Token ID1577
Feature activation-1.47e+00

Pos logit contributions
Prin+0.016
 Lily+0.015
 Bun+0.015
 accidentally+0.015
 Emma+0.015
Neg logit contributions
 enclosure-0.017
oops-0.017
opl-0.017
cks-0.016
ines-0.016
Token him
Token ID683
Feature activation-3.52e-01

Pos logit contributions
Prin+0.029
 war+0.026
 sounded+0.023
 villain+0.023
 accidentally+0.022
Neg logit contributions
oops-0.046
cks-0.042
iously-0.042
 enclosure-0.040
 cones-0.040
Token a
Token ID257
Feature activation-5.72e-01

Pos logit contributions
 Lily+0.004
Prin+0.004
 Emma+0.003
 accidentally+0.003
 Emily+0.003
Neg logit contributions
oops-0.014
 enclosure-0.014
cks-0.014
opl-0.014
ines-0.014
Token and
Token ID290
Feature activation-1.23e+00

Pos logit contributions
cks+0.004
oops+0.004
umm+0.004
 enclosure+0.004
opl+0.004
Neg logit contributions
 Lily-0.002
 Emma-0.002
Prin-0.002
 someone-0.002
 Benny-0.002
Token it
Token ID340
Feature activation-7.79e-01

Pos logit contributions
Prin+0.025
 Emma+0.020
 war+0.019
 Penny+0.019
 forgive+0.019
Neg logit contributions
oops-0.037
iously-0.035
 cones-0.034
cks-0.034
 ounces-0.033
Token made
Token ID925
Feature activation-2.52e-01

Pos logit contributions
Prin+0.012
 Lily+0.011
 Emma+0.011
 Penny+0.011
 Emily+0.010
Neg logit contributions
ines-0.026
cks-0.026
oops-0.026
iously-0.026
umm-0.026
Token the
Token ID262
Feature activation-3.95e-01

Pos logit contributions
Prin+0.005
 accidentally+0.004
 Penny+0.004
 Midnight+0.004
 Emma+0.004
Neg logit contributions
oops-0.007
iously-0.007
umm-0.007
cks-0.007
opl-0.007
Token little
Token ID1310
Feature activation-1.40e+00

Pos logit contributions
 Lily+0.005
Prin+0.005
 Emma+0.005
 accidentally+0.004
 Emily+0.004
Neg logit contributions
oops-0.015
cks-0.015
umm-0.015
 enclosure-0.015
opl-0.015
Token girl
Token ID2576
Feature activation-3.38e+00

Pos logit contributions
Prin+0.025
 Midnight+0.022
 Bun+0.022
 Penny+0.022
 Lily+0.022
Neg logit contributions
umm-0.044
 enclosure-0.044
oops-0.043
iously-0.042
 cones-0.041
Token scared
Token ID12008
Feature activation+1.22e-01

Pos logit contributions
 Lily+0.054
Prin+0.053
 Emma+0.049
 Penny+0.046
 Benny+0.046
Neg logit contributions
oops-0.116
cks-0.115
 enclosure-0.113
ines-0.113
opl-0.113
Token.
Token ID13
Feature activation-6.54e-01

Pos logit contributions
ines+0.003
oops+0.003
umm+0.003
iously+0.003
cks+0.003
Neg logit contributions
Prin-0.003
 Lily-0.003
 accidentally-0.003
 Emma-0.002
 war-0.002
Token
Token ID198
Feature activation-2.55e-02

Pos logit contributions
 Lily+0.012
Prin+0.012
 war+0.012
 Emma+0.011
 accidentally+0.011
Neg logit contributions
oops-0.020
 enclosure-0.020
umm-0.019
opl-0.019
cks-0.019
Token
Token ID198
Feature activation-2.36e-01

Pos logit contributions
 Lily+0.000
Prin+0.000
 Emma+0.000
 someone+0.000
 accidentally+0.000
Neg logit contributions
oops-0.001
cks-0.001
umm-0.001
opl-0.001
 enclosure-0.001
TokenThe
Token ID464
Feature activation-4.22e-01

Pos logit contributions
 Lily+0.004
Prin+0.004
 Emma+0.003
 accidentally+0.003
 someone+0.003
Neg logit contributions
oops-0.008
umm-0.008
cks-0.008
opl-0.008
 enclosure-0.008
Token mom
Token ID1995
Feature activation-1.36e+00

Pos logit contributions
cks+0.008
oops+0.008
opl+0.008
 enclosure+0.008
umm+0.008
Neg logit contributions
 Lily-0.005
Prin-0.005
 Emma-0.005
 accidentally-0.004
 someone-0.004
Tokenmy
Token ID1820
Feature activation-1.05e+00

Pos logit contributions
 Lily+0.022
Prin+0.021
 Emma+0.019
 someone+0.019
 war+0.019
Neg logit contributions
ines-0.046
cks-0.045
oops-0.045
 enclosure-0.045
opl-0.044
Token said
Token ID531
Feature activation-1.57e+00

Pos logit contributions
Prin+0.016
 Lily+0.016
 someone+0.014
 Emma+0.014
 war+0.014
Neg logit contributions
ines-0.035
oops-0.034
cks-0.034
 enclosure-0.034
iously-0.033
Token she
Token ID673
Feature activation-1.55e+00

Pos logit contributions
Prin+0.035
L+0.027
 sounded+0.026
Rose+0.026
 practiced+0.025
Neg logit contributions
iously-0.042
oops-0.041
ummy-0.041
ule-0.039
ummies-0.038
Token would
Token ID561
Feature activation-1.16e+00

Pos logit contributions
 Lily+0.026
Prin+0.026
 Emma+0.023
 someone+0.023
 war+0.023
Neg logit contributions
ines-0.051
oops-0.051
cks-0.049
 enclosure-0.049
opl-0.049
Token replace
Token ID6330
Feature activation-3.35e+00

Pos logit contributions
Prin+0.024
 Lily+0.022
 Emma+0.021
 Emily+0.021
 Penny+0.021
Neg logit contributions
oops-0.032
ines-0.031
 enclosure-0.031
opl-0.031
cks-0.030
Token it
Token ID340
Feature activation-9.76e-01

Pos logit contributions
Prin+0.058
 accidentally+0.042
 war+0.042
L+0.040
 sounded+0.039
Neg logit contributions
cks-0.116
opl-0.113
oops-0.111
 cones-0.108
 enclosure-0.107
Token with
Token ID351
Feature activation-1.09e+00

Pos logit contributions
Prin+0.018
 Lily+0.015
 war+0.015
 accidentally+0.014
 Emma+0.014
Neg logit contributions
oops-0.031
ines-0.031
iously-0.031
cks-0.030
 enclosure-0.030
Token a
Token ID257
Feature activation-1.51e+00

Pos logit contributions
Prin+0.022
Da+0.016
 Emily+0.016
L+0.016
Rose+0.016
Neg logit contributions
oops-0.033
 cones-0.032
cks-0.031
opl-0.031
 ig-0.030
Token new
Token ID649
Feature activation-1.43e+00

Pos logit contributions
Prin+0.039
 Lily+0.036
 Emily+0.035
 Penny+0.035
 sounded+0.035
Neg logit contributions
 enclosure-0.033
 ig-0.031
 tallest-0.030
oops-0.030
ines-0.030
Token one
Token ID530
Feature activation-4.17e-01

Pos logit contributions
Prin+0.038
 Lily+0.033
 Midnight+0.033
 Emily+0.033
 Penny+0.032
Neg logit contributions
 enclosure-0.036
 ig-0.031
oops-0.031
opl-0.031
ines-0.031
Token.
Token ID13
Feature activation-2.92e-01

Pos logit contributions
Prin+0.018
 Lily+0.015
 Emma+0.015
 accidentally+0.015
 war+0.015
Neg logit contributions
cks-0.025
opl-0.025
ram-0.025
ines-0.025
oops-0.025
Token He
Token ID679
Feature activation-2.81e-01

Pos logit contributions
Prin+0.006
 war+0.006
 Emma+0.006
 Lily+0.006
 accidentally+0.006
Neg logit contributions
oops-0.008
ram-0.008
 enclosure-0.007
opl-0.007
umm-0.007
Token fet
Token ID11351
Feature activation-3.35e-01

Pos logit contributions
 Lily+0.006
Prin+0.006
 war+0.005
 Emma+0.005
 Penny+0.005
Neg logit contributions
 enclosure-0.008
oops-0.008
iously-0.008
ines-0.007
cks-0.007
Tokenched
Token ID1740
Feature activation-1.43e+00

Pos logit contributions
 Lily+0.008
 Emma+0.008
Prin+0.008
 someone+0.008
 war+0.008
Neg logit contributions
oops-0.008
ines-0.008
cks-0.008
ram-0.007
opl-0.007
Token a
Token ID257
Feature activation-1.49e+00

Pos logit contributions
Prin+0.030
 accidentally+0.026
 Emma+0.025
 war+0.025
 Emily+0.023
Neg logit contributions
oops-0.042
 cones-0.040
cks-0.040
iously-0.039
opl-0.038
Token band
Token ID4097
Feature activation-3.34e+00

Pos logit contributions
 Lily+0.037
Prin+0.037
 Emma+0.036
 Penny+0.035
 Emily+0.035
Neg logit contributions
 enclosure-0.033
 cones-0.032
 pounds-0.031
 sle-0.031
itten-0.031
Tokenage
Token ID496
Feature activation-1.79e+00

Pos logit contributions
 Lily+0.103
Prin+0.102
 Emma+0.100
 accidentally+0.097
 Penny+0.097
Neg logit contributions
opl-0.071
oops-0.065
ines-0.065
 enclosure-0.064
cks-0.063
Token he
Token ID339
Feature activation+1.68e-02

Pos logit contributions
Prin+0.036
 Lily+0.034
 accidentally+0.032
 Emma+0.031
L+0.030
Neg logit contributions
opl-0.054
oops-0.054
ines-0.053
iously-0.053
 enclosure-0.052
Token found
Token ID1043
Feature activation-2.10e-01

Pos logit contributions
oops+0.001
cks+0.001
 enclosure+0.001
ines+0.001
opl+0.001
Neg logit contributions
 Lily-0.000
Prin-0.000
 Emma-0.000
 Emily-0.000
 Penny-0.000
Token in
Token ID287
Feature activation-8.24e-02

Pos logit contributions
 Lily+0.004
Prin+0.004
 Emma+0.004
 Emily+0.003
 accidentally+0.003
Neg logit contributions
oops-0.007
cks-0.007
opl-0.006
 enclosure-0.006
ines-0.006
Token the
Token ID262
Feature activation-8.39e-01

Pos logit contributions
Prin+0.001
 Lily+0.001
 Emma+0.001
 accidentally+0.001
 Midnight+0.001
Neg logit contributions
opl-0.003
 enclosure-0.003
cks-0.003
ines-0.003
oops-0.003
Token Let
Token ID3914
Feature activation-2.50e+00

Pos logit contributions
Prin+0.030
 accidentally+0.028
 band+0.028
 war+0.028
 something+0.027
Neg logit contributions
oops-0.026
umm-0.025
 enclosure-0.024
opl-0.022
cks-0.022
Token's
Token ID338
Feature activation-1.39e+00

Pos logit contributions
Prin+0.059
 chess+0.054
 villain+0.054
 war+0.053
 sounded+0.053
Neg logit contributions
zy-0.050
oops-0.049
opl-0.048
cks-0.047
ram-0.047
Token go
Token ID467
Feature activation-1.19e+00

Pos logit contributions
Prin+0.033
 Lily+0.029
Da+0.028
 sounded+0.027
 war+0.026
Neg logit contributions
oops-0.036
 enclosure-0.035
cks-0.034
umm-0.034
 doorstep-0.034
Token get
Token ID651
Feature activation-1.65e+00

Pos logit contributions
Prin+0.024
 Lily+0.022
 sounded+0.022
 Emma+0.022
 war+0.021
Neg logit contributions
 enclosure-0.036
oops-0.035
ines-0.035
cks-0.033
opl-0.032
Token some
Token ID617
Feature activation-1.26e+00

Pos logit contributions
Prin+0.029
 sounded+0.025
 war+0.023
 accidentally+0.023
 Emily+0.021
Neg logit contributions
oops-0.053
cks-0.050
 enclosure-0.048
iously-0.048
ines-0.047
Token thread
Token ID4704
Feature activation-3.34e+00

Pos logit contributions
 Lily+0.028
Prin+0.026
 Emma+0.026
 accidentally+0.025
 Emily+0.025
Neg logit contributions
oops-0.036
cks-0.036
umm-0.035
opl-0.035
 enclosure-0.035
Token and
Token ID290
Feature activation-9.94e-01

Pos logit contributions
Prin+0.073
 accidentally+0.069
 Emma+0.068
 Lily+0.068
 sounded+0.067
Neg logit contributions
oops-0.096
ines-0.092
 enclosure-0.090
opl-0.089
umm-0.088
Token needle
Token ID17598
Feature activation-2.74e+00

Pos logit contributions
Prin+0.023
Da+0.020
 sounded+0.019
 Penny+0.019
 Emma+0.019
Neg logit contributions
oops-0.027
 cones-0.024
 enclosure-0.024
cks-0.023
 ig-0.023
Token."
Token ID526
Feature activation-1.15e+00

Pos logit contributions
Prin+0.071
 Emma+0.067
 accidentally+0.065
 Lily+0.064
 Emily+0.064
Neg logit contributions
oops-0.070
ines-0.066
cks-0.066
 enclosure-0.064
opl-0.063
Token
Token ID198
Feature activation-3.54e-01

Pos logit contributions
Prin+0.019
 war+0.017
 someone+0.017
 sounded+0.016
 accidentally+0.016
Neg logit contributions
oops-0.038
 enclosure-0.037
ines-0.037
cks-0.036
umm-0.035
Token
Token ID198
Feature activation-3.11e-01

Pos logit contributions
 Lily+0.005
Prin+0.004
 Emma+0.004
 accidentally+0.004
 someone+0.004
Neg logit contributions
cks-0.013
oops-0.013
ines-0.013
 enclosure-0.013
opl-0.013
Token up
Token ID510
Feature activation-4.74e-01

Pos logit contributions
Prin+0.004
 war+0.004
 accidentally+0.004
 sounded+0.004
 villain+0.003
Neg logit contributions
oops-0.007
iously-0.007
umm-0.007
 enclosure-0.007
 cones-0.007
Token,
Token ID11
Feature activation+8.26e-02

Pos logit contributions
Prin+0.008
 accidentally+0.007
 Lily+0.007
 sounded+0.006
 Bun+0.006
Neg logit contributions
oops-0.015
iously-0.015
ines-0.015
 enclosure-0.015
opl-0.015
Token she
Token ID673
Feature activation-5.86e-01

Pos logit contributions
oops+0.002
iously+0.002
 cones+0.002
umm+0.002
 enclosure+0.002
Neg logit contributions
Prin-0.002
 war-0.001
 Bun-0.001
 sounded-0.001
 accidentally-0.001
Token would
Token ID561
Feature activation+3.94e-02

Pos logit contributions
 Lily+0.011
Prin+0.011
 Emma+0.010
 war+0.010
 Bun+0.010
Neg logit contributions
oops-0.018
 enclosure-0.018
ines-0.017
cks-0.017
iously-0.017
Token cl
Token ID537
Feature activation-3.09e-01

Pos logit contributions
 enclosure+0.001
oops+0.001
opl+0.001
cks+0.001
ines+0.001
Neg logit contributions
 Lily-0.001
Prin-0.001
 Emma-0.001
 Emily-0.001
 accidentally-0.001
Tokenench
Token ID24421
Feature activation-3.34e+00

Pos logit contributions
 Lily+0.007
Prin+0.007
 Emma+0.006
 Bun+0.006
 accidentally+0.006
Neg logit contributions
oops-0.009
umm-0.009
cks-0.009
opl-0.009
ines-0.009
Token her
Token ID607
Feature activation-1.35e+00

Pos logit contributions
Prin+0.062
 war+0.060
 accidentally+0.059
 villain+0.058
 Bun+0.056
Neg logit contributions
oops-0.101
cks-0.101
ines-0.098
ums-0.092
opl-0.092
Token fists
Token ID33053
Feature activation-5.78e-01

Pos logit contributions
Prin+0.035
 Lily+0.035
 accidentally+0.033
 Emma+0.033
 Bun+0.032
Neg logit contributions
oops-0.034
opl-0.032
 enclosure-0.032
cks-0.032
umm-0.031
Token and
Token ID290
Feature activation-8.39e-02

Pos logit contributions
 Lily+0.009
Prin+0.009
 Emma+0.008
 accidentally+0.008
 someone+0.008
Neg logit contributions
oops-0.020
cks-0.020
 enclosure-0.020
opl-0.020
ines-0.020
Token keep
Token ID1394
Feature activation-8.98e-01

Pos logit contributions
Prin+0.002
L+0.002
 Lily+0.002
 Bun+0.002
 war+0.002
Neg logit contributions
oops-0.002
 enclosure-0.002
iously-0.002
cks-0.002
opl-0.002
Token on
Token ID319
Feature activation-2.91e-01

Pos logit contributions
Prin+0.021
 Lily+0.019
L+0.019
 Bun+0.018
 sounded+0.018
Neg logit contributions
cks-0.025
oops-0.024
opl-0.024
iously-0.023
ines-0.023
Token Lily
Token ID20037
Feature activation-1.15e+00

Pos logit contributions
oops+0.001
 enclosure+0.001
iously+0.001
umm+0.001
opl+0.001
Neg logit contributions
Prin-0.001
 sounded-0.001
 war-0.001
 band-0.001
 louder-0.001
Token's
Token ID338
Feature activation-1.94e-01

Pos logit contributions
Prin+0.021
 war+0.021
 Lily+0.021
 Emma+0.019
 Penny+0.019
Neg logit contributions
iously-0.033
oops-0.032
 enclosure-0.031
cks-0.031
ines-0.031
Token mother
Token ID2802
Feature activation-8.79e-01

Pos logit contributions
 Lily+0.004
Prin+0.004
 Emma+0.004
 accidentally+0.004
 someone+0.004
Neg logit contributions
cks-0.006
oops-0.006
opl-0.006
umm-0.006
 enclosure-0.006
Token wrapped
Token ID12908
Feature activation-1.68e+00

Pos logit contributions
Prin+0.015
 war+0.015
 Lily+0.014
 Penny+0.014
 Emma+0.014
Neg logit contributions
ines-0.026
oops-0.026
cks-0.026
iously-0.025
 enclosure-0.025
Token a
Token ID257
Feature activation-1.67e+00

Pos logit contributions
 war+0.033
Prin+0.032
 accidentally+0.031
arre+0.030
 villain+0.028
Neg logit contributions
cks-0.045
oops-0.045
iously-0.044
asp-0.043
 cones-0.042
Token band
Token ID4097
Feature activation-3.33e+00

Pos logit contributions
 Emma+0.036
 Lily+0.035
 Emily+0.034
Prin+0.034
 war+0.032
Neg logit contributions
ines-0.043
itten-0.042
oops-0.042
cks-0.041
ram-0.041
Tokenage
Token ID496
Feature activation-2.07e+00

Pos logit contributions
 Emma+0.097
 Lily+0.097
 accidentally+0.096
Prin+0.095
 Penny+0.095
Neg logit contributions
oops-0.065
opl-0.063
ines-0.061
cks-0.059
asp-0.059
Token around
Token ID1088
Feature activation-7.62e-01

Pos logit contributions
Prin+0.045
 accidentally+0.044
 Lily+0.043
 war+0.042
 Benny+0.041
Neg logit contributions
oops-0.054
ines-0.052
opl-0.051
iously-0.049
cks-0.049
Token her
Token ID607
Feature activation-1.81e+00

Pos logit contributions
Prin+0.013
arre+0.010
 war+0.010
severe+0.010
 villain+0.009
Neg logit contributions
iously-0.022
cks-0.022
zy-0.021
ines-0.021
arge-0.021
Token friend
Token ID1545
Feature activation-1.80e+00

Pos logit contributions
Prin+0.052
 Emma+0.047
 war+0.047
 Lily+0.047
 Emily+0.046
Neg logit contributions
 enclosure-0.041
iously-0.040
oops-0.038
cks-0.038
asp-0.036
Token's
Token ID338
Feature activation-1.79e+00

Pos logit contributions
Prin+0.038
 Lily+0.033
 war+0.033
 accidentally+0.031
 someone+0.031
Neg logit contributions
 enclosure-0.053
oops-0.051
iously-0.048
cks-0.047
ines-0.046
Token ended
Token ID4444
Feature activation-9.99e-01

Pos logit contributions
Prin+0.002
 Lily+0.002
 Emma+0.002
 Midnight+0.002
L+0.002
Neg logit contributions
oops-0.003
cks-0.003
opl-0.003
iously-0.003
 enclosure-0.003
Token up
Token ID510
Feature activation-4.62e-02

Pos logit contributions
Prin+0.023
 sounded+0.020
 Lily+0.020
L+0.019
 Emma+0.019
Neg logit contributions
cks-0.030
opl-0.028
iously-0.027
 enclosure-0.027
oops-0.027
Token in
Token ID287
Feature activation-7.38e-02

Pos logit contributions
Prin+0.001
 Lily+0.001
 Emma+0.001
 Emily+0.001
 Midnight+0.001
Neg logit contributions
ines-0.001
cks-0.001
opl-0.001
 enclosure-0.001
oops-0.001
Token such
Token ID884
Feature activation-6.58e-01

Pos logit contributions
 Lily+0.001
Prin+0.001
 Emma+0.001
 accidentally+0.001
 Bun+0.001
Neg logit contributions
cks-0.003
opl-0.003
 enclosure-0.003
ines-0.003
oops-0.003
Token bad
Token ID2089
Feature activation+3.28e-01

Pos logit contributions
Prin+0.009
 Lily+0.009
 Emma+0.008
 Emily+0.008
 Penny+0.007
Neg logit contributions
opl-0.023
oops-0.023
cks-0.023
 enclosure-0.023
iously-0.023
Token situations
Token ID7445
Feature activation-3.33e+00

Pos logit contributions
umm+0.008
oops+0.008
opl+0.008
cks+0.008
ines+0.008
Neg logit contributions
 Lily-0.008
Prin-0.008
 Emma-0.008
 Emily-0.007
 accidentally-0.007
Token.
Token ID13
Feature activation-3.17e-01

Pos logit contributions
 Lily+0.073
Prin+0.072
 Emma+0.066
 Emily+0.065
 Penny+0.065
Neg logit contributions
ines-0.095
 enclosure-0.093
opl-0.093
cks-0.093
oops-0.093
Token
Token ID220
Feature activation-8.64e-01

Pos logit contributions
Prin+0.005
 Lily+0.005
 war+0.005
 band+0.005
 accidentally+0.005
Neg logit contributions
 enclosure-0.010
opl-0.010
oops-0.010
umm-0.010
cks-0.010
Token
Token ID198
Feature activation-1.90e-01

Pos logit contributions
 Lily+0.013
Prin+0.013
 war+0.012
 Emma+0.012
 accidentally+0.012
Neg logit contributions
opl-0.029
oops-0.029
cks-0.029
ines-0.029
 enclosure-0.029
Token
Token ID198
Feature activation-2.63e-01

Pos logit contributions
 Lily+0.003
Prin+0.002
 Emma+0.002
 accidentally+0.002
 someone+0.002
Neg logit contributions
cks-0.007
oops-0.007
umm-0.007
opl-0.007
 enclosure-0.007
TokenThe
Token ID464
Feature activation-5.82e-01

Pos logit contributions
 Lily+0.005
Prin+0.005
 Emma+0.005
 accidentally+0.005
 Emily+0.005
Neg logit contributions
cks-0.008
opl-0.008
 enclosure-0.008
oops-0.008
ines-0.008
Token.
Token ID13
Feature activation-1.46e+00

Pos logit contributions
 Lily+0.022
Prin+0.021
 Emma+0.020
 accidentally+0.019
 someone+0.019
Neg logit contributions
cks-0.034
oops-0.034
 enclosure-0.033
opl-0.033
umm-0.033
Token We
Token ID775
Feature activation-1.91e+00

Pos logit contributions
Prin+0.037
 war+0.032
 Lily+0.032
 sounded+0.032
 accidentally+0.031
Neg logit contributions
oops-0.038
umm-0.037
 enclosure-0.037
opl-0.035
iously-0.035
Token'll
Token ID1183
Feature activation-1.52e+00

Pos logit contributions
 Lily+0.039
Prin+0.038
 Emily+0.036
 Emma+0.036
 someone+0.035
Neg logit contributions
ines-0.056
oops-0.055
 enclosure-0.054
ram-0.052
opl-0.051
Token put
Token ID1234
Feature activation-2.31e+00

Pos logit contributions
Prin+0.030
 Lily+0.030
 Emma+0.027
 Emily+0.027
 war+0.027
Neg logit contributions
oops-0.045
 enclosure-0.044
opl-0.044
cks-0.044
ines-0.044
Token a
Token ID257
Feature activation-1.00e+00

Pos logit contributions
Prin+0.051
 accidentally+0.045
 Emma+0.042
 Emily+0.042
 Penny+0.040
Neg logit contributions
cks-0.073
oops-0.073
 cones-0.068
opl-0.067
 enclosure-0.063
Token band
Token ID4097
Feature activation-3.31e+00

Pos logit contributions
 Lily+0.013
Prin+0.012
 Emma+0.012
 Emily+0.012
 Penny+0.012
Neg logit contributions
 enclosure-0.034
oops-0.034
itten-0.034
 cones-0.034
 ig-0.034
Token-
Token ID12
Feature activation-1.05e+00

Pos logit contributions
 Emma+0.099
 Lily+0.099
 Penny+0.097
 accidentally+0.096
Prin+0.096
Neg logit contributions
oops-0.070
opl-0.066
ines-0.066
cks-0.064
 enclosure-0.062
Tokenaid
Token ID1698
Feature activation-2.21e+00

Pos logit contributions
 Emma+0.027
 Lily+0.027
 Emily+0.026
 Penny+0.026
 war+0.025
Neg logit contributions
oops-0.027
cks-0.025
opl-0.024
asp-0.024
at-0.024
Token on
Token ID319
Feature activation-1.09e+00

Pos logit contributions
Prin+0.047
 Lily+0.047
 accidentally+0.047
 Emma+0.046
 Emily+0.045
Neg logit contributions
oops-0.064
ines-0.061
opl-0.060
cks-0.059
 enclosure-0.059
Token it
Token ID340
Feature activation-3.68e-01

Pos logit contributions
Prin+0.015
 Lily+0.012
 Emma+0.012
 accidentally+0.012
 sounded+0.011
Neg logit contributions
cks-0.042
opl-0.041
oops-0.040
 enclosure-0.039
iously-0.038
Token and
Token ID290
Feature activation-8.87e-01

Pos logit contributions
Prin+0.007
 Lily+0.007
 Emma+0.006
 war+0.006
 Emily+0.006
Neg logit contributions
oops-0.011
 enclosure-0.011
ines-0.011
cks-0.011
opl-0.011
Token.
Token ID13
Feature activation-7.46e-02

Pos logit contributions
Prin+0.025
 Lily+0.023
 war+0.022
 accidentally+0.021
 Emma+0.021
Neg logit contributions
 enclosure-0.043
oops-0.043
cks-0.042
ines-0.042
iously-0.041
Token Her
Token ID2332
Feature activation-8.85e-02

Pos logit contributions
Prin+0.002
 war+0.002
 band+0.002
 sounded+0.001
 villain+0.001
Neg logit contributions
oops-0.002
 enclosure-0.002
opl-0.002
umm-0.002
iously-0.002
Token mom
Token ID1995
Feature activation-1.09e+00

Pos logit contributions
Prin+0.002
 Lily+0.002
 Emma+0.002
 Penny+0.002
 Emily+0.002
Neg logit contributions
oops-0.003
opl-0.002
umm-0.002
ines-0.002
 enclosure-0.002
Token put
Token ID1234
Feature activation-2.46e+00

Pos logit contributions
Prin+0.018
 Lily+0.018
 war+0.017
 Penny+0.016
 Emma+0.016
Neg logit contributions
ines-0.034
 enclosure-0.034
oops-0.034
cks-0.034
iously-0.033
Token a
Token ID257
Feature activation-1.42e+00

Pos logit contributions
Prin+0.051
 accidentally+0.046
 war+0.043
 practiced+0.041
 villains+0.037
Neg logit contributions
oops-0.070
cks-0.068
 cones-0.067
icer-0.061
els-0.060
Token band
Token ID4097
Feature activation-3.30e+00

Pos logit contributions
 Lily+0.023
Prin+0.023
 Emma+0.023
 Emily+0.021
 Penny+0.021
Neg logit contributions
 ig-0.042
itten-0.041
 sle-0.041
 enclosure-0.040
 cones-0.039
Token-
Token ID12
Feature activation-1.47e+00

Pos logit contributions
 Penny+0.093
 Emma+0.093
 Lily+0.092
Prin+0.091
 accidentally+0.091
Neg logit contributions
oops-0.070
opl-0.068
ines-0.065
cks-0.064
 enclosure-0.063
Tokenaid
Token ID1698
Feature activation-1.77e+00

Pos logit contributions
 Emma+0.037
 Lily+0.036
 Penny+0.035
 war+0.035
 Emily+0.035
Neg logit contributions
oops-0.038
cks-0.034
asp-0.034
at-0.033
umm-0.033
Token on
Token ID319
Feature activation-5.17e-01

Pos logit contributions
Prin+0.041
 Lily+0.040
 accidentally+0.040
 Emma+0.039
 Emily+0.037
Neg logit contributions
oops-0.046
iously-0.044
ines-0.043
opl-0.043
cks-0.043
Token his
Token ID465
Feature activation-7.68e-01

Pos logit contributions
Prin+0.012
 practiced+0.010
arre+0.010
 war+0.009
 sounded+0.009
Neg logit contributions
opl-0.012
cks-0.012
 front-0.012
zy-0.012
oops-0.011
Token knee
Token ID10329
Feature activation-1.54e+00

Pos logit contributions
 Lily+0.020
Prin+0.020
 Emma+0.019
 accidentally+0.018
 Emily+0.018
Neg logit contributions
oops-0.019
 enclosure-0.018
cks-0.018
opl-0.018
umm-0.018
Token out
Token ID503
Feature activation+4.86e-02

Pos logit contributions
Prin+0.019
 sounded+0.017
 accidentally+0.017
 war+0.015
 forgive+0.015
Neg logit contributions
oops-0.044
cks-0.038
ines-0.038
umm-0.037
 enclosure-0.037
Token a
Token ID257
Feature activation-8.67e-01

Pos logit contributions
oops+0.002
ines+0.001
 enclosure+0.001
 cones+0.001
asp+0.001
Neg logit contributions
Prin-0.001
 accidentally-0.001
 war-0.001
 Bun-0.001
 Midnight-0.001
Token needle
Token ID17598
Feature activation-2.29e+00

Pos logit contributions
Prin+0.022
 Lily+0.021
 Bun+0.020
 someone+0.020
 Benny+0.020
Neg logit contributions
ines-0.019
 enclosure-0.019
oops-0.018
 main-0.017
 ig-0.017
Token and
Token ID290
Feature activation-1.43e+00

Pos logit contributions
Prin+0.043
 Lily+0.041
 someone+0.038
 war+0.037
 Emma+0.037
Neg logit contributions
 enclosure-0.071
ines-0.071
oops-0.068
iously-0.068
ram-0.068
Token some
Token ID617
Feature activation-1.68e+00

Pos logit contributions
Prin+0.032
 war+0.029
 villain+0.027
arre+0.026
 Beat+0.026
Neg logit contributions
oops-0.036
 cones-0.034
 handles-0.032
 ig-0.032
iously-0.030
Token thread
Token ID4704
Feature activation-3.30e+00

Pos logit contributions
Prin+0.038
 Lily+0.033
 Emma+0.033
 Emily+0.033
 Penny+0.032
Neg logit contributions
oops-0.045
ines-0.044
 cones-0.043
umm-0.043
 ounces-0.041
Token and
Token ID290
Feature activation-1.80e+00

Pos logit contributions
Prin+0.067
 Lily+0.062
 Emma+0.060
 war+0.059
 someone+0.057
Neg logit contributions
ines-0.100
oops-0.098
cks-0.096
 enclosure-0.095
iously-0.094
Token showed
Token ID3751
Feature activation-8.02e-01

Pos logit contributions
Prin+0.039
 war+0.036
arre+0.031
 villain+0.031
 Beat+0.030
Neg logit contributions
oops-0.047
 cones-0.043
 handles-0.042
 ig-0.041
iously-0.040
Token Lily
Token ID20037
Feature activation-9.87e-01

Pos logit contributions
Prin+0.010
 war+0.010
 accidentally+0.009
 sounded+0.008
 band+0.007
Neg logit contributions
oops-0.028
ines-0.027
cks-0.026
iously-0.026
umm-0.026
Token how
Token ID703
Feature activation+9.92e-01

Pos logit contributions
Prin+0.016
 Lily+0.015
 Emma+0.014
 accidentally+0.014
 war+0.014
Neg logit contributions
oops-0.034
cks-0.033
 enclosure-0.033
ines-0.033
opl-0.032
Token to
Token ID284
Feature activation-9.51e-02

Pos logit contributions
oops+0.032
ines+0.031
cks+0.029
iously+0.029
umm+0.029
Neg logit contributions
Prin-0.018
 Lily-0.015
 Emma-0.015
 war-0.015
 sounded-0.015
Token help
Token ID1037
Feature activation-1.62e+00

Pos logit contributions
Prin+0.016
 Midnight+0.014
 Penny+0.014
 war+0.014
 sounded+0.014
Neg logit contributions
cks-0.013
oops-0.013
 enclosure-0.013
iously-0.012
opl-0.012
Token.
Token ID13
Feature activation-6.06e-01

Pos logit contributions
Prin+0.034
 Lily+0.032
 Emma+0.031
 Penny+0.030
 war+0.030
Neg logit contributions
oops-0.047
cks-0.047
ines-0.046
 enclosure-0.046
opl-0.045
Token She
Token ID1375
Feature activation-7.20e-01

Pos logit contributions
 Lily+0.010
 Emma+0.009
Prin+0.008
 Emily+0.008
 Benny+0.008
Neg logit contributions
cks-0.022
opl-0.022
 enclosure-0.021
umm-0.021
ines-0.021
Token put
Token ID1234
Feature activation-2.47e+00

Pos logit contributions
Prin+0.014
 war+0.014
 Lily+0.013
 Emma+0.013
 Penny+0.013
Neg logit contributions
oops-0.021
ines-0.020
 enclosure-0.020
cks-0.020
iously-0.019
Token a
Token ID257
Feature activation-1.43e+00

Pos logit contributions
Prin+0.047
 war+0.038
 accidentally+0.037
 Emma+0.036
 Penny+0.034
Neg logit contributions
oops-0.086
cks-0.082
 cones-0.079
opl-0.076
umm-0.074
Token band
Token ID4097
Feature activation-3.29e+00

Pos logit contributions
 Emma+0.020
 Lily+0.020
 Penny+0.019
Prin+0.019
 Emily+0.019
Neg logit contributions
 cones-0.047
 ig-0.046
itten-0.046
oops-0.046
 enclosure-0.045
Token-
Token ID12
Feature activation-1.19e+00

Pos logit contributions
 Emma+0.093
 Lily+0.093
 Penny+0.091
Prin+0.091
 accidentally+0.090
Neg logit contributions
oops-0.074
opl-0.073
cks-0.070
ines-0.069
 enclosure-0.066
Tokenaid
Token ID1698
Feature activation-2.00e+00

Pos logit contributions
 Emma+0.031
 Lily+0.030
 Penny+0.030
 Emily+0.030
 war+0.029
Neg logit contributions
oops-0.031
cks-0.028
opl-0.027
ines-0.027
asp-0.026
Token on
Token ID319
Feature activation-7.67e-01

Pos logit contributions
Prin+0.042
 accidentally+0.041
 Lily+0.041
 Emma+0.041
 Penny+0.038
Neg logit contributions
oops-0.059
cks-0.056
opl-0.056
ines-0.055
 cones-0.054
Token his
Token ID465
Feature activation-6.77e-01

Pos logit contributions
Prin+0.014
 war+0.010
 Emma+0.010
 Penny+0.010
 sounded+0.010
Neg logit contributions
cks-0.025
oops-0.025
opl-0.025
iously-0.024
 enclosure-0.024
Token knee
Token ID10329
Feature activation-1.29e+00

Pos logit contributions
 Lily+0.017
Prin+0.017
 Emma+0.016
 Emily+0.015
 accidentally+0.015
Neg logit contributions
oops-0.017
cks-0.017
opl-0.017
 enclosure-0.017
umm-0.017
Token Let
Token ID3914
Feature activation-1.85e+00

Pos logit contributions
Prin+0.009
 accidentally+0.009
 war+0.008
 sounded+0.008
 band+0.008
Neg logit contributions
oops-0.010
umm-0.009
 enclosure-0.009
opl-0.009
cks-0.009
Token's
Token ID338
Feature activation-1.23e+00

Pos logit contributions
Prin+0.039
 war+0.036
 Emily+0.036
 Lily+0.036
 accidentally+0.036
Neg logit contributions
oops-0.047
opl-0.047
umm-0.046
 enclosure-0.046
ram-0.046
Token go
Token ID467
Feature activation-7.76e-01

Pos logit contributions
Prin+0.029
 Lily+0.025
Da+0.025
 Bun+0.024
 Penny+0.024
Neg logit contributions
 enclosure-0.033
oops-0.032
cks-0.031
opl-0.031
ines-0.030
Token to
Token ID284
Feature activation-1.25e+00

Pos logit contributions
Prin+0.015
 Lily+0.014
 Emma+0.014
 accidentally+0.013
 Emily+0.013
Neg logit contributions
oops-0.025
 enclosure-0.025
ines-0.024
cks-0.024
opl-0.024
Token the
Token ID262
Feature activation-1.00e+00

Pos logit contributions
Prin+0.021
 sounded+0.020
 accidentally+0.017
 didn+0.016
L+0.016
Neg logit contributions
 enclosure-0.042
cks-0.040
oops-0.040
opl-0.039
ines-0.037
Token doctor
Token ID6253
Feature activation-3.29e+00

Pos logit contributions
 Lily+0.024
 Emma+0.022
Prin+0.021
 someone+0.021
 war+0.020
Neg logit contributions
cks-0.030
oops-0.029
iously-0.028
opl-0.028
ines-0.028
Token and
Token ID290
Feature activation-1.79e-01

Pos logit contributions
Prin+0.069
 Lily+0.063
 Benny+0.062
 Bun+0.061
 accidentally+0.061
Neg logit contributions
 enclosure-0.102
oops-0.100
ines-0.094
cks-0.093
opl-0.090
Token see
Token ID766
Feature activation-4.65e-01

Pos logit contributions
Prin+0.005
 sounded+0.004
Da+0.004
 Penny+0.004
 war+0.004
Neg logit contributions
 enclosure-0.005
oops-0.005
cks-0.004
 doorstep-0.004
 cones-0.004
Token if
Token ID611
Feature activation-5.62e-02

Pos logit contributions
Prin+0.007
 Lily+0.007
 Emma+0.007
 accidentally+0.007
 Emily+0.006
Neg logit contributions
oops-0.016
cks-0.016
 enclosure-0.015
opl-0.015
ines-0.015
Token you
Token ID345
Feature activation-2.47e-01

Pos logit contributions
Prin+0.001
 Emily+0.001
 Penny+0.001
 war+0.001
 Emma+0.001
Neg logit contributions
oops-0.002
iously-0.002
 cones-0.002
 enclosure-0.002
cks-0.002
Token need
Token ID761
Feature activation+4.26e-01

Pos logit contributions
Prin+0.005
 Lily+0.005
 Emma+0.005
 Emily+0.005
 Penny+0.005
Neg logit contributions
oops-0.007
ines-0.007
 enclosure-0.007
cks-0.007
opl-0.006
Token
Token ID198
Feature activation+8.82e-01

Pos logit contributions
 Lily+0.004
Prin+0.004
 Emma+0.004
 accidentally+0.004
 someone+0.004
Neg logit contributions
oops-0.009
cks-0.008
 enclosure-0.008
opl-0.008
umm-0.008
Token
Token ID198
Feature activation+5.13e-01

Pos logit contributions
oops+0.034
cks+0.034
umm+0.034
opl+0.034
 enclosure+0.034
Neg logit contributions
 Lily-0.011
Prin-0.010
 Emma-0.009
 someone-0.008
 accidentally-0.008
TokenOne
Token ID3198
Feature activation+8.25e-01

Pos logit contributions
cks+0.016
umm+0.016
oops+0.016
opl+0.016
 enclosure+0.016
Neg logit contributions
 Lily-0.010
Prin-0.010
 Emma-0.009
 someone-0.009
 accidentally-0.009
Token day
Token ID1110
Feature activation-3.86e-01

Pos logit contributions
oops+0.022
cks+0.022
 enclosure+0.021
opl+0.021
umm+0.021
Neg logit contributions
 Lily-0.020
Prin-0.019
 Emma-0.019
 accidentally-0.018
 Emily-0.018
Token,
Token ID11
Feature activation+2.71e-01

Pos logit contributions
Prin+0.008
 Bun+0.007
 Emma+0.007
 Midnight+0.006
 Lily+0.006
Neg logit contributions
oops-0.012
ines-0.011
 enclosure-0.011
cks-0.011
 cones-0.011
Token Ben
Token ID3932
Feature activation+2.04e+00

Pos logit contributions
oops+0.008
 cones+0.008
cks+0.007
ires+0.007
iously+0.007
Neg logit contributions
Prin-0.006
 Bun-0.005
Ell-0.005
 Emma-0.005
 Midnight-0.004
Token goes
Token ID2925
Feature activation+1.25e+00

Pos logit contributions
oops+0.054
 enclosure+0.054
cks+0.053
ines+0.052
opl+0.052
Neg logit contributions
 Lily-0.048
Prin-0.046
 Emma-0.045
 Bun-0.044
 someone-0.042
Token to
Token ID284
Feature activation+4.38e-01

Pos logit contributions
oops+0.045
 enclosure+0.044
ines+0.043
cks+0.043
opl+0.042
Neg logit contributions
 Lily-0.018
Prin-0.018
 Emma-0.017
 Bun-0.016
 Benny-0.016
Token the
Token ID262
Feature activation+1.72e-01

Pos logit contributions
 enclosure+0.015
oops+0.014
cks+0.013
 cones+0.013
 ig+0.013
Neg logit contributions
Prin-0.008
 sounded-0.007
L-0.006
 Midnight-0.006
 Bun-0.006
Token lake
Token ID13546
Feature activation+5.21e-01

Pos logit contributions
 enclosure+0.003
oops+0.003
umm+0.003
cks+0.003
opl+0.003
Neg logit contributions
 Lily-0.005
Prin-0.005
 Emma-0.005
 Emily-0.005
 Bun-0.005
Token with
Token ID351
Feature activation+9.60e-01

Pos logit contributions
oops+0.019
 enclosure+0.019
cks+0.019
ines+0.019
opl+0.019
Neg logit contributions
 Lily-0.007
Prin-0.007
 Emma-0.007
 Bun-0.006
 Emily-0.006
Token your
Token ID534
Feature activation-1.47e-02

Pos logit contributions
cks+0.003
ines+0.002
 enclosure+0.002
opl+0.002
oops+0.002
Neg logit contributions
Prin-0.002
 accidentally-0.002
 Lily-0.002
 Emma-0.002
 Bun-0.002
Token kids
Token ID3988
Feature activation-4.73e-01

Pos logit contributions
 Lily+0.000
Prin+0.000
 Emma+0.000
 accidentally+0.000
 Emily+0.000
Neg logit contributions
oops-0.000
cks-0.000
 enclosure-0.000
opl-0.000
ines-0.000
Token not
Token ID407
Feature activation+7.29e-01

Pos logit contributions
Prin+0.011
 accidentally+0.010
 Lily+0.010
 Bun+0.010
 Emma+0.009
Neg logit contributions
oops-0.014
cks-0.014
 enclosure-0.013
ines-0.013
opl-0.013
Token to
Token ID284
Feature activation+3.94e-01

Pos logit contributions
cks+0.024
oops+0.024
ines+0.023
 enclosure+0.023
opl+0.023
Neg logit contributions
Prin-0.012
 Emma-0.012
 Lily-0.012
 Penny-0.012
 Emily-0.011
Token climb
Token ID12080
Feature activation+1.16e+00

Pos logit contributions
cks+0.011
 enclosure+0.011
opl+0.010
ines+0.010
oops+0.010
Neg logit contributions
Prin-0.009
 Lily-0.008
 Emma-0.008
 Penny-0.008
 Emily-0.008
Token high
Token ID1029
Feature activation+2.36e+00

Pos logit contributions
cks+0.038
opl+0.037
ines+0.037
 enclosure+0.037
oops+0.035
Neg logit contributions
Prin-0.021
 Emma-0.019
 Lily-0.019
 Emily-0.019
 Penny-0.018
Token trees
Token ID7150
Feature activation+5.30e-01

Pos logit contributions
cks+0.071
ines+0.070
 enclosure+0.069
oops+0.069
opl+0.069
Neg logit contributions
 Lily-0.048
Prin-0.046
 Emma-0.045
 Emily-0.043
 Penny-0.043
Token and
Token ID290
Feature activation+9.35e-01

Pos logit contributions
 enclosure+0.016
cks+0.016
opl+0.016
ines+0.016
oops+0.016
Neg logit contributions
 Lily-0.011
Prin-0.010
 Emma-0.010
 Emily-0.010
 Penny-0.010
Token forget
Token ID6044
Feature activation-8.03e-01

Pos logit contributions
 enclosure+0.023
oops+0.022
cks+0.022
 tallest+0.022
 cones+0.021
Neg logit contributions
 Penny-0.025
 Emma-0.024
 Midnight-0.024
Prin-0.024
 Emily-0.023
Token about
Token ID546
Feature activation+9.83e-01

Pos logit contributions
Prin+0.017
 accidentally+0.015
 Lily+0.014
 Penny+0.014
 Emma+0.014
Neg logit contributions
 enclosure-0.023
oops-0.023
ines-0.023
cks-0.023
opl-0.023
Token everything
Token ID2279
Feature activation+1.07e+00

Pos logit contributions
cks+0.034
opl+0.033
 enclosure+0.033
oops+0.032
ines+0.032
Neg logit contributions
Prin-0.017
 Lily-0.015
 Emma-0.014
 accidentally-0.014
 Penny-0.014
Token very
Token ID845
Feature activation+1.32e+00

Pos logit contributions
cks+0.011
oops+0.011
 enclosure+0.011
opl+0.011
ines+0.011
Neg logit contributions
 Lily-0.008
Prin-0.008
 Emma-0.008
 Benny-0.007
 someone-0.007
Token excited
Token ID6568
Feature activation+3.68e-01

Pos logit contributions
cks+0.040
umm+0.039
 enclosure+0.039
opl+0.039
oops+0.039
Neg logit contributions
 Lily-0.029
Prin-0.026
 Emma-0.026
 accidentally-0.025
 someone-0.024
Token and
Token ID290
Feature activation-3.14e-01

Pos logit contributions
cks+0.014
oops+0.014
opl+0.014
umm+0.014
 enclosure+0.014
Neg logit contributions
 Lily-0.005
 Emma-0.004
Prin-0.004
 someone-0.004
 accidentally-0.004
Token had
Token ID550
Feature activation+7.44e-02

Pos logit contributions
Prin+0.006
 Lily+0.006
 Emma+0.006
 Midnight+0.006
 war+0.006
Neg logit contributions
oops-0.009
cks-0.009
 enclosure-0.009
ines-0.009
opl-0.009
Token a
Token ID257
Feature activation+9.57e-01

Pos logit contributions
oops+0.003
 enclosure+0.003
cks+0.003
opl+0.003
ines+0.003
Neg logit contributions
 Lily-0.001
Prin-0.001
 Emma-0.001
 Midnight-0.001
 Penny-0.001
Token big
Token ID1263
Feature activation+2.17e+00

Pos logit contributions
 enclosure+0.026
oops+0.025
opl+0.025
umm+0.025
ines+0.025
Neg logit contributions
 Lily-0.022
Prin-0.021
 Emma-0.020
 someone-0.019
 Bun-0.019
Token smile
Token ID8212
Feature activation+3.58e-01

Pos logit contributions
 enclosure+0.067
opl+0.067
oops+0.067
ines+0.066
umm+0.066
Neg logit contributions
 Lily-0.042
Prin-0.041
 Emma-0.038
 someone-0.037
 Bun-0.036
Token on
Token ID319
Feature activation+1.08e+00

Pos logit contributions
opl+0.011
oops+0.011
 enclosure+0.011
ines+0.011
cks+0.011
Neg logit contributions
Prin-0.007
 Lily-0.007
 Emma-0.007
 Emily-0.006
 Benny-0.006
Token his
Token ID465
Feature activation+1.16e-01

Pos logit contributions
cks+0.045
umm+0.045
oops+0.045
 enclosure+0.045
opl+0.045
Neg logit contributions
 Lily-0.010
Prin-0.008
 Emma-0.008
 someone-0.007
 accidentally-0.007
Token face
Token ID1986
Feature activation+7.91e-01

Pos logit contributions
umm+0.004
cks+0.004
ines+0.004
opl+0.004
oops+0.004
Neg logit contributions
 Lily-0.002
 Emma-0.002
 someone-0.002
Prin-0.001
 Emily-0.001
Token.
Token ID13
Feature activation+1.06e-01

Pos logit contributions
cks+0.025
oops+0.025
umm+0.025
opl+0.025
 enclosure+0.025
Neg logit contributions
 Lily-0.015
Prin-0.013
 Emma-0.013
 someone-0.013
 accidentally-0.013
Token staring
Token ID16143
Feature activation+1.30e+00

Pos logit contributions
 enclosure+0.003
cks+0.003
oops+0.003
opl+0.003
iously+0.003
Neg logit contributions
Prin-0.003
 Lily-0.003
 Emma-0.003
 Bun-0.003
 Benny-0.003
Token at
Token ID379
Feature activation+1.10e+00

Pos logit contributions
oops+0.034
cks+0.033
ines+0.033
 enclosure+0.033
umm+0.032
Neg logit contributions
 Lily-0.032
Prin-0.031
 Emma-0.030
 Bun-0.029
 accidentally-0.029
Token him
Token ID683
Feature activation+8.48e-01

Pos logit contributions
 enclosure+0.035
cks+0.035
opl+0.034
ines+0.034
iously+0.034
Neg logit contributions
Prin-0.020
 accidentally-0.018
 Lily-0.017
L-0.017
 Emma-0.017
Token with
Token ID351
Feature activation+6.63e-01

Pos logit contributions
oops+0.029
cks+0.029
umm+0.029
opl+0.029
 enclosure+0.028
Neg logit contributions
 Lily-0.014
 Emma-0.012
Prin-0.012
 someone-0.012
 Emily-0.012
Token her
Token ID607
Feature activation+6.75e-01

Pos logit contributions
cks+0.022
oops+0.022
opl+0.022
 enclosure+0.022
umm+0.021
Neg logit contributions
Prin-0.011
 Lily-0.011
 accidentally-0.010
 Emma-0.009
 Bun-0.009
Token big
Token ID1263
Feature activation+2.01e+00

Pos logit contributions
oops+0.019
cks+0.019
umm+0.019
opl+0.019
 enclosure+0.019
Neg logit contributions
 Lily-0.015
 Emma-0.014
Prin-0.014
 someone-0.014
 Emily-0.014
Token eyes
Token ID2951
Feature activation+1.43e+00

Pos logit contributions
oops+0.053
cks+0.052
opl+0.052
 enclosure+0.052
umm+0.052
Neg logit contributions
 Lily-0.049
Prin-0.047
 Emma-0.045
 Bun-0.043
 accidentally-0.043
Token.
Token ID13
Feature activation+7.82e-02

Pos logit contributions
cks+0.047
oops+0.047
umm+0.047
opl+0.047
 enclosure+0.047
Neg logit contributions
 Lily-0.025
Prin-0.023
 Emma-0.023
 someone-0.021
 accidentally-0.021
Token
Token ID198
Feature activation+3.10e-01

Pos logit contributions
 enclosure+0.002
oops+0.002
umm+0.002
opl+0.002
iously+0.002
Neg logit contributions
Prin-0.001
 Lily-0.001
 war-0.001
 sounded-0.001
 band-0.001
Token
Token ID198
Feature activation+1.46e-01

Pos logit contributions
cks+0.011
opl+0.011
ines+0.011
 enclosure+0.011
oops+0.011
Neg logit contributions
 Lily-0.005
Prin-0.004
 Emma-0.004
 war-0.004
 accidentally-0.004
TokenThe
Token ID464
Feature activation-9.55e-02

Pos logit contributions
cks+0.004
opl+0.004
 enclosure+0.004
oops+0.004
ines+0.004
Neg logit contributions
 Lily-0.003
 Emma-0.003
Prin-0.003
 accidentally-0.003
 war-0.003
Token and
Token ID290
Feature activation-4.19e-01

Pos logit contributions
cks+0.019
oops+0.019
opl+0.019
umm+0.019
 enclosure+0.018
Neg logit contributions
 Lily-0.006
Prin-0.005
 Emma-0.005
 accidentally-0.004
 someone-0.004
Token brown
Token ID7586
Feature activation+1.20e+00

Pos logit contributions
Prin+0.012
 Emma+0.011
 Penny+0.010
 Lily+0.010
 war+0.010
Neg logit contributions
cks-0.010
oops-0.008
 tallest-0.008
opl-0.008
ines-0.008
Token and
Token ID290
Feature activation-6.43e-02

Pos logit contributions
cks+0.040
oops+0.040
 enclosure+0.040
opl+0.040
umm+0.040
Neg logit contributions
 Lily-0.020
Prin-0.019
 Emma-0.018
 accidentally-0.017
 Emily-0.017
Token had
Token ID550
Feature activation+8.78e-01

Pos logit contributions
Prin+0.002
arre+0.001
 Emma+0.001
 forgive+0.001
 war+0.001
Neg logit contributions
cks-0.002
 tallest-0.001
opl-0.001
oops-0.001
ines-0.001
Token a
Token ID257
Feature activation+8.25e-01

Pos logit contributions
cks+0.027
oops+0.027
 enclosure+0.026
opl+0.026
umm+0.025
Neg logit contributions
 Lily-0.016
 Emma-0.016
Prin-0.016
 Benny-0.015
 Penny-0.015
Token red
Token ID2266
Feature activation+2.36e+00

Pos logit contributions
cks+0.021
opl+0.021
 enclosure+0.021
ines+0.021
oops+0.021
Neg logit contributions
 Lily-0.020
Prin-0.019
 Emma-0.019
 Benny-0.018
 Emily-0.018
Token cap
Token ID1451
Feature activation+9.66e-01

Pos logit contributions
 enclosure+0.057
opl+0.056
oops+0.056
cks+0.055
umm+0.054
Neg logit contributions
 Lily-0.062
 Emma-0.061
Prin-0.061
 Penny-0.058
 Emily-0.058
Token.
Token ID13
Feature activation-1.29e-02

Pos logit contributions
umm+0.036
oops+0.035
opl+0.035
 enclosure+0.034
cks+0.034
Neg logit contributions
 Lily-0.015
Prin-0.012
 someone-0.012
 Emma-0.012
 accidentally-0.011
Token It
Token ID632
Feature activation-3.45e-01

Pos logit contributions
 Lily+0.000
Prin+0.000
 Emma+0.000
 accidentally+0.000
 war+0.000
Neg logit contributions
oops-0.000
umm-0.000
 enclosure-0.000
cks-0.000
opl-0.000
Token looked
Token ID3114
Feature activation+6.59e-01

Pos logit contributions
Prin+0.006
 Emma+0.006
 Penny+0.006
 war+0.006
 Lily+0.006
Neg logit contributions
cks-0.011
ines-0.010
 enclosure-0.010
oops-0.010
opl-0.010
Token at
Token ID379
Feature activation+1.15e+00

Pos logit contributions
oops+0.019
cks+0.019
ines+0.018
 enclosure+0.018
opl+0.018
Neg logit contributions
Prin-0.014
 Lily-0.014
 Emma-0.013
 Midnight-0.012
 Emily-0.012
Tokenays
Token ID592
Feature activation+1.26e-01

Pos logit contributions
 Lily+0.005
Prin+0.005
 Emma+0.005
 accidentally+0.005
 someone+0.005
Neg logit contributions
umm-0.002
opl-0.002
 enclosure-0.002
oops-0.002
cks-0.002
Token water
Token ID1660
Feature activation-9.18e-03

Pos logit contributions
oops+0.005
cks+0.004
 enclosure+0.004
ines+0.004
opl+0.004
Neg logit contributions
Prin-0.002
 Emma-0.002
 Lily-0.002
 accidentally-0.002
 Bun-0.002
Token on
Token ID319
Feature activation-3.03e-01

Pos logit contributions
 Lily+0.000
Prin+0.000
 Emma+0.000
 someone+0.000
 accidentally+0.000
Neg logit contributions
umm-0.000
opl-0.000
cks-0.000
 enclosure-0.000
oops-0.000
Token the
Token ID262
Feature activation+2.50e-01

Pos logit contributions
Prin+0.003
 Lily+0.003
 Emma+0.003
 accidentally+0.003
 war+0.002
Neg logit contributions
cks-0.012
oops-0.012
 enclosure-0.012
umm-0.011
opl-0.011
Token fire
Token ID2046
Feature activation+8.34e-01

Pos logit contributions
oops+0.006
cks+0.006
opl+0.006
umm+0.006
ines+0.006
Neg logit contributions
 Lily-0.007
Prin-0.006
 Emma-0.006
 someone-0.006
 accidentally-0.006
Token.
Token ID13
Feature activation+4.14e-01

Pos logit contributions
oops+0.027
opl+0.027
cks+0.027
umm+0.027
ines+0.026
Neg logit contributions
 Lily-0.016
Prin-0.014
 Emma-0.013
 someone-0.013
 accidentally-0.013
Token He
Token ID679
Feature activation+5.48e-01

Pos logit contributions
cks+0.014
opl+0.014
oops+0.014
 enclosure+0.014
umm+0.014
Neg logit contributions
 Lily-0.008
Prin-0.007
 Emma-0.007
 someone-0.006
 accidentally-0.006
Token puts
Token ID7584
Feature activation-1.66e+00

Pos logit contributions
umm+0.015
opl+0.014
 enclosure+0.014
iously+0.013
oops+0.013
Neg logit contributions
 Lily-0.014
 accidentally-0.013
Prin-0.013
 Emma-0.012
 Emily-0.012
Token out
Token ID503
Feature activation+7.60e-01

Pos logit contributions
Prin+0.019
 Emma+0.016
 accidentally+0.015
 war+0.015
 Emily+0.015
Neg logit contributions
oops-0.067
cks-0.066
 enclosure-0.062
ines-0.061
umm-0.061
Token the
Token ID262
Feature activation+5.07e-01

Pos logit contributions
oops+0.030
cks+0.029
 enclosure+0.028
ines+0.028
opl+0.028
Neg logit contributions
Prin-0.009
 Emma-0.008
 Lily-0.008
 accidentally-0.008
 Penny-0.007
Token fire
Token ID2046
Feature activation+1.47e+00

Pos logit contributions
umm+0.016
opl+0.016
oops+0.016
cks+0.016
ines+0.015
Neg logit contributions
 Lily-0.010
Prin-0.009
 Emma-0.009
 someone-0.009
 accidentally-0.009
Token and
Token ID290
Feature activation-4.68e-01

Pos logit contributions
 Lily+0.019
Prin+0.018
 Emma+0.017
 Bun+0.016
 Benny+0.016
Neg logit contributions
 enclosure-0.040
ines-0.040
oops-0.040
cks-0.040
opl-0.039
Token pulled
Token ID5954
Feature activation-4.73e-01

Pos logit contributions
Prin+0.010
 war+0.010
 Midnight+0.009
 Bun+0.009
 Lily+0.009
Neg logit contributions
oops-0.013
iously-0.012
 enclosure-0.012
cks-0.011
used-0.011
Token out
Token ID503
Feature activation+3.25e-01

Pos logit contributions
 Lily+0.006
Prin+0.006
 Emma+0.005
 accidentally+0.005
 war+0.005
Neg logit contributions
oops-0.017
cks-0.017
ines-0.017
 enclosure-0.017
opl-0.017
Token a
Token ID257
Feature activation-8.77e-01

Pos logit contributions
oops+0.012
ines+0.012
 enclosure+0.012
cks+0.012
iously+0.011
Neg logit contributions
Prin-0.004
 accidentally-0.004
 Midnight-0.004
 Lily-0.004
 war-0.004
Token cheap
Token ID7026
Feature activation-7.91e-01

Pos logit contributions
 Lily+0.022
Prin+0.021
 Bun+0.020
 Emily+0.020
 Emma+0.020
Neg logit contributions
 enclosure-0.021
ines-0.021
oops-0.020
umm-0.020
 pounds-0.020
Token shiny
Token ID22441
Feature activation-3.66e-02

Pos logit contributions
Prin+0.020
 Lily+0.020
 Emily+0.019
 Emma+0.018
 Bun+0.018
Neg logit contributions
 enclosure-0.020
umm-0.019
oops-0.019
ines-0.018
opl-0.018
Token toy
Token ID13373
Feature activation-1.00e-01

Pos logit contributions
Prin+0.001
 Lily+0.001
 Emily+0.001
 Emma+0.001
 Bun+0.001
Neg logit contributions
umm-0.001
 enclosure-0.001
oops-0.001
ines-0.001
 pounds-0.001
Token.
Token ID13
Feature activation+6.86e-02

Pos logit contributions
 Lily+0.002
 Emma+0.001
 someone+0.001
 they+0.001
 Benny+0.001
Neg logit contributions
cks-0.004
opl-0.004
oops-0.004
umm-0.003
ines-0.003
Token She
Token ID1375
Feature activation-7.80e-01

Pos logit contributions
oops+0.002
 enclosure+0.002
opl+0.002
cks+0.002
ines+0.002
Neg logit contributions
 Lily-0.001
Prin-0.001
 war-0.001
 accidentally-0.001
 Bun-0.001
Token was
Token ID373
Feature activation-2.43e-01

Pos logit contributions
Prin+0.014
 Lily+0.014
 war+0.014
 Midnight+0.013
 Emma+0.013
Neg logit contributions
oops-0.024
 enclosure-0.023
ines-0.023
cks-0.023
iously-0.023
Token so
Token ID523
Feature activation+9.28e-01

Pos logit contributions
 Lily+0.004
 Emma+0.004
Prin+0.004
 accidentally+0.004
 someone+0.004
Neg logit contributions
umm-0.008
cks-0.008
opl-0.008
oops-0.008
 enclosure-0.008
Token
Token ID198
Feature activation+3.98e-01

Pos logit contributions
Prin+0.001
 Emma+0.001
 Lily+0.001
 accidentally+0.001
 Bun+0.001
Neg logit contributions
oops-0.002
cks-0.002
 enclosure-0.002
opl-0.002
ines-0.002
Token
Token ID198
Feature activation+6.23e-01

Pos logit contributions
umm+0.018
iously+0.017
oops+0.016
 enclosure+0.016
opl+0.016
Neg logit contributions
 Lily-0.005
 someone-0.004
 it-0.003
Prin-0.003
 they-0.003
TokenBut
Token ID1537
Feature activation+1.34e-01

Pos logit contributions
umm+0.022
oops+0.021
opl+0.021
iously+0.021
 enclosure+0.021
Neg logit contributions
 Lily-0.011
Prin-0.010
L-0.010
 someone-0.009
 accidentally-0.009
Token then
Token ID788
Feature activation-2.77e-01

Pos logit contributions
cks+0.003
 enclosure+0.003
oops+0.003
iously+0.003
ines+0.003
Neg logit contributions
Prin-0.004
 Emma-0.003
 Bun-0.003
 Penny-0.003
 Midnight-0.003
Token,
Token ID11
Feature activation-2.49e-02

Pos logit contributions
Prin+0.006
Da+0.005
Em+0.005
 Bun+0.005
 Midnight+0.005
Neg logit contributions
oops-0.007
 cones-0.007
cks-0.007
 enclosure-0.007
ines-0.007
Token the
Token ID262
Feature activation+5.37e-01

Pos logit contributions
Prin+0.001
arre+0.000
Em+0.000
Da+0.000
 Bun+0.000
Neg logit contributions
 cones-0.001
oops-0.001
 enclosure-0.001
 ig-0.001
 ost-0.001
Token teacher
Token ID4701
Feature activation+5.62e-01

Pos logit contributions
oops+0.014
cks+0.013
 enclosure+0.013
opl+0.013
umm+0.013
Neg logit contributions
 Lily-0.013
Prin-0.013
 Emma-0.013
 Bun-0.012
 accidentally-0.012
Token comes
Token ID2058
Feature activation+4.55e-01

Pos logit contributions
cks+0.018
oops+0.018
umm+0.017
opl+0.017
 enclosure+0.017
Neg logit contributions
 Lily-0.011
Prin-0.010
 Emma-0.010
 accidentally-0.009
 someone-0.009
Token in
Token ID287
Feature activation+3.57e-01

Pos logit contributions
oops+0.014
ines+0.014
 enclosure+0.014
cks+0.014
unks+0.013
Neg logit contributions
Prin-0.009
 Lily-0.008
 Emma-0.008
 Midnight-0.008
 accidentally-0.008
Token.
Token ID13
Feature activation+1.97e-02

Pos logit contributions
oops+0.013
cks+0.013
umm+0.013
opl+0.012
 enclosure+0.012
Neg logit contributions
 Lily-0.006
Prin-0.005
 Emma-0.005
 someone-0.005
 Emily-0.005
Token She
Token ID1375
Feature activation+3.44e-01

Pos logit contributions
oops+0.001
opl+0.001
 enclosure+0.001
umm+0.001
ines+0.001
Neg logit contributions
Prin-0.000
 Bun-0.000
 accidentally-0.000
 Benny-0.000
 Emma-0.000
Token cricket
Token ID18836
Feature activation+5.33e-01

Pos logit contributions
cks+0.023
oops+0.023
ines+0.023
opl+0.022
 enclosure+0.022
Neg logit contributions
 Lily-0.014
 Emma-0.012
Prin-0.012
 someone-0.012
 war-0.012
Token ch
Token ID442
Feature activation+6.51e-01

Pos logit contributions
 enclosure+0.018
ines+0.017
opl+0.017
cks+0.017
oops+0.017
Neg logit contributions
 Lily-0.009
Prin-0.008
 Emma-0.008
 Penny-0.008
 someone-0.007
Tokenir
Token ID343
Feature activation+5.70e-01

Pos logit contributions
cks+0.011
oops+0.010
ines+0.010
opl+0.010
 enclosure+0.009
Neg logit contributions
 Lily-0.024
 Emma-0.023
Prin-0.023
 Bun-0.022
 Emily-0.022
Tokenped
Token ID9124
Feature activation+6.85e-02

Pos logit contributions
opl+0.008
cks+0.008
oops+0.007
iously+0.007
ines+0.007
Neg logit contributions
 Lily-0.021
 Emma-0.021
Prin-0.021
 Penny-0.020
 Emily-0.020
Token happily
Token ID18177
Feature activation-8.83e-01

Pos logit contributions
iously+0.002
oops+0.002
 enclosure+0.002
ines+0.002
cks+0.002
Neg logit contributions
Prin-0.001
 accidentally-0.001
 Lily-0.001
 Penny-0.001
 Midnight-0.001
Token in
Token ID287
Feature activation+3.56e-01

Pos logit contributions
 Lily+0.019
Prin+0.019
 Emma+0.018
 accidentally+0.017
 Emily+0.017
Neg logit contributions
 enclosure-0.025
oops-0.025
ines-0.025
cks-0.025
iously-0.024
Token the
Token ID262
Feature activation-7.73e-01

Pos logit contributions
cks+0.012
opl+0.012
 enclosure+0.012
ines+0.012
umm+0.012
Neg logit contributions
Prin-0.006
 Lily-0.006
 accidentally-0.005
 Emma-0.005
 Bun-0.005
Token background
Token ID4469
Feature activation+1.54e-01

Pos logit contributions
 Lily+0.021
Prin+0.020
 accidentally+0.020
 Emma+0.019
 Bun+0.019
Neg logit contributions
 enclosure-0.018
opl-0.018
umm-0.017
oops-0.016
ines-0.016
Token.
Token ID13
Feature activation+1.63e-01

Pos logit contributions
umm+0.004
ines+0.004
opl+0.004
oops+0.004
 enclosure+0.004
Neg logit contributions
Prin-0.004
 Lily-0.003
 accidentally-0.003
 Emma-0.003
 war-0.003
Token
Token ID220
Feature activation-7.40e-01

Pos logit contributions
oops+0.006
 enclosure+0.005
umm+0.005
opl+0.005
cks+0.005
Neg logit contributions
Prin-0.003
 Lily-0.002
 accidentally-0.002
 war-0.002
 band-0.002
Token
Token ID198
Feature activation+1.79e-01

Pos logit contributions
Prin+0.011
 Lily+0.011
 war+0.010
 accidentally+0.010
 Emma+0.010
Neg logit contributions
oops-0.025
cks-0.025
opl-0.025
 enclosure-0.025
ines-0.025
Token't
Token ID470
Feature activation+1.11e+00

Pos logit contributions
opl+0.003
cks+0.003
oops+0.003
ines+0.003
umm+0.003
Neg logit contributions
 Lily-0.003
Prin-0.003
 Emma-0.003
 Penny-0.003
 war-0.002
Token get
Token ID651
Feature activation+9.71e-01

Pos logit contributions
oops+0.033
cks+0.033
opl+0.033
 enclosure+0.032
ines+0.032
Neg logit contributions
 Lily-0.023
Prin-0.022
 Emma-0.021
 Emily-0.020
 accidentally-0.020
Token up
Token ID510
Feature activation+5.18e-01

Pos logit contributions
oops+0.033
cks+0.033
ines+0.031
 enclosure+0.031
umm+0.031
Neg logit contributions
Prin-0.016
 Lily-0.015
 Emma-0.014
 war-0.013
 Emily-0.013
Token.
Token ID13
Feature activation-5.61e-01

Pos logit contributions
oops+0.018
cks+0.018
umm+0.017
 enclosure+0.017
opl+0.017
Neg logit contributions
 Lily-0.009
Prin-0.008
 Emma-0.008
 accidentally-0.007
 someone-0.007
Token
Token ID198
Feature activation+4.59e-01

Pos logit contributions
Prin+0.009
 war+0.009
 Lily+0.008
 Emma+0.008
 band+0.008
Neg logit contributions
oops-0.018
 enclosure-0.018
opl-0.017
umm-0.017
cks-0.017
Token
Token ID198
Feature activation+8.28e-02

Pos logit contributions
cks+0.017
oops+0.017
ines+0.017
opl+0.017
 enclosure+0.017
Neg logit contributions
 Lily-0.006
Prin-0.006
 Emma-0.005
 accidentally-0.005
 Emily-0.005
TokenSuddenly
Token ID38582
Feature activation-5.60e-01

Pos logit contributions
cks+0.003
oops+0.003
ines+0.003
 enclosure+0.003
opl+0.003
Neg logit contributions
 Lily-0.001
Prin-0.001
 Emma-0.001
 Benny-0.001
 Emily-0.001
Token,
Token ID11
Feature activation-1.34e-01

Pos logit contributions
Prin+0.014
 war+0.012
arre+0.012
 forgive+0.012
 sounded+0.012
Neg logit contributions
iously-0.012
used-0.011
ummy-0.011
ines-0.011
oops-0.011
Token she
Token ID673
Feature activation+7.84e-02

Pos logit contributions
arre+0.003
Prin+0.003
 chord+0.002
 war+0.002
 forgive+0.002
Neg logit contributions
 enormous-0.003
iously-0.003
oops-0.002
 goodness-0.002
 ig-0.002
Token heard
Token ID2982
Feature activation+1.22e+00

Pos logit contributions
cks+0.002
ines+0.002
 enclosure+0.002
unks+0.002
oops+0.002
Neg logit contributions
 Lily-0.001
Prin-0.001
 Emma-0.001
 war-0.001
 Midnight-0.001
Token a
Token ID257
Feature activation-8.71e-01

Pos logit contributions
 enclosure+0.032
 ig+0.031
iously+0.031
oops+0.031
opl+0.030
Neg logit contributions
Prin-0.021
 accidentally-0.020
 Penny-0.019
 Midnight-0.019
arre-0.018
Token cover
Token ID3002
Feature activation-1.21e+00

Pos logit contributions
 enclosure+0.014
opl+0.014
cks+0.013
oops+0.013
umm+0.013
Neg logit contributions
 Lily-0.010
Prin-0.010
 Midnight-0.009
L-0.009
 accidentally-0.009
Token what
Token ID644
Feature activation+8.89e-02

Pos logit contributions
Prin+0.023
 accidentally+0.022
 Lily+0.020
 war+0.019
 Midnight+0.019
Neg logit contributions
cks-0.039
 enclosure-0.037
opl-0.037
ines-0.037
oops-0.036
Token a
Token ID257
Feature activation+7.55e-02

Pos logit contributions
 cones+0.002
oops+0.002
 enclosure+0.002
cks+0.002
umm+0.002
Neg logit contributions
Prin-0.002
 Lily-0.002
 Bun-0.002
 accidentally-0.002
L-0.002
Token pupil
Token ID34047
Feature activation-7.28e-01

Pos logit contributions
umm+0.002
 enclosure+0.002
oops+0.002
ines+0.002
opl+0.002
Neg logit contributions
 Lily-0.002
Prin-0.002
 Emma-0.002
 Bun-0.002
 Benny-0.002
Token is
Token ID318
Feature activation-1.40e-01

Pos logit contributions
 Lily+0.014
Prin+0.013
 Emma+0.013
 Benny+0.012
 accidentally+0.012
Neg logit contributions
oops-0.023
cks-0.022
 enclosure-0.022
ines-0.022
opl-0.022
Token.
Token ID13
Feature activation-6.35e-01

Pos logit contributions
 Lily+0.003
 Emma+0.003
Prin+0.003
 accidentally+0.003
 someone+0.003
Neg logit contributions
umm-0.004
oops-0.004
cks-0.004
 enclosure-0.004
opl-0.004
Token A
Token ID317
Feature activation-4.77e-01

Pos logit contributions
 Lily+0.015
Prin+0.014
 accidentally+0.014
 Emma+0.014
 war+0.013
Neg logit contributions
oops-0.017
 enclosure-0.017
umm-0.017
opl-0.017
cks-0.017
Token pupil
Token ID34047
Feature activation-1.11e+00

Pos logit contributions
 Lily+0.011
Prin+0.010
 Emma+0.010
 Bun+0.010
 Midnight+0.010
Neg logit contributions
umm-0.013
 enclosure-0.012
ines-0.012
opl-0.012
oops-0.012
Token is
Token ID318
Feature activation-3.71e-01

Pos logit contributions
 Lily+0.021
Prin+0.020
 Emma+0.019
 accidentally+0.018
 Benny+0.018
Neg logit contributions
oops-0.035
cks-0.035
 enclosure-0.035
ines-0.034
opl-0.034
Token someone
Token ID2130
Feature activation-6.45e-01

Pos logit contributions
 Lily+0.008
Prin+0.007
 Emma+0.007
 Benny+0.007
 Bun+0.007
Neg logit contributions
cks-0.011
oops-0.011
opl-0.011
 enclosure-0.011
ines-0.011
Token who
Token ID508
Feature activation+8.41e-01

Pos logit contributions
 Lily+0.012
Prin+0.011
 Emma+0.010
 Benny+0.010
 Emily+0.010
Neg logit contributions
cks-0.021
oops-0.021
 enclosure-0.021
ines-0.021
opl-0.020
TokenS
Token ID50
Feature activation+1.17e+00

Pos logit contributions
cks+0.003
 enclosure+0.003
opl+0.003
ines+0.003
oops+0.003
Neg logit contributions
 Lily-0.001
 Emma-0.001
Prin-0.001
 accidentally-0.001
 war-0.001
Tokenara
Token ID3301
Feature activation+6.97e-01

Pos logit contributions
cks+0.038
opl+0.038
umm+0.036
oops+0.036
 enclosure+0.036
Neg logit contributions
 Lily-0.024
Prin-0.022
L-0.020
 Emma-0.020
 accidentally-0.019
Token does
Token ID857
Feature activation+2.97e-01

Pos logit contributions
oops+0.022
 enclosure+0.021
ines+0.021
cks+0.020
iously+0.020
Neg logit contributions
Prin-0.013
 Lily-0.013
 Bun-0.013
 war-0.013
 Benny-0.013
Token not
Token ID407
Feature activation+3.96e-01

Pos logit contributions
cks+0.009
oops+0.009
ines+0.009
opl+0.009
iously+0.008
Neg logit contributions
Prin-0.006
 Lily-0.006
 Bun-0.005
 Emma-0.005
 Benny-0.005
Token like
Token ID588
Feature activation+4.25e-01

Pos logit contributions
ines+0.010
oops+0.009
cks+0.009
asp+0.009
ram+0.008
Neg logit contributions
 Bun-0.010
 Midnight-0.010
 accidentally-0.010
 Benny-0.010
Prin-0.010
Token that
Token ID326
Feature activation-6.56e-01

Pos logit contributions
oops+0.015
cks+0.015
opl+0.015
ines+0.015
 enclosure+0.015
Neg logit contributions
Prin-0.006
 Lily-0.006
 Emma-0.006
 Bun-0.005
 accidentally-0.005
Token.
Token ID13
Feature activation-6.38e-02

Pos logit contributions
Prin+0.014
 Lily+0.014
 Bun+0.013
 Emma+0.013
 Benny+0.013
Neg logit contributions
opl-0.018
oops-0.018
cks-0.018
ines-0.018
umm-0.018
Token She
Token ID1375
Feature activation+5.32e-01

Pos logit contributions
 Lily+0.001
Prin+0.001
 Emma+0.001
 accidentally+0.001
 Bun+0.001
Neg logit contributions
oops-0.002
cks-0.002
umm-0.002
 enclosure-0.002
opl-0.002
Token says
Token ID1139
Feature activation+4.92e-01

Pos logit contributions
oops+0.012
cks+0.012
 enclosure+0.012
ines+0.012
opl+0.012
Neg logit contributions
 Lily-0.015
Prin-0.014
 Emma-0.014
 Bun-0.014
 Benny-0.014
Token,
Token ID11
Feature activation+1.04e+00

Pos logit contributions
oops+0.014
ines+0.013
 cones+0.012
ule+0.012
asp+0.012
Neg logit contributions
Prin-0.010
 Bun-0.010
 Emma-0.009
 Benny-0.009
 Midnight-0.009
Token "
Token ID366
Feature activation-3.18e-01

Pos logit contributions
oops+0.031
 enclosure+0.029
cks+0.029
opl+0.028
iously+0.028
Neg logit contributions
Prin-0.021
 Emma-0.018
 Bun-0.018
 Lily-0.018
 Penny-0.018
Token,
Token ID11
Feature activation+3.41e-01

Pos logit contributions
 Lily+0.032
Prin+0.032
 Emma+0.029
 accidentally+0.028
 war+0.028
Neg logit contributions
oops-0.052
cks-0.052
 enclosure-0.052
umm-0.051
ines-0.051
Token Anna
Token ID11735
Feature activation-4.26e-01

Pos logit contributions
oops+0.010
 enclosure+0.010
umm+0.010
cks+0.010
iously+0.010
Neg logit contributions
Prin-0.007
 Bun-0.006
 accidentally-0.006
 Emma-0.006
 war-0.006
Token.
Token ID13
Feature activation-1.04e+00

Pos logit contributions
 Lily+0.009
Prin+0.008
 Emma+0.008
 accidentally+0.008
 Emily+0.007
Neg logit contributions
oops-0.013
cks-0.013
ines-0.013
 enclosure-0.013
opl-0.012
Token I
Token ID314
Feature activation-1.65e+00

Pos logit contributions
 Lily+0.021
Prin+0.020
 Emma+0.019
 accidentally+0.019
 someone+0.018
Neg logit contributions
oops-0.032
cks-0.031
 enclosure-0.031
umm-0.031
opl-0.031
Token will
Token ID481
Feature activation-5.83e-01

Pos logit contributions
Prin+0.037
 Emma+0.037
 Lily+0.037
 Penny+0.037
 Emily+0.037
Neg logit contributions
oops-0.045
ines-0.045
 enclosure-0.042
ram-0.041
asp-0.040
Token be
Token ID307
Feature activation-1.22e+00

Pos logit contributions
Prin+0.011
 Lily+0.011
 Emma+0.010
 Emily+0.010
 war+0.010
Neg logit contributions
cks-0.017
oops-0.017
 enclosure-0.017
opl-0.017
ines-0.017
Token gentle
Token ID10296
Feature activation-9.90e-01

Pos logit contributions
Prin+0.029
 Emma+0.028
 Lily+0.027
 Bun+0.027
 Benny+0.027
Neg logit contributions
oops-0.033
cks-0.033
opl-0.031
ines-0.030
asp-0.029
Token."
Token ID526
Feature activation+8.36e-01

Pos logit contributions
Prin+0.018
 Lily+0.018
 accidentally+0.017
 Penny+0.016
 Emma+0.016
Neg logit contributions
oops-0.031
 enclosure-0.031
ines-0.031
cks-0.031
umm-0.030
Token Ben
Token ID3932
Feature activation+8.98e-01

Pos logit contributions
 enclosure+0.028
oops+0.028
cks+0.028
opl+0.027
ines+0.027
Neg logit contributions
Prin-0.013
 Lily-0.012
 accidentally-0.012
 war-0.012
 Emma-0.011
Token says
Token ID1139
Feature activation+3.50e-01

Pos logit contributions
oops+0.013
 enclosure+0.013
cks+0.012
ines+0.011
iously+0.011
Neg logit contributions
Prin-0.031
 Bun-0.031
 Lily-0.031
 accidentally-0.031
 Emma-0.030
Token.
Token ID13
Feature activation-9.63e-01

Pos logit contributions
oops+0.009
iously+0.008
cks+0.008
ines+0.008
 enclosure+0.007
Neg logit contributions
Prin-0.010
 accidentally-0.009
 war-0.009
 Bun-0.008
 Penny-0.008
Token her
Token ID607
Feature activation-1.57e+00

Pos logit contributions
Prin+0.009
 Lily+0.008
 accidentally+0.008
 Midnight+0.008
 Emma+0.007
Neg logit contributions
 enclosure-0.021
oops-0.021
cks-0.021
opl-0.020
ines-0.020
Token and
Token ID290
Feature activation-1.20e+00

Pos logit contributions
 Lily+0.025
 Emma+0.022
Prin+0.021
 Emily+0.020
 someone+0.020
Neg logit contributions
cks-0.057
opl-0.056
oops-0.056
umm-0.056
ines-0.055
Token asked
Token ID1965
Feature activation+1.50e-01

Pos logit contributions
Prin+0.026
 war+0.025
 Lily+0.023
 Midnight+0.022
 someone+0.022
Neg logit contributions
oops-0.033
 enclosure-0.033
cks-0.031
iously-0.031
ram-0.030
Token her
Token ID607
Feature activation-8.44e-01

Pos logit contributions
oops+0.005
ines+0.005
iously+0.005
 enclosure+0.005
cks+0.005
Neg logit contributions
Prin-0.002
 war-0.002
 accidentally-0.002
 Lily-0.002
 Midnight-0.002
Token for
Token ID329
Feature activation-5.08e-01

Pos logit contributions
Prin+0.015
 Lily+0.015
 Emma+0.013
 accidentally+0.013
 war+0.013
Neg logit contributions
oops-0.027
 enclosure-0.027
ines-0.027
cks-0.026
umm-0.026
Token the
Token ID262
Feature activation-1.11e-01

Pos logit contributions
Prin+0.012
 Lily+0.010
 accidentally+0.010
 Emma+0.010
 Midnight+0.010
Neg logit contributions
oops-0.014
cks-0.014
 enclosure-0.013
iously-0.013
ines-0.013
Token key
Token ID1994
Feature activation-1.71e-01

Pos logit contributions
 Lily+0.002
 someone+0.002
 Emma+0.002
Prin+0.002
 something+0.002
Neg logit contributions
cks-0.004
opl-0.004
ines-0.004
oops-0.003
umm-0.003
Token.
Token ID13
Feature activation-3.85e-01

Pos logit contributions
 Lily+0.003
Prin+0.003
 Emma+0.003
 accidentally+0.002
 Benny+0.002
Neg logit contributions
oops-0.006
cks-0.006
 enclosure-0.006
opl-0.006
ines-0.006
Token
Token ID220
Feature activation-8.41e-01

Pos logit contributions
Prin+0.007
 war+0.007
 accidentally+0.007
 Lily+0.006
 sounded+0.006
Neg logit contributions
oops-0.012
 enclosure-0.012
umm-0.011
opl-0.011
iously-0.011
Token
Token ID198
Feature activation-2.11e-01

Pos logit contributions
 Lily+0.011
Prin+0.010
 Emma+0.010
 accidentally+0.009
 war+0.009
Neg logit contributions
oops-0.031
cks-0.031
opl-0.031
 enclosure-0.031
ines-0.030
Token
Token ID198
Feature activation-2.29e-01

Pos logit contributions
 Lily+0.003
Prin+0.003
 war+0.003
 Emma+0.003
 accidentally+0.003
Neg logit contributions
cks-0.007
oops-0.007
opl-0.007
 enclosure-0.007
ines-0.007
Token "
Token ID366
Feature activation-1.30e+00

Pos logit contributions
Prin+0.017
 war+0.015
 Lily+0.014
 Emma+0.014
 accidentally+0.014
Neg logit contributions
oops-0.026
iously-0.025
 enclosure-0.025
cks-0.024
umm-0.024
TokenPlease
Token ID5492
Feature activation-3.91e-01

Pos logit contributions
 war+0.028
 Lily+0.027
 Emma+0.026
 Emily+0.026
 Benny+0.026
Neg logit contributions
umm-0.034
oops-0.032
ram-0.032
 enclosure-0.032
opl-0.031
Token give
Token ID1577
Feature activation-8.36e-01

Pos logit contributions
Prin+0.007
 Lily+0.006
 Emma+0.006
 sounded+0.006
 Emily+0.006
Neg logit contributions
oops-0.012
 enclosure-0.012
opl-0.012
ines-0.012
cks-0.011
Token me
Token ID502
Feature activation-9.23e-01

Pos logit contributions
Prin+0.016
 Lily+0.016
 sounded+0.015
 war+0.015
 Emma+0.015
Neg logit contributions
oops-0.025
cks-0.024
 enclosure-0.024
ines-0.023
opl-0.023
Token one
Token ID530
Feature activation-1.52e-01

Pos logit contributions
Prin+0.012
 Lily+0.012
 accidentally+0.011
 Emma+0.011
 sounded+0.010
Neg logit contributions
oops-0.034
 enclosure-0.033
cks-0.032
opl-0.032
ines-0.032
Token piece
Token ID3704
Feature activation-8.06e-01

Pos logit contributions
 Lily+0.003
 someone+0.003
 Emma+0.003
 something+0.003
Prin+0.003
Neg logit contributions
cks-0.005
opl-0.005
ines-0.005
umm-0.005
 enclosure-0.005
Token of
Token ID286
Feature activation-1.60e+00

Pos logit contributions
Prin+0.017
 Lily+0.016
 Emma+0.015
 accidentally+0.015
 war+0.015
Neg logit contributions
oops-0.024
ines-0.023
 enclosure-0.023
opl-0.023
umm-0.022
Token candy
Token ID18550
Feature activation-7.99e-01

Pos logit contributions
 Lily+0.028
Prin+0.026
 Emma+0.025
 accidentally+0.024
 Emily+0.024
Neg logit contributions
oops-0.053
cks-0.052
opl-0.052
umm-0.052
 enclosure-0.052
Token.
Token ID13
Feature activation-9.55e-01

Pos logit contributions
 Lily+0.015
Prin+0.014
 Emma+0.013
 accidentally+0.012
 someone+0.012
Neg logit contributions
oops-0.026
cks-0.026
opl-0.026
 enclosure-0.026
umm-0.026
Token I
Token ID314
Feature activation-1.77e+00

Pos logit contributions
Prin+0.019
 Lily+0.019
 Emma+0.018
 accidentally+0.018
 war+0.017
Neg logit contributions
oops-0.029
umm-0.028
 enclosure-0.028
opl-0.027
cks-0.027
Token am
Token ID716
Feature activation-9.78e-01

Pos logit contributions
 Lily+0.041
Prin+0.041
 Emma+0.039
 Emily+0.038
 Bun+0.038
Neg logit contributions
oops-0.048
ines-0.047
 enclosure-0.046
cks-0.044
opl-0.043
Token be
Token ID307
Feature activation-7.30e-01

Pos logit contributions
oops+0.017
 enclosure+0.016
umm+0.016
cks+0.016
opl+0.016
Neg logit contributions
Prin-0.012
 Lily-0.012
 Emma-0.011
 someone-0.010
 Bun-0.010
Token brave
Token ID14802
Feature activation+5.74e-01

Pos logit contributions
Prin+0.016
 Lily+0.015
 Emma+0.015
 Bun+0.015
 Benny+0.015
Neg logit contributions
oops-0.021
cks-0.020
opl-0.019
ines-0.019
umm-0.019
Token and
Token ID290
Feature activation+1.04e+00

Pos logit contributions
oops+0.020
ines+0.020
 enclosure+0.020
umm+0.020
iously+0.019
Neg logit contributions
Prin-0.008
 Lily-0.008
 Emma-0.007
 someone-0.007
 accidentally-0.007
Token try
Token ID1949
Feature activation-9.56e-01

Pos logit contributions
oops+0.030
 enclosure+0.025
umm+0.024
 cones+0.024
iously+0.024
Neg logit contributions
Prin-0.025
 Emma-0.022
 Midnight-0.022
 Bun-0.022
 Penny-0.022
Token to
Token ID284
Feature activation+1.76e-01

Pos logit contributions
Prin+0.019
 Emma+0.017
 Lily+0.016
 Midnight+0.016
L+0.016
Neg logit contributions
oops-0.029
cks-0.029
ines-0.029
 enclosure-0.028
unks-0.028
Token fix
Token ID4259
Feature activation-2.26e+00

Pos logit contributions
cks+0.006
oops+0.006
opl+0.006
umm+0.005
 enclosure+0.005
Neg logit contributions
 Lily-0.004
Prin-0.003
 Emma-0.003
 accidentally-0.003
 someone-0.003
Token the
Token ID262
Feature activation+3.73e-01

Pos logit contributions
Prin+0.035
 Emma+0.027
 Midnight+0.026
 accidentally+0.026
 Emily+0.026
Neg logit contributions
cks-0.084
oops-0.081
opl-0.079
 enclosure-0.078
 cones-0.077
Token pear
Token ID25286
Feature activation-2.37e-01

Pos logit contributions
cks+0.012
opl+0.012
oops+0.011
 enclosure+0.011
ines+0.011
Neg logit contributions
 Lily-0.009
 someone-0.007
 accidentally-0.007
Prin-0.007
 Emma-0.007
Token.
Token ID13
Feature activation-7.57e-02

Pos logit contributions
 Lily+0.004
Prin+0.004
 Emma+0.004
 someone+0.004
 accidentally+0.004
Neg logit contributions
cks-0.008
opl-0.008
oops-0.008
umm-0.008
 enclosure-0.008
Token He
Token ID679
Feature activation-1.36e-01

Pos logit contributions
 Lily+0.001
Prin+0.001
 Emma+0.001
 accidentally+0.001
 someone+0.001
Neg logit contributions
oops-0.002
umm-0.002
 enclosure-0.002
cks-0.002
opl-0.002
Token used
Token ID973
Feature activation-4.41e-01

Pos logit contributions
 Lily+0.002
Prin+0.002
 Emma+0.002
 someone+0.002
 accidentally+0.002
Neg logit contributions
oops-0.005
cks-0.005
 enclosure-0.005
opl-0.005
umm-0.005
Token "
Token ID366
Feature activation-1.73e+00

Pos logit contributions
Prin+0.021
 war+0.019
 Penny+0.017
 Lily+0.017
 Emma+0.017
Neg logit contributions
iously-0.025
oops-0.025
cks-0.023
 enclosure-0.023
umm-0.022
TokenDon
Token ID3987
Feature activation-7.35e-01

Pos logit contributions
 war+0.039
 Lily+0.038
 Emma+0.037
 Benny+0.036
 Penny+0.036
Neg logit contributions
umm-0.044
cks-0.042
opl-0.041
iring-0.041
umbling-0.041
Token't
Token ID470
Feature activation-1.65e+00

Pos logit contributions
 Lily+0.017
Prin+0.017
 accidentally+0.016
 Emma+0.016
 Penny+0.016
Neg logit contributions
opl-0.018
umm-0.018
ines-0.018
cks-0.018
oops-0.018
Token worry
Token ID5490
Feature activation-1.62e+00

Pos logit contributions
Prin+0.040
 Lily+0.036
 Midnight+0.035
 Penny+0.035
 Emma+0.035
Neg logit contributions
iously-0.040
umm-0.039
cks-0.039
oops-0.038
ram-0.038
Token,
Token ID11
Feature activation-7.08e-01

Pos logit contributions
Prin+0.034
 sounded+0.027
 practiced+0.027
 war+0.026
 accidentally+0.026
Neg logit contributions
ummy-0.049
ines-0.048
iously-0.047
umm-0.046
oops-0.046
Token you
Token ID345
Feature activation-2.79e+00

Pos logit contributions
Prin+0.018
 practiced+0.013
 sounded+0.013
Da+0.012
 war+0.012
Neg logit contributions
umm-0.018
ummy-0.017
oops-0.017
iously-0.016
 pounds-0.016
Token can
Token ID460
Feature activation-7.35e-01

Pos logit contributions
Prin+0.059
 Lily+0.055
Da+0.053
 Emma+0.052
 Midnight+0.052
Neg logit contributions
oops-0.082
ines-0.080
umm-0.074
asp-0.074
anda-0.073
Token build
Token ID1382
Feature activation-6.60e-01

Pos logit contributions
 Lily+0.012
Prin+0.011
 Emma+0.011
 Emily+0.010
 accidentally+0.010
Neg logit contributions
oops-0.025
cks-0.025
opl-0.025
 enclosure-0.025
umm-0.025
Token it
Token ID340
Feature activation-7.06e-01

Pos logit contributions
Prin+0.010
 Midnight+0.009
 Penny+0.009
 Emma+0.008
 sounded+0.008
Neg logit contributions
cks-0.024
oops-0.023
opl-0.023
 tallest-0.022
ines-0.022
Token again
Token ID757
Feature activation-1.67e+00

Pos logit contributions
Prin+0.011
 Emma+0.010
 Lily+0.010
 Penny+0.010
 Midnight+0.010
Neg logit contributions
oops-0.025
cks-0.025
ines-0.024
opl-0.024
 enclosure-0.024
Token.
Token ID13
Feature activation-1.09e+00

Pos logit contributions
Prin+0.042
 Emma+0.041
 Lily+0.041
 Penny+0.040
 Midnight+0.040
Neg logit contributions
cks-0.041
ines-0.040
oops-0.040
opl-0.040
 enclosure-0.039
Token's
Token ID338
Feature activation+3.48e-01

Pos logit contributions
umm+0.025
cks+0.024
opl+0.024
oops+0.024
ines+0.024
Neg logit contributions
 Lily-0.011
 accidentally-0.010
Prin-0.010
 Emma-0.010
 someone-0.009
Token dad
Token ID9955
Feature activation-1.33e-01

Pos logit contributions
cks+0.011
umm+0.011
opl+0.011
 enclosure+0.011
ines+0.010
Neg logit contributions
 Lily-0.007
 someone-0.006
 Emma-0.006
 accidentally-0.006
Prin-0.006
Token said
Token ID531
Feature activation-7.99e-01

Pos logit contributions
 Lily+0.002
 accidentally+0.002
 Emma+0.002
 they+0.002
 sounded+0.002
Neg logit contributions
umm-0.005
opl-0.005
oops-0.005
cks-0.005
 enclosure-0.005
Token,
Token ID11
Feature activation-6.15e-01

Pos logit contributions
Prin+0.018
L+0.015
 Midnight+0.015
 Penny+0.014
 Emily+0.014
Neg logit contributions
oops-0.024
iously-0.022
 enclosure-0.021
 cones-0.021
ule-0.021
Token "
Token ID366
Feature activation-6.97e-01

Pos logit contributions
Prin+0.013
 war+0.011
 Lily+0.011
 Emma+0.011
 Midnight+0.011
Neg logit contributions
oops-0.018
 enclosure-0.017
iously-0.017
cks-0.016
ule-0.016
TokenLet
Token ID5756
Feature activation-2.27e+00

Pos logit contributions
 Lily+0.013
Prin+0.012
 Emma+0.012
 accidentally+0.011
 Emily+0.011
Neg logit contributions
oops-0.022
cks-0.022
umm-0.022
opl-0.022
 enclosure-0.022
Token's
Token ID338
Feature activation-1.08e+00

Pos logit contributions
Prin+0.043
 Lily+0.042
 Emma+0.039
 war+0.038
 Penny+0.038
Neg logit contributions
oops-0.066
opl-0.065
 enclosure-0.065
cks-0.065
ram-0.063
Token test
Token ID1332
Feature activation-1.60e+00

Pos logit contributions
 Lily+0.019
Prin+0.019
 Emma+0.017
 Emily+0.017
 Penny+0.017
Neg logit contributions
oops-0.035
 enclosure-0.035
cks-0.035
opl-0.035
umm-0.034
Token out
Token ID503
Feature activation-2.63e-01

Pos logit contributions
Prin+0.030
 Lily+0.025
 Midnight+0.024
 Emma+0.024
L+0.024
Neg logit contributions
cks-0.053
oops-0.052
 enclosure-0.052
iously-0.051
ines-0.050
Token our
Token ID674
Feature activation+6.08e-01

Pos logit contributions
 Lily+0.005
Prin+0.005
 Emma+0.005
 Midnight+0.004
 Penny+0.004
Neg logit contributions
cks-0.008
oops-0.008
 enclosure-0.008
ines-0.008
opl-0.008
Token new
Token ID649
Feature activation-2.77e-01

Pos logit contributions
umm+0.019
cks+0.019
oops+0.019
asp+0.019
ines+0.018
Neg logit contributions
 Lily-0.012
 war-0.012
 someone-0.011
 band-0.011
 something-0.010
Token how
Token ID703
Feature activation+2.72e-01

Pos logit contributions
iously+0.019
 enclosure+0.019
cks+0.019
ines+0.019
oops+0.018
Neg logit contributions
Prin-0.018
L-0.015
 Bun-0.015
 Lily-0.015
 war-0.014
Token much
Token ID881
Feature activation-2.10e+00

Pos logit contributions
oops+0.008
ines+0.008
cks+0.007
asp+0.007
opl+0.007
Neg logit contributions
Prin-0.006
 sounded-0.005
 villain-0.005
 Lily-0.005
 Midnight-0.005
Token she
Token ID673
Feature activation+2.37e-01

Pos logit contributions
Prin+0.047
 accidentally+0.037
L+0.036
T+0.035
 Bun+0.035
Neg logit contributions
 cones-0.055
umm-0.055
oops-0.054
ines-0.054
opl-0.053
Token loved
Token ID6151
Feature activation-2.83e-01

Pos logit contributions
oops+0.008
ines+0.008
cks+0.008
 enclosure+0.008
opl+0.008
Neg logit contributions
 Lily-0.004
Prin-0.004
 Emma-0.004
 someone-0.004
 war-0.004
Token playing
Token ID2712
Feature activation-9.53e-01

Pos logit contributions
Prin+0.004
 Lily+0.004
L+0.004
 accidentally+0.003
 Emma+0.003
Neg logit contributions
cks-0.010
ines-0.010
opl-0.010
 enclosure-0.009
oops-0.009
Token doctor
Token ID6253
Feature activation-2.27e+00

Pos logit contributions
Prin+0.022
 accidentally+0.020
L+0.017
 Penny+0.017
Da+0.017
Neg logit contributions
cks-0.024
opl-0.024
ines-0.023
 enclosure-0.022
cles-0.022
Token and
Token ID290
Feature activation-2.58e-01

Pos logit contributions
Prin+0.046
 Lily+0.042
 accidentally+0.041
L+0.039
 war+0.037
Neg logit contributions
 enclosure-0.068
opl-0.067
oops-0.066
ines-0.066
cks-0.065
Token decided
Token ID3066
Feature activation+6.60e-01

Pos logit contributions
Prin+0.006
L+0.006
 Lily+0.005
 Bun+0.005
 Midnight+0.005
Neg logit contributions
oops-0.006
 enclosure-0.006
iously-0.006
ines-0.006
opl-0.006
Token to
Token ID284
Feature activation+7.17e-01

Pos logit contributions
oops+0.019
iously+0.018
 enclosure+0.018
ines+0.018
cks+0.017
Neg logit contributions
Prin-0.014
 Lily-0.012
 Emma-0.012
L-0.012
 Midnight-0.011
Token hang
Token ID8181
Feature activation-5.13e-01

Pos logit contributions
 enclosure+0.020
opl+0.020
oops+0.019
ines+0.018
cks+0.018
Neg logit contributions
Prin-0.016
 Lily-0.016
 Emma-0.015
 Emily-0.014
 accidentally-0.014
Token her
Token ID607
Feature activation+8.76e-01

Pos logit contributions
 accidentally+0.010
Prin+0.010
 war+0.008
 didn+0.008
 practiced+0.008
Neg logit contributions
oops-0.016
cks-0.016
opl-0.015
ines-0.015
ires-0.014
Token.
Token ID13
Feature activation-3.42e-01

Pos logit contributions
 Lily+0.007
Prin+0.007
 Emma+0.006
 accidentally+0.006
 Emily+0.006
Neg logit contributions
oops-0.010
ines-0.010
 enclosure-0.010
cks-0.010
opl-0.010
Tokenâ
Token ID22940
Feature activation-6.69e-01

Pos logit contributions
 Lily+0.008
Prin+0.008
 accidentally+0.007
 war+0.007
 Emma+0.007
Neg logit contributions
oops-0.009
 enclosure-0.009
opl-0.009
umm-0.009
cks-0.009
Token
Token ID26391
Feature activation-7.16e-01

Pos logit contributions
 Lily+0.018
Prin+0.017
 Emma+0.016
 accidentally+0.016
 someone+0.016
Neg logit contributions
oops-0.016
cks-0.016
opl-0.016
umm-0.016
 enclosure-0.016
Token The
Token ID383
Feature activation-5.21e-01

Pos logit contributions
 Lily+0.014
Prin+0.013
 Emma+0.012
 accidentally+0.012
 Emily+0.012
Neg logit contributions
cks-0.023
oops-0.023
opl-0.023
 enclosure-0.023
ines-0.022
Token two
Token ID734
Feature activation-5.62e-01

Pos logit contributions
 Lily+0.010
Prin+0.009
 someone+0.008
 Emma+0.008
 Benny+0.008
Neg logit contributions
cks-0.017
oops-0.017
ines-0.017
umm-0.017
opl-0.017
Token of
Token ID286
Feature activation-2.07e+00

Pos logit contributions
 Lily+0.012
Prin+0.010
 Emma+0.010
 someone+0.010
 Benny+0.010
Neg logit contributions
opl-0.017
cks-0.017
umm-0.017
oops-0.017
 enclosure-0.017
Token them
Token ID606
Feature activation+6.07e-01

Pos logit contributions
Prin+0.028
 Lily+0.027
 Emma+0.027
 accidentally+0.026
 Emily+0.024
Neg logit contributions
cks-0.075
opl-0.073
ines-0.073
 enclosure-0.073
oops-0.073
Token were
Token ID547
Feature activation-1.19e-01

Pos logit contributions
cks+0.021
umm+0.021
oops+0.021
opl+0.021
 enclosure+0.021
Neg logit contributions
 Lily-0.010
Prin-0.009
 Emma-0.008
 accidentally-0.008
 Benny-0.008
Token very
Token ID845
Feature activation+5.72e-01

Pos logit contributions
 Lily+0.002
 Emma+0.002
 Benny+0.002
 someone+0.002
 Emily+0.002
Neg logit contributions
umm-0.004
 enclosure-0.004
oops-0.004
opl-0.004
ines-0.004
Token happy
Token ID3772
Feature activation+2.74e-01

Pos logit contributions
opl+0.019
umm+0.019
 enclosure+0.019
cks+0.019
ines+0.018
Neg logit contributions
 Lily-0.011
 Emma-0.009
Prin-0.009
 they-0.009
 Benny-0.009
Token as
Token ID355
Feature activation+5.55e-01

Pos logit contributions
oops+0.009
cks+0.008
 enclosure+0.008
ines+0.008
umm+0.008
Neg logit contributions
 Lily-0.005
Prin-0.005
 Emma-0.005
 accidentally-0.005
 Emily-0.004