TOP ACTIVATIONS
MAX = 4.101

even see other people having pic n ics .
the berries and have something delicious to eat . But first
saw lots of people having pic n ics and walking their
fun . Lily realized that pic n ics were not so
went on adventures , had pic n ics , and even
the warm sunshine and having pic n ics with her family
Can I cut the box ?" Tom asks Jen .
see the bird and have pic n ics . <|endoftext|> Once
he wanted for his 3 rd birthday was to try something
outside . When the cup cakes were done , they sat
near the cliff and have pic n ics . One day
When the cup cakes were done baking , they
what is in the sack ," Tim says .
's look inside the sack ," Tom says .
bodies and make a tower ," Tom says . " Then
and he wanted to jump in it . Emily said "
's open it and see ," Tim says . He tries
many steps to the hole ," Tom says .
Let 's drink some water ," Tim says .
When the cup cakes were done , Mill ie

MIN ACTIVATIONS
MIN = -4.563

should be compassionate . Comp assion ate means you care about
time there was a little av ac ado . It was
: " Nice job , Elliot !" Elliot had a lot
, there was a loud av ac ado . It was
Once , there was a copper dog who was very exciting
there were two dogs , Pick and Race . They were
grateful fox es thanked the copper for his help . He
, there was an adventurous copper . He was always curious
playing together under the big oak tree . It was like
happened . He saw the broken axe and the rock .
cater pillar climbed up the oak tree , sure - footed
he came across a small ac orn that was lying on
there were two friends named Walter and L ila . They
wrong . She sees the broken brown engine and the crying
As the farmer scattered the av oc ados , he noticed
Suddenly , she spotted an ac orn that had fallen from
time there was a strong copper box . It was very
tree was sad that the av oc ados were being taken
time , there was a copper bucket . It was empty
here ? You are tresp assing ! You are thieves !

INTERVAL 1.935 - 4.101
CONTAINS 0.902%

respect other animals and people ." Jen said . She was
big wheel . Her mom says yes , and they put
agreed , and they went under a big tree to cool
water . She wanted something warm to feed her .
her . He remembered what Mom said . He ran to

INTERVAL -0.231 - 1.935
CONTAINS 60.668%

, " Everyone is different and special in their own way
. They saw the big dog lying on a rug .
. â € The little girl wondered who would
!" They walked to the bridge and looked at
back to normal . <|endoftext|> Once upon a time there was

INTERVAL -2.397 - -0.231
CONTAINS 38.063%

very big and friendly . You will like them ."
big world . She saw the sun and the flowers and
. She searched through the forest , high and low ,
is the last thing you want , remember that ".
, Ben . He 's gone now . He was just

INTERVAL -4.563 - -2.397
CONTAINS 0.367%

scary . Later that day , Lily 's mom
were having so much fun . April was so excited ,
in a hurry . From that day forward , the cat
. From that day on , the musician
time . Lucy was happy again . Mia and Lucy made
Token even
Token ID772
Feature activation+7.55e-01

Pos logit contributions
 broom+0.004
 Rat+0.004
 spider+0.004
 flashlight+0.004
 drill+0.003
Neg logit contributions
ams-0.007
ishment-0.006
rimp-0.006
ited-0.006
ackers-0.006
Token see
Token ID766
Feature activation+1.61e-01

Pos logit contributions
ement+0.021
umbs+0.020
ams+0.020
aws+0.020
ishment+0.019
Neg logit contributions
ray-0.015
 drill-0.013
 scientist-0.013
 spider-0.013
 if-0.012
Token other
Token ID584
Feature activation+5.42e-01

Pos logit contributions
ited+0.005
ackers+0.005
umbs+0.004
rimp+0.004
 swapped+0.004
Neg logit contributions
ray-0.003
 spider-0.003
 living-0.003
 Rat-0.002
 scientist-0.002
Token people
Token ID661
Feature activation-3.81e-01

Pos logit contributions
 swapped+0.014
rimp+0.014
ackers+0.014
ams+0.013
ited+0.013
Neg logit contributions
ray-0.011
 spider-0.010
 Rat-0.009
 broom-0.009
 mouse-0.009
Token having
Token ID1719
Feature activation+8.24e-01

Pos logit contributions
ray+0.006
 living+0.006
 scientist+0.006
ous+0.005
 spider+0.005
Neg logit contributions
 tongues-0.011
agged-0.011
ishment-0.011
ams-0.011
 crust-0.011
Token pic
Token ID8301
Feature activation+4.10e+00

Pos logit contributions
ackers+0.022
ishment+0.022
ited+0.022
glers+0.021
lights+0.021
Neg logit contributions
 living-0.015
 to-0.015
 broom-0.014
 spider-0.013
 if-0.013
Tokenn
Token ID77
Feature activation-2.94e-01

Pos logit contributions
ishment+0.085
aws+0.083
ams+0.081
 tongues+0.081
 swapped+0.079
Neg logit contributions
ray-0.100
 scientist-0.090
arsh-0.089
 drill-0.088
 broom-0.087
Tokenics
Token ID873
Feature activation+1.41e+00

Pos logit contributions
ray+0.008
 scientist+0.007
 drill+0.007
arsh+0.007
 broom+0.007
Neg logit contributions
 tongues-0.006
ishment-0.006
 swapped-0.006
 sav-0.006
 crust-0.005
Token.
Token ID13
Feature activation+3.90e-01

Pos logit contributions
 tongues+0.044
umbs+0.042
ited+0.042
 swapped+0.041
aney+0.041
Neg logit contributions
 scientist-0.022
ray-0.021
 if-0.020
 living-0.018
 for-0.018
Token
Token ID198
Feature activation+6.67e-01

Pos logit contributions
glers+0.009
 ne+0.009
ited+0.009
ams+0.009
 tongues+0.009
Neg logit contributions
ray-0.009
 scientist-0.008
 Monster-0.008
 Rat-0.008
oris-0.007
Token
Token ID198
Feature activation+7.58e-01

Pos logit contributions
aws+0.014
ams+0.014
 swapped+0.014
 tongues+0.014
 sav+0.014
Neg logit contributions
ray-0.017
 scientist-0.015
 drill-0.014
 Rat-0.014
oris-0.014
Token the
Token ID262
Feature activation+6.94e-01

Pos logit contributions
 swapped+0.009
ishment+0.009
 tongues+0.009
aws+0.009
ams+0.009
Neg logit contributions
ray-0.007
 scientist-0.006
 Rat-0.006
 spider-0.006
oris-0.006
Token berries
Token ID36322
Feature activation+4.52e-01

Pos logit contributions
ams+0.015
 swapped+0.015
ishment+0.015
 tongues+0.015
aws+0.015
Neg logit contributions
ray-0.017
 scientist-0.016
 Rat-0.015
 drill-0.015
 spider-0.015
Token and
Token ID290
Feature activation+1.30e+00

Pos logit contributions
 tongues+0.013
 swapped+0.013
ishment+0.013
ams+0.013
 unison+0.013
Neg logit contributions
ray-0.008
 scientist-0.007
 if-0.006
 Rat-0.006
cher-0.006
Token have
Token ID423
Feature activation+2.27e+00

Pos logit contributions
ement+0.037
agged+0.035
ams+0.035
aws+0.035
umbs+0.034
Neg logit contributions
ray-0.024
 scientist-0.024
 Rat-0.023
 flashlight-0.022
 broom-0.022
Token something
Token ID1223
Feature activation+2.34e+00

Pos logit contributions
ishment+0.079
lights+0.078
glers+0.076
 tongues+0.075
ilt+0.075
Neg logit contributions
 to-0.026
ray-0.026
 Rat-0.024
 if-0.022
 living-0.022
Token delicious
Token ID12625
Feature activation+4.08e+00

Pos logit contributions
ishment+0.074
ople+0.071
 unison+0.071
umbs+0.070
folk+0.069
Neg logit contributions
ray-0.040
 if-0.034
 scientist-0.034
 for-0.033
 to-0.033
Token to
Token ID284
Feature activation+2.26e+00

Pos logit contributions
ishment+0.137
 tongues+0.134
 swapped+0.134
ited+0.134
ized+0.134
Neg logit contributions
ray-0.050
 scientist-0.045
 if-0.039
 for-0.038
 broom-0.038
Token eat
Token ID4483
Feature activation+1.90e+00

Pos logit contributions
lights+0.064
ams+0.063
ishment+0.063
pool+0.059
aws+0.058
Neg logit contributions
ray-0.040
 spider-0.040
 Rat-0.040
 drill-0.038
 scientist-0.037
Token.
Token ID13
Feature activation+1.27e+00

Pos logit contributions
 swapped+0.063
ishment+0.062
 tongues+0.061
ited+0.061
 unison+0.059
Neg logit contributions
 if-0.025
 for-0.024
ray-0.024
 scientist-0.020
 to-0.020
Token But
Token ID887
Feature activation+1.22e+00

Pos logit contributions
ilt+0.038
glers+0.038
 tongues+0.038
ited+0.037
ams+0.036
Neg logit contributions
ray-0.024
 Rat-0.020
 scientist-0.018
ometer-0.017
 drill-0.017
Token first
Token ID717
Feature activation+1.19e+00

Pos logit contributions
umbs+0.034
aney+0.032
glers+0.032
aws+0.032
 tongues+0.032
Neg logit contributions
ray-0.024
 Rat-0.022
 scientist-0.021
 drill-0.021
 Monster-0.020
Token saw
Token ID2497
Feature activation-2.84e-01

Pos logit contributions
ams+0.007
ishment+0.007
 tongues+0.007
 crust+0.007
glers+0.007
Neg logit contributions
ray-0.005
 scientist-0.005
 Rat-0.004
 Monster-0.004
 spider-0.004
Token lots
Token ID6041
Feature activation+4.00e-01

Pos logit contributions
ray+0.006
 scientist+0.006
 drill+0.005
 Rat+0.005
oris+0.005
Neg logit contributions
ams-0.007
aws-0.007
 swapped-0.007
 tongues-0.007
ishment-0.007
Token of
Token ID286
Feature activation+4.42e-01

Pos logit contributions
ishment+0.013
 tongues+0.013
ackers+0.013
 argued+0.012
 swapped+0.012
Neg logit contributions
ray-0.006
 spider-0.005
 scientist-0.005
 living-0.005
ing-0.004
Token people
Token ID661
Feature activation-2.08e-01

Pos logit contributions
ishment+0.010
ams+0.010
 tongues+0.010
 crust+0.010
umbs+0.010
Neg logit contributions
ray-0.010
 scientist-0.010
 spider-0.009
 broom-0.009
 flashlight-0.009
Token having
Token ID1719
Feature activation+7.90e-01

Pos logit contributions
ray+0.003
 living+0.003
 scientist+0.003
 spider+0.003
 if+0.002
Neg logit contributions
 tongues-0.007
ishment-0.006
 crust-0.006
umbs-0.006
ams-0.006
Token pic
Token ID8301
Feature activation+4.01e+00

Pos logit contributions
ishment+0.021
ited+0.021
ackers+0.020
 crust+0.020
ams+0.020
Neg logit contributions
 living-0.015
 broom-0.015
 to-0.015
 spider-0.014
 Rat-0.014
Tokenn
Token ID77
Feature activation-4.89e-01

Pos logit contributions
ishment+0.076
aws+0.075
ams+0.074
 tongues+0.073
 swapped+0.072
Neg logit contributions
ray-0.106
 scientist-0.098
 drill-0.094
arsh-0.093
 broom-0.093
Tokenics
Token ID873
Feature activation+1.24e+00

Pos logit contributions
ray+0.014
 scientist+0.012
 drill+0.012
 broom+0.012
oris+0.012
Neg logit contributions
 tongues-0.009
ishment-0.009
 sav-0.009
 crust-0.009
 swapped-0.009
Token and
Token ID290
Feature activation+6.10e-01

Pos logit contributions
 tongues+0.038
umbs+0.037
ited+0.036
 swapped+0.036
ishment+0.036
Neg logit contributions
 scientist-0.020
ray-0.019
 if-0.018
 living-0.017
 spider-0.016
Token walking
Token ID6155
Feature activation-5.29e-01

Pos logit contributions
umbs+0.016
ited+0.016
ishment+0.016
ams+0.016
ement+0.016
Neg logit contributions
 scientist-0.011
ray-0.011
 Rat-0.010
 spider-0.010
 drill-0.010
Token their
Token ID511
Feature activation-1.94e-01

Pos logit contributions
ray+0.009
 scientist+0.008
 Monster+0.007
 flashlight+0.007
oris+0.007
Neg logit contributions
 tongues-0.015
ishment-0.015
umbs-0.015
 crust-0.015
ited-0.015
Token fun
Token ID1257
Feature activation+6.54e-01

Pos logit contributions
ishment+0.041
rily+0.038
ams+0.036
 wrists+0.035
 tongues+0.035
Neg logit contributions
ray-0.031
 spider-0.029
 scientist-0.028
 living-0.028
 Rat-0.027
Token.
Token ID13
Feature activation-9.20e-01

Pos logit contributions
ishment+0.017
 swapped+0.016
aney+0.016
ited+0.016
 tongues+0.016
Neg logit contributions
ray-0.013
 scientist-0.013
 spider-0.012
 broom-0.011
 Monster-0.011
Token Lily
Token ID20037
Feature activation-7.15e-01

Pos logit contributions
 Rat+0.017
ray+0.017
 scientist+0.015
 Monster+0.014
 spider+0.014
Neg logit contributions
ited-0.027
glers-0.026
ams-0.026
ized-0.026
 contain-0.025
Token realized
Token ID6939
Feature activation-8.80e-01

Pos logit contributions
ray+0.015
 scientist+0.014
 Rat+0.013
 spider+0.013
 Monster+0.012
Neg logit contributions
ishment-0.018
ams-0.018
 tongues-0.018
umbs-0.018
aws-0.018
Token that
Token ID326
Feature activation-2.51e+00

Pos logit contributions
ray+0.018
 scientist+0.017
 Rat+0.016
 spider+0.016
 drill+0.015
Neg logit contributions
 tongues-0.022
ams-0.022
 swapped-0.022
ishment-0.022
 crust-0.022
Token pic
Token ID8301
Feature activation+4.00e+00

Pos logit contributions
ray+0.057
 scientist+0.055
 Rat+0.051
 spider+0.050
 drill+0.050
Neg logit contributions
ams-0.058
aws-0.058
 swapped-0.057
ishment-0.057
 tongues-0.057
Tokenn
Token ID77
Feature activation+4.79e-01

Pos logit contributions
ams+0.088
umbs+0.087
ishment+0.085
 sweaty+0.084
aws+0.084
Neg logit contributions
ray-0.092
arsh-0.079
 Rat-0.077
 scientist-0.076
oris-0.076
Tokenics
Token ID873
Feature activation+1.02e+00

Pos logit contributions
ishment+0.009
 tongues+0.009
 sav+0.009
 crust+0.009
ams+0.009
Neg logit contributions
ray-0.013
 scientist-0.012
 spider-0.011
oris-0.011
 Monster-0.011
Token were
Token ID547
Feature activation+7.80e-01

Pos logit contributions
ishment+0.033
ams+0.032
umbs+0.032
 tongues+0.032
 crust+0.031
Neg logit contributions
ray-0.015
 scientist-0.014
 spider-0.013
 Monster-0.012
 living-0.011
Token not
Token ID407
Feature activation-1.52e+00

Pos logit contributions
ishment+0.022
umbs+0.021
 tongues+0.021
ams+0.021
aney+0.021
Neg logit contributions
 scientist-0.014
 spider-0.013
 living-0.013
ray-0.013
 Rat-0.013
Token so
Token ID523
Feature activation+6.81e-01

Pos logit contributions
 scientist+0.027
 if+0.027
 spider+0.026
 mouse+0.025
 living+0.024
Neg logit contributions
ement-0.042
 tongues-0.042
ishment-0.040
umbs-0.040
ited-0.040
Token went
Token ID1816
Feature activation-6.06e-01

Pos logit contributions
ray+0.015
 scientist+0.014
 drill+0.014
oris+0.013
 broom+0.013
Neg logit contributions
 swapped-0.007
aws-0.007
ams-0.007
 sav-0.007
 tongues-0.007
Token on
Token ID319
Feature activation-5.57e-02

Pos logit contributions
ray+0.014
 scientist+0.012
 drill+0.011
oris+0.011
 spider+0.011
Neg logit contributions
aws-0.015
ams-0.015
glers-0.015
 swapped-0.015
 sav-0.014
Token adventures
Token ID17545
Feature activation-1.51e+00

Pos logit contributions
ray+0.001
 scientist+0.001
 drill+0.001
oris+0.001
 Rat+0.001
Neg logit contributions
aws-0.001
ams-0.001
 swapped-0.001
 tongues-0.001
ishment-0.001
Token,
Token ID11
Feature activation-1.30e+00

Pos logit contributions
ray+0.037
 scientist+0.034
 drill+0.032
 Rat+0.031
oris+0.031
Neg logit contributions
aws-0.033
ams-0.033
 swapped-0.033
 tongues-0.033
ishment-0.032
Token had
Token ID550
Feature activation-1.48e-01

Pos logit contributions
ray+0.034
 scientist+0.031
 drill+0.029
oris+0.028
 spider+0.028
Neg logit contributions
aws-0.027
ams-0.027
 swapped-0.027
 sav-0.027
 crust-0.026
Token pic
Token ID8301
Feature activation+3.92e+00

Pos logit contributions
 Rat+0.002
ray+0.002
 scientist+0.002
 living+0.002
 mouse+0.002
Neg logit contributions
ishment-0.004
ited-0.004
 tongues-0.004
lights-0.004
ement-0.004
Tokenn
Token ID77
Feature activation-1.35e+00

Pos logit contributions
ishment+0.068
ams+0.067
 tongues+0.066
aws+0.065
 swapped+0.064
Neg logit contributions
ray-0.111
 scientist-0.103
 Rat-0.099
oris-0.098
 drill-0.098
Tokenics
Token ID873
Feature activation+1.08e+00

Pos logit contributions
ray+0.040
 scientist+0.038
 drill+0.036
 Rat+0.036
oris+0.035
Neg logit contributions
ishment-0.022
 tongues-0.022
ams-0.021
 swapped-0.021
 crust-0.021
Token,
Token ID11
Feature activation-9.33e-01

Pos logit contributions
 tongues+0.031
ishment+0.030
ited+0.030
aney+0.030
umbs+0.029
Neg logit contributions
ray-0.018
 scientist-0.018
 Rat-0.016
 Monster-0.016
 broom-0.016
Token and
Token ID290
Feature activation-5.00e-01

Pos logit contributions
ray+0.020
 scientist+0.019
 Rat+0.018
 drill+0.017
 spider+0.017
Neg logit contributions
ams-0.023
 tongues-0.022
ishment-0.022
aws-0.022
 swapped-0.022
Token even
Token ID772
Feature activation-9.63e-01

Pos logit contributions
ray+0.012
 scientist+0.012
 Rat+0.012
 drill+0.011
 spider+0.011
Neg logit contributions
ishment-0.010
 tongues-0.010
aws-0.010
ams-0.010
aney-0.010
Token the
Token ID262
Feature activation+9.84e-02

Pos logit contributions
ams+0.020
 swapped+0.020
ited+0.020
ishment+0.019
aws+0.019
Neg logit contributions
 spider-0.010
 scientist-0.010
 Rat-0.009
 broom-0.009
 drill-0.009
Token warm
Token ID5814
Feature activation+4.77e-01

Pos logit contributions
ams+0.002
ishment+0.002
 tongues+0.002
aws+0.002
 swapped+0.002
Neg logit contributions
ray-0.002
 scientist-0.002
 spider-0.002
 broom-0.002
 drill-0.002
Token sunshine
Token ID34488
Feature activation-5.67e-01

Pos logit contributions
 swapped+0.010
 argued+0.010
ishment+0.010
aws+0.010
ited+0.010
Neg logit contributions
ray-0.012
 scientist-0.011
 spider-0.011
 flashlight-0.011
 broom-0.011
Token and
Token ID290
Feature activation-3.23e-01

Pos logit contributions
 scientist+0.010
ray+0.009
 spider+0.009
 Monster+0.009
 mouse+0.008
Neg logit contributions
 swapped-0.016
ited-0.016
aney-0.016
ams-0.016
 tongues-0.016
Token having
Token ID1719
Feature activation+5.37e-01

Pos logit contributions
ray+0.007
 scientist+0.006
 Rat+0.006
 drill+0.006
 spider+0.006
Neg logit contributions
aws-0.008
ams-0.008
umbs-0.008
ishment-0.008
 swapped-0.008
Token pic
Token ID8301
Feature activation+3.89e+00

Pos logit contributions
ishment+0.015
ited+0.014
ams+0.013
melon+0.013
ole+0.013
Neg logit contributions
 Rat-0.011
 spider-0.010
 living-0.010
 broom-0.010
 Cat-0.009
Tokenn
Token ID77
Feature activation-1.02e+00

Pos logit contributions
ishment+0.074
aws+0.072
ams+0.072
 tongues+0.070
 swapped+0.069
Neg logit contributions
ray-0.105
 scientist-0.096
 Rat-0.093
 drill-0.093
oris-0.091
Tokenics
Token ID873
Feature activation+9.82e-01

Pos logit contributions
ray+0.030
 scientist+0.028
 drill+0.027
oris+0.027
 broom+0.027
Neg logit contributions
ishment-0.017
 tongues-0.017
 swapped-0.016
 crust-0.016
ams-0.016
Token with
Token ID351
Feature activation+2.44e-02

Pos logit contributions
 tongues+0.028
umbs+0.028
ishment+0.027
aney+0.027
 swapped+0.027
Neg logit contributions
 scientist-0.017
ray-0.016
 spider-0.015
 if-0.015
 Rat-0.015
Token her
Token ID607
Feature activation-2.20e-01

Pos logit contributions
ams+0.001
ited+0.001
ishment+0.001
ulating+0.001
 swapped+0.001
Neg logit contributions
 Rat-0.000
 spider-0.000
 Mouse-0.000
 mouse-0.000
 Oliver-0.000
Token family
Token ID1641
Feature activation+6.67e-01

Pos logit contributions
 scientist+0.004
ray+0.004
 spider+0.004
 broom+0.004
 Monster+0.004
Neg logit contributions
ams-0.006
ited-0.006
ishment-0.006
 swapped-0.006
umbs-0.006
TokenCan
Token ID6090
Feature activation-9.31e-02

Pos logit contributions
ishment+0.034
 tongues+0.033
lights+0.032
ams+0.032
 refreshing+0.031
Neg logit contributions
ray-0.027
 scientist-0.024
 drill-0.022
 Rat-0.022
 spider-0.021
Token I
Token ID314
Feature activation+8.96e-01

Pos logit contributions
ray+0.002
 scientist+0.002
 Rat+0.002
 drill+0.002
 spider+0.002
Neg logit contributions
ams-0.002
aws-0.002
ited-0.002
ishment-0.002
umbs-0.002
Token cut
Token ID2005
Feature activation-3.32e-01

Pos logit contributions
ams+0.022
aws+0.022
ishment+0.022
 tongues+0.022
 swapped+0.021
Neg logit contributions
ray-0.020
 scientist-0.018
 drill-0.017
 Rat-0.017
 spider-0.016
Token the
Token ID262
Feature activation+1.80e-01

Pos logit contributions
ray+0.008
 scientist+0.007
 drill+0.007
 Rat+0.007
 spider+0.006
Neg logit contributions
ams-0.008
aws-0.008
 swapped-0.008
 tongues-0.008
ishment-0.007
Token box
Token ID3091
Feature activation+8.75e-01

Pos logit contributions
 batter+0.004
 crust+0.004
 tongues+0.004
 baskets+0.004
 swapped+0.004
Neg logit contributions
ray-0.004
arsh-0.004
 scientist-0.004
oris-0.004
ous-0.003
Token?"
Token ID1701
Feature activation+3.88e+00

Pos logit contributions
 tongues+0.025
aws+0.025
 swapped+0.024
ams+0.024
 sav+0.024
Neg logit contributions
ray-0.017
 scientist-0.015
 drill-0.014
 spider-0.013
 broom-0.013
Token Tom
Token ID4186
Feature activation+2.29e+00

Pos logit contributions
ilt+0.128
umbs+0.128
ishment+0.126
 tongues+0.124
ackers+0.122
Neg logit contributions
 Cat-0.048
 Rat-0.047
 if-0.045
 Quick-0.043
 drill-0.041
Token asks
Token ID7893
Feature activation+3.57e-01

Pos logit contributions
 tongues+0.056
 refreshing+0.055
umbs+0.054
ams+0.054
 crust+0.054
Neg logit contributions
ray-0.052
 scientist-0.047
 drill-0.043
 Rat-0.043
 spider-0.043
Token Jen
Token ID13364
Feature activation+2.44e+00

Pos logit contributions
ited+0.009
ishment+0.009
 tongues+0.009
 sav+0.009
 refreshing+0.009
Neg logit contributions
ray-0.008
 scientist-0.007
 drill-0.007
 Rat-0.007
 spider-0.006
Token.
Token ID13
Feature activation+1.58e+00

Pos logit contributions
 tongues+0.059
aws+0.058
 refreshing+0.058
ams+0.058
umbs+0.056
Neg logit contributions
ray-0.057
 scientist-0.051
 drill-0.048
 Rat-0.046
 spider-0.046
Token
Token ID198
Feature activation+1.17e+00

Pos logit contributions
ishment+0.049
ited+0.047
ilt+0.047
ams+0.046
 tongues+0.046
Neg logit contributions
ray-0.027
 Rat-0.025
 scientist-0.024
 drill-0.024
oris-0.023
Token see
Token ID766
Feature activation-3.92e-01

Pos logit contributions
ams+0.023
aney+0.023
 tongues+0.023
aws+0.022
ishment+0.022
Neg logit contributions
ray-0.024
 scientist-0.023
 Rat-0.021
 drill-0.021
 spider-0.021
Token the
Token ID262
Feature activation+3.48e-01

Pos logit contributions
ray+0.009
 scientist+0.008
 Rat+0.007
 spider+0.007
 drill+0.007
Neg logit contributions
ams-0.010
 tongues-0.010
 swapped-0.010
ishment-0.009
ited-0.009
Token bird
Token ID6512
Feature activation-5.44e-01

Pos logit contributions
ams+0.010
 swapped+0.010
aney+0.010
rimp+0.010
 contain+0.009
Neg logit contributions
 spider-0.006
 mouse-0.006
ray-0.006
 owl-0.006
 flashlight-0.006
Token and
Token ID290
Feature activation+4.09e-01

Pos logit contributions
ray+0.010
 spider+0.008
 Owl+0.008
 living+0.008
 mouse+0.008
Neg logit contributions
 swapped-0.016
ishment-0.015
ams-0.015
 tongues-0.015
aney-0.015
Token have
Token ID423
Feature activation+1.02e+00

Pos logit contributions
ams+0.008
 swapped+0.008
aws+0.008
 tongues+0.008
ishment+0.008
Neg logit contributions
ray-0.011
 scientist-0.010
 drill-0.010
oris-0.009
 Rat-0.009
Token pic
Token ID8301
Feature activation+3.84e+00

Pos logit contributions
ishment+0.028
ited+0.028
glers+0.027
ement+0.027
ized+0.026
Neg logit contributions
ray-0.018
 Rat-0.017
 living-0.017
 broom-0.016
 scientist-0.016
Tokenn
Token ID77
Feature activation-7.32e-01

Pos logit contributions
ishment+0.070
ams+0.069
aws+0.069
 tongues+0.067
 swapped+0.065
Neg logit contributions
ray-0.107
 scientist-0.096
 drill-0.093
 Rat-0.093
 broom-0.092
Tokenics
Token ID873
Feature activation+1.65e+00

Pos logit contributions
ray+0.022
 scientist+0.021
 drill+0.020
 Rat+0.020
oris+0.020
Neg logit contributions
 tongues-0.011
ishment-0.011
ams-0.011
 swapped-0.011
 crust-0.011
Token.
Token ID13
Feature activation-6.08e-01

Pos logit contributions
 tongues+0.049
ishment+0.047
aney+0.046
ited+0.046
umbs+0.046
Neg logit contributions
ray-0.028
 scientist-0.027
cher-0.024
 if-0.023
 Rat-0.023
Token<|endoftext|>
Token ID50256
Feature activation-1.90e-01

Pos logit contributions
ray+0.015
 scientist+0.013
 Rat+0.012
 drill+0.012
oris+0.012
Neg logit contributions
ams-0.014
aws-0.013
 swapped-0.013
 tongues-0.013
ishment-0.013
TokenOnce
Token ID7454
Feature activation-2.69e-01

Pos logit contributions
ray+0.004
 scientist+0.004
 Rat+0.004
 drill+0.004
 spider+0.004
Neg logit contributions
ams-0.004
 tongues-0.004
ishment-0.004
 swapped-0.004
aws-0.004
Token he
Token ID339
Feature activation-1.35e+00

Pos logit contributions
ray+0.021
 scientist+0.019
 drill+0.017
 Rat+0.017
 spider+0.017
Neg logit contributions
ishment-0.021
aws-0.021
 tongues-0.021
 crust-0.021
ams-0.021
Token wanted
Token ID2227
Feature activation-1.24e-01

Pos logit contributions
ray+0.025
 scientist+0.023
 Rat+0.021
 Monster+0.020
 drill+0.020
Neg logit contributions
ishment-0.040
 tongues-0.038
 crust-0.038
ams-0.038
umbs-0.038
Token for
Token ID329
Feature activation-1.97e-01

Pos logit contributions
ray+0.002
 scientist+0.001
 drill+0.001
oris+0.001
 Rat+0.001
Neg logit contributions
ams-0.004
aws-0.004
 swapped-0.004
 tongues-0.004
ishment-0.004
Token his
Token ID465
Feature activation+6.51e-01

Pos logit contributions
ray+0.004
 scientist+0.003
 Rat+0.003
 drill+0.003
 Monster+0.003
Neg logit contributions
ited-0.006
ams-0.005
aney-0.005
 swapped-0.005
glers-0.005
Token 3
Token ID513
Feature activation+1.95e+00

Pos logit contributions
ams+0.016
ishment+0.015
aws+0.015
ited+0.015
 swapped+0.014
Neg logit contributions
ray-0.015
 scientist-0.015
 spider-0.014
 drill-0.014
 flashlight-0.014
Tokenrd
Token ID4372
Feature activation+3.83e+00

Pos logit contributions
 swapped+0.037
ishment+0.037
aws+0.036
glers+0.035
 tongues+0.034
Neg logit contributions
ray-0.055
 scientist-0.052
 spider-0.050
 drill-0.050
 broom-0.050
Token birthday
Token ID10955
Feature activation+2.44e+00

Pos logit contributions
ized+0.097
 widened+0.096
ishment+0.096
 swapped+0.096
 refreshing+0.095
Neg logit contributions
ray-0.079
 scientist-0.071
 living-0.071
 computer-0.067
 broom-0.066
Token was
Token ID373
Feature activation-2.38e-01

Pos logit contributions
 tongues+0.067
ishment+0.064
ized+0.063
ited+0.062
umbs+0.062
Neg logit contributions
ray-0.051
 scientist-0.051
 mouse-0.042
 Monster-0.042
 spider-0.042
Token to
Token ID284
Feature activation-7.52e-02

Pos logit contributions
 scientist+0.004
ray+0.004
 Rat+0.004
 drill+0.004
 Taylor+0.004
Neg logit contributions
 tongues-0.007
ishment-0.007
ited-0.007
umbs-0.007
ized-0.007
Token try
Token ID1949
Feature activation+1.97e-01

Pos logit contributions
 scientist+0.002
ray+0.001
 drill+0.001
 Monster+0.001
 Taylor+0.001
Neg logit contributions
ishment-0.002
ited-0.002
ized-0.002
umbs-0.002
aws-0.002
Token something
Token ID1223
Feature activation+1.27e+00

Pos logit contributions
 swapped+0.005
ited+0.005
ishment+0.005
 tongues+0.005
ams+0.005
Neg logit contributions
ray-0.004
 scientist-0.004
 drill-0.004
 Rat-0.004
 spider-0.004
Token outside
Token ID2354
Feature activation-3.28e-01

Pos logit contributions
ited+0.002
ishment+0.002
ackers+0.002
rimp+0.002
 swapped+0.002
Neg logit contributions
 broom-0.001
ray-0.001
 911-0.001
 spider-0.001
 for-0.001
Token.
Token ID13
Feature activation-8.31e-01

Pos logit contributions
ray+0.006
 scientist+0.006
 broom+0.006
 spider+0.005
 drill+0.005
Neg logit contributions
aws-0.009
ishment-0.009
 swapped-0.009
ited-0.009
 tongues-0.009
Token When
Token ID1649
Feature activation+1.10e-01

Pos logit contributions
ray+0.020
 scientist+0.018
 Rat+0.017
 drill+0.017
 spider+0.017
Neg logit contributions
ams-0.019
 tongues-0.018
glers-0.018
ishment-0.018
aws-0.018
Token the
Token ID262
Feature activation+1.52e+00

Pos logit contributions
ams+0.003
aws+0.003
ishment+0.003
 swapped+0.002
umbs+0.002
Neg logit contributions
ray-0.002
 scientist-0.002
 Rat-0.002
 spider-0.002
 drill-0.002
Token cup
Token ID6508
Feature activation+1.63e+00

Pos logit contributions
ishment+0.032
aws+0.031
ams+0.031
ackers+0.030
aney+0.029
Neg logit contributions
ray-0.038
 scientist-0.037
 spider-0.037
 broom-0.037
 drill-0.035
Tokencakes
Token ID37263
Feature activation+3.82e+00

Pos logit contributions
ishment+0.043
ams+0.041
aws+0.040
 tongues+0.040
 swapped+0.039
Neg logit contributions
ray-0.035
 scientist-0.032
 spider-0.032
 Monster-0.030
 broom-0.030
Token were
Token ID547
Feature activation+1.49e+00

Pos logit contributions
ishment+0.098
ams+0.089
aws+0.086
rimp+0.085
 tongues+0.083
Neg logit contributions
ray-0.084
 spider-0.078
 scientist-0.076
 Rat-0.075
 drill-0.075
Token done
Token ID1760
Feature activation+7.34e-01

Pos logit contributions
ishment+0.040
aws+0.038
 tongues+0.038
 unison+0.037
Twe+0.036
Neg logit contributions
 broom-0.027
 for-0.025
ray-0.025
 drill-0.024
 scientist-0.024
Token,
Token ID11
Feature activation-9.23e-01

Pos logit contributions
ishment+0.022
ams+0.022
aws+0.022
ited+0.021
 tongues+0.021
Neg logit contributions
ray-0.012
 scientist-0.012
 drill-0.010
 spider-0.010
 Rat-0.010
Token they
Token ID484
Feature activation+4.89e-01

Pos logit contributions
 scientist+0.019
ray+0.018
 spider+0.018
 Rat+0.018
 broom+0.017
Neg logit contributions
ited-0.023
ams-0.023
aney-0.022
umbs-0.022
rimp-0.022
Token sat
Token ID3332
Feature activation+2.75e-01

Pos logit contributions
ishment+0.011
ams+0.011
ement+0.011
aws+0.011
lights+0.011
Neg logit contributions
ray-0.011
 scientist-0.010
 broom-0.010
 flashlight-0.010
 Rat-0.010
Token near
Token ID1474
Feature activation+1.25e+00

Pos logit contributions
ray+0.027
 scientist+0.025
oris+0.022
 Rat+0.022
 drill+0.022
Neg logit contributions
 tongues-0.032
 swapped-0.032
ishment-0.032
ams-0.031
ited-0.031
Token the
Token ID262
Feature activation+2.36e-01

Pos logit contributions
ams+0.032
 swapped+0.031
ited+0.031
aws+0.031
umbs+0.031
Neg logit contributions
ray-0.026
 scientist-0.024
 drill-0.023
 spider-0.022
 Rat-0.022
Token cliff
Token ID19516
Feature activation-1.21e+00

Pos logit contributions
aws+0.004
ams+0.004
 tongues+0.004
 swapped+0.004
ishment+0.004
Neg logit contributions
ray-0.007
 scientist-0.006
 drill-0.006
 Rat-0.006
oris-0.006
Token and
Token ID290
Feature activation+9.04e-01

Pos logit contributions
ray+0.026
 scientist+0.023
 drill+0.021
 spider+0.021
 broom+0.020
Neg logit contributions
 swapped-0.031
 tongues-0.031
ishment-0.030
ams-0.030
aws-0.030
Token have
Token ID423
Feature activation+1.42e+00

Pos logit contributions
ams+0.018
aws+0.018
ishment+0.018
 tongues+0.018
 swapped+0.018
Neg logit contributions
ray-0.023
 scientist-0.022
 drill-0.021
 Rat-0.020
 spider-0.020
Token pic
Token ID8301
Feature activation+3.81e+00

Pos logit contributions
ishment+0.030
 tongues+0.028
ams+0.028
ited+0.028
glers+0.027
Neg logit contributions
ray-0.036
 scientist-0.033
 Rat-0.033
 broom-0.032
 spider-0.032
Tokenn
Token ID77
Feature activation-5.27e-01

Pos logit contributions
ams+0.063
ishment+0.063
aws+0.062
 tongues+0.062
 swapped+0.061
Neg logit contributions
ray-0.112
 scientist-0.103
 Rat-0.099
 drill-0.099
oris-0.098
Tokenics
Token ID873
Feature activation+1.55e+00

Pos logit contributions
ray+0.016
 scientist+0.015
 drill+0.015
 Rat+0.014
oris+0.014
Neg logit contributions
ishment-0.008
 tongues-0.008
ams-0.008
 swapped-0.008
 crust-0.008
Token.
Token ID13
Feature activation+2.86e-01

Pos logit contributions
 tongues+0.043
ishment+0.041
aney+0.041
ited+0.041
umbs+0.041
Neg logit contributions
ray-0.029
 scientist-0.028
cher-0.024
 Rat-0.024
 broom-0.024
Token One
Token ID1881
Feature activation-6.75e-01

Pos logit contributions
ited+0.008
ams+0.008
 tongues+0.007
ized+0.007
ishment+0.007
Neg logit contributions
ray-0.006
 scientist-0.005
 Rat-0.005
 Monster-0.005
 drill-0.005
Token day
Token ID1110
Feature activation-7.21e-01

Pos logit contributions
ray+0.017
 scientist+0.016
 drill+0.015
 Rat+0.015
oris+0.015
Neg logit contributions
ams-0.014
aws-0.014
 swapped-0.014
 tongues-0.014
ishment-0.014
Token
Token ID198
Feature activation-5.28e-01

Pos logit contributions
ray+0.037
 scientist+0.035
 drill+0.033
cher+0.033
 broom+0.032
Neg logit contributions
aws-0.027
 swapped-0.026
ams-0.026
 sav-0.026
 tongues-0.025
Token
Token ID198
Feature activation-6.14e-01

Pos logit contributions
ray+0.016
 scientist+0.016
cher+0.015
 drill+0.015
 broom+0.015
Neg logit contributions
aws-0.008
 sav-0.007
aney-0.007
umbs-0.007
ackers-0.007
TokenWhen
Token ID2215
Feature activation-2.22e-01

Pos logit contributions
ray+0.017
 scientist+0.016
 drill+0.015
 Rat+0.015
oris+0.015
Neg logit contributions
aws-0.012
 swapped-0.012
ams-0.012
 sav-0.011
 tongues-0.011
Token the
Token ID262
Feature activation+1.37e+00

Pos logit contributions
ray+0.006
 scientist+0.006
 drill+0.005
 Rat+0.005
oris+0.005
Neg logit contributions
ams-0.004
aws-0.004
 swapped-0.004
 tongues-0.004
ishment-0.004
Token cup
Token ID6508
Feature activation+1.38e+00

Pos logit contributions
ishment+0.029
aws+0.028
ams+0.028
 tongues+0.027
ackers+0.027
Neg logit contributions
ray-0.036
 scientist-0.035
 drill-0.032
 spider-0.032
 broom-0.032
Tokencakes
Token ID37263
Feature activation+3.80e+00

Pos logit contributions
ishment+0.028
ams+0.027
 tongues+0.026
aws+0.026
 swapped+0.024
Neg logit contributions
ray-0.039
 scientist-0.036
 spider-0.035
 Monster-0.034
 drill-0.034
Token were
Token ID547
Feature activation+1.09e+00

Pos logit contributions
ishment+0.092
ams+0.086
 tongues+0.083
rimp+0.083
aws+0.082
Neg logit contributions
ray-0.089
 scientist-0.080
 spider-0.080
 Monster-0.077
 Rat-0.076
Token done
Token ID1760
Feature activation+6.46e-01

Pos logit contributions
ishment+0.031
 tongues+0.030
 unison+0.029
Twe+0.029
pool+0.029
Neg logit contributions
 for-0.018
 broom-0.017
ray-0.016
 living-0.016
 scientist-0.015
Token baking
Token ID16871
Feature activation+1.84e+00

Pos logit contributions
ishment+0.019
ams+0.019
aws+0.019
 tongues+0.019
ited+0.019
Neg logit contributions
ray-0.011
 scientist-0.011
 spider-0.009
 drill-0.009
 Rat-0.009
Token,
Token ID11
Feature activation-5.59e-01

Pos logit contributions
ited+0.066
ishment+0.065
 tongues+0.063
ams+0.063
 swapped+0.063
Neg logit contributions
 spider-0.017
ray-0.017
 scientist-0.016
 Rat-0.015
 Monster-0.015
Token they
Token ID484
Feature activation+6.25e-01

Pos logit contributions
 Rat+0.012
 scientist+0.012
 spider+0.011
ray+0.011
 mouse+0.011
Neg logit contributions
ams-0.014
ited-0.014
rimp-0.014
 tongues-0.013
umbs-0.013
Token what
Token ID644
Feature activation+1.06e+00

Pos logit contributions
 scientist+0.005
ray+0.005
oris+0.004
 drill+0.004
 Monster+0.004
Neg logit contributions
aws-0.005
ams-0.005
ishment-0.004
 crust-0.004
 sav-0.004
Token is
Token ID318
Feature activation+6.69e-01

Pos logit contributions
ishment+0.029
ams+0.029
 swapped+0.029
 tongues+0.029
aws+0.029
Neg logit contributions
ray-0.020
 scientist-0.018
 spider-0.016
 Rat-0.016
 drill-0.016
Token in
Token ID287
Feature activation+1.56e+00

Pos logit contributions
ams+0.019
 crust+0.019
aws+0.019
 swapped+0.019
ishment+0.018
Neg logit contributions
ray-0.012
 scientist-0.011
oris-0.010
 Monster-0.010
 drill-0.010
Token the
Token ID262
Feature activation-3.84e-02

Pos logit contributions
 swapped+0.040
ams+0.040
 tongues+0.040
aws+0.039
aney+0.039
Neg logit contributions
ray-0.033
 scientist-0.030
 spider-0.028
 drill-0.027
 Rat-0.027
Token sack
Token ID23704
Feature activation-1.62e+00

Pos logit contributions
ray+0.001
 scientist+0.001
 drill+0.001
 Rat+0.001
 spider+0.001
Neg logit contributions
ams-0.001
aws-0.001
 swapped-0.001
 tongues-0.001
ishment-0.001
Token,"
Token ID553
Feature activation+3.80e+00

Pos logit contributions
ray+0.031
 scientist+0.026
 spider+0.025
 drill+0.024
 broom+0.024
Neg logit contributions
 tongues-0.046
 swapped-0.046
ams-0.045
ishment-0.045
aws-0.044
Token Tim
Token ID5045
Feature activation+1.38e+00

Pos logit contributions
umbs+0.120
ackers+0.118
ishment+0.115
ited+0.114
ilt+0.113
Neg logit contributions
 Cat-0.055
 if-0.054
 Quick-0.052
 oh-0.049
 "-0.049
Token says
Token ID1139
Feature activation+8.72e-01

Pos logit contributions
ams+0.031
aws+0.031
 tongues+0.031
 swapped+0.031
ishment+0.031
Neg logit contributions
ray-0.033
 scientist-0.030
 drill-0.028
 Rat-0.028
oris-0.027
Token.
Token ID13
Feature activation+1.47e+00

Pos logit contributions
ams+0.015
 swapped+0.015
aws+0.015
 tongues+0.015
 crust+0.015
Neg logit contributions
ray-0.025
 scientist-0.023
 Rat-0.022
oris-0.022
 drill-0.022
Token
Token ID198
Feature activation+1.50e+00

Pos logit contributions
ishment+0.046
ited+0.045
ams+0.043
ilt+0.043
ized+0.042
Neg logit contributions
ray-0.025
 scientist-0.023
 Rat-0.023
 Cat-0.022
 drill-0.022
Token
Token ID198
Feature activation+1.39e+00

Pos logit contributions
ishment+0.040
 tongues+0.039
ams+0.038
 swapped+0.038
aws+0.037
Neg logit contributions
ray-0.031
 scientist-0.028
 Rat-0.026
 drill-0.026
oris-0.025
Token's
Token ID338
Feature activation+8.77e-01

Pos logit contributions
ray+0.003
 scientist+0.002
 Rat+0.002
 spider+0.002
 broom+0.002
Neg logit contributions
ams-0.004
ishment-0.004
 tongues-0.004
 crust-0.004
 swapped-0.004
Token look
Token ID804
Feature activation-2.00e-01

Pos logit contributions
ams+0.024
lights+0.023
umbs+0.023
ited+0.023
ement+0.023
Neg logit contributions
ray-0.019
 flashlight-0.018
 scientist-0.018
 Rat-0.017
 drill-0.017
Token inside
Token ID2641
Feature activation-1.14e-01

Pos logit contributions
ray+0.004
 scientist+0.004
 drill+0.004
oris+0.004
 Rat+0.004
Neg logit contributions
ams-0.005
 swapped-0.005
aws-0.005
ishment-0.005
 refreshing-0.004
Token the
Token ID262
Feature activation-2.65e-02

Pos logit contributions
ray+0.002
 scientist+0.002
 drill+0.002
 spider+0.002
 Rat+0.002
Neg logit contributions
 swapped-0.003
 tongues-0.003
aws-0.003
ams-0.003
ishment-0.003
Token sack
Token ID23704
Feature activation-1.43e+00

Pos logit contributions
ray+0.001
 scientist+0.001
 drill+0.001
 Rat+0.001
oris+0.001
Neg logit contributions
ams-0.001
aws-0.001
 swapped-0.001
 tongues-0.001
ishment-0.001
Token,"
Token ID553
Feature activation+3.79e+00

Pos logit contributions
ray+0.031
 scientist+0.029
 drill+0.026
 spider+0.026
 Rat+0.026
Neg logit contributions
ams-0.035
 tongues-0.035
 swapped-0.035
aws-0.035
ishment-0.034
Token Tom
Token ID4186
Feature activation+1.96e+00

Pos logit contributions
ilt+0.127
umbs+0.127
ackers+0.124
ishment+0.120
immer+0.120
Neg logit contributions
 if-0.046
 Cat-0.042
 means-0.041
 "-0.041
 oh-0.040
Token says
Token ID1139
Feature activation+9.03e-01

Pos logit contributions
ams+0.049
 tongues+0.049
 swapped+0.048
 crust+0.047
aws+0.047
Neg logit contributions
ray-0.043
 scientist-0.039
 drill-0.036
 Rat-0.036
oris-0.036
Token.
Token ID13
Feature activation+1.05e+00

Pos logit contributions
ams+0.018
aws+0.018
 swapped+0.018
 tongues+0.018
ishment+0.018
Neg logit contributions
ray-0.024
 scientist-0.022
 drill-0.021
 Rat-0.020
oris-0.020
Token
Token ID198
Feature activation+1.27e+00

Pos logit contributions
ited+0.035
ishment+0.035
ilt+0.034
ized+0.033
ams+0.033
Neg logit contributions
ray-0.016
 scientist-0.015
 Cat-0.014
 "-0.014
 drill-0.014
Token
Token ID198
Feature activation+1.20e+00

Pos logit contributions
ishment+0.038
 tongues+0.037
ited+0.035
ams+0.035
 swapped+0.035
Neg logit contributions
ray-0.024
 scientist-0.020
 Rat-0.019
 drill-0.019
oris-0.018
Token bodies
Token ID5920
Feature activation+1.08e+00

Pos logit contributions
ams+0.050
ited+0.050
umbs+0.050
ackers+0.050
ishment+0.050
Neg logit contributions
ray-0.038
 broom-0.037
 spider-0.036
 drill-0.036
 flashlight-0.036
Token and
Token ID290
Feature activation+1.34e+00

Pos logit contributions
 swapped+0.029
 tongues+0.029
ited+0.029
aws+0.029
ishment+0.029
Neg logit contributions
ray-0.021
 scientist-0.018
 broom-0.017
 drill-0.017
 spider-0.017
Token make
Token ID787
Feature activation-2.85e-01

Pos logit contributions
ement+0.034
umbs+0.034
ited+0.034
aws+0.034
aney+0.034
Neg logit contributions
ray-0.028
 drill-0.026
 scientist-0.026
 broom-0.025
 flashlight-0.025
Token a
Token ID257
Feature activation+8.86e-01

Pos logit contributions
ray+0.005
 spider+0.004
 broom+0.004
 Rat+0.004
 drill+0.004
Neg logit contributions
ited-0.009
ams-0.008
aney-0.008
glers-0.008
umbs-0.008
Token tower
Token ID10580
Feature activation+1.04e+00

Pos logit contributions
ams+0.020
ishment+0.019
aws+0.019
umbs+0.019
 swapped+0.019
Neg logit contributions
ray-0.021
 scientist-0.021
 drill-0.019
 spider-0.019
 broom-0.019
Token,"
Token ID553
Feature activation+3.79e+00

Pos logit contributions
 swapped+0.034
 tongues+0.034
ited+0.034
ishment+0.034
agged+0.033
Neg logit contributions
ray-0.015
 drill-0.013
 spider-0.012
 scientist-0.012
 broom-0.012
Token Tom
Token ID4186
Feature activation+2.14e+00

Pos logit contributions
umbs+0.126
ackers+0.122
ilt+0.122
ishment+0.118
 tongues+0.114
Neg logit contributions
 if-0.050
 "-0.045
 oh-0.045
 means-0.041
 Cat-0.041
Token says
Token ID1139
Feature activation+1.07e+00

Pos logit contributions
 tongues+0.058
ams+0.057
 crust+0.056
 refreshing+0.056
ized+0.056
Neg logit contributions
ray-0.044
 scientist-0.037
 drill-0.035
 Rat-0.035
oris-0.035
Token.
Token ID13
Feature activation+1.40e+00

Pos logit contributions
aws+0.023
ams+0.023
ishment+0.022
 tongues+0.022
 swapped+0.022
Neg logit contributions
ray-0.027
 scientist-0.025
 drill-0.023
 broom-0.023
 Rat-0.023
Token "
Token ID366
Feature activation+4.86e-01

Pos logit contributions
ishment+0.048
ited+0.047
ilt+0.046
ized+0.045
ams+0.044
Neg logit contributions
ray-0.021
 "-0.020
 scientist-0.019
 Cat-0.019
 Help-0.019
TokenThen
Token ID6423
Feature activation+2.48e+00

Pos logit contributions
ishment+0.013
 tongues+0.013
glers+0.012
 refreshing+0.012
ams+0.012
Neg logit contributions
ray-0.011
 scientist-0.009
 drill-0.009
cher-0.008
 spider-0.008
Token and
Token ID290
Feature activation-2.92e-01

Pos logit contributions
ishment+0.029
ams+0.029
 tongues+0.029
 swapped+0.029
aws+0.029
Neg logit contributions
ray-0.024
 scientist-0.022
 drill-0.020
 spider-0.020
 broom-0.020
Token he
Token ID339
Feature activation-2.86e-02

Pos logit contributions
ray+0.009
 scientist+0.008
oris+0.008
 drill+0.008
inea+0.007
Neg logit contributions
 swapped-0.005
 sav-0.005
 tongues-0.005
ams-0.005
 refreshing-0.005
Token wanted
Token ID2227
Feature activation+1.39e+00

Pos logit contributions
ray+0.001
 scientist+0.001
 Rat+0.001
 drill+0.001
 spider+0.001
Neg logit contributions
ishment-0.001
ams-0.001
aws-0.001
 tongues-0.001
umbs-0.001
Token to
Token ID284
Feature activation+2.23e+00

Pos logit contributions
 swapped+0.044
aws+0.044
 tongues+0.044
ams+0.044
 crust+0.043
Neg logit contributions
ray-0.020
 scientist-0.018
oris-0.016
 drill-0.016
 Monster-0.015
Token jump
Token ID4391
Feature activation+2.78e+00

Pos logit contributions
ishment+0.053
ams+0.053
aws+0.050
ement+0.050
ited+0.048
Neg logit contributions
ray-0.052
 broom-0.048
 spider-0.048
 scientist-0.048
 Rat-0.047
Token in
Token ID287
Feature activation+3.78e+00

Pos logit contributions
ishment+0.076
ams+0.076
 tongues+0.075
aws+0.073
 swapped+0.073
Neg logit contributions
ray-0.053
 scientist-0.048
 spider-0.045
 drill-0.045
 Rat-0.045
Token it
Token ID340
Feature activation+1.01e+00

Pos logit contributions
ams+0.111
 swapped+0.107
aney+0.105
ishment+0.105
aws+0.105
Neg logit contributions
 spider-0.063
ray-0.062
 broom-0.060
 scientist-0.058
 Rat-0.058
Token.
Token ID13
Feature activation-2.84e-01

Pos logit contributions
ams+0.029
 swapped+0.028
 tongues+0.028
aney+0.027
aws+0.027
Neg logit contributions
ray-0.019
 spider-0.018
 scientist-0.017
 broom-0.017
 Rat-0.017
Token Emily
Token ID17608
Feature activation+1.67e+00

Pos logit contributions
ray+0.008
inea+0.008
 scientist+0.007
cher+0.007
 drill+0.007
Neg logit contributions
 swapped-0.005
 sav-0.005
ishment-0.005
aws-0.004
ams-0.004
Token said
Token ID531
Feature activation+1.70e+00

Pos logit contributions
ams+0.035
aws+0.034
ishment+0.034
 tongues+0.034
 swapped+0.034
Neg logit contributions
ray-0.043
 scientist-0.040
 Rat-0.037
 drill-0.037
 spider-0.037
Token "
Token ID366
Feature activation+5.22e-01

Pos logit contributions
 swapped+0.031
 Stella+0.031
 ne+0.030
 Ph+0.029
 Nem+0.028
Neg logit contributions
ray-0.051
 scientist-0.048
oris-0.045
 drill-0.045
 outage-0.043
Token's
Token ID338
Feature activation+1.03e+00

Pos logit contributions
ishment+0.003
ams+0.003
lights+0.003
aws+0.003
 tongues+0.003
Neg logit contributions
ray-0.001
 spider-0.001
 scientist-0.001
 Rat-0.001
 broom-0.001
Token open
Token ID1280
Feature activation+1.10e+00

Pos logit contributions
ams+0.030
ement+0.029
aws+0.029
agged+0.029
aney+0.028
Neg logit contributions
ray-0.020
 flashlight-0.019
 scientist-0.019
 drill-0.019
 Rat-0.018
Token it
Token ID340
Feature activation+2.61e-01

Pos logit contributions
 swapped+0.029
aws+0.028
ams+0.028
umbs+0.028
 tongues+0.028
Neg logit contributions
ray-0.023
 scientist-0.020
 Rat-0.019
 spider-0.019
 drill-0.019
Token and
Token ID290
Feature activation+1.32e+00

Pos logit contributions
aws+0.005
ams+0.005
 crust+0.005
ishment+0.005
 sav+0.005
Neg logit contributions
ray-0.006
 scientist-0.006
 drill-0.006
 Rat-0.006
oris-0.006
Token see
Token ID766
Feature activation-3.43e-01

Pos logit contributions
ams+0.027
aws+0.027
 tongues+0.026
 swapped+0.026
ishment+0.026
Neg logit contributions
ray-0.035
 scientist-0.033
 drill-0.031
 Rat-0.030
 spider-0.030
Token,"
Token ID553
Feature activation+3.78e+00

Pos logit contributions
ray+0.008
 scientist+0.008
oris+0.007
 drill+0.007
 Monster+0.007
Neg logit contributions
aws-0.008
ishment-0.008
ams-0.008
 crust-0.007
 sav-0.007
Token Tim
Token ID5045
Feature activation+1.10e+00

Pos logit contributions
umbs+0.124
ackers+0.117
ams+0.116
ishment+0.116
lights+0.114
Neg logit contributions
 Cat-0.052
 if-0.051
 Quick-0.051
 Mouse-0.050
 oh-0.050
Token says
Token ID1139
Feature activation+7.14e-01

Pos logit contributions
ams+0.020
aws+0.020
 swapped+0.020
ishment+0.019
 sav+0.019
Neg logit contributions
ray-0.030
 scientist-0.029
 drill-0.027
 spider-0.027
 Rat-0.027
Token.
Token ID13
Feature activation+1.00e+00

Pos logit contributions
ams+0.013
 swapped+0.013
 tongues+0.013
aws+0.013
 crust+0.012
Neg logit contributions
ray-0.020
 scientist-0.019
 Rat-0.018
oris-0.017
 drill-0.017
Token He
Token ID679
Feature activation-5.68e-01

Pos logit contributions
ited+0.033
ishment+0.033
ams+0.032
ilt+0.032
 festivities+0.031
Neg logit contributions
ray-0.015
 scientist-0.014
 Rat-0.013
 drill-0.013
 Cat-0.013
Token tries
Token ID8404
Feature activation-2.18e-01

Pos logit contributions
ray+0.014
 scientist+0.014
inea+0.013
 drill+0.013
 Rat+0.013
Neg logit contributions
 swapped-0.012
aws-0.011
 sav-0.011
 ne-0.011
 tongues-0.011
Token many
Token ID867
Feature activation+3.74e-01

Pos logit contributions
ray+0.017
 scientist+0.016
oris+0.015
 Rat+0.015
 drill+0.015
Neg logit contributions
ams-0.021
aws-0.021
 swapped-0.021
 tongues-0.021
 crust-0.021
Token steps
Token ID4831
Feature activation+4.07e-01

Pos logit contributions
 swapped+0.010
ited+0.010
ams+0.010
ishment+0.010
aws+0.010
Neg logit contributions
ray-0.008
 spider-0.007
 drill-0.007
 scientist-0.007
 broom-0.007
Token to
Token ID284
Feature activation+8.75e-01

Pos logit contributions
ishment+0.011
ited+0.011
 sav+0.011
 swapped+0.011
ackers+0.011
Neg logit contributions
ray-0.008
 drill-0.006
 spider-0.006
 scientist-0.006
 broom-0.006
Token the
Token ID262
Feature activation-1.70e-01

Pos logit contributions
ams+0.023
aws+0.022
 swapped+0.022
ishment+0.022
umbs+0.022
Neg logit contributions
ray-0.019
 drill-0.018
 spider-0.016
 broom-0.016
 Rat-0.016
Token hole
Token ID7604
Feature activation-4.67e-01

Pos logit contributions
ray+0.004
 scientist+0.004
oris+0.004
 Rat+0.004
 drill+0.004
Neg logit contributions
 tongues-0.003
aws-0.003
 crust-0.003
 sav-0.003
ams-0.003
Token,"
Token ID553
Feature activation+3.75e+00

Pos logit contributions
ray+0.009
 spider+0.007
 drill+0.007
 scientist+0.007
 broom+0.007
Neg logit contributions
 swapped-0.014
ited-0.013
 sav-0.013
ishment-0.013
ams-0.013
Token Tom
Token ID4186
Feature activation+2.32e+00

Pos logit contributions
ackers+0.117
umbs+0.117
ilt+0.114
ishment+0.112
 tongues+0.109
Neg logit contributions
 if-0.054
 oh-0.051
 "-0.050
 Cat-0.050
ray-0.049
Token says
Token ID1139
Feature activation+9.97e-01

Pos logit contributions
ams+0.052
aws+0.052
 swapped+0.052
 tongues+0.051
ishment+0.051
Neg logit contributions
ray-0.055
 scientist-0.051
 drill-0.047
 Rat-0.047
oris-0.046
Token.
Token ID13
Feature activation+1.65e+00

Pos logit contributions
ams+0.020
aws+0.020
 tongues+0.020
 swapped+0.020
ishment+0.020
Neg logit contributions
ray-0.026
 scientist-0.024
 drill-0.022
 Rat-0.022
oris-0.022
Token
Token ID198
Feature activation+1.61e+00

Pos logit contributions
ishment+0.055
ited+0.054
ilt+0.053
ized+0.052
 tongues+0.049
Neg logit contributions
ray-0.026
 "-0.025
 Cat-0.024
 Help-0.023
 scientist-0.022
Token
Token ID198
Feature activation+1.43e+00

Pos logit contributions
ishment+0.051
 tongues+0.050
ited+0.046
ams+0.046
 swapped+0.046
Neg logit contributions
ray-0.029
 scientist-0.022
 Rat-0.022
 drill-0.021
oris-0.020
TokenLet
Token ID5756
Feature activation+2.68e-01

Pos logit contributions
ishment+0.029
ams+0.028
ized+0.028
 tongues+0.028
ement+0.027
Neg logit contributions
ray-0.017
 scientist-0.017
 spider-0.015
 Monster-0.015
 Rat-0.015
Token's
Token ID338
Feature activation+1.32e+00

Pos logit contributions
ishment+0.009
lights+0.009
ams+0.009
 tongues+0.009
glers+0.009
Neg logit contributions
ray-0.004
 spider-0.003
 scientist-0.003
 Rat-0.003
 broom-0.003
Token drink
Token ID4144
Feature activation+1.43e+00

Pos logit contributions
ement+0.043
lights+0.042
agged+0.041
ams+0.041
ited+0.040
Neg logit contributions
ray-0.020
 drill-0.019
 flashlight-0.019
 if-0.019
 tool-0.019
Token some
Token ID617
Feature activation+1.59e+00

Pos logit contributions
ishment+0.037
 swapped+0.036
ited+0.036
ams+0.035
aws+0.035
Neg logit contributions
ray-0.030
 scientist-0.026
 spider-0.025
 Rat-0.025
 broom-0.025
Token water
Token ID1660
Feature activation+1.23e+00

Pos logit contributions
ishment+0.033
 swapped+0.030
ams+0.030
aws+0.030
 argued+0.029
Neg logit contributions
ray-0.042
 broom-0.039
 spider-0.039
 scientist-0.038
 flashlight-0.037
Token,"
Token ID553
Feature activation+3.73e+00

Pos logit contributions
 swapped+0.038
ishment+0.038
ited+0.037
aws+0.037
 blonde+0.037
Neg logit contributions
ray-0.019
 scientist-0.016
 broom-0.015
 spider-0.015
 drill-0.015
Token Tim
Token ID5045
Feature activation+1.36e+00

Pos logit contributions
umbs+0.112
ishment+0.108
ams+0.108
ited+0.107
 tongues+0.104
Neg logit contributions
 Mouse-0.057
 Cat-0.057
 Oliver-0.056
ray-0.056
 Rat-0.055
Token says
Token ID1139
Feature activation+8.95e-01

Pos logit contributions
 tongues+0.030
ams+0.030
aws+0.030
 swapped+0.029
 crust+0.029
Neg logit contributions
ray-0.034
 scientist-0.031
 Rat-0.028
 drill-0.028
oris-0.028
Token.
Token ID13
Feature activation+1.26e+00

Pos logit contributions
aws+0.021
ams+0.020
 sav+0.020
ishment+0.020
 tongues+0.020
Neg logit contributions
ray-0.021
 scientist-0.020
 drill-0.018
 broom-0.018
 Rat-0.018
Token
Token ID198
Feature activation+1.24e+00

Pos logit contributions
ited+0.039
ishment+0.039
ams+0.038
ized+0.037
ilt+0.037
Neg logit contributions
ray-0.021
 scientist-0.020
 Rat-0.019
 Monster-0.018
 drill-0.018
Token
Token ID198
Feature activation+1.21e+00

Pos logit contributions
ishment+0.029
 tongues+0.029
ams+0.029
 swapped+0.029
aws+0.029
Neg logit contributions
ray-0.028
 scientist-0.026
 drill-0.024
 Rat-0.024
oris-0.024
Token
Token ID198
Feature activation-4.30e-01

Pos logit contributions
ray+0.027
 scientist+0.027
cher+0.026
 drill+0.026
 spider+0.025
Neg logit contributions
 sav-0.016
aws-0.015
 Stella-0.015
ams-0.015
 crust-0.015
Token
Token ID198
Feature activation-4.93e-01

Pos logit contributions
ray+0.013
 scientist+0.012
 drill+0.011
cher+0.011
 broom+0.011
Neg logit contributions
aws-0.007
aney-0.007
umbs-0.007
 sav-0.007
ams-0.006
TokenWhen
Token ID2215
Feature activation-4.71e-01

Pos logit contributions
ray+0.015
 scientist+0.014
 drill+0.014
 Rat+0.013
cher+0.013
Neg logit contributions
ized-0.008
 sav-0.008
ams-0.008
 swapped-0.007
ited-0.007
Token the
Token ID262
Feature activation+1.59e+00

Pos logit contributions
ray+0.014
 scientist+0.013
cher+0.013
 drill+0.012
oris+0.012
Neg logit contributions
 sav-0.008
 Stella-0.008
 Ph-0.008
ized-0.008
 refreshing-0.008
Token cup
Token ID6508
Feature activation+1.48e+00

Pos logit contributions
aws+0.030
ams+0.030
ishment+0.030
 swapped+0.029
 tongues+0.029
Neg logit contributions
ray-0.044
 scientist-0.043
 spider-0.040
 drill-0.040
 broom-0.039
Tokencakes
Token ID37263
Feature activation+3.73e+00

Pos logit contributions
ishment+0.030
ams+0.029
aws+0.028
 tongues+0.028
 swapped+0.027
Neg logit contributions
ray-0.041
 scientist-0.038
 spider-0.036
 drill-0.035
 Monster-0.035
Token were
Token ID547
Feature activation+1.51e+00

Pos logit contributions
ishment+0.092
aws+0.085
ams+0.083
 tongues+0.081
rimp+0.080
Neg logit contributions
ray-0.090
 scientist-0.080
 spider-0.079
 drill-0.078
 broom-0.077
Token done
Token ID1760
Feature activation+1.03e+00

Pos logit contributions
ishment+0.041
aws+0.039
 tongues+0.038
Twe+0.037
 unison+0.037
Neg logit contributions
 broom-0.028
ray-0.027
 scientist-0.026
 drill-0.026
 for-0.025
Token,
Token ID11
Feature activation-1.15e+00

Pos logit contributions
ishment+0.032
aws+0.032
ams+0.032
ited+0.031
 tongues+0.031
Neg logit contributions
 scientist-0.015
ray-0.015
 drill-0.014
 spider-0.013
 broom-0.013
Token Mill
Token ID9212
Feature activation-6.74e-01

Pos logit contributions
ray+0.031
 scientist+0.029
 drill+0.027
oris+0.027
 Rat+0.026
Neg logit contributions
ams-0.022
aws-0.022
ishment-0.022
 swapped-0.022
 tongues-0.022
Tokenie
Token ID494
Feature activation-4.00e-01

Pos logit contributions
ray+0.021
 scientist+0.020
 drill+0.020
oris+0.019
 Rat+0.019
Neg logit contributions
ams-0.010
ited-0.010
aws-0.010
 tongues-0.010
ishment-0.009
Token should
Token ID815
Feature activation-9.43e-01

Pos logit contributions
ray+0.031
 scientist+0.028
 drill+0.026
 Rat+0.026
 spider+0.026
Neg logit contributions
ams-0.031
ishment-0.030
aws-0.030
 tongues-0.030
 swapped-0.030
Token be
Token ID307
Feature activation-7.14e-01

Pos logit contributions
ray+0.018
 scientist+0.016
 drill+0.016
 Rat+0.016
 spider+0.015
Neg logit contributions
ement-0.026
umbs-0.026
ams-0.026
ited-0.026
ishment-0.026
Token compassionate
Token ID32533
Feature activation-3.14e-01

Pos logit contributions
ray+0.018
 scientist+0.017
 drill+0.016
 Rat+0.015
oris+0.015
Neg logit contributions
ams-0.015
aws-0.015
 tongues-0.015
 swapped-0.015
ishment-0.015
Token.
Token ID13
Feature activation+1.36e-01

Pos logit contributions
ray+0.008
 scientist+0.008
 drill+0.007
 Rat+0.007
oris+0.007
Neg logit contributions
ams-0.006
aws-0.006
 tongues-0.006
 swapped-0.006
ishment-0.006
Token Comp
Token ID3082
Feature activation-1.27e+00

Pos logit contributions
ams+0.003
aws+0.003
 swapped+0.003
 tongues+0.003
ishment+0.003
Neg logit contributions
ray-0.003
 scientist-0.003
 drill-0.003
 Rat-0.003
oris-0.003
Tokenassion
Token ID11857
Feature activation-4.56e+00

Pos logit contributions
 scientist+0.025
ray+0.024
 broom+0.023
 spider+0.023
 drill+0.023
Neg logit contributions
aws-0.033
ams-0.033
ishment-0.032
 swapped-0.032
glers-0.032
Tokenate
Token ID378
Feature activation-1.11e+00

Pos logit contributions
ray+0.118
 scientist+0.118
 broom+0.116
 Rat+0.115
 drill+0.114
Neg logit contributions
aws-0.090
ams-0.087
 crust-0.079
ishment-0.078
ized-0.077
Token means
Token ID1724
Feature activation+5.96e-02

Pos logit contributions
ray+0.019
 scientist+0.016
 spider+0.015
 Monster+0.014
 Rat+0.014
Neg logit contributions
ishment-0.034
 swapped-0.034
 sav-0.033
ams-0.033
 crust-0.033
Token you
Token ID345
Feature activation-1.30e+00

Pos logit contributions
ited+0.002
umbs+0.002
ams+0.001
ishment+0.001
aney+0.001
Neg logit contributions
ray-0.001
 scientist-0.001
 spider-0.001
 Rat-0.001
 drill-0.001
Token care
Token ID1337
Feature activation-2.24e+00

Pos logit contributions
ray+0.027
 scientist+0.025
 drill+0.023
 Rat+0.023
oris+0.023
Neg logit contributions
ams-0.032
aws-0.032
 swapped-0.032
 tongues-0.032
ishment-0.032
Token about
Token ID546
Feature activation-4.01e-01

Pos logit contributions
 scientist+0.059
oris+0.057
ray+0.057
 broom+0.056
 spider+0.054
Neg logit contributions
ams-0.038
aws-0.038
 en-0.036
 both-0.033
lights-0.033
Token time
Token ID640
Feature activation+9.14e-01

Pos logit contributions
glers+0.027
ams+0.026
ishment+0.026
 swapped+0.026
 tongues+0.025
Neg logit contributions
ray-0.051
 spider-0.047
 scientist-0.047
 broom-0.047
 flashlight-0.046
Token there
Token ID612
Feature activation+1.11e+00

Pos logit contributions
aws+0.020
ams+0.020
 tongues+0.020
aney+0.019
 swapped+0.019
Neg logit contributions
ray-0.022
 scientist-0.021
 drill-0.019
oris-0.018
 Rat-0.018
Token was
Token ID373
Feature activation-6.89e-01

Pos logit contributions
ishment+0.037
ams+0.037
 tongues+0.036
 crust+0.035
 swapped+0.035
Neg logit contributions
ray-0.015
 Rat-0.014
 Monster-0.013
 spider-0.013
 scientist-0.012
Token a
Token ID257
Feature activation-1.45e+00

Pos logit contributions
ray+0.013
 scientist+0.012
 Rat+0.011
 drill+0.011
 spider+0.011
Neg logit contributions
ams-0.019
 tongues-0.018
 swapped-0.018
ishment-0.018
aws-0.018
Token little
Token ID1310
Feature activation-1.02e+00

Pos logit contributions
ray+0.040
 scientist+0.036
oris+0.035
 drill+0.034
cher+0.034
Neg logit contributions
 sav-0.027
 crust-0.027
 swapped-0.026
aws-0.026
 refreshing-0.026
Token av
Token ID1196
Feature activation-4.30e+00

Pos logit contributions
ray+0.020
 scientist+0.018
 drill+0.016
 Rat+0.016
oris+0.016
Neg logit contributions
ams-0.027
aws-0.027
 swapped-0.027
 tongues-0.027
ishment-0.027
Tokenac
Token ID330
Feature activation-8.05e-01

Pos logit contributions
ray+0.089
 scientist+0.084
 drill+0.076
 911+0.076
 broom+0.076
Neg logit contributions
ams-0.114
 sav-0.111
 crust-0.109
ited-0.107
umbs-0.106
Tokenado
Token ID4533
Feature activation-1.73e+00

Pos logit contributions
ray+0.022
 scientist+0.019
 Rat+0.018
 spider+0.018
 drill+0.018
Neg logit contributions
 swapped-0.016
ishment-0.016
 tongues-0.016
 crust-0.015
 refreshing-0.015
Token.
Token ID13
Feature activation-1.17e-01

Pos logit contributions
ray+0.040
 scientist+0.037
 spider+0.035
 drill+0.034
 Monster+0.034
Neg logit contributions
ishment-0.041
 swapped-0.041
ams-0.041
 tongues-0.040
aws-0.040
Token It
Token ID632
Feature activation-5.37e-01

Pos logit contributions
ray+0.003
 scientist+0.003
 Rat+0.003
 drill+0.003
 Monster+0.003
Neg logit contributions
ams-0.002
ishment-0.002
aws-0.002
 swapped-0.002
 tongues-0.002
Token was
Token ID373
Feature activation-3.99e-01

Pos logit contributions
ray+0.010
 scientist+0.009
 Rat+0.009
 spider+0.008
 drill+0.008
Neg logit contributions
ishment-0.015
ams-0.015
aws-0.015
 tongues-0.015
 swapped-0.015
Token:
Token ID25
Feature activation-9.17e-01

Pos logit contributions
ray+0.001
 scientist+0.000
 drill+0.000
 Rat+0.000
oris+0.000
Neg logit contributions
ams-0.001
aws-0.001
 tongues-0.001
 swapped-0.001
ishment-0.001
Token "
Token ID366
Feature activation-5.96e-01

Pos logit contributions
ray+0.013
 scientist+0.011
 drill+0.010
 Rat+0.010
oris+0.009
Neg logit contributions
ams-0.030
aws-0.030
 tongues-0.029
 swapped-0.029
ishment-0.029
TokenNice
Token ID35284
Feature activation-1.76e+00

Pos logit contributions
ray+0.013
 scientist+0.012
oris+0.011
 Rat+0.011
 Monster+0.011
Neg logit contributions
 tongues-0.016
 crust-0.015
ishment-0.015
aws-0.015
glers-0.014
Token job
Token ID1693
Feature activation-1.47e+00

Pos logit contributions
ray+0.036
 scientist+0.032
 Rat+0.031
 drill+0.030
oris+0.029
Neg logit contributions
 swapped-0.043
aws-0.043
ams-0.043
 tongues-0.043
 sav-0.042
Token,
Token ID11
Feature activation-1.01e+00

Pos logit contributions
ray+0.028
 scientist+0.026
 Monster+0.024
 drill+0.023
 spider+0.023
Neg logit contributions
ishment-0.041
 swapped-0.041
ams-0.041
ited-0.041
 tongues-0.040
Token Elliot
Token ID30451
Feature activation-4.27e+00

Pos logit contributions
ray+0.025
 scientist+0.024
 drill+0.022
 Rat+0.022
oris+0.022
Neg logit contributions
ams-0.021
 tongues-0.021
 swapped-0.021
aws-0.021
ishment-0.021
Token!"
Token ID2474
Feature activation-1.24e+00

Pos logit contributions
ray+0.107
 scientist+0.100
 drill+0.099
 broom+0.098
 Rat+0.097
Neg logit contributions
ams-0.087
aws-0.082
 sav-0.080
glers-0.078
 swapped-0.078
Token Elliot
Token ID30451
Feature activation-2.83e+00

Pos logit contributions
ray+0.032
 scientist+0.029
 Rat+0.027
 drill+0.027
oris+0.027
Neg logit contributions
ams-0.026
 tongues-0.026
aws-0.026
ishment-0.026
 swapped-0.026
Token had
Token ID550
Feature activation-4.33e-01

Pos logit contributions
ray+0.063
 scientist+0.057
 Monster+0.052
oris+0.052
 Rat+0.050
Neg logit contributions
ishment-0.070
 tongues-0.069
umbs-0.069
 crust-0.069
aws-0.068
Token a
Token ID257
Feature activation+1.69e-01

Pos logit contributions
ray+0.009
 scientist+0.008
 Rat+0.008
 drill+0.007
 spider+0.007
Neg logit contributions
 tongues-0.011
ishment-0.011
glers-0.011
ams-0.011
lights-0.011
Token lot
Token ID1256
Feature activation+9.94e-02

Pos logit contributions
 swapped+0.003
ams+0.003
aws+0.003
 tongues+0.003
ishment+0.003
Neg logit contributions
ray-0.005
 scientist-0.004
 drill-0.004
 Rat-0.004
oris-0.004
Token,
Token ID11
Feature activation+4.17e-01

Pos logit contributions
aws+0.021
aney+0.020
ams+0.020
 tongues+0.019
 swapped+0.019
Neg logit contributions
ray-0.028
 scientist-0.026
 drill-0.024
oris-0.024
cher-0.023
Token there
Token ID612
Feature activation+1.00e+00

Pos logit contributions
aws+0.007
aney+0.007
 swapped+0.007
 crust+0.006
ackers+0.006
Neg logit contributions
ray-0.012
 scientist-0.012
oris-0.011
 drill-0.011
cher-0.011
Token was
Token ID373
Feature activation-4.09e-01

Pos logit contributions
ams+0.026
ishment+0.026
 swapped+0.025
aws+0.025
 tongues+0.025
Neg logit contributions
ray-0.020
 scientist-0.019
 Rat-0.017
 drill-0.017
 spider-0.017
Token a
Token ID257
Feature activation-1.36e+00

Pos logit contributions
ray+0.008
 scientist+0.008
 drill+0.007
 Rat+0.007
oris+0.007
Neg logit contributions
ams-0.011
aws-0.011
 swapped-0.010
 tongues-0.010
ishment-0.010
Token loud
Token ID7812
Feature activation-4.24e-01

Pos logit contributions
ray+0.040
oris+0.035
 scientist+0.034
cher+0.034
 Rat+0.034
Neg logit contributions
 crust-0.024
 sav-0.024
 swapped-0.024
aws-0.023
 refreshing-0.023
Token av
Token ID1196
Feature activation-4.24e+00

Pos logit contributions
ray+0.008
 scientist+0.007
 drill+0.007
 spider+0.006
 Rat+0.006
Neg logit contributions
ams-0.012
aws-0.012
 swapped-0.012
ishment-0.012
 tongues-0.012
Tokenac
Token ID330
Feature activation-4.58e-01

Pos logit contributions
ray+0.094
 scientist+0.087
 911+0.081
cher+0.080
 broom+0.080
Neg logit contributions
ams-0.108
 sav-0.105
 crust-0.104
umbs-0.100
aws-0.100
Tokenado
Token ID4533
Feature activation-9.59e-01

Pos logit contributions
ray+0.013
 scientist+0.012
 drill+0.012
 Rat+0.012
 spider+0.012
Neg logit contributions
 swapped-0.008
ishment-0.008
 tongues-0.008
 refreshing-0.008
 crust-0.008
Token.
Token ID13
Feature activation+4.77e-01

Pos logit contributions
ray+0.022
 scientist+0.022
 spider+0.020
 drill+0.019
 Monster+0.019
Neg logit contributions
ishment-0.023
 swapped-0.023
 tongues-0.022
ams-0.022
aws-0.022
Token It
Token ID632
Feature activation-4.03e-01

Pos logit contributions
ams+0.010
aws+0.010
ishment+0.010
 tongues+0.010
 swapped+0.010
Neg logit contributions
ray-0.012
 scientist-0.011
 drill-0.011
 Rat-0.010
 spider-0.010
Token was
Token ID373
Feature activation-5.35e-01

Pos logit contributions
ray+0.009
 scientist+0.009
oris+0.008
 drill+0.008
 broom+0.008
Neg logit contributions
 sav-0.009
 swapped-0.009
ams-0.009
aws-0.009
 tongues-0.009
TokenOnce
Token ID7454
Feature activation+6.41e-02

Pos logit contributions
ray+0.010
 scientist+0.009
 Rat+0.008
 Monster+0.008
 drill+0.008
Neg logit contributions
ams-0.012
ishment-0.012
 tongues-0.012
 swapped-0.012
 crust-0.011
Token,
Token ID11
Feature activation+9.81e-01

Pos logit contributions
aws+0.002
ams+0.002
aney+0.002
 swapped+0.002
ackers+0.002
Neg logit contributions
ray-0.001
 scientist-0.001
 drill-0.001
 Rat-0.001
cher-0.001
Token there
Token ID612
Feature activation+4.18e-01

Pos logit contributions
aws+0.020
ams+0.020
aney+0.019
 swapped+0.019
ackers+0.019
Neg logit contributions
ray-0.026
 scientist-0.024
 drill-0.023
oris-0.022
 Rat-0.022
Token was
Token ID373
Feature activation-3.39e-01

Pos logit contributions
ishment+0.013
ams+0.013
 tongues+0.013
 swapped+0.013
 crust+0.013
Neg logit contributions
ray-0.006
 scientist-0.005
 Rat-0.005
 spider-0.005
 Monster-0.005
Token a
Token ID257
Feature activation-1.99e+00

Pos logit contributions
ray+0.006
 scientist+0.005
 spider+0.005
 Rat+0.005
 broom+0.005
Neg logit contributions
ams-0.010
 tongues-0.010
ishment-0.010
 swapped-0.010
glers-0.010
Token copper
Token ID15317
Feature activation-4.15e+00

Pos logit contributions
ray+0.056
 scientist+0.049
oris+0.049
cher+0.048
 drill+0.048
Neg logit contributions
 sav-0.036
 swapped-0.036
 crust-0.036
aws-0.036
ams-0.035
Token dog
Token ID3290
Feature activation-7.78e-01

Pos logit contributions
ray+0.093
 scientist+0.086
oris+0.082
 Rat+0.082
 911+0.081
Neg logit contributions
 crust-0.095
aws-0.094
ams-0.092
 sav-0.091
 tongues-0.089
Token who
Token ID508
Feature activation-1.43e+00

Pos logit contributions
ray+0.021
 scientist+0.019
 drill+0.018
 Rat+0.018
 spider+0.018
Neg logit contributions
ams-0.015
aws-0.015
 swapped-0.015
 tongues-0.015
ishment-0.015
Token was
Token ID373
Feature activation-1.37e+00

Pos logit contributions
ray+0.027
 scientist+0.024
 Rat+0.023
 drill+0.021
 Monster+0.021
Neg logit contributions
ishment-0.042
ams-0.040
aws-0.040
 tongues-0.040
aney-0.040
Token very
Token ID845
Feature activation+3.73e-02

Pos logit contributions
ray+0.025
 Rat+0.025
 living+0.024
 scientist+0.024
 Owl+0.023
Neg logit contributions
ited-0.039
ishment-0.038
ams-0.038
lights-0.038
 tongues-0.037
Token exciting
Token ID7895
Feature activation-5.09e-01

Pos logit contributions
ishment+0.001
umbs+0.001
lights+0.001
 unison+0.001
ited+0.001
Neg logit contributions
 scientist-0.001
ray-0.001
 spider-0.001
 living-0.001
 Rat-0.001
Token there
Token ID612
Feature activation+1.31e+00

Pos logit contributions
aws+0.019
ams+0.019
 swapped+0.018
 tongues+0.018
aney+0.018
Neg logit contributions
ray-0.020
 scientist-0.018
 drill-0.017
 Rat-0.016
oris-0.016
Token were
Token ID547
Feature activation-2.26e-01

Pos logit contributions
ams+0.041
ishment+0.041
 tongues+0.040
 swapped+0.039
 crust+0.039
Neg logit contributions
ray-0.021
 Rat-0.018
 scientist-0.018
 Monster-0.018
 spider-0.017
Token two
Token ID734
Feature activation-4.11e-02

Pos logit contributions
ray+0.004
 scientist+0.004
 spider+0.004
 Rat+0.004
 broom+0.003
Neg logit contributions
ishment-0.006
 tongues-0.006
 swapped-0.006
ams-0.006
ited-0.006
Token dogs
Token ID6844
Feature activation+3.20e-02

Pos logit contributions
 brother+0.001
 spider+0.001
 scientist+0.001
 broom+0.001
 mouse+0.001
Neg logit contributions
ishment-0.001
pool-0.001
 swapped-0.001
 contain-0.001
ited-0.001
Token,
Token ID11
Feature activation-6.80e-01

Pos logit contributions
ishment+0.001
 tongues+0.001
ited+0.001
ams+0.001
umbs+0.001
Neg logit contributions
ray-0.001
 scientist-0.001
 Rat-0.001
 Monster-0.001
 drill-0.001
Token Pick
Token ID12346
Feature activation-4.11e+00

Pos logit contributions
ray+0.014
 scientist+0.013
 Rat+0.013
 spider+0.012
 drill+0.012
Neg logit contributions
ams-0.018
 tongues-0.017
ishment-0.017
ited-0.017
ized-0.017
Token and
Token ID290
Feature activation-1.13e+00

Pos logit contributions
ray+0.109
 scientist+0.105
 drill+0.100
 broom+0.096
 spider+0.096
Neg logit contributions
aws-0.085
ams-0.081
 swapped-0.077
ishment-0.075
 sav-0.074
Token Race
Token ID12588
Feature activation+2.16e-02

Pos logit contributions
ray+0.023
 scientist+0.022
 Rat+0.021
oris+0.020
 drill+0.020
Neg logit contributions
ishment-0.029
ams-0.028
 tongues-0.028
ited-0.028
umbs-0.028
Token.
Token ID13
Feature activation-9.92e-01

Pos logit contributions
 tongues+0.001
ishment+0.001
 refreshing+0.001
ited+0.001
 sav+0.001
Neg logit contributions
ray-0.000
 scientist-0.000
 Monster-0.000
 living-0.000
oris-0.000
Token They
Token ID1119
Feature activation+4.14e-01

Pos logit contributions
ray+0.023
 scientist+0.021
 Rat+0.020
oris+0.020
 drill+0.020
Neg logit contributions
ams-0.023
 tongues-0.023
 swapped-0.022
aws-0.022
ishment-0.022
Token were
Token ID547
Feature activation-5.72e-01

Pos logit contributions
ishment+0.012
 crust+0.011
lights+0.011
umbs+0.011
 tongues+0.011
Neg logit contributions
ray-0.008
 scientist-0.008
 Monster-0.007
 knows-0.007
oris-0.007
Token grateful
Token ID14066
Feature activation-1.49e+00

Pos logit contributions
ray+0.015
 scientist+0.012
 drill+0.011
 broom+0.011
oris+0.011
Neg logit contributions
 sav-0.030
 crust-0.029
aws-0.028
ishment-0.028
 refreshing-0.028
Token fox
Token ID21831
Feature activation-1.58e+00

Pos logit contributions
ray+0.029
 scientist+0.027
 drill+0.024
 spider+0.024
 Rat+0.024
Neg logit contributions
ams-0.040
 swapped-0.040
aws-0.040
 tongues-0.039
ishment-0.039
Tokenes
Token ID274
Feature activation-1.92e-02

Pos logit contributions
ray+0.036
 scientist+0.033
 Monster+0.030
 Rat+0.030
oris+0.029
Neg logit contributions
ishment-0.039
 tongues-0.038
aws-0.038
ams-0.038
 swapped-0.037
Token thanked
Token ID26280
Feature activation-7.74e-01

Pos logit contributions
ray+0.000
 scientist+0.000
 Monster+0.000
cher+0.000
 Rat+0.000
Neg logit contributions
ishment-0.000
 tongues-0.000
 crust-0.000
aws-0.000
ams-0.000
Token the
Token ID262
Feature activation-1.00e+00

Pos logit contributions
 scientist+0.015
ray+0.015
 Rat+0.015
 Monster+0.014
 Oliver+0.014
Neg logit contributions
ams-0.021
 swapped-0.020
ited-0.020
umbs-0.020
 tongues-0.020
Token copper
Token ID15317
Feature activation-4.07e+00

Pos logit contributions
ray+0.017
 drill+0.013
 scientist+0.013
arsh+0.013
 broom+0.013
Neg logit contributions
 sav-0.030
aws-0.028
 crust-0.028
 refreshing-0.028
 delight-0.027
Token for
Token ID329
Feature activation-7.53e-01

Pos logit contributions
ray+0.069
 Mechan+0.066
 Polic+0.064
 outage+0.063
arsh+0.062
Neg logit contributions
 crust-0.102
ams-0.097
 sav-0.096
 ne-0.094
ole-0.094
Token his
Token ID465
Feature activation-5.41e-01

Pos logit contributions
ray+0.016
 scientist+0.015
 drill+0.014
 Rat+0.014
 spider+0.013
Neg logit contributions
aws-0.019
ams-0.019
 tongues-0.018
 swapped-0.018
ishment-0.018
Token help
Token ID1037
Feature activation-6.82e-01

Pos logit contributions
ray+0.011
 scientist+0.010
oris+0.009
 drill+0.009
 Rat+0.009
Neg logit contributions
 tongues-0.014
ams-0.014
aws-0.014
 swapped-0.014
ishment-0.013
Token.
Token ID13
Feature activation-9.70e-01

Pos logit contributions
ray+0.013
 scientist+0.013
 drill+0.012
 broom+0.011
 Rat+0.011
Neg logit contributions
aws-0.018
 swapped-0.018
ishment-0.018
ams-0.018
 tongues-0.018
Token He
Token ID679
Feature activation-1.62e+00

Pos logit contributions
ray+0.022
 scientist+0.020
 Rat+0.020
 Monster+0.019
oris+0.019
Neg logit contributions
ited-0.025
ams-0.024
 tongues-0.024
glers-0.023
 swapped-0.023
Token,
Token ID11
Feature activation+3.18e-01

Pos logit contributions
aws+0.019
aney+0.018
ams+0.018
 tongues+0.018
 swapped+0.018
Neg logit contributions
ray-0.023
 scientist-0.022
 drill-0.020
oris-0.019
 broom-0.019
Token there
Token ID612
Feature activation+9.44e-01

Pos logit contributions
aws+0.007
ams+0.007
 swapped+0.006
 tongues+0.006
ishment+0.006
Neg logit contributions
ray-0.008
 scientist-0.008
 drill-0.007
oris-0.007
 Rat-0.007
Token was
Token ID373
Feature activation-4.49e-01

Pos logit contributions
ams+0.028
ishment+0.028
 tongues+0.027
 swapped+0.027
 crust+0.026
Neg logit contributions
ray-0.016
 scientist-0.015
 Rat-0.014
 Monster-0.014
 spider-0.014
Token an
Token ID281
Feature activation-6.29e-01

Pos logit contributions
ray+0.007
 scientist+0.007
 Rat+0.006
 spider+0.006
 drill+0.006
Neg logit contributions
ams-0.013
ishment-0.013
 tongues-0.013
 swapped-0.013
glers-0.013
Token adventurous
Token ID39770
Feature activation-1.13e+00

Pos logit contributions
ray+0.015
 scientist+0.014
 spider+0.013
 drill+0.013
 Rat+0.013
Neg logit contributions
 tongues-0.014
 swapped-0.014
umbs-0.014
aws-0.014
ams-0.014
Token copper
Token ID15317
Feature activation-4.05e+00

Pos logit contributions
ray+0.024
 scientist+0.021
 drill+0.020
 Rat+0.020
oris+0.019
Neg logit contributions
aws-0.028
ams-0.028
 swapped-0.028
 tongues-0.028
ishment-0.028
Token.
Token ID13
Feature activation+3.26e-02

Pos logit contributions
ray+0.097
 911+0.091
 Rat+0.089
oris+0.088
 Taylor+0.087
Neg logit contributions
 crust-0.083
aws-0.082
 sav-0.079
ams-0.078
glers-0.076
Token He
Token ID679
Feature activation-1.43e+00

Pos logit contributions
ams+0.001
aws+0.001
 swapped+0.001
 tongues+0.001
ishment+0.001
Neg logit contributions
ray-0.001
 scientist-0.001
 drill-0.001
 Rat-0.001
oris-0.001
Token was
Token ID373
Feature activation-7.18e-01

Pos logit contributions
ray+0.037
 scientist+0.034
 drill+0.033
oris+0.033
 broom+0.032
Neg logit contributions
 sav-0.029
ams-0.029
 swapped-0.029
aws-0.029
 tongues-0.027
Token always
Token ID1464
Feature activation-1.25e+00

Pos logit contributions
ray+0.022
 scientist+0.020
oris+0.019
 drill+0.019
 spider+0.019
Neg logit contributions
ams-0.012
 swapped-0.012
aws-0.012
 sav-0.012
 crust-0.011
Token curious
Token ID11040
Feature activation-1.63e-02

Pos logit contributions
ray+0.035
 scientist+0.031
 drill+0.030
 Rat+0.030
oris+0.030
Neg logit contributions
ams-0.024
 sav-0.023
 swapped-0.023
ishment-0.023
 sweaty-0.023
Token playing
Token ID2712
Feature activation-2.31e-01

Pos logit contributions
 scientist+0.004
ray+0.004
 Rat+0.003
 drill+0.003
 spider+0.003
Neg logit contributions
umbs-0.005
ams-0.005
aws-0.005
ited-0.005
ishment-0.004
Token together
Token ID1978
Feature activation+1.17e-01

Pos logit contributions
ray+0.006
 scientist+0.006
 drill+0.005
 spider+0.005
 Rat+0.005
Neg logit contributions
ams-0.005
 tongues-0.005
aws-0.005
 swapped-0.005
ishment-0.005
Token under
Token ID739
Feature activation+1.52e+00

Pos logit contributions
ams+0.003
umbs+0.003
aney+0.003
ishment+0.003
ited+0.003
Neg logit contributions
 scientist-0.002
ray-0.002
 Monster-0.002
 spider-0.002
 living-0.002
Token the
Token ID262
Feature activation-3.70e-01

Pos logit contributions
ams+0.049
ited+0.047
aney+0.046
ishment+0.046
ount+0.046
Neg logit contributions
 scientist-0.024
 spider-0.023
 drill-0.022
 Rat-0.022
 Monster-0.022
Token big
Token ID1263
Feature activation+3.66e-01

Pos logit contributions
ray+0.008
 scientist+0.008
 spider+0.007
 broom+0.007
 drill+0.007
Neg logit contributions
ams-0.010
ishment-0.009
 swapped-0.009
aws-0.009
ited-0.009
Token oak
Token ID26210
Feature activation-4.03e+00

Pos logit contributions
ams+0.010
 swapped+0.010
aws+0.010
ishment+0.010
aney+0.010
Neg logit contributions
ray-0.007
 scientist-0.007
 spider-0.006
 drill-0.006
 broom-0.006
Token tree
Token ID5509
Feature activation-1.46e+00

Pos logit contributions
ray+0.089
 scientist+0.082
 drill+0.076
 Rat+0.075
 spider+0.074
Neg logit contributions
ams-0.097
 swapped-0.096
aws-0.096
 tongues-0.096
ishment-0.096
Token.
Token ID13
Feature activation-9.36e-02

Pos logit contributions
ray+0.029
 scientist+0.027
 drill+0.026
 spider+0.025
 broom+0.025
Neg logit contributions
ams-0.038
 swapped-0.037
ishment-0.037
 tongues-0.037
 Ph-0.037
Token It
Token ID632
Feature activation+7.64e-02

Pos logit contributions
ray+0.002
 scientist+0.002
 Monster+0.002
 Mouse+0.002
 Rat+0.002
Neg logit contributions
ams-0.002
glers-0.002
ited-0.002
 swapped-0.002
ized-0.002
Token was
Token ID373
Feature activation+1.85e-01

Pos logit contributions
ishment+0.003
ams+0.003
 tongues+0.002
umbs+0.002
 crust+0.002
Neg logit contributions
ray-0.001
 scientist-0.001
 Monster-0.001
 Rat-0.001
 spider-0.001
Token like
Token ID588
Feature activation-3.69e-01

Pos logit contributions
umbs+0.006
 tongues+0.006
ishment+0.006
ized+0.006
ited+0.006
Neg logit contributions
 Rat-0.002
 scientist-0.002
 drill-0.002
 if-0.002
 spider-0.002
Token happened
Token ID3022
Feature activation-5.88e-01

Pos logit contributions
 swapped+0.014
aws+0.014
ams+0.014
 tongues+0.013
glers+0.013
Neg logit contributions
ray-0.014
 scientist-0.013
oris-0.012
 broom-0.012
 drill-0.011
Token.
Token ID13
Feature activation+6.27e-03

Pos logit contributions
ray+0.017
 scientist+0.016
oris+0.016
 Rat+0.016
 drill+0.015
Neg logit contributions
aws-0.010
ams-0.010
ishment-0.009
 swapped-0.009
 tongues-0.009
Token He
Token ID679
Feature activation-4.25e-01

Pos logit contributions
aws+0.000
ams+0.000
 swapped+0.000
 tongues+0.000
ishment+0.000
Neg logit contributions
ray-0.000
 scientist-0.000
 drill-0.000
 Rat-0.000
oris-0.000
Token saw
Token ID2497
Feature activation-1.17e+00

Pos logit contributions
ray+0.011
 scientist+0.010
oris+0.009
 drill+0.009
inea+0.009
Neg logit contributions
 swapped-0.009
 sav-0.009
aws-0.008
 ne-0.008
ams-0.008
Token the
Token ID262
Feature activation-1.20e+00

Pos logit contributions
ray+0.028
 scientist+0.026
 drill+0.024
 Rat+0.024
 spider+0.024
Neg logit contributions
ams-0.027
 tongues-0.027
ishment-0.026
 swapped-0.026
aws-0.026
Token broken
Token ID5445
Feature activation-4.03e+00

Pos logit contributions
ray+0.030
oris+0.026
cher+0.026
 scientist+0.026
inea+0.026
Neg logit contributions
 crust-0.027
 sav-0.026
 sweaty-0.025
 tongues-0.024
 refreshing-0.024
Token axe
Token ID23272
Feature activation-1.71e+00

Pos logit contributions
ray+0.093
cher+0.084
oris+0.082
Fel+0.082
 scientist+0.080
Neg logit contributions
 crust-0.101
 sav-0.090
 wheels-0.089
 tongues-0.089
ized-0.088
Token and
Token ID290
Feature activation-1.11e-01

Pos logit contributions
ray+0.034
 scientist+0.031
 drill+0.030
 Rat+0.028
 spider+0.028
Neg logit contributions
ams-0.046
aws-0.045
 swapped-0.045
ishment-0.044
 tongues-0.044
Token the
Token ID262
Feature activation-4.11e-01

Pos logit contributions
ray+0.003
 scientist+0.003
oris+0.003
 drill+0.002
 Rat+0.002
Neg logit contributions
 swapped-0.002
ams-0.002
aws-0.002
 tongues-0.002
 sav-0.002
Token rock
Token ID3881
Feature activation-1.15e+00

Pos logit contributions
ray+0.009
 scientist+0.008
oris+0.008
 Monster+0.008
cher+0.008
Neg logit contributions
 crust-0.010
 tongues-0.010
 sav-0.010
aws-0.009
 refreshing-0.009
Token.
Token ID13
Feature activation-2.03e-01

Pos logit contributions
ray+0.021
 drill+0.020
 scientist+0.019
 broom+0.019
 spider+0.018
Neg logit contributions
ishment-0.032
ams-0.032
ackers-0.032
aney-0.031
aws-0.031
Token cater
Token ID20825
Feature activation-1.47e+00

Pos logit contributions
ray+0.002
 scientist+0.002
 drill+0.002
cher+0.002
oris+0.002
Neg logit contributions
 sav-0.003
 refreshing-0.003
aws-0.003
 tongues-0.003
 crust-0.003
Tokenpillar
Token ID41643
Feature activation-8.02e-01

Pos logit contributions
ray+0.030
 drill+0.028
 scientist+0.028
 flashlight+0.027
 911+0.027
Neg logit contributions
aws-0.037
ized-0.037
ited-0.036
umbs-0.036
 sav-0.036
Token climbed
Token ID19952
Feature activation-3.05e-01

Pos logit contributions
ray+0.020
 scientist+0.019
 drill+0.017
 Rat+0.017
 spider+0.017
Neg logit contributions
ams-0.017
aws-0.017
 tongues-0.017
ishment-0.017
 swapped-0.017
Token up
Token ID510
Feature activation+1.10e+00

Pos logit contributions
ray+0.009
 scientist+0.008
 drill+0.008
cher+0.007
oris+0.007
Neg logit contributions
aws-0.006
 sav-0.006
 crust-0.006
ams-0.005
 swapped-0.005
Token the
Token ID262
Feature activation-6.60e-01

Pos logit contributions
ams+0.029
 swapped+0.028
aws+0.028
 tongues+0.028
ishment+0.028
Neg logit contributions
ray-0.022
 scientist-0.020
 drill-0.019
 Rat-0.019
 spider-0.018
Token oak
Token ID26210
Feature activation-4.01e+00

Pos logit contributions
ray+0.018
 scientist+0.017
 drill+0.016
cher+0.016
oris+0.015
Neg logit contributions
 sav-0.013
 refreshing-0.013
 crust-0.013
 tongues-0.013
aws-0.012
Token tree
Token ID5509
Feature activation-1.52e+00

Pos logit contributions
ray+0.100
 scientist+0.095
 drill+0.088
 Rat+0.087
oris+0.086
Neg logit contributions
 sav-0.084
 crust-0.084
aws-0.084
ams-0.082
 tongues-0.080
Token,
Token ID11
Feature activation-1.14e+00

Pos logit contributions
ray+0.031
 scientist+0.028
 drill+0.026
 Rat+0.025
 broom+0.025
Neg logit contributions
ams-0.040
 swapped-0.039
aws-0.039
 tongues-0.039
ishment-0.039
Token sure
Token ID1654
Feature activation+1.26e-01

Pos logit contributions
ray+0.026
 scientist+0.024
 drill+0.022
 Rat+0.022
oris+0.022
Neg logit contributions
ams-0.026
aws-0.026
 swapped-0.026
 tongues-0.026
ishment-0.026
Token-
Token ID12
Feature activation-6.23e-01

Pos logit contributions
aws+0.003
 swapped+0.003
ams+0.003
 tongues+0.003
ishment+0.003
Neg logit contributions
ray-0.003
 scientist-0.003
 drill-0.003
oris-0.003
 Rat-0.003
Tokenfooted
Token ID43127
Feature activation-8.37e-02

Pos logit contributions
 scientist+0.014
ray+0.014
 drill+0.013
 broom+0.013
 flashlight+0.013
Neg logit contributions
ams-0.014
aws-0.014
 swapped-0.013
 sav-0.013
ized-0.013
Token he
Token ID339
Feature activation+4.38e-01

Pos logit contributions
ray+0.028
 scientist+0.025
oris+0.024
 drill+0.024
cher+0.023
Neg logit contributions
 refreshing-0.015
 Nem-0.015
 sav-0.014
 ne-0.014
 Ph-0.014
Token came
Token ID1625
Feature activation-1.13e+00

Pos logit contributions
 sav+0.009
aws+0.008
 swapped+0.008
 ne+0.008
ams+0.008
Neg logit contributions
ray-0.012
 scientist-0.011
oris-0.010
 drill-0.010
 outage-0.010
Token across
Token ID1973
Feature activation+9.10e-02

Pos logit contributions
 drill+0.033
 scientist+0.032
ray+0.032
 spider+0.031
 outage+0.030
Neg logit contributions
aws-0.019
 both-0.017
 sav-0.017
ams-0.016
 so-0.016
Token a
Token ID257
Feature activation-8.13e-01

Pos logit contributions
 sav+0.001
 ne+0.001
 Nem+0.001
 refreshing+0.001
aws+0.001
Neg logit contributions
ray-0.003
cher-0.003
ometer-0.003
oris-0.003
 outage-0.003
Token small
Token ID1402
Feature activation-9.25e-01

Pos logit contributions
ray+0.023
 scientist+0.021
oris+0.020
cher+0.020
inea+0.020
Neg logit contributions
 sav-0.015
 refreshing-0.014
 crust-0.014
aws-0.014
 ne-0.013
Token ac
Token ID936
Feature activation-4.00e+00

Pos logit contributions
ray+0.022
 scientist+0.020
 drill+0.018
 Rat+0.018
oris+0.018
Neg logit contributions
aws-0.021
ishment-0.021
ams-0.021
 sav-0.021
 crust-0.021
Tokenorn
Token ID1211
Feature activation-1.49e+00

Pos logit contributions
 911+0.092
 scientist+0.090
ray+0.087
 department+0.087
 permission+0.085
Neg logit contributions
 sav-0.090
umbs-0.085
aws-0.084
glers-0.084
ized-0.083
Token that
Token ID326
Feature activation-3.61e-01

Pos logit contributions
ray+0.044
 scientist+0.043
oris+0.041
 Rat+0.040
inea+0.039
Neg logit contributions
 crust-0.024
 sav-0.023
aws-0.023
ized-0.022
ams-0.022
Token was
Token ID373
Feature activation+6.26e-01

Pos logit contributions
ray+0.009
 scientist+0.008
 drill+0.008
oris+0.008
 broom+0.008
Neg logit contributions
 sav-0.008
 swapped-0.007
ams-0.007
 tongues-0.007
aws-0.007
Token lying
Token ID9105
Feature activation-1.09e+00

Pos logit contributions
ams+0.015
ishment+0.014
aws+0.014
 tongues+0.014
 swapped+0.014
Neg logit contributions
ray-0.014
 scientist-0.014
 drill-0.013
 Rat-0.013
 broom-0.013
Token on
Token ID319
Feature activation-3.66e-01

Pos logit contributions
ray+0.024
 scientist+0.023
 drill+0.021
oris+0.021
 Rat+0.021
Neg logit contributions
ams-0.026
aws-0.025
 swapped-0.025
 crust-0.025
 sav-0.025
Token there
Token ID612
Feature activation+3.66e-01

Pos logit contributions
ray+0.004
 scientist+0.004
 drill+0.004
oris+0.004
 Rat+0.004
Neg logit contributions
aws-0.003
ams-0.003
aney-0.003
 swapped-0.003
 crust-0.003
Token were
Token ID547
Feature activation-4.99e-01

Pos logit contributions
ams+0.010
ishment+0.010
 tongues+0.010
 swapped+0.010
aws+0.010
Neg logit contributions
ray-0.007
 scientist-0.006
 Rat-0.006
 spider-0.006
 drill-0.006
Token two
Token ID734
Feature activation-3.10e-01

Pos logit contributions
ray+0.009
 scientist+0.008
 spider+0.008
 Rat+0.008
 mouse+0.007
Neg logit contributions
ishment-0.014
 tongues-0.013
ited-0.013
ams-0.013
 swapped-0.013
Token friends
Token ID2460
Feature activation-5.98e-02

Pos logit contributions
 scientist+0.005
 spider+0.005
 broom+0.005
 brother+0.004
ray+0.004
Neg logit contributions
ishment-0.009
pool-0.009
 swapped-0.009
 contain-0.009
umbs-0.009
Token named
Token ID3706
Feature activation-1.54e-01

Pos logit contributions
ray+0.001
 scientist+0.001
 drill+0.001
 Rat+0.001
 spider+0.001
Neg logit contributions
ams-0.001
aws-0.001
 tongues-0.001
 swapped-0.001
ishment-0.001
Token Walter
Token ID15819
Feature activation-3.97e+00

Pos logit contributions
ray+0.004
 scientist+0.004
 Rat+0.004
 spider+0.003
 broom+0.003
Neg logit contributions
ams-0.003
ited-0.003
 tongues-0.003
ized-0.003
ishment-0.003
Token and
Token ID290
Feature activation-6.70e-01

Pos logit contributions
ray+0.112
 drill+0.107
 scientist+0.106
 spider+0.104
oris+0.103
Neg logit contributions
ams-0.073
aws-0.065
 crust-0.062
 sav-0.062
ackers-0.061
Token L
Token ID406
Feature activation+1.78e-01

Pos logit contributions
ray+0.017
 scientist+0.015
 drill+0.014
 Rat+0.014
oris+0.014
Neg logit contributions
ams-0.014
aws-0.014
 swapped-0.014
 tongues-0.014
ishment-0.014
Tokenila
Token ID10102
Feature activation+3.01e-02

Pos logit contributions
 tongues+0.004
 swapped+0.004
ishment+0.004
 crust+0.004
ams+0.004
Neg logit contributions
ray-0.005
 scientist-0.004
oris-0.004
 Rat-0.004
 drill-0.004
Token.
Token ID13
Feature activation-1.75e+00

Pos logit contributions
ams+0.001
 swapped+0.001
 tongues+0.001
ishment+0.001
aws+0.001
Neg logit contributions
ray-0.001
 scientist-0.001
 drill-0.001
 Rat-0.001
 spider-0.001
Token They
Token ID1119
Feature activation-6.93e-02

Pos logit contributions
ray+0.057
 scientist+0.055
 drill+0.052
 Rat+0.051
oris+0.051
Neg logit contributions
ams-0.023
 swapped-0.023
aws-0.023
 tongues-0.023
ishment-0.022
Token wrong
Token ID2642
Feature activation-7.34e-01

Pos logit contributions
ray+0.003
oris+0.003
inea+0.003
 Monster+0.003
 scientist+0.002
Neg logit contributions
aws-0.002
 sweaty-0.002
ized-0.002
 crust-0.002
 swapped-0.002
Token.
Token ID13
Feature activation-3.10e-01

Pos logit contributions
ray+0.019
 scientist+0.017
 drill+0.016
 spider+0.016
 Rat+0.016
Neg logit contributions
 swapped-0.015
ams-0.015
aws-0.015
 tongues-0.015
ishment-0.015
Token She
Token ID1375
Feature activation-7.19e-01

Pos logit contributions
cher+0.012
ray+0.011
inea+0.011
 scientist+0.010
 compass+0.010
Neg logit contributions
aws-0.003
 tongues-0.003
 Stella-0.003
 sav-0.003
 swapped-0.003
Token sees
Token ID7224
Feature activation-1.31e+00

Pos logit contributions
 scientist+0.016
ray+0.016
arsh+0.016
inea+0.015
cher+0.015
Neg logit contributions
aws-0.015
 swapped-0.015
 sav-0.015
glers-0.015
 ne-0.015
Token the
Token ID262
Feature activation-1.25e+00

Pos logit contributions
ray+0.032
 scientist+0.030
 drill+0.028
 Rat+0.028
 spider+0.027
Neg logit contributions
ams-0.029
 tongues-0.029
 swapped-0.029
aws-0.028
ishment-0.028
Token broken
Token ID5445
Feature activation-3.97e+00

Pos logit contributions
ray+0.038
 scientist+0.034
oris+0.033
inea+0.032
cher+0.032
Neg logit contributions
 swapped-0.021
 tongues-0.021
 crust-0.021
aws-0.021
ams-0.020
Token brown
Token ID7586
Feature activation-2.39e+00

Pos logit contributions
ray+0.111
oris+0.098
 scientist+0.097
 Rat+0.094
cher+0.092
Neg logit contributions
 crust-0.076
ams-0.076
 tongues-0.074
 floats-0.074
 swapped-0.073
Token engine
Token ID3113
Feature activation-1.16e+00

Pos logit contributions
ray+0.062
 scientist+0.055
oris+0.055
cher+0.053
 Rat+0.053
Neg logit contributions
 crust-0.049
 tongues-0.049
ams-0.048
 floats-0.047
aws-0.046
Token and
Token ID290
Feature activation-1.96e-01

Pos logit contributions
ray+0.020
 drill+0.018
 scientist+0.017
 broom+0.016
 spider+0.015
Neg logit contributions
ackers-0.036
aws-0.035
ams-0.035
aney-0.035
ishment-0.035
Token the
Token ID262
Feature activation-4.78e-01

Pos logit contributions
ray+0.006
 scientist+0.005
oris+0.005
inea+0.005
cher+0.005
Neg logit contributions
 swapped-0.003
 tongues-0.003
 floats-0.003
aws-0.003
 sav-0.003
Token crying
Token ID13774
Feature activation-2.53e+00

Pos logit contributions
ray+0.012
 scientist+0.011
 drill+0.011
 Rat+0.011
oris+0.011
Neg logit contributions
ams-0.010
aws-0.010
 swapped-0.010
 tongues-0.010
ishment-0.010
Token As
Token ID1081
Feature activation-1.37e+00

Pos logit contributions
ray+0.006
 scientist+0.005
 Monster+0.005
 Rat+0.005
 Oliver+0.005
Neg logit contributions
 tongues-0.007
ams-0.007
ited-0.007
glers-0.007
ishment-0.007
Token the
Token ID262
Feature activation+2.66e-01

Pos logit contributions
 scientist+0.026
 Rat+0.023
ray+0.023
 Monster+0.023
 Oliver+0.022
Neg logit contributions
umbs-0.040
aws-0.040
 tongues-0.039
aney-0.039
ams-0.038
Token farmer
Token ID18739
Feature activation+6.36e-01

Pos logit contributions
aws+0.008
aney+0.007
ishment+0.007
ams+0.007
ackers+0.007
Neg logit contributions
 scientist-0.006
ray-0.005
 mouse-0.005
 drill-0.005
 owl-0.005
Token scattered
Token ID16830
Feature activation+2.66e-01

Pos logit contributions
ishment+0.016
 tongues+0.016
ams+0.016
aws+0.016
lights+0.015
Neg logit contributions
ray-0.015
 scientist-0.013
 Monster-0.012
 spider-0.012
 mouse-0.012
Token the
Token ID262
Feature activation-9.34e-01

Pos logit contributions
aws+0.008
rimp+0.008
ishment+0.008
aney+0.008
ackers+0.008
Neg logit contributions
 scientist-0.004
 drill-0.004
 spider-0.004
 broom-0.004
 Monster-0.004
Token av
Token ID1196
Feature activation-3.95e+00

Pos logit contributions
ray+0.022
 scientist+0.021
 drill+0.019
 broom+0.019
 spider+0.019
Neg logit contributions
aws-0.022
 swapped-0.021
ishment-0.021
ams-0.021
 tongues-0.021
Tokenoc
Token ID420
Feature activation-2.38e+00

Pos logit contributions
ray+0.069
 scientist+0.063
 drill+0.057
 spider+0.055
 Rat+0.055
Neg logit contributions
ams-0.115
aws-0.112
 crust-0.111
 tongues-0.111
 swapped-0.111
Tokenados
Token ID22484
Feature activation-1.95e-01

Pos logit contributions
ray+0.068
 scientist+0.064
 drill+0.061
 Rat+0.060
 spider+0.060
Neg logit contributions
ams-0.042
 tongues-0.041
 crust-0.041
ishment-0.041
aws-0.041
Token,
Token ID11
Feature activation-1.30e+00

Pos logit contributions
ray+0.004
 scientist+0.004
 drill+0.004
 broom+0.004
 spider+0.003
Neg logit contributions
ams-0.005
ishment-0.005
aws-0.005
 tongues-0.005
aney-0.005
Token he
Token ID339
Feature activation+1.74e-02

Pos logit contributions
 scientist+0.025
ray+0.025
 Rat+0.023
 spider+0.023
 drill+0.023
Neg logit contributions
ams-0.035
ishment-0.034
aws-0.034
 tongues-0.034
 swapped-0.034
Token noticed
Token ID6810
Feature activation+1.26e-02

Pos logit contributions
ams+0.000
ishment+0.000
aws+0.000
umbs+0.000
 tongues+0.000
Neg logit contributions
ray-0.000
 scientist-0.000
 drill-0.000
 spider-0.000
 Monster-0.000
Token Suddenly
Token ID24975
Feature activation-6.06e-01

Pos logit contributions
ams+0.016
ited+0.015
glers+0.015
 refreshing+0.015
ishment+0.015
Neg logit contributions
ray-0.014
 scientist-0.013
 spider-0.012
 Rat-0.012
 Monster-0.012
Token,
Token ID11
Feature activation-7.29e-02

Pos logit contributions
ray+0.015
 scientist+0.014
 drill+0.013
oris+0.013
cher+0.013
Neg logit contributions
aws-0.013
 tongues-0.012
 swapped-0.012
ams-0.012
 crust-0.012
Token she
Token ID673
Feature activation+1.21e+00

Pos logit contributions
ray+0.002
 scientist+0.002
 drill+0.001
 Rat+0.001
 spider+0.001
Neg logit contributions
ams-0.002
aws-0.002
 swapped-0.002
 tongues-0.002
ishment-0.002
Token spotted
Token ID13489
Feature activation-1.88e-01

Pos logit contributions
ishment+0.035
ams+0.035
 crust+0.035
umbs+0.035
 tongues+0.035
Neg logit contributions
ray-0.020
 scientist-0.019
 Rat-0.017
 spider-0.017
 drill-0.017
Token an
Token ID281
Feature activation-8.79e-01

Pos logit contributions
ray+0.003
 scientist+0.003
 spider+0.003
 Rat+0.003
 drill+0.003
Neg logit contributions
ams-0.006
 swapped-0.006
 tongues-0.006
ishment-0.006
ited-0.005
Token ac
Token ID936
Feature activation-3.95e+00

Pos logit contributions
ray+0.020
 scientist+0.019
 spider+0.018
 drill+0.017
 Rat+0.017
Neg logit contributions
 swapped-0.021
ams-0.021
 tongues-0.020
aws-0.020
umbs-0.020
Tokenorn
Token ID1211
Feature activation-1.63e+00

Pos logit contributions
ray+0.083
 scientist+0.081
 911+0.079
 department+0.072
 drill+0.072
Neg logit contributions
 sav-0.093
umbs-0.093
glers-0.093
 crust-0.092
aws-0.091
Token that
Token ID326
Feature activation-8.31e-01

Pos logit contributions
ray+0.036
 scientist+0.034
 drill+0.031
 Rat+0.030
 spider+0.030
Neg logit contributions
ams-0.039
aws-0.039
 swapped-0.039
 tongues-0.038
ishment-0.038
Token had
Token ID550
Feature activation-1.22e+00

Pos logit contributions
 spider+0.013
ray+0.013
 Rat+0.013
 scientist+0.012
 flashlight+0.012
Neg logit contributions
ishment-0.026
ams-0.026
umbs-0.025
 tongues-0.025
aws-0.024
Token fallen
Token ID9292
Feature activation-1.50e+00

Pos logit contributions
ray+0.027
 Rat+0.024
 broom+0.024
 spider+0.024
 scientist+0.024
Neg logit contributions
ishment-0.032
ams-0.030
glers-0.029
rimp-0.029
 tongues-0.028
Token from
Token ID422
Feature activation+3.34e-01

Pos logit contributions
ray+0.032
 scientist+0.030
 drill+0.027
 Rat+0.027
oris+0.027
Neg logit contributions
ams-0.037
aws-0.037
 swapped-0.037
 tongues-0.036
ishment-0.036
Token time
Token ID640
Feature activation+7.61e-01

Pos logit contributions
glers+0.032
ams+0.030
 swapped+0.029
ishment+0.029
 tongues+0.029
Neg logit contributions
ray-0.053
 scientist-0.050
 spider-0.050
 broom-0.049
 drill-0.048
Token there
Token ID612
Feature activation+1.07e+00

Pos logit contributions
aws+0.016
aney+0.015
ams+0.015
 tongues+0.015
 swapped+0.015
Neg logit contributions
ray-0.020
 scientist-0.018
 drill-0.017
oris-0.016
cher-0.016
Token was
Token ID373
Feature activation-6.28e-01

Pos logit contributions
ishment+0.033
ams+0.033
 tongues+0.032
 swapped+0.032
 crust+0.031
Neg logit contributions
ray-0.017
 Rat-0.015
 scientist-0.015
 Monster-0.015
 spider-0.015
Token a
Token ID257
Feature activation-1.55e+00

Pos logit contributions
ray+0.010
 scientist+0.009
 Rat+0.009
 spider+0.009
 drill+0.008
Neg logit contributions
ams-0.019
ishment-0.019
 tongues-0.019
 swapped-0.018
glers-0.018
Token strong
Token ID1913
Feature activation-1.27e+00

Pos logit contributions
ray+0.044
 scientist+0.038
oris+0.038
cher+0.037
 drill+0.037
Neg logit contributions
 sav-0.028
 crust-0.028
 swapped-0.028
aws-0.028
 refreshing-0.027
Token copper
Token ID15317
Feature activation-3.95e+00

Pos logit contributions
ray+0.022
 scientist+0.021
 spider+0.020
 drill+0.019
 Monster+0.018
Neg logit contributions
ams-0.037
aws-0.037
 swapped-0.037
ishment-0.036
 tongues-0.036
Token box
Token ID3091
Feature activation-5.97e-01

Pos logit contributions
ray+0.092
 scientist+0.084
oris+0.082
 Rat+0.081
 911+0.080
Neg logit contributions
 crust-0.088
aws-0.085
ams-0.084
 sav-0.083
 tongues-0.082
Token.
Token ID13
Feature activation+4.74e-01

Pos logit contributions
ray+0.013
 scientist+0.012
 drill+0.011
 spider+0.011
 living+0.011
Neg logit contributions
ishment-0.015
 tongues-0.015
ams-0.015
 swapped-0.015
ited-0.015
Token It
Token ID632
Feature activation-6.01e-01

Pos logit contributions
ams+0.012
glers+0.012
aws+0.012
 tongues+0.012
ishment+0.012
Neg logit contributions
ray-0.010
 scientist-0.010
 drill-0.009
 Rat-0.009
 Monster-0.009
Token was
Token ID373
Feature activation-3.17e-01

Pos logit contributions
ray+0.014
 scientist+0.013
 drill+0.012
oris+0.012
 Rat+0.012
Neg logit contributions
ams-0.014
aws-0.014
 swapped-0.014
 tongues-0.014
 sav-0.014
Token very
Token ID845
Feature activation+3.15e-01

Pos logit contributions
ray+0.007
 scientist+0.006
 drill+0.006
 Rat+0.006
 spider+0.006
Neg logit contributions
ams-0.008
ishment-0.008
aws-0.008
 tongues-0.008
 swapped-0.008
Token tree
Token ID5509
Feature activation-5.23e-01

Pos logit contributions
umbs+0.000
ams+0.000
ishment+0.000
aney+0.000
aws+0.000
Neg logit contributions
 scientist-0.000
ray-0.000
 mouse-0.000
 spider-0.000
 drill-0.000
Token was
Token ID373
Feature activation+1.06e-01

Pos logit contributions
ray+0.010
 scientist+0.009
 drill+0.008
 spider+0.008
 Monster+0.008
Neg logit contributions
ishment-0.014
umbs-0.014
aws-0.014
ams-0.014
 tongues-0.014
Token sad
Token ID6507
Feature activation-3.92e-01

Pos logit contributions
ishment+0.003
 tongues+0.003
aws+0.003
umbs+0.003
 floats+0.003
Neg logit contributions
 scientist-0.002
 drill-0.002
 broom-0.002
 living-0.002
ray-0.002
Token that
Token ID326
Feature activation-1.41e+00

Pos logit contributions
ray+0.008
 scientist+0.007
 drill+0.006
 spider+0.006
 Rat+0.006
Neg logit contributions
ams-0.011
 swapped-0.011
aws-0.011
ited-0.010
 tongues-0.010
Token the
Token ID262
Feature activation+4.08e-01

Pos logit contributions
 scientist+0.027
ray+0.026
 Monster+0.025
 Rat+0.024
 drill+0.024
Neg logit contributions
umbs-0.039
aws-0.039
ited-0.038
ams-0.038
aney-0.038
Token av
Token ID1196
Feature activation-3.92e+00

Pos logit contributions
ams+0.011
ishment+0.010
umbs+0.010
aws+0.010
aney+0.010
Neg logit contributions
 scientist-0.009
ray-0.008
 spider-0.008
 mouse-0.008
 drill-0.008
Tokenoc
Token ID420
Feature activation-1.54e+00

Pos logit contributions
ray+0.059
 scientist+0.051
 drill+0.045
 Rat+0.044
oris+0.044
Neg logit contributions
aws-0.123
 swapped-0.122
 tongues-0.122
ishment-0.121
ams-0.121
Tokenados
Token ID22484
Feature activation-7.99e-01

Pos logit contributions
ray+0.041
 scientist+0.039
 drill+0.036
 Rat+0.036
oris+0.035
Neg logit contributions
aws-0.030
ams-0.030
 swapped-0.030
 tongues-0.029
ishment-0.029
Token were
Token ID547
Feature activation+7.00e-01

Pos logit contributions
ray+0.016
 scientist+0.015
 flashlight+0.014
 drill+0.014
 Monster+0.014
Neg logit contributions
ishment-0.022
ams-0.022
aws-0.021
aney-0.020
 tongues-0.020
Token being
Token ID852
Feature activation-6.68e-01

Pos logit contributions
umbs+0.019
aws+0.018
ishment+0.018
ited+0.018
aney+0.018
Neg logit contributions
 living-0.012
 scientist-0.012
 broom-0.012
 owl-0.012
 for-0.012
Token taken
Token ID2077
Feature activation-1.18e+00

Pos logit contributions
ray+0.012
 scientist+0.012
 living+0.011
cher+0.011
 broom+0.011
Neg logit contributions
ishment-0.018
 tongues-0.018
umbs-0.018
lights-0.017
aws-0.017
Token time
Token ID640
Feature activation+6.77e-01

Pos logit contributions
ams+0.024
ishment+0.023
glers+0.023
 swapped+0.023
 tongues+0.023
Neg logit contributions
ray-0.052
 scientist-0.049
 spider-0.048
 drill-0.048
 broom-0.047
Token,
Token ID11
Feature activation-2.55e-01

Pos logit contributions
aws+0.015
ams+0.015
 swapped+0.015
 tongues+0.015
ishment+0.015
Neg logit contributions
ray-0.016
 scientist-0.015
 drill-0.014
oris-0.013
 Rat-0.013
Token there
Token ID612
Feature activation+5.40e-01

Pos logit contributions
ray+0.007
 scientist+0.006
 drill+0.006
 Rat+0.006
oris+0.006
Neg logit contributions
aws-0.005
ams-0.005
 swapped-0.005
ishment-0.005
 tongues-0.005
Token was
Token ID373
Feature activation-6.42e-01

Pos logit contributions
ams+0.017
ishment+0.016
 tongues+0.016
 crust+0.016
 swapped+0.016
Neg logit contributions
ray-0.009
 scientist-0.008
 Rat-0.008
 Monster-0.007
 spider-0.007
Token a
Token ID257
Feature activation-1.58e+00

Pos logit contributions
ray+0.011
 scientist+0.010
 Rat+0.009
 spider+0.009
 drill+0.009
Neg logit contributions
ams-0.019
 tongues-0.019
ishment-0.019
 swapped-0.018
glers-0.018
Token copper
Token ID15317
Feature activation-3.91e+00

Pos logit contributions
ray+0.045
oris+0.039
cher+0.039
 scientist+0.038
 Rat+0.038
Neg logit contributions
 sav-0.028
 crust-0.028
 swapped-0.028
 refreshing-0.027
aws-0.027
Token bucket
Token ID19236
Feature activation-2.13e-01

Pos logit contributions
ray+0.087
 scientist+0.079
 911+0.078
 Rat+0.078
oris+0.077
Neg logit contributions
 crust-0.091
aws-0.089
ams-0.086
 sav-0.086
 tongues-0.083
Token.
Token ID13
Feature activation+4.50e-01

Pos logit contributions
ray+0.004
 scientist+0.004
 living+0.004
 spider+0.004
 drill+0.004
Neg logit contributions
ishment-0.006
 tongues-0.006
ited-0.006
aney-0.006
 swapped-0.006
Token It
Token ID632
Feature activation-1.02e-01

Pos logit contributions
ams+0.012
glers+0.012
 tongues+0.012
aws+0.011
ited+0.011
Neg logit contributions
ray-0.010
 scientist-0.009
 drill-0.008
 Rat-0.008
 Monster-0.008
Token was
Token ID373
Feature activation-4.51e-02

Pos logit contributions
ray+0.002
 scientist+0.002
 drill+0.002
 spider+0.002
 Rat+0.002
Neg logit contributions
ams-0.003
ishment-0.003
aws-0.003
 tongues-0.003
umbs-0.003
Token empty
Token ID6565
Feature activation-6.07e-01

Pos logit contributions
ray+0.001
 scientist+0.001
 drill+0.001
 Rat+0.001
 broom+0.001
Neg logit contributions
ishment-0.001
ams-0.001
aws-0.001
ited-0.001
 tongues-0.001
Token here
Token ID994
Feature activation+2.68e-01

Pos logit contributions
aws+0.001
ams+0.001
 swapped+0.001
ishment+0.001
 tongues+0.001
Neg logit contributions
ray-0.001
 scientist-0.001
 drill-0.001
 Rat-0.001
oris-0.001
Token?
Token ID30
Feature activation-9.50e-01

Pos logit contributions
ams+0.008
 tongues+0.008
 swapped+0.008
ishment+0.008
aws+0.008
Neg logit contributions
ray-0.004
 scientist-0.004
 spider-0.003
 Rat-0.003
 Monster-0.003
Token You
Token ID921
Feature activation-1.43e+00

Pos logit contributions
ray+0.017
 scientist+0.016
 drill+0.014
 Rat+0.014
oris+0.014
Neg logit contributions
ams-0.027
aws-0.027
 swapped-0.026
 tongues-0.026
ishment-0.026
Token are
Token ID389
Feature activation+6.98e-01

Pos logit contributions
ray+0.033
 scientist+0.029
 Rat+0.028
 drill+0.027
 spider+0.027
Neg logit contributions
ishment-0.035
ams-0.035
 tongues-0.034
aws-0.034
 crust-0.034
Token tresp
Token ID29723
Feature activation-9.43e-01

Pos logit contributions
ams+0.017
ishment+0.017
aws+0.016
 tongues+0.016
ited+0.016
Neg logit contributions
ray-0.016
 scientist-0.015
 drill-0.014
 Rat-0.014
 spider-0.014
Tokenassing
Token ID19696
Feature activation-3.91e+00

Pos logit contributions
ray+0.025
 scientist+0.024
 drill+0.022
 Rat+0.022
 spider+0.022
Neg logit contributions
aws-0.019
ishment-0.018
 tongues-0.018
 swapped-0.018
ams-0.018
Token!
Token ID0
Feature activation-1.33e+00

Pos logit contributions
ray+0.092
 scientist+0.090
 drill+0.084
 Rat+0.083
oris+0.083
Neg logit contributions
ams-0.087
aws-0.086
 swapped-0.081
umbs-0.081
 tongues-0.080
Token You
Token ID921
Feature activation-1.58e+00

Pos logit contributions
ray+0.030
 scientist+0.029
 drill+0.027
oris+0.026
 broom+0.026
Neg logit contributions
aws-0.031
ams-0.031
 swapped-0.030
 tongues-0.030
 sav-0.029
Token are
Token ID389
Feature activation+1.41e-01

Pos logit contributions
ray+0.037
 scientist+0.034
 drill+0.031
 Rat+0.031
 spider+0.031
Neg logit contributions
ams-0.037
ishment-0.036
aws-0.036
 tongues-0.036
 swapped-0.036
Token thieves
Token ID26655
Feature activation-2.18e+00

Pos logit contributions
ams+0.003
aws+0.003
 swapped+0.003
 tongues+0.003
ishment+0.003
Neg logit contributions
ray-0.003
 scientist-0.003
 drill-0.003
 Rat-0.003
 spider-0.003
Token!
Token ID0
Feature activation-1.53e+00

Pos logit contributions
ray+0.052
 scientist+0.048
 drill+0.045
oris+0.044
 Rat+0.044
Neg logit contributions
aws-0.048
ams-0.048
 swapped-0.047
 tongues-0.047
ishment-0.046
Token respect
Token ID2461
Feature activation-4.68e-01

Pos logit contributions
umbs+0.010
ams+0.009
ement+0.009
lights+0.009
aney+0.009
Neg logit contributions
ray-0.005
 flashlight-0.005
 drill-0.005
 broom-0.005
 scientist-0.005
Token other
Token ID584
Feature activation-5.18e-01

Pos logit contributions
ray+0.012
 scientist+0.012
 drill+0.011
 Rat+0.011
 spider+0.011
Neg logit contributions
 swapped-0.009
ams-0.009
ishment-0.009
aws-0.009
 tongues-0.009
Token animals
Token ID4695
Feature activation-4.95e-01

Pos logit contributions
ray+0.012
 scientist+0.011
 spider+0.010
 drill+0.010
 Rat+0.010
Neg logit contributions
 swapped-0.012
 crust-0.012
aney-0.012
ishment-0.012
ams-0.012
Token and
Token ID290
Feature activation+1.36e+00

Pos logit contributions
ray+0.011
 scientist+0.010
 drill+0.009
 spider+0.009
 Rat+0.009
Neg logit contributions
 swapped-0.012
ams-0.012
 tongues-0.012
ishment-0.012
aws-0.012
Token people
Token ID661
Feature activation-7.95e-01

Pos logit contributions
ams+0.029
aws+0.029
ishment+0.029
 swapped+0.028
umbs+0.028
Neg logit contributions
ray-0.034
 scientist-0.031
 drill-0.029
 Rat-0.029
 spider-0.029
Token."
Token ID526
Feature activation+1.95e+00

Pos logit contributions
ray+0.017
 scientist+0.013
cher+0.013
 drill+0.013
 living+0.013
Neg logit contributions
ishment-0.021
aney-0.021
aws-0.021
 Ph-0.021
 tongues-0.020
Token Jen
Token ID13364
Feature activation+1.32e+00

Pos logit contributions
ishment+0.051
ited+0.051
ams+0.050
 swapped+0.050
aney+0.049
Neg logit contributions
ray-0.039
 scientist-0.037
 Rat-0.036
 drill-0.036
 spider-0.035
Token said
Token ID531
Feature activation-9.47e-01

Pos logit contributions
 tongues+0.036
ams+0.036
 festivities+0.036
umm+0.035
 refreshing+0.035
Neg logit contributions
ray-0.026
 Rat-0.021
 owl-0.021
 spider-0.021
 scientist-0.020
Token.
Token ID13
Feature activation-8.30e-02

Pos logit contributions
ray+0.028
 scientist+0.027
oris+0.025
 drill+0.025
 Rat+0.025
Neg logit contributions
 swapped-0.016
ams-0.015
aws-0.015
 tongues-0.015
 crust-0.015
Token She
Token ID1375
Feature activation-1.12e+00

Pos logit contributions
ray+0.002
 scientist+0.002
 drill+0.002
oris+0.002
 Rat+0.002
Neg logit contributions
aws-0.002
ams-0.002
 swapped-0.002
 tongues-0.002
ishment-0.002
Token was
Token ID373
Feature activation+4.08e-01

Pos logit contributions
ray+0.028
 scientist+0.026
 Rat+0.024
 drill+0.023
 spider+0.023
Neg logit contributions
ams-0.025
 tongues-0.024
ishment-0.024
aws-0.024
 crust-0.023
Token big
Token ID1263
Feature activation+3.80e-01

Pos logit contributions
aws+0.004
 sav+0.004
 swapped+0.003
 crust+0.003
 tongues+0.003
Neg logit contributions
ray-0.007
 scientist-0.006
oris-0.006
cher-0.006
 Monster-0.006
Token wheel
Token ID7825
Feature activation-6.14e-02

Pos logit contributions
ams+0.008
aws+0.008
 swapped+0.008
 tongues+0.008
ishment+0.008
Neg logit contributions
ray-0.009
 scientist-0.009
 drill-0.008
 spider-0.008
 Rat-0.008
Token.
Token ID13
Feature activation+8.04e-01

Pos logit contributions
ray+0.001
 drill+0.001
 scientist+0.001
cher+0.001
 if+0.001
Neg logit contributions
 tongues-0.002
 refreshing-0.002
ishment-0.002
 swapped-0.002
 sav-0.002
Token Her
Token ID2332
Feature activation+3.45e-01

Pos logit contributions
ited+0.021
ams+0.021
glers+0.019
 refreshing+0.019
ishment+0.019
Neg logit contributions
ray-0.017
 scientist-0.016
 Rat-0.016
 spider-0.015
 drill-0.015
Token mom
Token ID1995
Feature activation+1.62e+00

Pos logit contributions
aws+0.007
 tongues+0.007
 swapped+0.006
 crust+0.006
ams+0.006
Neg logit contributions
ray-0.010
 scientist-0.009
oris-0.009
cher-0.008
 Monster-0.008
Token says
Token ID1139
Feature activation+2.37e+00

Pos logit contributions
ams+0.042
 tongues+0.041
aws+0.041
ishment+0.041
 crust+0.040
Neg logit contributions
ray-0.035
 scientist-0.030
 drill-0.028
 Rat-0.028
 spider-0.027
Token yes
Token ID3763
Feature activation+1.53e+00

Pos logit contributions
 swapped+0.040
aws+0.039
ams+0.039
 tongues+0.039
 crust+0.038
Neg logit contributions
ray-0.071
 scientist-0.067
oris-0.063
 drill-0.061
 Rat-0.060
Token,
Token ID11
Feature activation+9.49e-01

Pos logit contributions
umbs+0.053
 tongues+0.053
 swapped+0.052
ams+0.052
ishment+0.052
Neg logit contributions
ray-0.019
 scientist-0.013
 Rat-0.013
 drill-0.013
 if-0.012
Token and
Token ID290
Feature activation+1.07e-01

Pos logit contributions
glers+0.028
 tongues+0.027
ited+0.027
ishment+0.027
ilt+0.026
Neg logit contributions
ray-0.016
 Rat-0.015
 scientist-0.015
 spider-0.015
 if-0.014
Token they
Token ID484
Feature activation+1.15e+00

Pos logit contributions
ams+0.003
 tongues+0.003
umbs+0.003
ishment+0.003
ited+0.003
Neg logit contributions
ray-0.002
 scientist-0.002
 Rat-0.002
 spider-0.002
 broom-0.002
Token put
Token ID1234
Feature activation+1.58e+00

Pos logit contributions
ams+0.024
aws+0.023
 tongues+0.023
ishment+0.023
 swapped+0.023
Neg logit contributions
ray-0.030
 scientist-0.028
 drill-0.026
 Rat-0.026
oris-0.025
Token agreed
Token ID4987
Feature activation+1.82e+00

Pos logit contributions
ams+0.030
aws+0.030
ishment+0.029
 tongues+0.029
 swapped+0.029
Neg logit contributions
ray-0.041
 scientist-0.038
 drill-0.035
 Rat-0.035
 spider-0.035
Token,
Token ID11
Feature activation+5.34e-02

Pos logit contributions
ams+0.051
aws+0.051
ishment+0.051
 tongues+0.051
glers+0.050
Neg logit contributions
ray-0.032
 scientist-0.030
 broom-0.027
 drill-0.027
 Rat-0.027
Token and
Token ID290
Feature activation-6.78e-01

Pos logit contributions
glers+0.002
ishment+0.002
aws+0.002
ams+0.002
 tongues+0.002
Neg logit contributions
 scientist-0.001
ray-0.001
 broom-0.001
 Rat-0.001
 spider-0.001
Token they
Token ID484
Feature activation+9.06e-01

Pos logit contributions
ray+0.019
 scientist+0.018
 drill+0.017
 Rat+0.017
oris+0.017
Neg logit contributions
ams-0.012
aws-0.012
 swapped-0.012
 tongues-0.012
ishment-0.012
Token went
Token ID1816
Feature activation+3.42e-01

Pos logit contributions
ishment+0.017
ams+0.016
 tongues+0.016
rimp+0.016
aws+0.016
Neg logit contributions
ray-0.025
 Rat-0.023
 scientist-0.023
 broom-0.022
 flashlight-0.022
Token under
Token ID739
Feature activation+2.55e+00

Pos logit contributions
aws+0.009
ams+0.009
 swapped+0.009
glers+0.009
 tongues+0.009
Neg logit contributions
ray-0.007
 scientist-0.006
 drill-0.005
oris-0.005
 spider-0.005
Token a
Token ID257
Feature activation+6.90e-01

Pos logit contributions
aws+0.057
ams+0.057
 tongues+0.057
 swapped+0.057
ishment+0.056
Neg logit contributions
ray-0.061
 scientist-0.057
 drill-0.052
oris-0.051
 Rat-0.051
Token big
Token ID1263
Feature activation+1.41e+00

Pos logit contributions
ams+0.018
ishment+0.017
 tongues+0.016
 swapped+0.016
aws+0.016
Neg logit contributions
ray-0.015
 scientist-0.014
 spider-0.014
 broom-0.014
 drill-0.014
Token tree
Token ID5509
Feature activation-7.78e-02

Pos logit contributions
ams+0.040
aws+0.039
 swapped+0.038
ishment+0.038
 tongues+0.037
Neg logit contributions
ray-0.026
 spider-0.026
 scientist-0.025
 Rat-0.025
 drill-0.024
Token to
Token ID284
Feature activation+1.87e+00

Pos logit contributions
ray+0.002
 scientist+0.002
 drill+0.002
 broom+0.002
 spider+0.002
Neg logit contributions
ams-0.002
 tongues-0.002
 swapped-0.002
ishment-0.002
aws-0.002
Token cool
Token ID3608
Feature activation+1.45e+00

Pos logit contributions
ams+0.043
ishment+0.042
 tongues+0.041
aws+0.041
 swapped+0.040
Neg logit contributions
ray-0.045
 scientist-0.040
 Rat-0.039
 drill-0.039
 spider-0.038
Token water
Token ID1660
Feature activation+1.41e+00

Pos logit contributions
 swapped+0.033
Twe+0.032
ishment+0.032
aws+0.031
aney+0.031
Neg logit contributions
 spider-0.022
 flashlight-0.021
ray-0.020
 scientist-0.020
 mouse-0.019
Token.
Token ID13
Feature activation-3.05e-01

Pos logit contributions
ishment+0.044
ams+0.042
aney+0.041
 swapped+0.041
 unison+0.041
Neg logit contributions
 spider-0.020
 broom-0.020
ray-0.019
 living-0.019
 mouse-0.018
Token She
Token ID1375
Feature activation+2.16e-01

Pos logit contributions
ray+0.007
 scientist+0.007
 Rat+0.006
 drill+0.006
 spider+0.006
Neg logit contributions
ams-0.007
aws-0.007
 swapped-0.007
 tongues-0.007
ishment-0.007
Token wanted
Token ID2227
Feature activation-3.65e-01

Pos logit contributions
umbs+0.006
ams+0.006
ishment+0.005
aney+0.005
aws+0.005
Neg logit contributions
ray-0.005
 scientist-0.004
 Rat-0.004
 spider-0.004
 Monster-0.004
Token something
Token ID1223
Feature activation+1.97e+00

Pos logit contributions
ray+0.005
 scientist+0.004
 Rat+0.003
 drill+0.003
 spider+0.003
Neg logit contributions
ams-0.012
aws-0.012
ishment-0.012
 swapped-0.012
 tongues-0.012
Token warm
Token ID5814
Feature activation+2.78e+00

Pos logit contributions
ams+0.051
ishment+0.051
aws+0.049
 tongues+0.049
 swapped+0.049
Neg logit contributions
ray-0.041
 scientist-0.037
 broom-0.035
 spider-0.035
 Rat-0.034
Token to
Token ID284
Feature activation+1.08e+00

Pos logit contributions
 swapped+0.088
ams+0.087
aws+0.087
aney+0.086
ishment+0.086
Neg logit contributions
ray-0.036
 flashlight-0.034
 spider-0.033
 broom-0.032
 scientist-0.032
Token feed
Token ID3745
Feature activation+4.63e-01

Pos logit contributions
ishment+0.030
ams+0.029
aws+0.028
 swapped+0.028
ement+0.027
Neg logit contributions
 spider-0.020
 drill-0.020
 broom-0.020
 flashlight-0.019
 Rat-0.019
Token her
Token ID607
Feature activation+1.10e+00

Pos logit contributions
 swapped+0.014
ishment+0.013
ited+0.013
 tongues+0.013
ams+0.013
Neg logit contributions
ray-0.008
 spider-0.007
 Rat-0.007
 scientist-0.007
 if-0.006
Token.
Token ID13
Feature activation+4.54e-02

Pos logit contributions
 swapped+0.031
ited+0.031
ishment+0.030
ams+0.030
aney+0.029
Neg logit contributions
ray-0.019
 spider-0.018
 flashlight-0.017
 broom-0.017
 scientist-0.017
Token
Token ID198
Feature activation+4.67e-01

Pos logit contributions
aws+0.001
 swapped+0.001
ams+0.001
 tongues+0.001
ishment+0.001
Neg logit contributions
ray-0.001
 scientist-0.001
 drill-0.001
oris-0.001
 spider-0.001
Token her
Token ID607
Feature activation+1.01e+00

Pos logit contributions
 Rat+0.006
 scientist+0.006
ray+0.006
 spider+0.006
 drill+0.006
Neg logit contributions
ams-0.012
ited-0.012
ishment-0.012
 refreshing-0.012
 tongues-0.012
Token.
Token ID13
Feature activation+2.95e-01

Pos logit contributions
ams+0.024
aws+0.024
 swapped+0.023
 tongues+0.023
ishment+0.023
Neg logit contributions
ray-0.023
 scientist-0.022
 drill-0.021
 broom-0.020
 spider-0.020
Token He
Token ID679
Feature activation-1.10e+00

Pos logit contributions
ams+0.006
 tongues+0.006
ishment+0.006
aws+0.006
 swapped+0.006
Neg logit contributions
ray-0.008
 scientist-0.007
 drill-0.007
 Rat-0.007
 spider-0.007
Token remembered
Token ID12086
Feature activation+9.92e-03

Pos logit contributions
ray+0.033
 scientist+0.031
oris+0.029
 drill+0.029
 Monster+0.029
Neg logit contributions
 swapped-0.019
aws-0.018
 sav-0.018
ams-0.017
 tongues-0.017
Token what
Token ID644
Feature activation+1.39e-01

Pos logit contributions
aws+0.000
ams+0.000
 tongues+0.000
 swapped+0.000
 sav+0.000
Neg logit contributions
ray-0.000
 scientist-0.000
oris-0.000
 drill-0.000
 Monster-0.000
Token Mom
Token ID11254
Feature activation+2.11e+00

Pos logit contributions
 swapped+0.002
 ne+0.002
 Stella+0.002
 Nem+0.002
aws+0.002
Neg logit contributions
ray-0.004
oris-0.004
 scientist-0.004
cher-0.004
 broom-0.004
Token said
Token ID531
Feature activation+1.04e+00

Pos logit contributions
ishment+0.051
 tongues+0.050
ams+0.050
aws+0.050
 swapped+0.049
Neg logit contributions
ray-0.048
 scientist-0.044
 drill-0.041
 Rat-0.041
 spider-0.040
Token.
Token ID13
Feature activation-8.06e-02

Pos logit contributions
aws+0.016
ams+0.016
 swapped+0.016
 tongues+0.015
 crust+0.015
Neg logit contributions
ray-0.032
 scientist-0.031
oris-0.030
 drill-0.028
 Rat-0.028
Token He
Token ID679
Feature activation-1.18e+00

Pos logit contributions
ray+0.002
 scientist+0.002
 drill+0.002
oris+0.002
 spider+0.002
Neg logit contributions
aws-0.001
ams-0.001
 swapped-0.001
 tongues-0.001
 crust-0.001
Token ran
Token ID4966
Feature activation-4.22e-03

Pos logit contributions
ray+0.034
 scientist+0.033
 drill+0.031
oris+0.031
 spider+0.030
Neg logit contributions
 swapped-0.021
aws-0.020
 sav-0.020
ams-0.019
 crust-0.019
Token to
Token ID284
Feature activation+1.58e-01

Pos logit contributions
ray+0.000
 scientist+0.000
 drill+0.000
cher+0.000
oris+0.000
Neg logit contributions
 swapped-0.000
aws-0.000
 sav-0.000
ams-0.000
 crust-0.000
Token,
Token ID11
Feature activation-1.46e+00

Pos logit contributions
umbs+0.003
ams+0.003
 swapped+0.003
aws+0.003
 crust+0.003
Neg logit contributions
ray-0.002
 scientist-0.002
 Rat-0.002
 spider-0.002
 drill-0.002
Token "
Token ID366
Feature activation-7.10e-01

Pos logit contributions
ray+0.019
 scientist+0.018
 drill+0.016
 Rat+0.015
 broom+0.014
Neg logit contributions
aws-0.049
 tongues-0.048
ams-0.048
ishment-0.048
ited-0.048
TokenEveryone
Token ID16190
Feature activation-4.29e-01

Pos logit contributions
ray+0.016
 scientist+0.014
oris+0.013
 Rat+0.012
 drill+0.012
Neg logit contributions
 tongues-0.019
 crust-0.018
aws-0.018
glers-0.017
ishment-0.017
Token is
Token ID318
Feature activation+7.69e-02

Pos logit contributions
ray+0.008
 scientist+0.007
 drill+0.007
 spider+0.007
 Monster+0.006
Neg logit contributions
aws-0.012
ishment-0.012
 tongues-0.012
ams-0.012
umbs-0.012
Token different
Token ID1180
Feature activation-9.53e-01

Pos logit contributions
 tongues+0.002
umbs+0.002
ited+0.002
ishment+0.002
ams+0.002
Neg logit contributions
 scientist-0.001
ray-0.001
 drill-0.001
 spider-0.001
 living-0.001
Token and
Token ID290
Feature activation+2.07e-01

Pos logit contributions
ray+0.020
 scientist+0.018
 spider+0.017
 drill+0.016
 Monster+0.016
Neg logit contributions
 swapped-0.025
ited-0.025
ams-0.025
ishment-0.024
 tongues-0.024
Token special
Token ID2041
Feature activation-1.24e-01

Pos logit contributions
umbs+0.005
aws+0.005
aney+0.005
ited+0.005
ams+0.005
Neg logit contributions
ray-0.005
 scientist-0.004
 drill-0.004
 spider-0.004
 Rat-0.004
Token in
Token ID287
Feature activation+6.28e-01

Pos logit contributions
ray+0.003
 scientist+0.002
 spider+0.002
 broom+0.002
 drill+0.002
Neg logit contributions
 swapped-0.003
ited-0.003
ishment-0.003
ams-0.003
aws-0.003
Token their
Token ID511
Feature activation+1.02e-01

Pos logit contributions
ited+0.020
 swapped+0.020
umbs+0.019
agged+0.019
ams+0.019
Neg logit contributions
ray-0.009
 spider-0.009
 living-0.009
 scientist-0.009
 mouse-0.008
Token own
Token ID898
Feature activation-1.13e+00

Pos logit contributions
umbs+0.003
ams+0.003
 Ph+0.002
aney+0.002
rimp+0.002
Neg logit contributions
ray-0.002
 scientist-0.002
 spider-0.002
 flashlight-0.002
 living-0.002
Token way
Token ID835
Feature activation-8.81e-01

Pos logit contributions
ray+0.025
 scientist+0.022
 spider+0.021
 flashlight+0.021
 broom+0.021
Neg logit contributions
 swapped-0.029
aws-0.027
ams-0.027
ited-0.027
umbs-0.027
Token.
Token ID13
Feature activation-2.93e-01

Pos logit contributions
ams+0.062
ackers+0.060
ishment+0.058
ited+0.058
rimp+0.057
Neg logit contributions
 scientist-0.034
 drill-0.034
 broom-0.034
 Rat-0.033
ray-0.033
Token They
Token ID1119
Feature activation+4.43e-01

Pos logit contributions
ray+0.007
 scientist+0.007
 Rat+0.006
 drill+0.006
 Monster+0.006
Neg logit contributions
ited-0.008
ams-0.007
 tongues-0.007
ized-0.007
ilt-0.007
Token saw
Token ID2497
Feature activation-9.36e-01

Pos logit contributions
ams+0.010
 tongues+0.009
ishment+0.009
aws+0.009
umbs+0.009
Neg logit contributions
ray-0.011
 scientist-0.011
 Rat-0.010
 drill-0.010
 broom-0.010
Token the
Token ID262
Feature activation-2.41e-02

Pos logit contributions
ray+0.019
 scientist+0.018
 Rat+0.017
 spider+0.017
 drill+0.016
Neg logit contributions
ited-0.025
ams-0.025
ishment-0.024
 tongues-0.024
umbs-0.024
Token big
Token ID1263
Feature activation-1.70e-01

Pos logit contributions
ray+0.000
 scientist+0.000
 drill+0.000
oris+0.000
 Rat+0.000
Neg logit contributions
aws-0.001
 tongues-0.001
ams-0.001
 swapped-0.001
ishment-0.001
Token dog
Token ID3290
Feature activation+2.46e-01

Pos logit contributions
ray+0.004
 scientist+0.003
oris+0.003
 drill+0.003
 Rat+0.003
Neg logit contributions
 tongues-0.004
aws-0.004
ishment-0.004
ams-0.004
 swapped-0.004
Token lying
Token ID9105
Feature activation-5.72e-01

Pos logit contributions
ams+0.008
ishment+0.007
aney+0.007
ited+0.007
 tongues+0.007
Neg logit contributions
ray-0.004
 scientist-0.004
 spider-0.004
 drill-0.004
 Rat-0.004
Token on
Token ID319
Feature activation-2.30e-01

Pos logit contributions
ray+0.008
 scientist+0.008
 spider+0.007
 drill+0.007
 broom+0.007
Neg logit contributions
ishment-0.018
ams-0.018
 tongues-0.018
 swapped-0.018
aney-0.018
Token a
Token ID257
Feature activation-7.56e-01

Pos logit contributions
 scientist+0.004
ray+0.004
 Rat+0.004
 spider+0.004
 drill+0.004
Neg logit contributions
ams-0.007
ited-0.006
 swapped-0.006
 tongues-0.006
aney-0.006
Token rug
Token ID14477
Feature activation+1.37e+00

Pos logit contributions
ray+0.021
 scientist+0.019
oris+0.019
 Rat+0.018
 drill+0.018
Neg logit contributions
 crust-0.014
 tongues-0.014
aws-0.013
 sav-0.013
 swapped-0.013
Token.
Token ID13
Feature activation+1.63e-01

Pos logit contributions
ackers+0.040
ams+0.040
ishment+0.039
 tongues+0.039
Twe+0.039
Neg logit contributions
ray-0.023
 scientist-0.022
 drill-0.021
 spider-0.020
 broom-0.020
Token.
Token ID13
Feature activation+1.10e+00

Pos logit contributions
ray+0.018
 scientist+0.016
 drill+0.015
 Rat+0.015
oris+0.015
Neg logit contributions
ams-0.021
aws-0.021
 swapped-0.021
 tongues-0.021
ishment-0.021
Tokenâ
Token ID22940
Feature activation+9.14e-01

Pos logit contributions
ams+0.028
 tongues+0.028
aws+0.027
 swapped+0.027
ishment+0.027
Neg logit contributions
ray-0.024
 scientist-0.022
 Rat-0.020
 drill-0.019
oris-0.019
Token€
Token ID26391
Feature activation+1.71e-01

Pos logit contributions
 tongues+0.018
aws+0.018
ams+0.017
 swapped+0.017
ishment+0.017
Neg logit contributions
ray-0.026
 scientist-0.023
 drill-0.022
oris-0.022
 Rat-0.021
Token
Token ID198
Feature activation+1.44e+00

Pos logit contributions
ams+0.004
aws+0.004
 swapped+0.004
 tongues+0.004
ishment+0.004
Neg logit contributions
ray-0.004
 scientist-0.004
 drill-0.004
 Rat-0.004
oris-0.003
Token
Token ID198
Feature activation+1.51e+00

Pos logit contributions
ishment+0.035
ams+0.035
 tongues+0.035
 swapped+0.034
aws+0.034
Neg logit contributions
ray-0.032
 scientist-0.029
 Rat-0.028
 drill-0.027
oris-0.027
TokenThe
Token ID464
Feature activation-1.16e-01

Pos logit contributions
 tongues+0.042
ishment+0.041
 mouths+0.037
 crust+0.037
ams+0.037
Neg logit contributions
ray-0.032
 scientist-0.030
oris-0.028
Ol-0.027
 Rat-0.027
Token little
Token ID1310
Feature activation-5.73e-02

Pos logit contributions
ray+0.002
 scientist+0.002
 spider+0.002
 drill+0.002
 Rat+0.002
Neg logit contributions
ams-0.003
aws-0.003
 swapped-0.003
ishment-0.003
 tongues-0.003
Token girl
Token ID2576
Feature activation-1.54e-01

Pos logit contributions
ray+0.001
 scientist+0.001
 spider+0.001
 broom+0.001
 drill+0.001
Neg logit contributions
umbs-0.002
ams-0.002
 swapped-0.002
aney-0.002
 tongues-0.002
Token wondered
Token ID14028
Feature activation-5.01e-01

Pos logit contributions
ray+0.003
 scientist+0.003
 Rat+0.003
 spider+0.003
 Monster+0.003
Neg logit contributions
ams-0.004
ishment-0.004
umbs-0.004
aws-0.004
 tongues-0.004
Token who
Token ID508
Feature activation-9.17e-01

Pos logit contributions
ray+0.009
 scientist+0.008
 drill+0.007
oris+0.007
 Rat+0.007
Neg logit contributions
aws-0.014
ams-0.014
ishment-0.014
 tongues-0.013
 swapped-0.013
Token would
Token ID561
Feature activation-6.40e-02

Pos logit contributions
ray+0.019
 scientist+0.017
 Rat+0.016
 drill+0.016
 spider+0.016
Neg logit contributions
ams-0.024
aws-0.024
 tongues-0.024
ishment-0.024
 crust-0.023
Token!"
Token ID2474
Feature activation+6.41e-01

Pos logit contributions
ray+0.016
 scientist+0.015
 drill+0.014
oris+0.014
 Rat+0.014
Neg logit contributions
ams-0.013
aws-0.013
 tongues-0.013
 swapped-0.013
ishment-0.013
Token
Token ID198
Feature activation-3.64e-01

Pos logit contributions
 tongues+0.019
ited+0.019
ishment+0.019
ams+0.019
glers+0.018
Neg logit contributions
ray-0.011
 scientist-0.010
 Rat-0.010
 drill-0.010
 spider-0.009
Token
Token ID198
Feature activation-4.73e-01

Pos logit contributions
ray+0.010
 scientist+0.010
oris+0.009
 drill+0.009
 broom+0.009
Neg logit contributions
aws-0.007
ams-0.006
 sav-0.006
 swapped-0.006
 crust-0.006
TokenThey
Token ID2990
Feature activation+3.08e-01

Pos logit contributions
ray+0.014
 scientist+0.013
 drill+0.012
 Rat+0.012
oris+0.012
Neg logit contributions
ams-0.008
aws-0.008
 swapped-0.008
 crust-0.008
 tongues-0.008
Token walked
Token ID6807
Feature activation-3.43e-01

Pos logit contributions
ams+0.007
umbs+0.007
 festivities+0.007
ishment+0.007
 crust+0.007
Neg logit contributions
ray-0.007
 scientist-0.007
 broom-0.007
 flashlight-0.006
 drill-0.006
Token to
Token ID284
Feature activation+5.65e-01

Pos logit contributions
ray+0.008
 scientist+0.007
 drill+0.007
 Rat+0.007
 spider+0.007
Neg logit contributions
aws-0.008
ams-0.008
 swapped-0.008
glers-0.008
ishment-0.008
Token the
Token ID262
Feature activation-4.12e-01

Pos logit contributions
ams+0.012
aws+0.012
 swapped+0.012
 tongues+0.012
ishment+0.012
Neg logit contributions
ray-0.014
 scientist-0.013
 drill-0.012
 Rat-0.012
oris-0.012
Token bridge
Token ID7696
Feature activation+2.75e-01

Pos logit contributions
ray+0.012
oris+0.011
 Rat+0.011
 scientist+0.011
arsh+0.010
Neg logit contributions
 crust-0.007
 sav-0.007
aws-0.007
 refreshing-0.007
glers-0.007
Token and
Token ID290
Feature activation+9.87e-01

Pos logit contributions
 tongues+0.007
aws+0.007
ams+0.007
ishment+0.007
 swapped+0.007
Neg logit contributions
ray-0.006
 scientist-0.005
 drill-0.005
 broom-0.005
 spider-0.005
Token looked
Token ID3114
Feature activation-8.77e-01

Pos logit contributions
 swapped+0.016
 sav+0.014
ishment+0.014
ams+0.014
aws+0.014
Neg logit contributions
ray-0.029
 scientist-0.029
oris-0.027
 drill-0.027
 spider-0.027
Token at
Token ID379
Feature activation+5.48e-01

Pos logit contributions
ray+0.021
 scientist+0.021
 drill+0.020
oris+0.020
 Rat+0.020
Neg logit contributions
 swapped-0.019
 refreshing-0.019
aws-0.019
ishment-0.019
ams-0.018
Token back
Token ID736
Feature activation-5.98e-01

Pos logit contributions
umbs+0.037
 tongues+0.037
aws+0.036
ams+0.035
 swapped+0.035
Neg logit contributions
 scientist-0.024
ray-0.023
 living-0.021
 drill-0.021
 flashlight-0.021
Token to
Token ID284
Feature activation-1.82e-01

Pos logit contributions
ray+0.011
 scientist+0.010
 drill+0.008
 Rat+0.008
oris+0.008
Neg logit contributions
ams-0.017
aws-0.017
 tongues-0.017
 swapped-0.017
ishment-0.017
Token normal
Token ID3487
Feature activation+2.37e-01

Pos logit contributions
ray+0.004
 scientist+0.004
 drill+0.003
 Rat+0.003
 spider+0.003
Neg logit contributions
aws-0.005
ams-0.005
umbs-0.005
 swapped-0.005
glers-0.005
Token.
Token ID13
Feature activation-1.52e+00

Pos logit contributions
 swapped+0.006
 crust+0.006
aws+0.006
ams+0.006
umbs+0.006
Neg logit contributions
ray-0.005
 scientist-0.005
 Monster-0.004
 spider-0.004
 drill-0.004
Token<|endoftext|>
Token ID50256
Feature activation+4.38e-02

Pos logit contributions
 scientist+0.046
 broom+0.046
ray+0.046
 spider+0.045
cher+0.045
Neg logit contributions
 sav-0.020
ams-0.019
 delight-0.018
umbs-0.018
ishment-0.017
TokenOnce
Token ID7454
Feature activation+1.53e-02

Pos logit contributions
ams+0.001
aws+0.001
 tongues+0.001
 swapped+0.001
ishment+0.001
Neg logit contributions
ray-0.001
 scientist-0.001
 drill-0.001
 Rat-0.001
oris-0.001
Token upon
Token ID2402
Feature activation-6.83e-02

Pos logit contributions
aws+0.000
aney+0.000
ams+0.000
glers+0.000
ackers+0.000
Neg logit contributions
ray-0.000
 scientist-0.000
 drill-0.000
cher-0.000
 broom-0.000
Token a
Token ID257
Feature activation+2.03e+00

Pos logit contributions
ray+0.001
 scientist+0.001
 drill+0.001
 Rat+0.001
 spider+0.001
Neg logit contributions
ams-0.002
aws-0.002
 tongues-0.002
 swapped-0.002
ishment-0.002
Token time
Token ID640
Feature activation+8.93e-01

Pos logit contributions
glers+0.035
ams+0.034
 swapped+0.033
ishment+0.033
 tongues+0.032
Neg logit contributions
ray-0.059
 spider-0.055
 scientist-0.055
 broom-0.055
 Rat-0.053
Token there
Token ID612
Feature activation+1.25e+00

Pos logit contributions
aws+0.020
ams+0.019
 tongues+0.019
aney+0.019
 swapped+0.019
Neg logit contributions
ray-0.022
 scientist-0.020
 drill-0.019
oris-0.018
 Rat-0.018
Token was
Token ID373
Feature activation-5.80e-01

Pos logit contributions
ishment+0.040
ams+0.040
 tongues+0.039
 swapped+0.039
 crust+0.038
Neg logit contributions
ray-0.019
 Rat-0.017
 Monster-0.016
 spider-0.016
 scientist-0.016
Token very
Token ID845
Feature activation+1.42e+00

Pos logit contributions
ited+0.030
umbs+0.029
ement+0.029
ishment+0.029
aney+0.029
Neg logit contributions
 scientist-0.019
 drill-0.018
 broom-0.018
 spider-0.018
ray-0.017
Token big
Token ID1263
Feature activation-4.20e-02

Pos logit contributions
umbs+0.041
ishment+0.039
 unison+0.038
�+0.038
ement+0.038
Neg logit contributions
 scientist-0.027
ray-0.026
 spider-0.023
 drill-0.023
 knows-0.023
Token and
Token ID290
Feature activation+1.64e+00

Pos logit contributions
ray+0.001
 scientist+0.001
oris+0.001
 drill+0.001
 Rat+0.001
Neg logit contributions
ams-0.001
aws-0.001
ishment-0.001
 tongues-0.001
 swapped-0.001
Token friendly
Token ID8030
Feature activation+6.31e-01

Pos logit contributions
aws+0.043
umbs+0.042
ishment+0.042
aney+0.042
 tongues+0.041
Neg logit contributions
ray-0.034
 scientist-0.033
 drill-0.031
 spider-0.030
 Rat-0.030
Token.
Token ID13
Feature activation+6.04e-01

Pos logit contributions
 swapped+0.016
ited+0.016
ishment+0.016
ams+0.016
glers+0.016
Neg logit contributions
ray-0.013
 scientist-0.012
 spider-0.012
 Monster-0.011
 broom-0.011
Token You
Token ID921
Feature activation-1.30e+00

Pos logit contributions
ams+0.014
aws+0.013
 swapped+0.013
 tongues+0.013
ishment+0.013
Neg logit contributions
ray-0.014
 scientist-0.013
 Rat-0.012
 drill-0.012
 Monster-0.012
Token will
Token ID481
Feature activation-1.02e+00

Pos logit contributions
ray+0.025
 scientist+0.023
 drill+0.021
 Rat+0.021
 spider+0.020
Neg logit contributions
ams-0.035
aws-0.035
ishment-0.035
 tongues-0.035
 swapped-0.034
Token like
Token ID588
Feature activation-8.09e-01

Pos logit contributions
ray+0.020
 Rat+0.020
 scientist+0.019
 drill+0.019
 spider+0.019
Neg logit contributions
ams-0.029
ement-0.028
ishment-0.027
lights-0.027
ized-0.026
Token them
Token ID606
Feature activation+5.83e-01

Pos logit contributions
ray+0.016
 scientist+0.015
 spider+0.015
 drill+0.015
 Rat+0.014
Neg logit contributions
ited-0.021
ishment-0.021
ams-0.021
 swapped-0.021
umbs-0.020
Token."
Token ID526
Feature activation+6.54e-01

Pos logit contributions
ishment+0.015
 swapped+0.015
ams+0.015
 tongues+0.015
ited+0.015
Neg logit contributions
ray-0.012
 scientist-0.011
 spider-0.010
 Rat-0.010
 drill-0.010
Token
Token ID198
Feature activation+3.09e-01

Pos logit contributions
ams+0.015
aws+0.014
 swapped+0.014
 tongues+0.014
ishment+0.014
Neg logit contributions
ray-0.016
 scientist-0.015
 drill-0.014
 Rat-0.013
oris-0.013
Token big
Token ID1263
Feature activation-7.47e-01

Pos logit contributions
ray+0.017
 scientist+0.015
oris+0.014
cher+0.014
 Rat+0.014
Neg logit contributions
 tongues-0.015
 sav-0.015
aws-0.015
 crust-0.015
 refreshing-0.015
Token world
Token ID995
Feature activation+3.07e-01

Pos logit contributions
ray+0.017
 scientist+0.014
cher+0.014
oris+0.014
 drill+0.013
Neg logit contributions
 tongues-0.018
 crust-0.018
aws-0.018
ishment-0.018
 sav-0.018
Token.
Token ID13
Feature activation-8.60e-01

Pos logit contributions
 tongues+0.009
ams+0.009
ackers+0.009
ishment+0.009
 swapped+0.008
Neg logit contributions
ray-0.006
 scientist-0.005
 Monster-0.005
 spider-0.005
 living-0.005
Token She
Token ID1375
Feature activation-1.10e+00

Pos logit contributions
ray+0.024
 scientist+0.023
cher+0.022
 drill+0.022
 broom+0.021
Neg logit contributions
aws-0.015
 sav-0.015
 tongues-0.014
ishment-0.014
ams-0.014
Token saw
Token ID2497
Feature activation-1.12e+00

Pos logit contributions
ray+0.024
 scientist+0.022
 Rat+0.020
 spider+0.020
 Monster+0.020
Neg logit contributions
ams-0.028
umbs-0.027
ishment-0.027
 tongues-0.027
 crust-0.027
Token the
Token ID262
Feature activation-6.42e-01

Pos logit contributions
ray+0.026
 scientist+0.024
 drill+0.022
oris+0.022
 Rat+0.022
Neg logit contributions
aws-0.026
ams-0.026
 tongues-0.025
 swapped-0.025
ishment-0.025
Token sun
Token ID4252
Feature activation+6.30e-01

Pos logit contributions
ray+0.016
 scientist+0.015
 drill+0.014
 Rat+0.013
oris+0.013
Neg logit contributions
aws-0.014
 tongues-0.014
ams-0.014
 swapped-0.014
 crust-0.014
Token and
Token ID290
Feature activation-2.44e-01

Pos logit contributions
ackers+0.019
ams+0.018
 swapped+0.018
ishment+0.018
aws+0.018
Neg logit contributions
ray-0.010
 spider-0.009
 Monster-0.009
 Rat-0.009
 scientist-0.009
Token the
Token ID262
Feature activation-7.91e-02

Pos logit contributions
ray+0.006
 scientist+0.005
 Rat+0.005
 drill+0.005
 spider+0.005
Neg logit contributions
ams-0.006
aws-0.006
 swapped-0.006
ishment-0.006
 tongues-0.005
Token flowers
Token ID12734
Feature activation+2.22e-01

Pos logit contributions
ray+0.002
 scientist+0.002
 spider+0.002
 drill+0.002
 broom+0.002
Neg logit contributions
ams-0.002
 swapped-0.002
ishment-0.002
aws-0.002
aney-0.002
Token and
Token ID290
Feature activation-3.07e-01

Pos logit contributions
ishment+0.007
ams+0.007
ackers+0.006
 tongues+0.006
Twe+0.006
Neg logit contributions
ray-0.003
 living-0.003
 spider-0.003
 broom-0.003
 scientist-0.003
Token.
Token ID13
Feature activation+3.76e-01

Pos logit contributions
ishment+0.018
 tongues+0.017
aws+0.017
ited+0.016
 swapped+0.016
Neg logit contributions
ray-0.011
 broom-0.009
 scientist-0.009
cher-0.009
 drill-0.008
Token She
Token ID1375
Feature activation-2.67e-01

Pos logit contributions
ited+0.010
 tongues+0.010
ized+0.010
ams+0.010
glers+0.010
Neg logit contributions
ray-0.008
 scientist-0.007
 Rat-0.007
 Monster-0.007
 Owl-0.007
Token searched
Token ID16499
Feature activation-7.09e-01

Pos logit contributions
ray+0.006
 scientist+0.006
 drill+0.005
 spider+0.005
 broom+0.005
Neg logit contributions
ams-0.007
ishment-0.007
umbs-0.007
aws-0.006
 tongues-0.006
Token through
Token ID832
Feature activation+4.96e-01

Pos logit contributions
ray+0.016
 scientist+0.015
 drill+0.014
 Rat+0.014
oris+0.014
Neg logit contributions
ams-0.016
aws-0.016
 swapped-0.016
 tongues-0.016
ishment-0.016
Token the
Token ID262
Feature activation-4.24e-01

Pos logit contributions
ams+0.013
 swapped+0.013
umbs+0.013
rimp+0.013
 tongues+0.013
Neg logit contributions
ray-0.009
 scientist-0.009
 spider-0.009
 broom-0.009
 Rat-0.009
Token forest
Token ID8222
Feature activation-1.52e+00

Pos logit contributions
ray+0.010
 scientist+0.010
 spider+0.009
 broom+0.009
 drill+0.009
Neg logit contributions
ams-0.010
ishment-0.010
 swapped-0.010
aws-0.009
 tongues-0.009
Token,
Token ID11
Feature activation-9.02e-01

Pos logit contributions
ray+0.036
 scientist+0.033
 drill+0.030
 Rat+0.030
oris+0.030
Neg logit contributions
ams-0.035
aws-0.035
 tongues-0.034
ishment-0.034
 swapped-0.034
Token high
Token ID1029
Feature activation-1.75e-01

Pos logit contributions
ray+0.023
 scientist+0.021
 drill+0.020
 Rat+0.019
oris+0.019
Neg logit contributions
ams-0.019
aws-0.019
 swapped-0.019
 tongues-0.019
ishment-0.019
Token and
Token ID290
Feature activation+5.66e-01

Pos logit contributions
ray+0.004
 scientist+0.004
 drill+0.004
 Rat+0.004
 spider+0.003
Neg logit contributions
 swapped-0.004
ams-0.004
aws-0.004
 tongues-0.004
aney-0.004
Token low
Token ID1877
Feature activation-6.83e-01

Pos logit contributions
ams+0.010
aws+0.010
 tongues+0.010
umbs+0.010
ishment+0.009
Neg logit contributions
ray-0.016
 scientist-0.016
 Rat-0.015
 drill-0.015
 spider-0.015
Token,
Token ID11
Feature activation-4.95e-01

Pos logit contributions
ray+0.016
 scientist+0.015
 drill+0.014
 Rat+0.013
oris+0.013
Neg logit contributions
ams-0.015
aws-0.015
 tongues-0.015
ishment-0.015
 swapped-0.015
Token is
Token ID318
Feature activation+1.13e+00

Pos logit contributions
ray+0.033
 scientist+0.029
 drill+0.026
 spider+0.025
oris+0.025
Neg logit contributions
ishment-0.046
 swapped-0.045
 tongues-0.045
aws-0.045
ams-0.044
Token the
Token ID262
Feature activation+9.23e-01

Pos logit contributions
aws+0.031
 tongues+0.031
umbs+0.031
ishment+0.031
 swapped+0.030
Neg logit contributions
ray-0.022
 scientist-0.021
 drill-0.019
 Rat-0.019
 spider-0.018
Token last
Token ID938
Feature activation+8.87e-01

Pos logit contributions
umbs+0.024
ams+0.024
aws+0.024
aney+0.024
lights+0.023
Neg logit contributions
 scientist-0.020
ray-0.019
 drill-0.017
 spider-0.017
 mouse-0.016
Token thing
Token ID1517
Feature activation+2.31e-01

Pos logit contributions
aws+0.024
ishment+0.023
 tongues+0.022
 swapped+0.022
agged+0.022
Neg logit contributions
ray-0.020
 scientist-0.019
 tool-0.018
 spider-0.018
 drill-0.017
Token you
Token ID345
Feature activation-7.52e-01

Pos logit contributions
ishment+0.006
aws+0.006
 swapped+0.006
 tongues+0.006
ams+0.006
Neg logit contributions
ray-0.005
 scientist-0.004
 spider-0.004
 Monster-0.004
 drill-0.004
Token want
Token ID765
Feature activation-6.62e-01

Pos logit contributions
ray+0.017
 scientist+0.015
cher+0.013
 knows+0.013
 Monster+0.013
Neg logit contributions
ishment-0.019
lights-0.019
 tongues-0.019
aws-0.019
ams-0.018
Token,
Token ID11
Feature activation-2.00e-01

Pos logit contributions
ray+0.013
 scientist+0.012
 Rat+0.010
 Monster+0.010
 spider+0.010
Neg logit contributions
ishment-0.018
 swapped-0.018
glers-0.018
 tongues-0.018
aws-0.018
Token remember
Token ID3505
Feature activation+4.46e-01

Pos logit contributions
 scientist+0.004
ray+0.004
 Rat+0.004
 spider+0.004
 drill+0.004
Neg logit contributions
 tongues-0.005
 swapped-0.005
glers-0.005
aws-0.005
ams-0.005
Token that
Token ID326
Feature activation-1.52e+00

Pos logit contributions
 swapped+0.012
ams+0.012
aws+0.012
 tongues+0.012
ishment+0.012
Neg logit contributions
ray-0.009
 scientist-0.008
 drill-0.008
 Rat-0.007
 spider-0.007
Token".
Token ID1911
Feature activation-1.27e+00

Pos logit contributions
ray+0.033
 scientist+0.030
 spider+0.027
 drill+0.027
 Monster+0.026
Neg logit contributions
aws-0.040
 swapped-0.040
ishment-0.039
 tongues-0.039
glers-0.038
Token
Token ID220
Feature activation-1.48e+00

Pos logit contributions
ray+0.034
 scientist+0.032
 Rat+0.030
 drill+0.030
oris+0.030
Neg logit contributions
ams-0.025
 tongues-0.025
aws-0.025
 swapped-0.025
ishment-0.024
Token,
Token ID11
Feature activation-1.45e-01

Pos logit contributions
ray+0.006
 scientist+0.006
oris+0.005
 drill+0.005
cher+0.005
Neg logit contributions
 tongues-0.005
ams-0.005
aws-0.005
ishment-0.005
 crust-0.005
Token Ben
Token ID3932
Feature activation-1.26e-01

Pos logit contributions
ray+0.003
 scientist+0.003
 spider+0.003
 Rat+0.003
 drill+0.003
Neg logit contributions
ams-0.004
 swapped-0.004
aws-0.003
 tongues-0.003
glers-0.003
Token.
Token ID13
Feature activation-2.42e-01

Pos logit contributions
ray+0.003
 scientist+0.003
 drill+0.003
 Rat+0.003
oris+0.003
Neg logit contributions
ams-0.003
aws-0.003
 swapped-0.003
 tongues-0.003
ishment-0.003
Token He
Token ID679
Feature activation-9.47e-01

Pos logit contributions
ray+0.006
 scientist+0.006
oris+0.005
 drill+0.005
 broom+0.005
Neg logit contributions
 tongues-0.005
aws-0.005
ams-0.005
 sav-0.005
ishment-0.005
Token's
Token ID338
Feature activation+5.66e-01

Pos logit contributions
ray+0.017
 scientist+0.014
 drill+0.013
 Rat+0.013
 spider+0.013
Neg logit contributions
ams-0.028
ishment-0.028
aws-0.027
 tongues-0.027
 swapped-0.027
Token gone
Token ID3750
Feature activation-1.10e+00

Pos logit contributions
glers+0.016
ited+0.015
lights+0.015
aney+0.015
 swapped+0.015
Neg logit contributions
ray-0.011
 scientist-0.010
 spider-0.010
 broom-0.010
 drill-0.010
Token now
Token ID783
Feature activation-5.21e-01

Pos logit contributions
ray+0.022
 scientist+0.020
 broom+0.018
 spider+0.018
 drill+0.018
Neg logit contributions
 swapped-0.029
ams-0.029
aws-0.029
 tongues-0.029
ishment-0.029
Token.
Token ID13
Feature activation+9.69e-02

Pos logit contributions
ray+0.010
 scientist+0.008
 broom+0.008
 drill+0.008
 spider+0.008
Neg logit contributions
aney-0.015
aws-0.015
ishment-0.015
ams-0.015
 swapped-0.015
Token He
Token ID679
Feature activation-6.27e-01

Pos logit contributions
 tongues+0.002
 sav+0.002
ams+0.002
 swapped+0.002
aws+0.002
Neg logit contributions
ray-0.003
 scientist-0.002
oris-0.002
 broom-0.002
 drill-0.002
Token was
Token ID373
Feature activation+7.24e-01

Pos logit contributions
ray+0.011
 scientist+0.009
 Rat+0.008
 drill+0.008
 spider+0.008
Neg logit contributions
ams-0.019
aws-0.019
ishment-0.019
aney-0.019
 tongues-0.019
Token just
Token ID655
Feature activation+6.85e-01

Pos logit contributions
lights+0.021
 tongues+0.020
glers+0.020
ishment+0.020
umbs+0.020
Neg logit contributions
ray-0.013
 living-0.013
 scientist-0.013
 spider-0.013
 broom-0.012
Token scary
Token ID14343
Feature activation+1.93e-01

Pos logit contributions
umbs+0.006
ams+0.006
glers+0.006
aney+0.006
ople+0.006
Neg logit contributions
ray-0.004
 flashlight-0.004
 scientist-0.004
 spider-0.004
 broom-0.004
Token.
Token ID13
Feature activation-1.79e-01

Pos logit contributions
ams+0.005
 swapped+0.005
 crust+0.005
aws+0.005
glers+0.005
Neg logit contributions
ray-0.004
 scientist-0.004
 spider-0.003
 broom-0.003
 Monster-0.003
Token
Token ID198
Feature activation+1.75e-01

Pos logit contributions
ray+0.004
 Rat+0.004
 scientist+0.004
 spider+0.003
 Monster+0.003
Neg logit contributions
ams-0.004
glers-0.004
ited-0.004
 tongues-0.004
ishment-0.004
Token
Token ID198
Feature activation+7.93e-02

Pos logit contributions
aws+0.003
ams+0.003
 sav+0.003
aney+0.003
 swapped+0.003
Neg logit contributions
ray-0.005
 scientist-0.005
 drill-0.004
 broom-0.004
 spider-0.004
TokenLater
Token ID18602
Feature activation+2.97e-01

Pos logit contributions
ams+0.002
aws+0.002
 swapped+0.002
 tongues+0.002
ishment+0.002
Neg logit contributions
ray-0.002
 scientist-0.002
 drill-0.002
 Rat-0.002
oris-0.002
Token that
Token ID326
Feature activation-3.06e+00

Pos logit contributions
ams+0.008
 tongues+0.008
aws+0.008
 swapped+0.008
ishment+0.008
Neg logit contributions
ray-0.006
 scientist-0.006
 spider-0.005
 Rat-0.005
 Monster-0.005
Token day
Token ID1110
Feature activation-7.26e-01

Pos logit contributions
ray+0.093
cher+0.088
 scientist+0.086
 drill+0.084
oris+0.084
Neg logit contributions
 sav-0.048
ited-0.043
 crust-0.043
ams-0.042
 festivities-0.042
Token,
Token ID11
Feature activation-1.17e+00

Pos logit contributions
ray+0.017
 scientist+0.015
 drill+0.014
oris+0.014
 Rat+0.014
Neg logit contributions
ams-0.017
aws-0.017
 swapped-0.017
 tongues-0.017
 crust-0.017
Token Lily
Token ID20037
Feature activation-9.90e-01

Pos logit contributions
 scientist+0.025
ray+0.025
 Rat+0.025
 spider+0.025
 Monster+0.024
Neg logit contributions
ams-0.029
ishment-0.028
ited-0.027
 tongues-0.027
umbs-0.027
Token's
Token ID338
Feature activation-5.74e-01

Pos logit contributions
ray+0.026
 scientist+0.025
 drill+0.024
oris+0.023
 Rat+0.023
Neg logit contributions
 swapped-0.019
aws-0.019
ams-0.019
 sav-0.019
 tongues-0.018
Token mom
Token ID1995
Feature activation-9.35e-02

Pos logit contributions
ray+0.014
 scientist+0.013
 drill+0.012
oris+0.012
 Rat+0.011
Neg logit contributions
aws-0.013
 tongues-0.013
 swapped-0.013
ams-0.013
ishment-0.013
Token were
Token ID547
Feature activation+1.56e-02

Pos logit contributions
ray+0.001
 scientist+0.001
cher+0.001
 flashlight+0.001
 Monster+0.001
Neg logit contributions
ishment-0.001
 tongues-0.001
 crust-0.001
umbs-0.001
ams-0.001
Token having
Token ID1719
Feature activation+8.27e-01

Pos logit contributions
ishment+0.000
 tongues+0.000
ited+0.000
umbs+0.000
ackers+0.000
Neg logit contributions
 scientist-0.000
 living-0.000
 believing-0.000
 broom-0.000
cher-0.000
Token so
Token ID523
Feature activation+8.52e-01

Pos logit contributions
ited+0.023
ishment+0.023
 crust+0.023
ams+0.023
glers+0.022
Neg logit contributions
 Rat-0.014
 spider-0.013
 scientist-0.013
 living-0.013
 Taylor-0.012
Token much
Token ID881
Feature activation+1.67e+00

Pos logit contributions
ishment+0.026
 tongues+0.023
 wrists+0.023
rily+0.023
 crust+0.023
Neg logit contributions
ray-0.016
 knows-0.013
 Rat-0.013
 spider-0.013
 scientist-0.012
Token fun
Token ID1257
Feature activation+9.67e-01

Pos logit contributions
ishment+0.037
ams+0.036
aney+0.036
rily+0.035
 tongues+0.035
Neg logit contributions
ray-0.040
 scientist-0.040
 spider-0.037
 Rat-0.036
 flashlight-0.036
Token.
Token ID13
Feature activation-2.46e+00

Pos logit contributions
 swapped+0.023
ishment+0.023
aney+0.023
 tongues+0.023
ams+0.023
Neg logit contributions
ray-0.021
 scientist-0.020
 spider-0.018
 drill-0.018
 broom-0.018
Token April
Token ID3035
Feature activation+3.16e-01

Pos logit contributions
ray+0.066
cher+0.065
 drill+0.064
 scientist+0.063
 broom+0.062
Neg logit contributions
 sav-0.044
ishment-0.044
aws-0.043
 delight-0.042
 swapped-0.041
Token was
Token ID373
Feature activation-4.17e-01

Pos logit contributions
ishment+0.008
umbs+0.008
 tongues+0.008
 crust+0.008
ams+0.008
Neg logit contributions
ray-0.006
 scientist-0.006
 Monster-0.005
 spider-0.005
 Rat-0.005
Token so
Token ID523
Feature activation+1.86e-01

Pos logit contributions
 scientist+0.009
ray+0.009
 spider+0.008
 Rat+0.008
 Monster+0.008
Neg logit contributions
 tongues-0.010
ishment-0.010
umbs-0.010
glers-0.009
 crust-0.009
Token excited
Token ID6568
Feature activation+1.15e+00

Pos logit contributions
ishment+0.005
 tongues+0.005
umbs+0.005
glers+0.005
 crust+0.005
Neg logit contributions
ray-0.003
 scientist-0.003
 knows-0.003
 spider-0.003
oding-0.003
Token,
Token ID11
Feature activation-1.23e+00

Pos logit contributions
 tongues+0.038
 swapped+0.038
 crust+0.038
ams+0.037
umbs+0.037
Neg logit contributions
ray-0.016
 scientist-0.014
 spider-0.012
 Rat-0.012
 Monster-0.012
Token in
Token ID287
Feature activation+1.37e+00

Pos logit contributions
ray+0.011
 scientist+0.011
 Rat+0.010
 drill+0.010
 Monster+0.010
Neg logit contributions
ited-0.016
aws-0.016
aney-0.016
ams-0.016
 swapped-0.016
Token a
Token ID257
Feature activation-4.32e-01

Pos logit contributions
aws+0.040
ams+0.039
 tongues+0.039
 swapped+0.039
ishment+0.039
Neg logit contributions
 scientist-0.023
ray-0.022
 drill-0.022
 Rat-0.021
 spider-0.021
Token hurry
Token ID23290
Feature activation+7.11e-01

Pos logit contributions
ray+0.010
 scientist+0.010
 broom+0.009
 spider+0.009
 drill+0.009
Neg logit contributions
ams-0.011
aws-0.010
ishment-0.010
 tongues-0.010
 swapped-0.010
Token.
Token ID13
Feature activation-9.94e-01

Pos logit contributions
 tongues+0.021
ited+0.020
 swapped+0.020
aws+0.020
ishment+0.020
Neg logit contributions
ray-0.013
 scientist-0.012
 broom-0.011
 drill-0.011
 living-0.010
Token From
Token ID3574
Feature activation+4.06e-02

Pos logit contributions
ray+0.025
 scientist+0.022
 Rat+0.021
 drill+0.021
oris+0.020
Neg logit contributions
 tongues-0.022
ams-0.022
aws-0.022
ishment-0.022
 swapped-0.022
Token that
Token ID326
Feature activation-2.87e+00

Pos logit contributions
umbs+0.001
 tongues+0.001
aney+0.001
 crust+0.001
 swapped+0.001
Neg logit contributions
ray-0.001
 scientist-0.001
 Rat-0.001
 spider-0.001
 Monster-0.001
Token day
Token ID1110
Feature activation-6.90e-01

Pos logit contributions
ray+0.101
cher+0.096
 scientist+0.094
oris+0.093
 drill+0.090
Neg logit contributions
 sav-0.034
 time-0.030
 refreshing-0.027
ited-0.027
ams-0.027
Token forward
Token ID2651
Feature activation-8.78e-01

Pos logit contributions
ray+0.021
 scientist+0.020
 outage+0.018
 spider+0.018
oris+0.018
Neg logit contributions
ams-0.012
agged-0.011
 both-0.011
 swapped-0.010
 ne-0.010
Token,
Token ID11
Feature activation-2.52e+00

Pos logit contributions
ray+0.021
 scientist+0.020
 drill+0.019
 broom+0.018
oris+0.018
Neg logit contributions
ams-0.019
 swapped-0.019
aws-0.019
ishment-0.019
 tongues-0.019
Token the
Token ID262
Feature activation-6.91e-01

Pos logit contributions
ray+0.056
 scientist+0.054
 drill+0.050
 Rat+0.049
 spider+0.048
Neg logit contributions
aws-0.060
ams-0.059
 tongues-0.059
 swapped-0.059
ishment-0.059
Token cat
Token ID3797
Feature activation-1.54e+00

Pos logit contributions
ray+0.018
 scientist+0.016
oris+0.016
 Monster+0.015
 Rat+0.015
Neg logit contributions
 crust-0.014
 swapped-0.014
aws-0.014
 sav-0.014
ams-0.014
Token.
Token ID13
Feature activation-8.20e-01

Pos logit contributions
aws+0.008
aney+0.008
glers+0.008
 swapped+0.008
ishment+0.008
Neg logit contributions
ray-0.006
 scientist-0.006
 spider-0.005
 if-0.005
 broom-0.005
Token
Token ID220
Feature activation-9.56e-01

Pos logit contributions
ray+0.021
 scientist+0.020
 drill+0.019
 broom+0.019
 spider+0.019
Neg logit contributions
aws-0.017
ams-0.016
 sav-0.016
 swapped-0.016
 crust-0.016
Token
Token ID198
Feature activation-6.31e-01

Pos logit contributions
ray+0.024
 scientist+0.023
 drill+0.021
 spider+0.021
 broom+0.021
Neg logit contributions
aws-0.020
ams-0.019
 swapped-0.019
 tongues-0.019
 crust-0.019
Token
Token ID198
Feature activation-5.51e-01

Pos logit contributions
ray+0.018
 scientist+0.017
 drill+0.016
 spider+0.016
 broom+0.016
Neg logit contributions
aws-0.011
 sav-0.011
umbs-0.011
ams-0.011
 swapped-0.011
TokenFrom
Token ID4863
Feature activation+1.54e-02

Pos logit contributions
ray+0.015
 scientist+0.014
 drill+0.014
 spider+0.014
 broom+0.014
Neg logit contributions
aws-0.010
ams-0.010
 swapped-0.010
 sav-0.010
 crust-0.010
Token that
Token ID326
Feature activation-2.77e+00

Pos logit contributions
 tongues+0.000
ams+0.000
 swapped+0.000
aws+0.000
umbs+0.000
Neg logit contributions
ray-0.000
 scientist-0.000
 Rat-0.000
 drill-0.000
 spider-0.000
Token day
Token ID1110
Feature activation-8.92e-01

Pos logit contributions
ray+0.104
cher+0.102
oris+0.097
 scientist+0.095
driver+0.093
Neg logit contributions
 sav-0.029
 time-0.026
 crust-0.022
ited-0.021
 refreshing-0.020
Token on
Token ID319
Feature activation-4.84e-01

Pos logit contributions
ray+0.027
 scientist+0.026
 outage+0.024
ccoli+0.024
 spider+0.024
Neg logit contributions
ams-0.015
agged-0.014
 both-0.013
 swapped-0.013
 time-0.013
Token,
Token ID11
Feature activation-2.13e+00

Pos logit contributions
ray+0.011
 scientist+0.010
 drill+0.009
oris+0.009
 Rat+0.009
Neg logit contributions
ams-0.012
aws-0.012
 tongues-0.012
 swapped-0.011
ishment-0.011
Token the
Token ID262
Feature activation-6.79e-01

Pos logit contributions
ray+0.045
 scientist+0.044
 Rat+0.040
 drill+0.040
 Monster+0.040
Neg logit contributions
aws-0.053
ams-0.053
aney-0.053
 tongues-0.053
 swapped-0.052
Token musician
Token ID21623
Feature activation-2.22e+00

Pos logit contributions
ray+0.015
oris+0.013
 scientist+0.013
 Rat+0.013
 drill+0.013
Neg logit contributions
 swapped-0.017
 sav-0.016
ishment-0.016
ams-0.016
aws-0.016
Token time
Token ID640
Feature activation+1.23e-01

Pos logit contributions
ishment+0.048
rily+0.047
glers+0.045
ized+0.044
ams+0.043
Neg logit contributions
 spider-0.037
 flashlight-0.037
 living-0.036
 scientist-0.036
ray-0.036
Token.
Token ID13
Feature activation+2.61e-01

Pos logit contributions
ishment+0.003
 tongues+0.003
ited+0.003
aney+0.003
ackers+0.003
Neg logit contributions
ray-0.002
 scientist-0.002
 living-0.002
 spider-0.002
 drill-0.002
Token Lucy
Token ID22162
Feature activation-1.95e-01

Pos logit contributions
ited+0.007
ams+0.007
ized+0.007
glers+0.007
ishment+0.007
Neg logit contributions
ray-0.005
 Rat-0.005
 Monster-0.005
 scientist-0.005
 Mouse-0.005
Token was
Token ID373
Feature activation-7.46e-01

Pos logit contributions
ray+0.004
 scientist+0.003
 Monster+0.003
 Rat+0.003
 spider+0.003
Neg logit contributions
ishment-0.005
ams-0.005
 tongues-0.005
 crust-0.005
umbs-0.005
Token happy
Token ID3772
Feature activation+8.83e-01

Pos logit contributions
 scientist+0.012
 living+0.012
 Rat+0.012
 spider+0.012
ray+0.011
Neg logit contributions
 tongues-0.022
ishment-0.022
lights-0.021
umbs-0.021
ited-0.020
Token again
Token ID757
Feature activation-2.47e+00

Pos logit contributions
ams+0.026
ited+0.026
umbs+0.026
 tongues+0.026
ishment+0.026
Neg logit contributions
 scientist-0.015
ray-0.014
 spider-0.013
 Rat-0.013
 Monster-0.013
Token.
Token ID13
Feature activation-4.84e-01

Pos logit contributions
ray+0.057
 scientist+0.054
 Rat+0.049
 drill+0.049
 spider+0.049
Neg logit contributions
 swapped-0.057
 tongues-0.057
aws-0.057
ams-0.057
ishment-0.056
Token Mia
Token ID32189
Feature activation+4.30e-01

Pos logit contributions
ray+0.012
 scientist+0.011
 Rat+0.010
oris+0.010
 drill+0.010
Neg logit contributions
ams-0.011
 swapped-0.011
ishment-0.011
aws-0.011
 tongues-0.011
Token and
Token ID290
Feature activation+4.69e-01

Pos logit contributions
ishment+0.012
 tongues+0.012
ams+0.012
umbs+0.011
 crust+0.011
Neg logit contributions
ray-0.008
 scientist-0.008
 Monster-0.007
 Rat-0.007
 spider-0.007
Token Lucy
Token ID22162
Feature activation-4.92e-02

Pos logit contributions
ams+0.010
aws+0.010
 swapped+0.010
 tongues+0.010
ishment+0.010
Neg logit contributions
ray-0.012
 scientist-0.011
 Rat-0.010
 drill-0.010
 spider-0.010
Token made
Token ID925
Feature activation-1.29e+00

Pos logit contributions
ray+0.001
 scientist+0.001
 Rat+0.001
 Monster+0.001
inea+0.001
Neg logit contributions
ishment-0.001
lights-0.001
ams-0.001
 crust-0.001
umbs-0.001