TOP ACTIVATIONS
MAX = 0.091

to try the broccoli c asser ole . He said ,
ier said , " It seems the price of the food
" But it is just a toy , Lily .
" It 's just me ," a small voice
Ruby , even if it seems hard , don 't forget
and said , " It seems like it disappeared !"
" It 's just five dollars !" the shop
" Ben , it was just a spider . It was
" It 's just a skirt , Molly ,"
I have the z ig zag ?" M ama
" It 's probably just your water bottle and snacks
the funn iest potato h ider ever . You are the
" It 's just over there . It won
who own it don 't actually live there . They hate
said , " It 's just a harmless hook , don
said . " This station seems really old . What do
said , " It 's just an empty hole . Maybe
to be restored to its former glory ."
the clay until it looked just like the butterfly outside .
and Tom was the h ider . Sara counted to ten

MIN ACTIVATIONS
MIN = -0.103

replied . Fl umm ox ed , the little
added shells and stones and seaw eed to make it look
decorated it with shells and seaw eed . They had a
ate it with shells and seaw eed . They find a
some shells and rocks and seaw eed . He picked them
. <|endoftext|> M ama and pap a were busy in the
like fish and shells and seaw eed . But they did
made him an even y umm ier snack and they both
see shells , rocks , seaw eed , crabs and fish
of shells , rocks and seaw eed , but Jane was
and it was the y umm iest dinner they ever had
like fish and shells and seaw eed . One
shore , collecting shells and seaw eed . Jimmy took time
an amazing chess player , Mate o ! Remember to always
as solid as a table cloth ! <|endoftext|> Once upon a
a few shells and some seaw eed and built a castle
and it was the y umm iest ice cream she ever
. It was even y umm ier than the first card
it outside and a big wind blew it away . Anna
Like shells and rocks and seaw eed ." M

INTERVAL 0.043 - 0.091
CONTAINS 0.744%

shy . The big bird smiled and said , " Do
The fire man truck is here because there is a problem
. She wished that her teacher would love to dance as
Ben had a red shirt with a big button on the
good artist . One day , Anna wanted to

INTERVAL -0.006 - 0.043
CONTAINS 63.518%

friend of the jungle . They wanted to make
ran towards it . She climbed up and slid down with
came out . He smiled and said , " Welcome ,
the paper and was so proud of himself .
could such a magical creature exist in the world ?

INTERVAL -0.055 - -0.006
CONTAINS 35.579%

keep his room clean from now on . <|endoftext|> Once upon
better than pretending to be a princess with a fake necklace
their cookies . <|endoftext|> One day , a little girl named
it home with them ! Back at home ,
because his paws were too big . Max was sad and

INTERVAL -0.103 - -0.055
CONTAINS 0.159%

a button and a cr umb . They also saw a
" O w , ow , ow !" Tom cried
big tree and dug a hole . As they
with delicious potatoes all year long . The end
house , grabbed a big bag of candy , and ran
Token to
Token ID284
Feature activation+2.41e-02

Pos logit contributions
 seeking+0.041
 pad+0.035
 lair+0.034
 former+0.033
 wonders+0.031
Neg logit contributions
oo-0.037
aw-0.037
ug-0.036
ister-0.034
ww-0.032
Token try
Token ID1949
Feature activation+1.31e-02

Pos logit contributions
 seeking+0.033
 pad+0.024
 wonders+0.023
 hiding+0.022
 forgetting+0.022
Neg logit contributions
ug-0.060
aw-0.056
ww-0.054
oo-0.053
apy-0.053
Token the
Token ID262
Feature activation+1.37e-02

Pos logit contributions
 seeking+0.025
 pad+0.022
 slave+0.020
 former+0.020
 lair+0.019
Neg logit contributions
oo-0.020
aw-0.020
ug-0.020
ister-0.019
apy-0.018
Token broccoli
Token ID44653
Feature activation+2.55e-02

Pos logit contributions
 seeking+0.031
 pad+0.027
 former+0.025
epad+0.024
 lair+0.024
Neg logit contributions
ww-0.018
ug-0.016
aw-0.016
oo-0.016
olk-0.015
Token c
Token ID269
Feature activation+3.55e-02

Pos logit contributions
 seeking+0.048
 former+0.041
epad+0.041
 wonders+0.040
 pad+0.040
Neg logit contributions
ug-0.039
aw-0.038
ister-0.036
oo-0.035
ww-0.035
Tokenasser
Token ID24929
Feature activation+9.13e-02

Pos logit contributions
 seeking+0.083
 pad+0.071
 wonders+0.069
 pretending+0.067
 hiding+0.066
Neg logit contributions
ug-0.052
aw-0.045
oo-0.044
ww-0.043
apy-0.041
Tokenole
Token ID2305
Feature activation+4.83e-02

Pos logit contributions
 seeking+0.217
 pad+0.180
 pretending+0.178
 wonders+0.170
 forgetting+0.168
Neg logit contributions
ug-0.136
aw-0.118
oo-0.118
ww-0.109
ister-0.108
Token.
Token ID13
Feature activation-5.01e-03

Pos logit contributions
 seeking+0.096
 former+0.085
 pad+0.082
epad+0.080
 wonders+0.078
Neg logit contributions
aw-0.076
ug-0.075
ww-0.072
ister-0.068
oo-0.066
Token He
Token ID679
Feature activation-1.37e-02

Pos logit contributions
ug+0.008
aw+0.008
apy+0.007
oo+0.007
ww+0.007
Neg logit contributions
 seeking-0.010
 pad-0.008
epad-0.008
 supposed-0.008
 wonders-0.008
Token said
Token ID531
Feature activation-8.68e-04

Pos logit contributions
ug+0.025
aw+0.025
oo+0.023
ww+0.023
dd+0.022
Neg logit contributions
 seeking-0.026
epad-0.021
 pad-0.021
 pretending-0.019
 lair-0.019
Token,
Token ID11
Feature activation+8.14e-03

Pos logit contributions
ug+0.002
aw+0.002
ww+0.002
opter+0.002
olk+0.001
Neg logit contributions
 seeking-0.001
 hiding-0.001
 anyway-0.001
 pretending-0.001
 missing-0.001
Tokenier
Token ID959
Feature activation+2.32e-03

Pos logit contributions
ug+0.047
ww+0.044
apy+0.041
oo+0.041
olk+0.039
Neg logit contributions
 seeking-0.054
 pad-0.046
 anyway-0.044
 first-0.043
epad-0.043
Token said
Token ID531
Feature activation+1.26e-02

Pos logit contributions
 seeking+0.004
 pad+0.004
 wonders+0.003
 pretending+0.003
epad+0.003
Neg logit contributions
ug-0.005
aw-0.004
oo-0.004
ww-0.004
apy-0.004
Token,
Token ID11
Feature activation+2.19e-02

Pos logit contributions
 seeking+0.024
 hiding+0.020
 pretending+0.020
 anyway+0.019
 pad+0.019
Neg logit contributions
ug-0.024
ww-0.021
aw-0.021
oo-0.020
opter-0.019
Token "
Token ID366
Feature activation+1.88e-02

Pos logit contributions
 seeking+0.034
 pad+0.027
 wonders+0.025
 pretending+0.024
 hiding+0.024
Neg logit contributions
ug-0.049
aw-0.045
oo-0.044
ww-0.044
apy-0.042
TokenIt
Token ID1026
Feature activation+2.82e-02

Pos logit contributions
 seeking+0.038
 pad+0.032
 hiding+0.030
 pretending+0.030
 wonders+0.030
Neg logit contributions
ug-0.033
aw-0.031
ww-0.030
oo-0.030
apy-0.028
Token seems
Token ID2331
Feature activation+8.44e-02

Pos logit contributions
 seeking+0.051
 pad+0.044
epad+0.041
 lair+0.038
 pretending+0.038
Neg logit contributions
ug-0.053
ww-0.050
aw-0.049
oo-0.048
opter-0.046
Token the
Token ID262
Feature activation+4.15e-03

Pos logit contributions
 seeking+0.144
 pad+0.118
epad+0.106
 lair+0.106
 pretending+0.105
Neg logit contributions
ug-0.162
ww-0.149
aw-0.148
 cho-0.142
oo-0.141
Token price
Token ID2756
Feature activation-4.15e-04

Pos logit contributions
 seeking+0.007
 pad+0.006
 pretending+0.006
 wonders+0.006
epad+0.005
Neg logit contributions
ww-0.007
ug-0.007
aw-0.007
oo-0.007
ister-0.007
Token of
Token ID286
Feature activation+1.12e-02

Pos logit contributions
ug+0.001
aw+0.001
oo+0.001
ww+0.001
ister+0.001
Neg logit contributions
 seeking-0.001
 pad-0.001
 wonders-0.001
 pretending-0.001
epad-0.001
Token the
Token ID262
Feature activation+1.24e-03

Pos logit contributions
 seeking+0.022
 pad+0.018
 wonders+0.017
epad+0.017
 pretending+0.017
Neg logit contributions
ug-0.020
oo-0.018
aw-0.018
ww-0.018
ister-0.017
Token food
Token ID2057
Feature activation-1.20e-02

Pos logit contributions
 seeking+0.002
 pad+0.002
 wonders+0.002
epad+0.002
 pretending+0.002
Neg logit contributions
ug-0.002
ww-0.002
aw-0.002
oo-0.002
ister-0.002
Token
Token ID198
Feature activation+7.90e-03

Pos logit contributions
 seeking+0.023
 pad+0.019
 wonders+0.018
epad+0.018
 pretending+0.018
Neg logit contributions
ug-0.026
aw-0.024
oo-0.024
ww-0.023
apy-0.022
Token"
Token ID1
Feature activation+2.47e-02

Pos logit contributions
 seeking+0.018
 pad+0.015
 wonders+0.015
 pretending+0.014
epad+0.014
Neg logit contributions
ug-0.012
aw-0.011
oo-0.011
ww-0.010
apy-0.010
TokenBut
Token ID1537
Feature activation+1.99e-02

Pos logit contributions
 seeking+0.050
 pad+0.041
 wonders+0.039
 hiding+0.039
 pretending+0.038
Neg logit contributions
ug-0.044
aw-0.041
ww-0.040
oo-0.039
apy-0.038
Token it
Token ID340
Feature activation+9.39e-03

Pos logit contributions
 seeking+0.037
 pad+0.030
 wonders+0.029
 pretending+0.028
 hiding+0.028
Neg logit contributions
ug-0.039
aw-0.035
oo-0.034
ww-0.034
apy-0.032
Token is
Token ID318
Feature activation+3.61e-02

Pos logit contributions
 seeking+0.015
 pad+0.012
 wonders+0.011
epad+0.011
 pretending+0.011
Neg logit contributions
ug-0.021
aw-0.019
oo-0.019
ww-0.019
apy-0.018
Token just
Token ID655
Feature activation+8.17e-02

Pos logit contributions
 seeking+0.053
 pad+0.042
epad+0.039
 wonders+0.038
 pretending+0.035
Neg logit contributions
ug-0.080
ww-0.076
oo-0.074
aw-0.074
 cho-0.073
Token a
Token ID257
Feature activation+1.21e-02

Pos logit contributions
 seeking+0.144
 pad+0.118
epad+0.109
 wonders+0.108
 pretending+0.104
Neg logit contributions
ug-0.161
aw-0.152
ww-0.146
oo-0.144
apy-0.137
Token toy
Token ID13373
Feature activation+4.07e-03

Pos logit contributions
 seeking+0.024
 pad+0.019
 wonders+0.019
epad+0.018
 pretending+0.018
Neg logit contributions
ug-0.021
ww-0.020
aw-0.019
oo-0.019
apy-0.018
Token,
Token ID11
Feature activation-2.07e-04

Pos logit contributions
 seeking+0.008
 pad+0.006
 wonders+0.006
epad+0.006
 forgetting+0.006
Neg logit contributions
ug-0.007
aw-0.007
ww-0.007
oo-0.007
opter-0.006
Token Lily
Token ID20037
Feature activation+1.08e-02

Pos logit contributions
ug+0.000
aw+0.000
oo+0.000
ww+0.000
apy+0.000
Neg logit contributions
 seeking-0.000
 pad-0.000
 wonders-0.000
epad-0.000
 pretending-0.000
Token.
Token ID13
Feature activation+1.91e-02

Pos logit contributions
 seeking+0.021
 pad+0.016
 wonders+0.016
epad+0.016
 former+0.015
Neg logit contributions
ug-0.021
aw-0.019
oo-0.018
ww-0.018
ister-0.017
Token
Token ID198
Feature activation+8.27e-03

Pos logit contributions
 seeking+0.031
 pad+0.025
 wonders+0.023
 pretending+0.022
 hiding+0.022
Neg logit contributions
ug-0.046
aw-0.043
ww-0.042
oo-0.041
opter-0.040
Token
Token ID198
Feature activation+2.91e-03

Pos logit contributions
 seeking+0.009
 wonders+0.007
 pad+0.007
 anyway+0.006
 find+0.006
Neg logit contributions
ug-0.021
aw-0.020
ww-0.020
olk-0.019
dd-0.019
Token"
Token ID1
Feature activation+2.64e-02

Pos logit contributions
 seeking+0.005
 pad+0.005
 wonders+0.004
 hiding+0.004
 find+0.004
Neg logit contributions
ug-0.005
aw-0.005
oo-0.005
ww-0.005
apy-0.005
TokenIt
Token ID1026
Feature activation+3.18e-02

Pos logit contributions
 seeking+0.046
 wonders+0.037
 pad+0.037
 pretending+0.034
 forgetting+0.033
Neg logit contributions
ug-0.056
oo-0.049
aw-0.049
ww-0.047
opter-0.045
Token's
Token ID338
Feature activation+3.52e-02

Pos logit contributions
 seeking+0.051
 pad+0.041
epad+0.039
 wonders+0.036
 former+0.036
Neg logit contributions
ug-0.068
aw-0.063
ww-0.062
oo-0.062
opter-0.060
Token just
Token ID655
Feature activation+8.14e-02

Pos logit contributions
 seeking+0.058
 pad+0.046
 wonders+0.044
 pretending+0.042
epad+0.041
Neg logit contributions
ug-0.076
aw-0.070
oo-0.068
ww-0.068
apy-0.065
Token me
Token ID502
Feature activation+1.62e-02

Pos logit contributions
 seeking+0.141
 pad+0.118
epad+0.108
 wonders+0.108
 anyway+0.103
Neg logit contributions
ug-0.160
aw-0.150
oo-0.146
ww-0.143
ister-0.133
Token,"
Token ID553
Feature activation+2.03e-02

Pos logit contributions
 seeking+0.041
 wonders+0.036
 pad+0.036
epad+0.035
 former+0.035
Neg logit contributions
ug-0.019
oo-0.017
aw-0.014
ister-0.014
ww-0.013
Token a
Token ID257
Feature activation+1.02e-02

Pos logit contributions
 seeking+0.032
 wonders+0.026
 hiding+0.025
 pretending+0.024
 anyway+0.024
Neg logit contributions
ug-0.045
aw-0.041
ww-0.040
oo-0.039
opter-0.039
Token small
Token ID1402
Feature activation-2.10e-02

Pos logit contributions
 seeking+0.016
 pad+0.012
 wonders+0.012
 hiding+0.012
 anyway+0.011
Neg logit contributions
ug-0.022
aw-0.021
oo-0.020
dd-0.020
ww-0.019
Token voice
Token ID3809
Feature activation-1.28e-02

Pos logit contributions
ug+0.042
aw+0.038
oo+0.037
ww+0.037
apy+0.036
Neg logit contributions
 seeking-0.037
 pad-0.031
 wonders-0.030
 hiding-0.029
 pretending-0.028
Token Ruby
Token ID10888
Feature activation+2.56e-02

Pos logit contributions
 seeking+0.011
 pad+0.009
 wonders+0.009
 pretending+0.009
epad+0.008
Neg logit contributions
ug-0.012
aw-0.011
oo-0.010
ww-0.010
apy-0.010
Token,
Token ID11
Feature activation-1.48e-03

Pos logit contributions
 seeking+0.061
 pad+0.053
 wonders+0.051
 pretending+0.050
 hiding+0.049
Neg logit contributions
ug-0.036
aw-0.031
oo-0.030
ww-0.029
apy-0.028
Token even
Token ID772
Feature activation-2.23e-03

Pos logit contributions
ug+0.003
aw+0.003
oo+0.003
ww+0.002
apy+0.002
Neg logit contributions
 seeking-0.003
 pad-0.002
 wonders-0.002
 pretending-0.002
 hiding-0.002
Token if
Token ID611
Feature activation-2.94e-02

Pos logit contributions
ug+0.005
aw+0.004
ww+0.004
oo+0.004
apy+0.004
Neg logit contributions
 seeking-0.004
 pad-0.003
 wonders-0.003
epad-0.003
 pretending-0.003
Token it
Token ID340
Feature activation+2.33e-02

Pos logit contributions
ug+0.055
aw+0.049
oo+0.048
ww+0.047
apy+0.045
Neg logit contributions
 seeking-0.057
 pad-0.047
 wonders-0.045
 hiding-0.044
 pretending-0.043
Token seems
Token ID2331
Feature activation+8.13e-02

Pos logit contributions
 seeking+0.041
 pad+0.036
epad+0.034
 lair+0.031
 former+0.031
Neg logit contributions
ug-0.044
aw-0.041
ww-0.041
oo-0.040
apy-0.038
Token hard
Token ID1327
Feature activation+1.88e-02

Pos logit contributions
 seeking+0.137
 pad+0.110
epad+0.101
 pretending+0.099
 lair+0.097
Neg logit contributions
ug-0.162
ww-0.149
aw-0.148
oo-0.145
 cho-0.141
Token,
Token ID11
Feature activation+3.08e-03

Pos logit contributions
 seeking+0.039
 pad+0.032
 wonders+0.031
epad+0.030
 forgetting+0.029
Neg logit contributions
ug-0.032
oo-0.028
ww-0.027
aw-0.027
ister-0.025
Token don
Token ID836
Feature activation+5.56e-03

Pos logit contributions
 seeking+0.006
 pad+0.005
 wonders+0.005
epad+0.005
 supposed+0.005
Neg logit contributions
ug-0.005
aw-0.005
ww-0.005
oo-0.005
apy-0.004
Token't
Token ID470
Feature activation+2.62e-02

Pos logit contributions
 seeking+0.008
 pad+0.006
 wonders+0.006
 anyway+0.005
 pretending+0.005
Neg logit contributions
ug-0.013
aw-0.012
ww-0.012
oo-0.012
apy-0.011
Token forget
Token ID6044
Feature activation-5.54e-03

Pos logit contributions
 seeking+0.042
 pad+0.034
 wonders+0.031
 pretending+0.030
epad+0.030
Neg logit contributions
ug-0.058
aw-0.052
oo-0.052
ww-0.052
apy-0.050
Token and
Token ID290
Feature activation+1.05e-02

Pos logit contributions
 seeking+0.047
 pad+0.041
 supposed+0.041
epad+0.040
 former+0.039
Neg logit contributions
ug-0.036
aw-0.031
oo-0.031
apy-0.030
dd-0.029
Token said
Token ID531
Feature activation-1.87e-04

Pos logit contributions
 seeking+0.020
epad+0.016
 pad+0.016
 supposed+0.016
 fault+0.016
Neg logit contributions
ug-0.018
aw-0.018
oo-0.016
apy-0.016
dd-0.016
Token,
Token ID11
Feature activation+1.30e-02

Pos logit contributions
ug+0.000
aw+0.000
ww+0.000
oo+0.000
olk+0.000
Neg logit contributions
 seeking-0.000
 hiding-0.000
 pad-0.000
 pretending-0.000
 anyway-0.000
Token "
Token ID366
Feature activation+1.54e-02

Pos logit contributions
 seeking+0.021
 pad+0.017
 wonders+0.016
epad+0.016
 hiding+0.015
Neg logit contributions
ug-0.027
aw-0.025
oo-0.025
ww-0.024
apy-0.024
TokenIt
Token ID1026
Feature activation+2.67e-02

Pos logit contributions
 seeking+0.029
 pad+0.024
 wonders+0.023
 pretending+0.022
epad+0.022
Neg logit contributions
ug-0.030
aw-0.026
oo-0.026
ww-0.026
apy-0.025
Token seems
Token ID2331
Feature activation+8.01e-02

Pos logit contributions
 seeking+0.045
 pad+0.038
epad+0.035
 wonders+0.032
 lair+0.032
Neg logit contributions
ug-0.054
ww-0.050
aw-0.050
oo-0.049
opter-0.048
Token like
Token ID588
Feature activation+3.70e-02

Pos logit contributions
 seeking+0.134
 pad+0.112
epad+0.100
 wonders+0.096
 lair+0.095
Neg logit contributions
ug-0.161
ww-0.148
aw-0.148
 cho-0.143
oo-0.143
Token it
Token ID340
Feature activation+3.16e-03

Pos logit contributions
 seeking+0.068
 pad+0.061
epad+0.054
 wonders+0.053
 lair+0.050
Neg logit contributions
ug-0.068
ww-0.063
aw-0.063
oo-0.060
apy-0.059
Token disappeared
Token ID12120
Feature activation+1.93e-02

Pos logit contributions
 seeking+0.006
 pad+0.005
epad+0.005
 former+0.004
 lair+0.004
Neg logit contributions
ug-0.006
ww-0.006
aw-0.005
oo-0.005
opter-0.005
Token!"
Token ID2474
Feature activation+1.10e-02

Pos logit contributions
 seeking+0.034
 pad+0.028
 wonders+0.027
epad+0.026
 pretending+0.026
Neg logit contributions
ug-0.039
aw-0.035
oo-0.034
ww-0.034
ister-0.032
Token
Token ID198
Feature activation-9.63e-03

Pos logit contributions
 seeking+0.019
 pad+0.015
 wonders+0.014
epad+0.014
 hiding+0.014
Neg logit contributions
ug-0.023
aw-0.021
oo-0.021
ww-0.020
apy-0.020
Token
Token ID198
Feature activation+1.35e-03

Pos logit contributions
 seeking+0.030
 pad+0.024
 wonders+0.023
 supposed+0.023
epad+0.023
Neg logit contributions
ug-0.033
aw-0.030
oo-0.030
ww-0.029
apy-0.028
Token
Token ID198
Feature activation-3.67e-03

Pos logit contributions
 seeking+0.002
 pad+0.002
 wonders+0.001
 anyway+0.001
 hiding+0.001
Neg logit contributions
ug-0.003
aw-0.003
ww-0.003
oo-0.003
apy-0.003
Token"
Token ID1
Feature activation+2.09e-02

Pos logit contributions
ug+0.006
aw+0.005
oo+0.005
ww+0.005
apy+0.005
Neg logit contributions
 seeking-0.008
 pad-0.007
 wonders-0.006
epad-0.006
 hiding-0.006
TokenIt
Token ID1026
Feature activation+3.08e-02

Pos logit contributions
 seeking+0.042
 pad+0.034
 wonders+0.032
 pretending+0.032
 hiding+0.032
Neg logit contributions
ug-0.038
aw-0.035
ww-0.034
oo-0.033
apy-0.032
Token's
Token ID338
Feature activation+3.35e-02

Pos logit contributions
 seeking+0.055
 pad+0.044
epad+0.042
 lair+0.041
 forgetting+0.040
Neg logit contributions
ug-0.060
aw-0.056
ww-0.056
oo-0.055
opter-0.053
Token just
Token ID655
Feature activation+8.00e-02

Pos logit contributions
 seeking+0.055
 pad+0.044
 wonders+0.041
epad+0.040
 pretending+0.039
Neg logit contributions
ug-0.070
oo-0.065
aw-0.065
ww-0.064
apy-0.060
Token five
Token ID1936
Feature activation+3.28e-02

Pos logit contributions
 seeking+0.138
 pad+0.114
 lair+0.106
 wonders+0.103
epad+0.102
Neg logit contributions
ug-0.157
aw-0.147
oo-0.142
ww-0.142
apy-0.133
Token dollars
Token ID5054
Feature activation-1.52e-02

Pos logit contributions
 seeking+0.068
 pad+0.057
 wonders+0.054
epad+0.053
 pretending+0.052
Neg logit contributions
ug-0.057
aw-0.052
ww-0.050
oo-0.049
apy-0.046
Token!"
Token ID2474
Feature activation+9.69e-03

Pos logit contributions
ug+0.026
aw+0.024
oo+0.023
ww+0.023
apy+0.022
Neg logit contributions
 seeking-0.031
 pad-0.026
 wonders-0.025
 pretending-0.025
 hiding-0.024
Token the
Token ID262
Feature activation-1.64e-02

Pos logit contributions
 seeking+0.017
 pad+0.014
 wonders+0.013
 pretending+0.013
 hiding+0.013
Neg logit contributions
ug-0.020
aw-0.018
ww-0.018
oo-0.017
opter-0.017
Token shop
Token ID6128
Feature activation-5.76e-03

Pos logit contributions
ww+0.025
aw+0.025
ug+0.024
opter+0.023
ister+0.023
Neg logit contributions
 seeking-0.032
 fault-0.029
 pad-0.028
 supposed-0.026
asured-0.025
Token"
Token ID1
Feature activation+3.84e-02

Pos logit contributions
 seeking+0.006
 pad+0.005
 wonders+0.005
epad+0.004
 supposed+0.004
Neg logit contributions
ug-0.006
aw-0.006
oo-0.005
apy-0.005
ww-0.005
TokenBen
Token ID11696
Feature activation-4.27e-02

Pos logit contributions
 seeking+0.075
 pad+0.063
 wonders+0.060
 pretending+0.058
 hiding+0.058
Neg logit contributions
ug-0.071
aw-0.063
oo-0.061
ww-0.060
opter-0.058
Token,
Token ID11
Feature activation+1.34e-02

Pos logit contributions
ug+0.075
aw+0.065
ww+0.063
oo+0.063
apy+0.062
Neg logit contributions
 seeking-0.086
 pad-0.075
 wonders-0.071
 hiding-0.069
 pretending-0.068
Token it
Token ID340
Feature activation+2.44e-03

Pos logit contributions
 seeking+0.023
 pad+0.019
 wonders+0.018
 pretending+0.017
epad+0.017
Neg logit contributions
ug-0.028
aw-0.025
oo-0.025
ww-0.025
apy-0.024
Token was
Token ID373
Feature activation+3.04e-02

Pos logit contributions
 seeking+0.004
 pad+0.004
epad+0.003
 wonders+0.003
 pretending+0.003
Neg logit contributions
ug-0.005
aw-0.005
ww-0.004
oo-0.004
apy-0.004
Token just
Token ID655
Feature activation+7.99e-02

Pos logit contributions
 seeking+0.051
 pad+0.041
 wonders+0.039
epad+0.037
 pretending+0.037
Neg logit contributions
ug-0.064
aw-0.059
oo-0.058
ww-0.057
apy-0.055
Token a
Token ID257
Feature activation+1.45e-02

Pos logit contributions
 seeking+0.142
 pad+0.119
epad+0.109
 wonders+0.108
 anyway+0.105
Neg logit contributions
ug-0.154
aw-0.144
ww-0.138
oo-0.138
apy-0.130
Token spider
Token ID19230
Feature activation+1.37e-03

Pos logit contributions
 seeking+0.028
 pad+0.022
 wonders+0.022
epad+0.021
 forgetting+0.021
Neg logit contributions
ug-0.025
ww-0.025
aw-0.023
oo-0.022
apy-0.022
Token.
Token ID13
Feature activation+1.93e-02

Pos logit contributions
 seeking+0.002
 wonders+0.002
 pad+0.002
epad+0.002
 forgetting+0.002
Neg logit contributions
ug-0.003
aw-0.002
ww-0.002
oo-0.002
ister-0.002
Token It
Token ID632
Feature activation+1.89e-02

Pos logit contributions
 seeking+0.036
 pad+0.030
 wonders+0.029
 pretending+0.028
 hiding+0.027
Neg logit contributions
ug-0.036
aw-0.033
ww-0.032
oo-0.032
opter-0.031
Token was
Token ID373
Feature activation+1.63e-02

Pos logit contributions
 seeking+0.036
epad+0.032
 pad+0.031
 former+0.030
 fault+0.029
Neg logit contributions
ug-0.032
aw-0.030
ww-0.030
apy-0.029
oo-0.029
Token
Token ID198
Feature activation+4.10e-04

Pos logit contributions
 seeking+0.013
 pad+0.010
epad+0.010
 supposed+0.010
 wonders+0.010
Neg logit contributions
ug-0.014
aw-0.013
apy-0.013
oo-0.013
ww-0.012
Token
Token ID198
Feature activation-4.16e-03

Pos logit contributions
 seeking+0.001
 pad+0.000
 wonders+0.000
 hiding+0.000
 pretending+0.000
Neg logit contributions
ug-0.001
aw-0.001
ww-0.001
oo-0.001
opter-0.001
Token"
Token ID1
Feature activation+1.56e-02

Pos logit contributions
ug+0.007
aw+0.006
oo+0.006
ww+0.006
opter+0.006
Neg logit contributions
 seeking-0.008
 pad-0.007
 wonders-0.007
 hiding-0.007
 anyway-0.007
TokenIt
Token ID1026
Feature activation+2.83e-02

Pos logit contributions
 seeking+0.033
 pad+0.028
 wonders+0.027
 pretending+0.026
 hiding+0.026
Neg logit contributions
ug-0.027
aw-0.024
oo-0.023
ww-0.023
apy-0.022
Token's
Token ID338
Feature activation+2.96e-02

Pos logit contributions
 seeking+0.047
 pad+0.038
epad+0.036
 pretending+0.033
 lair+0.033
Neg logit contributions
ug-0.060
aw-0.055
ww-0.055
oo-0.054
opter-0.052
Token just
Token ID655
Feature activation+7.98e-02

Pos logit contributions
 seeking+0.049
 pad+0.039
epad+0.037
 wonders+0.036
 lair+0.035
Neg logit contributions
ug-0.062
oo-0.059
aw-0.057
ww-0.057
apy-0.052
Token a
Token ID257
Feature activation+1.97e-02

Pos logit contributions
 seeking+0.139
 pad+0.115
epad+0.108
 lair+0.105
 wonders+0.104
Neg logit contributions
ug-0.156
aw-0.146
oo-0.143
ww-0.141
apy-0.132
Token skirt
Token ID23967
Feature activation+5.16e-03

Pos logit contributions
 seeking+0.037
asured+0.029
 lair+0.029
 pretending+0.029
epad+0.029
Neg logit contributions
ww-0.037
ug-0.032
ister-0.030
opter-0.029
oo-0.029
Token,
Token ID11
Feature activation-6.20e-03

Pos logit contributions
 seeking+0.012
 pad+0.010
 wonders+0.010
epad+0.010
 pretending+0.010
Neg logit contributions
ug-0.007
aw-0.006
oo-0.006
ww-0.006
ister-0.006
Token Molly
Token ID30236
Feature activation-9.36e-03

Pos logit contributions
ug+0.012
aw+0.011
oo+0.011
ww+0.011
apy+0.011
Neg logit contributions
 seeking-0.011
 pad-0.009
epad-0.008
 wonders-0.008
 supposed-0.008
Token,"
Token ID553
Feature activation+1.53e-02

Pos logit contributions
ug+0.014
aw+0.012
oo+0.011
ww+0.011
opter+0.011
Neg logit contributions
 seeking-0.022
 pad-0.019
 wonders-0.018
epad-0.018
 pretending-0.018
Token I
Token ID314
Feature activation+2.86e-02

Pos logit contributions
 seeking+0.004
 pad+0.003
 pretending+0.003
 wonders+0.003
 hiding+0.003
Neg logit contributions
ug-0.006
aw-0.006
ww-0.005
oo-0.005
opter-0.005
Token have
Token ID423
Feature activation+1.60e-02

Pos logit contributions
 seeking+0.042
 pad+0.033
 wonders+0.030
 pretending+0.029
epad+0.029
Neg logit contributions
ug-0.066
oo-0.061
aw-0.061
ww-0.060
apy-0.058
Token the
Token ID262
Feature activation+1.85e-02

Pos logit contributions
 seeking+0.027
 pad+0.022
 wonders+0.021
 pretending+0.020
 hiding+0.020
Neg logit contributions
ug-0.034
aw-0.030
oo-0.030
ww-0.029
ister-0.029
Token z
Token ID1976
Feature activation+9.66e-03

Pos logit contributions
 seeking+0.039
epad+0.031
 pad+0.030
 lair+0.030
 forgetting+0.030
Neg logit contributions
ug-0.030
aw-0.028
ww-0.028
oo-0.027
ister-0.026
Tokenig
Token ID328
Feature activation+4.17e-03

Pos logit contributions
 seeking+0.023
 pad+0.020
 pretending+0.019
 hiding+0.019
 anyway+0.019
Neg logit contributions
ug-0.013
ww-0.011
aw-0.011
apy-0.011
opter-0.010
Tokenzag
Token ID50183
Feature activation+7.96e-02

Pos logit contributions
 seeking+0.007
 wonders+0.006
 pad+0.006
epad+0.006
 forgetting+0.006
Neg logit contributions
ug-0.008
oo-0.008
ister-0.007
ww-0.007
aw-0.007
Token?"
Token ID1701
Feature activation+1.51e-02

Pos logit contributions
 seeking+0.150
 pad+0.124
 wonders+0.122
epad+0.121
 forgetting+0.119
Neg logit contributions
ug-0.148
oo-0.133
ww-0.132
aw-0.132
ister-0.127
Token
Token ID198
Feature activation-3.15e-03

Pos logit contributions
 seeking+0.019
 pad+0.016
 anyway+0.016
 wonders+0.016
 own+0.014
Neg logit contributions
ug-0.036
aw-0.034
opter-0.034
ww-0.032
ister-0.032
Token
Token ID198
Feature activation-7.83e-03

Pos logit contributions
aw+0.008
ug+0.008
opter+0.008
ww+0.008
olk+0.008
Neg logit contributions
 seeking-0.002
 anyway-0.002
 wonders-0.002
 own-0.002
 pad-0.002
TokenM
Token ID44
Feature activation-6.46e-03

Pos logit contributions
ug+0.015
aw+0.015
opter+0.014
olk+0.013
ister+0.012
Neg logit contributions
 seeking-0.012
 anyway-0.011
 pad-0.011
 wonders-0.010
 own-0.010
Tokenama
Token ID1689
Feature activation-1.03e-02

Pos logit contributions
ug+0.007
aw+0.006
opter+0.006
apy+0.006
ww+0.005
Neg logit contributions
 seeking-0.015
 pad-0.014
 anyway-0.014
 wonders-0.014
 least-0.013
Token
Token ID198
Feature activation-3.94e-03

Pos logit contributions
 seeking+0.002
 wonders+0.001
 pad+0.001
 anyway+0.001
 find+0.001
Neg logit contributions
ug-0.004
aw-0.004
ww-0.004
oo-0.003
opter-0.003
Token"
Token ID1
Feature activation+1.80e-02

Pos logit contributions
ug+0.007
aw+0.007
ww+0.006
oo+0.006
apy+0.006
Neg logit contributions
 seeking-0.008
 pad-0.006
 wonders-0.006
 anyway-0.006
 find-0.006
TokenIt
Token ID1026
Feature activation+3.28e-02

Pos logit contributions
 seeking+0.030
 pad+0.024
 wonders+0.024
 pretending+0.022
 anyway+0.022
Neg logit contributions
ug-0.039
aw-0.035
oo-0.035
ww-0.033
opter-0.033
Token's
Token ID338
Feature activation+3.64e-02

Pos logit contributions
 seeking+0.052
 pad+0.041
epad+0.038
 wonders+0.037
 pretending+0.036
Neg logit contributions
ug-0.073
aw-0.066
ww-0.066
oo-0.066
opter-0.063
Token probably
Token ID2192
Feature activation+7.38e-02

Pos logit contributions
 seeking+0.058
 pad+0.046
 wonders+0.043
epad+0.042
 pretending+0.040
Neg logit contributions
ug-0.080
oo-0.073
aw-0.072
ww-0.072
apy-0.068
Token just
Token ID655
Feature activation+7.95e-02

Pos logit contributions
 seeking+0.122
 pad+0.098
epad+0.094
 wonders+0.093
 lair+0.087
Neg logit contributions
ug-0.152
oo-0.141
aw-0.139
ww-0.139
apy-0.132
Token your
Token ID534
Feature activation+2.40e-02

Pos logit contributions
 seeking+0.136
 pad+0.113
epad+0.105
 wonders+0.103
 lair+0.102
Neg logit contributions
ug-0.155
aw-0.143
ww-0.142
oo-0.142
apy-0.132
Token water
Token ID1660
Feature activation-4.72e-03

Pos logit contributions
 seeking+0.042
Acc+0.039
years+0.034
isk+0.033
 former+0.033
Neg logit contributions
ug-0.035
ww-0.034
oo-0.032
ister-0.032
 tum-0.029
Token bottle
Token ID9294
Feature activation+8.65e-03

Pos logit contributions
ug+0.008
aw+0.007
ww+0.007
oo+0.007
ister+0.007
Neg logit contributions
 seeking-0.010
 pad-0.008
 wonders-0.008
 pretending-0.008
epad-0.008
Token and
Token ID290
Feature activation+7.53e-03

Pos logit contributions
 seeking+0.019
 pad+0.016
 wonders+0.015
 hiding+0.015
 pretending+0.015
Neg logit contributions
ug-0.014
aw-0.012
oo-0.012
ww-0.012
apy-0.011
Token snacks
Token ID28487
Feature activation-2.71e-02

Pos logit contributions
 seeking+0.015
 pad+0.013
epad+0.012
 wonders+0.012
 lair+0.012
Neg logit contributions
ug-0.012
ww-0.011
apy-0.011
aw-0.011
oo-0.011
Token the
Token ID262
Feature activation-8.95e-03

Pos logit contributions
ug+0.026
aw+0.025
oo+0.024
ww+0.024
 cho+0.023
Neg logit contributions
 seeking-0.025
 pad-0.025
 former-0.022
 lair-0.021
 anyways-0.021
Token funn
Token ID36090
Feature activation-7.68e-03

Pos logit contributions
ug+0.015
ww+0.014
aw+0.014
ister+0.014
oo+0.013
Neg logit contributions
 seeking-0.018
 pad-0.014
 wonders-0.014
 forgetting-0.014
 intended-0.013
Tokeniest
Token ID6386
Feature activation-2.80e-02

Pos logit contributions
ug+0.008
aw+0.007
apy+0.006
oo+0.006
ww+0.006
Neg logit contributions
 seeking-0.019
 pad-0.017
 pretending-0.017
 supposed-0.017
epad-0.017
Token potato
Token ID21219
Feature activation-4.38e-02

Pos logit contributions
ug+0.049
ww+0.048
oo+0.046
aw+0.046
ister+0.045
Neg logit contributions
 seeking-0.054
 pad-0.044
 wonders-0.041
 intended-0.040
 forgetting-0.039
Token h
Token ID289
Feature activation-6.49e-03

Pos logit contributions
ug+0.078
aw+0.075
ww+0.074
oo+0.070
ister+0.068
Neg logit contributions
 seeking-0.080
 former-0.069
 pad-0.068
 wonders-0.067
epad-0.063
Tokenider
Token ID1304
Feature activation+7.94e-02

Pos logit contributions
ug+0.013
aw+0.012
oo+0.012
ww+0.012
apy+0.011
Neg logit contributions
 seeking-0.011
 pad-0.009
 wonders-0.009
 pretending-0.008
 hiding-0.008
Token ever
Token ID1683
Feature activation+1.44e-02

Pos logit contributions
 seeking+0.140
 pad+0.110
 wonders+0.106
epad+0.106
 forgetting+0.105
Neg logit contributions
ug-0.158
aw-0.150
oo-0.141
ww-0.137
ister-0.131
Token.
Token ID13
Feature activation-4.00e-04

Pos logit contributions
 seeking+0.028
 pad+0.022
 wonders+0.021
 lair+0.021
epad+0.021
Neg logit contributions
ug-0.025
aw-0.025
ww-0.024
oo-0.022
opter-0.021
Token You
Token ID921
Feature activation+1.42e-02

Pos logit contributions
ug+0.001
aw+0.001
oo+0.001
ww+0.001
apy+0.001
Neg logit contributions
 seeking-0.001
 pad-0.001
 wonders-0.001
 hiding-0.001
epad-0.001
Token are
Token ID389
Feature activation-1.04e-02

Pos logit contributions
 seeking+0.027
 pad+0.023
 lair+0.021
 pretending+0.021
epad+0.021
Neg logit contributions
ug-0.026
aw-0.024
opter-0.023
oo-0.023
ww-0.023
Token the
Token ID262
Feature activation-9.11e-03

Pos logit contributions
ug+0.017
oo+0.017
ww+0.016
aw+0.016
 cho+0.016
Neg logit contributions
 pad-0.018
 seeking-0.017
 former-0.016
 lair-0.016
 anyways-0.016
Token
Token ID198
Feature activation+4.82e-03

Pos logit contributions
 seeking+0.036
 pad+0.029
 wonders+0.028
 pretending+0.026
epad+0.026
Neg logit contributions
ug-0.047
aw-0.043
oo-0.042
ww-0.041
apy-0.040
Token
Token ID198
Feature activation-1.83e-03

Pos logit contributions
 seeking+0.007
 pad+0.005
 wonders+0.005
 anyway+0.005
 hiding+0.005
Neg logit contributions
ug-0.011
ww-0.010
aw-0.010
oo-0.010
opter-0.010
Token"
Token ID1
Feature activation+1.92e-02

Pos logit contributions
ug+0.003
oo+0.003
aw+0.003
ww+0.003
apy+0.002
Neg logit contributions
 seeking-0.004
 pad-0.003
 wonders-0.003
 hiding-0.003
 anyway-0.003
TokenIt
Token ID1026
Feature activation+2.99e-02

Pos logit contributions
 seeking+0.038
 pad+0.032
 wonders+0.031
 pretending+0.030
 hiding+0.029
Neg logit contributions
ug-0.035
aw-0.031
oo-0.031
ww-0.030
apy-0.029
Token's
Token ID338
Feature activation+3.45e-02

Pos logit contributions
 seeking+0.056
epad+0.046
 pad+0.046
 forgetting+0.042
 former+0.041
Neg logit contributions
ug-0.057
aw-0.053
ww-0.052
oo-0.051
opter-0.049
Token just
Token ID655
Feature activation+7.92e-02

Pos logit contributions
 seeking+0.055
 pad+0.043
 wonders+0.040
epad+0.040
 pretending+0.038
Neg logit contributions
ug-0.076
oo-0.070
aw-0.069
ww-0.069
apy-0.064
Token over
Token ID625
Feature activation+1.60e-02

Pos logit contributions
 seeking+0.132
 pad+0.109
epad+0.100
 wonders+0.099
 pretending+0.096
Neg logit contributions
ug-0.164
aw-0.153
ww-0.148
oo-0.147
apy-0.139
Token there
Token ID612
Feature activation+4.53e-02

Pos logit contributions
 seeking+0.031
 pad+0.026
epad+0.024
 wonders+0.024
 pretending+0.023
Neg logit contributions
ug-0.030
aw-0.027
oo-0.027
ww-0.026
ister-0.025
Token.
Token ID13
Feature activation+2.60e-02

Pos logit contributions
 seeking+0.109
 pad+0.091
epad+0.089
 wonders+0.089
 pretending+0.088
Neg logit contributions
ug-0.061
aw-0.055
oo-0.053
ww-0.051
ister-0.050
Token It
Token ID632
Feature activation+1.44e-02

Pos logit contributions
 seeking+0.044
 pad+0.035
 wonders+0.034
 pretending+0.032
 hiding+0.031
Neg logit contributions
ug-0.054
aw-0.049
ww-0.049
oo-0.048
opter-0.047
Token won
Token ID1839
Feature activation+3.00e-02

Pos logit contributions
 seeking+0.025
epad+0.021
 pad+0.021
 forgetting+0.018
 pretending+0.018
Neg logit contributions
ug-0.029
aw-0.027
ww-0.026
oo-0.025
opter-0.024
Token who
Token ID508
Feature activation+4.50e-02

Pos logit contributions
ug+0.008
aw+0.007
oo+0.007
ww+0.007
ister+0.007
Neg logit contributions
 seeking-0.006
 pad-0.005
 wonders-0.005
 pretending-0.004
epad-0.004
Token own
Token ID898
Feature activation+3.27e-02

Pos logit contributions
 seeking+0.068
 pad+0.055
epad+0.048
 pretending+0.048
 wonders+0.048
Neg logit contributions
ug-0.102
aw-0.094
oo-0.092
ww-0.090
ister-0.088
Token it
Token ID340
Feature activation+3.17e-03

Pos logit contributions
 seeking+0.063
 pad+0.053
 wonders+0.049
epad+0.048
 pretending+0.047
Neg logit contributions
ug-0.061
aw-0.055
oo-0.054
ww-0.053
apy-0.051
Token don
Token ID836
Feature activation+2.49e-02

Pos logit contributions
 seeking+0.006
epad+0.005
 pad+0.005
 lair+0.005
 former+0.005
Neg logit contributions
ug-0.005
aw-0.005
oo-0.005
ww-0.005
opter-0.005
Token't
Token ID470
Feature activation+4.27e-02

Pos logit contributions
 seeking+0.032
 supposed+0.023
 wonders+0.023
 anyway+0.022
 pad+0.022
Neg logit contributions
ug-0.060
aw-0.058
ww-0.055
apy-0.053
oo-0.052
Token actually
Token ID1682
Feature activation+7.90e-02

Pos logit contributions
 seeking+0.060
 pad+0.048
 wonders+0.043
 pretending+0.042
 lair+0.040
Neg logit contributions
ug-0.101
oo-0.092
aw-0.091
ww-0.091
ister-0.087
Token live
Token ID2107
Feature activation+4.48e-02

Pos logit contributions
 seeking+0.136
 pad+0.119
 lair+0.103
 wonders+0.101
 pretending+0.100
Neg logit contributions
ug-0.151
oo-0.140
ww-0.137
aw-0.135
apy-0.131
Token there
Token ID612
Feature activation+3.38e-02

Pos logit contributions
 seeking+0.085
 pad+0.072
 forgetting+0.068
epad+0.067
 wonders+0.067
Neg logit contributions
ug-0.084
aw-0.075
oo-0.072
ww-0.068
ister-0.067
Token.
Token ID13
Feature activation+1.41e-02

Pos logit contributions
 seeking+0.063
 wonders+0.052
 pad+0.051
epad+0.049
 forgetting+0.048
Neg logit contributions
ug-0.060
aw-0.054
oo-0.053
ww-0.051
ister-0.050
Token They
Token ID1119
Feature activation-6.17e-03

Pos logit contributions
 seeking+0.029
 pad+0.025
 wonders+0.023
 hiding+0.023
epad+0.023
Neg logit contributions
ug-0.025
aw-0.022
oo-0.021
ww-0.021
apy-0.020
Token hate
Token ID5465
Feature activation+2.23e-02

Pos logit contributions
ug+0.012
aw+0.011
oo+0.011
opter+0.011
ww+0.011
Neg logit contributions
 seeking-0.010
 pad-0.009
 lair-0.008
epad-0.008
 pretending-0.008
Token said
Token ID531
Feature activation+5.93e-03

Pos logit contributions
ug+0.047
aw+0.043
oo+0.042
ww+0.042
opter+0.040
Neg logit contributions
 seeking-0.044
 pad-0.035
 wonders-0.033
epad-0.033
 pretending-0.033
Token,
Token ID11
Feature activation+1.54e-02

Pos logit contributions
 seeking+0.009
 hiding+0.008
 anyway+0.008
 pretending+0.007
 pad+0.007
Neg logit contributions
ug-0.013
aw-0.011
olk-0.011
ww-0.011
oo-0.011
Token "
Token ID366
Feature activation+1.45e-02

Pos logit contributions
 seeking+0.024
 pad+0.019
 wonders+0.018
 pretending+0.017
epad+0.017
Neg logit contributions
ug-0.034
aw-0.031
oo-0.031
ww-0.030
apy-0.029
TokenIt
Token ID1026
Feature activation+3.11e-02

Pos logit contributions
 seeking+0.021
 pad+0.018
 wonders+0.018
 anyway+0.016
 forgetting+0.015
Neg logit contributions
ug-0.033
oo-0.030
aw-0.029
apy-0.029
ww-0.029
Token's
Token ID338
Feature activation+3.10e-02

Pos logit contributions
 seeking+0.056
 pad+0.046
epad+0.045
 former+0.042
 forgetting+0.041
Neg logit contributions
ug-0.061
ww-0.055
aw-0.055
oo-0.054
opter-0.052
Token just
Token ID655
Feature activation+7.89e-02

Pos logit contributions
 seeking+0.053
 pad+0.043
epad+0.041
 wonders+0.040
 lair+0.038
Neg logit contributions
ug-0.063
oo-0.058
ww-0.056
aw-0.056
 cho-0.052
Token a
Token ID257
Feature activation+2.28e-02

Pos logit contributions
 seeking+0.138
 pad+0.116
epad+0.109
 lair+0.105
 wonders+0.104
Neg logit contributions
ug-0.150
aw-0.141
ww-0.134
oo-0.134
 cho-0.124
Token harmless
Token ID23585
Feature activation+1.68e-02

Pos logit contributions
 seeking+0.042
 pad+0.033
 wonders+0.032
epad+0.031
 pretending+0.031
Neg logit contributions
ug-0.043
ww-0.041
aw-0.039
oo-0.039
opter-0.037
Token hook
Token ID8011
Feature activation-2.58e-02

Pos logit contributions
 seeking+0.035
 pad+0.030
 wonders+0.028
 hiding+0.028
 pretending+0.027
Neg logit contributions
ug-0.029
aw-0.026
oo-0.025
ww-0.025
apy-0.024
Token,
Token ID11
Feature activation-7.17e-03

Pos logit contributions
ug+0.048
aw+0.043
oo+0.043
ww+0.042
ister+0.040
Neg logit contributions
 seeking-0.047
 wonders-0.040
 former-0.040
epad-0.038
 pad-0.037
Token don
Token ID836
Feature activation-3.18e-03

Pos logit contributions
ug+0.015
aw+0.014
oo+0.014
ww+0.014
apy+0.013
Neg logit contributions
 seeking-0.012
 pad-0.009
 wonders-0.009
 pretending-0.009
 hiding-0.008
Token said
Token ID531
Feature activation-5.94e-03

Pos logit contributions
ug+0.017
aw+0.016
oo+0.015
ww+0.015
apy+0.015
Neg logit contributions
 seeking-0.016
 pad-0.013
 wonders-0.013
epad-0.012
 pretending-0.012
Token.
Token ID13
Feature activation+2.57e-02

Pos logit contributions
ug+0.015
ww+0.014
aw+0.014
oo+0.013
opter+0.013
Neg logit contributions
 seeking-0.007
 pad-0.005
 anyway-0.005
 hiding-0.005
 pretending-0.005
Token "
Token ID366
Feature activation+2.87e-02

Pos logit contributions
 seeking+0.043
 pad+0.034
 wonders+0.032
epad+0.031
 supposed+0.031
Neg logit contributions
ug-0.054
aw-0.050
oo-0.048
ww-0.047
apy-0.047
TokenThis
Token ID1212
Feature activation+1.24e-02

Pos logit contributions
 seeking+0.058
 pad+0.047
 hiding+0.046
 pretending+0.045
 wonders+0.045
Neg logit contributions
ug-0.051
aw-0.047
oo-0.045
ww-0.044
apy-0.043
Token station
Token ID4429
Feature activation+9.06e-03

Pos logit contributions
 seeking+0.022
 pad+0.018
 wonders+0.018
 pretending+0.017
 hiding+0.017
Neg logit contributions
ug-0.025
aw-0.022
oo-0.022
ww-0.022
apy-0.021
Token seems
Token ID2331
Feature activation+7.88e-02

Pos logit contributions
 seeking+0.016
 pad+0.013
 wonders+0.012
epad+0.012
 pretending+0.012
Neg logit contributions
ug-0.018
aw-0.017
oo-0.016
ww-0.016
ister-0.016
Token really
Token ID1107
Feature activation+3.27e-02

Pos logit contributions
 seeking+0.133
 pad+0.111
epad+0.099
 wonders+0.096
 lair+0.095
Neg logit contributions
ug-0.161
aw-0.145
ww-0.142
 cho-0.139
oo-0.137
Token old
Token ID1468
Feature activation+1.25e-02

Pos logit contributions
 seeking+0.052
 pad+0.042
 wonders+0.039
 pretending+0.037
epad+0.037
Neg logit contributions
ug-0.072
aw-0.066
ww-0.064
oo-0.064
apy-0.062
Token.
Token ID13
Feature activation+2.47e-02

Pos logit contributions
 seeking+0.024
 pad+0.019
epad+0.019
 former+0.019
 forgetting+0.018
Neg logit contributions
ug-0.022
aw-0.021
ww-0.018
oo-0.018
ister-0.018
Token What
Token ID1867
Feature activation-6.56e-03

Pos logit contributions
 seeking+0.046
 pad+0.038
 wonders+0.036
 hiding+0.036
epad+0.035
Neg logit contributions
ug-0.048
aw-0.043
oo-0.042
ww-0.041
apy-0.040
Token do
Token ID466
Feature activation-4.71e-02

Pos logit contributions
ug+0.014
aw+0.013
oo+0.012
ww+0.012
apy+0.012
Neg logit contributions
 seeking-0.011
 pad-0.009
 wonders-0.008
 pretending-0.008
epad-0.008
Token said
Token ID531
Feature activation+8.41e-03

Pos logit contributions
ug+0.000
aw+0.000
ww+0.000
oo+0.000
apy+0.000
Neg logit contributions
 seeking-0.000
 pad-0.000
 wonders-0.000
 hiding-0.000
 pretending-0.000
Token,
Token ID11
Feature activation+1.95e-02

Pos logit contributions
 seeking+0.012
 hiding+0.011
 pretending+0.010
 anyway+0.010
 first+0.010
Neg logit contributions
ug-0.019
aw-0.018
ww-0.017
opter-0.016
dd-0.016
Token "
Token ID366
Feature activation+2.20e-02

Pos logit contributions
 seeking+0.027
 pad+0.020
 pretending+0.019
 wonders+0.018
 hiding+0.018
Neg logit contributions
ug-0.048
aw-0.045
ww-0.044
oo-0.043
opter-0.042
TokenIt
Token ID1026
Feature activation+2.94e-02

Pos logit contributions
 seeking+0.044
 pad+0.036
 wonders+0.035
 pretending+0.034
epad+0.034
Neg logit contributions
ug-0.040
aw-0.036
oo-0.035
ww-0.035
apy-0.034
Token's
Token ID338
Feature activation+3.17e-02

Pos logit contributions
 seeking+0.051
 pad+0.043
epad+0.042
 former+0.038
 forgetting+0.038
Neg logit contributions
ug-0.058
aw-0.053
ww-0.052
oo-0.052
opter-0.050
Token just
Token ID655
Feature activation+7.87e-02

Pos logit contributions
 seeking+0.052
 pad+0.042
epad+0.039
 wonders+0.038
 pretending+0.037
Neg logit contributions
ug-0.068
oo-0.062
aw-0.061
ww-0.060
apy-0.057
Token an
Token ID281
Feature activation+2.04e-02

Pos logit contributions
 seeking+0.137
 pad+0.120
epad+0.112
 lair+0.105
 wonders+0.103
Neg logit contributions
ug-0.148
aw-0.136
oo-0.131
ww-0.130
apy-0.123
Token empty
Token ID6565
Feature activation-3.56e-03

Pos logit contributions
 seeking+0.033
 pad+0.027
 wonders+0.027
 hiding+0.025
 anyway+0.024
Neg logit contributions
ug-0.043
aw-0.041
apy-0.040
ww-0.040
oo-0.039
Token hole
Token ID7604
Feature activation-2.24e-02

Pos logit contributions
ug+0.007
aw+0.007
oo+0.007
ww+0.007
apy+0.006
Neg logit contributions
 seeking-0.006
 pad-0.005
 wonders-0.005
 pretending-0.005
 hiding-0.005
Token.
Token ID13
Feature activation+2.02e-02

Pos logit contributions
ug+0.043
aw+0.037
oo+0.037
ww+0.036
ister+0.035
Neg logit contributions
 seeking-0.041
 pad-0.035
 wonders-0.034
epad-0.033
 former-0.032
Token Maybe
Token ID6674
Feature activation+4.40e-03

Pos logit contributions
 seeking+0.039
 pad+0.032
 wonders+0.030
 pretending+0.029
epad+0.029
Neg logit contributions
ug-0.038
aw-0.034
oo-0.033
ww-0.033
apy-0.032
Token to
Token ID284
Feature activation+1.15e-02

Pos logit contributions
 seeking+0.031
 pad+0.025
 wonders+0.024
 pretending+0.023
 forgetting+0.023
Neg logit contributions
ug-0.041
aw-0.037
oo-0.036
ww-0.035
ister-0.034
Token be
Token ID307
Feature activation-3.97e-03

Pos logit contributions
 seeking+0.016
 pad+0.013
 wonders+0.012
 pretending+0.011
epad+0.011
Neg logit contributions
ug-0.027
oo-0.025
aw-0.024
ww-0.024
ister-0.024
Token restored
Token ID15032
Feature activation+1.41e-02

Pos logit contributions
ug+0.008
aw+0.008
oo+0.007
ww+0.007
apy+0.007
Neg logit contributions
 seeking-0.007
 pad-0.005
 wonders-0.005
 forgetting-0.005
epad-0.005
Token to
Token ID284
Feature activation-2.91e-03

Pos logit contributions
 seeking+0.025
 pad+0.020
 forgetting+0.019
 wonders+0.019
 pretending+0.019
Neg logit contributions
ug-0.029
aw-0.025
oo-0.024
ww-0.024
ister-0.023
Token its
Token ID663
Feature activation-2.24e-03

Pos logit contributions
ug+0.006
oo+0.005
aw+0.005
opter+0.005
apy+0.005
Neg logit contributions
 seeking-0.005
 pad-0.005
epad-0.004
 wonders-0.004
 pretending-0.004
Token former
Token ID1966
Feature activation+7.86e-02

Pos logit contributions
ug+0.004
aw+0.004
 hurricane+0.004
ww+0.004
um+0.004
Neg logit contributions
 seeking-0.004
 pad-0.003
epad-0.003
 pretending-0.003
 supposed-0.003
Token glory
Token ID13476
Feature activation+1.05e-02

Pos logit contributions
 seeking+0.141
 pad+0.114
 pretending+0.107
epad+0.106
 forgetting+0.105
Neg logit contributions
ug-0.155
aw-0.138
ww-0.132
oo-0.132
opter-0.128
Token."
Token ID526
Feature activation+8.66e-03

Pos logit contributions
 seeking+0.018
 pad+0.014
 wonders+0.013
 forgetting+0.013
 pretending+0.013
Neg logit contributions
ug-0.022
aw-0.020
oo-0.019
ister-0.019
ww-0.018
Token
Token ID220
Feature activation+1.46e-02

Pos logit contributions
 seeking+0.016
 pad+0.013
epad+0.013
 supposed+0.012
 pretending+0.012
Neg logit contributions
ug-0.017
aw-0.015
oo-0.015
apy-0.015
ww-0.014
Token
Token ID198
Feature activation-1.81e-02

Pos logit contributions
 seeking+0.026
 pad+0.021
 wonders+0.020
epad+0.019
 pretending+0.019
Neg logit contributions
ug-0.030
aw-0.027
oo-0.027
ww-0.026
apy-0.025
Token
Token ID198
Feature activation-1.95e-02

Pos logit contributions
ug+0.038
aw+0.035
oo+0.034
ww+0.034
apy+0.033
Neg logit contributions
 seeking-0.031
 pad-0.025
 wonders-0.023
epad-0.023
 pretending-0.023
Token the
Token ID262
Feature activation-2.26e-03

Pos logit contributions
ug+0.010
aw+0.009
oo+0.008
apy+0.008
ww+0.008
Neg logit contributions
 seeking-0.016
 supposed-0.014
 pad-0.014
epad-0.014
 former-0.013
Token clay
Token ID21558
Feature activation+1.60e-02

Pos logit contributions
ug+0.004
aw+0.003
ww+0.003
apy+0.003
opter+0.003
Neg logit contributions
 seeking-0.005
 pad-0.004
epad-0.004
 wonders-0.004
 forgetting-0.004
Token until
Token ID1566
Feature activation-1.25e-02

Pos logit contributions
 seeking+0.031
 supposed+0.028
 former+0.028
epad+0.028
 suppose+0.028
Neg logit contributions
ug-0.027
aw-0.023
apy-0.022
ww-0.021
olk-0.020
Token it
Token ID340
Feature activation+3.73e-03

Pos logit contributions
ug+0.022
aw+0.020
oo+0.020
ww+0.019
ister+0.019
Neg logit contributions
 seeking-0.025
 pad-0.021
 wonders-0.020
 pretending-0.020
 hiding-0.020
Token looked
Token ID3114
Feature activation+2.90e-02

Pos logit contributions
 seeking+0.006
 pad+0.005
 wonders+0.005
 pretending+0.005
 hiding+0.005
Neg logit contributions
ug-0.008
aw-0.007
oo-0.007
ww-0.007
apy-0.007
Token just
Token ID655
Feature activation+7.80e-02

Pos logit contributions
 seeking+0.052
 pad+0.040
epad+0.038
 lair+0.037
 forgetting+0.036
Neg logit contributions
aw-0.057
ug-0.056
apy-0.049
ww-0.049
 cho-0.049
Token like
Token ID588
Feature activation+3.62e-02

Pos logit contributions
 seeking+0.121
 pad+0.095
epad+0.089
 lair+0.086
 wonders+0.084
Neg logit contributions
ug-0.163
aw-0.163
oo-0.146
ww-0.145
apy-0.141
Token the
Token ID262
Feature activation+3.11e-03

Pos logit contributions
 seeking+0.074
 pad+0.066
 wonders+0.061
 former+0.060
epad+0.060
Neg logit contributions
ug-0.056
aw-0.056
 cho-0.050
apy-0.050
ww-0.049
Token butterfly
Token ID35113
Feature activation+2.69e-02

Pos logit contributions
 seeking+0.007
asured+0.006
 pad+0.006
isk+0.005
 supposed+0.005
Neg logit contributions
ug-0.004
aw-0.004
olk-0.004
 gust-0.004
ww-0.004
Token outside
Token ID2354
Feature activation+2.81e-02

Pos logit contributions
 seeking+0.052
 Deal+0.046
 former+0.045
epad+0.044
 wonders+0.041
Neg logit contributions
aw-0.046
ug-0.045
ww-0.041
 cho-0.039
oo-0.039
Token.
Token ID13
Feature activation-9.69e-03

Pos logit contributions
 seeking+0.049
 pad+0.041
 wonders+0.037
epad+0.036
 former+0.036
Neg logit contributions
ug-0.056
aw-0.055
oo-0.050
apy-0.049
ww-0.048
Token and
Token ID290
Feature activation+1.45e-02

Pos logit contributions
ug+0.026
aw+0.024
oo+0.023
ww+0.023
ister+0.022
Neg logit contributions
 seeking-0.021
 pad-0.017
 wonders-0.017
epad-0.016
 pretending-0.015
Token Tom
Token ID4186
Feature activation-6.93e-03

Pos logit contributions
 seeking+0.025
 pad+0.020
 wonders+0.019
 pretending+0.019
 hiding+0.019
Neg logit contributions
ug-0.030
aw-0.027
oo-0.026
ww-0.026
opter-0.025
Token was
Token ID373
Feature activation-8.09e-03

Pos logit contributions
aw+0.011
ug+0.010
oo+0.009
dd+0.009
ww+0.009
Neg logit contributions
 seeking-0.014
 former-0.013
epad-0.013
 pad-0.012
 least-0.012
Token the
Token ID262
Feature activation+1.08e-02

Pos logit contributions
ug+0.013
aw+0.013
oo+0.012
apy+0.011
 cho+0.011
Neg logit contributions
 seeking-0.015
 pad-0.015
 former-0.014
epad-0.014
 wonders-0.013
Token h
Token ID289
Feature activation-5.16e-03

Pos logit contributions
asured+0.020
 seeking+0.018
epad+0.018
 chooses+0.018
 former+0.017
Neg logit contributions
ww-0.017
 cho-0.014
aw-0.014
 wind-0.014
 fog-0.014
Tokenider
Token ID1304
Feature activation+7.79e-02

Pos logit contributions
ug+0.013
aw+0.012
oo+0.012
ww+0.012
apy+0.011
Neg logit contributions
 seeking-0.007
 pad-0.005
 wonders-0.005
 pretending-0.004
epad-0.004
Token.
Token ID13
Feature activation+2.65e-03

Pos logit contributions
 seeking+0.143
 former+0.117
 pad+0.115
epad+0.114
 wonders+0.113
Neg logit contributions
ug-0.145
aw-0.139
oo-0.128
ww-0.119
ister-0.118
Token Sara
Token ID24799
Feature activation+1.22e-02

Pos logit contributions
 seeking+0.005
 pad+0.004
 wonders+0.004
epad+0.004
 pretending+0.004
Neg logit contributions
ug-0.005
aw-0.004
oo-0.004
ww-0.004
apy-0.004
Token counted
Token ID14789
Feature activation+4.46e-02

Pos logit contributions
 seeking+0.022
epad+0.021
 former+0.021
 fault+0.020
 Deal+0.019
Neg logit contributions
ug-0.021
aw-0.020
oo-0.018
apy-0.016
ink-0.016
Token to
Token ID284
Feature activation+1.36e-02

Pos logit contributions
 seeking+0.080
 former+0.067
 pad+0.066
 least+0.063
 wonders+0.063
Neg logit contributions
ug-0.084
aw-0.077
oo-0.075
apy-0.070
ister-0.068
Token ten
Token ID3478
Feature activation+1.40e-02

Pos logit contributions
 seeking+0.021
 pad+0.019
 supposed+0.018
 fault+0.018
asured+0.017
Neg logit contributions
ug-0.024
aw-0.024
oo-0.022
apy-0.019
 cho-0.019
Token replied
Token ID8712
Feature activation+1.02e-03

Pos logit contributions
ug+0.016
aw+0.016
ww+0.015
oo+0.015
ister+0.014
Neg logit contributions
 seeking-0.014
epad-0.012
 pad-0.011
 forgetting-0.011
 former-0.011
Token.
Token ID13
Feature activation+4.45e-03

Pos logit contributions
 seeking+0.002
 pad+0.002
 wonders+0.001
 pretending+0.001
 hiding+0.001
Neg logit contributions
ug-0.002
aw-0.002
ww-0.002
oo-0.002
opter-0.002
Token
Token ID198
Feature activation-4.91e-03

Pos logit contributions
 seeking+0.008
 pad+0.006
 wonders+0.006
 pretending+0.006
 hiding+0.006
Neg logit contributions
ug-0.009
aw-0.008
oo-0.008
ww-0.008
apy-0.008
Token
Token ID198
Feature activation-9.28e-03

Pos logit contributions
ug+0.012
aw+0.011
ww+0.010
oo+0.010
apy+0.010
Neg logit contributions
 seeking-0.007
 pad-0.005
 wonders-0.005
 anyway-0.005
 hiding-0.005
TokenFl
Token ID7414
Feature activation+6.77e-03

Pos logit contributions
ug+0.016
oo+0.014
aw+0.014
apy+0.013
opter+0.013
Neg logit contributions
 seeking-0.018
 pad-0.016
 wonders-0.015
 anyway-0.015
 forgetting-0.014
Tokenumm
Token ID13929
Feature activation-1.03e-01

Pos logit contributions
 seeking+0.014
 pad+0.013
 hiding+0.012
 wonders+0.012
 pretending+0.011
Neg logit contributions
ug-0.010
ww-0.009
aw-0.008
apy-0.008
dd-0.008
Tokenox
Token ID1140
Feature activation-1.64e-02

Pos logit contributions
ug+0.122
apy+0.094
aw+0.093
ww+0.084
opter+0.083
Neg logit contributions
 seeking-0.250
 pad-0.226
 anyway-0.220
 wonders-0.218
 hiding-0.213
Tokened
Token ID276
Feature activation-2.88e-02

Pos logit contributions
ug+0.033
aw+0.030
oo+0.029
ww+0.029
ister+0.028
Neg logit contributions
 seeking-0.030
 pad-0.024
 wonders-0.023
epad-0.023
 pretending-0.022
Token,
Token ID11
Feature activation-4.93e-03

Pos logit contributions
ug+0.058
aw+0.051
oo+0.051
ww+0.051
opter+0.050
Neg logit contributions
 seeking-0.050
 pad-0.042
 wonders-0.040
 hiding-0.039
 pretending-0.038
Token the
Token ID262
Feature activation-1.27e-02

Pos logit contributions
ug+0.010
aw+0.009
oo+0.009
ww+0.008
apy+0.008
Neg logit contributions
 seeking-0.009
 pad-0.007
 wonders-0.007
 pretending-0.007
 hiding-0.007
Token little
Token ID1310
Feature activation-1.26e-02

Pos logit contributions
ug+0.023
aw+0.022
oo+0.021
ww+0.020
ister+0.020
Neg logit contributions
 seeking-0.024
 pad-0.020
epad-0.018
 supposed-0.018
 anyway-0.018
Token added
Token ID2087
Feature activation+4.86e-03

Pos logit contributions
ug+0.024
aw+0.022
apy+0.021
opter+0.020
oo+0.020
Neg logit contributions
 seeking-0.023
 pad-0.019
 wonders-0.018
 former-0.017
 supposed-0.017
Token shells
Token ID19679
Feature activation-3.05e-02

Pos logit contributions
 seeking+0.008
 pad+0.007
 wonders+0.006
 pretending+0.006
 hiding+0.006
Neg logit contributions
ug-0.010
oo-0.009
ww-0.009
aw-0.009
ister-0.009
Token and
Token ID290
Feature activation+1.31e-02

Pos logit contributions
ug+0.061
aw+0.056
oo+0.054
ww+0.053
apy+0.052
Neg logit contributions
 seeking-0.054
 pad-0.044
 wonders-0.042
 pretending-0.041
epad-0.040
Token stones
Token ID14966
Feature activation-5.24e-02

Pos logit contributions
 seeking+0.025
 pad+0.020
 wonders+0.019
 supposed+0.019
epad+0.019
Neg logit contributions
ug-0.024
aw-0.023
ww-0.021
apy-0.021
oo-0.021
Token and
Token ID290
Feature activation+2.99e-03

Pos logit contributions
ug+0.094
aw+0.088
oo+0.080
apy+0.077
ww+0.075
Neg logit contributions
 seeking-0.104
 wonders-0.085
 pad-0.082
 former-0.082
 supposed-0.081
Token seaw
Token ID32085
Feature activation-9.84e-02

Pos logit contributions
 seeking+0.006
 pad+0.005
 supposed+0.004
 wonders+0.004
epad+0.004
Neg logit contributions
ug-0.006
aw-0.005
ww-0.005
apy-0.005
oo-0.005
Tokeneed
Token ID2308
Feature activation-7.29e-02

Pos logit contributions
ug+0.188
ww+0.170
ister+0.169
oo+0.168
apy+0.164
Neg logit contributions
 seeking-0.173
 pad-0.156
 wonders-0.140
 hiding-0.136
 pretending-0.132
Token to
Token ID284
Feature activation+2.46e-03

Pos logit contributions
ug+0.138
aw+0.131
oo+0.121
apy+0.116
ww+0.115
Neg logit contributions
 seeking-0.137
 wonders-0.109
 pad-0.108
 former-0.105
 supposed-0.104
Token make
Token ID787
Feature activation-2.35e-02

Pos logit contributions
 seeking+0.004
 pad+0.003
 wonders+0.003
 pretending+0.003
epad+0.003
Neg logit contributions
ug-0.005
aw-0.005
oo-0.005
ww-0.005
apy-0.005
Token it
Token ID340
Feature activation-1.02e-02

Pos logit contributions
ug+0.045
aw+0.042
oo+0.040
ww+0.039
apy+0.038
Neg logit contributions
 seeking-0.045
 pad-0.037
 wonders-0.034
epad-0.033
 pretending-0.033
Token look
Token ID804
Feature activation+1.32e-02

Pos logit contributions
ug+0.017
apy+0.016
aw+0.016
 cho+0.015
oo+0.014
Neg logit contributions
 seeking-0.020
 pad-0.018
 forgetting-0.015
 least-0.015
epad-0.015
Token decorated
Token ID24789
Feature activation-1.74e-03

Pos logit contributions
ug+0.007
aw+0.007
apy+0.007
ww+0.006
oo+0.006
Neg logit contributions
 seeking-0.007
 pad-0.006
 supposed-0.006
 wonders-0.006
epad-0.006
Token it
Token ID340
Feature activation-3.28e-02

Pos logit contributions
ug+0.003
aw+0.002
oo+0.002
ww+0.002
apy+0.002
Neg logit contributions
 seeking-0.004
 pad-0.003
 wonders-0.003
 pretending-0.003
 hiding-0.003
Token with
Token ID351
Feature activation+2.28e-02

Pos logit contributions
ug+0.037
apy+0.034
aw+0.033
oo+0.029
ves+0.028
Neg logit contributions
 seeking-0.078
 pad-0.075
 supposed-0.071
 least-0.071
 suppose-0.071
Token shells
Token ID19679
Feature activation-5.07e-02

Pos logit contributions
 seeking+0.052
 pad+0.045
 forgetting+0.042
 supposed+0.042
epad+0.041
Neg logit contributions
ug-0.034
aw-0.032
apy-0.030
oo-0.027
 cho-0.027
Token and
Token ID290
Feature activation-1.04e-02

Pos logit contributions
ug+0.097
aw+0.092
oo+0.083
ister+0.080
ww+0.080
Neg logit contributions
 seeking-0.094
 wonders-0.074
 former-0.073
 supposed-0.071
epad-0.070
Token seaw
Token ID32085
Feature activation-9.81e-02

Pos logit contributions
ug+0.016
aw+0.016
apy+0.015
ww+0.014
 cho+0.013
Neg logit contributions
 seeking-0.021
 pad-0.018
 supposed-0.017
 anyway-0.017
epad-0.017
Tokeneed
Token ID2308
Feature activation-8.57e-02

Pos logit contributions
ug+0.181
oo+0.161
ww+0.159
aw+0.157
apy+0.155
Neg logit contributions
 seeking-0.185
 pad-0.164
 wonders-0.150
 hiding-0.145
 pretending-0.141
Token.
Token ID13
Feature activation-8.25e-03

Pos logit contributions
ug+0.169
aw+0.161
oo+0.146
apy+0.139
 cho+0.138
Neg logit contributions
 seeking-0.155
 wonders-0.120
 former-0.120
 least-0.118
 supposed-0.117
Token They
Token ID1119
Feature activation-1.68e-02

Pos logit contributions
ug+0.015
aw+0.014
oo+0.014
ww+0.013
apy+0.013
Neg logit contributions
 seeking-0.016
 pad-0.013
 wonders-0.013
epad-0.012
 pretending-0.012
Token had
Token ID550
Feature activation-1.15e-02

Pos logit contributions
ug+0.030
aw+0.028
apy+0.026
opter+0.025
dd+0.025
Neg logit contributions
 seeking-0.031
 pad-0.026
 supposed-0.025
 forgetting-0.024
 fault-0.024
Token a
Token ID257
Feature activation+6.09e-03

Pos logit contributions
aw+0.022
ug+0.020
 cho+0.019
ww+0.018
oo+0.017
Neg logit contributions
 seeking-0.020
 pad-0.017
epad-0.016
 slave-0.015
 former-0.015
Tokenate
Token ID378
Feature activation-4.87e-04

Pos logit contributions
 seeking+0.006
 wonders+0.005
 own+0.005
 hiding+0.004
 first+0.004
Neg logit contributions
ug-0.008
ww-0.008
apy-0.008
ister-0.007
oo-0.007
Token it
Token ID340
Feature activation-2.19e-02

Pos logit contributions
ug+0.001
aw+0.001
oo+0.001
ww+0.001
apy+0.001
Neg logit contributions
 seeking-0.001
 pad-0.001
 wonders-0.001
epad-0.001
 pretending-0.001
Token with
Token ID351
Feature activation+2.32e-02

Pos logit contributions
ug+0.032
apy+0.029
aw+0.028
oo+0.027
ww+0.025
Neg logit contributions
 seeking-0.047
 pad-0.043
 forgetting-0.041
 supposed-0.040
 least-0.040
Token shells
Token ID19679
Feature activation-4.82e-02

Pos logit contributions
 seeking+0.050
 pad+0.043
 wonders+0.039
epad+0.039
 forgetting+0.039
Neg logit contributions
ug-0.038
aw-0.035
apy-0.033
oo-0.032
ww-0.032
Token and
Token ID290
Feature activation-6.36e-03

Pos logit contributions
ug+0.099
aw+0.094
oo+0.088
ww+0.086
ister+0.084
Neg logit contributions
 seeking-0.084
 wonders-0.064
 pad-0.062
 former-0.061
epad-0.061
Token seaw
Token ID32085
Feature activation-9.76e-02

Pos logit contributions
ug+0.011
aw+0.010
apy+0.009
ww+0.009
oo+0.009
Neg logit contributions
 seeking-0.013
 pad-0.011
 supposed-0.010
epad-0.010
 wonders-0.010
Tokeneed
Token ID2308
Feature activation-8.11e-02

Pos logit contributions
ug+0.194
ister+0.173
oo+0.173
ww+0.172
apy+0.170
Neg logit contributions
 seeking-0.161
 pad-0.149
 wonders-0.131
 hiding-0.127
 pretending-0.121
Token.
Token ID13
Feature activation-5.02e-03

Pos logit contributions
ug+0.167
aw+0.159
oo+0.148
ww+0.143
apy+0.142
Neg logit contributions
 seeking-0.140
 wonders-0.106
 pad-0.104
 former-0.104
 forgetting-0.102
Token They
Token ID1119
Feature activation-1.29e-02

Pos logit contributions
ug+0.010
aw+0.009
oo+0.009
ww+0.009
ister+0.009
Neg logit contributions
 seeking-0.009
 pad-0.007
 wonders-0.007
 pretending-0.007
 hiding-0.007
Token find
Token ID1064
Feature activation-8.95e-03

Pos logit contributions
ug+0.022
aw+0.020
opter+0.019
apy+0.019
ister+0.018
Neg logit contributions
 seeking-0.024
 pad-0.021
 supposed-0.019
 former-0.019
 lair-0.019
Token a
Token ID257
Feature activation+1.45e-02

Pos logit contributions
ug+0.021
oo+0.020
ww+0.020
ister+0.020
opter+0.019
Neg logit contributions
 seeking-0.012
 wonders-0.010
 pad-0.009
 hiding-0.009
 pretending-0.009
Token some
Token ID617
Feature activation+5.44e-03

Pos logit contributions
ug+0.069
aw+0.063
oo+0.062
ww+0.061
apy+0.059
Neg logit contributions
 seeking-0.057
 pad-0.046
 wonders-0.044
 pretending-0.042
epad-0.042
Token shells
Token ID19679
Feature activation-3.67e-02

Pos logit contributions
 seeking+0.010
 wonders+0.008
 pad+0.008
 hiding+0.008
 pretending+0.008
Neg logit contributions
ug-0.011
oo-0.010
ww-0.009
opter-0.009
aw-0.009
Token and
Token ID290
Feature activation-1.39e-03

Pos logit contributions
ug+0.066
aw+0.060
oo+0.058
ww+0.057
apy+0.055
Neg logit contributions
 seeking-0.073
 pad-0.061
 wonders-0.058
 pretending-0.056
epad-0.056
Token rocks
Token ID12586
Feature activation-5.97e-02

Pos logit contributions
ug+0.002
aw+0.002
apy+0.002
ww+0.002
oo+0.002
Neg logit contributions
 seeking-0.003
 pad-0.002
 supposed-0.002
epad-0.002
 anyway-0.002
Token and
Token ID290
Feature activation-2.35e-03

Pos logit contributions
ug+0.115
aw+0.105
oo+0.101
ww+0.097
apy+0.096
Neg logit contributions
 seeking-0.111
 pad-0.091
 wonders-0.088
epad-0.084
 pretending-0.083
Token seaw
Token ID32085
Feature activation-9.68e-02

Pos logit contributions
ug+0.004
aw+0.004
apy+0.004
ww+0.003
 cho+0.003
Neg logit contributions
 seeking-0.005
 pad-0.004
 supposed-0.004
epad-0.004
 anyway-0.004
Tokeneed
Token ID2308
Feature activation-7.22e-02

Pos logit contributions
ug+0.193
ww+0.177
ister+0.175
oo+0.173
apy+0.173
Neg logit contributions
 seeking-0.160
 pad-0.147
 wonders-0.129
 hiding-0.126
 pretending-0.118
Token.
Token ID13
Feature activation-2.19e-03

Pos logit contributions
ug+0.138
aw+0.133
oo+0.121
apy+0.116
 cho+0.114
Neg logit contributions
 seeking-0.130
 former-0.105
 pad-0.104
 wonders-0.103
 least-0.100
Token He
Token ID679
Feature activation-2.57e-02

Pos logit contributions
ug+0.004
aw+0.004
oo+0.003
apy+0.003
ww+0.003
Neg logit contributions
 seeking-0.004
 pad-0.004
epad-0.004
 supposed-0.004
 wonders-0.003
Token picked
Token ID6497
Feature activation+5.37e-03

Pos logit contributions
ug+0.054
aw+0.051
oo+0.048
ww+0.047
apy+0.047
Neg logit contributions
 seeking-0.043
 pad-0.035
epad-0.032
 wonders-0.032
 supposed-0.031
Token them
Token ID606
Feature activation+6.77e-03

Pos logit contributions
 seeking+0.009
 pad+0.007
 wonders+0.007
 pretending+0.007
 hiding+0.007
Neg logit contributions
ug-0.011
aw-0.010
oo-0.010
ww-0.010
apy-0.010
Token.
Token ID13
Feature activation-1.12e-02

Pos logit contributions
 seeking+0.036
 wonders+0.030
 intended+0.030
epad+0.029
 former+0.029
Neg logit contributions
aw-0.031
ug-0.031
oo-0.027
ister-0.026
ww-0.025
Token<|endoftext|>
Token ID50256
Feature activation+3.05e-02

Pos logit contributions
ug+0.022
aw+0.020
oo+0.020
ww+0.019
apy+0.019
Neg logit contributions
 seeking-0.021
 pad-0.017
 wonders-0.016
epad-0.016
 pretending-0.016
TokenM
Token ID44
Feature activation-2.64e-02

Pos logit contributions
 seeking+0.057
 pad+0.047
 wonders+0.044
epad+0.044
 pretending+0.044
Neg logit contributions
ug-0.058
aw-0.053
oo-0.052
ww-0.051
apy-0.049
Tokenama
Token ID1689
Feature activation-4.87e-02

Pos logit contributions
ug+0.038
aw+0.032
apy+0.030
ww+0.030
oo+0.030
Neg logit contributions
 seeking-0.061
 pad-0.053
 wonders-0.052
 least-0.050
 anyway-0.049
Token and
Token ID290
Feature activation-1.66e-02

Pos logit contributions
ug+0.102
aw+0.093
oo+0.091
ww+0.090
apy+0.087
Neg logit contributions
 seeking-0.082
 pad-0.066
 wonders-0.062
 pretending-0.060
epad-0.060
Token pap
Token ID20461
Feature activation-9.55e-02

Pos logit contributions
ug+0.038
aw+0.034
oo+0.033
opter+0.033
ww+0.032
Neg logit contributions
 seeking-0.025
 wonders-0.020
 pad-0.020
 find-0.019
 think-0.019
Tokena
Token ID64
Feature activation-2.90e-02

Pos logit contributions
ug+0.172
ww+0.163
aw+0.162
apy+0.160
olk+0.149
Neg logit contributions
 seeking-0.161
 pad-0.142
 anyway-0.135
 wonders-0.133
 least-0.130
Token were
Token ID547
Feature activation-2.26e-02

Pos logit contributions
ug+0.053
aw+0.050
oo+0.047
ww+0.046
apy+0.044
Neg logit contributions
 seeking-0.056
 pad-0.046
 wonders-0.044
epad-0.043
 pretending-0.043
Token busy
Token ID8179
Feature activation-1.92e-04

Pos logit contributions
ug+0.051
aw+0.047
oo+0.046
ww+0.045
apy+0.043
Neg logit contributions
 seeking-0.035
 pad-0.028
 wonders-0.026
epad-0.025
 pretending-0.024
Token in
Token ID287
Feature activation-4.21e-03

Pos logit contributions
ug+0.000
oo+0.000
aw+0.000
apy+0.000
ww+0.000
Neg logit contributions
 seeking-0.000
 pad-0.000
 wonders-0.000
epad-0.000
 forgetting-0.000
Token the
Token ID262
Feature activation-6.98e-03

Pos logit contributions
ug+0.006
oo+0.006
aw+0.006
ww+0.005
apy+0.005
Neg logit contributions
 seeking-0.009
 pad-0.008
epad-0.008
 wonders-0.007
 lair-0.007
Token like
Token ID588
Feature activation-1.81e-03

Pos logit contributions
 seeking+0.027
 pad+0.021
 wonders+0.019
 pretending+0.018
epad+0.018
Neg logit contributions
ug-0.052
aw-0.048
oo-0.047
ww-0.047
apy-0.046
Token fish
Token ID5916
Feature activation-2.42e-02

Pos logit contributions
ug+0.003
aw+0.003
oo+0.003
ww+0.003
apy+0.003
Neg logit contributions
 seeking-0.004
 pad-0.003
 wonders-0.003
epad-0.003
 pretending-0.003
Token and
Token ID290
Feature activation-1.93e-04

Pos logit contributions
ug+0.034
aw+0.030
oo+0.029
ww+0.028
apy+0.026
Neg logit contributions
 seeking-0.057
 pad-0.049
epad-0.048
 wonders-0.048
 former-0.046
Token shells
Token ID19679
Feature activation-5.46e-02

Pos logit contributions
ug+0.000
aw+0.000
apy+0.000
ww+0.000
oo+0.000
Neg logit contributions
 seeking-0.000
 pad-0.000
epad-0.000
 supposed-0.000
 anyway-0.000
Token and
Token ID290
Feature activation-5.76e-04

Pos logit contributions
ug+0.111
aw+0.107
oo+0.099
ww+0.096
ister+0.095
Neg logit contributions
 seeking-0.093
 pad-0.072
 wonders-0.072
epad-0.071
 former-0.070
Token seaw
Token ID32085
Feature activation-9.47e-02

Pos logit contributions
ug+0.001
aw+0.001
apy+0.001
ww+0.001
oo+0.001
Neg logit contributions
 seeking-0.001
 pad-0.001
 supposed-0.001
epad-0.001
 forgetting-0.001
Tokeneed
Token ID2308
Feature activation-6.79e-02

Pos logit contributions
ug+0.188
ww+0.169
ister+0.167
apy+0.166
oo+0.165
Neg logit contributions
 seeking-0.159
 pad-0.143
 wonders-0.129
 hiding-0.127
 pretending-0.119
Token.
Token ID13
Feature activation-4.14e-03

Pos logit contributions
ug+0.126
aw+0.124
oo+0.111
ww+0.106
apy+0.105
Neg logit contributions
 seeking-0.123
 least-0.099
 former-0.099
 wonders-0.096
 pad-0.096
Token But
Token ID887
Feature activation+2.22e-03

Pos logit contributions
ug+0.009
aw+0.008
oo+0.008
ww+0.008
opter+0.008
Neg logit contributions
 seeking-0.007
 pad-0.006
 wonders-0.005
 pretending-0.005
 hiding-0.005
Token they
Token ID484
Feature activation+2.05e-02

Pos logit contributions
 seeking+0.004
 pad+0.003
 wonders+0.003
 pretending+0.003
 hiding+0.003
Neg logit contributions
ug-0.004
aw-0.004
oo-0.004
ww-0.004
apy-0.004
Token did
Token ID750
Feature activation+6.38e-04

Pos logit contributions
 seeking+0.031
 pad+0.024
 wonders+0.022
 pretending+0.021
epad+0.021
Neg logit contributions
ug-0.047
aw-0.043
oo-0.043
ww-0.042
apy-0.041
Token made
Token ID925
Feature activation-3.06e-02

Pos logit contributions
 seeking+0.010
epad+0.008
 pad+0.008
 supposed+0.007
 forgetting+0.007
Neg logit contributions
aw-0.010
ug-0.010
oo-0.009
apy-0.009
ww-0.009
Token him
Token ID683
Feature activation-1.86e-02

Pos logit contributions
ug+0.053
aw+0.051
oo+0.050
apy+0.048
 cho+0.044
Neg logit contributions
 seeking-0.063
 pad-0.054
 intended-0.047
 lair-0.046
epad-0.046
Token an
Token ID281
Feature activation-2.44e-04

Pos logit contributions
ug+0.034
apy+0.033
oo+0.033
aw+0.031
ww+0.029
Neg logit contributions
 seeking-0.034
 pad-0.028
 wonders-0.026
epad-0.026
 lair-0.026
Token even
Token ID772
Feature activation+2.94e-03

Pos logit contributions
ug+0.001
oo+0.000
aw+0.000
ww+0.000
ister+0.000
Neg logit contributions
 seeking-0.000
 pad-0.000
epad-0.000
 wonders-0.000
 forgetting-0.000
Token y
Token ID331
Feature activation-1.22e-02

Pos logit contributions
 seeking+0.005
 pad+0.004
epad+0.004
 supposed+0.004
 lair+0.004
Neg logit contributions
ug-0.006
aw-0.006
ww-0.005
apy-0.005
oo-0.005
Tokenumm
Token ID13929
Feature activation-9.42e-02

Pos logit contributions
ug+0.018
aw+0.016
apy+0.016
ww+0.015
ves+0.015
Neg logit contributions
 seeking-0.025
 hiding-0.022
 wonders-0.022
 pad-0.022
 anyway-0.021
Tokenier
Token ID959
Feature activation-2.58e-02

Pos logit contributions
ug+0.093
aw+0.077
oo+0.071
ww+0.070
apy+0.067
Neg logit contributions
 seeking-0.262
 pad-0.233
 wonders-0.223
 hiding-0.221
 anyway-0.221
Token snack
Token ID26906
Feature activation-5.34e-03

Pos logit contributions
ug+0.048
aw+0.045
ww+0.045
oo+0.043
opter+0.042
Neg logit contributions
 seeking-0.048
 pad-0.037
 wonders-0.036
epad-0.035
 forgetting-0.034
Token and
Token ID290
Feature activation+5.86e-03

Pos logit contributions
ug+0.011
aw+0.010
oo+0.010
ww+0.010
ister+0.010
Neg logit contributions
 seeking-0.009
 wonders-0.007
 pad-0.007
 former-0.007
epad-0.007
Token they
Token ID484
Feature activation-1.35e-02

Pos logit contributions
 seeking+0.011
 pad+0.010
epad+0.009
 lair+0.009
 wonders+0.009
Neg logit contributions
ug-0.010
apy-0.010
aw-0.010
ww-0.009
oo-0.009
Token both
Token ID1111
Feature activation-9.02e-05

Pos logit contributions
ug+0.023
aw+0.022
apy+0.021
dd+0.020
oo+0.020
Neg logit contributions
 seeking-0.028
 pad-0.022
 forgetting-0.021
epad-0.021
 wonders-0.021
Token see
Token ID766
Feature activation-2.80e-02

Pos logit contributions
ug+0.020
aw+0.018
ister+0.017
opter+0.017
oo+0.017
Neg logit contributions
 seeking-0.019
 pad-0.017
 supposed-0.016
 former-0.015
epad-0.015
Token shells
Token ID19679
Feature activation-3.66e-02

Pos logit contributions
ug+0.059
oo+0.053
ww+0.052
aw+0.052
ister+0.050
Neg logit contributions
 seeking-0.048
 pad-0.039
 wonders-0.037
 pretending-0.036
 hiding-0.035
Token,
Token ID11
Feature activation+2.17e-02

Pos logit contributions
ug+0.065
ww+0.058
aw+0.058
oo+0.057
apy+0.054
Neg logit contributions
 seeking-0.074
 pad-0.062
 hiding-0.058
 wonders-0.058
 pretending-0.058
Token rocks
Token ID12586
Feature activation-5.66e-02

Pos logit contributions
 seeking+0.042
 pad+0.035
 wonders+0.033
 pretending+0.032
epad+0.032
Neg logit contributions
ug-0.040
aw-0.037
oo-0.036
ww-0.035
apy-0.034
Token,
Token ID11
Feature activation+1.51e-02

Pos logit contributions
ug+0.098
ww+0.089
aw+0.088
oo+0.086
apy+0.083
Neg logit contributions
 seeking-0.116
 pad-0.098
 hiding-0.094
 pretending-0.092
 wonders-0.090
Token seaw
Token ID32085
Feature activation-9.31e-02

Pos logit contributions
 seeking+0.025
 pad+0.020
 wonders+0.019
 pretending+0.018
 hiding+0.018
Neg logit contributions
ug-0.034
oo-0.031
ww-0.030
aw-0.030
ister-0.030
Tokeneed
Token ID2308
Feature activation-6.81e-02

Pos logit contributions
ug+0.197
apy+0.185
oo+0.177
ister+0.177
ww+0.175
Neg logit contributions
 seeking-0.140
 pad-0.137
 hiding-0.111
 wonders-0.110
 count-0.103
Token,
Token ID11
Feature activation+1.10e-02

Pos logit contributions
ug+0.108
aw+0.095
oo+0.093
ww+0.092
apy+0.088
Neg logit contributions
 seeking-0.150
 pad-0.128
 wonders-0.123
 pretending-0.120
 hiding-0.119
Token crabs
Token ID49172
Feature activation-4.56e-02

Pos logit contributions
 seeking+0.017
 pad+0.014
 wonders+0.013
 pretending+0.012
 hiding+0.012
Neg logit contributions
ug-0.025
oo-0.023
ww-0.023
aw-0.022
ister-0.022
Token and
Token ID290
Feature activation+1.74e-03

Pos logit contributions
ug+0.064
aw+0.056
oo+0.054
ww+0.053
apy+0.050
Neg logit contributions
 seeking-0.108
 pad-0.093
 wonders-0.090
epad-0.088
 pretending-0.088
Token fish
Token ID5916
Feature activation-4.38e-02

Pos logit contributions
 seeking+0.003
 pad+0.003
 supposed+0.003
epad+0.003
 wonders+0.003
Neg logit contributions
ug-0.003
aw-0.003
ww-0.003
apy-0.003
oo-0.002
Token of
Token ID286
Feature activation+9.08e-03

Pos logit contributions
ug+0.002
aw+0.002
oo+0.001
ww+0.001
apy+0.001
Neg logit contributions
 seeking-0.004
 pad-0.003
 wonders-0.003
epad-0.003
 pretending-0.003
Token shells
Token ID19679
Feature activation-3.14e-02

Pos logit contributions
 seeking+0.014
 pad+0.011
 wonders+0.010
epad+0.010
 pretending+0.010
Neg logit contributions
ug-0.020
aw-0.019
oo-0.018
ww-0.018
apy-0.017
Token,
Token ID11
Feature activation+2.33e-02

Pos logit contributions
ug+0.050
aw+0.047
oo+0.042
ww+0.042
ister+0.041
Neg logit contributions
 seeking-0.065
 wonders-0.055
 former-0.054
 pad-0.054
epad-0.053
Token rocks
Token ID12586
Feature activation-5.61e-02

Pos logit contributions
 seeking+0.043
 pad+0.035
 supposed+0.033
 wonders+0.033
epad+0.033
Neg logit contributions
ug-0.044
aw-0.041
ww-0.037
oo-0.037
apy-0.037
Token and
Token ID290
Feature activation+7.84e-03

Pos logit contributions
ug+0.089
aw+0.079
oo+0.076
ww+0.074
ister+0.071
Neg logit contributions
 seeking-0.123
 pad-0.104
 wonders-0.102
epad-0.099
 pretending-0.097
Token seaw
Token ID32085
Feature activation-9.23e-02

Pos logit contributions
 seeking+0.015
 pad+0.012
 supposed+0.012
 fault+0.012
epad+0.011
Neg logit contributions
ug-0.013
aw-0.013
apy-0.012
ww-0.012
opter-0.011
Tokeneed
Token ID2308
Feature activation-6.79e-02

Pos logit contributions
ug+0.176
oo+0.158
ww+0.156
apy+0.155
ister+0.153
Neg logit contributions
 seeking-0.167
 pad-0.150
 wonders-0.135
 hiding-0.130
 pretending-0.126
Token,
Token ID11
Feature activation+4.45e-03

Pos logit contributions
ug+0.119
aw+0.114
oo+0.102
apy+0.098
ww+0.098
Neg logit contributions
 seeking-0.129
 wonders-0.106
 former-0.104
 pad-0.102
epad-0.101
Token but
Token ID475
Feature activation-1.41e-02

Pos logit contributions
 seeking+0.008
 supposed+0.007
epad+0.007
 pad+0.007
 wonders+0.006
Neg logit contributions
aw-0.008
ug-0.008
oo-0.006
apy-0.006
 cho-0.006
Token Jane
Token ID12091
Feature activation+4.31e-02

Pos logit contributions
ug+0.021
apy+0.020
aw+0.020
oo+0.019
ww+0.018
Neg logit contributions
 seeking-0.029
 pad-0.025
 lair-0.024
epad-0.023
 supposed-0.023
Token was
Token ID373
Feature activation-2.81e-04

Pos logit contributions
 seeking+0.080
epad+0.063
 former+0.058
 lair+0.058
 pad+0.057
Neg logit contributions
ug-0.084
aw-0.081
oo-0.075
dd-0.072
apy-0.070
Token and
Token ID290
Feature activation-3.81e-03

Pos logit contributions
 seeking+0.060
 pad+0.049
 wonders+0.048
epad+0.047
 lair+0.046
Neg logit contributions
ug-0.056
aw-0.053
oo-0.051
apy-0.048
ww-0.047
Token it
Token ID340
Feature activation-2.31e-02

Pos logit contributions
ug+0.007
aw+0.007
apy+0.007
oo+0.006
ww+0.006
Neg logit contributions
 seeking-0.007
 pad-0.006
 wonders-0.005
epad-0.005
 lair-0.005
Token was
Token ID373
Feature activation-7.69e-03

Pos logit contributions
ug+0.041
aw+0.039
ww+0.037
apy+0.036
oo+0.036
Neg logit contributions
 seeking-0.044
 pad-0.039
epad-0.036
 pretending-0.033
 lair-0.033
Token the
Token ID262
Feature activation-1.54e-02

Pos logit contributions
ug+0.015
oo+0.014
aw+0.014
ww+0.013
apy+0.013
Neg logit contributions
 seeking-0.014
 pad-0.012
 wonders-0.011
epad-0.010
 lair-0.010
Token y
Token ID331
Feature activation-2.39e-03

Pos logit contributions
ug+0.028
aw+0.027
ww+0.027
oo+0.026
ister+0.025
Neg logit contributions
 seeking-0.028
 pad-0.023
 wonders-0.021
epad-0.021
 pretending-0.020
Tokenumm
Token ID13929
Feature activation-9.21e-02

Pos logit contributions
ug+0.004
aw+0.003
opter+0.003
ister+0.003
oo+0.003
Neg logit contributions
 seeking-0.005
 wonders-0.004
 anyway-0.004
 pad-0.004
 former-0.004
Tokeniest
Token ID6386
Feature activation-1.52e-02

Pos logit contributions
ug+0.104
aw+0.088
ww+0.076
oo+0.076
apy+0.075
Neg logit contributions
 seeking-0.237
 pad-0.210
 anyway-0.210
 supposed-0.205
epad-0.204
Token dinner
Token ID8073
Feature activation-4.25e-02

Pos logit contributions
ug+0.024
aw+0.022
ww+0.022
oo+0.022
apy+0.021
Neg logit contributions
 seeking-0.033
 pad-0.028
epad-0.026
 wonders-0.026
 supposed-0.025
Token they
Token ID484
Feature activation-2.70e-02

Pos logit contributions
ug+0.078
aw+0.072
ww+0.070
oo+0.069
apy+0.067
Neg logit contributions
 seeking-0.080
 pad-0.068
 wonders-0.065
epad-0.061
 pretending-0.060
Token ever
Token ID1683
Feature activation-6.45e-03

Pos logit contributions
ug+0.054
dd+0.051
ww+0.050
aw+0.050
oo+0.048
Neg logit contributions
 seeking-0.043
 pad-0.037
 fault-0.036
 lair-0.035
 wonders-0.034
Token had
Token ID550
Feature activation-3.61e-02

Pos logit contributions
aw+0.013
ug+0.013
ww+0.012
dd+0.012
oo+0.011
Neg logit contributions
 seeking-0.011
 pad-0.008
 lair-0.008
 fault-0.008
 wonders-0.008
Token like
Token ID588
Feature activation+1.67e-03

Pos logit contributions
 seeking+0.025
 pad+0.019
 wonders+0.018
 pretending+0.017
 hiding+0.016
Neg logit contributions
ug-0.049
aw-0.045
oo-0.044
ww-0.044
apy-0.043
Token fish
Token ID5916
Feature activation-2.62e-02

Pos logit contributions
 seeking+0.003
 pad+0.003
 wonders+0.003
epad+0.003
 pretending+0.003
Neg logit contributions
ug-0.003
aw-0.003
oo-0.003
ww-0.003
apy-0.002
Token and
Token ID290
Feature activation-1.46e-03

Pos logit contributions
ug+0.039
aw+0.034
oo+0.033
ww+0.032
apy+0.031
Neg logit contributions
 seeking-0.060
 pad-0.052
 wonders-0.050
epad-0.049
 pretending-0.048
Token shells
Token ID19679
Feature activation-5.61e-02

Pos logit contributions
ug+0.002
aw+0.002
apy+0.002
ww+0.002
oo+0.002
Neg logit contributions
 seeking-0.003
 pad-0.003
epad-0.003
 supposed-0.003
 anyway-0.002
Token and
Token ID290
Feature activation-1.59e-03

Pos logit contributions
ug+0.117
aw+0.111
oo+0.104
ww+0.101
apy+0.099
Neg logit contributions
 seeking-0.094
 pad-0.073
 wonders-0.072
epad-0.071
 former-0.069
Token seaw
Token ID32085
Feature activation-9.21e-02

Pos logit contributions
ug+0.002
aw+0.002
apy+0.002
ww+0.002
oo+0.002
Neg logit contributions
 seeking-0.003
 pad-0.003
epad-0.003
 supposed-0.003
 anyway-0.003
Tokeneed
Token ID2308
Feature activation-6.88e-02

Pos logit contributions
ug+0.184
ww+0.166
ister+0.165
apy+0.164
oo+0.162
Neg logit contributions
 seeking-0.151
 pad-0.138
 wonders-0.122
 hiding-0.120
 pretending-0.111
Token.
Token ID13
Feature activation-1.32e-03

Pos logit contributions
ug+0.132
aw+0.128
oo+0.116
ww+0.111
apy+0.111
Neg logit contributions
 seeking-0.123
 former-0.096
 least-0.096
 wonders-0.096
 pad-0.095
Token
Token ID198
Feature activation-1.99e-03

Pos logit contributions
ug+0.003
aw+0.002
oo+0.002
ww+0.002
apy+0.002
Neg logit contributions
 seeking-0.002
 pad-0.002
 wonders-0.002
epad-0.002
 pretending-0.002
Token
Token ID198
Feature activation-4.63e-03

Pos logit contributions
ug+0.005
aw+0.004
ww+0.004
oo+0.004
opter+0.004
Neg logit contributions
 seeking-0.003
 pad-0.002
 wonders-0.002
 hiding-0.002
 pretending-0.002
TokenOne
Token ID3198
Feature activation+5.99e-02

Pos logit contributions
ug+0.011
aw+0.010
oo+0.010
ww+0.010
opter+0.010
Neg logit contributions
 seeking-0.006
 pad-0.005
 wonders-0.005
 anyway-0.004
 hiding-0.004
Token shore
Token ID15191
Feature activation-7.88e-03

Pos logit contributions
ug+0.032
aw+0.029
ww+0.028
oo+0.028
opter+0.026
Neg logit contributions
 seeking-0.041
 pad-0.034
 wonders-0.032
epad-0.032
 pretending-0.032
Token,
Token ID11
Feature activation+7.10e-03

Pos logit contributions
ug+0.015
aw+0.014
oo+0.013
ister+0.012
ww+0.012
Neg logit contributions
 seeking-0.015
 wonders-0.012
 pad-0.012
epad-0.011
 supposed-0.011
Token collecting
Token ID13157
Feature activation+1.17e-02

Pos logit contributions
 seeking+0.011
 wonders+0.009
epad+0.009
 pad+0.008
 former+0.008
Neg logit contributions
ug-0.014
aw-0.014
oo-0.013
 cho-0.013
apy-0.012
Token shells
Token ID19679
Feature activation-3.48e-02

Pos logit contributions
 seeking+0.020
 wonders+0.017
 pad+0.017
 hiding+0.016
 pretending+0.016
Neg logit contributions
ug-0.023
oo-0.021
ww-0.021
ister-0.020
opter-0.020
Token and
Token ID290
Feature activation+3.44e-03

Pos logit contributions
ug+0.063
aw+0.059
oo+0.054
ww+0.052
ister+0.051
Neg logit contributions
 seeking-0.068
 wonders-0.056
 pad-0.053
epad-0.053
 former-0.053
Token seaw
Token ID32085
Feature activation-9.19e-02

Pos logit contributions
 seeking+0.006
 pad+0.005
epad+0.005
 anyway+0.005
 wonders+0.005
Neg logit contributions
ug-0.006
aw-0.006
apy-0.006
ww-0.006
oo-0.005
Tokeneed
Token ID2308
Feature activation-6.88e-02

Pos logit contributions
ug+0.176
ww+0.158
oo+0.156
apy+0.155
ister+0.155
Neg logit contributions
 seeking-0.161
 pad-0.145
 wonders-0.130
 hiding-0.125
epad-0.121
Token.
Token ID13
Feature activation-8.57e-03

Pos logit contributions
ug+0.134
aw+0.126
oo+0.116
apy+0.110
dd+0.109
Neg logit contributions
 seeking-0.125
 wonders-0.102
 former-0.097
 pad-0.095
 least-0.095
Token Jimmy
Token ID12963
Feature activation-1.34e-02

Pos logit contributions
ug+0.016
aw+0.014
oo+0.013
apy+0.013
ww+0.013
Neg logit contributions
 seeking-0.017
 pad-0.014
 wonders-0.013
 hiding-0.013
epad-0.013
Token took
Token ID1718
Feature activation-2.29e-02

Pos logit contributions
ug+0.028
aw+0.026
oo+0.025
ww+0.025
apy+0.024
Neg logit contributions
 seeking-0.023
 pad-0.018
 wonders-0.017
epad-0.017
 pretending-0.016
Token time
Token ID640
Feature activation-1.81e-02

Pos logit contributions
ug+0.053
ister+0.050
opter+0.049
oo+0.048
ww+0.048
Neg logit contributions
 seeking-0.031
 wonders-0.025
 pad-0.025
 hiding-0.023
 pretending-0.023
Token an
Token ID281
Feature activation+3.90e-03

Pos logit contributions
 seeking+0.017
 pad+0.015
 former+0.015
 lair+0.014
epad+0.014
Neg logit contributions
ug-0.020
oo-0.018
aw-0.018
ww-0.017
ister-0.017
Token amazing
Token ID4998
Feature activation-1.15e-04

Pos logit contributions
 seeking+0.006
 pad+0.005
 hiding+0.004
epad+0.004
 wonders+0.004
Neg logit contributions
ug-0.009
aw-0.008
oo-0.008
ww-0.008
ister-0.007
Token chess
Token ID19780
Feature activation+9.29e-03

Pos logit contributions
ug+0.000
ww+0.000
oo+0.000
aw+0.000
ister+0.000
Neg logit contributions
 seeking-0.000
 pad-0.000
 wonders-0.000
 supposed-0.000
epad-0.000
Token player
Token ID2137
Feature activation+7.23e-03

Pos logit contributions
 seeking+0.017
 pad+0.014
 forgetting+0.013
 wonders+0.013
epad+0.013
Neg logit contributions
ug-0.017
ww-0.016
aw-0.016
opter-0.016
ister-0.015
Token,
Token ID11
Feature activation-1.84e-02

Pos logit contributions
 seeking+0.014
 former+0.011
 pad+0.011
 wonders+0.011
epad+0.011
Neg logit contributions
ug-0.013
aw-0.012
ww-0.012
oo-0.011
ister-0.011
Token Mate
Token ID24787
Feature activation-9.18e-02

Pos logit contributions
ug+0.037
aw+0.034
oo+0.033
ww+0.033
ister+0.031
Neg logit contributions
 seeking-0.033
 pad-0.026
 wonders-0.025
epad-0.024
 hiding-0.024
Tokeno
Token ID78
Feature activation-2.53e-02

Pos logit contributions
ug+0.201
aw+0.183
oo+0.180
apy+0.178
ww+0.177
Neg logit contributions
 seeking-0.142
 pad-0.119
 wonders-0.107
 pretending-0.103
 hiding-0.102
Token!
Token ID0
Feature activation-9.95e-03

Pos logit contributions
ug+0.052
aw+0.047
oo+0.047
ww+0.046
ister+0.044
Neg logit contributions
 seeking-0.045
 pad-0.036
 wonders-0.035
epad-0.034
 pretending-0.033
Token Remember
Token ID11436
Feature activation+1.37e-02

Pos logit contributions
ug+0.020
aw+0.018
oo+0.018
ww+0.018
apy+0.017
Neg logit contributions
 seeking-0.018
 pad-0.015
 wonders-0.014
 hiding-0.014
 pretending-0.014
Token to
Token ID284
Feature activation+1.90e-02

Pos logit contributions
 seeking+0.031
 pad+0.026
 wonders+0.025
epad+0.025
 pretending+0.025
Neg logit contributions
ug-0.021
aw-0.019
oo-0.018
ww-0.018
ister-0.017
Token always
Token ID1464
Feature activation-9.45e-03

Pos logit contributions
 seeking+0.029
 pad+0.025
 pretending+0.022
 wonders+0.022
epad+0.021
Neg logit contributions
ug-0.042
oo-0.038
ister-0.037
aw-0.036
ww-0.036
Token as
Token ID355
Feature activation-4.26e-02

Pos logit contributions
ug+0.026
aw+0.024
ww+0.024
oo+0.023
apy+0.023
Neg logit contributions
 seeking-0.023
 pad-0.020
epad-0.018
 wonders-0.017
 pretending-0.017
Token solid
Token ID4735
Feature activation-5.86e-02

Pos logit contributions
ug+0.074
aw+0.066
apy+0.064
ww+0.064
oo+0.063
Neg logit contributions
 seeking-0.082
 pad-0.075
 wonders-0.063
 lair-0.062
 former-0.062
Token as
Token ID355
Feature activation-2.99e-02

Pos logit contributions
ug+0.130
aw+0.119
oo+0.117
ww+0.115
apy+0.112
Neg logit contributions
 seeking-0.092
 pad-0.073
 wonders-0.069
 pretending-0.065
epad-0.065
Token a
Token ID257
Feature activation-1.31e-02

Pos logit contributions
ug+0.050
aw+0.044
ww+0.043
oo+0.042
apy+0.042
Neg logit contributions
 seeking-0.061
 pad-0.054
 wonders-0.048
epad-0.047
 pretending-0.046
Token table
Token ID3084
Feature activation-4.97e-02

Pos logit contributions
ug+0.023
ww+0.023
apy+0.021
opter+0.020
oo+0.020
Neg logit contributions
 seeking-0.025
asured-0.019
 pad-0.019
 supposed-0.019
 wonders-0.019
Tokencloth
Token ID44905
Feature activation-9.16e-02

Pos logit contributions
ug+0.100
aw+0.093
oo+0.089
ww+0.087
ister+0.087
Neg logit contributions
 seeking-0.084
 pad-0.069
 wonders-0.068
 pretending-0.063
epad-0.062
Token!
Token ID0
Feature activation-2.16e-02

Pos logit contributions
ug+0.195
aw+0.178
oo+0.175
ww+0.173
ister+0.166
Neg logit contributions
 seeking-0.152
 pad-0.122
 wonders-0.116
 pretending-0.111
 hiding-0.109
Token<|endoftext|>
Token ID50256
Feature activation+4.03e-02

Pos logit contributions
ug+0.041
aw+0.037
oo+0.036
ww+0.036
apy+0.036
Neg logit contributions
 seeking-0.041
 pad-0.034
epad-0.031
 pretending-0.031
 wonders-0.031
TokenOnce
Token ID7454
Feature activation-8.06e-04

Pos logit contributions
 seeking+0.077
 pad+0.064
 wonders+0.061
 pretending+0.059
epad+0.059
Neg logit contributions
ug-0.075
aw-0.068
oo-0.066
ww-0.065
apy-0.063
Token upon
Token ID2402
Feature activation+3.03e-02

Pos logit contributions
ug+0.002
aw+0.002
oo+0.002
ww+0.002
apy+0.002
Neg logit contributions
 seeking-0.001
 pad-0.000
 wonders-0.000
 pretending-0.000
 hiding-0.000
Token a
Token ID257
Feature activation+2.61e-02

Pos logit contributions
 seeking+0.062
 pad+0.052
 former+0.048
 fault+0.047
epad+0.047
Neg logit contributions
ug-0.050
aw-0.049
oo-0.048
apy-0.045
ww-0.043
Token a
Token ID257
Feature activation+8.24e-03

Pos logit contributions
ug+0.044
aw+0.040
ww+0.040
oo+0.040
opter+0.039
Neg logit contributions
 seeking-0.030
 pad-0.024
 wonders-0.023
 pretending-0.022
 anyway-0.022
Token few
Token ID1178
Feature activation-4.92e-03

Pos logit contributions
 seeking+0.015
 pad+0.013
 wonders+0.012
 hiding+0.012
 pretending+0.012
Neg logit contributions
ug-0.016
aw-0.015
oo-0.014
ww-0.014
ister-0.013
Token shells
Token ID19679
Feature activation-4.03e-02

Pos logit contributions
ug+0.010
aw+0.009
oo+0.009
ww+0.009
opter+0.009
Neg logit contributions
 seeking-0.008
 pad-0.007
 wonders-0.006
 pretending-0.006
 hiding-0.006
Token and
Token ID290
Feature activation-1.12e-02

Pos logit contributions
ug+0.081
aw+0.074
oo+0.072
ww+0.071
apy+0.068
Neg logit contributions
 seeking-0.072
 pad-0.058
 wonders-0.056
 pretending-0.054
epad-0.053
Token some
Token ID617
Feature activation-1.37e-02

Pos logit contributions
ug+0.022
aw+0.020
ww+0.019
oo+0.019
apy+0.019
Neg logit contributions
 seeking-0.020
 pad-0.016
 wonders-0.016
epad-0.015
 supposed-0.015
Token seaw
Token ID32085
Feature activation-9.10e-02

Pos logit contributions
ug+0.025
aw+0.023
ww+0.021
apy+0.021
oo+0.021
Neg logit contributions
 seeking-0.027
 pad-0.022
 wonders-0.021
epad-0.021
 supposed-0.021
Tokeneed
Token ID2308
Feature activation-7.04e-02

Pos logit contributions
ug+0.176
ww+0.160
oo+0.160
apy+0.158
ister+0.156
Neg logit contributions
 seeking-0.155
 pad-0.141
 wonders-0.122
 hiding-0.119
 pretending-0.114
Token and
Token ID290
Feature activation-1.44e-02

Pos logit contributions
ug+0.136
aw+0.125
oo+0.117
apy+0.113
ister+0.110
Neg logit contributions
 seeking-0.129
 wonders-0.106
 former-0.102
 supposed-0.101
 pad-0.101
Token built
Token ID3170
Feature activation-1.58e-02

Pos logit contributions
ug+0.027
aw+0.026
apy+0.024
ww+0.023
oo+0.023
Neg logit contributions
 seeking-0.027
 pad-0.021
 supposed-0.021
 wonders-0.020
epad-0.020
Token a
Token ID257
Feature activation+3.70e-03

Pos logit contributions
ug+0.029
aw+0.028
oo+0.026
apy+0.025
ww+0.025
Neg logit contributions
 seeking-0.030
 pad-0.025
epad-0.023
 supposed-0.023
 wonders-0.023
Token castle
Token ID16669
Feature activation-2.55e-02

Pos logit contributions
 seeking+0.008
 pad+0.007
epad+0.007
 wonders+0.007
 pretending+0.007
Neg logit contributions
ug-0.006
aw-0.005
ww-0.005
apy-0.005
oo-0.005
Token and
Token ID290
Feature activation+4.36e-03

Pos logit contributions
ug+0.022
ww+0.020
aw+0.020
oo+0.020
opter+0.019
Neg logit contributions
 seeking-0.014
 pad-0.012
 anyway-0.011
 hiding-0.011
 missing-0.010
Token it
Token ID340
Feature activation-2.51e-02

Pos logit contributions
 seeking+0.008
 pad+0.006
 wonders+0.006
 pretending+0.006
 hiding+0.006
Neg logit contributions
ug-0.009
aw-0.008
oo-0.008
ww-0.008
ister-0.008
Token was
Token ID373
Feature activation-8.22e-03

Pos logit contributions
ug+0.053
aw+0.048
oo+0.048
ww+0.047
apy+0.045
Neg logit contributions
 seeking-0.042
 pad-0.034
 wonders-0.032
 pretending-0.031
 hiding-0.031
Token the
Token ID262
Feature activation-1.45e-02

Pos logit contributions
ug+0.017
aw+0.016
oo+0.016
ww+0.015
apy+0.015
Neg logit contributions
 seeking-0.014
 pad-0.011
 wonders-0.011
epad-0.010
 pretending-0.010
Token y
Token ID331
Feature activation-1.49e-03

Pos logit contributions
ug+0.030
aw+0.029
ww+0.029
oo+0.028
ister+0.027
Neg logit contributions
 seeking-0.024
 pad-0.018
 wonders-0.017
epad-0.016
 pretending-0.016
Tokenumm
Token ID13929
Feature activation-9.08e-02

Pos logit contributions
ug+0.003
ves+0.002
opter+0.002
apy+0.002
ister+0.002
Neg logit contributions
 seeking-0.002
 pad-0.002
 anyway-0.002
 hiding-0.002
 wonders-0.002
Tokeniest
Token ID6386
Feature activation-1.93e-02

Pos logit contributions
ug+0.116
aw+0.096
apy+0.091
ww+0.088
oo+0.088
Neg logit contributions
 seeking-0.216
 pad-0.195
 anyway-0.192
 supposed-0.186
 first-0.185
Token ice
Token ID4771
Feature activation-2.94e-02

Pos logit contributions
ug+0.033
aw+0.030
oo+0.029
ww+0.029
apy+0.027
Neg logit contributions
 seeking-0.040
 pad-0.034
 wonders-0.032
epad-0.031
 pretending-0.031
Token cream
Token ID8566
Feature activation-2.57e-02

Pos logit contributions
ug+0.047
aw+0.040
oo+0.039
ww+0.038
opter+0.037
Neg logit contributions
 seeking-0.064
 pad-0.055
 anyway-0.053
 hiding-0.053
 pretending-0.052
Token she
Token ID673
Feature activation-1.04e-02

Pos logit contributions
ug+0.052
aw+0.050
ww+0.048
oo+0.048
ister+0.046
Neg logit contributions
 seeking-0.043
 wonders-0.035
 pad-0.034
 pretending-0.032
epad-0.031
Token ever
Token ID1683
Feature activation-5.71e-04

Pos logit contributions
ug+0.019
aw+0.019
ww+0.018
dd+0.018
oo+0.018
Neg logit contributions
 seeking-0.018
 fault-0.014
 pad-0.014
 lair-0.014
 former-0.014
Token.
Token ID13
Feature activation-8.40e-03

Pos logit contributions
ug+0.014
aw+0.013
oo+0.012
apy+0.012
ww+0.012
Neg logit contributions
 seeking-0.013
 pad-0.010
 pretending-0.010
 wonders-0.010
epad-0.010
Token It
Token ID632
Feature activation+8.42e-03

Pos logit contributions
ug+0.014
aw+0.013
oo+0.013
apy+0.013
ww+0.013
Neg logit contributions
 seeking-0.017
 pad-0.014
 hiding-0.013
epad-0.013
 pretending-0.013
Token was
Token ID373
Feature activation+6.10e-03

Pos logit contributions
 seeking+0.016
 pad+0.014
epad+0.013
 supposed+0.012
 pretending+0.012
Neg logit contributions
ug-0.015
aw-0.014
ww-0.014
apy-0.014
oo-0.013
Token even
Token ID772
Feature activation+1.94e-02

Pos logit contributions
 seeking+0.011
 pad+0.009
epad+0.009
 wonders+0.009
 lair+0.008
Neg logit contributions
ug-0.011
aw-0.010
oo-0.010
ww-0.010
apy-0.010
Token y
Token ID331
Feature activation-3.88e-03

Pos logit contributions
 seeking+0.033
 pad+0.027
epad+0.024
 supposed+0.023
 wonders+0.023
Neg logit contributions
ug-0.038
aw-0.037
ww-0.035
apy-0.034
 cho-0.033
Tokenumm
Token ID13929
Feature activation-9.02e-02

Pos logit contributions
ug+0.005
opter+0.004
oo+0.004
aw+0.004
ister+0.004
Neg logit contributions
 seeking-0.009
 pad-0.008
 wonders-0.007
 hiding-0.007
 supposed-0.007
Tokenier
Token ID959
Feature activation-1.01e-02

Pos logit contributions
ug+0.128
aw+0.110
oo+0.106
ww+0.106
apy+0.101
Neg logit contributions
 seeking-0.213
 pad-0.186
 wonders-0.176
 pretending-0.174
 hiding-0.174
Token than
Token ID621
Feature activation-4.26e-02

Pos logit contributions
ug+0.017
aw+0.016
ww+0.015
oo+0.015
dd+0.014
Neg logit contributions
 seeking-0.019
 pad-0.016
 wonders-0.016
epad-0.015
 suppose-0.015
Token the
Token ID262
Feature activation-6.75e-04

Pos logit contributions
ug+0.078
aw+0.072
apy+0.068
oo+0.068
ww+0.068
Neg logit contributions
 seeking-0.076
 pad-0.063
 wonders-0.062
 lair-0.061
 former-0.057
Token first
Token ID717
Feature activation+1.04e-02

Pos logit contributions
aw+0.001
ww+0.001
ug+0.001
oo+0.001
 cho+0.001
Neg logit contributions
 seeking-0.001
 wonders-0.001
asured-0.001
 pad-0.001
 anyways-0.001
Token card
Token ID2657
Feature activation+1.59e-02

Pos logit contributions
 seeking+0.019
 pad+0.015
 wonders+0.015
 pretending+0.014
epad+0.014
Neg logit contributions
ug-0.020
aw-0.019
oo-0.018
ww-0.018
opter-0.017
Token it
Token ID340
Feature activation-1.40e-02

Pos logit contributions
 seeking+0.033
 pad+0.030
epad+0.028
 supposed+0.027
 intended+0.026
Neg logit contributions
ug-0.021
aw-0.020
oo-0.018
apy-0.018
ww-0.017
Token outside
Token ID2354
Feature activation+2.15e-02

Pos logit contributions
ug+0.021
aw+0.019
oo+0.017
time+0.016
ww+0.016
Neg logit contributions
 seeking-0.028
epad-0.026
 former-0.025
 supposed-0.024
 wonders-0.024
Token and
Token ID290
Feature activation-6.26e-03

Pos logit contributions
 seeking+0.040
 pad+0.034
 wonders+0.032
epad+0.032
 supposed+0.031
Neg logit contributions
ug-0.040
aw-0.037
oo-0.035
ister-0.034
apy-0.033
Token a
Token ID257
Feature activation-1.36e-02

Pos logit contributions
aw+0.010
ug+0.010
apy+0.009
 cho+0.009
ww+0.008
Neg logit contributions
 seeking-0.011
 pad-0.011
epad-0.010
 supposed-0.009
 fault-0.009
Token big
Token ID1263
Feature activation-4.06e-02

Pos logit contributions
ww+0.022
ug+0.021
apy+0.020
 cho+0.020
aw+0.020
Neg logit contributions
 seeking-0.026
asured-0.022
 supposed-0.022
 wonders-0.021
 counts-0.021
Token wind
Token ID2344
Feature activation-9.01e-02

Pos logit contributions
ug+0.073
aw+0.068
ww+0.066
oo+0.063
ister+0.062
Neg logit contributions
 seeking-0.077
 pad-0.062
 supposed-0.062
epad-0.061
 wonders-0.060
Token blew
Token ID17948
Feature activation-1.64e-02

Pos logit contributions
ug+0.196
aw+0.179
oo+0.177
ww+0.173
apy+0.167
Neg logit contributions
 seeking-0.145
 pad-0.116
 wonders-0.111
epad-0.104
 pretending-0.104
Token it
Token ID340
Feature activation-8.19e-03

Pos logit contributions
ug+0.033
aw+0.030
oo+0.029
ww+0.029
apy+0.028
Neg logit contributions
 seeking-0.030
 pad-0.024
 wonders-0.023
 pretending-0.022
epad-0.022
Token away
Token ID1497
Feature activation-1.52e-02

Pos logit contributions
ug+0.015
aw+0.013
oo+0.012
apy+0.012
 cho+0.012
Neg logit contributions
 seeking-0.016
 pad-0.013
 wonders-0.013
 forgetting-0.013
 supposed-0.013
Token.
Token ID13
Feature activation-1.46e-02

Pos logit contributions
ug+0.030
aw+0.028
oo+0.026
apy+0.026
ww+0.025
Neg logit contributions
 seeking-0.027
 pad-0.022
 wonders-0.021
epad-0.020
 supposed-0.020
Token Anna
Token ID11735
Feature activation+1.28e-02

Pos logit contributions
ug+0.028
aw+0.026
oo+0.025
ww+0.025
apy+0.024
Neg logit contributions
 seeking-0.027
 pad-0.023
epad-0.021
 wonders-0.021
 supposed-0.021
Token Like
Token ID4525
Feature activation+3.97e-03

Pos logit contributions
ug+0.021
aw+0.019
oo+0.018
ww+0.018
apy+0.017
Neg logit contributions
 seeking-0.017
 pad-0.014
 wonders-0.013
 hiding-0.013
epad-0.012
Token shells
Token ID19679
Feature activation-4.43e-02

Pos logit contributions
 seeking+0.008
 pad+0.007
 wonders+0.006
epad+0.006
 supposed+0.006
Neg logit contributions
ug-0.007
aw-0.006
oo-0.006
ww-0.006
apy-0.006
Token and
Token ID290
Feature activation-2.91e-03

Pos logit contributions
ug+0.074
aw+0.066
oo+0.063
ww+0.063
ister+0.060
Neg logit contributions
 seeking-0.094
 pad-0.079
 wonders-0.077
epad-0.074
 pretending-0.074
Token rocks
Token ID12586
Feature activation-6.19e-02

Pos logit contributions
ug+0.005
aw+0.004
ww+0.004
apy+0.004
opter+0.004
Neg logit contributions
 seeking-0.006
 pad-0.005
 supposed-0.005
epad-0.005
 anyway-0.005
Token and
Token ID290
Feature activation-2.31e-03

Pos logit contributions
ug+0.124
aw+0.114
oo+0.109
ww+0.108
ister+0.104
Neg logit contributions
 seeking-0.111
 pad-0.089
 wonders-0.088
 pretending-0.083
epad-0.083
Token seaw
Token ID32085
Feature activation-9.00e-02

Pos logit contributions
ug+0.004
aw+0.004
ww+0.003
apy+0.003
 cho+0.003
Neg logit contributions
 seeking-0.005
 pad-0.004
 supposed-0.004
epad-0.004
 forgetting-0.004
Tokeneed
Token ID2308
Feature activation-7.13e-02

Pos logit contributions
ug+0.169
oo+0.152
ww+0.150
apy+0.150
ister+0.147
Neg logit contributions
 seeking-0.164
 pad-0.147
 wonders-0.134
 hiding-0.127
epad-0.125
Token."
Token ID526
Feature activation-4.56e-04

Pos logit contributions
ug+0.138
aw+0.132
oo+0.119
ww+0.119
ister+0.115
Neg logit contributions
 seeking-0.129
 forgetting-0.100
 wonders-0.100
 pad-0.098
 supposed-0.097
Token
Token ID198
Feature activation-1.38e-02

Pos logit contributions
ug+0.001
aw+0.001
oo+0.001
ww+0.001
apy+0.001
Neg logit contributions
 seeking-0.001
 pad-0.001
 wonders-0.001
 pretending-0.001
epad-0.001
Token
Token ID198
Feature activation-1.40e-02

Pos logit contributions
ug+0.030
aw+0.027
oo+0.027
ww+0.027
apy+0.026
Neg logit contributions
 seeking-0.022
 pad-0.018
 wonders-0.017
 pretending-0.016
 hiding-0.016
TokenM
Token ID44
Feature activation-1.29e-02

Pos logit contributions
ug+0.021
aw+0.018
ww+0.018
oo+0.018
ister+0.017
Neg logit contributions
 seeking-0.033
 pad-0.028
 wonders-0.027
 pretending-0.027
epad-0.026
Token shy
Token ID15800
Feature activation+2.96e-02

Pos logit contributions
 seeking+0.025
 pad+0.019
 wonders+0.018
 pretending+0.017
 hiding+0.017
Neg logit contributions
ug-0.039
aw-0.036
oo-0.035
ww-0.035
apy-0.034
Token.
Token ID13
Feature activation+7.84e-03

Pos logit contributions
 seeking+0.058
 former+0.050
 rightful+0.044
epad+0.044
 least+0.043
Neg logit contributions
aw-0.053
ug-0.052
oo-0.045
ww-0.043
ing-0.041
Token The
Token ID383
Feature activation+1.07e-02

Pos logit contributions
 seeking+0.012
 pad+0.010
 wonders+0.009
 pretending+0.009
 hiding+0.009
Neg logit contributions
ug-0.018
aw-0.016
oo-0.016
ww-0.016
apy-0.015
Token big
Token ID1263
Feature activation-1.55e-02

Pos logit contributions
 seeking+0.025
 pad+0.021
epad+0.020
 wonders+0.020
 supposed+0.020
Neg logit contributions
ug-0.015
aw-0.014
ww-0.014
oo-0.013
ister-0.013
Token bird
Token ID6512
Feature activation+1.99e-02

Pos logit contributions
ug+0.028
aw+0.025
oo+0.025
ww+0.025
apy+0.024
Neg logit contributions
 seeking-0.030
 pad-0.026
 wonders-0.025
 hiding-0.024
 pretending-0.024
Token smiled
Token ID13541
Feature activation+5.01e-02

Pos logit contributions
 seeking+0.044
epad+0.038
 pad+0.037
 supposed+0.037
 former+0.036
Neg logit contributions
ug-0.030
aw-0.029
oo-0.026
ww-0.025
apy-0.025
Token and
Token ID290
Feature activation-4.83e-03

Pos logit contributions
 seeking+0.107
 pad+0.081
epad+0.080
 supposed+0.080
 intended+0.078
Neg logit contributions
ug-0.086
aw-0.086
oo-0.076
ww-0.070
apy-0.070
Token said
Token ID531
Feature activation-2.01e-03

Pos logit contributions
ug+0.009
aw+0.008
ww+0.008
oo+0.008
apy+0.008
Neg logit contributions
 seeking-0.009
 pad-0.008
epad-0.007
 wonders-0.007
 supposed-0.007
Token,
Token ID11
Feature activation+3.72e-03

Pos logit contributions
ug+0.004
ww+0.003
aw+0.003
oo+0.003
opter+0.003
Neg logit contributions
 seeking-0.004
 hiding-0.003
 pad-0.003
 pretending-0.003
 anyway-0.003
Token "
Token ID366
Feature activation+7.58e-03

Pos logit contributions
 seeking+0.006
 pad+0.005
 wonders+0.005
epad+0.005
 supposed+0.005
Neg logit contributions
ug-0.007
aw-0.007
oo-0.007
ww-0.007
apy-0.006
TokenDo
Token ID5211
Feature activation-1.31e-02

Pos logit contributions
 seeking+0.015
 pad+0.012
 wonders+0.011
 pretending+0.011
 hiding+0.011
Neg logit contributions
ug-0.014
aw-0.013
oo-0.013
ww-0.013
apy-0.012
Token The
Token ID383
Feature activation+1.48e-02

Pos logit contributions
 seeking+0.045
 pad+0.036
 wonders+0.035
 pretending+0.033
 anyway+0.032
Neg logit contributions
ug-0.055
aw-0.051
ww-0.050
oo-0.050
dd-0.049
Token fire
Token ID2046
Feature activation+3.95e-03

Pos logit contributions
 seeking+0.033
 pad+0.029
epad+0.026
 supposed+0.026
 forgetting+0.025
Neg logit contributions
ug-0.021
aw-0.020
ister-0.019
oo-0.019
apy-0.018
Tokenman
Token ID805
Feature activation+1.78e-02

Pos logit contributions
 seeking+0.007
 pad+0.006
 wonders+0.006
epad+0.005
 pretending+0.005
Neg logit contributions
ug-0.008
aw-0.007
oo-0.007
ww-0.007
ister-0.007
Token truck
Token ID7779
Feature activation+1.56e-02

Pos logit contributions
 seeking+0.030
 pad+0.024
epad+0.024
 wonders+0.023
 forgetting+0.022
Neg logit contributions
ug-0.037
aw-0.035
oo-0.034
ww-0.033
ister-0.032
Token is
Token ID318
Feature activation+9.60e-04

Pos logit contributions
 seeking+0.028
 pad+0.023
epad+0.022
 wonders+0.022
 forgetting+0.021
Neg logit contributions
ug-0.031
aw-0.029
oo-0.028
ww-0.027
ister-0.027
Token here
Token ID994
Feature activation+4.68e-02

Pos logit contributions
 seeking+0.002
 pad+0.001
 wonders+0.001
epad+0.001
 former+0.001
Neg logit contributions
ug-0.002
oo-0.002
 cho-0.002
aw-0.002
ww-0.002
Token because
Token ID780
Feature activation+7.85e-03

Pos logit contributions
 seeking+0.089
 pad+0.073
 wonders+0.070
epad+0.068
 forgetting+0.068
Neg logit contributions
ug-0.082
oo-0.075
aw-0.071
ister-0.069
ww-0.069
Token there
Token ID612
Feature activation+3.81e-02

Pos logit contributions
 seeking+0.016
 wonders+0.013
 pad+0.013
 pretending+0.012
 hiding+0.012
Neg logit contributions
ug-0.014
aw-0.013
oo-0.012
ww-0.012
opter-0.012
Token is
Token ID318
Feature activation-7.52e-03

Pos logit contributions
 seeking+0.069
 pad+0.057
 wonders+0.053
epad+0.052
 pretending+0.052
Neg logit contributions
ug-0.075
aw-0.068
oo-0.066
ww-0.066
apy-0.063
Token a
Token ID257
Feature activation-4.05e-03

Pos logit contributions
ug+0.014
aw+0.013
oo+0.013
ww+0.012
 cho+0.012
Neg logit contributions
 seeking-0.014
 pad-0.012
epad-0.011
 wonders-0.011
 forgetting-0.010
Token problem
Token ID1917
Feature activation-1.96e-02

Pos logit contributions
ug+0.007
ww+0.006
aw+0.006
oo+0.006
ister+0.005
Neg logit contributions
 seeking-0.009
 pad-0.007
 wonders-0.007
epad-0.007
 pretending-0.007
Token.
Token ID13
Feature activation-8.37e-03

Pos logit contributions
aw+0.003
ug+0.003
oo+0.003
ww+0.002
 cho+0.002
Neg logit contributions
 seeking-0.003
 pad-0.002
epad-0.002
 supposed-0.002
 Deal-0.002
Token She
Token ID1375
Feature activation-4.38e-03

Pos logit contributions
ug+0.015
aw+0.014
oo+0.013
ww+0.013
apy+0.013
Neg logit contributions
 seeking-0.017
epad-0.013
 pad-0.013
 supposed-0.013
 hiding-0.013
Token wished
Token ID16555
Feature activation-7.86e-04

Pos logit contributions
ug+0.008
aw+0.008
oo+0.007
ww+0.007
apy+0.007
Neg logit contributions
 seeking-0.008
epad-0.007
 pad-0.007
 supposed-0.007
 lair-0.006
Token that
Token ID326
Feature activation-1.64e-02

Pos logit contributions
ug+0.002
aw+0.001
oo+0.001
ww+0.001
apy+0.001
Neg logit contributions
 seeking-0.001
 pad-0.001
 wonders-0.001
epad-0.001
 pretending-0.001
Token her
Token ID607
Feature activation+3.20e-02

Pos logit contributions
ug+0.028
aw+0.025
oo+0.024
ww+0.024
apy+0.023
Neg logit contributions
 seeking-0.032
 pad-0.029
 wonders-0.025
epad-0.025
 intended-0.025
Token teacher
Token ID4701
Feature activation+4.28e-02

Pos logit contributions
 seeking+0.059
 pad+0.054
epad+0.050
 supposed+0.048
 lair+0.047
Neg logit contributions
ug-0.056
oo-0.051
aw-0.049
apy-0.049
ww-0.049
Token would
Token ID561
Feature activation-1.61e-02

Pos logit contributions
 seeking+0.079
 pad+0.062
 lair+0.062
 former+0.062
epad+0.061
Neg logit contributions
ug-0.075
aw-0.073
oo-0.068
ww-0.066
ister-0.065
Token love
Token ID1842
Feature activation+1.69e-02

Pos logit contributions
ug+0.030
oo+0.029
ister+0.028
ww+0.026
aw+0.025
Neg logit contributions
 seeking-0.028
 pad-0.025
 pretending-0.022
 lair-0.022
 supposed-0.021
Token to
Token ID284
Feature activation-1.30e-02

Pos logit contributions
 seeking+0.031
 pad+0.027
 lair+0.025
epad+0.025
 former+0.024
Neg logit contributions
ug-0.030
aw-0.029
oo-0.028
ww-0.026
ister-0.026
Token dance
Token ID9280
Feature activation+2.54e-02

Pos logit contributions
ug+0.027
oo+0.025
ister+0.024
aw+0.023
ww+0.023
Neg logit contributions
 seeking-0.021
 pad-0.019
 lair-0.016
 supposed-0.016
 wonders-0.016
Token as
Token ID355
Feature activation-4.10e-02

Pos logit contributions
 seeking+0.048
 pad+0.041
 supposed+0.039
 wonders+0.038
epad+0.037
Neg logit contributions
ug-0.047
aw-0.044
oo-0.041
ww-0.040
ister-0.039
Token Ben
Token ID3932
Feature activation-2.64e-02

Pos logit contributions
ug+0.001
aw+0.001
oo+0.001
ww+0.001
apy+0.001
Neg logit contributions
 seeking-0.001
 pad-0.001
 wonders-0.001
 pretending-0.001
 hiding-0.001
Token had
Token ID550
Feature activation+4.88e-03

Pos logit contributions
ug+0.064
aw+0.058
oo+0.057
ww+0.057
apy+0.055
Neg logit contributions
 seeking-0.037
 pad-0.029
 wonders-0.028
 hiding-0.026
 pretending-0.026
Token a
Token ID257
Feature activation+2.28e-02

Pos logit contributions
 seeking+0.007
 pad+0.005
 wonders+0.005
 pretending+0.005
 hiding+0.005
Neg logit contributions
ug-0.012
aw-0.011
oo-0.010
ww-0.010
ister-0.010
Token red
Token ID2266
Feature activation+8.24e-04

Pos logit contributions
 seeking+0.039
 pad+0.031
 wonders+0.030
epad+0.029
 forgetting+0.028
Neg logit contributions
ug-0.046
ww-0.042
aw-0.042
oo-0.042
apy-0.040
Token shirt
Token ID10147
Feature activation-3.73e-02

Pos logit contributions
 seeking+0.001
 pad+0.001
 hiding+0.001
 wonders+0.001
 pretending+0.001
Neg logit contributions
ug-0.002
aw-0.002
ww-0.002
dd-0.002
opter-0.002
Token with
Token ID351
Feature activation+4.63e-02

Pos logit contributions
ug+0.058
aw+0.052
oo+0.051
ww+0.049
apy+0.047
Neg logit contributions
 seeking-0.083
 pad-0.071
 wonders-0.069
epad-0.066
 pretending-0.065
Token a
Token ID257
Feature activation+7.89e-03

Pos logit contributions
 seeking+0.067
 pad+0.056
 wonders+0.054
 pretending+0.051
 hiding+0.049
Neg logit contributions
ug-0.106
oo-0.095
ister-0.094
ww-0.093
aw-0.093
Token big
Token ID1263
Feature activation-2.38e-02

Pos logit contributions
 seeking+0.014
 pad+0.012
 wonders+0.011
 pretending+0.011
 hiding+0.011
Neg logit contributions
ug-0.015
aw-0.014
oo-0.014
ww-0.014
apy-0.013
Token button
Token ID4936
Feature activation-1.27e-02

Pos logit contributions
ug+0.050
aw+0.045
oo+0.044
ww+0.043
dd+0.042
Neg logit contributions
 seeking-0.041
 pad-0.036
 wonders-0.033
 pretending-0.032
 hiding-0.032
Token on
Token ID319
Feature activation-1.04e-02

Pos logit contributions
ug+0.027
aw+0.024
oo+0.024
ww+0.024
apy+0.023
Neg logit contributions
 seeking-0.021
 pad-0.017
 wonders-0.016
 pretending-0.016
 hiding-0.015
Token the
Token ID262
Feature activation+1.46e-03

Pos logit contributions
ug+0.022
ww+0.020
aw+0.019
oo+0.019
dd+0.019
Neg logit contributions
 seeking-0.017
 wonders-0.015
 pad-0.014
 pretending-0.014
 anyway-0.014
Token good
Token ID922
Feature activation-3.49e-02

Pos logit contributions
ww+0.001
ug+0.001
aw+0.001
ister+0.001
 cho+0.001
Neg logit contributions
 seeking-0.001
 pad-0.001
 wonders-0.000
asured-0.000
epad-0.000
Token artist
Token ID6802
Feature activation-2.12e-03

Pos logit contributions
ug+0.065
aw+0.061
oo+0.059
ww+0.059
ister+0.056
Neg logit contributions
 seeking-0.067
 pad-0.052
 wonders-0.050
 forgetting-0.049
 lair-0.048
Token.
Token ID13
Feature activation-1.67e-03

Pos logit contributions
ug+0.004
aw+0.004
oo+0.004
ister+0.003
ww+0.003
Neg logit contributions
 seeking-0.004
 wonders-0.003
 former-0.003
 intended-0.003
 pad-0.003
Token
Token ID198
Feature activation+1.05e-04

Pos logit contributions
ug+0.003
aw+0.003
oo+0.003
ww+0.003
apy+0.003
Neg logit contributions
 seeking-0.003
 pad-0.002
 wonders-0.002
epad-0.002
 pretending-0.002
Token
Token ID198
Feature activation-4.28e-03

Pos logit contributions
 seeking+0.000
 pad+0.000
 wonders+0.000
 hiding+0.000
 pretending+0.000
Neg logit contributions
ug-0.000
aw-0.000
ww-0.000
oo-0.000
opter-0.000
TokenOne
Token ID3198
Feature activation+5.80e-02

Pos logit contributions
ug+0.011
aw+0.010
oo+0.010
ww+0.010
opter+0.010
Neg logit contributions
 seeking-0.005
 pad-0.004
 wonders-0.004
 anyway-0.003
 hiding-0.003
Token day
Token ID1110
Feature activation-1.33e-02

Pos logit contributions
 seeking+0.092
 pad+0.073
 wonders+0.072
 pretending+0.069
 hiding+0.068
Neg logit contributions
ug-0.127
aw-0.117
oo-0.110
opter-0.110
ww-0.110
Token,
Token ID11
Feature activation-1.11e-02

Pos logit contributions
ug+0.029
aw+0.026
oo+0.024
ister+0.024
opter+0.024
Neg logit contributions
 seeking-0.022
 pretending-0.019
 hiding-0.019
 first-0.018
 learning-0.017
Token Anna
Token ID11735
Feature activation+2.31e-02

Pos logit contributions
ug+0.024
aw+0.021
oo+0.021
ww+0.021
opter+0.020
Neg logit contributions
 seeking-0.019
 pad-0.015
 wonders-0.014
 pretending-0.014
 hiding-0.014
Token wanted
Token ID2227
Feature activation+2.79e-03

Pos logit contributions
 seeking+0.043
epad+0.033
 pad+0.033
 forgetting+0.032
 former+0.032
Neg logit contributions
ug-0.043
aw-0.043
oo-0.037
ww-0.037
opter-0.036
Token to
Token ID284
Feature activation+1.82e-02

Pos logit contributions
 seeking+0.005
 pad+0.004
 slave+0.003
 former+0.003
 intended+0.003
Neg logit contributions
ug-0.005
aw-0.005
oo-0.005
apy-0.005
ister-0.005
Token friend
Token ID1545
Feature activation+3.18e-02

Pos logit contributions
aw+0.007
ug+0.007
ww+0.007
oo+0.006
ister+0.006
Neg logit contributions
 seeking-0.008
epad-0.006
 pad-0.006
 forgetting-0.006
 wonders-0.006
Token of
Token ID286
Feature activation-7.01e-03

Pos logit contributions
 seeking+0.058
 wonders+0.048
 former+0.047
 pad+0.046
epad+0.045
Neg logit contributions
ug-0.061
aw-0.058
oo-0.055
ww-0.052
ister-0.051
Token the
Token ID262
Feature activation-4.33e-03

Pos logit contributions
ug+0.012
aw+0.011
oo+0.010
ww+0.010
ister+0.009
Neg logit contributions
 seeking-0.015
 pad-0.013
 wonders-0.012
epad-0.012
 supposed-0.012
Token jungle
Token ID20712
Feature activation+1.26e-02

Pos logit contributions
aw+0.007
ug+0.007
ww+0.007
oo+0.006
ister+0.006
Neg logit contributions
 seeking-0.008
epad-0.007
 supposed-0.007
 pad-0.007
 anyway-0.007
Token.
Token ID13
Feature activation+1.34e-03

Pos logit contributions
 seeking+0.023
 wonders+0.019
 former+0.019
 Deal+0.018
 supposed+0.018
Neg logit contributions
aw-0.024
ug-0.023
ister-0.020
oo-0.020
 cho-0.020
Token
Token ID198
Feature activation-4.49e-03

Pos logit contributions
 seeking+0.002
 wonders+0.001
 pad+0.001
 anyway+0.001
 pretending+0.001
Neg logit contributions
ug-0.003
ww-0.003
oo-0.003
aw-0.003
opter-0.003
Token
Token ID198
Feature activation-8.58e-03

Pos logit contributions
ug+0.011
aw+0.010
ww+0.010
oo+0.010
opter+0.010
Neg logit contributions
 seeking-0.005
 wonders-0.004
 pad-0.004
 anyway-0.004
 hiding-0.003
TokenThey
Token ID2990
Feature activation-1.19e-02

Pos logit contributions
ug+0.016
aw+0.014
olk+0.014
oo+0.014
opter+0.014
Neg logit contributions
 seeking-0.015
 wonders-0.012
 pad-0.012
 learning-0.011
 first-0.011
Token wanted
Token ID2227
Feature activation+2.77e-03

Pos logit contributions
ug+0.023
aw+0.022
oo+0.021
ww+0.020
apy+0.020
Neg logit contributions
 seeking-0.022
 pad-0.018
 wonders-0.017
epad-0.016
 pretending-0.016
Token to
Token ID284
Feature activation+1.52e-02

Pos logit contributions
 seeking+0.004
 pad+0.003
epad+0.003
 wonders+0.003
 pretending+0.003
Neg logit contributions
ug-0.006
aw-0.006
oo-0.006
ww-0.006
apy-0.005
Token make
Token ID787
Feature activation-1.30e-02

Pos logit contributions
 seeking+0.026
 pad+0.023
epad+0.020
 forgetting+0.020
 supposed+0.020
Neg logit contributions
ug-0.029
oo-0.028
ister-0.027
aw-0.026
apy-0.026
Token ran
Token ID4966
Feature activation-1.52e-02

Pos logit contributions
 seeking+0.009
 pad+0.008
epad+0.007
 supposed+0.007
 wonders+0.007
Neg logit contributions
ug-0.009
aw-0.009
apy-0.008
ww-0.008
oo-0.008
Token towards
Token ID3371
Feature activation-7.61e-03

Pos logit contributions
ug+0.028
aw+0.026
oo+0.026
ww+0.024
 cho+0.023
Neg logit contributions
 seeking-0.027
epad-0.023
 pad-0.023
 wonders-0.023
 former-0.023
Token it
Token ID340
Feature activation-1.61e-02

Pos logit contributions
ug+0.012
aw+0.011
oo+0.011
ww+0.010
apy+0.010
Neg logit contributions
 seeking-0.016
 pad-0.014
epad-0.014
 wonders-0.013
 supposed-0.013
Token.
Token ID13
Feature activation-3.18e-03

Pos logit contributions
ug+0.022
aw+0.020
oo+0.019
ister+0.017
time+0.016
Neg logit contributions
 seeking-0.034
epad-0.031
 Deal-0.030
 supposed-0.030
 imag-0.029
Token She
Token ID1375
Feature activation+1.37e-03

Pos logit contributions
ug+0.006
aw+0.005
oo+0.005
apy+0.005
ww+0.005
Neg logit contributions
 seeking-0.006
 pad-0.005
epad-0.005
 supposed-0.005
 wonders-0.005
Token climbed
Token ID19952
Feature activation+7.99e-04

Pos logit contributions
 seeking+0.003
epad+0.002
 pad+0.002
 supposed+0.002
 fault+0.002
Neg logit contributions
ug-0.002
aw-0.002
oo-0.002
apy-0.002
ww-0.002
Token up
Token ID510
Feature activation-4.60e-03

Pos logit contributions
 seeking+0.002
 former+0.001
epad+0.001
 wonders+0.001
 pad+0.001
Neg logit contributions
ug-0.001
aw-0.001
oo-0.001
ww-0.001
 cho-0.001
Token and
Token ID290
Feature activation+1.78e-02

Pos logit contributions
ug+0.007
aw+0.007
oo+0.006
apy+0.006
ister+0.006
Neg logit contributions
 seeking-0.010
 pad-0.009
epad-0.008
 supposed-0.008
 wonders-0.008
Token slid
Token ID27803
Feature activation+8.30e-03

Pos logit contributions
 seeking+0.035
epad+0.029
 former+0.029
 pad+0.029
 supposed+0.029
Neg logit contributions
ug-0.029
aw-0.029
apy-0.026
ww-0.026
oo-0.025
Token down
Token ID866
Feature activation-2.83e-02

Pos logit contributions
 seeking+0.016
epad+0.014
 former+0.013
 supposed+0.013
 wonders+0.013
Neg logit contributions
ug-0.015
aw-0.014
oo-0.013
ww-0.013
apy-0.012
Token with
Token ID351
Feature activation+1.98e-02

Pos logit contributions
ug+0.043
aw+0.042
apy+0.036
ister+0.036
oo+0.035
Neg logit contributions
 seeking-0.060
 pad-0.055
epad-0.053
 former-0.051
asured-0.051
Token came
Token ID1625
Feature activation-9.03e-04

Pos logit contributions
ug+0.002
aw+0.002
oo+0.002
ww+0.002
apy+0.002
Neg logit contributions
 seeking-0.002
 pad-0.002
 wonders-0.002
 pretending-0.002
 hiding-0.002
Token out
Token ID503
Feature activation-5.44e-03

Pos logit contributions
ug+0.002
oo+0.002
aw+0.002
 cho+0.001
ww+0.001
Neg logit contributions
 seeking-0.002
 pad-0.001
 former-0.001
epad-0.001
 wonders-0.001
Token.
Token ID13
Feature activation+2.14e-03

Pos logit contributions
ug+0.008
oo+0.007
aw+0.007
apy+0.006
ister+0.006
Neg logit contributions
 seeking-0.012
 pad-0.012
epad-0.010
 former-0.010
 fault-0.010
Token He
Token ID679
Feature activation-2.24e-02

Pos logit contributions
 seeking+0.003
 pad+0.003
 wonders+0.003
 pretending+0.003
 hiding+0.003
Neg logit contributions
ug-0.005
ww-0.004
oo-0.004
aw-0.004
opter-0.004
Token smiled
Token ID13541
Feature activation+3.98e-02

Pos logit contributions
ug+0.045
aw+0.042
oo+0.041
ww+0.040
apy+0.039
Neg logit contributions
 seeking-0.039
 pad-0.032
epad-0.030
 wonders-0.029
 pretending-0.028
Token and
Token ID290
Feature activation-3.17e-03

Pos logit contributions
 seeking+0.077
 pad+0.060
epad+0.058
 supposed+0.058
 wonders+0.056
Neg logit contributions
ug-0.075
aw-0.072
oo-0.066
ww-0.063
ister-0.062
Token said
Token ID531
Feature activation-8.55e-03

Pos logit contributions
ug+0.007
aw+0.006
oo+0.006
ww+0.006
opter+0.006
Neg logit contributions
 seeking-0.005
 pad-0.004
 wonders-0.004
 pretending-0.004
 hiding-0.004
Token,
Token ID11
Feature activation+9.55e-03

Pos logit contributions
ug+0.017
ww+0.016
olk+0.015
opter+0.015
oo+0.014
Neg logit contributions
 seeking-0.014
 hiding-0.012
 pretending-0.012
 anyway-0.012
 learning-0.011
Token "
Token ID366
Feature activation-1.50e-03

Pos logit contributions
 seeking+0.016
 pad+0.013
 wonders+0.012
epad+0.012
 hiding+0.011
Neg logit contributions
ug-0.020
aw-0.018
oo-0.018
ww-0.017
apy-0.017
TokenWelcome
Token ID14618
Feature activation-1.19e-02

Pos logit contributions
ug+0.003
oo+0.003
aw+0.003
apy+0.003
ww+0.003
Neg logit contributions
 seeking-0.003
 pad-0.002
 wonders-0.002
 anyway-0.002
 forgetting-0.002
Token,
Token ID11
Feature activation-1.39e-02

Pos logit contributions
ug+0.021
aw+0.019
oo+0.019
ww+0.019
ister+0.018
Neg logit contributions
 seeking-0.024
 pad-0.020
 wonders-0.019
epad-0.019
 pretending-0.018
Token the
Token ID262
Feature activation-2.38e-04

Pos logit contributions
ug+0.005
aw+0.004
oo+0.004
ww+0.004
ister+0.004
Neg logit contributions
 seeking-0.005
 pad-0.004
 wonders-0.004
 pretending-0.004
 hiding-0.004
Token paper
Token ID3348
Feature activation-9.90e-03

Pos logit contributions
ug+0.000
aw+0.000
ww+0.000
oo+0.000
 cho+0.000
Neg logit contributions
 seeking-0.000
 wonders-0.000
asured-0.000
 suppose-0.000
 supposed-0.000
Token and
Token ID290
Feature activation-3.27e-03

Pos logit contributions
ug+0.018
aw+0.017
oo+0.016
ww+0.016
ister+0.015
Neg logit contributions
 seeking-0.019
 wonders-0.015
 pretending-0.015
 pad-0.015
 forgetting-0.015
Token was
Token ID373
Feature activation-3.35e-02

Pos logit contributions
ug+0.006
aw+0.006
apy+0.006
ww+0.006
oo+0.005
Neg logit contributions
 seeking-0.006
 pad-0.005
 supposed-0.004
 anyway-0.004
epad-0.004
Token so
Token ID523
Feature activation-2.33e-02

Pos logit contributions
aw+0.062
ug+0.061
oo+0.058
apy+0.056
 cho+0.056
Neg logit contributions
 seeking-0.058
 pad-0.048
 intended-0.044
 former-0.043
 wonders-0.042
Token proud
Token ID6613
Feature activation+1.81e-02

Pos logit contributions
ug+0.047
apy+0.044
aw+0.043
oo+0.042
ww+0.041
Neg logit contributions
 seeking-0.042
 pad-0.037
 hiding-0.031
 least-0.031
 lair-0.030
Token of
Token ID286
Feature activation-2.75e-02

Pos logit contributions
 seeking+0.047
 pad+0.041
 wonders+0.039
epad+0.038
 pretending+0.038
Neg logit contributions
ug-0.023
aw-0.020
ww-0.018
oo-0.018
ister-0.017
Token himself
Token ID2241
Feature activation-3.71e-02

Pos logit contributions
ug+0.053
oo+0.048
aw+0.048
ister+0.045
ww+0.045
Neg logit contributions
 seeking-0.051
 pad-0.045
epad-0.040
 wonders-0.040
 lair-0.038
Token.
Token ID13
Feature activation-1.19e-02

Pos logit contributions
ug+0.072
oo+0.065
aw+0.065
ww+0.062
ister+0.061
Neg logit contributions
 seeking-0.069
 wonders-0.056
 pad-0.054
 pretending-0.052
 lair-0.051
Token
Token ID198
Feature activation-1.58e-02

Pos logit contributions
ug+0.023
aw+0.021
oo+0.020
ww+0.020
apy+0.020
Neg logit contributions
 seeking-0.022
 pad-0.018
 wonders-0.017
 hiding-0.017
epad-0.017
Token
Token ID198
Feature activation-1.74e-02

Pos logit contributions
ug+0.034
aw+0.031
ww+0.031
oo+0.031
apy+0.029
Neg logit contributions
 seeking-0.025
 pad-0.020
 wonders-0.019
 pretending-0.018
 hiding-0.018
Token could
Token ID714
Feature activation-2.65e-02

Pos logit contributions
ug+0.001
aw+0.001
oo+0.000
ww+0.000
apy+0.000
Neg logit contributions
 seeking-0.000
 pad-0.000
 wonders-0.000
 pretending-0.000
 hiding-0.000
Token such
Token ID884
Feature activation-1.66e-02

Pos logit contributions
ug+0.050
aw+0.046
oo+0.044
ww+0.044
apy+0.042
Neg logit contributions
 seeking-0.050
 pad-0.041
 wonders-0.040
 pretending-0.038
 hiding-0.038
Token a
Token ID257
Feature activation+4.49e-03

Pos logit contributions
ug+0.025
aw+0.023
apy+0.022
ww+0.022
dd+0.021
Neg logit contributions
 seeking-0.038
 pad-0.030
 forgetting-0.029
isk-0.028
epad-0.028
Token magical
Token ID10883
Feature activation+7.25e-03

Pos logit contributions
 seeking+0.008
 pad+0.007
asured+0.006
epad+0.006
 supposed+0.006
Neg logit contributions
ww-0.008
ug-0.008
aw-0.007
 cho-0.007
oo-0.007
Token creature
Token ID7185
Feature activation+5.66e-03

Pos logit contributions
 seeking+0.012
 pad+0.010
 supposed+0.009
epad+0.009
 wonders+0.009
Neg logit contributions
ug-0.015
ww-0.013
oo-0.013
aw-0.013
opter-0.012
Token exist
Token ID2152
Feature activation+2.16e-02

Pos logit contributions
 seeking+0.011
 pad+0.010
epad+0.009
 wonders+0.009
 former+0.009
Neg logit contributions
ug-0.009
aw-0.008
oo-0.008
ww-0.008
ister-0.008
Token in
Token ID287
Feature activation-2.63e-03

Pos logit contributions
 seeking+0.041
 pad+0.038
epad+0.035
 wonders+0.033
 former+0.033
Neg logit contributions
ug-0.038
aw-0.035
oo-0.033
ww-0.032
ister-0.032
Token the
Token ID262
Feature activation-1.17e-02

Pos logit contributions
ug+0.004
aw+0.003
oo+0.003
ww+0.003
ister+0.003
Neg logit contributions
 seeking-0.006
 pad-0.005
epad-0.005
 supposed-0.005
 wonders-0.005
Token world
Token ID995
Feature activation-2.65e-03

Pos logit contributions
aw+0.021
ug+0.020
ww+0.019
oo+0.017
opter+0.016
Neg logit contributions
 seeking-0.022
epad-0.019
 pretending-0.018
 pad-0.018
 supposed-0.018
Token?
Token ID30
Feature activation-4.75e-03

Pos logit contributions
ug+0.005
aw+0.005
ister+0.004
ww+0.004
oo+0.004
Neg logit contributions
 seeking-0.005
 pad-0.004
 wonders-0.004
epad-0.004
 pretending-0.004
Token
Token ID220
Feature activation+1.61e-02

Pos logit contributions
ug+0.009
aw+0.008
oo+0.007
ww+0.007
apy+0.007
Neg logit contributions
 seeking-0.010
 pad-0.008
 hiding-0.008
 wonders-0.008
 pretending-0.008
Token keep
Token ID1394
Feature activation-2.56e-02

Pos logit contributions
 seeking+0.027
 pad+0.023
 wonders+0.020
 pretending+0.020
epad+0.020
Neg logit contributions
ug-0.037
oo-0.034
aw-0.033
ww-0.033
ister-0.032
Token his
Token ID465
Feature activation-1.39e-02

Pos logit contributions
ug+0.044
oo+0.040
aw+0.040
ww+0.037
dd+0.037
Neg logit contributions
 seeking-0.051
 pad-0.046
epad-0.040
 wonders-0.040
 intended-0.038
Token room
Token ID2119
Feature activation-6.91e-03

Pos logit contributions
ug+0.026
apy+0.022
oo+0.021
opter+0.021
olk+0.021
Neg logit contributions
 seeking-0.024
 pad-0.022
epad-0.019
 supposed-0.018
 wonders-0.018
Token clean
Token ID3424
Feature activation-9.21e-03

Pos logit contributions
ug+0.013
aw+0.011
apy+0.011
oo+0.011
ister+0.011
Neg logit contributions
 seeking-0.013
 pad-0.011
 wonders-0.010
 supposed-0.010
 pretending-0.010
Token from
Token ID422
Feature activation-1.50e-02

Pos logit contributions
ug+0.017
aw+0.015
oo+0.015
ww+0.015
ister+0.014
Neg logit contributions
 seeking-0.018
 pad-0.015
 wonders-0.014
epad-0.013
 supposed-0.013
Token now
Token ID783
Feature activation-1.18e-02

Pos logit contributions
ug+0.026
oo+0.022
aw+0.022
apy+0.020
ister+0.020
Neg logit contributions
 seeking-0.029
 pad-0.027
epad-0.023
 intended-0.023
 wonders-0.022
Token on
Token ID319
Feature activation-1.21e-02

Pos logit contributions
ug+0.021
aw+0.020
ww+0.019
oo+0.019
apy+0.018
Neg logit contributions
 seeking-0.023
 pad-0.019
 wonders-0.018
 pretending-0.018
 anyway-0.018
Token.
Token ID13
Feature activation-1.37e-02

Pos logit contributions
ug+0.023
aw+0.020
oo+0.020
ister+0.019
apy+0.019
Neg logit contributions
 seeking-0.022
 pad-0.020
 wonders-0.017
epad-0.017
 suppose-0.017
Token<|endoftext|>
Token ID50256
Feature activation+2.77e-02

Pos logit contributions
ug+0.028
aw+0.026
oo+0.025
ww+0.025
apy+0.024
Neg logit contributions
 seeking-0.024
 pad-0.020
 wonders-0.018
epad-0.018
 pretending-0.018
TokenOnce
Token ID7454
Feature activation-6.57e-03

Pos logit contributions
 seeking+0.045
 pad+0.036
 wonders+0.034
 pretending+0.032
 hiding+0.032
Neg logit contributions
ug-0.060
aw-0.055
oo-0.054
ww-0.054
apy-0.052
Token upon
Token ID2402
Feature activation+2.28e-02

Pos logit contributions
ug+0.019
aw+0.018
ww+0.018
oo+0.018
apy+0.017
Neg logit contributions
 seeking-0.006
 pad-0.004
 wonders-0.003
 hiding-0.003
 pretending-0.003
Token better
Token ID1365
Feature activation-1.81e-02

Pos logit contributions
 seeking+0.002
 pad+0.002
 lair+0.002
 Tent+0.002
 pretending+0.002
Neg logit contributions
ug-0.002
aw-0.002
ww-0.002
apy-0.002
 cho-0.002
Token than
Token ID621
Feature activation-5.79e-02

Pos logit contributions
ug+0.030
aw+0.028
ww+0.026
oo+0.025
dd+0.025
Neg logit contributions
 seeking-0.036
 pad-0.031
epad-0.029
 lair-0.028
 pretending-0.027
Token pretending
Token ID23662
Feature activation+3.42e-02

Pos logit contributions
ug+0.116
oo+0.104
aw+0.102
apy+0.100
ww+0.099
Neg logit contributions
 seeking-0.081
 pad-0.077
 wonders-0.071
 lair-0.070
 slave-0.062
Token to
Token ID284
Feature activation+9.19e-03

Pos logit contributions
 seeking+0.061
 pad+0.055
 wonders+0.049
 former+0.049
epad+0.048
Neg logit contributions
ug-0.063
aw-0.059
ww-0.055
apy-0.055
ister-0.054
Token be
Token ID307
Feature activation-1.70e-02

Pos logit contributions
 seeking+0.015
 pad+0.014
epad+0.012
 pretending+0.012
 wonders+0.012
Neg logit contributions
ug-0.018
oo-0.017
aw-0.016
ister-0.016
ww-0.016
Token a
Token ID257
Feature activation-1.13e-02

Pos logit contributions
ug+0.029
aw+0.028
apy+0.025
oo+0.024
ww+0.024
Neg logit contributions
 seeking-0.031
 pad-0.028
 wonders-0.025
 least-0.024
 Deal-0.024
Token princess
Token ID21752
Feature activation+8.98e-03

Pos logit contributions
ww+0.019
ug+0.018
aw+0.017
 cho+0.017
ister+0.017
Neg logit contributions
 seeking-0.021
 pad-0.018
asured-0.017
epad-0.017
 wonders-0.016
Token with
Token ID351
Feature activation+8.83e-03

Pos logit contributions
 seeking+0.017
 wonders+0.015
 intended+0.014
 supposed+0.014
epad+0.014
Neg logit contributions
ug-0.015
aw-0.015
over-0.013
ister-0.013
oo-0.013
Token a
Token ID257
Feature activation-1.28e-02

Pos logit contributions
 seeking+0.017
 pad+0.015
 lair+0.014
 forgetting+0.013
epad+0.013
Neg logit contributions
ug-0.014
aw-0.014
apy-0.013
oo-0.012
ww-0.012
Token fake
Token ID8390
Feature activation+1.15e-03

Pos logit contributions
ww+0.022
ug+0.021
aw+0.021
opter+0.020
apy+0.020
Neg logit contributions
 seeking-0.023
asured-0.019
 supposed-0.018
 wonders-0.018
 anyways-0.017
Token necklace
Token ID38987
Feature activation-1.75e-02

Pos logit contributions
 seeking+0.002
 supposed+0.002
 forgetting+0.002
epad+0.002
asured+0.002
Neg logit contributions
ug-0.002
aw-0.002
ww-0.002
opter-0.002
oo-0.002
Token their
Token ID511
Feature activation-1.36e-02

Pos logit contributions
ug+0.021
aw+0.020
oo+0.019
apy+0.019
ww+0.018
Neg logit contributions
 seeking-0.030
 pad-0.027
 wonders-0.024
epad-0.024
 supposed-0.023
Token cookies
Token ID14746
Feature activation-1.18e-02

Pos logit contributions
ug+0.025
aw+0.023
oo+0.023
ww+0.022
apy+0.021
Neg logit contributions
 seeking-0.026
 pad-0.022
epad-0.021
 wonders-0.020
 forgetting-0.020
Token.
Token ID13
Feature activation-1.09e-02

Pos logit contributions
ug+0.022
aw+0.021
oo+0.020
ister+0.019
ww+0.018
Neg logit contributions
 seeking-0.021
 wonders-0.018
 pad-0.018
epad-0.016
 intended-0.016
Token<|endoftext|>
Token ID50256
Feature activation+2.87e-02

Pos logit contributions
ug+0.023
aw+0.021
oo+0.020
ww+0.020
apy+0.019
Neg logit contributions
 seeking-0.019
 pad-0.015
 wonders-0.015
 pretending-0.014
 hiding-0.014
TokenOne
Token ID3198
Feature activation+5.25e-02

Pos logit contributions
 seeking+0.046
 pad+0.037
 wonders+0.035
 pretending+0.033
 hiding+0.033
Neg logit contributions
ug-0.063
aw-0.057
oo-0.056
ww-0.056
apy-0.054
Token day
Token ID1110
Feature activation-2.38e-02

Pos logit contributions
 seeking+0.093
 pad+0.076
 wonders+0.073
 hiding+0.070
 pretending+0.069
Neg logit contributions
ug-0.107
aw-0.097
oo-0.093
ww-0.093
opter-0.091
Token,
Token ID11
Feature activation-2.48e-02

Pos logit contributions
aw+0.051
ug+0.051
olk+0.045
opter+0.044
crow+0.043
Neg logit contributions
 seeking-0.036
 hiding-0.034
 pretending-0.031
 first-0.030
 upon-0.029
Token a
Token ID257
Feature activation+9.37e-03

Pos logit contributions
ug+0.068
aw+0.063
ister+0.062
ww+0.061
opter+0.060
Neg logit contributions
 seeking-0.026
 hiding-0.020
 pretending-0.018
 pad-0.018
 wonders-0.016
Token little
Token ID1310
Feature activation-2.12e-02

Pos logit contributions
 seeking+0.012
 pad+0.010
 hiding+0.010
 own+0.009
 wonders+0.009
Neg logit contributions
ug-0.023
aw-0.020
oo-0.020
opter-0.019
apy-0.019
Token girl
Token ID2576
Feature activation-7.57e-03

Pos logit contributions
ves+0.056
ug+0.053
aw+0.051
ink+0.051
apy+0.049
Neg logit contributions
 seeking-0.017
 hiding-0.016
 pad-0.016
 first-0.012
 too-0.012
Token named
Token ID3706
Feature activation-2.45e-03

Pos logit contributions
ug+0.016
opter+0.015
ister+0.014
ww+0.014
aw+0.014
Neg logit contributions
 seeking-0.013
 hiding-0.011
 pretending-0.010
 find-0.010
 wonders-0.010
Token it
Token ID340
Feature activation-9.85e-03

Pos logit contributions
ug+0.030
aw+0.027
oo+0.027
ww+0.026
apy+0.025
Neg logit contributions
 seeking-0.030
 pad-0.025
 wonders-0.023
 pretending-0.023
epad-0.022
Token home
Token ID1363
Feature activation+1.41e-02

Pos logit contributions
ug+0.016
oo+0.014
aw+0.013
ister+0.013
um+0.013
Neg logit contributions
 seeking-0.019
 wonders-0.017
 former-0.016
 pretending-0.016
epad-0.016
Token with
Token ID351
Feature activation+2.04e-02

Pos logit contributions
 seeking+0.028
 wonders+0.024
 pad+0.024
 Deal+0.021
 imag+0.021
Neg logit contributions
ug-0.024
oo-0.022
aw-0.021
ister-0.021
apy-0.019
Token them
Token ID606
Feature activation+5.01e-03

Pos logit contributions
 seeking+0.044
 pad+0.038
epad+0.033
 fault+0.033
 supposed+0.033
Neg logit contributions
aw-0.031
ug-0.030
oo-0.028
dd-0.027
apy-0.027
Token!
Token ID0
Feature activation-2.24e-02

Pos logit contributions
 seeking+0.010
 wonders+0.008
 former+0.008
 lair+0.008
epad+0.008
Neg logit contributions
ug-0.008
aw-0.007
oo-0.007
ister-0.007
apy-0.006
Token
Token ID198
Feature activation-1.98e-02

Pos logit contributions
ug+0.043
aw+0.040
oo+0.039
ww+0.039
apy+0.038
Neg logit contributions
 seeking-0.042
 pad-0.034
 wonders-0.031
 pretending-0.031
 hiding-0.031
Token
Token ID198
Feature activation-2.15e-02

Pos logit contributions
ug+0.040
aw+0.036
ww+0.036
oo+0.036
apy+0.034
Neg logit contributions
 seeking-0.035
 pad-0.029
 wonders-0.027
 pretending-0.026
 hiding-0.026
TokenBack
Token ID7282
Feature activation-4.29e-02

Pos logit contributions
ug+0.035
aw+0.031
oo+0.031
ww+0.030
apy+0.029
Neg logit contributions
 seeking-0.046
 pad-0.039
 wonders-0.038
 pretending-0.037
 hiding-0.036
Token at
Token ID379
Feature activation-1.27e-02

Pos logit contributions
ug+0.087
opter+0.077
ww+0.077
aw+0.076
oo+0.076
Neg logit contributions
 seeking-0.075
 pad-0.062
 wonders-0.060
 hiding-0.057
 anyway-0.057
Token home
Token ID1363
Feature activation+1.99e-03

Pos logit contributions
ug+0.025
oo+0.023
aw+0.023
ww+0.023
ister+0.022
Neg logit contributions
 seeking-0.021
 pad-0.018
epad-0.016
 wonders-0.016
 pretending-0.016
Token,
Token ID11
Feature activation-1.36e-02

Pos logit contributions
 seeking+0.005
 pad+0.004
 wonders+0.004
epad+0.004
 pretending+0.004
Neg logit contributions
ug-0.003
aw-0.002
oo-0.002
ww-0.002
apy-0.002
Token because
Token ID780
Feature activation+4.24e-04

Pos logit contributions
ug+0.004
aw+0.003
oo+0.003
ister+0.003
ww+0.003
Neg logit contributions
 seeking-0.003
 pad-0.003
 wonders-0.003
 former-0.003
epad-0.003
Token his
Token ID465
Feature activation+1.69e-02

Pos logit contributions
 seeking+0.001
 pad+0.001
 wonders+0.001
epad+0.001
 pretending+0.001
Neg logit contributions
ug-0.001
aw-0.001
oo-0.001
ww-0.001
apy-0.001
Token paws
Token ID49193
Feature activation-1.16e-02

Pos logit contributions
 seeking+0.033
 pad+0.028
epad+0.028
 wonders+0.026
 supposed+0.026
Neg logit contributions
ug-0.030
oo-0.026
aw-0.026
apy-0.025
ww-0.024
Token were
Token ID547
Feature activation+8.09e-03

Pos logit contributions
ug+0.021
aw+0.019
oo+0.018
apy+0.018
ww+0.017
Neg logit contributions
 seeking-0.023
 pad-0.019
 wonders-0.019
epad-0.018
 former-0.018
Token too
Token ID1165
Feature activation+2.93e-02

Pos logit contributions
 seeking+0.013
 pad+0.011
 wonders+0.011
epad+0.010
 former+0.010
Neg logit contributions
ug-0.016
oo-0.015
apy-0.014
aw-0.014
 cho-0.014
Token big
Token ID1263
Feature activation-1.79e-02

Pos logit contributions
 seeking+0.042
 pad+0.033
 wonders+0.031
 hiding+0.030
 anyway+0.029
Neg logit contributions
ug-0.069
aw-0.064
oo-0.062
ww-0.061
opter-0.059
Token.
Token ID13
Feature activation-5.15e-03

Pos logit contributions
ug+0.029
aw+0.027
 cho+0.024
oo+0.023
ing+0.022
Neg logit contributions
 seeking-0.037
 former-0.033
 wonders-0.030
 suppose-0.029
 Deal-0.029
Token Max
Token ID5436
Feature activation+3.84e-04

Pos logit contributions
ug+0.010
aw+0.009
oo+0.008
ww+0.008
apy+0.008
Neg logit contributions
 seeking-0.010
 pad-0.008
epad-0.008
 wonders-0.008
 supposed-0.008
Token was
Token ID373
Feature activation-4.47e-03

Pos logit contributions
 seeking+0.001
 pad+0.001
 wonders+0.000
epad+0.000
 pretending+0.000
Neg logit contributions
ug-0.001
aw-0.001
oo-0.001
ww-0.001
apy-0.001
Token sad
Token ID6507
Feature activation+2.13e-02

Pos logit contributions
ug+0.009
oo+0.008
aw+0.008
apy+0.008
ww+0.008
Neg logit contributions
 seeking-0.008
 pad-0.006
epad-0.006
 wonders-0.006
 former-0.006
Token and
Token ID290
Feature activation+4.68e-03

Pos logit contributions
 seeking+0.037
 pad+0.030
 wonders+0.028
epad+0.028
 pretending+0.027
Neg logit contributions
ug-0.044
aw-0.040
oo-0.039
ww-0.038
apy-0.037
Token a
Token ID257
Feature activation-1.73e-02

Pos logit contributions
 seeking+0.005
 pad+0.004
 wonders+0.004
epad+0.004
 supposed+0.004
Neg logit contributions
ug-0.005
aw-0.005
oo-0.004
apy-0.004
ww-0.004
Token button
Token ID4936
Feature activation-2.68e-02

Pos logit contributions
ug+0.031
ww+0.030
aw+0.029
apy+0.029
oo+0.028
Neg logit contributions
 seeking-0.032
 wonders-0.025
epad-0.025
asured-0.024
 supposed-0.024
Token and
Token ID290
Feature activation-3.72e-03

Pos logit contributions
ug+0.046
aw+0.041
oo+0.040
ww+0.040
apy+0.038
Neg logit contributions
 seeking-0.055
 pad-0.047
 wonders-0.045
 pretending-0.044
 hiding-0.043
Token a
Token ID257
Feature activation-8.93e-03

Pos logit contributions
ug+0.007
aw+0.007
apy+0.006
ww+0.006
oo+0.006
Neg logit contributions
 seeking-0.007
 pad-0.006
epad-0.005
 supposed-0.005
 fault-0.005
Token cr
Token ID1067
Feature activation-2.94e-02

Pos logit contributions
ug+0.016
ww+0.015
apy+0.015
aw+0.015
opter+0.014
Neg logit contributions
 seeking-0.017
asured-0.013
 wonders-0.013
 supposed-0.013
 least-0.013
Tokenumb
Token ID2178
Feature activation-7.34e-02

Pos logit contributions
ug+0.032
aw+0.029
ww+0.029
dd+0.026
apy+0.025
Neg logit contributions
 seeking-0.072
 pad-0.068
epad-0.062
 supposed-0.061
 wonders-0.061
Token.
Token ID13
Feature activation-1.36e-03

Pos logit contributions
ug+0.161
aw+0.147
oo+0.144
ww+0.142
apy+0.138
Neg logit contributions
 seeking-0.119
 pad-0.093
 wonders-0.090
 pretending-0.084
epad-0.084
Token They
Token ID1119
Feature activation-1.26e-02

Pos logit contributions
ug+0.003
aw+0.002
oo+0.002
ww+0.002
opter+0.002
Neg logit contributions
 seeking-0.003
 pad-0.002
 wonders-0.002
 pretending-0.002
 hiding-0.002
Token also
Token ID635
Feature activation+9.30e-04

Pos logit contributions
ug+0.024
aw+0.022
apy+0.021
oo+0.020
opter+0.020
Neg logit contributions
 seeking-0.023
 pad-0.019
 supposed-0.018
epad-0.018
 fault-0.018
Token saw
Token ID2497
Feature activation-2.47e-02

Pos logit contributions
 seeking+0.002
 fault+0.001
 pad+0.001
epad+0.001
 former+0.001
Neg logit contributions
ug-0.002
aw-0.002
apy-0.001
oo-0.001
ww-0.001
Token a
Token ID257
Feature activation+5.72e-03

Pos logit contributions
ug+0.049
aw+0.046
oo+0.043
apy+0.042
ww+0.042
Neg logit contributions
 seeking-0.045
 pad-0.037
 wonders-0.034
epad-0.034
 supposed-0.033
Token
Token ID198
Feature activation-1.26e-02

Pos logit contributions
ug+0.020
aw+0.018
ww+0.018
oo+0.017
opter+0.017
Neg logit contributions
 seeking-0.013
 pad-0.010
 wonders-0.009
 hiding-0.009
 pretending-0.009
Token"
Token ID1
Feature activation+5.48e-03

Pos logit contributions
ug+0.023
opter+0.019
oo+0.019
aw+0.019
olk+0.019
Neg logit contributions
 seeking-0.023
 wonders-0.019
 pad-0.019
 first-0.018
 learning-0.018
TokenO
Token ID46
Feature activation-2.18e-02

Pos logit contributions
 seeking+0.012
 wonders+0.010
 pad+0.010
 pretending+0.010
 forgetting+0.010
Neg logit contributions
ug-0.009
aw-0.007
oo-0.007
opter-0.007
ww-0.007
Tokenw
Token ID86
Feature activation-5.27e-02

Pos logit contributions
ug+0.026
apy+0.022
aw+0.021
 paramed+0.020
 yacht+0.020
Neg logit contributions
 seeking-0.049
 supposed-0.043
 pad-0.041
 anyway-0.041
 hiding-0.041
Token,
Token ID11
Feature activation+5.41e-03

Pos logit contributions
ug+0.110
aw+0.104
apy+0.098
ww+0.096
dd+0.096
Neg logit contributions
 seeking-0.073
 hiding-0.066
 pad-0.064
 pretending-0.060
 first-0.057
Token ow
Token ID12334
Feature activation-6.28e-02

Pos logit contributions
 seeking+0.011
 hiding+0.010
 pretending+0.009
 find+0.009
 forgetting+0.009
Neg logit contributions
ug-0.010
aw-0.008
oo-0.007
dd-0.007
over-0.007
Token,
Token ID11
Feature activation-1.27e-02

Pos logit contributions
ug+0.116
aw+0.108
dd+0.099
oo+0.097
ww+0.094
Neg logit contributions
 seeking-0.113
 hiding-0.096
 pad-0.094
 pretending-0.091
 first-0.091
Token ow
Token ID12334
Feature activation-6.71e-02

Pos logit contributions
ug+0.022
aw+0.018
dd+0.017
oo+0.017
opter+0.017
Neg logit contributions
 seeking-0.026
 hiding-0.023
 pretending-0.022
 find-0.022
 forgetting-0.021
Token!"
Token ID2474
Feature activation+3.20e-03

Pos logit contributions
ug+0.160
aw+0.147
oo+0.141
dd+0.139
crow+0.136
Neg logit contributions
 seeking-0.078
 hiding-0.062
 pad-0.059
 first-0.058
 anyway-0.056
Token Tom
Token ID4186
Feature activation-2.10e-02

Pos logit contributions
 seeking+0.005
 wonders+0.004
 pretending+0.004
 hiding+0.004
 anyway+0.003
Neg logit contributions
ug-0.007
ww-0.007
opter-0.006
oo-0.006
dd-0.006
Token cried
Token ID16896
Feature activation+4.94e-03

Pos logit contributions
ug+0.049
ww+0.047
opter+0.045
ister+0.044
oo+0.044
Neg logit contributions
 seeking-0.028
 pad-0.022
 wonders-0.022
 anyway-0.021
 hiding-0.021
Token big
Token ID1263
Feature activation-3.14e-02

Pos logit contributions
 seeking+0.007
 pad+0.006
 wonders+0.005
 pretending+0.005
 hiding+0.005
Neg logit contributions
ug-0.008
aw-0.007
oo-0.007
ww-0.007
apy-0.007
Token tree
Token ID5509
Feature activation-5.25e-03

Pos logit contributions
ug+0.081
aw+0.075
oo+0.075
dd+0.071
apy+0.070
Neg logit contributions
 seeking-0.042
 pad-0.033
 hiding-0.033
 find-0.028
 wonders-0.028
Token and
Token ID290
Feature activation-9.24e-03

Pos logit contributions
ug+0.009
aw+0.008
oo+0.007
ister+0.007
apy+0.007
Neg logit contributions
 seeking-0.010
 Deal-0.009
epad-0.009
 wonders-0.009
 imag-0.008
Token dug
Token ID18735
Feature activation-1.99e-02

Pos logit contributions
ug+0.018
aw+0.017
apy+0.017
oo+0.016
ww+0.016
Neg logit contributions
 seeking-0.016
 pad-0.013
 wonders-0.012
epad-0.012
 fault-0.012
Token a
Token ID257
Feature activation+7.82e-04

Pos logit contributions
ug+0.043
aw+0.039
oo+0.038
ww+0.038
ister+0.037
Neg logit contributions
 seeking-0.033
 pad-0.026
 wonders-0.025
 hiding-0.024
 pretending-0.024
Token hole
Token ID7604
Feature activation-5.72e-02

Pos logit contributions
 seeking+0.001
 pad+0.001
 wonders+0.001
 pretending+0.001
epad+0.001
Neg logit contributions
ug-0.001
aw-0.001
ww-0.001
oo-0.001
apy-0.001
Token.
Token ID13
Feature activation-1.08e-02

Pos logit contributions
ug+0.101
aw+0.089
oo+0.086
apy+0.084
dd+0.077
Neg logit contributions
 seeking-0.113
 pad-0.099
 wonders-0.094
 former-0.092
 least-0.090
Token
Token ID198
Feature activation-1.37e-02

Pos logit contributions
ug+0.024
aw+0.022
ww+0.022
oo+0.022
ister+0.021
Neg logit contributions
 seeking-0.017
 pad-0.013
 wonders-0.013
 anyway-0.012
 pretending-0.012
Token
Token ID198
Feature activation-1.64e-02

Pos logit contributions
ug+0.034
aw+0.032
ww+0.032
oo+0.030
opter+0.030
Neg logit contributions
 seeking-0.016
 pad-0.012
 wonders-0.012
 anyway-0.011
 find-0.011
TokenAs
Token ID1722
Feature activation-2.89e-03

Pos logit contributions
ug+0.033
aw+0.030
oo+0.028
ww+0.028
ister+0.028
Neg logit contributions
 seeking-0.027
 pad-0.023
 anyway-0.022
 wonders-0.021
 supposed-0.021
Token they
Token ID484
Feature activation-1.33e-04

Pos logit contributions
ug+0.006
aw+0.006
oo+0.005
opter+0.005
ww+0.005
Neg logit contributions
 seeking-0.005
 hiding-0.004
 pretending-0.004
 anyway-0.004
 first-0.004
Token with
Token ID351
Feature activation+1.78e-02

Pos logit contributions
 seeking+0.018
 pad+0.015
 wonders+0.015
epad+0.014
 forgetting+0.014
Neg logit contributions
ug-0.016
aw-0.015
oo-0.014
ww-0.014
ister-0.013
Token delicious
Token ID12625
Feature activation-6.86e-03

Pos logit contributions
 seeking+0.036
 pad+0.030
epad+0.028
 forgetting+0.028
 wonders+0.027
Neg logit contributions
ug-0.031
aw-0.028
oo-0.027
apy-0.027
ww-0.026
Token potatoes
Token ID18821
Feature activation-4.18e-02

Pos logit contributions
ug+0.014
aw+0.013
oo+0.012
ww+0.012
opter+0.012
Neg logit contributions
 seeking-0.012
 pad-0.010
 wonders-0.010
 pretending-0.009
 hiding-0.009
Token all
Token ID477
Feature activation-8.22e-03

Pos logit contributions
ug+0.069
aw+0.065
oo+0.061
ister+0.059
ww+0.058
Neg logit contributions
 seeking-0.083
 wonders-0.072
epad-0.068
 pad-0.066
 former-0.066
Token year
Token ID614
Feature activation+5.08e-03

Pos logit contributions
ug+0.015
aw+0.014
oo+0.013
ww+0.013
dd+0.013
Neg logit contributions
 seeking-0.016
 pad-0.013
 wonders-0.012
epad-0.012
 forgetting-0.012
Token long
Token ID890
Feature activation-5.63e-02

Pos logit contributions
 seeking+0.010
 pad+0.008
 wonders+0.007
 pretending+0.007
 forgetting+0.007
Neg logit contributions
ug-0.010
aw-0.009
oo-0.009
ww-0.009
ister-0.008
Token.
Token ID13
Feature activation-8.31e-03

Pos logit contributions
ug+0.105
aw+0.099
dd+0.089
oo+0.088
apy+0.083
Neg logit contributions
 seeking-0.106
 least-0.086
 former-0.084
 wonders-0.084
epad-0.082
Token
Token ID198
Feature activation-1.61e-02

Pos logit contributions
ug+0.017
aw+0.015
oo+0.015
ww+0.015
apy+0.014
Neg logit contributions
 seeking-0.015
 pad-0.012
 wonders-0.011
 pretending-0.011
 hiding-0.011
Token
Token ID198
Feature activation-1.83e-02

Pos logit contributions
ug+0.029
aw+0.026
ww+0.025
oo+0.025
apy+0.024
Neg logit contributions
 seeking-0.032
 pad-0.027
 wonders-0.026
 pretending-0.025
 hiding-0.025
TokenThe
Token ID464
Feature activation-5.11e-03

Pos logit contributions
ug+0.030
aw+0.026
oo+0.026
ww+0.025
apy+0.024
Neg logit contributions
 seeking-0.040
 pad-0.034
 wonders-0.033
 pretending-0.032
epad-0.031
Token end
Token ID886
Feature activation-1.55e-02

Pos logit contributions
ug+0.009
aw+0.008
oo+0.008
ww+0.008
ister+0.007
Neg logit contributions
 seeking-0.011
 pad-0.009
 wonders-0.008
 pretending-0.008
epad-0.008
Token house
Token ID2156
Feature activation-2.46e-03

Pos logit contributions
ug+0.022
aw+0.021
oo+0.020
ister+0.020
 taxi+0.020
Neg logit contributions
 seeking-0.030
epad-0.030
 supposed-0.027
 suppose-0.026
 haven-0.025
Token,
Token ID11
Feature activation+7.22e-03

Pos logit contributions
ug+0.004
oo+0.004
aw+0.004
ister+0.003
apy+0.003
Neg logit contributions
 seeking-0.005
 wonders-0.004
 pad-0.004
 supposed-0.004
epad-0.004
Token grabbed
Token ID13176
Feature activation-2.36e-02

Pos logit contributions
 seeking+0.012
 pad+0.009
 wonders+0.009
epad+0.009
 fault+0.009
Neg logit contributions
ug-0.014
oo-0.013
aw-0.013
apy-0.013
ww-0.012
Token a
Token ID257
Feature activation+1.80e-03

Pos logit contributions
ug+0.051
aw+0.047
ww+0.046
oo+0.046
opter+0.045
Neg logit contributions
 seeking-0.037
 pad-0.030
 wonders-0.029
 pretending-0.028
 hiding-0.027
Token big
Token ID1263
Feature activation-2.97e-02

Pos logit contributions
 seeking+0.003
 wonders+0.003
 pad+0.003
 pretending+0.003
 forgetting+0.003
Neg logit contributions
ug-0.003
ww-0.003
aw-0.003
oo-0.003
apy-0.003
Token bag
Token ID6131
Feature activation-6.15e-02

Pos logit contributions
ug+0.065
aw+0.060
oo+0.058
ww+0.057
dd+0.056
Neg logit contributions
 seeking-0.048
 pad-0.040
 hiding-0.037
 wonders-0.036
 pretending-0.035
Token of
Token ID286
Feature activation+1.27e-02

Pos logit contributions
ug+0.099
aw+0.087
oo+0.085
ww+0.083
apy+0.080
Neg logit contributions
 seeking-0.134
 pad-0.114
 wonders-0.110
 pretending-0.107
epad-0.106
Token candy
Token ID18550
Feature activation-8.43e-03

Pos logit contributions
 seeking+0.024
 pad+0.019
 wonders+0.018
epad+0.018
 pretending+0.018
Neg logit contributions
ug-0.024
aw-0.022
oo-0.022
ww-0.021
apy-0.021
Token,
Token ID11
Feature activation+1.04e-02

Pos logit contributions
ug+0.016
aw+0.015
ww+0.014
ister+0.014
oo+0.014
Neg logit contributions
 seeking-0.016
 wonders-0.013
 former-0.012
 pretending-0.012
 supposed-0.012
Token and
Token ID290
Feature activation-2.05e-02

Pos logit contributions
 seeking+0.020
 wonders+0.016
 supposed+0.016
 pad+0.016
epad+0.016
Neg logit contributions
aw-0.017
ug-0.017
apy-0.015
oo-0.015
ww-0.015
Token ran
Token ID4966
Feature activation-2.48e-02

Pos logit contributions
ug+0.037
aw+0.035
apy+0.034
ww+0.033
oo+0.032
Neg logit contributions
 seeking-0.039
 pad-0.032
 wonders-0.030
epad-0.030
 fault-0.029