TOP ACTIVATIONS
MAX = 0.094

. The flowers opened their pet als and bloom ed in
day the flower spread its pet als and left the garden
of them all . Its pet als were a deep ,
, so it spread its pet als and enjoyed the sunshine
a pretty red . Its pet als were bright and the
! A place where her pet would always stay happy and
Suddenly , the sun rose and Jack woke up .
. It had beautiful pink pet als and its leaves were
flower . It had smooth pet als that were bright and
went home and enjoyed his freshly caught fish . <|endoftext|> Once
flower . He opened the pet als very carefully . Inside
When the sun rose , Bob 's thing was
Lucy collected some bright yellow pet als and arranged them in
little girl never turned her pet away again .
She stretched out her yellow pet als and felt the sun
Emma quickly realized that her pet dog had taken it .
The flower opened up its pet als and started to feel
surprised . Even though the pet had found and ate the
. He wished that his pet fish , Fin , who
The l ily opened its pet als and felt happy and

MIN ACTIVATIONS
MIN = -0.111

decorated it with shells and seaw eed . They had a
Like shells and rocks and seaw eed ." M
ate it with shells and seaw eed . They find a
, " Good job , Anch or ! You did your
We love you too , grand pa ," Anna and Ben
sun and went back to Far ah 's house for a
some shells and rocks and seaw eed . He picked them
buckets and shovel s and seaw eed . Then we can
swimming , staying loyal to Anch or 's advice . They
" I 'm sorry , grand pa . We didn 't
pa , grand pa , grand pa !" Lily and Tom
and the crabs and the seaw eed . Sam
Don 't worry , Mar lon . You 'll be a
" Okay , grand pa . We understand ,"
see shells , rocks , seaw eed , crabs and fish
like fish and shells and seaw eed . One
" Grand pa , grand pa , grand pa !"
You did your best !" Anch or smiled and said ,
saw five fish and seven seaw eed . They were happy
's father said , " Son , it 's time to

INTERVAL 0.043 - 0.094
CONTAINS 0.782%

the wardrobe so it looked nice . And from then on
did not share well . One day , Lily wanted to
to eat jelly sandwiches . One day , her mom my
a pencil with her . One day , she met a
friends . They would work together to find more pretty leaves

INTERVAL -0.008 - 0.043
CONTAINS 69.550%

his famous neighbor , Mr . Johnson , walking his dog
healed and healthy . The little boy was so
. He dropped the little boy 's toy boat and sw
with her everywhere she went . One day , Lily went
they found while playing . <|endoftext|> One day , a little

INTERVAL -0.059 - -0.008
CONTAINS 29.586%

trace her finger over the colorful lines . One day ,
now ." Sam felt bad and wanted to make Tim happy
their mom calling them for lunch . " Stop
ran up and stayed there until it was safe . They
, so she shared the bag with Henry .

INTERVAL -0.111 - -0.059
CONTAINS 0.082%

with the cube and with each other . We will be
" Hello , Hal ley !" Tom says .
cars of different colors and sizes . He likes to make
like a flag , a map , a landmark , or
and went into a big chest . The chest was old
Token.
Token ID13
Feature activation+8.66e-03

Pos logit contributions
comb+0.062
ses+0.061
 pleasure+0.061
 hospitality+0.060
 pad+0.059
Neg logit contributions
ug-0.041
ister-0.038
P-0.035
irens-0.034
opter-0.032
Token The
Token ID383
Feature activation+1.43e-02

Pos logit contributions
comb+0.015
 seeking+0.013
 hospitality+0.013
 pleasure+0.013
 pad+0.013
Neg logit contributions
ug-0.015
ister-0.014
irens-0.014
P-0.013
iren-0.013
Token flowers
Token ID12734
Feature activation+4.40e-02

Pos logit contributions
comb+0.026
ses+0.023
 pleasure+0.023
 former+0.023
 pad+0.023
Neg logit contributions
ug-0.024
ister-0.023
P-0.022
irens-0.021
iren-0.020
Token opened
Token ID4721
Feature activation+2.34e-02

Pos logit contributions
 pleasure+0.084
ses+0.083
 former+0.080
 fault+0.079
eter+0.077
Neg logit contributions
ug-0.069
ister-0.063
P-0.058
opter-0.057
 yacht-0.053
Token their
Token ID511
Feature activation+4.13e-02

Pos logit contributions
comb+0.047
ses+0.046
 hospitality+0.046
 seeking+0.045
 pleasure+0.043
Neg logit contributions
ug-0.031
ister-0.028
irens-0.026
P-0.026
 yacht-0.025
Token pet
Token ID4273
Feature activation+9.32e-02

Pos logit contributions
comb+0.082
Aud+0.071
epad+0.071
 forgetting+0.070
 hospitality+0.070
Neg logit contributions
 yacht-0.054
ug-0.054
P-0.050
ister-0.050
ades-0.048
Tokenals
Token ID874
Feature activation+5.86e-02

Pos logit contributions
 former+0.155
comb+0.152
 pleasure+0.149
 seeking+0.147
 fault+0.147
Neg logit contributions
ug-0.153
P-0.142
ister-0.140
aw-0.137
irens-0.135
Token and
Token ID290
Feature activation+6.57e-03

Pos logit contributions
ses+0.139
 pleasure+0.135
 former+0.131
 fault+0.127
 hospitality+0.125
Neg logit contributions
ug-0.062
ister-0.057
irens-0.047
P-0.047
aw-0.046
Token bloom
Token ID29955
Feature activation+5.77e-02

Pos logit contributions
 fault+0.012
 former+0.012
 pleasure+0.012
ses+0.012
comb+0.011
Neg logit contributions
ug-0.009
P-0.009
ister-0.008
 yacht-0.008
opter-0.008
Tokened
Token ID276
Feature activation-6.66e-03

Pos logit contributions
ses+0.119
 pleasure+0.116
 seeking+0.112
 former+0.111
 hospitality+0.111
Neg logit contributions
ug-0.078
P-0.074
ister-0.068
opter-0.063
ades-0.063
Token in
Token ID287
Feature activation-2.53e-03

Pos logit contributions
ug+0.009
P+0.007
ister+0.007
aw+0.007
irens+0.007
Neg logit contributions
ses-0.014
comb-0.014
 seeking-0.013
 hospitality-0.012
 pad-0.012
Token day
Token ID1110
Feature activation+5.13e-03

Pos logit contributions
comb+0.079
 pleasure+0.074
ses+0.073
 pur+0.072
 pad+0.072
Neg logit contributions
ug-0.082
ister-0.078
irens-0.074
P-0.074
iren-0.070
Token the
Token ID262
Feature activation+2.33e-03

Pos logit contributions
comb+0.007
 nib+0.007
 pur+0.007
 watered+0.006
 pleasure+0.006
Neg logit contributions
ug-0.010
ister-0.010
irens-0.009
iren-0.009
P-0.009
Token flower
Token ID15061
Feature activation+5.49e-02

Pos logit contributions
 former+0.004
comb+0.004
eter+0.004
 hospitality+0.003
Aud+0.003
Neg logit contributions
P-0.004
ister-0.004
ug-0.004
 elevator-0.003
opter-0.003
Token spread
Token ID4104
Feature activation+1.56e-02

Pos logit contributions
 pleasure+0.105
ses+0.100
 hospitality+0.095
comb+0.094
 former+0.093
Neg logit contributions
P-0.086
ug-0.085
ister-0.083
irens-0.071
opter-0.069
Token its
Token ID663
Feature activation+4.58e-02

Pos logit contributions
comb+0.029
ses+0.028
 seeking+0.028
 pleasure+0.028
 former+0.027
Neg logit contributions
ug-0.025
P-0.023
ister-0.021
irens-0.020
iren-0.019
Token pet
Token ID4273
Feature activation+9.26e-02

Pos logit contributions
Aud+0.072
 hospitality+0.069
comb+0.068
 forgetting+0.068
 pleasure+0.066
Neg logit contributions
P-0.075
ug-0.074
ister-0.067
 elevator-0.061
 yacht-0.060
Tokenals
Token ID874
Feature activation+5.51e-02

Pos logit contributions
 pleasure+0.147
 former+0.147
comb+0.139
 seeking+0.138
ses+0.135
Neg logit contributions
ug-0.165
P-0.155
ister-0.150
irens-0.144
 yacht-0.137
Token and
Token ID290
Feature activation+7.82e-03

Pos logit contributions
 pleasure+0.116
ses+0.115
 former+0.110
 hospitality+0.107
eter+0.105
Neg logit contributions
ug-0.077
ister-0.068
P-0.064
irens-0.061
iren-0.060
Token left
Token ID1364
Feature activation-2.05e-02

Pos logit contributions
 pleasure+0.014
 fault+0.014
comb+0.014
 former+0.014
eter+0.014
Neg logit contributions
ug-0.011
P-0.011
ister-0.010
 yacht-0.010
irens-0.010
Token the
Token ID262
Feature activation-1.03e-02

Pos logit contributions
ug+0.031
ister+0.028
irens+0.026
P+0.026
iren+0.024
Neg logit contributions
comb-0.040
 former-0.038
Aud-0.037
ses-0.037
 pad-0.036
Token garden
Token ID11376
Feature activation-7.09e-03

Pos logit contributions
P+0.019
ister+0.018
ug+0.017
 elevator+0.016
 yacht+0.016
Neg logit contributions
comb-0.015
ses-0.015
 hospitality-0.015
 pleasure-0.014
Aud-0.014
Token of
Token ID286
Feature activation+1.04e-02

Pos logit contributions
P+0.017
ug+0.017
ister+0.017
opter+0.016
 yacht+0.015
Neg logit contributions
 former-0.012
ses-0.012
comb-0.012
 pleasure-0.011
 hospitality-0.011
Token them
Token ID606
Feature activation+6.12e-03

Pos logit contributions
comb+0.021
 former+0.019
 pad+0.019
ses+0.019
eter+0.019
Neg logit contributions
ug-0.015
P-0.014
ister-0.014
 yacht-0.013
irens-0.012
Token all
Token ID477
Feature activation-3.10e-02

Pos logit contributions
ses+0.013
 former+0.013
comb+0.013
 pleasure+0.012
eter+0.012
Neg logit contributions
ug-0.008
ister-0.007
P-0.007
irens-0.006
 yacht-0.006
Token.
Token ID13
Feature activation-5.04e-03

Pos logit contributions
ug+0.036
ister+0.034
P+0.031
irens+0.029
opter+0.028
Neg logit contributions
comb-0.071
ses-0.069
 former-0.066
 pleasure-0.065
ulent-0.065
Token Its
Token ID6363
Feature activation+3.81e-02

Pos logit contributions
ug+0.010
ister+0.009
P+0.009
irens+0.009
iren+0.009
Neg logit contributions
comb-0.007
 pleasure-0.007
ses-0.007
 pad-0.007
 pur-0.007
Token pet
Token ID4273
Feature activation+8.86e-02

Pos logit contributions
comb+0.061
Aud+0.061
 seeking+0.057
 forgetting+0.057
epad+0.057
Neg logit contributions
ug-0.057
P-0.055
um-0.055
ister-0.052
ole-0.051
Tokenals
Token ID874
Feature activation+4.61e-02

Pos logit contributions
comb+0.139
 former+0.136
 pleasure+0.133
 hospitality+0.132
ses+0.129
Neg logit contributions
ug-0.158
P-0.155
ister-0.151
opter-0.140
aw-0.134
Token were
Token ID547
Feature activation+5.77e-03

Pos logit contributions
ses+0.098
 pleasure+0.097
 former+0.093
Aud+0.092
eter+0.091
Neg logit contributions
ug-0.059
ister-0.057
P-0.053
opter-0.050
irens-0.045
Token a
Token ID257
Feature activation-3.22e-03

Pos logit contributions
ses+0.011
comb+0.010
 hospitality+0.010
Aud+0.010
 pleasure+0.010
Neg logit contributions
ug-0.008
ister-0.008
P-0.008
 cubes-0.007
irens-0.007
Token deep
Token ID2769
Feature activation-1.74e-02

Pos logit contributions
ister+0.005
P+0.005
ug+0.005
opter+0.005
 yacht+0.004
Neg logit contributions
comb-0.005
ses-0.005
 hospitality-0.005
 pleasure-0.005
 former-0.005
Token,
Token ID11
Feature activation+2.86e-02

Pos logit contributions
ug+0.024
ister+0.022
P+0.022
irens+0.020
opter+0.020
Neg logit contributions
comb-0.034
 hospitality-0.033
 pleasure-0.033
ses-0.032
 former-0.032
Token,
Token ID11
Feature activation+7.97e-03

Pos logit contributions
ses+0.101
Aud+0.098
comb+0.094
 hospitality+0.093
 pleasure+0.093
Neg logit contributions
ug-0.036
ister-0.031
P-0.028
?"-0.026
,-0.022
Token so
Token ID523
Feature activation-1.70e-02

Pos logit contributions
comb+0.017
Aud+0.016
 hospitality+0.016
ses+0.015
girls+0.015
Neg logit contributions
ug-0.009
ister-0.008
 yacht-0.008
 motor-0.007
P-0.006
Token it
Token ID340
Feature activation+4.19e-03

Pos logit contributions
ug+0.021
ister+0.020
P+0.018
opter+0.017
irens+0.015
Neg logit contributions
comb-0.037
 pad-0.037
ses-0.036
Aud-0.035
 hospitality-0.034
Token spread
Token ID4104
Feature activation+1.64e-02

Pos logit contributions
 pleasure+0.008
ses+0.008
 fault+0.008
epad+0.008
ulent+0.008
Neg logit contributions
P-0.006
ug-0.006
ister-0.005
ler-0.005
in-0.005
Token its
Token ID663
Feature activation+3.84e-02

Pos logit contributions
comb+0.033
 seeking+0.032
ses+0.032
 former+0.030
Aud+0.030
Neg logit contributions
ug-0.024
P-0.021
ister-0.018
aw-0.018
ler-0.017
Token pet
Token ID4273
Feature activation+8.78e-02

Pos logit contributions
Aud+0.063
epad+0.060
comb+0.060
ses+0.058
ccoli+0.057
Neg logit contributions
P-0.062
ug-0.057
 year-0.050
ister-0.049
body-0.047
Tokenals
Token ID874
Feature activation+4.82e-02

Pos logit contributions
 former+0.159
 pleasure+0.150
ses+0.148
 seeking+0.148
 fault+0.145
Neg logit contributions
ug-0.130
P-0.122
ister-0.116
aw-0.113
 yacht-0.109
Token and
Token ID290
Feature activation+7.62e-03

Pos logit contributions
ses+0.104
 pleasure+0.096
 former+0.093
sed+0.089
eter+0.088
Neg logit contributions
ug-0.064
ister-0.057
P-0.054
boat-0.051
 yacht-0.050
Token enjoyed
Token ID8359
Feature activation+1.21e-02

Pos logit contributions
comb+0.014
Aud+0.014
eter+0.014
 pad+0.014
 fault+0.014
Neg logit contributions
ug-0.009
P-0.009
 yacht-0.008
lf-0.007
ister-0.007
Token the
Token ID262
Feature activation-9.35e-03

Pos logit contributions
comb+0.028
 pad+0.028
ses+0.026
eter+0.025
Aud+0.025
Neg logit contributions
ug-0.013
P-0.012
ister-0.011
 yacht-0.010
irens-0.009
Token sunshine
Token ID34488
Feature activation+1.41e-02

Pos logit contributions
P+0.015
ister+0.013
ug+0.012
body+0.012
 year+0.012
Neg logit contributions
ses-0.018
 pad-0.017
comb-0.017
girls-0.016
eter-0.016
Token a
Token ID257
Feature activation+4.73e-03

Pos logit contributions
comb+0.027
 hospitality+0.025
 former+0.024
 pleasure+0.024
 pad+0.024
Neg logit contributions
ug-0.021
P-0.020
ister-0.019
irens-0.017
ep-0.016
Token pretty
Token ID2495
Feature activation+2.50e-02

Pos logit contributions
comb+0.008
ses+0.007
 pleasure+0.007
 former+0.007
 hospitality+0.007
Neg logit contributions
ug-0.008
ister-0.008
P-0.008
irens-0.007
opter-0.007
Token red
Token ID2266
Feature activation+6.33e-03

Pos logit contributions
 hospitality+0.033
Aud+0.033
 former+0.032
comb+0.032
 seeking+0.031
Neg logit contributions
ug-0.048
ister-0.047
P-0.046
opter-0.043
 yacht-0.042
Token.
Token ID13
Feature activation+3.21e-03

Pos logit contributions
comb+0.007
 seeking+0.007
 former+0.007
 pleasure+0.007
 hospitality+0.007
Neg logit contributions
ug-0.014
ister-0.013
P-0.013
opter-0.012
irens-0.012
Token Its
Token ID6363
Feature activation+3.44e-02

Pos logit contributions
comb+0.005
ses+0.004
 pleasure+0.004
 former+0.004
 pad+0.004
Neg logit contributions
ug-0.006
ister-0.006
P-0.006
irens-0.006
iren-0.005
Token pet
Token ID4273
Feature activation+8.45e-02

Pos logit contributions
Aud+0.061
comb+0.059
 seeking+0.058
 hospitality+0.058
 forgetting+0.057
Neg logit contributions
P-0.046
um-0.045
ug-0.044
ister-0.040
ole-0.040
Tokenals
Token ID874
Feature activation+4.24e-02

Pos logit contributions
comb+0.129
 former+0.126
 pleasure+0.122
 hospitality+0.120
 seeking+0.118
Neg logit contributions
ug-0.159
P-0.155
ister-0.151
opter-0.137
irens-0.134
Token were
Token ID547
Feature activation+8.02e-03

Pos logit contributions
 pleasure+0.084
ses+0.081
 former+0.081
comb+0.081
Aud+0.080
Neg logit contributions
ug-0.062
ister-0.058
P-0.056
opter-0.050
irens-0.047
Token bright
Token ID6016
Feature activation+2.40e-02

Pos logit contributions
comb+0.016
 hospitality+0.015
ses+0.014
 former+0.014
 pleasure+0.014
Neg logit contributions
ug-0.012
ister-0.011
P-0.011
ep-0.009
irens-0.009
Token and
Token ID290
Feature activation+2.55e-02

Pos logit contributions
comb+0.039
ses+0.036
 pleasure+0.036
 seeking+0.036
 former+0.035
Neg logit contributions
ug-0.043
ister-0.041
P-0.039
irens-0.039
iren-0.037
Token the
Token ID262
Feature activation-5.35e-03

Pos logit contributions
comb+0.049
 pad+0.048
 former+0.047
 fault+0.046
 hospitality+0.046
Neg logit contributions
ug-0.035
P-0.033
ister-0.033
lf-0.031
opter-0.030
Token!
Token ID0
Feature activation-1.74e-02

Pos logit contributions
comb+0.095
 former+0.090
 pleasure+0.090
ses+0.090
 seeking+0.086
Neg logit contributions
ug-0.095
ister-0.093
P-0.087
irens-0.083
iren-0.080
Token A
Token ID317
Feature activation+1.86e-02

Pos logit contributions
ug+0.031
ister+0.030
P+0.029
irens+0.028
iren+0.027
Neg logit contributions
comb-0.027
 former-0.026
 hospitality-0.026
ses-0.025
 seeking-0.024
Token place
Token ID1295
Feature activation-1.71e-02

Pos logit contributions
comb+0.032
ses+0.030
 pleasure+0.030
 former+0.029
 pad+0.029
Neg logit contributions
ug-0.031
ister-0.030
P-0.029
irens-0.028
iren-0.027
Token where
Token ID810
Feature activation+3.25e-02

Pos logit contributions
ister+0.028
ug+0.027
 elevator+0.025
irens+0.025
iren+0.024
Neg logit contributions
 former-0.030
comb-0.029
 pad-0.028
ulent-0.028
 pleasure-0.027
Token her
Token ID607
Feature activation+4.57e-02

Pos logit contributions
comb+0.064
ses+0.062
 pad+0.062
 former+0.060
ulent+0.059
Neg logit contributions
ug-0.046
ister-0.044
P-0.041
 elevator-0.040
irens-0.039
Token pet
Token ID4273
Feature activation+8.41e-02

Pos logit contributions
ulent+0.079
comb+0.078
Aud+0.078
epad+0.075
 pad+0.074
Neg logit contributions
ister-0.069
 elevator-0.062
ug-0.061
P-0.059
 yacht-0.059
Token would
Token ID561
Feature activation-5.24e-03

Pos logit contributions
comb+0.153
 pleasure+0.152
 former+0.152
ses+0.147
 hospitality+0.145
Neg logit contributions
ug-0.129
ister-0.128
P-0.118
irens-0.111
opter-0.108
Token always
Token ID1464
Feature activation+1.37e-03

Pos logit contributions
ister+0.009
ug+0.009
opter+0.008
 yacht+0.008
P+0.008
Neg logit contributions
 pad-0.008
ses-0.008
Aud-0.008
comb-0.008
eter-0.008
Token stay
Token ID2652
Feature activation+6.31e-04

Pos logit contributions
Aud+0.002
ses+0.002
comb+0.002
 pad+0.002
eter+0.002
Neg logit contributions
ug-0.002
ister-0.002
?"-0.002
 yacht-0.002
opter-0.002
Token happy
Token ID3772
Feature activation+3.46e-02

Pos logit contributions
comb+0.001
ses+0.001
 pleasure+0.001
 pad+0.001
 hospitality+0.001
Neg logit contributions
ug-0.001
ister-0.001
P-0.001
irens-0.001
opter-0.001
Token and
Token ID290
Feature activation-1.67e-02

Pos logit contributions
ses+0.084
Aud+0.083
comb+0.079
sed+0.076
 former+0.075
Neg logit contributions
ug-0.036
ister-0.027
?"-0.027
 cubes-0.024
,-0.024
Token
Token ID198
Feature activation-5.68e-03

Pos logit contributions
ug+0.024
ister+0.022
irens+0.022
opter+0.021
P+0.021
Neg logit contributions
 pad-0.009
 pleasure-0.009
 seeking-0.009
comb-0.009
 nib-0.008
TokenSuddenly
Token ID38582
Feature activation+7.36e-03

Pos logit contributions
ug+0.013
ister+0.012
irens+0.011
opter+0.011
iren+0.011
Neg logit contributions
 pad-0.007
 watered-0.006
 nib-0.006
 seeking-0.006
 pur-0.006
Token,
Token ID11
Feature activation+1.40e-02

Pos logit contributions
 nib+0.008
comb+0.008
 pleasure+0.008
 pur+0.008
 hiding+0.008
Neg logit contributions
ug-0.015
ister-0.015
P-0.015
irens-0.015
iren-0.015
Token the
Token ID262
Feature activation+2.28e-03

Pos logit contributions
 pleasure+0.020
comb+0.019
 seeking+0.019
 nib+0.019
 watered+0.018
Neg logit contributions
ug-0.028
ister-0.027
irens-0.025
iren-0.025
P-0.024
Token sun
Token ID4252
Feature activation+3.92e-02

Pos logit contributions
comb+0.004
 pleasure+0.004
 former+0.004
ses+0.004
 pad+0.004
Neg logit contributions
ister-0.004
ug-0.003
P-0.003
irens-0.003
iren-0.003
Token rose
Token ID8278
Feature activation+8.32e-02

Pos logit contributions
comb+0.059
 pur+0.054
 pad+0.052
 watered+0.052
ses+0.052
Neg logit contributions
ug-0.077
ister-0.074
P-0.070
irens-0.069
opter-0.066
Token and
Token ID290
Feature activation+4.02e-02

Pos logit contributions
comb+0.132
 pleasure+0.125
ses+0.123
 seeking+0.121
 former+0.120
Neg logit contributions
ug-0.152
ister-0.146
P-0.139
irens-0.138
iren-0.131
Token Jack
Token ID3619
Feature activation-3.33e-03

Pos logit contributions
comb+0.063
 pur+0.060
 pleasure+0.059
 watered+0.058
 nib+0.057
Neg logit contributions
ug-0.077
ister-0.072
P-0.068
irens-0.067
iren-0.065
Token woke
Token ID19092
Feature activation+4.45e-03

Pos logit contributions
ug+0.006
ister+0.006
P+0.006
irens+0.005
opter+0.005
Neg logit contributions
 pleasure-0.005
comb-0.005
ses-0.005
 former-0.005
 hospitality-0.005
Token up
Token ID510
Feature activation+2.49e-02

Pos logit contributions
 pur+0.004
comb+0.004
 former+0.004
 nib+0.003
 knowing+0.003
Neg logit contributions
ug-0.011
P-0.011
ister-0.011
iren-0.010
irens-0.010
Token.
Token ID13
Feature activation+1.41e-02

Pos logit contributions
comb+0.044
ses+0.042
 pleasure+0.041
 pad+0.040
 hospitality+0.040
Neg logit contributions
ug-0.040
ister-0.039
irens-0.037
P-0.037
iren-0.035
Token.
Token ID13
Feature activation+3.62e-03

Pos logit contributions
ses+0.069
 pleasure+0.068
 hospitality+0.067
comb+0.065
 former+0.065
Neg logit contributions
ug-0.041
ister-0.037
P-0.037
irens-0.033
ades-0.029
Token It
Token ID632
Feature activation+3.01e-02

Pos logit contributions
comb+0.005
 former+0.005
ses+0.005
 hospitality+0.005
 pleasure+0.005
Neg logit contributions
ug-0.007
ister-0.007
irens-0.006
P-0.006
iren-0.006
Token had
Token ID550
Feature activation+2.66e-03

Pos logit contributions
 pleasure+0.071
 fault+0.070
 hospitality+0.070
ses+0.069
 former+0.068
Neg logit contributions
P-0.031
ug-0.030
ister-0.028
lf-0.026
opter-0.026
Token beautiful
Token ID4950
Feature activation+4.34e-02

Pos logit contributions
comb+0.005
ccoli+0.005
ses+0.005
 hospitality+0.005
 pleasure+0.005
Neg logit contributions
P-0.004
ug-0.003
irens-0.003
opter-0.003
 elevator-0.003
Token pink
Token ID11398
Feature activation+4.70e-02

Pos logit contributions
comb+0.068
ses+0.063
 seeking+0.061
 pleasure+0.060
 hospitality+0.060
Neg logit contributions
ug-0.080
ister-0.075
P-0.075
irens-0.075
opter-0.071
Token pet
Token ID4273
Feature activation+8.31e-02

Pos logit contributions
comb+0.069
 pleasure+0.065
ses+0.064
 hospitality+0.063
 former+0.063
Neg logit contributions
ug-0.090
ister-0.085
P-0.084
irens-0.083
opter-0.078
Tokenals
Token ID874
Feature activation+3.62e-02

Pos logit contributions
comb+0.113
 former+0.111
 pleasure+0.106
ses+0.105
 hospitality+0.102
Neg logit contributions
ug-0.169
ister-0.162
P-0.157
irens-0.155
iren-0.147
Token and
Token ID290
Feature activation+7.66e-03

Pos logit contributions
 pleasure+0.084
 former+0.083
ses+0.083
 hospitality+0.079
 faith+0.076
Neg logit contributions
ug-0.039
ister-0.038
irens-0.033
iren-0.030
P-0.030
Token its
Token ID663
Feature activation+2.08e-02

Pos logit contributions
comb+0.015
 former+0.014
 hospitality+0.014
 pad+0.014
 fault+0.014
Neg logit contributions
ug-0.009
P-0.008
irens-0.008
lf-0.008
 elevator-0.008
Token leaves
Token ID5667
Feature activation-1.18e-03

Pos logit contributions
Aud+0.039
comb+0.035
 hospitality+0.035
 seeking+0.034
ccoli+0.033
Neg logit contributions
ister-0.026
ug-0.026
um-0.025
P-0.025
ole-0.023
Token were
Token ID547
Feature activation-7.53e-03

Pos logit contributions
ug+0.002
ister+0.001
opter+0.001
irens+0.001
iren+0.001
Neg logit contributions
 pleasure-0.002
 former-0.002
 hospitality-0.002
ses-0.002
Aud-0.002
Token flower
Token ID15061
Feature activation+3.58e-02

Pos logit contributions
comb+0.025
 pur+0.023
 pleasure+0.023
ses+0.023
 pad+0.023
Neg logit contributions
ug-0.036
ister-0.035
irens-0.034
P-0.034
iren-0.032
Token.
Token ID13
Feature activation+5.49e-03

Pos logit contributions
comb+0.062
 pur+0.057
 pad+0.056
ses+0.056
 seeking+0.055
Neg logit contributions
ug-0.061
ister-0.059
irens-0.057
P-0.055
iren-0.053
Token It
Token ID632
Feature activation+3.72e-02

Pos logit contributions
comb+0.009
 former+0.008
 pad+0.008
ses+0.008
 hospitality+0.008
Neg logit contributions
ug-0.010
ister-0.009
P-0.009
irens-0.009
opter-0.008
Token had
Token ID550
Feature activation+1.14e-02

Pos logit contributions
 pleasure+0.077
 hospitality+0.076
comb+0.074
 former+0.074
 fault+0.073
Neg logit contributions
P-0.050
ug-0.047
ister-0.043
opter-0.038
lf-0.037
Token smooth
Token ID7209
Feature activation+1.46e-02

Pos logit contributions
comb+0.021
 hospitality+0.020
 pleasure+0.019
 former+0.019
ses+0.019
Neg logit contributions
P-0.018
ug-0.017
irens-0.015
ister-0.015
opter-0.014
Token pet
Token ID4273
Feature activation+8.18e-02

Pos logit contributions
comb+0.019
 pleasure+0.017
ses+0.017
 pur+0.017
 pad+0.017
Neg logit contributions
ug-0.031
ister-0.030
P-0.029
irens-0.029
iren-0.027
Tokenals
Token ID874
Feature activation+3.74e-02

Pos logit contributions
comb+0.101
 pleasure+0.092
 former+0.091
ses+0.090
 seeking+0.089
Neg logit contributions
ug-0.182
ister-0.174
P-0.169
irens-0.166
iren-0.159
Token that
Token ID326
Feature activation+8.71e-03

Pos logit contributions
 pleasure+0.080
ses+0.078
 former+0.077
 hospitality+0.075
comb+0.073
Neg logit contributions
ug-0.051
ister-0.047
P-0.043
irens-0.041
iren-0.040
Token were
Token ID547
Feature activation-1.19e-02

Pos logit contributions
 former+0.014
 pad+0.014
comb+0.014
 pleasure+0.014
 fault+0.014
Neg logit contributions
ug-0.015
ister-0.014
P-0.014
opter-0.012
lf-0.012
Token bright
Token ID6016
Feature activation+8.39e-03

Pos logit contributions
ug+0.019
ister+0.017
P+0.017
irens+0.016
ep+0.016
Neg logit contributions
comb-0.021
 pleasure-0.021
 hospitality-0.021
ses-0.020
 former-0.020
Token and
Token ID290
Feature activation+1.26e-02

Pos logit contributions
comb+0.013
 pleasure+0.012
ses+0.012
 seeking+0.012
 former+0.012
Neg logit contributions
ug-0.015
ister-0.015
P-0.014
irens-0.014
iren-0.013
Token went
Token ID1816
Feature activation-1.58e-02

Pos logit contributions
comb+0.002
ses+0.002
 pleasure+0.002
 former+0.002
 seeking+0.002
Neg logit contributions
ug-0.002
ister-0.002
P-0.002
irens-0.002
opter-0.002
Token home
Token ID1363
Feature activation+4.55e-03

Pos logit contributions
ug+0.032
ister+0.032
P+0.031
irens+0.031
iren+0.030
Neg logit contributions
comb-0.022
 pleasure-0.020
ses-0.020
 former-0.020
 pad-0.019
Token and
Token ID290
Feature activation-5.73e-03

Pos logit contributions
ses+0.009
Aud+0.009
comb+0.009
 pleasure+0.009
 pad+0.009
Neg logit contributions
ister-0.006
ug-0.006
irens-0.005
iren-0.005
lf-0.005
Token enjoyed
Token ID8359
Feature activation+1.58e-02

Pos logit contributions
ug+0.009
ister+0.009
P+0.009
irens+0.008
opter+0.008
Neg logit contributions
comb-0.010
ses-0.010
 pleasure-0.009
 pad-0.009
Aud-0.009
Token his
Token ID465
Feature activation-4.62e-03

Pos logit contributions
comb+0.036
ses+0.035
 pad+0.034
Aud+0.033
 former+0.032
Neg logit contributions
ug-0.018
ister-0.017
P-0.017
irens-0.016
 yacht-0.015
Token freshly
Token ID29026
Feature activation+8.18e-02

Pos logit contributions
ug+0.008
ister+0.008
P+0.007
irens+0.007
 yacht+0.007
Neg logit contributions
comb-0.008
 pad-0.007
ses-0.007
 former-0.007
Aud-0.007
Token caught
Token ID4978
Feature activation+5.87e-04

Pos logit contributions
comb+0.143
ses+0.136
 pleasure+0.134
 former+0.133
 pad+0.132
Neg logit contributions
ug-0.136
P-0.135
ister-0.131
irens-0.122
iren-0.118
Token fish
Token ID5916
Feature activation+5.84e-03

Pos logit contributions
comb+0.001
ses+0.001
 pleasure+0.001
 pad+0.001
 seeking+0.001
Neg logit contributions
ug-0.001
ister-0.001
P-0.001
irens-0.001
iren-0.001
Token.
Token ID13
Feature activation+4.50e-03

Pos logit contributions
ses+0.011
comb+0.010
 pleasure+0.010
Aud+0.010
 former+0.009
Neg logit contributions
ug-0.010
ister-0.009
P-0.009
 cubes-0.008
 yacht-0.008
Token<|endoftext|>
Token ID50256
Feature activation+2.82e-02

Pos logit contributions
comb+0.008
ses+0.007
 seeking+0.007
 watered+0.007
 former+0.007
Neg logit contributions
ug-0.007
ister-0.007
P-0.007
irens-0.007
iren-0.006
TokenOnce
Token ID7454
Feature activation-8.77e-03

Pos logit contributions
comb+0.046
 pleasure+0.043
 pad+0.043
 seeking+0.043
 pur+0.043
Neg logit contributions
ug-0.050
ister-0.047
irens-0.044
P-0.044
opter-0.042
Token flower
Token ID15061
Feature activation+4.61e-02

Pos logit contributions
ug+0.020
ister+0.020
P+0.019
irens+0.018
opter+0.018
Neg logit contributions
comb-0.011
 pleasure-0.011
 former-0.011
 hospitality-0.010
ses-0.010
Token.
Token ID13
Feature activation+6.65e-03

Pos logit contributions
 pleasure+0.110
Aud+0.104
ses+0.100
 former+0.100
 hospitality+0.098
Neg logit contributions
ug-0.047
ister-0.046
 yacht-0.039
P-0.039
 elevator-0.036
Token He
Token ID679
Feature activation-1.71e-02

Pos logit contributions
comb+0.013
ses+0.011
 former+0.011
 pleasure+0.011
 hospitality+0.011
Neg logit contributions
ug-0.010
ister-0.009
irens-0.009
iren-0.009
P-0.009
Token opened
Token ID4721
Feature activation+1.22e-02

Pos logit contributions
ug+0.028
P+0.027
ister+0.026
ep+0.024
opter+0.023
Neg logit contributions
 pleasure-0.030
comb-0.030
ses-0.029
 former-0.028
 hospitality-0.028
Token the
Token ID262
Feature activation+1.86e-02

Pos logit contributions
comb+0.020
 pleasure+0.019
 pur+0.019
 pad+0.019
ses+0.018
Neg logit contributions
ug-0.022
ister-0.021
P-0.020
irens-0.020
iren-0.019
Token pet
Token ID4273
Feature activation+8.16e-02

Pos logit contributions
comb+0.034
 hospitality+0.034
 pleasure+0.032
 former+0.031
 seeking+0.031
Neg logit contributions
P-0.027
ister-0.026
ug-0.025
 elevator-0.024
irens-0.023
Tokenals
Token ID874
Feature activation+5.03e-02

Pos logit contributions
 former+0.113
comb+0.112
 pleasure+0.111
ses+0.108
 seeking+0.105
Neg logit contributions
ug-0.160
ister-0.156
P-0.154
irens-0.148
iren-0.144
Token very
Token ID845
Feature activation+3.05e-02

Pos logit contributions
 former+0.115
ses+0.114
 pleasure+0.111
 hospitality+0.103
 fault+0.102
Neg logit contributions
ug-0.060
ister-0.059
irens-0.051
iren-0.051
P-0.049
Token carefully
Token ID7773
Feature activation-9.42e-04

Pos logit contributions
comb+0.032
 pur+0.029
 watered+0.029
 former+0.028
 pollen+0.028
Neg logit contributions
ug-0.071
ister-0.067
irens-0.067
iren-0.066
P-0.065
Token.
Token ID13
Feature activation+6.26e-03

Pos logit contributions
ug+0.001
ister+0.001
P+0.001
irens+0.001
or+0.001
Neg logit contributions
 pleasure-0.002
ses-0.002
 former-0.002
 hospitality-0.002
comb-0.002
Token Inside
Token ID14384
Feature activation+3.09e-02

Pos logit contributions
comb+0.011
ses+0.010
 former+0.010
 pleasure+0.010
 hospitality+0.010
Neg logit contributions
ug-0.010
ister-0.010
irens-0.010
iren-0.009
P-0.009
Token
Token ID198
Feature activation-2.83e-03

Pos logit contributions
comb+0.006
 pleasure+0.006
ses+0.006
 former+0.006
 pad+0.006
Neg logit contributions
ug-0.006
ister-0.006
irens-0.006
P-0.006
iren-0.005
Token
Token ID198
Feature activation-2.01e-03

Pos logit contributions
ug+0.006
ister+0.006
irens+0.006
P+0.006
opter+0.006
Neg logit contributions
comb-0.003
 nib-0.003
 pur-0.003
 pleasure-0.003
 seeking-0.003
TokenWhen
Token ID2215
Feature activation-2.32e-03

Pos logit contributions
ug+0.004
ister+0.003
irens+0.003
opter+0.003
iren+0.003
Neg logit contributions
comb-0.003
 nib-0.003
 pur-0.003
 pad-0.003
Aud-0.003
Token the
Token ID262
Feature activation+3.70e-03

Pos logit contributions
ug+0.004
ister+0.004
irens+0.004
P+0.004
iren+0.003
Neg logit contributions
comb-0.004
 pleasure-0.004
 pur-0.004
 nib-0.004
ses-0.004
Token sun
Token ID4252
Feature activation+3.63e-02

Pos logit contributions
comb+0.007
 pleasure+0.006
ses+0.006
 pad+0.006
 pur+0.006
Neg logit contributions
ug-0.006
ister-0.006
P-0.005
irens-0.005
iren-0.005
Token rose
Token ID8278
Feature activation+8.14e-02

Pos logit contributions
comb+0.057
 pleasure+0.053
 pur+0.052
ses+0.052
 pad+0.052
Neg logit contributions
ug-0.068
ister-0.065
P-0.062
irens-0.061
iren-0.058
Token,
Token ID11
Feature activation+2.65e-02

Pos logit contributions
comb+0.141
 pur+0.131
 pleasure+0.129
ses+0.129
 pad+0.128
Neg logit contributions
ug-0.139
ister-0.131
P-0.125
irens-0.123
opter-0.116
Token Bob
Token ID5811
Feature activation+7.02e-03

Pos logit contributions
comb+0.037
 pleasure+0.034
ses+0.033
 pur+0.033
 seeking+0.032
Neg logit contributions
ug-0.055
ister-0.052
P-0.050
irens-0.050
iren-0.048
Token's
Token ID338
Feature activation+2.43e-02

Pos logit contributions
comb+0.012
 pleasure+0.012
ses+0.011
 former+0.011
 hospitality+0.011
Neg logit contributions
ug-0.012
ister-0.012
P-0.011
irens-0.011
opter-0.011
Token thing
Token ID1517
Feature activation+1.84e-02

Pos logit contributions
comb+0.045
Aud+0.043
 seeking+0.042
 pad+0.040
 hospitality+0.040
Neg logit contributions
ug-0.038
ister-0.037
irens-0.033
P-0.032
 yacht-0.031
Token was
Token ID373
Feature activation-5.82e-03

Pos logit contributions
 former+0.033
comb+0.033
 pleasure+0.032
 pad+0.031
ses+0.031
Neg logit contributions
ug-0.029
ister-0.029
irens-0.026
P-0.025
iren-0.025
Token Lucy
Token ID22162
Feature activation+2.48e-02

Pos logit contributions
ug+0.003
ister+0.003
irens+0.003
P+0.003
iren+0.003
Neg logit contributions
comb-0.004
ses-0.004
 seeking-0.004
 hospitality-0.004
 former-0.004
Token collected
Token ID7723
Feature activation+1.47e-02

Pos logit contributions
 pleasure+0.044
ses+0.043
comb+0.042
 former+0.041
 hospitality+0.040
Neg logit contributions
ug-0.043
P-0.038
ister-0.037
opter-0.035
ep-0.035
Token some
Token ID617
Feature activation+4.66e-02

Pos logit contributions
comb+0.029
ses+0.028
 former+0.027
 hospitality+0.026
 pad+0.026
Neg logit contributions
ug-0.022
P-0.020
ister-0.019
irens-0.019
 cubes-0.018
Token bright
Token ID6016
Feature activation+1.40e-02

Pos logit contributions
comb+0.083
ses+0.080
 former+0.076
 seeking+0.075
Aud+0.074
Neg logit contributions
ug-0.078
P-0.072
ister-0.070
irens-0.066
iren-0.063
Token yellow
Token ID7872
Feature activation+2.87e-02

Pos logit contributions
comb+0.024
ses+0.022
 former+0.022
 seeking+0.021
 pleasure+0.021
Neg logit contributions
ug-0.025
ister-0.023
P-0.022
irens-0.022
opter-0.021
Token pet
Token ID4273
Feature activation+8.08e-02

Pos logit contributions
 former+0.049
ses+0.049
comb+0.049
 pleasure+0.047
 seeking+0.047
Neg logit contributions
ug-0.047
P-0.044
ister-0.041
irens-0.040
 cubes-0.038
Tokenals
Token ID874
Feature activation+3.85e-02

Pos logit contributions
 former+0.151
ses+0.138
 fault+0.132
 pleasure+0.131
comb+0.131
Neg logit contributions
ug-0.132
P-0.118
ister-0.116
aw-0.108
irens-0.106
Token and
Token ID290
Feature activation-2.83e-03

Pos logit contributions
ses+0.096
 former+0.092
 pleasure+0.086
sed+0.084
Aud+0.083
Neg logit contributions
ug-0.041
ister-0.036
P-0.029
or-0.028
um-0.028
Token arranged
Token ID14921
Feature activation-3.60e-03

Pos logit contributions
ug+0.004
P+0.004
irens+0.004
 cubes+0.004
opter+0.004
Neg logit contributions
ses-0.005
comb-0.005
 fault-0.005
 former-0.005
 pleasure-0.005
Token them
Token ID606
Feature activation+1.96e-02

Pos logit contributions
ug+0.004
ister+0.004
P+0.004
irens+0.004
iren+0.003
Neg logit contributions
comb-0.008
ses-0.008
 former-0.008
 pad-0.007
 pleasure-0.007
Token in
Token ID287
Feature activation-1.16e-02

Pos logit contributions
ses+0.045
 pleasure+0.041
 former+0.040
Aud+0.040
 hospitality+0.039
Neg logit contributions
ug-0.024
P-0.020
 cubes-0.019
or-0.019
ades-0.019
Token little
Token ID1310
Feature activation-5.13e-03

Pos logit contributions
comb+0.010
ses+0.009
eter+0.009
 seeking+0.009
ulent+0.009
Neg logit contributions
P-0.010
ister-0.010
ug-0.009
 yacht-0.009
Dad-0.009
Token girl
Token ID2576
Feature activation+2.62e-02

Pos logit contributions
P+0.007
ug+0.007
ister+0.006
irens+0.005
iren+0.005
Neg logit contributions
comb-0.011
ses-0.010
 former-0.010
 seeking-0.010
Aud-0.010
Token never
Token ID1239
Feature activation+1.75e-02

Pos logit contributions
comb+0.046
ses+0.045
 former+0.044
 pleasure+0.044
 hospitality+0.042
Neg logit contributions
ug-0.043
ister-0.041
P-0.039
irens-0.034
in-0.034
Token turned
Token ID2900
Feature activation+2.92e-02

Pos logit contributions
comb+0.035
 pleasure+0.031
 former+0.031
ses+0.031
Aud+0.029
Neg logit contributions
ug-0.025
ister-0.022
P-0.022
irens-0.021
opter-0.020
Token her
Token ID607
Feature activation+3.73e-02

Pos logit contributions
comb+0.046
ses+0.041
 pleasure+0.040
 hospitality+0.039
 former+0.038
Neg logit contributions
ug-0.056
ister-0.053
irens-0.053
P-0.050
iren-0.050
Token pet
Token ID4273
Feature activation+8.08e-02

Pos logit contributions
comb+0.048
 hospitality+0.045
 former+0.044
epad+0.042
Aud+0.042
Neg logit contributions
ister-0.067
ole-0.064
ug-0.064
P-0.064
 yacht-0.061
Token away
Token ID1497
Feature activation+3.12e-03

Pos logit contributions
 former+0.146
 pleasure+0.140
ses+0.139
comb+0.136
 watered+0.135
Neg logit contributions
ister-0.125
ug-0.120
P-0.116
aw-0.105
crow-0.103
Token again
Token ID757
Feature activation+2.88e-02

Pos logit contributions
ses+0.006
 pleasure+0.005
Aud+0.005
comb+0.005
sed+0.005
Neg logit contributions
ister-0.005
ug-0.005
P-0.004
irens-0.004
in-0.004
Token.
Token ID13
Feature activation-1.05e-03

Pos logit contributions
ses+0.074
 pleasure+0.066
comb+0.065
sed+0.065
Aud+0.063
Neg logit contributions
ister-0.028
ug-0.028
P-0.025
irens-0.022
?"-0.019
Token
Token ID198
Feature activation-4.90e-03

Pos logit contributions
ug+0.002
ister+0.002
P+0.001
irens+0.001
iren+0.001
Neg logit contributions
comb-0.002
ses-0.002
 watered-0.002
 former-0.002
 seeking-0.002
Token
Token ID198
Feature activation-2.26e-03

Pos logit contributions
ug+0.006
ister+0.006
P+0.006
irens+0.005
iren+0.005
Neg logit contributions
comb-0.011
ses-0.010
 pleasure-0.010
 former-0.010
 pad-0.010
Token She
Token ID1375
Feature activation+9.79e-03

Pos logit contributions
comb+0.005
 fault+0.005
 former+0.005
ses+0.005
 hospitality+0.004
Neg logit contributions
ug-0.003
ister-0.003
P-0.003
irens-0.003
um-0.003
Token stretched
Token ID19110
Feature activation+1.45e-02

Pos logit contributions
 pleasure+0.017
comb+0.017
 fault+0.016
ses+0.016
 hospitality+0.016
Neg logit contributions
ug-0.015
P-0.015
ister-0.013
?"-0.013
ep-0.013
Token out
Token ID503
Feature activation+2.40e-02

Pos logit contributions
 former+0.026
ses+0.026
comb+0.025
 pleasure+0.025
 hospitality+0.025
Neg logit contributions
ug-0.023
ister-0.020
P-0.020
irens-0.018
iren-0.017
Token her
Token ID607
Feature activation+3.71e-02

Pos logit contributions
comb+0.049
ses+0.047
 pad+0.045
 pleasure+0.044
 former+0.044
Neg logit contributions
ug-0.035
irens-0.032
P-0.029
ister-0.029
 elevator-0.028
Token yellow
Token ID7872
Feature activation+4.60e-02

Pos logit contributions
comb+0.062
 hospitality+0.060
epad+0.057
 former+0.056
Aud+0.056
Neg logit contributions
P-0.058
ug-0.055
ister-0.050
 yacht-0.048
 elevator-0.046
Token pet
Token ID4273
Feature activation+8.08e-02

Pos logit contributions
comb+0.070
 pleasure+0.068
 seeking+0.066
 forgetting+0.066
ses+0.066
Neg logit contributions
P-0.081
ug-0.080
ister-0.076
irens-0.070
 yacht-0.068
Tokenals
Token ID874
Feature activation+5.21e-02

Pos logit contributions
 former+0.150
 pleasure+0.138
 fault+0.136
 seeking+0.136
comb+0.134
Neg logit contributions
ug-0.125
P-0.119
ister-0.111
aw-0.107
 yacht-0.103
Token and
Token ID290
Feature activation-4.03e-03

Pos logit contributions
ses+0.117
 former+0.111
 pleasure+0.111
 seeking+0.106
Aud+0.102
Neg logit contributions
ug-0.066
ister-0.058
P-0.050
iren-0.047
ler-0.046
Token felt
Token ID2936
Feature activation-2.60e-02

Pos logit contributions
P+0.006
ug+0.005
?"+0.005
 yacht+0.005
ister+0.004
Neg logit contributions
 fault-0.007
 pleasure-0.007
 hospitality-0.007
 former-0.007
comb-0.007
Token the
Token ID262
Feature activation-1.91e-02

Pos logit contributions
ug+0.038
ister+0.035
P+0.034
irens+0.030
 yacht+0.030
Neg logit contributions
comb-0.051
ses-0.048
 pad-0.048
 former-0.045
Aud-0.045
Token sun
Token ID4252
Feature activation+1.17e-02

Pos logit contributions
P+0.027
ister+0.023
ug+0.023
 yacht+0.020
um+0.020
Neg logit contributions
comb-0.035
ccoli-0.034
girls-0.033
 hospitality-0.033
Aud-0.033
Token Emma
Token ID18966
Feature activation+1.76e-02

Pos logit contributions
ug+0.017
ister+0.017
irens+0.016
P+0.016
iren+0.015
Neg logit contributions
comb-0.010
ses-0.009
 pleasure-0.009
 pad-0.009
 former-0.009
Token quickly
Token ID2952
Feature activation-1.37e-02

Pos logit contributions
comb+0.032
 pleasure+0.032
 hospitality+0.031
 former+0.031
ses+0.030
Neg logit contributions
ug-0.027
ister-0.025
P-0.024
opter-0.024
irens-0.023
Token realized
Token ID6939
Feature activation-1.41e-02

Pos logit contributions
ug+0.025
ister+0.023
P+0.023
irens+0.022
opter+0.022
Neg logit contributions
comb-0.022
 pleasure-0.021
ses-0.021
 former-0.019
 hospitality-0.019
Token that
Token ID326
Feature activation-8.00e-04

Pos logit contributions
ug+0.025
ister+0.024
P+0.023
irens+0.023
iren+0.021
Neg logit contributions
comb-0.023
 pleasure-0.022
ses-0.022
 pad-0.021
 pur-0.021
Token her
Token ID607
Feature activation+3.91e-02

Pos logit contributions
ug+0.001
ister+0.001
irens+0.001
P+0.001
lf+0.001
Neg logit contributions
comb-0.001
ses-0.001
 pad-0.001
 pleasure-0.001
ulent-0.001
Token pet
Token ID4273
Feature activation+8.04e-02

Pos logit contributions
comb+0.071
ulent+0.062
 pad+0.061
 nib+0.061
 hospitality+0.061
Neg logit contributions
ug-0.062
ister-0.061
P-0.057
irens-0.054
opter-0.053
Token dog
Token ID3290
Feature activation+3.29e-02

Pos logit contributions
comb+0.137
 pleasure+0.131
 former+0.128
 hospitality+0.126
 seeking+0.125
Neg logit contributions
ug-0.138
ister-0.134
P-0.127
irens-0.120
opter-0.117
Token had
Token ID550
Feature activation-9.76e-03

Pos logit contributions
comb+0.063
 former+0.060
 pleasure+0.060
 hospitality+0.058
ses+0.057
Neg logit contributions
ug-0.050
ister-0.047
P-0.044
irens-0.042
opter-0.040
Token taken
Token ID2077
Feature activation+7.85e-03

Pos logit contributions
ug+0.017
P+0.016
irens+0.015
ister+0.015
ep+0.014
Neg logit contributions
comb-0.017
 pleasure-0.015
ses-0.014
 hospitality-0.014
 former-0.014
Token it
Token ID340
Feature activation+3.29e-02

Pos logit contributions
comb+0.014
ses+0.013
 former+0.012
 hospitality+0.012
 seeking+0.012
Neg logit contributions
ug-0.014
P-0.012
ister-0.012
irens-0.012
iren-0.011
Token.
Token ID13
Feature activation-7.13e-03

Pos logit contributions
comb+0.057
ses+0.057
 pleasure+0.055
 former+0.053
 hospitality+0.053
Neg logit contributions
ug-0.054
ister-0.052
P-0.049
irens-0.046
iren-0.044
Token The
Token ID383
Feature activation+8.91e-03

Pos logit contributions
ug+0.004
ister+0.004
P+0.003
irens+0.003
iren+0.003
Neg logit contributions
comb-0.004
ses-0.004
 seeking-0.004
 hospitality-0.004
 watered-0.004
Token flower
Token ID15061
Feature activation+5.44e-02

Pos logit contributions
comb+0.016
Aud+0.016
ses+0.015
 seeking+0.015
eter+0.015
Neg logit contributions
ug-0.013
ister-0.012
P-0.012
 elevator-0.011
 yacht-0.011
Token opened
Token ID4721
Feature activation+2.05e-02

Pos logit contributions
 pleasure+0.107
ses+0.106
epad+0.100
 hospitality+0.099
ulent+0.099
Neg logit contributions
ug-0.075
ister-0.069
P-0.067
ler-0.063
ep-0.062
Token up
Token ID510
Feature activation+4.03e-02

Pos logit contributions
ses+0.041
comb+0.040
 hospitality+0.040
 seeking+0.039
 pleasure+0.037
Neg logit contributions
ug-0.028
ister-0.025
P-0.024
irens-0.023
 yacht-0.022
Token its
Token ID663
Feature activation+3.12e-02

Pos logit contributions
ses+0.086
comb+0.086
 hospitality+0.083
 former+0.083
Aud+0.083
Neg logit contributions
ug-0.045
ister-0.041
P-0.037
irens-0.037
 yacht-0.034
Token pet
Token ID4273
Feature activation+8.03e-02

Pos logit contributions
Aud+0.054
 hospitality+0.053
comb+0.052
ses+0.050
epad+0.050
Neg logit contributions
ug-0.041
P-0.039
um-0.039
ister-0.039
ole-0.038
Tokenals
Token ID874
Feature activation+4.57e-02

Pos logit contributions
 former+0.140
 pleasure+0.133
 fault+0.131
 seeking+0.131
ses+0.130
Neg logit contributions
ug-0.127
ister-0.121
P-0.115
aw-0.115
irens-0.110
Token and
Token ID290
Feature activation+2.20e-03

Pos logit contributions
ses+0.109
 former+0.103
 pleasure+0.102
 fault+0.098
 seeking+0.096
Neg logit contributions
ug-0.049
ister-0.046
P-0.036
iren-0.036
irens-0.036
Token started
Token ID2067
Feature activation-8.49e-03

Pos logit contributions
 fault+0.004
ses+0.004
 pleasure+0.004
 former+0.004
comb+0.004
Neg logit contributions
ug-0.003
P-0.003
ister-0.003
opter-0.002
 yacht-0.002
Token to
Token ID284
Feature activation+1.94e-03

Pos logit contributions
ug+0.011
irens+0.010
P+0.010
 kay+0.009
ister+0.009
Neg logit contributions
comb-0.016
ses-0.016
 pleasure-0.015
 former-0.015
ccoli-0.014
Token feel
Token ID1254
Feature activation-5.30e-03

Pos logit contributions
ses+0.003
Aud+0.003
eter+0.003
 fault+0.003
 pad+0.003
Neg logit contributions
?"-0.003
ister-0.003
ug-0.003
 yacht-0.003
P-0.002
Token surprised
Token ID6655
Feature activation+2.19e-03

Pos logit contributions
ug+0.029
ister+0.028
P+0.026
irens+0.026
iren+0.025
Neg logit contributions
comb-0.029
ses-0.029
 former-0.027
 hospitality-0.027
 pad-0.026
Token.
Token ID13
Feature activation+4.95e-03

Pos logit contributions
comb+0.003
ses+0.003
 pleasure+0.003
 former+0.003
 pad+0.003
Neg logit contributions
ug-0.004
ister-0.004
P-0.004
irens-0.004
iren-0.004
Token Even
Token ID3412
Feature activation-7.82e-03

Pos logit contributions
comb+0.008
 hospitality+0.007
 pleasure+0.007
ses+0.007
 former+0.007
Neg logit contributions
ug-0.009
ister-0.009
irens-0.008
P-0.008
iren-0.008
Token though
Token ID996
Feature activation+4.07e-03

Pos logit contributions
ug+0.012
ister+0.012
P+0.011
irens+0.011
iren+0.010
Neg logit contributions
comb-0.015
ses-0.014
 pleasure-0.014
 former-0.013
 pad-0.013
Token the
Token ID262
Feature activation+2.30e-03

Pos logit contributions
comb+0.008
ses+0.008
 pad+0.008
 pleasure+0.008
 former+0.008
Neg logit contributions
ug-0.005
ister-0.005
irens-0.005
P-0.005
opter-0.004
Token pet
Token ID4273
Feature activation+8.02e-02

Pos logit contributions
comb+0.004
ccoli+0.004
 seeking+0.004
 potions+0.004
epad+0.004
Neg logit contributions
P-0.003
ister-0.003
 taxi-0.003
Dad-0.003
 van-0.003
Token had
Token ID550
Feature activation+9.07e-03

Pos logit contributions
 pleasure+0.131
comb+0.128
 former+0.125
 seeking+0.124
 hospitality+0.124
Neg logit contributions
ister-0.140
ug-0.138
P-0.128
irens-0.123
iren-0.120
Token found
Token ID1043
Feature activation+1.19e-02

Pos logit contributions
comb+0.014
ses+0.013
 pleasure+0.013
 seeking+0.012
 pad+0.012
Neg logit contributions
ug-0.015
P-0.015
ister-0.015
irens-0.014
opter-0.014
Token and
Token ID290
Feature activation+7.71e-03

Pos logit contributions
comb+0.022
ses+0.021
 former+0.020
 seeking+0.020
 pad+0.020
Neg logit contributions
ug-0.018
ister-0.017
irens-0.016
P-0.016
iren-0.015
Token ate
Token ID15063
Feature activation+3.69e-02

Pos logit contributions
ses+0.014
Aud+0.014
comb+0.014
 pleasure+0.013
 fault+0.013
Neg logit contributions
ug-0.011
ister-0.010
lf-0.010
irens-0.010
 yacht-0.009
Token the
Token ID262
Feature activation+1.16e-02

Pos logit contributions
comb+0.072
ses+0.069
 pad+0.067
 former+0.067
 seeking+0.066
Neg logit contributions
ug-0.052
ister-0.049
P-0.046
irens-0.046
iren-0.042
Token.
Token ID13
Feature activation-4.20e-03

Pos logit contributions
ug+0.003
ister+0.003
iren+0.002
lf+0.002
irens+0.002
Neg logit contributions
 former-0.004
eter-0.004
 pleasure-0.004
ses-0.004
Aud-0.004
Token He
Token ID679
Feature activation-1.87e-02

Pos logit contributions
ug+0.005
ister+0.005
irens+0.005
P+0.005
iren+0.005
Neg logit contributions
comb-0.009
 former-0.008
ses-0.008
 watered-0.008
 seeking-0.008
Token wished
Token ID16555
Feature activation-1.12e-02

Pos logit contributions
ug+0.026
P+0.025
ister+0.024
ep+0.024
opter+0.023
Neg logit contributions
comb-0.035
 pleasure-0.035
ses-0.034
 fault-0.034
eter-0.033
Token that
Token ID326
Feature activation-1.71e-03

Pos logit contributions
ug+0.013
ister+0.013
irens+0.011
lf+0.010
opter+0.010
Neg logit contributions
comb-0.024
 former-0.024
 pad-0.024
ses-0.024
Aud-0.023
Token his
Token ID465
Feature activation+2.10e-02

Pos logit contributions
ister+0.002
ug+0.002
lf+0.001
irens+0.001
 elevator+0.001
Neg logit contributions
ses-0.004
comb-0.004
 pad-0.004
 former-0.004
Aud-0.004
Token pet
Token ID4273
Feature activation+8.01e-02

Pos logit contributions
Aud+0.037
comb+0.034
ulent+0.034
eter+0.031
 agreeing+0.031
Neg logit contributions
ister-0.033
ug-0.032
 elevator-0.028
lf-0.028
opter-0.027
Token fish
Token ID5916
Feature activation+2.10e-02

Pos logit contributions
comb+0.131
 pleasure+0.127
 former+0.124
ses+0.124
ulent+0.124
Neg logit contributions
ug-0.134
ister-0.134
P-0.117
opter-0.116
irens-0.108
Token,
Token ID11
Feature activation+1.66e-02

Pos logit contributions
 former+0.038
Aud+0.038
ses+0.037
 pleasure+0.036
comb+0.036
Neg logit contributions
ug-0.032
ister-0.030
opter-0.026
P-0.026
boat-0.026
Token Fin
Token ID4463
Feature activation-2.43e-02

Pos logit contributions
comb+0.033
ses+0.031
Aud+0.030
 former+0.030
eter+0.029
Neg logit contributions
ug-0.024
ister-0.022
irens-0.021
iren-0.019
 yacht-0.019
Token,
Token ID11
Feature activation+1.42e-02

Pos logit contributions
ug+0.042
ister+0.037
P+0.036
opter+0.034
ler+0.033
Neg logit contributions
comb-0.043
 former-0.042
ses-0.040
 amaz-0.039
 pleasure-0.039
Token who
Token ID508
Feature activation-6.39e-03

Pos logit contributions
comb+0.028
ses+0.026
 former+0.026
Aud+0.025
 pleasure+0.024
Neg logit contributions
ug-0.021
ister-0.020
irens-0.018
P-0.018
iren-0.017
Token The
Token ID383
Feature activation+3.80e-03

Pos logit contributions
ug+0.006
ister+0.006
P+0.005
irens+0.005
iren+0.005
Neg logit contributions
comb-0.005
ses-0.005
 watered-0.005
 seeking-0.005
 hospitality-0.005
Token l
Token ID300
Feature activation+3.69e-02

Pos logit contributions
comb+0.007
eter+0.006
ses+0.006
 pleasure+0.006
 former+0.006
Neg logit contributions
ister-0.006
P-0.006
ug-0.006
irens-0.005
 elevator-0.005
Tokenily
Token ID813
Feature activation+3.45e-02

Pos logit contributions
comb+0.071
 pleasure+0.067
ses+0.066
 pur+0.065
 pad+0.065
Neg logit contributions
ug-0.056
ister-0.053
P-0.050
irens-0.049
iren-0.046
Token opened
Token ID4721
Feature activation+9.50e-03

Pos logit contributions
 pleasure+0.065
ses+0.061
comb+0.060
 former+0.059
 hospitality+0.059
Neg logit contributions
ug-0.053
ister-0.051
P-0.045
ler-0.043
opter-0.043
Token its
Token ID663
Feature activation+3.07e-02

Pos logit contributions
comb+0.018
ses+0.018
 hospitality+0.017
 seeking+0.017
 former+0.016
Neg logit contributions
ug-0.014
ister-0.013
irens-0.012
P-0.012
iren-0.011
Token pet
Token ID4273
Feature activation+7.98e-02

Pos logit contributions
comb+0.055
 hospitality+0.054
Aud+0.052
epad+0.052
 seeking+0.049
Neg logit contributions
ister-0.037
P-0.036
um-0.036
ug-0.036
ole-0.035
Tokenals
Token ID874
Feature activation+4.69e-02

Pos logit contributions
 former+0.134
 pleasure+0.129
 fault+0.127
ses+0.125
 seeking+0.124
Neg logit contributions
ister-0.128
ug-0.127
P-0.121
irens-0.116
aw-0.115
Token and
Token ID290
Feature activation-4.64e-03

Pos logit contributions
ses+0.113
 former+0.107
 pleasure+0.106
 fault+0.100
 hospitality+0.100
Neg logit contributions
ister-0.049
ug-0.049
irens-0.038
iren-0.038
P-0.036
Token felt
Token ID2936
Feature activation-2.53e-02

Pos logit contributions
ug+0.007
P+0.007
ister+0.007
opter+0.006
irens+0.006
Neg logit contributions
comb-0.008
ses-0.008
 pleasure-0.007
 former-0.007
 pad-0.007
Token happy
Token ID3772
Feature activation+2.63e-02

Pos logit contributions
ug+0.040
ister+0.038
P+0.037
irens+0.034
iren+0.033
Neg logit contributions
comb-0.045
ses-0.044
 pad-0.042
 seeking-0.040
 hospitality-0.038
Token and
Token ID290
Feature activation-1.82e-02

Pos logit contributions
ses+0.058
Aud+0.058
comb+0.056
brance+0.054
 former+0.053
Neg logit contributions
ug-0.027
ister-0.024
 yacht-0.021
,-0.021
irens-0.020
Token decorated
Token ID24789
Feature activation-2.72e-02

Pos logit contributions
ug+0.006
 yacht+0.005
P+0.005
 cubes+0.005
irens+0.005
Neg logit contributions
comb-0.007
 pleasure-0.007
 former-0.006
Aud-0.006
eter-0.006
Token it
Token ID340
Feature activation-1.54e-02

Pos logit contributions
ug+0.044
ister+0.042
P+0.039
irens+0.039
opter+0.037
Neg logit contributions
comb-0.050
 pleasure-0.046
ses-0.046
 pad-0.046
 pur-0.045
Token with
Token ID351
Feature activation+1.38e-03

Pos logit contributions
ug+0.019
ister+0.017
P+0.016
irens+0.015
 cubes+0.014
Neg logit contributions
 pleasure-0.034
comb-0.032
 hospitality-0.032
ses-0.032
Aud-0.032
Token shells
Token ID19679
Feature activation-5.11e-02

Pos logit contributions
comb+0.003
ses+0.003
 pad+0.003
 seeking+0.003
 hospitality+0.003
Neg logit contributions
ug-0.002
P-0.002
ister-0.002
irens-0.002
 cubes-0.002
Token and
Token ID290
Feature activation-3.98e-03

Pos logit contributions
ug+0.073
ister+0.069
irens+0.061
P+0.061
iren+0.058
Neg logit contributions
 former-0.100
 pleasure-0.100
comb-0.099
ses-0.098
 hospitality-0.093
Token seaw
Token ID32085
Feature activation-1.07e-01

Pos logit contributions
ug+0.005
P+0.005
 cubes+0.005
irens+0.005
 yacht+0.005
Neg logit contributions
comb-0.008
 pleasure-0.007
Aud-0.007
 former-0.007
ses-0.007
Tokeneed
Token ID2308
Feature activation-7.73e-02

Pos logit contributions
ug+0.218
ister+0.215
irens+0.204
P+0.197
iren+0.194
Neg logit contributions
comb-0.149
 pad-0.139
ses-0.135
 pleasure-0.135
 pur-0.131
Token.
Token ID13
Feature activation+6.92e-04

Pos logit contributions
ug+0.101
ister+0.089
irens+0.079
 cubes+0.077
iren+0.077
Neg logit contributions
 former-0.163
 pleasure-0.162
comb-0.160
ses-0.154
 pur-0.151
Token They
Token ID1119
Feature activation-6.56e-03

Pos logit contributions
comb+0.001
 pleasure+0.001
ses+0.001
 pur+0.001
 pad+0.001
Neg logit contributions
ug-0.001
ister-0.001
P-0.001
irens-0.001
iren-0.001
Token had
Token ID550
Feature activation-5.89e-03

Pos logit contributions
ug+0.012
P+0.012
ister+0.012
opter+0.011
irens+0.011
Neg logit contributions
comb-0.010
 pleasure-0.009
ses-0.009
 former-0.009
 hospitality-0.009
Token a
Token ID257
Feature activation+9.43e-03

Pos logit contributions
P+0.008
 cubes+0.008
ug+0.007
ep+0.007
 seas+0.007
Neg logit contributions
comb-0.012
 former-0.010
ses-0.010
 hospitality-0.010
 pad-0.010
Token Like
Token ID4525
Feature activation-8.22e-03

Pos logit contributions
ug+0.020
ister+0.019
P+0.018
irens+0.018
opter+0.018
Neg logit contributions
comb-0.012
 pleasure-0.011
ses-0.011
 pad-0.011
 former-0.011
Token shells
Token ID19679
Feature activation-4.84e-02

Pos logit contributions
ug+0.014
ister+0.013
P+0.013
irens+0.013
iren+0.012
Neg logit contributions
comb-0.014
 pleasure-0.013
ses-0.013
 pad-0.013
 former-0.013
Token and
Token ID290
Feature activation+1.80e-03

Pos logit contributions
ug+0.067
ister+0.065
P+0.059
irens+0.058
iren+0.054
Neg logit contributions
comb-0.099
 pleasure-0.095
ses-0.095
 former-0.094
 pur-0.091
Token rocks
Token ID12586
Feature activation-6.38e-02

Pos logit contributions
comb+0.004
ses+0.003
 pleasure+0.003
 pad+0.003
 former+0.003
Neg logit contributions
ug-0.003
P-0.002
ister-0.002
irens-0.002
opter-0.002
Token and
Token ID290
Feature activation+1.89e-03

Pos logit contributions
ug+0.110
ister+0.105
P+0.099
irens+0.098
iren+0.094
Neg logit contributions
comb-0.109
 pleasure-0.104
ses-0.102
 former-0.100
 pur-0.099
Token seaw
Token ID32085
Feature activation-1.05e-01

Pos logit contributions
comb+0.004
ses+0.003
 pleasure+0.003
 former+0.003
 pad+0.003
Neg logit contributions
ug-0.003
P-0.003
ister-0.003
irens-0.003
iren-0.002
Tokeneed
Token ID2308
Feature activation-6.74e-02

Pos logit contributions
ister+0.225
ug+0.222
irens+0.212
iren+0.203
P+0.199
Neg logit contributions
comb-0.130
 pad-0.128
 pleasure-0.119
ses-0.118
 former-0.115
Token."
Token ID526
Feature activation+5.10e-04

Pos logit contributions
ug+0.100
ister+0.094
P+0.088
irens+0.084
iren+0.081
Neg logit contributions
comb-0.129
 pleasure-0.127
ses-0.126
 former-0.123
 pur-0.120
Token
Token ID198
Feature activation-2.10e-02

Pos logit contributions
comb+0.001
 pleasure+0.001
 pur+0.001
ses+0.001
 seeking+0.001
Neg logit contributions
ug-0.001
ister-0.001
irens-0.001
P-0.001
opter-0.001
Token
Token ID198
Feature activation-1.75e-02

Pos logit contributions
ug+0.046
ister+0.042
irens+0.042
opter+0.041
P+0.040
Neg logit contributions
 pur-0.023
comb-0.023
 nib-0.022
 pleasure-0.022
 pad-0.022
TokenM
Token ID44
Feature activation+1.39e-02

Pos logit contributions
ug+0.028
irens+0.024
ister+0.024
opter+0.023
iren+0.023
Neg logit contributions
comb-0.031
 pur-0.030
Aud-0.030
 pad-0.029
 former-0.029
Tokenate
Token ID378
Feature activation-3.28e-02

Pos logit contributions
ug+0.033
ister+0.033
iren+0.031
irens+0.030
opter+0.030
Neg logit contributions
 watered-0.026
 pur-0.024
comb-0.024
 pollen-0.024
 former-0.024
Token it
Token ID340
Feature activation-1.10e-02

Pos logit contributions
ug+0.047
ister+0.044
irens+0.041
P+0.040
iren+0.037
Neg logit contributions
comb-0.067
 former-0.063
ses-0.063
 pad-0.063
 pleasure-0.063
Token with
Token ID351
Feature activation-2.31e-03

Pos logit contributions
ug+0.015
ister+0.014
P+0.014
irens+0.013
 cubes+0.012
Neg logit contributions
 pleasure-0.022
comb-0.021
ses-0.021
 pad-0.021
 hospitality-0.021
Token shells
Token ID19679
Feature activation-5.12e-02

Pos logit contributions
ug+0.003
ister+0.003
P+0.003
irens+0.003
iren+0.003
Neg logit contributions
comb-0.005
ses-0.004
 pad-0.004
 pleasure-0.004
 seeking-0.004
Token and
Token ID290
Feature activation-2.13e-03

Pos logit contributions
ug+0.079
ister+0.075
P+0.070
irens+0.069
iren+0.065
Neg logit contributions
comb-0.095
 pleasure-0.093
 former-0.092
ses-0.092
 seeking-0.089
Token seaw
Token ID32085
Feature activation-1.04e-01

Pos logit contributions
ug+0.003
P+0.003
irens+0.003
ister+0.003
iren+0.003
Neg logit contributions
comb-0.004
 pleasure-0.004
 former-0.004
ses-0.004
 seeking-0.004
Tokeneed
Token ID2308
Feature activation-7.25e-02

Pos logit contributions
ister+0.222
ug+0.221
irens+0.210
iren+0.200
P+0.199
Neg logit contributions
comb-0.130
 pad-0.125
ses-0.121
 pleasure-0.119
 pur-0.116
Token.
Token ID13
Feature activation+1.83e-03

Pos logit contributions
ug+0.102
ister+0.095
irens+0.087
P+0.087
iren+0.083
Neg logit contributions
comb-0.144
 former-0.143
 pleasure-0.143
ses-0.138
 seeking-0.135
Token They
Token ID1119
Feature activation-8.15e-03

Pos logit contributions
comb+0.003
 pleasure+0.003
ses+0.003
 pad+0.003
 seeking+0.003
Neg logit contributions
ug-0.003
ister-0.003
P-0.003
irens-0.003
iren-0.003
Token find
Token ID1064
Feature activation-6.70e-03

Pos logit contributions
ug+0.015
ister+0.015
P+0.014
opter+0.014
irens+0.013
Neg logit contributions
comb-0.012
 pleasure-0.011
 former-0.011
ses-0.011
eter-0.011
Token a
Token ID257
Feature activation+2.59e-03

Pos logit contributions
ug+0.014
ister+0.013
P+0.013
irens+0.013
iren+0.012
Neg logit contributions
comb-0.009
 pleasure-0.009
ses-0.009
 pur-0.008
 pad-0.008
Token,
Token ID11
Feature activation+2.01e-02

Pos logit contributions
comb+0.020
 pleasure+0.020
 pur+0.019
 seeking+0.019
 hospitality+0.019
Neg logit contributions
ug-0.021
ister-0.020
P-0.020
irens-0.019
iren-0.018
Token "
Token ID366
Feature activation+5.28e-03

Pos logit contributions
comb+0.028
ses+0.025
 pleasure+0.025
 former+0.025
 seeking+0.024
Neg logit contributions
ug-0.041
ister-0.040
P-0.038
irens-0.038
iren-0.036
TokenGood
Token ID10248
Feature activation+8.44e-03

Pos logit contributions
comb+0.007
 pleasure+0.007
 pad+0.007
ses+0.007
 pur+0.007
Neg logit contributions
ug-0.011
ister-0.010
irens-0.010
P-0.010
iren-0.009
Token job
Token ID1693
Feature activation-4.63e-02

Pos logit contributions
 pleasure+0.013
ses+0.012
comb+0.012
 pur+0.012
 pad+0.012
Neg logit contributions
ug-0.016
ister-0.015
irens-0.015
P-0.015
iren-0.014
Token,
Token ID11
Feature activation-3.30e-02

Pos logit contributions
P+0.075
ug+0.071
irens+0.067
ister+0.067
iren+0.063
Neg logit contributions
 pleasure-0.082
comb-0.080
 seeking-0.080
 pur-0.076
ses-0.076
Token Anch
Token ID29253
Feature activation-1.03e-01

Pos logit contributions
ug+0.054
P+0.054
irens+0.054
ister+0.053
iren+0.050
Neg logit contributions
 pleasure-0.056
 seeking-0.051
 former-0.050
 hospitality-0.050
 pad-0.049
Tokenor
Token ID273
Feature activation-4.52e-02

Pos logit contributions
P+0.124
irens+0.116
Dad+0.112
lf+0.107
Daddy+0.097
Neg logit contributions
 pad-0.178
 seeking-0.169
 hope-0.158
 agrees-0.158
 hospitality-0.157
Token!
Token ID0
Feature activation-7.19e-03

Pos logit contributions
ug+0.074
ister+0.070
P+0.070
irens+0.069
iren+0.065
Neg logit contributions
comb-0.075
 pleasure-0.072
 pur-0.072
 seeking-0.071
ses-0.070
Token You
Token ID921
Feature activation+1.64e-02

Pos logit contributions
ug+0.015
ister+0.014
irens+0.014
P+0.014
opter+0.013
Neg logit contributions
 pleasure-0.009
comb-0.009
ses-0.009
 seeking-0.008
 pur-0.008
Token did
Token ID750
Feature activation-1.92e-03

Pos logit contributions
comb+0.024
ses+0.022
 former+0.021
 pleasure+0.021
 pad+0.021
Neg logit contributions
ug-0.033
ister-0.031
P-0.030
irens-0.029
opter-0.029
Token your
Token ID534
Feature activation+1.26e-02

Pos logit contributions
ug+0.003
ister+0.003
irens+0.003
P+0.003
opter+0.003
Neg logit contributions
comb-0.003
ses-0.003
 pad-0.003
 hospitality-0.003
 watered-0.003
TokenWe
Token ID1135
Feature activation+1.59e-02

Pos logit contributions
comb+0.046
ses+0.043
 pad+0.042
 seeking+0.041
 former+0.041
Neg logit contributions
P-0.057
ister-0.055
ug-0.054
irens-0.050
Dad-0.049
Token love
Token ID1842
Feature activation+4.02e-03

Pos logit contributions
comb+0.020
 seeking+0.018
ses+0.018
 former+0.018
 hospitality+0.017
Neg logit contributions
ug-0.034
ister-0.033
P-0.032
opter-0.030
irens-0.029
Token you
Token ID345
Feature activation-2.31e-03

Pos logit contributions
comb+0.010
 pad+0.009
ses+0.009
 former+0.009
 pleasure+0.009
Neg logit contributions
ug-0.004
ister-0.004
irens-0.004
P-0.004
iren-0.003
Token too
Token ID1165
Feature activation+2.54e-02

Pos logit contributions
ug+0.004
ister+0.004
P+0.003
irens+0.003
iren+0.003
Neg logit contributions
comb-0.004
 pleasure-0.004
ses-0.004
 pad-0.004
 former-0.004
Token,
Token ID11
Feature activation-3.28e-02

Pos logit contributions
 pleasure+0.049
comb+0.048
 pur+0.048
ses+0.047
 seeking+0.046
Neg logit contributions
ug-0.037
P-0.036
ister-0.035
irens-0.033
iren-0.031
Token grand
Token ID4490
Feature activation-1.03e-01

Pos logit contributions
ug+0.063
ister+0.060
P+0.058
irens+0.058
iren+0.055
Neg logit contributions
comb-0.049
 pleasure-0.047
ses-0.046
 pad-0.045
 pur-0.045
Tokenpa
Token ID8957
Feature activation-5.48e-02

Pos logit contributions
ug+0.156
P+0.153
irens+0.139
iren+0.131
ister+0.130
Neg logit contributions
 hospitality-0.193
comb-0.183
 pur-0.182
 pleasure-0.181
 generosity-0.177
Token,"
Token ID553
Feature activation+1.52e-02

Pos logit contributions
ug+0.068
ister+0.066
P+0.056
irens+0.055
iren+0.053
Neg logit contributions
comb-0.121
 former-0.116
ses-0.116
 pleasure-0.113
 seeking-0.112
Token Anna
Token ID11735
Feature activation+4.29e-03

Pos logit contributions
comb+0.022
 pleasure+0.020
ses+0.020
 pur+0.020
 pad+0.019
Neg logit contributions
ug-0.031
ister-0.030
P-0.028
irens-0.028
iren-0.027
Token and
Token ID290
Feature activation-3.32e-03

Pos logit contributions
comb+0.007
 pleasure+0.007
ses+0.007
 former+0.006
 seeking+0.006
Neg logit contributions
ug-0.008
ister-0.007
P-0.007
irens-0.007
iren-0.007
Token Ben
Token ID3932
Feature activation-1.60e-02

Pos logit contributions
ug+0.006
ister+0.006
P+0.006
irens+0.006
iren+0.005
Neg logit contributions
comb-0.005
 pleasure-0.005
ses-0.005
 pad-0.005
 pur-0.005
Token sun
Token ID4252
Feature activation+1.83e-03

Pos logit contributions
ug+0.028
ister+0.026
P+0.025
irens+0.025
iren+0.024
Neg logit contributions
comb-0.023
 pleasure-0.021
ses-0.021
 pur-0.021
 pad-0.021
Token and
Token ID290
Feature activation-1.67e-03

Pos logit contributions
 former+0.004
 pleasure+0.004
ses+0.004
 seeking+0.004
 hospitality+0.004
Neg logit contributions
ug-0.002
ister-0.002
irens-0.002
iren-0.002
opter-0.002
Token went
Token ID1816
Feature activation-2.60e-02

Pos logit contributions
ug+0.002
 yacht+0.002
ister+0.002
irens+0.002
aro+0.002
Neg logit contributions
 former-0.003
comb-0.003
 pleasure-0.003
 fault-0.003
ulent-0.003
Token back
Token ID736
Feature activation-9.48e-03

Pos logit contributions
ug+0.045
ister+0.043
P+0.042
irens+0.042
iren+0.040
Neg logit contributions
comb-0.045
ses-0.042
 pleasure-0.042
 former-0.041
 hospitality-0.040
Token to
Token ID284
Feature activation-4.37e-03

Pos logit contributions
ug+0.014
irens+0.012
ister+0.012
P+0.012
iren+0.011
Neg logit contributions
comb-0.019
ses-0.017
 hospitality-0.017
 pleasure-0.017
 reminder-0.016
Token Far
Token ID6755
Feature activation-1.03e-01

Pos logit contributions
 yacht+0.004
 his+0.004
 motor+0.004
ug+0.004
 kay+0.003
Neg logit contributions
comb-0.010
eter-0.009
ives-0.009
 pad-0.009
ulent-0.009
Tokenah
Token ID993
Feature activation-1.48e-02

Pos logit contributions
ug+0.211
irens+0.208
ister+0.206
P+0.199
iren+0.196
Neg logit contributions
 pad-0.126
comb-0.125
 pleasure-0.118
 hope-0.114
ses-0.113
Token's
Token ID338
Feature activation-1.80e-02

Pos logit contributions
ug+0.025
ister+0.023
P+0.021
irens+0.021
iren+0.020
Neg logit contributions
ses-0.027
comb-0.026
 pleasure-0.026
 seeking-0.024
 former-0.024
Token house
Token ID2156
Feature activation-2.42e-02

Pos logit contributions
ister+0.024
 yacht+0.024
 motor+0.023
ug+0.022
 elevator+0.021
Neg logit contributions
comb-0.035
Aud-0.033
ses-0.032
ulent-0.032
 pad-0.032
Token for
Token ID329
Feature activation+4.11e-03

Pos logit contributions
ug+0.029
ister+0.027
irens+0.022
 yacht+0.021
 cubes+0.020
Neg logit contributions
ses-0.054
comb-0.051
 pleasure-0.050
Aud-0.049
 pollen-0.047
Token a
Token ID257
Feature activation-2.27e-02

Pos logit contributions
comb+0.008
 pad+0.008
ses+0.007
eter+0.007
 former+0.007
Neg logit contributions
ug-0.006
P-0.006
irens-0.006
ister-0.005
 yacht-0.005
Token some
Token ID617
Feature activation+2.23e-02

Pos logit contributions
ug+0.058
ister+0.056
P+0.053
irens+0.053
iren+0.050
Neg logit contributions
comb-0.050
 pleasure-0.047
ses-0.046
 pad-0.045
 former-0.045
Token shells
Token ID19679
Feature activation-3.08e-02

Pos logit contributions
comb+0.038
 pleasure+0.036
ses+0.035
 pur+0.035
 pad+0.035
Neg logit contributions
ug-0.038
ister-0.037
P-0.035
irens-0.034
opter-0.032
Token and
Token ID290
Feature activation-2.11e-03

Pos logit contributions
ug+0.044
ister+0.042
irens+0.038
P+0.037
iren+0.036
Neg logit contributions
comb-0.061
 pleasure-0.061
 former-0.061
ses-0.059
 hospitality-0.057
Token rocks
Token ID12586
Feature activation-5.90e-02

Pos logit contributions
ug+0.003
P+0.003
irens+0.003
 cubes+0.003
ister+0.003
Neg logit contributions
comb-0.004
 pleasure-0.004
 former-0.004
ses-0.003
Aud-0.003
Token and
Token ID290
Feature activation-3.57e-03

Pos logit contributions
ug+0.094
ister+0.087
irens+0.082
iren+0.078
P+0.077
Neg logit contributions
comb-0.111
 pleasure-0.110
 former-0.108
ses-0.105
 hospitality-0.102
Token seaw
Token ID32085
Feature activation-1.03e-01

Pos logit contributions
ug+0.005
P+0.005
irens+0.005
 cubes+0.005
 yacht+0.004
Neg logit contributions
comb-0.007
 pleasure-0.007
 former-0.006
Aud-0.006
ses-0.006
Tokeneed
Token ID2308
Feature activation-6.46e-02

Pos logit contributions
ister+0.215
ug+0.214
irens+0.202
P+0.196
iren+0.193
Neg logit contributions
comb-0.133
 pad-0.125
ses-0.125
 pleasure-0.120
 pur-0.117
Token.
Token ID13
Feature activation+3.96e-03

Pos logit contributions
ug+0.078
ister+0.070
irens+0.066
iren+0.062
 cubes+0.060
Neg logit contributions
 former-0.144
 pleasure-0.143
comb-0.138
ses-0.133
Aud-0.130
Token He
Token ID679
Feature activation-3.73e-02

Pos logit contributions
comb+0.008
 hospitality+0.008
 pleasure+0.008
 former+0.008
 pur+0.008
Neg logit contributions
ug-0.005
ister-0.005
irens-0.005
P-0.005
iren-0.004
Token picked
Token ID6497
Feature activation+1.38e-02

Pos logit contributions
ug+0.076
ister+0.073
P+0.071
irens+0.069
iren+0.066
Neg logit contributions
comb-0.052
 pleasure-0.048
ses-0.047
 former-0.046
 pad-0.046
Token them
Token ID606
Feature activation+2.23e-02

Pos logit contributions
comb+0.023
 pleasure+0.022
ses+0.021
 former+0.021
 pad+0.021
Neg logit contributions
ug-0.024
ister-0.023
P-0.022
irens-0.022
iren-0.021
Token buckets
Token ID38674
Feature activation-2.06e-02

Pos logit contributions
ug+0.034
ister+0.032
P+0.031
irens+0.030
iren+0.029
Neg logit contributions
comb-0.035
ses-0.032
 pleasure-0.032
 pad-0.032
 former-0.032
Token and
Token ID290
Feature activation+1.40e-03

Pos logit contributions
ug+0.042
ister+0.041
P+0.039
irens+0.039
opter+0.037
Neg logit contributions
comb-0.028
 pleasure-0.026
ses-0.025
 pad-0.025
 pur-0.025
Token shovel
Token ID33677
Feature activation-8.03e-02

Pos logit contributions
comb+0.003
ses+0.002
 pleasure+0.002
 former+0.002
 pad+0.002
Neg logit contributions
ug-0.002
P-0.002
ister-0.002
irens-0.002
 cubes-0.002
Tokens
Token ID82
Feature activation+5.52e-03

Pos logit contributions
ug+0.146
ister+0.140
P+0.136
irens+0.134
iren+0.127
Neg logit contributions
comb-0.125
 pad-0.120
 pleasure-0.117
ses-0.114
 watered-0.112
Token and
Token ID290
Feature activation-1.48e-02

Pos logit contributions
comb+0.009
 pleasure+0.008
 pur+0.008
ses+0.008
 former+0.008
Neg logit contributions
ug-0.010
ister-0.009
P-0.009
irens-0.009
iren-0.008
Token seaw
Token ID32085
Feature activation-1.02e-01

Pos logit contributions
ug+0.024
P+0.022
ister+0.022
irens+0.021
 yacht+0.021
Neg logit contributions
comb-0.026
ses-0.023
Aud-0.023
 pleasure-0.023
 pad-0.023
Tokeneed
Token ID2308
Feature activation-6.85e-02

Pos logit contributions
ister+0.222
ug+0.219
irens+0.213
iren+0.203
opter+0.199
Neg logit contributions
comb-0.122
 pad-0.117
ses-0.109
 pleasure-0.107
 nib-0.105
Token.
Token ID13
Feature activation+1.78e-04

Pos logit contributions
ug+0.105
ister+0.100
P+0.090
irens+0.088
iren+0.085
Neg logit contributions
comb-0.127
 pleasure-0.124
 former-0.123
ses-0.122
 pur-0.121
Token Then
Token ID3244
Feature activation-2.10e-02

Pos logit contributions
comb+0.000
 seeking+0.000
 watered+0.000
 pur+0.000
 hospitality+0.000
Neg logit contributions
ug-0.000
ister-0.000
irens-0.000
P-0.000
iren-0.000
Token we
Token ID356
Feature activation+2.08e-02

Pos logit contributions
ug+0.029
ister+0.028
P+0.026
irens+0.024
opter+0.023
Neg logit contributions
comb-0.042
ses-0.040
 pad-0.040
 former-0.040
 seeking-0.039
Token can
Token ID460
Feature activation-8.73e-03

Pos logit contributions
comb+0.028
ses+0.026
 hospitality+0.026
 pleasure+0.026
 pur+0.025
Neg logit contributions
ug-0.044
ister-0.040
P-0.040
opter-0.038
irens-0.037
Token swimming
Token ID14899
Feature activation+1.68e-02

Pos logit contributions
ug+0.003
ister+0.003
irens+0.003
P+0.002
iren+0.002
Neg logit contributions
comb-0.003
ses-0.003
 pad-0.003
 former-0.003
 hospitality-0.003
Token,
Token ID11
Feature activation+1.86e-02

Pos logit contributions
comb+0.027
 pleasure+0.025
ses+0.025
 pur+0.024
 pad+0.024
Neg logit contributions
ug-0.031
ister-0.029
P-0.028
irens-0.027
iren-0.026
Token staying
Token ID10589
Feature activation+5.56e-03

Pos logit contributions
comb+0.034
ses+0.030
 former+0.029
 hospitality+0.029
 potions+0.028
Neg logit contributions
ug-0.031
ister-0.028
 yacht-0.026
irens-0.025
 motor-0.024
Token loyal
Token ID9112
Feature activation-3.30e-03

Pos logit contributions
comb+0.009
ses+0.008
 pleasure+0.008
 pad+0.008
 hospitality+0.008
Neg logit contributions
ug-0.011
ister-0.010
P-0.010
irens-0.009
iren-0.009
Token to
Token ID284
Feature activation+7.41e-03

Pos logit contributions
ug+0.006
ister+0.005
P+0.005
irens+0.005
 yacht+0.004
Neg logit contributions
comb-0.006
 former-0.005
ses-0.005
 pur-0.005
Aud-0.005
Token Anch
Token ID29253
Feature activation-1.01e-01

Pos logit contributions
comb+0.017
 pad+0.016
ses+0.015
eter+0.015
 pleasure+0.015
Neg logit contributions
ug-0.010
ister-0.009
irens-0.008
P-0.008
iren-0.007
Tokenor
Token ID273
Feature activation-4.25e-02

Pos logit contributions
P+0.105
irens+0.095
ister+0.080
lf+0.080
 cubes+0.078
Neg logit contributions
 pad-0.209
 seeking-0.199
 hospitality-0.199
 pleasure-0.194
 agrees-0.190
Token's
Token ID338
Feature activation-9.94e-03

Pos logit contributions
ug+0.074
ister+0.071
P+0.068
irens+0.067
iren+0.063
Neg logit contributions
comb-0.071
 pleasure-0.066
ses-0.065
 pad-0.065
 pur-0.065
Token advice
Token ID5608
Feature activation-1.66e-02

Pos logit contributions
ug+0.018
ister+0.017
 yacht+0.015
P+0.015
 motor+0.015
Neg logit contributions
comb-0.017
Aud-0.015
ses-0.014
 seeking-0.014
ulent-0.014
Token.
Token ID13
Feature activation-1.04e-02

Pos logit contributions
ug+0.028
ister+0.027
irens+0.024
P+0.024
iren+0.023
Neg logit contributions
comb-0.029
ses-0.028
 pleasure-0.027
 former-0.027
 seeking-0.027
Token They
Token ID1119
Feature activation+4.29e-03

Pos logit contributions
ug+0.020
ister+0.020
P+0.019
irens+0.019
iren+0.018
Neg logit contributions
comb-0.015
 pleasure-0.014
 pad-0.014
 former-0.014
ses-0.014
Token"
Token ID1
Feature activation+3.52e-02

Pos logit contributions
ug+0.014
ister+0.012
loo+0.012
opter+0.012
 elevator+0.011
Neg logit contributions
 pad-0.010
 knowing-0.010
Aud-0.010
 hopeful-0.010
 read-0.010
TokenI
Token ID40
Feature activation-2.40e-03

Pos logit contributions
comb+0.049
 pleasure+0.046
ses+0.045
 pur+0.044
 former+0.044
Neg logit contributions
ug-0.072
ister-0.068
irens-0.065
P-0.063
opter-0.062
Token'm
Token ID1101
Feature activation+1.43e-03

Pos logit contributions
ug+0.005
ister+0.005
P+0.004
opter+0.004
iren+0.004
Neg logit contributions
comb-0.003
ses-0.003
 seeking-0.003
 pollen-0.003
Aud-0.003
Token sorry
Token ID7926
Feature activation-9.63e-03

Pos logit contributions
comb+0.002
 former+0.002
 pad+0.002
ses+0.002
 hospitality+0.002
Neg logit contributions
ug-0.003
ister-0.003
P-0.002
irens-0.002
iren-0.002
Token,
Token ID11
Feature activation-3.05e-02

Pos logit contributions
P+0.017
ug+0.016
irens+0.016
iren+0.015
ister+0.015
Neg logit contributions
 pur-0.016
 pleasure-0.016
comb-0.015
 hospitality-0.015
 kindness-0.014
Token grand
Token ID4490
Feature activation-1.00e-01

Pos logit contributions
ug+0.058
P+0.055
ister+0.055
irens+0.053
iren+0.051
Neg logit contributions
 pleasure-0.046
comb-0.046
 pur-0.043
ses-0.043
 pad-0.043
Tokenpa
Token ID8957
Feature activation-5.95e-02

Pos logit contributions
ug+0.164
P+0.163
irens+0.144
iren+0.137
ister+0.137
Neg logit contributions
 hospitality-0.174
 pleasure-0.165
comb-0.165
 generosity-0.160
 pur-0.160
Token.
Token ID13
Feature activation+7.58e-03

Pos logit contributions
ug+0.080
ister+0.079
P+0.068
irens+0.068
iren+0.065
Neg logit contributions
comb-0.124
 former-0.118
ses-0.118
 pleasure-0.117
 pad-0.115
Token We
Token ID775
Feature activation+1.58e-02

Pos logit contributions
 pleasure+0.010
comb+0.010
ses+0.009
 pur+0.009
 seeking+0.009
Neg logit contributions
ug-0.016
ister-0.015
P-0.015
irens-0.015
iren-0.014
Token didn
Token ID1422
Feature activation-2.96e-03

Pos logit contributions
comb+0.023
 pleasure+0.021
ses+0.021
 seeking+0.021
 former+0.021
Neg logit contributions
ug-0.031
ister-0.030
P-0.029
irens-0.028
opter-0.027
Token't
Token ID470
Feature activation+6.57e-03

Pos logit contributions
ug+0.006
P+0.006
ister+0.006
opter+0.006
aver+0.005
Neg logit contributions
comb-0.003
 pur-0.003
 seeking-0.003
 reveal-0.003
ses-0.003
Tokenpa
Token ID8957
Feature activation-4.54e-02

Pos logit contributions
ug+0.050
P+0.046
opter+0.044
 yacht+0.042
irens+0.041
Neg logit contributions
comb-0.050
 hospitality-0.050
 seeking-0.046
 generosity-0.046
 pur-0.046
Token,
Token ID11
Feature activation-1.24e-02

Pos logit contributions
ug+0.060
ister+0.055
P+0.053
irens+0.052
iren+0.048
Neg logit contributions
comb-0.095
 pleasure-0.090
ses-0.090
 pur-0.089
 seeking-0.089
Token grand
Token ID4490
Feature activation-9.83e-02

Pos logit contributions
ug+0.028
irens+0.026
ister+0.026
P+0.026
iren+0.025
Neg logit contributions
comb-0.014
 pleasure-0.014
 seeking-0.014
 hospitality-0.014
 hope-0.013
Tokenpa
Token ID8957
Feature activation-5.58e-02

Pos logit contributions
ug+0.148
P+0.147
irens+0.129
ister+0.127
iren+0.120
Neg logit contributions
 hospitality-0.182
comb-0.174
 pleasure-0.174
 pur-0.173
ses-0.171
Token,
Token ID11
Feature activation-1.71e-02

Pos logit contributions
ug+0.078
ister+0.072
P+0.070
irens+0.069
iren+0.064
Neg logit contributions
comb-0.112
 pleasure-0.106
ses-0.105
 pur-0.105
 seeking-0.104
Token grand
Token ID4490
Feature activation-9.99e-02

Pos logit contributions
ug+0.039
ister+0.037
irens+0.036
P+0.036
iren+0.034
Neg logit contributions
comb-0.020
 pleasure-0.019
 seeking-0.018
ses-0.018
 pur-0.018
Tokenpa
Token ID8957
Feature activation-5.76e-02

Pos logit contributions
ug+0.150
P+0.148
irens+0.131
ister+0.128
iren+0.121
Neg logit contributions
 hospitality-0.185
comb-0.178
 pur-0.177
 pleasure-0.177
ses-0.173
Token!"
Token ID2474
Feature activation+9.40e-03

Pos logit contributions
ug+0.086
ister+0.080
P+0.077
irens+0.076
iren+0.071
Neg logit contributions
comb-0.111
 pleasure-0.105
ses-0.104
 pur-0.104
 seeking-0.103
Token Lily
Token ID20037
Feature activation+3.56e-04

Pos logit contributions
 pur+0.010
 forgetting+0.010
 pleasure+0.009
 hope+0.009
 amaz+0.009
Neg logit contributions
ug-0.023
P-0.021
ister-0.019
opter-0.019
or-0.019
Token and
Token ID290
Feature activation-8.51e-03

Pos logit contributions
comb+0.001
 pleasure+0.001
 former+0.001
 hospitality+0.001
ses+0.001
Neg logit contributions
ug-0.001
ister-0.001
irens-0.001
iren-0.001
P-0.001
Token Tom
Token ID4186
Feature activation-2.31e-02

Pos logit contributions
ug+0.014
ister+0.013
irens+0.013
P+0.012
iren+0.012
Neg logit contributions
comb-0.016
 former-0.014
 pleasure-0.014
 seeking-0.014
 pad-0.014
Token and
Token ID290
Feature activation-1.72e-03

Pos logit contributions
 pleasure+0.008
 former+0.008
ses+0.008
Aud+0.007
comb+0.007
Neg logit contributions
ug-0.005
ister-0.005
 yacht-0.004
in-0.004
irens-0.004
Token the
Token ID262
Feature activation-1.48e-02

Pos logit contributions
 yacht+0.002
ug+0.002
 cubes+0.001
ister+0.001
irens+0.001
Neg logit contributions
comb-0.004
Aud-0.004
 pad-0.004
 pleasure-0.004
eter-0.004
Token crabs
Token ID49172
Feature activation-2.01e-02

Pos logit contributions
ug+0.021
ister+0.021
P+0.020
irens+0.019
 yacht+0.018
Neg logit contributions
comb-0.028
 pleasure-0.025
ses-0.025
Aud-0.025
 seeking-0.025
Token and
Token ID290
Feature activation-8.04e-03

Pos logit contributions
ug+0.023
ister+0.022
 yacht+0.020
irens+0.019
iren+0.018
Neg logit contributions
 pleasure-0.046
ses-0.044
 former-0.044
eter-0.043
comb-0.041
Token the
Token ID262
Feature activation-2.21e-02

Pos logit contributions
 yacht+0.007
ug+0.007
 cubes+0.006
ister+0.006
irens+0.005
Neg logit contributions
comb-0.019
 pleasure-0.018
 former-0.018
eter-0.018
Aud-0.018
Token seaw
Token ID32085
Feature activation-9.98e-02

Pos logit contributions
ug+0.032
ister+0.031
P+0.030
irens+0.028
 yacht+0.027
Neg logit contributions
comb-0.043
 pleasure-0.038
 former-0.038
ses-0.038
Aud-0.038
Tokeneed
Token ID2308
Feature activation-6.89e-02

Pos logit contributions
ug+0.207
ister+0.203
irens+0.200
P+0.187
iren+0.184
Neg logit contributions
comb-0.136
 pad-0.129
ses-0.124
 pleasure-0.120
 former-0.119
Token.
Token ID13
Feature activation+4.89e-03

Pos logit contributions
ug+0.091
ister+0.084
 yacht+0.075
ler+0.072
irens+0.071
Neg logit contributions
 pleasure-0.143
comb-0.139
Aud-0.138
 former-0.138
ses-0.138
Token
Token ID198
Feature activation-1.25e-02

Pos logit contributions
comb+0.011
 hospitality+0.010
 pad+0.010
 former+0.010
 seeking+0.010
Neg logit contributions
ug-0.006
ister-0.006
irens-0.006
P-0.006
Dad-0.005
Token
Token ID198
Feature activation-1.32e-02

Pos logit contributions
ug+0.024
ister+0.023
P+0.022
irens+0.022
opter+0.021
Neg logit contributions
comb-0.018
 pleasure-0.017
 pur-0.017
ses-0.017
 pad-0.016
TokenSam
Token ID16305
Feature activation-1.13e-02

Pos logit contributions
ug+0.019
ister+0.018
irens+0.017
P+0.016
opter+0.016
Neg logit contributions
comb-0.026
 pad-0.024
 seeking-0.024
 pur-0.024
 pleasure-0.024
TokenDon
Token ID3987
Feature activation-1.04e-02

Pos logit contributions
comb+0.003
ses+0.003
 seeking+0.003
 pleasure+0.003
 pur+0.003
Neg logit contributions
P-0.003
ug-0.003
ister-0.003
irens-0.003
iren-0.003
Token't
Token ID470
Feature activation+9.76e-03

Pos logit contributions
ug+0.021
irens+0.019
P+0.019
ister+0.018
 cubes+0.018
Neg logit contributions
comb-0.015
 pur-0.013
 pleasure-0.013
 hospitality-0.013
 seeking-0.012
Token worry
Token ID5490
Feature activation-4.05e-02

Pos logit contributions
comb+0.012
ses+0.011
 pleasure+0.011
 pad+0.011
 seeking+0.011
Neg logit contributions
ug-0.021
ister-0.021
P-0.020
irens-0.019
opter-0.019
Token,
Token ID11
Feature activation-3.05e-02

Pos logit contributions
ug+0.059
ister+0.055
P+0.054
irens+0.053
iren+0.050
Neg logit contributions
comb-0.080
 pleasure-0.076
 pur-0.075
ses-0.074
 pad-0.074
Token Mar
Token ID1526
Feature activation-8.27e-02

Pos logit contributions
ug+0.049
ister+0.048
P+0.042
opter+0.041
irens+0.041
Neg logit contributions
comb-0.054
ses-0.052
 watered-0.051
 pad-0.050
 former-0.050
Tokenlon
Token ID14995
Feature activation-9.96e-02

Pos logit contributions
ug+0.133
ister+0.127
irens+0.122
P+0.120
iren+0.114
Neg logit contributions
comb-0.151
 pleasure-0.141
 pad-0.140
ses-0.139
 seeking-0.136
Token.
Token ID13
Feature activation+2.32e-03

Pos logit contributions
ug+0.148
ister+0.141
P+0.132
irens+0.126
opter+0.121
Neg logit contributions
comb-0.192
 pleasure-0.183
ses-0.182
 former-0.179
 seeking-0.179
Token You
Token ID921
Feature activation+1.77e-02

Pos logit contributions
comb+0.004
 watered+0.003
 pad+0.003
ses+0.003
 pur+0.003
Neg logit contributions
ug-0.004
ister-0.004
irens-0.004
P-0.004
iren-0.004
Token'll
Token ID1183
Feature activation+1.10e-02

Pos logit contributions
comb+0.022
ses+0.021
 pad+0.020
 former+0.019
 seeking+0.019
Neg logit contributions
ug-0.039
ister-0.037
opter-0.035
P-0.035
irens-0.032
Token be
Token ID307
Feature activation+7.12e-03

Pos logit contributions
ses+0.016
comb+0.015
 pad+0.015
eter+0.014
Aud+0.014
Neg logit contributions
ister-0.022
ug-0.021
P-0.019
opter-0.018
iren-0.018
Token a
Token ID257
Feature activation-1.44e-02

Pos logit contributions
comb+0.011
ses+0.011
 hospitality+0.010
 pad+0.010
 former+0.009
Neg logit contributions
ug-0.014
P-0.013
ister-0.013
irens-0.012
ep-0.012
Token
Token ID198
Feature activation-1.22e-02

Pos logit contributions
comb+0.012
 pleasure+0.012
 pur+0.012
 pad+0.012
ses+0.011
Neg logit contributions
ug-0.019
ister-0.018
P-0.017
irens-0.017
opter-0.017
Token
Token ID198
Feature activation-1.33e-02

Pos logit contributions
ug+0.028
ister+0.026
P+0.025
irens+0.025
opter+0.025
Neg logit contributions
 pur-0.012
 pleasure-0.012
comb-0.011
 pad-0.011
ses-0.011
Token"
Token ID1
Feature activation+3.16e-02

Pos logit contributions
ug+0.022
ister+0.020
irens+0.018
opter+0.018
 yacht+0.017
Neg logit contributions
 pad-0.023
Aud-0.022
comb-0.022
 pur-0.021
 hospitality-0.021
TokenOkay
Token ID16454
Feature activation-3.06e-02

Pos logit contributions
comb+0.049
ses+0.045
 pad+0.045
 pleasure+0.045
 seeking+0.045
Neg logit contributions
ug-0.058
ister-0.058
P-0.058
irens-0.054
iren-0.052
Token,
Token ID11
Feature activation-2.71e-02

Pos logit contributions
ug+0.049
P+0.045
irens+0.044
ister+0.043
iren+0.041
Neg logit contributions
comb-0.057
 pleasure-0.056
 pur-0.055
 hospitality-0.053
 pad-0.052
Token grand
Token ID4490
Feature activation-9.87e-02

Pos logit contributions
ug+0.052
ister+0.049
P+0.048
irens+0.047
iren+0.045
Neg logit contributions
comb-0.041
 pleasure-0.041
 pur-0.038
 pad-0.038
ses-0.038
Tokenpa
Token ID8957
Feature activation-5.72e-02

Pos logit contributions
P+0.158
ug+0.156
irens+0.142
ister+0.135
iren+0.135
Neg logit contributions
 hospitality-0.178
 pleasure-0.166
 generosity-0.163
comb-0.159
 kindness-0.159
Token.
Token ID13
Feature activation+5.32e-03

Pos logit contributions
ug+0.075
ister+0.073
P+0.061
irens+0.059
iren+0.059
Neg logit contributions
comb-0.122
 former-0.120
ses-0.116
 pleasure-0.114
 pad-0.113
Token We
Token ID775
Feature activation+1.56e-02

Pos logit contributions
comb+0.009
 watered+0.009
 pad+0.008
 former+0.008
 seeking+0.008
Neg logit contributions
ug-0.009
ister-0.009
P-0.008
opter-0.008
irens-0.008
Token understand
Token ID1833
Feature activation-3.39e-02

Pos logit contributions
comb+0.020
 seeking+0.019
ses+0.019
 former+0.018
 hospitality+0.017
Neg logit contributions
ug-0.033
ister-0.032
P-0.031
opter-0.029
Dad-0.027
Token,"
Token ID553
Feature activation+2.35e-02

Pos logit contributions
ug+0.060
ister+0.057
P+0.055
irens+0.054
iren+0.051
Neg logit contributions
comb-0.057
 pleasure-0.053
 pur-0.052
ses-0.052
 pad-0.051
Token see
Token ID766
Feature activation-1.83e-02

Pos logit contributions
ug+0.011
ister+0.011
P+0.010
irens+0.010
opter+0.010
Neg logit contributions
comb-0.007
 former-0.007
ses-0.006
eter-0.006
 pleasure-0.006
Token shells
Token ID19679
Feature activation-2.90e-02

Pos logit contributions
ug+0.033
ister+0.032
P+0.030
irens+0.030
iren+0.028
Neg logit contributions
comb-0.030
 pleasure-0.028
ses-0.027
 pad-0.027
 former-0.027
Token,
Token ID11
Feature activation+3.02e-02

Pos logit contributions
ug+0.038
ister+0.036
irens+0.033
P+0.033
iren+0.031
Neg logit contributions
comb-0.061
 pleasure-0.059
 former-0.059
ses-0.058
 pur-0.057
Token rocks
Token ID12586
Feature activation-5.43e-02

Pos logit contributions
comb+0.052
 pleasure+0.049
ses+0.049
 pur+0.048
 pad+0.048
Neg logit contributions
ug-0.051
ister-0.049
P-0.046
irens-0.046
iren-0.043
Token,
Token ID11
Feature activation+2.59e-02

Pos logit contributions
ug+0.075
ister+0.071
P+0.068
irens+0.066
iren+0.061
Neg logit contributions
comb-0.110
 pleasure-0.103
ses-0.103
 pad-0.103
 seeking-0.102
Token seaw
Token ID32085
Feature activation-9.86e-02

Pos logit contributions
comb+0.040
 pur+0.037
 pleasure+0.037
 pad+0.037
ses+0.037
Neg logit contributions
ug-0.049
ister-0.047
irens-0.044
P-0.044
opter-0.042
Tokeneed
Token ID2308
Feature activation-6.08e-02

Pos logit contributions
ister+0.229
ug+0.226
irens+0.217
iren+0.209
opter+0.203
Neg logit contributions
 pad-0.109
comb-0.102
 nib-0.092
 pollen-0.092
ses-0.091
Token,
Token ID11
Feature activation+2.03e-02

Pos logit contributions
ug+0.074
ister+0.069
irens+0.062
P+0.062
iren+0.057
Neg logit contributions
comb-0.133
 pleasure-0.130
 former-0.129
ses-0.127
 pur-0.125
Token crabs
Token ID49172
Feature activation-3.56e-02

Pos logit contributions
comb+0.032
 pleasure+0.030
ses+0.029
 former+0.029
 seeking+0.028
Neg logit contributions
ug-0.037
ister-0.035
P-0.035
irens-0.034
iren-0.032
Token and
Token ID290
Feature activation+1.31e-02

Pos logit contributions
ug+0.036
ister+0.034
P+0.030
irens+0.029
iren+0.027
Neg logit contributions
comb-0.085
 pleasure-0.084
 former-0.084
ses-0.082
 hospitality-0.081
Token fish
Token ID5916
Feature activation-2.91e-02

Pos logit contributions
comb+0.025
 pleasure+0.023
ses+0.023
 former+0.022
 seeking+0.022
Neg logit contributions
ug-0.020
P-0.019
ister-0.018
irens-0.018
 cubes-0.016
Token like
Token ID588
Feature activation+6.07e-03

Pos logit contributions
comb+0.025
 pleasure+0.022
ses+0.022
 former+0.021
 pad+0.021
Neg logit contributions
ug-0.047
ister-0.044
P-0.043
irens-0.043
iren-0.041
Token fish
Token ID5916
Feature activation-1.27e-02

Pos logit contributions
comb+0.011
 pleasure+0.010
ses+0.010
 former+0.010
 pad+0.010
Neg logit contributions
ug-0.010
ister-0.009
P-0.009
irens-0.009
iren-0.008
Token and
Token ID290
Feature activation+3.10e-03

Pos logit contributions
ug+0.016
ister+0.015
P+0.014
irens+0.013
iren+0.012
Neg logit contributions
comb-0.028
 pleasure-0.027
 former-0.026
ses-0.026
 pur-0.026
Token shells
Token ID19679
Feature activation-4.87e-02

Pos logit contributions
comb+0.006
 pleasure+0.006
Aud+0.005
 former+0.005
eter+0.005
Neg logit contributions
ug-0.004
P-0.004
irens-0.004
ister-0.004
 cubes-0.004
Token and
Token ID290
Feature activation+6.28e-03

Pos logit contributions
ug+0.066
ister+0.063
P+0.057
irens+0.056
iren+0.053
Neg logit contributions
 pleasure-0.099
comb-0.098
 former-0.098
ses-0.096
 hospitality-0.093
Token seaw
Token ID32085
Feature activation-9.85e-02

Pos logit contributions
comb+0.013
 pleasure+0.011
ses+0.011
 former+0.011
Aud+0.011
Neg logit contributions
ug-0.009
P-0.008
ister-0.008
irens-0.008
 cubes-0.008
Tokeneed
Token ID2308
Feature activation-5.93e-02

Pos logit contributions
ister+0.212
ug+0.210
irens+0.202
iren+0.193
P+0.189
Neg logit contributions
comb-0.121
 pad-0.115
ses-0.113
 nib-0.108
 pleasure-0.108
Token.
Token ID13
Feature activation-3.66e-03

Pos logit contributions
ug+0.073
ister+0.067
P+0.060
irens+0.060
 cubes+0.058
Neg logit contributions
 pleasure-0.128
 former-0.126
comb-0.125
ses-0.121
Aud-0.117
Token
Token ID198
Feature activation-1.03e-02

Pos logit contributions
ug+0.007
ister+0.007
P+0.007
irens+0.007
opter+0.007
Neg logit contributions
comb-0.005
 pleasure-0.005
 pad-0.005
ses-0.005
 former-0.005
Token
Token ID198
Feature activation-9.62e-03

Pos logit contributions
ug+0.024
ister+0.023
irens+0.023
opter+0.023
 yacht+0.022
Neg logit contributions
 pad-0.008
 pleasure-0.008
comb-0.008
 watered-0.008
 seeking-0.008
TokenOne
Token ID3198
Feature activation+5.07e-02

Pos logit contributions
ug+0.020
ister+0.020
irens+0.020
opter+0.019
iren+0.019
Neg logit contributions
 pad-0.011
Aud-0.010
comb-0.010
 watered-0.010
 former-0.009
Token
Token ID198
Feature activation-1.58e-02

Pos logit contributions
ug+0.036
P+0.033
ister+0.033
opter+0.032
irens+0.032
Neg logit contributions
 pur-0.015
comb-0.014
 seeking-0.014
 pleasure-0.014
ses-0.014
Token"
Token ID1
Feature activation+2.06e-02

Pos logit contributions
ug+0.034
 yacht+0.030
opter+0.028
ister+0.027
ler+0.027
Neg logit contributions
Aud-0.019
 love-0.018
 read-0.016
 solved-0.016
Thank-0.016
TokenGrand
Token ID23581
Feature activation-3.11e-02

Pos logit contributions
comb+0.033
 pur+0.032
ses+0.031
 watered+0.031
 seeking+0.031
Neg logit contributions
ug-0.038
ister-0.035
irens-0.032
opter-0.032
P-0.031
Tokenpa
Token ID8957
Feature activation-4.54e-02

Pos logit contributions
ug+0.050
P+0.046
opter+0.044
 yacht+0.042
irens+0.041
Neg logit contributions
comb-0.050
 hospitality-0.050
 seeking-0.046
 generosity-0.046
 pur-0.046
Token,
Token ID11
Feature activation-1.24e-02

Pos logit contributions
ug+0.060
ister+0.055
P+0.053
irens+0.052
iren+0.048
Neg logit contributions
comb-0.095
 pleasure-0.090
ses-0.090
 pur-0.089
 seeking-0.089
Token grand
Token ID4490
Feature activation-9.83e-02

Pos logit contributions
ug+0.028
irens+0.026
ister+0.026
P+0.026
iren+0.025
Neg logit contributions
comb-0.014
 pleasure-0.014
 seeking-0.014
 hospitality-0.014
 hope-0.013
Tokenpa
Token ID8957
Feature activation-5.58e-02

Pos logit contributions
ug+0.148
P+0.147
irens+0.129
ister+0.127
iren+0.120
Neg logit contributions
 hospitality-0.182
comb-0.174
 pleasure-0.174
 pur-0.173
ses-0.171
Token,
Token ID11
Feature activation-1.71e-02

Pos logit contributions
ug+0.078
ister+0.072
P+0.070
irens+0.069
iren+0.064
Neg logit contributions
comb-0.112
 pleasure-0.106
ses-0.105
 pur-0.105
 seeking-0.104
Token grand
Token ID4490
Feature activation-9.99e-02

Pos logit contributions
ug+0.039
ister+0.037
irens+0.036
P+0.036
iren+0.034
Neg logit contributions
comb-0.020
 pleasure-0.019
 seeking-0.018
ses-0.018
 pur-0.018
Tokenpa
Token ID8957
Feature activation-5.76e-02

Pos logit contributions
ug+0.150
P+0.148
irens+0.131
ister+0.128
iren+0.121
Neg logit contributions
 hospitality-0.185
comb-0.178
 pur-0.177
 pleasure-0.177
ses-0.173
Token!"
Token ID2474
Feature activation+9.40e-03

Pos logit contributions
ug+0.086
ister+0.080
P+0.077
irens+0.076
iren+0.071
Neg logit contributions
comb-0.111
 pleasure-0.105
ses-0.104
 pur-0.104
 seeking-0.103
Token You
Token ID921
Feature activation+1.64e-02

Pos logit contributions
ug+0.015
ister+0.014
irens+0.014
P+0.014
opter+0.013
Neg logit contributions
 pleasure-0.009
comb-0.009
ses-0.009
 seeking-0.008
 pur-0.008
Token did
Token ID750
Feature activation-1.92e-03

Pos logit contributions
comb+0.024
ses+0.022
 former+0.021
 pleasure+0.021
 pad+0.021
Neg logit contributions
ug-0.033
ister-0.031
P-0.030
irens-0.029
opter-0.029
Token your
Token ID534
Feature activation+1.26e-02

Pos logit contributions
ug+0.003
ister+0.003
irens+0.003
P+0.003
opter+0.003
Neg logit contributions
comb-0.003
ses-0.003
 pad-0.003
 hospitality-0.003
 watered-0.003
Token best
Token ID1266
Feature activation-1.88e-02

Pos logit contributions
comb+0.023
ses+0.021
 pad+0.021
 watered+0.020
 hospitality+0.020
Neg logit contributions
ug-0.021
ister-0.020
P-0.019
irens-0.018
opter-0.017
Token!"
Token ID2474
Feature activation+1.86e-02

Pos logit contributions
ug+0.035
ister+0.033
irens+0.032
P+0.032
iren+0.030
Neg logit contributions
comb-0.029
 pleasure-0.028
ses-0.027
 pad-0.027
 seeking-0.027
Token Anch
Token ID29253
Feature activation-9.81e-02

Pos logit contributions
 pleasure+0.023
 pur+0.022
ses+0.022
 pad+0.021
 seeking+0.021
Neg logit contributions
ister-0.037
P-0.035
ug-0.035
opter-0.033
ler-0.032
Tokenor
Token ID273
Feature activation-4.38e-02

Pos logit contributions
P+0.107
irens+0.105
ister+0.100
Dad+0.098
lf+0.097
Neg logit contributions
 pad-0.176
 shown-0.167
 agrees-0.165
 seeking-0.165
 hope-0.164
Token smiled
Token ID13541
Feature activation+3.16e-02

Pos logit contributions
ister+0.086
ug+0.085
irens+0.085
P+0.085
iren+0.079
Neg logit contributions
comb-0.055
 pur-0.054
 pad-0.052
ses-0.050
 watered-0.050
Token and
Token ID290
Feature activation-1.89e-03

Pos logit contributions
comb+0.056
ses+0.053
 seeking+0.051
 pleasure+0.051
 former+0.050
Neg logit contributions
ug-0.053
ister-0.050
P-0.047
irens-0.046
iren-0.043
Token said
Token ID531
Feature activation-5.90e-04

Pos logit contributions
ug+0.004
ister+0.003
P+0.003
irens+0.003
opter+0.003
Neg logit contributions
comb-0.003
 former-0.003
 pad-0.003
 pleasure-0.003
ses-0.003
Token,
Token ID11
Feature activation+7.91e-03

Pos logit contributions
P+0.001
ister+0.001
irens+0.001
ug+0.001
iren+0.001
Neg logit contributions
 pleasure-0.001
 pur-0.001
comb-0.001
 hospitality-0.001
 seeking-0.001
Token saw
Token ID2497
Feature activation-2.64e-02

Pos logit contributions
comb+0.029
 former+0.027
ses+0.027
 pleasure+0.026
 seeking+0.026
Neg logit contributions
ug-0.021
P-0.020
ister-0.019
opter-0.019
 cubes-0.018
Token five
Token ID1936
Feature activation+3.07e-02

Pos logit contributions
ug+0.048
ister+0.046
P+0.044
irens+0.043
iren+0.041
Neg logit contributions
comb-0.043
 pleasure-0.040
ses-0.039
 pad-0.039
 former-0.039
Token fish
Token ID5916
Feature activation-5.71e-03

Pos logit contributions
comb+0.046
 pad+0.043
 seeking+0.043
 pur+0.043
 watered+0.043
Neg logit contributions
ug-0.057
ister-0.056
irens-0.052
P-0.052
iren-0.051
Token and
Token ID290
Feature activation+1.12e-03

Pos logit contributions
ug+0.009
ister+0.008
irens+0.007
P+0.007
iren+0.007
Neg logit contributions
 former-0.011
 pleasure-0.011
comb-0.011
ses-0.010
 pur-0.010
Token seven
Token ID3598
Feature activation+1.21e-02

Pos logit contributions
comb+0.002
 pleasure+0.002
 former+0.002
 pad+0.002
ses+0.002
Neg logit contributions
ug-0.002
P-0.002
ister-0.002
irens-0.002
opter-0.002
Token seaw
Token ID32085
Feature activation-9.80e-02

Pos logit contributions
 pleasure+0.019
comb+0.019
 pur+0.019
 nib+0.018
 pad+0.018
Neg logit contributions
ug-0.021
ister-0.021
irens-0.019
P-0.019
iren-0.019
Tokeneed
Token ID2308
Feature activation-6.63e-02

Pos logit contributions
ug+0.210
ister+0.206
irens+0.199
P+0.191
iren+0.189
Neg logit contributions
comb-0.123
 pad-0.119
ses-0.116
 pleasure-0.111
 nib-0.109
Token.
Token ID13
Feature activation+4.41e-03

Pos logit contributions
ug+0.101
ister+0.094
irens+0.086
P+0.082
opter+0.082
Neg logit contributions
comb-0.126
 pleasure-0.124
 former-0.123
ses-0.118
 pur-0.116
Token They
Token ID1119
Feature activation-5.92e-03

Pos logit contributions
comb+0.006
 pleasure+0.006
ses+0.006
 pur+0.006
 pad+0.006
Neg logit contributions
ug-0.009
ister-0.008
P-0.008
irens-0.008
iren-0.008
Token were
Token ID547
Feature activation-1.14e-02

Pos logit contributions
ug+0.012
ister+0.011
P+0.011
irens+0.011
opter+0.010
Neg logit contributions
comb-0.009
 pleasure-0.008
ses-0.008
 former-0.008
 pad-0.008
Token happy
Token ID3772
Feature activation+2.19e-02

Pos logit contributions
ug+0.020
ister+0.019
irens+0.018
P+0.018
 cubes+0.017
Neg logit contributions
comb-0.019
 former-0.018
 hospitality-0.017
 pad-0.017
ses-0.017
Token's
Token ID338
Feature activation+1.00e-02

Pos logit contributions
comb+0.005
 pleasure+0.005
ses+0.005
 former+0.005
 seeking+0.005
Neg logit contributions
ug-0.006
ister-0.006
P-0.005
irens-0.005
iren-0.005
Token father
Token ID2988
Feature activation-1.78e-02

Pos logit contributions
comb+0.019
 seeking+0.017
Aud+0.017
 watered+0.017
 pollen+0.017
Neg logit contributions
ug-0.016
ister-0.015
irens-0.013
P-0.013
 yacht-0.012
Token said
Token ID531
Feature activation+1.20e-02

Pos logit contributions
ug+0.035
ister+0.033
P+0.032
irens+0.031
iren+0.030
Neg logit contributions
comb-0.026
 pleasure-0.025
ses-0.024
 former-0.024
 seeking-0.024
Token,
Token ID11
Feature activation+1.12e-02

Pos logit contributions
comb+0.024
 pleasure+0.022
 pur+0.022
ses+0.022
 pad+0.022
Neg logit contributions
ug-0.018
ister-0.017
P-0.016
irens-0.016
iren-0.015
Token "
Token ID366
Feature activation+8.02e-03

Pos logit contributions
comb+0.016
ses+0.015
 former+0.015
 pleasure+0.015
 seeking+0.014
Neg logit contributions
ug-0.022
ister-0.022
irens-0.020
P-0.020
iren-0.020
TokenSon
Token ID31056
Feature activation-9.79e-02

Pos logit contributions
comb+0.012
 pleasure+0.011
ses+0.011
 pur+0.010
 pad+0.010
Neg logit contributions
ug-0.016
ister-0.015
P-0.015
irens-0.015
iren-0.014
Token,
Token ID11
Feature activation-1.98e-02

Pos logit contributions
ug+0.140
P+0.130
ister+0.129
irens+0.128
iren+0.116
Neg logit contributions
comb-0.192
 pleasure-0.184
ses-0.179
 pad-0.179
 pur-0.177
Token it
Token ID340
Feature activation+6.88e-03

Pos logit contributions
ug+0.038
ister+0.037
P+0.035
irens+0.035
iren+0.033
Neg logit contributions
comb-0.029
 pleasure-0.027
ses-0.027
 pad-0.026
 pur-0.026
Token's
Token ID338
Feature activation+2.45e-02

Pos logit contributions
comb+0.012
 pleasure+0.011
ses+0.011
 hospitality+0.011
 former+0.011
Neg logit contributions
ug-0.012
ister-0.011
P-0.011
opter-0.010
irens-0.010
Token time
Token ID640
Feature activation-1.65e-02

Pos logit contributions
comb+0.043
 hospitality+0.040
ses+0.039
 seeking+0.038
Aud+0.038
Neg logit contributions
ug-0.041
ister-0.038
P-0.036
irens-0.034
iren-0.033
Token to
Token ID284
Feature activation+2.08e-02

Pos logit contributions
ug+0.037
ister+0.036
irens+0.033
iren+0.033
P+0.032
Neg logit contributions
comb-0.020
 former-0.018
 pur-0.017
ses-0.017
 pollen-0.016
Token the
Token ID262
Feature activation+5.39e-03

Pos logit contributions
ug+0.016
ister+0.016
P+0.015
irens+0.014
iren+0.014
Neg logit contributions
comb-0.015
 pleasure-0.015
 former-0.014
 pur-0.014
 pad-0.014
Token wardrobe
Token ID35593
Feature activation-1.13e-02

Pos logit contributions
comb+0.009
 pleasure+0.009
ses+0.009
 seeking+0.009
 pur+0.009
Neg logit contributions
ug-0.009
ister-0.009
P-0.008
irens-0.008
iren-0.007
Token so
Token ID523
Feature activation+1.92e-03

Pos logit contributions
ug+0.017
ister+0.017
irens+0.015
iren+0.015
P+0.014
Neg logit contributions
ses-0.021
comb-0.021
 pleasure-0.021
 hospitality-0.020
 pad-0.020
Token it
Token ID340
Feature activation+3.25e-03

Pos logit contributions
comb+0.003
 pleasure+0.003
 former+0.003
 pur+0.003
 seeking+0.003
Neg logit contributions
ug-0.003
ister-0.003
P-0.003
irens-0.003
iren-0.003
Token looked
Token ID3114
Feature activation+1.32e-02

Pos logit contributions
comb+0.005
ses+0.005
 hospitality+0.005
 pleasure+0.005
 pad+0.005
Neg logit contributions
ug-0.006
ister-0.006
P-0.005
opter-0.005
irens-0.005
Token nice
Token ID3621
Feature activation+4.80e-02

Pos logit contributions
comb+0.020
ses+0.018
 seeking+0.017
 pleasure+0.017
 hospitality+0.017
Neg logit contributions
ug-0.026
ister-0.025
P-0.024
irens-0.023
opter-0.022
Token.
Token ID13
Feature activation-1.28e-02

Pos logit contributions
ses+0.103
comb+0.097
Aud+0.096
 former+0.092
 pur+0.091
Neg logit contributions
ug-0.058
ister-0.054
 cubes-0.051
P-0.050
irens-0.048
Token And
Token ID843
Feature activation+5.79e-03

Pos logit contributions
ug+0.023
ister+0.022
P+0.021
irens+0.021
iren+0.020
Neg logit contributions
comb-0.021
ses-0.019
 pleasure-0.019
 seeking-0.018
 hospitality-0.018
Token from
Token ID422
Feature activation-3.01e-02

Pos logit contributions
comb+0.011
ses+0.011
Aud+0.010
 pad+0.010
 pur+0.010
Neg logit contributions
ug-0.009
ister-0.008
P-0.007
opter-0.007
irens-0.007
Token then
Token ID788
Feature activation+3.45e-04

Pos logit contributions
ug+0.054
ister+0.051
P+0.049
irens+0.048
iren+0.046
Neg logit contributions
comb-0.050
 pleasure-0.046
ses-0.046
 pad-0.045
 pur-0.045
Token on
Token ID319
Feature activation-2.89e-03

Pos logit contributions
comb+0.000
 pur+0.000
 pad+0.000
 pleasure+0.000
 hope+0.000
Neg logit contributions
ister-0.001
ug-0.001
irens-0.001
iren-0.001
ades-0.001
Token did
Token ID750
Feature activation-3.35e-03

Pos logit contributions
comb+0.013
 pleasure+0.013
 hospitality+0.012
 former+0.012
ses+0.012
Neg logit contributions
ug-0.016
ister-0.016
P-0.015
irens-0.014
opter-0.014
Token not
Token ID407
Feature activation+1.70e-02

Pos logit contributions
ug+0.007
ister+0.007
irens+0.007
P+0.007
iren+0.006
Neg logit contributions
comb-0.004
 pleasure-0.004
ses-0.004
 pad-0.004
 former-0.004
Token share
Token ID2648
Feature activation+8.52e-03

Pos logit contributions
comb+0.016
 pleasure+0.014
 pur+0.013
ses+0.013
 seeking+0.013
Neg logit contributions
ug-0.043
ister-0.041
P-0.040
irens-0.040
iren-0.038
Token well
Token ID880
Feature activation+2.20e-04

Pos logit contributions
comb+0.017
 pleasure+0.016
ses+0.016
 hospitality+0.016
 former+0.015
Neg logit contributions
ug-0.012
ister-0.012
irens-0.011
P-0.011
iren-0.010
Token.
Token ID13
Feature activation-5.28e-03

Pos logit contributions
comb+0.000
 pleasure+0.000
ses+0.000
 pad+0.000
 pur+0.000
Neg logit contributions
ug-0.000
ister-0.000
irens-0.000
P-0.000
iren-0.000
Token One
Token ID1881
Feature activation+4.56e-02

Pos logit contributions
ug+0.010
ister+0.010
P+0.009
irens+0.009
opter+0.009
Neg logit contributions
comb-0.008
 pleasure-0.007
ses-0.007
 pad-0.007
 former-0.007
Token day
Token ID1110
Feature activation+3.13e-05

Pos logit contributions
comb+0.071
 pleasure+0.066
ses+0.066
 seeking+0.066
 pur+0.065
Neg logit contributions
ug-0.084
ister-0.080
irens-0.078
P-0.078
opter-0.073
Token,
Token ID11
Feature activation+1.82e-03

Pos logit contributions
 pleasure+0.000
 love+0.000
 watered+0.000
comb+0.000
 seeking+0.000
Neg logit contributions
ister-0.000
iren-0.000
ug-0.000
irens-0.000
opter-0.000
Token Lily
Token ID20037
Feature activation+2.96e-02

Pos logit contributions
comb+0.003
 pleasure+0.003
ses+0.003
 pad+0.003
 pur+0.003
Neg logit contributions
ug-0.003
ister-0.003
P-0.003
irens-0.003
iren-0.002
Token wanted
Token ID2227
Feature activation-4.30e-03

Pos logit contributions
comb+0.047
 pleasure+0.044
ses+0.043
 former+0.042
 hospitality+0.042
Neg logit contributions
ug-0.055
ister-0.053
P-0.051
irens-0.050
iren-0.047
Token to
Token ID284
Feature activation+2.56e-02

Pos logit contributions
ug+0.009
ister+0.009
irens+0.009
P+0.009
iren+0.008
Neg logit contributions
comb-0.006
ses-0.005
 pleasure-0.005
 pad-0.005
 former-0.005
Token to
Token ID284
Feature activation+5.61e-03

Pos logit contributions
comb+0.013
ses+0.011
 pleasure+0.011
 former+0.011
 pad+0.010
Neg logit contributions
ug-0.019
ister-0.019
irens-0.019
P-0.019
iren-0.017
Token eat
Token ID4483
Feature activation+1.77e-02

Pos logit contributions
comb+0.006
ses+0.006
 pad+0.005
 pleasure+0.005
 former+0.005
Neg logit contributions
ug-0.013
ister-0.013
P-0.012
irens-0.012
iren-0.012
Token jelly
Token ID27644
Feature activation+5.39e-03

Pos logit contributions
 pur+0.019
comb+0.019
 nib+0.018
 pollen+0.018
 watered+0.018
Neg logit contributions
ug-0.043
ister-0.040
opter-0.038
irens-0.037
iren-0.037
Token sandwiches
Token ID29777
Feature activation+1.40e-02

Pos logit contributions
comb+0.009
 pleasure+0.008
ses+0.008
 pad+0.008
 former+0.008
Neg logit contributions
ug-0.010
ister-0.009
P-0.009
irens-0.009
iren-0.008
Token.
Token ID13
Feature activation-3.90e-03

Pos logit contributions
comb+0.028
ses+0.027
 pleasure+0.027
 former+0.027
 seeking+0.026
Neg logit contributions
ug-0.020
ister-0.019
P-0.018
irens-0.017
iren-0.016
Token One
Token ID1881
Feature activation+4.82e-02

Pos logit contributions
ug+0.006
ister+0.006
P+0.006
irens+0.006
iren+0.005
Neg logit contributions
comb-0.007
 hospitality-0.006
 seeking-0.006
ses-0.006
 former-0.006
Token day
Token ID1110
Feature activation+3.22e-03

Pos logit contributions
comb+0.071
 pleasure+0.065
 pad+0.064
 pur+0.064
 nib+0.063
Neg logit contributions
ug-0.098
irens-0.092
ister-0.091
opter-0.086
iren-0.085
Token,
Token ID11
Feature activation+5.06e-03

Pos logit contributions
 nib+0.004
 pur+0.004
comb+0.004
 love+0.004
 to+0.004
Neg logit contributions
ug-0.007
ister-0.006
irens-0.006
ep-0.006
iren-0.006
Token her
Token ID607
Feature activation+2.62e-02

Pos logit contributions
comb+0.006
 pleasure+0.006
ses+0.006
 pur+0.006
 pad+0.005
Neg logit contributions
ug-0.011
ister-0.011
irens-0.010
P-0.010
iren-0.010
Token mom
Token ID1995
Feature activation+1.59e-02

Pos logit contributions
comb+0.052
 former+0.049
 pleasure+0.049
 pad+0.048
 seeking+0.048
Neg logit contributions
ug-0.036
ister-0.036
P-0.035
irens-0.031
iren-0.029
Tokenmy
Token ID1820
Feature activation-5.01e-03

Pos logit contributions
comb+0.021
 pleasure+0.019
 pur+0.019
 pad+0.018
ses+0.018
Neg logit contributions
ug-0.034
ister-0.033
irens-0.031
P-0.031
iren-0.030
Token a
Token ID257
Feature activation-3.53e-05

Pos logit contributions
ug+0.018
P+0.017
ister+0.017
irens+0.017
iren+0.015
Neg logit contributions
comb-0.015
ses-0.013
 pleasure-0.013
 former-0.013
 hospitality-0.013
Token pencil
Token ID21613
Feature activation-2.33e-02

Pos logit contributions
ug+0.000
ister+0.000
irens+0.000
P+0.000
iren+0.000
Neg logit contributions
comb-0.000
 pad-0.000
 pleasure-0.000
 pur-0.000
 seeking-0.000
Token with
Token ID351
Feature activation+7.33e-03

Pos logit contributions
ug+0.046
ister+0.044
P+0.042
irens+0.042
iren+0.040
Neg logit contributions
comb-0.034
 pleasure-0.031
 pad-0.031
ses-0.031
 pur-0.031
Token her
Token ID607
Feature activation+1.36e-03

Pos logit contributions
comb+0.009
ses+0.008
 pleasure+0.008
 seeking+0.008
 former+0.008
Neg logit contributions
ug-0.016
ister-0.016
P-0.015
irens-0.015
iren-0.014
Token.
Token ID13
Feature activation-4.11e-03

Pos logit contributions
comb+0.003
 hospitality+0.002
 former+0.002
 pleasure+0.002
ses+0.002
Neg logit contributions
ug-0.002
ister-0.002
P-0.002
irens-0.002
 cubes-0.002
Token One
Token ID1881
Feature activation+4.88e-02

Pos logit contributions
ug+0.008
ister+0.007
P+0.007
irens+0.007
iren+0.007
Neg logit contributions
comb-0.007
ses-0.006
 pleasure-0.006
 seeking-0.006
 hospitality-0.006
Token day
Token ID1110
Feature activation+7.61e-03

Pos logit contributions
comb+0.075
 pleasure+0.070
 pad+0.069
 pur+0.069
ses+0.068
Neg logit contributions
ug-0.094
ister-0.089
irens-0.087
P-0.083
iren-0.081
Token,
Token ID11
Feature activation+6.80e-03

Pos logit contributions
 pur+0.009
 nib+0.009
 love+0.008
comb+0.008
 to+0.008
Neg logit contributions
ug-0.017
ister-0.016
irens-0.015
iren-0.015
ep-0.015
Token she
Token ID673
Feature activation+1.68e-02

Pos logit contributions
comb+0.008
 pur+0.007
 pleasure+0.007
ses+0.007
 seeking+0.007
Neg logit contributions
ug-0.016
ister-0.015
irens-0.014
P-0.014
iren-0.014
Token met
Token ID1138
Feature activation+6.28e-03

Pos logit contributions
comb+0.020
 watered+0.020
 pur+0.019
 pad+0.019
 seeking+0.018
Neg logit contributions
ug-0.038
ister-0.036
irens-0.035
iren-0.033
opter-0.033
Token a
Token ID257
Feature activation+1.21e-02

Pos logit contributions
comb+0.010
ses+0.009
 pleasure+0.009
 former+0.009
 seeking+0.008
Neg logit contributions
ug-0.012
ister-0.012
P-0.011
irens-0.011
iren-0.010
Token friends
Token ID2460
Feature activation+7.07e-04

Pos logit contributions
ug+0.029
ister+0.027
P+0.025
irens+0.024
iren+0.023
Neg logit contributions
comb-0.035
ses-0.032
 pad-0.031
 pleasure-0.030
 seeking-0.030
Token.
Token ID13
Feature activation-1.59e-03

Pos logit contributions
ses+0.001
comb+0.001
 former+0.001
 pleasure+0.001
 hospitality+0.001
Neg logit contributions
ug-0.001
ister-0.001
P-0.001
irens-0.001
opter-0.001
Token They
Token ID1119
Feature activation+1.24e-02

Pos logit contributions
ug+0.003
ister+0.002
P+0.002
irens+0.002
iren+0.002
Neg logit contributions
comb-0.003
ses-0.003
 hospitality-0.003
 seeking-0.003
 former-0.003
Token would
Token ID561
Feature activation+1.57e-02

Pos logit contributions
comb+0.021
ses+0.020
 pleasure+0.020
eter+0.020
 former+0.020
Neg logit contributions
ug-0.021
P-0.019
ister-0.019
 yacht-0.018
opter-0.017
Token work
Token ID670
Feature activation+6.66e-03

Pos logit contributions
ses+0.023
comb+0.023
eter+0.022
 pleasure+0.021
 pad+0.021
Neg logit contributions
ug-0.029
ister-0.029
 yacht-0.026
P-0.026
opter-0.024
Token together
Token ID1978
Feature activation+4.34e-02

Pos logit contributions
comb+0.013
ses+0.012
 pad+0.012
 pleasure+0.011
 hospitality+0.011
Neg logit contributions
ug-0.011
ister-0.010
irens-0.010
P-0.009
iren-0.009
Token to
Token ID284
Feature activation+2.59e-02

Pos logit contributions
ses+0.090
 pleasure+0.080
Aud+0.079
 hospitality+0.079
comb+0.078
Neg logit contributions
ug-0.063
ister-0.056
lf-0.048
 cubes-0.048
P-0.047
Token find
Token ID1064
Feature activation+4.42e-04

Pos logit contributions
ses+0.038
 pad+0.036
comb+0.035
 pleasure+0.033
Aud+0.033
Neg logit contributions
ug-0.051
ister-0.051
P-0.046
opter-0.045
irens-0.044
Token more
Token ID517
Feature activation+3.74e-02

Pos logit contributions
comb+0.001
ses+0.001
 pad+0.001
 seeking+0.001
 former+0.001
Neg logit contributions
ug-0.001
ister-0.001
P-0.001
irens-0.001
opter-0.001
Token pretty
Token ID2495
Feature activation+1.77e-02

Pos logit contributions
comb+0.067
ses+0.065
sed+0.056
ulent+0.055
Aud+0.055
Neg logit contributions
ug-0.065
ister-0.061
irens-0.060
P-0.058
 cubes-0.057
Token leaves
Token ID5667
Feature activation+9.43e-03

Pos logit contributions
comb+0.036
ses+0.035
 pleasure+0.032
Aud+0.032
 pad+0.032
Neg logit contributions
ug-0.025
ister-0.023
irens-0.022
P-0.022
 cubes-0.021
Token his
Token ID465
Feature activation+1.08e-02

Pos logit contributions
ug+0.034
ister+0.033
opter+0.031
P+0.030
irens+0.029
Neg logit contributions
 pur-0.014
 watered-0.013
 pleasure-0.013
ses-0.012
 knowing-0.012
Token famous
Token ID5863
Feature activation+1.45e-03

Pos logit contributions
comb+0.019
 pad+0.016
 pleasure+0.016
 pur+0.016
 hospitality+0.016
Neg logit contributions
ug-0.018
ister-0.018
P-0.017
irens-0.017
iren-0.016
Token neighbor
Token ID4780
Feature activation+2.76e-02

Pos logit contributions
comb+0.003
 pleasure+0.002
ses+0.002
 pad+0.002
 former+0.002
Neg logit contributions
ug-0.002
ister-0.002
irens-0.002
P-0.002
iren-0.002
Token,
Token ID11
Feature activation+1.70e-03

Pos logit contributions
comb+0.048
 pleasure+0.046
ses+0.046
 former+0.045
 hospitality+0.044
Neg logit contributions
ug-0.046
ister-0.044
irens-0.042
P-0.041
iren-0.039
Token Mr
Token ID1770
Feature activation-1.32e-02

Pos logit contributions
comb+0.003
 pleasure+0.003
ses+0.003
 former+0.003
 pad+0.003
Neg logit contributions
ug-0.003
ister-0.003
irens-0.003
P-0.003
iren-0.003
Token.
Token ID13
Feature activation+6.40e-03

Pos logit contributions
ug+0.020
ister+0.019
P+0.018
irens+0.017
iren+0.016
Neg logit contributions
comb-0.024
 pleasure-0.024
 former-0.024
 seeking-0.023
ses-0.023
Token Johnson
Token ID5030
Feature activation+6.81e-03

Pos logit contributions
comb+0.010
 watered+0.009
ses+0.009
 pleasure+0.009
 pad+0.009
Neg logit contributions
ister-0.011
ug-0.011
irens-0.011
P-0.011
iren-0.010
Token,
Token ID11
Feature activation+3.08e-03

Pos logit contributions
comb+0.013
 pleasure+0.013
ses+0.012
 former+0.012
 hospitality+0.012
Neg logit contributions
ug-0.011
ister-0.010
irens-0.009
P-0.009
iren-0.009
Token walking
Token ID6155
Feature activation-9.68e-03

Pos logit contributions
comb+0.005
 former+0.005
 pleasure+0.005
ses+0.005
eter+0.004
Neg logit contributions
ug-0.005
ister-0.005
irens-0.005
P-0.005
iren-0.005
Token his
Token ID465
Feature activation-6.11e-03

Pos logit contributions
ug+0.017
ister+0.016
P+0.015
irens+0.015
iren+0.015
Neg logit contributions
comb-0.017
 pleasure-0.016
ses-0.016
 former-0.015
 hospitality-0.015
Token dog
Token ID3290
Feature activation+1.45e-02

Pos logit contributions
ister+0.009
ug+0.009
P+0.009
irens+0.009
iren+0.008
Neg logit contributions
comb-0.011
 pleasure-0.010
 pollen-0.010
 hospitality-0.010
 pur-0.010
Token healed
Token ID28557
Feature activation+5.24e-03

Pos logit contributions
ug+0.026
ister+0.025
P+0.024
in+0.022
iren+0.022
Neg logit contributions
ses-0.026
comb-0.026
 pad-0.024
sed-0.022
Aud-0.022
Token and
Token ID290
Feature activation+1.06e-03

Pos logit contributions
ses+0.010
comb+0.009
 pleasure+0.009
 seeking+0.009
 pad+0.009
Neg logit contributions
ug-0.009
ister-0.008
P-0.008
irens-0.007
opter-0.007
Token healthy
Token ID5448
Feature activation+1.53e-03

Pos logit contributions
comb+0.002
 pad+0.002
ses+0.002
Aud+0.002
ulent+0.002
Neg logit contributions
ug-0.001
ister-0.001
opter-0.001
P-0.001
in-0.001
Token.
Token ID13
Feature activation+4.18e-03

Pos logit contributions
ses+0.003
Aud+0.003
comb+0.003
 seeking+0.003
 pad+0.003
Neg logit contributions
ug-0.002
ister-0.002
P-0.002
irens-0.002
ler-0.002
Token
Token ID198
Feature activation-8.04e-04

Pos logit contributions
comb+0.007
ses+0.006
 pleasure+0.006
 seeking+0.006
 pad+0.006
Neg logit contributions
ug-0.008
ister-0.007
P-0.007
irens-0.007
iren-0.006
Token
Token ID198
Feature activation+6.11e-04

Pos logit contributions
ug+0.001
ister+0.001
P+0.001
irens+0.001
opter+0.001
Neg logit contributions
comb-0.001
 pleasure-0.001
 pur-0.001
ses-0.001
 pad-0.001
TokenThe
Token ID464
Feature activation+9.97e-03

Pos logit contributions
comb+0.001
 pur+0.001
 pleasure+0.001
 former+0.001
 hospitality+0.001
Neg logit contributions
ug-0.001
ister-0.001
irens-0.001
opter-0.001
iren-0.001
Token little
Token ID1310
Feature activation+3.04e-03

Pos logit contributions
comb+0.019
 pleasure+0.018
ses+0.018
 pad+0.017
 seeking+0.017
Neg logit contributions
ug-0.015
ister-0.015
P-0.014
irens-0.014
iren-0.013
Token boy
Token ID2933
Feature activation+1.22e-02

Pos logit contributions
comb+0.006
 pleasure+0.006
ses+0.006
 pad+0.006
 pur+0.006
Neg logit contributions
ug-0.004
ister-0.004
P-0.004
irens-0.004
iren-0.003
Token was
Token ID373
Feature activation-3.36e-03

Pos logit contributions
ses+0.021
comb+0.021
 seeking+0.021
Aud+0.020
 pleasure+0.020
Neg logit contributions
ug-0.019
ister-0.019
P-0.018
opter-0.016
in-0.016
Token so
Token ID523
Feature activation+6.32e-03

Pos logit contributions
ug+0.006
ister+0.006
P+0.005
irens+0.005
iren+0.005
Neg logit contributions
ses-0.006
comb-0.006
 pad-0.005
 seeking-0.005
 potions-0.005
Token.
Token ID13
Feature activation-8.29e-03

Pos logit contributions
comb+0.015
ses+0.014
 pleasure+0.013
 former+0.013
 seeking+0.013
Neg logit contributions
ug-0.013
ister-0.012
irens-0.011
P-0.011
iren-0.011
Token He
Token ID679
Feature activation-3.14e-02

Pos logit contributions
ug+0.016
ister+0.015
P+0.014
irens+0.014
opter+0.014
Neg logit contributions
comb-0.013
 pleasure-0.012
ses-0.012
 pur-0.012
 pad-0.012
Token dropped
Token ID5710
Feature activation-5.35e-03

Pos logit contributions
ug+0.056
ister+0.051
P+0.050
irens+0.047
opter+0.046
Neg logit contributions
comb-0.053
 pleasure-0.051
ses-0.048
 former-0.047
 hospitality-0.047
Token the
Token ID262
Feature activation+1.15e-02

Pos logit contributions
ug+0.008
ister+0.008
P+0.008
irens+0.007
iren+0.007
Neg logit contributions
comb-0.010
 pleasure-0.009
ses-0.009
 former-0.009
 pad-0.009
Token little
Token ID1310
Feature activation-3.16e-03

Pos logit contributions
comb+0.021
 pleasure+0.020
ses+0.019
 pad+0.019
 pur+0.019
Neg logit contributions
ug-0.019
ister-0.018
P-0.017
irens-0.017
iren-0.016
Token boy
Token ID2933
Feature activation+8.09e-03

Pos logit contributions
ug+0.005
ister+0.005
P+0.005
irens+0.005
iren+0.004
Neg logit contributions
comb-0.006
 pleasure-0.005
ses-0.005
 pad-0.005
 pur-0.005
Token's
Token ID338
Feature activation-2.64e-04

Pos logit contributions
comb+0.014
 former+0.014
 pleasure+0.014
ses+0.014
 hospitality+0.013
Neg logit contributions
ug-0.014
ister-0.012
irens-0.012
P-0.012
iren-0.011
Token toy
Token ID13373
Feature activation+3.93e-04

Pos logit contributions
ug+0.000
ister+0.000
P+0.000
irens+0.000
iren+0.000
Neg logit contributions
comb-0.000
 former-0.000
 seeking-0.000
Aud-0.000
 pur-0.000
Token boat
Token ID8848
Feature activation-9.80e-03

Pos logit contributions
comb+0.001
 pleasure+0.001
ses+0.001
 pad+0.001
 seeking+0.001
Neg logit contributions
ug-0.001
ister-0.001
P-0.001
irens-0.001
iren-0.001
Token and
Token ID290
Feature activation-8.96e-03

Pos logit contributions
ug+0.017
ister+0.016
irens+0.015
P+0.015
iren+0.014
Neg logit contributions
comb-0.017
 pleasure-0.016
ses-0.016
 former-0.016
 hospitality-0.015
Token sw
Token ID1509
Feature activation-6.39e-02

Pos logit contributions
ug+0.015
ister+0.014
P+0.013
irens+0.013
opter+0.013
Neg logit contributions
comb-0.016
 pleasure-0.015
 former-0.015
ses-0.015
 pad-0.014
Token with
Token ID351
Feature activation-2.43e-03

Pos logit contributions
comb+0.010
 pleasure+0.010
ses+0.010
 pad+0.009
 former+0.009
Neg logit contributions
ug-0.010
ister-0.010
P-0.009
irens-0.009
iren-0.009
Token her
Token ID607
Feature activation+3.97e-03

Pos logit contributions
ug+0.005
ister+0.005
P+0.005
irens+0.005
iren+0.005
Neg logit contributions
comb-0.003
ses-0.003
 pleasure-0.003
 pad-0.003
 seeking-0.003
Token everywhere
Token ID8347
Feature activation-3.41e-02

Pos logit contributions
comb+0.008
 former+0.007
 hospitality+0.007
 pleasure+0.007
ses+0.007
Neg logit contributions
ug-0.006
P-0.005
ister-0.005
 cubes-0.005
irens-0.005
Token she
Token ID673
Feature activation-2.76e-03

Pos logit contributions
ug+0.067
ister+0.064
P+0.061
irens+0.061
iren+0.058
Neg logit contributions
comb-0.051
 pur-0.046
 pleasure-0.046
 seeking-0.045
 pad-0.045
Token went
Token ID1816
Feature activation-2.62e-02

Pos logit contributions
ug+0.005
ister+0.004
P+0.004
irens+0.004
iren+0.004
Neg logit contributions
comb-0.005
 pleasure-0.005
ses-0.005
 former-0.005
 seeking-0.004
Token.
Token ID13
Feature activation-6.44e-03

Pos logit contributions
ug+0.053
ister+0.053
P+0.050
opter+0.050
irens+0.049
Neg logit contributions
 seeking-0.032
 pad-0.032
comb-0.032
 pur-0.031
 nib-0.029
Token One
Token ID1881
Feature activation+4.78e-02

Pos logit contributions
ug+0.011
ister+0.011
P+0.011
irens+0.010
iren+0.010
Neg logit contributions
comb-0.011
ses-0.010
 hospitality-0.010
 seeking-0.009
 pleasure-0.009
Token day
Token ID1110
Feature activation+4.15e-03

Pos logit contributions
comb+0.067
 pleasure+0.063
 pad+0.062
 pur+0.061
 hospitality+0.060
Neg logit contributions
ug-0.097
irens-0.093
ister-0.093
P-0.087
opter-0.086
Token,
Token ID11
Feature activation+4.54e-03

Pos logit contributions
 pur+0.005
 nib+0.005
comb+0.005
 love+0.005
 pad+0.005
Neg logit contributions
ug-0.008
ister-0.008
irens-0.008
iren-0.008
ades-0.007
Token Lily
Token ID20037
Feature activation+3.17e-02

Pos logit contributions
comb+0.006
 pleasure+0.006
ses+0.006
 pad+0.006
 former+0.006
Neg logit contributions
ug-0.009
ister-0.009
P-0.009
irens-0.009
iren-0.008
Token went
Token ID1816
Feature activation-1.46e-02

Pos logit contributions
comb+0.044
 pad+0.040
 pur+0.040
 pleasure+0.039
ses+0.039
Neg logit contributions
ug-0.065
ister-0.063
irens-0.060
P-0.059
iren-0.057
Token they
Token ID484
Feature activation-4.81e-03

Pos logit contributions
ses+0.036
 former+0.034
 pleasure+0.033
comb+0.033
sed+0.032
Neg logit contributions
ug-0.024
ister-0.022
 yacht-0.020
in-0.018
ler-0.018
Token found
Token ID1043
Feature activation-1.29e-02

Pos logit contributions
ug+0.007
P+0.007
ister+0.006
ep+0.006
 yacht+0.006
Neg logit contributions
comb-0.010
ses-0.009
 fault-0.009
eter-0.009
 former-0.009
Token while
Token ID981
Feature activation+1.08e-02

Pos logit contributions
ug+0.017
P+0.015
ister+0.015
irens+0.013
aw+0.013
Neg logit contributions
comb-0.028
ses-0.027
 former-0.027
 pleasure-0.026
Aud-0.025
Token playing
Token ID2712
Feature activation+4.96e-03

Pos logit contributions
comb+0.022
 pleasure+0.021
ses+0.021
 pad+0.021
 former+0.020
Neg logit contributions
ug-0.016
ister-0.015
irens-0.013
P-0.013
opter-0.012
Token.
Token ID13
Feature activation-2.08e-03

Pos logit contributions
comb+0.010
ses+0.009
 pleasure+0.009
 pad+0.009
 hospitality+0.009
Neg logit contributions
ug-0.008
ister-0.007
P-0.007
irens-0.006
iren-0.006
Token<|endoftext|>
Token ID50256
Feature activation+2.31e-02

Pos logit contributions
ug+0.003
ister+0.003
P+0.003
irens+0.003
iren+0.003
Neg logit contributions
comb-0.004
ses-0.003
 seeking-0.003
 pad-0.003
 pleasure-0.003
TokenOne
Token ID3198
Feature activation+2.97e-02

Pos logit contributions
comb+0.035
 pleasure+0.032
 pur+0.032
ses+0.032
 seeking+0.032
Neg logit contributions
ug-0.044
ister-0.042
P-0.040
irens-0.040
opter-0.038
Token day
Token ID1110
Feature activation-1.40e-02

Pos logit contributions
comb+0.050
 pleasure+0.047
ses+0.046
 pur+0.046
 pad+0.046
Neg logit contributions
ug-0.052
ister-0.050
irens-0.047
P-0.047
iren-0.045
Token,
Token ID11
Feature activation-1.24e-02

Pos logit contributions
irens+0.033
ug+0.033
iren+0.032
arp+0.031
ister+0.031
Neg logit contributions
 to-0.013
 love-0.012
 hiding-0.012
 nib-0.012
 pur-0.011
Token a
Token ID257
Feature activation-1.42e-03

Pos logit contributions
ister+0.032
ug+0.032
irens+0.032
iren+0.031
opter+0.029
Neg logit contributions
comb-0.009
 pleasure-0.009
ses-0.009
 nib-0.009
 pad-0.009
Token little
Token ID1310
Feature activation-2.30e-02

Pos logit contributions
irens+0.003
ug+0.003
iren+0.003
ister+0.003
opter+0.002
Neg logit contributions
 pur-0.002
 pleasure-0.002
 pad-0.002
comb-0.002
ses-0.002
Token trace
Token ID12854
Feature activation+1.27e-02

Pos logit contributions
 yacht+0.004
ug+0.003
ister+0.003
 elevator+0.003
 motor+0.003
Neg logit contributions
comb-0.004
eter-0.004
ses-0.004
ulent-0.004
Aud-0.004
Token her
Token ID607
Feature activation+2.46e-02

Pos logit contributions
comb+0.025
ses+0.023
 pleasure+0.022
 seeking+0.022
 former+0.022
Neg logit contributions
ug-0.020
ister-0.018
irens-0.018
P-0.016
iren-0.016
Token finger
Token ID7660
Feature activation+8.24e-03

Pos logit contributions
comb+0.045
 hospitality+0.038
 pur+0.038
 forgetting+0.037
 pleasure+0.037
Neg logit contributions
ug-0.040
ister-0.036
P-0.036
irens-0.035
 yacht-0.035
Token over
Token ID625
Feature activation+6.65e-03

Pos logit contributions
 pleasure+0.017
 hospitality+0.017
comb+0.016
ses+0.016
Aud+0.015
Neg logit contributions
ug-0.012
ister-0.011
irens-0.010
ep-0.010
iren-0.010
Token the
Token ID262
Feature activation-1.02e-02

Pos logit contributions
comb+0.014
ses+0.013
 hospitality+0.013
 pleasure+0.013
Aud+0.012
Neg logit contributions
ug-0.009
irens-0.008
ister-0.008
P-0.008
 cubes-0.007
Token colorful
Token ID20239
Feature activation-9.74e-03

Pos logit contributions
ug+0.018
ister+0.017
P+0.017
irens+0.016
iren+0.015
Neg logit contributions
comb-0.017
 pleasure-0.016
ses-0.015
 pur-0.015
 former-0.015
Token lines
Token ID3951
Feature activation-8.34e-03

Pos logit contributions
ug+0.018
ister+0.017
P+0.016
irens+0.016
iren+0.015
Neg logit contributions
comb-0.015
 pleasure-0.014
ses-0.014
 pad-0.014
 pur-0.014
Token.
Token ID13
Feature activation-1.00e-03

Pos logit contributions
ug+0.013
ister+0.012
irens+0.011
P+0.010
iren+0.010
Neg logit contributions
 pleasure-0.016
ses-0.016
comb-0.016
 former-0.016
 hospitality-0.015
Token One
Token ID1881
Feature activation+4.71e-02

Pos logit contributions
ug+0.002
ister+0.002
P+0.002
irens+0.002
iren+0.001
Neg logit contributions
comb-0.002
 hospitality-0.002
ses-0.002
 seeking-0.002
 pleasure-0.002
Token day
Token ID1110
Feature activation+5.39e-03

Pos logit contributions
comb+0.073
 pleasure+0.068
 pad+0.067
 pur+0.066
ses+0.066
Neg logit contributions
ug-0.089
ister-0.085
irens-0.084
P-0.081
iren-0.078
Token,
Token ID11
Feature activation+7.38e-03

Pos logit contributions
comb+0.009
 pur+0.009
 nib+0.008
 pleasure+0.008
 pad+0.008
Neg logit contributions
ug-0.010
ister-0.009
irens-0.009
P-0.009
iren-0.008
Token now
Token ID783
Feature activation+1.24e-02

Pos logit contributions
comb+0.038
 pleasure+0.035
ses+0.035
 former+0.035
 hospitality+0.035
Neg logit contributions
ug-0.031
ister-0.029
P-0.026
irens-0.026
iren-0.025
Token."
Token ID526
Feature activation+1.42e-02

Pos logit contributions
comb+0.022
 pleasure+0.021
ses+0.020
 pad+0.020
 pur+0.020
Neg logit contributions
ug-0.020
ister-0.020
P-0.018
irens-0.018
iren-0.017
Token Sam
Token ID3409
Feature activation-1.27e-02

Pos logit contributions
comb+0.019
ses+0.018
 pleasure+0.018
 pur+0.018
 seeking+0.018
Neg logit contributions
ug-0.029
ister-0.028
P-0.027
irens-0.026
opter-0.025
Token felt
Token ID2936
Feature activation-1.87e-02

Pos logit contributions
ug+0.025
ister+0.024
P+0.023
irens+0.023
iren+0.022
Neg logit contributions
comb-0.018
 pleasure-0.017
ses-0.017
 former-0.016
 pad-0.016
Token bad
Token ID2089
Feature activation+1.57e-02

Pos logit contributions
ug+0.032
ister+0.030
P+0.028
irens+0.028
opter+0.027
Neg logit contributions
comb-0.032
 pleasure-0.031
ses-0.030
 pur-0.030
 seeking-0.030
Token and
Token ID290
Feature activation-1.55e-02

Pos logit contributions
comb+0.031
 former+0.031
Aud+0.030
 pleasure+0.029
 pollen+0.028
Neg logit contributions
ug-0.024
ister-0.021
irens-0.021
iren-0.020
P-0.019
Token wanted
Token ID2227
Feature activation-1.73e-02

Pos logit contributions
ug+0.026
ister+0.025
P+0.024
irens+0.022
iren+0.022
Neg logit contributions
comb-0.028
 pleasure-0.026
 pad-0.025
 former-0.025
ses-0.025
Token to
Token ID284
Feature activation+6.34e-03

Pos logit contributions
ug+0.037
ister+0.035
irens+0.035
P+0.034
iren+0.034
Neg logit contributions
comb-0.025
 pad-0.021
ses-0.020
 former-0.020
 pleasure-0.020
Token make
Token ID787
Feature activation-1.79e-02

Pos logit contributions
comb+0.007
 pad+0.006
ses+0.006
 pleasure+0.006
 former+0.006
Neg logit contributions
ug-0.015
ister-0.015
P-0.014
irens-0.014
iren-0.014
Token Tim
Token ID5045
Feature activation-1.23e-02

Pos logit contributions
ug+0.031
ister+0.030
P+0.028
irens+0.028
iren+0.026
Neg logit contributions
comb-0.030
 pleasure-0.029
ses-0.028
 pur-0.028
 pad-0.028
Token happy
Token ID3772
Feature activation+8.36e-03

Pos logit contributions
ug+0.024
ister+0.023
irens+0.022
P+0.021
iren+0.021
Neg logit contributions
comb-0.018
ses-0.017
 former-0.016
 pad-0.016
 pleasure-0.016
Token their
Token ID511
Feature activation-1.80e-02

Pos logit contributions
ug+0.004
ister+0.003
irens+0.003
P+0.003
iren+0.003
Neg logit contributions
comb-0.004
 pleasure-0.004
 pad-0.004
ses-0.004
 seeking-0.004
Token mom
Token ID1995
Feature activation+2.43e-03

Pos logit contributions
ug+0.032
ister+0.030
P+0.029
irens+0.028
iren+0.027
Neg logit contributions
comb-0.031
 pleasure-0.028
ses-0.028
 pad-0.028
 pur-0.028
Token calling
Token ID4585
Feature activation+1.10e-02

Pos logit contributions
comb+0.004
 pleasure+0.004
 former+0.004
ses+0.004
 hospitality+0.003
Neg logit contributions
ug-0.005
ister-0.004
irens-0.004
iren-0.004
P-0.004
Token them
Token ID606
Feature activation+1.38e-02

Pos logit contributions
comb+0.021
 pleasure+0.020
 pur+0.020
ses+0.020
 pad+0.020
Neg logit contributions
ug-0.016
ister-0.015
P-0.015
irens-0.014
opter-0.014
Token for
Token ID329
Feature activation+3.12e-03

Pos logit contributions
comb+0.026
 pleasure+0.024
 former+0.024
ses+0.023
 pad+0.022
Neg logit contributions
ug-0.023
irens-0.021
ister-0.021
iren-0.020
 elevator-0.017
Token lunch
Token ID9965
Feature activation-2.71e-02

Pos logit contributions
comb+0.005
 pad+0.004
ses+0.004
 former+0.004
 pleasure+0.004
Neg logit contributions
ug-0.006
ister-0.006
irens-0.006
P-0.006
iren-0.005
Token.
Token ID13
Feature activation+4.28e-03

Pos logit contributions
ug+0.043
ister+0.042
P+0.039
irens+0.039
iren+0.036
Neg logit contributions
comb-0.049
 pleasure-0.048
ses-0.047
 former-0.046
 pur-0.045
Token
Token ID198
Feature activation-8.17e-03

Pos logit contributions
comb+0.007
 pleasure+0.006
ses+0.006
 pad+0.006
 pur+0.006
Neg logit contributions
ug-0.008
ister-0.008
P-0.007
irens-0.007
iren-0.007
Token
Token ID198
Feature activation-9.73e-03

Pos logit contributions
ug+0.019
ister+0.018
P+0.018
opter+0.018
irens+0.017
Neg logit contributions
 pur-0.007
 seeking-0.007
comb-0.007
 pleasure-0.007
 watered-0.007
Token"
Token ID1
Feature activation+2.38e-02

Pos logit contributions
ug+0.016
ister+0.015
opter+0.014
loo+0.013
irens+0.013
Neg logit contributions
Aud-0.016
 watered-0.015
 seeking-0.015
comb-0.015
 pur-0.015
TokenStop
Token ID19485
Feature activation-4.60e-02

Pos logit contributions
comb+0.044
 pur+0.042
ses+0.041
 seeking+0.041
 pleasure+0.040
Neg logit contributions
ug-0.038
ister-0.036
irens-0.032
opter-0.031
P-0.031
Token ran
Token ID4966
Feature activation-1.63e-03

Pos logit contributions
ug+0.017
ister+0.016
P+0.015
irens+0.014
iren+0.014
Neg logit contributions
ses-0.018
 former-0.018
 pleasure-0.018
comb-0.017
 hospitality-0.017
Token up
Token ID510
Feature activation+6.38e-03

Pos logit contributions
ug+0.003
ister+0.003
P+0.003
irens+0.003
iren+0.003
Neg logit contributions
comb-0.002
 pleasure-0.002
ses-0.002
 pad-0.002
 former-0.002
Token and
Token ID290
Feature activation-2.21e-03

Pos logit contributions
comb+0.011
 pleasure+0.010
ses+0.010
 pad+0.010
 former+0.010
Neg logit contributions
ug-0.011
ister-0.011
irens-0.010
P-0.010
iren-0.010
Token stayed
Token ID9658
Feature activation-1.34e-02

Pos logit contributions
ug+0.003
P+0.003
ister+0.003
irens+0.003
 elevator+0.003
Neg logit contributions
 former-0.004
comb-0.004
ses-0.004
 pleasure-0.004
 fault-0.004
Token there
Token ID612
Feature activation+2.71e-03

Pos logit contributions
ug+0.022
ister+0.020
P+0.020
irens+0.019
iren+0.018
Neg logit contributions
comb-0.024
ses-0.024
 pad-0.023
 hospitality-0.023
 pleasure-0.022
Token until
Token ID1566
Feature activation-1.23e-02

Pos logit contributions
Aud+0.005
ses+0.005
comb+0.005
 former+0.005
 pollen+0.005
Neg logit contributions
ug-0.004
ister-0.003
irens-0.003
 elevator-0.003
 yacht-0.003
Token it
Token ID340
Feature activation+5.52e-03

Pos logit contributions
ug+0.018
ister+0.017
P+0.016
irens+0.016
iren+0.015
Neg logit contributions
comb-0.024
 pleasure-0.023
 seeking-0.022
ses-0.022
 pur-0.022
Token was
Token ID373
Feature activation-1.40e-03

Pos logit contributions
 pleasure+0.010
comb+0.010
ses+0.010
 hospitality+0.010
 former+0.010
Neg logit contributions
ug-0.008
ister-0.008
P-0.007
irens-0.007
opter-0.007
Token safe
Token ID3338
Feature activation+1.38e-02

Pos logit contributions
ug+0.002
ister+0.002
P+0.002
irens+0.002
iren+0.002
Neg logit contributions
comb-0.002
ses-0.002
 pleasure-0.002
 pad-0.002
 former-0.002
Token.
Token ID13
Feature activation+4.90e-03

Pos logit contributions
ses+0.030
Aud+0.029
comb+0.029
 pleasure+0.028
 hospitality+0.028
Neg logit contributions
ug-0.017
ister-0.016
irens-0.014
iren-0.013
P-0.013
Token They
Token ID1119
Feature activation+4.06e-04

Pos logit contributions
comb+0.010
 seeking+0.009
 watered+0.009
ses+0.009
 pur+0.009
Neg logit contributions
ug-0.006
ister-0.006
irens-0.006
P-0.006
iren-0.005
Token,
Token ID11
Feature activation+8.95e-04

Pos logit contributions
ug+0.003
ister+0.003
irens+0.002
P+0.002
iren+0.002
Neg logit contributions
comb-0.003
 pur-0.003
ses-0.003
 pleasure-0.003
 former-0.003
Token so
Token ID523
Feature activation-2.54e-02

Pos logit contributions
comb+0.002
ses+0.002
 seeking+0.002
 former+0.001
 watered+0.001
Neg logit contributions
ug-0.001
ister-0.001
P-0.001
irens-0.001
iren-0.001
Token she
Token ID673
Feature activation-1.54e-02

Pos logit contributions
ister+0.042
ug+0.041
P+0.039
irens+0.037
opter+0.035
Neg logit contributions
ses-0.043
comb-0.043
 pad-0.041
Aud-0.039
 seeking-0.039
Token shared
Token ID4888
Feature activation+9.42e-03

Pos logit contributions
ug+0.028
ister+0.027
P+0.026
opter+0.023
irens+0.023
Neg logit contributions
comb-0.024
 pleasure-0.024
ses-0.024
 former-0.023
 hospitality-0.022
Token the
Token ID262
Feature activation+3.75e-03

Pos logit contributions
comb+0.020
ses+0.020
 former+0.018
 seeking+0.018
 pad+0.018
Neg logit contributions
ister-0.011
P-0.011
ug-0.011
irens-0.011
iren-0.009
Token bag
Token ID6131
Feature activation-2.35e-02

Pos logit contributions
comb+0.006
ses+0.006
 former+0.006
 seeking+0.006
 pur+0.006
Neg logit contributions
ug-0.006
ister-0.006
P-0.006
irens-0.006
opter-0.006
Token with
Token ID351
Feature activation+1.23e-02

Pos logit contributions
ug+0.035
ister+0.034
P+0.031
irens+0.030
iren+0.029
Neg logit contributions
ses-0.045
comb-0.045
 pleasure-0.043
 former-0.042
 pad-0.042
Token Henry
Token ID8616
Feature activation-2.41e-02

Pos logit contributions
comb+0.022
ses+0.022
 pad+0.021
 former+0.020
 seeking+0.020
Neg logit contributions
ug-0.020
ister-0.019
P-0.019
irens-0.018
opter-0.016
Token.
Token ID13
Feature activation-3.20e-03

Pos logit contributions
ug+0.031
ister+0.029
P+0.027
in+0.025
ler+0.024
Neg logit contributions
ses-0.050
comb-0.049
 former-0.048
 pleasure-0.046
Aud-0.046
Token
Token ID198
Feature activation-8.42e-03

Pos logit contributions
ug+0.005
ister+0.005
P+0.005
irens+0.005
iren+0.004
Neg logit contributions
comb-0.006
ses-0.005
 former-0.005
 watered-0.005
 pad-0.005
Token
Token ID198
Feature activation-6.91e-03

Pos logit contributions
ug+0.016
ister+0.015
irens+0.015
P+0.015
opter+0.014
Neg logit contributions
comb-0.013
 pleasure-0.012
 pur-0.012
 pad-0.011
 seeking-0.011
Token with
Token ID351
Feature activation-2.04e-02

Pos logit contributions
ug+0.010
ister+0.010
P+0.009
irens+0.008
opter+0.008
Neg logit contributions
ses-0.014
comb-0.014
 former-0.014
 pleasure-0.013
 seeking-0.013
Token the
Token ID262
Feature activation-1.09e-02

Pos logit contributions
ug+0.031
ister+0.029
P+0.028
irens+0.027
opter+0.025
Neg logit contributions
comb-0.040
 pleasure-0.037
ses-0.037
 pad-0.037
 former-0.036
Token cube
Token ID23441
Feature activation-2.34e-02

Pos logit contributions
ug+0.016
ister+0.015
P+0.014
irens+0.014
iren+0.013
Neg logit contributions
 pleasure-0.022
comb-0.021
ses-0.021
 seeking-0.020
 former-0.020
Token and
Token ID290
Feature activation-1.13e-02

Pos logit contributions
ug+0.035
ister+0.033
P+0.028
irens+0.027
 cubes+0.027
Neg logit contributions
ses-0.045
comb-0.044
 pleasure-0.044
Aud-0.043
 former-0.043
Token with
Token ID351
Feature activation-2.17e-02

Pos logit contributions
 yacht+0.012
ug+0.011
ister+0.011
 motor+0.010
 cubes+0.010
Neg logit contributions
comb-0.026
 pad-0.024
ulent-0.024
Aud-0.023
eter-0.023
Token each
Token ID1123
Feature activation-5.98e-02

Pos logit contributions
ug+0.030
ister+0.029
P+0.027
irens+0.026
opter+0.024
Neg logit contributions
comb-0.047
 pad-0.042
 seeking-0.042
 former-0.041
ses-0.041
Token other
Token ID584
Feature activation-2.27e-02

Pos logit contributions
ug+0.095
ister+0.091
P+0.086
irens+0.084
iren+0.080
Neg logit contributions
comb-0.111
 pleasure-0.104
ses-0.103
 pur-0.102
 former-0.102
Token.
Token ID13
Feature activation+5.49e-03

Pos logit contributions
ug+0.028
ister+0.025
irens+0.021
P+0.021
ler+0.021
Neg logit contributions
comb-0.050
ses-0.049
 former-0.047
 pleasure-0.046
 seeking-0.045
Token We
Token ID775
Feature activation+1.71e-02

Pos logit contributions
comb+0.010
 pad+0.009
 watered+0.009
 seeking+0.009
 former+0.008
Neg logit contributions
ug-0.009
ister-0.008
P-0.008
irens-0.008
Dad-0.007
Token will
Token ID481
Feature activation-6.98e-03

Pos logit contributions
 seeking+0.024
comb+0.023
ses+0.023
 former+0.022
 forgetting+0.022
Neg logit contributions
ister-0.032
ug-0.032
P-0.031
opter-0.028
or-0.027
Token be
Token ID307
Feature activation-9.32e-03

Pos logit contributions
ister+0.013
ug+0.013
P+0.012
opter+0.011
?"+0.011
Neg logit contributions
ses-0.010
comb-0.010
 pad-0.010
Aud-0.009
eter-0.009
Token
Token ID198
Feature activation-1.08e-02

Pos logit contributions
ug+0.021
P+0.020
ister+0.020
irens+0.019
opter+0.019
Neg logit contributions
 pur-0.008
 pleasure-0.007
 nib-0.007
comb-0.007
 watered-0.007
Token"
Token ID1
Feature activation+2.79e-02

Pos logit contributions
ug+0.019
ister+0.017
opter+0.017
loo+0.016
ime+0.016
Neg logit contributions
Aud-0.016
girls-0.014
 pad-0.014
 nib-0.014
 love-0.014
TokenHello
Token ID15496
Feature activation-1.64e-02

Pos logit contributions
comb+0.052
ses+0.048
 former+0.048
 pleasure+0.048
 seeking+0.048
Neg logit contributions
P-0.043
ug-0.043
ister-0.042
irens-0.040
iren-0.038
Token,
Token ID11
Feature activation-3.21e-02

Pos logit contributions
ug+0.026
P+0.025
irens+0.024
iren+0.024
ister+0.023
Neg logit contributions
 pleasure-0.029
comb-0.028
 pur-0.027
 seeking-0.026
 hospitality-0.026
Token Hal
Token ID11023
Feature activation-7.15e-02

Pos logit contributions
ug+0.060
P+0.058
irens+0.058
ister+0.056
iren+0.054
Neg logit contributions
 pleasure-0.045
 pur-0.042
comb-0.042
 pad-0.041
 hospitality-0.041
Tokenley
Token ID1636
Feature activation-6.22e-02

Pos logit contributions
ug+0.128
irens+0.123
ister+0.122
P+0.120
iren+0.114
Neg logit contributions
comb-0.111
 pad-0.105
 pleasure-0.102
ses-0.101
 pur-0.099
Token!"
Token ID2474
Feature activation+1.24e-02

Pos logit contributions
ug+0.072
ister+0.069
P+0.058
opter+0.058
irens+0.055
Neg logit contributions
 former-0.139
comb-0.139
 pleasure-0.135
ses-0.132
 hospitality-0.130
Token Tom
Token ID4186
Feature activation-3.15e-02

Pos logit contributions
ses+0.013
 hope+0.013
 pur+0.013
 amaz+0.013
 nib+0.012
Neg logit contributions
ug-0.027
P-0.027
ister-0.025
opter-0.024
ler-0.024
Token says
Token ID1139
Feature activation-1.32e-03

Pos logit contributions
ug+0.053
ister+0.053
irens+0.051
P+0.049
iren+0.048
Neg logit contributions
comb-0.055
 former-0.050
 pleasure-0.049
ses-0.047
 hospitality-0.047
Token.
Token ID13
Feature activation+1.06e-02

Pos logit contributions
ug+0.002
ister+0.002
irens+0.002
iren+0.002
P+0.002
Neg logit contributions
comb-0.003
 former-0.003
 pleasure-0.003
 pad-0.002
ses-0.002
Token
Token ID198
Feature activation-8.20e-03

Pos logit contributions
comb+0.012
ses+0.011
 pur+0.011
 pleasure+0.011
 pad+0.011
Neg logit contributions
ug-0.024
ister-0.023
P-0.023
opter-0.021
irens-0.021
Token cars
Token ID5006
Feature activation+7.72e-03

Pos logit contributions
 pur+0.018
 love+0.017
 hope+0.016
 watered+0.016
 seeking+0.016
Neg logit contributions
ister-0.026
P-0.024
ug-0.024
opter-0.022
irens-0.021
Token of
Token ID286
Feature activation+1.91e-02

Pos logit contributions
comb+0.013
 pad+0.013
 seeking+0.012
 pur+0.012
ses+0.012
Neg logit contributions
ug-0.013
ister-0.012
P-0.012
irens-0.011
opter-0.011
Token different
Token ID1180
Feature activation+9.91e-03

Pos logit contributions
 pleasure+0.025
 seeking+0.024
 pur+0.024
 pad+0.024
ses+0.024
Neg logit contributions
ister-0.039
ug-0.038
P-0.037
irens-0.035
opter-0.034
Token colors
Token ID7577
Feature activation-2.02e-02

Pos logit contributions
 watered+0.009
 pad+0.009
 seeking+0.009
 pur+0.009
 love+0.009
Neg logit contributions
ister-0.024
P-0.023
ug-0.022
opter-0.020
opus-0.020
Token and
Token ID290
Feature activation+2.41e-03

Pos logit contributions
ug+0.040
ister+0.039
P+0.037
irens+0.036
opter+0.035
Neg logit contributions
comb-0.029
 pleasure-0.026
 pad-0.026
ses-0.026
 pur-0.026
Token sizes
Token ID10620
Feature activation-7.00e-02

Pos logit contributions
comb+0.003
 pleasure+0.003
ses+0.003
 former+0.003
 pad+0.003
Neg logit contributions
ug-0.005
ister-0.005
irens-0.005
P-0.005
iren-0.005
Token.
Token ID13
Feature activation-4.88e-03

Pos logit contributions
ug+0.120
ister+0.115
P+0.108
irens+0.108
iren+0.102
Neg logit contributions
comb-0.120
 pleasure-0.113
ses-0.111
 former-0.110
 pur-0.109
Token He
Token ID679
Feature activation-2.73e-02

Pos logit contributions
ug+0.011
ister+0.010
P+0.010
opter+0.010
irens+0.009
Neg logit contributions
 pleasure-0.006
 pad-0.006
ses-0.006
 pur-0.006
 former-0.005
Token likes
Token ID7832
Feature activation+1.94e-03

Pos logit contributions
ug+0.062
ister+0.059
P+0.057
irens+0.057
iren+0.054
Neg logit contributions
comb-0.032
 pleasure-0.029
ses-0.028
 pad-0.028
 former-0.028
Token to
Token ID284
Feature activation+5.06e-03

Pos logit contributions
comb+0.001
 pur+0.001
 pleasure+0.001
 seeking+0.001
 pad+0.001
Neg logit contributions
ug-0.005
ister-0.005
P-0.005
irens-0.005
opter-0.005
Token make
Token ID787
Feature activation-1.35e-02

Pos logit contributions
comb+0.005
 pur+0.005
 seeking+0.005
 pleasure+0.005
ses+0.004
Neg logit contributions
ug-0.012
ister-0.012
P-0.011
irens-0.011
opter-0.011
Token like
Token ID588
Feature activation-1.87e-02

Pos logit contributions
ug+0.019
ister+0.018
irens+0.016
P+0.016
 yacht+0.015
Neg logit contributions
comb-0.022
ses-0.020
 pleasure-0.020
 seeking-0.020
 hospitality-0.019
Token a
Token ID257
Feature activation-3.49e-02

Pos logit contributions
ug+0.028
ister+0.026
irens+0.025
P+0.024
iren+0.023
Neg logit contributions
comb-0.038
ses-0.034
 pleasure-0.034
 former-0.033
 hospitality-0.033
Token flag
Token ID6056
Feature activation-4.36e-02

Pos logit contributions
ister+0.058
ug+0.057
P+0.055
opter+0.053
irens+0.052
Neg logit contributions
comb-0.059
ses-0.055
 hospitality-0.053
 pleasure-0.053
 seeking-0.053
Token,
Token ID11
Feature activation-1.28e-02

Pos logit contributions
ug+0.055
ister+0.054
irens+0.045
P+0.044
iren+0.044
Neg logit contributions
comb-0.093
ses-0.092
 pleasure-0.092
 former-0.089
 hospitality-0.089
Token a
Token ID257
Feature activation-2.93e-02

Pos logit contributions
ug+0.021
ister+0.020
irens+0.019
P+0.018
iren+0.018
Neg logit contributions
comb-0.023
 pleasure-0.021
ses-0.020
 hospitality-0.020
 seeking-0.020
Token map
Token ID3975
Feature activation-6.09e-02

Pos logit contributions
ister+0.049
ug+0.048
P+0.047
opter+0.044
 yacht+0.043
Neg logit contributions
comb-0.049
 hospitality-0.045
ses-0.045
eter-0.045
 pleasure-0.044
Token,
Token ID11
Feature activation-1.11e-02

Pos logit contributions
ug+0.082
ister+0.079
irens+0.068
P+0.068
iren+0.068
Neg logit contributions
comb-0.125
ses-0.122
 former-0.121
 pleasure-0.120
 watered-0.117
Token a
Token ID257
Feature activation-3.00e-02

Pos logit contributions
ug+0.016
ister+0.015
irens+0.015
P+0.014
iren+0.013
Neg logit contributions
comb-0.021
 pleasure-0.019
ses-0.019
 seeking-0.019
 hospitality-0.019
Token landmark
Token ID20533
Feature activation-1.95e-02

Pos logit contributions
ister+0.050
ug+0.049
P+0.048
opter+0.045
 yacht+0.044
Neg logit contributions
comb-0.050
eter-0.046
ses-0.045
 hospitality-0.045
 former-0.045
Token,
Token ID11
Feature activation-9.88e-03

Pos logit contributions
ug+0.035
ister+0.033
irens+0.030
P+0.029
iren+0.029
Neg logit contributions
comb-0.033
 former-0.031
ses-0.031
 pur-0.031
 pleasure-0.031
Token or
Token ID393
Feature activation-1.09e-02

Pos logit contributions
ug+0.015
ister+0.014
irens+0.014
P+0.013
iren+0.013
Neg logit contributions
comb-0.018
 pleasure-0.016
ses-0.016
 seeking-0.016
 hospitality-0.016
Token and
Token ID290
Feature activation+1.46e-02

Pos logit contributions
ug+0.077
ister+0.074
P+0.073
irens+0.071
opter+0.069
Neg logit contributions
comb-0.050
 pad-0.047
 pleasure-0.045
ses-0.045
 seeking-0.044
Token went
Token ID1816
Feature activation-1.94e-02

Pos logit contributions
comb+0.024
 pleasure+0.023
ses+0.022
 former+0.022
 pad+0.022
Neg logit contributions
ug-0.026
ister-0.025
P-0.023
irens-0.023
iren-0.022
Token into
Token ID656
Feature activation-2.29e-02

Pos logit contributions
ug+0.041
ister+0.039
P+0.038
irens+0.038
iren+0.036
Neg logit contributions
comb-0.026
 pleasure-0.024
ses-0.023
 former-0.023
 hospitality-0.023
Token a
Token ID257
Feature activation-2.12e-02

Pos logit contributions
ug+0.032
ister+0.030
irens+0.029
P+0.029
iren+0.026
Neg logit contributions
comb-0.047
 pad-0.044
ses-0.044
 pleasure-0.043
 hospitality-0.042
Token big
Token ID1263
Feature activation-2.42e-02

Pos logit contributions
ug+0.037
ister+0.036
P+0.035
irens+0.034
iren+0.032
Neg logit contributions
comb-0.035
ses-0.033
 pleasure-0.033
 former-0.032
 hospitality-0.032
Token chest
Token ID7721
Feature activation-5.99e-02

Pos logit contributions
ug+0.047
ister+0.045
P+0.043
irens+0.043
iren+0.041
Neg logit contributions
comb-0.036
 pleasure-0.034
 pad-0.033
 pur-0.033
ses-0.033
Token.
Token ID13
Feature activation+3.81e-03

Pos logit contributions
ug+0.101
ister+0.097
P+0.091
irens+0.089
iren+0.085
Neg logit contributions
comb-0.103
 pleasure-0.098
ses-0.097
 pur-0.095
 former-0.095
Token The
Token ID383
Feature activation+3.04e-03

Pos logit contributions
comb+0.006
 pleasure+0.005
ses+0.005
 pur+0.005
 pad+0.005
Neg logit contributions
ug-0.007
ister-0.007
P-0.007
irens-0.007
opter-0.006
Token chest
Token ID7721
Feature activation-3.12e-02

Pos logit contributions
comb+0.006
 watered+0.005
eter+0.005
 former+0.005
Aud+0.005
Neg logit contributions
ister-0.004
ug-0.004
P-0.004
 yacht-0.004
irens-0.004
Token was
Token ID373
Feature activation-3.35e-03

Pos logit contributions
ug+0.049
ister+0.049
opter+0.043
lf+0.043
iren+0.042
Neg logit contributions
 pleasure-0.054
 former-0.054
comb-0.053
 hospitality-0.050
Aud-0.050
Token old
Token ID1468
Feature activation-1.57e-03

Pos logit contributions
ug+0.005
ister+0.005
irens+0.005
P+0.005
ep+0.005
Neg logit contributions
comb-0.006
 hospitality-0.005
eter-0.005
 pleasure-0.005
 former-0.005