TOP ACTIVATIONS
MAX = 0.192

you go , my d arl ings . Enjoy your lunch
's okay , my d arl ings . I forgive you
you too , my d arl ings . Now let 's
. Rex sn arl s . He does not
's wonderful , my d arl ings !" Mom my says
What happened , my d arl ings ? Why do you
'm sorry , my d arl ings . That was a
" Hello , my d arl ings . I 'm so
have fun , my d arl ings ?" Their mother asked
We know , my d arl ings . We miss them
have fun , my d arl ings ?" she asks .
" Hello , my d arl ings . I 'm so
Spot grow ls and sn arl s . He tries to
Not now , my d arl ings , it is too
you too , my d arl ings . But you have
you okay , my d arl ings ?" she asked .
to grow l and sn arl to try and scare her
you too , my d arl ings ," she said .
your fault , my d arl ings . You didn 't
'm sorry , my d arl ings . The tornado took

MIN ACTIVATIONS
MIN = -0.191

Hello Princess ! I am here to
. The Princess never found out what the
One day , Princess wanted to go on an
. From then on the Princess travelled around the world believing
One day , the Princess went outside and saw a
The dragon and Princess stayed up there for a
of crazy animals and amazing performers . She gasped in delight
Once inside , the Princess gave Jill a big hug
. Every day , the Princess polished her golden crown with
The Prince and Princess were so happy that they
the opportunity to marry his Princess . The King was so
One day when the Princess arrived at the lake ,
time there was a real Princess named Sally . Sally was
smiled . " Yes , Princess . I 'll hurry and
he gave the Prince and Princess his blessing .
ready . All of the performers were standing in the back
her secret hut . The Princess had lots of wonderful things
. When the Princess arrived , the King was
the park . When the performers finished , she joined the
was a kind and nice Princess . Every day , the

INTERVAL 0.096 - 0.192
CONTAINS 1.089%

. He jumped off his bed and ran to his mom
head . She felt something wet on her hand . It
happy and safe . The end . <|endoftext|> Once upon a
she walked is saw a big , shiny mailbox . It
see the movie on the big screen . As

INTERVAL 0.000 - 0.096
CONTAINS 50.288%

went back inside . She took off her wet rain coat
girl wanted to play with it , so her mom helped
strange happened . Bear wanted chocolate milk , but Rabbit wanted
and spl ashed . They did not see the sign that
happy . One day , Daisy 's parents said

INTERVAL -0.095 - 0.000
CONTAINS 46.716%

is also important to be kind and play together . <|endoftext|>
happy little girl named Jane . Jane was very excited for
the prett iest stickers . They spent a long time carefully
get together and build more things around the big house ,
<|endoftext|> Once upon a time , there was a rare bear

INTERVAL -0.191 - -0.095
CONTAINS 1.906%

a time there were two friends , Sam and Lucy .
idea . She nods and smiles . " OK
big airport . Lots of people went there to fly in
there was a little boy who liked to run . One
and the puppy became best friends and went on many adventures
Token you
Token ID345
Feature activation-3.49e-02

Pos logit contributions
 fascinated+0.083
 intrigued+0.080
 Curious+0.078
mit+0.074
nam+0.073
Neg logit contributions
loo-0.077
 dri-0.076
 dent-0.076
ister-0.074
 collapse-0.070
Token go
Token ID467
Feature activation+8.47e-02

Pos logit contributions
 fascinated+0.070
 Curious+0.068
 intrigued+0.067
nam+0.065
cious+0.064
Neg logit contributions
 dri-0.047
loo-0.045
ister-0.045
 dent-0.045
 collapse-0.041
Token,
Token ID11
Feature activation+1.49e-03

Pos logit contributions
 dent+0.146
loo+0.145
 dri+0.143
ister+0.141
imi+0.139
Neg logit contributions
 fascinated-0.140
 intrigued-0.135
 Curious-0.130
mit-0.123
 curious-0.121
Token my
Token ID616
Feature activation-2.66e-02

Pos logit contributions
 dri+0.002
loo+0.002
 dent+0.002
ister+0.002
imi+0.002
Neg logit contributions
 fascinated-0.003
 intrigued-0.003
 Curious-0.003
nam-0.003
cious-0.002
Token d
Token ID288
Feature activation-1.47e-02

Pos logit contributions
 fascinated+0.045
 intrigued+0.043
 Curious+0.042
mit+0.041
nam+0.040
Neg logit contributions
loo-0.045
 dri-0.044
 dent-0.044
ister-0.043
 collapse-0.041
Tokenarl
Token ID7063
Feature activation+1.92e-01

Pos logit contributions
 fascinated+0.035
mit+0.035
 intrigued+0.035
 Curious+0.034
cious+0.033
Neg logit contributions
 dent-0.013
 dri-0.012
loo-0.012
 swollen-0.011
 collapse-0.011
Tokenings
Token ID654
Feature activation+1.19e-02

Pos logit contributions
 dent+0.232
loo+0.231
 collapse+0.226
 dri+0.226
 flash+0.217
Neg logit contributions
 fascinated-0.389
mit-0.383
 intrigued-0.380
 Curious-0.365
cious-0.363
Token.
Token ID13
Feature activation+8.71e-03

Pos logit contributions
 dent+0.017
loo+0.017
 dri+0.016
 collapse+0.016
imi+0.015
Neg logit contributions
 fascinated-0.023
 intrigued-0.022
 Curious-0.022
mit-0.021
cious-0.020
Token Enjoy
Token ID18179
Feature activation+2.72e-02

Pos logit contributions
loo+0.014
 dri+0.014
 dent+0.014
ister+0.014
arl+0.014
Neg logit contributions
 fascinated-0.015
 intrigued-0.014
nam-0.013
 Curious-0.013
mit-0.013
Token your
Token ID534
Feature activation-1.42e-02

Pos logit contributions
 dent+0.040
loo+0.039
 collapse+0.038
 dri+0.036
 bolt+0.035
Neg logit contributions
 fascinated-0.052
 intrigued-0.050
mit-0.050
cious-0.049
nam-0.048
Token lunch
Token ID9965
Feature activation+8.43e-02

Pos logit contributions
 fascinated+0.027
 intrigued+0.027
 Curious+0.026
mit+0.025
nam+0.025
Neg logit contributions
loo-0.020
 dent-0.020
 dri-0.020
ister-0.020
imi-0.019
Token's
Token ID338
Feature activation-7.30e-03

Pos logit contributions
loo+0.046
 dent+0.045
 dri+0.045
ister+0.044
imi+0.042
Neg logit contributions
 fascinated-0.042
 intrigued-0.040
 Curious-0.039
nam-0.037
mit-0.036
Token okay
Token ID8788
Feature activation+7.25e-03

Pos logit contributions
 fascinated+0.010
 intrigued+0.010
nam+0.010
 curious+0.010
 Curious+0.009
Neg logit contributions
loo-0.015
ister-0.014
 dri-0.014
 dent-0.014
iren-0.013
Token,
Token ID11
Feature activation-3.36e-03

Pos logit contributions
 dri+0.012
 dent+0.012
loo+0.012
ister+0.011
arl+0.011
Neg logit contributions
 fascinated-0.012
 intrigued-0.011
 Curious-0.011
 curious-0.011
mit-0.011
Token my
Token ID616
Feature activation-3.56e-02

Pos logit contributions
 fascinated+0.006
 intrigued+0.006
 Curious+0.006
cious+0.005
nam+0.005
Neg logit contributions
 dent-0.006
loo-0.006
 dri-0.006
ister-0.005
imi-0.005
Token d
Token ID288
Feature activation-2.31e-02

Pos logit contributions
 fascinated+0.058
 intrigued+0.056
 Curious+0.054
nam+0.051
mit+0.051
Neg logit contributions
loo-0.063
 dent-0.062
 dri-0.062
ister-0.061
imi-0.058
Tokenarl
Token ID7063
Feature activation+1.87e-01

Pos logit contributions
mit+0.052
ked+0.051
 fascinated+0.050
 Curious+0.050
 youngest+0.049
Neg logit contributions
 swollen-0.019
 dent-0.018
 dri-0.018
 flash-0.017
 collapse-0.016
Tokenings
Token ID654
Feature activation+6.27e-03

Pos logit contributions
prise+0.225
 ig+0.225
 collapse+0.222
aster+0.218
web+0.216
Neg logit contributions
mit-0.368
cious-0.335
 subjects-0.334
 fascinated-0.333
ked-0.330
Token.
Token ID13
Feature activation+1.54e-02

Pos logit contributions
 dent+0.013
loo+0.012
 dri+0.012
omo+0.012
 swollen+0.012
Neg logit contributions
 wanted-0.007
 Curious-0.007
 curious-0.007
 fascinated-0.006
 intrigued-0.006
Token I
Token ID314
Feature activation-8.93e-03

Pos logit contributions
loo+0.028
 dent+0.027
 dri+0.027
ister+0.026
arl+0.025
Neg logit contributions
 fascinated-0.025
 intrigued-0.024
 Curious-0.023
nam-0.022
mit-0.022
Token forgive
Token ID20927
Feature activation-3.92e-03

Pos logit contributions
 fascinated+0.018
 intrigued+0.018
 Curious+0.017
mit+0.017
 curious+0.016
Neg logit contributions
loo-0.013
ister-0.012
 dent-0.012
imi-0.011
 dri-0.011
Token you
Token ID345
Feature activation-5.98e-02

Pos logit contributions
 fascinated+0.008
 intrigued+0.008
 Curious+0.008
nam+0.008
mit+0.007
Neg logit contributions
loo-0.005
 dri-0.005
 dent-0.005
ister-0.005
 collapse-0.004
Token you
Token ID345
Feature activation-5.88e-02

Pos logit contributions
 dent+0.070
loo+0.069
 collapse+0.064
ister+0.063
 dri+0.063
Neg logit contributions
 fascinated-0.071
nam-0.068
 intrigued-0.066
cious-0.064
 curious-0.064
Token too
Token ID1165
Feature activation+2.35e-02

Pos logit contributions
 fascinated+0.106
 intrigued+0.100
nam+0.099
 curious+0.098
 Curious+0.097
Neg logit contributions
loo-0.101
 dri-0.095
 dent-0.094
 collapse-0.092
ister-0.091
Token,
Token ID11
Feature activation+3.04e-04

Pos logit contributions
loo+0.036
 dri+0.035
 dent+0.035
ister+0.034
 collapse+0.032
Neg logit contributions
 fascinated-0.045
 intrigued-0.043
 Curious-0.042
nam-0.041
mit-0.040
Token my
Token ID616
Feature activation-3.42e-02

Pos logit contributions
loo+0.001
 dent+0.001
 dri+0.001
ister+0.001
 collapse+0.000
Neg logit contributions
 fascinated-0.000
 intrigued-0.000
 Curious-0.000
nam-0.000
mit-0.000
Token d
Token ID288
Feature activation-2.32e-02

Pos logit contributions
 fascinated+0.060
 intrigued+0.058
 Curious+0.056
nam+0.053
mit+0.053
Neg logit contributions
loo-0.057
 dent-0.056
 dri-0.056
ister-0.055
 collapse-0.052
Tokenarl
Token ID7063
Feature activation+1.86e-01

Pos logit contributions
mit+0.051
 fascinated+0.050
ked+0.050
 intrigued+0.048
cious+0.047
Neg logit contributions
 dent-0.023
 flash-0.021
 swollen-0.021
loo-0.020
 dri-0.019
Tokenings
Token ID654
Feature activation+6.83e-03

Pos logit contributions
 ig+0.211
 collapse+0.208
prise+0.205
loo+0.205
 dent+0.204
Neg logit contributions
mit-0.386
 fascinated-0.366
 intrigued-0.359
cious-0.356
 Curious-0.350
Token.
Token ID13
Feature activation+1.63e-02

Pos logit contributions
 dent+0.014
loo+0.013
 collapse+0.013
ig+0.012
 dri+0.012
Neg logit contributions
 wanted-0.009
 Curious-0.009
 fascinated-0.008
 curious-0.008
 Claire-0.008
Token Now
Token ID2735
Feature activation+1.44e-02

Pos logit contributions
loo+0.031
 dent+0.030
 dri+0.030
ister+0.029
imi+0.028
Neg logit contributions
 fascinated-0.025
 intrigued-0.024
 Curious-0.023
nam-0.022
mit-0.022
Token let
Token ID1309
Feature activation+4.69e-03

Pos logit contributions
loo+0.022
 dent+0.021
 dri+0.021
ister+0.021
imi+0.020
Neg logit contributions
 fascinated-0.027
 intrigued-0.026
 Curious-0.026
nam-0.025
mit-0.025
Token's
Token ID338
Feature activation-4.01e-03

Pos logit contributions
loo+0.011
 dent+0.011
 dri+0.010
ister+0.010
arl+0.010
Neg logit contributions
 fascinated-0.005
 intrigued-0.005
 Curious-0.005
mit-0.004
cious-0.004
Token.
Token ID13
Feature activation+1.70e-02

Pos logit contributions
 fascinated+0.018
 intrigued+0.017
 Curious+0.017
nam+0.017
mit+0.017
Neg logit contributions
 dent-0.008
 dri-0.008
loo-0.008
ister-0.008
 collapse-0.007
Token
Token ID198
Feature activation+2.56e-02

Pos logit contributions
loo+0.031
 dri+0.030
 dent+0.030
ister+0.029
imi+0.028
Neg logit contributions
 fascinated-0.028
 intrigued-0.026
 Curious-0.026
nam-0.024
mit-0.024
Token
Token ID198
Feature activation+2.51e-02

Pos logit contributions
loo+0.059
 dri+0.058
 dent+0.057
ister+0.054
ig+0.054
Neg logit contributions
 Curious-0.028
 fascinated-0.028
 intrigued-0.026
 curious-0.024
 Lay-0.021
TokenRex
Token ID47389
Feature activation-2.16e-02

Pos logit contributions
loo+0.039
 dri+0.038
 dent+0.036
 stain+0.034
imi+0.033
Neg logit contributions
 fascinated-0.047
 Curious-0.047
 intrigued-0.045
cious-0.044
 curious-0.044
Token sn
Token ID3013
Feature activation+3.36e-02

Pos logit contributions
nam+0.050
 fascinated+0.049
 intrigued+0.048
 Curious+0.047
cious+0.046
Neg logit contributions
 dri-0.025
 dent-0.023
loo-0.021
ister-0.020
 burns-0.020
Tokenarl
Token ID7063
Feature activation+1.86e-01

Pos logit contributions
 dent+0.053
 dri+0.052
 flash+0.050
 ig+0.049
 swollen+0.049
Neg logit contributions
 fascinated-0.055
 Curious-0.054
 intrigued-0.053
mit-0.053
ked-0.052
Tokens
Token ID82
Feature activation+3.34e-03

Pos logit contributions
prise+0.268
aster+0.260
 dent+0.256
icious+0.252
 ig+0.252
Neg logit contributions
 fascinated-0.325
mit-0.320
 intrigued-0.315
 Curious-0.303
ked-0.300
Token.
Token ID13
Feature activation+1.76e-02

Pos logit contributions
 dent+0.004
 dri+0.004
loo+0.004
ister+0.004
 collapse+0.004
Neg logit contributions
 fascinated-0.007
 intrigued-0.007
 Curious-0.007
cious-0.007
mit-0.007
Token He
Token ID679
Feature activation+3.12e-02

Pos logit contributions
loo+0.033
 dent+0.033
 dri+0.033
ister+0.032
imi+0.030
Neg logit contributions
 fascinated-0.027
 intrigued-0.026
 Curious-0.025
nam-0.024
mit-0.023
Token does
Token ID857
Feature activation+3.08e-02

Pos logit contributions
 dri+0.048
 dent+0.046
loo+0.044
ister+0.042
 collapse+0.041
Neg logit contributions
 fascinated-0.060
 intrigued-0.057
 Curious-0.056
nam-0.056
cious-0.055
Token not
Token ID407
Feature activation+2.42e-02

Pos logit contributions
 dent+0.040
 dri+0.040
ister+0.038
loo+0.038
 collapse+0.037
Neg logit contributions
 fascinated-0.064
 intrigued-0.061
 Curious-0.061
nam-0.059
cious-0.059
Token's
Token ID338
Feature activation-7.02e-03

Pos logit contributions
loo+0.036
arl+0.035
ister+0.034
 dent+0.034
 dri+0.032
Neg logit contributions
 fascinated-0.036
 intrigued-0.035
 Curious-0.034
mit-0.032
nam-0.032
Token wonderful
Token ID7932
Feature activation-2.53e-02

Pos logit contributions
nam+0.011
 fascinated+0.010
 curious+0.010
 intrigued+0.010
 wanted+0.009
Neg logit contributions
loo-0.015
ister-0.014
iren-0.014
 dri-0.013
 dent-0.013
Token,
Token ID11
Feature activation-5.65e-03

Pos logit contributions
 fascinated+0.054
 intrigued+0.052
 Curious+0.051
nam+0.049
mit+0.049
Neg logit contributions
loo-0.032
 dri-0.032
 dent-0.032
ister-0.031
imi-0.029
Token my
Token ID616
Feature activation-2.85e-02

Pos logit contributions
 fascinated+0.010
 intrigued+0.009
 Curious+0.009
nam+0.009
cious+0.008
Neg logit contributions
 dri-0.010
loo-0.009
 dent-0.009
ister-0.009
imi-0.009
Token d
Token ID288
Feature activation-1.89e-02

Pos logit contributions
 fascinated+0.045
 intrigued+0.043
 Curious+0.042
mit+0.040
nam+0.040
Neg logit contributions
loo-0.052
 dri-0.051
 dent-0.051
ister-0.050
imi-0.047
Tokenarl
Token ID7063
Feature activation+1.85e-01

Pos logit contributions
 fascinated+0.043
mit+0.043
 intrigued+0.042
 Curious+0.040
cious+0.040
Neg logit contributions
 dent-0.020
loo-0.018
 dri-0.018
 flash-0.017
 collapse-0.017
Tokenings
Token ID654
Feature activation+6.03e-03

Pos logit contributions
loo+0.176
 dent+0.172
 dri+0.169
 collapse+0.164
imi+0.156
Neg logit contributions
 fascinated-0.435
 intrigued-0.424
mit-0.423
 Curious-0.410
cious-0.406
Token!"
Token ID2474
Feature activation-3.80e-02

Pos logit contributions
loo+0.009
 dent+0.009
imi+0.008
omo+0.008
ig+0.008
Neg logit contributions
 fascinated-0.011
 intrigued-0.011
 Curious-0.011
 wanted-0.010
mit-0.010
Token Mom
Token ID11254
Feature activation+9.95e-02

Pos logit contributions
 fascinated+0.068
 intrigued+0.065
mit+0.063
nam+0.062
cious+0.062
Neg logit contributions
 dri-0.060
ister-0.058
loo-0.056
 dent-0.055
omo-0.054
Tokenmy
Token ID1820
Feature activation+4.92e-02

Pos logit contributions
loo+0.158
 dent+0.156
 dri+0.155
ister+0.151
imi+0.143
Neg logit contributions
 fascinated-0.181
 intrigued-0.174
 Curious-0.170
nam-0.162
mit-0.160
Token says
Token ID1139
Feature activation-1.02e-02

Pos logit contributions
 dri+0.077
ister+0.071
loo+0.070
 dent+0.070
imi+0.068
Neg logit contributions
 fascinated-0.090
nam-0.088
 intrigued-0.086
 Curious-0.085
cious-0.084
TokenWhat
Token ID2061
Feature activation-5.05e-02

Pos logit contributions
 fascinated+0.150
 Curious+0.145
 intrigued+0.144
nam+0.137
cious+0.133
Neg logit contributions
loo-0.119
 dent-0.115
 dri-0.112
ister-0.111
 collapse-0.106
Token happened
Token ID3022
Feature activation+2.89e-03

Pos logit contributions
 fascinated+0.096
 intrigued+0.093
 Curious+0.091
nam+0.087
mit+0.087
Neg logit contributions
loo-0.075
 dri-0.074
 dent-0.074
ister-0.072
imi-0.069
Token,
Token ID11
Feature activation-2.17e-03

Pos logit contributions
loo+0.004
 dri+0.004
 dent+0.004
ister+0.004
imi+0.003
Neg logit contributions
 fascinated-0.006
 intrigued-0.006
 Curious-0.006
nam-0.005
mit-0.005
Token my
Token ID616
Feature activation-3.68e-02

Pos logit contributions
 fascinated+0.003
 intrigued+0.003
 Curious+0.003
nam+0.003
mit+0.003
Neg logit contributions
loo-0.004
 dent-0.004
ister-0.004
 dri-0.004
 collapse-0.004
Token d
Token ID288
Feature activation-2.59e-02

Pos logit contributions
 fascinated+0.057
 intrigued+0.054
 Curious+0.053
nam+0.050
mit+0.049
Neg logit contributions
loo-0.069
 dent-0.068
 dri-0.068
ister-0.066
imi-0.064
Tokenarl
Token ID7063
Feature activation+1.85e-01

Pos logit contributions
ked+0.056
 Curious+0.055
 fascinated+0.055
mit+0.055
 intrigued+0.054
Neg logit contributions
 dent-0.022
 dri-0.021
 flash-0.021
 swollen-0.020
 collapse-0.019
Tokenings
Token ID654
Feature activation+3.13e-03

Pos logit contributions
prise+0.233
 ig+0.231
 collapse+0.228
aster+0.226
icious+0.222
Neg logit contributions
mit-0.356
ed-0.325
 fascinated-0.322
 intrigued-0.320
cious-0.319
Token?
Token ID30
Feature activation-1.35e-03

Pos logit contributions
 dent+0.004
 collapse+0.004
 snowball+0.004
loo+0.003
 ig+0.003
Neg logit contributions
 wanted-0.006
 Curious-0.006
 fascinated-0.006
 curious-0.006
 intrigued-0.006
Token Why
Token ID4162
Feature activation-5.10e-02

Pos logit contributions
 fascinated+0.002
 intrigued+0.002
 Curious+0.002
nam+0.002
mit+0.002
Neg logit contributions
loo-0.003
 dri-0.003
 dent-0.003
ister-0.003
imi-0.002
Token do
Token ID466
Feature activation+8.29e-02

Pos logit contributions
 fascinated+0.105
 intrigued+0.102
 Curious+0.100
cious+0.095
nam+0.094
Neg logit contributions
 dri-0.072
loo-0.070
 dent-0.069
ister-0.069
imi-0.065
Token you
Token ID345
Feature activation-5.99e-02

Pos logit contributions
loo+0.095
 dent+0.093
 dri+0.093
ister+0.091
imi+0.084
Neg logit contributions
 fascinated-0.188
 intrigued-0.182
 Curious-0.178
nam-0.172
mit-0.170
Token'm
Token ID1101
Feature activation+4.86e-03

Pos logit contributions
loo+0.030
 dent+0.029
ister+0.028
 dri+0.028
imi+0.027
Neg logit contributions
 fascinated-0.028
 intrigued-0.027
 Curious-0.027
 curious-0.025
mit-0.024
Token sorry
Token ID7926
Feature activation+1.48e-02

Pos logit contributions
loo+0.009
 dri+0.009
 dent+0.008
ister+0.008
omo+0.008
Neg logit contributions
 fascinated-0.008
 intrigued-0.008
 curious-0.007
 Curious-0.007
nam-0.007
Token,
Token ID11
Feature activation-2.97e-03

Pos logit contributions
 dent+0.025
loo+0.025
 dri+0.025
ister+0.023
 collapse+0.023
Neg logit contributions
 fascinated-0.024
 intrigued-0.023
 Curious-0.023
 curious-0.023
 Lay-0.022
Token my
Token ID616
Feature activation-3.32e-02

Pos logit contributions
 fascinated+0.005
 intrigued+0.005
 Curious+0.005
 curious+0.005
mit+0.005
Neg logit contributions
loo-0.005
 dent-0.005
 dri-0.005
ister-0.005
 collapse-0.004
Token d
Token ID288
Feature activation-2.37e-02

Pos logit contributions
 fascinated+0.054
 intrigued+0.052
 Curious+0.050
nam+0.049
mit+0.047
Neg logit contributions
loo-0.060
 dri-0.059
 dent-0.059
ister-0.057
imi-0.055
Tokenarl
Token ID7063
Feature activation+1.84e-01

Pos logit contributions
ked+0.051
 fascinated+0.049
mit+0.049
 intrigued+0.048
 Curious+0.048
Neg logit contributions
 dent-0.022
 flash-0.021
 dri-0.020
 swollen-0.018
 collapse-0.018
Tokenings
Token ID654
Feature activation+2.65e-03

Pos logit contributions
aster+0.221
 collapse+0.216
prise+0.214
web+0.211
 dent+0.210
Neg logit contributions
mit-0.369
 fascinated-0.341
cious-0.340
 intrigued-0.332
ed-0.327
Token.
Token ID13
Feature activation+1.26e-02

Pos logit contributions
 dent+0.005
loo+0.005
omo+0.005
 dri+0.005
aster+0.005
Neg logit contributions
 wanted-0.003
 Curious-0.003
 fascinated-0.003
 curious-0.003
 intrigued-0.003
Token That
Token ID1320
Feature activation-2.75e-02

Pos logit contributions
loo+0.023
 dri+0.023
 dent+0.023
ister+0.022
imi+0.022
Neg logit contributions
 fascinated-0.020
 intrigued-0.019
 Curious-0.018
nam-0.017
mit-0.017
Token was
Token ID373
Feature activation+4.86e-03

Pos logit contributions
 fascinated+0.040
 intrigued+0.038
 Curious+0.037
nam+0.036
mit+0.035
Neg logit contributions
loo-0.054
 dri-0.053
 dent-0.053
ister-0.052
imi-0.050
Token a
Token ID257
Feature activation+2.86e-03

Pos logit contributions
loo+0.009
 dent+0.008
 dri+0.008
ister+0.008
imi+0.008
Neg logit contributions
 fascinated-0.008
 intrigued-0.008
 Curious-0.008
nam-0.007
mit-0.007
Token "
Token ID366
Feature activation-7.56e-02

Pos logit contributions
 fascinated+0.001
 intrigued+0.001
 Curious+0.001
mit+0.001
nam+0.001
Neg logit contributions
 dri-0.001
 dent-0.001
ister-0.001
loo-0.001
arl-0.001
TokenHello
Token ID15496
Feature activation-7.51e-02

Pos logit contributions
 fascinated+0.138
 intrigued+0.132
 Curious+0.129
nam+0.124
mit+0.122
Neg logit contributions
loo-0.121
 dent-0.120
 dri-0.119
ister-0.116
imi-0.110
Token,
Token ID11
Feature activation-2.10e-03

Pos logit contributions
 fascinated+0.148
 intrigued+0.143
 Curious+0.141
mit+0.135
nam+0.134
Neg logit contributions
loo-0.106
 dent-0.106
 dri-0.105
ister-0.100
 collapse-0.095
Token my
Token ID616
Feature activation-3.51e-02

Pos logit contributions
 fascinated+0.003
 intrigued+0.003
 Curious+0.003
nam+0.003
mit+0.003
Neg logit contributions
loo-0.004
 dri-0.004
 dent-0.004
ister-0.004
imi-0.003
Token d
Token ID288
Feature activation-2.62e-02

Pos logit contributions
 fascinated+0.066
 intrigued+0.064
nam+0.064
 Curious+0.062
chard+0.062
Neg logit contributions
loo-0.054
 dent-0.052
ister-0.052
 dri-0.051
ay-0.049
Tokenarl
Token ID7063
Feature activation+1.84e-01

Pos logit contributions
 fascinated+0.065
mit+0.064
 intrigued+0.064
 Curious+0.064
 Lay+0.061
Neg logit contributions
 dent-0.019
 dri-0.018
 swollen-0.018
 flash-0.014
 collapse-0.014
Tokenings
Token ID654
Feature activation+3.25e-03

Pos logit contributions
 ig+0.227
 collapse+0.224
 snowball+0.218
 swollen+0.214
prise+0.214
Neg logit contributions
mit-0.355
 fascinated-0.330
 intrigued-0.323
ed-0.322
cious-0.322
Token.
Token ID13
Feature activation+1.34e-02

Pos logit contributions
 dent+0.006
loo+0.006
 dri+0.005
omo+0.005
 collapse+0.005
Neg logit contributions
 Curious-0.005
 fascinated-0.005
cious-0.004
 intrigued-0.004
 curious-0.004
Token I
Token ID314
Feature activation-8.43e-03

Pos logit contributions
loo+0.021
 dri+0.020
arl+0.019
ister+0.019
 dent+0.019
Neg logit contributions
 fascinated-0.026
 intrigued-0.025
nam-0.023
mit-0.022
 Curious-0.022
Token'm
Token ID1101
Feature activation+6.21e-03

Pos logit contributions
 fascinated+0.016
 intrigued+0.016
 Curious+0.015
mit+0.015
nam+0.015
Neg logit contributions
loo-0.012
 dent-0.012
 dri-0.012
ister-0.012
imi-0.011
Token so
Token ID523
Feature activation-2.97e-02

Pos logit contributions
loo+0.009
 dri+0.009
ister+0.009
 dent+0.008
imi+0.008
Neg logit contributions
 fascinated-0.012
 Curious-0.012
 intrigued-0.012
mit-0.011
cious-0.011
Token have
Token ID423
Feature activation+5.85e-02

Pos logit contributions
 fascinated+0.099
 intrigued+0.096
 Curious+0.094
 curious+0.088
mit+0.088
Neg logit contributions
loo-0.107
 dent-0.104
 dri-0.101
ister-0.099
imi-0.099
Token fun
Token ID1257
Feature activation+2.12e-02

Pos logit contributions
loo+0.088
 dent+0.086
 dri+0.086
ister+0.084
imi+0.079
Neg logit contributions
 fascinated-0.112
 intrigued-0.108
 Curious-0.106
nam-0.102
mit-0.100
Token,
Token ID11
Feature activation-1.78e-03

Pos logit contributions
 dri+0.036
loo+0.036
 dent+0.035
 collapse+0.035
arl+0.034
Neg logit contributions
 fascinated-0.036
 intrigued-0.034
 curious-0.032
nam-0.031
mit-0.030
Token my
Token ID616
Feature activation-3.54e-02

Pos logit contributions
 fascinated+0.003
 intrigued+0.003
 Curious+0.003
nam+0.003
mit+0.003
Neg logit contributions
loo-0.003
 dent-0.003
 dri-0.003
ister-0.003
 collapse-0.003
Token d
Token ID288
Feature activation-2.52e-02

Pos logit contributions
 fascinated+0.054
 intrigued+0.052
 Curious+0.051
nam+0.048
cious+0.046
Neg logit contributions
loo-0.067
 dri-0.066
 dent-0.066
ister-0.065
imi-0.062
Tokenarl
Token ID7063
Feature activation+1.84e-01

Pos logit contributions
 fascinated+0.060
mit+0.060
 intrigued+0.058
ked+0.057
 Curious+0.057
Neg logit contributions
 dent-0.021
 dri-0.019
 swollen-0.018
 flash-0.017
loo-0.017
Tokenings
Token ID654
Feature activation+2.07e-03

Pos logit contributions
 ig+0.197
 dent+0.197
aster+0.195
 collapse+0.195
prise+0.194
Neg logit contributions
mit-0.381
 fascinated-0.368
 intrigued-0.362
cious-0.357
 Curious-0.351
Token?"
Token ID1701
Feature activation-1.56e-02

Pos logit contributions
 dent+0.003
ig+0.003
 collapse+0.003
loo+0.003
 injuries+0.003
Neg logit contributions
 wanted-0.004
 Curious-0.004
 fascinated-0.004
 intrigued-0.003
 curious-0.003
Token Their
Token ID5334
Feature activation-8.89e-03

Pos logit contributions
 fascinated+0.032
 intrigued+0.031
 Curious+0.031
mit+0.030
cious+0.030
Neg logit contributions
 dri-0.020
 dent-0.020
loo-0.020
ister-0.020
imi-0.018
Token mother
Token ID2802
Feature activation+1.51e-02

Pos logit contributions
 fascinated+0.016
 intrigued+0.015
 Curious+0.015
nam+0.014
mit+0.014
Neg logit contributions
loo-0.014
 dri-0.014
 dent-0.014
ister-0.014
imi-0.013
Token asked
Token ID1965
Feature activation-3.23e-02

Pos logit contributions
loo+0.028
 dri+0.028
 dent+0.028
ister+0.027
imi+0.026
Neg logit contributions
 fascinated-0.023
 intrigued-0.022
 Curious-0.022
nam-0.021
mit-0.020
TokenWe
Token ID1135
Feature activation-4.12e-02

Pos logit contributions
 fascinated+0.147
 intrigued+0.142
 Curious+0.138
nam+0.133
mit+0.133
Neg logit contributions
loo-0.118
 dri-0.118
 dent-0.117
ister-0.115
imi-0.108
Token know
Token ID760
Feature activation+2.95e-02

Pos logit contributions
 fascinated+0.081
 intrigued+0.078
 Curious+0.077
nam+0.074
cious+0.073
Neg logit contributions
 dri-0.060
 dent-0.058
loo-0.057
ister-0.055
omo-0.052
Token,
Token ID11
Feature activation-3.01e-03

Pos logit contributions
loo+0.034
 dri+0.033
ister+0.032
 dent+0.032
imi+0.030
Neg logit contributions
 fascinated-0.067
 intrigued-0.065
 Curious-0.064
mit-0.061
nam-0.060
Token my
Token ID616
Feature activation-3.54e-02

Pos logit contributions
 fascinated+0.005
 intrigued+0.005
 Curious+0.005
nam+0.005
cious+0.005
Neg logit contributions
 dri-0.005
loo-0.005
 dent-0.005
ister-0.005
imi-0.005
Token d
Token ID288
Feature activation-2.46e-02

Pos logit contributions
 fascinated+0.068
 intrigued+0.065
nam+0.064
cious+0.063
 Curious+0.063
Neg logit contributions
 dent-0.054
loo-0.054
 dri-0.053
ister-0.049
omo-0.048
Tokenarl
Token ID7063
Feature activation+1.84e-01

Pos logit contributions
 fascinated+0.059
 intrigued+0.058
mit+0.058
 Curious+0.058
ked+0.058
Neg logit contributions
 dent-0.017
 dri-0.016
 swollen-0.015
 flash-0.015
 collapse-0.014
Tokenings
Token ID654
Feature activation+4.77e-03

Pos logit contributions
aster+0.234
 collapse+0.226
prise+0.222
 ig+0.222
 dent+0.221
Neg logit contributions
mit-0.354
 fascinated-0.325
cious-0.324
ed-0.321
 intrigued-0.320
Token.
Token ID13
Feature activation+1.27e-02

Pos logit contributions
 dent+0.009
loo+0.008
 dri+0.008
 collapse+0.008
omo+0.008
Neg logit contributions
 fascinated-0.007
 intrigued-0.007
 Curious-0.007
 wanted-0.007
 curious-0.007
Token We
Token ID775
Feature activation-5.61e-02

Pos logit contributions
loo+0.022
 dent+0.022
 dri+0.022
ister+0.021
arl+0.021
Neg logit contributions
 fascinated-0.021
 intrigued-0.020
 Curious-0.019
nam-0.019
mit-0.019
Token miss
Token ID2051
Feature activation+1.09e-02

Pos logit contributions
 fascinated+0.105
 intrigued+0.101
 Curious+0.100
nam+0.097
cious+0.095
Neg logit contributions
 dri-0.087
 dent-0.084
loo-0.082
ister-0.079
 collapse-0.075
Token them
Token ID606
Feature activation-1.01e-01

Pos logit contributions
 dri+0.013
loo+0.013
 dent+0.013
ister+0.012
imi+0.011
Neg logit contributions
 fascinated-0.024
 intrigued-0.024
 Curious-0.023
mit-0.022
cious-0.022
Token have
Token ID423
Feature activation+6.11e-02

Pos logit contributions
 fascinated+0.112
 intrigued+0.108
 Curious+0.105
nam+0.100
mit+0.099
Neg logit contributions
loo-0.104
 dent-0.102
 dri-0.102
ister-0.099
imi-0.095
Token fun
Token ID1257
Feature activation+1.99e-02

Pos logit contributions
 dri+0.084
 dent+0.083
loo+0.083
ister+0.078
 collapse+0.075
Neg logit contributions
 fascinated-0.125
 intrigued-0.121
 Curious-0.118
mit-0.114
nam-0.112
Token,
Token ID11
Feature activation-2.29e-04

Pos logit contributions
 dri+0.034
loo+0.034
 dent+0.033
 collapse+0.033
arl+0.032
Neg logit contributions
 fascinated-0.034
 intrigued-0.032
 curious-0.030
mit-0.029
 Curious-0.029
Token my
Token ID616
Feature activation-3.61e-02

Pos logit contributions
 fascinated+0.000
 intrigued+0.000
 Curious+0.000
cious+0.000
mit+0.000
Neg logit contributions
loo-0.000
ister-0.000
 dri-0.000
 dent-0.000
imi-0.000
Token d
Token ID288
Feature activation-2.62e-02

Pos logit contributions
 fascinated+0.066
 Curious+0.064
 intrigued+0.064
nam+0.060
cious+0.059
Neg logit contributions
loo-0.060
ister-0.057
 dent-0.057
 dri-0.055
imi-0.052
Tokenarl
Token ID7063
Feature activation+1.84e-01

Pos logit contributions
 fascinated+0.067
 intrigued+0.066
mit+0.065
 Curious+0.064
 Lay+0.062
Neg logit contributions
 dri-0.019
 dent-0.019
 swollen-0.016
loo-0.015
 flash-0.015
Tokenings
Token ID654
Feature activation+4.40e-03

Pos logit contributions
 dri+0.195
 collapse+0.190
 ig+0.187
 dent+0.186
aster+0.186
Neg logit contributions
mit-0.387
 fascinated-0.379
 intrigued-0.373
 Curious-0.359
 subjects-0.358
Token?"
Token ID1701
Feature activation-7.86e-03

Pos logit contributions
 dent+0.006
 dri+0.006
loo+0.005
 collapse+0.005
ig+0.005
Neg logit contributions
 fascinated-0.008
 Curious-0.008
 intrigued-0.008
 wanted-0.008
 curious-0.008
Token she
Token ID673
Feature activation+1.26e-02

Pos logit contributions
 fascinated+0.014
 intrigued+0.014
mit+0.014
cious+0.013
 Curious+0.013
Neg logit contributions
ister-0.012
 dri-0.012
 dent-0.012
loo-0.012
imi-0.011
Token asks
Token ID7893
Feature activation-2.94e-02

Pos logit contributions
 dri+0.019
 dent+0.019
imi+0.018
ilo+0.017
ister+0.017
Neg logit contributions
mit-0.023
cious-0.023
 fascinated-0.023
nam-0.023
chard-0.022
Token.
Token ID13
Feature activation+1.55e-02

Pos logit contributions
 fascinated+0.067
 intrigued+0.065
 Curious+0.064
cious+0.062
mit+0.062
Neg logit contributions
ister-0.032
 dri-0.031
 dent-0.031
loo-0.031
 collapse-0.029
Token"
Token ID1
Feature activation-5.06e-02

Pos logit contributions
loo+0.036
 dent+0.035
 dri+0.034
ister+0.031
imi+0.031
Neg logit contributions
 fascinated-0.061
 intrigued-0.059
 Curious-0.058
nam-0.056
cious-0.055
TokenHello
Token ID15496
Feature activation-7.27e-02

Pos logit contributions
 fascinated+0.098
 intrigued+0.095
 Curious+0.091
mit+0.089
nam+0.088
Neg logit contributions
 dri-0.074
loo-0.073
 dent-0.072
ister-0.070
arl-0.067
Token,
Token ID11
Feature activation-5.31e-04

Pos logit contributions
 fascinated+0.157
 intrigued+0.152
 Curious+0.148
nam+0.144
cious+0.140
Neg logit contributions
loo-0.092
 dri-0.091
ister-0.089
 dent-0.088
imi-0.084
Token my
Token ID616
Feature activation-3.52e-02

Pos logit contributions
 fascinated+0.001
 intrigued+0.001
 Curious+0.001
nam+0.001
mit+0.001
Neg logit contributions
loo-0.001
 dent-0.001
 dri-0.001
ister-0.001
 collapse-0.001
Token d
Token ID288
Feature activation-2.56e-02

Pos logit contributions
 fascinated+0.061
 intrigued+0.059
nam+0.057
 Curious+0.056
cious+0.054
Neg logit contributions
loo-0.059
 dent-0.057
 dri-0.057
ister-0.056
omo-0.054
Tokenarl
Token ID7063
Feature activation+1.84e-01

Pos logit contributions
 fascinated+0.065
 Curious+0.064
mit+0.064
 intrigued+0.064
 Lay+0.062
Neg logit contributions
 dent-0.018
 dri-0.017
 swollen-0.016
 flash-0.013
 collapse-0.013
Tokenings
Token ID654
Feature activation+5.33e-03

Pos logit contributions
 collapse+0.214
 dent+0.212
 swollen+0.208
 ig+0.207
 snowball+0.205
Neg logit contributions
mit-0.363
 fascinated-0.339
 intrigued-0.336
cious-0.334
 Curious-0.333
Token.
Token ID13
Feature activation+1.04e-02

Pos logit contributions
 dent+0.010
loo+0.009
 collapse+0.009
ig+0.009
 injuries+0.009
Neg logit contributions
 Curious-0.008
 wanted-0.007
 fascinated-0.007
 curious-0.007
 Robert-0.007
Token I
Token ID314
Feature activation-6.73e-03

Pos logit contributions
loo+0.017
arl+0.016
 dri+0.016
 dent+0.015
ister+0.015
Neg logit contributions
 fascinated-0.019
 intrigued-0.018
mit-0.017
nam-0.016
cious-0.016
Token'm
Token ID1101
Feature activation+8.07e-03

Pos logit contributions
 fascinated+0.013
 intrigued+0.012
 Curious+0.012
nam+0.012
mit+0.011
Neg logit contributions
 dri-0.010
loo-0.010
 dent-0.010
ister-0.010
 collapse-0.009
Token so
Token ID523
Feature activation-3.00e-02

Pos logit contributions
ister+0.012
 dri+0.012
loo+0.012
 dent+0.011
imi+0.011
Neg logit contributions
 Curious-0.015
 fascinated-0.015
 intrigued-0.014
mit-0.014
cious-0.014
TokenSpot
Token ID32565
Feature activation-1.95e-02

Pos logit contributions
loo+0.041
 dri+0.040
 dent+0.039
 collapse+0.036
imi+0.035
Neg logit contributions
 Curious-0.043
 fascinated-0.043
 intrigued-0.042
cious-0.041
 curious-0.041
Token grow
Token ID1663
Feature activation+1.60e-02

Pos logit contributions
nam+0.041
 fascinated+0.040
 intrigued+0.038
 Curious+0.037
 Lay+0.037
Neg logit contributions
 dri-0.027
 dent-0.025
loo-0.023
 burns-0.022
ister-0.022
Tokenls
Token ID7278
Feature activation-2.91e-02

Pos logit contributions
loo+0.027
 dri+0.027
 dent+0.026
ister+0.026
imi+0.025
Neg logit contributions
 fascinated-0.028
 intrigued-0.027
 Curious-0.026
nam-0.025
mit-0.025
Token and
Token ID290
Feature activation-7.99e-03

Pos logit contributions
 fascinated+0.054
 intrigued+0.052
 Curious+0.051
nam+0.051
mit+0.049
Neg logit contributions
 dent-0.044
 dri-0.044
loo-0.044
ister-0.042
 collapse-0.041
Token sn
Token ID3013
Feature activation+3.21e-02

Pos logit contributions
 fascinated+0.015
 intrigued+0.014
 Curious+0.014
nam+0.013
mit+0.013
Neg logit contributions
 dri-0.013
 dent-0.012
loo-0.012
ister-0.012
 collapse-0.011
Tokenarl
Token ID7063
Feature activation+1.84e-01

Pos logit contributions
 dent+0.049
 flash+0.049
 swollen+0.048
 bolt+0.047
 dri+0.046
Neg logit contributions
mit-0.051
chard-0.050
 fascinated-0.050
 intrigued-0.049
 Curious-0.049
Tokens
Token ID82
Feature activation+4.63e-03

Pos logit contributions
prise+0.300
loo+0.284
aster+0.283
 dent+0.280
 ig+0.277
Neg logit contributions
 fascinated-0.301
 intrigued-0.290
mit-0.283
 Curious-0.276
cious-0.275
Token.
Token ID13
Feature activation+1.82e-02

Pos logit contributions
 dent+0.006
 dri+0.006
loo+0.006
ister+0.006
 collapse+0.006
Neg logit contributions
 fascinated-0.009
 intrigued-0.009
 Curious-0.009
mit-0.009
cious-0.009
Token He
Token ID679
Feature activation+3.30e-02

Pos logit contributions
 dent+0.031
 dri+0.030
loo+0.030
ister+0.029
arl+0.028
Neg logit contributions
 fascinated-0.031
 intrigued-0.030
nam-0.029
mit-0.029
cious-0.028
Token tries
Token ID8404
Feature activation-2.26e-02

Pos logit contributions
 dri+0.051
 dent+0.048
loo+0.046
ister+0.043
imi+0.043
Neg logit contributions
 fascinated-0.063
 intrigued-0.060
nam-0.060
 Curious-0.059
cious-0.058
Token to
Token ID284
Feature activation+6.26e-03

Pos logit contributions
 fascinated+0.033
 intrigued+0.032
 Curious+0.031
nam+0.029
mit+0.029
Neg logit contributions
loo-0.044
 dent-0.043
 dri-0.043
ister-0.043
imi-0.041
TokenNot
Token ID3673
Feature activation+4.24e-02

Pos logit contributions
 fascinated+0.077
 intrigued+0.073
 Curious+0.071
nam+0.068
mit+0.067
Neg logit contributions
loo-0.089
 dent-0.088
 dri-0.087
ister-0.085
imi-0.082
Token now
Token ID783
Feature activation+8.12e-02

Pos logit contributions
loo+0.064
 dri+0.063
 dent+0.063
ister+0.060
ay+0.060
Neg logit contributions
 fascinated-0.080
 intrigued-0.077
 curious-0.074
 Curious-0.073
nam-0.073
Token,
Token ID11
Feature activation-7.09e-05

Pos logit contributions
loo+0.130
 dent+0.125
 dri+0.124
ister+0.116
imi+0.113
Neg logit contributions
 fascinated-0.149
 intrigued-0.144
 Curious-0.140
nam-0.134
mit-0.134
Token my
Token ID616
Feature activation-3.61e-02

Pos logit contributions
 fascinated+0.000
 intrigued+0.000
 Curious+0.000
nam+0.000
 curious+0.000
Neg logit contributions
loo-0.000
ister-0.000
 dent-0.000
 dri-0.000
iren-0.000
Token d
Token ID288
Feature activation-2.49e-02

Pos logit contributions
 fascinated+0.058
 intrigued+0.056
 curious+0.055
 Curious+0.054
mit+0.054
Neg logit contributions
 collapse-0.061
loo-0.060
 dent-0.060
ister-0.059
iren-0.057
Tokenarl
Token ID7063
Feature activation+1.84e-01

Pos logit contributions
ked+0.049
 Curious+0.046
mit+0.045
 fascinated+0.045
 intrigued+0.045
Neg logit contributions
 swollen-0.027
 dent-0.025
 flash-0.025
 dri-0.023
 mistake-0.022
Tokenings
Token ID654
Feature activation+6.30e-03

Pos logit contributions
aster+0.243
prise+0.242
 collapse+0.238
 ig+0.236
web+0.235
Neg logit contributions
mit-0.336
ed-0.310
 subjects-0.303
 fascinated-0.300
 intrigued-0.299
Token,
Token ID11
Feature activation+1.09e-04

Pos logit contributions
 dent+0.013
omo+0.013
ig+0.013
 ig+0.012
loo+0.012
Neg logit contributions
 wanted-0.007
ly-0.006
 Curious-0.006
 liked-0.005
 curious-0.005
Token it
Token ID340
Feature activation+4.69e-03

Pos logit contributions
loo+0.000
 dent+0.000
ister+0.000
 dri+0.000
arl+0.000
Neg logit contributions
 fascinated-0.000
 intrigued-0.000
 Curious-0.000
nam-0.000
 curious-0.000
Token is
Token ID318
Feature activation+2.67e-02

Pos logit contributions
loo+0.009
 dent+0.009
ister+0.009
arl+0.008
 dri+0.008
Neg logit contributions
 fascinated-0.007
 intrigued-0.007
 Curious-0.007
 curious-0.006
 wanted-0.006
Token too
Token ID1165
Feature activation+2.45e-02

Pos logit contributions
loo+0.051
ister+0.050
iren+0.046
aid+0.046
 dent+0.045
Neg logit contributions
 fascinated-0.042
 curious-0.040
nam-0.039
 intrigued-0.039
chard-0.036
Token you
Token ID345
Feature activation-5.81e-02

Pos logit contributions
loo+0.047
 dri+0.046
 dent+0.045
ister+0.044
imi+0.041
Neg logit contributions
 fascinated-0.097
 intrigued-0.094
 Curious-0.093
nam-0.089
mit-0.089
Token too
Token ID1165
Feature activation+2.50e-02

Pos logit contributions
 fascinated+0.098
 curious+0.094
 intrigued+0.093
nam+0.093
 Curious+0.091
Neg logit contributions
loo-0.104
 dri-0.100
 dent-0.096
 collapse-0.095
ister-0.093
Token,
Token ID11
Feature activation+6.41e-06

Pos logit contributions
loo+0.038
 dri+0.037
 dent+0.037
ister+0.036
imi+0.034
Neg logit contributions
 fascinated-0.048
 intrigued-0.047
 Curious-0.045
nam-0.043
mit-0.043
Token my
Token ID616
Feature activation-3.42e-02

Feature not active on prev token;
no predictions were affected.
Token d
Token ID288
Feature activation-2.46e-02

Pos logit contributions
 fascinated+0.060
 intrigued+0.057
 Curious+0.056
nam+0.056
cious+0.053
Neg logit contributions
loo-0.058
 dent-0.057
 dri-0.057
ister-0.055
omo-0.053
Tokenarl
Token ID7063
Feature activation+1.83e-01

Pos logit contributions
 fascinated+0.063
mit+0.062
 intrigued+0.062
 Curious+0.061
ked+0.060
Neg logit contributions
 dent-0.016
 dri-0.016
 flash-0.014
 swollen-0.013
loo-0.013
Tokenings
Token ID654
Feature activation+5.59e-03

Pos logit contributions
aster+0.197
 collapse+0.194
 ig+0.193
 dri+0.192
 dent+0.192
Neg logit contributions
mit-0.383
 fascinated-0.365
 intrigued-0.359
cious-0.356
 Curious-0.349
Token.
Token ID13
Feature activation+1.60e-02

Pos logit contributions
 dent+0.011
loo+0.010
 collapse+0.010
 dri+0.010
omo+0.010
Neg logit contributions
 Curious-0.007
 wanted-0.007
 fascinated-0.007
 curious-0.007
 intrigued-0.007
Token But
Token ID887
Feature activation-1.05e-02

Pos logit contributions
loo+0.030
 dent+0.030
 dri+0.030
ister+0.029
imi+0.028
Neg logit contributions
 fascinated-0.024
 intrigued-0.023
 Curious-0.022
nam-0.021
mit-0.021
Token you
Token ID345
Feature activation-5.77e-02

Pos logit contributions
 fascinated+0.020
 intrigued+0.019
nam+0.019
 Curious+0.019
mit+0.018
Neg logit contributions
 dent-0.015
loo-0.015
 dri-0.015
ister-0.015
 collapse-0.014
Token have
Token ID423
Feature activation+6.56e-02

Pos logit contributions
 fascinated+0.109
 intrigued+0.105
 Curious+0.102
nam+0.097
mit+0.097
Neg logit contributions
loo-0.089
 dent-0.087
 dri-0.086
ister-0.085
imi-0.080
Token you
Token ID345
Feature activation-5.98e-02

Pos logit contributions
 fascinated+0.011
 intrigued+0.010
 Curious+0.010
mit+0.010
cious+0.010
Neg logit contributions
imi-0.002
ister-0.002
 dri-0.002
w-0.002
loo-0.002
Token okay
Token ID8788
Feature activation+9.34e-03

Pos logit contributions
 fascinated+0.120
 intrigued+0.116
 Curious+0.114
nam+0.113
mit+0.112
Neg logit contributions
 dri-0.081
 dent-0.080
loo-0.078
ister-0.076
imi-0.073
Token,
Token ID11
Feature activation+5.18e-04

Pos logit contributions
 dri+0.016
loo+0.016
 dent+0.016
aid+0.015
 collapse+0.015
Neg logit contributions
 fascinated-0.015
 intrigued-0.015
 curious-0.014
 Curious-0.014
mit-0.013
Token my
Token ID616
Feature activation-3.63e-02

Pos logit contributions
loo+0.001
 dent+0.001
 dri+0.001
ister+0.001
 collapse+0.001
Neg logit contributions
 fascinated-0.001
 Curious-0.001
 intrigued-0.001
 curious-0.001
nam-0.001
Token d
Token ID288
Feature activation-2.57e-02

Pos logit contributions
 fascinated+0.062
 intrigued+0.060
 Curious+0.058
nam+0.056
mit+0.055
Neg logit contributions
loo-0.062
 dent-0.061
 dri-0.061
ister-0.060
imi-0.057
Tokenarl
Token ID7063
Feature activation+1.83e-01

Pos logit contributions
ked+0.055
 Curious+0.054
mit+0.054
 fascinated+0.053
 intrigued+0.052
Neg logit contributions
 dri-0.024
 dent-0.023
 flash-0.022
 swollen-0.021
 stitch-0.020
Tokenings
Token ID654
Feature activation+4.24e-03

Pos logit contributions
 ig+0.209
aster+0.208
prise+0.204
 collapse+0.201
web+0.198
Neg logit contributions
mit-0.367
 fascinated-0.346
 intrigued-0.340
 Curious-0.338
ked-0.337
Token?"
Token ID1701
Feature activation-1.08e-02

Pos logit contributions
ig+0.006
 dent+0.006
igs+0.006
 collapse+0.006
 ig+0.006
Neg logit contributions
 wanted-0.007
 Curious-0.007
 liked-0.006
 met-0.006
 curious-0.006
Token she
Token ID673
Feature activation+1.38e-02

Pos logit contributions
 fascinated+0.018
 intrigued+0.017
mit+0.017
 Curious+0.017
cious+0.017
Neg logit contributions
 dri-0.019
 dent-0.019
loo-0.018
ister-0.018
imi-0.017
Token asked
Token ID1965
Feature activation-2.79e-02

Pos logit contributions
 dri+0.026
 dent+0.026
 swollen+0.025
ister+0.025
 band+0.024
Neg logit contributions
 fascinated-0.019
mit-0.019
 intrigued-0.018
cious-0.018
 Curious-0.018
Token.
Token ID13
Feature activation+1.26e-02

Pos logit contributions
 fascinated+0.070
 intrigued+0.068
 Curious+0.066
mit+0.066
cious+0.065
Neg logit contributions
 dri-0.024
 dent-0.024
ister-0.023
loo-0.021
 collapse-0.021
Token to
Token ID284
Feature activation+8.07e-03

Pos logit contributions
 fascinated+0.039
 Curious+0.039
 intrigued+0.038
nam+0.037
mit+0.037
Neg logit contributions
 dri-0.037
 dent-0.036
loo-0.034
ister-0.033
 collapse-0.033
Token grow
Token ID1663
Feature activation+1.74e-02

Pos logit contributions
loo+0.014
 dent+0.013
 dri+0.013
ister+0.013
imi+0.013
Neg logit contributions
 fascinated-0.014
 intrigued-0.013
 Curious-0.013
nam-0.012
mit-0.012
Tokenl
Token ID75
Feature activation-6.49e-02

Pos logit contributions
 dri+0.025
loo+0.024
 dent+0.023
ister+0.022
 collapse+0.021
Neg logit contributions
 fascinated-0.035
 intrigued-0.033
 Curious-0.033
nam-0.033
mit-0.032
Token and
Token ID290
Feature activation-9.94e-03

Pos logit contributions
 fascinated+0.122
 intrigued+0.117
 Curious+0.113
nam+0.108
mit+0.108
Neg logit contributions
loo-0.100
 dent-0.099
 dri-0.098
ister-0.094
imi-0.090
Token sn
Token ID3013
Feature activation+3.43e-02

Pos logit contributions
 fascinated+0.019
 intrigued+0.018
 Curious+0.017
nam+0.017
mit+0.017
Neg logit contributions
loo-0.015
 dri-0.015
 dent-0.015
ister-0.014
 collapse-0.014
Tokenarl
Token ID7063
Feature activation+1.83e-01

Pos logit contributions
 dent+0.054
 dri+0.050
loo+0.049
imi+0.048
 flash+0.047
Neg logit contributions
 fascinated-0.064
 intrigued-0.062
 Curious-0.060
mit-0.059
cious-0.058
Token to
Token ID284
Feature activation+7.96e-03

Pos logit contributions
loo+0.296
 dent+0.290
 dri+0.285
ister+0.273
imi+0.269
Neg logit contributions
 fascinated-0.327
 intrigued-0.314
 Curious-0.298
cious-0.293
mit-0.291
Token try
Token ID1949
Feature activation+4.56e-02

Pos logit contributions
 dri+0.012
 dent+0.012
loo+0.012
ister+0.011
 collapse+0.011
Neg logit contributions
 fascinated-0.015
 intrigued-0.014
 Curious-0.014
nam-0.014
mit-0.014
Token and
Token ID290
Feature activation-1.03e-02

Pos logit contributions
 dri+0.069
 dent+0.067
loo+0.066
ister+0.065
imi+0.062
Neg logit contributions
 fascinated-0.085
 intrigued-0.082
 Curious-0.082
nam-0.077
mit-0.077
Token scare
Token ID19437
Feature activation-2.78e-02

Pos logit contributions
 fascinated+0.019
 intrigued+0.018
 Curious+0.018
nam+0.017
mit+0.017
Neg logit contributions
loo-0.016
ister-0.016
 dent-0.016
 dri-0.015
imi-0.015
Token her
Token ID607
Feature activation-4.89e-03

Pos logit contributions
 fascinated+0.054
 intrigued+0.053
 Curious+0.051
nam+0.049
mit+0.048
Neg logit contributions
 dri-0.041
 dent-0.039
loo-0.038
ister-0.038
 collapse-0.036
Token you
Token ID345
Feature activation-5.86e-02

Pos logit contributions
loo+0.045
 dent+0.044
 dri+0.044
ister+0.042
 collapse+0.039
Neg logit contributions
 fascinated-0.083
 intrigued-0.081
 Curious-0.079
nam-0.077
mit-0.076
Token too
Token ID1165
Feature activation+2.17e-02

Pos logit contributions
 fascinated+0.104
 intrigued+0.099
nam+0.097
 Curious+0.096
 curious+0.095
Neg logit contributions
loo-0.100
 dri-0.096
 dent-0.094
ister-0.090
 collapse-0.090
Token,
Token ID11
Feature activation-1.14e-03

Pos logit contributions
loo+0.033
 dri+0.033
 dent+0.032
ister+0.031
imi+0.029
Neg logit contributions
 fascinated-0.042
 intrigued-0.040
 Curious-0.039
nam-0.037
 curious-0.037
Token my
Token ID616
Feature activation-3.69e-02

Pos logit contributions
 fascinated+0.002
 intrigued+0.002
 Curious+0.002
nam+0.002
mit+0.002
Neg logit contributions
loo-0.002
 dent-0.002
 dri-0.002
ister-0.002
imi-0.002
Token d
Token ID288
Feature activation-2.66e-02

Pos logit contributions
 fascinated+0.062
 intrigued+0.059
 Curious+0.057
nam+0.055
cious+0.054
Neg logit contributions
loo-0.065
 dent-0.064
 dri-0.064
ister-0.063
imi-0.060
Tokenarl
Token ID7063
Feature activation+1.83e-01

Pos logit contributions
 fascinated+0.067
mit+0.067
 intrigued+0.066
 Curious+0.066
ked+0.065
Neg logit contributions
 dent-0.018
 dri-0.018
 flash-0.015
 swollen-0.015
loo-0.015
Tokenings
Token ID654
Feature activation+2.94e-03

Pos logit contributions
aster+0.182
 dri+0.181
 dent+0.180
 collapse+0.178
loo+0.175
Neg logit contributions
mit-0.398
 fascinated-0.389
 intrigued-0.382
 Curious-0.375
cious-0.374
Token,"
Token ID553
Feature activation-1.05e-02

Pos logit contributions
 dent+0.005
loo+0.005
 collapse+0.005
ig+0.005
 dri+0.005
Neg logit contributions
 fascinated-0.004
 Curious-0.004
 intrigued-0.004
 wanted-0.004
 curious-0.004
Token she
Token ID673
Feature activation+1.17e-02

Pos logit contributions
 fascinated+0.017
 intrigued+0.017
 Curious+0.016
nam+0.015
mit+0.015
Neg logit contributions
loo-0.019
 dri-0.018
 dent-0.018
ister-0.018
imi-0.017
Token said
Token ID531
Feature activation-1.30e-02

Pos logit contributions
 dri+0.018
 dent+0.016
ister+0.015
omo+0.015
loo+0.015
Neg logit contributions
 fascinated-0.023
nam-0.022
 intrigued-0.022
mit-0.021
 Curious-0.021
Token.
Token ID13
Feature activation+9.92e-03

Pos logit contributions
 fascinated+0.028
 intrigued+0.027
 Curious+0.026
nam+0.025
mit+0.025
Neg logit contributions
 dri-0.016
loo-0.016
 dent-0.016
ister-0.015
arl-0.014
Token your
Token ID534
Feature activation-2.41e-02

Pos logit contributions
loo+0.038
 dent+0.038
 dri+0.038
ister+0.036
imi+0.034
Neg logit contributions
 fascinated-0.048
 intrigued-0.046
 Curious-0.045
nam-0.043
mit-0.043
Token fault
Token ID8046
Feature activation+3.10e-02

Pos logit contributions
 fascinated+0.044
 intrigued+0.043
 Curious+0.041
mit+0.040
nam+0.039
Neg logit contributions
loo-0.038
 dri-0.037
 dent-0.037
ister-0.036
imi-0.034
Token,
Token ID11
Feature activation-2.27e-03

Pos logit contributions
loo+0.043
 dri+0.042
 dent+0.042
ister+0.040
 collapse+0.038
Neg logit contributions
 fascinated-0.063
 intrigued-0.061
 Curious-0.059
nam-0.057
mit-0.057
Token my
Token ID616
Feature activation-3.63e-02

Pos logit contributions
 fascinated+0.004
 intrigued+0.004
 Curious+0.004
nam+0.004
mit+0.004
Neg logit contributions
loo-0.004
 dent-0.004
 dri-0.004
ister-0.004
imi-0.003
Token d
Token ID288
Feature activation-2.71e-02

Pos logit contributions
 fascinated+0.056
 intrigued+0.053
 Curious+0.051
nam+0.048
mit+0.048
Neg logit contributions
loo-0.068
 dent-0.067
 dri-0.067
ister-0.065
 collapse-0.063
Tokenarl
Token ID7063
Feature activation+1.83e-01

Pos logit contributions
ked+0.061
mit+0.060
 fascinated+0.058
 intrigued+0.058
 Curious+0.058
Neg logit contributions
 flash-0.023
 dent-0.023
 dri-0.022
loo-0.020
 collapse-0.020
Tokenings
Token ID654
Feature activation+1.78e-03

Pos logit contributions
prise+0.216
 collapse+0.214
aster+0.213
web+0.209
 ig+0.207
Neg logit contributions
mit-0.358
 fascinated-0.342
cious-0.339
 intrigued-0.337
ked-0.331
Token.
Token ID13
Feature activation+1.25e-02

Pos logit contributions
loo+0.003
 dent+0.003
ig+0.003
omo+0.003
 collapse+0.003
Neg logit contributions
 fascinated-0.003
 Curious-0.003
 wanted-0.002
 intrigued-0.002
 curious-0.002
Token You
Token ID921
Feature activation-9.24e-02

Pos logit contributions
loo+0.023
 dent+0.023
 dri+0.022
ister+0.022
arl+0.021
Neg logit contributions
 fascinated-0.020
 intrigued-0.019
 Curious-0.018
nam-0.018
mit-0.017
Token didn
Token ID1422
Feature activation-5.84e-03

Pos logit contributions
 fascinated+0.187
 intrigued+0.180
 Curious+0.177
cious+0.171
nam+0.171
Neg logit contributions
 dri-0.132
 dent-0.127
loo-0.124
ister-0.122
 collapse-0.114
Token't
Token ID470
Feature activation+1.93e-02

Pos logit contributions
 Curious+0.007
 fascinated+0.007
 intrigued+0.007
 Lay+0.006
 Int+0.006
Neg logit contributions
loo-0.012
 dri-0.012
 dent-0.012
ig-0.011
arl-0.011
Token'm
Token ID1101
Feature activation+4.30e-03

Pos logit contributions
loo+0.026
 dent+0.026
 dri+0.025
ister+0.025
imi+0.024
Neg logit contributions
 fascinated-0.026
 intrigued-0.025
 Curious-0.024
nam-0.023
mit-0.023
Token sorry
Token ID7926
Feature activation+1.12e-02

Pos logit contributions
 dent+0.006
 dri+0.006
loo+0.006
ister+0.006
imi+0.005
Neg logit contributions
 fascinated-0.008
 Curious-0.008
 intrigued-0.008
mit-0.008
cious-0.008
Token,
Token ID11
Feature activation-5.46e-03

Pos logit contributions
loo+0.016
 dent+0.016
 dri+0.016
ister+0.015
 collapse+0.014
Neg logit contributions
 fascinated-0.022
 intrigued-0.021
 Curious-0.021
nam-0.020
mit-0.020
Token my
Token ID616
Feature activation-3.76e-02

Pos logit contributions
 fascinated+0.010
 intrigued+0.009
 Curious+0.009
nam+0.009
mit+0.009
Neg logit contributions
loo-0.009
 dri-0.009
 dent-0.009
ister-0.008
imi-0.008
Token d
Token ID288
Feature activation-2.69e-02

Pos logit contributions
 fascinated+0.073
 intrigued+0.069
nam+0.067
 Curious+0.067
chard+0.066
Neg logit contributions
 dent-0.058
loo-0.058
 dri-0.057
omo-0.055
imi-0.052
Tokenarl
Token ID7063
Feature activation+1.83e-01

Pos logit contributions
 fascinated+0.064
mit+0.063
ked+0.063
 Curious+0.062
 intrigued+0.062
Neg logit contributions
 dent-0.019
 dri-0.018
 flash-0.017
 swollen-0.017
loo-0.016
Tokenings
Token ID654
Feature activation+2.16e-03

Pos logit contributions
prise+0.215
aster+0.215
 ig+0.214
 collapse+0.211
 dent+0.209
Neg logit contributions
mit-0.361
cious-0.337
 fascinated-0.335
 intrigued-0.331
ed-0.327
Token.
Token ID13
Feature activation+6.67e-03

Pos logit contributions
 dent+0.004
loo+0.004
 dri+0.004
 collapse+0.004
omo+0.004
Neg logit contributions
 fascinated-0.003
 Curious-0.003
 wanted-0.003
 intrigued-0.003
 curious-0.003
Token The
Token ID383
Feature activation-3.68e-03

Pos logit contributions
loo+0.012
 dent+0.012
 dri+0.012
ister+0.011
arl+0.011
Neg logit contributions
 fascinated-0.011
 intrigued-0.010
 Curious-0.010
nam-0.010
mit-0.009
Token tornado
Token ID33718
Feature activation+1.12e-02

Pos logit contributions
 fascinated+0.007
 intrigued+0.007
 Curious+0.007
nam+0.007
mit+0.007
Neg logit contributions
loo-0.005
 dri-0.005
 dent-0.005
ister-0.005
imi-0.005
Token took
Token ID1718
Feature activation+1.94e-02

Pos logit contributions
loo+0.013
 dent+0.013
ister+0.013
 dri+0.013
arl+0.012
Neg logit contributions
 fascinated-0.025
 intrigued-0.024
 Curious-0.023
mit-0.022
nam-0.022
Token
Token ID95
Feature activation-2.43e-02

Pos logit contributions
 fascinated+0.079
 intrigued+0.076
 Curious+0.075
nam+0.073
 youngest+0.071
Neg logit contributions
ister-0.040
loo-0.037
 dri-0.035
arl-0.034
 dent-0.034
Token
Token ID26391
Feature activation-6.30e-02

Pos logit contributions
 fascinated+0.041
 intrigued+0.039
 Curious+0.038
nam+0.037
mit+0.036
Neg logit contributions
loo-0.042
 dent-0.042
 dri-0.041
ister-0.041
imi-0.039
Token
Token ID129
Feature activation-7.82e-03

Pos logit contributions
 fascinated+0.110
nam+0.106
 intrigued+0.106
 Curious+0.102
mit+0.097
Neg logit contributions
ister-0.102
loo-0.100
 dent-0.099
 dri-0.097
omo-0.095
Token
Token ID241
Feature activation-6.63e-02

Pos logit contributions
 fascinated+0.013
 intrigued+0.013
 Curious+0.012
nam+0.012
mit+0.012
Neg logit contributions
loo-0.013
 dent-0.013
 dri-0.013
ister-0.013
imi-0.012
TokenHello
Token ID15496
Feature activation-7.52e-02

Pos logit contributions
 fascinated+0.121
 intrigued+0.118
 Curious+0.114
nam+0.113
mit+0.111
Neg logit contributions
loo-0.100
 dent-0.099
 dri-0.098
ister-0.098
imi-0.091
Token Princess
Token ID8449
Feature activation-1.91e-01

Pos logit contributions
 fascinated+0.171
 intrigued+0.166
 Curious+0.163
nam+0.158
mit+0.155
Neg logit contributions
loo-0.085
 dri-0.083
 dent-0.083
ister-0.080
imi-0.074
Token!
Token ID0
Feature activation-8.16e-03

Pos logit contributions
 fascinated+0.425
 intrigued+0.408
 Curious+0.400
nam+0.393
cious+0.373
Neg logit contributions
ister-0.240
loo-0.226
 dri-0.218
imi-0.215
 dent-0.202
Token I
Token ID314
Feature activation-1.49e-02

Pos logit contributions
 fascinated+0.013
 intrigued+0.012
nam+0.012
cious+0.012
mit+0.011
Neg logit contributions
ister-0.013
loo-0.013
arl-0.013
O-0.013
imi-0.012
Token am
Token ID716
Feature activation+3.25e-02

Pos logit contributions
 fascinated+0.029
 intrigued+0.028
nam+0.027
 Curious+0.027
cious+0.026
Neg logit contributions
 dri-0.021
loo-0.021
 dent-0.021
ister-0.020
 collapse-0.019
Token here
Token ID994
Feature activation-6.72e-02

Pos logit contributions
 dri+0.050
ister+0.049
loo+0.048
 dent+0.048
imi+0.047
Neg logit contributions
 fascinated-0.056
 Curious-0.055
 intrigued-0.053
cious-0.053
nam-0.052
Token to
Token ID284
Feature activation+5.65e-03

Pos logit contributions
 fascinated+0.127
 intrigued+0.122
 Curious+0.120
nam+0.115
mit+0.113
Neg logit contributions
loo-0.101
 dri-0.099
 dent-0.099
ister-0.097
imi-0.091
Token.
Token ID13
Feature activation+1.18e-02

Pos logit contributions
loo+0.061
 dent+0.060
 dri+0.060
ister+0.058
imi+0.055
Neg logit contributions
 fascinated-0.086
 intrigued-0.083
 Curious-0.081
nam-0.078
mit-0.077
Token
Token ID220
Feature activation-3.11e-02

Pos logit contributions
 dent+0.019
 dri+0.018
loo+0.018
ister+0.018
omo+0.018
Neg logit contributions
 fascinated-0.021
 intrigued-0.020
mit-0.019
nam-0.019
cious-0.019
Token
Token ID198
Feature activation+1.55e-02

Pos logit contributions
 fascinated+0.057
 intrigued+0.055
 Curious+0.053
nam+0.053
mit+0.052
Neg logit contributions
loo-0.048
 dri-0.047
 dent-0.047
ister-0.047
imi-0.044
Token
Token ID198
Feature activation+1.39e-02

Pos logit contributions
loo+0.029
 dent+0.029
 dri+0.028
ister+0.027
 collapse+0.026
Neg logit contributions
 fascinated-0.024
 intrigued-0.023
 Curious-0.023
cious-0.021
nam-0.021
TokenThe
Token ID464
Feature activation+4.58e-03

Pos logit contributions
loo+0.020
 dent+0.019
 dri+0.019
ister+0.017
imi+0.017
Neg logit contributions
 fascinated-0.028
 intrigued-0.027
 Curious-0.027
cious-0.025
nam-0.025
Token Princess
Token ID8449
Feature activation-1.89e-01

Pos logit contributions
loo+0.008
 dri+0.008
 dent+0.008
ister+0.008
imi+0.007
Neg logit contributions
 fascinated-0.007
 intrigued-0.007
 Curious-0.007
nam-0.007
cious-0.006
Token never
Token ID1239
Feature activation+6.11e-02

Pos logit contributions
nam+0.346
 fascinated+0.328
 Curious+0.319
 intrigued+0.310
 Fasc+0.289
Neg logit contributions
 dri-0.301
ister-0.274
loo-0.264
web-0.262
 band-0.258
Token found
Token ID1043
Feature activation-9.28e-03

Pos logit contributions
 dri+0.113
 dent+0.101
 collapse+0.099
loo+0.099
 flash+0.098
Neg logit contributions
 fascinated-0.097
nam-0.094
 intrigued-0.094
 Curious-0.091
cious-0.087
Token out
Token ID503
Feature activation-3.27e-02

Pos logit contributions
 fascinated+0.019
 intrigued+0.019
 Curious+0.018
cious+0.017
mit+0.017
Neg logit contributions
 dri-0.012
 dent-0.012
loo-0.011
 collapse-0.011
 powder-0.011
Token what
Token ID644
Feature activation-6.27e-02

Pos logit contributions
 fascinated+0.068
 intrigued+0.066
 Curious+0.065
nam+0.062
mit+0.062
Neg logit contributions
loo-0.042
 dri-0.042
 dent-0.041
ister-0.040
 collapse-0.038
Token the
Token ID262
Feature activation-1.05e-02

Pos logit contributions
 fascinated+0.114
 intrigued+0.110
 Curious+0.109
mit+0.105
nam+0.103
Neg logit contributions
 dri-0.099
loo-0.096
ister-0.094
 dent-0.093
imi-0.088
Token
Token ID198
Feature activation+2.00e-02

Pos logit contributions
 fascinated+0.055
 intrigued+0.053
 Curious+0.052
nam+0.052
mit+0.051
Neg logit contributions
loo-0.041
ister-0.040
 dri-0.039
 dent-0.039
arl-0.038
Token
Token ID198
Feature activation+1.85e-02

Pos logit contributions
loo+0.030
 dri+0.030
 dent+0.030
ister+0.030
imi+0.028
Neg logit contributions
 fascinated-0.037
 intrigued-0.035
 Curious-0.034
nam-0.034
mit-0.034
TokenOne
Token ID3198
Feature activation+2.45e-02

Pos logit contributions
loo+0.025
 dent+0.025
 dri+0.025
ister+0.024
imi+0.022
Neg logit contributions
 fascinated-0.038
 intrigued-0.037
 Curious-0.036
nam-0.034
cious-0.034
Token day
Token ID1110
Feature activation+3.73e-02

Pos logit contributions
 dent+0.041
 dri+0.039
loo+0.038
ister+0.037
arl+0.036
Neg logit contributions
 fascinated-0.045
 intrigued-0.043
 Curious-0.042
mit-0.040
 curious-0.040
Token,
Token ID11
Feature activation-1.00e-02

Pos logit contributions
loo+0.028
ister+0.028
 dri+0.028
ww+0.025
O+0.025
Neg logit contributions
 fascinated-0.097
 intrigued-0.094
nam-0.091
 Curious-0.091
cious-0.087
Token Princess
Token ID8449
Feature activation-1.89e-01

Pos logit contributions
 fascinated+0.021
 intrigued+0.020
 Curious+0.020
cious+0.019
mit+0.018
Neg logit contributions
 dri-0.012
omo-0.012
ister-0.012
O-0.011
loo-0.011
Token wanted
Token ID2227
Feature activation-2.82e-02

Pos logit contributions
nam+0.347
 fascinated+0.338
 intrigued+0.317
 Curious+0.307
 mesmer+0.294
Neg logit contributions
 dri-0.278
ister-0.271
?-0.265
imi-0.260
ww-0.259
Token to
Token ID284
Feature activation+3.98e-03

Pos logit contributions
 fascinated+0.051
 intrigued+0.049
 Curious+0.048
nam+0.045
mit+0.045
Neg logit contributions
 dri-0.046
loo-0.045
ister-0.044
 dent-0.044
imi-0.042
Token go
Token ID467
Feature activation+7.64e-02

Pos logit contributions
loo+0.006
 dent+0.006
 dri+0.006
ister+0.006
imi+0.006
Neg logit contributions
 fascinated-0.007
 intrigued-0.007
 Curious-0.007
nam-0.006
mit-0.006
Token on
Token ID319
Feature activation+2.13e-02

Pos logit contributions
 dent+0.117
loo+0.115
 dri+0.115
ister+0.111
imi+0.106
Neg logit contributions
 fascinated-0.143
 intrigued-0.138
 Curious-0.135
mit-0.129
nam-0.127
Token an
Token ID281
Feature activation-3.30e-02

Pos logit contributions
 dri+0.036
 dent+0.035
ister+0.035
loo+0.035
imi+0.033
Neg logit contributions
 fascinated-0.036
 intrigued-0.035
mit-0.033
 Curious-0.033
cious-0.032
Token.
Token ID13
Feature activation+2.89e-03

Pos logit contributions
 dri+0.048
loo+0.047
 dent+0.046
ister+0.046
 collapse+0.042
Neg logit contributions
 fascinated-0.081
 intrigued-0.079
 Curious-0.078
nam-0.076
mit-0.074
Token From
Token ID3574
Feature activation-2.63e-02

Pos logit contributions
loo+0.004
 dri+0.004
ay+0.004
ister+0.004
arl+0.004
Neg logit contributions
 fascinated-0.005
mit-0.005
nam-0.005
 intrigued-0.005
cious-0.005
Token then
Token ID788
Feature activation+2.25e-02

Pos logit contributions
 fascinated+0.056
 intrigued+0.054
 Curious+0.053
nam+0.051
cious+0.050
Neg logit contributions
loo-0.034
 dri-0.033
 dent-0.033
ister-0.032
imi-0.030
Token on
Token ID319
Feature activation+1.98e-02

Pos logit contributions
 dent+0.040
 dri+0.038
omo+0.038
loo+0.038
ister+0.035
Neg logit contributions
 fascinated-0.038
 Curious-0.037
 intrigued-0.037
 curious-0.034
cious-0.034
Token the
Token ID262
Feature activation-1.07e-02

Pos logit contributions
loo+0.026
 dri+0.026
 dent+0.025
ister+0.025
imi+0.024
Neg logit contributions
 fascinated-0.041
 intrigued-0.040
 Curious-0.039
nam-0.038
mit-0.037
Token Princess
Token ID8449
Feature activation-1.87e-01

Pos logit contributions
 fascinated+0.020
 intrigued+0.019
 Curious+0.019
nam+0.018
cious+0.018
Neg logit contributions
 dri-0.016
 dent-0.015
loo-0.015
imi-0.015
ister-0.014
Token travelled
Token ID21650
Feature activation-8.42e-02

Pos logit contributions
nam+0.362
 fascinated+0.342
 Curious+0.338
 intrigued+0.329
chard+0.302
Neg logit contributions
 dri-0.284
ister-0.262
loo-0.253
imi-0.249
web-0.239
Token around
Token ID1088
Feature activation-7.58e-02

Pos logit contributions
 fascinated+0.156
 intrigued+0.149
 Curious+0.149
nam+0.142
cious+0.142
Neg logit contributions
 dri-0.128
loo-0.127
 dent-0.123
ister-0.122
 collapse-0.114
Token the
Token ID262
Feature activation-1.12e-02

Pos logit contributions
 fascinated+0.144
 Curious+0.141
 intrigued+0.140
nam+0.135
mit+0.134
Neg logit contributions
 dri-0.108
loo-0.103
 dent-0.102
ister-0.100
imi-0.097
Token world
Token ID995
Feature activation-6.95e-02

Pos logit contributions
 fascinated+0.025
 intrigued+0.024
 Curious+0.024
nam+0.024
cious+0.023
Neg logit contributions
 dri-0.012
 dent-0.012
loo-0.012
ister-0.011
imi-0.011
Token believing
Token ID14773
Feature activation-1.25e-01

Pos logit contributions
 fascinated+0.133
 Curious+0.130
 intrigued+0.129
nam+0.128
mit+0.120
Neg logit contributions
loo-0.099
 dri-0.096
ister-0.095
 dent-0.090
 collapse-0.086
Token
Token ID198
Feature activation+2.06e-02

Pos logit contributions
loo+0.035
 dri+0.035
 dent+0.035
ister+0.034
imi+0.032
Neg logit contributions
 fascinated-0.038
 intrigued-0.036
 Curious-0.035
nam-0.034
mit-0.033
TokenOne
Token ID3198
Feature activation+2.29e-02

Pos logit contributions
loo+0.021
 dri+0.021
ister+0.021
 dent+0.020
imi+0.019
Neg logit contributions
 fascinated-0.049
 intrigued-0.047
 Curious-0.046
mit-0.045
nam-0.045
Token day
Token ID1110
Feature activation+3.84e-02

Pos logit contributions
loo+0.028
 dri+0.026
ister+0.026
 dent+0.026
imi+0.025
Neg logit contributions
 fascinated-0.051
 intrigued-0.049
 Curious-0.048
nam-0.047
mit-0.046
Token,
Token ID11
Feature activation-7.25e-03

Pos logit contributions
loo+0.037
ister+0.036
 dri+0.035
 dent+0.033
imi+0.033
Neg logit contributions
 fascinated-0.094
 intrigued-0.092
 Curious-0.088
nam-0.087
mit-0.084
Token the
Token ID262
Feature activation-6.58e-03

Pos logit contributions
 fascinated+0.016
 intrigued+0.016
mit+0.015
cious+0.015
uries+0.015
Neg logit contributions
 powder-0.008
O-0.008
 dri-0.007
 flash-0.007
ister-0.007
Token Princess
Token ID8449
Feature activation-1.87e-01

Pos logit contributions
 fascinated+0.011
 intrigued+0.010
 Curious+0.010
cious+0.009
mit+0.009
Neg logit contributions
loo-0.012
 dri-0.012
imi-0.011
ister-0.011
 dent-0.011
Token went
Token ID1816
Feature activation+3.86e-02

Pos logit contributions
nam+0.390
 fascinated+0.352
 intrigued+0.334
 Curious+0.331
folk+0.322
Neg logit contributions
 dri-0.274
loo-0.244
ister-0.236
ww-0.233
?-0.233
Token outside
Token ID2354
Feature activation-2.28e-02

Pos logit contributions
loo+0.060
 dri+0.059
 dent+0.058
ister+0.057
imi+0.054
Neg logit contributions
 fascinated-0.071
 intrigued-0.069
 Curious-0.067
nam-0.065
mit-0.064
Token and
Token ID290
Feature activation-1.45e-02

Pos logit contributions
 fascinated+0.039
 intrigued+0.038
 Curious+0.037
nam+0.037
mit+0.036
Neg logit contributions
loo-0.037
 dri-0.037
 dent-0.036
ister-0.035
imi-0.034
Token saw
Token ID2497
Feature activation-1.60e-02

Pos logit contributions
cious+0.027
mit+0.027
 fascinated+0.026
uries+0.026
 intrigued+0.025
Neg logit contributions
 dri-0.020
 flash-0.017
 band-0.016
 now-0.016
w-0.016
Token a
Token ID257
Feature activation-6.65e-03

Pos logit contributions
 fascinated+0.030
 intrigued+0.029
mit+0.029
 Curious+0.028
cious+0.028
Neg logit contributions
 dri-0.024
loo-0.023
ister-0.022
imi-0.022
 dent-0.022
Token
Token ID198
Feature activation+1.97e-02

Pos logit contributions
 fascinated+0.062
nam+0.060
 intrigued+0.060
mit+0.058
 Curious+0.058
Neg logit contributions
loo-0.043
ister-0.042
 dri-0.042
 dent-0.041
arl-0.039
Token
Token ID198
Feature activation+1.91e-02

Pos logit contributions
 dri+0.024
ister+0.023
 dent+0.023
loo+0.023
O+0.023
Neg logit contributions
 fascinated-0.040
nam-0.039
mit-0.039
 intrigued-0.039
 Curious-0.037
TokenThe
Token ID464
Feature activation+8.17e-03

Pos logit contributions
O+0.016
 dri+0.015
ister+0.015
 dent+0.014
ww+0.014
Neg logit contributions
 fascinated-0.045
mit-0.045
 intrigued-0.044
 Curious-0.044
nam-0.043
Token dragon
Token ID10441
Feature activation-5.22e-02

Pos logit contributions
loo+0.012
 dri+0.011
 dent+0.011
ister+0.011
 collapse+0.011
Neg logit contributions
 fascinated-0.016
 intrigued-0.015
 Curious-0.015
nam-0.015
cious-0.014
Token and
Token ID290
Feature activation-1.51e-02

Pos logit contributions
 fascinated+0.086
nam+0.083
 Curious+0.083
 intrigued+0.083
cious+0.077
Neg logit contributions
 dri-0.088
loo-0.088
 dent-0.086
ister-0.084
imi-0.080
Token Princess
Token ID8449
Feature activation-1.87e-01

Pos logit contributions
 fascinated+0.025
 intrigued+0.024
cious+0.023
 Curious+0.022
mit+0.021
Neg logit contributions
 dri-0.026
 dent-0.025
 flash-0.025
loo-0.025
 stitch-0.024
Token stayed
Token ID9658
Feature activation+2.63e-04

Pos logit contributions
nam+0.354
 Curious+0.354
 fascinated+0.348
 intrigued+0.343
 Fasc+0.312
Neg logit contributions
 dri-0.257
ister-0.246
loo-0.234
 band-0.221
ww-0.220
Token up
Token ID510
Feature activation-9.42e-02

Pos logit contributions
loo+0.000
 dri+0.000
 dent+0.000
ister+0.000
 collapse+0.000
Neg logit contributions
 fascinated-0.000
 intrigued-0.000
 Curious-0.000
nam-0.000
mit-0.000
Token there
Token ID612
Feature activation-1.07e-01

Pos logit contributions
 fascinated+0.167
 intrigued+0.160
 Curious+0.150
nam+0.143
 curious+0.142
Neg logit contributions
loo-0.161
 dent-0.161
 dri-0.156
ister-0.154
imi-0.152
Token for
Token ID329
Feature activation+5.37e-02

Pos logit contributions
 fascinated+0.215
 Curious+0.214
 intrigued+0.210
nam+0.205
mit+0.199
Neg logit contributions
 dri-0.136
loo-0.135
 dent-0.132
ister-0.130
 collapse-0.123
Token a
Token ID257
Feature activation-5.25e-03

Pos logit contributions
loo+0.102
 dri+0.100
 dent+0.100
ister+0.098
imi+0.093
Neg logit contributions
 fascinated-0.083
 intrigued-0.079
 Curious-0.076
nam-0.073
cious-0.072
Token of
Token ID286
Feature activation+4.32e-02

Pos logit contributions
 fascinated+0.141
 Curious+0.139
 intrigued+0.138
nam+0.135
cious+0.132
Neg logit contributions
loo-0.039
 dri-0.036
ister-0.035
 dent-0.035
imi-0.028
Token crazy
Token ID7165
Feature activation+6.04e-03

Pos logit contributions
loo+0.068
 dri+0.068
ister+0.066
 flash+0.066
 dent+0.065
Neg logit contributions
 Curious-0.071
 fascinated-0.070
mit-0.068
 intrigued-0.068
chard-0.068
Token animals
Token ID4695
Feature activation-1.22e-01

Pos logit contributions
loo+0.009
ister+0.009
ay+0.008
 snow+0.008
 dri+0.008
Neg logit contributions
chard-0.012
nam-0.011
 Curious-0.011
 fascinated-0.011
 intrigued-0.010
Token and
Token ID290
Feature activation-1.55e-02

Pos logit contributions
 Curious+0.238
 fascinated+0.238
 intrigued+0.231
nam+0.228
mit+0.214
Neg logit contributions
 dri-0.167
loo-0.165
ister-0.162
 dent-0.159
 collapse-0.148
Token amazing
Token ID4998
Feature activation-2.05e-02

Pos logit contributions
 Curious+0.023
 fascinated+0.022
 intrigued+0.022
mit+0.022
cious+0.021
Neg logit contributions
 dri-0.028
 flash-0.025
 dent-0.024
loo-0.024
 snow-0.024
Token performers
Token ID23827
Feature activation-1.86e-01

Pos logit contributions
nam+0.037
 fascinated+0.037
chard+0.037
 Curious+0.037
 intrigued+0.036
Neg logit contributions
 dri-0.031
loo-0.031
 flash-0.029
 dent-0.029
ister-0.029
Token.
Token ID13
Feature activation+2.69e-03

Pos logit contributions
 Curious+0.371
 fascinated+0.365
 intrigued+0.358
nam+0.349
 Fasc+0.329
Neg logit contributions
loo-0.247
 dri-0.246
ister-0.238
 dent-0.226
 collapse-0.210
Token She
Token ID1375
Feature activation+2.80e-02

Pos logit contributions
ay+0.004
 dri+0.004
omo+0.004
loo+0.004
arl+0.003
Neg logit contributions
nam-0.005
 fascinated-0.005
cious-0.005
mit-0.005
 intrigued-0.005
Token gasped
Token ID45236
Feature activation-2.55e-02

Pos logit contributions
 dri+0.052
 dent+0.047
omo+0.047
loo+0.047
ister+0.046
Neg logit contributions
nam-0.042
 Curious-0.041
 fascinated-0.040
 intrigued-0.039
cious-0.038
Token in
Token ID287
Feature activation-3.54e-02

Pos logit contributions
 Curious+0.047
 fascinated+0.047
 intrigued+0.046
nam+0.045
cious+0.045
Neg logit contributions
ister-0.035
 dri-0.035
 dent-0.035
loo-0.034
imi-0.033
Token delight
Token ID10974
Feature activation-5.49e-03

Pos logit contributions
 fascinated+0.056
 intrigued+0.054
 Curious+0.052
nam+0.050
cious+0.050
Neg logit contributions
 dent-0.063
 dri-0.063
loo-0.061
ister-0.060
 collapse-0.059
Token
Token ID198
Feature activation+1.90e-02

Pos logit contributions
loo+0.036
 dent+0.036
 dri+0.035
ister+0.033
 collapse+0.033
Neg logit contributions
 fascinated-0.031
 intrigued-0.030
 Curious-0.030
mit-0.027
cious-0.027
TokenOnce
Token ID7454
Feature activation+1.69e-02

Pos logit contributions
loo+0.028
 dent+0.028
 dri+0.026
imi+0.025
ister+0.024
Neg logit contributions
 fascinated-0.039
 intrigued-0.038
 Curious-0.037
cious-0.035
mit-0.034
Token inside
Token ID2641
Feature activation-2.20e-02

Pos logit contributions
 dri+0.025
loo+0.025
ister+0.023
ig+0.021
imi+0.021
Neg logit contributions
 fascinated-0.033
 intrigued-0.032
 Curious-0.032
cious-0.031
nam-0.030
Token,
Token ID11
Feature activation-5.37e-03

Pos logit contributions
 fascinated+0.040
 intrigued+0.039
 Curious+0.038
nam+0.036
mit+0.036
Neg logit contributions
loo-0.035
 dri-0.034
 dent-0.034
ister-0.033
imi-0.031
Token the
Token ID262
Feature activation-5.19e-03

Pos logit contributions
 fascinated+0.011
 intrigued+0.011
cious+0.010
 Curious+0.010
mit+0.010
Neg logit contributions
 dri-0.007
O-0.006
ister-0.006
 collapse-0.006
loo-0.006
Token Princess
Token ID8449
Feature activation-1.86e-01

Pos logit contributions
 fascinated+0.008
 intrigued+0.008
 Curious+0.007
cious+0.007
mit+0.007
Neg logit contributions
loo-0.010
 dri-0.010
 dent-0.009
imi-0.009
ister-0.009
Token gave
Token ID2921
Feature activation-3.42e-02

Pos logit contributions
nam+0.366
 fascinated+0.340
 Curious+0.335
 intrigued+0.322
 Fasc+0.302
Neg logit contributions
 dri-0.274
ister-0.243
loo-0.234
omo-0.226
imi-0.225
Token Jill
Token ID22772
Feature activation-6.16e-02

Pos logit contributions
 fascinated+0.072
 intrigued+0.070
 Curious+0.068
cious+0.065
 mesmer+0.062
Neg logit contributions
 dri-0.046
loo-0.039
ister-0.039
imi-0.038
 band-0.038
Token a
Token ID257
Feature activation-5.08e-03

Pos logit contributions
 fascinated+0.120
 intrigued+0.117
 Curious+0.113
nam+0.107
cious+0.105
Neg logit contributions
 dri-0.088
loo-0.082
aster-0.079
imi-0.077
ister-0.077
Token big
Token ID1263
Feature activation+1.16e-01

Pos logit contributions
 fascinated+0.009
 intrigued+0.009
 Curious+0.009
mit+0.008
 curious+0.008
Neg logit contributions
loo-0.008
 dent-0.008
 dri-0.008
ister-0.008
 collapse-0.008
Token hug
Token ID16225
Feature activation+4.15e-03

Pos logit contributions
loo+0.195
 dent+0.190
 collapse+0.186
 dri+0.186
ister+0.185
Neg logit contributions
 fascinated-0.196
 intrigued-0.194
 Curious-0.193
mit-0.185
 curious-0.182
Token.
Token ID13
Feature activation+3.20e-03

Pos logit contributions
 fascinated+0.467
 Curious+0.459
 intrigued+0.458
 Fasc+0.423
 mesmer+0.418
Neg logit contributions
ister-0.163
imi-0.148
loo-0.138
?-0.132
web-0.131
Token Every
Token ID3887
Feature activation-6.53e-03

Pos logit contributions
ay+0.004
arl+0.003
imi+0.003
ilo+0.003
 dri+0.003
Neg logit contributions
 fascinated-0.006
mit-0.006
 intrigued-0.006
cious-0.006
 mesmer-0.006
Token day
Token ID1110
Feature activation+4.19e-02

Pos logit contributions
 fascinated+0.010
 Curious+0.009
 intrigued+0.009
 curious+0.009
 wanted+0.009
Neg logit contributions
 dent-0.013
 dri-0.012
 collapse-0.012
 swollen-0.012
omo-0.012
Token,
Token ID11
Feature activation-4.64e-03

Pos logit contributions
loo+0.032
 dri+0.032
ister+0.030
ay+0.029
ww+0.027
Neg logit contributions
 fascinated-0.111
 intrigued-0.108
 Curious-0.102
 mesmer-0.099
cious-0.098
Token the
Token ID262
Feature activation-6.13e-03

Pos logit contributions
 fascinated+0.010
mit+0.010
 intrigued+0.010
cious+0.010
ked+0.009
Neg logit contributions
 better-0.005
 end-0.004
?-0.004
ilo-0.004
 stitch-0.004
Token Princess
Token ID8449
Feature activation-1.85e-01

Pos logit contributions
 fascinated+0.008
 intrigued+0.008
 Curious+0.007
cious+0.007
 mesmer+0.007
Neg logit contributions
loo-0.013
 dri-0.013
ister-0.013
arl-0.012
imi-0.012
Token polished
Token ID23895
Feature activation+2.80e-02

Pos logit contributions
nam+0.366
 Curious+0.355
 fascinated+0.353
 intrigued+0.346
 Int+0.318
Neg logit contributions
 dri-0.275
?-0.244
ister-0.239
loo-0.238
imi-0.233
Token her
Token ID607
Feature activation-7.68e-03

Pos logit contributions
loo+0.051
 dent+0.051
 dri+0.050
ister+0.049
 collapse+0.049
Neg logit contributions
 fascinated-0.043
 intrigued-0.041
nam-0.040
 Curious-0.040
mit-0.038
Token golden
Token ID10861
Feature activation+6.05e-04

Pos logit contributions
 fascinated+0.013
 intrigued+0.012
 Curious+0.012
nam+0.011
mit+0.011
Neg logit contributions
loo-0.013
 dent-0.013
 dri-0.013
ister-0.013
imi-0.012
Token crown
Token ID12389
Feature activation-3.77e-02

Pos logit contributions
loo+0.001
ister+0.001
 dri+0.001
 dent+0.001
 collapse+0.001
Neg logit contributions
 fascinated-0.001
 intrigued-0.001
 Curious-0.001
mit-0.001
nam-0.001
Token with
Token ID351
Feature activation-6.73e-02

Pos logit contributions
 fascinated+0.059
 intrigued+0.057
 Curious+0.055
nam+0.053
mit+0.052
Neg logit contributions
loo-0.069
 dent-0.069
 dri-0.068
ister-0.067
imi-0.064
Token
Token ID198
Feature activation+2.11e-02

Pos logit contributions
loo+0.005
 dri+0.005
ister+0.005
arl+0.005
 dent+0.005
Neg logit contributions
 fascinated-0.006
 intrigued-0.006
nam-0.006
cious-0.006
mit-0.006
Token
Token ID198
Feature activation+2.07e-02

Pos logit contributions
loo+0.036
 dent+0.036
 dri+0.036
ister+0.035
imi+0.033
Neg logit contributions
 fascinated-0.036
 intrigued-0.034
 Curious-0.034
nam-0.031
mit-0.031
TokenThe
Token ID464
Feature activation+1.08e-02

Pos logit contributions
loo+0.029
 dent+0.029
 dri+0.029
ister+0.027
imi+0.026
Neg logit contributions
 fascinated-0.042
 intrigued-0.041
 Curious-0.040
nam-0.038
cious-0.037
Token Prince
Token ID9005
Feature activation-1.31e-01

Pos logit contributions
loo+0.021
 dri+0.021
ister+0.020
 dent+0.020
ilo+0.020
Neg logit contributions
 fascinated-0.015
 intrigued-0.013
 Curious-0.013
cious-0.013
nam-0.012
Token and
Token ID290
Feature activation-1.57e-02

Pos logit contributions
nam+0.224
 fascinated+0.216
 Curious+0.209
 intrigued+0.206
cious+0.189
Neg logit contributions
 dri-0.218
loo-0.210
ister-0.207
 dent-0.197
omo-0.193
Token Princess
Token ID8449
Feature activation-1.85e-01

Pos logit contributions
 fascinated+0.020
 intrigued+0.019
cious+0.019
 Curious+0.017
mit+0.017
Neg logit contributions
 dri-0.032
loo-0.031
 dent-0.030
ister-0.030
ilo-0.030
Token were
Token ID547
Feature activation+1.51e-02

Pos logit contributions
 fascinated+0.315
 Curious+0.308
 intrigued+0.306
nam+0.300
cious+0.272
Neg logit contributions
 dri-0.301
loo-0.295
ister-0.292
 dent-0.279
 collapse-0.267
Token so
Token ID523
Feature activation-3.50e-02

Pos logit contributions
 dri+0.023
loo+0.023
ister+0.023
 dent+0.022
omo+0.022
Neg logit contributions
 Curious-0.025
nam-0.025
 fascinated-0.024
 intrigued-0.024
cious-0.023
Token happy
Token ID3772
Feature activation-4.30e-02

Pos logit contributions
 Curious+0.052
 fascinated+0.049
nam+0.048
cious+0.048
 intrigued+0.047
Neg logit contributions
ister-0.061
loo-0.059
 dri-0.059
 dent-0.057
 collapse-0.056
Token that
Token ID326
Feature activation+5.33e-02

Pos logit contributions
nam+0.095
 Curious+0.094
 fascinated+0.094
 intrigued+0.092
mit+0.088
Neg logit contributions
loo-0.050
 dri-0.047
ister-0.046
 dent-0.044
imi-0.042
Token they
Token ID484
Feature activation-2.48e-02

Pos logit contributions
 dri+0.082
loo+0.081
ister+0.079
 dent+0.076
 collapse+0.076
Neg logit contributions
 fascinated-0.095
 intrigued-0.093
 Curious-0.092
mit-0.086
cious-0.084
Token the
Token ID262
Feature activation-4.34e-03

Pos logit contributions
 fascinated+0.044
 intrigued+0.042
 Curious+0.042
nam+0.039
cious+0.038
Neg logit contributions
 dri-0.031
loo-0.029
ister-0.028
aster-0.027
 dent-0.026
Token opportunity
Token ID3663
Feature activation-6.34e-02

Pos logit contributions
 fascinated+0.009
 intrigued+0.009
cious+0.009
nam+0.009
 Curious+0.009
Neg logit contributions
 dent-0.005
loo-0.005
 dri-0.005
ister-0.005
 flash-0.005
Token to
Token ID284
Feature activation+1.96e-03

Pos logit contributions
 fascinated+0.123
 intrigued+0.118
 Curious+0.117
nam+0.113
mit+0.109
Neg logit contributions
 dri-0.092
loo-0.091
 dent-0.090
ister-0.088
 collapse-0.082
Token marry
Token ID12479
Feature activation-1.10e-01

Pos logit contributions
loo+0.003
 dent+0.003
 dri+0.003
imi+0.003
ister+0.003
Neg logit contributions
 fascinated-0.003
 intrigued-0.003
 Curious-0.003
mit-0.003
 curious-0.003
Token his
Token ID465
Feature activation-4.14e-03

Pos logit contributions
 fascinated+0.239
 intrigued+0.231
 Curious+0.227
nam+0.215
mit+0.213
Neg logit contributions
loo-0.135
ister-0.134
 dri-0.133
 dent-0.128
imi-0.118
Token Princess
Token ID8449
Feature activation-1.85e-01

Pos logit contributions
 fascinated+0.007
 Curious+0.007
 intrigued+0.007
nam+0.007
cious+0.006
Neg logit contributions
 dent-0.007
loo-0.007
 dri-0.006
ister-0.006
omo-0.006
Token.
Token ID13
Feature activation+2.36e-03

Pos logit contributions
 fascinated+0.420
 Curious+0.411
 intrigued+0.406
nam+0.396
 Fasc+0.375
Neg logit contributions
ister-0.185
 dri-0.182
loo-0.176
imi-0.165
ilo-0.164
Token The
Token ID383
Feature activation-6.70e-03

Pos logit contributions
loo+0.004
 dri+0.004
ister+0.004
arl+0.003
 dent+0.003
Neg logit contributions
 fascinated-0.004
 intrigued-0.004
nam-0.004
cious-0.004
mit-0.004
Token King
Token ID2677
Feature activation-1.41e-01

Pos logit contributions
 fascinated+0.009
 intrigued+0.008
cious+0.008
 Curious+0.007
 mesmer+0.007
Neg logit contributions
loo-0.014
 dri-0.013
ister-0.013
ilo-0.013
 dent-0.013
Token was
Token ID373
Feature activation-1.89e-03

Pos logit contributions
 fascinated+0.227
nam+0.220
 Curious+0.219
 intrigued+0.218
cious+0.201
Neg logit contributions
loo-0.245
 dri-0.243
ister-0.239
 dent-0.234
omo-0.224
Token so
Token ID523
Feature activation-3.57e-02

Pos logit contributions
 fascinated+0.003
 Curious+0.003
 intrigued+0.003
nam+0.003
cious+0.002
Neg logit contributions
 dent-0.004
loo-0.004
 dri-0.004
ister-0.003
 collapse-0.003
Token
Token ID198
Feature activation+2.24e-02

Pos logit contributions
 dent+0.043
 dri+0.042
loo+0.042
ister+0.040
 collapse+0.039
Neg logit contributions
 fascinated-0.034
 intrigued-0.033
 Curious-0.033
cious-0.029
nam-0.029
TokenOne
Token ID3198
Feature activation+2.33e-02

Pos logit contributions
loo+0.026
 dri+0.026
 dent+0.026
ister+0.025
imi+0.023
Neg logit contributions
 fascinated-0.050
 intrigued-0.049
 Curious-0.048
nam-0.046
mit-0.046
Token day
Token ID1110
Feature activation+3.87e-02

Pos logit contributions
 dent+0.035
 dri+0.034
loo+0.033
ister+0.032
 collapse+0.031
Neg logit contributions
 fascinated-0.046
 intrigued-0.044
 Curious-0.043
mit-0.041
cious-0.040
Token when
Token ID618
Feature activation-5.87e-02

Pos logit contributions
ister+0.032
 dri+0.032
loo+0.031
 flash+0.029
aster+0.028
Neg logit contributions
 fascinated-0.099
 intrigued-0.097
mit-0.092
 Curious-0.092
nam-0.091
Token the
Token ID262
Feature activation-4.31e-03

Pos logit contributions
 fascinated+0.096
 intrigued+0.092
 Curious+0.089
nam+0.085
mit+0.084
Neg logit contributions
loo-0.105
 dri-0.103
 dent-0.103
ister-0.101
imi-0.096
Token Princess
Token ID8449
Feature activation-1.85e-01

Pos logit contributions
 fascinated+0.008
 intrigued+0.008
 Curious+0.007
nam+0.007
cious+0.007
Neg logit contributions
loo-0.007
 dri-0.007
 dent-0.007
ister-0.006
imi-0.006
Token arrived
Token ID5284
Feature activation-3.11e-02

Pos logit contributions
nam+0.375
 fascinated+0.346
 intrigued+0.326
 Curious+0.317
 mesmer+0.301
Neg logit contributions
 dri-0.292
loo-0.254
ister-0.246
imi-0.239
omo-0.235
Token at
Token ID379
Feature activation-2.36e-02

Pos logit contributions
 fascinated+0.061
 Curious+0.059
 intrigued+0.058
cious+0.056
nam+0.056
Neg logit contributions
ister-0.044
loo-0.043
 dri-0.040
ig-0.039
w-0.038
Token the
Token ID262
Feature activation-4.36e-03

Pos logit contributions
 fascinated+0.039
 intrigued+0.037
 Curious+0.037
nam+0.035
mit+0.034
Neg logit contributions
loo-0.041
 dent-0.041
 dri-0.041
ister-0.040
imi-0.038
Token lake
Token ID13546
Feature activation-2.15e-02

Pos logit contributions
 fascinated+0.007
 intrigued+0.007
 Curious+0.007
mit+0.006
 curious+0.006
Neg logit contributions
 dent-0.008
 dri-0.007
loo-0.007
ister-0.007
 collapse-0.007
Token,
Token ID11
Feature activation-5.70e-03

Pos logit contributions
 fascinated+0.045
 intrigued+0.044
 Curious+0.043
nam+0.043
mit+0.041
Neg logit contributions
loo-0.028
 dri-0.027
ister-0.026
 dent-0.026
 collapse-0.024
Token time
Token ID640
Feature activation-3.84e-03

Pos logit contributions
 fascinated+0.016
 intrigued+0.015
 Curious+0.015
nam+0.014
mit+0.014
Neg logit contributions
loo-0.010
 dent-0.010
 dri-0.010
ister-0.010
imi-0.009
Token there
Token ID612
Feature activation-1.09e-01

Pos logit contributions
 fascinated+0.010
 intrigued+0.009
nam+0.009
 Curious+0.009
cious+0.009
Neg logit contributions
loo-0.004
ister-0.004
imi-0.003
 dent-0.003
ay-0.003
Token was
Token ID373
Feature activation-4.35e-03

Pos logit contributions
 fascinated+0.182
 intrigued+0.174
 Curious+0.170
nam+0.158
mit+0.158
Neg logit contributions
loo-0.188
 dent-0.187
 dri-0.186
ister-0.182
arl-0.176
Token a
Token ID257
Feature activation-7.62e-03

Pos logit contributions
 fascinated+0.008
 intrigued+0.008
mit+0.008
 Curious+0.008
nam+0.007
Neg logit contributions
loo-0.006
 dri-0.006
imi-0.006
 snowball-0.006
O-0.006
Token real
Token ID1103
Feature activation-1.15e-02

Pos logit contributions
 fascinated+0.009
 intrigued+0.008
 Curious+0.008
nam+0.007
mit+0.007
Neg logit contributions
loo-0.017
 dri-0.017
 dent-0.017
ister-0.017
 collapse-0.016
Token Princess
Token ID8449
Feature activation-1.85e-01

Pos logit contributions
 fascinated+0.018
 intrigued+0.017
 Curious+0.017
mit+0.016
nam+0.016
Neg logit contributions
 dent-0.021
loo-0.021
 dri-0.020
ister-0.020
 collapse-0.019
Token named
Token ID3706
Feature activation-8.03e-02

Pos logit contributions
 Curious+0.437
 fascinated+0.434
 intrigued+0.430
nam+0.402
 Fasc+0.399
Neg logit contributions
ister-0.189
imi-0.171
loo-0.168
web-0.167
oo-0.164
Token Sally
Token ID25737
Feature activation-4.90e-02

Pos logit contributions
 fascinated+0.174
 intrigued+0.166
 mesmer+0.150
ked+0.150
 Curious+0.147
Neg logit contributions
imi-0.093
arl-0.085
onna-0.082
ww-0.081
ister-0.080
Token.
Token ID13
Feature activation+1.65e-03

Pos logit contributions
 fascinated+0.108
 intrigued+0.106
 Curious+0.104
nam+0.099
mit+0.098
Neg logit contributions
loo-0.059
 dri-0.058
 dent-0.056
ister-0.055
imi-0.053
Token Sally
Token ID25737
Feature activation-4.86e-02

Pos logit contributions
 dri+0.002
arl+0.002
ay+0.002
loo+0.002
 collapse+0.002
Neg logit contributions
nam-0.003
 fascinated-0.003
 intrigued-0.003
mit-0.003
cious-0.003
Token was
Token ID373
Feature activation-4.38e-03

Pos logit contributions
 fascinated+0.076
 intrigued+0.074
 Curious+0.073
nam+0.072
mit+0.068
Neg logit contributions
 dri-0.088
loo-0.087
ister-0.083
imi-0.083
 dent-0.082
Token smiled
Token ID13541
Feature activation-1.09e-01

Pos logit contributions
 fascinated+0.118
nam+0.115
 Curious+0.114
 intrigued+0.111
cious+0.106
Neg logit contributions
 dri-0.127
loo-0.123
ister-0.120
imi-0.116
 dent-0.116
Token.
Token ID13
Feature activation-1.32e-03

Pos logit contributions
nam+0.238
cious+0.237
mit+0.228
 fascinated+0.228
 Curious+0.224
Neg logit contributions
 dri-0.111
 dent-0.100
 snow-0.095
ister-0.094
ay-0.093
Token "
Token ID366
Feature activation-8.61e-02

Pos logit contributions
nam+0.002
cious+0.002
 fascinated+0.002
mit+0.002
 intrigued+0.002
Neg logit contributions
 dri-0.002
imi-0.002
arl-0.002
loo-0.002
ister-0.002
TokenYes
Token ID5297
Feature activation-2.03e-02

Pos logit contributions
 fascinated+0.162
 intrigued+0.156
 Curious+0.148
mit+0.145
nam+0.144
Neg logit contributions
 dri-0.124
loo-0.120
O-0.119
aid-0.117
ister-0.117
Token,
Token ID11
Feature activation-8.85e-03

Pos logit contributions
 fascinated+0.039
 intrigued+0.038
 Curious+0.037
mit+0.036
nam+0.036
Neg logit contributions
loo-0.030
 dent-0.029
 dri-0.029
ister-0.027
 collapse-0.026
Token Princess
Token ID8449
Feature activation-1.85e-01

Pos logit contributions
 fascinated+0.019
 intrigued+0.018
 Curious+0.018
cious+0.017
nam+0.017
Neg logit contributions
 dri-0.011
ister-0.011
loo-0.011
imi-0.011
 dent-0.011
Token.
Token ID13
Feature activation-2.69e-03

Pos logit contributions
 fascinated+0.452
 intrigued+0.434
 Curious+0.425
nam+0.413
 mesmer+0.396
Neg logit contributions
ister-0.194
loo-0.182
imi-0.177
 dri-0.174
ilo-0.159
Token I
Token ID314
Feature activation-1.39e-02

Pos logit contributions
 fascinated+0.005
 intrigued+0.005
nam+0.004
 mesmer+0.004
cious+0.004
Neg logit contributions
arl-0.004
loo-0.004
ay-0.004
imi-0.004
ister-0.004
Token'll
Token ID1183
Feature activation-1.23e-02

Pos logit contributions
 fascinated+0.024
 intrigued+0.023
 Curious+0.022
nam+0.022
mit+0.021
Neg logit contributions
loo-0.023
 dri-0.023
 dent-0.023
ister-0.023
imi-0.021
Token hurry
Token ID23290
Feature activation+4.00e-02

Pos logit contributions
 fascinated+0.022
 intrigued+0.022
 Curious+0.021
mit+0.020
nam+0.020
Neg logit contributions
loo-0.020
 dent-0.020
 dri-0.019
ister-0.019
imi-0.018
Token and
Token ID290
Feature activation-1.73e-02

Pos logit contributions
loo+0.063
 dri+0.062
 dent+0.061
ister+0.061
imi+0.057
Neg logit contributions
 fascinated-0.073
 intrigued-0.070
 Curious-0.069
nam-0.066
mit-0.065
Token he
Token ID339
Feature activation+1.01e-02

Pos logit contributions
loo+0.083
 dri+0.082
ister+0.080
 dent+0.079
 collapse+0.074
Neg logit contributions
 fascinated-0.094
 intrigued-0.091
 Curious-0.089
cious-0.085
mit-0.085
Token gave
Token ID2921
Feature activation-3.16e-02

Pos logit contributions
 dri+0.018
ister+0.016
loo+0.016
 dent+0.016
 collapse+0.015
Neg logit contributions
 fascinated-0.016
nam-0.016
 Curious-0.016
 intrigued-0.016
cious-0.015
Token the
Token ID262
Feature activation-3.52e-03

Pos logit contributions
 fascinated+0.063
 intrigued+0.061
 Curious+0.058
cious+0.057
nam+0.056
Neg logit contributions
 dri-0.044
loo-0.041
ister-0.041
ay-0.040
 dent-0.039
Token Prince
Token ID9005
Feature activation-1.30e-01

Pos logit contributions
 fascinated+0.007
cious+0.007
 intrigued+0.007
nam+0.007
 Curious+0.007
Neg logit contributions
loo-0.005
 dent-0.004
 dri-0.004
ister-0.004
 powder-0.004
Token and
Token ID290
Feature activation-1.53e-02

Pos logit contributions
 fascinated+0.250
 intrigued+0.239
nam+0.237
 Curious+0.237
 Fasc+0.216
Neg logit contributions
 dri-0.174
loo-0.169
ister-0.165
ilo-0.159
 snow-0.153
Token Princess
Token ID8449
Feature activation-1.85e-01

Pos logit contributions
 fascinated+0.024
 intrigued+0.023
 Curious+0.023
nam+0.022
cious+0.021
Neg logit contributions
 dri-0.028
loo-0.027
 dent-0.027
ister-0.026
 collapse-0.025
Token his
Token ID465
Feature activation-3.73e-03

Pos logit contributions
 fascinated+0.330
 intrigued+0.316
 Curious+0.316
nam+0.310
cious+0.284
Neg logit contributions
 dri-0.283
loo-0.282
ister-0.281
 dent-0.253
imi-0.250
Token blessing
Token ID20027
Feature activation-4.78e-02

Pos logit contributions
 fascinated+0.007
 intrigued+0.006
 Curious+0.006
nam+0.006
cious+0.006
Neg logit contributions
 dent-0.006
loo-0.006
 dri-0.006
ister-0.005
arl-0.005
Token.
Token ID13
Feature activation+3.41e-03

Pos logit contributions
 fascinated+0.091
 intrigued+0.088
 Curious+0.086
nam+0.082
mit+0.082
Neg logit contributions
loo-0.071
 dent-0.070
 dri-0.070
ister-0.068
imi-0.064
Token
Token ID198
Feature activation+2.11e-02

Pos logit contributions
loo+0.005
 dri+0.005
ister+0.005
arl+0.005
 dent+0.005
Neg logit contributions
 fascinated-0.006
 intrigued-0.006
nam-0.006
cious-0.006
mit-0.006
Token
Token ID198
Feature activation+2.07e-02

Pos logit contributions
loo+0.036
 dent+0.036
 dri+0.036
ister+0.035
imi+0.033
Neg logit contributions
 fascinated-0.036
 intrigued-0.034
 Curious-0.034
nam-0.031
mit-0.031
Token ready
Token ID3492
Feature activation+1.58e-02

Pos logit contributions
 dri+0.041
loo+0.035
 band+0.034
ister+0.033
 dent+0.033
Neg logit contributions
cious-0.047
 fascinated-0.045
 Curious-0.045
mit-0.044
 intrigued-0.044
Token.
Token ID13
Feature activation+6.41e-03

Pos logit contributions
loo+0.028
 dent+0.027
 dri+0.027
ister+0.026
imi+0.025
Neg logit contributions
 fascinated-0.026
 intrigued-0.025
 Curious-0.024
nam-0.023
mit-0.023
Token All
Token ID1439
Feature activation-6.44e-02

Pos logit contributions
ay+0.010
 dri+0.009
 flash+0.009
 dent+0.009
 collapse+0.009
Neg logit contributions
mit-0.012
cious-0.011
 fascinated-0.011
nam-0.011
 intrigued-0.011
Token of
Token ID286
Feature activation+4.30e-02

Pos logit contributions
 fascinated+0.144
 intrigued+0.140
 Curious+0.138
cious+0.134
nam+0.133
Neg logit contributions
loo-0.075
 dri-0.074
 dent-0.070
ister-0.070
imi-0.066
Token the
Token ID262
Feature activation-2.41e-03

Pos logit contributions
loo+0.069
 dri+0.069
ister+0.067
 dent+0.067
imi+0.065
Neg logit contributions
 fascinated-0.075
 intrigued-0.073
 Curious-0.071
mit-0.071
cious-0.069
Token performers
Token ID23827
Feature activation-1.85e-01

Pos logit contributions
 fascinated+0.004
 Curious+0.004
 intrigued+0.004
chard+0.004
nam+0.004
Neg logit contributions
loo-0.004
 dri-0.004
ister-0.004
 flash-0.004
 dent-0.004
Token were
Token ID547
Feature activation+1.45e-02

Pos logit contributions
 Curious+0.304
 fascinated+0.299
nam+0.289
 intrigued+0.289
cious+0.273
Neg logit contributions
 dri-0.313
loo-0.305
ister-0.294
 dent-0.283
 flash-0.277
Token standing
Token ID5055
Feature activation-6.48e-02

Pos logit contributions
 dri+0.025
 dent+0.024
ister+0.024
loo+0.023
 flash+0.022
Neg logit contributions
 Curious-0.024
 fascinated-0.022
nam-0.022
 intrigued-0.022
cious-0.021
Token in
Token ID287
Feature activation-3.61e-02

Pos logit contributions
 fascinated+0.115
 Curious+0.111
 intrigued+0.110
nam+0.106
cious+0.104
Neg logit contributions
loo-0.103
 dri-0.102
ister-0.101
 dent-0.100
imi-0.093
Token the
Token ID262
Feature activation-2.40e-03

Pos logit contributions
 fascinated+0.061
 intrigued+0.058
 Curious+0.057
nam+0.054
mit+0.052
Neg logit contributions
loo-0.064
 dent-0.062
 dri-0.061
ister-0.061
imi-0.058
Token back
Token ID736
Feature activation+5.91e-02

Pos logit contributions
 fascinated+0.005
 intrigued+0.005
 Curious+0.004
nam+0.004
cious+0.004
Neg logit contributions
loo-0.004
 dri-0.003
 dent-0.003
ister-0.003
 collapse-0.003
Token her
Token ID607
Feature activation-8.11e-03

Pos logit contributions
 fascinated+0.139
 intrigued+0.134
 Curious+0.131
nam+0.125
mit+0.125
Neg logit contributions
loo-0.115
 dri-0.115
 dent-0.113
ister-0.110
imi-0.105
Token secret
Token ID3200
Feature activation+5.49e-04

Pos logit contributions
 fascinated+0.017
 Curious+0.016
 intrigued+0.016
mit+0.015
nam+0.015
Neg logit contributions
loo-0.011
 dent-0.010
 dri-0.010
ister-0.010
imi-0.009
Token hut
Token ID40812
Feature activation+2.96e-02

Pos logit contributions
loo+0.001
 dent+0.001
 dri+0.001
ister+0.001
imi+0.001
Neg logit contributions
 fascinated-0.001
 intrigued-0.001
 Curious-0.001
nam-0.001
mit-0.001
Token.
Token ID13
Feature activation+2.66e-03

Pos logit contributions
loo+0.038
 dri+0.037
 dent+0.036
ister+0.035
 collapse+0.033
Neg logit contributions
 fascinated-0.063
 intrigued-0.061
 Curious-0.060
nam-0.058
mit-0.057
Token The
Token ID383
Feature activation-7.83e-03

Pos logit contributions
 dri+0.004
loo+0.004
ay+0.004
arl+0.004
ister+0.004
Neg logit contributions
 fascinated-0.004
nam-0.004
 intrigued-0.004
cious-0.004
mit-0.004
Token Princess
Token ID8449
Feature activation-1.84e-01

Pos logit contributions
 fascinated+0.013
 intrigued+0.012
 Curious+0.012
cious+0.011
nam+0.011
Neg logit contributions
loo-0.014
 dri-0.014
 dent-0.013
web-0.013
 powder-0.012
Token had
Token ID550
Feature activation+3.40e-02

Pos logit contributions
nam+0.351
 Curious+0.329
 fascinated+0.322
 intrigued+0.307
 Fasc+0.297
Neg logit contributions
 dri-0.279
ister-0.244
loo-0.244
 band-0.234
web-0.231
Token lots
Token ID6041
Feature activation+6.96e-04

Pos logit contributions
 dri+0.056
 dent+0.050
loo+0.048
 collapse+0.048
 flash+0.047
Neg logit contributions
 fascinated-0.058
nam-0.056
mit-0.056
 intrigued-0.055
 Curious-0.055
Token of
Token ID286
Feature activation+4.31e-02

Pos logit contributions
loo+0.001
 dri+0.001
 dent+0.001
ister+0.001
imi+0.000
Neg logit contributions
 fascinated-0.002
 intrigued-0.002
 Curious-0.002
nam-0.002
mit-0.002
Token wonderful
Token ID7932
Feature activation-3.13e-02

Pos logit contributions
 dri+0.060
loo+0.060
 dent+0.058
ister+0.053
 flash+0.053
Neg logit contributions
 fascinated-0.085
 intrigued-0.081
 Curious-0.081
mit-0.080
cious-0.080
Token things
Token ID1243
Feature activation-8.21e-02

Pos logit contributions
nam+0.056
 fascinated+0.056
chard+0.055
 Curious+0.055
mit+0.054
Neg logit contributions
loo-0.049
 dri-0.048
 dent-0.045
 flash-0.045
 powder-0.043
Token.
Token ID13
Feature activation+4.39e-03

Pos logit contributions
loo+0.017
 dent+0.017
 dri+0.017
ister+0.016
imi+0.015
Neg logit contributions
 fascinated-0.025
 intrigued-0.024
 Curious-0.023
nam-0.022
mit-0.022
Token
Token ID198
Feature activation+2.11e-02

Pos logit contributions
loo+0.006
ay+0.006
 dri+0.006
imi+0.006
ister+0.006
Neg logit contributions
 fascinated-0.008
 intrigued-0.008
nam-0.008
mit-0.008
cious-0.008
Token
Token ID198
Feature activation+1.93e-02

Pos logit contributions
loo+0.033
 dent+0.033
 dri+0.033
ister+0.032
imi+0.030
Neg logit contributions
 fascinated-0.039
 intrigued-0.037
 Curious-0.036
nam-0.035
mit-0.034
TokenWhen
Token ID2215
Feature activation-1.22e-03

Pos logit contributions
loo+0.027
 dent+0.027
 dri+0.027
ister+0.025
imi+0.024
Neg logit contributions
 fascinated-0.039
 intrigued-0.038
 Curious-0.037
nam-0.035
cious-0.035
Token the
Token ID262
Feature activation-5.07e-03

Pos logit contributions
 fascinated+0.002
 intrigued+0.002
 Curious+0.002
nam+0.002
mit+0.002
Neg logit contributions
loo-0.002
 dent-0.002
 dri-0.002
ister-0.002
imi-0.002
Token Princess
Token ID8449
Feature activation-1.84e-01

Pos logit contributions
 fascinated+0.009
 intrigued+0.009
 Curious+0.008
nam+0.008
cious+0.008
Neg logit contributions
loo-0.008
 dri-0.008
 dent-0.008
ister-0.008
imi-0.007
Token arrived
Token ID5284
Feature activation-3.03e-02

Pos logit contributions
nam+0.350
 fascinated+0.334
 intrigued+0.314
 Curious+0.305
mit+0.289
Neg logit contributions
 dri-0.296
ister-0.265
loo-0.264
 collapse-0.248
imi-0.246
Token,
Token ID11
Feature activation-6.52e-03

Pos logit contributions
 fascinated+0.057
 Curious+0.055
 intrigued+0.055
nam+0.054
cious+0.052
Neg logit contributions
loo-0.045
 dri-0.044
ister-0.043
 dent-0.042
arl-0.039
Token the
Token ID262
Feature activation-5.84e-03

Pos logit contributions
 fascinated+0.012
cious+0.012
mit+0.012
 intrigued+0.012
 Curious+0.011
Neg logit contributions
 dri-0.009
ister-0.008
loo-0.008
 collapse-0.008
 dent-0.008
Token King
Token ID2677
Feature activation-1.41e-01

Pos logit contributions
 fascinated+0.010
 intrigued+0.010
cious+0.009
 Curious+0.009
nam+0.009
Neg logit contributions
loo-0.010
 dri-0.009
 dent-0.009
ister-0.009
web-0.009
Token was
Token ID373
Feature activation-3.32e-03

Pos logit contributions
 fascinated+0.231
nam+0.226
 Curious+0.223
 intrigued+0.221
cious+0.203
Neg logit contributions
 dri-0.244
loo-0.243
ister-0.234
 dent-0.231
omo-0.216
Token the
Token ID262
Feature activation-2.93e-03

Pos logit contributions
 fascinated+0.035
 intrigued+0.034
 Curious+0.032
nam+0.032
mit+0.031
Neg logit contributions
loo-0.039
 dent-0.038
 dri-0.038
ister-0.037
arl-0.036
Token park
Token ID3952
Feature activation-5.03e-03

Pos logit contributions
 fascinated+0.005
 intrigued+0.005
 Curious+0.005
nam+0.005
mit+0.005
Neg logit contributions
loo-0.005
 dri-0.004
 dent-0.004
ister-0.004
arl-0.004
Token.
Token ID13
Feature activation+4.53e-03

Pos logit contributions
 fascinated+0.010
 intrigued+0.010
 Curious+0.010
nam+0.009
mit+0.009
Neg logit contributions
loo-0.007
 dri-0.007
 dent-0.007
ister-0.007
 collapse-0.006
Token When
Token ID1649
Feature activation-1.11e-02

Pos logit contributions
 dri+0.007
ay+0.007
arl+0.006
loo+0.006
omo+0.006
Neg logit contributions
 fascinated-0.008
mit-0.008
cious-0.008
nam-0.008
 intrigued-0.008
Token the
Token ID262
Feature activation-5.13e-03

Pos logit contributions
 fascinated+0.015
 intrigued+0.014
 Curious+0.014
nam+0.014
mit+0.013
Neg logit contributions
loo-0.022
 dent-0.022
 dri-0.022
ister-0.021
arl-0.021
Token performers
Token ID23827
Feature activation-1.84e-01

Pos logit contributions
 fascinated+0.009
 intrigued+0.009
 Curious+0.009
mit+0.008
nam+0.008
Neg logit contributions
 dent-0.008
loo-0.008
 dri-0.008
ister-0.008
imi-0.007
Token finished
Token ID5201
Feature activation+7.39e-02

Pos logit contributions
 fascinated+0.291
 intrigued+0.279
 Curious+0.279
nam+0.273
cious+0.263
Neg logit contributions
 dri-0.335
loo-0.326
ister-0.320
 dent-0.316
 flash-0.297
Token,
Token ID11
Feature activation-4.66e-03

Pos logit contributions
loo+0.116
 dent+0.115
 dri+0.114
ister+0.110
imi+0.105
Neg logit contributions
 fascinated-0.136
 intrigued-0.130
 Curious-0.127
nam-0.122
mit-0.120
Token she
Token ID673
Feature activation+9.49e-03

Pos logit contributions
 fascinated+0.009
 intrigued+0.009
cious+0.008
mit+0.008
 Curious+0.008
Neg logit contributions
 dri-0.007
 dent-0.006
ister-0.006
loo-0.006
 collapse-0.006
Token joined
Token ID5399
Feature activation-1.16e-01

Pos logit contributions
 dri+0.018
 dent+0.017
loo+0.017
ister+0.017
omo+0.016
Neg logit contributions
 fascinated-0.014
 intrigued-0.014
 Curious-0.014
nam-0.014
cious-0.013
Token the
Token ID262
Feature activation-4.53e-03

Pos logit contributions
 fascinated+0.234
 intrigued+0.230
 Curious+0.222
cious+0.222
mit+0.214
Neg logit contributions
 dri-0.151
ay-0.148
loo-0.145
oo-0.143
 band-0.141
Token was
Token ID373
Feature activation+4.48e-04

Pos logit contributions
 fascinated+0.165
 intrigued+0.157
 Curious+0.153
mit+0.142
nam+0.141
Neg logit contributions
loo-0.174
 dent-0.173
 dri-0.172
ister-0.168
arl-0.163
Token a
Token ID257
Feature activation-3.13e-03

Pos logit contributions
 dri+0.001
loo+0.001
 dent+0.001
imi+0.001
omo+0.001
Neg logit contributions
 fascinated-0.001
 intrigued-0.001
mit-0.001
 Curious-0.001
nam-0.001
Token kind
Token ID1611
Feature activation-4.20e-02

Pos logit contributions
 fascinated+0.003
 intrigued+0.003
 Curious+0.003
nam+0.003
mit+0.003
Neg logit contributions
loo-0.007
 dri-0.007
 dent-0.007
ister-0.007
 collapse-0.007
Token and
Token ID290
Feature activation-1.63e-02

Pos logit contributions
 fascinated+0.060
 intrigued+0.057
 Curious+0.055
mit+0.052
nam+0.051
Neg logit contributions
 dent-0.082
loo-0.082
 dri-0.082
ister-0.079
 collapse-0.078
Token nice
Token ID3621
Feature activation+2.75e-02

Pos logit contributions
 fascinated+0.027
 intrigued+0.026
 Curious+0.025
nam+0.024
mit+0.024
Neg logit contributions
loo-0.029
 dent-0.029
 dri-0.029
ister-0.028
imi-0.026
Token Princess
Token ID8449
Feature activation-1.84e-01

Pos logit contributions
 dent+0.050
loo+0.050
 dri+0.050
ister+0.048
 collapse+0.048
Neg logit contributions
 fascinated-0.042
 intrigued-0.041
 Curious-0.040
mit-0.038
nam-0.037
Token.
Token ID13
Feature activation+3.20e-03

Pos logit contributions
 fascinated+0.467
 Curious+0.459
 intrigued+0.458
 Fasc+0.423
 mesmer+0.418
Neg logit contributions
ister-0.163
imi-0.148
loo-0.138
?-0.132
web-0.131
Token Every
Token ID3887
Feature activation-6.53e-03

Pos logit contributions
ay+0.004
arl+0.003
imi+0.003
ilo+0.003
 dri+0.003
Neg logit contributions
 fascinated-0.006
mit-0.006
 intrigued-0.006
cious-0.006
 mesmer-0.006
Token day
Token ID1110
Feature activation+4.19e-02

Pos logit contributions
 fascinated+0.010
 Curious+0.009
 intrigued+0.009
 curious+0.009
 wanted+0.009
Neg logit contributions
 dent-0.013
 dri-0.012
 collapse-0.012
 swollen-0.012
omo-0.012
Token,
Token ID11
Feature activation-4.64e-03

Pos logit contributions
loo+0.032
 dri+0.032
ister+0.030
ay+0.029
ww+0.027
Neg logit contributions
 fascinated-0.111
 intrigued-0.108
 Curious-0.102
 mesmer-0.099
cious-0.098
Token the
Token ID262
Feature activation-6.13e-03

Pos logit contributions
 fascinated+0.010
mit+0.010
 intrigued+0.010
cious+0.010
ked+0.009
Neg logit contributions
 better-0.005
 end-0.004
?-0.004
ilo-0.004
 stitch-0.004
Token.
Token ID13
Feature activation+1.33e-03

Pos logit contributions
loo+0.003
 dent+0.003
 dri+0.003
ister+0.003
imi+0.002
Neg logit contributions
 fascinated-0.004
 intrigued-0.004
 Curious-0.004
nam-0.003
cious-0.003
Token He
Token ID679
Feature activation+2.58e-02

Pos logit contributions
loo+0.002
 dri+0.002
 dent+0.002
ister+0.002
omo+0.002
Neg logit contributions
 fascinated-0.002
nam-0.002
 intrigued-0.002
mit-0.002
cious-0.002
Token jumped
Token ID11687
Feature activation-9.32e-03

Pos logit contributions
 dri+0.047
 dent+0.045
loo+0.044
ister+0.043
omo+0.042
Neg logit contributions
 fascinated-0.041
nam-0.039
 intrigued-0.039
 Curious-0.038
cious-0.036
Token off
Token ID572
Feature activation+1.81e-02

Pos logit contributions
 fascinated+0.015
 intrigued+0.014
 Curious+0.014
mit+0.013
nam+0.013
Neg logit contributions
loo-0.017
 dent-0.017
 dri-0.017
ister-0.016
imi-0.016
Token his
Token ID465
Feature activation-8.83e-04

Pos logit contributions
loo+0.036
 dent+0.035
ister+0.034
imi+0.034
omo+0.034
Neg logit contributions
 fascinated-0.026
 intrigued-0.025
 Curious-0.024
 curious-0.022
 Lay-0.022
Token bed
Token ID3996
Feature activation+1.24e-01

Pos logit contributions
 fascinated+0.002
 intrigued+0.002
 Curious+0.002
nam+0.001
mit+0.001
Neg logit contributions
loo-0.001
 dri-0.001
 dent-0.001
ister-0.001
imi-0.001
Token and
Token ID290
Feature activation-9.55e-03

Pos logit contributions
loo+0.228
 dent+0.223
 dri+0.221
ister+0.215
imi+0.211
Neg logit contributions
 fascinated-0.198
 intrigued-0.189
 Curious-0.181
cious-0.169
nam-0.169
Token ran
Token ID4966
Feature activation-8.14e-03

Pos logit contributions
 fascinated+0.016
 intrigued+0.015
 Curious+0.014
nam+0.014
mit+0.014
Neg logit contributions
 dri-0.017
loo-0.017
 dent-0.017
ister-0.016
 collapse-0.015
Token to
Token ID284
Feature activation+4.81e-03

Pos logit contributions
 fascinated+0.015
 intrigued+0.015
 Curious+0.014
nam+0.014
mit+0.014
Neg logit contributions
loo-0.013
 dri-0.012
 dent-0.012
ister-0.012
 collapse-0.011
Token his
Token ID465
Feature activation-5.18e-04

Pos logit contributions
loo+0.008
 dent+0.008
 dri+0.008
ister+0.008
imi+0.008
Neg logit contributions
 fascinated-0.008
 intrigued-0.008
 Curious-0.008
nam-0.007
mit-0.007
Token mom
Token ID1995
Feature activation+6.21e-02

Pos logit contributions
 fascinated+0.001
 intrigued+0.001
 curious+0.001
 Curious+0.001
mit+0.001
Neg logit contributions
loo-0.001
ister-0.001
 dri-0.001
arl-0.001
 dent-0.001
Token head
Token ID1182
Feature activation+1.03e-02

Pos logit contributions
 fascinated+0.010
 intrigued+0.010
 Curious+0.010
nam+0.009
mit+0.009
Neg logit contributions
loo-0.009
 dri-0.009
 dent-0.009
ister-0.009
imi-0.008
Token.
Token ID13
Feature activation+1.31e-02

Pos logit contributions
loo+0.015
 dent+0.015
 dri+0.015
ister+0.015
imi+0.014
Neg logit contributions
 fascinated-0.020
 intrigued-0.019
 Curious-0.019
nam-0.018
mit-0.018
Token She
Token ID1375
Feature activation+3.69e-02

Pos logit contributions
 dri+0.026
 dent+0.025
arl+0.025
loo+0.025
ister+0.024
Neg logit contributions
 fascinated-0.018
 intrigued-0.017
nam-0.017
mit-0.017
cious-0.016
Token felt
Token ID2936
Feature activation-5.81e-03

Pos logit contributions
loo+0.072
 dent+0.071
 dri+0.071
ister+0.070
imi+0.067
Neg logit contributions
 fascinated-0.054
 intrigued-0.052
 Curious-0.050
nam-0.047
mit-0.046
Token something
Token ID1223
Feature activation-6.94e-02

Pos logit contributions
 fascinated+0.009
 intrigued+0.009
 Curious+0.009
 curious+0.008
nam+0.008
Neg logit contributions
loo-0.011
ister-0.011
 dri-0.010
 dent-0.010
imi-0.010
Token wet
Token ID9583
Feature activation+1.02e-01

Pos logit contributions
 fascinated+0.119
 intrigued+0.115
 Curious+0.111
 curious+0.111
 wanted+0.103
Neg logit contributions
loo-0.128
ister-0.125
omo-0.116
imi-0.112
 dent-0.109
Token on
Token ID319
Feature activation+2.57e-02

Pos logit contributions
loo+0.205
ister+0.202
imi+0.201
omo+0.200
 dent+0.195
Neg logit contributions
 fascinated-0.137
 intrigued-0.121
 wanted-0.120
 curious-0.119
 Curious-0.118
Token her
Token ID607
Feature activation-5.10e-03

Pos logit contributions
loo+0.056
 dent+0.053
 dri+0.053
ister+0.052
arl+0.050
Neg logit contributions
 fascinated-0.033
 Curious-0.030
 intrigued-0.030
nam-0.029
 Lay-0.027
Token hand
Token ID1021
Feature activation+7.00e-02

Pos logit contributions
 fascinated+0.009
 intrigued+0.009
 Curious+0.009
nam+0.008
mit+0.008
Neg logit contributions
loo-0.008
 dent-0.008
 dri-0.008
ister-0.008
imi-0.007
Token.
Token ID13
Feature activation+1.30e-02

Pos logit contributions
loo+0.123
 dent+0.116
ister+0.115
 dri+0.115
imi+0.112
Neg logit contributions
 fascinated-0.117
 intrigued-0.110
 Curious-0.107
cious-0.105
 curious-0.102
Token It
Token ID632
Feature activation-1.97e-05

Pos logit contributions
 dri+0.023
 dent+0.022
loo+0.022
arl+0.021
ister+0.021
Neg logit contributions
 fascinated-0.021
 intrigued-0.020
mit-0.020
nam-0.020
cious-0.019
Token happy
Token ID3772
Feature activation-3.70e-02

Pos logit contributions
 fascinated+0.064
 intrigued+0.061
 Curious+0.059
nam+0.057
mit+0.055
Neg logit contributions
loo-0.067
 dent-0.066
 dri-0.065
ister-0.065
imi-0.062
Token and
Token ID290
Feature activation-3.05e-03

Pos logit contributions
 fascinated+0.065
 intrigued+0.062
 Curious+0.061
nam+0.058
mit+0.057
Neg logit contributions
loo-0.061
 dent-0.061
 dri-0.060
ister-0.059
imi-0.056
Token safe
Token ID3338
Feature activation+7.29e-02

Pos logit contributions
 fascinated+0.005
 intrigued+0.005
 Curious+0.005
mit+0.005
nam+0.005
Neg logit contributions
loo-0.005
 dri-0.005
 dent-0.005
ister-0.005
 collapse-0.004
Token.
Token ID13
Feature activation+1.58e-02

Pos logit contributions
loo+0.111
 dent+0.109
 dri+0.107
ister+0.106
imi+0.102
Neg logit contributions
 fascinated-0.139
 intrigued-0.133
 Curious-0.129
nam-0.124
mit-0.122
Token The
Token ID383
Feature activation+2.11e-03

Pos logit contributions
 dri+0.025
loo+0.025
 dent+0.024
ister+0.023
imi+0.023
Neg logit contributions
 fascinated-0.028
 intrigued-0.028
 Curious-0.026
mit-0.026
nam-0.026
Token end
Token ID886
Feature activation+9.72e-02

Pos logit contributions
loo+0.002
ister+0.002
 dent+0.002
arl+0.002
 collapse+0.002
Neg logit contributions
 fascinated-0.005
 Curious-0.005
 intrigued-0.005
 curious-0.004
nam-0.004
Token.
Token ID13
Feature activation+1.52e-02

Pos logit contributions
 dent+0.219
ister+0.210
loo+0.208
arl+0.205
uch+0.202
Neg logit contributions
 Curious-0.104
 curious-0.102
 fascinated-0.097
 wanted-0.093
 intrigued-0.091
Token<|endoftext|>
Token ID50256
Feature activation+8.89e-02

Pos logit contributions
loo+0.031
 dent+0.031
ister+0.030
 dri+0.030
 collapse+0.028
Neg logit contributions
 fascinated-0.021
 Curious-0.021
 intrigued-0.020
nam-0.018
 Lay-0.018
TokenOnce
Token ID7454
Feature activation+2.65e-02

Pos logit contributions
loo+0.154
 dent+0.154
 dri+0.153
ister+0.149
imi+0.142
Neg logit contributions
 fascinated-0.147
 intrigued-0.141
 Curious-0.138
nam-0.129
mit-0.128
Token upon
Token ID2402
Feature activation-3.84e-02

Pos logit contributions
loo+0.058
 dent+0.058
 dri+0.058
ister+0.056
imi+0.054
Neg logit contributions
 fascinated-0.032
 intrigued-0.030
 Curious-0.029
nam-0.027
mit-0.026
Token a
Token ID257
Feature activation-2.22e-03

Pos logit contributions
 fascinated+0.068
 intrigued+0.066
mit+0.066
 Curious+0.062
nam+0.062
Neg logit contributions
 dri-0.059
loo-0.059
 dent-0.059
ister-0.056
aster-0.055
Token she
Token ID673
Feature activation+6.62e-03

Pos logit contributions
 fascinated+0.079
 intrigued+0.076
 Curious+0.074
mit+0.069
nam+0.069
Neg logit contributions
loo-0.094
 dri-0.094
 dent-0.092
ister-0.092
imi-0.088
Token walked
Token ID6807
Feature activation-3.86e-02

Pos logit contributions
loo+0.014
 dent+0.014
 dri+0.013
ister+0.013
imi+0.013
Neg logit contributions
 fascinated-0.009
 intrigued-0.009
 Curious-0.008
nam-0.008
mit-0.008
Token is
Token ID318
Feature activation+1.67e-02

Pos logit contributions
 fascinated+0.081
 intrigued+0.078
nam+0.076
 Curious+0.075
cious+0.074
Neg logit contributions
loo-0.053
 dri-0.049
ister-0.049
 dent-0.047
imi-0.045
Token saw
Token ID2497
Feature activation-1.65e-02

Pos logit contributions
ister+0.020
 dri+0.019
 collapse+0.019
loo+0.019
 flash+0.018
Neg logit contributions
nam-0.036
 fascinated-0.034
mit-0.033
cious-0.033
 intrigued-0.033
Token a
Token ID257
Feature activation-6.57e-03

Pos logit contributions
 fascinated+0.032
 intrigued+0.031
mit+0.030
cious+0.030
 Curious+0.030
Neg logit contributions
loo-0.024
 dri-0.024
ister-0.022
 snow-0.022
 powder-0.022
Token big
Token ID1263
Feature activation+1.15e-01

Pos logit contributions
 fascinated+0.011
 intrigued+0.010
 Curious+0.010
nam+0.010
 curious+0.010
Neg logit contributions
loo-0.012
 dri-0.011
 dent-0.011
ister-0.011
arl-0.011
Token,
Token ID11
Feature activation-3.72e-03

Pos logit contributions
 dri+0.188
aid+0.185
arl+0.184
loo+0.183
 collapse+0.179
Neg logit contributions
 fascinated-0.188
 intrigued-0.182
 curious-0.180
 Curious-0.177
mit-0.173
Token shiny
Token ID22441
Feature activation+3.92e-02

Pos logit contributions
 fascinated+0.006
 intrigued+0.006
 Curious+0.006
mit+0.006
nam+0.006
Neg logit contributions
loo-0.006
 dri-0.006
 dent-0.006
ister-0.006
imi-0.006
Token mailbox
Token ID37282
Feature activation+3.09e-02

Pos logit contributions
 dri+0.061
loo+0.059
 collapse+0.058
 dent+0.058
ister+0.058
Neg logit contributions
 fascinated-0.067
 intrigued-0.066
 Curious-0.064
mit-0.064
nam-0.064
Token.
Token ID13
Feature activation+4.61e-03

Pos logit contributions
loo+0.050
 dent+0.049
 dri+0.049
ister+0.048
arl+0.047
Neg logit contributions
 fascinated-0.055
 intrigued-0.053
 Curious-0.051
nam-0.050
 curious-0.049
Token It
Token ID632
Feature activation-5.50e-03

Pos logit contributions
ay+0.005
 dri+0.005
O+0.005
imi+0.005
loo+0.005
Neg logit contributions
mit-0.010
 fascinated-0.010
nam-0.009
 intrigued-0.009
cious-0.009
Token see
Token ID766
Feature activation+1.25e-02

Pos logit contributions
loo+0.009
 dent+0.009
 dri+0.009
ister+0.009
imi+0.008
Neg logit contributions
 fascinated-0.008
 intrigued-0.008
 Curious-0.008
mit-0.007
nam-0.007
Token the
Token ID262
Feature activation+4.94e-04

Pos logit contributions
 dri+0.020
loo+0.020
ister+0.019
 dent+0.019
imi+0.018
Neg logit contributions
 fascinated-0.023
 intrigued-0.022
 Curious-0.022
cious-0.021
mit-0.021
Token movie
Token ID3807
Feature activation-2.70e-02

Pos logit contributions
loo+0.001
 dri+0.000
 dent+0.000
ister+0.000
imi+0.000
Neg logit contributions
 fascinated-0.001
 intrigued-0.001
 Curious-0.001
nam-0.001
mit-0.001
Token on
Token ID319
Feature activation+2.31e-02

Pos logit contributions
 fascinated+0.047
 intrigued+0.045
 Curious+0.043
cious+0.041
mit+0.041
Neg logit contributions
 dent-0.045
loo-0.044
 dri-0.044
arl-0.042
ister-0.041
Token the
Token ID262
Feature activation+5.37e-04

Pos logit contributions
loo+0.041
 dent+0.041
 dri+0.041
ister+0.039
arl+0.038
Neg logit contributions
 fascinated-0.037
 intrigued-0.035
 Curious-0.035
nam-0.033
cious-0.032
Token big
Token ID1263
Feature activation+1.22e-01

Pos logit contributions
loo+0.001
 dri+0.001
 dent+0.001
ister+0.001
imi+0.001
Neg logit contributions
 fascinated-0.001
 intrigued-0.001
 Curious-0.001
nam-0.001
mit-0.001
Token screen
Token ID3159
Feature activation-3.36e-02

Pos logit contributions
 dent+0.181
 dri+0.180
arl+0.176
ister+0.175
loo+0.174
Neg logit contributions
 fascinated-0.221
 intrigued-0.212
 Curious-0.212
 curious-0.200
cious-0.198
Token.
Token ID13
Feature activation+1.17e-02

Pos logit contributions
 fascinated+0.059
 intrigued+0.057
 Curious+0.055
 curious+0.051
mit+0.051
Neg logit contributions
 dent-0.055
loo-0.054
 dri-0.053
ister-0.052
arl-0.052
Token
Token ID198
Feature activation+3.10e-02

Pos logit contributions
 dri+0.011
omo+0.011
ay+0.011
imi+0.011
O+0.010
Neg logit contributions
mit-0.027
cious-0.026
 fascinated-0.026
nam-0.026
 intrigued-0.026
Token
Token ID198
Feature activation+2.96e-02

Pos logit contributions
 dent+0.056
loo+0.056
 dri+0.055
ister+0.053
 collapse+0.051
Neg logit contributions
 fascinated-0.050
 intrigued-0.048
 Curious-0.047
nam-0.043
cious-0.042
TokenAs
Token ID1722
Feature activation-3.70e-02

Pos logit contributions
ister+0.039
 dri+0.039
loo+0.038
 dent+0.038
imi+0.037
Neg logit contributions
 fascinated-0.059
 intrigued-0.057
mit-0.056
 Curious-0.055
nam-0.055
Token went
Token ID1816
Feature activation+4.84e-02

Pos logit contributions
 dri+0.007
 dent+0.006
ister+0.006
omo+0.006
loo+0.006
Neg logit contributions
nam-0.005
 fascinated-0.005
 intrigued-0.005
 Curious-0.005
 Lay-0.005
Token back
Token ID736
Feature activation+6.54e-02

Pos logit contributions
 dent+0.094
ister+0.092
loo+0.091
 dri+0.090
imi+0.088
Neg logit contributions
 fascinated-0.070
 intrigued-0.067
 Curious-0.066
 curious-0.062
mit-0.061
Token inside
Token ID2641
Feature activation-1.47e-02

Pos logit contributions
loo+0.108
 dri+0.107
 dent+0.104
ister+0.101
imi+0.098
Neg logit contributions
 fascinated-0.116
 intrigued-0.111
 Curious-0.108
nam-0.105
cious-0.103
Token.
Token ID13
Feature activation+1.17e-02

Pos logit contributions
 fascinated+0.023
 intrigued+0.021
 Curious+0.021
nam+0.019
 curious+0.019
Neg logit contributions
loo-0.028
 dent-0.028
 dri-0.027
ister-0.027
arl-0.026
Token She
Token ID1375
Feature activation+3.67e-02

Pos logit contributions
 dri+0.016
arl+0.016
imi+0.015
ay+0.014
omo+0.014
Neg logit contributions
mit-0.022
cious-0.022
 fascinated-0.021
folk-0.021
nam-0.021
Token took
Token ID1718
Feature activation+2.32e-02

Pos logit contributions
 dri+0.071
 dent+0.063
loo+0.063
omo+0.062
imi+0.061
Neg logit contributions
 fascinated-0.054
 intrigued-0.052
 Curious-0.051
mit-0.051
nam-0.051
Token off
Token ID572
Feature activation+2.13e-02

Pos logit contributions
loo+0.044
 dent+0.043
 dri+0.043
ister+0.042
imi+0.040
Neg logit contributions
 fascinated-0.036
 intrigued-0.034
 Curious-0.033
nam-0.032
mit-0.031
Token her
Token ID607
Feature activation-3.80e-03

Pos logit contributions
loo+0.040
 dent+0.040
 dri+0.039
ister+0.038
imi+0.037
Neg logit contributions
 fascinated-0.033
 intrigued-0.031
 Curious-0.031
nam-0.030
mit-0.028
Token wet
Token ID9583
Feature activation+1.02e-01

Pos logit contributions
 fascinated+0.007
 intrigued+0.007
 Curious+0.007
nam+0.007
mit+0.007
Neg logit contributions
loo-0.006
 dent-0.005
 dri-0.005
ister-0.005
 collapse-0.005
Token rain
Token ID6290
Feature activation+3.87e-02

Pos logit contributions
loo+0.162
ister+0.160
 dent+0.160
 collapse+0.154
 dri+0.152
Neg logit contributions
 fascinated-0.179
 Curious-0.175
 intrigued-0.172
mit-0.163
 curious-0.163
Tokencoat
Token ID31434
Feature activation-7.81e-03

Pos logit contributions
loo+0.071
 dent+0.071
ister+0.069
 dri+0.069
imi+0.067
Neg logit contributions
 fascinated-0.058
 intrigued-0.054
 Curious-0.054
cious-0.051
 curious-0.050
Token girl
Token ID2576
Feature activation-6.26e-02

Pos logit contributions
 fascinated+0.008
 intrigued+0.008
 Curious+0.008
nam+0.007
cious+0.007
Neg logit contributions
loo-0.010
 dri-0.010
 dent-0.009
ister-0.009
imi-0.009
Token wanted
Token ID2227
Feature activation-2.83e-02

Pos logit contributions
 fascinated+0.109
 intrigued+0.105
 Curious+0.103
nam+0.101
mit+0.098
Neg logit contributions
 dri-0.101
loo-0.099
 dent-0.099
ister-0.096
imi-0.092
Token to
Token ID284
Feature activation+1.24e-02

Pos logit contributions
 fascinated+0.051
 intrigued+0.050
 Curious+0.049
cious+0.046
mit+0.046
Neg logit contributions
 dri-0.045
loo-0.043
 dent-0.043
ister-0.042
imi-0.040
Token play
Token ID711
Feature activation+2.88e-02

Pos logit contributions
loo+0.018
 dent+0.018
 dri+0.018
ister+0.018
imi+0.017
Neg logit contributions
 fascinated-0.024
 intrigued-0.023
 Curious-0.022
nam-0.021
mit-0.021
Token with
Token ID351
Feature activation-5.42e-02

Pos logit contributions
loo+0.054
 dent+0.054
 dri+0.053
arl+0.052
ister+0.051
Neg logit contributions
 fascinated-0.044
 intrigued-0.042
 Curious-0.040
nam-0.039
 curious-0.039
Token it
Token ID340
Feature activation+1.64e-02

Pos logit contributions
 fascinated+0.089
 intrigued+0.085
 Curious+0.083
nam+0.079
mit+0.078
Neg logit contributions
loo-0.096
 dent-0.094
 dri-0.094
ister-0.092
imi-0.088
Token,
Token ID11
Feature activation+8.26e-04

Pos logit contributions
loo+0.027
 dent+0.027
 dri+0.026
ister+0.026
arl+0.025
Neg logit contributions
 fascinated-0.029
 intrigued-0.028
 Curious-0.027
nam-0.026
 curious-0.026
Token so
Token ID523
Feature activation-3.00e-02

Pos logit contributions
 dri+0.002
 dent+0.002
ister+0.002
imi+0.002
loo+0.002
Neg logit contributions
 fascinated-0.001
 intrigued-0.001
cious-0.001
mit-0.001
 Curious-0.001
Token her
Token ID607
Feature activation-3.34e-04

Pos logit contributions
 fascinated+0.049
 intrigued+0.047
 Curious+0.046
mit+0.044
nam+0.043
Neg logit contributions
 dri-0.052
ister-0.051
loo-0.051
 dent-0.050
imi-0.049
Token mom
Token ID1995
Feature activation+6.96e-02

Pos logit contributions
 fascinated+0.001
 intrigued+0.001
 Curious+0.001
nam+0.001
cious+0.001
Neg logit contributions
 dri-0.000
loo-0.000
 dent-0.000
ister-0.000
imi-0.000
Token helped
Token ID4193
Feature activation+1.37e-02

Pos logit contributions
loo+0.120
 dri+0.120
 dent+0.118
ister+0.116
imi+0.111
Neg logit contributions
 fascinated-0.116
 intrigued-0.111
 Curious-0.109
nam-0.104
mit-0.102
Token strange
Token ID6283
Feature activation+2.49e-02

Pos logit contributions
 fascinated+0.135
 intrigued+0.130
 Curious+0.127
nam+0.121
mit+0.120
Neg logit contributions
loo-0.121
 dent-0.120
 dri-0.120
ister-0.116
imi-0.111
Token happened
Token ID3022
Feature activation-1.22e-03

Pos logit contributions
loo+0.044
 dri+0.044
 dent+0.044
ister+0.042
 collapse+0.040
Neg logit contributions
 fascinated-0.041
 intrigued-0.039
 Curious-0.038
nam-0.037
mit-0.036
Token.
Token ID13
Feature activation+6.10e-03

Pos logit contributions
 fascinated+0.003
 intrigued+0.003
 Curious+0.003
nam+0.003
mit+0.002
Neg logit contributions
loo-0.001
 dri-0.001
 dent-0.001
ister-0.001
 collapse-0.001
Token Bear
Token ID14732
Feature activation-1.81e-02

Pos logit contributions
 dri+0.010
 dent+0.010
loo+0.010
ilo+0.009
imi+0.009
Neg logit contributions
 fascinated-0.010
nam-0.010
mit-0.010
 intrigued-0.009
cious-0.009
Token wanted
Token ID2227
Feature activation-2.79e-02

Pos logit contributions
nam+0.028
 fascinated+0.028
 Curious+0.027
 intrigued+0.027
mit+0.026
Neg logit contributions
 dri-0.032
 dent-0.031
ister-0.031
loo-0.030
imi-0.029
Token chocolate
Token ID11311
Feature activation+5.86e-02

Pos logit contributions
 fascinated+0.046
 intrigued+0.044
 Curious+0.043
nam+0.041
mit+0.041
Neg logit contributions
 dri-0.049
loo-0.049
 dent-0.048
ister-0.047
imi-0.044
Token milk
Token ID7545
Feature activation+4.87e-02

Pos logit contributions
 dent+0.105
loo+0.104
arl+0.103
 dri+0.097
imi+0.096
Neg logit contributions
 fascinated-0.090
 intrigued-0.085
mit-0.082
cious-0.081
 Curious-0.080
Token,
Token ID11
Feature activation-3.58e-03

Pos logit contributions
 dent+0.090
loo+0.089
 dri+0.083
arl+0.083
imi+0.083
Neg logit contributions
 fascinated-0.073
mit-0.069
 intrigued-0.069
cious-0.066
 curious-0.064
Token but
Token ID475
Feature activation-5.20e-02

Pos logit contributions
 fascinated+0.005
 intrigued+0.004
 Curious+0.004
mit+0.004
nam+0.004
Neg logit contributions
 dri-0.008
 dent-0.007
loo-0.007
ister-0.007
omo-0.007
Token Rabbit
Token ID25498
Feature activation-3.62e-02

Pos logit contributions
 fascinated+0.093
 intrigued+0.089
 Curious+0.089
nam+0.085
cious+0.085
Neg logit contributions
 dri-0.081
loo-0.080
 dent-0.080
ister-0.078
imi-0.075
Token wanted
Token ID2227
Feature activation-2.70e-02

Pos logit contributions
 fascinated+0.060
nam+0.058
 Curious+0.057
 intrigued+0.057
mit+0.053
Neg logit contributions
 dri-0.062
 dent-0.059
ister-0.059
loo-0.058
imi-0.055
Token and
Token ID290
Feature activation-1.01e-02

Pos logit contributions
nam+0.057
 fascinated+0.056
 Curious+0.055
 intrigued+0.055
chard+0.053
Neg logit contributions
 dri-0.029
 dent-0.027
ister-0.027
arl-0.026
imi-0.026
Token spl
Token ID4328
Feature activation-5.92e-03

Pos logit contributions
 fascinated+0.016
 Curious+0.016
 intrigued+0.016
mit+0.015
cious+0.015
Neg logit contributions
 dri-0.018
 dent-0.017
 collapse-0.016
 flash-0.015
 swollen-0.015
Tokenashed
Token ID5263
Feature activation-1.58e-02

Pos logit contributions
 fascinated+0.007
 intrigued+0.006
 Curious+0.006
nam+0.006
 Lay+0.005
Neg logit contributions
loo-0.014
 dri-0.014
 dent-0.013
ister-0.013
arl-0.013
Token.
Token ID13
Feature activation+1.50e-02

Pos logit contributions
 fascinated+0.029
 intrigued+0.028
 Curious+0.027
nam+0.026
cious+0.026
Neg logit contributions
 dri-0.026
loo-0.025
 dent-0.025
ister-0.024
arl-0.023
Token They
Token ID1119
Feature activation+7.15e-04

Pos logit contributions
 dri+0.025
 dent+0.025
arl+0.023
omo+0.023
O+0.023
Neg logit contributions
 fascinated-0.025
mit-0.025
 intrigued-0.025
nam-0.024
cious-0.023
Token did
Token ID750
Feature activation+3.72e-02

Pos logit contributions
 dri+0.001
 dent+0.001
 flash+0.001
 collapse+0.001
omo+0.001
Neg logit contributions
 fascinated-0.001
 Curious-0.001
 intrigued-0.001
cious-0.001
nam-0.001
Token not
Token ID407
Feature activation+2.58e-02

Pos logit contributions
 dent+0.042
 dri+0.041
 collapse+0.040
ister+0.037
 flash+0.036
Neg logit contributions
 fascinated-0.082
 Curious-0.081
nam-0.080
 intrigued-0.080
mit-0.078
Token see
Token ID766
Feature activation+1.62e-02

Pos logit contributions
 dri+0.040
 dent+0.039
ister+0.037
loo+0.037
 collapse+0.036
Neg logit contributions
 fascinated-0.049
 intrigued-0.047
 Curious-0.047
nam-0.045
cious-0.043
Token the
Token ID262
Feature activation-5.37e-04

Pos logit contributions
 dri+0.023
 dent+0.022
loo+0.022
ister+0.021
 collapse+0.021
Neg logit contributions
 fascinated-0.032
 intrigued-0.032
 Curious-0.032
cious-0.030
mit-0.029
Token sign
Token ID1051
Feature activation-1.98e-02

Pos logit contributions
 fascinated+0.001
 intrigued+0.001
 Curious+0.001
nam+0.001
cious+0.001
Neg logit contributions
loo-0.001
 dri-0.001
 dent-0.001
ister-0.001
imi-0.001
Token that
Token ID326
Feature activation+5.56e-02

Pos logit contributions
 fascinated+0.040
 intrigued+0.038
 Curious+0.037
nam+0.035
mit+0.035
Neg logit contributions
loo-0.028
 dent-0.027
 dri-0.026
ister-0.026
imi-0.025
Token happy
Token ID3772
Feature activation-4.49e-02

Pos logit contributions
 Curious+0.025
 intrigued+0.025
 fascinated+0.025
mit+0.023
cious+0.022
Neg logit contributions
 dri-0.031
ister-0.030
imi-0.030
loo-0.030
 flash-0.029
Token.
Token ID13
Feature activation-5.70e-04

Pos logit contributions
 fascinated+0.094
 intrigued+0.093
 Curious+0.092
nam+0.089
mit+0.088
Neg logit contributions
loo-0.059
 dri-0.056
ister-0.053
imi-0.052
 dent-0.052
Token
Token ID198
Feature activation+2.06e-02

Pos logit contributions
nam+0.001
mit+0.001
 fascinated+0.001
cious+0.001
 intrigued+0.001
Neg logit contributions
imi-0.001
loo-0.001
ay-0.001
ilo-0.001
omo-0.001
Token
Token ID198
Feature activation+2.02e-02

Pos logit contributions
loo+0.032
 dri+0.031
ister+0.031
 dent+0.031
imi+0.030
Neg logit contributions
 fascinated-0.037
 intrigued-0.036
 Curious-0.035
nam-0.035
mit-0.034
TokenOne
Token ID3198
Feature activation+2.07e-02

Pos logit contributions
imi+0.013
O+0.013
ister+0.012
 dri+0.012
loo+0.011
Neg logit contributions
nam-0.053
mit-0.053
 fascinated-0.053
 intrigued-0.051
 Curious-0.050
Token day
Token ID1110
Feature activation+3.78e-02

Pos logit contributions
loo+0.027
 dri+0.026
 dent+0.026
ister+0.026
imi+0.024
Neg logit contributions
 fascinated-0.044
 intrigued-0.042
 Curious-0.041
nam-0.040
mit-0.039
Token,
Token ID11
Feature activation-8.33e-03

Pos logit contributions
loo+0.029
imi+0.028
ister+0.028
 dri+0.028
ww+0.027
Neg logit contributions
 fascinated-0.095
 intrigued-0.093
nam-0.092
 Curious-0.090
mit-0.088
Token Daisy
Token ID40355
Feature activation-5.79e-02

Pos logit contributions
 fascinated+0.015
mit+0.015
cious+0.015
uries+0.015
 intrigued+0.015
Neg logit contributions
O-0.011
imi-0.011
ww-0.011
 powder-0.011
 dri-0.011
Token's
Token ID338
Feature activation-1.38e-02

Pos logit contributions
nam+0.111
 fascinated+0.095
mit+0.093
 intrigued+0.092
chard+0.090
Neg logit contributions
 dri-0.101
loo-0.087
ister-0.085
imi-0.083
?-0.083
Token parents
Token ID3397
Feature activation-5.57e-02

Pos logit contributions
 fascinated+0.030
 intrigued+0.029
 Curious+0.028
cious+0.027
nam+0.027
Neg logit contributions
loo-0.018
 dri-0.017
ister-0.016
 dent-0.016
imi-0.016
Token said
Token ID531
Feature activation-1.94e-02

Pos logit contributions
 fascinated+0.088
 intrigued+0.085
nam+0.084
 Curious+0.084
cious+0.081
Neg logit contributions
 dri-0.103
ister-0.096
loo-0.096
 dent-0.092
imi-0.088
Token is
Token ID318
Feature activation+2.01e-02

Pos logit contributions
 dri+0.005
 dent+0.004
loo+0.004
ister+0.004
 collapse+0.004
Neg logit contributions
 fascinated-0.008
nam-0.008
 Curious-0.008
 intrigued-0.008
cious-0.008
Token also
Token ID635
Feature activation+3.47e-02

Pos logit contributions
loo+0.031
ister+0.029
 dent+0.029
 dri+0.029
imi+0.028
Neg logit contributions
 fascinated-0.039
 intrigued-0.037
 Curious-0.036
nam-0.035
mit-0.034
Token important
Token ID1593
Feature activation-5.64e-03

Pos logit contributions
 dri+0.049
 dent+0.048
loo+0.047
ister+0.045
imi+0.043
Neg logit contributions
 fascinated-0.068
 Curious-0.066
 intrigued-0.066
cious-0.062
nam-0.062
Token to
Token ID284
Feature activation+7.76e-03

Pos logit contributions
 fascinated+0.010
 intrigued+0.010
 Curious+0.010
nam+0.009
cious+0.009
Neg logit contributions
loo-0.009
 dri-0.009
 dent-0.009
ister-0.009
imi-0.008
Token be
Token ID307
Feature activation+4.55e-02

Pos logit contributions
loo+0.013
 dri+0.012
 dent+0.012
ister+0.011
imi+0.011
Neg logit contributions
 fascinated-0.014
 intrigued-0.014
mit-0.013
 Curious-0.013
nam-0.013
Token kind
Token ID1611
Feature activation-4.36e-02

Pos logit contributions
 dent+0.062
 dri+0.061
loo+0.057
ister+0.057
aid+0.055
Neg logit contributions
 Curious-0.088
 fascinated-0.084
 intrigued-0.082
 Lay-0.081
cious-0.080
Token and
Token ID290
Feature activation-9.25e-03

Pos logit contributions
 fascinated+0.085
 intrigued+0.082
 Curious+0.081
nam+0.078
cious+0.075
Neg logit contributions
loo-0.064
 dri-0.063
 dent-0.062
ister-0.061
imi-0.057
Token play
Token ID711
Feature activation+2.32e-02

Pos logit contributions
 fascinated+0.017
 Curious+0.016
 intrigued+0.016
cious+0.015
mit+0.015
Neg logit contributions
 dri-0.014
 dent-0.014
loo-0.014
ister-0.013
 collapse-0.013
Token together
Token ID1978
Feature activation-3.90e-02

Pos logit contributions
loo+0.037
 dri+0.037
 dent+0.037
ister+0.036
imi+0.034
Neg logit contributions
 fascinated-0.042
 intrigued-0.040
 Curious-0.039
nam-0.038
mit-0.037
Token.
Token ID13
Feature activation+1.31e-02

Pos logit contributions
 fascinated+0.073
 intrigued+0.070
 Curious+0.067
nam+0.066
mit+0.065
Neg logit contributions
loo-0.060
 dent-0.060
 dri-0.059
ister-0.057
imi-0.055
Token<|endoftext|>
Token ID50256
Feature activation+8.17e-02

Pos logit contributions
loo+0.022
 dri+0.021
 dent+0.021
ister+0.020
imi+0.020
Neg logit contributions
 fascinated-0.023
 intrigued-0.022
 Curious-0.021
nam-0.021
mit-0.021
Token happy
Token ID3772
Feature activation-4.43e-02

Pos logit contributions
 fascinated+0.008
 intrigued+0.007
 Curious+0.007
nam+0.006
mit+0.006
Neg logit contributions
loo-0.015
 dri-0.015
 dent-0.015
ister-0.015
 collapse-0.014
Token little
Token ID1310
Feature activation-9.10e-03

Pos logit contributions
 fascinated+0.061
 intrigued+0.058
 Curious+0.056
nam+0.053
mit+0.052
Neg logit contributions
loo-0.089
 dent-0.089
 dri-0.088
ister-0.086
imi-0.083
Token girl
Token ID2576
Feature activation-6.21e-02

Pos logit contributions
 fascinated+0.011
 intrigued+0.010
 Curious+0.010
mit+0.010
nam+0.010
Neg logit contributions
 dent-0.019
 dri-0.019
loo-0.019
 collapse-0.019
ister-0.019
Token named
Token ID3706
Feature activation-8.08e-02

Pos logit contributions
 fascinated+0.103
 intrigued+0.097
 Curious+0.095
nam+0.091
mit+0.090
Neg logit contributions
loo-0.110
 dent-0.109
 dri-0.108
ister-0.105
 collapse-0.101
Token Jane
Token ID12091
Feature activation-5.76e-02

Pos logit contributions
 fascinated+0.174
 intrigued+0.169
mit+0.155
 Curious+0.152
nam+0.151
Neg logit contributions
imi-0.108
 dri-0.106
arl-0.099
loo-0.096
omo-0.095
Token.
Token ID13
Feature activation-2.33e-04

Pos logit contributions
 fascinated+0.114
 intrigued+0.109
 Curious+0.107
nam+0.103
mit+0.102
Neg logit contributions
loo-0.083
 dent-0.082
 dri-0.081
ister-0.079
imi-0.074
Token Jane
Token ID12091
Feature activation-5.77e-02

Pos logit contributions
mit+0.000
nam+0.000
cious+0.000
 fascinated+0.000
ked+0.000
Neg logit contributions
loo-0.000
 dri-0.000
ay-0.000
omo-0.000
 collapse-0.000
Token was
Token ID373
Feature activation-3.73e-03

Pos logit contributions
 fascinated+0.067
 intrigued+0.065
 Curious+0.063
nam+0.060
mit+0.057
Neg logit contributions
 dri-0.127
loo-0.126
 dent-0.124
ister-0.122
imi-0.120
Token very
Token ID845
Feature activation+1.13e-02

Pos logit contributions
 Curious+0.007
 fascinated+0.007
 intrigued+0.007
cious+0.007
mit+0.007
Neg logit contributions
 dri-0.004
 dent-0.004
imi-0.004
loo-0.004
ister-0.004
Token excited
Token ID6568
Feature activation-4.47e-02

Pos logit contributions
ister+0.020
loo+0.020
 dri+0.020
arl+0.019
imi+0.019
Neg logit contributions
 Curious-0.017
 fascinated-0.016
 intrigued-0.016
cious-0.015
nam-0.015
Token for
Token ID329
Feature activation+5.39e-02

Pos logit contributions
 fascinated+0.084
 intrigued+0.083
 Curious+0.083
mit+0.080
nam+0.079
Neg logit contributions
loo-0.065
ister-0.063
 dri-0.062
imi-0.060
 dent-0.060
Token the
Token ID262
Feature activation-2.56e-03

Pos logit contributions
 dri+0.003
loo+0.003
 dent+0.003
ister+0.003
 collapse+0.002
Neg logit contributions
 fascinated-0.003
 intrigued-0.003
 Curious-0.003
cious-0.003
mit-0.003
Token prett
Token ID46442
Feature activation-3.44e-02

Pos logit contributions
 fascinated+0.004
 intrigued+0.004
 Curious+0.004
cious+0.004
mit+0.004
Neg logit contributions
 dent-0.004
 dri-0.004
loo-0.004
ister-0.004
 collapse-0.004
Tokeniest
Token ID6386
Feature activation-2.26e-02

Pos logit contributions
 fascinated+0.069
 intrigued+0.067
mit+0.066
 Curious+0.064
nam+0.063
Neg logit contributions
 dent-0.047
 dri-0.043
ister-0.043
loo-0.042
 collapse-0.041
Token stickers
Token ID28568
Feature activation-3.58e-02

Pos logit contributions
 fascinated+0.039
 intrigued+0.038
 Curious+0.037
nam+0.035
cious+0.034
Neg logit contributions
loo-0.038
 dri-0.038
 dent-0.037
ister-0.036
imi-0.034
Token.
Token ID13
Feature activation+7.24e-03

Pos logit contributions
 fascinated+0.074
 intrigued+0.072
 Curious+0.071
nam+0.067
mit+0.065
Neg logit contributions
loo-0.048
 dri-0.047
 dent-0.046
ister-0.046
 collapse-0.042
Token They
Token ID1119
Feature activation-2.83e-03

Pos logit contributions
loo+0.013
 dri+0.013
 dent+0.012
omo+0.012
ay+0.012
Neg logit contributions
 fascinated-0.011
nam-0.011
 intrigued-0.011
cious-0.010
mit-0.010
Token spent
Token ID3377
Feature activation+5.81e-03

Pos logit contributions
 fascinated+0.004
 intrigued+0.004
 Curious+0.004
nam+0.004
mit+0.004
Neg logit contributions
 dri-0.005
loo-0.005
 dent-0.005
ister-0.005
imi-0.005
Token a
Token ID257
Feature activation-3.85e-03

Pos logit contributions
 dri+0.009
 dent+0.009
loo+0.008
ister+0.008
 flash+0.008
Neg logit contributions
 fascinated-0.011
 Curious-0.010
 intrigued-0.010
nam-0.010
mit-0.010
Token long
Token ID890
Feature activation+1.14e-01

Pos logit contributions
 fascinated+0.007
 intrigued+0.007
 Curious+0.007
nam+0.006
cious+0.006
Neg logit contributions
loo-0.006
 dent-0.006
 dri-0.006
ister-0.006
arl-0.006
Token time
Token ID640
Feature activation+7.40e-04

Pos logit contributions
loo+0.166
 dri+0.163
 dent+0.163
ister+0.159
imi+0.149
Neg logit contributions
 fascinated-0.224
 intrigued-0.215
 Curious-0.209
nam-0.200
cious-0.199
Token carefully
Token ID7773
Feature activation+4.68e-02

Pos logit contributions
loo+0.001
 dent+0.001
 dri+0.001
ister+0.001
 collapse+0.001
Neg logit contributions
 fascinated-0.001
 intrigued-0.001
 Curious-0.001
nam-0.001
mit-0.001
Token get
Token ID651
Feature activation+3.44e-02

Pos logit contributions
loo+0.024
 dent+0.024
 dri+0.023
ister+0.023
imi+0.022
Neg logit contributions
 fascinated-0.023
 intrigued-0.022
mit-0.020
 Curious-0.020
nam-0.019
Token together
Token ID1978
Feature activation-4.22e-02

Pos logit contributions
 dri+0.050
 dent+0.046
loo+0.045
ister+0.043
 collapse+0.042
Neg logit contributions
 Curious-0.066
 fascinated-0.066
cious-0.065
 intrigued-0.064
mit-0.060
Token and
Token ID290
Feature activation-1.26e-02

Pos logit contributions
 fascinated+0.073
 intrigued+0.070
 Curious+0.069
nam+0.066
cious+0.064
Neg logit contributions
loo-0.070
 dri-0.069
 dent-0.069
ister-0.067
 collapse-0.063
Token build
Token ID1382
Feature activation+2.65e-02

Pos logit contributions
 fascinated+0.024
 intrigued+0.023
 Curious+0.023
cious+0.022
nam+0.021
Neg logit contributions
 dri-0.019
loo-0.019
 dent-0.018
ister-0.018
 collapse-0.017
Token more
Token ID517
Feature activation+3.42e-02

Pos logit contributions
 dent+0.041
 dri+0.040
loo+0.039
 collapse+0.037
ister+0.037
Neg logit contributions
 fascinated-0.049
 intrigued-0.047
 Curious-0.047
cious-0.045
mit-0.045
Token things
Token ID1243
Feature activation-8.09e-02

Pos logit contributions
 dent+0.057
 dri+0.056
loo+0.056
ister+0.053
 flash+0.051
Neg logit contributions
 fascinated-0.057
 Curious-0.056
 intrigued-0.055
cious-0.053
nam-0.052
Token around
Token ID1088
Feature activation-7.21e-02

Pos logit contributions
 fascinated+0.163
 intrigued+0.157
 Curious+0.154
nam+0.148
mit+0.146
Neg logit contributions
loo-0.112
 dri-0.110
 dent-0.110
ister-0.107
imi-0.100
Token the
Token ID262
Feature activation-3.31e-03

Pos logit contributions
 fascinated+0.132
 intrigued+0.127
 Curious+0.126
mit+0.118
nam+0.118
Neg logit contributions
loo-0.113
 dent-0.112
 dri-0.112
ister-0.109
 collapse-0.103
Token big
Token ID1263
Feature activation+1.17e-01

Pos logit contributions
 fascinated+0.007
 intrigued+0.007
 Curious+0.007
mit+0.006
nam+0.006
Neg logit contributions
 dent-0.004
loo-0.004
 dri-0.004
ister-0.004
 collapse-0.004
Token house
Token ID2156
Feature activation+7.57e-02

Pos logit contributions
loo+0.183
 dri+0.180
 dent+0.180
ister+0.176
imi+0.167
Neg logit contributions
 fascinated-0.217
 intrigued-0.209
 Curious-0.203
nam-0.195
mit-0.192
Token,
Token ID11
Feature activation-3.58e-03

Pos logit contributions
loo+0.111
 dent+0.110
 dri+0.109
ister+0.106
imi+0.102
Neg logit contributions
 fascinated-0.146
 intrigued-0.141
 Curious-0.136
nam-0.131
mit-0.130
Token<|endoftext|>
Token ID50256
Feature activation+9.12e-02

Pos logit contributions
loo+0.014
omo+0.014
imi+0.014
 dri+0.014
 dent+0.014
Neg logit contributions
 fascinated-0.015
 intrigued-0.015
mit-0.014
cious-0.014
nam-0.014
TokenOnce
Token ID7454
Feature activation+2.71e-02

Pos logit contributions
 dent+0.164
loo+0.162
 dri+0.162
ister+0.158
 collapse+0.151
Neg logit contributions
 fascinated-0.144
 intrigued-0.138
 Curious-0.137
nam-0.123
cious-0.123
Token upon
Token ID2402
Feature activation-4.16e-02

Pos logit contributions
loo+0.059
 dri+0.057
 dent+0.057
ister+0.056
imi+0.055
Neg logit contributions
 fascinated-0.034
 intrigued-0.033
 Curious-0.032
cious-0.030
nam-0.029
Token a
Token ID257
Feature activation+2.47e-03

Pos logit contributions
 fascinated+0.075
mit+0.073
 intrigued+0.073
 Curious+0.069
cious+0.069
Neg logit contributions
 dent-0.061
 dri-0.061
loo-0.061
ister-0.058
aster-0.057
Token time
Token ID640
Feature activation+3.75e-03

Pos logit contributions
loo+0.004
 dent+0.004
 dri+0.004
ister+0.004
 collapse+0.003
Neg logit contributions
 fascinated-0.005
 intrigued-0.005
 Curious-0.004
cious-0.004
mit-0.004
Token,
Token ID11
Feature activation-2.39e-04

Pos logit contributions
loo+0.004
ister+0.004
 dent+0.004
imi+0.004
 dri+0.004
Neg logit contributions
 fascinated-0.009
 intrigued-0.008
 Curious-0.008
nam-0.008
mit-0.008
Token there
Token ID612
Feature activation-9.71e-02

Pos logit contributions
 fascinated+0.000
 intrigued+0.000
 Curious+0.000
mit+0.000
nam+0.000
Neg logit contributions
loo-0.000
 dent-0.000
 dri-0.000
ister-0.000
imi-0.000
Token was
Token ID373
Feature activation+1.11e-03

Pos logit contributions
 fascinated+0.152
 intrigued+0.144
 Curious+0.140
mit+0.129
cious+0.127
Neg logit contributions
 dent-0.177
loo-0.177
 dri-0.176
ister-0.173
arl-0.172
Token a
Token ID257
Feature activation+3.53e-04

Pos logit contributions
loo+0.002
 dri+0.002
 dent+0.002
imi+0.002
ister+0.002
Neg logit contributions
 fascinated-0.002
 intrigued-0.002
mit-0.002
 Curious-0.002
nam-0.002
Token rare
Token ID4071
Feature activation-4.91e-03

Pos logit contributions
loo+0.001
 dri+0.001
 dent+0.001
ister+0.001
imi+0.001
Neg logit contributions
 fascinated-0.000
 intrigued-0.000
 Curious-0.000
nam-0.000
mit-0.000
Token bear
Token ID6842
Feature activation-3.98e-02

Pos logit contributions
 fascinated+0.008
 intrigued+0.007
 Curious+0.007
mit+0.007
nam+0.007
Neg logit contributions
 dent-0.009
 dri-0.009
loo-0.009
 collapse-0.008
ister-0.008
Token a
Token ID257
Feature activation-5.48e-03

Pos logit contributions
 fascinated+0.077
mit+0.076
 intrigued+0.075
nam+0.070
 Curious+0.069
Neg logit contributions
loo-0.061
 dri-0.061
 dent-0.061
aster-0.059
ister-0.059
Token time
Token ID640
Feature activation-2.74e-03

Pos logit contributions
 fascinated+0.010
 intrigued+0.010
 Curious+0.009
cious+0.009
 curious+0.009
Neg logit contributions
loo-0.009
 dent-0.009
 dri-0.009
ister-0.008
 collapse-0.008
Token there
Token ID612
Feature activation-1.08e-01

Pos logit contributions
 fascinated+0.007
 intrigued+0.007
 Curious+0.007
nam+0.006
cious+0.006
Neg logit contributions
loo-0.003
ister-0.003
imi-0.002
 dent-0.002
ay-0.002
Token were
Token ID547
Feature activation+1.60e-02

Pos logit contributions
 fascinated+0.178
 intrigued+0.170
 Curious+0.167
mit+0.154
nam+0.153
Neg logit contributions
loo-0.188
 dent-0.188
 dri-0.187
ister-0.183
arl-0.179
Token two
Token ID734
Feature activation-4.12e-02

Pos logit contributions
loo+0.026
 dri+0.025
ister+0.024
imi+0.024
 dent+0.024
Neg logit contributions
 fascinated-0.028
 intrigued-0.027
 Curious-0.026
mit-0.026
nam-0.025
Token friends
Token ID2460
Feature activation-1.07e-01

Pos logit contributions
 fascinated+0.076
 Curious+0.074
 intrigued+0.074
nam+0.069
cious+0.069
Neg logit contributions
 dri-0.062
loo-0.062
 dent-0.061
ister-0.060
imi-0.060
Token,
Token ID11
Feature activation-4.95e-03

Pos logit contributions
 fascinated+0.210
 intrigued+0.204
 Curious+0.201
nam+0.193
cious+0.189
Neg logit contributions
loo-0.156
 dent-0.151
ister-0.150
 dri-0.150
imi-0.144
Token Sam
Token ID3409
Feature activation+8.59e-03

Pos logit contributions
 fascinated+0.010
 intrigued+0.010
mit+0.009
cious+0.009
 Curious+0.009
Neg logit contributions
omo-0.006
 snowball-0.006
 dri-0.006
imi-0.006
loo-0.006
Token and
Token ID290
Feature activation-1.33e-02

Pos logit contributions
 dri+0.011
loo+0.011
ister+0.011
 dent+0.011
imi+0.011
Neg logit contributions
 fascinated-0.018
nam-0.017
 intrigued-0.017
 Curious-0.017
mit-0.016
Token Lucy
Token ID22162
Feature activation-3.79e-02

Pos logit contributions
 fascinated+0.024
 intrigued+0.024
 Curious+0.022
 mesmer+0.021
mit+0.020
Neg logit contributions
ister-0.020
imi-0.019
loo-0.019
ww-0.019
 dri-0.019
Token.
Token ID13
Feature activation+3.97e-03

Pos logit contributions
 fascinated+0.086
 intrigued+0.085
 Curious+0.085
nam+0.080
mit+0.078
Neg logit contributions
 dent-0.039
loo-0.039
 dri-0.039
ister-0.036
imi-0.035
Token idea
Token ID2126
Feature activation-3.56e-02

Pos logit contributions
loo+0.070
 dent+0.070
 dri+0.069
ister+0.068
imi+0.064
Neg logit contributions
 fascinated-0.077
 intrigued-0.074
 Curious-0.073
nam-0.069
mit-0.069
Token.
Token ID13
Feature activation+2.25e-02

Pos logit contributions
 fascinated+0.075
 intrigued+0.072
 Curious+0.071
nam+0.069
mit+0.067
Neg logit contributions
loo-0.047
 dri-0.046
 dent-0.045
ister-0.044
imi-0.041
Token She
Token ID1375
Feature activation+4.20e-02

Pos logit contributions
arl+0.034
 dri+0.033
 dent+0.032
imi+0.032
loo+0.032
Neg logit contributions
 fascinated-0.041
 intrigued-0.040
mit-0.039
nam-0.039
cious-0.038
Token nods
Token ID34833
Feature activation-8.64e-02

Pos logit contributions
 dri+0.072
 dent+0.068
loo+0.067
ister+0.064
imi+0.064
Neg logit contributions
 fascinated-0.072
nam-0.071
 intrigued-0.069
 Curious-0.067
cious-0.066
Token and
Token ID290
Feature activation-5.36e-03

Pos logit contributions
 fascinated+0.126
 intrigued+0.120
 Curious+0.116
 curious+0.111
mit+0.105
Neg logit contributions
loo-0.177
 dent-0.173
 dri-0.168
ister-0.166
 collapse-0.163
Token smiles
Token ID21845
Feature activation-1.01e-01

Pos logit contributions
 fascinated+0.010
 intrigued+0.009
 Curious+0.009
nam+0.009
cious+0.009
Neg logit contributions
 dri-0.009
 dent-0.008
loo-0.008
ister-0.008
 collapse-0.008
Token.
Token ID13
Feature activation+2.18e-02

Pos logit contributions
 fascinated+0.203
 intrigued+0.196
 Curious+0.192
nam+0.186
mit+0.184
Neg logit contributions
loo-0.140
 dri-0.139
 dent-0.139
ister-0.134
imi-0.126
Token
Token ID198
Feature activation+3.39e-02

Pos logit contributions
 dri+0.037
loo+0.036
 dent+0.036
arl+0.036
ister+0.035
Neg logit contributions
 fascinated-0.037
 intrigued-0.035
nam-0.034
mit-0.034
 Curious-0.033
Token
Token ID198
Feature activation+3.34e-02

Pos logit contributions
loo+0.077
 dent+0.077
 dri+0.076
 collapse+0.072
uch+0.070
Neg logit contributions
 fascinated-0.038
 Curious-0.036
 intrigued-0.035
 curious-0.031
 Lay-0.030
Token"
Token ID1
Feature activation-4.57e-02

Pos logit contributions
loo+0.066
 dri+0.064
 dent+0.064
 ig+0.059
 fever+0.058
Neg logit contributions
 fascinated-0.048
 Curious-0.047
 intrigued-0.046
 curious-0.044
cious-0.044
TokenOK
Token ID11380
Feature activation-7.81e-03

Pos logit contributions
 fascinated+0.084
 intrigued+0.080
 Curious+0.079
nam+0.075
mit+0.074
Neg logit contributions
loo-0.073
 dent-0.072
 dri-0.072
ister-0.070
imi-0.067
Token big
Token ID1263
Feature activation+1.16e-01

Pos logit contributions
 fascinated+0.007
 intrigued+0.007
 Curious+0.007
nam+0.006
mit+0.006
Neg logit contributions
loo-0.013
 dri-0.013
 dent-0.013
ister-0.012
 collapse-0.012
Token airport
Token ID9003
Feature activation-7.43e-03

Pos logit contributions
 dri+0.206
 collapse+0.206
 dent+0.201
arl+0.200
loo+0.199
Neg logit contributions
 fascinated-0.175
 curious-0.169
 intrigued-0.164
mit-0.159
 Curious-0.157
Token.
Token ID13
Feature activation+3.34e-03

Pos logit contributions
 fascinated+0.015
 intrigued+0.015
 Curious+0.015
nam+0.014
mit+0.014
Neg logit contributions
loo-0.010
ister-0.010
 dent-0.009
 dri-0.009
 collapse-0.009
Token Lots
Token ID25024
Feature activation-5.83e-02

Pos logit contributions
loo+0.007
 dent+0.007
 dri+0.007
ister+0.007
ay+0.007
Neg logit contributions
 fascinated-0.004
 intrigued-0.004
mit-0.004
nam-0.004
cious-0.004
Token of
Token ID286
Feature activation+4.34e-02

Pos logit contributions
 fascinated+0.141
 intrigued+0.137
 Curious+0.135
nam+0.130
mit+0.129
Neg logit contributions
 dent-0.057
loo-0.056
 dri-0.056
ister-0.053
imi-0.049
Token people
Token ID661
Feature activation-1.04e-01

Pos logit contributions
loo+0.073
 dri+0.073
ister+0.072
 flash+0.070
 band+0.070
Neg logit contributions
mit-0.073
 fascinated-0.071
 intrigued-0.071
 Curious-0.070
cious-0.068
Token went
Token ID1816
Feature activation+3.97e-02

Pos logit contributions
 Curious+0.182
 fascinated+0.177
 intrigued+0.175
cious+0.168
nam+0.166
Neg logit contributions
loo-0.178
 dri-0.168
ister-0.160
 flash-0.155
 dent-0.148
Token there
Token ID612
Feature activation-1.06e-01

Pos logit contributions
loo+0.050
ister+0.048
 dri+0.046
arl+0.044
 dent+0.043
Neg logit contributions
 fascinated-0.085
 intrigued-0.083
 Curious-0.082
mit-0.081
nam-0.081
Token to
Token ID284
Feature activation+4.21e-03

Pos logit contributions
 Curious+0.221
 fascinated+0.219
 intrigued+0.215
nam+0.214
mit+0.209
Neg logit contributions
loo-0.127
 dri-0.120
ister-0.118
 dent-0.115
 collapse-0.107
Token fly
Token ID6129
Feature activation-3.64e-02

Pos logit contributions
loo+0.006
 dri+0.006
 dent+0.006
ister+0.006
 collapse+0.006
Neg logit contributions
 fascinated-0.008
 intrigued-0.008
 Curious-0.008
mit-0.007
cious-0.007
Token in
Token ID287
Feature activation-3.54e-02

Pos logit contributions
 fascinated+0.078
 intrigued+0.076
 Curious+0.075
nam+0.072
mit+0.072
Neg logit contributions
loo-0.046
 dri-0.045
ister-0.042
 dent-0.042
 flash-0.041
Token there
Token ID612
Feature activation-1.06e-01

Pos logit contributions
loo+0.038
 dri+0.036
ister+0.036
 dent+0.035
imi+0.035
Neg logit contributions
 fascinated-0.024
 intrigued-0.022
 Curious-0.022
cious-0.021
nam-0.020
Token was
Token ID373
Feature activation+2.37e-03

Pos logit contributions
 fascinated+0.225
nam+0.220
 intrigued+0.220
cious+0.214
 Curious+0.213
Neg logit contributions
loo-0.133
 dri-0.127
 dent-0.127
ister-0.120
omo-0.115
Token a
Token ID257
Feature activation-2.10e-03

Pos logit contributions
loo+0.003
 dri+0.003
 snowball+0.003
 dent+0.003
 snow+0.003
Neg logit contributions
 fascinated-0.004
 intrigued-0.004
mit-0.004
 Curious-0.004
nam-0.004
Token little
Token ID1310
Feature activation-6.62e-03

Pos logit contributions
 fascinated+0.002
 intrigued+0.002
 Curious+0.002
 curious+0.002
mit+0.002
Neg logit contributions
loo-0.005
 dri-0.005
 dent-0.005
ister-0.005
 collapse-0.005
Token boy
Token ID2933
Feature activation-4.58e-02

Pos logit contributions
 fascinated+0.006
 curious+0.006
nam+0.006
 intrigued+0.006
mit+0.005
Neg logit contributions
 collapse-0.016
 dri-0.016
 dent-0.016
loo-0.015
ister-0.015
Token who
Token ID508
Feature activation-9.61e-02

Pos logit contributions
 fascinated+0.065
 intrigued+0.060
 curious+0.058
nam+0.057
 Curious+0.057
Neg logit contributions
 dri-0.092
 dent-0.091
loo-0.090
ister-0.087
 collapse-0.086
Token liked
Token ID8288
Feature activation-4.62e-02

Pos logit contributions
 fascinated+0.097
 intrigued+0.090
 Curious+0.086
nam+0.079
mit+0.077
Neg logit contributions
loo-0.231
 dent-0.229
 dri-0.229
ister-0.225
imi-0.217
Token to
Token ID284
Feature activation+7.68e-03

Pos logit contributions
 fascinated+0.077
 intrigued+0.074
 Curious+0.072
mit+0.069
cious+0.068
Neg logit contributions
loo-0.080
 dri-0.080
 dent-0.079
ister-0.077
imi-0.073
Token run
Token ID1057
Feature activation+3.41e-02

Pos logit contributions
loo+0.013
 dent+0.012
 dri+0.012
ister+0.012
arl+0.012
Neg logit contributions
 fascinated-0.014
 intrigued-0.013
 Curious-0.013
nam-0.012
mit-0.012
Token.
Token ID13
Feature activation+6.41e-03

Pos logit contributions
 dri+0.058
loo+0.058
 dent+0.058
ister+0.056
arl+0.053
Neg logit contributions
 fascinated-0.058
 intrigued-0.056
 Curious-0.053
cious-0.052
mit-0.052
Token One
Token ID1881
Feature activation+1.70e-02

Pos logit contributions
loo+0.012
 dent+0.012
 dri+0.011
ister+0.011
imi+0.011
Neg logit contributions
 fascinated-0.010
 intrigued-0.010
 Curious-0.009
cious-0.009
nam-0.009
Token and
Token ID290
Feature activation-9.58e-03

Pos logit contributions
loo+0.076
 dent+0.076
 dri+0.075
ister+0.074
arl+0.070
Neg logit contributions
 fascinated-0.065
 intrigued-0.062
 Curious-0.060
mit-0.056
cious-0.056
Token the
Token ID262
Feature activation-1.91e-03

Pos logit contributions
 fascinated+0.020
 intrigued+0.020
cious+0.019
mit+0.019
 Curious+0.019
Neg logit contributions
 dri-0.013
loo-0.013
 dent-0.013
 collapse-0.012
ister-0.012
Token puppy
Token ID26188
Feature activation-5.95e-02

Pos logit contributions
 fascinated+0.003
 intrigued+0.003
mit+0.003
 Curious+0.003
cious+0.003
Neg logit contributions
 dent-0.004
 dri-0.003
loo-0.003
ister-0.003
 swollen-0.003
Token became
Token ID2627
Feature activation-2.80e-03

Pos logit contributions
 fascinated+0.094
 intrigued+0.090
 Curious+0.088
nam+0.084
mit+0.082
Neg logit contributions
loo-0.108
 dri-0.108
 dent-0.107
ister-0.104
 collapse-0.099
Token best
Token ID1266
Feature activation-4.15e-02

Pos logit contributions
 fascinated+0.006
 intrigued+0.006
 Curious+0.006
cious+0.005
mit+0.005
Neg logit contributions
 dri-0.003
 dent-0.003
loo-0.003
ister-0.003
 collapse-0.003
Token friends
Token ID2460
Feature activation-9.76e-02

Pos logit contributions
 fascinated+0.068
 intrigued+0.065
 Curious+0.063
nam+0.059
mit+0.059
Neg logit contributions
loo-0.074
 dent-0.074
 dri-0.074
ister-0.072
imi-0.068
Token and
Token ID290
Feature activation-7.65e-03

Pos logit contributions
 fascinated+0.188
 intrigued+0.182
 Curious+0.178
nam+0.173
mit+0.170
Neg logit contributions
loo-0.144
 dent-0.141
 dri-0.140
ister-0.137
imi-0.128
Token went
Token ID1816
Feature activation+4.39e-02

Pos logit contributions
 fascinated+0.012
 intrigued+0.012
 Curious+0.012
mit+0.011
nam+0.011
Neg logit contributions
 dri-0.014
loo-0.014
 dent-0.013
ister-0.013
 collapse-0.013
Token on
Token ID319
Feature activation+2.57e-02

Pos logit contributions
 dent+0.074
loo+0.070
ister+0.070
 dri+0.069
 collapse+0.067
Neg logit contributions
 fascinated-0.076
 intrigued-0.072
 Curious-0.071
cious-0.069
 curious-0.068
Token many
Token ID867
Feature activation+4.24e-02

Pos logit contributions
loo+0.047
 dent+0.047
 dri+0.046
ister+0.045
imi+0.043
Neg logit contributions
 fascinated-0.041
 intrigued-0.039
 Curious-0.038
nam-0.036
cious-0.035
Token adventures
Token ID17545
Feature activation-5.57e-02

Pos logit contributions
loo+0.066
 dri+0.065
 dent+0.065
ister+0.063
imi+0.060
Neg logit contributions
 fascinated-0.078
 intrigued-0.075
 Curious-0.074
nam-0.070
mit-0.070