TOP ACTIVATIONS
MAX = 0.078

and never take your loved ones for granted . No matter
like spending time with loved ones and being grateful for what
are truly valuable are the ones that make us happy ,
but soon she felt a pair of strong arms lifting her
inger . I can make honey too . Why do you
and spend time with loved ones , but also to help
better to share and keep things special . <|endoftext|> Once upon
's important to cherish the ones you have now . From
protect yourself and your loved ones , but sometimes it 's
most amazing things and felt how great the world looks from
bag . She keeps the ones she likes the most .
amazing day with his loved ones . <|endoftext|> Once upon a
the nice shape and felt how smooth it was . Tommy
on the wood and felt how smooth and cool it was
to share than to keep things for yourself . <|endoftext|> One
it up . He feels how soft and smooth it is
And that is how the rich man found the
you understand ?" " Yes , Mom ," Sam
And that is how Daisy learned to read .
your parents and keep your things tidy . <|endoftext|> Once upon

MIN ACTIVATIONS
MIN = -0.119

sailed for days , exploring unch arted waters until he came
up and down on the tr amp oline .
. She falls off the tr amp oline . She lands
the mixture into some baking tr ays . She put them
to the edge of the tr amp oline and took a
the weeds next to his tr amp oline . He shouted
, until at last her tr icycle â s
D ina was riding her tr icycle , something mysterious and
it for her tree - cl im bing adventures . She
and hop around on his tr amp oline . One day
stretches or hop on the tr amp oline . Everyone at
Together they jumped on the tr amp oline and ran races
, kissing them on their fore heads . She
the pigeon all those weeks ago . <|endoftext|> Mike and his
the sofa , on the tr amp oline . They bounce
born only a few days ago . The deer felt sorry
cut ler y and different cond iments .
decided to jump on the tr amp oline , but Lily
she got back on her tr icycle and rode around the
story . A long time ago , there was an adventurous

INTERVAL 0.029 - 0.078
CONTAINS 4.800%

They had a great time playing their new game . After
have been a waste to hang it there . Let 's
asked her mom my , " What are planets ?" Her
I am lost and I feel very sad ".
. It hurts a lot ," Tom says .

INTERVAL -0.021 - 0.029
CONTAINS 85.568%

if you like it ." L ily took
fruit was red and round . Lily wanted to eat the
stared at the big , tall freezer . It was very
are very creative and brave . We are proud of you
the boat was free . Jack was so happy and grateful

INTERVAL -0.070 - -0.021
CONTAINS 9.629%

to peek in on Mom my and Daddy 's conversations ."
don 't know . But we can count the words .
surprised . He asked the tree ,
some police officers . They were looking for a lost child
to play with the scissors anymore because they were sharp and

INTERVAL -0.119 - -0.070
CONTAINS 0.003%

and hop around on his tr amp oline . One day
Grand ma 's words fore shadow ed a happy future
gentle ticking sound as the pend ulum slowly swinging back and
something flew out of the chim ney ! It was a
began to jump on the tr amp oline .
Token and
Token ID290
Feature activation+1.70e-03

Pos logit contributions
 independent+0.013
 strong+0.013
 brave+0.013
 sure+0.013
 graceful+0.013
Neg logit contributions
ossible-0.016
 harbour-0.015
ric-0.014
 exhibit-0.014
 ago-0.014
Token never
Token ID1239
Feature activation-2.38e-02

Pos logit contributions
 brave+0.002
 independent+0.002
 sure+0.002
 strong+0.002
 clever+0.002
Neg logit contributions
ossible-0.004
ric-0.003
 harbour-0.003
 exhibit-0.003
 cinema-0.003
Token take
Token ID1011
Feature activation+1.02e-02

Pos logit contributions
ossible+0.049
ric+0.045
 harbour+0.044
 cinema+0.044
 exhibit+0.043
Neg logit contributions
 brave-0.033
 sure-0.032
 independent-0.032
 strong-0.032
 clever-0.030
Token your
Token ID534
Feature activation+1.01e-02

Pos logit contributions
 independent+0.013
 brave+0.013
 sure+0.013
 strong+0.012
 clever+0.012
Neg logit contributions
ossible-0.020
 harbour-0.018
ric-0.018
 exhibit-0.017
|-0.017
Token loved
Token ID6151
Feature activation-6.09e-03

Pos logit contributions
 sure+0.014
 independent+0.014
 brave+0.014
 strong+0.013
 graceful+0.013
Neg logit contributions
ossible-0.020
ric-0.018
 harbour-0.018
 exhibit-0.017
 cinema-0.017
Token ones
Token ID3392
Feature activation+7.78e-02

Pos logit contributions
ossible+0.010
 ago+0.010
ric+0.009
 harbour+0.009
 exhibit+0.009
Neg logit contributions
 independent-0.009
 sure-0.009
 brave-0.009
 strong-0.009
 clever-0.008
Token for
Token ID329
Feature activation-8.53e-03

Pos logit contributions
 independent+0.108
 sure+0.107
 clever+0.105
 brave+0.105
 graceful+0.105
Neg logit contributions
ossible-0.142
 harbour-0.131
 exhibit-0.122
ric-0.121
 cinema-0.119
Token granted
Token ID7520
Feature activation-1.66e-03

Pos logit contributions
ossible+0.016
ric+0.015
 harbour+0.015
 cinema+0.014
 Sale+0.014
Neg logit contributions
 independent-0.012
 brave-0.012
 strong-0.012
 sure-0.012
 graceful-0.012
Token.
Token ID13
Feature activation-4.78e-03

Pos logit contributions
ossible+0.003
ric+0.003
 harbour+0.003
 ago+0.003
 exhibit+0.003
Neg logit contributions
 independent-0.003
 brave-0.002
 sure-0.002
 graceful-0.002
 strong-0.002
Token No
Token ID1400
Feature activation+1.09e-02

Pos logit contributions
ossible+0.010
anged+0.010
ric+0.010
 cinema+0.009
 harbour+0.009
Neg logit contributions
 sure-0.006
 brave-0.005
 independent-0.005
 compassionate-0.005
 clever-0.005
Token matter
Token ID2300
Feature activation+2.19e-03

Pos logit contributions
 independent+0.015
 brave+0.015
 sure+0.015
 strong+0.015
 graceful+0.015
Neg logit contributions
ossible-0.021
ric-0.019
 harbour-0.019
 exhibit-0.018
 cinema-0.018
Token like
Token ID588
Feature activation+9.50e-03

Pos logit contributions
 brave+0.032
 sure+0.032
 independent+0.032
 clever+0.031
 bee+0.030
Neg logit contributions
ossible-0.062
 harbour-0.059
ric-0.058
 ago-0.057
 exhibit-0.054
Token spending
Token ID4581
Feature activation+3.95e-03

Pos logit contributions
 independent+0.011
 sure+0.011
 brave+0.011
 strong+0.011
 clever+0.010
Neg logit contributions
ossible-0.018
 harbour-0.017
ric-0.017
 ago-0.017
 Sale-0.017
Token time
Token ID640
Feature activation-1.02e-02

Pos logit contributions
 sure+0.006
 brave+0.006
 independent+0.006
 strong+0.006
 clever+0.006
Neg logit contributions
ossible-0.007
ric-0.006
 harbour-0.006
 exhibit-0.006
|-0.006
Token with
Token ID351
Feature activation-1.97e-02

Pos logit contributions
ossible+0.020
 harbour+0.018
ric+0.018
 exhibit+0.018
 ago+0.018
Neg logit contributions
 sure-0.013
 independent-0.013
 brave-0.013
 strong-0.013
 clever-0.013
Token loved
Token ID6151
Feature activation-8.86e-03

Pos logit contributions
ossible+0.040
ric+0.038
 harbour+0.037
 exhibit+0.035
 ago+0.034
Neg logit contributions
 independent-0.024
 brave-0.024
 sure-0.024
 strong-0.023
 clever-0.023
Token ones
Token ID3392
Feature activation+7.63e-02

Pos logit contributions
ossible+0.016
 ago+0.015
ric+0.015
 harbour+0.015
 exhibit+0.014
Neg logit contributions
 sure-0.012
 independent-0.012
 brave-0.012
 strong-0.012
 clever-0.012
Token and
Token ID290
Feature activation+3.06e-03

Pos logit contributions
 sure+0.106
 independent+0.106
 brave+0.104
 clever+0.100
 graceful+0.100
Neg logit contributions
ossible-0.147
 harbour-0.131
ric-0.130
 exhibit-0.124
 ago-0.122
Token being
Token ID852
Feature activation+2.76e-03

Pos logit contributions
 brave+0.004
 independent+0.004
 sure+0.004
 strong+0.004
 clever+0.004
Neg logit contributions
ossible-0.006
ric-0.006
 harbour-0.006
 exhibit-0.006
 cinema-0.006
Token grateful
Token ID14066
Feature activation-2.20e-02

Pos logit contributions
 bee+0.002
isy+0.002
Pen+0.002
 pony+0.002
 raven+0.002
Neg logit contributions
ossible-0.006
 harbour-0.006
 ago-0.005
ric-0.005
 tanks-0.005
Token for
Token ID329
Feature activation+1.37e-02

Pos logit contributions
ossible+0.041
ric+0.036
 harbour+0.035
 ago+0.035
 exhibit+0.034
Neg logit contributions
 independent-0.033
 sure-0.033
 brave-0.033
 strong-0.032
 clever-0.031
Token what
Token ID644
Feature activation+3.72e-02

Pos logit contributions
 independent+0.019
 brave+0.019
 strong+0.019
 bee+0.018
 graceful+0.018
Neg logit contributions
ossible-0.024
ric-0.023
 harbour-0.022
 Sale-0.022
 cinema-0.021
Token are
Token ID389
Feature activation-1.60e-02

Pos logit contributions
 independent+0.003
 brave+0.003
 sure+0.003
 strong+0.003
 graceful+0.003
Neg logit contributions
ossible-0.004
ric-0.004
 harbour-0.004
 exhibit-0.003
|-0.003
Token truly
Token ID4988
Feature activation+1.18e-02

Pos logit contributions
ossible+0.031
 harbour+0.030
ric+0.030
 Sale+0.028
ained+0.028
Neg logit contributions
 bee-0.018
 sure-0.017
Pen-0.017
 independent-0.017
 raven-0.017
Token valuable
Token ID8119
Feature activation+2.97e-03

Pos logit contributions
 bee+0.010
Pen+0.010
 raven+0.009
 pony+0.008
 swat+0.008
Neg logit contributions
ossible-0.026
 harbour-0.026
 tanks-0.025
ric-0.024
 exhibit-0.024
Token are
Token ID389
Feature activation-6.37e-03

Pos logit contributions
 independent+0.004
 sure+0.004
 brave+0.004
 strong+0.004
 graceful+0.004
Neg logit contributions
ossible-0.006
ric-0.005
 harbour-0.005
 exhibit-0.005
 ago-0.005
Token the
Token ID262
Feature activation+2.98e-02

Pos logit contributions
ossible+0.013
 harbour+0.012
ric+0.012
 ago+0.011
 Sale+0.011
Neg logit contributions
 bee-0.007
Pen-0.007
 independent-0.007
 raven-0.007
 brave-0.007
Token ones
Token ID3392
Feature activation+7.55e-02

Pos logit contributions
 sure+0.036
 independent+0.036
 brave+0.035
 graceful+0.034
 strong+0.034
Neg logit contributions
ossible-0.061
 harbour-0.056
ric-0.054
 cinema-0.053
 exhibit-0.052
Token that
Token ID326
Feature activation+1.11e-02

Pos logit contributions
 independent+0.102
 graceful+0.101
 bee+0.100
 sure+0.100
 swat+0.099
Neg logit contributions
ossible-0.131
 harbour-0.119
 ago-0.111
 exhibit-0.111
ric-0.110
Token make
Token ID787
Feature activation+4.29e-02

Pos logit contributions
 brave+0.016
 independent+0.016
 sure+0.016
 strong+0.015
 clever+0.015
Neg logit contributions
ossible-0.022
ric-0.020
 harbour-0.019
 exhibit-0.019
 cinema-0.019
Token us
Token ID514
Feature activation+2.73e-02

Pos logit contributions
 independent+0.070
 brave+0.068
 bee+0.066
 strong+0.066
 courageous+0.066
Neg logit contributions
ossible-0.068
ric-0.064
 harbour-0.060
 Sale-0.056
 cinema-0.056
Token happy
Token ID3772
Feature activation-8.07e-03

Pos logit contributions
 independent+0.025
 brave+0.024
 sure+0.023
 graceful+0.023
 bee+0.023
Neg logit contributions
ossible-0.063
ric-0.060
 harbour-0.059
 exhibit-0.055
|-0.054
Token,
Token ID11
Feature activation+2.64e-02

Pos logit contributions
ossible+0.015
ric+0.014
 harbour+0.013
 exhibit+0.013
 cinema+0.013
Neg logit contributions
 independent-0.012
 brave-0.012
 sure-0.012
 strong-0.012
 clever-0.011
Token but
Token ID475
Feature activation-1.26e-04

Pos logit contributions
 independent+0.014
 brave+0.013
 clever+0.013
 strong+0.013
 sure+0.013
Neg logit contributions
ossible-0.025
ric-0.024
 harbour-0.024
 ago-0.022
 exhibit-0.022
Token soon
Token ID2582
Feature activation+3.41e-03

Pos logit contributions
ossible+0.000
 harbour+0.000
ric+0.000
 ago+0.000
 exhibit+0.000
Neg logit contributions
 independent-0.000
 graceful-0.000
 strong-0.000
 sure-0.000
 clever-0.000
Token she
Token ID673
Feature activation+4.11e-03

Pos logit contributions
 independent+0.004
 strong+0.003
 graceful+0.003
 brave+0.003
 intelligent+0.003
Neg logit contributions
ossible-0.007
 harbour-0.006
 exhibit-0.006
 ago-0.006
anged-0.006
Token felt
Token ID2936
Feature activation+3.54e-02

Pos logit contributions
 independent+0.007
 sure+0.006
 graceful+0.006
 strong+0.006
 intelligent+0.006
Neg logit contributions
ossible-0.007
 harbour-0.006
ric-0.006
 exhibit-0.006
|-0.006
Token a
Token ID257
Feature activation+3.59e-02

Pos logit contributions
Small+0.043
 pony+0.043
isy+0.043
 butterfly+0.041
 independent+0.041
Neg logit contributions
ossible-0.059
 harbour-0.058
ric-0.057
ained-0.052
|-0.051
Token pair
Token ID5166
Feature activation+7.54e-02

Pos logit contributions
 independent+0.042
 sure+0.039
 intelligent+0.037
 graceful+0.035
 compassionate+0.035
Neg logit contributions
ossible-0.066
 harbour-0.064
ric-0.063
 cinema-0.059
|-0.059
Token of
Token ID286
Feature activation+1.34e-02

Pos logit contributions
 independent+0.105
 clever+0.102
 graceful+0.101
 sure+0.100
 brave+0.099
Neg logit contributions
ossible-0.138
 harbour-0.120
|-0.119
 exhibit-0.118
ric-0.117
Token strong
Token ID1913
Feature activation+1.61e-02

Pos logit contributions
 sure+0.015
 independent+0.015
 brave+0.014
Small+0.014
 clever+0.014
Neg logit contributions
ossible-0.026
ric-0.023
 harbour-0.022
 ago-0.022
 cinema-0.021
Token arms
Token ID5101
Feature activation+2.07e-04

Pos logit contributions
 sure+0.024
 independent+0.023
 brave+0.023
 clever+0.023
 graceful+0.022
Neg logit contributions
ossible-0.029
ric-0.027
 harbour-0.027
 exhibit-0.026
 ago-0.025
Token lifting
Token ID16842
Feature activation+1.17e-02

Pos logit contributions
 independent+0.000
 brave+0.000
 sure+0.000
 clever+0.000
 strong+0.000
Neg logit contributions
ossible-0.000
ric-0.000
 harbour-0.000
 exhibit-0.000
 cinema-0.000
Token her
Token ID607
Feature activation+1.01e-02

Pos logit contributions
 compassionate+0.015
 clever+0.015
 independent+0.015
 intelligent+0.015
 courageous+0.015
Neg logit contributions
ossible-0.019
ric-0.018
|-0.018
rollers-0.017
 ago-0.017
Tokeninger
Token ID3889
Feature activation+1.59e-02

Pos logit contributions
 brave+0.004
 sure+0.004
 independent+0.004
 strong+0.004
 graceful+0.004
Neg logit contributions
ossible-0.003
 harbour-0.003
ric-0.003
 cinema-0.003
 exhibit-0.003
Token.
Token ID13
Feature activation+5.34e-03

Pos logit contributions
 independent+0.021
 clever+0.021
 sure+0.021
 brave+0.020
 strong+0.020
Neg logit contributions
ossible-0.031
ric-0.029
 ago-0.028
 exhibit-0.027
 harbour-0.027
Token I
Token ID314
Feature activation+5.55e-03

Pos logit contributions
 sure+0.008
 independent+0.008
 brave+0.008
 clever+0.008
 strong+0.008
Neg logit contributions
ossible-0.010
ric-0.009
 harbour-0.008
|-0.008
 ago-0.008
Token can
Token ID460
Feature activation+8.18e-03

Pos logit contributions
 independent+0.009
 sure+0.008
 brave+0.008
 strong+0.008
 graceful+0.008
Neg logit contributions
ossible-0.010
ric-0.009
|-0.009
 exhibit-0.009
 harbour-0.008
Token make
Token ID787
Feature activation+4.54e-02

Pos logit contributions
 independent+0.011
 sure+0.010
 courageous+0.010
 clever+0.010
 brave+0.010
Neg logit contributions
ossible-0.017
ric-0.015
 exhibit-0.014
|-0.014
 harbour-0.013
Token honey
Token ID12498
Feature activation+7.53e-02

Pos logit contributions
 independent+0.059
 courageous+0.057
 brave+0.056
 compassionate+0.055
 strong+0.055
Neg logit contributions
ossible-0.087
ric-0.083
 exhibit-0.075
 aquarium-0.072
 cinema-0.071
Token too
Token ID1165
Feature activation+2.17e-02

Pos logit contributions
 independent+0.105
 clever+0.099
 courageous+0.099
 brave+0.099
 graceful+0.096
Neg logit contributions
ossible-0.137
ric-0.133
 cinema-0.118
 exhibit-0.117
 station-0.115
Token.
Token ID13
Feature activation+3.80e-03

Pos logit contributions
 sure+0.030
 independent+0.029
 brave+0.029
 clever+0.028
 courageous+0.027
Neg logit contributions
ossible-0.041
ric-0.036
 ago-0.036
 exhibit-0.035
 harbour-0.035
Token Why
Token ID4162
Feature activation+2.23e-02

Pos logit contributions
 sure+0.006
 independent+0.006
 clever+0.006
 brave+0.006
 graceful+0.006
Neg logit contributions
ossible-0.007
ric-0.006
 harbour-0.006
 ago-0.006
|-0.006
Token do
Token ID466
Feature activation+2.42e-02

Pos logit contributions
 independent+0.042
 courageous+0.039
 intelligent+0.037
 fearless+0.037
 graceful+0.037
Neg logit contributions
ossible-0.035
ric-0.032
 Sale-0.028
 exhibit-0.027
 cinema-0.026
Token you
Token ID345
Feature activation-1.02e-02

Pos logit contributions
 independent+0.042
 sure+0.042
 graceful+0.040
 courageous+0.040
 brave+0.039
Neg logit contributions
ossible-0.036
 Sale-0.030
 exhibit-0.029
ric-0.028
 cinema-0.027
Token and
Token ID290
Feature activation+4.75e-03

Pos logit contributions
ossible+0.018
ric+0.016
 ago+0.016
 exhibit+0.016
 harbour+0.016
Neg logit contributions
 sure-0.013
 independent-0.013
 strong-0.013
 brave-0.013
 clever-0.012
Token spend
Token ID4341
Feature activation+2.48e-02

Pos logit contributions
 brave+0.006
 independent+0.006
 sure+0.006
 strong+0.006
 clever+0.006
Neg logit contributions
ossible-0.010
ric-0.009
 harbour-0.009
 exhibit-0.009
 cinema-0.009
Token time
Token ID640
Feature activation-1.08e-02

Pos logit contributions
 sure+0.036
 independent+0.036
 brave+0.036
 strong+0.035
 clever+0.034
Neg logit contributions
ossible-0.046
ric-0.040
 harbour-0.039
|-0.039
 cinema-0.038
Token with
Token ID351
Feature activation-1.83e-02

Pos logit contributions
ossible+0.020
 harbour+0.017
ric+0.017
 ago+0.017
|+0.017
Neg logit contributions
 sure-0.015
 independent-0.015
 brave-0.015
 strong-0.015
 clever-0.015
Token loved
Token ID6151
Feature activation-1.02e-02

Pos logit contributions
ossible+0.038
ric+0.035
 harbour+0.034
 cinema+0.033
 ago+0.032
Neg logit contributions
 independent-0.020
 sure-0.020
 brave-0.020
 strong-0.020
 courageous-0.019
Token ones
Token ID3392
Feature activation+7.31e-02

Pos logit contributions
 ago+0.017
ric+0.017
ossible+0.017
 exhibit+0.016
 harbour+0.016
Neg logit contributions
 sure-0.014
 brave-0.014
 independent-0.014
 strong-0.014
 clever-0.014
Token,
Token ID11
Feature activation+2.72e-02

Pos logit contributions
 sure+0.109
 independent+0.107
 brave+0.103
 clever+0.103
 graceful+0.102
Neg logit contributions
ossible-0.128
 harbour-0.111
ric-0.108
 exhibit-0.106
 ago-0.104
Token but
Token ID475
Feature activation+1.06e-02

Pos logit contributions
 brave+0.026
 independent+0.025
 sure+0.025
 strong+0.024
 clever+0.024
Neg logit contributions
ossible-0.060
ric-0.056
 harbour-0.055
 ago-0.052
 cinema-0.052
Token also
Token ID635
Feature activation-2.93e-03

Pos logit contributions
 brave+0.011
 sure+0.011
 independent+0.011
 strong+0.010
 clever+0.010
Neg logit contributions
ossible-0.025
ric-0.024
 harbour-0.023
 cinema-0.023
 exhibit-0.023
Token to
Token ID284
Feature activation-1.18e-02

Pos logit contributions
ossible+0.006
ric+0.006
 ago+0.005
 harbour+0.005
ained+0.005
Neg logit contributions
 brave-0.003
 independent-0.003
 bee-0.003
 strong-0.003
 raven-0.003
Token help
Token ID1037
Feature activation+1.71e-02

Pos logit contributions
ossible+0.025
ric+0.023
 harbour+0.023
 exhibit+0.022
 cinema+0.022
Neg logit contributions
 brave-0.015
 independent-0.015
 sure-0.015
 strong-0.014
 clever-0.014
Token better
Token ID1365
Feature activation-1.13e-03

Pos logit contributions
ossible+0.019
ric+0.018
 harbour+0.017
 cinema+0.017
ained+0.017
Neg logit contributions
 independent-0.007
 brave-0.007
 bee-0.007
 sure-0.007
 strong-0.007
Token to
Token ID284
Feature activation+3.12e-03

Pos logit contributions
ossible+0.002
ric+0.002
 harbour+0.002
 exhibit+0.002
 cinema+0.002
Neg logit contributions
 brave-0.002
 independent-0.001
 sure-0.001
 strong-0.001
 clever-0.001
Token share
Token ID2648
Feature activation+1.39e-02

Pos logit contributions
 brave+0.004
 independent+0.004
 sure+0.004
 strong+0.004
 clever+0.004
Neg logit contributions
ossible-0.007
ric-0.006
 harbour-0.006
 exhibit-0.006
 cinema-0.006
Token and
Token ID290
Feature activation+5.03e-03

Pos logit contributions
 independent+0.018
 sure+0.018
 brave+0.017
 strong+0.017
 fearless+0.016
Neg logit contributions
ossible-0.028
 harbour-0.024
ric-0.024
 exhibit-0.023
 ago-0.023
Token keep
Token ID1394
Feature activation+1.91e-02

Pos logit contributions
 brave+0.007
 sure+0.006
 independent+0.006
 strong+0.006
 clever+0.006
Neg logit contributions
ossible-0.011
ric-0.010
 harbour-0.010
 cinema-0.009
 exhibit-0.009
Token things
Token ID1243
Feature activation+7.28e-02

Pos logit contributions
 independent+0.023
 brave+0.023
 bee+0.023
 strong+0.022
 sure+0.022
Neg logit contributions
ossible-0.037
 harbour-0.035
ric-0.033
 cinema-0.032
 exhibit-0.032
Token special
Token ID2041
Feature activation+1.62e-02

Pos logit contributions
 bee+0.079
 independent+0.075
Small+0.075
 sure+0.073
 raven+0.073
Neg logit contributions
ossible-0.151
 harbour-0.139
ric-0.127
|-0.126
 ago-0.124
Token.
Token ID13
Feature activation-2.09e-03

Pos logit contributions
 independent+0.022
 sure+0.022
 strong+0.022
 graceful+0.022
 brave+0.022
Neg logit contributions
ossible-0.031
 harbour-0.027
 ago-0.026
 exhibit-0.026
ric-0.026
Token<|endoftext|>
Token ID50256
Feature activation-3.78e-02

Pos logit contributions
ossible+0.004
 harbour+0.004
 cinema+0.004
ric+0.004
anged+0.004
Neg logit contributions
 sure-0.003
 brave-0.002
 independent-0.002
 strong-0.002
 compassionate-0.002
TokenOnce
Token ID7454
Feature activation-3.23e-02

Pos logit contributions
ossible+0.062
ric+0.058
 harbour+0.056
 cinema+0.052
 exhibit+0.052
Neg logit contributions
 brave-0.065
 sure-0.064
 independent-0.063
 strong-0.062
 clever-0.061
Token upon
Token ID2402
Feature activation-5.55e-03

Pos logit contributions
ossible+0.043
 harbour+0.042
ric+0.042
 cinema+0.040
 exhibit+0.037
Neg logit contributions
 sure-0.064
 brave-0.063
 independent-0.062
 strong-0.062
 clever-0.061
Token's
Token ID338
Feature activation-1.45e-02

Pos logit contributions
 independent+0.019
 strong+0.018
 sure+0.018
 graceful+0.018
 brave+0.018
Neg logit contributions
ossible-0.017
ric-0.016
 exhibit-0.015
|-0.015
 harbour-0.014
Token important
Token ID1593
Feature activation-1.60e-02

Pos logit contributions
ossible+0.031
ric+0.029
ained+0.029
 harbour+0.027
|+0.027
Neg logit contributions
 bee-0.015
 independent-0.014
 butterfly-0.013
 graceful-0.013
 brave-0.013
Token to
Token ID284
Feature activation+8.21e-03

Pos logit contributions
ossible+0.032
ric+0.029
 harbour+0.028
 exhibit+0.027
 cinema+0.027
Neg logit contributions
 brave-0.022
 independent-0.022
 sure-0.022
 strong-0.022
 graceful-0.021
Token cherish
Token ID48303
Feature activation+9.75e-04

Pos logit contributions
 brave+0.011
 independent+0.011
 sure+0.011
 strong+0.010
 clever+0.010
Neg logit contributions
ossible-0.017
ric-0.016
 harbour-0.016
 exhibit-0.015
 cinema-0.015
Token the
Token ID262
Feature activation+1.26e-02

Pos logit contributions
 independent+0.001
 sure+0.001
 strong+0.001
 brave+0.001
 clever+0.001
Neg logit contributions
ossible-0.002
 harbour-0.002
ric-0.002
 ago-0.002
 cinema-0.002
Token ones
Token ID3392
Feature activation+7.23e-02

Pos logit contributions
 sure+0.018
 independent+0.017
 brave+0.017
 strong+0.016
 clever+0.016
Neg logit contributions
ossible-0.023
 harbour-0.021
ric-0.021
 exhibit-0.020
 cinema-0.020
Token you
Token ID345
Feature activation+8.75e-03

Pos logit contributions
 sure+0.104
 independent+0.102
 graceful+0.100
Small+0.100
 clever+0.100
Neg logit contributions
ossible-0.124
 harbour-0.105
 exhibit-0.102
 ago-0.100
ric-0.098
Token have
Token ID423
Feature activation-5.37e-03

Pos logit contributions
 brave+0.011
 independent+0.011
 sure+0.011
 strong+0.011
 clever+0.011
Neg logit contributions
ossible-0.018
ric-0.017
 harbour-0.017
 exhibit-0.016
 cinema-0.016
Token now
Token ID783
Feature activation-1.11e-02

Pos logit contributions
ossible+0.011
ric+0.010
 harbour+0.010
 exhibit+0.009
 cinema+0.009
Neg logit contributions
 independent-0.007
 brave-0.007
 sure-0.007
 strong-0.007
 graceful-0.007
Token.
Token ID13
Feature activation-2.57e-03

Pos logit contributions
ossible+0.023
ric+0.021
 harbour+0.021
 exhibit+0.020
 cinema+0.020
Neg logit contributions
 brave-0.015
 independent-0.015
 sure-0.015
 strong-0.014
 clever-0.014
Token From
Token ID3574
Feature activation-1.61e-02

Pos logit contributions
ossible+0.005
ric+0.005
 harbour+0.005
 cinema+0.004
 exhibit+0.004
Neg logit contributions
 sure-0.004
 independent-0.004
 brave-0.004
 strong-0.003
 clever-0.003
Token protect
Token ID1805
Feature activation+9.90e-03

Pos logit contributions
 brave+0.003
 sure+0.003
 independent+0.003
 strong+0.003
 clever+0.003
Neg logit contributions
ossible-0.005
ric-0.005
 harbour-0.005
 cinema-0.005
 exhibit-0.005
Token yourself
Token ID3511
Feature activation+3.93e-02

Pos logit contributions
 independent+0.012
 sure+0.012
 graceful+0.012
 clever+0.012
 strong+0.012
Neg logit contributions
ossible-0.019
ric-0.018
 harbour-0.017
|-0.016
 exhibit-0.016
Token and
Token ID290
Feature activation+1.16e-02

Pos logit contributions
 independent+0.051
 sure+0.051
 graceful+0.049
 clever+0.049
 brave+0.049
Neg logit contributions
ossible-0.073
 harbour-0.070
ric-0.069
 exhibit-0.068
 ago-0.066
Token your
Token ID534
Feature activation+1.16e-02

Pos logit contributions
 brave+0.015
 independent+0.015
 sure+0.015
 strong+0.014
 clever+0.014
Neg logit contributions
ossible-0.024
ric-0.023
 harbour-0.022
 exhibit-0.021
 cinema-0.021
Token loved
Token ID6151
Feature activation-1.17e-02

Pos logit contributions
 brave+0.014
 independent+0.014
 sure+0.014
 strong+0.014
 clever+0.013
Neg logit contributions
ossible-0.025
ric-0.024
 harbour-0.023
 cinema-0.022
 exhibit-0.022
Token ones
Token ID3392
Feature activation+7.20e-02

Pos logit contributions
ossible+0.020
ric+0.020
 ago+0.019
 exhibit+0.018
 harbour+0.018
Neg logit contributions
 independent-0.016
 strong-0.016
 sure-0.016
 brave-0.016
 redd-0.016
Token,
Token ID11
Feature activation+3.14e-02

Pos logit contributions
 independent+0.106
 sure+0.106
 graceful+0.104
 bee+0.102
 clever+0.101
Neg logit contributions
ossible-0.113
 harbour-0.104
ric-0.099
 exhibit-0.098
 cinema-0.094
Token but
Token ID475
Feature activation+1.60e-02

Pos logit contributions
 sure+0.034
 brave+0.034
 independent+0.034
 clever+0.033
Pen+0.033
Neg logit contributions
ossible-0.063
ric-0.057
 harbour-0.056
 tanks-0.053
ained-0.052
Token sometimes
Token ID3360
Feature activation+1.04e-02

Pos logit contributions
 brave+0.017
 independent+0.017
 sure+0.017
 strong+0.016
 clever+0.016
Neg logit contributions
ossible-0.037
ric-0.035
 harbour-0.034
 exhibit-0.033
 cinema-0.033
Token it
Token ID340
Feature activation+7.24e-03

Pos logit contributions
 independent+0.016
 sure+0.015
 graceful+0.014
isy+0.014
 strong+0.014
Neg logit contributions
ossible-0.017
 harbour-0.015
 exhibit-0.015
ric-0.014
 Sale-0.014
Token's
Token ID338
Feature activation-2.09e-02

Pos logit contributions
 independent+0.011
 brave+0.011
 sure+0.011
 strong+0.011
 graceful+0.010
Neg logit contributions
ossible-0.013
ric-0.012
 harbour-0.012
 exhibit-0.012
 cinema-0.011
Token most
Token ID749
Feature activation+1.30e-02

Pos logit contributions
 sure+0.010
 independent+0.009
 brave+0.009
 compassionate+0.009
 softly+0.009
Neg logit contributions
 cinema-0.013
ossible-0.012
 harbour-0.012
ric-0.012
 station-0.012
Token amazing
Token ID4998
Feature activation-1.37e-02

Pos logit contributions
Pen+0.013
 Honey+0.012
pick+0.012
Flo+0.012
Her+0.012
Neg logit contributions
 harbour-0.024
 pool-0.023
ric-0.023
 exhibit-0.022
 cinema-0.022
Token things
Token ID1243
Feature activation+3.02e-02

Pos logit contributions
 exhibit+0.023
ossible+0.023
ric+0.022
 harbour+0.022
 cinema+0.022
Neg logit contributions
 sure-0.020
 independent-0.020
 brave-0.019
 compassionate-0.019
 clever-0.018
Token and
Token ID290
Feature activation+1.14e-02

Pos logit contributions
 independent+0.042
 sure+0.040
 brave+0.039
 clever+0.039
 graceful+0.039
Neg logit contributions
ossible-0.054
 harbour-0.052
ric-0.052
 exhibit-0.050
 ago-0.048
Token felt
Token ID2936
Feature activation+4.58e-02

Pos logit contributions
 independent+0.015
 bee+0.014
 brave+0.014
 sure+0.014
Pen+0.014
Neg logit contributions
 harbour-0.020
ossible-0.020
ric-0.019
 exhibit-0.018
 ago-0.018
Token how
Token ID703
Feature activation+7.19e-02

Pos logit contributions
 pony+0.053
isy+0.053
 bee+0.052
 hen+0.050
Bee+0.050
Neg logit contributions
 harbour-0.087
ossible-0.083
ric-0.079
|-0.076
ained-0.076
Token great
Token ID1049
Feature activation+1.06e-02

Pos logit contributions
 pony+0.060
 bee+0.059
pick+0.058
 raven+0.056
 bunny+0.056
Neg logit contributions
ossible-0.159
ric-0.148
 harbour-0.146
|-0.142
 exhibit-0.135
Token the
Token ID262
Feature activation+1.92e-02

Pos logit contributions
 independent+0.014
 compassionate+0.014
 sure+0.013
 brave+0.013
 graceful+0.013
Neg logit contributions
ossible-0.018
 harbour-0.018
 exhibit-0.017
 ago-0.017
ric-0.017
Token world
Token ID995
Feature activation+5.91e-03

Pos logit contributions
 sure+0.027
 independent+0.026
 brave+0.024
 graceful+0.024
 strong+0.024
Neg logit contributions
 harbour-0.033
 cinema-0.033
ossible-0.032
ric-0.032
 exhibit-0.032
Token looks
Token ID3073
Feature activation+1.87e-02

Pos logit contributions
 independent+0.010
 sure+0.010
 compassionate+0.010
 careful+0.009
 clever+0.009
Neg logit contributions
|-0.008
ossible-0.008
 exhibit-0.008
 harbour-0.008
ric-0.008
Token from
Token ID422
Feature activation+7.44e-03

Pos logit contributions
 independent+0.024
 sure+0.024
 bee+0.024
 raven+0.024
Small+0.024
Neg logit contributions
ossible-0.033
 harbour-0.031
 ago-0.031
ric-0.030
 exhibit-0.029
Token bag
Token ID6131
Feature activation-1.08e-02

Pos logit contributions
ossible+0.020
ric+0.017
 harbour+0.017
 exhibit+0.017
 cinema+0.016
Neg logit contributions
 sure-0.025
 brave-0.025
 independent-0.025
 strong-0.024
 graceful-0.024
Token.
Token ID13
Feature activation+8.82e-03

Pos logit contributions
ossible+0.024
ric+0.024
 harbour+0.023
 cinema+0.022
|+0.022
Neg logit contributions
 brave-0.012
 independent-0.012
 strong-0.011
 sure-0.011
 clever-0.011
Token She
Token ID1375
Feature activation-2.52e-03

Pos logit contributions
 brave+0.012
 sure+0.012
 independent+0.012
 strong+0.012
 clever+0.012
Neg logit contributions
ossible-0.018
ric-0.016
 harbour-0.016
 cinema-0.015
 exhibit-0.015
Token keeps
Token ID7622
Feature activation-6.90e-03

Pos logit contributions
ossible+0.004
ric+0.004
 harbour+0.004
 cinema+0.004
 exhibit+0.004
Neg logit contributions
 independent-0.004
 brave-0.004
 sure-0.004
 strong-0.004
 graceful-0.004
Token the
Token ID262
Feature activation+1.85e-03

Pos logit contributions
ossible+0.013
ric+0.012
 harbour+0.012
 cinema+0.012
 exhibit+0.011
Neg logit contributions
 brave-0.010
 independent-0.010
 sure-0.010
 strong-0.009
 graceful-0.009
Token ones
Token ID3392
Feature activation+7.19e-02

Pos logit contributions
 sure+0.004
 independent+0.003
 brave+0.003
 strong+0.003
 graceful+0.003
Neg logit contributions
ossible-0.003
 cinema-0.002
 exhibit-0.002
ric-0.002
 driveway-0.002
Token she
Token ID673
Feature activation+1.45e-02

Pos logit contributions
 independent+0.091
 sure+0.091
 brave+0.089
 graceful+0.087
 clever+0.086
Neg logit contributions
ossible-0.147
 harbour-0.125
ric-0.122
 exhibit-0.121
 ago-0.118
Token likes
Token ID7832
Feature activation-2.25e-02

Pos logit contributions
 butterfly+0.022
 bee+0.021
 independent+0.021
 sure+0.021
 graceful+0.021
Neg logit contributions
ossible-0.026
ric-0.022
 ago-0.022
 exhibit-0.021
 cinema-0.021
Token the
Token ID262
Feature activation+3.05e-03

Pos logit contributions
ossible+0.037
 ago+0.033
 exhibit+0.033
ric+0.033
 cinema+0.031
Neg logit contributions
 independent-0.037
 sure-0.037
 brave-0.037
 clever-0.036
 strong-0.036
Token most
Token ID749
Feature activation+8.38e-03

Pos logit contributions
 sure+0.006
 independent+0.005
 brave+0.005
 graceful+0.005
 strong+0.005
Neg logit contributions
ossible-0.005
ric-0.004
 cinema-0.004
 exhibit-0.004
 station-0.004
Token.
Token ID13
Feature activation+8.23e-03

Pos logit contributions
 sure+0.014
 brave+0.014
 independent+0.014
 swat+0.014
Small+0.014
Neg logit contributions
ossible-0.011
 ago-0.011
 exhibit-0.011
 harbour-0.010
ric-0.010
Token amazing
Token ID4998
Feature activation-1.49e-02

Pos logit contributions
 brave+0.003
 sure+0.003
 independent+0.003
 strong+0.003
 graceful+0.003
Neg logit contributions
ossible-0.004
 harbour-0.004
ric-0.004
 exhibit-0.003
 cinema-0.003
Token day
Token ID1110
Feature activation+1.45e-02

Pos logit contributions
ossible+0.025
ric+0.024
 harbour+0.023
|+0.022
 cinema+0.022
Neg logit contributions
 brave-0.025
 independent-0.025
 sure-0.025
 strong-0.024
 graceful-0.024
Token with
Token ID351
Feature activation-2.19e-02

Pos logit contributions
 brave+0.018
 independent+0.018
 sure+0.018
 strong+0.017
 clever+0.017
Neg logit contributions
ossible-0.031
ric-0.029
 harbour-0.028
 exhibit-0.027
 cinema-0.027
Token his
Token ID465
Feature activation-6.26e-04

Pos logit contributions
ossible+0.046
ric+0.042
 harbour+0.042
 exhibit+0.040
 cinema+0.039
Neg logit contributions
 independent-0.027
 sure-0.027
 brave-0.027
 strong-0.025
 clever-0.025
Token loved
Token ID6151
Feature activation-1.04e-02

Pos logit contributions
ossible+0.001
ric+0.001
 harbour+0.001
 cinema+0.001
 exhibit+0.001
Neg logit contributions
 brave-0.001
 independent-0.001
 sure-0.001
 strong-0.001
 clever-0.001
Token ones
Token ID3392
Feature activation+7.16e-02

Pos logit contributions
ossible+0.018
ric+0.017
 harbour+0.017
 exhibit+0.017
 ago+0.017
Neg logit contributions
 sure-0.015
 independent-0.015
 brave-0.015
 clever-0.015
 strong-0.015
Token.
Token ID13
Feature activation-2.98e-03

Pos logit contributions
 sure+0.107
 independent+0.107
 brave+0.105
 clever+0.103
 graceful+0.103
Neg logit contributions
ossible-0.127
 harbour-0.114
 exhibit-0.109
ric-0.108
 cinema-0.104
Token<|endoftext|>
Token ID50256
Feature activation-3.12e-02

Pos logit contributions
ossible+0.006
 harbour+0.006
 cinema+0.006
ric+0.006
 exhibit+0.006
Neg logit contributions
 sure-0.004
 brave-0.003
 independent-0.003
 strong-0.003
 clever-0.003
TokenOnce
Token ID7454
Feature activation-3.27e-02

Pos logit contributions
ossible+0.060
ric+0.057
 harbour+0.056
 cinema+0.053
 exhibit+0.052
Neg logit contributions
 brave-0.045
 sure-0.044
 independent-0.043
 strong-0.043
 clever-0.041
Token upon
Token ID2402
Feature activation-3.54e-03

Pos logit contributions
 harbour+0.048
ric+0.047
 cinema+0.046
ossible+0.045
 Sale+0.044
Neg logit contributions
 sure-0.061
 brave-0.061
 independent-0.059
 strong-0.059
 clever-0.058
Token a
Token ID257
Feature activation+1.15e-02

Pos logit contributions
ossible+0.008
ric+0.007
|+0.007
 exhibit+0.007
 harbour+0.007
Neg logit contributions
 independent-0.004
 brave-0.004
 sure-0.004
 strong-0.004
 graceful-0.004
Token the
Token ID262
Feature activation+1.40e-02

Pos logit contributions
 independent+0.031
 brave+0.031
 sure+0.029
 strong+0.029
 clever+0.028
Neg logit contributions
ossible-0.058
ric-0.054
 harbour-0.053
 exhibit-0.051
 cinema-0.050
Token nice
Token ID3621
Feature activation-1.30e-02

Pos logit contributions
 sure+0.018
 independent+0.018
 brave+0.018
 strong+0.018
 clever+0.017
Neg logit contributions
ossible-0.029
ric-0.027
 harbour-0.026
 exhibit-0.025
 cinema-0.025
Token shape
Token ID5485
Feature activation+2.01e-02

Pos logit contributions
ossible+0.024
ric+0.022
 harbour+0.022
 exhibit+0.021
 cinema+0.021
Neg logit contributions
 brave-0.020
 sure-0.020
 independent-0.020
 strong-0.019
 clever-0.019
Token and
Token ID290
Feature activation+1.25e-02

Pos logit contributions
 independent+0.026
 sure+0.025
 brave+0.025
 strong+0.024
 clever+0.024
Neg logit contributions
ossible-0.040
 harbour-0.038
 exhibit-0.038
ric-0.037
 cinema-0.036
Token felt
Token ID2936
Feature activation+5.12e-02

Pos logit contributions
 independent+0.018
 brave+0.017
 sure+0.017
 strong+0.017
 clever+0.016
Neg logit contributions
ossible-0.023
 harbour-0.022
ric-0.021
 exhibit-0.020
 cinema-0.019
Token how
Token ID703
Feature activation+7.14e-02

Pos logit contributions
Bee+0.060
 bee+0.057
isy+0.057
 pony+0.057
Small+0.057
Neg logit contributions
 harbour-0.096
ossible-0.095
ric-0.088
 aquarium-0.085
 ago-0.084
Token smooth
Token ID7209
Feature activation+9.66e-03

Pos logit contributions
 bee+0.060
 pony+0.060
 raven+0.056
isy+0.056
 bunny+0.054
Neg logit contributions
ossible-0.164
 harbour-0.151
ric-0.143
|-0.141
 tanks-0.136
Token it
Token ID340
Feature activation+2.08e-02

Pos logit contributions
 independent+0.014
 compassionate+0.014
 brave+0.014
 sure+0.014
 clever+0.014
Neg logit contributions
ossible-0.016
 harbour-0.015
ric-0.015
 ago-0.014
 tanks-0.014
Token was
Token ID373
Feature activation-4.96e-03

Pos logit contributions
 independent+0.037
 compassionate+0.035
 sure+0.034
 careful+0.034
 pony+0.034
Neg logit contributions
ossible-0.027
|-0.026
ric-0.026
 harbour-0.025
 exhibit-0.025
Token.
Token ID13
Feature activation-7.18e-03

Pos logit contributions
ossible+0.008
 harbour+0.008
ric+0.008
 ago+0.007
 exhibit+0.007
Neg logit contributions
 independent-0.008
 sure-0.007
 brave-0.007
 clever-0.007
 graceful-0.007
Token Tommy
Token ID19919
Feature activation+1.72e-02

Pos logit contributions
ossible+0.013
ric+0.012
 exhibit+0.012
 cinema+0.012
 harbour+0.011
Neg logit contributions
 sure-0.011
 independent-0.011
 brave-0.010
 strong-0.010
 clever-0.010
Token on
Token ID319
Feature activation+1.02e-02

Pos logit contributions
ossible+0.040
ric+0.037
 harbour+0.036
ained+0.034
 driveway+0.034
Neg logit contributions
 independent-0.029
 clever-0.028
 sure-0.028
 brave-0.027
 intelligent-0.027
Token the
Token ID262
Feature activation-4.54e-03

Pos logit contributions
 sure+0.014
 observ+0.014
 courageous+0.014
 independent+0.014
 brave+0.013
Neg logit contributions
ossible-0.018
 exhibit-0.017
 harbour-0.017
 cinema-0.017
ric-0.017
Token wood
Token ID4898
Feature activation+5.46e-03

Pos logit contributions
ossible+0.006
 driveway+0.005
 exhibit+0.005
ric+0.005
 station+0.005
Neg logit contributions
 sure-0.009
 independent-0.008
 brave-0.008
 compassionate-0.008
 clever-0.008
Token and
Token ID290
Feature activation+4.11e-03

Pos logit contributions
 sure+0.007
 independent+0.007
 brave+0.007
 clever+0.006
 graceful+0.006
Neg logit contributions
ossible-0.012
ric-0.011
 harbour-0.011
 exhibit-0.010
 cinema-0.010
Token felt
Token ID2936
Feature activation+5.02e-02

Pos logit contributions
 independent+0.006
 sure+0.006
 brave+0.006
 clever+0.006
 graceful+0.005
Neg logit contributions
ossible-0.007
 harbour-0.007
ric-0.007
 exhibit-0.007
 cinema-0.007
Token how
Token ID703
Feature activation+7.11e-02

Pos logit contributions
 butterfly+0.059
 pony+0.058
Small+0.058
Bee+0.058
pick+0.057
Neg logit contributions
ossible-0.092
 harbour-0.089
ric-0.084
 aquarium-0.076
ained-0.076
Token smooth
Token ID7209
Feature activation+1.14e-02

Pos logit contributions
isy+0.059
 pony+0.059
pick+0.057
 butterfly+0.056
 bee+0.055
Neg logit contributions
ossible-0.162
 harbour-0.139
ric-0.135
melon-0.130
loo-0.127
Token and
Token ID290
Feature activation+5.20e-03

Pos logit contributions
 independent+0.018
 compassionate+0.018
 clever+0.017
 sure+0.017
 brave+0.017
Neg logit contributions
ossible-0.017
 harbour-0.016
ric-0.016
melon-0.016
 tanks-0.015
Token cool
Token ID3608
Feature activation+3.66e-03

Pos logit contributions
 bee+0.005
 independent+0.005
 raven+0.005
 sure+0.005
 pony+0.005
Neg logit contributions
ossible-0.010
ric-0.010
 harbour-0.010
 cinema-0.009
 driveway-0.009
Token it
Token ID340
Feature activation+1.44e-02

Pos logit contributions
 independent+0.006
 compassionate+0.006
 clever+0.006
 intelligent+0.006
 brave+0.006
Neg logit contributions
ossible-0.005
 harbour-0.005
 ago-0.005
ric-0.005
anged-0.005
Token was
Token ID373
Feature activation-1.88e-04

Pos logit contributions
 independent+0.026
 compassionate+0.025
 raven+0.025
 pony+0.025
 lamb+0.024
Neg logit contributions
ossible-0.017
|-0.016
ric-0.016
 exhibit-0.015
 harbour-0.015
Token to
Token ID284
Feature activation+7.44e-03

Pos logit contributions
 independent+0.012
 strong+0.012
 sure+0.011
 brave+0.011
 graceful+0.011
Neg logit contributions
ossible-0.019
 ago-0.018
 exhibit-0.017
 harbour-0.017
ric-0.017
Token share
Token ID2648
Feature activation+2.20e-02

Pos logit contributions
 independent+0.010
 brave+0.010
 sure+0.010
 strong+0.010
 clever+0.010
Neg logit contributions
ossible-0.015
ric-0.014
 harbour-0.013
 exhibit-0.013
 cinema-0.013
Token than
Token ID621
Feature activation+5.25e-03

Pos logit contributions
 independent+0.028
 sure+0.028
 brave+0.028
 clever+0.027
 strong+0.027
Neg logit contributions
ossible-0.044
ric-0.038
 harbour-0.038
 exhibit-0.038
 ago-0.037
Token to
Token ID284
Feature activation+1.12e-02

Pos logit contributions
 sure+0.007
 brave+0.007
 independent+0.007
 strong+0.007
 clever+0.006
Neg logit contributions
ossible-0.011
ric-0.010
 harbour-0.010
 exhibit-0.009
 cinema-0.009
Token keep
Token ID1394
Feature activation+2.81e-02

Pos logit contributions
 independent+0.016
 sure+0.016
 brave+0.016
 clever+0.015
 graceful+0.015
Neg logit contributions
ossible-0.021
ric-0.019
 harbour-0.019
 exhibit-0.018
|-0.018
Token things
Token ID1243
Feature activation+7.07e-02

Pos logit contributions
 brave+0.034
 independent+0.033
 sure+0.032
 bee+0.032
 strong+0.032
Neg logit contributions
ossible-0.053
 harbour-0.051
ric-0.050
 exhibit-0.047
 ago-0.047
Token for
Token ID329
Feature activation+8.27e-04

Pos logit contributions
 independent+0.086
 bee+0.085
 sure+0.084
 brave+0.083
 clever+0.083
Neg logit contributions
ossible-0.135
 harbour-0.125
ric-0.117
 ago-0.115
|-0.110
Token yourself
Token ID3511
Feature activation+3.61e-02

Pos logit contributions
 independent+0.001
 brave+0.001
 sure+0.001
 graceful+0.001
 strong+0.001
Neg logit contributions
ossible-0.002
 harbour-0.001
ric-0.001
 ago-0.001
 Sale-0.001
Token.
Token ID13
Feature activation+5.81e-03

Pos logit contributions
 independent+0.056
 sure+0.055
 brave+0.055
 clever+0.054
 graceful+0.054
Neg logit contributions
ossible-0.061
ric-0.056
 ago-0.056
 harbour-0.055
 exhibit-0.053
Token<|endoftext|>
Token ID50256
Feature activation-3.96e-02

Pos logit contributions
 sure+0.008
 brave+0.008
 independent+0.008
 strong+0.007
 graceful+0.007
Neg logit contributions
ossible-0.011
ric-0.011
 harbour-0.011
 cinema-0.010
 exhibit-0.010
TokenOne
Token ID3198
Feature activation-4.97e-03

Pos logit contributions
ossible+0.064
ric+0.060
 harbour+0.058
 cinema+0.054
 exhibit+0.053
Neg logit contributions
 brave-0.069
 sure-0.068
 independent-0.067
 strong-0.066
 clever-0.065
Token it
Token ID340
Feature activation+3.03e-02

Pos logit contributions
 compassionate+0.029
 courageous+0.028
 independent+0.028
 brave+0.028
 fearless+0.027
Neg logit contributions
ossible-0.039
 exhibit-0.037
 harbour-0.035
ric-0.035
 aquarium-0.034
Token up
Token ID510
Feature activation+1.37e-02

Pos logit contributions
 courageous+0.055
 compassionate+0.054
 independent+0.054
 observ+0.053
 fearless+0.053
Neg logit contributions
ossible-0.044
 exhibit-0.041
 enclosure-0.034
ric-0.033
 harbour-0.033
Token.
Token ID13
Feature activation+5.96e-03

Pos logit contributions
 independent+0.020
 sure+0.020
 brave+0.019
 clever+0.019
 graceful+0.019
Neg logit contributions
ossible-0.026
 harbour-0.022
ric-0.022
 exhibit-0.022
 ago-0.021
Token He
Token ID679
Feature activation+9.47e-04

Pos logit contributions
 brave+0.009
 sure+0.009
 independent+0.009
 strong+0.009
 clever+0.009
Neg logit contributions
ossible-0.011
ric-0.010
 harbour-0.010
 cinema-0.010
 exhibit-0.010
Token feels
Token ID5300
Feature activation+4.35e-02

Pos logit contributions
 independent+0.002
 brave+0.002
 sure+0.002
 strong+0.002
 graceful+0.002
Neg logit contributions
ossible-0.002
ric-0.001
 harbour-0.001
 exhibit-0.001
 cinema-0.001
Token how
Token ID703
Feature activation+7.05e-02

Pos logit contributions
Pen+0.051
Bee+0.050
isy+0.049
 pony+0.049
Small+0.048
Neg logit contributions
ossible-0.084
 harbour-0.078
 ago-0.074
 aquarium-0.072
ric-0.071
Token soft
Token ID2705
Feature activation+1.63e-02

Pos logit contributions
 pony+0.067
 bee+0.066
isy+0.065
 sure+0.064
 independent+0.064
Neg logit contributions
ossible-0.158
 harbour-0.134
ric-0.133
 ago-0.127
|-0.126
Token and
Token ID290
Feature activation+1.32e-02

Pos logit contributions
 independent+0.025
 sure+0.025
 compassionate+0.024
 clever+0.024
 observ+0.024
Neg logit contributions
ossible-0.026
 harbour-0.024
melon-0.024
ric-0.024
 exhibit-0.024
Token smooth
Token ID7209
Feature activation+4.28e-03

Pos logit contributions
 sure+0.011
 bee+0.011
 raven+0.010
 pony+0.010
 lamb+0.010
Neg logit contributions
ossible-0.028
ric-0.027
|-0.025
 harbour-0.025
 tanks-0.025
Token it
Token ID340
Feature activation+1.55e-02

Pos logit contributions
 independent+0.007
 compassionate+0.007
 sure+0.007
 clever+0.007
 brave+0.007
Neg logit contributions
ossible-0.006
 ago-0.006
ric-0.006
melon-0.005
 tanks-0.005
Token is
Token ID318
Feature activation-2.86e-03

Pos logit contributions
 independent+0.030
 compassionate+0.029
 sure+0.028
 courageous+0.028
 brave+0.028
Neg logit contributions
ossible-0.019
ric-0.018
 exhibit-0.017
|-0.016
 cinema-0.015
Token
Token ID198
Feature activation-4.63e-03

Pos logit contributions
ossible+0.004
ric+0.004
 harbour+0.004
 cinema+0.004
 exhibit+0.004
Neg logit contributions
 brave-0.004
 sure-0.004
 independent-0.004
 strong-0.004
 clever-0.004
Token
Token ID198
Feature activation+4.33e-04

Pos logit contributions
ossible+0.008
ric+0.007
 harbour+0.007
|+0.006
 exhibit+0.006
Neg logit contributions
 independent-0.008
 brave-0.008
 strong-0.008
 sure-0.008
 clever-0.008
TokenAnd
Token ID1870
Feature activation-1.07e-02

Pos logit contributions
 independent+0.000
 brave+0.000
 strong+0.000
 sure+0.000
 clever+0.000
Neg logit contributions
ossible-0.001
 harbour-0.001
ric-0.001
|-0.001
 cinema-0.001
Token that
Token ID326
Feature activation+1.92e-02

Pos logit contributions
ossible+0.021
 harbour+0.019
ric+0.019
 exhibit+0.018
|+0.018
Neg logit contributions
 independent-0.014
 strong-0.014
 brave-0.014
 clever-0.014
 graceful-0.014
Token is
Token ID318
Feature activation+1.94e-03

Pos logit contributions
 independent+0.031
 brave+0.030
 strong+0.029
 sure+0.029
 clever+0.029
Neg logit contributions
ossible-0.034
ric-0.030
 exhibit-0.029
 harbour-0.029
 cinema-0.029
Token how
Token ID703
Feature activation+7.05e-02

Pos logit contributions
 independent+0.002
 bee+0.002
 graceful+0.002
 butterfly+0.002
 raven+0.002
Neg logit contributions
ossible-0.004
ric-0.004
 harbour-0.004
 cinema-0.003
ained-0.003
Token the
Token ID262
Feature activation+2.42e-02

Pos logit contributions
 independent+0.119
 bee+0.118
 pony+0.117
 raven+0.117
isy+0.116
Neg logit contributions
ossible-0.101
ric-0.083
|-0.080
 harbour-0.080
 exhibit-0.075
Token rich
Token ID5527
Feature activation-4.05e-03

Pos logit contributions
 sure+0.031
 independent+0.028
 brave+0.027
 strong+0.026
 clever+0.026
Neg logit contributions
ossible-0.049
ric-0.046
 harbour-0.045
 exhibit-0.045
 cinema-0.044
Token man
Token ID582
Feature activation-7.45e-03

Pos logit contributions
ossible+0.008
 harbour+0.007
 exhibit+0.007
ric+0.007
 station+0.007
Neg logit contributions
 sure-0.005
 independent-0.005
 brave-0.005
 clever-0.005
 graceful-0.005
Token found
Token ID1043
Feature activation+2.51e-02

Pos logit contributions
ossible+0.011
 exhibit+0.011
ric+0.011
|+0.010
 harbour+0.010
Neg logit contributions
 independent-0.012
 graceful-0.011
 silk-0.011
 sweet-0.011
 redd-0.011
Token the
Token ID262
Feature activation+3.98e-03

Pos logit contributions
 independent+0.034
 compassionate+0.032
 graceful+0.031
 raven+0.030
 intelligent+0.030
Neg logit contributions
ossible-0.043
 exhibit-0.043
 harbour-0.041
ric-0.041
 tanks-0.039
Token you
Token ID345
Feature activation+1.46e-02

Pos logit contributions
 brave+0.028
 sure+0.027
 independent+0.027
 strong+0.026
 clever+0.026
Neg logit contributions
ossible-0.051
 harbour-0.048
ric-0.048
 cinema-0.046
 exhibit-0.046
Token understand
Token ID1833
Feature activation-3.18e-03

Pos logit contributions
 independent+0.014
 graceful+0.013
 brave+0.013
 clever+0.013
 sure+0.013
Neg logit contributions
ossible-0.036
ric-0.032
|-0.031
 harbour-0.030
 ago-0.030
Token?"
Token ID1701
Feature activation+2.07e-02

Pos logit contributions
ossible+0.005
ric+0.004
 harbour+0.004
 cinema+0.004
 exhibit+0.004
Neg logit contributions
 independent-0.006
 brave-0.006
 sure-0.006
 strong-0.006
 clever-0.006
Token
Token ID198
Feature activation+1.33e-02

Pos logit contributions
 independent+0.040
 sure+0.040
 brave+0.040
 strong+0.039
 clever+0.039
Neg logit contributions
ossible-0.031
ric-0.024
 harbour-0.023
 exhibit-0.022
 ago-0.022
Token
Token ID198
Feature activation+1.84e-02

Pos logit contributions
 independent+0.024
 brave+0.023
 sure+0.023
 strong+0.023
 clever+0.022
Neg logit contributions
ossible-0.022
ric-0.019
 harbour-0.018
|-0.018
 exhibit-0.018
Token"
Token ID1
Feature activation+7.04e-02

Pos logit contributions
 sure+0.027
 independent+0.026
 brave+0.025
 strong+0.025
 clever+0.024
Neg logit contributions
ossible-0.038
ric-0.032
 harbour-0.031
|-0.030
 exhibit-0.030
TokenYes
Token ID5297
Feature activation+8.26e-03

Pos logit contributions
 sure+0.100
 independent+0.092
 brave+0.088
 strong+0.087
 clever+0.083
Neg logit contributions
ossible-0.138
 harbour-0.115
 cinema-0.115
ric-0.115
OM-0.115
Token,
Token ID11
Feature activation+9.26e-03

Pos logit contributions
 independent+0.015
 brave+0.015
 sure+0.014
 graceful+0.014
 strong+0.014
Neg logit contributions
ossible-0.013
ric-0.011
 harbour-0.011
 exhibit-0.010
 Sale-0.010
Token Mom
Token ID11254
Feature activation-1.18e-02

Pos logit contributions
 independent+0.014
 sure+0.014
 brave+0.014
 strong+0.013
 graceful+0.013
Neg logit contributions
ossible-0.017
ric-0.014
 harbour-0.014
 exhibit-0.014
 cinema-0.014
Token,"
Token ID553
Feature activation+3.44e-02

Pos logit contributions
ossible+0.025
 harbour+0.024
ric+0.024
 cinema+0.023
|+0.023
Neg logit contributions
 brave-0.015
 sure-0.014
 independent-0.014
 strong-0.014
 clever-0.013
Token Sam
Token ID3409
Feature activation+2.60e-02

Pos logit contributions
 independent+0.063
 sure+0.063
 brave+0.062
 strong+0.061
 clever+0.060
Neg logit contributions
ossible-0.056
ric-0.044
 harbour-0.043
 exhibit-0.042
 cinema-0.041
Token
Token ID198
Feature activation+3.88e-03

Pos logit contributions
ossible+0.004
ric+0.004
 harbour+0.003
 exhibit+0.003
|+0.003
Neg logit contributions
 brave-0.004
 independent-0.004
 strong-0.004
 graceful-0.004
 clever-0.004
Token
Token ID198
Feature activation+1.05e-02

Pos logit contributions
 independent+0.008
 brave+0.008
 strong+0.008
 clever+0.008
 sharp+0.008
Neg logit contributions
ossible-0.005
|-0.004
ric-0.004
 exhibit-0.004
 harbour-0.004
TokenAnd
Token ID1870
Feature activation-4.34e-03

Pos logit contributions
 independent+0.017
 strong+0.016
 sharp+0.016
 intelligent+0.015
 brave+0.015
Neg logit contributions
ossible-0.020
|-0.017
 harbour-0.014
 cinema-0.014
arium-0.014
Token that
Token ID326
Feature activation+2.18e-02

Pos logit contributions
ossible+0.008
 harbour+0.007
|+0.007
 exhibit+0.007
ric+0.007
Neg logit contributions
 independent-0.006
 strong-0.006
 brave-0.006
 clever-0.006
 graceful-0.006
Token is
Token ID318
Feature activation+5.30e-03

Pos logit contributions
 independent+0.036
 brave+0.034
 strong+0.034
 sure+0.034
 clever+0.033
Neg logit contributions
ossible-0.039
ric-0.033
 exhibit-0.032
|-0.032
 harbour-0.032
Token how
Token ID703
Feature activation+7.03e-02

Pos logit contributions
 independent+0.006
 graceful+0.006
 butterflies+0.006
 raven+0.006
 sharp+0.006
Neg logit contributions
ossible-0.011
ric-0.009
 harbour-0.009
 cinema-0.008
 Sale-0.008
Token Daisy
Token ID40355
Feature activation+3.80e-02

Pos logit contributions
 raven+0.102
 pony+0.102
pick+0.101
 brist+0.100
 independent+0.100
Neg logit contributions
ossible-0.114
|-0.094
ric-0.087
 ago-0.087
 harbour-0.086
Token learned
Token ID4499
Feature activation+1.90e-02

Pos logit contributions
 independent+0.056
 graceful+0.055
 sure+0.053
 oh+0.053
 sweet+0.053
Neg logit contributions
ossible-0.066
 harbour-0.058
 exhibit-0.057
|-0.056
 ago-0.054
Token to
Token ID284
Feature activation-9.42e-03

Pos logit contributions
 independent+0.024
 brave+0.023
 strong+0.023
 clever+0.022
 sure+0.022
Neg logit contributions
ossible-0.038
ric-0.035
 harbour-0.034
 exhibit-0.033
 cinema-0.033
Token read
Token ID1100
Feature activation-4.07e-04

Pos logit contributions
ossible+0.019
ric+0.017
 harbour+0.017
 exhibit+0.016
 Sale+0.016
Neg logit contributions
 independent-0.012
 sure-0.012
 brave-0.012
 graceful-0.012
 strong-0.012
Token.
Token ID13
Feature activation+1.00e-02

Pos logit contributions
ossible+0.001
 exhibit+0.001
 cinema+0.001
ric+0.001
 harbour+0.001
Neg logit contributions
 independent-0.001
 sure-0.001
 brave-0.000
 clever-0.000
 intelligent-0.000
Token your
Token ID534
Feature activation+4.95e-03

Pos logit contributions
worm+0.028
 redd+0.026
 brave+0.026
 raven+0.026
 courageous+0.026
Neg logit contributions
ossible-0.038
ric-0.035
 harbour-0.033
 Sale-0.032
 cinema-0.032
Token parents
Token ID3397
Feature activation-6.62e-03

Pos logit contributions
 sure+0.007
 brave+0.007
 independent+0.007
 strong+0.007
 graceful+0.007
Neg logit contributions
ossible-0.010
ric-0.009
 harbour-0.009
 exhibit-0.008
 cinema-0.008
Token and
Token ID290
Feature activation-1.35e-02

Pos logit contributions
ossible+0.014
ric+0.013
 harbour+0.013
 exhibit+0.013
 cinema+0.013
Neg logit contributions
 brave-0.008
 independent-0.008
 sure-0.008
 strong-0.008
 clever-0.008
Token keep
Token ID1394
Feature activation+6.09e-04

Pos logit contributions
ossible+0.028
ric+0.025
 harbour+0.025
 exhibit+0.024
 cinema+0.024
Neg logit contributions
 brave-0.018
 independent-0.017
 sure-0.017
 strong-0.017
 clever-0.017
Token your
Token ID534
Feature activation+2.07e-02

Pos logit contributions
 bee+0.001
 brave+0.001
 raven+0.001
 independent+0.001
worm+0.001
Neg logit contributions
ossible-0.001
ric-0.001
 harbour-0.001
 cinema-0.001
 exhibit-0.001
Token things
Token ID1243
Feature activation+7.02e-02

Pos logit contributions
 sure+0.031
 brave+0.030
 independent+0.030
 strong+0.029
 graceful+0.029
Neg logit contributions
ossible-0.039
ric-0.035
 exhibit-0.033
 harbour-0.033
 cinema-0.033
Token tidy
Token ID43044
Feature activation+4.69e-03

Pos logit contributions
 bee+0.070
 raven+0.065
 sure+0.064
 independent+0.063
 clever+0.062
Neg logit contributions
ossible-0.158
 harbour-0.144
ric-0.131
 driveway-0.130
|-0.130
Token.
Token ID13
Feature activation-1.15e-02

Pos logit contributions
 brave+0.006
 independent+0.006
 sure+0.006
 clever+0.006
 graceful+0.006
Neg logit contributions
ossible-0.009
 harbour-0.008
ric-0.008
 ago-0.008
 exhibit-0.008
Token<|endoftext|>
Token ID50256
Feature activation-4.38e-02

Pos logit contributions
ossible+0.023
 harbour+0.022
ric+0.022
 cinema+0.022
 exhibit+0.022
Neg logit contributions
 sure-0.015
 brave-0.014
 independent-0.014
 strong-0.014
 graceful-0.013
TokenOnce
Token ID7454
Feature activation-4.02e-02

Pos logit contributions
ric+0.096
ossible+0.095
 harbour+0.093
anged+0.089
 cinema+0.086
Neg logit contributions
 brave-0.051
 sure-0.050
 strong-0.048
 independent-0.047
 clever-0.045
Token upon
Token ID2402
Feature activation-1.24e-02

Pos logit contributions
 harbour+0.066
ric+0.064
 cinema+0.062
 habitat+0.059
 Sale+0.058
Neg logit contributions
 sure-0.070
 brave-0.069
 clever-0.067
 independent-0.067
 strong-0.067
Token sailed
Token ID33835
Feature activation-8.46e-03

Pos logit contributions
 independent+0.017
 brave+0.017
 strong+0.016
 sure+0.016
 clever+0.016
Neg logit contributions
ossible-0.020
ric-0.018
 harbour-0.017
 exhibit-0.017
 cinema-0.016
Token for
Token ID329
Feature activation+1.16e-03

Pos logit contributions
ossible+0.016
 harbour+0.014
 exhibit+0.013
 cinema+0.013
rollers+0.012
Neg logit contributions
 independent-0.012
 sure-0.012
 bee-0.012
 clever-0.012
 compassionate-0.012
Token days
Token ID1528
Feature activation+5.79e-03

Pos logit contributions
 independent+0.001
 sure+0.001
 brave+0.001
 strong+0.001
 clever+0.001
Neg logit contributions
ossible-0.003
 harbour-0.002
ric-0.002
 exhibit-0.002
 cinema-0.002
Token,
Token ID11
Feature activation+1.20e-02

Pos logit contributions
 brave+0.007
 sure+0.007
 independent+0.007
 strong+0.007
 clever+0.006
Neg logit contributions
ossible-0.013
ric-0.012
 harbour-0.012
 cinema-0.011
 exhibit-0.011
Token exploring
Token ID13504
Feature activation+3.10e-02

Pos logit contributions
 independent+0.017
 compassionate+0.016
 clever+0.016
 brave+0.016
 bee+0.016
Neg logit contributions
 harbour-0.021
ossible-0.020
ric-0.020
 exhibit-0.020
 station-0.019
Token unch
Token ID16258
Feature activation-9.23e-02

Pos logit contributions
 tang+0.041
 sure+0.041
 independent+0.040
 compassionate+0.039
 bee+0.038
Neg logit contributions
 exhibit-0.054
 harbour-0.054
ossible-0.054
 cinema-0.051
arium-0.049
Tokenarted
Token ID19112
Feature activation-1.46e-02

Pos logit contributions
ossible+0.159
 cinema+0.147
 exhibit+0.139
 harbour+0.137
ric+0.137
Neg logit contributions
 independent-0.155
 sure-0.144
 strong-0.140
 intelligent-0.140
 graceful-0.140
Token waters
Token ID10150
Feature activation+1.25e-02

Pos logit contributions
ossible+0.029
ric+0.027
 harbour+0.027
 exhibit+0.027
 cinema+0.026
Neg logit contributions
 independent-0.018
 sure-0.018
 brave-0.018
 clever-0.018
 compassionate-0.018
Token until
Token ID1566
Feature activation+7.95e-03

Pos logit contributions
 independent+0.017
 sure+0.017
 brave+0.017
 clever+0.016
 strong+0.016
Neg logit contributions
ossible-0.024
ric-0.023
 harbour-0.022
 exhibit-0.022
 ago-0.021
Token he
Token ID339
Feature activation-1.26e-03

Pos logit contributions
 brave+0.008
 independent+0.008
 sure+0.008
 strong+0.008
 clever+0.007
Neg logit contributions
ossible-0.019
ric-0.018
 harbour-0.018
 cinema-0.017
 exhibit-0.017
Token came
Token ID1625
Feature activation-2.06e-02

Pos logit contributions
ossible+0.003
ric+0.002
 harbour+0.002
 cinema+0.002
 exhibit+0.002
Neg logit contributions
 brave-0.002
 sure-0.002
 independent-0.002
 strong-0.002
 clever-0.002
Token up
Token ID510
Feature activation+1.14e-02

Pos logit contributions
ossible+0.023
 harbour+0.021
ric+0.021
 cinema+0.020
 exhibit+0.020
Neg logit contributions
 sure-0.015
 independent-0.015
 brave-0.014
 clever-0.014
 compassionate-0.013
Token and
Token ID290
Feature activation-1.12e-02

Pos logit contributions
 sure+0.015
 independent+0.015
 brave+0.014
 clever+0.014
 compassionate+0.014
Neg logit contributions
ossible-0.022
 harbour-0.020
ric-0.020
 exhibit-0.020
 cinema-0.020
Token down
Token ID866
Feature activation+1.99e-03

Pos logit contributions
ossible+0.019
ric+0.018
 harbour+0.018
 cinema+0.017
 exhibit+0.017
Neg logit contributions
 independent-0.017
 sure-0.017
 brave-0.017
 clever-0.016
 compassionate-0.016
Token on
Token ID319
Feature activation-1.49e-02

Pos logit contributions
 sure+0.003
 independent+0.003
 brave+0.003
 compassionate+0.003
 clever+0.003
Neg logit contributions
ossible-0.003
 harbour-0.003
 exhibit-0.003
ric-0.003
 cinema-0.003
Token the
Token ID262
Feature activation-2.54e-02

Pos logit contributions
ossible+0.033
ric+0.031
 exhibit+0.031
 cinema+0.030
 harbour+0.030
Neg logit contributions
 independent-0.017
 sure-0.016
 brave-0.016
 clever-0.015
 compassionate-0.015
Token tr
Token ID491
Feature activation-9.15e-02

Pos logit contributions
ossible+0.039
 cinema+0.037
 exhibit+0.037
ric+0.036
 harbour+0.036
Neg logit contributions
 sure-0.044
 independent-0.040
 brave-0.040
 compassionate-0.039
 clever-0.038
Tokenamp
Token ID696
Feature activation-4.86e-02

Pos logit contributions
ossible+0.152
 inconvenience+0.150
 harbour+0.147
 Sale+0.141
 tanks+0.135
Neg logit contributions
The-0.098
 independent-0.098
 flexible-0.098
 ant-0.095
in-0.095
Tokenoline
Token ID14453
Feature activation-3.54e-02

Pos logit contributions
ossible+0.081
ric+0.073
 harbour+0.071
 cinema+0.065
rollers+0.063
Neg logit contributions
 independent-0.082
 brave-0.082
 strong-0.080
 sure-0.079
 graceful-0.077
Token.
Token ID13
Feature activation-1.27e-02

Pos logit contributions
ossible+0.072
ric+0.067
 harbour+0.065
 cinema+0.063
 exhibit+0.063
Neg logit contributions
 brave-0.048
 sure-0.047
 independent-0.047
 strong-0.046
 clever-0.045
Token
Token ID198
Feature activation-1.75e-02

Pos logit contributions
ossible+0.024
ric+0.023
 harbour+0.022
 cinema+0.021
|+0.021
Neg logit contributions
 sure-0.019
 brave-0.019
 independent-0.018
 strong-0.018
 clever-0.018
Token
Token ID198
Feature activation-1.03e-02

Pos logit contributions
ossible+0.032
ric+0.030
 harbour+0.029
 cinema+0.028
 exhibit+0.028
Neg logit contributions
 brave-0.027
 sure-0.027
 independent-0.026
 strong-0.026
 clever-0.025
Token.
Token ID13
Feature activation+1.10e-02

Pos logit contributions
 independent+0.025
 sure+0.025
 compassionate+0.024
 clever+0.024
 brave+0.024
Neg logit contributions
ossible-0.032
ric-0.029
ained-0.028
 harbour-0.027
 cinema-0.027
Token She
Token ID1375
Feature activation-5.79e-03

Pos logit contributions
 brave+0.017
 sure+0.017
 independent+0.017
 strong+0.017
 clever+0.016
Neg logit contributions
ossible-0.020
ric-0.018
 ago-0.017
 harbour-0.017
|-0.017
Token falls
Token ID8953
Feature activation-3.85e-02

Pos logit contributions
ossible+0.010
ric+0.009
 harbour+0.008
 exhibit+0.008
 cinema+0.008
Neg logit contributions
 independent-0.010
 sure-0.010
 brave-0.010
 strong-0.010
 graceful-0.009
Token off
Token ID572
Feature activation+1.26e-02

Pos logit contributions
ossible+0.078
ric+0.072
 harbour+0.070
 cinema+0.068
 exhibit+0.067
Neg logit contributions
 brave-0.051
 sure-0.051
 independent-0.051
 clever-0.049
 strong-0.049
Token the
Token ID262
Feature activation-4.04e-02

Pos logit contributions
 sure+0.017
 observ+0.016
 courageous+0.016
 clever+0.015
 independent+0.015
Neg logit contributions
ossible-0.023
|-0.022
 exhibit-0.021
 cinema-0.021
 ago-0.021
Token tr
Token ID491
Feature activation-9.03e-02

Pos logit contributions
ossible+0.060
ric+0.053
 cinema+0.052
 exhibit+0.051
|+0.051
Neg logit contributions
 sure-0.077
 independent-0.071
 brave-0.071
 compassionate-0.069
 clever-0.068
Tokenamp
Token ID696
Feature activation-5.22e-02

Pos logit contributions
 harbour+0.143
ossible+0.140
 inconvenience+0.132
 Sale+0.131
 station+0.126
Neg logit contributions
 independent-0.122
 flexible-0.118
 ant-0.117
 brave-0.116
 strong-0.115
Tokenoline
Token ID14453
Feature activation-3.58e-02

Pos logit contributions
ossible+0.080
 harbour+0.072
ric+0.071
 cinema+0.064
rollers+0.063
Neg logit contributions
 independent-0.095
 brave-0.095
 strong-0.094
 sure-0.092
 graceful-0.090
Token.
Token ID13
Feature activation+1.11e-02

Pos logit contributions
ossible+0.075
ric+0.070
 harbour+0.068
 exhibit+0.067
 cinema+0.066
Neg logit contributions
 brave-0.045
 independent-0.045
 sure-0.045
 strong-0.043
 clever-0.042
Token She
Token ID1375
Feature activation-3.44e-03

Pos logit contributions
 brave+0.017
 sure+0.017
 independent+0.017
 strong+0.016
 clever+0.016
Neg logit contributions
ossible-0.020
ric-0.018
 harbour-0.017
 ago-0.017
|-0.017
Token lands
Token ID8604
Feature activation+2.41e-02

Pos logit contributions
ossible+0.006
ric+0.005
 harbour+0.005
 exhibit+0.005
|+0.005
Neg logit contributions
 independent-0.006
 sure-0.006
 brave-0.006
 strong-0.006
 clever-0.006
Token the
Token ID262
Feature activation-9.56e-03

Pos logit contributions
 brave+0.007
 courageous+0.007
 fearless+0.007
 sure+0.007
 independent+0.007
Neg logit contributions
ossible-0.009
ric-0.008
 pizz-0.007
 Sale-0.007
 exhibit-0.007
Token mixture
Token ID11710
Feature activation-3.70e-03

Pos logit contributions
ossible+0.014
 exhibit+0.011
 cinema+0.011
 ago+0.011
 station+0.011
Neg logit contributions
 sure-0.018
 fearless-0.017
 brave-0.017
 independent-0.017
Small-0.017
Token into
Token ID656
Feature activation-3.69e-03

Pos logit contributions
ossible+0.007
ric+0.007
 harbour+0.006
 exhibit+0.006
 cinema+0.006
Neg logit contributions
 brave-0.005
 independent-0.005
 sure-0.005
 clever-0.005
 compassionate-0.005
Token some
Token ID617
Feature activation+3.61e-04

Pos logit contributions
ossible+0.007
ric+0.006
 ago+0.006
 Sale+0.006
 pizz+0.006
Neg logit contributions
 courageous-0.005
 fearless-0.005
 brave-0.005
 independent-0.005
 agile-0.004
Token baking
Token ID16871
Feature activation-8.14e-03

Pos logit contributions
 sure+0.001
 fearless+0.001
 courageous+0.001
 brave+0.001
 independent+0.001
Neg logit contributions
ossible-0.001
 ago-0.000
 pizz-0.000
 exhibit-0.000
 pool-0.000
Token tr
Token ID491
Feature activation-8.86e-02

Pos logit contributions
ossible+0.015
ric+0.013
 exhibit+0.013
 cinema+0.012
 harbour+0.012
Neg logit contributions
 brave-0.013
 independent-0.013
 courageous-0.012
 sure-0.012
 compassionate-0.012
Tokenays
Token ID592
Feature activation-1.23e-02

Pos logit contributions
 harbour+0.116
ossible+0.115
 cinema+0.107
ric+0.106
 inconvenience+0.104
Neg logit contributions
 sure-0.155
 independent-0.155
 strong-0.148
 flexible-0.148
 graceful-0.145
Token.
Token ID13
Feature activation-9.51e-03

Pos logit contributions
ossible+0.022
ric+0.020
 harbour+0.019
 exhibit+0.019
 ago+0.018
Neg logit contributions
 brave-0.019
 independent-0.019
 sure-0.018
 clever-0.018
 compassionate-0.018
Token She
Token ID1375
Feature activation-1.11e-03

Pos logit contributions
ossible+0.016
ric+0.014
 harbour+0.014
 exhibit+0.013
 Sale+0.013
Neg logit contributions
 brave-0.016
 independent-0.016
 clever-0.016
 sure-0.016
 strong-0.016
Token put
Token ID1234
Feature activation+1.22e-02

Pos logit contributions
ossible+0.002
ric+0.001
 exhibit+0.001
 harbour+0.001
 cinema+0.001
Neg logit contributions
 independent-0.002
 courageous-0.002
 brave-0.002
 graceful-0.002
 fearless-0.002
Token them
Token ID606
Feature activation+2.10e-02

Pos logit contributions
 courageous+0.020
 fearless+0.019
 compassionate+0.019
 independent+0.019
 observ+0.018
Neg logit contributions
ossible-0.021
ric-0.018
 Sale-0.016
 pizz-0.016
melon-0.016
Token to
Token ID284
Feature activation-3.09e-03

Pos logit contributions
 sure+0.009
 independent+0.009
 brave+0.009
 compassionate+0.009
 clever+0.009
Neg logit contributions
ossible-0.019
 exhibit-0.018
ric-0.018
 harbour-0.018
 cinema-0.017
Token the
Token ID262
Feature activation-1.89e-02

Pos logit contributions
ossible+0.007
 exhibit+0.006
ric+0.006
 harbour+0.006
 cinema+0.006
Neg logit contributions
 independent-0.003
 brave-0.003
 compassionate-0.003
 courageous-0.003
 clever-0.003
Token edge
Token ID5743
Feature activation-8.32e-03

Pos logit contributions
ossible+0.032
 cinema+0.030
ric+0.029
 exhibit+0.029
 harbour+0.029
Neg logit contributions
 sure-0.031
 independent-0.029
 brave-0.029
 compassionate-0.027
 strong-0.027
Token of
Token ID286
Feature activation-3.02e-02

Pos logit contributions
ossible+0.017
ric+0.016
 harbour+0.016
 exhibit+0.015
 cinema+0.015
Neg logit contributions
 sure-0.011
 independent-0.011
 brave-0.010
 clever-0.010
 strong-0.010
Token the
Token ID262
Feature activation-2.69e-02

Pos logit contributions
ossible+0.073
ric+0.069
 harbour+0.068
 cinema+0.068
 exhibit+0.067
Neg logit contributions
 independent-0.027
 sure-0.027
 brave-0.027
 clever-0.024
 strong-0.024
Token tr
Token ID491
Feature activation-8.85e-02

Pos logit contributions
ossible+0.033
ric+0.028
 harbour+0.027
 exhibit+0.027
 cinema+0.027
Neg logit contributions
 sure-0.059
 independent-0.057
 brave-0.057
 strong-0.056
 clever-0.055
Tokenamp
Token ID696
Feature activation-4.85e-02

Pos logit contributions
ossible+0.141
 inconvenience+0.140
 harbour+0.138
 Sale+0.130
bank+0.123
Neg logit contributions
 ant-0.110
 independent-0.107
 flexible-0.106
in-0.105
The-0.102
Tokenoline
Token ID14453
Feature activation-3.33e-02

Pos logit contributions
ossible+0.073
ric+0.065
 harbour+0.064
 cinema+0.057
rollers+0.056
Neg logit contributions
 brave-0.090
 independent-0.090
 strong-0.089
 sure-0.087
 graceful-0.085
Token and
Token ID290
Feature activation-1.07e-03

Pos logit contributions
ossible+0.075
ric+0.070
 harbour+0.068
 cinema+0.066
|+0.065
Neg logit contributions
 brave-0.038
 sure-0.038
 independent-0.038
 strong-0.037
 clever-0.035
Token took
Token ID1718
Feature activation+8.18e-03

Pos logit contributions
ossible+0.002
 harbour+0.001
ric+0.001
 exhibit+0.001
anged+0.001
Neg logit contributions
 independent-0.002
 compassionate-0.002
 brave-0.002
 intelligent-0.002
 clever-0.002
Token a
Token ID257
Feature activation+1.18e-03

Pos logit contributions
 compassionate+0.012
 independent+0.012
 intelligent+0.012
 graceful+0.012
 courageous+0.012
Neg logit contributions
ossible-0.014
 exhibit-0.012
 Sale-0.012
 cinema-0.012
|-0.012
Token the
Token ID262
Feature activation-5.32e-03

Pos logit contributions
ossible+0.075
ric+0.072
 harbour+0.069
 cinema+0.068
 exhibit+0.068
Neg logit contributions
 brave-0.018
 independent-0.018
 strong-0.018
 sure-0.017
 graceful-0.016
Token weeds
Token ID36708
Feature activation-1.34e-03

Pos logit contributions
ossible+0.009
ric+0.009
 harbour+0.008
 exhibit+0.008
 cinema+0.008
Neg logit contributions
 brave-0.009
 independent-0.009
 strong-0.008
 graceful-0.008
 sure-0.008
Token next
Token ID1306
Feature activation-4.51e-02

Pos logit contributions
ossible+0.003
ric+0.003
|+0.003
 cinema+0.003
 harbour+0.003
Neg logit contributions
 brave-0.001
 sure-0.001
 strong-0.001
 independent-0.001
 graceful-0.001
Token to
Token ID284
Feature activation+1.18e-02

Pos logit contributions
ric+0.118
ossible+0.117
|+0.115
 cinema+0.114
 harbour+0.113
Neg logit contributions
 strong-0.034
 independent-0.033
 brave-0.032
 sweet-0.028
 graceful-0.028
Token his
Token ID465
Feature activation-4.34e-03

Pos logit contributions
 brave+0.015
 sure+0.015
 independent+0.014
 strong+0.014
 clever+0.014
Neg logit contributions
ossible-0.026
ric-0.024
 harbour-0.023
 exhibit-0.023
 cinema-0.023
Token tr
Token ID491
Feature activation-8.80e-02

Pos logit contributions
ossible+0.009
ric+0.008
 harbour+0.008
|+0.008
 cinema+0.008
Neg logit contributions
 brave-0.006
 independent-0.006
 sure-0.006
 strong-0.006
 clever-0.006
Tokenamp
Token ID696
Feature activation-5.30e-02

Pos logit contributions
 inconvenience+0.140
 Sale+0.131
 cinema+0.130
 station+0.128
 pizz+0.128
Neg logit contributions
 independent-0.116
 flexible-0.113
 brave-0.112
 sure-0.112
 ant-0.111
Tokenoline
Token ID14453
Feature activation-3.02e-02

Pos logit contributions
ossible+0.082
ric+0.074
 harbour+0.072
 cinema+0.067
rollers+0.064
Neg logit contributions
 brave-0.096
 independent-0.096
 strong-0.094
 sure-0.094
 graceful-0.090
Token.
Token ID13
Feature activation-1.24e-02

Pos logit contributions
ossible+0.070
ric+0.066
 harbour+0.063
|+0.062
 cinema+0.061
Neg logit contributions
 brave-0.033
 sure-0.032
 independent-0.031
 strong-0.031
 clever-0.028
Token He
Token ID679
Feature activation+2.76e-03

Pos logit contributions
ric+0.025
ossible+0.024
|+0.024
 cinema+0.023
 exhibit+0.022
Neg logit contributions
 sure-0.017
 strong-0.016
 independent-0.016
 brave-0.016
 clever-0.015
Token shouted
Token ID17293
Feature activation-1.29e-02

Pos logit contributions
 independent+0.004
rier+0.004
 sure+0.004
 strong+0.004
 courageous+0.004
Neg logit contributions
ossible-0.005
 exhibit-0.004
 harbour-0.004
ric-0.004
|-0.004
Token,
Token ID11
Feature activation+1.13e-02

Pos logit contributions
ossible+0.036
ric+0.033
 harbour+0.033
 exhibit+0.032
 cinema+0.031
Neg logit contributions
 brave-0.022
 sure-0.022
 independent-0.022
 strong-0.021
 clever-0.021
Token until
Token ID1566
Feature activation+6.95e-04

Pos logit contributions
 independent+0.016
 compassionate+0.015
 bee+0.015
 clever+0.015
 brave+0.015
Neg logit contributions
ossible-0.020
ric-0.019
 harbour-0.019
 driveway-0.018
 station-0.017
Token at
Token ID379
Feature activation-3.28e-02

Pos logit contributions
 brave+0.000
 sure+0.000
 independent+0.000
 strong+0.000
 clever+0.000
Neg logit contributions
ossible-0.002
ric-0.002
 harbour-0.002
 exhibit-0.002
 cinema-0.002
Token last
Token ID938
Feature activation-5.61e-03

Pos logit contributions
ossible+0.086
ric+0.081
 harbour+0.080
 exhibit+0.078
 cinema+0.078
Neg logit contributions
 brave-0.025
 independent-0.025
 sure-0.025
 strong-0.024
 graceful-0.022
Token her
Token ID607
Feature activation-1.34e-02

Pos logit contributions
ossible+0.011
 harbour+0.010
ric+0.010
 ago+0.009
 exhibit+0.009
Neg logit contributions
 independent-0.008
 sure-0.008
 brave-0.007
 clever-0.007
 compassionate-0.007
Token tr
Token ID491
Feature activation-8.57e-02

Pos logit contributions
ossible+0.025
ric+0.023
 harbour+0.022
 exhibit+0.022
 cinema+0.021
Neg logit contributions
 sure-0.020
 independent-0.019
 brave-0.019
 strong-0.019
 clever-0.018
Tokenicycle
Token ID35298
Feature activation-4.27e-02

Pos logit contributions
 cinema+0.129
 harbour+0.125
 inconvenience+0.122
 aquarium+0.122
 tanks+0.122
Neg logit contributions
 independent-0.121
 sure-0.119
 flexible-0.115
 strong-0.113
 brave-0.112
Tokenâ
Token ID22940
Feature activation-4.56e-02

Pos logit contributions
ossible+0.071
ric+0.066
 harbour+0.066
 exhibit+0.064
 cinema+0.062
Neg logit contributions
 independent-0.069
 sure-0.068
 brave-0.067
 clever-0.066
 compassionate-0.065
Token
Token ID26391
Feature activation-3.26e-02

Pos logit contributions
ossible+0.078
ric+0.072
 harbour+0.070
 exhibit+0.068
 cinema+0.066
Neg logit contributions
 independent-0.076
 brave-0.075
 sure-0.074
 strong-0.073
 graceful-0.070
Token
Token ID8151
Feature activation-3.84e-02

Pos logit contributions
ossible+0.052
ric+0.048
 harbour+0.047
 cinema+0.045
 exhibit+0.044
Neg logit contributions
 brave-0.057
 sure-0.057
 independent-0.056
 strong-0.055
 clever-0.054
Tokens
Token ID82
Feature activation-2.43e-02

Pos logit contributions
 harbour+0.067
ossible+0.066
ric+0.066
 cinema+0.064
 aquarium+0.062
Neg logit contributions
 sure-0.059
 brave-0.058
 strong-0.054
 independent-0.053
 sweet-0.053
Token D
Token ID360
Feature activation-1.62e-03

Pos logit contributions
 brave+0.011
 independent+0.010
 sure+0.010
 strong+0.010
 graceful+0.010
Neg logit contributions
ossible-0.017
ric-0.016
 harbour-0.016
 exhibit-0.015
 cinema-0.015
Tokenina
Token ID1437
Feature activation-1.13e-02

Pos logit contributions
ossible+0.003
ric+0.002
 harbour+0.002
 cinema+0.002
 exhibit+0.002
Neg logit contributions
 brave-0.003
 sure-0.003
 independent-0.003
 strong-0.003
 graceful-0.003
Token was
Token ID373
Feature activation-2.35e-02

Pos logit contributions
ossible+0.020
ric+0.018
 harbour+0.018
 exhibit+0.017
 cinema+0.017
Neg logit contributions
 independent-0.018
 brave-0.018
 sure-0.018
 strong-0.017
 graceful-0.017
Token riding
Token ID10311
Feature activation-1.46e-02

Pos logit contributions
ossible+0.043
ric+0.040
 harbour+0.038
 exhibit+0.037
|+0.037
Neg logit contributions
 brave-0.038
 sure-0.037
 independent-0.037
 strong-0.037
 clever-0.036
Token her
Token ID607
Feature activation-5.39e-03

Pos logit contributions
ossible+0.028
|+0.025
ained+0.024
 exhibit+0.024
ric+0.023
Neg logit contributions
 sure-0.021
 clever-0.018
 observ-0.018
 independent-0.018
 intelligent-0.017
Token tr
Token ID491
Feature activation-8.52e-02

Pos logit contributions
ossible+0.010
ric+0.009
 harbour+0.009
 cinema+0.008
 exhibit+0.008
Neg logit contributions
 brave-0.009
 independent-0.009
 sure-0.009
 strong-0.008
 clever-0.008
Tokenicycle
Token ID35298
Feature activation-3.90e-02

Pos logit contributions
 harbour+0.139
 inconvenience+0.135
 cinema+0.134
 aquarium+0.129
 station+0.128
Neg logit contributions
 independent-0.112
 flexible-0.107
 strong-0.102
 graceful-0.100
 ant-0.100
Token,
Token ID11
Feature activation+1.25e-02

Pos logit contributions
ossible+0.073
ric+0.067
 harbour+0.065
 exhibit+0.063
 cinema+0.063
Neg logit contributions
 sure-0.058
 brave-0.058
 independent-0.058
 strong-0.056
 clever-0.055
Token something
Token ID1223
Feature activation+3.99e-03

Pos logit contributions
 independent+0.014
 sure+0.014
 courageous+0.013
 intelligent+0.013
 clever+0.013
Neg logit contributions
ossible-0.024
ained-0.021
ric-0.021
 harbour-0.020
 pool-0.020
Token mysterious
Token ID11428
Feature activation-2.67e-02

Pos logit contributions
 sure+0.005
 independent+0.005
 brave+0.004
 careful+0.004
Pen+0.004
Neg logit contributions
ossible-0.007
 cinema-0.007
ric-0.007
 harbour-0.007
 exhibit-0.007
Token and
Token ID290
Feature activation+6.71e-03

Pos logit contributions
ossible+0.045
ric+0.041
 harbour+0.040
 exhibit+0.038
 cinema+0.038
Neg logit contributions
 brave-0.045
 independent-0.045
 sure-0.045
 strong-0.044
 clever-0.043
Token it
Token ID340
Feature activation+6.94e-03

Pos logit contributions
ossible+0.001
ric+0.001
 ago+0.001
 exhibit+0.001
 harbour+0.001
Neg logit contributions
 independent-0.001
 brave-0.001
 sure-0.001
 clever-0.000
 compassionate-0.000
Token for
Token ID329
Feature activation-2.31e-02

Pos logit contributions
 independent+0.010
 brave+0.010
 sure+0.010
 graceful+0.010
 clever+0.010
Neg logit contributions
ossible-0.013
ric-0.011
 exhibit-0.011
 harbour-0.011
|-0.011
Token her
Token ID607
Feature activation-7.01e-03

Pos logit contributions
ossible+0.053
ric+0.049
 exhibit+0.048
 harbour+0.047
 cinema+0.046
Neg logit contributions
 independent-0.023
 brave-0.023
 sure-0.023
 strong-0.022
 graceful-0.021
Token tree
Token ID5509
Feature activation-1.99e-02

Pos logit contributions
ossible+0.013
ric+0.011
 exhibit+0.011
 harbour+0.011
 ago+0.011
Neg logit contributions
 sure-0.011
 independent-0.011
 brave-0.010
 graceful-0.010
 strong-0.010
Token-
Token ID12
Feature activation-1.59e-02

Pos logit contributions
ossible+0.040
ric+0.037
 harbour+0.036
 exhibit+0.035
 cinema+0.035
Neg logit contributions
 brave-0.027
 sure-0.027
 independent-0.027
 strong-0.026
 clever-0.026
Tokencl
Token ID565
Feature activation-8.50e-02

Pos logit contributions
ossible+0.030
ric+0.028
 harbour+0.026
 exhibit+0.026
|+0.025
Neg logit contributions
 brave-0.024
 independent-0.024
 sure-0.024
 strong-0.023
 clever-0.023
Tokenim
Token ID320
Feature activation-3.05e-02

Pos logit contributions
 harbour+0.123
 inconvenience+0.122
 cinema+0.122
 Sale+0.114
ric+0.111
Neg logit contributions
 independent-0.135
 sure-0.130
 brave-0.130
 strong-0.128
 flexible-0.125
Tokenbing
Token ID4623
Feature activation+6.67e-03

Pos logit contributions
ossible+0.040
ric+0.039
 exhibit+0.035
|+0.033
 harbour+0.033
Neg logit contributions
 independent-0.062
 sure-0.060
 brave-0.060
 graceful-0.060
 clever-0.059
Token adventures
Token ID17545
Feature activation-5.35e-03

Pos logit contributions
 sure+0.010
 independent+0.010
 brave+0.010
 clever+0.010
 graceful+0.009
Neg logit contributions
ossible-0.012
ric-0.011
 exhibit-0.011
 harbour-0.011
 cinema-0.010
Token.
Token ID13
Feature activation-2.02e-03

Pos logit contributions
ossible+0.009
 exhibit+0.008
 harbour+0.008
ric+0.008
|+0.008
Neg logit contributions
 independent-0.008
 sure-0.008
 brave-0.008
 clever-0.008
 graceful-0.008
Token She
Token ID1375
Feature activation+6.02e-03

Pos logit contributions
ossible+0.004
ric+0.004
 harbour+0.004
 exhibit+0.003
 cinema+0.003
Neg logit contributions
 brave-0.003
 independent-0.003
 sure-0.003
 strong-0.003
 clever-0.003
Token and
Token ID290
Feature activation+2.34e-04

Pos logit contributions
ossible+0.030
ric+0.029
 harbour+0.028
|+0.028
 cinema+0.027
Neg logit contributions
 brave-0.016
 independent-0.016
 strong-0.016
 sure-0.015
 clever-0.015
Token hop
Token ID1725
Feature activation+2.24e-02

Pos logit contributions
 brave+0.000
 sure+0.000
 independent+0.000
 strong+0.000
 clever+0.000
Neg logit contributions
ossible-0.001
ric-0.000
 harbour-0.000
 cinema-0.000
|-0.000
Token around
Token ID1088
Feature activation-7.71e-03

Pos logit contributions
 brave+0.023
 sure+0.023
 independent+0.023
 strong+0.022
 clever+0.021
Neg logit contributions
ossible-0.053
ric-0.049
 harbour-0.048
 exhibit-0.047
 cinema-0.047
Token on
Token ID319
Feature activation-1.65e-03

Pos logit contributions
ossible+0.015
ric+0.014
 harbour+0.014
 exhibit+0.013
 cinema+0.013
Neg logit contributions
 sure-0.011
 independent-0.011
 brave-0.011
 strong-0.010
 clever-0.010
Token his
Token ID465
Feature activation-3.79e-03

Pos logit contributions
ossible+0.004
ric+0.004
 harbour+0.004
 exhibit+0.003
 cinema+0.003
Neg logit contributions
 sure-0.002
 brave-0.002
 independent-0.002
 strong-0.002
 clever-0.002
Token tr
Token ID491
Feature activation-8.49e-02

Pos logit contributions
ossible+0.007
ric+0.006
 harbour+0.006
 cinema+0.006
 exhibit+0.006
Neg logit contributions
 brave-0.006
 independent-0.006
 sure-0.006
 strong-0.006
 clever-0.006
Tokenamp
Token ID696
Feature activation-4.91e-02

Pos logit contributions
 inconvenience+0.138
 Sale+0.123
 tanks+0.120
 cinema+0.120
 pizz+0.119
Neg logit contributions
 independent-0.109
 flexible-0.106
The-0.104
 graceful-0.103
 brave-0.102
Tokenoline
Token ID14453
Feature activation-2.60e-02

Pos logit contributions
ossible+0.083
ric+0.075
 harbour+0.074
 cinema+0.068
rollers+0.065
Neg logit contributions
 independent-0.082
 brave-0.081
 strong-0.078
 sure-0.076
 graceful-0.075
Token.
Token ID13
Feature activation-1.11e-02

Pos logit contributions
ossible+0.062
ric+0.061
 harbour+0.057
|+0.056
 Sale+0.056
Neg logit contributions
 brave-0.028
 strong-0.026
 independent-0.025
 sure-0.025
 graceful-0.023
Token One
Token ID1881
Feature activation+1.07e-02

Pos logit contributions
ric+0.021
|+0.021
ossible+0.020
anged+0.020
 cinema+0.019
Neg logit contributions
 sure-0.015
 strong-0.014
 independent-0.014
 brave-0.014
 graceful-0.013
Token day
Token ID1110
Feature activation-8.17e-03

Pos logit contributions
 brave+0.019
 independent+0.019
 sure+0.019
 strong+0.019
 clever+0.018
Neg logit contributions
ossible-0.017
ric-0.016
 harbour-0.015
 cinema-0.014
 exhibit-0.014
Token stretches
Token ID23687
Feature activation+1.49e-02

Pos logit contributions
 sure+0.020
 independent+0.019
 redd+0.019
 brave+0.019
Bee+0.018
Neg logit contributions
ossible-0.025
 cinema-0.024
 exhibit-0.024
ric-0.023
 aquarium-0.022
Token or
Token ID393
Feature activation-2.00e-02

Pos logit contributions
 independent+0.019
 sure+0.019
 brave+0.019
 clever+0.019
 strong+0.018
Neg logit contributions
ossible-0.031
ric-0.028
 harbour-0.027
 exhibit-0.027
 cinema-0.026
Token hop
Token ID1725
Feature activation+2.20e-02

Pos logit contributions
ossible+0.039
ric+0.036
 harbour+0.036
 cinema+0.034
 exhibit+0.034
Neg logit contributions
 sure-0.027
 independent-0.027
 brave-0.026
 strong-0.025
 clever-0.025
Token on
Token ID319
Feature activation+4.59e-03

Pos logit contributions
 independent+0.031
 sure+0.030
 clever+0.030
 brave+0.030
 compassionate+0.029
Neg logit contributions
ossible-0.040
|-0.037
ric-0.037
 harbour-0.036
 exhibit-0.036
Token the
Token ID262
Feature activation-2.17e-02

Pos logit contributions
 sure+0.005
 independent+0.005
 brave+0.005
 courageous+0.005
 intelligent+0.005
Neg logit contributions
ossible-0.008
ric-0.008
 exhibit-0.008
 cinema-0.008
 ago-0.008
Token tr
Token ID491
Feature activation-8.47e-02

Pos logit contributions
 cinema+0.030
 driveway+0.030
 exhibit+0.029
 station+0.028
 pool+0.028
Neg logit contributions
 sure-0.039
 compassionate-0.035
 bee-0.034
 observ-0.034
 independent-0.034
Tokenamp
Token ID696
Feature activation-4.35e-02

Pos logit contributions
ossible+0.133
 harbour+0.123
ric+0.113
 inconvenience+0.113
 Sale+0.112
Neg logit contributions
 independent-0.122
 sure-0.119
 flexible-0.115
 strong-0.115
 brave-0.112
Tokenoline
Token ID14453
Feature activation-2.32e-02

Pos logit contributions
ossible+0.059
ric+0.052
 harbour+0.051
 cinema+0.047
 exhibit+0.047
Neg logit contributions
 brave-0.088
 independent-0.088
 sure-0.087
 strong-0.086
 clever-0.084
Token.
Token ID13
Feature activation+6.96e-03

Pos logit contributions
ossible+0.043
ric+0.040
 harbour+0.040
 exhibit+0.039
 cinema+0.038
Neg logit contributions
 independent-0.034
 brave-0.034
 sure-0.034
 strong-0.033
 clever-0.033
Token Everyone
Token ID11075
Feature activation-1.25e-02

Pos logit contributions
 brave+0.011
 independent+0.010
 sure+0.010
 strong+0.010
 clever+0.010
Neg logit contributions
ossible-0.013
ric-0.012
 harbour-0.012
 cinema-0.011
 exhibit-0.011
Token at
Token ID379
Feature activation-2.62e-02

Pos logit contributions
ossible+0.021
 exhibit+0.018
 harbour+0.018
ric+0.018
|+0.018
Neg logit contributions
 independent-0.020
 strong-0.019
 redd-0.019
 silk-0.019
 sweet-0.018
Token Together
Token ID17083
Feature activation-1.57e-03

Pos logit contributions
ossible+0.000
ric+0.000
 cinema+0.000
 harbour+0.000
|+0.000
Neg logit contributions
 sure-0.000
 brave-0.000
 independent-0.000
 strong-0.000
 clever-0.000
Token they
Token ID484
Feature activation-2.26e-02

Pos logit contributions
ossible+0.003
ric+0.003
 harbour+0.003
 cinema+0.003
|+0.003
Neg logit contributions
 sure-0.002
 brave-0.002
 independent-0.002
 strong-0.002
 clever-0.002
Token jumped
Token ID11687
Feature activation-5.02e-04

Pos logit contributions
ossible+0.039
 harbour+0.034
 cinema+0.033
ric+0.033
 exhibit+0.032
Neg logit contributions
 independent-0.037
 graceful-0.035
 brave-0.035
 clever-0.035
 strong-0.035
Token on
Token ID319
Feature activation+8.08e-03

Pos logit contributions
ossible+0.001
 cinema+0.001
 harbour+0.001
ric+0.001
|+0.001
Neg logit contributions
 independent-0.001
 compassionate-0.001
 sure-0.001
 brave-0.001
 clever-0.001
Token the
Token ID262
Feature activation-1.96e-02

Pos logit contributions
 sure+0.010
 independent+0.009
 brave+0.009
 clever+0.009
 compassionate+0.009
Neg logit contributions
ossible-0.017
 harbour-0.016
ric-0.015
 exhibit-0.015
 cinema-0.015
Token tr
Token ID491
Feature activation-8.43e-02

Pos logit contributions
ossible+0.028
 harbour+0.027
ric+0.026
 cinema+0.026
|+0.024
Neg logit contributions
 sure-0.037
 independent-0.034
 brave-0.034
 clever-0.033
 courageous-0.033
Tokenamp
Token ID696
Feature activation-3.43e-02

Pos logit contributions
 inconvenience+0.129
ossible+0.127
 Sale+0.123
 harbour+0.121
 station+0.113
Neg logit contributions
 independent-0.109
 sure-0.108
 flexible-0.106
in-0.103
 mosquito-0.102
Tokenoline
Token ID14453
Feature activation-2.19e-02

Pos logit contributions
ossible+0.045
ric+0.040
 harbour+0.038
 cinema+0.036
 exhibit+0.036
Neg logit contributions
 brave-0.071
 independent-0.071
 sure-0.071
 strong-0.070
 clever-0.068
Token and
Token ID290
Feature activation+6.40e-03

Pos logit contributions
ossible+0.046
ric+0.042
 harbour+0.042
 exhibit+0.041
 cinema+0.041
Neg logit contributions
 independent-0.027
 brave-0.027
 sure-0.027
 strong-0.026
 clever-0.026
Token ran
Token ID4966
Feature activation-6.47e-03

Pos logit contributions
 independent+0.010
 intelligent+0.010
 clever+0.010
 compassionate+0.010
 brave+0.010
Neg logit contributions
ossible-0.010
 harbour-0.009
|-0.009
ric-0.009
 cinema-0.009
Token races
Token ID9558
Feature activation-7.91e-03

Pos logit contributions
ossible+0.013
 harbour+0.012
ric+0.012
 exhibit+0.011
|+0.011
Neg logit contributions
 brave-0.009
 independent-0.009
 sure-0.008
 clever-0.008
 strong-0.008
Token,
Token ID11
Feature activation+2.06e-02

Pos logit contributions
 independent+0.043
 brave+0.041
 sure+0.041
 clever+0.041
 graceful+0.041
Neg logit contributions
ossible-0.050
ric-0.043
 harbour-0.043
 exhibit-0.041
 cinema-0.041
Token kissing
Token ID25847
Feature activation-1.66e-02

Pos logit contributions
 independent+0.031
 sure+0.031
 clever+0.030
 graceful+0.030
 brave+0.030
Neg logit contributions
ossible-0.035
 harbour-0.032
ric-0.032
 cinema-0.030
 dealership-0.029
Token them
Token ID606
Feature activation+1.31e-02

Pos logit contributions
ossible+0.033
ric+0.029
 cinema+0.027
|+0.027
 harbour+0.026
Neg logit contributions
 independent-0.024
 sure-0.023
 courageous-0.023
 clever-0.022
 brave-0.022
Token on
Token ID319
Feature activation+9.60e-03

Pos logit contributions
 independent+0.019
 sure+0.019
 clever+0.019
 brave+0.019
 graceful+0.018
Neg logit contributions
ossible-0.024
ric-0.021
 harbour-0.021
|-0.020
 cinema-0.020
Token their
Token ID511
Feature activation-2.50e-02

Pos logit contributions
 sure+0.012
 brave+0.011
 independent+0.011
 clever+0.011
 strong+0.011
Neg logit contributions
ossible-0.021
ric-0.019
 harbour-0.019
 cinema-0.018
 exhibit-0.018
Token fore
Token ID1674
Feature activation-8.35e-02

Pos logit contributions
ossible+0.036
ric+0.032
 harbour+0.031
 exhibit+0.029
 cinema+0.029
Neg logit contributions
 brave-0.049
 sure-0.048
 independent-0.048
 strong-0.047
 clever-0.047
Tokenheads
Token ID16600
Feature activation-2.53e-02

Pos logit contributions
ossible+0.120
ric+0.109
 harbour+0.107
 cinema+0.101
 exhibit+0.101
Neg logit contributions
 brave-0.163
 sure-0.162
 independent-0.162
 strong-0.159
 clever-0.155
Token.
Token ID13
Feature activation+2.98e-03

Pos logit contributions
ossible+0.054
ric+0.050
 harbour+0.050
 cinema+0.048
 exhibit+0.048
Neg logit contributions
 brave-0.032
 sure-0.032
 independent-0.031
 strong-0.030
 clever-0.029
Token
Token ID198
Feature activation-1.31e-03

Pos logit contributions
 brave+0.005
 sure+0.005
 independent+0.005
 strong+0.005
 clever+0.004
Neg logit contributions
ossible-0.005
ric-0.005
 harbour-0.005
 exhibit-0.005
 cinema-0.005
Token
Token ID198
Feature activation+4.35e-03

Pos logit contributions
ossible+0.002
ric+0.002
 harbour+0.002
 exhibit+0.002
 cinema+0.002
Neg logit contributions
 brave-0.002
 sure-0.002
 independent-0.002
 strong-0.002
 graceful-0.002
TokenShe
Token ID3347
Feature activation-2.13e-03

Pos logit contributions
 brave+0.005
 independent+0.004
 sure+0.004
 strong+0.004
 gentle+0.004
Neg logit contributions
ric-0.010
ossible-0.010
 exhibit-0.009
 harbour-0.009
|-0.009
Token the
Token ID262
Feature activation+3.12e-03

Pos logit contributions
 redd+0.035
 agile+0.032
 independent+0.032
 clever+0.031
 intelligent+0.031
Neg logit contributions
ossible-0.039
ric-0.039
 cinema-0.035
 station-0.035
|-0.035
Token pigeon
Token ID46387
Feature activation-1.51e-02

Pos logit contributions
 sure+0.004
 independent+0.004
 strong+0.003
 brave+0.003
 clever+0.003
Neg logit contributions
ossible-0.007
 harbour-0.006
ric-0.006
 cinema-0.006
 exhibit-0.006
Token all
Token ID477
Feature activation-1.70e-02

Pos logit contributions
ossible+0.027
 exhibit+0.025
ric+0.025
 harbour+0.024
|+0.023
Neg logit contributions
 independent-0.022
 clever-0.021
 brave-0.021
 graceful-0.021
 strong-0.021
Token those
Token ID883
Feature activation+2.21e-02

Pos logit contributions
ossible+0.037
ric+0.034
|+0.033
 harbour+0.032
 exhibit+0.032
Neg logit contributions
 brave-0.020
 independent-0.020
 sure-0.020
 clever-0.019
 strong-0.019
Token weeks
Token ID2745
Feature activation-2.09e-02

Pos logit contributions
pick+0.025
Small+0.024
hire+0.024
 redd+0.023
 independent+0.023
Neg logit contributions
 exhibit-0.036
ossible-0.036
ric-0.036
|-0.035
 ago-0.035
Token ago
Token ID2084
Feature activation-8.31e-02

Pos logit contributions
ossible+0.042
ric+0.038
 harbour+0.038
 exhibit+0.037
 cinema+0.036
Neg logit contributions
 independent-0.029
 brave-0.029
 sure-0.028
 strong-0.028
 clever-0.027
Token.
Token ID13
Feature activation+8.62e-04

Pos logit contributions
ossible+0.160
ric+0.147
 harbour+0.144
 exhibit+0.140
 ago+0.137
Neg logit contributions
 independent-0.119
 brave-0.118
 sure-0.117
 strong-0.115
 clever-0.114
Token<|endoftext|>
Token ID50256
Feature activation-3.76e-02

Pos logit contributions
 independent+0.002
 brave+0.002
 strong+0.002
 clever+0.001
 sure+0.001
Neg logit contributions
ossible-0.001
ric-0.001
 harbour-0.001
|-0.001
 exhibit-0.001
TokenMike
Token ID16073
Feature activation-5.60e-03

Pos logit contributions
ossible+0.055
ric+0.050
 harbour+0.048
 cinema+0.046
 exhibit+0.046
Neg logit contributions
 brave-0.071
 sure-0.071
 independent-0.070
 strong-0.069
 clever-0.068
Token and
Token ID290
Feature activation+4.98e-03

Pos logit contributions
ossible+0.009
ric+0.008
 harbour+0.007
 exhibit+0.007
|+0.007
Neg logit contributions
 independent-0.011
 brave-0.010
 sure-0.010
 strong-0.010
 clever-0.010
Token his
Token ID465
Feature activation+1.19e-02

Pos logit contributions
 sure+0.007
 independent+0.007
 brave+0.007
 clever+0.006
 graceful+0.006
Neg logit contributions
ossible-0.010
 harbour-0.009
ric-0.009
|-0.008
 exhibit-0.008
Token the
Token ID262
Feature activation-1.31e-02

Pos logit contributions
ossible+0.030
ric+0.029
 harbour+0.028
 cinema+0.028
 exhibit+0.028
Neg logit contributions
 sure-0.011
 independent-0.011
 brave-0.011
 strong-0.010
 clever-0.010
Token sofa
Token ID34902
Feature activation-1.84e-02

Pos logit contributions
ossible+0.020
ric+0.019
 cinema+0.018
 harbour+0.018
 exhibit+0.018
Neg logit contributions
 sure-0.024
 brave-0.023
 independent-0.023
 clever-0.022
 strong-0.022
Token,
Token ID11
Feature activation-1.84e-03

Pos logit contributions
ossible+0.039
ric+0.037
 harbour+0.036
|+0.035
 cinema+0.034
Neg logit contributions
 brave-0.023
 sure-0.022
 independent-0.022
 strong-0.022
 graceful-0.021
Token on
Token ID319
Feature activation-1.46e-02

Pos logit contributions
ossible+0.004
ric+0.004
 cinema+0.003
 harbour+0.003
 exhibit+0.003
Neg logit contributions
 sure-0.002
 independent-0.002
 clever-0.002
 brave-0.002
 courageous-0.002
Token the
Token ID262
Feature activation-1.22e-02

Pos logit contributions
ossible+0.034
ric+0.033
 cinema+0.032
 harbour+0.032
 exhibit+0.031
Neg logit contributions
 sure-0.015
 independent-0.014
 brave-0.014
 compassionate-0.013
 courageous-0.013
Token tr
Token ID491
Feature activation-8.29e-02

Pos logit contributions
ossible+0.016
 cinema+0.014
ric+0.014
 harbour+0.014
 exhibit+0.013
Neg logit contributions
 sure-0.026
 independent-0.024
 brave-0.024
 clever-0.023
 strong-0.023
Tokenamp
Token ID696
Feature activation-4.37e-02

Pos logit contributions
 inconvenience+0.124
 harbour+0.123
ossible+0.116
 Sale+0.109
 officers+0.109
Neg logit contributions
 sure-0.112
 flexible-0.109
 independent-0.109
The-0.108
 strong-0.105
Tokenoline
Token ID14453
Feature activation-2.66e-02

Pos logit contributions
ossible+0.064
ric+0.058
 harbour+0.057
 cinema+0.051
 exhibit+0.050
Neg logit contributions
 brave-0.082
 independent-0.082
 sure-0.081
 strong-0.080
 graceful-0.078
Token.
Token ID13
Feature activation-3.24e-03

Pos logit contributions
ossible+0.056
ric+0.052
 harbour+0.051
|+0.049
 cinema+0.049
Neg logit contributions
 brave-0.034
 sure-0.033
 independent-0.033
 strong-0.033
 graceful-0.032
Token They
Token ID1119
Feature activation-6.20e-03

Pos logit contributions
ossible+0.006
ric+0.005
 harbour+0.005
 exhibit+0.005
|+0.005
Neg logit contributions
 sure-0.005
 brave-0.005
 independent-0.005
 strong-0.005
 graceful-0.005
Token bounce
Token ID20110
Feature activation+1.28e-02

Pos logit contributions
ossible+0.011
ric+0.010
|+0.009
 harbour+0.009
 driveway+0.008
Neg logit contributions
 independent-0.010
 sure-0.010
 compassionate-0.010
 courageous-0.010
rier-0.009
Token born
Token ID4642
Feature activation+2.15e-02

Pos logit contributions
ossible+0.024
 harbour+0.022
ric+0.022
 ago+0.021
 cinema+0.021
Neg logit contributions
 independent-0.020
 graceful-0.019
 sweet-0.019
 clever-0.019
 strong-0.018
Token only
Token ID691
Feature activation-1.88e-02

Pos logit contributions
 sure+0.033
 independent+0.032
 clever+0.030
 graceful+0.030
Pen+0.029
Neg logit contributions
ossible-0.033
 harbour-0.032
ric-0.032
 exhibit-0.031
 ago-0.031
Token a
Token ID257
Feature activation-1.43e-03

Pos logit contributions
ossible+0.035
ric+0.034
 ago+0.033
 harbour+0.032
 exhibit+0.032
Neg logit contributions
 independent-0.025
 sure-0.025
 brave-0.024
 strong-0.024
 clever-0.024
Token few
Token ID1178
Feature activation+2.43e-02

Pos logit contributions
 harbour+0.003
ossible+0.003
 exhibit+0.003
ric+0.003
 cinema+0.003
Neg logit contributions
 sure-0.002
 independent-0.002
 herself-0.001
 brave-0.001
 intelligent-0.001
Token days
Token ID1528
Feature activation-3.92e-03

Pos logit contributions
 sure+0.036
 independent+0.034
 strong+0.032
 graceful+0.032
 clever+0.032
Neg logit contributions
 ago-0.040
ossible-0.040
 exhibit-0.039
 harbour-0.036
 habitat-0.035
Token ago
Token ID2084
Feature activation-8.16e-02

Pos logit contributions
ossible+0.007
 harbour+0.006
ric+0.006
 ago+0.006
 exhibit+0.006
Neg logit contributions
 independent-0.006
 sure-0.006
 brave-0.006
 clever-0.006
 strong-0.006
Token.
Token ID13
Feature activation-1.76e-02

Pos logit contributions
ossible+0.148
ric+0.137
 harbour+0.133
 exhibit+0.130
 ago+0.127
Neg logit contributions
 independent-0.126
 brave-0.124
 sure-0.124
 strong-0.121
 clever-0.120
Token The
Token ID383
Feature activation+1.58e-05

Pos logit contributions
ossible+0.030
ric+0.028
 harbour+0.027
 cinema+0.026
 exhibit+0.026
Neg logit contributions
 sure-0.029
 brave-0.029
 independent-0.029
 strong-0.028
 clever-0.027
Token deer
Token ID20096
Feature activation-1.38e-02

Pos logit contributions
 sure+0.000
 independent+0.000
 brave+0.000
 strong+0.000
 clever+0.000
Neg logit contributions
ossible-0.000
 harbour-0.000
ric-0.000
 cinema-0.000
 exhibit-0.000
Token felt
Token ID2936
Feature activation+4.68e-02

Pos logit contributions
ossible+0.022
ric+0.021
 exhibit+0.021
 harbour+0.020
|+0.020
Neg logit contributions
 independent-0.023
 sure-0.022
 strong-0.021
 clever-0.021
 graceful-0.021
Token sorry
Token ID7926
Feature activation+8.42e-03

Pos logit contributions
 pony+0.053
 butterfly+0.050
Small+0.050
 bee+0.049
 independent+0.049
Neg logit contributions
ossible-0.087
 harbour-0.084
ric-0.080
|-0.077
 ago-0.076
Token cut
Token ID2005
Feature activation+1.06e-02

Pos logit contributions
 independent+0.001
 courageous+0.001
 brave+0.001
 observ+0.001
 sure+0.001
Neg logit contributions
ossible-0.001
 pizz-0.001
 ago-0.001
 harbour-0.001
ric-0.001
Tokenler
Token ID1754
Feature activation-2.40e-03

Pos logit contributions
 independent+0.018
 courageous+0.017
 brave+0.017
 compassionate+0.017
 fearless+0.017
Neg logit contributions
ossible-0.016
ric-0.015
 pizz-0.015
 exhibit-0.014
 tanks-0.014
Tokeny
Token ID88
Feature activation+5.07e-04

Pos logit contributions
ossible+0.005
ric+0.004
 exhibit+0.004
 harbour+0.004
 cinema+0.004
Neg logit contributions
 independent-0.003
 brave-0.003
 sure-0.003
 clever-0.003
 graceful-0.003
Token and
Token ID290
Feature activation+1.21e-03

Pos logit contributions
 independent+0.001
 brave+0.001
 sure+0.001
 compassionate+0.001
 clever+0.001
Neg logit contributions
ossible-0.001
 exhibit-0.001
ric-0.001
 pizz-0.001
 harbour-0.001
Token different
Token ID1180
Feature activation-2.64e-02

Pos logit contributions
 independent+0.002
 brave+0.002
 sure+0.002
 graceful+0.002
 clever+0.002
Neg logit contributions
ossible-0.002
 harbour-0.002
ric-0.002
 ago-0.002
 cinema-0.002
Token cond
Token ID1779
Feature activation-8.13e-02

Pos logit contributions
ossible+0.044
ric+0.042
 exhibit+0.041
 harbour+0.039
 ago+0.038
Neg logit contributions
 brave-0.042
 independent-0.042
 sure-0.041
 strong-0.040
 clever-0.040
Tokeniments
Token ID6800
Feature activation+1.24e-02

Pos logit contributions
ossible+0.131
 harbour+0.122
ric+0.118
 cinema+0.114
rollers+0.106
Neg logit contributions
 sure-0.136
 independent-0.130
 brave-0.127
 clever-0.121
 strong-0.121
Token.
Token ID13
Feature activation-1.22e-02

Pos logit contributions
 independent+0.020
 brave+0.019
 graceful+0.019
 sure+0.019
 compassionate+0.019
Neg logit contributions
ossible-0.021
ric-0.019
 ago-0.019
 exhibit-0.019
 harbour-0.018
Token
Token ID220
Feature activation-1.68e-02

Pos logit contributions
ossible+0.021
ric+0.020
 cinema+0.019
 exhibit+0.019
|+0.018
Neg logit contributions
 sure-0.020
 independent-0.020
 strong-0.019
 brave-0.019
 graceful-0.018
Token
Token ID198
Feature activation-1.49e-02

Pos logit contributions
ossible+0.030
ric+0.027
 harbour+0.026
 exhibit+0.025
 cinema+0.025
Neg logit contributions
 brave-0.027
 independent-0.027
 sure-0.027
 strong-0.026
 clever-0.026
Token
Token ID198
Feature activation-9.18e-03

Pos logit contributions
ossible+0.027
ric+0.025
 harbour+0.024
 cinema+0.024
 exhibit+0.024
Neg logit contributions
 brave-0.023
 sure-0.023
 independent-0.023
 strong-0.022
 clever-0.022
Token decided
Token ID3066
Feature activation+1.84e-04

Pos logit contributions
ossible+0.025
ric+0.023
 harbour+0.022
 exhibit+0.021
 cinema+0.021
Neg logit contributions
 independent-0.024
 brave-0.024
 sure-0.024
 strong-0.023
 clever-0.023
Token to
Token ID284
Feature activation-1.61e-02

Pos logit contributions
 independent+0.000
 brave+0.000
 sure+0.000
 strong+0.000
 clever+0.000
Neg logit contributions
ossible-0.000
 harbour-0.000
 exhibit-0.000
ric-0.000
 cinema-0.000
Token jump
Token ID4391
Feature activation+5.67e-03

Pos logit contributions
ossible+0.029
ric+0.026
 harbour+0.026
 exhibit+0.025
 cinema+0.025
Neg logit contributions
 brave-0.025
 sure-0.025
 independent-0.025
 strong-0.024
 clever-0.024
Token on
Token ID319
Feature activation+1.45e-03

Pos logit contributions
 independent+0.007
 sure+0.007
 compassionate+0.007
 brave+0.007
 clever+0.007
Neg logit contributions
ossible-0.012
|-0.011
ric-0.010
 exhibit-0.010
 cinema-0.010
Token the
Token ID262
Feature activation-1.11e-02

Pos logit contributions
 independent+0.002
 sure+0.002
 brave+0.001
 compassionate+0.001
 courageous+0.001
Neg logit contributions
ossible-0.003
 exhibit-0.003
ric-0.003
 cinema-0.003
 harbour-0.003
Token tr
Token ID491
Feature activation-8.11e-02

Pos logit contributions
ossible+0.013
 cinema+0.012
 exhibit+0.011
ric+0.010
 harbour+0.010
Neg logit contributions
 sure-0.024
 independent-0.024
 compassionate-0.023
 brave-0.023
 courageous-0.023
Tokenamp
Token ID696
Feature activation-4.35e-02

Pos logit contributions
 inconvenience+0.117
 harbour+0.114
ossible+0.112
ric+0.104
 officers+0.103
Neg logit contributions
 sure-0.110
 flexible-0.108
The-0.107
 independent-0.107
 graceful-0.104
Tokenoline
Token ID14453
Feature activation-2.44e-02

Pos logit contributions
ossible+0.069
ric+0.063
 harbour+0.062
 cinema+0.055
rollers+0.054
Neg logit contributions
 independent-0.078
 brave-0.077
 strong-0.075
 sure-0.075
 graceful-0.072
Token,
Token ID11
Feature activation-3.15e-03

Pos logit contributions
ossible+0.051
ric+0.047
 harbour+0.046
 exhibit+0.045
 cinema+0.045
Neg logit contributions
 brave-0.031
 independent-0.031
 sure-0.031
 strong-0.030
 clever-0.029
Token but
Token ID475
Feature activation-1.58e-02

Pos logit contributions
ossible+0.006
ric+0.006
 cinema+0.006
 exhibit+0.005
 harbour+0.005
Neg logit contributions
 independent-0.004
 clever-0.004
 brave-0.004
 compassionate-0.004
 sure-0.004
Token Lily
Token ID20037
Feature activation+7.80e-03

Pos logit contributions
ossible+0.033
ric+0.031
 harbour+0.030
 cinema+0.029
 exhibit+0.029
Neg logit contributions
 brave-0.020
 independent-0.020
 sure-0.019
 strong-0.019
 clever-0.018
Token she
Token ID673
Feature activation+1.05e-02

Pos logit contributions
 independent+0.003
 intelligent+0.003
 sure+0.003
 clever+0.002
 redd+0.002
Neg logit contributions
ossible-0.002
 harbour-0.002
 exhibit-0.002
anged-0.002
 station-0.002
Token got
Token ID1392
Feature activation+5.90e-03

Pos logit contributions
 independent+0.018
 sure+0.017
 graceful+0.016
 brave+0.016
 compassionate+0.016
Neg logit contributions
ossible-0.017
 harbour-0.015
ric-0.015
 exhibit-0.014
 cinema-0.014
Token back
Token ID736
Feature activation-6.04e-04

Pos logit contributions
 independent+0.008
isy+0.008
 clever+0.007
 brave+0.007
 graceful+0.007
Neg logit contributions
ossible-0.011
 harbour-0.010
ric-0.010
ained-0.009
 cinema-0.009
Token on
Token ID319
Feature activation-1.05e-03

Pos logit contributions
ossible+0.001
ric+0.001
 harbour+0.001
 exhibit+0.001
 cinema+0.001
Neg logit contributions
 brave-0.001
 sure-0.001
 independent-0.001
 strong-0.001
 clever-0.001
Token her
Token ID607
Feature activation-8.41e-03

Pos logit contributions
ossible+0.002
 harbour+0.002
 cinema+0.002
ric+0.002
 exhibit+0.002
Neg logit contributions
 sure-0.001
 brave-0.001
 independent-0.001
 clever-0.001
 courageous-0.001
Token tr
Token ID491
Feature activation-8.11e-02

Pos logit contributions
ossible+0.014
ric+0.012
 harbour+0.012
 exhibit+0.012
 cinema+0.012
Neg logit contributions
 sure-0.014
 independent-0.014
 brave-0.014
 strong-0.013
 clever-0.013
Tokenicycle
Token ID35298
Feature activation-3.50e-02

Pos logit contributions
 inconvenience+0.133
 harbour+0.123
 station+0.122
 cinema+0.121
 Sale+0.121
Neg logit contributions
 independent-0.104
 sure-0.100
 flexible-0.100
 strong-0.095
 brave-0.095
Token and
Token ID290
Feature activation-4.09e-03

Pos logit contributions
ossible+0.083
ric+0.078
 harbour+0.076
 exhibit+0.074
 cinema+0.074
Neg logit contributions
 brave-0.035
 independent-0.035
 sure-0.035
 strong-0.034
 clever-0.032
Token rode
Token ID22075
Feature activation-3.63e-03

Pos logit contributions
ossible+0.006
 harbour+0.005
anged+0.005
 cinema+0.005
 exhibit+0.005
Neg logit contributions
 independent-0.007
 intelligent-0.007
 clever-0.007
 brave-0.007
 courageous-0.007
Token around
Token ID1088
Feature activation-7.93e-03

Pos logit contributions
ossible+0.007
|+0.005
ulk+0.005
 harbour+0.005
 recess+0.005
Neg logit contributions
 bee-0.005
 clever-0.005
 compassionate-0.005
 intelligent-0.005
 sure-0.005
Token the
Token ID262
Feature activation-4.91e-03

Pos logit contributions
ossible+0.014
 exhibit+0.014
 cinema+0.013
|+0.013
 ago+0.013
Neg logit contributions
 redd-0.010
hire-0.010
 independent-0.010
 sure-0.010
 compassionate-0.010
Token story
Token ID1621
Feature activation-1.95e-02

Pos logit contributions
ossible+0.019
ric+0.017
 exhibit+0.017
 harbour+0.016
 cinema+0.016
Neg logit contributions
 independent-0.020
 sure-0.019
 brave-0.019
 strong-0.019
 graceful-0.018
Token.
Token ID13
Feature activation+3.75e-04

Pos logit contributions
ossible+0.040
ric+0.036
 harbour+0.035
 cinema+0.035
 exhibit+0.035
Neg logit contributions
 independent-0.026
 brave-0.026
 sure-0.026
 strong-0.025
 clever-0.025
Token A
Token ID317
Feature activation-5.11e-03

Pos logit contributions
 brave+0.001
 independent+0.001
 sure+0.001
 strong+0.001
 clever+0.001
Neg logit contributions
ossible-0.001
ric-0.001
 harbour-0.001
 exhibit-0.001
 cinema-0.001
Token long
Token ID890
Feature activation+1.70e-03

Pos logit contributions
ossible+0.010
 exhibit+0.009
 cinema+0.009
 harbour+0.009
ric+0.009
Neg logit contributions
 sure-0.007
 independent-0.007
 brave-0.006
 strong-0.006
 careful-0.006
Token time
Token ID640
Feature activation-2.58e-02

Pos logit contributions
 sure+0.003
 independent+0.003
 brave+0.003
 clever+0.002
 strong+0.002
Neg logit contributions
ossible-0.003
ric-0.003
 harbour-0.003
 exhibit-0.003
 cinema-0.003
Token ago
Token ID2084
Feature activation-8.10e-02

Pos logit contributions
ossible+0.047
ric+0.043
 harbour+0.042
 exhibit+0.041
 cinema+0.041
Neg logit contributions
 independent-0.039
 brave-0.038
 sure-0.038
 strong-0.038
 clever-0.037
Token,
Token ID11
Feature activation+5.93e-03

Pos logit contributions
ossible+0.156
ric+0.144
 harbour+0.141
 cinema+0.136
 exhibit+0.136
Neg logit contributions
 brave-0.117
 sure-0.116
 independent-0.115
 strong-0.112
 clever-0.110
Token there
Token ID612
Feature activation-1.52e-02

Pos logit contributions
 independent+0.008
 brave+0.008
 compassionate+0.008
 strong+0.008
 clever+0.008
Neg logit contributions
ossible-0.011
 harbour-0.010
ric-0.010
 ago-0.010
 exhibit-0.010
Token was
Token ID373
Feature activation-6.21e-03

Pos logit contributions
ossible+0.022
 exhibit+0.019
ric+0.018
 harbour+0.017
 cinema+0.017
Neg logit contributions
 independent-0.030
 brave-0.029
 strong-0.029
 sure-0.029
 graceful-0.028
Token an
Token ID281
Feature activation+4.52e-03

Pos logit contributions
ossible+0.014
ric+0.013
 cinema+0.013
 harbour+0.013
 exhibit+0.012
Neg logit contributions
 independent-0.007
 sure-0.007
 brave-0.007
 strong-0.006
 graceful-0.006
Token adventurous
Token ID39770
Feature activation+6.79e-03

Pos logit contributions
 sure+0.005
 brave+0.005
 strong+0.005
 clever+0.005
 independent+0.005
Neg logit contributions
ossible-0.009
 harbour-0.009
 exhibit-0.009
ric-0.009
 aquarium-0.009
Token They
Token ID1119
Feature activation-1.97e-04

Pos logit contributions
ossible+0.009
ric+0.008
 harbour+0.008
 exhibit+0.008
 cinema+0.008
Neg logit contributions
 sure-0.008
 brave-0.008
 independent-0.008
 strong-0.008
 clever-0.008
Token had
Token ID550
Feature activation-7.26e-03

Pos logit contributions
ossible+0.000
ric+0.000
 harbour+0.000
|+0.000
 cinema+0.000
Neg logit contributions
 independent-0.000
 sure-0.000
 brave-0.000
 strong-0.000
 clever-0.000
Token a
Token ID257
Feature activation+9.35e-03

Pos logit contributions
ossible+0.015
ric+0.014
 harbour+0.014
|+0.013
 cinema+0.013
Neg logit contributions
 independent-0.009
 brave-0.009
 sure-0.009
 strong-0.008
 clever-0.008
Token great
Token ID1049
Feature activation+2.68e-03

Pos logit contributions
 independent+0.009
 sure+0.009
 brave+0.009
 strong+0.008
 clever+0.008
Neg logit contributions
ossible-0.023
ric-0.021
 harbour-0.021
 cinema-0.020
 exhibit-0.020
Token time
Token ID640
Feature activation-6.94e-03

Pos logit contributions
 brave+0.004
 independent+0.004
 sure+0.004
 strong+0.004
 graceful+0.004
Neg logit contributions
ossible-0.005
ric-0.005
 harbour-0.005
 exhibit-0.004
 cinema-0.004
Token playing
Token ID2712
Feature activation+3.28e-02

Pos logit contributions
ossible+0.014
ric+0.013
 harbour+0.013
|+0.012
 cinema+0.012
Neg logit contributions
 brave-0.010
 sure-0.010
 independent-0.010
 strong-0.009
 graceful-0.009
Token their
Token ID511
Feature activation-5.88e-03

Pos logit contributions
 sure+0.046
 independent+0.045
 clever+0.043
 compassionate+0.043
 courageous+0.043
Neg logit contributions
ossible-0.060
 cinema-0.057
 exhibit-0.055
 harbour-0.054
 pool-0.054
Token new
Token ID649
Feature activation-1.12e-02

Pos logit contributions
ossible+0.011
ric+0.010
 harbour+0.010
 exhibit+0.010
 cinema+0.010
Neg logit contributions
 sure-0.009
 brave-0.008
 independent-0.008
 strong-0.008
 clever-0.008
Token game
Token ID983
Feature activation+4.87e-03

Pos logit contributions
ossible+0.019
ric+0.017
 harbour+0.017
 exhibit+0.016
 cinema+0.016
Neg logit contributions
 sure-0.019
 brave-0.019
 independent-0.019
 strong-0.018
 clever-0.018
Token.
Token ID13
Feature activation-1.24e-03

Pos logit contributions
 independent+0.007
 brave+0.007
 sure+0.007
 strong+0.007
 clever+0.006
Neg logit contributions
ossible-0.010
ric-0.009
 harbour-0.009
 exhibit-0.008
 cinema-0.008
Token After
Token ID2293
Feature activation-1.75e-02

Pos logit contributions
ossible+0.002
 harbour+0.002
ric+0.002
 exhibit+0.002
 cinema+0.002
Neg logit contributions
 sure-0.002
 brave-0.002
 independent-0.002
 strong-0.001
 graceful-0.001
Token have
Token ID423
Feature activation-1.91e-02

Pos logit contributions
 sure+0.012
 independent+0.012
 brave+0.011
 graceful+0.011
 clever+0.011
Neg logit contributions
ossible-0.014
ric-0.013
|-0.012
 harbour-0.012
OM-0.011
Token been
Token ID587
Feature activation-2.68e-02

Pos logit contributions
ossible+0.034
ric+0.031
|+0.030
 exhibit+0.030
 harbour+0.029
Neg logit contributions
 sure-0.028
 independent-0.028
 brave-0.028
 strong-0.027
 clever-0.027
Token a
Token ID257
Feature activation+1.23e-02

Pos logit contributions
ossible+0.048
ric+0.045
|+0.044
ained+0.043
 harbour+0.042
Neg logit contributions
 sure-0.035
 independent-0.034
 brave-0.033
 clever-0.033
 courageous-0.033
Token waste
Token ID7030
Feature activation+1.45e-02

Pos logit contributions
 sure+0.014
 independent+0.013
 brave+0.012
 intelligent+0.012
 strong+0.011
Neg logit contributions
ossible-0.025
 exhibit-0.023
 harbour-0.023
 cinema-0.023
ric-0.022
Token to
Token ID284
Feature activation+1.23e-02

Pos logit contributions
 sure+0.022
 independent+0.021
 brave+0.021
 clever+0.020
 graceful+0.020
Neg logit contributions
ossible-0.026
 exhibit-0.023
 harbour-0.023
 cinema-0.023
 ago-0.023
Token hang
Token ID8181
Feature activation+3.95e-02

Pos logit contributions
 sure+0.017
 independent+0.017
 brave+0.017
 clever+0.016
 graceful+0.016
Neg logit contributions
ossible-0.023
ric-0.021
|-0.020
 exhibit-0.020
 harbour-0.020
Token it
Token ID340
Feature activation+1.84e-02

Pos logit contributions
 clever+0.058
 compassionate+0.056
 sure+0.056
 courageous+0.056
 brave+0.056
Neg logit contributions
ossible-0.063
 ago-0.059
ained-0.059
ric-0.058
|-0.055
Token there
Token ID612
Feature activation-6.43e-03

Pos logit contributions
 independent+0.027
 brave+0.027
 compassionate+0.026
 clever+0.026
 sure+0.026
Neg logit contributions
ossible-0.032
|-0.028
ric-0.028
 exhibit-0.028
ained-0.027
Token.
Token ID13
Feature activation-1.81e-02

Pos logit contributions
ossible+0.012
ric+0.011
 harbour+0.010
 exhibit+0.010
|+0.010
Neg logit contributions
 independent-0.010
 brave-0.009
 sure-0.009
 clever-0.009
 graceful-0.009
Token Let
Token ID3914
Feature activation+3.19e-02

Pos logit contributions
ossible+0.034
ric+0.031
 harbour+0.030
 exhibit+0.029
 cinema+0.029
Neg logit contributions
 brave-0.028
 sure-0.027
 independent-0.027
 strong-0.027
 clever-0.026
Token's
Token ID338
Feature activation-2.95e-02

Pos logit contributions
 independent+0.056
 brave+0.054
 graceful+0.053
 strong+0.053
 clever+0.053
Neg logit contributions
ossible-0.052
 harbour-0.042
ric-0.042
|-0.042
 exhibit-0.041
Token asked
Token ID1965
Feature activation+1.16e-03

Pos logit contributions
 brave+0.023
 independent+0.023
 sure+0.023
 strong+0.022
 clever+0.022
Neg logit contributions
ossible-0.032
ric-0.030
 harbour-0.029
 exhibit-0.028
 cinema-0.028
Token her
Token ID607
Feature activation-4.83e-03

Pos logit contributions
worm+0.002
 agile+0.002
 courageous+0.002
 fearless+0.002
 tang+0.001
Neg logit contributions
ossible-0.002
 Sale-0.002
 ago-0.002
 exhibit-0.002
 cinema-0.002
Token mom
Token ID1995
Feature activation-3.05e-02

Pos logit contributions
ossible+0.008
 exhibit+0.006
 station+0.006
 Sale+0.006
 cinema+0.006
Neg logit contributions
 agile-0.008
 sure-0.008
 courageous-0.008
 fearless-0.008
rier-0.008
Tokenmy
Token ID1820
Feature activation-3.27e-02

Pos logit contributions
ossible+0.059
ric+0.054
 harbour+0.053
 exhibit+0.051
 cinema+0.051
Neg logit contributions
 brave-0.044
 sure-0.044
 independent-0.044
 strong-0.043
 clever-0.042
Token,
Token ID11
Feature activation-2.40e-02

Pos logit contributions
ossible+0.072
ric+0.066
 exhibit+0.064
 harbour+0.063
 cinema+0.062
Neg logit contributions
 independent-0.039
 brave-0.039
 strong-0.037
 graceful-0.037
 clever-0.037
Token "
Token ID366
Feature activation+2.91e-02

Pos logit contributions
ossible+0.041
ric+0.034
 station+0.033
 harbour+0.032
 cinema+0.031
Neg logit contributions
 independent-0.038
 agile-0.038
worm-0.037
 courageous-0.037
 graceful-0.036
TokenWhat
Token ID2061
Feature activation+4.46e-02

Pos logit contributions
 independent+0.038
 brave+0.037
 sure+0.036
 strong+0.036
 clever+0.035
Neg logit contributions
ossible-0.062
 harbour-0.057
ric-0.057
 cinema-0.056
 exhibit-0.054
Token are
Token ID389
Feature activation-2.00e-03

Pos logit contributions
 independent+0.078
 courageous+0.073
 sure+0.073
 strong+0.072
 brave+0.072
Neg logit contributions
ossible-0.070
 exhibit-0.059
 Sale-0.057
ric-0.055
 cinema-0.054
Token planets
Token ID14705
Feature activation+2.42e-02

Pos logit contributions
ossible+0.004
ric+0.003
 harbour+0.003
 exhibit+0.003
 Sale+0.003
Neg logit contributions
 sure-0.003
 brave-0.003
 independent-0.003
 strong-0.003
 clever-0.003
Token?"
Token ID1701
Feature activation-2.43e-03

Pos logit contributions
 independent+0.042
 sure+0.042
 brave+0.042
 clever+0.041
 strong+0.040
Neg logit contributions
ossible-0.038
ric-0.034
 harbour-0.033
 exhibit-0.032
 cinema-0.032
Token Her
Token ID2332
Feature activation+1.92e-02

Pos logit contributions
ric+0.005
ossible+0.005
 harbour+0.005
 cinema+0.005
 exhibit+0.005
Neg logit contributions
 sure-0.003
 brave-0.003
 independent-0.003
 strong-0.003
 graceful-0.003
TokenI
Token ID40
Feature activation+7.40e-03

Pos logit contributions
 sure+0.037
 independent+0.036
 brave+0.036
 strong+0.034
 clever+0.034
Neg logit contributions
ossible-0.082
ric-0.078
|-0.075
 harbour-0.075
 cinema-0.071
Token am
Token ID716
Feature activation+8.41e-03

Pos logit contributions
 brave+0.011
 sure+0.011
 independent+0.011
 strong+0.010
 clever+0.010
Neg logit contributions
ossible-0.014
ric-0.013
 harbour-0.013
 cinema-0.012
 exhibit-0.012
Token lost
Token ID2626
Feature activation+2.23e-02

Pos logit contributions
 independent+0.011
 sure+0.010
 strong+0.010
 brave+0.010
 graceful+0.010
Neg logit contributions
ossible-0.016
ric-0.015
 harbour-0.015
 ago-0.014
 exhibit-0.014
Token and
Token ID290
Feature activation+2.56e-02

Pos logit contributions
 sure+0.035
 clever+0.033
 independent+0.033
 graceful+0.032
Small+0.032
Neg logit contributions
ossible-0.035
 harbour-0.034
 ago-0.034
 exhibit-0.033
ric-0.032
Token I
Token ID314
Feature activation+1.56e-02

Pos logit contributions
 sure+0.031
 independent+0.031
 brave+0.031
 strong+0.030
 clever+0.029
Neg logit contributions
ossible-0.054
ric-0.050
 harbour-0.049
|-0.047
 exhibit-0.047
Token feel
Token ID1254
Feature activation+4.46e-02

Pos logit contributions
 brave+0.022
 independent+0.022
 sure+0.022
 strong+0.021
 clever+0.021
Neg logit contributions
ossible-0.031
ric-0.028
 harbour-0.028
 exhibit-0.027
 cinema-0.027
Token very
Token ID845
Feature activation+2.80e-02

Pos logit contributions
Small+0.054
 independent+0.053
 sure+0.053
 clever+0.051
 pony+0.051
Neg logit contributions
ossible-0.082
ric-0.076
 harbour-0.076
|-0.073
 ago-0.072
Token sad
Token ID6507
Feature activation+7.26e-03

Pos logit contributions
 sure+0.034
 independent+0.033
 brave+0.032
 strong+0.032
 graceful+0.031
Neg logit contributions
ossible-0.057
ric-0.052
 harbour-0.052
|-0.050
 exhibit-0.050
Token".
Token ID1911
Feature activation+9.74e-03

Pos logit contributions
 independent+0.011
 sure+0.011
 clever+0.011
 brave+0.011
 strong+0.011
Neg logit contributions
ossible-0.012
ric-0.011
 ago-0.011
 harbour-0.011
 exhibit-0.010
Token
Token ID198
Feature activation+2.32e-03

Pos logit contributions
 brave+0.015
 independent+0.015
 sure+0.015
 strong+0.014
 clever+0.014
Neg logit contributions
ossible-0.018
ric-0.017
 harbour-0.016
 exhibit-0.016
 cinema-0.016
Token
Token ID198
Feature activation+8.32e-03

Pos logit contributions
 independent+0.005
 brave+0.004
 sure+0.004
 strong+0.004
 clever+0.004
Neg logit contributions
ossible-0.004
ric-0.003
|-0.003
 harbour-0.003
 exhibit-0.003
Token.
Token ID13
Feature activation+8.98e-03

Pos logit contributions
ossible+0.002
ric+0.002
 harbour+0.002
 exhibit+0.002
 ago+0.002
Neg logit contributions
 independent-0.002
 sure-0.002
 brave-0.002
 clever-0.001
 graceful-0.001
Token It
Token ID632
Feature activation-8.59e-03

Pos logit contributions
 sure+0.014
 brave+0.014
 independent+0.013
 graceful+0.013
 clever+0.013
Neg logit contributions
ossible-0.016
ric-0.014
 ago-0.014
 harbour-0.014
|-0.013
Token hurts
Token ID20406
Feature activation+3.31e-03

Pos logit contributions
ossible+0.015
ric+0.014
 harbour+0.014
 exhibit+0.013
 cinema+0.013
Neg logit contributions
 independent-0.014
 brave-0.013
 sure-0.013
 strong-0.013
 graceful-0.013
Token a
Token ID257
Feature activation+1.33e-02

Pos logit contributions
 sure+0.005
 independent+0.005
 clever+0.005
 graceful+0.004
 brave+0.004
Neg logit contributions
ossible-0.006
ric-0.005
|-0.005
 harbour-0.005
 Sale-0.005
Token lot
Token ID1256
Feature activation-6.74e-03

Pos logit contributions
 independent+0.019
 sure+0.018
 brave+0.018
 strong+0.017
 clever+0.017
Neg logit contributions
ossible-0.026
ric-0.024
 harbour-0.023
 cinema-0.022
 exhibit-0.022
Token,"
Token ID553
Feature activation+3.41e-02

Pos logit contributions
ossible+0.014
ric+0.013
 harbour+0.013
 exhibit+0.012
 cinema+0.012
Neg logit contributions
 brave-0.009
 independent-0.009
 sure-0.009
 strong-0.008
 clever-0.008
Token Tom
Token ID4186
Feature activation+4.53e-03

Pos logit contributions
 brave+0.054
 independent+0.054
 sure+0.054
 strong+0.052
 clever+0.051
Neg logit contributions
ossible-0.062
ric-0.056
 harbour-0.055
 exhibit-0.052
 cinema-0.052
Token says
Token ID1139
Feature activation+2.89e-02

Pos logit contributions
 brave+0.009
 sure+0.009
 independent+0.008
 strong+0.008
 clever+0.008
Neg logit contributions
ossible-0.007
ric-0.007
 harbour-0.006
 cinema-0.006
 exhibit-0.006
Token.
Token ID13
Feature activation+1.45e-02

Pos logit contributions
 independent+0.043
 brave+0.042
 sure+0.042
 strong+0.041
 graceful+0.041
Neg logit contributions
ossible-0.054
ric-0.049
 harbour-0.047
 exhibit-0.046
 cinema-0.045
Token
Token ID198
Feature activation+7.69e-03

Pos logit contributions
 sure+0.026
 independent+0.026
 brave+0.026
 graceful+0.025
 clever+0.025
Neg logit contributions
ossible-0.023
ric-0.019
 harbour-0.019
 cinema-0.018
 exhibit-0.018
Token
Token ID198
Feature activation+1.34e-02

Pos logit contributions
 brave+0.012
 independent+0.012
 sure+0.012
 strong+0.012
 clever+0.012
Neg logit contributions
ossible-0.014
ric-0.012
 harbour-0.012
 exhibit-0.011
 cinema-0.011
Token if
Token ID611
Feature activation+2.29e-02

Pos logit contributions
 courageous+0.004
 independent+0.004
 fearless+0.004
 agile+0.004
 graceful+0.004
Neg logit contributions
ossible-0.004
ric-0.004
 exhibit-0.004
 cinema-0.004
 Sale-0.004
Token you
Token ID345
Feature activation-3.09e-02

Pos logit contributions
 independent+0.033
 sure+0.033
 courageous+0.032
 graceful+0.032
 brave+0.032
Neg logit contributions
ossible-0.040
 Sale-0.036
 ago-0.033
 exhibit-0.033
 harbour-0.033
Token like
Token ID588
Feature activation-2.34e-02

Pos logit contributions
ossible+0.059
ric+0.054
 exhibit+0.051
 harbour+0.050
|+0.050
Neg logit contributions
 independent-0.045
 graceful-0.044
 brave-0.043
 sure-0.043
 clever-0.042
Token it
Token ID340
Feature activation+8.68e-03

Pos logit contributions
ossible+0.048
ric+0.045
 exhibit+0.043
 Sale+0.043
 ago+0.042
Neg logit contributions
 courageous-0.028
 brave-0.028
 independent-0.027
 fearless-0.026
 sure-0.026
Token."
Token ID526
Feature activation-1.01e-02

Pos logit contributions
 independent+0.014
 courageous+0.013
 graceful+0.013
 brave+0.013
 compassionate+0.013
Neg logit contributions
ossible-0.014
 exhibit-0.012
ained-0.012
ric-0.012
 cinema-0.011
Token
Token ID220
Feature activation-1.39e-02

Pos logit contributions
ossible+0.018
ric+0.018
 harbour+0.017
 cinema+0.017
 exhibit+0.016
Neg logit contributions
 sure-0.016
 brave-0.016
 independent-0.016
 strong-0.015
 graceful-0.015
Token
Token ID198
Feature activation-1.22e-02

Pos logit contributions
ossible+0.025
ric+0.023
 harbour+0.022
 exhibit+0.021
 cinema+0.021
Neg logit contributions
 brave-0.022
 independent-0.022
 sure-0.022
 strong-0.022
 clever-0.021
Token
Token ID198
Feature activation-5.61e-03

Pos logit contributions
ossible+0.020
ric+0.017
|+0.017
 harbour+0.016
 exhibit+0.016
Neg logit contributions
 independent-0.022
 brave-0.022
 sure-0.021
 strong-0.021
 clever-0.021
TokenL
Token ID43
Feature activation-1.51e-03

Pos logit contributions
ossible+0.011
|+0.009
ric+0.009
 harbour+0.008
rollers+0.008
Neg logit contributions
 independent-0.009
 brave-0.009
 sure-0.008
 strong-0.008
 clever-0.008
Tokenily
Token ID813
Feature activation+1.83e-02

Pos logit contributions
ossible+0.004
ric+0.004
|+0.003
 harbour+0.003
 exhibit+0.003
Neg logit contributions
 brave-0.001
 independent-0.001
 strong-0.001
 sure-0.001
 clever-0.001
Token took
Token ID1718
Feature activation+1.60e-02

Pos logit contributions
 independent+0.029
 graceful+0.027
 strong+0.027
 brave+0.027
 sure+0.027
Neg logit contributions
ossible-0.034
ric-0.028
 harbour-0.028
 exhibit-0.028
|-0.027
Token fruit
Token ID8234
Feature activation-2.10e-02

Pos logit contributions
 sure+0.006
 brave+0.006
 independent+0.006
 strong+0.005
 courageous+0.005
Neg logit contributions
ossible-0.009
ric-0.008
 exhibit-0.008
 harbour-0.008
 Sale-0.007
Token was
Token ID373
Feature activation-1.59e-02

Pos logit contributions
ossible+0.037
ric+0.033
 exhibit+0.033
 harbour+0.032
 cinema+0.031
Neg logit contributions
 sure-0.033
 independent-0.033
 brave-0.032
 strong-0.031
 graceful-0.030
Token red
Token ID2266
Feature activation-7.76e-03

Pos logit contributions
ossible+0.033
ric+0.030
 ago+0.029
ained+0.028
 harbour+0.028
Neg logit contributions
 sure-0.018
 brave-0.017
 courageous-0.017
 independent-0.016
Small-0.016
Token and
Token ID290
Feature activation+1.75e-02

Pos logit contributions
ossible+0.022
ric+0.021
 harbour+0.020
|+0.020
 cinema+0.020
Neg logit contributions
 brave-0.005
 strong-0.005
 independent-0.005
 sweet-0.005
 sure-0.005
Token round
Token ID2835
Feature activation-1.34e-02

Pos logit contributions
 sure+0.023
 independent+0.021
 courageous+0.021
 brave+0.020
 fearless+0.019
Neg logit contributions
ossible-0.033
ric-0.030
 Sale-0.029
melon-0.029
 ago-0.028
Token.
Token ID13
Feature activation-1.75e-02

Pos logit contributions
ossible+0.022
ric+0.020
 exhibit+0.020
 ago+0.019
 harbour+0.019
Neg logit contributions
 sure-0.022
 independent-0.022
 brave-0.021
 clever-0.021
 strong-0.021
Token Lily
Token ID20037
Feature activation+2.20e-02

Pos logit contributions
ossible+0.032
ric+0.028
|+0.028
 harbour+0.027
 exhibit+0.027
Neg logit contributions
 brave-0.027
 sure-0.027
 independent-0.027
 strong-0.026
 clever-0.026
Token wanted
Token ID2227
Feature activation-8.21e-03

Pos logit contributions
 independent+0.031
 sure+0.031
 courageous+0.030
 brave+0.030
rier+0.030
Neg logit contributions
ossible-0.042
 exhibit-0.035
ric-0.034
 harbour-0.034
|-0.033
Token to
Token ID284
Feature activation-4.62e-03

Pos logit contributions
ossible+0.014
 exhibit+0.012
ric+0.012
 Sale+0.011
 harbour+0.011
Neg logit contributions
 brave-0.013
 sure-0.013
 independent-0.013
 courageous-0.013
 graceful-0.012
Token eat
Token ID4483
Feature activation+6.65e-03

Pos logit contributions
ossible+0.009
ric+0.008
 exhibit+0.008
 harbour+0.008
|+0.008
Neg logit contributions
 sure-0.007
 brave-0.006
 independent-0.006
 graceful-0.006
 clever-0.006
Token the
Token ID262
Feature activation+3.89e-03

Pos logit contributions
 courageous+0.008
 independent+0.008
 sure+0.008
 brave+0.008
 fearless+0.008
Neg logit contributions
ossible-0.013
ric-0.012
 exhibit-0.012
 Sale-0.011
 pizz-0.011
Token stared
Token ID18484
Feature activation-1.65e-03

Pos logit contributions
 independent+0.003
 sure+0.003
 brave+0.003
 strong+0.003
 clever+0.003
Neg logit contributions
ossible-0.005
 harbour-0.004
ric-0.004
 exhibit-0.004
 cinema-0.004
Token at
Token ID379
Feature activation+2.00e-02

Pos logit contributions
ossible+0.003
 harbour+0.003
ric+0.003
 cinema+0.003
 exhibit+0.003
Neg logit contributions
 sure-0.002
 brave-0.002
 independent-0.002
 clever-0.002
 strong-0.002
Token the
Token ID262
Feature activation+1.90e-02

Pos logit contributions
 brave+0.020
 sure+0.019
 independent+0.019
 strong+0.018
 clever+0.018
Neg logit contributions
ossible-0.048
ric-0.046
 harbour-0.044
 exhibit-0.043
 cinema-0.043
Token big
Token ID1263
Feature activation-1.13e-02

Pos logit contributions
 sure+0.028
 independent+0.026
 brave+0.026
 strong+0.025
 clever+0.024
Neg logit contributions
ossible-0.037
 harbour-0.034
ric-0.034
 cinema-0.033
 exhibit-0.033
Token,
Token ID11
Feature activation+1.98e-02

Pos logit contributions
ossible+0.022
ric+0.020
 harbour+0.020
 exhibit+0.020
 cinema+0.020
Neg logit contributions
 sure-0.017
 brave-0.016
 independent-0.016
 clever-0.015
 strong-0.015
Token tall
Token ID7331
Feature activation+4.87e-03

Pos logit contributions
 sure+0.026
isy+0.025
pick+0.023
Small+0.022
 observ+0.022
Neg logit contributions
 harbour-0.035
 pool-0.033
|-0.032
 cinema-0.032
ossible-0.031
Token freezer
Token ID30967
Feature activation-3.44e-03

Pos logit contributions
 sure+0.008
 independent+0.007
 brave+0.007
 clever+0.007
 compassionate+0.007
Neg logit contributions
ossible-0.008
 exhibit-0.008
 harbour-0.008
ric-0.008
 cinema-0.008
Token.
Token ID13
Feature activation+1.35e-03

Pos logit contributions
ossible+0.008
ric+0.007
 harbour+0.007
|+0.007
 cinema+0.007
Neg logit contributions
 brave-0.004
 strong-0.004
 independent-0.004
 sure-0.004
 clever-0.004
Token It
Token ID632
Feature activation+3.39e-03

Pos logit contributions
 brave+0.002
 independent+0.002
 sure+0.002
 strong+0.002
 clever+0.002
Neg logit contributions
ossible-0.002
ric-0.002
 harbour-0.002
 cinema-0.002
 exhibit-0.002
Token was
Token ID373
Feature activation-8.21e-03

Pos logit contributions
 sure+0.005
 independent+0.005
 brave+0.005
 graceful+0.005
 strong+0.005
Neg logit contributions
ossible-0.006
ric-0.005
 harbour-0.005
 exhibit-0.005
|-0.005
Token very
Token ID845
Feature activation+1.47e-02

Pos logit contributions
ossible+0.017
ric+0.016
 harbour+0.016
 cinema+0.015
 ago+0.015
Neg logit contributions
 sure-0.010
 brave-0.009
 independent-0.009
 clever-0.009
 graceful-0.009
Token are
Token ID389
Feature activation-2.57e-02

Pos logit contributions
ossible+0.006
ric+0.006
 harbour+0.005
 exhibit+0.005
 cinema+0.005
Neg logit contributions
 independent-0.005
 brave-0.005
 sure-0.005
 strong-0.005
 graceful-0.005
Token very
Token ID845
Feature activation+1.00e-02

Pos logit contributions
ossible+0.051
ric+0.050
 harbour+0.049
 cinema+0.048
 exhibit+0.047
Neg logit contributions
 independent-0.028
 sure-0.028
 bee-0.027
 graceful-0.026
 brave-0.026
Token creative
Token ID7325
Feature activation+1.54e-02

Pos logit contributions
 pony+0.004
 butterfly+0.004
 butterflies+0.004
 bunny+0.004
 bee+0.004
Neg logit contributions
ossible-0.026
 exhibit-0.025
 aquarium-0.025
 harbour-0.024
ric-0.024
Token and
Token ID290
Feature activation+2.08e-02

Pos logit contributions
 independent+0.024
 graceful+0.024
 sure+0.024
 clever+0.023
 compassionate+0.023
Neg logit contributions
ossible-0.024
 ago-0.023
 exhibit-0.022
 harbour-0.022
 cinema-0.021
Token brave
Token ID14802
Feature activation+1.36e-02

Pos logit contributions
 bee+0.011
isy+0.010
 pony+0.010
 mosquito+0.009
 butterfly+0.009
Neg logit contributions
ossible-0.051
 harbour-0.048
 ago-0.047
ric-0.046
 tanks-0.046
Token.
Token ID13
Feature activation+1.03e-02

Pos logit contributions
 independent+0.021
 compassionate+0.020
 graceful+0.020
 clever+0.020
 sure+0.020
Neg logit contributions
ossible-0.023
 ago-0.022
 harbour-0.021
 exhibit-0.020
 cinema-0.020
Token We
Token ID775
Feature activation-5.58e-03

Pos logit contributions
 brave+0.017
 sure+0.017
 independent+0.017
 strong+0.016
 clever+0.016
Neg logit contributions
ossible-0.018
ric-0.016
 harbour-0.016
 cinema-0.015
 exhibit-0.015
Token are
Token ID389
Feature activation-2.36e-02

Pos logit contributions
ossible+0.011
ric+0.010
 harbour+0.010
 exhibit+0.010
|+0.010
Neg logit contributions
 independent-0.007
 brave-0.007
 sure-0.007
 strong-0.007
 graceful-0.007
Token proud
Token ID6613
Feature activation-1.71e-02

Pos logit contributions
ric+0.044
ossible+0.044
 harbour+0.043
 cinema+0.042
 aquarium+0.041
Neg logit contributions
Sweet-0.026
worm-0.026
 redd-0.026
 bee-0.025
 tang-0.025
Token of
Token ID286
Feature activation+9.86e-03

Pos logit contributions
ossible+0.035
 harbour+0.031
ric+0.030
 exhibit+0.030
 cinema+0.030
Neg logit contributions
 independent-0.023
 sure-0.022
 brave-0.022
 strong-0.022
 clever-0.022
Token you
Token ID345
Feature activation+2.08e-02

Pos logit contributions
 independent+0.016
 sure+0.016
 brave+0.016
 strong+0.016
 graceful+0.015
Neg logit contributions
ossible-0.017
ric-0.015
 Sale-0.014
 harbour-0.014
 exhibit-0.014
Token the
Token ID262
Feature activation-1.39e-02

Pos logit contributions
ossible+0.007
 harbour+0.006
ric+0.006
 cinema+0.006
 exhibit+0.006
Neg logit contributions
 independent-0.003
 sure-0.003
 brave-0.003
 clever-0.003
 graceful-0.003
Token boat
Token ID8848
Feature activation-2.99e-02

Pos logit contributions
ossible+0.025
ric+0.023
|+0.022
 exhibit+0.021
 harbour+0.021
Neg logit contributions
 brave-0.023
 strong-0.023
 independent-0.023
 sure-0.022
 graceful-0.022
Token was
Token ID373
Feature activation-2.85e-02

Pos logit contributions
ossible+0.051
ric+0.047
 harbour+0.046
 exhibit+0.045
 cinema+0.044
Neg logit contributions
 independent-0.048
 sure-0.048
 brave-0.048
 strong-0.046
 clever-0.046
Token free
Token ID1479
Feature activation+8.89e-03

Pos logit contributions
ossible+0.061
ric+0.056
 harbour+0.055
 exhibit+0.054
 cinema+0.053
Neg logit contributions
 brave-0.036
 sure-0.036
 independent-0.036
 strong-0.035
 clever-0.034
Token.
Token ID13
Feature activation-1.28e-02

Pos logit contributions
 independent+0.014
 brave+0.013
 sure+0.013
 clever+0.013
 graceful+0.013
Neg logit contributions
ossible-0.015
 harbour-0.014
ric-0.014
 exhibit-0.013
 ago-0.013
Token Jack
Token ID3619
Feature activation+6.88e-03

Pos logit contributions
ossible+0.019
ric+0.017
 harbour+0.016
 exhibit+0.016
 cinema+0.016
Neg logit contributions
 brave-0.024
 independent-0.024
 sure-0.024
 strong-0.024
 clever-0.023
Token was
Token ID373
Feature activation-3.34e-03

Pos logit contributions
 independent+0.012
 compassionate+0.011
 intelligent+0.011
 redd+0.011
 courageous+0.011
Neg logit contributions
ossible-0.010
ric-0.009
 harbour-0.009
 exhibit-0.009
|-0.009
Token so
Token ID523
Feature activation+2.83e-02

Pos logit contributions
ossible+0.006
ric+0.006
 harbour+0.006
ained+0.005
|+0.005
Neg logit contributions
 bee-0.004
 bunny-0.004
 butterfly-0.004
 independent-0.004
Bee-0.004
Token happy
Token ID3772
Feature activation-5.03e-03

Pos logit contributions
 pony+0.020
 bee+0.020
 bunny+0.020
 butterflies+0.019
 butterfly+0.019
Neg logit contributions
ossible-0.070
ric-0.063
 harbour-0.062
|-0.062
rollers-0.060
Token and
Token ID290
Feature activation+1.39e-02

Pos logit contributions
ossible+0.010
ric+0.009
 exhibit+0.009
 harbour+0.009
 cinema+0.008
Neg logit contributions
 independent-0.007
 brave-0.007
 strong-0.007
 sure-0.007
 clever-0.007
Token grateful
Token ID14066
Feature activation-1.45e-02

Pos logit contributions
 bee+0.016
 sure+0.016
 brave+0.016
 independent+0.016
 raven+0.015
Neg logit contributions
ossible-0.027
 harbour-0.024
|-0.023
ric-0.022
 exhibit-0.022
Token to
Token ID284
Feature activation-2.54e-02

Pos logit contributions
 independent+0.010
 brave+0.010
 compassionate+0.009
 clever+0.009
 sure+0.009
Neg logit contributions
ossible-0.011
ric-0.011
 exhibit-0.010
rollers-0.010
 harbour-0.010
Token peek
Token ID27185
Feature activation+8.45e-03

Pos logit contributions
ossible+0.045
ric+0.044
 harbour+0.042
 exhibit+0.041
 cinema+0.040
Neg logit contributions
 independent-0.036
 graceful-0.034
 brave-0.034
 sure-0.033
 clever-0.033
Token in
Token ID287
Feature activation-2.68e-02

Pos logit contributions
 compassionate+0.013
 clever+0.012
 raven+0.012
 tang+0.012
 independent+0.012
Neg logit contributions
ossible-0.012
 ago-0.012
 exhibit-0.012
 harbour-0.012
 cinema-0.012
Token on
Token ID319
Feature activation-3.00e-03

Pos logit contributions
ossible+0.055
 harbour+0.051
 ago+0.050
 exhibit+0.050
ric+0.049
Neg logit contributions
 sure-0.032
 independent-0.032
 brave-0.031
 clever-0.031
 compassionate-0.031
Token Mom
Token ID11254
Feature activation-5.26e-02

Pos logit contributions
ossible+0.006
 exhibit+0.006
ric+0.006
 harbour+0.006
 cinema+0.006
Neg logit contributions
 sure-0.003
 independent-0.003
 brave-0.003
 clever-0.003
 compassionate-0.003
Tokenmy
Token ID1820
Feature activation-3.15e-02

Pos logit contributions
ossible+0.067
ric+0.059
 harbour+0.057
 exhibit+0.054
 cinema+0.054
Neg logit contributions
 brave-0.111
 independent-0.110
 sure-0.110
 strong-0.108
 clever-0.106
Token and
Token ID290
Feature activation-2.62e-02

Pos logit contributions
ossible+0.052
ric+0.048
 exhibit+0.047
 harbour+0.046
 cinema+0.045
Neg logit contributions
 independent-0.052
 brave-0.051
 graceful-0.051
 clever-0.050
 strong-0.050
Token Daddy
Token ID26926
Feature activation-3.36e-02

Pos logit contributions
ossible+0.050
 harbour+0.045
ric+0.045
 cinema+0.042
 exhibit+0.042
Neg logit contributions
 independent-0.037
 brave-0.036
 sure-0.035
 graceful-0.035
 strong-0.035
Token's
Token ID338
Feature activation-2.61e-02

Pos logit contributions
ossible+0.053
ric+0.049
 exhibit+0.048
 harbour+0.047
 station+0.046
Neg logit contributions
 independent-0.057
 graceful-0.055
 clever-0.054
 compassionate-0.054
 brave-0.054
Token conversations
Token ID10275
Feature activation-4.67e-03

Pos logit contributions
ossible+0.043
ric+0.038
 harbour+0.037
 exhibit+0.036
 driveway+0.035
Neg logit contributions
 sure-0.043
 independent-0.043
 graceful-0.042
 brave-0.042
 strong-0.042
Token."
Token ID526
Feature activation+5.88e-03

Pos logit contributions
ossible+0.008
ric+0.007
 Sale+0.007
 cinema+0.007
 harbour+0.007
Neg logit contributions
 independent-0.007
 graceful-0.007
 clever-0.007
 brave-0.007
 compassionate-0.007
Token don
Token ID836
Feature activation-1.93e-02

Pos logit contributions
ossible+0.019
ric+0.017
 exhibit+0.016
|+0.016
 harbour+0.016
Neg logit contributions
 independent-0.012
 brave-0.012
 sure-0.012
 graceful-0.012
 strong-0.012
Token't
Token ID470
Feature activation-3.64e-02

Pos logit contributions
ossible+0.030
ric+0.027
 harbour+0.026
 exhibit+0.025
 cinema+0.025
Neg logit contributions
 brave-0.035
 sure-0.035
 independent-0.035
 strong-0.034
 clever-0.034
Token know
Token ID760
Feature activation-1.04e-03

Pos logit contributions
ossible+0.070
ric+0.061
 ago+0.058
 harbour+0.058
|+0.058
Neg logit contributions
 independent-0.050
 brave-0.049
 clever-0.049
 graceful-0.049
 sure-0.048
Token.
Token ID13
Feature activation-1.14e-02

Pos logit contributions
ossible+0.002
ric+0.002
 ago+0.002
 exhibit+0.002
 harbour+0.002
Neg logit contributions
 independent-0.002
 brave-0.002
 strong-0.002
 sure-0.002
 clever-0.002
Token But
Token ID887
Feature activation-1.01e-02

Pos logit contributions
ossible+0.020
ric+0.018
 harbour+0.017
 exhibit+0.017
 cinema+0.017
Neg logit contributions
 independent-0.018
 sure-0.018
 brave-0.018
 strong-0.018
 clever-0.018
Token we
Token ID356
Feature activation-2.47e-02

Pos logit contributions
ossible+0.019
ric+0.017
 ago+0.016
 harbour+0.016
 exhibit+0.016
Neg logit contributions
 independent-0.014
 sure-0.013
 graceful-0.013
 brave-0.013
 courageous-0.013
Token can
Token ID460
Feature activation-1.87e-02

Pos logit contributions
ossible+0.047
ric+0.043
 harbour+0.041
 exhibit+0.040
|+0.040
Neg logit contributions
 independent-0.036
 brave-0.036
 sure-0.036
 graceful-0.035
 strong-0.035
Token count
Token ID954
Feature activation-1.42e-02

Pos logit contributions
ossible+0.037
ric+0.034
 harbour+0.033
 exhibit+0.032
 cinema+0.032
Neg logit contributions
 independent-0.026
 brave-0.026
 sure-0.026
 strong-0.025
 clever-0.025
Token the
Token ID262
Feature activation-1.81e-02

Pos logit contributions
ossible+0.028
ric+0.027
 exhibit+0.025
 ago+0.024
 harbour+0.024
Neg logit contributions
 independent-0.019
 brave-0.019
 sure-0.019
 clever-0.018
 compassionate-0.018
Token words
Token ID2456
Feature activation-8.93e-03

Pos logit contributions
ossible+0.036
ric+0.033
 harbour+0.032
 exhibit+0.032
 cinema+0.031
Neg logit contributions
 brave-0.025
 sure-0.025
 independent-0.025
 strong-0.024
 clever-0.024
Token.
Token ID13
Feature activation-1.06e-02

Pos logit contributions
ossible+0.017
 ago+0.015
 exhibit+0.014
ric+0.014
 harbour+0.014
Neg logit contributions
 independent-0.013
 brave-0.012
 clever-0.012
 sure-0.012
 compassionate-0.012
Token surprised
Token ID6655
Feature activation-1.94e-02

Pos logit contributions
 butterflies+0.019
 pony+0.019
 silk+0.018
 butterfly+0.018
 bunny+0.018
Neg logit contributions
ossible-0.051
 harbour-0.045
ric-0.044
 ago-0.043
rollers-0.042
Token.
Token ID13
Feature activation-2.53e-03

Pos logit contributions
ossible+0.039
ric+0.036
 harbour+0.035
 exhibit+0.034
 cinema+0.033
Neg logit contributions
 independent-0.027
 brave-0.027
 sure-0.026
 strong-0.026
 clever-0.025
Token He
Token ID679
Feature activation-4.39e-03

Pos logit contributions
ossible+0.004
 harbour+0.004
ric+0.004
 exhibit+0.004
 cinema+0.004
Neg logit contributions
 brave-0.004
 sure-0.004
 independent-0.004
 strong-0.004
 clever-0.004
Token asked
Token ID1965
Feature activation-7.87e-03

Pos logit contributions
ossible+0.007
 harbour+0.006
ric+0.006
 exhibit+0.006
|+0.005
Neg logit contributions
 independent-0.008
 sure-0.007
 brave-0.007
 strong-0.007
 graceful-0.007
Token the
Token ID262
Feature activation-2.51e-03

Pos logit contributions
ossible+0.014
 exhibit+0.012
 Sale+0.012
 tanks+0.011
 harbour+0.011
Neg logit contributions
 graceful-0.012
 independent-0.012
 tang-0.012
 clever-0.012
 courageous-0.011
Token tree
Token ID5509
Feature activation-3.55e-02

Pos logit contributions
ossible+0.005
 exhibit+0.004
loo+0.004
 driveway+0.004
ric+0.004
Neg logit contributions
 sure-0.003
 independent-0.003
 swat-0.003
 graceful-0.003
 courageous-0.003
Token,
Token ID11
Feature activation-1.13e-02

Pos logit contributions
ossible+0.064
ric+0.059
 exhibit+0.058
 harbour+0.056
 cinema+0.054
Neg logit contributions
 independent-0.054
 sure-0.054
 brave-0.053
 strong-0.052
 clever-0.051
Token
Token ID6184
Feature activation-8.87e-03

Pos logit contributions
ossible+0.021
ric+0.019
 harbour+0.018
 ago+0.018
 cinema+0.017
Neg logit contributions
 independent-0.016
 brave-0.015
 graceful-0.015
 clever-0.015
 strong-0.015
Token
Token ID95
Feature activation-2.33e-02

Pos logit contributions
ossible+0.015
ric+0.014
 exhibit+0.014
 harbour+0.014
 cinema+0.013
Neg logit contributions
 sure-0.014
 independent-0.013
 brave-0.013
 graceful-0.013
 clever-0.013
Token
Token ID26391
Feature activation-4.79e-03

Pos logit contributions
ossible+0.036
ric+0.033
 exhibit+0.032
 harbour+0.031
|+0.031
Neg logit contributions
 sure-0.041
 brave-0.040
 independent-0.040
 graceful-0.040
 strong-0.039
Token
Token ID129
Feature activation-2.94e-02

Pos logit contributions
ossible+0.008
ric+0.008
 harbour+0.007
 exhibit+0.007
 cinema+0.007
Neg logit contributions
 brave-0.008
 independent-0.008
 sure-0.008
 strong-0.008
 clever-0.007
Token some
Token ID617
Feature activation+1.13e-02

Pos logit contributions
 independent+0.006
 brave+0.006
 sure+0.006
 strong+0.006
 clever+0.006
Neg logit contributions
ossible-0.015
ric-0.014
 harbour-0.014
 cinema-0.014
 exhibit-0.014
Token police
Token ID1644
Feature activation-1.32e-02

Pos logit contributions
 sure+0.014
 independent+0.013
 brave+0.013
 strong+0.012
 courageous+0.012
Neg logit contributions
ossible-0.022
 harbour-0.021
ric-0.020
 cinema-0.020
 aquarium-0.019
Token officers
Token ID3790
Feature activation-1.94e-02

Pos logit contributions
ossible+0.024
ric+0.021
|+0.020
 harbour+0.020
 cinema+0.019
Neg logit contributions
 brave-0.022
 independent-0.021
 strong-0.021
 sure-0.021
 clever-0.021
Token.
Token ID13
Feature activation-1.30e-03

Pos logit contributions
ossible+0.042
ric+0.038
|+0.036
 Sale+0.036
 harbour+0.036
Neg logit contributions
 brave-0.026
 sure-0.025
 strong-0.024
 independent-0.024
 clever-0.022
Token They
Token ID1119
Feature activation-1.33e-02

Pos logit contributions
ossible+0.002
ric+0.002
 harbour+0.002
 cinema+0.002
 exhibit+0.002
Neg logit contributions
 brave-0.002
 independent-0.002
 sure-0.002
 strong-0.002
 clever-0.002
Token were
Token ID547
Feature activation-2.94e-02

Pos logit contributions
ossible+0.026
ric+0.025
 cinema+0.024
|+0.024
 exhibit+0.023
Neg logit contributions
 brave-0.021
 sure-0.020
 strong-0.020
 clever-0.019
 independent-0.019
Token looking
Token ID2045
Feature activation+2.08e-02

Pos logit contributions
ossible+0.065
ric+0.058
|+0.056
 exhibit+0.056
OM+0.056
Neg logit contributions
 brave-0.043
 sure-0.041
 strong-0.040
 independent-0.040
 clever-0.039
Token for
Token ID329
Feature activation-1.75e-03

Pos logit contributions
 sure+0.025
 independent+0.025
 brave+0.025
 strong+0.024
 clever+0.024
Neg logit contributions
ossible-0.044
ric-0.041
 harbour-0.040
 cinema-0.039
 exhibit-0.039
Token a
Token ID257
Feature activation+1.08e-03

Pos logit contributions
ossible+0.003
 harbour+0.003
ric+0.003
 exhibit+0.003
 cinema+0.003
Neg logit contributions
 independent-0.003
 sure-0.003
 brave-0.003
 strong-0.003
 graceful-0.003
Token lost
Token ID2626
Feature activation-3.70e-03

Pos logit contributions
 brave+0.001
 independent+0.001
 sure+0.001
 strong+0.001
 clever+0.001
Neg logit contributions
ossible-0.003
ric-0.002
 harbour-0.002
 exhibit-0.002
 cinema-0.002
Token child
Token ID1200
Feature activation+9.56e-04

Pos logit contributions
ossible+0.007
ric+0.007
 harbour+0.006
|+0.006
 cinema+0.006
Neg logit contributions
 brave-0.006
 independent-0.005
 sure-0.005
 strong-0.005
 graceful-0.005
Token to
Token ID284
Feature activation-2.06e-02

Pos logit contributions
ossible+0.132
ric+0.124
 harbour+0.122
 cinema+0.119
|+0.119
Neg logit contributions
 independent-0.067
 sure-0.067
 brave-0.066
 strong-0.065
 clever-0.062
Token play
Token ID711
Feature activation+1.66e-02

Pos logit contributions
ossible+0.046
ric+0.043
 harbour+0.043
 cinema+0.042
 exhibit+0.041
Neg logit contributions
 brave-0.024
 independent-0.024
 sure-0.024
 strong-0.024
 clever-0.022
Token with
Token ID351
Feature activation-2.69e-02

Pos logit contributions
 Lex+0.026
 courageous+0.026
 fearless+0.025
Pen+0.025
Flash+0.025
Neg logit contributions
 cinema-0.023
 exhibit-0.022
ossible-0.022
 factory-0.022
 station-0.022
Token the
Token ID262
Feature activation-9.96e-03

Pos logit contributions
ossible+0.049
ric+0.048
 ago+0.045
 exhibit+0.044
 cinema+0.043
Neg logit contributions
 courageous-0.037
 fearless-0.036
 sure-0.036
 brave-0.036
 independent-0.035
Token scissors
Token ID41786
Feature activation+1.72e-02

Pos logit contributions
ossible+0.016
 driveway+0.015
box+0.015
 station+0.015
 cinema+0.015
Neg logit contributions
 sure-0.014
 courageous-0.014
 fearless-0.013
 brave-0.013
 Honey-0.012
Token anymore
Token ID7471
Feature activation-3.85e-02

Pos logit contributions
 independent+0.026
 sure+0.026
 brave+0.026
 graceful+0.025
 clever+0.025
Neg logit contributions
ossible-0.030
ric-0.027
 exhibit-0.026
 harbour-0.026
 ago-0.026
Token because
Token ID780
Feature activation-1.65e-02

Pos logit contributions
ossible+0.066
 exhibit+0.058
ric+0.058
 ago+0.056
 cinema+0.055
Neg logit contributions
 independent-0.062
 brave-0.061
 strong-0.061
 graceful-0.060
 clever-0.059
Token they
Token ID484
Feature activation-2.11e-02

Pos logit contributions
ossible+0.035
 ago+0.032
ric+0.031
 cinema+0.031
 Sale+0.031
Neg logit contributions
 brave-0.018
 sure-0.017
 independent-0.017
 courageous-0.017
 fearless-0.016
Token were
Token ID547
Feature activation-3.84e-02

Pos logit contributions
ossible+0.038
ric+0.035
 harbour+0.034
 exhibit+0.032
 cinema+0.032
Neg logit contributions
 brave-0.033
 independent-0.033
 sure-0.033
 strong-0.032
 clever-0.032
Token sharp
Token ID7786
Feature activation+8.20e-03

Pos logit contributions
ric+0.060
ossible+0.059
 ago+0.057
ained+0.057
 harbour+0.054
Neg logit contributions
 brave-0.054
 courageous-0.054
Small-0.053
 raven-0.053
 fearless-0.053
Token and
Token ID290
Feature activation+2.13e-02

Pos logit contributions
 sure+0.012
 compassionate+0.012
 brave+0.012
 independent+0.012
 clever+0.012
Neg logit contributions
 ago-0.013
ossible-0.013
anged-0.013
ric-0.013
 exhibit-0.012
Token and
Token ID290
Feature activation+2.34e-04

Pos logit contributions
ossible+0.030
ric+0.029
 harbour+0.028
|+0.028
 cinema+0.027
Neg logit contributions
 brave-0.016
 independent-0.016
 strong-0.016
 sure-0.015
 clever-0.015
Token hop
Token ID1725
Feature activation+2.24e-02

Pos logit contributions
 brave+0.000
 sure+0.000
 independent+0.000
 strong+0.000
 clever+0.000
Neg logit contributions
ossible-0.001
ric-0.000
 harbour-0.000
 cinema-0.000
|-0.000
Token around
Token ID1088
Feature activation-7.71e-03

Pos logit contributions
 brave+0.023
 sure+0.023
 independent+0.023
 strong+0.022
 clever+0.021
Neg logit contributions
ossible-0.053
ric-0.049
 harbour-0.048
 exhibit-0.047
 cinema-0.047
Token on
Token ID319
Feature activation-1.65e-03

Pos logit contributions
ossible+0.015
ric+0.014
 harbour+0.014
 exhibit+0.013
 cinema+0.013
Neg logit contributions
 sure-0.011
 independent-0.011
 brave-0.011
 strong-0.010
 clever-0.010
Token his
Token ID465
Feature activation-3.79e-03

Pos logit contributions
ossible+0.004
ric+0.004
 harbour+0.004
 exhibit+0.003
 cinema+0.003
Neg logit contributions
 sure-0.002
 brave-0.002
 independent-0.002
 strong-0.002
 clever-0.002
Token tr
Token ID491
Feature activation-8.49e-02

Pos logit contributions
ossible+0.007
ric+0.006
 harbour+0.006
 cinema+0.006
 exhibit+0.006
Neg logit contributions
 brave-0.006
 independent-0.006
 sure-0.006
 strong-0.006
 clever-0.006
Tokenamp
Token ID696
Feature activation-4.91e-02

Pos logit contributions
 inconvenience+0.138
 Sale+0.123
 tanks+0.120
 cinema+0.120
 pizz+0.119
Neg logit contributions
 independent-0.109
 flexible-0.106
The-0.104
 graceful-0.103
 brave-0.102
Tokenoline
Token ID14453
Feature activation-2.60e-02

Pos logit contributions
ossible+0.083
ric+0.075
 harbour+0.074
 cinema+0.068
rollers+0.065
Neg logit contributions
 independent-0.082
 brave-0.081
 strong-0.078
 sure-0.076
 graceful-0.075
Token.
Token ID13
Feature activation-1.11e-02

Pos logit contributions
ossible+0.062
ric+0.061
 harbour+0.057
|+0.056
 Sale+0.056
Neg logit contributions
 brave-0.028
 strong-0.026
 independent-0.025
 sure-0.025
 graceful-0.023
Token One
Token ID1881
Feature activation+1.07e-02

Pos logit contributions
ric+0.021
|+0.021
ossible+0.020
anged+0.020
 cinema+0.019
Neg logit contributions
 sure-0.015
 strong-0.014
 independent-0.014
 brave-0.014
 graceful-0.013
Token day
Token ID1110
Feature activation-8.17e-03

Pos logit contributions
 brave+0.019
 independent+0.019
 sure+0.019
 strong+0.019
 clever+0.018
Neg logit contributions
ossible-0.017
ric-0.016
 harbour-0.015
 cinema-0.014
 exhibit-0.014
Token
Token ID198
Feature activation-4.19e-03

Pos logit contributions
ossible+0.018
ric+0.017
 harbour+0.017
 cinema+0.016
 exhibit+0.016
Neg logit contributions
 sure-0.017
 brave-0.017
 independent-0.017
 strong-0.017
 clever-0.016
TokenGrand
Token ID23581
Feature activation-2.04e-02

Pos logit contributions
ossible+0.009
 harbour+0.008
ric+0.008
 cinema+0.008
 exhibit+0.008
Neg logit contributions
 independent-0.005
 brave-0.005
 strong-0.005
 sure-0.005
 clever-0.005
Tokenma
Token ID2611
Feature activation-2.53e-02

Pos logit contributions
ossible+0.036
 harbour+0.035
ric+0.034
 cinema+0.033
bank+0.033
Neg logit contributions
 brave-0.030
 sweet-0.029
 sure-0.029
 independent-0.029
 graceful-0.028
Token's
Token ID338
Feature activation-1.78e-02

Pos logit contributions
ossible+0.047
 exhibit+0.042
ric+0.041
 harbour+0.041
 station+0.039
Neg logit contributions
 independent-0.038
 strong-0.036
 graceful-0.036
 clever-0.035
 sure-0.035
Token words
Token ID2456
Feature activation-1.72e-02

Pos logit contributions
ossible+0.037
 harbour+0.034
ric+0.034
 cinema+0.033
 exhibit+0.033
Neg logit contributions
 independent-0.021
 sure-0.021
 strong-0.020
 brave-0.020
 graceful-0.020
Token fore
Token ID1674
Feature activation-7.25e-02

Pos logit contributions
ossible+0.031
ric+0.027
 harbour+0.027
 ago+0.026
 exhibit+0.025
Neg logit contributions
 independent-0.027
 brave-0.025
 sure-0.025
 clever-0.025
 strong-0.025
Tokenshadow
Token ID19106
Feature activation-2.24e-02

Pos logit contributions
ossible+0.126
ric+0.115
 harbour+0.112
 exhibit+0.107
 cinema+0.107
Neg logit contributions
 brave-0.118
 independent-0.117
 sure-0.117
 strong-0.115
 clever-0.112
Tokened
Token ID276
Feature activation-2.57e-02

Pos logit contributions
ossible+0.041
ric+0.038
 harbour+0.036
 exhibit+0.036
|+0.035
Neg logit contributions
 independent-0.033
 brave-0.033
 sure-0.033
 strong-0.032
 clever-0.032
Token a
Token ID257
Feature activation-1.01e-02

Pos logit contributions
ossible+0.048
ric+0.044
 harbour+0.043
 exhibit+0.042
 ago+0.041
Neg logit contributions
 independent-0.036
 brave-0.035
 sure-0.035
 graceful-0.034
 clever-0.034
Token happy
Token ID3772
Feature activation-1.83e-02

Pos logit contributions
ossible+0.021
ric+0.019
 harbour+0.019
 cinema+0.018
 exhibit+0.018
Neg logit contributions
 brave-0.013
 independent-0.013
 sure-0.013
 strong-0.013
 clever-0.013
Token future
Token ID2003
Feature activation-1.94e-02

Pos logit contributions
ossible+0.036
ric+0.033
 harbour+0.032
 exhibit+0.031
 cinema+0.031
Neg logit contributions
 sure-0.026
 brave-0.026
 independent-0.026
 strong-0.025
 clever-0.024
Token gentle
Token ID10296
Feature activation+9.60e-03

Pos logit contributions
 sure+0.013
 independent+0.013
 brave+0.012
 strong+0.011
 clever+0.011
Neg logit contributions
ossible-0.034
 harbour-0.031
ric-0.031
 cinema-0.030
 exhibit-0.030
Token ticking
Token ID40212
Feature activation-1.79e-03

Pos logit contributions
 sure+0.014
 independent+0.013
 brave+0.013
 clever+0.013
 strong+0.013
Neg logit contributions
ossible-0.019
ric-0.017
 harbour-0.017
 exhibit-0.016
 cinema-0.016
Token sound
Token ID2128
Feature activation+4.28e-03

Pos logit contributions
ossible+0.003
ric+0.003
 harbour+0.003
 exhibit+0.003
 cinema+0.003
Neg logit contributions
 sure-0.002
 independent-0.002
 brave-0.002
 clever-0.002
 graceful-0.002
Token as
Token ID355
Feature activation+2.41e-02

Pos logit contributions
 independent+0.007
 clever+0.006
 sure+0.006
 brave+0.006
 compassionate+0.006
Neg logit contributions
ossible-0.007
 harbour-0.007
ric-0.006
 ago-0.006
 exhibit-0.006
Token the
Token ID262
Feature activation-2.20e-03

Pos logit contributions
 independent+0.033
 raven+0.032
 pony+0.032
Small+0.032
 bunny+0.031
Neg logit contributions
 ago-0.039
ossible-0.037
 harbour-0.035
OM-0.034
ric-0.034
Token pend
Token ID44017
Feature activation-7.25e-02

Pos logit contributions
ossible+0.004
 harbour+0.004
 cinema+0.004
ric+0.004
 exhibit+0.004
Neg logit contributions
 sure-0.003
 independent-0.003
 brave-0.003
 graceful-0.003
 clever-0.003
Tokenulum
Token ID14452
Feature activation-2.37e-02

Pos logit contributions
ossible+0.081
|+0.062
ric+0.062
 exhibit+0.059
 Sale+0.057
Neg logit contributions
 independent-0.163
 strong-0.160
 brave-0.160
 sure-0.156
 graceful-0.155
Token slowly
Token ID6364
Feature activation-4.88e-03

Pos logit contributions
ossible+0.039
ric+0.037
 harbour+0.036
 exhibit+0.035
 cinema+0.034
Neg logit contributions
 independent-0.039
 brave-0.038
 sure-0.038
 clever-0.037
 strong-0.036
Token swinging
Token ID25635
Feature activation+5.86e-03

Pos logit contributions
ossible+0.008
 harbour+0.007
ric+0.007
 exhibit+0.007
 cinema+0.007
Neg logit contributions
 independent-0.008
 sure-0.008
 brave-0.008
 clever-0.008
 graceful-0.007
Token back
Token ID736
Feature activation-1.95e-02

Pos logit contributions
 sure+0.008
 independent+0.008
 clever+0.008
 brave+0.008
 compassionate+0.008
Neg logit contributions
ossible-0.011
ric-0.010
 harbour-0.010
 ago-0.009
 exhibit-0.009
Token and
Token ID290
Feature activation-1.52e-02

Pos logit contributions
ossible+0.039
ric+0.037
 harbour+0.035
 exhibit+0.035
 cinema+0.034
Neg logit contributions
 brave-0.026
 sure-0.026
 independent-0.026
 clever-0.025
 strong-0.025
Token something
Token ID1223
Feature activation+2.99e-03

Pos logit contributions
 independent+0.011
 sure+0.010
 brave+0.010
 graceful+0.010
 clever+0.010
Neg logit contributions
ossible-0.022
 harbour-0.021
ric-0.020
 cinema-0.020
 exhibit-0.020
Token flew
Token ID13112
Feature activation-2.48e-03

Pos logit contributions
 independent+0.004
 sure+0.004
 brave+0.003
 careful+0.003
 intelligent+0.003
Neg logit contributions
ossible-0.006
 harbour-0.005
 cinema-0.005
 exhibit-0.005
ric-0.005
Token out
Token ID503
Feature activation-3.65e-03

Pos logit contributions
ossible+0.006
ric+0.005
 exhibit+0.005
 harbour+0.005
 cinema+0.005
Neg logit contributions
 strong-0.003
 brave-0.003
 independent-0.003
 sure-0.003
 graceful-0.003
Token of
Token ID286
Feature activation-2.60e-02

Pos logit contributions
ossible+0.008
ric+0.008
 harbour+0.007
 exhibit+0.007
 cinema+0.007
Neg logit contributions
 brave-0.004
 independent-0.004
 sure-0.004
 strong-0.004
 clever-0.004
Token the
Token ID262
Feature activation-2.85e-02

Pos logit contributions
ossible+0.073
ric+0.070
 exhibit+0.067
 harbour+0.066
rollers+0.066
Neg logit contributions
 strong-0.017
 brave-0.016
 independent-0.014
 sure-0.014
 gentle-0.013
Token chim
Token ID18205
Feature activation-7.15e-02

Pos logit contributions
ossible+0.046
ric+0.040
 harbour+0.040
 cinema+0.038
|+0.038
Neg logit contributions
 brave-0.053
 strong-0.053
 independent-0.053
 graceful-0.050
 clever-0.050
Tokenney
Token ID1681
Feature activation-4.99e-02

Pos logit contributions
ossible+0.136
|+0.129
 cinema+0.127
 harbour+0.126
 Sale+0.124
Neg logit contributions
 strong-0.098
 brave-0.095
 independent-0.093
 flexible-0.088
 graceful-0.087
Token!
Token ID0
Feature activation-1.47e-02

Pos logit contributions
ossible+0.118
ric+0.108
|+0.104
 harbour+0.103
 cinema+0.102
Neg logit contributions
 brave-0.056
 strong-0.055
 sure-0.054
 independent-0.053
 graceful-0.050
Token It
Token ID632
Feature activation-6.44e-03

Pos logit contributions
ric+0.032
ossible+0.031
 harbour+0.028
|+0.028
anged+0.028
Neg logit contributions
 sure-0.018
 brave-0.018
 strong-0.018
 independent-0.017
 clever-0.016
Token was
Token ID373
Feature activation-1.76e-02

Pos logit contributions
ossible+0.010
 harbour+0.010
ric+0.009
 exhibit+0.009
 cinema+0.009
Neg logit contributions
 independent-0.010
 sure-0.010
 brave-0.010
 compassionate-0.010
 intelligent-0.010
Token a
Token ID257
Feature activation+2.22e-02

Pos logit contributions
ossible+0.039
 harbour+0.037
ric+0.037
 cinema+0.036
 exhibit+0.035
Neg logit contributions
 independent-0.018
 sure-0.018
 brave-0.018
 clever-0.017
 graceful-0.017
Token began
Token ID2540
Feature activation+8.51e-03

Pos logit contributions
 independent+0.002
 brave+0.002
 clever+0.002
 compassionate+0.002
 sure+0.002
Neg logit contributions
ossible-0.002
 harbour-0.002
ric-0.002
|-0.002
 cinema-0.002
Token to
Token ID284
Feature activation-4.49e-03

Pos logit contributions
 brave+0.013
 independent+0.013
 sure+0.013
 clever+0.013
 strong+0.012
Neg logit contributions
ossible-0.015
ric-0.014
 harbour-0.013
 exhibit-0.012
 cinema-0.012
Token jump
Token ID4391
Feature activation+1.62e-02

Pos logit contributions
ossible+0.008
ric+0.008
 harbour+0.007
 exhibit+0.007
|+0.007
Neg logit contributions
 independent-0.007
 brave-0.007
 sure-0.007
 clever-0.007
 graceful-0.006
Token on
Token ID319
Feature activation+1.07e-02

Pos logit contributions
 independent+0.024
 compassionate+0.024
 sure+0.024
 clever+0.024
 intelligent+0.023
Neg logit contributions
ossible-0.027
ric-0.025
|-0.025
 harbour-0.024
 exhibit-0.024
Token the
Token ID262
Feature activation-1.15e-02

Pos logit contributions
 compassionate+0.014
 sure+0.014
 observ+0.014
 independent+0.014
 courageous+0.013
Neg logit contributions
ossible-0.019
ric-0.018
 exhibit-0.018
 pool-0.018
 harbour-0.017
Token tr
Token ID491
Feature activation-7.14e-02

Pos logit contributions
ossible+0.017
ric+0.017
 harbour+0.015
 exhibit+0.015
 driveway+0.015
Neg logit contributions
 sure-0.021
 independent-0.020
 brave-0.019
 compassionate-0.019
 clever-0.019
Tokenamp
Token ID696
Feature activation-3.23e-02

Pos logit contributions
 harbour+0.114
ossible+0.113
 inconvenience+0.112
 Sale+0.111
 station+0.110
Neg logit contributions
 flexible-0.084
 independent-0.083
 sure-0.081
 strong-0.081
 graceful-0.080
Tokenoline
Token ID14453
Feature activation-1.54e-02

Pos logit contributions
ossible+0.049
 harbour+0.043
ric+0.043
 cinema+0.039
 exhibit+0.037
Neg logit contributions
 independent-0.060
 brave-0.060
 sure-0.060
 strong-0.059
 graceful-0.057
Token.
Token ID13
Feature activation-1.01e-02

Pos logit contributions
ossible+0.029
ric+0.027
 harbour+0.026
 exhibit+0.026
 cinema+0.026
Neg logit contributions
 independent-0.022
 sure-0.022
 brave-0.022
 clever-0.021
 strong-0.021
Token
Token ID198
Feature activation-1.13e-02

Pos logit contributions
ossible+0.016
ric+0.014
 harbour+0.013
|+0.013
 exhibit+0.013
Neg logit contributions
 brave-0.019
 independent-0.018
 strong-0.018
 clever-0.018
 sure-0.018
Token
Token ID198
Feature activation-4.01e-03

Pos logit contributions
ossible+0.018
ric+0.015
|+0.014
 harbour+0.014
 exhibit+0.014
Neg logit contributions
 independent-0.021
 brave-0.021
 strong-0.020
 clever-0.020
 sure-0.020