TOP ACTIVATIONS
MAX = 2.439

. " Hello , little girl ," said the bear .
" Hello , little girl ," said Tw inkle .
and said : " Hello Ab ig ail ! I 'm
. " Hello , little girl ," said the rabbit .
good student . Em ma was happy that she
the best of friends . Ab bie 's original number was
" Hello , little girl ," said the bear .
and twist ! Em ma was very happy to
some tasty leaves on the other side of the garden .
singing . Em ma suddenly stopped in her
from it . Em ma was afraid , uncertain
. " Hello , little girl ," said the rabbit .
tried to land on the other side of the bin .
little girl ." Em ma said , " That
Daisy saw this and asked Ab bie if she could share
the gate keeper agreed that Ab bie had worked hard and
right away . Em ma spent hours gazing at
" Hello , little girl ," said the spirit .
The younger sister smiled and said ,
" Hi there , little girl ! Do you want to

MIN ACTIVATIONS
MIN = -2.778

giving him too many ch ocol ates ." The king learned
oll ip ops , ch ocol ates , g ummies ,
" I will buy more av oc ados tomorrow . You
sad , Mr . Cro cod ile ?" The crocod ile
reward . It was ch ocol ates and sweets !
old man who was sun bat hing in the shed .
So , Stri pe and his friends got their
birds that are not par rots . They might bite you
was a box of ch ocol ates . Lily loved ch
as he opened the ch ocol ates and started eating them
big . The wood pe cker was sad . Then
." St ri pe y felt sad because he
not eat too many ch ocol ates and his friend learned
noticed something . The ch ocol ates were bitter . They
or get you a new one ." L ily
they come out and sun bat he on the logs .
here ? You are tresp assing ! You are thieves !
said , " I love av oc ados ! They are
. St ri pe and his friends were scared
?" The wood pe cker thought it over .

INTERVAL 1.134 - 2.439
CONTAINS 1.006%

he was going to do something fun . He couldn 't
very sad and started to cry . The
with everyone ." M ia and Ben nodded and
dinner time ." L ily and her mom my
something special . She went on an adventure to find it

INTERVAL -0.170 - 1.134
CONTAINS 63.652%

climbing up the walls . After a while , she realized
The all igator was so happy to have a
<|endoftext|> L ily and Ben were playing in the park .
. Lily was sitting under a tree . " Hi Lily
and thanked Tim my by jumping out of the water .

INTERVAL -1.474 - -0.170
CONTAINS 35.215%

slowly fade away . <|endoftext|> Once upon a time , there
Tom and Lily wanted to say hello to the old
slowly to the ground . Everyone was wondering what would happen
Jenny . Jenny was scared and started running around .
, Max ," Tim said . " We made the mess

INTERVAL -2.778 - -1.474
CONTAINS 0.128%

, there was a little gir affe named Gig i .
. We were just playing . Our parents are at the
careful , Lily . Cater pill ars can be tricky .
creative muff ins like this one at home ?"
Did you not remember what I told you ?"
Token.
Token ID13
Feature activation-6.02e-02

Pos logit contributions
 Grac+0.010
 gn+0.009
 Nib+0.009
 Maur+0.009
 Po+0.009
Neg logit contributions
 differences-0.005
 team-0.005
affe-0.005
 law-0.004
 sides-0.004
Token "
Token ID366
Feature activation+7.14e-01

Pos logit contributions
 differences+0.002
 losers+0.002
oom+0.002
 pleasures+0.002
 similarities+0.002
Neg logit contributions
 gn-0.001
 Grac-0.001
 Nib-0.001
 Po-0.001
 Maur-0.001
TokenHello
Token ID15496
Feature activation+6.64e-01

Pos logit contributions
 Grac+0.016
 gn+0.015
 Nib+0.015
ina+0.015
 Po+0.014
Neg logit contributions
 differences-0.020
 losers-0.019
 pleasures-0.019
 similarities-0.018
affe-0.018
Token,
Token ID11
Feature activation+6.11e-01

Pos logit contributions
 gn+0.013
 Grac+0.013
 Nib+0.013
 Po+0.012
 Maur+0.012
Neg logit contributions
 differences-0.021
 losers-0.020
 pleasures-0.019
 similarities-0.019
oom-0.019
Token little
Token ID1310
Feature activation+1.65e+00

Pos logit contributions
 Grac+0.014
 gn+0.014
 Nib+0.013
 Po+0.013
 Ho+0.012
Neg logit contributions
 differences-0.018
 pleasures-0.017
 losers-0.017
 similarities-0.016
oom-0.016
Token girl
Token ID2576
Feature activation+2.44e+00

Pos logit contributions
 gn+0.041
 Grac+0.041
 Nib+0.040
 Po+0.038
ina+0.038
Neg logit contributions
 differences-0.043
 losers-0.040
 pleasures-0.039
oom-0.038
 similarities-0.037
Token,"
Token ID553
Feature activation+8.94e-01

Pos logit contributions
 Grac+0.056
 gn+0.056
 Nib+0.055
 Po+0.053
ina+0.051
Neg logit contributions
 differences-0.067
 losers-0.063
 pleasures-0.061
oom-0.060
affe-0.059
Token said
Token ID531
Feature activation-3.80e-01

Pos logit contributions
 gn+0.015
 Grac+0.014
 Nib+0.014
 Po+0.013
 Maur+0.012
Neg logit contributions
 differences-0.032
 losers-0.030
 pleasures-0.030
oom-0.029
 similarities-0.029
Token the
Token ID262
Feature activation+3.00e-01

Pos logit contributions
 differences+0.011
 pleasures+0.010
 losers+0.010
 similarities+0.010
oom+0.009
Neg logit contributions
 Grac-0.010
 Nib-0.009
 gn-0.009
 Po-0.009
 Maur-0.009
Token bear
Token ID6842
Feature activation-1.63e-01

Pos logit contributions
 Grac+0.008
 gn+0.008
 Nib+0.008
ina+0.008
 Maur+0.008
Neg logit contributions
 differences-0.007
 losers-0.006
 law-0.006
 pleasures-0.006
 team-0.006
Token.
Token ID13
Feature activation+2.23e-01

Pos logit contributions
 differences+0.004
 losers+0.004
 pleasures+0.004
 similarities+0.004
oom+0.004
Neg logit contributions
 gn-0.004
 Nib-0.004
 Grac-0.004
 Po-0.004
ina-0.004
Token
Token ID198
Feature activation+1.65e-01

Pos logit contributions
 gn+0.020
icol+0.019
 Nib+0.019
 Grac+0.018
 blo+0.018
Neg logit contributions
 differences-0.017
Everyone-0.017
oom-0.017
affe-0.016
 similarities-0.016
Token"
Token ID1
Feature activation+9.44e-01

Pos logit contributions
 gn+0.004
 Grac+0.004
 Nib+0.004
 Po+0.004
ina+0.004
Neg logit contributions
 differences-0.004
 losers-0.004
 pleasures-0.004
 similarities-0.004
oom-0.004
TokenHello
Token ID15496
Feature activation+6.90e-01

Pos logit contributions
 Nib+0.021
 gn+0.021
 Grac+0.021
 Po+0.020
 Maur+0.019
Neg logit contributions
 differences-0.026
 pleasures-0.024
Everyone-0.024
 losers-0.024
 similarities-0.024
Token,
Token ID11
Feature activation+7.74e-01

Pos logit contributions
 gn+0.017
 Grac+0.016
 Nib+0.016
icol+0.016
 Po+0.015
Neg logit contributions
 differences-0.017
oom-0.016
 losers-0.016
 team-0.015
affe-0.015
Token little
Token ID1310
Feature activation+1.63e+00

Pos logit contributions
 gn+0.018
icol+0.018
 Grac+0.017
 Nib+0.017
ina+0.016
Neg logit contributions
 differences-0.020
 losers-0.019
 similarities-0.018
 law-0.017
 team-0.017
Token girl
Token ID2576
Feature activation+2.38e+00

Pos logit contributions
 gn+0.040
 Grac+0.039
 Nib+0.039
ina+0.037
icol+0.037
Neg logit contributions
 differences-0.041
 pleasures-0.037
 losers-0.037
oom-0.036
 similarities-0.035
Token,"
Token ID553
Feature activation+1.01e+00

Pos logit contributions
 Grac+0.057
 gn+0.057
 Nib+0.056
 Po+0.053
 Maur+0.052
Neg logit contributions
 differences-0.063
 losers-0.059
 pleasures-0.056
oom-0.056
affe-0.056
Token said
Token ID531
Feature activation-4.76e-01

Pos logit contributions
 gn+0.020
 Grac+0.019
 Nib+0.018
ina+0.017
icol+0.017
Neg logit contributions
 differences-0.032
 losers-0.030
 similarities-0.029
 pleasures-0.028
oom-0.028
Token Tw
Token ID1815
Feature activation+5.32e-01

Pos logit contributions
 differences+0.014
 losers+0.013
oom+0.012
 pleasures+0.012
 similarities+0.012
Neg logit contributions
 gn-0.010
 Grac-0.010
 Nib-0.010
 Po-0.009
ina-0.009
Tokeninkle
Token ID19894
Feature activation-1.83e-01

Pos logit contributions
 gn+0.014
 Grac+0.013
ina+0.013
 Nib+0.013
 Po+0.013
Neg logit contributions
 differences-0.014
 losers-0.013
 missiles-0.013
 laws-0.013
 pleasures-0.013
Token.
Token ID13
Feature activation+2.05e-01

Pos logit contributions
 differences+0.006
 losers+0.005
 pleasures+0.005
 similarities+0.005
 laws+0.005
Neg logit contributions
 gn-0.004
 Grac-0.004
 Nib-0.004
 Po-0.004
ina-0.004
Token and
Token ID290
Feature activation+3.47e-01

Pos logit contributions
 differences+0.001
 pleasures+0.001
 losers+0.001
ness+0.001
 team+0.001
Neg logit contributions
 gn-0.001
 Nib-0.001
 Grac-0.001
 Maur-0.001
 Po-0.001
Token said
Token ID531
Feature activation-1.88e-01

Pos logit contributions
 gn+0.007
 Grac+0.007
 Nib+0.007
 Po+0.006
ina+0.006
Neg logit contributions
 differences-0.011
 losers-0.010
 pleasures-0.010
 similarities-0.010
oom-0.010
Token:
Token ID25
Feature activation-3.51e-01

Pos logit contributions
 differences+0.006
 losers+0.006
 pleasures+0.006
 similarities+0.006
oom+0.006
Neg logit contributions
 gn-0.003
 Grac-0.003
 Nib-0.003
 Po-0.003
ina-0.003
Token "
Token ID366
Feature activation+1.45e+00

Pos logit contributions
 differences+0.007
 losers+0.007
 similarities+0.006
 pleasures+0.006
oom+0.006
Neg logit contributions
 gn-0.010
 Nib-0.010
 Grac-0.010
ina-0.009
 Maur-0.009
TokenHello
Token ID15496
Feature activation+9.78e-01

Pos logit contributions
 gn+0.035
 Grac+0.034
 Nib+0.034
 Po+0.032
ina+0.031
Neg logit contributions
 differences-0.040
 losers-0.037
 pleasures-0.036
 similarities-0.035
oom-0.035
Token Ab
Token ID2275
Feature activation+2.36e+00

Pos logit contributions
 Grac+0.019
 gn+0.019
 Nib+0.018
 Po+0.018
ina+0.017
Neg logit contributions
 differences-0.032
 losers-0.031
 pleasures-0.029
 similarities-0.029
affe-0.029
Tokenig
Token ID328
Feature activation+9.99e-01

Pos logit contributions
 gn+0.072
 Grac+0.072
 Nib+0.071
 Po+0.069
ina+0.067
Neg logit contributions
 differences-0.048
 losers-0.044
 pleasures-0.042
oom-0.041
 similarities-0.041
Tokenail
Token ID603
Feature activation+8.73e-01

Pos logit contributions
 gn+0.021
 Grac+0.020
ina+0.019
 Nib+0.019
 Po+0.019
Neg logit contributions
 differences-0.032
 losers-0.030
 pleasures-0.029
 similarities-0.029
 laws-0.028
Token!
Token ID0
Feature activation-5.84e-01

Pos logit contributions
 Grac+0.016
 gn+0.016
ina+0.016
 Po+0.015
 Nib+0.015
Neg logit contributions
 differences-0.029
 losers-0.027
 similarities-0.027
 pleasures-0.027
 missiles-0.026
Token I
Token ID314
Feature activation+6.57e-01

Pos logit contributions
 differences+0.015
 losers+0.014
 pleasures+0.014
 similarities+0.014
oom+0.014
Neg logit contributions
 gn-0.014
 Grac-0.014
 Nib-0.014
 Po-0.013
ina-0.013
Token'm
Token ID1101
Feature activation+6.17e-01

Pos logit contributions
 Nib+0.015
 gn+0.015
 Maur+0.014
icol+0.014
 Grac+0.014
Neg logit contributions
 differences-0.018
oom-0.017
 losers-0.016
 pleasures-0.016
 team-0.016
Token.
Token ID13
Feature activation+1.17e-01

Pos logit contributions
 Grac+0.005
 gn+0.005
 Nib+0.005
 Po+0.005
 Maur+0.005
Neg logit contributions
 differences-0.005
 losers-0.004
 pleasures-0.004
affe-0.004
oom-0.004
Token "
Token ID366
Feature activation+8.10e-01

Pos logit contributions
 gn+0.003
 Grac+0.003
 Nib+0.003
ina+0.003
 Po+0.003
Neg logit contributions
 differences-0.003
 losers-0.003
 pleasures-0.003
 similarities-0.003
affe-0.003
TokenHello
Token ID15496
Feature activation+8.63e-01

Pos logit contributions
 gn+0.016
 Grac+0.015
 Nib+0.015
 Po+0.014
ina+0.014
Neg logit contributions
 differences-0.026
 losers-0.025
 pleasures-0.024
 similarities-0.024
oom-0.023
Token,
Token ID11
Feature activation+7.35e-01

Pos logit contributions
 gn+0.017
 Grac+0.017
 Nib+0.017
 Po+0.016
 Ho+0.015
Neg logit contributions
 differences-0.028
 losers-0.027
 pleasures-0.026
 similarities-0.026
 laws-0.025
Token little
Token ID1310
Feature activation+1.54e+00

Pos logit contributions
 Grac+0.023
 gn+0.023
 Nib+0.023
 Po+0.022
 Ho+0.021
Neg logit contributions
 differences-0.015
 losers-0.013
 pleasures-0.013
 similarities-0.013
oom-0.013
Token girl
Token ID2576
Feature activation+2.34e+00

Pos logit contributions
 gn+0.040
 Grac+0.040
 Nib+0.039
 Po+0.038
ina+0.037
Neg logit contributions
 differences-0.039
 losers-0.037
 pleasures-0.035
 similarities-0.035
oom-0.034
Token,"
Token ID553
Feature activation+9.91e-01

Pos logit contributions
 gn+0.052
 Grac+0.051
 Nib+0.050
 Po+0.048
ina+0.046
Neg logit contributions
 differences-0.070
 losers-0.065
 pleasures-0.063
 similarities-0.062
oom-0.061
Token said
Token ID531
Feature activation-3.01e-01

Pos logit contributions
 gn+0.017
 Grac+0.017
 Nib+0.016
 Po+0.016
 called+0.015
Neg logit contributions
 differences-0.036
 pleasures-0.034
oom-0.034
 losers-0.034
 similarities-0.033
Token the
Token ID262
Feature activation+2.14e-01

Pos logit contributions
 differences+0.009
 pleasures+0.009
 similarities+0.008
 losers+0.008
oom+0.008
Neg logit contributions
 Grac-0.008
 Nib-0.008
 Po-0.007
 gn-0.007
 Ho-0.007
Token rabbit
Token ID22746
Feature activation+1.95e-01

Pos logit contributions
 Nib+0.006
 Grac+0.006
 Maur+0.006
 gn+0.006
icol+0.006
Neg logit contributions
 differences-0.004
 team-0.004
affe-0.004
 losers-0.004
 law-0.004
Token.
Token ID13
Feature activation+2.88e-01

Pos logit contributions
 gn+0.005
 Nib+0.005
 Grac+0.004
 Po+0.004
 called+0.004
Neg logit contributions
 differences-0.006
 losers-0.005
 pleasures-0.005
 similarities-0.005
oom-0.005
Token good
Token ID922
Feature activation+6.12e-01

Pos logit contributions
 differences+0.005
 losers+0.005
 pleasures+0.005
oom+0.005
 similarities+0.005
Neg logit contributions
 gn-0.006
 Grac-0.006
 Nib-0.006
 Po-0.006
ina-0.005
Token student
Token ID3710
Feature activation+9.09e-01

Pos logit contributions
 gn+0.017
 Grac+0.017
ina+0.016
 Po+0.016
 Nib+0.016
Neg logit contributions
 differences-0.014
 losers-0.014
 similarities-0.013
 missiles-0.013
affe-0.013
Token.
Token ID13
Feature activation+7.28e-01

Pos logit contributions
 gn+0.019
 Grac+0.019
ina+0.018
 called+0.017
 Po+0.017
Neg logit contributions
 differences-0.028
 losers-0.026
 similarities-0.026
oom-0.026
 pleasures-0.025
Token
Token ID198
Feature activation+7.31e-01

Pos logit contributions
 gn+0.015
 Nib+0.014
 Grac+0.014
ina+0.013
 Po+0.013
Neg logit contributions
 differences-0.023
 losers-0.021
 pleasures-0.021
 similarities-0.020
oom-0.020
Token
Token ID198
Feature activation+2.44e-01

Pos logit contributions
 gn+0.017
 Nib+0.016
 Grac+0.016
 Po+0.015
ina+0.015
Neg logit contributions
 differences-0.021
 losers-0.019
Everyone-0.019
 pleasures-0.019
oom-0.018
TokenEm
Token ID10161
Feature activation+2.32e+00

Pos logit contributions
 Grac+0.006
 gn+0.005
 Po+0.005
ina+0.005
 Nib+0.005
Neg logit contributions
 differences-0.007
 losers-0.007
oom-0.006
 winds-0.006
 attitude-0.006
Tokenma
Token ID2611
Feature activation+3.88e-01

Pos logit contributions
 Grac+0.061
 gn+0.061
 Nib+0.059
 Po+0.057
ina+0.057
Neg logit contributions
 differences-0.058
 losers-0.054
 pleasures-0.052
 similarities-0.050
oom-0.050
Token was
Token ID373
Feature activation+3.07e-01

Pos logit contributions
 gn+0.008
 Grac+0.008
 Nib+0.007
ina+0.007
 Po+0.007
Neg logit contributions
 differences-0.012
 losers-0.012
 pleasures-0.011
 similarities-0.011
oom-0.011
Token happy
Token ID3772
Feature activation+3.20e-02

Pos logit contributions
 gn+0.004
 Grac+0.004
 Nib+0.004
 Po+0.004
ina+0.003
Neg logit contributions
 differences-0.012
 losers-0.011
 pleasures-0.011
 similarities-0.011
oom-0.011
Token that
Token ID326
Feature activation-4.80e-02

Pos logit contributions
 gn+0.001
 Grac+0.001
 Nib+0.001
 Po+0.001
ina+0.001
Neg logit contributions
 differences-0.001
 losers-0.001
 pleasures-0.001
 similarities-0.001
oom-0.001
Token she
Token ID673
Feature activation+4.61e-01

Pos logit contributions
 differences+0.002
 losers+0.002
affe+0.002
oom+0.002
 similarities+0.002
Neg logit contributions
 Grac-0.000
 gn-0.000
 Po-0.000
ina-0.000
 Nib-0.000
Token the
Token ID262
Feature activation-2.62e-02

Pos logit contributions
 differences+0.004
 losers+0.004
 pleasures+0.004
oom+0.004
affe+0.004
Neg logit contributions
 Grac-0.005
 gn-0.005
 Nib-0.005
 Po-0.004
ina-0.004
Token best
Token ID1266
Feature activation-1.10e-01

Pos logit contributions
 differences+0.001
 pleasures+0.001
 losers+0.001
 similarities+0.001
oom+0.001
Neg logit contributions
 gn-0.001
 Grac-0.001
 Nib-0.001
 Po-0.000
ina-0.000
Token of
Token ID286
Feature activation-1.32e+00

Pos logit contributions
 differences+0.003
 similarities+0.003
 losers+0.002
 pleasures+0.002
Everyone+0.002
Neg logit contributions
 gn-0.003
 Grac-0.003
ina-0.003
 imagining-0.003
 Nib-0.003
Token friends
Token ID2460
Feature activation-5.22e-02

Pos logit contributions
 pleasures+0.035
affe+0.034
oom+0.034
 differences+0.033
vent+0.032
Neg logit contributions
 Grac-0.031
 blo-0.031
 called-0.030
 searching-0.029
 Po-0.028
Token.
Token ID13
Feature activation+9.35e-02

Pos logit contributions
 differences+0.002
 losers+0.001
 pleasures+0.001
 similarities+0.001
oom+0.001
Neg logit contributions
 Grac-0.001
 gn-0.001
 Nib-0.001
ina-0.001
 Po-0.001
Token Ab
Token ID2275
Feature activation+2.31e+00

Pos logit contributions
 gn+0.002
 Grac+0.002
 Nib+0.002
 Po+0.002
ina+0.002
Neg logit contributions
 differences-0.003
 pleasures-0.003
 losers-0.003
 similarities-0.003
oom-0.002
Tokenbie
Token ID12590
Feature activation+1.89e-01

Pos logit contributions
 gn+0.036
 Nib+0.035
 Grac+0.034
 Po+0.032
 Ho+0.030
Neg logit contributions
 differences-0.082
oom-0.077
 losers-0.076
 pleasures-0.075
 similarities-0.074
Token's
Token ID338
Feature activation+6.26e-02

Pos logit contributions
 gn+0.004
 Grac+0.003
 Nib+0.003
ina+0.003
 Po+0.003
Neg logit contributions
 differences-0.006
 pleasures-0.006
 losers-0.006
 laws-0.006
 similarities-0.006
Token original
Token ID2656
Feature activation+5.30e-01

Pos logit contributions
 gn+0.002
 Grac+0.001
 Nib+0.001
 searching+0.001
 imagining+0.001
Neg logit contributions
 losers-0.002
 pleasures-0.002
 differences-0.001
Everyone-0.001
oom-0.001
Token number
Token ID1271
Feature activation-2.29e-01

Pos logit contributions
 gn+0.014
 Grac+0.013
 imagining+0.013
 Nib+0.013
ina+0.012
Neg logit contributions
 differences-0.013
 losers-0.013
 similarities-0.012
 missiles-0.012
Everyone-0.012
Token was
Token ID373
Feature activation+8.64e-02

Pos logit contributions
 differences+0.007
 losers+0.007
 pleasures+0.007
 similarities+0.007
oom+0.006
Neg logit contributions
 Grac-0.004
 gn-0.004
ina-0.004
 Nib-0.004
 called-0.004
Token
Token ID198
Feature activation+1.10e-01

Pos logit contributions
 gn+0.008
 Nib+0.008
 Grac+0.007
ina+0.007
 blo+0.007
Neg logit contributions
 differences-0.007
Everyone-0.007
oom-0.007
 losers-0.007
 pleasures-0.006
Token"
Token ID1
Feature activation+5.72e-01

Pos logit contributions
 gn+0.003
 Grac+0.003
 Po+0.002
 Nib+0.002
 called+0.002
Neg logit contributions
 differences-0.003
 losers-0.003
 missiles-0.003
 pleasures-0.003
 similarities-0.003
TokenHello
Token ID15496
Feature activation+8.88e-01

Pos logit contributions
 gn+0.013
 Grac+0.013
 Nib+0.013
ina+0.012
 Po+0.012
Neg logit contributions
 differences-0.016
 losers-0.015
 pleasures-0.015
 similarities-0.014
oom-0.014
Token,
Token ID11
Feature activation+7.25e-01

Pos logit contributions
 gn+0.020
 Grac+0.020
 Nib+0.019
 Po+0.018
ina+0.018
Neg logit contributions
 differences-0.025
 losers-0.023
oom-0.023
 pleasures-0.022
affe-0.022
Token little
Token ID1310
Feature activation+1.84e+00

Pos logit contributions
 gn+0.021
 Grac+0.021
 Nib+0.020
ina+0.020
 Maur+0.020
Neg logit contributions
 differences-0.015
 losers-0.015
affe-0.013
 pleasures-0.013
 similarities-0.013
Token girl
Token ID2576
Feature activation+2.28e+00

Pos logit contributions
 Grac+0.052
 gn+0.052
 Nib+0.051
ina+0.051
 Maur+0.050
Neg logit contributions
 differences-0.039
 losers-0.036
 pleasures-0.035
oom-0.034
affe-0.033
Token,"
Token ID553
Feature activation+7.16e-01

Pos logit contributions
 Grac+0.054
 gn+0.054
 Nib+0.053
 Po+0.051
 Maur+0.050
Neg logit contributions
 differences-0.061
 losers-0.057
 pleasures-0.055
oom-0.054
affe-0.054
Token said
Token ID531
Feature activation-7.02e-01

Pos logit contributions
 gn+0.013
 Grac+0.013
 Nib+0.012
 Po+0.012
 Maur+0.011
Neg logit contributions
 differences-0.024
 losers-0.023
 pleasures-0.022
oom-0.022
 similarities-0.022
Token the
Token ID262
Feature activation+7.91e-01

Pos logit contributions
 differences+0.020
 pleasures+0.019
 losers+0.018
 similarities+0.018
oom+0.017
Neg logit contributions
 Grac-0.018
 Nib-0.018
 gn-0.017
 Po-0.017
 Ho-0.016
Token bear
Token ID6842
Feature activation-1.64e-01

Pos logit contributions
ina+0.023
 Nib+0.023
 Grac+0.023
icol+0.023
 Maur+0.022
Neg logit contributions
 differences-0.016
 losers-0.015
 law-0.014
 team-0.014
affe-0.014
Token.
Token ID13
Feature activation+3.83e-01

Pos logit contributions
 differences+0.005
 losers+0.004
 pleasures+0.004
 similarities+0.004
oom+0.004
Neg logit contributions
 gn-0.004
 Grac-0.004
 Nib-0.004
 Po-0.004
ina-0.004
Token and
Token ID290
Feature activation+4.56e-01

Pos logit contributions
 Grac+0.004
 gn+0.004
 Nib+0.004
 Po+0.004
ina+0.004
Neg logit contributions
 differences-0.005
 losers-0.005
 pleasures-0.005
 similarities-0.005
oom-0.005
Token twist
Token ID14528
Feature activation+2.73e-01

Pos logit contributions
 gn+0.012
 Grac+0.012
 Nib+0.011
 Po+0.011
 blo+0.011
Neg logit contributions
 differences-0.012
oom-0.011
 pleasures-0.011
 losers-0.011
affe-0.011
Token!
Token ID0
Feature activation+8.01e-01

Pos logit contributions
 Grac+0.006
 Po+0.006
 Nib+0.006
 gn+0.006
ina+0.005
Neg logit contributions
 differences-0.008
 losers-0.008
 pleasures-0.008
 similarities-0.008
affe-0.007
Token
Token ID198
Feature activation+8.98e-01

Pos logit contributions
 Grac+0.014
 gn+0.014
 Nib+0.014
 Po+0.013
ina+0.012
Neg logit contributions
 differences-0.027
 losers-0.026
 pleasures-0.025
 similarities-0.025
oom-0.024
Token
Token ID198
Feature activation+4.66e-01

Pos logit contributions
 gn+0.018
 Nib+0.017
 Grac+0.017
 Po+0.016
ina+0.015
Neg logit contributions
 differences-0.029
 losers-0.027
 pleasures-0.026
oom-0.026
 similarities-0.026
TokenEm
Token ID10161
Feature activation+2.28e+00

Pos logit contributions
 Grac+0.008
ina+0.007
 Po+0.007
 gn+0.007
 called+0.007
Neg logit contributions
 differences-0.016
 losers-0.015
affe-0.015
 attitude-0.015
 winds-0.015
Tokenma
Token ID2611
Feature activation-5.82e-02

Pos logit contributions
 Grac+0.059
 gn+0.058
 Nib+0.056
ina+0.055
 Po+0.055
Neg logit contributions
 differences-0.058
 losers-0.055
 pleasures-0.052
 similarities-0.051
affe-0.051
Token was
Token ID373
Feature activation-2.48e-01

Pos logit contributions
 differences+0.002
 pleasures+0.002
 losers+0.002
 similarities+0.002
 laws+0.002
Neg logit contributions
 gn-0.001
 called-0.001
 Grac-0.001
ina-0.001
 Nib-0.001
Token very
Token ID845
Feature activation+2.45e-01

Pos logit contributions
 differences+0.009
 losers+0.009
 pleasures+0.008
 laws+0.008
 sides+0.008
Neg logit contributions
 Nib-0.004
 Grac-0.004
 gn-0.003
 Po-0.003
ina-0.003
Token happy
Token ID3772
Feature activation-1.36e-01

Pos logit contributions
 gn+0.005
 Grac+0.005
 Nib+0.005
 Po+0.005
 Maur+0.005
Neg logit contributions
 differences-0.007
 pleasures-0.007
 losers-0.007
oom-0.007
affe-0.007
Token to
Token ID284
Feature activation-5.38e-01

Pos logit contributions
 differences+0.003
 losers+0.003
 pleasures+0.003
 similarities+0.003
 laws+0.003
Neg logit contributions
 Nib-0.004
 gn-0.004
 Grac-0.004
 Po-0.003
 Maur-0.003
Token some
Token ID617
Feature activation-7.86e-01

Pos logit contributions
 gn+0.004
 Grac+0.004
 Nib+0.004
ina+0.003
 Maur+0.003
Neg logit contributions
 differences-0.004
 losers-0.004
 missiles-0.004
 laws-0.004
 pleasures-0.004
Token tasty
Token ID25103
Feature activation+2.71e-01

Pos logit contributions
 differences+0.022
 losers+0.021
oom+0.021
 similarities+0.020
 pleasures+0.020
Neg logit contributions
 gn-0.019
 Grac-0.018
 Nib-0.018
 blo-0.018
 Po-0.018
Token leaves
Token ID5667
Feature activation-1.00e-01

Pos logit contributions
 gn+0.006
 Grac+0.006
 Nib+0.006
 Po+0.006
ina+0.005
Neg logit contributions
 differences-0.008
 losers-0.007
 pleasures-0.007
 similarities-0.007
oom-0.007
Token on
Token ID319
Feature activation+5.14e-01

Pos logit contributions
 differences+0.003
 losers+0.002
 pleasures+0.002
oom+0.002
affe+0.002
Neg logit contributions
 Grac-0.003
 gn-0.003
 Nib-0.002
 Po-0.002
 Maur-0.002
Token the
Token ID262
Feature activation+8.75e-01

Pos logit contributions
 gn+0.016
 Grac+0.016
 Nib+0.016
 Po+0.015
 Maur+0.015
Neg logit contributions
 differences-0.011
 losers-0.010
 pleasures-0.009
 similarities-0.009
oom-0.009
Token other
Token ID584
Feature activation+2.27e+00

Pos logit contributions
 gn+0.027
 Grac+0.027
 Nib+0.026
 Po+0.026
ina+0.025
Neg logit contributions
 differences-0.018
 losers-0.016
 pleasures-0.016
oom-0.015
 similarities-0.015
Token side
Token ID1735
Feature activation-1.12e-02

Pos logit contributions
 gn+0.081
 Grac+0.081
 Nib+0.080
 Po+0.078
ina+0.077
Neg logit contributions
 differences-0.035
 losers-0.030
 pleasures-0.029
 similarities-0.027
oom-0.027
Token of
Token ID286
Feature activation-2.26e-01

Pos logit contributions
 differences+0.000
 losers+0.000
 similarities+0.000
 pleasures+0.000
oom+0.000
Neg logit contributions
 gn-0.000
 Grac-0.000
 Po-0.000
ina-0.000
 Nib-0.000
Token the
Token ID262
Feature activation+3.03e-01

Pos logit contributions
 differences+0.004
 losers+0.004
 pleasures+0.004
 similarities+0.003
oom+0.003
Neg logit contributions
 gn-0.007
 Grac-0.007
 Nib-0.007
 Po-0.007
ina-0.007
Token garden
Token ID11376
Feature activation+1.03e+00

Pos logit contributions
 gn+0.009
 Grac+0.009
 Nib+0.008
 Po+0.008
ina+0.008
Neg logit contributions
 differences-0.007
 losers-0.006
 pleasures-0.006
 similarities-0.006
oom-0.006
Token.
Token ID13
Feature activation-1.22e+00

Pos logit contributions
 gn+0.027
 Grac+0.027
 Nib+0.026
 Po+0.026
ina+0.025
Neg logit contributions
 differences-0.025
 losers-0.024
 pleasures-0.023
 similarities-0.022
oom-0.022
Token singing
Token ID13777
Feature activation-3.63e-01

Pos logit contributions
 differences+0.001
 losers+0.001
 pleasures+0.001
 similarities+0.001
oom+0.001
Neg logit contributions
 gn-0.001
 Grac-0.001
 Nib-0.001
 Po-0.001
 blo-0.001
Token.
Token ID13
Feature activation+4.46e-01

Pos logit contributions
 differences+0.011
 losers+0.011
 similarities+0.010
 pleasures+0.010
affe+0.010
Neg logit contributions
 gn-0.007
 Grac-0.007
 blo-0.007
 Po-0.007
ina-0.007
Token
Token ID220
Feature activation+4.63e-01

Pos logit contributions
 gn+0.009
 Nib+0.008
 Grac+0.008
ina+0.008
 Po+0.008
Neg logit contributions
 differences-0.014
 losers-0.013
 pleasures-0.013
 similarities-0.013
oom-0.013
Token
Token ID198
Feature activation+2.16e-01

Pos logit contributions
 gn+0.009
 Grac+0.009
 Nib+0.008
 Po+0.008
ina+0.008
Neg logit contributions
 differences-0.015
 losers-0.014
 pleasures-0.014
oom-0.013
 similarities-0.013
Token
Token ID198
Feature activation+8.20e-02

Pos logit contributions
 gn+0.005
 Nib+0.005
 Grac+0.005
 Po+0.004
ina+0.004
Neg logit contributions
 differences-0.006
Everyone-0.006
 losers-0.006
 pleasures-0.006
oom-0.006
TokenEm
Token ID10161
Feature activation+2.26e+00

Pos logit contributions
 Grac+0.002
 gn+0.002
 Po+0.001
 called+0.001
ina+0.001
Neg logit contributions
 differences-0.003
 losers-0.002
oom-0.002
 attitude-0.002
 similarities-0.002
Tokenma
Token ID2611
Feature activation+2.18e-01

Pos logit contributions
 gn+0.068
 Nib+0.068
 Grac+0.066
 Po+0.064
 Ho+0.062
Neg logit contributions
 differences-0.050
 losers-0.044
 pleasures-0.043
oom-0.043
 similarities-0.042
Token suddenly
Token ID6451
Feature activation-2.05e-01

Pos logit contributions
 gn+0.004
 Grac+0.004
 Nib+0.004
 Po+0.004
ina+0.004
Neg logit contributions
 differences-0.007
 losers-0.007
 pleasures-0.007
 similarities-0.007
oom-0.006
Token stopped
Token ID5025
Feature activation-4.77e-01

Pos logit contributions
 differences+0.008
 losers+0.007
 pleasures+0.007
 similarities+0.007
affe+0.007
Neg logit contributions
 gn-0.003
 Grac-0.003
 Po-0.003
ina-0.003
 Nib-0.003
Token in
Token ID287
Feature activation-7.75e-02

Pos logit contributions
 differences+0.013
 losers+0.011
 pleasures+0.011
 similarities+0.011
 laws+0.011
Neg logit contributions
 gn-0.012
 Grac-0.012
 Nib-0.012
 Maur-0.011
 Po-0.011
Token her
Token ID607
Feature activation+3.62e-01

Pos logit contributions
 differences+0.002
 losers+0.002
 pleasures+0.002
 law+0.002
 laws+0.002
Neg logit contributions
 Grac-0.002
 Nib-0.002
 gn-0.002
 Maur-0.002
ina-0.002
Token from
Token ID422
Feature activation-9.37e-02

Pos logit contributions
 gn+0.007
 Nib+0.006
 Grac+0.006
 Maur+0.006
 Po+0.006
Neg logit contributions
 differences-0.009
 losers-0.009
oom-0.009
 sides-0.009
 pleasures-0.009
Token it
Token ID340
Feature activation+3.95e-01

Pos logit contributions
 differences+0.002
 losers+0.002
 pleasures+0.002
 laws+0.002
oom+0.002
Neg logit contributions
 gn-0.002
 Grac-0.002
 Nib-0.002
 Maur-0.002
 Po-0.002
Token.
Token ID13
Feature activation+4.53e-01

Pos logit contributions
 gn+0.010
 Grac+0.010
 Nib+0.010
 Po+0.009
ina+0.009
Neg logit contributions
 differences-0.011
 losers-0.010
 pleasures-0.009
 similarities-0.009
oom-0.009
Token
Token ID198
Feature activation+4.90e-01

Pos logit contributions
 Grac+0.008
 gn+0.008
 Nib+0.008
 Po+0.007
 Maur+0.007
Neg logit contributions
 differences-0.015
 losers-0.014
 pleasures-0.014
 similarities-0.014
oom-0.014
Token
Token ID198
Feature activation+2.53e-01

Pos logit contributions
 gn+0.010
 Nib+0.009
 Grac+0.009
 Po+0.009
ina+0.008
Neg logit contributions
 differences-0.016
 losers-0.015
 pleasures-0.014
oom-0.014
 similarities-0.014
TokenEm
Token ID10161
Feature activation+2.25e+00

Pos logit contributions
 Grac+0.005
 gn+0.005
ina+0.004
 Po+0.004
 called+0.004
Neg logit contributions
 differences-0.008
 losers-0.008
 winds-0.008
 attitude-0.007
 missiles-0.007
Tokenma
Token ID2611
Feature activation+2.32e-01

Pos logit contributions
 gn+0.060
 Grac+0.060
 Nib+0.058
 Po+0.056
ina+0.055
Neg logit contributions
 differences-0.056
 losers-0.051
 pleasures-0.050
 similarities-0.048
oom-0.048
Token was
Token ID373
Feature activation-5.61e-02

Pos logit contributions
 gn+0.004
 Grac+0.004
 Nib+0.004
 called+0.004
ina+0.004
Neg logit contributions
 differences-0.008
 pleasures-0.007
 losers-0.007
 similarities-0.007
oom-0.007
Token afraid
Token ID7787
Feature activation+3.69e-01

Pos logit contributions
 differences+0.002
 losers+0.002
 pleasures+0.002
 similarities+0.002
oom+0.002
Neg logit contributions
 gn-0.001
 Grac-0.001
 Nib-0.001
 Po-0.001
ina-0.001
Token,
Token ID11
Feature activation+8.93e-02

Pos logit contributions
 Grac+0.007
 Po+0.006
 gn+0.006
 called+0.006
 Nib+0.006
Neg logit contributions
 differences-0.013
 losers-0.012
 similarities-0.012
oom-0.012
 pleasures-0.012
Token uncertain
Token ID8627
Feature activation-3.28e-01

Pos logit contributions
 gn+0.002
 Grac+0.002
 Nib+0.002
 Po+0.002
ina+0.001
Neg logit contributions
 differences-0.003
 losers-0.003
 pleasures-0.003
oom-0.003
 similarities-0.003
Token.
Token ID13
Feature activation+8.91e-02

Pos logit contributions
 Grac+0.019
 gn+0.019
 Nib+0.018
 Maur+0.018
ina+0.017
Neg logit contributions
 differences-0.028
 losers-0.026
affe-0.025
 pleasures-0.025
 team-0.025
Token "
Token ID366
Feature activation+7.92e-01

Pos logit contributions
 gn+0.003
ina+0.003
 Grac+0.003
 blo+0.003
 Nib+0.003
Neg logit contributions
 differences-0.002
 losers-0.002
 similarities-0.002
 team-0.002
affe-0.002
TokenHello
Token ID15496
Feature activation+7.85e-01

Pos logit contributions
 Nib+0.020
 gn+0.020
 Grac+0.019
 Maur+0.018
ina+0.018
Neg logit contributions
 differences-0.020
Everyone-0.019
 pleasures-0.018
 similarities-0.018
 losers-0.018
Token,
Token ID11
Feature activation+6.41e-01

Pos logit contributions
 gn+0.016
 Grac+0.016
 Nib+0.016
 Po+0.015
ina+0.015
Neg logit contributions
 differences-0.024
 losers-0.023
 pleasures-0.022
 similarities-0.022
oom-0.021
Token little
Token ID1310
Feature activation+1.36e+00

Pos logit contributions
 gn+0.021
 Grac+0.021
 Nib+0.021
 Po+0.020
ina+0.020
Neg logit contributions
 differences-0.012
 losers-0.011
 pleasures-0.010
 similarities-0.010
oom-0.010
Token girl
Token ID2576
Feature activation+2.24e+00

Pos logit contributions
 gn+0.037
 Po+0.037
 Grac+0.036
 Nib+0.035
 Ho+0.034
Neg logit contributions
 differences-0.035
 losers-0.032
 pleasures-0.031
 similarities-0.031
 laws-0.030
Token,"
Token ID553
Feature activation+1.13e+00

Pos logit contributions
 gn+0.051
 Grac+0.050
 Nib+0.049
 Po+0.047
ina+0.046
Neg logit contributions
 differences-0.065
 losers-0.060
 pleasures-0.059
 similarities-0.058
oom-0.057
Token said
Token ID531
Feature activation-5.24e-01

Pos logit contributions
 Grac+0.016
 gn+0.016
 Nib+0.016
 Po+0.015
ina+0.014
Neg logit contributions
 differences-0.042
 losers-0.040
 pleasures-0.039
oom-0.039
 similarities-0.038
Token the
Token ID262
Feature activation+1.43e-01

Pos logit contributions
 differences+0.016
 pleasures+0.015
 losers+0.014
 similarities+0.014
 sides+0.014
Neg logit contributions
 Grac-0.013
 Nib-0.013
 Po-0.013
 gn-0.012
 Ho-0.012
Token rabbit
Token ID22746
Feature activation+2.49e-01

Pos logit contributions
 Grac+0.004
 Nib+0.004
 gn+0.004
icol+0.004
 Maur+0.004
Neg logit contributions
 differences-0.003
affe-0.003
 team-0.003
 law-0.002
 losers-0.002
Token.
Token ID13
Feature activation+4.91e-01

Pos logit contributions
 gn+0.006
 Grac+0.006
 Nib+0.006
 Po+0.006
ina+0.006
Neg logit contributions
 differences-0.007
 losers-0.006
 pleasures-0.006
 similarities-0.006
oom-0.006
Token tried
Token ID3088
Feature activation+4.86e-01

Pos logit contributions
 Nib+0.006
 Grac+0.006
 gn+0.006
ina+0.006
 Maur+0.006
Neg logit contributions
 differences-0.008
 losers-0.008
 pleasures-0.007
oom-0.007
 similarities-0.007
Token to
Token ID284
Feature activation-1.66e-01

Pos logit contributions
 gn+0.012
 Grac+0.012
 Nib+0.012
 Po+0.011
ina+0.011
Neg logit contributions
 differences-0.013
 losers-0.012
 pleasures-0.012
 similarities-0.012
oom-0.011
Token land
Token ID1956
Feature activation-7.03e-02

Pos logit contributions
 differences+0.004
 pleasures+0.004
 similarities+0.004
 losers+0.004
affe+0.004
Neg logit contributions
 gn-0.004
 Grac-0.004
 Nib-0.004
 Po-0.004
 Maur-0.004
Token on
Token ID319
Feature activation+5.59e-01

Pos logit contributions
 differences+0.002
 losers+0.002
 pleasures+0.002
 similarities+0.002
affe+0.002
Neg logit contributions
 gn-0.002
 Nib-0.001
 Grac-0.001
 Po-0.001
 blo-0.001
Token the
Token ID262
Feature activation+1.01e+00

Pos logit contributions
 gn+0.015
 Grac+0.015
 Nib+0.015
 Po+0.015
 Maur+0.014
Neg logit contributions
 differences-0.013
 losers-0.012
 pleasures-0.012
 similarities-0.012
oom-0.012
Token other
Token ID584
Feature activation+2.23e+00

Pos logit contributions
 Grac+0.032
 Nib+0.031
 gn+0.031
ina+0.031
 Po+0.031
Neg logit contributions
 differences-0.020
 losers-0.018
 sides-0.017
 law-0.017
 laws-0.017
Token side
Token ID1735
Feature activation-1.48e-01

Pos logit contributions
 Grac+0.088
 gn+0.088
 Nib+0.087
 Po+0.085
ina+0.084
Neg logit contributions
 differences-0.026
 losers-0.020
 pleasures-0.020
 sides-0.019
 similarities-0.018
Token of
Token ID286
Feature activation-3.85e-01

Pos logit contributions
 differences+0.004
 similarities+0.003
 losers+0.003
 pleasures+0.003
 laws+0.003
Neg logit contributions
 gn-0.004
 Grac-0.004
 Po-0.004
 Nib-0.004
ina-0.004
Token the
Token ID262
Feature activation+5.66e-01

Pos logit contributions
 differences+0.007
 losers+0.007
 pleasures+0.006
 similarities+0.006
oom+0.006
Neg logit contributions
 gn-0.012
 Grac-0.012
 Nib-0.012
 Po-0.012
ina-0.012
Token bin
Token ID9874
Feature activation+8.19e-01

Pos logit contributions
 Grac+0.016
 Nib+0.015
 gn+0.015
ina+0.015
 Po+0.015
Neg logit contributions
 differences-0.014
 losers-0.012
 law-0.012
 pleasures-0.012
 laws-0.012
Token.
Token ID13
Feature activation-2.62e-01

Pos logit contributions
 gn+0.020
 Grac+0.020
 Nib+0.020
 Po+0.019
ina+0.019
Neg logit contributions
 differences-0.022
 losers-0.020
 pleasures-0.020
 similarities-0.019
oom-0.019
Token little
Token ID1310
Feature activation+7.18e-01

Pos logit contributions
 differences+0.008
 losers+0.007
 pleasures+0.007
 similarities+0.007
oom+0.007
Neg logit contributions
 gn-0.013
 Grac-0.013
 Nib-0.013
 Po-0.012
ina-0.012
Token girl
Token ID2576
Feature activation+1.32e+00

Pos logit contributions
 gn+0.018
 Grac+0.018
 Nib+0.018
 Po+0.017
ina+0.017
Neg logit contributions
 differences-0.018
 losers-0.017
 pleasures-0.016
 similarities-0.016
oom-0.016
Token."
Token ID526
Feature activation+1.03e+00

Pos logit contributions
 gn+0.034
 Grac+0.034
 Nib+0.034
 Po+0.032
 Maur+0.032
Neg logit contributions
 differences-0.033
 losers-0.031
 pleasures-0.029
oom-0.029
 similarities-0.029
Token
Token ID198
Feature activation+4.66e-01

Pos logit contributions
 gn+0.030
 Nib+0.027
 blo+0.027
 Maur+0.026
ina+0.025
Neg logit contributions
 differences-0.023
Everyone-0.021
 team-0.021
oom-0.020
 similarities-0.020
Token
Token ID198
Feature activation-4.65e-03

Pos logit contributions
 gn+0.012
 Nib+0.011
 Grac+0.011
 Maur+0.010
 blo+0.010
Neg logit contributions
 differences-0.012
Everyone-0.012
oom-0.011
 similarities-0.011
 losers-0.011
TokenEm
Token ID10161
Feature activation+2.23e+00

Pos logit contributions
 differences+0.000
 losers+0.000
 pleasures+0.000
oom+0.000
affe+0.000
Neg logit contributions
 Grac-0.000
 gn-0.000
 Po-0.000
 Nib-0.000
ina-0.000
Tokenma
Token ID2611
Feature activation-1.45e-01

Pos logit contributions
 gn+0.068
 Nib+0.068
 Grac+0.065
 Po+0.064
 Maur+0.062
Neg logit contributions
 differences-0.049
oom-0.043
 losers-0.042
 pleasures-0.042
 similarities-0.041
Token said
Token ID531
Feature activation-7.78e-01

Pos logit contributions
 differences+0.004
 losers+0.004
 pleasures+0.004
 similarities+0.004
oom+0.004
Neg logit contributions
 gn-0.003
 Grac-0.003
 Nib-0.003
 Po-0.003
ina-0.003
Token,
Token ID11
Feature activation-5.32e-01

Pos logit contributions
 differences+0.025
 losers+0.023
 pleasures+0.022
 similarities+0.022
oom+0.022
Neg logit contributions
 gn-0.016
 Grac-0.016
 Nib-0.015
 Po-0.015
ina-0.014
Token "
Token ID366
Feature activation+9.74e-01

Pos logit contributions
 differences+0.011
 losers+0.010
 similarities+0.009
 pleasures+0.009
oom+0.009
Neg logit contributions
 gn-0.016
 Nib-0.016
 Grac-0.015
ina-0.015
 Po-0.015
TokenThat
Token ID2504
Feature activation+2.74e-01

Pos logit contributions
 Nib+0.027
 gn+0.026
 Grac+0.026
 Po+0.025
 Maur+0.024
Neg logit contributions
 differences-0.023
 losers-0.021
 pleasures-0.021
 similarities-0.021
Everyone-0.020
Token Daisy
Token ID40355
Feature activation+3.14e-01

Pos logit contributions
 differences+0.008
 losers+0.007
 pleasures+0.007
 laws+0.007
 similarities+0.007
Neg logit contributions
 gn-0.003
 Nib-0.003
 Grac-0.003
ina-0.003
 Po-0.003
Token saw
Token ID2497
Feature activation-2.06e-01

Pos logit contributions
 gn+0.007
 Grac+0.007
 Nib+0.007
 Po+0.006
 Maur+0.006
Neg logit contributions
 differences-0.009
 losers-0.009
 pleasures-0.008
 similarities-0.008
oom-0.008
Token this
Token ID428
Feature activation+2.06e-01

Pos logit contributions
 differences+0.006
 laws+0.005
 losers+0.005
 missiles+0.005
 sides+0.005
Neg logit contributions
 gn-0.005
 Nib-0.005
 Grac-0.005
 Maur-0.004
ina-0.004
Token and
Token ID290
Feature activation-1.85e-01

Pos logit contributions
 gn+0.005
 Grac+0.005
 Nib+0.005
 Po+0.005
ina+0.005
Neg logit contributions
 differences-0.005
 losers-0.005
 pleasures-0.005
 similarities-0.005
oom-0.005
Token asked
Token ID1965
Feature activation-1.38e-01

Pos logit contributions
 differences+0.006
 losers+0.006
 pleasures+0.005
 similarities+0.005
oom+0.005
Neg logit contributions
 gn-0.003
 Grac-0.003
 Nib-0.003
 Po-0.003
ina-0.003
Token Ab
Token ID2275
Feature activation+2.23e+00

Pos logit contributions
 differences+0.004
 losers+0.004
 pleasures+0.004
 similarities+0.004
 laws+0.004
Neg logit contributions
 gn-0.003
 Nib-0.003
 Grac-0.003
ina-0.003
 Maur-0.003
Tokenbie
Token ID12590
Feature activation+4.85e-01

Pos logit contributions
 gn+0.030
 Grac+0.029
 Nib+0.029
 Po+0.026
 Maur+0.024
Neg logit contributions
 differences-0.084
 losers-0.079
oom-0.078
 pleasures-0.078
 similarities-0.077
Token if
Token ID611
Feature activation+4.48e-01

Pos logit contributions
 gn+0.010
 Grac+0.010
 Nib+0.010
 Po+0.009
ina+0.009
Neg logit contributions
 differences-0.015
 losers-0.015
 pleasures-0.014
oom-0.014
 similarities-0.014
Token she
Token ID673
Feature activation-1.35e-01

Pos logit contributions
 Grac+0.004
 gn+0.004
 Nib+0.004
 Po+0.004
 Ho+0.003
Neg logit contributions
 differences-0.020
 losers-0.019
 pleasures-0.019
oom-0.018
affe-0.018
Token could
Token ID714
Feature activation+5.67e-02

Pos logit contributions
 differences+0.005
 losers+0.004
 pleasures+0.004
oom+0.004
 similarities+0.004
Neg logit contributions
 gn-0.003
 Grac-0.002
 Nib-0.002
 Po-0.002
 called-0.002
Token share
Token ID2648
Feature activation+7.37e-02

Pos logit contributions
 gn+0.001
 Grac+0.001
 Nib+0.001
 Po+0.001
 called+0.001
Neg logit contributions
 differences-0.002
 losers-0.001
 pleasures-0.001
 similarities-0.001
 missiles-0.001
Token the
Token ID262
Feature activation+1.59e-01

Pos logit contributions
 differences+0.020
 losers+0.019
 pleasures+0.018
oom+0.018
 similarities+0.018
Neg logit contributions
 Grac-0.009
 gn-0.009
 Nib-0.009
 Maur-0.008
 Po-0.008
Token gate
Token ID8946
Feature activation-2.15e-01

Pos logit contributions
 gn+0.004
 blo+0.003
 called+0.003
 searching+0.003
 Po+0.003
Neg logit contributions
 differences-0.005
 losers-0.004
oom-0.004
 pleasures-0.004
 missiles-0.004
Tokenkeeper
Token ID13884
Feature activation-2.11e-01

Pos logit contributions
 differences+0.008
 pleasures+0.007
 losers+0.007
 similarities+0.007
oom+0.007
Neg logit contributions
 gn-0.004
 called-0.003
 tucked-0.003
 Grac-0.003
 Nib-0.003
Token agreed
Token ID4987
Feature activation+3.94e-01

Pos logit contributions
 differences+0.007
 losers+0.007
 pleasures+0.006
oom+0.006
 similarities+0.006
Neg logit contributions
 gn-0.004
 Grac-0.004
 Nib-0.004
 Po-0.004
ina-0.003
Token that
Token ID326
Feature activation-2.72e-02

Pos logit contributions
 gn+0.008
 Grac+0.008
 Nib+0.008
 Po+0.007
ina+0.007
Neg logit contributions
 differences-0.012
 losers-0.012
 pleasures-0.011
oom-0.011
 similarities-0.011
Token Ab
Token ID2275
Feature activation+2.23e+00

Pos logit contributions
 differences+0.001
 losers+0.001
affe+0.001
oom+0.001
 pleasures+0.001
Neg logit contributions
 Grac-0.000
 gn-0.000
 Nib-0.000
 Po-0.000
 Maur-0.000
Tokenbie
Token ID12590
Feature activation-2.52e-01

Pos logit contributions
 gn+0.031
 Grac+0.031
 Nib+0.030
 Po+0.028
 Ho+0.026
Neg logit contributions
 differences-0.083
 losers-0.077
oom-0.077
 pleasures-0.076
 similarities-0.076
Token had
Token ID550
Feature activation-2.84e-01

Pos logit contributions
 differences+0.009
 pleasures+0.008
 losers+0.008
 similarities+0.008
oom+0.008
Neg logit contributions
 gn-0.004
 called-0.004
 Grac-0.004
ina-0.004
 Nib-0.004
Token worked
Token ID3111
Feature activation-1.18e-01

Pos logit contributions
 differences+0.010
 pleasures+0.009
affe+0.009
oom+0.009
 losers+0.009
Neg logit contributions
 gn-0.005
 Grac-0.005
 Nib-0.005
 called-0.005
 Po-0.005
Token hard
Token ID1327
Feature activation+4.15e-01

Pos logit contributions
 differences+0.004
 losers+0.004
affe+0.003
 pleasures+0.003
 similarities+0.003
Neg logit contributions
 Grac-0.002
 gn-0.002
 Nib-0.002
 Po-0.002
ina-0.002
Token and
Token ID290
Feature activation-1.84e-01

Pos logit contributions
 gn+0.010
 Grac+0.010
 Nib+0.010
 Po+0.009
 Maur+0.009
Neg logit contributions
 differences-0.012
 losers-0.011
 pleasures-0.011
oom-0.010
 similarities-0.010
Token right
Token ID826
Feature activation+1.38e-01

Pos logit contributions
 gn+0.011
 Grac+0.011
 Nib+0.011
 Po+0.010
ina+0.010
Neg logit contributions
 differences-0.014
 losers-0.013
 pleasures-0.012
 similarities-0.012
oom-0.012
Token away
Token ID1497
Feature activation+9.15e-01

Pos logit contributions
 gn+0.004
 Grac+0.004
 Nib+0.004
 Po+0.004
 Maur+0.003
Neg logit contributions
 differences-0.003
 losers-0.003
 pleasures-0.003
oom-0.003
 similarities-0.003
Token.
Token ID13
Feature activation+6.37e-01

Pos logit contributions
 Grac+0.020
 gn+0.019
ina+0.018
 Po+0.018
 Nib+0.018
Neg logit contributions
 differences-0.027
 losers-0.026
 similarities-0.025
 pleasures-0.025
affe-0.025
Token
Token ID198
Feature activation+7.37e-01

Pos logit contributions
 gn+0.014
 Nib+0.013
 Grac+0.012
ina+0.012
 blo+0.012
Neg logit contributions
 differences-0.019
 losers-0.018
 pleasures-0.018
oom-0.017
 similarities-0.017
Token
Token ID198
Feature activation+3.45e-01

Pos logit contributions
 gn+0.017
 Nib+0.017
 Grac+0.016
 Po+0.016
icol+0.015
Neg logit contributions
 differences-0.020
Everyone-0.019
oom-0.019
 pleasures-0.018
 losers-0.018
TokenEm
Token ID10161
Feature activation+2.22e+00

Pos logit contributions
 Grac+0.007
 gn+0.007
ina+0.007
 Po+0.007
 Nib+0.007
Neg logit contributions
 differences-0.010
 losers-0.010
affe-0.009
 winds-0.009
 similarities-0.009
Tokenma
Token ID2611
Feature activation+1.31e-01

Pos logit contributions
 gn+0.062
 Nib+0.061
 Grac+0.061
 Po+0.058
 Maur+0.057
Neg logit contributions
 differences-0.053
 losers-0.048
 pleasures-0.047
oom-0.046
 similarities-0.046
Token spent
Token ID3377
Feature activation+1.21e-01

Pos logit contributions
 gn+0.002
 Grac+0.002
ina+0.002
 Nib+0.002
 called+0.002
Neg logit contributions
 differences-0.004
 pleasures-0.004
 losers-0.004
 similarities-0.004
 laws-0.004
Token hours
Token ID2250
Feature activation-4.33e-02

Pos logit contributions
 Grac+0.003
 gn+0.003
 Po+0.003
 Nib+0.003
 called+0.003
Neg logit contributions
 differences-0.003
 losers-0.003
affe-0.003
 pleasures-0.003
oom-0.003
Token gazing
Token ID50234
Feature activation-6.27e-02

Pos logit contributions
 differences+0.002
 pleasures+0.002
 similarities+0.002
affe+0.001
 losers+0.001
Neg logit contributions
 gn-0.001
 Grac-0.001
 imagining-0.001
 called-0.001
 searching-0.001
Token at
Token ID379
Feature activation-3.94e-01

Pos logit contributions
 differences+0.002
 losers+0.002
 similarities+0.002
 pleasures+0.002
affe+0.002
Neg logit contributions
 Grac-0.001
 Po-0.001
ina-0.001
 gn-0.001
 Nib-0.001
Token
Token ID198
Feature activation+1.41e-01

Pos logit contributions
 gn+0.017
 Nib+0.016
 Grac+0.016
icol+0.015
 blo+0.015
Neg logit contributions
 differences-0.016
oom-0.015
Everyone-0.015
 pleasures-0.014
affe-0.014
Token"
Token ID1
Feature activation+8.06e-01

Pos logit contributions
 gn+0.003
 Grac+0.003
 Nib+0.003
 Po+0.003
ina+0.003
Neg logit contributions
 differences-0.004
 losers-0.004
 pleasures-0.004
oom-0.004
 similarities-0.004
TokenHello
Token ID15496
Feature activation+5.90e-01

Pos logit contributions
 Grac+0.017
 Nib+0.017
 gn+0.017
 Po+0.016
ina+0.016
Neg logit contributions
 differences-0.023
 losers-0.022
 pleasures-0.022
 similarities-0.021
Everyone-0.021
Token,
Token ID11
Feature activation+5.65e-01

Pos logit contributions
 Grac+0.011
 gn+0.011
 Nib+0.011
 Po+0.011
ina+0.010
Neg logit contributions
 differences-0.020
 losers-0.018
 pleasures-0.018
 similarities-0.018
 laws-0.017
Token little
Token ID1310
Feature activation+1.52e+00

Pos logit contributions
 Grac+0.018
 gn+0.018
 Nib+0.018
 Po+0.017
ina+0.017
Neg logit contributions
 differences-0.012
 pleasures-0.011
 losers-0.010
 similarities-0.010
oom-0.010
Token girl
Token ID2576
Feature activation+2.22e+00

Pos logit contributions
 Grac+0.039
 gn+0.039
 Po+0.038
 Nib+0.037
 Ho+0.036
Neg logit contributions
 differences-0.040
 losers-0.039
 similarities-0.037
 pleasures-0.036
affe-0.035
Token,"
Token ID553
Feature activation+8.19e-01

Pos logit contributions
 Grac+0.054
 gn+0.054
 Nib+0.053
 Po+0.051
 Maur+0.050
Neg logit contributions
 differences-0.057
 losers-0.054
 pleasures-0.051
affe-0.051
oom-0.051
Token said
Token ID531
Feature activation-7.23e-01

Pos logit contributions
 gn+0.013
 Grac+0.013
 Nib+0.013
 Po+0.012
ina+0.011
Neg logit contributions
 differences-0.029
 losers-0.028
 pleasures-0.027
oom-0.027
 similarities-0.026
Token the
Token ID262
Feature activation+1.43e-01

Pos logit contributions
 differences+0.022
 pleasures+0.021
 losers+0.019
 similarities+0.019
affe+0.019
Neg logit contributions
 Grac-0.018
 Nib-0.018
 Po-0.017
 gn-0.017
 Ho-0.016
Token spirit
Token ID4437
Feature activation-4.32e-03

Pos logit contributions
 gn+0.005
 Grac+0.005
 Po+0.005
 Nib+0.005
ina+0.004
Neg logit contributions
 differences-0.003
 losers-0.002
 pleasures-0.002
 similarities-0.002
oom-0.002
Token.
Token ID13
Feature activation+3.80e-01

Pos logit contributions
 differences+0.000
 losers+0.000
 pleasures+0.000
 similarities+0.000
oom+0.000
Neg logit contributions
 gn-0.000
 Grac-0.000
 Nib-0.000
 called-0.000
 Po-0.000
Token
Token ID26391
Feature activation+1.58e-01

Pos logit contributions
ina+0.001
 called+0.001
 gn+0.001
 Grac+0.001
 Gab+0.001
Neg logit contributions
 missiles-0.002
 differences-0.002
 losers-0.002
 laws-0.002
 sides-0.002
Token
Token ID220
Feature activation-2.43e-01

Pos logit contributions
 gn+0.003
 Grac+0.003
 Po+0.003
 Nib+0.003
ina+0.003
Neg logit contributions
 differences-0.005
 losers-0.005
 matter-0.005
 pleasures-0.005
 similarities-0.005
Token
Token ID198
Feature activation+1.37e-01

Pos logit contributions
 differences+0.007
 losers+0.007
 pleasures+0.007
 similarities+0.007
affe+0.007
Neg logit contributions
 Grac-0.005
 gn-0.005
 Nib-0.005
 Po-0.005
ina-0.005
Token
Token ID198
Feature activation-4.40e-01

Pos logit contributions
 gn+0.003
 Grac+0.003
 Nib+0.003
ina+0.003
 Po+0.003
Neg logit contributions
 differences-0.004
Everyone-0.004
 losers-0.004
 similarities-0.004
 pleasures-0.003
TokenThe
Token ID464
Feature activation+7.17e-01

Pos logit contributions
 differences+0.010
 losers+0.009
 attitude+0.009
 pleasures+0.009
oom+0.009
Neg logit contributions
 Grac-0.013
 gn-0.013
 Nib-0.012
 Po-0.012
ina-0.012
Token younger
Token ID7099
Feature activation+2.22e+00

Pos logit contributions
 Grac+0.018
 gn+0.018
 Po+0.018
 Nib+0.016
ina+0.016
Neg logit contributions
affe-0.019
 differences-0.019
 losers-0.018
oom-0.018
 missiles-0.018
Token sister
Token ID6621
Feature activation+5.84e-01

Pos logit contributions
 Grac+0.051
 gn+0.048
ina+0.047
 Po+0.046
 called+0.041
Neg logit contributions
 differences-0.067
 losers-0.064
 similarities-0.063
ag-0.061
 missiles-0.061
Token smiled
Token ID13541
Feature activation+1.03e-01

Pos logit contributions
 Grac+0.011
 gn+0.011
 Nib+0.010
ina+0.010
 Po+0.010
Neg logit contributions
 differences-0.019
 losers-0.018
 pleasures-0.018
 similarities-0.018
oom-0.017
Token and
Token ID290
Feature activation+4.55e-01

Pos logit contributions
 gn+0.002
 Nib+0.002
 Grac+0.002
 Po+0.002
 Maur+0.002
Neg logit contributions
 differences-0.003
 losers-0.003
 pleasures-0.002
oom-0.002
 similarities-0.002
Token said
Token ID531
Feature activation-9.08e-02

Pos logit contributions
 gn+0.008
 Grac+0.008
 Nib+0.007
 Po+0.007
ina+0.007
Neg logit contributions
 differences-0.015
 losers-0.015
 pleasures-0.014
 similarities-0.014
oom-0.014
Token,
Token ID11
Feature activation-2.31e-01

Pos logit contributions
 differences+0.003
 losers+0.003
 pleasures+0.003
 similarities+0.003
 missiles+0.003
Neg logit contributions
 Grac-0.002
 gn-0.002
 Nib-0.002
 Po-0.002
 Maur-0.001
Token"
Token ID1
Feature activation+6.98e-01

Pos logit contributions
 differences+0.002
 losers+0.002
 pleasures+0.002
 similarities+0.002
oom+0.002
Neg logit contributions
 gn-0.002
 Grac-0.002
 Nib-0.002
 Po-0.002
 Maur-0.002
TokenHi
Token ID17250
Feature activation+7.32e-01

Pos logit contributions
 gn+0.014
 Grac+0.014
 Nib+0.014
ina+0.013
 Po+0.013
Neg logit contributions
 differences-0.021
 losers-0.020
 pleasures-0.019
 similarities-0.019
oom-0.019
Token there
Token ID612
Feature activation+2.85e-01

Pos logit contributions
 gn+0.017
 Grac+0.017
 Nib+0.016
ina+0.016
 Po+0.016
Neg logit contributions
 differences-0.020
 losers-0.019
oom-0.018
 pleasures-0.018
affe-0.018
Token,
Token ID11
Feature activation+4.90e-01

Pos logit contributions
 gn+0.006
 Grac+0.006
 Nib+0.006
 Po+0.006
ina+0.006
Neg logit contributions
 differences-0.008
 losers-0.008
 pleasures-0.007
oom-0.007
 similarities-0.007
Token little
Token ID1310
Feature activation+1.52e+00

Pos logit contributions
 Grac+0.011
 gn+0.011
 Nib+0.011
 Po+0.011
 Ho+0.010
Neg logit contributions
 differences-0.014
 pleasures-0.013
 losers-0.013
oom-0.013
 similarities-0.013
Token girl
Token ID2576
Feature activation+2.22e+00

Pos logit contributions
 gn+0.038
 Grac+0.038
 Nib+0.036
 Po+0.036
 blo+0.034
Neg logit contributions
 differences-0.042
 losers-0.039
 similarities-0.038
 pleasures-0.037
 missiles-0.036
Token!
Token ID0
Feature activation-8.14e-01

Pos logit contributions
 Grac+0.058
 Nib+0.056
 gn+0.056
 Maur+0.054
 Po+0.053
Neg logit contributions
 differences-0.050
 losers-0.048
affe-0.048
oom-0.047
 pleasures-0.045
Token Do
Token ID2141
Feature activation+6.59e-01

Pos logit contributions
 differences+0.026
 losers+0.024
 pleasures+0.024
 similarities+0.023
oom+0.023
Neg logit contributions
 gn-0.016
 Grac-0.015
 Nib-0.015
 Po-0.014
ina-0.014
Token you
Token ID345
Feature activation+4.00e-02

Pos logit contributions
 Po+0.016
 Nib+0.016
 gn+0.016
 Grac+0.016
 Ho+0.015
Neg logit contributions
 differences-0.017
 pleasures-0.016
 losers-0.016
oom-0.015
 winds-0.015
Token want
Token ID765
Feature activation+5.28e-01

Pos logit contributions
 gn+0.001
 Grac+0.001
 Nib+0.001
 Po+0.001
ina+0.001
Neg logit contributions
 differences-0.001
 losers-0.001
 pleasures-0.001
oom-0.001
 similarities-0.001
Token to
Token ID284
Feature activation+2.91e-01

Pos logit contributions
 gn+0.013
 Grac+0.013
 Nib+0.013
 Po+0.012
ina+0.012
Neg logit contributions
 differences-0.014
 losers-0.013
 pleasures-0.012
 similarities-0.012
 laws-0.012
Token giving
Token ID3501
Feature activation-3.73e-01

Pos logit contributions
 differences+0.017
 losers+0.016
 pleasures+0.015
oom+0.015
 similarities+0.015
Neg logit contributions
 gn-0.012
 Grac-0.012
 Nib-0.012
 Po-0.011
ina-0.011
Token him
Token ID683
Feature activation+8.62e-03

Pos logit contributions
 differences+0.010
 losers+0.010
 pleasures+0.009
 similarities+0.009
oom+0.009
Neg logit contributions
 Grac-0.009
 Nib-0.009
 gn-0.009
 Po-0.009
 Ho-0.008
Token too
Token ID1165
Feature activation-7.16e-01

Pos logit contributions
 Nib+0.000
 gn+0.000
 Grac+0.000
 Po+0.000
 Ho+0.000
Neg logit contributions
 differences-0.000
 losers-0.000
affe-0.000
 similarities-0.000
oom-0.000
Token many
Token ID867
Feature activation-7.80e-01

Pos logit contributions
 differences+0.024
 losers+0.023
folk+0.022
 similarities+0.022
affe+0.022
Neg logit contributions
 gn-0.012
 blo-0.012
 called-0.011
 Grac-0.011
 Nib-0.011
Token ch
Token ID442
Feature activation-1.16e+00

Pos logit contributions
oom+0.020
 differences+0.020
 losers+0.019
 similarities+0.019
affe+0.018
Neg logit contributions
 gn-0.019
 Nib-0.018
 Grac-0.018
 blo-0.018
 called-0.018
Tokenocol
Token ID4668
Feature activation-2.78e+00

Pos logit contributions
 differences+0.025
 pleasures+0.024
 losers+0.024
Everyone+0.024
 similarities+0.024
Neg logit contributions
 Nib-0.032
ina-0.031
 Po-0.031
 gn-0.031
 Grac-0.030
Tokenates
Token ID689
Feature activation-6.81e-01

Pos logit contributions
 differences+0.083
 losers+0.079
 pleasures+0.076
 similarities+0.075
 laws+0.074
Neg logit contributions
 gn-0.057
 Nib-0.057
 Grac-0.056
ina-0.054
 Po-0.053
Token."
Token ID526
Feature activation+6.04e-01

Pos logit contributions
 differences+0.019
 losers+0.017
 similarities+0.017
 pleasures+0.017
oom+0.016
Neg logit contributions
 gn-0.016
 Grac-0.016
 Nib-0.016
 Po-0.015
ina-0.015
Token The
Token ID383
Feature activation+1.98e-01

Pos logit contributions
 gn+0.013
 Grac+0.012
 Nib+0.012
ina+0.011
 blo+0.011
Neg logit contributions
 differences-0.018
 losers-0.017
affe-0.016
 pleasures-0.016
 similarities-0.016
Token king
Token ID5822
Feature activation-7.45e-01

Pos logit contributions
 gn+0.006
 Nib+0.006
 Grac+0.006
 Po+0.006
 Ho+0.005
Neg logit contributions
 differences-0.005
 losers-0.004
 pleasures-0.004
oom-0.004
 similarities-0.004
Token learned
Token ID4499
Feature activation+8.35e-01

Pos logit contributions
 differences+0.025
 pleasures+0.023
 losers+0.023
 similarities+0.022
 laws+0.022
Neg logit contributions
 gn-0.014
 Grac-0.014
 Nib-0.013
ina-0.013
 called-0.013
Tokenoll
Token ID692
Feature activation+5.72e-03

Pos logit contributions
 Grac+0.002
 gn+0.002
 Nib+0.002
 Po+0.001
ina+0.001
Neg logit contributions
 differences-0.004
 losers-0.004
 pleasures-0.004
affe-0.004
oom-0.004
Tokenip
Token ID541
Feature activation-7.88e-02

Pos logit contributions
 gn+0.000
 Grac+0.000
 Nib+0.000
 Po+0.000
ina+0.000
Neg logit contributions
 differences-0.000
 losers-0.000
 pleasures-0.000
 similarities-0.000
 laws-0.000
Tokenops
Token ID2840
Feature activation-4.98e-01

Pos logit contributions
 differences+0.002
 losers+0.002
 similarities+0.002
 pleasures+0.002
oom+0.002
Neg logit contributions
 gn-0.002
 Nib-0.002
 Grac-0.002
 Po-0.002
ina-0.002
Token,
Token ID11
Feature activation-2.25e-01

Pos logit contributions
 differences+0.016
 losers+0.014
 pleasures+0.014
 similarities+0.014
oom+0.014
Neg logit contributions
 gn-0.010
 Grac-0.010
 Nib-0.010
 Po-0.009
ina-0.009
Token ch
Token ID442
Feature activation-6.08e-01

Pos logit contributions
 differences+0.007
 losers+0.006
 pleasures+0.006
 similarities+0.006
oom+0.006
Neg logit contributions
 gn-0.005
 Grac-0.005
 Nib-0.005
 Po-0.005
ina-0.004
Tokenocol
Token ID4668
Feature activation-2.56e+00

Pos logit contributions
 differences+0.014
 pleasures+0.013
 similarities+0.012
 laws+0.012
 law+0.012
Neg logit contributions
 Nib-0.018
 Po-0.018
 gn-0.018
 Grac-0.017
 Ho-0.017
Tokenates
Token ID689
Feature activation-6.05e-01

Pos logit contributions
 differences+0.064
 losers+0.060
oom+0.057
 pleasures+0.057
affe+0.056
Neg logit contributions
 Grac-0.068
 gn-0.067
 Nib-0.064
 Po-0.063
 Maur-0.060
Token,
Token ID11
Feature activation-2.20e-01

Pos logit contributions
 differences+0.017
 losers+0.016
 pleasures+0.016
 similarities+0.016
oom+0.015
Neg logit contributions
 gn-0.013
 Grac-0.013
 Nib-0.013
 Po-0.012
ina-0.012
Token g
Token ID308
Feature activation+4.87e-01

Pos logit contributions
 differences+0.006
 losers+0.006
 pleasures+0.006
 similarities+0.005
affe+0.005
Neg logit contributions
 Grac-0.005
 gn-0.005
 Nib-0.005
ina-0.005
 Po-0.005
Tokenummies
Token ID39578
Feature activation-7.85e-01

Pos logit contributions
 gn+0.012
 Grac+0.012
 Nib+0.012
 Po+0.011
ina+0.011
Neg logit contributions
 differences-0.013
 losers-0.012
 pleasures-0.012
 similarities-0.011
 laws-0.011
Token,
Token ID11
Feature activation-2.20e-01

Pos logit contributions
 differences+0.024
 losers+0.022
 pleasures+0.022
 similarities+0.022
oom+0.021
Neg logit contributions
 gn-0.017
 Grac-0.016
 Nib-0.016
 Po-0.015
ina-0.015
Token "
Token ID366
Feature activation+5.73e-01

Pos logit contributions
 differences+0.010
 losers+0.009
 pleasures+0.009
oom+0.009
 similarities+0.009
Neg logit contributions
 gn-0.009
 Grac-0.009
 Nib-0.009
 Po-0.009
ina-0.008
TokenI
Token ID40
Feature activation+6.27e-01

Pos logit contributions
 gn+0.011
 Grac+0.011
 Nib+0.011
 Po+0.010
 Gab+0.010
Neg logit contributions
 differences-0.018
oom-0.017
 attitude-0.017
 missiles-0.017
 losers-0.017
Token will
Token ID481
Feature activation-3.56e-01

Pos logit contributions
 gn+0.014
 Grac+0.014
 Nib+0.014
 Po+0.013
ina+0.013
Neg logit contributions
 differences-0.018
 losers-0.017
 pleasures-0.016
oom-0.016
 similarities-0.016
Token buy
Token ID2822
Feature activation-7.85e-02

Pos logit contributions
 differences+0.009
 losers+0.008
 pleasures+0.008
 similarities+0.008
oom+0.008
Neg logit contributions
 gn-0.009
 Grac-0.009
 Nib-0.009
 Po-0.009
ina-0.009
Token more
Token ID517
Feature activation-6.08e-01

Pos logit contributions
 differences+0.002
 pleasures+0.002
oom+0.002
 attitude+0.002
 similarities+0.002
Neg logit contributions
 Nib-0.002
 Grac-0.002
 Po-0.002
 gn-0.002
 Ho-0.002
Token av
Token ID1196
Feature activation-2.52e+00

Pos logit contributions
 differences+0.015
 losers+0.014
oom+0.014
 similarities+0.013
 winds+0.013
Neg logit contributions
 gn-0.016
 Nib-0.016
 Grac-0.016
 called-0.015
 blo-0.015
Tokenoc
Token ID420
Feature activation-1.56e+00

Pos logit contributions
 differences+0.069
 losers+0.063
 similarities+0.061
 pleasures+0.060
 attitude+0.059
Neg logit contributions
 gn-0.061
 Nib-0.059
 Grac-0.058
 Po-0.055
 Ho-0.054
Tokenados
Token ID22484
Feature activation-7.30e-01

Pos logit contributions
 thrill+0.046
 differences+0.045
 matter+0.043
 similarities+0.042
oom+0.042
Neg logit contributions
 gn-0.032
 Nib-0.031
us-0.027
 Grac-0.027
 Ho-0.027
Token tomorrow
Token ID9439
Feature activation-1.55e-01

Pos logit contributions
 differences+0.020
 losers+0.019
 pleasures+0.019
 similarities+0.018
oom+0.018
Neg logit contributions
 gn-0.017
 Grac-0.017
 Nib-0.016
 Po-0.016
ina-0.015
Token.
Token ID13
Feature activation-8.92e-01

Pos logit contributions
 differences+0.004
 losers+0.004
 pleasures+0.004
 similarities+0.004
oom+0.004
Neg logit contributions
 gn-0.004
 Grac-0.004
 Nib-0.004
 Po-0.004
ina-0.004
Token You
Token ID921
Feature activation+1.39e-01

Pos logit contributions
 differences+0.025
 losers+0.023
 pleasures+0.023
oom+0.023
 similarities+0.022
Neg logit contributions
 gn-0.020
 Grac-0.020
 Nib-0.020
 Po-0.019
 Maur-0.018
Token sad
Token ID6507
Feature activation-5.66e-01

Pos logit contributions
 differences+0.015
 losers+0.014
 law+0.013
 team+0.013
 missiles+0.013
Neg logit contributions
 gn-0.013
 Nib-0.013
 Grac-0.013
icol-0.012
 Maur-0.012
Token,
Token ID11
Feature activation-6.07e-01

Pos logit contributions
 differences+0.021
 losers+0.020
 pleasures+0.020
 similarities+0.019
oom+0.019
Neg logit contributions
 gn-0.008
 Grac-0.008
 Nib-0.008
 Po-0.007
ina-0.007
Token Mr
Token ID1770
Feature activation+7.54e-02

Pos logit contributions
 differences+0.017
 pleasures+0.017
oom+0.016
 similarities+0.015
ag+0.014
Neg logit contributions
 gn-0.015
 Po-0.015
 Nib-0.015
 Grac-0.015
 called-0.015
Token.
Token ID13
Feature activation+8.16e-01

Pos logit contributions
 gn+0.002
 Nib+0.002
 Grac+0.002
 Po+0.002
ina+0.002
Neg logit contributions
 differences-0.002
 losers-0.002
 pleasures-0.002
 similarities-0.002
oom-0.002
Token Cro
Token ID9325
Feature activation-4.68e-01

Pos logit contributions
 gn+0.018
 Nib+0.018
 Grac+0.017
 Po+0.017
ina+0.016
Neg logit contributions
 differences-0.025
 pleasures-0.024
 losers-0.023
 similarities-0.023
oom-0.022
Tokencod
Token ID19815
Feature activation-2.45e+00

Pos logit contributions
 differences+0.012
 pleasures+0.012
 winds+0.011
adoes+0.011
 losers+0.011
Neg logit contributions
ina-0.011
 gn-0.011
 Nib-0.011
 called-0.011
us-0.010
Tokenile
Token ID576
Feature activation-8.24e-01

Pos logit contributions
 differences+0.055
 efforts+0.055
 pleasures+0.054
oom+0.053
 losers+0.053
Neg logit contributions
ina-0.064
 called-0.059
 blo-0.058
 gn-0.057
 Nib-0.056
Token?"
Token ID1701
Feature activation+4.55e-01

Pos logit contributions
 differences+0.030
 pleasures+0.028
 losers+0.028
 similarities+0.028
oom+0.027
Neg logit contributions
 gn-0.012
 Grac-0.012
 Nib-0.012
 Po-0.011
ina-0.011
Token The
Token ID383
Feature activation-4.41e-01

Pos logit contributions
 gn+0.009
 Grac+0.009
 Nib+0.009
ina+0.008
 Po+0.008
Neg logit contributions
 differences-0.014
 losers-0.013
 similarities-0.012
 pleasures-0.012
affe-0.012
Token crocod
Token ID37565
Feature activation-1.02e+00

Pos logit contributions
 differences+0.010
 pleasures+0.009
 losers+0.009
oom+0.009
 similarities+0.009
Neg logit contributions
 gn-0.013
 Nib-0.013
 Grac-0.012
 Po-0.012
 Ho-0.012
Tokenile
Token ID576
Feature activation-7.20e-01

Pos logit contributions
 differences+0.022
 pleasures+0.021
 similarities+0.019
 losers+0.019
 efforts+0.019
Neg logit contributions
 gn-0.030
 Nib-0.029
 called-0.028
ina-0.028
 Po-0.028
Token reward
Token ID6721
Feature activation+6.97e-01

Pos logit contributions
 differences+0.002
 losers+0.001
 similarities+0.001
 pleasures+0.001
oom+0.001
Neg logit contributions
 gn-0.002
 Grac-0.002
 Nib-0.002
 Po-0.002
ina-0.002
Token.
Token ID13
Feature activation+1.33e-01

Pos logit contributions
 Grac+0.015
 gn+0.015
 Nib+0.015
 Po+0.014
ina+0.014
Neg logit contributions
 differences-0.020
 losers-0.019
 pleasures-0.018
 similarities-0.018
oom-0.018
Token It
Token ID632
Feature activation+3.38e-01

Pos logit contributions
 gn+0.003
 Grac+0.003
 Nib+0.003
 Po+0.003
ina+0.003
Neg logit contributions
 differences-0.004
 losers-0.004
 pleasures-0.004
 similarities-0.003
oom-0.003
Token was
Token ID373
Feature activation+3.89e-01

Pos logit contributions
 gn+0.007
 Grac+0.007
 Nib+0.007
 Po+0.007
ina+0.007
Neg logit contributions
 differences-0.010
 losers-0.009
 pleasures-0.009
 similarities-0.009
oom-0.009
Token ch
Token ID442
Feature activation-7.08e-01

Pos logit contributions
 gn+0.008
 Grac+0.008
 Nib+0.008
ina+0.007
 Po+0.007
Neg logit contributions
 differences-0.012
 losers-0.011
 pleasures-0.011
 similarities-0.011
 laws-0.011
Tokenocol
Token ID4668
Feature activation-2.44e+00

Pos logit contributions
 differences+0.016
 similarities+0.016
 pleasures+0.016
 laws+0.015
 law+0.015
Neg logit contributions
 Po-0.020
 Grac-0.020
ina-0.018
 Nib-0.018
 gn-0.018
Tokenates
Token ID689
Feature activation-4.78e-01

Pos logit contributions
 differences+0.068
 losers+0.063
 pleasures+0.062
oom+0.060
 similarities+0.060
Neg logit contributions
 gn-0.057
 Grac-0.056
 Nib-0.055
 Po-0.053
ina-0.051
Token and
Token ID290
Feature activation+8.39e-02

Pos logit contributions
 differences+0.014
 losers+0.013
 pleasures+0.012
 similarities+0.012
oom+0.012
Neg logit contributions
 gn-0.011
 Grac-0.011
 Nib-0.010
 Po-0.010
ina-0.010
Token sweets
Token ID42402
Feature activation+9.31e-03

Pos logit contributions
 gn+0.002
 Grac+0.002
 Nib+0.002
 Po+0.002
ina+0.002
Neg logit contributions
 differences-0.002
 losers-0.002
 pleasures-0.002
oom-0.002
 similarities-0.002
Token!
Token ID0
Feature activation+3.42e-01

Pos logit contributions
 Grac+0.000
 Nib+0.000
 gn+0.000
 Po+0.000
 Maur+0.000
Neg logit contributions
 differences-0.000
 losers-0.000
 pleasures-0.000
affe-0.000
oom-0.000
Token
Token ID198
Feature activation+1.08e-01

Pos logit contributions
 Grac+0.008
 gn+0.007
 Nib+0.007
 Po+0.007
 Ho+0.007
Neg logit contributions
 differences-0.010
 losers-0.010
 pleasures-0.009
 similarities-0.009
oom-0.009
Token old
Token ID1468
Feature activation-7.00e-02

Pos logit contributions
 gn+0.005
ina+0.005
 Grac+0.005
 Po+0.004
 Nib+0.004
Neg logit contributions
 differences-0.004
 losers-0.004
 similarities-0.004
 missiles-0.004
Everyone-0.004
Token man
Token ID582
Feature activation-2.53e-01

Pos logit contributions
 differences+0.002
 losers+0.002
 similarities+0.001
oom+0.001
 pleasures+0.001
Neg logit contributions
 gn-0.002
 Grac-0.002
 Nib-0.002
ina-0.002
 Po-0.002
Token who
Token ID508
Feature activation-4.30e-01

Pos logit contributions
 differences+0.008
 losers+0.008
 similarities+0.008
 pleasures+0.008
oom+0.007
Neg logit contributions
 gn-0.005
 called-0.005
 Nib-0.005
 Grac-0.005
ina-0.005
Token was
Token ID373
Feature activation+2.61e-01

Pos logit contributions
 differences+0.015
 pleasures+0.014
 losers+0.014
 similarities+0.014
oom+0.014
Neg logit contributions
 called-0.007
 gn-0.007
ina-0.007
 Grac-0.007
 tucked-0.006
Token sun
Token ID4252
Feature activation-7.04e-01

Pos logit contributions
 gn+0.004
 Grac+0.004
 called+0.004
 Nib+0.004
 blo+0.004
Neg logit contributions
 differences-0.009
oom-0.009
 pleasures-0.009
affe-0.009
 similarities-0.009
Tokenbat
Token ID8664
Feature activation-2.43e+00

Pos logit contributions
 differences+0.022
 losers+0.021
 similarities+0.020
 pleasures+0.020
affe+0.019
Neg logit contributions
ina-0.015
 Grac-0.015
 gn-0.015
 Nib-0.014
 Po-0.014
Tokenhing
Token ID722
Feature activation-3.97e-01

Pos logit contributions
 differences+0.067
 similarities+0.063
 losers+0.061
 pleasures+0.060
 laws+0.058
Neg logit contributions
ina-0.060
 Grac-0.058
 Nib-0.053
 called-0.053
 gn-0.052
Token in
Token ID287
Feature activation+9.02e-02

Pos logit contributions
 differences+0.011
 losers+0.011
 similarities+0.011
affe+0.010
 pleasures+0.010
Neg logit contributions
 gn-0.009
 Grac-0.009
ina-0.008
 Po-0.008
 Nib-0.008
Token the
Token ID262
Feature activation+2.64e-01

Pos logit contributions
 Grac+0.002
 gn+0.002
 Po+0.002
 Nib+0.002
 called+0.002
Neg logit contributions
 differences-0.003
affe-0.002
 losers-0.002
oom-0.002
 pleasures-0.002
Token shed
Token ID14999
Feature activation+1.90e-01

Pos logit contributions
 gn+0.006
 Po+0.006
 Nib+0.006
 Grac+0.006
 blo+0.005
Neg logit contributions
 differences-0.008
 losers-0.007
 similarities-0.007
 pleasures-0.007
oom-0.007
Token.
Token ID13
Feature activation+2.87e-01

Pos logit contributions
 gn+0.004
 Grac+0.004
 Nib+0.004
ina+0.004
 called+0.004
Neg logit contributions
 differences-0.006
 losers-0.005
 similarities-0.005
 pleasures-0.005
oom-0.005
Token
Token ID198
Feature activation+2.39e-01

Pos logit contributions
 Grac+0.007
 gn+0.007
 Nib+0.007
 Po+0.007
ina+0.006
Neg logit contributions
 differences-0.016
 losers-0.015
 pleasures-0.015
 similarities-0.015
oom-0.015
Token
Token ID198
Feature activation-4.50e-01

Pos logit contributions
 gn+0.005
 Grac+0.005
 Nib+0.005
 Po+0.004
ina+0.004
Neg logit contributions
 differences-0.007
 losers-0.007
oom-0.007
 pleasures-0.007
 similarities-0.007
TokenSo
Token ID2396
Feature activation+8.09e-02

Pos logit contributions
 differences+0.010
 losers+0.009
 pleasures+0.009
 similarities+0.008
oom+0.008
Neg logit contributions
 gn-0.013
 Nib-0.013
 Grac-0.013
 Po-0.013
 Maur-0.012
Token,
Token ID11
Feature activation-1.85e-01

Pos logit contributions
 gn+0.002
 Grac+0.002
 Nib+0.002
ina+0.002
 blo+0.002
Neg logit contributions
 differences-0.002
 losers-0.002
oom-0.002
 similarities-0.002
affe-0.002
Token Stri
Token ID26137
Feature activation-4.83e-01

Pos logit contributions
 differences+0.005
 losers+0.005
oom+0.005
 similarities+0.005
 pleasures+0.005
Neg logit contributions
 gn-0.004
 Grac-0.004
 Nib-0.004
ina-0.004
 Po-0.004
Tokenpe
Token ID431
Feature activation-2.39e+00

Pos logit contributions
 differences+0.011
 pleasures+0.010
 losers+0.010
 similarities+0.009
 attitude+0.009
Neg logit contributions
 gn-0.015
 Nib-0.014
 called-0.014
ina-0.013
 Ho-0.013
Token and
Token ID290
Feature activation-7.51e-02

Pos logit contributions
 differences+0.077
 losers+0.073
 pleasures+0.071
 similarities+0.070
 laws+0.068
Neg logit contributions
 gn-0.046
 Nib-0.043
ina-0.043
 Grac-0.042
 called-0.042
Token his
Token ID465
Feature activation-1.03e-01

Pos logit contributions
 differences+0.002
 pleasures+0.002
 losers+0.002
oom+0.002
 similarities+0.002
Neg logit contributions
 gn-0.002
 Nib-0.002
 Grac-0.002
 Po-0.002
 Ho-0.002
Token friends
Token ID2460
Feature activation-3.40e-01

Pos logit contributions
 differences+0.002
 pleasures+0.002
 losers+0.002
 similarities+0.002
oom+0.002
Neg logit contributions
 gn-0.003
 Nib-0.003
 Po-0.003
 Grac-0.003
 Ho-0.003
Token got
Token ID1392
Feature activation+1.19e-02

Pos logit contributions
 differences+0.013
 pleasures+0.012
 losers+0.012
 similarities+0.012
 laws+0.011
Neg logit contributions
 called-0.005
 gn-0.005
 Grac-0.005
 Nib-0.005
ina-0.005
Token their
Token ID511
Feature activation-2.16e-01

Pos logit contributions
 gn+0.000
 Grac+0.000
 Nib+0.000
 Po+0.000
ina+0.000
Neg logit contributions
 differences-0.000
 losers-0.000
 pleasures-0.000
 similarities-0.000
oom-0.000
Token birds
Token ID10087
Feature activation-1.20e+00

Pos logit contributions
 Po+0.004
 Nib+0.004
 Ho+0.004
 Grac+0.004
 gn+0.004
Neg logit contributions
 differences-0.003
 pleasures-0.003
 losers-0.003
 similarities-0.003
oom-0.003
Token that
Token ID326
Feature activation-6.86e-01

Pos logit contributions
 differences+0.041
 similarities+0.039
 pleasures+0.038
Everyone+0.037
oom+0.037
Neg logit contributions
 Nib-0.021
 Po-0.020
 Ho-0.020
 gn-0.020
 called-0.019
Token are
Token ID389
Feature activation-5.62e-01

Pos logit contributions
 differences+0.020
 losers+0.019
 pleasures+0.018
oom+0.018
 similarities+0.018
Neg logit contributions
 gn-0.015
 Nib-0.015
 Po-0.015
 Grac-0.014
 blo-0.014
Token not
Token ID407
Feature activation-7.68e-01

Pos logit contributions
oom+0.016
 differences+0.016
 pleasures+0.015
affe+0.015
 losers+0.015
Neg logit contributions
 gn-0.012
 Nib-0.012
 Po-0.012
 Grac-0.012
 called-0.012
Token par
Token ID1582
Feature activation-1.26e+00

Pos logit contributions
 differences+0.022
oom+0.021
 pleasures+0.020
 losers+0.020
 similarities+0.020
Neg logit contributions
 Nib-0.017
 gn-0.017
 Grac-0.016
 called-0.016
 Po-0.016
Tokenrots
Token ID24744
Feature activation-2.39e+00

Pos logit contributions
 pleasures+0.041
 matter+0.041
 winds+0.041
 times+0.041
 differences+0.041
Neg logit contributions
 Nib-0.021
ina-0.020
 gn-0.020
 Ho-0.020
 Po-0.020
Token.
Token ID13
Feature activation-4.34e-01

Pos logit contributions
 differences+0.068
 similarities+0.065
 losers+0.063
 pleasures+0.063
oom+0.062
Neg logit contributions
 Nib-0.050
 gn-0.048
 Po-0.048
 Grac-0.048
 called-0.047
Token They
Token ID1119
Feature activation+5.25e-01

Pos logit contributions
 differences+0.012
 losers+0.011
 pleasures+0.010
 similarities+0.010
affe+0.010
Neg logit contributions
 gn-0.011
 Grac-0.010
 Nib-0.010
ina-0.010
 Po-0.010
Token might
Token ID1244
Feature activation+7.47e-01

Pos logit contributions
 Nib+0.011
 gn+0.011
 called+0.010
 Grac+0.010
 Po+0.010
Neg logit contributions
 differences-0.015
 pleasures-0.015
oom-0.015
 losers-0.014
 similarities-0.014
Token bite
Token ID13197
Feature activation-3.94e-01

Pos logit contributions
 Grac+0.022
 gn+0.021
 Nib+0.021
 Po+0.020
ina+0.020
Neg logit contributions
 differences-0.017
 losers-0.016
 pleasures-0.014
 similarities-0.014
 laws-0.014
Token you
Token ID345
Feature activation-4.45e-01

Pos logit contributions
 differences+0.010
 losers+0.010
 pleasures+0.010
 similarities+0.010
oom+0.009
Neg logit contributions
 Nib-0.009
 Grac-0.009
 Po-0.009
 gn-0.009
 Ho-0.009
Token was
Token ID373
Feature activation+5.51e-01

Pos logit contributions
icol+0.010
 imagining+0.010
 Grac+0.010
ina+0.009
 Gab+0.009
Neg logit contributions
 differences-0.007
 losers-0.007
 sides-0.007
 ends-0.007
 tips-0.006
Token a
Token ID257
Feature activation+3.18e-01

Pos logit contributions
ina+0.014
 Grac+0.014
 gn+0.014
 Maur+0.014
icol+0.013
Neg logit contributions
 losers-0.013
 differences-0.013
 wind-0.012
 missiles-0.012
 sides-0.012
Token box
Token ID3091
Feature activation+2.64e-01

Pos logit contributions
 Grac+0.009
ina+0.009
 Nib+0.009
 Maur+0.009
 gn+0.009
Neg logit contributions
 differences-0.006
 losers-0.006
 team-0.006
 law-0.006
 similarities-0.005
Token of
Token ID286
Feature activation-5.40e-02

Pos logit contributions
 Grac+0.007
 Nib+0.007
 gn+0.007
 Po+0.007
 Maur+0.007
Neg logit contributions
 differences-0.006
 losers-0.006
oom-0.005
affe-0.005
 sides-0.005
Token ch
Token ID442
Feature activation-3.31e-01

Pos logit contributions
 differences+0.001
 losers+0.001
 pleasures+0.001
 similarities+0.001
oom+0.001
Neg logit contributions
 gn-0.002
 Grac-0.002
 Nib-0.001
 Po-0.001
ina-0.001
Tokenocol
Token ID4668
Feature activation-2.38e+00

Pos logit contributions
 differences+0.008
 pleasures+0.007
 losers+0.007
 similarities+0.007
 matter+0.007
Neg logit contributions
 Po-0.009
 Nib-0.009
 Grac-0.009
 Ho-0.009
ina-0.008
Tokenates
Token ID689
Feature activation-2.96e-01

Pos logit contributions
 differences+0.064
 losers+0.060
 pleasures+0.058
oom+0.057
 similarities+0.057
Neg logit contributions
 gn-0.057
 Grac-0.057
 Nib-0.056
 Po-0.054
 Maur-0.052
Token.
Token ID13
Feature activation+1.64e-01

Pos logit contributions
 differences+0.007
 losers+0.007
 pleasures+0.007
oom+0.006
affe+0.006
Neg logit contributions
 Grac-0.008
 gn-0.008
 Nib-0.008
 Po-0.007
 Maur-0.007
Token Lily
Token ID20037
Feature activation+3.20e-01

Pos logit contributions
 gn+0.004
 Grac+0.004
ina+0.004
 Nib+0.004
 blo+0.004
Neg logit contributions
 differences-0.004
 losers-0.004
 pleasures-0.003
 similarities-0.003
affe-0.003
Token loved
Token ID6151
Feature activation-4.48e-02

Pos logit contributions
 Nib+0.008
 Grac+0.008
 Po+0.008
 gn+0.007
 Maur+0.007
Neg logit contributions
 differences-0.008
 losers-0.007
 team-0.007
affe-0.007
 similarities-0.007
Token ch
Token ID442
Feature activation-4.43e-01

Pos logit contributions
 differences+0.001
 losers+0.001
 similarities+0.001
 pleasures+0.001
 laws+0.001
Neg logit contributions
 gn-0.001
 Grac-0.001
 Nib-0.001
 Po-0.001
ina-0.001
Token as
Token ID355
Feature activation-8.09e-01

Pos logit contributions
 gn+0.000
 Grac+0.000
 Nib+0.000
 Po+0.000
 called+0.000
Neg logit contributions
 differences-0.000
 losers-0.000
 similarities-0.000
oom-0.000
 pleasures-0.000
Token he
Token ID339
Feature activation-1.31e-01

Pos logit contributions
 differences+0.027
 losers+0.025
 pleasures+0.025
 similarities+0.024
affe+0.024
Neg logit contributions
 gn-0.016
 Grac-0.015
 Po-0.015
 Nib-0.015
ina-0.014
Token opened
Token ID4721
Feature activation+1.24e-01

Pos logit contributions
 differences+0.004
 pleasures+0.004
 losers+0.004
 similarities+0.004
oom+0.004
Neg logit contributions
 gn-0.003
 Grac-0.002
 Nib-0.002
 called-0.002
ina-0.002
Token the
Token ID262
Feature activation+3.13e-01

Pos logit contributions
 Grac+0.003
 gn+0.003
 Nib+0.003
 Po+0.003
 Ho+0.003
Neg logit contributions
 differences-0.004
 losers-0.003
affe-0.003
 pleasures-0.003
 similarities-0.003
Token ch
Token ID442
Feature activation-4.56e-01

Pos logit contributions
 gn+0.007
 Grac+0.007
 Nib+0.007
 Po+0.007
 blo+0.006
Neg logit contributions
 differences-0.009
 losers-0.008
 pleasures-0.008
oom-0.008
 similarities-0.008
Tokenocol
Token ID4668
Feature activation-2.33e+00

Pos logit contributions
 similarities+0.011
 differences+0.010
 missiles+0.010
Tom+0.010
 pleasures+0.010
Neg logit contributions
ina-0.011
 gn-0.011
 Grac-0.011
 Po-0.010
 blo-0.010
Tokenates
Token ID689
Feature activation+1.68e-01

Pos logit contributions
 differences+0.069
 losers+0.065
 pleasures+0.062
 similarities+0.061
oom+0.061
Neg logit contributions
 gn-0.050
 Grac-0.049
 Nib-0.049
 Po-0.047
ina-0.046
Token and
Token ID290
Feature activation+1.07e-01

Pos logit contributions
 gn+0.004
 Grac+0.004
 Nib+0.004
ina+0.003
 Po+0.003
Neg logit contributions
 differences-0.005
 losers-0.005
 similarities-0.005
 pleasures-0.005
oom-0.004
Token started
Token ID2067
Feature activation+5.40e-01

Pos logit contributions
 gn+0.002
 Grac+0.002
 Nib+0.002
 Po+0.002
 Ho+0.002
Neg logit contributions
 differences-0.003
 losers-0.003
 pleasures-0.003
oom-0.003
 similarities-0.003
Token eating
Token ID6600
Feature activation+5.62e-01

Pos logit contributions
 gn+0.011
 Grac+0.010
 Nib+0.010
 Po+0.010
ina+0.009
Neg logit contributions
 differences-0.017
 losers-0.017
 pleasures-0.016
affe-0.016
oom-0.016
Token them
Token ID606
Feature activation+5.73e-01

Pos logit contributions
 gn+0.014
 Nib+0.014
 Grac+0.013
 Po+0.013
 Ho+0.012
Neg logit contributions
 differences-0.015
 losers-0.014
oom-0.014
 pleasures-0.014
affe-0.014
Token big
Token ID1263
Feature activation+1.14e-01

Pos logit contributions
 differences+0.016
 pleasures+0.014
 losers+0.014
 similarities+0.014
 competitors+0.013
Neg logit contributions
 called-0.012
 gn-0.012
 very-0.012
 Nib-0.011
 Gab-0.011
Token.
Token ID13
Feature activation-1.41e-01

Pos logit contributions
 Grac+0.003
 gn+0.003
 Nib+0.002
 Po+0.002
 Maur+0.002
Neg logit contributions
 differences-0.003
 losers-0.003
 pleasures-0.003
oom-0.003
 similarities-0.003
Token
Token ID198
Feature activation+3.01e-01

Pos logit contributions
 differences+0.005
 pleasures+0.005
 losers+0.005
 similarities+0.005
 efforts+0.004
Neg logit contributions
 Nib-0.003
 gn-0.002
 Grac-0.002
 called-0.002
 Ho-0.002
Token The
Token ID383
Feature activation+3.85e-01

Pos logit contributions
 gn+0.005
 Grac+0.005
 Nib+0.005
 Po+0.005
ina+0.004
Neg logit contributions
 differences-0.010
 losers-0.010
 pleasures-0.010
 similarities-0.009
oom-0.009
Token wood
Token ID4898
Feature activation-3.76e-01

Pos logit contributions
 gn+0.010
 Nib+0.009
 called+0.009
 Grac+0.008
 imagining+0.008
Neg logit contributions
 differences-0.011
 losers-0.010
oom-0.010
 pleasures-0.010
 similarities-0.010
Tokenpe
Token ID431
Feature activation-2.32e+00

Pos logit contributions
 differences+0.013
 losers+0.013
 pleasures+0.013
 similarities+0.012
 goals+0.012
Neg logit contributions
 called-0.007
 gn-0.007
ina-0.006
 Nib-0.006
 Grac-0.006
Tokencker
Token ID15280
Feature activation-5.90e-01

Pos logit contributions
 differences+0.063
 losers+0.062
 pleasures+0.058
 similarities+0.056
 laws+0.056
Neg logit contributions
ina-0.056
 called-0.056
 gn-0.054
 Nib-0.052
 watered-0.049
Token was
Token ID373
Feature activation+2.78e-01

Pos logit contributions
 differences+0.023
 pleasures+0.023
 losers+0.022
 similarities+0.022
oom+0.021
Neg logit contributions
 called-0.009
 gn-0.008
 Nib-0.007
 Grac-0.007
 tucked-0.006
Token sad
Token ID6507
Feature activation+1.29e+00

Pos logit contributions
 gn+0.005
 Nib+0.005
 Grac+0.005
 called+0.005
 Po+0.005
Neg logit contributions
 differences-0.009
 pleasures-0.009
oom-0.008
 similarities-0.008
 losers-0.008
Token.
Token ID13
Feature activation-5.37e-01

Pos logit contributions
 gn+0.034
 Grac+0.033
 Nib+0.032
 Po+0.031
 Maur+0.030
Neg logit contributions
 differences-0.033
 losers-0.030
 pleasures-0.029
 attitude-0.029
oom-0.029
Token Then
Token ID3244
Feature activation+5.58e-01

Pos logit contributions
 differences+0.019
 pleasures+0.019
oom+0.018
 similarities+0.018
 losers+0.018
Neg logit contributions
 Nib-0.009
 gn-0.009
 called-0.009
 Grac-0.008
 Po-0.008
Token."
Token ID526
Feature activation+8.34e-01

Pos logit contributions
 gn+0.002
 Grac+0.002
 Nib+0.002
 Po+0.002
ina+0.002
Neg logit contributions
 differences-0.003
 losers-0.003
 pleasures-0.003
 similarities-0.003
oom-0.003
Token
Token ID198
Feature activation+4.94e-01

Pos logit contributions
 gn+0.018
 Grac+0.018
 blo+0.016
ina+0.016
 Nib+0.016
Neg logit contributions
 differences-0.024
 losers-0.022
affe-0.022
oom-0.022
 similarities-0.022
Token
Token ID198
Feature activation-1.50e-01

Pos logit contributions
 gn+0.012
 Grac+0.011
 Nib+0.011
 blo+0.011
icol+0.011
Neg logit contributions
 differences-0.013
oom-0.012
Everyone-0.012
affe-0.012
 losers-0.012
TokenSt
Token ID1273
Feature activation-5.82e-02

Pos logit contributions
 differences+0.004
 losers+0.003
 pleasures+0.003
 similarities+0.003
oom+0.003
Neg logit contributions
 Grac-0.004
 gn-0.004
 Nib-0.004
 Po-0.004
ina-0.004
Tokenri
Token ID380
Feature activation-7.86e-01

Pos logit contributions
 differences+0.001
 losers+0.001
 pleasures+0.001
 laws+0.001
 similarities+0.001
Neg logit contributions
 Nib-0.002
ina-0.002
 Grac-0.002
 gn-0.002
 Maur-0.001
Tokenpe
Token ID431
Feature activation-2.30e+00

Pos logit contributions
 differences+0.021
 losers+0.019
 pleasures+0.019
 similarities+0.019
 laws+0.018
Neg logit contributions
ina-0.021
 gn-0.020
 Nib-0.019
 called-0.019
 Grac-0.018
Tokeny
Token ID88
Feature activation-8.58e-01

Pos logit contributions
 differences+0.067
 losers+0.064
 pleasures+0.061
 similarities+0.060
 laws+0.059
Neg logit contributions
 gn-0.051
 Nib-0.049
 Grac-0.049
ina-0.049
 Po-0.046
Token felt
Token ID2936
Feature activation+2.96e-01

Pos logit contributions
 differences+0.031
 pleasures+0.029
 losers+0.029
 similarities+0.028
oom+0.028
Neg logit contributions
 gn-0.016
 called-0.014
 Nib-0.014
 Grac-0.014
 Po-0.013
Token sad
Token ID6507
Feature activation+1.44e+00

Pos logit contributions
ina+0.009
 Grac+0.009
 gn+0.008
icol+0.008
 Po+0.008
Neg logit contributions
 differences-0.007
 losers-0.006
 pleasures-0.006
 laws-0.006
 attitude-0.005
Token because
Token ID780
Feature activation-4.04e-01

Pos logit contributions
 gn+0.039
 Grac+0.039
 Nib+0.037
 Maur+0.036
icol+0.036
Neg logit contributions
 differences-0.034
 pleasures-0.030
 losers-0.030
 attitude-0.030
 anymore-0.030
Token he
Token ID339
Feature activation+4.99e-01

Pos logit contributions
 differences+0.015
 pleasures+0.014
oom+0.014
 losers+0.014
 similarities+0.014
Neg logit contributions
 Nib-0.007
 gn-0.007
 called-0.006
 Grac-0.006
 Po-0.006
Token not
Token ID407
Feature activation+9.85e-02

Pos logit contributions
 gn+0.004
 Nib+0.004
 Grac+0.003
 Po+0.003
ina+0.003
Neg logit contributions
 differences-0.005
 losers-0.004
oom-0.004
 missiles-0.004
affe-0.004
Token eat
Token ID4483
Feature activation-2.75e-01

Pos logit contributions
 gn+0.002
 called+0.002
 Nib+0.002
 Po+0.002
 Grac+0.002
Neg logit contributions
 differences-0.003
folk-0.003
 missiles-0.003
 losers-0.003
oom-0.003
Token too
Token ID1165
Feature activation-8.02e-01

Pos logit contributions
 differences+0.007
 losers+0.006
 pleasures+0.006
oom+0.006
affe+0.006
Neg logit contributions
 Nib-0.007
 gn-0.007
 Grac-0.007
 Po-0.007
 Ho-0.006
Token many
Token ID867
Feature activation-2.70e-01

Pos logit contributions
 differences+0.031
 losers+0.029
 similarities+0.029
 pleasures+0.028
oom+0.028
Neg logit contributions
 gn-0.010
 Nib-0.010
 Po-0.010
 called-0.010
 very-0.009
Token ch
Token ID442
Feature activation-8.74e-01

Pos logit contributions
oom+0.006
 differences+0.006
 losers+0.006
affe+0.005
 pleasures+0.005
Neg logit contributions
 gn-0.008
 Nib-0.008
 Grac-0.008
 blo-0.008
 called-0.007
Tokenocol
Token ID4668
Feature activation-2.29e+00

Pos logit contributions
 pleasures+0.020
Everyone+0.019
 losers+0.019
 matter+0.019
 differences+0.019
Neg logit contributions
 Nib-0.024
 gn-0.023
 Po-0.023
 Grac-0.022
 Ho-0.022
Tokenates
Token ID689
Feature activation-6.79e-01

Pos logit contributions
 differences+0.066
 losers+0.062
 pleasures+0.060
 similarities+0.059
oom+0.059
Neg logit contributions
 gn-0.051
 Grac-0.050
 Nib-0.050
 Po-0.047
ina-0.046
Token and
Token ID290
Feature activation-8.34e-03

Pos logit contributions
 differences+0.020
 losers+0.019
 pleasures+0.018
 similarities+0.018
oom+0.018
Neg logit contributions
 gn-0.015
 Grac-0.015
 Nib-0.015
 Po-0.014
ina-0.014
Token his
Token ID465
Feature activation+3.80e-02

Pos logit contributions
 differences+0.000
oom+0.000
 losers+0.000
 pleasures+0.000
affe+0.000
Neg logit contributions
 gn-0.000
 Nib-0.000
 Grac-0.000
 Po-0.000
 Ho-0.000
Token friend
Token ID1545
Feature activation-3.75e-01

Pos logit contributions
 gn+0.001
 Nib+0.001
 Grac+0.001
 Po+0.001
 Ho+0.001
Neg logit contributions
 differences-0.001
 losers-0.001
oom-0.001
 pleasures-0.001
Everyone-0.001
Token learned
Token ID4499
Feature activation+5.70e-01

Pos logit contributions
 differences+0.012
 losers+0.012
 pleasures+0.011
oom+0.011
 similarities+0.011
Neg logit contributions
 gn-0.007
 Nib-0.007
 Grac-0.007
 Po-0.007
 called-0.007
Token noticed
Token ID6810
Feature activation-2.22e-01

Pos logit contributions
 differences+0.009
 losers+0.008
 pleasures+0.008
 similarities+0.008
oom+0.008
Neg logit contributions
 gn-0.006
 Grac-0.005
 Nib-0.005
 Po-0.005
ina-0.005
Token something
Token ID1223
Feature activation+1.06e+00

Pos logit contributions
 differences+0.006
 losers+0.005
 missiles+0.005
 pleasures+0.005
 laws+0.005
Neg logit contributions
 Grac-0.005
 gn-0.005
 Nib-0.005
 Maur-0.005
icol-0.005
Token.
Token ID13
Feature activation-2.11e-01

Pos logit contributions
 Grac+0.032
 gn+0.031
 Nib+0.031
 Po+0.030
ina+0.030
Neg logit contributions
 differences-0.022
 losers-0.020
 pleasures-0.019
 laws-0.018
 similarities-0.018
Token The
Token ID383
Feature activation+2.05e-01

Pos logit contributions
 differences+0.004
affe+0.003
 losers+0.003
 pleasures+0.003
 similarities+0.003
Neg logit contributions
 gn-0.007
ina-0.007
 Grac-0.006
 Nib-0.006
 blo-0.006
Token ch
Token ID442
Feature activation-3.64e-04

Pos logit contributions
 gn+0.005
 Grac+0.005
 Nib+0.005
 Po+0.005
ina+0.005
Neg logit contributions
 differences-0.005
 losers-0.005
 pleasures-0.005
oom-0.005
 similarities-0.005
Tokenocol
Token ID4668
Feature activation-2.28e+00

Pos logit contributions
 differences+0.000
 similarities+0.000
 pleasures+0.000
 losers+0.000
 laws+0.000
Neg logit contributions
 Grac-0.000
 Nib-0.000
ina-0.000
 Po-0.000
 gn-0.000
Tokenates
Token ID689
Feature activation-6.64e-01

Pos logit contributions
 differences+0.057
oom+0.051
 losers+0.051
 pleasures+0.050
 similarities+0.049
Neg logit contributions
 gn-0.061
 Grac-0.061
 Nib-0.058
 Po-0.057
 Maur-0.054
Token were
Token ID547
Feature activation-1.20e-01

Pos logit contributions
 differences+0.024
 losers+0.022
 pleasures+0.022
 similarities+0.022
 laws+0.021
Neg logit contributions
 gn-0.011
 Grac-0.011
 Nib-0.010
ina-0.010
 Po-0.010
Token bitter
Token ID12922
Feature activation+2.55e-01

Pos logit contributions
 differences+0.003
 losers+0.003
 similarities+0.003
 laws+0.003
 sides+0.003
Neg logit contributions
 gn-0.003
 Grac-0.003
 Nib-0.003
ina-0.003
 Po-0.003
Token.
Token ID13
Feature activation-8.22e-02

Pos logit contributions
 Grac+0.007
 gn+0.007
 Nib+0.007
 Maur+0.007
 Po+0.007
Neg logit contributions
 differences-0.006
 pleasures-0.005
 losers-0.005
 winds-0.005
 sides-0.005
Token They
Token ID1119
Feature activation+1.36e-01

Pos logit contributions
 differences+0.002
 losers+0.001
 pleasures+0.001
affe+0.001
 similarities+0.001
Neg logit contributions
 gn-0.003
 Grac-0.002
ina-0.002
 Nib-0.002
 blo-0.002
Token or
Token ID393
Feature activation-9.01e-01

Pos logit contributions
 differences+0.034
 similarities+0.033
 losers+0.031
 pleasures+0.031
 winds+0.030
Neg logit contributions
 gn-0.020
 Nib-0.019
 called-0.018
 Grac-0.018
 Po-0.017
Token get
Token ID651
Feature activation+2.49e-01

Pos logit contributions
 differences+0.025
 pleasures+0.024
oom+0.023
 similarities+0.022
affe+0.022
Neg logit contributions
 gn-0.021
 Nib-0.021
 Grac-0.020
 called-0.020
 Po-0.020
Token you
Token ID345
Feature activation-5.10e-01

Pos logit contributions
 Grac+0.006
 Nib+0.006
 gn+0.006
 Po+0.005
 Ho+0.005
Neg logit contributions
 differences-0.007
 pleasures-0.007
affe-0.007
 similarities-0.006
oom-0.006
Token a
Token ID257
Feature activation-3.19e-01

Pos logit contributions
 differences+0.018
 similarities+0.017
affe+0.017
 pleasures+0.016
folk+0.016
Neg logit contributions
 Nib-0.008
 ready-0.007
 called-0.007
 gn-0.007
 Grac-0.007
Token new
Token ID649
Feature activation-2.97e-01

Pos logit contributions
 differences+0.007
 losers+0.006
 pleasures+0.006
 similarities+0.006
oom+0.006
Neg logit contributions
 gn-0.009
 Grac-0.009
 Nib-0.009
 Po-0.009
ina-0.009
Token one
Token ID530
Feature activation-2.25e+00

Pos logit contributions
 differences+0.006
 losers+0.006
 pleasures+0.006
 similarities+0.005
oom+0.005
Neg logit contributions
 gn-0.009
 Grac-0.009
 Nib-0.009
 Po-0.008
ina-0.008
Token."
Token ID526
Feature activation+2.79e-01

Pos logit contributions
 differences+0.069
 similarities+0.065
 pleasures+0.063
 losers+0.061
Everyone+0.059
Neg logit contributions
 called-0.045
 gn-0.044
 Nib-0.043
 Grac-0.042
 least-0.042
Token
Token ID198
Feature activation+3.33e-02

Pos logit contributions
 Grac+0.007
 gn+0.007
ina+0.007
 blo+0.006
 Nib+0.006
Neg logit contributions
 differences-0.007
 losers-0.007
Everyone-0.007
 pleasures-0.007
oom-0.007
Token
Token ID198
Feature activation-2.50e-01

Pos logit contributions
 gn+0.001
 Grac+0.001
 Nib+0.001
ina+0.001
 blo+0.001
Neg logit contributions
 differences-0.001
Everyone-0.001
 losers-0.001
affe-0.001
oom-0.001
TokenL
Token ID43
Feature activation+5.62e-01

Pos logit contributions
 differences+0.007
 losers+0.007
 pleasures+0.007
oom+0.007
 similarities+0.006
Neg logit contributions
 gn-0.006
 Grac-0.006
 Nib-0.005
 Po-0.005
ina-0.005
Tokenily
Token ID813
Feature activation-1.48e-01

Pos logit contributions
 gn+0.013
 Grac+0.012
 Nib+0.012
 Po+0.012
ina+0.012
Neg logit contributions
 differences-0.016
 losers-0.015
 pleasures-0.015
 similarities-0.015
oom-0.014
Token they
Token ID484
Feature activation+1.35e-01

Pos logit contributions
 differences+0.044
oom+0.043
 losers+0.043
 similarities+0.043
affe+0.042
Neg logit contributions
 called-0.020
 blo-0.018
 gn-0.018
 Grac-0.018
 Po-0.018
Token come
Token ID1282
Feature activation-3.77e-01

Pos logit contributions
 gn+0.003
 Grac+0.003
 Nib+0.003
 Po+0.003
ina+0.003
Neg logit contributions
 differences-0.004
 losers-0.003
 pleasures-0.003
oom-0.003
 similarities-0.003
Token out
Token ID503
Feature activation+4.94e-01

Pos logit contributions
 differences+0.010
 losers+0.009
oom+0.009
 pleasures+0.009
 team+0.009
Neg logit contributions
 gn-0.009
 Grac-0.009
 Nib-0.008
 Maur-0.008
icol-0.008
Token and
Token ID290
Feature activation-3.36e-01

Pos logit contributions
 gn+0.014
 Grac+0.014
 Nib+0.014
 Po+0.013
 Maur+0.013
Neg logit contributions
 differences-0.011
 losers-0.010
 pleasures-0.010
oom-0.010
 similarities-0.010
Token sun
Token ID4252
Feature activation-8.59e-01

Pos logit contributions
 differences+0.010
oom+0.009
 pleasures+0.009
 losers+0.009
affe+0.009
Neg logit contributions
 gn-0.008
 Nib-0.008
 Grac-0.007
 Po-0.007
 called-0.007
Tokenbat
Token ID8664
Feature activation-2.25e+00

Pos logit contributions
 differences+0.027
 losers+0.025
 pleasures+0.025
 similarities+0.024
oom+0.024
Neg logit contributions
 gn-0.017
 Grac-0.017
 Nib-0.017
 Po-0.016
ina-0.015
Tokenhe
Token ID258
Feature activation-1.49e-01

Pos logit contributions
 differences+0.053
 losers+0.048
 pleasures+0.047
 similarities+0.046
 laws+0.044
Neg logit contributions
 Grac-0.062
 Nib-0.061
 gn-0.061
ina-0.059
 Po-0.059
Token on
Token ID319
Feature activation+3.84e-01

Pos logit contributions
 differences+0.005
 losers+0.004
 similarities+0.004
 pleasures+0.004
oom+0.004
Neg logit contributions
 gn-0.003
 Grac-0.003
 Nib-0.003
 Po-0.003
ina-0.003
Token the
Token ID262
Feature activation+1.05e+00

Pos logit contributions
 Grac+0.009
 Po+0.009
 Nib+0.009
 gn+0.009
 Ho+0.009
Neg logit contributions
 differences-0.010
 pleasures-0.010
 losers-0.010
 similarities-0.010
affe-0.010
Token logs
Token ID17259
Feature activation-1.69e-01

Pos logit contributions
 gn+0.026
 Nib+0.025
 blo+0.024
 Grac+0.024
 Po+0.024
Neg logit contributions
 differences-0.028
oom-0.026
 pleasures-0.026
 similarities-0.026
 efforts-0.025
Token.
Token ID13
Feature activation-3.55e-01

Pos logit contributions
 differences+0.005
 losers+0.005
 similarities+0.005
 pleasures+0.005
oom+0.005
Neg logit contributions
 gn-0.004
 Grac-0.003
 Nib-0.003
 Po-0.003
 called-0.003
Token here
Token ID994
Feature activation-2.41e-01

Pos logit contributions
 differences+0.012
 losers+0.011
 pleasures+0.011
 similarities+0.011
affe+0.011
Neg logit contributions
 gn-0.006
 Grac-0.006
 Nib-0.006
 Po-0.006
ina-0.006
Token?
Token ID30
Feature activation-9.69e-01

Pos logit contributions
 differences+0.007
 losers+0.007
 pleasures+0.007
oom+0.007
 similarities+0.007
Neg logit contributions
 gn-0.005
 Grac-0.005
 Nib-0.005
 Po-0.004
 Maur-0.004
Token You
Token ID921
Feature activation-3.65e-01

Pos logit contributions
 differences+0.028
 losers+0.027
 pleasures+0.025
oom+0.025
 similarities+0.025
Neg logit contributions
 gn-0.021
 Grac-0.020
 Nib-0.020
 Po-0.019
ina-0.019
Token are
Token ID389
Feature activation+3.52e-01

Pos logit contributions
 differences+0.010
 pleasures+0.010
 similarities+0.009
 losers+0.009
oom+0.009
Neg logit contributions
 Grac-0.009
 gn-0.008
 Nib-0.008
 Po-0.008
 called-0.008
Token tresp
Token ID29723
Feature activation+3.56e-01

Pos logit contributions
 gn+0.007
 Grac+0.007
 Nib+0.007
 Po+0.007
ina+0.007
Neg logit contributions
 differences-0.011
 pleasures-0.010
 losers-0.010
oom-0.010
 similarities-0.010
Tokenassing
Token ID19696
Feature activation-2.24e+00

Pos logit contributions
 gn+0.010
 Grac+0.009
 Nib+0.009
 Po+0.009
ina+0.009
Neg logit contributions
 differences-0.009
 losers-0.008
 pleasures-0.008
 similarities-0.008
oom-0.008
Token!
Token ID0
Feature activation-8.90e-01

Pos logit contributions
 differences+0.074
 similarities+0.069
 pleasures+0.068
 losers+0.067
 efforts+0.065
Neg logit contributions
 gn-0.040
 called-0.039
ina-0.038
 Grac-0.038
 Nib-0.037
Token You
Token ID921
Feature activation-4.23e-01

Pos logit contributions
 differences+0.025
 losers+0.023
 pleasures+0.022
 similarities+0.022
oom+0.021
Neg logit contributions
 gn-0.021
 Grac-0.021
 Nib-0.021
 Po-0.021
ina-0.020
Token are
Token ID389
Feature activation+4.15e-01

Pos logit contributions
 differences+0.012
 pleasures+0.011
 similarities+0.011
 losers+0.011
oom+0.010
Neg logit contributions
 gn-0.010
 Grac-0.010
 Nib-0.010
 Po-0.009
 called-0.009
Token thieves
Token ID26655
Feature activation+1.30e-01

Pos logit contributions
 gn+0.008
 Grac+0.008
 Nib+0.008
 Po+0.008
 Ho+0.007
Neg logit contributions
 differences-0.013
 pleasures-0.012
oom-0.012
 losers-0.012
 similarities-0.012
Token!
Token ID0
Feature activation-9.39e-01

Pos logit contributions
 gn+0.003
 Grac+0.003
 Nib+0.003
 Po+0.002
ina+0.002
Neg logit contributions
 differences-0.004
 similarities-0.004
 pleasures-0.004
 losers-0.004
oom-0.004
Token said
Token ID531
Feature activation-6.36e-01

Pos logit contributions
 differences+0.012
 losers+0.011
 pleasures+0.011
oom+0.011
 similarities+0.011
Neg logit contributions
 gn-0.007
 Grac-0.007
 Nib-0.006
 Po-0.006
ina-0.006
Token,
Token ID11
Feature activation-3.08e-01

Pos logit contributions
 differences+0.023
 pleasures+0.022
oom+0.021
 similarities+0.021
 sides+0.021
Neg logit contributions
 gn-0.011
 Nib-0.011
 Grac-0.011
 Po-0.010
 Ho-0.010
Token "
Token ID366
Feature activation+7.73e-01

Pos logit contributions
 differences+0.009
 pleasures+0.009
 losers+0.008
oom+0.008
 similarities+0.008
Neg logit contributions
 Grac-0.007
 gn-0.007
 Nib-0.007
 Po-0.007
 Ho-0.006
TokenI
Token ID40
Feature activation+8.16e-01

Pos logit contributions
 Grac+0.014
 gn+0.014
ina+0.013
 Po+0.013
 Nib+0.013
Neg logit contributions
 differences-0.026
 losers-0.024
 attitude-0.023
oom-0.023
 efforts-0.023
Token love
Token ID1842
Feature activation+1.20e-01

Pos logit contributions
 gn+0.019
 Grac+0.019
 Nib+0.018
 Po+0.018
ina+0.017
Neg logit contributions
 differences-0.023
 losers-0.022
 pleasures-0.021
 similarities-0.021
oom-0.020
Token av
Token ID1196
Feature activation-2.23e+00

Pos logit contributions
 gn+0.003
 Grac+0.003
 Nib+0.003
 Po+0.003
 Ho+0.003
Neg logit contributions
 differences-0.003
oom-0.003
 pleasures-0.003
 losers-0.003
affe-0.003
Tokenoc
Token ID420
Feature activation-9.37e-01

Pos logit contributions
 differences+0.056
 losers+0.052
 pleasures+0.050
 similarities+0.050
oom+0.049
Neg logit contributions
 gn-0.058
 Grac-0.057
 Nib-0.057
 Po-0.054
ina-0.052
Tokenados
Token ID22484
Feature activation-8.45e-01

Pos logit contributions
 differences+0.025
 thrill+0.024
 losers+0.023
 similarities+0.023
 pleasures+0.023
Neg logit contributions
 gn-0.022
 Nib-0.022
 Grac-0.021
 Po-0.020
 Ho-0.019
Token!
Token ID0
Feature activation-1.37e+00

Pos logit contributions
 differences+0.025
 losers+0.023
 similarities+0.023
 pleasures+0.023
oom+0.023
Neg logit contributions
 gn-0.018
 Grac-0.018
 Nib-0.018
 Po-0.017
ina-0.016
Token They
Token ID1119
Feature activation+2.56e-01

Pos logit contributions
 differences+0.036
 losers+0.033
 pleasures+0.032
oom+0.031
 similarities+0.031
Neg logit contributions
 gn-0.035
 Grac-0.034
 Nib-0.034
 Po-0.033
ina-0.031
Token are
Token ID389
Feature activation+1.77e-01

Pos logit contributions
 Grac+0.006
 gn+0.006
 Nib+0.006
 Po+0.005
 blo+0.005
Neg logit contributions
 differences-0.007
 losers-0.007
 pleasures-0.007
oom-0.007
 similarities-0.006
Token.
Token ID13
Feature activation+1.20e-01

Pos logit contributions
 Grac+0.015
 gn+0.015
 Nib+0.015
 Po+0.014
 Maur+0.014
Neg logit contributions
 differences-0.012
 losers-0.012
oom-0.011
 pleasures-0.011
 team-0.011
Token
Token ID198
Feature activation+5.92e-01

Pos logit contributions
 gn+0.002
 Grac+0.002
 Nib+0.002
 Po+0.002
ina+0.002
Neg logit contributions
 differences-0.004
 losers-0.004
 pleasures-0.003
 similarities-0.003
oom-0.003
Token
Token ID198
Feature activation+9.04e-02

Pos logit contributions
 gn+0.013
 Grac+0.012
 Nib+0.012
 Po+0.011
ina+0.011
Neg logit contributions
 differences-0.017
oom-0.016
 losers-0.016
 pleasures-0.016
affe-0.016
TokenSt
Token ID1273
Feature activation+1.02e-01

Pos logit contributions
 Grac+0.002
 gn+0.002
 Nib+0.002
 Po+0.002
ina+0.002
Neg logit contributions
 differences-0.002
 losers-0.002
 pleasures-0.002
oom-0.002
 similarities-0.002
Tokenri
Token ID380
Feature activation-8.86e-01

Pos logit contributions
 Nib+0.003
 gn+0.003
 Grac+0.003
ina+0.003
 Po+0.002
Neg logit contributions
 differences-0.003
 losers-0.002
 pleasures-0.002
 similarities-0.002
 laws-0.002
Tokenpe
Token ID431
Feature activation-2.23e+00

Pos logit contributions
 differences+0.031
 pleasures+0.029
 losers+0.028
 similarities+0.028
 impact+0.027
Neg logit contributions
 gn-0.017
 called-0.017
ina-0.017
 Nib-0.015
 Grac-0.013
Token and
Token ID290
Feature activation+1.43e-01

Pos logit contributions
 differences+0.071
 losers+0.068
 pleasures+0.066
 similarities+0.064
oom+0.063
Neg logit contributions
 gn-0.044
 Nib-0.042
 Grac-0.041
ina-0.040
 called-0.039
Token his
Token ID465
Feature activation-5.74e-03

Pos logit contributions
 Nib+0.003
 gn+0.003
 Ho+0.003
 Po+0.003
 called+0.003
Neg logit contributions
 pleasures-0.005
 differences-0.005
 impact-0.004
 anymore-0.004
 losers-0.004
Token friends
Token ID2460
Feature activation-6.07e-01

Pos logit contributions
 differences+0.000
 losers+0.000
 pleasures+0.000
 similarities+0.000
oom+0.000
Neg logit contributions
 gn-0.000
 Grac-0.000
 Nib-0.000
 Po-0.000
ina-0.000
Token were
Token ID547
Feature activation-1.42e-01

Pos logit contributions
 differences+0.021
 pleasures+0.020
 losers+0.020
 similarities+0.019
 laws+0.019
Neg logit contributions
 Grac-0.010
 gn-0.010
 called-0.010
 Nib-0.010
 Po-0.010
Token scared
Token ID12008
Feature activation+7.72e-01

Pos logit contributions
 differences+0.005
 losers+0.004
 pleasures+0.004
 similarities+0.004
oom+0.004
Neg logit contributions
 gn-0.003
 Grac-0.003
 Nib-0.003
 Po-0.002
ina-0.002
Token?"
Token ID1701
Feature activation+7.47e-01

Pos logit contributions
 differences+0.027
 losers+0.026
oom+0.025
 similarities+0.025
 pleasures+0.025
Neg logit contributions
 gn-0.019
 blo-0.019
 called-0.019
 Grac-0.019
 Nib-0.018
Token
Token ID198
Feature activation+3.89e-01

Pos logit contributions
 gn+0.012
 Grac+0.012
 Nib+0.011
 Po+0.011
 Maur+0.010
Neg logit contributions
 differences-0.028
 losers-0.026
 pleasures-0.026
 similarities-0.025
oom-0.025
Token
Token ID198
Feature activation+2.29e-01

Pos logit contributions
 gn+0.005
 Grac+0.005
 Nib+0.005
 Po+0.004
 Maur+0.004
Neg logit contributions
 differences-0.016
 pleasures-0.015
 losers-0.015
 similarities-0.015
 attitude-0.015
TokenThe
Token ID464
Feature activation+7.74e-02

Pos logit contributions
 gn+0.007
 Grac+0.006
 Nib+0.006
 Maur+0.005
 called+0.005
Neg logit contributions
 differences-0.006
 attitude-0.006
ness-0.006
 pleasures-0.006
adoes-0.005
Token wood
Token ID4898
Feature activation-6.45e-01

Pos logit contributions
 gn+0.002
 very+0.002
 called+0.001
 pap+0.001
 searching+0.001
Neg logit contributions
oom-0.003
 losers-0.002
 differences-0.002
 pleasures-0.002
 similarities-0.002
Tokenpe
Token ID431
Feature activation-2.23e+00

Pos logit contributions
 differences+0.020
 losers+0.019
 pleasures+0.018
 similarities+0.018
oom+0.017
Neg logit contributions
 gn-0.015
 called-0.015
ina-0.014
 Grac-0.013
 blo-0.013
Tokencker
Token ID15280
Feature activation-4.90e-01

Pos logit contributions
 differences+0.058
 losers+0.057
 pleasures+0.052
 similarities+0.052
 laws+0.051
Neg logit contributions
 gn-0.057
ina-0.057
 called-0.054
 Nib-0.053
 Grac-0.053
Token thought
Token ID1807
Feature activation+2.17e-01

Pos logit contributions
 differences+0.018
 pleasures+0.017
 losers+0.017
 similarities+0.017
oom+0.016
Neg logit contributions
 gn-0.008
 called-0.008
 Grac-0.008
 Nib-0.008
ina-0.007
Token it
Token ID340
Feature activation+4.13e-01

Pos logit contributions
 Grac+0.003
 Nib+0.003
 gn+0.003
 Po+0.003
 called+0.003
Neg logit contributions
 differences-0.008
 pleasures-0.007
 losers-0.007
 similarities-0.007
oom-0.007
Token over
Token ID625
Feature activation-2.43e-01

Pos logit contributions
 gn+0.009
 Grac+0.008
 Nib+0.008
 Po+0.008
 Maur+0.008
Neg logit contributions
 differences-0.012
 losers-0.012
 pleasures-0.011
oom-0.011
 similarities-0.011
Token.
Token ID13
Feature activation+1.54e-01

Pos logit contributions
 differences+0.008
 losers+0.008
oom+0.008
 pleasures+0.008
 similarities+0.008
Neg logit contributions
 gn-0.004
 Po-0.004
 Nib-0.004
 Grac-0.004
 called-0.004
Token he
Token ID339
Feature activation+1.92e-01

Pos logit contributions
 Grac+0.002
 gn+0.002
 Nib+0.002
 Po+0.002
 called+0.002
Neg logit contributions
 differences-0.004
 losers-0.004
affe-0.004
 similarities-0.004
 pleasures-0.004
Token was
Token ID373
Feature activation+2.19e-02

Pos logit contributions
 gn+0.004
 Grac+0.004
 Nib+0.003
 Po+0.003
ina+0.003
Neg logit contributions
 differences-0.006
 pleasures-0.006
 losers-0.006
 similarities-0.006
 laws-0.006
Token going
Token ID1016
Feature activation-6.11e-02

Pos logit contributions
 gn+0.000
 Grac+0.000
 Nib+0.000
 Po+0.000
ina+0.000
Neg logit contributions
 differences-0.001
 losers-0.001
 pleasures-0.001
 similarities-0.001
oom-0.001
Token to
Token ID284
Feature activation+5.38e-01

Pos logit contributions
 differences+0.002
 losers+0.002
 pleasures+0.002
 similarities+0.002
oom+0.002
Neg logit contributions
 gn-0.001
 Grac-0.001
 Nib-0.001
 Po-0.001
ina-0.001
Token do
Token ID466
Feature activation+1.75e+00

Pos logit contributions
 gn+0.014
 Grac+0.013
 Nib+0.013
 Po+0.012
 called+0.012
Neg logit contributions
 differences-0.015
affe-0.014
 pleasures-0.013
 losers-0.013
 similarities-0.013
Token something
Token ID1223
Feature activation+1.39e+00

Pos logit contributions
 gn+0.034
 Grac+0.031
 imagining+0.030
 called+0.030
 Nib+0.030
Neg logit contributions
 differences-0.056
 losers-0.054
affe-0.054
 similarities-0.051
oom-0.051
Token fun
Token ID1257
Feature activation+9.97e-01

Pos logit contributions
 gn+0.041
 Grac+0.041
 Nib+0.040
 Po+0.039
ina+0.038
Neg logit contributions
 differences-0.030
 losers-0.027
 pleasures-0.026
 similarities-0.025
oom-0.025
Token.
Token ID13
Feature activation+3.62e-01

Pos logit contributions
 gn+0.027
 Grac+0.027
 Nib+0.027
 Po+0.026
 Maur+0.025
Neg logit contributions
 differences-0.024
 losers-0.022
 pleasures-0.022
oom-0.021
 similarities-0.021
Token He
Token ID679
Feature activation-7.59e-02

Pos logit contributions
 gn+0.009
 Grac+0.009
 Nib+0.009
ina+0.008
 Po+0.008
Neg logit contributions
 differences-0.010
 losers-0.009
 pleasures-0.009
oom-0.008
 similarities-0.008
Token couldn
Token ID3521
Feature activation+1.55e-01

Pos logit contributions
 differences+0.002
 losers+0.002
 pleasures+0.002
 similarities+0.002
oom+0.002
Neg logit contributions
 gn-0.002
 Grac-0.002
 Nib-0.001
 Po-0.001
ina-0.001
Token't
Token ID470
Feature activation+1.03e+00

Pos logit contributions
 gn+0.003
 Grac+0.003
 Nib+0.003
 Po+0.003
ina+0.003
Neg logit contributions
 differences-0.005
 pleasures-0.005
 similarities-0.005
 losers-0.004
affe-0.004
Token very
Token ID845
Feature activation+1.75e-01

Pos logit contributions
 differences+0.009
 losers+0.009
 laws+0.008
 law+0.008
 missiles+0.008
Neg logit contributions
 Grac-0.005
 gn-0.005
 Nib-0.005
ina-0.005
 Po-0.005
Token sad
Token ID6507
Feature activation+1.12e+00

Pos logit contributions
 gn+0.005
 Grac+0.005
 Nib+0.005
 Po+0.004
ina+0.004
Neg logit contributions
 differences-0.004
 losers-0.004
 pleasures-0.004
 similarities-0.004
oom-0.004
Token and
Token ID290
Feature activation-2.41e-01

Pos logit contributions
 Grac+0.035
 gn+0.035
 Nib+0.035
 Maur+0.033
icol+0.033
Neg logit contributions
 differences-0.021
 anymore-0.020
 times-0.019
 pleasures-0.018
 attitude-0.018
Token started
Token ID2067
Feature activation+6.71e-01

Pos logit contributions
 differences+0.007
 losers+0.006
 similarities+0.006
 pleasures+0.006
 laws+0.006
Neg logit contributions
 Grac-0.005
 gn-0.005
 Nib-0.005
ina-0.005
 Po-0.005
Token to
Token ID284
Feature activation-6.35e-01

Pos logit contributions
 Grac+0.017
 Nib+0.017
 gn+0.016
ina+0.016
 Po+0.016
Neg logit contributions
 differences-0.017
 losers-0.016
 laws-0.015
 pleasures-0.015
 similarities-0.015
Token cry
Token ID3960
Feature activation+1.17e+00

Pos logit contributions
 differences+0.013
 losers+0.011
 attitude+0.010
 laws+0.010
 law+0.010
Neg logit contributions
 Grac-0.020
 Nib-0.020
 gn-0.019
ina-0.019
 Maur-0.019
Token.
Token ID13
Feature activation+4.14e-02

Pos logit contributions
 gn+0.032
 Grac+0.032
 Nib+0.031
 Po+0.030
ina+0.030
Neg logit contributions
 differences-0.027
 losers-0.025
 pleasures-0.024
oom-0.024
 similarities-0.024
Token
Token ID220
Feature activation+5.02e-01

Pos logit contributions
 gn+0.001
 Grac+0.001
 Nib+0.001
 Po+0.001
ina+0.001
Neg logit contributions
 differences-0.001
 losers-0.001
 pleasures-0.001
 similarities-0.001
 laws-0.001
Token
Token ID198
Feature activation+4.43e-02

Pos logit contributions
 gn+0.011
 Grac+0.010
 Nib+0.010
 Po+0.010
ina+0.010
Neg logit contributions
 differences-0.015
 losers-0.014
oom-0.013
 pleasures-0.013
Everyone-0.013
Token
Token ID198
Feature activation-3.36e-01

Pos logit contributions
 gn+0.001
 Nib+0.001
 Grac+0.001
icol+0.001
 Po+0.001
Neg logit contributions
 differences-0.001
Everyone-0.001
 losers-0.001
oom-0.001
 laws-0.001
TokenThe
Token ID464
Feature activation+1.49e-01

Pos logit contributions
 differences+0.009
 losers+0.008
 missiles+0.008
oom+0.008
 pleasures+0.008
Neg logit contributions
 gn-0.009
 Grac-0.009
 Nib-0.008
 Maur-0.008
 Po-0.008
Token with
Token ID351
Feature activation-7.11e-01

Pos logit contributions
 gn+0.006
 Nib+0.005
 Grac+0.005
 Po+0.005
ina+0.005
Neg logit contributions
 differences-0.009
 losers-0.008
 pleasures-0.008
 similarities-0.008
oom-0.008
Token everyone
Token ID2506
Feature activation-1.00e+00

Pos logit contributions
 pleasures+0.019
 thrill+0.018
 differences+0.018
oom+0.017
 losers+0.017
Neg logit contributions
 Po-0.019
 Grac-0.017
 called-0.017
 Ho-0.017
 Nib-0.016
Token."
Token ID526
Feature activation+1.30e+00

Pos logit contributions
 differences+0.033
oom+0.032
 similarities+0.032
 efforts+0.032
 winds+0.032
Neg logit contributions
 blo-0.017
 called-0.016
 very-0.016
 Grac-0.015
ina-0.015
Token
Token ID198
Feature activation+7.90e-01

Pos logit contributions
 gn+0.026
 Grac+0.026
 Nib+0.025
ina+0.024
 Po+0.023
Neg logit contributions
 differences-0.040
 losers-0.037
 pleasures-0.036
 similarities-0.036
affe-0.035
Token
Token ID198
Feature activation+6.22e-01

Pos logit contributions
 gn+0.016
 Grac+0.016
 Nib+0.016
ina+0.015
 Po+0.015
Neg logit contributions
 differences-0.024
 losers-0.022
 similarities-0.021
 pleasures-0.021
oom-0.021
TokenM
Token ID44
Feature activation+1.42e+00

Pos logit contributions
 gn+0.013
 Grac+0.012
 Nib+0.012
 Po+0.012
ina+0.011
Neg logit contributions
 differences-0.020
 losers-0.019
 pleasures-0.018
oom-0.018
 laws-0.018
Tokenia
Token ID544
Feature activation-4.09e-01

Pos logit contributions
ina+0.036
 gn+0.035
 Nib+0.033
 Po+0.032
 Maur+0.031
Neg logit contributions
 differences-0.036
 losers-0.035
 pleasures-0.034
 missiles-0.034
 similarities-0.032
Token and
Token ID290
Feature activation+2.52e-01

Pos logit contributions
 differences+0.013
 losers+0.012
 pleasures+0.012
 similarities+0.012
oom+0.012
Neg logit contributions
 gn-0.008
 Grac-0.008
 Nib-0.008
 Po-0.008
ina-0.007
Token Ben
Token ID3932
Feature activation-2.85e-02

Pos logit contributions
 Po+0.005
 Nib+0.005
 gn+0.004
 Grac+0.004
 Ho+0.004
Neg logit contributions
 differences-0.009
 pleasures-0.009
 losers-0.008
 similarities-0.008
oom-0.008
Token nodded
Token ID14464
Feature activation-7.11e-01

Pos logit contributions
 differences+0.001
 losers+0.001
 pleasures+0.001
 similarities+0.001
oom+0.001
Neg logit contributions
 gn-0.001
 Grac-0.001
 Nib-0.001
 Po-0.001
ina-0.001
Token and
Token ID290
Feature activation-1.29e-01

Pos logit contributions
 differences+0.020
 losers+0.018
 pleasures+0.018
 similarities+0.017
oom+0.017
Neg logit contributions
 gn-0.017
 Grac-0.017
 Nib-0.016
 Po-0.016
ina-0.015
Token dinner
Token ID8073
Feature activation-2.41e-01

Pos logit contributions
 differences+0.038
 losers+0.035
 pleasures+0.034
affe+0.034
 similarities+0.034
Neg logit contributions
 gn-0.036
 Grac-0.036
 Nib-0.035
 Po-0.035
ina-0.033
Token time
Token ID640
Feature activation-2.58e-01

Pos logit contributions
 differences+0.006
 losers+0.005
 pleasures+0.005
 similarities+0.005
affe+0.005
Neg logit contributions
 gn-0.006
 Grac-0.006
 Nib-0.006
 Po-0.006
ina-0.006
Token."
Token ID526
Feature activation+6.28e-01

Pos logit contributions
 differences+0.007
 losers+0.006
 pleasures+0.006
 similarities+0.006
affe+0.006
Neg logit contributions
 Grac-0.006
 gn-0.006
 Nib-0.006
 Po-0.006
ina-0.006
Token
Token ID198
Feature activation+6.73e-01

Pos logit contributions
 gn+0.017
 Nib+0.015
ina+0.015
 blo+0.015
 Grac+0.015
Neg logit contributions
 differences-0.015
Everyone-0.014
oom-0.014
 pleasures-0.014
 similarities-0.014
Token
Token ID198
Feature activation-1.14e-01

Pos logit contributions
 gn+0.018
 Grac+0.016
 Nib+0.016
 blo+0.015
icol+0.015
Neg logit contributions
 differences-0.017
Everyone-0.016
oom-0.016
 similarities-0.015
 losers-0.015
TokenL
Token ID43
Feature activation+1.17e+00

Pos logit contributions
 differences+0.003
 losers+0.003
 pleasures+0.003
 similarities+0.003
oom+0.003
Neg logit contributions
 gn-0.003
 Grac-0.003
 Nib-0.003
 Po-0.003
ina-0.003
Tokenily
Token ID813
Feature activation-3.30e-01

Pos logit contributions
 gn+0.038
 Nib+0.037
 Grac+0.036
icol+0.036
 Ho+0.035
Neg logit contributions
 differences-0.024
 losers-0.022
 pleasures-0.022
oom-0.021
affe-0.020
Token and
Token ID290
Feature activation+1.09e-02

Pos logit contributions
 differences+0.011
 losers+0.010
 pleasures+0.010
 similarities+0.010
 missiles+0.010
Neg logit contributions
 gn-0.006
 Grac-0.006
 Nib-0.006
 called-0.006
 Po-0.006
Token her
Token ID607
Feature activation-8.49e-02

Pos logit contributions
 Grac+0.000
 gn+0.000
 Nib+0.000
 Po+0.000
ina+0.000
Neg logit contributions
 differences-0.000
 losers-0.000
 pleasures-0.000
 similarities-0.000
oom-0.000
Token mom
Token ID1995
Feature activation+5.16e-01

Pos logit contributions
 differences+0.002
 losers+0.002
 pleasures+0.002
 similarities+0.002
oom+0.002
Neg logit contributions
 gn-0.003
 Grac-0.003
 Nib-0.003
 Po-0.002
ina-0.002
Tokenmy
Token ID1820
Feature activation-4.71e-01

Pos logit contributions
 gn+0.009
 Grac+0.009
 Nib+0.009
 Po+0.009
 Maur+0.008
Neg logit contributions
 differences-0.017
 losers-0.016
 pleasures-0.016
oom-0.016
 similarities-0.015
Token something
Token ID1223
Feature activation+1.32e+00

Pos logit contributions
 gn+0.003
 Grac+0.003
 Nib+0.003
 Po+0.003
ina+0.003
Neg logit contributions
 differences-0.005
 losers-0.004
 pleasures-0.004
oom-0.004
 similarities-0.004
Token special
Token ID2041
Feature activation-8.30e-02

Pos logit contributions
 gn+0.041
 Grac+0.040
 Nib+0.040
ina+0.040
icol+0.040
Neg logit contributions
 differences-0.026
 law-0.022
 losers-0.022
 laws-0.022
 similarities-0.021
Token.
Token ID13
Feature activation-3.15e-01

Pos logit contributions
 differences+0.002
 law+0.001
 pleasures+0.001
 losers+0.001
 thing+0.001
Neg logit contributions
 gn-0.003
 Grac-0.003
 Nib-0.003
 Maur-0.003
 Po-0.003
Token She
Token ID1375
Feature activation+1.48e-01

Pos logit contributions
 differences+0.014
 losers+0.014
 pleasures+0.013
 missiles+0.013
oom+0.013
Neg logit contributions
 Grac-0.002
 gn-0.002
 Nib-0.002
 Po-0.002
 Maur-0.002
Token went
Token ID1816
Feature activation+1.70e-01

Pos logit contributions
 Grac+0.004
 Nib+0.004
 gn+0.003
 Po+0.003
 blo+0.003
Neg logit contributions
 differences-0.004
 losers-0.004
oom-0.003
 similarities-0.003
 team-0.003
Token on
Token ID319
Feature activation+1.52e+00

Pos logit contributions
 Maur+0.006
 Grac+0.006
 Nib+0.006
icol+0.006
ina+0.006
Neg logit contributions
 law-0.002
ag-0.002
 differences-0.002
 laws-0.002
 tips-0.002
Token an
Token ID281
Feature activation-3.48e-01

Pos logit contributions
 Grac+0.028
 gn+0.028
 Nib+0.027
ina+0.026
 Po+0.026
Neg logit contributions
 differences-0.049
 losers-0.046
 pleasures-0.045
 laws-0.045
 similarities-0.044
Token adventure
Token ID8855
Feature activation-3.49e-01

Pos logit contributions
 differences+0.004
 losers+0.004
 law+0.004
 laws+0.004
 sides+0.004
Neg logit contributions
 Grac-0.014
 Nib-0.013
 gn-0.013
 Po-0.013
 Ho-0.013
Token to
Token ID284
Feature activation+4.83e-01

Pos logit contributions
 differences+0.010
 losers+0.009
 pleasures+0.009
oom+0.009
 similarities+0.008
Neg logit contributions
 gn-0.008
 Grac-0.008
 Nib-0.008
 Po-0.008
 Maur-0.007
Token find
Token ID1064
Feature activation+1.04e+00

Pos logit contributions
 gn+0.010
 Grac+0.010
 Nib+0.010
 Po+0.010
ina+0.009
Neg logit contributions
 differences-0.015
 losers-0.014
 pleasures-0.013
oom-0.013
 similarities-0.013
Token it
Token ID340
Feature activation+6.80e-01

Pos logit contributions
 gn+0.023
 Grac+0.023
 Nib+0.022
ina+0.021
 Po+0.021
Neg logit contributions
 differences-0.030
 losers-0.028
 pleasures-0.027
 laws-0.027
 similarities-0.027
Token climbing
Token ID14281
Feature activation+5.89e-01

Pos logit contributions
 gn+0.002
 Grac+0.002
 Nib+0.002
 Po+0.002
 Maur+0.002
Neg logit contributions
 differences-0.004
 losers-0.003
 pleasures-0.003
 similarities-0.003
oom-0.003
Token up
Token ID510
Feature activation+1.38e-01

Pos logit contributions
 gn+0.015
 Grac+0.015
 Nib+0.014
 Po+0.014
ina+0.013
Neg logit contributions
 differences-0.016
 losers-0.015
 pleasures-0.014
 similarities-0.014
oom-0.014
Token the
Token ID262
Feature activation+3.01e-01

Pos logit contributions
 gn+0.004
 Grac+0.004
 Nib+0.003
 Po+0.003
 blo+0.003
Neg logit contributions
 differences-0.004
 losers-0.003
affe-0.003
 pleasures-0.003
oom-0.003
Token walls
Token ID7714
Feature activation+2.58e-01

Pos logit contributions
 gn+0.006
 Grac+0.006
 Nib+0.005
 blo+0.005
 Maur+0.005
Neg logit contributions
 differences-0.010
 losers-0.010
 pleasures-0.009
 similarities-0.009
affe-0.009
Token.
Token ID13
Feature activation-1.82e-02

Pos logit contributions
 gn+0.006
 Grac+0.006
 blo+0.005
ina+0.005
 Nib+0.005
Neg logit contributions
 differences-0.008
 losers-0.007
 similarities-0.007
 pleasures-0.007
affe-0.007
Token After
Token ID2293
Feature activation+9.50e-01

Pos logit contributions
 differences+0.001
 losers+0.001
 pleasures+0.001
 similarities+0.001
oom+0.001
Neg logit contributions
 gn-0.000
 Grac-0.000
 Nib-0.000
 Po-0.000
 Maur-0.000
Token a
Token ID257
Feature activation-2.47e-01

Pos logit contributions
 Grac+0.021
 Nib+0.020
ina+0.020
 gn+0.020
 Maur+0.020
Neg logit contributions
 differences-0.026
 losers-0.024
 laws-0.024
 missiles-0.024
 winds-0.024
Token while
Token ID981
Feature activation+4.59e-01

Pos logit contributions
 differences+0.007
 losers+0.006
 pleasures+0.006
 similarities+0.006
 laws+0.006
Neg logit contributions
 gn-0.006
 Grac-0.006
 Nib-0.006
 Po-0.006
 Maur-0.005
Token,
Token ID11
Feature activation-2.91e-01

Pos logit contributions
 Grac+0.009
 gn+0.009
 Nib+0.008
 Po+0.008
 Maur+0.007
Neg logit contributions
 differences-0.015
 losers-0.015
 pleasures-0.014
 similarities-0.014
oom-0.014
Token she
Token ID673
Feature activation-2.98e-01

Pos logit contributions
 differences+0.010
 losers+0.009
 pleasures+0.009
 similarities+0.009
oom+0.009
Neg logit contributions
 Grac-0.006
 Po-0.006
 gn-0.006
 Nib-0.006
 Maur-0.005
Token realized
Token ID6939
Feature activation+5.15e-01

Pos logit contributions
 differences+0.008
 losers+0.008
oom+0.008
affe+0.008
 team+0.008
Neg logit contributions
 Grac-0.007
 Nib-0.006
 gn-0.006
 Po-0.006
 Maur-0.006
Token
Token ID198
Feature activation+1.94e-01

Pos logit contributions
 Grac+0.001
 Nib+0.001
 Po+0.001
 gn+0.001
 Ho+0.001
Neg logit contributions
 differences-0.001
 losers-0.001
 pleasures-0.001
 similarities-0.001
oom-0.001
Token
Token ID198
Feature activation-5.17e-01

Pos logit contributions
 gn+0.005
 Grac+0.005
 Nib+0.005
icol+0.005
ina+0.004
Neg logit contributions
 differences-0.005
Everyone-0.005
oom-0.005
 similarities-0.004
affe-0.004
TokenThe
Token ID464
Feature activation+3.17e-01

Pos logit contributions
 differences+0.014
 attitude+0.013
 losers+0.013
 winds+0.013
ness+0.013
Neg logit contributions
 gn-0.012
 Grac-0.012
 Po-0.011
 Nib-0.011
 Maur-0.011
Token all
Token ID477
Feature activation-6.65e-01

Pos logit contributions
 gn+0.007
 blo+0.007
 Grac+0.007
 Nib+0.007
 Po+0.006
Neg logit contributions
 differences-0.009
oom-0.009
 losers-0.008
 similarities-0.008
 pleasures-0.008
Tokenigator
Token ID23823
Feature activation-5.67e-01

Pos logit contributions
 differences+0.018
 pleasures+0.017
 similarities+0.016
oom+0.016
 losers+0.015
Neg logit contributions
 Grac-0.017
 Po-0.017
 called-0.016
 gn-0.016
ina-0.016
Token was
Token ID373
Feature activation+3.31e-02

Pos logit contributions
 differences+0.020
 pleasures+0.020
 losers+0.019
 similarities+0.019
oom+0.018
Neg logit contributions
 gn-0.009
 called-0.009
 Nib-0.009
 Po-0.008
 Grac-0.008
Token so
Token ID523
Feature activation+2.45e-01

Pos logit contributions
 gn+0.001
 Grac+0.001
 Nib+0.001
 Po+0.001
ina+0.001
Neg logit contributions
 differences-0.001
 losers-0.001
 pleasures-0.001
oom-0.001
 similarities-0.001
Token happy
Token ID3772
Feature activation-4.62e-01

Pos logit contributions
 gn+0.005
 Nib+0.005
 Grac+0.005
 Po+0.005
 blo+0.004
Neg logit contributions
 differences-0.008
 pleasures-0.007
 losers-0.007
 laws-0.007
 missiles-0.007
Token to
Token ID284
Feature activation-3.84e-01

Pos logit contributions
 differences+0.014
 losers+0.013
 pleasures+0.013
 similarities+0.012
oom+0.012
Neg logit contributions
 gn-0.010
 Grac-0.010
 Nib-0.010
 Po-0.009
ina-0.009
Token have
Token ID423
Feature activation-4.25e-01

Pos logit contributions
 differences+0.009
 losers+0.008
 pleasures+0.008
 similarities+0.007
oom+0.007
Neg logit contributions
 gn-0.011
 Grac-0.011
 Nib-0.011
 Po-0.010
ina-0.010
Token a
Token ID257
Feature activation-2.97e-01

Pos logit contributions
 differences+0.014
 pleasures+0.014
affe+0.013
oom+0.013
 losers+0.013
Neg logit contributions
 Po-0.008
 called-0.007
 Grac-0.007
 gn-0.007
 Nib-0.007
Token<|endoftext|>
Token ID50256
Feature activation-4.66e-01

Pos logit contributions
 differences+0.012
 losers+0.011
 similarities+0.011
 pleasures+0.011
oom+0.011
Neg logit contributions
 gn-0.009
 Grac-0.009
 Nib-0.008
ina-0.008
 blo-0.008
TokenL
Token ID43
Feature activation+2.83e-01

Pos logit contributions
 differences+0.018
 losers+0.017
 pleasures+0.017
oom+0.017
 similarities+0.016
Neg logit contributions
 gn-0.006
 Grac-0.006
 Nib-0.006
 Po-0.005
ina-0.005
Tokenily
Token ID813
Feature activation-4.14e-01

Pos logit contributions
 gn+0.009
icol+0.009
 Grac+0.009
 Nib+0.008
 blo+0.008
Neg logit contributions
 differences-0.005
oom-0.005
 losers-0.005
 sides-0.005
 times-0.005
Token and
Token ID290
Feature activation-5.76e-01

Pos logit contributions
 differences+0.011
 losers+0.010
 pleasures+0.010
oom+0.010
 similarities+0.010
Neg logit contributions
 Grac-0.010
 Nib-0.010
 gn-0.010
 Po-0.009
 Maur-0.009
Token Ben
Token ID3932
Feature activation-4.46e-01

Pos logit contributions
 differences+0.014
 losers+0.013
oom+0.012
 similarities+0.012
 pleasures+0.012
Neg logit contributions
 gn-0.016
 Grac-0.015
 Nib-0.015
ina-0.014
 Po-0.014
Token were
Token ID547
Feature activation-1.25e-01

Pos logit contributions
 differences+0.011
 losers+0.011
 pleasures+0.010
oom+0.010
 similarities+0.010
Neg logit contributions
 Grac-0.011
 gn-0.011
 Nib-0.011
 Po-0.010
ina-0.010
Token playing
Token ID2712
Feature activation-9.03e-01

Pos logit contributions
 differences+0.003
 losers+0.003
 team+0.003
 goals+0.002
 laws+0.002
Neg logit contributions
 gn-0.004
 Grac-0.004
 Nib-0.004
icol-0.003
 Maur-0.003
Token in
Token ID287
Feature activation+1.93e-02

Pos logit contributions
 differences+0.028
 losers+0.026
 pleasures+0.025
 similarities+0.025
affe+0.025
Neg logit contributions
 gn-0.019
 Grac-0.018
 Nib-0.018
 Po-0.018
 Ho-0.017
Token the
Token ID262
Feature activation-1.31e-01

Pos logit contributions
 gn+0.001
 Grac+0.001
 Nib+0.001
 Po+0.001
ina+0.001
Neg logit contributions
 differences-0.000
 losers-0.000
 pleasures-0.000
 similarities-0.000
oom-0.000
Token park
Token ID3952
Feature activation-3.52e-01

Pos logit contributions
 differences+0.003
 losers+0.003
 sides+0.003
 team+0.003
 pleasures+0.003
Neg logit contributions
 Grac-0.003
 Nib-0.003
 gn-0.003
ina-0.003
 Maur-0.003
Token.
Token ID13
Feature activation-2.25e-01

Pos logit contributions
 differences+0.010
 losers+0.010
 pleasures+0.009
 similarities+0.009
oom+0.009
Neg logit contributions
 gn-0.008
 Nib-0.008
 Grac-0.008
 Po-0.008
 called-0.007
Token.
Token ID13
Feature activation+1.61e-01

Pos logit contributions
 Grac+0.012
 gn+0.012
 Nib+0.012
 Maur+0.011
 Po+0.011
Neg logit contributions
 differences-0.010
 losers-0.009
oom-0.008
 pleasures-0.008
 similarities-0.008
Token Lily
Token ID20037
Feature activation+3.22e-03

Pos logit contributions
 gn+0.004
 Grac+0.004
 Nib+0.004
ina+0.004
 Po+0.004
Neg logit contributions
 differences-0.004
 losers-0.004
 pleasures-0.004
 similarities-0.004
oom-0.004
Token was
Token ID373
Feature activation-3.41e-01

Pos logit contributions
 Grac+0.000
 gn+0.000
 Nib+0.000
 Po+0.000
 Maur+0.000
Neg logit contributions
 differences-0.000
 losers-0.000
 pleasures-0.000
 similarities-0.000
oom-0.000
Token sitting
Token ID5586
Feature activation-8.23e-01

Pos logit contributions
 differences+0.009
 losers+0.009
 law+0.008
 team+0.008
 laws+0.008
Neg logit contributions
icol-0.008
 Grac-0.008
 gn-0.008
 Nib-0.008
ina-0.007
Token under
Token ID739
Feature activation+3.81e-01

Pos logit contributions
 differences+0.026
 losers+0.024
 pleasures+0.023
 similarities+0.023
oom+0.023
Neg logit contributions
 gn-0.017
 Grac-0.016
 Nib-0.016
 Po-0.016
ina-0.015
Token a
Token ID257
Feature activation+7.45e-01

Pos logit contributions
 gn+0.006
 Grac+0.006
 Nib+0.006
 Po+0.006
 Ho+0.005
Neg logit contributions
 differences-0.014
 losers-0.013
 pleasures-0.013
 similarities-0.013
oom-0.013
Token tree
Token ID5509
Feature activation+7.18e-02

Pos logit contributions
 gn+0.024
 Grac+0.024
 Nib+0.024
 Po+0.023
ina+0.022
Neg logit contributions
 differences-0.014
 losers-0.013
 pleasures-0.012
 similarities-0.012
oom-0.012
Token.
Token ID13
Feature activation+2.18e-01

Pos logit contributions
 gn+0.002
 Nib+0.002
 Grac+0.002
 Po+0.002
ina+0.001
Neg logit contributions
 differences-0.002
 losers-0.002
 pleasures-0.002
 similarities-0.002
oom-0.002
Token "
Token ID366
Feature activation+7.76e-01

Pos logit contributions
 gn+0.005
 Grac+0.005
 Nib+0.005
 Po+0.005
ina+0.005
Neg logit contributions
 differences-0.006
 losers-0.005
 pleasures-0.005
 similarities-0.005
oom-0.005
TokenHi
Token ID17250
Feature activation+3.70e-01

Pos logit contributions
 gn+0.015
 Grac+0.015
 Nib+0.015
 Po+0.014
ina+0.013
Neg logit contributions
 differences-0.025
 losers-0.023
 pleasures-0.023
 similarities-0.022
oom-0.022
Token Lily
Token ID20037
Feature activation+1.14e+00

Pos logit contributions
 Grac+0.007
 gn+0.007
 Nib+0.007
 Po+0.007
 Ho+0.007
Neg logit contributions
 differences-0.012
 pleasures-0.011
 losers-0.011
 similarities-0.011
 laws-0.010
Token and
Token ID290
Feature activation-1.23e-01

Pos logit contributions
 gn+0.002
 Grac+0.002
 Nib+0.002
 Po+0.002
 Maur+0.002
Neg logit contributions
 differences-0.003
 losers-0.003
 pleasures-0.002
 similarities-0.002
oom-0.002
Token thanked
Token ID26280
Feature activation+1.79e-01

Pos logit contributions
 differences+0.004
 losers+0.003
 pleasures+0.003
 similarities+0.003
oom+0.003
Neg logit contributions
 gn-0.003
 Grac-0.003
 Nib-0.003
 Po-0.002
ina-0.002
Token Tim
Token ID5045
Feature activation+6.53e-01

Pos logit contributions
 Nib+0.004
 Grac+0.004
 Po+0.004
 Ros+0.003
 Ho+0.003
Neg logit contributions
 differences-0.006
 pleasures-0.006
 losers-0.006
oom-0.005
 similarities-0.005
Tokenmy
Token ID1820
Feature activation+1.84e-01

Pos logit contributions
 gn+0.010
 Grac+0.010
 Nib+0.010
 Po+0.009
ina+0.009
Neg logit contributions
 differences-0.024
 losers-0.023
 pleasures-0.022
 similarities-0.022
 laws-0.022
Token by
Token ID416
Feature activation-6.22e-01

Pos logit contributions
 Nib+0.003
 gn+0.003
 Grac+0.003
 Po+0.003
 Ho+0.003
Neg logit contributions
 differences-0.006
 losers-0.006
 similarities-0.006
oom-0.006
 pleasures-0.006
Token jumping
Token ID14284
Feature activation+9.12e-02

Pos logit contributions
 differences+0.019
 losers+0.018
 pleasures+0.018
oom+0.018
affe+0.018
Neg logit contributions
 gn-0.013
 Grac-0.013
 Nib-0.013
 Po-0.012
 blo-0.012
Token out
Token ID503
Feature activation+2.96e-01

Pos logit contributions
 gn+0.002
 Grac+0.002
 Nib+0.002
 Po+0.002
 Ho+0.002
Neg logit contributions
 differences-0.003
 losers-0.003
 pleasures-0.002
 similarities-0.002
affe-0.002
Token of
Token ID286
Feature activation-5.65e-01

Pos logit contributions
 Grac+0.008
 gn+0.008
 Po+0.008
 Nib+0.008
 blo+0.007
Neg logit contributions
 differences-0.007
 losers-0.007
 pleasures-0.007
affe-0.007
 similarities-0.007
Token the
Token ID262
Feature activation-4.00e-01

Pos logit contributions
 differences+0.013
 losers+0.012
 pleasures+0.011
affe+0.011
oom+0.011
Neg logit contributions
 gn-0.016
 Grac-0.016
 Nib-0.016
 Po-0.016
 Ho-0.015
Token water
Token ID1660
Feature activation-2.72e-01

Pos logit contributions
 differences+0.012
 losers+0.011
affe+0.011
 similarities+0.011
 pleasures+0.011
Neg logit contributions
 gn-0.009
 Po-0.009
 Grac-0.009
 Nib-0.008
 blo-0.008
Token.
Token ID13
Feature activation+2.29e-01

Pos logit contributions
 differences+0.008
 losers+0.008
 pleasures+0.008
 similarities+0.008
oom+0.007
Neg logit contributions
 gn-0.006
 Nib-0.006
 Grac-0.006
 Po-0.006
 called-0.005
Token slowly
Token ID6364
Feature activation-5.34e-01

Pos logit contributions
 differences+0.016
 losers+0.015
 similarities+0.015
 laws+0.014
affe+0.014
Neg logit contributions
ina-0.011
 blo-0.011
 Grac-0.010
 gn-0.010
 tucked-0.010
Token fade
Token ID22100
Feature activation-5.26e-01

Pos logit contributions
 differences+0.017
 losers+0.016
 similarities+0.015
affe+0.015
 pleasures+0.015
Neg logit contributions
 gn-0.011
 blo-0.011
 Nib-0.010
 Grac-0.010
 called-0.010
Token away
Token ID1497
Feature activation+6.72e-01

Pos logit contributions
 differences+0.017
 losers+0.017
 similarities+0.016
affe+0.016
 pleasures+0.016
Neg logit contributions
 Grac-0.009
 called-0.009
 gn-0.009
 Po-0.009
 Nib-0.009
Token.
Token ID13
Feature activation+2.48e-01

Pos logit contributions
 gn+0.015
 Grac+0.015
 Nib+0.014
 Po+0.014
ina+0.014
Neg logit contributions
 differences-0.020
 losers-0.019
 similarities-0.018
 pleasures-0.018
affe-0.018
Token<|endoftext|>
Token ID50256
Feature activation-3.61e-01

Pos logit contributions
 gn+0.004
 Grac+0.004
 Nib+0.004
 Po+0.004
 Maur+0.004
Neg logit contributions
 differences-0.009
 losers-0.008
 pleasures-0.008
 similarities-0.008
oom-0.008
TokenOnce
Token ID7454
Feature activation-3.13e-01

Pos logit contributions
 differences+0.013
 losers+0.012
 pleasures+0.012
oom+0.012
 similarities+0.012
Neg logit contributions
 gn-0.005
 Grac-0.005
 Nib-0.005
 Po-0.005
ina-0.004
Token upon
Token ID2402
Feature activation+8.26e-01

Pos logit contributions
 differences+0.011
 losers+0.011
 pleasures+0.010
 similarities+0.010
 laws+0.010
Neg logit contributions
 Grac-0.005
 Nib-0.005
 gn-0.005
 Po-0.004
ina-0.004
Token a
Token ID257
Feature activation-7.42e-01

Pos logit contributions
 Grac+0.018
 gn+0.017
 Nib+0.017
ina+0.016
icol+0.016
Neg logit contributions
 differences-0.024
 losers-0.023
 pleasures-0.022
Everyone-0.022
 team-0.021
Token time
Token ID640
Feature activation-5.19e-01

Pos logit contributions
 differences+0.010
 losers+0.008
 pleasures+0.007
 similarities+0.007
 laws+0.007
Neg logit contributions
 gn-0.028
 Grac-0.028
 Nib-0.028
 Po-0.027
ina-0.027
Token,
Token ID11
Feature activation-1.03e+00

Pos logit contributions
 differences+0.012
oom+0.012
 team+0.011
 losers+0.011
 pleasures+0.011
Neg logit contributions
 gn-0.013
 Grac-0.013
 Nib-0.013
ina-0.012
icol-0.012
Token there
Token ID612
Feature activation-6.07e-01

Pos logit contributions
 differences+0.026
 losers+0.024
oom+0.023
 pleasures+0.023
 similarities+0.023
Neg logit contributions
 gn-0.026
 Grac-0.026
 Nib-0.026
 Po-0.024
ina-0.024
Token
Token ID198
Feature activation+1.70e-02

Pos logit contributions
 gn+0.002
 Grac+0.002
 Nib+0.002
 Po+0.002
ina+0.002
Neg logit contributions
 differences-0.004
 losers-0.004
 pleasures-0.004
 similarities-0.004
oom-0.004
TokenTom
Token ID13787
Feature activation-9.10e-02

Pos logit contributions
 gn+0.000
 Nib+0.000
 Grac+0.000
 Po+0.000
 called+0.000
Neg logit contributions
 differences-0.000
 losers-0.000
 attitude-0.000
 pleasures-0.000
 missiles-0.000
Token and
Token ID290
Feature activation+3.15e-01

Pos logit contributions
 differences+0.002
 losers+0.002
affe+0.002
oom+0.002
 team+0.002
Neg logit contributions
 Grac-0.002
 Nib-0.002
 gn-0.002
 Maur-0.002
 Po-0.002
Token Lily
Token ID20037
Feature activation+3.14e-01

Pos logit contributions
 gn+0.005
 Po+0.005
 Nib+0.005
 called+0.005
 Ho+0.005
Neg logit contributions
 differences-0.012
 pleasures-0.012
oom-0.012
izing-0.011
 similarities-0.011
Token wanted
Token ID2227
Feature activation-6.62e-02

Pos logit contributions
 Grac+0.007
 gn+0.007
 Nib+0.007
 Po+0.006
 Maur+0.006
Neg logit contributions
 differences-0.009
 losers-0.008
 team-0.008
affe-0.008
 winds-0.008
Token to
Token ID284
Feature activation-5.05e-01

Pos logit contributions
 differences+0.002
 losers+0.002
 pleasures+0.002
oom+0.002
affe+0.002
Neg logit contributions
 gn-0.001
 Nib-0.001
 Grac-0.001
 Po-0.001
 Ho-0.001
Token say
Token ID910
Feature activation-3.81e-01

Pos logit contributions
 differences+0.014
 pleasures+0.013
 losers+0.013
 similarities+0.013
oom+0.013
Neg logit contributions
 gn-0.012
 Grac-0.012
 Nib-0.011
 Po-0.011
 called-0.010
Token hello
Token ID23748
Feature activation-1.45e-01

Pos logit contributions
 differences+0.011
 pleasures+0.011
oom+0.010
 losers+0.010
 similarities+0.010
Neg logit contributions
 Nib-0.009
 gn-0.009
 Po-0.008
 Grac-0.008
 Flor-0.008
Token to
Token ID284
Feature activation-4.00e-01

Pos logit contributions
 differences+0.003
 losers+0.003
oom+0.003
affe+0.003
 team+0.003
Neg logit contributions
 Grac-0.004
 gn-0.004
 Nib-0.004
 Po-0.004
 blo-0.004
Token the
Token ID262
Feature activation+6.69e-02

Pos logit contributions
 pleasures+0.013
 differences+0.013
oom+0.012
 losers+0.012
 efforts+0.011
Neg logit contributions
 Nib-0.009
 Po-0.008
 Ho-0.008
 called-0.008
 Gab-0.007
Token old
Token ID1468
Feature activation+2.02e-01

Pos logit contributions
 gn+0.002
 Nib+0.001
 closet+0.001
 waiting+0.001
 very+0.001
Neg logit contributions
oom-0.002
 differences-0.002
 losers-0.002
 pleasures-0.002
 efforts-0.002
Token slowly
Token ID6364
Feature activation-1.28e-01

Pos logit contributions
 differences+0.024
 pleasures+0.023
 losers+0.022
affe+0.022
 similarities+0.022
Neg logit contributions
 gn-0.017
 Po-0.016
 Nib-0.016
 Grac-0.016
 Maur-0.015
Token to
Token ID284
Feature activation-4.98e-01

Pos logit contributions
 differences+0.004
 losers+0.004
 similarities+0.004
 pleasures+0.004
affe+0.004
Neg logit contributions
 gn-0.003
 called-0.002
 Nib-0.002
 blo-0.002
 Grac-0.002
Token the
Token ID262
Feature activation-5.10e-01

Pos logit contributions
 differences+0.014
 pleasures+0.013
affe+0.013
 losers+0.013
oom+0.013
Neg logit contributions
 gn-0.012
 Nib-0.012
 Po-0.012
 Grac-0.012
 called-0.011
Token ground
Token ID2323
Feature activation+3.33e-01

Pos logit contributions
 differences+0.012
oom+0.012
 pleasures+0.012
 similarities+0.012
 losers+0.011
Neg logit contributions
 gn-0.013
 called-0.012
 least-0.012
 Po-0.012
 blo-0.012
Token.
Token ID13
Feature activation-2.01e-01

Pos logit contributions
 gn+0.007
 called+0.006
 Nib+0.006
 blo+0.006
 Grac+0.006
Neg logit contributions
 differences-0.011
 losers-0.010
 pleasures-0.010
 similarities-0.010
affe-0.010
Token Everyone
Token ID11075
Feature activation-7.29e-01

Pos logit contributions
 differences+0.006
 pleasures+0.005
 losers+0.005
emaker+0.005
oom+0.005
Neg logit contributions
 Grac-0.005
 Po-0.005
 gn-0.005
 Nib-0.005
 called-0.004
Token was
Token ID373
Feature activation-2.67e-01

Pos logit contributions
 differences+0.027
 pleasures+0.026
 losers+0.025
 similarities+0.025
affe+0.024
Neg logit contributions
 called-0.012
 Grac-0.011
 gn-0.010
ina-0.010
 blo-0.010
Token wondering
Token ID11263
Feature activation+3.80e-01

Pos logit contributions
 differences+0.009
 losers+0.009
 pleasures+0.008
 similarities+0.008
oom+0.008
Neg logit contributions
 gn-0.005
 Grac-0.005
 Nib-0.005
 Po-0.004
ina-0.004
Token what
Token ID644
Feature activation-5.40e-01

Pos logit contributions
 gn+0.006
 Grac+0.006
 Nib+0.006
 Po+0.006
 called+0.006
Neg logit contributions
 differences-0.013
 losers-0.013
oom-0.013
 pleasures-0.013
 similarities-0.012
Token would
Token ID561
Feature activation+2.20e-01

Pos logit contributions
 pleasures+0.021
affe+0.020
 thrill+0.020
oom+0.020
 losers+0.020
Neg logit contributions
 called-0.007
 very-0.007
 Po-0.006
 she-0.006
 searching-0.006
Token happen
Token ID1645
Feature activation+4.41e-01

Pos logit contributions
 gn+0.005
 Grac+0.005
 imagining+0.005
 searching+0.005
 called+0.005
Neg logit contributions
 differences-0.006
oom-0.006
affe-0.005
 pleasures-0.005
 losers-0.005
Token Jenny
Token ID28437
Feature activation+4.65e-01

Pos logit contributions
 Po+0.003
 Grac+0.003
 Nib+0.003
 gn+0.003
ina+0.003
Neg logit contributions
 differences-0.005
 losers-0.005
 pleasures-0.005
affe-0.005
 similarities-0.005
Token.
Token ID13
Feature activation+5.20e-01

Pos logit contributions
 gn+0.010
 Grac+0.010
 Po+0.010
 Nib+0.010
 called+0.009
Neg logit contributions
 differences-0.014
 losers-0.013
 pleasures-0.013
 similarities-0.013
 laws-0.012
Token Jenny
Token ID28437
Feature activation+8.97e-02

Pos logit contributions
 gn+0.011
 Nib+0.010
 Grac+0.010
ina+0.009
 Maur+0.009
Neg logit contributions
 differences-0.016
 losers-0.014
 similarities-0.014
oom-0.014
 pleasures-0.014
Token was
Token ID373
Feature activation-1.12e-01

Pos logit contributions
 gn+0.002
 Grac+0.002
 Nib+0.002
 Po+0.002
 called+0.002
Neg logit contributions
 differences-0.003
 pleasures-0.003
 losers-0.003
 similarities-0.003
oom-0.003
Token scared
Token ID12008
Feature activation+6.90e-01

Pos logit contributions
 differences+0.003
 losers+0.003
 laws+0.003
 law+0.003
 bully+0.003
Neg logit contributions
 Grac-0.003
 gn-0.003
 Nib-0.003
ina-0.002
icol-0.002
Token and
Token ID290
Feature activation-2.70e-01

Pos logit contributions
 Grac+0.013
 gn+0.013
 Nib+0.013
 Po+0.013
ina+0.012
Neg logit contributions
 differences-0.022
 losers-0.021
 similarities-0.021
 pleasures-0.021
oom-0.020
Token started
Token ID2067
Feature activation+7.07e-01

Pos logit contributions
 differences+0.008
 pleasures+0.008
 losers+0.008
oom+0.008
 similarities+0.008
Neg logit contributions
 gn-0.006
 Grac-0.006
 Nib-0.005
 Po-0.005
 called-0.005
Token running
Token ID2491
Feature activation+5.59e-01

Pos logit contributions
 Grac+0.017
 gn+0.017
 Nib+0.016
 Po+0.016
ina+0.015
Neg logit contributions
 differences-0.020
 losers-0.018
 pleasures-0.018
 similarities-0.017
oom-0.017
Token around
Token ID1088
Feature activation-1.40e-01

Pos logit contributions
 Grac+0.015
 gn+0.015
 Nib+0.015
 Po+0.014
ina+0.014
Neg logit contributions
 differences-0.014
 losers-0.013
 pleasures-0.012
oom-0.012
 laws-0.012
Token.
Token ID13
Feature activation+1.42e-01

Pos logit contributions
 differences+0.004
 losers+0.004
affe+0.004
oom+0.004
 pleasures+0.004
Neg logit contributions
 gn-0.003
 Grac-0.003
 Nib-0.003
 Po-0.003
 imagining-0.003
Token
Token ID198
Feature activation-9.59e-02

Pos logit contributions
 gn+0.003
 Nib+0.003
 Grac+0.003
ina+0.003
 blo+0.002
Neg logit contributions
 differences-0.004
 losers-0.004
 similarities-0.004
oom-0.004
 pleasures-0.004
Token,
Token ID11
Feature activation-4.76e-01

Pos logit contributions
 differences+0.019
 losers+0.018
 pleasures+0.018
 similarities+0.018
oom+0.017
Neg logit contributions
 Grac-0.011
 gn-0.011
 Nib-0.011
 Po-0.011
ina-0.010
Token Max
Token ID5436
Feature activation+1.28e-02

Pos logit contributions
 differences+0.016
 pleasures+0.015
oom+0.015
ag+0.015
 similarities+0.015
Neg logit contributions
 Po-0.010
 Grac-0.009
 Nib-0.009
 Ho-0.009
 called-0.008
Token,"
Token ID553
Feature activation+4.06e-01

Pos logit contributions
 gn+0.000
 Grac+0.000
 Nib+0.000
 Po+0.000
ina+0.000
Neg logit contributions
 differences-0.000
 losers-0.000
 pleasures-0.000
 similarities-0.000
oom-0.000
Token Tim
Token ID5045
Feature activation-7.19e-02

Pos logit contributions
 Grac+0.010
 gn+0.010
 Nib+0.010
 Po+0.010
 Ho+0.009
Neg logit contributions
 differences-0.011
 losers-0.010
 pleasures-0.010
oom-0.009
 similarities-0.009
Token said
Token ID531
Feature activation-1.14e+00

Pos logit contributions
 differences+0.002
 losers+0.002
affe+0.002
oom+0.002
 team+0.002
Neg logit contributions
 Grac-0.002
 Nib-0.001
 gn-0.001
 Maur-0.001
icol-0.001
Token.
Token ID13
Feature activation-5.37e-01

Pos logit contributions
 differences+0.034
 losers+0.031
 pleasures+0.031
 similarities+0.031
affe+0.029
Neg logit contributions
 Grac-0.026
 gn-0.026
 Nib-0.026
 Po-0.025
 Maur-0.024
Token "
Token ID366
Feature activation+9.31e-03

Pos logit contributions
 differences+0.013
 losers+0.012
 pleasures+0.011
affe+0.011
 similarities+0.011
Neg logit contributions
 gn-0.014
 Grac-0.014
 Nib-0.014
ina-0.014
 Po-0.013
TokenWe
Token ID1135
Feature activation+6.40e-01

Pos logit contributions
 gn+0.000
 Grac+0.000
 Po+0.000
 Nib+0.000
ina+0.000
Neg logit contributions
 differences-0.000
 losers-0.000
oom-0.000
 efforts-0.000
affe-0.000
Token made
Token ID925
Feature activation+7.90e-02

Pos logit contributions
 Grac+0.014
 Nib+0.013
 Po+0.013
 called+0.013
 gn+0.013
Neg logit contributions
 differences-0.019
 pleasures-0.018
 laws-0.018
 losers-0.017
 similarities-0.017
Token the
Token ID262
Feature activation-1.39e-01

Pos logit contributions
 Grac+0.002
 Nib+0.002
 Po+0.002
 gn+0.002
 Ho+0.002
Neg logit contributions
 differences-0.002
 pleasures-0.002
 losers-0.002
affe-0.002
 similarities-0.002
Token mess
Token ID2085
Feature activation+6.42e-01

Pos logit contributions
 differences+0.003
 losers+0.003
 pleasures+0.003
affe+0.003
oom+0.003
Neg logit contributions
 gn-0.004
 Nib-0.003
 Grac-0.003
 Po-0.003
 Ho-0.003
Token,
Token ID11
Feature activation-8.66e-01

Pos logit contributions
 differences+0.010
oom+0.009
 losers+0.009
 pleasures+0.009
 team+0.009
Neg logit contributions
 gn-0.010
 Grac-0.009
 Nib-0.009
ina-0.009
 Po-0.009
Token there
Token ID612
Feature activation-4.94e-01

Pos logit contributions
 differences+0.023
 losers+0.021
 pleasures+0.021
oom+0.020
 similarities+0.020
Neg logit contributions
 gn-0.021
 Grac-0.021
 Nib-0.021
 Po-0.020
ina-0.020
Token was
Token ID373
Feature activation+2.58e-01

Pos logit contributions
 differences+0.015
 losers+0.013
 pleasures+0.013
oom+0.013
 similarities+0.013
Neg logit contributions
 gn-0.011
 Grac-0.010
 Nib-0.010
 Po-0.010
ina-0.009
Token a
Token ID257
Feature activation-7.34e-01

Pos logit contributions
 gn+0.008
 Grac+0.007
ina+0.007
icol+0.007
 Nib+0.007
Neg logit contributions
 time-0.005
 missiles-0.005
 losers-0.005
 team-0.005
 victory-0.005
Token little
Token ID1310
Feature activation+3.27e-01

Pos logit contributions
 differences+0.016
 pleasures+0.014
 losers+0.013
 similarities+0.013
 laws+0.013
Neg logit contributions
 gn-0.024
 Grac-0.023
 Nib-0.023
 Po-0.023
 Ho-0.022
Token gir
Token ID37370
Feature activation-1.72e+00

Pos logit contributions
 gn+0.009
 Grac+0.009
 Nib+0.008
 Po+0.008
 Ho+0.008
Neg logit contributions
 differences-0.009
 pleasures-0.008
 similarities-0.008
 losers-0.008
 laws-0.007
Tokenaffe
Token ID21223
Feature activation+1.37e-01

Pos logit contributions
 differences+0.033
 pleasures+0.030
 shortcuts+0.027
 similarities+0.026
 minds+0.025
Neg logit contributions
ina-0.059
 called-0.059
 gn-0.058
 Nib-0.058
 Ho-0.055
Token named
Token ID3706
Feature activation+5.88e-01

Pos logit contributions
 Grac+0.002
 gn+0.002
 Nib+0.002
 Po+0.002
 Maur+0.002
Neg logit contributions
 differences-0.005
 losers-0.005
affe-0.005
 pleasures-0.005
 similarities-0.005
Token Gig
Token ID19525
Feature activation+5.50e-01

Pos logit contributions
 gn+0.014
 Grac+0.014
 Nib+0.014
 Po+0.013
ina+0.013
Neg logit contributions
 differences-0.016
 losers-0.015
 pleasures-0.015
 similarities-0.014
oom-0.014
Tokeni
Token ID72
Feature activation-6.04e-01

Pos logit contributions
 gn+0.007
 Grac+0.007
 Nib+0.007
 Po+0.006
ina+0.006
Neg logit contributions
 differences-0.021
 losers-0.020
 pleasures-0.020
 similarities-0.020
oom-0.019
Token.
Token ID13
Feature activation-3.89e-03

Pos logit contributions
 differences+0.014
 losers+0.013
affe+0.012
 pleasures+0.012
oom+0.012
Neg logit contributions
 Grac-0.017
 gn-0.017
 Nib-0.016
 Po-0.016
 Maur-0.015
Token.
Token ID13
Feature activation-1.12e+00

Pos logit contributions
 differences+0.023
 similarities+0.022
 losers+0.021
 pleasures+0.021
 sides+0.021
Neg logit contributions
 gn-0.014
 Nib-0.014
ina-0.013
 called-0.013
 Grac-0.013
Token We
Token ID775
Feature activation+5.09e-01

Pos logit contributions
 differences+0.031
 pleasures+0.029
 losers+0.028
oom+0.028
affe+0.028
Neg logit contributions
 Grac-0.027
 Nib-0.027
 Po-0.026
 gn-0.026
 Ho-0.025
Token were
Token ID547
Feature activation+3.27e-01

Pos logit contributions
 Grac+0.011
 gn+0.011
 Nib+0.011
ina+0.010
 Po+0.010
Neg logit contributions
 differences-0.015
 losers-0.014
 pleasures-0.014
 similarities-0.014
 laws-0.014
Token just
Token ID655
Feature activation-1.08e-01

Pos logit contributions
 gn+0.006
 Nib+0.006
 Grac+0.006
 Po+0.006
 called+0.006
Neg logit contributions
 differences-0.011
 pleasures-0.010
oom-0.010
affe-0.010
 similarities-0.010
Token playing
Token ID2712
Feature activation-8.06e-01

Pos logit contributions
 differences+0.004
 pleasures+0.004
 losers+0.004
affe+0.004
oom+0.004
Neg logit contributions
 gn-0.002
 Grac-0.002
 Nib-0.002
 Po-0.002
 imagining-0.002
Token.
Token ID13
Feature activation-1.55e+00

Pos logit contributions
 differences+0.025
affe+0.023
 similarities+0.023
 losers+0.023
 pleasures+0.023
Neg logit contributions
 gn-0.016
 Grac-0.016
 Po-0.016
 Nib-0.016
 called-0.015
Token Our
Token ID3954
Feature activation+2.37e-01

Pos logit contributions
 differences+0.038
 losers+0.035
 pleasures+0.034
oom+0.034
 similarities+0.033
Neg logit contributions
 Grac-0.041
 Nib-0.041
 gn-0.040
 Po-0.040
 Ho-0.038
Token parents
Token ID3397
Feature activation-4.57e-01

Pos logit contributions
 gn+0.006
 Nib+0.006
 Po+0.005
 Grac+0.005
 closet+0.005
Neg logit contributions
affe-0.006
 losers-0.006
 differences-0.006
 pleasures-0.006
folk-0.006
Token are
Token ID389
Feature activation-1.91e-01

Pos logit contributions
 differences+0.014
 pleasures+0.014
 missiles+0.014
 similarities+0.013
 losers+0.013
Neg logit contributions
 called-0.009
 Grac-0.008
 blo-0.008
 gn-0.008
 tucked-0.008
Token at
Token ID379
Feature activation-9.09e-01

Pos logit contributions
 differences+0.006
 pleasures+0.006
 losers+0.005
oom+0.005
 similarities+0.005
Neg logit contributions
 gn-0.004
 Grac-0.004
 Nib-0.004
 Po-0.004
ina-0.003
Token the
Token ID262
Feature activation-3.38e-01

Pos logit contributions
 differences+0.022
 losers+0.021
 pleasures+0.020
 similarities+0.019
oom+0.019
Neg logit contributions
 gn-0.024
 Grac-0.024
 Nib-0.024
 Po-0.023
ina-0.022
Token careful
Token ID8161
Feature activation-1.16e+00

Pos logit contributions
 differences+0.010
 losers+0.010
 pleasures+0.009
affe+0.009
 similarities+0.009
Neg logit contributions
 gn-0.013
 Grac-0.013
 Nib-0.013
 Po-0.012
ina-0.012
Token,
Token ID11
Feature activation-3.18e-01

Pos logit contributions
 similarities+0.041
 differences+0.041
 losers+0.041
affe+0.039
 pleasures+0.039
Neg logit contributions
ina-0.017
 Grac-0.016
 called-0.016
 gn-0.016
 Po-0.014
Token Lily
Token ID20037
Feature activation+2.39e-01

Pos logit contributions
 differences+0.010
oom+0.009
ag+0.009
affe+0.009
 pleasures+0.009
Neg logit contributions
 Grac-0.007
 Po-0.007
 gn-0.007
 Nib-0.006
 called-0.006
Token.
Token ID13
Feature activation-7.56e-01

Pos logit contributions
ina+0.004
 gn+0.004
 called+0.004
 Grac+0.004
 Po+0.004
Neg logit contributions
 differences-0.008
 losers-0.008
 similarities-0.008
 pleasures-0.008
 missiles-0.007
Token Cater
Token ID40742
Feature activation+6.61e-01

Pos logit contributions
 differences+0.023
 pleasures+0.022
 losers+0.022
 missiles+0.021
oom+0.021
Neg logit contributions
 Po-0.018
 Grac-0.017
 Nib-0.017
 Ho-0.016
 gn-0.016
Tokenpill
Token ID27215
Feature activation-1.63e+00

Pos logit contributions
 Grac+0.016
 gn+0.016
 Po+0.016
ina+0.015
 Nib+0.015
Neg logit contributions
 differences-0.019
 losers-0.017
 similarities-0.017
 pleasures-0.017
 missiles-0.016
Tokenars
Token ID945
Feature activation-3.55e-01

Pos logit contributions
 differences+0.030
oom+0.028
adoes+0.028
 pleasures+0.027
 missiles+0.026
Neg logit contributions
 Nib-0.048
 called-0.048
 gn-0.048
 blo-0.047
 Grac-0.046
Token can
Token ID460
Feature activation+4.77e-01

Pos logit contributions
 differences+0.011
 pleasures+0.010
affe+0.010
 similarities+0.010
 laws+0.010
Neg logit contributions
 called-0.006
 Nib-0.006
 Grac-0.006
ina-0.006
 Po-0.006
Token be
Token ID307
Feature activation-1.04e-01

Pos logit contributions
 gn+0.011
 Nib+0.011
 Grac+0.011
 Po+0.011
 Ho+0.010
Neg logit contributions
 differences-0.014
 pleasures-0.013
 similarities-0.013
 attitude-0.012
 losers-0.012
Token tricky
Token ID17198
Feature activation+7.22e-01

Pos logit contributions
 differences+0.003
oom+0.003
 pleasures+0.003
affe+0.003
 similarities+0.003
Neg logit contributions
 gn-0.002
 called-0.002
 Po-0.002
 Nib-0.002
 Grac-0.002
Token.
Token ID13
Feature activation-8.84e-01

Pos logit contributions
 gn+0.014
 Grac+0.014
 Po+0.013
 Nib+0.013
 called+0.013
Neg logit contributions
 differences-0.022
 losers-0.021
 similarities-0.021
oom-0.020
affe-0.020
Token creative
Token ID7325
Feature activation-1.16e-01

Pos logit contributions
 gn+0.015
 Grac+0.015
 Nib+0.014
ina+0.014
 Po+0.014
Neg logit contributions
 differences-0.017
 losers-0.016
 laws-0.015
 pleasures-0.015
 missiles-0.015
Token muff
Token ID27563
Feature activation-3.03e-01

Pos logit contributions
 differences+0.003
 losers+0.003
 pleasures+0.003
 similarities+0.003
oom+0.003
Neg logit contributions
 gn-0.003
 Grac-0.003
 Nib-0.003
 Po-0.003
 Maur-0.002
Tokenins
Token ID1040
Feature activation-6.41e-01

Pos logit contributions
 differences+0.008
 losers+0.007
 pleasures+0.007
 similarities+0.007
affe+0.007
Neg logit contributions
 Grac-0.008
 gn-0.007
 Nib-0.007
ina-0.007
 Po-0.007
Token like
Token ID588
Feature activation-6.13e-01

Pos logit contributions
 differences+0.021
 losers+0.019
 pleasures+0.019
 similarities+0.019
oom+0.018
Neg logit contributions
 Grac-0.012
 gn-0.012
 Nib-0.012
 Po-0.011
ina-0.011
Token this
Token ID428
Feature activation-6.60e-01

Pos logit contributions
 differences+0.016
 pleasures+0.015
 losers+0.015
oom+0.015
 similarities+0.014
Neg logit contributions
 Grac-0.015
 Nib-0.015
 gn-0.015
 Po-0.015
 Ho-0.014
Token one
Token ID530
Feature activation-1.57e+00

Pos logit contributions
 differences+0.021
 losers+0.019
 pleasures+0.019
 similarities+0.018
oom+0.018
Neg logit contributions
 gn-0.013
 Grac-0.013
 Nib-0.013
 Po-0.012
ina-0.012
Token at
Token ID379
Feature activation-4.70e-01

Pos logit contributions
 differences+0.060
 pleasures+0.057
 losers+0.056
 similarities+0.056
oom+0.055
Neg logit contributions
 Grac-0.020
 gn-0.020
 Nib-0.019
 called-0.018
 Po-0.018
Token home
Token ID1363
Feature activation+3.29e-01

Pos logit contributions
 differences+0.010
 losers+0.009
 pleasures+0.009
 similarities+0.009
oom+0.009
Neg logit contributions
 gn-0.014
 Grac-0.014
 Nib-0.014
 Po-0.013
ina-0.013
Token?"
Token ID1701
Feature activation+4.70e-01

Pos logit contributions
 gn+0.005
 Grac+0.005
 Nib+0.005
ina+0.004
 Po+0.004
Neg logit contributions
 differences-0.012
 losers-0.011
 pleasures-0.011
 similarities-0.011
oom-0.011
Token
Token ID198
Feature activation+1.34e-03

Pos logit contributions
 gn+0.008
 Grac+0.008
 Nib+0.007
 Po+0.007
 Maur+0.006
Neg logit contributions
 differences-0.017
 losers-0.016
 pleasures-0.015
 similarities-0.015
oom-0.015
Token
Token ID198
Feature activation-2.79e-01

Pos logit contributions
 gn+0.000
 Grac+0.000
 Nib+0.000
 Po+0.000
 Maur+0.000
Neg logit contributions
 differences-0.000
 losers-0.000
 pleasures-0.000
 similarities-0.000
 laws-0.000
TokenDid
Token ID11633
Feature activation-1.49e-01

Pos logit contributions
 differences+0.005
 losers+0.004
 pleasures+0.004
 similarities+0.004
oom+0.004
Neg logit contributions
 gn-0.004
 Grac-0.004
 Nib-0.004
 Po-0.004
ina-0.004
Token you
Token ID345
Feature activation-5.90e-01

Pos logit contributions
 pleasures+0.005
oom+0.005
 differences+0.005
 thrill+0.005
 attitude+0.005
Neg logit contributions
 Nib-0.003
 Po-0.003
 Ho-0.003
 Grac-0.003
 gn-0.003
Token not
Token ID407
Feature activation+2.65e-01

Pos logit contributions
 differences+0.018
 pleasures+0.018
 similarities+0.017
 laws+0.016
 sides+0.016
Neg logit contributions
 called-0.013
 Nib-0.013
 gn-0.012
 Grac-0.012
 Po-0.012
Token remember
Token ID3505
Feature activation-1.30e-01

Pos logit contributions
 gn+0.006
 Nib+0.006
 Grac+0.006
 called+0.006
 Po+0.006
Neg logit contributions
 differences-0.008
 pleasures-0.007
 similarities-0.007
 losers-0.007
 laws-0.007
Token what
Token ID644
Feature activation-1.22e+00

Pos logit contributions
 differences+0.004
 pleasures+0.004
 losers+0.003
affe+0.003
 similarities+0.003
Neg logit contributions
 gn-0.003
 Nib-0.003
 Grac-0.003
 Po-0.003
 Ho-0.003
Token I
Token ID314
Feature activation-1.73e+00

Pos logit contributions
 differences+0.037
 pleasures+0.036
oom+0.035
 thrill+0.034
 similarities+0.034
Neg logit contributions
 called-0.026
 Po-0.025
 gn-0.024
 Grac-0.024
 Nib-0.024
Token told
Token ID1297
Feature activation-1.65e-01

Pos logit contributions
 differences+0.044
 losers+0.042
oom+0.039
 team+0.039
 similarities+0.038
Neg logit contributions
 Grac-0.043
 gn-0.042
 Nib-0.042
 Maur-0.042
icol-0.041
Token you
Token ID345
Feature activation-1.03e+00

Pos logit contributions
 differences+0.004
 pleasures+0.004
 similarities+0.004
 losers+0.004
affe+0.004
Neg logit contributions
 Grac-0.004
 Nib-0.004
 Po-0.004
 gn-0.004
 called-0.004
Token?"
Token ID1701
Feature activation+3.36e-01

Pos logit contributions
 differences+0.037
 losers+0.035
 pleasures+0.035
 similarities+0.034
oom+0.034
Neg logit contributions
 gn-0.016
 Nib-0.015
 Grac-0.015
 called-0.014
 Po-0.014
Token
Token ID198
Feature activation-6.63e-02

Pos logit contributions
 gn+0.009
icol+0.009
 Grac+0.008
 Nib+0.008
 blo+0.008
Neg logit contributions
 differences-0.007
Everyone-0.007
 law-0.007
 laws-0.007
 similarities-0.006
Token
Token ID198
Feature activation-2.66e-01

Pos logit contributions
 differences+0.002
Everyone+0.002
 losers+0.002
oom+0.002
 similarities+0.002
Neg logit contributions
 gn-0.001
 Grac-0.001
 Nib-0.001
ina-0.001
 blo-0.001