TOP ACTIVATIONS
MAX = 2.698

at the red blocks in amaz ement . He couldn 't
She opened her mouth in amaz ement . She had never
The girl stepped back in amaz ement as the dark started
! You are ruining my deal !" The second
books back on the books he lf , smiling proudly .
the people cl apped in amaz ement . The singing rock
watched the colorful lights in amaz ement , feeling truly lucky
bird looked at her in amaz ement , and said :
did not make a bad deal . <|endoftext|> Tom and Mia
looked at the bug in amaz ement . She was so
gazed at the butterfly in amaz ement . But
just stared at him in amaz ement . So , he
gazed at Ad eline in amaz ement . As she sang
but they ran back in amaz ement ! The post had
shaking . Mom watched in amaz ement from a distance ,
the books on his books he lf .
up at the tree in amaz ement , wondering how it
trick . Everyone watched in amaz ement and applauded the young
not mean to break the rules . They did not mean
his mummy , trembling in amaz ement . They had just

MIN ACTIVATIONS
MIN = -4.877

When Daisy put the magn ifying glass down , the cater
She takes out a magn ifying glass . She looks at
flashlight , and a magn ifying glass . As
the necklace and the magn ifying glass and started to examine
a flashlight and a magn ifying glass to examine their finds
the watch with the magn ifying glass . He saw the
said , holding the magn ifying glass over a black ant
He took out a magn ifying glass and a notebook and
, handing him the magn ifying glass . She looked around
He brought his magn ifying glass and looked closely .
, next to her magn ifying glass . She looked at
it she had a magn ifying glass , a tiny torch
the necklace and the magn ifying glass and told him that
He pointed his magn ifying glass at different classmates and
She had a big magn ifying glass that made things look
She took out a magn ifying glass and looked very carefully
hat and grabbed her magn ifying glass . She was ready
borrowed her mom 's magn ifying glass to take a closer
examine it with the magn ifying glass ?" "
cried , dropping the magn ifying glass . "

INTERVAL 0.804 - 2.698
CONTAINS 2.454%

laughed . Then Alice said it was time for dinner ,
man and he wanted a wife to help him count .
She loved to drink tea with her mom every day .
. And you rang the bell . That is very bad
She loved to bake cookies with her mom my in the

INTERVAL -1.090 - 0.804
CONTAINS 96.944%

. Suddenly , Jack heard a voice saying
That is not nice !" " We want the
now ." Tim my was sad , but he got up
" Hello , Mr . Snow man !" Sara said .
day . Every day , people passed by his shop and

INTERVAL -2.984 - -1.090
CONTAINS 0.601%

, " Mom my , are they done baking yet ?"
Welcome to our restaurant . What would you like to order
two dogs fighting outside . Mitt ens didn 't like fighting
so he requested , " Bird , come play with me
know what to do . Mitt ens saw that Lucy was

INTERVAL -4.877 - -2.984
CONTAINS 0.000%

He took out a magn ifying glass and a notebook and
. They had a magn ifying glass and a small box
He pointed his magn ifying glass at different classmates and
borrowed her mom 's magn ifying glass to take a closer
said , holding the magn ifying glass over a black ant
Token at
Token ID379
Feature activation-6.73e-02

Pos logit contributions
izza+0.004
 Acc+0.004
 cooled+0.004
ugg+0.004
 cinnamon+0.004
Neg logit contributions
 girl-0.001
 after-0.001
 Lena-0.001
 calcul-0.001
 imagining-0.001
Token the
Token ID262
Feature activation-2.26e-01

Pos logit contributions
izza+0.002
 Acc+0.002
melon+0.002
 cooled+0.002
icking+0.002
Neg logit contributions
 Lena-0.001
Sab-0.001
robat-0.001
 calcul-0.001
rob-0.001
Token red
Token ID2266
Feature activation+9.41e-01

Pos logit contributions
izza+0.007
melon+0.006
 Acc+0.006
 cooled+0.006
icking+0.006
Neg logit contributions
 Lena-0.004
Sab-0.004
robat-0.004
 calcul-0.004
rob-0.004
Token blocks
Token ID7021
Feature activation-6.30e-01

Pos logit contributions
 Lena+0.018
 calcul+0.017
robat+0.016
Sab+0.016
 girl+0.016
Neg logit contributions
 Acc-0.028
izza-0.028
 cooled-0.028
ugg-0.027
icking-0.027
Token in
Token ID287
Feature activation+5.43e-02

Pos logit contributions
 Acc+0.026
izza+0.025
ugg+0.025
 cooled+0.024
acket+0.024
Neg logit contributions
 Lena-0.005
 calcul-0.005
b-0.005
 imagining-0.005
 after-0.005
Token amaz
Token ID40642
Feature activation+2.70e+00

Pos logit contributions
 Lena+0.001
robat+0.001
Sab+0.001
rob+0.001
 calcul+0.001
Neg logit contributions
izza-0.002
melon-0.002
 Acc-0.002
 cooled-0.002
 cinnamon-0.002
Tokenement
Token ID972
Feature activation+7.65e-01

Pos logit contributions
 Lena+0.093
Sab+0.090
robat+0.089
 calcul+0.089
 pupil+0.087
Neg logit contributions
izza-0.037
 Acc-0.036
 cooled-0.035
melon-0.035
icking-0.034
Token.
Token ID13
Feature activation-8.00e-02

Pos logit contributions
 Lena+0.009
 imagining+0.008
b+0.008
 than+0.007
 after+0.007
Neg logit contributions
 Acc-0.029
izza-0.028
 cooled-0.028
melon-0.028
 cinnamon-0.027
Token He
Token ID679
Feature activation-3.25e-01

Pos logit contributions
 cooled+0.003
 drank+0.003
icking+0.003
uddled+0.003
melon+0.003
Neg logit contributions
 Lena-0.001
 Mir-0.001
Sab-0.001
 Made-0.001
 Taylor-0.001
Token couldn
Token ID3521
Feature activation-1.87e-01

Pos logit contributions
 Acc+0.010
ugg+0.010
melon+0.010
icking+0.010
izza+0.010
Neg logit contributions
 Lena-0.006
b-0.006
 imagined-0.006
Grand-0.005
 figures-0.005
Token't
Token ID470
Feature activation-1.73e-01

Pos logit contributions
izza+0.007
 Acc+0.007
 cooled+0.006
atted+0.006
melon+0.006
Neg logit contributions
 Lena-0.003
Sab-0.002
 calcul-0.002
robat-0.002
Mel-0.002
Token She
Token ID1375
Feature activation-3.03e-01

Pos logit contributions
melon+0.006
icking+0.006
 cooled+0.006
izza+0.006
uddled+0.006
Neg logit contributions
 Lena-0.002
 girl-0.001
Sab-0.001
 calcul-0.001
 Taylor-0.001
Token opened
Token ID4721
Feature activation-1.66e-02

Pos logit contributions
izza+0.009
melon+0.009
 Acc+0.009
icking+0.008
 cinnamon+0.008
Neg logit contributions
 Lena-0.006
Sab-0.006
 imagined-0.006
 imagining-0.006
b-0.006
Token her
Token ID607
Feature activation-3.74e-02

Pos logit contributions
izza+0.001
 cooled+0.001
melon+0.001
 Acc+0.001
uttered+0.001
Neg logit contributions
robat-0.000
 Lena-0.000
rob-0.000
 calcul-0.000
Sab-0.000
Token mouth
Token ID5422
Feature activation-2.38e-01

Pos logit contributions
izza+0.001
 Acc+0.001
 cooled+0.001
melon+0.001
icking+0.001
Neg logit contributions
 Lena-0.001
Sab-0.001
robat-0.001
 calcul-0.001
rob-0.001
Token in
Token ID287
Feature activation-2.54e-01

Pos logit contributions
 Acc+0.009
izza+0.009
melon+0.008
 cooled+0.008
 cinnamon+0.008
Neg logit contributions
 Lena-0.003
Sab-0.002
Grand-0.002
b-0.002
 woman-0.002
Token amaz
Token ID40642
Feature activation+2.62e+00

Pos logit contributions
izza+0.009
 Acc+0.009
melon+0.009
 cooled+0.009
icking+0.009
Neg logit contributions
 Lena-0.003
Sab-0.003
 calcul-0.003
robat-0.003
 imagining-0.003
Tokenement
Token ID972
Feature activation+6.82e-01

Pos logit contributions
 Lena+0.087
Sab+0.083
robat+0.083
 calcul+0.082
rob+0.080
Neg logit contributions
izza-0.041
 Acc-0.040
 cooled-0.039
melon-0.039
icking-0.037
Token.
Token ID13
Feature activation-1.16e-01

Pos logit contributions
 Lena+0.008
 imagining+0.008
 after+0.007
b+0.007
Sab+0.007
Neg logit contributions
izza-0.026
 Acc-0.025
 cooled-0.025
melon-0.025
 cinnamon-0.024
Token She
Token ID1375
Feature activation-2.81e-01

Pos logit contributions
icking+0.005
melon+0.004
 cooled+0.004
uddled+0.004
atted+0.004
Neg logit contributions
 Lena-0.002
Sab-0.001
 Mir-0.001
 calcul-0.001
 Taylor-0.001
Token had
Token ID550
Feature activation-4.19e-01

Pos logit contributions
melon+0.008
izza+0.008
 Acc+0.008
icking+0.008
ugg+0.007
Neg logit contributions
 Lena-0.006
 imagined-0.006
b-0.006
 imagining-0.005
Sab-0.005
Token never
Token ID1239
Feature activation-7.23e-01

Pos logit contributions
izza+0.012
 Acc+0.012
melon+0.012
icking+0.012
uddled+0.011
Neg logit contributions
 Lena-0.009
Sab-0.008
 calcul-0.008
 imagining-0.008
 woman-0.007
Token The
Token ID383
Feature activation+1.09e-01

Pos logit contributions
melon+0.004
icking+0.004
 cooled+0.003
izza+0.003
acket+0.003
Neg logit contributions
 Lena-0.001
 Made-0.001
 Mir-0.001
 Taylor-0.001
 girl-0.001
Token girl
Token ID2576
Feature activation+4.22e-01

Pos logit contributions
 Lena+0.002
robat+0.001
rob+0.001
Sab+0.001
 calcul+0.001
Neg logit contributions
 Acc-0.004
izza-0.004
 cooled-0.004
melon-0.004
icking-0.004
Token stepped
Token ID10764
Feature activation+3.96e-01

Pos logit contributions
 Lena+0.008
b+0.008
Sab+0.007
 girl+0.007
 figures+0.007
Neg logit contributions
izza-0.014
melon-0.013
 Acc-0.013
 cinnamon-0.012
 Enjoy-0.012
Token back
Token ID736
Feature activation-2.02e-01

Pos logit contributions
 Lena+0.006
Sab+0.005
 calcul+0.005
 girl+0.005
b+0.005
Neg logit contributions
izza-0.014
 Acc-0.014
melon-0.013
 cooled-0.013
 cinnamon-0.013
Token in
Token ID287
Feature activation-2.47e-01

Pos logit contributions
izza+0.007
 Acc+0.007
melon+0.007
 cooled+0.007
icking+0.006
Neg logit contributions
 Lena-0.003
Sab-0.003
 calcul-0.003
robat-0.003
 pupil-0.003
Token amaz
Token ID40642
Feature activation+2.51e+00

Pos logit contributions
 cooled+0.007
 Acc+0.007
izza+0.007
melon+0.007
 drank+0.007
Neg logit contributions
rob-0.005
robat-0.005
Sab-0.005
 Lena-0.005
b-0.005
Tokenement
Token ID972
Feature activation+6.94e-01

Pos logit contributions
 Lena+0.092
Sab+0.090
robat+0.090
 calcul+0.089
 pupil+0.089
Neg logit contributions
atted-0.028
uddled-0.028
 Acc-0.027
izza-0.027
icking-0.027
Token as
Token ID355
Feature activation+2.83e-01

Pos logit contributions
 Lena+0.008
 imagining+0.007
 after+0.007
Sab+0.007
 woman+0.007
Neg logit contributions
izza-0.026
 Acc-0.026
 cooled-0.025
melon-0.025
 cinnamon-0.025
Token the
Token ID262
Feature activation+1.91e-01

Pos logit contributions
 Lena+0.003
Sab+0.003
 woman+0.002
 girl+0.002
 pupil+0.002
Neg logit contributions
izza-0.011
melon-0.011
 cooled-0.011
 Acc-0.011
icking-0.011
Token dark
Token ID3223
Feature activation+2.07e-01

Pos logit contributions
rob+0.004
robat+0.004
Sab+0.004
 Lena+0.004
 calcul+0.003
Neg logit contributions
 Acc-0.005
 cooled-0.005
izza-0.005
 drank-0.005
melon-0.005
Token started
Token ID2067
Feature activation-5.56e-01

Pos logit contributions
 Lena+0.004
Sab+0.003
 calcul+0.003
robat+0.003
 girl+0.003
Neg logit contributions
izza-0.007
 Acc-0.007
melon-0.007
 cooled-0.006
icking-0.006
Token!
Token ID0
Feature activation-8.82e-02

Pos logit contributions
izza+0.020
 Acc+0.020
uttered+0.019
 cooled+0.019
acket+0.019
Neg logit contributions
 Lena-0.007
cest-0.007
b-0.006
 witch-0.006
 girl-0.006
Token You
Token ID921
Feature activation+8.26e-01

Pos logit contributions
 cooled+0.003
uddled+0.003
izza+0.003
melon+0.003
icking+0.003
Neg logit contributions
 Lena-0.001
Sab-0.001
 calcul-0.001
 pupil-0.001
 Made-0.001
Token are
Token ID389
Feature activation-1.63e-01

Pos logit contributions
 Lena+0.016
 calcul+0.015
 figures+0.015
 spy+0.015
robat+0.015
Neg logit contributions
 Acc-0.025
izza-0.024
melon-0.023
uddled-0.023
ugg-0.023
Token ruining
Token ID39842
Feature activation-4.38e-02

Pos logit contributions
izza+0.005
 Acc+0.005
 cooled+0.005
melon+0.005
icking+0.004
Neg logit contributions
 Lena-0.003
Sab-0.003
 calcul-0.003
robat-0.003
 pupil-0.003
Token my
Token ID616
Feature activation+2.46e-01

Pos logit contributions
izza+0.002
 cooled+0.002
 Acc+0.002
melon+0.002
uttered+0.002
Neg logit contributions
 Lena-0.001
Sab-0.000
 calcul-0.000
robat-0.000
 pupil-0.000
Token deal
Token ID1730
Feature activation+2.51e+00

Pos logit contributions
 Lena+0.004
Sab+0.004
robat+0.004
 calcul+0.004
 imagining+0.004
Neg logit contributions
izza-0.008
 Acc-0.008
melon-0.008
 cooled-0.008
icking-0.007
Token!"
Token ID2474
Feature activation-3.11e-02

Pos logit contributions
 Lena+0.050
Sab+0.048
 calcul+0.046
robat+0.046
rob+0.044
Neg logit contributions
izza-0.072
melon-0.070
 Acc-0.069
 cooled-0.069
icking-0.068
Token
Token ID198
Feature activation+5.63e-03

Pos logit contributions
izza+0.001
 Acc+0.001
melon+0.001
 cooled+0.001
icking+0.001
Neg logit contributions
 Lena-0.001
Sab-0.000
robat-0.000
 calcul-0.000
rob-0.000
Token
Token ID198
Feature activation+2.83e-01

Pos logit contributions
robat+0.000
 calcul+0.000
 imagining+0.000
 pupil+0.000
 student+0.000
Neg logit contributions
izza-0.000
 Acc-0.000
 cooled-0.000
ugg-0.000
melon-0.000
TokenThe
Token ID464
Feature activation-4.30e-02

Pos logit contributions
 imagining+0.006
robat+0.006
Sab+0.006
rob+0.006
 calcul+0.006
Neg logit contributions
izza-0.007
 apologised-0.007
 Acc-0.007
 apologized-0.007
 calmed-0.007
Token second
Token ID1218
Feature activation+3.38e-01

Pos logit contributions
izza+0.001
melon+0.001
 cooled+0.001
 Acc+0.001
 cinnamon+0.001
Neg logit contributions
 Lena-0.001
Sab-0.001
robat-0.001
rob-0.001
 calcul-0.001
Token books
Token ID3835
Feature activation-4.60e-01

Pos logit contributions
Sab+0.001
cest+0.001
 Lena+0.001
quel+0.001
ang+0.001
Neg logit contributions
izza-0.001
 cinnamon-0.001
 cooled-0.001
 stomach-0.001
 farewell-0.001
Token back
Token ID736
Feature activation-6.07e-02

Pos logit contributions
izza+0.015
 Acc+0.015
 cooled+0.015
melon+0.015
icking+0.014
Neg logit contributions
 Lena-0.007
robat-0.006
Sab-0.006
 calcul-0.006
 pupil-0.006
Token on
Token ID319
Feature activation-2.83e-02

Pos logit contributions
 Acc+0.002
izza+0.002
 cooled+0.002
melon+0.002
 cinnamon+0.002
Neg logit contributions
 Lena-0.001
b-0.001
 imagining-0.001
 calcul-0.001
 girl-0.001
Token the
Token ID262
Feature activation+5.00e-01

Pos logit contributions
izza+0.001
 Acc+0.001
 cooled+0.001
melon+0.001
uddled+0.001
Neg logit contributions
 Lena-0.000
 calcul-0.000
robat-0.000
Sab-0.000
 student-0.000
Token books
Token ID3835
Feature activation+3.85e-01

Pos logit contributions
Sab+0.009
 Lena+0.009
rob+0.009
b+0.009
robat+0.009
Neg logit contributions
izza-0.015
 Acc-0.014
 cooled-0.014
melon-0.013
 cinnamon-0.013
Tokenhe
Token ID258
Feature activation+2.50e+00

Pos logit contributions
 Lena+0.005
Sab+0.004
robat+0.004
 calcul+0.004
rob+0.004
Neg logit contributions
izza-0.014
 Acc-0.014
melon-0.013
 cooled-0.013
icking-0.013
Tokenlf
Token ID1652
Feature activation+6.95e-01

Pos logit contributions
Sab+0.060
b+0.056
ang+0.055
Do+0.055
Gu+0.055
Neg logit contributions
izza-0.055
 cooled-0.054
 stomach-0.053
melon-0.052
 drank-0.052
Token,
Token ID11
Feature activation+1.64e-01

Pos logit contributions
 Lena+0.011
robat+0.010
 calcul+0.010
Sab+0.009
 student+0.009
Neg logit contributions
 Acc-0.023
izza-0.023
 cooled-0.022
melon-0.022
ugg-0.022
Token smiling
Token ID16755
Feature activation+1.61e-01

Pos logit contributions
 Lena+0.002
 imagining+0.002
 woman+0.002
 calcul+0.002
 girl+0.002
Neg logit contributions
izza-0.006
 Acc-0.006
ugg-0.006
uttered-0.006
uned-0.006
Token proudly
Token ID23017
Feature activation+1.88e-01

Pos logit contributions
 Lena+0.002
 girl+0.002
 pupil+0.002
 after+0.002
cest+0.002
Neg logit contributions
 Acc-0.006
izza-0.006
melon-0.006
 cooled-0.006
 cinnamon-0.006
Token.
Token ID13
Feature activation-2.75e-01

Pos logit contributions
 Lena+0.003
 calcul+0.003
robat+0.003
Sab+0.003
 pupil+0.003
Neg logit contributions
izza-0.006
 Acc-0.006
melon-0.006
 cooled-0.006
 cinnamon-0.006
Token the
Token ID262
Feature activation+7.79e-03

Pos logit contributions
 Lena+0.005
 girl+0.004
 woman+0.004
Sab+0.004
 after+0.004
Neg logit contributions
izza-0.011
 Acc-0.011
melon-0.011
ugg-0.010
icking-0.010
Token people
Token ID661
Feature activation+4.85e-01

Pos logit contributions
 girl+0.000
 woman+0.000
 princess+0.000
 knight+0.000
 witch+0.000
Neg logit contributions
izza-0.000
 Acc-0.000
uddled-0.000
ugg-0.000
icked-0.000
Token cl
Token ID537
Feature activation+5.66e-01

Pos logit contributions
b+0.009
 woman+0.008
 after+0.008
 girl+0.008
 imagining+0.008
Neg logit contributions
izza-0.016
 Acc-0.016
 cinnamon-0.015
melon-0.014
 Enjoy-0.014
Tokenapped
Token ID6320
Feature activation-4.36e-01

Pos logit contributions
cest+0.016
ang+0.015
 Lena+0.015
b+0.015
Sab+0.015
Neg logit contributions
izza-0.011
 cinnamon-0.011
melon-0.011
 Acc-0.011
 cooled-0.010
Token in
Token ID287
Feature activation-1.20e-01

Pos logit contributions
izza+0.015
 Acc+0.015
melon+0.015
 cooled+0.014
ugg+0.014
Neg logit contributions
 Lena-0.006
 after-0.006
 girl-0.006
Sab-0.006
 imagining-0.006
Token amaz
Token ID40642
Feature activation+2.47e+00

Pos logit contributions
izza+0.004
uddled+0.004
icking+0.004
melon+0.004
ugg+0.004
Neg logit contributions
 Lena-0.002
 calcul-0.002
 girl-0.002
Sab-0.002
 imagining-0.002
Tokenement
Token ID972
Feature activation+6.12e-01

Pos logit contributions
 Lena+0.075
Sab+0.072
robat+0.071
 calcul+0.071
rob+0.069
Neg logit contributions
izza-0.046
melon-0.044
 Acc-0.044
 cooled-0.043
icking-0.039
Token.
Token ID13
Feature activation-1.45e-01

Pos logit contributions
 imagining+0.006
 after+0.006
 than+0.006
b+0.006
 Lena+0.006
Neg logit contributions
 Acc-0.024
izza-0.024
 cooled-0.023
 cinnamon-0.023
melon-0.023
Token The
Token ID383
Feature activation-3.29e-02

Pos logit contributions
icking+0.006
uddled+0.005
 cooled+0.005
melon+0.005
ades+0.005
Neg logit contributions
 Lena-0.002
-0.002
 Mir-0.001
 Made-0.001
 Taylor-0.001
Token singing
Token ID13777
Feature activation-2.20e-01

Pos logit contributions
izza+0.001
 Acc+0.001
icked+0.001
uddled+0.001
ugg+0.001
Neg logit contributions
 girl-0.000
 woman-0.000
 princess-0.000
 knight-0.000
 witch-0.000
Token rock
Token ID3881
Feature activation+5.36e-01

Pos logit contributions
izza+0.008
 Acc+0.008
icking+0.007
 Enjoy+0.007
ugg+0.007
Neg logit contributions
 girl-0.004
 woman-0.003
 princess-0.003
 figures-0.003
 pupil-0.003
Token watched
Token ID7342
Feature activation+1.64e-01

Pos logit contributions
izza+0.001
 Acc+0.001
melon+0.001
icking+0.001
acket+0.001
Neg logit contributions
 imagined-0.000
 Lena-0.000
b-0.000
Sab-0.000
 imagining-0.000
Token the
Token ID262
Feature activation-3.17e-01

Pos logit contributions
 Lena+0.002
 imagining+0.002
Sab+0.002
 after+0.002
 princess+0.002
Neg logit contributions
izza-0.006
 Acc-0.006
 cooled-0.006
ugg-0.006
melon-0.006
Token colorful
Token ID20239
Feature activation+5.07e-01

Pos logit contributions
izza+0.011
icking+0.011
icked+0.011
 Acc+0.011
acket+0.011
Neg logit contributions
 girl-0.005
 princess-0.005
 woman-0.005
 figures-0.005
 witch-0.005
Token lights
Token ID7588
Feature activation-1.04e+00

Pos logit contributions
 girl+0.008
 princess+0.008
 figures+0.007
 woman+0.007
 places+0.007
Neg logit contributions
izza-0.019
 Acc-0.018
icking-0.018
 cooled-0.017
icked-0.017
Token in
Token ID287
Feature activation+1.02e-01

Pos logit contributions
 Acc+0.042
izza+0.041
ugg+0.039
 Enjoy+0.038
uddled+0.037
Neg logit contributions
 after-0.012
 imagining-0.011
 girl-0.010
 Lena-0.010
 princess-0.010
Token amaz
Token ID40642
Feature activation+2.46e+00

Pos logit contributions
 Lena+0.002
Sab+0.001
robat+0.001
 calcul+0.001
 pupil+0.001
Neg logit contributions
izza-0.004
 Acc-0.003
 cooled-0.003
melon-0.003
icking-0.003
Tokenement
Token ID972
Feature activation+6.82e-01

Pos logit contributions
 Lena+0.080
Sab+0.076
robat+0.076
 calcul+0.075
rob+0.074
Neg logit contributions
izza-0.041
 Acc-0.039
melon-0.039
 cooled-0.039
icking-0.036
Token,
Token ID11
Feature activation+7.44e-02

Pos logit contributions
 Lena+0.008
 imagining+0.008
 after+0.007
 girl+0.007
Sab+0.007
Neg logit contributions
 Acc-0.025
izza-0.025
 cooled-0.024
melon-0.024
 cinnamon-0.023
Token feeling
Token ID4203
Feature activation+3.29e-01

Pos logit contributions
 Lena+0.001
 imagining+0.001
 woman+0.001
 calcul+0.001
Sab+0.001
Neg logit contributions
izza-0.003
 Acc-0.002
melon-0.002
ugg-0.002
 cooled-0.002
Token truly
Token ID4988
Feature activation-7.41e-01

Pos logit contributions
 Lena+0.005
 calcul+0.005
 imagining+0.005
Sab+0.004
b+0.004
Neg logit contributions
izza-0.012
 Acc-0.011
icking-0.011
melon-0.011
uddled-0.011
Token lucky
Token ID9670
Feature activation-7.30e-01

Pos logit contributions
izza+0.024
melon+0.022
 Acc+0.022
ugg+0.022
 cooled+0.022
Neg logit contributions
 imagining-0.013
 Lena-0.013
 calcul-0.012
b-0.012
cest-0.012
Token bird
Token ID6512
Feature activation-3.93e-01

Pos logit contributions
 girl+0.000
 woman+0.000
 princess+0.000
 witch+0.000
 pupil+0.000
Neg logit contributions
izza-0.001
icking-0.001
 Acc-0.001
melon-0.001
 cooled-0.001
Token looked
Token ID3114
Feature activation-1.19e-01

Pos logit contributions
izza+0.012
 Acc+0.011
melon+0.011
 cinnamon+0.010
icking+0.010
Neg logit contributions
b-0.008
 Lena-0.008
Sab-0.008
 girl-0.008
robat-0.007
Token at
Token ID379
Feature activation-9.26e-02

Pos logit contributions
izza+0.004
 Acc+0.004
ugg+0.004
 cinnamon+0.004
 cooled+0.004
Neg logit contributions
 Lena-0.002
 after-0.002
 imagining-0.001
 girl-0.001
 calcul-0.001
Token her
Token ID607
Feature activation+6.69e-02

Pos logit contributions
izza+0.004
 Acc+0.004
 cooled+0.004
melon+0.004
icking+0.003
Neg logit contributions
 Lena-0.001
Sab-0.001
robat-0.001
 calcul-0.001
 girl-0.001
Token in
Token ID287
Feature activation-3.15e-01

Pos logit contributions
 imagining+0.001
 girl+0.001
 pupil+0.001
 woman+0.001
 princess+0.001
Neg logit contributions
izza-0.003
 Acc-0.002
 cooled-0.002
ugg-0.002
melon-0.002
Token amaz
Token ID40642
Feature activation+2.45e+00

Pos logit contributions
izza+0.010
 Acc+0.010
melon+0.009
 cooled+0.009
icking+0.009
Neg logit contributions
 Lena-0.006
 calcul-0.005
Sab-0.005
 speaker-0.005
 pupil-0.005
Tokenement
Token ID972
Feature activation+8.01e-01

Pos logit contributions
 Lena+0.083
Sab+0.080
robat+0.079
 calcul+0.079
rob+0.077
Neg logit contributions
izza-0.036
 Acc-0.035
 cooled-0.034
melon-0.034
icking-0.033
Token,
Token ID11
Feature activation-1.70e-01

Pos logit contributions
 imagining+0.008
 Lena+0.008
 after+0.007
 than+0.007
 girl+0.007
Neg logit contributions
izza-0.032
 Acc-0.031
 cooled-0.030
melon-0.029
 Enjoy-0.029
Token and
Token ID290
Feature activation+9.51e-02

Pos logit contributions
izza+0.007
uned+0.006
ugg+0.006
icking+0.006
 Acc+0.006
Neg logit contributions
 imagining-0.002
 Lena-0.002
 woman-0.002
 girl-0.002
 speaker-0.002
Token said
Token ID531
Feature activation+4.80e-02

Pos logit contributions
 Lena+0.001
Sab+0.001
 girl+0.001
 woman+0.001
 calcul+0.001
Neg logit contributions
izza-0.003
 Acc-0.003
melon-0.003
 cooled-0.003
icking-0.003
Token:
Token ID25
Feature activation+5.86e-01

Pos logit contributions
 Lena+0.001
 her+0.000
Sab+0.000
 she+0.000
 after+0.000
Neg logit contributions
acket-0.002
izza-0.002
icking-0.002
 cooled-0.002
melon-0.002
Token did
Token ID750
Feature activation+2.42e-01

Pos logit contributions
 Acc+0.025
izza+0.025
melon+0.024
uttered+0.023
icking+0.023
Neg logit contributions
 Lena-0.014
 girl-0.013
 calcul-0.013
 witch-0.013
b-0.013
Token not
Token ID407
Feature activation-3.84e-01

Pos logit contributions
 Lena+0.006
Sab+0.005
 calcul+0.005
 imagining+0.005
b+0.005
Neg logit contributions
izza-0.006
melon-0.006
 Acc-0.006
 cooled-0.006
icking-0.006
Token make
Token ID787
Feature activation+3.36e-01

Pos logit contributions
uddled+0.015
uttered+0.013
izza+0.013
uffed+0.013
ugg+0.013
Neg logit contributions
 spy-0.007
 figure-0.006
 after-0.006
cest-0.005
 than-0.005
Token a
Token ID257
Feature activation-4.44e-01

Pos logit contributions
robat+0.007
 Lena+0.007
rob+0.007
Sab+0.007
 calcul+0.007
Neg logit contributions
izza-0.009
 Acc-0.009
 cooled-0.009
melon-0.009
icking-0.008
Token bad
Token ID2089
Feature activation-1.78e-01

Pos logit contributions
 Acc+0.011
melon+0.011
izza+0.011
 cooled+0.011
 stomach+0.011
Neg logit contributions
 Lena-0.010
robat-0.010
Sab-0.010
rob-0.009
 calcul-0.009
Token deal
Token ID1730
Feature activation+2.44e+00

Pos logit contributions
melon+0.005
izza+0.005
 cooled+0.005
 Acc+0.005
icking+0.005
Neg logit contributions
 Lena-0.004
Sab-0.003
robat-0.003
 calcul-0.003
rob-0.003
Token.
Token ID13
Feature activation+5.45e-01

Pos logit contributions
Sab+0.071
rob+0.069
robat+0.068
 calcul+0.068
 Lena+0.068
Neg logit contributions
icking-0.046
melon-0.045
izza-0.042
icks-0.039
icked-0.039
Token<|endoftext|>
Token ID50256
Feature activation-3.83e-01

Pos logit contributions
robat+0.011
rob+0.010
b+0.010
 Lena+0.010
cest+0.010
Neg logit contributions
 Acc-0.017
izza-0.016
ugg-0.015
 cooled-0.015
uttered-0.015
TokenTom
Token ID13787
Feature activation+2.92e-01

Pos logit contributions
izza+0.015
 Acc+0.015
melon+0.015
 cooled+0.015
icking+0.014
Neg logit contributions
 Lena-0.004
Sab-0.003
robat-0.003
 calcul-0.003
 pupil-0.003
Token and
Token ID290
Feature activation-1.16e-01

Pos logit contributions
 Lena+0.005
 girl+0.004
robat+0.004
 calcul+0.004
 pupil+0.004
Neg logit contributions
izza-0.010
uttered-0.010
 Acc-0.009
uddled-0.009
 cooled-0.009
Token Mia
Token ID32189
Feature activation-1.18e-01

Pos logit contributions
uddled+0.004
izza+0.004
icking+0.004
melon+0.004
 cooled+0.004
Neg logit contributions
 Lena-0.002
 woman-0.002
 girl-0.001
 pupil-0.001
 student-0.001
Token looked
Token ID3114
Feature activation-9.07e-02

Pos logit contributions
 imagined+0.004
 girl+0.004
b+0.003
 woman+0.003
 Lena+0.003
Neg logit contributions
izza-0.007
 Acc-0.007
melon-0.007
acket-0.007
 Enjoy-0.007
Token at
Token ID379
Feature activation-1.08e-01

Pos logit contributions
izza+0.003
 Acc+0.003
 cooled+0.003
 cinnamon+0.003
melon+0.003
Neg logit contributions
 after-0.001
 girl-0.001
 Lena-0.001
 imagining-0.001
 calcul-0.001
Token the
Token ID262
Feature activation-1.08e-01

Pos logit contributions
izza+0.004
 cooled+0.004
melon+0.004
 Acc+0.004
uddled+0.004
Neg logit contributions
 Lena-0.001
 girl-0.001
 woman-0.001
Sab-0.001
robat-0.001
Token bug
Token ID5434
Feature activation+5.67e-01

Pos logit contributions
 Acc+0.004
izza+0.004
ugg+0.004
icked+0.004
atted+0.004
Neg logit contributions
 girl-0.002
 woman-0.002
 witch-0.002
 pupil-0.002
 student-0.002
Token in
Token ID287
Feature activation-1.16e-01

Pos logit contributions
 imagining+0.004
 Lena+0.004
 after+0.004
 girl+0.004
b+0.004
Neg logit contributions
izza-0.024
 Acc-0.024
acket-0.023
ugg-0.023
 cooled-0.022
Token amaz
Token ID40642
Feature activation+2.43e+00

Pos logit contributions
izza+0.003
 Acc+0.003
 cooled+0.003
melon+0.003
icking+0.003
Neg logit contributions
 Lena-0.002
Sab-0.002
robat-0.002
rob-0.002
 calcul-0.002
Tokenement
Token ID972
Feature activation+6.99e-01

Pos logit contributions
 Lena+0.082
Sab+0.079
robat+0.078
 calcul+0.078
rob+0.076
Neg logit contributions
izza-0.036
 Acc-0.035
 cooled-0.034
melon-0.034
icking-0.033
Token.
Token ID13
Feature activation-6.39e-02

Pos logit contributions
 Lena+0.008
 imagining+0.007
 after+0.007
 than+0.006
b+0.006
Neg logit contributions
izza-0.027
 Acc-0.027
 cooled-0.026
melon-0.026
 cinnamon-0.025
Token She
Token ID1375
Feature activation-2.94e-01

Pos logit contributions
icking+0.003
 cooled+0.003
melon+0.003
uddled+0.003
acket+0.003
Neg logit contributions
 Lena-0.000
 Mir-0.000
 Taylor-0.000
 Made-0.000
 She-0.000
Token was
Token ID373
Feature activation-3.33e-01

Pos logit contributions
izza+0.009
melon+0.009
 Acc+0.009
icking+0.009
ugg+0.008
Neg logit contributions
b-0.006
 imagined-0.006
 Lena-0.006
 woman-0.005
Sab-0.005
Token so
Token ID523
Feature activation+3.74e-01

Pos logit contributions
izza+0.013
ugg+0.012
 Acc+0.012
melon+0.012
uddled+0.012
Neg logit contributions
 Lena-0.004
 imagining-0.004
Sab-0.003
b-0.003
 woman-0.003
Token gazed
Token ID50255
Feature activation-1.61e-01

Pos logit contributions
b+0.001
 imagined+0.001
 girl+0.001
 h+0.001
cest+0.001
Neg logit contributions
izza-0.002
 Acc-0.001
acket-0.001
�-0.001
uning-0.001
Token at
Token ID379
Feature activation+3.23e-02

Pos logit contributions
izza+0.008
 Acc+0.008
OP+0.007
 cooled+0.007
 cinnamon+0.007
Neg logit contributions
 after-0.001
 girl-0.001
 though-0.000
 imagining-0.000
 than-0.000
Token the
Token ID262
Feature activation-2.01e-01

Pos logit contributions
 Lena+0.000
 girl+0.000
 figures+0.000
 places+0.000
 woman+0.000
Neg logit contributions
izza-0.001
 cooled-0.001
acket-0.001
ugg-0.001
 Acc-0.001
Token butterfly
Token ID35113
Feature activation+1.59e-01

Pos logit contributions
izza+0.007
 Acc+0.006
 cooled+0.006
ugg+0.006
melon+0.006
Neg logit contributions
 girl-0.004
 woman-0.003
 Lena-0.003
robat-0.003
 pupil-0.003
Token in
Token ID287
Feature activation+1.10e-02

Pos logit contributions
 Lena+0.002
 imagining+0.002
 after+0.001
 figures+0.001
 girl+0.001
Neg logit contributions
 Acc-0.006
izza-0.006
 cooled-0.006
acket-0.006
ugg-0.006
Token amaz
Token ID40642
Feature activation+2.43e+00

Pos logit contributions
 Lena+0.000
Sab+0.000
robat+0.000
 calcul+0.000
rob+0.000
Neg logit contributions
izza-0.000
 Acc-0.000
 cooled-0.000
melon-0.000
icking-0.000
Tokenement
Token ID972
Feature activation+6.51e-01

Pos logit contributions
 Lena+0.082
Sab+0.078
robat+0.077
 calcul+0.077
 pupil+0.075
Neg logit contributions
izza-0.037
 Acc-0.035
 cooled-0.035
melon-0.035
icking-0.033
Token.
Token ID13
Feature activation-1.47e-01

Pos logit contributions
 Lena+0.007
 imagining+0.007
 after+0.007
 than+0.006
cest+0.006
Neg logit contributions
izza-0.025
 Acc-0.025
 cooled-0.024
melon-0.024
 cinnamon-0.023
Token
Token ID198
Feature activation-5.56e-01

Pos logit contributions
icking+0.005
 cooled+0.005
izza+0.005
melon+0.005
 Acc+0.005
Neg logit contributions
 Lena-0.002
Sab-0.002
 calcul-0.002
 girl-0.002
L-0.002
Token
Token ID198
Feature activation-2.74e-01

Pos logit contributions
izza+0.019
 Acc+0.019
 cooled+0.019
melon+0.019
uddled+0.018
Neg logit contributions
 Lena-0.008
robat-0.007
 calcul-0.007
Sab-0.007
rob-0.007
TokenBut
Token ID1537
Feature activation+2.50e-01

Pos logit contributions
izza+0.009
 Acc+0.009
 cooled+0.009
melon+0.009
icking+0.009
Neg logit contributions
 Lena-0.004
Sab-0.004
robat-0.004
 calcul-0.004
rob-0.003
Token just
Token ID655
Feature activation+7.66e-02

Pos logit contributions
 imagined+0.002
 Lena+0.002
b+0.002
 pupil+0.002
Mel+0.002
Neg logit contributions
izza-0.004
 Acc-0.004
 cinnamon-0.003
melon-0.003
ugg-0.003
Token stared
Token ID18484
Feature activation-4.43e-02

Pos logit contributions
 imagined+0.002
b+0.001
 imagining+0.001
 Lena+0.001
 h+0.001
Neg logit contributions
izza-0.003
 Enjoy-0.002
 Acc-0.002
ades-0.002
 Drink-0.002
Token at
Token ID379
Feature activation+1.79e-01

Pos logit contributions
izza+0.002
 cooled+0.002
 cinnamon+0.002
 Acc+0.002
ugg+0.002
Neg logit contributions
 after-0.000
 Lena-0.000
 imagining-0.000
 girl-0.000
 pupil-0.000
Token him
Token ID683
Feature activation-2.36e-02

Pos logit contributions
 Lena+0.002
 woman+0.002
 girl+0.002
Sab+0.002
 figures+0.002
Neg logit contributions
izza-0.007
 cooled-0.007
ugg-0.007
 Acc-0.007
uddled-0.007
Token in
Token ID287
Feature activation-1.64e-01

Pos logit contributions
 Acc+0.001
izza+0.001
 cooled+0.001
 cinnamon+0.001
ugg+0.001
Neg logit contributions
 Lena-0.000
 after-0.000
 imagining-0.000
 pupil-0.000
b-0.000
Token amaz
Token ID40642
Feature activation+2.41e+00

Pos logit contributions
izza+0.004
 Acc+0.004
 cooled+0.004
melon+0.004
 cinnamon+0.004
Neg logit contributions
 Lena-0.003
Sab-0.003
robat-0.003
rob-0.003
 calcul-0.003
Tokenement
Token ID972
Feature activation+5.86e-01

Pos logit contributions
 Lena+0.077
Sab+0.074
robat+0.073
 calcul+0.073
rob+0.071
Neg logit contributions
izza-0.041
 Acc-0.039
melon-0.039
 cooled-0.039
icking-0.036
Token.
Token ID13
Feature activation-1.92e-01

Pos logit contributions
 Lena+0.007
 imagining+0.006
b+0.006
 after+0.006
cest+0.005
Neg logit contributions
izza-0.022
 Acc-0.022
 cooled-0.021
melon-0.021
 cinnamon-0.021
Token So
Token ID1406
Feature activation-9.48e-02

Pos logit contributions
icking+0.007
 cooled+0.007
uddled+0.007
 drank+0.007
bled+0.007
Neg logit contributions
 Lena-0.003
 Taylor-0.002
 Mir-0.002
 Made-0.002
Sab-0.002
Token,
Token ID11
Feature activation-6.91e-01

Pos logit contributions
izza+0.003
 Acc+0.003
melon+0.003
 cooled+0.003
uddled+0.003
Neg logit contributions
 Lena-0.001
 girl-0.001
Sab-0.001
 calcul-0.001
 woman-0.001
Token he
Token ID339
Feature activation-1.18e-01

Pos logit contributions
izza+0.025
 Acc+0.025
melon+0.024
 cooled+0.024
icking+0.024
Neg logit contributions
 Lena-0.009
Sab-0.008
 calcul-0.008
robat-0.007
 pupil-0.007
Token gazed
Token ID50255
Feature activation+4.97e-02

Pos logit contributions
 Lena+0.004
Sab+0.004
 girl+0.004
 woman+0.004
 speaker+0.003
Neg logit contributions
izza-0.005
melon-0.005
 Acc-0.005
acket-0.004
icking-0.004
Token at
Token ID379
Feature activation+2.41e-01

Pos logit contributions
 Lena+0.001
 girl+0.001
Sab+0.001
 calcul+0.001
 imagining+0.001
Neg logit contributions
izza-0.002
 Acc-0.002
 cooled-0.002
melon-0.002
 cinnamon-0.002
Token Ad
Token ID1215
Feature activation+2.45e-01

Pos logit contributions
 Lena+0.004
Sab+0.003
 calcul+0.003
 woman+0.003
robat+0.003
Neg logit contributions
izza-0.008
melon-0.008
 cooled-0.008
 Acc-0.008
icking-0.008
Tokeneline
Token ID4470
Feature activation+1.46e+00

Pos logit contributions
 Lena+0.007
robat+0.006
Sab+0.006
 speaker+0.006
 imagining+0.006
Neg logit contributions
izza-0.005
icking-0.005
atted-0.005
uddled-0.005
ugg-0.005
Token in
Token ID287
Feature activation-1.02e-01

Pos logit contributions
 Lena+0.018
 imagining+0.017
 after+0.016
 girl+0.016
 student+0.015
Neg logit contributions
 Acc-0.055
izza-0.054
 cooled-0.053
melon-0.053
 cinnamon-0.052
Token amaz
Token ID40642
Feature activation+2.40e+00

Pos logit contributions
izza+0.003
 Acc+0.003
 cooled+0.003
melon+0.003
 cinnamon+0.003
Neg logit contributions
 Lena-0.002
robat-0.002
Sab-0.002
rob-0.002
 calcul-0.002
Tokenement
Token ID972
Feature activation+7.01e-01

Pos logit contributions
 Lena+0.089
Sab+0.087
robat+0.086
 calcul+0.085
 pupil+0.084
Neg logit contributions
uddled-0.026
atted-0.026
 Acc-0.026
icking-0.026
izza-0.025
Token.
Token ID13
Feature activation-7.74e-02

Pos logit contributions
 Lena+0.009
 imagining+0.008
 pupil+0.007
Sab+0.007
 girl+0.007
Neg logit contributions
izza-0.026
 Acc-0.026
 cooled-0.025
melon-0.025
 cinnamon-0.025
Token As
Token ID1081
Feature activation+3.37e-02

Pos logit contributions
izza+0.002
 cooled+0.002
melon+0.002
 Acc+0.002
icking+0.002
Neg logit contributions
 Lena-0.002
Sab-0.001
 calcul-0.001
 pupil-0.001
 woman-0.001
Token she
Token ID673
Feature activation-3.85e-01

Pos logit contributions
robat+0.001
 Lena+0.001
rob+0.001
 calcul+0.001
Sab+0.001
Neg logit contributions
 cooled-0.001
izza-0.001
 Acc-0.001
melon-0.001
uttered-0.001
Token sang
Token ID25889
Feature activation-2.61e-01

Pos logit contributions
izza+0.010
 Acc+0.010
melon+0.010
icking+0.009
 cooled+0.009
Neg logit contributions
 Lena-0.009
Sab-0.008
robat-0.008
 calcul-0.008
b-0.008
Token but
Token ID475
Feature activation+2.11e-01

Pos logit contributions
 Lena+0.002
 after+0.002
 places+0.001
 calcul+0.001
 than+0.001
Neg logit contributions
ugg-0.008
izza-0.008
 Acc-0.008
 cooled-0.008
acket-0.007
Token they
Token ID484
Feature activation-6.89e-01

Pos logit contributions
 after+0.003
 girl+0.002
 her+0.002
 imagining+0.002
 Lena+0.002
Neg logit contributions
ugg-0.009
acket-0.009
izza-0.009
uddled-0.008
ades-0.008
Token ran
Token ID4966
Feature activation-5.03e-02

Pos logit contributions
izza+0.027
 Acc+0.027
ugg+0.025
 cinnamon+0.025
 Enjoy+0.024
Neg logit contributions
 imagined-0.011
 saw-0.010
 h-0.010
 figure-0.010
 after-0.010
Token back
Token ID736
Feature activation-1.85e-01

Pos logit contributions
izza+0.002
 Acc+0.002
 cinnamon+0.002
 cooled+0.002
melon+0.002
Neg logit contributions
 Lena-0.001
 after-0.001
 pupil-0.001
 calcul-0.001
 speaker-0.001
Token in
Token ID287
Feature activation-4.01e-01

Pos logit contributions
izza+0.006
 Acc+0.006
melon+0.006
 cooled+0.006
 cinnamon+0.006
Neg logit contributions
 Lena-0.003
Sab-0.003
robat-0.003
 calcul-0.003
 imagining-0.003
Token amaz
Token ID40642
Feature activation+2.39e+00

Pos logit contributions
izza+0.012
 cooled+0.012
 Acc+0.012
melon+0.012
icking+0.012
Neg logit contributions
 Lena-0.007
Sab-0.007
robat-0.007
 calcul-0.007
rob-0.006
Tokenement
Token ID972
Feature activation+6.58e-01

Pos logit contributions
 Lena+0.080
Sab+0.077
robat+0.076
 calcul+0.076
 pupil+0.074
Neg logit contributions
izza-0.036
 Acc-0.035
 cooled-0.035
melon-0.034
icking-0.033
Token!
Token ID0
Feature activation-3.09e-01

Pos logit contributions
 Lena+0.007
 imagining+0.006
 after+0.006
b+0.006
 than+0.006
Neg logit contributions
izza-0.025
 Acc-0.025
 cooled-0.025
 cinnamon-0.025
melon-0.024
Token The
Token ID383
Feature activation-8.27e-02

Pos logit contributions
 cooled+0.010
icking+0.010
melon+0.010
 cinnamon+0.010
uddled+0.010
Neg logit contributions
 Lena-0.006
 Taylor-0.004
Sab-0.004
 Made-0.004
 pupil-0.004
Token post
Token ID1281
Feature activation+1.08e+00

Pos logit contributions
izza+0.003
 Acc+0.003
melon+0.003
 cooled+0.003
ugg+0.003
Neg logit contributions
 Lena-0.001
 calcul-0.001
Sab-0.001
 woman-0.001
 girl-0.001
Token had
Token ID550
Feature activation-2.89e-01

Pos logit contributions
 Lena+0.020
 woman+0.019
 girl+0.019
Sab+0.018
b+0.018
Neg logit contributions
 Acc-0.034
izza-0.034
 Enjoy-0.032
melon-0.031
ugg-0.031
Token shaking
Token ID17275
Feature activation+2.10e-01

Pos logit contributions
 Lena+0.003
 imagining+0.003
b+0.002
 circles+0.002
 after+0.002
Neg logit contributions
izza-0.006
ugg-0.006
 Acc-0.006
melon-0.005
 Enjoy-0.005
Token.
Token ID13
Feature activation-1.86e-01

Pos logit contributions
 Lena+0.002
 after+0.002
 her+0.002
 girl+0.002
 than+0.002
Neg logit contributions
izza-0.009
ugg-0.009
 Acc-0.008
melon-0.008
 Enjoy-0.008
Token Mom
Token ID11254
Feature activation-1.55e-01

Pos logit contributions
melon+0.007
icking+0.007
izza+0.007
 cooled+0.006
uddled+0.006
Neg logit contributions
 Lena-0.003
Sab-0.002
 girl-0.002
 calcul-0.002
 Mir-0.002
Token watched
Token ID7342
Feature activation+2.34e-01

Pos logit contributions
izza+0.005
 Acc+0.005
melon+0.005
uned+0.005
 Enjoy+0.005
Neg logit contributions
 Lena-0.003
Grand-0.003
b-0.003
Sab-0.003
 girl-0.003
Token in
Token ID287
Feature activation-5.28e-01

Pos logit contributions
 Lena+0.003
 after+0.002
 girl+0.002
 her+0.002
 calcul+0.002
Neg logit contributions
izza-0.009
melon-0.009
 Acc-0.009
 cooled-0.009
acket-0.009
Token amaz
Token ID40642
Feature activation+2.37e+00

Pos logit contributions
izza+0.016
melon+0.016
 Acc+0.016
 cooled+0.016
icking+0.015
Neg logit contributions
 Lena-0.010
 calcul-0.009
Sab-0.009
 speaker-0.009
 girl-0.009
Tokenement
Token ID972
Feature activation+4.18e-01

Pos logit contributions
 Lena+0.079
Sab+0.076
robat+0.075
 calcul+0.075
rob+0.073
Neg logit contributions
izza-0.037
 Acc-0.036
 cooled-0.035
melon-0.035
icking-0.033
Token from
Token ID422
Feature activation-2.15e-01

Pos logit contributions
 Lena+0.005
 after+0.005
 imagining+0.004
 girl+0.004
 woman+0.004
Neg logit contributions
izza-0.016
 Acc-0.016
 cooled-0.015
melon-0.015
 cinnamon-0.015
Token a
Token ID257
Feature activation+4.54e-01

Pos logit contributions
izza+0.008
 Acc+0.008
 cooled+0.008
melon+0.008
icking+0.008
Neg logit contributions
 Lena-0.003
 girl-0.002
 woman-0.002
 pupil-0.002
 calcul-0.002
Token distance
Token ID5253
Feature activation-9.42e-01

Pos logit contributions
 Lena+0.008
 woman+0.007
 speaker+0.007
 girl+0.007
 calcul+0.007
Neg logit contributions
izza-0.015
 Acc-0.014
melon-0.014
icked-0.014
icking-0.014
Token,
Token ID11
Feature activation-8.03e-02

Pos logit contributions
izza+0.034
 Acc+0.033
melon+0.032
 cooled+0.032
ugg+0.032
Neg logit contributions
 Lena-0.014
 woman-0.012
 girl-0.012
Sab-0.011
 after-0.011
Token the
Token ID262
Feature activation+3.04e-01

Pos logit contributions
izza+0.018
melon+0.017
 Acc+0.017
 cooled+0.017
icking+0.017
Neg logit contributions
 Lena-0.010
Sab-0.010
robat-0.010
 calcul-0.009
rob-0.009
Token books
Token ID3835
Feature activation-1.76e-02

Pos logit contributions
ang+0.009
rob+0.009
Sab+0.009
cest+0.009
quel+0.009
Neg logit contributions
izza-0.005
 stomach-0.005
 cooled-0.004
 wipes-0.004
 milk-0.004
Token on
Token ID319
Feature activation+8.60e-02

Pos logit contributions
izza+0.001
melon+0.001
 cooled+0.001
 Acc+0.001
icking+0.001
Neg logit contributions
 Lena-0.000
Sab-0.000
 calcul-0.000
robat-0.000
rob-0.000
Token his
Token ID465
Feature activation+5.39e-01

Pos logit contributions
 Lena+0.001
Sab+0.001
robat+0.001
 calcul+0.001
rob+0.001
Neg logit contributions
izza-0.003
 Acc-0.003
 cooled-0.003
melon-0.003
icking-0.003
Token books
Token ID3835
Feature activation+5.22e-01

Pos logit contributions
rob+0.014
Sab+0.013
b+0.012
robat+0.012
 Lena+0.012
Neg logit contributions
 stomach-0.014
izza-0.013
 Acc-0.012
 farewell-0.011
 brow-0.011
Tokenhe
Token ID258
Feature activation+2.36e+00

Pos logit contributions
Sab+0.008
 Lena+0.008
 calcul+0.007
robat+0.007
rob+0.007
Neg logit contributions
izza-0.017
melon-0.017
 cooled-0.017
icking-0.017
 Acc-0.017
Tokenlf
Token ID1652
Feature activation+5.39e-01

Pos logit contributions
Sab+0.068
ang+0.067
b+0.065
 Chr+0.062
lves+0.062
Neg logit contributions
 stomach-0.041
 drank-0.040
uddled-0.039
izza-0.038
 cooled-0.037
Token.
Token ID13
Feature activation-3.56e-01

Pos logit contributions
 Lena+0.010
Sab+0.009
robat+0.009
 calcul+0.008
 pupil+0.008
Neg logit contributions
izza-0.017
 Acc-0.017
 cooled-0.016
melon-0.016
icking-0.016
Token
Token ID220
Feature activation+6.25e-02

Pos logit contributions
 cooled+0.010
izza+0.010
 Acc+0.010
melon+0.010
icking+0.010
Neg logit contributions
 Lena-0.007
 calcul-0.006
Sab-0.006
 girl-0.006
robat-0.006
Token
Token ID198
Feature activation-4.28e-01

Pos logit contributions
 Lena+0.001
robat+0.001
 calcul+0.001
Sab+0.001
 pupil+0.001
Neg logit contributions
izza-0.002
 Acc-0.002
melon-0.002
 cooled-0.002
icking-0.002
Token
Token ID198
Feature activation-1.55e-02

Pos logit contributions
izza+0.011
ugg+0.011
 cooled+0.010
 Acc+0.010
 Enjoy+0.010
Neg logit contributions
 calcul-0.010
 witch-0.010
cest-0.009
 student-0.009
 pupil-0.009
Token up
Token ID510
Feature activation+2.92e-01

Pos logit contributions
 after+0.000
 girl+0.000
 calcul+0.000
 imagining+0.000
 Lena+0.000
Neg logit contributions
 Acc-0.001
izza-0.001
 cooled-0.001
ugg-0.001
uddled-0.001
Token at
Token ID379
Feature activation+2.00e-01

Pos logit contributions
 Lena+0.003
 after+0.003
 girl+0.003
 imagining+0.003
 woman+0.002
Neg logit contributions
izza-0.012
 Acc-0.011
 cooled-0.011
melon-0.011
 cinnamon-0.011
Token the
Token ID262
Feature activation-1.44e-01

Pos logit contributions
 Lena+0.002
 girl+0.002
 woman+0.002
 places+0.002
 figures+0.002
Neg logit contributions
izza-0.008
 cooled-0.008
uddled-0.008
 Acc-0.007
ugg-0.007
Token tree
Token ID5509
Feature activation+2.37e-01

Pos logit contributions
izza+0.005
 Acc+0.005
icking+0.005
icked+0.005
uddled+0.005
Neg logit contributions
 girl-0.002
 witch-0.002
 woman-0.002
 calcul-0.002
 princess-0.002
Token in
Token ID287
Feature activation-1.98e-01

Pos logit contributions
 imagining+0.002
 Lena+0.002
 woman+0.002
 girl+0.002
 after+0.002
Neg logit contributions
 Acc-0.009
izza-0.009
 cooled-0.009
ugg-0.009
acket-0.009
Token amaz
Token ID40642
Feature activation+2.35e+00

Pos logit contributions
izza+0.006
 Acc+0.006
melon+0.006
 cooled+0.006
icking+0.005
Neg logit contributions
 Lena-0.004
Sab-0.004
robat-0.004
 calcul-0.004
rob-0.003
Tokenement
Token ID972
Feature activation+6.36e-01

Pos logit contributions
 Lena+0.077
Sab+0.074
robat+0.073
 calcul+0.073
rob+0.071
Neg logit contributions
izza-0.038
 Acc-0.037
melon-0.036
 cooled-0.036
icking-0.034
Token,
Token ID11
Feature activation-8.32e-03

Pos logit contributions
 Lena+0.007
 imagining+0.007
 after+0.007
b+0.006
 woman+0.006
Neg logit contributions
 Acc-0.024
izza-0.023
 cooled-0.023
melon-0.023
 cinnamon-0.022
Token wondering
Token ID11263
Feature activation-1.64e-01

Pos logit contributions
izza+0.000
uned+0.000
icked+0.000
ugg+0.000
 cooled+0.000
Neg logit contributions
 imagining-0.000
 woman-0.000
 after-0.000
 Lena-0.000
 pretending-0.000
Token how
Token ID703
Feature activation-4.67e-01

Pos logit contributions
ugg+0.005
 cooled+0.005
izza+0.005
melon+0.005
uddled+0.005
Neg logit contributions
 imagining-0.003
 Lena-0.003
 after-0.003
 though-0.003
 woman-0.003
Token it
Token ID340
Feature activation-4.86e-01

Pos logit contributions
uddled+0.018
icking+0.018
acket+0.018
ugg+0.018
izza+0.018
Neg logit contributions
 girl-0.005
 woman-0.005
 she-0.005
 after-0.005
b-0.005
Token trick
Token ID6908
Feature activation+1.06e+00

Pos logit contributions
 Lena+0.009
robat+0.009
Sab+0.008
 calcul+0.008
 pupil+0.008
Neg logit contributions
izza-0.016
 Acc-0.016
melon-0.016
 cooled-0.016
icking-0.015
Token.
Token ID13
Feature activation-2.03e-01

Pos logit contributions
 Lena+0.018
robat+0.016
Sab+0.016
 calcul+0.016
 pupil+0.015
Neg logit contributions
izza-0.035
 Acc-0.034
 cooled-0.034
melon-0.033
ugg-0.033
Token Everyone
Token ID11075
Feature activation-1.90e-01

Pos logit contributions
 cooled+0.007
melon+0.007
izza+0.006
uddled+0.006
icking+0.006
Neg logit contributions
 Lena-0.004
Sab-0.003
 Taylor-0.003
 calcul-0.003
 pupil-0.003
Token watched
Token ID7342
Feature activation+8.19e-02

Pos logit contributions
izza+0.006
melon+0.006
 Acc+0.005
icking+0.005
 cinnamon+0.005
Neg logit contributions
 Lena-0.004
b-0.004
 girl-0.003
robat-0.003
 pupil-0.003
Token in
Token ID287
Feature activation-4.61e-01

Pos logit contributions
 Lena+0.001
 after+0.001
 calcul+0.001
 Taylor+0.001
 imagining+0.001
Neg logit contributions
izza-0.003
 cooled-0.003
ugg-0.003
melon-0.003
 Acc-0.003
Token amaz
Token ID40642
Feature activation+2.35e+00

Pos logit contributions
izza+0.013
 Acc+0.013
melon+0.013
 cooled+0.013
icking+0.012
Neg logit contributions
 Lena-0.009
Sab-0.009
robat-0.009
 calcul-0.008
rob-0.008
Tokenement
Token ID972
Feature activation+4.37e-01

Pos logit contributions
 Lena+0.075
Sab+0.072
robat+0.071
 calcul+0.071
rob+0.070
Neg logit contributions
izza-0.040
 Acc-0.038
melon-0.038
 cooled-0.038
icking-0.035
Token and
Token ID290
Feature activation+1.26e-01

Pos logit contributions
 Lena+0.005
 imagining+0.005
 after+0.005
cest+0.004
 girl+0.004
Neg logit contributions
 Acc-0.016
izza-0.016
 cooled-0.016
melon-0.016
 cinnamon-0.015
Token applauded
Token ID38911
Feature activation+3.43e-01

Pos logit contributions
 Lena+0.003
Sab+0.002
 calcul+0.002
 imagining+0.002
robat+0.002
Neg logit contributions
izza-0.004
melon-0.003
 Acc-0.003
 cooled-0.003
uddled-0.003
Token the
Token ID262
Feature activation-6.35e-02

Pos logit contributions
 Lena+0.007
Sab+0.006
 calcul+0.006
robat+0.006
 pupil+0.006
Neg logit contributions
izza-0.010
 cooled-0.010
melon-0.010
 Acc-0.010
ugg-0.010
Token young
Token ID1862
Feature activation+5.73e-01

Pos logit contributions
icked+0.002
 Acc+0.002
melon+0.002
izza+0.002
atted+0.002
Neg logit contributions
 girl-0.001
 woman-0.001
 pupil-0.001
 student-0.001
robat-0.001
Token not
Token ID407
Feature activation-5.77e-02

Pos logit contributions
 Lena+0.021
Sab+0.020
 calcul+0.019
robat+0.019
 imagining+0.019
Neg logit contributions
izza-0.016
 Acc-0.015
melon-0.015
 cooled-0.015
icking-0.014
Token mean
Token ID1612
Feature activation+5.12e-03

Pos logit contributions
izza+0.002
 Acc+0.002
melon+0.002
 cooled+0.002
uddled+0.002
Neg logit contributions
 Lena-0.001
Sab-0.001
robat-0.001
 calcul-0.001
 girl-0.001
Token to
Token ID284
Feature activation+8.19e-01

Pos logit contributions
 Lena+0.000
 after+0.000
 places+0.000
 girl+0.000
 than+0.000
Neg logit contributions
izza-0.000
melon-0.000
 Acc-0.000
 cooled-0.000
uttered-0.000
Token break
Token ID2270
Feature activation+2.63e-01

Pos logit contributions
robat+0.020
 Lena+0.019
 calcul+0.019
rob+0.019
Sab+0.019
Neg logit contributions
 Acc-0.019
melon-0.019
 cooled-0.019
izza-0.018
icking-0.017
Token the
Token ID262
Feature activation+3.53e-01

Pos logit contributions
 Lena+0.004
Sab+0.004
robat+0.004
 calcul+0.004
rob+0.004
Neg logit contributions
izza-0.008
 Acc-0.008
melon-0.008
 cooled-0.008
icking-0.008
Token rules
Token ID3173
Feature activation+2.33e+00

Pos logit contributions
 Lena+0.007
Sab+0.007
robat+0.007
rob+0.007
 calcul+0.007
Neg logit contributions
izza-0.009
 cooled-0.009
melon-0.009
uddled-0.009
icking-0.009
Token.
Token ID13
Feature activation+4.90e-01

Pos logit contributions
 Lena+0.035
robat+0.030
 calcul+0.030
Sab+0.029
 student+0.029
Neg logit contributions
izza-0.080
 Acc-0.080
 cooled-0.077
melon-0.076
 cinnamon-0.075
Token They
Token ID1119
Feature activation+6.93e-02

Pos logit contributions
 Lena+0.012
 calcul+0.011
Sab+0.010
 girl+0.010
 speaker+0.010
Neg logit contributions
icking-0.013
 cooled-0.013
melon-0.013
izza-0.013
uddled-0.012
Token did
Token ID750
Feature activation+7.36e-01

Pos logit contributions
 Lena+0.001
 girl+0.001
L+0.001
 woman+0.001
 h+0.001
Neg logit contributions
izza-0.003
 Acc-0.002
melon-0.002
acket-0.002
icking-0.002
Token not
Token ID407
Feature activation-6.45e-02

Pos logit contributions
 Lena+0.018
 calcul+0.016
Sab+0.016
 imagining+0.016
 after+0.016
Neg logit contributions
izza-0.019
melon-0.018
 Acc-0.018
ugg-0.018
 cooled-0.018
Token mean
Token ID1612
Feature activation-7.30e-02

Pos logit contributions
izza+0.002
uddled+0.002
uttered+0.002
melon+0.002
 Acc+0.002
Neg logit contributions
 Lena-0.001
 spy-0.001
 girl-0.001
 calcul-0.001
 pupil-0.001
Token his
Token ID465
Feature activation-3.42e-01

Pos logit contributions
izza+0.036
uddled+0.036
 Acc+0.035
OP+0.034
melon+0.033
Neg logit contributions
 Lena-0.011
 girl-0.010
 after-0.010
 Mir-0.009
 imagining-0.009
Token mummy
Token ID38567
Feature activation-3.50e-01

Pos logit contributions
melon+0.011
 cooled+0.011
uddled+0.011
 Acc+0.011
ugg+0.011
Neg logit contributions
 girl-0.006
 pupil-0.005
b-0.005
 imagining-0.005
 woman-0.005
Token,
Token ID11
Feature activation+2.53e-01

Pos logit contributions
 Acc+0.013
izza+0.013
uddled+0.013
 cooled+0.013
melon+0.013
Neg logit contributions
 Lena-0.004
 girl-0.004
 woman-0.004
b-0.004
 after-0.004
Token trembling
Token ID39228
Feature activation-8.50e-02

Pos logit contributions
 Lena+0.004
 woman+0.004
 imagining+0.004
 girl+0.003
 speaker+0.003
Neg logit contributions
izza-0.009
 Acc-0.009
melon-0.009
uttered-0.008
uned-0.008
Token in
Token ID287
Feature activation-6.51e-01

Pos logit contributions
izza+0.003
 Acc+0.003
melon+0.003
ugg+0.003
 cinnamon+0.003
Neg logit contributions
 imagining-0.001
b-0.001
 after-0.001
 figures-0.001
 than-0.001
Token amaz
Token ID40642
Feature activation+2.32e+00

Pos logit contributions
izza+0.018
 Acc+0.018
 cooled+0.018
melon+0.017
icking+0.017
Neg logit contributions
 Lena-0.014
Sab-0.013
robat-0.013
 calcul-0.013
rob-0.012
Tokenement
Token ID972
Feature activation+4.53e-01

Pos logit contributions
 Lena+0.077
Sab+0.074
robat+0.074
 calcul+0.073
rob+0.072
Neg logit contributions
izza-0.036
 Acc-0.035
 cooled-0.034
melon-0.034
icking-0.033
Token.
Token ID13
Feature activation-3.11e-01

Pos logit contributions
 Lena+0.005
 imagining+0.005
b+0.005
 after+0.005
cest+0.005
Neg logit contributions
izza-0.017
 Acc-0.017
 cooled-0.016
melon-0.016
 cinnamon-0.016
Token They
Token ID1119
Feature activation-4.28e-01

Pos logit contributions
icking+0.012
melon+0.011
uddled+0.011
 cooled+0.011
izza+0.011
Neg logit contributions
 Lena-0.004
 Mir-0.004
Sab-0.004
Grand-0.003
 Taylor-0.003
Token had
Token ID550
Feature activation-2.70e-01

Pos logit contributions
 Acc+0.014
izza+0.013
 cinnamon+0.013
melon+0.012
ucker+0.012
Neg logit contributions
 imagined-0.008
 h-0.008
 Lena-0.008
b-0.008
 woman-0.008
Token just
Token ID655
Feature activation-1.87e-01

Pos logit contributions
 Acc+0.009
izza+0.009
melon+0.009
ugg+0.008
acket+0.008
Neg logit contributions
 Lena-0.005
 imagined-0.005
 imagining-0.004
Sab-0.004
b-0.004
Token When
Token ID1649
Feature activation-4.00e-01

Pos logit contributions
 Lena+0.007
Sab+0.006
 calcul+0.005
 pupil+0.005
 girl+0.005
Neg logit contributions
 cooled-0.012
icking-0.012
izza-0.012
melon-0.011
uddled-0.011
Token Daisy
Token ID40355
Feature activation-4.25e-01

Pos logit contributions
 Acc+0.015
izza+0.015
uddled+0.015
icking+0.014
atted+0.014
Neg logit contributions
 Lena-0.006
Sab-0.004
 girl-0.004
 hall-0.004
 imagining-0.004
Token put
Token ID1234
Feature activation+1.33e-01

Pos logit contributions
 Acc+0.014
izza+0.014
icking+0.013
melon+0.013
uddled+0.013
Neg logit contributions
 Lena-0.008
 girl-0.007
 witch-0.007
Sab-0.007
 Taylor-0.007
Token the
Token ID262
Feature activation-1.55e-01

Pos logit contributions
 Lena+0.002
Sab+0.002
robat+0.002
 calcul+0.002
rob+0.002
Neg logit contributions
izza-0.004
 Acc-0.004
melon-0.004
 cooled-0.004
icking-0.004
Token magn
Token ID7842
Feature activation-2.37e-01

Pos logit contributions
 Acc+0.005
izza+0.005
 cooled+0.005
uttered+0.005
icked+0.005
Neg logit contributions
 girl-0.003
 woman-0.003
 witch-0.003
 pupil-0.003
 calcul-0.003
Tokenifying
Token ID4035
Feature activation-4.88e+00

Pos logit contributions
 Acc+0.006
izza+0.005
 cooled+0.005
melon+0.005
 cinnamon+0.005
Neg logit contributions
 Lena-0.006
Sab-0.006
robat-0.006
 calcul-0.006
 pupil-0.006
Token glass
Token ID5405
Feature activation-9.38e-01

Pos logit contributions
izza+0.100
 Acc+0.100
 cooled+0.096
melon+0.095
icking+0.094
Neg logit contributions
 Lena-0.136
Sab-0.129
robat-0.128
 calcul-0.127
rob-0.124
Token down
Token ID866
Feature activation+3.39e-01

Pos logit contributions
 Acc+0.032
izza+0.031
icking+0.030
 cooled+0.029
uddled+0.029
Neg logit contributions
 Lena-0.015
 girl-0.013
robat-0.013
 calcul-0.013
 hall-0.013
Token,
Token ID11
Feature activation+4.58e-01

Pos logit contributions
 Lena+0.005
 girl+0.004
 calcul+0.004
 after+0.003
Sab+0.003
Neg logit contributions
izza-0.013
 Acc-0.012
acket-0.012
uddled-0.012
 cooled-0.012
Token the
Token ID262
Feature activation-2.63e-02

Pos logit contributions
 Lena+0.005
Sab+0.004
 calcul+0.004
robat+0.004
 pupil+0.004
Neg logit contributions
izza-0.018
 Acc-0.018
melon-0.017
 cooled-0.017
icking-0.017
Token cater
Token ID20825
Feature activation-9.74e-01

Pos logit contributions
izza+0.001
icking+0.001
 Acc+0.001
icked+0.001
 cooled+0.001
Neg logit contributions
 girl-0.000
 woman-0.000
 witch-0.000
 pupil-0.000
 princess-0.000
Token She
Token ID1375
Feature activation+2.08e-01

Pos logit contributions
 Lena+0.003
robat+0.003
Sab+0.003
 calcul+0.003
rob+0.003
Neg logit contributions
izza-0.004
 Acc-0.004
melon-0.004
 cooled-0.004
uttered-0.004
Token takes
Token ID2753
Feature activation+3.94e-01

Pos logit contributions
 Lena+0.005
 calcul+0.004
Sab+0.004
 girl+0.004
 Taylor+0.004
Neg logit contributions
melon-0.006
izza-0.006
 Acc-0.006
 cooled-0.006
ugg-0.006
Token out
Token ID503
Feature activation-3.51e-01

Pos logit contributions
robat+0.008
rob+0.008
cest+0.007
 Lena+0.007
Sab+0.007
Neg logit contributions
izza-0.011
 cooled-0.011
melon-0.010
 cinnamon-0.010
icking-0.010
Token a
Token ID257
Feature activation+1.27e-01

Pos logit contributions
izza+0.010
 cooled+0.010
 Acc+0.010
melon+0.010
icking+0.010
Neg logit contributions
robat-0.007
 Lena-0.007
rob-0.006
Sab-0.006
 calcul-0.006
Token magn
Token ID7842
Feature activation-1.14e-01

Pos logit contributions
 Lena+0.003
Sab+0.002
 calcul+0.002
robat+0.002
 pupil+0.002
Neg logit contributions
izza-0.004
 Acc-0.004
melon-0.004
 cooled-0.004
icking-0.003
Tokenifying
Token ID4035
Feature activation-4.50e+00

Pos logit contributions
izza+0.002
 Acc+0.002
 cooled+0.002
melon+0.002
icking+0.002
Neg logit contributions
 Lena-0.003
Sab-0.003
robat-0.003
 calcul-0.003
 pupil-0.003
Token glass
Token ID5405
Feature activation-7.18e-01

Pos logit contributions
izza+0.086
 Acc+0.085
 cooled+0.082
icking+0.081
melon+0.081
Neg logit contributions
 Lena-0.133
Sab-0.125
 calcul-0.124
robat-0.124
 girl-0.120
Token.
Token ID13
Feature activation+5.93e-01

Pos logit contributions
 Acc+0.024
izza+0.024
 cooled+0.023
icking+0.022
melon+0.022
Neg logit contributions
 Lena-0.012
 calcul-0.010
robat-0.010
Sab-0.010
 girl-0.010
Token She
Token ID1375
Feature activation+3.61e-02

Pos logit contributions
 Lena+0.011
Sab+0.009
 Mir+0.008
 calcul+0.008
 Taylor+0.008
Neg logit contributions
 cooled-0.020
melon-0.019
izza-0.019
icking-0.019
 Acc-0.019
Token looks
Token ID3073
Feature activation+1.05e-01

Pos logit contributions
 Lena+0.001
 calcul+0.001
Sab+0.001
 imagining+0.001
 Taylor+0.001
Neg logit contributions
melon-0.001
izza-0.001
 Acc-0.001
icking-0.001
 cooled-0.001
Token at
Token ID379
Feature activation-9.01e-02

Pos logit contributions
 Lena+0.002
Sab+0.002
robat+0.002
rob+0.002
 calcul+0.002
Neg logit contributions
 cooled-0.003
izza-0.003
melon-0.003
 Acc-0.003
 drank-0.003
Token flashlight
Token ID38371
Feature activation-4.34e-01

Pos logit contributions
icking+0.007
 Acc+0.007
izza+0.007
uttered+0.007
icked+0.007
Neg logit contributions
 Lena-0.003
 girl-0.003
 woman-0.003
 pupil-0.003
 student-0.003
Token,
Token ID11
Feature activation+6.90e-02

Pos logit contributions
izza+0.015
 Acc+0.015
 cooled+0.014
ugg+0.014
 cinnamon+0.014
Neg logit contributions
 Lena-0.006
 woman-0.005
 speaker-0.005
robat-0.005
 hall-0.005
Token and
Token ID290
Feature activation-1.87e-01

Pos logit contributions
 Lena+0.001
 calcul+0.001
Sab+0.001
 imagining+0.001
 woman+0.001
Neg logit contributions
izza-0.002
 Acc-0.002
melon-0.002
 cooled-0.002
ugg-0.002
Token a
Token ID257
Feature activation-6.10e-01

Pos logit contributions
izza+0.006
 Acc+0.006
melon+0.006
 cooled+0.006
icking+0.005
Neg logit contributions
 Lena-0.003
Sab-0.003
 calcul-0.003
robat-0.003
 pupil-0.003
Token magn
Token ID7842
Feature activation-2.29e-01

Pos logit contributions
icking+0.020
izza+0.020
icked+0.020
uttered+0.020
 Acc+0.020
Neg logit contributions
 girl-0.010
 Lena-0.010
 woman-0.010
 pupil-0.010
 witch-0.010
Tokenifying
Token ID4035
Feature activation-4.47e+00

Pos logit contributions
 Acc+0.005
izza+0.005
 cooled+0.005
melon+0.005
 cinnamon+0.004
Neg logit contributions
 Lena-0.006
Sab-0.006
robat-0.006
 calcul-0.006
 pupil-0.006
Token glass
Token ID5405
Feature activation-7.11e-01

Pos logit contributions
 Acc+0.098
izza+0.098
icking+0.094
uddled+0.092
 cooled+0.091
Neg logit contributions
 Lena-0.119
 calcul-0.111
Sab-0.110
robat-0.110
 girl-0.108
Token.
Token ID13
Feature activation+1.57e-01

Pos logit contributions
 Acc+0.026
izza+0.025
 cooled+0.023
icking+0.023
ugg+0.023
Neg logit contributions
 Lena-0.010
 girl-0.009
robat-0.009
 calcul-0.009
 figures-0.009
Token
Token ID198
Feature activation+5.42e-01

Pos logit contributions
 Lena+0.003
Sab+0.003
robat+0.003
 calcul+0.003
rob+0.003
Neg logit contributions
izza-0.005
 Acc-0.005
 cooled-0.005
melon-0.005
icking-0.004
Token
Token ID198
Feature activation+8.45e-01

Pos logit contributions
robat+0.010
 calcul+0.009
cest+0.009
rob+0.009
 imagining+0.009
Neg logit contributions
izza-0.017
 Acc-0.016
uttered-0.015
 cooled-0.015
ugg-0.015
TokenAs
Token ID1722
Feature activation-1.14e-01

Pos logit contributions
rob+0.013
robat+0.013
 Lena+0.013
cest+0.012
 calcul+0.012
Neg logit contributions
izza-0.027
 Acc-0.027
 cooled-0.025
ugg-0.025
melon-0.025
Token the
Token ID262
Feature activation-9.70e-03

Pos logit contributions
izza+0.026
 Acc+0.026
melon+0.025
 cooled+0.025
acket+0.025
Neg logit contributions
 Lena-0.005
Sab-0.004
 girl-0.003
 calcul-0.003
 woman-0.003
Token necklace
Token ID38987
Feature activation+6.69e-01

Pos logit contributions
izza+0.000
 Acc+0.000
 cooled+0.000
melon+0.000
icking+0.000
Neg logit contributions
 Lena-0.000
 girl-0.000
 calcul-0.000
Sab-0.000
 woman-0.000
Token and
Token ID290
Feature activation+2.94e-02

Pos logit contributions
 Lena+0.011
Sab+0.011
robat+0.010
 calcul+0.010
rob+0.010
Neg logit contributions
izza-0.021
melon-0.021
 cooled-0.021
icking-0.020
 Acc-0.020
Token the
Token ID262
Feature activation-3.06e-01

Pos logit contributions
 Lena+0.001
 woman+0.001
 girl+0.001
 calcul+0.001
 imagining+0.001
Neg logit contributions
izza-0.001
 Acc-0.001
melon-0.001
icking-0.001
ugg-0.001
Token magn
Token ID7842
Feature activation-8.88e-02

Pos logit contributions
izza+0.009
 Acc+0.009
 cooled+0.009
icking+0.009
uttered+0.009
Neg logit contributions
 Lena-0.006
 girl-0.006
 woman-0.005
 witch-0.005
Sab-0.005
Tokenifying
Token ID4035
Feature activation-4.29e+00

Pos logit contributions
 Acc+0.002
izza+0.002
 cooled+0.002
melon+0.002
 cinnamon+0.002
Neg logit contributions
 Lena-0.003
Sab-0.002
robat-0.002
 calcul-0.002
 pupil-0.002
Token glass
Token ID5405
Feature activation-6.13e-01

Pos logit contributions
izza+0.078
 Acc+0.077
icking+0.074
 cooled+0.073
melon+0.072
Neg logit contributions
 Lena-0.130
Sab-0.124
 calcul-0.122
robat-0.121
 girl-0.119
Token and
Token ID290
Feature activation-2.36e-02

Pos logit contributions
 Acc+0.025
izza+0.024
acket+0.023
 Enjoy+0.023
icking+0.022
Neg logit contributions
 Lena-0.006
 girl-0.006
 woman-0.005
 figures-0.005
 ones-0.005
Token started
Token ID2067
Feature activation-4.13e-01

Pos logit contributions
izza+0.001
 Acc+0.001
melon+0.001
icking+0.001
acket+0.001
Neg logit contributions
 Lena-0.001
 woman-0.000
 imagining-0.000
 girl-0.000
 calcul-0.000
Token to
Token ID284
Feature activation+1.63e-01

Pos logit contributions
izza+0.014
 Acc+0.014
acket+0.013
icking+0.013
melon+0.013
Neg logit contributions
 Lena-0.007
 imagining-0.007
Sab-0.006
 calcul-0.006
b-0.006
Token examine
Token ID10716
Feature activation-1.47e-01

Pos logit contributions
 Lena+0.003
 imagining+0.003
 princess+0.003
Sab+0.003
 calcul+0.003
Neg logit contributions
izza-0.005
uddled-0.005
atted-0.005
uttered-0.005
icking-0.005
Token a
Token ID257
Feature activation+2.89e-01

Pos logit contributions
izza+0.003
 Acc+0.003
melon+0.003
 cooled+0.003
icking+0.003
Neg logit contributions
 Lena-0.002
Sab-0.002
 calcul-0.002
robat-0.002
 imagining-0.001
Token flashlight
Token ID38371
Feature activation-1.88e-01

Pos logit contributions
 Lena+0.005
Sab+0.005
 calcul+0.005
robat+0.005
 pupil+0.005
Neg logit contributions
izza-0.009
 Acc-0.009
 cooled-0.009
melon-0.009
icking-0.009
Token and
Token ID290
Feature activation+2.11e-01

Pos logit contributions
 Acc+0.007
izza+0.007
 cinnamon+0.006
 cooled+0.006
ugg+0.006
Neg logit contributions
 Lena-0.003
 imagining-0.002
 woman-0.002
 speaker-0.002
 hall-0.002
Token a
Token ID257
Feature activation-2.11e-01

Pos logit contributions
 Lena+0.004
Sab+0.003
 calcul+0.003
 woman+0.003
 imagining+0.003
Neg logit contributions
 Acc-0.007
izza-0.007
melon-0.007
icking-0.007
 cooled-0.007
Token magn
Token ID7842
Feature activation-2.16e-01

Pos logit contributions
izza+0.007
 Acc+0.007
icking+0.007
 cooled+0.006
melon+0.006
Neg logit contributions
 Lena-0.004
 girl-0.003
 woman-0.003
 pupil-0.003
 calcul-0.003
Tokenifying
Token ID4035
Feature activation-4.19e+00

Pos logit contributions
 Acc+0.004
izza+0.004
 cooled+0.004
melon+0.004
 cinnamon+0.004
Neg logit contributions
 Lena-0.006
Sab-0.006
robat-0.006
 calcul-0.006
 pupil-0.006
Token glass
Token ID5405
Feature activation-5.04e-01

Pos logit contributions
izza+0.081
 Acc+0.079
 cooled+0.077
melon+0.076
icking+0.074
Neg logit contributions
 Lena-0.123
Sab-0.117
robat-0.116
 calcul-0.116
rob-0.113
Token to
Token ID284
Feature activation+8.69e-01

Pos logit contributions
 Acc+0.019
izza+0.018
 apologised+0.017
 cooled+0.017
acket+0.017
Neg logit contributions
 Lena-0.007
 girl-0.006
 woman-0.006
 hall-0.006
 figures-0.006
Token examine
Token ID10716
Feature activation+2.15e-01

Pos logit contributions
 Lena+0.019
Sab+0.018
robat+0.018
 calcul+0.018
rob+0.017
Neg logit contributions
izza-0.023
 Acc-0.023
 cooled-0.022
melon-0.022
icking-0.021
Token their
Token ID511
Feature activation-8.68e-02

Pos logit contributions
 Lena+0.004
Sab+0.003
 calcul+0.003
robat+0.003
 girl+0.003
Neg logit contributions
izza-0.007
 Acc-0.007
 cooled-0.007
melon-0.007
ugg-0.007
Token finds
Token ID7228
Feature activation+2.72e-01

Pos logit contributions
izza+0.003
 Acc+0.003
melon+0.003
 cooled+0.003
icking+0.003
Neg logit contributions
 Lena-0.002
Sab-0.001
robat-0.001
 calcul-0.001
rob-0.001
Token the
Token ID262
Feature activation-1.22e-01

Pos logit contributions
melon+0.010
 Acc+0.010
 cooled+0.010
izza+0.010
icking+0.009
Neg logit contributions
 calcul-0.007
Sab-0.007
robat-0.007
 Lena-0.006
Grand-0.006
Token watch
Token ID2342
Feature activation+9.29e-01

Pos logit contributions
izza+0.004
 Acc+0.004
 cooled+0.004
melon+0.004
icking+0.004
Neg logit contributions
 Lena-0.002
 calcul-0.002
 woman-0.002
 girl-0.002
robat-0.002
Token with
Token ID351
Feature activation-4.24e-01

Pos logit contributions
 Lena+0.015
 student+0.012
 speaker+0.012
 calcul+0.012
robat+0.012
Neg logit contributions
 Acc-0.033
izza-0.031
 cooled-0.031
ugg-0.031
melon-0.030
Token the
Token ID262
Feature activation+7.94e-02

Pos logit contributions
izza+0.015
 Acc+0.015
uddled+0.014
ugg+0.014
melon+0.014
Neg logit contributions
 Lena-0.007
 calcul-0.005
 imagining-0.005
 girl-0.005
 woman-0.005
Token magn
Token ID7842
Feature activation+1.97e-01

Pos logit contributions
 Lena+0.001
 woman+0.001
 calcul+0.001
 girl+0.001
 witch+0.001
Neg logit contributions
izza-0.003
 Acc-0.003
melon-0.002
 cooled-0.002
icking-0.002
Tokenifying
Token ID4035
Feature activation-4.08e+00

Pos logit contributions
 Lena+0.005
Sab+0.005
robat+0.005
 Taylor+0.004
Mel+0.004
Neg logit contributions
 Acc-0.005
izza-0.005
 cooled-0.005
melon-0.004
 drank-0.004
Token glass
Token ID5405
Feature activation-5.09e-01

Pos logit contributions
izza+0.077
 Acc+0.076
icking+0.073
 cooled+0.072
melon+0.071
Neg logit contributions
 Lena-0.123
Sab-0.116
 calcul-0.115
robat-0.114
 girl-0.111
Token.
Token ID13
Feature activation+6.72e-02

Pos logit contributions
 Acc+0.021
izza+0.019
 Enjoy+0.019
acket+0.019
ugg+0.019
Neg logit contributions
 Lena-0.006
 girl-0.006
 figures-0.005
 witch-0.005
 woman-0.005
Token He
Token ID679
Feature activation-3.16e-01

Pos logit contributions
 Lena+0.001
 Mir+0.001
 L+0.001
 girl+0.000
 Made+0.000
Neg logit contributions
icking-0.003
 cooled-0.003
ugg-0.003
atted-0.003
fed-0.003
Token saw
Token ID2497
Feature activation-6.42e-01

Pos logit contributions
izza+0.011
 Acc+0.011
melon+0.010
 cooled+0.010
icking+0.010
Neg logit contributions
 Lena-0.005
Sab-0.004
 calcul-0.004
robat-0.004
 pupil-0.004
Token the
Token ID262
Feature activation-1.18e-01

Pos logit contributions
izza+0.022
 Acc+0.021
 cooled+0.021
ugg+0.021
icking+0.021
Neg logit contributions
 Lena-0.011
Sab-0.009
 girl-0.009
 figures-0.009
 calcul-0.009
Token said
Token ID531
Feature activation-2.56e-01

Pos logit contributions
robat+0.005
rob+0.005
Sab+0.005
cest+0.005
 Lena+0.005
Neg logit contributions
izza-0.006
 drank-0.006
 cooled-0.006
 Acc-0.006
 apologised-0.006
Token,
Token ID11
Feature activation+5.09e-01

Pos logit contributions
izza+0.009
melon+0.009
 cooled+0.009
 cinnamon+0.009
acket+0.009
Neg logit contributions
 Lena-0.004
 pupil-0.003
 girl-0.003
 after-0.003
 imagining-0.003
Token holding
Token ID4769
Feature activation+1.76e-01

Pos logit contributions
 Lena+0.008
 imagining+0.008
 woman+0.008
 calcul+0.007
 speaker+0.007
Neg logit contributions
izza-0.018
ugg-0.016
 Acc-0.016
 cooled-0.016
icking-0.016
Token the
Token ID262
Feature activation+4.00e-02

Pos logit contributions
 Lena+0.003
 calcul+0.003
Sab+0.003
 pupil+0.003
robat+0.003
Neg logit contributions
izza-0.006
 Acc-0.006
 cooled-0.005
melon-0.005
ugg-0.005
Token magn
Token ID7842
Feature activation+2.89e-01

Pos logit contributions
 Lena+0.001
 girl+0.001
robat+0.001
 calcul+0.001
 woman+0.001
Neg logit contributions
 Acc-0.001
izza-0.001
 cooled-0.001
ugg-0.001
uddled-0.001
Tokenifying
Token ID4035
Feature activation-3.99e+00

Pos logit contributions
 Lena+0.008
Sab+0.008
robat+0.008
 pupil+0.007
 calcul+0.007
Neg logit contributions
 Acc-0.007
izza-0.006
 cooled-0.006
melon-0.006
uddled-0.005
Token glass
Token ID5405
Feature activation-2.26e-01

Pos logit contributions
izza+0.068
 Acc+0.065
melon+0.064
 cooled+0.064
icking+0.060
Neg logit contributions
 Lena-0.127
Sab-0.122
robat-0.121
 calcul-0.120
rob-0.117
Token over
Token ID625
Feature activation+1.72e-02

Pos logit contributions
 Acc+0.009
izza+0.009
acket+0.009
uddled+0.008
ugg+0.008
Neg logit contributions
 girl-0.002
 Lena-0.002
 after-0.002
 figures-0.002
 pupil-0.002
Token a
Token ID257
Feature activation+3.72e-01

Pos logit contributions
 Lena+0.000
 girl+0.000
 woman+0.000
Sab+0.000
 pupil+0.000
Neg logit contributions
izza-0.001
 Acc-0.001
 cooled-0.001
acket-0.001
ugg-0.001
Token black
Token ID2042
Feature activation+5.39e-01

Pos logit contributions
 Lena+0.008
Sab+0.007
robat+0.007
 calcul+0.007
 pupil+0.007
Neg logit contributions
izza-0.010
 Acc-0.010
 cooled-0.010
melon-0.010
icking-0.010
Token ant
Token ID1885
Feature activation-5.58e-02

Pos logit contributions
 Lena+0.009
robat+0.009
 girl+0.009
 pupil+0.009
 calcul+0.009
Neg logit contributions
 Acc-0.017
izza-0.017
 cooled-0.017
icking-0.016
uddled-0.016
Token He
Token ID679
Feature activation-2.67e-01

Pos logit contributions
izza+0.007
 cooled+0.007
melon+0.007
 Acc+0.006
icking+0.006
Neg logit contributions
 Lena-0.005
Sab-0.004
 calcul-0.004
robat-0.004
 pupil-0.004
Token took
Token ID1718
Feature activation+3.25e-03

Pos logit contributions
 Acc+0.007
izza+0.007
melon+0.007
 cooled+0.007
icking+0.007
Neg logit contributions
 Lena-0.006
Sab-0.006
 calcul-0.006
robat-0.006
 pupil-0.005
Token out
Token ID503
Feature activation-3.22e-01

Pos logit contributions
robat+0.000
rob+0.000
Sab+0.000
 Lena+0.000
 calcul+0.000
Neg logit contributions
izza-0.000
 cooled-0.000
icked-0.000
 farewell-0.000
uttered-0.000
Token a
Token ID257
Feature activation+3.45e-01

Pos logit contributions
izza+0.010
 Acc+0.010
melon+0.010
 cooled+0.010
icking+0.010
Neg logit contributions
 Lena-0.005
Sab-0.005
 calcul-0.005
robat-0.005
 pupil-0.005
Token magn
Token ID7842
Feature activation+1.56e-01

Pos logit contributions
 Lena+0.007
Sab+0.006
robat+0.006
 calcul+0.006
rob+0.006
Neg logit contributions
izza-0.010
 Acc-0.010
 cooled-0.010
melon-0.010
icking-0.009
Tokenifying
Token ID4035
Feature activation-3.98e+00

Pos logit contributions
 Lena+0.005
Sab+0.005
robat+0.005
 calcul+0.004
 pupil+0.004
Neg logit contributions
izza-0.003
 Acc-0.003
 cooled-0.003
melon-0.003
icking-0.002
Token glass
Token ID5405
Feature activation-4.17e-01

Pos logit contributions
izza+0.081
 Acc+0.079
 cooled+0.077
melon+0.076
icking+0.075
Neg logit contributions
 Lena-0.114
Sab-0.108
 calcul-0.106
robat-0.106
 pupil-0.103
Token and
Token ID290
Feature activation+1.82e-01

Pos logit contributions
 Acc+0.016
izza+0.015
 cooled+0.014
icking+0.014
uttered+0.014
Neg logit contributions
 Lena-0.006
robat-0.005
 girl-0.005
 calcul-0.005
 woman-0.005
Token a
Token ID257
Feature activation-2.75e-01

Pos logit contributions
 Lena+0.004
 calcul+0.004
Sab+0.004
 imagining+0.003
 woman+0.003
Neg logit contributions
izza-0.005
 Acc-0.005
melon-0.005
icking-0.005
 cooled-0.005
Token notebook
Token ID20922
Feature activation+4.16e-01

Pos logit contributions
icking+0.009
 Acc+0.009
izza+0.009
melon+0.008
uddled+0.008
Neg logit contributions
 Lena-0.005
 girl-0.005
 woman-0.005
 pupil-0.005
 calcul-0.005
Token and
Token ID290
Feature activation+8.91e-02

Pos logit contributions
 Lena+0.007
robat+0.006
Sab+0.006
 calcul+0.006
 pupil+0.006
Neg logit contributions
izza-0.013
 Acc-0.013
 cooled-0.013
melon-0.013
icking-0.012
Token,
Token ID11
Feature activation+2.43e-01

Pos logit contributions
acket+0.019
 cinnamon+0.018
izza+0.017
cream+0.017
latable+0.017
Neg logit contributions
 she-0.002
 after-0.001
 her-0.001
 but-0.001
 one-0.001
Token handing
Token ID22786
Feature activation+1.90e-01

Pos logit contributions
 Lena+0.003
 imagining+0.003
 woman+0.003
 girl+0.003
 after+0.003
Neg logit contributions
izza-0.010
uned-0.009
icking-0.009
acket-0.009
uttered-0.009
Token him
Token ID683
Feature activation-2.84e-01

Pos logit contributions
 Lena+0.004
Sab+0.004
robat+0.004
 calcul+0.004
rob+0.003
Neg logit contributions
izza-0.005
melon-0.005
 Acc-0.005
 cooled-0.005
icking-0.005
Token the
Token ID262
Feature activation+3.13e-01

Pos logit contributions
 Acc+0.012
acket+0.012
izza+0.011
uddled+0.011
 cooled+0.011
Neg logit contributions
 Lena-0.003
 her-0.002
 girl-0.002
 one-0.002
 Mir-0.002
Token magn
Token ID7842
Feature activation+2.36e-01

Pos logit contributions
 woman+0.005
 girl+0.005
 witch+0.005
 spy+0.005
 speaker+0.005
Neg logit contributions
uddled-0.010
 Acc-0.010
atted-0.010
ugg-0.010
uttered-0.010
Tokenifying
Token ID4035
Feature activation-3.93e+00

Pos logit contributions
 Lena+0.006
robat+0.006
Sab+0.006
 pupil+0.006
 student+0.006
Neg logit contributions
 Acc-0.006
izza-0.005
 cooled-0.005
 cinnamon-0.005
melon-0.005
Token glass
Token ID5405
Feature activation-2.58e-01

Pos logit contributions
 Acc+0.074
izza+0.074
icking+0.072
 cooled+0.071
uddled+0.070
Neg logit contributions
 Lena-0.117
Sab-0.109
 calcul-0.109
robat-0.108
 girl-0.106
Token.
Token ID13
Feature activation+4.40e-01

Pos logit contributions
 Acc+0.010
izza+0.009
acket+0.009
icking+0.009
uddled+0.009
Neg logit contributions
 Lena-0.003
 girl-0.003
 woman-0.003
 figures-0.003
robat-0.003
Token She
Token ID1375
Feature activation+3.60e-03

Pos logit contributions
 Lena+0.006
Sab+0.004
 Taylor+0.003
L+0.003
 calcul+0.003
Neg logit contributions
icking-0.017
 cooled-0.017
izza-0.017
melon-0.017
acket-0.016
Token looked
Token ID3114
Feature activation-1.19e-01

Pos logit contributions
 Lena+0.000
 woman+0.000
 imagined+0.000
 girl+0.000
L+0.000
Neg logit contributions
izza-0.000
 Acc-0.000
melon-0.000
icking-0.000
uning-0.000
Token around
Token ID1088
Feature activation+1.41e-01

Pos logit contributions
izza+0.005
 Acc+0.004
 cooled+0.004
melon+0.004
 cinnamon+0.004
Neg logit contributions
 Lena-0.001
 imagining-0.001
 girl-0.001
 calcul-0.001
 woman-0.001
Token
Token ID198
Feature activation+1.54e-01

Pos logit contributions
izza+0.007
ete+0.006
 Acc+0.006
ugg+0.006
 cooled+0.006
Neg logit contributions
 calcul-0.007
cest-0.006
robat-0.006
 detective-0.006
 pupil-0.006
TokenHe
Token ID1544
Feature activation-4.86e-01

Pos logit contributions
 Lena+0.003
robat+0.003
 calcul+0.003
Sab+0.003
rob+0.003
Neg logit contributions
izza-0.005
 cooled-0.005
 Acc-0.005
melon-0.004
icking-0.004
Token brought
Token ID3181
Feature activation-2.32e-01

Pos logit contributions
izza+0.011
 Acc+0.011
melon+0.011
 cooled+0.011
icking+0.010
Neg logit contributions
 Lena-0.012
Sab-0.012
robat-0.012
 calcul-0.012
rob-0.011
Token his
Token ID465
Feature activation-3.22e-01

Pos logit contributions
 Acc+0.008
izza+0.008
melon+0.008
uddled+0.008
icking+0.008
Neg logit contributions
 Lena-0.003
Sab-0.003
 girl-0.003
 calcul-0.003
b-0.003
Token magn
Token ID7842
Feature activation+2.62e-01

Pos logit contributions
izza+0.010
 Acc+0.010
melon+0.010
 cooled+0.010
ugg+0.010
Neg logit contributions
 girl-0.005
 Lena-0.005
 woman-0.005
 pupil-0.005
 student-0.005
Tokenifying
Token ID4035
Feature activation-3.89e+00

Pos logit contributions
 Lena+0.006
Sab+0.006
b+0.006
 pupil+0.006
Mel+0.006
Neg logit contributions
 Acc-0.007
izza-0.006
 cooled-0.006
 cinnamon-0.006
melon-0.006
Token glass
Token ID5405
Feature activation-5.00e-01

Pos logit contributions
izza+0.081
 Acc+0.080
 cooled+0.077
icking+0.076
melon+0.076
Neg logit contributions
 Lena-0.109
Sab-0.102
 calcul-0.102
robat-0.099
 girl-0.098
Token and
Token ID290
Feature activation+2.49e-01

Pos logit contributions
 Acc+0.018
izza+0.017
 cooled+0.016
acket+0.016
uddled+0.016
Neg logit contributions
 Lena-0.007
 figures-0.007
 girl-0.007
b-0.007
 pupil-0.006
Token looked
Token ID3114
Feature activation-5.04e-01

Pos logit contributions
 Lena+0.005
 figures+0.005
 imagining+0.005
b+0.005
 imagined+0.005
Neg logit contributions
 Acc-0.008
izza-0.007
melon-0.007
ugg-0.007
icking-0.007
Token closely
Token ID7173
Feature activation-3.20e-02

Pos logit contributions
izza+0.019
 Acc+0.019
 cooled+0.018
 cinnamon+0.018
ugg+0.018
Neg logit contributions
 Lena-0.006
 after-0.006
 imagining-0.005
b-0.005
 girl-0.005
Token.
Token ID13
Feature activation-2.74e-01

Pos logit contributions
 Acc+0.001
izza+0.001
 cooled+0.001
melon+0.001
ugg+0.001
Neg logit contributions
 after-0.000
 Lena-0.000
 imagining-0.000
b-0.000
 though-0.000
Token,
Token ID11
Feature activation+4.82e-01

Pos logit contributions
 Lena+0.017
Sab+0.015
robat+0.015
 calcul+0.015
 pupil+0.014
Neg logit contributions
izza-0.037
 Acc-0.036
 cooled-0.036
melon-0.036
icking-0.035
Token next
Token ID1306
Feature activation-3.65e-01

Pos logit contributions
 Lena+0.008
Sab+0.008
 calcul+0.007
robat+0.007
 imagining+0.007
Neg logit contributions
izza-0.015
 Acc-0.015
melon-0.015
 cooled-0.015
icking-0.014
Token to
Token ID284
Feature activation-2.54e-02

Pos logit contributions
izza+0.010
 cooled+0.009
 Acc+0.009
melon+0.009
icking+0.009
Neg logit contributions
 Lena-0.008
robat-0.008
Sab-0.008
 calcul-0.008
rob-0.008
Token her
Token ID607
Feature activation-6.31e-01

Pos logit contributions
izza+0.001
 Acc+0.001
melon+0.001
 cooled+0.001
icking+0.001
Neg logit contributions
 Lena-0.000
Sab-0.000
 calcul-0.000
robat-0.000
 pupil-0.000
Token magn
Token ID7842
Feature activation+1.25e-01

Pos logit contributions
izza+0.018
 Acc+0.018
melon+0.017
 cooled+0.017
icking+0.017
Neg logit contributions
 Lena-0.013
robat-0.012
Sab-0.012
rob-0.012
 calcul-0.012
Tokenifying
Token ID4035
Feature activation-3.79e+00

Pos logit contributions
 Lena+0.003
Sab+0.003
Mel+0.003
 pupil+0.003
 Taylor+0.003
Neg logit contributions
 Acc-0.003
 cooled-0.003
izza-0.003
melon-0.003
 cinnamon-0.003
Token glass
Token ID5405
Feature activation-6.87e-01

Pos logit contributions
izza+0.070
 Acc+0.068
 cooled+0.066
melon+0.065
icking+0.065
Neg logit contributions
 Lena-0.116
Sab-0.109
 calcul-0.108
robat-0.107
 pupil-0.105
Token.
Token ID13
Feature activation+3.88e-01

Pos logit contributions
 Acc+0.023
izza+0.023
 cooled+0.022
icking+0.021
acket+0.021
Neg logit contributions
 Lena-0.012
 girl-0.010
robat-0.010
 woman-0.010
 witch-0.010
Token She
Token ID1375
Feature activation-6.08e-02

Pos logit contributions
 Lena+0.006
Sab+0.006
 calcul+0.005
 pupil+0.005
robat+0.005
Neg logit contributions
izza-0.013
melon-0.013
 cooled-0.013
 Acc-0.013
icking-0.013
Token looked
Token ID3114
Feature activation-2.22e-01

Pos logit contributions
melon+0.002
izza+0.002
 Acc+0.002
icking+0.002
 cooled+0.002
Neg logit contributions
 Lena-0.001
Sab-0.001
 calcul-0.001
 woman-0.001
 imagining-0.001
Token at
Token ID379
Feature activation-2.31e-01

Pos logit contributions
izza+0.007
 Acc+0.007
 cooled+0.007
melon+0.007
icking+0.007
Neg logit contributions
 Lena-0.004
Sab-0.004
robat-0.004
rob-0.004
 calcul-0.003
Token it
Token ID340
Feature activation+1.61e-01

Pos logit contributions
robat+0.002
cest+0.002
b+0.002
rob+0.002
Sab+0.002
Neg logit contributions
melon-0.003
 Acc-0.003
 cooled-0.003
uttered-0.003
 farewell-0.003
Token she
Token ID673
Feature activation+1.58e+00

Pos logit contributions
 Lena+0.004
robat+0.003
Sab+0.003
rob+0.003
 calcul+0.003
Neg logit contributions
 cooled-0.004
izza-0.004
 Acc-0.004
icking-0.004
melon-0.004
Token had
Token ID550
Feature activation+8.38e-02

Pos logit contributions
robat+0.037
 Lena+0.037
Sab+0.035
 calcul+0.034
rob+0.034
Neg logit contributions
 cooled-0.041
izza-0.039
 Acc-0.039
 drank-0.039
icking-0.038
Token a
Token ID257
Feature activation+5.18e-01

Pos logit contributions
 Lena+0.002
Sab+0.002
 calcul+0.002
robat+0.002
 imagining+0.001
Neg logit contributions
izza-0.002
 Acc-0.002
melon-0.002
 cooled-0.002
icking-0.002
Token magn
Token ID7842
Feature activation+2.41e-01

Pos logit contributions
 Lena+0.008
 pupil+0.008
 girl+0.008
 woman+0.008
Sab+0.008
Neg logit contributions
izza-0.017
 Acc-0.017
icking-0.017
melon-0.016
icked-0.016
Tokenifying
Token ID4035
Feature activation-3.74e+00

Pos logit contributions
 Lena+0.006
Sab+0.006
robat+0.006
 pupil+0.006
 calcul+0.006
Neg logit contributions
 Acc-0.005
izza-0.005
 cooled-0.005
melon-0.005
 cinnamon-0.005
Token glass
Token ID5405
Feature activation-4.36e-01

Pos logit contributions
izza+0.078
 Acc+0.077
icking+0.074
uddled+0.072
atted+0.072
Neg logit contributions
 Lena-0.104
Sab-0.098
 calcul-0.098
robat-0.096
 speaker-0.095
Token,
Token ID11
Feature activation+8.54e-01

Pos logit contributions
 Acc+0.017
izza+0.016
acket+0.015
 cooled+0.015
icking+0.014
Neg logit contributions
 Lena-0.006
 girl-0.006
 pupil-0.005
 woman-0.005
 witch-0.005
Token a
Token ID257
Feature activation+7.63e-02

Pos logit contributions
 Lena+0.013
 imagining+0.011
 calcul+0.011
Sab+0.011
 woman+0.011
Neg logit contributions
izza-0.030
 Acc-0.029
melon-0.029
 cooled-0.029
ugg-0.028
Token tiny
Token ID7009
Feature activation-1.47e-01

Pos logit contributions
 girl+0.001
 woman+0.001
 pupil+0.001
 witch+0.001
 princess+0.001
Neg logit contributions
icking-0.003
atted-0.003
izza-0.003
icked-0.003
 Acc-0.003
Token torch
Token ID28034
Feature activation-2.52e-01

Pos logit contributions
izza+0.005
 Acc+0.005
uddled+0.005
icking+0.005
 cooled+0.005
Neg logit contributions
 girl-0.003
 pupil-0.002
 princess-0.002
 woman-0.002
 witch-0.002
Token the
Token ID262
Feature activation+9.29e-02

Pos logit contributions
acket+0.024
 Acc+0.023
izza+0.023
 cooled+0.023
ugg+0.022
Neg logit contributions
 Lena-0.008
 her-0.007
 figures-0.006
 Mir-0.006
 girl-0.006
Token necklace
Token ID38987
Feature activation+8.23e-01

Pos logit contributions
 Lena+0.002
robat+0.002
rob+0.002
Sab+0.002
Grand+0.002
Neg logit contributions
izza-0.002
melon-0.002
 cooled-0.002
 Acc-0.002
 cinnamon-0.002
Token and
Token ID290
Feature activation+3.79e-01

Pos logit contributions
 Lena+0.011
Sab+0.009
 calcul+0.009
 woman+0.009
 girl+0.009
Neg logit contributions
 Acc-0.030
izza-0.030
 cooled-0.029
ugg-0.028
melon-0.028
Token the
Token ID262
Feature activation-3.59e-01

Pos logit contributions
 Lena+0.008
robat+0.007
Sab+0.007
 calcul+0.007
rob+0.007
Neg logit contributions
izza-0.011
melon-0.011
 cooled-0.011
 Acc-0.011
icking-0.010
Token magn
Token ID7842
Feature activation+1.22e-01

Pos logit contributions
izza+0.010
 Acc+0.010
melon+0.010
 cooled+0.010
icking+0.009
Neg logit contributions
 Lena-0.008
Sab-0.007
robat-0.007
 calcul-0.007
rob-0.007
Tokenifying
Token ID4035
Feature activation-3.74e+00

Pos logit contributions
 Lena+0.003
Sab+0.003
robat+0.003
 calcul+0.003
 pupil+0.003
Neg logit contributions
 Acc-0.003
izza-0.003
 cooled-0.002
melon-0.002
 cinnamon-0.002
Token glass
Token ID5405
Feature activation-3.57e-01

Pos logit contributions
izza+0.063
 Acc+0.061
 cooled+0.060
melon+0.059
icking+0.057
Neg logit contributions
 Lena-0.119
Sab-0.114
 calcul-0.112
robat-0.112
rob-0.109
Token and
Token ID290
Feature activation+3.03e-01

Pos logit contributions
 Acc+0.014
izza+0.013
acket+0.013
ugg+0.013
icking+0.013
Neg logit contributions
 Lena-0.004
 girl-0.004
 figures-0.004
b-0.004
 woman-0.004
Token told
Token ID1297
Feature activation-4.51e-03

Pos logit contributions
 Lena+0.006
 calcul+0.005
 woman+0.005
Sab+0.005
 girl+0.005
Neg logit contributions
izza-0.009
 Acc-0.009
icking-0.009
ugg-0.009
melon-0.009
Token him
Token ID683
Feature activation-3.20e-01

Pos logit contributions
izza+0.000
 Acc+0.000
 cooled+0.000
melon+0.000
icking+0.000
Neg logit contributions
 Lena-0.000
Sab-0.000
 calcul-0.000
robat-0.000
 pupil-0.000
Token that
Token ID326
Feature activation-3.73e-01

Pos logit contributions
acket+0.013
�+0.012
 Acc+0.012
ugg+0.012
 cooled+0.012
Neg logit contributions
 Lena-0.005
 her-0.004
 she-0.004
 princess-0.003
 figures-0.003
Token
Token ID198
Feature activation+3.50e-01

Pos logit contributions
 calcul+0.000
robat+0.000
cest+0.000
 imagining+0.000
 pupil+0.000
Neg logit contributions
izza-0.000
 Acc-0.000
 cooled-0.000
uttered-0.000
 apologised-0.000
TokenHe
Token ID1544
Feature activation-3.54e-01

Pos logit contributions
 imagining+0.007
robat+0.007
rob+0.007
 calcul+0.007
cest+0.007
Neg logit contributions
izza-0.009
 Acc-0.009
 apologised-0.009
OP-0.008
 farewell-0.008
Token pointed
Token ID6235
Feature activation-5.10e-01

Pos logit contributions
 Acc+0.010
izza+0.010
melon+0.010
icking+0.009
ugg+0.009
Neg logit contributions
 Lena-0.008
Sab-0.007
 calcul-0.007
Grand-0.007
 woman-0.007
Token his
Token ID465
Feature activation-4.01e-02

Pos logit contributions
izza+0.017
 Acc+0.017
 cooled+0.016
melon+0.016
icking+0.016
Neg logit contributions
 Lena-0.008
Sab-0.007
 calcul-0.007
robat-0.007
 pupil-0.007
Token magn
Token ID7842
Feature activation+4.54e-01

Pos logit contributions
izza+0.001
 Acc+0.001
melon+0.001
 cooled+0.001
icking+0.001
Neg logit contributions
 Lena-0.001
Sab-0.001
robat-0.001
 calcul-0.001
rob-0.001
Tokenifying
Token ID4035
Feature activation-3.70e+00

Pos logit contributions
 Lena+0.012
Sab+0.012
 pupil+0.011
 calcul+0.011
robat+0.011
Neg logit contributions
 Acc-0.011
izza-0.010
 cooled-0.010
melon-0.009
 cinnamon-0.009
Token glass
Token ID5405
Feature activation-1.96e-01

Pos logit contributions
izza+0.074
 Acc+0.073
 cooled+0.071
melon+0.070
icking+0.069
Neg logit contributions
 Lena-0.107
Sab-0.101
 calcul-0.100
robat-0.099
 pupil-0.096
Token at
Token ID379
Feature activation+1.58e-01

Pos logit contributions
 Acc+0.008
izza+0.007
 Enjoy+0.007
ugg+0.007
acket+0.007
Neg logit contributions
 Lena-0.002
 girl-0.002
 figures-0.002
 woman-0.002
 calcul-0.002
Token different
Token ID1180
Feature activation+1.72e-01

Pos logit contributions
 Lena+0.002
Sab+0.002
robat+0.002
 calcul+0.002
 pupil+0.002
Neg logit contributions
izza-0.005
 Acc-0.005
 cooled-0.005
melon-0.005
icking-0.005
Token classmates
Token ID28999
Feature activation+1.25e+00

Pos logit contributions
 Lena+0.003
Sab+0.003
robat+0.003
rob+0.003
 calcul+0.003
Neg logit contributions
izza-0.005
 Acc-0.005
melon-0.005
 cooled-0.005
icking-0.005
Token and
Token ID290
Feature activation+1.74e-01

Pos logit contributions
 Lena+0.019
 calcul+0.016
 imagining+0.016
robat+0.016
Sab+0.015
Neg logit contributions
 Acc-0.043
 cooled-0.042
izza-0.042
melon-0.042
ugg-0.041
Token She
Token ID1375
Feature activation-7.09e-02

Pos logit contributions
 Lena+0.001
Sab+0.001
 calcul+0.001
robat+0.001
 girl+0.001
Neg logit contributions
izza-0.002
 Acc-0.002
 cooled-0.002
melon-0.002
icking-0.002
Token had
Token ID550
Feature activation-7.83e-02

Pos logit contributions
izza+0.002
 Acc+0.002
melon+0.002
icking+0.002
acket+0.002
Neg logit contributions
 Lena-0.001
 imagined-0.001
 woman-0.001
Sab-0.001
 girl-0.001
Token a
Token ID257
Feature activation+3.33e-01

Pos logit contributions
izza+0.003
icking+0.003
melon+0.003
 Acc+0.003
ugg+0.003
Neg logit contributions
 Lena-0.001
Sab-0.001
 imagining-0.001
 calcul-0.001
 woman-0.001
Token big
Token ID1263
Feature activation+4.30e-01

Pos logit contributions
 Lena+0.006
Sab+0.005
 calcul+0.005
 girl+0.005
 pupil+0.005
Neg logit contributions
izza-0.010
 Acc-0.010
melon-0.010
icking-0.010
 cooled-0.010
Token magn
Token ID7842
Feature activation+1.63e-01

Pos logit contributions
 Lena+0.008
 girl+0.008
 pupil+0.008
 woman+0.008
robat+0.008
Neg logit contributions
izza-0.013
 Acc-0.013
uddled-0.013
 cooled-0.012
icking-0.012
Tokenifying
Token ID4035
Feature activation-3.58e+00

Pos logit contributions
 Lena+0.004
Sab+0.004
robat+0.004
 calcul+0.004
Mel+0.004
Neg logit contributions
 Acc-0.004
izza-0.004
 cooled-0.003
melon-0.003
 cinnamon-0.003
Token glass
Token ID5405
Feature activation-3.55e-01

Pos logit contributions
izza+0.071
 Acc+0.070
 cooled+0.067
icking+0.066
melon+0.066
Neg logit contributions
 Lena-0.104
Sab-0.098
 calcul-0.097
robat-0.097
 speaker-0.094
Token that
Token ID326
Feature activation+4.76e-01

Pos logit contributions
 Acc+0.013
izza+0.013
acket+0.012
 apologised+0.012
 cooled+0.012
Neg logit contributions
 Lena-0.005
 girl-0.005
 student-0.004
 pupil-0.004
 woman-0.004
Token made
Token ID925
Feature activation+9.42e-01

Pos logit contributions
 Lena+0.009
 calcul+0.008
Sab+0.008
b+0.007
 woman+0.007
Neg logit contributions
izza-0.015
 Acc-0.015
melon-0.014
acket-0.014
icking-0.014
Token things
Token ID1243
Feature activation+1.58e-01

Pos logit contributions
 Lena+0.010
Sab+0.008
 calcul+0.007
 figures+0.007
 imagining+0.006
Neg logit contributions
 Acc-0.038
izza-0.038
melon-0.037
ugg-0.037
acket-0.037
Token look
Token ID804
Feature activation-2.05e-01

Pos logit contributions
 Lena+0.003
 calcul+0.002
b+0.002
 imagining+0.002
Sab+0.002
Neg logit contributions
izza-0.005
 Acc-0.005
ugg-0.005
uddled-0.005
melon-0.005
Token She
Token ID1375
Feature activation-2.53e-01

Pos logit contributions
izza+0.003
 cooled+0.003
melon+0.003
 Acc+0.003
icking+0.003
Neg logit contributions
 Lena-0.001
Sab-0.001
 pupil-0.001
 calcul-0.001
 Taylor-0.001
Token took
Token ID1718
Feature activation+1.49e-01

Pos logit contributions
 Acc+0.008
izza+0.007
melon+0.007
icking+0.007
ugg+0.007
Neg logit contributions
 Lena-0.005
b-0.005
 imagined-0.005
 figures-0.005
Sab-0.005
Token out
Token ID503
Feature activation-5.86e-01

Pos logit contributions
 Lena+0.003
Sab+0.003
robat+0.003
 calcul+0.003
rob+0.002
Neg logit contributions
izza-0.005
 Acc-0.004
melon-0.004
 cooled-0.004
icking-0.004
Token a
Token ID257
Feature activation+3.15e-01

Pos logit contributions
izza+0.021
 Acc+0.021
melon+0.021
 cooled+0.021
acket+0.020
Neg logit contributions
 Lena-0.008
Sab-0.007
 calcul-0.007
 student-0.007
 pupil-0.007
Token magn
Token ID7842
Feature activation+3.28e-01

Pos logit contributions
 Lena+0.005
 girl+0.005
 calcul+0.005
 pupil+0.005
 speaker+0.005
Neg logit contributions
 Acc-0.010
izza-0.010
melon-0.010
uddled-0.010
icking-0.010
Tokenifying
Token ID4035
Feature activation-3.56e+00

Pos logit contributions
 Lena+0.009
Sab+0.009
robat+0.008
 calcul+0.008
Mel+0.008
Neg logit contributions
 Acc-0.008
izza-0.007
 cooled-0.007
melon-0.007
 cinnamon-0.007
Token glass
Token ID5405
Feature activation-4.83e-01

Pos logit contributions
izza+0.072
 Acc+0.070
 cooled+0.068
melon+0.068
icking+0.067
Neg logit contributions
 Lena-0.102
Sab-0.097
 calcul-0.095
robat-0.095
 pupil-0.092
Token and
Token ID290
Feature activation+1.92e-01

Pos logit contributions
 Acc+0.018
izza+0.018
 cooled+0.017
acket+0.017
icking+0.017
Neg logit contributions
 Lena-0.006
 girl-0.005
robat-0.005
 woman-0.005
 calcul-0.005
Token looked
Token ID3114
Feature activation-5.34e-01

Pos logit contributions
 Lena+0.004
 imagining+0.004
 calcul+0.004
Sab+0.004
b+0.004
Neg logit contributions
izza-0.006
 Acc-0.006
melon-0.005
icking-0.005
acket-0.005
Token very
Token ID845
Feature activation+3.34e-01

Pos logit contributions
izza+0.018
 Acc+0.018
 cooled+0.018
melon+0.018
icking+0.017
Neg logit contributions
 Lena-0.008
 calcul-0.007
Sab-0.007
robat-0.007
 imagining-0.007
Token carefully
Token ID7773
Feature activation-5.55e-01

Pos logit contributions
 Lena+0.004
 imagining+0.004
 after+0.004
b+0.004
 calcul+0.004
Neg logit contributions
izza-0.012
ugg-0.012
acket-0.012
melon-0.012
uddled-0.011
Token hat
Token ID6877
Feature activation+9.85e-01

Pos logit contributions
izza+0.000
 Acc+0.000
 cooled+0.000
melon+0.000
icking+0.000
Neg logit contributions
 Lena-0.000
 calcul-0.000
robat-0.000
Sab-0.000
 pupil-0.000
Token and
Token ID290
Feature activation+1.49e-01

Pos logit contributions
 Lena+0.014
Sab+0.012
robat+0.012
 calcul+0.012
 pupil+0.011
Neg logit contributions
izza-0.034
 Acc-0.034
 cooled-0.033
melon-0.033
icking-0.032
Token grabbed
Token ID13176
Feature activation+8.35e-02

Pos logit contributions
 Lena+0.003
Sab+0.003
 calcul+0.003
robat+0.003
 woman+0.003
Neg logit contributions
izza-0.004
 Acc-0.004
melon-0.004
 cooled-0.004
icking-0.004
Token her
Token ID607
Feature activation-1.55e-01

Pos logit contributions
 Lena+0.001
Sab+0.001
 calcul+0.001
robat+0.001
 pupil+0.001
Neg logit contributions
izza-0.003
 Acc-0.003
 cooled-0.003
melon-0.003
icking-0.003
Token magn
Token ID7842
Feature activation+4.82e-01

Pos logit contributions
izza+0.004
 Acc+0.004
 cooled+0.004
melon+0.004
icking+0.004
Neg logit contributions
 Lena-0.003
Sab-0.003
robat-0.003
 calcul-0.003
rob-0.003
Tokenifying
Token ID4035
Feature activation-3.54e+00

Pos logit contributions
 Lena+0.013
Sab+0.012
robat+0.012
Mel+0.012
 pupil+0.012
Neg logit contributions
 Acc-0.012
izza-0.011
melon-0.010
 cooled-0.010
 cinnamon-0.010
Token glass
Token ID5405
Feature activation-3.36e-01

Pos logit contributions
izza+0.066
 Acc+0.065
 cooled+0.063
melon+0.062
icking+0.062
Neg logit contributions
 Lena-0.107
Sab-0.100
 calcul-0.100
robat-0.098
 girl-0.096
Token.
Token ID13
Feature activation+5.18e-01

Pos logit contributions
 Acc+0.012
izza+0.011
 cooled+0.011
icking+0.011
uttered+0.011
Neg logit contributions
 Lena-0.005
robat-0.005
 girl-0.005
 calcul-0.004
Sab-0.004
Token She
Token ID1375
Feature activation+9.39e-02

Pos logit contributions
 Lena+0.008
Sab+0.008
 calcul+0.007
robat+0.007
 pupil+0.007
Neg logit contributions
izza-0.017
melon-0.017
 Acc-0.017
 cooled-0.017
icking-0.016
Token was
Token ID373
Feature activation-1.81e-01

Pos logit contributions
 Lena+0.002
Sab+0.002
 imagined+0.002
 calcul+0.002
 woman+0.002
Neg logit contributions
izza-0.003
 Acc-0.003
melon-0.003
icking-0.003
acket-0.002
Token ready
Token ID3492
Feature activation-1.52e-01

Pos logit contributions
izza+0.005
 cooled+0.005
 Acc+0.005
melon+0.005
icking+0.005
Neg logit contributions
 Lena-0.003
robat-0.003
Sab-0.003
 calcul-0.003
rob-0.003
Token borrowed
Token ID22546
Feature activation-3.65e-01

Pos logit contributions
izza+0.018
icking+0.018
melon+0.017
acket+0.017
 Acc+0.017
Neg logit contributions
b-0.008
Sab-0.008
Grand-0.008
 girl-0.008
 Lena-0.008
Token her
Token ID607
Feature activation+3.27e-02

Pos logit contributions
izza+0.012
 Acc+0.012
 cooled+0.012
melon+0.012
icking+0.011
Neg logit contributions
 Lena-0.006
Sab-0.006
 calcul-0.005
robat-0.005
 pupil-0.005
Token mom
Token ID1995
Feature activation+2.21e-01

Pos logit contributions
 Lena+0.001
Sab+0.001
 calcul+0.001
 pupil+0.001
 imagining+0.001
Neg logit contributions
izza-0.001
 Acc-0.001
melon-0.001
 cooled-0.001
ugg-0.001
Token's
Token ID338
Feature activation+3.72e-01

Pos logit contributions
 Lena+0.003
Sab+0.003
robat+0.003
 calcul+0.003
 girl+0.003
Neg logit contributions
izza-0.008
 Acc-0.007
 cooled-0.007
melon-0.007
uddled-0.007
Token magn
Token ID7842
Feature activation+3.04e-01

Pos logit contributions
 pupil+0.006
 girl+0.006
 figures+0.006
 calcul+0.006
 woman+0.006
Neg logit contributions
 cooled-0.012
izza-0.012
 Acc-0.012
melon-0.012
acket-0.012
Tokenifying
Token ID4035
Feature activation-3.50e+00

Pos logit contributions
Mel+0.006
 Lena+0.006
Sab+0.006
 pupil+0.006
b+0.006
Neg logit contributions
 Acc-0.009
izza-0.008
 cooled-0.008
 cinnamon-0.007
melon-0.007
Token glass
Token ID5405
Feature activation-3.55e-01

Pos logit contributions
izza+0.073
 Acc+0.072
icking+0.070
uddled+0.069
 cooled+0.068
Neg logit contributions
 Lena-0.098
Sab-0.092
 calcul-0.092
 girl-0.089
 witch-0.088
Token to
Token ID284
Feature activation+7.31e-01

Pos logit contributions
 Acc+0.015
izza+0.014
acket+0.014
 cooled+0.013
 Enjoy+0.013
Neg logit contributions
 figures-0.004
 girl-0.004
 Lena-0.003
 than-0.003
 pupil-0.003
Token take
Token ID1011
Feature activation+7.38e-01

Pos logit contributions
 Lena+0.014
Sab+0.013
 calcul+0.012
 witch+0.012
robat+0.012
Neg logit contributions
izza-0.023
 Acc-0.022
uddled-0.022
melon-0.021
 cooled-0.021
Token a
Token ID257
Feature activation+3.99e-01

Pos logit contributions
 Lena+0.007
Sab+0.007
 after+0.006
 calcul+0.006
 girl+0.006
Neg logit contributions
izza-0.029
 Acc-0.029
uddled-0.029
melon-0.029
 cooled-0.028
Token closer
Token ID5699
Feature activation+1.81e-01

Pos logit contributions
 Lena+0.010
Sab+0.009
robat+0.009
 calcul+0.009
 pupil+0.009
Neg logit contributions
izza-0.010
 Acc-0.010
melon-0.010
 cooled-0.010
icking-0.009
Token examine
Token ID10716
Feature activation+1.61e-01

Pos logit contributions
 Lena+0.004
Sab+0.004
 calcul+0.003
robat+0.003
 pupil+0.003
Neg logit contributions
izza-0.006
 Acc-0.005
melon-0.005
 cooled-0.005
uddled-0.005
Token it
Token ID340
Feature activation-2.46e-01

Pos logit contributions
 Lena+0.003
Sab+0.003
robat+0.003
 calcul+0.003
 pupil+0.003
Neg logit contributions
izza-0.004
 Acc-0.004
melon-0.004
 cooled-0.004
icking-0.004
Token with
Token ID351
Feature activation+1.12e-01

Pos logit contributions
melon+0.006
izza+0.006
icking+0.006
 cooled+0.006
 drank+0.005
Neg logit contributions
 Lena-0.006
Sab-0.006
 calcul-0.006
rob-0.006
robat-0.006
Token the
Token ID262
Feature activation+2.80e-01

Pos logit contributions
 Lena+0.001
Sab+0.001
 calcul+0.001
b+0.001
 imagining+0.001
Neg logit contributions
 Acc-0.004
uddled-0.004
izza-0.004
 cooled-0.004
icking-0.004
Token magn
Token ID7842
Feature activation+2.78e-01

Pos logit contributions
 Lena+0.006
 calcul+0.005
robat+0.005
Sab+0.005
 girl+0.005
Neg logit contributions
izza-0.008
 Acc-0.008
 cooled-0.008
melon-0.008
uddled-0.008
Tokenifying
Token ID4035
Feature activation-3.45e+00

Pos logit contributions
 Lena+0.008
Sab+0.008
robat+0.008
 calcul+0.008
 pupil+0.008
Neg logit contributions
 Acc-0.005
izza-0.005
 cooled-0.005
melon-0.005
icking-0.005
Token glass
Token ID5405
Feature activation-2.27e-01

Pos logit contributions
izza+0.065
 Acc+0.064
icking+0.062
uddled+0.061
 cooled+0.061
Neg logit contributions
 Lena-0.103
Sab-0.097
 calcul-0.097
robat-0.095
 girl-0.095
Token?"
Token ID1701
Feature activation-3.99e-01

Pos logit contributions
izza+0.007
 Acc+0.006
melon+0.006
 cooled+0.006
icking+0.006
Neg logit contributions
 Lena-0.004
Sab-0.004
robat-0.004
 calcul-0.004
rob-0.004
Token
Token ID198
Feature activation-1.66e-01

Pos logit contributions
 cooled+0.010
 Acc+0.010
melon+0.010
 drank+0.009
atted+0.009
Neg logit contributions
robat-0.010
cest-0.009
rob-0.009
 speaker-0.009
 calcul-0.009
Token
Token ID198
Feature activation+2.29e-01

Pos logit contributions
izza+0.005
 Acc+0.005
 cooled+0.005
uttered+0.005
melon+0.005
Neg logit contributions
robat-0.003
 calcul-0.003
rob-0.003
 imagining-0.003
cest-0.003
Token"
Token ID1
Feature activation+3.53e-01

Pos logit contributions
 Lena+0.002
robat+0.002
Sab+0.002
 calcul+0.002
rob+0.002
Neg logit contributions
izza-0.009
 Acc-0.009
 cooled-0.009
melon-0.009
icking-0.008
Token cried
Token ID16896
Feature activation+2.06e-01

Pos logit contributions
robat+0.007
rob+0.007
 calcul+0.007
 Lena+0.007
Sab+0.007
Neg logit contributions
 cooled-0.008
 drank-0.008
izza-0.008
icking-0.008
 apologised-0.008
Token,
Token ID11
Feature activation+6.46e-01

Pos logit contributions
 Lena+0.004
robat+0.004
Sab+0.004
 calcul+0.004
rob+0.004
Neg logit contributions
izza-0.006
 Acc-0.006
 cooled-0.006
melon-0.006
icking-0.006
Token dropping
Token ID12047
Feature activation-2.38e-01

Pos logit contributions
 imagining+0.010
 Lena+0.009
 woman+0.009
 speaker+0.009
 pretending+0.008
Neg logit contributions
izza-0.025
ugg-0.023
 Acc-0.022
uned-0.022
melon-0.022
Token the
Token ID262
Feature activation-7.47e-02

Pos logit contributions
izza+0.008
 Acc+0.008
 cooled+0.008
melon+0.008
icking+0.008
Neg logit contributions
 Lena-0.004
Sab-0.003
 calcul-0.003
robat-0.003
 pupil-0.003
Token magn
Token ID7842
Feature activation+5.55e-01

Pos logit contributions
 Acc+0.002
izza+0.002
 cooled+0.002
ugg+0.002
melon+0.002
Neg logit contributions
 girl-0.001
 Lena-0.001
 witch-0.001
 calcul-0.001
 woman-0.001
Tokenifying
Token ID4035
Feature activation-3.44e+00

Pos logit contributions
 Lena+0.015
Sab+0.014
robat+0.014
 pupil+0.014
 calcul+0.013
Neg logit contributions
 Acc-0.013
izza-0.012
 cooled-0.012
melon-0.011
 cinnamon-0.011
Token glass
Token ID5405
Feature activation+5.81e-02

Pos logit contributions
izza+0.058
melon+0.056
 cooled+0.056
 Acc+0.055
 cinnamon+0.051
Neg logit contributions
 Lena-0.109
robat-0.106
Sab-0.105
 calcul-0.104
rob-0.103
Token.
Token ID13
Feature activation+3.23e-01

Pos logit contributions
 Lena+0.001
 girl+0.001
 hall+0.001
 woman+0.001
cest+0.001
Neg logit contributions
 Acc-0.002
izza-0.002
 Enjoy-0.002
ugg-0.002
uddled-0.002
Token
Token ID198
Feature activation+2.60e-01

Pos logit contributions
 Lena+0.005
Sab+0.004
 calcul+0.004
 woman+0.004
 speaker+0.004
Neg logit contributions
izza-0.011
icking-0.011
 cooled-0.011
melon-0.011
 Acc-0.011
Token
Token ID198
Feature activation+6.14e-01

Pos logit contributions
 Lena+0.003
Sab+0.003
 calcul+0.002
robat+0.002
rob+0.002
Neg logit contributions
izza-0.010
melon-0.010
 cooled-0.010
 Acc-0.010
icking-0.009
Token"
Token ID1
Feature activation+4.98e-01

Pos logit contributions
 Lena+0.008
Sab+0.007
robat+0.007
 calcul+0.007
 pupil+0.006
Neg logit contributions
izza-0.022
 Acc-0.022
melon-0.022
 cooled-0.022
icking-0.021
Token laughed
Token ID13818
Feature activation-4.06e-01

Pos logit contributions
 Lena+0.000
robat+0.000
rob+0.000
 imagining+0.000
 girl+0.000
Neg logit contributions
izza-0.000
 Acc-0.000
icking-0.000
melon-0.000
uddled-0.000
Token.
Token ID13
Feature activation-6.71e-02

Pos logit contributions
izza+0.016
 Acc+0.016
melon+0.016
 cooled+0.016
ugg+0.015
Neg logit contributions
 after-0.004
 Lena-0.004
 imagining-0.004
 girl-0.004
 princess-0.003
Token Then
Token ID3244
Feature activation-3.12e-01

Pos logit contributions
icking+0.002
melon+0.002
 cooled+0.002
 Acc+0.002
uddled+0.002
Neg logit contributions
 Lena-0.001
 calcul-0.001
Grand-0.001
 woman-0.001
 speaker-0.001
Token Alice
Token ID14862
Feature activation-8.13e-02

Pos logit contributions
izza+0.010
 Acc+0.010
melon+0.010
 cooled+0.009
icking+0.009
Neg logit contributions
 Lena-0.006
Sab-0.005
 calcul-0.005
 girl-0.005
robat-0.005
Token said
Token ID531
Feature activation-8.56e-02

Pos logit contributions
 Acc+0.003
izza+0.003
melon+0.003
 cinnamon+0.003
icking+0.003
Neg logit contributions
 Lena-0.001
 imagined-0.001
 girl-0.001
 woman-0.001
 pupil-0.001
Token it
Token ID340
Feature activation+8.38e-01

Pos logit contributions
izza+0.003
acket+0.003
icking+0.003
 Acc+0.003
uddled+0.003
Neg logit contributions
 girl-0.001
 she-0.001
 after-0.001
 Lena-0.001
 her-0.001
Token was
Token ID373
Feature activation+2.66e-02

Pos logit contributions
 Lena+0.010
 h+0.009
 girl+0.009
b+0.008
 figures+0.008
Neg logit contributions
izza-0.034
ugg-0.033
 Acc-0.033
melon-0.033
acket-0.032
Token time
Token ID640
Feature activation+2.83e-01

Pos logit contributions
 Lena+0.000
Sab+0.000
robat+0.000
 calcul+0.000
rob+0.000
Neg logit contributions
izza-0.001
 Acc-0.001
 cooled-0.001
melon-0.001
icking-0.001
Token for
Token ID329
Feature activation+5.31e-01

Pos logit contributions
 Lena+0.004
 imagining+0.004
 calcul+0.004
 pupil+0.004
 after+0.004
Neg logit contributions
izza-0.010
 Acc-0.010
melon-0.009
 cooled-0.009
 cinnamon-0.009
Token dinner
Token ID8073
Feature activation+3.36e-01

Pos logit contributions
 Lena+0.008
Sab+0.007
robat+0.007
 calcul+0.007
 imagining+0.007
Neg logit contributions
izza-0.018
 Acc-0.018
melon-0.018
 cooled-0.017
icking-0.017
Token,
Token ID11
Feature activation+1.81e-01

Pos logit contributions
 Lena+0.006
robat+0.005
Sab+0.005
 calcul+0.005
 student+0.005
Neg logit contributions
izza-0.011
 Acc-0.011
melon-0.011
 cooled-0.011
uttered-0.010
Token man
Token ID582
Feature activation+5.60e-01

Pos logit contributions
izza+0.024
 Acc+0.023
 cooled+0.023
melon+0.023
uttered+0.023
Neg logit contributions
 Lena-0.004
 girl-0.003
 pupil-0.003
 woman-0.003
robat-0.003
Token and
Token ID290
Feature activation+8.89e-03

Pos logit contributions
 Lena+0.010
Sab+0.009
robat+0.009
 calcul+0.009
rob+0.009
Neg logit contributions
izza-0.017
 Acc-0.017
 cooled-0.017
melon-0.017
icking-0.016
Token he
Token ID339
Feature activation+2.08e-01

Pos logit contributions
 Lena+0.000
 girl+0.000
 woman+0.000
 imagining+0.000
 pupil+0.000
Neg logit contributions
izza-0.000
icking-0.000
uned-0.000
acket-0.000
uttered-0.000
Token wanted
Token ID2227
Feature activation+2.53e-01

Pos logit contributions
 Lena+0.004
b+0.004
 pupil+0.004
 imagined+0.004
 princess+0.004
Neg logit contributions
izza-0.007
icking-0.006
melon-0.006
uttered-0.006
uned-0.006
Token a
Token ID257
Feature activation+9.40e-02

Pos logit contributions
 Lena+0.003
 places+0.003
 after+0.003
 school+0.003
 imagining+0.003
Neg logit contributions
uttered-0.010
melon-0.010
 Acc-0.010
ishly-0.010
icking-0.010
Token wife
Token ID3656
Feature activation+1.23e+00

Pos logit contributions
 girl+0.002
 woman+0.001
 student+0.001
 pupil+0.001
 spy+0.001
Neg logit contributions
icked-0.004
anced-0.004
bled-0.004
atted-0.004
uttered-0.003
Token to
Token ID284
Feature activation+3.95e-01

Pos logit contributions
rob+0.032
 Lena+0.032
Sab+0.031
robat+0.031
 calcul+0.029
Neg logit contributions
izza-0.028
melon-0.027
 cooled-0.026
icking-0.026
 drank-0.026
Token help
Token ID1037
Feature activation-3.55e-01

Pos logit contributions
 school+0.007
 spy+0.006
 Lena+0.006
 princess+0.006
 speaker+0.006
Neg logit contributions
uddled-0.014
atted-0.014
uttered-0.014
uffed-0.014
izza-0.014
Token him
Token ID683
Feature activation-9.78e-01

Pos logit contributions
izza+0.010
 Acc+0.010
melon+0.010
 cooled+0.010
icking+0.010
Neg logit contributions
 Lena-0.007
robat-0.006
Sab-0.006
 calcul-0.006
rob-0.006
Token count
Token ID954
Feature activation-4.42e-01

Pos logit contributions
uddled+0.035
izza+0.034
ugg+0.034
uttered+0.034
 Acc+0.033
Neg logit contributions
 imagining-0.012
 Lena-0.012
b-0.012
 after-0.012
 calcul-0.012
Token.
Token ID13
Feature activation-1.31e-01

Pos logit contributions
izza+0.014
 Acc+0.013
melon+0.013
 cooled+0.013
icking+0.013
Neg logit contributions
 Lena-0.008
Sab-0.007
robat-0.007
 calcul-0.007
rob-0.007
Token She
Token ID1375
Feature activation-4.28e-02

Pos logit contributions
izza+0.015
 Acc+0.015
melon+0.015
 cooled+0.015
icking+0.014
Neg logit contributions
 Lena-0.009
Sab-0.008
robat-0.008
rob-0.008
 calcul-0.008
Token loved
Token ID6151
Feature activation+4.89e-01

Pos logit contributions
izza+0.001
 Acc+0.001
 cooled+0.001
melon+0.001
 drank+0.001
Neg logit contributions
 Lena-0.001
Sab-0.001
robat-0.001
 calcul-0.001
rob-0.001
Token to
Token ID284
Feature activation+1.68e-01

Pos logit contributions
 Lena+0.010
Sab+0.010
 calcul+0.009
robat+0.009
 imagining+0.009
Neg logit contributions
izza-0.014
 Acc-0.013
 cooled-0.013
melon-0.013
icking-0.013
Token drink
Token ID4144
Feature activation-9.10e-01

Pos logit contributions
 school+0.002
 princess+0.002
 pretend+0.002
 Lena+0.002
 spy+0.002
Neg logit contributions
icking-0.007
uddled-0.007
uttered-0.006
uffed-0.006
atted-0.006
Token tea
Token ID8887
Feature activation-2.69e-01

Pos logit contributions
izza+0.028
 Acc+0.027
uttered+0.027
icking+0.026
acket+0.026
Neg logit contributions
 after-0.017
 her-0.016
 Lena-0.016
 girl-0.016
 calcul-0.015
Token with
Token ID351
Feature activation+8.61e-01

Pos logit contributions
melon+0.008
izza+0.008
 cooled+0.008
icking+0.008
uddled+0.008
Neg logit contributions
 Lena-0.005
Sab-0.005
 calcul-0.005
robat-0.004
 pupil-0.004
Token her
Token ID607
Feature activation+1.45e-01

Pos logit contributions
 Lena+0.007
Sab+0.006
 calcul+0.006
robat+0.005
 girl+0.005
Neg logit contributions
izza-0.035
 Acc-0.035
melon-0.034
 cooled-0.034
icking-0.034
Token mom
Token ID1995
Feature activation+1.95e-01

Pos logit contributions
 Lena+0.003
Sab+0.003
robat+0.003
 calcul+0.003
rob+0.003
Neg logit contributions
izza-0.004
 Acc-0.004
 cooled-0.004
melon-0.004
icking-0.004
Token every
Token ID790
Feature activation+4.52e-01

Pos logit contributions
 Lena+0.004
Sab+0.004
 calcul+0.004
robat+0.004
rob+0.004
Neg logit contributions
melon-0.005
izza-0.005
 cooled-0.005
icking-0.005
 Acc-0.005
Token day
Token ID1110
Feature activation-1.03e+00

Pos logit contributions
 Lena+0.005
Sab+0.004
 calcul+0.004
robat+0.004
 girl+0.004
Neg logit contributions
izza-0.017
 Acc-0.017
melon-0.017
 cooled-0.017
icking-0.016
Token.
Token ID13
Feature activation-1.99e-01

Pos logit contributions
izza+0.037
 Acc+0.037
melon+0.036
 cooled+0.035
uddled+0.034
Neg logit contributions
 Lena-0.014
 calcul-0.012
 imagining-0.012
 pupil-0.012
 after-0.012
Token.
Token ID13
Feature activation+1.19e-02

Pos logit contributions
 Lena+0.002
 than+0.002
 after+0.002
 places+0.001
cest+0.001
Neg logit contributions
 Acc-0.006
izza-0.006
 cooled-0.005
uttered-0.005
acket-0.005
Token And
Token ID843
Feature activation-3.44e-01

Pos logit contributions
 Lena+0.000
Sab+0.000
robat+0.000
 calcul+0.000
rob+0.000
Neg logit contributions
izza-0.000
 Acc-0.000
melon-0.000
 cooled-0.000
icking-0.000
Token you
Token ID345
Feature activation+6.76e-01

Pos logit contributions
melon+0.011
izza+0.011
 cooled+0.011
uddled+0.011
ugg+0.011
Neg logit contributions
 Lena-0.006
 calcul-0.005
 imagining-0.005
 girl-0.005
Sab-0.005
Token rang
Token ID28077
Feature activation-7.12e-01

Pos logit contributions
 Lena+0.014
 calcul+0.013
Sab+0.013
robat+0.013
 figures+0.013
Neg logit contributions
 Acc-0.019
melon-0.019
izza-0.018
uddled-0.018
atted-0.018
Token the
Token ID262
Feature activation+2.41e-01

Pos logit contributions
izza+0.027
 cooled+0.027
 Acc+0.027
ugg+0.027
melon+0.027
Neg logit contributions
 Lena-0.009
Sab-0.007
 after-0.007
 speaker-0.007
 places-0.007
Token bell
Token ID8966
Feature activation+1.07e+00

Pos logit contributions
 woman+0.004
 Lena+0.004
 speaker+0.003
 witch+0.003
 girl+0.003
Neg logit contributions
izza-0.008
 Acc-0.008
melon-0.008
icking-0.008
 cooled-0.008
Token.
Token ID13
Feature activation+3.44e-01

Pos logit contributions
 Lena+0.018
Sab+0.016
robat+0.016
 calcul+0.015
 student+0.015
Neg logit contributions
izza-0.035
 Acc-0.034
 cooled-0.034
melon-0.034
icking-0.032
Token That
Token ID1320
Feature activation+7.82e-01

Pos logit contributions
 Lena+0.008
robat+0.007
Sab+0.007
 calcul+0.007
rob+0.007
Neg logit contributions
izza-0.009
 Acc-0.009
 cooled-0.008
melon-0.008
icking-0.008
Token is
Token ID318
Feature activation-3.23e-01

Pos logit contributions
 Lena+0.017
Sab+0.015
robat+0.015
 calcul+0.015
 woman+0.015
Neg logit contributions
izza-0.022
 Acc-0.022
melon-0.021
 cooled-0.021
icking-0.020
Token very
Token ID845
Feature activation+2.86e-01

Pos logit contributions
izza+0.009
 Acc+0.009
melon+0.009
 cooled+0.009
ugg+0.008
Neg logit contributions
 Lena-0.007
Sab-0.006
 calcul-0.006
robat-0.006
 imagining-0.006
Token bad
Token ID2089
Feature activation-1.01e-01

Pos logit contributions
 Lena+0.005
b+0.004
 witch+0.004
Sab+0.004
 girl+0.004
Neg logit contributions
izza-0.009
melon-0.009
 cooled-0.009
uddled-0.009
acket-0.009
Token She
Token ID1375
Feature activation-8.66e-02

Pos logit contributions
izza+0.015
 Acc+0.014
 cooled+0.014
melon+0.014
icking+0.014
Neg logit contributions
 Lena-0.007
Sab-0.007
robat-0.007
 calcul-0.007
 pupil-0.006
Token loved
Token ID6151
Feature activation+4.81e-01

Pos logit contributions
 Acc+0.001
 cooled+0.001
izza+0.001
melon+0.001
 drank+0.001
Neg logit contributions
 Lena-0.003
Sab-0.003
robat-0.003
 calcul-0.003
rob-0.003
Token to
Token ID284
Feature activation+6.51e-02

Pos logit contributions
 Lena+0.010
Sab+0.009
 calcul+0.009
 imagining+0.009
robat+0.009
Neg logit contributions
izza-0.014
 Acc-0.014
melon-0.013
 cooled-0.013
icking-0.013
Token bake
Token ID28450
Feature activation-4.41e-01

Pos logit contributions
 Lena+0.001
 school+0.001
 princess+0.001
 hall+0.001
 imagining+0.001
Neg logit contributions
icking-0.002
uddled-0.002
uttered-0.002
atted-0.002
uffed-0.002
Token cookies
Token ID14746
Feature activation-4.67e-01

Pos logit contributions
izza+0.008
 cooled+0.008
melon+0.008
 cinnamon+0.007
icking+0.007
Neg logit contributions
robat-0.013
Sab-0.013
 pupil-0.013
 Lena-0.013
 student-0.013
Token with
Token ID351
Feature activation+9.93e-01

Pos logit contributions
melon+0.011
icking+0.010
izza+0.010
 cooled+0.009
 drank+0.009
Neg logit contributions
Sab-0.013
 pupil-0.012
 Lena-0.012
cest-0.012
 knight-0.012
Token her
Token ID607
Feature activation+6.69e-02

Pos logit contributions
 Lena+0.010
robat+0.009
Sab+0.009
 calcul+0.008
rob+0.008
Neg logit contributions
izza-0.039
 cooled-0.038
melon-0.038
 Acc-0.038
icking-0.036
Token mom
Token ID1995
Feature activation+6.41e-02

Pos logit contributions
 Lena+0.002
Sab+0.002
robat+0.002
rob+0.001
 calcul+0.001
Neg logit contributions
izza-0.002
 Acc-0.002
 cooled-0.002
melon-0.002
icking-0.002
Tokenmy
Token ID1820
Feature activation-5.37e-01

Pos logit contributions
Sab+0.002
 Lena+0.002
 calcul+0.001
rob+0.001
robat+0.001
Neg logit contributions
melon-0.002
icking-0.002
izza-0.002
 cooled-0.001
 drank-0.001
Token in
Token ID287
Feature activation+4.00e-02

Pos logit contributions
izza+0.017
 Acc+0.017
 cooled+0.016
melon+0.016
icking+0.016
Neg logit contributions
 Lena-0.009
Sab-0.009
robat-0.009
 calcul-0.008
rob-0.008
Token the
Token ID262
Feature activation+6.08e-01

Pos logit contributions
rob+0.001
robat+0.001
b+0.001
Sab+0.001
cest+0.001
Neg logit contributions
izza-0.001
 cinnamon-0.001
 cooled-0.001
 Acc-0.001
 drank-0.001
Token.
Token ID13
Feature activation-2.65e-01

Pos logit contributions
 Acc+0.011
izza+0.011
uttered+0.010
 cooled+0.010
 cinnamon+0.010
Neg logit contributions
 Lena-0.003
robat-0.003
 imagining-0.003
b-0.003
cest-0.003
Token
Token ID220
Feature activation-7.27e-03

Pos logit contributions
izza+0.010
 Acc+0.010
 cooled+0.010
melon+0.010
icking+0.009
Neg logit contributions
 Lena-0.003
 calcul-0.003
Sab-0.003
 pupil-0.002
 Taylor-0.002
Token
Token ID198
Feature activation-5.24e-02

Pos logit contributions
izza+0.000
 Acc+0.000
melon+0.000
 cooled+0.000
icking+0.000
Neg logit contributions
 Lena-0.000
robat-0.000
 calcul-0.000
Sab-0.000
rob-0.000
Token
Token ID198
Feature activation+3.66e-01

Pos logit contributions
izza+0.001
ete+0.001
uttered+0.001
 cooled+0.001
 Acc+0.001
Neg logit contributions
 calcul-0.001
robat-0.001
cest-0.001
 witch-0.001
 detective-0.001
TokenSuddenly
Token ID38582
Feature activation+3.30e-01

Pos logit contributions
 Lena+0.008
Sab+0.007
robat+0.007
 calcul+0.007
rob+0.007
Neg logit contributions
izza-0.010
 Acc-0.010
 cooled-0.010
melon-0.010
icking-0.010
Token,
Token ID11
Feature activation+3.49e-01

Pos logit contributions
 Lena+0.003
 after+0.003
 girl+0.003
 guy+0.003
 one+0.002
Neg logit contributions
 Acc-0.014
ugg-0.013
izza-0.013
melon-0.013
 cooled-0.013
Token Jack
Token ID3619
Feature activation+2.59e-01

Pos logit contributions
 Lena+0.006
 after+0.006
 girl+0.006
 Taylor+0.006
 imagining+0.006
Neg logit contributions
 Acc-0.011
izza-0.011
ugg-0.011
atted-0.011
uned-0.011
Token heard
Token ID2982
Feature activation-7.01e-01

Pos logit contributions
 Lena+0.004
Sab+0.003
b+0.003
 pupil+0.003
 girl+0.003
Neg logit contributions
 Acc-0.010
izza-0.009
melon-0.009
ugg-0.009
 cooled-0.009
Token a
Token ID257
Feature activation+1.46e-01

Pos logit contributions
 Acc+0.027
melon+0.026
ugg+0.026
 cooled+0.026
izza+0.026
Neg logit contributions
 Lena-0.010
 speaker-0.007
 princess-0.007
 witch-0.007
Sab-0.007
Token voice
Token ID3809
Feature activation+7.31e-01

Pos logit contributions
 Lena+0.003
Sab+0.002
robat+0.002
 calcul+0.002
rob+0.002
Neg logit contributions
izza-0.005
 Acc-0.004
melon-0.004
 cooled-0.004
icking-0.004
Token saying
Token ID2282
Feature activation+1.09e-01

Pos logit contributions
 Lena+0.016
robat+0.016
Sab+0.015
 calcul+0.015
 student+0.015
Neg logit contributions
melon-0.019
 Acc-0.018
icking-0.018
 cooled-0.018
 calmed-0.018
Token That
Token ID1320
Feature activation+8.15e-01

Pos logit contributions
 Lena+0.003
 calcul+0.003
Sab+0.003
 pupil+0.003
robat+0.003
Neg logit contributions
 cooled-0.006
izza-0.006
melon-0.006
icking-0.006
 Acc-0.006
Token is
Token ID318
Feature activation-3.39e-01

Pos logit contributions
 Lena+0.015
 girl+0.014
robat+0.014
 calcul+0.014
 woman+0.014
Neg logit contributions
izza-0.025
 Acc-0.025
ugg-0.024
uttered-0.024
melon-0.024
Token not
Token ID407
Feature activation+5.56e-02

Pos logit contributions
ugg+0.010
uttered+0.010
uddled+0.010
izza+0.010
 Acc+0.010
Neg logit contributions
 Lena-0.007
 calcul-0.006
 Taylor-0.006
 student-0.006
Grand-0.006
Token nice
Token ID3621
Feature activation-8.28e-02

Pos logit contributions
 Lena+0.001
cest+0.001
 girl+0.001
 princess+0.001
 than+0.001
Neg logit contributions
ugg-0.002
uttered-0.002
�-0.002
 Acc-0.002
uddled-0.002
Token!"
Token ID2474
Feature activation+5.40e-02

Pos logit contributions
 Acc+0.003
izza+0.003
 cooled+0.003
uttered+0.003
ugg+0.003
Neg logit contributions
 Lena-0.001
 girl-0.001
 calcul-0.001
 student-0.001
 after-0.001
Token
Token ID198
Feature activation+4.95e-02

Pos logit contributions
robat+0.002
rob+0.002
cest+0.001
b+0.001
oc+0.001
Neg logit contributions
 Acc-0.001
 wipes-0.001
izza-0.001
 stomach-0.001
 calmed-0.001
Token
Token ID198
Feature activation+3.70e-01

Pos logit contributions
robat+0.001
 calcul+0.001
cest+0.001
rob+0.001
 imagining+0.001
Neg logit contributions
 Acc-0.001
izza-0.001
ugg-0.001
uddled-0.001
uttered-0.001
Token"
Token ID1
Feature activation+5.51e-01

Pos logit contributions
robat+0.008
rob+0.008
 imagining+0.008
 calcul+0.007
cest+0.007
Neg logit contributions
 Acc-0.010
izza-0.010
ugg-0.009
 drank-0.009
uddled-0.009
TokenWe
Token ID1135
Feature activation+3.22e-01

Pos logit contributions
robat+0.012
 Lena+0.012
 imagining+0.012
Sab+0.012
 calcul+0.011
Neg logit contributions
 Acc-0.014
 cooled-0.013
izza-0.013
melon-0.013
uddled-0.013
Token want
Token ID765
Feature activation-1.77e-01

Pos logit contributions
 spy+0.005
 Lena+0.005
robat+0.005
 calcul+0.005
 h+0.005
Neg logit contributions
izza-0.011
 Acc-0.010
melon-0.010
 cinnamon-0.010
uddled-0.010
Token the
Token ID262
Feature activation+2.81e-01

Pos logit contributions
uttered+0.007
icking+0.007
uffed+0.007
ugg+0.007
ushed+0.007
Neg logit contributions
 Lena-0.002
 school-0.002
 after-0.002
 one-0.002
 ones-0.002
Token now
Token ID783
Feature activation-8.75e-02

Pos logit contributions
 Lena+0.012
Sab+0.011
robat+0.011
 calcul+0.011
rob+0.011
Neg logit contributions
izza-0.017
 Acc-0.017
 cooled-0.017
icking-0.016
melon-0.016
Token."
Token ID526
Feature activation+4.02e-02

Pos logit contributions
izza+0.003
 Acc+0.003
melon+0.003
 cooled+0.003
icking+0.003
Neg logit contributions
 Lena-0.002
Sab-0.001
robat-0.001
 calcul-0.001
 imagining-0.001
Token Tim
Token ID5045
Feature activation-7.05e-01

Pos logit contributions
 Lena+0.001
L+0.001
 calcul+0.001
oc+0.001
 imagining+0.001
Neg logit contributions
 cooled-0.001
melon-0.001
izza-0.001
uttered-0.001
icking-0.001
Tokenmy
Token ID1820
Feature activation-8.01e-01

Pos logit contributions
izza+0.027
melon+0.026
 Acc+0.026
 cinnamon+0.025
 cooled+0.025
Neg logit contributions
 Lena-0.009
 girl-0.008
robat-0.008
 pupil-0.007
b-0.007
Token was
Token ID373
Feature activation-1.38e-01

Pos logit contributions
izza+0.025
 Acc+0.024
melon+0.024
 cinnamon+0.023
uned+0.023
Neg logit contributions
 Lena-0.016
 girl-0.016
 woman-0.015
 pupil-0.015
b-0.014
Token sad
Token ID6507
Feature activation-4.37e-01

Pos logit contributions
izza+0.004
 Acc+0.004
melon+0.004
 cooled+0.004
icking+0.004
Neg logit contributions
 Lena-0.003
Sab-0.002
 calcul-0.002
 imagining-0.002
 pupil-0.002
Token,
Token ID11
Feature activation+7.31e-02

Pos logit contributions
izza+0.013
 Acc+0.013
 cooled+0.013
melon+0.013
icking+0.012
Neg logit contributions
 Lena-0.008
Sab-0.008
robat-0.008
 calcul-0.008
rob-0.008
Token but
Token ID475
Feature activation+9.39e-02

Pos logit contributions
 Lena+0.001
Sab+0.001
robat+0.001
 calcul+0.001
rob+0.001
Neg logit contributions
melon-0.003
 cooled-0.003
izza-0.003
 Acc-0.003
icking-0.003
Token he
Token ID339
Feature activation-3.32e-01

Pos logit contributions
 Lena+0.002
Sab+0.001
robat+0.001
 calcul+0.001
rob+0.001
Neg logit contributions
izza-0.003
 Acc-0.003
 cooled-0.003
melon-0.003
icking-0.003
Token got
Token ID1392
Feature activation-3.13e-01

Pos logit contributions
izza+0.010
 Acc+0.010
melon+0.010
 cinnamon+0.009
icks+0.009
Neg logit contributions
 Lena-0.007
 woman-0.006
 girl-0.006
 pupil-0.006
b-0.006
Token up
Token ID510
Feature activation+4.15e-01

Pos logit contributions
izza+0.012
 Acc+0.011
 cinnamon+0.011
melon+0.011
acket+0.011
Neg logit contributions
 Lena-0.005
 girl-0.004
 after-0.004
b-0.004
 pupil-0.004
Token"
Token ID1
Feature activation+4.44e-01

Pos logit contributions
 Lena+0.016
robat+0.015
 calcul+0.015
rob+0.015
Sab+0.015
Neg logit contributions
izza-0.031
 Acc-0.030
melon-0.029
 cooled-0.029
ugg-0.028
TokenHello
Token ID15496
Feature activation-2.51e-01

Pos logit contributions
 Lena+0.009
Sab+0.008
robat+0.008
 calcul+0.008
rob+0.007
Neg logit contributions
izza-0.013
 Acc-0.013
 cooled-0.013
melon-0.013
icking-0.012
Token,
Token ID11
Feature activation-2.42e-01

Pos logit contributions
 Acc+0.008
izza+0.008
 cooled+0.008
melon+0.008
ugg+0.008
Neg logit contributions
 Lena-0.004
 girl-0.004
Sab-0.004
 woman-0.004
 witch-0.004
Token Mr
Token ID1770
Feature activation-7.66e-01

Pos logit contributions
uddled+0.008
 cooled+0.008
uttered+0.008
izza+0.008
 Acc+0.008
Neg logit contributions
 Lena-0.004
 girl-0.004
 woman-0.004
 witch-0.004
 princess-0.004
Token.
Token ID13
Feature activation-2.23e-01

Pos logit contributions
 cooled+0.025
 Acc+0.025
melon+0.025
izza+0.025
icking+0.024
Neg logit contributions
 Lena-0.012
 girl-0.011
Sab-0.011
 woman-0.011
robat-0.011
Token Snow
Token ID7967
Feature activation+1.43e-02

Pos logit contributions
izza+0.007
 cooled+0.007
 Acc+0.007
uddled+0.007
melon+0.006
Neg logit contributions
 Lena-0.004
Sab-0.004
 pupil-0.004
 girl-0.004
robat-0.004
Tokenman
Token ID805
Feature activation-4.79e-02

Pos logit contributions
 Lena+0.000
 girl+0.000
robat+0.000
 witch+0.000
 woman+0.000
Neg logit contributions
izza-0.000
 Acc-0.000
 cooled-0.000
melon-0.000
 cinnamon-0.000
Token!"
Token ID2474
Feature activation-1.54e-02

Pos logit contributions
izza+0.002
 Acc+0.002
uttered+0.001
melon+0.001
 cooled+0.001
Neg logit contributions
 Lena-0.001
 girl-0.001
robat-0.001
 witch-0.001
cest-0.001
Token Sara
Token ID24799
Feature activation+1.81e-01

Pos logit contributions
izza+0.000
melon+0.000
 cooled+0.000
 Acc+0.000
 drank+0.000
Neg logit contributions
robat-0.000
 calcul-0.000
cest-0.000
 Lena-0.000
Sab-0.000
Token said
Token ID531
Feature activation-6.44e-02

Pos logit contributions
 Lena+0.002
 girl+0.002
 woman+0.002
Sab+0.002
 Taylor+0.002
Neg logit contributions
izza-0.007
 Acc-0.007
melon-0.007
 cooled-0.007
ugg-0.007
Token.
Token ID13
Feature activation+2.59e-01

Pos logit contributions
izza+0.003
 cinnamon+0.003
acket+0.003
 cooled+0.003
 Acc+0.003
Neg logit contributions
 Lena-0.001
 after-0.001
 woman-0.000
 imagining-0.000
 her-0.000
Token day
Token ID1110
Feature activation-3.95e-01

Pos logit contributions
melon+0.000
 cooled+0.000
izza+0.000
 Acc+0.000
icking+0.000
Neg logit contributions
 Lena-0.000
Sab-0.000
robat-0.000
 calcul-0.000
rob-0.000
Token.
Token ID13
Feature activation-1.53e-01

Pos logit contributions
izza+0.012
melon+0.011
 cooled+0.011
 Acc+0.011
icking+0.011
Neg logit contributions
 Lena-0.008
Sab-0.007
robat-0.007
 calcul-0.007
rob-0.007
Token Every
Token ID3887
Feature activation+6.99e-02

Pos logit contributions
izza+0.004
 cooled+0.004
melon+0.004
 Acc+0.004
icking+0.004
Neg logit contributions
 Lena-0.003
Sab-0.003
 calcul-0.003
robat-0.003
 pupil-0.003
Token day
Token ID1110
Feature activation-4.26e-01

Pos logit contributions
 Lena+0.000
 student+0.000
 girl+0.000
 pupil+0.000
 woman+0.000
Neg logit contributions
izza-0.003
 Acc-0.003
icking-0.003
uddled-0.003
melon-0.003
Token,
Token ID11
Feature activation+3.00e-02

Pos logit contributions
 Acc+0.015
izza+0.015
melon+0.015
uddled+0.014
uttered+0.014
Neg logit contributions
 Lena-0.007
 calcul-0.006
 girl-0.006
 after-0.005
Sab-0.005
Token people
Token ID661
Feature activation+7.73e-02

Pos logit contributions
 Lena+0.000
Sab+0.000
 calcul+0.000
robat+0.000
 pupil+0.000
Neg logit contributions
izza-0.001
 Acc-0.001
 cooled-0.001
melon-0.001
icking-0.001
Token passed
Token ID3804
Feature activation+8.57e-02

Pos logit contributions
 girl+0.001
 student+0.001
b+0.001
 though+0.001
 after+0.001
Neg logit contributions
izza-0.003
 Acc-0.002
 cinnamon-0.002
uttered-0.002
acket-0.002
Token by
Token ID416
Feature activation-5.35e-01

Pos logit contributions
 Lena+0.001
Sab+0.001
robat+0.001
 calcul+0.001
rob+0.001
Neg logit contributions
izza-0.003
 Acc-0.003
 cooled-0.003
melon-0.003
icking-0.003
Token his
Token ID465
Feature activation+1.70e-01

Pos logit contributions
ugg+0.020
izza+0.020
melon+0.020
 cooled+0.020
uttered+0.020
Neg logit contributions
 Lena-0.007
 places-0.005
 imagining-0.005
 figures-0.005
 student-0.005
Token shop
Token ID6128
Feature activation+2.13e-01

Pos logit contributions
 Lena+0.003
Sab+0.003
robat+0.003
 calcul+0.003
rob+0.003
Neg logit contributions
izza-0.005
 Acc-0.005
 cooled-0.005
melon-0.005
icking-0.005
Token and
Token ID290
Feature activation+3.76e-01

Pos logit contributions
 Lena+0.003
 student+0.003
 woman+0.003
 pupil+0.003
 girl+0.003
Neg logit contributions
 Acc-0.008
izza-0.007
 cooled-0.007
 cinnamon-0.007
melon-0.007
Token,
Token ID11
Feature activation-1.37e-01

Pos logit contributions
acket+0.004
icking+0.004
izza+0.004
uddled+0.004
 Acc+0.004
Neg logit contributions
 Lena-0.001
 after-0.001
 girl-0.001
 Mir-0.001
Sab-0.001
Token "
Token ID366
Feature activation-7.41e-01

Pos logit contributions
izza+0.005
melon+0.004
 Acc+0.004
icking+0.004
 cooled+0.004
Neg logit contributions
 Lena-0.002
Sab-0.002
 woman-0.002
Grand-0.002
 imagining-0.002
TokenMom
Token ID29252
Feature activation+2.89e-01

Pos logit contributions
 cooled+0.020
izza+0.020
 Acc+0.020
 cinnamon+0.019
ugg+0.019
Neg logit contributions
 Lena-0.016
 figures-0.015
robat-0.015
 imagining-0.014
 calcul-0.014
Tokenmy
Token ID1820
Feature activation-2.06e-01

Pos logit contributions
 Lena+0.004
 girl+0.003
 Taylor+0.003
 imagining+0.002
Grand+0.002
Neg logit contributions
uddled-0.011
uttered-0.011
 Acc-0.011
melon-0.011
izza-0.011
Token,
Token ID11
Feature activation-1.35e-01

Pos logit contributions
uttered+0.009
uddled+0.009
add+0.009
utt+0.009
 Acc+0.009
Neg logit contributions
 girl-0.001
 Lena-0.001
 after-0.001
 Taylor-0.001
 witch-0.001
Token are
Token ID389
Feature activation-1.21e+00

Pos logit contributions
uddled+0.004
uttered+0.004
atted+0.004
acket+0.004
uffed+0.004
Neg logit contributions
 Lena-0.003
 girl-0.002
 princess-0.002
 after-0.002
 calcul-0.002
Token they
Token ID484
Feature activation+2.58e-02

Pos logit contributions
izza+0.038
 Acc+0.038
melon+0.038
 cooled+0.038
icking+0.036
Neg logit contributions
 Lena-0.021
Sab-0.019
 calcul-0.018
robat-0.018
 pupil-0.017
Token done
Token ID1760
Feature activation-1.18e-01

Pos logit contributions
 Lena+0.000
 girl+0.000
 imagining+0.000
 pupil+0.000
 calcul+0.000
Neg logit contributions
 Acc-0.001
izza-0.001
melon-0.001
atted-0.001
uttered-0.001
Token baking
Token ID16871
Feature activation+3.30e-01

Pos logit contributions
izza+0.003
 Acc+0.003
melon+0.003
uttered+0.003
 cooled+0.003
Neg logit contributions
 Lena-0.002
 imagining-0.002
 calcul-0.002
Sab-0.002
robat-0.002
Token yet
Token ID1865
Feature activation+4.59e-02

Pos logit contributions
 Lena+0.006
Sab+0.006
robat+0.006
 calcul+0.006
 pupil+0.006
Neg logit contributions
izza-0.010
 Acc-0.009
 cooled-0.009
melon-0.009
icking-0.009
Token?"
Token ID1701
Feature activation+7.25e-02

Pos logit contributions
 Lena+0.001
 after+0.001
Grand+0.001
 imagining+0.001
 calcul+0.001
Neg logit contributions
uttered-0.002
atted-0.002
 Acc-0.002
melon-0.002
izza-0.002
TokenWelcome
Token ID14618
Feature activation-1.41e-01

Pos logit contributions
 cooled+0.016
ugg+0.016
 drank+0.015
 cinnamon+0.015
izza+0.015
Neg logit contributions
 Lena-0.016
 imagining-0.016
 figures-0.015
robat-0.015
cest-0.014
Token to
Token ID284
Feature activation+3.50e-01

Pos logit contributions
 Acc+0.005
izza+0.005
melon+0.004
 cooled+0.004
 cinnamon+0.004
Neg logit contributions
 Lena-0.002
 girl-0.002
 pupil-0.002
 witch-0.002
Sab-0.002
Token our
Token ID674
Feature activation-4.99e-01

Pos logit contributions
 Lena+0.005
Sab+0.004
 calcul+0.004
 imagining+0.004
 student+0.004
Neg logit contributions
 Acc-0.012
izza-0.012
uttered-0.012
uddled-0.012
 cooled-0.012
Token restaurant
Token ID7072
Feature activation+1.00e+00

Pos logit contributions
izza+0.016
 Acc+0.016
 cooled+0.016
melon+0.016
uttered+0.016
Neg logit contributions
 Lena-0.008
 calcul-0.007
 girl-0.007
 pupil-0.007
 student-0.007
Token.
Token ID13
Feature activation+2.30e-02

Pos logit contributions
 Lena+0.016
 student+0.014
 girl+0.014
robat+0.014
 pupil+0.014
Neg logit contributions
 Acc-0.034
izza-0.033
uttered-0.032
 cooled-0.032
melon-0.032
Token What
Token ID1867
Feature activation-1.18e+00

Pos logit contributions
 Lena+0.000
 calcul+0.000
Sab+0.000
 Taylor+0.000
Grand+0.000
Neg logit contributions
uddled-0.001
 cooled-0.001
icking-0.001
melon-0.001
atted-0.001
Token would
Token ID561
Feature activation-8.29e-01

Pos logit contributions
melon+0.047
ishly+0.046
ugg+0.045
atted+0.045
uffed+0.043
Neg logit contributions
 student-0.016
 girl-0.016
 princess-0.015
 Lena-0.015
 calcul-0.015
Token you
Token ID345
Feature activation-2.33e-01

Pos logit contributions
izza+0.025
 Acc+0.025
melon+0.025
 cooled+0.025
icking+0.024
Neg logit contributions
 Lena-0.015
Sab-0.014
robat-0.014
 calcul-0.014
rob-0.013
Token like
Token ID588
Feature activation-6.93e-02

Pos logit contributions
izza+0.007
 Acc+0.007
 cooled+0.007
melon+0.007
icking+0.007
Neg logit contributions
 Lena-0.004
Sab-0.004
robat-0.004
 calcul-0.004
rob-0.004
Token to
Token ID284
Feature activation+4.08e-01

Pos logit contributions
izza+0.002
 Acc+0.002
 cooled+0.002
melon+0.002
icking+0.002
Neg logit contributions
 Lena-0.001
Sab-0.001
robat-0.001
 calcul-0.001
rob-0.001
Token order
Token ID1502
Feature activation+2.87e-01

Pos logit contributions
 Lena+0.010
Sab+0.009
 calcul+0.009
robat+0.009
 imagining+0.009
Neg logit contributions
izza-0.010
 Acc-0.010
melon-0.010
 cooled-0.010
icking-0.009
Token two
Token ID734
Feature activation-6.73e-01

Pos logit contributions
izza+0.012
 Acc+0.011
melon+0.011
 cooled+0.011
icking+0.011
Neg logit contributions
 Lena-0.007
Sab-0.006
robat-0.006
 calcul-0.006
rob-0.006
Token dogs
Token ID6844
Feature activation-2.25e-01

Pos logit contributions
 Acc+0.024
izza+0.024
ugg+0.024
uttered+0.023
 cooled+0.023
Neg logit contributions
 Lena-0.009
 girl-0.009
 calcul-0.009
 figures-0.009
 princess-0.008
Token fighting
Token ID4330
Feature activation-3.98e-03

Pos logit contributions
izza+0.010
 Acc+0.010
 farewell+0.009
 cinnamon+0.009
OP+0.009
Neg logit contributions
 after-0.002
 imagining-0.002
 pretending-0.002
 hall-0.002
 rac-0.002
Token outside
Token ID2354
Feature activation+1.56e-01

Pos logit contributions
ugg+0.000
 Acc+0.000
izza+0.000
ete+0.000
acket+0.000
Neg logit contributions
 after-0.000
 than-0.000
 Lena-0.000
 her-0.000
 though-0.000
Token.
Token ID13
Feature activation-2.28e-01

Pos logit contributions
 Lena+0.002
 after+0.002
 her+0.001
 hall+0.001
 though+0.001
Neg logit contributions
izza-0.006
acket-0.006
ugg-0.006
 Acc-0.006
 cooled-0.006
Token Mitt
Token ID16627
Feature activation-1.35e+00

Pos logit contributions
izza+0.006
melon+0.006
 cooled+0.006
 Acc+0.006
icking+0.005
Neg logit contributions
 Lena-0.005
robat-0.005
Sab-0.005
 calcul-0.005
rob-0.005
Tokenens
Token ID641
Feature activation-1.66e-01

Pos logit contributions
izza+0.035
 cooled+0.034
 Acc+0.034
melon+0.034
icking+0.033
Neg logit contributions
 Lena-0.031
Sab-0.029
 calcul-0.029
robat-0.028
 student-0.027
Token didn
Token ID1422
Feature activation+2.70e-01

Pos logit contributions
izza+0.006
 Acc+0.006
ugg+0.006
uned+0.006
 cinnamon+0.006
Neg logit contributions
 girl-0.002
 Lena-0.002
 man-0.002
 woman-0.002
 pupil-0.002
Token't
Token ID470
Feature activation-9.63e-02

Pos logit contributions
 Lena+0.005
Sab+0.005
robat+0.005
 calcul+0.005
rob+0.005
Neg logit contributions
izza-0.008
 Acc-0.008
melon-0.008
 cooled-0.008
icking-0.007
Token like
Token ID588
Feature activation+4.24e-01

Pos logit contributions
 drank+0.002
 cooled+0.002
melon+0.002
 forgive+0.002
 calmed+0.002
Neg logit contributions
rob-0.002
Sab-0.002
 Lena-0.002
robat-0.002
Mel-0.002
Token fighting
Token ID4330
Feature activation+8.25e-02

Pos logit contributions
 Lena+0.006
 imagining+0.005
 places+0.005
 calcul+0.005
 after+0.005
Neg logit contributions
izza-0.016
uttered-0.016
ugg-0.016
icking-0.016
uddled-0.015
Token so
Token ID523
Feature activation-2.92e-01

Pos logit contributions
 Lena+0.003
Sab+0.002
 calcul+0.002
 imagining+0.002
 woman+0.002
Neg logit contributions
izza-0.013
 Acc-0.013
 cooled-0.013
melon-0.013
ugg-0.013
Token he
Token ID339
Feature activation-7.57e-01

Pos logit contributions
izza+0.009
 cooled+0.009
melon+0.009
 Acc+0.009
icking+0.008
Neg logit contributions
 Lena-0.005
robat-0.005
Sab-0.005
 calcul-0.005
rob-0.005
Token requested
Token ID9167
Feature activation+3.47e-01

Pos logit contributions
 Acc+0.026
izza+0.025
icking+0.023
ugg+0.023
melon+0.022
Neg logit contributions
 Lena-0.014
b-0.013
 girl-0.013
 princess-0.013
 imagined-0.013
Token,
Token ID11
Feature activation-5.83e-02

Pos logit contributions
 Lena+0.004
 after+0.004
 her+0.003
 one+0.003
b+0.003
Neg logit contributions
acket-0.014
 Acc-0.014
izza-0.014
ugg-0.014
melon-0.013
Token "
Token ID366
Feature activation-6.84e-01

Pos logit contributions
izza+0.002
uddled+0.002
ugg+0.002
 Acc+0.002
 cooled+0.002
Neg logit contributions
 Lena-0.001
 woman-0.001
 imagining-0.001
 Mir-0.001
 after-0.001
TokenBird
Token ID42562
Feature activation-1.61e+00

Pos logit contributions
izza+0.020
 cooled+0.019
melon+0.019
 Acc+0.019
icking+0.018
Neg logit contributions
 Lena-0.014
Sab-0.013
 calcul-0.012
robat-0.012
 imagining-0.012
Token,
Token ID11
Feature activation+2.12e-02

Pos logit contributions
izza+0.056
 Acc+0.054
 cooled+0.054
uddled+0.053
 apologised+0.053
Neg logit contributions
 Lena-0.023
 girl-0.021
Sab-0.021
b-0.021
 witch-0.021
Token come
Token ID1282
Feature activation+4.43e-01

Pos logit contributions
 Lena+0.000
 girl+0.000
 witch+0.000
 princess+0.000
Sab+0.000
Neg logit contributions
uddled-0.001
uttered-0.001
ugg-0.001
atted-0.001
izza-0.001
Token play
Token ID711
Feature activation+5.99e-02

Pos logit contributions
 Lena+0.008
robat+0.007
Sab+0.007
 calcul+0.007
rob+0.007
Neg logit contributions
izza-0.014
 Acc-0.013
melon-0.013
 cooled-0.013
icking-0.013
Token with
Token ID351
Feature activation-7.01e-02

Pos logit contributions
 Lena+0.001
robat+0.001
Sab+0.001
 calcul+0.001
rob+0.001
Neg logit contributions
izza-0.002
 Acc-0.002
 cooled-0.002
melon-0.002
icking-0.002
Token me
Token ID502
Feature activation+3.11e-01

Pos logit contributions
izza+0.003
 Acc+0.003
 cooled+0.003
melon+0.003
icking+0.003
Neg logit contributions
 Lena-0.001
Sab-0.001
robat-0.001
 witch-0.001
 places-0.001
Token know
Token ID760
Feature activation-3.75e-01

Pos logit contributions
 Acc+0.000
 cooled+0.000
izza+0.000
melon+0.000
 drank+0.000
Neg logit contributions
 Lena-0.000
rob-0.000
robat-0.000
Sab-0.000
 calcul-0.000
Token what
Token ID644
Feature activation-6.81e-02

Pos logit contributions
izza+0.012
acket+0.012
ugg+0.012
 cooled+0.011
melon+0.011
Neg logit contributions
 Lena-0.008
 imagining-0.007
Sab-0.007
 though-0.007
 princess-0.007
Token to
Token ID284
Feature activation+3.01e-01

Pos logit contributions
izza+0.002
melon+0.002
 cooled+0.002
icking+0.002
 Acc+0.002
Neg logit contributions
 Lena-0.001
Sab-0.001
 calcul-0.001
 pupil-0.001
 student-0.001
Token do
Token ID466
Feature activation+5.99e-01

Pos logit contributions
Sab+0.010
rob+0.009
oc+0.009
 Africa+0.009
 Mir+0.009
Neg logit contributions
icking-0.005
 forgive-0.005
 drank-0.005
melon-0.004
izza-0.004
Token.
Token ID13
Feature activation-1.17e-01

Pos logit contributions
Sab+0.017
rob+0.016
robat+0.015
 Lena+0.015
 Africa+0.015
Neg logit contributions
icking-0.012
 Acc-0.012
melon-0.012
 cooled-0.012
izza-0.012
Token Mitt
Token ID16627
Feature activation-1.32e+00

Pos logit contributions
izza+0.004
melon+0.004
 cooled+0.004
 Acc+0.004
icking+0.004
Neg logit contributions
 Lena-0.002
Sab-0.002
 calcul-0.002
 pupil-0.002
 girl-0.002
Tokenens
Token ID641
Feature activation+2.65e-02

Pos logit contributions
izza+0.045
melon+0.044
 Acc+0.043
 cooled+0.041
uttered+0.041
Neg logit contributions
 Lena-0.022
 girl-0.021
robat-0.020
 witch-0.019
rob-0.018
Token saw
Token ID2497
Feature activation-5.35e-01

Pos logit contributions
 girl+0.000
 woman+0.000
 Lena+0.000
 witch+0.000
 pupil+0.000
Neg logit contributions
izza-0.001
 Acc-0.001
icks-0.001
 cinnamon-0.001
uttered-0.001
Token that
Token ID326
Feature activation-3.85e-01

Pos logit contributions
izza+0.019
 Acc+0.018
 cooled+0.018
melon+0.017
icking+0.017
Neg logit contributions
 Lena-0.009
Sab-0.007
 girl-0.007
 imagining-0.007
 woman-0.007
Token Lucy
Token ID22162
Feature activation-3.05e-01

Pos logit contributions
izza+0.013
 Acc+0.013
 cooled+0.013
melon+0.013
icking+0.012
Neg logit contributions
 Lena-0.006
Sab-0.005
 calcul-0.005
robat-0.005
 pupil-0.005
Token was
Token ID373
Feature activation-3.50e-02

Pos logit contributions
izza+0.013
ugg+0.012
 Enjoy+0.012
acket+0.012
uned+0.012
Neg logit contributions
 girl-0.003
 she-0.003
 Lena-0.003
 figures-0.003
 pupil-0.003
Token He
Token ID679
Feature activation-2.67e-01

Pos logit contributions
izza+0.007
 cooled+0.007
melon+0.007
 Acc+0.006
icking+0.006
Neg logit contributions
 Lena-0.005
Sab-0.004
 calcul-0.004
robat-0.004
 pupil-0.004
Token took
Token ID1718
Feature activation+3.25e-03

Pos logit contributions
 Acc+0.007
izza+0.007
melon+0.007
 cooled+0.007
icking+0.007
Neg logit contributions
 Lena-0.006
Sab-0.006
 calcul-0.006
robat-0.006
 pupil-0.005
Token out
Token ID503
Feature activation-3.22e-01

Pos logit contributions
robat+0.000
rob+0.000
Sab+0.000
 Lena+0.000
 calcul+0.000
Neg logit contributions
izza-0.000
 cooled-0.000
icked-0.000
 farewell-0.000
uttered-0.000
Token a
Token ID257
Feature activation+3.45e-01

Pos logit contributions
izza+0.010
 Acc+0.010
melon+0.010
 cooled+0.010
icking+0.010
Neg logit contributions
 Lena-0.005
Sab-0.005
 calcul-0.005
robat-0.005
 pupil-0.005
Token magn
Token ID7842
Feature activation+1.56e-01

Pos logit contributions
 Lena+0.007
Sab+0.006
robat+0.006
 calcul+0.006
rob+0.006
Neg logit contributions
izza-0.010
 Acc-0.010
 cooled-0.010
melon-0.010
icking-0.009
Tokenifying
Token ID4035
Feature activation-3.98e+00

Pos logit contributions
 Lena+0.005
Sab+0.005
robat+0.005
 calcul+0.004
 pupil+0.004
Neg logit contributions
izza-0.003
 Acc-0.003
 cooled-0.003
melon-0.003
icking-0.002
Token glass
Token ID5405
Feature activation-4.17e-01

Pos logit contributions
izza+0.081
 Acc+0.079
 cooled+0.077
melon+0.076
icking+0.075
Neg logit contributions
 Lena-0.114
Sab-0.108
 calcul-0.106
robat-0.106
 pupil-0.103
Token and
Token ID290
Feature activation+1.82e-01

Pos logit contributions
 Acc+0.016
izza+0.015
 cooled+0.014
icking+0.014
uttered+0.014
Neg logit contributions
 Lena-0.006
robat-0.005
 girl-0.005
 calcul-0.005
 woman-0.005
Token a
Token ID257
Feature activation-2.75e-01

Pos logit contributions
 Lena+0.004
 calcul+0.004
Sab+0.004
 imagining+0.003
 woman+0.003
Neg logit contributions
izza-0.005
 Acc-0.005
melon-0.005
icking-0.005
 cooled-0.005
Token notebook
Token ID20922
Feature activation+4.16e-01

Pos logit contributions
icking+0.009
 Acc+0.009
izza+0.009
melon+0.008
uddled+0.008
Neg logit contributions
 Lena-0.005
 girl-0.005
 woman-0.005
 pupil-0.005
 calcul-0.005
Token and
Token ID290
Feature activation+8.91e-02

Pos logit contributions
 Lena+0.007
robat+0.006
Sab+0.006
 calcul+0.006
 pupil+0.006
Neg logit contributions
izza-0.013
 Acc-0.013
 cooled-0.013
melon-0.013
icking-0.012
Token.
Token ID13
Feature activation-2.54e-02

Pos logit contributions
izza+0.002
 Acc+0.002
 cooled+0.002
melon+0.002
icking+0.002
Neg logit contributions
 Lena-0.001
Sab-0.001
robat-0.001
 calcul-0.001
rob-0.001
Token They
Token ID1119
Feature activation-1.96e-02

Pos logit contributions
izza+0.001
 Acc+0.001
 cooled+0.001
melon+0.001
icking+0.001
Neg logit contributions
 Lena-0.001
Sab-0.000
 calcul-0.000
robat-0.000
 girl-0.000
Token had
Token ID550
Feature activation-9.88e-02

Pos logit contributions
izza+0.001
 Acc+0.001
melon+0.001
uddled+0.001
icking+0.001
Neg logit contributions
 Lena-0.000
Sab-0.000
 calcul-0.000
 pupil-0.000
 woman-0.000
Token a
Token ID257
Feature activation+3.79e-01

Pos logit contributions
 Acc+0.004
izza+0.004
melon+0.004
icking+0.004
ugg+0.004
Neg logit contributions
 Lena-0.001
 imagining-0.001
 places-0.001
Sab-0.001
 figures-0.001
Token magn
Token ID7842
Feature activation+2.60e-01

Pos logit contributions
 girl+0.007
 Lena+0.006
 pupil+0.006
 woman+0.006
 student+0.006
Neg logit contributions
izza-0.012
icking-0.012
 Acc-0.012
icked-0.012
atted-0.012
Tokenifying
Token ID4035
Feature activation-3.36e+00

Pos logit contributions
 Lena+0.007
Sab+0.007
robat+0.007
 pupil+0.007
 calcul+0.007
Neg logit contributions
 Acc-0.006
izza-0.006
 cooled-0.005
melon-0.005
 cinnamon-0.005
Token glass
Token ID5405
Feature activation-1.94e-01

Pos logit contributions
izza+0.070
 Acc+0.068
icking+0.066
 cooled+0.065
uddled+0.064
Neg logit contributions
 Lena-0.094
Sab-0.088
 calcul-0.088
robat-0.088
 speaker-0.085
Token and
Token ID290
Feature activation+3.03e-01

Pos logit contributions
 Acc+0.007
izza+0.007
 cooled+0.006
 apologised+0.006
acket+0.006
Neg logit contributions
 Lena-0.003
 girl-0.003
 pupil-0.003
 student-0.003
 woman-0.003
Token a
Token ID257
Feature activation+9.11e-02

Pos logit contributions
 Lena+0.004
 calcul+0.004
 imagined+0.004
 figures+0.004
 imagining+0.003
Neg logit contributions
 Acc-0.011
icking-0.011
melon-0.011
ugg-0.011
izza-0.011
Token small
Token ID1402
Feature activation+3.97e-01

Pos logit contributions
 girl+0.002
 woman+0.002
 pupil+0.002
 witch+0.002
 Lena+0.002
Neg logit contributions
icking-0.003
 Acc-0.003
atted-0.003
uddled-0.003
izza-0.003
Token box
Token ID3091
Feature activation-1.10e+00

Pos logit contributions
 girl+0.008
 pupil+0.007
 woman+0.007
 witch+0.007
 princess+0.007
Neg logit contributions
uddled-0.013
 Acc-0.013
icking-0.013
izza-0.013
icked-0.013
Token
Token ID198
Feature activation+3.50e-01

Pos logit contributions
 calcul+0.000
robat+0.000
cest+0.000
 imagining+0.000
 pupil+0.000
Neg logit contributions
izza-0.000
 Acc-0.000
 cooled-0.000
uttered-0.000
 apologised-0.000
TokenHe
Token ID1544
Feature activation-3.54e-01

Pos logit contributions
 imagining+0.007
robat+0.007
rob+0.007
 calcul+0.007
cest+0.007
Neg logit contributions
izza-0.009
 Acc-0.009
 apologised-0.009
OP-0.008
 farewell-0.008
Token pointed
Token ID6235
Feature activation-5.10e-01

Pos logit contributions
 Acc+0.010
izza+0.010
melon+0.010
icking+0.009
ugg+0.009
Neg logit contributions
 Lena-0.008
Sab-0.007
 calcul-0.007
Grand-0.007
 woman-0.007
Token his
Token ID465
Feature activation-4.01e-02

Pos logit contributions
izza+0.017
 Acc+0.017
 cooled+0.016
melon+0.016
icking+0.016
Neg logit contributions
 Lena-0.008
Sab-0.007
 calcul-0.007
robat-0.007
 pupil-0.007
Token magn
Token ID7842
Feature activation+4.54e-01

Pos logit contributions
izza+0.001
 Acc+0.001
melon+0.001
 cooled+0.001
icking+0.001
Neg logit contributions
 Lena-0.001
Sab-0.001
robat-0.001
 calcul-0.001
rob-0.001
Tokenifying
Token ID4035
Feature activation-3.70e+00

Pos logit contributions
 Lena+0.012
Sab+0.012
 pupil+0.011
 calcul+0.011
robat+0.011
Neg logit contributions
 Acc-0.011
izza-0.010
 cooled-0.010
melon-0.009
 cinnamon-0.009
Token glass
Token ID5405
Feature activation-1.96e-01

Pos logit contributions
izza+0.074
 Acc+0.073
 cooled+0.071
melon+0.070
icking+0.069
Neg logit contributions
 Lena-0.107
Sab-0.101
 calcul-0.100
robat-0.099
 pupil-0.096
Token at
Token ID379
Feature activation+1.58e-01

Pos logit contributions
 Acc+0.008
izza+0.007
 Enjoy+0.007
ugg+0.007
acket+0.007
Neg logit contributions
 Lena-0.002
 girl-0.002
 figures-0.002
 woman-0.002
 calcul-0.002
Token different
Token ID1180
Feature activation+1.72e-01

Pos logit contributions
 Lena+0.002
Sab+0.002
robat+0.002
 calcul+0.002
 pupil+0.002
Neg logit contributions
izza-0.005
 Acc-0.005
 cooled-0.005
melon-0.005
icking-0.005
Token classmates
Token ID28999
Feature activation+1.25e+00

Pos logit contributions
 Lena+0.003
Sab+0.003
robat+0.003
rob+0.003
 calcul+0.003
Neg logit contributions
izza-0.005
 Acc-0.005
melon-0.005
 cooled-0.005
icking-0.005
Token and
Token ID290
Feature activation+1.74e-01

Pos logit contributions
 Lena+0.019
 calcul+0.016
 imagining+0.016
robat+0.016
Sab+0.015
Neg logit contributions
 Acc-0.043
 cooled-0.042
izza-0.042
melon-0.042
ugg-0.041
Token borrowed
Token ID22546
Feature activation-3.65e-01

Pos logit contributions
izza+0.018
icking+0.018
melon+0.017
acket+0.017
 Acc+0.017
Neg logit contributions
b-0.008
Sab-0.008
Grand-0.008
 girl-0.008
 Lena-0.008
Token her
Token ID607
Feature activation+3.27e-02

Pos logit contributions
izza+0.012
 Acc+0.012
 cooled+0.012
melon+0.012
icking+0.011
Neg logit contributions
 Lena-0.006
Sab-0.006
 calcul-0.005
robat-0.005
 pupil-0.005
Token mom
Token ID1995
Feature activation+2.21e-01

Pos logit contributions
 Lena+0.001
Sab+0.001
 calcul+0.001
 pupil+0.001
 imagining+0.001
Neg logit contributions
izza-0.001
 Acc-0.001
melon-0.001
 cooled-0.001
ugg-0.001
Token's
Token ID338
Feature activation+3.72e-01

Pos logit contributions
 Lena+0.003
Sab+0.003
robat+0.003
 calcul+0.003
 girl+0.003
Neg logit contributions
izza-0.008
 Acc-0.007
 cooled-0.007
melon-0.007
uddled-0.007
Token magn
Token ID7842
Feature activation+3.04e-01

Pos logit contributions
 pupil+0.006
 girl+0.006
 figures+0.006
 calcul+0.006
 woman+0.006
Neg logit contributions
 cooled-0.012
izza-0.012
 Acc-0.012
melon-0.012
acket-0.012
Tokenifying
Token ID4035
Feature activation-3.50e+00

Pos logit contributions
Mel+0.006
 Lena+0.006
Sab+0.006
 pupil+0.006
b+0.006
Neg logit contributions
 Acc-0.009
izza-0.008
 cooled-0.008
 cinnamon-0.007
melon-0.007
Token glass
Token ID5405
Feature activation-3.55e-01

Pos logit contributions
izza+0.073
 Acc+0.072
icking+0.070
uddled+0.069
 cooled+0.068
Neg logit contributions
 Lena-0.098
Sab-0.092
 calcul-0.092
 girl-0.089
 witch-0.088
Token to
Token ID284
Feature activation+7.31e-01

Pos logit contributions
 Acc+0.015
izza+0.014
acket+0.014
 cooled+0.013
 Enjoy+0.013
Neg logit contributions
 figures-0.004
 girl-0.004
 Lena-0.003
 than-0.003
 pupil-0.003
Token take
Token ID1011
Feature activation+7.38e-01

Pos logit contributions
 Lena+0.014
Sab+0.013
 calcul+0.012
 witch+0.012
robat+0.012
Neg logit contributions
izza-0.023
 Acc-0.022
uddled-0.022
melon-0.021
 cooled-0.021
Token a
Token ID257
Feature activation+3.99e-01

Pos logit contributions
 Lena+0.007
Sab+0.007
 after+0.006
 calcul+0.006
 girl+0.006
Neg logit contributions
izza-0.029
 Acc-0.029
uddled-0.029
melon-0.029
 cooled-0.028
Token closer
Token ID5699
Feature activation+1.81e-01

Pos logit contributions
 Lena+0.010
Sab+0.009
robat+0.009
 calcul+0.009
 pupil+0.009
Neg logit contributions
izza-0.010
 Acc-0.010
melon-0.010
 cooled-0.010
icking-0.009
Token said
Token ID531
Feature activation-2.56e-01

Pos logit contributions
robat+0.005
rob+0.005
Sab+0.005
cest+0.005
 Lena+0.005
Neg logit contributions
izza-0.006
 drank-0.006
 cooled-0.006
 Acc-0.006
 apologised-0.006
Token,
Token ID11
Feature activation+5.09e-01

Pos logit contributions
izza+0.009
melon+0.009
 cooled+0.009
 cinnamon+0.009
acket+0.009
Neg logit contributions
 Lena-0.004
 pupil-0.003
 girl-0.003
 after-0.003
 imagining-0.003
Token holding
Token ID4769
Feature activation+1.76e-01

Pos logit contributions
 Lena+0.008
 imagining+0.008
 woman+0.008
 calcul+0.007
 speaker+0.007
Neg logit contributions
izza-0.018
ugg-0.016
 Acc-0.016
 cooled-0.016
icking-0.016
Token the
Token ID262
Feature activation+4.00e-02

Pos logit contributions
 Lena+0.003
 calcul+0.003
Sab+0.003
 pupil+0.003
robat+0.003
Neg logit contributions
izza-0.006
 Acc-0.006
 cooled-0.005
melon-0.005
ugg-0.005
Token magn
Token ID7842
Feature activation+2.89e-01

Pos logit contributions
 Lena+0.001
 girl+0.001
robat+0.001
 calcul+0.001
 woman+0.001
Neg logit contributions
 Acc-0.001
izza-0.001
 cooled-0.001
ugg-0.001
uddled-0.001
Tokenifying
Token ID4035
Feature activation-3.99e+00

Pos logit contributions
 Lena+0.008
Sab+0.008
robat+0.008
 pupil+0.007
 calcul+0.007
Neg logit contributions
 Acc-0.007
izza-0.006
 cooled-0.006
melon-0.006
uddled-0.005
Token glass
Token ID5405
Feature activation-2.26e-01

Pos logit contributions
izza+0.068
 Acc+0.065
melon+0.064
 cooled+0.064
icking+0.060
Neg logit contributions
 Lena-0.127
Sab-0.122
robat-0.121
 calcul-0.120
rob-0.117
Token over
Token ID625
Feature activation+1.72e-02

Pos logit contributions
 Acc+0.009
izza+0.009
acket+0.009
uddled+0.008
ugg+0.008
Neg logit contributions
 girl-0.002
 Lena-0.002
 after-0.002
 figures-0.002
 pupil-0.002
Token a
Token ID257
Feature activation+3.72e-01

Pos logit contributions
 Lena+0.000
 girl+0.000
 woman+0.000
Sab+0.000
 pupil+0.000
Neg logit contributions
izza-0.001
 Acc-0.001
 cooled-0.001
acket-0.001
ugg-0.001
Token black
Token ID2042
Feature activation+5.39e-01

Pos logit contributions
 Lena+0.008
Sab+0.007
robat+0.007
 calcul+0.007
 pupil+0.007
Neg logit contributions
izza-0.010
 Acc-0.010
 cooled-0.010
melon-0.010
icking-0.010
Token ant
Token ID1885
Feature activation-5.58e-02

Pos logit contributions
 Lena+0.009
robat+0.009
 girl+0.009
 pupil+0.009
 calcul+0.009
Neg logit contributions
 Acc-0.017
izza-0.017
 cooled-0.017
icking-0.016
uddled-0.016