TOP ACTIVATIONS
MAX = 0.077

day , Red 's engine stopped working and he couldn 't
, Tim my 's radio stopped working . He was sad
to sing . Every day , Blue would sing happy songs
, the car 's wheels stopped spinning . The car was
Tim my 's toy car stopped working and he was very
One day the train stopped and the passengers all had
house . Everywhere she went , she heard the sound of
others . Everywhere he went , he made friends with the
day , Tim 's robot stopped blinking . Tim was sad
It was filled with ch ocol ates and coins . She
, Tim my 's computer stopped working . He felt sad
long tail . Every day , she flew around in the
track ? Or how Grand ma taught you how to fly
. One day , he stopped at a store and bought
One day , the television stopped working and Tim my was
proud . Everywhere he went , he held his ant lers
One day , the clock stopped ticking and Lily didn 't
. Finally , the wind stopped and the fle a was
, Tim my 's car stopped working and he couldn 't
ver he went , people stopped to listen .

MIN ACTIVATIONS
MIN = -0.082

girl named Lily who loved pic n ics . She would
their salad and have happy pic n ics together . <|endoftext|>
in the garden and have pic n ics together .
liked to play and have pic n ics . One day
favourite place to go for pic n ics . <|endoftext|> Once
with a present - a brand new battery - powered car
the store and bought a brand new sp ade . Tim
near the cliff and have pic n ics . One day
so excited ! She loved pic n ics and spending time
She loved to go on pic n ics with her family
a family who really enjoyed pic n ics . They loved
box and inside was a brand new toy with a bright
floor . " Oh , L ila , what
board . He chose a brand new box of clay and
sorry . " Oh , L ila , I
, it was like a brand new symbol !
family were ready to go pic n icking again . But
and nodded . He loved pic n ics and couldn 't
" Why do you want a bisc uit ?"
, ready to have more pic n ics in the future

INTERVAL 0.037 - 0.077
CONTAINS 0.402%

sang louder and clearer than before . The other birds in
town loved to hear the bell ring . One
When he was finished , he l icked his paws
I love you and your mom my so much . You
, she saw a man kne el down in front of

INTERVAL -0.002 - 0.037
CONTAINS 56.899%

his mother and she smiled back . The baby couldn 't
in the water . He stared at the fish . The
friends . That way the bee didn 't have to escape
bird . " Hello , little girl !"
saw a spark ly feather . She was so excited !

INTERVAL -0.042 - -0.002
CONTAINS 42.587%

Just when the rabbit was about to give up
they said goodbye to each other . The little bear went
favorite spot and he could go on magical adventures whenever he
named Lily . They loved to run , jump , and
the paper with her best friend . They shared the y

INTERVAL -0.082 - -0.042
CONTAINS 0.113%

We wanted to give you a candle , but we did
did it again and she learnt her lesson . She loved
The monkey â s friends were happy and
can find a princess or a dragon inside ."
day , a little girl named Lily was playing in the
Token day
Token ID1110
Feature activation+1.48e-02

Pos logit contributions
urious+0.042
 acts+0.041
 typing+0.039
itive+0.039
 recite+0.039
Neg logit contributions
ar-0.015
 Nem-0.013
 Bub-0.011
 home-0.011
 Twe-0.010
Token,
Token ID11
Feature activation+4.66e-02

Pos logit contributions
 typing+0.020
 acts+0.020
itive+0.020
pled+0.019
 explains+0.019
Neg logit contributions
ar-0.022
bank-0.020
melon-0.020
 Nem-0.020
 mainland-0.019
Token Red
Token ID2297
Feature activation-1.07e-03

Pos logit contributions
 typing+0.073
itive+0.069
 recite+0.067
 acts+0.066
urious+0.064
Neg logit contributions
ar-0.070
bank-0.056
aun-0.055
melon-0.053
orted-0.053
Token's
Token ID338
Feature activation+4.04e-03

Pos logit contributions
ar+0.002
 Nem+0.001
 Bub+0.001
 home+0.001
 Twe+0.001
Neg logit contributions
urious-0.002
 acts-0.002
ymes-0.002
 recite-0.002
itive-0.002
Token engine
Token ID3113
Feature activation+1.85e-04

Pos logit contributions
urious+0.007
 acts+0.007
 typing+0.007
itive+0.007
 recite+0.007
Neg logit contributions
ar-0.005
 Nem-0.005
 Bub-0.004
 bamboo-0.004
melon-0.004
Token stopped
Token ID5025
Feature activation+7.70e-02

Pos logit contributions
urious+0.000
 acts+0.000
 typing+0.000
itive+0.000
 recite+0.000
Neg logit contributions
ar-0.000
 Nem-0.000
 Bub-0.000
 bamboo-0.000
 home-0.000
Token working
Token ID1762
Feature activation+3.29e-02

Pos logit contributions
urious+0.127
 acts+0.125
ymes+0.118
 explains+0.116
 amaz+0.116
Neg logit contributions
 home-0.113
 Nem-0.100
 Bub-0.098
 Twe-0.097
 back-0.094
Token and
Token ID290
Feature activation+7.34e-03

Pos logit contributions
urious+0.064
 acts+0.062
 recite+0.057
 explains+0.057
 amaz+0.056
Neg logit contributions
ar-0.039
 Nem-0.036
 home-0.035
 Bub-0.033
 Twe-0.031
Token he
Token ID339
Feature activation+9.41e-03

Pos logit contributions
urious+0.013
 typing+0.013
 acts+0.013
 recite+0.012
itive+0.012
Neg logit contributions
ar-0.010
 Nem-0.008
 Bub-0.008
inky-0.007
 bamboo-0.007
Token couldn
Token ID3521
Feature activation+1.07e-02

Pos logit contributions
urious+0.016
 acts+0.016
itive+0.015
ymes+0.015
 recite+0.015
Neg logit contributions
ar-0.013
 Nem-0.012
 Bub-0.011
 home-0.011
 Twe-0.010
Token't
Token ID470
Feature activation+2.68e-02

Pos logit contributions
urious+0.024
 acts+0.023
itive+0.022
 typing+0.022
 explains+0.022
Neg logit contributions
ar-0.010
 Nem-0.008
 Bub-0.008
 Twe-0.007
 home-0.007
Token,
Token ID11
Feature activation+2.03e-02

Pos logit contributions
ar+0.015
bank+0.014
oster+0.014
 mainland+0.014
age+0.013
Neg logit contributions
 acts-0.009
itive-0.009
 typing-0.009
pled-0.009
 explains-0.009
Token Tim
Token ID5045
Feature activation+2.10e-02

Pos logit contributions
 typing+0.029
itive+0.027
urious+0.027
 explains+0.026
 acts+0.026
Neg logit contributions
ar-0.032
inky-0.028
age-0.028
bank-0.026
oster-0.026
Tokenmy
Token ID1820
Feature activation+9.96e-03

Pos logit contributions
 acts+0.030
 amaz+0.030
 typing+0.029
ymes+0.029
itive+0.029
Neg logit contributions
 home-0.032
 Nem-0.030
ar-0.029
 Bub-0.027
 back-0.027
Token's
Token ID338
Feature activation+7.96e-03

Pos logit contributions
urious+0.014
 acts+0.014
ymes+0.014
itive+0.014
asing+0.013
Neg logit contributions
 Nem-0.015
ar-0.015
 home-0.014
 Bub-0.013
 Twe-0.013
Token radio
Token ID5243
Feature activation-8.39e-03

Pos logit contributions
urious+0.015
 acts+0.014
 typing+0.014
itive+0.014
 recite+0.014
Neg logit contributions
ar-0.010
 Nem-0.009
 Bub-0.008
 home-0.008
 Twe-0.008
Token stopped
Token ID5025
Feature activation+7.18e-02

Pos logit contributions
ar+0.012
 Nem+0.011
 Bub+0.010
 home+0.009
 bamboo+0.009
Neg logit contributions
urious-0.014
 acts-0.014
 typing-0.013
 recite-0.013
itive-0.013
Token working
Token ID1762
Feature activation+3.18e-02

Pos logit contributions
 acts+0.115
urious+0.115
ymes+0.106
 explains+0.105
 amaz+0.104
Neg logit contributions
 home-0.104
 Nem-0.095
 Bub-0.092
 Twe-0.088
ar-0.088
Token.
Token ID13
Feature activation-5.35e-04

Pos logit contributions
urious+0.067
 acts+0.065
 explains+0.060
 recite+0.060
itive+0.059
Neg logit contributions
ar-0.033
 Nem-0.030
 home-0.029
 Bub-0.027
 Twe-0.025
Token He
Token ID679
Feature activation+1.69e-02

Pos logit contributions
ar+0.001
 Nem+0.001
 Bub+0.001
 home+0.001
balls+0.001
Neg logit contributions
urious-0.001
 acts-0.001
 typing-0.001
 explains-0.001
 recite-0.001
Token was
Token ID373
Feature activation+1.25e-02

Pos logit contributions
urious+0.029
 acts+0.028
itive+0.027
asing+0.027
ymes+0.027
Neg logit contributions
ar-0.022
 Nem-0.022
 Bub-0.020
 Twe-0.018
 home-0.018
Token sad
Token ID6507
Feature activation+5.18e-03

Pos logit contributions
urious+0.024
 acts+0.023
itive+0.022
ymes+0.022
 recite+0.022
Neg logit contributions
ar-0.014
 Nem-0.014
 Bub-0.013
 home-0.012
 Twe-0.012
Token to
Token ID284
Feature activation-8.64e-03

Pos logit contributions
urious+0.019
 acts+0.019
 typing+0.018
 recite+0.017
itive+0.017
Neg logit contributions
ar-0.011
 Nem-0.010
 Bub-0.009
 home-0.008
 Twe-0.008
Token sing
Token ID1702
Feature activation+2.87e-02

Pos logit contributions
ar+0.014
 Nem+0.013
 Bub+0.013
 home+0.012
 Twe+0.012
Neg logit contributions
urious-0.013
 acts-0.012
 typing-0.011
itive-0.011
 explains-0.011
Token.
Token ID13
Feature activation+2.47e-02

Pos logit contributions
urious+0.045
 acts+0.043
 typing+0.040
itive+0.040
 recite+0.040
Neg logit contributions
ar-0.045
 Nem-0.042
 Bub-0.040
 home-0.037
 Twe-0.036
Token Every
Token ID3887
Feature activation+2.92e-03

Pos logit contributions
urious+0.043
 acts+0.043
 explains+0.040
 recite+0.040
 typing+0.039
Neg logit contributions
ar-0.033
 Nem-0.029
 Bub-0.027
 home-0.026
balls-0.025
Token day
Token ID1110
Feature activation+1.42e-02

Pos logit contributions
urious+0.005
 acts+0.005
 recite+0.005
itive+0.005
 explains+0.005
Neg logit contributions
ar-0.004
 Nem-0.003
 Bub-0.003
balls-0.003
orted-0.003
Token,
Token ID11
Feature activation+7.10e-02

Pos logit contributions
 acts+0.020
itive+0.018
urious+0.018
 explains+0.017
 recite+0.017
Neg logit contributions
ar-0.022
bank-0.019
anger-0.019
 Nem-0.019
aun-0.019
Token Blue
Token ID4518
Feature activation+2.86e-02

Pos logit contributions
urious+0.167
 acts+0.164
itive+0.157
 typing+0.156
 recite+0.156
Neg logit contributions
ar-0.052
 Nem-0.042
 Bub-0.037
 home-0.033
melon-0.029
Token would
Token ID561
Feature activation+3.16e-02

Pos logit contributions
urious+0.047
 acts+0.043
 typing+0.043
itive+0.042
 recite+0.041
Neg logit contributions
ar-0.042
 Nem-0.039
 Bub-0.037
 home-0.036
 Twe-0.034
Token sing
Token ID1702
Feature activation+3.18e-02

Pos logit contributions
urious+0.043
 acts+0.037
itive+0.036
 explains+0.035
 typing+0.035
Neg logit contributions
ar-0.056
 Nem-0.053
 home-0.052
 Bub-0.052
 bamboo-0.048
Token happy
Token ID3772
Feature activation+1.66e-02

Pos logit contributions
urious+0.055
 acts+0.052
 typing+0.049
itive+0.049
 explains+0.049
Neg logit contributions
ar-0.045
 Nem-0.042
 Bub-0.040
 home-0.038
 Twe-0.036
Token songs
Token ID7259
Feature activation+2.89e-02

Pos logit contributions
urious+0.027
 acts+0.025
 typing+0.024
 explains+0.024
itive+0.024
Neg logit contributions
ar-0.025
 Nem-0.023
 Bub-0.022
 home-0.022
 bamboo-0.020
Token,
Token ID11
Feature activation+3.53e-02

Pos logit contributions
 acts+0.004
itive+0.004
 hears+0.004
 typing+0.004
 explains+0.004
Neg logit contributions
bank-0.006
ar-0.006
age-0.005
 mainland-0.005
melon-0.005
Token the
Token ID262
Feature activation+9.93e-03

Pos logit contributions
 acts+0.044
urious+0.043
 typing+0.043
itive+0.042
 recite+0.041
Neg logit contributions
ar-0.063
inky-0.053
 Nem-0.053
age-0.052
bank-0.051
Token car
Token ID1097
Feature activation+7.95e-03

Pos logit contributions
urious+0.022
 acts+0.021
 typing+0.020
itive+0.020
 recite+0.020
Neg logit contributions
ar-0.009
 Nem-0.008
 Bub-0.007
 home-0.006
 Twe-0.006
Token's
Token ID338
Feature activation+7.22e-03

Pos logit contributions
urious+0.013
 acts+0.013
itive+0.012
 typing+0.012
 recite+0.012
Neg logit contributions
ar-0.011
 Nem-0.010
 Bub-0.010
 home-0.009
 Twe-0.009
Token wheels
Token ID13666
Feature activation+2.03e-02

Pos logit contributions
urious+0.013
 acts+0.013
 typing+0.012
itive+0.012
 recite+0.012
Neg logit contributions
ar-0.009
 Nem-0.008
 Bub-0.008
 bamboo-0.007
melon-0.007
Token stopped
Token ID5025
Feature activation+7.08e-02

Pos logit contributions
urious+0.036
 acts+0.035
itive+0.033
 recite+0.033
 typing+0.032
Neg logit contributions
ar-0.027
 Nem-0.024
 Bub-0.023
 home-0.022
 Twe-0.021
Token spinning
Token ID19493
Feature activation+2.48e-02

Pos logit contributions
urious+0.116
 acts+0.113
 explains+0.104
ymes+0.104
itive+0.104
Neg logit contributions
 home-0.102
 Nem-0.095
ar-0.094
 Bub-0.093
 Twe-0.091
Token.
Token ID13
Feature activation+1.02e-02

Pos logit contributions
urious+0.039
 acts+0.038
 typing+0.036
itive+0.035
 explains+0.035
Neg logit contributions
ar-0.037
 Nem-0.034
 Bub-0.032
melon-0.030
 bamboo-0.030
Token The
Token ID383
Feature activation+1.51e-02

Pos logit contributions
urious+0.018
 acts+0.017
 explains+0.016
 typing+0.016
 recite+0.016
Neg logit contributions
ar-0.014
 home-0.011
 Nem-0.011
balls-0.011
melon-0.011
Token car
Token ID1097
Feature activation+2.20e-02

Pos logit contributions
urious+0.033
 acts+0.032
itive+0.030
 recite+0.030
 typing+0.030
Neg logit contributions
ar-0.014
 Nem-0.013
 Bub-0.012
 home-0.011
 Twe-0.010
Token was
Token ID373
Feature activation+1.88e-02

Pos logit contributions
urious+0.039
 acts+0.038
itive+0.036
 recite+0.035
ymes+0.035
Neg logit contributions
ar-0.029
 Nem-0.026
 Bub-0.025
 home-0.024
 Twe-0.023
Token Tim
Token ID5045
Feature activation+1.10e-02

Pos logit contributions
 typing+0.019
itive+0.018
urious+0.018
 explains+0.017
 wasting+0.017
Neg logit contributions
ar-0.028
age-0.024
inky-0.024
oster-0.023
anger-0.023
Tokenmy
Token ID1820
Feature activation+5.48e-03

Pos logit contributions
 acts+0.015
 amaz+0.014
 typing+0.014
ymes+0.014
 recite+0.014
Neg logit contributions
 home-0.018
 Nem-0.018
ar-0.017
 Bub-0.016
 Twe-0.015
Token's
Token ID338
Feature activation+3.95e-03

Pos logit contributions
urious+0.008
 acts+0.008
ymes+0.007
itive+0.007
 recite+0.007
Neg logit contributions
 Nem-0.009
ar-0.009
 home-0.008
 Bub-0.008
 Twe-0.008
Token toy
Token ID13373
Feature activation+3.91e-03

Pos logit contributions
urious+0.007
 acts+0.007
 typing+0.006
itive+0.006
 explains+0.006
Neg logit contributions
ar-0.006
 Nem-0.005
 Bub-0.005
orted-0.004
 bamboo-0.004
Token car
Token ID1097
Feature activation+5.42e-05

Pos logit contributions
 acts+0.007
urious+0.007
 typing+0.007
itive+0.006
 explains+0.006
Neg logit contributions
ar-0.005
 Nem-0.004
 thirst-0.004
orted-0.004
 bamboo-0.004
Token stopped
Token ID5025
Feature activation+6.99e-02

Pos logit contributions
urious+0.000
 acts+0.000
 typing+0.000
itive+0.000
 recite+0.000
Neg logit contributions
ar-0.000
 Nem-0.000
 Bub-0.000
 bamboo-0.000
balls-0.000
Token working
Token ID1762
Feature activation+2.56e-02

Pos logit contributions
urious+0.102
 acts+0.101
 explains+0.093
ymes+0.093
 amaz+0.092
Neg logit contributions
 home-0.113
 Nem-0.107
 Twe-0.103
 Bub-0.103
ar-0.101
Token and
Token ID290
Feature activation-2.95e-03

Pos logit contributions
urious+0.049
 acts+0.047
 recite+0.044
 explains+0.044
itive+0.044
Neg logit contributions
ar-0.031
 Nem-0.029
 home-0.027
 Bub-0.026
 Twe-0.025
Token he
Token ID339
Feature activation+8.26e-03

Pos logit contributions
ar+0.004
inky+0.003
 Nem+0.003
 Bub+0.003
orted+0.003
Neg logit contributions
 typing-0.005
urious-0.005
 acts-0.005
 recite-0.005
itive-0.005
Token was
Token ID373
Feature activation+2.88e-03

Pos logit contributions
urious+0.013
 acts+0.013
itive+0.012
ymes+0.012
 recite+0.012
Neg logit contributions
ar-0.012
 Nem-0.012
 Bub-0.011
 home-0.010
 Twe-0.010
Token very
Token ID845
Feature activation+4.00e-03

Pos logit contributions
urious+0.005
 acts+0.005
 typing+0.005
 recite+0.005
itive+0.005
Neg logit contributions
ar-0.004
 Nem-0.003
 Bub-0.003
 bamboo-0.003
ger-0.003
Token
Token ID198
Feature activation+1.05e-02

Pos logit contributions
ar+0.002
 Nem+0.002
 Bub+0.002
melon+0.002
 home+0.002
Neg logit contributions
urious-0.003
 acts-0.003
 typing-0.003
 recite-0.003
 explains-0.003
TokenOne
Token ID3198
Feature activation+1.92e-02

Pos logit contributions
urious+0.014
 acts+0.013
 typing+0.013
 recite+0.013
itive+0.012
Neg logit contributions
ar-0.018
 Nem-0.017
 Bub-0.016
 home-0.016
melon-0.015
Token day
Token ID1110
Feature activation+8.40e-03

Pos logit contributions
urious+0.046
 acts+0.045
 typing+0.043
 explains+0.043
 recite+0.043
Neg logit contributions
ar-0.014
 Nem-0.012
 Bub-0.011
 home-0.010
 bamboo-0.008
Token the
Token ID262
Feature activation+3.75e-03

Pos logit contributions
 acts+0.012
itive+0.012
 typing+0.011
 explains+0.011
urious+0.011
Neg logit contributions
ar-0.012
bank-0.012
melon-0.011
 Nem-0.011
balls-0.011
Token train
Token ID4512
Feature activation-1.27e-02

Pos logit contributions
urious+0.007
 acts+0.007
itive+0.007
 recite+0.007
 typing+0.007
Neg logit contributions
ar-0.004
 Nem-0.004
 Bub-0.004
 home-0.003
 Twe-0.003
Token stopped
Token ID5025
Feature activation+6.89e-02

Pos logit contributions
ar+0.018
 Nem+0.017
 Bub+0.016
 home+0.015
 Twe+0.014
Neg logit contributions
urious-0.021
 acts-0.021
itive-0.019
 typing-0.019
 recite-0.019
Token and
Token ID290
Feature activation-2.93e-03

Pos logit contributions
urious+0.116
 acts+0.113
ymes+0.108
 explains+0.107
itive+0.107
Neg logit contributions
 home-0.096
 Nem-0.082
 Bub-0.082
ar-0.077
 back-0.077
Token the
Token ID262
Feature activation+3.93e-03

Pos logit contributions
ar+0.004
 Nem+0.004
 Bub+0.004
 home+0.004
 Twe+0.004
Neg logit contributions
urious-0.005
 acts-0.004
 typing-0.004
itive-0.004
 recite-0.004
Token passengers
Token ID10405
Feature activation+2.53e-03

Pos logit contributions
urious+0.007
 acts+0.007
itive+0.007
 recite+0.007
 explains+0.007
Neg logit contributions
ar-0.005
 Nem-0.005
 Bub-0.004
 home-0.004
 Twe-0.004
Token all
Token ID477
Feature activation-1.08e-02

Pos logit contributions
urious+0.005
 acts+0.004
ymes+0.004
 explains+0.004
 recite+0.004
Neg logit contributions
ar-0.003
 Nem-0.003
 home-0.003
 Bub-0.003
 Twe-0.003
Token had
Token ID550
Feature activation-5.97e-03

Pos logit contributions
ar+0.016
 Nem+0.015
 Bub+0.014
 home+0.013
 Twe+0.013
Neg logit contributions
urious-0.018
 acts-0.017
itive-0.016
 typing-0.016
 recite-0.016
Token house
Token ID2156
Feature activation-1.49e-03

Pos logit contributions
urious+0.018
 acts+0.016
 typing+0.015
 recite+0.015
 amaz+0.015
Neg logit contributions
ar-0.023
orted-0.019
 Nem-0.019
 Bub-0.019
ger-0.018
Token.
Token ID13
Feature activation+2.29e-02

Pos logit contributions
ar+0.002
 Nem+0.002
 Bub+0.001
 home+0.001
 Twe+0.001
Neg logit contributions
urious-0.003
 acts-0.003
itive-0.003
 recite-0.003
 explains-0.003
Token Everywhere
Token ID48201
Feature activation+3.36e-02

Pos logit contributions
urious+0.043
 acts+0.041
itive+0.040
 typing+0.039
 recite+0.038
Neg logit contributions
ar-0.028
 Nem-0.026
 Bub-0.024
 home-0.023
 bamboo-0.022
Token she
Token ID673
Feature activation+7.34e-03

Pos logit contributions
urious+0.068
 acts+0.063
 typing+0.063
itive+0.061
 recite+0.061
Neg logit contributions
ar-0.037
 Nem-0.034
 Bub-0.030
 home-0.030
 Pink-0.028
Token went
Token ID1816
Feature activation+8.17e-03

Pos logit contributions
urious+0.013
 acts+0.012
 typing+0.011
itive+0.011
 recite+0.011
Neg logit contributions
ar-0.010
 Nem-0.010
 Bub-0.009
 home-0.008
 Twe-0.008
Token,
Token ID11
Feature activation+6.71e-02

Pos logit contributions
urious+0.017
 acts+0.016
itive+0.016
 typing+0.015
 recite+0.015
Neg logit contributions
ar-0.008
 Nem-0.008
 Bub-0.007
 home-0.006
 Twe-0.006
Token she
Token ID673
Feature activation+9.75e-03

Pos logit contributions
urious+0.175
asing+0.159
 acts+0.154
 explains+0.154
ran+0.151
Neg logit contributions
 Nem-0.044
 Pink-0.044
 home-0.037
 Bub-0.035
 it-0.034
Token heard
Token ID2982
Feature activation+1.43e-02

Pos logit contributions
urious+0.016
 acts+0.015
itive+0.015
 typing+0.015
 recite+0.015
Neg logit contributions
ar-0.014
 Nem-0.014
 Bub-0.013
 home-0.012
ger-0.011
Token the
Token ID262
Feature activation+6.81e-03

Pos logit contributions
urious+0.025
 acts+0.024
itive+0.023
 recite+0.022
 typing+0.022
Neg logit contributions
ar-0.019
 Nem-0.018
 Bub-0.017
 home-0.016
 bamboo-0.016
Token sound
Token ID2128
Feature activation+1.81e-02

Pos logit contributions
urious+0.011
 acts+0.011
 typing+0.010
itive+0.010
 recite+0.010
Neg logit contributions
ar-0.010
 Nem-0.009
 Bub-0.009
 home-0.008
 bamboo-0.008
Token of
Token ID286
Feature activation+1.32e-02

Pos logit contributions
urious+0.027
 acts+0.025
itive+0.024
 recite+0.023
 explains+0.023
Neg logit contributions
ar-0.030
 Nem-0.028
 Bub-0.026
 home-0.025
 Twe-0.024
Token others
Token ID1854
Feature activation+1.38e-02

Pos logit contributions
urious+0.015
 acts+0.014
itive+0.014
 explains+0.013
 typing+0.013
Neg logit contributions
ar-0.007
 Nem-0.007
 Bub-0.007
 home-0.007
 Twe-0.006
Token.
Token ID13
Feature activation+2.61e-02

Pos logit contributions
urious+0.026
 acts+0.025
itive+0.024
 typing+0.024
 recite+0.023
Neg logit contributions
ar-0.017
 Nem-0.016
 Bub-0.014
 home-0.013
 Twe-0.013
Token Everywhere
Token ID48201
Feature activation+3.62e-02

Pos logit contributions
urious+0.049
 acts+0.048
 typing+0.045
 explains+0.045
 recite+0.045
Neg logit contributions
ar-0.032
 Nem-0.028
 Bub-0.026
 home-0.025
melon-0.024
Token he
Token ID339
Feature activation+2.60e-02

Pos logit contributions
urious+0.086
 acts+0.079
itive+0.079
 recite+0.079
 typing+0.078
Neg logit contributions
ar-0.029
 Nem-0.026
 Bub-0.022
 home-0.020
 Twe-0.019
Token went
Token ID1816
Feature activation+1.95e-02

Pos logit contributions
urious+0.048
 acts+0.046
 explains+0.046
itive+0.046
 typing+0.045
Neg logit contributions
ar-0.033
 Nem-0.030
 Bub-0.028
ger-0.026
 home-0.025
Token,
Token ID11
Feature activation+6.70e-02

Pos logit contributions
urious+0.042
 acts+0.041
itive+0.039
 typing+0.039
 recite+0.038
Neg logit contributions
ar-0.019
 Nem-0.017
 Bub-0.015
 home-0.014
 Twe-0.013
Token he
Token ID339
Feature activation+2.76e-02

Pos logit contributions
urious+0.185
 acts+0.167
 explains+0.163
itive+0.163
asing+0.162
Neg logit contributions
 Nem-0.034
 Bub-0.030
ar-0.026
 home-0.022
 Twe-0.021
Token made
Token ID925
Feature activation+1.87e-02

Pos logit contributions
urious+0.049
 acts+0.045
itive+0.044
ymes+0.044
 explains+0.044
Neg logit contributions
ar-0.038
 Nem-0.035
 Bub-0.034
 home-0.031
 Twe-0.030
Token friends
Token ID2460
Feature activation+2.11e-02

Pos logit contributions
urious+0.031
 acts+0.029
itive+0.028
 recite+0.027
 typing+0.027
Neg logit contributions
ar-0.027
 Nem-0.026
 Bub-0.025
 home-0.024
 Twe-0.023
Token with
Token ID351
Feature activation+2.50e-02

Pos logit contributions
urious+0.035
 acts+0.033
itive+0.031
 typing+0.031
 recite+0.031
Neg logit contributions
ar-0.031
 Nem-0.029
 Bub-0.027
 home-0.026
 Twe-0.025
Token the
Token ID262
Feature activation+1.98e-02

Pos logit contributions
urious+0.046
 acts+0.044
itive+0.042
 typing+0.042
 recite+0.042
Neg logit contributions
ar-0.031
 Nem-0.030
 Bub-0.028
 home-0.026
 Twe-0.025
Token day
Token ID1110
Feature activation-4.70e-03

Pos logit contributions
urious+0.049
 acts+0.047
 typing+0.045
itive+0.045
 recite+0.045
Neg logit contributions
ar-0.018
 Nem-0.016
 Bub-0.014
 home-0.013
 Twe-0.012
Token,
Token ID11
Feature activation+1.67e-02

Pos logit contributions
ar+0.008
bank+0.008
age+0.008
oster+0.008
 mainland+0.007
Neg logit contributions
 acts-0.005
itive-0.005
 explains-0.005
urious-0.005
pled-0.005
Token Tim
Token ID5045
Feature activation+6.03e-03

Pos logit contributions
urious+0.025
 typing+0.024
 acts+0.024
itive+0.023
 explains+0.023
Neg logit contributions
ar-0.025
age-0.021
inky-0.020
bank-0.020
anger-0.019
Token's
Token ID338
Feature activation+1.36e-02

Pos logit contributions
itive+0.011
 amaz+0.011
urious+0.010
ymes+0.010
 acts+0.010
Neg logit contributions
 Nem-0.007
ar-0.007
 home-0.007
 Bub-0.006
 Twe-0.006
Token robot
Token ID9379
Feature activation+1.01e-02

Pos logit contributions
urious+0.023
 acts+0.022
itive+0.020
 typing+0.020
 recite+0.020
Neg logit contributions
ar-0.020
 Nem-0.018
 Bub-0.017
 home-0.016
 Twe-0.016
Token stopped
Token ID5025
Feature activation+6.70e-02

Pos logit contributions
urious+0.014
 acts+0.014
 typing+0.013
 explains+0.013
 recite+0.013
Neg logit contributions
ar-0.016
 Nem-0.016
 Bub-0.015
 bamboo-0.014
 home-0.014
Token blinking
Token ID43196
Feature activation+2.71e-02

Pos logit contributions
 acts+0.112
urious+0.111
 explains+0.106
 amaz+0.106
itive+0.104
Neg logit contributions
 home-0.093
 Nem-0.083
 Bub-0.081
 Twe-0.075
 very-0.074
Token.
Token ID13
Feature activation-7.70e-03

Pos logit contributions
urious+0.053
 acts+0.052
 typing+0.050
 recite+0.050
 explains+0.049
Neg logit contributions
ar-0.029
 Nem-0.027
 home-0.025
 Bub-0.024
 Twe-0.022
Token Tim
Token ID5045
Feature activation+1.71e-02

Pos logit contributions
ar+0.009
 Nem+0.008
 Bub+0.007
 home+0.007
balls+0.006
Neg logit contributions
urious-0.015
 acts-0.015
 typing-0.014
 explains-0.014
 recite-0.014
Token was
Token ID373
Feature activation+1.03e-02

Pos logit contributions
itive+0.029
urious+0.029
 amaz+0.028
ymes+0.028
asing+0.028
Neg logit contributions
ar-0.020
 Nem-0.020
 Bub-0.019
 home-0.018
 Twe-0.017
Token sad
Token ID6507
Feature activation+5.67e-03

Pos logit contributions
urious+0.017
 acts+0.017
itive+0.016
 recite+0.015
 explains+0.015
Neg logit contributions
ar-0.015
 Nem-0.014
 Bub-0.013
 home-0.013
 Twe-0.012
Token It
Token ID632
Feature activation-3.01e-03

Pos logit contributions
urious+0.012
 acts+0.011
 explains+0.011
 typing+0.011
 recite+0.011
Neg logit contributions
ar-0.010
 Nem-0.009
 Bub-0.008
balls-0.008
 bamboo-0.008
Token was
Token ID373
Feature activation-4.05e-03

Pos logit contributions
ar+0.003
 Nem+0.003
 Bub+0.003
 Twe+0.003
ger+0.003
Neg logit contributions
urious-0.006
 acts-0.006
itive-0.006
ymes-0.005
asing-0.005
Token filled
Token ID5901
Feature activation-4.55e-03

Pos logit contributions
ar+0.006
 Nem+0.005
 Bub+0.005
orted+0.005
balls+0.005
Neg logit contributions
urious-0.006
 acts-0.006
 typing-0.006
 recite-0.006
 explains-0.006
Token with
Token ID351
Feature activation+1.97e-03

Pos logit contributions
ar+0.006
 Nem+0.006
 Bub+0.005
 home+0.005
 Twe+0.005
Neg logit contributions
urious-0.008
 acts-0.008
itive-0.007
 typing-0.007
 recite-0.007
Token ch
Token ID442
Feature activation+3.28e-02

Pos logit contributions
urious+0.003
 acts+0.003
itive+0.003
 recite+0.003
 typing+0.003
Neg logit contributions
ar-0.003
 Nem-0.003
 Bub-0.003
 home-0.002
 Twe-0.002
Tokenocol
Token ID4668
Feature activation+6.67e-02

Pos logit contributions
urious+0.050
 acts+0.047
itive+0.046
 recite+0.046
 explains+0.045
Neg logit contributions
ar-0.052
 Nem-0.048
 Bub-0.045
 home-0.043
melon-0.041
Tokenates
Token ID689
Feature activation+1.09e-02

Pos logit contributions
urious+0.130
 acts+0.127
 recite+0.120
 typing+0.120
 explains+0.120
Neg logit contributions
ar-0.076
 Nem-0.067
 Bub-0.062
 home-0.059
balls-0.056
Token and
Token ID290
Feature activation-9.55e-03

Pos logit contributions
urious+0.019
 acts+0.018
itive+0.017
 typing+0.017
 recite+0.017
Neg logit contributions
ar-0.015
 Nem-0.014
 Bub-0.013
 home-0.012
 Twe-0.012
Token coins
Token ID10796
Feature activation-6.08e-03

Pos logit contributions
ar+0.014
 Nem+0.013
 Bub+0.012
 home+0.012
 Twe+0.011
Neg logit contributions
urious-0.016
 acts-0.015
itive-0.014
 recite-0.014
 typing-0.014
Token.
Token ID13
Feature activation+1.26e-02

Pos logit contributions
ar+0.008
 Nem+0.007
 Bub+0.007
 home+0.006
 Twe+0.006
Neg logit contributions
urious-0.011
 acts-0.010
 typing-0.010
itive-0.010
 recite-0.010
Token She
Token ID1375
Feature activation+2.26e-03

Pos logit contributions
urious+0.022
 acts+0.021
 explains+0.020
 recite+0.020
 typing+0.020
Neg logit contributions
ar-0.017
 Nem-0.016
 Bub-0.015
 bamboo-0.014
balls-0.014
Token,
Token ID11
Feature activation+2.44e-02

Pos logit contributions
ar+0.008
bank+0.008
oster+0.008
age+0.008
 mainland+0.008
Neg logit contributions
itive-0.006
 acts-0.005
 explains-0.005
urious-0.005
 typing-0.005
Token Tim
Token ID5045
Feature activation+1.22e-02

Pos logit contributions
 typing+0.033
urious+0.033
itive+0.032
 acts+0.031
 explains+0.031
Neg logit contributions
ar-0.040
age-0.034
inky-0.034
oster-0.032
bank-0.032
Tokenmy
Token ID1820
Feature activation+7.66e-03

Pos logit contributions
 amaz+0.016
 acts+0.016
itive+0.015
 typing+0.015
ymes+0.015
Neg logit contributions
 Nem-0.020
 home-0.020
ar-0.019
 Bub-0.018
 away-0.017
Token's
Token ID338
Feature activation+8.30e-03

Pos logit contributions
urious+0.011
 acts+0.011
ymes+0.010
itive+0.010
 recite+0.010
Neg logit contributions
 Nem-0.012
ar-0.012
 home-0.011
 Bub-0.011
 Twe-0.010
Token computer
Token ID3644
Feature activation-3.52e-03

Pos logit contributions
urious+0.016
 acts+0.016
itive+0.015
 recite+0.015
ymes+0.015
Neg logit contributions
ar-0.008
 home-0.008
 Nem-0.008
balls-0.007
 Bub-0.007
Token stopped
Token ID5025
Feature activation+6.65e-02

Pos logit contributions
ar+0.005
 Nem+0.005
 Bub+0.005
 bamboo+0.005
 home+0.004
Neg logit contributions
urious-0.005
 acts-0.005
 typing-0.005
 explains-0.005
 recite-0.005
Token working
Token ID1762
Feature activation+2.67e-02

Pos logit contributions
 acts+0.101
urious+0.099
ymes+0.093
 explains+0.091
itive+0.090
Neg logit contributions
 home-0.107
 Nem-0.093
 Bub-0.089
 Twe-0.088
 back-0.088
Token.
Token ID13
Feature activation+6.30e-05

Pos logit contributions
urious+0.054
 acts+0.052
 recite+0.048
 explains+0.048
itive+0.047
Neg logit contributions
ar-0.030
 home-0.027
 Nem-0.027
 Bub-0.024
 Twe-0.023
Token He
Token ID679
Feature activation+1.89e-02

Pos logit contributions
urious+0.000
 acts+0.000
 typing+0.000
 explains+0.000
 recite+0.000
Neg logit contributions
ar-0.000
 Nem-0.000
 bamboo-0.000
 Bub-0.000
balls-0.000
Token felt
Token ID2936
Feature activation+1.19e-02

Pos logit contributions
urious+0.031
 acts+0.031
itive+0.030
ymes+0.030
asing+0.029
Neg logit contributions
 Nem-0.025
ar-0.025
 Bub-0.023
 home-0.021
 Twe-0.021
Token sad
Token ID6507
Feature activation+6.52e-03

Pos logit contributions
 acts+0.017
urious+0.017
 typing+0.016
 explains+0.016
 recite+0.016
Neg logit contributions
ar-0.020
 Nem-0.017
orted-0.016
 Bub-0.016
balls-0.016
Token long
Token ID890
Feature activation+6.90e-03

Pos logit contributions
urious+0.019
 acts+0.018
 recite+0.018
itive+0.017
 explains+0.017
Neg logit contributions
ar-0.030
 Nem-0.028
orted-0.026
 Bub-0.026
melon-0.024
Token tail
Token ID7894
Feature activation-3.21e-03

Pos logit contributions
urious+0.009
 acts+0.009
itive+0.009
 recite+0.008
 explains+0.008
Neg logit contributions
ar-0.012
 Nem-0.011
 Bub-0.011
balls-0.010
melon-0.010
Token.
Token ID13
Feature activation+3.35e-02

Pos logit contributions
ar+0.004
 Nem+0.004
 Bub+0.004
 Twe+0.004
 home+0.003
Neg logit contributions
urious-0.005
 acts-0.005
itive-0.005
 typing-0.005
 recite-0.005
Token Every
Token ID3887
Feature activation+8.56e-03

Pos logit contributions
urious+0.072
 acts+0.069
itive+0.068
 typing+0.066
ymes+0.065
Neg logit contributions
ar-0.030
 Nem-0.027
 Bub-0.026
 home-0.025
 Twe-0.023
Token day
Token ID1110
Feature activation+1.39e-02

Pos logit contributions
urious+0.017
 acts+0.017
 recite+0.016
itive+0.015
 explains+0.015
Neg logit contributions
ar-0.010
 Nem-0.009
 Bub-0.009
balls-0.008
 Twe-0.007
Token,
Token ID11
Feature activation+6.63e-02

Pos logit contributions
urious+0.029
 acts+0.028
 typing+0.027
itive+0.027
 recite+0.027
Neg logit contributions
ar-0.014
 Nem-0.013
 Bub-0.011
 home-0.010
melon-0.010
Token she
Token ID673
Feature activation+3.02e-02

Pos logit contributions
urious+0.156
 acts+0.140
asing+0.138
itive+0.134
pled+0.133
Neg logit contributions
ar-0.050
 Bub-0.050
 Twe-0.048
 Nem-0.047
 home-0.047
Token flew
Token ID13112
Feature activation+1.93e-02

Pos logit contributions
urious+0.055
 amaz+0.050
 typing+0.049
 acts+0.049
itive+0.049
Neg logit contributions
ar-0.040
 Nem-0.037
 Bub-0.035
 home-0.032
 Twe-0.031
Token around
Token ID1088
Feature activation+1.93e-02

Pos logit contributions
urious+0.026
 acts+0.024
 recite+0.024
 hears+0.023
 explains+0.023
Neg logit contributions
ar-0.032
 Nem-0.029
inky-0.027
melon-0.026
anger-0.026
Token in
Token ID287
Feature activation+2.65e-02

Pos logit contributions
urious+0.037
 acts+0.034
 explains+0.034
itive+0.033
 typing+0.032
Neg logit contributions
ar-0.024
 Nem-0.022
 home-0.022
 Bub-0.022
 Twe-0.020
Token the
Token ID262
Feature activation+2.54e-02

Pos logit contributions
itive+0.058
asing+0.056
urious+0.055
 explains+0.054
istent+0.053
Neg logit contributions
 home-0.029
 bamboo-0.022
 back-0.021
 sand-0.021
 it-0.020
Token track
Token ID2610
Feature activation+1.71e-03

Pos logit contributions
urious+0.003
 acts+0.003
itive+0.003
 typing+0.003
 recite+0.003
Neg logit contributions
ar-0.002
 Nem-0.002
 Bub-0.002
 home-0.002
 Twe-0.002
Token?
Token ID30
Feature activation+2.17e-02

Pos logit contributions
urious+0.003
 acts+0.003
 typing+0.003
 explains+0.003
 recite+0.003
Neg logit contributions
ar-0.002
 Nem-0.002
 Bub-0.002
 Twe-0.002
 home-0.002
Token Or
Token ID1471
Feature activation+1.45e-02

Pos logit contributions
urious+0.037
 acts+0.036
itive+0.034
 typing+0.034
 recite+0.034
Neg logit contributions
ar-0.030
 Nem-0.028
 Bub-0.026
 home-0.024
 Twe-0.024
Token how
Token ID703
Feature activation+2.68e-02

Pos logit contributions
urious+0.029
itive+0.026
 acts+0.026
 recite+0.025
ymes+0.025
Neg logit contributions
ar-0.017
 Nem-0.016
 Bub-0.016
 Twe-0.014
 home-0.014
Token Grand
Token ID5675
Feature activation+2.23e-02

Pos logit contributions
urious+0.054
 acts+0.052
itive+0.051
 amaz+0.051
 recite+0.050
Neg logit contributions
ar-0.027
 Nem-0.027
 home-0.025
 Bub-0.024
 Twe-0.023
Tokenma
Token ID2611
Feature activation+6.63e-02

Pos logit contributions
urious+0.031
 acts+0.031
Ru+0.031
 amaz+0.031
 recite+0.031
Neg logit contributions
ar-0.034
 Nem-0.030
 home-0.030
melon-0.028
oe-0.028
Token taught
Token ID7817
Feature activation+1.14e-02

Pos logit contributions
urious+0.101
itive+0.093
 amaz+0.092
 acts+0.090
 abrupt+0.089
Neg logit contributions
ar-0.104
 Nem-0.092
 home-0.090
 Bub-0.090
balls-0.083
Token you
Token ID345
Feature activation-4.60e-03

Pos logit contributions
urious+0.021
 acts+0.020
 recite+0.019
 typing+0.019
 explains+0.019
Neg logit contributions
ar-0.014
 Nem-0.013
 Bub-0.012
 home-0.011
melon-0.011
Token how
Token ID703
Feature activation-9.29e-04

Pos logit contributions
ar+0.006
 Nem+0.006
 Bub+0.006
 Twe+0.005
 home+0.005
Neg logit contributions
urious-0.008
 acts-0.008
itive-0.007
 explains-0.007
 typing-0.007
Token to
Token ID284
Feature activation+5.21e-03

Pos logit contributions
ar+0.001
 Nem+0.001
 Bub+0.001
 home+0.001
 bamboo+0.001
Neg logit contributions
urious-0.002
 acts-0.002
 typing-0.002
 explains-0.002
itive-0.002
Token fly
Token ID6129
Feature activation+1.57e-02

Pos logit contributions
urious+0.007
 acts+0.007
itive+0.006
ymes+0.006
 explains+0.006
Neg logit contributions
ar-0.009
 Nem-0.009
 Bub-0.008
 home-0.008
 Twe-0.008
Token.
Token ID13
Feature activation+2.39e-03

Pos logit contributions
urious+0.014
 acts+0.014
 typing+0.013
itive+0.013
 recite+0.013
Neg logit contributions
ar-0.010
 Nem-0.009
 Bub-0.009
melon-0.008
 Twe-0.008
Token One
Token ID1881
Feature activation+2.11e-02

Pos logit contributions
urious+0.003
 acts+0.003
 explains+0.003
 typing+0.003
 recite+0.003
Neg logit contributions
ar-0.004
 thirst-0.003
orted-0.003
balls-0.003
ger-0.003
Token day
Token ID1110
Feature activation+8.18e-03

Pos logit contributions
urious+0.050
 acts+0.050
 typing+0.048
itive+0.048
 explains+0.048
Neg logit contributions
ar-0.015
 Nem-0.013
 Bub-0.011
 home-0.011
 Twe-0.009
Token,
Token ID11
Feature activation+3.03e-02

Pos logit contributions
 acts+0.010
itive+0.010
 typing+0.010
 explains+0.009
 recite+0.009
Neg logit contributions
ar-0.013
age-0.013
bank-0.012
inky-0.012
 mainland-0.012
Token he
Token ID339
Feature activation+2.25e-02

Pos logit contributions
 typing+0.043
 acts+0.041
itive+0.041
 recite+0.040
 explains+0.038
Neg logit contributions
ar-0.048
inky-0.042
age-0.041
orted-0.039
anger-0.038
Token stopped
Token ID5025
Feature activation+6.63e-02

Pos logit contributions
urious+0.042
 acts+0.039
itive+0.039
ymes+0.039
 typing+0.038
Neg logit contributions
ar-0.028
 Nem-0.027
 Bub-0.026
 home-0.025
 Twe-0.023
Token at
Token ID379
Feature activation+1.79e-02

Pos logit contributions
urious+0.128
ymes+0.120
 acts+0.120
itive+0.120
 explains+0.116
Neg logit contributions
 home-0.080
 Bub-0.069
 Nem-0.066
 back-0.064
ar-0.063
Token a
Token ID257
Feature activation-2.19e-03

Pos logit contributions
urious+0.031
 acts+0.030
 typing+0.028
itive+0.028
 recite+0.028
Neg logit contributions
ar-0.025
 Nem-0.022
 Bub-0.021
 home-0.020
 Twe-0.019
Token store
Token ID3650
Feature activation-6.56e-03

Pos logit contributions
ar+0.004
 Nem+0.003
 Bub+0.003
orted+0.003
balls+0.003
Neg logit contributions
urious-0.003
 acts-0.003
 recite-0.003
 typing-0.003
 explains-0.003
Token and
Token ID290
Feature activation-1.57e-02

Pos logit contributions
ar+0.008
 Nem+0.007
 home+0.007
 Bub+0.006
 Twe+0.006
Neg logit contributions
urious-0.013
 acts-0.012
 recite-0.012
ymes-0.011
itive-0.011
Token bought
Token ID5839
Feature activation-5.30e-03

Pos logit contributions
ar+0.026
 Nem+0.024
 Bub+0.023
 home+0.021
 bamboo+0.021
Neg logit contributions
urious-0.023
 acts-0.022
 typing-0.021
 recite-0.020
itive-0.020
Token One
Token ID1881
Feature activation+1.84e-02

Pos logit contributions
urious+0.001
 acts+0.001
 explains+0.001
 typing+0.001
 recite+0.001
Neg logit contributions
ar-0.001
 Nem-0.000
 Bub-0.000
balls-0.000
 home-0.000
Token day
Token ID1110
Feature activation-7.91e-03

Pos logit contributions
urious+0.043
 acts+0.042
 typing+0.040
itive+0.040
 recite+0.040
Neg logit contributions
ar-0.014
 Nem-0.012
 Bub-0.011
 home-0.010
 Twe-0.009
Token,
Token ID11
Feature activation+2.00e-02

Pos logit contributions
ar+0.013
oster+0.013
bank+0.013
 mainland+0.012
age+0.012
Neg logit contributions
itive-0.009
 acts-0.008
 typing-0.008
urious-0.008
 explains-0.008
Token the
Token ID262
Feature activation+1.22e-02

Pos logit contributions
 typing+0.028
itive+0.027
urious+0.027
 acts+0.025
 explains+0.025
Neg logit contributions
ar-0.033
age-0.028
inky-0.027
bank-0.026
oster-0.026
Token television
Token ID5581
Feature activation+7.21e-03

Pos logit contributions
urious+0.018
 acts+0.017
 typing+0.017
itive+0.016
 explains+0.016
Neg logit contributions
ar-0.021
 Nem-0.018
 Bub-0.018
 Twe-0.016
orted-0.016
Token stopped
Token ID5025
Feature activation+6.60e-02

Pos logit contributions
urious+0.012
 acts+0.012
itive+0.011
 typing+0.011
 recite+0.011
Neg logit contributions
ar-0.010
 Nem-0.009
 Bub-0.009
 home-0.008
 Twe-0.008
Token working
Token ID1762
Feature activation+2.01e-02

Pos logit contributions
 acts+0.092
urious+0.092
 explains+0.084
ymes+0.082
 recite+0.081
Neg logit contributions
 home-0.108
 Nem-0.103
 Bub-0.100
ar-0.096
 Twe-0.096
Token and
Token ID290
Feature activation+4.73e-03

Pos logit contributions
urious+0.040
 acts+0.038
 recite+0.036
 explains+0.036
itive+0.035
Neg logit contributions
ar-0.023
 Nem-0.021
 home-0.020
 Bub-0.019
 Twe-0.018
Token Tim
Token ID5045
Feature activation+1.25e-02

Pos logit contributions
 typing+0.008
urious+0.008
 acts+0.008
 recite+0.008
 explains+0.008
Neg logit contributions
ar-0.006
inky-0.005
 Nem-0.005
 Bub-0.005
orted-0.005
Tokenmy
Token ID1820
Feature activation+1.21e-02

Pos logit contributions
 acts+0.016
 amaz+0.015
urious+0.015
 recite+0.015
 typing+0.015
Neg logit contributions
ar-0.020
 Nem-0.020
 home-0.020
 Bub-0.019
 Twe-0.018
Token was
Token ID373
Feature activation+5.69e-03

Pos logit contributions
urious+0.021
 acts+0.021
ymes+0.021
itive+0.020
 recite+0.019
Neg logit contributions
ar-0.015
 Nem-0.014
 home-0.013
 Bub-0.012
 cab-0.012
Token proud
Token ID6613
Feature activation+3.01e-03

Pos logit contributions
ar+0.003
 Nem+0.002
 Bub+0.002
anger+0.002
ger+0.002
Neg logit contributions
urious-0.002
 acts-0.002
 typing-0.002
itive-0.002
 explains-0.002
Token.
Token ID13
Feature activation+7.77e-03

Pos logit contributions
urious+0.004
 acts+0.004
itive+0.003
 typing+0.003
 explains+0.003
Neg logit contributions
ar-0.006
 Nem-0.005
 Bub-0.005
 home-0.005
orted-0.005
Token Everywhere
Token ID48201
Feature activation+2.97e-02

Pos logit contributions
 acts+0.012
urious+0.012
 typing+0.011
 explains+0.011
 recite+0.011
Neg logit contributions
ar-0.012
balls-0.010
 home-0.010
 Nem-0.010
orted-0.009
Token he
Token ID339
Feature activation+2.68e-02

Pos logit contributions
 acts+0.054
urious+0.054
itive+0.050
 explains+0.050
 typing+0.049
Neg logit contributions
ar-0.036
 Nem-0.032
 Bub-0.031
balls-0.029
 home-0.028
Token went
Token ID1816
Feature activation+2.05e-02

Pos logit contributions
urious+0.049
 acts+0.046
 typing+0.045
itive+0.045
 explains+0.045
Neg logit contributions
ar-0.034
 Nem-0.033
 Bub-0.030
 Twe-0.027
ger-0.027
Token,
Token ID11
Feature activation+6.59e-02

Pos logit contributions
 acts+0.041
urious+0.041
 explains+0.038
itive+0.038
 typing+0.038
Neg logit contributions
ar-0.021
 Nem-0.019
 Bub-0.017
 home-0.016
balls-0.016
Token he
Token ID339
Feature activation+2.36e-02

Pos logit contributions
urious+0.158
 acts+0.145
 recite+0.142
 explains+0.141
asing+0.140
Neg logit contributions
 Nem-0.053
ar-0.049
 Bub-0.045
 Twe-0.042
 home-0.040
Token held
Token ID2714
Feature activation+2.55e-02

Pos logit contributions
urious+0.040
 acts+0.037
itive+0.037
 explains+0.036
 typing+0.036
Neg logit contributions
ar-0.034
 Nem-0.031
 Bub-0.030
 home-0.027
 Twe-0.027
Token his
Token ID465
Feature activation+1.84e-02

Pos logit contributions
 acts+0.043
urious+0.043
 explains+0.040
 typing+0.040
 recite+0.040
Neg logit contributions
ar-0.035
 Nem-0.031
 Bub-0.029
melon-0.028
orted-0.027
Token ant
Token ID1885
Feature activation+2.25e-02

Pos logit contributions
 acts+0.027
urious+0.026
 recite+0.023
itive+0.023
 explains+0.023
Neg logit contributions
ar-0.031
 Nem-0.028
 Bub-0.027
orted-0.025
melon-0.025
Tokenlers
Token ID8116
Feature activation+2.59e-02

Pos logit contributions
urious+0.034
 acts+0.033
itive+0.029
 typing+0.029
 recite+0.029
Neg logit contributions
ar-0.037
 Nem-0.034
 Bub-0.032
orted-0.030
 Twe-0.030
Token One
Token ID1881
Feature activation+2.01e-02

Pos logit contributions
urious+0.002
 acts+0.002
 explains+0.002
 recite+0.002
 typing+0.002
Neg logit contributions
ar-0.002
 Nem-0.002
balls-0.001
orted-0.001
melon-0.001
Token day
Token ID1110
Feature activation-3.14e-03

Pos logit contributions
urious+0.049
 acts+0.048
 typing+0.047
 recite+0.046
 explains+0.046
Neg logit contributions
ar-0.012
 Nem-0.010
 home-0.010
 Bub-0.008
 bamboo-0.007
Token,
Token ID11
Feature activation+1.85e-02

Pos logit contributions
bank+0.005
 mainland+0.005
ar+0.005
 Nem+0.005
melon+0.005
Neg logit contributions
 acts-0.004
itive-0.003
 typing-0.003
 hears-0.003
pled-0.003
Token the
Token ID262
Feature activation+1.96e-02

Pos logit contributions
 typing+0.018
itive+0.018
 recite+0.018
 acts+0.017
 read+0.017
Neg logit contributions
ar-0.037
bank-0.035
age-0.033
 mainland-0.032
inky-0.032
Token clock
Token ID8801
Feature activation+1.09e-02

Pos logit contributions
 acts+0.030
urious+0.030
 recite+0.027
itive+0.027
 explains+0.027
Neg logit contributions
 home-0.027
ar-0.026
 Nem-0.025
balls-0.023
time-0.023
Token stopped
Token ID5025
Feature activation+6.59e-02

Pos logit contributions
urious+0.018
 acts+0.017
 typing+0.016
 recite+0.016
 explains+0.016
Neg logit contributions
ar-0.016
 Nem-0.015
 Bub-0.014
 home-0.013
 bamboo-0.013
Token ticking
Token ID40212
Feature activation+3.37e-02

Pos logit contributions
 acts+0.108
urious+0.105
 amaz+0.099
 explains+0.098
 encourages+0.094
Neg logit contributions
 home-0.099
 Twe-0.082
 Nem-0.081
 very-0.080
 back-0.080
Token and
Token ID290
Feature activation+3.61e-04

Pos logit contributions
urious+0.057
 acts+0.055
 typing+0.051
 recite+0.051
itive+0.050
Neg logit contributions
ar-0.048
 Nem-0.044
 Bub-0.041
 home-0.041
 Twe-0.039
Token Lily
Token ID20037
Feature activation+2.69e-03

Pos logit contributions
urious+0.001
 acts+0.001
 typing+0.001
 recite+0.001
itive+0.001
Neg logit contributions
ar-0.000
 Nem-0.000
 Bub-0.000
inky-0.000
 home-0.000
Token didn
Token ID1422
Feature activation-4.22e-04

Pos logit contributions
 acts+0.005
urious+0.005
 amaz+0.005
asing+0.005
ymes+0.005
Neg logit contributions
ar-0.003
 Nem-0.003
 home-0.002
 Bub-0.002
 Twe-0.002
Token't
Token ID470
Feature activation+1.95e-02

Pos logit contributions
ar+0.000
 Nem+0.000
 Bub+0.000
 home+0.000
 Twe+0.000
Neg logit contributions
urious-0.001
 acts-0.001
 typing-0.001
 recite-0.001
itive-0.001
Token.
Token ID13
Feature activation+5.18e-04

Pos logit contributions
 acts+0.012
urious+0.012
itive+0.012
 typing+0.011
 recite+0.011
Neg logit contributions
ar-0.010
 Nem-0.010
 Bub-0.009
melon-0.009
 Twe-0.008
Token Finally
Token ID9461
Feature activation-1.88e-02

Pos logit contributions
urious+0.001
 acts+0.001
 recite+0.001
 explains+0.001
 typing+0.001
Neg logit contributions
ar-0.001
 Nem-0.001
melon-0.001
balls-0.001
 bamboo-0.001
Token,
Token ID11
Feature activation+2.26e-02

Pos logit contributions
inky+0.027
ar+0.027
orted+0.026
melon+0.026
bank+0.026
Neg logit contributions
 acts-0.024
itive-0.024
 typing-0.022
 amaz-0.022
 explains-0.022
Token the
Token ID262
Feature activation+9.07e-03

Pos logit contributions
urious+0.032
 acts+0.031
 typing+0.030
 recite+0.030
itive+0.029
Neg logit contributions
ar-0.036
 Nem-0.033
 Bub-0.032
melon-0.030
inky-0.029
Token wind
Token ID2344
Feature activation-1.14e-02

Pos logit contributions
urious+0.017
 acts+0.017
itive+0.016
 typing+0.016
 explains+0.016
Neg logit contributions
ar-0.011
 Nem-0.010
 Bub-0.009
 home-0.009
 Twe-0.008
Token stopped
Token ID5025
Feature activation+6.58e-02

Pos logit contributions
ar+0.017
 Nem+0.017
 Bub+0.016
 bamboo+0.015
 home+0.015
Neg logit contributions
urious-0.017
 acts-0.017
itive-0.015
 typing-0.015
 recite-0.015
Token and
Token ID290
Feature activation+1.38e-02

Pos logit contributions
urious+0.120
 acts+0.116
ymes+0.110
 explains+0.108
 recite+0.106
Neg logit contributions
 home-0.081
 Nem-0.081
ar-0.078
 Bub-0.078
 Twe-0.076
Token the
Token ID262
Feature activation+9.98e-03

Pos logit contributions
urious+0.022
 acts+0.022
 recite+0.020
 typing+0.020
 explains+0.020
Neg logit contributions
ar-0.020
 Nem-0.018
 Bub-0.017
 home-0.016
balls-0.016
Token fle
Token ID5104
Feature activation-1.40e-02

Pos logit contributions
urious+0.018
 acts+0.018
 typing+0.016
 recite+0.016
itive+0.016
Neg logit contributions
ar-0.013
 Nem-0.012
 Bub-0.011
 home-0.010
 Twe-0.010
Tokena
Token ID64
Feature activation+7.24e-04

Pos logit contributions
 Nem+0.028
 Bub+0.026
ar+0.025
 Twe+0.025
melon+0.025
Neg logit contributions
urious-0.015
ymes-0.014
 acts-0.013
asing-0.012
itive-0.012
Token was
Token ID373
Feature activation+7.42e-03

Pos logit contributions
urious+0.001
 acts+0.001
 typing+0.001
 explains+0.001
itive+0.001
Neg logit contributions
ar-0.001
 Nem-0.001
 Bub-0.001
 home-0.001
ger-0.001
Token,
Token ID11
Feature activation+1.71e-02

Pos logit contributions
ar+0.016
oster+0.015
bank+0.015
 mainland+0.015
age+0.014
Neg logit contributions
itive-0.010
 acts-0.009
urious-0.009
 typing-0.009
pled-0.009
Token Tim
Token ID5045
Feature activation+1.14e-02

Pos logit contributions
 typing+0.024
urious+0.023
itive+0.023
 explains+0.021
 acts+0.021
Neg logit contributions
ar-0.028
age-0.023
inky-0.023
oster-0.022
bank-0.022
Tokenmy
Token ID1820
Feature activation+4.99e-03

Pos logit contributions
 acts+0.014
 typing+0.014
itive+0.013
 recite+0.013
ymes+0.013
Neg logit contributions
 Nem-0.019
 home-0.019
ar-0.019
 Bub-0.018
 Twe-0.017
Token's
Token ID338
Feature activation-1.07e-03

Pos logit contributions
urious+0.007
 acts+0.006
itive+0.006
ymes+0.006
 recite+0.006
Neg logit contributions
ar-0.009
 Nem-0.008
 Bub-0.008
 home-0.008
 Twe-0.007
Token car
Token ID1097
Feature activation+2.38e-03

Pos logit contributions
ar+0.002
 Nem+0.001
 Bub+0.001
orted+0.001
 bamboo+0.001
Neg logit contributions
urious-0.002
 acts-0.002
 typing-0.002
 explains-0.002
itive-0.002
Token stopped
Token ID5025
Feature activation+6.58e-02

Pos logit contributions
urious+0.004
 acts+0.004
 typing+0.003
itive+0.003
 recite+0.003
Neg logit contributions
ar-0.003
 Nem-0.003
 Bub-0.003
 bamboo-0.003
balls-0.003
Token working
Token ID1762
Feature activation+2.61e-02

Pos logit contributions
 acts+0.098
urious+0.097
ymes+0.091
 explains+0.090
itive+0.088
Neg logit contributions
 home-0.106
 Nem-0.097
 Bub-0.097
 Twe-0.094
ar-0.090
Token and
Token ID290
Feature activation-3.55e-04

Pos logit contributions
urious+0.051
 acts+0.050
 recite+0.046
 explains+0.046
itive+0.046
Neg logit contributions
ar-0.030
 Nem-0.028
 home-0.026
 Bub-0.025
 Twe-0.024
Token he
Token ID339
Feature activation+9.14e-03

Pos logit contributions
ar+0.000
 Nem+0.000
 Bub+0.000
inky+0.000
 bamboo+0.000
Neg logit contributions
urious-0.001
 acts-0.001
 typing-0.001
 recite-0.001
itive-0.001
Token couldn
Token ID3521
Feature activation+1.36e-03

Pos logit contributions
urious+0.015
 acts+0.015
itive+0.014
ymes+0.014
 recite+0.014
Neg logit contributions
ar-0.013
 Nem-0.012
 Bub-0.011
 home-0.011
 Twe-0.011
Token't
Token ID470
Feature activation+2.17e-02

Pos logit contributions
urious+0.003
 acts+0.003
itive+0.003
 typing+0.003
 recite+0.003
Neg logit contributions
ar-0.001
 Nem-0.001
 Bub-0.001
 home-0.001
 Twe-0.001
Tokenver
Token ID332
Feature activation+1.60e-02

Pos logit contributions
urious+0.014
 acts+0.013
 typing+0.013
 recite+0.012
itive+0.012
Neg logit contributions
ar-0.008
 Bub-0.007
 Nem-0.007
 Twe-0.006
ger-0.006
Token he
Token ID339
Feature activation+1.37e-02

Pos logit contributions
urious+0.032
 acts+0.030
itive+0.030
 recite+0.030
 typing+0.029
Neg logit contributions
ar-0.018
 Nem-0.016
 Bub-0.015
 home-0.014
 Twe-0.013
Token went
Token ID1816
Feature activation+4.36e-03

Pos logit contributions
urious+0.023
 typing+0.023
itive+0.023
ymes+0.022
 amaz+0.022
Neg logit contributions
ar-0.018
 Nem-0.017
 Bub-0.016
 Twe-0.015
ger-0.015
Token,
Token ID11
Feature activation+3.80e-02

Pos logit contributions
urious+0.009
 acts+0.009
 typing+0.008
itive+0.008
 explains+0.008
Neg logit contributions
ar-0.005
 Nem-0.004
 Bub-0.004
 home-0.003
balls-0.003
Token people
Token ID661
Feature activation+2.77e-02

Pos logit contributions
urious+0.102
asing+0.091
 acts+0.090
itive+0.090
ymes+0.089
Neg logit contributions
 Nem-0.022
ar-0.020
 Bub-0.020
 bamboo-0.017
 home-0.014
Token stopped
Token ID5025
Feature activation+6.57e-02

Pos logit contributions
urious+0.046
 acts+0.043
itive+0.041
 typing+0.040
 explains+0.040
Neg logit contributions
ar-0.040
 Nem-0.038
 Bub-0.036
 home-0.034
melon-0.033
Token to
Token ID284
Feature activation+2.27e-02

Pos logit contributions
urious+0.120
 acts+0.108
 amaz+0.105
itive+0.104
ymes+0.104
Neg logit contributions
 Nem-0.082
 Bub-0.079
ar-0.079
 home-0.078
 Twe-0.072
Token listen
Token ID6004
Feature activation+2.36e-02

Pos logit contributions
urious+0.038
asing+0.033
ymes+0.032
 typing+0.032
itive+0.031
Neg logit contributions
 Nem-0.035
 Bub-0.033
ar-0.033
 Twe-0.032
 home-0.032
Token.
Token ID13
Feature activation+1.91e-02

Pos logit contributions
urious+0.050
 acts+0.049
itive+0.046
 recite+0.046
 typing+0.046
Neg logit contributions
ar-0.023
 Nem-0.021
 home-0.019
 Bub-0.018
 Twe-0.016
Token
Token ID198
Feature activation+7.10e-03

Pos logit contributions
 acts+0.033
urious+0.033
 explains+0.030
 recite+0.030
 typing+0.030
Neg logit contributions
ar-0.026
 Nem-0.022
 home-0.021
melon-0.021
 Bub-0.021
Token
Token ID198
Feature activation+2.79e-02

Pos logit contributions
urious+0.013
 acts+0.013
 explains+0.012
 recite+0.012
 typing+0.012
Neg logit contributions
ar-0.009
 Nem-0.008
 Bub-0.007
melon-0.007
 home-0.007
Token girl
Token ID2576
Feature activation-4.63e-03

Pos logit contributions
ar+0.000
 Nem+0.000
 Bub+0.000
balls+0.000
orted+0.000
Neg logit contributions
 acts-0.000
urious-0.000
itive-0.000
 typing-0.000
ymes-0.000
Token named
Token ID3706
Feature activation-3.34e-02

Pos logit contributions
ar+0.005
 Nem+0.004
 Bub+0.004
 home+0.003
balls+0.003
Neg logit contributions
urious-0.010
 acts-0.009
 typing-0.009
itive-0.009
 recite-0.009
Token Lily
Token ID20037
Feature activation-1.84e-03

Pos logit contributions
ar+0.036
 Nem+0.030
 Bub+0.027
orted+0.025
ger+0.025
Neg logit contributions
urious-0.068
 acts-0.064
 typing-0.063
 recite-0.062
 explains-0.062
Token who
Token ID508
Feature activation-7.92e-03

Pos logit contributions
ar+0.002
 Nem+0.002
 Bub+0.002
 home+0.002
 Twe+0.001
Neg logit contributions
urious-0.004
 acts-0.004
 recite-0.003
 explains-0.003
 typing-0.003
Token loved
Token ID6151
Feature activation-4.78e-03

Pos logit contributions
ar+0.013
 Nem+0.012
 Bub+0.011
balls+0.010
melon+0.010
Neg logit contributions
urious-0.012
 acts-0.011
 typing-0.011
 recite-0.011
itive-0.011
Token pic
Token ID8301
Feature activation-7.52e-02

Pos logit contributions
ar+0.005
 Nem+0.005
 Bub+0.004
 home+0.004
 Twe+0.004
Neg logit contributions
urious-0.010
 acts-0.009
 typing-0.009
itive-0.009
 recite-0.009
Tokenn
Token ID77
Feature activation+6.78e-03

Pos logit contributions
ar+0.092
 Nem+0.090
 Bub+0.084
 home+0.082
balls+0.079
Neg logit contributions
urious-0.128
 recite-0.120
 acts-0.119
 explains-0.114
itive-0.113
Tokenics
Token ID873
Feature activation+9.15e-03

Pos logit contributions
 acts+0.013
urious+0.012
 amaz+0.012
 explains+0.011
 recite+0.011
Neg logit contributions
ar-0.008
 home-0.007
 Nem-0.007
balls-0.007
 Twe-0.007
Token.
Token ID13
Feature activation-9.53e-04

Pos logit contributions
urious+0.017
 acts+0.016
itive+0.015
 typing+0.015
 recite+0.015
Neg logit contributions
ar-0.012
 Nem-0.011
 Bub-0.010
 home-0.009
 Twe-0.009
Token She
Token ID1375
Feature activation+5.74e-04

Pos logit contributions
ar+0.001
 Nem+0.001
 Bub+0.001
 home+0.001
 Twe+0.001
Neg logit contributions
urious-0.002
 acts-0.002
itive-0.001
 typing-0.001
 recite-0.001
Token would
Token ID561
Feature activation+2.74e-02

Pos logit contributions
urious+0.001
 acts+0.001
itive+0.001
 typing+0.001
 recite+0.001
Neg logit contributions
ar-0.001
 Nem-0.001
 Bub-0.001
 home-0.001
 Twe-0.001
Token their
Token ID511
Feature activation-8.29e-03

Pos logit contributions
 Nem+0.000
ar+0.000
 home+0.000
 Bub+0.000
 bamboo+0.000
Neg logit contributions
urious-0.000
 acts-0.000
itive-0.000
 typing-0.000
 recite-0.000
Token salad
Token ID20698
Feature activation-6.40e-03

Pos logit contributions
 Nem+0.009
ar+0.009
 home+0.008
 Bub+0.008
melon+0.007
Neg logit contributions
urious-0.016
 typing-0.015
 acts-0.015
itive-0.015
 recite-0.015
Token and
Token ID290
Feature activation-1.02e-02

Pos logit contributions
ar+0.008
 Nem+0.008
 Bub+0.007
 Twe+0.007
orted+0.006
Neg logit contributions
urious-0.011
 acts-0.011
itive-0.010
 explains-0.010
 typing-0.010
Token have
Token ID423
Feature activation-2.73e-02

Pos logit contributions
 Nem+0.015
ar+0.014
 Bub+0.014
 home+0.013
 Twe+0.013
Neg logit contributions
urious-0.018
itive-0.016
 acts-0.016
ymes-0.016
 explains-0.015
Token happy
Token ID3772
Feature activation-1.22e-02

Pos logit contributions
ar+0.035
 Nem+0.032
 Bub+0.030
 home+0.028
 Twe+0.027
Neg logit contributions
urious-0.050
 acts-0.048
 typing-0.045
itive-0.045
 recite-0.045
Token pic
Token ID8301
Feature activation-7.43e-02

Pos logit contributions
ar+0.019
 Nem+0.018
 Bub+0.017
 home+0.016
 Twe+0.016
Neg logit contributions
urious-0.019
 acts-0.018
 typing-0.017
itive-0.017
 recite-0.017
Tokenn
Token ID77
Feature activation+9.29e-03

Pos logit contributions
 Nem+0.091
ar+0.085
 Bub+0.083
 home+0.082
 Twe+0.078
Neg logit contributions
urious-0.122
 acts-0.118
 recite-0.113
 explains-0.109
ymes-0.107
Tokenics
Token ID873
Feature activation-2.02e-02

Pos logit contributions
 acts+0.017
urious+0.017
 amaz+0.017
 recite+0.016
 explains+0.016
Neg logit contributions
ar-0.011
 Nem-0.010
 home-0.010
balls-0.010
 Bub-0.009
Token together
Token ID1978
Feature activation-8.91e-03

Pos logit contributions
ar+0.026
 Nem+0.024
 Bub+0.024
 Twe+0.021
melon+0.021
Neg logit contributions
 acts-0.034
urious-0.034
itive-0.032
ymes-0.031
 explains-0.031
Token.
Token ID13
Feature activation+4.38e-03

Pos logit contributions
ar+0.011
 Nem+0.010
 Bub+0.010
 Twe+0.009
 home+0.009
Neg logit contributions
urious-0.016
 acts-0.016
itive-0.015
 typing-0.015
 recite-0.015
Token<|endoftext|>
Token ID50256
Feature activation+3.16e-03

Pos logit contributions
urious+0.009
 acts+0.008
itive+0.008
 typing+0.008
 recite+0.008
Neg logit contributions
ar-0.005
 Nem-0.004
 Bub-0.004
 home-0.004
 Twe-0.004
Token in
Token ID287
Feature activation-7.68e-03

Pos logit contributions
ar+0.014
 Nem+0.013
 Bub+0.012
 home+0.011
 bamboo+0.011
Neg logit contributions
urious-0.022
 acts-0.021
 recite-0.020
 typing-0.020
itive-0.020
Token the
Token ID262
Feature activation-1.46e-02

Pos logit contributions
ar+0.009
 Nem+0.009
 home+0.008
 Bub+0.008
 Twe+0.007
Neg logit contributions
urious-0.015
itive-0.014
 acts-0.014
asing-0.014
 recite-0.014
Token garden
Token ID11376
Feature activation-3.05e-03

Pos logit contributions
ar+0.023
 Nem+0.021
 Bub+0.021
balls+0.019
orted+0.019
Neg logit contributions
urious-0.023
 acts-0.021
 typing-0.020
 explains-0.020
 recite-0.019
Token and
Token ID290
Feature activation-6.35e-03

Pos logit contributions
ar+0.003
 Nem+0.003
 home+0.003
 Bub+0.002
melon+0.002
Neg logit contributions
urious-0.007
 acts-0.006
 recite-0.006
 explains-0.006
itive-0.006
Token have
Token ID423
Feature activation-3.01e-02

Pos logit contributions
 Nem+0.009
 Bub+0.008
 home+0.008
ar+0.008
 Twe+0.008
Neg logit contributions
urious-0.011
itive-0.010
 acts-0.010
ymes-0.010
asing-0.010
Token pic
Token ID8301
Feature activation-7.18e-02

Pos logit contributions
ar+0.039
 Nem+0.035
 Bub+0.033
 home+0.030
melon+0.029
Neg logit contributions
urious-0.055
 acts-0.053
 typing-0.050
itive-0.050
 recite-0.050
Tokenn
Token ID77
Feature activation+2.56e-03

Pos logit contributions
ar+0.084
 Nem+0.083
 Bub+0.075
 home+0.072
melon+0.069
Neg logit contributions
urious-0.128
 acts-0.123
 recite-0.119
 explains-0.116
 typing-0.114
Tokenics
Token ID873
Feature activation-1.32e-02

Pos logit contributions
 acts+0.005
urious+0.005
 amaz+0.004
 recite+0.004
 explains+0.004
Neg logit contributions
ar-0.003
 Nem-0.003
 home-0.003
balls-0.003
 Bub-0.003
Token together
Token ID1978
Feature activation-9.04e-03

Pos logit contributions
ar+0.017
 Nem+0.016
 Bub+0.015
 Twe+0.013
melon+0.013
Neg logit contributions
urious-0.023
 acts-0.023
itive-0.022
 typing-0.021
 recite-0.021
Token.
Token ID13
Feature activation+7.25e-03

Pos logit contributions
ar+0.012
 Nem+0.011
 Bub+0.011
 Twe+0.010
orted+0.009
Neg logit contributions
urious-0.016
 acts-0.015
itive-0.014
 typing-0.014
 recite-0.014
Token
Token ID220
Feature activation+8.52e-03

Pos logit contributions
urious+0.011
 acts+0.011
 explains+0.011
 typing+0.010
 recite+0.010
Neg logit contributions
ar-0.011
balls-0.009
ger-0.009
 Nem-0.009
 Bub-0.009
Token liked
Token ID8288
Feature activation+1.73e-03

Pos logit contributions
urious+0.004
 acts+0.004
itive+0.004
 explains+0.004
 typing+0.004
Neg logit contributions
ar-0.006
 Nem-0.005
 Bub-0.005
balls-0.005
 bamboo-0.005
Token to
Token ID284
Feature activation-4.99e-03

Pos logit contributions
urious+0.004
 acts+0.004
itive+0.004
 explains+0.004
 recite+0.004
Neg logit contributions
 Nem-0.001
ar-0.001
 Bub-0.001
 home-0.001
 Twe-0.001
Token play
Token ID711
Feature activation-1.04e-02

Pos logit contributions
ar+0.007
 Nem+0.007
 Bub+0.006
 home+0.006
 Twe+0.006
Neg logit contributions
urious-0.008
 acts-0.008
itive-0.007
 typing-0.007
 explains-0.007
Token and
Token ID290
Feature activation-5.14e-03

Pos logit contributions
ar+0.013
 Nem+0.012
 Bub+0.011
 home+0.010
 Twe+0.010
Neg logit contributions
urious-0.019
 acts-0.019
itive-0.018
 typing-0.018
 recite-0.018
Token have
Token ID423
Feature activation-2.63e-02

Pos logit contributions
ar+0.008
 Nem+0.008
 Bub+0.008
 home+0.007
 Twe+0.007
Neg logit contributions
urious-0.008
 acts-0.007
itive-0.007
 explains-0.007
ymes-0.007
Token pic
Token ID8301
Feature activation-7.10e-02

Pos logit contributions
ar+0.030
 Nem+0.028
 Bub+0.026
 home+0.024
 Twe+0.023
Neg logit contributions
urious-0.051
 acts-0.050
itive-0.047
 typing-0.047
 recite-0.047
Tokenn
Token ID77
Feature activation+5.10e-03

Pos logit contributions
ar+0.084
 Nem+0.083
 Bub+0.075
 home+0.072
 Twe+0.070
Neg logit contributions
urious-0.124
 acts-0.119
 recite-0.117
 explains-0.113
 typing-0.111
Tokenics
Token ID873
Feature activation-8.98e-03

Pos logit contributions
 acts+0.009
urious+0.009
 amaz+0.008
 recite+0.008
 explains+0.008
Neg logit contributions
ar-0.007
 Nem-0.006
 Bub-0.006
 home-0.006
balls-0.006
Token.
Token ID13
Feature activation-1.42e-03

Pos logit contributions
ar+0.012
 Nem+0.011
 Bub+0.010
 Twe+0.009
melon+0.009
Neg logit contributions
urious-0.015
 acts-0.015
itive-0.014
 typing-0.014
 recite-0.014
Token One
Token ID1881
Feature activation+7.07e-03

Pos logit contributions
ar+0.002
 Nem+0.002
 Bub+0.002
 bamboo+0.002
 Twe+0.001
Neg logit contributions
urious-0.003
 acts-0.002
 explains-0.002
 typing-0.002
 recite-0.002
Token day
Token ID1110
Feature activation+3.56e-03

Pos logit contributions
urious+0.016
 acts+0.016
 typing+0.015
itive+0.015
 recite+0.015
Neg logit contributions
ar-0.006
 Nem-0.005
 Bub-0.004
 home-0.004
 Twe-0.004
Token favourite
Token ID12507
Feature activation-1.42e-02

Pos logit contributions
ar+0.033
 Nem+0.030
 Bub+0.029
 Twe+0.026
 home+0.026
Neg logit contributions
urious-0.037
 acts-0.036
 typing-0.033
 recite-0.033
itive-0.033
Token place
Token ID1295
Feature activation-1.79e-02

Pos logit contributions
ar+0.020
 Nem+0.018
 Bub+0.018
orted+0.017
 Twe+0.016
Neg logit contributions
 acts-0.024
urious-0.024
itive-0.022
 recite-0.021
 typing-0.021
Token to
Token ID284
Feature activation-8.15e-03

Pos logit contributions
ar+0.024
 Nem+0.022
 Bub+0.021
 Twe+0.019
 bamboo+0.019
Neg logit contributions
urious-0.031
 acts-0.030
 typing-0.029
 recite-0.028
 explains-0.028
Token go
Token ID467
Feature activation-1.50e-03

Pos logit contributions
ar+0.012
 Nem+0.012
 Bub+0.011
 home+0.011
 bamboo+0.010
Neg logit contributions
urious-0.014
 acts-0.013
itive-0.012
 typing-0.012
ymes-0.012
Token for
Token ID329
Feature activation-1.53e-02

Pos logit contributions
ar+0.002
 Nem+0.002
 Bub+0.002
 home+0.001
 Twe+0.001
Neg logit contributions
urious-0.003
 acts-0.003
 typing-0.003
itive-0.003
 recite-0.003
Token pic
Token ID8301
Feature activation-7.04e-02

Pos logit contributions
ar+0.022
melon+0.020
 Nem+0.020
orted+0.019
 Bub+0.019
Neg logit contributions
 acts-0.023
urious-0.023
 recite-0.022
 typing-0.022
 explains-0.021
Tokenn
Token ID77
Feature activation+1.05e-03

Pos logit contributions
ar+0.078
 Nem+0.075
 Bub+0.070
 bamboo+0.062
 home+0.061
Neg logit contributions
urious-0.133
 acts-0.128
 recite-0.125
itive-0.121
 typing-0.121
Tokenics
Token ID873
Feature activation-2.20e-02

Pos logit contributions
urious+0.002
 acts+0.002
 explains+0.002
 recite+0.002
 amaz+0.002
Neg logit contributions
ar-0.001
 Nem-0.001
 home-0.001
 Bub-0.001
 Twe-0.001
Token.
Token ID13
Feature activation+5.69e-03

Pos logit contributions
ar+0.028
 Nem+0.026
 Bub+0.025
 Twe+0.022
melon+0.022
Neg logit contributions
urious-0.039
 acts-0.038
itive-0.036
 recite-0.036
 typing-0.035
Token<|endoftext|>
Token ID50256
Feature activation+1.08e-02

Pos logit contributions
urious+0.009
 acts+0.009
 explains+0.008
 recite+0.008
 typing+0.008
Neg logit contributions
ar-0.008
balls-0.007
ger-0.007
 Nem-0.007
melon-0.007
TokenOnce
Token ID7454
Feature activation-6.21e-03

Pos logit contributions
urious+0.020
 acts+0.019
 typing+0.018
 recite+0.018
 explains+0.018
Neg logit contributions
ar-0.014
 Nem-0.012
 Bub-0.012
 home-0.011
 Twe-0.010
Token with
Token ID351
Feature activation+1.55e-03

Pos logit contributions
ar+0.017
 Nem+0.016
 Bub+0.015
 Twe+0.014
ger+0.014
Neg logit contributions
urious-0.019
 acts-0.018
 typing-0.018
 recite-0.017
 explains-0.017
Token a
Token ID257
Feature activation-1.76e-02

Pos logit contributions
urious+0.003
 acts+0.003
itive+0.003
 typing+0.003
 recite+0.003
Neg logit contributions
ar-0.002
 Nem-0.002
 Bub-0.002
 home-0.002
 Twe-0.002
Token present
Token ID1944
Feature activation-2.73e-02

Pos logit contributions
ar+0.029
 Nem+0.027
 Bub+0.026
 Twe+0.024
orted+0.024
Neg logit contributions
urious-0.025
 acts-0.024
 recite-0.023
 typing-0.023
 explains-0.023
Token -
Token ID532
Feature activation-5.67e-05

Pos logit contributions
ar+0.041
 Nem+0.039
 Bub+0.038
 Twe+0.035
orted+0.034
Neg logit contributions
 acts-0.038
 typing-0.038
urious-0.038
itive-0.037
 recite-0.037
Token a
Token ID257
Feature activation-1.93e-02

Pos logit contributions
ar+0.000
 Nem+0.000
 Bub+0.000
 home+0.000
 Twe+0.000
Neg logit contributions
urious-0.000
 acts-0.000
itive-0.000
 recite-0.000
 typing-0.000
Token brand
Token ID4508
Feature activation-7.01e-02

Pos logit contributions
ar+0.033
 Nem+0.030
 Bub+0.028
orted+0.028
 Twe+0.026
Neg logit contributions
urious-0.026
 acts-0.025
 recite-0.025
 typing-0.025
 explains-0.024
Token new
Token ID649
Feature activation-3.61e-02

Pos logit contributions
ar+0.141
balls+0.134
 Nem+0.132
bank+0.132
 Bub+0.132
Neg logit contributions
urious-0.065
 explains-0.059
 acts-0.058
 types-0.055
 amaz-0.052
Token battery
Token ID6555
Feature activation-2.83e-02

Pos logit contributions
ar+0.060
 Nem+0.054
 Bub+0.051
orted+0.049
balls+0.048
Neg logit contributions
 acts-0.053
urious-0.050
 amaz-0.049
 recite-0.047
 explains-0.046
Token-
Token ID12
Feature activation-1.64e-02

Pos logit contributions
ar+0.041
 Nem+0.040
 Bub+0.038
 bamboo+0.036
 Twe+0.036
Neg logit contributions
 acts-0.043
 typing-0.039
urious-0.038
 explains-0.038
itive-0.037
Tokenpowered
Token ID12293
Feature activation-2.22e-02

Pos logit contributions
ar+0.022
 Nem+0.022
 thirst+0.021
melon+0.021
balls+0.020
Neg logit contributions
 acts-0.023
urious-0.023
asing-0.023
 explains-0.022
pled-0.022
Token car
Token ID1097
Feature activation-1.25e-02

Pos logit contributions
ar+0.032
 Nem+0.028
orted+0.027
 Bub+0.027
 Twe+0.024
Neg logit contributions
 acts-0.036
urious-0.033
 typing-0.033
 recite-0.032
 explains-0.032
Token the
Token ID262
Feature activation-8.51e-03

Pos logit contributions
 Nem+0.016
 home+0.016
 Bub+0.016
ar+0.015
 Twe+0.014
Neg logit contributions
urious-0.026
itive-0.025
 acts-0.025
ymes-0.024
asing-0.024
Token store
Token ID3650
Feature activation-9.85e-03

Pos logit contributions
ar+0.017
 Nem+0.015
 Bub+0.015
orted+0.015
aun+0.014
Neg logit contributions
urious-0.010
 typing-0.009
 acts-0.009
 recite-0.008
 explains-0.008
Token and
Token ID290
Feature activation-1.89e-02

Pos logit contributions
ar+0.015
 Nem+0.014
 Bub+0.013
 home+0.012
 bamboo+0.012
Neg logit contributions
urious-0.016
 acts-0.015
 typing-0.014
itive-0.014
 explains-0.014
Token bought
Token ID5839
Feature activation-8.67e-03

Pos logit contributions
ar+0.029
 Nem+0.025
 bamboo+0.023
 Bub+0.023
inky+0.023
Neg logit contributions
urious-0.029
 typing-0.028
 acts-0.027
 recite-0.026
 explains-0.026
Token a
Token ID257
Feature activation-3.67e-02

Pos logit contributions
ar+0.012
 Nem+0.011
 Bub+0.010
orted+0.009
 home+0.009
Neg logit contributions
urious-0.015
 acts-0.014
 explains-0.013
 typing-0.013
itive-0.013
Token brand
Token ID4508
Feature activation-6.86e-02

Pos logit contributions
ar+0.061
 Nem+0.056
 Bub+0.052
orted+0.051
 Twe+0.050
Neg logit contributions
urious-0.051
 acts-0.049
 typing-0.048
 recite-0.047
 explains-0.047
Token new
Token ID649
Feature activation-4.08e-02

Pos logit contributions
ar+0.133
 Nem+0.123
 Pink+0.121
orted+0.119
age+0.119
Neg logit contributions
urious-0.074
 explains-0.063
 acts-0.060
 types-0.058
 encourages-0.054
Token sp
Token ID599
Feature activation-1.41e-02

Pos logit contributions
ar+0.066
 Nem+0.059
orted+0.056
 Bub+0.054
 Twe+0.054
Neg logit contributions
urious-0.060
 acts-0.059
 explains-0.054
 amaz-0.053
 typing-0.053
Tokenade
Token ID671
Feature activation-3.11e-02

Pos logit contributions
ar+0.015
 Nem+0.015
 Bub+0.014
 Twe+0.013
 Pink+0.012
Neg logit contributions
urious-0.028
 acts-0.026
 typing-0.025
itive-0.025
 explains-0.025
Token.
Token ID13
Feature activation-1.01e-02

Pos logit contributions
ar+0.049
 Nem+0.044
orted+0.042
 Bub+0.042
Flo+0.041
Neg logit contributions
 acts-0.041
 typing-0.039
urious-0.038
 explains-0.038
asing-0.036
Token Tim
Token ID5045
Feature activation+1.79e-02

Pos logit contributions
ar+0.017
oster+0.014
ger+0.014
 away+0.012
balls+0.012
Neg logit contributions
urious-0.014
 typing-0.013
 explains-0.013
 amaz-0.013
 Taking-0.013
Token near
Token ID1474
Feature activation+9.40e-03

Pos logit contributions
ar+0.004
 Nem+0.004
 Bub+0.003
melon+0.003
 bamboo+0.003
Neg logit contributions
urious-0.004
 acts-0.004
itive-0.004
 typing-0.004
 recite-0.004
Token the
Token ID262
Feature activation+6.34e-03

Pos logit contributions
urious+0.020
itive+0.019
asing+0.018
 recite+0.018
 typing+0.018
Neg logit contributions
 home-0.009
 Nem-0.009
ar-0.008
 Twe-0.008
 Bub-0.008
Token cliff
Token ID19516
Feature activation+1.67e-02

Pos logit contributions
urious+0.011
 acts+0.011
itive+0.010
 typing+0.010
 recite+0.010
Neg logit contributions
ar-0.009
 Nem-0.008
 Bub-0.007
 home-0.007
 Twe-0.007
Token and
Token ID290
Feature activation+7.19e-03

Pos logit contributions
urious+0.033
 acts+0.032
 recite+0.031
itive+0.031
 typing+0.031
Neg logit contributions
ar-0.019
 Nem-0.017
 home-0.016
 Bub-0.015
melon-0.014
Token have
Token ID423
Feature activation-2.46e-02

Pos logit contributions
urious+0.013
itive+0.012
ymes+0.012
 acts+0.012
asing+0.011
Neg logit contributions
 Nem-0.010
ar-0.010
 home-0.009
 Bub-0.009
 Twe-0.009
Token pic
Token ID8301
Feature activation-6.84e-02

Pos logit contributions
ar+0.028
 Nem+0.026
 Bub+0.024
 home+0.023
 bamboo+0.022
Neg logit contributions
urious-0.048
 acts-0.046
itive-0.044
 typing-0.044
 recite-0.044
Tokenn
Token ID77
Feature activation+6.88e-03

Pos logit contributions
 Nem+0.081
ar+0.080
 Bub+0.073
 home+0.069
melon+0.068
Neg logit contributions
urious-0.118
 acts-0.115
 recite-0.112
 explains-0.107
 typing-0.105
Tokenics
Token ID873
Feature activation-9.17e-03

Pos logit contributions
urious+0.012
 acts+0.012
 recite+0.011
 amaz+0.011
 explains+0.011
Neg logit contributions
ar-0.009
 Nem-0.008
 home-0.008
 Bub-0.008
balls-0.008
Token.
Token ID13
Feature activation+7.01e-03

Pos logit contributions
ar+0.012
 Nem+0.011
 Bub+0.010
 Twe+0.009
 home+0.009
Neg logit contributions
urious-0.016
 acts-0.016
itive-0.015
 typing-0.015
 recite-0.015
Token One
Token ID1881
Feature activation+1.31e-02

Pos logit contributions
urious+0.011
 acts+0.010
 explains+0.010
 typing+0.009
 recite+0.009
Neg logit contributions
ar-0.012
balls-0.010
ger-0.009
 Nem-0.009
 Bub-0.009
Token day
Token ID1110
Feature activation+1.49e-03

Pos logit contributions
urious+0.030
 acts+0.030
 typing+0.029
 recite+0.029
itive+0.029
Neg logit contributions
ar-0.010
 Nem-0.009
 home-0.007
 Bub-0.007
 bamboo-0.006
Token so
Token ID523
Feature activation+7.46e-04

Pos logit contributions
urious+0.002
 acts+0.002
itive+0.002
 typing+0.002
 recite+0.002
Neg logit contributions
ar-0.001
 Nem-0.001
 Bub-0.001
 home-0.001
 Twe-0.001
Token excited
Token ID6568
Feature activation-1.34e-02

Pos logit contributions
urious+0.001
 acts+0.001
 recite+0.001
ymes+0.001
itive+0.001
Neg logit contributions
 Nem-0.001
ar-0.001
 Bub-0.001
 home-0.001
 Twe-0.001
Token!
Token ID0
Feature activation-9.27e-03

Pos logit contributions
ar+0.018
 Nem+0.017
 Bub+0.016
 bamboo+0.014
melon+0.014
Neg logit contributions
urious-0.024
 acts-0.023
itive-0.021
 typing-0.021
 explains-0.021
Token She
Token ID1375
Feature activation-1.09e-02

Pos logit contributions
ar+0.017
 Nem+0.014
 bamboo+0.014
 Bub+0.013
 thirst+0.013
Neg logit contributions
urious-0.013
 typing-0.012
 explains-0.011
 acts-0.011
 recite-0.011
Token loved
Token ID6151
Feature activation-1.67e-02

Pos logit contributions
ar+0.018
 Nem+0.017
 Bub+0.016
 home+0.015
 Twe+0.015
Neg logit contributions
urious-0.016
 acts-0.015
itive-0.014
 recite-0.014
 typing-0.014
Token pic
Token ID8301
Feature activation-6.81e-02

Pos logit contributions
ar+0.023
 Nem+0.023
 Bub+0.021
 home+0.021
 bamboo+0.020
Neg logit contributions
urious-0.028
 acts-0.027
itive-0.025
 recite-0.025
 explains-0.025
Tokenn
Token ID77
Feature activation+4.19e-03

Pos logit contributions
ar+0.079
 Nem+0.076
 Bub+0.071
 home+0.069
ger+0.064
Neg logit contributions
urious-0.122
 acts-0.116
 recite-0.115
 explains-0.112
 typing-0.112
Tokenics
Token ID873
Feature activation-4.99e-03

Pos logit contributions
 acts+0.009
urious+0.008
 amaz+0.008
 recite+0.008
 explains+0.008
Neg logit contributions
ar-0.004
 home-0.004
balls-0.004
 Nem-0.004
oe-0.004
Token and
Token ID290
Feature activation-1.44e-02

Pos logit contributions
ar+0.007
 Nem+0.007
 Bub+0.007
 home+0.006
 Twe+0.006
Neg logit contributions
urious-0.008
 acts-0.008
 typing-0.007
itive-0.007
 recite-0.007
Token spending
Token ID4581
Feature activation-2.98e-02

Pos logit contributions
 Nem+0.021
ar+0.020
 Bub+0.019
 Twe+0.018
 home+0.018
Neg logit contributions
urious-0.025
 acts-0.023
ymes-0.023
itive-0.022
asing-0.022
Token time
Token ID640
Feature activation-1.13e-03

Pos logit contributions
ar+0.046
 Nem+0.043
 Bub+0.042
melon+0.040
inky+0.040
Neg logit contributions
urious-0.042
 acts-0.042
 typing-0.039
 recite-0.039
 reads-0.039
Token She
Token ID1375
Feature activation+1.96e-03

Pos logit contributions
ar+0.006
 Nem+0.005
 Bub+0.005
 home+0.005
 Twe+0.005
Neg logit contributions
urious-0.007
 acts-0.007
 typing-0.006
itive-0.006
 recite-0.006
Token loved
Token ID6151
Feature activation+4.53e-03

Pos logit contributions
urious+0.003
 acts+0.003
 typing+0.002
itive+0.002
 recite+0.002
Neg logit contributions
ar-0.003
 Nem-0.003
 Bub-0.003
 home-0.003
melon-0.003
Token to
Token ID284
Feature activation-5.98e-03

Pos logit contributions
urious+0.010
 acts+0.009
itive+0.009
 explains+0.009
 recite+0.009
Neg logit contributions
 Nem-0.004
ar-0.004
 Bub-0.004
 home-0.004
 Twe-0.003
Token go
Token ID467
Feature activation+1.09e-02

Pos logit contributions
 Nem+0.007
ar+0.007
 Bub+0.007
 home+0.007
 Twe+0.006
Neg logit contributions
urious-0.011
ymes-0.010
 acts-0.010
asing-0.010
itive-0.010
Token on
Token ID319
Feature activation+2.05e-03

Pos logit contributions
urious+0.023
 acts+0.022
itive+0.021
ymes+0.020
 explains+0.020
Neg logit contributions
ar-0.012
 Nem-0.011
 home-0.011
 Bub-0.010
 bamboo-0.009
Token pic
Token ID8301
Feature activation-6.79e-02

Pos logit contributions
urious+0.003
 acts+0.003
asing+0.003
itive+0.003
 explains+0.003
Neg logit contributions
 Nem-0.003
 home-0.003
ar-0.003
 bamboo-0.003
 Bub-0.003
Tokenn
Token ID77
Feature activation+2.42e-02

Pos logit contributions
ar+0.077
 Nem+0.072
 Bub+0.066
 home+0.059
melon+0.059
Neg logit contributions
urious-0.130
 acts-0.124
 recite-0.120
 typing-0.119
 explains-0.118
Tokenics
Token ID873
Feature activation+1.21e-02

Pos logit contributions
 acts+0.046
urious+0.044
 amaz+0.042
ymes+0.042
 explains+0.041
Neg logit contributions
ar-0.027
 home-0.026
 Nem-0.025
balls-0.024
 Twe-0.024
Token with
Token ID351
Feature activation+1.11e-02

Pos logit contributions
urious+0.024
 acts+0.023
 explains+0.022
 typing+0.022
itive+0.022
Neg logit contributions
ar-0.014
 Nem-0.013
 home-0.012
 Bub-0.011
 Twe-0.011
Token her
Token ID607
Feature activation+7.00e-03

Pos logit contributions
urious+0.017
 acts+0.016
 typing+0.015
 recite+0.015
itive+0.015
Neg logit contributions
ar-0.017
 Nem-0.016
 Bub-0.015
 home-0.014
balls-0.014
Token family
Token ID1641
Feature activation+2.27e-02

Pos logit contributions
urious+0.012
 acts+0.011
itive+0.011
 typing+0.011
 recite+0.011
Neg logit contributions
ar-0.010
 Nem-0.009
 Bub-0.009
 home-0.008
 Twe-0.008
Token a
Token ID257
Feature activation+2.15e-02

Pos logit contributions
urious+0.023
 acts+0.021
itive+0.021
 typing+0.020
 recite+0.020
Neg logit contributions
ar-0.029
 Nem-0.023
orted-0.023
melon-0.023
bank-0.023
Token family
Token ID1641
Feature activation+8.44e-03

Pos logit contributions
urious+0.030
 acts+0.029
itive+0.026
 typing+0.026
 explains+0.026
Neg logit contributions
ar-0.037
 Nem-0.035
 Bub-0.033
 home-0.032
 Twe-0.031
Token who
Token ID508
Feature activation-2.67e-03

Pos logit contributions
urious+0.012
 acts+0.011
 typing+0.011
itive+0.010
 recite+0.010
Neg logit contributions
ar-0.015
 Nem-0.013
 Bub-0.013
balls-0.012
orted-0.012
Token really
Token ID1107
Feature activation-8.39e-03

Pos logit contributions
ar+0.004
 Nem+0.004
 Bub+0.004
 home+0.004
balls+0.004
Neg logit contributions
 acts-0.004
urious-0.004
 typing-0.004
itive-0.004
 recite-0.003
Token enjoyed
Token ID8359
Feature activation+2.27e-03

Pos logit contributions
ar+0.015
 Nem+0.014
 Bub+0.014
melon+0.013
 bamboo+0.013
Neg logit contributions
 acts-0.010
urious-0.010
itive-0.009
 typing-0.009
 recite-0.009
Token pic
Token ID8301
Feature activation-6.75e-02

Pos logit contributions
urious+0.004
 acts+0.004
 explains+0.003
 recite+0.003
itive+0.003
Neg logit contributions
ar-0.003
 Nem-0.003
 Bub-0.003
 home-0.003
 Twe-0.003
Tokenn
Token ID77
Feature activation+6.73e-03

Pos logit contributions
ar+0.078
 Nem+0.073
 Bub+0.067
 home+0.066
orted+0.061
Neg logit contributions
urious-0.123
 acts-0.119
 recite-0.116
itive-0.114
 typing-0.114
Tokenics
Token ID873
Feature activation+5.61e-03

Pos logit contributions
 acts+0.012
urious+0.012
 amaz+0.011
 explains+0.011
 recite+0.011
Neg logit contributions
ar-0.008
 Nem-0.008
 home-0.007
balls-0.007
 Bub-0.007
Token.
Token ID13
Feature activation-6.56e-03

Pos logit contributions
urious+0.010
 acts+0.009
itive+0.009
 typing+0.009
 recite+0.009
Neg logit contributions
ar-0.007
 Nem-0.007
 Bub-0.007
 home-0.006
 Twe-0.006
Token They
Token ID1119
Feature activation+1.47e-02

Pos logit contributions
ar+0.011
balls+0.009
 home+0.008
ger+0.008
 away+0.008
Neg logit contributions
 acts-0.010
urious-0.010
 typing-0.009
 explains-0.009
 recite-0.009
Token loved
Token ID6151
Feature activation+7.75e-03

Pos logit contributions
urious+0.025
 acts+0.024
 typing+0.022
 recite+0.022
 explains+0.022
Neg logit contributions
ar-0.021
 Nem-0.020
 Bub-0.018
 home-0.017
 Twe-0.017
Token box
Token ID3091
Feature activation+4.07e-03

Pos logit contributions
urious+0.006
 acts+0.006
 typing+0.005
 explains+0.005
 recite+0.005
Neg logit contributions
ar-0.005
 Nem-0.005
 Bub-0.004
 Twe-0.004
melon-0.004
Token and
Token ID290
Feature activation-2.59e-03

Pos logit contributions
urious+0.007
 acts+0.007
itive+0.007
 recite+0.007
ymes+0.007
Neg logit contributions
ar-0.005
 home-0.005
 Nem-0.005
 Bub-0.004
melon-0.004
Token inside
Token ID2641
Feature activation-1.07e-02

Pos logit contributions
ar+0.003
 Nem+0.003
 Bub+0.003
 home+0.003
 Twe+0.003
Neg logit contributions
urious-0.005
 acts-0.004
itive-0.004
ymes-0.004
asing-0.004
Token was
Token ID373
Feature activation-1.50e-02

Pos logit contributions
ar+0.011
 Nem+0.010
 Bub+0.009
 home+0.009
 Twe+0.008
Neg logit contributions
urious-0.023
 acts-0.021
itive-0.021
 recite-0.020
ymes-0.020
Token a
Token ID257
Feature activation-1.21e-02

Pos logit contributions
ar+0.022
 Nem+0.020
 Bub+0.018
orted+0.017
ger+0.017
Neg logit contributions
 typing-0.024
urious-0.024
 acts-0.023
 explains-0.023
 recite-0.023
Token brand
Token ID4508
Feature activation-6.72e-02

Pos logit contributions
ar+0.021
 Nem+0.019
 Bub+0.018
orted+0.017
 Twe+0.017
Neg logit contributions
urious-0.017
 acts-0.016
 typing-0.016
 recite-0.015
 explains-0.015
Token new
Token ID649
Feature activation-2.51e-02

Pos logit contributions
ar+0.130
 Nem+0.126
 Bub+0.125
bank+0.122
balls+0.120
Neg logit contributions
urious-0.074
 explains-0.062
 acts-0.058
 requests-0.054
ymes-0.053
Token toy
Token ID13373
Feature activation-2.89e-02

Pos logit contributions
ar+0.039
 Nem+0.036
 Bub+0.034
orted+0.032
melon+0.031
Neg logit contributions
 acts-0.039
urious-0.039
 typing-0.037
 explains-0.037
 recite-0.036
Token with
Token ID351
Feature activation-1.28e-02

Pos logit contributions
ar+0.041
 Nem+0.039
orted+0.036
 Bub+0.035
 thirst+0.035
Neg logit contributions
 acts-0.045
 typing-0.044
 explains-0.042
itive-0.042
urious-0.041
Token a
Token ID257
Feature activation-1.02e-02

Pos logit contributions
ar+0.021
 Nem+0.018
 Bub+0.017
orted+0.017
ger+0.016
Neg logit contributions
 acts-0.019
urious-0.018
 typing-0.018
 explains-0.017
 recite-0.017
Token bright
Token ID6016
Feature activation-1.48e-02

Pos logit contributions
ar+0.018
 Nem+0.017
 Bub+0.016
orted+0.015
melon+0.015
Neg logit contributions
 typing-0.013
urious-0.013
 recite-0.013
 acts-0.013
 explains-0.012
Token floor
Token ID4314
Feature activation-1.80e-03

Pos logit contributions
ar+0.010
 Nem+0.009
 Bub+0.009
orted+0.008
 Twe+0.008
Neg logit contributions
urious-0.007
 acts-0.007
 typing-0.007
itive-0.006
 recite-0.006
Token.
Token ID13
Feature activation-2.19e-02

Pos logit contributions
ar+0.002
 Nem+0.002
 home+0.001
 Bub+0.001
balls+0.001
Neg logit contributions
urious-0.004
 acts-0.004
 recite-0.003
itive-0.003
 explains-0.003
Token
Token ID198
Feature activation-9.48e-03

Pos logit contributions
ar+0.038
 Nem+0.034
melon+0.031
 Bub+0.031
 home+0.031
Neg logit contributions
urious-0.030
 acts-0.030
 explains-0.028
 typing-0.028
 recite-0.026
Token
Token ID198
Feature activation-5.92e-03

Pos logit contributions
ar+0.012
 Nem+0.011
 Bub+0.010
melon+0.010
 home+0.009
Neg logit contributions
urious-0.017
 acts-0.016
 typing-0.016
 explains-0.016
 recite-0.015
Token"
Token ID1
Feature activation-1.85e-02

Pos logit contributions
ar+0.008
 Nem+0.008
 Bub+0.007
 home+0.007
 Twe+0.007
Neg logit contributions
urious-0.010
 acts-0.010
itive-0.009
 typing-0.009
 recite-0.009
TokenOh
Token ID5812
Feature activation-6.68e-02

Pos logit contributions
ar+0.027
 Nem+0.025
 Bub+0.024
 home+0.022
 Twe+0.022
Neg logit contributions
urious-0.030
 acts-0.029
itive-0.027
 typing-0.027
 recite-0.027
Token,
Token ID11
Feature activation-4.98e-02

Pos logit contributions
 Nem+0.110
melon+0.108
bank+0.105
orted+0.104
ger+0.102
Neg logit contributions
 amaz-0.082
 hears-0.081
 acts-0.078
itive-0.076
urious-0.075
Token L
Token ID406
Feature activation-3.55e-02

Pos logit contributions
ar+0.074
 Nem+0.070
orted+0.064
 Bub+0.063
melon+0.062
Neg logit contributions
 acts-0.076
urious-0.075
 typing-0.070
 explains-0.069
itive-0.068
Tokenila
Token ID10102
Feature activation-3.05e-02

Pos logit contributions
ar+0.065
 Nem+0.061
 Bub+0.058
 home+0.056
 Twe+0.055
Neg logit contributions
urious-0.046
 acts-0.043
itive-0.040
 typing-0.040
 recite-0.039
Token,
Token ID11
Feature activation-4.49e-02

Pos logit contributions
ar+0.031
 Nem+0.027
 Bub+0.025
 home+0.023
 Twe+0.022
Neg logit contributions
urious-0.065
 acts-0.062
 typing-0.059
 recite-0.059
itive-0.059
Token what
Token ID644
Feature activation-6.01e-03

Pos logit contributions
ar+0.072
 Nem+0.067
orted+0.063
 Bub+0.062
ger+0.060
Neg logit contributions
 acts-0.064
urious-0.063
 typing-0.060
 explains-0.058
 amaz-0.058
Tokenboard
Token ID3526
Feature activation-1.85e-02

Pos logit contributions
ar+0.027
 Nem+0.027
 Bub+0.026
 Twe+0.025
 thirst+0.023
Neg logit contributions
urious-0.021
 acts-0.020
 explains-0.019
 typing-0.019
pled-0.018
Token.
Token ID13
Feature activation-4.55e-03

Pos logit contributions
ar+0.023
 Nem+0.022
 Bub+0.020
 home+0.018
 Twe+0.018
Neg logit contributions
urious-0.034
 acts-0.033
 typing-0.031
itive-0.031
 explains-0.031
Token He
Token ID679
Feature activation-8.09e-03

Pos logit contributions
ar+0.006
 Nem+0.005
 Bub+0.005
 home+0.004
 bamboo+0.004
Neg logit contributions
urious-0.009
 acts-0.008
 typing-0.008
 explains-0.008
 recite-0.008
Token chose
Token ID7690
Feature activation-8.92e-03

Pos logit contributions
ar+0.011
 Nem+0.011
 Bub+0.010
 home+0.009
balls+0.009
Neg logit contributions
urious-0.014
 acts-0.013
itive-0.013
ymes-0.013
 recite-0.012
Token a
Token ID257
Feature activation-3.05e-02

Pos logit contributions
ar+0.015
orted+0.012
inky+0.012
 Nem+0.012
 Bub+0.012
Neg logit contributions
urious-0.012
 acts-0.012
 typing-0.012
pled-0.011
 explains-0.011
Token brand
Token ID4508
Feature activation-6.67e-02

Pos logit contributions
ar+0.057
orted+0.051
 Nem+0.050
 Bub+0.047
ger+0.046
Neg logit contributions
 recite-0.035
 acts-0.034
 amaz-0.034
 typing-0.034
urious-0.034
Token new
Token ID649
Feature activation-2.54e-02

Pos logit contributions
ar+0.136
 Bub+0.125
 Nem+0.124
orted+0.122
age+0.121
Neg logit contributions
urious-0.062
 explains-0.058
 acts-0.057
 types-0.056
 encourages-0.047
Token box
Token ID3091
Feature activation-1.60e-02

Pos logit contributions
ar+0.042
 Nem+0.035
 Bub+0.034
orted+0.034
inky+0.031
Neg logit contributions
 acts-0.039
urious-0.038
 amaz-0.035
 explains-0.035
 recite-0.034
Token of
Token ID286
Feature activation-6.51e-03

Pos logit contributions
ar+0.024
 Nem+0.023
 Bub+0.022
 Twe+0.020
 bamboo+0.020
Neg logit contributions
urious-0.023
 acts-0.023
 typing-0.022
 recite-0.022
itive-0.022
Token clay
Token ID21558
Feature activation-1.75e-02

Pos logit contributions
ar+0.010
 Nem+0.009
 Bub+0.008
orted+0.008
balls+0.008
Neg logit contributions
 acts-0.010
urious-0.010
 typing-0.009
 explains-0.009
 recite-0.009
Token and
Token ID290
Feature activation-1.23e-02

Pos logit contributions
ar+0.026
 Nem+0.024
 Bub+0.023
orted+0.022
 Twe+0.022
Neg logit contributions
 acts-0.027
urious-0.025
 typing-0.024
itive-0.024
 explains-0.024
Token sorry
Token ID7926
Feature activation-6.71e-03

Pos logit contributions
ar+0.053
 Nem+0.051
 Bub+0.048
 home+0.046
 Twe+0.046
Neg logit contributions
urious-0.033
 acts-0.031
itive-0.028
 typing-0.028
 recite-0.028
Token.
Token ID13
Feature activation-1.98e-02

Pos logit contributions
ar+0.010
 Nem+0.009
 Bub+0.009
melon+0.008
orted+0.008
Neg logit contributions
 acts-0.011
urious-0.010
 explains-0.010
 typing-0.009
itive-0.009
Token
Token ID198
Feature activation-1.60e-02

Pos logit contributions
ar+0.035
melon+0.032
 thirst+0.031
 Nem+0.031
 bamboo+0.030
Neg logit contributions
urious-0.025
 explains-0.024
 acts-0.024
 typing-0.022
Ru-0.021
Token
Token ID198
Feature activation-1.48e-02

Pos logit contributions
ar+0.024
 Nem+0.021
melon+0.020
 Bub+0.019
ger+0.019
Neg logit contributions
urious-0.026
 explains-0.024
 acts-0.024
 typing-0.024
 recite-0.023
Token"
Token ID1
Feature activation-1.82e-02

Pos logit contributions
ar+0.024
 Nem+0.024
 Bub+0.022
melon+0.021
 home+0.021
Neg logit contributions
urious-0.021
 acts-0.019
 typing-0.019
 explains-0.019
 recite-0.018
TokenOh
Token ID5812
Feature activation-6.65e-02

Pos logit contributions
ar+0.029
 Nem+0.027
 Bub+0.026
 home+0.025
 bamboo+0.024
Neg logit contributions
urious-0.028
 acts-0.026
 typing-0.025
itive-0.024
 explains-0.024
Token,
Token ID11
Feature activation-4.49e-02

Pos logit contributions
 Nem+0.112
ar+0.110
orted+0.109
bank+0.109
melon+0.107
Neg logit contributions
 amaz-0.078
 acts-0.077
itive-0.073
 hears-0.073
urious-0.072
Token L
Token ID406
Feature activation-3.29e-02

Pos logit contributions
ar+0.071
 Nem+0.067
orted+0.065
bank+0.061
 Bub+0.061
Neg logit contributions
 acts-0.064
urious-0.060
 amaz-0.059
 explains-0.059
 typing-0.059
Tokenila
Token ID10102
Feature activation-2.36e-02

Pos logit contributions
ar+0.061
 Nem+0.058
 Bub+0.055
 home+0.053
 Twe+0.052
Neg logit contributions
urious-0.041
 acts-0.039
itive-0.036
 explains-0.035
 typing-0.035
Token,
Token ID11
Feature activation-3.57e-02

Pos logit contributions
ar+0.023
 Nem+0.021
 Bub+0.019
 home+0.017
 Twe+0.017
Neg logit contributions
urious-0.050
 acts-0.049
 typing-0.046
 recite-0.046
itive-0.046
Token I
Token ID314
Feature activation-7.50e-03

Pos logit contributions
ar+0.053
 Nem+0.048
 Bub+0.045
orted+0.044
 home+0.043
Neg logit contributions
 acts-0.056
urious-0.055
 typing-0.052
 explains-0.051
 recite-0.051
Token,
Token ID11
Feature activation+4.03e-02

Pos logit contributions
ar+0.029
 Nem+0.026
 Bub+0.024
 home+0.022
 Twe+0.021
Neg logit contributions
urious-0.055
 acts-0.054
 typing-0.051
 recite-0.051
itive-0.051
Token it
Token ID340
Feature activation-2.80e-03

Pos logit contributions
urious+0.074
 acts+0.071
itive+0.066
 recite+0.066
 explains+0.065
Neg logit contributions
ar-0.051
 Nem-0.048
 Bub-0.045
 home-0.043
 Twe-0.041
Token was
Token ID373
Feature activation-7.69e-03

Pos logit contributions
ar+0.004
 Nem+0.004
 Bub+0.003
 home+0.003
melon+0.003
Neg logit contributions
urious-0.005
 acts-0.005
itive-0.004
 typing-0.004
 recite-0.004
Token like
Token ID588
Feature activation-6.05e-05

Pos logit contributions
ar+0.013
 Nem+0.011
 Bub+0.010
orted+0.010
balls+0.010
Neg logit contributions
urious-0.011
 typing-0.011
 amaz-0.011
 acts-0.011
 recite-0.011
Token a
Token ID257
Feature activation-2.07e-02

Pos logit contributions
ar+0.000
 Nem+0.000
 Bub+0.000
 home+0.000
orted+0.000
Neg logit contributions
urious-0.000
 acts-0.000
 typing-0.000
 recite-0.000
itive-0.000
Token brand
Token ID4508
Feature activation-6.64e-02

Pos logit contributions
ar+0.034
 Nem+0.031
orted+0.031
 Bub+0.030
balls+0.028
Neg logit contributions
urious-0.028
 acts-0.027
 recite-0.027
 typing-0.026
 amaz-0.026
Token new
Token ID649
Feature activation-2.91e-02

Pos logit contributions
ar+0.122
 Nem+0.118
balls+0.117
 Bub+0.116
orted+0.115
Neg logit contributions
urious-0.077
 acts-0.062
 amaz-0.060
 explains-0.058
 abrupt-0.057
Token symbol
Token ID6194
Feature activation+2.16e-03

Pos logit contributions
ar+0.044
 Nem+0.040
 Bub+0.039
orted+0.037
balls+0.036
Neg logit contributions
 acts-0.046
urious-0.045
 amaz-0.040
 recite-0.040
itive-0.040
Token!
Token ID0
Feature activation+1.75e-02

Pos logit contributions
urious+0.004
 acts+0.004
 typing+0.003
itive+0.003
 explains+0.003
Neg logit contributions
ar-0.003
 Nem-0.003
 Bub-0.003
 home-0.003
 Twe-0.002
Token
Token ID198
Feature activation+1.66e-02

Pos logit contributions
urious+0.026
 acts+0.025
 explains+0.025
 typing+0.025
 recite+0.024
Neg logit contributions
ar-0.028
 bamboo-0.024
 Nem-0.023
melon-0.023
ger-0.022
Token
Token ID198
Feature activation+2.58e-02

Pos logit contributions
urious+0.028
 explains+0.026
 typing+0.026
 recite+0.025
 acts+0.025
Neg logit contributions
ar-0.024
 Nem-0.021
melon-0.020
 Bub-0.020
ger-0.020
Token family
Token ID1641
Feature activation-1.48e-02

Pos logit contributions
urious+0.007
 acts+0.006
itive+0.006
 typing+0.006
 recite+0.006
Neg logit contributions
ar-0.006
 Nem-0.005
 Bub-0.005
 home-0.005
 Twe-0.004
Token were
Token ID547
Feature activation-5.94e-04

Pos logit contributions
ar+0.019
 Nem+0.019
 home+0.017
 Bub+0.017
melon+0.016
Neg logit contributions
urious-0.027
 acts-0.025
itive-0.024
ymes-0.024
 recite-0.024
Token ready
Token ID3492
Feature activation-8.33e-04

Pos logit contributions
ar+0.001
 Nem+0.001
 Bub+0.001
 Twe+0.001
 bamboo+0.001
Neg logit contributions
urious-0.001
 acts-0.001
 typing-0.001
itive-0.001
 recite-0.001
Token to
Token ID284
Feature activation+3.06e-03

Pos logit contributions
ar+0.001
 home+0.001
 Nem+0.001
 Bub+0.001
balls+0.000
Neg logit contributions
urious-0.002
 acts-0.002
ymes-0.002
itive-0.002
 explains-0.002
Token go
Token ID467
Feature activation-5.04e-03

Pos logit contributions
urious+0.005
 acts+0.005
itive+0.005
ymes+0.005
asing+0.005
Neg logit contributions
ar-0.004
 Nem-0.004
 home-0.004
 Bub-0.004
 Twe-0.004
Token pic
Token ID8301
Feature activation-6.63e-02

Pos logit contributions
ar+0.005
 Nem+0.005
 Bub+0.004
 Twe+0.004
melon+0.004
Neg logit contributions
urious-0.010
 acts-0.010
 typing-0.009
 recite-0.009
 explains-0.009
Tokenn
Token ID77
Feature activation-2.62e-03

Pos logit contributions
ar+0.071
 Nem+0.065
 Bub+0.060
 home+0.054
 Twe+0.053
Neg logit contributions
urious-0.132
 acts-0.127
 recite-0.122
 typing-0.121
itive-0.121
Tokenicking
Token ID7958
Feature activation-1.09e-02

Pos logit contributions
ar+0.003
 Nem+0.003
 home+0.003
 Twe+0.003
 Bub+0.003
Neg logit contributions
urious-0.005
 acts-0.005
ymes-0.004
itive-0.004
 explains-0.004
Token again
Token ID757
Feature activation+1.09e-02

Pos logit contributions
ar+0.014
 Nem+0.013
 Bub+0.012
 home+0.011
 Twe+0.011
Neg logit contributions
urious-0.019
 acts-0.019
 typing-0.017
itive-0.017
 explains-0.017
Token.
Token ID13
Feature activation-3.09e-03

Pos logit contributions
urious+0.026
 acts+0.025
 typing+0.024
ymes+0.024
 explains+0.024
Neg logit contributions
 home-0.010
ar-0.008
 Nem-0.007
 away-0.006
 Bub-0.006
Token But
Token ID887
Feature activation-6.65e-03

Pos logit contributions
ar+0.005
ger+0.004
 Nem+0.004
balls+0.004
melon+0.004
Neg logit contributions
urious-0.005
 acts-0.005
 typing-0.004
 explains-0.004
 recite-0.004
Token and
Token ID290
Feature activation-2.26e-02

Pos logit contributions
urious+0.021
 acts+0.020
itive+0.019
 typing+0.019
 recite+0.019
Neg logit contributions
ar-0.025
 Nem-0.023
 Bub-0.022
 home-0.021
 Twe-0.021
Token nodded
Token ID14464
Feature activation+5.87e-03

Pos logit contributions
ar+0.042
 Nem+0.038
 Bub+0.037
 home+0.036
 bamboo+0.035
Neg logit contributions
urious-0.027
 acts-0.026
 explains-0.025
 typing-0.024
 recite-0.024
Token.
Token ID13
Feature activation-3.03e-03

Pos logit contributions
urious+0.012
 acts+0.011
 typing+0.010
itive+0.010
 recite+0.010
Neg logit contributions
ar-0.007
 Nem-0.006
 Bub-0.006
 home-0.005
 Twe-0.005
Token He
Token ID679
Feature activation-2.43e-03

Pos logit contributions
ar+0.004
 Nem+0.003
 Bub+0.003
 bamboo+0.003
 home+0.003
Neg logit contributions
urious-0.005
 acts-0.005
 typing-0.005
 explains-0.005
 recite-0.005
Token loved
Token ID6151
Feature activation-1.18e-02

Pos logit contributions
ar+0.004
 Nem+0.004
 Bub+0.003
 home+0.003
 Twe+0.003
Neg logit contributions
urious-0.004
 acts-0.004
itive-0.003
ymes-0.003
 typing-0.003
Token pic
Token ID8301
Feature activation-6.61e-02

Pos logit contributions
ar+0.016
 Nem+0.016
 Bub+0.015
 home+0.014
 bamboo+0.014
Neg logit contributions
urious-0.021
 acts-0.019
itive-0.019
 recite-0.018
 explains-0.018
Tokenn
Token ID77
Feature activation+5.16e-04

Pos logit contributions
ar+0.078
 Nem+0.077
 Bub+0.071
 home+0.070
 bamboo+0.066
Neg logit contributions
urious-0.114
 acts-0.109
 recite-0.109
 explains-0.106
itive-0.105
Tokenics
Token ID873
Feature activation-9.98e-03

Pos logit contributions
 acts+0.001
urious+0.001
 amaz+0.001
 recite+0.001
 explains+0.001
Neg logit contributions
ar-0.001
 home-0.001
 Nem-0.001
balls-0.001
 Bub-0.001
Token and
Token ID290
Feature activation-1.51e-02

Pos logit contributions
ar+0.014
 Nem+0.013
 Bub+0.012
 home+0.011
 Twe+0.011
Neg logit contributions
urious-0.017
 acts-0.016
itive-0.015
 typing-0.015
 recite-0.015
Token couldn
Token ID3521
Feature activation+1.97e-03

Pos logit contributions
 Nem+0.021
ar+0.021
 Bub+0.020
 Twe+0.018
 home+0.018
Neg logit contributions
urious-0.026
 acts-0.024
ymes-0.024
itive-0.024
asing-0.023
Token't
Token ID470
Feature activation+3.46e-03

Pos logit contributions
urious+0.004
 acts+0.004
itive+0.004
 typing+0.004
ymes+0.004
Neg logit contributions
ar-0.002
 Nem-0.002
 Bub-0.001
 Twe-0.001
 home-0.001
Token "
Token ID366
Feature activation+1.64e-02

Pos logit contributions
urious+0.037
 acts+0.036
itive+0.034
 typing+0.034
 recite+0.034
Neg logit contributions
ar-0.025
 Nem-0.023
 Bub-0.022
 home-0.020
 Twe-0.020
TokenWhy
Token ID5195
Feature activation-1.65e-03

Pos logit contributions
urious+0.026
 acts+0.026
itive+0.024
 typing+0.023
 explains+0.023
Neg logit contributions
ar-0.024
 Nem-0.023
 Bub-0.021
melon-0.019
 Twe-0.019
Token do
Token ID466
Feature activation+3.46e-03

Pos logit contributions
ar+0.002
 Nem+0.002
 Bub+0.002
 Twe+0.002
 home+0.002
Neg logit contributions
urious-0.003
 abrupt-0.002
 acts-0.002
ymes-0.002
 amaz-0.002
Token you
Token ID345
Feature activation+6.99e-03

Pos logit contributions
urious+0.007
 acts+0.006
itive+0.006
 typing+0.006
 recite+0.006
Neg logit contributions
ar-0.004
 Nem-0.004
 Bub-0.004
 Twe-0.003
 home-0.003
Token want
Token ID765
Feature activation-2.02e-02

Pos logit contributions
urious+0.010
 acts+0.009
ymes+0.008
itive+0.008
 typing+0.008
Neg logit contributions
ar-0.012
 Nem-0.012
 Bub-0.011
 home-0.011
melon-0.010
Token a
Token ID257
Feature activation-6.60e-02

Pos logit contributions
ar+0.029
 Nem+0.026
orted+0.025
 Bub+0.024
ger+0.023
Neg logit contributions
 acts-0.032
 recite-0.031
 explains-0.031
urious-0.030
 typing-0.030
Token bisc
Token ID36228
Feature activation-2.61e-02

Pos logit contributions
 Nem+0.093
ar+0.092
 Bub+0.085
orted+0.084
 Twe+0.081
Neg logit contributions
urious-0.097
 acts-0.096
 recite-0.094
 explains-0.093
 typing-0.086
Tokenuit
Token ID5013
Feature activation-3.28e-03

Pos logit contributions
 Nem+0.036
ar+0.035
 Bub+0.032
 Twe+0.030
 home+0.030
Neg logit contributions
 acts-0.040
urious-0.040
 recite-0.039
itive-0.038
 explains-0.038
Token?"
Token ID1701
Feature activation+1.76e-02

Pos logit contributions
ar+0.005
 Nem+0.005
 Bub+0.004
 home+0.004
 Twe+0.004
Neg logit contributions
urious-0.005
 acts-0.005
itive-0.005
 typing-0.005
 recite-0.005
Token
Token ID198
Feature activation+8.07e-03

Pos logit contributions
urious+0.031
 acts+0.030
 explains+0.029
 recite+0.029
 typing+0.029
Neg logit contributions
ar-0.023
 Nem-0.020
 Bub-0.018
 bamboo-0.018
melon-0.018
Token
Token ID198
Feature activation+1.62e-02

Pos logit contributions
urious+0.015
 acts+0.014
 explains+0.014
 recite+0.014
 typing+0.014
Neg logit contributions
ar-0.010
 Nem-0.009
 Bub-0.008
melon-0.008
 bamboo-0.008
Token,
Token ID11
Feature activation+4.84e-03

Pos logit contributions
ar+0.012
 Nem+0.010
 Bub+0.010
 home+0.009
 Twe+0.009
Neg logit contributions
urious-0.021
 acts-0.020
itive-0.019
 explains-0.019
 recite-0.019
Token ready
Token ID3492
Feature activation-2.31e-03

Pos logit contributions
urious+0.009
 acts+0.008
ymes+0.008
itive+0.008
 explains+0.008
Neg logit contributions
 Nem-0.006
ar-0.006
 home-0.006
 Bub-0.006
 Twe-0.006
Token to
Token ID284
Feature activation+2.75e-03

Pos logit contributions
ar+0.003
 Nem+0.003
 Bub+0.002
 home+0.002
 Twe+0.002
Neg logit contributions
urious-0.004
 acts-0.004
 typing-0.004
itive-0.004
 recite-0.004
Token have
Token ID423
Feature activation-3.30e-02

Pos logit contributions
urious+0.005
 acts+0.005
itive+0.005
ymes+0.004
asing+0.004
Neg logit contributions
ar-0.004
 Nem-0.004
 Bub-0.004
 home-0.004
 Twe-0.003
Token more
Token ID517
Feature activation-2.83e-03

Pos logit contributions
ar+0.048
 Nem+0.042
 Bub+0.041
orted+0.039
melon+0.038
Neg logit contributions
urious-0.052
 acts-0.051
 explains-0.048
 recite-0.048
 typing-0.048
Token pic
Token ID8301
Feature activation-6.58e-02

Pos logit contributions
ar+0.003
 Nem+0.003
 home+0.002
 Bub+0.002
 bamboo+0.002
Neg logit contributions
urious-0.006
 acts-0.006
itive-0.006
 typing-0.006
 recite-0.005
Tokenn
Token ID77
Feature activation+1.64e-02

Pos logit contributions
ar+0.071
 Nem+0.069
 Bub+0.064
 home+0.057
 Twe+0.056
Neg logit contributions
urious-0.125
 acts-0.123
 recite-0.117
 explains-0.115
 typing-0.114
Tokenics
Token ID873
Feature activation-1.72e-02

Pos logit contributions
 acts+0.032
urious+0.032
 amaz+0.032
 recite+0.030
 explains+0.029
Neg logit contributions
ar-0.018
 home-0.016
 Nem-0.015
time-0.015
balls-0.015
Token in
Token ID287
Feature activation-5.02e-03

Pos logit contributions
ar+0.023
 Nem+0.021
 Bub+0.020
 Twe+0.018
 home+0.018
Neg logit contributions
urious-0.030
 acts-0.029
itive-0.027
 typing-0.027
 recite-0.027
Token the
Token ID262
Feature activation+2.24e-03

Pos logit contributions
ar+0.007
 Nem+0.007
 Bub+0.006
 home+0.006
 Twe+0.006
Neg logit contributions
urious-0.008
 acts-0.008
itive-0.008
 typing-0.007
 explains-0.007
Token future
Token ID2003
Feature activation+6.47e-03

Pos logit contributions
urious+0.004
 acts+0.003
 recite+0.003
 explains+0.003
 typing+0.003
Neg logit contributions
ar-0.003
 Nem-0.003
 Bub-0.003
balls-0.003
orted-0.003
Token sang
Token ID25889
Feature activation+1.82e-02

Pos logit contributions
urious+0.047
itive+0.045
ymes+0.045
 typing+0.045
 acts+0.043
Neg logit contributions
ar-0.029
 Bub-0.026
 Nem-0.024
 home-0.023
 away-0.022
Token louder
Token ID27089
Feature activation+3.41e-02

Pos logit contributions
urious+0.030
 acts+0.029
 recite+0.027
itive+0.027
 typing+0.027
Neg logit contributions
ar-0.026
 Nem-0.024
 Bub-0.023
melon-0.021
 home-0.021
Token and
Token ID290
Feature activation+1.24e-02

Pos logit contributions
urious+0.055
 acts+0.052
 typing+0.049
 recite+0.048
 explains+0.048
Neg logit contributions
ar-0.052
 Nem-0.047
 Bub-0.045
 home-0.044
 Twe-0.041
Token clearer
Token ID22363
Feature activation+1.57e-02

Pos logit contributions
urious+0.022
 acts+0.020
itive+0.020
 typing+0.019
ymes+0.019
Neg logit contributions
ar-0.017
 Nem-0.017
 Bub-0.017
 home-0.015
 Twe-0.015
Token than
Token ID621
Feature activation+2.04e-02

Pos logit contributions
urious+0.033
 acts+0.030
 explains+0.029
 typing+0.029
 recite+0.029
Neg logit contributions
ar-0.017
 Nem-0.015
 home-0.014
 Bub-0.014
balls-0.012
Token before
Token ID878
Feature activation+4.06e-02

Pos logit contributions
urious+0.039
 acts+0.037
 explains+0.035
itive+0.035
 typing+0.035
Neg logit contributions
ar-0.025
 Nem-0.023
 Bub-0.022
 home-0.020
 Twe-0.020
Token.
Token ID13
Feature activation+8.79e-03

Pos logit contributions
urious+0.085
 acts+0.080
 explains+0.076
itive+0.076
ymes+0.075
Neg logit contributions
ar-0.041
 Nem-0.040
 Bub-0.036
 home-0.035
 Twe-0.033
Token The
Token ID383
Feature activation+1.35e-02

Pos logit contributions
urious+0.015
 acts+0.015
 recite+0.014
 explains+0.014
 typing+0.013
Neg logit contributions
ar-0.012
 Nem-0.010
 home-0.010
balls-0.010
melon-0.009
Token other
Token ID584
Feature activation+2.18e-02

Pos logit contributions
urious+0.022
 acts+0.021
 explains+0.020
 typing+0.020
itive+0.020
Neg logit contributions
ar-0.020
 Nem-0.019
 Bub-0.018
 home-0.018
 Twe-0.017
Token birds
Token ID10087
Feature activation+1.62e-02

Pos logit contributions
 typing+0.040
 recite+0.039
urious+0.039
 explains+0.038
 acts+0.037
Neg logit contributions
ar-0.027
 home-0.024
 Nem-0.023
 Twe-0.022
 Bub-0.022
Token in
Token ID287
Feature activation+1.63e-02

Pos logit contributions
urious+0.028
 acts+0.026
 typing+0.025
ymes+0.024
 abrupt+0.024
Neg logit contributions
ar-0.022
 Nem-0.021
 Bub-0.020
 home-0.019
 away-0.018
Token town
Token ID3240
Feature activation+1.28e-02

Pos logit contributions
urious+0.050
 acts+0.046
 typing+0.044
 recite+0.042
itive+0.042
Neg logit contributions
ar-0.066
 Nem-0.062
 Bub-0.060
balls-0.056
 bamboo-0.055
Token loved
Token ID6151
Feature activation+1.11e-02

Pos logit contributions
urious+0.021
 acts+0.020
 recite+0.020
itive+0.019
Ru+0.019
Neg logit contributions
ar-0.018
 Nem-0.017
 home-0.016
 Bub-0.016
 Twe-0.014
Token to
Token ID284
Feature activation+1.56e-03

Pos logit contributions
urious+0.018
 acts+0.018
 typing+0.017
 recite+0.017
itive+0.017
Neg logit contributions
ar-0.016
 Nem-0.014
 Bub-0.013
orted-0.013
ger-0.013
Token hear
Token ID3285
Feature activation+2.70e-02

Pos logit contributions
urious+0.002
 acts+0.002
itive+0.002
 typing+0.002
 explains+0.002
Neg logit contributions
ar-0.002
 Nem-0.002
 Bub-0.002
 home-0.002
 Twe-0.002
Token the
Token ID262
Feature activation+2.06e-02

Pos logit contributions
urious+0.053
 acts+0.050
itive+0.048
Ru+0.047
 recite+0.047
Neg logit contributions
 Nem-0.029
ar-0.029
 home-0.027
 Bub-0.027
 Twe-0.026
Token bell
Token ID8966
Feature activation+4.44e-02

Pos logit contributions
urious+0.036
 acts+0.034
itive+0.032
 recite+0.032
 typing+0.032
Neg logit contributions
ar-0.027
 Nem-0.025
 home-0.024
 Bub-0.023
 Twe-0.022
Token ring
Token ID5858
Feature activation+4.23e-02

Pos logit contributions
urious+0.079
 acts+0.074
Ru+0.072
itive+0.070
ymes+0.070
Neg logit contributions
ar-0.057
 home-0.052
 Bub-0.049
 Nem-0.048
 Twe-0.047
Token.
Token ID13
Feature activation+5.36e-03

Pos logit contributions
urious+0.090
 acts+0.084
Ru+0.082
itive+0.081
 typing+0.081
Neg logit contributions
ar-0.040
 home-0.036
 Nem-0.031
 away-0.031
 Bub-0.031
Token
Token ID198
Feature activation+3.41e-04

Pos logit contributions
urious+0.009
 acts+0.009
 explains+0.008
 typing+0.008
 recite+0.007
Neg logit contributions
ar-0.008
 Nem-0.007
balls-0.007
 home-0.006
melon-0.006
Token
Token ID198
Feature activation+6.94e-03

Pos logit contributions
urious+0.001
 acts+0.001
 explains+0.001
 recite+0.001
 typing+0.001
Neg logit contributions
ar-0.000
 Nem-0.000
 Bub-0.000
 home-0.000
melon-0.000
TokenOne
Token ID3198
Feature activation+2.24e-02

Pos logit contributions
urious+0.010
 acts+0.009
 typing+0.009
itive+0.009
 recite+0.009
Neg logit contributions
ar-0.012
 Nem-0.011
 Bub-0.011
 home-0.010
 Twe-0.010
Token
Token ID198
Feature activation+6.60e-03

Pos logit contributions
ar+0.008
 Nem+0.007
 Bub+0.006
melon+0.006
 home+0.006
Neg logit contributions
urious-0.011
 acts-0.011
 recite-0.010
 explains-0.010
 typing-0.010
TokenWhen
Token ID2215
Feature activation+2.25e-02

Pos logit contributions
 acts+0.011
urious+0.011
itive+0.010
 explains+0.010
 typing+0.010
Neg logit contributions
ar-0.009
 Nem-0.009
 Bub-0.008
 Twe-0.008
 home-0.008
Token he
Token ID339
Feature activation-2.60e-02

Pos logit contributions
urious+0.038
 acts+0.037
 typing+0.036
itive+0.035
 recite+0.035
Neg logit contributions
ar-0.030
 Nem-0.026
 Bub-0.025
inky-0.024
orted-0.024
Token was
Token ID373
Feature activation-1.12e-02

Pos logit contributions
ar+0.035
 Nem+0.033
 Bub+0.030
 home+0.028
 Twe+0.028
Neg logit contributions
urious-0.046
 acts-0.044
itive-0.042
 typing-0.042
 explains-0.041
Token finished
Token ID5201
Feature activation-1.51e-02

Pos logit contributions
ar+0.018
 Nem+0.017
 Bub+0.016
 Twe+0.015
ger+0.015
Neg logit contributions
urious-0.016
 acts-0.015
 typing-0.015
 recite-0.015
 explains-0.015
Token,
Token ID11
Feature activation+3.97e-02

Pos logit contributions
ar+0.017
 Nem+0.015
 Bub+0.014
 home+0.013
 bamboo+0.013
Neg logit contributions
urious-0.029
 acts-0.028
 typing-0.027
itive-0.027
 recite-0.027
Token he
Token ID339
Feature activation-1.31e-03

Pos logit contributions
urious+0.090
 acts+0.086
ymes+0.081
 explains+0.081
itive+0.081
Neg logit contributions
ar-0.035
 Nem-0.035
 Bub-0.031
 home-0.028
 Twe-0.028
Token l
Token ID300
Feature activation+8.11e-04

Pos logit contributions
ar+0.002
 Nem+0.002
 Bub+0.002
 home+0.002
 Twe+0.002
Neg logit contributions
urious-0.002
 acts-0.002
itive-0.002
ymes-0.002
 typing-0.002
Tokenicked
Token ID9484
Feature activation+1.46e-02

Pos logit contributions
urious+0.001
 acts+0.001
itive+0.001
asing+0.001
 explains+0.001
Neg logit contributions
ar-0.001
 Nem-0.001
 Bub-0.001
 Twe-0.001
 home-0.001
Token his
Token ID465
Feature activation+1.05e-02

Pos logit contributions
urious+0.023
 acts+0.023
 explains+0.022
itive+0.021
 recite+0.021
Neg logit contributions
ar-0.022
 Nem-0.018
 Bub-0.017
balls-0.017
ger-0.017
Token paws
Token ID49193
Feature activation-2.72e-03

Pos logit contributions
urious+0.015
 acts+0.015
 recite+0.013
 amaz+0.013
itive+0.013
Neg logit contributions
ar-0.017
 Nem-0.016
 Bub-0.015
 Twe-0.014
orted-0.014
TokenI
Token ID40
Feature activation-4.28e-03

Pos logit contributions
ar+0.024
 Nem+0.021
 Bub+0.020
 home+0.019
melon+0.019
Neg logit contributions
urious-0.031
 acts-0.030
 typing-0.029
itive-0.029
 recite-0.028
Token love
Token ID1842
Feature activation-2.62e-03

Pos logit contributions
ar+0.006
 Nem+0.006
 Bub+0.006
 home+0.005
 Twe+0.005
Neg logit contributions
urious-0.007
 acts-0.007
itive-0.006
 typing-0.006
ymes-0.006
Token you
Token ID345
Feature activation+6.02e-03

Pos logit contributions
ar+0.004
 Nem+0.003
 Bub+0.003
orted+0.003
melon+0.003
Neg logit contributions
 acts-0.004
urious-0.004
 typing-0.004
 recite-0.004
itive-0.004
Token and
Token ID290
Feature activation-3.24e-03

Pos logit contributions
urious+0.011
 acts+0.011
itive+0.010
 typing+0.010
 recite+0.010
Neg logit contributions
ar-0.008
 Nem-0.007
 Bub-0.006
 bamboo-0.006
 Twe-0.006
Token your
Token ID534
Feature activation-6.85e-03

Pos logit contributions
ar+0.004
 Nem+0.004
 Bub+0.004
 home+0.003
 Twe+0.003
Neg logit contributions
urious-0.006
 acts-0.006
ymes-0.005
itive-0.005
 recite-0.005
Token mom
Token ID1995
Feature activation+3.96e-02

Pos logit contributions
ar+0.012
orted+0.010
 Nem+0.010
 Bub+0.010
ger+0.009
Neg logit contributions
 acts-0.010
urious-0.009
itive-0.008
 typing-0.008
 amaz-0.008
Tokenmy
Token ID1820
Feature activation+3.00e-02

Pos logit contributions
urious+0.054
 acts+0.050
itive+0.048
 typing+0.047
ymes+0.046
Neg logit contributions
ar-0.068
 Nem-0.065
 Bub-0.061
 home-0.060
 Twe-0.057
Token so
Token ID523
Feature activation+9.09e-03

Pos logit contributions
urious+0.045
 acts+0.043
itive+0.040
 typing+0.040
 recite+0.040
Neg logit contributions
ar-0.047
 Nem-0.044
 Bub-0.042
 home-0.040
 Twe-0.039
Token much
Token ID881
Feature activation+1.49e-02

Pos logit contributions
urious+0.014
 acts+0.013
itive+0.012
 typing+0.012
 recite+0.012
Neg logit contributions
ar-0.014
 Nem-0.013
 Bub-0.013
 home-0.012
 Twe-0.012
Token.
Token ID13
Feature activation+1.44e-02

Pos logit contributions
urious+0.028
 acts+0.027
itive+0.026
 typing+0.026
 recite+0.025
Neg logit contributions
ar-0.018
 Nem-0.017
 Bub-0.015
 home-0.015
 Twe-0.014
Token You
Token ID921
Feature activation+2.23e-02

Pos logit contributions
urious+0.025
 acts+0.024
 typing+0.023
 explains+0.023
 recite+0.023
Neg logit contributions
ar-0.020
 Nem-0.018
 Bub-0.017
 home-0.016
 bamboo-0.015
Token,
Token ID11
Feature activation+1.66e-02

Pos logit contributions
urious+0.002
 acts+0.002
itive+0.002
 typing+0.002
 explains+0.002
Neg logit contributions
ar-0.001
 Nem-0.001
 Bub-0.001
orted-0.001
 Twe-0.001
Token she
Token ID673
Feature activation+7.94e-04

Pos logit contributions
 typing+0.019
itive+0.018
urious+0.017
 acts+0.017
 recite+0.017
Neg logit contributions
ar-0.033
inky-0.029
orted-0.027
 Nem-0.027
bank-0.027
Token saw
Token ID2497
Feature activation+1.28e-02

Pos logit contributions
 acts+0.001
urious+0.001
itive+0.001
 explains+0.001
ymes+0.001
Neg logit contributions
ar-0.001
 Nem-0.001
 Bub-0.001
 home-0.001
 Twe-0.001
Token a
Token ID257
Feature activation+2.33e-04

Pos logit contributions
urious+0.018
 acts+0.017
 explains+0.017
 typing+0.016
itive+0.016
Neg logit contributions
ar-0.023
 Nem-0.019
inky-0.019
oe-0.018
suit-0.018
Token man
Token ID582
Feature activation+1.09e-02

Pos logit contributions
urious+0.000
 acts+0.000
itive+0.000
 typing+0.000
 explains+0.000
Neg logit contributions
ar-0.000
 Nem-0.000
 Bub-0.000
 home-0.000
 Twe-0.000
Token kne
Token ID24813
Feature activation+4.03e-02

Pos logit contributions
urious+0.018
 acts+0.018
itive+0.016
ymes+0.016
 explains+0.016
Neg logit contributions
ar-0.015
 Nem-0.014
 home-0.014
 Bub-0.013
 Twe-0.013
Tokenel
Token ID417
Feature activation+1.41e-02

Pos logit contributions
 acts+0.069
 recite+0.064
urious+0.064
 typing+0.063
 explains+0.063
Neg logit contributions
ar-0.052
 home-0.052
 Nem-0.051
 away-0.047
orted-0.047
Token down
Token ID866
Feature activation+2.83e-02

Pos logit contributions
urious+0.025
 acts+0.024
itive+0.023
 typing+0.023
 recite+0.023
Neg logit contributions
ar-0.018
 Nem-0.017
 Bub-0.016
 home-0.015
 Twe-0.014
Token in
Token ID287
Feature activation+8.52e-03

Pos logit contributions
urious+0.059
 acts+0.058
itive+0.054
asing+0.052
ymes+0.052
Neg logit contributions
 home-0.030
ar-0.028
 Nem-0.026
 away-0.026
 back-0.026
Token front
Token ID2166
Feature activation+9.80e-03

Pos logit contributions
urious+0.018
asing+0.017
itive+0.017
 acts+0.016
ymes+0.016
Neg logit contributions
 Nem-0.009
 home-0.009
ar-0.008
 away-0.008
 Bub-0.007
Token of
Token ID286
Feature activation-2.11e-03

Pos logit contributions
urious+0.013
 acts+0.013
itive+0.012
 recite+0.012
 typing+0.012
Neg logit contributions
ar-0.016
 Nem-0.015
 home-0.015
 Bub-0.014
 bamboo-0.014
Token his
Token ID465
Feature activation+2.39e-02

Pos logit contributions
urious+0.038
 acts+0.036
itive+0.034
 recite+0.034
 typing+0.034
Neg logit contributions
ar-0.038
 Nem-0.033
 Bub-0.031
balls-0.030
orted-0.029
Token mother
Token ID2802
Feature activation+2.05e-02

Pos logit contributions
urious+0.042
 acts+0.039
asing+0.038
ymes+0.038
itive+0.038
Neg logit contributions
 home-0.030
 Nem-0.030
ar-0.029
 Bub-0.027
 bamboo-0.025
Token and
Token ID290
Feature activation-1.35e-02

Pos logit contributions
urious+0.038
 acts+0.035
itive+0.034
ymes+0.034
 typing+0.033
Neg logit contributions
ar-0.025
 Nem-0.024
 home-0.022
 Bub-0.022
melon-0.020
Token she
Token ID673
Feature activation+5.51e-04

Pos logit contributions
ar+0.021
 Nem+0.021
 Bub+0.019
 home+0.019
 Twe+0.018
Neg logit contributions
urious-0.020
 acts-0.019
ymes-0.018
itive-0.018
 typing-0.018
Token smiled
Token ID13541
Feature activation+8.06e-03

Pos logit contributions
urious+0.001
 acts+0.001
itive+0.001
ymes+0.001
 typing+0.001
Neg logit contributions
ar-0.001
 Nem-0.001
 Bub-0.001
 home-0.001
 Twe-0.001
Token back
Token ID736
Feature activation+1.15e-02

Pos logit contributions
urious+0.015
 acts+0.015
itive+0.014
 typing+0.014
 recite+0.014
Neg logit contributions
ar-0.010
 Nem-0.009
 Bub-0.008
 home-0.008
 Twe-0.007
Token.
Token ID13
Feature activation+3.39e-03

Pos logit contributions
urious+0.023
 acts+0.023
 recite+0.022
 typing+0.021
 explains+0.021
Neg logit contributions
ar-0.012
 Nem-0.011
 home-0.010
 Bub-0.010
 Twe-0.009
Token The
Token ID383
Feature activation+2.25e-02

Pos logit contributions
urious+0.005
 acts+0.005
 explains+0.005
 typing+0.005
 recite+0.005
Neg logit contributions
ar-0.005
 Nem-0.004
 Bub-0.004
balls-0.004
 bamboo-0.004
Token baby
Token ID5156
Feature activation+1.94e-02

Pos logit contributions
urious+0.038
 acts+0.037
itive+0.034
 typing+0.034
 recite+0.034
Neg logit contributions
ar-0.031
 Nem-0.030
 Bub-0.028
 home-0.026
 Twe-0.025
Token couldn
Token ID3521
Feature activation+5.30e-03

Pos logit contributions
urious+0.029
 acts+0.028
 typing+0.026
itive+0.026
 recite+0.026
Neg logit contributions
ar-0.031
 Nem-0.029
 Bub-0.028
 home-0.026
 Twe-0.026
Token't
Token ID470
Feature activation+1.57e-02

Pos logit contributions
urious+0.012
 acts+0.012
itive+0.011
 typing+0.011
ymes+0.011
Neg logit contributions
ar-0.005
 Nem-0.004
 Bub-0.004
 Twe-0.003
 home-0.003
Token in
Token ID287
Feature activation-1.56e-03

Pos logit contributions
urious+0.005
 acts+0.005
itive+0.004
 recite+0.004
 explains+0.004
Neg logit contributions
ar-0.003
 Nem-0.003
 Bub-0.003
 home-0.003
 bamboo-0.003
Token the
Token ID262
Feature activation+8.65e-03

Pos logit contributions
ar+0.002
 Nem+0.002
 Bub+0.002
 home+0.002
 Twe+0.002
Neg logit contributions
urious-0.003
 acts-0.003
itive-0.002
 typing-0.002
 explains-0.002
Token water
Token ID1660
Feature activation-1.09e-02

Pos logit contributions
urious+0.014
 acts+0.013
 abrupt+0.012
 hears+0.012
 typing+0.011
Neg logit contributions
ar-0.013
 Nem-0.012
 Bub-0.011
balls-0.011
orted-0.011
Token.
Token ID13
Feature activation+4.09e-03

Pos logit contributions
ar+0.014
 Nem+0.013
 Bub+0.012
 home+0.011
 Twe+0.011
Neg logit contributions
urious-0.020
 acts-0.019
 typing-0.018
itive-0.018
 explains-0.018
Token He
Token ID679
Feature activation+8.31e-03

Pos logit contributions
urious+0.007
 acts+0.007
 explains+0.006
 typing+0.006
 recite+0.006
Neg logit contributions
ar-0.006
 Nem-0.005
 Bub-0.005
 home-0.005
 bamboo-0.005
Token stared
Token ID18484
Feature activation+2.07e-02

Pos logit contributions
urious+0.011
 acts+0.010
 typing+0.009
 explains+0.009
itive+0.009
Neg logit contributions
ar-0.015
 Nem-0.014
 Bub-0.013
 home-0.013
 Twe-0.013
Token at
Token ID379
Feature activation+8.73e-03

Pos logit contributions
urious+0.025
 acts+0.023
itive+0.022
 typing+0.021
 explains+0.021
Neg logit contributions
ar-0.040
 Nem-0.037
 Bub-0.035
 Twe-0.033
 home-0.033
Token the
Token ID262
Feature activation-5.29e-04

Pos logit contributions
 acts+0.013
urious+0.013
 explains+0.011
 amaz+0.011
 abrupt+0.011
Neg logit contributions
ar-0.015
inky-0.012
orted-0.012
age-0.012
 Nem-0.011
Token fish
Token ID5916
Feature activation+5.94e-03

Pos logit contributions
ar+0.001
 Nem+0.001
 Bub+0.001
 home+0.001
 bamboo+0.001
Neg logit contributions
urious-0.001
 acts-0.001
 typing-0.001
itive-0.001
 recite-0.001
Token.
Token ID13
Feature activation-9.21e-04

Pos logit contributions
urious+0.010
 acts+0.010
itive+0.009
 typing+0.009
 recite+0.009
Neg logit contributions
ar-0.008
 Nem-0.008
 Bub-0.007
 home-0.007
 bamboo-0.007
Token The
Token ID383
Feature activation+8.87e-03

Pos logit contributions
ar+0.002
ger+0.001
 bamboo+0.001
 thirst+0.001
 away+0.001
Neg logit contributions
urious-0.001
 explains-0.001
 acts-0.001
 typing-0.001
 requests-0.001
Token friends
Token ID2460
Feature activation-7.52e-03

Pos logit contributions
urious+0.006
 acts+0.006
 typing+0.006
itive+0.006
 explains+0.006
Neg logit contributions
ar-0.004
 Nem-0.003
 home-0.003
 Bub-0.003
 bamboo-0.003
Token.
Token ID13
Feature activation+9.35e-03

Pos logit contributions
ar+0.009
 Nem+0.008
 Bub+0.008
 home+0.007
 Twe+0.007
Neg logit contributions
urious-0.014
 acts-0.014
itive-0.013
 typing-0.013
 recite-0.013
Token That
Token ID1320
Feature activation+1.63e-02

Pos logit contributions
urious+0.016
 acts+0.016
 explains+0.015
 recite+0.015
 typing+0.014
Neg logit contributions
ar-0.012
 Nem-0.011
 Bub-0.010
balls-0.010
 bamboo-0.010
Token way
Token ID835
Feature activation-1.59e-02

Pos logit contributions
 typing+0.035
urious+0.035
itive+0.034
asing+0.033
ymes+0.033
Neg logit contributions
ar-0.014
 home-0.013
 Nem-0.012
ger-0.011
 Bub-0.010
Token the
Token ID262
Feature activation+3.38e-03

Pos logit contributions
ar+0.018
 Nem+0.016
 Bub+0.015
 home+0.014
 Twe+0.014
Neg logit contributions
urious-0.031
 acts-0.030
itive-0.028
 typing-0.028
 recite-0.028
Token bee
Token ID20697
Feature activation+1.67e-02

Pos logit contributions
urious+0.005
 acts+0.005
 typing+0.005
 explains+0.005
itive+0.005
Neg logit contributions
ar-0.005
 Nem-0.005
 home-0.005
 Bub-0.005
 bamboo-0.005
Token didn
Token ID1422
Feature activation+2.10e-03

Pos logit contributions
urious+0.028
 typing+0.026
 acts+0.026
itive+0.025
ymes+0.024
Neg logit contributions
ar-0.025
 Nem-0.022
 home-0.021
 Bub-0.021
ger-0.019
Token't
Token ID470
Feature activation+1.36e-02

Pos logit contributions
urious+0.004
 acts+0.004
 typing+0.004
itive+0.004
 recite+0.004
Neg logit contributions
ar-0.002
 Nem-0.002
 Bub-0.002
 home-0.002
 Twe-0.002
Token have
Token ID423
Feature activation-1.07e-02

Pos logit contributions
urious+0.016
 acts+0.016
 recite+0.014
itive+0.014
 typing+0.014
Neg logit contributions
ar-0.026
 Nem-0.024
 Bub-0.023
balls-0.022
orted-0.022
Token to
Token ID284
Feature activation+1.41e-02

Pos logit contributions
ar+0.015
 Nem+0.014
 Bub+0.013
balls+0.012
melon+0.012
Neg logit contributions
urious-0.018
 acts-0.017
 recite-0.016
 explains-0.016
itive-0.016
Token escape
Token ID6654
Feature activation+4.29e-03

Pos logit contributions
urious+0.021
 acts+0.020
 typing+0.019
 explains+0.019
ymes+0.019
Neg logit contributions
ar-0.022
 Nem-0.022
 Bub-0.021
 home-0.020
 Twe-0.020
Token bird
Token ID6512
Feature activation+2.02e-03

Pos logit contributions
ar+0.023
 Nem+0.021
 Bub+0.020
balls+0.019
orted+0.019
Neg logit contributions
 acts-0.021
urious-0.020
itive-0.019
 recite-0.019
 typing-0.019
Token.
Token ID13
Feature activation-6.94e-03

Pos logit contributions
urious+0.004
 acts+0.004
itive+0.003
 typing+0.003
 recite+0.003
Neg logit contributions
ar-0.002
 Nem-0.002
 Bub-0.002
 home-0.002
balls-0.002
Token
Token ID220
Feature activation+7.42e-03

Pos logit contributions
ar+0.011
balls+0.009
 Nem+0.009
ger+0.009
 home+0.009
Neg logit contributions
urious-0.010
 acts-0.010
 recite-0.010
 typing-0.010
 explains-0.010
Token
Token ID198
Feature activation-4.01e-03

Pos logit contributions
urious+0.013
 acts+0.012
 explains+0.012
 typing+0.012
 recite+0.012
Neg logit contributions
ar-0.010
 Nem-0.009
 Bub-0.008
melon-0.008
 bamboo-0.007
Token
Token ID198
Feature activation+8.02e-05

Pos logit contributions
ar+0.005
 Nem+0.004
 Bub+0.004
 home+0.003
melon+0.003
Neg logit contributions
urious-0.008
 acts-0.008
 typing-0.007
 explains-0.007
 recite-0.007
Token"
Token ID1
Feature activation+9.93e-03

Pos logit contributions
 acts+0.000
urious+0.000
itive+0.000
 typing+0.000
 explains+0.000
Neg logit contributions
ar-0.000
 Nem-0.000
 Bub-0.000
 Twe-0.000
 home-0.000
TokenHello
Token ID15496
Feature activation-6.15e-03

Pos logit contributions
urious+0.019
 acts+0.018
 recite+0.017
itive+0.017
 typing+0.017
Neg logit contributions
ar-0.012
 Nem-0.011
 Bub-0.010
 home-0.010
balls-0.009
Token,
Token ID11
Feature activation-1.06e-02

Pos logit contributions
ar+0.005
 Nem+0.005
 Bub+0.004
 home+0.004
 Twe+0.004
Neg logit contributions
urious-0.014
 acts-0.013
 typing-0.013
itive-0.013
 recite-0.013
Token little
Token ID1310
Feature activation-1.82e-02

Pos logit contributions
 Nem+0.014
 Bub+0.014
ar+0.014
 Twe+0.013
 home+0.012
Neg logit contributions
urious-0.019
 acts-0.018
itive-0.017
 typing-0.017
 recite-0.017
Token girl
Token ID2576
Feature activation-1.27e-02

Pos logit contributions
ar+0.023
 Nem+0.021
 Bub+0.020
 home+0.019
 Twe+0.018
Neg logit contributions
urious-0.034
 acts-0.032
 typing-0.031
 recite-0.031
itive-0.031
Token!"
Token ID2474
Feature activation-5.59e-03

Pos logit contributions
ar+0.017
 Nem+0.015
 Bub+0.015
balls+0.014
melon+0.014
Neg logit contributions
urious-0.022
 acts-0.021
itive-0.020
 explains-0.020
 typing-0.020
Token saw
Token ID2497
Feature activation+1.85e-02

Pos logit contributions
urious+0.005
 acts+0.005
 typing+0.005
ymes+0.005
itive+0.004
Neg logit contributions
ar-0.004
 Nem-0.004
 Bub-0.003
 home-0.003
 Twe-0.003
Token a
Token ID257
Feature activation+5.53e-03

Pos logit contributions
urious+0.034
 acts+0.033
itive+0.031
 typing+0.031
 recite+0.031
Neg logit contributions
ar-0.024
 Nem-0.022
 Bub-0.020
 home-0.019
 Twe-0.018
Token spark
Token ID9009
Feature activation-8.79e-03

Pos logit contributions
urious+0.008
 acts+0.008
itive+0.007
 typing+0.007
 explains+0.007
Neg logit contributions
ar-0.009
 Nem-0.008
 Bub-0.008
 home-0.007
 Twe-0.007
Tokenly
Token ID306
Feature activation-2.48e-02

Pos logit contributions
ar+0.012
 Nem+0.011
 Bub+0.010
 home+0.010
ger+0.010
Neg logit contributions
urious-0.016
 acts-0.015
 typing-0.014
 recite-0.014
 explains-0.014
Token feather
Token ID17972
Feature activation-1.08e-02

Pos logit contributions
ar+0.039
 Nem+0.036
balls+0.034
orted+0.034
 Bub+0.034
Neg logit contributions
 acts-0.036
urious-0.034
 recite-0.033
itive-0.033
 amaz-0.031
Token.
Token ID13
Feature activation+5.24e-03

Pos logit contributions
ar+0.015
 Nem+0.014
 Bub+0.013
 home+0.012
 Twe+0.012
Neg logit contributions
urious-0.019
 acts-0.018
 typing-0.017
itive-0.017
 recite-0.017
Token She
Token ID1375
Feature activation-2.73e-03

Pos logit contributions
urious+0.009
 acts+0.009
 recite+0.008
 typing+0.008
 explains+0.008
Neg logit contributions
ar-0.007
 Nem-0.006
balls-0.006
 Bub-0.006
 bamboo-0.005
Token was
Token ID373
Feature activation+2.32e-03

Pos logit contributions
ar+0.004
 Nem+0.004
 Bub+0.004
 home+0.004
 Twe+0.004
Neg logit contributions
urious-0.004
 acts-0.004
itive-0.004
 typing-0.004
 recite-0.004
Token so
Token ID523
Feature activation+5.40e-03

Pos logit contributions
urious+0.004
 acts+0.004
 typing+0.004
 recite+0.004
 explains+0.004
Neg logit contributions
ar-0.003
 Nem-0.003
 Bub-0.002
ger-0.002
balls-0.002
Token excited
Token ID6568
Feature activation-1.51e-02

Pos logit contributions
urious+0.008
 acts+0.008
itive+0.007
 typing+0.007
 recite+0.007
Neg logit contributions
ar-0.009
 Nem-0.008
 Bub-0.008
 home-0.007
 Twe-0.007
Token!
Token ID0
Feature activation+9.93e-05

Pos logit contributions
ar+0.020
 Nem+0.018
 Bub+0.017
 bamboo+0.016
melon+0.015
Neg logit contributions
urious-0.027
 acts-0.026
itive-0.025
 typing-0.024
 explains-0.024
Token
Token ID198
Feature activation+1.48e-03

Pos logit contributions
urious+0.009
 acts+0.009
 explains+0.008
 typing+0.008
 amaz+0.008
Neg logit contributions
ar-0.007
 Nem-0.005
 bamboo-0.005
melon-0.005
 home-0.005
Token
Token ID198
Feature activation+1.06e-02

Pos logit contributions
urious+0.003
 explains+0.002
 acts+0.002
 recite+0.002
 typing+0.002
Neg logit contributions
ar-0.002
 Nem-0.002
melon-0.002
 Bub-0.002
 home-0.002
TokenJust
Token ID5703
Feature activation-5.00e-03

Pos logit contributions
urious+0.017
 acts+0.015
 typing+0.015
 recite+0.015
itive+0.015
Neg logit contributions
ar-0.016
 Nem-0.015
 Bub-0.015
 home-0.014
melon-0.014
Token when
Token ID618
Feature activation+1.96e-02

Pos logit contributions
ar+0.006
 Nem+0.006
 Bub+0.006
balls+0.005
melon+0.005
Neg logit contributions
urious-0.009
 acts-0.009
itive-0.008
 recite-0.008
 typing-0.008
Token the
Token ID262
Feature activation+3.02e-03

Pos logit contributions
 acts+0.030
 typing+0.029
 recite+0.029
urious+0.029
itive+0.029
Neg logit contributions
ar-0.025
 Nem-0.023
orted-0.022
inky-0.022
melon-0.022
Token rabbit
Token ID22746
Feature activation-2.30e-02

Pos logit contributions
urious+0.006
 acts+0.006
 typing+0.005
 recite+0.005
itive+0.005
Neg logit contributions
ar-0.004
 Nem-0.003
 Bub-0.003
 Twe-0.003
orted-0.003
Token was
Token ID373
Feature activation+3.16e-03

Pos logit contributions
ar+0.033
 Nem+0.029
 Bub+0.028
 home+0.026
ger+0.025
Neg logit contributions
urious-0.040
 acts-0.038
itive-0.038
 typing-0.037
ymes-0.037
Token about
Token ID546
Feature activation-1.07e-03

Pos logit contributions
urious+0.005
 acts+0.005
 typing+0.004
 recite+0.004
itive+0.004
Neg logit contributions
ar-0.005
 Nem-0.005
 Bub-0.004
 Twe-0.004
 bamboo-0.004
Token to
Token ID284
Feature activation+1.35e-02

Pos logit contributions
ar+0.001
 Nem+0.001
 Bub+0.001
 home+0.001
 Twe+0.001
Neg logit contributions
urious-0.002
 acts-0.002
itive-0.002
 typing-0.002
 recite-0.002
Token give
Token ID1577
Feature activation+9.74e-04

Pos logit contributions
urious+0.023
ymes+0.020
 acts+0.020
itive+0.019
asing+0.019
Neg logit contributions
 Nem-0.021
ar-0.020
 Bub-0.019
 home-0.019
 Twe-0.018
Token up
Token ID510
Feature activation+1.12e-02

Pos logit contributions
urious+0.002
 acts+0.002
 explains+0.002
 recite+0.002
 typing+0.001
Neg logit contributions
ar-0.001
 Nem-0.001
 Bub-0.001
melon-0.001
balls-0.001
Token they
Token ID484
Feature activation-6.54e-03

Pos logit contributions
ar+0.018
 Nem+0.018
 Bub+0.017
 home+0.016
 Twe+0.015
Neg logit contributions
urious-0.025
ymes-0.023
 acts-0.022
itive-0.022
 typing-0.022
Token said
Token ID531
Feature activation-1.51e-02

Pos logit contributions
ar+0.009
 Nem+0.009
 Bub+0.008
 home+0.008
 Twe+0.007
Neg logit contributions
urious-0.011
 acts-0.011
 typing-0.010
itive-0.010
 recite-0.010
Token goodbye
Token ID24829
Feature activation-1.21e-02

Pos logit contributions
ar+0.013
 Nem+0.011
 Bub+0.011
 home+0.009
melon+0.009
Neg logit contributions
 acts-0.032
urious-0.032
itive-0.031
 explains-0.031
 typing-0.030
Token to
Token ID284
Feature activation-3.37e-03

Pos logit contributions
ar+0.018
 Nem+0.017
 Bub+0.016
melon+0.016
 bamboo+0.016
Neg logit contributions
urious-0.017
 acts-0.017
itive-0.016
 typing-0.016
 recite-0.015
Token each
Token ID1123
Feature activation+1.46e-02

Pos logit contributions
 Nem+0.004
ar+0.004
 Bub+0.004
 home+0.004
 Twe+0.003
Neg logit contributions
urious-0.007
 acts-0.006
itive-0.006
 typing-0.006
ymes-0.006
Token other
Token ID584
Feature activation-1.58e-02

Pos logit contributions
urious+0.026
 acts+0.025
 typing+0.025
 explains+0.024
 recite+0.024
Neg logit contributions
ar-0.018
 Nem-0.017
 home-0.017
 Bub-0.016
 bamboo-0.015
Token.
Token ID13
Feature activation+3.03e-03

Pos logit contributions
ar+0.020
 Nem+0.018
 Bub+0.017
 home+0.017
 bamboo+0.015
Neg logit contributions
urious-0.029
 acts-0.028
itive-0.026
 typing-0.026
 recite-0.026
Token The
Token ID383
Feature activation+1.32e-02

Pos logit contributions
urious+0.005
 acts+0.004
 explains+0.004
 typing+0.004
 recite+0.004
Neg logit contributions
ar-0.005
 bamboo-0.004
 home-0.004
ger-0.004
balls-0.004
Token little
Token ID1310
Feature activation+1.32e-02

Pos logit contributions
urious+0.020
 acts+0.019
 typing+0.018
itive+0.018
 recite+0.017
Neg logit contributions
ar-0.021
 Nem-0.020
 Bub-0.019
 home-0.018
 Twe-0.017
Token bear
Token ID6842
Feature activation-2.57e-03

Pos logit contributions
urious+0.025
 acts+0.023
 typing+0.023
 explains+0.022
 recite+0.022
Neg logit contributions
ar-0.017
 Nem-0.016
 Bub-0.015
 home-0.014
 Twe-0.013
Token went
Token ID1816
Feature activation-2.35e-02

Pos logit contributions
ar+0.004
 Nem+0.004
 Bub+0.003
 home+0.003
 bamboo+0.003
Neg logit contributions
urious-0.004
 acts-0.004
 typing-0.004
itive-0.004
ymes-0.004
Token favorite
Token ID4004
Feature activation-8.69e-03

Pos logit contributions
ar+0.010
 Nem+0.009
 Bub+0.009
 Twe+0.008
orted+0.008
Neg logit contributions
urious-0.011
 acts-0.010
 typing-0.009
 recite-0.009
itive-0.009
Token spot
Token ID4136
Feature activation-1.10e-02

Pos logit contributions
ar+0.013
 Nem+0.011
 Bub+0.011
orted+0.010
 Twe+0.010
Neg logit contributions
 acts-0.014
urious-0.014
itive-0.013
 recite-0.013
 typing-0.013
Token and
Token ID290
Feature activation-1.49e-02

Pos logit contributions
ar+0.015
 Nem+0.014
 Bub+0.013
 Twe+0.012
 home+0.012
Neg logit contributions
urious-0.018
 acts-0.018
 typing-0.017
itive-0.017
 explains-0.016
Token he
Token ID339
Feature activation-4.69e-04

Pos logit contributions
ar+0.018
 Nem+0.018
 Bub+0.016
 home+0.016
 Twe+0.015
Neg logit contributions
urious-0.028
 acts-0.026
itive-0.025
ymes-0.025
 typing-0.025
Token could
Token ID714
Feature activation+1.77e-02

Pos logit contributions
ar+0.001
 Nem+0.001
 Bub+0.001
 home+0.001
 Twe+0.001
Neg logit contributions
urious-0.001
 acts-0.001
itive-0.001
 typing-0.001
 recite-0.001
Token go
Token ID467
Feature activation-9.59e-03

Pos logit contributions
urious+0.025
 acts+0.024
 recite+0.022
 typing+0.022
itive+0.022
Neg logit contributions
ar-0.030
 Nem-0.028
 Bub-0.027
 home-0.025
 Twe-0.025
Token on
Token ID319
Feature activation-7.19e-03

Pos logit contributions
ar+0.012
 Nem+0.011
 Bub+0.010
orted+0.010
balls+0.010
Neg logit contributions
 acts-0.016
urious-0.016
 recite-0.015
 typing-0.015
itive-0.014
Token magical
Token ID10883
Feature activation-2.51e-02

Pos logit contributions
ar+0.011
 Nem+0.010
 Bub+0.009
melon+0.009
orted+0.008
Neg logit contributions
urious-0.011
 acts-0.011
 typing-0.011
 recite-0.010
 explains-0.010
Token adventures
Token ID17545
Feature activation+6.45e-04

Pos logit contributions
ar+0.044
 Nem+0.042
 Bub+0.040
 home+0.038
 Twe+0.037
Neg logit contributions
urious-0.034
 acts-0.032
itive-0.029
 typing-0.029
 recite-0.029
Token whenever
Token ID8797
Feature activation+1.07e-02

Pos logit contributions
urious+0.001
 acts+0.001
itive+0.001
 typing+0.001
 recite+0.001
Neg logit contributions
ar-0.001
 Nem-0.001
 Bub-0.001
 home-0.001
 Twe-0.001
Token he
Token ID339
Feature activation+1.75e-02

Pos logit contributions
urious+0.022
 acts+0.022
itive+0.021
 typing+0.021
 recite+0.020
Neg logit contributions
ar-0.011
 Nem-0.009
 Bub-0.009
 home-0.008
melon-0.008
Token named
Token ID3706
Feature activation-3.03e-02

Pos logit contributions
ar+0.020
 Nem+0.019
balls+0.018
 Bub+0.017
orted+0.017
Neg logit contributions
 acts-0.024
urious-0.023
itive-0.022
 typing-0.021
 explains-0.021
Token Lily
Token ID20037
Feature activation-5.51e-03

Pos logit contributions
ar+0.035
 Nem+0.028
orted+0.026
melon+0.025
 Bub+0.025
Neg logit contributions
urious-0.059
 acts-0.058
 explains-0.054
 typing-0.054
 recite-0.053
Token.
Token ID13
Feature activation-4.88e-05

Pos logit contributions
ar+0.008
 Nem+0.007
 Bub+0.007
balls+0.006
orted+0.006
Neg logit contributions
urious-0.009
 acts-0.009
itive-0.008
 typing-0.008
 explains-0.008
Token They
Token ID1119
Feature activation+3.96e-03

Pos logit contributions
ar+0.000
 Nem+0.000
 home+0.000
 Bub+0.000
ger+0.000
Neg logit contributions
urious-0.000
 acts-0.000
 typing-0.000
 explains-0.000
 recite-0.000
Token loved
Token ID6151
Feature activation+3.62e-03

Pos logit contributions
urious+0.006
 acts+0.006
itive+0.005
 typing+0.005
 recite+0.005
Neg logit contributions
ar-0.006
 Nem-0.006
 Bub-0.005
 home-0.005
balls-0.005
Token to
Token ID284
Feature activation-1.43e-02

Pos logit contributions
urious+0.008
 acts+0.008
itive+0.007
 recite+0.007
 explains+0.007
Neg logit contributions
ar-0.003
 Nem-0.003
 Bub-0.003
 home-0.003
 Twe-0.003
Token run
Token ID1057
Feature activation-1.30e-02

Pos logit contributions
ar+0.018
 Nem+0.017
 Bub+0.016
 home+0.015
 Twe+0.015
Neg logit contributions
urious-0.028
 acts-0.025
ymes-0.025
itive-0.025
 typing-0.024
Token,
Token ID11
Feature activation-2.16e-03

Pos logit contributions
 Nem+0.024
ar+0.024
melon+0.024
 Bub+0.022
aun+0.022
Neg logit contributions
 acts-0.013
urious-0.013
 hears-0.011
ymes-0.011
 countries-0.011
Token jump
Token ID4391
Feature activation+4.54e-03

Pos logit contributions
 Nem+0.003
 Bub+0.003
 bamboo+0.003
ar+0.003
 Twe+0.003
Neg logit contributions
urious-0.004
 acts-0.004
ymes-0.004
itive-0.004
 explains-0.003
Token,
Token ID11
Feature activation+1.62e-02

Pos logit contributions
urious+0.005
 acts+0.005
itive+0.005
ymes+0.005
 typing+0.005
Neg logit contributions
ar-0.008
 Nem-0.008
 Bub-0.007
melon-0.007
 thirst-0.007
Token and
Token ID290
Feature activation+1.14e-02

Pos logit contributions
urious+0.032
ymes+0.029
itive+0.029
 acts+0.029
 recite+0.028
Neg logit contributions
 Nem-0.019
ar-0.019
 Bub-0.019
 home-0.018
balls-0.018
Token the
Token ID262
Feature activation+3.64e-03

Pos logit contributions
ar+0.011
 Nem+0.010
 Bub+0.009
 home+0.009
 Twe+0.009
Neg logit contributions
urious-0.015
 acts-0.014
itive-0.013
 typing-0.013
 recite-0.013
Token paper
Token ID3348
Feature activation-8.39e-03

Pos logit contributions
urious+0.006
 acts+0.006
itive+0.006
 recite+0.006
 typing+0.006
Neg logit contributions
ar-0.005
 home-0.004
 Nem-0.004
 Bub-0.004
 cab-0.004
Token with
Token ID351
Feature activation-2.01e-02

Pos logit contributions
ar+0.011
 Nem+0.011
 Bub+0.010
 Twe+0.009
 home+0.009
Neg logit contributions
urious-0.014
 acts-0.014
 recite-0.013
 typing-0.013
itive-0.013
Token her
Token ID607
Feature activation-6.78e-03

Pos logit contributions
ar+0.032
 Nem+0.027
 Bub+0.026
orted+0.026
balls+0.025
Neg logit contributions
 acts-0.030
urious-0.029
 recite-0.028
 amaz-0.027
 explains-0.027
Token best
Token ID1266
Feature activation-8.54e-03

Pos logit contributions
ar+0.010
 Nem+0.010
 Bub+0.009
 Twe+0.008
orted+0.008
Neg logit contributions
urious-0.011
 acts-0.010
 recite-0.010
itive-0.009
 explains-0.009
Token friend
Token ID1545
Feature activation-1.61e-02

Pos logit contributions
ar+0.015
 Nem+0.014
 Bub+0.014
 Twe+0.013
orted+0.013
Neg logit contributions
urious-0.011
 acts-0.011
 recite-0.010
 amaz-0.010
itive-0.010
Token.
Token ID13
Feature activation-9.84e-03

Pos logit contributions
ar+0.021
 Nem+0.019
 Bub+0.018
 Twe+0.017
 bamboo+0.016
Neg logit contributions
urious-0.028
 acts-0.028
 recite-0.026
itive-0.026
 typing-0.026
Token They
Token ID1119
Feature activation-6.68e-03

Pos logit contributions
ar+0.014
 Nem+0.013
melon+0.012
 Bub+0.011
 home+0.011
Neg logit contributions
urious-0.016
 acts-0.016
 explains-0.015
 recite-0.014
Ru-0.014
Token shared
Token ID4888
Feature activation-1.50e-02

Pos logit contributions
ar+0.010
 Nem+0.009
 Bub+0.009
 home+0.008
balls+0.008
Neg logit contributions
urious-0.010
 acts-0.010
 recite-0.009
 typing-0.009
itive-0.009
Token the
Token ID262
Feature activation-1.13e-03

Pos logit contributions
ar+0.020
 Nem+0.019
 Bub+0.018
 home+0.017
 Twe+0.016
Neg logit contributions
urious-0.026
 acts-0.025
itive-0.024
 typing-0.024
 recite-0.023
Token y
Token ID331
Feature activation-6.34e-03

Pos logit contributions
ar+0.002
 Nem+0.001
 home+0.001
 Bub+0.001
 cab+0.001
Neg logit contributions
urious-0.002
 acts-0.002
itive-0.002
 typing-0.002
 recite-0.002
Token We
Token ID775
Feature activation-7.86e-03

Pos logit contributions
ar+0.020
 Nem+0.019
 Bub+0.018
 home+0.017
 bamboo+0.017
Neg logit contributions
urious-0.017
 acts-0.017
 typing-0.016
 explains-0.016
 recite-0.015
Token wanted
Token ID2227
Feature activation-1.82e-02

Pos logit contributions
ar+0.015
 Nem+0.014
 Bub+0.013
 bamboo+0.013
balls+0.013
Neg logit contributions
urious-0.009
 acts-0.009
 explains-0.008
itive-0.008
 recite-0.008
Token to
Token ID284
Feature activation-9.20e-03

Pos logit contributions
ar+0.027
 Nem+0.023
 Bub+0.023
orted+0.022
ger+0.022
Neg logit contributions
 acts-0.029
urious-0.029
 explains-0.027
 typing-0.027
 recite-0.027
Token give
Token ID1577
Feature activation-1.39e-02

Pos logit contributions
ar+0.016
 Nem+0.015
 Bub+0.015
 home+0.014
balls+0.014
Neg logit contributions
urious-0.012
 acts-0.012
 recite-0.011
 typing-0.011
 explains-0.010
Token you
Token ID345
Feature activation-2.74e-02

Pos logit contributions
ar+0.022
 Bub+0.019
 Nem+0.019
orted+0.018
ger+0.018
Neg logit contributions
urious-0.021
 acts-0.021
 explains-0.019
ymes-0.019
 recite-0.019
Token a
Token ID257
Feature activation-4.92e-02

Pos logit contributions
ar+0.045
 Nem+0.041
 Bub+0.041
orted+0.039
balls+0.039
Neg logit contributions
urious-0.039
 acts-0.038
ymes-0.035
 explains-0.034
 recite-0.034
Token candle
Token ID26839
Feature activation-1.19e-02

Pos logit contributions
ar+0.090
 Nem+0.086
 Bub+0.080
orted+0.080
bank+0.079
Neg logit contributions
 recite-0.061
 acts-0.060
urious-0.058
 explains-0.058
ymes-0.054
Token,
Token ID11
Feature activation-2.07e-02

Pos logit contributions
ar+0.017
 Nem+0.016
 Bub+0.015
 Twe+0.014
orted+0.014
Neg logit contributions
 acts-0.019
urious-0.019
itive-0.018
 explains-0.018
 recite-0.017
Token but
Token ID475
Feature activation-1.77e-02

Pos logit contributions
ar+0.029
 Nem+0.029
 Bub+0.027
 home+0.026
 Twe+0.025
Neg logit contributions
urious-0.034
 acts-0.033
itive-0.031
 typing-0.030
 recite-0.030
Token we
Token ID356
Feature activation+5.54e-03

Pos logit contributions
 Nem+0.017
ar+0.016
 Bub+0.015
 home+0.014
 Twe+0.013
Neg logit contributions
urious-0.039
 acts-0.036
itive-0.036
 recite-0.035
ymes-0.035
Token did
Token ID750
Feature activation-6.14e-03

Pos logit contributions
urious+0.007
 acts+0.007
itive+0.006
 typing+0.006
 recite+0.006
Neg logit contributions
ar-0.010
 Nem-0.009
 Bub-0.009
 home-0.008
 Twe-0.008
Token did
Token ID750
Feature activation-2.11e-02

Pos logit contributions
 Nem+0.017
ar+0.017
 Bub+0.015
 home+0.015
melon+0.013
Neg logit contributions
urious-0.024
itive-0.024
ymes-0.024
 acts-0.023
 typing-0.023
Token it
Token ID340
Feature activation-1.61e-02

Pos logit contributions
ar+0.030
 Nem+0.028
 Bub+0.026
orted+0.026
 bamboo+0.025
Neg logit contributions
 recite-0.034
 typing-0.033
urious-0.033
 acts-0.033
 explains-0.032
Token again
Token ID757
Feature activation+1.04e-02

Pos logit contributions
ar+0.019
 Nem+0.018
 home+0.017
 Bub+0.017
melon+0.015
Neg logit contributions
urious-0.031
 acts-0.031
ymes-0.028
 typing-0.028
 explains-0.028
Token and
Token ID290
Feature activation-1.50e-02

Pos logit contributions
ymes+0.024
urious+0.023
itive+0.022
 acts+0.022
 typing+0.022
Neg logit contributions
 home-0.009
 Nem-0.008
ar-0.008
 Bub-0.006
 away-0.006
Token she
Token ID673
Feature activation-1.57e-03

Pos logit contributions
ar+0.021
 Nem+0.020
 Bub+0.019
 home+0.017
 Twe+0.017
Neg logit contributions
urious-0.025
 acts-0.024
 typing-0.023
itive-0.023
 recite-0.023
Token learnt
Token ID26338
Feature activation-4.27e-02

Pos logit contributions
ar+0.003
 Nem+0.002
 Bub+0.002
 home+0.002
 Twe+0.002
Neg logit contributions
urious-0.002
 acts-0.002
itive-0.002
 typing-0.002
 recite-0.002
Token her
Token ID607
Feature activation-3.87e-03

Pos logit contributions
ar+0.074
 Nem+0.066
 Bub+0.063
balls+0.061
orted+0.061
Neg logit contributions
urious-0.059
 explains-0.058
 acts-0.058
 typing-0.054
 recite-0.054
Token lesson
Token ID11483
Feature activation-2.25e-02

Pos logit contributions
ar+0.006
 Nem+0.005
 Bub+0.005
 Twe+0.005
 home+0.004
Neg logit contributions
urious-0.006
 acts-0.006
 explains-0.006
itive-0.006
 recite-0.006
Token.
Token ID13
Feature activation+4.61e-03

Pos logit contributions
ar+0.026
 Nem+0.024
 Bub+0.022
 home+0.020
 Twe+0.020
Neg logit contributions
urious-0.044
 acts-0.042
itive-0.040
 typing-0.040
 recite-0.040
Token She
Token ID1375
Feature activation+1.54e-03

Pos logit contributions
urious+0.008
 acts+0.008
 explains+0.007
 recite+0.007
 typing+0.007
Neg logit contributions
ar-0.007
 Nem-0.006
 Bub-0.005
 bamboo-0.005
balls-0.005
Token loved
Token ID6151
Feature activation-1.01e-02

Pos logit contributions
urious+0.002
 acts+0.002
itive+0.002
 typing+0.002
ymes+0.002
Neg logit contributions
ar-0.002
 Nem-0.002
 Bub-0.002
 home-0.002
 Twe-0.002
Token
Token ID198
Feature activation+3.42e-03

Pos logit contributions
ar+0.005
 Nem+0.004
 Bub+0.004
melon+0.004
 Twe+0.004
Neg logit contributions
urious-0.007
 acts-0.006
 explains-0.006
 recite-0.006
 typing-0.006
TokenThe
Token ID464
Feature activation-3.28e-03

Pos logit contributions
urious+0.006
 acts+0.005
 typing+0.005
 recite+0.005
itive+0.005
Neg logit contributions
ar-0.005
 Nem-0.005
 Bub-0.004
 home-0.004
 Twe-0.004
Token monkey
Token ID21657
Feature activation+3.13e-03

Pos logit contributions
ar+0.004
 Nem+0.004
 Bub+0.004
 home+0.004
 bamboo+0.004
Neg logit contributions
urious-0.006
 acts-0.006
itive-0.005
 typing-0.005
 explains-0.005
Tokenâ
Token ID22940
Feature activation+7.10e-05

Pos logit contributions
urious+0.005
 acts+0.005
 typing+0.005
itive+0.005
ymes+0.005
Neg logit contributions
ar-0.005
 Nem-0.004
 Bub-0.004
 home-0.004
ger-0.003
Token
Token ID26391
Feature activation-1.50e-02

Pos logit contributions
urious+0.000
 acts+0.000
itive+0.000
 typing+0.000
 recite+0.000
Neg logit contributions
ar-0.000
 Nem-0.000
 Bub-0.000
 home-0.000
 Twe-0.000
Token
Token ID8151
Feature activation-4.27e-02

Pos logit contributions
ar+0.022
 Nem+0.021
 Bub+0.020
 home+0.019
 Twe+0.018
Neg logit contributions
urious-0.024
 acts-0.023
itive-0.022
 typing-0.022
 recite-0.022
Tokens
Token ID82
Feature activation+2.32e-03

Pos logit contributions
ar+0.072
 Nem+0.069
 Bub+0.066
 bamboo+0.062
balls+0.061
Neg logit contributions
urious-0.055
 acts-0.049
 typing-0.047
itive-0.047
 explains-0.046
Token friends
Token ID2460
Feature activation-2.56e-03

Pos logit contributions
urious+0.003
 acts+0.003
itive+0.002
 recite+0.002
 typing+0.002
Neg logit contributions
ar-0.004
 Nem-0.004
 Bub-0.004
 Twe-0.004
orted-0.004
Token were
Token ID547
Feature activation+1.74e-03

Pos logit contributions
ar+0.004
 Nem+0.003
 Bub+0.003
 home+0.003
 Twe+0.003
Neg logit contributions
urious-0.004
 acts-0.004
itive-0.004
 typing-0.004
ymes-0.004
Token happy
Token ID3772
Feature activation-3.36e-03

Pos logit contributions
urious+0.003
 typing+0.002
 acts+0.002
 recite+0.002
itive+0.002
Neg logit contributions
ar-0.003
 Nem-0.002
 Bub-0.002
ger-0.002
orted-0.002
Token and
Token ID290
Feature activation-1.45e-02

Pos logit contributions
ar+0.005
 Nem+0.004
 Bub+0.004
inky+0.004
orted+0.004
Neg logit contributions
urious-0.006
 acts-0.006
 typing-0.005
itive-0.005
 explains-0.005
Token can
Token ID460
Feature activation+1.29e-02

Pos logit contributions
urious+0.028
 acts+0.026
itive+0.025
 typing+0.025
 recite+0.025
Neg logit contributions
ar-0.029
 Nem-0.027
 Bub-0.025
 home-0.024
 Twe-0.023
Token find
Token ID1064
Feature activation-1.32e-02

Pos logit contributions
urious+0.020
 acts+0.019
ymes+0.018
itive+0.018
 typing+0.017
Neg logit contributions
ar-0.020
 Nem-0.019
 home-0.019
 Bub-0.018
 Twe-0.018
Token a
Token ID257
Feature activation-4.53e-02

Pos logit contributions
ar+0.018
 Nem+0.016
 Bub+0.015
melon+0.015
balls+0.014
Neg logit contributions
 acts-0.023
urious-0.023
 explains-0.022
 recite-0.021
 typing-0.021
Token princess
Token ID21752
Feature activation-4.99e-03

Pos logit contributions
ar+0.079
orted+0.077
 Nem+0.071
aun+0.066
balls+0.066
Neg logit contributions
 recite-0.055
urious-0.054
 amaz-0.054
 acts-0.053
 typing-0.052
Token or
Token ID393
Feature activation-1.78e-02

Pos logit contributions
ar+0.007
 Nem+0.006
 Bub+0.006
 home+0.005
 Twe+0.005
Neg logit contributions
urious-0.009
 acts-0.008
itive-0.008
 typing-0.008
 recite-0.008
Token a
Token ID257
Feature activation-4.29e-02

Pos logit contributions
ar+0.019
 Nem+0.019
 Bub+0.018
 home+0.017
 Twe+0.017
Neg logit contributions
urious-0.036
 acts-0.033
itive-0.032
 typing-0.031
 explains-0.031
Token dragon
Token ID10441
Feature activation-8.89e-03

Pos logit contributions
ar+0.078
orted+0.077
 Nem+0.070
aun+0.066
icle+0.064
Neg logit contributions
 recite-0.047
 acts-0.046
 forget-0.044
 typing-0.043
 amaz-0.043
Token inside
Token ID2641
Feature activation-1.59e-03

Pos logit contributions
ar+0.012
 Nem+0.011
 Bub+0.010
 home+0.009
 Twe+0.009
Neg logit contributions
urious-0.016
 acts-0.015
itive-0.014
 typing-0.014
 recite-0.014
Token."
Token ID526
Feature activation-4.96e-03

Pos logit contributions
ar+0.002
 Nem+0.001
 home+0.001
 Bub+0.001
 Twe+0.001
Neg logit contributions
urious-0.003
 acts-0.003
 typing-0.003
 recite-0.003
itive-0.003
Token
Token ID198
Feature activation-5.65e-03

Pos logit contributions
ar+0.008
 bamboo+0.007
age+0.007
melon+0.007
 Bub+0.007
Neg logit contributions
urious-0.007
 acts-0.007
 explains-0.007
 typing-0.006
 encourages-0.006
Token
Token ID198
Feature activation+7.09e-04

Pos logit contributions
ar+0.008
 Nem+0.007
 Bub+0.007
melon+0.007
 bamboo+0.007
Neg logit contributions
urious-0.009
 explains-0.009
 acts-0.009
 recite-0.008
 typing-0.008
Token day
Token ID1110
Feature activation-7.61e-03

Pos logit contributions
urious+0.008
 acts+0.008
itive+0.007
 typing+0.007
 recite+0.007
Neg logit contributions
ar-0.003
 Nem-0.003
 Bub-0.002
 home-0.002
 bamboo-0.002
Token,
Token ID11
Feature activation-1.91e-02

Pos logit contributions
ar+0.010
age+0.009
 Nem+0.009
bank+0.009
 bamboo+0.009
Neg logit contributions
 acts-0.012
urious-0.012
itive-0.011
 explains-0.011
 typing-0.011
Token a
Token ID257
Feature activation+7.51e-03

Pos logit contributions
ar+0.035
age+0.033
round+0.033
bank+0.032
anger+0.032
Neg logit contributions
urious-0.020
 acts-0.019
 do-0.018
 hears-0.016
 based-0.016
Token little
Token ID1310
Feature activation-9.69e-03

Pos logit contributions
urious+0.009
 acts+0.009
 typing+0.008
itive+0.008
 recite+0.008
Neg logit contributions
ar-0.014
 Nem-0.013
 Bub-0.012
 home-0.012
orted-0.011
Token girl
Token ID2576
Feature activation-2.09e-02

Pos logit contributions
ar+0.018
 Nem+0.016
 Bub+0.015
balls+0.015
orted+0.015
Neg logit contributions
 acts-0.012
urious-0.011
itive-0.010
 typing-0.010
ymes-0.010
Token named
Token ID3706
Feature activation-4.96e-02

Pos logit contributions
ar+0.022
 Nem+0.020
 Bub+0.018
 home+0.016
balls+0.016
Neg logit contributions
urious-0.043
 acts-0.042
 typing-0.039
itive-0.039
 recite-0.039
Token Lily
Token ID20037
Feature activation-1.52e-02

Pos logit contributions
ar+0.065
 Nem+0.057
bank+0.051
orted+0.051
balls+0.051
Neg logit contributions
urious-0.087
 acts-0.083
 explains-0.079
 recite-0.079
 typing-0.079
Token was
Token ID373
Feature activation+1.79e-02

Pos logit contributions
ar+0.020
 Nem+0.019
 Bub+0.017
 home+0.017
 Twe+0.016
Neg logit contributions
urious-0.027
 acts-0.025
itive-0.025
 recite-0.024
 typing-0.024
Token playing
Token ID2712
Feature activation-9.17e-03

Pos logit contributions
urious+0.031
 acts+0.030
itive+0.028
 recite+0.028
 explains+0.028
Neg logit contributions
ar-0.024
 Nem-0.023
 Bub-0.021
 home-0.020
 Twe-0.019
Token in
Token ID287
Feature activation-9.11e-03

Pos logit contributions
ar+0.014
 Nem+0.013
 Bub+0.012
melon+0.011
orted+0.011
Neg logit contributions
urious-0.014
 acts-0.014
 typing-0.013
 recite-0.013
itive-0.013
Token the
Token ID262
Feature activation-7.60e-03

Pos logit contributions
ar+0.015
 Nem+0.014
 Bub+0.013
balls+0.012
melon+0.012
Neg logit contributions
urious-0.013
 acts-0.013
 typing-0.012
 recite-0.012
 explains-0.012