TOP ACTIVATIONS
MAX = 3.613

looks fun to play in ," thought Spe edy . So
with joy and said " me asuring is so much fun
air . M illy gasped with amaz ement !
le and he had to sne e ze ! He put
around and soon her dad joined in . They laughed as
I had something to do ," the screw cried .
The little boy smiled with delight . They
This rock is so hot ," said the little rock .
isy , always smiling and laughing . One day a little
of nowhere . John shouted with surprise and the squirrel sc
in her nose and she sne ez ed so loud that
" Me too ! Me too !" Sam said .
and sne ez ed and sne ez ed . " B
he can 't . He sne ez es very loud and
riches she had found . R yle e was glad she
the car 's owner said ," Please make my car better
on the tree and feel the sun on its skin .
do it again ! You sne e ze and I 'll
pur red and me owed with pleasure . It was very
felt sick and had to sne e ze a lot .

MIN ACTIVATIONS
MIN = -4.446

a good dog named D ott ie . D ott ie
loyal little girl named L ott ie . Every day ,
a kind deer named D ott ie appeared . " Leave
were two friends , Ab ig ail and Mat ty .
said , " Hello D ott ie ! This is a
there were two friends , Lydia and Tim . Every day
were two sisters , M illy and Molly . M illy
my told her about the broken whistle . His mom said
fun . They thanked the old man for allowing them to
had a tutor named Mrs . Smith . Mrs . Smith
she was inside , D ott ie saw lots of colorful
both danced around the z ig zag together ! <|endoftext|> John
" I own this yellow shield !" Lily replied with
even cut er than the old one . Bun Bun was
her elderly neighbor , Mrs . Johnson , struggling to carry
on the beach . The old man invited the boy to
Suddenly , they sp ott led a bicycle . It
there were two cats , Tig gy and Snow y .
not bigger than a fing ern ail . They also found
Exc use me , Mr . Dragon , are you friendly

INTERVAL 1.598 - 3.613
CONTAINS 0.555%

and Kitty both looked pretty with their makeup . They played
, there was an elderly couple who owned a big farm
She liked to run around in the bright sunlight , looking
a good boy , really ." she said to the man
now eat the mineral . Y um ! The

INTERVAL -0.416 - 1.598
CONTAINS 74.825%

us to Grand ma and Grand pa 's house . Are
a time , there was a girl named Lucy . She
inside with Mom and enjoyed their snack and their book .
playing in his room , when all of a sudden he
her Mom said yes . S ue put the

INTERVAL -2.431 - -0.416
CONTAINS 24.576%

of ways to solve the giant 's puzzle . Finally ,
to success . <|endoftext|> Jack was very angry today . He
, we are sorry ," Anna says . " We did
, I can ." She went to the kitchen
. They also liked to look at the animals that lived

INTERVAL -4.446 - -2.431
CONTAINS 0.044%

put the salt on the broken wheel , and it stuck
want to play with his toy car alone . He wished
smiled and said , " That 's a great idea ,
friends . One was an old house and one was a
was excited and thanked the old woman . As the little
Token looks
Token ID3073
Feature activation+9.71e-01

Pos logit contributions
 faces+0.009
Thank+0.009
 Sorry+0.009
ulating+0.009
 sav+0.008
Neg logit contributions
ert-0.011
oris-0.010
izzy-0.010
ion-0.010
 log-0.009
Token fun
Token ID1257
Feature activation+2.51e-01

Pos logit contributions
 Sorry+0.026
Thank+0.025
thank+0.025
 endings+0.024
 faces+0.024
Neg logit contributions
 log-0.030
ert-0.030
izzy-0.027
oris-0.026
 stool-0.026
Token to
Token ID284
Feature activation-6.85e-01

Pos logit contributions
 faces+0.006
ulating+0.006
 Sorry+0.006
Thank+0.006
 sav+0.005
Neg logit contributions
ert-0.010
oris-0.009
izzy-0.009
ion-0.009
 log-0.008
Token play
Token ID711
Feature activation+1.77e-01

Pos logit contributions
ert+0.029
oris+0.027
izzy+0.026
 log+0.026
ion+0.026
Neg logit contributions
 faces-0.014
ulating-0.014
 Sorry-0.014
Thank-0.013
 sav-0.013
Token in
Token ID287
Feature activation+8.91e-01

Pos logit contributions
 faces+0.004
 Sorry+0.004
ulating+0.004
Thank+0.004
 sav+0.003
Neg logit contributions
ert-0.007
oris-0.007
ion-0.007
izzy-0.007
 log-0.007
Token,"
Token ID553
Feature activation+3.61e+00

Pos logit contributions
 faces+0.021
Thank+0.020
 Sorry+0.020
ulating+0.020
 sav+0.020
Neg logit contributions
ert-0.035
oris-0.033
izzy-0.031
ion-0.031
 log-0.029
Token thought
Token ID1807
Feature activation+6.43e-01

Pos logit contributions
ulating+0.124
umm+0.119
 habitat+0.114
thank+0.107
 sav+0.106
Neg logit contributions
ert-0.103
oris-0.090
 log-0.086
izzy-0.084
ion-0.079
Token Spe
Token ID2531
Feature activation-5.11e-01

Pos logit contributions
 faces+0.014
ulating+0.014
 Sorry+0.014
aw+0.013
Thank+0.013
Neg logit contributions
ert-0.025
oris-0.024
izzy-0.023
ion-0.023
 stool-0.022
Tokenedy
Token ID4716
Feature activation-6.97e-01

Pos logit contributions
ert+0.018
izzy+0.017
oris+0.017
ion+0.017
 stool+0.016
Neg logit contributions
ulating-0.014
 faces-0.013
aw-0.013
Thank-0.013
 sav-0.013
Token.
Token ID13
Feature activation+6.52e-01

Pos logit contributions
ert+0.030
oris+0.028
izzy+0.027
ion+0.026
 log+0.026
Neg logit contributions
 faces-0.014
 Sorry-0.014
Thank-0.014
ulating-0.013
 sav-0.013
Token So
Token ID1406
Feature activation+1.86e+00

Pos logit contributions
 faces+0.017
ulating+0.017
 Sorry+0.017
Thank+0.016
 sav+0.016
Neg logit contributions
ert-0.024
oris-0.022
izzy-0.021
ion-0.020
 log-0.020
Token with
Token ID351
Feature activation+2.69e+00

Pos logit contributions
 Sorry+0.018
ulating+0.016
 sav+0.016
 faces+0.016
 pige+0.016
Neg logit contributions
ert-0.018
oris-0.017
izzy-0.015
 log-0.015
 stool-0.015
Token joy
Token ID8716
Feature activation-4.07e-01

Pos logit contributions
clock+0.065
 Sorry+0.063
ulating+0.062
 faces+0.058
 habitat+0.058
Neg logit contributions
ert-0.091
ion-0.090
oris-0.088
izzy-0.086
ions-0.085
Token and
Token ID290
Feature activation+6.59e-01

Pos logit contributions
ert+0.015
oris+0.013
ion+0.012
izzy+0.012
 stool+0.012
Neg logit contributions
 Sorry-0.012
Thank-0.011
 sav-0.011
ulating-0.010
clock-0.010
Token said
Token ID531
Feature activation+1.04e+00

Pos logit contributions
ulating+0.016
clock+0.014
 habitat+0.014
 faces+0.014
aw+0.013
Neg logit contributions
ert-0.026
oris-0.023
izzy-0.022
ion-0.022
 log-0.021
Token "
Token ID366
Feature activation+5.39e-02

Pos logit contributions
ulating+0.027
 faces+0.026
 sav+0.026
clock+0.026
Thank+0.026
Neg logit contributions
ert-0.037
oris-0.034
ion-0.033
izzy-0.032
 stool-0.031
Tokenme
Token ID1326
Feature activation+3.17e+00

Pos logit contributions
 faces+0.001
ulating+0.001
 face+0.001
 Sorry+0.001
 habitat+0.001
Neg logit contributions
ert-0.002
oris-0.002
izzy-0.002
Alex-0.002
ions-0.002
Tokenasuring
Token ID45925
Feature activation-7.09e-01

Pos logit contributions
 sudden+0.101
 faces+0.096
ulating+0.092
 sav+0.087
 streets+0.085
Neg logit contributions
ert-0.104
oris-0.094
ions-0.090
ion-0.089
izzy-0.088
Token is
Token ID318
Feature activation+4.62e-01

Pos logit contributions
ert+0.027
oris+0.025
izzy+0.023
ion+0.023
 log+0.022
Neg logit contributions
 faces-0.018
 Sorry-0.018
Thank-0.017
ulating-0.017
 sav-0.017
Token so
Token ID523
Feature activation+5.66e-01

Pos logit contributions
 faces+0.012
ulating+0.012
 Sorry+0.012
 sudden+0.011
Thank+0.011
Neg logit contributions
ert-0.017
ion-0.015
oris-0.015
izzy-0.014
 log-0.014
Token much
Token ID881
Feature activation+1.01e-01

Pos logit contributions
 faces+0.015
thank+0.013
view+0.013
ulating+0.013
 habitat+0.013
Neg logit contributions
ert-0.021
oris-0.019
ion-0.019
izzy-0.019
igsaw-0.018
Token fun
Token ID1257
Feature activation+5.72e-01

Pos logit contributions
 faces+0.002
 Sorry+0.002
ulating+0.002
view+0.002
Thank+0.002
Neg logit contributions
ert-0.004
ion-0.004
oris-0.003
 log-0.003
 outage-0.003
Token air
Token ID1633
Feature activation-9.77e-02

Pos logit contributions
 Sorry+0.014
ulating+0.014
 argued+0.013
 pige+0.012
Thank+0.012
Neg logit contributions
 sheet-0.016
 log-0.016
 spider-0.016
ert-0.015
 mattress-0.015
Token.
Token ID13
Feature activation+2.27e-01

Pos logit contributions
ert+0.003
ion+0.002
 sheet+0.002
 log+0.002
zag+0.002
Neg logit contributions
 Sorry-0.003
Thank-0.003
 sudden-0.003
ulating-0.003
 pige-0.003
Token M
Token ID337
Feature activation+1.39e+00

Pos logit contributions
 faces+0.006
 Sorry+0.005
Thank+0.005
ulating+0.005
 sav+0.005
Neg logit contributions
ert-0.009
oris-0.008
izzy-0.008
ion-0.008
 log-0.007
Tokenilly
Token ID6548
Feature activation-8.85e-01

Pos logit contributions
 faces+0.048
ulating+0.047
Thank+0.047
 Sorry+0.047
 sav+0.046
Neg logit contributions
ert-0.039
oris-0.035
ion-0.033
izzy-0.032
 log-0.032
Token gasped
Token ID45236
Feature activation-3.71e-01

Pos logit contributions
ert+0.036
oris+0.032
izzy+0.031
 log+0.030
ion+0.030
Neg logit contributions
 Sorry-0.020
Thank-0.020
ulating-0.020
 faces-0.019
thank-0.018
Token with
Token ID351
Feature activation+3.07e+00

Pos logit contributions
ert+0.013
oris+0.012
 log+0.011
izzy+0.011
ion+0.011
Neg logit contributions
 Sorry-0.010
Thank-0.010
clock-0.009
ulating-0.009
 sav-0.009
Token amaz
Token ID40642
Feature activation-1.74e-01

Pos logit contributions
ulating+0.061
clock+0.061
 Sorry+0.059
thank+0.057
 faces+0.054
Neg logit contributions
ert-0.119
oris-0.114
 log-0.112
ion-0.110
izzy-0.108
Tokenement
Token ID972
Feature activation-5.26e-01

Pos logit contributions
ert+0.006
oris+0.005
ion+0.005
izzy+0.005
ions+0.005
Neg logit contributions
 Sorry-0.005
Thank-0.005
ulating-0.005
 faces-0.005
 sav-0.005
Token!
Token ID0
Feature activation+3.10e-02

Pos logit contributions
ert+0.020
oris+0.017
izzy+0.016
ion+0.016
 log+0.016
Neg logit contributions
Thank-0.014
 Sorry-0.014
 sav-0.013
ulating-0.013
thank-0.013
Token
Token ID198
Feature activation+4.50e-01

Pos logit contributions
 Sorry+0.001
 faces+0.001
Thank+0.001
ulating+0.001
 sav+0.001
Neg logit contributions
ert-0.001
oris-0.001
ion-0.001
izzy-0.001
 log-0.001
Token
Token ID198
Feature activation-1.79e-01

Pos logit contributions
Thank+0.006
 Sorry+0.006
 sudden+0.006
 faces+0.005
 sav+0.005
Neg logit contributions
ert-0.022
oris-0.021
ion-0.020
izzy-0.020
 log-0.020
Tokenle
Token ID293
Feature activation+1.33e+00

Pos logit contributions
 Sorry+0.043
Thank+0.043
ulating+0.040
 faces+0.040
 sav+0.039
Neg logit contributions
ert-0.050
oris-0.045
izzy-0.044
ion-0.044
 log-0.043
Token and
Token ID290
Feature activation+3.15e-01

Pos logit contributions
Thank+0.039
 Sorry+0.038
ulating+0.036
thank+0.035
 sav+0.034
Neg logit contributions
ert-0.047
ion-0.042
 log-0.042
oris-0.041
izzy-0.041
Token he
Token ID339
Feature activation+5.27e-01

Pos logit contributions
ulating+0.011
aw+0.010
view+0.009
 Sorry+0.009
 endings+0.009
Neg logit contributions
ert-0.009
 log-0.009
 stool-0.008
 Ted-0.008
 spider-0.008
Token had
Token ID550
Feature activation+5.85e-01

Pos logit contributions
 Sorry+0.011
ulating+0.011
 faces+0.011
Thank+0.011
clock+0.010
Neg logit contributions
ert-0.022
oris-0.020
izzy-0.020
ion-0.019
 log-0.019
Token to
Token ID284
Feature activation+3.69e-01

Pos logit contributions
 Sorry+0.018
ulating+0.016
Thank+0.016
 faces+0.015
thank+0.015
Neg logit contributions
ert-0.020
 log-0.018
oris-0.017
ion-0.017
izzy-0.016
Token sne
Token ID10505
Feature activation+3.06e+00

Pos logit contributions
ulating+0.010
 Sorry+0.010
Thank+0.010
clock+0.009
 sudden+0.009
Neg logit contributions
ert-0.013
 log-0.012
ion-0.011
oris-0.011
izzy-0.011
Tokene
Token ID68
Feature activation+5.92e-01

Pos logit contributions
 Sorry+0.095
ulating+0.094
 faces+0.092
Thank+0.091
 sav+0.087
Neg logit contributions
ert-0.095
izzy-0.091
ion-0.090
oris-0.084
ions-0.082
Tokenze
Token ID2736
Feature activation+2.31e+00

Pos logit contributions
Thank+0.014
 Sorry+0.013
ulating+0.012
 sav+0.012
clock+0.012
Neg logit contributions
ert-0.024
oris-0.021
izzy-0.021
ion-0.021
 log-0.021
Token!
Token ID0
Feature activation+8.72e-02

Pos logit contributions
 Sorry+0.065
Thank+0.062
ulating+0.060
 sav+0.058
 sudden+0.058
Neg logit contributions
ert-0.083
ion-0.075
 log-0.074
izzy-0.072
oris-0.071
Token He
Token ID679
Feature activation-1.94e-02

Pos logit contributions
 faces+0.002
 Sorry+0.002
Thank+0.002
ulating+0.002
 sav+0.002
Neg logit contributions
ert-0.004
oris-0.003
izzy-0.003
ion-0.003
 log-0.003
Token put
Token ID1234
Feature activation+1.04e+00

Pos logit contributions
ert+0.001
oris+0.001
izzy+0.001
ion+0.001
 log+0.001
Neg logit contributions
 Sorry-0.000
ulating-0.000
 faces-0.000
Thank-0.000
 sav-0.000
Token around
Token ID1088
Feature activation+5.83e-01

Pos logit contributions
 Sorry+0.000
Thank+0.000
 pige+0.000
 faces+0.000
ulating+0.000
Neg logit contributions
ert-0.000
oris-0.000
 log-0.000
ion-0.000
zag-0.000
Token and
Token ID290
Feature activation+1.00e-01

Pos logit contributions
 Sorry+0.019
ulating+0.018
Thank+0.017
 faces+0.017
clock+0.016
Neg logit contributions
ert-0.019
ion-0.017
oris-0.017
izzy-0.017
ions-0.016
Token soon
Token ID2582
Feature activation+9.39e-01

Pos logit contributions
ulating+0.003
 Sorry+0.003
umm+0.003
 habitat+0.003
clock+0.003
Neg logit contributions
ert-0.003
oris-0.003
 log-0.003
izzy-0.003
 Ted-0.003
Token her
Token ID607
Feature activation+4.40e-01

Pos logit contributions
 Sorry+0.028
ulating+0.027
Thank+0.026
clock+0.026
 faces+0.025
Neg logit contributions
ert-0.032
oris-0.029
izzy-0.028
ion-0.027
 log-0.026
Token dad
Token ID9955
Feature activation+1.08e+00

Pos logit contributions
Thank+0.014
ulating+0.014
 Sorry+0.014
thank+0.012
aw+0.012
Neg logit contributions
ert-0.015
 log-0.012
 stool-0.012
oris-0.012
 sheet-0.011
Token joined
Token ID5399
Feature activation+3.01e+00

Pos logit contributions
 Sorry+0.034
Thank+0.031
ulating+0.030
 habitat+0.029
clock+0.029
Neg logit contributions
ert-0.039
oris-0.033
izzy-0.033
ion-0.031
 log-0.031
Token in
Token ID287
Feature activation+1.55e+00

Pos logit contributions
 Sorry+0.088
ulating+0.088
Thank+0.087
 sav+0.086
clock+0.086
Neg logit contributions
ert-0.102
ion-0.092
oris-0.088
izzy-0.087
ions-0.084
Token.
Token ID13
Feature activation+8.70e-01

Pos logit contributions
 Sorry+0.039
ulating+0.039
 faces+0.038
Thank+0.037
clock+0.035
Neg logit contributions
ert-0.058
ion-0.053
oris-0.053
izzy-0.051
 log-0.051
Token They
Token ID1119
Feature activation+2.41e-01

Pos logit contributions
 faces+0.023
 Sorry+0.023
 sudden+0.022
ulating+0.022
Thank+0.022
Neg logit contributions
ert-0.029
oris-0.028
ion-0.027
izzy-0.026
 stool-0.025
Token laughed
Token ID13818
Feature activation+9.55e-01

Pos logit contributions
 Sorry+0.005
ulating+0.005
 faces+0.005
Thank+0.004
clock+0.004
Neg logit contributions
ert-0.011
oris-0.010
izzy-0.010
ion-0.009
 log-0.009
Token as
Token ID355
Feature activation+1.89e-01

Pos logit contributions
 Sorry+0.027
Thank+0.025
ulating+0.025
 sav+0.025
clock+0.025
Neg logit contributions
ert-0.032
oris-0.029
 log-0.028
ion-0.027
izzy-0.027
Token I
Token ID314
Feature activation-3.30e-01

Pos logit contributions
 faces+0.006
 Sorry+0.006
Thank+0.006
ulating+0.006
 sav+0.006
Neg logit contributions
ert-0.010
oris-0.010
izzy-0.009
ion-0.009
 log-0.009
Token had
Token ID550
Feature activation-1.16e-01

Pos logit contributions
ert+0.013
oris+0.013
ion+0.012
izzy+0.012
 log+0.012
Neg logit contributions
 faces-0.007
 Sorry-0.007
Thank-0.007
ulating-0.007
 sav-0.007
Token something
Token ID1223
Feature activation-2.15e-01

Pos logit contributions
ert+0.003
ion+0.003
 log+0.003
 sheet+0.003
oris+0.003
Neg logit contributions
ulating-0.004
 Sorry-0.004
Thank-0.004
umm-0.004
 faces-0.003
Token to
Token ID284
Feature activation-8.24e-01

Pos logit contributions
ert+0.007
ion+0.006
izzy+0.006
oris+0.006
 log+0.006
Neg logit contributions
Thank-0.007
 faces-0.006
 Sorry-0.006
 sav-0.006
thank-0.006
Token do
Token ID466
Feature activation-1.03e-01

Pos logit contributions
ert+0.030
izzy+0.027
oris+0.027
 log+0.027
ion+0.026
Neg logit contributions
ulating-0.021
Thank-0.021
aw-0.021
 Sorry-0.020
 faces-0.020
Token,"
Token ID553
Feature activation+3.01e+00

Pos logit contributions
ert+0.004
oris+0.003
ion+0.003
izzy+0.003
 log+0.003
Neg logit contributions
 Sorry-0.003
 faces-0.003
Thank-0.003
ulating-0.002
 sav-0.002
Token the
Token ID262
Feature activation+1.17e+00

Pos logit contributions
ulating+0.093
 habitat+0.083
umm+0.079
thank+0.078
view+0.077
Neg logit contributions
ert-0.093
 log-0.089
ion-0.086
izzy-0.085
 spider-0.084
Token screw
Token ID9580
Feature activation-1.91e+00

Pos logit contributions
 faces+0.043
Thank+0.043
 Sorry+0.042
 sav+0.042
ulating+0.041
Neg logit contributions
ert-0.031
oris-0.027
ion-0.026
izzy-0.025
ions-0.023
Token cried
Token ID16896
Feature activation+1.29e-01

Pos logit contributions
ert+0.081
oris+0.075
izzy+0.072
ion+0.072
 log+0.069
Neg logit contributions
 Sorry-0.039
 faces-0.039
Thank-0.039
ulating-0.038
 sav-0.037
Token.
Token ID13
Feature activation+5.65e-02

Pos logit contributions
 Sorry+0.003
 faces+0.003
Thank+0.003
ulating+0.003
 sav+0.003
Neg logit contributions
ert-0.005
oris-0.005
izzy-0.005
ion-0.005
 log-0.004
Token
Token ID198
Feature activation+1.64e-01

Pos logit contributions
 faces+0.001
 Sorry+0.001
ulating+0.001
Thank+0.001
 sav+0.001
Neg logit contributions
ert-0.002
oris-0.002
ion-0.002
izzy-0.002
 log-0.002
Token
Token ID198
Feature activation+8.05e-01

Pos logit contributions
ulating+0.023
 Sorry+0.022
Thank+0.022
 faces+0.022
 drawn+0.022
Neg logit contributions
ert-0.036
ion-0.033
oris-0.033
izzy-0.033
 stool-0.032
TokenThe
Token ID464
Feature activation-2.78e-01

Pos logit contributions
 faces+0.015
 Sorry+0.015
Thank+0.015
ulating+0.015
 sav+0.014
Neg logit contributions
ert-0.035
oris-0.033
izzy-0.032
ion-0.032
 log-0.031
Token little
Token ID1310
Feature activation-1.27e+00

Pos logit contributions
ert+0.011
oris+0.010
izzy+0.010
 log+0.010
 stool+0.010
Neg logit contributions
ulating-0.006
 Sorry-0.006
Thank-0.006
 faces-0.006
 sav-0.006
Token boy
Token ID2933
Feature activation-1.53e-01

Pos logit contributions
ert+0.058
oris+0.054
ion+0.052
izzy+0.051
ions+0.048
Neg logit contributions
 faces-0.025
Thank-0.024
 Sorry-0.024
 sav-0.022
ulating-0.022
Token smiled
Token ID13541
Feature activation+3.05e-01

Pos logit contributions
ert+0.007
oris+0.007
ion+0.006
izzy+0.006
 log+0.006
Neg logit contributions
 faces-0.003
ulating-0.003
Thank-0.003
 Sorry-0.003
 sav-0.003
Token with
Token ID351
Feature activation+3.00e+00

Pos logit contributions
 Sorry+0.007
 faces+0.007
Thank+0.007
ulating+0.007
 sav+0.007
Neg logit contributions
ert-0.012
oris-0.011
izzy-0.011
ion-0.011
 log-0.010
Token delight
Token ID10974
Feature activation-7.05e-01

Pos logit contributions
 Sorry+0.052
ulating+0.052
 faces+0.051
Thank+0.050
 sav+0.048
Neg logit contributions
ert-0.134
oris-0.125
ion-0.121
izzy-0.121
 log-0.118
Token.
Token ID13
Feature activation-2.08e-01

Pos logit contributions
ert+0.025
oris+0.021
 stool+0.021
ion+0.021
izzy+0.021
Neg logit contributions
 Sorry-0.020
Thank-0.019
 sav-0.018
ulating-0.018
 drawn-0.017
Token
Token ID198
Feature activation+4.89e-01

Pos logit contributions
ert+0.008
oris+0.008
izzy+0.007
ion+0.007
 log+0.007
Neg logit contributions
 faces-0.005
 Sorry-0.005
Thank-0.005
ulating-0.005
 sav-0.004
Token
Token ID198
Feature activation-8.47e-02

Pos logit contributions
Thank+0.007
 sudden+0.006
 Sorry+0.006
 sav+0.006
 faces+0.006
Neg logit contributions
ert-0.024
oris-0.023
ion-0.022
izzy-0.022
 log-0.021
TokenThey
Token ID2990
Feature activation+3.67e-01

Pos logit contributions
ert+0.004
oris+0.004
izzy+0.003
ion+0.003
 log+0.003
Neg logit contributions
Thank-0.002
 faces-0.001
 Sorry-0.001
ulating-0.001
 sav-0.001
TokenThis
Token ID1212
Feature activation-1.07e+00

Pos logit contributions
 faces+0.013
ulating+0.013
 Sorry+0.012
clock+0.011
aw+0.010
Neg logit contributions
ert-0.024
izzy-0.023
oris-0.023
ion-0.023
 log-0.022
Token rock
Token ID3881
Feature activation-7.36e-01

Pos logit contributions
ert+0.030
oris+0.029
ion+0.029
izzy+0.029
ions+0.026
Neg logit contributions
 sav-0.036
 sound-0.035
 faces-0.035
Thank-0.035
 Sorry-0.034
Token is
Token ID318
Feature activation+2.43e-01

Pos logit contributions
ert+0.026
oris+0.023
izzy+0.023
ion+0.023
 log+0.022
Neg logit contributions
 faces-0.020
ulating-0.020
Thank-0.020
 Sorry-0.020
 sav-0.019
Token so
Token ID523
Feature activation-4.35e-03

Pos logit contributions
 faces+0.006
 Sorry+0.006
ulating+0.005
Thank+0.005
aw+0.005
Neg logit contributions
ert-0.009
 log-0.009
ion-0.009
oris-0.008
izzy-0.008
Token hot
Token ID3024
Feature activation+1.42e+00

Pos logit contributions
ert+0.000
izzy+0.000
 log+0.000
oris+0.000
ion+0.000
Neg logit contributions
 faces-0.000
ulating-0.000
thank-0.000
Thank-0.000
 Sorry-0.000
Token,"
Token ID553
Feature activation+2.99e+00

Pos logit contributions
 faces+0.038
 Sorry+0.037
Thank+0.036
ulating+0.036
 sudden+0.034
Neg logit contributions
ert-0.052
oris-0.047
ion-0.047
izzy-0.046
 log-0.045
Token said
Token ID531
Feature activation+8.83e-01

Pos logit contributions
ulating+0.088
 habitat+0.078
view+0.073
thank+0.073
umm+0.073
Neg logit contributions
ert-0.099
 log-0.094
ion-0.089
 stool-0.088
izzy-0.088
Token the
Token ID262
Feature activation+1.49e+00

Pos logit contributions
 faces+0.022
ulating+0.022
Thank+0.022
 Sorry+0.021
 sav+0.021
Neg logit contributions
ert-0.033
oris-0.031
izzy-0.029
ion-0.029
 log-0.028
Token little
Token ID1310
Feature activation-2.32e-01

Pos logit contributions
 faces+0.057
Thank+0.056
 Sorry+0.056
ulating+0.055
 sav+0.055
Neg logit contributions
ert-0.037
oris-0.033
ion-0.031
izzy-0.031
 log-0.027
Token rock
Token ID3881
Feature activation-1.47e+00

Pos logit contributions
ert+0.009
oris+0.008
ion+0.008
izzy+0.008
ions+0.007
Neg logit contributions
 faces-0.006
 Sorry-0.006
Thank-0.006
 sav-0.006
 sound-0.005
Token.
Token ID13
Feature activation+1.15e-01

Pos logit contributions
ert+0.063
oris+0.058
izzy+0.056
ion+0.056
 log+0.055
Neg logit contributions
 faces-0.029
 Sorry-0.029
Thank-0.028
ulating-0.028
 sav-0.027
Tokenisy
Token ID13560
Feature activation+3.17e-02

Pos logit contributions
 faces+0.010
 Sorry+0.010
Thank+0.010
ulating+0.010
 sav+0.010
Neg logit contributions
ert-0.006
oris-0.005
izzy-0.005
ion-0.005
 log-0.005
Token,
Token ID11
Feature activation+3.94e-01

Pos logit contributions
 Sorry+0.001
 faces+0.001
Thank+0.001
ulating+0.001
 sav+0.001
Neg logit contributions
ert-0.001
oris-0.001
ion-0.001
izzy-0.001
 log-0.001
Token always
Token ID1464
Feature activation+8.71e-02

Pos logit contributions
 Sorry+0.013
 faces+0.012
clock+0.012
ulating+0.011
umm+0.011
Neg logit contributions
 log-0.012
ert-0.011
 stool-0.011
ion-0.011
oris-0.010
Token smiling
Token ID16755
Feature activation+2.54e+00

Pos logit contributions
 Sorry+0.003
 faces+0.002
ulating+0.002
aw+0.002
 sudden+0.002
Neg logit contributions
ert-0.003
 log-0.003
 stool-0.002
oris-0.002
ccoli-0.002
Token and
Token ID290
Feature activation+1.59e+00

Pos logit contributions
 Sorry+0.078
Thank+0.071
 sav+0.071
 sudden+0.069
ulating+0.068
Neg logit contributions
ert-0.080
oris-0.071
 log-0.067
izzy-0.064
 sheet-0.064
Token laughing
Token ID14376
Feature activation+2.97e+00

Pos logit contributions
aw+0.053
ulating+0.050
 endings+0.046
clock+0.046
view+0.044
Neg logit contributions
ert-0.044
 log-0.042
 Ted-0.038
oris-0.037
ions-0.037
Token.
Token ID13
Feature activation+1.12e+00

Pos logit contributions
 Sorry+0.095
 sav+0.087
 sudden+0.086
Thank+0.086
view+0.085
Neg logit contributions
ert-0.086
oris-0.077
 log-0.074
izzy-0.072
ion-0.071
Token One
Token ID1881
Feature activation+2.59e-01

Pos logit contributions
Thank+0.020
 Sorry+0.020
 faces+0.020
ulating+0.019
 sav+0.019
Neg logit contributions
ert-0.050
oris-0.047
izzy-0.045
ion-0.045
 log-0.044
Token day
Token ID1110
Feature activation+1.34e+00

Pos logit contributions
Thank+0.005
 faces+0.005
 Sorry+0.005
ulating+0.005
 sav+0.005
Neg logit contributions
ert-0.012
oris-0.011
ion-0.010
izzy-0.010
 stool-0.010
Token a
Token ID257
Feature activation+9.52e-01

Pos logit contributions
 faces+0.029
ulating+0.028
 Sorry+0.028
Thank+0.028
 sav+0.027
Neg logit contributions
ert-0.055
oris-0.051
izzy-0.049
ion-0.049
 log-0.048
Token little
Token ID1310
Feature activation-2.72e-02

Pos logit contributions
 faces+0.019
ulating+0.019
 Sorry+0.019
Thank+0.018
 sav+0.018
Neg logit contributions
ert-0.040
oris-0.038
izzy-0.036
ion-0.036
 log-0.035
Token of
Token ID286
Feature activation+1.17e+00

Pos logit contributions
 Sorry+0.042
ulating+0.041
 faces+0.040
Thank+0.040
 sav+0.039
Neg logit contributions
ert-0.054
oris-0.049
ion-0.048
izzy-0.046
 log-0.046
Token nowhere
Token ID12062
Feature activation-1.75e+00

Pos logit contributions
 faces+0.034
 Sorry+0.033
Thank+0.033
ulating+0.033
 sav+0.032
Neg logit contributions
ert-0.040
oris-0.036
izzy-0.035
ion-0.034
 log-0.033
Token.
Token ID13
Feature activation-1.62e-01

Pos logit contributions
ert+0.075
oris+0.071
izzy+0.068
ion+0.067
 log+0.065
Neg logit contributions
 faces-0.034
ulating-0.033
Thank-0.033
 Sorry-0.032
 sav-0.032
Token John
Token ID1757
Feature activation+2.64e-01

Pos logit contributions
ert+0.005
oris+0.004
ion+0.004
 stool+0.004
 log+0.004
Neg logit contributions
 faces-0.005
ulating-0.005
Thank-0.005
 Sorry-0.005
thank-0.005
Token shouted
Token ID17293
Feature activation+5.55e-01

Pos logit contributions
 Sorry+0.007
 faces+0.007
Thank+0.007
ulating+0.007
 endings+0.006
Neg logit contributions
ert-0.010
oris-0.009
 log-0.009
izzy-0.008
ion-0.008
Token with
Token ID351
Feature activation+2.95e+00

Pos logit contributions
 faces+0.012
 Sorry+0.011
ulating+0.011
Thank+0.011
 sav+0.011
Neg logit contributions
ert-0.023
oris-0.021
izzy-0.021
ion-0.021
 log-0.020
Token surprise
Token ID5975
Feature activation+6.07e-02

Pos logit contributions
 Sorry+0.067
clock+0.066
ulating+0.066
thank+0.062
 faces+0.062
Neg logit contributions
ert-0.108
oris-0.104
 log-0.102
ion-0.101
izzy-0.098
Token and
Token ID290
Feature activation-1.07e-01

Pos logit contributions
Thank+0.001
 Sorry+0.001
ulating+0.001
 faces+0.001
 sav+0.001
Neg logit contributions
ert-0.002
oris-0.002
ion-0.002
izzy-0.002
 log-0.002
Token the
Token ID262
Feature activation+1.29e+00

Pos logit contributions
ert+0.004
 log+0.003
oris+0.003
 Ted+0.003
 Ali+0.003
Neg logit contributions
ulating-0.003
clock-0.003
view-0.003
 habitat-0.003
 endings-0.003
Token squirrel
Token ID33039
Feature activation-7.68e-02

Pos logit contributions
ulating+0.048
 Sorry+0.046
clock+0.043
aw+0.042
 faces+0.041
Neg logit contributions
 log-0.034
ert-0.032
 spider-0.029
oris-0.029
 stool-0.027
Token sc
Token ID629
Feature activation+1.28e-01

Pos logit contributions
ert+0.003
oris+0.003
 log+0.003
ion+0.003
izzy+0.003
Neg logit contributions
 Sorry-0.002
 faces-0.002
ulating-0.002
clock-0.002
Thank-0.002
Token in
Token ID287
Feature activation+2.16e+00

Pos logit contributions
 Sorry+0.039
clock+0.036
Thank+0.036
ulating+0.036
 sav+0.035
Neg logit contributions
ert-0.054
ion-0.050
izzy-0.050
 log-0.049
oris-0.049
Token her
Token ID607
Feature activation+7.35e-01

Pos logit contributions
ulating+0.080
 Sorry+0.078
clock+0.075
thank+0.073
aw+0.071
Neg logit contributions
 spider-0.049
 sheet-0.048
ert-0.047
 log-0.047
ion-0.046
Token nose
Token ID9686
Feature activation+1.22e+00

Pos logit contributions
ulating+0.016
Thank+0.016
 Sorry+0.016
 sav+0.015
 faces+0.014
Neg logit contributions
ert-0.029
oris-0.027
 log-0.026
izzy-0.026
 stool-0.026
Token and
Token ID290
Feature activation+5.55e-01

Pos logit contributions
 Sorry+0.044
Thank+0.039
ulating+0.037
 pige+0.037
clock+0.036
Neg logit contributions
ert-0.034
ion-0.032
 log-0.031
 sheet-0.031
 stool-0.031
Token she
Token ID673
Feature activation+9.85e-01

Pos logit contributions
ulating+0.019
clock+0.018
umm+0.017
 Sorry+0.017
Thank+0.017
Neg logit contributions
ert-0.016
 log-0.014
 spider-0.014
 sheet-0.014
izzy-0.014
Token sne
Token ID10505
Feature activation+2.92e+00

Pos logit contributions
 Sorry+0.024
ulating+0.024
Thank+0.024
thank+0.022
clock+0.021
Neg logit contributions
ert-0.039
izzy-0.035
oris-0.034
 log-0.033
 spider-0.031
Tokenez
Token ID8471
Feature activation+1.37e+00

Pos logit contributions
 Sorry+0.120
ulating+0.115
 faces+0.113
Thank+0.111
 sav+0.110
Neg logit contributions
ert-0.065
izzy-0.061
ion-0.056
oris-0.054
ions-0.051
Tokened
Token ID276
Feature activation+1.34e+00

Pos logit contributions
 faces+0.027
 Sorry+0.027
Thank+0.027
ulating+0.027
 sav+0.025
Neg logit contributions
ert-0.059
oris-0.055
izzy-0.053
ion-0.053
 log-0.051
Token so
Token ID523
Feature activation+9.43e-01

Pos logit contributions
 Sorry+0.044
 pige+0.039
Thank+0.039
ulating+0.037
clock+0.036
Neg logit contributions
ert-0.040
 log-0.036
ion-0.035
 spider-0.035
oris-0.034
Token loud
Token ID7812
Feature activation+1.57e+00

Pos logit contributions
 Sorry+0.019
ulating+0.019
 faces+0.018
Thank+0.018
clock+0.018
Neg logit contributions
ert-0.039
oris-0.037
izzy-0.036
ion-0.035
 log-0.034
Token that
Token ID326
Feature activation+8.52e-03

Pos logit contributions
 Sorry+0.035
 faces+0.034
Thank+0.034
ulating+0.033
 sav+0.032
Neg logit contributions
ert-0.062
ion-0.058
oris-0.057
izzy-0.055
 log-0.055
Token
Token ID198
Feature activation-6.01e-01

Pos logit contributions
 Sorry+0.001
 faces+0.001
Thank+0.001
 sudden+0.001
aw+0.001
Neg logit contributions
ert-0.005
oris-0.004
ion-0.004
 stool-0.004
izzy-0.004
Token"
Token ID1
Feature activation+4.41e-01

Pos logit contributions
ert+0.025
ion+0.022
izzy+0.022
oris+0.022
igsaw+0.021
Neg logit contributions
Thank-0.013
 sudden-0.013
aw-0.013
 Sorry-0.013
 faces-0.013
TokenMe
Token ID5308
Feature activation+2.38e+00

Pos logit contributions
ulating+0.011
clock+0.011
 faces+0.011
 Sorry+0.011
aw+0.011
Neg logit contributions
ert-0.015
izzy-0.015
oris-0.014
ion-0.014
 stool-0.013
Token too
Token ID1165
Feature activation+1.27e-01

Pos logit contributions
ulating+0.069
 faces+0.069
Thank+0.068
 sav+0.068
 sudden+0.065
Neg logit contributions
ert-0.080
izzy-0.071
oris-0.071
ion-0.069
 log-0.068
Token!
Token ID0
Feature activation-7.28e-01

Pos logit contributions
Thank+0.004
 faces+0.004
 Sorry+0.004
ulating+0.004
thank+0.004
Neg logit contributions
ert-0.004
oris-0.003
 log-0.003
 stool-0.003
ion-0.003
Token Me
Token ID2185
Feature activation+2.92e+00

Pos logit contributions
ert+0.026
oris+0.024
izzy+0.023
ion+0.023
 log+0.022
Neg logit contributions
 faces-0.019
ulating-0.019
Thank-0.018
 sav-0.018
 sudden-0.018
Token too
Token ID1165
Feature activation-2.79e-01

Pos logit contributions
ulating+0.083
 faces+0.082
Thank+0.081
 sav+0.081
 Sorry+0.078
Neg logit contributions
ert-0.097
ion-0.087
oris-0.086
izzy-0.085
 log-0.084
Token!"
Token ID2474
Feature activation+7.95e-01

Pos logit contributions
ert+0.008
 log+0.007
oris+0.007
 stool+0.007
 spider+0.007
Neg logit contributions
 faces-0.009
Thank-0.009
ulating-0.008
thank-0.008
view-0.008
Token Sam
Token ID3409
Feature activation-3.69e-02

Pos logit contributions
ulating+0.027
Thank+0.027
 Sorry+0.027
 faces+0.027
 sav+0.026
Neg logit contributions
ert-0.023
oris-0.020
izzy-0.019
ion-0.019
 log-0.018
Token said
Token ID531
Feature activation-6.11e-01

Pos logit contributions
ert+0.001
oris+0.001
izzy+0.001
ion+0.001
 log+0.001
Neg logit contributions
 Sorry-0.001
Thank-0.001
ulating-0.001
thank-0.001
 faces-0.001
Token.
Token ID13
Feature activation-9.30e-01

Pos logit contributions
ert+0.025
oris+0.023
ion+0.022
izzy+0.022
 stool+0.021
Neg logit contributions
Thank-0.014
 Sorry-0.013
ulating-0.013
 sav-0.013
 faces-0.013
Token and
Token ID290
Feature activation+5.16e-01

Pos logit contributions
 Sorry+0.029
 pige+0.025
Thank+0.024
clock+0.024
ulating+0.024
Neg logit contributions
ert-0.026
 log-0.026
 spider-0.024
oris-0.024
 sheet-0.023
Token sne
Token ID10505
Feature activation+2.86e+00

Pos logit contributions
ulating+0.016
clock+0.015
aw+0.015
umm+0.015
 habitat+0.014
Neg logit contributions
ert-0.016
 log-0.015
izzy-0.014
 Ted-0.014
 spider-0.014
Tokenez
Token ID8471
Feature activation+1.35e+00

Pos logit contributions
 Sorry+0.114
 faces+0.111
ulating+0.110
clock+0.106
Thank+0.106
Neg logit contributions
ert-0.064
izzy-0.062
ion-0.058
oris-0.055
ions-0.052
Tokened
Token ID276
Feature activation+1.29e+00

Pos logit contributions
 Sorry+0.029
 faces+0.029
Thank+0.029
ulating+0.028
 sav+0.028
Neg logit contributions
ert-0.055
oris-0.051
izzy-0.050
ion-0.049
 log-0.047
Token and
Token ID290
Feature activation+6.32e-01

Pos logit contributions
 Sorry+0.043
 pige+0.038
view+0.036
Thank+0.036
clock+0.036
Neg logit contributions
ert-0.037
 log-0.036
 spider-0.035
oris-0.034
ion-0.033
Token sne
Token ID10505
Feature activation+2.91e+00

Pos logit contributions
ulating+0.020
aw+0.019
clock+0.018
 habitat+0.018
 pige+0.018
Neg logit contributions
ert-0.019
 Ted-0.018
 log-0.017
 spider-0.017
izzy-0.017
Tokenez
Token ID8471
Feature activation+1.39e+00

Pos logit contributions
 Sorry+0.120
 faces+0.115
ulating+0.115
clock+0.111
Thank+0.111
Neg logit contributions
ert-0.062
izzy-0.061
ion-0.056
oris-0.052
ions-0.050
Tokened
Token ID276
Feature activation+1.48e+00

Pos logit contributions
 Sorry+0.031
 faces+0.031
Thank+0.031
ulating+0.030
 sav+0.030
Neg logit contributions
ert-0.056
oris-0.052
izzy-0.051
ion-0.050
 log-0.047
Token.
Token ID13
Feature activation+3.27e-01

Pos logit contributions
 Sorry+0.050
 pige+0.045
Thank+0.043
view+0.043
ulating+0.042
Neg logit contributions
ert-0.042
 spider-0.039
 log-0.039
oris-0.037
ion-0.037
Token "
Token ID366
Feature activation+9.22e-01

Pos logit contributions
ulating+0.008
 faces+0.008
 Sorry+0.008
Thank+0.008
clock+0.008
Neg logit contributions
ert-0.012
oris-0.011
izzy-0.011
ion-0.010
 stool-0.010
TokenB
Token ID33
Feature activation+1.03e-01

Pos logit contributions
 faces+0.026
ulating+0.026
 Sorry+0.025
clock+0.024
 habitat+0.024
Neg logit contributions
izzy-0.028
ert-0.027
Ali-0.026
oris-0.026
Alex-0.026
Token he
Token ID339
Feature activation+1.11e+00

Pos logit contributions
ert+0.018
oris+0.017
izzy+0.016
ion+0.016
 log+0.015
Neg logit contributions
 faces-0.012
 Sorry-0.012
Thank-0.011
 sav-0.011
ulating-0.011
Token can
Token ID460
Feature activation-7.84e-01

Pos logit contributions
 faces+0.027
 Sorry+0.026
 sav+0.025
aw+0.025
Thank+0.024
Neg logit contributions
ert-0.043
oris-0.040
ion-0.038
izzy-0.037
 stool-0.036
Token't
Token ID470
Feature activation-5.12e-01

Pos logit contributions
ert+0.031
oris+0.028
ion+0.027
izzy+0.027
 log+0.026
Neg logit contributions
 faces-0.018
 Sorry-0.018
Thank-0.018
ulating-0.018
 sav-0.017
Token.
Token ID13
Feature activation-5.22e-01

Pos logit contributions
ert+0.025
oris+0.024
ion+0.023
igsaw+0.022
ions+0.022
Neg logit contributions
 faces-0.007
 face-0.006
 sound-0.006
 miss-0.006
 people-0.006
Token He
Token ID679
Feature activation+4.54e-01

Pos logit contributions
ert+0.020
oris+0.018
izzy+0.018
ion+0.017
 log+0.017
Neg logit contributions
 Sorry-0.013
 faces-0.013
Thank-0.013
ulating-0.013
 sav-0.012
Token sne
Token ID10505
Feature activation+2.91e+00

Pos logit contributions
 faces+0.011
 Sorry+0.011
 sav+0.011
Thank+0.011
ulating+0.011
Neg logit contributions
ert-0.017
oris-0.016
ion-0.015
izzy-0.015
 stool-0.014
Tokenez
Token ID8471
Feature activation+1.29e+00

Pos logit contributions
 Sorry+0.101
 faces+0.101
ulating+0.100
Thank+0.098
 sav+0.096
Neg logit contributions
ert-0.081
oris-0.073
izzy-0.072
ion-0.072
ions-0.064
Tokenes
Token ID274
Feature activation+1.60e+00

Pos logit contributions
 faces+0.033
 Sorry+0.032
Thank+0.032
ulating+0.032
aw+0.031
Neg logit contributions
ert-0.048
oris-0.044
ion-0.042
izzy-0.041
 log-0.041
Token very
Token ID845
Feature activation+6.63e-01

Pos logit contributions
ulating+0.043
Thank+0.043
 sav+0.042
 pige+0.042
 Sorry+0.042
Neg logit contributions
ert-0.055
izzy-0.052
ion-0.051
oris-0.050
 log-0.049
Token loud
Token ID7812
Feature activation+5.06e-01

Pos logit contributions
 faces+0.017
 Sorry+0.016
Thank+0.016
ulating+0.016
 sav+0.016
Neg logit contributions
ert-0.025
oris-0.023
izzy-0.022
ion-0.022
 log-0.021
Token and
Token ID290
Feature activation+2.88e-01

Pos logit contributions
 Sorry+0.014
Thank+0.013
 sudden+0.013
ulating+0.013
 faces+0.013
Neg logit contributions
ert-0.017
ion-0.017
oris-0.016
 log-0.015
izzy-0.015
Token riches
Token ID35000
Feature activation-5.22e-01

Pos logit contributions
 Sorry+0.027
ulating+0.026
 faces+0.025
Thank+0.025
aw+0.025
Neg logit contributions
ert-0.026
oris-0.023
 log-0.022
izzy-0.021
ion-0.021
Token she
Token ID673
Feature activation+5.51e-01

Pos logit contributions
ert+0.022
oris+0.020
izzy+0.019
ion+0.019
 log+0.018
Neg logit contributions
 faces-0.011
 Sorry-0.011
Thank-0.011
ulating-0.011
 sav-0.010
Token had
Token ID550
Feature activation-5.45e-01

Pos logit contributions
 Sorry+0.014
 faces+0.014
ulating+0.014
Thank+0.013
 sudden+0.012
Neg logit contributions
ert-0.021
oris-0.019
izzy-0.018
 log-0.017
ion-0.017
Token found
Token ID1043
Feature activation-3.86e-01

Pos logit contributions
ert+0.019
oris+0.016
ion+0.016
izzy+0.016
ions+0.015
Neg logit contributions
 Sorry-0.017
 faces-0.016
ulating-0.015
Thank-0.014
 sudden-0.014
Token.
Token ID13
Feature activation+7.44e-01

Pos logit contributions
ert+0.014
ion+0.013
oris+0.013
ions+0.012
izzy+0.012
Neg logit contributions
 Sorry-0.011
 faces-0.010
ulating-0.010
Thank-0.010
 sudden-0.009
Token R
Token ID371
Feature activation+2.91e+00

Pos logit contributions
 faces+0.018
ulating+0.017
Thank+0.017
 Sorry+0.017
 sav+0.016
Neg logit contributions
ert-0.029
oris-0.028
ion-0.026
izzy-0.026
 log-0.025
Tokenyle
Token ID2349
Feature activation-1.56e+00

Pos logit contributions
 Sorry+0.100
 faces+0.099
Thank+0.096
ulating+0.095
 sav+0.092
Neg logit contributions
ert-0.085
oris-0.078
ion-0.072
izzy-0.072
ig-0.065
Tokene
Token ID68
Feature activation-5.12e-02

Pos logit contributions
ert+0.061
oris+0.057
izzy+0.054
ion+0.054
 log+0.052
Neg logit contributions
 faces-0.037
 Sorry-0.036
Thank-0.036
ulating-0.036
 sav-0.034
Token was
Token ID373
Feature activation+1.01e-01

Pos logit contributions
ert+0.002
oris+0.002
izzy+0.002
 log+0.002
ion+0.002
Neg logit contributions
 Sorry-0.001
Thank-0.001
ulating-0.001
 faces-0.001
clock-0.001
Token glad
Token ID9675
Feature activation+7.89e-01

Pos logit contributions
 Sorry+0.002
 faces+0.002
 endings+0.002
ulating+0.002
aw+0.002
Neg logit contributions
ert-0.004
oris-0.004
ion-0.004
izzy-0.004
ela-0.004
Token she
Token ID673
Feature activation-4.13e-01

Pos logit contributions
 Sorry+0.024
ulating+0.024
Thank+0.023
 faces+0.023
thank+0.022
Neg logit contributions
ert-0.026
oris-0.023
ion-0.022
izzy-0.022
 log-0.021
Token the
Token ID262
Feature activation-4.05e-01

Pos logit contributions
ert+0.049
oris+0.046
ion+0.044
izzy+0.044
 log+0.042
Neg logit contributions
 faces-0.018
Thank-0.018
 Sorry-0.018
ulating-0.017
 sav-0.017
Token car
Token ID1097
Feature activation-9.60e-01

Pos logit contributions
ert+0.009
 stool+0.008
 log+0.008
oris+0.007
driver+0.007
Neg logit contributions
 Sorry-0.016
ulating-0.016
 faces-0.015
aw-0.015
 sav-0.015
Token's
Token ID338
Feature activation-3.89e-01

Pos logit contributions
ert+0.030
driver+0.026
 log+0.026
oris+0.026
zag+0.026
Neg logit contributions
 Sorry-0.029
ulating-0.028
 faces-0.028
umm-0.028
Thank-0.027
Token owner
Token ID4870
Feature activation+5.14e-01

Pos logit contributions
ert+0.012
driver+0.010
 stool+0.010
 log+0.010
oris+0.009
Neg logit contributions
 Sorry-0.013
Thank-0.012
ulating-0.012
aw-0.012
 sav-0.012
Token said
Token ID531
Feature activation+4.43e-01

Pos logit contributions
ulating+0.016
 Sorry+0.015
 faces+0.015
 habitat+0.015
 sav+0.015
Neg logit contributions
ert-0.016
izzy-0.015
oris-0.014
ion-0.014
driver-0.013
Token,"
Token ID553
Feature activation+2.90e+00

Pos logit contributions
 faces+0.010
ulating+0.010
Thank+0.010
 Sorry+0.010
 sav+0.009
Neg logit contributions
ert-0.018
oris-0.016
ion-0.016
izzy-0.016
 log-0.015
TokenPlease
Token ID5492
Feature activation-6.63e-01

Pos logit contributions
ulating+0.077
 habitat+0.072
 faces+0.070
 drawn+0.068
clock+0.065
Neg logit contributions
ert-0.093
 log-0.089
 stool-0.089
ion-0.086
oris-0.085
Token make
Token ID787
Feature activation-4.39e-02

Pos logit contributions
ert+0.023
oris+0.022
izzy+0.021
ion+0.020
 log+0.020
Neg logit contributions
 faces-0.019
ulating-0.018
 Sorry-0.017
 sudden-0.017
Thank-0.017
Token my
Token ID616
Feature activation+4.30e-01

Pos logit contributions
ert+0.002
oris+0.001
ion+0.001
izzy+0.001
 log+0.001
Neg logit contributions
Thank-0.001
ulating-0.001
 Sorry-0.001
 faces-0.001
 sav-0.001
Token car
Token ID1097
Feature activation-2.59e-03

Pos logit contributions
ulating+0.014
Thank+0.014
 Sorry+0.013
 sav+0.013
 faces+0.013
Neg logit contributions
ert-0.013
oris-0.011
 log-0.011
 stool-0.011
izzy-0.010
Token better
Token ID1365
Feature activation+2.36e-01

Pos logit contributions
ert+0.000
oris+0.000
izzy+0.000
ion+0.000
 log+0.000
Neg logit contributions
 Sorry-0.000
ulating-0.000
Thank-0.000
 faces-0.000
 sav-0.000
Token on
Token ID319
Feature activation+4.71e-01

Pos logit contributions
 Sorry+0.047
 endings+0.041
clock+0.039
 faces+0.039
ulating+0.038
Neg logit contributions
 log-0.040
ert-0.039
 stool-0.038
ig-0.036
 sheet-0.035
Token the
Token ID262
Feature activation+1.80e+00

Pos logit contributions
 Sorry+0.014
ulating+0.014
 faces+0.014
Thank+0.013
aw+0.013
Neg logit contributions
ert-0.015
 log-0.014
ion-0.014
 stool-0.013
oris-0.013
Token tree
Token ID5509
Feature activation-6.39e-01

Pos logit contributions
 Sorry+0.054
ulating+0.049
 faces+0.047
Thank+0.047
clock+0.046
Neg logit contributions
ert-0.060
 log-0.057
 stool-0.056
oris-0.053
ion-0.052
Token and
Token ID290
Feature activation+1.06e+00

Pos logit contributions
ert+0.024
 log+0.021
oris+0.021
ion+0.021
 stool+0.021
Neg logit contributions
 Sorry-0.018
Thank-0.016
 faces-0.016
ulating-0.016
 sudden-0.015
Token feel
Token ID1254
Feature activation+1.21e+00

Pos logit contributions
ulating+0.033
aw+0.032
 sudden+0.028
clock+0.028
umm+0.027
Neg logit contributions
 log-0.033
 stool-0.031
ert-0.031
 mat-0.028
 chair-0.027
Token the
Token ID262
Feature activation+2.90e+00

Pos logit contributions
clock+0.032
 Sorry+0.030
 faces+0.030
ulating+0.030
 habitat+0.029
Neg logit contributions
ion-0.038
 log-0.037
izzy-0.036
ert-0.036
oris-0.034
Token sun
Token ID4252
Feature activation+1.08e+00

Pos logit contributions
 Sorry+0.088
clock+0.086
ulating+0.085
aw+0.085
ach+0.081
Neg logit contributions
 log-0.076
 leaf-0.069
 stool-0.069
 teaspoon-0.066
 spider-0.064
Token on
Token ID319
Feature activation+1.71e-01

Pos logit contributions
ulating+0.031
 habitat+0.031
 Sorry+0.030
 sudden+0.029
 endings+0.028
Neg logit contributions
ert-0.034
ion-0.033
 stool-0.033
izzy-0.032
 log-0.031
Token its
Token ID663
Feature activation+1.69e+00

Pos logit contributions
ulating+0.005
Thank+0.005
clock+0.005
 Sorry+0.005
 sudden+0.005
Neg logit contributions
ert-0.005
ion-0.005
oris-0.004
 log-0.004
 stool-0.004
Token skin
Token ID4168
Feature activation+3.30e-01

Pos logit contributions
 Sorry+0.047
Thank+0.040
ulating+0.038
aw+0.036
Sorry+0.034
Neg logit contributions
ert-0.062
 log-0.061
 stool-0.056
 mat-0.054
 sheet-0.053
Token.
Token ID13
Feature activation+9.66e-01

Pos logit contributions
 Sorry+0.012
Thank+0.011
ulating+0.011
 sudden+0.011
 sav+0.010
Neg logit contributions
ert-0.009
ion-0.008
 stool-0.008
 log-0.008
 sheet-0.007
Token do
Token ID466
Feature activation+1.55e-02

Pos logit contributions
aw+0.025
Thank+0.024
ulating+0.024
clock+0.023
 pige+0.023
Neg logit contributions
ert-0.030
 log-0.027
oris-0.026
izzy-0.026
 Ted-0.024
Token it
Token ID340
Feature activation-1.19e+00

Pos logit contributions
 faces+0.000
 Sorry+0.000
ulating+0.000
Thank+0.000
 sav+0.000
Neg logit contributions
ert-0.001
oris-0.001
izzy-0.001
ion-0.001
 log-0.001
Token again
Token ID757
Feature activation-7.89e-01

Pos logit contributions
ert+0.045
oris+0.042
izzy+0.040
ion+0.040
 log+0.038
Neg logit contributions
 faces-0.029
 Sorry-0.029
Thank-0.029
ulating-0.028
 sav-0.027
Token!
Token ID0
Feature activation-1.96e-01

Pos logit contributions
ert+0.029
oris+0.024
 log+0.024
izzy+0.023
 stool+0.022
Neg logit contributions
 Sorry-0.022
 faces-0.022
ulating-0.021
Thank-0.021
aw-0.020
Token You
Token ID921
Feature activation+1.86e-01

Pos logit contributions
ert+0.007
oris+0.006
ion+0.006
izzy+0.006
 log+0.006
Neg logit contributions
 faces-0.006
ulating-0.005
Thank-0.005
 sav-0.005
 sudden-0.005
Token sne
Token ID10505
Feature activation+2.89e+00

Pos logit contributions
 faces+0.006
Thank+0.006
ulating+0.006
 sav+0.006
 Sorry+0.006
Neg logit contributions
ert-0.006
oris-0.005
izzy-0.005
 log-0.005
ion-0.005
Tokene
Token ID68
Feature activation+6.94e-01

Pos logit contributions
 faces+0.102
 Sorry+0.100
ulating+0.099
 sudden+0.097
Thank+0.095
Neg logit contributions
ert-0.077
ion-0.072
izzy-0.071
oris-0.068
ions-0.064
Tokenze
Token ID2736
Feature activation+1.93e+00

Pos logit contributions
Thank+0.015
 Sorry+0.014
ulating+0.014
clock+0.014
 sudden+0.014
Neg logit contributions
ert-0.029
oris-0.026
izzy-0.025
ion-0.025
 log-0.025
Token and
Token ID290
Feature activation+7.94e-01

Pos logit contributions
 faces+0.054
 sudden+0.054
 Sorry+0.053
Thank+0.053
ulating+0.052
Neg logit contributions
ert-0.067
oris-0.058
ion-0.058
izzy-0.057
 log-0.056
Token I
Token ID314
Feature activation-9.77e-01

Pos logit contributions
ulating+0.022
aw+0.021
clock+0.020
Thank+0.020
view+0.020
Neg logit contributions
ert-0.027
 log-0.024
oris-0.023
izzy-0.023
 spider-0.022
Token'll
Token ID1183
Feature activation+6.12e-03

Pos logit contributions
ert+0.035
oris+0.032
izzy+0.031
ion+0.029
 log+0.029
Neg logit contributions
 faces-0.027
ulating-0.026
 Sorry-0.026
Thank-0.026
 sav-0.025
Token pur
Token ID1308
Feature activation+1.85e+00

Pos logit contributions
ert+0.018
oris+0.017
izzy+0.017
 log+0.016
ion+0.016
Neg logit contributions
 Sorry-0.010
ulating-0.010
 faces-0.010
Thank-0.009
clock-0.009
Tokenred
Token ID445
Feature activation+2.01e-01

Pos logit contributions
 Sorry+0.055
 faces+0.053
 sudden+0.050
Thank+0.050
clock+0.050
Neg logit contributions
ert-0.062
oris-0.057
izzy-0.056
ion-0.055
ions-0.051
Token and
Token ID290
Feature activation+4.90e-01

Pos logit contributions
 Sorry+0.006
Thank+0.005
ulating+0.005
clock+0.005
 faces+0.005
Neg logit contributions
ert-0.007
 log-0.006
oris-0.006
ion-0.006
 stool-0.006
Token me
Token ID502
Feature activation+6.98e-01

Pos logit contributions
ulating+0.014
clock+0.013
 Sorry+0.013
 endings+0.012
view+0.012
Neg logit contributions
ert-0.016
 log-0.015
 stool-0.014
 sheet-0.014
ccoli-0.014
Tokenowed
Token ID6972
Feature activation+1.65e+00

Pos logit contributions
 Sorry+0.026
 faces+0.026
ulating+0.025
Thank+0.025
 sav+0.025
Neg logit contributions
ert-0.018
oris-0.016
izzy-0.015
ion-0.015
 log-0.014
Token with
Token ID351
Feature activation+2.89e+00

Pos logit contributions
 Sorry+0.046
clock+0.042
ulating+0.041
Thank+0.040
 habitat+0.040
Neg logit contributions
ert-0.054
oris-0.048
 log-0.048
ion-0.047
izzy-0.046
Token pleasure
Token ID9476
Feature activation+1.05e+00

Pos logit contributions
ulating+0.083
clock+0.081
 Sorry+0.071
 habitat+0.068
thank+0.068
Neg logit contributions
 log-0.086
ert-0.085
 spider-0.084
 sheet-0.083
ion-0.082
Token.
Token ID13
Feature activation+7.31e-01

Pos logit contributions
 Sorry+0.026
ulating+0.026
Thank+0.026
 sav+0.024
 faces+0.023
Neg logit contributions
ert-0.040
oris-0.036
ion-0.034
 log-0.034
izzy-0.034
Token It
Token ID632
Feature activation-5.91e-01

Pos logit contributions
 Sorry+0.014
Thank+0.014
 faces+0.014
ulating+0.013
 sav+0.013
Neg logit contributions
ert-0.033
oris-0.030
izzy-0.029
ion-0.029
 log-0.028
Token was
Token ID373
Feature activation-1.43e-01

Pos logit contributions
ert+0.026
oris+0.025
ion+0.024
izzy+0.024
 log+0.023
Neg logit contributions
 faces-0.011
Thank-0.011
 Sorry-0.011
ulating-0.010
 sav-0.010
Token very
Token ID845
Feature activation+5.82e-01

Pos logit contributions
ert+0.005
ion+0.005
 log+0.005
izzy+0.005
ions+0.004
Neg logit contributions
 Sorry-0.004
 faces-0.004
ulating-0.004
clock-0.003
 endings-0.003
Token felt
Token ID2936
Feature activation-6.81e-02

Pos logit contributions
 Sorry+0.004
ulating+0.004
 faces+0.004
Thank+0.004
clock+0.003
Neg logit contributions
ert-0.006
izzy-0.006
oris-0.006
 log-0.006
 stool-0.005
Token sick
Token ID6639
Feature activation+1.52e+00

Pos logit contributions
ert+0.003
izzy+0.003
ion+0.003
oris+0.003
 log+0.003
Neg logit contributions
 Sorry-0.001
ulating-0.001
 faces-0.001
Thank-0.001
clock-0.001
Token and
Token ID290
Feature activation+9.58e-01

Pos logit contributions
 Sorry+0.046
Thank+0.042
ulating+0.042
 faces+0.041
 sudden+0.041
Neg logit contributions
ert-0.051
ion-0.049
izzy-0.047
 log-0.047
oris-0.046
Token had
Token ID550
Feature activation+7.96e-02

Pos logit contributions
ulating+0.025
Thank+0.023
thank+0.022
 Sorry+0.022
 faces+0.021
Neg logit contributions
ert-0.034
izzy-0.033
 log-0.032
ion-0.031
 stool-0.030
Token to
Token ID284
Feature activation+5.51e-01

Pos logit contributions
ulating+0.002
 Sorry+0.002
Thank+0.002
 faces+0.002
 sav+0.002
Neg logit contributions
ert-0.003
ion-0.002
izzy-0.002
oris-0.002
 log-0.002
Token sne
Token ID10505
Feature activation+2.89e+00

Pos logit contributions
ulating+0.014
Thank+0.014
aw+0.014
 sudden+0.013
 Sorry+0.013
Neg logit contributions
ert-0.019
 log-0.018
izzy-0.018
ion-0.017
 stool-0.017
Tokene
Token ID68
Feature activation+3.26e-01

Pos logit contributions
 Sorry+0.103
ulating+0.102
Thank+0.097
 faces+0.096
 sav+0.093
Neg logit contributions
izzy-0.074
ion-0.073
ert-0.071
ions-0.066
ig-0.063
Tokenze
Token ID2736
Feature activation+2.01e+00

Pos logit contributions
Thank+0.008
 Sorry+0.008
 sav+0.007
ulating+0.007
clock+0.007
Neg logit contributions
ert-0.012
izzy-0.011
ion-0.011
 log-0.011
oris-0.011
Token a
Token ID257
Feature activation-4.29e-01

Pos logit contributions
 Sorry+0.059
Thank+0.057
ulating+0.055
 sav+0.053
 sudden+0.052
Neg logit contributions
ert-0.068
ion-0.064
 log-0.061
izzy-0.060
 stool-0.059
Token lot
Token ID1256
Feature activation-5.42e-01

Pos logit contributions
ert+0.017
 log+0.015
oris+0.015
 stool+0.015
izzy+0.015
Neg logit contributions
ulating-0.010
 Sorry-0.010
Thank-0.010
 faces-0.010
clock-0.009
Token.
Token ID13
Feature activation-4.69e-01

Pos logit contributions
ert+0.026
oris+0.025
izzy+0.023
 log+0.022
 stool+0.022
Neg logit contributions
ulating-0.007
 faces-0.007
 sav-0.007
Thank-0.007
aw-0.007
Token a
Token ID257
Feature activation-2.26e-01

Pos logit contributions
ert+0.019
oris+0.018
izzy+0.017
ion+0.017
 log+0.016
Neg logit contributions
Thank-0.009
 faces-0.009
 sav-0.009
ulating-0.009
 Sorry-0.009
Token good
Token ID922
Feature activation-6.78e-01

Pos logit contributions
ert+0.011
ion+0.010
oris+0.010
izzy+0.010
ions+0.010
Neg logit contributions
Thank-0.003
 Sorry-0.003
 faces-0.003
 sav-0.003
ulating-0.003
Token dog
Token ID3290
Feature activation+5.86e-01

Pos logit contributions
ert+0.028
oris+0.025
ion+0.025
izzy+0.024
ions+0.023
Neg logit contributions
 faces-0.015
 sav-0.015
Thank-0.015
 Sorry-0.015
ulating-0.014
Token named
Token ID3706
Feature activation-1.73e+00

Pos logit contributions
Thank+0.008
 Sorry+0.007
 sav+0.007
 faces+0.007
ulating+0.007
Neg logit contributions
ert-0.030
oris-0.028
izzy-0.026
ion-0.026
 stool-0.026
Token D
Token ID360
Feature activation-1.08e+00

Pos logit contributions
ert+0.062
oris+0.057
ion+0.054
izzy+0.054
 log+0.051
Neg logit contributions
 faces-0.047
 Sorry-0.046
Thank-0.046
ulating-0.046
 sav-0.045
Tokenott
Token ID1252
Feature activation-4.45e+00

Pos logit contributions
ert+0.035
 stool+0.033
oris+0.032
ion+0.031
 log+0.031
Neg logit contributions
ulating-0.029
Thank-0.029
aw-0.029
 faces-0.028
 sav-0.028
Tokenie
Token ID494
Feature activation-2.64e-01

Pos logit contributions
oris+0.161
 stool+0.160
igsaw+0.151
 opener+0.150
ert+0.150
Neg logit contributions
aw-0.100
thank-0.096
view-0.095
 sound-0.093
 faces-0.092
Token.
Token ID13
Feature activation+1.02e-01

Pos logit contributions
ert+0.013
oris+0.013
ion+0.012
izzy+0.012
 stool+0.012
Neg logit contributions
 faces-0.003
Thank-0.003
 sav-0.003
ulating-0.003
aw-0.003
Token D
Token ID360
Feature activation+1.21e+00

Pos logit contributions
 faces+0.003
 Sorry+0.003
ulating+0.003
Thank+0.003
 sav+0.002
Neg logit contributions
ert-0.004
oris-0.004
izzy-0.003
ion-0.003
 log-0.003
Tokenott
Token ID1252
Feature activation-2.66e+00

Pos logit contributions
Thank+0.045
 faces+0.044
 sav+0.044
 Sorry+0.044
ulating+0.044
Neg logit contributions
ert-0.029
ion-0.025
oris-0.025
 stool-0.024
 log-0.023
Tokenie
Token ID494
Feature activation+5.30e-01

Pos logit contributions
 stool+0.086
oris+0.085
ert+0.084
ion+0.080
 opener+0.079
Neg logit contributions
aw-0.068
 faces-0.066
thank-0.065
 sav-0.065
view-0.065
Token loyal
Token ID9112
Feature activation+1.51e-01

Pos logit contributions
ert+0.002
ion+0.002
oris+0.002
izzy+0.002
ions+0.002
Neg logit contributions
Thank-0.001
 Sorry-0.001
 faces-0.001
 sav-0.001
ulating-0.001
Token little
Token ID1310
Feature activation-4.71e-01

Pos logit contributions
 faces+0.003
 Sorry+0.003
Thank+0.003
 sav+0.003
 face+0.003
Neg logit contributions
ert-0.006
oris-0.006
ion-0.006
izzy-0.005
ions-0.005
Token girl
Token ID2576
Feature activation+5.10e-01

Pos logit contributions
ert+0.021
ion+0.019
oris+0.019
izzy+0.018
ions+0.017
Neg logit contributions
 Sorry-0.010
Thank-0.009
 faces-0.009
 sound-0.009
 sav-0.009
Token named
Token ID3706
Feature activation-1.37e+00

Pos logit contributions
 faces+0.010
 Sorry+0.010
ulating+0.010
Thank+0.010
 sav+0.009
Neg logit contributions
ert-0.022
oris-0.020
izzy-0.020
ion-0.019
 log-0.019
Token L
Token ID406
Feature activation-1.16e+00

Pos logit contributions
ert+0.053
oris+0.049
ion+0.046
izzy+0.046
 log+0.045
Neg logit contributions
 faces-0.033
Thank-0.033
 Sorry-0.033
ulating-0.033
 sav-0.031
Tokenott
Token ID1252
Feature activation-4.44e+00

Pos logit contributions
 log+0.030
ccoli+0.027
oris+0.027
ert+0.026
 boulder+0.025
Neg logit contributions
aw-0.041
view-0.037
 drawn-0.037
 were-0.036
 went-0.036
Tokenie
Token ID494
Feature activation-1.83e-01

Pos logit contributions
oris+0.166
 stool+0.159
ert+0.157
 opener+0.154
igsaw+0.154
Neg logit contributions
aw-0.096
thank-0.095
view-0.091
 faces-0.089
 sound-0.089
Token.
Token ID13
Feature activation+6.27e-01

Pos logit contributions
ert+0.009
oris+0.008
ion+0.008
izzy+0.008
 stool+0.007
Neg logit contributions
 faces-0.003
Thank-0.003
ulating-0.003
 Sorry-0.003
 sav-0.003
Token Every
Token ID3887
Feature activation-1.55e+00

Pos logit contributions
 faces+0.013
 Sorry+0.013
ulating+0.013
Thank+0.013
 sav+0.012
Neg logit contributions
ert-0.026
oris-0.024
izzy-0.024
ion-0.023
 log-0.022
Token day
Token ID1110
Feature activation+1.74e+00

Pos logit contributions
ert+0.072
oris+0.069
izzy+0.064
ion+0.064
ions+0.062
Neg logit contributions
Thank-0.026
 sav-0.025
 Sorry-0.024
 faces-0.024
 sound-0.023
Token,
Token ID11
Feature activation+1.20e+00

Pos logit contributions
ulating+0.058
thank+0.054
 faces+0.053
 habitat+0.053
Thank+0.052
Neg logit contributions
ert-0.054
izzy-0.053
oris-0.048
ion-0.045
ig-0.045
Token a
Token ID257
Feature activation-1.41e-01

Pos logit contributions
ert+0.016
oris+0.015
izzy+0.014
ion+0.014
 log+0.014
Neg logit contributions
 faces-0.013
 Sorry-0.013
ulating-0.013
Thank-0.012
 sav-0.012
Token kind
Token ID1611
Feature activation-6.78e-01

Pos logit contributions
ert+0.006
oris+0.005
 log+0.005
izzy+0.005
ion+0.005
Neg logit contributions
 faces-0.003
ulating-0.003
 Sorry-0.003
Thank-0.003
 sav-0.003
Token deer
Token ID20096
Feature activation-8.62e-02

Pos logit contributions
ert+0.029
oris+0.025
izzy+0.025
ions+0.025
ion+0.024
Neg logit contributions
 face-0.015
 sound-0.015
 faces-0.014
 sav-0.014
 people-0.013
Token named
Token ID3706
Feature activation-1.16e+00

Pos logit contributions
ert+0.004
oris+0.003
izzy+0.003
ion+0.003
 log+0.003
Neg logit contributions
 faces-0.002
 Sorry-0.002
Thank-0.002
ulating-0.002
 sav-0.002
Token D
Token ID360
Feature activation-1.43e+00

Pos logit contributions
ert+0.036
oris+0.032
ion+0.031
izzy+0.031
 log+0.029
Neg logit contributions
 faces-0.036
 Sorry-0.036
Thank-0.036
ulating-0.035
 sav-0.035
Tokenott
Token ID1252
Feature activation-4.06e+00

Pos logit contributions
 stool+0.044
ert+0.041
ion+0.039
 outage+0.039
 log+0.038
Neg logit contributions
aw-0.040
ulating-0.038
Thank-0.038
thank-0.036
 Instead-0.036
Tokenie
Token ID494
Feature activation-4.59e-01

Pos logit contributions
igsaw+0.139
 stool+0.139
oris+0.134
 opener+0.131
ert+0.127
Neg logit contributions
aw-0.101
 sound-0.092
thank-0.089
 people-0.088
 faces-0.087
Token appeared
Token ID4120
Feature activation+7.16e-01

Pos logit contributions
ert+0.019
oris+0.018
izzy+0.017
ion+0.017
 log+0.016
Neg logit contributions
 faces-0.010
 Sorry-0.010
Thank-0.010
ulating-0.010
 sav-0.009
Token.
Token ID13
Feature activation-7.18e-01

Pos logit contributions
 faces+0.012
Thank+0.012
ulating+0.012
 Sorry+0.012
 sav+0.011
Neg logit contributions
ert-0.032
oris-0.030
izzy-0.029
ion-0.029
 log-0.028
Token "
Token ID366
Feature activation+1.15e+00

Pos logit contributions
ert+0.026
oris+0.024
izzy+0.023
ion+0.023
 log+0.022
Neg logit contributions
 faces-0.019
 Sorry-0.019
Thank-0.019
ulating-0.019
 sav-0.018
TokenLeave
Token ID35087
Feature activation-7.97e-01

Pos logit contributions
 faces+0.031
clock+0.029
 Sorry+0.029
ulating+0.028
 face+0.026
Neg logit contributions
ert-0.041
oris-0.039
izzy-0.037
ion-0.037
Alex-0.035
Token were
Token ID547
Feature activation+1.02e-01

Pos logit contributions
 faces+0.018
 Sorry+0.018
ulating+0.018
Thank+0.018
 sav+0.017
Neg logit contributions
ert-0.034
oris-0.032
izzy-0.031
ion-0.031
 log-0.030
Token two
Token ID734
Feature activation+4.93e-01

Pos logit contributions
 Sorry+0.003
ulating+0.003
 faces+0.003
Thank+0.003
 sav+0.003
Neg logit contributions
ert-0.004
izzy-0.003
oris-0.003
ion-0.003
 log-0.003
Token friends
Token ID2460
Feature activation+5.46e-01

Pos logit contributions
 Sorry+0.012
Thank+0.011
ulating+0.011
thank+0.010
 sav+0.010
Neg logit contributions
ert-0.020
oris-0.018
izzy-0.018
 log-0.018
ion-0.017
Token,
Token ID11
Feature activation-6.57e-01

Pos logit contributions
 faces+0.010
ulating+0.009
Thank+0.009
 Sorry+0.009
 sav+0.009
Neg logit contributions
ert-0.025
oris-0.023
ion-0.022
izzy-0.022
 stool-0.022
Token Ab
Token ID2275
Feature activation-2.34e+00

Pos logit contributions
ert+0.030
ion+0.028
oris+0.028
ig+0.026
 outage+0.026
Neg logit contributions
Thank-0.012
 so-0.011
 thanked-0.010
 Sorry-0.010
 including-0.010
Tokenig
Token ID328
Feature activation-4.05e+00

Pos logit contributions
oris+0.061
ert+0.057
 opener+0.055
 outage+0.054
igsaw+0.053
Neg logit contributions
aw-0.082
 sav-0.078
 sudden-0.077
 faces-0.076
 pige-0.076
Tokenail
Token ID603
Feature activation-2.44e+00

Pos logit contributions
oris+0.069
ion+0.065
ert+0.063
igsaw+0.063
 stool+0.060
Neg logit contributions
 faces-0.166
 sudden-0.163
aw-0.159
 face-0.155
 sav-0.154
Token and
Token ID290
Feature activation-4.44e-01

Pos logit contributions
ert+0.100
oris+0.091
izzy+0.088
ion+0.088
 log+0.085
Neg logit contributions
 Sorry-0.055
 faces-0.053
Thank-0.053
ulating-0.053
 sav-0.050
Token Mat
Token ID6550
Feature activation-1.83e+00

Pos logit contributions
ion+0.019
ert+0.019
oris+0.017
 stool+0.017
 log+0.017
Neg logit contributions
 Sorry-0.010
Thank-0.009
 faces-0.009
 sav-0.008
 people-0.008
Tokenty
Token ID774
Feature activation-1.05e+00

Pos logit contributions
 log+0.055
 stool+0.054
ion+0.054
ccoli+0.053
oris+0.052
Neg logit contributions
 faces-0.048
thank-0.047
ulating-0.047
 sav-0.046
 thanked-0.045
Token.
Token ID13
Feature activation-4.62e-01

Pos logit contributions
ert+0.047
oris+0.044
izzy+0.042
ion+0.042
 log+0.040
Neg logit contributions
 faces-0.019
Thank-0.019
 Sorry-0.018
ulating-0.018
 sav-0.018
Token said
Token ID531
Feature activation+4.55e-01

Pos logit contributions
 Sorry+0.011
ulating+0.011
 faces+0.011
 sudden+0.011
Thank+0.011
Neg logit contributions
ert-0.014
ion-0.012
oris-0.012
 log-0.012
izzy-0.012
Token,
Token ID11
Feature activation+9.25e-01

Pos logit contributions
 faces+0.010
 Sorry+0.010
Thank+0.010
ulating+0.010
 sav+0.009
Neg logit contributions
ert-0.018
oris-0.017
izzy-0.016
ion-0.016
 log-0.016
Token "
Token ID366
Feature activation+9.64e-02

Pos logit contributions
 faces+0.028
 Sorry+0.028
Thank+0.028
ulating+0.028
 sav+0.027
Neg logit contributions
ert-0.030
oris-0.027
izzy-0.025
ion-0.025
 log-0.024
TokenHello
Token ID15496
Feature activation+1.05e+00

Pos logit contributions
 faces+0.002
 Sorry+0.002
ulating+0.002
 face+0.002
 sav+0.002
Neg logit contributions
ert-0.004
oris-0.004
ion-0.003
izzy-0.003
ions-0.003
Token D
Token ID360
Feature activation+1.52e-01

Pos logit contributions
 faces+0.028
ulating+0.027
 Sorry+0.027
Thank+0.027
 sav+0.025
Neg logit contributions
ert-0.039
oris-0.036
izzy-0.034
ion-0.033
 log-0.033
Tokenott
Token ID1252
Feature activation-4.01e+00

Pos logit contributions
Thank+0.005
 sav+0.005
 Sorry+0.005
 faces+0.005
ulating+0.005
Neg logit contributions
ert-0.004
ion-0.003
oris-0.003
 log-0.003
 stool-0.003
Tokenie
Token ID494
Feature activation-6.59e-01

Pos logit contributions
oris+0.130
 stool+0.127
 opener+0.124
ert+0.123
ion+0.122
Neg logit contributions
aw-0.097
thank-0.096
 sound-0.095
 faces-0.094
 people-0.092
Token!
Token ID0
Feature activation-4.15e-01

Pos logit contributions
ert+0.027
oris+0.025
izzy+0.024
ion+0.024
 log+0.023
Neg logit contributions
 faces-0.014
ulating-0.014
 Sorry-0.014
Thank-0.014
 sav-0.013
Token This
Token ID770
Feature activation-2.47e+00

Pos logit contributions
ert+0.018
oris+0.016
ion+0.016
 log+0.015
izzy+0.015
Neg logit contributions
 Sorry-0.009
Thank-0.008
ulating-0.008
 sav-0.008
 faces-0.008
Token is
Token ID318
Feature activation+2.42e-01

Pos logit contributions
ert+0.089
oris+0.081
ion+0.080
izzy+0.080
ions+0.074
Neg logit contributions
 Sorry-0.061
 faces-0.061
 sav-0.060
 people-0.059
ulating-0.059
Token a
Token ID257
Feature activation+8.86e-02

Pos logit contributions
 Sorry+0.006
 faces+0.006
Thank+0.006
ulating+0.006
 sav+0.006
Neg logit contributions
ert-0.009
oris-0.008
izzy-0.008
ion-0.008
 log-0.008
Token there
Token ID612
Feature activation+1.18e+00

Pos logit contributions
ert+0.018
oris+0.018
ion+0.018
 log+0.017
izzy+0.017
Neg logit contributions
 Sorry-0.003
 faces-0.003
 were-0.003
 so-0.003
Thank-0.003
Token were
Token ID547
Feature activation+3.05e-01

Pos logit contributions
 faces+0.027
ulating+0.026
 Sorry+0.026
Thank+0.026
 sav+0.025
Neg logit contributions
ert-0.047
oris-0.044
izzy-0.042
ion-0.042
 log-0.040
Token two
Token ID734
Feature activation+3.86e-01

Pos logit contributions
ulating+0.008
 faces+0.008
 Sorry+0.008
Thank+0.008
 sav+0.007
Neg logit contributions
ert-0.011
oris-0.010
izzy-0.010
ion-0.010
 log-0.009
Token friends
Token ID2460
Feature activation+1.57e-01

Pos logit contributions
 Sorry+0.010
Thank+0.009
ulating+0.009
thank+0.009
 sav+0.008
Neg logit contributions
ert-0.015
oris-0.014
 log-0.014
izzy-0.014
ion-0.013
Token,
Token ID11
Feature activation-1.15e+00

Pos logit contributions
 faces+0.003
 Sorry+0.003
ulating+0.003
Thank+0.003
 sav+0.003
Neg logit contributions
ert-0.007
oris-0.006
izzy-0.006
ion-0.006
 log-0.006
Token Lydia
Token ID47766
Feature activation-3.80e+00

Pos logit contributions
ert+0.048
oris+0.045
ion+0.044
 outage+0.041
izzy+0.041
Neg logit contributions
Thank-0.024
 sav-0.023
 Sorry-0.022
 faces-0.022
 thanked-0.022
Token and
Token ID290
Feature activation-6.41e-01

Pos logit contributions
ert+0.193
oris+0.183
izzy+0.177
ion+0.174
 log+0.169
Neg logit contributions
 faces-0.047
 thanked-0.046
ulating-0.045
 sav-0.044
Thank-0.041
Token Tim
Token ID5045
Feature activation-1.22e+00

Pos logit contributions
ert+0.025
ion+0.023
oris+0.023
 stool+0.022
izzy+0.022
Neg logit contributions
 Sorry-0.016
Thank-0.015
 faces-0.015
ulating-0.014
 sav-0.014
Token.
Token ID13
Feature activation-1.44e+00

Pos logit contributions
ert+0.045
oris+0.042
ion+0.040
izzy+0.039
 stool+0.038
Neg logit contributions
 faces-0.032
Thank-0.031
ulating-0.031
 Sorry-0.030
 sav-0.030
Token Every
Token ID3887
Feature activation-2.22e+00

Pos logit contributions
ert+0.063
oris+0.060
izzy+0.057
ion+0.055
 log+0.055
Neg logit contributions
 Sorry-0.027
 faces-0.025
ulating-0.024
Thank-0.024
 sav-0.023
Token day
Token ID1110
Feature activation-2.63e-01

Pos logit contributions
ert+0.103
oris+0.100
izzy+0.092
 stool+0.092
ion+0.092
Neg logit contributions
 faces-0.034
Thank-0.034
aw-0.033
 Sorry-0.033
 sound-0.033
Token were
Token ID547
Feature activation-5.08e-02

Pos logit contributions
 faces+0.022
 Sorry+0.022
ulating+0.022
Thank+0.022
 sav+0.021
Neg logit contributions
ert-0.042
oris-0.039
izzy-0.037
ion-0.037
 log-0.036
Token two
Token ID734
Feature activation+3.35e-01

Pos logit contributions
ert+0.002
oris+0.002
izzy+0.002
ion+0.002
 log+0.002
Neg logit contributions
ulating-0.001
 faces-0.001
 Sorry-0.001
Thank-0.001
 sav-0.001
Token sisters
Token ID15153
Feature activation+4.23e-01

Pos logit contributions
 Sorry+0.008
Thank+0.008
ulating+0.008
thank+0.007
 sav+0.007
Neg logit contributions
ert-0.013
oris-0.012
 log-0.012
izzy-0.012
ion-0.011
Token,
Token ID11
Feature activation-1.40e+00

Pos logit contributions
 Sorry+0.010
Thank+0.010
 faces+0.009
ulating+0.009
 sav+0.009
Neg logit contributions
ert-0.017
oris-0.016
izzy-0.015
ion-0.015
 log-0.014
Token M
Token ID337
Feature activation-1.34e+00

Pos logit contributions
ert+0.060
oris+0.055
ion+0.052
 log+0.050
izzy+0.049
Neg logit contributions
Thank-0.031
 sav-0.029
 faces-0.028
 Sorry-0.028
 thanked-0.028
Tokenilly
Token ID6548
Feature activation-3.74e+00

Pos logit contributions
oris+0.040
ert+0.037
 log+0.037
ion+0.036
driver+0.034
Neg logit contributions
 faces-0.041
 face-0.036
view-0.036
aw-0.035
clock-0.035
Token and
Token ID290
Feature activation-1.02e+00

Pos logit contributions
ert+0.196
oris+0.187
ion+0.179
izzy+0.175
igsaw+0.174
Neg logit contributions
 faces-0.040
 thanked-0.039
 sav-0.034
 face-0.032
 with-0.032
Token Molly
Token ID30236
Feature activation-1.40e+00

Pos logit contributions
ert+0.039
oris+0.035
ion+0.034
izzy+0.033
 log+0.033
Neg logit contributions
 Sorry-0.026
Thank-0.026
 faces-0.026
 sav-0.025
ulating-0.025
Token.
Token ID13
Feature activation-2.02e+00

Pos logit contributions
ert+0.059
oris+0.054
izzy+0.053
ion+0.052
 log+0.050
Neg logit contributions
 Sorry-0.029
 faces-0.029
Thank-0.028
ulating-0.028
 sav-0.027
Token M
Token ID337
Feature activation-5.87e-01

Pos logit contributions
ert+0.087
oris+0.081
izzy+0.078
ion+0.078
 log+0.075
Neg logit contributions
 faces-0.040
 Sorry-0.039
Thank-0.039
ulating-0.039
 sav-0.037
Tokenilly
Token ID6548
Feature activation-2.11e+00

Pos logit contributions
ert+0.019
oris+0.018
ion+0.018
 log+0.017
driver+0.017
Neg logit contributions
 faces-0.017
ulating-0.016
aw-0.015
 sav-0.015
Thank-0.015
Tokenmy
Token ID1820
Feature activation-2.36e-01

Pos logit contributions
 faces+0.001
Thank+0.001
 sav+0.001
 thanked+0.001
clock+0.001
Neg logit contributions
ert-0.001
ion-0.001
oris-0.001
ions-0.001
 stool-0.001
Token told
Token ID1297
Feature activation-1.41e-01

Pos logit contributions
ert+0.011
oris+0.011
ion+0.010
izzy+0.010
ions+0.010
Neg logit contributions
 faces-0.004
Thank-0.004
 sav-0.003
 Sorry-0.003
ulating-0.003
Token her
Token ID607
Feature activation-6.65e-01

Pos logit contributions
ert+0.007
oris+0.006
izzy+0.006
ccoli+0.006
 outage+0.006
Neg logit contributions
 faces-0.003
 sound-0.002
 people-0.002
 sav-0.002
 face-0.002
Token about
Token ID546
Feature activation-1.10e+00

Pos logit contributions
ert+0.035
oris+0.034
ion+0.031
izzy+0.031
ions+0.031
Neg logit contributions
 faces-0.010
 face-0.009
 people-0.008
 sound-0.007
 so-0.007
Token the
Token ID262
Feature activation-1.32e+00

Pos logit contributions
ert+0.047
oris+0.045
izzy+0.043
ion+0.040
ela+0.039
Neg logit contributions
 faces-0.023
Thank-0.022
 sav-0.022
 Sorry-0.021
 sound-0.021
Token broken
Token ID5445
Feature activation-3.71e+00

Pos logit contributions
ert+0.048
oris+0.043
 log+0.042
ion+0.041
izzy+0.040
Neg logit contributions
 Sorry-0.035
ulating-0.034
Thank-0.033
 faces-0.033
 sav-0.032
Token whistle
Token ID16121
Feature activation-6.13e-01

Pos logit contributions
ert+0.143
oris+0.131
izzy+0.126
ion+0.126
 log+0.120
Neg logit contributions
 faces-0.091
 Sorry-0.089
Thank-0.088
ulating-0.088
 sav-0.085
Token.
Token ID13
Feature activation-5.78e-01

Pos logit contributions
ert+0.022
ion+0.020
oris+0.020
izzy+0.019
 log+0.019
Neg logit contributions
 Sorry-0.017
Thank-0.016
ulating-0.016
 faces-0.016
 sav-0.016
Token His
Token ID2399
Feature activation-4.10e-02

Pos logit contributions
ert+0.023
oris+0.021
izzy+0.020
ion+0.020
 log+0.019
Neg logit contributions
 faces-0.014
 Sorry-0.013
Thank-0.013
ulating-0.013
 sav-0.013
Token mom
Token ID1995
Feature activation+9.82e-01

Pos logit contributions
ert+0.001
oris+0.001
izzy+0.001
 log+0.001
ion+0.001
Neg logit contributions
ulating-0.001
 Sorry-0.001
Thank-0.001
 faces-0.001
 sav-0.001
Token said
Token ID531
Feature activation+1.02e-01

Pos logit contributions
 Sorry+0.019
 faces+0.019
ulating+0.019
Thank+0.019
 sav+0.018
Neg logit contributions
ert-0.042
oris-0.039
izzy-0.038
ion-0.038
 log-0.036
Token fun
Token ID1257
Feature activation+2.37e-01

Pos logit contributions
 Sorry+0.006
 faces+0.006
Thank+0.006
ulating+0.006
 sav+0.005
Neg logit contributions
ert-0.020
oris-0.019
ion-0.018
izzy-0.018
 log-0.018
Token.
Token ID13
Feature activation+5.58e-02

Pos logit contributions
 Sorry+0.006
Thank+0.006
ulating+0.005
 faces+0.005
 sav+0.005
Neg logit contributions
ert-0.009
oris-0.008
ion-0.008
 log-0.008
izzy-0.008
Token They
Token ID1119
Feature activation-5.40e-01

Pos logit contributions
 faces+0.001
ulating+0.001
 Sorry+0.001
Thank+0.001
 sav+0.001
Neg logit contributions
ert-0.002
oris-0.002
ion-0.002
izzy-0.002
 log-0.002
Token thanked
Token ID26280
Feature activation-3.61e-01

Pos logit contributions
ert+0.024
oris+0.022
izzy+0.022
ion+0.022
 log+0.021
Neg logit contributions
 Sorry-0.010
 faces-0.010
ulating-0.010
Thank-0.009
 sav-0.009
Token the
Token ID262
Feature activation-1.28e+00

Pos logit contributions
ert+0.013
oris+0.012
ion+0.011
izzy+0.011
 log+0.011
Neg logit contributions
ulating-0.010
 Sorry-0.009
Thank-0.009
 faces-0.009
 sav-0.009
Token old
Token ID1468
Feature activation-3.71e+00

Pos logit contributions
ert+0.052
oris+0.047
ion+0.046
izzy+0.046
ions+0.044
Neg logit contributions
Thank-0.030
 faces-0.030
ulating-0.029
 sav-0.028
 people-0.028
Token man
Token ID582
Feature activation-1.54e+00

Pos logit contributions
ert+0.144
izzy+0.136
ion+0.133
ions+0.130
oris+0.126
Neg logit contributions
 sound-0.094
Thank-0.093
 people-0.093
 faces-0.092
thank-0.091
Token for
Token ID329
Feature activation+2.16e-01

Pos logit contributions
ert+0.058
oris+0.053
izzy+0.051
ion+0.050
 log+0.048
Neg logit contributions
 faces-0.039
 Sorry-0.039
Thank-0.038
ulating-0.038
 sav-0.037
Token allowing
Token ID5086
Feature activation-2.21e-01

Pos logit contributions
 Sorry+0.005
 faces+0.005
ulating+0.005
Thank+0.005
clock+0.005
Neg logit contributions
ert-0.008
oris-0.008
ion-0.007
izzy-0.007
 log-0.007
Token them
Token ID606
Feature activation+3.99e-01

Pos logit contributions
ert+0.007
ion+0.006
oris+0.006
ela+0.006
izzy+0.005
Neg logit contributions
 Sorry-0.008
Thank-0.007
ulating-0.007
clock-0.006
thank-0.006
Token to
Token ID284
Feature activation-4.73e-01

Pos logit contributions
 Sorry+0.013
Thank+0.013
 sav+0.013
 sudden+0.012
ulating+0.012
Neg logit contributions
ert-0.013
oris-0.011
ion-0.010
izzy-0.010
 log-0.009
Token had
Token ID550
Feature activation-2.89e-01

Pos logit contributions
 faces+0.005
 Sorry+0.005
Thank+0.005
ulating+0.005
 sav+0.004
Neg logit contributions
ert-0.015
oris-0.014
ion-0.014
izzy-0.014
 log-0.013
Token a
Token ID257
Feature activation-2.03e-01

Pos logit contributions
ert+0.006
izzy+0.005
 sheet+0.005
 log+0.005
 spider+0.005
Neg logit contributions
ulating-0.011
 Sorry-0.011
 faces-0.011
umm-0.010
thank-0.010
Token tutor
Token ID48680
Feature activation+9.04e-01

Pos logit contributions
ert+0.009
ion+0.008
oris+0.008
izzy+0.008
ions+0.008
Neg logit contributions
 Sorry-0.004
 faces-0.004
Thank-0.004
 sav-0.004
 sudden-0.004
Token named
Token ID3706
Feature activation-2.00e+00

Pos logit contributions
 faces+0.024
 Sorry+0.023
ulating+0.023
Thank+0.022
 sav+0.021
Neg logit contributions
ert-0.032
oris-0.030
izzy-0.030
ion-0.029
 log-0.028
Token Mrs
Token ID9074
Feature activation-1.52e+00

Pos logit contributions
ert+0.068
oris+0.062
ion+0.060
izzy+0.059
 log+0.057
Neg logit contributions
 Sorry-0.057
 faces-0.056
Thank-0.056
ulating-0.055
 sav-0.054
Token.
Token ID13
Feature activation-3.71e+00

Pos logit contributions
ert+0.076
oris+0.071
ion+0.070
izzy+0.068
 log+0.067
Neg logit contributions
 faces-0.021
Thank-0.018
ulating-0.017
 Sorry-0.017
clock-0.017
Token Smith
Token ID4176
Feature activation-6.48e-01

Pos logit contributions
ert+0.125
oris+0.114
izzy+0.109
ion+0.108
 log+0.103
Neg logit contributions
 faces-0.107
 Sorry-0.105
Thank-0.105
ulating-0.105
 sav-0.101
Token.
Token ID13
Feature activation-1.52e+00

Pos logit contributions
ert+0.033
oris+0.031
 stool+0.029
ion+0.029
izzy+0.029
Neg logit contributions
 faces-0.007
 thanked-0.007
Thank-0.007
 drawn-0.007
 sav-0.007
Token Mrs
Token ID9074
Feature activation-8.79e-01

Pos logit contributions
ert+0.057
oris+0.053
izzy+0.051
ion+0.050
 log+0.049
Neg logit contributions
 faces-0.037
ulating-0.036
 Sorry-0.036
Thank-0.035
aw-0.035
Token.
Token ID13
Feature activation-2.81e+00

Pos logit contributions
ert+0.044
oris+0.040
ion+0.039
izzy+0.039
 log+0.039
Neg logit contributions
 faces-0.013
Thank-0.012
 Sorry-0.011
ulating-0.011
clock-0.010
Token Smith
Token ID4176
Feature activation+4.11e-01

Pos logit contributions
ert+0.092
oris+0.083
ion+0.080
izzy+0.079
 log+0.076
Neg logit contributions
 faces-0.083
 Sorry-0.083
Thank-0.082
ulating-0.082
 sav-0.080
Token she
Token ID673
Feature activation+4.62e-02

Pos logit contributions
ert+0.042
oris+0.042
ion+0.041
 log+0.040
 outage+0.039
Neg logit contributions
 Sorry-0.022
 faces-0.022
 people-0.022
 thanked-0.020
 were-0.020
Token was
Token ID373
Feature activation-8.38e-01

Pos logit contributions
 faces+0.001
 Sorry+0.001
ulating+0.001
clock+0.001
Thank+0.001
Neg logit contributions
ert-0.002
oris-0.001
izzy-0.001
ion-0.001
 log-0.001
Token inside
Token ID2641
Feature activation-1.47e-01

Pos logit contributions
ert+0.031
oris+0.028
ion+0.027
izzy+0.027
 log+0.027
Neg logit contributions
 Sorry-0.023
 faces-0.022
Thank-0.021
thank-0.020
ulating-0.020
Token,
Token ID11
Feature activation+8.17e-02

Pos logit contributions
ert+0.006
oris+0.005
ion+0.005
izzy+0.005
 log+0.005
Neg logit contributions
 Sorry-0.004
ulating-0.003
 faces-0.003
Thank-0.003
 sav-0.003
Token D
Token ID360
Feature activation-7.55e-02

Pos logit contributions
ulating+0.002
 faces+0.002
 Sorry+0.002
Thank+0.002
 sav+0.002
Neg logit contributions
ert-0.003
oris-0.002
ion-0.002
izzy-0.002
 log-0.002
Tokenott
Token ID1252
Feature activation-3.67e+00

Pos logit contributions
ert+0.003
oris+0.002
ion+0.002
 stool+0.002
 log+0.002
Neg logit contributions
ulating-0.002
 faces-0.002
Thank-0.002
aw-0.002
 sav-0.002
Tokenie
Token ID494
Feature activation-1.75e-01

Pos logit contributions
ert+0.110
oris+0.109
ion+0.103
 stool+0.100
izzy+0.100
Neg logit contributions
 faces-0.108
aw-0.104
 sav-0.104
ulating-0.103
thank-0.101
Token saw
Token ID2497
Feature activation+8.81e-01

Pos logit contributions
ert+0.007
oris+0.006
 log+0.006
izzy+0.006
 stool+0.005
Neg logit contributions
 Sorry-0.005
ulating-0.004
Thank-0.004
 faces-0.004
thank-0.004
Token lots
Token ID6041
Feature activation+5.21e-02

Pos logit contributions
 Sorry+0.024
ulating+0.023
 faces+0.023
Thank+0.023
clock+0.021
Neg logit contributions
ert-0.032
oris-0.029
ion-0.029
izzy-0.028
 log-0.028
Token of
Token ID286
Feature activation+7.03e-01

Pos logit contributions
 Sorry+0.001
 faces+0.001
Thank+0.001
ulating+0.001
 sav+0.001
Neg logit contributions
ert-0.002
ion-0.002
oris-0.002
izzy-0.002
 log-0.002
Token colorful
Token ID20239
Feature activation-3.43e-01

Pos logit contributions
 Sorry+0.018
ulating+0.017
Thank+0.017
clock+0.015
thank+0.015
Neg logit contributions
ert-0.027
ion-0.025
 log-0.025
oris-0.024
izzy-0.024
Token both
Token ID1111
Feature activation-4.55e-01

Pos logit contributions
Thank+0.004
 sav+0.004
 faces+0.004
 thanked+0.004
 Sorry+0.004
Neg logit contributions
ert-0.017
oris-0.016
ion-0.016
izzy-0.015
 stool-0.015
Token danced
Token ID39480
Feature activation+2.54e-02

Pos logit contributions
ert+0.020
oris+0.019
ion+0.018
izzy+0.018
 stool+0.018
Neg logit contributions
Thank-0.008
 faces-0.008
 Sorry-0.008
 sav-0.008
ulating-0.008
Token around
Token ID1088
Feature activation-1.07e+00

Pos logit contributions
 Sorry+0.001
 faces+0.001
Thank+0.001
ulating+0.001
 sav+0.001
Neg logit contributions
ert-0.001
oris-0.001
izzy-0.001
ion-0.001
 log-0.001
Token the
Token ID262
Feature activation+9.63e-02

Pos logit contributions
ert+0.044
oris+0.041
izzy+0.040
ion+0.040
 stool+0.038
Neg logit contributions
 faces-0.023
Thank-0.023
ulating-0.022
 sav-0.022
 Sorry-0.022
Token z
Token ID1976
Feature activation-3.26e-01

Pos logit contributions
Thank+0.002
 faces+0.002
 sav+0.002
 Sorry+0.002
ulating+0.002
Neg logit contributions
ert-0.004
oris-0.003
ion-0.003
izzy-0.003
ions-0.003
Tokenig
Token ID328
Feature activation-3.65e+00

Pos logit contributions
ccoli+0.007
 stool+0.006
 outage+0.006
ert+0.006
oris+0.006
Neg logit contributions
aw-0.014
 faces-0.012
 sav-0.012
 Sorry-0.011
Thank-0.011
Tokenzag
Token ID50183
Feature activation-1.69e+00

Pos logit contributions
ert+0.046
oris+0.045
ion+0.040
izzy+0.038
 stool+0.035
Neg logit contributions
 faces-0.165
ulating-0.162
 sav-0.161
aw-0.161
Thank-0.161
Token together
Token ID1978
Feature activation-9.01e-01

Pos logit contributions
ert+0.069
oris+0.063
ion+0.061
izzy+0.061
 log+0.059
Neg logit contributions
 Sorry-0.037
 faces-0.037
Thank-0.037
ulating-0.036
 sav-0.035
Token!
Token ID0
Feature activation-2.36e-01

Pos logit contributions
ert+0.034
oris+0.029
 log+0.029
ion+0.028
 stool+0.027
Neg logit contributions
 Sorry-0.026
Thank-0.023
ulating-0.023
clock-0.022
 faces-0.022
Token<|endoftext|>
Token ID50256
Feature activation-2.41e-01

Pos logit contributions
ert+0.011
oris+0.010
izzy+0.010
ion+0.010
 stool+0.010
Neg logit contributions
 Sorry-0.004
Thank-0.004
ulating-0.004
 faces-0.004
 sav-0.004
TokenJohn
Token ID7554
Feature activation+3.33e-01

Pos logit contributions
ert+0.009
oris+0.009
ion+0.008
izzy+0.008
 stool+0.008
Neg logit contributions
Thank-0.006
 Sorry-0.006
 sav-0.005
 faces-0.005
aw-0.005
Token
Token ID198
Feature activation-9.30e-01

Pos logit contributions
ert+0.014
oris+0.013
ion+0.013
izzy+0.013
 log+0.013
Neg logit contributions
Thank-0.006
 faces-0.006
 Sorry-0.006
ulating-0.006
 sav-0.006
Token"
Token ID1
Feature activation+1.81e-01

Pos logit contributions
ert+0.031
oris+0.028
izzy+0.026
ion+0.026
 log+0.025
Neg logit contributions
 faces-0.028
 Sorry-0.027
ulating-0.027
Thank-0.027
 sav-0.026
TokenI
Token ID40
Feature activation-1.12e+00

Pos logit contributions
 faces+0.004
ulating+0.004
 sav+0.004
 Sorry+0.004
clock+0.004
Neg logit contributions
ert-0.007
oris-0.007
izzy-0.007
ion-0.006
 log-0.006
Token own
Token ID898
Feature activation-1.03e+00

Pos logit contributions
ert+0.047
oris+0.044
ion+0.042
izzy+0.042
 log+0.040
Neg logit contributions
 faces-0.023
 Sorry-0.023
Thank-0.023
ulating-0.023
 sav-0.022
Token this
Token ID428
Feature activation-3.48e+00

Pos logit contributions
ert+0.038
oris+0.035
izzy+0.035
ion+0.033
 stool+0.031
Neg logit contributions
 faces-0.026
 Sorry-0.026
 sav-0.026
Thank-0.025
ulating-0.025
Token yellow
Token ID7872
Feature activation-3.63e+00

Pos logit contributions
ert+0.094
oris+0.084
izzy+0.083
ion+0.083
ions+0.072
Neg logit contributions
Thank-0.122
 sound-0.121
 faces-0.121
 sav-0.120
 Sorry-0.120
Token shield
Token ID7614
Feature activation-6.26e-01

Pos logit contributions
ert+0.105
izzy+0.097
ion+0.096
oris+0.095
ela+0.085
Neg logit contributions
 faces-0.125
 face-0.121
 sound-0.120
 people-0.120
 pictures-0.117
Token!"
Token ID2474
Feature activation+9.26e-01

Pos logit contributions
ert+0.022
oris+0.020
ion+0.020
izzy+0.019
 log+0.019
Neg logit contributions
ulating-0.017
 faces-0.017
 sudden-0.016
 Sorry-0.016
Thank-0.016
Token Lily
Token ID20037
Feature activation-8.59e-01

Pos logit contributions
ulating+0.026
clock+0.025
 faces+0.025
 Sorry+0.024
 habitat+0.024
Neg logit contributions
ert-0.031
izzy-0.029
ion-0.029
oris-0.028
 log-0.026
Token replied
Token ID8712
Feature activation-1.18e+00

Pos logit contributions
ert+0.035
oris+0.032
izzy+0.031
ion+0.030
 log+0.029
Neg logit contributions
 Sorry-0.019
Thank-0.018
 faces-0.018
ulating-0.018
aw-0.017
Token with
Token ID351
Feature activation+1.70e+00

Pos logit contributions
ert+0.048
oris+0.044
izzy+0.043
ion+0.043
 log+0.041
Neg logit contributions
 Sorry-0.025
Thank-0.025
 faces-0.024
ulating-0.024
 sav-0.024
Token even
Token ID772
Feature activation-1.57e+00

Pos logit contributions
ert+0.035
ion+0.033
 log+0.032
 stool+0.031
 opener+0.031
Neg logit contributions
 Sorry-0.034
ulating-0.030
 faces-0.030
aw-0.030
 endings-0.029
Token cut
Token ID2005
Feature activation-5.32e-02

Pos logit contributions
ert+0.041
 log+0.040
ion+0.040
 stool+0.039
izzy+0.039
Neg logit contributions
 Sorry-0.054
 faces-0.049
ulating-0.049
 habitat-0.047
 endings-0.046
Tokener
Token ID263
Feature activation-1.05e+00

Pos logit contributions
ert+0.002
ion+0.002
oris+0.002
izzy+0.002
 log+0.002
Neg logit contributions
 Sorry-0.001
 faces-0.001
Thank-0.001
ulating-0.001
 sav-0.001
Token than
Token ID621
Feature activation-8.50e-01

Pos logit contributions
ert+0.041
oris+0.037
ion+0.036
izzy+0.036
 log+0.034
Neg logit contributions
 Sorry-0.028
 faces-0.026
Thank-0.026
 sudden-0.025
 sav-0.025
Token the
Token ID262
Feature activation-6.60e-01

Pos logit contributions
ert+0.032
oris+0.029
ion+0.029
izzy+0.028
 log+0.027
Neg logit contributions
 Sorry-0.021
 faces-0.021
ulating-0.021
Thank-0.021
 sav-0.020
Token old
Token ID1468
Feature activation-3.62e+00

Pos logit contributions
ert+0.016
 log+0.014
 stool+0.014
 spider+0.014
oris+0.014
Neg logit contributions
 Sorry-0.025
ulating-0.025
aw-0.024
 faces-0.024
Thank-0.023
Token one
Token ID530
Feature activation-7.02e-01

Pos logit contributions
ert+0.117
oris+0.105
izzy+0.102
ion+0.100
ions+0.091
Neg logit contributions
Thank-0.115
 faces-0.114
 sav-0.110
 sound-0.109
 Sorry-0.109
Token.
Token ID13
Feature activation-2.32e-01

Pos logit contributions
ert+0.028
oris+0.026
izzy+0.025
ion+0.025
 log+0.024
Neg logit contributions
 faces-0.016
 Sorry-0.016
ulating-0.016
Thank-0.016
 sav-0.015
Token Bun
Token ID28515
Feature activation-9.83e-01

Pos logit contributions
ert+0.008
oris+0.007
ion+0.007
izzy+0.007
 log+0.007
Neg logit contributions
 faces-0.006
ulating-0.006
Thank-0.006
 Sorry-0.006
 sav-0.006
Token Bun
Token ID28515
Feature activation-9.02e-01

Pos logit contributions
ert+0.033
oris+0.030
izzy+0.029
ion+0.028
 log+0.027
Neg logit contributions
 faces-0.029
 Sorry-0.029
ulating-0.028
Thank-0.028
 sav-0.027
Token was
Token ID373
Feature activation-5.39e-01

Pos logit contributions
ert+0.035
oris+0.032
izzy+0.031
ion+0.030
 log+0.029
Neg logit contributions
 Sorry-0.022
 faces-0.022
ulating-0.021
Thank-0.021
 sav-0.020
Token her
Token ID607
Feature activation-2.60e-01

Pos logit contributions
 faces+0.005
Thank+0.005
aw+0.005
 sudden+0.005
 Sorry+0.005
Neg logit contributions
ert-0.012
oris-0.011
izzy-0.010
ion-0.010
 log-0.010
Token elderly
Token ID13830
Feature activation-9.37e-01

Pos logit contributions
ert+0.009
oris+0.008
ion+0.008
izzy+0.008
 log+0.008
Neg logit contributions
 faces-0.007
 Sorry-0.007
Thank-0.007
ulating-0.007
 sav-0.007
Token neighbor
Token ID4780
Feature activation+5.20e-01

Pos logit contributions
ert+0.038
oris+0.035
ion+0.033
izzy+0.033
 log+0.031
Neg logit contributions
 faces-0.023
Thank-0.022
 Sorry-0.021
 sav-0.021
ulating-0.021
Token,
Token ID11
Feature activation+4.51e-01

Pos logit contributions
 Sorry+0.016
ulating+0.015
Thank+0.015
 faces+0.015
thank+0.014
Neg logit contributions
ert-0.016
 log-0.015
izzy-0.015
ion-0.015
 spider-0.014
Token Mrs
Token ID9074
Feature activation-1.05e+00

Pos logit contributions
 faces+0.011
 Sorry+0.011
ulating+0.011
Thank+0.011
 sav+0.011
Neg logit contributions
ert-0.017
oris-0.015
izzy-0.015
ion-0.015
 log-0.014
Token.
Token ID13
Feature activation-3.61e+00

Pos logit contributions
ert+0.050
oris+0.047
ion+0.045
izzy+0.045
 log+0.044
Neg logit contributions
 faces-0.017
Thank-0.016
ulating-0.015
 Sorry-0.015
 sav-0.014
Token Johnson
Token ID5030
Feature activation-3.21e-01

Pos logit contributions
ert+0.113
oris+0.103
izzy+0.097
ion+0.095
 log+0.092
Neg logit contributions
 faces-0.113
ulating-0.111
 Sorry-0.110
Thank-0.109
aw-0.107
Token,
Token ID11
Feature activation+1.06e+00

Pos logit contributions
ert+0.012
oris+0.011
ion+0.011
izzy+0.011
 log+0.010
Neg logit contributions
 Sorry-0.008
ulating-0.008
Thank-0.008
 faces-0.008
 sav-0.008
Token struggling
Token ID9648
Feature activation+3.05e-01

Pos logit contributions
 Sorry+0.031
ulating+0.031
 faces+0.030
clock+0.030
umm+0.028
Neg logit contributions
ert-0.032
izzy-0.030
ion-0.030
 log-0.030
oris-0.029
Token to
Token ID284
Feature activation-1.19e-01

Pos logit contributions
 Sorry+0.008
ulating+0.008
 faces+0.008
Thank+0.008
 sav+0.008
Neg logit contributions
ert-0.011
oris-0.010
ion-0.010
izzy-0.010
 log-0.009
Token carry
Token ID3283
Feature activation+2.88e-01

Pos logit contributions
ert+0.004
oris+0.004
 log+0.004
izzy+0.004
ion+0.004
Neg logit contributions
 Sorry-0.003
aw-0.003
ulating-0.003
 faces-0.003
clock-0.003
Token on
Token ID319
Feature activation-7.68e-01

Pos logit contributions
 faces+0.012
 Sorry+0.012
Thank+0.012
ulating+0.012
 sav+0.011
Neg logit contributions
ert-0.028
oris-0.026
izzy-0.025
ion-0.025
 log-0.024
Token the
Token ID262
Feature activation-1.01e+00

Pos logit contributions
ert+0.033
oris+0.031
izzy+0.030
ion+0.028
ela+0.027
Neg logit contributions
 sav-0.016
 faces-0.016
Thank-0.015
 Sorry-0.015
ulating-0.014
Token beach
Token ID10481
Feature activation-7.92e-02

Pos logit contributions
ert+0.040
oris+0.036
ion+0.035
izzy+0.035
 log+0.034
Neg logit contributions
 Sorry-0.024
Thank-0.024
 faces-0.024
ulating-0.023
 sav-0.022
Token.
Token ID13
Feature activation-9.09e-01

Pos logit contributions
ert+0.002
 log+0.002
 stool+0.002
ion+0.002
oris+0.002
Neg logit contributions
 Sorry-0.003
Thank-0.002
ulating-0.002
 sudden-0.002
 faces-0.002
Token The
Token ID383
Feature activation-1.52e+00

Pos logit contributions
ert+0.039
oris+0.036
izzy+0.035
ion+0.035
 log+0.034
Neg logit contributions
 faces-0.018
 Sorry-0.018
Thank-0.018
ulating-0.018
 sav-0.017
Token old
Token ID1468
Feature activation-3.58e+00

Pos logit contributions
ert+0.062
oris+0.057
izzy+0.055
ion+0.054
 log+0.052
Neg logit contributions
 faces-0.035
Thank-0.034
 Sorry-0.034
ulating-0.034
 sav-0.032
Token man
Token ID582
Feature activation+2.19e-01

Pos logit contributions
ert+0.171
oris+0.161
izzy+0.160
ion+0.157
ions+0.153
Neg logit contributions
 faces-0.065
Thank-0.065
 sound-0.062
 thanked-0.061
 people-0.059
Token invited
Token ID9392
Feature activation+4.40e-01

Pos logit contributions
 Sorry+0.006
ulating+0.006
umm+0.006
 faces+0.006
Thank+0.006
Neg logit contributions
ert-0.007
 log-0.006
izzy-0.006
oris-0.006
ion-0.006
Token the
Token ID262
Feature activation-1.67e+00

Pos logit contributions
 Sorry+0.011
ulating+0.011
Thank+0.011
 faces+0.011
 sav+0.010
Neg logit contributions
ert-0.017
oris-0.015
ion-0.015
 log-0.014
izzy-0.014
Token boy
Token ID2933
Feature activation-7.93e-01

Pos logit contributions
ert+0.069
oris+0.064
izzy+0.061
ion+0.061
 log+0.058
Neg logit contributions
 faces-0.037
Thank-0.036
 Sorry-0.036
ulating-0.036
 sav-0.034
Token to
Token ID284
Feature activation+3.16e-02

Pos logit contributions
ert+0.035
oris+0.033
izzy+0.032
 stool+0.030
ion+0.030
Neg logit contributions
 faces-0.015
ulating-0.014
 sav-0.014
Thank-0.013
aw-0.013
Token
Token ID198
Feature activation-2.54e-01

Pos logit contributions
 faces+0.009
 Sorry+0.008
Thank+0.008
ulating+0.008
 sav+0.008
Neg logit contributions
ert-0.020
oris-0.019
izzy-0.018
ion-0.018
 log-0.017
TokenSuddenly
Token ID38582
Feature activation-3.04e-01

Pos logit contributions
ert+0.011
oris+0.010
izzy+0.010
ion+0.010
 log+0.009
Neg logit contributions
 faces-0.005
 Sorry-0.005
ulating-0.005
Thank-0.005
 sav-0.005
Token,
Token ID11
Feature activation+1.73e-02

Pos logit contributions
ert+0.014
oris+0.013
izzy+0.013
ion+0.013
 stool+0.012
Neg logit contributions
 faces-0.005
 Sorry-0.005
Thank-0.005
ulating-0.005
 sudden-0.005
Token they
Token ID484
Feature activation+1.14e+00

Pos logit contributions
 faces+0.000
 Sorry+0.000
Thank+0.000
ulating+0.000
 sav+0.000
Neg logit contributions
ert-0.001
oris-0.001
izzy-0.001
ion-0.001
 log-0.001
Token sp
Token ID599
Feature activation+1.73e-01

Pos logit contributions
 Sorry+0.036
ulating+0.035
Thank+0.034
thank+0.033
umm+0.032
Neg logit contributions
ert-0.038
izzy-0.033
oris-0.033
 log-0.032
 sheet-0.030
Tokenott
Token ID1252
Feature activation-3.56e+00

Pos logit contributions
 faces+0.005
ulating+0.005
 Sorry+0.005
aw+0.005
Thank+0.005
Neg logit contributions
ert-0.006
oris-0.005
izzy-0.005
ion-0.005
 log-0.005
Tokenled
Token ID992
Feature activation+2.85e-01

Pos logit contributions
oris+0.113
ert+0.111
 stool+0.109
ion+0.105
 opener+0.105
Neg logit contributions
aw-0.097
 faces-0.093
thank-0.089
ulating-0.089
 sav-0.088
Token a
Token ID257
Feature activation-1.39e-01

Pos logit contributions
 Sorry+0.007
Thank+0.007
ulating+0.007
 sav+0.007
 faces+0.006
Neg logit contributions
ert-0.011
oris-0.010
 log-0.010
ion-0.010
izzy-0.009
Token bicycle
Token ID17026
Feature activation-1.54e-01

Pos logit contributions
ert+0.005
oris+0.005
 log+0.005
izzy+0.005
 stool+0.005
Neg logit contributions
 Sorry-0.003
ulating-0.003
 faces-0.003
Thank-0.003
 sav-0.003
Token.
Token ID13
Feature activation-4.94e-01

Pos logit contributions
ert+0.006
oris+0.005
ion+0.005
 log+0.005
izzy+0.005
Neg logit contributions
 Sorry-0.004
 faces-0.004
Thank-0.004
ulating-0.004
 sav-0.003
Token It
Token ID632
Feature activation-6.91e-01

Pos logit contributions
ert+0.021
oris+0.020
ion+0.019
izzy+0.019
 log+0.018
Neg logit contributions
 faces-0.010
Thank-0.009
 Sorry-0.009
ulating-0.009
 sav-0.009
Token there
Token ID612
Feature activation+6.77e-01

Pos logit contributions
oris+0.020
ion+0.020
 stool+0.018
ert+0.018
roach+0.018
Neg logit contributions
 were-0.019
 at-0.017
 so-0.017
 sad-0.017
 people-0.017
Token were
Token ID547
Feature activation-8.21e-02

Pos logit contributions
 faces+0.013
 Sorry+0.013
Thank+0.013
ulating+0.013
 sav+0.012
Neg logit contributions
ert-0.029
oris-0.027
ion-0.026
izzy-0.026
 log-0.025
Token two
Token ID734
Feature activation+3.52e-01

Pos logit contributions
ert+0.003
oris+0.003
izzy+0.003
ion+0.003
 log+0.003
Neg logit contributions
ulating-0.002
 faces-0.002
 Sorry-0.002
Thank-0.002
 sav-0.002
Token cats
Token ID11875
Feature activation+1.99e-01

Pos logit contributions
 Sorry+0.009
Thank+0.008
ulating+0.008
thank+0.008
 sav+0.007
Neg logit contributions
ert-0.014
oris-0.013
 log-0.012
izzy-0.012
ion-0.012
Token,
Token ID11
Feature activation-4.30e-01

Pos logit contributions
 Sorry+0.004
 faces+0.004
Thank+0.004
ulating+0.004
 sav+0.004
Neg logit contributions
ert-0.008
oris-0.008
ion-0.007
izzy-0.007
 log-0.007
Token Tig
Token ID14189
Feature activation-3.54e+00

Pos logit contributions
ert+0.019
oris+0.018
ion+0.017
 outage+0.017
igsaw+0.016
Neg logit contributions
Thank-0.008
 so-0.008
 sav-0.007
 faces-0.007
 including-0.007
Tokengy
Token ID1360
Feature activation-2.52e+00

Pos logit contributions
ert+0.105
ions+0.097
izzy+0.096
igsaw+0.096
 teaspoon+0.095
Neg logit contributions
 sav-0.109
Thank-0.106
thank-0.104
ulating-0.104
 thanked-0.103
Token and
Token ID290
Feature activation-4.54e-01

Pos logit contributions
ert+0.091
ion+0.081
oris+0.080
 log+0.078
izzy+0.075
Neg logit contributions
 Sorry-0.073
Thank-0.066
ulating-0.063
 faces-0.062
clock-0.060
Token Snow
Token ID7967
Feature activation-1.29e+00

Pos logit contributions
ert+0.018
ion+0.016
oris+0.016
 stool+0.015
igsaw+0.015
Neg logit contributions
Thank-0.011
 faces-0.011
 Sorry-0.011
 sav-0.010
 people-0.010
Tokeny
Token ID88
Feature activation-1.32e+00

Pos logit contributions
ert+0.043
ion+0.041
 stool+0.041
oris+0.040
 log+0.039
Neg logit contributions
 faces-0.033
ulating-0.032
Thank-0.032
aw-0.031
thank-0.031
Token.
Token ID13
Feature activation-4.32e-01

Pos logit contributions
ert+0.051
oris+0.045
izzy+0.044
ion+0.044
 log+0.044
Neg logit contributions
 Sorry-0.033
Thank-0.032
ulating-0.032
 faces-0.030
 sav-0.030
Token not
Token ID407
Feature activation-1.76e+00

Pos logit contributions
ert+0.025
ion+0.022
oris+0.022
 log+0.022
izzy+0.022
Neg logit contributions
 Sorry-0.017
 faces-0.016
ulating-0.015
Thank-0.015
aw-0.014
Token bigger
Token ID5749
Feature activation+2.00e-01

Pos logit contributions
ert+0.073
oris+0.066
ion+0.064
izzy+0.064
 log+0.064
Neg logit contributions
 Sorry-0.039
 faces-0.037
Thank-0.036
ulating-0.036
 sav-0.034
Token than
Token ID621
Feature activation-1.38e+00

Pos logit contributions
 Sorry+0.006
Thank+0.006
 sudden+0.005
thank+0.005
ulating+0.005
Neg logit contributions
ert-0.007
 log-0.006
oris-0.006
 spider-0.006
driver-0.006
Token a
Token ID257
Feature activation-6.91e-01

Pos logit contributions
ert+0.053
ion+0.047
oris+0.046
izzy+0.045
 log+0.044
Neg logit contributions
Thank-0.035
ulating-0.035
 Sorry-0.035
 faces-0.034
 sav-0.033
Token fing
Token ID8038
Feature activation-8.40e-01

Pos logit contributions
ert+0.021
oris+0.019
 log+0.018
izzy+0.018
ion+0.018
Neg logit contributions
ulating-0.022
 faces-0.022
Thank-0.022
 Sorry-0.022
 sav-0.021
Tokenern
Token ID1142
Feature activation-3.54e+00

Pos logit contributions
ert+0.027
oris+0.025
izzy+0.024
 log+0.024
 mattress+0.022
Neg logit contributions
 sav-0.024
 drawn-0.024
ulating-0.024
thank-0.022
aw-0.022
Tokenail
Token ID603
Feature activation-1.48e+00

Pos logit contributions
oris+0.114
ccoli+0.106
oren+0.104
ert+0.104
Alex+0.102
Neg logit contributions
 drawn-0.089
aw-0.086
clock-0.084
 sav-0.083
ulating-0.082
Token.
Token ID13
Feature activation+3.38e-01

Pos logit contributions
ert+0.054
ion+0.050
izzy+0.048
oris+0.048
 log+0.046
Neg logit contributions
 Sorry-0.042
 faces-0.039
Thank-0.039
 sudden-0.037
ulating-0.037
Token They
Token ID1119
Feature activation+5.44e-01

Pos logit contributions
 faces+0.007
Thank+0.007
 Sorry+0.007
ulating+0.007
 sav+0.007
Neg logit contributions
ert-0.014
oris-0.013
ion-0.012
izzy-0.012
 log-0.012
Token also
Token ID635
Feature activation+4.08e-01

Pos logit contributions
ulating+0.010
 Sorry+0.010
 faces+0.010
Thank+0.010
 sav+0.009
Neg logit contributions
ert-0.024
oris-0.022
izzy-0.022
ion-0.021
 log-0.021
Token found
Token ID1043
Feature activation+5.62e-01

Pos logit contributions
 Sorry+0.010
ulating+0.010
 faces+0.010
clock+0.009
 sudden+0.009
Neg logit contributions
ert-0.016
oris-0.014
izzy-0.014
 log-0.014
ion-0.013
TokenExc
Token ID40127
Feature activation-6.09e-01

Pos logit contributions
ert+0.014
oris+0.013
ion+0.012
izzy+0.012
 log+0.012
Neg logit contributions
 faces-0.005
 Sorry-0.005
ulating-0.005
Thank-0.005
 sav-0.004
Tokenuse
Token ID1904
Feature activation+4.85e-01

Pos logit contributions
ert+0.018
oris+0.017
ion+0.016
izzy+0.016
 log+0.015
Neg logit contributions
 faces-0.020
ulating-0.020
 Sorry-0.020
Thank-0.019
 sav-0.018
Token me
Token ID502
Feature activation-8.16e-01

Pos logit contributions
 Sorry+0.016
ulating+0.016
Thank+0.015
 drawn+0.015
 faces+0.015
Neg logit contributions
ert-0.015
oris-0.013
 log-0.012
izzy-0.011
leton-0.011
Token,
Token ID11
Feature activation-5.88e-01

Pos logit contributions
ert+0.034
oris+0.032
izzy+0.031
ion+0.030
 log+0.029
Neg logit contributions
 faces-0.017
 Sorry-0.017
Thank-0.017
ulating-0.017
 sav-0.016
Token Mr
Token ID1770
Feature activation-6.27e-01

Pos logit contributions
ert+0.021
oris+0.019
izzy+0.019
ion+0.018
 log+0.017
Neg logit contributions
 faces-0.016
 Sorry-0.016
Thank-0.015
ulating-0.015
 sav-0.015
Token.
Token ID13
Feature activation-3.53e+00

Pos logit contributions
ert+0.030
ion+0.028
oris+0.028
izzy+0.027
 stool+0.027
Neg logit contributions
 Sorry-0.009
 faces-0.009
Thank-0.008
aw-0.008
clock-0.008
Token Dragon
Token ID2851
Feature activation+2.56e-01

Pos logit contributions
ion+0.116
ert+0.113
izzy+0.104
oris+0.102
 stool+0.096
Neg logit contributions
 Sorry-0.110
Thank-0.102
clock-0.097
 sav-0.096
 miss-0.093
Token,
Token ID11
Feature activation+8.11e-01

Pos logit contributions
 faces+0.006
Thank+0.006
ulating+0.006
 Sorry+0.006
 sav+0.006
Neg logit contributions
ert-0.010
oris-0.009
izzy-0.009
ion-0.009
 log-0.008
Token are
Token ID389
Feature activation-5.23e-01

Pos logit contributions
ulating+0.023
 faces+0.023
Thank+0.023
 Sorry+0.022
 sav+0.022
Neg logit contributions
ert-0.028
oris-0.025
ion-0.025
izzy-0.024
 log-0.024
Token you
Token ID345
Feature activation-1.26e+00

Pos logit contributions
ert+0.019
ion+0.017
oris+0.017
 log+0.016
izzy+0.016
Neg logit contributions
ulating-0.014
 faces-0.014
Thank-0.014
thank-0.013
 Sorry-0.013
Token friendly
Token ID8030
Feature activation+2.19e-01

Pos logit contributions
ert+0.047
oris+0.043
izzy+0.041
ion+0.041
 log+0.041
Neg logit contributions
 faces-0.032
 Sorry-0.031
Thank-0.031
ulating-0.031
 sav-0.030
Token and
Token ID290
Feature activation+1.12e+00

Pos logit contributions
 Sorry+0.017
ulating+0.017
 faces+0.017
Thank+0.016
clock+0.015
Neg logit contributions
ert-0.031
oris-0.028
izzy-0.027
 log-0.026
ion-0.026
Token Kitty
Token ID21168
Feature activation+8.03e-01

Pos logit contributions
 faces+0.041
 Sorry+0.041
ulating+0.041
Thank+0.041
 sav+0.040
Neg logit contributions
ert-0.029
oris-0.026
izzy-0.024
ion-0.024
 log-0.023
Token both
Token ID1111
Feature activation+3.36e-01

Pos logit contributions
 Sorry+0.018
ulating+0.018
clock+0.017
 habitat+0.017
 endings+0.017
Neg logit contributions
ert-0.031
izzy-0.029
oris-0.028
 log-0.026
ig-0.026
Token looked
Token ID3114
Feature activation+5.15e-02

Pos logit contributions
ulating+0.007
 Sorry+0.007
 faces+0.007
Thank+0.006
clock+0.006
Neg logit contributions
ert-0.014
izzy-0.013
oris-0.013
ion-0.012
 log-0.012
Token pretty
Token ID2495
Feature activation+5.84e-02

Pos logit contributions
 Sorry+0.001
clock+0.001
 habitat+0.001
Thank+0.001
 endings+0.001
Neg logit contributions
ert-0.002
izzy-0.001
 sheet-0.001
oris-0.001
 log-0.001
Token with
Token ID351
Feature activation+1.89e+00

Pos logit contributions
 Sorry+0.002
Thank+0.002
 sav+0.002
 Leave+0.002
ulating+0.002
Neg logit contributions
ert-0.002
 sheet-0.001
 stool-0.001
 spider-0.001
ig-0.001
Token their
Token ID511
Feature activation+1.44e+00

Pos logit contributions
clock+0.055
ulating+0.052
 Sorry+0.051
thank+0.050
Please+0.050
Neg logit contributions
ert-0.052
ion-0.052
igsaw-0.052
 sheet-0.052
oris-0.050
Token makeup
Token ID16029
Feature activation+7.45e-02

Pos logit contributions
aw+0.045
ulating+0.042
 sav+0.042
 Sorry+0.042
Thank+0.041
Neg logit contributions
ert-0.042
izzy-0.039
oris-0.038
igsaw-0.038
ail-0.037
Token.
Token ID13
Feature activation+6.79e-01

Pos logit contributions
 Sorry+0.002
Thank+0.002
 sudden+0.002
ulating+0.002
 sav+0.002
Neg logit contributions
ert-0.002
 stool-0.002
 sheet-0.002
oris-0.002
izzy-0.002
Token They
Token ID1119
Feature activation+5.62e-01

Pos logit contributions
 faces+0.013
ulating+0.013
 Sorry+0.013
Thank+0.013
 sav+0.012
Neg logit contributions
ert-0.029
oris-0.027
ion-0.027
izzy-0.026
 log-0.026
Token played
Token ID2826
Feature activation+1.20e+00

Pos logit contributions
ulating+0.011
 Sorry+0.011
clock+0.010
 faces+0.010
Thank+0.010
Neg logit contributions
ert-0.024
izzy-0.022
oris-0.022
 log-0.021
ion-0.021
Token,
Token ID11
Feature activation-2.66e-01

Pos logit contributions
ert+0.020
oris+0.020
ion+0.019
izzy+0.019
 log+0.018
Neg logit contributions
 Sorry-0.003
 were-0.003
 so-0.002
 with-0.002
 faces-0.002
Token there
Token ID612
Feature activation+8.95e-01

Pos logit contributions
ert+0.013
oris+0.012
ion+0.011
izzy+0.011
 log+0.011
Neg logit contributions
Thank-0.004
 Sorry-0.004
 sudden-0.004
 faces-0.003
 so-0.003
Token was
Token ID373
Feature activation-3.86e-01

Pos logit contributions
 faces+0.020
 Sorry+0.019
ulating+0.019
Thank+0.019
 sav+0.018
Neg logit contributions
ert-0.037
oris-0.034
izzy-0.033
ion-0.032
 log-0.031
Token an
Token ID281
Feature activation-1.01e-01

Pos logit contributions
ert+0.017
oris+0.015
izzy+0.014
ion+0.014
 log+0.014
Neg logit contributions
Thank-0.008
 faces-0.008
 sav-0.008
ulating-0.008
 Sorry-0.008
Token elderly
Token ID13830
Feature activation-7.92e-01

Pos logit contributions
ert+0.004
oris+0.004
izzy+0.004
ion+0.004
 log+0.004
Neg logit contributions
 faces-0.002
ulating-0.002
Thank-0.002
 Sorry-0.002
 sav-0.002
Token couple
Token ID3155
Feature activation+1.90e+00

Pos logit contributions
ert+0.034
oris+0.030
izzy+0.029
ion+0.028
ions+0.027
Neg logit contributions
Thank-0.018
 sav-0.017
 pige-0.017
 people-0.017
aw-0.016
Token who
Token ID508
Feature activation+9.71e-01

Pos logit contributions
 Sorry+0.049
 faces+0.048
ulating+0.048
Thank+0.046
 sav+0.044
Neg logit contributions
ert-0.070
ion-0.063
oris-0.063
izzy-0.062
 log-0.061
Token owned
Token ID6898
Feature activation-4.77e-01

Pos logit contributions
 faces+0.023
ulating+0.022
 Sorry+0.021
 sudden+0.021
clock+0.020
Neg logit contributions
ert-0.037
oris-0.035
izzy-0.035
ion-0.033
 log-0.032
Token a
Token ID257
Feature activation-5.80e-01

Pos logit contributions
ert+0.017
oris+0.015
ion+0.015
izzy+0.015
 log+0.014
Neg logit contributions
 Sorry-0.013
 faces-0.013
ulating-0.013
Thank-0.012
 sav-0.012
Token big
Token ID1263
Feature activation+3.84e-02

Pos logit contributions
ert+0.022
oris+0.020
 log+0.020
 stool+0.020
izzy+0.018
Neg logit contributions
ulating-0.014
 faces-0.014
Thank-0.014
 Sorry-0.013
clock-0.013
Token farm
Token ID5318
Feature activation+1.73e+00

Pos logit contributions
ulating+0.001
Thank+0.001
 Sorry+0.001
 faces+0.001
clock+0.001
Neg logit contributions
ert-0.001
 log-0.001
 stool-0.001
oris-0.001
 sheet-0.001
Token She
Token ID1375
Feature activation+3.03e-01

Pos logit contributions
Thank+0.017
 Sorry+0.017
 faces+0.017
 sav+0.016
ulating+0.016
Neg logit contributions
ert-0.047
oris-0.044
izzy-0.042
ion-0.042
 log-0.041
Token liked
Token ID8288
Feature activation+2.54e-01

Pos logit contributions
 faces+0.006
 Sorry+0.006
Thank+0.006
ulating+0.006
 sav+0.006
Neg logit contributions
ert-0.013
oris-0.012
izzy-0.012
ion-0.012
 log-0.011
Token to
Token ID284
Feature activation+7.48e-01

Pos logit contributions
ulating+0.008
 Sorry+0.008
Thank+0.008
 faces+0.007
aw+0.007
Neg logit contributions
ert-0.008
ion-0.007
oris-0.007
izzy-0.007
 log-0.007
Token run
Token ID1057
Feature activation+1.42e+00

Pos logit contributions
ulating+0.019
 Sorry+0.019
 sudden+0.018
Thank+0.018
 faces+0.018
Neg logit contributions
ert-0.027
oris-0.024
 log-0.024
ion-0.024
 stool-0.024
Token around
Token ID1088
Feature activation-4.13e-01

Pos logit contributions
 Sorry+0.036
Thank+0.034
 faces+0.034
ulating+0.033
 sav+0.032
Neg logit contributions
ert-0.054
oris-0.051
ion-0.049
izzy-0.048
 log-0.047
Token in
Token ID287
Feature activation+1.61e+00

Pos logit contributions
ert+0.016
oris+0.015
ion+0.015
izzy+0.014
 log+0.014
Neg logit contributions
 Sorry-0.010
ulating-0.009
 faces-0.009
Thank-0.009
clock-0.009
Token the
Token ID262
Feature activation+6.46e-01

Pos logit contributions
 Sorry+0.054
ulating+0.054
Thank+0.051
thank+0.049
 faces+0.048
Neg logit contributions
ert-0.049
ion-0.044
 stool-0.043
 log-0.042
oris-0.042
Token bright
Token ID6016
Feature activation+2.36e-02

Pos logit contributions
ulating+0.021
 Sorry+0.021
Thank+0.019
thank+0.019
 faces+0.019
Neg logit contributions
 log-0.018
ert-0.018
 sheet-0.017
 stool-0.017
 spider-0.016
Token sunlight
Token ID19606
Feature activation-5.95e-01

Pos logit contributions
 Sorry+0.001
Thank+0.001
ulating+0.001
 faces+0.001
thank+0.000
Neg logit contributions
ert-0.001
oris-0.001
ion-0.001
izzy-0.001
 log-0.001
Token,
Token ID11
Feature activation-1.72e-01

Pos logit contributions
ert+0.020
ion+0.018
 stool+0.018
izzy+0.017
oris+0.017
Neg logit contributions
 Sorry-0.019
Thank-0.017
 sudden-0.017
ulating-0.017
 faces-0.017
Token looking
Token ID2045
Feature activation+6.57e-01

Pos logit contributions
 stool+0.005
ert+0.005
 log+0.005
ion+0.005
 mat+0.005
Neg logit contributions
 Sorry-0.006
ulating-0.005
 endings-0.005
umm-0.004
clock-0.004
Token a
Token ID257
Feature activation-1.20e-01

Pos logit contributions
ulating+0.023
 Sorry+0.023
Thank+0.022
thank+0.021
 faces+0.021
Neg logit contributions
ert-0.034
 log-0.030
oris-0.030
ion-0.029
 stool-0.029
Token good
Token ID922
Feature activation-9.88e-01

Pos logit contributions
ert+0.004
oris+0.004
 log+0.004
 stool+0.004
izzy+0.004
Neg logit contributions
ulating-0.003
 faces-0.003
 Sorry-0.003
Thank-0.003
aw-0.003
Token boy
Token ID2933
Feature activation+1.92e-01

Pos logit contributions
ert+0.037
oris+0.034
ion+0.033
izzy+0.033
ions+0.031
Neg logit contributions
 faces-0.025
 Sorry-0.024
Thank-0.024
ulating-0.024
 sav-0.024
Token,
Token ID11
Feature activation+3.77e-01

Pos logit contributions
 faces+0.005
ulating+0.005
 Sorry+0.005
Thank+0.005
 sudden+0.005
Neg logit contributions
ert-0.007
ion-0.006
izzy-0.006
oris-0.006
 log-0.006
Token really
Token ID1107
Feature activation-8.70e-01

Pos logit contributions
ulating+0.010
 Sorry+0.010
 faces+0.010
Thank+0.010
 sudden+0.009
Neg logit contributions
ert-0.013
 log-0.012
ion-0.012
oris-0.012
izzy-0.011
Token."
Token ID526
Feature activation+1.75e+00

Pos logit contributions
ert+0.032
oris+0.029
 log+0.029
izzy+0.028
ion+0.028
Neg logit contributions
 faces-0.020
Thank-0.020
ulating-0.019
 Sorry-0.019
umm-0.019
Token she
Token ID673
Feature activation-1.13e+00

Pos logit contributions
ulating+0.060
 habitat+0.056
 faces+0.055
 Birthday+0.052
aw+0.052
Neg logit contributions
ert-0.045
izzy-0.043
oris-0.042
 stool-0.041
ion-0.041
Token said
Token ID531
Feature activation-9.26e-01

Pos logit contributions
ert+0.042
oris+0.038
izzy+0.038
ion+0.036
 log+0.036
Neg logit contributions
 Sorry-0.028
ulating-0.028
umm-0.027
Thank-0.026
thank-0.026
Token to
Token ID284
Feature activation-1.12e+00

Pos logit contributions
ert+0.036
oris+0.033
ion+0.032
izzy+0.032
 stool+0.031
Neg logit contributions
 Sorry-0.022
Thank-0.022
ulating-0.021
 sav-0.021
 faces-0.020
Token the
Token ID262
Feature activation-2.99e-01

Pos logit contributions
ert+0.045
oris+0.042
izzy+0.040
ion+0.040
 log+0.038
Neg logit contributions
 faces-0.025
 Sorry-0.025
ulating-0.025
Thank-0.024
 sav-0.023
Token man
Token ID582
Feature activation-1.24e+00

Pos logit contributions
ert+0.010
oris+0.010
izzy+0.009
 log+0.009
 stool+0.009
Neg logit contributions
ulating-0.008
 Sorry-0.008
Thank-0.008
 sav-0.008
 faces-0.008
Token now
Token ID783
Feature activation-2.45e-01

Pos logit contributions
ert+0.035
oris+0.032
 log+0.032
izzy+0.030
 opener+0.030
Neg logit contributions
 Sorry-0.022
ulating-0.022
aw-0.022
clock-0.021
 faces-0.021
Token eat
Token ID4483
Feature activation+3.47e-01

Pos logit contributions
ert+0.010
oris+0.009
ion+0.009
 log+0.009
izzy+0.008
Neg logit contributions
 Sorry-0.006
ulating-0.006
 faces-0.006
Thank-0.006
clock-0.005
Token the
Token ID262
Feature activation+2.65e-01

Pos logit contributions
 Sorry+0.011
ulating+0.011
Thank+0.010
clock+0.010
 faces+0.010
Neg logit contributions
ert-0.011
ion-0.010
 log-0.010
ions-0.009
 stool-0.009
Token mineral
Token ID18352
Feature activation-1.62e+00

Pos logit contributions
 Sorry+0.011
ulating+0.010
 faces+0.010
Thank+0.010
aw+0.010
Neg logit contributions
ert-0.006
oris-0.005
 log-0.005
izzy-0.005
ion-0.005
Token.
Token ID13
Feature activation+7.26e-01

Pos logit contributions
ert+0.066
oris+0.060
ion+0.058
izzy+0.058
 log+0.057
Neg logit contributions
 Sorry-0.036
 faces-0.036
Thank-0.035
ulating-0.035
 sav-0.033
Token Y
Token ID575
Feature activation+1.61e+00

Pos logit contributions
 faces+0.017
 Sorry+0.016
ulating+0.016
Thank+0.016
 sav+0.015
Neg logit contributions
ert-0.029
oris-0.027
ion-0.026
izzy-0.026
 log-0.024
Tokenum
Token ID388
Feature activation+1.23e+00

Pos logit contributions
 faces+0.039
 Sorry+0.038
Thank+0.038
ulating+0.038
 sav+0.036
Neg logit contributions
ert-0.063
oris-0.058
izzy-0.055
ion-0.055
 log-0.053
Token!
Token ID0
Feature activation+1.10e+00

Pos logit contributions
 faces+0.028
 Sorry+0.028
Thank+0.028
ulating+0.027
 sav+0.026
Neg logit contributions
ert-0.049
oris-0.045
izzy-0.044
ion-0.043
 log-0.042
Token
Token ID198
Feature activation+1.42e+00

Pos logit contributions
 faces+0.024
 Sorry+0.023
ulating+0.023
Thank+0.023
 sav+0.022
Neg logit contributions
ert-0.046
oris-0.043
ion-0.041
izzy-0.041
 log-0.039
Token
Token ID198
Feature activation+7.43e-01

Pos logit contributions
Thank+0.030
 sav+0.028
 faces+0.028
aw+0.027
ulating+0.027
Neg logit contributions
ert-0.059
oris-0.056
izzy-0.052
 stool-0.051
ion-0.051
TokenThe
Token ID464
Feature activation+8.76e-01

Pos logit contributions
Thank+0.015
ulating+0.014
 sav+0.013
 Sorry+0.013
 faces+0.013
Neg logit contributions
ert-0.033
oris-0.031
izzy-0.029
ion-0.029
 log-0.028
Token us
Token ID514
Feature activation+9.38e-01

Pos logit contributions
ulating+0.007
Thank+0.007
 Sorry+0.007
thank+0.006
 sudden+0.006
Neg logit contributions
ert-0.005
ion-0.005
ions-0.004
oris-0.004
 log-0.004
Token to
Token ID284
Feature activation-3.27e-01

Pos logit contributions
Thank+0.027
 Sorry+0.027
ulating+0.026
 faces+0.025
 sav+0.025
Neg logit contributions
ert-0.034
oris-0.029
ion-0.029
 log-0.028
izzy-0.028
Token Grand
Token ID5675
Feature activation+1.34e+00

Pos logit contributions
ert+0.009
 log+0.008
ion+0.008
 sheet+0.007
 stool+0.007
Neg logit contributions
ulating-0.012
 Sorry-0.011
Thank-0.011
umm-0.011
 sudden-0.011
Tokenma
Token ID2611
Feature activation+4.81e-01

Pos logit contributions
ulating+0.056
Thank+0.056
 Sorry+0.055
 faces+0.055
 sav+0.053
Neg logit contributions
ert-0.029
oris-0.023
ion-0.023
izzy-0.022
 log-0.020
Token and
Token ID290
Feature activation-2.25e-01

Pos logit contributions
Thank+0.014
 Sorry+0.014
ulating+0.013
 faces+0.013
 sudden+0.013
Neg logit contributions
ert-0.016
ion-0.015
izzy-0.015
 stool-0.014
oris-0.014
Token Grand
Token ID5675
Feature activation+1.58e+00

Pos logit contributions
ert+0.008
oris+0.007
ion+0.007
 log+0.007
 stool+0.007
Neg logit contributions
ulating-0.006
 faces-0.006
Thank-0.006
 sudden-0.006
aw-0.006
Tokenpa
Token ID8957
Feature activation+5.87e-01

Pos logit contributions
ulating+0.059
Thank+0.058
 Sorry+0.058
 faces+0.057
 sudden+0.056
Neg logit contributions
ert-0.040
ion-0.034
oris-0.034
izzy-0.033
ions-0.030
Token's
Token ID338
Feature activation+1.41e-01

Pos logit contributions
 Sorry+0.015
Thank+0.015
ulating+0.015
 faces+0.015
 sudden+0.014
Neg logit contributions
ert-0.021
ion-0.019
oris-0.019
izzy-0.019
 stool-0.018
Token house
Token ID2156
Feature activation-5.92e-02

Pos logit contributions
ulating+0.005
Thank+0.005
 Sorry+0.005
 faces+0.005
 sav+0.004
Neg logit contributions
ert-0.004
 log-0.003
 stool-0.003
oris-0.003
 sheet-0.003
Token.
Token ID13
Feature activation-1.50e-01

Pos logit contributions
ert+0.002
ion+0.002
 stool+0.002
izzy+0.002
oris+0.002
Neg logit contributions
 Sorry-0.002
Thank-0.002
 sudden-0.002
ulating-0.002
 faces-0.002
Token Are
Token ID4231
Feature activation-8.44e-01

Pos logit contributions
ert+0.005
oris+0.005
izzy+0.005
ion+0.005
 stool+0.005
Neg logit contributions
 faces-0.004
ulating-0.004
 sudden-0.004
thank-0.004
Thank-0.003
Token a
Token ID257
Feature activation-1.13e-01

Pos logit contributions
 faces+0.005
Thank+0.005
 Sorry+0.005
ulating+0.005
 sav+0.004
Neg logit contributions
ert-0.009
oris-0.008
ion-0.008
izzy-0.008
 log-0.007
Token time
Token ID640
Feature activation-7.56e-01

Pos logit contributions
ert+0.006
oris+0.005
ion+0.005
izzy+0.005
ions+0.005
Neg logit contributions
 faces-0.002
ulating-0.002
 sav-0.002
Thank-0.002
 Sorry-0.001
Token,
Token ID11
Feature activation-3.73e-01

Pos logit contributions
ert+0.038
oris+0.036
ion+0.036
izzy+0.034
 log+0.034
Neg logit contributions
 Sorry-0.009
Thank-0.008
ulating-0.007
 faces-0.007
 so-0.007
Token there
Token ID612
Feature activation+4.36e-01

Pos logit contributions
ert+0.018
oris+0.017
ion+0.016
izzy+0.015
 log+0.015
Neg logit contributions
Thank-0.006
 Sorry-0.005
 sudden-0.005
ulating-0.005
 faces-0.005
Token was
Token ID373
Feature activation-4.74e-01

Pos logit contributions
 faces+0.009
 Sorry+0.009
Thank+0.009
ulating+0.009
 sav+0.009
Neg logit contributions
ert-0.018
oris-0.017
izzy-0.016
ion-0.016
 log-0.015
Token a
Token ID257
Feature activation-2.08e-01

Pos logit contributions
ert+0.020
oris+0.019
ion+0.017
izzy+0.017
 stool+0.017
Neg logit contributions
Thank-0.010
 faces-0.010
ulating-0.010
 sav-0.010
 Sorry-0.010
Token girl
Token ID2576
Feature activation+1.05e-01

Pos logit contributions
ert+0.010
ion+0.009
oris+0.009
izzy+0.009
ions+0.009
Neg logit contributions
Thank-0.003
 Sorry-0.003
 faces-0.003
 sav-0.003
ulating-0.003
Token named
Token ID3706
Feature activation-8.34e-01

Pos logit contributions
 faces+0.003
 Sorry+0.003
ulating+0.003
Thank+0.002
clock+0.002
Neg logit contributions
ert-0.004
izzy-0.004
oris-0.004
ion-0.004
 log-0.003
Token Lucy
Token ID22162
Feature activation+1.07e-01

Pos logit contributions
ert+0.031
oris+0.028
ion+0.027
izzy+0.027
 log+0.026
Neg logit contributions
 faces-0.022
 Sorry-0.022
Thank-0.021
ulating-0.021
 sav-0.021
Token.
Token ID13
Feature activation-1.69e-01

Pos logit contributions
 faces+0.002
Thank+0.002
 Sorry+0.002
ulating+0.002
 sav+0.002
Neg logit contributions
ert-0.005
oris-0.004
ion-0.004
izzy-0.004
 log-0.004
Token She
Token ID1375
Feature activation+8.94e-01

Pos logit contributions
ert+0.007
oris+0.006
izzy+0.006
ion+0.006
 log+0.006
Neg logit contributions
 faces-0.004
ulating-0.004
 Sorry-0.004
Thank-0.004
 sav-0.004
Token inside
Token ID2641
Feature activation-1.27e+00

Pos logit contributions
ert+0.040
oris+0.037
izzy+0.035
ion+0.035
ccoli+0.033
Neg logit contributions
 faces-0.021
Thank-0.020
ulating-0.020
 Sorry-0.019
 sav-0.019
Token with
Token ID351
Feature activation+1.55e-01

Pos logit contributions
ert+0.054
oris+0.050
izzy+0.048
ion+0.048
 log+0.046
Neg logit contributions
 faces-0.026
Thank-0.025
 Sorry-0.025
ulating-0.025
 sav-0.024
Token Mom
Token ID11254
Feature activation-2.48e-02

Pos logit contributions
ulating+0.004
 Sorry+0.004
Thank+0.004
 faces+0.004
 sav+0.004
Neg logit contributions
ert-0.006
oris-0.005
ion-0.005
izzy-0.005
 log-0.005
Token and
Token ID290
Feature activation-2.36e-01

Pos logit contributions
ert+0.001
oris+0.001
ion+0.001
izzy+0.001
 log+0.001
Neg logit contributions
 Sorry-0.001
ulating-0.001
Thank-0.001
 faces-0.001
 sav-0.001
Token enjoyed
Token ID8359
Feature activation+4.32e-03

Pos logit contributions
ert+0.008
oris+0.007
izzy+0.007
ion+0.007
 stool+0.007
Neg logit contributions
ulating-0.007
 faces-0.006
 Sorry-0.006
Thank-0.006
aw-0.006
Token their
Token ID511
Feature activation+6.80e-01

Pos logit contributions
ulating+0.000
 Sorry+0.000
Thank+0.000
clock+0.000
 faces+0.000
Neg logit contributions
ert-0.000
 log-0.000
ion-0.000
 stool-0.000
ions-0.000
Token snack
Token ID26906
Feature activation-1.18e-02

Pos logit contributions
 faces+0.015
Thank+0.013
 Sorry+0.013
 sudden+0.013
 sav+0.013
Neg logit contributions
ert-0.029
oris-0.028
izzy-0.026
ion-0.026
ela-0.025
Token and
Token ID290
Feature activation+5.20e-01

Pos logit contributions
ert+0.000
ion+0.000
 stool+0.000
izzy+0.000
ions+0.000
Neg logit contributions
 Sorry-0.000
ulating-0.000
Thank-0.000
 sudden-0.000
 faces-0.000
Token their
Token ID511
Feature activation+9.83e-01

Pos logit contributions
ulating+0.016
 Sorry+0.014
aw+0.014
Thank+0.013
clock+0.013
Neg logit contributions
ert-0.017
 log-0.016
ion-0.015
 stool-0.015
 sheet-0.015
Token book
Token ID1492
Feature activation+1.68e-01

Pos logit contributions
ulating+0.025
 Sorry+0.025
Thank+0.025
 faces+0.024
 sav+0.023
Neg logit contributions
ert-0.036
oris-0.033
ion-0.032
izzy-0.031
 log-0.031
Token.
Token ID13
Feature activation-7.61e-01

Pos logit contributions
 Sorry+0.005
Thank+0.005
 sudden+0.005
ulating+0.005
 sav+0.004
Neg logit contributions
ert-0.006
ion-0.005
 stool-0.005
izzy-0.005
zag-0.005
Token playing
Token ID2712
Feature activation-2.61e-01

Pos logit contributions
ert+0.017
izzy+0.016
oris+0.015
 stool+0.015
 log+0.015
Neg logit contributions
 Sorry-0.014
 faces-0.014
Thank-0.014
ulating-0.013
aw-0.013
Token in
Token ID287
Feature activation-2.29e-01

Pos logit contributions
ert+0.013
oris+0.013
izzy+0.012
ion+0.012
 log+0.011
Neg logit contributions
Thank-0.003
 faces-0.003
ulating-0.003
 Sorry-0.003
 sav-0.003
Token his
Token ID465
Feature activation-6.26e-01

Pos logit contributions
ert+0.011
oris+0.010
izzy+0.010
ion+0.010
ccoli+0.009
Neg logit contributions
 sav-0.004
Thank-0.004
 faces-0.004
 people-0.004
 sound-0.004
Token room
Token ID2119
Feature activation+4.34e-01

Pos logit contributions
ert+0.026
ion+0.025
oris+0.024
izzy+0.023
ions+0.022
Neg logit contributions
 faces-0.014
 sav-0.014
 Sorry-0.013
Thank-0.013
ulating-0.013
Token,
Token ID11
Feature activation-4.69e-01

Pos logit contributions
 Sorry+0.010
 faces+0.009
Thank+0.009
ulating+0.008
 sav+0.008
Neg logit contributions
ert-0.017
 stool-0.016
 log-0.016
oris-0.016
ion-0.015
Token when
Token ID618
Feature activation+4.42e-01

Pos logit contributions
ert+0.019
 log+0.017
oris+0.017
 stool+0.016
izzy+0.016
Neg logit contributions
 Sorry-0.011
 faces-0.010
ulating-0.010
clock-0.010
Thank-0.010
Token all
Token ID477
Feature activation-8.28e-01

Pos logit contributions
 faces+0.012
 Sorry+0.012
Thank+0.012
ulating+0.012
 sav+0.012
Neg logit contributions
ert-0.015
oris-0.014
ion-0.014
izzy-0.014
 log-0.013
Token of
Token ID286
Feature activation-3.01e-01

Pos logit contributions
ert+0.034
oris+0.032
ion+0.030
izzy+0.029
ccoli+0.029
Neg logit contributions
 faces-0.018
 Sorry-0.018
Thank-0.017
 sav-0.017
ulating-0.017
Token a
Token ID257
Feature activation+5.61e-01

Pos logit contributions
ert+0.012
oris+0.011
izzy+0.010
ion+0.010
 stool+0.010
Neg logit contributions
 faces-0.007
Thank-0.007
 sav-0.007
 Sorry-0.007
ulating-0.007
Token sudden
Token ID4802
Feature activation+4.09e-01

Pos logit contributions
Thank+0.008
ulating+0.008
 Sorry+0.007
thank+0.007
 faces+0.007
Neg logit contributions
ert-0.027
oris-0.025
 log-0.025
izzy-0.025
 stool-0.025
Token he
Token ID339
Feature activation+5.55e-01

Pos logit contributions
 faces+0.007
 Sorry+0.007
Thank+0.006
 sudden+0.006
 sav+0.006
Neg logit contributions
ert-0.019
oris-0.018
ion-0.017
izzy-0.017
 stool-0.016
Token her
Token ID607
Feature activation+1.12e+00

Pos logit contributions
ulating+0.000
Thank+0.000
aw+0.000
 Sorry+0.000
 sudden+0.000
Neg logit contributions
ert-0.000
ion-0.000
oris-0.000
izzy-0.000
 stool-0.000
Token Mom
Token ID11254
Feature activation+1.67e+00

Pos logit contributions
ulating+0.027
Thank+0.026
 faces+0.026
 Sorry+0.026
 sav+0.025
Neg logit contributions
ert-0.044
oris-0.040
izzy-0.039
ion-0.038
 log-0.038
Token said
Token ID531
Feature activation+7.46e-01

Pos logit contributions
 faces+0.029
 Sorry+0.029
ulating+0.028
Thank+0.028
 sav+0.027
Neg logit contributions
ert-0.076
oris-0.070
izzy-0.068
ion-0.068
 log-0.066
Token yes
Token ID3763
Feature activation+1.50e+00

Pos logit contributions
 faces+0.020
 Sorry+0.020
Thank+0.020
ulating+0.019
 sav+0.019
Neg logit contributions
ert-0.027
oris-0.025
izzy-0.024
ion-0.023
 log-0.023
Token.
Token ID13
Feature activation+4.40e-01

Pos logit contributions
Thank+0.047
 Sorry+0.045
 sudden+0.043
 faces+0.042
thank+0.041
Neg logit contributions
ert-0.051
oris-0.045
ion-0.042
izzy-0.041
 log-0.041
Token
Token ID198
Feature activation+6.49e-01

Pos logit contributions
 faces+0.011
ulating+0.011
 sav+0.011
 Sorry+0.011
 sudden+0.011
Neg logit contributions
ert-0.015
ion-0.014
oris-0.013
izzy-0.013
 stool-0.013
Token
Token ID198
Feature activation+1.03e-01

Pos logit contributions
 faces+0.015
 Sorry+0.014
ulating+0.014
Thank+0.014
 sav+0.013
Neg logit contributions
ert-0.026
oris-0.024
ion-0.024
izzy-0.024
 log-0.022
TokenS
Token ID50
Feature activation+1.15e-01

Pos logit contributions
 faces+0.003
 Sorry+0.003
clock+0.003
 endings+0.003
 habitat+0.003
Neg logit contributions
ion-0.003
ert-0.003
oris-0.003
 stool-0.003
izzy-0.003
Tokenue
Token ID518
Feature activation+2.85e-01

Pos logit contributions
Thank+0.003
aw+0.003
 faces+0.003
ulating+0.003
 Sorry+0.003
Neg logit contributions
ert-0.004
oris-0.004
izzy-0.003
 log-0.003
 stool-0.003
Token put
Token ID1234
Feature activation+7.19e-01

Pos logit contributions
 faces+0.005
 sav+0.005
Thank+0.005
 thanked+0.004
ulating+0.004
Neg logit contributions
ert-0.013
oris-0.012
ion-0.012
izzy-0.012
ions-0.011
Token the
Token ID262
Feature activation+2.74e-01

Pos logit contributions
Thank+0.008
 people+0.006
 faces+0.006
 pictures+0.006
 sav+0.006
Neg logit contributions
ert-0.039
 outage-0.036
oris-0.036
ccoli-0.034
igsaw-0.033
Token of
Token ID286
Feature activation-1.11e-01

Pos logit contributions
ert+0.000
ion+0.000
oris+0.000
izzy+0.000
 stool+0.000
Neg logit contributions
 Sorry-0.000
Thank-0.000
ulating-0.000
 faces-0.000
 sav-0.000
Token ways
Token ID2842
Feature activation+1.33e-01

Pos logit contributions
ert+0.004
oris+0.004
ion+0.004
izzy+0.004
 log+0.004
Neg logit contributions
 Sorry-0.003
 faces-0.003
ulating-0.003
Thank-0.003
 sav-0.002
Token to
Token ID284
Feature activation-6.16e-01

Pos logit contributions
 faces+0.003
 Sorry+0.003
Thank+0.003
ulating+0.003
 sav+0.003
Neg logit contributions
ert-0.005
oris-0.005
ion-0.004
izzy-0.004
 log-0.004
Token solve
Token ID8494
Feature activation-8.19e-01

Pos logit contributions
ert+0.022
oris+0.020
izzy+0.019
ion+0.019
 log+0.018
Neg logit contributions
 faces-0.017
 Sorry-0.017
ulating-0.017
Thank-0.017
 sav-0.016
Token the
Token ID262
Feature activation-7.87e-01

Pos logit contributions
ert+0.034
oris+0.032
izzy+0.031
ion+0.030
 log+0.029
Neg logit contributions
 faces-0.017
 sav-0.016
Thank-0.016
 Sorry-0.016
ulating-0.016
Token giant
Token ID6175
Feature activation-9.64e-01

Pos logit contributions
ert+0.021
oris+0.019
izzy+0.018
ion+0.018
 log+0.016
Neg logit contributions
 faces-0.028
 Sorry-0.028
Thank-0.028
ulating-0.028
 sav-0.027
Token's
Token ID338
Feature activation-1.46e+00

Pos logit contributions
ert+0.031
oris+0.028
 log+0.027
ion+0.027
izzy+0.027
Neg logit contributions
Thank-0.029
 Sorry-0.029
ulating-0.028
 faces-0.028
 sav-0.027
Token puzzle
Token ID15027
Feature activation-4.89e-01

Pos logit contributions
ert+0.041
oris+0.037
izzy+0.037
ion+0.037
ions+0.032
Neg logit contributions
 faces-0.051
ulating-0.049
 Sorry-0.048
clock-0.048
Thank-0.048
Token.
Token ID13
Feature activation-3.83e-01

Pos logit contributions
ert+0.020
oris+0.019
ion+0.018
izzy+0.018
 log+0.017
Neg logit contributions
 Sorry-0.011
 faces-0.011
Thank-0.010
ulating-0.010
 sav-0.010
Token Finally
Token ID9461
Feature activation+7.20e-01

Pos logit contributions
ert+0.016
oris+0.015
izzy+0.014
ion+0.014
 log+0.013
Neg logit contributions
 faces-0.008
 Sorry-0.008
Thank-0.008
ulating-0.008
 sav-0.007
Token,
Token ID11
Feature activation-3.44e-01

Pos logit contributions
 faces+0.017
ulating+0.016
 Sorry+0.016
Thank+0.016
clock+0.015
Neg logit contributions
ert-0.028
oris-0.026
ion-0.025
izzy-0.025
 stool-0.024
Token to
Token ID284
Feature activation+2.19e-01

Pos logit contributions
ert+0.003
oris+0.002
ion+0.002
 log+0.002
izzy+0.002
Neg logit contributions
 Sorry-0.003
Thank-0.003
ulating-0.003
 sav-0.003
 faces-0.003
Token success
Token ID1943
Feature activation-2.48e-01

Pos logit contributions
 Sorry+0.006
Thank+0.006
ulating+0.006
aw+0.006
thank+0.006
Neg logit contributions
ert-0.008
oris-0.007
 log-0.007
ion-0.006
 spider-0.006
Token.
Token ID13
Feature activation+3.36e-01

Pos logit contributions
ert+0.010
oris+0.009
ion+0.008
izzy+0.008
 log+0.008
Neg logit contributions
 Sorry-0.007
Thank-0.006
 faces-0.006
ulating-0.006
 sav-0.006
Token<|endoftext|>
Token ID50256
Feature activation+6.61e-03

Pos logit contributions
 Sorry+0.005
Thank+0.005
 sav+0.005
ulating+0.005
 faces+0.005
Neg logit contributions
ert-0.016
oris-0.015
izzy-0.014
ion-0.014
 log-0.014
TokenJack
Token ID14295
Feature activation-5.18e-01

Pos logit contributions
Thank+0.000
 Sorry+0.000
 faces+0.000
 sav+0.000
ulating+0.000
Neg logit contributions
ert-0.000
oris-0.000
ion-0.000
izzy-0.000
 log-0.000
Token was
Token ID373
Feature activation-4.31e-01

Pos logit contributions
ert+0.022
oris+0.020
izzy+0.020
ion+0.019
 log+0.019
Neg logit contributions
 Sorry-0.011
 faces-0.011
ulating-0.011
Thank-0.010
 sudden-0.009
Token very
Token ID845
Feature activation-2.93e-01

Pos logit contributions
ert+0.016
oris+0.014
izzy+0.014
ion+0.014
 log+0.014
Neg logit contributions
 faces-0.011
 Sorry-0.011
ulating-0.010
Thank-0.010
 endings-0.009
Token angry
Token ID7954
Feature activation+1.52e+00

Pos logit contributions
ert+0.012
izzy+0.011
oris+0.010
 log+0.010
 stool+0.010
Neg logit contributions
 faces-0.006
ulating-0.006
aw-0.006
 Sorry-0.006
Thank-0.006
Token today
Token ID1909
Feature activation+7.66e-01

Pos logit contributions
 Sorry+0.036
Thank+0.035
 faces+0.035
ulating+0.034
 sav+0.033
Neg logit contributions
ert-0.060
oris-0.054
ion-0.053
izzy-0.052
 log-0.051
Token.
Token ID13
Feature activation-9.68e-01

Pos logit contributions
ulating+0.022
Thank+0.022
 Sorry+0.022
 faces+0.021
 sudden+0.020
Neg logit contributions
ert-0.025
ion-0.022
 log-0.022
 stool-0.022
oris-0.021
Token He
Token ID679
Feature activation+1.82e-01

Pos logit contributions
ert+0.038
oris+0.035
izzy+0.034
ion+0.033
 log+0.032
Neg logit contributions
 faces-0.023
 Sorry-0.022
Thank-0.022
ulating-0.022
 sav-0.021
Token,
Token ID11
Feature activation+1.83e-01

Pos logit contributions
 faces+0.006
 Sorry+0.006
Thank+0.006
ulating+0.006
 sav+0.006
Neg logit contributions
ert-0.013
oris-0.012
izzy-0.011
ion-0.011
 log-0.011
Token we
Token ID356
Feature activation+5.75e-01

Pos logit contributions
 faces+0.004
 Sorry+0.004
Thank+0.003
 people+0.003
 face+0.003
Neg logit contributions
ert-0.008
oris-0.008
izzy-0.007
ion-0.007
igsaw-0.007
Token are
Token ID389
Feature activation+6.59e-01

Pos logit contributions
 faces+0.013
ulating+0.013
 Sorry+0.013
Thank+0.013
 sav+0.012
Neg logit contributions
ert-0.023
oris-0.021
izzy-0.021
ion-0.020
 log-0.020
Token sorry
Token ID7926
Feature activation-3.84e-01

Pos logit contributions
 faces+0.014
 Sorry+0.014
Thank+0.014
ulating+0.014
 sav+0.013
Neg logit contributions
ert-0.027
oris-0.025
izzy-0.024
ion-0.024
 log-0.024
Token,"
Token ID553
Feature activation+4.64e-01

Pos logit contributions
ert+0.018
oris+0.017
ion+0.016
izzy+0.016
 stool+0.015
Neg logit contributions
 faces-0.007
 Sorry-0.007
ulating-0.006
Thank-0.006
 sound-0.006
Token Anna
Token ID11735
Feature activation-5.95e-01

Pos logit contributions
 faces+0.015
 Sorry+0.015
Thank+0.015
ulating+0.015
 sav+0.015
Neg logit contributions
ert-0.014
oris-0.013
izzy-0.012
ion-0.012
 log-0.011
Token says
Token ID1139
Feature activation-1.09e+00

Pos logit contributions
ert+0.025
oris+0.023
izzy+0.022
ion+0.022
 log+0.022
Neg logit contributions
 faces-0.012
 Sorry-0.012
ulating-0.012
Thank-0.012
 sav-0.012
Token.
Token ID13
Feature activation-1.67e+00

Pos logit contributions
ert+0.049
oris+0.046
izzy+0.044
ion+0.044
 log+0.043
Neg logit contributions
 faces-0.020
 Sorry-0.019
ulating-0.019
Thank-0.019
 sav-0.017
Token "
Token ID366
Feature activation-6.12e-01

Pos logit contributions
ert+0.064
oris+0.059
izzy+0.056
ion+0.056
 log+0.054
Neg logit contributions
 faces-0.041
 Sorry-0.041
ulating-0.040
Thank-0.040
 sav-0.039
TokenWe
Token ID1135
Feature activation+1.89e-02

Pos logit contributions
ert+0.024
oris+0.022
izzy+0.022
 stool+0.022
ion+0.022
Neg logit contributions
ulating-0.012
 faces-0.012
 Sorry-0.012
 sav-0.011
clock-0.011
Token did
Token ID750
Feature activation-7.83e-02

Pos logit contributions
 faces+0.000
ulating+0.000
 Sorry+0.000
Thank+0.000
 sav+0.000
Neg logit contributions
ert-0.001
oris-0.001
izzy-0.001
ion-0.001
 log-0.001
Token,
Token ID11
Feature activation-4.00e-01

Pos logit contributions
Thank+0.010
 faces+0.010
ulating+0.010
 Sorry+0.010
 sav+0.009
Neg logit contributions
ert-0.017
oris-0.015
izzy-0.015
ion-0.014
 log-0.014
Token I
Token ID314
Feature activation-6.02e-01

Pos logit contributions
ert+0.013
oris+0.012
izzy+0.011
ion+0.011
 log+0.011
Neg logit contributions
 faces-0.012
 Sorry-0.012
Thank-0.012
ulating-0.012
 sav-0.012
Token can
Token ID460
Feature activation-4.13e-01

Pos logit contributions
ert+0.022
oris+0.020
ion+0.019
izzy+0.019
 log+0.018
Neg logit contributions
Thank-0.016
 Sorry-0.016
 faces-0.016
ulating-0.016
 sav-0.015
Token."
Token ID526
Feature activation+1.30e-01

Pos logit contributions
ert+0.014
oris+0.013
izzy+0.013
ion+0.012
 opener+0.012
Neg logit contributions
 faces-0.011
ulating-0.011
aw-0.011
 Sorry-0.011
Thank-0.011
Token
Token ID198
Feature activation+4.08e-01

Pos logit contributions
 faces+0.003
 Sorry+0.003
Thank+0.003
ulating+0.003
 sav+0.002
Neg logit contributions
ert-0.005
oris-0.005
izzy-0.005
ion-0.005
 log-0.005
Token
Token ID198
Feature activation-4.95e-01

Pos logit contributions
Thank+0.004
 sav+0.004
 faces+0.004
 sudden+0.004
 thanked+0.004
Neg logit contributions
ert-0.021
oris-0.020
 stool-0.019
ion-0.019
 log-0.019
TokenShe
Token ID3347
Feature activation-1.06e-01

Pos logit contributions
ert+0.019
oris+0.017
ion+0.016
izzy+0.016
 log+0.015
Neg logit contributions
Thank-0.013
ulating-0.013
 faces-0.013
 sav-0.013
 sudden-0.012
Token went
Token ID1816
Feature activation-8.80e-01

Pos logit contributions
ert+0.005
oris+0.005
ion+0.005
izzy+0.005
igsaw+0.005
Neg logit contributions
 thanked-0.001
 sav-0.001
Thank-0.001
 faces-0.001
 went-0.001
Token to
Token ID284
Feature activation-4.11e-01

Pos logit contributions
ert+0.046
oris+0.041
izzy+0.039
ccoli+0.038
ig+0.038
Neg logit contributions
 with-0.012
Thank-0.011
 sav-0.011
 so-0.011
thank-0.011
Token the
Token ID262
Feature activation-8.62e-01

Pos logit contributions
ert+0.019
oris+0.018
ion+0.018
izzy+0.017
ig+0.017
Neg logit contributions
 faces-0.007
 sav-0.007
Thank-0.006
 Sorry-0.006
 people-0.006
Token kitchen
Token ID9592
Feature activation+8.11e-02

Pos logit contributions
ert+0.025
 log+0.025
 stool+0.025
oris+0.023
 sheet+0.022
Neg logit contributions
ulating-0.026
 Sorry-0.026
 faces-0.025
Thank-0.025
aw-0.024
Token.
Token ID13
Feature activation-1.46e+00

Pos logit contributions
ert+0.008
ion+0.007
oris+0.007
izzy+0.007
 log+0.006
Neg logit contributions
 Sorry-0.005
ulating-0.005
Thank-0.005
 faces-0.005
 sudden-0.005
Token They
Token ID1119
Feature activation+6.57e-02

Pos logit contributions
ert+0.065
oris+0.060
izzy+0.057
ion+0.057
 log+0.055
Neg logit contributions
 faces-0.028
 Sorry-0.028
Thank-0.027
ulating-0.027
 sav-0.026
Token also
Token ID635
Feature activation-2.11e-01

Pos logit contributions
ulating+0.001
 Sorry+0.001
 faces+0.001
clock+0.001
Thank+0.001
Neg logit contributions
ert-0.003
izzy-0.002
oris-0.002
 log-0.002
ion-0.002
Token liked
Token ID8288
Feature activation-9.37e-01

Pos logit contributions
izzy+0.006
ert+0.006
inda+0.006
oris+0.006
 stool+0.005
Neg logit contributions
umm-0.006
 Sorry-0.006
 sudden-0.006
iring-0.006
 faces-0.006
Token to
Token ID284
Feature activation-3.35e-02

Pos logit contributions
ert+0.034
oris+0.031
ion+0.030
izzy+0.030
 log+0.029
Neg logit contributions
 faces-0.024
 Sorry-0.024
Thank-0.024
ulating-0.024
 sav-0.023
Token look
Token ID804
Feature activation-7.60e-01

Pos logit contributions
ert+0.001
oris+0.001
izzy+0.001
 log+0.001
ion+0.001
Neg logit contributions
ulating-0.001
aw-0.001
Thank-0.001
 Sorry-0.001
 faces-0.001
Token at
Token ID379
Feature activation-2.41e-01

Pos logit contributions
ert+0.032
oris+0.030
izzy+0.029
ion+0.029
 log+0.028
Neg logit contributions
 Sorry-0.015
Thank-0.015
 sav-0.014
 faces-0.014
ulating-0.013
Token the
Token ID262
Feature activation-8.70e-02

Pos logit contributions
ert+0.010
oris+0.009
ion+0.009
izzy+0.008
 log+0.008
Neg logit contributions
 Sorry-0.006
ulating-0.006
Thank-0.006
 faces-0.006
 sav-0.005
Token animals
Token ID4695
Feature activation-6.87e-02

Pos logit contributions
ert+0.003
 log+0.003
oris+0.003
ion+0.003
 stool+0.003
Neg logit contributions
ulating-0.003
 Sorry-0.003
Thank-0.002
thank-0.002
 sav-0.002
Token that
Token ID326
Feature activation-4.72e-01

Pos logit contributions
ert+0.003
ion+0.003
oris+0.003
 log+0.002
izzy+0.002
Neg logit contributions
 Sorry-0.002
Thank-0.002
 faces-0.002
ulating-0.002
 sav-0.001
Token lived
Token ID5615
Feature activation-5.55e-01

Pos logit contributions
izzy+0.017
ert+0.017
ion+0.016
oris+0.015
 log+0.015
Neg logit contributions
ulating-0.013
 sudden-0.012
 faces-0.011
 sav-0.011
clock-0.011
Token put
Token ID1234
Feature activation+2.96e-01

Pos logit contributions
ert+0.012
oris+0.011
ion+0.011
izzy+0.011
 log+0.010
Neg logit contributions
Thank-0.005
 faces-0.004
 thanked-0.004
 sav-0.004
 Sorry-0.004
Token the
Token ID262
Feature activation-6.39e-01

Pos logit contributions
 faces+0.007
Thank+0.006
 Sorry+0.006
 sav+0.006
ulating+0.006
Neg logit contributions
ert-0.012
oris-0.011
izzy-0.011
ion-0.010
 log-0.010
Token salt
Token ID8268
Feature activation-6.41e-01

Pos logit contributions
ert+0.024
oris+0.023
izzy+0.021
igsaw+0.020
ion+0.020
Neg logit contributions
 faces-0.018
 sound-0.017
 people-0.017
 sudden-0.016
Thank-0.016
Token on
Token ID319
Feature activation-1.60e+00

Pos logit contributions
ert+0.031
oris+0.029
izzy+0.028
ion+0.027
igsaw+0.026
Neg logit contributions
 faces-0.009
Thank-0.009
ulating-0.008
 sav-0.008
 Sorry-0.008
Token the
Token ID262
Feature activation-3.67e-01

Pos logit contributions
ert+0.060
oris+0.055
ion+0.054
izzy+0.052
 log+0.051
Neg logit contributions
 Sorry-0.040
 faces-0.040
ulating-0.040
Thank-0.039
 sav-0.037
Token broken
Token ID5445
Feature activation-2.55e+00

Pos logit contributions
ert+0.012
oris+0.011
izzy+0.010
ion+0.010
 log+0.009
Neg logit contributions
 faces-0.011
 Sorry-0.011
Thank-0.011
ulating-0.011
 sav-0.011
Token wheel
Token ID7825
Feature activation-1.55e+00

Pos logit contributions
ert+0.087
oris+0.082
izzy+0.080
ion+0.072
igsaw+0.069
Neg logit contributions
 faces-0.076
 sav-0.071
 sound-0.070
aw-0.068
Thank-0.068
Token,
Token ID11
Feature activation-8.84e-01

Pos logit contributions
ert+0.062
oris+0.056
ion+0.055
izzy+0.055
 log+0.052
Neg logit contributions
 Sorry-0.037
 faces-0.036
Thank-0.036
ulating-0.036
 sav-0.034
Token and
Token ID290
Feature activation-2.99e-01

Pos logit contributions
ert+0.030
 stool+0.028
ion+0.028
 log+0.027
oris+0.027
Neg logit contributions
 Sorry-0.026
ulating-0.023
 faces-0.022
aw-0.022
clock-0.021
Token it
Token ID340
Feature activation-4.92e-01

Pos logit contributions
ert+0.010
ion+0.010
 stool+0.009
 log+0.009
izzy+0.009
Neg logit contributions
 Sorry-0.008
ulating-0.008
 faces-0.007
Thank-0.007
 habitat-0.007
Token stuck
Token ID7819
Feature activation+2.68e-01

Pos logit contributions
ert+0.020
oris+0.019
izzy+0.018
ion+0.018
 log+0.017
Neg logit contributions
 Sorry-0.011
 faces-0.011
ulating-0.011
Thank-0.010
 sav-0.010
Token want
Token ID765
Feature activation-9.40e-01

Pos logit contributions
ert+0.007
izzy+0.007
oris+0.006
 opener+0.006
 log+0.006
Neg logit contributions
 Sorry-0.005
 sudden-0.005
aw-0.005
umm-0.005
ulating-0.005
Token to
Token ID284
Feature activation+1.17e-01

Pos logit contributions
ert+0.030
ion+0.028
oris+0.027
izzy+0.026
 log+0.025
Neg logit contributions
ulating-0.029
 Sorry-0.028
Thank-0.028
 faces-0.027
 sav-0.026
Token play
Token ID711
Feature activation-5.69e-01

Pos logit contributions
aw+0.004
 sudden+0.004
umm+0.004
ulating+0.004
Thank+0.004
Neg logit contributions
ert-0.003
izzy-0.002
 opener-0.002
 outage-0.002
zag-0.002
Token with
Token ID351
Feature activation-1.40e+00

Pos logit contributions
ert+0.024
ion+0.023
izzy+0.023
 stool+0.022
oris+0.022
Neg logit contributions
 sav-0.010
 Sorry-0.010
Thank-0.010
ulating-0.010
 sudden-0.010
Token his
Token ID465
Feature activation-1.52e+00

Pos logit contributions
ert+0.056
oris+0.052
izzy+0.049
ion+0.048
 log+0.047
Neg logit contributions
 faces-0.033
 Sorry-0.032
Thank-0.031
ulating-0.031
 sav-0.030
Token toy
Token ID13373
Feature activation-2.89e+00

Pos logit contributions
ert+0.059
oris+0.056
ion+0.053
izzy+0.052
ions+0.049
Neg logit contributions
 faces-0.039
 Sorry-0.036
Thank-0.034
 people-0.034
 sav-0.034
Token car
Token ID1097
Feature activation-7.44e-01

Pos logit contributions
ert+0.110
oris+0.103
ion+0.098
izzy+0.097
 log+0.090
Neg logit contributions
 faces-0.073
 Sorry-0.071
Thank-0.067
ulating-0.066
 sav-0.065
Token alone
Token ID3436
Feature activation-8.87e-01

Pos logit contributions
ert+0.023
ion+0.021
driver+0.021
izzy+0.021
 stool+0.021
Neg logit contributions
 Sorry-0.022
 sudden-0.022
Thank-0.021
ulating-0.021
 sav-0.021
Token.
Token ID13
Feature activation+5.02e-01

Pos logit contributions
ert+0.024
ion+0.021
izzy+0.021
 spider+0.020
ions+0.020
Neg logit contributions
 Sorry-0.032
ulating-0.029
Thank-0.029
 sav-0.029
 faces-0.028
Token He
Token ID679
Feature activation-5.73e-01

Pos logit contributions
 faces+0.015
ulating+0.014
thank+0.014
Thank+0.014
 Sorry+0.014
Neg logit contributions
ert-0.015
izzy-0.015
oris-0.015
ion-0.014
 log-0.013
Token wished
Token ID16555
Feature activation-8.68e-01

Pos logit contributions
ert+0.022
izzy+0.021
oris+0.021
 log+0.020
ion+0.020
Neg logit contributions
 Sorry-0.013
ulating-0.013
 faces-0.012
Thank-0.012
thank-0.011
Token smiled
Token ID13541
Feature activation+2.68e-01

Pos logit contributions
ert+0.005
oris+0.004
ion+0.004
izzy+0.004
 log+0.004
Neg logit contributions
 faces-0.002
 thanked-0.002
Thank-0.002
 sav-0.002
ulating-0.002
Token and
Token ID290
Feature activation-6.05e-01

Pos logit contributions
 Sorry+0.008
ulating+0.007
 sav+0.007
Thank+0.007
 faces+0.007
Neg logit contributions
ert-0.009
oris-0.008
izzy-0.008
 log-0.008
ion-0.008
Token said
Token ID531
Feature activation-9.57e-04

Pos logit contributions
ert+0.029
oris+0.027
izzy+0.026
ion+0.026
 log+0.025
Neg logit contributions
 faces-0.009
 Sorry-0.009
Thank-0.009
ulating-0.009
 sav-0.009
Token,
Token ID11
Feature activation+3.27e-01

Pos logit contributions
ert+0.000
oris+0.000
izzy+0.000
ion+0.000
 log+0.000
Neg logit contributions
 faces-0.000
 Sorry-0.000
Thank-0.000
ulating-0.000
 sav-0.000
Token "
Token ID366
Feature activation-3.55e-01

Pos logit contributions
 faces+0.011
 Sorry+0.011
ulating+0.010
Thank+0.010
 sav+0.010
Neg logit contributions
ert-0.010
oris-0.009
ion-0.008
izzy-0.008
 stool-0.008
TokenThat
Token ID2504
Feature activation-2.46e+00

Pos logit contributions
ert+0.015
oris+0.014
izzy+0.014
ion+0.014
 log+0.013
Neg logit contributions
Thank-0.007
 Sorry-0.007
 faces-0.007
ulating-0.007
 sav-0.007
Token's
Token ID338
Feature activation+8.58e-01

Pos logit contributions
ert+0.120
oris+0.115
izzy+0.112
ion+0.112
ions+0.108
Neg logit contributions
 faces-0.036
 sound-0.032
 sav-0.031
 Sorry-0.029
ulating-0.029
Token a
Token ID257
Feature activation+7.21e-01

Pos logit contributions
Thank+0.014
 Sorry+0.013
ulating+0.013
 faces+0.013
 sav+0.013
Neg logit contributions
ert-0.041
oris-0.038
izzy-0.037
ion-0.037
 log-0.036
Token great
Token ID1049
Feature activation-8.16e-01

Pos logit contributions
 faces+0.010
 Sorry+0.010
Thank+0.010
ulating+0.010
 sav+0.009
Neg logit contributions
ert-0.036
oris-0.034
ion-0.033
izzy-0.032
ions-0.031
Token idea
Token ID2126
Feature activation-5.67e-01

Pos logit contributions
ert+0.037
oris+0.036
ion+0.035
izzy+0.034
ions+0.033
Neg logit contributions
 faces-0.015
 sound-0.014
 Sorry-0.013
ulating-0.013
 sav-0.013
Token,
Token ID11
Feature activation+2.30e-01

Pos logit contributions
ert+0.025
oris+0.024
izzy+0.023
ion+0.022
 log+0.022
Neg logit contributions
 faces-0.011
 Sorry-0.010
ulating-0.010
Thank-0.010
 sav-0.010
Token friends
Token ID2460
Feature activation+4.55e-01

Pos logit contributions
 Sorry+0.011
Thank+0.010
ulating+0.010
thank+0.009
 sav+0.009
Neg logit contributions
ert-0.017
oris-0.015
 log-0.015
izzy-0.015
ion-0.015
Token.
Token ID13
Feature activation-8.63e-01

Pos logit contributions
 faces+0.008
ulating+0.008
Thank+0.008
 Sorry+0.008
 sav+0.007
Neg logit contributions
ert-0.021
oris-0.019
ion-0.019
izzy-0.018
 stool-0.018
Token One
Token ID1881
Feature activation-9.27e-01

Pos logit contributions
ert+0.037
oris+0.035
izzy+0.034
ion+0.034
 log+0.032
Neg logit contributions
 faces-0.016
ulating-0.016
 Sorry-0.016
Thank-0.016
 sav-0.015
Token was
Token ID373
Feature activation-1.58e+00

Pos logit contributions
ert+0.040
oris+0.038
ion+0.036
izzy+0.035
 stool+0.034
Neg logit contributions
 faces-0.018
Thank-0.018
 Sorry-0.018
ulating-0.018
 sav-0.017
Token an
Token ID281
Feature activation-1.39e+00

Pos logit contributions
ert+0.065
oris+0.060
ion+0.058
izzy+0.057
 log+0.055
Neg logit contributions
 faces-0.034
 Sorry-0.034
Thank-0.034
ulating-0.033
 sav-0.032
Token old
Token ID1468
Feature activation-3.28e+00

Pos logit contributions
ert+0.057
oris+0.053
izzy+0.049
ion+0.049
 stool+0.046
Neg logit contributions
 faces-0.031
ulating-0.030
 sav-0.030
Thank-0.029
 Sorry-0.029
Token house
Token ID2156
Feature activation-9.88e-02

Pos logit contributions
ert+0.134
oris+0.126
ion+0.123
izzy+0.122
ions+0.115
Neg logit contributions
 faces-0.074
Thank-0.070
 sound-0.070
 sav-0.069
 Sorry-0.068
Token and
Token ID290
Feature activation-8.64e-01

Pos logit contributions
ert+0.004
oris+0.004
ion+0.004
izzy+0.004
 log+0.004
Neg logit contributions
 faces-0.002
 Sorry-0.002
ulating-0.002
Thank-0.002
 sav-0.002
Token one
Token ID530
Feature activation+1.92e-01

Pos logit contributions
ert+0.044
oris+0.042
ion+0.040
izzy+0.039
ig+0.038
Neg logit contributions
Thank-0.013
 people-0.013
 so-0.013
 faces-0.012
 sound-0.012
Token was
Token ID373
Feature activation-1.05e+00

Pos logit contributions
 faces+0.004
ulating+0.004
 Sorry+0.004
Thank+0.004
 sav+0.004
Neg logit contributions
ert-0.008
oris-0.007
izzy-0.007
ion-0.007
 log-0.007
Token a
Token ID257
Feature activation-6.34e-01

Pos logit contributions
ert+0.033
 log+0.032
izzy+0.031
 stool+0.031
oris+0.030
Neg logit contributions
 faces-0.030
 Sorry-0.030
 endings-0.029
ulating-0.028
aw-0.028
Token was
Token ID373
Feature activation-7.05e-01

Pos logit contributions
ert+0.002
oris+0.002
ion+0.002
izzy+0.002
 log+0.002
Neg logit contributions
 Sorry-0.001
ulating-0.001
 faces-0.001
Thank-0.001
clock-0.001
Token excited
Token ID6568
Feature activation+3.76e-01

Pos logit contributions
ert+0.028
ion+0.027
izzy+0.027
oris+0.026
 log+0.025
Neg logit contributions
 Sorry-0.015
 faces-0.014
 endings-0.013
ulating-0.012
Thank-0.012
Token and
Token ID290
Feature activation-5.60e-01

Pos logit contributions
 Sorry+0.014
Thank+0.012
 sudden+0.012
ulating+0.011
 sav+0.011
Neg logit contributions
ert-0.011
oris-0.009
ion-0.009
 log-0.008
izzy-0.008
Token thanked
Token ID26280
Feature activation-1.37e-01

Pos logit contributions
ert+0.018
oris+0.016
 stool+0.016
ela+0.016
 log+0.016
Neg logit contributions
 endings-0.015
view-0.014
umm-0.014
clock-0.014
aw-0.014
Token the
Token ID262
Feature activation-3.48e-01

Pos logit contributions
ert+0.005
oris+0.004
ion+0.004
izzy+0.004
 log+0.004
Neg logit contributions
ulating-0.004
 Sorry-0.004
 faces-0.004
Thank-0.004
 sav-0.003
Token old
Token ID1468
Feature activation-2.68e+00

Pos logit contributions
ert+0.013
oris+0.012
izzy+0.012
ion+0.012
 log+0.011
Neg logit contributions
 faces-0.009
Thank-0.009
ulating-0.009
 Sorry-0.009
 sav-0.008
Token woman
Token ID2415
Feature activation-1.08e+00

Pos logit contributions
ert+0.115
izzy+0.107
oris+0.102
ion+0.102
ions+0.102
Neg logit contributions
Thank-0.061
 faces-0.059
 sound-0.056
thank-0.055
 sav-0.055
Token.
Token ID13
Feature activation-3.46e-01

Pos logit contributions
ert+0.042
oris+0.038
izzy+0.037
ion+0.036
 log+0.035
Neg logit contributions
 faces-0.026
 Sorry-0.026
ulating-0.025
Thank-0.025
 sav-0.025
Token As
Token ID1081
Feature activation-5.11e-01

Pos logit contributions
ert+0.015
oris+0.014
izzy+0.013
ion+0.013
 log+0.012
Neg logit contributions
 faces-0.007
 Sorry-0.007
ulating-0.007
Thank-0.007
 sav-0.007
Token the
Token ID262
Feature activation-5.00e-02

Pos logit contributions
ert+0.017
oris+0.015
ion+0.014
izzy+0.014
 log+0.014
Neg logit contributions
ulating-0.016
 faces-0.016
 Sorry-0.016
clock-0.015
aw-0.015
Token little
Token ID1310
Feature activation-6.11e-01

Pos logit contributions
ert+0.002
oris+0.002
izzy+0.002
ion+0.002
 log+0.002
Neg logit contributions
 Sorry-0.001
 faces-0.001
ulating-0.001
Thank-0.001
 sav-0.001