TOP ACTIVATIONS
MAX = 0.047

and stretched , but it wasn 't long enough .
rub bers look so j olly !" The elephant
and looked , but it wasn 't there . She asked
worked quickly , but it wasn 't easy . Lily watched
onto the railing but it wasn 't enough . The stairs
the store , but it wasn 't there .
all of the boys and girls quickly ." Santa
chance and go up the stairs ." M illy said ,
a chair , but it wasn 't tall enough either .
tight and we 'll bounce together ." So the cross
, my generous and helpful girls ." He hugged them and
extingu isher , but there wasn 't one . He yelled
was too heavy and she wasn 't strong enough .
was too high and he wasn 't sure how he could
all her strength but it wasn 't enough .
a vine , but it wasn 't strong enough . He
aisle with the other flower girls ," her mom said .
" Let 's carry it together !" The obedient
seen it , but they hadn 't . Lily felt uncomfortable
move faster , but it wasn 't fast enough . A

MIN ACTIVATIONS
MIN = -0.042

him , " Did the hat and shield help you today
Mom said , " What car ? I only see a
Just when Tim my was about to give up
" What is it ?" Ben asks .
never seen such a wonderful suit before . She quickly put
went home and enjoyed his freshly caught fish . <|endoftext|> Once
crown and playing in the freshly cut grass . It looked
" What is it , Tom ?" she asked
" What is it ?" Lily asked .
. When Su zy and her mum stepped outside
the sweet aroma of a freshly baked cake .
" What is it , Mom ?" Sara and
" What is it , Grand ma ?" Sara
" What is it , Max ?" Lily asked
They have never heard a bird talk before . They think
" What is it ?" Lily asked .
But what caught Emily 's eye was the strange little trees
" What is it ?" he asked .
her clock what time it was , and the clock said
said , " What is it ?" Ben said

INTERVAL 0.024 - 0.047
CONTAINS 0.173%

noticed something . The bird wasn 't alone . There was
Let 's pretend we are climbing it !" They
was a little girl named Daisy . Daisy was only three
the ingredients were all mixed together . She was so proud
and said , " This mine is mine , and no

INTERVAL 0.002 - 0.024
CONTAINS 39.545%

beautiful !" J aya smiled and said , "
He says the meat is raw and not good to eat
bit . And you have to permit me to play after
could not find the butter in the kitchen . They looked
, I can tell you a story ," the k ang

INTERVAL -0.020 - 0.002
CONTAINS 59.549%

write stories with her cr ay ons . She wrote about
people are on the sees aw , the heavier one will
parade , there was a clumsy man named Tim . He
me a good lunch today ." L ily felt
One day he wanted to talk to the birds . He

INTERVAL -0.042 - -0.020
CONTAINS 0.732%

feel sorry for what they did . They want to say
all the animals . When they got there , Tim my
you . It is fun . Don 't be scared ."
car . As they were driving , they saw
Lily and Tom went to the park with Mom . They
Token and
Token ID290
Feature activation+7.20e-04

Pos logit contributions
 Grace+0.013
Els+0.013
 importantly+0.012
reetings+0.012
ified+0.012
Neg logit contributions
aster-0.014
 peel-0.011
 toast-0.011
dog-0.011
 slur-0.010
Token stretched
Token ID19110
Feature activation+1.32e-02

Pos logit contributions
 Grace+0.001
 Grac+0.001
Els+0.001
 Fiona+0.001
 Beth+0.001
Neg logit contributions
aster-0.001
five-0.001
 toast-0.001
dog-0.001
ops-0.001
Token,
Token ID11
Feature activation+8.76e-03

Pos logit contributions
 importantly+0.021
Els+0.021
 Grace+0.021
reetings+0.021
ified+0.020
Neg logit contributions
aster-0.021
 peel-0.017
 toast-0.016
dog-0.016
 slur-0.016
Token but
Token ID475
Feature activation+1.06e-03

Pos logit contributions
 Grace+0.009
 importantly+0.008
Els+0.008
ified+0.008
 ago+0.008
Neg logit contributions
aster-0.020
 peel-0.016
 toast-0.016
dog-0.016
ops-0.016
Token it
Token ID340
Feature activation+1.67e-03

Pos logit contributions
 Grace+0.002
 importantly+0.002
Els+0.002
reetings+0.002
ified+0.002
Neg logit contributions
aster-0.002
 peel-0.001
 toast-0.001
dog-0.001
 slur-0.001
Token wasn
Token ID2492
Feature activation+4.65e-02

Pos logit contributions
Els+0.002
 Grace+0.002
ified+0.002
 importantly+0.002
spe+0.002
Neg logit contributions
aster-0.003
five-0.003
 toast-0.003
 peel-0.003
dog-0.003
Token't
Token ID470
Feature activation+8.70e-03

Pos logit contributions
 importantly+0.078
Els+0.076
reetings+0.076
 Grace+0.075
 galaxies+0.073
Neg logit contributions
aster-0.076
 peel-0.063
 toast-0.055
on-0.054
dog-0.054
Token long
Token ID890
Feature activation-3.48e-03

Pos logit contributions
 Grace+0.012
 importantly+0.012
Els+0.012
reetings+0.012
ified+0.012
Neg logit contributions
aster-0.016
 peel-0.013
 toast-0.012
dog-0.012
 slur-0.012
Token enough
Token ID1576
Feature activation-5.22e-03

Pos logit contributions
aster+0.004
 peel+0.003
 toast+0.003
dog+0.003
 cut+0.003
Neg logit contributions
Els-0.007
 Grace-0.007
reetings-0.007
 importantly-0.007
 Amanda-0.007
Token.
Token ID13
Feature activation-1.21e-03

Pos logit contributions
aster+0.007
 peel+0.005
 toast+0.005
dog+0.005
 slur+0.004
Neg logit contributions
Els-0.010
 Grace-0.010
reetings-0.010
 importantly-0.010
ified-0.010
Token
Token ID220
Feature activation+1.84e-03

Pos logit contributions
aster+0.002
 peel+0.002
 toast+0.002
dog+0.002
 slur+0.001
Neg logit contributions
Els-0.002
reetings-0.002
 Grace-0.002
 importantly-0.002
ified-0.002
Token rub
Token ID6437
Feature activation+4.81e-03

Pos logit contributions
aster+0.020
 peel+0.017
 toast+0.016
dog+0.015
ops+0.014
Neg logit contributions
 importantly-0.025
reetings-0.025
Els-0.025
 Grace-0.025
ified-0.024
Tokenbers
Token ID1213
Feature activation+2.99e-03

Pos logit contributions
 importantly+0.009
Els+0.009
 Grace+0.009
reetings+0.009
ified+0.009
Neg logit contributions
aster-0.006
 peel-0.005
 toast-0.005
dog-0.005
ops-0.005
Token look
Token ID804
Feature activation+6.48e-03

Pos logit contributions
 importantly+0.005
 Grace+0.005
Els+0.005
reetings+0.005
ified+0.005
Neg logit contributions
aster-0.005
 peel-0.004
 toast-0.004
dog-0.004
 slur-0.003
Token so
Token ID523
Feature activation-2.34e-03

Pos logit contributions
Els+0.011
reetings+0.011
 ago+0.011
ified+0.011
 importantly+0.011
Neg logit contributions
aster-0.008
 toast-0.008
 peel-0.007
 slur-0.007
 cut-0.007
Token j
Token ID474
Feature activation+1.16e-02

Pos logit contributions
aster+0.004
 peel+0.003
 toast+0.003
 slur+0.003
dog+0.003
Neg logit contributions
reetings-0.004
Els-0.004
ified-0.004
 ago-0.004
 Grace-0.004
Tokenolly
Token ID5098
Feature activation+4.56e-02

Pos logit contributions
Els+0.021
 Grace+0.021
 importantly+0.021
reetings+0.021
ified+0.020
Neg logit contributions
aster-0.016
 peel-0.012
 toast-0.012
dog-0.011
ops-0.011
Token!"
Token ID2474
Feature activation+7.54e-04

Pos logit contributions
 importantly+0.070
Els+0.068
reetings+0.067
 Beat+0.065
ified+0.064
Neg logit contributions
aster-0.071
 toast-0.065
 peel-0.063
dog-0.061
 cone-0.060
Token
Token ID198
Feature activation+5.19e-03

Pos logit contributions
 importantly+0.001
Els+0.001
 Grace+0.001
reetings+0.001
ified+0.001
Neg logit contributions
aster-0.001
 peel-0.001
 toast-0.001
dog-0.001
 slur-0.001
Token
Token ID198
Feature activation+1.09e-02

Pos logit contributions
 importantly+0.008
 Grace+0.008
reetings+0.008
ified+0.008
Els+0.008
Neg logit contributions
aster-0.009
 peel-0.007
 toast-0.007
dog-0.007
 slur-0.006
TokenThe
Token ID464
Feature activation+9.17e-03

Pos logit contributions
 Grace+0.018
 ago+0.017
 importantly+0.017
reetings+0.017
ified+0.017
Neg logit contributions
aster-0.018
 peel-0.015
 toast-0.014
dog-0.014
 slur-0.013
Token elephant
Token ID20950
Feature activation-5.92e-03

Pos logit contributions
reetings+0.015
Els+0.015
 importantly+0.014
 Grace+0.014
ified+0.014
Neg logit contributions
aster-0.015
 peel-0.013
 toast-0.012
dog-0.012
 cut-0.011
Token and
Token ID290
Feature activation+5.02e-03

Pos logit contributions
Els+0.012
 importantly+0.012
 Grace+0.012
reetings+0.012
ified+0.011
Neg logit contributions
aster-0.015
 peel-0.012
 toast-0.012
dog-0.011
 slur-0.011
Token looked
Token ID3114
Feature activation+9.47e-03

Pos logit contributions
 Grace+0.009
Els+0.009
 importantly+0.008
 Grac+0.008
 ago+0.008
Neg logit contributions
aster-0.008
 toast-0.006
 peel-0.006
five-0.006
dog-0.006
Token,
Token ID11
Feature activation+1.32e-03

Pos logit contributions
Els+0.014
reetings+0.014
 importantly+0.014
 Grace+0.014
ified+0.014
Neg logit contributions
aster-0.016
 peel-0.013
 toast-0.012
dog-0.012
 slur-0.012
Token but
Token ID475
Feature activation-1.07e-03

Pos logit contributions
 Grace+0.001
 importantly+0.001
Els+0.001
reetings+0.001
ified+0.001
Neg logit contributions
aster-0.003
 peel-0.003
 toast-0.002
dog-0.002
 slur-0.002
Token it
Token ID340
Feature activation+1.80e-03

Pos logit contributions
aster+0.002
 peel+0.002
dog+0.001
 toast+0.001
 slur+0.001
Neg logit contributions
 Grace-0.002
Els-0.002
 importantly-0.002
ified-0.002
reetings-0.002
Token wasn
Token ID2492
Feature activation+4.53e-02

Pos logit contributions
 Grace+0.003
Els+0.003
 importantly+0.002
ified+0.002
reetings+0.002
Neg logit contributions
aster-0.003
 peel-0.003
 toast-0.003
five-0.003
dog-0.003
Token't
Token ID470
Feature activation+4.62e-03

Pos logit contributions
 importantly+0.077
Els+0.074
reetings+0.074
 Grace+0.072
 sizes+0.071
Neg logit contributions
aster-0.074
 peel-0.061
on-0.053
 toast-0.052
dog-0.050
Token there
Token ID612
Feature activation+3.80e-03

Pos logit contributions
 Grace+0.006
Els+0.006
 importantly+0.006
reetings+0.005
ified+0.005
Neg logit contributions
aster-0.009
 peel-0.008
 toast-0.008
dog-0.007
 slur-0.007
Token.
Token ID13
Feature activation-4.11e-03

Pos logit contributions
Els+0.007
 importantly+0.007
reetings+0.007
 Grace+0.007
ified+0.007
Neg logit contributions
aster-0.005
 peel-0.004
 toast-0.004
dog-0.003
ops-0.003
Token She
Token ID1375
Feature activation+4.89e-03

Pos logit contributions
aster+0.007
 peel+0.005
 toast+0.005
dog+0.005
 slur+0.005
Neg logit contributions
reetings-0.007
 importantly-0.007
Els-0.006
ified-0.006
 insepar-0.006
Token asked
Token ID1965
Feature activation-7.85e-03

Pos logit contributions
 Grace+0.008
Els+0.007
 importantly+0.007
ified+0.007
 Grac+0.006
Neg logit contributions
aster-0.009
five-0.007
 peel-0.007
 toast-0.007
dog-0.006
Token worked
Token ID3111
Feature activation+1.66e-02

Pos logit contributions
 Grace+0.011
 importantly+0.011
Els+0.011
ified+0.011
spe+0.010
Neg logit contributions
aster-0.010
 peel-0.008
 toast-0.007
dog-0.007
five-0.007
Token quickly
Token ID2952
Feature activation+1.54e-03

Pos logit contributions
Els+0.027
reetings+0.026
 importantly+0.026
 Grace+0.026
 ago+0.025
Neg logit contributions
aster-0.027
 peel-0.023
 toast-0.021
dog-0.020
 slur-0.019
Token,
Token ID11
Feature activation+2.25e-03

Pos logit contributions
Els+0.003
 importantly+0.003
 Grace+0.003
reetings+0.003
 aliens+0.003
Neg logit contributions
aster-0.002
 peel-0.002
 toast-0.001
dog-0.001
 cut-0.001
Token but
Token ID475
Feature activation+1.36e-03

Pos logit contributions
Els+0.003
reetings+0.003
 importantly+0.003
 Grace+0.003
ified+0.003
Neg logit contributions
aster-0.004
 peel-0.003
 toast-0.003
dog-0.003
 slur-0.003
Token it
Token ID340
Feature activation+6.15e-03

Pos logit contributions
Els+0.002
 importantly+0.002
 Grace+0.002
reetings+0.002
ified+0.002
Neg logit contributions
aster-0.002
 peel-0.002
 toast-0.002
dog-0.002
 slur-0.002
Token wasn
Token ID2492
Feature activation+4.50e-02

Pos logit contributions
 Grace+0.010
Els+0.009
 importantly+0.009
reetings+0.009
ified+0.009
Neg logit contributions
aster-0.010
 peel-0.008
 toast-0.008
dog-0.008
 slur-0.008
Token't
Token ID470
Feature activation+2.81e-03

Pos logit contributions
 importantly+0.085
 galaxies+0.084
Els+0.084
 sizes+0.083
reetings+0.082
Neg logit contributions
aster-0.069
 peel-0.054
on-0.047
 toast-0.045
dog-0.044
Token easy
Token ID2562
Feature activation+1.23e-03

Pos logit contributions
Els+0.005
reetings+0.005
 Grace+0.005
 importantly+0.005
ified+0.004
Neg logit contributions
aster-0.004
 peel-0.004
 toast-0.004
 cut-0.003
dog-0.003
Token.
Token ID13
Feature activation+2.15e-03

Pos logit contributions
Els+0.003
 Grace+0.003
 importantly+0.003
reetings+0.003
ifying+0.002
Neg logit contributions
aster-0.001
 peel-0.001
 toast-0.001
dog-0.001
ops-0.001
Token Lily
Token ID20037
Feature activation+2.76e-03

Pos logit contributions
reetings+0.004
Els+0.004
 aliens+0.004
 insepar+0.004
 importantly+0.004
Neg logit contributions
aster-0.003
 peel-0.003
 toast-0.002
 slur-0.002
dog-0.002
Token watched
Token ID7342
Feature activation-8.59e-03

Pos logit contributions
 Grace+0.004
Els+0.004
ified+0.004
 importantly+0.004
 insepar+0.004
Neg logit contributions
aster-0.005
 peel-0.004
 toast-0.004
dog-0.003
five-0.003
Token onto
Token ID4291
Feature activation-2.00e-03

Pos logit contributions
aster+0.002
 peel+0.002
 toast+0.002
dog+0.002
ops+0.002
Neg logit contributions
 Grace-0.002
 importantly-0.002
Els-0.002
ified-0.002
 Amanda-0.002
Token the
Token ID262
Feature activation-4.31e-03

Pos logit contributions
aster+0.004
 peel+0.003
 toast+0.003
dog+0.003
ops+0.003
Neg logit contributions
 Grace-0.003
Els-0.003
ified-0.003
 importantly-0.003
 Amanda-0.003
Token railing
Token ID47590
Feature activation+4.75e-03

Pos logit contributions
aster+0.007
 peel+0.006
 toast+0.006
dog+0.005
 slur+0.005
Neg logit contributions
 Grace-0.007
 importantly-0.007
Els-0.007
reetings-0.007
ified-0.006
Token but
Token ID475
Feature activation+9.04e-05

Pos logit contributions
Els+0.009
reetings+0.009
 importantly+0.008
 Grace+0.008
ified+0.008
Neg logit contributions
aster-0.007
 peel-0.005
 toast-0.005
dog-0.005
 slur-0.005
Token it
Token ID340
Feature activation+4.80e-03

Pos logit contributions
Els+0.000
 importantly+0.000
 Grace+0.000
reetings+0.000
ified+0.000
Neg logit contributions
aster-0.000
 peel-0.000
 toast-0.000
dog-0.000
 slur-0.000
Token wasn
Token ID2492
Feature activation+4.49e-02

Pos logit contributions
 Grace+0.007
Els+0.007
 importantly+0.007
ified+0.007
reetings+0.007
Neg logit contributions
aster-0.008
 peel-0.007
 toast-0.007
dog-0.006
five-0.006
Token't
Token ID470
Feature activation+6.19e-03

Pos logit contributions
 importantly+0.087
Els+0.083
reetings+0.083
 galaxies+0.081
 sizes+0.080
Neg logit contributions
aster-0.065
 peel-0.052
on-0.045
 cut-0.042
dog-0.042
Token enough
Token ID1576
Feature activation-1.74e-02

Pos logit contributions
Els+0.010
 importantly+0.010
reetings+0.010
 Grace+0.009
ifying+0.009
Neg logit contributions
aster-0.010
 peel-0.008
 toast-0.008
 cut-0.008
dog-0.007
Token.
Token ID13
Feature activation+2.66e-03

Pos logit contributions
aster+0.020
 peel+0.016
 toast+0.015
 cut+0.013
 slur+0.013
Neg logit contributions
Els-0.038
 Grace-0.037
reetings-0.036
ified-0.035
 importantly-0.035
Token The
Token ID383
Feature activation+4.01e-03

Pos logit contributions
reetings+0.005
 importantly+0.004
Els+0.004
 insepar+0.004
 aliens+0.004
Neg logit contributions
aster-0.004
 peel-0.003
dog-0.003
 toast-0.003
on-0.003
Token stairs
Token ID16046
Feature activation+1.87e-02

Pos logit contributions
reetings+0.006
Els+0.006
 Grace+0.006
 importantly+0.006
ified+0.006
Neg logit contributions
aster-0.007
 peel-0.006
 toast-0.006
dog-0.005
 slur-0.005
Token the
Token ID262
Feature activation-2.27e-02

Pos logit contributions
aster+0.020
 peel+0.016
 toast+0.015
dog+0.015
ops+0.015
Neg logit contributions
 Grace-0.021
Els-0.021
 importantly-0.020
reetings-0.020
ified-0.019
Token store
Token ID3650
Feature activation-3.19e-03

Pos logit contributions
aster+0.034
 peel+0.027
 toast+0.025
 slur+0.025
dog+0.025
Neg logit contributions
 Grace-0.038
Els-0.038
 importantly-0.037
reetings-0.037
ified-0.036
Token,
Token ID11
Feature activation-1.17e-03

Pos logit contributions
aster+0.007
 slur+0.006
Ble+0.006
 peel+0.006
ops+0.006
Neg logit contributions
 ago-0.003
 Grace-0.003
 so-0.003
 since-0.003
 importantly-0.003
Token but
Token ID475
Feature activation-3.53e-03

Pos logit contributions
aster+0.002
 peel+0.002
 toast+0.002
dog+0.002
ops+0.002
Neg logit contributions
 Grace-0.001
 importantly-0.001
Els-0.001
reetings-0.001
 ago-0.001
Token it
Token ID340
Feature activation-1.04e-03

Pos logit contributions
aster+0.006
 peel+0.005
dog+0.004
 slur+0.004
five+0.004
Neg logit contributions
 Grace-0.006
Els-0.005
 importantly-0.005
 ago-0.005
reetings-0.005
Token wasn
Token ID2492
Feature activation+4.47e-02

Pos logit contributions
aster+0.002
five+0.002
 peel+0.002
 toast+0.002
 slur+0.002
Neg logit contributions
 Grace-0.001
Els-0.001
ified-0.001
spe-0.001
 importantly-0.001
Token't
Token ID470
Feature activation+6.08e-03

Pos logit contributions
 importantly+0.079
Els+0.077
reetings+0.076
 Grace+0.075
 galaxies+0.074
Neg logit contributions
aster-0.071
 peel-0.058
 toast-0.050
on-0.049
dog-0.048
Token there
Token ID612
Feature activation+5.07e-03

Pos logit contributions
Els+0.009
 importantly+0.009
 Grace+0.009
reetings+0.009
ified+0.009
Neg logit contributions
aster-0.010
 peel-0.009
 toast-0.008
dog-0.008
 slur-0.008
Token.
Token ID13
Feature activation-1.53e-03

Pos logit contributions
Els+0.009
 importantly+0.009
 Grace+0.009
reetings+0.009
ified+0.009
Neg logit contributions
aster-0.007
 peel-0.006
 toast-0.005
dog-0.005
ops-0.005
Token
Token ID220
Feature activation+1.21e-03

Pos logit contributions
aster+0.003
 peel+0.002
 toast+0.002
dog+0.002
 slur+0.002
Neg logit contributions
 Grace-0.002
Els-0.002
 importantly-0.002
reetings-0.002
ified-0.002
Token
Token ID198
Feature activation-9.90e-03

Pos logit contributions
ified+0.002
 importantly+0.002
Els+0.002
 Grace+0.002
reetings+0.002
Neg logit contributions
aster-0.002
 peel-0.001
 toast-0.001
on-0.001
dog-0.001
Token all
Token ID477
Feature activation+1.08e-02

Pos logit contributions
aster+0.000
 toast+0.000
 peel+0.000
dog+0.000
 slur+0.000
Neg logit contributions
ified-0.000
 importantly-0.000
Els-0.000
reetings-0.000
spe-0.000
Token of
Token ID286
Feature activation+2.32e-03

Pos logit contributions
ified+0.021
Els+0.020
 importantly+0.019
reetings+0.018
 rooft+0.018
Neg logit contributions
aster-0.017
 toast-0.012
on-0.011
 peel-0.010
dog-0.010
Token the
Token ID262
Feature activation-5.46e-03

Pos logit contributions
spe+0.004
ified+0.004
reetings+0.004
Els+0.004
 importantly+0.004
Neg logit contributions
aster-0.004
 toast-0.003
 peel-0.003
on-0.002
dog-0.002
Token boys
Token ID6510
Feature activation+1.25e-02

Pos logit contributions
aster+0.007
 toast+0.007
 peel+0.006
 cut+0.006
 man+0.006
Neg logit contributions
ified-0.010
reetings-0.009
Els-0.009
ifying-0.009
eway-0.009
Token and
Token ID290
Feature activation+3.62e-03

Pos logit contributions
Els+0.021
 importantly+0.021
reetings+0.020
ified+0.020
Mill+0.019
Neg logit contributions
aster-0.019
 peel-0.016
dog-0.015
 toast-0.015
 cone-0.015
Token girls
Token ID4813
Feature activation+4.44e-02

Pos logit contributions
reetings+0.007
ified+0.007
Els+0.007
 importantly+0.007
eway+0.006
Neg logit contributions
aster-0.005
 toast-0.004
 say-0.004
 cut-0.004
 peel-0.004
Token quickly
Token ID2952
Feature activation+1.57e-02

Pos logit contributions
ified+0.076
Els+0.075
 importantly+0.075
reetings+0.074
 Beat+0.072
Neg logit contributions
aster-0.060
 peel-0.051
 toast-0.049
 cut-0.049
five-0.048
Token."
Token ID526
Feature activation+5.57e-05

Pos logit contributions
Els+0.030
 importantly+0.029
ified+0.027
 Grace+0.027
Mill+0.027
Neg logit contributions
aster-0.022
 peel-0.018
 toast-0.017
 cut-0.016
five-0.016
Token
Token ID198
Feature activation+1.74e-03

Pos logit contributions
reetings+0.000
spe+0.000
ified+0.000
 importantly+0.000
 insepar+0.000
Neg logit contributions
aster-0.000
 peel-0.000
 toast-0.000
dog-0.000
 slur-0.000
Token
Token ID198
Feature activation+1.02e-02

Pos logit contributions
 sizes+0.003
ified+0.003
 ago+0.003
reetings+0.003
 importantly+0.003
Neg logit contributions
aster-0.002
 toast-0.002
 peel-0.002
on-0.002
dog-0.002
TokenSanta
Token ID42694
Feature activation-3.62e-03

Pos logit contributions
 ago+0.020
 importantly+0.019
 aliens+0.019
 sizes+0.018
reetings+0.018
Neg logit contributions
aster-0.014
 peel-0.010
 toast-0.010
dog-0.009
 slur-0.009
Token chance
Token ID2863
Feature activation+1.22e-02

Pos logit contributions
 Grace+0.005
Els+0.005
 importantly+0.005
 ago+0.005
ified+0.005
Neg logit contributions
aster-0.005
 peel-0.004
 toast-0.004
 slur-0.004
dog-0.004
Token and
Token ID290
Feature activation+1.16e-02

Pos logit contributions
Els+0.025
 importantly+0.024
reetings+0.023
ified+0.022
 Grace+0.022
Neg logit contributions
aster-0.016
 toast-0.013
 peel-0.013
dog-0.012
 cut-0.011
Token go
Token ID467
Feature activation+1.47e-02

Pos logit contributions
 importantly+0.026
Els+0.026
reetings+0.026
ified+0.025
 insepar+0.024
Neg logit contributions
aster-0.013
 peel-0.010
 toast-0.008
 cut-0.008
 cone-0.007
Token up
Token ID510
Feature activation+3.05e-03

Pos logit contributions
 importantly+0.030
ified+0.028
Els+0.028
spe+0.027
reetings+0.027
Neg logit contributions
aster-0.021
 peel-0.017
dog-0.016
 cone-0.015
five-0.015
Token the
Token ID262
Feature activation+2.27e-03

Pos logit contributions
 importantly+0.007
Els+0.007
reetings+0.007
ified+0.006
ifying+0.006
Neg logit contributions
aster-0.004
 toast-0.003
 peel-0.003
dog-0.002
 cone-0.002
Token stairs
Token ID16046
Feature activation+4.41e-02

Pos logit contributions
 importantly+0.004
Els+0.004
 Grace+0.004
reetings+0.004
ified+0.004
Neg logit contributions
aster-0.004
 peel-0.003
 toast-0.003
dog-0.003
 cone-0.003
Token."
Token ID526
Feature activation+4.31e-03

Pos logit contributions
Els+0.086
 importantly+0.086
reetings+0.082
 Grace+0.078
ified+0.077
Neg logit contributions
aster-0.060
 peel-0.049
 cone-0.047
five-0.045
dog-0.044
Token M
Token ID337
Feature activation+2.31e-02

Pos logit contributions
reetings+0.006
 importantly+0.006
Els+0.006
ified+0.006
 Grace+0.005
Neg logit contributions
aster-0.008
 peel-0.007
 slur-0.007
 toast-0.007
dog-0.006
Tokenilly
Token ID6548
Feature activation-6.40e-04

Pos logit contributions
 Grace+0.038
Els+0.038
 importantly+0.038
reetings+0.037
ified+0.037
Neg logit contributions
aster-0.036
 peel-0.029
 toast-0.027
dog-0.026
 slur-0.025
Token said
Token ID531
Feature activation+5.89e-03

Pos logit contributions
aster+0.001
 peel+0.001
 toast+0.001
dog+0.001
five+0.001
Neg logit contributions
 Grace-0.001
Els-0.001
ified-0.001
spe-0.001
 importantly-0.001
Token,
Token ID11
Feature activation+3.84e-03

Pos logit contributions
Els+0.010
reetings+0.010
 importantly+0.010
 Grace+0.010
 ago+0.009
Neg logit contributions
aster-0.009
 peel-0.007
 toast-0.007
dog-0.007
 slur-0.007
Token a
Token ID257
Feature activation-1.17e-03

Pos logit contributions
 Grace+0.007
Els+0.006
 importantly+0.006
ified+0.006
 Grac+0.006
Neg logit contributions
aster-0.008
 peel-0.006
 toast-0.006
dog-0.006
ops-0.006
Token chair
Token ID5118
Feature activation+4.95e-03

Pos logit contributions
aster+0.002
 peel+0.002
 toast+0.002
dog+0.002
 slur+0.002
Neg logit contributions
Els-0.002
 Grace-0.002
 importantly-0.002
reetings-0.002
ified-0.002
Token,
Token ID11
Feature activation-2.02e-03

Pos logit contributions
 Grace+0.007
ified+0.007
 importantly+0.007
 ago+0.007
Els+0.007
Neg logit contributions
aster-0.008
 peel-0.006
 toast-0.006
 slur-0.006
dog-0.006
Token but
Token ID475
Feature activation-7.66e-04

Pos logit contributions
aster+0.003
 peel+0.003
dog+0.003
ops+0.003
 toast+0.003
Neg logit contributions
 Grace-0.003
 importantly-0.003
Els-0.003
ified-0.003
 ago-0.003
Token it
Token ID340
Feature activation+5.50e-03

Pos logit contributions
aster+0.001
 peel+0.001
dog+0.001
five+0.001
 toast+0.001
Neg logit contributions
 Grace-0.001
Els-0.001
 importantly-0.001
ified-0.001
 insepar-0.001
Token wasn
Token ID2492
Feature activation+4.38e-02

Pos logit contributions
 Grace+0.008
Els+0.008
ified+0.007
 importantly+0.007
spe+0.007
Neg logit contributions
aster-0.010
five-0.008
 toast-0.008
 peel-0.008
dog-0.008
Token't
Token ID470
Feature activation+5.44e-03

Pos logit contributions
 importantly+0.077
Els+0.074
reetings+0.074
 Grace+0.073
 galaxies+0.073
Neg logit contributions
aster-0.069
 peel-0.058
 toast-0.049
on-0.049
dog-0.048
Token tall
Token ID7331
Feature activation-5.42e-03

Pos logit contributions
Els+0.008
 Grace+0.008
 importantly+0.008
reetings+0.008
ified+0.008
Neg logit contributions
aster-0.010
 peel-0.008
 toast-0.008
dog-0.007
 slur-0.007
Token enough
Token ID1576
Feature activation-9.42e-03

Pos logit contributions
aster+0.009
 peel+0.008
 toast+0.008
dog+0.008
 cone+0.007
Neg logit contributions
Els-0.009
reetings-0.009
 Grace-0.009
Mill-0.008
 importantly-0.008
Token either
Token ID2035
Feature activation-1.03e-02

Pos logit contributions
aster+0.011
 peel+0.009
 toast+0.009
dog+0.007
 cone+0.007
Neg logit contributions
Els-0.020
 Grace-0.019
reetings-0.019
 importantly-0.019
ified-0.019
Token.
Token ID13
Feature activation-1.72e-03

Pos logit contributions
aster+0.014
 peel+0.011
 toast+0.010
dog+0.010
 slur+0.009
Neg logit contributions
Els-0.019
 importantly-0.019
 Grace-0.019
reetings-0.019
ified-0.018
Token tight
Token ID5381
Feature activation+1.51e-02

Pos logit contributions
Els+0.019
reetings+0.019
 importantly+0.019
 Grace+0.018
ified+0.018
Neg logit contributions
aster-0.012
 peel-0.010
 slur-0.008
 toast-0.008
dog-0.008
Token and
Token ID290
Feature activation+1.83e-02

Pos logit contributions
Els+0.034
 importantly+0.033
reetings+0.033
spe+0.032
 reality+0.030
Neg logit contributions
aster-0.014
 peel-0.014
dog-0.012
 toast-0.011
 cut-0.011
Token we
Token ID356
Feature activation+1.79e-02

Pos logit contributions
Els+0.037
reetings+0.036
 importantly+0.036
ified+0.035
 Grace+0.034
Neg logit contributions
aster-0.025
 peel-0.020
 toast-0.017
 slur-0.017
 cut-0.016
Token'll
Token ID1183
Feature activation+1.88e-02

Pos logit contributions
 importantly+0.026
reetings+0.025
Els+0.025
 Grace+0.024
ified+0.024
Neg logit contributions
aster-0.031
 peel-0.029
 cut-0.025
 slur-0.025
dog-0.024
Token bounce
Token ID20110
Feature activation+2.17e-02

Pos logit contributions
 importantly+0.038
Els+0.037
reetings+0.037
 Grace+0.036
ified+0.036
Neg logit contributions
aster-0.024
 peel-0.019
 toast-0.016
dog-0.016
 slur-0.016
Token together
Token ID1978
Feature activation+4.32e-02

Pos logit contributions
 importantly+0.047
reetings+0.044
Els+0.042
ified+0.041
spe+0.041
Neg logit contributions
aster-0.025
 peel-0.022
 toast-0.020
dog-0.020
 band-0.019
Token."
Token ID526
Feature activation-3.70e-03

Pos logit contributions
 importantly+0.082
Els+0.078
reetings+0.077
 Lay+0.075
 Beat+0.074
Neg logit contributions
aster-0.057
 peel-0.052
 toast-0.049
 cone-0.046
 cut-0.045
Token
Token ID198
Feature activation-4.63e-03

Pos logit contributions
aster+0.007
 peel+0.006
 toast+0.005
dog+0.005
 slur+0.005
Neg logit contributions
 Grace-0.005
Els-0.005
 importantly-0.005
reetings-0.005
ified-0.005
TokenSo
Token ID2396
Feature activation+1.20e-02

Pos logit contributions
aster+0.008
 peel+0.006
 toast+0.006
dog+0.006
 slur+0.006
Neg logit contributions
 importantly-0.007
 Grace-0.007
reetings-0.007
ified-0.007
 ago-0.007
Token the
Token ID262
Feature activation+4.18e-03

Pos logit contributions
reetings+0.027
Els+0.025
 importantly+0.024
 ago+0.024
ified+0.024
Neg logit contributions
aster-0.014
 peel-0.010
 toast-0.010
 slur-0.009
dog-0.009
Token cross
Token ID3272
Feature activation-6.93e-03

Pos logit contributions
 importantly+0.006
 Grace+0.006
Els+0.006
reetings+0.006
ified+0.006
Neg logit contributions
aster-0.007
 peel-0.006
 toast-0.005
dog-0.005
 slur-0.005
Token,
Token ID11
Feature activation+1.18e-02

Pos logit contributions
Els+0.027
 importantly+0.025
reetings+0.025
ified+0.025
spe+0.025
Neg logit contributions
aster-0.013
 toast-0.011
 peel-0.010
dog-0.009
on-0.009
Token my
Token ID616
Feature activation+1.84e-02

Pos logit contributions
reetings+0.023
Els+0.023
ified+0.023
 valleys+0.022
ifying+0.022
Neg logit contributions
aster-0.016
 toast-0.014
 slur-0.013
 peel-0.013
 sandwich-0.012
Token generous
Token ID14431
Feature activation+1.97e-02

Pos logit contributions
Els+0.026
 Grace+0.026
 importantly+0.026
reetings+0.026
ified+0.025
Neg logit contributions
aster-0.033
 peel-0.028
 toast-0.027
dog-0.026
 slur-0.026
Token and
Token ID290
Feature activation+1.29e-03

Pos logit contributions
 Grace+0.034
Els+0.034
 importantly+0.034
reetings+0.033
ified+0.033
Neg logit contributions
aster-0.030
 peel-0.023
 toast-0.022
dog-0.021
 slur-0.021
Token helpful
Token ID7613
Feature activation+1.69e-02

Pos logit contributions
Els+0.002
ified+0.002
reetings+0.002
 importantly+0.002
ifying+0.002
Neg logit contributions
aster-0.002
 toast-0.002
 peel-0.002
dog-0.001
 cone-0.001
Token girls
Token ID4813
Feature activation+4.29e-02

Pos logit contributions
 importantly+0.030
Els+0.030
reetings+0.030
 Grace+0.029
ified+0.029
Neg logit contributions
aster-0.025
 peel-0.020
 toast-0.019
dog-0.018
 slur-0.017
Token."
Token ID526
Feature activation-7.60e-04

Pos logit contributions
Els+0.071
ified+0.070
reetings+0.070
 importantly+0.069
 Grace+0.067
Neg logit contributions
aster-0.064
 toast-0.054
 peel-0.054
five-0.052
dog-0.051
Token He
Token ID679
Feature activation-5.98e-03

Pos logit contributions
aster+0.001
 peel+0.001
 toast+0.001
 slur+0.001
dog+0.001
Neg logit contributions
ified-0.001
reetings-0.001
Els-0.001
 insepar-0.001
spe-0.001
Token hugged
Token ID41130
Feature activation+1.04e-03

Pos logit contributions
aster+0.010
 peel+0.007
 toast+0.007
ops+0.006
five+0.006
Neg logit contributions
 Grace-0.010
Els-0.010
 importantly-0.009
ified-0.009
 insepar-0.009
Token them
Token ID606
Feature activation-8.72e-03

Pos logit contributions
 Grace+0.002
Els+0.001
 importantly+0.001
reetings+0.001
ified+0.001
Neg logit contributions
aster-0.002
 peel-0.002
 toast-0.001
dog-0.001
 slur-0.001
Token and
Token ID290
Feature activation-1.50e-02

Pos logit contributions
aster+0.020
Ble+0.016
uch+0.015
ede+0.015
 peel+0.015
Neg logit contributions
 Grace-0.010
 forever-0.009
 so-0.009
 asleep-0.008
 Amanda-0.008
Token extingu
Token ID24995
Feature activation-4.39e-03

Pos logit contributions
Els+0.007
 importantly+0.006
 Grace+0.006
reetings+0.006
ified+0.006
Neg logit contributions
aster-0.004
 peel-0.003
dog-0.003
 toast-0.003
 cone-0.003
Tokenisher
Token ID4828
Feature activation+7.31e-03

Pos logit contributions
aster+0.009
 peel+0.007
 toast+0.007
ops+0.007
 slur+0.007
Neg logit contributions
ifying-0.005
ified-0.005
 importantly-0.005
spe-0.005
 Grace-0.005
Token,
Token ID11
Feature activation+8.46e-03

Pos logit contributions
Els+0.015
 importantly+0.014
 Grace+0.014
reetings+0.014
ified+0.013
Neg logit contributions
aster-0.009
 peel-0.007
 toast-0.007
dog-0.007
 cone-0.006
Token but
Token ID475
Feature activation+1.13e-05

Pos logit contributions
Els+0.011
reetings+0.011
 Grace+0.011
 importantly+0.011
ified+0.011
Neg logit contributions
aster-0.016
 peel-0.014
 toast-0.013
dog-0.013
 slur-0.013
Token there
Token ID612
Feature activation+7.22e-03

Pos logit contributions
 importantly+0.000
 Grace+0.000
Els+0.000
reetings+0.000
ified+0.000
Neg logit contributions
aster-0.000
 peel-0.000
 toast-0.000
dog-0.000
ops-0.000
Token wasn
Token ID2492
Feature activation+4.27e-02

Pos logit contributions
 Grace+0.008
spe+0.008
 importantly+0.008
 Amanda+0.007
ified+0.007
Neg logit contributions
aster-0.015
 peel-0.013
five-0.013
 toast-0.012
 slur-0.012
Token't
Token ID470
Feature activation+1.87e-03

Pos logit contributions
 importantly+0.074
Els+0.073
reetings+0.071
 Grace+0.070
 galaxies+0.066
Neg logit contributions
aster-0.071
 peel-0.058
on-0.051
 toast-0.050
dog-0.050
Token one
Token ID530
Feature activation-9.92e-03

Pos logit contributions
Els+0.003
 Grace+0.003
reetings+0.003
 importantly+0.003
ified+0.003
Neg logit contributions
aster-0.003
 peel-0.002
 toast-0.002
dog-0.002
 slur-0.002
Token.
Token ID13
Feature activation-3.13e-03

Pos logit contributions
aster+0.016
 peel+0.012
 toast+0.012
dog+0.011
 slur+0.011
Neg logit contributions
Els-0.017
 Grace-0.017
 importantly-0.017
reetings-0.016
ified-0.016
Token He
Token ID679
Feature activation+1.01e-03

Pos logit contributions
aster+0.006
 peel+0.005
 toast+0.005
dog+0.005
 slur+0.005
Neg logit contributions
 Grace-0.004
 importantly-0.004
ified-0.004
Els-0.004
 ago-0.004
Token yelled
Token ID22760
Feature activation+1.20e-04

Pos logit contributions
 Grace+0.002
Els+0.002
 importantly+0.002
ified+0.002
reetings+0.001
Neg logit contributions
aster-0.002
 toast-0.001
 peel-0.001
five-0.001
dog-0.001
Token was
Token ID373
Feature activation+2.03e-02

Pos logit contributions
aster+0.001
 toast+0.001
 peel+0.001
five+0.001
dog+0.001
Neg logit contributions
 Grace-0.001
Els-0.001
 importantly-0.001
ified-0.001
reetings-0.000
Token too
Token ID1165
Feature activation-1.44e-02

Pos logit contributions
Els+0.036
reetings+0.035
 Grace+0.034
 importantly+0.034
ified+0.033
Neg logit contributions
aster-0.030
 peel-0.026
 toast-0.023
 cut-0.023
dog-0.021
Token heavy
Token ID4334
Feature activation-4.97e-03

Pos logit contributions
aster+0.030
 peel+0.025
 toast+0.025
dog+0.024
 slur+0.024
Neg logit contributions
 Grace-0.015
 importantly-0.015
Els-0.015
reetings-0.015
ified-0.014
Token and
Token ID290
Feature activation+2.23e-02

Pos logit contributions
aster+0.005
 peel+0.004
 toast+0.004
dog+0.004
ops+0.003
Neg logit contributions
Els-0.011
 importantly-0.010
reetings-0.010
 Grace-0.010
spe-0.010
Token she
Token ID673
Feature activation+1.89e-02

Pos logit contributions
reetings+0.043
Els+0.043
 importantly+0.041
 Grace+0.040
ified+0.040
Neg logit contributions
aster-0.030
 peel-0.024
 toast-0.023
 cut-0.020
dog-0.020
Token wasn
Token ID2492
Feature activation+4.26e-02

Pos logit contributions
 Grace+0.023
Els+0.023
 importantly+0.023
ified+0.022
reetings+0.021
Neg logit contributions
aster-0.038
 peel-0.031
 toast-0.030
dog-0.029
five-0.029
Token't
Token ID470
Feature activation+1.50e-02

Pos logit contributions
 importantly+0.069
Els+0.068
 Grace+0.067
reetings+0.067
 galaxies+0.065
Neg logit contributions
aster-0.072
 peel-0.059
 toast-0.053
on-0.053
dog-0.052
Token strong
Token ID1913
Feature activation-4.84e-03

Pos logit contributions
Els+0.019
 Grace+0.019
 importantly+0.019
reetings+0.018
ified+0.018
Neg logit contributions
aster-0.030
 peel-0.025
 toast-0.024
dog-0.023
 slur-0.023
Token enough
Token ID1576
Feature activation-1.15e-02

Pos logit contributions
aster+0.009
 peel+0.008
dog+0.008
 toast+0.007
 cut+0.007
Neg logit contributions
Els-0.007
reetings-0.006
 Grace-0.006
 importantly-0.006
 galaxies-0.006
Token.
Token ID13
Feature activation-7.40e-03

Pos logit contributions
aster+0.014
 peel+0.012
 toast+0.011
 cut+0.009
dog+0.009
Neg logit contributions
Els-0.024
 Grace-0.023
reetings-0.023
 importantly-0.022
ified-0.022
Token
Token ID220
Feature activation+2.88e-04

Pos logit contributions
aster+0.013
 peel+0.011
 toast+0.010
dog+0.010
 slur+0.009
Neg logit contributions
reetings-0.011
 importantly-0.011
Els-0.011
 Grace-0.010
ified-0.010
Token was
Token ID373
Feature activation+1.90e-02

Pos logit contributions
 Grace+0.011
 importantly+0.010
ified+0.010
Els+0.010
reetings+0.010
Neg logit contributions
aster-0.014
 peel-0.011
 toast-0.011
dog-0.010
five-0.010
Token too
Token ID1165
Feature activation-8.90e-03

Pos logit contributions
Els+0.032
reetings+0.032
 Grace+0.032
 importantly+0.031
ified+0.030
Neg logit contributions
aster-0.029
 peel-0.025
 toast-0.023
dog-0.021
 cut-0.021
Token high
Token ID1029
Feature activation-8.93e-03

Pos logit contributions
aster+0.016
 peel+0.012
 toast+0.012
dog+0.012
five+0.012
Neg logit contributions
Els-0.013
 Grace-0.013
 importantly-0.013
reetings-0.013
ified-0.012
Token and
Token ID290
Feature activation+2.44e-02

Pos logit contributions
aster+0.009
 peel+0.007
dog+0.007
 toast+0.006
ops+0.006
Neg logit contributions
Els-0.020
 importantly-0.020
 Grace-0.019
reetings-0.019
 Amanda-0.018
Token he
Token ID339
Feature activation+1.87e-02

Pos logit contributions
Els+0.046
reetings+0.046
 importantly+0.046
 Grace+0.045
ified+0.044
Neg logit contributions
aster-0.034
 peel-0.027
 toast-0.025
dog-0.024
 slur-0.023
Token wasn
Token ID2492
Feature activation+4.26e-02

Pos logit contributions
 Grace+0.021
Els+0.021
 importantly+0.020
ified+0.020
reetings+0.019
Neg logit contributions
aster-0.039
 peel-0.032
 toast-0.032
five-0.031
dog-0.030
Token't
Token ID470
Feature activation+1.49e-02

Pos logit contributions
 importantly+0.065
 Grace+0.064
Els+0.063
reetings+0.062
 galaxies+0.059
Neg logit contributions
aster-0.076
 peel-0.063
 toast-0.057
dog-0.056
on-0.055
Token sure
Token ID1654
Feature activation+4.48e-03

Pos logit contributions
 Grace+0.022
Els+0.021
 importantly+0.021
reetings+0.021
ified+0.020
Neg logit contributions
aster-0.027
 peel-0.022
 toast-0.021
dog-0.020
 slur-0.020
Token how
Token ID703
Feature activation-5.19e-04

Pos logit contributions
 Grace+0.008
Els+0.008
ified+0.008
 Grac+0.007
 Beat+0.007
Neg logit contributions
aster-0.007
 peel-0.006
 toast-0.005
on-0.005
 cone-0.005
Token he
Token ID339
Feature activation-3.24e-03

Pos logit contributions
aster+0.001
 peel+0.001
 toast+0.001
 slur+0.001
dog+0.000
Neg logit contributions
Els-0.001
 Grace-0.001
ified-0.001
 importantly-0.001
reetings-0.001
Token could
Token ID714
Feature activation+7.50e-03

Pos logit contributions
aster+0.006
 toast+0.005
 peel+0.005
dog+0.004
 slur+0.004
Neg logit contributions
 Grace-0.005
Els-0.005
 importantly-0.004
ified-0.004
reetings-0.004
Token all
Token ID477
Feature activation+6.15e-03

Pos logit contributions
 importantly+0.026
reetings+0.026
Els+0.025
 Grace+0.025
ified+0.024
Neg logit contributions
aster-0.020
 peel-0.016
 toast-0.015
dog-0.014
 slur-0.014
Token her
Token ID607
Feature activation+9.33e-03

Pos logit contributions
 Grace+0.010
Els+0.010
 importantly+0.010
reetings+0.010
ified+0.010
Neg logit contributions
aster-0.010
 peel-0.008
 toast-0.007
dog-0.007
 slur-0.007
Token strength
Token ID4202
Feature activation+1.18e-02

Pos logit contributions
 Grace+0.013
Els+0.013
 importantly+0.013
reetings+0.013
ified+0.012
Neg logit contributions
aster-0.017
 peel-0.014
 toast-0.014
dog-0.013
 slur-0.013
Token but
Token ID475
Feature activation-8.31e-04

Pos logit contributions
Els+0.020
 Grace+0.020
 importantly+0.020
reetings+0.019
ified+0.019
Neg logit contributions
aster-0.018
 peel-0.015
 toast-0.014
dog-0.013
 slur-0.013
Token it
Token ID340
Feature activation+2.33e-03

Pos logit contributions
aster+0.002
 peel+0.001
dog+0.001
 toast+0.001
 slur+0.001
Neg logit contributions
 Grace-0.001
 importantly-0.001
Els-0.001
 ago-0.001
ified-0.001
Token wasn
Token ID2492
Feature activation+4.25e-02

Pos logit contributions
 Grace+0.003
Els+0.003
 importantly+0.003
ified+0.003
reetings+0.003
Neg logit contributions
aster-0.004
 peel-0.003
 toast-0.003
dog-0.003
five-0.003
Token't
Token ID470
Feature activation+1.03e-02

Pos logit contributions
 importantly+0.078
Els+0.076
 Grace+0.074
reetings+0.074
 galaxies+0.074
Neg logit contributions
aster-0.065
 peel-0.052
on-0.045
 toast-0.043
dog-0.043
Token enough
Token ID1576
Feature activation-1.45e-02

Pos logit contributions
Els+0.016
 importantly+0.016
 Grace+0.016
reetings+0.016
ifying+0.015
Neg logit contributions
aster-0.017
 peel-0.014
 toast-0.014
dog-0.013
 cut-0.013
Token.
Token ID13
Feature activation-1.72e-03

Pos logit contributions
aster+0.018
 peel+0.015
 toast+0.014
 cut+0.012
 slur+0.011
Neg logit contributions
Els-0.031
 Grace-0.030
reetings-0.029
ified-0.028
 importantly-0.028
Token
Token ID220
Feature activation+2.04e-03

Pos logit contributions
aster+0.002
 peel+0.002
on+0.002
 toast+0.002
dog+0.002
Neg logit contributions
Els-0.003
reetings-0.003
ified-0.003
 importantly-0.003
 insepar-0.003
Token
Token ID198
Feature activation-6.63e-03

Pos logit contributions
 importantly+0.004
Els+0.004
ified+0.004
 aliens+0.004
ifying+0.004
Neg logit contributions
aster-0.003
 peel-0.002
on-0.002
ops-0.002
 toast-0.002
Token a
Token ID257
Feature activation-1.33e-03

Pos logit contributions
 Grace+0.005
Els+0.005
 importantly+0.005
reetings+0.005
ified+0.005
Neg logit contributions
aster-0.006
 peel-0.005
 toast-0.005
dog-0.005
 slur-0.005
Token vine
Token ID17793
Feature activation+2.45e-03

Pos logit contributions
aster+0.002
 peel+0.002
 toast+0.002
 cone+0.002
 cut+0.002
Neg logit contributions
ified-0.002
 importantly-0.002
reetings-0.002
 Grace-0.002
Els-0.002
Token,
Token ID11
Feature activation+3.43e-03

Pos logit contributions
 Grace+0.004
 importantly+0.004
Els+0.004
reetings+0.004
ified+0.004
Neg logit contributions
aster-0.004
 peel-0.003
 toast-0.003
dog-0.003
 slur-0.003
Token but
Token ID475
Feature activation+7.53e-04

Pos logit contributions
reetings+0.006
Els+0.006
 Grace+0.006
 importantly+0.006
ified+0.005
Neg logit contributions
aster-0.006
 peel-0.005
 toast-0.004
dog-0.004
 slur-0.004
Token it
Token ID340
Feature activation+1.51e-03

Pos logit contributions
 Grace+0.001
 importantly+0.001
Els+0.001
reetings+0.001
ified+0.001
Neg logit contributions
aster-0.001
 peel-0.001
 toast-0.001
dog-0.001
 slur-0.001
Token wasn
Token ID2492
Feature activation+4.24e-02

Pos logit contributions
 Grace+0.002
Els+0.002
ified+0.002
 importantly+0.002
spe+0.002
Neg logit contributions
aster-0.003
 toast-0.002
five-0.002
 peel-0.002
dog-0.002
Token't
Token ID470
Feature activation+1.32e-03

Pos logit contributions
 importantly+0.077
Els+0.074
 Grace+0.073
reetings+0.073
 galaxies+0.071
Neg logit contributions
aster-0.066
 peel-0.055
on-0.047
 toast-0.045
dog-0.044
Token strong
Token ID1913
Feature activation-6.31e-03

Pos logit contributions
Els+0.002
 Grace+0.002
 importantly+0.002
reetings+0.002
ified+0.002
Neg logit contributions
aster-0.002
 peel-0.002
 toast-0.002
dog-0.002
 cut-0.002
Token enough
Token ID1576
Feature activation-8.20e-03

Pos logit contributions
aster+0.011
 peel+0.010
dog+0.009
 toast+0.009
 cut+0.009
Neg logit contributions
Els-0.010
 Grace-0.010
reetings-0.009
 importantly-0.009
Mill-0.009
Token.
Token ID13
Feature activation-7.11e-04

Pos logit contributions
aster+0.011
 peel+0.009
 toast+0.008
dog+0.008
 slur+0.007
Neg logit contributions
Els-0.016
 Grace-0.016
reetings-0.015
 importantly-0.015
ified-0.015
Token He
Token ID679
Feature activation+2.97e-03

Pos logit contributions
aster+0.001
 peel+0.001
 toast+0.001
dog+0.001
 slur+0.001
Neg logit contributions
Els-0.001
reetings-0.001
 importantly-0.001
 Grace-0.001
ified-0.001
Token aisle
Token ID28960
Feature activation+3.21e-02

Pos logit contributions
 importantly+0.011
reetings+0.011
ified+0.010
Els+0.010
 Grace+0.010
Neg logit contributions
aster-0.007
 peel-0.006
 toast-0.006
 cone-0.005
 man-0.005
Token with
Token ID351
Feature activation-3.32e-04

Pos logit contributions
Els+0.070
reetings+0.068
 importantly+0.068
 Grace+0.064
 Isn+0.062
Neg logit contributions
aster-0.034
 peel-0.028
 toast-0.027
 cone-0.024
 cut-0.023
Token the
Token ID262
Feature activation+1.01e-02

Pos logit contributions
aster+0.001
 peel+0.000
 toast+0.000
dog+0.000
 slur+0.000
Neg logit contributions
 importantly-0.001
Els-0.001
reetings-0.001
 Grace-0.001
ified-0.001
Token other
Token ID584
Feature activation+1.17e-02

Pos logit contributions
reetings+0.019
 importantly+0.018
Els+0.018
ified+0.018
 Grace+0.017
Neg logit contributions
aster-0.014
 toast-0.012
 peel-0.012
dog-0.010
 slur-0.010
Token flower
Token ID15061
Feature activation+1.79e-02

Pos logit contributions
reetings+0.019
 importantly+0.019
Els+0.019
ified+0.019
 Grace+0.018
Neg logit contributions
aster-0.018
 toast-0.016
 peel-0.014
dog-0.013
five-0.013
Token girls
Token ID4813
Feature activation+4.23e-02

Pos logit contributions
reetings+0.033
 importantly+0.033
Els+0.032
ified+0.031
 Beat+0.030
Neg logit contributions
aster-0.025
 toast-0.023
 peel-0.021
 cut-0.019
 cone-0.019
Token,"
Token ID553
Feature activation-1.08e-02

Pos logit contributions
Els+0.083
reetings+0.082
 importantly+0.080
ified+0.076
 Beat+0.075
Neg logit contributions
aster-0.051
 toast-0.045
 peel-0.043
 cut-0.041
 cone-0.038
Token her
Token ID607
Feature activation+7.68e-03

Pos logit contributions
aster+0.019
at+0.017
 peel+0.017
ops+0.017
atmeal+0.017
Neg logit contributions
 Grace-0.016
 Amanda-0.014
 Beth-0.014
 Grac-0.014
 May-0.013
Token mom
Token ID1995
Feature activation-7.12e-03

Pos logit contributions
 Grace+0.015
 importantly+0.015
Els+0.015
reetings+0.014
ified+0.014
Neg logit contributions
aster-0.010
 peel-0.008
 toast-0.007
dog-0.007
 slur-0.007
Token said
Token ID531
Feature activation-9.37e-03

Pos logit contributions
aster+0.010
 peel+0.008
 toast+0.007
Ble+0.007
five+0.007
Neg logit contributions
 Grace-0.013
 importantly-0.012
ified-0.012
Els-0.012
 Amanda-0.012
Token.
Token ID13
Feature activation+7.17e-03

Pos logit contributions
aster+0.015
 peel+0.012
 toast+0.011
dog+0.011
ops+0.011
Neg logit contributions
 Grace-0.015
 importantly-0.015
ified-0.015
Els-0.014
 ago-0.014
Token "
Token ID366
Feature activation-3.93e-03

Pos logit contributions
reetings+0.013
Els+0.013
 importantly+0.013
ified+0.013
 Grace+0.012
Neg logit contributions
aster-0.014
 peel-0.012
 toast-0.011
dog-0.011
 slur-0.011
TokenLet
Token ID5756
Feature activation+8.73e-04

Pos logit contributions
aster+0.008
 peel+0.007
 toast+0.007
 slur+0.007
dog+0.006
Neg logit contributions
Els-0.004
 Grace-0.004
reetings-0.004
ified-0.004
 importantly-0.004
Token's
Token ID338
Feature activation+4.10e-03

Pos logit contributions
 Grace+0.002
Els+0.001
ified+0.001
reetings+0.001
 importantly+0.001
Neg logit contributions
aster-0.001
 peel-0.001
 toast-0.001
dog-0.001
 slur-0.001
Token carry
Token ID3283
Feature activation+1.34e-02

Pos logit contributions
 importantly+0.010
reetings+0.009
Els+0.009
ified+0.009
 Grace+0.009
Neg logit contributions
aster-0.004
 peel-0.003
 cut-0.003
 toast-0.002
dog-0.002
Token it
Token ID340
Feature activation+2.16e-02

Pos logit contributions
 importantly+0.027
reetings+0.027
Els+0.026
ified+0.025
spe+0.025
Neg logit contributions
aster-0.018
 peel-0.013
 toast-0.013
dog-0.012
on-0.012
Token together
Token ID1978
Feature activation+4.22e-02

Pos logit contributions
 importantly+0.041
reetings+0.039
Els+0.039
 Grace+0.037
ified+0.036
Neg logit contributions
aster-0.030
 peel-0.027
 toast-0.025
 cone-0.023
 cut-0.023
Token!"
Token ID2474
Feature activation-7.18e-03

Pos logit contributions
 importantly+0.074
Els+0.072
 Lay+0.070
reetings+0.069
 Beat+0.068
Neg logit contributions
aster-0.063
 toast-0.057
 peel-0.056
 cone-0.052
 cut-0.051
Token
Token ID198
Feature activation-1.01e-02

Pos logit contributions
aster+0.017
 peel+0.014
 toast+0.014
atmeal+0.013
 slur+0.013
Neg logit contributions
 Grace-0.007
Els-0.006
 importantly-0.005
 Amanda-0.005
 Grac-0.005
Token
Token ID198
Feature activation-1.64e-03

Pos logit contributions
aster+0.021
 peel+0.018
 toast+0.017
dog+0.016
 slur+0.016
Neg logit contributions
 Grace-0.012
Els-0.012
 importantly-0.011
reetings-0.011
 Amanda-0.010
TokenThe
Token ID464
Feature activation-1.84e-03

Pos logit contributions
aster+0.003
 peel+0.003
 toast+0.003
dog+0.003
 slur+0.003
Neg logit contributions
Els-0.002
 Grace-0.002
 importantly-0.002
reetings-0.002
ified-0.002
Token obedient
Token ID41652
Feature activation-3.41e-04

Pos logit contributions
aster+0.003
 peel+0.002
dog+0.002
 cut+0.002
 man+0.002
Neg logit contributions
ifying-0.003
Els-0.003
ified-0.003
 importantly-0.003
reetings-0.003
Token seen
Token ID1775
Feature activation-1.33e-02

Pos logit contributions
 Grace+0.002
 importantly+0.002
Els+0.002
reetings+0.002
ified+0.002
Neg logit contributions
aster-0.002
 peel-0.002
dog-0.002
 toast-0.002
 slur-0.002
Token it
Token ID340
Feature activation-1.57e-02

Pos logit contributions
aster+0.020
 peel+0.016
 toast+0.015
dog+0.014
 slur+0.014
Neg logit contributions
reetings-0.023
 importantly-0.023
Els-0.023
 Grace-0.022
ified-0.022
Token,
Token ID11
Feature activation+3.04e-03

Pos logit contributions
aster+0.023
 peel+0.019
 toast+0.018
dog+0.017
ops+0.016
Neg logit contributions
Els-0.027
 importantly-0.027
 Grace-0.026
reetings-0.026
ified-0.025
Token but
Token ID475
Feature activation-1.87e-03

Pos logit contributions
reetings+0.005
Els+0.005
ified+0.004
 importantly+0.004
spe+0.004
Neg logit contributions
aster-0.005
 peel-0.004
 toast-0.004
dog-0.004
 slur-0.004
Token they
Token ID484
Feature activation+1.06e-02

Pos logit contributions
aster+0.002
 peel+0.002
 toast+0.002
dog+0.001
 cut+0.001
Neg logit contributions
reetings-0.004
Els-0.004
 importantly-0.004
ified-0.004
 rooft-0.004
Token hadn
Token ID8020
Feature activation+4.22e-02

Pos logit contributions
 Grace+0.016
 importantly+0.016
Els+0.015
ified+0.015
 insepar+0.014
Neg logit contributions
aster-0.020
 peel-0.016
 toast-0.015
five-0.015
dog-0.015
Token't
Token ID470
Feature activation-6.65e-04

Pos logit contributions
 importantly+0.068
Els+0.065
 Grace+0.064
reetings+0.062
 galaxies+0.061
Neg logit contributions
aster-0.070
 peel-0.058
on-0.053
dog-0.053
ops-0.052
Token.
Token ID13
Feature activation+7.62e-04

Pos logit contributions
aster+0.001
 peel+0.001
 toast+0.001
dog+0.001
ops+0.001
Neg logit contributions
 Grace-0.001
reetings-0.001
 importantly-0.001
Els-0.001
 ago-0.001
Token Lily
Token ID20037
Feature activation+3.30e-03

Pos logit contributions
reetings+0.001
 importantly+0.001
Els+0.001
 insepar+0.001
ified+0.001
Neg logit contributions
aster-0.001
 peel-0.001
 toast-0.001
dog-0.001
 slur-0.001
Token felt
Token ID2936
Feature activation+4.48e-03

Pos logit contributions
 Grace+0.005
ified+0.004
Els+0.004
 importantly+0.004
 insepar+0.004
Neg logit contributions
aster-0.007
 toast-0.005
 peel-0.005
five-0.005
dog-0.005
Token uncomfortable
Token ID12916
Feature activation-7.86e-03

Pos logit contributions
 importantly+0.007
 Grace+0.007
Els+0.006
ified+0.006
reetings+0.006
Neg logit contributions
aster-0.008
 peel-0.006
 toast-0.006
dog-0.006
ops-0.006
Token move
Token ID1445
Feature activation+4.98e-04

Pos logit contributions
 Grace+0.011
Els+0.011
 importantly+0.010
reetings+0.010
ified+0.010
Neg logit contributions
aster-0.008
 peel-0.006
 toast-0.006
dog-0.005
ops-0.005
Token faster
Token ID5443
Feature activation+3.13e-03

Pos logit contributions
 importantly+0.001
Els+0.001
reetings+0.001
 Grace+0.001
 ago+0.001
Neg logit contributions
aster-0.001
 peel-0.001
dog-0.001
 toast-0.000
five-0.000
Token,
Token ID11
Feature activation+9.31e-04

Pos logit contributions
Els+0.007
reetings+0.007
 Grace+0.007
 importantly+0.007
ifying+0.007
Neg logit contributions
aster-0.003
 peel-0.003
dog-0.002
 man-0.002
five-0.002
Token but
Token ID475
Feature activation+2.61e-03

Pos logit contributions
Els+0.001
reetings+0.001
 importantly+0.001
 Grace+0.001
ified+0.001
Neg logit contributions
aster-0.002
 peel-0.001
 toast-0.001
dog-0.001
 slur-0.001
Token it
Token ID340
Feature activation-1.93e-04

Pos logit contributions
 Grace+0.004
 importantly+0.004
Els+0.004
reetings+0.004
ified+0.004
Neg logit contributions
aster-0.005
 peel-0.004
 toast-0.004
dog-0.004
 slur-0.003
Token wasn
Token ID2492
Feature activation+4.22e-02

Pos logit contributions
aster+0.000
 toast+0.000
 peel+0.000
five+0.000
dog+0.000
Neg logit contributions
 Grace-0.000
Els-0.000
 importantly-0.000
ified-0.000
reetings-0.000
Token't
Token ID470
Feature activation+4.51e-03

Pos logit contributions
 importantly+0.073
Els+0.071
 Grace+0.070
reetings+0.070
spe+0.068
Neg logit contributions
aster-0.067
 peel-0.055
on-0.048
dog-0.048
 toast-0.047
Token fast
Token ID3049
Feature activation-3.00e-03

Pos logit contributions
Els+0.007
 Grace+0.007
 importantly+0.007
reetings+0.007
ifying+0.007
Neg logit contributions
aster-0.008
 peel-0.006
 toast-0.006
dog-0.006
 slur-0.006
Token enough
Token ID1576
Feature activation-1.20e-02

Pos logit contributions
aster+0.004
 peel+0.003
dog+0.003
 cut+0.003
 slur+0.002
Neg logit contributions
Els-0.006
 Grace-0.006
reetings-0.006
 importantly-0.006
Mill-0.006
Token.
Token ID13
Feature activation+4.58e-04

Pos logit contributions
aster+0.013
 peel+0.011
 toast+0.010
dog+0.009
 cut+0.009
Neg logit contributions
Els-0.027
 Grace-0.025
reetings-0.025
spe-0.024
 Grac-0.024
Token A
Token ID317
Feature activation+4.24e-03

Pos logit contributions
reetings+0.001
 importantly+0.001
Els+0.001
 Grace+0.001
 insepar+0.001
Neg logit contributions
aster-0.001
 peel-0.001
 toast-0.000
dog-0.000
 slur-0.000
Token him
Token ID683
Feature activation+1.80e-03

Pos logit contributions
aster+0.002
 peel+0.002
 toast+0.001
dog+0.001
 slur+0.001
Neg logit contributions
 Grace-0.005
 importantly-0.005
Els-0.005
reetings-0.005
ified-0.005
Token,
Token ID11
Feature activation+9.46e-03

Pos logit contributions
 importantly+0.003
ified+0.003
Els+0.003
ifying+0.003
 rooft+0.003
Neg logit contributions
aster-0.003
 peel-0.002
 toast-0.002
dog-0.002
on-0.002
Token "
Token ID366
Feature activation+2.35e-04

Pos logit contributions
Els+0.017
ified+0.017
 galaxies+0.016
reetings+0.016
spe+0.016
Neg logit contributions
aster-0.014
 peel-0.011
dog-0.010
 toast-0.010
on-0.009
TokenDid
Token ID11633
Feature activation-7.34e-03

Pos logit contributions
 importantly+0.000
 Grace+0.000
Els+0.000
reetings+0.000
ified+0.000
Neg logit contributions
aster-0.000
 peel-0.000
 toast-0.000
dog-0.000
 slur-0.000
Token the
Token ID262
Feature activation-6.14e-03

Pos logit contributions
aster+0.012
 peel+0.009
 toast+0.009
dog+0.008
 slur+0.008
Neg logit contributions
Els-0.012
reetings-0.012
 importantly-0.012
ified-0.012
 Grace-0.012
Token hat
Token ID6877
Feature activation-4.23e-02

Pos logit contributions
aster+0.010
 peel+0.008
 toast+0.008
dog+0.008
 slur+0.008
Neg logit contributions
Els-0.010
 Grace-0.010
 importantly-0.010
ified-0.009
reetings-0.009
Token and
Token ID290
Feature activation+3.19e-03

Pos logit contributions
aster+0.061
 peel+0.047
 toast+0.044
dog+0.043
 slur+0.041
Neg logit contributions
 Grace-0.075
 importantly-0.075
Els-0.075
reetings-0.074
ified-0.072
Token shield
Token ID7614
Feature activation-1.57e-02

Pos logit contributions
ified+0.007
Els+0.007
reetings+0.007
 importantly+0.006
 insepar+0.006
Neg logit contributions
aster-0.004
 peel-0.003
 cut-0.003
 toast-0.003
 slur-0.003
Token help
Token ID1037
Feature activation-6.88e-03

Pos logit contributions
aster+0.023
 peel+0.018
 toast+0.017
dog+0.016
 slur+0.015
Neg logit contributions
Els-0.029
 importantly-0.028
 Grace-0.028
ified-0.027
reetings-0.027
Token you
Token ID345
Feature activation+3.95e-03

Pos logit contributions
aster+0.012
 peel+0.010
 toast+0.009
dog+0.009
 slur+0.009
Neg logit contributions
 Grace-0.010
Els-0.010
 importantly-0.010
reetings-0.010
ified-0.009
Token today
Token ID1909
Feature activation+7.05e-03

Pos logit contributions
 importantly+0.008
Els+0.008
ified+0.008
reetings+0.008
 Grace+0.008
Neg logit contributions
aster-0.005
 peel-0.004
 toast-0.003
 cut-0.003
dog-0.003
Token Mom
Token ID11254
Feature activation-7.94e-03

Pos logit contributions
aster+0.013
 peel+0.011
 toast+0.010
dog+0.010
 slur+0.010
Neg logit contributions
 importantly-0.013
Els-0.012
reetings-0.012
 Grace-0.012
ified-0.012
Token said
Token ID531
Feature activation+5.72e-04

Pos logit contributions
aster+0.011
 toast+0.008
 peel+0.008
ops+0.007
five+0.007
Neg logit contributions
 Grace-0.014
Els-0.014
 importantly-0.014
ified-0.014
reetings-0.013
Token,
Token ID11
Feature activation+1.95e-03

Pos logit contributions
 Grace+0.001
 importantly+0.001
Els+0.001
reetings+0.001
ified+0.001
Neg logit contributions
aster-0.001
 peel-0.001
 toast-0.001
dog-0.001
 slur-0.001
Token "
Token ID366
Feature activation-1.61e-03

Pos logit contributions
Els+0.003
 importantly+0.003
 Grace+0.003
reetings+0.003
ified+0.002
Neg logit contributions
aster-0.004
 peel-0.003
 toast-0.003
dog-0.003
 slur-0.003
TokenWhat
Token ID2061
Feature activation-8.97e-03

Pos logit contributions
aster+0.003
 peel+0.002
 toast+0.002
 slur+0.002
dog+0.002
Neg logit contributions
Els-0.002
 importantly-0.002
 Grace-0.002
reetings-0.002
ified-0.002
Token car
Token ID1097
Feature activation-4.11e-02

Pos logit contributions
aster+0.014
 peel+0.011
 toast+0.011
dog+0.010
 slur+0.010
Neg logit contributions
 Grace-0.015
Els-0.015
 importantly-0.015
reetings-0.015
ified-0.014
Token?
Token ID30
Feature activation-7.68e-03

Pos logit contributions
aster+0.068
 peel+0.054
 toast+0.053
 slur+0.051
dog+0.050
Neg logit contributions
reetings-0.063
 importantly-0.063
 Grace-0.063
Els-0.060
ified-0.059
Token I
Token ID314
Feature activation+1.25e-03

Pos logit contributions
aster+0.014
 peel+0.011
dog+0.011
on+0.011
 cone+0.010
Neg logit contributions
 insepar-0.011
Els-0.011
 importantly-0.011
 perspective-0.011
ified-0.011
Token only
Token ID691
Feature activation-6.95e-03

Pos logit contributions
 Grace+0.002
 importantly+0.002
Els+0.002
reetings+0.002
ified+0.002
Neg logit contributions
aster-0.002
 peel-0.002
 toast-0.002
dog-0.002
 slur-0.002
Token see
Token ID766
Feature activation-2.03e-03

Pos logit contributions
aster+0.010
 peel+0.008
dog+0.007
 toast+0.007
 cut+0.007
Neg logit contributions
Els-0.013
 Grace-0.013
 importantly-0.013
ified-0.012
reetings-0.012
Token a
Token ID257
Feature activation-5.96e-03

Pos logit contributions
aster+0.003
 toast+0.002
 peel+0.002
dog+0.002
on+0.002
Neg logit contributions
Els-0.004
reetings-0.004
 importantly-0.004
 Grace-0.004
spe-0.004
Token
Token ID198
Feature activation-7.35e-03

Pos logit contributions
aster+0.006
 peel+0.005
 toast+0.005
dog+0.005
 slur+0.005
Neg logit contributions
 Grace-0.004
 importantly-0.004
Els-0.004
reetings-0.004
ified-0.004
Token
Token ID198
Feature activation-1.54e-03

Pos logit contributions
aster+0.015
 peel+0.013
 toast+0.012
dog+0.012
 slur+0.011
Neg logit contributions
 importantly-0.009
 Grace-0.009
Els-0.009
reetings-0.009
ified-0.008
TokenJust
Token ID5703
Feature activation+3.57e-03

Pos logit contributions
aster+0.003
 peel+0.003
 toast+0.003
dog+0.002
 slur+0.002
Neg logit contributions
 importantly-0.002
Els-0.002
reetings-0.002
 Grace-0.002
ified-0.002
Token when
Token ID618
Feature activation-5.72e-04

Pos logit contributions
 Grace+0.004
 importantly+0.004
Els+0.004
reetings+0.004
ified+0.004
Neg logit contributions
aster-0.007
 peel-0.006
 toast-0.006
dog-0.006
 slur-0.006
Token Tim
Token ID5045
Feature activation-2.32e-02

Pos logit contributions
aster+0.001
 peel+0.001
 toast+0.001
dog+0.001
 slur+0.001
Neg logit contributions
 Grace-0.001
 importantly-0.001
Els-0.001
reetings-0.001
ified-0.001
Tokenmy
Token ID1820
Feature activation-4.04e-02

Pos logit contributions
aster+0.044
 toast+0.035
 peel+0.034
 slur+0.033
five+0.033
Neg logit contributions
 Grace-0.030
reetings-0.029
Els-0.029
ified-0.029
 importantly-0.028
Token was
Token ID373
Feature activation-2.38e-02

Pos logit contributions
aster+0.075
 toast+0.063
five+0.061
awk+0.057
 peel+0.055
Neg logit contributions
ified-0.048
 Grace-0.046
 importantly-0.043
 million-0.043
Els-0.042
Token about
Token ID546
Feature activation-1.75e-02

Pos logit contributions
aster+0.048
 peel+0.040
dog+0.040
ops+0.040
 toast+0.039
Neg logit contributions
 importantly-0.027
 Grace-0.026
ified-0.025
Els-0.025
 insepar-0.025
Token to
Token ID284
Feature activation-1.44e-02

Pos logit contributions
aster+0.032
 peel+0.026
 toast+0.025
dog+0.025
 slur+0.024
Neg logit contributions
 Grace-0.024
 importantly-0.024
Els-0.023
reetings-0.023
 ago-0.022
Token give
Token ID1577
Feature activation-1.73e-02

Pos logit contributions
aster+0.020
 toast+0.016
ops+0.016
 peel+0.016
dog+0.015
Neg logit contributions
 Grace-0.025
Els-0.024
 importantly-0.024
reetings-0.023
ified-0.023
Token up
Token ID510
Feature activation-1.84e-02

Pos logit contributions
aster+0.037
ops+0.031
 peel+0.031
Ble+0.031
five+0.030
Neg logit contributions
 Grace-0.017
 Bobby-0.014
 forever-0.014
 so-0.014
 Grac-0.014
Token
Token ID198
Feature activation-6.12e-03

Pos logit contributions
aster+0.007
 peel+0.005
dog+0.005
 toast+0.005
five+0.005
Neg logit contributions
 insepar-0.010
ified-0.010
spe-0.010
iful-0.010
reetings-0.010
Token
Token ID198
Feature activation+1.49e-03

Pos logit contributions
aster+0.011
 peel+0.009
 toast+0.009
dog+0.008
 slur+0.008
Neg logit contributions
 Grace-0.009
 importantly-0.009
Els-0.009
reetings-0.009
ified-0.008
Token"
Token ID1
Feature activation-1.67e-02

Pos logit contributions
 Grace+0.002
 importantly+0.002
Els+0.002
reetings+0.002
ified+0.002
Neg logit contributions
aster-0.003
 peel-0.002
 toast-0.002
dog-0.002
 slur-0.002
TokenWhat
Token ID2061
Feature activation-1.07e-02

Pos logit contributions
aster+0.031
 peel+0.025
 toast+0.025
 slur+0.023
ops+0.023
Neg logit contributions
Els-0.024
 importantly-0.023
 Grace-0.023
reetings-0.022
ified-0.021
Token is
Token ID318
Feature activation-1.34e-02

Pos logit contributions
aster+0.017
 peel+0.014
 toast+0.013
dog+0.013
 slur+0.012
Neg logit contributions
 Grace-0.017
Els-0.017
 importantly-0.017
reetings-0.017
ified-0.017
Token it
Token ID340
Feature activation-4.02e-02

Pos logit contributions
aster+0.024
 peel+0.020
 toast+0.019
dog+0.019
 slur+0.018
Neg logit contributions
 Grace-0.019
 importantly-0.019
Els-0.019
reetings-0.018
ified-0.018
Token?"
Token ID1701
Feature activation-1.43e-02

Pos logit contributions
aster+0.069
 peel+0.058
 toast+0.056
dog+0.053
 slur+0.050
Neg logit contributions
Els-0.061
 importantly-0.060
 Grace-0.059
ified-0.059
reetings-0.057
Token Ben
Token ID3932
Feature activation-1.61e-02

Pos logit contributions
aster+0.019
 peel+0.014
 toast+0.013
dog+0.012
 slur+0.012
Neg logit contributions
reetings-0.028
Els-0.028
 importantly-0.028
 Grace-0.027
ified-0.027
Token asks
Token ID7893
Feature activation-3.07e-03

Pos logit contributions
aster+0.023
 peel+0.018
 toast+0.017
dog+0.017
 slur+0.016
Neg logit contributions
 importantly-0.029
Els-0.029
reetings-0.029
 Grace-0.029
ified-0.028
Token.
Token ID13
Feature activation+4.37e-03

Pos logit contributions
aster+0.004
 peel+0.003
 toast+0.003
dog+0.003
 slur+0.003
Neg logit contributions
 importantly-0.006
Els-0.006
 Grace-0.006
reetings-0.006
ified-0.005
Token
Token ID198
Feature activation-4.50e-03

Pos logit contributions
Els+0.006
 importantly+0.006
 Grace+0.006
reetings+0.006
ified+0.006
Neg logit contributions
aster-0.008
 peel-0.007
 toast-0.006
dog-0.006
 slur-0.006
Token never
Token ID1239
Feature activation-5.07e-03

Pos logit contributions
 importantly+0.020
reetings+0.019
 Grace+0.019
ified+0.018
Els+0.018
Neg logit contributions
aster-0.017
 peel-0.013
 toast-0.013
 cut-0.013
 slur-0.012
Token seen
Token ID1775
Feature activation-1.33e-02

Pos logit contributions
aster+0.008
 peel+0.006
 toast+0.006
dog+0.006
 slur+0.005
Neg logit contributions
Els-0.009
 Grace-0.009
 importantly-0.009
reetings-0.008
ified-0.008
Token such
Token ID884
Feature activation+2.52e-03

Pos logit contributions
aster+0.019
 toast+0.015
 peel+0.014
on+0.014
 cut+0.013
Neg logit contributions
reetings-0.024
spe-0.024
 importantly-0.023
Els-0.023
ified-0.022
Token a
Token ID257
Feature activation+4.01e-03

Pos logit contributions
reetings+0.006
 Shall+0.005
ifying+0.005
 million+0.005
Els+0.005
Neg logit contributions
 toast-0.002
aster-0.002
 cut-0.002
 peel-0.002
 l-0.002
Token wonderful
Token ID7932
Feature activation-1.08e-02

Pos logit contributions
 importantly+0.006
Els+0.006
 Grace+0.006
reetings+0.006
ified+0.006
Neg logit contributions
aster-0.007
 peel-0.006
 toast-0.005
dog-0.005
 slur-0.005
Token suit
Token ID6050
Feature activation-3.98e-02

Pos logit contributions
aster+0.017
 peel+0.013
 toast+0.013
dog+0.012
 slur+0.012
Neg logit contributions
 Grace-0.018
Els-0.018
 importantly-0.018
reetings-0.018
ified-0.017
Token before
Token ID878
Feature activation-7.50e-03

Pos logit contributions
aster+0.076
 peel+0.063
 toast+0.061
dog+0.059
 slur+0.058
Neg logit contributions
 Grace-0.052
Els-0.052
 importantly-0.052
reetings-0.051
ified-0.049
Token.
Token ID13
Feature activation+7.36e-03

Pos logit contributions
aster+0.010
 peel+0.008
 toast+0.007
dog+0.007
on+0.006
Neg logit contributions
Els-0.015
reetings-0.014
 importantly-0.014
 Grace-0.014
ified-0.014
Token She
Token ID1375
Feature activation+6.00e-03

Pos logit contributions
reetings+0.012
 importantly+0.011
Els+0.011
 aliens+0.011
spe+0.011
Neg logit contributions
aster-0.012
 peel-0.010
 toast-0.009
dog-0.009
five-0.009
Token quickly
Token ID2952
Feature activation-1.65e-03

Pos logit contributions
reetings+0.011
 importantly+0.011
Els+0.011
 Grace+0.011
ified+0.010
Neg logit contributions
aster-0.008
 peel-0.007
 toast-0.006
 slur-0.006
dog-0.006
Token put
Token ID1234
Feature activation-9.12e-04

Pos logit contributions
aster+0.002
 peel+0.002
 toast+0.002
dog+0.002
 slur+0.001
Neg logit contributions
 Grace-0.003
Els-0.003
 importantly-0.003
reetings-0.003
ified-0.003
Token went
Token ID1816
Feature activation-1.77e-03

Pos logit contributions
aster+0.016
 peel+0.012
 toast+0.011
dog+0.011
 slur+0.010
Neg logit contributions
 Grace-0.024
Els-0.024
 importantly-0.024
reetings-0.023
ified-0.023
Token home
Token ID1363
Feature activation-9.24e-03

Pos logit contributions
aster+0.003
 peel+0.003
 toast+0.002
dog+0.002
 slur+0.002
Neg logit contributions
 Grace-0.003
Els-0.003
 importantly-0.003
reetings-0.003
ified-0.002
Token and
Token ID290
Feature activation-6.39e-03

Pos logit contributions
aster+0.014
 peel+0.011
 toast+0.011
dog+0.010
 slur+0.010
Neg logit contributions
 Grace-0.016
 importantly-0.016
Els-0.016
reetings-0.015
ified-0.015
Token enjoyed
Token ID8359
Feature activation+9.98e-03

Pos logit contributions
aster+0.008
 peel+0.006
 toast+0.006
dog+0.006
 slur+0.006
Neg logit contributions
Els-0.012
 importantly-0.012
reetings-0.012
 Grace-0.012
ified-0.012
Token his
Token ID465
Feature activation-6.31e-03

Pos logit contributions
reetings+0.019
Els+0.019
ified+0.019
 Grace+0.019
 importantly+0.018
Neg logit contributions
aster-0.014
 peel-0.011
 toast-0.010
 slur-0.010
dog-0.010
Token freshly
Token ID29026
Feature activation-3.97e-02

Pos logit contributions
aster+0.010
 peel+0.008
 toast+0.007
dog+0.007
ops+0.007
Neg logit contributions
 importantly-0.010
 Grace-0.010
Els-0.010
reetings-0.010
 ago-0.009
Token caught
Token ID4978
Feature activation-3.10e-03

Pos logit contributions
aster+0.050
inches+0.043
 peel+0.041
ingle+0.041
ens+0.038
Neg logit contributions
reetings-0.058
 insepar-0.057
 importantly-0.057
Els-0.056
 Grace-0.052
Token fish
Token ID5916
Feature activation-5.69e-03

Pos logit contributions
aster+0.005
 peel+0.004
 toast+0.004
dog+0.004
 slur+0.003
Neg logit contributions
 Grace-0.005
Els-0.005
 importantly-0.005
reetings-0.005
ified-0.005
Token.
Token ID13
Feature activation+7.70e-03

Pos logit contributions
aster+0.009
 peel+0.007
 toast+0.007
dog+0.006
 slur+0.006
Neg logit contributions
 Grace-0.010
Els-0.010
 importantly-0.009
reetings-0.009
ified-0.009
Token<|endoftext|>
Token ID50256
Feature activation+6.62e-03

Pos logit contributions
reetings+0.013
Els+0.013
ified+0.012
 importantly+0.012
 Grace+0.012
Neg logit contributions
aster-0.012
 peel-0.010
 toast-0.009
dog-0.009
 slur-0.009
TokenOnce
Token ID7454
Feature activation+7.53e-04

Pos logit contributions
 importantly+0.013
Els+0.012
 insepar+0.012
spe+0.012
ified+0.012
Neg logit contributions
aster-0.008
on-0.007
dog-0.007
 peel-0.007
ops-0.006
Token crown
Token ID12389
Feature activation-1.95e-02

Pos logit contributions
aster+0.004
 peel+0.003
 toast+0.003
 slur+0.003
dog+0.003
Neg logit contributions
 importantly-0.004
reetings-0.004
Els-0.004
 Grace-0.004
ified-0.004
Token and
Token ID290
Feature activation+5.95e-03

Pos logit contributions
aster+0.035
 peel+0.027
dog+0.026
 toast+0.026
 slur+0.025
Neg logit contributions
 Grace-0.028
Els-0.027
 importantly-0.027
ified-0.026
reetings-0.026
Token playing
Token ID2712
Feature activation+5.05e-04

Pos logit contributions
reetings+0.011
ified+0.011
Els+0.011
ifying+0.011
 importantly+0.010
Neg logit contributions
aster-0.008
 toast-0.007
 peel-0.007
 slur-0.006
 cut-0.006
Token in
Token ID287
Feature activation+4.99e-03

Pos logit contributions
 Grace+0.001
Els+0.001
 importantly+0.001
reetings+0.001
ified+0.001
Neg logit contributions
aster-0.001
 peel-0.001
 toast-0.001
dog-0.001
 slur-0.001
Token the
Token ID262
Feature activation-8.09e-03

Pos logit contributions
reetings+0.011
 importantly+0.011
ified+0.011
Els+0.010
spe+0.010
Neg logit contributions
aster-0.006
 peel-0.004
 toast-0.004
 cut-0.004
on-0.004
Token freshly
Token ID29026
Feature activation-3.95e-02

Pos logit contributions
aster+0.015
 peel+0.012
ops+0.011
dog+0.011
 slur+0.011
Neg logit contributions
 Grace-0.012
Els-0.011
 importantly-0.011
 rooft-0.010
reetings-0.010
Token cut
Token ID2005
Feature activation-1.05e-02

Pos logit contributions
aster+0.060
 peel+0.047
 toast+0.046
inches+0.046
ops+0.045
Neg logit contributions
Els-0.059
 Grace-0.055
 rooft-0.053
reetings-0.053
 importantly-0.052
Token grass
Token ID8701
Feature activation-1.35e-03

Pos logit contributions
aster+0.016
 peel+0.012
 toast+0.012
dog+0.011
 slur+0.011
Neg logit contributions
 importantly-0.019
reetings-0.018
Els-0.018
 Grace-0.018
ified-0.018
Token.
Token ID13
Feature activation-8.34e-03

Pos logit contributions
aster+0.002
 peel+0.001
 toast+0.001
dog+0.001
 cone+0.001
Neg logit contributions
 importantly-0.003
reetings-0.003
Els-0.003
 Grace-0.003
ified-0.002
Token It
Token ID632
Feature activation+3.33e-04

Pos logit contributions
aster+0.016
 peel+0.014
 toast+0.013
dog+0.013
 slur+0.013
Neg logit contributions
Els-0.011
 importantly-0.011
 Grace-0.010
reetings-0.010
ified-0.010
Token looked
Token ID3114
Feature activation+3.86e-03

Pos logit contributions
 Grace+0.001
Els+0.001
 importantly+0.000
reetings+0.000
ified+0.000
Neg logit contributions
aster-0.001
 peel-0.000
 toast-0.000
dog-0.000
five-0.000
Token
Token ID198
Feature activation-4.48e-03

Pos logit contributions
ified+0.005
reetings+0.005
 importantly+0.005
Els+0.005
 insepar+0.005
Neg logit contributions
aster-0.004
 peel-0.003
dog-0.003
 toast-0.003
 man-0.003
Token
Token ID198
Feature activation+1.69e-03

Pos logit contributions
aster+0.008
 peel+0.006
 toast+0.006
dog+0.006
 slur+0.006
Neg logit contributions
 importantly-0.007
 Grace-0.007
Els-0.007
reetings-0.006
ified-0.006
Token"
Token ID1
Feature activation-5.75e-03

Pos logit contributions
 importantly+0.003
 Grace+0.003
reetings+0.003
ified+0.003
Els+0.003
Neg logit contributions
aster-0.003
 peel-0.002
 toast-0.002
dog-0.002
 slur-0.002
TokenWhat
Token ID2061
Feature activation-1.19e-02

Pos logit contributions
aster+0.009
 peel+0.007
 toast+0.007
dog+0.007
 slur+0.006
Neg logit contributions
 Grace-0.010
 importantly-0.009
Els-0.009
reetings-0.009
ified-0.009
Token is
Token ID318
Feature activation-1.36e-02

Pos logit contributions
aster+0.018
 peel+0.014
 toast+0.014
dog+0.013
on+0.012
Neg logit contributions
Els-0.021
 Grace-0.021
ified-0.020
 importantly-0.020
reetings-0.020
Token it
Token ID340
Feature activation-3.93e-02

Pos logit contributions
aster+0.023
 peel+0.019
 toast+0.018
dog+0.017
 slur+0.016
Neg logit contributions
Els-0.021
 importantly-0.021
reetings-0.021
 Grace-0.021
ified-0.021
Token,
Token ID11
Feature activation-1.09e-02

Pos logit contributions
aster+0.067
 peel+0.055
 toast+0.054
dog+0.051
 slur+0.048
Neg logit contributions
Els-0.060
 importantly-0.060
ified-0.059
 Grace-0.058
reetings-0.056
Token Tom
Token ID4186
Feature activation-6.60e-03

Pos logit contributions
aster+0.015
 toast+0.012
 peel+0.012
 slur+0.010
dog+0.010
Neg logit contributions
reetings-0.021
Els-0.021
ified-0.020
 Grace-0.020
 importantly-0.020
Token?"
Token ID1701
Feature activation-1.40e-02

Pos logit contributions
aster+0.012
 peel+0.009
 toast+0.009
dog+0.009
 cone+0.008
Neg logit contributions
Els-0.010
 importantly-0.010
 Grace-0.010
ified-0.010
ifying-0.009
Token she
Token ID673
Feature activation-1.97e-03

Pos logit contributions
aster+0.021
 peel+0.016
 toast+0.016
dog+0.015
 slur+0.014
Neg logit contributions
 Grace-0.024
Els-0.024
 importantly-0.024
reetings-0.024
ified-0.024
Token asked
Token ID1965
Feature activation-1.02e-04

Pos logit contributions
aster+0.003
 peel+0.002
 toast+0.002
dog+0.002
 slur+0.002
Neg logit contributions
reetings-0.003
Els-0.003
 importantly-0.003
ifying-0.003
 Grace-0.003
Token
Token ID198
Feature activation+4.26e-04

Pos logit contributions
aster+0.006
 peel+0.005
 toast+0.005
dog+0.005
five+0.004
Neg logit contributions
 insepar-0.007
reetings-0.007
ified-0.007
 importantly-0.007
Els-0.007
Token
Token ID198
Feature activation+4.68e-03

Pos logit contributions
 importantly+0.001
 Grace+0.001
ified+0.001
Els+0.001
reetings+0.001
Neg logit contributions
aster-0.001
 peel-0.001
 toast-0.001
dog-0.001
 slur-0.001
Token"
Token ID1
Feature activation-1.30e-02

Pos logit contributions
 Grace+0.007
 importantly+0.007
ified+0.007
reetings+0.007
Els+0.007
Neg logit contributions
aster-0.008
 peel-0.007
 toast-0.006
dog-0.006
 slur-0.006
TokenWhat
Token ID2061
Feature activation-8.29e-03

Pos logit contributions
aster+0.027
 peel+0.023
 toast+0.022
 slur+0.022
dog+0.021
Neg logit contributions
Els-0.015
 Grace-0.015
 importantly-0.014
reetings-0.014
ified-0.014
Token is
Token ID318
Feature activation-1.49e-02

Pos logit contributions
aster+0.013
 peel+0.011
 toast+0.010
dog+0.010
 slur+0.009
Neg logit contributions
Els-0.014
 Grace-0.014
 importantly-0.014
reetings-0.013
ified-0.013
Token it
Token ID340
Feature activation-3.93e-02

Pos logit contributions
aster+0.024
 peel+0.019
 toast+0.018
dog+0.017
 slur+0.017
Neg logit contributions
Els-0.024
 importantly-0.024
 Grace-0.024
reetings-0.024
ified-0.023
Token?"
Token ID1701
Feature activation-9.81e-03

Pos logit contributions
aster+0.068
 peel+0.057
 toast+0.055
dog+0.052
on+0.049
Neg logit contributions
Els-0.060
 importantly-0.059
ified-0.058
 Grace-0.057
spe-0.056
Token Lily
Token ID20037
Feature activation-2.14e-02

Pos logit contributions
aster+0.012
 peel+0.009
 toast+0.008
dog+0.008
 slur+0.008
Neg logit contributions
reetings-0.020
Els-0.019
 importantly-0.019
ified-0.019
 Grace-0.019
Token asked
Token ID1965
Feature activation-4.02e-03

Pos logit contributions
aster+0.037
 peel+0.029
 toast+0.028
dog+0.027
 slur+0.027
Neg logit contributions
 Grace-0.032
Els-0.032
 importantly-0.032
reetings-0.031
ified-0.031
Token.
Token ID13
Feature activation-1.23e-03

Pos logit contributions
aster+0.007
 peel+0.005
 toast+0.005
 slur+0.005
dog+0.005
Neg logit contributions
 Grace-0.007
Els-0.006
 importantly-0.006
ified-0.006
reetings-0.006
Token
Token ID198
Feature activation-7.48e-03

Pos logit contributions
aster+0.002
 peel+0.002
 toast+0.002
dog+0.002
 slur+0.002
Neg logit contributions
 importantly-0.002
Els-0.002
 Grace-0.002
reetings-0.002
ified-0.002
Token.
Token ID13
Feature activation-9.39e-04

Pos logit contributions
aster+0.005
 toast+0.004
 peel+0.004
dog+0.003
ops+0.003
Neg logit contributions
Els-0.008
reetings-0.008
 importantly-0.008
spe-0.007
ified-0.007
Token
Token ID198
Feature activation-5.62e-03

Pos logit contributions
aster+0.002
 peel+0.001
 toast+0.001
dog+0.001
 slur+0.001
Neg logit contributions
 importantly-0.001
reetings-0.001
ified-0.001
Els-0.001
 Grace-0.001
Token
Token ID198
Feature activation+1.77e-03

Pos logit contributions
aster+0.010
 peel+0.008
 toast+0.007
dog+0.007
 slur+0.007
Neg logit contributions
 importantly-0.008
 Grace-0.008
Els-0.008
reetings-0.008
ified-0.008
TokenWhen
Token ID2215
Feature activation+8.61e-03

Pos logit contributions
 Grace+0.002
 importantly+0.002
Els+0.002
reetings+0.002
ified+0.002
Neg logit contributions
aster-0.004
 peel-0.003
 toast-0.003
dog-0.003
 slur-0.003
Token Su
Token ID1778
Feature activation-7.80e-03

Pos logit contributions
 Grace+0.012
 importantly+0.012
Els+0.012
reetings+0.011
ified+0.011
Neg logit contributions
aster-0.016
 peel-0.013
 toast-0.013
dog-0.012
 slur-0.012
Tokenzy
Token ID7357
Feature activation-3.92e-02

Pos logit contributions
aster+0.014
 toast+0.012
 peel+0.011
 slur+0.011
 bread+0.011
Neg logit contributions
 Grace-0.009
ified-0.009
 Once-0.009
reetings-0.009
 Fiona-0.008
Token and
Token ID290
Feature activation-6.10e-03

Pos logit contributions
aster+0.059
 toast+0.047
 peel+0.046
dog+0.042
 slur+0.042
Neg logit contributions
 Grace-0.066
Els-0.064
ified-0.063
reetings-0.062
 importantly-0.061
Token her
Token ID607
Feature activation+4.26e-03

Pos logit contributions
aster+0.009
 peel+0.007
 toast+0.006
dog+0.006
ops+0.006
Neg logit contributions
 Grace-0.011
Els-0.011
 importantly-0.010
 Fiona-0.010
reetings-0.010
Token mum
Token ID25682
Feature activation-9.14e-03

Pos logit contributions
reetings+0.008
 importantly+0.008
Els+0.008
ified+0.008
 Grace+0.008
Neg logit contributions
aster-0.006
 peel-0.005
 toast-0.005
 slur-0.004
dog-0.004
Token stepped
Token ID10764
Feature activation+5.17e-03

Pos logit contributions
aster+0.014
 peel+0.011
 toast+0.011
dog+0.010
 slur+0.010
Neg logit contributions
 Grace-0.016
Els-0.015
ified-0.015
 importantly-0.015
reetings-0.014
Token outside
Token ID2354
Feature activation-1.05e-02

Pos logit contributions
 importantly+0.010
reetings+0.010
Els+0.009
spe+0.009
 sizes+0.009
Neg logit contributions
aster-0.007
 peel-0.006
dog-0.005
 toast-0.005
 slur-0.005
Token the
Token ID262
Feature activation-3.65e-03

Pos logit contributions
aster+0.003
 peel+0.003
ops+0.003
dog+0.003
five+0.003
Neg logit contributions
 Grace-0.003
 importantly-0.002
Els-0.002
ified-0.002
 Grac-0.002
Token sweet
Token ID6029
Feature activation-1.56e-02

Pos logit contributions
aster+0.006
 peel+0.004
dog+0.004
ops+0.004
five+0.004
Neg logit contributions
Els-0.006
 Grace-0.006
 importantly-0.006
ified-0.006
reetings-0.006
Token aroma
Token ID31242
Feature activation-4.83e-03

Pos logit contributions
aster+0.026
 peel+0.019
inches+0.018
five+0.018
ops+0.018
Neg logit contributions
 Grace-0.023
 importantly-0.022
 sizes-0.021
Els-0.021
ified-0.021
Token of
Token ID286
Feature activation-2.93e-03

Pos logit contributions
aster+0.010
inches+0.008
five+0.008
 peel+0.008
dog+0.008
Neg logit contributions
 importantly-0.005
ified-0.005
 Grace-0.005
ifying-0.005
 million-0.005
Token a
Token ID257
Feature activation-9.56e-04

Pos logit contributions
aster+0.005
 peel+0.004
dog+0.003
 toast+0.003
ops+0.003
Neg logit contributions
 Grace-0.005
 importantly-0.005
Els-0.005
ified-0.004
reetings-0.004
Token freshly
Token ID29026
Feature activation-3.91e-02

Pos logit contributions
aster+0.002
 peel+0.001
 toast+0.001
dog+0.001
ops+0.001
Neg logit contributions
 importantly-0.001
 Grace-0.001
Els-0.001
reetings-0.001
ified-0.001
Token baked
Token ID22979
Feature activation+7.73e-03

Pos logit contributions
aster+0.052
ens+0.044
inches+0.044
ingle+0.043
 itch+0.043
Neg logit contributions
Els-0.049
 importantly-0.047
reetings-0.047
 million-0.044
 few-0.044
Token cake
Token ID12187
Feature activation-9.46e-04

Pos logit contributions
 importantly+0.012
 Grace+0.012
Els+0.012
 million+0.012
 ago+0.012
Neg logit contributions
aster-0.010
inches-0.008
 peel-0.008
dog-0.008
five-0.008
Token.
Token ID13
Feature activation-1.82e-03

Pos logit contributions
aster+0.002
 peel+0.001
dog+0.001
five+0.001
 slur+0.001
Neg logit contributions
 Grace-0.001
 importantly-0.001
Els-0.001
ified-0.001
ifying-0.001
Token
Token ID198
Feature activation-6.35e-03

Pos logit contributions
aster+0.003
 peel+0.003
 toast+0.003
dog+0.003
 slur+0.002
Neg logit contributions
 importantly-0.003
 Grace-0.003
Els-0.003
reetings-0.003
ified-0.002
Token
Token ID198
Feature activation+2.65e-03

Pos logit contributions
aster+0.010
 peel+0.007
on+0.007
 toast+0.007
dog+0.007
Neg logit contributions
 importantly-0.011
ified-0.011
ifying-0.011
reetings-0.011
spe-0.011
Token
Token ID198
Feature activation+2.60e-03

Pos logit contributions
aster+0.014
 peel+0.011
 toast+0.010
dog+0.010
 slur+0.009
Neg logit contributions
reetings-0.017
ified-0.017
 insepar-0.016
 importantly-0.016
Els-0.016
Token
Token ID198
Feature activation+5.18e-03

Pos logit contributions
 importantly+0.004
ified+0.004
reetings+0.004
 Grace+0.004
Els+0.004
Neg logit contributions
aster-0.004
 peel-0.003
 toast-0.003
dog-0.003
 slur-0.003
Token"
Token ID1
Feature activation-1.16e-02

Pos logit contributions
 importantly+0.010
 Grace+0.010
ified+0.010
reetings+0.010
Els+0.010
Neg logit contributions
aster-0.007
 peel-0.005
 toast-0.005
dog-0.004
 slur-0.004
TokenWhat
Token ID2061
Feature activation-1.18e-02

Pos logit contributions
aster+0.021
 peel+0.017
 toast+0.017
 slur+0.016
dog+0.016
Neg logit contributions
Els-0.017
 importantly-0.016
 Grace-0.016
reetings-0.016
ified-0.015
Token is
Token ID318
Feature activation-1.42e-02

Pos logit contributions
aster+0.019
 peel+0.015
 toast+0.015
dog+0.014
 slur+0.013
Neg logit contributions
 Grace-0.019
Els-0.019
 importantly-0.019
reetings-0.019
ified-0.019
Token it
Token ID340
Feature activation-3.90e-02

Pos logit contributions
aster+0.022
 peel+0.017
 toast+0.017
dog+0.015
 slur+0.015
Neg logit contributions
reetings-0.025
Els-0.024
 importantly-0.024
 Grace-0.024
ified-0.024
Token,
Token ID11
Feature activation-5.42e-03

Pos logit contributions
aster+0.067
 peel+0.056
 toast+0.055
dog+0.051
 slur+0.048
Neg logit contributions
Els-0.060
 importantly-0.059
ified-0.058
 Grace-0.057
spe-0.056
Token Mom
Token ID11254
Feature activation+5.93e-04

Pos logit contributions
aster+0.006
 toast+0.005
 peel+0.005
 cone+0.004
 slur+0.004
Neg logit contributions
reetings-0.012
ified-0.011
 importantly-0.011
Els-0.011
 Grace-0.011
Token?"
Token ID1701
Feature activation-1.22e-02

Pos logit contributions
Els+0.001
ified+0.001
 importantly+0.001
 Grace+0.001
ifying+0.001
Neg logit contributions
aster-0.001
 peel-0.001
 toast-0.001
dog-0.001
 cut-0.001
Token Sara
Token ID24799
Feature activation-1.67e-02

Pos logit contributions
aster+0.016
 peel+0.011
 toast+0.011
dog+0.010
 slur+0.010
Neg logit contributions
reetings-0.025
ified-0.025
Els-0.024
 importantly-0.024
ifying-0.024
Token and
Token ID290
Feature activation-2.31e-02

Pos logit contributions
aster+0.027
 peel+0.021
 toast+0.020
dog+0.020
 slur+0.019
Neg logit contributions
 Grace-0.027
Els-0.027
 importantly-0.027
reetings-0.026
ified-0.026
Token
Token ID198
Feature activation+5.41e-03

Pos logit contributions
aster+0.016
 peel+0.012
 toast+0.012
dog+0.012
 cone+0.011
Neg logit contributions
ified-0.019
reetings-0.019
 insepar-0.018
 importantly-0.018
Els-0.018
Token
Token ID198
Feature activation+8.04e-03

Pos logit contributions
 Grace+0.008
 importantly+0.008
Els+0.008
reetings+0.008
ified+0.008
Neg logit contributions
aster-0.010
 peel-0.008
 toast-0.007
dog-0.007
 slur-0.007
Token"
Token ID1
Feature activation-1.25e-02

Pos logit contributions
 Grace+0.015
 importantly+0.015
Els+0.015
reetings+0.015
ified+0.015
Neg logit contributions
aster-0.011
 peel-0.008
 toast-0.008
dog-0.007
 slur-0.007
TokenWhat
Token ID2061
Feature activation-1.49e-02

Pos logit contributions
aster+0.023
 peel+0.018
 toast+0.018
 slur+0.017
ops+0.017
Neg logit contributions
 importantly-0.019
Els-0.019
 Grace-0.018
reetings-0.017
ified-0.017
Token is
Token ID318
Feature activation-1.45e-02

Pos logit contributions
aster+0.025
 peel+0.021
 toast+0.020
dog+0.019
 slur+0.018
Neg logit contributions
 Grace-0.023
Els-0.023
 importantly-0.023
reetings-0.022
ified-0.022
Token it
Token ID340
Feature activation-3.90e-02

Pos logit contributions
aster+0.025
 peel+0.020
 toast+0.019
dog+0.018
 slur+0.017
Neg logit contributions
Els-0.023
 Grace-0.023
reetings-0.023
 importantly-0.022
ified-0.022
Token,
Token ID11
Feature activation-7.99e-03

Pos logit contributions
aster+0.067
 peel+0.055
 toast+0.054
dog+0.051
 slur+0.048
Neg logit contributions
Els-0.060
 importantly-0.058
ified-0.058
 Grace-0.058
reetings-0.055
Token Grand
Token ID5675
Feature activation-3.90e-03

Pos logit contributions
aster+0.008
 toast+0.007
 peel+0.006
 cone+0.006
dog+0.005
Neg logit contributions
reetings-0.018
ified-0.018
Els-0.017
 importantly-0.017
 Grace-0.017
Tokenma
Token ID2611
Feature activation+9.27e-04

Pos logit contributions
aster+0.005
 peel+0.005
 slur+0.004
five+0.004
 itch+0.004
Neg logit contributions
reetings-0.007
Els-0.006
 importantly-0.006
 ago-0.006
 Grace-0.006
Token?"
Token ID1701
Feature activation-1.35e-02

Pos logit contributions
Els+0.002
 Grace+0.002
ified+0.002
 importantly+0.002
 aliens+0.001
Neg logit contributions
aster-0.002
 peel-0.001
 toast-0.001
dog-0.001
on-0.001
Token Sara
Token ID24799
Feature activation-2.35e-02

Pos logit contributions
aster+0.017
 peel+0.012
 toast+0.011
dog+0.011
 slur+0.010
Neg logit contributions
reetings-0.029
ified-0.028
 importantly-0.027
Els-0.027
ifying-0.027
Token
Token ID198
Feature activation-6.04e-03

Pos logit contributions
aster+0.001
 peel+0.000
dog+0.000
 toast+0.000
on+0.000
Neg logit contributions
Els-0.001
ified-0.001
reetings-0.001
 insepar-0.001
 importantly-0.001
Token
Token ID198
Feature activation+3.08e-03

Pos logit contributions
aster+0.010
 peel+0.007
 toast+0.007
dog+0.007
 slur+0.007
Neg logit contributions
ified-0.010
Els-0.010
 importantly-0.010
reetings-0.010
 Grace-0.010
Token"
Token ID1
Feature activation-8.24e-03

Pos logit contributions
 Grace+0.006
ified+0.006
 insepar+0.006
 importantly+0.006
Els+0.006
Neg logit contributions
aster-0.004
 peel-0.003
dog-0.002
 toast-0.002
five-0.002
TokenWhat
Token ID2061
Feature activation-1.02e-02

Pos logit contributions
aster+0.016
 peel+0.013
 toast+0.013
ops+0.012
 slur+0.012
Neg logit contributions
Els-0.011
 importantly-0.011
 Grace-0.011
reetings-0.010
 Shall-0.010
Token is
Token ID318
Feature activation-1.36e-02

Pos logit contributions
aster+0.017
 peel+0.014
 toast+0.013
dog+0.012
 slur+0.012
Neg logit contributions
Els-0.016
 Grace-0.016
 importantly-0.016
reetings-0.016
ified-0.015
Token it
Token ID340
Feature activation-3.89e-02

Pos logit contributions
aster+0.023
 peel+0.018
 toast+0.018
dog+0.017
 slur+0.016
Neg logit contributions
Els-0.021
reetings-0.021
 Grace-0.021
 importantly-0.021
ified-0.021
Token,
Token ID11
Feature activation-8.99e-03

Pos logit contributions
aster+0.065
 peel+0.054
 toast+0.053
dog+0.049
 cut+0.046
Neg logit contributions
Els-0.062
ified-0.060
 importantly-0.060
 Grace-0.059
spe-0.058
Token Max
Token ID5436
Feature activation-2.14e-04

Pos logit contributions
aster+0.011
 peel+0.009
 toast+0.009
dog+0.007
 cone+0.007
Neg logit contributions
Els-0.019
reetings-0.018
ified-0.018
 importantly-0.018
 Grace-0.017
Token?"
Token ID1701
Feature activation-1.53e-02

Pos logit contributions
aster+0.000
 peel+0.000
 toast+0.000
dog+0.000
 cut+0.000
Neg logit contributions
Els-0.000
 importantly-0.000
ified-0.000
 Grace-0.000
ifying-0.000
Token Lily
Token ID20037
Feature activation-1.51e-02

Pos logit contributions
aster+0.023
 peel+0.017
 toast+0.015
dog+0.015
 cut+0.015
Neg logit contributions
ified-0.030
Els-0.029
reetings-0.029
 insepar-0.029
ifying-0.029
Token asked
Token ID1965
Feature activation-5.71e-03

Pos logit contributions
aster+0.025
 peel+0.020
 toast+0.019
dog+0.018
 slur+0.018
Neg logit contributions
 Grace-0.024
 importantly-0.024
Els-0.024
reetings-0.023
ified-0.023
Token They
Token ID1119
Feature activation-2.93e-03

Pos logit contributions
reetings+0.005
Els+0.004
 insepar+0.004
 importantly+0.004
ified+0.004
Neg logit contributions
aster-0.004
 peel-0.003
dog-0.003
five-0.003
 slur-0.003
Token have
Token ID423
Feature activation+1.12e-02

Pos logit contributions
aster+0.004
 peel+0.003
 toast+0.003
 slur+0.003
dog+0.003
Neg logit contributions
 importantly-0.006
reetings-0.006
Els-0.006
 Grace-0.006
ified-0.005
Token never
Token ID1239
Feature activation-1.79e-02

Pos logit contributions
 importantly+0.019
reetings+0.019
Els+0.019
 Grace+0.019
ified+0.019
Neg logit contributions
aster-0.016
 peel-0.013
 toast-0.013
 cut-0.012
 slur-0.012
Token heard
Token ID2982
Feature activation-6.91e-04

Pos logit contributions
aster+0.028
 peel+0.022
 toast+0.021
dog+0.020
 slur+0.020
Neg logit contributions
 Grace-0.030
 importantly-0.030
Els-0.030
reetings-0.030
ified-0.028
Token a
Token ID257
Feature activation-9.09e-03

Pos logit contributions
aster+0.001
 peel+0.001
 toast+0.001
dog+0.001
 slur+0.001
Neg logit contributions
Els-0.001
reetings-0.001
 importantly-0.001
 Grace-0.001
ified-0.001
Token bird
Token ID6512
Feature activation-3.89e-02

Pos logit contributions
aster+0.014
 peel+0.011
 toast+0.010
dog+0.010
 slur+0.010
Neg logit contributions
 importantly-0.016
Els-0.016
 Grace-0.015
reetings-0.015
ified-0.015
Token talk
Token ID1561
Feature activation-1.28e-02

Pos logit contributions
aster+0.060
 peel+0.046
 toast+0.046
ops+0.044
five+0.044
Neg logit contributions
 Grace-0.062
 importantly-0.060
Els-0.059
ified-0.058
reetings-0.057
Token before
Token ID878
Feature activation-8.83e-03

Pos logit contributions
aster+0.020
 peel+0.016
 toast+0.015
dog+0.015
on+0.014
Neg logit contributions
Els-0.022
 importantly-0.022
 Grace-0.021
reetings-0.021
 insepar-0.020
Token.
Token ID13
Feature activation-6.62e-04

Pos logit contributions
aster+0.010
 toast+0.007
 peel+0.007
dog+0.006
 slur+0.006
Neg logit contributions
Els-0.020
reetings-0.019
 importantly-0.019
ified-0.018
spe-0.018
Token They
Token ID1119
Feature activation-8.70e-04

Pos logit contributions
aster+0.001
 peel+0.001
dog+0.001
 toast+0.001
 slur+0.001
Neg logit contributions
reetings-0.001
Els-0.001
 insepar-0.001
ified-0.001
 importantly-0.001
Token think
Token ID892
Feature activation-3.67e-03

Pos logit contributions
aster+0.001
 peel+0.001
 slur+0.001
 toast+0.001
dog+0.001
Neg logit contributions
 importantly-0.002
reetings-0.002
Els-0.002
 Grace-0.002
ified-0.002
Token
Token ID198
Feature activation+1.35e-03

Pos logit contributions
aster+0.007
 peel+0.005
 toast+0.005
dog+0.005
 slur+0.005
Neg logit contributions
reetings-0.007
Els-0.007
ified-0.007
 Grace-0.007
 importantly-0.007
Token
Token ID198
Feature activation+6.38e-03

Pos logit contributions
 Grace+0.002
 importantly+0.002
Els+0.002
ified+0.002
reetings+0.002
Neg logit contributions
aster-0.002
 peel-0.002
 toast-0.002
dog-0.002
 slur-0.002
Token"
Token ID1
Feature activation-1.43e-02

Pos logit contributions
 Grace+0.010
 importantly+0.009
Els+0.009
reetings+0.009
ified+0.009
Neg logit contributions
aster-0.011
 peel-0.009
 toast-0.009
dog-0.009
 slur-0.008
TokenWhat
Token ID2061
Feature activation-8.21e-03

Pos logit contributions
aster+0.031
 peel+0.026
 toast+0.026
 slur+0.025
dog+0.025
Neg logit contributions
Els-0.016
 Grace-0.015
 importantly-0.015
reetings-0.015
ified-0.014
Token is
Token ID318
Feature activation-1.38e-02

Pos logit contributions
aster+0.013
 peel+0.011
 toast+0.010
dog+0.010
 slur+0.010
Neg logit contributions
 importantly-0.013
 Grace-0.013
Els-0.013
reetings-0.013
ified-0.013
Token it
Token ID340
Feature activation-3.89e-02

Pos logit contributions
aster+0.023
 peel+0.019
 toast+0.018
dog+0.018
 slur+0.018
Neg logit contributions
 Grace-0.021
 importantly-0.021
Els-0.021
reetings-0.020
ified-0.020
Token?"
Token ID1701
Feature activation-1.14e-02

Pos logit contributions
aster+0.071
 peel+0.059
 toast+0.057
dog+0.055
 slur+0.052
Neg logit contributions
Els-0.055
 importantly-0.055
 Grace-0.054
ified-0.053
reetings-0.052
Token Lily
Token ID20037
Feature activation-1.87e-02

Pos logit contributions
aster+0.014
 peel+0.010
 toast+0.010
dog+0.009
 slur+0.009
Neg logit contributions
Els-0.023
 importantly-0.023
 Grace-0.023
reetings-0.023
ified-0.022
Token asked
Token ID1965
Feature activation-6.43e-03

Pos logit contributions
aster+0.035
 peel+0.027
 toast+0.027
five+0.026
dog+0.026
Neg logit contributions
 Grace-0.026
Els-0.026
 importantly-0.026
ified-0.025
 ago-0.024
Token.
Token ID13
Feature activation-2.28e-03

Pos logit contributions
aster+0.011
 peel+0.009
 toast+0.009
 slur+0.009
dog+0.008
Neg logit contributions
 Grace-0.010
 importantly-0.009
ified-0.009
Els-0.009
 ago-0.009
Token
Token ID198
Feature activation-8.26e-03

Pos logit contributions
aster+0.004
 peel+0.004
 toast+0.003
dog+0.003
 slur+0.003
Neg logit contributions
 Grace-0.003
Els-0.003
 importantly-0.003
reetings-0.003
ified-0.003
TokenBut
Token ID1537
Feature activation+4.21e-03

Pos logit contributions
Els+0.007
 importantly+0.007
 Grace+0.007
reetings+0.007
ified+0.007
Neg logit contributions
aster-0.012
 peel-0.010
 toast-0.009
dog-0.009
 slur-0.009
Token what
Token ID644
Feature activation-6.56e-03

Pos logit contributions
Els+0.007
reetings+0.007
 importantly+0.006
spe+0.006
ified+0.006
Neg logit contributions
aster-0.007
 peel-0.006
 toast-0.006
 slur-0.005
dog-0.005
Token caught
Token ID4978
Feature activation-1.46e-02

Pos logit contributions
aster+0.011
 peel+0.009
 toast+0.009
dog+0.009
 slur+0.008
Neg logit contributions
 Grace-0.010
 importantly-0.010
Els-0.010
reetings-0.010
ified-0.009
Token Emily
Token ID17608
Feature activation+3.82e-03

Pos logit contributions
aster+0.025
 peel+0.020
 toast+0.020
dog+0.018
 slur+0.018
Neg logit contributions
Els-0.022
 importantly-0.022
reetings-0.022
 Grace-0.022
ified-0.021
Token's
Token ID338
Feature activation-1.47e-02

Pos logit contributions
Els+0.007
 importantly+0.007
reetings+0.007
 Grace+0.007
ified+0.007
Neg logit contributions
aster-0.005
 peel-0.004
 toast-0.004
dog-0.004
 cut-0.004
Token eye
Token ID4151
Feature activation-3.88e-02

Pos logit contributions
aster+0.025
 peel+0.020
 toast+0.019
dog+0.018
 slur+0.018
Neg logit contributions
 Grace-0.023
 importantly-0.022
Els-0.022
reetings-0.022
ified-0.021
Token was
Token ID373
Feature activation-2.65e-02

Pos logit contributions
aster+0.064
 peel+0.052
 toast+0.050
dog+0.047
 slur+0.044
Neg logit contributions
Els-0.063
 importantly-0.062
 Grace-0.061
reetings-0.060
ified-0.059
Token the
Token ID262
Feature activation-3.42e-03

Pos logit contributions
aster+0.047
 peel+0.039
 toast+0.037
dog+0.035
 cut+0.034
Neg logit contributions
Els-0.039
 importantly-0.039
 Grace-0.038
ified-0.037
reetings-0.037
Token strange
Token ID6283
Feature activation-1.16e-02

Pos logit contributions
aster+0.005
 peel+0.005
 toast+0.005
dog+0.004
 cut+0.004
Neg logit contributions
 importantly-0.005
reetings-0.005
ified-0.005
Els-0.005
 Grace-0.005
Token little
Token ID1310
Feature activation-3.65e-03

Pos logit contributions
aster+0.016
 toast+0.014
 peel+0.013
dog+0.012
 cut+0.011
Neg logit contributions
Els-0.021
 importantly-0.021
ified-0.020
reetings-0.020
ifying-0.020
Token trees
Token ID7150
Feature activation-1.11e-02

Pos logit contributions
aster+0.005
 peel+0.004
 toast+0.004
dog+0.004
 slur+0.004
Neg logit contributions
 importantly-0.006
reetings-0.006
Els-0.006
ified-0.006
 Grace-0.006
Token
Token ID198
Feature activation-8.07e-03

Pos logit contributions
aster+0.013
 peel+0.010
dog+0.009
five+0.008
 man+0.008
Neg logit contributions
 insepar-0.026
ified-0.025
iful-0.025
reetings-0.024
ifying-0.024
Token
Token ID198
Feature activation-1.58e-03

Pos logit contributions
aster+0.014
 peel+0.012
 toast+0.011
dog+0.011
 slur+0.011
Neg logit contributions
 Grace-0.012
 importantly-0.012
Els-0.012
reetings-0.011
ified-0.011
Token"
Token ID1
Feature activation-1.53e-02

Pos logit contributions
aster+0.003
 peel+0.002
 toast+0.002
dog+0.002
five+0.002
Neg logit contributions
 Grace-0.003
ified-0.003
 importantly-0.003
 insepar-0.002
Els-0.002
TokenWhat
Token ID2061
Feature activation-9.44e-03

Pos logit contributions
aster+0.028
 peel+0.022
 toast+0.022
 slur+0.021
dog+0.020
Neg logit contributions
Els-0.023
 importantly-0.022
 Grace-0.022
reetings-0.022
ified-0.021
Token is
Token ID318
Feature activation-1.29e-02

Pos logit contributions
aster+0.015
 peel+0.012
 toast+0.012
dog+0.011
 slur+0.011
Neg logit contributions
Els-0.015
 Grace-0.015
 importantly-0.015
reetings-0.015
ified-0.015
Token it
Token ID340
Feature activation-3.85e-02

Pos logit contributions
aster+0.021
 peel+0.017
 toast+0.017
dog+0.016
 slur+0.015
Neg logit contributions
Els-0.021
reetings-0.021
 importantly-0.021
ified-0.020
 Grace-0.020
Token?"
Token ID1701
Feature activation-1.57e-02

Pos logit contributions
aster+0.069
 peel+0.057
 toast+0.056
dog+0.053
 slur+0.051
Neg logit contributions
Els-0.056
 importantly-0.055
 Grace-0.054
ified-0.054
reetings-0.052
Token he
Token ID339
Feature activation-1.56e-02

Pos logit contributions
aster+0.019
 peel+0.014
 toast+0.013
dog+0.012
 slur+0.012
Neg logit contributions
reetings-0.032
Els-0.032
 importantly-0.032
ified-0.031
 Grace-0.031
Token asked
Token ID1965
Feature activation+1.10e-03

Pos logit contributions
aster+0.024
 peel+0.019
 toast+0.018
dog+0.017
 slur+0.017
Neg logit contributions
reetings-0.026
Els-0.026
 importantly-0.026
 Grace-0.026
ified-0.025
Token.
Token ID13
Feature activation-5.94e-04

Pos logit contributions
 importantly+0.002
Els+0.002
 Grace+0.002
reetings+0.002
ified+0.002
Neg logit contributions
aster-0.002
 peel-0.001
 toast-0.001
dog-0.001
 slur-0.001
Token
Token ID198
Feature activation-4.94e-03

Pos logit contributions
aster+0.001
 peel+0.001
dog+0.001
 toast+0.001
 slur+0.001
Neg logit contributions
Els-0.001
 importantly-0.001
reetings-0.001
 Grace-0.001
ified-0.001
Token her
Token ID607
Feature activation-2.43e-03

Pos logit contributions
aster+0.014
 peel+0.011
dog+0.010
 toast+0.010
 slur+0.010
Neg logit contributions
 Grace-0.018
Els-0.017
 importantly-0.017
reetings-0.017
ified-0.016
Token clock
Token ID8801
Feature activation-1.60e-02

Pos logit contributions
aster+0.004
ops+0.003
 peel+0.003
dog+0.003
 slur+0.003
Neg logit contributions
 Grace-0.004
 importantly-0.004
Els-0.004
 aliens-0.004
 Amanda-0.004
Token what
Token ID644
Feature activation-1.23e-02

Pos logit contributions
aster+0.026
 peel+0.021
 toast+0.021
dog+0.020
 slur+0.019
Neg logit contributions
Els-0.025
 importantly-0.025
 Grace-0.025
reetings-0.024
ified-0.024
Token time
Token ID640
Feature activation-1.36e-02

Pos logit contributions
aster+0.022
 peel+0.019
 slur+0.018
ops+0.017
five+0.017
Neg logit contributions
 Grace-0.018
 importantly-0.017
Els-0.016
 Amanda-0.015
 insepar-0.015
Token it
Token ID340
Feature activation-3.23e-02

Pos logit contributions
aster+0.017
 toast+0.013
 peel+0.012
on+0.010
dog+0.010
Neg logit contributions
Els-0.029
reetings-0.028
ified-0.028
 importantly-0.027
ifying-0.027
Token was
Token ID373
Feature activation-3.85e-02

Pos logit contributions
aster+0.068
 peel+0.053
 slur+0.053
 toast+0.053
five+0.053
Neg logit contributions
 Grace-0.039
reetings-0.037
 importantly-0.035
Els-0.033
ified-0.033
Token,
Token ID11
Feature activation+5.25e-03

Pos logit contributions
aster+0.059
 peel+0.046
 toast+0.044
dog+0.042
 slur+0.041
Neg logit contributions
Els-0.066
 Grace-0.066
 importantly-0.065
reetings-0.064
ified-0.063
Token and
Token ID290
Feature activation-4.71e-03

Pos logit contributions
reetings+0.008
Els+0.008
ified+0.008
 importantly+0.008
 Grace+0.008
Neg logit contributions
aster-0.009
 peel-0.008
 toast-0.007
dog-0.007
 slur-0.007
Token the
Token ID262
Feature activation+1.29e-03

Pos logit contributions
aster+0.006
 peel+0.004
 toast+0.004
dog+0.004
 slur+0.004
Neg logit contributions
reetings-0.010
Els-0.010
 importantly-0.010
ified-0.009
ifying-0.009
Token clock
Token ID8801
Feature activation-7.75e-03

Pos logit contributions
 importantly+0.002
 Grace+0.002
 million+0.002
 aliens+0.002
Am+0.002
Neg logit contributions
aster-0.002
ops-0.001
 peel-0.001
dog-0.001
 slur-0.001
Token said
Token ID531
Feature activation+6.99e-03

Pos logit contributions
aster+0.012
 peel+0.009
 toast+0.009
five+0.008
dog+0.008
Neg logit contributions
 Grace-0.013
 importantly-0.013
spe-0.012
Els-0.012
ified-0.012
Token said
Token ID531
Feature activation+4.40e-03

Pos logit contributions
aster+0.015
 peel+0.011
dog+0.011
ops+0.010
 toast+0.010
Neg logit contributions
 Grace-0.017
 importantly-0.016
Els-0.016
 Grac-0.015
 galaxies-0.015
Token,
Token ID11
Feature activation+5.16e-03

Pos logit contributions
Els+0.007
 importantly+0.007
 Grace+0.007
reetings+0.007
ified+0.007
Neg logit contributions
aster-0.007
 peel-0.006
 toast-0.005
dog-0.005
 slur-0.005
Token "
Token ID366
Feature activation-8.93e-03

Pos logit contributions
Els+0.008
 importantly+0.008
reetings+0.008
 Grace+0.007
ified+0.007
Neg logit contributions
aster-0.009
 peel-0.008
 toast-0.007
dog-0.007
 slur-0.007
TokenWhat
Token ID2061
Feature activation-9.12e-03

Pos logit contributions
aster+0.021
 toast+0.018
 peel+0.018
 slur+0.018
ats+0.018
Neg logit contributions
Els-0.008
 Grace-0.007
 Isn-0.007
 importantly-0.007
ified-0.007
Token is
Token ID318
Feature activation-1.50e-02

Pos logit contributions
aster+0.015
 peel+0.012
 toast+0.011
dog+0.011
 slur+0.011
Neg logit contributions
 importantly-0.014
 Grace-0.014
Els-0.014
reetings-0.014
ified-0.014
Token it
Token ID340
Feature activation-3.84e-02

Pos logit contributions
aster+0.025
 peel+0.021
 toast+0.020
dog+0.019
 slur+0.019
Neg logit contributions
 Grace-0.023
Els-0.023
 importantly-0.023
reetings-0.022
ified-0.022
Token?"
Token ID1701
Feature activation-1.62e-02

Pos logit contributions
aster+0.064
 peel+0.054
 toast+0.051
dog+0.048
on+0.046
Neg logit contributions
Els-0.061
 importantly-0.059
 Grace-0.058
ified-0.058
spe-0.057
Token
Token ID198
Feature activation-1.11e-02

Pos logit contributions
aster+0.027
 peel+0.022
 toast+0.021
dog+0.020
 slur+0.019
Neg logit contributions
Els-0.026
reetings-0.025
 importantly-0.025
spe-0.025
ified-0.025
Token
Token ID198
Feature activation-7.66e-03

Pos logit contributions
aster+0.024
 peel+0.020
 toast+0.020
dog+0.019
 slur+0.019
Neg logit contributions
 Grace-0.012
Els-0.012
 importantly-0.012
reetings-0.012
 ago-0.010
TokenBen
Token ID11696
Feature activation-9.86e-03

Pos logit contributions
aster+0.014
 toast+0.011
 slur+0.011
 peel+0.011
dog+0.011
Neg logit contributions
Els-0.011
 Grace-0.010
reetings-0.010
 importantly-0.010
 ago-0.009
Token said
Token ID531
Feature activation+6.25e-04

Pos logit contributions
aster+0.017
 toast+0.012
five+0.011
 peel+0.011
atmeal+0.011
Neg logit contributions
 Grace-0.017
ified-0.016
Els-0.015
 importantly-0.015
 ago-0.015
Token noticed
Token ID6810
Feature activation+2.70e-03

Pos logit contributions
aster+0.014
 peel+0.011
 toast+0.010
dog+0.010
 slur+0.009
Neg logit contributions
Els-0.022
 importantly-0.022
 Grace-0.022
reetings-0.022
ified-0.021
Token something
Token ID1223
Feature activation-6.82e-03

Pos logit contributions
Els+0.005
reetings+0.005
 importantly+0.005
 Grace+0.005
ified+0.005
Neg logit contributions
aster-0.004
 toast-0.003
 peel-0.003
dog-0.003
on-0.003
Token.
Token ID13
Feature activation+7.24e-03

Pos logit contributions
aster+0.013
 peel+0.010
 toast+0.010
dog+0.010
 slur+0.009
Neg logit contributions
 importantly-0.009
 Grace-0.009
Els-0.009
reetings-0.009
ified-0.009
Token The
Token ID383
Feature activation+1.39e-03

Pos logit contributions
Els+0.012
 importantly+0.012
reetings+0.012
 Grace+0.012
 insepar+0.011
Neg logit contributions
aster-0.012
 peel-0.009
 toast-0.009
dog-0.009
 slur-0.008
Token bird
Token ID6512
Feature activation-8.55e-03

Pos logit contributions
 Grace+0.003
 importantly+0.003
Els+0.003
reetings+0.003
ified+0.003
Neg logit contributions
aster-0.002
 peel-0.001
 toast-0.001
dog-0.001
 cone-0.001
Token wasn
Token ID2492
Feature activation+2.80e-02

Pos logit contributions
aster+0.013
 peel+0.010
 toast+0.010
dog+0.009
 slur+0.009
Neg logit contributions
Els-0.014
 Grace-0.014
 importantly-0.014
reetings-0.014
ified-0.014
Token't
Token ID470
Feature activation-5.12e-03

Pos logit contributions
 importantly+0.043
 Grace+0.042
reetings+0.041
Els+0.041
 ago+0.040
Neg logit contributions
aster-0.049
 peel-0.041
 toast-0.037
dog-0.037
on-0.036
Token alone
Token ID3436
Feature activation-1.38e-02

Pos logit contributions
aster+0.009
 peel+0.007
 toast+0.007
 cut+0.007
on+0.006
Neg logit contributions
ifying-0.008
 importantly-0.008
ified-0.008
reetings-0.008
Els-0.008
Token.
Token ID13
Feature activation+1.10e-02

Pos logit contributions
aster+0.010
 peel+0.009
dog+0.007
 cut+0.007
 toast+0.007
Neg logit contributions
Els-0.034
 importantly-0.031
reetings-0.031
 Grace-0.030
 Shall-0.030
Token There
Token ID1318
Feature activation+3.83e-03

Pos logit contributions
reetings+0.019
Els+0.019
 importantly+0.019
 Grace+0.018
 insepar+0.018
Neg logit contributions
aster-0.017
 peel-0.013
 toast-0.012
dog-0.012
 slur-0.012
Token was
Token ID373
Feature activation+6.39e-03

Pos logit contributions
 Grace+0.006
 importantly+0.005
Els+0.005
reetings+0.005
ified+0.005
Neg logit contributions
aster-0.007
 peel-0.006
 toast-0.005
dog-0.005
 slur-0.005
TokenLet
Token ID5756
Feature activation-2.05e-03

Pos logit contributions
aster+0.020
 peel+0.017
 toast+0.016
 slur+0.015
dog+0.015
Neg logit contributions
Els-0.012
 Grace-0.012
 importantly-0.012
reetings-0.011
 ago-0.011
Token's
Token ID338
Feature activation+5.77e-03

Pos logit contributions
aster+0.003
 peel+0.002
 toast+0.002
dog+0.002
 slur+0.002
Neg logit contributions
 Grace-0.003
Els-0.003
ified-0.003
reetings-0.003
 importantly-0.003
Token pretend
Token ID16614
Feature activation+2.50e-03

Pos logit contributions
 importantly+0.014
reetings+0.014
Els+0.013
 Grace+0.013
ified+0.013
Neg logit contributions
aster-0.006
 peel-0.005
 cut-0.004
 mark-0.004
 toast-0.004
Token we
Token ID356
Feature activation+3.62e-03

Pos logit contributions
reetings+0.005
Els+0.004
 importantly+0.004
ified+0.004
 Grace+0.004
Neg logit contributions
aster-0.004
 toast-0.003
 peel-0.003
 cone-0.003
dog-0.003
Token are
Token ID389
Feature activation-1.16e-02

Pos logit contributions
 importantly+0.007
 Grace+0.006
Els+0.006
reetings+0.006
ified+0.006
Neg logit contributions
aster-0.005
 peel-0.004
dog-0.004
 cut-0.004
 toast-0.004
Token climbing
Token ID14281
Feature activation+2.97e-02

Pos logit contributions
aster+0.019
 toast+0.016
 peel+0.015
 cut+0.014
 man+0.014
Neg logit contributions
Els-0.019
 importantly-0.019
ified-0.018
 Grace-0.018
reetings-0.018
Token it
Token ID340
Feature activation+8.18e-03

Pos logit contributions
 importantly+0.070
reetings+0.063
Els+0.059
 Grac+0.058
 Grace+0.058
Neg logit contributions
aster-0.033
dog-0.028
 peel-0.028
 toast-0.027
 cone-0.027
Token!"
Token ID2474
Feature activation-1.28e-02

Pos logit contributions
 importantly+0.015
reetings+0.014
 Grace+0.013
Els+0.013
 Grac+0.013
Neg logit contributions
aster-0.012
 peel-0.012
 toast-0.011
dog-0.010
 cone-0.010
Token
Token ID198
Feature activation-6.24e-03

Pos logit contributions
aster+0.021
 peel+0.017
 toast+0.016
dog+0.015
 slur+0.015
Neg logit contributions
reetings-0.021
Els-0.021
ified-0.021
 importantly-0.021
spe-0.020
Token
Token ID198
Feature activation-1.73e-04

Pos logit contributions
aster+0.012
 peel+0.010
 toast+0.009
dog+0.009
 slur+0.009
Neg logit contributions
 Grace-0.008
Els-0.008
 importantly-0.008
reetings-0.008
ified-0.008
TokenThey
Token ID2990
Feature activation-5.06e-03

Pos logit contributions
aster+0.000
 peel+0.000
 toast+0.000
dog+0.000
 slur+0.000
Neg logit contributions
 Grace-0.000
 importantly-0.000
Els-0.000
reetings-0.000
ified-0.000
Token was
Token ID373
Feature activation-1.03e-02

Pos logit contributions
ified+0.007
 importantly+0.007
ifying+0.007
Els+0.007
 rooft+0.007
Neg logit contributions
aster-0.005
 peel-0.004
 toast-0.003
dog-0.003
on-0.003
Token a
Token ID257
Feature activation+1.14e-03

Pos logit contributions
aster+0.017
 toast+0.013
 peel+0.013
dog+0.012
on+0.012
Neg logit contributions
Els-0.017
reetings-0.017
 importantly-0.017
 Grace-0.017
ified-0.016
Token little
Token ID1310
Feature activation+6.56e-04

Pos logit contributions
Els+0.001
 importantly+0.001
 Grace+0.001
reetings+0.001
ified+0.001
Neg logit contributions
aster-0.002
 peel-0.002
 toast-0.002
dog-0.002
 slur-0.002
Token girl
Token ID2576
Feature activation-2.69e-03

Pos logit contributions
reetings+0.001
Els+0.001
 importantly+0.001
 Grace+0.001
ified+0.001
Neg logit contributions
aster-0.001
 peel-0.001
 toast-0.001
dog-0.001
 slur-0.001
Token named
Token ID3706
Feature activation+1.07e-02

Pos logit contributions
aster+0.003
 peel+0.003
dog+0.002
 toast+0.002
 cut+0.002
Neg logit contributions
reetings-0.006
Els-0.005
ified-0.005
 importantly-0.005
 Grace-0.005
Token Daisy
Token ID40355
Feature activation+2.56e-02

Pos logit contributions
 importantly+0.017
reetings+0.016
 aliens+0.016
 galaxies+0.016
 insepar+0.016
Neg logit contributions
aster-0.018
 peel-0.014
 toast-0.013
dog-0.013
 slur-0.012
Token.
Token ID13
Feature activation+1.71e-02

Pos logit contributions
reetings+0.067
 importantly+0.063
Els+0.063
 Shall+0.061
eway+0.060
Neg logit contributions
aster-0.016
 peel-0.011
dog-0.010
 cut-0.009
 toast-0.008
Token Daisy
Token ID40355
Feature activation+2.00e-02

Pos logit contributions
spe+0.026
reetings+0.025
 importantly+0.025
ifying+0.025
Els+0.024
Neg logit contributions
aster-0.027
 peel-0.024
 toast-0.023
dog-0.023
 slur-0.022
Token was
Token ID373
Feature activation-1.87e-03

Pos logit contributions
 importantly+0.036
reetings+0.036
Els+0.036
ified+0.035
 Grace+0.035
Neg logit contributions
aster-0.028
 peel-0.022
 toast-0.020
dog-0.020
 slur-0.019
Token only
Token ID691
Feature activation-9.96e-03

Pos logit contributions
aster+0.003
 peel+0.002
 toast+0.002
 cut+0.002
dog+0.002
Neg logit contributions
reetings-0.003
Els-0.003
ified-0.003
 importantly-0.003
ifying-0.003
Token three
Token ID1115
Feature activation+2.05e-02

Pos logit contributions
aster+0.019
 peel+0.017
 toast+0.017
 cut+0.016
dog+0.016
Neg logit contributions
reetings-0.013
Els-0.011
 importantly-0.011
ified-0.011
 million-0.011
Token the
Token ID262
Feature activation-3.16e-03

Pos logit contributions
 Grace+0.011
Els+0.011
 importantly+0.010
 Grac+0.010
ified+0.010
Neg logit contributions
aster-0.012
 peel-0.011
dog-0.010
five-0.010
 slur-0.010
Token ingredients
Token ID9391
Feature activation+9.27e-03

Pos logit contributions
aster+0.005
 peel+0.004
 toast+0.003
dog+0.003
 slur+0.003
Neg logit contributions
 Grace-0.006
 importantly-0.006
Els-0.006
reetings-0.005
ified-0.005
Token were
Token ID547
Feature activation+1.27e-02

Pos logit contributions
 Grace+0.011
Els+0.011
 importantly+0.011
ified+0.011
 Grac+0.010
Neg logit contributions
aster-0.018
dog-0.014
 peel-0.014
 toast-0.014
five-0.014
Token all
Token ID477
Feature activation+7.13e-03

Pos logit contributions
 Grace+0.021
Els+0.021
reetings+0.021
 importantly+0.021
ified+0.020
Neg logit contributions
aster-0.020
 peel-0.016
 toast-0.016
 slur-0.015
dog-0.015
Token mixed
Token ID7668
Feature activation+1.84e-02

Pos logit contributions
 Grace+0.012
 importantly+0.012
Els+0.012
reetings+0.012
ified+0.011
Neg logit contributions
aster-0.011
 peel-0.009
 toast-0.008
dog-0.008
 slur-0.008
Token together
Token ID1978
Feature activation+3.07e-02

Pos logit contributions
reetings+0.034
 importantly+0.034
 Grace+0.032
Els+0.032
ified+0.031
Neg logit contributions
aster-0.026
 peel-0.022
 toast-0.021
dog-0.019
 slur-0.018
Token.
Token ID13
Feature activation-4.34e-03

Pos logit contributions
reetings+0.069
Els+0.066
 importantly+0.066
 Grace+0.065
ified+0.064
Neg logit contributions
aster-0.033
 peel-0.026
 toast-0.025
dog-0.020
 slur-0.020
Token She
Token ID1375
Feature activation-1.09e-03

Pos logit contributions
aster+0.008
 peel+0.007
 toast+0.006
dog+0.006
 slur+0.006
Neg logit contributions
 importantly-0.006
reetings-0.006
Els-0.006
 Grace-0.006
ified-0.006
Token was
Token ID373
Feature activation+4.52e-03

Pos logit contributions
aster+0.002
 peel+0.001
dog+0.001
five+0.001
 toast+0.001
Neg logit contributions
Els-0.002
 Grace-0.002
 importantly-0.002
 insepar-0.002
 Grac-0.002
Token so
Token ID523
Feature activation-9.30e-03

Pos logit contributions
reetings+0.005
Els+0.005
 Grace+0.005
 importantly+0.005
ified+0.005
Neg logit contributions
aster-0.010
 peel-0.008
 toast-0.008
 slur-0.008
dog-0.008
Token proud
Token ID6613
Feature activation-1.01e-03

Pos logit contributions
aster+0.016
dog+0.013
 peel+0.013
ops+0.012
 toast+0.012
Neg logit contributions
 importantly-0.014
Els-0.013
 insepar-0.013
 Grace-0.012
spe-0.012
Token and
Token ID290
Feature activation-8.56e-03

Pos logit contributions
 importantly+0.037
reetings+0.036
 Grace+0.036
Els+0.036
ified+0.034
Neg logit contributions
aster-0.016
 peel-0.015
 toast-0.013
 cone-0.011
ops-0.011
Token said
Token ID531
Feature activation+1.35e-02

Pos logit contributions
aster+0.012
 peel+0.010
 toast+0.009
dog+0.009
 slur+0.008
Neg logit contributions
 Grace-0.015
Els-0.015
 importantly-0.015
reetings-0.014
ified-0.014
Token,
Token ID11
Feature activation+9.99e-03

Pos logit contributions
 importantly+0.025
Els+0.025
reetings+0.025
 sizes+0.025
ifying+0.024
Neg logit contributions
aster-0.018
 toast-0.014
 peel-0.013
five-0.013
on-0.012
Token "
Token ID366
Feature activation+2.13e-03

Pos logit contributions
Els+0.013
 importantly+0.013
 Grace+0.013
reetings+0.013
ified+0.012
Neg logit contributions
aster-0.019
 peel-0.016
 toast-0.015
dog-0.015
 slur-0.015
TokenThis
Token ID1212
Feature activation+1.54e-03

Pos logit contributions
 Grace+0.003
 importantly+0.003
Els+0.003
reetings+0.003
ified+0.003
Neg logit contributions
aster-0.004
 peel-0.003
 toast-0.003
dog-0.003
 slur-0.003
Token mine
Token ID6164
Feature activation+2.56e-02

Pos logit contributions
 importantly+0.003
Els+0.003
ifying+0.003
 Grace+0.003
ified+0.003
Neg logit contributions
aster-0.002
 peel-0.002
 sandwich-0.001
 toast-0.001
 cut-0.001
Token is
Token ID318
Feature activation+5.81e-03

Pos logit contributions
 importantly+0.052
Els+0.051
 Grace+0.050
reetings+0.049
ified+0.048
Neg logit contributions
aster-0.027
 peel-0.025
 toast-0.022
 cut-0.021
ops-0.021
Token mine
Token ID6164
Feature activation+2.13e-02

Pos logit contributions
Els+0.011
reetings+0.010
ified+0.010
Fre+0.010
 importantly+0.010
Neg logit contributions
aster-0.008
 toast-0.007
 cut-0.006
 peel-0.006
 cast-0.006
Token,
Token ID11
Feature activation-4.49e-05

Pos logit contributions
Els+0.047
 importantly+0.046
reetings+0.045
 Grace+0.043
ified+0.042
Neg logit contributions
aster-0.017
 peel-0.017
 toast-0.016
 cone-0.014
--0.014
Token and
Token ID290
Feature activation+1.13e-02

Pos logit contributions
aster+0.000
 peel+0.000
 toast+0.000
dog+0.000
 cone+0.000
Neg logit contributions
reetings-0.000
Els-0.000
ified-0.000
 importantly-0.000
 Grace-0.000
Token no
Token ID645
Feature activation-2.25e-03

Pos logit contributions
reetings+0.021
Els+0.020
ified+0.020
iful+0.019
 insepar+0.019
Neg logit contributions
aster-0.016
 peel-0.014
 cut-0.013
 toast-0.013
 say-0.012
Token beautiful
Token ID4950
Feature activation+1.16e-02

Pos logit contributions
aster+0.009
 peel+0.007
 slur+0.007
 toast+0.007
 cut+0.007
Neg logit contributions
ified-0.011
 importantly-0.011
reetings-0.011
Els-0.010
spe-0.010
Token!"
Token ID2474
Feature activation-2.85e-03

Pos logit contributions
Els+0.021
spe+0.020
reetings+0.020
ifying+0.020
 importantly+0.020
Neg logit contributions
 toast-0.014
dog-0.013
aster-0.012
--0.012
 cream-0.012
Token
Token ID198
Feature activation+5.30e-03

Pos logit contributions
aster+0.005
 peel+0.004
 toast+0.004
dog+0.004
 slur+0.004
Neg logit contributions
 importantly-0.004
reetings-0.004
 Grace-0.004
Els-0.004
ified-0.004
Token
Token ID198
Feature activation+1.36e-02

Pos logit contributions
 importantly+0.008
 Grace+0.008
ified+0.008
reetings+0.008
Els+0.008
Neg logit contributions
aster-0.009
 peel-0.007
 toast-0.007
dog-0.007
 slur-0.007
TokenJ
Token ID41
Feature activation+2.51e-02

Pos logit contributions
 importantly+0.021
 Grace+0.021
 ago+0.020
reetings+0.020
ified+0.019
Neg logit contributions
aster-0.024
 peel-0.019
dog-0.018
 slur-0.018
 toast-0.018
Tokenaya
Token ID11729
Feature activation+3.94e-03

Pos logit contributions
 Grace+0.047
 importantly+0.047
Els+0.047
reetings+0.046
ified+0.045
Neg logit contributions
aster-0.034
 peel-0.026
 toast-0.024
dog-0.023
 slur-0.023
Token smiled
Token ID13541
Feature activation-4.27e-03

Pos logit contributions
 importantly+0.008
reetings+0.008
Els+0.008
 Grace+0.008
ified+0.007
Neg logit contributions
aster-0.005
 peel-0.004
 toast-0.003
 slur-0.003
dog-0.003
Token and
Token ID290
Feature activation-1.14e-02

Pos logit contributions
aster+0.005
 peel+0.004
 toast+0.004
 slur+0.003
dog+0.003
Neg logit contributions
reetings-0.009
Els-0.009
 importantly-0.008
 Grace-0.008
 ago-0.008
Token said
Token ID531
Feature activation+1.05e-02

Pos logit contributions
aster+0.016
 peel+0.012
 toast+0.011
dog+0.010
ops+0.010
Neg logit contributions
 Grace-0.022
Els-0.022
 importantly-0.022
ified-0.021
reetings-0.020
Token,
Token ID11
Feature activation+7.72e-03

Pos logit contributions
 importantly+0.019
Els+0.019
reetings+0.019
ifying+0.019
 rooft+0.018
Neg logit contributions
aster-0.013
 peel-0.010
 toast-0.010
five-0.010
on-0.010
Token "
Token ID366
Feature activation+9.10e-03

Pos logit contributions
 importantly+0.011
Els+0.011
reetings+0.011
 Grace+0.010
ified+0.010
Neg logit contributions
aster-0.014
 peel-0.012
 toast-0.011
dog-0.011
 slur-0.011
Token He
Token ID679
Feature activation+3.12e-03

Pos logit contributions
aster+0.021
 peel+0.017
 toast+0.017
dog+0.016
 slur+0.016
Neg logit contributions
Els-0.018
 Grace-0.018
 importantly-0.017
reetings-0.017
ified-0.017
Token says
Token ID1139
Feature activation+4.82e-03

Pos logit contributions
Els+0.005
 Grace+0.005
 importantly+0.005
reetings+0.005
ified+0.005
Neg logit contributions
aster-0.005
 peel-0.004
 toast-0.004
dog-0.003
five-0.003
Token the
Token ID262
Feature activation+7.69e-03

Pos logit contributions
Els+0.009
 Grace+0.009
reetings+0.009
ified+0.009
 importantly+0.009
Neg logit contributions
aster-0.007
 peel-0.005
 toast-0.005
dog-0.004
 slur-0.004
Token meat
Token ID6174
Feature activation+9.08e-04

Pos logit contributions
 importantly+0.015
 Grace+0.015
Els+0.015
reetings+0.015
ified+0.014
Neg logit contributions
aster-0.010
 peel-0.007
 toast-0.007
dog-0.007
 slur-0.006
Token is
Token ID318
Feature activation-3.79e-03

Pos logit contributions
 Grace+0.001
Els+0.001
 importantly+0.001
reetings+0.001
ified+0.001
Neg logit contributions
aster-0.002
 peel-0.001
 toast-0.001
 slur-0.001
dog-0.001
Token raw
Token ID8246
Feature activation+3.93e-03

Pos logit contributions
aster+0.006
 peel+0.005
 toast+0.005
dog+0.004
 slur+0.004
Neg logit contributions
Els-0.006
 Grace-0.006
reetings-0.006
 importantly-0.006
ified-0.006
Token and
Token ID290
Feature activation+6.49e-03

Pos logit contributions
Els+0.009
 Grace+0.009
reetings+0.008
 importantly+0.008
ified+0.008
Neg logit contributions
aster-0.005
 peel-0.004
 toast-0.004
dog-0.003
 cone-0.003
Token not
Token ID407
Feature activation+2.38e-03

Pos logit contributions
Els+0.012
reetings+0.011
 Grace+0.011
ified+0.011
 importantly+0.011
Neg logit contributions
aster-0.010
 peel-0.008
 toast-0.007
 slur-0.007
dog-0.007
Token good
Token ID922
Feature activation-7.48e-03

Pos logit contributions
Els+0.005
 Grace+0.004
reetings+0.004
 importantly+0.004
ified+0.004
Neg logit contributions
aster-0.003
 peel-0.003
 toast-0.003
dog-0.002
 slur-0.002
Token to
Token ID284
Feature activation-3.25e-04

Pos logit contributions
aster+0.010
 peel+0.008
 toast+0.008
dog+0.007
 slur+0.007
Neg logit contributions
Els-0.015
 Grace-0.015
reetings-0.015
 importantly-0.014
ified-0.014
Token eat
Token ID4483
Feature activation-8.51e-04

Pos logit contributions
aster+0.000
 peel+0.000
 toast+0.000
dog+0.000
 slur+0.000
Neg logit contributions
 Grace-0.001
Els-0.001
 importantly-0.001
reetings-0.001
ified-0.001
Token bit
Token ID1643
Feature activation-6.12e-03

Pos logit contributions
reetings+0.009
Els+0.009
 importantly+0.009
 Grace+0.008
ified+0.008
Neg logit contributions
aster-0.005
 peel-0.005
 toast-0.004
dog-0.004
 cone-0.004
Token.
Token ID13
Feature activation-2.17e-02

Pos logit contributions
aster+0.008
 peel+0.006
 toast+0.006
dog+0.005
 cut+0.005
Neg logit contributions
Els-0.013
 Lay-0.012
reetings-0.012
 insepar-0.012
 Grace-0.012
Token And
Token ID843
Feature activation-3.91e-03

Pos logit contributions
aster+0.041
 peel+0.034
 toast+0.032
dog+0.031
 slur+0.029
Neg logit contributions
 insepar-0.032
Els-0.032
spe-0.032
reetings-0.031
 importantly-0.031
Token you
Token ID345
Feature activation+9.36e-03

Pos logit contributions
aster+0.006
 toast+0.005
 peel+0.005
 cut+0.005
 slur+0.004
Neg logit contributions
reetings-0.007
ified-0.007
Els-0.007
spe-0.006
Mill-0.006
Token have
Token ID423
Feature activation+3.16e-03

Pos logit contributions
 importantly+0.015
 Grace+0.015
Els+0.015
reetings+0.015
ified+0.014
Neg logit contributions
aster-0.016
 peel-0.013
 toast-0.012
dog-0.012
 slur-0.012
Token to
Token ID284
Feature activation+2.99e-03

Pos logit contributions
reetings+0.006
Els+0.006
 importantly+0.006
 Grace+0.006
ified+0.006
Neg logit contributions
aster-0.005
 toast-0.004
 peel-0.004
dog-0.003
 cut-0.003
Token permit
Token ID8749
Feature activation-1.15e-03

Pos logit contributions
 Grace+0.005
Els+0.005
 importantly+0.005
reetings+0.005
ified+0.005
Neg logit contributions
aster-0.004
 peel-0.003
 toast-0.003
dog-0.003
 slur-0.003
Token me
Token ID502
Feature activation+3.06e-03

Pos logit contributions
aster+0.002
 peel+0.002
 toast+0.001
dog+0.001
 slur+0.001
Neg logit contributions
 importantly-0.002
Els-0.002
 Grace-0.002
reetings-0.002
ified-0.002
Token to
Token ID284
Feature activation-1.87e-03

Pos logit contributions
 importantly+0.005
Els+0.005
 Grace+0.005
ified+0.005
reetings+0.005
Neg logit contributions
aster-0.005
 peel-0.004
 toast-0.004
dog-0.004
 slur-0.003
Token play
Token ID711
Feature activation-6.51e-03

Pos logit contributions
aster+0.002
 peel+0.002
 toast+0.002
dog+0.001
 slur+0.001
Neg logit contributions
 importantly-0.004
 Grace-0.004
Els-0.004
reetings-0.004
ified-0.004
Token after
Token ID706
Feature activation-7.10e-03

Pos logit contributions
aster+0.010
 peel+0.008
 toast+0.008
dog+0.008
 slur+0.007
Neg logit contributions
Els-0.011
 Grace-0.011
 importantly-0.011
reetings-0.010
ified-0.010
Token could
Token ID714
Feature activation+8.27e-03

Pos logit contributions
 Grace+0.001
 importantly+0.001
Els+0.001
 ago+0.001
ified+0.000
Neg logit contributions
aster-0.001
five-0.001
dog-0.001
 peel-0.001
 toast-0.001
Token not
Token ID407
Feature activation+9.00e-04

Pos logit contributions
 Grace+0.011
Els+0.011
 importantly+0.011
ified+0.010
 ago+0.010
Neg logit contributions
aster-0.015
five-0.011
dog-0.011
 peel-0.011
 toast-0.011
Token find
Token ID1064
Feature activation-2.13e-03

Pos logit contributions
Els+0.001
 ago+0.001
 Grace+0.001
 forever+0.001
".+0.001
Neg logit contributions
aster-0.001
awed-0.001
ops-0.001
inches-0.001
uddled-0.001
Token the
Token ID262
Feature activation+8.95e-03

Pos logit contributions
aster+0.004
 peel+0.003
ops+0.003
dog+0.003
 slur+0.003
Neg logit contributions
 Grace-0.003
Els-0.003
 importantly-0.003
ified-0.003
 aliens-0.003
Token butter
Token ID9215
Feature activation+1.02e-03

Pos logit contributions
 importantly+0.017
 Grace+0.017
Els+0.017
reetings+0.017
ified+0.016
Neg logit contributions
aster-0.012
 peel-0.010
 toast-0.009
dog-0.009
 slur-0.008
Token in
Token ID287
Feature activation+8.44e-03

Pos logit contributions
 importantly+0.002
Els+0.002
 Grace+0.002
reetings+0.002
ified+0.002
Neg logit contributions
aster-0.002
 peel-0.001
 toast-0.001
dog-0.001
 slur-0.001
Token the
Token ID262
Feature activation-6.42e-03

Pos logit contributions
reetings+0.018
 importantly+0.018
ified+0.017
Els+0.017
 Grace+0.017
Neg logit contributions
aster-0.011
 peel-0.008
 toast-0.008
 cut-0.007
dog-0.007
Token kitchen
Token ID9592
Feature activation+2.35e-03

Pos logit contributions
aster+0.010
 peel+0.007
 toast+0.007
dog+0.007
 slur+0.007
Neg logit contributions
 Grace-0.011
Els-0.011
 importantly-0.011
reetings-0.011
ified-0.011
Token.
Token ID13
Feature activation+1.12e-03

Pos logit contributions
Els+0.005
 importantly+0.005
reetings+0.005
 Grace+0.005
ified+0.004
Neg logit contributions
aster-0.003
 peel-0.002
 toast-0.002
dog-0.002
 cone-0.002
Token They
Token ID1119
Feature activation+3.77e-03

Pos logit contributions
reetings+0.002
 importantly+0.002
Els+0.002
 Grace+0.002
ified+0.002
Neg logit contributions
aster-0.002
 peel-0.001
dog-0.001
 toast-0.001
 cone-0.001
Token looked
Token ID3114
Feature activation+9.02e-03

Pos logit contributions
 Grace+0.005
Els+0.005
ified+0.005
 Amanda+0.005
 ago+0.005
Neg logit contributions
aster-0.007
five-0.006
 toast-0.005
 peel-0.005
dog-0.005
Token,
Token ID11
Feature activation+3.91e-03

Pos logit contributions
aster+0.004
 toast+0.003
 peel+0.003
dog+0.003
 cone+0.003
Neg logit contributions
ifying-0.006
Els-0.006
 galaxies-0.006
 importantly-0.006
 former-0.006
Token I
Token ID314
Feature activation+6.90e-03

Pos logit contributions
reetings+0.007
ifying+0.007
spe+0.007
Els+0.007
ified+0.007
Neg logit contributions
aster-0.005
 peel-0.005
 cone-0.004
 toast-0.004
 man-0.004
Token can
Token ID460
Feature activation-2.87e-03

Pos logit contributions
reetings+0.008
 Grace+0.007
Els+0.007
 importantly+0.007
 ago+0.007
Neg logit contributions
aster-0.015
 peel-0.012
 toast-0.012
dog-0.012
 slur-0.012
Token tell
Token ID1560
Feature activation-6.38e-04

Pos logit contributions
aster+0.004
 peel+0.003
 toast+0.003
dog+0.003
 slur+0.003
Neg logit contributions
 Grace-0.005
Els-0.005
 importantly-0.005
reetings-0.005
ified-0.005
Token you
Token ID345
Feature activation+1.20e-02

Pos logit contributions
aster+0.001
 toast+0.001
 peel+0.001
dog+0.001
 slur+0.001
Neg logit contributions
reetings-0.001
 importantly-0.001
Els-0.001
spe-0.001
ified-0.001
Token a
Token ID257
Feature activation+1.18e-02

Pos logit contributions
Els+0.023
 importantly+0.023
ified+0.023
reetings+0.022
 Grace+0.022
Neg logit contributions
aster-0.016
 toast-0.014
 peel-0.012
 sandwich-0.011
dog-0.011
Token story
Token ID1621
Feature activation-2.66e-04

Pos logit contributions
ified+0.020
 importantly+0.020
Els+0.019
ifying+0.019
 ago+0.019
Neg logit contributions
aster-0.018
 peel-0.015
 toast-0.014
dog-0.014
 cone-0.013
Token,"
Token ID553
Feature activation-7.57e-03

Pos logit contributions
aster+0.000
 toast+0.000
 peel+0.000
dog+0.000
five+0.000
Neg logit contributions
Els-0.000
 importantly-0.000
 Grace-0.000
ifying-0.000
 insepar-0.000
Token the
Token ID262
Feature activation+9.04e-03

Pos logit contributions
aster+0.011
 peel+0.008
 toast+0.008
dog+0.007
 slur+0.007
Neg logit contributions
 Grace-0.014
 importantly-0.014
Els-0.014
reetings-0.013
ified-0.013
Token k
Token ID479
Feature activation+8.17e-03

Pos logit contributions
reetings+0.014
 importantly+0.014
ified+0.014
 Grace+0.013
Els+0.013
Neg logit contributions
aster-0.015
 peel-0.013
dog-0.012
 toast-0.012
 cone-0.012
Tokenang
Token ID648
Feature activation-7.66e-03

Pos logit contributions
Els+0.013
 importantly+0.012
 Grace+0.012
reetings+0.012
ified+0.012
Neg logit contributions
aster-0.014
 toast-0.012
 peel-0.011
 slur-0.011
dog-0.010
Token write
Token ID3551
Feature activation-1.40e-02

Pos logit contributions
aster+0.015
 peel+0.011
 toast+0.011
dog+0.010
 slur+0.010
Neg logit contributions
 Grace-0.024
Els-0.024
 importantly-0.024
reetings-0.024
ified-0.023
Token stories
Token ID3923
Feature activation+3.92e-03

Pos logit contributions
aster+0.024
 peel+0.019
 slur+0.017
dog+0.017
five+0.017
Neg logit contributions
 Grace-0.022
 importantly-0.021
Els-0.020
reetings-0.020
 aliens-0.020
Token with
Token ID351
Feature activation-1.04e-02

Pos logit contributions
Els+0.008
 importantly+0.007
reetings+0.007
 Grace+0.007
ified+0.007
Neg logit contributions
aster-0.005
 peel-0.004
 toast-0.004
dog-0.004
on-0.003
Token her
Token ID607
Feature activation+2.42e-03

Pos logit contributions
aster+0.016
 peel+0.012
dog+0.012
five+0.011
ops+0.011
Neg logit contributions
 Grace-0.018
 importantly-0.018
Els-0.017
 aliens-0.017
 Grac-0.017
Token cr
Token ID1067
Feature activation-1.17e-03

Pos logit contributions
 Grace+0.005
 importantly+0.004
Els+0.004
reetings+0.004
 ago+0.004
Neg logit contributions
aster-0.003
 peel-0.003
 toast-0.002
dog-0.002
 slur-0.002
Tokenay
Token ID323
Feature activation-6.86e-03

Pos logit contributions
aster+0.002
 peel+0.002
 toast+0.002
dog+0.002
 slur+0.002
Neg logit contributions
 importantly-0.002
 Grace-0.002
Els-0.002
reetings-0.001
ified-0.001
Tokenons
Token ID684
Feature activation+9.99e-04

Pos logit contributions
aster+0.008
 peel+0.006
 slur+0.005
five+0.005
 toast+0.005
Neg logit contributions
 Grace-0.014
reetings-0.014
 importantly-0.014
 ago-0.014
Els-0.013
Token.
Token ID13
Feature activation+2.55e-03

Pos logit contributions
Els+0.002
reetings+0.002
 importantly+0.002
 Grace+0.002
ified+0.002
Neg logit contributions
aster-0.001
 peel-0.001
 toast-0.001
dog-0.001
on-0.001
Token She
Token ID1375
Feature activation+3.66e-03

Pos logit contributions
reetings+0.004
 importantly+0.004
Els+0.004
ified+0.004
 insepar+0.004
Neg logit contributions
aster-0.004
 peel-0.004
 toast-0.003
dog-0.003
 slur-0.003
Token wrote
Token ID2630
Feature activation+9.95e-05

Pos logit contributions
 Grace+0.006
 importantly+0.006
Els+0.006
 ago+0.005
reetings+0.005
Neg logit contributions
aster-0.006
 peel-0.005
 toast-0.005
dog-0.004
five-0.004
Token about
Token ID546
Feature activation-3.70e-03

Pos logit contributions
Els+0.000
 importantly+0.000
 Grace+0.000
reetings+0.000
ified+0.000
Neg logit contributions
aster-0.000
 peel-0.000
 toast-0.000
dog-0.000
 slur-0.000
Token people
Token ID661
Feature activation-1.24e-03

Pos logit contributions
Els+0.010
 Grace+0.010
reetings+0.010
 importantly+0.010
ifying+0.009
Neg logit contributions
aster-0.008
 peel-0.007
 toast-0.006
dog-0.006
 cone-0.006
Token are
Token ID389
Feature activation-7.91e-03

Pos logit contributions
aster+0.002
 peel+0.001
 toast+0.001
dog+0.001
 slur+0.001
Neg logit contributions
Els-0.002
 importantly-0.002
 Grace-0.002
reetings-0.002
ified-0.002
Token on
Token ID319
Feature activation+1.16e-02

Pos logit contributions
aster+0.012
 peel+0.010
 toast+0.009
 cut+0.009
dog+0.009
Neg logit contributions
Els-0.014
reetings-0.013
 importantly-0.013
 Grace-0.013
ified-0.013
Token the
Token ID262
Feature activation-1.70e-03

Pos logit contributions
reetings+0.025
 importantly+0.024
Els+0.024
ified+0.023
 Shall+0.022
Neg logit contributions
aster-0.015
 toast-0.012
 peel-0.010
dog-0.009
 cone-0.008
Token sees
Token ID7224
Feature activation-6.46e-04

Pos logit contributions
aster+0.003
 peel+0.002
 slur+0.002
dog+0.002
on+0.002
Neg logit contributions
reetings-0.002
 ago-0.002
Els-0.002
 importantly-0.002
 Grace-0.002
Tokenaw
Token ID707
Feature activation-1.39e-02

Pos logit contributions
aster+0.001
 toast+0.001
 peel+0.001
dog+0.001
five+0.001
Neg logit contributions
Els-0.001
 importantly-0.001
 Grace-0.001
reetings-0.001
ified-0.001
Token,
Token ID11
Feature activation+3.34e-04

Pos logit contributions
aster+0.021
 peel+0.017
dog+0.016
 toast+0.016
ops+0.016
Neg logit contributions
 importantly-0.025
Els-0.025
 Grace-0.024
reetings-0.023
 Grac-0.023
Token the
Token ID262
Feature activation+8.04e-03

Pos logit contributions
Els+0.001
ified+0.001
reetings+0.001
 importantly+0.001
ifying+0.001
Neg logit contributions
aster-0.000
 toast-0.000
 peel-0.000
 say-0.000
 slur-0.000
Token heavier
Token ID20140
Feature activation+7.28e-03

Pos logit contributions
 Grace+0.015
Els+0.014
 importantly+0.014
reetings+0.014
ified+0.014
Neg logit contributions
aster-0.012
 peel-0.009
 toast-0.009
dog-0.009
 cone-0.008
Token one
Token ID530
Feature activation+4.73e-03

Pos logit contributions
 Grace+0.013
Els+0.013
 importantly+0.013
reetings+0.013
ified+0.013
Neg logit contributions
aster-0.011
 peel-0.008
 toast-0.008
dog-0.008
 cone-0.008
Token will
Token ID481
Feature activation+7.91e-03

Pos logit contributions
Els+0.008
 Grace+0.008
 importantly+0.008
ified+0.008
reetings+0.008
Neg logit contributions
aster-0.007
 peel-0.006
 toast-0.006
dog-0.006
 cone-0.005
Token parade
Token ID16134
Feature activation-1.88e-02

Pos logit contributions
aster+0.025
 peel+0.021
 toast+0.020
 slur+0.020
ops+0.020
Neg logit contributions
 Grace-0.016
 importantly-0.016
 aliens-0.015
 insepar-0.015
Els-0.015
Token,
Token ID11
Feature activation+9.31e-03

Pos logit contributions
aster+0.038
 peel+0.032
 toast+0.031
 slur+0.031
dog+0.030
Neg logit contributions
 Grace-0.023
reetings-0.022
 importantly-0.022
 ago-0.021
Els-0.021
Token there
Token ID612
Feature activation-5.45e-03

Pos logit contributions
reetings+0.019
ified+0.018
Els+0.018
 importantly+0.018
 former+0.017
Neg logit contributions
aster-0.011
 toast-0.008
 peel-0.008
dog-0.007
 cut-0.007
Token was
Token ID373
Feature activation+4.47e-03

Pos logit contributions
aster+0.012
 peel+0.010
 toast+0.010
five+0.009
 slur+0.009
Neg logit contributions
 Grace-0.006
 importantly-0.006
 aliens-0.006
Els-0.005
 Amanda-0.005
Token a
Token ID257
Feature activation-2.85e-03

Pos logit contributions
reetings+0.010
Els+0.009
ified+0.009
 Grace+0.009
 importantly+0.009
Neg logit contributions
aster-0.006
 toast-0.005
 peel-0.004
dog-0.004
 cut-0.004
Token clumsy
Token ID39120
Feature activation-1.75e-03

Pos logit contributions
aster+0.005
 peel+0.004
 toast+0.004
dog+0.004
 slur+0.003
Neg logit contributions
reetings-0.005
Els-0.005
ified-0.004
 importantly-0.004
 Grace-0.004
Token man
Token ID582
Feature activation-1.42e-02

Pos logit contributions
aster+0.002
 peel+0.002
 toast+0.002
dog+0.002
 slur+0.002
Neg logit contributions
Els-0.003
 importantly-0.003
 Grace-0.003
reetings-0.003
ified-0.003
Token named
Token ID3706
Feature activation+1.29e-02

Pos logit contributions
aster+0.024
 peel+0.019
 toast+0.019
 slur+0.018
dog+0.018
Neg logit contributions
 importantly-0.022
 Grace-0.022
Els-0.021
reetings-0.021
ified-0.021
Token Tim
Token ID5045
Feature activation+1.30e-02

Pos logit contributions
 importantly+0.028
Els+0.028
ifying+0.028
reetings+0.027
ified+0.027
Neg logit contributions
aster-0.014
 peel-0.010
dog-0.010
 toast-0.009
 cone-0.008
Token.
Token ID13
Feature activation+8.34e-03

Pos logit contributions
 Grace+0.023
Els+0.023
 importantly+0.023
reetings+0.023
ified+0.022
Neg logit contributions
aster-0.019
 peel-0.015
 toast-0.014
dog-0.013
 slur-0.013
Token He
Token ID679
Feature activation-2.98e-04

Pos logit contributions
 Grace+0.012
 importantly+0.012
Els+0.012
reetings+0.012
ified+0.012
Neg logit contributions
aster-0.015
 peel-0.012
 toast-0.012
dog-0.011
 slur-0.011
Token me
Token ID502
Feature activation-3.64e-03

Pos logit contributions
 importantly+0.003
Els+0.003
 Grace+0.003
reetings+0.003
ified+0.003
Neg logit contributions
aster-0.003
 peel-0.003
 toast-0.003
dog-0.002
 slur-0.002
Token a
Token ID257
Feature activation+2.50e-05

Pos logit contributions
aster+0.005
 peel+0.004
ops+0.003
dog+0.003
 toast+0.003
Neg logit contributions
 Grace-0.006
 importantly-0.006
reetings-0.006
Els-0.006
 ago-0.006
Token good
Token ID922
Feature activation-6.16e-03

Pos logit contributions
 Grace+0.000
Els+0.000
 importantly+0.000
reetings+0.000
 million+0.000
Neg logit contributions
aster-0.000
 peel-0.000
dog-0.000
ops-0.000
 toast-0.000
Token lunch
Token ID9965
Feature activation+1.88e-02

Pos logit contributions
aster+0.010
 peel+0.007
 toast+0.007
dog+0.007
ops+0.007
Neg logit contributions
 Grace-0.010
Els-0.010
 importantly-0.010
reetings-0.009
 ago-0.009
Token today
Token ID1909
Feature activation+2.12e-02

Pos logit contributions
Els+0.033
 Grace+0.032
 importantly+0.032
reetings+0.032
 Lay+0.030
Neg logit contributions
aster-0.029
 peel-0.025
 toast-0.025
dog-0.023
 cone-0.022
Token."
Token ID526
Feature activation-3.42e-03

Pos logit contributions
Els+0.042
 importantly+0.041
reetings+0.040
 Lay+0.039
 galaxies+0.039
Neg logit contributions
aster-0.028
 toast-0.024
 peel-0.024
dog-0.021
 cone-0.020
Token
Token ID198
Feature activation-5.53e-03

Pos logit contributions
aster+0.006
 peel+0.005
 toast+0.004
dog+0.004
 slur+0.004
Neg logit contributions
reetings-0.005
 importantly-0.005
Els-0.005
ified-0.005
 Grace-0.005
Token
Token ID198
Feature activation+2.51e-03

Pos logit contributions
aster+0.009
 peel+0.008
 toast+0.007
dog+0.007
 slur+0.007
Neg logit contributions
 importantly-0.008
Els-0.008
 Grace-0.008
ified-0.008
reetings-0.008
TokenL
Token ID43
Feature activation+1.24e-02

Pos logit contributions
 ago+0.005
 Grace+0.005
 importantly+0.005
ified+0.005
reetings+0.005
Neg logit contributions
aster-0.003
 peel-0.002
dog-0.002
 toast-0.002
 slur-0.002
Tokenily
Token ID813
Feature activation-4.59e-03

Pos logit contributions
 Grace+0.024
 importantly+0.023
Els+0.023
ifying+0.023
reetings+0.023
Neg logit contributions
aster-0.017
 peel-0.013
 toast-0.012
dog-0.011
five-0.011
Token felt
Token ID2936
Feature activation+2.08e-03

Pos logit contributions
aster+0.008
 peel+0.006
 toast+0.006
five+0.006
atmeal+0.005
Neg logit contributions
 Grace-0.007
ified-0.007
Els-0.007
spe-0.006
 importantly-0.006
Token One
Token ID1881
Feature activation-5.56e-03

Pos logit contributions
ified+0.016
reetings+0.016
Els+0.015
 importantly+0.015
 insepar+0.015
Neg logit contributions
aster-0.013
dog-0.010
 peel-0.010
 cone-0.009
on-0.009
Token day
Token ID1110
Feature activation-4.77e-03

Pos logit contributions
aster+0.011
 peel+0.009
 toast+0.009
 slur+0.009
dog+0.008
Neg logit contributions
 Grace-0.007
 importantly-0.007
Els-0.007
reetings-0.007
 ago-0.006
Token he
Token ID339
Feature activation-4.64e-03

Pos logit contributions
aster+0.007
 peel+0.005
 toast+0.005
dog+0.005
on+0.004
Neg logit contributions
Els-0.009
 importantly-0.009
reetings-0.009
ified-0.009
spe-0.009
Token wanted
Token ID2227
Feature activation-1.22e-02

Pos logit contributions
aster+0.004
 peel+0.003
 cut+0.003
dog+0.003
 cast+0.002
Neg logit contributions
 importantly-0.011
ifying-0.010
reetings-0.010
ified-0.010
Els-0.010
Token to
Token ID284
Feature activation-2.34e-04

Pos logit contributions
aster+0.018
 toast+0.014
 peel+0.014
dog+0.013
 cone+0.012
Neg logit contributions
 importantly-0.022
reetings-0.022
ified-0.022
Els-0.022
 Grace-0.021
Token talk
Token ID1561
Feature activation-5.25e-03

Pos logit contributions
aster+0.000
 peel+0.000
 toast+0.000
dog+0.000
 slur+0.000
Neg logit contributions
 importantly-0.000
Els-0.000
reetings-0.000
 Grace-0.000
ified-0.000
Token to
Token ID284
Feature activation-2.13e-03

Pos logit contributions
aster+0.006
 toast+0.005
 peel+0.005
dog+0.005
 cone+0.004
Neg logit contributions
Els-0.011
ified-0.010
 importantly-0.010
reetings-0.010
 Grace-0.010
Token the
Token ID262
Feature activation+3.65e-05

Pos logit contributions
aster+0.003
 peel+0.002
 toast+0.002
dog+0.002
 slur+0.002
Neg logit contributions
reetings-0.004
 importantly-0.004
Els-0.004
ified-0.004
 ago-0.004
Token birds
Token ID10087
Feature activation-1.05e-02

Pos logit contributions
reetings+0.000
 importantly+0.000
ified+0.000
Els+0.000
 ago+0.000
Neg logit contributions
aster-0.000
 toast-0.000
 peel-0.000
 man-0.000
dog-0.000
Token.
Token ID13
Feature activation+3.76e-03

Pos logit contributions
aster+0.014
 peel+0.011
 toast+0.010
dog+0.010
 slur+0.009
Neg logit contributions
Els-0.021
 importantly-0.020
reetings-0.020
ified-0.020
 Grace-0.020
Token He
Token ID679
Feature activation-2.86e-03

Pos logit contributions
reetings+0.006
 importantly+0.006
Els+0.006
ified+0.006
 insepar+0.006
Neg logit contributions
aster-0.006
 peel-0.005
dog-0.005
 toast-0.005
 cone-0.004
Token feel
Token ID1254
Feature activation-5.53e-03

Pos logit contributions
aster+0.005
 toast+0.004
 peel+0.004
dog+0.004
on+0.004
Neg logit contributions
 Grace-0.006
Els-0.006
 importantly-0.005
reetings-0.005
ified-0.005
Token sorry
Token ID7926
Feature activation-1.25e-02

Pos logit contributions
aster+0.010
 peel+0.008
 toast+0.007
dog+0.007
ops+0.007
Neg logit contributions
 importantly-0.008
 Grace-0.008
Els-0.008
reetings-0.008
ified-0.008
Token for
Token ID329
Feature activation-2.72e-03

Pos logit contributions
aster+0.015
 peel+0.011
 toast+0.011
dog+0.010
 slur+0.009
Neg logit contributions
Els-0.026
reetings-0.025
 Grace-0.025
ified-0.025
 importantly-0.025
Token what
Token ID644
Feature activation-3.67e-03

Pos logit contributions
aster+0.005
 peel+0.004
 toast+0.004
dog+0.003
 slur+0.003
Neg logit contributions
 Grace-0.004
 importantly-0.004
Els-0.004
reetings-0.004
ified-0.004
Token they
Token ID484
Feature activation-2.47e-02

Pos logit contributions
aster+0.006
 peel+0.005
 toast+0.005
dog+0.005
 slur+0.004
Neg logit contributions
 Grace-0.006
 importantly-0.006
Els-0.006
reetings-0.006
ified-0.005
Token did
Token ID750
Feature activation-2.49e-02

Pos logit contributions
aster+0.048
 toast+0.038
 slur+0.034
 peel+0.034
 cone+0.034
Neg logit contributions
reetings-0.034
 ago-0.031
 importantly-0.029
 should-0.028
 Isn-0.027
Token.
Token ID13
Feature activation-4.76e-03

Pos logit contributions
aster+0.037
 peel+0.029
 toast+0.027
dog+0.026
 slur+0.025
Neg logit contributions
 importantly-0.043
 Grace-0.043
Els-0.043
reetings-0.043
ified-0.041
Token They
Token ID1119
Feature activation-3.61e-03

Pos logit contributions
aster+0.006
 toast+0.005
 peel+0.005
dog+0.005
five+0.004
Neg logit contributions
reetings-0.010
iful-0.009
 insepar-0.009
ified-0.009
chie-0.009
Token want
Token ID765
Feature activation-1.82e-02

Pos logit contributions
aster+0.005
 toast+0.004
dog+0.004
 peel+0.004
on+0.004
Neg logit contributions
 Grace-0.006
Els-0.006
 importantly-0.006
ified-0.006
reetings-0.006
Token to
Token ID284
Feature activation+2.37e-03

Pos logit contributions
aster+0.036
 peel+0.030
ops+0.029
dog+0.028
 slur+0.028
Neg logit contributions
 Grace-0.022
 importantly-0.021
Els-0.020
 aliens-0.019
 ago-0.019
Token say
Token ID910
Feature activation-6.12e-03

Pos logit contributions
 Grace+0.004
Els+0.004
 importantly+0.004
reetings+0.004
ified+0.004
Neg logit contributions
aster-0.003
 peel-0.003
 toast-0.002
dog-0.002
 slur-0.002
Token all
Token ID477
Feature activation+2.21e-03

Pos logit contributions
reetings+0.010
Els+0.010
ified+0.010
 importantly+0.010
 ago+0.009
Neg logit contributions
aster-0.009
 toast-0.007
 peel-0.007
dog-0.006
 slur-0.006
Token the
Token ID262
Feature activation+1.46e-03

Pos logit contributions
 Grace+0.003
 importantly+0.003
Els+0.003
reetings+0.003
ified+0.003
Neg logit contributions
aster-0.004
 peel-0.003
 toast-0.003
dog-0.003
 slur-0.003
Token animals
Token ID4695
Feature activation+5.51e-05

Pos logit contributions
ified+0.002
ifying+0.002
reetings+0.002
 Grace+0.002
Els+0.002
Neg logit contributions
aster-0.002
 peel-0.002
dog-0.002
 toast-0.002
 man-0.002
Token.
Token ID13
Feature activation-4.38e-03

Pos logit contributions
Els+0.000
 Grace+0.000
 importantly+0.000
reetings+0.000
ified+0.000
Neg logit contributions
aster-0.000
 peel-0.000
 toast-0.000
dog-0.000
 slur-0.000
Token When
Token ID1649
Feature activation+4.52e-03

Pos logit contributions
aster+0.007
 peel+0.006
dog+0.006
 toast+0.006
 slur+0.006
Neg logit contributions
 importantly-0.007
Els-0.007
reetings-0.007
ified-0.007
 Grace-0.006
Token they
Token ID484
Feature activation-2.11e-02

Pos logit contributions
Els+0.008
reetings+0.008
 importantly+0.008
ified+0.008
 Grace+0.008
Neg logit contributions
aster-0.007
 peel-0.005
 toast-0.005
dog-0.005
 slur-0.005
Token got
Token ID1392
Feature activation-2.00e-03

Pos logit contributions
aster+0.036
 toast+0.031
 peel+0.029
dog+0.028
 slur+0.027
Neg logit contributions
 Grace-0.031
Els-0.030
ified-0.029
reetings-0.029
 ago-0.029
Token there
Token ID612
Feature activation-6.42e-03

Pos logit contributions
aster+0.004
 peel+0.003
 toast+0.003
dog+0.003
 slur+0.003
Neg logit contributions
 importantly-0.003
 Grace-0.003
Els-0.003
reetings-0.003
ified-0.002
Token,
Token ID11
Feature activation+6.95e-03

Pos logit contributions
aster+0.009
 peel+0.007
 toast+0.007
dog+0.007
ops+0.007
Neg logit contributions
 importantly-0.011
Els-0.011
reetings-0.011
ified-0.011
 Grace-0.011
Token Tim
Token ID5045
Feature activation-6.58e-04

Pos logit contributions
reetings+0.013
Els+0.013
 importantly+0.013
ified+0.012
 Grace+0.012
Neg logit contributions
aster-0.010
 peel-0.008
 toast-0.007
dog-0.007
 slur-0.007
Tokenmy
Token ID1820
Feature activation-1.44e-02

Pos logit contributions
aster+0.001
 peel+0.001
 toast+0.001
dog+0.001
 slur+0.001
Neg logit contributions
 importantly-0.001
Els-0.001
 Grace-0.001
ified-0.001
reetings-0.001
Token you
Token ID345
Feature activation+1.29e-04

Pos logit contributions
aster+0.015
 peel+0.012
 toast+0.011
dog+0.011
five+0.010
Neg logit contributions
 importantly-0.020
Els-0.019
reetings-0.019
spe-0.018
 Grace-0.018
Token.
Token ID13
Feature activation-2.03e-02

Pos logit contributions
 importantly+0.000
Els+0.000
 Grace+0.000
reetings+0.000
ified+0.000
Neg logit contributions
aster-0.000
 peel-0.000
 toast-0.000
dog-0.000
 slur-0.000
Token It
Token ID632
Feature activation-3.37e-03

Pos logit contributions
aster+0.038
 peel+0.031
dog+0.030
 toast+0.030
 slur+0.029
Neg logit contributions
Els-0.028
 importantly-0.027
 Grace-0.027
reetings-0.027
 insepar-0.026
Token is
Token ID318
Feature activation-2.51e-03

Pos logit contributions
aster+0.006
 peel+0.005
 toast+0.005
dog+0.005
 slur+0.004
Neg logit contributions
 Grace-0.005
Els-0.005
 importantly-0.005
reetings-0.005
ified-0.005
Token fun
Token ID1257
Feature activation+5.95e-03

Pos logit contributions
aster+0.004
 peel+0.003
 toast+0.003
dog+0.003
 slur+0.003
Neg logit contributions
Els-0.005
reetings-0.004
ified-0.004
spe-0.004
 Grace-0.004
Token.
Token ID13
Feature activation-2.26e-02

Pos logit contributions
Els+0.014
reetings+0.013
 importantly+0.013
ified+0.012
ifying+0.012
Neg logit contributions
aster-0.006
 toast-0.005
 peel-0.005
dog-0.005
five-0.004
Token Don
Token ID2094
Feature activation-1.14e-02

Pos logit contributions
aster+0.044
 peel+0.036
dog+0.035
 toast+0.034
five+0.033
Neg logit contributions
Els-0.030
 Grace-0.028
 insepar-0.028
spe-0.028
 importantly-0.028
Token't
Token ID470
Feature activation-2.41e-03

Pos logit contributions
aster+0.022
 peel+0.018
 toast+0.018
dog+0.017
 slur+0.017
Neg logit contributions
 Grace-0.015
Els-0.015
 importantly-0.015
reetings-0.014
ified-0.014
Token be
Token ID307
Feature activation-3.99e-03

Pos logit contributions
aster+0.003
 peel+0.002
 toast+0.002
dog+0.002
 slur+0.002
Neg logit contributions
 importantly-0.005
Els-0.005
reetings-0.005
 Grace-0.005
ified-0.005
Token scared
Token ID12008
Feature activation-1.67e-02

Pos logit contributions
aster+0.006
 peel+0.005
 toast+0.004
dog+0.004
 cut+0.004
Neg logit contributions
Els-0.007
reetings-0.007
 Grace-0.007
 importantly-0.007
ified-0.007
Token."
Token ID526
Feature activation-1.55e-02

Pos logit contributions
aster+0.020
 peel+0.016
dog+0.016
 toast+0.015
 cut+0.013
Neg logit contributions
Els-0.036
 former-0.033
 Lay-0.032
spe-0.032
 importantly-0.032
Token car
Token ID1097
Feature activation-4.83e-03

Pos logit contributions
aster+0.008
 peel+0.006
 toast+0.006
dog+0.006
 cone+0.006
Neg logit contributions
 importantly-0.008
reetings-0.008
Els-0.008
ified-0.008
 Grace-0.008
Token.
Token ID13
Feature activation-7.94e-04

Pos logit contributions
aster+0.007
 peel+0.005
 toast+0.005
dog+0.005
 slur+0.005
Neg logit contributions
Els-0.009
 importantly-0.009
 Grace-0.009
reetings-0.009
ified-0.008
Token
Token ID198
Feature activation-5.66e-03

Pos logit contributions
aster+0.001
 peel+0.001
dog+0.001
 slur+0.001
 toast+0.001
Neg logit contributions
reetings-0.002
 importantly-0.001
ified-0.001
Els-0.001
 insepar-0.001
Token
Token ID198
Feature activation+1.78e-03

Pos logit contributions
aster+0.007
 peel+0.005
on+0.005
dog+0.005
 toast+0.005
Neg logit contributions
ified-0.011
 importantly-0.011
ifying-0.011
reetings-0.011
 aliens-0.011
TokenAs
Token ID1722
Feature activation+1.57e-02

Pos logit contributions
 importantly+0.003
ified+0.003
reetings+0.003
 aliens+0.003
 ago+0.003
Neg logit contributions
aster-0.003
 peel-0.002
dog-0.002
 toast-0.002
 slur-0.002
Token they
Token ID484
Feature activation-2.06e-02

Pos logit contributions
reetings+0.041
ified+0.040
Els+0.038
 Isn+0.037
 Shall+0.036
Neg logit contributions
aster-0.015
 toast-0.007
 man-0.006
 peel-0.006
 cut-0.005
Token were
Token ID547
Feature activation-1.34e-02

Pos logit contributions
aster+0.032
 peel+0.025
 toast+0.024
dog+0.023
 slur+0.022
Neg logit contributions
 Grace-0.035
Els-0.034
 importantly-0.034
ified-0.033
reetings-0.033
Token driving
Token ID5059
Feature activation+5.29e-03

Pos logit contributions
aster+0.025
 peel+0.020
 toast+0.019
dog+0.019
 slur+0.018
Neg logit contributions
Els-0.019
reetings-0.019
 importantly-0.019
 Grace-0.019
ified-0.018
Token,
Token ID11
Feature activation+1.15e-02

Pos logit contributions
 importantly+0.011
Els+0.011
reetings+0.011
ified+0.011
ifying+0.010
Neg logit contributions
aster-0.006
dog-0.004
 man-0.004
 peel-0.004
 cone-0.004
Token they
Token ID484
Feature activation-1.10e-02

Pos logit contributions
reetings+0.026
 importantly+0.023
Els+0.022
ified+0.022
 ago+0.021
Neg logit contributions
aster-0.015
 toast-0.011
 peel-0.010
dog-0.010
 slur-0.009
Token saw
Token ID2497
Feature activation+4.83e-03

Pos logit contributions
aster+0.015
 peel+0.011
 toast+0.011
dog+0.010
 slur+0.010
Neg logit contributions
 Grace-0.021
Els-0.021
 importantly-0.020
ified-0.020
 ago-0.020
Token Lily
Token ID20037
Feature activation-4.96e-03

Pos logit contributions
 importantly+0.013
Els+0.013
reetings+0.013
 Grace+0.013
ified+0.012
Neg logit contributions
aster-0.015
 peel-0.012
 toast-0.011
dog-0.011
 slur-0.011
Token and
Token ID290
Feature activation-1.34e-02

Pos logit contributions
aster+0.009
 toast+0.008
 peel+0.007
ops+0.007
atmeal+0.007
Neg logit contributions
 ago-0.006
 Grace-0.006
 importantly-0.005
reetings-0.005
Els-0.005
Token Tom
Token ID4186
Feature activation-4.83e-03

Pos logit contributions
aster+0.021
 peel+0.017
dog+0.015
 toast+0.015
 slur+0.015
Neg logit contributions
 Grace-0.023
 importantly-0.022
Els-0.021
ified-0.021
reetings-0.020
Token went
Token ID1816
Feature activation-1.70e-03

Pos logit contributions
aster+0.008
 toast+0.006
 peel+0.006
ops+0.006
 slur+0.006
Neg logit contributions
 Grace-0.008
Els-0.007
reetings-0.007
 ago-0.007
 importantly-0.007
Token to
Token ID284
Feature activation-9.18e-03

Pos logit contributions
aster+0.003
 peel+0.003
 toast+0.002
dog+0.002
 slur+0.002
Neg logit contributions
 Grace-0.002
 importantly-0.002
Els-0.002
reetings-0.002
 ago-0.002
Token the
Token ID262
Feature activation-2.04e-02

Pos logit contributions
aster+0.013
 peel+0.010
 toast+0.010
dog+0.009
 slur+0.009
Neg logit contributions
 importantly-0.017
reetings-0.017
Els-0.017
ified-0.017
 Grace-0.016
Token park
Token ID3952
Feature activation-1.54e-02

Pos logit contributions
aster+0.031
 peel+0.024
 toast+0.023
dog+0.022
 slur+0.022
Neg logit contributions
 Grace-0.035
Els-0.035
 importantly-0.035
reetings-0.034
ified-0.033
Token with
Token ID351
Feature activation-7.53e-03

Pos logit contributions
aster+0.034
 toast+0.029
atmeal+0.029
ops+0.029
 slur+0.029
Neg logit contributions
 ago-0.013
 Grace-0.012
spe-0.012
 so-0.011
 since-0.011
Token Mom
Token ID11254
Feature activation+1.16e-02

Pos logit contributions
aster+0.014
ops+0.012
 peel+0.012
dog+0.011
 slur+0.011
Neg logit contributions
 Grace-0.011
Els-0.010
 importantly-0.010
 Amanda-0.009
 aliens-0.009
Token.
Token ID13
Feature activation+1.01e-04

Pos logit contributions
 Grace+0.021
Els+0.021
 importantly+0.021
reetings+0.021
ified+0.020
Neg logit contributions
aster-0.016
 peel-0.013
 toast-0.012
dog-0.011
 slur-0.011
Token They
Token ID1119
Feature activation-6.50e-03

Pos logit contributions
reetings+0.000
Els+0.000
 importantly+0.000
ified+0.000
 insepar+0.000
Neg logit contributions
aster-0.000
 peel-0.000
 toast-0.000
 slur-0.000
dog-0.000