TOP ACTIVATIONS
MAX = 3.932

Linda had taken a small pe b ble and was trying
was no bigger than a pe a !
, Lily always kept a pe b ble in her pocket
I wish I had a pe b ble like that ,"
soon he heard a big slur p and a splash coming
mom m ies and d add ies . Lily and Ben
. But she felt a pe a under them . She
, Peter stepped on a pe b ble and it hurt
big pink bear called P add ington . He loved to
cute and made a soft pe ep sound .
the kids had their t amb our ines and drums ,
, she accidentally threw a pe b ble at her friend
, then she grabbed a pe b ble . She looked
day and she found a pe b ble . She lly
for finding such a beautiful pe b ble .
, she found a small pe b ble . It was
fluffy bunny , a w add ling duck , and a
of wind blew a fr is bee and it almost hit
the n ooks and cr ann ies . After
drum and Mia has a fl ute . They make different

MIN ACTIVATIONS
MIN = -3.458

When Daisy put the magn ifying glass down , the cater
flashlight , and a magn ifying glass . As
. He was wearing his brand new icy blue jacket as
. They smell popcorn and cotton candy . They hear music
. They smell popcorn and cotton candy . They hear music
the park , a three year old saw a huge furniture
the necklace and the magn ifying glass and started to examine
special card to a 3 year old child . The card
the warm sunshine and having pic n ics with her family
. Bob , a three year - old boy , was
the necklace and the magn ifying glass and told him that
was a lovely , sunny year and all of the trees
Inside , a little three year old boy was standing .
One day , a three year old child opened the refrigerator
two friends ; a 3 year old named Anna and a
the h anger for her brand new dress . The cats
friendship which will last a lifetime . <|endoftext|> Once upon a
stairs , a little three year old girl stopped him .
Everyone laughed and said how y ummy the pasta was !
She takes out a magn ifying glass . She looks at

INTERVAL 2.084 - 3.932
CONTAINS 0.036%

She did not set the timer . She did not care
the idea to bake a pe b ble . She gathered
more careful with him from now on . Lily and Fl
, so he snapped the ash tr ay every day .
He wanted to find a pe b ble like Lily 's

INTERVAL 0.237 - 2.084
CONTAINS 30.969%

happened . He saw that Lily was sad and asked her
can help you ," replied the bear . They walked and
. He did not like to share . He said ,
and it fell to the ground . He ran to it
Inside , the waiter greeted them and showed them to their

INTERVAL -1.611 - 0.237
CONTAINS 68.625%

When she was done , she presented it to Jake .
bought him a warm coat so his little body would not
very proud of her little girl and smiled .
checking out all the tasty treats . The bear
, a bike , and a t eddy bear . They

INTERVAL -3.458 - -1.611
CONTAINS 0.370%

. It likes Tom and Ben . Tom and
's plan a big art day !" M imi said .
As she climbed higher and higher , she heard a loud
with the watch , looking forward to many more happy days
who loved to kiss . Everywhere she went , she always
Token Linda
Token ID21776
Feature activation+3.34e-01

Pos logit contributions
ucker+0.006
ifying+0.005
angles+0.005
ours+0.005
 mouths+0.005
Neg logit contributions
 gard-0.005
 gn-0.004
Els-0.004
 guardian-0.004
 wizard-0.004
Token had
Token ID550
Feature activation-1.46e-01

Pos logit contributions
 gard+0.012
 gn+0.009
Els+0.009
Ar+0.009
 guardian+0.008
Neg logit contributions
ifying-0.013
ucker-0.012
ours-0.012
 mouths-0.012
 anymore-0.011
Token taken
Token ID2077
Feature activation-5.68e-01

Pos logit contributions
ifying+0.006
 mouths+0.005
ucker+0.005
 anymore+0.005
ours+0.005
Neg logit contributions
 gard-0.005
Els-0.004
 gn-0.004
 wizard-0.004
Ar-0.004
Token a
Token ID257
Feature activation+4.89e-01

Pos logit contributions
ifying+0.023
ucker+0.023
ours+0.022
 mouths+0.022
angles+0.022
Neg logit contributions
 gard-0.018
 gn-0.014
Els-0.014
 guardian-0.013
Ar-0.013
Token small
Token ID1402
Feature activation+9.61e-01

Pos logit contributions
 gard+0.016
 gn+0.013
 wizard+0.012
Els+0.012
 guardian+0.011
Neg logit contributions
ucker-0.020
ours-0.020
angles-0.020
ifying-0.019
 mouths-0.019
Token pe
Token ID613
Feature activation+3.93e+00

Pos logit contributions
 gard+0.033
 gn+0.027
Els+0.025
 guardian+0.024
 wizard+0.024
Neg logit contributions
ifying-0.037
ucker-0.036
ours-0.035
angles-0.034
 mouths-0.034
Tokenb
Token ID65
Feature activation+2.98e+00

Pos logit contributions
 gard+0.135
 gn+0.107
Els+0.104
Ar+0.100
 guardian+0.096
Neg logit contributions
ifying-0.151
ucker-0.147
ours-0.143
angles-0.139
 mouths-0.138
Tokenble
Token ID903
Feature activation+1.34e+00

Pos logit contributions
 gard+0.111
Ar+0.095
 gn+0.092
Els+0.089
 seek+0.088
Neg logit contributions
ifying-0.106
ucker-0.098
ours-0.096
angles-0.094
outing-0.092
Token and
Token ID290
Feature activation-1.92e-01

Pos logit contributions
 gard+0.044
 gn+0.035
Els+0.033
 guardian+0.032
 wizard+0.031
Neg logit contributions
ucker-0.053
ifying-0.053
ours-0.052
 mouths-0.051
angles-0.050
Token was
Token ID373
Feature activation-2.96e-02

Pos logit contributions
ucker+0.008
ifying+0.008
ours+0.008
angles+0.008
 anymore+0.008
Neg logit contributions
 gard-0.006
 gn-0.005
Els-0.004
 seek-0.004
 wizard-0.004
Token trying
Token ID2111
Feature activation-9.20e-02

Pos logit contributions
ifying+0.001
ucker+0.001
ours+0.001
 anymore+0.001
 mouths+0.001
Neg logit contributions
 gard-0.001
 gn-0.001
Els-0.001
Ar-0.001
 guardian-0.001
Token was
Token ID373
Feature activation+1.83e-01

Pos logit contributions
 gard+0.002
 gn+0.002
Els+0.002
 guardian+0.002
 elf+0.001
Neg logit contributions
ifying-0.002
 anymore-0.002
ucker-0.002
 mouths-0.002
ours-0.002
Token no
Token ID645
Feature activation-7.77e-01

Pos logit contributions
 gard+0.007
 gn+0.005
Els+0.005
Ar+0.005
 guardian+0.005
Neg logit contributions
ifying-0.007
 anymore-0.006
ucker-0.006
 mouths-0.006
ours-0.006
Token bigger
Token ID5749
Feature activation+1.58e-01

Pos logit contributions
ifying+0.024
ucker+0.024
 anymore+0.023
ours+0.023
 mouths+0.023
Neg logit contributions
 gard-0.031
 gn-0.026
Els-0.026
Ar-0.025
 guardian-0.025
Token than
Token ID621
Feature activation+7.34e-02

Pos logit contributions
 gard+0.007
Els+0.005
 gn+0.005
Ar+0.005
 guardian+0.005
Neg logit contributions
ifying-0.005
ucker-0.005
 anymore-0.005
 mouths-0.005
ours-0.005
Token a
Token ID257
Feature activation+1.18e-01

Pos logit contributions
 gard+0.002
 gn+0.002
 wizard+0.002
 himself+0.002
 mer+0.002
Neg logit contributions
ucker-0.003
ours-0.003
ifying-0.003
inches-0.003
angles-0.003
Token pe
Token ID613
Feature activation+3.80e+00

Pos logit contributions
 gard+0.004
 gn+0.003
Els+0.003
 guardian+0.002
Ar+0.002
Neg logit contributions
ifying-0.005
ucker-0.005
ours-0.005
 mouths-0.005
 anymore-0.005
Tokena
Token ID64
Feature activation+5.24e-01

Pos logit contributions
 gard+0.132
 gn+0.108
Els+0.102
 guardian+0.098
 wizard+0.098
Neg logit contributions
ifying-0.145
ucker-0.141
ours-0.136
 mouths-0.136
 anymore-0.134
Token!
Token ID0
Feature activation-1.44e-02

Pos logit contributions
 gard+0.017
 gn+0.014
Els+0.012
 wizard+0.012
 guardian+0.012
Neg logit contributions
ucker-0.021
ifying-0.021
ours-0.021
angles-0.020
 mouths-0.020
Token
Token ID220
Feature activation-5.42e-02

Pos logit contributions
ifying+0.001
ucker+0.001
ours+0.001
 mouths+0.001
 anymore+0.001
Neg logit contributions
 gard-0.000
 gn-0.000
Els-0.000
 guardian-0.000
Ar-0.000
Token
Token ID198
Feature activation+2.33e-02

Pos logit contributions
ifying+0.002
ucker+0.002
ours+0.002
 anymore+0.002
 mouths+0.002
Neg logit contributions
 gard-0.001
 gn-0.001
Els-0.001
 guardian-0.001
Ar-0.001
Token
Token ID198
Feature activation+8.88e-02

Pos logit contributions
 gard+0.001
 gn+0.001
 guardian+0.001
 elf+0.001
 mer+0.001
Neg logit contributions
ucker-0.001
ifying-0.001
ours-0.001
 anymore-0.001
angles-0.001
Token,
Token ID11
Feature activation-1.08e-02

Pos logit contributions
 gard+0.006
 gn+0.004
Els+0.004
 wizard+0.004
 guardian+0.004
Neg logit contributions
ifying-0.007
ucker-0.006
ours-0.006
angles-0.006
 mouths-0.006
Token Lily
Token ID20037
Feature activation+3.29e-01

Pos logit contributions
ifying+0.000
ucker+0.000
ours+0.000
inches+0.000
outing+0.000
Neg logit contributions
 gard-0.000
 gn-0.000
 fairy-0.000
 seek-0.000
 wizard-0.000
Token always
Token ID1464
Feature activation+3.42e-01

Pos logit contributions
 gard+0.012
Els+0.010
 gn+0.010
Ar+0.010
 guardian+0.009
Neg logit contributions
ucker-0.011
 anymore-0.010
 mouths-0.010
ifying-0.010
angles-0.010
Token kept
Token ID4030
Feature activation+2.15e-02

Pos logit contributions
 gard+0.012
Els+0.010
 gn+0.009
 guardian+0.009
Ar+0.009
Neg logit contributions
ucker-0.012
ifying-0.012
 mouths-0.012
ours-0.012
inches-0.011
Token a
Token ID257
Feature activation-1.63e-01

Pos logit contributions
 gard+0.001
Els+0.001
 gn+0.001
Ar+0.001
 guardian+0.001
Neg logit contributions
ucker-0.001
ifying-0.001
ours-0.001
 mouths-0.001
 afar-0.001
Token pe
Token ID613
Feature activation+3.71e+00

Pos logit contributions
ifying+0.006
ucker+0.006
 anymore+0.005
ours+0.005
 mouths+0.005
Neg logit contributions
 gard-0.006
 gn-0.005
Els-0.005
Ar-0.004
 guardian-0.004
Tokenb
Token ID65
Feature activation+2.54e+00

Pos logit contributions
 gard+0.128
 gn+0.102
Els+0.099
Ar+0.093
 guardian+0.092
Neg logit contributions
ifying-0.142
ucker-0.140
ours-0.135
 anymore-0.131
 mouths-0.131
Tokenble
Token ID903
Feature activation+9.97e-01

Pos logit contributions
 gard+0.093
 gn+0.081
Ar+0.080
Els+0.076
 seek+0.075
Neg logit contributions
ifying-0.088
ucker-0.086
outing-0.081
ours-0.080
angles-0.080
Token in
Token ID287
Feature activation+5.04e-01

Pos logit contributions
 gard+0.033
 gn+0.026
 wizard+0.024
 guardian+0.024
Els+0.023
Neg logit contributions
ucker-0.041
ifying-0.040
ours-0.040
 mouths-0.039
angles-0.038
Token her
Token ID607
Feature activation+1.64e-01

Pos logit contributions
 gard+0.016
 gn+0.013
Els+0.012
 guardian+0.012
Ar+0.011
Neg logit contributions
ifying-0.021
ucker-0.021
ours-0.020
angles-0.019
 mouths-0.019
Token pocket
Token ID10000
Feature activation-1.14e+00

Pos logit contributions
 gard+0.006
 gn+0.005
Els+0.005
 guardian+0.005
Ar+0.005
Neg logit contributions
ifying-0.006
ucker-0.006
ours-0.005
 anymore-0.005
 mouths-0.005
TokenI
Token ID40
Feature activation+2.08e-01

Pos logit contributions
 gard+0.020
 gn+0.016
Els+0.015
 guardian+0.014
Ar+0.014
Neg logit contributions
ifying-0.027
ucker-0.026
ours-0.025
 anymore-0.025
 mouths-0.025
Token wish
Token ID4601
Feature activation+1.54e-01

Pos logit contributions
 gard+0.007
 gn+0.006
Els+0.006
 guardian+0.005
 wizard+0.005
Neg logit contributions
ifying-0.008
ucker-0.008
 mouths-0.008
ours-0.007
angles-0.007
Token I
Token ID314
Feature activation+8.74e-02

Pos logit contributions
 gard+0.005
 gn+0.004
Els+0.004
 guardian+0.004
 wizard+0.004
Neg logit contributions
ifying-0.006
ucker-0.006
ours-0.006
 mouths-0.005
angles-0.005
Token had
Token ID550
Feature activation-7.94e-01

Pos logit contributions
 gard+0.003
 gn+0.002
Els+0.002
 guardian+0.002
 seek+0.002
Neg logit contributions
ucker-0.004
ifying-0.004
ours-0.004
 mouths-0.004
angles-0.003
Token a
Token ID257
Feature activation+4.44e-02

Pos logit contributions
ifying+0.033
ucker+0.033
ours+0.032
angles+0.031
 mouths+0.030
Neg logit contributions
 gard-0.025
 gn-0.020
Els-0.018
 guardian-0.017
Ar-0.017
Token pe
Token ID613
Feature activation+3.64e+00

Pos logit contributions
 gard+0.001
 gn+0.001
Els+0.001
Ar+0.001
 guardian+0.001
Neg logit contributions
ucker-0.002
ifying-0.002
 anymore-0.002
 mouths-0.002
ours-0.002
Tokenb
Token ID65
Feature activation+2.51e+00

Pos logit contributions
 gard+0.120
 gn+0.095
Els+0.092
Ar+0.087
 guardian+0.085
Neg logit contributions
ifying-0.143
ucker-0.142
ours-0.137
 anymore-0.133
angles-0.133
Tokenble
Token ID903
Feature activation+1.08e+00

Pos logit contributions
 gard+0.094
Ar+0.082
 gn+0.078
Els+0.076
 seek+0.075
Neg logit contributions
ucker-0.083
ifying-0.080
ours-0.078
angles-0.076
 anymore-0.075
Token like
Token ID588
Feature activation-8.18e-01

Pos logit contributions
 gard+0.036
 gn+0.029
Els+0.027
 guardian+0.026
 wizard+0.026
Neg logit contributions
ifying-0.042
ucker-0.042
ours-0.041
 mouths-0.040
angles-0.040
Token that
Token ID326
Feature activation-5.20e-01

Pos logit contributions
ucker+0.032
ifying+0.030
ours+0.030
angles+0.029
 mouths+0.028
Neg logit contributions
 gard-0.029
 gn-0.024
Els-0.022
 wizard-0.021
 elf-0.021
Token,"
Token ID553
Feature activation+3.99e-01

Pos logit contributions
ifying+0.021
ucker+0.020
ours+0.020
 mouths+0.019
angles+0.019
Neg logit contributions
 gard-0.018
 gn-0.014
Els-0.013
 wizard-0.013
 guardian-0.013
Token soon
Token ID2582
Feature activation+6.79e-01

Pos logit contributions
 gard+0.018
 gn+0.015
 guardian+0.014
 wizard+0.013
 himself+0.013
Neg logit contributions
ucker-0.023
ifying-0.022
ours-0.021
angles-0.021
inches-0.020
Token he
Token ID339
Feature activation-3.98e-01

Pos logit contributions
 gard+0.024
 gn+0.020
Els+0.019
 guardian+0.018
Ar+0.018
Neg logit contributions
ifying-0.025
ucker-0.025
ours-0.024
 mouths-0.023
 anymore-0.023
Token heard
Token ID2982
Feature activation+3.52e-01

Pos logit contributions
ifying+0.017
ucker+0.017
ours+0.016
angles+0.016
 mouths+0.016
Neg logit contributions
 gard-0.012
 gn-0.009
Els-0.009
 guardian-0.008
Ar-0.008
Token a
Token ID257
Feature activation-1.45e-01

Pos logit contributions
 gard+0.009
 gn+0.007
Els+0.006
 guardian+0.006
 wizard+0.006
Neg logit contributions
ucker-0.017
ifying-0.016
ours-0.016
angles-0.016
inches-0.015
Token big
Token ID1263
Feature activation+2.38e-01

Pos logit contributions
ifying+0.006
ucker+0.005
ours+0.005
 anymore+0.005
 mouths+0.005
Neg logit contributions
 gard-0.005
 gn-0.004
Els-0.004
 guardian-0.004
Ar-0.004
Token slur
Token ID40066
Feature activation+3.61e+00

Pos logit contributions
 gard+0.007
 gn+0.006
Els+0.005
 guardian+0.005
 wizard+0.005
Neg logit contributions
ifying-0.010
ucker-0.010
ours-0.009
angles-0.009
 mouths-0.009
Tokenp
Token ID79
Feature activation+1.88e+00

Pos logit contributions
 gard+0.125
 gn+0.103
Els+0.100
Ar+0.096
 seek+0.092
Neg logit contributions
ifying-0.132
ucker-0.131
 anymore-0.130
angles-0.124
ours-0.123
Token and
Token ID290
Feature activation+6.23e-02

Pos logit contributions
 gard+0.057
 gn+0.047
Els+0.043
 mer+0.042
 guardian+0.042
Neg logit contributions
ucker-0.075
ifying-0.075
ours-0.073
angles-0.072
 mouths-0.072
Token a
Token ID257
Feature activation-3.57e-01

Pos logit contributions
 gard+0.002
 gn+0.002
Els+0.002
Ar+0.001
 guardian+0.001
Neg logit contributions
ifying-0.002
ucker-0.002
ours-0.002
 mouths-0.002
 anymore-0.002
Token splash
Token ID22870
Feature activation+5.70e-01

Pos logit contributions
ifying+0.013
ucker+0.013
ours+0.013
angles+0.013
 anymore+0.012
Neg logit contributions
 gard-0.013
 gn-0.010
Els-0.010
 guardian-0.009
 wizard-0.009
Token coming
Token ID2406
Feature activation+6.73e-01

Pos logit contributions
 gard+0.017
 gn+0.013
 mer+0.012
Els+0.012
 wizard+0.011
Neg logit contributions
ucker-0.025
ifying-0.024
ours-0.024
angles-0.023
 mouths-0.023
Token mom
Token ID1995
Feature activation-3.33e-01

Pos logit contributions
ifying+0.004
ucker+0.004
ours+0.004
angles+0.003
 mouths+0.003
Neg logit contributions
 gard-0.004
 gn-0.003
Els-0.003
Ar-0.003
 guardian-0.003
Tokenm
Token ID76
Feature activation+1.73e+00

Pos logit contributions
ifying+0.016
ours+0.015
angles+0.015
ucker+0.015
izing+0.015
Neg logit contributions
 gard-0.007
 gn-0.006
 been-0.005
 members-0.005
 next-0.005
Tokenies
Token ID444
Feature activation+6.46e-01

Pos logit contributions
 gard+0.061
 gn+0.050
 servants+0.048
 guardian+0.048
 himself+0.048
Neg logit contributions
ifying-0.062
ucker-0.055
angles-0.054
 mouths-0.054
ours-0.052
Token and
Token ID290
Feature activation+5.35e-01

Pos logit contributions
 gard+0.021
 gn+0.017
 guardian+0.015
 wizard+0.015
 mer+0.015
Neg logit contributions
ucker-0.026
angles-0.026
ours-0.026
ifying-0.024
 mouths-0.024
Token d
Token ID288
Feature activation-1.62e-01

Pos logit contributions
 gard+0.019
 gn+0.016
 mer+0.014
 wizard+0.014
 guardian+0.014
Neg logit contributions
ucker-0.021
ifying-0.020
angles-0.019
ours-0.019
 anymore-0.019
Tokenadd
Token ID2860
Feature activation+3.58e+00

Pos logit contributions
 mouths+0.005
ifying+0.005
ucker+0.005
ours+0.005
 anymore+0.005
Neg logit contributions
 gard-0.006
Els-0.005
 gn-0.005
Ar-0.005
 guardian-0.005
Tokenies
Token ID444
Feature activation+1.93e-01

Pos logit contributions
 gard+0.138
 gn+0.117
Ar+0.112
Els+0.111
 servants+0.107
Neg logit contributions
ifying-0.119
ucker-0.106
ours-0.103
 mouths-0.100
 anymore-0.096
Token.
Token ID13
Feature activation-2.26e-01

Pos logit contributions
 gard+0.006
 gn+0.005
 guardian+0.004
Els+0.004
 wizard+0.004
Neg logit contributions
ucker-0.008
ifying-0.008
ours-0.008
angles-0.008
 mouths-0.008
Token Lily
Token ID20037
Feature activation-2.90e-01

Pos logit contributions
ifying+0.011
ucker+0.011
angles+0.010
ours+0.010
izing+0.010
Neg logit contributions
 gard-0.006
 gn-0.004
Els-0.004
Ar-0.004
 guardian-0.003
Token and
Token ID290
Feature activation-4.32e-01

Pos logit contributions
ifying+0.012
ucker+0.012
ours+0.011
angles+0.011
 anymore+0.011
Neg logit contributions
 gard-0.009
 gn-0.007
Els-0.007
 guardian-0.006
Ar-0.006
Token Ben
Token ID3932
Feature activation-1.35e+00

Pos logit contributions
ifying+0.017
ucker+0.016
ours+0.015
 anymore+0.015
angles+0.014
Neg logit contributions
 gard-0.016
 gn-0.013
 wizard-0.012
 guardian-0.011
Els-0.011
Token.
Token ID13
Feature activation+1.13e-01

Pos logit contributions
 gard+0.023
 gn+0.018
 wizard+0.016
 guardian+0.016
Els+0.016
Neg logit contributions
ucker-0.037
ifying-0.036
ours-0.035
 mouths-0.034
angles-0.034
Token But
Token ID887
Feature activation+4.52e-01

Pos logit contributions
 gard+0.003
 gn+0.002
Els+0.002
Ar+0.002
 guardian+0.002
Neg logit contributions
ifying-0.005
ucker-0.005
angles-0.005
ours-0.005
 mouths-0.005
Token she
Token ID673
Feature activation-1.20e-01

Pos logit contributions
 gard+0.014
 gn+0.011
Els+0.010
 seek+0.010
 wizard+0.010
Neg logit contributions
ifying-0.020
ucker-0.020
ours-0.018
angles-0.018
inches-0.017
Token felt
Token ID2936
Feature activation-3.68e-01

Pos logit contributions
ifying+0.005
ucker+0.005
ours+0.004
angles+0.004
 anymore+0.004
Neg logit contributions
 gard-0.004
 gn-0.003
Els-0.003
 guardian-0.003
 wizard-0.003
Token a
Token ID257
Feature activation-4.31e-01

Pos logit contributions
angles+0.015
ucker+0.014
ifying+0.014
ours+0.014
ishment+0.013
Neg logit contributions
 gard-0.013
 gn-0.011
 himself-0.009
 wizard-0.009
 mer-0.009
Token pe
Token ID613
Feature activation+3.56e+00

Pos logit contributions
ifying+0.016
ucker+0.015
ours+0.014
 anymore+0.014
 mouths+0.014
Neg logit contributions
 gard-0.016
 gn-0.013
Els-0.012
 guardian-0.012
Ar-0.012
Tokena
Token ID64
Feature activation+1.08e-01

Pos logit contributions
 gard+0.117
 gn+0.095
Els+0.090
 guardian+0.086
 wizard+0.085
Neg logit contributions
ifying-0.142
ucker-0.138
 mouths-0.133
ours-0.133
 anymore-0.132
Token under
Token ID739
Feature activation-1.11e-01

Pos logit contributions
 gard+0.003
 gn+0.002
 wizard+0.002
 guardian+0.002
 mer+0.002
Neg logit contributions
ucker-0.005
angles-0.005
ours-0.005
ifying-0.005
 mouths-0.005
Token them
Token ID606
Feature activation+1.52e+00

Pos logit contributions
ucker+0.005
ifying+0.005
ours+0.005
angles+0.005
outing+0.005
Neg logit contributions
 gard-0.003
 gn-0.002
Els-0.002
 wizard-0.002
 mer-0.002
Token.
Token ID13
Feature activation+2.75e-01

Pos logit contributions
 gard+0.039
 gn+0.032
 wizard+0.027
 guardian+0.027
 mer+0.026
Neg logit contributions
ucker-0.071
ours-0.069
ifying-0.069
angles-0.068
inches-0.064
Token She
Token ID1375
Feature activation+2.07e-01

Pos logit contributions
 gard+0.007
 gn+0.005
Els+0.005
Ar+0.004
 guardian+0.004
Neg logit contributions
ifying-0.013
angles-0.012
ucker-0.012
izing-0.012
ours-0.012
Token,
Token ID11
Feature activation+4.56e-01

Pos logit contributions
ifying+0.012
ucker+0.012
ours+0.011
 mouths+0.011
 anymore+0.011
Neg logit contributions
 gard-0.013
 gn-0.011
Els-0.010
 guardian-0.010
Ar-0.010
Token Peter
Token ID5613
Feature activation+1.36e-02

Pos logit contributions
 gard+0.016
 gn+0.013
Els+0.012
Ar+0.012
 mer+0.012
Neg logit contributions
ucker-0.018
ifying-0.017
ours-0.016
inches-0.016
angles-0.016
Token stepped
Token ID10764
Feature activation+6.78e-01

Pos logit contributions
 gard+0.000
 gn+0.000
Els+0.000
 guardian+0.000
Ar+0.000
Neg logit contributions
ifying-0.001
ucker-0.001
ours-0.001
 anymore-0.001
 mouths-0.001
Token on
Token ID319
Feature activation-1.75e-01

Pos logit contributions
 gard+0.026
Els+0.022
 gn+0.022
Ar+0.022
 guardian+0.021
Neg logit contributions
ifying-0.024
 anymore-0.020
ours-0.020
ucker-0.019
outing-0.019
Token a
Token ID257
Feature activation+2.28e-01

Pos logit contributions
ucker+0.007
ifying+0.007
ours+0.007
angles+0.007
 mouths+0.007
Neg logit contributions
 gard-0.006
 gn-0.005
Els-0.004
Ar-0.004
 guardian-0.004
Token pe
Token ID613
Feature activation+3.56e+00

Pos logit contributions
 gard+0.006
 gn+0.005
 wizard+0.004
Els+0.004
 seek+0.004
Neg logit contributions
ucker-0.011
ours-0.011
ifying-0.011
angles-0.010
 mouths-0.010
Tokenb
Token ID65
Feature activation+2.53e+00

Pos logit contributions
 gard+0.135
 gn+0.110
Els+0.107
Ar+0.103
 guardian+0.099
Neg logit contributions
ifying-0.123
ucker-0.121
ours-0.118
angles-0.115
 mouths-0.113
Tokenble
Token ID903
Feature activation+9.78e-01

Pos logit contributions
 gard+0.093
 gn+0.078
Ar+0.077
Els+0.076
 seek+0.074
Neg logit contributions
ifying-0.089
ucker-0.086
ours-0.085
angles-0.084
outing-0.082
Token and
Token ID290
Feature activation-1.02e-01

Pos logit contributions
 gard+0.032
 gn+0.025
Els+0.023
 wizard+0.023
 guardian+0.022
Neg logit contributions
ucker-0.040
ifying-0.040
ours-0.039
angles-0.038
 mouths-0.038
Token it
Token ID340
Feature activation-2.66e-02

Pos logit contributions
ifying+0.004
ucker+0.004
ours+0.004
 anymore+0.004
angles+0.004
Neg logit contributions
 gard-0.003
 gn-0.003
Els-0.002
 guardian-0.002
Ar-0.002
Token hurt
Token ID5938
Feature activation+2.88e-01

Pos logit contributions
ifying+0.001
ucker+0.001
 mouths+0.001
 anymore+0.001
ours+0.001
Neg logit contributions
 gard-0.001
Els-0.001
 gn-0.001
 guardian-0.001
Ar-0.001
Token big
Token ID1263
Feature activation+1.18e-01

Pos logit contributions
ifying+0.008
ucker+0.007
ours+0.007
angles+0.007
 mouths+0.007
Neg logit contributions
 gard-0.005
 gn-0.004
Els-0.003
 wizard-0.003
Ar-0.003
Token pink
Token ID11398
Feature activation+3.92e-01

Pos logit contributions
 gard+0.003
 gn+0.002
 wizard+0.002
 bear+0.002
 elf+0.002
Neg logit contributions
ifying-0.006
ucker-0.006
ours-0.005
angles-0.005
outing-0.005
Token bear
Token ID6842
Feature activation+1.11e-01

Pos logit contributions
 gard+0.011
 gn+0.008
 wizard+0.007
 guardian+0.007
 mer+0.007
Neg logit contributions
ucker-0.019
ifying-0.018
angles-0.018
ours-0.018
inches-0.017
Token called
Token ID1444
Feature activation-6.21e-01

Pos logit contributions
 gard+0.004
 gn+0.003
Els+0.003
 wizard+0.003
 guardian+0.003
Neg logit contributions
ifying-0.004
ucker-0.004
angles-0.004
ours-0.004
 mouths-0.004
Token P
Token ID350
Feature activation+7.15e-01

Pos logit contributions
ifying+0.022
ucker+0.022
ours+0.021
 mouths+0.021
 anymore+0.020
Neg logit contributions
 gard-0.023
 gn-0.019
Els-0.018
 guardian-0.017
Ar-0.017
Tokenadd
Token ID2860
Feature activation+3.56e+00

Pos logit contributions
 gard+0.030
 gn+0.025
Els+0.024
 guardian+0.023
Ar+0.023
Neg logit contributions
ifying-0.022
ucker-0.022
ours-0.021
 anymore-0.020
 mouths-0.020
Tokenington
Token ID9557
Feature activation+3.87e-01

Pos logit contributions
 gard+0.166
 gn+0.142
Els+0.138
 guardian+0.133
Ar+0.132
Neg logit contributions
ifying-0.092
ucker-0.090
ours-0.085
 anymore-0.083
 mouths-0.083
Token.
Token ID13
Feature activation+4.97e-02

Pos logit contributions
 gard+0.014
 gn+0.011
Els+0.011
Ar+0.010
 guardian+0.010
Neg logit contributions
ifying-0.014
ucker-0.014
ours-0.013
 mouths-0.013
 anymore-0.013
Token He
Token ID679
Feature activation+1.38e-01

Pos logit contributions
 gard+0.002
Els+0.001
 gn+0.001
 guardian+0.001
Ar+0.001
Neg logit contributions
ifying-0.002
ucker-0.002
ours-0.002
 mouths-0.002
 anymore-0.002
Token loved
Token ID6151
Feature activation+8.38e-02

Pos logit contributions
 gard+0.006
 guardian+0.005
Els+0.005
 gn+0.005
 wizard+0.005
Neg logit contributions
ifying-0.004
ucker-0.004
 mouths-0.004
ours-0.003
 anymore-0.003
Token to
Token ID284
Feature activation-1.32e-01

Pos logit contributions
 gard+0.003
Els+0.002
 guardian+0.002
 gn+0.002
Ar+0.002
Neg logit contributions
 mouths-0.003
ucker-0.003
ours-0.003
ifying-0.003
 anymore-0.003
Token cute
Token ID13779
Feature activation-6.93e-01

Pos logit contributions
ifying+0.003
 anymore+0.003
angles+0.003
ucker+0.003
ours+0.003
Neg logit contributions
 gard-0.002
 gn-0.002
Els-0.002
Ar-0.002
 guardian-0.002
Token and
Token ID290
Feature activation+7.69e-02

Pos logit contributions
 anymore+0.022
ifying+0.021
ucker+0.020
ours+0.019
 mouths+0.019
Neg logit contributions
 gard-0.028
Els-0.024
Ar-0.024
 gn-0.023
 guardian-0.022
Token made
Token ID925
Feature activation-4.15e-01

Pos logit contributions
 gard+0.002
 gn+0.002
 seek+0.002
Els+0.002
 mer+0.002
Neg logit contributions
ifying-0.003
ucker-0.003
ours-0.003
 anymore-0.003
angles-0.003
Token a
Token ID257
Feature activation+2.06e-01

Pos logit contributions
ifying+0.019
ucker+0.018
ours+0.018
angles+0.017
 mouths+0.017
Neg logit contributions
 gard-0.011
 gn-0.009
Els-0.008
 guardian-0.008
 mer-0.008
Token soft
Token ID2705
Feature activation+3.90e-01

Pos logit contributions
 gard+0.007
 gn+0.006
Els+0.005
 mer+0.005
Ar+0.005
Neg logit contributions
ifying-0.008
ours-0.008
ucker-0.008
angles-0.008
 anymore-0.008
Token pe
Token ID613
Feature activation+3.53e+00

Pos logit contributions
 gard+0.013
 gn+0.010
Els+0.009
 guardian+0.009
Ar+0.009
Neg logit contributions
ifying-0.016
ucker-0.015
ours-0.015
angles-0.015
 anymore-0.015
Tokenep
Token ID538
Feature activation+1.72e+00

Pos logit contributions
 gard+0.126
 gn+0.102
Els+0.098
 guardian+0.093
Ar+0.092
Neg logit contributions
ifying-0.131
ucker-0.128
ours-0.123
 anymore-0.121
 mouths-0.121
Token sound
Token ID2128
Feature activation+6.65e-01

Pos logit contributions
 gard+0.055
 gn+0.045
Els+0.041
 guardian+0.040
 wizard+0.039
Neg logit contributions
ifying-0.069
ucker-0.068
ours-0.066
 mouths-0.064
angles-0.064
Token.
Token ID13
Feature activation+2.57e-01

Pos logit contributions
 gard+0.022
 gn+0.018
Els+0.017
 guardian+0.016
 wizard+0.016
Neg logit contributions
ifying-0.026
ucker-0.026
ours-0.025
angles-0.024
 mouths-0.024
Token
Token ID198
Feature activation+9.36e-02

Pos logit contributions
 gard+0.006
 gn+0.004
Els+0.004
Ar+0.004
 wizard+0.004
Neg logit contributions
ucker-0.012
ifying-0.012
angles-0.012
 mouths-0.012
ours-0.012
Token
Token ID198
Feature activation+1.35e-01

Pos logit contributions
 gard+0.002
 gn+0.002
Els+0.002
 guardian+0.002
Ar+0.002
Neg logit contributions
ifying-0.004
ucker-0.004
ours-0.004
 anymore-0.004
 mouths-0.004
Token the
Token ID262
Feature activation+8.05e-02

Pos logit contributions
ucker+0.070
ifying+0.067
ours+0.067
outing+0.065
angles+0.064
Neg logit contributions
 gard-0.049
 gn-0.042
Els-0.037
 mer-0.037
 himself-0.037
Token kids
Token ID3988
Feature activation+4.33e-01

Pos logit contributions
 gard+0.003
 gn+0.002
Els+0.002
Ar+0.002
 guardian+0.002
Neg logit contributions
ifying-0.003
ucker-0.003
ours-0.003
angles-0.003
 mouths-0.003
Token had
Token ID550
Feature activation-1.83e-01

Pos logit contributions
 gard+0.013
 gn+0.010
Els+0.009
 guardian+0.009
 wizard+0.009
Neg logit contributions
ucker-0.018
ifying-0.018
ours-0.018
angles-0.017
 mouths-0.017
Token their
Token ID511
Feature activation+2.54e-01

Pos logit contributions
ifying+0.008
ucker+0.007
ours+0.007
 mouths+0.007
 anymore+0.007
Neg logit contributions
 gard-0.006
 gn-0.004
Els-0.004
 guardian-0.004
Ar-0.004
Token t
Token ID256
Feature activation+1.76e+00

Pos logit contributions
 gard+0.008
 gn+0.006
Ar+0.006
 guardian+0.006
 mer+0.006
Neg logit contributions
ucker-0.011
ours-0.010
ifying-0.010
angles-0.010
outing-0.010
Tokenamb
Token ID4131
Feature activation+3.52e+00

Pos logit contributions
 gard+0.068
 gn+0.056
Els+0.056
 guardian+0.053
Ar+0.053
Neg logit contributions
ifying-0.058
ucker-0.056
ours-0.055
 mouths-0.054
 anymore-0.054
Tokenour
Token ID454
Feature activation+1.47e+00

Pos logit contributions
 gard+0.128
 guardian+0.109
Els+0.107
 wizard+0.105
 gn+0.104
Neg logit contributions
ifying-0.115
ucker-0.114
 anymore-0.109
ours-0.106
 mouths-0.103
Tokenines
Token ID1127
Feature activation+6.85e-01

Pos logit contributions
 gard+0.049
 gn+0.041
Els+0.038
 guardian+0.038
 wizard+0.036
Neg logit contributions
ucker-0.055
ifying-0.054
ours-0.054
angles-0.052
 mouths-0.051
Token and
Token ID290
Feature activation+1.54e-01

Pos logit contributions
 gard+0.019
 gn+0.015
 mer+0.013
 guardian+0.013
Els+0.013
Neg logit contributions
ucker-0.031
ours-0.030
ifying-0.029
angles-0.029
inches-0.028
Token drums
Token ID24730
Feature activation-2.77e-01

Pos logit contributions
 gard+0.005
 gn+0.004
 seek+0.003
Els+0.003
 guardian+0.003
Neg logit contributions
ifying-0.006
ucker-0.006
ours-0.006
angles-0.006
 anymore-0.006
Token,
Token ID11
Feature activation-4.78e-01

Pos logit contributions
ucker+0.011
ifying+0.011
ours+0.011
 mouths+0.011
angles+0.011
Neg logit contributions
 gard-0.009
 gn-0.007
Els-0.006
 guardian-0.006
 mer-0.006
Token,
Token ID11
Feature activation+6.18e-01

Pos logit contributions
 gard+0.015
 gn+0.012
Els+0.012
 guardian+0.012
Ar+0.012
Neg logit contributions
 anymore-0.010
ours-0.010
ifying-0.010
ucker-0.010
 mouths-0.009
Token she
Token ID673
Feature activation+3.21e-01

Pos logit contributions
 gard+0.019
 gn+0.015
Els+0.013
 seek+0.013
 mer+0.013
Neg logit contributions
ucker-0.028
ifying-0.027
angles-0.026
ours-0.026
ishment-0.025
Token accidentally
Token ID14716
Feature activation+4.10e-01

Pos logit contributions
 gard+0.010
 gn+0.008
Els+0.008
 guardian+0.007
Ar+0.007
Neg logit contributions
ifying-0.013
ucker-0.013
ours-0.012
 anymore-0.012
 mouths-0.012
Token threw
Token ID9617
Feature activation-3.93e-01

Pos logit contributions
 gard+0.016
Els+0.014
 gn+0.013
Ar+0.013
 guardian+0.013
Neg logit contributions
ifying-0.013
ucker-0.012
ours-0.012
 mouths-0.012
 anymore-0.012
Token a
Token ID257
Feature activation+4.00e-01

Pos logit contributions
ucker+0.020
ifying+0.019
ours+0.019
angles+0.018
outing+0.018
Neg logit contributions
 gard-0.009
 gn-0.007
Els-0.006
 himself-0.006
 h-0.006
Token pe
Token ID613
Feature activation+3.51e+00

Pos logit contributions
 gard+0.012
 gn+0.009
 mer+0.009
 wizard+0.009
Ar+0.008
Neg logit contributions
ishment-0.017
ucker-0.017
ours-0.017
angles-0.017
ifying-0.016
Tokenb
Token ID65
Feature activation+2.77e+00

Pos logit contributions
 gard+0.117
 gn+0.093
Els+0.090
Ar+0.085
 guardian+0.084
Neg logit contributions
ifying-0.138
ucker-0.136
ours-0.131
angles-0.128
 anymore-0.128
Tokenble
Token ID903
Feature activation+1.10e+00

Pos logit contributions
 gard+0.110
Ar+0.094
 seek+0.093
 gn+0.090
Els+0.090
Neg logit contributions
ifying-0.086
ucker-0.085
ours-0.085
angles-0.080
inches-0.078
Token at
Token ID379
Feature activation-1.40e-01

Pos logit contributions
 gard+0.035
 gn+0.027
Els+0.024
 guardian+0.024
 wizard+0.024
Neg logit contributions
ifying-0.045
ucker-0.045
ours-0.043
 mouths-0.043
angles-0.043
Token her
Token ID607
Feature activation+3.73e-01

Pos logit contributions
ucker+0.007
ifying+0.007
ours+0.007
ience+0.007
outing+0.007
Neg logit contributions
 gard-0.003
 himself-0.002
 gn-0.002
 two-0.002
Els-0.002
Token friend
Token ID1545
Feature activation-9.03e-02

Pos logit contributions
 gard+0.011
 gn+0.009
Els+0.008
Ar+0.008
 guardian+0.008
Neg logit contributions
ifying-0.016
ucker-0.016
ours-0.015
 anymore-0.015
 mouths-0.015
Token,
Token ID11
Feature activation-2.20e-01

Pos logit contributions
ifying+0.019
ucker+0.019
ours+0.018
angles+0.018
 mouths+0.018
Neg logit contributions
 gard-0.016
 gn-0.013
Els-0.011
 guardian-0.011
 wizard-0.011
Token then
Token ID788
Feature activation-3.42e-01

Pos logit contributions
ifying+0.009
ucker+0.009
ours+0.008
 anymore+0.008
angles+0.008
Neg logit contributions
 gard-0.007
 gn-0.006
Els-0.005
 guardian-0.005
 wizard-0.005
Token she
Token ID673
Feature activation-4.72e-03

Pos logit contributions
ifying+0.015
ucker+0.015
 anymore+0.014
angles+0.014
ours+0.014
Neg logit contributions
 gard-0.011
 gn-0.008
Els-0.007
 seek-0.007
 mer-0.007
Token grabbed
Token ID13176
Feature activation+1.46e-01

Pos logit contributions
ifying+0.000
ucker+0.000
ours+0.000
 mouths+0.000
 anymore+0.000
Neg logit contributions
 gard-0.000
 gn-0.000
Els-0.000
 guardian-0.000
Ar-0.000
Token a
Token ID257
Feature activation+3.00e-01

Pos logit contributions
 gard+0.004
 gn+0.003
Els+0.003
 himself+0.002
 two+0.002
Neg logit contributions
ucker-0.007
ifying-0.007
angles-0.006
ours-0.006
 anymore-0.006
Token pe
Token ID613
Feature activation+3.49e+00

Pos logit contributions
 gard+0.010
 gn+0.008
Els+0.007
 mer+0.007
Ar+0.007
Neg logit contributions
ucker-0.012
angles-0.012
ifying-0.012
ours-0.012
 anymore-0.012
Tokenb
Token ID65
Feature activation+2.64e+00

Pos logit contributions
 gard+0.125
 gn+0.101
Els+0.097
Ar+0.093
 guardian+0.091
Neg logit contributions
ifying-0.129
ucker-0.128
ours-0.125
angles-0.120
 mouths-0.118
Tokenble
Token ID903
Feature activation+9.26e-01

Pos logit contributions
 gard+0.103
 gn+0.089
Ar+0.088
 seek+0.085
Els+0.084
Neg logit contributions
ifying-0.088
ours-0.086
ucker-0.084
angles-0.080
outing-0.078
Token.
Token ID13
Feature activation+1.30e-01

Pos logit contributions
 gard+0.032
 gn+0.025
Els+0.024
Ar+0.023
 guardian+0.023
Neg logit contributions
ifying-0.036
ucker-0.035
ours-0.034
 anymore-0.033
 mouths-0.033
Token She
Token ID1375
Feature activation+2.98e-01

Pos logit contributions
 gard+0.004
 gn+0.003
Els+0.003
 guardian+0.003
 wizard+0.003
Neg logit contributions
ucker-0.006
ifying-0.006
ours-0.005
angles-0.005
 mouths-0.005
Token looked
Token ID3114
Feature activation-1.16e+00

Pos logit contributions
 gard+0.010
 gn+0.008
Els+0.007
 guardian+0.007
Ar+0.007
Neg logit contributions
ifying-0.012
ucker-0.012
ours-0.011
 mouths-0.011
angles-0.011
Token day
Token ID1110
Feature activation-1.36e+00

Pos logit contributions
 gard+0.007
 gn+0.005
 seek+0.005
 guardian+0.004
 wizard+0.004
Neg logit contributions
ucker-0.014
ifying-0.014
ours-0.014
 mouths-0.014
angles-0.013
Token and
Token ID290
Feature activation+2.55e-01

Pos logit contributions
ifying+0.050
ucker+0.049
ours+0.048
 mouths+0.046
angles+0.046
Neg logit contributions
 gard-0.049
 gn-0.039
Els-0.038
 guardian-0.036
Ar-0.036
Token she
Token ID673
Feature activation+1.18e-01

Pos logit contributions
 gard+0.008
 gn+0.007
Els+0.006
 seek+0.006
 wizard+0.006
Neg logit contributions
ifying-0.011
ucker-0.010
ours-0.010
angles-0.010
 mouths-0.010
Token found
Token ID1043
Feature activation+9.36e-02

Pos logit contributions
 gard+0.004
 gn+0.003
Els+0.003
 guardian+0.003
Ar+0.003
Neg logit contributions
ifying-0.005
ucker-0.005
ours-0.004
 mouths-0.004
 anymore-0.004
Token a
Token ID257
Feature activation+2.96e-01

Pos logit contributions
 gard+0.003
 gn+0.002
Els+0.002
 guardian+0.002
 wizard+0.002
Neg logit contributions
ifying-0.004
ucker-0.004
ours-0.004
 mouths-0.004
angles-0.004
Token pe
Token ID613
Feature activation+3.49e+00

Pos logit contributions
 gard+0.009
 gn+0.007
 wizard+0.007
 guardian+0.006
Els+0.006
Neg logit contributions
ucker-0.013
ifying-0.013
ours-0.013
angles-0.012
 mouths-0.012
Tokenb
Token ID65
Feature activation+2.57e+00

Pos logit contributions
 gard+0.110
 gn+0.087
Els+0.083
 guardian+0.078
 wizard+0.077
Neg logit contributions
ifying-0.144
ucker-0.140
ours-0.136
 mouths-0.135
 anymore-0.134
Tokenble
Token ID903
Feature activation+1.07e+00

Pos logit contributions
 gard+0.093
Ar+0.077
 gn+0.075
Els+0.072
 seek+0.072
Neg logit contributions
ifying-0.093
ucker-0.089
ours-0.088
angles-0.086
 anymore-0.084
Token.
Token ID13
Feature activation-4.22e-04

Pos logit contributions
 gard+0.032
 gn+0.026
 wizard+0.023
 guardian+0.022
Els+0.022
Neg logit contributions
ucker-0.047
ifying-0.045
angles-0.044
ours-0.044
 mouths-0.044
Token She
Token ID1375
Feature activation+1.82e-01

Pos logit contributions
ifying+0.000
ucker+0.000
ours+0.000
angles+0.000
 mouths+0.000
Neg logit contributions
 gard-0.000
 gn-0.000
Els-0.000
 guardian-0.000
Ar-0.000
Tokenlly
Token ID12810
Feature activation-1.37e+00

Pos logit contributions
 gard+0.005
 gn+0.004
Els+0.004
Ar+0.004
 guardian+0.004
Neg logit contributions
ifying-0.008
ucker-0.008
ours-0.007
 anymore-0.007
 mouths-0.007
Token for
Token ID329
Feature activation+1.46e-01

Pos logit contributions
 gard+0.010
Ar+0.009
Els+0.009
 gn+0.009
 guardian+0.008
Neg logit contributions
ifying-0.008
 anymore-0.008
ucker-0.008
ours-0.007
 mouths-0.007
Token finding
Token ID4917
Feature activation+2.24e-01

Pos logit contributions
 gard+0.005
 gn+0.004
Els+0.004
Ar+0.004
 guardian+0.004
Neg logit contributions
ifying-0.005
ucker-0.005
 anymore-0.005
ours-0.005
 mouths-0.005
Token such
Token ID884
Feature activation-6.16e-01

Pos logit contributions
 gard+0.009
 gn+0.008
Ar+0.007
Els+0.007
 seek+0.007
Neg logit contributions
ifying-0.007
ucker-0.007
 anymore-0.007
ours-0.007
 mouths-0.006
Token a
Token ID257
Feature activation+1.24e-01

Pos logit contributions
ifying+0.026
ucker+0.025
ours+0.024
 anymore+0.024
 mouths+0.024
Neg logit contributions
 gard-0.019
 gn-0.015
Els-0.014
Ar-0.013
 guardian-0.013
Token beautiful
Token ID4950
Feature activation+3.34e-01

Pos logit contributions
 gard+0.005
Els+0.004
 gn+0.004
Ar+0.004
py+0.003
Neg logit contributions
ucker-0.004
ifying-0.004
 anymore-0.003
 badly-0.003
 mouths-0.003
Token pe
Token ID613
Feature activation+3.48e+00

Pos logit contributions
 gard+0.010
Els+0.008
Ar+0.008
 gn+0.008
 guardian+0.007
Neg logit contributions
ucker-0.013
 anymore-0.012
ifying-0.012
 mouths-0.012
ours-0.012
Tokenb
Token ID65
Feature activation+2.49e+00

Pos logit contributions
 gard+0.126
 gn+0.099
Els+0.097
Ar+0.097
 mer+0.091
Neg logit contributions
ucker-0.125
ifying-0.123
ours-0.119
angles-0.115
 anymore-0.114
Tokenble
Token ID903
Feature activation+1.47e+00

Pos logit contributions
 gard+0.095
Ar+0.084
 gn+0.077
 seek+0.076
Els+0.075
Neg logit contributions
ucker-0.080
ifying-0.080
ours-0.075
angles-0.074
outing-0.073
Token.
Token ID13
Feature activation+6.38e-02

Pos logit contributions
 gard+0.049
 gn+0.040
Els+0.036
 guardian+0.036
 wizard+0.036
Neg logit contributions
ucker-0.058
ifying-0.058
ours-0.057
angles-0.055
 mouths-0.055
Token
Token ID198
Feature activation+3.89e-02

Pos logit contributions
 gard+0.002
 gn+0.002
 seek+0.002
Ar+0.002
Els+0.002
Neg logit contributions
ifying-0.002
ucker-0.002
ours-0.002
 mouths-0.002
angles-0.002
Token
Token ID198
Feature activation+1.06e-01

Pos logit contributions
 gard+0.002
 gn+0.001
 seek+0.001
 guardian+0.001
 mer+0.001
Neg logit contributions
ucker-0.001
ours-0.001
ifying-0.001
bs-0.001
angles-0.001
Token,
Token ID11
Feature activation+2.26e-01

Pos logit contributions
 gard+0.021
 gn+0.017
Els+0.016
Ar+0.015
 wizard+0.015
Neg logit contributions
ucker-0.028
ifying-0.028
ours-0.027
 mouths-0.026
angles-0.026
Token she
Token ID673
Feature activation+9.28e-02

Pos logit contributions
 gard+0.007
 gn+0.006
Els+0.005
 guardian+0.005
 wizard+0.005
Neg logit contributions
ucker-0.010
ifying-0.010
ours-0.009
angles-0.009
 mouths-0.009
Token found
Token ID1043
Feature activation+6.69e-02

Pos logit contributions
 gard+0.003
 gn+0.002
Els+0.002
 guardian+0.002
 wizard+0.002
Neg logit contributions
ifying-0.004
ucker-0.004
ours-0.004
angles-0.004
 mouths-0.004
Token a
Token ID257
Feature activation+1.80e-01

Pos logit contributions
 gard+0.002
 gn+0.002
Els+0.001
 guardian+0.001
Ar+0.001
Neg logit contributions
ifying-0.003
ucker-0.003
ours-0.003
 anymore-0.003
 mouths-0.003
Token small
Token ID1402
Feature activation+4.59e-01

Pos logit contributions
 gard+0.006
 gn+0.005
 wizard+0.004
 guardian+0.004
Els+0.004
Neg logit contributions
ucker-0.008
ours-0.008
ifying-0.008
angles-0.008
 mouths-0.007
Token pe
Token ID613
Feature activation+3.42e+00

Pos logit contributions
 gard+0.014
 gn+0.011
 wizard+0.010
 guardian+0.010
 elf+0.010
Neg logit contributions
ucker-0.020
ifying-0.020
angles-0.020
ours-0.020
inches-0.019
Tokenb
Token ID65
Feature activation+2.50e+00

Pos logit contributions
 gard+0.109
 gn+0.087
Els+0.082
 guardian+0.078
 wizard+0.076
Neg logit contributions
ifying-0.140
ucker-0.137
ours-0.133
 mouths-0.131
 anymore-0.131
Tokenble
Token ID903
Feature activation+9.71e-01

Pos logit contributions
 gard+0.091
Ar+0.076
 gn+0.073
Els+0.071
 seek+0.071
Neg logit contributions
ifying-0.091
ucker-0.086
ours-0.085
angles-0.082
 anymore-0.081
Token.
Token ID13
Feature activation-2.52e-02

Pos logit contributions
 gard+0.029
 gn+0.024
 wizard+0.020
 guardian+0.020
Els+0.020
Neg logit contributions
ucker-0.042
ifying-0.041
angles-0.040
ours-0.040
 mouths-0.040
Token It
Token ID632
Feature activation+1.79e-02

Pos logit contributions
ifying+0.001
ucker+0.001
ours+0.001
angles+0.001
 mouths+0.001
Neg logit contributions
 gard-0.001
 gn-0.001
Els-0.001
 guardian-0.001
Ar-0.001
Token was
Token ID373
Feature activation-2.20e-01

Pos logit contributions
 gard+0.001
 gn+0.000
Els+0.000
Ar+0.000
 guardian+0.000
Neg logit contributions
ifying-0.001
ucker-0.001
ours-0.001
 mouths-0.001
 anymore-0.001
Token fluffy
Token ID39145
Feature activation+1.90e-03

Pos logit contributions
 gard+0.005
 gn+0.004
Els+0.004
 guardian+0.003
 wizard+0.003
Neg logit contributions
ifying-0.009
ucker-0.008
ours-0.008
angles-0.008
 mouths-0.008
Token bunny
Token ID44915
Feature activation-6.04e-01

Pos logit contributions
 gard+0.000
 gn+0.000
Els+0.000
 guardian+0.000
 wizard+0.000
Neg logit contributions
ifying-0.000
ucker-0.000
ours-0.000
angles-0.000
 mouths-0.000
Token,
Token ID11
Feature activation-3.97e-01

Pos logit contributions
ifying+0.027
ucker+0.026
ours+0.026
angles+0.026
 mouths+0.025
Neg logit contributions
 gard-0.018
 gn-0.014
 wizard-0.012
Els-0.012
 guardian-0.012
Token a
Token ID257
Feature activation-1.95e-01

Pos logit contributions
ours+0.019
angles+0.019
ifying+0.019
ucker+0.019
inches+0.018
Neg logit contributions
 gard-0.010
 gn-0.008
 mer-0.007
 seek-0.006
 wizard-0.006
Token w
Token ID266
Feature activation+9.57e-01

Pos logit contributions
ifying+0.009
ours+0.009
ucker+0.009
angles+0.009
 anymore+0.008
Neg logit contributions
 gard-0.006
 gn-0.004
Els-0.004
 wizard-0.004
 guardian-0.004
Tokenadd
Token ID2860
Feature activation+3.42e+00

Pos logit contributions
 gard+0.043
 gn+0.037
Els+0.036
Ar+0.034
 guardian+0.034
Neg logit contributions
ifying-0.027
ucker-0.026
ours-0.025
 anymore-0.024
 mouths-0.024
Tokenling
Token ID1359
Feature activation+6.60e-01

Pos logit contributions
 gard+0.142
 gn+0.119
Els+0.112
Ar+0.110
 wizard+0.110
Neg logit contributions
ifying-0.112
ucker-0.104
ours-0.102
 mouths-0.097
angles-0.095
Token duck
Token ID22045
Feature activation-1.66e-01

Pos logit contributions
 gard+0.017
 gn+0.013
 mer+0.011
Els+0.011
 wizard+0.011
Neg logit contributions
ifying-0.032
ucker-0.032
ours-0.031
angles-0.030
inches-0.030
Token,
Token ID11
Feature activation-6.19e-01

Pos logit contributions
ifying+0.007
ours+0.007
angles+0.007
ucker+0.006
 wasteful+0.006
Neg logit contributions
 gard-0.005
 gn-0.004
 mer-0.004
 guardian-0.004
 bear-0.004
Token and
Token ID290
Feature activation-3.31e-01

Pos logit contributions
ifying+0.027
ucker+0.026
ours+0.026
angles+0.026
 anymore+0.025
Neg logit contributions
 gard-0.018
 gn-0.015
Els-0.013
 mer-0.013
 guardian-0.012
Token a
Token ID257
Feature activation-5.73e-01

Pos logit contributions
ucker+0.014
ifying+0.014
ours+0.013
angles+0.013
inches+0.013
Neg logit contributions
 gard-0.010
 gn-0.009
 mer-0.008
Els-0.008
 wizard-0.007
Token of
Token ID286
Feature activation+1.17e+00

Pos logit contributions
ifying+0.006
ours+0.005
 anymore+0.005
ucker+0.005
bs+0.005
Neg logit contributions
 gard-0.007
 gn-0.005
Els-0.005
 guardian-0.005
 wizard-0.005
Token wind
Token ID2344
Feature activation+4.06e-01

Pos logit contributions
 gard+0.034
 gn+0.026
Els+0.025
 guardian+0.023
Ar+0.023
Neg logit contributions
ifying-0.051
ucker-0.050
ours-0.049
 anymore-0.048
 mouths-0.048
Token blew
Token ID17948
Feature activation+5.39e-01

Pos logit contributions
 gard+0.016
Els+0.013
 gn+0.013
 guardian+0.013
Ar+0.012
Neg logit contributions
ifying-0.013
ucker-0.013
ours-0.012
 anymore-0.012
inches-0.012
Token a
Token ID257
Feature activation+4.65e-01

Pos logit contributions
 gard+0.014
 gn+0.011
Ar+0.011
 himself+0.011
 mer+0.010
Neg logit contributions
ucker-0.025
ifying-0.024
ours-0.023
angles-0.023
inches-0.022
Token fr
Token ID1216
Feature activation-5.73e-01

Pos logit contributions
 gard+0.015
 gn+0.012
 wizard+0.010
 guardian+0.010
 elf+0.010
Neg logit contributions
ucker-0.020
angles-0.020
ours-0.019
ifying-0.019
ishment-0.019
Tokenis
Token ID271
Feature activation+3.42e+00

Pos logit contributions
ucker+0.017
ifying+0.016
ours+0.016
angles+0.016
 anymore+0.015
Neg logit contributions
 gard-0.025
 gn-0.021
Els-0.021
Ar-0.020
 guardian-0.020
Tokenbee
Token ID20963
Feature activation+1.81e+00

Pos logit contributions
 gard+0.120
 gn+0.096
Els+0.092
 guardian+0.087
Ar+0.086
Neg logit contributions
ifying-0.130
ucker-0.127
ours-0.123
 mouths-0.120
 anymore-0.119
Token and
Token ID290
Feature activation-2.03e-01

Pos logit contributions
 gard+0.052
 gn+0.041
 guardian+0.034
Els+0.034
 wizard+0.034
Neg logit contributions
ucker-0.081
ifying-0.078
ours-0.077
angles-0.075
 mouths-0.075
Token it
Token ID340
Feature activation-5.24e-02

Pos logit contributions
ucker+0.008
ifying+0.008
ours+0.008
 anymore+0.008
angles+0.008
Neg logit contributions
 gard-0.007
 gn-0.005
Els-0.005
Ar-0.005
 mer-0.005
Token almost
Token ID2048
Feature activation-3.35e-02

Pos logit contributions
ifying+0.002
ucker+0.002
 mouths+0.002
 anymore+0.002
ours+0.002
Neg logit contributions
 gard-0.002
 gn-0.002
Els-0.002
 guardian-0.001
 wizard-0.001
Token hit
Token ID2277
Feature activation-3.62e-02

Pos logit contributions
ifying+0.001
ucker+0.001
ours+0.001
 mouths+0.001
 anymore+0.001
Neg logit contributions
 gard-0.001
 gn-0.001
Els-0.001
 guardian-0.001
 wizard-0.001
Token the
Token ID262
Feature activation+3.52e-01

Pos logit contributions
ifying+0.000
ucker+0.000
ours+0.000
 mouths+0.000
 anymore+0.000
Neg logit contributions
 gard-0.000
 gn-0.000
Els-0.000
 guardian-0.000
Ar-0.000
Token n
Token ID299
Feature activation+5.11e-01

Pos logit contributions
 gard+0.012
 gn+0.010
Els+0.010
 guardian+0.009
Ar+0.009
Neg logit contributions
ifying-0.013
ucker-0.013
ours-0.013
 anymore-0.012
 mouths-0.012
Tokenooks
Token ID31085
Feature activation+4.42e-01

Pos logit contributions
 gard+0.021
 gn+0.017
Els+0.016
Ar+0.016
 wizard+0.016
Neg logit contributions
ifying-0.017
ucker-0.017
ours-0.016
angles-0.015
 anymore-0.015
Token and
Token ID290
Feature activation+6.17e-01

Pos logit contributions
 gard+0.015
 gn+0.012
Els+0.011
 guardian+0.010
 wizard+0.010
Neg logit contributions
ifying-0.018
ucker-0.017
ours-0.017
angles-0.016
 mouths-0.016
Token cr
Token ID1067
Feature activation+1.61e+00

Pos logit contributions
 gard+0.021
 gn+0.017
Els+0.016
Ar+0.015
 guardian+0.015
Neg logit contributions
ucker-0.024
ifying-0.024
 anymore-0.023
ours-0.023
angles-0.022
Tokenann
Token ID1236
Feature activation+3.40e+00

Pos logit contributions
 gard+0.069
 gn+0.059
Els+0.056
 wizard+0.055
 guardian+0.055
Neg logit contributions
ifying-0.052
ucker-0.050
ours-0.047
 mouths-0.046
angles-0.046
Tokenies
Token ID444
Feature activation+5.77e-01

Pos logit contributions
 gard+0.147
 gn+0.124
Els+0.121
 guardian+0.115
 wizard+0.113
Neg logit contributions
ifying-0.099
ucker-0.093
ours-0.090
 anymore-0.089
angles-0.088
Token.
Token ID13
Feature activation-6.68e-02

Pos logit contributions
 gard+0.020
 gn+0.016
Els+0.016
Ar+0.015
 guardian+0.015
Neg logit contributions
ifying-0.022
ucker-0.021
ours-0.020
 anymore-0.020
 mouths-0.020
Token
Token ID198
Feature activation+3.31e-02

Pos logit contributions
ifying+0.003
ucker+0.003
ours+0.003
 mouths+0.003
 anymore+0.002
Neg logit contributions
 gard-0.002
 gn-0.002
 guardian-0.002
Els-0.002
 wizard-0.002
Token
Token ID198
Feature activation+9.43e-02

Pos logit contributions
 gard+0.001
 gn+0.001
 guardian+0.001
 elf+0.001
Els+0.001
Neg logit contributions
ucker-0.001
ifying-0.001
ours-0.001
 anymore-0.001
angles-0.001
TokenAfter
Token ID3260
Feature activation+9.30e-01

Pos logit contributions
 gard+0.004
 gn+0.003
 guardian+0.003
 elf+0.003
 wizard+0.003
Neg logit contributions
ifying-0.003
ucker-0.003
ours-0.003
angles-0.003
 anymore-0.003
Token drum
Token ID13026
Feature activation+8.47e-01

Pos logit contributions
ifying+0.010
ucker+0.010
angles+0.010
ours+0.010
 anymore+0.010
Neg logit contributions
 gard-0.007
 gn-0.005
 wizard-0.005
Ar-0.005
 mer-0.005
Token and
Token ID290
Feature activation-1.47e-01

Pos logit contributions
 gard+0.030
 gn+0.025
 guardian+0.022
 wizard+0.022
Els+0.022
Neg logit contributions
ifying-0.032
ucker-0.032
ours-0.030
angles-0.029
 mouths-0.029
Token Mia
Token ID32189
Feature activation-3.20e-02

Pos logit contributions
ifying+0.006
ucker+0.006
 anymore+0.006
ours+0.005
ishment+0.005
Neg logit contributions
 gard-0.005
 gn-0.004
Els-0.004
Ar-0.004
 wizard-0.003
Token has
Token ID468
Feature activation+4.19e-02

Pos logit contributions
ifying+0.001
ucker+0.001
ours+0.001
 anymore+0.001
 mouths+0.001
Neg logit contributions
 gard-0.001
 gn-0.001
Els-0.001
Ar-0.001
 guardian-0.001
Token a
Token ID257
Feature activation-6.28e-02

Pos logit contributions
 gard+0.001
 gn+0.001
Els+0.001
Ar+0.001
 guardian+0.001
Neg logit contributions
ifying-0.002
ucker-0.002
ours-0.002
 anymore-0.002
 mouths-0.002
Token fl
Token ID781
Feature activation+3.39e+00

Pos logit contributions
ifying+0.002
ucker+0.002
ours+0.002
angles+0.002
 anymore+0.002
Neg logit contributions
 gard-0.002
 gn-0.002
 wizard-0.002
Ar-0.002
 mer-0.002
Tokenute
Token ID1133
Feature activation+9.03e-01

Pos logit contributions
 gard+0.169
 gn+0.146
Els+0.143
 guardian+0.139
Ar+0.137
Neg logit contributions
ifying-0.076
ucker-0.073
ours-0.070
 mouths-0.068
 anymore-0.068
Token.
Token ID13
Feature activation-2.05e-02

Pos logit contributions
 gard+0.028
 gn+0.022
 guardian+0.020
 wizard+0.020
 mer+0.019
Neg logit contributions
ucker-0.038
ifying-0.037
ours-0.036
angles-0.036
 mouths-0.035
Token They
Token ID1119
Feature activation+3.22e-01

Pos logit contributions
ifying+0.001
ucker+0.001
 anymore+0.001
angles+0.001
izing+0.001
Neg logit contributions
 gard-0.001
 gn-0.001
Els-0.001
Ar-0.001
 wizard-0.000
Token make
Token ID787
Feature activation-1.46e-01

Pos logit contributions
 gard+0.011
 gn+0.009
Els+0.008
 mer+0.008
Ar+0.008
Neg logit contributions
ucker-0.013
ifying-0.012
ours-0.012
 anymore-0.011
inches-0.011
Token different
Token ID1180
Feature activation+4.54e-01

Pos logit contributions
ifying+0.007
ucker+0.006
ours+0.006
 mouths+0.006
 anymore+0.006
Neg logit contributions
 gard-0.004
 gn-0.003
 mer-0.003
 h-0.003
Els-0.003
Token When
Token ID1649
Feature activation+1.33e-02

Pos logit contributions
ifying+0.009
ucker+0.009
angles+0.009
ours+0.009
 anymore+0.008
Neg logit contributions
 gard-0.007
 gn-0.005
Els-0.005
Ar-0.005
 guardian-0.005
Token Daisy
Token ID40355
Feature activation-7.80e-01

Pos logit contributions
 gard+0.000
 gn+0.000
Els+0.000
 wizard+0.000
 guardian+0.000
Neg logit contributions
ifying-0.001
ucker-0.000
ours-0.000
angles-0.000
 anymore-0.000
Token put
Token ID1234
Feature activation-4.36e-02

Pos logit contributions
ifying+0.034
ucker+0.033
ours+0.032
 mouths+0.032
 anymore+0.032
Neg logit contributions
 gard-0.023
 gn-0.018
Els-0.017
 guardian-0.015
Ar-0.015
Token the
Token ID262
Feature activation-6.93e-01

Pos logit contributions
ucker+0.002
ifying+0.002
outing+0.002
ours+0.002
 mouths+0.002
Neg logit contributions
 gard-0.001
 gn-0.001
Els-0.001
 elf-0.001
 wizard-0.001
Token magn
Token ID7842
Feature activation+1.05e+00

Pos logit contributions
ucker+0.028
ifying+0.027
ours+0.026
angles+0.026
 anymore+0.026
Neg logit contributions
 gard-0.024
 gn-0.019
 wizard-0.017
 guardian-0.017
Els-0.017
Tokenifying
Token ID4035
Feature activation-3.46e+00

Pos logit contributions
 gard+0.065
Els+0.059
 gn+0.058
 guardian+0.057
 mer+0.056
Neg logit contributions
ucker-0.010
ours-0.008
 mouths-0.007
 anymore-0.006
angles-0.006
Token glass
Token ID5405
Feature activation-5.02e-01

Pos logit contributions
ucker+0.116
ifying+0.113
ours+0.111
angles+0.110
 anymore+0.108
Neg logit contributions
 gard-0.132
 gn-0.110
Els-0.105
Ar-0.100
 guardian-0.099
Token down
Token ID866
Feature activation-1.30e+00

Pos logit contributions
ifying+0.019
ucker+0.018
ours+0.018
angles+0.017
 mouths+0.017
Neg logit contributions
 gard-0.018
 gn-0.015
Els-0.014
 guardian-0.013
Ar-0.013
Token,
Token ID11
Feature activation+4.86e-01

Pos logit contributions
ucker+0.051
ifying+0.050
ours+0.048
angles+0.047
 mouths+0.047
Neg logit contributions
 gard-0.045
 gn-0.036
Els-0.035
 guardian-0.033
 wizard-0.032
Token the
Token ID262
Feature activation-2.98e-01

Pos logit contributions
 gard+0.015
 gn+0.012
Els+0.011
 wizard+0.010
 seek+0.010
Neg logit contributions
ucker-0.021
ifying-0.021
ours-0.020
angles-0.020
 mouths-0.020
Token cater
Token ID20825
Feature activation-4.85e-01

Pos logit contributions
ucker+0.013
ifying+0.013
ours+0.012
angles+0.012
 anymore+0.012
Neg logit contributions
 gard-0.009
 gn-0.007
Els-0.007
 wizard-0.006
 guardian-0.006
Token flashlight
Token ID38371
Feature activation+4.07e-01

Pos logit contributions
ucker+0.011
ifying+0.010
ours+0.010
angles+0.010
 anymore+0.010
Neg logit contributions
 gard-0.009
 gn-0.007
Els-0.006
 wizard-0.006
 guardian-0.006
Token,
Token ID11
Feature activation-7.60e-01

Pos logit contributions
 gard+0.014
 gn+0.011
Els+0.010
 guardian+0.010
 wizard+0.010
Neg logit contributions
ifying-0.016
ucker-0.016
ours-0.015
 mouths-0.015
angles-0.015
Token and
Token ID290
Feature activation-3.06e-01

Pos logit contributions
ucker+0.033
ifying+0.032
ours+0.031
 anymore+0.030
angles+0.030
Neg logit contributions
 gard-0.023
 gn-0.018
Els-0.017
 guardian-0.016
 wizard-0.016
Token a
Token ID257
Feature activation-6.28e-01

Pos logit contributions
ucker+0.014
ifying+0.014
ours+0.013
 anymore+0.013
angles+0.013
Neg logit contributions
 gard-0.009
 gn-0.007
Els-0.006
 guardian-0.006
 wizard-0.006
Token magn
Token ID7842
Feature activation+9.18e-01

Pos logit contributions
ucker+0.026
ifying+0.025
ours+0.025
angles+0.024
 anymore+0.024
Neg logit contributions
 gard-0.021
 gn-0.017
Els-0.015
 wizard-0.015
 guardian-0.015
Tokenifying
Token ID4035
Feature activation-3.41e+00

Pos logit contributions
 gard+0.057
 gn+0.051
Els+0.051
 guardian+0.049
 wizard+0.049
Neg logit contributions
ucker-0.009
 mouths-0.007
ours-0.007
ifying-0.006
angles-0.006
Token glass
Token ID5405
Feature activation-7.81e-01

Pos logit contributions
ifying+0.110
ucker+0.109
ours+0.104
angles+0.101
 anymore+0.101
Neg logit contributions
 gard-0.137
 gn-0.114
Els-0.110
 guardian-0.105
Ar-0.104
Token.
Token ID13
Feature activation-1.08e-01

Pos logit contributions
ifying+0.031
ucker+0.031
ours+0.030
angles+0.029
 mouths+0.029
Neg logit contributions
 gard-0.026
 gn-0.021
Els-0.019
 guardian-0.018
 wizard-0.018
Token
Token ID198
Feature activation-8.65e-02

Pos logit contributions
ifying+0.004
ucker+0.004
ours+0.004
angles+0.004
 mouths+0.004
Neg logit contributions
 gard-0.004
 gn-0.003
Els-0.003
Ar-0.003
 guardian-0.003
Token
Token ID198
Feature activation-1.12e-01

Pos logit contributions
ifying+0.004
ucker+0.004
ours+0.004
 anymore+0.004
angles+0.004
Neg logit contributions
 gard-0.002
 gn-0.002
Els-0.002
 guardian-0.002
Ar-0.001
TokenAs
Token ID1722
Feature activation-2.79e-01

Pos logit contributions
ifying+0.004
ucker+0.004
ours+0.004
angles+0.004
 anymore+0.004
Neg logit contributions
 gard-0.004
 gn-0.003
Els-0.003
 guardian-0.003
 wizard-0.003
Token.
Token ID13
Feature activation+1.40e-01

Pos logit contributions
 gard+0.005
 gn+0.004
Els+0.004
Ar+0.004
 guardian+0.004
Neg logit contributions
ifying-0.005
ucker-0.005
 anymore-0.005
ours-0.005
 mouths-0.005
Token He
Token ID679
Feature activation+3.76e-01

Pos logit contributions
 gard+0.004
 gn+0.003
Els+0.003
Ar+0.003
 guardian+0.003
Neg logit contributions
ifying-0.006
ucker-0.006
ours-0.005
angles-0.005
 mouths-0.005
Token was
Token ID373
Feature activation-5.52e-01

Pos logit contributions
 gard+0.012
 gn+0.010
Els+0.009
 guardian+0.009
Ar+0.009
Neg logit contributions
ifying-0.015
ucker-0.015
ours-0.014
 mouths-0.014
 anymore-0.014
Token wearing
Token ID5762
Feature activation+7.53e-01

Pos logit contributions
ifying+0.020
ucker+0.020
ours+0.020
 anymore+0.019
 mouths+0.019
Neg logit contributions
 gard-0.019
Els-0.015
 gn-0.015
Ar-0.014
 guardian-0.014
Token his
Token ID465
Feature activation-8.30e-01

Pos logit contributions
 gard+0.017
 gn+0.013
 mer+0.011
Els+0.011
Ar+0.011
Neg logit contributions
ucker-0.039
ours-0.037
ifying-0.037
inches-0.035
angles-0.035
Token brand
Token ID4508
Feature activation-3.38e+00

Pos logit contributions
ucker+0.031
ifying+0.030
ours+0.030
angles+0.029
 anymore+0.028
Neg logit contributions
 gard-0.030
 gn-0.024
 guardian-0.022
 seek-0.022
Els-0.022
Token new
Token ID649
Feature activation-1.06e-01

Pos logit contributions
ifying+0.131
ucker+0.118
 mouths+0.114
ours+0.114
 anymore+0.112
Neg logit contributions
 gard-0.118
 gn-0.097
Els-0.092
 wizard-0.088
 mer-0.088
Token icy
Token ID30284
Feature activation+6.20e-01

Pos logit contributions
ifying+0.004
ucker+0.003
 mouths+0.003
ours+0.003
 anymore+0.003
Neg logit contributions
 gard-0.004
 gn-0.003
Els-0.003
Ar-0.003
 guardian-0.003
Token blue
Token ID4171
Feature activation+3.71e-01

Pos logit contributions
 gard+0.023
Els+0.019
 gn+0.018
Ar+0.018
 guardian+0.017
Neg logit contributions
ifying-0.021
 mouths-0.021
ucker-0.020
 anymore-0.020
ours-0.020
Token jacket
Token ID15224
Feature activation-4.23e-01

Pos logit contributions
 gard+0.014
 gn+0.011
Els+0.011
Ar+0.010
 guardian+0.010
Neg logit contributions
ifying-0.014
ucker-0.013
ours-0.013
 mouths-0.013
 anymore-0.012
Token as
Token ID355
Feature activation+9.32e-02

Pos logit contributions
ucker+0.019
ours+0.019
ifying+0.018
 mouths+0.018
angles+0.018
Neg logit contributions
 gard-0.011
 gn-0.009
 wizard-0.008
 guardian-0.008
 himself-0.008
Token.
Token ID13
Feature activation-9.91e-02

Pos logit contributions
ifying+0.040
ucker+0.039
angles+0.037
ours+0.037
inches+0.036
Neg logit contributions
 gard-0.024
 gn-0.019
 guardian-0.016
 wizard-0.016
 mer-0.016
Token They
Token ID1119
Feature activation-2.41e-01

Pos logit contributions
ucker+0.004
angles+0.004
ifying+0.004
izing+0.003
outing+0.003
Neg logit contributions
 gard-0.004
 gn-0.003
Els-0.003
Ar-0.003
 mer-0.002
Token smell
Token ID8508
Feature activation-1.15e+00

Pos logit contributions
ifying+0.009
ucker+0.009
ours+0.009
 mouths+0.009
 anymore+0.009
Neg logit contributions
 gard-0.008
 gn-0.007
Els-0.006
 guardian-0.006
Ar-0.006
Token popcorn
Token ID38914
Feature activation-6.16e-01

Pos logit contributions
ifying+0.054
ucker+0.054
inches+0.052
outing+0.050
angles+0.050
Neg logit contributions
 gard-0.031
 mer-0.024
 gn-0.024
 himself-0.023
 two-0.021
Token and
Token ID290
Feature activation-3.43e-03

Pos logit contributions
ifying+0.028
ucker+0.028
ours+0.026
angles+0.026
inches+0.026
Neg logit contributions
 gard-0.017
 gn-0.013
 mer-0.011
 wizard-0.011
Els-0.010
Token cotton
Token ID15985
Feature activation-3.10e+00

Pos logit contributions
ucker+0.000
ifying+0.000
ours+0.000
angles+0.000
 anymore+0.000
Neg logit contributions
 gard-0.000
 gn-0.000
Els-0.000
 seek-0.000
Ar-0.000
Token candy
Token ID18550
Feature activation-1.47e+00

Pos logit contributions
ifying+0.119
ucker+0.117
ours+0.111
outing+0.110
angles+0.107
Neg logit contributions
 gard-0.107
 gn-0.087
Ar-0.080
 mer-0.076
 wizard-0.074
Token.
Token ID13
Feature activation+2.60e-04

Pos logit contributions
ifying+0.064
ucker+0.064
ours+0.061
angles+0.059
ishment+0.059
Neg logit contributions
 gard-0.042
 gn-0.033
Ar-0.028
Els-0.028
 wizard-0.028
Token They
Token ID1119
Feature activation-2.84e-01

Pos logit contributions
 gard+0.000
 gn+0.000
Els+0.000
Ar+0.000
 mer+0.000
Neg logit contributions
ifying-0.000
ucker-0.000
angles-0.000
 anymore-0.000
izing-0.000
Token hear
Token ID3285
Feature activation+3.03e-01

Pos logit contributions
ifying+0.010
ucker+0.010
ours+0.010
 anymore+0.010
 mouths+0.010
Neg logit contributions
 gard-0.010
 gn-0.008
Els-0.008
 guardian-0.008
Ar-0.008
Token music
Token ID2647
Feature activation-9.28e-01

Pos logit contributions
 gard+0.009
 gn+0.007
 mer+0.007
Els+0.006
 himself+0.006
Neg logit contributions
ucker-0.014
ifying-0.014
ours-0.013
angles-0.013
inches-0.013
Token.
Token ID13
Feature activation+1.33e-01

Pos logit contributions
ifying+0.052
ucker+0.051
ours+0.049
 anymore+0.049
 mouths+0.048
Neg logit contributions
 gard-0.049
 gn-0.039
Els-0.038
 guardian-0.036
Ar-0.036
Token They
Token ID1119
Feature activation-2.49e-01

Pos logit contributions
 gard+0.005
 gn+0.004
Els+0.004
Ar+0.004
 mer+0.003
Neg logit contributions
ifying-0.005
ucker-0.005
angles-0.005
ours-0.005
izing-0.005
Token smell
Token ID8508
Feature activation-1.18e+00

Pos logit contributions
ifying+0.010
ucker+0.009
ours+0.009
 mouths+0.009
 anymore+0.009
Neg logit contributions
 gard-0.009
 gn-0.007
Els-0.007
 guardian-0.006
 wizard-0.006
Token popcorn
Token ID38914
Feature activation-6.07e-01

Pos logit contributions
ifying+0.059
ucker+0.059
inches+0.056
ours+0.055
outing+0.054
Neg logit contributions
 gard-0.028
 mer-0.025
 himself-0.022
 two-0.020
 gn-0.020
Token and
Token ID290
Feature activation+5.11e-02

Pos logit contributions
ifying+0.029
ucker+0.028
ours+0.027
angles+0.026
ishment+0.026
Neg logit contributions
 gard-0.016
 gn-0.012
 mer-0.011
 wizard-0.010
 next-0.010
Token cotton
Token ID15985
Feature activation-3.08e+00

Pos logit contributions
 gard+0.002
 gn+0.001
 seek+0.001
 mer+0.001
Els+0.001
Neg logit contributions
ucker-0.002
ifying-0.002
ours-0.002
angles-0.002
outing-0.002
Token candy
Token ID18550
Feature activation-1.47e+00

Pos logit contributions
ifying+0.125
ucker+0.123
ours+0.118
outing+0.117
angles+0.114
Neg logit contributions
 gard-0.099
 gn-0.078
Ar-0.072
 mer-0.070
 wizard-0.065
Token.
Token ID13
Feature activation+1.27e-01

Pos logit contributions
ifying+0.065
ucker+0.065
ours+0.063
ishment+0.061
angles+0.061
Neg logit contributions
 gard-0.040
 gn-0.031
 mer-0.027
Ar-0.027
 wizard-0.026
Token They
Token ID1119
Feature activation-2.73e-01

Pos logit contributions
 gard+0.005
 gn+0.004
Els+0.003
Ar+0.003
 mer+0.003
Neg logit contributions
ifying-0.005
ucker-0.005
angles-0.005
ours-0.004
 anymore-0.004
Token hear
Token ID3285
Feature activation+3.48e-01

Pos logit contributions
ifying+0.010
ucker+0.010
ours+0.010
 mouths+0.009
 anymore+0.009
Neg logit contributions
 gard-0.010
 gn-0.008
Els-0.008
 guardian-0.007
 wizard-0.007
Token music
Token ID2647
Feature activation-9.73e-01

Pos logit contributions
 gard+0.011
 gn+0.009
 mer+0.008
Els+0.008
 wizard+0.007
Neg logit contributions
ucker-0.016
ifying-0.015
ours-0.014
angles-0.014
inches-0.014
Token the
Token ID262
Feature activation-1.44e-02

Pos logit contributions
 gard+0.031
 gn+0.025
Els+0.025
Ar+0.023
 guardian+0.023
Neg logit contributions
ucker-0.035
ifying-0.035
ours-0.033
angles-0.033
 anymore-0.033
Token park
Token ID3952
Feature activation-7.48e-01

Pos logit contributions
ucker+0.001
ifying+0.001
ours+0.001
angles+0.001
inches+0.001
Neg logit contributions
 gard-0.000
 gn-0.000
Els-0.000
 guardian-0.000
 wizard-0.000
Token,
Token ID11
Feature activation-2.92e-01

Pos logit contributions
ifying+0.030
ucker+0.029
ours+0.027
angles+0.027
 mouths+0.027
Neg logit contributions
 gard-0.026
 gn-0.021
Els-0.019
 guardian-0.018
 wizard-0.018
Token a
Token ID257
Feature activation-6.62e-01

Pos logit contributions
ucker+0.014
ifying+0.014
angles+0.014
ours+0.013
ishment+0.013
Neg logit contributions
 gard-0.007
Els-0.005
 gn-0.005
Ar-0.005
 seek-0.005
Token three
Token ID1115
Feature activation+4.95e-01

Pos logit contributions
ucker+0.033
angles+0.033
ifying+0.033
ours+0.033
inches+0.032
Neg logit contributions
 gard-0.018
 gn-0.013
Ar-0.012
 mer-0.011
 wizard-0.011
Token year
Token ID614
Feature activation-3.07e+00

Pos logit contributions
 gard+0.017
 gn+0.013
Els+0.013
 guardian+0.012
Ar+0.012
Neg logit contributions
ifying-0.020
ucker-0.019
ours-0.018
 anymore-0.018
angles-0.018
Token old
Token ID1468
Feature activation-4.28e-01

Pos logit contributions
ifying+0.116
ucker+0.113
ours+0.109
 mouths+0.107
 anymore+0.107
Neg logit contributions
 gard-0.109
 gn-0.088
Els-0.084
 guardian-0.080
Ar-0.079
Token saw
Token ID2497
Feature activation-2.08e-01

Pos logit contributions
ifying+0.017
ucker+0.017
ours+0.016
 anymore+0.016
 mouths+0.016
Neg logit contributions
 gard-0.013
 gn-0.010
Els-0.010
Ar-0.010
 guardian-0.009
Token a
Token ID257
Feature activation-1.65e-01

Pos logit contributions
ifying+0.010
ucker+0.010
angles+0.009
ours+0.009
 mouths+0.009
Neg logit contributions
 gard-0.005
 gn-0.004
Els-0.004
 guardian-0.003
 wizard-0.003
Token huge
Token ID3236
Feature activation-2.47e-01

Pos logit contributions
ucker+0.007
ifying+0.007
angles+0.007
ours+0.007
inches+0.007
Neg logit contributions
 gard-0.005
 gn-0.004
 wizard-0.003
 guardian-0.003
Els-0.003
Token furniture
Token ID13091
Feature activation+4.65e-01

Pos logit contributions
ifying+0.011
ucker+0.011
angles+0.010
ours+0.010
outing+0.010
Neg logit contributions
 gard-0.008
 gn-0.006
 wizard-0.005
 guardian-0.005
 elf-0.005
Token the
Token ID262
Feature activation+4.32e-01

Pos logit contributions
 gard+0.023
 gn+0.019
Els+0.018
 guardian+0.017
Ar+0.016
Neg logit contributions
ifying-0.029
ucker-0.028
ours-0.027
 anymore-0.027
 mouths-0.026
Token necklace
Token ID38987
Feature activation+2.03e-02

Pos logit contributions
 gard+0.015
 gn+0.013
Els+0.012
 guardian+0.011
Ar+0.011
Neg logit contributions
ifying-0.016
ucker-0.016
ours-0.015
 anymore-0.015
angles-0.015
Token and
Token ID290
Feature activation+1.04e-01

Pos logit contributions
 gard+0.001
 gn+0.001
Els+0.001
 guardian+0.000
Ar+0.000
Neg logit contributions
ifying-0.001
ucker-0.001
ours-0.001
 mouths-0.001
 anymore-0.001
Token the
Token ID262
Feature activation+1.08e-01

Pos logit contributions
 gard+0.003
 gn+0.002
Els+0.002
 guardian+0.002
Ar+0.002
Neg logit contributions
ifying-0.004
ucker-0.004
ours-0.004
 anymore-0.004
 mouths-0.004
Token magn
Token ID7842
Feature activation+4.56e-01

Pos logit contributions
 gard+0.004
 gn+0.003
Els+0.003
 guardian+0.003
 wizard+0.003
Neg logit contributions
ifying-0.004
ucker-0.004
ours-0.004
angles-0.004
 anymore-0.004
Tokenifying
Token ID4035
Feature activation-3.05e+00

Pos logit contributions
 gard+0.028
 gn+0.025
Els+0.025
 guardian+0.024
Ar+0.024
Neg logit contributions
ucker-0.005
ours-0.004
ifying-0.004
 mouths-0.004
 anymore-0.004
Token glass
Token ID5405
Feature activation-4.28e-01

Pos logit contributions
ifying+0.089
ucker+0.089
ours+0.085
angles+0.084
 anymore+0.082
Neg logit contributions
 gard-0.131
 gn-0.112
Els-0.106
Ar-0.102
 guardian-0.101
Token and
Token ID290
Feature activation+6.40e-02

Pos logit contributions
ifying+0.016
ucker+0.016
ours+0.015
 anymore+0.015
 mouths+0.015
Neg logit contributions
 gard-0.015
 gn-0.012
Els-0.012
Ar-0.011
 guardian-0.011
Token started
Token ID2067
Feature activation+1.21e+00

Pos logit contributions
 gard+0.002
 gn+0.001
Els+0.001
 guardian+0.001
Ar+0.001
Neg logit contributions
ifying-0.003
ucker-0.003
ours-0.003
 anymore-0.003
angles-0.003
Token to
Token ID284
Feature activation+4.96e-01

Pos logit contributions
 gard+0.036
Els+0.027
 gn+0.027
Ar+0.026
 guardian+0.025
Neg logit contributions
ifying-0.051
ucker-0.051
ours-0.049
 mouths-0.049
 anymore-0.048
Token examine
Token ID10716
Feature activation+4.14e-01

Pos logit contributions
 gard+0.017
 gn+0.014
Els+0.013
 guardian+0.013
Ar+0.012
Neg logit contributions
ifying-0.019
ucker-0.019
ours-0.018
 anymore-0.018
 mouths-0.018
Token special
Token ID2041
Feature activation+8.37e-01

Pos logit contributions
 gard+0.004
 gn+0.003
 wizard+0.003
 guardian+0.003
 gen+0.003
Neg logit contributions
ucker-0.005
ours-0.005
ifying-0.005
angles-0.005
ishment-0.005
Token card
Token ID2657
Feature activation+1.71e-01

Pos logit contributions
 gard+0.030
 gn+0.024
 wizard+0.023
Els+0.023
 guardian+0.023
Neg logit contributions
ifying-0.032
ucker-0.031
ours-0.030
 mouths-0.029
angles-0.029
Token to
Token ID284
Feature activation-5.56e-01

Pos logit contributions
 gard+0.007
Els+0.006
 gn+0.005
Ar+0.005
 guardian+0.005
Neg logit contributions
ifying-0.006
 anymore-0.005
ucker-0.005
ours-0.005
angles-0.005
Token a
Token ID257
Feature activation-3.27e-03

Pos logit contributions
ucker+0.025
ifying+0.025
ours+0.024
 mouths+0.023
inches+0.023
Neg logit contributions
 gard-0.016
 gn-0.012
 guardian-0.011
 wizard-0.011
Els-0.011
Token 3
Token ID513
Feature activation+4.84e-02

Pos logit contributions
ucker+0.000
ours+0.000
angles+0.000
ifying+0.000
inches+0.000
Neg logit contributions
 gard-0.000
 gn-0.000
 wizard-0.000
 guardian-0.000
 mer-0.000
Token year
Token ID614
Feature activation-2.96e+00

Pos logit contributions
 gard+0.002
 gn+0.001
Els+0.001
Ar+0.001
 guardian+0.001
Neg logit contributions
ifying-0.002
ucker-0.002
ours-0.002
 anymore-0.002
 mouths-0.002
Token old
Token ID1468
Feature activation-8.10e-02

Pos logit contributions
ifying+0.129
ucker+0.127
 mouths+0.122
angles+0.122
ours+0.122
Neg logit contributions
 gard-0.091
 gn-0.072
 guardian-0.066
 wizard-0.064
Els-0.062
Token child
Token ID1200
Feature activation-7.08e-01

Pos logit contributions
ifying+0.003
ucker+0.003
ours+0.003
 mouths+0.003
 anymore+0.003
Neg logit contributions
 gard-0.003
 gn-0.002
Els-0.002
Ar-0.002
 guardian-0.002
Token.
Token ID13
Feature activation-2.13e-01

Pos logit contributions
ucker+0.027
ifying+0.027
angles+0.026
ours+0.026
 mouths+0.026
Neg logit contributions
 gard-0.024
 gn-0.020
Els-0.018
 wizard-0.018
 guardian-0.018
Token The
Token ID383
Feature activation-3.10e-01

Pos logit contributions
ifying+0.007
ucker+0.007
ours+0.006
 mouths+0.006
 anymore+0.006
Neg logit contributions
 gard-0.009
 gn-0.007
Els-0.007
 guardian-0.007
Ar-0.007
Token card
Token ID2657
Feature activation-1.08e-01

Pos logit contributions
ucker+0.015
ifying+0.015
ours+0.014
angles+0.014
 anymore+0.014
Neg logit contributions
 gard-0.008
 gn-0.006
 guardian-0.006
 wizard-0.006
Els-0.006
Token the
Token ID262
Feature activation+3.72e-02

Pos logit contributions
ucker+0.014
ifying+0.014
ours+0.013
outing+0.012
angles+0.012
Neg logit contributions
 gard-0.010
 gn-0.008
Els-0.008
 seek-0.007
 wizard-0.007
Token warm
Token ID5814
Feature activation-8.98e-01

Pos logit contributions
 gard+0.001
 gn+0.001
Els+0.001
Ar+0.001
 guardian+0.001
Neg logit contributions
ifying-0.001
ours-0.001
 anymore-0.001
 mouths-0.001
ucker-0.001
Token sunshine
Token ID34488
Feature activation-1.58e+00

Pos logit contributions
ifying+0.037
ucker+0.035
 anymore+0.035
 mouths+0.035
ours+0.035
Neg logit contributions
 gard-0.028
 gn-0.022
Els-0.022
Ar-0.021
 guardian-0.020
Token and
Token ID290
Feature activation+2.19e-01

Pos logit contributions
ifying+0.067
ucker+0.067
ours+0.065
angles+0.062
 mouths+0.061
Neg logit contributions
 gard-0.049
 gn-0.038
Els-0.034
 wizard-0.034
 guardian-0.033
Token having
Token ID1719
Feature activation+3.81e-01

Pos logit contributions
 gard+0.007
 gn+0.005
Els+0.005
 seek+0.005
 mer+0.005
Neg logit contributions
ifying-0.010
ucker-0.009
ours-0.009
 anymore-0.009
angles-0.009
Token pic
Token ID8301
Feature activation-2.96e+00

Pos logit contributions
 gard+0.014
 gn+0.012
Els+0.011
 guardian+0.011
Ar+0.011
Neg logit contributions
ifying-0.013
ucker-0.013
ours-0.013
 anymore-0.012
 mouths-0.012
Tokenn
Token ID77
Feature activation+2.10e+00

Pos logit contributions
ifying+0.136
ucker+0.134
ours+0.130
 anymore+0.128
 mouths+0.128
Neg logit contributions
 gard-0.079
 gn-0.059
Els-0.056
 guardian-0.052
Ar-0.051
Tokenics
Token ID873
Feature activation-1.23e-01

Pos logit contributions
 gard+0.077
 gn+0.063
Els+0.060
 guardian+0.057
Ar+0.057
Neg logit contributions
ifying-0.076
ucker-0.074
ours-0.071
 anymore-0.070
 mouths-0.070
Token with
Token ID351
Feature activation+2.58e-01

Pos logit contributions
ucker+0.005
ifying+0.005
ours+0.005
 mouths+0.005
angles+0.005
Neg logit contributions
 gard-0.004
 gn-0.003
Els-0.003
 seek-0.003
 guardian-0.003
Token her
Token ID607
Feature activation+3.42e-01

Pos logit contributions
 gard+0.007
Els+0.005
 gn+0.005
 himself+0.004
 Bear+0.004
Neg logit contributions
ucker-0.012
ours-0.012
ifying-0.012
outing-0.012
angles-0.012
Token family
Token ID1641
Feature activation-5.88e-01

Pos logit contributions
 gard+0.007
 gn+0.005
Els+0.004
 guardian+0.004
 wizard+0.004
Neg logit contributions
ifying-0.018
ucker-0.018
ours-0.017
angles-0.017
inches-0.016
Token.
Token ID13
Feature activation+7.84e-02

Pos logit contributions
ucker+0.024
ifying+0.024
ours+0.024
angles+0.023
inches+0.023
Neg logit contributions
 gard-0.018
 gn-0.014
Els-0.013
 mer-0.013
 guardian-0.013
Token Bob
Token ID5811
Feature activation-2.84e-01

Pos logit contributions
 gard+0.003
 gn+0.002
Els+0.002
Ar+0.002
 wizard+0.002
Neg logit contributions
ucker-0.003
ifying-0.003
angles-0.003
ours-0.003
inches-0.003
Token,
Token ID11
Feature activation-3.84e-01

Pos logit contributions
ifying+0.012
ucker+0.011
 anymore+0.011
ours+0.011
angles+0.011
Neg logit contributions
 gard-0.009
 gn-0.007
Els-0.007
 guardian-0.006
Ar-0.006
Token a
Token ID257
Feature activation-6.70e-01

Pos logit contributions
ifying+0.015
ucker+0.015
ours+0.014
angles+0.014
 mouths+0.014
Neg logit contributions
 gard-0.013
 gn-0.011
Els-0.010
 guardian-0.010
 wizard-0.010
Token three
Token ID1115
Feature activation+5.00e-01

Pos logit contributions
angles+0.033
ucker+0.032
ours+0.032
ifying+0.031
inches+0.031
Neg logit contributions
 gard-0.019
 gn-0.014
 wizard-0.012
 mer-0.012
 guardian-0.012
Token year
Token ID614
Feature activation-2.96e+00

Pos logit contributions
 gard+0.017
 gn+0.013
Els+0.013
 guardian+0.012
Ar+0.012
Neg logit contributions
ifying-0.020
ucker-0.020
ours-0.019
angles-0.018
 anymore-0.018
Token-
Token ID12
Feature activation+7.92e-02

Pos logit contributions
ifying+0.130
ucker+0.125
ours+0.119
angles+0.119
 mouths+0.118
Neg logit contributions
 gard-0.092
 gn-0.073
 guardian-0.066
 wizard-0.066
Els-0.064
Tokenold
Token ID727
Feature activation-9.37e-01

Pos logit contributions
 gard+0.003
 gn+0.003
Els+0.003
 seek+0.003
 guardian+0.003
Neg logit contributions
ifying-0.002
ucker-0.002
ours-0.002
angles-0.002
bs-0.002
Token boy
Token ID2933
Feature activation-5.69e-01

Pos logit contributions
ifying+0.037
ucker+0.036
ours+0.035
 anymore+0.035
 mouths+0.035
Neg logit contributions
 gard-0.030
 gn-0.024
Els-0.023
Ar-0.022
 guardian-0.022
Token,
Token ID11
Feature activation-5.84e-01

Pos logit contributions
ifying+0.023
ucker+0.022
angles+0.022
ours+0.021
 anymore+0.021
Neg logit contributions
 gard-0.019
 gn-0.016
Els-0.014
 guardian-0.014
 wizard-0.014
Token was
Token ID373
Feature activation-4.21e-01

Pos logit contributions
ifying+0.024
ucker+0.024
ours+0.023
 mouths+0.023
 anymore+0.023
Neg logit contributions
 gard-0.018
 gn-0.015
Els-0.014
 guardian-0.013
Ar-0.013
Token the
Token ID262
Feature activation-2.09e-01

Pos logit contributions
ifying+0.031
ucker+0.030
ours+0.027
 anymore+0.027
 mouths+0.027
Neg logit contributions
 gard-0.028
 gn-0.024
Els-0.022
 guardian-0.021
 wizard-0.020
Token necklace
Token ID38987
Feature activation-2.93e-01

Pos logit contributions
ifying+0.007
ucker+0.007
ours+0.006
 anymore+0.006
 mouths+0.006
Neg logit contributions
 gard-0.008
 gn-0.007
Els-0.007
Ar-0.006
 guardian-0.006
Token and
Token ID290
Feature activation-8.90e-02

Pos logit contributions
ifying+0.012
ucker+0.012
ours+0.011
 mouths+0.011
angles+0.011
Neg logit contributions
 gard-0.009
 gn-0.008
Els-0.007
 guardian-0.007
 wizard-0.007
Token the
Token ID262
Feature activation+4.57e-03

Pos logit contributions
ifying+0.003
ucker+0.003
ours+0.003
 mouths+0.003
angles+0.003
Neg logit contributions
 gard-0.003
 gn-0.002
Els-0.002
Ar-0.002
 guardian-0.002
Token magn
Token ID7842
Feature activation+2.29e-01

Pos logit contributions
 gard+0.000
 gn+0.000
Els+0.000
 guardian+0.000
 wizard+0.000
Neg logit contributions
ifying-0.000
ucker-0.000
ours-0.000
angles-0.000
 anymore-0.000
Tokenifying
Token ID4035
Feature activation-2.95e+00

Pos logit contributions
 gard+0.013
Els+0.012
 gn+0.012
 guardian+0.011
 mer+0.011
Neg logit contributions
ucker-0.003
 mouths-0.003
ours-0.003
 wasteful-0.002
ishment-0.002
Token glass
Token ID5405
Feature activation-4.72e-01

Pos logit contributions
ifying+0.084
ucker+0.083
ours+0.079
angles+0.078
 anymore+0.076
Neg logit contributions
 gard-0.130
 gn-0.111
Els-0.106
 guardian-0.102
Ar-0.102
Token and
Token ID290
Feature activation+6.80e-02

Pos logit contributions
ifying+0.019
ucker+0.019
ours+0.018
angles+0.017
 mouths+0.017
Neg logit contributions
 gard-0.015
 gn-0.012
Els-0.012
 guardian-0.011
 wizard-0.011
Token told
Token ID1297
Feature activation+6.61e-01

Pos logit contributions
 gard+0.002
 gn+0.002
Els+0.002
 guardian+0.002
 wizard+0.002
Neg logit contributions
ifying-0.003
ucker-0.003
ours-0.003
 anymore-0.003
angles-0.003
Token him
Token ID683
Feature activation-5.35e-01

Pos logit contributions
 gard+0.023
 gn+0.019
Els+0.018
 wizard+0.017
 himself+0.017
Neg logit contributions
ifying-0.027
ucker-0.026
ours-0.024
outing-0.023
inches-0.023
Token that
Token ID326
Feature activation-3.73e-01

Pos logit contributions
ifying+0.022
ucker+0.021
ishment+0.020
ours+0.019
angles+0.019
Neg logit contributions
 gard-0.017
 gn-0.015
Els-0.013
 himself-0.013
 wizard-0.013
Token was
Token ID373
Feature activation-5.79e-01

Pos logit contributions
 gard+0.016
Els+0.012
 gn+0.012
Ar+0.012
 guardian+0.011
Neg logit contributions
ucker-0.013
ifying-0.013
 anymore-0.012
 mouths-0.012
angles-0.012
Token a
Token ID257
Feature activation-1.80e-01

Pos logit contributions
ifying+0.026
ucker+0.026
ours+0.025
angles+0.024
 mouths+0.024
Neg logit contributions
 gard-0.017
 gn-0.013
Els-0.012
 guardian-0.011
 wizard-0.011
Token lovely
Token ID14081
Feature activation-1.21e-01

Pos logit contributions
ifying+0.007
ucker+0.007
ours+0.007
angles+0.007
 mouths+0.007
Neg logit contributions
 gard-0.006
 gn-0.005
Els-0.004
 guardian-0.004
Ar-0.004
Token,
Token ID11
Feature activation-7.46e-01

Pos logit contributions
ifying+0.005
ucker+0.005
ours+0.005
 mouths+0.005
angles+0.005
Neg logit contributions
 gard-0.004
 gn-0.003
Els-0.003
 guardian-0.003
 wizard-0.003
Token sunny
Token ID27737
Feature activation-1.28e+00

Pos logit contributions
ours+0.033
ucker+0.033
ifying+0.032
angles+0.031
inches+0.030
Neg logit contributions
 gard-0.023
 gn-0.020
 seek-0.018
 mer-0.017
 guardian-0.017
Token year
Token ID614
Feature activation-2.93e+00

Pos logit contributions
ifying+0.050
ucker+0.048
 anymore+0.047
 mouths+0.046
ours+0.046
Neg logit contributions
 gard-0.042
 gn-0.033
Els-0.033
Ar-0.030
 guardian-0.030
Token and
Token ID290
Feature activation-2.51e-01

Pos logit contributions
ifying+0.121
ucker+0.120
ours+0.117
 mouths+0.115
angles+0.113
Neg logit contributions
 gard-0.094
 gn-0.076
 guardian-0.069
Els-0.069
 wizard-0.066
Token all
Token ID477
Feature activation+3.44e-01

Pos logit contributions
ucker+0.011
ifying+0.011
ours+0.010
angles+0.010
inches+0.010
Neg logit contributions
 gard-0.008
 gn-0.006
Els-0.006
 seek-0.006
 guardian-0.006
Token of
Token ID286
Feature activation-1.44e+00

Pos logit contributions
 gard+0.013
 gn+0.010
Els+0.010
 guardian+0.009
Ar+0.009
Neg logit contributions
ifying-0.012
ucker-0.012
ours-0.012
 mouths-0.011
angles-0.011
Token the
Token ID262
Feature activation-2.43e-01

Pos logit contributions
ifying+0.063
ucker+0.063
ours+0.061
angles+0.058
 mouths+0.058
Neg logit contributions
 gard-0.044
 gn-0.034
Els-0.033
 mer-0.030
 wizard-0.030
Token trees
Token ID7150
Feature activation-2.44e-02

Pos logit contributions
ifying+0.011
ucker+0.011
 anymore+0.010
 mouths+0.010
angles+0.010
Neg logit contributions
 gard-0.007
 gn-0.005
Els-0.005
 guardian-0.004
Ar-0.004
Token Inside
Token ID14384
Feature activation-2.47e-02

Pos logit contributions
 gard+0.011
 gn+0.009
Els+0.008
Ar+0.008
 guardian+0.008
Neg logit contributions
ucker-0.010
ifying-0.010
angles-0.010
ours-0.010
 anymore-0.010
Token,
Token ID11
Feature activation+2.40e-02

Pos logit contributions
ifying+0.001
ucker+0.001
ours+0.001
inches+0.001
 mouths+0.001
Neg logit contributions
 gard-0.001
 gn-0.001
Els-0.001
 guardian-0.001
 wizard-0.001
Token a
Token ID257
Feature activation-1.05e-01

Pos logit contributions
 gard+0.001
 gn+0.001
 two+0.000
Els+0.000
 mer+0.000
Neg logit contributions
ucker-0.001
ifying-0.001
inches-0.001
ours-0.001
 mouths-0.001
Token little
Token ID1310
Feature activation-1.25e+00

Pos logit contributions
ucker+0.005
ifying+0.004
ours+0.004
angles+0.004
 anymore+0.004
Neg logit contributions
 gard-0.003
 gn-0.002
Els-0.002
 guardian-0.002
 wizard-0.002
Token three
Token ID1115
Feature activation+8.24e-01

Pos logit contributions
ifying+0.056
ucker+0.055
ours+0.053
angles+0.052
 mouths+0.052
Neg logit contributions
 gard-0.036
 gn-0.027
Els-0.026
 guardian-0.024
 wizard-0.024
Token year
Token ID614
Feature activation-2.92e+00

Pos logit contributions
 gard+0.029
 gn+0.024
Els+0.023
 guardian+0.021
Ar+0.021
Neg logit contributions
ifying-0.031
ucker-0.030
ours-0.029
 mouths-0.029
 anymore-0.029
Token old
Token ID1468
Feature activation-1.18e-01

Pos logit contributions
ifying+0.116
ucker+0.113
ours+0.109
angles+0.107
 mouths+0.107
Neg logit contributions
 gard-0.099
 gn-0.080
Els-0.075
 guardian-0.072
 wizard-0.071
Token boy
Token ID2933
Feature activation-3.28e-01

Pos logit contributions
ifying+0.005
ucker+0.005
ours+0.005
 mouths+0.004
 anymore+0.004
Neg logit contributions
 gard-0.004
 gn-0.003
Els-0.003
Ar-0.003
 guardian-0.003
Token was
Token ID373
Feature activation-1.51e-01

Pos logit contributions
ifying+0.015
ucker+0.014
ours+0.014
angles+0.014
 anymore+0.014
Neg logit contributions
 gard-0.009
 gn-0.007
Els-0.007
 guardian-0.006
 wizard-0.006
Token standing
Token ID5055
Feature activation+4.77e-01

Pos logit contributions
ifying+0.006
ucker+0.006
ours+0.006
angles+0.006
 mouths+0.006
Neg logit contributions
 gard-0.005
 gn-0.004
Els-0.003
 guardian-0.003
 wizard-0.003
Token.
Token ID13
Feature activation+6.67e-02

Pos logit contributions
 gard+0.015
 gn+0.012
Els+0.012
Ar+0.011
 guardian+0.011
Neg logit contributions
ifying-0.020
ucker-0.019
ours-0.018
 anymore-0.018
 mouths-0.018
Token One
Token ID1881
Feature activation+6.19e-01

Pos logit contributions
 gard+0.000
 gn+0.000
Els+0.000
Ar+0.000
 guardian+0.000
Neg logit contributions
ifying-0.000
ucker-0.000
ours-0.000
angles-0.000
 anymore-0.000
Token day
Token ID1110
Feature activation-1.03e+00

Pos logit contributions
 gard+0.015
 gn+0.011
 guardian+0.010
 wizard+0.009
 seek+0.009
Neg logit contributions
ucker-0.031
ifying-0.031
ours-0.030
angles-0.030
inches-0.029
Token,
Token ID11
Feature activation+6.02e-01

Pos logit contributions
ifying+0.041
ucker+0.041
ours+0.038
angles+0.038
ishment+0.037
Neg logit contributions
 gard-0.035
 gn-0.028
Els-0.026
 guardian-0.024
 wizard-0.024
Token a
Token ID257
Feature activation-1.77e-01

Pos logit contributions
 gard+0.014
 gn+0.010
 seek+0.009
 wizard+0.009
Els+0.008
Neg logit contributions
ucker-0.032
ifying-0.031
ours-0.030
angles-0.030
inches-0.029
Token three
Token ID1115
Feature activation+9.31e-01

Pos logit contributions
ifying+0.007
ucker+0.007
ours+0.007
 anymore+0.007
 mouths+0.007
Neg logit contributions
 gard-0.006
 gn-0.005
Els-0.004
 guardian-0.004
Ar-0.004
Token year
Token ID614
Feature activation-2.91e+00

Pos logit contributions
 gard+0.035
 gn+0.029
Els+0.029
Ar+0.027
 guardian+0.027
Neg logit contributions
ifying-0.031
ucker-0.030
ours-0.030
 mouths-0.029
 anymore-0.029
Token old
Token ID1468
Feature activation-2.30e-01

Pos logit contributions
ifying+0.107
ucker+0.105
ours+0.101
 anymore+0.099
 mouths+0.099
Neg logit contributions
 gard-0.105
 gn-0.085
Els-0.082
 guardian-0.078
Ar-0.078
Token child
Token ID1200
Feature activation-5.53e-01

Pos logit contributions
ours+0.007
 mouths+0.007
ucker+0.007
 anymore+0.006
ifying+0.006
Neg logit contributions
 gard-0.008
Els-0.007
Ar-0.007
 gn-0.007
 guardian-0.006
Token opened
Token ID4721
Feature activation+1.16e+00

Pos logit contributions
ifying+0.024
ucker+0.023
ours+0.023
 anymore+0.023
 mouths+0.022
Neg logit contributions
 gard-0.016
 gn-0.012
Els-0.012
Ar-0.011
 guardian-0.011
Token the
Token ID262
Feature activation+1.39e-01

Pos logit contributions
 gard+0.040
 gn+0.032
Els+0.031
 guardian+0.030
 wizard+0.029
Neg logit contributions
ucker-0.044
ifying-0.044
ours-0.042
 mouths-0.041
angles-0.040
Token refrigerator
Token ID30500
Feature activation-1.13e-01

Pos logit contributions
 gard+0.006
Els+0.006
Ar+0.006
 gn+0.005
 seek+0.005
Neg logit contributions
ifying-0.004
 mouths-0.003
 anymore-0.003
ours-0.003
ucker-0.003
Token two
Token ID734
Feature activation+1.53e+00

Pos logit contributions
 gard+0.020
 gn+0.016
 wizard+0.015
 guardian+0.015
Els+0.014
Neg logit contributions
ifying-0.029
ucker-0.028
 anymore-0.028
 mouths-0.028
ours-0.028
Token friends
Token ID2460
Feature activation+1.63e+00

Pos logit contributions
 gard+0.031
 gn+0.019
 guardian+0.015
Els+0.014
 servants+0.014
Neg logit contributions
ifying-0.086
ucker-0.081
ours-0.080
angles-0.079
inches-0.076
Token;
Token ID26
Feature activation-5.77e-01

Pos logit contributions
 gard+0.060
 gn+0.049
Els+0.046
 guardian+0.045
 wizard+0.044
Neg logit contributions
ifying-0.060
ucker-0.058
ours-0.056
angles-0.055
 mouths-0.054
Token a
Token ID257
Feature activation-5.63e-01

Pos logit contributions
ifying+0.023
ucker+0.023
ours+0.022
angles+0.022
 anymore+0.022
Neg logit contributions
 gard-0.019
 gn-0.015
Els-0.014
 guardian-0.013
Ar-0.013
Token 3
Token ID513
Feature activation-5.51e-02

Pos logit contributions
ifying+0.025
ucker+0.025
ours+0.024
angles+0.024
 anymore+0.024
Neg logit contributions
 gard-0.017
 gn-0.013
Els-0.012
 wizard-0.012
 guardian-0.011
Token year
Token ID614
Feature activation-2.91e+00

Pos logit contributions
ucker+0.002
ifying+0.002
ours+0.002
 anymore+0.002
 mouths+0.002
Neg logit contributions
 gard-0.002
 gn-0.002
Els-0.002
Ar-0.002
 guardian-0.002
Token old
Token ID1468
Feature activation-5.27e-01

Pos logit contributions
ifying+0.133
ucker+0.127
ours+0.122
angles+0.122
 mouths+0.119
Neg logit contributions
 gard-0.087
 gn-0.066
 guardian-0.062
 wizard-0.060
 elf-0.058
Token named
Token ID3706
Feature activation-2.17e+00

Pos logit contributions
ucker+0.020
ifying+0.020
ours+0.020
 mouths+0.020
 anymore+0.019
Neg logit contributions
 gard-0.017
 gn-0.014
Els-0.013
Ar-0.013
 guardian-0.012
Token Anna
Token ID11735
Feature activation+3.44e-01

Pos logit contributions
ifying+0.070
ucker+0.069
ours+0.065
 mouths+0.064
 anymore+0.064
Neg logit contributions
 gard-0.089
 gn-0.075
Els-0.072
 guardian-0.068
Ar-0.068
Token and
Token ID290
Feature activation-7.21e-01

Pos logit contributions
 gard+0.012
 gn+0.010
Els+0.010
 guardian+0.009
Ar+0.009
Neg logit contributions
ifying-0.013
ucker-0.012
ours-0.012
 anymore-0.012
 mouths-0.012
Token a
Token ID257
Feature activation-5.97e-01

Pos logit contributions
ifying+0.022
ucker+0.022
ours+0.021
 mouths+0.021
 anymore+0.021
Neg logit contributions
 gard-0.029
 gn-0.024
Els-0.024
 guardian-0.023
Ar-0.023
Token the
Token ID262
Feature activation+4.78e-01

Pos logit contributions
 gard+0.013
 gn+0.011
 himself+0.011
 h+0.010
Els+0.010
Neg logit contributions
ucker-0.021
ifying-0.020
ours-0.019
outing-0.019
angles-0.019
Token h
Token ID289
Feature activation+1.06e+00

Pos logit contributions
 gard+0.018
Els+0.014
 gn+0.014
Ar+0.013
 guardian+0.013
Neg logit contributions
ifying-0.017
 anymore-0.016
 mouths-0.016
ucker-0.015
ours-0.015
Tokenanger
Token ID2564
Feature activation+1.70e+00

Pos logit contributions
 gard+0.043
Els+0.036
 gn+0.035
Ar+0.035
 mer+0.033
Neg logit contributions
ifying-0.035
ucker-0.034
ours-0.033
 anymore-0.032
inches-0.032
Token for
Token ID329
Feature activation+1.65e-01

Pos logit contributions
 gard+0.044
 gn+0.034
 guardian+0.032
 wizard+0.030
Els+0.029
Neg logit contributions
ucker-0.080
ifying-0.077
ours-0.075
 mouths-0.074
angles-0.072
Token her
Token ID607
Feature activation+7.66e-01

Pos logit contributions
 gard+0.005
 gn+0.004
 himself+0.003
Els+0.003
 seek+0.003
Neg logit contributions
ifying-0.008
ucker-0.008
ours-0.007
angles-0.007
outing-0.007
Token brand
Token ID4508
Feature activation-2.90e+00

Pos logit contributions
 gard+0.025
 gn+0.020
Els+0.019
 guardian+0.018
Ar+0.018
Neg logit contributions
ifying-0.031
ucker-0.030
ours-0.029
angles-0.028
 mouths-0.028
Token new
Token ID649
Feature activation+7.09e-01

Pos logit contributions
ifying+0.115
ucker+0.102
 mouths+0.101
ours+0.099
ishment+0.097
Neg logit contributions
 gard-0.100
 gn-0.082
Els-0.079
 guardian-0.074
 wizard-0.074
Token dress
Token ID6576
Feature activation+1.30e-01

Pos logit contributions
 gard+0.026
Els+0.021
Ar+0.019
 gn+0.019
G+0.018
Neg logit contributions
 anymore-0.025
 mouths-0.024
 tant-0.022
inches-0.022
ours-0.022
Token.
Token ID13
Feature activation+9.79e-02

Pos logit contributions
 gard+0.003
 gn+0.003
 guardian+0.002
 seek+0.002
 wizard+0.002
Neg logit contributions
ucker-0.006
ifying-0.006
ours-0.006
 mouths-0.006
angles-0.006
Token The
Token ID383
Feature activation+1.34e-02

Pos logit contributions
 gard+0.004
 gn+0.003
Els+0.003
Ar+0.003
 guardian+0.003
Neg logit contributions
ifying-0.003
ucker-0.003
ours-0.003
angles-0.003
 anymore-0.003
Token cats
Token ID11875
Feature activation-1.68e-01

Pos logit contributions
 gard+0.000
Els+0.000
 gn+0.000
Ar+0.000
 guardian+0.000
Neg logit contributions
ifying-0.001
ucker-0.001
 anymore-0.001
 mouths-0.001
ours-0.001
Token friendship
Token ID14738
Feature activation-5.36e-01

Pos logit contributions
ucker+0.019
ifying+0.019
ours+0.018
outing+0.017
angles+0.017
Neg logit contributions
 gard-0.014
 gn-0.011
 mer-0.010
 wizard-0.010
 himself-0.010
Token which
Token ID543
Feature activation-1.54e-01

Pos logit contributions
ucker+0.025
ifying+0.025
ours+0.024
 mouths+0.023
inches+0.023
Neg logit contributions
 gard-0.014
 gn-0.011
 seek-0.010
 himself-0.010
Els-0.010
Token will
Token ID481
Feature activation-2.01e-02

Pos logit contributions
ifying+0.006
ucker+0.006
ours+0.005
angles+0.005
 mouths+0.005
Neg logit contributions
 gard-0.006
 gn-0.005
Els-0.004
 guardian-0.004
 wizard-0.004
Token last
Token ID938
Feature activation-6.64e-01

Pos logit contributions
ifying+0.001
ucker+0.001
 anymore+0.001
ours+0.001
angles+0.001
Neg logit contributions
 gard-0.001
 gn-0.001
Els-0.001
Ar-0.000
 guardian-0.000
Token a
Token ID257
Feature activation+5.48e-02

Pos logit contributions
ucker+0.028
ifying+0.027
ours+0.027
outing+0.025
angles+0.025
Neg logit contributions
 gard-0.021
 gn-0.016
 seek-0.016
 wizard-0.016
 guardian-0.015
Token lifetime
Token ID10869
Feature activation-2.90e+00

Pos logit contributions
 gard+0.002
 gn+0.002
Els+0.001
 guardian+0.001
Ar+0.001
Neg logit contributions
ucker-0.002
ifying-0.002
ours-0.002
angles-0.002
 mouths-0.002
Token.
Token ID13
Feature activation-5.26e-01

Pos logit contributions
ucker+0.120
ifying+0.116
ours+0.114
 mouths+0.111
angles+0.110
Neg logit contributions
 gard-0.092
 gn-0.073
 guardian-0.067
Els-0.067
 wizard-0.066
Token<|endoftext|>
Token ID50256
Feature activation+1.71e-01

Pos logit contributions
ifying+0.023
ucker+0.023
ours+0.022
 anymore+0.022
 mouths+0.022
Neg logit contributions
 gard-0.015
 gn-0.011
Els-0.011
 guardian-0.010
Ar-0.010
TokenOnce
Token ID7454
Feature activation+1.92e-01

Pos logit contributions
 gard+0.005
 gn+0.004
Els+0.004
 guardian+0.004
 wizard+0.003
Neg logit contributions
ifying-0.007
ucker-0.007
 anymore-0.007
ours-0.007
 mouths-0.007
Token upon
Token ID2402
Feature activation-5.94e-01

Pos logit contributions
 gard+0.005
Els+0.004
 gn+0.004
 guardian+0.003
 wizard+0.003
Neg logit contributions
ifying-0.009
ishment-0.008
 mouths-0.008
ucker-0.008
 afar-0.008
Token a
Token ID257
Feature activation+3.01e-01

Pos logit contributions
ucker+0.029
ifying+0.029
ours+0.028
angles+0.027
 mouths+0.027
Neg logit contributions
 gard-0.014
 gn-0.011
Els-0.010
Ar-0.009
 seek-0.009
Token stairs
Token ID16046
Feature activation+5.39e-01

Pos logit contributions
 gard+0.016
 gn+0.013
Els+0.012
 wizard+0.012
 guardian+0.012
Neg logit contributions
ifying-0.019
ucker-0.019
ours-0.018
angles-0.017
 mouths-0.017
Token,
Token ID11
Feature activation-1.19e-01

Pos logit contributions
 gard+0.016
 gn+0.013
 wizard+0.011
 guardian+0.011
 seek+0.011
Neg logit contributions
ucker-0.023
ifying-0.023
ours-0.022
angles-0.022
 mouths-0.021
Token a
Token ID257
Feature activation-3.16e-01

Pos logit contributions
ucker+0.005
ifying+0.005
ours+0.005
angles+0.005
 mouths+0.005
Neg logit contributions
 gard-0.003
 gn-0.003
Els-0.002
 wizard-0.002
Ar-0.002
Token little
Token ID1310
Feature activation-1.37e+00

Pos logit contributions
ucker+0.013
angles+0.013
ifying+0.013
ours+0.013
inches+0.012
Neg logit contributions
 gard-0.010
 gn-0.008
 wizard-0.007
 guardian-0.007
Els-0.007
Token three
Token ID1115
Feature activation+4.01e-01

Pos logit contributions
ucker+0.062
ifying+0.062
angles+0.060
ours+0.059
 mouths+0.057
Neg logit contributions
 gard-0.040
 gn-0.030
 wizard-0.026
 guardian-0.026
Els-0.026
Token year
Token ID614
Feature activation-2.89e+00

Pos logit contributions
 gard+0.014
 gn+0.012
Els+0.011
 guardian+0.011
Ar+0.010
Neg logit contributions
ifying-0.015
ucker-0.015
ours-0.014
 anymore-0.014
 mouths-0.014
Token old
Token ID1468
Feature activation-5.25e-01

Pos logit contributions
ifying+0.124
ucker+0.122
angles+0.117
ours+0.116
 mouths+0.115
Neg logit contributions
 gard-0.090
 gn-0.072
 guardian-0.064
 wizard-0.064
Els-0.063
Token girl
Token ID2576
Feature activation-8.46e-01

Pos logit contributions
ifying+0.020
ucker+0.020
 mouths+0.020
 anymore+0.020
ours+0.020
Neg logit contributions
 gard-0.016
Els-0.013
 gn-0.013
Ar-0.012
 guardian-0.012
Token stopped
Token ID5025
Feature activation+1.49e+00

Pos logit contributions
ifying+0.035
ucker+0.034
ours+0.033
 anymore+0.033
 mouths+0.033
Neg logit contributions
 gard-0.027
 gn-0.021
Els-0.020
 guardian-0.019
Ar-0.019
Token him
Token ID683
Feature activation-2.89e-02

Pos logit contributions
 gard+0.044
 gn+0.035
 wizard+0.031
Els+0.031
 guardian+0.030
Neg logit contributions
ifying-0.065
ucker-0.064
angles-0.063
ours-0.061
 mouths-0.060
Token.
Token ID13
Feature activation-2.26e-01

Pos logit contributions
ucker+0.001
ours+0.001
ifying+0.001
angles+0.001
 mouths+0.001
Neg logit contributions
 gard-0.001
 gn-0.001
 guardian-0.001
 wizard-0.001
 elf-0.001
Token Everyone
Token ID11075
Feature activation+5.25e-01

Pos logit contributions
 gard+0.002
 gn+0.001
Ar+0.001
Els+0.001
 Bear+0.001
Neg logit contributions
ucker-0.002
ifying-0.002
inches-0.002
ours-0.002
angles-0.002
Token laughed
Token ID13818
Feature activation+3.90e-01

Pos logit contributions
 gard+0.017
 gn+0.014
Els+0.013
 guardian+0.013
Ar+0.013
Neg logit contributions
ifying-0.021
ucker-0.020
ours-0.020
 anymore-0.019
 mouths-0.019
Token and
Token ID290
Feature activation+2.95e-01

Pos logit contributions
 gard+0.011
 gn+0.008
 mer+0.008
Els+0.007
 wizard+0.007
Neg logit contributions
ucker-0.018
ifying-0.018
ours-0.017
angles-0.016
inches-0.016
Token said
Token ID531
Feature activation+5.15e-01

Pos logit contributions
 gard+0.010
 gn+0.008
Els+0.008
 guardian+0.007
Ar+0.007
Neg logit contributions
ifying-0.012
ucker-0.011
ours-0.011
 mouths-0.011
 anymore-0.011
Token how
Token ID703
Feature activation-7.06e-01

Pos logit contributions
 gard+0.017
 mer+0.013
 gn+0.013
Els+0.012
 wizard+0.011
Neg logit contributions
ifying-0.022
ucker-0.022
ishment-0.021
angles-0.021
inches-0.020
Token y
Token ID331
Feature activation-2.88e+00

Pos logit contributions
ifying+0.030
ucker+0.030
ishment+0.029
angles+0.029
ours+0.028
Neg logit contributions
 gard-0.021
 gn-0.015
 mer-0.015
Els-0.014
 seek-0.014
Tokenummy
Token ID13513
Feature activation-7.92e-01

Pos logit contributions
ifying+0.117
rible+0.104
nam+0.103
 afar+0.103
pieces+0.101
Neg logit contributions
 gard-0.078
 himself-0.067
 mer-0.059
Els-0.056
 next-0.054
Token the
Token ID262
Feature activation-3.88e-01

Pos logit contributions
ifying+0.032
ucker+0.030
ours+0.029
 anymore+0.028
 mouths+0.028
Neg logit contributions
 gard-0.027
 gn-0.022
Els-0.020
 wizard-0.019
 mer-0.019
Token pasta
Token ID26296
Feature activation+4.00e-01

Pos logit contributions
ifying+0.011
ucker+0.011
 mouths+0.011
ours+0.011
 anymore+0.010
Neg logit contributions
 gard-0.016
Els-0.014
 gn-0.014
 guardian-0.013
Ar-0.013
Token was
Token ID373
Feature activation-7.58e-01

Pos logit contributions
 gard+0.014
 gn+0.011
Els+0.011
 guardian+0.010
Ar+0.010
Neg logit contributions
ifying-0.015
ucker-0.015
ours-0.014
 mouths-0.014
 anymore-0.014
Token!
Token ID0
Feature activation-1.03e-01

Pos logit contributions
ifying+0.027
ucker+0.027
 anymore+0.026
ours+0.026
 mouths+0.025
Neg logit contributions
 gard-0.027
 gn-0.023
Els-0.022
Ar-0.021
 guardian-0.021
Token She
Token ID1375
Feature activation-1.84e-01

Pos logit contributions
ifying+0.004
ucker+0.004
ours+0.004
 mouths+0.003
 anymore+0.003
Neg logit contributions
 gard-0.003
 gn-0.002
Els-0.002
 guardian-0.002
Ar-0.002
Token takes
Token ID2753
Feature activation+3.59e-01

Pos logit contributions
ours+0.005
ucker+0.004
angles+0.004
 mouths+0.004
ifying+0.004
Neg logit contributions
 gard-0.008
Els-0.007
 gn-0.007
 seek-0.007
Ar-0.007
Token out
Token ID503
Feature activation+4.29e-01

Pos logit contributions
 gard+0.012
 gn+0.009
Els+0.009
 guardian+0.008
Ar+0.008
Neg logit contributions
ifying-0.015
ucker-0.014
ours-0.014
 anymore-0.013
 mouths-0.013
Token a
Token ID257
Feature activation-5.70e-02

Pos logit contributions
 gard+0.014
 gn+0.011
Els+0.011
Ar+0.010
 guardian+0.010
Neg logit contributions
ifying-0.017
ucker-0.016
ours-0.016
 mouths-0.016
 anymore-0.016
Token magn
Token ID7842
Feature activation+1.32e+00

Pos logit contributions
ucker+0.002
ifying+0.002
ours+0.002
angles+0.002
 mouths+0.002
Neg logit contributions
 gard-0.002
 gn-0.001
Els-0.001
 wizard-0.001
 guardian-0.001
Tokenifying
Token ID4035
Feature activation-2.88e+00

Pos logit contributions
 gard+0.090
 gn+0.081
Els+0.080
 guardian+0.078
Ar+0.078
Neg logit contributions
ucker-0.005
ifying-0.005
ours-0.003
 mouths-0.003
 anymore-0.002
Token glass
Token ID5405
Feature activation-2.57e-01

Pos logit contributions
ifying+0.085
ucker+0.083
ours+0.080
 anymore+0.077
 mouths+0.077
Neg logit contributions
 gard-0.123
 gn-0.104
Els-0.101
 guardian-0.097
Ar-0.096
Token.
Token ID13
Feature activation+2.30e-02

Pos logit contributions
ifying+0.009
ucker+0.009
ours+0.009
 anymore+0.009
 mouths+0.009
Neg logit contributions
 gard-0.009
Els-0.007
 gn-0.007
Ar-0.007
 guardian-0.007
Token She
Token ID1375
Feature activation-2.42e-01

Pos logit contributions
 gard+0.001
 gn+0.001
Els+0.001
Ar+0.000
 wizard+0.000
Neg logit contributions
ifying-0.001
ucker-0.001
 anymore-0.001
angles-0.001
 mouths-0.001
Token looks
Token ID3073
Feature activation-1.25e+00

Pos logit contributions
ours+0.007
ucker+0.006
ifying+0.006
 mouths+0.006
angles+0.006
Neg logit contributions
 gard-0.011
Els-0.009
 gn-0.009
Ar-0.009
 guardian-0.009
Token at
Token ID379
Feature activation-5.91e-01

Pos logit contributions
ifying+0.047
ucker+0.046
ours+0.043
 mouths+0.042
angles+0.042
Neg logit contributions
 gard-0.045
 gn-0.037
Els-0.034
 wizard-0.034
 mer-0.033
Token She
Token ID1375
Feature activation+3.48e-01

Pos logit contributions
 gard+0.007
 gn+0.005
Els+0.005
 guardian+0.004
 wizard+0.004
Neg logit contributions
ifying-0.012
ucker-0.012
angles-0.012
ours-0.011
 mouths-0.011
Token did
Token ID750
Feature activation+1.83e-01

Pos logit contributions
 gard+0.012
 gn+0.009
Els+0.009
 guardian+0.008
Ar+0.008
Neg logit contributions
ifying-0.014
ucker-0.013
ours-0.013
 mouths-0.013
 anymore-0.013
Token not
Token ID407
Feature activation-9.56e-01

Pos logit contributions
 gard+0.003
 gn+0.002
 himself+0.002
 next+0.002
 two+0.002
Neg logit contributions
ifying-0.011
ucker-0.010
ishment-0.010
angles-0.009
ays-0.009
Token set
Token ID900
Feature activation+1.07e+00

Pos logit contributions
ifying+0.033
 anymore+0.032
ucker+0.032
ours+0.031
 mouths+0.031
Neg logit contributions
 gard-0.035
 gn-0.029
Els-0.029
Ar-0.027
 guardian-0.027
Token the
Token ID262
Feature activation-3.51e-01

Pos logit contributions
 gard+0.031
 gn+0.025
 himself+0.024
 two+0.023
 wizard+0.023
Neg logit contributions
ifying-0.047
ucker-0.045
 mouths-0.041
inches-0.041
fy-0.041
Token timer
Token ID19781
Feature activation+2.13e+00

Pos logit contributions
ifying+0.011
ucker+0.011
ours+0.010
angles+0.010
 anymore+0.010
Neg logit contributions
 gard-0.015
 gn-0.012
Els-0.011
 wizard-0.011
 guardian-0.011
Token.
Token ID13
Feature activation+4.33e-01

Pos logit contributions
 gard+0.066
 gn+0.053
 guardian+0.047
 wizard+0.046
Els+0.044
Neg logit contributions
ifying-0.089
ucker-0.089
ours-0.084
angles-0.080
inches-0.079
Token She
Token ID1375
Feature activation+4.23e-01

Pos logit contributions
 gard+0.011
 gn+0.008
Els+0.007
 guardian+0.006
 wizard+0.006
Neg logit contributions
ifying-0.021
ucker-0.020
angles-0.020
ours-0.019
 mouths-0.019
Token did
Token ID750
Feature activation+1.75e-01

Pos logit contributions
 gard+0.013
 gn+0.011
Els+0.010
Ar+0.009
 guardian+0.009
Neg logit contributions
ifying-0.018
ucker-0.017
ours-0.017
 anymore-0.016
angles-0.016
Token not
Token ID407
Feature activation-9.29e-01

Pos logit contributions
 gard+0.003
 gn+0.002
 himself+0.002
 next+0.002
 two+0.002
Neg logit contributions
ifying-0.010
ucker-0.009
ishment-0.009
ours-0.009
angles-0.009
Token care
Token ID1337
Feature activation+3.31e-01

Pos logit contributions
ifying+0.033
 anymore+0.032
ucker+0.031
ours+0.030
 mouths+0.030
Neg logit contributions
 gard-0.034
 gn-0.028
Els-0.028
Ar-0.026
 guardian-0.026
Token the
Token ID262
Feature activation-3.63e-01

Pos logit contributions
ifying+0.004
ucker+0.004
ours+0.004
 anymore+0.004
 mouths+0.004
Neg logit contributions
 gard-0.003
 gn-0.002
Els-0.002
 guardian-0.002
Ar-0.002
Token idea
Token ID2126
Feature activation+3.47e-01

Pos logit contributions
ifying+0.016
ucker+0.015
ours+0.015
angles+0.014
 anymore+0.014
Neg logit contributions
 gard-0.011
 gn-0.009
Els-0.008
 guardian-0.008
 wizard-0.007
Token to
Token ID284
Feature activation-4.86e-02

Pos logit contributions
 gard+0.010
 gn+0.008
Els+0.007
 guardian+0.007
 wizard+0.007
Neg logit contributions
ifying-0.015
ucker-0.015
ours-0.014
angles-0.014
 mouths-0.014
Token bake
Token ID28450
Feature activation+6.45e-02

Pos logit contributions
ifying+0.002
ucker+0.002
 anymore+0.002
ours+0.002
angles+0.001
Neg logit contributions
 gard-0.002
 gn-0.002
Els-0.002
Ar-0.001
 guardian-0.001
Token a
Token ID257
Feature activation-1.80e-01

Pos logit contributions
 gard+0.002
 gn+0.001
Els+0.001
 seek+0.001
 himself+0.001
Neg logit contributions
ifying-0.003
ucker-0.003
 mouths-0.003
inches-0.003
ours-0.003
Token pe
Token ID613
Feature activation+2.72e+00

Pos logit contributions
ifying+0.007
ucker+0.007
ours+0.006
angles+0.006
 anymore+0.006
Neg logit contributions
 gard-0.007
 gn-0.005
Els-0.005
 wizard-0.005
 guardian-0.005
Tokenb
Token ID65
Feature activation+1.85e+00

Pos logit contributions
 gard+0.105
 gn+0.086
Els+0.084
Ar+0.083
 mer+0.079
Neg logit contributions
ucker-0.090
ours-0.089
ifying-0.089
angles-0.085
 anymore-0.082
Tokenble
Token ID903
Feature activation+6.63e-01

Pos logit contributions
 gard+0.069
Ar+0.062
Els+0.057
 gn+0.056
 seek+0.055
Neg logit contributions
ours-0.063
ucker-0.060
ifying-0.060
angles-0.059
outing-0.055
Token.
Token ID13
Feature activation-4.92e-02

Pos logit contributions
 gard+0.022
 gn+0.018
Els+0.016
 guardian+0.016
 wizard+0.016
Neg logit contributions
ifying-0.026
ucker-0.026
ours-0.025
 mouths-0.025
angles-0.024
Token She
Token ID1375
Feature activation+4.69e-01

Pos logit contributions
ifying+0.002
ucker+0.002
ours+0.002
 mouths+0.002
 anymore+0.002
Neg logit contributions
 gard-0.002
 gn-0.001
Els-0.001
Ar-0.001
 guardian-0.001
Token gathered
Token ID9272
Feature activation+1.24e+00

Pos logit contributions
 gard+0.015
 gn+0.012
Els+0.012
 guardian+0.011
Ar+0.011
Neg logit contributions
ucker-0.019
ifying-0.018
ours-0.018
 mouths-0.018
angles-0.017
Token more
Token ID517
Feature activation+1.26e+00

Pos logit contributions
 badly+0.007
 anymore+0.006
ifying+0.006
ucker+0.006
 wasteful+0.006
Neg logit contributions
 gard-0.012
Ar-0.011
 gn-0.011
Els-0.010
 wizard-0.010
Token careful
Token ID8161
Feature activation+8.26e-01

Pos logit contributions
 gard+0.038
 gn+0.032
Els+0.030
Ar+0.029
 gen+0.027
Neg logit contributions
ucker-0.051
ifying-0.049
 anymore-0.047
ours-0.046
izing-0.046
Token with
Token ID351
Feature activation+8.00e-01

Pos logit contributions
 gard+0.026
 gn+0.021
Els+0.020
 guardian+0.019
 wizard+0.019
Neg logit contributions
ifying-0.034
ucker-0.033
ours-0.032
 mouths-0.032
angles-0.031
Token him
Token ID683
Feature activation+8.71e-02

Pos logit contributions
 gard+0.023
 gn+0.019
Els+0.018
 himself+0.016
 elf+0.016
Neg logit contributions
ifying-0.036
ucker-0.035
ours-0.034
angles-0.033
inches-0.032
Token from
Token ID422
Feature activation+1.29e+00

Pos logit contributions
 gard+0.002
 gn+0.002
 himself+0.002
 next+0.002
 guardian+0.002
Neg logit contributions
ucker-0.004
ifying-0.004
ours-0.004
angles-0.004
inches-0.004
Token now
Token ID783
Feature activation+2.16e+00

Pos logit contributions
 gard+0.037
 gn+0.032
 himself+0.030
 elf+0.029
 wizard+0.028
Neg logit contributions
ucker-0.060
ifying-0.058
ours-0.055
angles-0.055
 mouths-0.051
Token on
Token ID319
Feature activation+4.69e-01

Pos logit contributions
 gard+0.070
 gn+0.055
Els+0.054
Ar+0.050
 guardian+0.050
Neg logit contributions
ifying-0.088
ucker-0.085
ours-0.083
 anymore-0.082
 mouths-0.081
Token.
Token ID13
Feature activation-2.40e-01

Pos logit contributions
 gard+0.013
 gn+0.011
 seek+0.009
Els+0.009
 himself+0.009
Neg logit contributions
ucker-0.021
ours-0.020
ifying-0.020
angles-0.020
outing-0.019
Token Lily
Token ID20037
Feature activation-2.71e-03

Pos logit contributions
ifying+0.009
ucker+0.009
ours+0.009
 mouths+0.008
 anymore+0.008
Neg logit contributions
 gard-0.008
 gn-0.007
Els-0.007
 guardian-0.006
Ar-0.006
Token and
Token ID290
Feature activation-4.33e-01

Pos logit contributions
ucker+0.000
 mouths+0.000
ours+0.000
angles+0.000
ifying+0.000
Neg logit contributions
 gard-0.000
Els-0.000
 gn-0.000
Ar-0.000
 guardian-0.000
Token Fl
Token ID1610
Feature activation-3.29e-01

Pos logit contributions
ifying+0.015
ucker+0.015
ours+0.014
 mouths+0.014
 anymore+0.014
Neg logit contributions
 gard-0.016
 gn-0.013
Els-0.013
 guardian-0.012
Ar-0.012
Token,
Token ID11
Feature activation+7.41e-02

Pos logit contributions
 gard+0.044
 gn+0.035
 wizard+0.031
 guardian+0.031
 mer+0.030
Neg logit contributions
ucker-0.067
ifying-0.065
ours-0.065
angles-0.063
 mouths-0.061
Token so
Token ID523
Feature activation-4.02e-01

Pos logit contributions
 gard+0.002
 gn+0.002
 wizard+0.002
Els+0.002
 guardian+0.002
Neg logit contributions
ifying-0.003
ucker-0.003
ours-0.003
angles-0.003
 mouths-0.003
Token he
Token ID339
Feature activation+3.31e-01

Pos logit contributions
ifying+0.021
ucker+0.021
ours+0.020
angles+0.020
ishment+0.019
Neg logit contributions
 gard-0.009
 gn-0.007
 wizard-0.006
 guardian-0.005
 himself-0.005
Token snapped
Token ID20821
Feature activation-5.88e-01

Pos logit contributions
 gard+0.011
Els+0.009
 guardian+0.009
 gn+0.009
Ar+0.008
Neg logit contributions
ucker-0.012
ifying-0.012
 mouths-0.011
ours-0.011
angles-0.011
Token the
Token ID262
Feature activation+1.26e-01

Pos logit contributions
ucker+0.023
ifying+0.023
ours+0.022
angles+0.021
 mouths+0.021
Neg logit contributions
 gard-0.020
 gn-0.016
Els-0.015
 guardian-0.015
 wizard-0.014
Token ash
Token ID12530
Feature activation+2.09e+00

Pos logit contributions
 gard+0.004
 gn+0.004
Els+0.003
 guardian+0.003
Ar+0.003
Neg logit contributions
ifying-0.005
ucker-0.005
ours-0.004
 anymore-0.004
 mouths-0.004
Tokentr
Token ID2213
Feature activation+1.66e+00

Pos logit contributions
 gard+0.079
 gn+0.065
Els+0.063
 guardian+0.060
Ar+0.059
Neg logit contributions
ifying-0.073
ucker-0.071
ours-0.069
 mouths-0.067
 anymore-0.067
Tokenay
Token ID323
Feature activation-7.24e-01

Pos logit contributions
 gard+0.067
 gn+0.057
Els+0.056
 guardian+0.055
 gen+0.054
Neg logit contributions
ucker-0.045
 afar-0.044
ifying-0.043
ours-0.043
 mouths-0.043
Token every
Token ID790
Feature activation-9.61e-01

Pos logit contributions
ucker+0.037
ours+0.037
 mouths+0.034
ifying+0.034
ishment+0.033
Neg logit contributions
 gard-0.014
 himself-0.012
 gn-0.011
 next-0.010
 h-0.010
Token day
Token ID1110
Feature activation-2.02e+00

Pos logit contributions
ifying+0.041
ucker+0.041
ours+0.040
angles+0.039
 mouths+0.039
Neg logit contributions
 gard-0.029
 gn-0.023
Els-0.021
 wizard-0.020
 guardian-0.020
Token.
Token ID13
Feature activation-7.01e-02

Pos logit contributions
ucker+0.085
ifying+0.084
ours+0.083
angles+0.080
 mouths+0.078
Neg logit contributions
 gard-0.062
 gn-0.050
Els-0.044
 wizard-0.044
 guardian-0.044
Token He
Token ID679
Feature activation+3.59e-02

Pos logit contributions
 gard+0.011
 gn+0.008
Els+0.008
Ar+0.008
 himself+0.007
Neg logit contributions
ifying-0.016
ucker-0.016
angles-0.016
izing-0.015
ours-0.015
Token wanted
Token ID2227
Feature activation+2.17e-01

Pos logit contributions
 gard+0.001
 gn+0.001
Els+0.001
Ar+0.001
 wizard+0.001
Neg logit contributions
ifying-0.002
ucker-0.002
ours-0.002
 anymore-0.002
angles-0.002
Token to
Token ID284
Feature activation+3.00e-02

Pos logit contributions
 gard+0.006
 gn+0.005
Els+0.005
 guardian+0.004
Ar+0.004
Neg logit contributions
ifying-0.009
ucker-0.009
ours-0.009
 mouths-0.009
 anymore-0.009
Token find
Token ID1064
Feature activation+6.15e-01

Pos logit contributions
 gard+0.001
 gn+0.001
Els+0.001
Ar+0.001
 guardian+0.001
Neg logit contributions
ifying-0.001
ucker-0.001
 anymore-0.001
 mouths-0.001
ours-0.001
Token a
Token ID257
Feature activation+3.02e-01

Pos logit contributions
 gard+0.021
 gn+0.017
Els+0.016
 guardian+0.015
Ar+0.015
Neg logit contributions
ifying-0.024
ucker-0.023
ours-0.023
 mouths-0.022
 anymore-0.022
Token pe
Token ID613
Feature activation+3.17e+00

Pos logit contributions
 gard+0.009
Els+0.007
 gn+0.007
Ar+0.007
 guardian+0.006
Neg logit contributions
ucker-0.012
ifying-0.012
 anymore-0.011
ours-0.011
 mouths-0.011
Tokenb
Token ID65
Feature activation+2.29e+00

Pos logit contributions
 gard+0.111
 gn+0.087
Els+0.085
Ar+0.083
 guardian+0.079
Neg logit contributions
ucker-0.118
ifying-0.116
ours-0.113
angles-0.110
 anymore-0.110
Tokenble
Token ID903
Feature activation+7.71e-01

Pos logit contributions
 gard+0.092
Ar+0.078
 seek+0.076
 gn+0.074
Els+0.071
Neg logit contributions
ucker-0.071
ours-0.070
angles-0.070
ifying-0.068
 anymore-0.066
Token like
Token ID588
Feature activation-6.86e-01

Pos logit contributions
 gard+0.023
 gn+0.019
 wizard+0.017
 guardian+0.016
 mer+0.016
Neg logit contributions
ucker-0.034
ifying-0.033
ours-0.032
angles-0.032
 mouths-0.031
Token Lily
Token ID20037
Feature activation-1.18e-01

Pos logit contributions
ifying+0.027
ucker+0.027
ours+0.025
angles+0.025
 mouths+0.025
Neg logit contributions
 gard-0.023
 gn-0.019
Els-0.018
 guardian-0.017
 wizard-0.017
Token's
Token ID338
Feature activation+1.69e-01

Pos logit contributions
ifying+0.006
ours+0.006
angles+0.006
ucker+0.006
inches+0.005
Neg logit contributions
 gard-0.003
 gn-0.002
 himself-0.002
 bear-0.002
 h-0.002
Token happened
Token ID3022
Feature activation-6.16e-01

Pos logit contributions
ifying+0.007
ucker+0.007
ishment+0.007
ours+0.007
 mouths+0.007
Neg logit contributions
 gard-0.006
 gn-0.005
Els-0.004
Ar-0.004
 seek-0.004
Token.
Token ID13
Feature activation+8.09e-02

Pos logit contributions
ifying+0.024
ucker+0.024
ours+0.023
 mouths+0.022
angles+0.022
Neg logit contributions
 gard-0.021
 gn-0.017
Els-0.016
 guardian-0.015
Ar-0.015
Token He
Token ID679
Feature activation+8.75e-02

Pos logit contributions
 gard+0.002
 gn+0.002
Ar+0.001
Els+0.001
 wizard+0.001
Neg logit contributions
ifying-0.004
angles-0.004
ucker-0.004
ours-0.004
 anymore-0.004
Token saw
Token ID2497
Feature activation-3.68e-01

Pos logit contributions
 gard+0.003
 gn+0.002
Els+0.002
 guardian+0.002
Ar+0.002
Neg logit contributions
ifying-0.004
ucker-0.004
ours-0.004
 anymore-0.003
 mouths-0.003
Token that
Token ID326
Feature activation-8.94e-01

Pos logit contributions
ifying+0.016
ucker+0.016
angles+0.015
 mouths+0.015
ishment+0.014
Neg logit contributions
 gard-0.011
 gn-0.010
 himself-0.009
 two-0.008
 mer-0.008
Token Lily
Token ID20037
Feature activation+2.49e-01

Pos logit contributions
ifying+0.036
ucker+0.034
ours+0.033
 mouths+0.032
angles+0.032
Neg logit contributions
 gard-0.032
 gn-0.027
 seek-0.024
Els-0.024
 mer-0.023
Token was
Token ID373
Feature activation-5.67e-01

Pos logit contributions
 gard+0.004
 gn+0.003
 bear+0.003
 following+0.003
 two+0.003
Neg logit contributions
ifying-0.014
angles-0.013
ours-0.013
ucker-0.013
 ours-0.013
Token sad
Token ID6507
Feature activation+8.24e-01

Pos logit contributions
ifying+0.013
ucker+0.013
ours+0.012
 anymore+0.012
 mouths+0.011
Neg logit contributions
 gard-0.028
 gn-0.024
Els-0.024
Ar-0.023
 guardian-0.023
Token and
Token ID290
Feature activation+5.80e-01

Pos logit contributions
 gard+0.026
 gn+0.020
Els+0.019
 wizard+0.018
 guardian+0.018
Neg logit contributions
ifying-0.035
ucker-0.034
ours-0.033
angles-0.033
 mouths-0.032
Token asked
Token ID1965
Feature activation+9.18e-01

Pos logit contributions
 gard+0.016
 gn+0.013
 wizard+0.011
 elf+0.010
Els+0.010
Neg logit contributions
ifying-0.028
ucker-0.026
angles-0.025
ours-0.025
 anymore-0.025
Token her
Token ID607
Feature activation+4.77e-01

Pos logit contributions
 gard+0.030
 gn+0.025
 himself+0.023
Els+0.021
 wizard+0.021
Neg logit contributions
ifying-0.040
ucker-0.038
 mouths-0.035
angles-0.035
ishment-0.035
Token can
Token ID460
Feature activation-2.45e-01

Pos logit contributions
ifying+0.001
ucker+0.001
 anymore+0.001
ours+0.001
angles+0.001
Neg logit contributions
 gard-0.002
Els-0.002
 gn-0.002
Ar-0.002
 elf-0.002
Token help
Token ID1037
Feature activation-2.99e-01

Pos logit contributions
 anymore+0.009
 badly+0.009
idents+0.008
 ours+0.008
ucker+0.008
Neg logit contributions
 gard-0.008
Els-0.006
Ar-0.006
 gn-0.006
 mer-0.006
Token you
Token ID345
Feature activation-5.00e-01

Pos logit contributions
ucker+0.008
 anymore+0.008
ifying+0.008
ours+0.007
angles+0.007
Neg logit contributions
 gard-0.013
 gn-0.011
Els-0.011
Ar-0.011
 guardian-0.011
Token,"
Token ID553
Feature activation+3.41e-01

Pos logit contributions
 anymore+0.015
ifying+0.013
ucker+0.013
 badly+0.013
 ours+0.012
Neg logit contributions
 gard-0.021
Els-0.018
 gn-0.018
Ar-0.018
Luc-0.016
Token replied
Token ID8712
Feature activation-3.75e-01

Pos logit contributions
 gard+0.012
 gn+0.009
Els+0.009
 guardian+0.008
Ar+0.008
Neg logit contributions
ifying-0.013
ucker-0.013
ours-0.013
 mouths-0.012
 anymore-0.012
Token the
Token ID262
Feature activation+2.60e-01

Pos logit contributions
ifying+0.016
ucker+0.016
ours+0.015
angles+0.015
outing+0.014
Neg logit contributions
 gard-0.012
 gn-0.010
 wizard-0.009
Els-0.009
 elf-0.008
Token bear
Token ID6842
Feature activation+2.28e-01

Pos logit contributions
 gard+0.008
Els+0.007
Ar+0.006
 gn+0.006
 seek+0.006
Neg logit contributions
 anymore-0.009
ifying-0.009
ucker-0.009
 mouths-0.009
 afar-0.008
Token.
Token ID13
Feature activation-2.35e-01

Pos logit contributions
 gard+0.009
 gn+0.007
Els+0.007
Ar+0.007
 guardian+0.006
Neg logit contributions
ifying-0.008
ucker-0.008
 anymore-0.008
ours-0.007
 mouths-0.007
Token They
Token ID1119
Feature activation+2.80e-01

Pos logit contributions
ucker+0.008
ours+0.008
ifying+0.008
 mouths+0.008
 anymore+0.007
Neg logit contributions
 gard-0.009
 guardian-0.007
 gn-0.007
Els-0.007
 elf-0.007
Token walked
Token ID6807
Feature activation+1.77e-01

Pos logit contributions
 gard+0.010
 gn+0.009
Els+0.008
 guardian+0.008
 wizard+0.008
Neg logit contributions
ifying-0.009
 mouths-0.009
angles-0.009
bs-0.009
ucker-0.009
Token and
Token ID290
Feature activation-4.49e-03

Pos logit contributions
 gard+0.008
 gn+0.007
Els+0.007
 gen+0.007
 guardian+0.007
Neg logit contributions
 anymore-0.004
angles-0.004
ifying-0.004
 mouths-0.004
ucker-0.004
Token.
Token ID13
Feature activation+1.86e-02

Pos logit contributions
ifying+0.016
ucker+0.016
angles+0.016
ours+0.015
 mouths+0.014
Neg logit contributions
 gard-0.007
 gn-0.006
 himself-0.005
 mer-0.005
 next-0.005
Token He
Token ID679
Feature activation+1.69e-01

Pos logit contributions
 gard+0.001
 gn+0.000
Els+0.000
Ar+0.000
 guardian+0.000
Neg logit contributions
ifying-0.001
ucker-0.001
ours-0.001
angles-0.001
 mouths-0.001
Token did
Token ID750
Feature activation-3.34e-03

Pos logit contributions
 gard+0.004
 gn+0.004
Els+0.003
Ar+0.003
 wizard+0.003
Neg logit contributions
ifying-0.008
ucker-0.008
ours-0.007
 anymore-0.007
angles-0.007
Token not
Token ID407
Feature activation-1.14e+00

Pos logit contributions
ifying+0.000
ucker+0.000
ours+0.000
ays+0.000
angles+0.000
Neg logit contributions
 gard-0.000
 gn-0.000
 himself-0.000
 seek-0.000
 next-0.000
Token like
Token ID588
Feature activation-7.50e-01

Pos logit contributions
 anymore+0.032
ucker+0.029
bs+0.028
ifying+0.027
angles+0.027
Neg logit contributions
 gard-0.048
Els-0.042
 gn-0.041
Ar-0.040
 guardian-0.038
Token to
Token ID284
Feature activation+4.02e-01

Pos logit contributions
ifying+0.032
ucker+0.031
ours+0.030
angles+0.029
ishment+0.028
Neg logit contributions
 gard-0.024
 gn-0.021
 wizard-0.018
 mer-0.018
 himself-0.017
Token share
Token ID2648
Feature activation+4.09e-01

Pos logit contributions
 gard+0.014
 gn+0.012
Els+0.011
 guardian+0.011
Ar+0.010
Neg logit contributions
ifying-0.015
ucker-0.015
ours-0.015
 mouths-0.014
angles-0.014
Token.
Token ID13
Feature activation+2.11e-01

Pos logit contributions
 gard+0.008
 himself+0.007
 gn+0.007
 seek+0.006
 next+0.005
Neg logit contributions
ucker-0.021
ifying-0.021
ours-0.021
outing-0.019
outed-0.019
Token He
Token ID679
Feature activation-8.68e-04

Pos logit contributions
 gard+0.007
 gn+0.006
Els+0.005
Ar+0.005
 guardian+0.005
Neg logit contributions
ifying-0.008
ucker-0.008
ours-0.008
angles-0.008
 mouths-0.008
Token said
Token ID531
Feature activation+6.57e-02

Pos logit contributions
ifying+0.000
ucker+0.000
ours+0.000
 anymore+0.000
angles+0.000
Neg logit contributions
 gard-0.000
 gn-0.000
Els-0.000
Ar-0.000
 wizard-0.000
Token,
Token ID11
Feature activation+6.01e-02

Pos logit contributions
 gard+0.002
 gn+0.002
Els+0.002
 wizard+0.002
 guardian+0.002
Neg logit contributions
ifying-0.003
ucker-0.002
ours-0.002
angles-0.002
 mouths-0.002
Token and
Token ID290
Feature activation+1.38e-02

Pos logit contributions
ifying+0.015
ucker+0.015
ours+0.015
 mouths+0.014
angles+0.014
Neg logit contributions
 gard-0.012
 gn-0.009
Els-0.008
 wizard-0.008
 guardian-0.008
Token it
Token ID340
Feature activation-8.26e-02

Pos logit contributions
 gard+0.000
 gn+0.000
Els+0.000
 mer+0.000
 wizard+0.000
Neg logit contributions
ifying-0.001
ucker-0.001
angles-0.001
 anymore-0.001
ours-0.001
Token fell
Token ID3214
Feature activation-8.14e-01

Pos logit contributions
ifying+0.003
ucker+0.003
ours+0.003
 mouths+0.003
 anymore+0.003
Neg logit contributions
 gard-0.003
 gn-0.002
Els-0.002
 guardian-0.002
Ar-0.002
Token to
Token ID284
Feature activation+5.95e-02

Pos logit contributions
ifying+0.026
 anymore+0.025
ucker+0.024
ours+0.023
 mouths+0.023
Neg logit contributions
 gard-0.034
Els-0.029
 gn-0.028
 guardian-0.028
 wizard-0.027
Token the
Token ID262
Feature activation+2.32e-01

Pos logit contributions
 gard+0.002
 gn+0.002
Els+0.002
 wizard+0.001
Ar+0.001
Neg logit contributions
ifying-0.002
ucker-0.002
ours-0.002
 mouths-0.002
angles-0.002
Token ground
Token ID2323
Feature activation+3.05e-01

Pos logit contributions
 gard+0.007
 gn+0.005
Els+0.004
 guardian+0.004
Ar+0.004
Neg logit contributions
ucker-0.011
ifying-0.010
ours-0.010
angles-0.010
 mouths-0.010
Token.
Token ID13
Feature activation+1.73e-01

Pos logit contributions
 gard+0.010
 gn+0.008
Els+0.007
 guardian+0.007
 wizard+0.007
Neg logit contributions
ifying-0.013
ucker-0.012
ours-0.012
angles-0.012
 mouths-0.012
Token He
Token ID679
Feature activation-6.20e-02

Pos logit contributions
 gard+0.006
 gn+0.005
Els+0.004
Ar+0.004
 wizard+0.004
Neg logit contributions
ifying-0.007
ucker-0.007
angles-0.007
ours-0.006
 anymore-0.006
Token ran
Token ID4966
Feature activation-3.45e-01

Pos logit contributions
ifying+0.003
ucker+0.003
ours+0.003
angles+0.003
 anymore+0.003
Neg logit contributions
 gard-0.002
 gn-0.001
Els-0.001
Ar-0.001
 mer-0.001
Token to
Token ID284
Feature activation-1.35e-01

Pos logit contributions
ifying+0.013
 anymore+0.013
ucker+0.013
ours+0.012
 mouths+0.012
Neg logit contributions
 gard-0.012
 gn-0.010
Els-0.009
 guardian-0.009
 wizard-0.009
Token it
Token ID340
Feature activation-6.36e-01

Pos logit contributions
ifying+0.004
ucker+0.004
ours+0.004
 anymore+0.004
 mouths+0.004
Neg logit contributions
 gard-0.005
 gn-0.004
Els-0.004
 guardian-0.004
Ar-0.004
TokenInside
Token ID24441
Feature activation+1.62e-01

Pos logit contributions
 gard+0.011
 gn+0.010
 wizard+0.009
 guardian+0.009
 elf+0.009
Neg logit contributions
ifying-0.008
ucker-0.007
ours-0.007
outing-0.006
izing-0.006
Token,
Token ID11
Feature activation+2.06e-01

Pos logit contributions
 gard+0.005
 gn+0.004
Els+0.004
 wizard+0.004
 guardian+0.004
Neg logit contributions
ifying-0.006
ucker-0.006
ours-0.006
 mouths-0.006
inches-0.006
Token the
Token ID262
Feature activation-7.53e-02

Pos logit contributions
 gard+0.007
Els+0.006
 gn+0.006
 mer+0.005
 guardian+0.005
Neg logit contributions
ucker-0.008
ifying-0.008
inches-0.007
ours-0.007
 anymore-0.007
Token waiter
Token ID46612
Feature activation-7.39e-01

Pos logit contributions
ifying+0.003
 mouths+0.003
ucker+0.003
ours+0.003
 anymore+0.003
Neg logit contributions
 gard-0.002
 gn-0.002
Els-0.002
Ar-0.002
 guardian-0.002
Token greeted
Token ID21272
Feature activation-8.41e-02

Pos logit contributions
ifying+0.031
ucker+0.030
ours+0.029
 mouths+0.029
 anymore+0.028
Neg logit contributions
 gard-0.023
 gn-0.018
Els-0.017
Ar-0.016
 guardian-0.016
Token them
Token ID606
Feature activation+7.89e-01

Pos logit contributions
ucker+0.003
ifying+0.003
inches+0.003
ours+0.003
 anymore+0.003
Neg logit contributions
 gard-0.003
Els-0.002
 gn-0.002
 himself-0.002
 mer-0.002
Token and
Token ID290
Feature activation-1.53e-01

Pos logit contributions
 gard+0.024
 gn+0.019
Els+0.017
 guardian+0.017
 mer+0.016
Neg logit contributions
ifying-0.033
ucker-0.033
ours-0.031
angles-0.031
 mouths-0.031
Token showed
Token ID3751
Feature activation+3.40e-01

Pos logit contributions
ifying+0.006
ucker+0.006
ours+0.006
 anymore+0.006
angles+0.006
Neg logit contributions
 gard-0.005
 gn-0.004
Els-0.004
 guardian-0.004
Ar-0.004
Token them
Token ID606
Feature activation+2.58e-01

Pos logit contributions
 gard+0.012
Els+0.010
 gn+0.010
 mer+0.009
 elf+0.009
Neg logit contributions
ucker-0.013
ifying-0.013
ours-0.012
inches-0.012
 anymore-0.011
Token to
Token ID284
Feature activation-2.44e-01

Pos logit contributions
 gard+0.009
 gn+0.007
Els+0.007
 guardian+0.007
 wizard+0.006
Neg logit contributions
ifying-0.010
ucker-0.010
ours-0.009
angles-0.009
 mouths-0.009
Token their
Token ID511
Feature activation+4.14e-01

Pos logit contributions
ucker+0.009
ifying+0.009
ours+0.009
inches+0.009
angles+0.009
Neg logit contributions
 gard-0.009
 gn-0.007
Els-0.007
 guardian-0.006
 wizard-0.006
Token When
Token ID1649
Feature activation-2.45e-01

Pos logit contributions
ifying+0.014
ucker+0.014
 anymore+0.013
ours+0.013
angles+0.013
Neg logit contributions
 gard-0.013
 gn-0.010
Els-0.010
 guardian-0.009
Ar-0.009
Token she
Token ID673
Feature activation-5.30e-01

Pos logit contributions
ifying+0.010
ucker+0.010
ours+0.009
angles+0.009
 mouths+0.009
Neg logit contributions
 gard-0.008
 gn-0.007
Els-0.006
 guardian-0.006
 wizard-0.006
Token was
Token ID373
Feature activation-5.13e-01

Pos logit contributions
ifying+0.022
ucker+0.021
ours+0.021
 mouths+0.020
 anymore+0.020
Neg logit contributions
 gard-0.017
 gn-0.013
Els-0.013
 guardian-0.012
Ar-0.012
Token done
Token ID1760
Feature activation+5.87e-01

Pos logit contributions
ifying+0.017
ucker+0.017
 anymore+0.016
ours+0.016
inches+0.015
Neg logit contributions
 gard-0.020
 gn-0.017
Els-0.016
Ar-0.016
 guardian-0.015
Token,
Token ID11
Feature activation+6.69e-01

Pos logit contributions
 gard+0.023
 gn+0.019
Els+0.018
Ar+0.018
 guardian+0.017
Neg logit contributions
ifying-0.020
ucker-0.019
ours-0.019
 anymore-0.018
 mouths-0.018
Token she
Token ID673
Feature activation-6.45e-03

Pos logit contributions
 gard+0.020
 gn+0.015
Els+0.014
 guardian+0.013
 wizard+0.013
Neg logit contributions
ucker-0.030
ifying-0.030
ours-0.029
angles-0.028
 mouths-0.028
Token presented
Token ID5545
Feature activation-4.21e-01

Pos logit contributions
ifying+0.000
ucker+0.000
ours+0.000
 mouths+0.000
angles+0.000
Neg logit contributions
 gard-0.000
 gn-0.000
Els-0.000
 guardian-0.000
Ar-0.000
Token it
Token ID340
Feature activation-3.15e-01

Pos logit contributions
ifying+0.017
ucker+0.017
ours+0.016
 mouths+0.015
 anymore+0.015
Neg logit contributions
 gard-0.014
 gn-0.011
Els-0.011
 guardian-0.010
 elf-0.010
Token to
Token ID284
Feature activation-3.91e-01

Pos logit contributions
ifying+0.014
ucker+0.014
ours+0.013
 mouths+0.013
angles+0.013
Neg logit contributions
 gard-0.009
 gn-0.007
Els-0.006
 guardian-0.006
 wizard-0.006
Token Jake
Token ID14757
Feature activation+4.22e-01

Pos logit contributions
ifying+0.018
ucker+0.017
ours+0.016
 mouths+0.016
angles+0.016
Neg logit contributions
 gard-0.012
 gn-0.009
Els-0.008
 wizard-0.008
 guardian-0.008
Token.
Token ID13
Feature activation-3.68e-01

Pos logit contributions
 gard+0.010
 gn+0.009
 wizard+0.008
 himself+0.008
 guardian+0.008
Neg logit contributions
ifying-0.019
ucker-0.019
 mouths-0.018
angles-0.018
ours-0.018
Token bought
Token ID5839
Feature activation+2.47e-01

Pos logit contributions
ifying+0.003
ucker+0.003
ours+0.002
 anymore+0.002
 mouths+0.002
Neg logit contributions
 gard-0.002
 gn-0.002
Els-0.001
 guardian-0.001
 wizard-0.001
Token him
Token ID683
Feature activation-5.09e-01

Pos logit contributions
 gard+0.008
 gn+0.007
Els+0.006
 wizard+0.006
 elf+0.006
Neg logit contributions
ifying-0.010
ucker-0.010
ours-0.009
inches-0.009
 mouths-0.009
Token a
Token ID257
Feature activation-9.14e-02

Pos logit contributions
ucker+0.022
ifying+0.022
ours+0.021
angles+0.021
 mouths+0.021
Neg logit contributions
 gard-0.015
 gn-0.012
Els-0.011
 guardian-0.010
 wizard-0.010
Token warm
Token ID5814
Feature activation-3.31e-01

Pos logit contributions
ifying+0.004
ucker+0.004
ours+0.003
angles+0.003
 mouths+0.003
Neg logit contributions
 gard-0.003
 gn-0.003
Els-0.002
 guardian-0.002
Ar-0.002
Token coat
Token ID13209
Feature activation-1.13e-02

Pos logit contributions
ifying+0.013
ucker+0.013
ours+0.013
angles+0.012
 mouths+0.012
Neg logit contributions
 gard-0.011
 gn-0.009
Els-0.008
 guardian-0.008
 wizard-0.008
Token so
Token ID523
Feature activation-1.05e+00

Pos logit contributions
ucker+0.000
ifying+0.000
ours+0.000
angles+0.000
 mouths+0.000
Neg logit contributions
 gard-0.000
 gn-0.000
 guardian-0.000
Els-0.000
 wizard-0.000
Token his
Token ID465
Feature activation-3.39e-01

Pos logit contributions
ucker+0.049
ifying+0.048
ours+0.047
angles+0.047
inches+0.046
Neg logit contributions
 gard-0.027
 gn-0.020
Els-0.018
 himself-0.018
 guardian-0.017
Token little
Token ID1310
Feature activation-2.69e-01

Pos logit contributions
ifying+0.013
ucker+0.013
angles+0.013
ours+0.013
inches+0.012
Neg logit contributions
 gard-0.011
 gn-0.009
 guardian-0.008
Els-0.008
 elf-0.008
Token body
Token ID1767
Feature activation-1.28e-02

Pos logit contributions
ifying+0.011
ucker+0.011
ours+0.010
angles+0.010
 anymore+0.010
Neg logit contributions
 gard-0.009
 gn-0.007
Els-0.007
 guardian-0.007
 wizard-0.006
Token would
Token ID561
Feature activation+9.84e-02

Pos logit contributions
ifying+0.001
ucker+0.001
ours+0.000
angles+0.000
 mouths+0.000
Neg logit contributions
 gard-0.000
 gn-0.000
Els-0.000
 guardian-0.000
Ar-0.000
Token not
Token ID407
Feature activation-6.90e-01

Pos logit contributions
 gard+0.003
 gn+0.003
Els+0.003
Ar+0.002
 guardian+0.002
Neg logit contributions
ifying-0.004
ucker-0.004
 anymore-0.004
 mouths-0.004
ours-0.003
Token very
Token ID845
Feature activation-1.07e-01

Pos logit contributions
ifying+0.015
ours+0.014
ucker+0.014
 anymore+0.014
outing+0.013
Neg logit contributions
 gard-0.015
 gn-0.013
Els-0.013
Ar-0.013
 elf-0.012
Token proud
Token ID6613
Feature activation+4.97e-02

Pos logit contributions
ucker+0.004
ifying+0.004
ours+0.004
outing+0.003
 anymore+0.003
Neg logit contributions
 gard-0.004
 gn-0.003
Els-0.003
Ar-0.003
 wizard-0.003
Token of
Token ID286
Feature activation-3.38e-01

Pos logit contributions
 gard+0.002
Els+0.002
 gn+0.002
Ar+0.002
 guardian+0.001
Neg logit contributions
 anymore-0.002
ifying-0.002
ucker-0.002
ours-0.002
 mouths-0.002
Token her
Token ID607
Feature activation+1.35e-01

Pos logit contributions
ucker+0.014
ifying+0.014
 mouths+0.013
angles+0.013
ours+0.013
Neg logit contributions
 gard-0.011
 gn-0.009
Els-0.009
 elf-0.008
 mer-0.008
Token little
Token ID1310
Feature activation+1.98e-01

Pos logit contributions
 gard+0.004
 gn+0.003
 guardian+0.003
 wizard+0.003
 elf+0.003
Neg logit contributions
ucker-0.006
ifying-0.006
angles-0.006
ours-0.005
 mouths-0.005
Token girl
Token ID2576
Feature activation-4.73e-01

Pos logit contributions
 gard+0.006
 gn+0.005
Els+0.005
Ar+0.004
 guardian+0.004
Neg logit contributions
ifying-0.008
ucker-0.008
ours-0.008
 anymore-0.008
 mouths-0.008
Token and
Token ID290
Feature activation+9.10e-02

Pos logit contributions
ucker+0.021
ifying+0.021
angles+0.020
ours+0.020
 mouths+0.019
Neg logit contributions
 gard-0.014
 gn-0.011
 guardian-0.010
 wizard-0.009
 elf-0.009
Token smiled
Token ID13541
Feature activation+1.12e+00

Pos logit contributions
 gard+0.003
 gn+0.002
Els+0.002
 guardian+0.002
 wizard+0.002
Neg logit contributions
ifying-0.004
ucker-0.004
ours-0.003
 anymore-0.003
 mouths-0.003
Token.
Token ID13
Feature activation-2.30e-01

Pos logit contributions
 gard+0.042
 gn+0.035
Els+0.034
Ar+0.032
 guardian+0.032
Neg logit contributions
ifying-0.039
ucker-0.038
ours-0.036
 anymore-0.036
 mouths-0.036
Token
Token ID198
Feature activation+7.89e-02

Pos logit contributions
ifying+0.009
ucker+0.009
ours+0.009
 mouths+0.009
 anymore+0.009
Neg logit contributions
 gard-0.008
 gn-0.006
Els-0.006
 guardian-0.005
Ar-0.005
Token
Token ID198
Feature activation+1.81e-01

Pos logit contributions
 gard+0.003
 gn+0.002
 guardian+0.002
 wizard+0.002
 elf+0.002
Neg logit contributions
ucker-0.003
ifying-0.003
ours-0.003
 anymore-0.003
angles-0.003
Token checking
Token ID10627
Feature activation+2.40e-01

Pos logit contributions
 gard+0.002
Els+0.002
 gn+0.002
Ar+0.002
 guardian+0.002
Neg logit contributions
ifying-0.003
ucker-0.003
 mouths-0.003
ours-0.003
 anymore-0.003
Token out
Token ID503
Feature activation+9.10e-01

Pos logit contributions
 gard+0.008
 gn+0.006
Els+0.006
 guardian+0.006
 wizard+0.006
Neg logit contributions
ucker-0.010
ifying-0.010
ours-0.009
inches-0.009
 mouths-0.009
Token all
Token ID477
Feature activation-1.66e-01

Pos logit contributions
 gard+0.028
 gn+0.024
 guardian+0.022
 himself+0.022
 mer+0.021
Neg logit contributions
ucker-0.039
inches-0.036
ours-0.036
ifying-0.035
angles-0.034
Token the
Token ID262
Feature activation+2.14e-01

Pos logit contributions
ifying+0.007
ucker+0.006
ours+0.006
 mouths+0.006
inches+0.006
Neg logit contributions
 gard-0.005
 gn-0.004
Els-0.004
 guardian-0.004
 wizard-0.004
Token tasty
Token ID25103
Feature activation-8.46e-01

Pos logit contributions
 gard+0.008
 gn+0.007
Els+0.007
 guardian+0.006
Ar+0.006
Neg logit contributions
ifying-0.007
ucker-0.007
 anymore-0.007
 mouths-0.007
ours-0.007
Token treats
Token ID18432
Feature activation-5.27e-01

Pos logit contributions
ifying+0.034
ucker+0.033
ours+0.032
angles+0.031
 anymore+0.031
Neg logit contributions
 gard-0.028
 gn-0.022
Els-0.021
 guardian-0.020
 wizard-0.019
Token.
Token ID13
Feature activation+1.13e-01

Pos logit contributions
ucker+0.023
ifying+0.022
ours+0.022
inches+0.021
 mouths+0.021
Neg logit contributions
 gard-0.016
 gn-0.012
 wizard-0.011
 guardian-0.011
Els-0.011
Token
Token ID198
Feature activation+2.55e-01

Pos logit contributions
 gard+0.005
 gn+0.004
 seek+0.004
Els+0.004
 guardian+0.004
Neg logit contributions
ifying-0.003
ours-0.003
ucker-0.003
 mouths-0.003
 anymore-0.003
Token
Token ID198
Feature activation+3.20e-01

Pos logit contributions
 gard+0.011
 gn+0.009
 guardian+0.008
 servant+0.008
 elf+0.008
Neg logit contributions
ucker-0.008
ifying-0.007
ours-0.007
 anymore-0.007
bs-0.006
TokenThe
Token ID464
Feature activation-1.05e-02

Pos logit contributions
 gard+0.014
 servant+0.012
 gn+0.012
 gen+0.012
 wizard+0.012
Neg logit contributions
ifying-0.009
izing-0.008
 badly-0.007
ized-0.007
ised-0.007
Token bear
Token ID6842
Feature activation+8.97e-02

Pos logit contributions
 mouths+0.000
 anymore+0.000
ifying+0.000
 cheesy+0.000
idents+0.000
Neg logit contributions
Els-0.000
 gard-0.000
Ar-0.000
ba-0.000
Jo-0.000
Token,
Token ID11
Feature activation-1.31e-01

Pos logit contributions
ifying+0.030
ucker+0.029
ours+0.028
 anymore+0.028
 mouths+0.027
Neg logit contributions
 gard-0.025
 gn-0.020
Els-0.019
 guardian-0.018
Ar-0.018
Token a
Token ID257
Feature activation-4.16e-01

Pos logit contributions
ucker+0.007
ifying+0.007
ours+0.007
angles+0.007
 anymore+0.006
Neg logit contributions
 gard-0.003
 gn-0.002
 seek-0.002
 mer-0.002
Els-0.002
Token bike
Token ID7161
Feature activation-1.26e+00

Pos logit contributions
ucker+0.017
angles+0.017
ours+0.017
ifying+0.016
ishment+0.016
Neg logit contributions
 gard-0.014
 gn-0.012
 wizard-0.010
 seek-0.010
 mer-0.010
Token,
Token ID11
Feature activation-7.22e-01

Pos logit contributions
ucker+0.056
ifying+0.055
 mouths+0.053
ours+0.053
ishment+0.051
Neg logit contributions
 gard-0.035
 gn-0.030
 wizard-0.025
 guardian-0.025
 seek-0.024
Token and
Token ID290
Feature activation-4.54e-01

Pos logit contributions
ucker+0.037
ifying+0.036
ours+0.035
angles+0.034
 anymore+0.034
Neg logit contributions
 gard-0.017
 gn-0.013
Els-0.010
 guardian-0.010
 mer-0.010
Token a
Token ID257
Feature activation-7.67e-01

Pos logit contributions
ucker+0.024
ifying+0.022
ours+0.022
angles+0.022
 anymore+0.022
Neg logit contributions
 gard-0.011
 gn-0.008
Els-0.007
 wizard-0.006
 seek-0.006
Token t
Token ID256
Feature activation+8.14e-01

Pos logit contributions
ucker+0.030
ours+0.029
angles+0.029
ifying+0.029
ishment+0.028
Neg logit contributions
 gard-0.027
 gn-0.022
 wizard-0.020
 mer-0.019
 guardian-0.019
Tokeneddy
Token ID21874
Feature activation+3.60e-01

Pos logit contributions
 gard+0.032
 gn+0.027
Els+0.026
 guardian+0.025
Ar+0.025
Neg logit contributions
ifying-0.026
ucker-0.026
ours-0.025
 mouths-0.025
 anymore-0.024
Token bear
Token ID6842
Feature activation-1.04e+00

Pos logit contributions
 gard+0.005
 gn+0.003
Els+0.002
 wizard+0.002
 guardian+0.002
Neg logit contributions
ifying-0.021
ucker-0.021
ours-0.021
angles-0.020
 mouths-0.020
Token.
Token ID13
Feature activation-5.05e-02

Pos logit contributions
ucker+0.043
ifying+0.043
ours+0.041
angles+0.041
 mouths+0.041
Neg logit contributions
 gard-0.032
 gn-0.027
Els-0.023
 guardian-0.023
 wizard-0.023
Token They
Token ID1119
Feature activation-1.43e-01

Pos logit contributions
ifying+0.002
ucker+0.002
angles+0.002
inches+0.002
ience+0.002
Neg logit contributions
 gard-0.002
 gn-0.001
Els-0.001
Ar-0.001
 mer-0.001
Token.
Token ID13
Feature activation+2.10e-01

Pos logit contributions
 gard+0.008
 gn+0.007
Els+0.006
 guardian+0.006
 wizard+0.006
Neg logit contributions
ifying-0.011
ucker-0.011
ours-0.010
angles-0.010
 mouths-0.010
Token It
Token ID632
Feature activation+1.34e-01

Pos logit contributions
 gard+0.005
 gn+0.004
Ar+0.004
 himself+0.003
Els+0.003
Neg logit contributions
ifying-0.010
angles-0.010
ucker-0.010
izing-0.010
ours-0.010
Token likes
Token ID7832
Feature activation-6.70e-01

Pos logit contributions
 gard+0.006
 gn+0.005
Els+0.005
 guardian+0.005
Ar+0.004
Neg logit contributions
ucker-0.004
ifying-0.004
ours-0.004
 mouths-0.004
 anymore-0.004
Token Tom
Token ID4186
Feature activation-1.45e+00

Pos logit contributions
ifying+0.031
ucker+0.029
ours+0.028
angles+0.027
inkle+0.027
Neg logit contributions
 gard-0.020
 gn-0.016
 mer-0.015
Els-0.014
Ar-0.014
Token and
Token ID290
Feature activation-2.19e-01

Pos logit contributions
ifying+0.075
ours+0.068
angles+0.067
ucker+0.067
izing+0.064
Neg logit contributions
 gard-0.031
 gn-0.024
 himself-0.022
 bear-0.021
 guardian-0.020
Token Ben
Token ID3932
Feature activation-1.65e+00

Pos logit contributions
ifying+0.009
ucker+0.009
angles+0.008
izing+0.008
outing+0.008
Neg logit contributions
 gard-0.008
 gn-0.007
 seek-0.006
 wizard-0.006
 mer-0.006
Token.
Token ID13
Feature activation-3.90e-02

Pos logit contributions
ifying+0.076
ucker+0.075
ours+0.074
angles+0.073
izing+0.068
Neg logit contributions
 gard-0.044
 gn-0.033
 mer-0.030
 himself-0.030
 guardian-0.029
Token
Token ID198
Feature activation+5.46e-02

Pos logit contributions
ifying+0.002
ucker+0.002
ours+0.002
angles+0.002
ience+0.002
Neg logit contributions
 gard-0.001
Ar-0.001
 gn-0.000
Els-0.000
 himself-0.000
Token
Token ID198
Feature activation+9.19e-02

Pos logit contributions
 gard+0.001
 gn+0.001
Ar+0.001
Els+0.001
 wizard+0.001
Neg logit contributions
ifying-0.003
ucker-0.003
 mouths-0.003
ours-0.003
 anymore-0.003
TokenTom
Token ID13787
Feature activation-1.70e-01

Pos logit contributions
 gard+0.003
Ar+0.002
 gn+0.002
Els+0.002
 seek+0.002
Neg logit contributions
ifying-0.004
ucker-0.004
 anymore-0.004
 mouths-0.004
ours-0.004
Token and
Token ID290
Feature activation-4.13e-01

Pos logit contributions
ifying+0.006
ucker+0.006
ours+0.006
 anymore+0.006
 mouths+0.006
Neg logit contributions
 gard-0.006
 gn-0.005
Els-0.005
 guardian-0.004
Ar-0.004
Token's
Token ID338
Feature activation-2.73e-03

Pos logit contributions
ifying+0.007
ucker+0.007
ours+0.007
 anymore+0.007
 mouths+0.007
Neg logit contributions
 gard-0.009
 gn-0.007
Els-0.007
 guardian-0.007
Ar-0.007
Token plan
Token ID1410
Feature activation-4.32e-01

Pos logit contributions
ifying+0.000
ucker+0.000
ours+0.000
 mouths+0.000
 anymore+0.000
Neg logit contributions
 gard-0.000
 gn-0.000
Els-0.000
 guardian-0.000
Ar-0.000
Token a
Token ID257
Feature activation+1.57e-02

Pos logit contributions
ifying+0.018
ucker+0.017
ours+0.017
 anymore+0.017
 mouths+0.016
Neg logit contributions
 gard-0.014
 gn-0.011
Els-0.011
Ar-0.010
 guardian-0.010
Token big
Token ID1263
Feature activation+3.79e-01

Pos logit contributions
 gard+0.001
 gn+0.000
Els+0.000
 guardian+0.000
Ar+0.000
Neg logit contributions
ifying-0.001
ucker-0.001
ours-0.001
 mouths-0.001
 anymore-0.001
Token art
Token ID1242
Feature activation+7.27e-01

Pos logit contributions
 gard+0.013
Els+0.011
 gn+0.011
Ar+0.010
 guardian+0.010
Neg logit contributions
ucker-0.013
 anymore-0.013
ifying-0.013
 mouths-0.013
ours-0.013
Token day
Token ID1110
Feature activation-2.17e+00

Pos logit contributions
 gard+0.029
Els+0.027
 gn+0.025
Ar+0.024
 guardian+0.023
Neg logit contributions
 anymore-0.022
angles-0.020
ifying-0.020
ours-0.020
inches-0.019
Token!"
Token ID2474
Feature activation+4.47e-01

Pos logit contributions
ifying+0.075
 anymore+0.074
ucker+0.072
ours+0.070
 mouths+0.069
Neg logit contributions
 gard-0.082
Els-0.068
 gn-0.067
 guardian-0.062
Ar-0.062
Token M
Token ID337
Feature activation-3.02e-01

Pos logit contributions
 gard+0.019
 guardian+0.017
Els+0.016
 gn+0.016
Ar+0.015
Neg logit contributions
 mouths-0.012
ucker-0.012
ours-0.011
bs-0.011
inches-0.010
Tokenimi
Token ID25236
Feature activation+4.19e-01

Pos logit contributions
ifying+0.009
ucker+0.009
ours+0.008
 anymore+0.008
 mouths+0.008
Neg logit contributions
 gard-0.013
 gn-0.011
Els-0.011
Ar-0.011
 guardian-0.010
Token said
Token ID531
Feature activation-4.46e-01

Pos logit contributions
 gard+0.017
Els+0.014
 gn+0.014
Ar+0.014
 guardian+0.013
Neg logit contributions
ucker-0.012
 mouths-0.012
ours-0.012
inches-0.012
bs-0.012
Token.
Token ID13
Feature activation-2.88e-01

Pos logit contributions
ifying+0.018
ucker+0.017
ours+0.017
angles+0.016
 mouths+0.016
Neg logit contributions
 gard-0.015
 gn-0.012
Els-0.011
 guardian-0.011
 wizard-0.011
TokenAs
Token ID1722
Feature activation-9.76e-02

Pos logit contributions
 gard+0.001
 gn+0.001
Els+0.001
 guardian+0.001
 wizard+0.001
Neg logit contributions
ifying-0.001
ucker-0.001
ours-0.001
 anymore-0.001
angles-0.001
Token she
Token ID673
Feature activation-3.49e-01

Pos logit contributions
ifying+0.004
ucker+0.004
ours+0.004
angles+0.004
inches+0.004
Neg logit contributions
 gard-0.003
 gn-0.002
Els-0.002
 seek-0.002
 wizard-0.002
Token climbed
Token ID19952
Feature activation+2.31e-01

Pos logit contributions
ifying+0.014
ucker+0.014
ours+0.013
 mouths+0.013
 anymore+0.013
Neg logit contributions
 gard-0.011
 gn-0.009
Els-0.008
 guardian-0.008
 wizard-0.008
Token higher
Token ID2440
Feature activation-1.19e+00

Pos logit contributions
 gard+0.010
 gn+0.008
Els+0.008
Ar+0.008
 guardian+0.008
Neg logit contributions
ifying-0.006
 anymore-0.006
ucker-0.006
ours-0.006
angles-0.006
Token and
Token ID290
Feature activation-1.29e-01

Pos logit contributions
ifying+0.046
ucker+0.044
ours+0.043
 mouths+0.041
angles+0.041
Neg logit contributions
 gard-0.042
 gn-0.034
Els-0.033
 guardian-0.031
 wizard-0.030
Token higher
Token ID2440
Feature activation-1.75e+00

Pos logit contributions
ifying+0.004
ucker+0.004
ours+0.004
 mouths+0.004
angles+0.004
Neg logit contributions
 gard-0.006
 gn-0.005
Els-0.005
 wizard-0.004
 guardian-0.004
Token,
Token ID11
Feature activation+1.43e-01

Pos logit contributions
ifying+0.068
ucker+0.065
ours+0.063
angles+0.061
 mouths+0.061
Neg logit contributions
 gard-0.063
 gn-0.051
Els-0.049
 wizard-0.045
 guardian-0.045
Token she
Token ID673
Feature activation-3.24e-01

Pos logit contributions
 gard+0.004
 gn+0.003
Els+0.003
 wizard+0.003
 seek+0.003
Neg logit contributions
ucker-0.007
ifying-0.006
ours-0.006
angles-0.006
inches-0.006
Token heard
Token ID2982
Feature activation+2.51e-01

Pos logit contributions
ifying+0.013
ucker+0.012
ours+0.012
 mouths+0.012
 anymore+0.012
Neg logit contributions
 gard-0.011
 gn-0.008
Els-0.008
Ar-0.008
 guardian-0.008
Token a
Token ID257
Feature activation-3.73e-01

Pos logit contributions
 gard+0.007
 gn+0.005
Els+0.005
 guardian+0.004
 wizard+0.004
Neg logit contributions
ucker-0.012
ifying-0.012
ours-0.012
angles-0.011
inches-0.011
Token loud
Token ID7812
Feature activation+6.22e-01

Pos logit contributions
ifying+0.013
ucker+0.013
ours+0.013
angles+0.012
 anymore+0.012
Neg logit contributions
 gard-0.014
 gn-0.012
Els-0.011
 guardian-0.011
 wizard-0.011
Token with
Token ID351
Feature activation-6.54e-02

Pos logit contributions
 gard+0.013
 gn+0.011
Els+0.010
 seek+0.010
 mer+0.010
Neg logit contributions
ucker-0.017
ifying-0.016
ours-0.015
inches-0.015
outing-0.014
Token the
Token ID262
Feature activation-8.77e-02

Pos logit contributions
ucker+0.003
ifying+0.003
ours+0.003
inches+0.003
outing+0.003
Neg logit contributions
 gard-0.002
 two-0.001
 himself-0.001
Els-0.001
 gn-0.001
Token watch
Token ID2342
Feature activation+6.54e-01

Pos logit contributions
ucker+0.004
ours+0.004
ifying+0.004
inches+0.004
angles+0.003
Neg logit contributions
 gard-0.003
 gn-0.002
 wizard-0.002
 guardian-0.002
 elf-0.002
Token,
Token ID11
Feature activation-1.16e-01

Pos logit contributions
 gard+0.017
 gn+0.013
 wizard+0.012
 guardian+0.012
 seek+0.012
Neg logit contributions
ucker-0.031
ours-0.028
ifying-0.028
inches-0.028
ishment-0.027
Token looking
Token ID2045
Feature activation-5.23e-01

Pos logit contributions
ifying+0.005
ucker+0.005
ours+0.005
inches+0.005
angles+0.005
Neg logit contributions
 gard-0.004
 gn-0.003
 mer-0.003
 guardian-0.002
 seek-0.002
Token forward
Token ID2651
Feature activation-1.85e+00

Pos logit contributions
ucker+0.022
ifying+0.020
ours+0.020
inches+0.020
 mouths+0.019
Neg logit contributions
 gard-0.017
 mer-0.013
 gn-0.013
 guardian-0.012
Els-0.012
Token to
Token ID284
Feature activation+7.64e-02

Pos logit contributions
ifying+0.088
ucker+0.088
ours+0.085
 mouths+0.084
angles+0.083
Neg logit contributions
 gard-0.046
 gn-0.033
Els-0.031
 guardian-0.029
 wizard-0.028
Token many
Token ID867
Feature activation+4.28e-01

Pos logit contributions
 gard+0.002
 gn+0.002
Els+0.002
 wizard+0.002
 seek+0.002
Neg logit contributions
ucker-0.003
ifying-0.003
ours-0.003
outing-0.003
inches-0.003
Token more
Token ID517
Feature activation+7.10e-01

Pos logit contributions
 gard+0.012
 gn+0.010
 wizard+0.009
Els+0.009
 elf+0.009
Neg logit contributions
ucker-0.020
ifying-0.020
ours-0.019
inches-0.018
outing-0.017
Token happy
Token ID3772
Feature activation+5.46e-01

Pos logit contributions
 gard+0.022
 gn+0.018
 wizard+0.017
 guardian+0.016
Els+0.016
Neg logit contributions
ifying-0.030
ucker-0.030
ours-0.029
angles-0.028
inches-0.027
Token days
Token ID1528
Feature activation-1.30e+00

Pos logit contributions
 gard+0.018
 gn+0.014
Els+0.013
 guardian+0.013
Ar+0.013
Neg logit contributions
ifying-0.022
ucker-0.022
ours-0.021
 mouths-0.020
angles-0.020
Token who
Token ID508
Feature activation-1.24e-01

Pos logit contributions
ifying+0.008
ucker+0.007
ours+0.007
 mouths+0.007
angles+0.007
Neg logit contributions
 gard-0.008
 gn-0.006
Els-0.006
 guardian-0.006
 wizard-0.006
Token loved
Token ID6151
Feature activation+4.56e-01

Pos logit contributions
ifying+0.005
ucker+0.005
ours+0.005
angles+0.004
 mouths+0.004
Neg logit contributions
 gard-0.004
 gn-0.004
Els-0.003
 guardian-0.003
Ar-0.003
Token to
Token ID284
Feature activation+3.46e-01

Pos logit contributions
 gard+0.013
 gn+0.010
Els+0.010
 guardian+0.009
Ar+0.009
Neg logit contributions
ifying-0.020
ucker-0.020
ours-0.019
 mouths-0.019
 anymore-0.019
Token kiss
Token ID9245
Feature activation-6.48e-01

Pos logit contributions
 gard+0.013
 gn+0.011
Els+0.010
 guardian+0.010
Ar+0.010
Neg logit contributions
ifying-0.012
ucker-0.012
ours-0.012
 anymore-0.011
 mouths-0.011
Token.
Token ID13
Feature activation+6.89e-02

Pos logit contributions
ucker+0.032
ours+0.031
ifying+0.031
angles+0.030
inches+0.029
Neg logit contributions
 gard-0.017
 gn-0.011
Els-0.010
 mer-0.010
 wizard-0.010
Token Everywhere
Token ID48201
Feature activation-1.94e+00

Pos logit contributions
 gard+0.002
 gn+0.002
Els+0.002
 guardian+0.002
Ar+0.002
Neg logit contributions
ifying-0.003
ucker-0.003
ours-0.003
angles-0.003
 mouths-0.003
Token she
Token ID673
Feature activation-2.75e-01

Pos logit contributions
ifying+0.108
ucker+0.106
ishment+0.100
outing+0.100
angles+0.099
Neg logit contributions
 gard-0.039
 gn-0.024
Els-0.021
 fairy-0.021
 wizard-0.021
Token went
Token ID1816
Feature activation-4.59e-02

Pos logit contributions
ifying+0.011
ucker+0.011
ours+0.011
 mouths+0.010
 anymore+0.010
Neg logit contributions
 gard-0.008
 gn-0.007
Els-0.007
 guardian-0.006
Ar-0.006
Token,
Token ID11
Feature activation+8.35e-02

Pos logit contributions
ifying+0.002
ucker+0.002
ours+0.002
angles+0.002
 mouths+0.002
Neg logit contributions
 gard-0.002
 gn-0.001
Els-0.001
 guardian-0.001
 wizard-0.001
Token she
Token ID673
Feature activation-8.34e-02

Pos logit contributions
 gard+0.002
 gn+0.002
Els+0.002
 wizard+0.001
 guardian+0.001
Neg logit contributions
ucker-0.004
ifying-0.004
ours-0.004
inches-0.004
angles-0.004
Token always
Token ID1464
Feature activation+1.08e-02

Pos logit contributions
ifying+0.003
ucker+0.003
ours+0.003
angles+0.003
 mouths+0.003
Neg logit contributions
 gard-0.003
 gn-0.002
Els-0.002
 wizard-0.002
Ar-0.002