TOP ACTIVATIONS
MAX = 3.531

lived a nice bear named Che wy . Che wy had
big , generous dog named Che wy . Che wy loved
small and brave girl named Che y enne . She always
day , an explorer named Che e wanted to study a
was a z ebra named Stri pe . Stri pe lived
a sweet little girl named St acey . She was three
was a little girl named St acey . She was very
were two friends , N ee N ee and Bob o
was a z ebra named Stri pe y . Stri pe
was a z ebra named Stri pe y . He was
Can we pretend to be dent ists next time ?" they
was a little cube named Cub ey . Cub ey loved
She grabbed oven m itt s and carefully pulled out
there were two twins named Marcus and Maggie . They lived
was a little girl named St itch . She was very
Sam thinks it is too high and dangerous .
was a creative girl named St itch . She loved to
was a bunny called B oun cy . B oun cy
want to replace each other ever again . <|endoftext|> Anna and
two friends , Lisa and St u . Lisa loved to

MIN ACTIVATIONS
MIN = -3.436

Bad dog ! Leave the girls alone !" The
didn 't know what " sm oother " meant , but
soft copper coin . E sm ie wanted to pick up
ow , ow ! The rain is bad . It is
er ful " and " sm iley ". But it was
grandma 's house for safe keeping . When
. One day , E sm ie decided to travel to
The naughty cat had out sm arted the bees and learned
reliable that it had out sm arted the big scary shark
and his friend , Sue , went to the park .
happy that they had out sm arted the mean man .
a small elephant called E sm ie who loved to travel
and his dog , Max , went to the park .
and his friend , Tom , went to the park .
also added some spicy - sm elling flowers to her dress
friends , Lily and Ben , went to play in the
away noticed some sour - sm elling bushes . She had
But it 's mine ! Nobody else can have it ".
and his dog , Max , went to the park .
" But why not ? Everybody is welcome to come join

INTERVAL 1.789 - 3.531
CONTAINS 0.140%

. " Be ep be ep !" Tom said
in " Fun Land " ever after . <|endoftext|> One day
not to trust everything that looks like a nut . They
were two best friends , Mike and Sam . Every day
a modern train . It can make sounds and move .

INTERVAL 0.047 - 1.789
CONTAINS 46.365%

him to the hospital , but the bite was too deep
that basket ?" The lady smiled and said ,
, Anna . I will help you . But you have
" Of course you can ," mom said .
too sm elly and scary ." Uncle Bob took off his

INTERVAL -1.694 - 0.047
CONTAINS 53.267%

is our ball . We were playing with it . Give
shouting with joy . He had never seen anything like it
" Hello , Lily . I am your magic wand .
around in it . She looked in the mirror and smiled
a girl called Mary . Mary loved to play in her

INTERVAL -3.436 - -1.694
CONTAINS 0.228%

yours , this is for someone else . You have to
perform for her family and friends . She has a big
inside , Amy saw snow fl akes falling from the sky
stayed obedient and kept the girl calm .
hands and laughing loudly . Everyone around him smiled and applauded
Token lived
Token ID5615
Feature activation-4.51e-01

Pos logit contributions
 endings+0.014
emic+0.014
und+0.014
Em+0.013
 ow+0.013
Neg logit contributions
 one-0.011
 moral-0.010
 closer-0.010
 lesson-0.009
ossible-0.009
Token a
Token ID257
Feature activation-5.54e-01

Pos logit contributions
emic+0.014
 unnecessary+0.014
 endings+0.014
Em+0.013
osity+0.013
Neg logit contributions
 one-0.007
 moral-0.005
 end-0.005
 lesson-0.005
 closer-0.005
Token nice
Token ID3621
Feature activation-7.33e-01

Pos logit contributions
emic+0.014
 endings+0.014
 unnecessary+0.013
osity+0.013
 situations+0.013
Neg logit contributions
 one-0.013
 moral-0.012
 end-0.011
 closer-0.011
 first-0.010
Token bear
Token ID6842
Feature activation+7.85e-01

Pos logit contributions
emic+0.017
 endings+0.016
 unnecessary+0.016
osity+0.016
Em+0.015
Neg logit contributions
 one-0.018
 moral-0.015
 end-0.015
 closer-0.014
 lesson-0.014
Token named
Token ID3706
Feature activation-4.73e-01

Pos logit contributions
 one+0.017
 moral+0.015
 end+0.014
 closer+0.014
 lesson+0.014
Neg logit contributions
emic-0.019
 endings-0.018
osity-0.018
 unnecessary-0.018
Em-0.017
Token Che
Token ID2580
Feature activation+3.53e+00

Pos logit contributions
emic+0.012
 endings+0.011
 unnecessary+0.011
osity+0.011
 situations+0.010
Neg logit contributions
 one-0.012
 moral-0.009
 end-0.009
 first-0.009
 closer-0.009
Tokenwy
Token ID21768
Feature activation+1.92e+00

Pos logit contributions
 one+0.087
 closer+0.075
 moral+0.072
 lesson+0.070
 some+0.067
Neg logit contributions
emic-0.083
Em-0.077
 endings-0.077
 unnecessary-0.076
und-0.076
Token.
Token ID13
Feature activation-8.83e-01

Pos logit contributions
 one+0.033
 moral+0.026
 end+0.024
 closer+0.023
 lesson+0.023
Neg logit contributions
emic-0.059
 endings-0.057
 unnecessary-0.057
osity-0.056
Em-0.054
Token Che
Token ID2580
Feature activation+2.78e+00

Pos logit contributions
 unnecessary+0.017
emic+0.016
 endings+0.016
und+0.015
osity+0.015
Neg logit contributions
 one-0.023
 moral-0.021
 closer-0.021
 end-0.020
 lesson-0.019
Tokenwy
Token ID21768
Feature activation+1.54e+00

Pos logit contributions
 one+0.070
 closer+0.063
 moral+0.057
 lesson+0.055
 particular+0.054
Neg logit contributions
emic-0.063
 endings-0.059
und-0.059
Em-0.059
 unnecessary-0.058
Token had
Token ID550
Feature activation+3.31e-01

Pos logit contributions
 one+0.030
 moral+0.025
 closer+0.024
 end+0.024
 lesson+0.023
Neg logit contributions
emic-0.043
 endings-0.041
 unnecessary-0.041
osity-0.040
Em-0.039
Token big
Token ID1263
Feature activation-2.46e-01

Pos logit contributions
emic+0.012
 endings+0.012
 unnecessary+0.012
osity+0.011
 situations+0.011
Neg logit contributions
 one-0.012
 moral-0.010
 end-0.010
 closer-0.009
 first-0.009
Token,
Token ID11
Feature activation-4.23e-01

Pos logit contributions
emic+0.006
 unnecessary+0.006
 endings+0.006
osity+0.006
 pian+0.006
Neg logit contributions
 one-0.005
 moral-0.004
 lesson-0.004
 closer-0.004
 end-0.004
Token generous
Token ID14431
Feature activation+9.59e-02

Pos logit contributions
emic+0.012
 unnecessary+0.012
 endings+0.011
osity+0.011
Em+0.011
Neg logit contributions
 one-0.008
 moral-0.007
 end-0.006
 closer-0.006
 lesson-0.006
Token dog
Token ID3290
Feature activation+1.30e+00

Pos logit contributions
 one+0.002
 moral+0.002
 closer+0.002
 end+0.002
 lesson+0.002
Neg logit contributions
emic-0.002
 endings-0.002
osity-0.002
 unnecessary-0.002
Em-0.002
Token named
Token ID3706
Feature activation-6.32e-01

Pos logit contributions
 one+0.032
 moral+0.030
 lesson+0.028
 closer+0.028
 end+0.028
Neg logit contributions
emic-0.027
 endings-0.025
Em-0.025
osity-0.025
 pian-0.024
Token Che
Token ID2580
Feature activation+3.37e+00

Pos logit contributions
emic+0.017
 endings+0.016
osity+0.016
 unnecessary+0.016
 hey+0.015
Neg logit contributions
 one-0.015
 moral-0.011
 first-0.011
 end-0.011
 closer-0.011
Tokenwy
Token ID21768
Feature activation+1.23e+00

Pos logit contributions
 one+0.083
 closer+0.074
 moral+0.072
 lesson+0.070
 particular+0.066
Neg logit contributions
emic-0.078
Em-0.070
und-0.070
 endings-0.069
 unnecessary-0.068
Token.
Token ID13
Feature activation-5.26e-01

Pos logit contributions
 one+0.022
 closer+0.018
 moral+0.018
 lesson+0.017
 end+0.016
Neg logit contributions
emic-0.034
 endings-0.034
osity-0.032
 unnecessary-0.032
 pian-0.031
Token Che
Token ID2580
Feature activation+2.80e+00

Pos logit contributions
emic+0.010
 unnecessary+0.010
osity+0.009
 endings+0.009
und+0.009
Neg logit contributions
 one-0.014
 moral-0.013
 closer-0.012
 end-0.012
 lesson-0.012
Tokenwy
Token ID21768
Feature activation+1.22e+00

Pos logit contributions
 one+0.074
 closer+0.069
 moral+0.062
 particular+0.060
 lesson+0.059
Neg logit contributions
emic-0.059
und-0.057
Em-0.054
 endings-0.053
 unnecessary-0.052
Token loved
Token ID6151
Feature activation-6.31e-01

Pos logit contributions
 one+0.026
 closer+0.023
 moral+0.023
 end+0.021
 lesson+0.021
Neg logit contributions
emic-0.030
 endings-0.030
osity-0.028
Em-0.028
 unnecessary-0.027
Token small
Token ID1402
Feature activation-2.38e-01

Pos logit contributions
 endings+0.008
emic+0.008
 unnecessary+0.008
osity+0.008
 situations+0.008
Neg logit contributions
 one-0.009
 moral-0.008
 end-0.008
 closer-0.007
 first-0.007
Token and
Token ID290
Feature activation-9.85e-01

Pos logit contributions
emic+0.007
 unnecessary+0.006
 endings+0.006
osity+0.006
Em+0.006
Neg logit contributions
 one-0.005
 moral-0.004
 end-0.004
 closer-0.004
 first-0.004
Token brave
Token ID14802
Feature activation-2.34e-01

Pos logit contributions
 ow+0.019
Em+0.019
 cob+0.019
 importantly+0.019
 pian+0.018
Neg logit contributions
ort-0.023
oral-0.021
 one-0.021
 moral-0.020
ossible-0.020
Token girl
Token ID2576
Feature activation-9.35e-01

Pos logit contributions
emic+0.007
 endings+0.007
 unnecessary+0.007
osity+0.006
und+0.006
Neg logit contributions
 one-0.005
 end-0.004
 moral-0.004
 first-0.004
 closer-0.004
Token named
Token ID3706
Feature activation-1.20e+00

Pos logit contributions
emic+0.031
 unnecessary+0.031
osity+0.030
 endings+0.030
mic+0.028
Neg logit contributions
 one-0.017
 end-0.012
 moral-0.011
 first-0.010
ossible-0.010
Token Che
Token ID2580
Feature activation+3.37e+00

Pos logit contributions
emic+0.031
 endings+0.030
 unnecessary+0.030
osity+0.029
 importantly+0.029
Neg logit contributions
 one-0.030
 first-0.024
 closer-0.023
 end-0.023
 moral-0.023
Tokeny
Token ID88
Feature activation-1.19e-01

Pos logit contributions
 one+0.078
 closer+0.067
 moral+0.066
 lesson+0.063
 end+0.059
Neg logit contributions
emic-0.084
Em-0.078
 endings-0.076
und-0.076
 unnecessary-0.075
Tokenenne
Token ID29727
Feature activation-2.06e-01

Pos logit contributions
emic+0.004
 unnecessary+0.004
 endings+0.004
osity+0.004
 situations+0.004
Neg logit contributions
 one-0.002
 end-0.001
 moral-0.001
 first-0.001
 closer-0.001
Token.
Token ID13
Feature activation-9.67e-01

Pos logit contributions
emic+0.006
 endings+0.006
 unnecessary+0.006
osity+0.006
Em+0.006
Neg logit contributions
 one-0.003
 moral-0.003
 end-0.003
 closer-0.003
 lesson-0.003
Token She
Token ID1375
Feature activation-6.08e-01

Pos logit contributions
emic+0.024
 unnecessary+0.023
 endings+0.022
osity+0.022
Em+0.021
Neg logit contributions
 one-0.023
 moral-0.019
 end-0.019
 closer-0.018
 lesson-0.018
Token always
Token ID1464
Feature activation+1.57e-02

Pos logit contributions
emic+0.017
 unnecessary+0.017
osity+0.016
 endings+0.016
 situations+0.015
Neg logit contributions
 one-0.012
 moral-0.010
 end-0.010
 lesson-0.009
 first-0.009
Token day
Token ID1110
Feature activation+3.54e-01

Pos logit contributions
ort+0.001
 one+0.001
 moral+0.001
 closer+0.001
 seller+0.001
Neg logit contributions
emic-0.002
 importantly-0.002
 unnecessary-0.002
 endings-0.002
Em-0.002
Token,
Token ID11
Feature activation-2.09e-01

Pos logit contributions
 one+0.006
 moral+0.006
 end+0.005
 lesson+0.005
 closer+0.005
Neg logit contributions
 endings-0.009
Em-0.009
emic-0.009
 importantly-0.009
 unnecessary-0.009
Token an
Token ID281
Feature activation-1.16e+00

Pos logit contributions
emic+0.006
 endings+0.006
Em+0.006
 unnecessary+0.005
 importantly+0.005
Neg logit contributions
 moral-0.004
 one-0.003
 end-0.003
 lesson-0.003
ort-0.003
Token explorer
Token ID39349
Feature activation-7.14e-01

Pos logit contributions
emic+0.028
 endings+0.027
 unnecessary+0.027
 pian+0.026
Em+0.026
Neg logit contributions
 one-0.025
 closer-0.021
 moral-0.020
ossible-0.020
 lesson-0.019
Token named
Token ID3706
Feature activation-7.02e-01

Pos logit contributions
emic+0.020
 unnecessary+0.020
 endings+0.019
osity+0.019
Em+0.018
Neg logit contributions
 one-0.014
 moral-0.012
 end-0.011
 closer-0.011
 lesson-0.010
Token Che
Token ID2580
Feature activation+3.20e+00

Pos logit contributions
 unnecessary+0.021
emic+0.020
 endings+0.019
 situations+0.018
osity+0.018
Neg logit contributions
 one-0.016
 first-0.012
 some-0.011
 end-0.011
 closer-0.011
Tokene
Token ID68
Feature activation+1.74e+00

Pos logit contributions
 one+0.081
 closer+0.071
 moral+0.067
 lesson+0.067
 bite+0.064
Neg logit contributions
emic-0.069
Em-0.067
und-0.064
 endings-0.064
 hey-0.061
Token wanted
Token ID2227
Feature activation+5.93e-01

Pos logit contributions
 one+0.029
 moral+0.021
 end+0.021
 first+0.018
 lesson+0.018
Neg logit contributions
 unnecessary-0.056
emic-0.056
 endings-0.053
osity-0.053
 importantly-0.051
Token to
Token ID284
Feature activation-7.62e-02

Pos logit contributions
 one+0.012
 moral+0.010
 end+0.009
 closer+0.009
 lesson+0.009
Neg logit contributions
emic-0.017
 unnecessary-0.016
 endings-0.016
osity-0.016
Em-0.015
Token study
Token ID2050
Feature activation+2.90e-02

Pos logit contributions
emic+0.002
 unnecessary+0.002
 endings+0.002
osity+0.002
Em+0.002
Neg logit contributions
 one-0.002
 moral-0.001
 end-0.001
 closer-0.001
 first-0.001
Token a
Token ID257
Feature activation-2.36e-01

Pos logit contributions
 one+0.000
 moral+0.000
 end+0.000
 closer+0.000
 lesson+0.000
Neg logit contributions
emic-0.001
 unnecessary-0.001
 endings-0.001
osity-0.001
Em-0.001
Token was
Token ID373
Feature activation+4.07e-01

Pos logit contributions
 endings+0.019
emic+0.019
und+0.018
 unnecessary+0.018
 ow+0.018
Neg logit contributions
 one-0.014
 moral-0.012
 closer-0.011
 lesson-0.011
 end-0.010
Token a
Token ID257
Feature activation-4.11e-01

Pos logit contributions
 one+0.006
 moral+0.006
 end+0.005
 closer+0.005
ort+0.005
Neg logit contributions
emic-0.012
 endings-0.012
 unnecessary-0.011
Em-0.011
 importantly-0.011
Token z
Token ID1976
Feature activation+1.63e+00

Pos logit contributions
 endings+0.010
emic+0.010
 unnecessary+0.010
osity+0.010
 situations+0.009
Neg logit contributions
 one-0.010
 moral-0.009
 end-0.008
 closer-0.008
 first-0.008
Tokenebra
Token ID37052
Feature activation+1.33e+00

Pos logit contributions
 one+0.043
 moral+0.038
 lesson+0.037
 end+0.036
 closer+0.035
Neg logit contributions
emic-0.034
 unnecessary-0.032
 endings-0.031
osity-0.031
 hey-0.030
Token named
Token ID3706
Feature activation-5.07e-01

Pos logit contributions
 one+0.032
 moral+0.029
 lesson+0.027
 end+0.027
 closer+0.027
Neg logit contributions
emic-0.030
 endings-0.028
osity-0.027
 unnecessary-0.026
 pian-0.026
Token Stri
Token ID26137
Feature activation+3.16e+00

Pos logit contributions
emic+0.012
 unnecessary+0.011
 endings+0.011
osity+0.011
Em+0.010
Neg logit contributions
 one-0.013
 moral-0.011
 end-0.011
 closer-0.010
 lesson-0.010
Tokenpe
Token ID431
Feature activation+1.76e+00

Pos logit contributions
 one+0.062
 moral+0.051
 closer+0.049
 lesson+0.047
 end+0.047
Neg logit contributions
emic-0.089
 endings-0.082
und-0.081
osity-0.081
Em-0.081
Token.
Token ID13
Feature activation-6.30e-01

Pos logit contributions
 one+0.033
 moral+0.029
 closer+0.028
 lesson+0.027
 end+0.026
Neg logit contributions
emic-0.047
 endings-0.045
osity-0.044
und-0.043
 unnecessary-0.043
Token Stri
Token ID26137
Feature activation+2.76e+00

Pos logit contributions
emic+0.012
 unnecessary+0.012
 endings+0.011
osity+0.011
Em+0.010
Neg logit contributions
 one-0.018
 moral-0.016
 end-0.015
 closer-0.015
 lesson-0.015
Tokenpe
Token ID431
Feature activation+1.14e+00

Pos logit contributions
 one+0.051
 moral+0.040
 closer+0.040
 end+0.038
 lesson+0.037
Neg logit contributions
emic-0.082
und-0.076
 endings-0.075
osity-0.073
Em-0.073
Token lived
Token ID5615
Feature activation-6.24e-01

Pos logit contributions
 one+0.026
 moral+0.024
 closer+0.023
 lesson+0.022
 end+0.022
Neg logit contributions
emic-0.026
und-0.024
 endings-0.024
osity-0.023
 unnecessary-0.022
Token a
Token ID257
Feature activation-4.89e-01

Pos logit contributions
 one+0.005
 moral+0.005
 end+0.004
 closer+0.004
 lesson+0.004
Neg logit contributions
emic-0.010
 endings-0.010
 unnecessary-0.010
Em-0.010
 importantly-0.010
Token sweet
Token ID6029
Feature activation+8.54e-02

Pos logit contributions
 endings+0.012
emic+0.011
 unnecessary+0.011
osity+0.011
 situations+0.011
Neg logit contributions
 one-0.013
 moral-0.011
 end-0.011
 closer-0.010
 first-0.010
Token little
Token ID1310
Feature activation-3.62e-01

Pos logit contributions
 one+0.002
 moral+0.002
 end+0.002
 closer+0.002
 lesson+0.002
Neg logit contributions
emic-0.002
 unnecessary-0.002
 endings-0.002
osity-0.002
Em-0.002
Token girl
Token ID2576
Feature activation-7.77e-02

Pos logit contributions
emic+0.009
 unnecessary+0.009
 endings+0.009
osity+0.008
Em+0.008
Neg logit contributions
 one-0.009
 moral-0.007
 end-0.007
 closer-0.007
 lesson-0.007
Token named
Token ID3706
Feature activation-5.08e-01

Pos logit contributions
 unnecessary+0.002
emic+0.002
 endings+0.002
osity+0.002
 situations+0.002
Neg logit contributions
 one-0.002
 end-0.001
 moral-0.001
 closer-0.001
 first-0.001
Token St
Token ID520
Feature activation+3.03e+00

Pos logit contributions
 unnecessary+0.015
 endings+0.015
emic+0.015
 situations+0.014
osity+0.014
Neg logit contributions
 one-0.011
 first-0.008
 some-0.008
 end-0.008
 moral-0.008
Tokenacey
Token ID25415
Feature activation+7.51e-01

Pos logit contributions
 one+0.081
 moral+0.070
 closer+0.068
 end+0.066
 first+0.065
Neg logit contributions
emic-0.067
 endings-0.063
 unnecessary-0.061
osity-0.059
Em-0.058
Token.
Token ID13
Feature activation-5.68e-01

Pos logit contributions
 one+0.012
 moral+0.009
 end+0.009
 closer+0.008
 lesson+0.008
Neg logit contributions
emic-0.024
 unnecessary-0.024
 endings-0.023
osity-0.023
Em-0.022
Token She
Token ID1375
Feature activation-8.64e-01

Pos logit contributions
emic+0.012
 unnecessary+0.012
 endings+0.012
osity+0.011
Em+0.011
Neg logit contributions
 one-0.015
 moral-0.013
 end-0.013
 closer-0.012
 lesson-0.012
Token was
Token ID373
Feature activation+2.01e-02

Pos logit contributions
emic+0.024
 unnecessary+0.023
 endings+0.023
osity+0.023
 situations+0.021
Neg logit contributions
 one-0.018
 moral-0.015
 end-0.014
 lesson-0.013
 closer-0.013
Token three
Token ID1115
Feature activation-4.15e-01

Pos logit contributions
 one+0.000
 first+0.000
 moral+0.000
 end+0.000
 closer+0.000
Neg logit contributions
 endings-0.001
emic-0.001
 unnecessary-0.001
osity-0.001
 situations-0.001
Token was
Token ID373
Feature activation+4.13e-01

Pos logit contributions
 endings+0.013
emic+0.013
und+0.012
 ow+0.012
Em+0.012
Neg logit contributions
 one-0.010
 moral-0.009
 closer-0.008
 lesson-0.008
ossible-0.008
Token a
Token ID257
Feature activation-4.42e-01

Pos logit contributions
 one+0.006
 moral+0.005
 end+0.005
 closer+0.005
 lesson+0.005
Neg logit contributions
emic-0.012
 endings-0.012
 unnecessary-0.012
Em-0.012
 importantly-0.012
Token little
Token ID1310
Feature activation-4.49e-01

Pos logit contributions
 endings+0.011
emic+0.010
 unnecessary+0.010
osity+0.010
 situations+0.010
Neg logit contributions
 one-0.011
 moral-0.010
 end-0.009
 closer-0.009
 first-0.009
Token girl
Token ID2576
Feature activation+2.64e-02

Pos logit contributions
emic+0.011
 endings+0.011
 unnecessary+0.011
osity+0.011
Em+0.010
Neg logit contributions
 one-0.010
 moral-0.009
 end-0.008
 closer-0.008
 lesson-0.008
Token named
Token ID3706
Feature activation-6.63e-01

Pos logit contributions
 one+0.001
 end+0.000
 moral+0.000
 closer+0.000
 first+0.000
Neg logit contributions
 unnecessary-0.001
emic-0.001
 endings-0.001
osity-0.001
 situations-0.001
Token St
Token ID520
Feature activation+3.03e+00

Pos logit contributions
 unnecessary+0.020
 endings+0.020
 situations+0.019
emic+0.019
osity+0.019
Neg logit contributions
 one-0.015
 first-0.011
 end-0.010
 some-0.010
 closer-0.010
Tokenacey
Token ID25415
Feature activation+8.76e-01

Pos logit contributions
 one+0.082
 moral+0.070
 closer+0.068
 end+0.066
 lesson+0.065
Neg logit contributions
emic-0.066
 endings-0.063
 unnecessary-0.061
osity-0.059
Em-0.058
Token.
Token ID13
Feature activation-5.69e-01

Pos logit contributions
 one+0.015
 moral+0.011
 end+0.011
 closer+0.010
 lesson+0.010
Neg logit contributions
emic-0.028
 unnecessary-0.027
 endings-0.027
osity-0.026
Em-0.025
Token She
Token ID1375
Feature activation-8.03e-01

Pos logit contributions
emic+0.012
 unnecessary+0.012
 endings+0.012
osity+0.011
 situations+0.011
Neg logit contributions
 one-0.015
 moral-0.013
 end-0.013
 first-0.012
 closer-0.012
Token was
Token ID373
Feature activation-1.40e-02

Pos logit contributions
 unnecessary+0.025
emic+0.025
osity+0.025
 endings+0.024
 situations+0.024
Neg logit contributions
 one-0.015
 moral-0.011
 end-0.011
 lesson-0.010
 first-0.010
Token very
Token ID845
Feature activation+6.78e-01

Pos logit contributions
 endings+0.000
emic+0.000
 unnecessary+0.000
 situations+0.000
osity+0.000
Neg logit contributions
 one-0.000
 first-0.000
 moral-0.000
 end-0.000
 closer-0.000
Token were
Token ID547
Feature activation-1.74e-01

Pos logit contributions
emic+0.016
 endings+0.016
und+0.015
 unnecessary+0.015
osity+0.015
Neg logit contributions
 one-0.010
 moral-0.009
 closer-0.008
 end-0.008
 lesson-0.008
Token two
Token ID734
Feature activation+7.82e-01

Pos logit contributions
emic+0.006
 unnecessary+0.006
osity+0.006
 endings+0.006
 situations+0.006
Neg logit contributions
 one-0.003
 moral-0.002
 end-0.002
 first-0.002
 closer-0.001
Token friends
Token ID2460
Feature activation-6.01e-01

Pos logit contributions
 one+0.018
 moral+0.016
 end+0.015
 closer+0.015
 lesson+0.015
Neg logit contributions
emic-0.019
 unnecessary-0.018
 endings-0.018
osity-0.017
 importantly-0.017
Token,
Token ID11
Feature activation-1.59e+00

Pos logit contributions
emic+0.014
 endings+0.013
 unnecessary+0.013
osity+0.013
Em+0.013
Neg logit contributions
 one-0.014
 moral-0.012
 lesson-0.012
 end-0.012
 closer-0.012
Token N
Token ID399
Feature activation+1.24e+00

Pos logit contributions
emic+0.041
 unnecessary+0.039
 endings+0.039
osity+0.038
Em+0.037
Neg logit contributions
 one-0.035
 moral-0.030
 end-0.029
 closer-0.027
 lesson-0.027
Tokenee
Token ID1453
Feature activation+3.03e+00

Pos logit contributions
 one+0.033
 moral+0.029
 end+0.028
 closer+0.027
 lesson+0.027
Neg logit contributions
emic-0.027
 endings-0.025
 unnecessary-0.025
osity-0.024
Em-0.023
TokenN
Token ID45
Feature activation+7.97e-01

Pos logit contributions
 one+0.054
 moral+0.049
 closer+0.046
 lesson+0.044
 end+0.044
Neg logit contributions
emic-0.085
Em-0.080
 endings-0.078
osity-0.078
und-0.077
Tokenee
Token ID1453
Feature activation+2.77e+00

Pos logit contributions
 one+0.023
 moral+0.020
 closer+0.020
 end+0.019
 lesson+0.019
Neg logit contributions
emic-0.015
 endings-0.014
osity-0.014
 unnecessary-0.014
Em-0.013
Token and
Token ID290
Feature activation-1.38e+00

Pos logit contributions
 one+0.051
 moral+0.046
 closer+0.044
 lesson+0.042
 end+0.041
Neg logit contributions
emic-0.077
Em-0.072
 endings-0.071
osity-0.071
und-0.070
Token Bob
Token ID5811
Feature activation+8.48e-01

Pos logit contributions
emic+0.037
 unnecessary+0.036
 endings+0.036
osity+0.035
 situations+0.033
Neg logit contributions
 one-0.031
 moral-0.024
 end-0.024
 first-0.023
 closer-0.023
Tokeno
Token ID78
Feature activation+1.64e+00

Pos logit contributions
 one+0.016
 moral+0.016
 lesson+0.015
 end+0.014
 closer+0.014
Neg logit contributions
emic-0.022
 endings-0.020
osity-0.020
 unnecessary-0.020
und-0.020
Token was
Token ID373
Feature activation+3.87e-01

Pos logit contributions
 endings+0.015
emic+0.015
und+0.015
Em+0.014
 ow+0.014
Neg logit contributions
 one-0.012
 moral-0.010
 closer-0.010
 lesson-0.009
ossible-0.009
Token a
Token ID257
Feature activation-4.19e-01

Pos logit contributions
 one+0.006
 moral+0.005
 end+0.004
 closer+0.004
 lesson+0.004
Neg logit contributions
emic-0.012
 endings-0.012
 unnecessary-0.012
Em-0.011
 importantly-0.011
Token z
Token ID1976
Feature activation+1.47e+00

Pos logit contributions
 endings+0.010
emic+0.010
 unnecessary+0.010
osity+0.010
 situations+0.009
Neg logit contributions
 one-0.011
 moral-0.009
 end-0.009
 closer-0.008
 first-0.008
Tokenebra
Token ID37052
Feature activation+1.38e+00

Pos logit contributions
 one+0.039
 moral+0.035
 lesson+0.033
 end+0.032
 closer+0.031
Neg logit contributions
emic-0.032
 unnecessary-0.028
 hey-0.028
 endings-0.028
osity-0.028
Token named
Token ID3706
Feature activation-5.02e-01

Pos logit contributions
 one+0.034
 moral+0.031
 end+0.029
 lesson+0.029
 closer+0.029
Neg logit contributions
emic-0.030
 endings-0.028
osity-0.027
 unnecessary-0.027
 pian-0.026
Token Stri
Token ID26137
Feature activation+3.01e+00

Pos logit contributions
emic+0.011
 unnecessary+0.010
 endings+0.010
osity+0.010
Em+0.009
Neg logit contributions
 one-0.013
 moral-0.012
 end-0.011
 lesson-0.011
 closer-0.011
Tokenpe
Token ID431
Feature activation+1.82e+00

Pos logit contributions
 one+0.055
 moral+0.045
 closer+0.043
 lesson+0.042
 end+0.041
Neg logit contributions
emic-0.089
 endings-0.082
und-0.082
Em-0.081
osity-0.080
Tokeny
Token ID88
Feature activation+1.51e-01

Pos logit contributions
 one+0.033
 moral+0.029
 closer+0.027
 lesson+0.027
 end+0.026
Neg logit contributions
emic-0.050
 endings-0.048
osity-0.047
 unnecessary-0.046
und-0.045
Token.
Token ID13
Feature activation-6.91e-01

Pos logit contributions
 one+0.003
 moral+0.002
 end+0.002
 closer+0.002
 lesson+0.002
Neg logit contributions
emic-0.005
 unnecessary-0.005
 endings-0.004
osity-0.004
Em-0.004
Token Stri
Token ID26137
Feature activation+2.58e+00

Pos logit contributions
emic+0.015
 unnecessary+0.014
 endings+0.014
osity+0.013
Em+0.013
Neg logit contributions
 one-0.019
 moral-0.016
 end-0.016
 closer-0.015
 lesson-0.015
Tokenpe
Token ID431
Feature activation+1.01e+00

Pos logit contributions
 one+0.050
 closer+0.040
 moral+0.038
 end+0.037
 lesson+0.036
Neg logit contributions
emic-0.075
und-0.071
 endings-0.069
Em-0.066
ode-0.066
Token was
Token ID373
Feature activation+4.99e-01

Pos logit contributions
 endings+0.018
emic+0.017
und+0.017
Em+0.017
 ow+0.017
Neg logit contributions
 one-0.014
 moral-0.013
 closer-0.012
 lesson-0.011
ossible-0.011
Token a
Token ID257
Feature activation-3.73e-01

Pos logit contributions
 one+0.007
 moral+0.006
 end+0.006
 closer+0.005
 lesson+0.005
Neg logit contributions
emic-0.015
 endings-0.015
 unnecessary-0.015
Em-0.015
 importantly-0.014
Token z
Token ID1976
Feature activation+1.45e+00

Pos logit contributions
 endings+0.009
emic+0.009
 unnecessary+0.009
osity+0.008
 situations+0.008
Neg logit contributions
 one-0.010
 moral-0.008
 end-0.008
 closer-0.008
 first-0.008
Tokenebra
Token ID37052
Feature activation+1.34e+00

Pos logit contributions
 one+0.037
 moral+0.033
 lesson+0.031
 end+0.031
 closer+0.030
Neg logit contributions
emic-0.032
 unnecessary-0.029
 endings-0.029
osity-0.029
 hey-0.029
Token named
Token ID3706
Feature activation-5.71e-01

Pos logit contributions
 one+0.033
 moral+0.030
 end+0.028
 lesson+0.028
 closer+0.028
Neg logit contributions
emic-0.029
 endings-0.027
osity-0.026
 unnecessary-0.026
 pian-0.026
Token Stri
Token ID26137
Feature activation+2.96e+00

Pos logit contributions
emic+0.012
 unnecessary+0.012
 endings+0.012
osity+0.011
Em+0.011
Neg logit contributions
 one-0.015
 moral-0.013
 end-0.013
 closer-0.012
 first-0.012
Tokenpe
Token ID431
Feature activation+1.77e+00

Pos logit contributions
 one+0.056
 moral+0.046
 closer+0.044
 lesson+0.043
 end+0.043
Neg logit contributions
emic-0.086
 endings-0.078
und-0.078
Em-0.078
osity-0.076
Tokeny
Token ID88
Feature activation+1.29e-01

Pos logit contributions
 one+0.033
 moral+0.029
 closer+0.027
 lesson+0.027
 end+0.026
Neg logit contributions
emic-0.048
 endings-0.046
osity-0.045
 unnecessary-0.045
und-0.044
Token.
Token ID13
Feature activation-7.70e-01

Pos logit contributions
 one+0.002
 moral+0.002
 end+0.002
 closer+0.002
 lesson+0.002
Neg logit contributions
emic-0.004
 unnecessary-0.004
 endings-0.004
osity-0.004
Em-0.003
Token He
Token ID679
Feature activation-7.91e-02

Pos logit contributions
emic+0.015
 unnecessary+0.015
 endings+0.015
osity+0.014
Em+0.013
Neg logit contributions
 one-0.022
 moral-0.019
 end-0.018
 closer-0.018
 lesson-0.017
Token was
Token ID373
Feature activation-2.79e-01

Pos logit contributions
emic+0.002
 endings+0.002
Em+0.002
 unnecessary+0.002
 pian+0.002
Neg logit contributions
 one-0.002
 closer-0.002
 moral-0.002
 end-0.002
 lesson-0.001
TokenCan
Token ID6090
Feature activation+1.05e+00

Pos logit contributions
 one+0.008
 moral+0.006
 end+0.006
 closer+0.006
 lesson+0.006
Neg logit contributions
emic-0.007
 endings-0.007
 unnecessary-0.007
osity-0.006
Em-0.006
Token we
Token ID356
Feature activation+6.55e-02

Pos logit contributions
 one+0.028
 moral+0.024
 end+0.023
 closer+0.022
 lesson+0.022
Neg logit contributions
emic-0.023
 endings-0.021
 unnecessary-0.021
osity-0.021
Em-0.020
Token pretend
Token ID16614
Feature activation+3.12e-01

Pos logit contributions
 one+0.001
 moral+0.001
 end+0.001
 closer+0.001
 first+0.001
Neg logit contributions
emic-0.002
 unnecessary-0.002
 endings-0.002
osity-0.002
Em-0.002
Token to
Token ID284
Feature activation+2.77e-01

Pos logit contributions
 one+0.007
 moral+0.006
 end+0.005
 lesson+0.005
 closer+0.005
Neg logit contributions
emic-0.008
 endings-0.008
 unnecessary-0.008
osity-0.007
Em-0.007
Token be
Token ID307
Feature activation+1.39e+00

Pos logit contributions
 one+0.006
 moral+0.005
 lesson+0.005
 end+0.005
 closer+0.005
Neg logit contributions
emic-0.007
 endings-0.007
 unnecessary-0.007
 pian-0.006
osity-0.006
Token dent
Token ID18794
Feature activation+2.95e+00

Pos logit contributions
 one+0.023
 moral+0.016
 some+0.015
 end+0.015
 closer+0.014
Neg logit contributions
emic-0.047
osity-0.045
 endings-0.044
 unnecessary-0.044
Em-0.043
Tokenists
Token ID1023
Feature activation+8.03e-01

Pos logit contributions
 one+0.052
 moral+0.045
 end+0.041
 lesson+0.041
ossible+0.041
Neg logit contributions
emic-0.083
 unnecessary-0.081
 endings-0.080
osity-0.077
 pian-0.076
Token next
Token ID1306
Feature activation+6.61e-01

Pos logit contributions
 one+0.014
 moral+0.012
 lesson+0.011
 end+0.010
 closer+0.010
Neg logit contributions
emic-0.023
 endings-0.022
 unnecessary-0.022
osity-0.021
 pian-0.021
Token time
Token ID640
Feature activation+9.80e-01

Pos logit contributions
 one+0.012
 moral+0.010
 end+0.009
 closer+0.009
 lesson+0.008
Neg logit contributions
emic-0.020
 unnecessary-0.019
 endings-0.019
osity-0.019
Em-0.018
Token?"
Token ID1701
Feature activation-5.37e-02

Pos logit contributions
 one+0.016
 moral+0.013
 end+0.012
 closer+0.011
 lesson+0.011
Neg logit contributions
emic-0.031
 unnecessary-0.029
 endings-0.029
osity-0.029
Em-0.028
Token they
Token ID484
Feature activation+4.60e-01

Pos logit contributions
Em+0.001
 importantly+0.001
emic+0.001
 endings+0.001
 pian+0.001
Neg logit contributions
 one-0.001
 end-0.001
 lesson-0.001
 moral-0.001
 last-0.001
Token was
Token ID373
Feature activation+2.85e-01

Pos logit contributions
 endings+0.018
emic+0.018
und+0.017
 ow+0.017
Em+0.017
Neg logit contributions
 one-0.014
 moral-0.012
 closer-0.011
 lesson-0.011
ossible-0.010
Token a
Token ID257
Feature activation-4.14e-01

Pos logit contributions
 one+0.004
 moral+0.004
 end+0.003
 closer+0.003
 lesson+0.003
Neg logit contributions
emic-0.008
 unnecessary-0.008
 endings-0.008
Em-0.008
 importantly-0.008
Token little
Token ID1310
Feature activation-4.13e-01

Pos logit contributions
emic+0.010
 endings+0.010
 unnecessary+0.010
osity+0.010
 situations+0.009
Neg logit contributions
 one-0.010
 moral-0.008
 end-0.008
 closer-0.008
 first-0.007
Token cube
Token ID23441
Feature activation+6.09e-01

Pos logit contributions
emic+0.010
 unnecessary+0.010
 endings+0.010
osity+0.010
Em+0.009
Neg logit contributions
 one-0.009
 moral-0.008
 end-0.008
 closer-0.007
 lesson-0.007
Token named
Token ID3706
Feature activation-4.33e-01

Pos logit contributions
 one+0.012
 moral+0.010
 end+0.009
 first+0.009
 closer+0.009
Neg logit contributions
emic-0.018
 unnecessary-0.017
 endings-0.017
osity-0.017
Em-0.016
Token Cub
Token ID7070
Feature activation+2.92e+00

Pos logit contributions
emic+0.009
osity+0.009
 unnecessary+0.009
 endings+0.009
Em+0.008
Neg logit contributions
 one-0.011
 moral-0.010
 end-0.010
 lesson-0.009
 closer-0.009
Tokeney
Token ID2959
Feature activation+6.02e-01

Pos logit contributions
 one+0.060
 closer+0.056
 particular+0.055
 moral+0.055
 lesson+0.052
Neg logit contributions
emic-0.070
 endings-0.068
 unnecessary-0.064
cess-0.060
osity-0.059
Token.
Token ID13
Feature activation-1.04e+00

Pos logit contributions
 one+0.011
 moral+0.011
 closer+0.010
 lesson+0.010
 end+0.010
Neg logit contributions
emic-0.014
 endings-0.014
osity-0.014
 importantly-0.014
cess-0.013
Token Cub
Token ID7070
Feature activation+2.37e+00

Pos logit contributions
 unnecessary+0.020
 endings+0.020
und+0.019
emic+0.018
osity+0.018
Neg logit contributions
 one-0.025
 moral-0.023
 closer-0.023
 end-0.022
 lesson-0.022
Tokeney
Token ID2959
Feature activation+8.63e-01

Pos logit contributions
 one+0.054
 closer+0.053
 particular+0.050
 moral+0.048
 lesson+0.045
Neg logit contributions
 endings-0.051
emic-0.050
und-0.047
 unnecessary-0.046
ode-0.044
Token loved
Token ID6151
Feature activation-7.56e-01

Pos logit contributions
 one+0.021
 closer+0.020
 moral+0.019
 lesson+0.018
 end+0.018
Neg logit contributions
emic-0.017
 endings-0.017
und-0.015
osity-0.015
Em-0.015
Token
Token ID220
Feature activation-1.00e+00

Pos logit contributions
 unnecessary+0.011
emic+0.010
 endings+0.010
 pian+0.009
 ow+0.009
Neg logit contributions
 one-0.008
 moral-0.006
ossible-0.006
 lesson-0.006
 some-0.006
Token She
Token ID1375
Feature activation-2.17e-01

Pos logit contributions
 unnecessary+0.038
 endings+0.038
emic+0.037
 pian+0.036
 situations+0.036
Neg logit contributions
 one-0.010
 moral-0.006
 first-0.005
ossible-0.005
 some-0.005
Token grabbed
Token ID13176
Feature activation-1.68e-01

Pos logit contributions
emic+0.005
 endings+0.005
Em+0.005
osity+0.005
 unnecessary+0.005
Neg logit contributions
 one-0.005
 moral-0.004
 end-0.004
 closer-0.004
 lesson-0.004
Token oven
Token ID14361
Feature activation+8.36e-01

Pos logit contributions
emic+0.005
 unnecessary+0.005
osity+0.004
 endings+0.004
 hey+0.004
Neg logit contributions
 one-0.003
 moral-0.003
 end-0.002
ossible-0.002
 lesson-0.002
Token m
Token ID285
Feature activation+3.89e-01

Pos logit contributions
 one+0.013
 moral+0.012
 end+0.011
ossible+0.010
 lesson+0.010
Neg logit contributions
emic-0.025
osity-0.024
 endings-0.024
 unnecessary-0.024
 pian-0.023
Tokenitt
Token ID715
Feature activation+2.91e+00

Pos logit contributions
 one+0.011
 moral+0.009
 end+0.009
 closer+0.009
 lesson+0.009
Neg logit contributions
emic-0.008
 unnecessary-0.008
 endings-0.008
osity-0.007
Em-0.007
Tokens
Token ID82
Feature activation-2.34e-01

Pos logit contributions
 one+0.064
 moral+0.061
 end+0.057
 lesson+0.054
 particular+0.051
Neg logit contributions
osity-0.068
emic-0.068
 unnecessary-0.068
 endings-0.063
und-0.063
Token and
Token ID290
Feature activation+3.62e-01

Pos logit contributions
emic+0.007
 endings+0.007
 unnecessary+0.007
osity+0.007
und+0.007
Neg logit contributions
 one-0.004
 moral-0.003
 end-0.003
 lesson-0.003
 closer-0.003
Token carefully
Token ID7773
Feature activation+4.66e-01

Pos logit contributions
 one+0.007
 moral+0.006
 end+0.005
 closer+0.005
 first+0.005
Neg logit contributions
emic-0.010
 unnecessary-0.010
 endings-0.010
osity-0.010
Em-0.009
Token pulled
Token ID5954
Feature activation-2.86e-01

Pos logit contributions
 one+0.010
 moral+0.008
 end+0.008
 lesson+0.008
 closer+0.008
Neg logit contributions
emic-0.012
 endings-0.012
 unnecessary-0.012
osity-0.011
Em-0.011
Token out
Token ID503
Feature activation+2.28e-01

Pos logit contributions
emic+0.008
 unnecessary+0.008
 endings+0.008
osity+0.008
Em+0.007
Neg logit contributions
 one-0.005
 moral-0.004
 end-0.004
 lesson-0.004
ossible-0.004
Token there
Token ID612
Feature activation-5.98e-01

Pos logit contributions
 one+0.017
 moral+0.014
 end+0.013
 lesson+0.012
 closer+0.012
Neg logit contributions
emic-0.034
 endings-0.032
 unnecessary-0.032
osity-0.032
Em-0.031
Token were
Token ID547
Feature activation-1.82e-02

Pos logit contributions
emic+0.017
 endings+0.016
und+0.015
osity+0.015
Em+0.015
Neg logit contributions
 one-0.011
 moral-0.009
 closer-0.008
 lesson-0.008
 end-0.008
Token two
Token ID734
Feature activation+7.32e-01

Pos logit contributions
emic+0.001
osity+0.001
 unnecessary+0.001
 endings+0.001
 situations+0.001
Neg logit contributions
 one-0.000
 moral-0.000
 end-0.000
 first-0.000
 closer-0.000
Token twins
Token ID20345
Feature activation+3.01e-01

Pos logit contributions
 one+0.017
 moral+0.015
 end+0.014
 closer+0.014
 lesson+0.014
Neg logit contributions
emic-0.017
 unnecessary-0.016
 endings-0.016
osity-0.016
 importantly-0.015
Token named
Token ID3706
Feature activation-4.06e-01

Pos logit contributions
 one+0.007
 moral+0.005
 end+0.005
 first+0.005
 closer+0.005
Neg logit contributions
emic-0.008
 unnecessary-0.008
 endings-0.008
osity-0.008
 situations-0.007
Token Marcus
Token ID17068
Feature activation+2.91e+00

Pos logit contributions
 endings+0.011
 unnecessary+0.011
emic+0.011
osity+0.011
 importantly+0.011
Neg logit contributions
 one-0.009
 first-0.007
 end-0.007
 moral-0.007
 closer-0.007
Token and
Token ID290
Feature activation-1.08e+00

Pos logit contributions
 one+0.048
 moral+0.045
 closer+0.042
 lesson+0.040
 end+0.040
Neg logit contributions
emic-0.082
Em-0.077
osity-0.075
 endings-0.073
 hey-0.072
Token Maggie
Token ID33014
Feature activation+2.44e-01

Pos logit contributions
emic+0.035
 unnecessary+0.034
 endings+0.034
osity+0.033
 situations+0.032
Neg logit contributions
 one-0.022
 first-0.014
ossible-0.014
 some-0.014
 end-0.014
Token.
Token ID13
Feature activation-7.49e-01

Pos logit contributions
 one+0.004
 moral+0.003
 end+0.003
 closer+0.003
 lesson+0.003
Neg logit contributions
emic-0.007
 endings-0.007
 unnecessary-0.007
osity-0.007
Em-0.007
Token They
Token ID1119
Feature activation+1.26e-01

Pos logit contributions
emic+0.016
 endings+0.016
 unnecessary+0.015
osity+0.015
und+0.015
Neg logit contributions
 one-0.019
 moral-0.017
 end-0.016
 closer-0.016
 lesson-0.015
Token lived
Token ID5615
Feature activation-5.74e-01

Pos logit contributions
 one+0.003
 moral+0.002
 closer+0.002
 end+0.002
 lesson+0.002
Neg logit contributions
emic-0.003
 endings-0.003
 unnecessary-0.003
osity-0.003
Em-0.003
Token was
Token ID373
Feature activation+1.67e-01

Pos logit contributions
emic+0.020
 endings+0.020
und+0.019
Em+0.019
 unnecessary+0.019
Neg logit contributions
 one-0.014
 moral-0.012
 closer-0.011
 lesson-0.010
 end-0.010
Token a
Token ID257
Feature activation-5.26e-01

Pos logit contributions
 one+0.002
 moral+0.002
 end+0.002
 closer+0.002
 lesson+0.002
Neg logit contributions
emic-0.005
 unnecessary-0.005
 endings-0.005
Em-0.005
 importantly-0.005
Token little
Token ID1310
Feature activation-5.37e-01

Pos logit contributions
 endings+0.012
emic+0.012
 unnecessary+0.012
osity+0.012
 situations+0.011
Neg logit contributions
 one-0.014
 moral-0.012
 end-0.011
 closer-0.011
 first-0.011
Token girl
Token ID2576
Feature activation-1.03e-01

Pos logit contributions
emic+0.014
 endings+0.013
 unnecessary+0.013
osity+0.013
Em+0.012
Neg logit contributions
 one-0.012
 moral-0.010
 end-0.010
 closer-0.010
 lesson-0.009
Token named
Token ID3706
Feature activation-5.78e-01

Pos logit contributions
 unnecessary+0.003
emic+0.003
 endings+0.003
osity+0.003
 situations+0.003
Neg logit contributions
 one-0.002
 end-0.001
 moral-0.001
 closer-0.001
 first-0.001
Token St
Token ID520
Feature activation+2.90e+00

Pos logit contributions
 endings+0.016
 unnecessary+0.016
emic+0.016
 situations+0.016
osity+0.015
Neg logit contributions
 one-0.014
 first-0.011
 end-0.010
 moral-0.010
 closer-0.010
Tokenitch
Token ID2007
Feature activation+1.97e+00

Pos logit contributions
 one+0.077
 moral+0.065
 closer+0.064
 end+0.062
 first+0.062
Neg logit contributions
emic-0.065
 endings-0.061
 unnecessary-0.059
osity-0.057
Em-0.056
Token.
Token ID13
Feature activation-7.22e-01

Pos logit contributions
 one+0.036
 moral+0.030
 closer+0.028
 lesson+0.027
 end+0.027
Neg logit contributions
emic-0.057
 endings-0.055
 unnecessary-0.053
osity-0.052
Em-0.051
Token She
Token ID1375
Feature activation-8.76e-01

Pos logit contributions
emic+0.015
 unnecessary+0.014
 endings+0.014
osity+0.014
Em+0.013
Neg logit contributions
 one-0.020
 moral-0.017
 end-0.016
 closer-0.016
 lesson-0.016
Token was
Token ID373
Feature activation-1.56e-01

Pos logit contributions
emic+0.026
 unnecessary+0.026
osity+0.025
 endings+0.025
 situations+0.024
Neg logit contributions
 one-0.017
 moral-0.014
 end-0.013
 lesson-0.012
 first-0.012
Token very
Token ID845
Feature activation+5.36e-01

Pos logit contributions
 endings+0.005
emic+0.005
 unnecessary+0.005
osity+0.005
 situations+0.005
Neg logit contributions
 one-0.003
 moral-0.002
 first-0.002
 end-0.002
 closer-0.002
Token Sam
Token ID3409
Feature activation+1.60e-01

Pos logit contributions
 one+0.002
 moral+0.002
 closer+0.001
 end+0.001
 first+0.001
Neg logit contributions
emic-0.003
 unnecessary-0.003
 endings-0.003
osity-0.003
 importantly-0.003
Token thinks
Token ID6834
Feature activation+5.04e-01

Pos logit contributions
 one+0.004
 end+0.003
 moral+0.003
 lesson+0.003
ort+0.003
Neg logit contributions
emic-0.003
 unnecessary-0.003
 pian-0.003
und-0.003
 endings-0.003
Token it
Token ID340
Feature activation-2.32e-02

Pos logit contributions
 one+0.007
 moral+0.006
 end+0.005
ossible+0.005
 closer+0.005
Neg logit contributions
emic-0.016
 unnecessary-0.016
 endings-0.016
osity-0.015
 importantly-0.015
Token is
Token ID318
Feature activation+9.53e-01

Pos logit contributions
emic+0.001
 unnecessary+0.001
 endings+0.001
osity+0.001
und+0.001
Neg logit contributions
 one-0.000
 moral-0.000
 end-0.000
ossible-0.000
 closer-0.000
Token too
Token ID1165
Feature activation+2.09e+00

Pos logit contributions
 one+0.015
 moral+0.011
 end+0.010
 closer+0.010
 first+0.010
Neg logit contributions
emic-0.032
 endings-0.031
osity-0.031
 unnecessary-0.031
Em-0.030
Token high
Token ID1029
Feature activation+2.87e+00

Pos logit contributions
 one+0.034
 moral+0.029
 end+0.027
ossible+0.025
 closer+0.024
Neg logit contributions
emic-0.064
 unnecessary-0.061
 endings-0.059
osity-0.059
und-0.057
Token and
Token ID290
Feature activation-4.74e-01

Pos logit contributions
 one+0.042
 moral+0.035
 end+0.031
ossible+0.029
 closer+0.029
Neg logit contributions
emic-0.091
 endings-0.090
 unnecessary-0.089
osity-0.087
 situations-0.084
Token dangerous
Token ID4923
Feature activation+4.05e-01

Pos logit contributions
emic+0.014
 unnecessary+0.013
 endings+0.013
osity+0.013
Em+0.013
Neg logit contributions
 one-0.009
 moral-0.007
 end-0.007
 closer-0.007
 lesson-0.006
Token.
Token ID13
Feature activation+4.39e-01

Pos logit contributions
 one+0.006
 end+0.004
 moral+0.004
 closer+0.004
 first+0.004
Neg logit contributions
emic-0.014
 unnecessary-0.014
osity-0.013
 endings-0.013
Em-0.013
Token
Token ID198
Feature activation+7.44e-01

Pos logit contributions
 one+0.006
 moral+0.004
 closer+0.004
 first+0.004
 end+0.004
Neg logit contributions
emic-0.016
 unnecessary-0.015
osity-0.015
 endings-0.015
 situations-0.014
Token
Token ID198
Feature activation-2.68e-02

Pos logit contributions
 one+0.014
 moral+0.011
 end+0.011
 lesson+0.010
 closer+0.010
Neg logit contributions
emic-0.021
Em-0.020
 endings-0.019
und-0.019
 unnecessary-0.019
Token was
Token ID373
Feature activation+3.04e-01

Pos logit contributions
emic+0.014
 endings+0.014
und+0.014
Em+0.013
osity+0.013
Neg logit contributions
 one-0.010
 moral-0.009
 closer-0.008
 lesson-0.008
 end-0.007
Token a
Token ID257
Feature activation-5.36e-01

Pos logit contributions
 one+0.004
 moral+0.004
 end+0.003
 closer+0.003
 lesson+0.003
Neg logit contributions
emic-0.010
 endings-0.009
 unnecessary-0.009
Em-0.009
osity-0.009
Token creative
Token ID7325
Feature activation-1.02e-01

Pos logit contributions
 endings+0.013
emic+0.013
 unnecessary+0.012
osity+0.012
 situations+0.012
Neg logit contributions
 one-0.014
 moral-0.012
 end-0.012
 closer-0.011
 first-0.011
Token girl
Token ID2576
Feature activation-2.37e-01

Pos logit contributions
emic+0.003
 unnecessary+0.003
 endings+0.003
osity+0.002
 situations+0.002
Neg logit contributions
 one-0.002
 moral-0.002
 end-0.002
 first-0.002
 lesson-0.002
Token named
Token ID3706
Feature activation-5.53e-01

Pos logit contributions
 unnecessary+0.007
emic+0.007
 endings+0.007
osity+0.007
 situations+0.007
Neg logit contributions
 one-0.005
 end-0.003
 moral-0.003
 first-0.003
 closer-0.003
Token St
Token ID520
Feature activation+2.86e+00

Pos logit contributions
 unnecessary+0.017
 endings+0.017
 situations+0.016
emic+0.016
osity+0.016
Neg logit contributions
 one-0.013
 first-0.010
 some-0.009
 closer-0.009
 end-0.008
Tokenitch
Token ID2007
Feature activation+1.92e+00

Pos logit contributions
 one+0.074
 moral+0.063
 closer+0.061
 end+0.060
 lesson+0.059
Neg logit contributions
emic-0.065
 endings-0.062
 unnecessary-0.060
osity-0.059
Em-0.057
Token.
Token ID13
Feature activation-6.03e-01

Pos logit contributions
 one+0.035
 moral+0.029
 closer+0.028
 lesson+0.027
 end+0.027
Neg logit contributions
emic-0.054
 endings-0.052
 unnecessary-0.050
osity-0.050
Em-0.050
Token She
Token ID1375
Feature activation-7.64e-01

Pos logit contributions
emic+0.013
 unnecessary+0.012
 endings+0.012
osity+0.012
Em+0.011
Neg logit contributions
 one-0.016
 moral-0.014
 end-0.013
 closer-0.013
 lesson-0.013
Token loved
Token ID6151
Feature activation-7.74e-01

Pos logit contributions
emic+0.023
 unnecessary+0.023
osity+0.022
 endings+0.022
 situations+0.021
Neg logit contributions
 one-0.015
 moral-0.012
 end-0.012
 lesson-0.011
 first-0.011
Token to
Token ID284
Feature activation+1.75e-01

Pos logit contributions
emic+0.029
 unnecessary+0.027
igor+0.027
osity+0.027
 endings+0.026
Neg logit contributions
 one-0.012
 some-0.008
 first-0.007
 moral-0.006
 lesson-0.006
Token was
Token ID373
Feature activation+5.06e-02

Pos logit contributions
emic+0.017
 endings+0.017
und+0.017
 unnecessary+0.016
Em+0.016
Neg logit contributions
 one-0.012
 moral-0.010
 closer-0.009
 lesson-0.009
 end-0.009
Token a
Token ID257
Feature activation-5.23e-01

Pos logit contributions
 one+0.001
 moral+0.001
 end+0.001
 lesson+0.001
 closer+0.001
Neg logit contributions
emic-0.002
 unnecessary-0.002
 endings-0.002
Em-0.001
 importantly-0.001
Token bunny
Token ID44915
Feature activation+7.36e-01

Pos logit contributions
 endings+0.012
emic+0.012
 unnecessary+0.012
osity+0.012
 situations+0.011
Neg logit contributions
 one-0.013
 moral-0.012
 end-0.011
 closer-0.011
 first-0.011
Token called
Token ID1444
Feature activation-6.24e-01

Pos logit contributions
 one+0.018
 moral+0.016
 lesson+0.015
 end+0.015
 closer+0.015
Neg logit contributions
emic-0.016
 endings-0.015
 unnecessary-0.015
osity-0.015
 importantly-0.015
Token B
Token ID347
Feature activation+1.07e+00

Pos logit contributions
emic+0.016
 endings+0.015
 unnecessary+0.015
osity+0.015
 importantly+0.014
Neg logit contributions
 one-0.015
 moral-0.011
 first-0.011
 end-0.011
 closer-0.011
Tokenoun
Token ID977
Feature activation+2.85e+00

Pos logit contributions
 one+0.029
 moral+0.025
 end+0.024
 closer+0.024
 lesson+0.023
Neg logit contributions
emic-0.023
 unnecessary-0.021
 endings-0.021
osity-0.021
Em-0.020
Tokency
Token ID948
Feature activation-1.27e-02

Pos logit contributions
 one+0.064
 moral+0.062
 lesson+0.058
 end+0.054
 closer+0.052
Neg logit contributions
emic-0.069
Em-0.061
osity-0.061
und-0.061
 hey-0.061
Token.
Token ID13
Feature activation-2.74e-01

Pos logit contributions
emic+0.000
 unnecessary+0.000
 endings+0.000
osity+0.000
 situations+0.000
Neg logit contributions
 one-0.000
 moral-0.000
 end-0.000
 first-0.000
 closer-0.000
Token B
Token ID347
Feature activation+5.78e-01

Pos logit contributions
emic+0.006
 unnecessary+0.005
 endings+0.005
osity+0.005
Em+0.005
Neg logit contributions
 one-0.008
 moral-0.007
 end-0.006
 closer-0.006
 first-0.006
Tokenoun
Token ID977
Feature activation+2.11e+00

Pos logit contributions
 one+0.014
 moral+0.012
 end+0.012
 closer+0.011
 lesson+0.011
Neg logit contributions
emic-0.014
 unnecessary-0.013
osity-0.013
 endings-0.013
Em-0.012
Tokency
Token ID948
Feature activation-4.71e-01

Pos logit contributions
 one+0.047
 moral+0.044
 lesson+0.040
 end+0.039
 closer+0.038
Neg logit contributions
emic-0.053
und-0.049
Em-0.047
 endings-0.047
osity-0.047
Token want
Token ID765
Feature activation+6.39e-01

Pos logit contributions
emic+0.016
osity+0.015
 unnecessary+0.015
 endings+0.015
 importantly+0.015
Neg logit contributions
 one-0.009
 end-0.007
 moral-0.006
 bite-0.006
 last-0.006
Token to
Token ID284
Feature activation-1.08e+00

Pos logit contributions
 one+0.013
 moral+0.010
 end+0.010
 closer+0.009
 lesson+0.009
Neg logit contributions
emic-0.019
 unnecessary-0.018
osity-0.018
 endings-0.018
Em-0.017
Token replace
Token ID6330
Feature activation+9.76e-01

Pos logit contributions
emic+0.038
osity+0.037
 unnecessary+0.035
 importantly+0.035
ibr+0.035
Neg logit contributions
 one-0.021
 end-0.017
 bite-0.015
 first-0.015
 last-0.014
Token each
Token ID1123
Feature activation+1.56e+00

Pos logit contributions
 one+0.016
 moral+0.014
 end+0.013
ossible+0.012
 lesson+0.012
Neg logit contributions
 unnecessary-0.028
emic-0.028
 endings-0.028
 situations-0.027
osity-0.026
Token other
Token ID584
Feature activation+9.45e-02

Pos logit contributions
 one+0.033
 moral+0.032
ort+0.031
ossible+0.030
 closer+0.030
Neg logit contributions
 unnecessary-0.032
emic-0.031
 situations-0.030
 endings-0.030
 pian-0.029
Token ever
Token ID1683
Feature activation+2.84e+00

Pos logit contributions
 one+0.002
 moral+0.002
 end+0.002
 closer+0.002
 lesson+0.002
Neg logit contributions
emic-0.003
 unnecessary-0.002
 endings-0.002
osity-0.002
Em-0.002
Token again
Token ID757
Feature activation+6.81e-01

Pos logit contributions
 one+0.051
 moral+0.041
 end+0.039
 closer+0.037
 lesson+0.036
Neg logit contributions
emic-0.086
 unnecessary-0.083
 endings-0.082
osity-0.081
Em-0.078
Token.
Token ID13
Feature activation+9.18e-01

Pos logit contributions
 one+0.010
 moral+0.007
 end+0.007
 first+0.006
 closer+0.006
Neg logit contributions
emic-0.023
osity-0.023
 unnecessary-0.022
 endings-0.022
Em-0.021
Token<|endoftext|>
Token ID50256
Feature activation+1.83e-01

Pos logit contributions
 one+0.019
 moral+0.016
 end+0.014
 lesson+0.014
ossible+0.014
Neg logit contributions
emic-0.026
 unnecessary-0.026
osity-0.025
 endings-0.025
 situations-0.024
TokenAnna
Token ID31160
Feature activation+6.30e-01

Pos logit contributions
 one+0.003
 moral+0.002
 end+0.002
 closer+0.002
 lesson+0.002
Neg logit contributions
emic-0.006
 unnecessary-0.005
 endings-0.005
osity-0.005
Em-0.005
Token and
Token ID290
Feature activation-4.98e-01

Pos logit contributions
 one+0.013
 moral+0.011
 closer+0.011
 end+0.010
 lesson+0.010
Neg logit contributions
emic-0.016
 endings-0.016
Em-0.015
und-0.015
osity-0.015
Token two
Token ID734
Feature activation+8.42e-01

Pos logit contributions
emic+0.000
 unnecessary+0.000
 endings+0.000
osity+0.000
 situations+0.000
Neg logit contributions
 one-0.000
 moral-0.000
 end-0.000
 closer-0.000
 first-0.000
Token friends
Token ID2460
Feature activation-5.84e-01

Pos logit contributions
 one+0.020
 moral+0.017
 end+0.016
 closer+0.016
 lesson+0.016
Neg logit contributions
emic-0.021
 unnecessary-0.020
 endings-0.019
osity-0.019
 importantly-0.018
Token,
Token ID11
Feature activation-1.00e+00

Pos logit contributions
emic+0.015
 endings+0.014
 unnecessary+0.014
osity+0.014
Em+0.014
Neg logit contributions
 one-0.012
 moral-0.011
 lesson-0.010
 end-0.010
 closer-0.010
Token Lisa
Token ID15378
Feature activation+5.10e-01

Pos logit contributions
emic+0.024
 unnecessary+0.022
 endings+0.022
osity+0.022
Em+0.022
Neg logit contributions
 one-0.023
 moral-0.020
 end-0.020
 lesson-0.019
 closer-0.018
Token and
Token ID290
Feature activation-1.23e+00

Pos logit contributions
 one+0.009
 moral+0.008
 closer+0.008
 lesson+0.008
 end+0.007
Neg logit contributions
emic-0.013
Em-0.013
 endings-0.012
 importantly-0.012
und-0.012
Token St
Token ID520
Feature activation+2.83e+00

Pos logit contributions
emic+0.032
 unnecessary+0.031
 endings+0.031
osity+0.030
 situations+0.029
Neg logit contributions
 one-0.029
 moral-0.022
 end-0.022
 first-0.022
 closer-0.021
Tokenu
Token ID84
Feature activation+1.40e+00

Pos logit contributions
 one+0.074
 moral+0.063
 closer+0.061
 end+0.060
 lesson+0.059
Neg logit contributions
emic-0.065
 endings-0.061
 unnecessary-0.060
osity-0.058
Em-0.055
Token.
Token ID13
Feature activation-4.54e-01

Pos logit contributions
 one+0.025
 moral+0.023
 closer+0.021
 end+0.021
 lesson+0.020
Neg logit contributions
emic-0.040
 endings-0.037
osity-0.036
 unnecessary-0.035
Em-0.035
Token Lisa
Token ID15378
Feature activation+1.80e-01

Pos logit contributions
emic+0.011
 unnecessary+0.011
 endings+0.011
osity+0.011
Em+0.010
Neg logit contributions
 one-0.011
 moral-0.009
 end-0.009
 closer-0.008
 lesson-0.008
Token loved
Token ID6151
Feature activation-2.20e-01

Pos logit contributions
 one+0.004
 closer+0.004
 moral+0.004
 end+0.003
 lesson+0.003
Neg logit contributions
emic-0.004
Em-0.004
 endings-0.004
 pian-0.004
und-0.004
Token to
Token ID284
Feature activation+4.19e-01

Pos logit contributions
emic+0.007
osity+0.007
 unnecessary+0.007
 endings+0.007
igor+0.007
Neg logit contributions
 one-0.004
 some-0.003
 first-0.003
 moral-0.003
 lesson-0.002
Token Bad
Token ID7772
Feature activation+1.72e+00

Pos logit contributions
emic+0.012
 unnecessary+0.011
 endings+0.011
osity+0.011
 importantly+0.010
Neg logit contributions
 one-0.014
 moral-0.011
 end-0.011
 closer-0.011
ossible-0.011
Token dog
Token ID3290
Feature activation+1.58e-01

Pos logit contributions
 one+0.037
 moral+0.037
 closer+0.034
 particular+0.033
ossible+0.033
Neg logit contributions
 endings-0.040
 unnecessary-0.039
 situations-0.035
 hey-0.035
 ow-0.035
Token!
Token ID0
Feature activation-5.33e-01

Pos logit contributions
 one+0.003
 moral+0.003
 end+0.003
 lesson+0.003
 closer+0.003
Neg logit contributions
emic-0.004
 unnecessary-0.004
 endings-0.004
osity-0.004
und-0.004
Token Leave
Token ID17446
Feature activation+8.84e-01

Pos logit contributions
emic+0.012
 unnecessary+0.012
 endings+0.011
osity+0.011
 canvas+0.011
Neg logit contributions
 one-0.014
 moral-0.012
 end-0.011
ossible-0.011
 closer-0.011
Token the
Token ID262
Feature activation-1.80e-01

Pos logit contributions
 one+0.018
 moral+0.016
 end+0.014
 closer+0.014
 lesson+0.014
Neg logit contributions
emic-0.023
 unnecessary-0.023
 endings-0.023
osity-0.021
Em-0.021
Token girls
Token ID4813
Feature activation-3.44e+00

Pos logit contributions
emic+0.005
 unnecessary+0.004
 endings+0.004
 pian+0.004
osity+0.004
Neg logit contributions
 one-0.004
 moral-0.003
 closer-0.003
ossible-0.003
 end-0.003
Token alone
Token ID3436
Feature activation-7.56e-01

Pos logit contributions
emic+0.086
 endings+0.083
 unnecessary+0.083
osity+0.082
Em+0.077
Neg logit contributions
 one-0.080
 moral-0.066
 end-0.065
 closer-0.062
 first-0.061
Token!"
Token ID2474
Feature activation-4.09e-02

Pos logit contributions
emic+0.022
 unnecessary+0.021
osity+0.021
 endings+0.021
Em+0.020
Neg logit contributions
 one-0.015
 moral-0.012
 end-0.011
 closer-0.011
 first-0.011
Token
Token ID198
Feature activation-1.95e-01

Pos logit contributions
emic+0.001
 unnecessary+0.001
 endings+0.001
osity+0.001
Em+0.001
Neg logit contributions
 one-0.001
 moral-0.001
 end-0.001
 closer-0.001
 lesson-0.001
Token
Token ID198
Feature activation-4.86e-01

Pos logit contributions
emic+0.007
 unnecessary+0.007
 endings+0.007
osity+0.007
 situations+0.006
Neg logit contributions
 one-0.003
 moral-0.002
 end-0.002
 closer-0.002
 lesson-0.002
TokenThe
Token ID464
Feature activation+4.01e-01

Pos logit contributions
emic+0.014
osity+0.014
 endings+0.014
 unnecessary+0.013
 situations+0.012
Neg logit contributions
 one-0.010
 moral-0.008
 end-0.008
 closer-0.007
 first-0.007
Token didn
Token ID1422
Feature activation-7.23e-01

Pos logit contributions
emic+0.000
Em+0.000
 endings+0.000
osity+0.000
und+0.000
Neg logit contributions
 one-0.000
 moral-0.000
 closer-0.000
 end-0.000
 lesson-0.000
Token't
Token ID470
Feature activation-1.91e-01

Pos logit contributions
emic+0.017
Em+0.017
 endings+0.016
osity+0.016
 unnecessary+0.016
Neg logit contributions
 one-0.015
 closer-0.012
ossible-0.012
 moral-0.012
 end-0.012
Token know
Token ID760
Feature activation-6.89e-01

Pos logit contributions
emic+0.005
 unnecessary+0.005
 endings+0.005
osity+0.005
Em+0.005
Neg logit contributions
 one-0.004
 moral-0.003
 end-0.003
 closer-0.003
 lesson-0.003
Token what
Token ID644
Feature activation-1.65e-01

Pos logit contributions
emic+0.023
 endings+0.021
osity+0.021
 unnecessary+0.021
igor+0.020
Neg logit contributions
 one-0.014
 end-0.010
 some-0.009
 closer-0.009
 first-0.009
Token "
Token ID366
Feature activation-2.34e-01

Pos logit contributions
emic+0.005
 unnecessary+0.005
 endings+0.005
osity+0.005
 situations+0.004
Neg logit contributions
 one-0.003
 moral-0.002
 end-0.002
 lesson-0.002
 first-0.002
Tokensm
Token ID5796
Feature activation-3.41e+00

Pos logit contributions
emic+0.005
 endings+0.005
 unnecessary+0.005
osity+0.005
Em+0.005
Neg logit contributions
 one-0.006
 moral-0.005
 end-0.005
 closer-0.005
 lesson-0.005
Tokenoother
Token ID31724
Feature activation+5.24e-01

Pos logit contributions
emic+0.078
 unnecessary+0.076
 endings+0.075
 importantly+0.069
igor+0.068
Neg logit contributions
 one-0.087
 moral-0.073
 end-0.071
 last-0.067
 first-0.067
Token"
Token ID1
Feature activation+3.80e-01

Pos logit contributions
 one+0.008
 moral+0.006
 end+0.006
 closer+0.006
 lesson+0.006
Neg logit contributions
emic-0.017
 endings-0.016
 unnecessary-0.016
osity-0.016
Em-0.015
Token meant
Token ID4001
Feature activation+4.41e-01

Pos logit contributions
 one+0.007
 moral+0.005
 end+0.005
 closer+0.005
 lesson+0.005
Neg logit contributions
emic-0.012
 unnecessary-0.011
 endings-0.011
osity-0.011
 importantly-0.011
Token,
Token ID11
Feature activation-2.64e-01

Pos logit contributions
 one+0.005
 first+0.003
 end+0.002
 moral+0.002
 lesson+0.002
Neg logit contributions
emic-0.018
 unnecessary-0.017
 endings-0.017
osity-0.016
und-0.016
Token but
Token ID475
Feature activation+4.72e-01

Pos logit contributions
emic+0.008
 unnecessary+0.008
 endings+0.008
osity+0.008
 situations+0.007
Neg logit contributions
 one-0.006
 moral-0.004
 first-0.004
 some-0.004
 end-0.004
Token soft
Token ID2705
Feature activation-3.31e-01

Pos logit contributions
 one+0.001
 moral+0.001
 end+0.000
 closer+0.000
 lesson+0.000
Neg logit contributions
emic-0.001
 endings-0.001
osity-0.001
 unnecessary-0.001
und-0.001
Token copper
Token ID15317
Feature activation+1.13e+00

Pos logit contributions
emic+0.009
 endings+0.009
 unnecessary+0.009
osity+0.008
Em+0.008
Neg logit contributions
 one-0.007
 moral-0.006
 end-0.005
 closer-0.005
 lesson-0.005
Token coin
Token ID10752
Feature activation+1.07e+00

Pos logit contributions
 one+0.020
 moral+0.016
 end+0.015
 closer+0.015
 lesson+0.014
Neg logit contributions
emic-0.034
 endings-0.033
 unnecessary-0.033
osity-0.032
Em-0.031
Token.
Token ID13
Feature activation+5.08e-02

Pos logit contributions
 one+0.022
 moral+0.021
 lesson+0.019
ossible+0.019
 end+0.018
Neg logit contributions
emic-0.027
 endings-0.025
osity-0.025
 unnecessary-0.025
 importantly-0.024
Token E
Token ID412
Feature activation+1.12e+00

Pos logit contributions
 one+0.001
 moral+0.001
 some+0.001
 end+0.001
 first+0.001
Neg logit contributions
emic-0.002
 unnecessary-0.002
 endings-0.002
osity-0.001
 situations-0.001
Tokensm
Token ID5796
Feature activation-3.40e+00

Pos logit contributions
 one+0.028
 moral+0.024
 end+0.023
 closer+0.022
 lesson+0.022
Neg logit contributions
emic-0.026
 endings-0.025
 unnecessary-0.024
osity-0.024
Em-0.024
Tokenie
Token ID494
Feature activation+5.65e-02

Pos logit contributions
emic+0.093
 endings+0.089
 unnecessary+0.088
osity+0.083
 hey+0.081
Neg logit contributions
 one-0.071
 end-0.059
 moral-0.059
 closer-0.054
 lesson-0.053
Token wanted
Token ID2227
Feature activation+8.14e-01

Pos logit contributions
 one+0.001
 end+0.001
 moral+0.001
 lesson+0.001
 closer+0.001
Neg logit contributions
emic-0.001
 endings-0.001
Em-0.001
und-0.001
osity-0.001
Token to
Token ID284
Feature activation-3.55e-01

Pos logit contributions
 one+0.018
 moral+0.017
 end+0.016
ossible+0.016
 lesson+0.015
Neg logit contributions
 unnecessary-0.019
 endings-0.018
emic-0.018
osity-0.017
 importantly-0.016
Token pick
Token ID2298
Feature activation+3.51e-01

Pos logit contributions
emic+0.008
 endings+0.008
 unnecessary+0.008
osity+0.008
 situations+0.007
Neg logit contributions
 one-0.008
 moral-0.007
 end-0.006
 closer-0.006
 lesson-0.006
Token up
Token ID510
Feature activation+1.06e+00

Pos logit contributions
 moral+0.007
ossible+0.007
oral+0.007
 particular+0.006
 End+0.006
Neg logit contributions
 unnecessary-0.009
 endings-0.008
 uh-0.007
und-0.007
 cob-0.007
Token ow
Token ID12334
Feature activation+4.85e-01

Pos logit contributions
emic+0.008
osity+0.007
 endings+0.007
 unnecessary+0.007
Em+0.006
Neg logit contributions
 one-0.028
 end-0.025
 moral-0.024
 closer-0.024
 bite-0.023
Token,
Token ID11
Feature activation-8.39e-01

Pos logit contributions
 one+0.007
 moral+0.005
 end+0.005
 closer+0.005
 lesson+0.005
Neg logit contributions
emic-0.016
 unnecessary-0.016
 endings-0.016
osity-0.015
Em-0.015
Token ow
Token ID12334
Feature activation+4.09e-03

Pos logit contributions
emic+0.009
osity+0.008
 endings+0.008
 unnecessary+0.008
Em+0.007
Neg logit contributions
 one-0.033
 end-0.030
 moral-0.029
 closer-0.029
 first-0.028
Token!
Token ID0
Feature activation-9.52e-01

Pos logit contributions
 one+0.000
 moral+0.000
 end+0.000
 closer+0.000
 first+0.000
Neg logit contributions
emic-0.000
 endings-0.000
 unnecessary-0.000
osity-0.000
Em-0.000
Token The
Token ID383
Feature activation+8.80e-02

Pos logit contributions
 unnecessary+0.029
emic+0.029
 endings+0.028
Em+0.027
 achieving+0.027
Neg logit contributions
 one-0.019
 some-0.014
 moral-0.013
ossible-0.013
 end-0.013
Token rain
Token ID6290
Feature activation-3.36e+00

Pos logit contributions
 one+0.002
 moral+0.002
 closer+0.002
ossible+0.002
 end+0.002
Neg logit contributions
emic-0.002
 endings-0.002
 unnecessary-0.002
osity-0.002
 situations-0.002
Token is
Token ID318
Feature activation-7.28e-01

Pos logit contributions
emic+0.101
 endings+0.098
 unnecessary+0.097
osity+0.096
Em+0.094
Neg logit contributions
 one-0.062
 moral-0.048
 end-0.048
 closer-0.045
 first-0.044
Token bad
Token ID2089
Feature activation+1.07e-01

Pos logit contributions
emic+0.021
 endings+0.021
 unnecessary+0.020
osity+0.020
Em+0.019
Neg logit contributions
 one-0.014
 moral-0.011
 end-0.011
 closer-0.010
 first-0.010
Token.
Token ID13
Feature activation-3.88e-01

Pos logit contributions
 one+0.002
 end+0.001
 moral+0.001
 closer+0.001
 first+0.001
Neg logit contributions
emic-0.003
osity-0.003
 unnecessary-0.003
 endings-0.003
Em-0.003
Token It
Token ID632
Feature activation-7.42e-01

Pos logit contributions
emic+0.011
 unnecessary+0.011
 endings+0.011
Em+0.011
 achieving+0.010
Neg logit contributions
 one-0.008
 some-0.006
 moral-0.006
ossible-0.006
 end-0.005
Token is
Token ID318
Feature activation-6.18e-01

Pos logit contributions
emic+0.021
 endings+0.021
osity+0.020
 unnecessary+0.020
Em+0.020
Neg logit contributions
 one-0.015
 moral-0.012
 end-0.012
 closer-0.012
 first-0.011
Tokener
Token ID263
Feature activation-9.18e-01

Pos logit contributions
 one+0.038
 moral+0.032
 closer+0.031
 end+0.031
 lesson+0.030
Neg logit contributions
emic-0.029
 endings-0.028
 unnecessary-0.028
osity-0.026
 hey-0.025
Tokenful
Token ID913
Feature activation+9.17e-01

Pos logit contributions
emic+0.025
 endings+0.024
osity+0.024
 unnecessary+0.024
Em+0.023
Neg logit contributions
 one-0.018
 moral-0.015
 end-0.014
 closer-0.014
 lesson-0.014
Token"
Token ID1
Feature activation+5.74e-01

Pos logit contributions
 one+0.014
 moral+0.011
 closer+0.010
 end+0.010
 first+0.009
Neg logit contributions
emic-0.029
 endings-0.028
osity-0.027
 unnecessary-0.027
Em-0.026
Token and
Token ID290
Feature activation+1.32e-01

Pos logit contributions
 one+0.010
 moral+0.008
 end+0.007
 closer+0.007
 lesson+0.007
Neg logit contributions
emic-0.018
 endings-0.017
 unnecessary-0.017
osity-0.017
Em-0.016
Token "
Token ID366
Feature activation-1.25e-02

Pos logit contributions
 one+0.003
 moral+0.002
 first+0.002
 end+0.002
 closer+0.002
Neg logit contributions
emic-0.004
 unnecessary-0.004
 endings-0.004
Em-0.004
osity-0.004
Tokensm
Token ID5796
Feature activation-3.32e+00

Pos logit contributions
emic+0.000
 endings+0.000
 unnecessary+0.000
osity+0.000
Em+0.000
Neg logit contributions
 one-0.000
 moral-0.000
 end-0.000
 lesson-0.000
 closer-0.000
Tokeniley
Token ID9618
Feature activation+4.56e-01

Pos logit contributions
emic+0.086
 unnecessary+0.079
 endings+0.078
 importantly+0.076
igor+0.075
Neg logit contributions
 one-0.077
 end-0.067
 moral-0.063
ossible-0.059
 closer-0.059
Token".
Token ID1911
Feature activation-2.71e-01

Pos logit contributions
 one+0.008
 moral+0.006
 end+0.006
 closer+0.006
 lesson+0.005
Neg logit contributions
emic-0.014
 endings-0.013
 unnecessary-0.013
osity-0.013
Em-0.013
Token But
Token ID887
Feature activation+7.73e-01

Pos logit contributions
emic+0.008
 unnecessary+0.008
 endings+0.007
osity+0.007
Em+0.007
Neg logit contributions
 one-0.005
 moral-0.004
 end-0.004
 closer-0.004
 lesson-0.004
Token it
Token ID340
Feature activation+3.26e-01

Pos logit contributions
 one+0.012
 moral+0.010
 end+0.009
 lesson+0.009
ossible+0.009
Neg logit contributions
emic-0.023
 endings-0.023
 unnecessary-0.023
osity-0.022
 importantly-0.021
Token was
Token ID373
Feature activation-6.20e-01

Pos logit contributions
 one+0.006
 moral+0.005
 end+0.005
 closer+0.005
ossible+0.005
Neg logit contributions
emic-0.009
 endings-0.009
 unnecessary-0.009
osity-0.008
Em-0.008
Token grandma
Token ID49890
Feature activation+3.07e-01

Pos logit contributions
 one+0.017
 moral+0.014
 end+0.013
 first+0.012
 lesson+0.012
Neg logit contributions
emic-0.027
 unnecessary-0.026
 endings-0.026
osity-0.026
Em-0.025
Token's
Token ID338
Feature activation+2.24e-01

Pos logit contributions
 one+0.006
 moral+0.004
 end+0.004
 closer+0.004
 first+0.004
Neg logit contributions
emic-0.010
 unnecessary-0.009
 endings-0.009
osity-0.009
 importantly-0.009
Token house
Token ID2156
Feature activation-7.43e-01

Pos logit contributions
 one+0.005
 moral+0.004
 end+0.004
 lesson+0.004
 closer+0.004
Neg logit contributions
emic-0.006
 unnecessary-0.005
 endings-0.005
osity-0.005
Em-0.005
Token for
Token ID329
Feature activation+5.22e-01

Pos logit contributions
emic+0.023
 endings+0.022
 unnecessary+0.022
osity+0.022
Em+0.021
Neg logit contributions
 one-0.012
 moral-0.010
 end-0.009
 closer-0.008
 lesson-0.008
Token safe
Token ID3338
Feature activation+7.40e-02

Pos logit contributions
 one+0.008
 moral+0.006
 end+0.005
 first+0.005
 closer+0.005
Neg logit contributions
emic-0.018
osity-0.017
 endings-0.017
 unnecessary-0.017
Em-0.016
Tokenkeeping
Token ID19934
Feature activation-3.27e+00

Pos logit contributions
 one+0.002
 first+0.001
 end+0.001
 some+0.001
 moral+0.001
Neg logit contributions
emic-0.002
osity-0.002
 unnecessary-0.002
 endings-0.002
Em-0.002
Token.
Token ID13
Feature activation-2.05e-02

Pos logit contributions
emic+0.102
 endings+0.098
 unnecessary+0.098
osity+0.096
Em+0.093
Neg logit contributions
 one-0.054
 moral-0.043
 end-0.040
 closer-0.038
 lesson-0.037
Token
Token ID220
Feature activation-9.65e-01

Pos logit contributions
emic+0.001
 unnecessary+0.001
 endings+0.001
osity+0.001
 pian+0.001
Neg logit contributions
 one-0.000
 moral-0.000
 end-0.000
 closer-0.000
ossible-0.000
Token
Token ID198
Feature activation+4.65e-01

Pos logit contributions
emic+0.032
 unnecessary+0.031
 endings+0.031
osity+0.030
Em+0.029
Neg logit contributions
 one-0.014
 moral-0.011
 end-0.010
 closer-0.009
 lesson-0.009
Token
Token ID198
Feature activation-2.69e-01

Pos logit contributions
 one+0.008
 moral+0.006
 end+0.006
 closer+0.006
 lesson+0.006
Neg logit contributions
emic-0.014
Em-0.013
 unnecessary-0.013
 endings-0.013
osity-0.013
TokenWhen
Token ID2215
Feature activation-5.68e-01

Pos logit contributions
emic+0.006
 endings+0.006
 unnecessary+0.006
osity+0.006
Em+0.006
Neg logit contributions
 one-0.007
 moral-0.006
 end-0.006
 closer-0.005
 lesson-0.005
Token.
Token ID13
Feature activation-2.80e-01

Pos logit contributions
 moral+0.001
 one+0.000
 end+0.000
oral+0.000
 End+0.000
Neg logit contributions
emic-0.001
 endings-0.001
 unnecessary-0.001
und-0.001
Em-0.001
Token One
Token ID1881
Feature activation-4.08e-03

Pos logit contributions
emic+0.009
 endings+0.009
 unnecessary+0.009
osity+0.009
Em+0.008
Neg logit contributions
 one-0.004
 moral-0.003
 end-0.003
 closer-0.003
 lesson-0.003
Token day
Token ID1110
Feature activation-3.64e-01

Pos logit contributions
emic+0.000
 cob+0.000
Em+0.000
 ow+0.000
 pian+0.000
Neg logit contributions
 moral-0.000
 one-0.000
ort-0.000
 closer-0.000
 seller-0.000
Token,
Token ID11
Feature activation-7.40e-02

Pos logit contributions
emic+0.012
 unnecessary+0.012
 endings+0.012
osity+0.011
Em+0.011
Neg logit contributions
 one-0.005
 moral-0.004
 end-0.004
 closer-0.004
 lesson-0.004
Token E
Token ID412
Feature activation+1.48e+00

Pos logit contributions
emic+0.002
 unnecessary+0.002
 endings+0.002
osity+0.002
Em+0.002
Neg logit contributions
 one-0.001
 moral-0.001
 end-0.001
 lesson-0.001
 closer-0.001
Tokensm
Token ID5796
Feature activation-3.23e+00

Pos logit contributions
 one+0.037
 moral+0.034
 closer+0.032
 end+0.032
 lesson+0.030
Neg logit contributions
emic-0.034
 endings-0.031
Em-0.031
osity-0.030
und-0.029
Tokenie
Token ID494
Feature activation-2.17e-01

Pos logit contributions
emic+0.101
 unnecessary+0.098
 endings+0.098
osity+0.093
 importantly+0.091
Neg logit contributions
 one-0.055
 moral-0.043
 end-0.043
 closer-0.037
 lesson-0.037
Token decided
Token ID3066
Feature activation-2.56e-01

Pos logit contributions
emic+0.007
 unnecessary+0.006
 endings+0.006
osity+0.006
Em+0.006
Neg logit contributions
 one-0.004
 moral-0.003
 end-0.003
 closer-0.003
 lesson-0.003
Token to
Token ID284
Feature activation-9.50e-01

Pos logit contributions
emic+0.009
 unnecessary+0.008
osity+0.008
 endings+0.008
 importantly+0.008
Neg logit contributions
 one-0.005
 closer-0.003
 moral-0.003
 first-0.003
 some-0.003
Token travel
Token ID3067
Feature activation+1.42e-01

Pos logit contributions
emic+0.027
 unnecessary+0.026
 endings+0.026
osity+0.026
Em+0.025
Neg logit contributions
 one-0.019
 moral-0.015
 end-0.015
 closer-0.014
 lesson-0.014
Token to
Token ID284
Feature activation+5.70e-01

Pos logit contributions
 moral+0.004
 End+0.003
oral+0.003
ossible+0.003
 end+0.003
Neg logit contributions
 unnecessary-0.002
 endings-0.002
emic-0.002
Em-0.002
 achieving-0.002
TokenThe
Token ID464
Feature activation+1.21e-01

Pos logit contributions
emic+0.016
 unnecessary+0.015
Em+0.015
 endings+0.015
osity+0.014
Neg logit contributions
 one-0.028
 first-0.023
 closer-0.023
 moral-0.023
 end-0.023
Token naughty
Token ID45128
Feature activation+8.64e-01

Pos logit contributions
 one+0.002
boat+0.001
ort+0.001
ossible+0.001
 closer+0.001
Neg logit contributions
 ow-0.004
 unnecessary-0.004
 endings-0.004
emic-0.004
 cob-0.003
Token cat
Token ID3797
Feature activation-2.27e-01

Pos logit contributions
 one+0.023
ossible+0.022
 closer+0.021
 moral+0.020
boat+0.019
Neg logit contributions
 ow-0.016
 endings-0.016
 unnecessary-0.014
 cob-0.014
emic-0.014
Token had
Token ID550
Feature activation+6.20e-02

Pos logit contributions
emic+0.007
 unnecessary+0.006
 endings+0.006
osity+0.006
Em+0.006
Neg logit contributions
 one-0.004
 moral-0.003
 end-0.003
 closer-0.003
 lesson-0.003
Token out
Token ID503
Feature activation-7.49e-01

Pos logit contributions
 one+0.001
 moral+0.001
 end+0.001
 closer+0.001
 first+0.001
Neg logit contributions
emic-0.002
 unnecessary-0.002
 endings-0.002
osity-0.001
Em-0.001
Tokensm
Token ID5796
Feature activation-3.23e+00

Pos logit contributions
emic+0.023
 endings+0.021
 unnecessary+0.021
osity+0.021
Em+0.020
Neg logit contributions
 one-0.015
 moral-0.011
 end-0.011
 closer-0.010
 lesson-0.010
Tokenarted
Token ID19112
Feature activation-1.62e-01

Pos logit contributions
emic+0.066
 endings+0.061
 unnecessary+0.058
osity+0.056
igor+0.055
Neg logit contributions
 one-0.090
 moral-0.080
 end-0.079
 closer-0.076
 last-0.075
Token the
Token ID262
Feature activation+4.18e-01

Pos logit contributions
emic+0.005
 endings+0.005
 unnecessary+0.005
osity+0.005
Em+0.005
Neg logit contributions
 one-0.003
 moral-0.002
 end-0.002
 closer-0.002
 lesson-0.002
Token bees
Token ID17002
Feature activation-6.87e-01

Pos logit contributions
 one+0.010
 moral+0.009
 end+0.008
 first+0.008
 closer+0.008
Neg logit contributions
emic-0.011
osity-0.010
 endings-0.010
 unnecessary-0.010
Em-0.010
Token and
Token ID290
Feature activation+3.70e-02

Pos logit contributions
emic+0.022
 unnecessary+0.022
 endings+0.022
osity+0.021
Em+0.021
Neg logit contributions
 one-0.011
 moral-0.008
 end-0.008
 closer-0.007
 first-0.007
Token learned
Token ID4499
Feature activation-8.99e-01

Pos logit contributions
 one+0.001
 moral+0.001
 end+0.001
 closer+0.001
 first+0.001
Neg logit contributions
emic-0.001
 unnecessary-0.001
 endings-0.001
osity-0.001
Em-0.001
Token reliable
Token ID9314
Feature activation-9.85e-02

Pos logit contributions
 one+0.008
 end+0.006
 moral+0.006
ossible+0.006
dog+0.006
Neg logit contributions
emic-0.009
 unnecessary-0.009
 importantly-0.009
osity-0.008
 endings-0.008
Token that
Token ID326
Feature activation+1.40e-01

Pos logit contributions
emic+0.003
 unnecessary+0.003
 endings+0.003
osity+0.003
Em+0.003
Neg logit contributions
 one-0.002
 moral-0.001
 end-0.001
 closer-0.001
 first-0.001
Token it
Token ID340
Feature activation+2.01e-01

Pos logit contributions
 one+0.002
 moral+0.002
ossible+0.002
 end+0.001
 closer+0.001
Neg logit contributions
emic-0.004
 endings-0.004
 unnecessary-0.004
osity-0.004
 importantly-0.004
Token had
Token ID550
Feature activation+2.62e-01

Pos logit contributions
 one+0.005
 moral+0.004
 end+0.004
 closer+0.003
 lesson+0.003
Neg logit contributions
emic-0.005
 unnecessary-0.005
 endings-0.005
osity-0.005
Em-0.005
Token out
Token ID503
Feature activation-4.04e-01

Pos logit contributions
 one+0.006
 moral+0.005
 end+0.005
 closer+0.004
 lesson+0.004
Neg logit contributions
emic-0.007
 unnecessary-0.006
 endings-0.006
osity-0.006
Em-0.006
Tokensm
Token ID5796
Feature activation-3.20e+00

Pos logit contributions
emic+0.012
 endings+0.011
 unnecessary+0.011
Em+0.011
osity+0.011
Neg logit contributions
 one-0.008
 moral-0.006
 end-0.006
 closer-0.006
 lesson-0.006
Tokenarted
Token ID19112
Feature activation-1.40e-01

Pos logit contributions
emic+0.065
 endings+0.061
 unnecessary+0.059
igor+0.055
 pian+0.055
Neg logit contributions
 one-0.089
 moral-0.078
 end-0.077
 closer-0.075
 last-0.075
Token the
Token ID262
Feature activation+2.81e-01

Pos logit contributions
emic+0.005
 endings+0.004
 unnecessary+0.004
osity+0.004
Em+0.004
Neg logit contributions
 one-0.002
 moral-0.002
 end-0.002
 closer-0.001
 lesson-0.001
Token big
Token ID1263
Feature activation-3.59e-01

Pos logit contributions
 one+0.006
 moral+0.005
 end+0.004
 closer+0.004
 lesson+0.004
Neg logit contributions
emic-0.008
 unnecessary-0.008
 endings-0.008
osity-0.007
Em-0.007
Token scary
Token ID14343
Feature activation+2.21e-01

Pos logit contributions
 unnecessary+0.009
 endings+0.009
emic+0.009
 situations+0.009
osity+0.008
Neg logit contributions
 one-0.007
 moral-0.006
ossible-0.006
 lesson-0.006
 end-0.006
Token shark
Token ID21027
Feature activation+6.91e-01

Pos logit contributions
 one+0.004
 moral+0.004
 end+0.003
 closer+0.003
 lesson+0.003
Neg logit contributions
emic-0.006
 unnecessary-0.006
 endings-0.006
osity-0.006
Em-0.006
Token and
Token ID290
Feature activation-1.36e+00

Pos logit contributions
 one+0.000
 moral+0.000
 end+0.000
 closer+0.000
 lesson+0.000
Neg logit contributions
emic-0.000
 endings-0.000
 unnecessary-0.000
osity-0.000
Em-0.000
Token his
Token ID465
Feature activation-2.98e-02

Pos logit contributions
emic+0.049
 unnecessary+0.047
 endings+0.047
osity+0.046
Em+0.045
Neg logit contributions
 one-0.017
 moral-0.012
 end-0.011
 closer-0.010
 lesson-0.010
Token friend
Token ID1545
Feature activation-1.16e+00

Pos logit contributions
 endings+0.001
 unnecessary+0.001
emic+0.001
osity+0.001
und+0.001
Neg logit contributions
 one-0.001
 moral-0.000
 end-0.000
 first-0.000
 lesson-0.000
Token,
Token ID11
Feature activation-2.83e+00

Pos logit contributions
emic+0.033
 endings+0.032
 unnecessary+0.032
osity+0.031
Em+0.030
Neg logit contributions
 one-0.021
 moral-0.018
 end-0.016
 closer-0.016
 lesson-0.016
Token Sue
Token ID26089
Feature activation-8.42e-01

Pos logit contributions
emic+0.082
 unnecessary+0.080
 endings+0.079
osity+0.078
 situations+0.074
Neg logit contributions
 one-0.056
 moral-0.044
 end-0.042
 closer-0.041
 first-0.040
Token,
Token ID11
Feature activation-3.18e+00

Pos logit contributions
emic+0.024
 endings+0.024
 unnecessary+0.023
osity+0.022
 hey+0.022
Neg logit contributions
 one-0.014
 moral-0.013
 lesson-0.012
 closer-0.011
 first-0.011
Token went
Token ID1816
Feature activation-4.44e-01

Pos logit contributions
emic+0.092
 unnecessary+0.090
 endings+0.089
osity+0.088
Em+0.083
Neg logit contributions
 one-0.062
 moral-0.049
 end-0.047
 closer-0.045
 lesson-0.044
Token to
Token ID284
Feature activation+5.32e-01

Pos logit contributions
emic+0.012
 unnecessary+0.011
 endings+0.011
osity+0.011
 pian+0.010
Neg logit contributions
 one-0.009
 moral-0.008
 end-0.008
 lesson-0.007
 closer-0.007
Token the
Token ID262
Feature activation+8.68e-01

Pos logit contributions
 one+0.009
 moral+0.007
 end+0.006
 closer+0.006
 lesson+0.006
Neg logit contributions
emic-0.017
 unnecessary-0.016
 endings-0.016
osity-0.016
Em-0.015
Token park
Token ID3952
Feature activation+3.15e-01

Pos logit contributions
 one+0.018
 moral+0.014
 end+0.014
 first+0.013
 closer+0.012
Neg logit contributions
emic-0.026
 unnecessary-0.026
osity-0.025
 endings-0.025
 importantly-0.024
Token.
Token ID13
Feature activation-2.84e-03

Pos logit contributions
 moral+0.008
 End+0.007
 end+0.007
oral+0.007
 lesson+0.007
Neg logit contributions
emic-0.006
and-0.005
 endings-0.005
und-0.005
 pian-0.005
Token happy
Token ID3772
Feature activation-4.49e-01

Pos logit contributions
 one+0.004
 moral+0.003
 end+0.003
 lesson+0.003
ossible+0.003
Neg logit contributions
emic-0.005
 unnecessary-0.005
 endings-0.005
osity-0.005
 importantly-0.005
Token that
Token ID326
Feature activation+1.86e-01

Pos logit contributions
emic+0.016
 unnecessary+0.015
osity+0.015
 endings+0.015
Em+0.014
Neg logit contributions
 one-0.007
 moral-0.004
 end-0.004
 closer-0.004
 first-0.004
Token they
Token ID484
Feature activation-1.72e-01

Pos logit contributions
 one+0.004
 moral+0.003
ossible+0.003
 end+0.003
 closer+0.003
Neg logit contributions
 endings-0.005
emic-0.005
 unnecessary-0.005
 importantly-0.004
Em-0.004
Token had
Token ID550
Feature activation-1.32e-01

Pos logit contributions
emic+0.005
 endings+0.004
 unnecessary+0.004
osity+0.004
 situations+0.004
Neg logit contributions
 one-0.004
 moral-0.003
 end-0.003
 closer-0.003
 first-0.003
Token out
Token ID503
Feature activation-5.45e-01

Pos logit contributions
emic+0.004
 unnecessary+0.004
 endings+0.004
osity+0.004
Em+0.003
Neg logit contributions
 one-0.003
 moral-0.002
 end-0.002
 closer-0.002
 lesson-0.002
Tokensm
Token ID5796
Feature activation-3.16e+00

Pos logit contributions
emic+0.017
 endings+0.016
 unnecessary+0.016
osity+0.016
Em+0.015
Neg logit contributions
 one-0.010
 moral-0.008
 end-0.008
 closer-0.007
 lesson-0.007
Tokenarted
Token ID19112
Feature activation-2.30e-01

Pos logit contributions
emic+0.064
 endings+0.061
 unnecessary+0.060
mic+0.056
igor+0.056
Neg logit contributions
 one-0.086
 moral-0.075
 end-0.075
 last-0.071
 closer-0.071
Token the
Token ID262
Feature activation+2.89e-03

Pos logit contributions
emic+0.008
osity+0.007
 endings+0.007
 unnecessary+0.007
 importantly+0.007
Neg logit contributions
 one-0.004
 end-0.003
 moral-0.003
 closer-0.002
 first-0.002
Token mean
Token ID1612
Feature activation+4.03e-01

Pos logit contributions
 one+0.000
 moral+0.000
 end+0.000
 closer+0.000
 first+0.000
Neg logit contributions
emic-0.000
 unnecessary-0.000
 endings-0.000
osity-0.000
Em-0.000
Token man
Token ID582
Feature activation-2.88e-01

Pos logit contributions
 one+0.005
 moral+0.004
ossible+0.004
 closer+0.004
 end+0.004
Neg logit contributions
 endings-0.013
emic-0.013
 unnecessary-0.012
osity-0.012
 situations-0.012
Token.
Token ID13
Feature activation-3.32e-01

Pos logit contributions
emic+0.009
 unnecessary+0.009
 endings+0.009
osity+0.009
Em+0.008
Neg logit contributions
 one-0.005
 moral-0.004
 end-0.003
 closer-0.003
 first-0.003
Token a
Token ID257
Feature activation-3.14e-01

Pos logit contributions
 one+0.006
 moral+0.005
 end+0.005
 closer+0.005
 lesson+0.005
Neg logit contributions
emic-0.011
 endings-0.011
 unnecessary-0.011
Em-0.010
 importantly-0.010
Token small
Token ID1402
Feature activation-3.13e-01

Pos logit contributions
emic+0.007
 endings+0.007
 unnecessary+0.007
osity+0.007
 situations+0.006
Neg logit contributions
 one-0.008
 moral-0.007
 end-0.007
 closer-0.007
 first-0.007
Token elephant
Token ID20950
Feature activation-2.35e-01

Pos logit contributions
emic+0.009
 unnecessary+0.009
 endings+0.009
osity+0.008
Em+0.008
Neg logit contributions
 one-0.007
 moral-0.005
 end-0.005
 closer-0.005
 first-0.005
Token called
Token ID1444
Feature activation-9.37e-01

Pos logit contributions
 unnecessary+0.007
emic+0.007
 endings+0.007
osity+0.007
 situations+0.006
Neg logit contributions
 one-0.005
 end-0.004
 moral-0.004
 first-0.003
 closer-0.003
Token E
Token ID412
Feature activation+2.02e+00

Pos logit contributions
emic+0.023
 unnecessary+0.022
 endings+0.022
osity+0.021
Em+0.020
Neg logit contributions
 one-0.022
 moral-0.019
 end-0.018
 closer-0.017
 lesson-0.017
Tokensm
Token ID5796
Feature activation-3.14e+00

Pos logit contributions
 one+0.047
 moral+0.042
 closer+0.040
 first+0.038
 end+0.038
Neg logit contributions
emic-0.049
Em-0.045
 endings-0.043
osity-0.042
und-0.042
Tokenie
Token ID494
Feature activation-5.16e-01

Pos logit contributions
emic+0.087
 endings+0.084
 unnecessary+0.083
osity+0.081
Em+0.078
Neg logit contributions
 one-0.064
 moral-0.052
 end-0.051
 closer-0.048
 first-0.047
Token who
Token ID508
Feature activation-4.56e-01

Pos logit contributions
emic+0.017
 unnecessary+0.016
 endings+0.016
osity+0.016
Em+0.015
Neg logit contributions
 one-0.009
 moral-0.006
 end-0.006
 closer-0.006
 first-0.006
Token loved
Token ID6151
Feature activation-7.17e-01

Pos logit contributions
emic+0.010
 endings+0.010
osity+0.010
Em+0.009
 unnecessary+0.009
Neg logit contributions
 one-0.011
 moral-0.010
 end-0.009
 closer-0.009
 first-0.009
Token to
Token ID284
Feature activation+3.91e-01

Pos logit contributions
emic+0.025
osity+0.024
 unnecessary+0.023
 endings+0.023
 hey+0.023
Neg logit contributions
 one-0.012
 some-0.008
 closer-0.007
 lesson-0.007
 moral-0.007
Token travel
Token ID3067
Feature activation+2.53e-02

Pos logit contributions
 one+0.008
 moral+0.006
 end+0.006
 closer+0.006
 lesson+0.006
Neg logit contributions
emic-0.011
 unnecessary-0.011
 endings-0.011
osity-0.011
Em-0.010
Token and
Token ID290
Feature activation-1.51e+00

Pos logit contributions
 one+0.020
 moral+0.018
 closer+0.017
 lesson+0.016
 end+0.015
Neg logit contributions
 endings-0.031
emic-0.030
und-0.030
Em-0.029
 situations-0.029
Token his
Token ID465
Feature activation-3.66e-01

Pos logit contributions
emic+0.041
 unnecessary+0.041
 endings+0.041
Em+0.039
osity+0.038
Neg logit contributions
 one-0.026
 moral-0.024
 end-0.022
 closer-0.021
 first-0.020
Token dog
Token ID3290
Feature activation+4.14e-01

Pos logit contributions
emic+0.011
 endings+0.011
 unnecessary+0.011
osity+0.011
 situations+0.010
Neg logit contributions
 one-0.007
 moral-0.005
 end-0.005
 first-0.005
 lesson-0.005
Token,
Token ID11
Feature activation-2.79e+00

Pos logit contributions
 one+0.008
 moral+0.007
 closer+0.007
 lesson+0.007
 first+0.006
Neg logit contributions
emic-0.011
 endings-0.011
 unnecessary-0.010
Em-0.010
 hey-0.010
Token Max
Token ID5436
Feature activation+5.39e-01

Pos logit contributions
emic+0.070
 unnecessary+0.065
 endings+0.064
Em+0.063
osity+0.062
Neg logit contributions
 one-0.060
 moral-0.055
 end-0.051
 lesson-0.049
 closer-0.048
Token,
Token ID11
Feature activation-3.12e+00

Pos logit contributions
 one+0.009
 moral+0.009
 lesson+0.008
 closer+0.008
 first+0.007
Neg logit contributions
 endings-0.015
emic-0.014
 unnecessary-0.014
 hey-0.014
Em-0.013
Token went
Token ID1816
Feature activation-9.24e-01

Pos logit contributions
emic+0.094
 unnecessary+0.094
osity+0.093
 endings+0.093
 situations+0.088
Neg logit contributions
 one-0.059
 moral-0.042
 end-0.042
 first-0.040
 closer-0.040
Token to
Token ID284
Feature activation+4.12e-01

Pos logit contributions
emic+0.023
 unnecessary+0.022
osity+0.021
 endings+0.021
Em+0.020
Neg logit contributions
 one-0.019
 moral-0.018
 end-0.017
 lesson-0.015
 particular-0.015
Token the
Token ID262
Feature activation+5.22e-01

Pos logit contributions
 one+0.007
 moral+0.005
 end+0.005
 lesson+0.005
 closer+0.005
Neg logit contributions
emic-0.012
 unnecessary-0.012
 endings-0.012
Em-0.012
osity-0.012
Token park
Token ID3952
Feature activation+8.74e-02

Pos logit contributions
 one+0.011
 moral+0.009
 end+0.008
 first+0.008
 closer+0.008
Neg logit contributions
emic-0.015
 unnecessary-0.015
 endings-0.014
osity-0.014
 importantly-0.014
Token.
Token ID13
Feature activation-3.74e-01

Pos logit contributions
 moral+0.002
 End+0.002
oral+0.002
inkle+0.002
 end+0.002
Neg logit contributions
and-0.001
emic-0.001
 endings-0.001
und-0.001
cess-0.001
Token and
Token ID290
Feature activation-1.16e+00

Pos logit contributions
 one+0.001
 moral+0.001
 end+0.001
 lesson+0.001
 closer+0.001
Neg logit contributions
emic-0.002
 endings-0.002
 unnecessary-0.002
osity-0.002
Em-0.002
Token his
Token ID465
Feature activation-1.31e-01

Pos logit contributions
emic+0.043
 unnecessary+0.041
 endings+0.041
osity+0.041
Em+0.039
Neg logit contributions
 one-0.014
 moral-0.009
 end-0.009
 lesson-0.008
 closer-0.008
Token friend
Token ID1545
Feature activation-1.19e+00

Pos logit contributions
 endings+0.004
 unnecessary+0.004
emic+0.004
osity+0.004
 situations+0.004
Neg logit contributions
 one-0.002
 moral-0.002
 end-0.002
 first-0.002
 lesson-0.002
Token,
Token ID11
Feature activation-2.73e+00

Pos logit contributions
emic+0.034
 endings+0.033
 unnecessary+0.033
osity+0.032
Em+0.031
Neg logit contributions
 one-0.022
 moral-0.018
 end-0.017
 closer-0.017
 lesson-0.016
Token Tom
Token ID4186
Feature activation+1.04e+00

Pos logit contributions
emic+0.080
 unnecessary+0.078
 endings+0.077
osity+0.077
 situations+0.073
Neg logit contributions
 one-0.053
 moral-0.041
 end-0.039
 closer-0.038
 first-0.038
Token,
Token ID11
Feature activation-3.09e+00

Pos logit contributions
 moral+0.019
 one+0.019
 lesson+0.018
ort+0.017
 closer+0.017
Neg logit contributions
 endings-0.026
emic-0.025
 hey-0.024
und-0.024
 situations-0.023
Token went
Token ID1816
Feature activation-4.96e-01

Pos logit contributions
emic+0.092
 unnecessary+0.092
osity+0.090
 endings+0.090
 situations+0.085
Neg logit contributions
 one-0.058
 moral-0.044
 end-0.043
 closer-0.041
 first-0.041
Token to
Token ID284
Feature activation+5.97e-01

Pos logit contributions
emic+0.013
 unnecessary+0.013
 endings+0.012
osity+0.012
Em+0.012
Neg logit contributions
 one-0.010
 moral-0.009
 end-0.008
 lesson-0.008
 closer-0.007
Token the
Token ID262
Feature activation+8.44e-01

Pos logit contributions
 one+0.010
 moral+0.008
 end+0.007
 closer+0.007
 lesson+0.007
Neg logit contributions
emic-0.019
 unnecessary-0.018
 endings-0.018
osity-0.018
Em-0.017
Token park
Token ID3952
Feature activation+3.09e-01

Pos logit contributions
 one+0.016
 moral+0.012
 end+0.012
 first+0.012
 last+0.011
Neg logit contributions
emic-0.027
 unnecessary-0.027
osity-0.027
 importantly-0.026
 endings-0.026
Token.
Token ID13
Feature activation+9.97e-02

Pos logit contributions
 moral+0.008
 End+0.007
oral+0.007
 end+0.007
 lesson+0.007
Neg logit contributions
emic-0.006
and-0.005
 endings-0.005
und-0.005
 pian-0.005
Token also
Token ID635
Feature activation+7.47e-01

Pos logit contributions
emic+0.008
Em+0.008
 endings+0.008
osity+0.008
 pian+0.007
Neg logit contributions
 one-0.007
 moral-0.006
 end-0.006
 closer-0.006
 lesson-0.006
Token added
Token ID2087
Feature activation+3.06e-01

Pos logit contributions
 one+0.016
 moral+0.014
 end+0.013
 lesson+0.013
 closer+0.013
Neg logit contributions
emic-0.019
Em-0.018
 endings-0.018
 unnecessary-0.017
osity-0.017
Token some
Token ID617
Feature activation+4.08e-01

Pos logit contributions
 one+0.004
 moral+0.003
 end+0.003
ossible+0.003
 lesson+0.003
Neg logit contributions
emic-0.010
 unnecessary-0.010
 endings-0.010
osity-0.009
Em-0.009
Token spicy
Token ID26880
Feature activation+6.91e-01

Pos logit contributions
 one+0.010
 moral+0.009
 end+0.008
 closer+0.008
 lesson+0.008
Neg logit contributions
emic-0.010
osity-0.010
 endings-0.009
 unnecessary-0.009
 hey-0.009
Token-
Token ID12
Feature activation-6.52e-02

Pos logit contributions
 one+0.018
 end+0.014
 lesson+0.014
 moral+0.014
 some+0.014
Neg logit contributions
emic-0.018
osity-0.016
 unnecessary-0.016
 hey-0.015
Em-0.015
Tokensm
Token ID5796
Feature activation-3.08e+00

Pos logit contributions
emic+0.001
 unnecessary+0.001
 endings+0.001
osity+0.001
Em+0.001
Neg logit contributions
 one-0.002
 moral-0.001
 end-0.001
 first-0.001
 closer-0.001
Tokenelling
Token ID9417
Feature activation-5.47e-02

Pos logit contributions
emic+0.073
 endings+0.071
 unnecessary+0.069
osity+0.065
 Majesty+0.065
Neg logit contributions
 one-0.074
 moral-0.063
 end-0.063
 last-0.059
 lesson-0.058
Token flowers
Token ID12734
Feature activation-7.56e-01

Pos logit contributions
emic+0.001
 endings+0.001
 unnecessary+0.001
osity+0.001
Em+0.001
Neg logit contributions
 one-0.001
 moral-0.001
 end-0.001
 closer-0.001
 lesson-0.001
Token to
Token ID284
Feature activation+6.00e-01

Pos logit contributions
emic+0.024
 unnecessary+0.023
 endings+0.023
osity+0.022
Em+0.022
Neg logit contributions
 one-0.013
 moral-0.010
 end-0.009
 closer-0.009
 first-0.009
Token her
Token ID607
Feature activation-2.67e-01

Pos logit contributions
 one+0.012
 end+0.009
 moral+0.009
 first+0.008
 closer+0.008
Neg logit contributions
emic-0.018
 endings-0.018
 unnecessary-0.017
osity-0.017
Em-0.016
Token dress
Token ID6576
Feature activation-4.61e-01

Pos logit contributions
emic+0.009
osity+0.009
 unnecessary+0.008
 endings+0.008
und+0.008
Neg logit contributions
 one-0.005
 end-0.004
 moral-0.003
 first-0.003
 last-0.003
Token friends
Token ID2460
Feature activation-1.05e+00

Pos logit contributions
 one+0.001
 moral+0.001
 end+0.001
 closer+0.001
 lesson+0.001
Neg logit contributions
emic-0.002
 unnecessary-0.002
 endings-0.002
osity-0.002
Em-0.002
Token,
Token ID11
Feature activation-2.36e+00

Pos logit contributions
emic+0.034
 endings+0.033
 unnecessary+0.033
osity+0.033
Em+0.032
Neg logit contributions
 one-0.016
 moral-0.012
 end-0.011
 closer-0.011
 lesson-0.010
Token Lily
Token ID20037
Feature activation+1.31e-01

Pos logit contributions
emic+0.067
 unnecessary+0.064
 endings+0.064
osity+0.062
Em+0.062
Neg logit contributions
 one-0.044
 moral-0.038
 end-0.036
 lesson-0.033
 closer-0.033
Token and
Token ID290
Feature activation-1.07e+00

Pos logit contributions
 one+0.002
 moral+0.002
 closer+0.002
 lesson+0.002
 end+0.001
Neg logit contributions
emic-0.004
 endings-0.004
und-0.004
Em-0.004
osity-0.004
Token Ben
Token ID3932
Feature activation+1.19e+00

Pos logit contributions
emic+0.031
 unnecessary+0.030
 endings+0.029
osity+0.029
 importantly+0.028
Neg logit contributions
 one-0.024
 end-0.017
 first-0.017
 closer-0.017
 moral-0.017
Token,
Token ID11
Feature activation-3.07e+00

Pos logit contributions
 one+0.024
 moral+0.023
 lesson+0.022
ort+0.021
 closer+0.021
Neg logit contributions
 endings-0.028
emic-0.028
 hey-0.026
Em-0.026
und-0.026
Token went
Token ID1816
Feature activation-3.14e-01

Pos logit contributions
emic+0.093
 unnecessary+0.092
 endings+0.090
osity+0.090
 situations+0.085
Neg logit contributions
 one-0.056
 moral-0.042
 end-0.041
 closer-0.039
 first-0.038
Token to
Token ID284
Feature activation+7.43e-01

Pos logit contributions
emic+0.009
 unnecessary+0.008
osity+0.008
 endings+0.008
 pian+0.008
Neg logit contributions
 one-0.006
 moral-0.005
 end-0.005
 lesson-0.005
 closer-0.005
Token play
Token ID711
Feature activation+3.74e-01

Pos logit contributions
 one+0.012
 moral+0.010
 end+0.009
 lesson+0.008
 closer+0.008
Neg logit contributions
emic-0.023
 unnecessary-0.022
 endings-0.022
osity-0.022
Em-0.021
Token in
Token ID287
Feature activation+6.80e-01

Pos logit contributions
 one+0.007
 moral+0.006
 end+0.005
 lesson+0.005
 closer+0.005
Neg logit contributions
emic-0.011
 endings-0.011
 unnecessary-0.011
osity-0.010
Em-0.010
Token the
Token ID262
Feature activation+1.93e-01

Pos logit contributions
 one+0.011
 moral+0.010
 end+0.009
 lesson+0.009
 closer+0.009
Neg logit contributions
emic-0.019
Em-0.019
 endings-0.018
 unnecessary-0.018
 pian-0.018
Token away
Token ID1497
Feature activation-3.40e-01

Pos logit contributions
emic+0.003
 unnecessary+0.002
 endings+0.002
osity+0.002
Em+0.002
Neg logit contributions
 one-0.001
 closer-0.001
 moral-0.001
 end-0.001
 first-0.001
Token noticed
Token ID6810
Feature activation-1.29e+00

Pos logit contributions
emic+0.012
 unnecessary+0.012
 endings+0.012
osity+0.011
Em+0.011
Neg logit contributions
 one-0.005
 first-0.003
 end-0.003
 closer-0.003
 moral-0.003
Token some
Token ID617
Feature activation+2.05e-02

Pos logit contributions
emic+0.046
 unnecessary+0.044
osity+0.044
 endings+0.043
Em+0.042
Neg logit contributions
 one-0.021
 some-0.013
 first-0.012
 end-0.012
 closer-0.012
Token sour
Token ID11348
Feature activation+1.56e-01

Pos logit contributions
 one+0.000
 moral+0.000
 end+0.000
 closer+0.000
 lesson+0.000
Neg logit contributions
emic-0.001
osity-0.001
 unnecessary-0.001
 endings-0.001
Em-0.001
Token-
Token ID12
Feature activation+2.20e-01

Pos logit contributions
 one+0.003
 moral+0.003
 end+0.003
 closer+0.002
 lesson+0.002
Neg logit contributions
emic-0.004
 unnecessary-0.004
 endings-0.004
osity-0.004
Em-0.004
Tokensm
Token ID5796
Feature activation-3.07e+00

Pos logit contributions
 one+0.005
 moral+0.004
 end+0.004
ossible+0.004
 closer+0.004
Neg logit contributions
emic-0.005
 unnecessary-0.005
 endings-0.005
und-0.005
Em-0.005
Tokenelling
Token ID9417
Feature activation-3.72e-01

Pos logit contributions
emic+0.077
 endings+0.074
 unnecessary+0.073
osity+0.068
 importantly+0.068
Neg logit contributions
 one-0.072
 end-0.059
 moral-0.058
 last-0.056
 first-0.055
Token bushes
Token ID37413
Feature activation-1.09e+00

Pos logit contributions
emic+0.010
 unnecessary+0.010
 endings+0.009
osity+0.009
Em+0.009
Neg logit contributions
 one-0.008
 moral-0.007
 end-0.006
 closer-0.006
 lesson-0.006
Token.
Token ID13
Feature activation-6.25e-02

Pos logit contributions
emic+0.034
 endings+0.033
 unnecessary+0.033
osity+0.032
Em+0.031
Neg logit contributions
 one-0.018
 moral-0.014
 end-0.013
 closer-0.013
 lesson-0.013
Token She
Token ID1375
Feature activation-3.58e-01

Pos logit contributions
emic+0.002
 unnecessary+0.002
 endings+0.002
osity+0.002
 situations+0.001
Neg logit contributions
 one-0.001
 moral-0.001
 some-0.001
 first-0.001
ossible-0.001
Token had
Token ID550
Feature activation-2.39e-01

Pos logit contributions
emic+0.010
 endings+0.009
 unnecessary+0.009
osity+0.009
Em+0.009
Neg logit contributions
 one-0.007
 moral-0.006
 end-0.006
 lesson-0.005
 closer-0.005
TokenBut
Token ID1537
Feature activation+6.26e-02

Pos logit contributions
emic+0.016
 endings+0.015
 unnecessary+0.015
osity+0.015
Em+0.014
Neg logit contributions
 one-0.016
 moral-0.014
 end-0.014
 closer-0.013
 lesson-0.013
Token it
Token ID340
Feature activation+1.07e-01

Pos logit contributions
 one+0.002
 first+0.001
 end+0.001
 moral+0.001
 closer+0.001
Neg logit contributions
emic-0.002
 unnecessary-0.002
 endings-0.002
Em-0.002
osity-0.002
Token's
Token ID338
Feature activation-9.35e-01

Pos logit contributions
 one+0.002
 moral+0.002
 end+0.002
 first+0.002
 closer+0.002
Neg logit contributions
emic-0.003
 endings-0.003
 unnecessary-0.003
osity-0.003
Em-0.003
Token mine
Token ID6164
Feature activation-3.53e-01

Pos logit contributions
emic+0.025
 unnecessary+0.025
 endings+0.024
osity+0.024
 importantly+0.023
Neg logit contributions
 one-0.018
 moral-0.015
 end-0.014
 closer-0.014
ossible-0.014
Token!
Token ID0
Feature activation-6.90e-01

Pos logit contributions
emic+0.011
 unnecessary+0.011
 endings+0.011
Em+0.011
osity+0.010
Neg logit contributions
 one-0.006
 first-0.005
 end-0.004
 closer-0.004
 moral-0.004
Token Nobody
Token ID15658
Feature activation-3.06e+00

Pos logit contributions
 unnecessary+0.016
emic+0.016
 endings+0.016
 situations+0.015
osity+0.014
Neg logit contributions
 one-0.018
 moral-0.015
 first-0.014
ossible-0.014
 some-0.014
Token else
Token ID2073
Feature activation-4.41e-01

Pos logit contributions
emic+0.086
 endings+0.083
 unnecessary+0.082
osity+0.082
Em+0.079
Neg logit contributions
 one-0.062
 moral-0.049
 end-0.048
 closer-0.047
 lesson-0.046
Token can
Token ID460
Feature activation+4.02e-01

Pos logit contributions
emic+0.012
 endings+0.012
 unnecessary+0.012
osity+0.012
Em+0.011
Neg logit contributions
 one-0.009
 moral-0.007
 end-0.007
 closer-0.007
 first-0.007
Token have
Token ID423
Feature activation-1.25e-01

Pos logit contributions
 one+0.008
 moral+0.008
ossible+0.007
 lesson+0.007
 End+0.007
Neg logit contributions
emic-0.009
 endings-0.009
und-0.009
 hey-0.009
osity-0.008
Token it
Token ID340
Feature activation+2.43e-01

Pos logit contributions
emic+0.004
 endings+0.004
 unnecessary+0.004
osity+0.003
Em+0.003
Neg logit contributions
 one-0.002
 moral-0.002
 end-0.002
 closer-0.002
 lesson-0.002
Token".
Token ID1911
Feature activation-3.25e-01

Pos logit contributions
 one+0.005
 moral+0.004
 end+0.004
 closer+0.004
 lesson+0.004
Neg logit contributions
emic-0.007
 unnecessary-0.006
 endings-0.006
osity-0.006
Em-0.006
Token and
Token ID290
Feature activation-1.23e+00

Pos logit contributions
 one+0.001
 moral+0.001
 end+0.001
 lesson+0.001
 closer+0.001
Neg logit contributions
emic-0.001
 endings-0.001
 unnecessary-0.001
osity-0.001
Em-0.001
Token his
Token ID465
Feature activation-1.28e-01

Pos logit contributions
emic+0.044
 unnecessary+0.042
 endings+0.042
osity+0.042
Em+0.040
Neg logit contributions
 one-0.016
 moral-0.011
 end-0.010
 closer-0.009
 lesson-0.009
Token dog
Token ID3290
Feature activation+4.78e-01

Pos logit contributions
 endings+0.004
 unnecessary+0.004
osity+0.004
emic+0.004
 situations+0.004
Neg logit contributions
 one-0.002
 moral-0.002
 end-0.001
 first-0.001
 lesson-0.001
Token,
Token ID11
Feature activation-2.80e+00

Pos logit contributions
 one+0.009
 moral+0.007
 closer+0.007
 lesson+0.007
 end+0.006
Neg logit contributions
emic-0.014
 endings-0.013
 unnecessary-0.013
osity-0.013
Em-0.012
Token Max
Token ID5436
Feature activation+5.59e-01

Pos logit contributions
emic+0.069
 endings+0.065
 unnecessary+0.065
osity+0.063
Em+0.062
Neg logit contributions
 one-0.064
 moral-0.056
 end-0.053
 closer-0.051
 lesson-0.051
Token,
Token ID11
Feature activation-3.06e+00

Pos logit contributions
 one+0.009
 moral+0.009
 lesson+0.008
 closer+0.007
 first+0.007
Neg logit contributions
 endings-0.016
emic-0.016
 unnecessary-0.015
 hey-0.015
Em-0.015
Token went
Token ID1816
Feature activation-7.29e-01

Pos logit contributions
emic+0.091
 unnecessary+0.089
 endings+0.087
osity+0.087
 situations+0.082
Neg logit contributions
 one-0.058
 moral-0.045
 end-0.043
 closer-0.041
 first-0.041
Token to
Token ID284
Feature activation+5.72e-01

Pos logit contributions
emic+0.018
 unnecessary+0.018
 endings+0.017
osity+0.017
Em+0.017
Neg logit contributions
 one-0.015
 moral-0.013
 end-0.013
 lesson-0.012
 closer-0.011
Token the
Token ID262
Feature activation+8.21e-01

Pos logit contributions
 one+0.010
 moral+0.008
 end+0.008
 lesson+0.007
 closer+0.007
Neg logit contributions
emic-0.017
 unnecessary-0.016
 endings-0.016
osity-0.016
Em-0.016
Token park
Token ID3952
Feature activation+2.81e-01

Pos logit contributions
 one+0.016
 end+0.013
 moral+0.013
 first+0.012
 closer+0.011
Neg logit contributions
emic-0.025
 unnecessary-0.024
osity-0.024
 endings-0.024
 importantly-0.023
Token.
Token ID13
Feature activation-1.31e-02

Pos logit contributions
 moral+0.007
 End+0.007
oral+0.007
inkle+0.006
 end+0.006
Neg logit contributions
and-0.005
emic-0.005
 endings-0.004
und-0.004
 pian-0.004
Token "
Token ID366
Feature activation-5.41e-01

Pos logit contributions
emic+0.009
 unnecessary+0.008
 endings+0.008
osity+0.008
Em+0.008
Neg logit contributions
 one-0.007
 moral-0.006
 end-0.006
 closer-0.006
 lesson-0.006
TokenBut
Token ID1537
Feature activation+2.46e-01

Pos logit contributions
emic+0.011
 endings+0.011
 unnecessary+0.011
Em+0.010
osity+0.010
Neg logit contributions
 one-0.014
 moral-0.012
 end-0.012
 lesson-0.012
 closer-0.012
Token why
Token ID1521
Feature activation+8.63e-01

Pos logit contributions
 one+0.006
 first+0.005
 end+0.005
 moral+0.005
 closer+0.005
Neg logit contributions
emic-0.007
 unnecessary-0.006
 endings-0.006
osity-0.006
Em-0.006
Token not
Token ID407
Feature activation+3.69e-02

Pos logit contributions
 one+0.017
 moral+0.014
 end+0.014
 first+0.013
 lesson+0.013
Neg logit contributions
emic-0.025
 unnecessary-0.024
 endings-0.024
osity-0.023
Em-0.023
Token?
Token ID30
Feature activation-3.16e-01

Pos logit contributions
 one+0.001
 end+0.001
 moral+0.001
 first+0.001
 closer+0.001
Neg logit contributions
emic-0.001
 unnecessary-0.001
 endings-0.001
osity-0.001
mic-0.001
Token Everybody
Token ID23080
Feature activation-3.04e+00

Pos logit contributions
 unnecessary+0.007
emic+0.007
 endings+0.007
osity+0.007
 situations+0.006
Neg logit contributions
 one-0.008
 moral-0.007
 first-0.006
 closer-0.006
 end-0.006
Token is
Token ID318
Feature activation-2.90e-01

Pos logit contributions
emic+0.085
 unnecessary+0.083
 endings+0.082
osity+0.081
Em+0.077
Neg logit contributions
 one-0.061
 moral-0.049
 end-0.048
 closer-0.046
 first-0.045
Token welcome
Token ID7062
Feature activation-1.56e-01

Pos logit contributions
emic+0.007
 unnecessary+0.007
 endings+0.007
osity+0.007
Em+0.007
Neg logit contributions
 one-0.006
 moral-0.005
 end-0.005
 closer-0.005
 lesson-0.005
Token to
Token ID284
Feature activation-2.61e-01

Pos logit contributions
emic+0.005
 unnecessary+0.004
 endings+0.004
osity+0.004
Em+0.004
Neg logit contributions
 one-0.003
 moral-0.002
 end-0.002
 closer-0.002
 first-0.002
Token come
Token ID1282
Feature activation-1.09e-01

Pos logit contributions
emic+0.008
osity+0.008
 endings+0.008
 unnecessary+0.008
 importantly+0.007
Neg logit contributions
 one-0.005
 end-0.004
 moral-0.004
 first-0.004
 closer-0.004
Token join
Token ID4654
Feature activation-6.07e-01

Pos logit contributions
emic+0.003
 unnecessary+0.003
 endings+0.003
osity+0.003
Em+0.003
Neg logit contributions
 one-0.002
 moral-0.001
 end-0.001
 lesson-0.001
 closer-0.001
Token.
Token ID13
Feature activation-1.66e-02

Pos logit contributions
 one+0.013
 moral+0.010
 end+0.009
 closer+0.009
 first+0.009
Neg logit contributions
emic-0.028
 unnecessary-0.027
 endings-0.027
osity-0.026
Em-0.026
Token
Token ID198
Feature activation+2.95e-01

Pos logit contributions
emic+0.000
 unnecessary+0.000
 endings+0.000
osity+0.000
Em+0.000
Neg logit contributions
 one-0.000
 moral-0.000
 end-0.000
 closer-0.000
 lesson-0.000
Token
Token ID198
Feature activation-3.67e-01

Pos logit contributions
 one+0.006
 end+0.005
 moral+0.005
 closer+0.005
 last+0.005
Neg logit contributions
Em-0.007
emic-0.007
und-0.007
 endings-0.006
and-0.006
Token"
Token ID1
Feature activation+2.53e-01

Pos logit contributions
Em+0.008
emic+0.008
 endings+0.008
und+0.008
 unnecessary+0.008
Neg logit contributions
 one-0.009
 last-0.007
 moral-0.007
 closer-0.007
 lesson-0.007
TokenBe
Token ID3856
Feature activation+6.16e-01

Pos logit contributions
 one+0.006
 moral+0.005
 end+0.005
 closer+0.005
 lesson+0.005
Neg logit contributions
emic-0.006
 endings-0.006
 unnecessary-0.005
osity-0.005
Em-0.005
Tokenep
Token ID538
Feature activation+1.93e+00

Pos logit contributions
 one+0.017
 moral+0.015
 end+0.014
 lesson+0.014
 closer+0.014
Neg logit contributions
 unnecessary-0.012
emic-0.012
 endings-0.011
 hey-0.010
 importantly-0.010
Token be
Token ID307
Feature activation+4.63e-01

Pos logit contributions
 one+0.035
 moral+0.028
 end+0.027
 closer+0.025
 lesson+0.025
Neg logit contributions
emic-0.058
 unnecessary-0.055
 endings-0.055
osity-0.054
Em-0.052
Tokenep
Token ID538
Feature activation+1.85e+00

Pos logit contributions
 one+0.010
 moral+0.008
 end+0.008
 closer+0.008
 first+0.007
Neg logit contributions
emic-0.013
 endings-0.012
 unnecessary-0.012
osity-0.012
Em-0.011
Token!"
Token ID2474
Feature activation-8.93e-02

Pos logit contributions
 one+0.036
 moral+0.030
 end+0.028
 lesson+0.027
 closer+0.027
Neg logit contributions
emic-0.052
 unnecessary-0.049
 endings-0.049
osity-0.048
 hey-0.046
Token Tom
Token ID4186
Feature activation+1.40e+00

Pos logit contributions
emic+0.002
 endings+0.002
 unnecessary+0.002
osity+0.002
Em+0.002
Neg logit contributions
 one-0.002
 moral-0.002
 end-0.002
 lesson-0.002
 closer-0.002
Token said
Token ID531
Feature activation+4.48e-01

Pos logit contributions
 one+0.023
 moral+0.019
 end+0.018
 lesson+0.016
 closer+0.016
Neg logit contributions
emic-0.043
 endings-0.041
 unnecessary-0.041
osity-0.041
Em-0.040
Token in
Token ID287
Feature activation-2.73e-01

Pos logit contributions
 one+0.009
 end+0.006
 moral+0.006
 first+0.006
 closer+0.006
Neg logit contributions
emic-0.019
 unnecessary-0.019
osity-0.019
 endings-0.019
Em-0.018
Token "
Token ID366
Feature activation+2.04e-01

Pos logit contributions
emic+0.009
 unnecessary+0.008
 endings+0.008
osity+0.008
Em+0.008
Neg logit contributions
 one-0.004
 moral-0.003
 end-0.003
 closer-0.003
 lesson-0.003
TokenFun
Token ID24629
Feature activation+2.55e-01

Pos logit contributions
 one+0.005
 moral+0.004
 end+0.004
 closer+0.004
 first+0.004
Neg logit contributions
emic-0.005
 unnecessary-0.005
 endings-0.005
osity-0.005
 situations-0.004
Token Land
Token ID6379
Feature activation+2.01e-01

Pos logit contributions
 one+0.005
 moral+0.005
 end+0.004
 closer+0.004
 lesson+0.004
Neg logit contributions
emic-0.007
 endings-0.006
 unnecessary-0.006
osity-0.006
Em-0.006
Token"
Token ID1
Feature activation+5.92e-02

Pos logit contributions
 one+0.004
 moral+0.004
 end+0.004
 closer+0.003
 lesson+0.003
Neg logit contributions
emic-0.005
 endings-0.005
 unnecessary-0.005
osity-0.005
Em-0.005
Token ever
Token ID1683
Feature activation+2.14e+00

Pos logit contributions
 one+0.001
 moral+0.001
 end+0.001
 closer+0.001
 lesson+0.001
Neg logit contributions
emic-0.002
 unnecessary-0.002
 endings-0.002
osity-0.002
 hey-0.002
Token after
Token ID706
Feature activation+6.71e-01

Pos logit contributions
 one+0.036
 end+0.026
 moral+0.024
 lesson+0.021
 first+0.021
Neg logit contributions
emic-0.076
 unnecessary-0.071
osity-0.071
 importantly-0.068
und-0.068
Token.
Token ID13
Feature activation-3.22e-01

Pos logit contributions
 one+0.011
 moral+0.008
 end+0.008
 first+0.007
 closer+0.007
Neg logit contributions
emic-0.022
 unnecessary-0.021
 endings-0.021
osity-0.020
Em-0.020
Token<|endoftext|>
Token ID50256
Feature activation+4.10e-01

Pos logit contributions
 unnecessary+0.013
 endings+0.012
emic+0.012
 situations+0.012
osity+0.012
Neg logit contributions
 one-0.003
ossible-0.002
 moral-0.002
 end-0.001
 lesson-0.001
TokenOne
Token ID3198
Feature activation+3.82e-01

Pos logit contributions
 one+0.007
 moral+0.006
 end+0.005
 lesson+0.005
 closer+0.005
Neg logit contributions
emic-0.012
 endings-0.012
Em-0.012
 unnecessary-0.011
und-0.011
Token day
Token ID1110
Feature activation+3.91e-01

Pos logit contributions
 one+0.006
ort+0.006
 closer+0.006
 seller+0.005
orted+0.005
Neg logit contributions
emic-0.009
 importantly-0.009
 ow-0.009
 endings-0.009
 cob-0.009
Token not
Token ID407
Feature activation-6.58e-01

Pos logit contributions
emic+0.032
Em+0.031
 endings+0.031
 unnecessary+0.030
 pian+0.030
Neg logit contributions
 one-0.018
 first-0.013
 closer-0.013
 moral-0.013
 lesson-0.013
Token to
Token ID284
Feature activation-1.28e+00

Pos logit contributions
Em+0.023
emic+0.023
ibr+0.022
icol+0.022
 endings+0.021
Neg logit contributions
 one-0.011
 bite-0.008
 closer-0.008
 first-0.007
 end-0.007
Token trust
Token ID3774
Feature activation-7.33e-02

Pos logit contributions
emic+0.041
osity+0.039
Em+0.039
 importantly+0.038
ibr+0.038
Neg logit contributions
 one-0.026
 end-0.020
 bite-0.019
 first-0.018
 last-0.018
Token everything
Token ID2279
Feature activation-1.42e+00

Pos logit contributions
emic+0.002
osity+0.002
Em+0.002
 endings+0.002
 importantly+0.002
Neg logit contributions
 one-0.001
 end-0.001
 first-0.001
 closer-0.001
 moral-0.001
Token that
Token ID326
Feature activation+3.92e-01

Pos logit contributions
emic+0.051
osity+0.049
 endings+0.049
 unnecessary+0.049
Em+0.048
Neg logit contributions
 one-0.020
 end-0.013
 closer-0.013
 first-0.012
 moral-0.012
Token looks
Token ID3073
Feature activation+1.86e+00

Pos logit contributions
 one+0.008
 moral+0.007
 closer+0.007
 end+0.007
 lesson+0.006
Neg logit contributions
emic-0.010
 endings-0.010
 unnecessary-0.010
osity-0.009
 situations-0.009
Token like
Token ID588
Feature activation-4.37e-01

Pos logit contributions
 one+0.039
 moral+0.032
 end+0.031
 closer+0.029
 lesson+0.029
Neg logit contributions
emic-0.050
 unnecessary-0.049
 endings-0.048
osity-0.047
Em-0.045
Token a
Token ID257
Feature activation-3.08e-01

Pos logit contributions
emic+0.013
osity+0.013
 unnecessary+0.013
Em+0.013
 endings+0.012
Neg logit contributions
 one-0.008
 moral-0.006
 end-0.006
 first-0.006
 lesson-0.006
Token nut
Token ID6701
Feature activation+1.05e+00

Pos logit contributions
emic+0.009
osity+0.009
 endings+0.009
 unnecessary+0.009
Em+0.008
Neg logit contributions
 one-0.006
 moral-0.005
 end-0.005
 lesson-0.005
 closer-0.004
Token.
Token ID13
Feature activation+7.67e-01

Pos logit contributions
 one+0.017
 moral+0.013
 end+0.012
 closer+0.012
 first+0.011
Neg logit contributions
emic-0.034
 unnecessary-0.032
 endings-0.032
osity-0.032
Em-0.031
Token They
Token ID1119
Feature activation+3.74e-01

Pos logit contributions
 one+0.012
 moral+0.009
 end+0.008
 lesson+0.008
ossible+0.008
Neg logit contributions
emic-0.025
 unnecessary-0.024
 endings-0.024
osity-0.024
 situations-0.023
Token were
Token ID547
Feature activation-5.36e-02

Pos logit contributions
emic+0.003
 endings+0.003
 unnecessary+0.003
osity+0.003
Em+0.003
Neg logit contributions
 one-0.002
 moral-0.001
 end-0.001
 closer-0.001
 lesson-0.001
Token two
Token ID734
Feature activation+6.29e-01

Pos logit contributions
osity+0.002
emic+0.002
 unnecessary+0.002
 endings+0.002
 situations+0.002
Neg logit contributions
 one-0.001
 moral-0.001
 some-0.001
ossible-0.001
 first-0.000
Token best
Token ID1266
Feature activation+3.00e-01

Pos logit contributions
 one+0.015
 moral+0.013
 closer+0.013
 lesson+0.012
 end+0.012
Neg logit contributions
emic-0.015
 unnecessary-0.014
 endings-0.014
 importantly-0.013
osity-0.013
Token friends
Token ID2460
Feature activation-5.53e-01

Pos logit contributions
 one+0.007
 moral+0.006
 end+0.006
 closer+0.005
 lesson+0.005
Neg logit contributions
emic-0.008
 unnecessary-0.007
 endings-0.007
osity-0.007
 importantly-0.007
Token,
Token ID11
Feature activation-1.37e+00

Pos logit contributions
emic+0.015
 endings+0.014
 unnecessary+0.014
osity+0.014
Em+0.013
Neg logit contributions
 one-0.012
 moral-0.010
 end-0.009
 lesson-0.009
 closer-0.009
Token Mike
Token ID4995
Feature activation+1.87e+00

Pos logit contributions
emic+0.034
 unnecessary+0.032
 endings+0.031
osity+0.031
Em+0.030
Neg logit contributions
 one-0.031
 moral-0.027
 end-0.026
 lesson-0.025
 closer-0.025
Token and
Token ID290
Feature activation-1.24e+00

Pos logit contributions
 one+0.034
 moral+0.033
 closer+0.031
 lesson+0.030
 end+0.028
Neg logit contributions
emic-0.050
Em-0.045
und-0.044
osity-0.043
ime-0.042
Token Sam
Token ID3409
Feature activation+6.27e-01

Pos logit contributions
emic+0.035
 unnecessary+0.035
 endings+0.034
osity+0.033
 situations+0.032
Neg logit contributions
 one-0.028
 moral-0.020
 first-0.020
 end-0.019
 closer-0.019
Token.
Token ID13
Feature activation-4.96e-01

Pos logit contributions
 moral+0.012
 one+0.012
 lesson+0.011
 closer+0.011
ort+0.011
Neg logit contributions
emic-0.014
 endings-0.014
Em-0.013
 hey-0.013
 unnecessary-0.013
Token Every
Token ID3887
Feature activation-5.88e-02

Pos logit contributions
emic+0.011
 endings+0.011
osity+0.011
 unnecessary+0.011
und+0.010
Neg logit contributions
 one-0.012
 moral-0.011
 end-0.010
 closer-0.010
 lesson-0.010
Token day
Token ID1110
Feature activation-3.64e-01

Pos logit contributions
 endings+0.002
emic+0.002
 unnecessary+0.001
 pian+0.001
 ow+0.001
Neg logit contributions
 one-0.001
 moral-0.001
 closer-0.001
 first-0.001
 Mysteries-0.001
Token a
Token ID257
Feature activation+1.18e-02

Pos logit contributions
 one+0.015
 moral+0.012
 end+0.011
 first+0.011
 lesson+0.011
Neg logit contributions
emic-0.025
osity-0.024
 unnecessary-0.024
 endings-0.024
Em-0.023
Token modern
Token ID3660
Feature activation+6.99e-02

Pos logit contributions
 one+0.000
 moral+0.000
 end+0.000
 closer+0.000
 lesson+0.000
Neg logit contributions
emic-0.000
 unnecessary-0.000
 endings-0.000
osity-0.000
Em-0.000
Token train
Token ID4512
Feature activation+4.65e-01

Pos logit contributions
 one+0.002
ort+0.001
 moral+0.001
 closer+0.001
ossible+0.001
Neg logit contributions
 endings-0.001
 canvas-0.001
emic-0.001
 pian-0.001
 cob-0.001
Token.
Token ID13
Feature activation+1.35e+00

Pos logit contributions
 one+0.007
 moral+0.005
 end+0.005
 first+0.005
 closer+0.004
Neg logit contributions
 unnecessary-0.016
emic-0.015
 endings-0.015
osity-0.015
Em-0.014
Token It
Token ID632
Feature activation+6.48e-01

Pos logit contributions
 one+0.029
 moral+0.024
 end+0.023
 closer+0.022
 lesson+0.022
Neg logit contributions
emic-0.037
 unnecessary-0.035
 endings-0.035
osity-0.034
Em-0.033
Token can
Token ID460
Feature activation+1.89e+00

Pos logit contributions
 one+0.013
 moral+0.011
 end+0.011
ossible+0.011
 lesson+0.011
Neg logit contributions
emic-0.016
 pian-0.015
 endings-0.014
 unnecessary-0.014
osity-0.014
Token make
Token ID787
Feature activation+3.72e-01

Pos logit contributions
 one+0.043
 lesson+0.042
ossible+0.040
 moral+0.040
 moment+0.038
Neg logit contributions
emic-0.036
 pian-0.034
 ow-0.034
mic-0.033
 hey-0.032
Token sounds
Token ID5238
Feature activation+5.23e-02

Pos logit contributions
 one+0.007
 moral+0.005
 end+0.005
 some+0.005
 closer+0.005
Neg logit contributions
emic-0.012
osity-0.011
 unnecessary-0.011
Em-0.011
 endings-0.011
Token and
Token ID290
Feature activation+4.73e-01

Pos logit contributions
 one+0.001
 end+0.000
 moral+0.000
 closer+0.000
 some+0.000
Neg logit contributions
emic-0.002
 unnecessary-0.002
osity-0.002
 endings-0.002
und-0.002
Token move
Token ID1445
Feature activation+6.54e-01

Pos logit contributions
 one+0.011
 moral+0.009
 end+0.009
 some+0.009
 closer+0.009
Neg logit contributions
emic-0.013
osity-0.012
Em-0.012
 unnecessary-0.012
und-0.012
Token.
Token ID13
Feature activation+1.03e+00

Pos logit contributions
 one+0.010
 moral+0.008
 end+0.007
 closer+0.007
 first+0.007
Neg logit contributions
emic-0.022
 endings-0.021
 unnecessary-0.021
osity-0.021
Em-0.020
Token him
Token ID683
Feature activation+2.35e-01

Pos logit contributions
 unnecessary+0.005
emic+0.005
 endings+0.005
 hey+0.005
 situations+0.005
Neg logit contributions
 one-0.003
 moral-0.002
 end-0.002
 lesson-0.002
ossible-0.002
Token to
Token ID284
Feature activation+2.45e-01

Pos logit contributions
 one+0.005
 moral+0.004
 end+0.004
 particular+0.004
 lesson+0.004
Neg logit contributions
 unnecessary-0.006
emic-0.006
 endings-0.006
osity-0.006
 situations-0.005
Token the
Token ID262
Feature activation+4.32e-02

Pos logit contributions
 one+0.004
 moral+0.003
 end+0.003
 closer+0.002
 lesson+0.002
Neg logit contributions
emic-0.008
 unnecessary-0.008
 endings-0.008
osity-0.008
Em-0.007
Token hospital
Token ID4436
Feature activation+7.14e-01

Pos logit contributions
 one+0.001
 end+0.001
 moral+0.001
 first+0.001
 closer+0.001
Neg logit contributions
emic-0.001
osity-0.001
 endings-0.001
 unnecessary-0.001
 importantly-0.001
Token,
Token ID11
Feature activation+4.79e-02

Pos logit contributions
 one+0.011
 moral+0.008
 end+0.008
 closer+0.007
 lesson+0.007
Neg logit contributions
emic-0.024
 unnecessary-0.023
 endings-0.023
osity-0.022
Em-0.022
Token but
Token ID475
Feature activation+1.07e+00

Pos logit contributions
 one+0.001
 moral+0.001
 end+0.001
 lesson+0.001
 closer+0.001
Neg logit contributions
emic-0.001
 unnecessary-0.001
 endings-0.001
osity-0.001
Em-0.001
Token the
Token ID262
Feature activation+3.76e-02

Pos logit contributions
 one+0.018
 moral+0.016
ossible+0.016
ort+0.014
 lesson+0.014
Neg logit contributions
emic-0.029
 unnecessary-0.029
 importantly-0.028
 endings-0.028
 hey-0.028
Token bite
Token ID13197
Feature activation-2.06e-01

Pos logit contributions
 one+0.001
 moral+0.001
 end+0.001
ossible+0.001
 closer+0.001
Neg logit contributions
emic-0.001
 unnecessary-0.001
 endings-0.001
osity-0.001
Em-0.001
Token was
Token ID373
Feature activation-6.10e-01

Pos logit contributions
emic+0.006
 endings+0.006
 unnecessary+0.006
osity+0.006
und+0.005
Neg logit contributions
 one-0.004
 moral-0.003
 end-0.003
 closer-0.003
ossible-0.003
Token too
Token ID1165
Feature activation+1.47e+00

Pos logit contributions
emic+0.016
 unnecessary+0.016
osity+0.015
 endings+0.015
 importantly+0.015
Neg logit contributions
 one-0.012
 moral-0.010
ossible-0.010
 end-0.009
 lesson-0.009
Token deep
Token ID2769
Feature activation+1.43e+00

Pos logit contributions
 one+0.034
 end+0.031
ossible+0.030
 moral+0.029
 End+0.028
Neg logit contributions
 unnecessary-0.032
emic-0.031
 ow-0.028
 importantly-0.027
Em-0.026
Token that
Token ID326
Feature activation+5.18e-01

Pos logit contributions
emic+0.002
 unnecessary+0.002
 endings+0.002
osity+0.002
Em+0.001
Neg logit contributions
 one-0.001
 moral-0.001
 end-0.001
 closer-0.001
 lesson-0.001
Token basket
Token ID7988
Feature activation+4.71e-01

Pos logit contributions
 one+0.009
 moral+0.008
ossible+0.008
 closer+0.008
ort+0.008
Neg logit contributions
emic-0.013
 endings-0.013
 unnecessary-0.013
osity-0.012
 situations-0.012
Token?"
Token ID1701
Feature activation-1.60e-01

Pos logit contributions
 one+0.009
 moral+0.008
 lesson+0.007
 end+0.007
 closer+0.006
Neg logit contributions
emic-0.013
osity-0.012
 endings-0.012
 unnecessary-0.012
und-0.012
Token
Token ID198
Feature activation+9.00e-02

Pos logit contributions
emic+0.004
 endings+0.004
 Acc+0.004
Em+0.004
 importantly+0.004
Neg logit contributions
 one-0.003
 moral-0.003
 lesson-0.003
 end-0.003
 closer-0.002
Token
Token ID198
Feature activation-4.05e-01

Pos logit contributions
 one+0.002
 moral+0.002
 end+0.002
 lesson+0.002
gers+0.002
Neg logit contributions
Em-0.002
emic-0.002
und-0.002
 Acc-0.002
and-0.002
TokenThe
Token ID464
Feature activation+5.14e-01

Pos logit contributions
emic+0.010
Em+0.010
 endings+0.009
osity+0.009
und+0.009
Neg logit contributions
 one-0.009
 moral-0.007
 end-0.007
 lesson-0.007
 last-0.007
Token lady
Token ID10846
Feature activation-3.00e-01

Pos logit contributions
 one+0.008
ort+0.007
ossible+0.007
 closer+0.006
 moral+0.006
Neg logit contributions
emic-0.015
 endings-0.014
osity-0.014
 hey-0.014
Em-0.014
Token smiled
Token ID13541
Feature activation+8.09e-01

Pos logit contributions
emic+0.008
Em+0.008
 importantly+0.007
 hey+0.007
osity+0.007
Neg logit contributions
 one-0.005
 lesson-0.005
 moral-0.005
 end-0.005
ossible-0.004
Token and
Token ID290
Feature activation+7.60e-01

Pos logit contributions
 one+0.012
 moral+0.010
 lesson+0.009
 end+0.009
ossible+0.009
Neg logit contributions
emic-0.025
 endings-0.024
osity-0.024
 unnecessary-0.024
 importantly-0.023
Token said
Token ID531
Feature activation-6.00e-01

Pos logit contributions
 one+0.010
 moral+0.008
 end+0.008
 lesson+0.007
 closer+0.007
Neg logit contributions
emic-0.026
 endings-0.025
 unnecessary-0.025
osity-0.024
Em-0.024
Token,
Token ID11
Feature activation+1.64e-01

Pos logit contributions
emic+0.020
 unnecessary+0.019
 endings+0.019
osity+0.019
Em+0.018
Neg logit contributions
 one-0.009
 moral-0.007
 end-0.006
 closer-0.006
 lesson-0.006
Token,
Token ID11
Feature activation-1.14e-01

Pos logit contributions
 one+0.014
 moral+0.011
 end+0.010
 closer+0.010
 lesson+0.010
Neg logit contributions
emic-0.029
 unnecessary-0.028
 endings-0.028
osity-0.028
Em-0.027
Token Anna
Token ID11735
Feature activation-4.56e-01

Pos logit contributions
emic+0.003
 unnecessary+0.003
 endings+0.003
osity+0.003
und+0.003
Neg logit contributions
 one-0.003
 moral-0.002
 first-0.002
 end-0.002
 closer-0.002
Token.
Token ID13
Feature activation-8.87e-01

Pos logit contributions
emic+0.014
 endings+0.014
 unnecessary+0.014
osity+0.013
Em+0.013
Neg logit contributions
 one-0.008
 moral-0.006
 end-0.006
 closer-0.005
 lesson-0.005
Token I
Token ID314
Feature activation+3.50e-01

Pos logit contributions
emic+0.019
 unnecessary+0.019
 endings+0.018
osity+0.017
 situations+0.017
Neg logit contributions
 one-0.025
 moral-0.020
 end-0.020
 first-0.020
 closer-0.019
Token will
Token ID481
Feature activation-9.34e-02

Pos logit contributions
 one+0.009
 moral+0.008
 kernels+0.007
 front+0.007
 particular+0.007
Neg logit contributions
emic-0.006
 hey-0.006
 unnecessary-0.006
ird-0.005
igor-0.005
Token help
Token ID1037
Feature activation+1.63e-01

Pos logit contributions
emic+0.003
 unnecessary+0.003
 endings+0.003
osity+0.003
Em+0.002
Neg logit contributions
 one-0.002
 moral-0.001
 end-0.001
 first-0.001
 closer-0.001
Token you
Token ID345
Feature activation+2.81e-01

Pos logit contributions
 one+0.004
 moral+0.004
 end+0.004
 first+0.004
 closer+0.004
Neg logit contributions
emic-0.004
 unnecessary-0.003
 endings-0.003
osity-0.003
 importantly-0.003
Token.
Token ID13
Feature activation-5.15e-01

Pos logit contributions
 one+0.005
 end+0.003
 moral+0.003
 closer+0.003
 first+0.003
Neg logit contributions
emic-0.009
 unnecessary-0.009
 endings-0.009
osity-0.009
Em-0.008
Token But
Token ID887
Feature activation+2.72e-01

Pos logit contributions
emic+0.012
 unnecessary+0.012
 endings+0.011
osity+0.011
 situations+0.011
Neg logit contributions
 one-0.014
 moral-0.011
 end-0.011
ossible-0.010
 first-0.010
Token you
Token ID345
Feature activation+4.85e-02

Pos logit contributions
 one+0.006
 end+0.004
 moral+0.004
 first+0.004
 closer+0.004
Neg logit contributions
emic-0.008
 unnecessary-0.008
 endings-0.008
osity-0.007
Em-0.007
Token have
Token ID423
Feature activation-4.54e-01

Pos logit contributions
 one+0.001
 moral+0.001
 lesson+0.001
 closer+0.001
 end+0.001
Neg logit contributions
emic-0.001
 unnecessary-0.001
 endings-0.001
osity-0.001
 hey-0.001
Token
Token ID198
Feature activation+2.71e-01

Pos logit contributions
 one+0.005
 end+0.004
 moral+0.004
 bite+0.004
 closer+0.004
Neg logit contributions
emic-0.007
Em-0.006
 importantly-0.006
 endings-0.006
 unnecessary-0.006
Token
Token ID198
Feature activation-2.07e-01

Pos logit contributions
 one+0.005
 moral+0.004
 end+0.004
 closer+0.004
 first+0.004
Neg logit contributions
emic-0.008
Em-0.007
 endings-0.007
 unnecessary-0.007
 importantly-0.007
Token"
Token ID1
Feature activation-1.39e-02

Pos logit contributions
emic+0.006
 endings+0.006
 unnecessary+0.006
osity+0.006
Em+0.006
Neg logit contributions
 one-0.004
 moral-0.003
 end-0.003
 closer-0.003
 lesson-0.003
TokenOf
Token ID5189
Feature activation+9.11e-01

Pos logit contributions
emic+0.000
Em+0.000
 endings+0.000
 unnecessary+0.000
 importantly+0.000
Neg logit contributions
 one-0.000
 end-0.000
 moral-0.000
ossible-0.000
 lesson-0.000
Token course
Token ID1781
Feature activation+8.31e-01

Pos logit contributions
 one+0.020
 moral+0.019
 lesson+0.017
 closer+0.017
ort+0.016
Neg logit contributions
 endings-0.019
 hey-0.019
 ow-0.019
 unnecessary-0.019
emic-0.018
Token you
Token ID345
Feature activation+3.86e-01

Pos logit contributions
 one+0.016
 moral+0.015
 end+0.013
 lesson+0.013
 closer+0.012
Neg logit contributions
emic-0.022
 unnecessary-0.022
 endings-0.021
 hey-0.020
osity-0.020
Token can
Token ID460
Feature activation+7.88e-01

Pos logit contributions
 one+0.012
 particular+0.010
 moral+0.010
 lesson+0.010
 moment+0.010
Neg logit contributions
emic-0.007
 endings-0.005
 hey-0.005
 unnecessary-0.005
osity-0.005
Token,"
Token ID553
Feature activation-2.10e-01

Pos logit contributions
 moral+0.017
 one+0.017
 lesson+0.016
 closer+0.015
 particular+0.015
Neg logit contributions
emic-0.016
und-0.015
 endings-0.015
 hey-0.015
ird-0.014
Token mom
Token ID1995
Feature activation-3.19e-01

Pos logit contributions
Em+0.004
 importantly+0.003
emic+0.003
 endings+0.003
 pian+0.003
Neg logit contributions
 one-0.005
 closer-0.005
ort-0.005
 end-0.005
 moral-0.005
Token said
Token ID531
Feature activation+1.33e-01

Pos logit contributions
Em+0.009
emic+0.009
 importantly+0.008
 pian+0.008
ird+0.008
Neg logit contributions
 one-0.006
 lesson-0.005
 moral-0.005
 end-0.005
 single-0.004
Token.
Token ID13
Feature activation+6.39e-01

Pos logit contributions
 one+0.002
 moral+0.002
 end+0.002
 closer+0.001
 first+0.001
Neg logit contributions
emic-0.004
 unnecessary-0.004
 endings-0.004
osity-0.004
Em-0.004
Token too
Token ID1165
Feature activation+1.06e+00

Pos logit contributions
emic+0.014
 unnecessary+0.014
 endings+0.014
osity+0.013
 importantly+0.013
Neg logit contributions
 one-0.010
 moral-0.008
 end-0.008
 closer-0.008
ossible-0.008
Token sm
Token ID895
Feature activation-4.30e-01

Pos logit contributions
 one+0.026
 moral+0.023
 end+0.023
 closer+0.021
ossible+0.021
Neg logit contributions
emic-0.023
 unnecessary-0.022
 endings-0.020
 ow-0.020
 importantly-0.020
Tokenelly
Token ID6148
Feature activation+1.93e-01

Pos logit contributions
emic+0.009
 unnecessary+0.009
 endings+0.008
osity+0.008
 Majesty+0.008
Neg logit contributions
 one-0.011
 end-0.009
 closer-0.009
 moral-0.009
ossible-0.009
Token and
Token ID290
Feature activation-5.83e-01

Pos logit contributions
 one+0.003
 moral+0.002
 end+0.002
 closer+0.002
 first+0.002
Neg logit contributions
emic-0.006
 unnecessary-0.006
 endings-0.006
osity-0.006
Em-0.006
Token scary
Token ID14343
Feature activation+1.19e-01

Pos logit contributions
emic+0.016
 endings+0.015
 unnecessary+0.015
osity+0.015
Em+0.014
Neg logit contributions
 one-0.013
 moral-0.010
 end-0.010
 closer-0.010
 first-0.010
Token."
Token ID526
Feature activation+6.48e-01

Pos logit contributions
 one+0.002
 end+0.001
 some+0.001
 first+0.001
 moral+0.001
Neg logit contributions
emic-0.004
Em-0.004
osity-0.004
 unnecessary-0.004
igor-0.004
Token Uncle
Token ID23169
Feature activation+1.07e+00

Pos logit contributions
 one+0.011
 moral+0.008
 end+0.008
 closer+0.007
 lesson+0.007
Neg logit contributions
emic-0.021
 unnecessary-0.020
 endings-0.020
osity-0.020
Em-0.019
Token Bob
Token ID5811
Feature activation+9.25e-01

Pos logit contributions
 one+0.021
 moral+0.017
 end+0.016
 closer+0.016
 lesson+0.015
Neg logit contributions
emic-0.030
 unnecessary-0.029
 endings-0.029
osity-0.028
Em-0.027
Token took
Token ID1718
Feature activation+2.71e-01

Pos logit contributions
 one+0.018
 moral+0.016
 end+0.015
 lesson+0.015
ossible+0.014
Neg logit contributions
emic-0.025
Em-0.023
 endings-0.022
 hey-0.022
und-0.022
Token off
Token ID572
Feature activation+5.09e-01

Pos logit contributions
 moral+0.005
 one+0.005
 end+0.004
 lesson+0.004
ossible+0.004
Neg logit contributions
 unnecessary-0.007
emic-0.007
 endings-0.006
 hey-0.006
Em-0.006
Token his
Token ID465
Feature activation+4.96e-01

Pos logit contributions
 moral+0.009
 one+0.009
 end+0.009
 End+0.008
 lesson+0.008
Neg logit contributions
Em-0.012
 pian-0.011
emic-0.011
 Sad-0.011
 hey-0.011
Token is
Token ID318
Feature activation-2.10e-01

Pos logit contributions
 one+0.008
 moral+0.006
 closer+0.006
 end+0.006
ossible+0.006
Neg logit contributions
emic-0.012
 endings-0.012
 unnecessary-0.012
osity-0.011
 situations-0.011
Token our
Token ID674
Feature activation+1.55e-01

Pos logit contributions
emic+0.006
 unnecessary+0.006
 endings+0.006
osity+0.006
Em+0.005
Neg logit contributions
 one-0.004
 moral-0.003
 end-0.003
 closer-0.003
 lesson-0.003
Token ball
Token ID2613
Feature activation+5.00e-02

Pos logit contributions
 one+0.003
 moral+0.003
ort+0.003
 closer+0.003
ossible+0.003
Neg logit contributions
emic-0.004
 endings-0.004
 unnecessary-0.004
osity-0.003
 situations-0.003
Token.
Token ID13
Feature activation+1.12e-01

Pos logit contributions
 one+0.001
 moral+0.001
 end+0.001
 closer+0.001
 lesson+0.001
Neg logit contributions
emic-0.001
 endings-0.001
 unnecessary-0.001
osity-0.001
 situations-0.001
Token We
Token ID775
Feature activation-1.61e-01

Pos logit contributions
 one+0.003
 moral+0.003
 end+0.002
 closer+0.002
 lesson+0.002
Neg logit contributions
emic-0.002
 unnecessary-0.002
 endings-0.002
osity-0.002
Em-0.002
Token were
Token ID547
Feature activation-3.10e-01

Pos logit contributions
emic+0.004
 unnecessary+0.004
 endings+0.004
osity+0.004
 hey+0.004
Neg logit contributions
 one-0.004
 moral-0.003
 end-0.003
 closer-0.003
 lesson-0.003
Token playing
Token ID2712
Feature activation+1.39e-02

Pos logit contributions
emic+0.008
 endings+0.008
 unnecessary+0.007
osity+0.007
Em+0.007
Neg logit contributions
 one-0.007
 moral-0.006
 end-0.006
 closer-0.006
 first-0.006
Token with
Token ID351
Feature activation+1.09e-01

Pos logit contributions
 one+0.000
 moral+0.000
 end+0.000
 lesson+0.000
 closer+0.000
Neg logit contributions
emic-0.000
 endings-0.000
 unnecessary-0.000
osity-0.000
 situations-0.000
Token it
Token ID340
Feature activation+3.39e-01

Pos logit contributions
 one+0.002
 moral+0.002
 end+0.001
ossible+0.001
 lesson+0.001
Neg logit contributions
 unnecessary-0.003
 endings-0.003
emic-0.003
 situations-0.003
osity-0.003
Token.
Token ID13
Feature activation+4.74e-01

Pos logit contributions
 one+0.006
 moral+0.005
 end+0.005
ossible+0.004
 closer+0.004
Neg logit contributions
emic-0.010
 endings-0.009
 unnecessary-0.009
osity-0.009
Em-0.009
Token Give
Token ID13786
Feature activation+5.34e-01

Pos logit contributions
 one+0.013
 moral+0.011
 end+0.010
 closer+0.010
ossible+0.010
Neg logit contributions
emic-0.010
 unnecessary-0.010
 endings-0.010
osity-0.009
Em-0.009
Token shouting
Token ID19642
Feature activation-1.70e-01

Pos logit contributions
emic+0.007
 unnecessary+0.007
 endings+0.007
osity+0.007
Em+0.007
Neg logit contributions
 one-0.005
 moral-0.004
 end-0.004
 closer-0.004
 lesson-0.004
Token with
Token ID351
Feature activation-2.41e-01

Pos logit contributions
emic+0.006
 unnecessary+0.005
 endings+0.005
osity+0.005
Em+0.005
Neg logit contributions
 one-0.003
 closer-0.002
 end-0.002
 moral-0.002
 some-0.002
Token joy
Token ID8716
Feature activation-7.79e-01

Pos logit contributions
emic+0.006
 unnecessary+0.006
 endings+0.006
osity+0.006
Em+0.005
Neg logit contributions
 one-0.005
 moral-0.005
 end-0.004
 closer-0.004
 lesson-0.004
Token.
Token ID13
Feature activation-5.10e-01

Pos logit contributions
 unnecessary+0.031
emic+0.031
 endings+0.030
Em+0.029
 Acc+0.029
Neg logit contributions
 one-0.010
 first-0.004
 for-0.004
 some-0.004
 closer-0.004
Token He
Token ID679
Feature activation+2.92e-01

Pos logit contributions
emic+0.016
 unnecessary+0.016
 endings+0.015
 situations+0.015
 hey+0.014
Neg logit contributions
 one-0.010
 moral-0.007
 lesson-0.007
 end-0.006
 first-0.006
Token had
Token ID550
Feature activation-4.14e-01

Pos logit contributions
 one+0.006
 moral+0.005
 end+0.005
 closer+0.004
 first+0.004
Neg logit contributions
emic-0.008
 unnecessary-0.008
 endings-0.008
osity-0.008
 situations-0.008
Token never
Token ID1239
Feature activation+4.41e-01

Pos logit contributions
emic+0.013
 endings+0.012
osity+0.012
 unnecessary+0.012
 situations+0.012
Neg logit contributions
 one-0.008
 some-0.006
 first-0.006
 moral-0.006
 closer-0.006
Token seen
Token ID1775
Feature activation-4.81e-01

Pos logit contributions
 one+0.009
 last+0.006
 end+0.006
 first+0.006
 closer+0.005
Neg logit contributions
emic-0.014
osity-0.014
und-0.014
 endings-0.014
umm-0.013
Token anything
Token ID1997
Feature activation+6.73e-03

Pos logit contributions
emic+0.016
 endings+0.015
osity+0.015
 unnecessary+0.015
und+0.015
Neg logit contributions
 one-0.009
 some-0.006
 end-0.006
 closer-0.006
 first-0.006
Token like
Token ID588
Feature activation-5.03e-01

Pos logit contributions
 one+0.000
 closer+0.000
 end+0.000
 first+0.000
 moral+0.000
Neg logit contributions
emic-0.000
 endings-0.000
osity-0.000
 unnecessary-0.000
und-0.000
Token it
Token ID340
Feature activation+8.07e-01

Pos logit contributions
emic+0.018
 unnecessary+0.017
 endings+0.017
osity+0.017
Em+0.016
Neg logit contributions
 one-0.007
 moral-0.005
 end-0.005
 closer-0.004
 first-0.004
Token"
Token ID1
Feature activation-3.51e-02

Pos logit contributions
emic+0.010
 unnecessary+0.010
 endings+0.010
osity+0.010
 Majesty+0.009
Neg logit contributions
 one-0.002
 moral-0.001
 end-0.001
 closer-0.001
"-0.001
TokenHello
Token ID15496
Feature activation-3.29e-01

Pos logit contributions
emic+0.001
 unnecessary+0.001
 endings+0.001
osity+0.001
 importantly+0.001
Neg logit contributions
 one-0.001
 moral-0.001
 end-0.001
 closer-0.001
 first-0.001
Token,
Token ID11
Feature activation-4.86e-01

Pos logit contributions
emic+0.012
 unnecessary+0.012
 endings+0.012
osity+0.011
 situations+0.011
Neg logit contributions
 one-0.005
 end-0.003
 closer-0.003
 moral-0.003
 first-0.003
Token Lily
Token ID20037
Feature activation-7.80e-01

Pos logit contributions
emic+0.013
 endings+0.013
 unnecessary+0.013
 situations+0.012
und+0.012
Neg logit contributions
 one-0.012
 end-0.010
 moral-0.010
 first-0.009
 closer-0.009
Token.
Token ID13
Feature activation-5.56e-01

Pos logit contributions
emic+0.025
 unnecessary+0.024
 endings+0.024
osity+0.024
 importantly+0.022
Neg logit contributions
 one-0.014
 moral-0.010
 end-0.010
 closer-0.009
 first-0.009
Token I
Token ID314
Feature activation-3.24e-01

Pos logit contributions
 unnecessary+0.013
emic+0.013
 endings+0.012
 situations+0.011
 importantly+0.011
Neg logit contributions
 one-0.015
 closer-0.012
 moral-0.012
 end-0.011
 first-0.011
Token am
Token ID716
Feature activation+1.44e-01

Pos logit contributions
emic+0.009
 unnecessary+0.008
 endings+0.008
osity+0.008
und+0.008
Neg logit contributions
 one-0.007
 moral-0.006
 end-0.005
 lesson-0.005
 closer-0.005
Token your
Token ID534
Feature activation+9.11e-01

Pos logit contributions
 one+0.003
 moral+0.002
 end+0.002
 closer+0.002
 first+0.002
Neg logit contributions
emic-0.004
 endings-0.004
 unnecessary-0.004
osity-0.004
Em-0.004
Token magic
Token ID5536
Feature activation+2.25e-01

Pos logit contributions
 one+0.018
 moral+0.013
 first+0.013
 end+0.013
 closer+0.012
Neg logit contributions
emic-0.029
 unnecessary-0.028
 endings-0.027
und-0.027
osity-0.027
Token wand
Token ID11569
Feature activation+5.06e-01

Pos logit contributions
 one+0.005
 moral+0.005
 end+0.004
 lesson+0.004
 closer+0.004
Neg logit contributions
emic-0.006
 unnecessary-0.005
 endings-0.005
osity-0.005
Em-0.005
Token.
Token ID13
Feature activation-3.99e-01

Pos logit contributions
 one+0.009
 moral+0.007
 end+0.007
 lesson+0.006
 closer+0.006
Neg logit contributions
emic-0.015
 endings-0.015
 unnecessary-0.014
osity-0.014
Em-0.014
Token around
Token ID1088
Feature activation-1.99e-01

Pos logit contributions
emic+0.019
 unnecessary+0.018
 endings+0.018
osity+0.018
Em+0.017
Neg logit contributions
 one-0.011
 moral-0.009
 end-0.008
 closer-0.008
 lesson-0.008
Token in
Token ID287
Feature activation-1.23e-01

Pos logit contributions
emic+0.007
 unnecessary+0.006
 endings+0.006
osity+0.006
 importantly+0.006
Neg logit contributions
 one-0.004
 end-0.002
 moral-0.002
 first-0.002
 closer-0.002
Token it
Token ID340
Feature activation+2.79e-01

Pos logit contributions
emic+0.005
 unnecessary+0.004
 endings+0.004
osity+0.004
Em+0.004
Neg logit contributions
 one-0.002
 end-0.001
 moral-0.001
 closer-0.001
 first-0.001
Token.
Token ID13
Feature activation-7.52e-01

Pos logit contributions
 one+0.004
 end+0.002
 first+0.002
 closer+0.002
 moral+0.002
Neg logit contributions
emic-0.010
 unnecessary-0.010
 endings-0.010
osity-0.010
 hey-0.009
Token She
Token ID1375
Feature activation-3.65e-01

Pos logit contributions
 unnecessary+0.021
emic+0.021
 endings+0.019
 situations+0.018
 hey+0.018
Neg logit contributions
 one-0.018
 moral-0.014
 some-0.013
 first-0.013
 end-0.013
Token looked
Token ID3114
Feature activation-5.55e-01

Pos logit contributions
emic+0.010
 unnecessary+0.010
 endings+0.010
osity+0.010
 situations+0.010
Neg logit contributions
 one-0.007
 moral-0.006
 end-0.006
 closer-0.005
 first-0.005
Token in
Token ID287
Feature activation+5.54e-01

Pos logit contributions
emic+0.016
 unnecessary+0.015
 endings+0.015
osity+0.015
Em+0.014
Neg logit contributions
 one-0.011
 moral-0.009
 end-0.008
 lesson-0.008
 closer-0.008
Token the
Token ID262
Feature activation-1.00e-01

Pos logit contributions
 one+0.008
 moral+0.006
 end+0.006
 closer+0.006
 lesson+0.006
Neg logit contributions
emic-0.018
 unnecessary-0.018
 endings-0.018
osity-0.017
Em-0.017
Token mirror
Token ID10162
Feature activation-4.31e-01

Pos logit contributions
emic+0.003
 endings+0.003
osity+0.002
 unnecessary+0.002
cess+0.002
Neg logit contributions
 one-0.003
 end-0.002
 moral-0.002
 last-0.002
 closer-0.002
Token and
Token ID290
Feature activation+3.54e-01

Pos logit contributions
 unnecessary+0.018
emic+0.017
 endings+0.017
osity+0.017
 situations+0.016
Neg logit contributions
 one-0.005
 first-0.002
 end-0.002
 some-0.002
 last-0.002
Token smiled
Token ID13541
Feature activation-1.41e-01

Pos logit contributions
 one+0.006
 closer+0.004
 first+0.003
 end+0.003
 some+0.003
Neg logit contributions
 unnecessary-0.012
emic-0.012
 endings-0.012
osity-0.012
 situations-0.012
Token a
Token ID257
Feature activation-3.56e-01

Pos logit contributions
 one+0.005
 moral+0.004
 end+0.004
 closer+0.004
 lesson+0.004
Neg logit contributions
emic-0.010
 endings-0.010
 unnecessary-0.010
Em-0.010
 importantly-0.010
Token girl
Token ID2576
Feature activation+1.30e-02

Pos logit contributions
emic+0.008
 endings+0.008
 unnecessary+0.008
osity+0.008
 situations+0.007
Neg logit contributions
 one-0.009
 moral-0.008
 end-0.008
 closer-0.007
 first-0.007
Token called
Token ID1444
Feature activation-7.13e-01

Pos logit contributions
 one+0.000
 end+0.000
 moral+0.000
 closer+0.000
 first+0.000
Neg logit contributions
 unnecessary-0.000
emic-0.000
 endings-0.000
osity-0.000
 situations-0.000
Token Mary
Token ID5335
Feature activation-1.68e-01

Pos logit contributions
 unnecessary+0.023
emic+0.022
 endings+0.022
osity+0.021
ode+0.021
Neg logit contributions
 one-0.015
 first-0.011
 some-0.010
 last-0.009
 end-0.009
Token.
Token ID13
Feature activation-6.51e-01

Pos logit contributions
emic+0.005
 unnecessary+0.005
 endings+0.005
osity+0.005
Em+0.005
Neg logit contributions
 one-0.003
 moral-0.002
 end-0.002
 closer-0.002
 lesson-0.002
Token Mary
Token ID5335
Feature activation-3.65e-01

Pos logit contributions
emic+0.014
 endings+0.014
osity+0.013
 unnecessary+0.013
und+0.012
Neg logit contributions
 one-0.016
 moral-0.015
 end-0.014
 closer-0.014
 lesson-0.013
Token loved
Token ID6151
Feature activation-4.19e-01

Pos logit contributions
emic+0.010
 endings+0.010
 unnecessary+0.009
osity+0.009
Em+0.009
Neg logit contributions
 one-0.007
 moral-0.006
 end-0.006
 closer-0.006
 lesson-0.006
Token to
Token ID284
Feature activation+2.26e-01

Pos logit contributions
emic+0.015
 unnecessary+0.014
osity+0.014
igor+0.014
 endings+0.014
Neg logit contributions
 one-0.007
 some-0.005
 first-0.004
 moral-0.004
 lesson-0.004
Token play
Token ID711
Feature activation+3.64e-01

Pos logit contributions
 one+0.005
 moral+0.004
 end+0.004
 first+0.003
 closer+0.003
Neg logit contributions
emic-0.006
 unnecessary-0.006
 endings-0.006
osity-0.006
 importantly-0.006
Token in
Token ID287
Feature activation+8.14e-01

Pos logit contributions
 one+0.007
 some+0.005
 end+0.005
 moral+0.004
 first+0.004
Neg logit contributions
emic-0.012
osity-0.012
 unnecessary-0.012
 endings-0.011
 importantly-0.011
Token her
Token ID607
Feature activation-8.27e-01

Pos logit contributions
 one+0.013
 moral+0.010
 end+0.010
 closer+0.009
 first+0.009
Neg logit contributions
emic-0.026
 unnecessary-0.025
 endings-0.025
osity-0.025
Em-0.024
Token yours
Token ID12431
Feature activation-5.27e-01

Pos logit contributions
emic+0.011
 endings+0.010
 unnecessary+0.010
osity+0.010
Em+0.010
Neg logit contributions
 one-0.008
 moral-0.006
 end-0.006
 lesson-0.006
 closer-0.006
Token,
Token ID11
Feature activation-5.65e-01

Pos logit contributions
emic+0.016
 unnecessary+0.016
 endings+0.016
osity+0.015
Em+0.015
Neg logit contributions
 one-0.009
 moral-0.007
 end-0.007
 closer-0.006
 lesson-0.006
Token this
Token ID428
Feature activation-6.31e-01

Pos logit contributions
emic+0.020
Em+0.018
 endings+0.018
osity+0.018
 unnecessary+0.018
Neg logit contributions
 one-0.010
 moral-0.007
 end-0.006
 lesson-0.006
 first-0.006
Token is
Token ID318
Feature activation-7.23e-01

Pos logit contributions
 endings+0.016
emic+0.016
 unnecessary+0.016
 situations+0.015
 pian+0.015
Neg logit contributions
 one-0.012
 moral-0.011
 closer-0.010
 end-0.010
ossible-0.010
Token for
Token ID329
Feature activation-2.57e-01

Pos logit contributions
 unnecessary+0.019
emic+0.018
osity+0.017
 endings+0.017
 importantly+0.017
Neg logit contributions
 one-0.014
 moral-0.012
ossible-0.012
 end-0.012
ort-0.011
Token someone
Token ID2130
Feature activation-2.10e+00

Pos logit contributions
emic+0.007
osity+0.006
 endings+0.006
 unnecessary+0.006
Em+0.006
Neg logit contributions
 one-0.006
 moral-0.005
 end-0.005
 first-0.005
 some-0.005
Token else
Token ID2073
Feature activation-5.47e-01

Pos logit contributions
emic+0.055
osity+0.054
 endings+0.054
 unnecessary+0.051
und+0.051
Neg logit contributions
 one-0.048
 closer-0.037
 first-0.036
 moral-0.036
 end-0.035
Token.
Token ID13
Feature activation-5.05e-01

Pos logit contributions
emic+0.016
 unnecessary+0.015
 endings+0.015
osity+0.015
Em+0.014
Neg logit contributions
 one-0.011
 moral-0.009
 end-0.008
 closer-0.008
 lesson-0.008
Token You
Token ID921
Feature activation-7.97e-01

Pos logit contributions
emic+0.013
 unnecessary+0.012
 endings+0.012
 ow+0.011
 canvas+0.011
Neg logit contributions
 one-0.012
 moral-0.010
 some-0.009
 first-0.009
ossible-0.009
Token have
Token ID423
Feature activation-7.35e-01

Pos logit contributions
emic+0.020
 unnecessary+0.019
 endings+0.019
osity+0.019
Em+0.018
Neg logit contributions
 one-0.018
 moral-0.015
 end-0.015
 closer-0.014
 lesson-0.014
Token to
Token ID284
Feature activation-9.15e-01

Pos logit contributions
emic+0.022
 endings+0.021
osity+0.021
 unnecessary+0.021
Em+0.020
Neg logit contributions
 one-0.014
 moral-0.011
 end-0.010
 lesson-0.010
 first-0.010
Token perform
Token ID1620
Feature activation+8.19e-01

Pos logit contributions
 one+0.005
 moral+0.004
 end+0.004
 closer+0.004
 lesson+0.004
Neg logit contributions
emic-0.005
 unnecessary-0.004
 endings-0.004
osity-0.004
Em-0.004
Token for
Token ID329
Feature activation+2.23e-01

Pos logit contributions
 one+0.014
 moral+0.011
 end+0.011
 closer+0.010
 lesson+0.010
Neg logit contributions
emic-0.025
 unnecessary-0.024
 endings-0.024
osity-0.024
Em-0.023
Token her
Token ID607
Feature activation-3.86e-01

Pos logit contributions
 one+0.004
 end+0.002
 some+0.002
 first+0.002
 moral+0.002
Neg logit contributions
emic-0.008
osity-0.008
und-0.007
igor-0.007
ode-0.007
Token family
Token ID1641
Feature activation-1.31e+00

Pos logit contributions
emic+0.011
 unnecessary+0.010
osity+0.010
 endings+0.010
und+0.010
Neg logit contributions
 one-0.008
 moral-0.007
 end-0.007
 first-0.006
 lesson-0.006
Token and
Token ID290
Feature activation-4.14e-01

Pos logit contributions
emic+0.044
 unnecessary+0.044
 endings+0.043
osity+0.042
 hey+0.041
Neg logit contributions
 one-0.020
 moral-0.014
 end-0.014
 first-0.013
 closer-0.012
Token friends
Token ID2460
Feature activation-1.85e+00

Pos logit contributions
emic+0.013
 unnecessary+0.012
und+0.012
igor+0.012
osity+0.012
Neg logit contributions
 one-0.009
 moral-0.007
 end-0.007
 first-0.006
 some-0.006
Token.
Token ID13
Feature activation+6.75e-01

Pos logit contributions
emic+0.058
 endings+0.056
 unnecessary+0.056
osity+0.055
Em+0.053
Neg logit contributions
 one-0.030
 moral-0.024
 end-0.022
 closer-0.021
 lesson-0.021
Token She
Token ID1375
Feature activation-5.74e-01

Pos logit contributions
 one+0.014
 moral+0.011
 end+0.011
 closer+0.010
 lesson+0.010
Neg logit contributions
emic-0.019
 unnecessary-0.018
 endings-0.018
osity-0.018
Em-0.017
Token has
Token ID468
Feature activation+7.35e-01

Pos logit contributions
emic+0.014
 endings+0.013
 unnecessary+0.013
 pian+0.013
osity+0.013
Neg logit contributions
 one-0.013
 moral-0.011
 closer-0.011
 end-0.010
 lesson-0.010
Token a
Token ID257
Feature activation-2.46e-02

Pos logit contributions
 one+0.009
 some+0.004
 first+0.003
 moral+0.003
 lesson+0.002
Neg logit contributions
osity-0.031
emic-0.030
 unnecessary-0.030
 hey-0.029
igor-0.029
Token big
Token ID1263
Feature activation-4.16e-01

Pos logit contributions
emic+0.001
 unnecessary+0.001
 endings+0.001
osity+0.001
Em+0.001
Neg logit contributions
 one-0.000
 moral-0.000
 end-0.000
 closer-0.000
 lesson-0.000
Token inside
Token ID2641
Feature activation+5.91e-02

Pos logit contributions
 moral+0.001
 one+0.001
 lesson+0.001
 end+0.001
oral+0.001
Neg logit contributions
emic-0.001
 unnecessary-0.001
 hey-0.001
 endings-0.001
Em-0.001
Token,
Token ID11
Feature activation-4.16e-01

Pos logit contributions
 one+0.001
 moral+0.001
 end+0.001
 lesson+0.001
 closer+0.001
Neg logit contributions
emic-0.002
 endings-0.002
 unnecessary-0.002
osity-0.002
Em-0.002
Token Amy
Token ID14235
Feature activation-6.24e-01

Pos logit contributions
emic+0.012
 unnecessary+0.011
 endings+0.011
osity+0.011
Em+0.011
Neg logit contributions
 one-0.008
 moral-0.007
 end-0.006
 closer-0.006
 lesson-0.006
Token saw
Token ID2497
Feature activation-1.27e+00

Pos logit contributions
emic+0.017
 endings+0.016
 unnecessary+0.016
osity+0.016
Em+0.015
Neg logit contributions
 one-0.012
 moral-0.011
 end-0.010
 lesson-0.010
 closer-0.009
Token snow
Token ID6729
Feature activation+8.70e-01

Pos logit contributions
emic+0.043
 unnecessary+0.041
osity+0.040
 endings+0.040
Em+0.038
Neg logit contributions
 one-0.021
 end-0.015
 moral-0.015
 closer-0.014
 some-0.013
Tokenfl
Token ID2704
Feature activation-1.83e+00

Pos logit contributions
 one+0.018
 end+0.014
 first+0.014
 closer+0.013
 moral+0.013
Neg logit contributions
emic-0.025
 unnecessary-0.025
 endings-0.024
Em-0.023
osity-0.023
Tokenakes
Token ID1124
Feature activation+7.37e-01

Pos logit contributions
 unnecessary+0.048
 endings+0.044
osity+0.043
 Acc+0.043
 situations+0.043
Neg logit contributions
 one-0.042
ossible-0.032
 last-0.032
 once-0.032
 some-0.032
Token falling
Token ID7463
Feature activation-7.59e-01

Pos logit contributions
 one+0.014
 moral+0.011
 end+0.011
 closer+0.010
 first+0.010
Neg logit contributions
emic-0.022
 unnecessary-0.021
 endings-0.021
osity-0.021
Em-0.020
Token from
Token ID422
Feature activation+4.53e-01

Pos logit contributions
emic+0.024
 endings+0.023
 unnecessary+0.023
osity+0.022
Em+0.022
Neg logit contributions
 one-0.014
 closer-0.011
 end-0.010
 first-0.010
 last-0.010
Token the
Token ID262
Feature activation-3.98e-01

Pos logit contributions
 one+0.008
 moral+0.006
 end+0.006
 closer+0.005
 first+0.005
Neg logit contributions
emic-0.014
 unnecessary-0.014
 endings-0.014
osity-0.013
Em-0.013
Token sky
Token ID6766
Feature activation+2.36e-01

Pos logit contributions
emic+0.011
osity+0.011
 endings+0.011
 unnecessary+0.011
 importantly+0.010
Neg logit contributions
 one-0.009
 end-0.007
 moral-0.007
 first-0.007
 last-0.007
Token stayed
Token ID9658
Feature activation-3.62e-01

Pos logit contributions
emic+0.015
 unnecessary+0.014
 endings+0.014
osity+0.014
Em+0.014
Neg logit contributions
 one-0.010
 moral-0.008
 end-0.007
 closer-0.007
 lesson-0.007
Token obedient
Token ID41652
Feature activation-3.62e-01

Pos logit contributions
emic+0.009
 unnecessary+0.009
 endings+0.009
osity+0.009
 importantly+0.009
Neg logit contributions
 one-0.007
 moral-0.006
 lesson-0.006
 end-0.006
ossible-0.006
Token and
Token ID290
Feature activation-6.91e-02

Pos logit contributions
emic+0.014
 unnecessary+0.014
osity+0.013
 endings+0.013
Em+0.013
Neg logit contributions
 one-0.004
 first-0.002
 end-0.002
 closer-0.002
 moral-0.002
Token kept
Token ID4030
Feature activation-2.22e-01

Pos logit contributions
emic+0.002
 endings+0.002
osity+0.002
 unnecessary+0.002
Em+0.002
Neg logit contributions
 one-0.001
 moral-0.001
 closer-0.001
 end-0.001
 lesson-0.001
Token the
Token ID262
Feature activation-1.74e-01

Pos logit contributions
emic+0.006
 unnecessary+0.006
 endings+0.006
osity+0.005
Em+0.005
Neg logit contributions
 one-0.005
 moral-0.004
 end-0.004
 lesson-0.004
ossible-0.004
Token girl
Token ID2576
Feature activation-2.46e+00

Pos logit contributions
emic+0.003
 unnecessary+0.003
 endings+0.003
osity+0.003
Em+0.003
Neg logit contributions
 one-0.005
 moral-0.004
ossible-0.004
 end-0.004
 closer-0.004
Token calm
Token ID9480
Feature activation-3.15e-01

Pos logit contributions
emic+0.066
 unnecessary+0.063
 endings+0.063
osity+0.062
Em+0.059
Neg logit contributions
 one-0.052
 moral-0.043
 end-0.041
 closer-0.040
 lesson-0.039
Token.
Token ID13
Feature activation-3.76e-01

Pos logit contributions
emic+0.012
 unnecessary+0.011
osity+0.011
Em+0.011
 endings+0.011
Neg logit contributions
 one-0.005
 end-0.003
 closer-0.003
 first-0.003
 last-0.002
Token
Token ID220
Feature activation-9.90e-01

Pos logit contributions
emic+0.013
 unnecessary+0.012
 endings+0.012
osity+0.012
 pian+0.011
Neg logit contributions
 one-0.006
 moral-0.004
 end-0.004
 closer-0.004
 lesson-0.004
Token
Token ID198
Feature activation+2.85e-01

Pos logit contributions
 endings+0.035
 unnecessary+0.035
emic+0.035
osity+0.034
 pian+0.034
Neg logit contributions
 one-0.012
 moral-0.008
 first-0.007
 end-0.007
ossible-0.006
Token
Token ID198
Feature activation-3.64e-01

Pos logit contributions
 one+0.005
 end+0.004
 moral+0.004
 closer+0.004
 lesson+0.004
Neg logit contributions
emic-0.008
Em-0.008
 endings-0.008
 unnecessary-0.008
und-0.008
Token hands
Token ID2832
Feature activation+6.69e-01

Pos logit contributions
emic+0.010
osity+0.010
 endings+0.010
 unnecessary+0.010
Em+0.009
Neg logit contributions
 one-0.010
 moral-0.009
 end-0.008
 closer-0.008
 first-0.008
Token and
Token ID290
Feature activation+4.29e-01

Pos logit contributions
 one+0.011
 moral+0.009
 end+0.008
 lesson+0.008
 closer+0.007
Neg logit contributions
emic-0.021
 endings-0.020
 unnecessary-0.020
osity-0.020
Em-0.019
Token laughing
Token ID14376
Feature activation+4.06e-02

Pos logit contributions
 one+0.009
 moral+0.007
 closer+0.007
 end+0.006
 first+0.006
Neg logit contributions
emic-0.013
 unnecessary-0.012
 endings-0.012
osity-0.012
Em-0.011
Token loudly
Token ID23112
Feature activation+5.62e-01

Pos logit contributions
 one+0.001
 moral+0.000
 end+0.000
 closer+0.000
 lesson+0.000
Neg logit contributions
emic-0.001
 unnecessary-0.001
 endings-0.001
osity-0.001
Em-0.001
Token.
Token ID13
Feature activation+9.14e-02

Pos logit contributions
 one+0.008
 end+0.005
 moral+0.005
 closer+0.005
 first+0.005
Neg logit contributions
emic-0.021
 unnecessary-0.020
osity-0.019
 endings-0.019
Em-0.019
Token Everyone
Token ID11075
Feature activation-1.74e+00

Pos logit contributions
 one+0.002
 moral+0.001
ossible+0.001
 end+0.001
 first+0.001
Neg logit contributions
emic-0.003
 unnecessary-0.003
 endings-0.003
osity-0.003
 hey-0.003
Token around
Token ID1088
Feature activation+8.01e-01

Pos logit contributions
emic+0.053
 unnecessary+0.053
 endings+0.052
osity+0.050
 situations+0.048
Neg logit contributions
 one-0.033
 moral-0.025
 end-0.025
 closer-0.024
 first-0.023
Token him
Token ID683
Feature activation+4.07e-01

Pos logit contributions
 one+0.013
 moral+0.009
 end+0.009
 closer+0.008
 first+0.008
Neg logit contributions
emic-0.026
 unnecessary-0.025
 endings-0.025
osity-0.025
Em-0.024
Token smiled
Token ID13541
Feature activation+3.12e-01

Pos logit contributions
 one+0.008
 moral+0.006
 end+0.006
 closer+0.006
 first+0.006
Neg logit contributions
emic-0.012
 unnecessary-0.011
 endings-0.011
osity-0.011
Em-0.011
Token and
Token ID290
Feature activation+3.91e-01

Pos logit contributions
 one+0.004
 moral+0.003
 end+0.003
 closer+0.002
 first+0.002
Neg logit contributions
emic-0.011
 unnecessary-0.011
 endings-0.011
osity-0.011
Em-0.010
Token applauded
Token ID38911
Feature activation+4.24e-01

Pos logit contributions
 one+0.007
 closer+0.005
 moral+0.005
 end+0.005
 first+0.005
Neg logit contributions
emic-0.012
 unnecessary-0.012
 endings-0.012
osity-0.011
 situations-0.011