TOP ACTIVATIONS
MAX = 3.537

funny and said , " Those are crazy pants , Billy
you , mom my ! These w affles are delicious !"
ll ie explained that when two ot ters love each other
and her name was Ab ig ail . Ab ig ail
's mom said , " Those are nice pants !" Tim
" Because ," replied the polar bear , " people
and humble girl named Ab ig ail . She loved to
child 's name was Ab ig ail and she was three
, they made lots of purple v ests together . Emma
bed . As he was do zing off , he heard
are not tr unks . Those are my pants . Please
said : " Hello Ab ig ail ! I 'm glad
cub said yes . The two of them went for a
youngest girl was named Ab ig ail and she was 3
year old girl named Ab ig ail . She loved to
" We don 't have any w affles left . We
not go in there ," the bear warned Wil bur .
wrong ?" The two bears stopped , looked at
shots and then told Ab ig ail to take some .
!" So L izz ie and the motor got

MIN ACTIVATIONS
MIN = -2.978

her mom came in and handed her the video . Amy
big sign that said " Sold Out ". Lily 's mom
let out a big sn arl and the mouse knew he
a p uddle . Her fancy dress was all wet and
found a dusty key and handed it to Jane . Jane
, he accidentally knocked over Mr . Smith 's drink that
a big arm . It grabs the dump ster . It
fill it with your own adventures ". Jane looked around ,
The bug found a nice tree . The tree was
butterfly back in its natural habitat . He waved goodbye to
soft t eddy bear named Mr . Bear . She loved
, a soft bear called Mr . Bear . One day
friend , a potato named Mr . Sp ud . They
wrong . She sees the broken puzzle and the upset children
bird was very happy and thanked Amy and Ben . It
to the closet . She sees the mess and the tears
Wow , that is an impressive puzzle . Can I help
He smiled and showed everybody his puzzle . Everyone praised John
to jump , but they still cannot get it .
Finally they found the best vacation ever ! It was a

INTERVAL 1.908 - 3.537
CONTAINS 0.098%

chew on books and papers . He can ruin my story
when she che wed her nails , she felt something hard
ir ping loudly ! El isa smiled and started talking to
forest . It saw lots of trees , flowers , and
careful , little girl . The forest can be dangerous ."

INTERVAL 0.279 - 1.908
CONTAINS 31.840%

. L ily feels happy . She likes to
See , I promised ice cream ." Amy was
, " Yes , you do , sweet heart . Exercise
? You get to use a real telephone today !"
happy friends , and no matter where they went , they

INTERVAL -1.350 - 0.279
CONTAINS 66.788%

" O st ric hes do not like people
cookies and playing . He was a happy dog . <|endoftext|>
She was happy that Max listened to her .
It was very gray . They looked at each other .
. They felt so happy and excited to take it out

INTERVAL -2.978 - -1.350
CONTAINS 1.274%

didn 't want to because he was too tired .
stay still , so it held tight to the grass .
. The ball popped and def lated . The dog ran
sad . Suddenly , a kind butterfly flew by and said
marriage . You be the husband and I be the wife
Token funny
Token ID8258
Feature activation+5.35e-01

Pos logit contributions
 mil+0.023
 marsh+0.023
 particular+0.022
 pl+0.020
 was+0.019
Neg logit contributions
 unnecessary-0.043
 fault-0.042
 weaknesses-0.042
ately-0.041
arl-0.041
Token and
Token ID290
Feature activation+4.18e-01

Pos logit contributions
 mil+0.007
 was+0.006
 marsh+0.006
 particular+0.006
 pl+0.006
Neg logit contributions
 unnecessary-0.024
arl-0.024
ately-0.023
 weaknesses-0.023
 fault-0.023
Token said
Token ID531
Feature activation+7.09e-01

Pos logit contributions
 mil+0.010
 marsh+0.010
 particular+0.009
 pl+0.009
 pasture+0.008
Neg logit contributions
 unnecessary-0.013
 weaknesses-0.013
ately-0.013
 fault-0.013
arl-0.013
Token,
Token ID11
Feature activation+2.36e-01

Pos logit contributions
 mil+0.003
 marsh+0.003
 particular+0.002
 was+0.002
 pl+0.002
Neg logit contributions
 unnecessary-0.037
 fault-0.037
ately-0.037
arl-0.037
 weaknesses-0.036
Token "
Token ID366
Feature activation+8.68e-01

Pos logit contributions
 mil+0.004
 marsh+0.004
 particular+0.003
 was+0.003
 pl+0.003
Neg logit contributions
 unnecessary-0.010
 fault-0.010
 weaknesses-0.009
ately-0.009
arl-0.009
TokenThose
Token ID9627
Feature activation+3.54e+00

Pos logit contributions
 mil+0.027
 marsh+0.027
 particular+0.026
 pl+0.025
 was+0.025
Neg logit contributions
 unnecessary-0.023
 weaknesses-0.022
 fault-0.022
ately-0.022
arl-0.021
Token are
Token ID389
Feature activation+1.47e+00

Pos logit contributions
 mil+0.088
 marsh+0.086
 particular+0.086
 pl+0.079
 was+0.078
Neg logit contributions
 unnecessary-0.114
 weaknesses-0.110
 fault-0.109
arl-0.107
ately-0.107
Token crazy
Token ID7165
Feature activation+1.24e+00

Pos logit contributions
 mil+0.038
 particular+0.037
 marsh+0.037
 pl+0.035
 was+0.034
Neg logit contributions
 unnecessary-0.044
 ours-0.043
 weaknesses-0.043
 fault-0.043
arl-0.042
Token pants
Token ID12581
Feature activation+7.14e-01

Pos logit contributions
 mil+0.023
 marsh+0.022
 particular+0.022
 pl+0.020
 was+0.020
Neg logit contributions
 unnecessary-0.047
 fault-0.046
 weaknesses-0.046
ately-0.046
arl-0.046
Token,
Token ID11
Feature activation+6.85e-01

Pos logit contributions
 mil+0.013
 particular+0.013
 marsh+0.013
 pl+0.012
M+0.011
Neg logit contributions
 fault-0.027
 unnecessary-0.027
arl-0.026
ately-0.026
 weaknesses-0.026
Token Billy
Token ID15890
Feature activation-3.16e-01

Pos logit contributions
 mil+0.018
 marsh+0.017
 particular+0.017
 pl+0.016
 was+0.016
Neg logit contributions
 unnecessary-0.021
 fault-0.021
ately-0.021
 weaknesses-0.021
arl-0.021
Token you
Token ID345
Feature activation+9.34e-01

Pos logit contributions
arl+0.006
ately+0.006
 unnecessary+0.006
 weaknesses+0.006
asing+0.005
Neg logit contributions
 marsh-0.016
 mil-0.016
 particular-0.016
 pl-0.015
 was-0.015
Token,
Token ID11
Feature activation-3.26e-01

Pos logit contributions
 particular+0.013
 mil+0.013
 marsh+0.013
 pl+0.012
 gra+0.012
Neg logit contributions
arl-0.040
ately-0.038
 fault-0.038
der-0.037
 unnecessary-0.036
Token mom
Token ID1995
Feature activation+1.73e-01

Pos logit contributions
arl+0.010
 fault+0.010
der+0.010
ately+0.010
 ours+0.010
Neg logit contributions
 mil-0.008
 marsh-0.008
 particular-0.007
 pl-0.007
 was-0.007
Tokenmy
Token ID1820
Feature activation+2.36e-01

Pos logit contributions
 marsh+0.005
 mil+0.005
 particular+0.004
 pl+0.004
 gra+0.004
Neg logit contributions
 fault-0.005
arl-0.005
icated-0.005
ately-0.005
 weaknesses-0.005
Token!
Token ID0
Feature activation-1.00e-01

Pos logit contributions
 particular+0.004
 mil+0.004
 marsh+0.004
 pl+0.004
 was+0.003
Neg logit contributions
 fault-0.009
arl-0.009
 unnecessary-0.009
ately-0.009
 weaknesses-0.009
Token These
Token ID2312
Feature activation+3.17e+00

Pos logit contributions
arl+0.002
 unnecessary+0.002
 fault+0.002
ately+0.002
 weaknesses+0.002
Neg logit contributions
 mil-0.004
 marsh-0.004
 particular-0.004
 pl-0.003
 was-0.003
Token w
Token ID266
Feature activation+2.22e-01

Pos logit contributions
 particular+0.069
 tow+0.067
 mil+0.064
 named+0.060
 marsh+0.058
Neg logit contributions
 unnecessary-0.102
arl-0.097
 weaknesses-0.097
der-0.095
icated-0.094
Tokenaffles
Token ID48501
Feature activation-8.03e-02

Pos logit contributions
 particular+0.007
 mil+0.007
 was+0.007
 marsh+0.007
 pl+0.007
Neg logit contributions
arl-0.005
 unnecessary-0.005
 fault-0.005
 weaknesses-0.005
ately-0.005
Token are
Token ID389
Feature activation+4.31e-01

Pos logit contributions
 fault+0.002
 ours+0.002
arl+0.002
ately+0.002
 unnecessary+0.002
Neg logit contributions
 particular-0.002
 mil-0.002
 marsh-0.002
 tow-0.002
 pl-0.002
Token delicious
Token ID12625
Feature activation-3.86e-01

Pos logit contributions
 pasture+0.011
bird+0.011
 particular+0.011
mun+0.011
c+0.011
Neg logit contributions
 ours-0.013
 yours-0.011
icated-0.010
 unnecessary-0.010
 fault-0.009
Token!"
Token ID2474
Feature activation+3.37e-01

Pos logit contributions
 fault+0.012
 unnecessary+0.012
ately+0.011
arl+0.011
der+0.011
Neg logit contributions
 particular-0.010
 mil-0.010
 marsh-0.009
 pl-0.009
 was-0.009
Tokenll
Token ID297
Feature activation-1.67e+00

Pos logit contributions
 unnecessary+0.002
ately+0.002
 fault+0.002
arl+0.002
 weaknesses+0.002
Neg logit contributions
 mil-0.001
 marsh-0.001
 particular-0.001
 pl-0.001
 was-0.001
Tokenie
Token ID494
Feature activation+1.19e-01

Pos logit contributions
 unnecessary+0.066
 fault+0.064
 weaknesses+0.064
ately+0.064
arl+0.064
Neg logit contributions
 mil-0.029
 marsh-0.028
 particular-0.028
 was-0.027
 pl-0.026
Token explained
Token ID4893
Feature activation+5.00e-01

Pos logit contributions
 marsh+0.003
 mil+0.003
 particular+0.003
 pl+0.002
 gra+0.002
Neg logit contributions
ately-0.004
der-0.004
 weaknesses-0.004
 fault-0.004
 unnecessary-0.003
Token that
Token ID326
Feature activation+7.93e-01

Pos logit contributions
 mil+0.007
 marsh+0.007
 particular+0.007
 was+0.006
 pl+0.006
Neg logit contributions
 unnecessary-0.021
 fault-0.021
ately-0.021
arl-0.021
 weaknesses-0.021
Token when
Token ID618
Feature activation+1.25e+00

Pos logit contributions
 particular+0.020
 mil+0.019
 marsh+0.019
 pl+0.019
M+0.018
Neg logit contributions
 weaknesses-0.022
 fault-0.022
der-0.022
 unnecessary-0.021
icated-0.021
Token two
Token ID734
Feature activation+3.16e+00

Pos logit contributions
 mil+0.027
 marsh+0.026
 particular+0.026
 was+0.025
 pl+0.024
Neg logit contributions
 unnecessary-0.044
 fault-0.043
arl-0.043
ately-0.043
 weaknesses-0.043
Token ot
Token ID30972
Feature activation+2.27e-01

Pos logit contributions
 mil+0.067
 marsh+0.066
 particular+0.066
 pl+0.060
M+0.060
Neg logit contributions
 unnecessary-0.110
 weaknesses-0.108
ately-0.108
arl-0.103
 fault-0.103
Tokenters
Token ID1010
Feature activation+1.91e+00

Pos logit contributions
 mil+0.007
 was+0.007
 pl+0.007
 marsh+0.006
 b+0.006
Neg logit contributions
 fault-0.006
 unnecessary-0.006
 weaknesses-0.006
 ours-0.005
arl-0.005
Token love
Token ID1842
Feature activation+3.39e-01

Pos logit contributions
 mil+0.040
 marsh+0.039
 particular+0.038
 pl+0.034
M+0.034
Neg logit contributions
 unnecessary-0.067
ately-0.066
 weaknesses-0.066
 fault-0.066
arl-0.064
Token each
Token ID1123
Feature activation+1.96e+00

Pos logit contributions
 mil+0.007
 marsh+0.007
 particular+0.007
 pl+0.007
 was+0.007
Neg logit contributions
 unnecessary-0.012
 fault-0.012
ately-0.011
 weaknesses-0.011
arl-0.011
Token other
Token ID584
Feature activation+2.15e-01

Pos logit contributions
 marsh+0.040
 mil+0.040
 particular+0.038
 pl+0.036
 was+0.035
Neg logit contributions
 unnecessary-0.071
 fault-0.069
ately-0.068
 weaknesses-0.068
arl-0.066
Token and
Token ID290
Feature activation+4.29e-01

Pos logit contributions
 unnecessary+0.006
 fault+0.006
 weaknesses+0.006
arl+0.006
ately+0.006
Neg logit contributions
 mil-0.002
 marsh-0.002
 particular-0.002
 was-0.002
 pl-0.001
Token her
Token ID607
Feature activation-1.74e-01

Pos logit contributions
 mil+0.010
 marsh+0.010
 pasture+0.009
 particular+0.008
 pl+0.008
Neg logit contributions
 weaknesses-0.013
nam-0.013
ately-0.013
 unnecessary-0.012
arl-0.012
Token name
Token ID1438
Feature activation-3.42e-01

Pos logit contributions
 unnecessary+0.006
ately+0.006
 fault+0.006
 weaknesses+0.006
arl+0.006
Neg logit contributions
 mil-0.004
 marsh-0.004
 particular-0.004
 was-0.003
 pl-0.003
Token was
Token ID373
Feature activation-3.42e-01

Pos logit contributions
der+0.012
ately+0.011
ities+0.010
 fault+0.010
uable+0.010
Neg logit contributions
 marsh-0.007
 particular-0.006
 mil-0.006
 herd-0.006
rim-0.006
Token Ab
Token ID2275
Feature activation-2.80e-01

Pos logit contributions
 unnecessary+0.011
 fault+0.011
ately+0.011
arl+0.010
 weaknesses+0.010
Neg logit contributions
 mil-0.008
 marsh-0.008
 particular-0.008
 pl-0.008
 was-0.007
Tokenig
Token ID328
Feature activation+3.06e+00

Pos logit contributions
 unnecessary+0.008
 fault+0.008
ately+0.008
 weaknesses+0.008
arl+0.008
Neg logit contributions
 mil-0.008
 marsh-0.008
 particular-0.008
 was-0.007
 pl-0.007
Tokenail
Token ID603
Feature activation+2.36e-01

Pos logit contributions
 particular+0.094
 marsh+0.092
 mil+0.091
 Ms+0.085
 pl+0.082
Neg logit contributions
ately-0.080
 fault-0.077
arl-0.072
der-0.070
emic-0.070
Token.
Token ID13
Feature activation-1.26e-01

Pos logit contributions
 marsh+0.005
 particular+0.005
 mil+0.005
 pl+0.005
M+0.004
Neg logit contributions
 fault-0.008
ately-0.008
der-0.008
 unnecessary-0.007
 weaknesses-0.007
Token Ab
Token ID2275
Feature activation-1.15e-01

Pos logit contributions
 fault+0.004
ately+0.004
arl+0.004
 weaknesses+0.004
 unnecessary+0.004
Neg logit contributions
 mil-0.003
 marsh-0.003
 pl-0.002
 particular-0.002
 sn-0.002
Tokenig
Token ID328
Feature activation+2.26e+00

Pos logit contributions
 unnecessary+0.003
 fault+0.003
 weaknesses+0.003
arl+0.003
ately+0.003
Neg logit contributions
 mil-0.004
 marsh-0.004
 particular-0.004
 was-0.003
 pl-0.003
Tokenail
Token ID603
Feature activation+5.27e-01

Pos logit contributions
 mil+0.067
 marsh+0.067
 particular+0.064
 Ms+0.060
 sn+0.058
Neg logit contributions
ately-0.063
 fault-0.058
arl-0.056
 weaknesses-0.056
 unnecessary-0.054
Token's
Token ID338
Feature activation+1.05e+00

Pos logit contributions
 marsh+0.014
 mil+0.014
 particular+0.014
 gra+0.012
 reception+0.012
Neg logit contributions
ately-0.015
 fault-0.014
asing-0.014
 weaknesses-0.014
emic-0.013
Token mom
Token ID1995
Feature activation+2.86e-01

Pos logit contributions
 mil+0.033
 marsh+0.032
 particular+0.030
 pl+0.029
 was+0.029
Neg logit contributions
 fault-0.027
 unnecessary-0.027
 weaknesses-0.027
ately-0.026
arl-0.026
Token said
Token ID531
Feature activation+3.43e-01

Pos logit contributions
 marsh+0.008
 mil+0.007
 particular+0.007
 gra+0.006
 pasture+0.006
Neg logit contributions
ately-0.008
asing-0.008
emic-0.007
der-0.007
endant-0.007
Token,
Token ID11
Feature activation-8.38e-02

Pos logit contributions
 mil+0.006
 marsh+0.006
 particular+0.006
 was+0.006
 pl+0.006
Neg logit contributions
 unnecessary-0.013
 fault-0.013
ately-0.013
 weaknesses-0.013
arl-0.013
Token "
Token ID366
Feature activation+2.11e-01

Pos logit contributions
 unnecessary+0.003
 fault+0.003
ately+0.003
 weaknesses+0.003
arl+0.003
Neg logit contributions
 mil-0.001
 marsh-0.001
 particular-0.001
 was-0.001
 pl-0.001
TokenThose
Token ID9627
Feature activation+3.05e+00

Pos logit contributions
 mil+0.006
 particular+0.005
 marsh+0.005
 was+0.005
 pl+0.005
Neg logit contributions
 unnecessary-0.007
 fault-0.007
ately-0.006
arl-0.006
 weaknesses-0.006
Token are
Token ID389
Feature activation+1.61e+00

Pos logit contributions
 mil+0.085
 marsh+0.084
 particular+0.083
 pl+0.077
M+0.075
Neg logit contributions
 unnecessary-0.089
 weaknesses-0.086
 fault-0.085
arl-0.083
ately-0.083
Token nice
Token ID3621
Feature activation+1.47e+00

Pos logit contributions
 mil+0.040
 particular+0.039
 marsh+0.038
 pl+0.036
 gra+0.035
Neg logit contributions
arl-0.049
 ours-0.048
 unnecessary-0.047
 weaknesses-0.047
 fault-0.047
Token pants
Token ID12581
Feature activation+8.98e-01

Pos logit contributions
 mil+0.040
 marsh+0.040
 particular+0.039
 pl+0.036
 gra+0.035
Neg logit contributions
arl-0.042
 weaknesses-0.042
 unnecessary-0.042
 fault-0.041
ately-0.041
Token!"
Token ID2474
Feature activation+6.66e-01

Pos logit contributions
 particular+0.017
 marsh+0.017
 mil+0.017
 pl+0.015
M+0.014
Neg logit contributions
 fault-0.033
arl-0.032
ately-0.032
 ours-0.032
 weaknesses-0.032
Token Tim
Token ID5045
Feature activation+9.32e-01

Pos logit contributions
 mil+0.011
 marsh+0.011
 particular+0.010
 was+0.010
 pl+0.010
Neg logit contributions
 unnecessary-0.028
 fault-0.027
 weaknesses-0.027
ately-0.026
arl-0.026
Token
Token ID198
Feature activation+1.05e-01

Pos logit contributions
 mil+0.003
 marsh+0.003
 particular+0.003
 pl+0.002
 was+0.002
Neg logit contributions
 unnecessary-0.007
ately-0.007
 fault-0.007
arl-0.007
 weaknesses-0.006
Token"
Token ID1
Feature activation+9.44e-01

Pos logit contributions
 mil+0.002
 marsh+0.002
 particular+0.002
 sn+0.002
 pl+0.002
Neg logit contributions
arl-0.004
 unnecessary-0.004
ately-0.004
 weaknesses-0.004
 fault-0.004
TokenBecause
Token ID8128
Feature activation+1.02e-01

Pos logit contributions
 mil+0.023
 marsh+0.023
 particular+0.022
 was+0.021
 pl+0.021
Neg logit contributions
 unnecessary-0.030
 fault-0.030
ately-0.030
 weaknesses-0.029
arl-0.029
Token,"
Token ID553
Feature activation+9.90e-01

Pos logit contributions
 mil+0.002
 marsh+0.002
 particular+0.002
 was+0.002
 pl+0.002
Neg logit contributions
 unnecessary-0.003
 fault-0.003
ately-0.003
arl-0.003
 weaknesses-0.003
Token replied
Token ID8712
Feature activation+2.95e-01

Pos logit contributions
 mil+0.021
 marsh+0.021
 particular+0.020
 pl+0.018
 sn+0.018
Neg logit contributions
 unnecessary-0.034
ately-0.034
 weaknesses-0.033
 fault-0.032
der-0.032
Token the
Token ID262
Feature activation+3.03e+00

Pos logit contributions
 mil+0.004
 marsh+0.004
 particular+0.004
 pl+0.003
 was+0.003
Neg logit contributions
ately-0.012
 unnecessary-0.012
 fault-0.012
der-0.011
 weaknesses-0.011
Token polar
Token ID13559
Feature activation+3.95e-01

Pos logit contributions
 mil+0.061
 particular+0.058
 marsh+0.057
 pl+0.051
 was+0.051
Neg logit contributions
 unnecessary-0.112
arl-0.109
 fault-0.108
 weaknesses-0.107
ately-0.106
Token bear
Token ID6842
Feature activation-1.32e-01

Pos logit contributions
 mil+0.007
 particular+0.007
 marsh+0.007
 pl+0.006
 was+0.005
Neg logit contributions
 unnecessary-0.015
ately-0.015
 fault-0.015
arl-0.015
 weaknesses-0.015
Token,
Token ID11
Feature activation-1.62e-01

Pos logit contributions
ately+0.005
der+0.005
 fault+0.005
 unnecessary+0.004
 weaknesses+0.004
Neg logit contributions
 marsh-0.003
 mil-0.003
 particular-0.003
 pasture-0.002
 pl-0.002
Token "
Token ID366
Feature activation+3.35e-01

Pos logit contributions
der+0.005
ately+0.005
 unnecessary+0.005
 fault+0.005
arl+0.005
Neg logit contributions
 particular-0.003
 marsh-0.003
 mil-0.003
 pl-0.003
 was-0.003
Tokenpeople
Token ID15332
Feature activation+4.35e-02

Pos logit contributions
 mil+0.009
 marsh+0.009
 particular+0.008
 was+0.008
 pl+0.008
Neg logit contributions
 unnecessary-0.010
 fault-0.010
ately-0.010
 weaknesses-0.010
arl-0.010
Token and
Token ID290
Feature activation-3.82e-01

Pos logit contributions
arl+0.041
der+0.038
ately+0.038
 fault+0.038
 unnecessary+0.036
Neg logit contributions
 marsh-0.033
 mil-0.033
 particular-0.032
 pl-0.030
 was-0.030
Token humble
Token ID19695
Feature activation-7.49e-01

Pos logit contributions
 ours+0.011
arl+0.011
 unnecessary+0.010
 weaknesses+0.010
nam+0.010
Neg logit contributions
 marsh-0.010
 mil-0.010
 pasture-0.009
ink-0.009
 was-0.009
Token girl
Token ID2576
Feature activation+6.76e-01

Pos logit contributions
arl+0.023
 fault+0.023
 unnecessary+0.022
ately+0.022
der+0.022
Neg logit contributions
 mil-0.018
 marsh-0.018
 particular-0.017
 was-0.017
 pl-0.016
Token named
Token ID3706
Feature activation-9.65e-01

Pos logit contributions
 marsh+0.014
 mil+0.013
 particular+0.011
Ms+0.010
rell+0.010
Neg logit contributions
ately-0.023
 privacy-0.022
 weaknesses-0.022
 ago-0.022
 fault-0.022
Token Ab
Token ID2275
Feature activation+1.47e-01

Pos logit contributions
 unnecessary+0.031
 fault+0.030
ately+0.030
 weaknesses+0.030
arl+0.029
Neg logit contributions
 mil-0.025
 marsh-0.024
 particular-0.024
 was-0.023
 pl-0.023
Tokenig
Token ID328
Feature activation+2.96e+00

Pos logit contributions
 mil+0.004
 marsh+0.004
 particular+0.004
 was+0.004
 pl+0.004
Neg logit contributions
 unnecessary-0.004
 fault-0.004
ately-0.004
arl-0.004
 weaknesses-0.004
Tokenail
Token ID603
Feature activation+7.85e-01

Pos logit contributions
 marsh+0.091
 mil+0.090
 particular+0.090
 Ms+0.084
 sn+0.079
Neg logit contributions
ately-0.079
 fault-0.074
arl-0.073
emic-0.068
der-0.068
Token.
Token ID13
Feature activation-6.84e-01

Pos logit contributions
 marsh+0.017
 mil+0.017
 particular+0.016
 Ms+0.014
 pl+0.014
Neg logit contributions
ately-0.027
 fault-0.026
der-0.026
 weaknesses-0.025
arl-0.025
Token She
Token ID1375
Feature activation+1.94e-01

Pos logit contributions
 fault+0.022
ately+0.022
arl+0.022
 weaknesses+0.021
 unnecessary+0.021
Neg logit contributions
 mil-0.017
 marsh-0.016
 particular-0.015
 pl-0.015
 was-0.014
Token loved
Token ID6151
Feature activation-1.65e-02

Pos logit contributions
 marsh+0.005
 particular+0.005
 mil+0.005
 pasture+0.004
 Ms+0.004
Neg logit contributions
 weaknesses-0.005
ately-0.005
 villains-0.005
der-0.005
 unnecessary-0.005
Token to
Token ID284
Feature activation+4.74e-01

Pos logit contributions
 unnecessary+0.001
 fault+0.001
 weaknesses+0.001
ately+0.001
arl+0.001
Neg logit contributions
 mil-0.000
 particular-0.000
 marsh-0.000
 was-0.000
 pl-0.000
Token child
Token ID1200
Feature activation+8.28e-02

Pos logit contributions
 mil+0.009
 marsh+0.008
 particular+0.008
 was+0.008
 pl+0.008
Neg logit contributions
 unnecessary-0.010
 fault-0.010
ately-0.010
 weaknesses-0.010
arl-0.010
Token's
Token ID338
Feature activation+7.12e-02

Pos logit contributions
 marsh+0.002
 particular+0.002
 mil+0.002
Ms+0.002
 Ms+0.001
Neg logit contributions
 fault-0.003
ately-0.003
endant-0.003
 weaknesses-0.003
der-0.003
Token name
Token ID1438
Feature activation-5.15e-01

Pos logit contributions
 mil+0.002
 marsh+0.002
 particular+0.002
 pl+0.002
 was+0.002
Neg logit contributions
 fault-0.002
arl-0.002
 unnecessary-0.002
ately-0.002
 weaknesses-0.002
Token was
Token ID373
Feature activation-3.28e-01

Pos logit contributions
der+0.015
ately+0.014
uable+0.014
endant+0.013
icated+0.013
Neg logit contributions
 marsh-0.011
 particular-0.011
 pasture-0.011
 mil-0.011
 gra-0.010
Token Ab
Token ID2275
Feature activation-3.03e-01

Pos logit contributions
 unnecessary+0.010
 fault+0.010
ately+0.010
arl+0.010
 weaknesses+0.010
Neg logit contributions
 mil-0.009
 marsh-0.008
 particular-0.008
 pl-0.008
 was-0.008
Tokenig
Token ID328
Feature activation+2.95e+00

Pos logit contributions
 unnecessary+0.009
 weaknesses+0.009
 fault+0.009
arl+0.009
der+0.009
Neg logit contributions
 mil-0.008
 particular-0.008
 marsh-0.008
 pl-0.008
 was-0.008
Tokenail
Token ID603
Feature activation+1.30e-02

Pos logit contributions
 particular+0.090
 marsh+0.089
 mil+0.087
 Ms+0.081
 pl+0.078
Neg logit contributions
ately-0.078
 fault-0.074
arl-0.069
der-0.068
emic-0.067
Token and
Token ID290
Feature activation-1.94e-01

Pos logit contributions
 marsh+0.000
 particular+0.000
 mil+0.000
 pl+0.000
M+0.000
Neg logit contributions
der-0.000
ately-0.000
 fault-0.000
 weaknesses-0.000
emic-0.000
Token she
Token ID673
Feature activation-5.02e-01

Pos logit contributions
 unnecessary+0.007
 fault+0.007
ately+0.007
 weaknesses+0.007
arl+0.007
Neg logit contributions
 mil-0.004
 marsh-0.004
 particular-0.004
 pl-0.003
 was-0.003
Token was
Token ID373
Feature activation-7.86e-02

Pos logit contributions
 villains+0.013
 fault+0.013
 weaknesses+0.013
emic+0.012
icated+0.012
Neg logit contributions
 marsh-0.014
 particular-0.013
 mil-0.013
Ms-0.012
 pasture-0.012
Token three
Token ID1115
Feature activation+1.40e+00

Pos logit contributions
 fault+0.003
 unnecessary+0.003
 ours+0.003
nam+0.003
 spies+0.003
Neg logit contributions
 marsh-0.002
 particular-0.001
 mil-0.001
oc-0.001
M-0.001
Token,
Token ID11
Feature activation-3.75e-01

Pos logit contributions
 pl+0.018
 mil+0.018
 gra+0.016
 particular+0.015
 tow+0.014
Neg logit contributions
ately-0.055
 unnecessary-0.054
 fault-0.053
der-0.053
 weaknesses-0.051
Token they
Token ID484
Feature activation+3.09e-01

Pos logit contributions
ately+0.010
 unnecessary+0.010
 fault+0.010
 weaknesses+0.009
der+0.009
Neg logit contributions
 mil-0.010
 pl-0.010
 marsh-0.010
 particular-0.010
 sn-0.009
Token made
Token ID925
Feature activation+5.59e-01

Pos logit contributions
 mil+0.007
 marsh+0.006
 particular+0.006
 reception+0.005
 Ms+0.005
Neg logit contributions
ately-0.011
 weaknesses-0.011
 fault-0.010
emic-0.010
arl-0.010
Token lots
Token ID6041
Feature activation+1.08e+00

Pos logit contributions
 particular+0.016
 mil+0.016
 marsh+0.015
 pl+0.014
 was+0.014
Neg logit contributions
 unnecessary-0.016
 fault-0.016
ately-0.015
 weaknesses-0.015
der-0.014
Token of
Token ID286
Feature activation+2.29e+00

Pos logit contributions
 particular+0.022
 marsh+0.021
 mil+0.021
M+0.021
 pl+0.020
Neg logit contributions
 unnecessary-0.037
der-0.036
ately-0.035
 fault-0.035
 weaknesses-0.034
Token purple
Token ID14032
Feature activation+2.95e+00

Pos logit contributions
 mil+0.052
 marsh+0.051
 particular+0.050
 was+0.047
 pl+0.047
Neg logit contributions
 unnecessary-0.079
 fault-0.077
ately-0.076
 weaknesses-0.076
arl-0.076
Token v
Token ID410
Feature activation+1.16e+00

Pos logit contributions
 particular+0.089
 mil+0.088
M+0.083
 Ms+0.082
Ms+0.082
Neg logit contributions
 weaknesses-0.076
 unnecessary-0.075
 fault-0.073
ately-0.073
der-0.070
Tokenests
Token ID3558
Feature activation+4.86e-01

Pos logit contributions
 particular+0.045
 marsh+0.045
 mil+0.043
 sn+0.042
 b+0.042
Neg logit contributions
arl-0.024
 fault-0.023
asing-0.023
ately-0.023
emic-0.022
Token together
Token ID1978
Feature activation-7.30e-01

Pos logit contributions
 mil+0.011
 marsh+0.011
 particular+0.011
 pl+0.010
M+0.010
Neg logit contributions
 fault-0.016
ately-0.016
 unnecessary-0.016
der-0.015
 weaknesses-0.015
Token.
Token ID13
Feature activation-8.64e-01

Pos logit contributions
ately+0.025
der+0.025
 fault+0.025
 unnecessary+0.024
 weaknesses+0.024
Neg logit contributions
 mil-0.015
 marsh-0.015
 particular-0.014
 reception-0.013
 pl-0.013
Token Emma
Token ID18966
Feature activation+1.15e-01

Pos logit contributions
 unnecessary+0.031
 fault+0.031
arl+0.030
ately+0.030
 weaknesses+0.030
Neg logit contributions
 mil-0.018
 marsh-0.017
 particular-0.017
 pl-0.017
 was-0.016
Token bed
Token ID3996
Feature activation-1.29e-01

Pos logit contributions
arl+0.020
 fault+0.019
 spine+0.019
ately+0.018
 unnecessary+0.018
Neg logit contributions
 mil-0.017
 marsh-0.016
 particular-0.016
 pl-0.016
-0.015
Token.
Token ID13
Feature activation+1.03e-01

Pos logit contributions
 unnecessary+0.005
ately+0.005
 fault+0.005
arl+0.005
 weaknesses+0.005
Neg logit contributions
 mil-0.002
 marsh-0.002
 particular-0.002
 pl-0.002
 was-0.002
Token As
Token ID1081
Feature activation+1.53e-01

Pos logit contributions
 mil+0.002
 was+0.002
 marsh+0.002
 particular+0.002
 pl+0.002
Neg logit contributions
 unnecessary-0.004
 fault-0.004
ately-0.004
arl-0.004
 weaknesses-0.004
Token he
Token ID339
Feature activation+9.20e-01

Pos logit contributions
 mil+0.004
 marsh+0.004
 particular+0.004
 pl+0.003
 was+0.003
Neg logit contributions
 unnecessary-0.005
 fault-0.005
ately-0.005
 weaknesses-0.005
arl-0.005
Token was
Token ID373
Feature activation+2.29e+00

Pos logit contributions
 mil+0.017
 particular+0.016
 marsh+0.016
 Ms+0.014
 pl+0.014
Neg logit contributions
ately-0.035
 unnecessary-0.033
der-0.032
arl-0.032
 weaknesses-0.032
Token do
Token ID466
Feature activation+2.94e+00

Pos logit contributions
 particular+0.059
 mil+0.057
 marsh+0.053
 pl+0.053
M+0.052
Neg logit contributions
 unnecessary-0.068
arl-0.067
 fault-0.066
der-0.065
ately-0.064
Tokenzing
Token ID9510
Feature activation+1.42e+00

Pos logit contributions
 mil+0.044
 marsh+0.042
 particular+0.042
 pl+0.039
M+0.036
Neg logit contributions
 unnecessary-0.121
ately-0.116
der-0.114
 fault-0.114
arl-0.113
Token off
Token ID572
Feature activation+1.11e+00

Pos logit contributions
 mil+0.022
 particular+0.021
 marsh+0.020
 pl+0.019
 was+0.018
Neg logit contributions
 unnecessary-0.060
 fault-0.057
arl-0.057
ately-0.057
 weaknesses-0.056
Token,
Token ID11
Feature activation+1.98e-01

Pos logit contributions
 mil+0.013
 particular+0.012
 pl+0.011
 marsh+0.011
M+0.011
Neg logit contributions
ately-0.049
 unnecessary-0.048
 fault-0.047
der-0.047
arl-0.046
Token he
Token ID339
Feature activation+1.01e-01

Pos logit contributions
 mil+0.004
 marsh+0.004
 particular+0.004
 was+0.004
 pl+0.004
Neg logit contributions
 unnecessary-0.007
 fault-0.007
arl-0.007
ately-0.007
 weaknesses-0.007
Token heard
Token ID2982
Feature activation+6.83e-01

Pos logit contributions
 mil+0.002
 particular+0.002
 marsh+0.002
 Ms+0.002
 pl+0.002
Neg logit contributions
ately-0.003
 unnecessary-0.003
 weaknesses-0.003
der-0.003
 fault-0.003
Token are
Token ID389
Feature activation+6.53e-01

Pos logit contributions
 mil+0.043
 marsh+0.041
 particular+0.040
 was+0.039
 pl+0.038
Neg logit contributions
 unnecessary-0.071
 fault-0.069
ately-0.069
 weaknesses-0.068
arl-0.068
Token not
Token ID407
Feature activation-5.05e-02

Pos logit contributions
 mil+0.015
 marsh+0.014
 particular+0.014
 was+0.014
 pl+0.014
Neg logit contributions
 unnecessary-0.022
 fault-0.022
arl-0.022
 weaknesses-0.021
ately-0.021
Token tr
Token ID491
Feature activation+1.68e-01

Pos logit contributions
ately+0.002
 unnecessary+0.002
 weaknesses+0.002
arl+0.002
 fault+0.002
Neg logit contributions
 mil-0.001
 marsh-0.001
 was-0.001
 pl-0.001
 particular-0.001
Tokenunks
Token ID14125
Feature activation+4.41e-01

Pos logit contributions
 mil+0.005
 marsh+0.005
 particular+0.005
 was+0.005
 pl+0.004
Neg logit contributions
 fault-0.005
ately-0.005
arl-0.005
 unnecessary-0.005
 weaknesses-0.004
Token.
Token ID13
Feature activation+1.72e+00

Pos logit contributions
 mil+0.008
 marsh+0.008
 particular+0.008
 was+0.007
 pl+0.007
Neg logit contributions
 unnecessary-0.017
 fault-0.017
ately-0.017
arl-0.017
 weaknesses-0.017
Token Those
Token ID5845
Feature activation+2.94e+00

Pos logit contributions
 mil+0.045
 marsh+0.044
 particular+0.042
 was+0.041
 pl+0.039
Neg logit contributions
 unnecessary-0.055
ately-0.053
 fault-0.052
 weaknesses-0.051
arl-0.049
Token are
Token ID389
Feature activation+1.30e+00

Pos logit contributions
 mil+0.074
 particular+0.073
 marsh+0.071
M+0.067
 pl+0.066
Neg logit contributions
 unnecessary-0.093
 weaknesses-0.091
 fault-0.089
ately-0.087
arl-0.086
Token my
Token ID616
Feature activation+6.82e-01

Pos logit contributions
 mil+0.029
 marsh+0.028
 particular+0.028
 pl+0.026
 was+0.026
Neg logit contributions
 unnecessary-0.045
 fault-0.044
arl-0.043
 weaknesses-0.043
ately-0.043
Token pants
Token ID12581
Feature activation+3.02e-01

Pos logit contributions
 mil+0.018
 marsh+0.018
 particular+0.018
 was+0.017
 pl+0.017
Neg logit contributions
 unnecessary-0.020
 fault-0.020
ately-0.020
 weaknesses-0.020
arl-0.019
Token.
Token ID13
Feature activation+1.51e+00

Pos logit contributions
 mil+0.005
 particular+0.005
 marsh+0.005
 pl+0.005
 was+0.005
Neg logit contributions
 fault-0.012
 unnecessary-0.012
ately-0.011
arl-0.011
 weaknesses-0.011
Token Please
Token ID4222
Feature activation+5.87e-01

Pos logit contributions
 mil+0.042
 marsh+0.042
 particular+0.040
 was+0.039
 sn+0.038
Neg logit contributions
 unnecessary-0.045
ately-0.043
 fault-0.043
 weaknesses-0.042
arl-0.040
Token said
Token ID531
Feature activation+3.96e-03

Pos logit contributions
 mil+0.010
 particular+0.010
 marsh+0.010
 end+0.009
 pasture+0.009
Neg logit contributions
ately-0.014
der-0.013
 unnecessary-0.013
 weaknesses-0.013
 fault-0.012
Token:
Token ID25
Feature activation-1.46e-01

Pos logit contributions
 marsh+0.000
 particular+0.000
 mil+0.000
 pasture+0.000
 pl+0.000
Neg logit contributions
der-0.000
ately-0.000
arl-0.000
 unnecessary-0.000
 weaknesses-0.000
Token "
Token ID366
Feature activation+4.39e-01

Pos logit contributions
arl+0.005
der+0.005
ately+0.005
 fault+0.005
 unnecessary+0.004
Neg logit contributions
 mil-0.003
 marsh-0.003
 particular-0.003
 pl-0.003
 was-0.003
TokenHello
Token ID15496
Feature activation-3.98e-01

Pos logit contributions
 mil+0.012
 marsh+0.012
 particular+0.011
 pl+0.011
 was+0.011
Neg logit contributions
 unnecessary-0.012
 weaknesses-0.012
ately-0.012
arl-0.012
 fault-0.012
Token Ab
Token ID2275
Feature activation+7.08e-01

Pos logit contributions
 unnecessary+0.017
ately+0.017
 fault+0.017
arl+0.017
 weaknesses+0.016
Neg logit contributions
 mil-0.006
 marsh-0.005
 particular-0.005
 pl-0.005
 was-0.004
Tokenig
Token ID328
Feature activation+2.94e+00

Pos logit contributions
 mil+0.022
 marsh+0.022
 particular+0.022
 was+0.021
 pl+0.021
Neg logit contributions
 unnecessary-0.018
 fault-0.017
 weaknesses-0.017
arl-0.017
ately-0.017
Tokenail
Token ID603
Feature activation+9.94e-01

Pos logit contributions
 marsh+0.061
 mil+0.059
 particular+0.058
 Ms+0.051
 pl+0.050
Neg logit contributions
ately-0.105
 fault-0.099
arl-0.096
aser-0.092
 unnecessary-0.092
Token!
Token ID0
Feature activation-7.40e-02

Pos logit contributions
 marsh+0.014
 mil+0.014
 particular+0.014
 pl+0.012
M+0.011
Neg logit contributions
ately-0.041
 fault-0.041
 unnecessary-0.039
arl-0.039
der-0.039
Token I
Token ID314
Feature activation-3.86e-01

Pos logit contributions
 unnecessary+0.002
 fault+0.002
 weaknesses+0.002
ately+0.002
arl+0.002
Neg logit contributions
 mil-0.002
 particular-0.002
 was-0.002
 marsh-0.002
-0.002
Token'm
Token ID1101
Feature activation-9.78e-02

Pos logit contributions
 fault+0.012
 unnecessary+0.012
ately+0.012
 weaknesses+0.012
arl+0.011
Neg logit contributions
 mil-0.010
 marsh-0.010
 particular-0.009
 Ms-0.009
Ms-0.008
Token glad
Token ID9675
Feature activation-1.61e-01

Pos logit contributions
 fault+0.003
 unnecessary+0.003
arl+0.003
 ours+0.003
ately+0.003
Neg logit contributions
 mil-0.002
 particular-0.002
 marsh-0.002
 pl-0.002
M-0.002
Token cub
Token ID13617
Feature activation-1.02e+00

Pos logit contributions
 mil+0.041
 marsh+0.039
 particular+0.039
 pl+0.037
 was+0.036
Neg logit contributions
 unnecessary-0.067
ately-0.065
arl-0.065
 fault-0.064
 weaknesses-0.064
Token said
Token ID531
Feature activation-2.92e-01

Pos logit contributions
 unnecessary+0.037
ately+0.037
 fault+0.037
 weaknesses+0.036
arl+0.036
Neg logit contributions
 mil-0.020
 marsh-0.020
 particular-0.019
 pl-0.018
 was-0.017
Token yes
Token ID3763
Feature activation-3.75e-01

Pos logit contributions
 unnecessary+0.009
 fault+0.008
ately+0.008
 weaknesses+0.008
arl+0.008
Neg logit contributions
 mil-0.008
 marsh-0.008
 particular-0.008
 pl-0.007
 was-0.007
Token.
Token ID13
Feature activation-5.37e-04

Pos logit contributions
 unnecessary+0.016
 fault+0.016
ately+0.016
 weaknesses+0.016
arl+0.016
Neg logit contributions
 mil-0.005
 marsh-0.005
 particular-0.005
 was-0.005
 pl-0.005
Token The
Token ID383
Feature activation+1.99e+00

Pos logit contributions
 unnecessary+0.000
 fault+0.000
ately+0.000
 weaknesses+0.000
arl+0.000
Neg logit contributions
 mil-0.000
 marsh-0.000
 particular-0.000
 pl-0.000
 was-0.000
Token two
Token ID734
Feature activation+2.94e+00

Pos logit contributions
 mil+0.046
 marsh+0.043
 particular+0.043
 pl+0.041
 was+0.039
Neg logit contributions
 unnecessary-0.065
arl-0.063
ately-0.062
 weaknesses-0.062
 fault-0.061
Token of
Token ID286
Feature activation+1.31e+00

Pos logit contributions
 mil+0.070
 marsh+0.064
 particular+0.063
 pl+0.061
M+0.058
Neg logit contributions
 unnecessary-0.097
ately-0.095
 weaknesses-0.091
der-0.087
arl-0.087
Token them
Token ID606
Feature activation+7.87e-01

Pos logit contributions
 mil+0.029
 marsh+0.027
 particular+0.026
 pl+0.025
 was+0.024
Neg logit contributions
 unnecessary-0.046
arl-0.045
ately-0.045
 weaknesses-0.044
 fault-0.043
Token went
Token ID1816
Feature activation+1.83e-01

Pos logit contributions
 mil+0.017
 marsh+0.016
 particular+0.015
 pl+0.013
 Ms+0.013
Neg logit contributions
ately-0.029
 unnecessary-0.027
 weaknesses-0.027
der-0.026
 fault-0.025
Token for
Token ID329
Feature activation+2.58e-01

Pos logit contributions
 mil+0.004
 marsh+0.004
 particular+0.004
 pl+0.004
 was+0.004
Neg logit contributions
arl-0.006
 unnecessary-0.006
ately-0.006
 fault-0.006
 weaknesses-0.006
Token a
Token ID257
Feature activation+3.42e-01

Pos logit contributions
 mil+0.003
 marsh+0.003
 particular+0.003
 was+0.003
 pl+0.003
Neg logit contributions
 fault-0.012
 unnecessary-0.012
arl-0.012
ately-0.011
 weaknesses-0.011
Token youngest
Token ID18887
Feature activation-2.86e-01

Pos logit contributions
 mil+0.029
 marsh+0.028
 particular+0.028
 was+0.027
 pl+0.027
Neg logit contributions
 unnecessary-0.035
ately-0.034
 fault-0.033
 weaknesses-0.033
arl-0.033
Token girl
Token ID2576
Feature activation+3.06e-01

Pos logit contributions
 fault+0.009
ately+0.009
 unnecessary+0.009
 weaknesses+0.009
arl+0.009
Neg logit contributions
 mil-0.007
 marsh-0.007
 particular-0.006
 pl-0.006
 sn-0.006
Token was
Token ID373
Feature activation+4.88e-01

Pos logit contributions
 particular+0.006
 marsh+0.006
 mil+0.006
Ms+0.005
 gra+0.005
Neg logit contributions
 ago-0.010
ately-0.010
 fault-0.010
 weaknesses-0.010
endant-0.009
Token named
Token ID3706
Feature activation-6.46e-01

Pos logit contributions
 marsh+0.012
 particular+0.011
oc+0.011
 mil+0.011
M+0.010
Neg logit contributions
 fault-0.015
 spies-0.014
 unnecessary-0.014
emic-0.014
arl-0.014
Token Ab
Token ID2275
Feature activation+1.01e-01

Pos logit contributions
 unnecessary+0.022
ately+0.021
der+0.020
 fault+0.020
 weaknesses+0.020
Neg logit contributions
 particular-0.016
 mil-0.016
 was-0.015
 marsh-0.015
 pl-0.015
Tokenig
Token ID328
Feature activation+2.93e+00

Pos logit contributions
 mil+0.003
 marsh+0.003
 particular+0.003
 was+0.002
 pl+0.002
Neg logit contributions
 unnecessary-0.003
 fault-0.003
ately-0.003
 weaknesses-0.003
arl-0.003
Tokenail
Token ID603
Feature activation+1.40e-01

Pos logit contributions
 particular+0.087
 marsh+0.086
 mil+0.086
 Ms+0.079
 sn+0.076
Neg logit contributions
ately-0.081
 fault-0.076
arl-0.073
emic-0.072
der-0.070
Token and
Token ID290
Feature activation+2.83e-02

Pos logit contributions
 particular+0.003
 marsh+0.003
 mil+0.003
 pl+0.002
M+0.002
Neg logit contributions
ately-0.005
 fault-0.005
der-0.005
 weaknesses-0.005
 unnecessary-0.005
Token she
Token ID673
Feature activation-2.63e-01

Pos logit contributions
 mil+0.001
 marsh+0.001
 particular+0.000
 pl+0.000
 sn+0.000
Neg logit contributions
 fault-0.001
ately-0.001
 unnecessary-0.001
 weaknesses-0.001
arl-0.001
Token was
Token ID373
Feature activation+1.51e-01

Pos logit contributions
 villains+0.006
 fault+0.006
emic+0.006
 weaknesses+0.006
 enemies+0.005
Neg logit contributions
 marsh-0.008
 particular-0.008
 mil-0.008
 pasture-0.007
 front-0.007
Token 3
Token ID513
Feature activation+1.21e+00

Pos logit contributions
 particular+0.003
 marsh+0.003
 mil+0.003
oc+0.003
M+0.003
Neg logit contributions
 fault-0.005
 unnecessary-0.005
 spies-0.005
 ours-0.005
emic-0.005
Token year
Token ID614
Feature activation-4.88e-01

Pos logit contributions
 mil+0.013
 marsh+0.012
 particular+0.012
 Ms+0.011
 pl+0.011
Neg logit contributions
ately-0.027
der-0.027
arl-0.026
 unnecessary-0.025
 weaknesses-0.025
Token old
Token ID1468
Feature activation-4.06e-01

Pos logit contributions
 unnecessary+0.016
 fault+0.015
ately+0.015
arl+0.015
 weaknesses+0.015
Neg logit contributions
 mil-0.012
 marsh-0.012
 particular-0.012
 was-0.011
 pl-0.011
Token girl
Token ID2576
Feature activation+8.40e-01

Pos logit contributions
 unnecessary+0.015
 weaknesses+0.014
 fault+0.014
arl+0.014
ately+0.014
Neg logit contributions
 mil-0.010
 was-0.009
 marsh-0.009
 pl-0.009
,-0.009
Token named
Token ID3706
Feature activation-1.15e+00

Pos logit contributions
 marsh+0.016
 particular+0.016
 mil+0.015
 Ms+0.013
Ms+0.013
Neg logit contributions
ately-0.030
 ago-0.029
ever-0.028
emic-0.027
 unnecessary-0.027
Token Ab
Token ID2275
Feature activation-1.99e-01

Pos logit contributions
 unnecessary+0.038
 fault+0.037
 weaknesses+0.037
ately+0.037
arl+0.036
Neg logit contributions
 mil-0.028
 particular-0.028
 marsh-0.028
 was-0.027
 pl-0.027
Tokenig
Token ID328
Feature activation+2.93e+00

Pos logit contributions
 unnecessary+0.005
ately+0.005
 fault+0.005
arl+0.005
 weaknesses+0.005
Neg logit contributions
 mil-0.006
 marsh-0.006
 particular-0.006
 was-0.006
 pl-0.006
Tokenail
Token ID603
Feature activation+6.98e-01

Pos logit contributions
 marsh+0.088
 particular+0.087
 mil+0.087
 Ms+0.081
 pl+0.075
Neg logit contributions
ately-0.081
 fault-0.075
arl-0.073
emic-0.070
der-0.069
Token.
Token ID13
Feature activation-7.54e-01

Pos logit contributions
 marsh+0.014
 particular+0.014
 mil+0.014
 Ms+0.012
 pl+0.012
Neg logit contributions
ately-0.025
 fault-0.024
 unnecessary-0.023
 weaknesses-0.023
der-0.023
Token She
Token ID1375
Feature activation-1.59e-01

Pos logit contributions
 unnecessary+0.027
 fault+0.026
ately+0.026
 weaknesses+0.026
arl+0.026
Neg logit contributions
 mil-0.016
 marsh-0.016
 particular-0.015
 was-0.015
 pl-0.014
Token loved
Token ID6151
Feature activation-1.32e-01

Pos logit contributions
 weaknesses+0.005
ately+0.005
 unnecessary+0.005
 fault+0.005
der+0.005
Neg logit contributions
 particular-0.004
 mil-0.004
 marsh-0.004
 Ms-0.003
Ms-0.003
Token to
Token ID284
Feature activation+3.84e-01

Pos logit contributions
 unnecessary+0.005
 fault+0.005
ately+0.005
arl+0.005
 weaknesses+0.005
Neg logit contributions
 mil-0.003
 marsh-0.003
 particular-0.003
 was-0.003
 pl-0.002
Token "
Token ID366
Feature activation+1.74e-01

Pos logit contributions
 marsh+0.003
 particular+0.003
 mil+0.003
 pl+0.003
 was+0.003
Neg logit contributions
ately-0.005
arl-0.005
der-0.005
 fault-0.005
 unnecessary-0.005
TokenWe
Token ID1135
Feature activation+7.51e-01

Pos logit contributions
 mil+0.004
 particular+0.004
 marsh+0.004
 was+0.004
 pl+0.004
Neg logit contributions
 unnecessary-0.006
 fault-0.006
ately-0.005
arl-0.005
 weaknesses-0.005
Token don
Token ID836
Feature activation-2.12e-01

Pos logit contributions
 mil+0.018
 marsh+0.018
 particular+0.018
 was+0.017
 pl+0.017
Neg logit contributions
 unnecessary-0.024
 fault-0.024
ately-0.024
 weaknesses-0.024
arl-0.023
Token't
Token ID470
Feature activation-7.32e-01

Pos logit contributions
arl+0.006
der+0.006
ately+0.006
 weaknesses+0.006
 fault+0.006
Neg logit contributions
 marsh-0.005
 particular-0.005
 mil-0.005
 pl-0.005
 l-0.005
Token have
Token ID423
Feature activation+7.76e-01

Pos logit contributions
 unnecessary+0.025
 fault+0.024
ately+0.024
arl+0.024
 weaknesses+0.024
Neg logit contributions
 mil-0.017
 marsh-0.017
 particular-0.016
 was-0.016
 pl-0.016
Token any
Token ID597
Feature activation+2.92e+00

Pos logit contributions
 mil+0.020
 marsh+0.019
 particular+0.018
 was+0.018
 pl+0.018
Neg logit contributions
 unnecessary-0.024
arl-0.024
ately-0.024
 fault-0.024
 weaknesses-0.024
Token w
Token ID266
Feature activation+6.38e-01

Pos logit contributions
 particular+0.059
 mil+0.057
M+0.056
 marsh+0.054
ink+0.052
Neg logit contributions
 unnecessary-0.105
 fault-0.102
 weaknesses-0.097
der-0.095
arl-0.094
Tokenaffles
Token ID48501
Feature activation+9.12e-02

Pos logit contributions
 particular+0.021
 mil+0.021
 was+0.021
 marsh+0.021
 pl+0.020
Neg logit contributions
 unnecessary-0.014
arl-0.014
 fault-0.014
ately-0.014
 weaknesses-0.013
Token left
Token ID1364
Feature activation+1.01e-02

Pos logit contributions
 mil+0.002
 particular+0.002
 marsh+0.002
 pl+0.002
 was+0.001
Neg logit contributions
 fault-0.003
 unnecessary-0.003
ately-0.003
arl-0.003
der-0.003
Token.
Token ID13
Feature activation-1.52e-01

Pos logit contributions
 mil+0.000
 particular+0.000
 marsh+0.000
 pl+0.000
 was+0.000
Neg logit contributions
 fault-0.000
ately-0.000
 unnecessary-0.000
arl-0.000
 ours-0.000
Token We
Token ID775
Feature activation+6.95e-02

Pos logit contributions
 unnecessary+0.004
ately+0.004
 fault+0.004
arl+0.004
 weaknesses+0.004
Neg logit contributions
 mil-0.005
 marsh-0.004
 was-0.004
 particular-0.004
-0.004
Token not
Token ID407
Feature activation-4.35e-01

Pos logit contributions
arl+0.006
 fault+0.006
 weaknesses+0.006
 unnecessary+0.006
 property+0.006
Neg logit contributions
 marsh-0.005
 mil-0.005
 particular-0.005
M-0.005
 Ms-0.005
Token go
Token ID467
Feature activation+2.47e-01

Pos logit contributions
 unnecessary+0.016
 weaknesses+0.015
 fault+0.015
arl+0.015
ately+0.015
Neg logit contributions
 marsh-0.010
 mil-0.009
 particular-0.009
 pl-0.008
 Ms-0.008
Token in
Token ID287
Feature activation+1.32e+00

Pos logit contributions
 mil+0.006
 marsh+0.006
 particular+0.005
 was+0.005
 pl+0.005
Neg logit contributions
 unnecessary-0.008
 fault-0.008
ately-0.008
arl-0.008
 weaknesses-0.008
Token there
Token ID612
Feature activation+1.75e-01

Pos logit contributions
 mil+0.032
 particular+0.032
 marsh+0.032
 pl+0.030
 was+0.029
Neg logit contributions
 unnecessary-0.042
 fault-0.041
ately-0.040
arl-0.040
 ours-0.039
Token,"
Token ID553
Feature activation+8.01e-01

Pos logit contributions
 mil+0.003
 marsh+0.003
 particular+0.003
 pl+0.003
 was+0.003
Neg logit contributions
 fault-0.006
 unnecessary-0.006
ately-0.006
arl-0.006
 weaknesses-0.006
Token the
Token ID262
Feature activation+2.92e+00

Pos logit contributions
 marsh+0.016
 mil+0.016
 particular+0.015
 pl+0.013
 was+0.013
Neg logit contributions
ately-0.029
 unnecessary-0.029
 weaknesses-0.027
 fault-0.027
der-0.027
Token bear
Token ID6842
Feature activation-7.85e-02

Pos logit contributions
 mil+0.070
 marsh+0.069
 particular+0.065
 Ms+0.057
 pl+0.057
Neg logit contributions
arl-0.094
 unnecessary-0.092
ately-0.088
 weaknesses-0.088
der-0.083
Token warned
Token ID7728
Feature activation-7.72e-01

Pos logit contributions
ately+0.003
der+0.002
 unnecessary+0.002
 weaknesses+0.002
 fault+0.002
Neg logit contributions
 marsh-0.002
 mil-0.002
 particular-0.002
 pasture-0.002
 spring-0.002
Token Wil
Token ID5187
Feature activation-8.26e-01

Pos logit contributions
ately+0.028
 unnecessary+0.027
 fault+0.026
arl+0.026
der+0.025
Neg logit contributions
 marsh-0.017
 mil-0.016
 particular-0.016
 pl-0.014
 was-0.014
Tokenbur
Token ID6236
Feature activation+7.57e-03

Pos logit contributions
 unnecessary+0.029
 fault+0.029
 weaknesses+0.029
arl+0.028
ately+0.028
Neg logit contributions
 mil-0.018
 marsh-0.017
 particular-0.017
 was-0.016
 pl-0.016
Token.
Token ID13
Feature activation-7.20e-03

Pos logit contributions
 mil+0.000
 marsh+0.000
 particular+0.000
 was+0.000
 pl+0.000
Neg logit contributions
 unnecessary-0.000
arl-0.000
 weaknesses-0.000
 fault-0.000
ately-0.000
Token wrong
Token ID2642
Feature activation-6.41e-01

Pos logit contributions
 mil+0.006
 particular+0.006
 marsh+0.006
 pl+0.006
 was+0.005
Neg logit contributions
 unnecessary-0.008
 fault-0.008
 weaknesses-0.007
arl-0.007
ately-0.007
Token?"
Token ID1701
Feature activation+1.09e+00

Pos logit contributions
 unnecessary+0.022
arl+0.021
ately+0.021
 weaknesses+0.021
 fault+0.021
Neg logit contributions
 mil-0.014
 marsh-0.014
 was-0.014
 particular-0.014
 pl-0.013
Token
Token ID198
Feature activation+7.05e-01

Pos logit contributions
 mil+0.019
 marsh+0.018
 particular+0.018
 pl+0.016
 was+0.016
Neg logit contributions
 unnecessary-0.043
ately-0.042
 fault-0.041
arl-0.041
 weaknesses-0.041
Token
Token ID198
Feature activation+4.37e-02

Pos logit contributions
 mil+0.012
 marsh+0.011
 particular+0.011
 pl+0.010
 was+0.009
Neg logit contributions
 unnecessary-0.028
arl-0.027
ately-0.027
 fault-0.027
 weaknesses-0.026
TokenThe
Token ID464
Feature activation+2.18e+00

Pos logit contributions
 mil+0.001
 marsh+0.001
 particular+0.001
 pl+0.001
 was+0.001
Neg logit contributions
 unnecessary-0.002
ately-0.001
arl-0.001
 weaknesses-0.001
der-0.001
Token two
Token ID734
Feature activation+2.89e+00

Pos logit contributions
 mil+0.056
 marsh+0.051
 particular+0.051
 pl+0.050
 was+0.048
Neg logit contributions
 unnecessary-0.066
 weaknesses-0.064
arl-0.063
 fault-0.062
der-0.061
Token bears
Token ID13062
Feature activation+7.09e-01

Pos logit contributions
 mil+0.082
 particular+0.076
 marsh+0.076
 pl+0.073
 gra+0.071
Neg logit contributions
ately-0.081
 unnecessary-0.081
 weaknesses-0.078
der-0.075
 intentions-0.072
Token stopped
Token ID5025
Feature activation+7.52e-01

Pos logit contributions
 mil+0.015
 marsh+0.015
 particular+0.014
 pl+0.013
 was+0.013
Neg logit contributions
 unnecessary-0.025
ately-0.025
 fault-0.024
 weaknesses-0.024
arl-0.024
Token,
Token ID11
Feature activation-1.48e-01

Pos logit contributions
 mil+0.015
 was+0.015
,+0.014
 pl+0.014
 w+0.014
Neg logit contributions
 unnecessary-0.028
 weaknesses-0.028
arl-0.028
 fault-0.028
 ours-0.027
Token looked
Token ID3114
Feature activation+3.22e-01

Pos logit contributions
 unnecessary+0.006
 fault+0.006
arl+0.006
 weaknesses+0.006
ately+0.006
Neg logit contributions
 mil-0.002
 marsh-0.002
 was-0.002
 particular-0.002
 pl-0.002
Token at
Token ID379
Feature activation+1.29e-01

Pos logit contributions
 mil+0.007
 marsh+0.007
 particular+0.006
 was+0.006
 pl+0.006
Neg logit contributions
 unnecessary-0.012
 fault-0.011
ately-0.011
arl-0.011
 weaknesses-0.011
Token shots
Token ID6934
Feature activation+1.72e-01

Pos logit contributions
 mil+0.014
 marsh+0.014
 particular+0.012
 pl+0.011
M+0.011
Neg logit contributions
 unnecessary-0.025
 weaknesses-0.025
 fault-0.025
ately-0.025
der-0.024
Token and
Token ID290
Feature activation+4.79e-01

Pos logit contributions
 mil+0.003
 marsh+0.003
 particular+0.003
 pl+0.003
M+0.002
Neg logit contributions
ately-0.007
der-0.006
 fault-0.006
 unnecessary-0.006
arl-0.006
Token then
Token ID788
Feature activation+4.91e-01

Pos logit contributions
 mil+0.009
 marsh+0.009
 particular+0.009
 pl+0.008
 was+0.008
Neg logit contributions
 unnecessary-0.018
ately-0.018
 weaknesses-0.018
 fault-0.017
arl-0.017
Token told
Token ID1297
Feature activation-2.54e-01

Pos logit contributions
 mil+0.009
 marsh+0.009
 particular+0.009
 pl+0.008
 was+0.008
Neg logit contributions
 unnecessary-0.018
ately-0.018
 fault-0.018
 weaknesses-0.018
arl-0.017
Token Ab
Token ID2275
Feature activation+6.77e-01

Pos logit contributions
 unnecessary+0.010
 fault+0.010
ately+0.009
der+0.009
 weaknesses+0.009
Neg logit contributions
 mil-0.005
 particular-0.004
 marsh-0.004
M-0.004
 was-0.004
Tokenig
Token ID328
Feature activation+2.88e+00

Pos logit contributions
 mil+0.018
 particular+0.018
 pl+0.018
 was+0.018
 marsh+0.017
Neg logit contributions
 unnecessary-0.020
 weaknesses-0.019
arl-0.019
 fault-0.019
der-0.019
Tokenail
Token ID603
Feature activation+1.01e+00

Pos logit contributions
 marsh+0.080
 mil+0.080
 particular+0.078
 pl+0.070
 sn+0.070
Neg logit contributions
ately-0.085
 fault-0.084
arl-0.079
 unnecessary-0.079
 weaknesses-0.078
Token to
Token ID284
Feature activation+2.77e-01

Pos logit contributions
 mil+0.019
 marsh+0.019
 particular+0.018
 pl+0.017
 was+0.016
Neg logit contributions
 fault-0.037
 unnecessary-0.037
ately-0.037
 weaknesses-0.037
arl-0.036
Token take
Token ID1011
Feature activation+1.46e-01

Pos logit contributions
 mil+0.006
 marsh+0.006
 particular+0.005
 was+0.005
 pl+0.005
Neg logit contributions
 fault-0.010
 unnecessary-0.010
 weaknesses-0.010
arl-0.010
ately-0.010
Token some
Token ID617
Feature activation+1.52e+00

Pos logit contributions
 mil+0.002
 particular+0.002
 marsh+0.002
 pl+0.002
 was+0.002
Neg logit contributions
 unnecessary-0.006
 fault-0.006
 weaknesses-0.006
ately-0.006
arl-0.005
Token.
Token ID13
Feature activation+1.96e-01

Pos logit contributions
 mil+0.033
 marsh+0.031
 particular+0.031
 was+0.029
 pl+0.029
Neg logit contributions
 unnecessary-0.055
 weaknesses-0.052
 fault-0.052
ately-0.051
arl-0.051
Token!"
Token ID2474
Feature activation+1.17e+00

Pos logit contributions
 mil+0.007
 marsh+0.006
 particular+0.006
 pl+0.005
 was+0.005
Neg logit contributions
 unnecessary-0.037
ately-0.037
 fault-0.037
arl-0.037
 weaknesses-0.036
Token
Token ID198
Feature activation+1.43e+00

Pos logit contributions
 mil+0.017
 marsh+0.017
 particular+0.016
 pl+0.014
 was+0.014
Neg logit contributions
ately-0.049
 unnecessary-0.049
arl-0.048
 fault-0.048
 weaknesses-0.048
Token
Token ID198
Feature activation+1.13e-01

Pos logit contributions
 mil+0.026
 marsh+0.024
 particular+0.022
 pl+0.021
 sn+0.020
Neg logit contributions
arl-0.055
ately-0.055
 unnecessary-0.054
 fault-0.054
 weaknesses-0.053
TokenSo
Token ID2396
Feature activation+6.08e-01

Pos logit contributions
 mil+0.003
 marsh+0.003
 particular+0.003
 pl+0.002
 Ms+0.002
Neg logit contributions
 unnecessary-0.003
arl-0.003
 weaknesses-0.003
ately-0.003
der-0.003
Token L
Token ID406
Feature activation-1.31e-02

Pos logit contributions
 mil+0.010
 marsh+0.010
 particular+0.009
 pl+0.008
 pasture+0.008
Neg logit contributions
ately-0.025
 unnecessary-0.024
arl-0.024
 weaknesses-0.023
 fault-0.023
Tokenizz
Token ID6457
Feature activation+2.87e+00

Pos logit contributions
 unnecessary+0.000
ately+0.000
 fault+0.000
arl+0.000
 weaknesses+0.000
Neg logit contributions
 mil-0.000
 marsh-0.000
 particular-0.000
 was-0.000
 pl-0.000
Tokenie
Token ID494
Feature activation+1.70e+00

Pos logit contributions
 mil+0.059
 marsh+0.058
 particular+0.056
 pl+0.050
 was+0.048
Neg logit contributions
ately-0.104
 unnecessary-0.101
 fault-0.099
 weaknesses-0.099
arl-0.099
Token and
Token ID290
Feature activation+5.87e-01

Pos logit contributions
 particular+0.036
 marsh+0.035
 mil+0.033
 Ms+0.029
Ms+0.029
Neg logit contributions
ately-0.061
der-0.057
 weaknesses-0.055
 fault-0.055
 unnecessary-0.053
Token the
Token ID262
Feature activation+8.78e-01

Pos logit contributions
 mil+0.008
 marsh+0.007
 particular+0.007
 pl+0.005
ink+0.005
Neg logit contributions
ately-0.025
 unnecessary-0.024
 weaknesses-0.024
der-0.024
 fault-0.023
Token motor
Token ID5584
Feature activation+1.72e+00

Pos logit contributions
 mil+0.021
 marsh+0.021
 particular+0.021
 was+0.020
 pl+0.020
Neg logit contributions
 unnecessary-0.029
 fault-0.028
ately-0.028
 weaknesses-0.028
arl-0.028
Token got
Token ID1392
Feature activation+3.72e-01

Pos logit contributions
 particular+0.031
 mil+0.030
 marsh+0.029
 Ms+0.025
M+0.025
Neg logit contributions
ately-0.064
der-0.062
 weaknesses-0.061
 fault-0.060
 unnecessary-0.060
Token her
Token ID607
Feature activation+1.38e-02

Pos logit contributions
 unnecessary+0.027
ately+0.026
 fault+0.026
 weaknesses+0.026
arl+0.025
Neg logit contributions
 mil-0.015
 marsh-0.015
 particular-0.014
 pl-0.014
 was-0.013
Token mom
Token ID1995
Feature activation-1.30e-01

Pos logit contributions
 mil+0.000
 marsh+0.000
 particular+0.000
 pl+0.000
 was+0.000
Neg logit contributions
 fault-0.000
 unnecessary-0.000
ately-0.000
 weaknesses-0.000
arl-0.000
Token came
Token ID1625
Feature activation-1.21e+00

Pos logit contributions
ately+0.005
arl+0.005
 unnecessary+0.005
 weaknesses+0.005
emic+0.005
Neg logit contributions
 marsh-0.002
 mil-0.002
 particular-0.002
 pl-0.002
 spring-0.002
Token in
Token ID287
Feature activation+6.41e-01

Pos logit contributions
 unnecessary+0.043
ately+0.042
 fault+0.041
arl+0.041
 weaknesses+0.040
Neg logit contributions
 mil-0.026
 marsh-0.026
 particular-0.024
 pl-0.022
 was-0.022
Token and
Token ID290
Feature activation-7.05e-01

Pos logit contributions
 marsh+0.012
 mil+0.011
 particular+0.010
 pl+0.010
me+0.009
Neg logit contributions
ately-0.024
arl-0.024
 unnecessary-0.023
 fault-0.023
keeping-0.023
Token handed
Token ID10158
Feature activation-2.98e+00

Pos logit contributions
ately+0.023
 unnecessary+0.021
 weaknesses+0.020
emic+0.020
arl+0.020
Neg logit contributions
 marsh-0.018
 mil-0.017
 particular-0.017
 pasture-0.015
 spring-0.014
Token her
Token ID607
Feature activation-8.05e-01

Pos logit contributions
 unnecessary+0.125
ately+0.123
 fault+0.123
arl+0.121
 weaknesses+0.121
Neg logit contributions
 mil-0.044
 marsh-0.042
 particular-0.039
 pl-0.037
 was-0.034
Token the
Token ID262
Feature activation+4.75e-01

Pos logit contributions
 fault+0.032
 unnecessary+0.031
ately+0.030
 weaknesses+0.029
der+0.029
Neg logit contributions
 mil-0.014
 marsh-0.014
 particular-0.013
 pl-0.012
M-0.011
Token video
Token ID2008
Feature activation-6.65e-01

Pos logit contributions
 mil+0.011
 marsh+0.011
 particular+0.011
 pl+0.010
M+0.010
Neg logit contributions
 fault-0.015
 unnecessary-0.014
arl-0.014
 weaknesses-0.014
 ours-0.014
Token.
Token ID13
Feature activation-1.31e+00

Pos logit contributions
 fault+0.025
ately+0.024
der+0.024
arl+0.023
 unnecessary+0.023
Neg logit contributions
 marsh-0.013
 mil-0.013
 particular-0.013
 pl-0.011
 sn-0.010
Token Amy
Token ID14235
Feature activation+4.26e-01

Pos logit contributions
ately+0.036
 fault+0.036
 weaknesses+0.035
 unnecessary+0.035
der+0.035
Neg logit contributions
 mil-0.037
 marsh-0.036
 particular-0.035
 pl-0.034
 sn-0.032
Token big
Token ID1263
Feature activation-2.81e-01

Pos logit contributions
 unnecessary+0.017
 fault+0.017
arl+0.017
 weaknesses+0.016
ately+0.016
Neg logit contributions
 mil-0.009
 marsh-0.009
 particular-0.009
 pl-0.008
 was-0.008
Token sign
Token ID1051
Feature activation+2.96e-01

Pos logit contributions
 fault+0.009
 unnecessary+0.009
arl+0.009
 ours+0.009
ately+0.009
Neg logit contributions
 mil-0.006
 marsh-0.006
M-0.006
 particular-0.006
 was-0.005
Token that
Token ID326
Feature activation-3.13e-01

Pos logit contributions
 marsh+0.008
 mil+0.007
 particular+0.007
 pl+0.007
M+0.006
Neg logit contributions
ately-0.009
der-0.009
arl-0.009
 fault-0.008
 unnecessary-0.008
Token said
Token ID531
Feature activation-4.79e-01

Pos logit contributions
ately+0.009
 weaknesses+0.009
der+0.009
 unnecessary+0.008
 fault+0.008
Neg logit contributions
 mil-0.008
 marsh-0.008
 particular-0.008
 pasture-0.007
 pl-0.007
Token "
Token ID366
Feature activation+5.17e-01

Pos logit contributions
ately+0.017
 fault+0.017
 unnecessary+0.017
der+0.017
arl+0.017
Neg logit contributions
 mil-0.009
 marsh-0.009
 particular-0.009
 pl-0.008
 sn-0.008
TokenSold
Token ID33873
Feature activation-2.97e+00

Pos logit contributions
 mil+0.013
 marsh+0.013
 particular+0.012
 pl+0.012
 was+0.012
Neg logit contributions
 unnecessary-0.016
ately-0.016
 fault-0.016
arl-0.016
 weaknesses-0.016
Token Out
Token ID3806
Feature activation-9.05e-02

Pos logit contributions
 unnecessary+0.078
 fault+0.077
 weaknesses+0.076
ately+0.070
arl+0.068
Neg logit contributions
 mil-0.091
 marsh-0.089
 particular-0.087
 was-0.086
 pl-0.085
Token".
Token ID1911
Feature activation+1.07e-02

Pos logit contributions
ately+0.003
arl+0.003
 fault+0.002
 ours+0.002
 unnecessary+0.002
Neg logit contributions
 marsh-0.003
 mil-0.003
 particular-0.003
 pur-0.002
 l-0.002
Token Lily
Token ID20037
Feature activation+6.16e-01

Pos logit contributions
 was+0.000
 mil+0.000
 marsh+0.000
 particular+0.000
,+0.000
Neg logit contributions
 unnecessary-0.000
 fault-0.000
 weaknesses-0.000
arl-0.000
ately-0.000
Token's
Token ID338
Feature activation-2.88e-01

Pos logit contributions
 marsh+0.013
 mil+0.013
 particular+0.012
 gra+0.010
 Ms+0.010
Neg logit contributions
ately-0.023
 weaknesses-0.022
der-0.022
 unnecessary-0.021
 fault-0.021
Token mom
Token ID1995
Feature activation+2.89e-02

Pos logit contributions
 fault+0.010
 unnecessary+0.010
ately+0.010
 weaknesses+0.010
arl+0.009
Neg logit contributions
 mil-0.007
 marsh-0.006
 particular-0.006
 pl-0.006
 was-0.006
Token let
Token ID1309
Feature activation-3.55e-01

Pos logit contributions
 marsh+0.001
 particular+0.001
 mil+0.001
 pasture+0.000
 gra+0.000
Neg logit contributions
ately-0.001
der-0.001
 weaknesses-0.001
 unnecessary-0.001
asing-0.001
Token out
Token ID503
Feature activation+6.87e-01

Pos logit contributions
 unnecessary+0.010
ately+0.009
 fault+0.008
 ours+0.008
arl+0.008
Neg logit contributions
 mil-0.010
 marsh-0.009
 particular-0.009
 pasture-0.009
mary-0.009
Token a
Token ID257
Feature activation+3.66e-01

Pos logit contributions
 mil+0.010
 marsh+0.009
 particular+0.009
 pl+0.008
 was+0.008
Neg logit contributions
 unnecessary-0.029
 fault-0.029
ately-0.029
 weaknesses-0.029
arl-0.029
Token big
Token ID1263
Feature activation-1.48e-01

Pos logit contributions
 mil+0.007
 marsh+0.007
 particular+0.007
 was+0.007
 pl+0.007
Neg logit contributions
 unnecessary-0.013
ately-0.013
 fault-0.013
arl-0.013
 weaknesses-0.013
Token sn
Token ID3013
Feature activation-9.51e-01

Pos logit contributions
 unnecessary+0.005
arl+0.005
 fault+0.005
ately+0.005
 weaknesses+0.005
Neg logit contributions
 mil-0.003
 marsh-0.003
 particular-0.003
 was-0.002
 pl-0.002
Tokenarl
Token ID7063
Feature activation-2.92e+00

Pos logit contributions
arl+0.028
ately+0.025
 fault+0.025
 unnecessary+0.024
 weaknesses+0.023
Neg logit contributions
 marsh-0.029
 mil-0.029
 particular-0.027
 was-0.027
 pl-0.026
Token and
Token ID290
Feature activation-1.17e-01

Pos logit contributions
arl+0.125
ately+0.124
 fault+0.122
der+0.120
 unnecessary+0.119
Neg logit contributions
 mil-0.042
 marsh-0.040
 particular-0.038
 pl-0.035
 sn-0.033
Token the
Token ID262
Feature activation+1.07e+00

Pos logit contributions
 unnecessary+0.004
ately+0.004
 weaknesses+0.004
der+0.004
 ours+0.004
Neg logit contributions
 particular-0.002
 mil-0.002
 marsh-0.002
ink-0.002
 end-0.002
Token mouse
Token ID10211
Feature activation-4.22e-01

Pos logit contributions
 mil+0.021
 marsh+0.020
 particular+0.020
 was+0.020
 pl+0.019
Neg logit contributions
 unnecessary-0.041
 fault-0.040
arl-0.040
ately-0.040
 weaknesses-0.040
Token knew
Token ID2993
Feature activation-5.47e-01

Pos logit contributions
ately+0.014
 weaknesses+0.014
der+0.014
 unnecessary+0.013
 fault+0.013
Neg logit contributions
 marsh-0.009
 particular-0.009
 mil-0.009
Ms-0.008
 Ms-0.008
Token he
Token ID339
Feature activation-1.26e+00

Pos logit contributions
arl+0.019
 unnecessary+0.018
 weaknesses+0.018
 fault+0.018
ately+0.017
Neg logit contributions
 was-0.013
 mil-0.013
 marsh-0.012
 pl-0.012
 particular-0.012
Token a
Token ID257
Feature activation-1.25e+00

Pos logit contributions
 mil+0.003
 marsh+0.003
 particular+0.003
 was+0.002
 pl+0.002
Neg logit contributions
 unnecessary-0.015
 fault-0.015
ately-0.015
 weaknesses-0.015
arl-0.015
Token p
Token ID279
Feature activation-2.02e+00

Pos logit contributions
 unnecessary+0.042
 fault+0.041
arl+0.041
ately+0.041
 weaknesses+0.041
Neg logit contributions
 mil-0.028
 marsh-0.028
 particular-0.027
 was-0.026
 pl-0.026
Tokenuddle
Token ID24500
Feature activation-1.33e+00

Pos logit contributions
 unnecessary+0.052
 fault+0.050
ately+0.050
 weaknesses+0.050
arl+0.048
Neg logit contributions
 mil-0.063
 marsh-0.061
 particular-0.061
 pl-0.060
 was-0.059
Token.
Token ID13
Feature activation-4.77e-01

Pos logit contributions
arl+0.047
der+0.047
ately+0.046
 fault+0.045
 unnecessary+0.043
Neg logit contributions
 marsh-0.028
 mil-0.028
 particular-0.027
 Ms-0.024
Ms-0.023
Token Her
Token ID2332
Feature activation-1.51e+00

Pos logit contributions
ately+0.015
 fault+0.015
 unnecessary+0.015
arl+0.015
 weaknesses+0.014
Neg logit contributions
 mil-0.011
 marsh-0.011
 particular-0.011
 pl-0.010
 sn-0.010
Token fancy
Token ID14996
Feature activation-2.83e+00

Pos logit contributions
 unnecessary+0.052
arl+0.051
ately+0.051
 weaknesses+0.051
 fault+0.050
Neg logit contributions
 mil-0.032
 marsh-0.032
 was-0.032
 particular-0.031
 pl-0.031
Token dress
Token ID6576
Feature activation-9.62e-01

Pos logit contributions
 unnecessary+0.081
 fault+0.079
ately+0.079
arl+0.078
 weaknesses+0.078
Neg logit contributions
 mil-0.080
 marsh-0.078
 particular-0.077
 was-0.075
 pl-0.074
Token was
Token ID373
Feature activation+3.80e-01

Pos logit contributions
ately+0.041
 fault+0.040
der+0.039
 unnecessary+0.038
 weaknesses+0.037
Neg logit contributions
 marsh-0.015
 mil-0.014
 particular-0.014
 Ms-0.013
Ms-0.012
Token all
Token ID477
Feature activation+2.63e-01

Pos logit contributions
 mil+0.010
oc+0.010
 particular+0.010
 marsh+0.010
 Ms+0.010
Neg logit contributions
 fault-0.011
arl-0.010
 unnecessary-0.010
der-0.010
 ours-0.010
Token wet
Token ID9583
Feature activation-4.95e-01

Pos logit contributions
 mil+0.007
 marsh+0.006
 particular+0.006
 pl+0.006
 pasture+0.006
Neg logit contributions
 fault-0.008
arl-0.008
 unnecessary-0.008
 ours-0.008
der-0.008
Token and
Token ID290
Feature activation-6.15e-01

Pos logit contributions
 unnecessary+0.024
 weaknesses+0.023
 fault+0.023
ately+0.023
arl+0.023
Neg logit contributions
 mil-0.005
 was-0.005
 marsh-0.005
,-0.004
 sn-0.004
Token found
Token ID1043
Feature activation-1.70e+00

Pos logit contributions
 marsh+0.013
 particular+0.013
 mil+0.012
opl+0.011
Ms+0.011
Neg logit contributions
ately-0.018
der-0.017
 weaknesses-0.016
ously-0.015
 intentions-0.015
Token a
Token ID257
Feature activation-3.96e-01

Pos logit contributions
ately+0.063
 fault+0.061
 unnecessary+0.061
der+0.059
 ours+0.058
Neg logit contributions
 mil-0.033
 particular-0.032
 marsh-0.031
 pasture-0.028
me-0.027
Token dusty
Token ID36972
Feature activation-1.24e+00

Pos logit contributions
 unnecessary+0.013
 fault+0.013
 weaknesses+0.013
arl+0.013
ately+0.012
Neg logit contributions
 mil-0.009
 marsh-0.009
 particular-0.008
 pl-0.008
 was-0.008
Token key
Token ID1994
Feature activation-2.99e-01

Pos logit contributions
 fault+0.032
 unnecessary+0.030
arl+0.029
 weaknesses+0.029
ately+0.029
Neg logit contributions
 mil-0.036
 marsh-0.035
 particular-0.034
me-0.033
 Ms-0.033
Token and
Token ID290
Feature activation-8.85e-01

Pos logit contributions
ately+0.012
 fault+0.012
 unnecessary+0.012
arl+0.011
 weaknesses+0.011
Neg logit contributions
 mil-0.005
 marsh-0.005
 particular-0.005
 pl-0.004
 was-0.004
Token handed
Token ID10158
Feature activation-2.81e+00

Pos logit contributions
ately+0.032
 unnecessary+0.031
 weaknesses+0.030
der+0.030
 fault+0.029
Neg logit contributions
 mil-0.018
 marsh-0.018
 particular-0.018
 pl-0.015
 pasture-0.015
Token it
Token ID340
Feature activation+6.06e-01

Pos logit contributions
ately+0.094
 unnecessary+0.092
 fault+0.092
 weaknesses+0.089
arl+0.088
Neg logit contributions
 mil-0.065
 marsh-0.064
 particular-0.061
 pl-0.057
 sn-0.055
Token to
Token ID284
Feature activation-1.97e+00

Pos logit contributions
 mil+0.017
 marsh+0.017
 particular+0.017
me+0.015
 Ms+0.015
Neg logit contributions
 unnecessary-0.017
 fault-0.017
ately-0.017
 weaknesses-0.016
 ours-0.016
Token Jane
Token ID12091
Feature activation-4.72e-01

Pos logit contributions
 unnecessary+0.066
 fault+0.064
ately+0.064
 weaknesses+0.063
arl+0.063
Neg logit contributions
 mil-0.047
 marsh-0.045
 particular-0.044
 was-0.043
 pl-0.042
Token.
Token ID13
Feature activation-1.49e+00

Pos logit contributions
 fault+0.017
ately+0.017
 unnecessary+0.017
der+0.016
arl+0.016
Neg logit contributions
 mil-0.009
 marsh-0.009
 particular-0.009
 pl-0.008
 lead-0.007
Token Jane
Token ID12091
Feature activation-7.42e-02

Pos logit contributions
 unnecessary+0.051
 fault+0.049
ately+0.049
 weaknesses+0.049
arl+0.049
Neg logit contributions
 mil-0.034
 marsh-0.033
 particular-0.033
 was-0.032
 pl-0.031
Token,
Token ID11
Feature activation+1.90e-02

Pos logit contributions
 mil+0.027
 particular+0.025
 marsh+0.025
 pl+0.023
M+0.023
Neg logit contributions
ately-0.049
 fault-0.048
der-0.048
 unnecessary-0.048
 weaknesses-0.047
Token he
Token ID339
Feature activation-3.10e-01

Pos logit contributions
 mil+0.000
 particular+0.000
 marsh+0.000
 pl+0.000
 was+0.000
Neg logit contributions
 unnecessary-0.001
ately-0.001
 fault-0.001
 weaknesses-0.001
der-0.001
Token accidentally
Token ID14716
Feature activation-5.87e-01

Pos logit contributions
ately+0.009
 unnecessary+0.008
der+0.008
 weaknesses+0.008
 spies+0.008
Neg logit contributions
 particular-0.009
 marsh-0.008
 mil-0.008
 tow-0.007
 gra-0.007
Token knocked
Token ID13642
Feature activation-4.79e-01

Pos logit contributions
 fault+0.014
 unnecessary+0.014
ately+0.014
der+0.014
 spies+0.013
Neg logit contributions
 particular-0.019
 marsh-0.017
 mil-0.017
Ms-0.017
 Ms-0.017
Token over
Token ID625
Feature activation+4.39e-01

Pos logit contributions
 unnecessary+0.024
 weaknesses+0.023
 fault+0.023
ately+0.023
arl+0.023
Neg logit contributions
 was-0.004
 mil-0.004
 marsh-0.004
 particular-0.003
 pl-0.003
Token Mr
Token ID1770
Feature activation-2.81e+00

Pos logit contributions
 mil+0.005
 marsh+0.005
 particular+0.005
 was+0.004
 pl+0.004
Neg logit contributions
 unnecessary-0.020
 fault-0.019
ately-0.019
 weaknesses-0.019
arl-0.019
Token.
Token ID13
Feature activation-1.08e+00

Pos logit contributions
 unnecessary+0.126
 weaknesses+0.122
ately+0.121
arl+0.120
 fault+0.119
Neg logit contributions
 mil-0.037
 was-0.034
 marsh-0.033
 particular-0.032
 pl-0.030
Token Smith
Token ID4176
Feature activation-1.51e+00

Pos logit contributions
 unnecessary+0.036
 weaknesses+0.034
ately+0.034
 fault+0.034
arl+0.033
Neg logit contributions
 mil-0.027
 marsh-0.027
 particular-0.026
 pl-0.025
 was-0.025
Token's
Token ID338
Feature activation-6.34e-01

Pos logit contributions
 unnecessary+0.063
 fault+0.062
ately+0.062
arl+0.061
 weaknesses+0.061
Neg logit contributions
 mil-0.022
 marsh-0.022
 particular-0.021
 pl-0.019
 was-0.019
Token drink
Token ID4144
Feature activation-5.70e-01

Pos logit contributions
 fault+0.021
 unnecessary+0.021
ately+0.020
arl+0.020
 weaknesses+0.020
Neg logit contributions
 mil-0.015
 marsh-0.014
 particular-0.014
 was-0.013
 pl-0.013
Token that
Token ID326
Feature activation-3.99e-01

Pos logit contributions
der+0.019
arl+0.017
 fault+0.017
ately+0.016
 unnecessary+0.016
Neg logit contributions
 particular-0.014
 herd-0.013
 mil-0.013
 pasture-0.013
 marsh-0.013
Token a
Token ID257
Feature activation-7.69e-01

Pos logit contributions
 fault+0.005
 unnecessary+0.005
 weaknesses+0.005
ately+0.005
der+0.005
Neg logit contributions
 particular-0.002
 mil-0.002
 marsh-0.002
 pl-0.002
 was-0.002
Token big
Token ID1263
Feature activation-1.15e+00

Pos logit contributions
 unnecessary+0.028
 fault+0.028
ately+0.028
 weaknesses+0.027
arl+0.027
Neg logit contributions
 mil-0.016
 marsh-0.015
 particular-0.015
 was-0.014
 pl-0.014
Token arm
Token ID3211
Feature activation+2.15e-01

Pos logit contributions
 unnecessary+0.036
 fault+0.036
arl+0.035
ately+0.035
 weaknesses+0.035
Neg logit contributions
 mil-0.028
 marsh-0.028
 particular-0.027
 was-0.026
 pl-0.026
Token.
Token ID13
Feature activation-8.73e-01

Pos logit contributions
 mil+0.003
 marsh+0.003
 was+0.003
 particular+0.003
 pl+0.002
Neg logit contributions
 unnecessary-0.010
 fault-0.009
 weaknesses-0.009
ately-0.009
arl-0.009
Token It
Token ID632
Feature activation-3.62e-01

Pos logit contributions
 unnecessary+0.032
 weaknesses+0.031
ately+0.031
 fault+0.030
arl+0.029
Neg logit contributions
 was-0.019
 mil-0.019
 marsh-0.019
 particular-0.018
,-0.018
Token grabs
Token ID22378
Feature activation-2.80e+00

Pos logit contributions
 fault+0.010
der+0.010
 unnecessary+0.009
 weaknesses+0.009
 ours+0.009
Neg logit contributions
 mil-0.010
 particular-0.010
Ms-0.009
M-0.009
 marsh-0.009
Token the
Token ID262
Feature activation-1.39e-01

Pos logit contributions
 unnecessary+0.116
 fault+0.114
 weaknesses+0.113
ately+0.113
arl+0.113
Neg logit contributions
 mil-0.044
 marsh-0.043
 particular-0.041
 was-0.040
 pl-0.039
Token dump
Token ID10285
Feature activation-9.46e-02

Pos logit contributions
 weaknesses+0.005
 unnecessary+0.005
ately+0.005
 fault+0.005
der+0.005
Neg logit contributions
 mil-0.003
 pl-0.003
 marsh-0.003
 was-0.003
 l-0.003
Tokenster
Token ID1706
Feature activation-3.36e-01

Pos logit contributions
ately+0.003
 fault+0.003
 unnecessary+0.003
arl+0.003
der+0.003
Neg logit contributions
 mil-0.002
 marsh-0.002
 particular-0.002
 pl-0.002
 was-0.002
Token.
Token ID13
Feature activation-9.99e-01

Pos logit contributions
 fault+0.013
ately+0.013
 unnecessary+0.013
arl+0.013
der+0.012
Neg logit contributions
 marsh-0.006
 mil-0.006
 particular-0.006
 Ms-0.005
 pl-0.005
Token It
Token ID632
Feature activation-4.93e-01

Pos logit contributions
 unnecessary+0.037
 weaknesses+0.036
ately+0.035
 fault+0.035
arl+0.033
Neg logit contributions
 was-0.023
 mil-0.022
 marsh-0.022
 particular-0.021
,-0.020
Token fill
Token ID6070
Feature activation-9.62e-01

Pos logit contributions
 unnecessary+0.025
 fault+0.025
ately+0.024
arl+0.024
 weaknesses+0.024
Neg logit contributions
 mil-0.017
 marsh-0.017
 particular-0.016
 was-0.016
 pl-0.016
Token it
Token ID340
Feature activation-2.90e-01

Pos logit contributions
 unnecessary+0.029
 fault+0.028
ately+0.028
 ours+0.028
 weaknesses+0.027
Neg logit contributions
 mil-0.024
 particular-0.024
 marsh-0.024
 pl-0.022
 sn-0.022
Token with
Token ID351
Feature activation+4.80e-01

Pos logit contributions
 ours+0.011
arl+0.011
ately+0.011
der+0.011
 fault+0.011
Neg logit contributions
 particular-0.005
 mil-0.005
 marsh-0.005
 pl-0.004
 tow-0.004
Token your
Token ID534
Feature activation-1.59e+00

Pos logit contributions
 particular+0.012
 mil+0.011
 marsh+0.010
M+0.010
 Ms+0.010
Neg logit contributions
 unnecessary-0.016
 fault-0.015
 ours-0.015
 weaknesses-0.015
ately-0.015
Token own
Token ID898
Feature activation-2.02e+00

Pos logit contributions
 unnecessary+0.051
 fault+0.049
ately+0.049
 weaknesses+0.048
arl+0.048
Neg logit contributions
 mil-0.040
 marsh-0.039
 particular-0.038
 was-0.037
 pl-0.036
Token adventures
Token ID17545
Feature activation-2.79e+00

Pos logit contributions
 unnecessary+0.064
 fault+0.063
ately+0.062
 weaknesses+0.061
arl+0.061
Neg logit contributions
 mil-0.050
 marsh-0.050
 particular-0.049
 pl-0.045
 was-0.045
Token".
Token ID1911
Feature activation-1.01e+00

Pos logit contributions
 unnecessary+0.083
 fault+0.082
ately+0.081
 weaknesses+0.081
arl+0.080
Neg logit contributions
 mil-0.074
 marsh-0.072
 particular-0.072
 pl-0.069
 was-0.068
Token Jane
Token ID12091
Feature activation+1.05e+00

Pos logit contributions
 unnecessary+0.039
arl+0.037
 fault+0.037
 weaknesses+0.037
ately+0.035
Neg logit contributions
 was-0.021
 mil-0.021
 particular-0.020
 marsh-0.019
 pl-0.019
Token looked
Token ID3114
Feature activation+6.48e-01

Pos logit contributions
 marsh+0.022
 mil+0.020
 particular+0.019
 Ms+0.017
 tow+0.016
Neg logit contributions
ately-0.041
der-0.037
 weaknesses-0.035
ously-0.033
perate-0.033
Token around
Token ID1088
Feature activation+7.41e-01

Pos logit contributions
 mil+0.014
 marsh+0.014
 particular+0.014
 pl+0.012
M+0.012
Neg logit contributions
 unnecessary-0.023
ately-0.021
 fault-0.021
arl-0.021
der-0.021
Token,
Token ID11
Feature activation-7.08e-02

Pos logit contributions
 mil+0.012
 marsh+0.012
 particular+0.010
 pl+0.010
me+0.009
Neg logit contributions
ately-0.030
 unnecessary-0.029
der-0.029
 fault-0.029
arl-0.028
Token
Token ID198
Feature activation-4.36e-01

Pos logit contributions
 mil+0.005
 marsh+0.005
 particular+0.005
 was+0.005
 pl+0.005
Neg logit contributions
 unnecessary-0.020
 fault-0.020
ately-0.019
 weaknesses-0.019
arl-0.019
TokenThe
Token ID464
Feature activation-6.91e-02

Pos logit contributions
 unnecessary+0.015
 weaknesses+0.015
arl+0.014
ately+0.014
 fault+0.014
Neg logit contributions
 mil-0.010
 marsh-0.010
 particular-0.009
 pl-0.009
 sn-0.008
Token bug
Token ID5434
Feature activation+4.32e-01

Pos logit contributions
ately+0.002
 unnecessary+0.002
 fault+0.002
arl+0.002
keeping+0.002
Neg logit contributions
 was-0.002
 spring-0.002
 end-0.002
 particular-0.002
 pl-0.002
Token found
Token ID1043
Feature activation-1.39e+00

Pos logit contributions
 marsh+0.010
 mil+0.010
 particular+0.010
 Ms+0.009
Ms+0.008
Neg logit contributions
ately-0.014
 unnecessary-0.014
 weaknesses-0.014
der-0.014
 fault-0.014
Token a
Token ID257
Feature activation-1.52e+00

Pos logit contributions
 unnecessary+0.058
 fault+0.057
ately+0.056
 weaknesses+0.056
arl+0.054
Neg logit contributions
 mil-0.023
 particular-0.022
 marsh-0.022
 pl-0.018
 Ms-0.017
Token nice
Token ID3621
Feature activation-2.78e+00

Pos logit contributions
ately+0.060
 unnecessary+0.060
 weaknesses+0.058
 fault+0.058
arl+0.058
Neg logit contributions
 mil-0.027
 marsh-0.026
 particular-0.025
 was-0.025
 pl-0.025
Token tree
Token ID5509
Feature activation-7.92e-01

Pos logit contributions
 unnecessary+0.102
 fault+0.101
arl+0.099
 weaknesses+0.099
ately+0.098
Neg logit contributions
 mil-0.056
 marsh-0.055
 particular-0.052
 pl-0.049
 was-0.049
Token.
Token ID13
Feature activation-1.24e+00

Pos logit contributions
 fault+0.031
 unnecessary+0.031
ately+0.030
arl+0.030
 weaknesses+0.030
Neg logit contributions
 marsh-0.014
 mil-0.014
 particular-0.013
 pl-0.012
 Ms-0.011
Token The
Token ID383
Feature activation-5.23e-01

Pos logit contributions
 unnecessary+0.049
 fault+0.049
 weaknesses+0.047
ately+0.047
arl+0.046
Neg logit contributions
 mil-0.023
 was-0.022
 marsh-0.022
 sn-0.021
 particular-0.021
Token tree
Token ID5509
Feature activation-4.94e-01

Pos logit contributions
ately+0.019
 unnecessary+0.018
arl+0.018
 fault+0.018
der+0.017
Neg logit contributions
 spring-0.012
 was-0.012
 end-0.011
 pl-0.011
 b-0.011
Token was
Token ID373
Feature activation-8.47e-01

Pos logit contributions
ately+0.018
der+0.017
 weaknesses+0.017
endant+0.017
 fault+0.017
Neg logit contributions
 marsh-0.010
 mil-0.010
 particular-0.009
 Ms-0.008
Ms-0.008
Token butterfly
Token ID35113
Feature activation+3.73e-01

Pos logit contributions
 mil+0.008
 particular+0.008
 marsh+0.007
 was+0.007
 pl+0.007
Neg logit contributions
 unnecessary-0.014
 fault-0.013
arl-0.013
ately-0.013
 weaknesses-0.013
Token back
Token ID736
Feature activation+2.74e-02

Pos logit contributions
 mil+0.008
 marsh+0.008
 particular+0.008
 Ms+0.008
Ms+0.007
Neg logit contributions
 unnecessary-0.012
 fault-0.012
ately-0.012
 weaknesses-0.012
der-0.011
Token in
Token ID287
Feature activation+6.38e-01

Pos logit contributions
 mil+0.000
 marsh+0.000
 particular+0.000
 was+0.000
 pl+0.000
Neg logit contributions
 unnecessary-0.001
 fault-0.001
ately-0.001
 weaknesses-0.001
arl-0.001
Token its
Token ID663
Feature activation-1.07e-01

Pos logit contributions
 mil+0.011
 marsh+0.010
me+0.009
 pl+0.009
 particular+0.009
Neg logit contributions
 fault-0.024
arl-0.024
 unnecessary-0.024
ately-0.024
 weaknesses-0.023
Token natural
Token ID3288
Feature activation-1.78e-01

Pos logit contributions
 unnecessary+0.003
 fault+0.003
 weaknesses+0.003
 spine+0.003
 ours+0.003
Neg logit contributions
 mil-0.003
 marsh-0.002
me-0.002
 particular-0.002
 Ms-0.002
Token habitat
Token ID20018
Feature activation-2.77e+00

Pos logit contributions
 fault+0.006
 unnecessary+0.005
ately+0.005
 weaknesses+0.005
arl+0.005
Neg logit contributions
 mil-0.005
 marsh-0.004
 particular-0.004
 Ms-0.004
M-0.004
Token.
Token ID13
Feature activation-8.30e-01

Pos logit contributions
 unnecessary+0.124
 weaknesses+0.120
 fault+0.120
ately+0.119
arl+0.119
Neg logit contributions
 mil-0.034
 marsh-0.033
 particular-0.031
 was-0.031
 pl-0.029
Token He
Token ID679
Feature activation-1.70e-02

Pos logit contributions
 unnecessary+0.026
 fault+0.026
ately+0.026
arl+0.025
 weaknesses+0.025
Neg logit contributions
 mil-0.021
 marsh-0.020
 particular-0.020
 pl-0.019
 was-0.019
Token waved
Token ID26834
Feature activation-2.16e-02

Pos logit contributions
ately+0.001
der+0.001
 weaknesses+0.001
 unnecessary+0.001
endant+0.001
Neg logit contributions
 mil-0.000
 marsh-0.000
 particular-0.000
Ms-0.000
 Ms-0.000
Token goodbye
Token ID24829
Feature activation-3.63e-01

Pos logit contributions
 unnecessary+0.001
ately+0.001
 fault+0.001
arl+0.001
 weaknesses+0.001
Neg logit contributions
 mil-0.001
 marsh-0.001
 particular-0.001
 pl-0.000
 was-0.000
Token to
Token ID284
Feature activation-3.20e-01

Pos logit contributions
 unnecessary+0.014
 fault+0.014
 weaknesses+0.014
arl+0.014
ately+0.014
Neg logit contributions
 mil-0.006
 marsh-0.006
 particular-0.006
 was-0.006
 pl-0.006
Token soft
Token ID2705
Feature activation-6.03e-01

Pos logit contributions
 fault+0.012
 unnecessary+0.012
arl+0.012
 weaknesses+0.011
der+0.011
Neg logit contributions
 mil-0.007
 marsh-0.006
 particular-0.006
 pl-0.006
 was-0.006
Token t
Token ID256
Feature activation-6.80e-01

Pos logit contributions
 fault+0.020
arl+0.020
 spine+0.020
der+0.020
ately+0.019
Neg logit contributions
 mil-0.012
 particular-0.011
 marsh-0.011
Ms-0.011
 pl-0.011
Tokeneddy
Token ID21874
Feature activation+1.70e-01

Pos logit contributions
 unnecessary+0.017
 fault+0.016
ately+0.016
arl+0.016
 weaknesses+0.016
Neg logit contributions
 mil-0.022
 marsh-0.022
 particular-0.021
 was-0.021
 pl-0.021
Token bear
Token ID6842
Feature activation+1.00e-01

Pos logit contributions
 mil+0.003
 particular+0.003
 marsh+0.003
 pl+0.003
 was+0.002
Neg logit contributions
 unnecessary-0.006
 fault-0.006
ately-0.006
der-0.006
arl-0.006
Token named
Token ID3706
Feature activation-4.80e-02

Pos logit contributions
 particular+0.002
 marsh+0.002
 mil+0.002
 pl+0.002
M+0.002
Neg logit contributions
 fault-0.003
der-0.003
ately-0.003
 spies-0.003
 spine-0.003
Token Mr
Token ID1770
Feature activation-2.77e+00

Pos logit contributions
arl+0.001
 unnecessary+0.001
 fault+0.001
ately+0.001
der+0.001
Neg logit contributions
 mil-0.001
 particular-0.001
 pl-0.001
 marsh-0.001
 was-0.001
Token.
Token ID13
Feature activation-5.72e-01

Pos logit contributions
 unnecessary+0.131
 weaknesses+0.128
ately+0.126
arl+0.125
 fault+0.124
Neg logit contributions
 mil-0.028
 was-0.026
 marsh-0.025
,-0.024
 particular-0.024
Token Bear
Token ID14732
Feature activation-9.61e-01

Pos logit contributions
arl+0.020
 fault+0.020
ately+0.019
 unnecessary+0.019
der+0.019
Neg logit contributions
 pl-0.011
 mil-0.011
 particular-0.010
 was-0.009
 marsh-0.009
Token.
Token ID13
Feature activation+1.83e-01

Pos logit contributions
 fault+0.032
der+0.030
ately+0.030
 weaknesses+0.029
 unnecessary+0.028
Neg logit contributions
 marsh-0.021
 particular-0.021
 mil-0.021
 pl-0.019
Ms-0.017
Token She
Token ID1375
Feature activation+2.08e-01

Pos logit contributions
 pl+0.004
 mil+0.004
 particular+0.004
 marsh+0.004
g+0.003
Neg logit contributions
arl-0.006
 fault-0.005
der-0.005
 weaknesses-0.005
ately-0.005
Token loved
Token ID6151
Feature activation-2.01e-01

Pos logit contributions
 particular+0.006
 mil+0.006
 marsh+0.006
Ms+0.006
 front+0.005
Neg logit contributions
der-0.005
 villains-0.005
ately-0.004
 weaknesses-0.004
endant-0.004
Token,
Token ID11
Feature activation-6.17e-01

Pos logit contributions
 mil+0.012
 marsh+0.012
 particular+0.012
 pl+0.011
 was+0.011
Neg logit contributions
 unnecessary-0.021
 fault-0.021
ately-0.021
arl-0.020
 weaknesses-0.020
Token a
Token ID257
Feature activation-2.34e-01

Pos logit contributions
 unnecessary+0.025
 fault+0.024
ately+0.024
arl+0.024
 weaknesses+0.024
Neg logit contributions
 mil-0.011
 marsh-0.010
 particular-0.010
 was-0.010
 pl-0.009
Token soft
Token ID2705
Feature activation-1.27e+00

Pos logit contributions
 unnecessary+0.010
ately+0.009
 weaknesses+0.009
 fault+0.009
arl+0.009
Neg logit contributions
 mil-0.004
 marsh-0.004
 particular-0.004
 was-0.004
 pl-0.004
Token bear
Token ID6842
Feature activation-2.48e-01

Pos logit contributions
 fault+0.041
arl+0.040
 unnecessary+0.039
ately+0.039
 weaknesses+0.039
Neg logit contributions
 mil-0.030
 marsh-0.029
 particular-0.028
 pl-0.027
 Ms-0.026
Token called
Token ID1444
Feature activation-6.16e-02

Pos logit contributions
 fault+0.009
der+0.009
ately+0.009
arl+0.008
endant+0.008
Neg logit contributions
 marsh-0.005
 particular-0.005
 mil-0.005
 pl-0.004
Ms-0.004
Token Mr
Token ID1770
Feature activation-2.77e+00

Pos logit contributions
arl+0.002
 unnecessary+0.002
 fault+0.002
ately+0.002
 weaknesses+0.002
Neg logit contributions
 mil-0.002
 particular-0.002
 marsh-0.002
 pl-0.001
 was-0.001
Token.
Token ID13
Feature activation-5.58e-01

Pos logit contributions
 unnecessary+0.136
 weaknesses+0.133
ately+0.132
arl+0.128
 fault+0.127
Neg logit contributions
 was-0.023
 mil-0.022
,-0.021
 marsh-0.019
 particular-0.018
Token Bear
Token ID14732
Feature activation-1.07e+00

Pos logit contributions
 fault+0.020
arl+0.020
 unnecessary+0.019
ately+0.019
der+0.018
Neg logit contributions
 mil-0.011
 pl-0.011
 particular-0.011
 marsh-0.010
 was-0.010
Token.
Token ID13
Feature activation-1.72e-01

Pos logit contributions
 fault+0.037
der+0.034
ately+0.034
 weaknesses+0.033
arl+0.032
Neg logit contributions
 mil-0.023
 marsh-0.023
 particular-0.022
 pl-0.021
 Ms-0.020
Token One
Token ID1881
Feature activation+3.84e-01

Pos logit contributions
arl+0.005
 fault+0.005
 ours+0.005
der+0.005
 weaknesses+0.005
Neg logit contributions
 pl-0.004
 mil-0.004
 particular-0.004
 marsh-0.004
me-0.004
Token day
Token ID1110
Feature activation-7.76e-02

Pos logit contributions
 marsh+0.011
 mil+0.011
Ms+0.011
 pl+0.010
 Ms+0.010
Neg logit contributions
 fault-0.009
icated-0.009
 spine-0.009
ately-0.009
 privacy-0.009
Token friend
Token ID1545
Feature activation-8.58e-01

Pos logit contributions
 mil+0.001
 marsh+0.001
 particular+0.001
 pl+0.001
 Ms+0.001
Neg logit contributions
 fault-0.001
 unnecessary-0.001
arl-0.001
ately-0.001
 weaknesses-0.001
Token,
Token ID11
Feature activation-3.52e-02

Pos logit contributions
 fault+0.033
 unnecessary+0.031
ately+0.031
 spies+0.031
 weaknesses+0.030
Neg logit contributions
 mil-0.015
 particular-0.013
 marsh-0.013
 pl-0.013
 gra-0.012
Token a
Token ID257
Feature activation-3.52e-01

Pos logit contributions
 unnecessary+0.001
 fault+0.001
ately+0.001
 weaknesses+0.001
arl+0.001
Neg logit contributions
 mil-0.001
 marsh-0.001
 particular-0.001
 pl-0.001
 was-0.001
Token potato
Token ID21219
Feature activation+2.47e-01

Pos logit contributions
 unnecessary+0.011
 fault+0.011
arl+0.011
 weaknesses+0.011
ately+0.011
Neg logit contributions
 mil-0.008
 marsh-0.008
 particular-0.008
 pl-0.007
 was-0.007
Token named
Token ID3706
Feature activation-4.91e-01

Pos logit contributions
 particular+0.005
 mil+0.005
 marsh+0.004
 pl+0.004
 Ms+0.004
Neg logit contributions
 fault-0.009
der-0.009
ately-0.009
arl-0.009
 weaknesses-0.008
Token Mr
Token ID1770
Feature activation-2.76e+00

Pos logit contributions
 unnecessary+0.015
 fault+0.015
ately+0.015
arl+0.015
 weaknesses+0.015
Neg logit contributions
 mil-0.012
 particular-0.012
 marsh-0.012
 pl-0.011
 was-0.011
Token.
Token ID13
Feature activation-7.77e-01

Pos logit contributions
 unnecessary+0.136
 weaknesses+0.134
ately+0.132
arl+0.130
 fault+0.128
Neg logit contributions
 mil-0.021
 was-0.020
,-0.020
 marsh-0.017
 sn-0.015
Token Sp
Token ID1338
Feature activation-5.85e-01

Pos logit contributions
arl+0.028
 unnecessary+0.028
 fault+0.028
ately+0.027
 weaknesses+0.027
Neg logit contributions
 mil-0.016
 particular-0.015
 marsh-0.015
 pl-0.014
 was-0.014
Tokenud
Token ID463
Feature activation+1.69e+00

Pos logit contributions
 unnecessary+0.018
 weaknesses+0.017
 fault+0.017
arl+0.017
ately+0.017
Neg logit contributions
 mil-0.016
 marsh-0.015
 particular-0.015
 was-0.015
 pl-0.014
Token.
Token ID13
Feature activation+3.05e-01

Pos logit contributions
 mil+0.027
 marsh+0.026
 particular+0.026
 pl+0.024
 was+0.023
Neg logit contributions
 unnecessary-0.068
 fault-0.068
ately-0.067
 weaknesses-0.066
arl-0.066
Token They
Token ID1119
Feature activation+1.04e+00

Pos logit contributions
 mil+0.007
 particular+0.006
 marsh+0.006
 pl+0.006
 was+0.006
Neg logit contributions
arl-0.010
 fault-0.010
 unnecessary-0.010
 weaknesses-0.010
ately-0.010
Token wrong
Token ID2642
Feature activation-3.47e-01

Pos logit contributions
 weaknesses+0.014
ately+0.014
 unnecessary+0.013
 fault+0.013
der+0.012
Neg logit contributions
 was-0.014
 mil-0.014
,-0.014
 pl-0.013
 w-0.013
Token.
Token ID13
Feature activation-2.12e-01

Pos logit contributions
 unnecessary+0.016
 weaknesses+0.016
 fault+0.016
ately+0.016
arl+0.016
Neg logit contributions
 was-0.004
 mil-0.004
 particular-0.003
,-0.003
 marsh-0.003
Token She
Token ID1375
Feature activation-1.11e+00

Pos logit contributions
 unnecessary+0.010
 fault+0.009
ately+0.009
 weaknesses+0.009
ust+0.009
Neg logit contributions
 was-0.003
,-0.003
 mil-0.003
 Ms-0.002
 Mitt-0.002
Token sees
Token ID7224
Feature activation-1.60e+00

Pos logit contributions
 unnecessary+0.027
arl+0.027
ately+0.026
 fault+0.026
 weaknesses+0.025
Neg logit contributions
 marsh-0.035
 mil-0.035
 particular-0.035
M-0.031
me-0.030
Token the
Token ID262
Feature activation-1.37e+00

Pos logit contributions
 unnecessary+0.072
 weaknesses+0.072
ately+0.070
 fault+0.070
 spine+0.070
Neg logit contributions
 was-0.022
 marsh-0.021
,-0.021
 mil-0.020
 the-0.020
Token broken
Token ID5445
Feature activation-2.73e+00

Pos logit contributions
ately+0.041
 unnecessary+0.041
 weaknesses+0.039
arl+0.038
 fault+0.037
Neg logit contributions
 mil-0.038
 marsh-0.037
 was-0.037
 pl-0.036
 particular-0.035
Token puzzle
Token ID15027
Feature activation-5.11e-01

Pos logit contributions
 unnecessary+0.078
 fault+0.077
ately+0.076
 weaknesses+0.075
arl+0.075
Neg logit contributions
 mil-0.077
 marsh-0.075
 particular-0.074
 was-0.071
 pl-0.071
Token and
Token ID290
Feature activation-1.39e+00

Pos logit contributions
 unnecessary+0.022
 fault+0.021
ately+0.021
 weaknesses+0.021
arl+0.021
Neg logit contributions
 mil-0.007
 marsh-0.007
 particular-0.007
 was-0.006
 pl-0.006
Token the
Token ID262
Feature activation-6.55e-01

Pos logit contributions
 unnecessary+0.065
 fault+0.064
ately+0.064
arl+0.063
 weaknesses+0.063
Neg logit contributions
 mil-0.015
 marsh-0.014
 was-0.013
 particular-0.013
 pl-0.012
Token upset
Token ID9247
Feature activation-1.60e+00

Pos logit contributions
ately+0.022
 unnecessary+0.021
 weaknesses+0.020
arl+0.020
ust+0.020
Neg logit contributions
 mil-0.017
 marsh-0.016
 was-0.016
 pl-0.016
 l-0.015
Token children
Token ID1751
Feature activation-5.57e-01

Pos logit contributions
 unnecessary+0.052
ately+0.049
 fault+0.047
 weaknesses+0.047
arl+0.047
Neg logit contributions
 was-0.043
 mil-0.042
 marsh-0.041
,-0.040
 particular-0.039
Token bird
Token ID6512
Feature activation+5.76e-02

Pos logit contributions
 marsh+0.020
 mil+0.019
 particular+0.019
 Ms+0.017
 was+0.017
Neg logit contributions
 unnecessary-0.031
arl-0.031
 fault-0.031
 weaknesses-0.030
ately-0.029
Token was
Token ID373
Feature activation-6.68e-01

Pos logit contributions
 particular+0.001
 marsh+0.001
 mil+0.001
Ms+0.001
 pasture+0.001
Neg logit contributions
ately-0.002
der-0.002
endant-0.002
 weaknesses-0.002
 fault-0.002
Token very
Token ID845
Feature activation+7.51e-02

Pos logit contributions
 fault+0.018
 ours+0.018
 unnecessary+0.017
endant+0.016
 spies+0.015
Neg logit contributions
 particular-0.020
 marsh-0.019
 mil-0.018
me-0.017
 reception-0.016
Token happy
Token ID3772
Feature activation-8.71e-01

Pos logit contributions
 marsh+0.002
 mil+0.002
 particular+0.002
 pl+0.002
 was+0.002
Neg logit contributions
 unnecessary-0.002
arl-0.002
 fault-0.002
ately-0.002
 weaknesses-0.002
Token and
Token ID290
Feature activation-3.43e-01

Pos logit contributions
 unnecessary+0.042
arl+0.041
 weaknesses+0.040
 fault+0.039
ately+0.039
Neg logit contributions
 was-0.010
 mil-0.009
,-0.009
 marsh-0.008
 w-0.008
Token thanked
Token ID26280
Feature activation-2.70e+00

Pos logit contributions
ately+0.010
 unnecessary+0.010
 weaknesses+0.010
der+0.010
 ours+0.010
Neg logit contributions
 particular-0.009
 marsh-0.008
 mil-0.008
 pasture-0.008
 end-0.008
Token Amy
Token ID14235
Feature activation+6.17e-01

Pos logit contributions
 unnecessary+0.099
 fault+0.096
arl+0.096
 weaknesses+0.096
ately+0.094
Neg logit contributions
 mil-0.058
 particular-0.056
 marsh-0.056
 was-0.055
 Ms-0.052
Token and
Token ID290
Feature activation-9.42e-01

Pos logit contributions
 mil+0.013
 marsh+0.013
 particular+0.013
 pl+0.012
 was+0.011
Neg logit contributions
 fault-0.022
 unnecessary-0.021
ately-0.021
der-0.021
arl-0.021
Token Ben
Token ID3932
Feature activation-8.55e-02

Pos logit contributions
ately+0.030
 unnecessary+0.030
der+0.029
 fault+0.029
 weaknesses+0.029
Neg logit contributions
 particular-0.022
 mil-0.022
 marsh-0.021
 pl-0.019
 pasture-0.018
Token.
Token ID13
Feature activation-1.18e+00

Pos logit contributions
 unnecessary+0.003
 fault+0.003
ately+0.003
 weaknesses+0.003
arl+0.003
Neg logit contributions
 mil-0.002
 particular-0.001
 marsh-0.001
 pl-0.001
 was-0.001
Token It
Token ID632
Feature activation+4.27e-01

Pos logit contributions
 unnecessary+0.042
ately+0.040
 fault+0.039
 weaknesses+0.039
arl+0.039
Neg logit contributions
 mil-0.027
 marsh-0.027
 was-0.026
 particular-0.026
 pl-0.025
Token to
Token ID284
Feature activation-1.42e+00

Pos logit contributions
 unnecessary+0.045
 fault+0.044
ately+0.044
 weaknesses+0.044
arl+0.044
Neg logit contributions
 mil-0.026
 marsh-0.025
 particular-0.024
 was-0.024
 pl-0.023
Token the
Token ID262
Feature activation-2.10e-01

Pos logit contributions
 unnecessary+0.066
 fault+0.065
 weaknesses+0.065
ately+0.064
arl+0.064
Neg logit contributions
 mil-0.016
 particular-0.015
 marsh-0.014
 was-0.014
 pl-0.013
Token closet
Token ID21615
Feature activation-1.01e+00

Pos logit contributions
ately+0.009
 unnecessary+0.008
 weaknesses+0.008
 fault+0.008
arl+0.008
Neg logit contributions
 mil-0.004
 was-0.004
 pl-0.004
 marsh-0.003
 l-0.003
Token.
Token ID13
Feature activation-4.32e-01

Pos logit contributions
 unnecessary+0.045
 weaknesses+0.044
 fault+0.043
ately+0.043
arl+0.043
Neg logit contributions
 mil-0.013
 marsh-0.012
 particular-0.012
 was-0.012
 pl-0.011
Token She
Token ID1375
Feature activation-1.62e+00

Pos logit contributions
 unnecessary+0.018
 fault+0.017
ately+0.017
 weaknesses+0.017
arl+0.016
Neg logit contributions
 was-0.007
 mil-0.007
 marsh-0.007
 particular-0.007
 pl-0.007
Token sees
Token ID7224
Feature activation-2.69e+00

Pos logit contributions
arl+0.036
 unnecessary+0.035
ately+0.033
 fault+0.032
 ours+0.031
Neg logit contributions
 marsh-0.055
 mil-0.053
 particular-0.052
M-0.048
me-0.047
Token the
Token ID262
Feature activation-3.23e-01

Pos logit contributions
 unnecessary+0.134
 weaknesses+0.133
 fault+0.131
ately+0.130
arl+0.128
Neg logit contributions
 marsh-0.023
 mil-0.023
 was-0.021
 particular-0.019
 Ms-0.016
Token mess
Token ID2085
Feature activation-1.32e+00

Pos logit contributions
 unnecessary+0.010
ately+0.010
 weaknesses+0.009
arl+0.009
 fault+0.009
Neg logit contributions
 mil-0.009
 was-0.009
 marsh-0.008
 pl-0.008
 l-0.008
Token and
Token ID290
Feature activation-1.70e+00

Pos logit contributions
 unnecessary+0.064
 weaknesses+0.062
 fault+0.061
ately+0.060
arl+0.059
Neg logit contributions
 was-0.014
 mil-0.014
,-0.012
 marsh-0.012
 sn-0.011
Token the
Token ID262
Feature activation-2.25e-02

Pos logit contributions
 unnecessary+0.071
arl+0.070
ately+0.070
 fault+0.069
 weaknesses+0.068
Neg logit contributions
 particular-0.024
 marsh-0.024
 mil-0.023
M-0.019
me-0.017
Token tears
Token ID10953
Feature activation+1.37e-01

Pos logit contributions
ately+0.001
 unnecessary+0.001
arl+0.001
 weaknesses+0.001
 fault+0.001
Neg logit contributions
 mil-0.001
 was-0.001
 pl-0.001
 marsh-0.001
 l-0.001
TokenWow
Token ID22017
Feature activation+4.05e-01

Pos logit contributions
 was+0.003
 mil+0.003
M+0.003
 particular+0.002
 pl+0.002
Neg logit contributions
 fault-0.004
 unnecessary-0.004
ately-0.004
arl-0.004
 weaknesses-0.004
Token,
Token ID11
Feature activation+2.57e-02

Pos logit contributions
 mil+0.002
 marsh+0.002
 particular+0.001
 was+0.001
 pl+0.001
Neg logit contributions
 unnecessary-0.021
 fault-0.021
ately-0.021
 weaknesses-0.021
arl-0.021
Token that
Token ID326
Feature activation-1.10e-02

Pos logit contributions
 was+0.001
 mil+0.001
,+0.001
 l+0.001
 w+0.001
Neg logit contributions
ately-0.001
 unnecessary-0.001
 weaknesses-0.001
 fault-0.001
arl-0.001
Token is
Token ID318
Feature activation-5.39e-01

Pos logit contributions
 fault+0.000
 unnecessary+0.000
 weaknesses+0.000
der+0.000
ately+0.000
Neg logit contributions
 mil-0.000
 marsh-0.000
 particular-0.000
 pl-0.000
 sn-0.000
Token an
Token ID281
Feature activation-1.62e+00

Pos logit contributions
 unnecessary+0.019
 fault+0.019
ately+0.019
 weaknesses+0.019
arl+0.019
Neg logit contributions
 mil-0.011
 marsh-0.011
 particular-0.011
 pl-0.010
 was-0.010
Token impressive
Token ID8036
Feature activation-2.69e+00

Pos logit contributions
 unnecessary+0.057
ately+0.055
arl+0.054
 weaknesses+0.054
 fault+0.053
Neg logit contributions
 mil-0.036
 particular-0.036
 marsh-0.035
 was-0.034
 pl-0.033
Token puzzle
Token ID15027
Feature activation-4.94e-01

Pos logit contributions
 unnecessary+0.081
ately+0.078
arl+0.077
 weaknesses+0.075
 fault+0.074
Neg logit contributions
 mil-0.074
 particular-0.073
 was-0.073
 marsh-0.073
 pl-0.070
Token.
Token ID13
Feature activation-5.77e-02

Pos logit contributions
 fault+0.019
 unnecessary+0.019
ately+0.019
 weaknesses+0.019
arl+0.019
Neg logit contributions
 mil-0.009
 marsh-0.008
 particular-0.008
 pl-0.007
 was-0.007
Token Can
Token ID1680
Feature activation-4.51e-01

Pos logit contributions
 unnecessary+0.002
ately+0.002
 fault+0.002
 weaknesses+0.002
arl+0.002
Neg logit contributions
 mil-0.002
 marsh-0.002
 was-0.002
-0.002
 particular-0.002
Token I
Token ID314
Feature activation-1.05e+00

Pos logit contributions
 unnecessary+0.013
 fault+0.013
arl+0.013
ately+0.012
 weaknesses+0.012
Neg logit contributions
 was-0.013
 mil-0.013
 particular-0.013
,-0.012
 marsh-0.012
Token help
Token ID1037
Feature activation-1.18e+00

Pos logit contributions
 fault+0.034
 unnecessary+0.034
arl+0.034
 weaknesses+0.034
ately+0.033
Neg logit contributions
 mil-0.025
 marsh-0.025
 particular-0.022
 pl-0.021
 was-0.021
Token He
Token ID679
Feature activation-1.20e+00

Pos logit contributions
 unnecessary+0.040
ately+0.037
arl+0.037
 fault+0.037
 weaknesses+0.037
Neg logit contributions
 mil-0.024
 was-0.024
 marsh-0.023
 particular-0.023
 pl-0.022
Token smiled
Token ID13541
Feature activation+5.34e-01

Pos logit contributions
ately+0.039
der+0.038
 weaknesses+0.038
endant+0.036
 fault+0.035
Neg logit contributions
 marsh-0.028
 mil-0.027
 particular-0.027
 pasture-0.023
 Ms-0.023
Token and
Token ID290
Feature activation-2.54e-01

Pos logit contributions
 mil+0.009
 marsh+0.009
 particular+0.008
 pl+0.008
 was+0.007
Neg logit contributions
 unnecessary-0.021
 fault-0.021
ately-0.021
der-0.021
arl-0.021
Token showed
Token ID3751
Feature activation-1.63e+00

Pos logit contributions
ately+0.009
 unnecessary+0.009
der+0.009
 weaknesses+0.008
 fault+0.008
Neg logit contributions
 particular-0.005
 marsh-0.005
 mil-0.005
 pl-0.004
 end-0.004
Token everybody
Token ID7288
Feature activation-6.57e-01

Pos logit contributions
 unnecessary+0.062
 fault+0.061
ately+0.061
arl+0.060
 weaknesses+0.060
Neg logit contributions
 mil-0.031
 marsh-0.030
 particular-0.029
 was-0.028
 pl-0.028
Token his
Token ID465
Feature activation-2.69e+00

Pos logit contributions
 unnecessary+0.022
 fault+0.022
 weaknesses+0.021
ately+0.021
der+0.021
Neg logit contributions
 mil-0.014
 marsh-0.014
 particular-0.014
 pl-0.013
 sn-0.012
Token puzzle
Token ID15027
Feature activation-6.74e-01

Pos logit contributions
arl+0.091
 unnecessary+0.088
ately+0.085
keeping+0.081
 fault+0.080
Neg logit contributions
 was-0.067
 particular-0.064
 mil-0.062
 marsh-0.062
,-0.061
Token.
Token ID13
Feature activation-1.06e+00

Pos logit contributions
 unnecessary+0.027
 fault+0.027
ately+0.027
 weaknesses+0.027
arl+0.026
Neg logit contributions
 mil-0.011
 marsh-0.011
 particular-0.010
 pl-0.009
 was-0.009
Token Everyone
Token ID11075
Feature activation-9.01e-01

Pos logit contributions
 unnecessary+0.040
ately+0.038
 fault+0.038
arl+0.038
 weaknesses+0.038
Neg logit contributions
 mil-0.022
 marsh-0.021
 was-0.021
 particular-0.021
 pl-0.019
Token praised
Token ID15342
Feature activation-1.70e+00

Pos logit contributions
 unnecessary+0.042
ately+0.041
 fault+0.041
 weaknesses+0.041
arl+0.040
Neg logit contributions
 mil-0.009
 marsh-0.009
 particular-0.008
 pl-0.007
 was-0.007
Token John
Token ID1757
Feature activation-3.59e-01

Pos logit contributions
 unnecessary+0.075
 weaknesses+0.074
 fault+0.073
arl+0.072
endant+0.070
Neg logit contributions
,-0.027
 was-0.026
 mil-0.024
 particular-0.023
 the-0.023
Token to
Token ID284
Feature activation-2.94e-01

Pos logit contributions
 unnecessary+0.010
 fault+0.009
ately+0.009
 weaknesses+0.009
arl+0.009
Neg logit contributions
 mil-0.005
 marsh-0.005
 particular-0.005
 was-0.005
 pl-0.005
Token jump
Token ID4391
Feature activation-1.35e-01

Pos logit contributions
 unnecessary+0.011
ately+0.011
 fault+0.010
arl+0.010
 weaknesses+0.010
Neg logit contributions
 mil-0.006
 marsh-0.006
 particular-0.005
 was-0.005
 pl-0.005
Token,
Token ID11
Feature activation-2.79e-01

Pos logit contributions
 fault+0.005
 unnecessary+0.005
arl+0.005
ately+0.005
der+0.005
Neg logit contributions
 mil-0.003
 particular-0.002
 marsh-0.002
 pl-0.002
 Ms-0.002
Token but
Token ID475
Feature activation-3.01e-01

Pos logit contributions
 unnecessary+0.009
 fault+0.009
ately+0.009
 weaknesses+0.009
arl+0.009
Neg logit contributions
 mil-0.007
 marsh-0.007
 particular-0.007
 was-0.007
 pl-0.007
Token they
Token ID484
Feature activation-9.96e-01

Pos logit contributions
 unnecessary+0.010
 fault+0.010
ately+0.010
 weaknesses+0.010
arl+0.010
Neg logit contributions
 mil-0.007
 marsh-0.007
 particular-0.006
 pl-0.006
 was-0.006
Token still
Token ID991
Feature activation-2.68e+00

Pos logit contributions
 unnecessary+0.033
 fault+0.032
ately+0.032
 weaknesses+0.032
arl+0.032
Neg logit contributions
 mil-0.024
 marsh-0.023
 particular-0.023
 was-0.022
 pl-0.022
Token cannot
Token ID2314
Feature activation-1.11e+00

Pos logit contributions
 unnecessary+0.082
 fault+0.081
ately+0.079
arl+0.079
 weaknesses+0.079
Neg logit contributions
 mil-0.070
 marsh-0.069
 particular-0.067
 pl-0.064
 was-0.064
Token get
Token ID651
Feature activation-6.32e-01

Pos logit contributions
 unnecessary+0.039
 fault+0.039
ately+0.039
arl+0.038
 weaknesses+0.038
Neg logit contributions
 mil-0.023
 marsh-0.022
 particular-0.020
 Ms-0.019
 was-0.019
Token it
Token ID340
Feature activation+6.94e-01

Pos logit contributions
 unnecessary+0.023
ately+0.023
 fault+0.023
arl+0.023
 weaknesses+0.023
Neg logit contributions
 mil-0.013
 marsh-0.012
 particular-0.012
 pl-0.012
 was-0.012
Token.
Token ID13
Feature activation+2.90e-01

Pos logit contributions
 mil+0.011
 marsh+0.011
 particular+0.010
 pl+0.010
 was+0.010
Neg logit contributions
 unnecessary-0.028
 fault-0.028
ately-0.028
arl-0.028
 weaknesses-0.027
Token
Token ID198
Feature activation+1.60e-01

Pos logit contributions
 was+0.004
 marsh+0.004
 particular+0.004
 Mitt+0.003
 mil+0.003
Neg logit contributions
 unnecessary-0.013
 weaknesses-0.013
ately-0.013
 fault-0.012
 privacy-0.012
TokenFinally
Token ID11158
Feature activation+9.15e-01

Pos logit contributions
 weaknesses+0.005
ately+0.005
 unnecessary+0.005
ust+0.005
 fault+0.005
Neg logit contributions
 mil-0.005
 marsh-0.004
 sn-0.004
 pl-0.004
 l-0.004
Token they
Token ID484
Feature activation+3.30e-01

Pos logit contributions
 mil+0.010
 marsh+0.009
 pl+0.008
 particular+0.008
 gra+0.007
Neg logit contributions
ately-0.041
 unnecessary-0.040
 weaknesses-0.039
 fault-0.039
der-0.039
Token found
Token ID1043
Feature activation-5.67e-01

Pos logit contributions
 mil+0.006
 marsh+0.006
 particular+0.006
 pl+0.005
 was+0.005
Neg logit contributions
 unnecessary-0.012
ately-0.012
 weaknesses-0.012
 fault-0.012
arl-0.012
Token the
Token ID262
Feature activation-3.78e-01

Pos logit contributions
 fault+0.023
 unnecessary+0.023
ately+0.022
 weaknesses+0.021
 ours+0.021
Neg logit contributions
 mil-0.010
 particular-0.009
 marsh-0.009
 pl-0.008
me-0.008
Token best
Token ID1266
Feature activation-3.35e-01

Pos logit contributions
 unnecessary+0.013
 fault+0.013
arl+0.013
ately+0.013
 weaknesses+0.013
Neg logit contributions
 mil-0.008
 marsh-0.008
 particular-0.008
 pl-0.007
 was-0.007
Token vacation
Token ID14600
Feature activation-2.68e+00

Pos logit contributions
 unnecessary+0.011
 fault+0.011
ately+0.011
 weaknesses+0.011
arl+0.011
Neg logit contributions
 mil-0.008
 marsh-0.007
 particular-0.007
 pl-0.007
 was-0.007
Token ever
Token ID1683
Feature activation-6.43e-01

Pos logit contributions
 fault+0.099
ately+0.099
 unnecessary+0.097
 weaknesses+0.095
der+0.093
Neg logit contributions
 marsh-0.051
 mil-0.050
 particular-0.048
 pl-0.044
 sn-0.042
Token!
Token ID0
Feature activation-7.48e-01

Pos logit contributions
 unnecessary+0.030
arl+0.030
 weaknesses+0.030
 fault+0.030
ately+0.029
Neg logit contributions
 was-0.007
 mil-0.007
 marsh-0.006
 pl-0.006
 particular-0.006
Token It
Token ID632
Feature activation+1.08e-01

Pos logit contributions
 unnecessary+0.024
 fault+0.024
ately+0.023
 weaknesses+0.023
arl+0.023
Neg logit contributions
 mil-0.018
 marsh-0.018
 particular-0.018
 pl-0.017
 was-0.017
Token was
Token ID373
Feature activation-2.18e-01

Pos logit contributions
 marsh+0.002
 mil+0.002
 pasture+0.002
Ms+0.002
 particular+0.002
Neg logit contributions
ately-0.004
 fault-0.004
ust-0.004
der-0.004
ounded-0.003
Token a
Token ID257
Feature activation-7.51e-01

Pos logit contributions
 fault+0.007
 ours+0.007
 ago+0.006
 spies+0.006
 theirs+0.006
Neg logit contributions
me-0.006
mun-0.006
oc-0.005
 mil-0.005
ink-0.005
Token chew
Token ID34722
Feature activation+1.16e-01

Pos logit contributions
 unnecessary+0.005
 fault+0.005
 weaknesses+0.005
arl+0.005
ately+0.005
Neg logit contributions
 mil-0.003
 marsh-0.003
 particular-0.003
 was-0.003
M-0.003
Token on
Token ID319
Feature activation+1.24e+00

Pos logit contributions
 mil+0.002
 particular+0.002
 marsh+0.002
 pl+0.002
 was+0.002
Neg logit contributions
 unnecessary-0.004
ately-0.004
 fault-0.004
arl-0.004
 weaknesses-0.004
Token books
Token ID3835
Feature activation+4.02e-01

Pos logit contributions
 mil+0.024
 particular+0.024
 marsh+0.022
 pl+0.022
 was+0.021
Neg logit contributions
 unnecessary-0.047
 fault-0.045
 weaknesses-0.044
ately-0.044
arl-0.043
Token and
Token ID290
Feature activation+3.09e-01

Pos logit contributions
 mil+0.006
 marsh+0.006
 particular+0.006
 was+0.006
 pl+0.005
Neg logit contributions
 unnecessary-0.017
 fault-0.017
 weaknesses-0.016
ately-0.016
arl-0.016
Token papers
Token ID9473
Feature activation+8.64e-01

Pos logit contributions
 mil+0.006
 marsh+0.006
 was+0.006
 sn+0.006
 pl+0.006
Neg logit contributions
arl-0.011
 fault-0.011
ately-0.011
 unnecessary-0.011
 weaknesses-0.011
Token.
Token ID13
Feature activation+2.08e+00

Pos logit contributions
 mil+0.012
 marsh+0.012
 was+0.011
 particular+0.011
 pl+0.010
Neg logit contributions
 unnecessary-0.038
ately-0.036
 fault-0.036
 weaknesses-0.036
arl-0.036
Token He
Token ID679
Feature activation+1.56e-01

Pos logit contributions
 mil+0.054
 marsh+0.054
 particular+0.051
 was+0.051
 sn+0.049
Neg logit contributions
 unnecessary-0.064
 fault-0.064
ately-0.063
 weaknesses-0.062
arl-0.060
Token can
Token ID460
Feature activation+8.63e-01

Pos logit contributions
 particular+0.005
 mil+0.005
 marsh+0.004
 pl+0.004
M+0.004
Neg logit contributions
 fault-0.004
 unnecessary-0.004
 weaknesses-0.004
der-0.004
ately-0.004
Token ruin
Token ID16866
Feature activation+9.68e-01

Pos logit contributions
 mil+0.020
 marsh+0.019
 particular+0.018
M+0.017
 was+0.017
Neg logit contributions
 weaknesses-0.026
 unnecessary-0.026
arl-0.026
 fault-0.026
ately-0.025
Token my
Token ID616
Feature activation-3.47e-01

Pos logit contributions
 mil+0.020
 particular+0.019
 marsh+0.019
 pl+0.018
 was+0.018
Neg logit contributions
 unnecessary-0.035
 fault-0.034
ately-0.034
 weaknesses-0.034
arl-0.033
Token story
Token ID1621
Feature activation-4.57e-01

Pos logit contributions
 unnecessary+0.012
ately+0.012
arl+0.012
 fault+0.012
 weaknesses+0.012
Neg logit contributions
 mil-0.008
 marsh-0.008
 particular-0.008
 was-0.007
 pl-0.007
Token when
Token ID618
Feature activation+6.59e-01

Pos logit contributions
 mil+0.009
 marsh+0.008
 particular+0.008
 pl+0.008
 was+0.007
Neg logit contributions
 unnecessary-0.017
ately-0.017
 fault-0.017
 weaknesses-0.016
arl-0.016
Token she
Token ID673
Feature activation+8.75e-01

Pos logit contributions
 mil+0.016
 marsh+0.015
 pl+0.015
 particular+0.015
 pasture+0.014
Neg logit contributions
 unnecessary-0.021
 fault-0.020
ately-0.020
 weaknesses-0.019
arl-0.019
Token che
Token ID1125
Feature activation+1.77e-01

Pos logit contributions
 marsh+0.017
 mil+0.016
 particular+0.014
 Ms+0.013
 pasture+0.013
Neg logit contributions
ately-0.033
 unnecessary-0.031
der-0.030
 weaknesses-0.030
 fault-0.030
Tokenwed
Token ID19103
Feature activation+5.69e-01

Pos logit contributions
 mil+0.006
 marsh+0.006
 particular+0.006
 was+0.006
 pl+0.006
Neg logit contributions
 unnecessary-0.004
 fault-0.004
arl-0.004
 weaknesses-0.004
ately-0.003
Token her
Token ID607
Feature activation-4.91e-01

Pos logit contributions
 mil+0.009
 marsh+0.008
 particular+0.008
 pl+0.008
 gra+0.007
Neg logit contributions
 unnecessary-0.023
ately-0.022
 fault-0.022
arl-0.022
der-0.021
Token nails
Token ID23361
Feature activation+2.14e+00

Pos logit contributions
 unnecessary+0.012
 fault+0.012
ately+0.012
 weaknesses+0.012
arl+0.012
Neg logit contributions
 mil-0.016
 marsh-0.016
 particular-0.015
 was-0.015
 pl-0.015
Token,
Token ID11
Feature activation-2.08e-01

Pos logit contributions
 mil+0.029
 marsh+0.028
 particular+0.026
 Ms+0.024
 pasture+0.023
Neg logit contributions
ately-0.093
 fault-0.093
 unnecessary-0.090
arl-0.089
der-0.087
Token she
Token ID673
Feature activation-4.92e-01

Pos logit contributions
 unnecessary+0.008
 fault+0.008
arl+0.008
ately+0.008
 weaknesses+0.008
Neg logit contributions
 mil-0.004
 marsh-0.004
 particular-0.003
 was-0.003
 pl-0.003
Token felt
Token ID2936
Feature activation+8.44e-01

Pos logit contributions
ately+0.015
 unnecessary+0.015
 fault+0.014
der+0.014
 weaknesses+0.014
Neg logit contributions
 mil-0.013
 particular-0.013
 marsh-0.013
 Ms-0.011
 pl-0.011
Token something
Token ID1223
Feature activation+2.12e-01

Pos logit contributions
 particular+0.020
 mil+0.019
 marsh+0.018
me+0.018
 reception+0.017
Neg logit contributions
 unnecessary-0.030
der-0.027
 fault-0.027
 weaknesses-0.025
ately-0.025
Token hard
Token ID1327
Feature activation-1.74e-01

Pos logit contributions
 mil+0.005
 particular+0.005
 marsh+0.005
 Ms+0.005
 pasture+0.005
Neg logit contributions
 unnecessary-0.007
ately-0.006
der-0.006
 fault-0.006
arl-0.006
Tokenir
Token ID343
Feature activation+1.18e+00

Pos logit contributions
 unnecessary+0.003
 fault+0.003
 weaknesses+0.003
ately+0.003
arl+0.003
Neg logit contributions
 mil-0.003
 marsh-0.003
 particular-0.003
 pl-0.003
 was-0.003
Tokenping
Token ID13886
Feature activation+4.33e-01

Pos logit contributions
 particular+0.043
 mil+0.043
 marsh+0.041
 was+0.039
 pl+0.039
Neg logit contributions
ately-0.024
der-0.024
 fault-0.023
 unnecessary-0.023
arl-0.023
Token loudly
Token ID23112
Feature activation+3.51e-01

Pos logit contributions
 particular+0.009
 mil+0.008
 marsh+0.007
 pl+0.007
 Ms+0.007
Neg logit contributions
 unnecessary-0.015
ately-0.015
der-0.015
 fault-0.014
 weaknesses-0.014
Token!
Token ID0
Feature activation+2.80e-01

Pos logit contributions
 particular+0.006
 mil+0.006
 marsh+0.005
 pl+0.005
 was+0.005
Neg logit contributions
 unnecessary-0.014
ately-0.014
 fault-0.014
der-0.014
arl-0.013
Token El
Token ID2574
Feature activation+7.17e-01

Pos logit contributions
 mil+0.006
 marsh+0.005
 particular+0.005
 pl+0.005
 l+0.005
Neg logit contributions
ately-0.010
 unnecessary-0.010
 fault-0.010
 weaknesses-0.010
arl-0.010
Tokenisa
Token ID9160
Feature activation+2.34e+00

Pos logit contributions
 mil+0.018
 marsh+0.017
 particular+0.017
 was+0.016
 pl+0.016
Neg logit contributions
 unnecessary-0.023
ately-0.023
 fault-0.022
 weaknesses-0.022
arl-0.022
Token smiled
Token ID13541
Feature activation+1.03e+00

Pos logit contributions
 particular+0.043
 marsh+0.042
 mil+0.042
 Ms+0.037
 Mitt+0.036
Neg logit contributions
ately-0.090
der-0.083
 weaknesses-0.083
 fault-0.080
 ago-0.079
Token and
Token ID290
Feature activation+6.05e-01

Pos logit contributions
 mil+0.021
 marsh+0.021
 particular+0.020
me+0.017
Ms+0.017
Neg logit contributions
der-0.039
ately-0.037
 fault-0.035
osity-0.034
 unnecessary-0.034
Token started
Token ID2067
Feature activation+1.07e+00

Pos logit contributions
 particular+0.014
 mil+0.014
 marsh+0.013
 end+0.012
me+0.012
Neg logit contributions
ately-0.019
der-0.019
 weaknesses-0.018
 unnecessary-0.018
 fault-0.017
Token talking
Token ID3375
Feature activation-6.19e-02

Pos logit contributions
 particular+0.030
 marsh+0.028
me+0.028
 end+0.027
 mil+0.027
Neg logit contributions
 fault-0.028
 unnecessary-0.028
ately-0.027
 weaknesses-0.027
keeping-0.026
Token to
Token ID284
Feature activation+1.41e-01

Pos logit contributions
 unnecessary+0.002
 fault+0.002
ately+0.002
der+0.002
 weaknesses+0.002
Neg logit contributions
 mil-0.001
 marsh-0.001
 particular-0.001
 pl-0.001
 was-0.001
Token forest
Token ID8222
Feature activation-1.06e+00

Pos logit contributions
 mil+0.035
 marsh+0.034
 particular+0.031
me+0.030
 Ms+0.030
Neg logit contributions
 unnecessary-0.051
arl-0.048
 weaknesses-0.046
 fault-0.045
 spine-0.043
Token.
Token ID13
Feature activation-7.34e-01

Pos logit contributions
ately+0.036
 fault+0.035
 spine+0.034
 unnecessary+0.034
der+0.034
Neg logit contributions
 marsh-0.025
 mil-0.024
 particular-0.022
 Ms-0.020
Ms-0.020
Token It
Token ID632
Feature activation+9.05e-01

Pos logit contributions
 unnecessary+0.024
 fault+0.023
ately+0.023
 weaknesses+0.023
arl+0.023
Neg logit contributions
 mil-0.018
 marsh-0.018
 particular-0.017
 was-0.017
 pl-0.017
Token saw
Token ID2497
Feature activation-2.72e-01

Pos logit contributions
 mil+0.027
 marsh+0.025
Ms+0.024
um+0.022
 chip+0.022
Neg logit contributions
ately-0.021
ust-0.020
 inspection-0.020
 villains-0.018
der-0.018
Token lots
Token ID6041
Feature activation+4.65e-01

Pos logit contributions
 unnecessary+0.010
 weaknesses+0.010
 fault+0.010
ately+0.010
der+0.010
Neg logit contributions
 mil-0.005
 particular-0.005
 marsh-0.005
 pl-0.004
me-0.004
Token of
Token ID286
Feature activation+2.21e+00

Pos logit contributions
 particular+0.010
 mil+0.010
M+0.010
 marsh+0.009
Ms+0.009
Neg logit contributions
 unnecessary-0.015
der-0.014
arl-0.014
ately-0.014
icated-0.014
Token trees
Token ID7150
Feature activation+4.10e-01

Pos logit contributions
 mil+0.048
 particular+0.047
 marsh+0.045
 pl+0.042
 was+0.042
Neg logit contributions
 unnecessary-0.079
 weaknesses-0.074
 fault-0.073
ately-0.073
arl-0.072
Token,
Token ID11
Feature activation+4.00e-01

Pos logit contributions
 marsh+0.006
 mil+0.006
 particular+0.006
M+0.006
Ms+0.006
Neg logit contributions
ately-0.016
der-0.015
 fault-0.015
arl-0.015
 weaknesses-0.015
Token flowers
Token ID12734
Feature activation+3.46e-01

Pos logit contributions
 particular+0.009
 mil+0.009
 marsh+0.009
 pl+0.008
M+0.008
Neg logit contributions
 unnecessary-0.013
 weaknesses-0.013
ately-0.013
 fault-0.012
der-0.012
Token,
Token ID11
Feature activation-3.93e-03

Pos logit contributions
 mil+0.005
 particular+0.005
 marsh+0.005
 Ms+0.004
Ms+0.004
Neg logit contributions
ately-0.014
der-0.014
 fault-0.014
arl-0.014
 ago-0.013
Token and
Token ID290
Feature activation+2.23e-01

Pos logit contributions
 unnecessary+0.000
ately+0.000
 weaknesses+0.000
 fault+0.000
der+0.000
Neg logit contributions
 particular-0.000
 mil-0.000
 marsh-0.000
 pl-0.000
 Ms-0.000
Token careful
Token ID8161
Feature activation-3.70e-01

Pos logit contributions
 unnecessary+0.009
 fault+0.009
 weaknesses+0.009
arl+0.009
ately+0.009
Neg logit contributions
 mil-0.013
 marsh-0.013
 particular-0.013
 was-0.012
 pl-0.012
Token,
Token ID11
Feature activation+1.83e-01

Pos logit contributions
 unnecessary+0.014
 fault+0.014
ately+0.014
arl+0.014
 weaknesses+0.014
Neg logit contributions
 mil-0.007
 marsh-0.007
 particular-0.007
 pl-0.006
 was-0.006
Token little
Token ID1310
Feature activation+7.76e-01

Pos logit contributions
 mil+0.005
 marsh+0.005
 particular+0.005
 was+0.005
 pl+0.005
Neg logit contributions
 unnecessary-0.005
 fault-0.005
arl-0.005
ately-0.005
 weaknesses-0.005
Token girl
Token ID2576
Feature activation-7.63e-01

Pos logit contributions
 mil+0.012
 marsh+0.012
 particular+0.011
 was+0.011
 pl+0.011
Neg logit contributions
 unnecessary-0.032
arl-0.032
 fault-0.032
ately-0.032
 weaknesses-0.031
Token.
Token ID13
Feature activation-1.06e-01

Pos logit contributions
 unnecessary+0.027
 fault+0.026
ately+0.026
 weaknesses+0.025
arl+0.025
Neg logit contributions
 mil-0.017
 marsh-0.016
 particular-0.016
 pl-0.015
 was-0.015
Token The
Token ID383
Feature activation+1.93e+00

Pos logit contributions
 unnecessary+0.003
 fault+0.003
ately+0.003
 weaknesses+0.003
arl+0.003
Neg logit contributions
 mil-0.003
 marsh-0.003
 particular-0.003
 was-0.003
 Ms-0.003
Token forest
Token ID8222
Feature activation-1.16e+00

Pos logit contributions
 marsh+0.023
 mil+0.023
 was+0.022
 particular+0.022
 w+0.021
Neg logit contributions
ately-0.088
 unnecessary-0.088
 fault-0.088
 weaknesses-0.085
arl-0.084
Token can
Token ID460
Feature activation+6.47e-01

Pos logit contributions
 unnecessary+0.044
 fault+0.043
ately+0.043
 weaknesses+0.043
arl+0.042
Neg logit contributions
 mil-0.022
 marsh-0.022
 particular-0.021
 pl-0.020
 was-0.019
Token be
Token ID307
Feature activation+1.28e-01

Pos logit contributions
 mil+0.016
 marsh+0.015
 pasture+0.014
M+0.013
Ms+0.013
Neg logit contributions
ately-0.020
 ours-0.020
 weaknesses-0.019
 unnecessary-0.019
arl-0.018
Token dangerous
Token ID4923
Feature activation+3.21e-01

Pos logit contributions
 mil+0.004
 particular+0.004
 marsh+0.004
 pl+0.003
M+0.003
Neg logit contributions
 unnecessary-0.004
 fault-0.004
 ours-0.003
 weaknesses-0.003
ately-0.003
Token."
Token ID526
Feature activation+1.62e-01

Pos logit contributions
,+0.006
 was+0.006
 marsh+0.006
 mil+0.006
 sn+0.005
Neg logit contributions
arl-0.012
ately-0.012
 fault-0.012
 unnecessary-0.012
 weaknesses-0.012
Token.
Token ID13
Feature activation+6.47e-01

Pos logit contributions
 was+0.003
 mil+0.003
 marsh+0.003
,+0.003
 particular+0.003
Neg logit contributions
 weaknesses-0.009
ately-0.009
 unnecessary-0.009
 fault-0.009
arl-0.009
Token
Token ID198
Feature activation+3.83e-01

Pos logit contributions
 was+0.011
 marsh+0.011
 mil+0.011
 particular+0.010
M+0.010
Neg logit contributions
 unnecessary-0.027
ately-0.026
 weaknesses-0.026
 fault-0.025
 spine-0.025
Token
Token ID198
Feature activation-4.40e-01

Pos logit contributions
 mil+0.006
 marsh+0.006
 particular+0.005
 pl+0.005
 was+0.005
Neg logit contributions
 unnecessary-0.016
 fault-0.016
arl-0.016
ately-0.015
 weaknesses-0.015
TokenL
Token ID43
Feature activation-7.43e-01

Pos logit contributions
 unnecessary+0.015
arl+0.015
ately+0.015
 weaknesses+0.015
 fault+0.015
Neg logit contributions
 mil-0.010
 marsh-0.009
 particular-0.009
 pl-0.009
 was-0.008
Tokenily
Token ID813
Feature activation-2.52e-01

Pos logit contributions
 unnecessary+0.018
 weaknesses+0.018
 fault+0.017
ately+0.016
arl+0.016
Neg logit contributions
 mil-0.025
 particular-0.024
 was-0.024
 marsh-0.024
 pl-0.023
Token feels
Token ID5300
Feature activation+1.18e+00

Pos logit contributions
 fault+0.008
der+0.007
 weaknesses+0.007
 unnecessary+0.007
arl+0.007
Neg logit contributions
 particular-0.007
 marsh-0.007
 mil-0.007
M-0.006
 pl-0.006
Token happy
Token ID3772
Feature activation-1.74e-01

Pos logit contributions
 mil+0.035
 particular+0.035
 marsh+0.034
 pl+0.031
M+0.031
Neg logit contributions
 unnecessary-0.034
 fault-0.031
arl-0.030
 weaknesses-0.029
der-0.029
Token.
Token ID13
Feature activation+6.45e-02

Pos logit contributions
 unnecessary+0.009
 weaknesses+0.009
 fault+0.009
arl+0.009
ately+0.009
Neg logit contributions
 was-0.002
,-0.002
 w-0.001
 sn-0.001
 mil-0.001
Token She
Token ID1375
Feature activation-6.67e-01

Pos logit contributions
 was+0.001
 mil+0.001
 particular+0.001
 marsh+0.001
,+0.001
Neg logit contributions
 unnecessary-0.003
ately-0.003
 weaknesses-0.003
 fault-0.003
 ours-0.003
Token likes
Token ID7832
Feature activation-8.93e-01

Pos logit contributions
 fault+0.017
 unnecessary+0.017
arl+0.016
 weaknesses+0.016
der+0.016
Neg logit contributions
 particular-0.021
 mil-0.020
 marsh-0.020
 Ms-0.018
M-0.018
Token to
Token ID284
Feature activation-5.79e-01

Pos logit contributions
 unnecessary+0.037
ately+0.036
arl+0.036
 fault+0.035
 weaknesses+0.035
Neg logit contributions
 mil-0.015
 marsh-0.014
 particular-0.013
 was-0.013
 pl-0.013
TokenSee
Token ID6214
Feature activation+4.25e-01

Pos logit contributions
 unnecessary+0.029
 fault+0.029
ately+0.028
arl+0.028
 weaknesses+0.028
Neg logit contributions
 mil-0.025
 marsh-0.025
 particular-0.024
 was-0.023
 pl-0.023
Token,
Token ID11
Feature activation-3.63e-01

Pos logit contributions
 mil+0.005
 marsh+0.004
 particular+0.004
 was+0.004
 pl+0.004
Neg logit contributions
 unnecessary-0.020
 fault-0.020
 weaknesses-0.019
arl-0.019
ately-0.019
Token I
Token ID314
Feature activation+1.32e-01

Pos logit contributions
 unnecessary+0.012
 fault+0.012
arl+0.012
ately+0.012
 weaknesses+0.011
Neg logit contributions
 mil-0.009
 was-0.009
 marsh-0.009
 particular-0.008
 pl-0.008
Token promised
Token ID8072
Feature activation-9.51e-01

Pos logit contributions
 marsh+0.003
 mil+0.003
 particular+0.003
 Ms+0.003
 pasture+0.003
Neg logit contributions
ately-0.004
 fault-0.004
arl-0.004
 weaknesses-0.004
 unnecessary-0.004
Token ice
Token ID4771
Feature activation-3.97e-01

Pos logit contributions
 unnecessary+0.030
arl+0.029
 fault+0.029
 weaknesses+0.029
ately+0.028
Neg logit contributions
 mil-0.025
 was-0.024
 particular-0.023
 marsh-0.023
,-0.023
Token cream
Token ID8566
Feature activation+4.75e-01

Pos logit contributions
 fault+0.011
ately+0.011
 unnecessary+0.011
arl+0.010
der+0.010
Neg logit contributions
 particular-0.011
 marsh-0.011
 mil-0.011
 pl-0.010
 sn-0.010
Token."
Token ID526
Feature activation-5.35e-01

Pos logit contributions
 marsh+0.010
 mil+0.010
 particular+0.010
 pl+0.009
 sn+0.009
Neg logit contributions
ately-0.016
 fault-0.016
der-0.015
arl-0.015
 unnecessary-0.015
Token
Token ID198
Feature activation-4.21e-01

Pos logit contributions
 unnecessary+0.020
 fault+0.020
ately+0.020
arl+0.019
 weaknesses+0.019
Neg logit contributions
 mil-0.011
 marsh-0.010
 particular-0.010
 was-0.010
 pl-0.009
Token
Token ID198
Feature activation-1.09e+00

Pos logit contributions
arl+0.016
 unnecessary+0.015
 fault+0.015
ately+0.015
 weaknesses+0.015
Neg logit contributions
 mil-0.008
 marsh-0.007
 particular-0.007
 pl-0.007
 sn-0.007
TokenAmy
Token ID40797
Feature activation+1.03e+00

Pos logit contributions
 unnecessary+0.030
 weaknesses+0.030
arl+0.030
ately+0.029
der+0.028
Neg logit contributions
 mil-0.031
 marsh-0.030
 pl-0.028
 particular-0.028
 sn-0.028
Token was
Token ID373
Feature activation-4.36e-01

Pos logit contributions
 particular+0.028
 marsh+0.027
 mil+0.024
 pasture+0.024
 tow+0.023
Neg logit contributions
der-0.028
ately-0.028
 fault-0.025
endant-0.024
 weaknesses-0.023
Token,
Token ID11
Feature activation-4.42e-01

Pos logit contributions
ately+0.003
der+0.003
arl+0.003
 fault+0.003
 unnecessary+0.003
Neg logit contributions
 particular-0.001
 marsh-0.001
 mil-0.001
 pl-0.001
 pasture-0.001
Token "
Token ID366
Feature activation-4.17e-01

Pos logit contributions
ately+0.016
arl+0.016
der+0.015
 unnecessary+0.015
 fault+0.015
Neg logit contributions
 marsh-0.009
 mil-0.009
 particular-0.009
 pl-0.008
 was-0.008
TokenYes
Token ID5297
Feature activation+7.72e-01

Pos logit contributions
 unnecessary+0.014
 fault+0.014
arl+0.013
ately+0.013
 weaknesses+0.013
Neg logit contributions
 mil-0.010
 particular-0.010
 was-0.010
 pl-0.010
 marsh-0.010
Token,
Token ID11
Feature activation+1.38e-01

Pos logit contributions
 mil+0.005
 marsh+0.005
 particular+0.004
 pl+0.004
 was+0.004
Neg logit contributions
 unnecessary-0.039
ately-0.038
 fault-0.038
arl-0.038
 weaknesses-0.037
Token you
Token ID345
Feature activation-3.64e-01

Pos logit contributions
 mil+0.003
 was+0.003
 marsh+0.003
 particular+0.003
,+0.003
Neg logit contributions
 unnecessary-0.005
 fault-0.005
arl-0.005
 weaknesses-0.005
ately-0.004
Token do
Token ID466
Feature activation+2.91e-01

Pos logit contributions
 unnecessary+0.012
ately+0.012
arl+0.012
 fault+0.012
keeping+0.011
Neg logit contributions
 was-0.009
 marsh-0.008
 mil-0.008
 pl-0.008
,-0.008
Token,
Token ID11
Feature activation-2.24e-01

Pos logit contributions
 mil+0.004
 marsh+0.004
 particular+0.004
 pl+0.004
 was+0.004
Neg logit contributions
 unnecessary-0.012
 fault-0.012
ately-0.012
 weaknesses-0.012
arl-0.012
Token sweet
Token ID6029
Feature activation-1.57e-02

Pos logit contributions
 unnecessary+0.008
arl+0.007
 fault+0.007
 weaknesses+0.007
ately+0.007
Neg logit contributions
 mil-0.005
 was-0.005
 marsh-0.005
 particular-0.005
,-0.005
Tokenheart
Token ID11499
Feature activation-5.72e-01

Pos logit contributions
 unnecessary+0.001
 fault+0.001
ately+0.001
 weaknesses+0.001
arl+0.001
Neg logit contributions
 mil-0.000
 marsh-0.000
 particular-0.000
 was-0.000
 pl-0.000
Token.
Token ID13
Feature activation-8.65e-01

Pos logit contributions
 unnecessary+0.023
 fault+0.023
ately+0.023
arl+0.022
 weaknesses+0.022
Neg logit contributions
 mil-0.009
 marsh-0.009
 particular-0.009
 pl-0.008
 was-0.008
Token Exercise
Token ID32900
Feature activation-8.14e-01

Pos logit contributions
 unnecessary+0.024
 fault+0.023
ately+0.022
 weaknesses+0.022
arl+0.021
Neg logit contributions
 mil-0.026
 marsh-0.025
 was-0.025
 particular-0.025
 pl-0.024
Token?
Token ID30
Feature activation-1.77e-01

Pos logit contributions
 mil+0.035
 marsh+0.034
 particular+0.033
 was+0.032
 pl+0.032
Neg logit contributions
 unnecessary-0.031
 fault-0.030
ately-0.030
arl-0.029
 weaknesses-0.029
Token You
Token ID921
Feature activation-1.06e-01

Pos logit contributions
 unnecessary+0.006
 fault+0.005
 weaknesses+0.005
ately+0.005
arl+0.005
Neg logit contributions
 mil-0.005
 was-0.005
 particular-0.005
 marsh-0.005
-0.005
Token get
Token ID651
Feature activation+3.91e-01

Pos logit contributions
 unnecessary+0.004
 fault+0.003
ately+0.003
arl+0.003
 weaknesses+0.003
Neg logit contributions
 mil-0.002
 marsh-0.002
 particular-0.002
 was-0.002
 pl-0.002
Token to
Token ID284
Feature activation+8.83e-01

Pos logit contributions
 mil+0.007
 marsh+0.007
 particular+0.007
 was+0.007
 pl+0.007
Neg logit contributions
 unnecessary-0.015
 fault-0.015
ately-0.015
 weaknesses-0.015
arl-0.015
Token use
Token ID779
Feature activation+4.62e-01

Pos logit contributions
 mil+0.019
 marsh+0.019
 particular+0.018
 was+0.017
M+0.017
Neg logit contributions
 unnecessary-0.029
 fault-0.029
 ours-0.029
arl-0.029
 weaknesses-0.028
Token a
Token ID257
Feature activation+7.03e-01

Pos logit contributions
 mil+0.011
 marsh+0.010
 particular+0.010
 pl+0.009
 was+0.009
Neg logit contributions
 unnecessary-0.016
 fault-0.015
ately-0.015
 weaknesses-0.015
arl-0.015
Token real
Token ID1103
Feature activation+1.43e+00

Pos logit contributions
 mil+0.017
 marsh+0.017
 particular+0.017
 was+0.016
 pl+0.016
Neg logit contributions
 unnecessary-0.023
 fault-0.022
arl-0.022
 weaknesses-0.022
ately-0.022
Token telephone
Token ID11426
Feature activation+3.43e-01

Pos logit contributions
 mil+0.043
 marsh+0.042
 particular+0.040
 Ms+0.039
M+0.039
Neg logit contributions
 fault-0.037
 unnecessary-0.037
 weaknesses-0.036
ately-0.035
arl-0.035
Token today
Token ID1909
Feature activation-1.29e+00

Pos logit contributions
 mil+0.007
 marsh+0.007
 particular+0.006
 was+0.006
 pl+0.006
Neg logit contributions
 unnecessary-0.013
 fault-0.013
ately-0.012
arl-0.012
 weaknesses-0.012
Token!"
Token ID2474
Feature activation+2.84e-02

Pos logit contributions
 fault+0.042
ately+0.041
 unnecessary+0.041
arl+0.040
 weaknesses+0.040
Neg logit contributions
 mil-0.030
 marsh-0.029
 particular-0.029
 pl-0.028
 sn-0.026
Token
Token ID220
Feature activation+3.84e-01

Pos logit contributions
 mil+0.000
 marsh+0.000
 particular+0.000
 pl+0.000
 was+0.000
Neg logit contributions
ately-0.001
 unnecessary-0.001
 fault-0.001
arl-0.001
der-0.001
Token happy
Token ID3772
Feature activation-2.45e-01

Pos logit contributions
 particular+0.004
 mil+0.004
oc+0.004
 marsh+0.003
me+0.003
Neg logit contributions
 fault-0.004
 unnecessary-0.004
 ours-0.003
emic-0.003
 ago-0.003
Token friends
Token ID2460
Feature activation-7.12e-01

Pos logit contributions
 unnecessary+0.011
arl+0.010
 weaknesses+0.010
ately+0.010
 fault+0.010
Neg logit contributions
 was-0.004
 mil-0.004
,-0.004
 particular-0.004
 marsh-0.004
Token,
Token ID11
Feature activation-2.97e-01

Pos logit contributions
 unnecessary+0.029
 fault+0.029
ately+0.029
 weaknesses+0.028
arl+0.028
Neg logit contributions
 mil-0.011
 marsh-0.011
 particular-0.011
 pl-0.010
 was-0.010
Token and
Token ID290
Feature activation-1.75e-02

Pos logit contributions
 fault+0.011
 unnecessary+0.011
ately+0.011
 weaknesses+0.011
der+0.011
Neg logit contributions
 marsh-0.005
 mil-0.005
 particular-0.005
 pl-0.005
M-0.004
Token no
Token ID645
Feature activation+1.02e+00

Pos logit contributions
 unnecessary+0.001
ately+0.001
 fault+0.001
 weaknesses+0.001
arl+0.001
Neg logit contributions
 mil-0.000
 marsh-0.000
 particular-0.000
 pl-0.000
 was-0.000
Token matter
Token ID2300
Feature activation+6.42e-01

Pos logit contributions
 mil+0.017
 particular+0.017
 marsh+0.017
 was+0.016
 pl+0.016
Neg logit contributions
 unnecessary-0.041
arl-0.040
 fault-0.040
ately-0.040
 weaknesses-0.040
Token where
Token ID810
Feature activation+1.53e+00

Pos logit contributions
 was+0.016
 mil+0.016
 particular+0.016
 marsh+0.016
,+0.015
Neg logit contributions
 unnecessary-0.021
arl-0.020
 fault-0.020
 weaknesses-0.020
ately-0.019
Token they
Token ID484
Feature activation+9.78e-01

Pos logit contributions
 mil+0.033
 marsh+0.032
 was+0.031
 particular+0.031
 pl+0.030
Neg logit contributions
 unnecessary-0.055
arl-0.054
 fault-0.053
ately-0.053
 weaknesses-0.052
Token went
Token ID1816
Feature activation+1.05e+00

Pos logit contributions
 mil+0.018
 marsh+0.018
 particular+0.017
 pl+0.015
 was+0.015
Neg logit contributions
 unnecessary-0.037
ately-0.037
 weaknesses-0.036
 fault-0.036
arl-0.036
Token,
Token ID11
Feature activation-2.45e-02

Pos logit contributions
 mil+0.010
 particular+0.009
 marsh+0.009
 was+0.009
 pl+0.008
Neg logit contributions
 unnecessary-0.050
 fault-0.050
 weaknesses-0.049
ately-0.049
arl-0.049
Token they
Token ID484
Feature activation-2.85e-01

Pos logit contributions
 unnecessary+0.001
 fault+0.001
ately+0.001
 weaknesses+0.001
arl+0.001
Neg logit contributions
 mil-0.001
 marsh-0.001
 particular-0.001
 pl-0.001
 was-0.001
Token
Token ID198
Feature activation+1.75e-01

Pos logit contributions
 mil+0.010
 was+0.010
 particular+0.010
 marsh+0.010
 Ms+0.009
Neg logit contributions
 unnecessary-0.034
 weaknesses-0.033
 fault-0.033
ately-0.033
arl-0.032
Token
Token ID198
Feature activation-3.70e-01

Pos logit contributions
 mil+0.002
 marsh+0.002
 particular+0.002
 was+0.002
 pl+0.002
Neg logit contributions
 unnecessary-0.008
 fault-0.008
ately-0.008
 weaknesses-0.008
arl-0.008
Token"
Token ID1
Feature activation+6.62e-01

Pos logit contributions
 unnecessary+0.013
arl+0.013
ately+0.013
 weaknesses+0.012
 fault+0.012
Neg logit contributions
 mil-0.008
 marsh-0.008
 particular-0.008
 pl-0.007
 sn-0.007
TokenO
Token ID46
Feature activation+1.34e-01

Pos logit contributions
 particular+0.014
 mil+0.014
M+0.014
 was+0.014
 pl+0.013
Neg logit contributions
 fault-0.025
 unnecessary-0.024
ately-0.023
 weaknesses-0.023
 spine-0.023
Tokenst
Token ID301
Feature activation+7.75e-01

Pos logit contributions
 mil+0.004
 marsh+0.004
 particular+0.004
 was+0.004
 pl+0.004
Neg logit contributions
 unnecessary-0.004
 fault-0.004
ately-0.004
 weaknesses-0.004
arl-0.004
Tokenric
Token ID1173
Feature activation-6.23e-01

Pos logit contributions
 mil+0.017
 marsh+0.016
 particular+0.016
 pl+0.015
 was+0.015
Neg logit contributions
 fault-0.028
 unnecessary-0.028
ately-0.027
arl-0.027
 weaknesses-0.026
Tokenhes
Token ID956
Feature activation+1.15e+00

Pos logit contributions
 unnecessary+0.025
ately+0.024
 weaknesses+0.024
 fault+0.024
arl+0.023
Neg logit contributions
 mil-0.011
 marsh-0.011
 was-0.011
 pl-0.010
 particular-0.010
Token do
Token ID466
Feature activation+1.06e+00

Pos logit contributions
 mil+0.024
 marsh+0.024
 particular+0.023
 pl+0.021
 Ms+0.020
Neg logit contributions
 fault-0.041
ately-0.040
arl-0.040
 unnecessary-0.040
 weaknesses-0.039
Token not
Token ID407
Feature activation-7.29e-01

Pos logit contributions
 marsh+0.028
 particular+0.028
 mil+0.028
 pl+0.026
 pasture+0.024
Neg logit contributions
 unnecessary-0.033
 fault-0.031
 weaknesses-0.031
ately-0.030
der-0.029
Token like
Token ID588
Feature activation-1.88e-01

Pos logit contributions
ately+0.024
 fault+0.024
 unnecessary+0.024
 weaknesses+0.023
arl+0.023
Neg logit contributions
 mil-0.017
 marsh-0.017
 particular-0.016
 pl-0.016
 was-0.016
Token people
Token ID661
Feature activation-8.37e-01

Pos logit contributions
 unnecessary+0.007
 fault+0.007
ately+0.006
arl+0.006
 weaknesses+0.006
Neg logit contributions
 mil-0.004
 particular-0.004
 marsh-0.004
 pl-0.004
 was-0.004
Token cookies
Token ID14746
Feature activation+4.62e-01

Pos logit contributions
 mil+0.014
 marsh+0.014
 particular+0.013
 pl+0.013
 was+0.012
Neg logit contributions
 unnecessary-0.026
 fault-0.026
ately-0.025
 weaknesses-0.025
arl-0.025
Token and
Token ID290
Feature activation+9.87e-01

Pos logit contributions
 mil+0.009
 particular+0.009
 marsh+0.008
 pl+0.008
 Ms+0.007
Neg logit contributions
 fault-0.017
ately-0.017
arl-0.017
der-0.017
 unnecessary-0.017
Token playing
Token ID2712
Feature activation+7.61e-01

Pos logit contributions
 mil+0.017
 marsh+0.017
 particular+0.016
 was+0.015
 pl+0.015
Neg logit contributions
 unnecessary-0.039
 fault-0.038
ately-0.038
arl-0.038
 weaknesses-0.038
Token.
Token ID13
Feature activation+6.05e-01

Pos logit contributions
 mil+0.006
 marsh+0.006
 was+0.006
 particular+0.005
 pl+0.005
Neg logit contributions
 unnecessary-0.037
 fault-0.037
 weaknesses-0.037
ately-0.037
arl-0.036
Token He
Token ID679
Feature activation-9.02e-02

Pos logit contributions
 was+0.011
 marsh+0.011
 particular+0.011
 Mitt+0.011
 w+0.010
Neg logit contributions
ately-0.024
 unnecessary-0.024
 weaknesses-0.023
 fault-0.022
asing-0.022
Token was
Token ID373
Feature activation-3.99e-01

Pos logit contributions
der+0.003
ately+0.003
 fault+0.003
 weaknesses+0.003
 unnecessary+0.003
Neg logit contributions
 mil-0.002
 particular-0.002
 marsh-0.002
Ms-0.002
 Ms-0.002
Token a
Token ID257
Feature activation+8.57e-02

Pos logit contributions
 unnecessary+0.013
 fault+0.013
 ours+0.012
arl+0.012
 spies+0.012
Neg logit contributions
 particular-0.010
 mil-0.010
 marsh-0.009
 pl-0.008
 was-0.008
Token happy
Token ID3772
Feature activation-5.12e-01

Pos logit contributions
 mil+0.002
 marsh+0.002
 particular+0.002
 pl+0.002
 was+0.002
Neg logit contributions
 unnecessary-0.003
 fault-0.003
arl-0.003
 weaknesses-0.003
ately-0.003
Token dog
Token ID3290
Feature activation+1.06e+00

Pos logit contributions
 unnecessary+0.020
 weaknesses+0.020
ately+0.019
 fault+0.019
arl+0.019
Neg logit contributions
 mil-0.009
 marsh-0.009
 particular-0.009
 was-0.009
 pl-0.008
Token.
Token ID13
Feature activation+8.59e-01

Pos logit contributions
 marsh+0.022
 particular+0.022
 mil+0.021
 pl+0.019
M+0.017
Neg logit contributions
ately-0.037
 unnecessary-0.036
der-0.036
 fault-0.036
 spies-0.035
Token<|endoftext|>
Token ID50256
Feature activation-8.37e-01

Pos logit contributions
 was+0.013
 particular+0.013
 marsh+0.013
 mil+0.012
 w+0.011
Neg logit contributions
 unnecessary-0.036
ately-0.036
 weaknesses-0.035
 fault-0.035
arl-0.033
Token She
Token ID1375
Feature activation-3.72e-01

Pos logit contributions
 was+0.002
,+0.002
 particular+0.002
 marsh+0.002
 mil+0.002
Neg logit contributions
 unnecessary-0.008
 fault-0.007
ately-0.007
 weaknesses-0.007
 ours-0.007
Token was
Token ID373
Feature activation-3.55e-01

Pos logit contributions
ately+0.011
der+0.010
endant+0.010
 weaknesses+0.009
emic+0.009
Neg logit contributions
 marsh-0.011
 particular-0.010
 mil-0.010
 gra-0.009
 pasture-0.009
Token happy
Token ID3772
Feature activation-2.09e-01

Pos logit contributions
 fault+0.010
 unnecessary+0.010
 ours+0.010
arl+0.010
 spies+0.009
Neg logit contributions
 marsh-0.010
 mil-0.010
 particular-0.010
oc-0.009
 was-0.009
Token that
Token ID326
Feature activation-1.05e-01

Pos logit contributions
 unnecessary+0.010
 weaknesses+0.010
arl+0.010
 fault+0.009
ately+0.009
Neg logit contributions
 was-0.003
,-0.003
 mil-0.002
 w-0.002
 sn-0.002
Token Max
Token ID5436
Feature activation-4.70e-01

Pos logit contributions
 fault+0.003
 weaknesses+0.003
 unnecessary+0.003
ately+0.003
 spies+0.002
Neg logit contributions
 mil-0.003
 marsh-0.003
 pl-0.003
 pasture-0.003
 particular-0.003
Token listened
Token ID16399
Feature activation+1.27e-01

Pos logit contributions
 fault+0.017
ately+0.017
 weaknesses+0.016
 unnecessary+0.016
der+0.016
Neg logit contributions
 marsh-0.010
 mil-0.010
 particular-0.009
 pl-0.008
 Ms-0.007
Token to
Token ID284
Feature activation-7.21e-01

Pos logit contributions
 mil+0.003
 marsh+0.003
 particular+0.003
 pl+0.002
 was+0.002
Neg logit contributions
 unnecessary-0.005
 fault-0.004
ately-0.004
 weaknesses-0.004
arl-0.004
Token her
Token ID607
Feature activation+1.06e-04

Pos logit contributions
 unnecessary+0.026
 fault+0.026
ately+0.026
 weaknesses+0.025
arl+0.025
Neg logit contributions
 particular-0.015
 mil-0.015
 marsh-0.014
 was-0.014
 pl-0.013
Token.
Token ID13
Feature activation+5.38e-01

Pos logit contributions
 mil+0.000
 particular+0.000
 was+0.000
 marsh+0.000
 pl+0.000
Neg logit contributions
 unnecessary-0.000
 weaknesses-0.000
arl-0.000
ately-0.000
 fault-0.000
Token
Token ID198
Feature activation+7.61e-01

Pos logit contributions
 was+0.007
,+0.007
 particular+0.007
 marsh+0.006
 Ms+0.006
Neg logit contributions
 unnecessary-0.025
ately-0.024
 fault-0.024
 weaknesses-0.023
 ours-0.022
Token
Token ID198
Feature activation+3.10e-03

Pos logit contributions
 mil+0.011
 marsh+0.010
 particular+0.010
 pl+0.009
 was+0.009
Neg logit contributions
 unnecessary-0.032
 fault-0.032
arl-0.032
ately-0.032
 weaknesses-0.032
Token It
Token ID632
Feature activation+6.29e-01

Pos logit contributions
 unnecessary+0.001
 fault+0.001
 weaknesses+0.001
ately+0.001
arl+0.001
Neg logit contributions
 mil-0.001
 marsh-0.001
 particular-0.001
 was-0.001
 pl-0.001
Token was
Token ID373
Feature activation-2.41e-01

Pos logit contributions
 mil+0.013
 pasture+0.013
Ms+0.012
 marsh+0.011
um+0.011
Neg logit contributions
der-0.020
ately-0.019
 weaknesses-0.019
 fault-0.018
 villains-0.018
Token very
Token ID845
Feature activation+1.22e-01

Pos logit contributions
 fault+0.008
 unnecessary+0.007
 ours+0.007
arl+0.007
der+0.007
Neg logit contributions
 mil-0.006
oc-0.006
me-0.005
 particular-0.005
 marsh-0.005
Token gray
Token ID12768
Feature activation-5.00e-01

Pos logit contributions
 mil+0.003
 marsh+0.003
 particular+0.003
 was+0.002
 pl+0.002
Neg logit contributions
 unnecessary-0.004
arl-0.004
 fault-0.004
ately-0.004
 weaknesses-0.004
Token.
Token ID13
Feature activation-2.18e-01

Pos logit contributions
 unnecessary+0.024
 weaknesses+0.023
ately+0.023
arl+0.023
 fault+0.022
Neg logit contributions
 mil-0.005
 was-0.005
 marsh-0.005
 particular-0.005
,-0.005
Token They
Token ID1119
Feature activation+6.72e-02

Pos logit contributions
 unnecessary+0.007
 fault+0.007
ately+0.007
 weaknesses+0.007
arl+0.007
Neg logit contributions
 mil-0.005
 marsh-0.005
 particular-0.005
 was-0.005
 pl-0.005
Token looked
Token ID3114
Feature activation+5.79e-01

Pos logit contributions
 mil+0.001
 marsh+0.001
 was+0.001
 particular+0.001
 pl+0.001
Neg logit contributions
 unnecessary-0.003
 fault-0.003
arl-0.002
 weaknesses-0.002
ately-0.002
Token at
Token ID379
Feature activation-1.31e+00

Pos logit contributions
 mil+0.012
 marsh+0.012
 particular+0.011
 was+0.011
 pl+0.011
Neg logit contributions
 unnecessary-0.021
ately-0.021
 fault-0.021
 weaknesses-0.021
arl-0.021
Token each
Token ID1123
Feature activation+1.06e+00

Pos logit contributions
 unnecessary+0.051
 fault+0.050
ately+0.049
 weaknesses+0.049
arl+0.049
Neg logit contributions
 mil-0.024
 marsh-0.023
 particular-0.023
 pl-0.022
 was-0.021
Token other
Token ID584
Feature activation-7.53e-01

Pos logit contributions
 mil+0.027
 marsh+0.027
 pl+0.023
ink+0.023
 particular+0.022
Neg logit contributions
 unnecessary-0.033
ately-0.033
 fault-0.032
 weaknesses-0.031
arl-0.031
Token.
Token ID13
Feature activation+3.73e-02

Pos logit contributions
 unnecessary+0.034
arl+0.033
 fault+0.033
 weaknesses+0.033
ately+0.033
Neg logit contributions
 mil-0.009
 marsh-0.008
 was-0.008
 particular-0.008
 pl-0.007
Token.
Token ID13
Feature activation-6.13e-01

Pos logit contributions
ately+0.032
 fault+0.032
 unnecessary+0.031
arl+0.030
der+0.030
Neg logit contributions
 marsh-0.013
 particular-0.013
 mil-0.013
 pl-0.011
 Ms-0.010
Token They
Token ID1119
Feature activation+3.56e-01

Pos logit contributions
 unnecessary+0.021
ately+0.020
 fault+0.019
 weaknesses+0.019
arl+0.019
Neg logit contributions
 mil-0.015
 marsh-0.015
 particular-0.015
 was-0.014
 pl-0.013
Token felt
Token ID2936
Feature activation+9.56e-01

Pos logit contributions
 mil+0.009
 particular+0.008
 marsh+0.008
Ms+0.007
M+0.007
Neg logit contributions
ately-0.011
 weaknesses-0.010
der-0.010
emic-0.010
 fault-0.010
Token so
Token ID523
Feature activation+6.11e-01

Pos logit contributions
 particular+0.028
me+0.026
M+0.025
 mil+0.025
ink+0.024
Neg logit contributions
 unnecessary-0.028
 ours-0.025
 fault-0.023
lessness-0.022
 privacy-0.021
Token happy
Token ID3772
Feature activation-1.42e-01

Pos logit contributions
 mil+0.015
 marsh+0.014
 particular+0.014
 pl+0.013
 was+0.013
Neg logit contributions
 unnecessary-0.020
arl-0.020
 fault-0.020
ately-0.019
 weaknesses-0.019
Token and
Token ID290
Feature activation-2.52e-01

Pos logit contributions
 unnecessary+0.006
arl+0.006
 weaknesses+0.006
 fault+0.006
ately+0.006
Neg logit contributions
 mil-0.002
 was-0.002
 marsh-0.002
,-0.002
 particular-0.002
Token excited
Token ID6568
Feature activation-3.93e-02

Pos logit contributions
 unnecessary+0.008
ately+0.008
 weaknesses+0.007
 fault+0.007
 ours+0.007
Neg logit contributions
 marsh-0.006
 particular-0.006
 mil-0.006
 pl-0.006
M-0.006
Token to
Token ID284
Feature activation-5.87e-01

Pos logit contributions
 unnecessary+0.002
arl+0.002
 weaknesses+0.002
emic+0.002
 fault+0.002
Neg logit contributions
 was-0.000
,-0.000
 mil-0.000
 w-0.000
 particular-0.000
Token take
Token ID1011
Feature activation-9.34e-01

Pos logit contributions
arl+0.018
 weaknesses+0.018
 unnecessary+0.018
 fault+0.018
ately+0.018
Neg logit contributions
 mil-0.014
 marsh-0.014
 particular-0.012
 Ms-0.012
M-0.012
Token it
Token ID340
Feature activation+1.35e+00

Pos logit contributions
 unnecessary+0.038
 fault+0.037
ately+0.037
 weaknesses+0.036
arl+0.036
Neg logit contributions
 mil-0.015
 particular-0.015
 marsh-0.014
 pl-0.013
 was-0.013
Token out
Token ID503
Feature activation+1.19e+00

Pos logit contributions
 mil+0.036
 marsh+0.036
 particular+0.035
 pl+0.033
 was+0.032
Neg logit contributions
 fault-0.041
 unnecessary-0.040
arl-0.039
ately-0.039
 weaknesses-0.038
Token didn
Token ID1422
Feature activation-7.39e-01

Pos logit contributions
ately+0.017
 fault+0.016
 unnecessary+0.016
 weaknesses+0.016
der+0.016
Neg logit contributions
 marsh-0.011
 mil-0.011
 particular-0.010
 pl-0.009
 Ms-0.009
Token't
Token ID470
Feature activation-1.19e+00

Pos logit contributions
arl+0.020
 fault+0.020
der+0.019
ately+0.019
 unnecessary+0.018
Neg logit contributions
 mil-0.020
 marsh-0.020
 pasture-0.019
 pl-0.019
 l-0.019
Token want
Token ID765
Feature activation-3.18e-01

Pos logit contributions
ately+0.038
 unnecessary+0.038
 weaknesses+0.037
 fault+0.037
endant+0.036
Neg logit contributions
 was-0.029
,-0.029
 particular-0.029
 mil-0.029
 sn-0.028
Token to
Token ID284
Feature activation-2.31e-01

Pos logit contributions
 unnecessary+0.010
 fault+0.010
ately+0.009
 weaknesses+0.009
der+0.009
Neg logit contributions
 marsh-0.008
 mil-0.008
 particular-0.008
 pl-0.008
me-0.007
Token because
Token ID780
Feature activation+1.43e-01

Pos logit contributions
 unnecessary+0.009
 fault+0.008
ately+0.008
 weaknesses+0.008
arl+0.008
Neg logit contributions
 mil-0.005
 marsh-0.005
 particular-0.004
 was-0.004
 pl-0.004
Token he
Token ID339
Feature activation-1.37e+00

Pos logit contributions
 mil+0.003
 marsh+0.003
 particular+0.003
 pl+0.003
 was+0.003
Neg logit contributions
 unnecessary-0.005
 fault-0.005
ately-0.005
 weaknesses-0.005
arl-0.005
Token was
Token ID373
Feature activation-3.02e-01

Pos logit contributions
ately+0.052
 unnecessary+0.052
 fault+0.051
 weaknesses+0.050
der+0.048
Neg logit contributions
 marsh-0.026
 mil-0.026
 particular-0.022
 Ms-0.020
 reception-0.020
Token too
Token ID1165
Feature activation-2.06e-01

Pos logit contributions
 unnecessary+0.010
 fault+0.010
ately+0.010
 weaknesses+0.010
arl+0.010
Neg logit contributions
 mil-0.007
 marsh-0.007
 particular-0.007
 pl-0.006
 was-0.006
Token tired
Token ID10032
Feature activation-5.91e-01

Pos logit contributions
 unnecessary+0.007
 fault+0.007
arl+0.007
ately+0.007
 weaknesses+0.007
Neg logit contributions
 mil-0.005
 marsh-0.005
 particular-0.004
 pl-0.004
 was-0.004
Token.
Token ID13
Feature activation+2.98e-01

Pos logit contributions
 unnecessary+0.026
 weaknesses+0.025
 fault+0.025
ately+0.025
arl+0.025
Neg logit contributions
 mil-0.008
 was-0.008
 particular-0.008
 marsh-0.007
,-0.007
Token
Token ID220
Feature activation+4.24e-01

Pos logit contributions
 mil+0.005
 marsh+0.005
 particular+0.005
 was+0.005
 pl+0.005
Neg logit contributions
 unnecessary-0.012
 fault-0.011
ately-0.011
arl-0.011
 weaknesses-0.011
Token stay
Token ID2652
Feature activation-6.08e-01

Pos logit contributions
 unnecessary+0.004
arl+0.004
ately+0.004
 fault+0.004
 weaknesses+0.004
Neg logit contributions
 mil-0.002
 marsh-0.002
 particular-0.002
 was-0.002
 Ms-0.002
Token still
Token ID991
Feature activation+2.19e-01

Pos logit contributions
 unnecessary+0.022
 fault+0.022
ately+0.022
arl+0.021
 weaknesses+0.021
Neg logit contributions
 mil-0.013
 marsh-0.012
 particular-0.012
 was-0.011
 pl-0.011
Token,
Token ID11
Feature activation-6.54e-01

Pos logit contributions
 mil+0.002
 marsh+0.002
 particular+0.002
 was+0.002
 pl+0.002
Neg logit contributions
 unnecessary-0.010
 fault-0.010
ately-0.010
arl-0.010
 weaknesses-0.010
Token so
Token ID523
Feature activation-1.66e-01

Pos logit contributions
 unnecessary+0.021
arl+0.021
 weaknesses+0.021
ately+0.021
 fault+0.021
Neg logit contributions
 mil-0.016
 marsh-0.015
 was-0.015
 particular-0.015
 pl-0.015
Token it
Token ID340
Feature activation+2.90e-01

Pos logit contributions
 unnecessary+0.006
 fault+0.006
ately+0.006
 weaknesses+0.006
arl+0.006
Neg logit contributions
 mil-0.003
 marsh-0.003
 particular-0.003
 pl-0.003
 was-0.003
Token held
Token ID2714
Feature activation-1.36e+00

Pos logit contributions
 particular+0.007
 mil+0.007
Ms+0.006
 Ms+0.006
 marsh+0.006
Neg logit contributions
ately-0.009
der-0.009
 unnecessary-0.009
 weaknesses-0.009
 fault-0.008
Token tight
Token ID5381
Feature activation+2.27e-01

Pos logit contributions
 unnecessary+0.054
 fault+0.053
ately+0.052
 weaknesses+0.052
arl+0.052
Neg logit contributions
 mil-0.024
 marsh-0.023
 particular-0.023
 was-0.021
 pl-0.021
Token to
Token ID284
Feature activation-1.15e+00

Pos logit contributions
 mil+0.003
 marsh+0.003
 was+0.003
 particular+0.003
 pl+0.003
Neg logit contributions
 unnecessary-0.010
 fault-0.009
 weaknesses-0.009
arl-0.009
ately-0.009
Token the
Token ID262
Feature activation+6.27e-01

Pos logit contributions
 unnecessary+0.048
 fault+0.048
arl+0.047
ately+0.047
 weaknesses+0.047
Neg logit contributions
 mil-0.017
 marsh-0.017
 particular-0.016
 was-0.015
 pl-0.015
Token grass
Token ID8701
Feature activation-5.25e-01

Pos logit contributions
 mil+0.013
 marsh+0.013
 was+0.012
 pl+0.012
 particular+0.012
Neg logit contributions
 unnecessary-0.023
 fault-0.023
ately-0.023
arl-0.022
 weaknesses-0.022
Token.
Token ID13
Feature activation-8.47e-01

Pos logit contributions
 unnecessary+0.021
 fault+0.021
ately+0.020
arl+0.020
 weaknesses+0.020
Neg logit contributions
 mil-0.009
 marsh-0.009
 particular-0.008
 was-0.008
 pl-0.008
Token.
Token ID13
Feature activation+1.46e-02

Pos logit contributions
 fault+0.018
der+0.017
arl+0.017
ately+0.017
 unnecessary+0.017
Neg logit contributions
 mil-0.010
 marsh-0.010
 particular-0.010
 Ms-0.009
 pl-0.009
Token The
Token ID383
Feature activation-1.89e-01

Pos logit contributions
 marsh+0.000
 mil+0.000
 was+0.000
 particular+0.000
 pl+0.000
Neg logit contributions
 unnecessary-0.001
 weaknesses-0.001
 fault-0.001
ately-0.001
arl-0.000
Token ball
Token ID2613
Feature activation+5.67e-02

Pos logit contributions
ately+0.007
 unnecessary+0.007
 fault+0.007
 weaknesses+0.007
arl+0.007
Neg logit contributions
 mil-0.004
 was-0.004
 marsh-0.003
 particular-0.003
 l-0.003
Token popped
Token ID22928
Feature activation-7.79e-01

Pos logit contributions
Ms+0.001
 particular+0.001
 marsh+0.001
 mil+0.001
 pasture+0.001
Neg logit contributions
der-0.002
ately-0.002
 spies-0.001
 loses-0.001
 villains-0.001
Token and
Token ID290
Feature activation-5.66e-02

Pos logit contributions
der+0.026
ately+0.026
arl+0.026
 unnecessary+0.026
 fault+0.025
Neg logit contributions
 particular-0.018
 mil-0.018
 marsh-0.017
 pl-0.015
 Ms-0.015
Token def
Token ID825
Feature activation-1.43e+00

Pos logit contributions
der+0.002
arl+0.002
 unnecessary+0.002
ately+0.002
 spies+0.002
Neg logit contributions
 particular-0.001
 pasture-0.001
me-0.001
ink-0.001
 mil-0.001
Tokenlated
Token ID17249
Feature activation-3.67e-03

Pos logit contributions
 unnecessary+0.034
 fault+0.033
 weaknesses+0.032
ately+0.032
arl+0.031
Neg logit contributions
 mil-0.048
 marsh-0.047
 particular-0.046
 was-0.045
 pl-0.045
Token.
Token ID13
Feature activation-4.11e-01

Pos logit contributions
 unnecessary+0.000
 weaknesses+0.000
 fault+0.000
arl+0.000
ately+0.000
Neg logit contributions
 mil-0.000
 marsh-0.000
 was-0.000
 particular-0.000
 pl-0.000
Token The
Token ID383
Feature activation-3.96e-01

Pos logit contributions
 unnecessary+0.016
 weaknesses+0.016
 fault+0.015
ately+0.015
 privacy+0.015
Neg logit contributions
 marsh-0.008
 was-0.008
 mil-0.008
 particular-0.007
 pl-0.007
Token dog
Token ID3290
Feature activation-5.83e-02

Pos logit contributions
ately+0.016
 unnecessary+0.015
 fault+0.015
 weaknesses+0.015
arl+0.015
Neg logit contributions
 was-0.007
 mil-0.007
 marsh-0.007
 l-0.007
 end-0.007
Token ran
Token ID4966
Feature activation-5.66e-01

Pos logit contributions
der+0.002
ately+0.002
 loses+0.002
 unnecessary+0.002
 villains+0.002
Neg logit contributions
 particular-0.002
 marsh-0.001
 mil-0.001
 pasture-0.001
opl-0.001
Token sad
Token ID6507
Feature activation-5.78e-01

Pos logit contributions
 particular+0.029
 mil+0.028
me+0.027
 marsh+0.027
 pasture+0.026
Neg logit contributions
 unnecessary-0.034
der-0.030
 fault-0.029
 ours-0.028
 weaknesses-0.028
Token.
Token ID13
Feature activation-5.60e-01

Pos logit contributions
 unnecessary+0.029
 weaknesses+0.029
arl+0.028
emic+0.028
 ours+0.028
Neg logit contributions
,-0.006
 was-0.005
 w-0.004
 sn-0.004
 marsh-0.003
Token Suddenly
Token ID24975
Feature activation+4.39e-01

Pos logit contributions
 unnecessary+0.022
arl+0.022
 fault+0.022
 weaknesses+0.021
 ours+0.021
Neg logit contributions
 was-0.010
 mil-0.010
 particular-0.009
 marsh-0.009
 pl-0.009
Token,
Token ID11
Feature activation-2.75e-01

Pos logit contributions
 mil+0.005
 particular+0.004
 marsh+0.004
 pl+0.003
 pasture+0.003
Neg logit contributions
ately-0.020
 unnecessary-0.020
 weaknesses-0.019
 fault-0.019
der-0.019
Token a
Token ID257
Feature activation-8.62e-01

Pos logit contributions
 unnecessary+0.012
ately+0.012
 fault+0.012
 weaknesses+0.012
arl+0.012
Neg logit contributions
 mil-0.004
 marsh-0.003
 particular-0.003
 pl-0.003
 was-0.003
Token kind
Token ID1611
Feature activation-1.89e+00

Pos logit contributions
 unnecessary+0.031
arl+0.030
 weaknesses+0.030
 fault+0.030
ately+0.030
Neg logit contributions
 mil-0.017
 marsh-0.017
 particular-0.016
 was-0.015
 pl-0.015
Token butterfly
Token ID35113
Feature activation+2.81e-01

Pos logit contributions
arl+0.062
 fault+0.060
ately+0.060
 unnecessary+0.060
der+0.059
Neg logit contributions
 mil-0.044
 marsh-0.044
 particular-0.041
 was-0.039
 pl-0.039
Token flew
Token ID13112
Feature activation-9.01e-01

Pos logit contributions
 particular+0.006
 marsh+0.006
 mil+0.006
 Ms+0.005
Ms+0.005
Neg logit contributions
ately-0.009
der-0.009
 unnecessary-0.009
endant-0.008
arl-0.008
Token by
Token ID416
Feature activation+1.19e-02

Pos logit contributions
arl+0.027
 unnecessary+0.026
ately+0.025
 fault+0.025
 weaknesses+0.024
Neg logit contributions
 marsh-0.026
 mil-0.025
 particular-0.024
me-0.021
 pl-0.021
Token and
Token ID290
Feature activation-1.79e-01

Pos logit contributions
 mil+0.000
 marsh+0.000
 particular+0.000
 was+0.000
 pl+0.000
Neg logit contributions
 unnecessary-0.001
 fault-0.001
ately-0.001
 weaknesses-0.001
arl-0.001
Token said
Token ID531
Feature activation-3.59e-01

Pos logit contributions
 unnecessary+0.006
ately+0.006
der+0.006
 weaknesses+0.006
 ours+0.006
Neg logit contributions
 particular-0.004
 mil-0.004
 marsh-0.004
 pasture-0.003
 end-0.003
Token marriage
Token ID4845
Feature activation-3.08e-01

Pos logit contributions
 mil+0.002
 marsh+0.002
 particular+0.002
 pl+0.001
 was+0.001
Neg logit contributions
 unnecessary-0.005
 fault-0.005
ately-0.005
 weaknesses-0.005
arl-0.004
Token.
Token ID13
Feature activation-5.50e-01

Pos logit contributions
 unnecessary+0.011
 fault+0.010
ately+0.010
arl+0.010
 weaknesses+0.010
Neg logit contributions
 mil-0.007
 marsh-0.007
 particular-0.006
 pl-0.006
 was-0.006
Token You
Token ID921
Feature activation-5.88e-01

Pos logit contributions
 unnecessary+0.015
 fault+0.014
ately+0.014
 weaknesses+0.014
arl+0.013
Neg logit contributions
 mil-0.017
 marsh-0.017
 particular-0.016
 was-0.016
-0.016
Token be
Token ID307
Feature activation-8.62e-02

Pos logit contributions
 unnecessary+0.019
ately+0.019
arl+0.019
 fault+0.019
 weaknesses+0.018
Neg logit contributions
 mil-0.014
 marsh-0.014
 was-0.014
 particular-0.013
 pl-0.013
Token the
Token ID262
Feature activation+6.51e-01

Pos logit contributions
 unnecessary+0.003
 fault+0.003
arl+0.003
 ours+0.003
 weaknesses+0.003
Neg logit contributions
 mil-0.002
 marsh-0.002
 particular-0.002
 was-0.002
 pl-0.002
Token husband
Token ID5229
Feature activation-1.56e+00

Pos logit contributions
 mil+0.015
 marsh+0.015
 particular+0.015
 was+0.015
 pl+0.014
Neg logit contributions
 fault-0.021
 unnecessary-0.021
arl-0.021
 ours-0.020
 weaknesses-0.020
Token and
Token ID290
Feature activation-7.14e-02

Pos logit contributions
 unnecessary+0.048
 fault+0.048
ately+0.047
arl+0.047
 weaknesses+0.046
Neg logit contributions
 mil-0.040
 marsh-0.039
 particular-0.039
 pl-0.037
 was-0.035
Token I
Token ID314
Feature activation-1.34e+00

Pos logit contributions
 unnecessary+0.002
 fault+0.002
ately+0.002
 weaknesses+0.002
arl+0.002
Neg logit contributions
 mil-0.002
 marsh-0.002
 particular-0.002
 was-0.002
 pl-0.002
Token be
Token ID307
Feature activation-2.31e-01

Pos logit contributions
 fault+0.035
 unnecessary+0.033
 weaknesses+0.033
arl+0.032
 ours+0.030
Neg logit contributions
 mil-0.040
 marsh-0.040
 particular-0.038
 Ms-0.037
M-0.036
Token the
Token ID262
Feature activation+9.15e-01

Pos logit contributions
 unnecessary+0.007
 fault+0.007
 ours+0.007
arl+0.007
 weaknesses+0.007
Neg logit contributions
 marsh-0.006
 mil-0.006
 particular-0.006
 was-0.005
 pl-0.005
Token wife
Token ID3656
Feature activation-1.90e+00

Pos logit contributions
 mil+0.022
 marsh+0.022
 particular+0.021
 was+0.021
 pl+0.020
Neg logit contributions
 unnecessary-0.030
 fault-0.029
arl-0.029
 weaknesses-0.028
 ours-0.027