TOP ACTIVATIONS
MAX = 0.073

special ." The grand kid and Grand ma gathered
io was thankful and thanked T ummy before they each continued
. One day T oni found a big ,
her . The grand kids were so happy and
drove back home , the grand kid held her wheel and
. One day T oni asked Ben why he
One day , T essa and her parents went
big bag of papers . J ax got scared and started
careful and sat on the c actus . They became good
girl called T essa . T essa was playing in the
. Finally , T oni was ready . She
Tim didn 't like the seaw eed because it would get
When they arrived , the grand kid was so excited !
done a H ula - H oop Dance before , but
play on . One day J ax had the urge to
After that , J ax said goodbye to the
One day , J ax was playing in the
One day , T essa went to the park
In the nightmare , T essa was running and running
help . Then T essa spotted some salt nearby

MIN ACTIVATIONS
MIN = -0.059

Sue and Joe , were sitting in front of it .
and L ola - were sitting on a cool raft .
he could be and never mis beh aved . <|endoftext|> Once
to fix it . Her mom my was very thoughtful and
If you want to get rid of the hay , you
B un 's friend , a flexible snake named Sly
me ?" she asked . Mom my smiled and took the
came to help her . Mom my found the key and
and hugged it tightly . Mom my told Lily that it
. Luckily , his mom heard him and
Lily not to worry . Mom my gave her a hug
One day , Jonathan was sitting on his per ch in
my came home . Her mom my called the ambulance but
my ac he . Her mom my said , " Don
just go without it ? Mom my smiled and told her
, the baby cried . Mom my heard the cry and
like the ignorant boy who laughed at me when I fell
her his spoiled bucket . Mom my said not to worry
, a little girl was sitting on a yacht . She
mom my tightly . Her mom my told her that sometimes

INTERVAL 0.040 - 0.073
CONTAINS 0.145%

elly ?" The c actus answered , " I
a sandwich . S ara wants to kick the
hero . He could barely contain his pride . From then
said , � � € � � Expl oring is great
said no , but the bad person didn 't listen .

INTERVAL 0.007 - 0.040
CONTAINS 27.019%

and he was so happy to see her . They hugged
says . Sam does not go away . He
his front yard . Jack thought the neighbor was so adorable
to melt . She quickly ate it all up before it
the bird , but he was too slow . He fell

INTERVAL -0.026 - 0.007
CONTAINS 71.338%

, you can have me as a reward ! What are
While they were playing , Tim found a big
's your name ?" The pony was shy and didn 't
my friend anymore . You are mean and careless . You
hurting Max 's feelings . She wanted to improve their friendship

INTERVAL -0.059 - -0.026
CONTAINS 1.497%

cloth ." So , Lily leaned back and felt the warm
His friend replied , " Why should I ?
<|endoftext|> Tom and Lily were playing in the garden . They
day , a little girl came to visit her . The
mom saw what was happening and came over to hug her
Token special
Token ID2041
Feature activation+2.39e-03

Pos logit contributions
 beating+0.021
ning+0.021
 aware+0.020
aping+0.020
broken+0.019
Neg logit contributions
 vows-0.017
 throne-0.017
bridge-0.017
 menu-0.016
rost-0.016
Token."
Token ID526
Feature activation-9.84e-03

Pos logit contributions
 throne+0.004
 vows+0.004
bridge+0.004
 menu+0.004
affle+0.004
Neg logit contributions
 beating-0.003
ning-0.003
 aware-0.003
aping-0.003
hr-0.003
Token
Token ID198
Feature activation-1.21e-02

Pos logit contributions
 beating+0.012
 aware+0.012
ning+0.012
 group+0.011
aping+0.011
Neg logit contributions
bridge-0.018
 throne-0.018
 vows-0.018
 menu-0.017
rost-0.017
Token
Token ID198
Feature activation-3.57e-03

Pos logit contributions
ning+0.016
 aware+0.016
 beating+0.016
 group+0.015
aping+0.014
Neg logit contributions
 throne-0.020
 vows-0.020
bridge-0.020
rost-0.020
affle-0.019
TokenThe
Token ID464
Feature activation+1.43e-02

Pos logit contributions
ning+0.005
 beating+0.005
 aware+0.005
run+0.005
 group+0.004
Neg logit contributions
bridge-0.006
 vows-0.006
 throne-0.006
rost-0.006
 menu-0.005
Token grand
Token ID4490
Feature activation+7.26e-02

Pos logit contributions
bridge+0.026
 throne+0.025
 vows+0.025
affle+0.024
oll+0.024
Neg logit contributions
ning-0.018
 aware-0.018
 beating-0.018
aping-0.017
hr-0.017
Tokenkid
Token ID38439
Feature activation+1.44e-02

Pos logit contributions
bridge+0.121
 throne+0.115
 vows+0.113
obb+0.106
rost+0.103
Neg logit contributions
 beating-0.105
ning-0.103
 group-0.101
hr-0.098
 aware-0.098
Token and
Token ID290
Feature activation+1.43e-02

Pos logit contributions
 vows+0.026
 throne+0.026
bridge+0.026
affle+0.025
 menu+0.024
Neg logit contributions
 beating-0.018
ning-0.017
 aware-0.017
 group-0.016
hr-0.016
Token Grand
Token ID5675
Feature activation+2.96e-02

Pos logit contributions
 throne+0.022
bridge+0.022
oll+0.021
 vows+0.021
 menu+0.020
Neg logit contributions
 aware-0.021
aping-0.021
ning-0.021
ing-0.019
 group-0.019
Tokenma
Token ID2611
Feature activation-1.25e-02

Pos logit contributions
bridge+0.055
 throne+0.054
 vows+0.052
affle+0.049
rost+0.049
Neg logit contributions
ning-0.037
 beating-0.036
 aware-0.035
 group-0.035
aping-0.034
Token gathered
Token ID9272
Feature activation-4.09e-03

Pos logit contributions
 beating+0.017
ning+0.017
 aware+0.016
 group+0.016
hr+0.016
Neg logit contributions
bridge-0.021
 throne-0.021
 vows-0.021
affle-0.020
rost-0.020
Tokenio
Token ID952
Feature activation-1.02e-02

Pos logit contributions
 throne+0.062
bridge+0.061
 microwave+0.058
 piping+0.056
 vows+0.055
Neg logit contributions
ning-0.087
ran-0.083
ing-0.083
aping-0.083
hr-0.082
Token was
Token ID373
Feature activation+5.86e-03

Pos logit contributions
ning+0.013
 beating+0.013
 aware+0.012
 group+0.012
hr+0.012
Neg logit contributions
 vows-0.019
bridge-0.018
 throne-0.018
affle-0.017
oll-0.017
Token thankful
Token ID25535
Feature activation+1.54e-02

Pos logit contributions
bridge+0.010
 throne+0.009
 vows+0.009
oll+0.009
 piping+0.009
Neg logit contributions
ning-0.008
aping-0.008
ing-0.008
ounce-0.008
aunted-0.008
Token and
Token ID290
Feature activation-3.84e-03

Pos logit contributions
 throne+0.025
 vows+0.025
bridge+0.025
oll+0.024
rost+0.024
Neg logit contributions
ning-0.021
 beating-0.021
aping-0.020
 group-0.020
hr-0.020
Token thanked
Token ID26280
Feature activation+1.91e-02

Pos logit contributions
ning+0.006
aping+0.005
 beating+0.005
hr+0.005
 group+0.005
Neg logit contributions
 vows-0.006
 throne-0.006
bridge-0.006
oll-0.005
ibrarian-0.005
Token T
Token ID309
Feature activation+7.19e-02

Pos logit contributions
 throne+0.031
 vows+0.031
bridge+0.030
rost+0.029
oll+0.028
Neg logit contributions
 aware-0.027
 beating-0.026
ning-0.026
aping-0.025
 group-0.025
Tokenummy
Token ID13513
Feature activation-3.08e-04

Pos logit contributions
 throne+0.088
bridge+0.087
 piping+0.079
 microwave+0.077
 embassy+0.074
Neg logit contributions
ing-0.137
ning-0.128
ran-0.128
hr-0.122
aping-0.121
Token before
Token ID878
Feature activation-3.42e-03

Pos logit contributions
 beating+0.000
ning+0.000
 aware+0.000
aping+0.000
hr+0.000
Neg logit contributions
 throne-0.001
 vows-0.001
bridge-0.001
 menu-0.001
oll-0.001
Token they
Token ID484
Feature activation-1.04e-02

Pos logit contributions
aping+0.005
aunted+0.004
ning+0.004
 aware+0.004
broken+0.004
Neg logit contributions
 throne-0.005
bridge-0.005
 vows-0.005
 Lib-0.005
oll-0.005
Token each
Token ID1123
Feature activation-1.52e-02

Pos logit contributions
ning+0.017
 beating+0.016
 aware+0.016
hr+0.016
aping+0.016
Neg logit contributions
 vows-0.015
bridge-0.015
 throne-0.015
oll-0.014
rost-0.014
Token continued
Token ID3767
Feature activation+7.31e-03

Pos logit contributions
ning+0.022
 beating+0.022
 aware+0.021
aping+0.021
hr+0.020
Neg logit contributions
bridge-0.025
 throne-0.024
 vows-0.024
affle-0.023
rost-0.022
Token.
Token ID13
Feature activation+7.13e-03

Pos logit contributions
 beating+0.010
ning+0.009
aping+0.009
run+0.009
hr+0.009
Neg logit contributions
 throne-0.010
bridge-0.009
 menu-0.009
 vows-0.009
pter-0.008
Token
Token ID198
Feature activation-7.57e-03

Pos logit contributions
bridge+0.011
 throne+0.011
 vows+0.011
affle+0.010
rost+0.010
Neg logit contributions
ning-0.011
 beating-0.011
 aware-0.011
 group-0.010
aping-0.010
Token
Token ID198
Feature activation+2.12e-03

Pos logit contributions
 beating+0.009
ning+0.009
 aware+0.009
broken+0.008
hr+0.008
Neg logit contributions
bridge-0.014
 throne-0.013
rost-0.013
affle-0.013
oll-0.013
TokenOne
Token ID3198
Feature activation-1.34e-02

Pos logit contributions
bridge+0.003
 throne+0.003
 vows+0.003
affle+0.003
rost+0.003
Neg logit contributions
ning-0.003
 beating-0.003
 aware-0.003
 group-0.003
aping-0.003
Token day
Token ID1110
Feature activation-6.37e-03

Pos logit contributions
ning+0.023
 beating+0.022
 aware+0.022
aping+0.022
hr+0.021
Neg logit contributions
 throne-0.018
bridge-0.017
rost-0.017
affle-0.017
 vows-0.017
Token T
Token ID309
Feature activation+7.12e-02

Pos logit contributions
 aware+0.007
ning+0.007
 beating+0.007
aping+0.007
 group+0.007
Neg logit contributions
 throne-0.012
bridge-0.012
 vows-0.012
affle-0.011
oll-0.011
Tokenoni
Token ID14651
Feature activation+6.87e-03

Pos logit contributions
bridge+0.097
 throne+0.097
 vows+0.089
pter+0.082
 ferry+0.081
Neg logit contributions
ing-0.128
ning-0.119
aping-0.119
 aware-0.114
 group-0.113
Token found
Token ID1043
Feature activation-1.22e-02

Pos logit contributions
bridge+0.012
 throne+0.012
 vows+0.012
affle+0.012
oll+0.012
Neg logit contributions
 beating-0.009
ning-0.008
 aware-0.008
hr-0.008
run-0.008
Token a
Token ID257
Feature activation-1.60e-05

Pos logit contributions
 beating+0.014
 aware+0.013
ning+0.013
hr+0.013
 group+0.013
Neg logit contributions
bridge-0.024
 throne-0.023
oll-0.022
affle-0.022
rost-0.022
Token big
Token ID1263
Feature activation+1.04e-02

Pos logit contributions
 aware+0.000
ning+0.000
aping+0.000
 beating+0.000
aunted+0.000
Neg logit contributions
 throne-0.000
 vows-0.000
 menu-0.000
bridge-0.000
affle-0.000
Token,
Token ID11
Feature activation-9.17e-03

Pos logit contributions
 throne+0.016
bridge+0.015
 vows+0.015
affle+0.015
 menu+0.014
Neg logit contributions
 beating-0.016
 aware-0.016
ning-0.016
aping-0.015
hr-0.015
Token her
Token ID607
Feature activation+9.01e-03

Pos logit contributions
 vows+0.027
 throne+0.027
bridge+0.026
rost+0.025
 menu+0.025
Neg logit contributions
 aware-0.030
 beating-0.030
ning-0.029
aping-0.028
 group-0.028
Token.
Token ID13
Feature activation+6.05e-03

Pos logit contributions
 throne+0.016
 vows+0.016
 menu+0.015
bridge+0.014
affle+0.014
Neg logit contributions
 beating-0.012
 aware-0.012
ning-0.012
run-0.012
aping-0.011
Token
Token ID198
Feature activation-3.08e-03

Pos logit contributions
bridge+0.012
 throne+0.011
 vows+0.011
affle+0.011
rost+0.011
Neg logit contributions
 beating-0.007
ning-0.007
 aware-0.007
 group-0.007
aping-0.006
Token
Token ID198
Feature activation+7.48e-03

Pos logit contributions
ning+0.003
 beating+0.003
 aware+0.003
 group+0.003
hr+0.003
Neg logit contributions
bridge-0.007
 throne-0.006
 vows-0.006
affle-0.006
rost-0.006
TokenThe
Token ID464
Feature activation+1.12e-02

Pos logit contributions
bridge+0.013
 throne+0.013
 vows+0.013
affle+0.012
rost+0.012
Neg logit contributions
 beating-0.010
ning-0.010
 aware-0.009
 group-0.009
aping-0.009
Token grand
Token ID4490
Feature activation+7.06e-02

Pos logit contributions
bridge+0.017
 throne+0.017
 vows+0.016
rost+0.016
affle+0.016
Neg logit contributions
 beating-0.018
 group-0.017
ning-0.017
 aware-0.017
hr-0.016
Tokenkids
Token ID45235
Feature activation+6.29e-03

Pos logit contributions
bridge+0.106
obb+0.099
 throne+0.099
affle+0.097
oll+0.096
Neg logit contributions
 group-0.110
 beating-0.105
ning-0.105
hr-0.105
 aware-0.102
Token were
Token ID547
Feature activation+4.32e-03

Pos logit contributions
bridge+0.011
 vows+0.011
 throne+0.011
affle+0.010
rost+0.010
Neg logit contributions
 beating-0.008
ning-0.008
 aware-0.008
 group-0.008
hr-0.008
Token so
Token ID523
Feature activation+5.78e-03

Pos logit contributions
 throne+0.008
bridge+0.007
 vows+0.007
oll+0.007
 menu+0.007
Neg logit contributions
ning-0.006
aping-0.006
 beating-0.006
 aware-0.005
run-0.005
Token happy
Token ID3772
Feature activation-9.09e-04

Pos logit contributions
 vows+0.011
 throne+0.011
bridge+0.010
oll+0.010
affle+0.010
Neg logit contributions
ning-0.008
 beating-0.007
 aware-0.006
aping-0.006
aunted-0.006
Token and
Token ID290
Feature activation+8.45e-03

Pos logit contributions
 beating+0.001
ning+0.001
 aware+0.001
 group+0.001
aping+0.001
Neg logit contributions
 throne-0.002
bridge-0.002
 vows-0.002
 menu-0.002
oll-0.002
Token drove
Token ID10357
Feature activation+1.00e-03

Pos logit contributions
bridge+0.043
 throne+0.042
 vows+0.040
affle+0.040
rost+0.039
Neg logit contributions
 beating-0.025
ning-0.025
 group-0.024
 aware-0.024
aping-0.023
Token back
Token ID736
Feature activation+7.92e-03

Pos logit contributions
bridge+0.002
 throne+0.002
 vows+0.002
rost+0.002
affle+0.002
Neg logit contributions
 beating-0.001
 aware-0.001
ning-0.001
 group-0.001
aping-0.001
Token home
Token ID1363
Feature activation+2.03e-02

Pos logit contributions
 throne+0.015
bridge+0.015
 vows+0.015
 menu+0.014
affle+0.014
Neg logit contributions
 beating-0.009
 aware-0.009
ning-0.009
 group-0.008
hr-0.008
Token,
Token ID11
Feature activation-5.31e-05

Pos logit contributions
 throne+0.033
 vows+0.031
 menu+0.028
bridge+0.027
 hotel+0.027
Neg logit contributions
ning-0.031
 beating-0.029
 aware-0.029
aunted-0.028
aping-0.028
Token the
Token ID262
Feature activation+7.21e-03

Pos logit contributions
 aware+0.000
ning+0.000
 beating+0.000
aping+0.000
run+0.000
Neg logit contributions
 throne-0.000
 vows-0.000
bridge-0.000
 menu-0.000
 microwave-0.000
Token grand
Token ID4490
Feature activation+6.91e-02

Pos logit contributions
bridge+0.011
 throne+0.011
 vows+0.010
affle+0.010
 menu+0.010
Neg logit contributions
ning-0.011
 beating-0.011
 aware-0.011
aping-0.011
hr-0.011
Tokenkid
Token ID38439
Feature activation+2.45e-02

Pos logit contributions
bridge+0.107
 throne+0.102
 vows+0.100
obb+0.092
affle+0.092
Neg logit contributions
 beating-0.109
ning-0.108
 group-0.103
 aware-0.102
ing-0.100
Token held
Token ID2714
Feature activation+5.35e-03

Pos logit contributions
 vows+0.043
 throne+0.043
bridge+0.042
 menu+0.040
affle+0.038
Neg logit contributions
ning-0.031
 aware-0.031
 beating-0.031
aping-0.030
hr-0.030
Token her
Token ID607
Feature activation+2.87e-03

Pos logit contributions
bridge+0.010
 vows+0.010
 throne+0.010
affle+0.009
oll+0.009
Neg logit contributions
 beating-0.006
 aware-0.006
ning-0.006
 group-0.006
hr-0.006
Token wheel
Token ID7825
Feature activation-6.10e-03

Pos logit contributions
bridge+0.004
 throne+0.004
 vows+0.004
affle+0.004
rost+0.004
Neg logit contributions
 beating-0.005
 aware-0.005
ning-0.005
 group-0.005
aping-0.004
Token and
Token ID290
Feature activation+2.56e-03

Pos logit contributions
 aware+0.007
ning+0.007
 beating+0.007
 group+0.007
hr+0.007
Neg logit contributions
 throne-0.011
bridge-0.011
 vows-0.010
oll-0.010
rost-0.010
Token.
Token ID13
Feature activation+5.13e-03

Pos logit contributions
 beating+0.022
 aware+0.022
ning+0.022
hr+0.021
aping+0.020
Neg logit contributions
 throne-0.040
bridge-0.040
 vows-0.038
affle-0.038
oll-0.038
Token
Token ID198
Feature activation-2.19e-03

Pos logit contributions
bridge+0.010
 throne+0.010
 vows+0.010
affle+0.010
rost+0.009
Neg logit contributions
 beating-0.006
ning-0.006
 aware-0.005
 group-0.005
aping-0.005
Token
Token ID198
Feature activation+1.27e-02

Pos logit contributions
 beating+0.002
ning+0.002
 aware+0.002
 group+0.002
hr+0.002
Neg logit contributions
bridge-0.004
 throne-0.004
 vows-0.004
rost-0.004
affle-0.004
TokenOne
Token ID3198
Feature activation-1.45e-02

Pos logit contributions
bridge+0.023
 throne+0.023
 vows+0.022
oll+0.021
affle+0.021
Neg logit contributions
 beating-0.016
ning-0.016
 group-0.016
 aware-0.015
ing-0.014
Token day
Token ID1110
Feature activation-3.50e-05

Pos logit contributions
 beating+0.022
ning+0.022
 aware+0.022
 group+0.021
aping+0.021
Neg logit contributions
bridge-0.022
 throne-0.022
 vows-0.021
affle-0.021
rost-0.020
Token T
Token ID309
Feature activation+6.84e-02

Pos logit contributions
 beating+0.000
 aware+0.000
ning+0.000
 group+0.000
hr+0.000
Neg logit contributions
bridge-0.000
 throne-0.000
 vows-0.000
affle-0.000
rost-0.000
Tokenoni
Token ID14651
Feature activation+4.34e-03

Pos logit contributions
bridge+0.092
 throne+0.089
 piping+0.079
 vows+0.079
 embassy+0.074
Neg logit contributions
ing-0.124
ning-0.117
aping-0.112
 group-0.112
 aware-0.111
Token asked
Token ID1965
Feature activation-2.53e-03

Pos logit contributions
bridge+0.008
 throne+0.008
 vows+0.008
affle+0.008
rost+0.008
Neg logit contributions
 beating-0.005
ning-0.005
 aware-0.005
 group-0.005
aping-0.004
Token Ben
Token ID3932
Feature activation-9.82e-03

Pos logit contributions
 beating+0.003
ning+0.003
 aware+0.003
broken+0.003
hr+0.003
Neg logit contributions
 vows-0.004
 throne-0.004
rost-0.004
affle-0.004
oll-0.004
Token why
Token ID1521
Feature activation+1.77e-02

Pos logit contributions
 beating+0.013
ning+0.012
hr+0.011
aping+0.011
aunted+0.011
Neg logit contributions
 throne-0.018
 vows-0.017
rost-0.017
 menu-0.017
affle-0.016
Token he
Token ID339
Feature activation-5.79e-04

Pos logit contributions
 throne+0.018
rost+0.018
 vows+0.018
bridge+0.017
affle+0.017
Neg logit contributions
ning-0.036
 aware-0.035
aping-0.035
aunted-0.034
hr-0.034
Token
Token ID198
Feature activation-5.22e-03

Pos logit contributions
 beating+0.037
ning+0.037
 aware+0.036
hr+0.034
 group+0.034
Neg logit contributions
bridge-0.051
rost-0.049
 throne-0.049
oll-0.048
 vows-0.047
Token
Token ID198
Feature activation+1.62e-03

Pos logit contributions
ning+0.006
 beating+0.006
 aware+0.006
broken+0.005
hr+0.005
Neg logit contributions
 throne-0.010
bridge-0.010
 vows-0.009
rost-0.009
 menu-0.009
TokenOne
Token ID3198
Feature activation-9.84e-03

Pos logit contributions
bridge+0.002
 throne+0.002
 vows+0.002
rost+0.002
affle+0.002
Neg logit contributions
ning-0.002
 beating-0.002
 aware-0.002
 group-0.002
aping-0.002
Token day
Token ID1110
Feature activation-6.94e-03

Pos logit contributions
 beating+0.014
ning+0.014
 aware+0.014
 group+0.014
hr+0.013
Neg logit contributions
bridge-0.016
 throne-0.016
 vows-0.015
affle-0.015
rost-0.014
Token,
Token ID11
Feature activation-1.68e-02

Pos logit contributions
 aware+0.009
 beating+0.009
ning+0.009
aping+0.008
run+0.008
Neg logit contributions
 vows-0.012
 throne-0.012
bridge-0.011
oll-0.011
 menu-0.011
Token T
Token ID309
Feature activation+6.81e-02

Pos logit contributions
 aware+0.018
 beating+0.017
ning+0.017
run+0.017
aping+0.017
Neg logit contributions
 vows-0.033
 throne-0.032
bridge-0.032
 menu-0.030
oll-0.030
Tokenessa
Token ID21411
Feature activation+9.93e-03

Pos logit contributions
bridge+0.100
 throne+0.095
affle+0.088
rost+0.087
 microwave+0.086
Neg logit contributions
 beating-0.113
 aware-0.112
ing-0.112
ning-0.109
 group-0.107
Token and
Token ID290
Feature activation+1.80e-02

Pos logit contributions
bridge+0.018
 throne+0.018
 vows+0.018
 menu+0.017
affle+0.017
Neg logit contributions
 beating-0.012
ning-0.012
 aware-0.011
run-0.011
hr-0.011
Token her
Token ID607
Feature activation+5.18e-04

Pos logit contributions
bridge+0.032
 throne+0.031
 vows+0.031
affle+0.029
rost+0.029
Neg logit contributions
 beating-0.024
ning-0.024
 aware-0.023
 group-0.023
aping-0.022
Token parents
Token ID3397
Feature activation-1.59e-03

Pos logit contributions
bridge+0.001
 throne+0.001
 vows+0.001
affle+0.001
rost+0.001
Neg logit contributions
 beating-0.001
ning-0.001
 aware-0.001
 group-0.001
aping-0.001
Token went
Token ID1816
Feature activation+3.91e-03

Pos logit contributions
 beating+0.002
ning+0.002
 aware+0.002
hr+0.002
run+0.002
Neg logit contributions
bridge-0.002
 vows-0.002
 throne-0.002
oll-0.002
rost-0.002
Token big
Token ID1263
Feature activation+1.43e-02

Pos logit contributions
 throne+0.007
bridge+0.007
 vows+0.006
affle+0.006
 menu+0.006
Neg logit contributions
ning-0.007
 aware-0.007
 beating-0.007
aping-0.007
hr-0.007
Token bag
Token ID6131
Feature activation-2.00e-03

Pos logit contributions
 throne+0.018
bridge+0.018
 vows+0.017
affle+0.016
 menu+0.016
Neg logit contributions
ning-0.026
 beating-0.026
 aware-0.026
aping-0.025
hr-0.025
Token of
Token ID286
Feature activation+6.11e-04

Pos logit contributions
ning+0.002
 aware+0.002
hr+0.002
 beating+0.002
aping+0.002
Neg logit contributions
bridge-0.004
 throne-0.004
oll-0.003
 menu-0.003
 vows-0.003
Token papers
Token ID9473
Feature activation+1.07e-02

Pos logit contributions
bridge+0.001
 throne+0.001
 vows+0.001
rost+0.001
affle+0.001
Neg logit contributions
 aware-0.001
ning-0.001
 beating-0.001
 group-0.001
 wise-0.001
Token.
Token ID13
Feature activation+2.19e-02

Pos logit contributions
bridge+0.022
 throne+0.022
 vows+0.022
affle+0.021
rost+0.021
Neg logit contributions
 beating-0.010
ning-0.010
 aware-0.010
 group-0.010
hr-0.009
Token J
Token ID449
Feature activation+6.77e-02

Pos logit contributions
bridge+0.046
 throne+0.045
 vows+0.045
affle+0.044
rost+0.043
Neg logit contributions
ning-0.021
 beating-0.021
 aware-0.020
 group-0.020
aping-0.019
Tokenax
Token ID897
Feature activation-6.76e-03

Pos logit contributions
 throne+0.117
 vows+0.115
bridge+0.109
pter+0.098
 piping+0.098
Neg logit contributions
ing-0.102
hr-0.097
ning-0.096
aping-0.095
run-0.092
Token got
Token ID1392
Feature activation-1.05e-02

Pos logit contributions
ning+0.010
 aware+0.009
 beating+0.009
run+0.009
hr+0.008
Neg logit contributions
 vows-0.010
 throne-0.010
bridge-0.010
ibrarian-0.009
affle-0.009
Token scared
Token ID12008
Feature activation-8.64e-03

Pos logit contributions
 beating+0.013
ning+0.013
 aware+0.012
aping+0.012
hr+0.012
Neg logit contributions
bridge-0.018
 vows-0.018
oll-0.017
 throne-0.017
 menu-0.016
Token and
Token ID290
Feature activation+6.93e-03

Pos logit contributions
ning+0.013
 beating+0.013
aping+0.013
hr+0.012
 aware+0.012
Neg logit contributions
oll-0.012
 throne-0.012
bridge-0.011
rost-0.011
 fourteen-0.011
Token started
Token ID2067
Feature activation+1.01e-03

Pos logit contributions
 throne+0.011
bridge+0.011
 vows+0.011
rost+0.010
affle+0.010
Neg logit contributions
ning-0.010
aping-0.010
 beating-0.010
 aware-0.009
 group-0.009
Token careful
Token ID8161
Feature activation-1.60e-03

Pos logit contributions
 throne+0.008
bridge+0.008
oll+0.007
affle+0.007
 vows+0.007
Neg logit contributions
aping-0.006
ing-0.006
ning-0.006
hr-0.006
 group-0.006
Token and
Token ID290
Feature activation+5.31e-03

Pos logit contributions
ning+0.002
 group+0.002
aping+0.002
hr+0.002
run+0.002
Neg logit contributions
 throne-0.002
bridge-0.002
rost-0.002
 vows-0.002
oll-0.002
Token sat
Token ID3332
Feature activation-1.45e-02

Pos logit contributions
 vows+0.008
 throne+0.008
rost+0.007
bridge+0.007
oll+0.007
Neg logit contributions
ning-0.008
 group-0.008
aping-0.008
hr-0.008
aunted-0.008
Token on
Token ID319
Feature activation+2.47e-02

Pos logit contributions
 beating+0.019
 group+0.018
ning+0.018
 aware+0.018
hr+0.018
Neg logit contributions
bridge-0.025
 throne-0.025
 vows-0.023
oll-0.023
rost-0.023
Token the
Token ID262
Feature activation+9.70e-03

Pos logit contributions
bridge+0.055
 throne+0.054
 vows+0.054
affle+0.053
rost+0.052
Neg logit contributions
 beating-0.021
ning-0.020
 aware-0.020
 group-0.019
hr-0.018
Token c
Token ID269
Feature activation+6.72e-02

Pos logit contributions
bridge+0.017
 vows+0.015
rost+0.015
 throne+0.015
affle+0.015
Neg logit contributions
 beating-0.014
ning-0.014
 group-0.014
 aware-0.013
hr-0.013
Tokenactus
Token ID34144
Feature activation+1.98e-02

Pos logit contributions
bridge+0.068
 throne+0.058
 vows+0.057
rost+0.055
 piping+0.053
Neg logit contributions
ning-0.143
 beating-0.141
ing-0.141
 aware-0.140
 group-0.136
Token.
Token ID13
Feature activation+8.95e-03

Pos logit contributions
 throne+0.037
bridge+0.035
oll+0.034
 vows+0.033
 menu+0.033
Neg logit contributions
 aware-0.024
 beating-0.024
aping-0.024
ning-0.024
hr-0.024
Token They
Token ID1119
Feature activation+5.63e-03

Pos logit contributions
bridge+0.015
 throne+0.015
 vows+0.014
rost+0.014
 menu+0.014
Neg logit contributions
 beating-0.013
ning-0.012
 aware-0.012
 group-0.012
broken-0.012
Token became
Token ID2627
Feature activation-5.04e-03

Pos logit contributions
bridge+0.010
 throne+0.010
 vows+0.010
affle+0.010
rost+0.009
Neg logit contributions
 beating-0.007
ning-0.007
 aware-0.007
 group-0.007
hr-0.006
Token good
Token ID922
Feature activation-6.14e-03

Pos logit contributions
 beating+0.006
ning+0.006
run+0.006
 aware+0.006
aping+0.006
Neg logit contributions
 throne-0.009
bridge-0.009
 vows-0.009
affle-0.008
oll-0.008
Token girl
Token ID2576
Feature activation+1.29e-03

Pos logit contributions
bridge+0.004
 throne+0.004
 vows+0.004
affle+0.004
rost+0.004
Neg logit contributions
 beating-0.002
ning-0.002
 aware-0.002
 group-0.002
aping-0.002
Token called
Token ID1444
Feature activation+1.38e-02

Pos logit contributions
 throne+0.002
bridge+0.002
rost+0.002
 vows+0.002
oll+0.002
Neg logit contributions
ning-0.002
 beating-0.002
 aware-0.002
aping-0.002
 group-0.002
Token T
Token ID309
Feature activation+5.01e-02

Pos logit contributions
 throne+0.019
 vows+0.018
rost+0.018
bridge+0.018
oll+0.017
Neg logit contributions
ning-0.025
 beating-0.024
 aware-0.024
aping-0.023
broken-0.023
Tokenessa
Token ID21411
Feature activation+4.52e-03

Pos logit contributions
bridge+0.078
 throne+0.076
 vows+0.072
rost+0.069
affle+0.069
Neg logit contributions
 beating-0.076
ning-0.076
 aware-0.074
ing-0.074
 group-0.073
Token.
Token ID13
Feature activation+2.22e-02

Pos logit contributions
bridge+0.010
 throne+0.010
 vows+0.010
affle+0.009
rost+0.009
Neg logit contributions
 beating-0.004
ning-0.004
 aware-0.004
 group-0.004
aping-0.003
Token T
Token ID309
Feature activation+6.70e-02

Pos logit contributions
bridge+0.035
 throne+0.035
 vows+0.034
affle+0.032
rost+0.032
Neg logit contributions
 beating-0.032
ning-0.032
 aware-0.032
 group-0.031
hr-0.030
Tokenessa
Token ID21411
Feature activation+1.21e-03

Pos logit contributions
bridge+0.100
 throne+0.096
rost+0.089
 fourteen+0.087
affle+0.086
Neg logit contributions
ing-0.111
 beating-0.110
 aware-0.110
ning-0.108
 group-0.106
Token was
Token ID373
Feature activation-1.18e-02

Pos logit contributions
bridge+0.003
 throne+0.003
 vows+0.003
affle+0.002
rost+0.002
Neg logit contributions
 beating-0.001
 aware-0.001
 group-0.001
ning-0.001
aping-0.001
Token playing
Token ID2712
Feature activation-3.31e-02

Pos logit contributions
ning+0.016
 beating+0.015
aping+0.015
 aware+0.015
aunted+0.015
Neg logit contributions
 throne-0.020
bridge-0.019
oll-0.019
 vows-0.018
affle-0.018
Token in
Token ID287
Feature activation-1.11e-02

Pos logit contributions
 beating+0.031
 aware+0.030
ning+0.029
aping+0.027
run+0.027
Neg logit contributions
bridge-0.072
 throne-0.070
affle-0.068
 vows-0.066
 menu-0.066
Token the
Token ID262
Feature activation-5.55e-03

Pos logit contributions
ning+0.012
 beating+0.012
 aware+0.012
aping+0.011
 group+0.011
Neg logit contributions
bridge-0.022
 throne-0.022
 vows-0.021
rost-0.021
affle-0.020
Token.
Token ID13
Feature activation+9.97e-03

Pos logit contributions
bridge+0.006
 throne+0.006
 vows+0.005
 menu+0.005
affle+0.005
Neg logit contributions
 aware-0.004
aping-0.004
ning-0.004
 beating-0.004
hr-0.004
Token
Token ID198
Feature activation+8.37e-03

Pos logit contributions
bridge+0.017
 throne+0.017
 vows+0.017
affle+0.016
rost+0.016
Neg logit contributions
ning-0.013
 beating-0.013
 aware-0.013
 group-0.013
aping-0.012
Token
Token ID198
Feature activation+2.07e-02

Pos logit contributions
bridge+0.018
 throne+0.017
 vows+0.017
affle+0.017
rost+0.017
Neg logit contributions
ning-0.008
 beating-0.008
 aware-0.008
 group-0.007
hr-0.007
TokenFinally
Token ID11158
Feature activation+8.37e-03

Pos logit contributions
 throne+0.041
bridge+0.041
 vows+0.040
rost+0.038
affle+0.038
Neg logit contributions
 beating-0.023
 aware-0.021
ning-0.021
 group-0.021
hr-0.020
Token,
Token ID11
Feature activation-4.17e-03

Pos logit contributions
bridge+0.021
 throne+0.020
 vows+0.020
affle+0.020
rost+0.019
Neg logit contributions
 beating-0.005
ning-0.005
 group-0.005
 aware-0.005
ing-0.005
Token T
Token ID309
Feature activation+6.68e-02

Pos logit contributions
ning+0.005
 beating+0.005
 group+0.005
 aware+0.005
ing+0.005
Neg logit contributions
bridge-0.008
 throne-0.007
 vows-0.007
affle-0.007
rost-0.007
Tokenoni
Token ID14651
Feature activation+1.23e-02

Pos logit contributions
 throne+0.085
bridge+0.080
 vows+0.073
 piping+0.069
 embassy+0.066
Neg logit contributions
ing-0.126
ning-0.123
 aware-0.118
aping-0.117
 beating-0.116
Token was
Token ID373
Feature activation+1.52e-02

Pos logit contributions
bridge+0.022
 throne+0.021
 vows+0.021
affle+0.021
oll+0.020
Neg logit contributions
ning-0.016
 beating-0.016
 aware-0.015
hr-0.014
aping-0.014
Token ready
Token ID3492
Feature activation+1.16e-02

Pos logit contributions
bridge+0.026
 throne+0.026
oll+0.024
affle+0.024
obb+0.024
Neg logit contributions
ning-0.020
aping-0.019
hr-0.019
 group-0.019
aunted-0.018
Token.
Token ID13
Feature activation+1.55e-02

Pos logit contributions
 throne+0.021
bridge+0.021
 vows+0.021
 menu+0.020
rost+0.019
Neg logit contributions
aping-0.014
 aware-0.014
ning-0.014
 beating-0.013
hr-0.013
Token She
Token ID1375
Feature activation+6.46e-03

Pos logit contributions
bridge+0.024
 throne+0.024
 vows+0.023
affle+0.022
rost+0.022
Neg logit contributions
ning-0.023
 beating-0.023
 aware-0.022
 group-0.022
ing-0.022
Token Tim
Token ID5045
Feature activation-9.74e-03

Pos logit contributions
bridge+0.022
 throne+0.022
 vows+0.021
affle+0.020
rost+0.020
Neg logit contributions
 beating-0.024
ning-0.024
 aware-0.024
 group-0.023
aping-0.023
Token didn
Token ID1422
Feature activation+3.04e-02

Pos logit contributions
 beating+0.013
run+0.013
ning+0.013
 aware+0.013
hr+0.013
Neg logit contributions
affle-0.015
 vows-0.015
bridge-0.014
 throne-0.014
oll-0.014
Token't
Token ID470
Feature activation+1.85e-02

Pos logit contributions
 throne+0.055
bridge+0.054
 vows+0.053
rost+0.050
affle+0.049
Neg logit contributions
 beating-0.041
ning-0.039
 group-0.037
 aware-0.036
ing-0.036
Token like
Token ID588
Feature activation+1.46e-02

Pos logit contributions
bridge+0.035
 vows+0.034
 throne+0.034
affle+0.032
rost+0.031
Neg logit contributions
 aware-0.023
 beating-0.023
 group-0.022
ning-0.021
hr-0.020
Token the
Token ID262
Feature activation+1.13e-02

Pos logit contributions
bridge+0.027
 throne+0.027
 vows+0.026
affle+0.026
rost+0.026
Neg logit contributions
 beating-0.017
ning-0.017
 aware-0.017
 group-0.016
aping-0.016
Token seaw
Token ID32085
Feature activation+6.68e-02

Pos logit contributions
bridge+0.015
 throne+0.015
 vows+0.014
affle+0.014
rost+0.013
Neg logit contributions
 beating-0.020
ning-0.020
 aware-0.020
 group-0.019
aping-0.019
Tokeneed
Token ID2308
Feature activation+2.39e-02

Pos logit contributions
bridge+0.096
 throne+0.093
 vows+0.091
rost+0.083
 piping+0.080
Neg logit contributions
 beating-0.107
ning-0.106
ing-0.104
 aware-0.104
 group-0.103
Token because
Token ID780
Feature activation+1.32e-02

Pos logit contributions
bridge+0.043
 throne+0.042
 vows+0.042
affle+0.040
rost+0.040
Neg logit contributions
 aware-0.030
 beating-0.029
ning-0.029
 group-0.028
aping-0.027
Token it
Token ID340
Feature activation-1.56e-03

Pos logit contributions
 vows+0.021
bridge+0.021
 throne+0.021
affle+0.020
rost+0.019
Neg logit contributions
 beating-0.019
ning-0.019
 group-0.018
 aware-0.018
ing-0.018
Token would
Token ID561
Feature activation+7.69e-03

Pos logit contributions
 aware+0.002
 beating+0.002
ning+0.002
hr+0.002
 group+0.002
Neg logit contributions
bridge-0.003
 throne-0.003
affle-0.003
 vows-0.003
rost-0.003
Token get
Token ID651
Feature activation-1.23e-02

Pos logit contributions
 throne+0.013
bridge+0.013
 vows+0.013
rost+0.012
affle+0.012
Neg logit contributions
ning-0.010
 beating-0.010
 aware-0.010
aping-0.010
 group-0.009
TokenWhen
Token ID2215
Feature activation-1.48e-03

Pos logit contributions
ning+0.007
 beating+0.007
 aware+0.006
run+0.006
 group+0.006
Neg logit contributions
bridge-0.008
 vows-0.008
 throne-0.008
rost-0.007
 menu-0.007
Token they
Token ID484
Feature activation+2.46e-02

Pos logit contributions
 beating+0.002
ning+0.002
 group+0.002
 aware+0.002
hr+0.002
Neg logit contributions
bridge-0.003
 throne-0.002
affle-0.002
 vows-0.002
rost-0.002
Token arrived
Token ID5284
Feature activation+9.81e-03

Pos logit contributions
bridge+0.056
 throne+0.053
affle+0.053
 vows+0.051
obb+0.050
Neg logit contributions
 group-0.021
run-0.020
 reminded-0.018
 beating-0.018
hr-0.018
Token,
Token ID11
Feature activation-1.47e-02

Pos logit contributions
bridge+0.020
 throne+0.020
 vows+0.019
affle+0.019
rost+0.019
Neg logit contributions
ning-0.010
 aware-0.010
 beating-0.010
 group-0.010
ing-0.009
Token the
Token ID262
Feature activation+7.67e-03

Pos logit contributions
 aware+0.022
 beating+0.022
ning+0.021
aping+0.021
run+0.020
Neg logit contributions
 throne-0.023
 vows-0.023
bridge-0.022
 menu-0.021
rost-0.021
Token grand
Token ID4490
Feature activation+6.63e-02

Pos logit contributions
bridge+0.012
 throne+0.012
 menu+0.012
 vows+0.012
affle+0.012
Neg logit contributions
ning-0.011
 beating-0.011
 aware-0.011
aping-0.011
hr-0.011
Tokenkid
Token ID38439
Feature activation+2.28e-02

Pos logit contributions
bridge+0.111
 throne+0.106
 vows+0.104
obb+0.098
rost+0.095
Neg logit contributions
 beating-0.094
ning-0.094
 group-0.091
ing-0.088
 aware-0.088
Token was
Token ID373
Feature activation+6.40e-03

Pos logit contributions
bridge+0.042
 throne+0.042
 vows+0.042
 menu+0.039
affle+0.039
Neg logit contributions
 beating-0.027
ning-0.027
 aware-0.027
hr-0.025
 group-0.025
Token so
Token ID523
Feature activation+1.36e-02

Pos logit contributions
bridge+0.012
 throne+0.012
 vows+0.011
oll+0.011
 menu+0.011
Neg logit contributions
ning-0.008
 aware-0.008
aping-0.007
 beating-0.007
 group-0.007
Token excited
Token ID6568
Feature activation+8.09e-03

Pos logit contributions
bridge+0.028
 throne+0.028
 vows+0.027
affle+0.026
rost+0.026
Neg logit contributions
ning-0.014
 beating-0.014
 aware-0.013
 group-0.013
aping-0.012
Token!
Token ID0
Feature activation+1.03e-02

Pos logit contributions
bridge+0.017
 throne+0.016
 vows+0.016
affle+0.016
 menu+0.016
Neg logit contributions
 beating-0.008
 aware-0.008
ning-0.008
 group-0.007
aping-0.007
Token done
Token ID1760
Feature activation-4.81e-03

Pos logit contributions
ning+0.047
aping+0.046
bing+0.044
aunted+0.042
ing+0.042
Neg logit contributions
 wait-0.028
 throne-0.022
 vows-0.021
bridge-0.020
 menu-0.020
Token a
Token ID257
Feature activation+1.34e-02

Pos logit contributions
aping+0.006
ning+0.006
 aware+0.006
aunted+0.006
 group+0.006
Neg logit contributions
bridge-0.008
 throne-0.008
 vows-0.008
 fourteen-0.008
 menu-0.008
Token H
Token ID367
Feature activation+5.57e-02

Pos logit contributions
 vows+0.019
 throne+0.019
 menu+0.017
bridge+0.017
affle+0.016
Neg logit contributions
aping-0.023
ning-0.023
aunted-0.022
run-0.021
 aware-0.021
Tokenula
Token ID4712
Feature activation+2.94e-02

Pos logit contributions
bridge+0.083
 throne+0.082
 vows+0.079
rost+0.076
affle+0.075
Neg logit contributions
 beating-0.088
ning-0.087
 aware-0.086
 group-0.084
hr-0.083
Token-
Token ID12
Feature activation+5.19e-03

Pos logit contributions
 throne+0.051
bridge+0.051
 vows+0.051
affle+0.049
oll+0.048
Neg logit contributions
 aware-0.039
 beating-0.038
ning-0.037
aping-0.036
 group-0.035
TokenH
Token ID39
Feature activation+6.62e-02

Pos logit contributions
bridge+0.008
 vows+0.007
rost+0.007
 throne+0.007
affle+0.007
Neg logit contributions
ning-0.008
 group-0.008
 beating-0.008
 aware-0.007
aping-0.007
Tokenoop
Token ID11224
Feature activation+1.20e-02

Pos logit contributions
bridge+0.107
 throne+0.104
 vows+0.098
 microwave+0.092
 menu+0.092
Neg logit contributions
 beating-0.100
 aware-0.100
aping-0.097
ing-0.097
hr-0.096
Token Dance
Token ID17349
Feature activation+1.21e-02

Pos logit contributions
bridge+0.021
 throne+0.021
 vows+0.020
oll+0.020
affle+0.020
Neg logit contributions
 aware-0.015
 beating-0.015
aping-0.014
ning-0.014
 group-0.014
Token before
Token ID878
Feature activation+2.69e-03

Pos logit contributions
bridge+0.022
 throne+0.021
 vows+0.021
 menu+0.020
affle+0.019
Neg logit contributions
 aware-0.016
aping-0.015
 beating-0.015
ning-0.015
aunted-0.014
Token,
Token ID11
Feature activation-4.95e-04

Pos logit contributions
 vows+0.004
 throne+0.004
 menu+0.004
bridge+0.004
 fourteen+0.004
Neg logit contributions
aping-0.003
 aware-0.003
aunted-0.003
run-0.003
ning-0.003
Token but
Token ID475
Feature activation-1.53e-02

Pos logit contributions
ning+0.001
aping+0.001
aunted+0.001
 aware+0.001
 beating+0.001
Neg logit contributions
bridge-0.001
 throne-0.001
 vows-0.001
 fourteen-0.001
 menu-0.001
Token play
Token ID711
Feature activation-1.59e-02

Pos logit contributions
 throne+0.019
bridge+0.019
 vows+0.018
affle+0.017
rost+0.016
Neg logit contributions
ning-0.019
 aware-0.019
 beating-0.019
aping-0.018
ing-0.018
Token on
Token ID319
Feature activation+1.76e-02

Pos logit contributions
 aware+0.016
 beating+0.016
ning+0.015
aping+0.015
hr+0.014
Neg logit contributions
bridge-0.034
 throne-0.032
affle-0.031
 vows-0.031
 menu-0.030
Token.
Token ID13
Feature activation+1.72e-02

Pos logit contributions
bridge+0.034
 throne+0.034
 vows+0.032
affle+0.032
 menu+0.031
Neg logit contributions
 aware-0.021
ning-0.020
 beating-0.020
aping-0.019
aunted-0.018
Token One
Token ID1881
Feature activation-1.14e-02

Pos logit contributions
bridge+0.026
 throne+0.026
 vows+0.025
affle+0.024
rost+0.024
Neg logit contributions
 beating-0.026
ning-0.026
 aware-0.026
 group-0.025
aping-0.024
Token day
Token ID1110
Feature activation+8.73e-03

Pos logit contributions
ning+0.021
 aware+0.021
 beating+0.021
aping+0.021
hr+0.020
Neg logit contributions
 throne-0.013
bridge-0.013
affle-0.012
rost-0.012
oll-0.012
Token J
Token ID449
Feature activation+6.60e-02

Pos logit contributions
bridge+0.019
 throne+0.019
 vows+0.019
affle+0.019
rost+0.018
Neg logit contributions
 beating-0.007
ning-0.007
 group-0.007
 aware-0.007
ing-0.007
Tokenax
Token ID897
Feature activation+6.45e-03

Pos logit contributions
 throne+0.111
 vows+0.108
bridge+0.103
rost+0.098
stack+0.096
Neg logit contributions
ing-0.101
hr-0.093
 group-0.090
ning-0.090
 aware-0.089
Token had
Token ID550
Feature activation+6.59e-03

Pos logit contributions
bridge+0.012
 throne+0.011
 vows+0.011
affle+0.011
oll+0.011
Neg logit contributions
 beating-0.008
ning-0.008
 aware-0.008
 group-0.007
hr-0.007
Token the
Token ID262
Feature activation-9.22e-03

Pos logit contributions
bridge+0.013
 throne+0.013
 vows+0.013
 menu+0.012
rost+0.012
Neg logit contributions
ning-0.007
 beating-0.007
 aware-0.007
aping-0.007
hr-0.006
Token urge
Token ID14960
Feature activation+2.12e-02

Pos logit contributions
 beating+0.013
ning+0.013
 aware+0.012
 group+0.012
aping+0.012
Neg logit contributions
bridge-0.016
 throne-0.015
 vows-0.015
affle-0.014
rost-0.014
Token to
Token ID284
Feature activation-7.86e-03

Pos logit contributions
 throne+0.039
 vows+0.039
bridge+0.039
 menu+0.037
rost+0.037
Neg logit contributions
 aware-0.024
ning-0.024
 beating-0.023
aping-0.023
hr-0.023
Token
Token ID198
Feature activation+4.77e-03

Pos logit contributions
 beating+0.036
 aware+0.034
ning+0.033
 group+0.033
broken+0.032
Neg logit contributions
bridge-0.040
oll-0.038
rost-0.037
 throne-0.036
affle-0.036
Token
Token ID198
Feature activation+1.59e-02

Pos logit contributions
bridge+0.009
 throne+0.009
 vows+0.009
rost+0.009
affle+0.009
Neg logit contributions
ning-0.006
 beating-0.005
 aware-0.005
 group-0.005
broken-0.005
TokenAfter
Token ID3260
Feature activation+6.70e-03

Pos logit contributions
bridge+0.028
 throne+0.027
 vows+0.027
affle+0.026
rost+0.026
Neg logit contributions
ning-0.021
 beating-0.021
 aware-0.021
 group-0.020
aping-0.019
Token that
Token ID326
Feature activation-1.20e-03

Pos logit contributions
bridge+0.012
 throne+0.012
 vows+0.011
affle+0.011
oll+0.011
Neg logit contributions
ning-0.009
 beating-0.009
 aware-0.008
 group-0.008
aping-0.008
Token,
Token ID11
Feature activation+1.29e-02

Pos logit contributions
 beating+0.001
 aware+0.001
ning+0.001
aping+0.001
run+0.001
Neg logit contributions
 throne-0.002
 vows-0.002
bridge-0.002
 menu-0.002
rost-0.002
Token J
Token ID449
Feature activation+6.57e-02

Pos logit contributions
bridge+0.023
 throne+0.023
 vows+0.022
affle+0.021
rost+0.021
Neg logit contributions
 beating-0.016
ning-0.016
 aware-0.016
 group-0.015
aping-0.015
Tokenax
Token ID897
Feature activation+3.94e-03

Pos logit contributions
 throne+0.110
bridge+0.102
 vows+0.102
 microwave+0.093
rost+0.092
Neg logit contributions
ing-0.101
hr-0.098
ning-0.095
 aware-0.094
aping-0.093
Token said
Token ID531
Feature activation+1.78e-03

Pos logit contributions
bridge+0.007
 throne+0.007
 vows+0.007
affle+0.007
oll+0.006
Neg logit contributions
ning-0.005
 beating-0.005
 aware-0.005
hr-0.005
 group-0.004
Token goodbye
Token ID24829
Feature activation+3.35e-03

Pos logit contributions
bridge+0.002
 throne+0.002
 vows+0.002
affle+0.002
oll+0.002
Neg logit contributions
ning-0.003
 beating-0.003
 aware-0.003
aping-0.003
 group-0.003
Token to
Token ID284
Feature activation+1.26e-02

Pos logit contributions
 throne+0.006
 vows+0.005
bridge+0.005
rost+0.005
 menu+0.005
Neg logit contributions
ning-0.005
aping-0.004
hr-0.004
 group-0.004
 beating-0.004
Token the
Token ID262
Feature activation-7.09e-03

Pos logit contributions
bridge+0.025
 throne+0.025
 vows+0.024
affle+0.024
rost+0.023
Neg logit contributions
 beating-0.014
ning-0.013
 aware-0.013
 group-0.013
aping-0.012
Token
Token ID198
Feature activation+2.33e-03

Pos logit contributions
 beating+0.044
ning+0.041
 aware+0.041
hr+0.040
 group+0.039
Neg logit contributions
bridge-0.046
rost-0.045
oll-0.043
 throne-0.042
ibrarian-0.042
Token
Token ID198
Feature activation+1.70e-02

Pos logit contributions
bridge+0.005
 throne+0.005
 vows+0.004
rost+0.004
affle+0.004
Neg logit contributions
 beating-0.003
ning-0.002
 aware-0.002
 group-0.002
hr-0.002
TokenOne
Token ID3198
Feature activation-7.85e-03

Pos logit contributions
 throne+0.030
bridge+0.030
 vows+0.029
affle+0.028
oll+0.028
Neg logit contributions
 beating-0.022
 aware-0.022
ning-0.022
 group-0.022
hr-0.021
Token day
Token ID1110
Feature activation+1.22e-02

Pos logit contributions
ning+0.014
 beating+0.014
 aware+0.013
aping+0.013
run+0.013
Neg logit contributions
 throne-0.010
bridge-0.010
 vows-0.010
rost-0.010
affle-0.010
Token,
Token ID11
Feature activation-7.60e-04

Pos logit contributions
bridge+0.029
 throne+0.028
 vows+0.028
affle+0.027
rost+0.027
Neg logit contributions
ning-0.009
 beating-0.009
 group-0.009
 aware-0.009
ing-0.008
Token J
Token ID449
Feature activation+6.57e-02

Pos logit contributions
 beating+0.001
ning+0.001
 aware+0.001
 group+0.001
aping+0.001
Neg logit contributions
bridge-0.002
 throne-0.002
 vows-0.002
affle-0.001
rost-0.001
Tokenax
Token ID897
Feature activation+4.18e-03

Pos logit contributions
 throne+0.106
 vows+0.104
bridge+0.101
pter+0.090
 piping+0.090
Neg logit contributions
ing-0.101
hr-0.095
ning-0.095
 aware-0.094
aping-0.094
Token was
Token ID373
Feature activation+5.10e-03

Pos logit contributions
bridge+0.007
 vows+0.007
 throne+0.007
 menu+0.006
affle+0.006
Neg logit contributions
 beating-0.005
ning-0.005
run-0.005
 aware-0.005
aping-0.005
Token playing
Token ID2712
Feature activation-3.36e-02

Pos logit contributions
 throne+0.007
 vows+0.007
bridge+0.006
 embassy+0.006
 priest+0.006
Neg logit contributions
aping-0.008
aunted-0.007
heart-0.007
run-0.007
ning-0.007
Token in
Token ID287
Feature activation-6.92e-03

Pos logit contributions
aping+0.042
 beating+0.042
run+0.041
ning+0.039
 aware+0.039
Neg logit contributions
bridge-0.063
 throne-0.060
 menu-0.056
affle-0.054
obb-0.053
Token the
Token ID262
Feature activation-1.00e-03

Pos logit contributions
ning+0.008
 beating+0.008
 aware+0.008
ing+0.008
aping+0.008
Neg logit contributions
 throne-0.013
bridge-0.013
 menu-0.012
rost-0.012
 vows-0.011
Token
Token ID198
Feature activation-1.48e-02

Pos logit contributions
 beating+0.019
 aware+0.018
ning+0.017
 group+0.017
hr+0.016
Neg logit contributions
bridge-0.026
 vows-0.026
 throne-0.026
oll-0.025
rost-0.024
Token
Token ID198
Feature activation-7.27e-03

Pos logit contributions
 beating+0.020
 aware+0.020
ning+0.019
hr+0.017
broken+0.017
Neg logit contributions
 vows-0.025
bridge-0.024
rost-0.024
 throne-0.024
affle-0.023
TokenOne
Token ID3198
Feature activation-1.63e-02

Pos logit contributions
 aware+0.012
ning+0.011
 beating+0.011
run+0.011
 group+0.011
Neg logit contributions
bridge-0.010
 vows-0.010
 throne-0.009
affle-0.009
rost-0.009
Token day
Token ID1110
Feature activation-6.30e-03

Pos logit contributions
ning+0.027
 beating+0.027
 aware+0.026
aping+0.025
 group+0.025
Neg logit contributions
bridge-0.023
 throne-0.023
 vows-0.022
affle-0.022
rost-0.022
Token,
Token ID11
Feature activation-1.64e-02

Pos logit contributions
 aware+0.008
ning+0.008
 beating+0.008
aping+0.007
run+0.007
Neg logit contributions
 vows-0.011
 throne-0.011
bridge-0.011
oll-0.011
 menu-0.010
Token T
Token ID309
Feature activation+6.56e-02

Pos logit contributions
 aware+0.018
ning+0.017
 beating+0.017
run+0.017
aping+0.017
Neg logit contributions
 vows-0.031
 throne-0.031
bridge-0.030
oll-0.030
 menu-0.029
Tokenessa
Token ID21411
Feature activation+1.21e-02

Pos logit contributions
bridge+0.105
 throne+0.100
 microwave+0.091
 fourteen+0.090
rost+0.090
Neg logit contributions
ing-0.103
 aware-0.101
 beating-0.099
ning-0.098
hr-0.093
Token went
Token ID1816
Feature activation+8.78e-03

Pos logit contributions
 vows+0.023
bridge+0.022
 throne+0.022
affle+0.021
oll+0.021
Neg logit contributions
 beating-0.014
ning-0.014
 aware-0.014
hr-0.013
aping-0.013
Token to
Token ID284
Feature activation-1.49e-02

Pos logit contributions
 throne+0.019
bridge+0.018
 vows+0.018
affle+0.018
 menu+0.017
Neg logit contributions
 beating-0.008
ning-0.008
 aware-0.008
 group-0.008
aping-0.007
Token the
Token ID262
Feature activation-1.84e-03

Pos logit contributions
 beating+0.019
ning+0.018
ing+0.018
 aware+0.017
aping+0.017
Neg logit contributions
 throne-0.029
 menu-0.027
 vows-0.027
rost-0.026
bridge-0.026
Token park
Token ID3952
Feature activation-7.19e-03

Pos logit contributions
ning+0.003
 group+0.003
 beating+0.003
 aware+0.003
hr+0.003
Neg logit contributions
bridge-0.003
 throne-0.002
 vows-0.002
affle-0.002
rost-0.002
Token
Token ID198
Feature activation+1.23e-03

Pos logit contributions
ning+0.011
 beating+0.011
 aware+0.011
hr+0.010
broken+0.010
Neg logit contributions
bridge-0.014
 throne-0.014
rost-0.013
oll-0.013
 vows-0.013
TokenIn
Token ID818
Feature activation-1.45e-02

Pos logit contributions
bridge+0.002
 throne+0.002
 vows+0.002
oll+0.002
affle+0.002
Neg logit contributions
ning-0.002
 beating-0.002
 aware-0.002
 group-0.002
aping-0.002
Token the
Token ID262
Feature activation-2.05e-02

Pos logit contributions
ning+0.018
aping+0.017
 aware+0.017
 beating+0.017
aunted+0.016
Neg logit contributions
 throne-0.027
bridge-0.026
 vows-0.026
rost-0.025
 menu-0.024
Token nightmare
Token ID17123
Feature activation-4.07e-04

Pos logit contributions
ning+0.037
 beating+0.037
 aware+0.037
aping+0.036
 group+0.036
Neg logit contributions
bridge-0.025
 throne-0.025
 vows-0.024
affle-0.023
rost-0.023
Token,
Token ID11
Feature activation-2.01e-02

Pos logit contributions
aping+0.001
aunted+0.001
 aware+0.001
ning+0.001
hr+0.001
Neg logit contributions
 throne-0.001
bridge-0.001
 vows-0.001
oll-0.001
 menu-0.001
Token T
Token ID309
Feature activation+6.53e-02

Pos logit contributions
aping+0.029
run+0.028
aunted+0.028
ning+0.028
 aware+0.027
Neg logit contributions
 throne-0.031
bridge-0.031
 vows-0.030
oll-0.029
 passport-0.029
Tokenessa
Token ID21411
Feature activation+5.10e-03

Pos logit contributions
bridge+0.098
 throne+0.093
rost+0.085
 fourteen+0.085
 microwave+0.084
Neg logit contributions
 aware-0.106
ing-0.106
 beating-0.105
ning-0.102
hr-0.099
Token was
Token ID373
Feature activation-9.55e-03

Pos logit contributions
 throne+0.008
bridge+0.008
 vows+0.008
oll+0.007
affle+0.007
Neg logit contributions
ning-0.007
aping-0.007
 aware-0.007
 beating-0.007
run-0.007
Token running
Token ID2491
Feature activation-1.44e-02

Pos logit contributions
aping+0.014
aunted+0.013
ning+0.013
hr+0.012
iting+0.012
Neg logit contributions
 throne-0.017
oll-0.015
bridge-0.014
ibrarian-0.013
 palace-0.013
Token and
Token ID290
Feature activation-2.32e-02

Pos logit contributions
aping+0.020
ning+0.020
 aware+0.019
hr+0.019
 beating+0.019
Neg logit contributions
 throne-0.024
bridge-0.024
 vows-0.022
 menu-0.021
rost-0.021
Token running
Token ID2491
Feature activation-7.96e-03

Pos logit contributions
aping+0.038
eter+0.035
ning+0.035
aunted+0.035
hr+0.034
Neg logit contributions
 throne-0.037
bridge-0.029
 bamboo-0.027
 crown-0.027
stack-0.026
Token help
Token ID1037
Feature activation-8.70e-03

Pos logit contributions
ing+0.000
ning+0.000
aunted+0.000
aping+0.000
 beating+0.000
Neg logit contributions
 throne-0.000
bridge-0.000
affle-0.000
 bamboo-0.000
rost-0.000
Token.
Token ID13
Feature activation-7.08e-03

Pos logit contributions
 aware+0.014
aping+0.013
broken+0.013
 group+0.013
hr+0.013
Neg logit contributions
 throne-0.011
rost-0.011
bridge-0.010
 menu-0.010
 bamboo-0.010
Token
Token ID198
Feature activation-1.56e-02

Pos logit contributions
 beating+0.010
ning+0.010
 aware+0.010
broken+0.009
 group+0.009
Neg logit contributions
 throne-0.010
bridge-0.010
oll-0.010
 vows-0.010
rost-0.010
Token
Token ID198
Feature activation-1.37e-02

Pos logit contributions
ning+0.023
 beating+0.022
 aware+0.021
broken+0.021
hr+0.021
Neg logit contributions
bridge-0.023
rost-0.023
 throne-0.023
ibrarian-0.022
oll-0.022
TokenThen
Token ID6423
Feature activation-2.01e-02

Pos logit contributions
ning+0.020
run+0.018
aping+0.017
 beating+0.017
 group+0.017
Neg logit contributions
bridge-0.021
 embassy-0.019
 vows-0.019
 throne-0.019
oll-0.018
Token T
Token ID309
Feature activation+6.53e-02

Pos logit contributions
aping+0.029
run+0.028
ning+0.028
 beating+0.028
 aware+0.028
Neg logit contributions
 throne-0.031
 vows-0.031
bridge-0.030
oll-0.028
 embassy-0.027
Tokenessa
Token ID21411
Feature activation-2.70e-03

Pos logit contributions
bridge+0.094
 throne+0.090
rost+0.081
affle+0.080
 vows+0.079
Neg logit contributions
 beating-0.111
 aware-0.111
ning-0.110
ing-0.109
hr-0.105
Token spotted
Token ID13489
Feature activation-9.22e-03

Pos logit contributions
 beating+0.004
hr+0.004
ning+0.004
 aware+0.003
aping+0.003
Neg logit contributions
bridge-0.005
 throne-0.004
 vows-0.004
oll-0.004
ibrarian-0.004
Token some
Token ID617
Feature activation-2.07e-02

Pos logit contributions
 beating+0.014
aping+0.013
 aware+0.013
aunted+0.013
broken+0.013
Neg logit contributions
bridge-0.014
 throne-0.013
 bamboo-0.013
oll-0.013
 menu-0.013
Token salt
Token ID8268
Feature activation+4.11e-03

Pos logit contributions
hr+0.035
 beating+0.035
aunted+0.035
aping+0.034
ning+0.034
Neg logit contributions
bridge-0.028
affle-0.025
 throne-0.024
oll-0.024
 bamboo-0.024
Token nearby
Token ID6716
Feature activation-1.02e-02

Pos logit contributions
bridge+0.008
 throne+0.007
 vows+0.007
affle+0.007
oll+0.007
Neg logit contributions
 beating-0.005
 aware-0.005
ning-0.005
 group-0.005
hr-0.005
Token Sue
Token ID26089
Feature activation+5.69e-03

Pos logit contributions
 beating+0.043
ning+0.042
 aware+0.041
aping+0.039
 group+0.039
Neg logit contributions
bridge-0.056
 throne-0.056
 vows-0.054
rost-0.052
affle-0.052
Token and
Token ID290
Feature activation+4.98e-03

Pos logit contributions
 throne+0.010
 vows+0.010
bridge+0.010
affle+0.009
rost+0.009
Neg logit contributions
 beating-0.007
ning-0.007
 aware-0.007
hr-0.007
 group-0.007
Token Joe
Token ID5689
Feature activation-3.12e-04

Pos logit contributions
 throne+0.007
bridge+0.007
 vows+0.007
rost+0.007
affle+0.007
Neg logit contributions
 beating-0.008
ning-0.008
 aware-0.008
aping-0.007
broken-0.007
Token,
Token ID11
Feature activation-1.95e-02

Pos logit contributions
 beating+0.000
ning+0.000
 aware+0.000
 group+0.000
hr+0.000
Neg logit contributions
bridge-0.001
 throne-0.001
 vows-0.001
affle-0.001
rost-0.001
Token were
Token ID547
Feature activation-1.45e-02

Pos logit contributions
 beating+0.029
ning+0.029
 aware+0.028
aping+0.027
hr+0.027
Neg logit contributions
bridge-0.031
 throne-0.031
 vows-0.030
rost-0.029
 menu-0.028
Token sitting
Token ID5586
Feature activation-5.88e-02

Pos logit contributions
 beating+0.015
 aware+0.014
ning+0.014
run+0.013
aping+0.013
Neg logit contributions
bridge-0.030
 throne-0.029
 vows-0.028
affle-0.028
oll-0.028
Token in
Token ID287
Feature activation-7.80e-03

Pos logit contributions
 beating+0.098
run+0.094
hr+0.092
ning+0.087
aping+0.085
Neg logit contributions
bridge-0.085
 throne-0.080
oll-0.078
affle-0.076
obb-0.074
Token front
Token ID2166
Feature activation-1.69e-02

Pos logit contributions
 beating+0.005
ning+0.005
 group+0.005
 aware+0.004
hr+0.004
Neg logit contributions
bridge-0.019
affle-0.019
 vows-0.019
 throne-0.019
oll-0.019
Token of
Token ID286
Feature activation-2.12e-03

Pos logit contributions
ning+0.015
 beating+0.015
 aware+0.015
hr+0.014
 group+0.014
Neg logit contributions
bridge-0.037
 throne-0.037
 vows-0.035
rost-0.034
 menu-0.034
Token it
Token ID340
Feature activation-1.32e-02

Pos logit contributions
 beating+0.002
ning+0.002
 aware+0.002
 group+0.002
aping+0.002
Neg logit contributions
bridge-0.004
 throne-0.004
 vows-0.004
rost-0.004
affle-0.004
Token.
Token ID13
Feature activation+3.87e-03

Pos logit contributions
 aware+0.019
hr+0.019
aping+0.019
 beating+0.019
aunted+0.018
Neg logit contributions
 throne-0.020
 menu-0.020
bridge-0.019
 vows-0.018
oll-0.018
Token and
Token ID290
Feature activation+8.90e-03

Pos logit contributions
 beating+0.003
ning+0.003
 aware+0.003
hr+0.003
run+0.003
Neg logit contributions
 throne-0.006
bridge-0.006
affle-0.006
rost-0.006
 vows-0.006
Token L
Token ID406
Feature activation+1.18e-02

Pos logit contributions
 throne+0.014
bridge+0.014
 vows+0.014
rost+0.014
affle+0.013
Neg logit contributions
 beating-0.013
ning-0.013
 aware-0.012
aping-0.012
broken-0.012
Tokenola
Token ID5708
Feature activation+1.67e-03

Pos logit contributions
bridge+0.016
 throne+0.016
 vows+0.016
rost+0.015
affle+0.014
Neg logit contributions
ning-0.020
 group-0.020
 beating-0.020
 aware-0.020
aping-0.019
Token -
Token ID532
Feature activation-2.20e-02

Pos logit contributions
bridge+0.003
 vows+0.003
 throne+0.003
affle+0.003
rost+0.003
Neg logit contributions
 beating-0.002
ning-0.002
 aware-0.002
hr-0.002
aping-0.002
Token were
Token ID547
Feature activation-1.99e-02

Pos logit contributions
 beating+0.035
ning+0.034
 aware+0.034
 group+0.032
aping+0.032
Neg logit contributions
bridge-0.033
 throne-0.032
 vows-0.031
rost-0.030
affle-0.030
Token sitting
Token ID5586
Feature activation-5.71e-02

Pos logit contributions
 beating+0.024
 aware+0.022
ning+0.022
run+0.021
broken+0.021
Neg logit contributions
bridge-0.038
 throne-0.037
 vows-0.036
affle-0.036
oll-0.035
Token on
Token ID319
Feature activation+8.38e-03

Pos logit contributions
 beating+0.087
run+0.081
hr+0.081
ning+0.078
 aware+0.076
Neg logit contributions
bridge-0.092
affle-0.085
 throne-0.085
oll-0.084
obb-0.082
Token a
Token ID257
Feature activation+1.35e-02

Pos logit contributions
bridge+0.021
 throne+0.020
 vows+0.020
oll+0.020
affle+0.020
Neg logit contributions
ning-0.005
 group-0.005
 beating-0.005
 aware-0.005
hr-0.004
Token cool
Token ID3608
Feature activation+2.90e-02

Pos logit contributions
bridge+0.024
 vows+0.024
oll+0.024
mm+0.022
rost+0.022
Neg logit contributions
 group-0.019
 beating-0.017
ning-0.017
 aware-0.016
 wise-0.016
Token raft
Token ID31812
Feature activation+2.46e-03

Pos logit contributions
bridge+0.041
 vows+0.039
 throne+0.039
oll+0.037
affle+0.037
Neg logit contributions
 beating-0.048
ning-0.048
 group-0.048
 aware-0.047
hr-0.045
Token.
Token ID13
Feature activation+3.91e-03

Pos logit contributions
bridge+0.004
 throne+0.004
 menu+0.004
 vows+0.004
affle+0.004
Neg logit contributions
 aware-0.003
 beating-0.003
hr-0.003
aping-0.003
ning-0.003
Token he
Token ID339
Feature activation+1.43e-02

Pos logit contributions
bridge+0.002
 throne+0.002
 vows+0.002
affle+0.002
rost+0.002
Neg logit contributions
 beating-0.002
ning-0.002
 aware-0.002
 group-0.002
aping-0.002
Token could
Token ID714
Feature activation-1.79e-03

Pos logit contributions
bridge+0.020
 vows+0.019
 throne+0.019
affle+0.018
oll+0.018
Neg logit contributions
ning-0.024
 beating-0.023
 aware-0.023
aping-0.022
 group-0.022
Token be
Token ID307
Feature activation-1.74e-02

Pos logit contributions
ning+0.003
ing+0.003
 beating+0.002
hr+0.002
aping+0.002
Neg logit contributions
 throne-0.003
bridge-0.003
 vows-0.003
affle-0.002
 wait-0.002
Token and
Token ID290
Feature activation-2.86e-02

Pos logit contributions
ning+0.023
aunted+0.023
aping+0.022
 Cater+0.021
ing+0.021
Neg logit contributions
bridge-0.031
 throne-0.027
obb-0.025
affle-0.024
 vows-0.023
Token never
Token ID1239
Feature activation-3.36e-02

Pos logit contributions
ning+0.048
 beating+0.045
aping+0.044
hr+0.044
ing+0.043
Neg logit contributions
 throne-0.041
bridge-0.039
 vows-0.039
rost-0.035
affle-0.034
Token mis
Token ID2984
Feature activation-5.68e-02

Pos logit contributions
ning+0.060
aping+0.058
ing+0.054
bing+0.053
aunted+0.052
Neg logit contributions
 wait-0.042
 throne-0.041
 vows-0.037
rost-0.036
 fourteen-0.033
Tokenbeh
Token ID20709
Feature activation-3.58e-02

Pos logit contributions
 Emily+0.092
 Mitt+0.081
 Tina+0.080
 singing+0.079
 Bella+0.077
Neg logit contributions
asses-0.056
round-0.055
 wait-0.054
Whoever-0.054
obb-0.053
Tokenaved
Token ID9586
Feature activation-8.72e-03

Pos logit contributions
ning+0.065
 beating+0.063
 aware+0.063
 wise+0.060
hr+0.059
Neg logit contributions
bridge-0.043
obb-0.041
oll-0.041
affle-0.041
 vows-0.039
Token.
Token ID13
Feature activation-2.50e-03

Pos logit contributions
 aware+0.011
aping+0.011
hr+0.010
 beating+0.010
ning+0.010
Neg logit contributions
bridge-0.016
 throne-0.015
 vows-0.014
 menu-0.013
obb-0.013
Token<|endoftext|>
Token ID50256
Feature activation-1.20e-02

Pos logit contributions
ning+0.003
 beating+0.003
 aware+0.003
hr+0.003
 group+0.003
Neg logit contributions
bridge-0.005
rost-0.004
 throne-0.004
oll-0.004
 vows-0.004
TokenOnce
Token ID7454
Feature activation+1.23e-02

Pos logit contributions
 aware+0.015
 beating+0.015
ning+0.014
hr+0.013
aping+0.013
Neg logit contributions
bridge-0.023
 throne-0.022
 vows-0.021
affle-0.021
rost-0.021
Token to
Token ID284
Feature activation-4.44e-03

Pos logit contributions
ning+0.022
 group+0.019
hr+0.019
 aware+0.018
aunted+0.018
Neg logit contributions
 throne-0.024
 vows-0.024
bridge-0.023
rost-0.023
obb-0.023
Token fix
Token ID4259
Feature activation-3.27e-03

Pos logit contributions
ning+0.006
 aware+0.005
 beating+0.005
run+0.005
ing+0.005
Neg logit contributions
 throne-0.008
rost-0.007
bridge-0.007
 vows-0.007
affle-0.007
Token it
Token ID340
Feature activation+8.22e-04

Pos logit contributions
 aware+0.006
 group+0.006
 beating+0.006
hr+0.006
ning+0.005
Neg logit contributions
bridge-0.004
 throne-0.004
 vows-0.004
affle-0.004
oll-0.004
Token.
Token ID13
Feature activation-5.69e-04

Pos logit contributions
bridge+0.001
 throne+0.001
oll+0.001
obb+0.001
 vows+0.001
Neg logit contributions
 aware-0.001
 group-0.001
hr-0.001
 beating-0.001
ning-0.001
Token Her
Token ID2332
Feature activation+1.29e-03

Pos logit contributions
 beating+0.001
 aware+0.001
ning+0.001
 group+0.001
hr+0.001
Neg logit contributions
 throne-0.001
bridge-0.001
 vows-0.001
affle-0.001
rost-0.001
Token mom
Token ID1995
Feature activation-5.60e-02

Pos logit contributions
bridge+0.002
 throne+0.002
 vows+0.002
affle+0.002
rost+0.002
Neg logit contributions
 beating-0.002
ning-0.002
 aware-0.002
 group-0.002
aping-0.002
Tokenmy
Token ID1820
Feature activation-3.81e-02

Pos logit contributions
 Cater+0.083
hr+0.077
 harmony+0.075
 beating+0.071
ning+0.070
Neg logit contributions
bridge-0.074
ibrarian-0.073
 microwave-0.073
oll-0.072
rost-0.071
Token was
Token ID373
Feature activation-7.40e-03

Pos logit contributions
hr+0.058
ning+0.056
 Cater+0.056
run+0.055
 harmony+0.054
Neg logit contributions
ibrarian-0.051
 vows-0.050
bridge-0.049
oll-0.047
 microwave-0.046
Token very
Token ID845
Feature activation+1.96e-03

Pos logit contributions
ning+0.011
ing+0.010
heart+0.010
ounce+0.010
run+0.010
Neg logit contributions
bridge-0.011
oll-0.010
 throne-0.010
ibrarian-0.010
obb-0.010
Token thoughtful
Token ID22677
Feature activation+5.78e-04

Pos logit contributions
obb+0.003
oll+0.003
 vows+0.003
affle+0.003
ibrarian+0.003
Neg logit contributions
ning-0.003
bat-0.002
 beating-0.002
ing-0.002
run-0.002
Token and
Token ID290
Feature activation-1.50e-02

Pos logit contributions
bridge+0.001
 throne+0.001
 vows+0.001
oll+0.001
affle+0.001
Neg logit contributions
ning-0.001
 beating-0.001
 aware-0.001
hr-0.001
 group-0.001
TokenIf
Token ID1532
Feature activation-8.67e-03

Pos logit contributions
 beating+0.012
 group+0.012
 aware+0.012
ning+0.012
broken+0.011
Neg logit contributions
bridge-0.012
rost-0.011
 vows-0.010
inner-0.010
 fourteen-0.010
Token you
Token ID345
Feature activation+4.26e-03

Pos logit contributions
broken+0.015
 beating+0.015
ning+0.015
aping+0.015
 aware+0.015
Neg logit contributions
 throne-0.011
 vows-0.011
rost-0.010
bridge-0.010
 menu-0.009
Token want
Token ID765
Feature activation+6.87e-03

Pos logit contributions
 vows+0.006
 throne+0.005
bridge+0.005
affle+0.005
rost+0.005
Neg logit contributions
ning-0.008
 beating-0.007
aping-0.007
aunted-0.007
 aware-0.007
Token to
Token ID284
Feature activation-8.07e-03

Pos logit contributions
 throne+0.012
 vows+0.012
 menu+0.012
affle+0.011
rost+0.011
Neg logit contributions
 beating-0.009
broken-0.009
aping-0.009
aunted-0.009
hr-0.009
Token get
Token ID651
Feature activation-1.30e-02

Pos logit contributions
ning+0.013
 beating+0.013
aping+0.013
aunted+0.013
run+0.013
Neg logit contributions
 throne-0.011
 vows-0.011
bridge-0.010
 wait-0.010
 menu-0.010
Token rid
Token ID5755
Feature activation-5.47e-02

Pos logit contributions
 beating+0.018
ning+0.017
 group+0.017
aunted+0.017
hr+0.017
Neg logit contributions
affle-0.020
 throne-0.020
bridge-0.020
 vows-0.020
rost-0.020
Token of
Token ID286
Feature activation-7.62e-03

Pos logit contributions
ounce+0.079
 beating+0.079
ning+0.077
hr+0.076
 feat+0.076
Neg logit contributions
bridge-0.083
affle-0.082
oll-0.082
obb-0.075
inner-0.074
Token the
Token ID262
Feature activation-9.87e-03

Pos logit contributions
 beating+0.010
aunted+0.010
aping+0.010
 group+0.010
broken+0.010
Neg logit contributions
 throne-0.012
bridge-0.012
affle-0.012
 menu-0.012
 vows-0.012
Token hay
Token ID27678
Feature activation+3.52e-03

Pos logit contributions
 beating+0.020
hr+0.019
ning+0.019
 aware+0.019
aping+0.019
Neg logit contributions
bridge-0.010
 throne-0.010
 vows-0.010
affle-0.009
 menu-0.009
Token,
Token ID11
Feature activation-4.03e-02

Pos logit contributions
 throne+0.006
bridge+0.006
 vows+0.006
 menu+0.006
affle+0.006
Neg logit contributions
 beating-0.005
 aware-0.004
aping-0.004
hr-0.004
ning-0.004
Token you
Token ID345
Feature activation-4.94e-03

Pos logit contributions
ning+0.066
 beating+0.065
broken+0.064
aunted+0.062
aping+0.061
Neg logit contributions
 throne-0.059
bridge-0.056
rost-0.052
 vows-0.051
 menu-0.048
Token
Token ID198
Feature activation-1.17e-02

Pos logit contributions
ning+0.015
 aware+0.015
 beating+0.014
hr+0.014
broken+0.014
Neg logit contributions
rost-0.013
bridge-0.012
 throne-0.012
affle-0.012
oll-0.012
TokenB
Token ID33
Feature activation+4.20e-03

Pos logit contributions
ning+0.013
hr+0.012
 group+0.012
 aware+0.012
aping+0.011
Neg logit contributions
bridge-0.022
affle-0.021
 throne-0.020
oll-0.020
rost-0.020
Tokenun
Token ID403
Feature activation-7.86e-03

Pos logit contributions
oll+0.009
rost+0.009
bridge+0.009
affle+0.009
 throne+0.008
Neg logit contributions
ning-0.004
 aware-0.004
hr-0.003
 group-0.003
 beating-0.003
Token's
Token ID338
Feature activation-1.02e-02

Pos logit contributions
ning+0.012
hr+0.011
aping+0.011
run+0.011
 sorts+0.011
Neg logit contributions
bridge-0.011
 throne-0.011
affle-0.011
ibrarian-0.010
oll-0.010
Token friend
Token ID1545
Feature activation-2.91e-02

Pos logit contributions
ning+0.014
 beating+0.014
hr+0.014
 aware+0.013
aping+0.013
Neg logit contributions
 throne-0.017
bridge-0.017
 vows-0.017
oll-0.016
affle-0.016
Token,
Token ID11
Feature activation-5.47e-02

Pos logit contributions
hr+0.053
aping+0.049
ning+0.049
eter+0.048
 beating+0.048
Neg logit contributions
 throne-0.031
oll-0.030
ibrarian-0.029
rost-0.028
ily-0.028
Token a
Token ID257
Feature activation-1.37e-02

Pos logit contributions
aping+0.094
eter+0.090
hr+0.089
 beating+0.086
aunted+0.084
Neg logit contributions
 bamboo-0.064
 wow-0.061
ily-0.059
 Lib-0.058
bridge-0.058
Token flexible
Token ID12846
Feature activation+1.88e-02

Pos logit contributions
aping+0.024
ning+0.024
aunted+0.022
bing+0.020
ing+0.020
Neg logit contributions
 throne-0.022
 firefighter-0.020
 gorilla-0.020
 freezer-0.019
 coin-0.019
Token snake
Token ID17522
Feature activation-6.24e-03

Pos logit contributions
 throne+0.029
affle+0.028
bridge+0.027
 vows+0.027
 menu+0.027
Neg logit contributions
aping-0.030
ning-0.029
hr-0.029
 aware-0.029
 beating-0.028
Token named
Token ID3706
Feature activation-3.75e-03

Pos logit contributions
 beating+0.008
ning+0.008
 aware+0.008
 group+0.007
hr+0.007
Neg logit contributions
bridge-0.011
 throne-0.011
 vows-0.011
affle-0.011
rost-0.010
Token Sly
Token ID31615
Feature activation+7.13e-03

Pos logit contributions
 beating+0.006
broken+0.005
 aware+0.005
aping+0.005
hr+0.005
Neg logit contributions
 throne-0.005
 bamboo-0.005
rost-0.005
bridge-0.005
affle-0.005
Token me
Token ID502
Feature activation-1.94e-02

Pos logit contributions
 beating+0.040
run+0.039
aunted+0.039
apping+0.039
 feat+0.039
Neg logit contributions
affle-0.021
 oven-0.021
 cake-0.021
rost-0.019
 throne-0.018
Token?"
Token ID1701
Feature activation-2.74e-02

Pos logit contributions
aunted+0.035
 beating+0.034
ning+0.032
aping+0.031
 aware+0.031
Neg logit contributions
 throne-0.023
affle-0.023
 freezer-0.021
aches-0.021
oll-0.020
Token she
Token ID673
Feature activation-2.51e-02

Pos logit contributions
ning+0.055
 aware+0.053
 beating+0.053
aping+0.052
hr+0.052
Neg logit contributions
 throne-0.027
bridge-0.026
 vows-0.025
 microwave-0.024
 piping-0.024
Token asked
Token ID1965
Feature activation-1.81e-02

Pos logit contributions
aping+0.047
ounce+0.046
ing+0.045
run+0.045
aught+0.045
Neg logit contributions
ibrarian-0.022
 throne-0.022
 microwave-0.022
bridge-0.022
 crown-0.021
Token.
Token ID13
Feature activation-1.29e-02

Pos logit contributions
ning+0.034
 aware+0.033
 group+0.033
 flock+0.032
ounce+0.032
Neg logit contributions
 throne-0.018
ily-0.017
oll-0.017
rost-0.016
bridge-0.016
Token Mom
Token ID11254
Feature activation-5.46e-02

Pos logit contributions
 aware+0.020
 beating+0.019
ning+0.019
hr+0.019
 group+0.018
Neg logit contributions
 vows-0.019
 throne-0.019
bridge-0.019
 menu-0.018
 microwave-0.018
Tokenmy
Token ID1820
Feature activation-3.07e-02

Pos logit contributions
hr+0.079
 group+0.076
 harmony+0.076
 Cater+0.074
 aware+0.073
Neg logit contributions
oll-0.080
bridge-0.078
ibrarian-0.077
 microwave-0.077
rost-0.077
Token smiled
Token ID13541
Feature activation-2.47e-02

Pos logit contributions
hr+0.043
 harmony+0.043
 beating+0.043
 group+0.042
ning+0.042
Neg logit contributions
ibrarian-0.047
bridge-0.045
oll-0.044
affle-0.043
 vows-0.043
Token and
Token ID290
Feature activation-6.44e-03

Pos logit contributions
hr+0.035
ning+0.034
aping+0.033
 group+0.033
run+0.032
Neg logit contributions
bridge-0.039
 throne-0.038
oll-0.037
 vows-0.036
 menu-0.035
Token took
Token ID1718
Feature activation+4.39e-03

Pos logit contributions
 beating+0.009
ning+0.009
 group+0.009
hr+0.009
 aware+0.008
Neg logit contributions
 vows-0.010
oll-0.010
 throne-0.010
bridge-0.010
 microwave-0.009
Token the
Token ID262
Feature activation-9.36e-03

Pos logit contributions
bridge+0.007
 vows+0.007
oll+0.007
affle+0.007
obb+0.007
Neg logit contributions
 beating-0.007
 aware-0.006
 group-0.006
ning-0.006
hr-0.006
Token came
Token ID1625
Feature activation-3.52e-02

Pos logit contributions
hr+0.044
 aware+0.042
 harmony+0.041
ning+0.040
eter+0.040
Neg logit contributions
bridge-0.028
ibrarian-0.028
oll-0.026
 throne-0.026
 microwave-0.026
Token to
Token ID284
Feature activation-7.02e-03

Pos logit contributions
 consider+0.057
ended+0.056
hr+0.056
 beating+0.056
 aware+0.055
Neg logit contributions
 throne-0.033
 menu-0.030
 menus-0.028
 freezer-0.027
rant-0.026
Token help
Token ID1037
Feature activation-1.50e-03

Pos logit contributions
ning+0.011
 beating+0.011
ing+0.011
 aware+0.010
aping+0.010
Neg logit contributions
 throne-0.011
bridge-0.010
 vows-0.010
rost-0.010
 menu-0.010
Token her
Token ID607
Feature activation-9.93e-03

Pos logit contributions
ning+0.003
 beating+0.003
 aware+0.003
 group+0.003
aping+0.002
Neg logit contributions
rost-0.002
 throne-0.002
 menu-0.002
 vows-0.002
bridge-0.002
Token.
Token ID13
Feature activation+5.74e-03

Pos logit contributions
ning+0.013
 beating+0.013
ing+0.012
 aware+0.012
 group+0.012
Neg logit contributions
 throne-0.017
 vows-0.016
bridge-0.016
 menu-0.016
affle-0.015
Token Mom
Token ID11254
Feature activation-5.46e-02

Pos logit contributions
bridge+0.009
 throne+0.008
 vows+0.008
affle+0.008
rost+0.008
Neg logit contributions
ning-0.009
 beating-0.009
 aware-0.009
 group-0.009
hr-0.008
Tokenmy
Token ID1820
Feature activation-3.19e-02

Pos logit contributions
 Cater+0.072
hr+0.071
 harmony+0.069
 beating+0.068
 group+0.068
Neg logit contributions
 microwave-0.084
bridge-0.084
oll-0.081
ibrarian-0.080
 freezer-0.080
Token found
Token ID1043
Feature activation-1.35e-02

Pos logit contributions
 beating+0.046
ning+0.046
hr+0.045
 harmony+0.044
 aware+0.044
Neg logit contributions
ibrarian-0.046
 throne-0.044
bridge-0.044
 vows-0.043
 microwave-0.042
Token the
Token ID262
Feature activation-1.16e-03

Pos logit contributions
 beating+0.017
 aware+0.016
 group+0.015
ning+0.015
hr+0.015
Neg logit contributions
bridge-0.025
 throne-0.025
 vows-0.024
oll-0.024
rost-0.024
Token key
Token ID1994
Feature activation-5.69e-03

Pos logit contributions
 group+0.002
 kind+0.001
 wise+0.001
 beating+0.001
 struggling+0.001
Neg logit contributions
affle-0.002
bridge-0.002
 vows-0.002
rost-0.002
 fries-0.002
Token and
Token ID290
Feature activation-1.88e-02

Pos logit contributions
 aware+0.007
ning+0.007
 beating+0.007
 group+0.007
hr+0.006
Neg logit contributions
bridge-0.010
 throne-0.010
 vows-0.010
oll-0.010
rost-0.009
Token and
Token ID290
Feature activation-1.09e-02

Pos logit contributions
 throne+0.002
oll+0.002
bridge+0.002
 vows+0.002
 menu+0.002
Neg logit contributions
ning-0.002
hr-0.002
 aware-0.002
 beating-0.002
aping-0.002
Token hugged
Token ID41130
Feature activation+3.89e-03

Pos logit contributions
ning+0.014
 beating+0.014
 aware+0.014
 group+0.014
hr+0.013
Neg logit contributions
 throne-0.019
bridge-0.018
 vows-0.018
affle-0.017
oll-0.017
Token it
Token ID340
Feature activation+1.59e-03

Pos logit contributions
bridge+0.006
 throne+0.006
 vows+0.006
rost+0.005
affle+0.005
Neg logit contributions
 beating-0.006
 aware-0.006
ning-0.006
 group-0.006
hr-0.006
Token tightly
Token ID17707
Feature activation+5.45e-03

Pos logit contributions
 throne+0.002
 vows+0.002
bridge+0.002
oll+0.002
 menu+0.002
Neg logit contributions
ning-0.003
aping-0.003
hr-0.003
 aware-0.003
ran-0.003
Token.
Token ID13
Feature activation+1.03e-02

Pos logit contributions
 throne+0.011
bridge+0.011
 vows+0.010
oll+0.010
affle+0.010
Neg logit contributions
ning-0.006
 aware-0.006
 beating-0.006
 group-0.006
hr-0.006
Token Mom
Token ID11254
Feature activation-5.45e-02

Pos logit contributions
bridge+0.019
 throne+0.018
 vows+0.018
affle+0.018
rost+0.017
Neg logit contributions
 beating-0.013
ning-0.013
 aware-0.013
 group-0.012
aping-0.012
Tokenmy
Token ID1820
Feature activation-2.97e-02

Pos logit contributions
hr+0.075
 beating+0.073
 group+0.072
 aware+0.072
 Cater+0.071
Neg logit contributions
oll-0.080
affle-0.079
 microwave-0.078
bridge-0.076
 throne-0.076
Token told
Token ID1297
Feature activation+6.95e-03

Pos logit contributions
 beating+0.042
ning+0.042
run+0.041
hr+0.040
 aware+0.040
Neg logit contributions
 vows-0.045
bridge-0.043
 throne-0.043
ibrarian-0.043
oll-0.042
Token Lily
Token ID20037
Feature activation-3.22e-03

Pos logit contributions
bridge+0.012
 throne+0.012
 vows+0.012
affle+0.011
rost+0.011
Neg logit contributions
 beating-0.009
ning-0.009
 aware-0.009
 group-0.009
aping-0.009
Token that
Token ID326
Feature activation-1.41e-02

Pos logit contributions
ning+0.005
 beating+0.005
 group+0.005
hr+0.005
 aware+0.005
Neg logit contributions
 vows-0.005
 throne-0.005
bridge-0.004
oll-0.004
rost-0.004
Token it
Token ID340
Feature activation-9.95e-03

Pos logit contributions
run+0.023
 group+0.022
 aware+0.022
 beating+0.022
ning+0.022
Neg logit contributions
 throne-0.018
bridge-0.018
rost-0.018
 vows-0.018
oll-0.018
Token.
Token ID13
Feature activation-1.82e-02

Pos logit contributions
hr+0.028
aping+0.027
ning+0.026
 group+0.026
aunted+0.025
Neg logit contributions
 throne-0.029
oll-0.027
bridge-0.027
 vows-0.026
rost-0.025
Token
Token ID220
Feature activation-3.00e-02

Pos logit contributions
 beating+0.032
ning+0.032
broken+0.031
hr+0.030
 aware+0.030
Neg logit contributions
 throne-0.023
bridge-0.022
rost-0.022
 vows-0.021
affle-0.020
Token
Token ID198
Feature activation-1.13e-02

Pos logit contributions
 beating+0.047
 aware+0.046
ning+0.046
 group+0.045
hr+0.045
Neg logit contributions
bridge-0.043
rost-0.043
oll-0.041
ibrarian-0.040
affle-0.039
Token
Token ID198
Feature activation-1.82e-02

Pos logit contributions
ning+0.018
 beating+0.017
 aware+0.016
broken+0.015
hr+0.015
Neg logit contributions
bridge-0.017
rost-0.017
ibrarian-0.016
 throne-0.016
affle-0.015
TokenLuckily
Token ID42332
Feature activation-2.17e-02

Pos logit contributions
ning+0.031
 beating+0.029
 group+0.028
hr+0.028
 aware+0.027
Neg logit contributions
bridge-0.026
 throne-0.023
rost-0.023
affle-0.022
ibrarian-0.022
Token,
Token ID11
Feature activation-5.45e-02

Pos logit contributions
hr+0.034
aping+0.032
 aware+0.032
 group+0.032
ning+0.031
Neg logit contributions
 throne-0.031
 vows-0.028
bridge-0.028
rost-0.028
oll-0.027
Token his
Token ID465
Feature activation-2.53e-03

Pos logit contributions
aping+0.085
ning+0.082
 beating+0.080
ounce+0.080
broken+0.079
Neg logit contributions
 throne-0.081
bridge-0.077
 bamboo-0.069
rost-0.068
 wow-0.066
Token mom
Token ID1995
Feature activation-3.33e-02

Pos logit contributions
ning+0.004
 aware+0.004
 beating+0.004
aping+0.004
hr+0.004
Neg logit contributions
 throne-0.004
 vows-0.004
affle-0.004
bridge-0.004
 menu-0.004
Token heard
Token ID2982
Feature activation-2.09e-02

Pos logit contributions
 Cater+0.055
hr+0.055
 aware+0.054
ning+0.051
eter+0.051
Neg logit contributions
bridge-0.042
ibrarian-0.040
 throne-0.038
oll-0.036
 microwave-0.035
Token him
Token ID683
Feature activation-7.70e-03

Pos logit contributions
 beating+0.040
ning+0.039
aping+0.039
 group+0.039
 aware+0.039
Neg logit contributions
bridge-0.024
 throne-0.022
 menu-0.020
rost-0.020
affle-0.020
Token and
Token ID290
Feature activation-3.34e-02

Pos logit contributions
ning+0.010
aping+0.010
 group+0.009
 beating+0.009
 aware+0.009
Neg logit contributions
 throne-0.012
 vows-0.012
 menu-0.011
oll-0.011
bridge-0.011
Token Lily
Token ID20037
Feature activation-1.50e-03

Pos logit contributions
bridge+0.015
 throne+0.015
 vows+0.015
rost+0.014
affle+0.014
Neg logit contributions
ning-0.015
 beating-0.014
 aware-0.014
 group-0.014
aping-0.013
Token not
Token ID407
Feature activation-7.33e-03

Pos logit contributions
ning+0.002
 beating+0.002
 group+0.002
hr+0.002
aping+0.002
Neg logit contributions
 throne-0.002
bridge-0.002
 vows-0.002
oll-0.002
rost-0.002
Token to
Token ID284
Feature activation-5.67e-03

Pos logit contributions
ning+0.010
 group+0.010
aping+0.010
 Cater+0.010
ised+0.010
Neg logit contributions
 throne-0.010
rost-0.010
 anniversary-0.010
 menu-0.010
 vows-0.009
Token worry
Token ID5490
Feature activation+5.55e-03

Pos logit contributions
ning+0.008
 beating+0.007
run+0.007
 aware+0.007
aping+0.007
Neg logit contributions
 wait-0.009
rost-0.009
 anniversary-0.008
 throne-0.008
 freezer-0.008
Token.
Token ID13
Feature activation-4.20e-03

Pos logit contributions
 throne+0.009
oll+0.009
rost+0.009
bridge+0.009
 vows+0.009
Neg logit contributions
ning-0.007
 group-0.007
 aware-0.007
aping-0.007
hr-0.007
Token Mom
Token ID11254
Feature activation-5.45e-02

Pos logit contributions
 beating+0.007
ning+0.006
 aware+0.006
 group+0.006
hr+0.006
Neg logit contributions
bridge-0.006
 throne-0.006
 vows-0.006
rost-0.006
affle-0.006
Tokenmy
Token ID1820
Feature activation-3.09e-02

Pos logit contributions
 Cater+0.079
hr+0.079
 harmony+0.079
 group+0.075
 flock+0.072
Neg logit contributions
oll-0.079
 microwave-0.078
bridge-0.077
ibrarian-0.076
affle-0.075
Token gave
Token ID2921
Feature activation-5.54e-03

Pos logit contributions
hr+0.042
 beating+0.042
 group+0.041
ning+0.041
 harmony+0.040
Neg logit contributions
ibrarian-0.048
bridge-0.047
oll-0.046
 vows-0.045
affle-0.044
Token her
Token ID607
Feature activation-3.78e-04

Pos logit contributions
 beating+0.009
 aware+0.009
ning+0.009
 group+0.009
hr+0.009
Neg logit contributions
bridge-0.008
rost-0.007
 throne-0.007
 vows-0.007
oll-0.007
Token a
Token ID257
Feature activation+3.42e-03

Pos logit contributions
 beating+0.000
ning+0.000
 group+0.000
 aware+0.000
aping+0.000
Neg logit contributions
bridge-0.001
 throne-0.001
 vows-0.001
oll-0.001
affle-0.001
Token hug
Token ID16225
Feature activation+1.78e-03

Pos logit contributions
 throne+0.006
bridge+0.005
 menu+0.005
 vows+0.005
affle+0.005
Neg logit contributions
ning-0.005
 beating-0.005
 aware-0.005
aping-0.005
hr-0.005
Token One
Token ID1881
Feature activation-1.57e-02

Pos logit contributions
 throne+0.013
bridge+0.013
 vows+0.012
 menu+0.012
rost+0.012
Neg logit contributions
 beating-0.017
 aware-0.017
ning-0.017
 group-0.017
aping-0.017
Token day
Token ID1110
Feature activation-4.82e-03

Pos logit contributions
ning+0.026
 beating+0.026
 aware+0.025
aping+0.025
 group+0.024
Neg logit contributions
bridge-0.022
 throne-0.021
 vows-0.020
affle-0.020
rost-0.020
Token,
Token ID11
Feature activation-2.03e-02

Pos logit contributions
 aware+0.005
ning+0.005
 beating+0.005
aping+0.005
 group+0.005
Neg logit contributions
 throne-0.010
bridge-0.009
 vows-0.009
 menu-0.009
rost-0.009
Token Jonathan
Token ID11232
Feature activation+1.50e-02

Pos logit contributions
 aware+0.035
ning+0.035
aping+0.035
 beating+0.034
run+0.034
Neg logit contributions
 throne-0.028
bridge-0.026
 vows-0.025
 menu-0.025
rost-0.024
Token was
Token ID373
Feature activation-8.88e-03

Pos logit contributions
bridge+0.027
 throne+0.026
 vows+0.026
affle+0.025
rost+0.025
Neg logit contributions
ning-0.019
 beating-0.019
 aware-0.018
 group-0.017
aping-0.017
Token sitting
Token ID5586
Feature activation-5.43e-02

Pos logit contributions
aunted+0.012
aping+0.012
ran+0.011
run+0.011
hr+0.011
Neg logit contributions
 throne-0.014
oll-0.014
affle-0.014
bridge-0.013
obb-0.013
Token on
Token ID319
Feature activation+8.52e-03

Pos logit contributions
run+0.085
 beating+0.085
hr+0.082
ning+0.079
aping+0.078
Neg logit contributions
bridge-0.081
 throne-0.077
oll-0.075
affle-0.075
obb-0.072
Token his
Token ID465
Feature activation+1.25e-02

Pos logit contributions
bridge+0.020
 vows+0.020
 throne+0.020
affle+0.019
oll+0.019
Neg logit contributions
ning-0.006
 beating-0.006
 aware-0.006
 group-0.005
hr-0.005
Token per
Token ID583
Feature activation+4.36e-03

Pos logit contributions
bridge+0.019
 vows+0.018
 throne+0.017
rost+0.017
affle+0.017
Neg logit contributions
 beating-0.020
ning-0.020
 group-0.019
 aware-0.019
aping-0.018
Tokench
Token ID354
Feature activation-4.19e-03

Pos logit contributions
bridge+0.007
 throne+0.007
obb+0.007
affle+0.007
oll+0.006
Neg logit contributions
 group-0.006
hr-0.006
 aware-0.006
aping-0.006
ning-0.006
Token in
Token ID287
Feature activation-6.61e-03

Pos logit contributions
 aware+0.006
 beating+0.005
ning+0.005
 group+0.005
aping+0.005
Neg logit contributions
bridge-0.007
 throne-0.007
affle-0.007
 vows-0.007
rost-0.007
Tokenmy
Token ID1820
Feature activation-9.66e-03

Pos logit contributions
 aware+0.035
hr+0.035
 Cater+0.033
ning+0.032
 beating+0.031
Neg logit contributions
bridge-0.033
ibrarian-0.031
oll-0.031
 throne-0.030
rost-0.030
Token came
Token ID1625
Feature activation-2.94e-02

Pos logit contributions
hr+0.017
 aware+0.016
ning+0.016
aping+0.015
 group+0.015
Neg logit contributions
bridge-0.012
oll-0.012
ibrarian-0.011
 vows-0.011
 throne-0.011
Token home
Token ID1363
Feature activation+3.56e-03

Pos logit contributions
ning+0.046
hr+0.045
aping+0.043
 beating+0.043
aunted+0.042
Neg logit contributions
 throne-0.036
 freezer-0.030
rost-0.029
 bamboo-0.029
 ear-0.028
Token.
Token ID13
Feature activation-2.88e-03

Pos logit contributions
 throne+0.005
 freezer+0.004
 menu+0.004
rost+0.004
oll+0.004
Neg logit contributions
ning-0.006
aping-0.006
elcome-0.005
ming-0.005
ymes-0.005
Token Her
Token ID2332
Feature activation+1.37e-02

Pos logit contributions
ning+0.005
 beating+0.005
hr+0.005
 aware+0.005
 group+0.005
Neg logit contributions
 throne-0.004
 vows-0.003
 menu-0.003
rost-0.003
bridge-0.003
Token mom
Token ID1995
Feature activation-5.42e-02

Pos logit contributions
 throne+0.020
 vows+0.020
oll+0.020
bridge+0.019
 microwave+0.019
Neg logit contributions
ning-0.020
hr-0.020
aping-0.019
ing-0.019
 aware-0.018
Tokenmy
Token ID1820
Feature activation-3.25e-02

Pos logit contributions
 Cater+0.080
hr+0.075
 harmony+0.074
eter+0.072
ning+0.070
Neg logit contributions
 microwave-0.068
bridge-0.068
 freezer-0.066
ibrarian-0.066
rost-0.066
Token called
Token ID1444
Feature activation-7.28e-03

Pos logit contributions
hr+0.053
ning+0.052
eter+0.051
 harmony+0.050
 Cater+0.049
Neg logit contributions
ibrarian-0.039
bridge-0.037
 microwave-0.037
 throne-0.036
 vows-0.035
Token the
Token ID262
Feature activation+1.06e-02

Pos logit contributions
ning+0.012
aping+0.012
 beating+0.012
 aware+0.012
hr+0.012
Neg logit contributions
 throne-0.010
rost-0.010
 vows-0.009
bridge-0.009
 menu-0.009
Token ambulance
Token ID22536
Feature activation-1.88e-03

Pos logit contributions
bridge+0.019
 throne+0.019
 vows+0.018
affle+0.018
 menu+0.018
Neg logit contributions
ning-0.014
 beating-0.014
 aware-0.013
hr-0.013
aping-0.013
Token but
Token ID475
Feature activation-1.83e-02

Pos logit contributions
hr+0.002
 aware+0.002
 group+0.002
ning+0.002
aping+0.002
Neg logit contributions
 throne-0.003
bridge-0.003
 vows-0.003
rost-0.003
ibrarian-0.003
Tokenmy
Token ID1820
Feature activation-1.37e-02

Pos logit contributions
 group+0.009
 aware+0.009
 beating+0.009
ning+0.008
hr+0.008
Neg logit contributions
bridge-0.028
oll-0.027
rost-0.027
 throne-0.026
mm-0.026
Token ac
Token ID936
Feature activation+1.53e-02

Pos logit contributions
 group+0.018
 aware+0.018
hr+0.018
aping+0.017
ning+0.017
Neg logit contributions
bridge-0.023
oll-0.023
 throne-0.022
rost-0.021
affle-0.021
Tokenhe
Token ID258
Feature activation+1.90e-02

Pos logit contributions
bridge+0.015
affle+0.015
oll+0.014
 throne+0.014
rost+0.013
Neg logit contributions
 beating-0.032
 aware-0.031
 group-0.031
ning-0.031
hr-0.030
Token.
Token ID13
Feature activation-1.18e-03

Pos logit contributions
bridge+0.038
 throne+0.037
 vows+0.037
affle+0.036
rost+0.035
Neg logit contributions
ning-0.020
 beating-0.020
 aware-0.020
 group-0.019
aping-0.018
Token Her
Token ID2332
Feature activation+6.12e-03

Pos logit contributions
 beating+0.002
ning+0.002
 aware+0.002
 group+0.002
hr+0.002
Neg logit contributions
 throne-0.002
 vows-0.002
bridge-0.002
rost-0.002
 menu-0.002
Token mom
Token ID1995
Feature activation-5.40e-02

Pos logit contributions
 throne+0.008
 vows+0.008
oll+0.008
affle+0.007
bridge+0.007
Neg logit contributions
ning-0.011
hr-0.011
 aware-0.011
aping-0.010
ing-0.010
Tokenmy
Token ID1820
Feature activation-3.52e-02

Pos logit contributions
 Cater+0.086
 harmony+0.082
hr+0.082
 group+0.078
 flock+0.074
Neg logit contributions
ibrarian-0.065
oll-0.065
 microwave-0.064
rost-0.063
 frost-0.062
Token said
Token ID531
Feature activation+1.05e-02

Pos logit contributions
hr+0.058
 harmony+0.055
 Cater+0.053
 group+0.053
run+0.052
Neg logit contributions
ibrarian-0.046
oll-0.042
 vows-0.042
affle-0.040
bridge-0.039
Token,
Token ID11
Feature activation-2.00e-02

Pos logit contributions
 vows+0.019
bridge+0.019
 throne+0.018
oll+0.018
rost+0.017
Neg logit contributions
ning-0.013
 beating-0.013
 aware-0.013
 group-0.012
hr-0.012
Token "
Token ID366
Feature activation-8.47e-03

Pos logit contributions
aping+0.030
ning+0.030
 group+0.030
aunted+0.030
hr+0.030
Neg logit contributions
bridge-0.028
 throne-0.027
 vows-0.027
 microwave-0.026
 menu-0.026
TokenDon
Token ID3987
Feature activation+9.38e-03

Pos logit contributions
ning+0.013
 group+0.013
 beating+0.013
 aware+0.012
run+0.012
Neg logit contributions
bridge-0.013
rost-0.012
 vows-0.011
oll-0.011
inner-0.011
Token just
Token ID655
Feature activation-1.65e-02

Pos logit contributions
 group+0.009
ning+0.008
 aware+0.008
aping+0.008
aunted+0.008
Neg logit contributions
oll-0.010
 vows-0.010
 menu-0.010
bridge-0.010
affle-0.010
Token go
Token ID467
Feature activation+6.15e-04

Pos logit contributions
ning+0.023
 beating+0.022
aping+0.021
 aware+0.021
 group+0.021
Neg logit contributions
 throne-0.028
bridge-0.028
 vows-0.027
oll-0.026
 menu-0.026
Token without
Token ID1231
Feature activation-7.75e-03

Pos logit contributions
bridge+0.001
 throne+0.001
 vows+0.001
affle+0.001
rost+0.001
Neg logit contributions
 beating-0.001
ning-0.001
 aware-0.001
 group-0.001
aping-0.001
Token it
Token ID340
Feature activation-3.95e-03

Pos logit contributions
aping+0.010
ning+0.010
ounce+0.009
 group+0.009
ing+0.009
Neg logit contributions
 throne-0.013
 vows-0.012
 fourteen-0.012
 menu-0.012
oll-0.012
Token?
Token ID30
Feature activation-1.88e-02

Pos logit contributions
aping+0.005
 aware+0.005
ning+0.005
hr+0.005
aunted+0.005
Neg logit contributions
 throne-0.006
 vows-0.006
bridge-0.006
oll-0.006
 menu-0.006
Token Mom
Token ID11254
Feature activation-5.40e-02

Pos logit contributions
 beating+0.033
ning+0.033
 aware+0.032
 group+0.031
hr+0.031
Neg logit contributions
bridge-0.024
 vows-0.023
 throne-0.023
oll-0.022
rost-0.021
Tokenmy
Token ID1820
Feature activation-2.87e-02

Pos logit contributions
hr+0.074
 group+0.072
 beating+0.071
 Cater+0.070
 aware+0.069
Neg logit contributions
oll-0.077
bridge-0.074
 throne-0.073
 microwave-0.073
affle-0.073
Token smiled
Token ID13541
Feature activation-2.30e-02

Pos logit contributions
ning+0.038
 beating+0.038
hr+0.038
 group+0.037
 aware+0.036
Neg logit contributions
 vows-0.045
bridge-0.043
ibrarian-0.043
 throne-0.041
oll-0.040
Token and
Token ID290
Feature activation-1.71e-03

Pos logit contributions
ning+0.032
hr+0.032
aping+0.031
 group+0.030
run+0.029
Neg logit contributions
bridge-0.037
 vows-0.036
 throne-0.036
oll-0.034
 menu-0.032
Token told
Token ID1297
Feature activation+4.80e-03

Pos logit contributions
ning+0.002
 beating+0.002
 group+0.002
aping+0.002
hr+0.002
Neg logit contributions
 vows-0.003
 throne-0.003
oll-0.003
bridge-0.003
ibrarian-0.002
Token her
Token ID607
Feature activation-6.45e-04

Pos logit contributions
 vows+0.006
 throne+0.006
bridge+0.006
rost+0.006
oll+0.006
Neg logit contributions
ning-0.009
 beating-0.008
 aware-0.008
 group-0.008
aping-0.008
Token,
Token ID11
Feature activation-1.73e-02

Pos logit contributions
 beating+0.014
ning+0.014
run+0.013
aping+0.013
 aware+0.013
Neg logit contributions
 throne-0.018
 vows-0.018
bridge-0.017
mm-0.016
rost-0.016
Token the
Token ID262
Feature activation-1.09e-02

Pos logit contributions
run+0.027
aping+0.026
ounce+0.026
 aware+0.025
 beating+0.025
Neg logit contributions
 throne-0.026
 vows-0.024
 wait-0.024
 cake-0.023
bridge-0.023
Token baby
Token ID5156
Feature activation-9.38e-03

Pos logit contributions
 beating+0.016
ning+0.015
 aware+0.015
 group+0.015
aping+0.014
Neg logit contributions
bridge-0.018
 throne-0.017
 vows-0.017
affle-0.017
rost-0.016
Token cried
Token ID16896
Feature activation-1.06e-02

Pos logit contributions
run+0.012
 beating+0.012
hr+0.012
ning+0.012
 aware+0.011
Neg logit contributions
bridge-0.016
 throne-0.016
 vows-0.015
oll-0.015
 microwave-0.015
Token.
Token ID13
Feature activation-9.56e-04

Pos logit contributions
hr+0.018
aping+0.017
ning+0.016
 group+0.016
ran+0.016
Neg logit contributions
oll-0.015
bridge-0.015
obb-0.015
 throne-0.014
 vows-0.014
Token Mom
Token ID11254
Feature activation-5.39e-02

Pos logit contributions
 beating+0.002
 aware+0.002
hr+0.002
ning+0.002
 group+0.002
Neg logit contributions
 vows-0.001
 throne-0.001
 microwave-0.001
bridge-0.001
rost-0.001
Tokenmy
Token ID1820
Feature activation-2.68e-02

Pos logit contributions
hr+0.077
 Cater+0.072
 aware+0.072
 beating+0.072
 harmony+0.071
Neg logit contributions
bridge-0.078
oll-0.077
 microwave-0.077
rost-0.076
ibrarian-0.075
Token heard
Token ID2982
Feature activation+1.72e-03

Pos logit contributions
hr+0.040
ning+0.039
 beating+0.038
 aware+0.038
 group+0.037
Neg logit contributions
ibrarian-0.039
bridge-0.039
oll-0.038
 throne-0.037
affle-0.036
Token the
Token ID262
Feature activation-1.79e-02

Pos logit contributions
bridge+0.003
 throne+0.003
 vows+0.003
 menu+0.003
rost+0.003
Neg logit contributions
ning-0.002
 beating-0.002
 aware-0.002
aping-0.002
aunted-0.002
Token cry
Token ID3960
Feature activation-3.93e-03

Pos logit contributions
 aware+0.029
run+0.029
aping+0.029
ning+0.029
 beating+0.028
Neg logit contributions
 throne-0.026
 vows-0.025
 menu-0.024
bridge-0.024
affle-0.023
Token and
Token ID290
Feature activation-1.83e-02

Pos logit contributions
hr+0.006
 aware+0.006
aping+0.006
ning+0.006
aunted+0.006
Neg logit contributions
bridge-0.005
oll-0.005
 throne-0.005
 vows-0.005
rost-0.005
Token like
Token ID588
Feature activation+2.81e-03

Pos logit contributions
aping+0.041
ning+0.036
rolling+0.035
aunted+0.035
ounce+0.035
Neg logit contributions
oll-0.032
 gorilla-0.031
 mine-0.030
 throne-0.029
affle-0.029
Token the
Token ID262
Feature activation+9.89e-03

Pos logit contributions
 throne+0.004
 vows+0.004
affle+0.004
rost+0.004
bridge+0.004
Neg logit contributions
aping-0.004
 aware-0.004
ning-0.004
aunted-0.004
broken-0.004
Token ignorant
Token ID20109
Feature activation+2.45e-02

Pos logit contributions
bridge+0.015
 throne+0.014
 vows+0.014
affle+0.013
 menu+0.013
Neg logit contributions
 beating-0.016
ning-0.016
 aware-0.016
aping-0.016
run-0.015
Token boy
Token ID2933
Feature activation+2.68e-03

Pos logit contributions
bridge+0.034
 vows+0.031
 throne+0.030
 menu+0.029
affle+0.029
Neg logit contributions
 aware-0.041
aping-0.041
ning-0.039
 beating-0.038
shadow-0.037
Token who
Token ID508
Feature activation-1.79e-03

Pos logit contributions
bridge+0.005
 throne+0.005
oll+0.005
 vows+0.005
rost+0.005
Neg logit contributions
 beating-0.003
 aware-0.003
ning-0.003
 group-0.003
aping-0.003
Token laughed
Token ID13818
Feature activation-5.39e-02

Pos logit contributions
 beating+0.003
ning+0.003
 group+0.003
 aware+0.003
aping+0.003
Neg logit contributions
bridge-0.002
affle-0.002
 vows-0.002
 throne-0.002
oll-0.002
Token at
Token ID379
Feature activation-2.59e-03

Pos logit contributions
hr+0.087
aping+0.083
ning+0.083
 aware+0.080
 beating+0.079
Neg logit contributions
bridge-0.083
oll-0.076
 throne-0.071
obb-0.070
 vows-0.068
Token me
Token ID502
Feature activation-8.80e-03

Pos logit contributions
 beating+0.004
 aware+0.004
ning+0.004
broken+0.004
hr+0.004
Neg logit contributions
 throne-0.004
rost-0.004
affle-0.004
 vows-0.003
 menu-0.003
Token when
Token ID618
Feature activation-9.12e-03

Pos logit contributions
 beating+0.013
 aware+0.012
ning+0.012
aping+0.012
 group+0.012
Neg logit contributions
 throne-0.014
bridge-0.014
oll-0.013
rost-0.013
 vows-0.013
Token I
Token ID314
Feature activation-9.81e-03

Pos logit contributions
ning+0.016
 beating+0.015
 aware+0.015
aping+0.015
hr+0.015
Neg logit contributions
bridge-0.012
 throne-0.012
 vows-0.012
affle-0.011
rost-0.011
Token fell
Token ID3214
Feature activation-3.38e-02

Pos logit contributions
ning+0.016
 beating+0.016
aping+0.015
 aware+0.015
ing+0.015
Neg logit contributions
 vows-0.014
affle-0.014
 throne-0.014
bridge-0.013
oll-0.013
Token her
Token ID607
Feature activation+6.02e-03

Pos logit contributions
bridge+0.008
 throne+0.007
 vows+0.007
affle+0.007
rost+0.007
Neg logit contributions
 beating-0.009
 aware-0.009
ning-0.009
 group-0.009
hr-0.008
Token his
Token ID465
Feature activation+1.38e-02

Pos logit contributions
bridge+0.012
 throne+0.012
 vows+0.012
affle+0.011
rost+0.011
Neg logit contributions
 beating-0.007
ning-0.007
 aware-0.006
 group-0.006
aping-0.006
Token spoiled
Token ID34781
Feature activation+2.64e-02

Pos logit contributions
bridge+0.021
 vows+0.019
 throne+0.019
oll+0.018
rost+0.018
Neg logit contributions
 group-0.022
 beating-0.022
 aware-0.022
ning-0.021
run-0.021
Token bucket
Token ID19236
Feature activation+7.57e-03

Pos logit contributions
bridge+0.033
 throne+0.032
 vows+0.031
affle+0.029
rost+0.029
Neg logit contributions
 beating-0.049
ning-0.048
 aware-0.048
 group-0.047
aping-0.046
Token.
Token ID13
Feature activation-4.61e-03

Pos logit contributions
bridge+0.016
 throne+0.015
 vows+0.015
rost+0.014
affle+0.014
Neg logit contributions
 beating-0.008
 aware-0.008
ning-0.008
 group-0.007
hr-0.007
Token Mom
Token ID11254
Feature activation-5.39e-02

Pos logit contributions
 beating+0.007
 aware+0.007
ning+0.007
 group+0.006
hr+0.006
Neg logit contributions
bridge-0.007
 throne-0.007
 vows-0.007
rost-0.007
affle-0.007
Tokenmy
Token ID1820
Feature activation-3.00e-02

Pos logit contributions
 harmony+0.079
 Cater+0.078
hr+0.077
 beating+0.072
 group+0.072
Neg logit contributions
ibrarian-0.076
oll-0.074
 microwave-0.074
bridge-0.073
affle-0.072
Token said
Token ID531
Feature activation+6.63e-03

Pos logit contributions
 beating+0.042
ning+0.041
hr+0.041
run+0.040
 aware+0.039
Neg logit contributions
ibrarian-0.046
 vows-0.045
bridge-0.044
oll-0.042
 throne-0.042
Token not
Token ID407
Feature activation-2.62e-03

Pos logit contributions
bridge+0.010
 throne+0.010
 vows+0.010
affle+0.009
rost+0.009
Neg logit contributions
 beating-0.010
ning-0.010
 aware-0.010
 group-0.010
aping-0.009
Token to
Token ID284
Feature activation+6.15e-04

Pos logit contributions
ning+0.003
 beating+0.003
aping+0.003
 group+0.003
 aware+0.002
Neg logit contributions
 throne-0.005
bridge-0.005
 vows-0.005
rost-0.005
 menu-0.005
Token worry
Token ID5490
Feature activation+1.45e-02

Pos logit contributions
 throne+0.001
rost+0.001
 vows+0.001
bridge+0.001
 anniversary+0.001
Neg logit contributions
ning-0.001
 beating-0.000
ing-0.000
 aware-0.000
aping-0.000
Token,
Token ID11
Feature activation-5.45e-03

Pos logit contributions
bridge+0.013
 throne+0.013
 vows+0.013
affle+0.013
rost+0.013
Neg logit contributions
 beating-0.006
ning-0.006
 aware-0.006
 group-0.005
aping-0.005
Token a
Token ID257
Feature activation+5.67e-03

Pos logit contributions
 beating+0.005
ning+0.005
 aware+0.005
 group+0.005
aping+0.004
Neg logit contributions
bridge-0.012
 throne-0.012
 vows-0.012
affle-0.011
rost-0.011
Token little
Token ID1310
Feature activation-2.14e-03

Pos logit contributions
bridge+0.012
 throne+0.012
 vows+0.012
affle+0.011
rost+0.011
Neg logit contributions
ning-0.005
 beating-0.005
 aware-0.005
 group-0.005
aping-0.005
Token girl
Token ID2576
Feature activation+7.37e-03

Pos logit contributions
 beating+0.003
ning+0.003
 aware+0.003
hr+0.003
aping+0.003
Neg logit contributions
 throne-0.004
bridge-0.004
affle-0.004
 vows-0.004
rost-0.004
Token was
Token ID373
Feature activation-4.18e-03

Pos logit contributions
bridge+0.012
 throne+0.012
 vows+0.012
affle+0.011
rost+0.011
Neg logit contributions
 beating-0.010
ning-0.010
 aware-0.010
 group-0.009
aping-0.009
Token sitting
Token ID5586
Feature activation-5.37e-02

Pos logit contributions
ning+0.004
 beating+0.004
 aware+0.004
aping+0.004
 group+0.004
Neg logit contributions
bridge-0.008
 throne-0.008
oll-0.008
 vows-0.008
affle-0.008
Token on
Token ID319
Feature activation+8.43e-03

Pos logit contributions
 beating+0.084
run+0.080
ning+0.079
hr+0.076
 group+0.074
Neg logit contributions
bridge-0.083
 throne-0.079
oll-0.075
affle-0.074
obb-0.071
Token a
Token ID257
Feature activation+1.98e-02

Pos logit contributions
bridge+0.021
 vows+0.020
affle+0.020
 throne+0.020
rost+0.020
Neg logit contributions
ning-0.005
 beating-0.005
 group-0.005
 aware-0.005
hr-0.004
Token yacht
Token ID43207
Feature activation-4.92e-03

Pos logit contributions
bridge+0.032
 vows+0.030
oll+0.029
 fourteen+0.028
mm+0.028
Neg logit contributions
 group-0.033
 beating-0.031
ning-0.029
 aware-0.029
 wise-0.029
Token.
Token ID13
Feature activation+1.09e-02

Pos logit contributions
 aware+0.005
 beating+0.005
ning+0.005
run+0.005
aping+0.005
Neg logit contributions
 throne-0.010
bridge-0.010
 menu-0.009
 vows-0.009
rost-0.009
Token She
Token ID1375
Feature activation+1.48e-02

Pos logit contributions
 throne+0.014
bridge+0.013
 vows+0.013
 menu+0.013
rost+0.012
Neg logit contributions
ning-0.020
 aware-0.019
 beating-0.019
aping-0.018
hr-0.018
Token mom
Token ID1995
Feature activation-3.19e-02

Pos logit contributions
bridge+0.008
 throne+0.008
 vows+0.007
affle+0.007
rost+0.007
Neg logit contributions
 beating-0.006
ning-0.006
 aware-0.005
 group-0.005
aping-0.005
Tokenmy
Token ID1820
Feature activation-1.61e-02

Pos logit contributions
 aware+0.032
 beating+0.032
hr+0.032
ning+0.030
 group+0.030
Neg logit contributions
 throne-0.060
oll-0.059
rost-0.059
bridge-0.058
 microwave-0.055
Token tightly
Token ID17707
Feature activation-5.12e-03

Pos logit contributions
hr+0.023
ning+0.023
 aware+0.023
 beating+0.023
 group+0.022
Neg logit contributions
 throne-0.025
oll-0.024
 vows-0.024
bridge-0.023
 menu-0.023
Token.
Token ID13
Feature activation+1.03e-02

Pos logit contributions
ning+0.007
hr+0.007
 aware+0.006
aping+0.006
 group+0.006
Neg logit contributions
 throne-0.009
 vows-0.009
bridge-0.009
 menu-0.008
oll-0.008
Token Her
Token ID2332
Feature activation+6.38e-03

Pos logit contributions
bridge+0.017
 throne+0.017
 vows+0.016
affle+0.016
rost+0.016
Neg logit contributions
 beating-0.015
ning-0.014
 aware-0.014
 group-0.014
aping-0.013
Token mom
Token ID1995
Feature activation-5.36e-02

Pos logit contributions
bridge+0.013
 throne+0.012
 vows+0.012
rost+0.011
affle+0.011
Neg logit contributions
 beating-0.008
 group-0.007
ning-0.007
 aware-0.007
aping-0.007
Tokenmy
Token ID1820
Feature activation-3.65e-02

Pos logit contributions
 Cater+0.080
hr+0.075
 beating+0.073
 aware+0.073
 situations+0.072
Neg logit contributions
bridge-0.063
 microwave-0.061
oll-0.060
 frost-0.059
 throne-0.059
Token told
Token ID1297
Feature activation+5.52e-03

Pos logit contributions
run+0.060
hr+0.058
ning+0.058
 beating+0.055
 Cater+0.055
Neg logit contributions
bridge-0.041
 vows-0.041
ibrarian-0.041
 microwave-0.038
 throne-0.038
Token her
Token ID607
Feature activation+9.44e-03

Pos logit contributions
bridge+0.010
 throne+0.010
 vows+0.009
affle+0.009
rost+0.009
Neg logit contributions
 beating-0.007
ning-0.007
 aware-0.007
 group-0.007
aping-0.007
Token that
Token ID326
Feature activation-1.78e-02

Pos logit contributions
bridge+0.017
 throne+0.016
affle+0.016
 vows+0.016
rost+0.016
Neg logit contributions
 beating-0.012
 aware-0.012
ning-0.012
 group-0.012
ing-0.011
Token sometimes
Token ID3360
Feature activation-2.10e-02

Pos logit contributions
run+0.025
 beating+0.025
 aware+0.025
ning+0.025
 group+0.025
Neg logit contributions
bridge-0.026
 throne-0.026
rost-0.025
 vows-0.025
oll-0.025
Tokenelly
Token ID6148
Feature activation+5.33e-03

Pos logit contributions
ning+0.021
 beating+0.021
 aware+0.020
 group+0.020
aping+0.019
Neg logit contributions
bridge-0.024
 throne-0.024
 vows-0.024
affle-0.022
rost-0.022
Token?"
Token ID1701
Feature activation-1.76e-02

Pos logit contributions
bridge+0.008
 throne+0.008
affle+0.008
 vows+0.008
 menu+0.008
Neg logit contributions
 beating-0.008
 aware-0.008
ning-0.007
aping-0.007
 group-0.007
Token
Token ID198
Feature activation-5.52e-03

Pos logit contributions
 beating+0.027
ning+0.026
 aware+0.026
aping+0.025
broken+0.024
Neg logit contributions
 throne-0.026
bridge-0.026
rost-0.026
 microwave-0.025
affle-0.025
Token
Token ID198
Feature activation-4.27e-03

Pos logit contributions
ning+0.009
 beating+0.009
 aware+0.008
hr+0.008
 group+0.008
Neg logit contributions
rost-0.008
bridge-0.008
affle-0.008
ibrarian-0.008
 throne-0.007
TokenThe
Token ID464
Feature activation-9.81e-03

Pos logit contributions
ning+0.007
 beating+0.006
 aware+0.006
ing+0.006
 group+0.006
Neg logit contributions
bridge-0.007
rost-0.006
affle-0.006
 throne-0.006
 vows-0.006
Token c
Token ID269
Feature activation+4.89e-02

Pos logit contributions
 beating+0.014
ning+0.014
 aware+0.014
 group+0.014
hr+0.013
Neg logit contributions
bridge-0.016
 throne-0.015
 vows-0.015
rost-0.014
affle-0.014
Tokenactus
Token ID34144
Feature activation+1.42e-03

Pos logit contributions
bridge+0.047
 throne+0.043
 vows+0.041
rost+0.039
 piping+0.038
Neg logit contributions
 aware-0.104
ning-0.103
ing-0.103
 beating-0.103
 group-0.100
Token answered
Token ID9373
Feature activation-6.73e-03

Pos logit contributions
bridge+0.003
 throne+0.003
 vows+0.003
affle+0.002
oll+0.002
Neg logit contributions
 beating-0.002
ning-0.002
 aware-0.002
aping-0.002
 group-0.002
Token,
Token ID11
Feature activation-1.35e-02

Pos logit contributions
hr+0.009
 group+0.009
 aware+0.009
aping+0.009
 beating+0.009
Neg logit contributions
 vows-0.011
bridge-0.011
 throne-0.011
oll-0.010
rost-0.010
Token "
Token ID366
Feature activation+1.09e-03

Pos logit contributions
aping+0.018
ning+0.018
 group+0.017
hr+0.017
 aware+0.017
Neg logit contributions
 throne-0.022
bridge-0.022
 vows-0.021
 menu-0.021
 fourteen-0.021
TokenI
Token ID40
Feature activation+1.42e-02

Pos logit contributions
bridge+0.002
 throne+0.002
 vows+0.002
affle+0.002
rost+0.002
Neg logit contributions
 beating-0.002
ning-0.002
 aware-0.002
 group-0.001
aping-0.001
Token a
Token ID257
Feature activation+1.16e-02

Pos logit contributions
 beating+0.003
ning+0.003
 aware+0.003
 group+0.003
aping+0.003
Neg logit contributions
bridge-0.007
 throne-0.007
 vows-0.007
affle-0.007
rost-0.007
Token sandwich
Token ID20433
Feature activation+6.44e-03

Pos logit contributions
bridge+0.025
rost+0.024
oll+0.023
 vows+0.023
 throne+0.023
Neg logit contributions
 group-0.012
 beating-0.011
ning-0.010
 wise-0.010
 kind-0.010
Token.
Token ID13
Feature activation+2.39e-02

Pos logit contributions
bridge+0.015
 vows+0.015
 throne+0.015
rost+0.015
affle+0.015
Neg logit contributions
ning-0.004
 beating-0.004
 group-0.004
 aware-0.004
ing-0.004
Token
Token ID198
Feature activation+1.23e-02

Pos logit contributions
bridge+0.039
 vows+0.037
 throne+0.037
affle+0.035
obb+0.034
Neg logit contributions
run-0.034
ning-0.034
 group-0.033
 beating-0.033
 aware-0.033
Token
Token ID198
Feature activation+1.77e-02

Pos logit contributions
bridge+0.029
 throne+0.028
 vows+0.028
affle+0.027
oll+0.027
Neg logit contributions
 beating-0.009
ning-0.009
 aware-0.009
 group-0.008
aping-0.008
TokenS
Token ID50
Feature activation+5.07e-02

Pos logit contributions
 throne+0.038
bridge+0.037
 vows+0.037
affle+0.035
oll+0.035
Neg logit contributions
 aware-0.017
 beating-0.017
 group-0.016
ing-0.016
ning-0.016
Tokenara
Token ID3301
Feature activation+5.24e-03

Pos logit contributions
bridge+0.071
 throne+0.068
 vows+0.062
 piping+0.053
pter+0.052
Neg logit contributions
ing-0.091
ning-0.089
 aware-0.083
aping-0.082
 group-0.082
Token wants
Token ID3382
Feature activation+1.26e-02

Pos logit contributions
bridge+0.010
 throne+0.009
 vows+0.009
affle+0.008
rost+0.008
Neg logit contributions
 aware-0.007
 group-0.006
ning-0.006
 beating-0.006
 wise-0.006
Token to
Token ID284
Feature activation+7.18e-05

Pos logit contributions
bridge+0.029
 throne+0.029
 vows+0.028
affle+0.028
rost+0.027
Neg logit contributions
 beating-0.010
ning-0.009
 aware-0.009
 group-0.009
aping-0.008
Token kick
Token ID4829
Feature activation-2.72e-04

Pos logit contributions
bridge+0.000
 throne+0.000
 vows+0.000
oll+0.000
affle+0.000
Neg logit contributions
 group-0.000
 beating-0.000
ning-0.000
 aware-0.000
hr-0.000
Token the
Token ID262
Feature activation+1.11e-02

Pos logit contributions
 beating+0.000
 aware+0.000
ning+0.000
aping+0.000
hr+0.000
Neg logit contributions
bridge-0.001
 throne-0.001
 vows-0.001
affle-0.001
rost-0.000
Token hero
Token ID4293
Feature activation+2.85e-03

Pos logit contributions
 beating+0.005
ning+0.005
 group+0.005
 aware+0.005
hr+0.005
Neg logit contributions
bridge-0.005
 throne-0.004
 vows-0.004
affle-0.004
rost-0.004
Token.
Token ID13
Feature activation+2.15e-02

Pos logit contributions
bridge+0.005
 throne+0.005
 vows+0.005
oll+0.005
affle+0.004
Neg logit contributions
 aware-0.004
ning-0.004
aping-0.003
 beating-0.003
hr-0.003
Token He
Token ID679
Feature activation+3.67e-03

Pos logit contributions
bridge+0.038
 throne+0.037
affle+0.036
 vows+0.036
oll+0.035
Neg logit contributions
 aware-0.028
ning-0.028
 beating-0.028
 group-0.027
aping-0.026
Token could
Token ID714
Feature activation+7.54e-03

Pos logit contributions
 vows+0.006
 throne+0.006
bridge+0.006
affle+0.005
oll+0.005
Neg logit contributions
ning-0.005
aping-0.005
 aware-0.005
 beating-0.005
ing-0.005
Token barely
Token ID8523
Feature activation-2.33e-03

Pos logit contributions
 throne+0.012
bridge+0.012
 vows+0.012
affle+0.011
rost+0.011
Neg logit contributions
ning-0.011
 beating-0.011
 aware-0.011
aping-0.010
 group-0.010
Token contain
Token ID3994
Feature activation+4.75e-02

Pos logit contributions
ning+0.004
 beating+0.004
 aware+0.003
aping+0.003
 group+0.003
Neg logit contributions
bridge-0.004
 throne-0.004
 vows-0.003
rost-0.003
affle-0.003
Token his
Token ID465
Feature activation-1.13e-03

Pos logit contributions
bridge+0.074
 throne+0.072
 vows+0.070
affle+0.067
rost+0.066
Neg logit contributions
 beating-0.073
ning-0.072
 aware-0.072
 group-0.070
hr-0.068
Token pride
Token ID11293
Feature activation-5.39e-03

Pos logit contributions
ning+0.002
 beating+0.001
 aware+0.001
aping+0.001
ing+0.001
Neg logit contributions
 vows-0.002
 throne-0.002
bridge-0.002
affle-0.002
 menu-0.002
Token.
Token ID13
Feature activation+1.92e-02

Pos logit contributions
aping+0.007
ning+0.007
 aware+0.007
 group+0.006
aunted+0.006
Neg logit contributions
bridge-0.009
 throne-0.009
 vows-0.009
oll-0.008
 menu-0.008
Token From
Token ID3574
Feature activation+1.23e-02

Pos logit contributions
bridge+0.034
 throne+0.034
 vows+0.033
affle+0.032
oll+0.031
Neg logit contributions
 aware-0.024
ning-0.024
 beating-0.024
 group-0.023
aping-0.023
Token then
Token ID788
Feature activation+1.13e-02

Pos logit contributions
bridge+0.018
 throne+0.018
 vows+0.017
affle+0.017
rost+0.016
Neg logit contributions
 beating-0.019
ning-0.019
 aware-0.019
 group-0.019
aping-0.018
Token said
Token ID531
Feature activation+1.43e-02

Pos logit contributions
 vows+0.010
 throne+0.010
bridge+0.010
 fries+0.009
oll+0.009
Neg logit contributions
 beating-0.009
ning-0.008
aping-0.008
run-0.008
 aware-0.008
Token,
Token ID11
Feature activation-7.02e-03

Pos logit contributions
bridge+0.027
 throne+0.027
 vows+0.026
affle+0.025
oll+0.025
Neg logit contributions
ning-0.017
 beating-0.017
 aware-0.016
 group-0.016
aping-0.015
Token �
Token ID6184
Feature activation+4.13e-02

Pos logit contributions
ning+0.009
aping+0.009
 aware+0.009
 group+0.009
aunted+0.009
Neg logit contributions
bridge-0.011
 throne-0.011
 vows-0.011
 menu-0.011
 fourteen-0.010
Token�
Token ID95
Feature activation+3.04e-02

Pos logit contributions
bridge+0.064
 vows+0.061
 piping+0.061
 throne+0.060
oll+0.056
Neg logit contributions
 aware-0.061
hr-0.059
 wise-0.059
 beating-0.058
aping-0.058
Token€
Token ID26391
Feature activation+8.17e-03

Pos logit contributions
 throne+0.047
 vows+0.046
bridge+0.046
oll+0.043
affle+0.042
Neg logit contributions
 aware-0.045
 beating-0.045
ning-0.045
 group-0.043
hr-0.043
Token�
Token ID129
Feature activation+4.19e-02

Pos logit contributions
bridge+0.014
 throne+0.014
 vows+0.013
mm+0.012
pter+0.012
Neg logit contributions
ning-0.011
 beating-0.011
 aware-0.011
run-0.010
hr-0.010
Token�
Token ID241
Feature activation+3.00e-02

Pos logit contributions
bridge+0.062
 throne+0.061
 vows+0.059
oll+0.057
 piping+0.055
Neg logit contributions
 beating-0.069
 aware-0.069
ning-0.067
 group-0.062
aping-0.062
TokenExpl
Token ID18438
Feature activation+8.26e-03

Pos logit contributions
 throne+0.053
bridge+0.051
 vows+0.051
affle+0.049
oll+0.048
Neg logit contributions
 beating-0.040
 aware-0.039
ning-0.039
 group-0.038
hr-0.037
Tokenoring
Token ID3255
Feature activation+7.71e-03

Pos logit contributions
bridge+0.014
 throne+0.014
 vows+0.014
affle+0.013
rost+0.013
Neg logit contributions
 beating-0.011
ning-0.011
 aware-0.011
 group-0.010
aping-0.010
Token is
Token ID318
Feature activation-3.11e-03

Pos logit contributions
 vows+0.012
bridge+0.011
affle+0.011
 throne+0.011
 menu+0.010
Neg logit contributions
ning-0.012
aping-0.012
 beating-0.012
 aware-0.012
aunted-0.012
Token great
Token ID1049
Feature activation-5.80e-03

Pos logit contributions
 beating+0.004
aping+0.004
ning+0.004
 group+0.004
broken+0.004
Neg logit contributions
bridge-0.005
 vows-0.005
oll-0.005
 throne-0.005
 fourteen-0.005
Token said
Token ID531
Feature activation+1.04e-02

Pos logit contributions
 beating+0.020
ning+0.020
 aware+0.019
 group+0.018
hr+0.018
Neg logit contributions
 throne-0.027
bridge-0.026
 vows-0.026
oll-0.025
ibrarian-0.025
Token no
Token ID645
Feature activation-2.10e-03

Pos logit contributions
bridge+0.020
 throne+0.019
 vows+0.019
rost+0.018
oll+0.018
Neg logit contributions
ning-0.012
 beating-0.012
 aware-0.012
 group-0.011
aping-0.011
Token,
Token ID11
Feature activation-1.09e-05

Pos logit contributions
ning+0.003
 beating+0.003
 aware+0.003
aping+0.002
 group+0.002
Neg logit contributions
rost-0.003
 throne-0.003
affle-0.003
oll-0.003
 vows-0.003
Token but
Token ID475
Feature activation-1.61e-02

Pos logit contributions
ning+0.000
 beating+0.000
 aware+0.000
aping+0.000
 group+0.000
Neg logit contributions
bridge-0.000
 throne-0.000
 vows-0.000
rost-0.000
oll-0.000
Token the
Token ID262
Feature activation+1.16e-02

Pos logit contributions
ning+0.026
 beating+0.026
 aware+0.025
aping+0.025
 group+0.024
Neg logit contributions
 throne-0.023
 vows-0.022
bridge-0.022
rost-0.021
oll-0.021
Token bad
Token ID2089
Feature activation+5.02e-02

Pos logit contributions
bridge+0.017
 throne+0.017
 vows+0.016
affle+0.016
rost+0.016
Neg logit contributions
 group-0.018
 beating-0.018
 aware-0.018
ning-0.018
hr-0.017
Token person
Token ID1048
Feature activation+2.90e-03

Pos logit contributions
bridge+0.086
 throne+0.081
affle+0.079
 vows+0.077
oll+0.076
Neg logit contributions
 group-0.070
 aware-0.068
 beating-0.067
ning-0.065
 wise-0.065
Token didn
Token ID1422
Feature activation+2.84e-02

Pos logit contributions
bridge+0.005
 vows+0.005
 throne+0.005
oll+0.005
affle+0.005
Neg logit contributions
 beating-0.004
ning-0.004
 aware-0.004
 group-0.004
hr-0.004
Token't
Token ID470
Feature activation+1.20e-02

Pos logit contributions
 throne+0.056
bridge+0.054
 vows+0.053
affle+0.051
rost+0.051
Neg logit contributions
 beating-0.034
ning-0.033
ing-0.031
 group-0.031
 aware-0.030
Token listen
Token ID6004
Feature activation+9.48e-03

Pos logit contributions
bridge+0.019
 throne+0.019
 vows+0.018
affle+0.018
rost+0.018
Neg logit contributions
ning-0.018
 beating-0.018
 aware-0.017
 group-0.017
aping-0.016
Token.
Token ID13
Feature activation-2.02e-03

Pos logit contributions
 vows+0.016
 throne+0.016
bridge+0.016
rost+0.015
 menu+0.015
Neg logit contributions
ning-0.012
 group-0.012
aping-0.011
 aware-0.011
hr-0.011
Token and
Token ID290
Feature activation-7.03e-03

Pos logit contributions
ning+0.004
aping+0.004
ing+0.004
aunted+0.004
 beating+0.004
Neg logit contributions
 throne-0.005
bridge-0.005
 vows-0.005
 menu-0.005
oll-0.004
Token he
Token ID339
Feature activation-3.99e-04

Pos logit contributions
ning+0.012
 beating+0.012
aping+0.011
 aware+0.011
hr+0.011
Neg logit contributions
 throne-0.009
 vows-0.009
bridge-0.009
oll-0.008
 menu-0.008
Token was
Token ID373
Feature activation-4.51e-05

Pos logit contributions
ning+0.001
aping+0.001
 aware+0.001
 beating+0.001
ing+0.001
Neg logit contributions
 vows-0.001
 throne-0.001
bridge-0.001
affle-0.001
oll-0.001
Token so
Token ID523
Feature activation+6.15e-04

Pos logit contributions
ning+0.000
aping+0.000
ing+0.000
aunted+0.000
 beating+0.000
Neg logit contributions
bridge-0.000
 throne-0.000
 vows-0.000
oll-0.000
affle-0.000
Token happy
Token ID3772
Feature activation-5.35e-03

Pos logit contributions
 throne+0.001
bridge+0.001
 vows+0.001
oll+0.001
 menu+0.001
Neg logit contributions
ning-0.001
 beating-0.001
aping-0.001
ing-0.001
aunted-0.001
Token to
Token ID284
Feature activation+1.39e-02

Pos logit contributions
 beating+0.007
ning+0.007
 aware+0.007
aping+0.006
 group+0.006
Neg logit contributions
 throne-0.009
bridge-0.009
 vows-0.009
 menu-0.009
oll-0.008
Token see
Token ID766
Feature activation+1.10e-02

Pos logit contributions
bridge+0.025
 throne+0.024
 vows+0.024
affle+0.024
rost+0.024
Neg logit contributions
 beating-0.018
 group-0.017
 aware-0.017
ning-0.016
hr-0.016
Token her
Token ID607
Feature activation+1.42e-03

Pos logit contributions
bridge+0.016
 throne+0.016
 vows+0.015
oll+0.014
 menu+0.014
Neg logit contributions
 beating-0.018
ning-0.018
 aware-0.017
broken-0.017
aping-0.017
Token.
Token ID13
Feature activation+6.02e-03

Pos logit contributions
 throne+0.003
 vows+0.003
bridge+0.002
 menu+0.002
affle+0.002
Neg logit contributions
ning-0.002
aping-0.002
 beating-0.002
 aware-0.002
aunted-0.002
Token They
Token ID1119
Feature activation-2.92e-03

Pos logit contributions
bridge+0.010
 throne+0.010
 vows+0.010
affle+0.009
rost+0.009
Neg logit contributions
 beating-0.008
ning-0.008
 aware-0.008
 group-0.008
aping-0.008
Token hugged
Token ID41130
Feature activation+6.46e-03

Pos logit contributions
ning+0.004
 beating+0.004
 aware+0.004
 group+0.004
aping+0.004
Neg logit contributions
bridge-0.005
 throne-0.005
 vows-0.005
affle-0.005
oll-0.005
Token says
Token ID1139
Feature activation-1.04e-02

Pos logit contributions
 beating+0.010
 aware+0.010
ning+0.010
 group+0.010
aping+0.009
Neg logit contributions
bridge-0.008
 throne-0.007
 vows-0.007
affle-0.007
oll-0.007
Token.
Token ID13
Feature activation-8.99e-03

Pos logit contributions
ning+0.015
 group+0.015
hr+0.015
 aware+0.015
aping+0.014
Neg logit contributions
bridge-0.016
 throne-0.016
oll-0.016
rost-0.015
 vows-0.015
Token
Token ID198
Feature activation-8.49e-03

Pos logit contributions
 aware+0.013
ning+0.013
hr+0.013
 beating+0.013
broken+0.012
Neg logit contributions
bridge-0.015
rost-0.013
 throne-0.013
 microwave-0.013
affle-0.013
Token
Token ID198
Feature activation-2.34e-03

Pos logit contributions
ning+0.014
 beating+0.013
hr+0.013
 aware+0.013
broken+0.012
Neg logit contributions
rost-0.012
bridge-0.012
affle-0.011
oll-0.011
ibrarian-0.011
TokenSam
Token ID16305
Feature activation-3.00e-03

Pos logit contributions
ning+0.003
 beating+0.003
hr+0.003
 aware+0.003
 group+0.003
Neg logit contributions
bridge-0.004
rost-0.004
 throne-0.003
affle-0.003
 fourteen-0.003
Token does
Token ID857
Feature activation+3.53e-02

Pos logit contributions
ning+0.004
 beating+0.004
 aware+0.004
 group+0.004
aping+0.004
Neg logit contributions
bridge-0.005
 throne-0.005
 vows-0.005
rost-0.004
affle-0.004
Token not
Token ID407
Feature activation+2.20e-02

Pos logit contributions
bridge+0.067
 throne+0.066
 vows+0.064
affle+0.063
oll+0.062
Neg logit contributions
 beating-0.044
 aware-0.042
hr-0.039
ning-0.039
 group-0.039
Token go
Token ID467
Feature activation-6.53e-04

Pos logit contributions
bridge+0.038
 throne+0.038
 vows+0.037
affle+0.035
oll+0.035
Neg logit contributions
 aware-0.030
 beating-0.029
 group-0.028
ning-0.027
hr-0.027
Token away
Token ID1497
Feature activation+1.67e-03

Pos logit contributions
 beating+0.001
ning+0.001
 aware+0.001
 group+0.001
aping+0.001
Neg logit contributions
bridge-0.001
 throne-0.001
 vows-0.001
affle-0.001
rost-0.001
Token.
Token ID13
Feature activation+1.18e-02

Pos logit contributions
 throne+0.003
rost+0.003
bridge+0.003
 vows+0.003
 menu+0.003
Neg logit contributions
ning-0.002
 beating-0.002
 aware-0.002
aping-0.002
 group-0.002
Token He
Token ID679
Feature activation+2.57e-02

Pos logit contributions
bridge+0.016
 throne+0.015
 vows+0.015
affle+0.014
rost+0.014
Neg logit contributions
 beating-0.021
ning-0.020
 aware-0.020
 group-0.020
aping-0.019
Token his
Token ID465
Feature activation+3.03e-04

Pos logit contributions
ning+0.014
 aware+0.014
 beating+0.013
aping+0.013
ing+0.012
Neg logit contributions
 throne-0.026
bridge-0.026
 menu-0.025
 vows-0.025
rost-0.024
Token front
Token ID2166
Feature activation+2.77e-03

Pos logit contributions
 throne+0.000
bridge+0.000
 vows+0.000
 menu+0.000
affle+0.000
Neg logit contributions
 beating-0.000
ning-0.000
 aware-0.000
aping-0.000
hr-0.000
Token yard
Token ID12699
Feature activation-5.78e-03

Pos logit contributions
 throne+0.004
bridge+0.004
 vows+0.004
 menu+0.004
rost+0.003
Neg logit contributions
ning-0.005
 aware-0.005
 beating-0.005
aping-0.004
hr-0.004
Token.
Token ID13
Feature activation+1.77e-02

Pos logit contributions
 aware+0.007
 beating+0.007
hr+0.007
aping+0.007
ning+0.007
Neg logit contributions
 throne-0.010
bridge-0.010
 menu-0.010
 vows-0.010
 freezer-0.009
Token Jack
Token ID3619
Feature activation-1.24e-02

Pos logit contributions
bridge+0.030
 throne+0.030
 vows+0.029
affle+0.028
rost+0.028
Neg logit contributions
 beating-0.024
ning-0.024
 aware-0.024
 group-0.023
hr-0.022
Token thought
Token ID1807
Feature activation+2.59e-02

Pos logit contributions
ning+0.018
hr+0.018
run+0.018
 aware+0.018
aping+0.017
Neg logit contributions
 vows-0.017
 throne-0.016
 menu-0.016
affle-0.015
ibrarian-0.015
Token the
Token ID262
Feature activation+2.70e-03

Pos logit contributions
bridge+0.043
 throne+0.042
 vows+0.041
affle+0.040
rost+0.039
Neg logit contributions
 beating-0.037
ning-0.037
 aware-0.036
 group-0.035
aping-0.034
Token neighbor
Token ID4780
Feature activation-5.43e-03

Pos logit contributions
bridge+0.005
 throne+0.005
 vows+0.005
affle+0.005
rost+0.005
Neg logit contributions
 group-0.003
 beating-0.003
ning-0.003
 aware-0.003
 wise-0.003
Token was
Token ID373
Feature activation-6.05e-03

Pos logit contributions
 aware+0.007
hr+0.007
 beating+0.007
ning+0.007
aping+0.007
Neg logit contributions
 throne-0.009
 vows-0.009
bridge-0.009
oll-0.009
affle-0.009
Token so
Token ID523
Feature activation+1.45e-02

Pos logit contributions
 aware+0.007
aping+0.007
 beating+0.007
ning+0.007
run+0.007
Neg logit contributions
bridge-0.011
 throne-0.011
 menu-0.010
 vows-0.010
affle-0.010
Token adorable
Token ID29012
Feature activation+9.09e-03

Pos logit contributions
bridge+0.030
 throne+0.029
 vows+0.029
affle+0.028
rost+0.028
Neg logit contributions
 beating-0.015
 aware-0.015
 group-0.015
ning-0.014
 wise-0.013
Token to
Token ID284
Feature activation-1.09e-02

Pos logit contributions
 throne+0.004
 vows+0.004
 menu+0.004
bridge+0.004
rost+0.003
Neg logit contributions
 aware-0.003
aping-0.003
ning-0.003
 group-0.003
ised-0.003
Token melt
Token ID16867
Feature activation-1.15e-02

Pos logit contributions
aping+0.018
ning+0.018
run+0.018
 aware+0.018
aunted+0.018
Neg logit contributions
 throne-0.016
bridge-0.015
rost-0.014
 menu-0.013
 vows-0.013
Token.
Token ID13
Feature activation+1.06e-02

Pos logit contributions
 aware+0.012
 beating+0.012
 group+0.012
ning+0.012
aping+0.011
Neg logit contributions
bridge-0.023
 throne-0.022
 vows-0.022
affle-0.021
rost-0.021
Token She
Token ID1375
Feature activation+3.65e-03

Pos logit contributions
bridge+0.019
 throne+0.019
 vows+0.018
affle+0.018
rost+0.018
Neg logit contributions
 beating-0.013
ning-0.013
 aware-0.013
 group-0.013
hr-0.012
Token quickly
Token ID2952
Feature activation+2.49e-02

Pos logit contributions
 throne+0.006
 vows+0.006
bridge+0.006
affle+0.006
rost+0.006
Neg logit contributions
ning-0.005
 aware-0.005
 beating-0.004
aping-0.004
 group-0.004
Token ate
Token ID15063
Feature activation+2.09e-02

Pos logit contributions
bridge+0.038
 throne+0.038
 vows+0.037
affle+0.035
rost+0.035
Neg logit contributions
ning-0.038
 beating-0.037
 aware-0.036
 group-0.036
aping-0.035
Token it
Token ID340
Feature activation+1.26e-02

Pos logit contributions
bridge+0.043
 throne+0.043
 vows+0.042
affle+0.040
rost+0.039
Neg logit contributions
ning-0.022
 beating-0.021
 aware-0.020
ing-0.019
 group-0.018
Token all
Token ID477
Feature activation+5.90e-03

Pos logit contributions
 throne+0.021
bridge+0.020
oll+0.020
 vows+0.020
 menu+0.019
Neg logit contributions
 aware-0.017
hr-0.017
aping-0.016
 group-0.016
ning-0.016
Token up
Token ID510
Feature activation+1.78e-02

Pos logit contributions
bridge+0.013
 throne+0.013
 vows+0.012
affle+0.012
rost+0.012
Neg logit contributions
ning-0.006
 beating-0.005
 aware-0.005
 group-0.005
ing-0.005
Token before
Token ID878
Feature activation+4.09e-03

Pos logit contributions
bridge+0.037
 throne+0.036
 vows+0.035
affle+0.034
rost+0.034
Neg logit contributions
 beating-0.018
ning-0.018
 aware-0.018
 group-0.017
aping-0.016
Token it
Token ID340
Feature activation+1.12e-02

Pos logit contributions
bridge+0.007
 throne+0.007
 vows+0.007
affle+0.007
rost+0.007
Neg logit contributions
 beating-0.005
ning-0.005
 aware-0.005
 group-0.005
hr-0.005
Token the
Token ID262
Feature activation-2.99e-03

Pos logit contributions
bridge+0.021
rost+0.021
 throne+0.021
affle+0.021
 vows+0.021
Neg logit contributions
 beating-0.015
 aware-0.014
ning-0.013
 group-0.013
hr-0.013
Token bird
Token ID6512
Feature activation-5.76e-03

Pos logit contributions
ning+0.005
 beating+0.005
 group+0.005
 aware+0.005
aping+0.005
Neg logit contributions
bridge-0.004
 throne-0.004
 vows-0.004
affle-0.004
rost-0.003
Token,
Token ID11
Feature activation-1.30e-02

Pos logit contributions
 beating+0.006
 aware+0.006
ning+0.006
 group+0.006
aping+0.006
Neg logit contributions
 throne-0.011
bridge-0.011
rost-0.010
 vows-0.010
oll-0.010
Token but
Token ID475
Feature activation-1.41e-03

Pos logit contributions
 beating+0.013
ning+0.013
 aware+0.013
 group+0.012
aping+0.012
Neg logit contributions
bridge-0.027
 throne-0.027
 vows-0.026
affle-0.025
rost-0.025
Token he
Token ID339
Feature activation+7.33e-03

Pos logit contributions
ning+0.002
 beating+0.002
 aware+0.002
 group+0.002
aping+0.002
Neg logit contributions
bridge-0.002
 throne-0.002
 vows-0.002
rost-0.002
affle-0.002
Token was
Token ID373
Feature activation+7.48e-03

Pos logit contributions
bridge+0.013
 throne+0.012
 vows+0.012
affle+0.012
rost+0.012
Neg logit contributions
 beating-0.010
 aware-0.010
ning-0.010
 group-0.009
hr-0.009
Token too
Token ID1165
Feature activation+8.33e-03

Pos logit contributions
bridge+0.013
 throne+0.013
oll+0.012
 fourteen+0.012
rost+0.012
Neg logit contributions
ning-0.009
aping-0.009
 group-0.009
ing-0.009
hr-0.008
Token slow
Token ID3105
Feature activation-1.19e-02

Pos logit contributions
obb+0.012
rost+0.011
 fourteen+0.011
 menu+0.011
oll+0.010
Neg logit contributions
ning-0.013
aping-0.012
ming-0.012
run-0.011
heart-0.011
Token.
Token ID13
Feature activation+7.44e-03

Pos logit contributions
 group+0.014
 aware+0.014
 beating+0.014
ning+0.014
hr+0.014
Neg logit contributions
bridge-0.021
 throne-0.020
oll-0.020
rost-0.020
 vows-0.020
Token He
Token ID679
Feature activation+1.16e-02

Pos logit contributions
bridge+0.010
 throne+0.010
 vows+0.010
rost+0.010
affle+0.010
Neg logit contributions
 beating-0.012
ning-0.012
 aware-0.012
 group-0.012
hr-0.012
Token fell
Token ID3214
Feature activation-1.23e-02

Pos logit contributions
bridge+0.021
 throne+0.020
 vows+0.020
affle+0.019
rost+0.019
Neg logit contributions
 beating-0.015
ning-0.015
 aware-0.014
 group-0.014
aping-0.013
Token,
Token ID11
Feature activation-4.21e-03

Pos logit contributions
bridge+0.018
 throne+0.018
 vows+0.018
 menu+0.017
affle+0.017
Neg logit contributions
 aware-0.012
 beating-0.012
ning-0.011
aping-0.011
hr-0.011
Token you
Token ID345
Feature activation+6.70e-04

Pos logit contributions
 beating+0.007
ning+0.007
broken+0.007
aping+0.007
 aware+0.007
Neg logit contributions
bridge-0.006
 throne-0.006
 vows-0.005
 menu-0.005
rost-0.005
Token can
Token ID460
Feature activation-6.41e-03

Pos logit contributions
 vows+0.001
bridge+0.001
 throne+0.001
affle+0.001
rost+0.001
Neg logit contributions
ning-0.001
 beating-0.001
 aware-0.001
aping-0.001
hr-0.001
Token have
Token ID423
Feature activation-1.66e-02

Pos logit contributions
ning+0.009
 beating+0.009
 aware+0.009
aping+0.008
 group+0.008
Neg logit contributions
 throne-0.011
bridge-0.011
 vows-0.011
affle-0.010
rost-0.010
Token me
Token ID502
Feature activation-7.06e-03

Pos logit contributions
aping+0.025
ning+0.024
 beating+0.024
aunted+0.023
broken+0.023
Neg logit contributions
 vows-0.027
 throne-0.026
bridge-0.026
 fourteen-0.025
 menu-0.024
Token as
Token ID355
Feature activation+3.28e-03

Pos logit contributions
ning+0.010
aping+0.009
 beating+0.009
aunted+0.009
 group+0.009
Neg logit contributions
 throne-0.012
bridge-0.012
 vows-0.012
 menu-0.011
affle-0.011
Token a
Token ID257
Feature activation+3.35e-03

Pos logit contributions
 throne+0.005
bridge+0.005
 vows+0.005
 menu+0.005
affle+0.005
Neg logit contributions
ning-0.005
aping-0.005
 beating-0.004
aunted-0.004
 aware-0.004
Token reward
Token ID6721
Feature activation-5.11e-03

Pos logit contributions
 throne+0.005
bridge+0.004
 vows+0.004
affle+0.004
 menu+0.004
Neg logit contributions
ning-0.006
 beating-0.006
 aware-0.006
aping-0.006
hr-0.005
Token!
Token ID0
Feature activation-1.36e-02

Pos logit contributions
 beating+0.007
 aware+0.007
ning+0.006
aping+0.006
 group+0.006
Neg logit contributions
bridge-0.009
 throne-0.009
 vows-0.009
rost-0.008
affle-0.008
Token What
Token ID1867
Feature activation+7.95e-03

Pos logit contributions
 beating+0.024
ning+0.023
broken+0.023
 aware+0.022
 group+0.022
Neg logit contributions
bridge-0.018
 vows-0.017
 throne-0.017
oll-0.017
rost-0.017
Token are
Token ID389
Feature activation-2.97e-03

Pos logit contributions
 vows+0.011
oll+0.010
bridge+0.010
ibrarian+0.010
 menu+0.010
Neg logit contributions
broken-0.014
 beating-0.014
aping-0.014
ning-0.014
hr-0.013
Token
Token ID198
Feature activation-2.30e-03

Pos logit contributions
 beating+0.001
 aware+0.001
ning+0.001
broken+0.001
hr+0.001
Neg logit contributions
bridge-0.001
 throne-0.001
rost-0.001
affle-0.001
 menu-0.001
Token
Token ID198
Feature activation+9.97e-03

Pos logit contributions
ning+0.002
 beating+0.002
 aware+0.002
 group+0.002
hr+0.002
Neg logit contributions
bridge-0.005
 throne-0.004
rost-0.004
 vows-0.004
affle-0.004
TokenWhile
Token ID3633
Feature activation+1.83e-04

Pos logit contributions
bridge+0.017
 throne+0.017
 vows+0.016
affle+0.016
rost+0.016
Neg logit contributions
 beating-0.014
ning-0.013
 aware-0.013
 group-0.013
aping-0.012
Token they
Token ID484
Feature activation+2.03e-02

Pos logit contributions
bridge+0.000
 throne+0.000
 vows+0.000
rost+0.000
affle+0.000
Neg logit contributions
 aware-0.000
 beating-0.000
ning-0.000
broken-0.000
hr-0.000
Token were
Token ID547
Feature activation+8.45e-03

Pos logit contributions
bridge+0.035
 throne+0.034
 vows+0.034
affle+0.033
rost+0.033
Neg logit contributions
 beating-0.027
ning-0.027
 aware-0.027
 group-0.025
aping-0.025
Token playing
Token ID2712
Feature activation-2.29e-02

Pos logit contributions
 throne+0.016
 vows+0.016
bridge+0.015
oll+0.015
 menu+0.015
Neg logit contributions
ning-0.009
 aware-0.009
aping-0.009
 beating-0.009
run-0.009
Token,
Token ID11
Feature activation-2.06e-02

Pos logit contributions
 beating+0.024
ning+0.023
 aware+0.023
hr+0.021
aping+0.021
Neg logit contributions
bridge-0.047
 throne-0.047
 vows-0.045
affle-0.044
rost-0.043
Token Tim
Token ID5045
Feature activation-2.36e-03

Pos logit contributions
 aware+0.034
 beating+0.033
ning+0.032
aping+0.031
broken+0.031
Neg logit contributions
bridge-0.030
 throne-0.028
 vows-0.028
rost-0.027
 menu-0.027
Token found
Token ID1043
Feature activation-1.10e-02

Pos logit contributions
run+0.004
 aware+0.004
hr+0.003
 beating+0.003
aping+0.003
Neg logit contributions
 throne-0.003
bridge-0.003
 menu-0.003
 microwave-0.003
affle-0.003
Token a
Token ID257
Feature activation+4.26e-03

Pos logit contributions
 beating+0.012
 aware+0.011
ning+0.011
hr+0.010
 group+0.010
Neg logit contributions
bridge-0.023
 throne-0.022
affle-0.021
 vows-0.021
rost-0.021
Token big
Token ID1263
Feature activation+1.64e-02

Pos logit contributions
 throne+0.007
bridge+0.007
 menu+0.007
 vows+0.007
affle+0.007
Neg logit contributions
 aware-0.006
ning-0.006
aping-0.006
 beating-0.006
hr-0.006
Token's
Token ID338
Feature activation-5.33e-03

Pos logit contributions
bridge+0.005
 vows+0.005
 throne+0.005
oll+0.004
 menu+0.004
Neg logit contributions
broken-0.009
ning-0.008
 beating-0.008
aping-0.008
 aware-0.008
Token your
Token ID534
Feature activation+2.07e-03

Pos logit contributions
ning+0.007
 beating+0.007
 aware+0.007
 group+0.006
hr+0.006
Neg logit contributions
bridge-0.010
 throne-0.009
 vows-0.009
affle-0.009
rost-0.009
Token name
Token ID1438
Feature activation-1.67e-03

Pos logit contributions
 throne+0.003
 vows+0.003
bridge+0.003
affle+0.002
 menu+0.002
Neg logit contributions
 beating-0.004
ning-0.004
 aware-0.004
aping-0.004
hr-0.004
Token?"
Token ID1701
Feature activation-1.88e-02

Pos logit contributions
 aware+0.003
aping+0.003
hr+0.003
 beating+0.003
aunted+0.003
Neg logit contributions
bridge-0.002
 throne-0.002
 vows-0.002
 menu-0.002
affle-0.002
Token The
Token ID383
Feature activation-1.34e-02

Pos logit contributions
 beating+0.030
ning+0.030
broken+0.028
 aware+0.028
aping+0.028
Neg logit contributions
bridge-0.027
 throne-0.027
 vows-0.025
 menu-0.025
 microwave-0.025
Token pony
Token ID26902
Feature activation-1.21e-02

Pos logit contributions
 beating+0.018
 aware+0.018
ning+0.017
aping+0.017
hr+0.017
Neg logit contributions
bridge-0.023
 throne-0.023
 vows-0.023
affle-0.022
 menu-0.021
Token was
Token ID373
Feature activation-3.49e-03

Pos logit contributions
aping+0.018
 aware+0.017
 beating+0.017
ning+0.017
hr+0.017
Neg logit contributions
 throne-0.018
 vows-0.018
affle-0.017
bridge-0.017
oll-0.016
Token shy
Token ID15800
Feature activation-6.98e-03

Pos logit contributions
aping+0.005
ing+0.004
hr+0.004
aunted+0.004
iting+0.004
Neg logit contributions
 throne-0.006
oll-0.006
bridge-0.006
affle-0.006
ibrarian-0.005
Token and
Token ID290
Feature activation+4.41e-03

Pos logit contributions
aping+0.010
hr+0.010
 aware+0.010
shadow+0.010
 beating+0.009
Neg logit contributions
oll-0.010
 throne-0.010
bridge-0.010
 menu-0.010
 vows-0.010
Token didn
Token ID1422
Feature activation+2.20e-02

Pos logit contributions
 throne+0.007
 vows+0.007
bridge+0.007
affle+0.007
 menu+0.007
Neg logit contributions
ning-0.006
aping-0.006
 aware-0.006
 beating-0.006
 group-0.006
Token't
Token ID470
Feature activation+1.23e-02

Pos logit contributions
 throne+0.041
bridge+0.041
 vows+0.039
affle+0.038
rost+0.038
Neg logit contributions
 beating-0.028
ning-0.026
 group-0.025
 aware-0.025
 wise-0.024
Token my
Token ID616
Feature activation-7.32e-03

Pos logit contributions
aping+0.031
rolling+0.029
ning+0.028
ing+0.028
ised+0.028
Neg logit contributions
 mine-0.019
 cake-0.016
 ice-0.016
 yours-0.016
 gorilla-0.015
Token friend
Token ID1545
Feature activation+5.18e-03

Pos logit contributions
 beating+0.006
ning+0.006
 aware+0.006
hr+0.006
aping+0.006
Neg logit contributions
 throne-0.016
bridge-0.016
 vows-0.016
affle-0.015
 menu-0.015
Token anymore
Token ID7471
Feature activation+5.09e-03

Pos logit contributions
bridge+0.008
 throne+0.008
 vows+0.008
oll+0.008
 menu+0.008
Neg logit contributions
 aware-0.008
 beating-0.007
aping-0.007
ning-0.007
hr-0.007
Token.
Token ID13
Feature activation-5.10e-03

Pos logit contributions
 throne+0.008
bridge+0.008
 vows+0.008
oll+0.008
affle+0.008
Neg logit contributions
ning-0.007
 aware-0.007
 beating-0.007
aping-0.007
hr-0.007
Token You
Token ID921
Feature activation+1.48e-03

Pos logit contributions
 beating+0.006
 aware+0.006
ning+0.006
 group+0.006
run+0.006
Neg logit contributions
bridge-0.009
 throne-0.009
 vows-0.009
affle-0.009
 menu-0.009
Token are
Token ID389
Feature activation-1.18e-02

Pos logit contributions
bridge+0.002
 throne+0.002
 vows+0.002
affle+0.002
rost+0.002
Neg logit contributions
 beating-0.003
ning-0.003
 aware-0.003
 group-0.002
aping-0.002
Token mean
Token ID1612
Feature activation-1.78e-02

Pos logit contributions
ning+0.015
 beating+0.015
 aware+0.014
 group+0.014
aping+0.014
Neg logit contributions
bridge-0.021
 throne-0.021
 vows-0.020
affle-0.020
oll-0.020
Token and
Token ID290
Feature activation-1.06e-03

Pos logit contributions
 beating+0.024
ning+0.024
hr+0.024
 aware+0.023
aping+0.022
Neg logit contributions
bridge-0.028
 throne-0.027
 vows-0.027
oll-0.027
 menu-0.026
Token careless
Token ID36138
Feature activation-1.69e-02

Pos logit contributions
ning+0.002
 beating+0.001
aping+0.001
 aware+0.001
aunted+0.001
Neg logit contributions
bridge-0.002
 throne-0.002
obb-0.002
 vows-0.002
rost-0.002
Token.
Token ID13
Feature activation-8.72e-03

Pos logit contributions
ning+0.025
 aware+0.025
 beating+0.025
hr+0.025
aping+0.025
Neg logit contributions
bridge-0.026
 fourteen-0.023
 throne-0.023
 vows-0.023
rost-0.023
Token You
Token ID921
Feature activation+3.46e-05

Pos logit contributions
 beating+0.012
ning+0.012
 aware+0.012
 group+0.011
aping+0.011
Neg logit contributions
bridge-0.015
 throne-0.015
 vows-0.014
affle-0.014
rost-0.013
Token hurting
Token ID21530
Feature activation-7.70e-03

Pos logit contributions
 aware+0.007
ning+0.007
aping+0.007
aunted+0.007
 group+0.007
Neg logit contributions
 throne-0.007
rost-0.007
bridge-0.007
 fourteen-0.007
 bamboo-0.006
Token Max
Token ID5436
Feature activation-1.81e-03

Pos logit contributions
 aware+0.012
 beating+0.012
hr+0.011
 group+0.011
ning+0.011
Neg logit contributions
rost-0.011
 throne-0.010
affle-0.010
bridge-0.010
oll-0.010
Token's
Token ID338
Feature activation+5.50e-03

Pos logit contributions
ning+0.002
hr+0.002
aping+0.002
 beating+0.002
 group+0.002
Neg logit contributions
oll-0.003
 throne-0.003
 menu-0.003
rost-0.003
 vows-0.003
Token feelings
Token ID7666
Feature activation+2.14e-04

Pos logit contributions
 vows+0.008
affle+0.008
 throne+0.007
rost+0.007
 menu+0.007
Neg logit contributions
ning-0.009
aping-0.009
hr-0.008
 beating-0.008
 aware-0.008
Token.
Token ID13
Feature activation-5.68e-03

Pos logit contributions
bridge+0.000
 throne+0.000
oll+0.000
 menu+0.000
 vows+0.000
Neg logit contributions
ning-0.000
hr-0.000
aping-0.000
 group-0.000
 aware-0.000
Token She
Token ID1375
Feature activation-9.22e-03

Pos logit contributions
hr+0.010
ning+0.010
 beating+0.010
 aware+0.010
 group+0.009
Neg logit contributions
 throne-0.007
 vows-0.007
bridge-0.007
 menu-0.006
rost-0.006
Token wanted
Token ID2227
Feature activation+9.50e-03

Pos logit contributions
 beating+0.010
 aware+0.010
ning+0.009
 group+0.009
aping+0.009
Neg logit contributions
bridge-0.019
 throne-0.018
 vows-0.018
rost-0.017
affle-0.017
Token to
Token ID284
Feature activation-9.86e-03

Pos logit contributions
 vows+0.018
 throne+0.018
bridge+0.018
rost+0.018
 menu+0.017
Neg logit contributions
 beating-0.011
ning-0.011
 aware-0.010
hr-0.010
aping-0.010
Token improve
Token ID2987
Feature activation-6.50e-04

Pos logit contributions
ning+0.014
 beating+0.013
 aware+0.013
aping+0.012
ing+0.012
Neg logit contributions
 throne-0.017
 vows-0.017
bridge-0.017
affle-0.016
rost-0.016
Token their
Token ID511
Feature activation+8.14e-05

Pos logit contributions
hr+0.001
aping+0.001
 group+0.001
ning+0.001
 beating+0.001
Neg logit contributions
 throne-0.001
 vows-0.001
 menu-0.001
 bamboo-0.001
affle-0.001
Token friendship
Token ID14738
Feature activation+7.54e-03

Pos logit contributions
 vows+0.000
affle+0.000
 throne+0.000
 menu+0.000
rost+0.000
Neg logit contributions
ning-0.000
 beating-0.000
 aware-0.000
aping-0.000
aunted-0.000
Tokencloth
Token ID44905
Feature activation+1.03e-02

Pos logit contributions
bridge+0.007
 vows+0.007
 throne+0.007
affle+0.006
oll+0.006
Neg logit contributions
 beating-0.005
 aware-0.005
ning-0.005
 group-0.005
aping-0.004
Token."
Token ID526
Feature activation-6.32e-03

Pos logit contributions
bridge+0.019
 throne+0.019
 vows+0.019
oll+0.018
affle+0.018
Neg logit contributions
 beating-0.012
 aware-0.012
ning-0.012
 group-0.011
hr-0.011
Token So
Token ID1406
Feature activation-1.54e-02

Pos logit contributions
 beating+0.009
ning+0.009
 aware+0.009
 group+0.009
hr+0.009
Neg logit contributions
bridge-0.010
 throne-0.010
 vows-0.010
affle-0.009
rost-0.009
Token,
Token ID11
Feature activation-2.04e-02

Pos logit contributions
ning+0.017
 beating+0.017
 aware+0.017
aping+0.016
 group+0.015
Neg logit contributions
 throne-0.030
 vows-0.030
bridge-0.029
rost-0.028
oll-0.028
Token Lily
Token ID20037
Feature activation-4.81e-03

Pos logit contributions
 beating+0.034
ning+0.034
 aware+0.034
 group+0.032
aping+0.032
Neg logit contributions
bridge-0.029
 throne-0.028
 vows-0.028
affle-0.026
rost-0.026
Token leaned
Token ID23831
Feature activation-3.55e-02

Pos logit contributions
ning+0.007
hr+0.007
 beating+0.007
 aware+0.007
run+0.007
Neg logit contributions
bridge-0.007
 vows-0.007
 throne-0.007
affle-0.007
oll-0.007
Token back
Token ID736
Feature activation-2.49e-03

Pos logit contributions
 aware+0.048
 group+0.047
hr+0.046
 beating+0.045
ning+0.045
Neg logit contributions
bridge-0.060
 throne-0.056
rost-0.056
 vows-0.054
oll-0.054
Token and
Token ID290
Feature activation+3.23e-03

Pos logit contributions
 beating+0.003
 aware+0.003
ning+0.003
 group+0.003
hr+0.003
Neg logit contributions
 throne-0.005
bridge-0.005
 vows-0.004
affle-0.004
oll-0.004
Token felt
Token ID2936
Feature activation+6.45e-03

Pos logit contributions
 throne+0.005
bridge+0.005
 vows+0.005
 microwave+0.005
rost+0.005
Neg logit contributions
 beating-0.005
 group-0.005
ning-0.005
 aware-0.004
hr-0.004
Token the
Token ID262
Feature activation-8.92e-03

Pos logit contributions
bridge+0.011
oll+0.010
 throne+0.010
obb+0.010
affle+0.010
Neg logit contributions
ning-0.010
 group-0.009
aping-0.009
hr-0.009
 beating-0.009
Token warm
Token ID5814
Feature activation+9.96e-03

Pos logit contributions
 beating+0.014
ning+0.014
 aware+0.014
 group+0.013
aping+0.013
Neg logit contributions
bridge-0.013
 throne-0.013
 vows-0.013
affle-0.012
rost-0.012
Token
Token ID198
Feature activation-1.30e-02

Pos logit contributions
 beating+0.043
 aware+0.042
ning+0.041
broken+0.040
ing+0.040
Neg logit contributions
 throne-0.038
bridge-0.038
 vows-0.037
rost-0.036
 microwave-0.035
Token
Token ID198
Feature activation-1.55e-02

Pos logit contributions
ning+0.019
 beating+0.019
 aware+0.019
hr+0.017
 group+0.017
Neg logit contributions
bridge-0.020
rost-0.019
 throne-0.019
affle-0.018
 vows-0.018
TokenHis
Token ID6653
Feature activation-4.36e-03

Pos logit contributions
ning+0.025
 beating+0.024
 aware+0.023
 group+0.023
ing+0.023
Neg logit contributions
bridge-0.023
 throne-0.021
 vows-0.021
rost-0.020
affle-0.020
Token friend
Token ID1545
Feature activation-1.96e-02

Pos logit contributions
ning+0.004
 aware+0.004
 beating+0.004
hr+0.004
aping+0.004
Neg logit contributions
 throne-0.009
 vows-0.009
bridge-0.009
affle-0.009
oll-0.008
Token replied
Token ID8712
Feature activation-6.79e-03

Pos logit contributions
aping+0.031
 aware+0.031
hr+0.031
 beating+0.030
ning+0.030
Neg logit contributions
 vows-0.026
ibrarian-0.025
 throne-0.024
oll-0.024
ipop-0.022
Token,
Token ID11
Feature activation-2.89e-02

Pos logit contributions
hr+0.009
 aware+0.009
 group+0.009
ning+0.009
aping+0.009
Neg logit contributions
 vows-0.011
bridge-0.011
 throne-0.011
oll-0.010
affle-0.010
Token "
Token ID366
Feature activation-4.44e-03

Pos logit contributions
aping+0.047
hr+0.045
 aware+0.045
ounce+0.043
aunted+0.043
Neg logit contributions
bridge-0.040
 throne-0.040
 vows-0.039
 menu-0.038
 microwave-0.037
TokenWhy
Token ID5195
Feature activation+1.56e-02

Pos logit contributions
 beating+0.007
ning+0.007
 aware+0.007
 group+0.007
hr+0.006
Neg logit contributions
bridge-0.007
 vows-0.006
rost-0.006
 throne-0.006
oll-0.006
Token should
Token ID815
Feature activation+1.98e-02

Pos logit contributions
 vows+0.019
bridge+0.019
 throne+0.019
affle+0.017
rost+0.017
Neg logit contributions
aping-0.028
ning-0.028
 aware-0.028
 beating-0.027
hr-0.026
Token I
Token ID314
Feature activation-1.13e-02

Pos logit contributions
bridge+0.031
 throne+0.031
 vows+0.030
affle+0.029
rost+0.029
Neg logit contributions
 beating-0.029
ning-0.029
 aware-0.029
 group-0.028
aping-0.027
Token?
Token ID5633
Feature activation-3.57e-02

Pos logit contributions
ning+0.018
 beating+0.018
 aware+0.017
aping+0.017
ing+0.017
Neg logit contributions
 throne-0.017
 vows-0.017
bridge-0.016
affle-0.016
 menu-0.015
Token<|endoftext|>
Token ID50256
Feature activation-1.94e-02

Pos logit contributions
 beating+0.006
ning+0.006
 aware+0.006
 group+0.006
aping+0.005
Neg logit contributions
bridge-0.013
 throne-0.013
 vows-0.013
affle-0.012
rost-0.012
TokenTom
Token ID13787
Feature activation+2.12e-03

Pos logit contributions
 beating+0.027
 aware+0.027
ning+0.026
broken+0.025
hr+0.024
Neg logit contributions
bridge-0.034
 throne-0.032
affle-0.031
rost-0.031
 vows-0.031
Token and
Token ID290
Feature activation+1.37e-02

Pos logit contributions
 throne+0.005
bridge+0.005
 vows+0.005
affle+0.005
rost+0.005
Neg logit contributions
 beating-0.001
ning-0.001
 aware-0.001
hr-0.001
 group-0.001
Token Lily
Token ID20037
Feature activation+1.63e-02

Pos logit contributions
bridge+0.026
 throne+0.025
 vows+0.025
affle+0.025
oll+0.024
Neg logit contributions
 beating-0.016
 aware-0.016
 group-0.016
ning-0.015
hr-0.014
Token were
Token ID547
Feature activation+4.99e-03

Pos logit contributions
bridge+0.029
 throne+0.028
 vows+0.028
affle+0.027
rost+0.027
Neg logit contributions
 beating-0.021
ning-0.021
 aware-0.021
 group-0.020
aping-0.019
Token playing
Token ID2712
Feature activation-3.62e-02

Pos logit contributions
bridge+0.010
 throne+0.010
 vows+0.010
affle+0.010
oll+0.010
Neg logit contributions
 beating-0.005
ning-0.005
 aware-0.005
 group-0.005
hr-0.005
Token in
Token ID287
Feature activation-6.06e-03

Pos logit contributions
 beating+0.049
 aware+0.045
ning+0.043
run+0.042
hr+0.041
Neg logit contributions
bridge-0.066
 throne-0.065
affle-0.061
 vows-0.061
 menu-0.059
Token the
Token ID262
Feature activation-1.40e-02

Pos logit contributions
 beating+0.005
ning+0.005
 aware+0.005
 group+0.004
hr+0.004
Neg logit contributions
bridge-0.014
 throne-0.014
 vows-0.014
affle-0.013
oll-0.013
Token garden
Token ID11376
Feature activation-1.54e-02

Pos logit contributions
 group+0.021
ning+0.021
 aware+0.019
 beating+0.019
 wise+0.019
Neg logit contributions
 vows-0.023
bridge-0.023
oll-0.022
rost-0.022
affle-0.021
Token.
Token ID13
Feature activation+1.80e-02

Pos logit contributions
 beating+0.019
ning+0.018
run+0.018
 aware+0.017
hr+0.017
Neg logit contributions
 throne-0.028
bridge-0.028
 menu-0.026
 vows-0.025
rost-0.025
Token They
Token ID1119
Feature activation+1.31e-02

Pos logit contributions
bridge+0.033
 throne+0.031
 vows+0.031
affle+0.031
obb+0.030
Neg logit contributions
ning-0.023
 beating-0.023
 group-0.022
 aware-0.022
run-0.021
Token day
Token ID1110
Feature activation-6.28e-03

Pos logit contributions
 beating+0.021
ning+0.021
 aware+0.020
 group+0.019
aping+0.019
Neg logit contributions
bridge-0.019
 throne-0.019
 vows-0.018
affle-0.018
rost-0.018
Token,
Token ID11
Feature activation-1.65e-02

Pos logit contributions
 beating+0.006
ning+0.005
 aware+0.005
 group+0.005
aping+0.005
Neg logit contributions
bridge-0.014
 throne-0.014
 vows-0.013
affle-0.013
rost-0.013
Token a
Token ID257
Feature activation-2.04e-03

Pos logit contributions
 beating+0.025
ning+0.025
 aware+0.025
aping+0.024
 group+0.023
Neg logit contributions
bridge-0.026
 throne-0.026
 vows-0.025
rost-0.023
affle-0.023
Token little
Token ID1310
Feature activation-2.00e-02

Pos logit contributions
ning+0.003
aping+0.003
 aware+0.003
aunted+0.003
 beating+0.003
Neg logit contributions
 throne-0.004
 menu-0.004
 vows-0.004
bridge-0.003
affle-0.003
Token girl
Token ID2576
Feature activation-4.79e-03

Pos logit contributions
 beating+0.035
aping+0.034
run+0.034
 aware+0.033
hr+0.032
Neg logit contributions
 throne-0.026
 gorilla-0.025
 menu-0.025
 freezer-0.024
 microwave-0.024
Token came
Token ID1625
Feature activation-3.66e-02

Pos logit contributions
 beating+0.007
ning+0.006
 aware+0.006
 group+0.006
hr+0.006
Neg logit contributions
bridge-0.008
 throne-0.008
 vows-0.008
affle-0.007
rost-0.007
Token to
Token ID284
Feature activation-1.41e-02

Pos logit contributions
 beating+0.064
ming+0.063
ised+0.063
ended+0.063
run+0.061
Neg logit contributions
 menu-0.027
ancy-0.026
 Lib-0.026
adies-0.026
ily-0.025
Token visit
Token ID3187
Feature activation+6.21e-03

Pos logit contributions
ning+0.020
 beating+0.020
ing+0.020
aping+0.019
run+0.019
Neg logit contributions
 throne-0.025
 vows-0.024
 menu-0.023
rost-0.023
bridge-0.022
Token her
Token ID607
Feature activation+8.52e-03

Pos logit contributions
 throne+0.009
 vows+0.009
 menu+0.009
rost+0.008
 Lib+0.008
Neg logit contributions
 beating-0.010
ning-0.010
 aware-0.010
aping-0.010
run-0.010
Token.
Token ID13
Feature activation+1.05e-02

Pos logit contributions
 throne+0.013
 vows+0.013
 menu+0.013
bridge+0.012
 anniversary+0.012
Neg logit contributions
 beating-0.013
run-0.013
 aware-0.013
ning-0.012
aping-0.012
Token The
Token ID383
Feature activation+1.90e-02

Pos logit contributions
 throne+0.014
bridge+0.013
 vows+0.013
 menu+0.013
rost+0.012
Neg logit contributions
 beating-0.018
ning-0.018
 aware-0.018
aping-0.017
hr-0.017
Token mom
Token ID1995
Feature activation-3.65e-02

Pos logit contributions
 throne+0.009
 vows+0.008
affle+0.008
 menu+0.008
oll+0.008
Neg logit contributions
 aware-0.012
ning-0.012
 beating-0.012
aping-0.012
hr-0.012
Token saw
Token ID2497
Feature activation-5.36e-03

Pos logit contributions
 Cater+0.057
hr+0.056
 harmony+0.054
 aware+0.054
 beating+0.054
Neg logit contributions
ibrarian-0.050
bridge-0.049
oll-0.045
 throne-0.044
 microwave-0.043
Token what
Token ID644
Feature activation+5.04e-03

Pos logit contributions
 beating+0.007
ning+0.007
 aware+0.007
 group+0.007
aping+0.007
Neg logit contributions
bridge-0.009
 throne-0.009
 vows-0.009
affle-0.009
rost-0.009
Token was
Token ID373
Feature activation+3.35e-03

Pos logit contributions
bridge+0.006
 vows+0.005
ibrarian+0.005
 menu+0.005
 throne+0.005
Neg logit contributions
ning-0.010
 beating-0.009
hr-0.009
 aware-0.009
aping-0.009
Token happening
Token ID5836
Feature activation-4.92e-03

Pos logit contributions
bridge+0.005
 throne+0.005
 vows+0.005
affle+0.005
oll+0.004
Neg logit contributions
 beating-0.005
ning-0.005
 aware-0.005
 group-0.005
aping-0.005
Token and
Token ID290
Feature activation-2.65e-02

Pos logit contributions
 aware+0.008
aping+0.008
 beating+0.007
ning+0.007
hr+0.007
Neg logit contributions
bridge-0.006
 menu-0.006
 throne-0.006
oll-0.006
 vows-0.006
Token came
Token ID1625
Feature activation-2.57e-02

Pos logit contributions
ning+0.044
 beating+0.043
ounce+0.042
hr+0.041
 Cater+0.041
Neg logit contributions
bridge-0.033
ibrarian-0.032
 vows-0.031
 fourteen-0.030
 throne-0.030
Token over
Token ID625
Feature activation-5.43e-03

Pos logit contributions
 Cater+0.038
ning+0.038
ised+0.038
 beating+0.038
 group+0.037
Neg logit contributions
 throne-0.027
ibrarian-0.026
 menu-0.025
 vows-0.025
 Lib-0.025
Token to
Token ID284
Feature activation-8.13e-03

Pos logit contributions
ning+0.007
 beating+0.007
 aware+0.006
aping+0.006
broken+0.006
Neg logit contributions
 throne-0.009
bridge-0.009
 menu-0.009
 vows-0.009
rost-0.009
Token hug
Token ID16225
Feature activation+2.07e-03

Pos logit contributions
ning+0.012
 beating+0.012
 aware+0.011
aping+0.011
ing+0.011
Neg logit contributions
bridge-0.013
 throne-0.013
 vows-0.013
rost-0.012
 menu-0.012
Token her
Token ID607
Feature activation-3.20e-03

Pos logit contributions
bridge+0.003
 throne+0.003
 vows+0.003
rost+0.003
 menu+0.003
Neg logit contributions
 beating-0.003
 aware-0.003
ning-0.003
 group-0.003
aping-0.003