TOP ACTIVATIONS
MAX = 0.088

with !" That night , Tim my and
the garden ." That made Steve smile . He
another prize ." That made Jim very happy ,
myself !" That 's why Ana was such
hug . That day , Emma learned how
she said . That night , Jane went to
laughed . That night , Lily hugged Ros
find it ." That 's exactly what they did
lose it ." That night , Lily went to
ade together ". That made Wendy cheer up and
smiled big . That day was so much fun
each other ." That night , Mom and Dad
other . That 's when they saw the
said . That night , Daisy 's mom
Wow , Tim my ! That 's a great picture of
ummy !" That night , Tim my went
, understanding . That evening , Emily decided to
you love !" That evening , Emily was determined
way ". That was just the advice Tommy
of joy . That night , Maya was visited

MIN ACTIVATIONS
MIN = -0.087

lately . With work and err ands , they didn 't
asks . " Sur fing is fun , but
room while she ran an err and . The girl was
it . So , he l icked it with his tongue
can . " Sur fing is fun , but
asked the man for two l oll ip ops . The
Her mother said , " Sur render is important so you
lunch . " Sur prise , Mia !" said
and she needed a lot of help .
needed to go do some err ands . Mom
said firmly : " Sur render the cup and let
try it ! He eagerly l icked it , but the
at his salon and running err ands . But every day
her mom were out running err ands . Mary was three
behind a bush . " Sur prise !" she said with
tasted like , so she l icked it .
felt very sorry . He l icked the boy 's hand
Fl uffy just shrugged and l icked his paw . Lily
're welcome , Max . Sur fing is fun and I
The lion said , " Sur render or I will eat

INTERVAL 0.044 - 0.088
CONTAINS 0.275%

. She could feel it warming her face and it made
liked his picture . From that day on , the painter
M um smiled , " That 's a great idea ,"
it . From that day on , Lily always
." The moral of the story is that it

INTERVAL 0.000 - 0.044
CONTAINS 48.597%

away and Mom my smiled . She said " Thank you
and touch the sky . Mia showed the butterfly her colorful
of the kingdom came to celebrate . Everyone was so happy
named Max . Max loved to drive around and explore new
kept flying until it reached a me adow with beautiful flowers

INTERVAL -0.043 - 0.000
CONTAINS 50.755%

charming little town , there lived a boy named Tim .
excited because it was holiday time ! They had been looking
liked the eggs and became her friends . The fat hen
she sprayed . She smiled and felt happy . <|endoftext|> Once
I will never leave you again ." They crawled

INTERVAL -0.087 - -0.043
CONTAINS 0.374%

when they found a nice spot . They walked
time . They had lots of fun in the big land
to play in and lots of grass to eat . One
playing tag and having lots of fun . Suddenly , John
but I had so much fun playing with my new friends
Token with
Token ID351
Feature activation-1.72e-02

Pos logit contributions
idents+0.008
s+0.007
 argue+0.007
ah+0.007
 Today+0.007
Neg logit contributions
aver-0.010
 doctors-0.009
avers-0.008
eh-0.008
 gard-0.008
Token!"
Token ID2474
Feature activation+1.31e-02

Pos logit contributions
aver+0.038
 doctors+0.034
avers+0.032
P+0.031
eh+0.031
Neg logit contributions
idents-0.022
s-0.020
ric-0.018
 argue-0.017
ah-0.017
Token
Token ID220
Feature activation+7.54e-03

Pos logit contributions
idents+0.019
ah+0.017
ric+0.017
 argue+0.017
 destination+0.016
Neg logit contributions
aver-0.025
 doctors-0.024
avers-0.022
 knights-0.022
ily-0.022
Token
Token ID198
Feature activation+1.12e-02

Pos logit contributions
idents+0.010
ric+0.009
s+0.009
ah+0.009
 argue+0.008
Neg logit contributions
aver-0.015
 doctors-0.015
 knights-0.014
avers-0.013
 pirates-0.013
Token
Token ID198
Feature activation+6.32e-03

Pos logit contributions
idents+0.018
ric+0.016
ah+0.016
 argue+0.015
s+0.014
Neg logit contributions
aver-0.021
 doctors-0.020
avers-0.018
 knights-0.018
ily-0.017
TokenThat
Token ID2504
Feature activation+8.77e-02

Pos logit contributions
idents+0.013
ric+0.012
ah+0.012
s+0.012
 argue+0.011
Neg logit contributions
aver-0.009
 doctors-0.008
 knights-0.007
avers-0.007
 pirates-0.007
Token night
Token ID1755
Feature activation-1.51e-02

Pos logit contributions
idents+0.145
ric+0.143
seven+0.132
 argue+0.127
ishly+0.114
Neg logit contributions
 doctors-0.149
aver-0.144
 pirates-0.125
 knights-0.123
P-0.122
Token,
Token ID11
Feature activation+1.10e-02

Pos logit contributions
aver+0.038
 doctors+0.034
P+0.034
 knights+0.033
avers+0.033
Neg logit contributions
idents-0.015
s-0.014
ah-0.012
 Today-0.011
 destination-0.010
Token Tim
Token ID5045
Feature activation+1.33e-02

Pos logit contributions
idents+0.021
ric+0.020
ah+0.019
s+0.019
seven+0.019
Neg logit contributions
aver-0.016
 doctors-0.016
 knights-0.013
avers-0.013
 pirates-0.013
Tokenmy
Token ID1820
Feature activation-1.17e-02

Pos logit contributions
idents+0.022
ric+0.018
 summit+0.018
seven+0.018
 argue+0.017
Neg logit contributions
aver-0.026
 doctors-0.023
 knights-0.021
ily-0.021
avers-0.021
Token and
Token ID290
Feature activation+4.13e-02

Pos logit contributions
aver+0.025
 doctors+0.024
avers+0.022
 knights+0.022
P+0.021
Neg logit contributions
idents-0.015
s-0.013
ah-0.013
ric-0.013
 argue-0.012
Token the
Token ID262
Feature activation-1.28e-02

Pos logit contributions
idents+0.006
ric+0.005
ah+0.005
s+0.005
 argue+0.005
Neg logit contributions
aver-0.009
 doctors-0.009
 knights-0.008
avers-0.008
 pirates-0.008
Token garden
Token ID11376
Feature activation+3.75e-03

Pos logit contributions
aver+0.024
 doctors+0.022
avers+0.020
 knights+0.019
 spies+0.019
Neg logit contributions
idents-0.019
ric-0.017
 Today-0.017
s-0.017
 summit-0.017
Token."
Token ID526
Feature activation+8.22e-03

Pos logit contributions
idents+0.005
ric+0.004
ah+0.004
s+0.004
 argue+0.004
Neg logit contributions
aver-0.008
 doctors-0.008
 knights-0.007
avers-0.007
 pirates-0.007
Token
Token ID198
Feature activation+1.01e-05

Pos logit contributions
idents+0.014
ric+0.013
ah+0.013
 argue+0.012
 destination+0.012
Neg logit contributions
aver-0.013
 doctors-0.012
 knights-0.012
ily-0.011
avers-0.011
Token
Token ID198
Feature activation-1.22e-03

Pos logit contributions
idents+0.000
ric+0.000
ah+0.000
 argue+0.000
s+0.000
Neg logit contributions
aver-0.000
 doctors-0.000
avers-0.000
 knights-0.000
ily-0.000
TokenThat
Token ID2504
Feature activation+8.77e-02

Pos logit contributions
aver+0.002
 doctors+0.002
P+0.001
avers+0.001
 knights+0.001
Neg logit contributions
idents-0.003
ric-0.002
ah-0.002
 argue-0.002
s-0.002
Token made
Token ID925
Feature activation+4.55e-02

Pos logit contributions
idents+0.160
ric+0.160
 argue+0.153
seven+0.143
 fare+0.124
Neg logit contributions
aver-0.131
 doctors-0.124
P-0.112
ily-0.107
 knights-0.107
Token Steve
Token ID6542
Feature activation+3.76e-02

Pos logit contributions
idents+0.080
ric+0.075
 argue+0.070
seven+0.069
ah+0.069
Neg logit contributions
aver-0.076
 doctors-0.075
 knights-0.067
avers-0.065
 pirates-0.064
Token smile
Token ID8212
Feature activation+3.93e-02

Pos logit contributions
idents+0.067
ric+0.058
ah+0.056
 argue+0.054
seven+0.054
Neg logit contributions
aver-0.061
 doctors-0.057
 knights-0.054
ily-0.050
 pirates-0.049
Token.
Token ID13
Feature activation+9.65e-03

Pos logit contributions
idents+0.075
ric+0.073
 argue+0.066
seven+0.066
ah+0.065
Neg logit contributions
 doctors-0.055
 knights-0.052
aver-0.052
 pirates-0.049
ily-0.043
Token He
Token ID679
Feature activation+1.24e-02

Pos logit contributions
idents+0.015
ric+0.014
ah+0.013
 argue+0.013
s+0.013
Neg logit contributions
aver-0.018
 doctors-0.017
 knights-0.016
avers-0.015
ily-0.015
Token another
Token ID1194
Feature activation-1.78e-03

Pos logit contributions
idents+0.014
ric+0.013
ah+0.013
 argue+0.012
s+0.012
Neg logit contributions
 doctors-0.013
aver-0.012
 knights-0.011
 pirates-0.011
avers-0.010
Token prize
Token ID11596
Feature activation-1.37e-02

Pos logit contributions
aver+0.003
 doctors+0.003
 knights+0.003
avers+0.003
 pirates+0.003
Neg logit contributions
idents-0.003
ric-0.002
s-0.002
ah-0.002
 argue-0.002
Token."
Token ID526
Feature activation-2.92e-03

Pos logit contributions
aver+0.030
 doctors+0.029
 knights+0.027
avers+0.026
 pirates+0.026
Neg logit contributions
idents-0.016
ric-0.014
ah-0.014
s-0.014
 argue-0.013
Token
Token ID198
Feature activation-4.40e-03

Pos logit contributions
aver+0.006
 doctors+0.005
 knights+0.005
avers+0.005
ily+0.005
Neg logit contributions
idents-0.004
ric-0.004
ah-0.004
s-0.004
 argue-0.004
Token
Token ID198
Feature activation-5.53e-03

Pos logit contributions
aver+0.009
 doctors+0.009
 knights+0.008
avers+0.008
 pirates+0.008
Neg logit contributions
idents-0.006
ric-0.005
s-0.005
ah-0.005
 argue-0.005
TokenThat
Token ID2504
Feature activation+8.76e-02

Pos logit contributions
aver+0.008
 doctors+0.008
 knights+0.007
avers+0.007
P+0.006
Neg logit contributions
idents-0.011
ric-0.010
ah-0.010
s-0.010
 argue-0.009
Token made
Token ID925
Feature activation+4.46e-02

Pos logit contributions
idents+0.154
ric+0.153
 argue+0.136
seven+0.136
ishly+0.123
Neg logit contributions
 doctors-0.138
aver-0.130
 knights-0.117
 pirates-0.114
P-0.114
Token Jim
Token ID5395
Feature activation+5.36e-02

Pos logit contributions
idents+0.082
ric+0.079
s+0.073
 argue+0.073
seven+0.073
Neg logit contributions
 doctors-0.067
aver-0.066
 knights-0.059
ily-0.058
 pirates-0.056
Token very
Token ID845
Feature activation+3.79e-02

Pos logit contributions
idents+0.098
ric+0.085
ah+0.081
riage+0.079
seven+0.079
Neg logit contributions
aver-0.078
 doctors-0.077
 knights-0.073
 pirates-0.067
ily-0.065
Token happy
Token ID3772
Feature activation+1.51e-02

Pos logit contributions
idents+0.067
ric+0.062
ah+0.060
s+0.057
seven+0.056
Neg logit contributions
aver-0.063
 doctors-0.060
 knights-0.054
avers-0.052
 pirates-0.051
Token,
Token ID11
Feature activation+1.13e-02

Pos logit contributions
idents+0.026
ric+0.025
ah+0.025
seven+0.022
 argue+0.022
Neg logit contributions
 doctors-0.024
aver-0.023
 knights-0.022
 pirates-0.022
ily-0.020
Token myself
Token ID3589
Feature activation+1.29e-02

Pos logit contributions
aver+0.032
 doctors+0.031
 knights+0.029
avers+0.028
 pirates+0.028
Neg logit contributions
idents-0.022
ah-0.019
s-0.019
ric-0.019
 argue-0.017
Token!"
Token ID2474
Feature activation+4.62e-03

Pos logit contributions
idents+0.020
ric+0.018
ah+0.016
 argue+0.016
s+0.015
Neg logit contributions
 doctors-0.024
aver-0.024
 knights-0.023
 pirates-0.022
ily-0.021
Token
Token ID220
Feature activation-8.04e-04

Pos logit contributions
idents+0.008
ah+0.007
ric+0.007
 argue+0.007
 destination+0.007
Neg logit contributions
ily-0.007
 doctors-0.007
aver-0.007
 knights-0.007
itt-0.006
Token
Token ID198
Feature activation-2.58e-03

Pos logit contributions
aver+0.002
 doctors+0.001
 knights+0.001
avers+0.001
ily+0.001
Neg logit contributions
idents-0.001
ah-0.001
ric-0.001
 argue-0.001
s-0.001
Token
Token ID198
Feature activation-1.20e-03

Pos logit contributions
aver+0.004
 doctors+0.004
ily+0.004
 knights+0.004
P+0.004
Neg logit contributions
idents-0.005
ah-0.004
ric-0.004
 argue-0.004
 destination-0.004
TokenThat
Token ID2504
Feature activation+8.69e-02

Pos logit contributions
aver+0.002
 doctors+0.002
P+0.002
ily+0.002
 knights+0.002
Neg logit contributions
idents-0.002
ric-0.002
ah-0.002
 argue-0.002
 destination-0.002
Token's
Token ID338
Feature activation+1.72e-02

Pos logit contributions
idents+0.166
ric+0.164
 argue+0.152
ishly+0.136
seven+0.135
Neg logit contributions
 doctors-0.122
aver-0.112
ily-0.104
 grown-0.101
 knights-0.099
Token why
Token ID1521
Feature activation+3.21e-02

Pos logit contributions
idents+0.028
s+0.026
ah+0.026
 argue+0.026
ric+0.026
Neg logit contributions
aver-0.030
 doctors-0.030
ily-0.026
avers-0.025
 pirates-0.024
Token Ana
Token ID17639
Feature activation+1.29e-02

Pos logit contributions
idents+0.058
ric+0.054
ah+0.052
 argue+0.049
 destination+0.048
Neg logit contributions
 doctors-0.050
aver-0.046
 knights-0.045
ily-0.044
 pirates-0.042
Token was
Token ID373
Feature activation+3.38e-03

Pos logit contributions
idents+0.021
ric+0.020
ah+0.018
seven+0.018
 argue+0.018
Neg logit contributions
aver-0.023
 doctors-0.022
 knights-0.020
avers-0.019
 pirates-0.019
Token such
Token ID884
Feature activation-2.46e-02

Pos logit contributions
idents+0.004
ric+0.004
ah+0.004
seven+0.004
 argue+0.004
Neg logit contributions
aver-0.007
 doctors-0.007
 knights-0.006
avers-0.006
P-0.006
Token hug
Token ID16225
Feature activation+6.84e-04

Pos logit contributions
idents+0.011
ric+0.010
ah+0.009
s+0.009
 argue+0.008
Neg logit contributions
aver-0.021
 doctors-0.020
 knights-0.018
avers-0.018
 pirates-0.018
Token.
Token ID13
Feature activation+1.16e-02

Pos logit contributions
idents+0.001
ric+0.001
 argue+0.001
ah+0.001
 summit+0.001
Neg logit contributions
 doctors-0.001
aver-0.001
 knights-0.001
ily-0.001
 pirates-0.001
Token
Token ID220
Feature activation+1.08e-02

Pos logit contributions
idents+0.016
ric+0.015
ah+0.015
 argue+0.014
s+0.014
Neg logit contributions
aver-0.022
 doctors-0.022
 knights-0.020
avers-0.019
ily-0.019
Token
Token ID198
Feature activation+5.77e-03

Pos logit contributions
idents+0.014
ric+0.012
s+0.012
ah+0.011
 Today+0.011
Neg logit contributions
aver-0.023
 doctors-0.023
 knights-0.021
avers-0.020
 pirates-0.020
Token
Token ID198
Feature activation+4.24e-03

Pos logit contributions
idents+0.008
ric+0.007
ah+0.007
s+0.007
 argue+0.007
Neg logit contributions
aver-0.011
 doctors-0.011
 knights-0.010
avers-0.010
P-0.010
TokenThat
Token ID2504
Feature activation+8.48e-02

Pos logit contributions
idents+0.006
s+0.005
ric+0.005
ah+0.005
seven+0.005
Neg logit contributions
aver-0.008
 doctors-0.008
 knights-0.007
 pirates-0.007
avers-0.007
Token day
Token ID1110
Feature activation-7.64e-03

Pos logit contributions
ric+0.146
idents+0.141
 argue+0.130
seven+0.129
ah+0.116
Neg logit contributions
 doctors-0.142
aver-0.133
P-0.119
ily-0.116
 knights-0.114
Token,
Token ID11
Feature activation+1.62e-02

Pos logit contributions
aver+0.014
 doctors+0.014
 knights+0.013
avers+0.012
 pirates+0.012
Neg logit contributions
idents-0.012
s-0.010
ric-0.010
ah-0.010
 argue-0.010
Token Emma
Token ID18966
Feature activation+2.16e-03

Pos logit contributions
idents+0.032
ric+0.030
ah+0.030
s+0.029
 argue+0.028
Neg logit contributions
aver-0.022
 doctors-0.022
 knights-0.019
avers-0.018
 pirates-0.017
Token learned
Token ID4499
Feature activation+1.47e-02

Pos logit contributions
idents+0.003
ric+0.003
ah+0.003
s+0.003
 argue+0.003
Neg logit contributions
aver-0.004
 doctors-0.004
 knights-0.003
avers-0.003
 pirates-0.003
Token how
Token ID703
Feature activation+6.60e-03

Pos logit contributions
idents+0.025
ric+0.023
ah+0.023
s+0.022
 argue+0.022
Neg logit contributions
 doctors-0.025
aver-0.024
 knights-0.021
ily-0.020
 pirates-0.020
Token she
Token ID673
Feature activation-8.38e-04

Pos logit contributions
aver+0.002
 doctors+0.002
 knights+0.002
ily+0.002
avers+0.002
Neg logit contributions
idents-0.002
ric-0.002
ah-0.002
 argue-0.002
seven-0.002
Token said
Token ID531
Feature activation-1.64e-02

Pos logit contributions
aver+0.002
 doctors+0.002
 knights+0.001
avers+0.001
 pirates+0.001
Neg logit contributions
idents-0.001
ric-0.001
ah-0.001
s-0.001
 argue-0.001
Token.
Token ID13
Feature activation-1.53e-02

Pos logit contributions
aver+0.035
 doctors+0.034
 knights+0.031
P+0.030
avers+0.030
Neg logit contributions
idents-0.021
s-0.019
 argue-0.017
ah-0.017
ric-0.017
Token
Token ID198
Feature activation-1.05e-02

Pos logit contributions
aver+0.033
 doctors+0.032
 knights+0.029
avers+0.029
 pirates+0.028
Neg logit contributions
idents-0.019
ric-0.017
s-0.016
ah-0.016
 argue-0.015
Token
Token ID198
Feature activation-9.58e-03

Pos logit contributions
aver+0.021
 doctors+0.020
 knights+0.019
avers+0.018
 pirates+0.018
Neg logit contributions
idents-0.015
ric-0.013
s-0.013
ah-0.013
 Today-0.012
TokenThat
Token ID2504
Feature activation+8.46e-02

Pos logit contributions
aver+0.015
 doctors+0.015
 knights+0.013
avers+0.012
 pirates+0.012
Neg logit contributions
idents-0.017
s-0.015
ric-0.015
ah-0.015
 argue-0.015
Token night
Token ID1755
Feature activation-1.57e-02

Pos logit contributions
idents+0.156
ric+0.152
 argue+0.142
seven+0.137
ishly+0.121
Neg logit contributions
 doctors-0.124
aver-0.123
ily-0.106
P-0.102
 pirates-0.099
Token,
Token ID11
Feature activation+4.76e-03

Pos logit contributions
aver+0.032
 doctors+0.031
 knights+0.029
avers+0.028
P+0.028
Neg logit contributions
idents-0.021
s-0.019
ah-0.018
ric-0.017
 Today-0.017
Token Jane
Token ID12091
Feature activation+3.29e-03

Pos logit contributions
idents+0.009
ric+0.008
ah+0.008
 argue+0.008
s+0.007
Neg logit contributions
aver-0.007
 doctors-0.007
 knights-0.006
ily-0.006
avers-0.006
Token went
Token ID1816
Feature activation-5.06e-03

Pos logit contributions
idents+0.006
ric+0.005
 argue+0.005
ah+0.005
 Today+0.004
Neg logit contributions
aver-0.006
 doctors-0.006
 knights-0.005
avers-0.005
eh-0.005
Token to
Token ID284
Feature activation+6.74e-03

Pos logit contributions
aver+0.012
 doctors+0.011
avers+0.010
 knights+0.010
P+0.010
Neg logit contributions
idents-0.005
s-0.005
ah-0.005
ric-0.005
 destination-0.004
Token laughed
Token ID13818
Feature activation+1.46e-02

Pos logit contributions
idents+0.048
ric+0.044
ah+0.043
s+0.042
seven+0.041
Neg logit contributions
aver-0.034
 doctors-0.033
 knights-0.028
avers-0.027
ily-0.027
Token.
Token ID13
Feature activation+1.24e-02

Pos logit contributions
ric+0.025
idents+0.024
ah+0.024
seven+0.022
 argue+0.022
Neg logit contributions
 doctors-0.024
aver-0.023
 knights-0.022
 pirates-0.021
ily-0.020
Token
Token ID220
Feature activation+1.40e-02

Pos logit contributions
idents+0.016
ah+0.016
ric+0.015
 argue+0.014
seven+0.013
Neg logit contributions
aver-0.025
 doctors-0.024
 knights-0.022
avers-0.022
ily-0.022
Token
Token ID198
Feature activation+1.20e-02

Pos logit contributions
idents+0.019
ric+0.017
s+0.016
ah+0.016
 argue+0.015
Neg logit contributions
aver-0.029
 doctors-0.028
 knights-0.026
avers-0.025
 pirates-0.025
Token
Token ID198
Feature activation+9.55e-03

Pos logit contributions
idents+0.018
ric+0.016
ah+0.016
 argue+0.015
s+0.015
Neg logit contributions
aver-0.023
 doctors-0.022
avers-0.020
 knights-0.020
P-0.019
TokenThat
Token ID2504
Feature activation+8.44e-02

Pos logit contributions
idents+0.012
s+0.010
ric+0.010
ah+0.010
 argue+0.010
Neg logit contributions
aver-0.020
 doctors-0.020
 knights-0.018
avers-0.018
 pirates-0.018
Token night
Token ID1755
Feature activation-1.26e-02

Pos logit contributions
idents+0.129
ric+0.126
seven+0.113
 argue+0.112
ah+0.106
Neg logit contributions
 doctors-0.158
aver-0.150
P-0.131
 pirates-0.130
 knights-0.130
Token,
Token ID11
Feature activation+7.12e-03

Pos logit contributions
aver+0.029
 doctors+0.027
 knights+0.026
avers+0.025
P+0.025
Neg logit contributions
idents-0.015
s-0.014
ah-0.012
 Today-0.011
 destination-0.011
Token Lily
Token ID20037
Feature activation+2.72e-03

Pos logit contributions
idents+0.015
ric+0.014
ah+0.013
s+0.013
 argue+0.013
Neg logit contributions
aver-0.009
 doctors-0.009
 knights-0.008
avers-0.008
 pirates-0.007
Token hugged
Token ID41130
Feature activation-1.04e-03

Pos logit contributions
idents+0.003
s+0.003
ah+0.003
ric+0.003
 argue+0.003
Neg logit contributions
aver-0.006
 doctors-0.006
avers-0.005
 knights-0.005
 pirates-0.005
Token Ros
Token ID10018
Feature activation+1.58e-02

Pos logit contributions
 doctors+0.002
aver+0.002
 knights+0.002
avers+0.002
 pirates+0.002
Neg logit contributions
idents-0.002
ric-0.001
ah-0.001
s-0.001
seven-0.001
Token find
Token ID1064
Feature activation-9.46e-03

Pos logit contributions
idents+0.009
ric+0.008
s+0.008
 argue+0.008
ah+0.007
Neg logit contributions
aver-0.016
 doctors-0.015
avers-0.014
P-0.014
eh-0.014
Token it
Token ID340
Feature activation+1.79e-02

Pos logit contributions
aver+0.020
 doctors+0.018
avers+0.017
eh+0.016
 knights+0.016
Neg logit contributions
idents-0.013
s-0.011
ric-0.011
ah-0.010
 destination-0.010
Token."
Token ID526
Feature activation-6.05e-03

Pos logit contributions
idents+0.026
ric+0.023
ah+0.023
 argue+0.022
s+0.021
Neg logit contributions
aver-0.035
 doctors-0.034
 knights-0.031
avers-0.030
 pirates-0.030
Token
Token ID198
Feature activation-6.70e-03

Pos logit contributions
aver+0.012
 doctors+0.012
 knights+0.011
avers+0.010
ily+0.010
Neg logit contributions
idents-0.008
ric-0.008
ah-0.007
 argue-0.007
s-0.007
Token
Token ID198
Feature activation-8.10e-03

Pos logit contributions
aver+0.014
 doctors+0.014
 knights+0.013
avers+0.012
 pirates+0.012
Neg logit contributions
idents-0.008
s-0.007
ric-0.007
ah-0.007
 argue-0.007
TokenThat
Token ID2504
Feature activation+8.34e-02

Pos logit contributions
aver+0.016
 doctors+0.016
 knights+0.014
avers+0.014
 pirates+0.014
Neg logit contributions
idents-0.011
ric-0.010
ah-0.010
s-0.010
 argue-0.009
Token's
Token ID338
Feature activation+1.47e-02

Pos logit contributions
idents+0.144
ric+0.138
seven+0.124
 argue+0.124
that+0.112
Neg logit contributions
 doctors-0.138
aver-0.137
 knights-0.122
 pirates-0.120
P-0.113
Token exactly
Token ID3446
Feature activation-1.10e-03

Pos logit contributions
idents+0.027
ric+0.024
seven+0.024
 argue+0.024
s+0.024
Neg logit contributions
aver-0.024
 doctors-0.023
avers-0.020
 knights-0.019
 pirates-0.019
Token what
Token ID644
Feature activation+1.91e-02

Pos logit contributions
aver+0.002
 doctors+0.002
 knights+0.002
avers+0.002
 pirates+0.002
Neg logit contributions
idents-0.002
ric-0.001
s-0.001
ah-0.001
 argue-0.001
Token they
Token ID484
Feature activation+9.82e-03

Pos logit contributions
idents+0.031
ric+0.027
ah+0.027
s+0.026
 argue+0.026
Neg logit contributions
aver-0.035
 doctors-0.033
 knights-0.030
avers-0.029
 pirates-0.028
Token did
Token ID750
Feature activation+8.25e-03

Pos logit contributions
idents+0.018
ric+0.017
seven+0.016
ah+0.016
s+0.016
Neg logit contributions
aver-0.015
 doctors-0.014
 knights-0.013
ily-0.012
 pirates-0.012
Token lose
Token ID4425
Feature activation-7.30e-03

Pos logit contributions
idents+0.031
ric+0.028
ah+0.027
s+0.025
seven+0.025
Neg logit contributions
aver-0.039
 doctors-0.038
 knights-0.034
 pirates-0.033
avers-0.033
Token it
Token ID340
Feature activation+1.07e-02

Pos logit contributions
aver+0.017
 doctors+0.016
avers+0.015
eh+0.015
P+0.015
Neg logit contributions
idents-0.008
s-0.007
ric-0.007
 argue-0.006
ah-0.006
Token."
Token ID526
Feature activation-4.61e-03

Pos logit contributions
idents+0.013
s+0.011
ric+0.011
ah+0.011
 argue+0.011
Neg logit contributions
aver-0.023
 doctors-0.022
 knights-0.020
avers-0.020
P-0.019
Token
Token ID198
Feature activation+4.10e-03

Pos logit contributions
aver+0.010
 doctors+0.010
 knights+0.009
avers+0.009
 pirates+0.008
Neg logit contributions
idents-0.006
s-0.005
ric-0.005
ah-0.005
 Today-0.005
Token
Token ID198
Feature activation+4.09e-03

Pos logit contributions
idents+0.006
ric+0.006
ah+0.006
 argue+0.005
seven+0.005
Neg logit contributions
aver-0.008
 doctors-0.007
avers-0.006
 knights-0.006
ily-0.006
TokenThat
Token ID2504
Feature activation+8.32e-02

Pos logit contributions
idents+0.009
s+0.008
ric+0.008
ah+0.008
 argue+0.008
Neg logit contributions
aver-0.005
 doctors-0.005
 knights-0.004
avers-0.004
 pirates-0.004
Token night
Token ID1755
Feature activation-1.52e-02

Pos logit contributions
ric+0.129
idents+0.127
seven+0.117
 argue+0.109
ah+0.106
Neg logit contributions
 doctors-0.154
aver-0.144
P-0.129
 pirates-0.129
 knights-0.128
Token,
Token ID11
Feature activation+4.23e-03

Pos logit contributions
aver+0.038
 knights+0.035
 doctors+0.034
 spies+0.034
 pirates+0.034
Neg logit contributions
s-0.015
idents-0.014
ah-0.011
 Today-0.011
 may-0.009
Token Lily
Token ID20037
Feature activation-1.43e-02

Pos logit contributions
idents+0.009
ric+0.008
ah+0.008
s+0.008
 argue+0.008
Neg logit contributions
aver-0.006
 doctors-0.005
 knights-0.005
avers-0.004
 pirates-0.004
Token went
Token ID1816
Feature activation-1.16e-02

Pos logit contributions
aver+0.030
 doctors+0.028
avers+0.027
itt+0.027
 knights+0.026
Neg logit contributions
s-0.018
idents-0.017
ah-0.016
 argue-0.015
 may-0.013
Token to
Token ID284
Feature activation-7.31e-04

Pos logit contributions
aver+0.025
 doctors+0.022
avers+0.021
 knights+0.020
itt+0.020
Neg logit contributions
s-0.014
idents-0.014
 Today-0.012
ah-0.012
 argue-0.012
Tokenade
Token ID671
Feature activation+2.88e-02

Pos logit contributions
idents+0.050
ric+0.050
ah+0.047
 Today+0.047
seven+0.046
Neg logit contributions
 doctors-0.034
aver-0.033
eh-0.028
P-0.028
ily-0.028
Token together
Token ID1978
Feature activation+2.11e-02

Pos logit contributions
idents+0.040
ric+0.038
ah+0.036
 argue+0.033
seven+0.033
Neg logit contributions
 doctors-0.056
aver-0.055
 knights-0.050
 pirates-0.048
avers-0.048
Token".
Token ID1911
Feature activation+3.03e-03

Pos logit contributions
idents+0.031
ric+0.029
ah+0.028
 argue+0.025
 Today+0.024
Neg logit contributions
 doctors-0.038
aver-0.037
 knights-0.035
 pirates-0.034
ily-0.033
Token
Token ID198
Feature activation-5.28e-03

Pos logit contributions
idents+0.004
ric+0.004
ah+0.004
 argue+0.004
seven+0.004
Neg logit contributions
aver-0.006
 doctors-0.005
P-0.005
avers-0.005
ily-0.005
Token
Token ID198
Feature activation-5.20e-03

Pos logit contributions
aver+0.010
 doctors+0.010
 knights+0.009
avers+0.009
 pirates+0.008
Neg logit contributions
idents-0.008
ric-0.007
ah-0.007
s-0.007
 argue-0.007
TokenThat
Token ID2504
Feature activation+8.32e-02

Pos logit contributions
aver+0.009
 doctors+0.008
avers+0.007
 knights+0.007
P+0.007
Neg logit contributions
idents-0.009
ric-0.008
ah-0.008
s-0.008
 argue-0.008
Token made
Token ID925
Feature activation+4.25e-02

Pos logit contributions
ric+0.151
idents+0.149
seven+0.132
 argue+0.131
 Today+0.116
Neg logit contributions
 doctors-0.129
aver-0.127
P-0.111
ily-0.103
eh-0.100
Token Wendy
Token ID28314
Feature activation+4.42e-02

Pos logit contributions
idents+0.069
ric+0.065
ah+0.062
seven+0.061
 argue+0.060
Neg logit contributions
 doctors-0.074
aver-0.073
ily-0.064
avers-0.063
P-0.063
Token cheer
Token ID14042
Feature activation+2.00e-02

Pos logit contributions
idents+0.075
ric+0.073
ah+0.068
 Today+0.064
 destination+0.064
Neg logit contributions
aver-0.069
 doctors-0.068
 knights-0.062
ily-0.060
avers-0.057
Token up
Token ID510
Feature activation+1.24e-02

Pos logit contributions
idents+0.033
ric+0.032
seven+0.030
ah+0.030
 argue+0.030
Neg logit contributions
 doctors-0.033
aver-0.033
ily-0.029
 knights-0.029
 pirates-0.028
Token and
Token ID290
Feature activation+1.90e-02

Pos logit contributions
idents+0.020
ric+0.019
ah+0.018
seven+0.017
 argue+0.017
Neg logit contributions
aver-0.021
 doctors-0.021
 knights-0.019
avers-0.018
 pirates-0.018
Token smiled
Token ID13541
Feature activation+4.11e-03

Pos logit contributions
aver+0.009
 doctors+0.009
avers+0.008
 knights+0.008
ily+0.008
Neg logit contributions
idents-0.010
ric-0.008
 Today-0.008
 summit-0.008
 destination-0.008
Token big
Token ID1263
Feature activation+1.99e-03

Pos logit contributions
idents+0.007
ric+0.007
ah+0.007
 argue+0.006
seven+0.006
Neg logit contributions
 doctors-0.007
aver-0.006
 knights-0.006
 pirates-0.006
 spies-0.005
Token.
Token ID13
Feature activation+8.84e-03

Pos logit contributions
ric+0.003
idents+0.003
ah+0.003
 summit+0.003
s+0.003
Neg logit contributions
 doctors-0.003
 knights-0.003
aver-0.003
 pirates-0.003
 gard-0.003
Token
Token ID198
Feature activation-5.32e-03

Pos logit contributions
idents+0.013
ric+0.012
ah+0.012
s+0.011
 argue+0.011
Neg logit contributions
aver-0.017
 doctors-0.016
 knights-0.015
avers-0.014
ily-0.014
Token
Token ID198
Feature activation-3.87e-03

Pos logit contributions
aver+0.011
 doctors+0.011
 knights+0.010
avers+0.010
 pirates+0.009
Neg logit contributions
idents-0.007
ric-0.006
s-0.006
ah-0.006
 argue-0.006
TokenThat
Token ID2504
Feature activation+8.29e-02

Pos logit contributions
aver+0.006
 doctors+0.006
 knights+0.005
avers+0.005
 pirates+0.005
Neg logit contributions
idents-0.007
ric-0.007
s-0.007
ah-0.006
 argue-0.006
Token day
Token ID1110
Feature activation-9.23e-03

Pos logit contributions
idents+0.157
ric+0.150
 argue+0.140
seven+0.130
ishly+0.126
Neg logit contributions
 doctors-0.125
aver-0.121
P-0.099
ily-0.097
 pirates-0.097
Token was
Token ID373
Feature activation+5.17e-03

Pos logit contributions
 doctors+0.015
aver+0.015
 knights+0.013
ily+0.013
avers+0.013
Neg logit contributions
idents-0.016
ric-0.015
ah-0.014
 argue-0.014
seven-0.013
Token so
Token ID523
Feature activation+8.13e-03

Pos logit contributions
idents+0.009
ric+0.008
ah+0.007
s+0.007
 argue+0.007
Neg logit contributions
aver-0.009
 doctors-0.009
 knights-0.008
avers-0.008
eh-0.007
Token much
Token ID881
Feature activation-2.00e-02

Pos logit contributions
idents+0.013
s+0.011
ric+0.011
ah+0.011
 argue+0.011
Neg logit contributions
aver-0.015
 doctors-0.014
 knights-0.013
avers-0.013
 pirates-0.012
Token fun
Token ID1257
Feature activation-3.66e-02

Pos logit contributions
aver+0.046
avers+0.041
 knights+0.040
 doctors+0.040
eh+0.039
Neg logit contributions
idents-0.025
s-0.021
ah-0.019
 Today-0.018
 argue-0.018
Token each
Token ID1123
Feature activation+1.26e-02

Pos logit contributions
aver+0.026
 doctors+0.025
 knights+0.021
avers+0.021
 pirates+0.020
Neg logit contributions
idents-0.033
ric-0.030
ah-0.029
s-0.029
 argue-0.028
Token other
Token ID584
Feature activation+9.30e-03

Pos logit contributions
idents+0.025
ric+0.023
ah+0.021
s+0.021
 argue+0.020
Neg logit contributions
aver-0.019
 doctors-0.019
 knights-0.016
avers-0.016
ily-0.015
Token."
Token ID526
Feature activation+8.50e-04

Pos logit contributions
idents+0.015
ric+0.013
ah+0.013
seven+0.012
s+0.012
Neg logit contributions
 doctors-0.017
aver-0.016
 knights-0.015
 pirates-0.014
avers-0.014
Token
Token ID198
Feature activation-1.13e-03

Pos logit contributions
idents+0.001
ric+0.001
ah+0.001
s+0.001
 argue+0.001
Neg logit contributions
aver-0.002
 doctors-0.002
 knights-0.002
avers-0.002
 pirates-0.002
Token
Token ID198
Feature activation-4.40e-03

Pos logit contributions
aver+0.002
 doctors+0.002
 knights+0.002
avers+0.002
 pirates+0.002
Neg logit contributions
idents-0.002
ric-0.001
ah-0.001
s-0.001
 argue-0.001
TokenThat
Token ID2504
Feature activation+8.26e-02

Pos logit contributions
aver+0.008
 doctors+0.008
 knights+0.007
 pirates+0.007
avers+0.007
Neg logit contributions
idents-0.006
s-0.006
ric-0.006
ah-0.006
 argue-0.006
Token night
Token ID1755
Feature activation-1.35e-02

Pos logit contributions
idents+0.152
ric+0.147
 argue+0.133
seven+0.133
ishly+0.120
Neg logit contributions
 doctors-0.132
aver-0.123
P-0.105
ily-0.101
eh-0.100
Token,
Token ID11
Feature activation+6.95e-03

Pos logit contributions
aver+0.028
 doctors+0.026
 knights+0.025
avers+0.024
P+0.024
Neg logit contributions
idents-0.018
s-0.016
ah-0.015
ric-0.015
 Today-0.015
Token Mom
Token ID11254
Feature activation-1.12e-02

Pos logit contributions
idents+0.012
ric+0.011
ah+0.011
s+0.010
 argue+0.010
Neg logit contributions
aver-0.012
 doctors-0.011
 knights-0.010
 pirates-0.010
avers-0.010
Token and
Token ID290
Feature activation+4.41e-03

Pos logit contributions
aver+0.021
 doctors+0.020
 knights+0.018
avers+0.018
P+0.017
Neg logit contributions
idents-0.017
ah-0.015
ric-0.015
s-0.015
 argue-0.014
Token Dad
Token ID17415
Feature activation+4.22e-03

Pos logit contributions
idents+0.008
ric+0.007
ah+0.007
s+0.007
 argue+0.007
Neg logit contributions
aver-0.007
 doctors-0.007
 knights-0.006
avers-0.006
 pirates-0.006
Token other
Token ID584
Feature activation+2.18e-02

Pos logit contributions
idents+0.034
ric+0.030
ah+0.029
s+0.029
 argue+0.028
Neg logit contributions
aver-0.034
 doctors-0.032
 knights-0.029
avers-0.028
 pirates-0.027
Token.
Token ID13
Feature activation+3.42e-03

Pos logit contributions
idents+0.033
ric+0.029
ah+0.029
s+0.027
seven+0.027
Neg logit contributions
aver-0.041
 doctors-0.040
 knights-0.036
avers-0.035
 pirates-0.035
Token
Token ID220
Feature activation+1.05e-02

Pos logit contributions
idents+0.004
ric+0.004
ah+0.004
s+0.004
 argue+0.004
Neg logit contributions
aver-0.007
 doctors-0.007
 knights-0.006
avers-0.006
 pirates-0.006
Token
Token ID198
Feature activation-5.17e-04

Pos logit contributions
idents+0.016
ah+0.014
ric+0.014
 argue+0.014
s+0.013
Neg logit contributions
aver-0.020
 doctors-0.019
avers-0.017
 knights-0.017
ily-0.017
Token
Token ID198
Feature activation-3.70e-03

Pos logit contributions
aver+0.001
 doctors+0.001
 knights+0.001
avers+0.001
 pirates+0.001
Neg logit contributions
idents-0.001
ric-0.001
ah-0.001
s-0.001
 argue-0.001
TokenThat
Token ID2504
Feature activation+8.20e-02

Pos logit contributions
aver+0.007
 doctors+0.006
 knights+0.006
 pirates+0.005
avers+0.005
Neg logit contributions
idents-0.006
ric-0.005
ah-0.005
s-0.005
 argue-0.005
Token's
Token ID338
Feature activation+2.53e-02

Pos logit contributions
idents+0.139
ric+0.126
 argue+0.114
seven+0.112
s+0.108
Neg logit contributions
aver-0.143
 doctors-0.143
 knights-0.119
 pirates-0.117
avers-0.116
Token when
Token ID618
Feature activation+9.76e-03

Pos logit contributions
idents+0.039
ric+0.035
s+0.034
ah+0.034
 argue+0.032
Neg logit contributions
aver-0.048
 doctors-0.047
 knights-0.041
avers-0.041
 pirates-0.040
Token they
Token ID484
Feature activation+2.36e-02

Pos logit contributions
idents+0.014
ric+0.012
ah+0.012
s+0.012
 argue+0.012
Neg logit contributions
aver-0.019
 doctors-0.018
avers-0.017
 knights-0.016
eh-0.016
Token saw
Token ID2497
Feature activation+1.13e-02

Pos logit contributions
idents+0.041
ric+0.036
ah+0.035
s+0.035
seven+0.034
Neg logit contributions
aver-0.040
 doctors-0.039
 knights-0.034
avers-0.033
 pirates-0.033
Token the
Token ID262
Feature activation+2.49e-04

Pos logit contributions
idents+0.016
ric+0.014
s+0.013
ah+0.013
 argue+0.013
Neg logit contributions
aver-0.023
 doctors-0.022
 knights-0.020
avers-0.020
P-0.020
Token said
Token ID531
Feature activation-1.83e-02

Pos logit contributions
aver+0.017
 doctors+0.017
 knights+0.015
avers+0.015
 pirates+0.014
Neg logit contributions
idents-0.013
ric-0.012
ah-0.011
s-0.011
 argue-0.011
Token.
Token ID13
Feature activation-1.18e-02

Pos logit contributions
aver+0.041
 doctors+0.040
P+0.036
 knights+0.036
avers+0.036
Neg logit contributions
idents-0.022
s-0.019
 argue-0.017
ric-0.017
ah-0.017
Token
Token ID220
Feature activation-3.03e-04

Pos logit contributions
aver+0.027
 doctors+0.027
 knights+0.024
 pirates+0.024
avers+0.023
Neg logit contributions
idents-0.013
s-0.012
 Today-0.011
ric-0.010
ah-0.010
Token
Token ID198
Feature activation-5.61e-03

Pos logit contributions
aver+0.001
 doctors+0.001
 knights+0.001
avers+0.001
 pirates+0.001
Neg logit contributions
idents-0.000
ric-0.000
s-0.000
ah-0.000
 argue-0.000
Token
Token ID198
Feature activation-6.61e-03

Pos logit contributions
 doctors+0.014
aver+0.014
 knights+0.013
 pirates+0.013
 spies+0.012
Neg logit contributions
s-0.005
idents-0.004
 Today-0.004
that-0.004
 -0.003
TokenThat
Token ID2504
Feature activation+8.19e-02

Pos logit contributions
 doctors+0.011
aver+0.011
 knights+0.010
 pirates+0.010
 spies+0.009
Neg logit contributions
s-0.011
idents-0.010
that-0.010
ah-0.009
 Today-0.009
Token night
Token ID1755
Feature activation-1.82e-02

Pos logit contributions
idents+0.134
ric+0.132
 argue+0.115
seven+0.115
ah+0.108
Neg logit contributions
 doctors-0.144
aver-0.139
 knights-0.120
ily-0.119
P-0.118
Token,
Token ID11
Feature activation+4.49e-03

Pos logit contributions
aver+0.041
 doctors+0.037
 knights+0.037
P+0.036
avers+0.036
Neg logit contributions
idents-0.021
s-0.019
ah-0.017
 Today-0.017
 destination-0.016
Token Daisy
Token ID40355
Feature activation+2.73e-03

Pos logit contributions
idents+0.008
ric+0.007
s+0.007
ah+0.007
 argue+0.006
Neg logit contributions
aver-0.008
 doctors-0.007
 knights-0.007
avers-0.007
 pirates-0.006
Token's
Token ID338
Feature activation+6.08e-04

Pos logit contributions
idents+0.004
ric+0.004
ah+0.004
s+0.004
 argue+0.004
Neg logit contributions
aver-0.005
 doctors-0.005
 knights-0.004
avers-0.004
 pirates-0.004
Token mom
Token ID1995
Feature activation-2.15e-02

Pos logit contributions
idents+0.001
s+0.001
ric+0.001
 argue+0.001
ah+0.001
Neg logit contributions
aver-0.001
 doctors-0.001
avers-0.001
P-0.001
eh-0.001
TokenWow
Token ID22017
Feature activation-4.08e-03

Pos logit contributions
idents+0.025
ric+0.023
 argue+0.023
 destination+0.021
s+0.021
Neg logit contributions
aver-0.029
 doctors-0.028
P-0.025
avers-0.025
 knights-0.025
Token,
Token ID11
Feature activation+8.41e-03

Pos logit contributions
aver+0.007
 doctors+0.007
 knights+0.006
 pirates+0.006
avers+0.006
Neg logit contributions
idents-0.007
ric-0.006
ah-0.006
 argue-0.006
seven-0.006
Token Tim
Token ID5045
Feature activation+2.55e-02

Pos logit contributions
idents+0.015
ric+0.015
ah+0.014
s+0.014
 argue+0.013
Neg logit contributions
 doctors-0.013
aver-0.013
 knights-0.011
 pirates-0.011
ily-0.010
Tokenmy
Token ID1820
Feature activation+1.74e-02

Pos logit contributions
idents+0.041
ric+0.037
 summit+0.037
 argue+0.035
seven+0.034
Neg logit contributions
aver-0.049
 doctors-0.044
avers-0.040
 knights-0.040
ily-0.039
Token!
Token ID0
Feature activation+1.61e-02

Pos logit contributions
idents+0.024
ric+0.024
 summit+0.021
 argue+0.021
ah+0.021
Neg logit contributions
aver-0.032
 doctors-0.032
 knights-0.030
 pirates-0.029
ily-0.027
Token That
Token ID1320
Feature activation+8.17e-02

Pos logit contributions
idents+0.022
ric+0.019
ah+0.018
s+0.018
 argue+0.017
Neg logit contributions
aver-0.033
 doctors-0.032
 knights-0.029
avers-0.029
eh-0.028
Token's
Token ID338
Feature activation+1.15e-02

Pos logit contributions
idents+0.123
ric+0.117
 argue+0.108
seven+0.101
ah+0.101
Neg logit contributions
 doctors-0.153
aver-0.145
 knights-0.139
 pirates-0.137
P-0.123
Token a
Token ID257
Feature activation+3.15e-02

Pos logit contributions
idents+0.021
s+0.021
ah+0.020
 argue+0.019
ric+0.019
Neg logit contributions
 doctors-0.018
aver-0.018
 pirates-0.015
 knights-0.015
ily-0.014
Token great
Token ID1049
Feature activation+3.20e-02

Pos logit contributions
idents+0.048
ric+0.043
s+0.041
ah+0.041
 argue+0.040
Neg logit contributions
aver-0.059
 doctors-0.058
 knights-0.052
 pirates-0.050
avers-0.050
Token picture
Token ID4286
Feature activation-6.70e-04

Pos logit contributions
idents+0.047
ric+0.042
ah+0.040
s+0.040
 argue+0.038
Neg logit contributions
aver-0.062
 doctors-0.060
 knights-0.055
avers-0.053
 pirates-0.053
Token of
Token ID286
Feature activation+1.34e-03

Pos logit contributions
aver+0.001
 doctors+0.001
avers+0.001
 knights+0.001
P+0.001
Neg logit contributions
idents-0.001
s-0.001
ah-0.001
ric-0.001
seven-0.001
Tokenummy
Token ID13513
Feature activation-1.21e-02

Pos logit contributions
idents+0.010
s+0.009
ah+0.008
ric+0.008
 Today+0.008
Neg logit contributions
aver-0.014
 doctors-0.014
 knights-0.012
avers-0.012
 pirates-0.012
Token!"
Token ID2474
Feature activation+1.88e-02

Pos logit contributions
aver+0.027
 doctors+0.027
 knights+0.025
 pirates+0.024
avers+0.024
Neg logit contributions
idents-0.014
ric-0.013
ah-0.011
 argue-0.011
s-0.011
Token
Token ID220
Feature activation+6.92e-03

Pos logit contributions
idents+0.030
ric+0.028
ah+0.028
 argue+0.027
 summit+0.027
Neg logit contributions
aver-0.032
 doctors-0.031
avers-0.028
 knights-0.028
ily-0.026
Token
Token ID198
Feature activation+1.41e-02

Pos logit contributions
idents+0.011
ric+0.009
ah+0.009
 argue+0.009
seven+0.009
Neg logit contributions
aver-0.013
 doctors-0.012
avers-0.011
 knights-0.011
ily-0.011
Token
Token ID198
Feature activation+1.48e-02

Pos logit contributions
idents+0.026
 argue+0.022
ric+0.022
ah+0.022
 destination+0.021
Neg logit contributions
aver-0.024
 doctors-0.021
avers-0.020
P-0.019
ily-0.019
TokenThat
Token ID2504
Feature activation+8.16e-02

Pos logit contributions
idents+0.026
ric+0.023
ah+0.022
s+0.022
 argue+0.022
Neg logit contributions
aver-0.026
 doctors-0.024
avers-0.022
 knights-0.021
P-0.021
Token night
Token ID1755
Feature activation-1.89e-02

Pos logit contributions
ric+0.135
idents+0.133
seven+0.121
 argue+0.116
that+0.107
Neg logit contributions
 doctors-0.141
aver-0.133
 pirates-0.119
 knights-0.118
P-0.116
Token,
Token ID11
Feature activation+9.12e-03

Pos logit contributions
aver+0.044
 doctors+0.042
P+0.040
 knights+0.040
avers+0.039
Neg logit contributions
idents-0.020
s-0.019
ah-0.018
 Today-0.017
 destination-0.014
Token Tim
Token ID5045
Feature activation+5.76e-03

Pos logit contributions
idents+0.017
ric+0.016
ah+0.016
s+0.015
 argue+0.015
Neg logit contributions
aver-0.014
 doctors-0.013
 knights-0.012
 pirates-0.011
avers-0.011
Tokenmy
Token ID1820
Feature activation-1.35e-02

Pos logit contributions
idents+0.006
ric+0.005
s+0.005
ah+0.005
 argue+0.005
Neg logit contributions
aver-0.013
 doctors-0.013
 knights-0.012
avers-0.012
 pirates-0.011
Token went
Token ID1816
Feature activation-6.82e-03

Pos logit contributions
aver+0.030
 doctors+0.028
avers+0.027
 knights+0.026
itt+0.026
Neg logit contributions
idents-0.015
s-0.015
ah-0.014
 argue-0.012
ric-0.012
Token,
Token ID11
Feature activation-1.20e-03

Pos logit contributions
idents+0.022
s+0.019
ric+0.019
ah+0.019
 argue+0.018
Neg logit contributions
aver-0.033
 doctors-0.032
 knights-0.029
avers-0.029
eh-0.028
Token understanding
Token ID4547
Feature activation+1.68e-02

Pos logit contributions
aver+0.002
 doctors+0.002
 knights+0.002
avers+0.002
P+0.002
Neg logit contributions
idents-0.002
ric-0.002
s-0.002
ah-0.002
 argue-0.002
Token.
Token ID13
Feature activation+4.25e-03

Pos logit contributions
idents+0.028
ric+0.026
ah+0.025
 argue+0.024
seven+0.023
Neg logit contributions
 doctors-0.029
aver-0.028
 knights-0.026
ily-0.025
 pirates-0.024
Token
Token ID198
Feature activation+1.41e-03

Pos logit contributions
idents+0.005
ah+0.005
ric+0.005
 argue+0.004
seven+0.004
Neg logit contributions
aver-0.009
 doctors-0.009
 knights-0.008
ily-0.008
avers-0.008
Token
Token ID198
Feature activation+4.07e-03

Pos logit contributions
idents+0.002
ric+0.002
ah+0.002
s+0.002
 argue+0.002
Neg logit contributions
aver-0.003
 doctors-0.003
 knights-0.002
avers-0.002
 pirates-0.002
TokenThat
Token ID2504
Feature activation+8.15e-02

Pos logit contributions
idents+0.006
s+0.006
ric+0.006
ah+0.006
 argue+0.005
Neg logit contributions
aver-0.007
 doctors-0.007
 knights-0.006
avers-0.006
 pirates-0.006
Token evening
Token ID6180
Feature activation-1.15e-02

Pos logit contributions
ric+0.133
idents+0.131
seven+0.116
 argue+0.115
ah+0.107
Neg logit contributions
 doctors-0.141
aver-0.135
 knights-0.117
ily-0.117
P-0.116
Token,
Token ID11
Feature activation+8.67e-03

Pos logit contributions
aver+0.023
 doctors+0.022
 knights+0.020
avers+0.020
eh+0.020
Neg logit contributions
idents-0.017
s-0.015
ah-0.014
 Today-0.013
ric-0.013
Token Emily
Token ID17608
Feature activation+1.24e-03

Pos logit contributions
idents+0.017
ric+0.016
ah+0.016
s+0.015
 argue+0.015
Neg logit contributions
aver-0.012
 doctors-0.012
 knights-0.010
avers-0.010
 pirates-0.009
Token decided
Token ID3066
Feature activation+4.39e-03

Pos logit contributions
idents+0.002
s+0.002
ah+0.002
ric+0.002
 argue+0.002
Neg logit contributions
aver-0.002
 doctors-0.002
 knights-0.002
avers-0.002
 pirates-0.002
Token to
Token ID284
Feature activation-1.25e-02

Pos logit contributions
idents+0.006
s+0.005
ric+0.005
ah+0.005
 argue+0.005
Neg logit contributions
aver-0.009
 doctors-0.009
avers-0.008
 knights-0.008
eh-0.008
Token you
Token ID345
Feature activation-9.82e-03

Pos logit contributions
aver+0.007
 doctors+0.006
avers+0.006
eh+0.006
P+0.005
Neg logit contributions
idents-0.005
s-0.005
ric-0.005
 argue-0.004
ah-0.004
Token love
Token ID1842
Feature activation-1.22e-02

Pos logit contributions
aver+0.019
 doctors+0.019
 knights+0.017
avers+0.016
 pirates+0.016
Neg logit contributions
idents-0.014
ric-0.012
ah-0.012
s-0.012
 argue-0.011
Token!"
Token ID2474
Feature activation-2.60e-03

Pos logit contributions
aver+0.025
 doctors+0.024
 knights+0.021
avers+0.021
eh+0.021
Neg logit contributions
idents-0.017
ric-0.015
s-0.015
ah-0.014
 argue-0.014
Token
Token ID198
Feature activation-5.16e-03

Pos logit contributions
aver+0.005
 doctors+0.005
 knights+0.005
avers+0.005
 pirates+0.005
Neg logit contributions
idents-0.003
ric-0.003
ah-0.003
s-0.003
 argue-0.003
Token
Token ID198
Feature activation-5.48e-03

Pos logit contributions
aver+0.011
 doctors+0.010
 knights+0.009
avers+0.009
 pirates+0.009
Neg logit contributions
idents-0.006
ric-0.006
s-0.006
ah-0.006
seven-0.005
TokenThat
Token ID2504
Feature activation+8.15e-02

Pos logit contributions
aver+0.010
 doctors+0.010
 knights+0.009
 pirates+0.008
avers+0.008
Neg logit contributions
idents-0.008
ah-0.008
s-0.007
ric-0.007
seven-0.007
Token evening
Token ID6180
Feature activation-8.21e-03

Pos logit contributions
idents+0.149
ric+0.139
seven+0.127
 argue+0.126
that+0.113
Neg logit contributions
 doctors-0.125
aver-0.125
 knights-0.107
P-0.106
 pirates-0.104
Token,
Token ID11
Feature activation+9.03e-03

Pos logit contributions
aver+0.017
 doctors+0.017
avers+0.015
 knights+0.015
eh+0.015
Neg logit contributions
idents-0.010
s-0.009
ah-0.009
 Today-0.008
ric-0.008
Token Emily
Token ID17608
Feature activation+1.54e-02

Pos logit contributions
idents+0.014
ric+0.012
ah+0.012
s+0.012
 argue+0.011
Neg logit contributions
aver-0.017
 doctors-0.017
 knights-0.015
avers-0.015
 pirates-0.014
Token was
Token ID373
Feature activation+1.38e-03

Pos logit contributions
idents+0.024
ric+0.021
ah+0.020
s+0.020
 argue+0.020
Neg logit contributions
aver-0.029
 doctors-0.028
 knights-0.025
avers-0.025
 pirates-0.024
Token determined
Token ID5295
Feature activation+1.16e-02

Pos logit contributions
ric+0.002
idents+0.001
ah+0.001
 argue+0.001
 destination+0.001
Neg logit contributions
aver-0.003
 doctors-0.003
P-0.003
 knights-0.003
avers-0.002
Token way
Token ID835
Feature activation-1.24e-02

Pos logit contributions
aver+0.010
 doctors+0.010
 knights+0.009
avers+0.009
 pirates+0.009
Neg logit contributions
idents-0.010
ric-0.009
ah-0.009
s-0.009
 argue-0.009
Token".
Token ID1911
Feature activation+1.25e-03

Pos logit contributions
aver+0.024
 doctors+0.023
 knights+0.021
avers+0.020
 pirates+0.020
Neg logit contributions
idents-0.019
ric-0.016
ah-0.016
s-0.015
 argue-0.015
Token
Token ID220
Feature activation+3.39e-03

Pos logit contributions
idents+0.002
ric+0.002
ah+0.002
 argue+0.002
s+0.002
Neg logit contributions
aver-0.002
 doctors-0.002
 knights-0.002
avers-0.002
ily-0.002
Token
Token ID198
Feature activation+9.92e-04

Pos logit contributions
idents+0.005
ric+0.004
ah+0.004
s+0.004
 argue+0.004
Neg logit contributions
aver-0.007
 doctors-0.006
 knights-0.006
avers-0.006
ily-0.006
Token
Token ID198
Feature activation-4.38e-03

Pos logit contributions
idents+0.001
ric+0.001
ah+0.001
s+0.001
 argue+0.001
Neg logit contributions
aver-0.002
 doctors-0.002
 knights-0.002
avers-0.002
P-0.002
TokenThat
Token ID2504
Feature activation+8.13e-02

Pos logit contributions
aver+0.007
 doctors+0.007
 knights+0.006
avers+0.006
 pirates+0.005
Neg logit contributions
idents-0.008
ric-0.007
ah-0.007
s-0.007
 argue-0.007
Token was
Token ID373
Feature activation+2.02e-02

Pos logit contributions
idents+0.138
ric+0.129
seven+0.116
 argue+0.116
that+0.106
Neg logit contributions
 doctors-0.141
aver-0.138
 pirates-0.116
 knights-0.115
P-0.112
Token just
Token ID655
Feature activation+7.02e-03

Pos logit contributions
idents+0.035
ric+0.032
s+0.031
ah+0.030
 argue+0.029
Neg logit contributions
aver-0.035
 doctors-0.034
avers-0.029
 knights-0.029
ily-0.028
Token the
Token ID262
Feature activation-2.83e-03

Pos logit contributions
idents+0.011
ric+0.010
s+0.010
ah+0.010
 argue+0.009
Neg logit contributions
aver-0.013
 doctors-0.013
 knights-0.011
avers-0.011
 pirates-0.011
Token advice
Token ID5608
Feature activation+1.69e-02

Pos logit contributions
aver+0.006
 doctors+0.005
 knights+0.005
avers+0.005
eh+0.005
Neg logit contributions
idents-0.004
ric-0.004
ah-0.004
s-0.004
 argue-0.003
Token Tommy
Token ID19919
Feature activation+1.07e-02

Pos logit contributions
idents+0.033
ric+0.030
ah+0.029
s+0.028
 argue+0.028
Neg logit contributions
aver-0.025
 doctors-0.024
 knights-0.021
avers-0.020
ily-0.020
Token of
Token ID286
Feature activation-8.42e-03

Pos logit contributions
idents+0.003
ric+0.003
ah+0.003
s+0.003
 argue+0.002
Neg logit contributions
aver-0.004
 doctors-0.004
 knights-0.003
avers-0.003
 pirates-0.003
Token joy
Token ID8716
Feature activation+2.47e-03

Pos logit contributions
aver+0.017
 doctors+0.016
 knights+0.015
avers+0.014
P+0.014
Neg logit contributions
idents-0.012
ric-0.011
ah-0.010
s-0.010
 argue-0.010
Token.
Token ID13
Feature activation+8.90e-03

Pos logit contributions
idents+0.004
ric+0.004
ah+0.003
 argue+0.003
 Today+0.003
Neg logit contributions
 doctors-0.004
aver-0.004
 knights-0.004
 pirates-0.004
ily-0.004
Token
Token ID198
Feature activation-5.63e-03

Pos logit contributions
idents+0.015
ah+0.014
 argue+0.014
ric+0.014
seven+0.012
Neg logit contributions
 doctors-0.014
aver-0.013
 knights-0.013
ily-0.013
P-0.012
Token
Token ID198
Feature activation-4.92e-03

Pos logit contributions
aver+0.009
 doctors+0.009
 knights+0.008
avers+0.007
ily+0.007
Neg logit contributions
idents-0.011
ric-0.009
ah-0.009
 argue-0.009
s-0.009
TokenThat
Token ID2504
Feature activation+8.13e-02

Pos logit contributions
aver+0.008
 doctors+0.008
 knights+0.007
avers+0.007
P+0.007
Neg logit contributions
idents-0.009
ric-0.008
ah-0.007
 argue-0.007
s-0.007
Token night
Token ID1755
Feature activation-1.67e-02

Pos logit contributions
idents+0.143
ric+0.134
 argue+0.123
seven+0.118
ah+0.109
Neg logit contributions
 doctors-0.133
aver-0.128
 knights-0.109
 pirates-0.109
P-0.106
Token,
Token ID11
Feature activation+8.83e-03

Pos logit contributions
aver+0.033
 doctors+0.031
 knights+0.028
avers+0.028
eh+0.027
Neg logit contributions
idents-0.024
s-0.021
ric-0.021
ah-0.021
 argue-0.020
Token Maya
Token ID26041
Feature activation+3.39e-04

Pos logit contributions
idents+0.016
ric+0.015
ah+0.014
 argue+0.014
s+0.013
Neg logit contributions
 doctors-0.014
aver-0.014
 knights-0.012
 pirates-0.012
ily-0.012
Token was
Token ID373
Feature activation+3.42e-03

Pos logit contributions
idents+0.001
ric+0.000
ah+0.000
s+0.000
 argue+0.000
Neg logit contributions
aver-0.001
 doctors-0.001
 knights-0.001
avers-0.001
 pirates-0.001
Token visited
Token ID8672
Feature activation+9.43e-03

Pos logit contributions
idents+0.005
ric+0.004
ah+0.004
s+0.004
 argue+0.004
Neg logit contributions
aver-0.007
 doctors-0.006
 knights-0.006
avers-0.006
 pirates-0.006
Token lately
Token ID16537
Feature activation-1.08e-02

Pos logit contributions
aver+0.036
 doctors+0.032
avers+0.030
P+0.029
eh+0.029
Neg logit contributions
idents-0.037
ric-0.034
s-0.033
 argue-0.033
ah-0.031
Token.
Token ID13
Feature activation+5.51e-03

Pos logit contributions
aver+0.024
 doctors+0.022
 knights+0.021
avers+0.021
P+0.021
Neg logit contributions
idents-0.013
s-0.012
ah-0.011
ric-0.011
 argue-0.010
Token With
Token ID2080
Feature activation-2.19e-02

Pos logit contributions
idents+0.007
ric+0.006
ah+0.006
s+0.006
 argue+0.005
Neg logit contributions
aver-0.012
 doctors-0.012
 knights-0.010
avers-0.010
 pirates-0.010
Token work
Token ID670
Feature activation-8.21e-03

Pos logit contributions
aver+0.041
 doctors+0.040
 knights+0.036
avers+0.035
 pirates+0.034
Neg logit contributions
idents-0.033
ric-0.029
ah-0.029
s-0.028
 argue-0.027
Token and
Token ID290
Feature activation-6.61e-03

Pos logit contributions
aver+0.018
 doctors+0.017
eh+0.016
 knights+0.016
avers+0.016
Neg logit contributions
idents-0.010
s-0.009
ah-0.008
 argue-0.008
 destination-0.008
Token err
Token ID11454
Feature activation-8.56e-02

Pos logit contributions
aver+0.013
 doctors+0.011
avers+0.011
eh+0.011
P+0.011
Neg logit contributions
idents-0.010
s-0.009
 argue-0.008
ric-0.008
ah-0.008
Tokenands
Token ID1746
Feature activation-1.56e-02

Pos logit contributions
aver+0.113
 doctors+0.101
eh+0.090
 knights+0.087
P+0.087
Neg logit contributions
idents-0.167
seven-0.160
 argue-0.158
ric-0.154
 correctly-0.147
Token,
Token ID11
Feature activation+1.00e-02

Pos logit contributions
aver+0.031
 doctors+0.029
 knights+0.027
eh+0.026
avers+0.026
Neg logit contributions
idents-0.022
s-0.020
ric-0.019
ah-0.019
 argue-0.018
Token they
Token ID484
Feature activation-1.12e-02

Pos logit contributions
idents+0.016
ric+0.014
ah+0.014
s+0.014
 argue+0.013
Neg logit contributions
aver-0.018
 doctors-0.018
 knights-0.016
avers-0.016
 pirates-0.015
Token didn
Token ID1422
Feature activation+2.33e-02

Pos logit contributions
aver+0.021
 doctors+0.020
 knights+0.018
avers+0.018
eh+0.018
Neg logit contributions
idents-0.017
s-0.015
ric-0.015
ah-0.015
 argue-0.015
Token't
Token ID470
Feature activation+1.54e-02

Pos logit contributions
idents+0.035
ah+0.032
ric+0.031
seven+0.028
s+0.028
Neg logit contributions
 doctors-0.043
aver-0.041
ily-0.038
 knights-0.037
 pirates-0.036
Token asks
Token ID7893
Feature activation+1.87e-03

Pos logit contributions
aver+0.001
 doctors+0.001
avers+0.001
 knights+0.001
eh+0.001
Neg logit contributions
idents-0.001
s-0.001
ric-0.001
ah-0.001
 argue-0.001
Token.
Token ID13
Feature activation-1.15e-02

Pos logit contributions
s+0.002
idents+0.002
 argue+0.001
 Today+0.001
ah+0.001
Neg logit contributions
aver-0.005
 doctors-0.004
eh-0.004
avers-0.004
P-0.004
Token
Token ID198
Feature activation-2.79e-03

Pos logit contributions
aver+0.026
 doctors+0.025
 knights+0.023
avers+0.022
eh+0.022
Neg logit contributions
idents-0.014
s-0.012
 Today-0.011
ric-0.011
ah-0.011
Token
Token ID198
Feature activation-3.00e-03

Pos logit contributions
aver+0.006
 doctors+0.006
 knights+0.006
avers+0.005
 pirates+0.005
Neg logit contributions
idents-0.003
s-0.003
ah-0.003
ric-0.003
 Today-0.003
Token"
Token ID1
Feature activation-1.60e-02

Pos logit contributions
aver+0.006
 doctors+0.006
 knights+0.005
avers+0.005
 pirates+0.005
Neg logit contributions
idents-0.004
s-0.004
ah-0.004
 argue-0.004
 Today-0.004
TokenSur
Token ID14214
Feature activation-8.52e-02

Pos logit contributions
aver+0.035
 doctors+0.033
 knights+0.031
avers+0.030
 pirates+0.029
Neg logit contributions
idents-0.020
ric-0.017
s-0.017
ah-0.017
 argue-0.016
Tokenfing
Token ID28825
Feature activation+3.17e-02

Pos logit contributions
 doctors+0.157
aver+0.154
 knights+0.139
eh+0.136
 gard+0.133
Neg logit contributions
idents-0.120
ah-0.117
ric-0.113
s-0.112
 argue-0.109
Token is
Token ID318
Feature activation+1.68e-02

Pos logit contributions
idents+0.044
seven+0.036
ah+0.036
 argue+0.036
ric+0.035
Neg logit contributions
aver-0.061
 doctors-0.058
 pirates-0.052
avers-0.052
 knights-0.052
Token fun
Token ID1257
Feature activation-1.94e-02

Pos logit contributions
idents+0.029
ric+0.026
ah+0.026
s+0.026
 argue+0.025
Neg logit contributions
aver-0.028
 doctors-0.028
avers-0.024
 knights-0.023
 pirates-0.023
Token,
Token ID11
Feature activation-3.59e-03

Pos logit contributions
aver+0.042
 doctors+0.040
 knights+0.037
avers+0.036
 pirates+0.036
Neg logit contributions
idents-0.025
ric-0.021
ah-0.021
s-0.020
 argue-0.019
Token but
Token ID475
Feature activation+7.47e-03

Pos logit contributions
aver+0.007
 doctors+0.007
 knights+0.006
avers+0.006
 pirates+0.006
Neg logit contributions
idents-0.005
ric-0.005
ah-0.005
s-0.005
 argue-0.004
Token room
Token ID2119
Feature activation-1.88e-02

Pos logit contributions
aver+0.056
 doctors+0.047
avers+0.047
ily+0.045
P+0.043
Neg logit contributions
idents-0.059
s-0.052
ric-0.052
 destination-0.051
 Today-0.049
Token while
Token ID981
Feature activation-2.66e-02

Pos logit contributions
aver+0.040
 doctors+0.038
avers+0.035
 knights+0.035
P+0.034
Neg logit contributions
idents-0.024
s-0.020
ric-0.020
ah-0.019
 argue-0.019
Token she
Token ID673
Feature activation-1.75e-02

Pos logit contributions
aver+0.060
avers+0.050
eh+0.048
P+0.048
 doctors+0.046
Neg logit contributions
idents-0.034
 argue-0.031
s-0.031
ric-0.028
 Today-0.028
Token ran
Token ID4966
Feature activation-5.76e-03

Pos logit contributions
aver+0.035
 doctors+0.033
 knights+0.030
avers+0.030
P+0.029
Neg logit contributions
idents-0.025
s-0.022
ah-0.022
ric-0.021
 argue-0.020
Token an
Token ID281
Feature activation-3.09e-02

Pos logit contributions
aver+0.009
 doctors+0.008
avers+0.008
 knights+0.008
P+0.007
Neg logit contributions
idents-0.010
s-0.010
 argue-0.009
ah-0.009
 Today-0.009
Token err
Token ID11454
Feature activation-8.52e-02

Pos logit contributions
aver+0.051
 doctors+0.045
 knights+0.045
P+0.045
avers+0.044
Neg logit contributions
idents-0.048
 Today-0.043
ah-0.043
 argue-0.043
ric-0.041
Tokenand
Token ID392
Feature activation-5.01e-02

Pos logit contributions
aver+0.118
 doctors+0.116
 knights+0.106
 pirates+0.100
avers+0.095
Neg logit contributions
 sorts-0.133
seven-0.132
 argue-0.130
 kinds-0.128
idents-0.127
Token.
Token ID13
Feature activation-4.05e-04

Pos logit contributions
aver+0.090
 doctors+0.080
 knights+0.078
avers+0.076
P+0.074
Neg logit contributions
idents-0.078
s-0.074
ah-0.067
 argue-0.067
ric-0.065
Token The
Token ID383
Feature activation-9.94e-04

Pos logit contributions
aver+0.001
 doctors+0.001
 knights+0.001
avers+0.001
 pirates+0.001
Neg logit contributions
idents-0.000
ric-0.000
s-0.000
ah-0.000
 Today-0.000
Token girl
Token ID2576
Feature activation-2.97e-02

Pos logit contributions
aver+0.002
avers+0.002
 knights+0.002
P+0.002
iers+0.002
Neg logit contributions
idents-0.002
 Today-0.001
s-0.001
 top-0.001
 destination-0.001
Token was
Token ID373
Feature activation-7.49e-03

Pos logit contributions
aver+0.060
 doctors+0.058
 knights+0.054
P+0.054
 spies+0.053
Neg logit contributions
idents-0.038
s-0.037
ah-0.033
 argue-0.031
ric-0.029
Token it
Token ID340
Feature activation+1.93e-02

Pos logit contributions
idents+0.002
ah+0.002
ric+0.002
 argue+0.002
 Today+0.002
Neg logit contributions
 doctors-0.002
aver-0.002
 knights-0.002
 pirates-0.002
eh-0.002
Token.
Token ID13
Feature activation-5.44e-03

Pos logit contributions
ric+0.039
idents+0.037
seven+0.035
ah+0.034
 argue+0.033
Neg logit contributions
 doctors-0.026
aver-0.026
ily-0.024
eh-0.023
 knights-0.023
Token So
Token ID1406
Feature activation-3.70e-03

Pos logit contributions
aver+0.010
 doctors+0.010
 knights+0.009
avers+0.009
ily+0.008
Neg logit contributions
idents-0.008
ric-0.008
ah-0.007
 argue-0.007
seven-0.007
Token,
Token ID11
Feature activation-1.22e-02

Pos logit contributions
aver+0.007
 doctors+0.007
 knights+0.006
avers+0.006
eh+0.006
Neg logit contributions
idents-0.006
ric-0.005
ah-0.005
s-0.005
 argue-0.005
Token he
Token ID339
Feature activation-3.18e-02

Pos logit contributions
aver+0.019
 doctors+0.019
 knights+0.017
avers+0.016
 pirates+0.016
Neg logit contributions
idents-0.022
ric-0.020
ah-0.020
s-0.019
 argue-0.019
Token l
Token ID300
Feature activation-8.26e-02

Pos logit contributions
 doctors+0.056
aver+0.055
 importantly+0.054
iers+0.052
 emergencies+0.051
Neg logit contributions
ah-0.043
s-0.042
idents-0.041
 argue-0.038
 might-0.035
Tokenicked
Token ID9484
Feature activation+3.19e-03

Pos logit contributions
 doctors+0.188
 emergencies+0.157
 pirates+0.156
 knights+0.155
aver+0.154
Neg logit contributions
ah-0.078
s-0.068
're-0.063
seven-0.059
 are-0.059
Token it
Token ID340
Feature activation+1.92e-02

Pos logit contributions
idents+0.006
ric+0.005
ah+0.005
seven+0.005
 argue+0.005
Neg logit contributions
aver-0.005
 doctors-0.005
 knights-0.005
ily-0.005
 pirates-0.004
Token with
Token ID351
Feature activation-2.08e-02

Pos logit contributions
ric+0.042
idents+0.038
ah+0.036
seven+0.035
riage+0.035
Neg logit contributions
 doctors-0.023
aver-0.022
 knights-0.022
ily-0.021
 pirates-0.020
Token his
Token ID465
Feature activation-2.01e-02

Pos logit contributions
aver+0.036
 doctors+0.035
 knights+0.031
avers+0.030
 pirates+0.030
Neg logit contributions
idents-0.035
ric-0.032
ah-0.031
s-0.031
 argue-0.029
Token tongue
Token ID11880
Feature activation+2.74e-02

Pos logit contributions
aver+0.037
 doctors+0.035
 knights+0.033
 pirates+0.031
avers+0.031
Neg logit contributions
idents-0.031
ric-0.030
ah-0.028
seven-0.028
 argue-0.026
Token can
Token ID460
Feature activation-1.20e-02

Pos logit contributions
idents+0.008
ric+0.006
s+0.006
ah+0.006
 argue+0.006
Neg logit contributions
aver-0.016
 doctors-0.015
 knights-0.014
avers-0.014
eh-0.014
Token.
Token ID13
Feature activation-7.43e-03

Pos logit contributions
aver+0.021
 doctors+0.020
 pirates+0.018
 knights+0.017
P+0.017
Neg logit contributions
idents-0.021
ric-0.018
ah-0.017
seven-0.016
fl-0.016
Token
Token ID198
Feature activation+3.94e-03

Pos logit contributions
aver+0.014
 doctors+0.013
 knights+0.012
avers+0.012
ily+0.012
Neg logit contributions
idents-0.011
ric-0.010
ah-0.010
 argue-0.009
seven-0.009
Token
Token ID198
Feature activation-1.13e-03

Pos logit contributions
idents+0.005
ric+0.005
s+0.005
ah+0.005
 argue+0.004
Neg logit contributions
aver-0.008
 doctors-0.008
 knights-0.007
avers-0.007
 pirates-0.007
Token"
Token ID1
Feature activation-1.64e-02

Pos logit contributions
aver+0.002
 doctors+0.002
 knights+0.002
avers+0.002
 pirates+0.002
Neg logit contributions
idents-0.002
s-0.002
ah-0.002
ric-0.002
 argue-0.002
TokenSur
Token ID14214
Feature activation-8.05e-02

Pos logit contributions
aver+0.030
 doctors+0.029
 knights+0.026
avers+0.026
eh+0.025
Neg logit contributions
idents-0.026
s-0.023
ah-0.023
ric-0.022
 argue-0.022
Tokenfing
Token ID28825
Feature activation+2.98e-02

Pos logit contributions
 doctors+0.136
aver+0.133
 knights+0.118
eh+0.117
 gard+0.115
Neg logit contributions
idents-0.122
ah-0.121
s-0.116
ric-0.115
 argue-0.115
Token is
Token ID318
Feature activation+1.24e-02

Pos logit contributions
idents+0.039
ah+0.032
seven+0.032
 argue+0.032
 Today+0.031
Neg logit contributions
aver-0.059
 doctors-0.055
 pirates-0.052
 knights-0.051
avers-0.050
Token fun
Token ID1257
Feature activation-2.07e-02

Pos logit contributions
idents+0.021
ric+0.019
ah+0.018
s+0.018
 argue+0.017
Neg logit contributions
aver-0.022
 doctors-0.021
 knights-0.018
avers-0.018
 pirates-0.018
Token,
Token ID11
Feature activation-6.00e-03

Pos logit contributions
aver+0.046
 doctors+0.045
 knights+0.041
avers+0.040
 pirates+0.039
Neg logit contributions
idents-0.025
ric-0.021
ah-0.020
s-0.020
 argue-0.019
Token but
Token ID475
Feature activation+1.84e-03

Pos logit contributions
aver+0.010
 doctors+0.010
 knights+0.009
avers+0.009
 pirates+0.008
Neg logit contributions
idents-0.010
ric-0.009
ah-0.009
s-0.009
 argue-0.009
Token asked
Token ID1965
Feature activation+5.36e-03

Pos logit contributions
aver+0.033
 doctors+0.032
avers+0.028
 knights+0.028
P+0.028
Neg logit contributions
idents-0.023
s-0.021
ah-0.020
seven-0.019
 argue-0.018
Token the
Token ID262
Feature activation-1.40e-02

Pos logit contributions
idents+0.007
s+0.006
ah+0.006
 argue+0.006
ric+0.006
Neg logit contributions
aver-0.011
 doctors-0.010
avers-0.009
eh-0.009
P-0.009
Token man
Token ID582
Feature activation-1.06e-02

Pos logit contributions
aver+0.024
avers+0.021
 doctors+0.021
eh+0.021
 knights+0.020
Neg logit contributions
idents-0.022
s-0.021
ah-0.020
 argue-0.020
ric-0.019
Token for
Token ID329
Feature activation-3.13e-02

Pos logit contributions
aver+0.017
 doctors+0.017
 knights+0.016
 pirates+0.015
ily+0.015
Neg logit contributions
idents-0.020
ric-0.017
ah-0.016
seven-0.015
 argue-0.015
Token two
Token ID734
Feature activation-3.94e-02

Pos logit contributions
aver+0.064
 doctors+0.060
avers+0.056
 knights+0.055
eh+0.055
Neg logit contributions
idents-0.041
ah-0.036
ric-0.036
 argue-0.035
s-0.035
Token l
Token ID300
Feature activation-7.99e-02

Pos logit contributions
aver+0.068
 doctors+0.066
 knights+0.058
avers+0.057
 pirates+0.055
Neg logit contributions
idents-0.066
ric-0.059
ah-0.058
s-0.058
 argue-0.056
Tokenoll
Token ID692
Feature activation-4.01e-02

Pos logit contributions
 doctors+0.129
aver+0.112
 pirates+0.102
avers+0.099
 knights+0.099
Neg logit contributions
ah-0.137
seven-0.127
s-0.124
 correctly-0.118
of-0.113
Tokenip
Token ID541
Feature activation-1.26e-02

Pos logit contributions
aver+0.080
 doctors+0.075
 knights+0.069
avers+0.068
ily+0.067
Neg logit contributions
idents-0.059
ric-0.053
 argue-0.048
 Today-0.047
 summit-0.046
Tokenops
Token ID2840
Feature activation-3.69e-02

Pos logit contributions
aver+0.030
 doctors+0.029
 knights+0.026
P+0.026
ily+0.026
Neg logit contributions
idents-0.013
ric-0.012
 argue-0.010
 Today-0.010
 summit-0.010
Token.
Token ID13
Feature activation-5.75e-03

Pos logit contributions
aver+0.072
 doctors+0.070
 knights+0.063
avers+0.062
 pirates+0.061
Neg logit contributions
idents-0.054
ric-0.047
ah-0.045
s-0.044
 argue-0.043
Token The
Token ID383
Feature activation+3.59e-03

Pos logit contributions
aver+0.011
 doctors+0.011
 knights+0.010
avers+0.009
ily+0.009
Neg logit contributions
idents-0.009
ric-0.008
ah-0.008
s-0.007
 argue-0.007
TokenHer
Token ID9360
Feature activation-1.31e-03

Pos logit contributions
aver+0.012
 doctors+0.011
 knights+0.010
avers+0.010
 pirates+0.010
Neg logit contributions
idents-0.011
ric-0.010
s-0.010
ah-0.010
 argue-0.009
Token mother
Token ID2802
Feature activation-2.01e-02

Pos logit contributions
aver+0.003
avers+0.003
eh+0.002
 doctors+0.002
iers+0.002
Neg logit contributions
idents-0.002
s-0.001
 destination-0.001
 argue-0.001
 Today-0.001
Token said
Token ID531
Feature activation-2.09e-02

Pos logit contributions
aver+0.037
 doctors+0.036
 knights+0.034
P+0.033
 spies+0.033
Neg logit contributions
ah-0.029
s-0.028
idents-0.027
 argue-0.024
ric-0.023
Token,
Token ID11
Feature activation-1.71e-02

Pos logit contributions
aver+0.048
 doctors+0.044
P+0.042
avers+0.041
 knights+0.041
Neg logit contributions
idents-0.025
s-0.023
ah-0.020
 argue-0.020
ric-0.018
Token "
Token ID366
Feature activation+5.60e-04

Pos logit contributions
aver+0.033
 doctors+0.032
 knights+0.029
avers+0.028
 pirates+0.027
Neg logit contributions
idents-0.025
ric-0.022
ah-0.022
s-0.022
 argue-0.021
TokenSur
Token ID14214
Feature activation-7.96e-02

Pos logit contributions
idents+0.001
ric+0.001
ah+0.001
s+0.001
 argue+0.001
Neg logit contributions
aver-0.001
 doctors-0.001
 knights-0.001
avers-0.001
 pirates-0.001
Tokenrender
Token ID13287
Feature activation-3.19e-02

Pos logit contributions
aver+0.116
 doctors+0.112
 knights+0.099
avers+0.094
eh+0.093
Neg logit contributions
idents-0.143
ah-0.138
ric-0.134
 argue-0.133
seven-0.130
Token is
Token ID318
Feature activation-1.31e-03

Pos logit contributions
aver+0.056
 doctors+0.055
 knights+0.049
ily+0.047
 pirates+0.047
Neg logit contributions
idents-0.052
ric-0.045
ah-0.044
 argue-0.044
s-0.042
Token important
Token ID1593
Feature activation-1.99e-02

Pos logit contributions
aver+0.003
 doctors+0.002
 knights+0.002
avers+0.002
P+0.002
Neg logit contributions
idents-0.002
ric-0.002
ah-0.002
s-0.002
 argue-0.002
Token so
Token ID523
Feature activation+4.37e-04

Pos logit contributions
aver+0.041
 doctors+0.039
avers+0.036
 knights+0.035
P+0.035
Neg logit contributions
idents-0.027
ric-0.023
s-0.023
ah-0.023
 argue-0.022
Token you
Token ID345
Feature activation-1.87e-02

Pos logit contributions
idents+0.001
s+0.001
 may+0.001
ah+0.001
ric+0.001
Neg logit contributions
aver-0.001
 doctors-0.001
avers-0.001
eh-0.001
 knights-0.001
Token lunch
Token ID9965
Feature activation-2.86e-03

Pos logit contributions
aver+0.057
 doctors+0.053
avers+0.049
 knights+0.046
P+0.046
Neg logit contributions
idents-0.034
ric-0.028
s-0.028
 argue-0.028
seven-0.027
Token.
Token ID13
Feature activation+2.73e-04

Pos logit contributions
aver+0.006
 doctors+0.006
avers+0.006
 knights+0.006
eh+0.005
Neg logit contributions
idents-0.003
s-0.003
ah-0.003
ric-0.002
 argue-0.002
Token
Token ID198
Feature activation-7.31e-03

Pos logit contributions
idents+0.000
ric+0.000
ah+0.000
s+0.000
 Today+0.000
Neg logit contributions
aver-0.001
 doctors-0.001
 knights-0.001
avers-0.001
 pirates-0.001
Token
Token ID198
Feature activation-8.19e-03

Pos logit contributions
aver+0.015
 doctors+0.014
 knights+0.013
avers+0.013
 pirates+0.012
Neg logit contributions
idents-0.010
ric-0.009
ah-0.009
s-0.009
 argue-0.008
Token"
Token ID1
Feature activation-2.35e-03

Pos logit contributions
 doctors+0.015
aver+0.015
 knights+0.014
 pirates+0.013
 spies+0.013
Neg logit contributions
idents-0.011
ah-0.011
s-0.011
 Today-0.010
ric-0.010
TokenSur
Token ID14214
Feature activation-7.96e-02

Pos logit contributions
aver+0.004
 doctors+0.004
 knights+0.004
avers+0.004
 pirates+0.004
Neg logit contributions
idents-0.003
ah-0.003
seven-0.003
ric-0.003
s-0.003
Tokenprise
Token ID7919
Feature activation-1.75e-02

Pos logit contributions
 doctors+0.134
aver+0.129
avers+0.109
 knights+0.108
 spies+0.106
Neg logit contributions
ah-0.122
idents-0.119
seven-0.113
 argue-0.110
ric-0.109
Token,
Token ID11
Feature activation-7.14e-03

Pos logit contributions
aver+0.043
 doctors+0.042
 knights+0.039
avers+0.038
 pirates+0.037
Neg logit contributions
idents-0.016
ric-0.014
ah-0.013
s-0.013
 argue-0.012
Token Mia
Token ID32189
Feature activation+1.61e-02

Pos logit contributions
aver+0.012
 doctors+0.012
avers+0.011
P+0.010
eh+0.010
Neg logit contributions
idents-0.012
ric-0.010
ah-0.010
s-0.010
 argue-0.010
Token!"
Token ID2474
Feature activation-7.34e-03

Pos logit contributions
idents+0.015
ah+0.013
ric+0.013
s+0.012
 argue+0.011
Neg logit contributions
aver-0.039
 doctors-0.038
 knights-0.035
avers-0.035
 pirates-0.034
Token said
Token ID531
Feature activation-1.66e-02

Pos logit contributions
aver+0.013
 doctors+0.013
avers+0.011
 knights+0.011
eh+0.011
Neg logit contributions
idents-0.012
s-0.010
ah-0.010
ric-0.010
 argue-0.010
Token and
Token ID290
Feature activation-1.11e-02

Pos logit contributions
 doctors+0.005
aver+0.005
 knights+0.004
ily+0.004
 pirates+0.004
Neg logit contributions
idents-0.004
ric-0.004
ah-0.003
 argue-0.003
seven-0.003
Token she
Token ID673
Feature activation-1.88e-02

Pos logit contributions
aver+0.022
 doctors+0.020
avers+0.019
 knights+0.019
P+0.018
Neg logit contributions
idents-0.017
s-0.014
ric-0.014
 argue-0.013
ah-0.013
Token needed
Token ID2622
Feature activation-3.89e-02

Pos logit contributions
aver+0.037
 doctors+0.035
 spies+0.032
 knights+0.032
avers+0.032
Neg logit contributions
idents-0.026
s-0.025
ah-0.022
 argue-0.022
ric-0.021
Token a
Token ID257
Feature activation-3.31e-02

Pos logit contributions
aver+0.091
 doctors+0.079
avers+0.078
 knights+0.077
eh+0.075
Neg logit contributions
idents-0.042
s-0.037
seven-0.035
 argue-0.034
ric-0.034
Token lot
Token ID1256
Feature activation-5.80e-02

Pos logit contributions
aver+0.066
 doctors+0.063
 knights+0.058
avers+0.057
ily+0.056
Neg logit contributions
idents-0.048
s-0.041
ah-0.041
ric-0.041
 argue-0.040
Token of
Token ID286
Feature activation-7.92e-02

Pos logit contributions
aver+0.087
 doctors+0.083
 knights+0.075
avers+0.070
 pirates+0.069
Neg logit contributions
idents-0.106
s-0.103
ah-0.098
ric-0.096
seven-0.094
Token help
Token ID1037
Feature activation-3.81e-03

Pos logit contributions
aver+0.156
 doctors+0.141
 knights+0.135
avers+0.131
 pirates+0.127
Neg logit contributions
idents-0.113
 argue-0.098
s-0.095
ah-0.093
ric-0.092
Token.
Token ID13
Feature activation-7.07e-03

Pos logit contributions
aver+0.008
 doctors+0.008
 knights+0.007
avers+0.007
 pirates+0.007
Neg logit contributions
idents-0.005
ric-0.005
s-0.004
ah-0.004
 argue-0.004
Token
Token ID220
Feature activation-8.91e-03

Pos logit contributions
 doctors+0.012
aver+0.012
 knights+0.011
ily+0.011
P+0.011
Neg logit contributions
idents-0.011
ric-0.010
ah-0.010
 argue-0.010
seven-0.009
Token
Token ID198
Feature activation-1.62e-03

Pos logit contributions
aver+0.015
 doctors+0.014
ily+0.013
avers+0.013
 knights+0.012
Neg logit contributions
idents-0.016
ah-0.014
 argue-0.014
ric-0.014
seven-0.013
Token
Token ID198
Feature activation+1.90e-03

Pos logit contributions
aver+0.003
 doctors+0.002
ily+0.002
avers+0.002
P+0.002
Neg logit contributions
idents-0.003
ric-0.003
ah-0.003
 argue-0.003
 destination-0.002
Token needed
Token ID2622
Feature activation-3.26e-02

Pos logit contributions
idents+0.004
ric+0.004
ah+0.003
s+0.003
seven+0.003
Neg logit contributions
aver-0.006
 doctors-0.006
 knights-0.005
avers-0.005
ily-0.005
Token to
Token ID284
Feature activation-5.37e-03

Pos logit contributions
aver+0.081
eh+0.071
avers+0.070
 doctors+0.069
P+0.068
Neg logit contributions
idents-0.032
 argue-0.028
s-0.028
 destination-0.025
 Today-0.024
Token go
Token ID467
Feature activation-1.62e-02

Pos logit contributions
aver+0.011
 doctors+0.011
 knights+0.010
avers+0.010
eh+0.010
Neg logit contributions
idents-0.007
ric-0.006
ah-0.006
 argue-0.006
s-0.006
Token do
Token ID466
Feature activation+7.20e-03

Pos logit contributions
 doctors+0.028
aver+0.027
 knights+0.024
 pirates+0.024
avers+0.023
Neg logit contributions
idents-0.027
ric-0.025
ah-0.024
s-0.022
 argue-0.022
Token some
Token ID617
Feature activation-3.12e-02

Pos logit contributions
idents+0.012
ric+0.011
ah+0.010
s+0.010
 argue+0.010
Neg logit contributions
aver-0.013
 doctors-0.012
 knights-0.011
avers-0.011
 pirates-0.011
Token err
Token ID11454
Feature activation-7.90e-02

Pos logit contributions
aver+0.066
 doctors+0.058
avers+0.057
 knights+0.055
ily+0.054
Neg logit contributions
idents-0.041
s-0.036
ah-0.035
 argue-0.034
 Today-0.033
Tokenands
Token ID1746
Feature activation-7.58e-03

Pos logit contributions
aver+0.114
 doctors+0.103
 knights+0.098
 gard+0.092
 pirates+0.092
Neg logit contributions
seven-0.129
idents-0.127
 argue-0.126
 sorts-0.123
 Today-0.121
Token.
Token ID13
Feature activation+2.24e-03

Pos logit contributions
aver+0.014
 doctors+0.014
 knights+0.013
avers+0.012
 pirates+0.012
Neg logit contributions
idents-0.011
ric-0.010
ah-0.010
s-0.009
 argue-0.009
Token
Token ID198
Feature activation+6.29e-03

Pos logit contributions
idents+0.003
ah+0.003
ric+0.003
 argue+0.003
seven+0.003
Neg logit contributions
aver-0.004
 doctors-0.004
 knights-0.004
avers-0.004
 pirates-0.003
Token
Token ID198
Feature activation+2.03e-03

Pos logit contributions
idents+0.009
ric+0.008
s+0.008
ah+0.008
 argue+0.008
Neg logit contributions
aver-0.012
 doctors-0.012
 knights-0.011
avers-0.010
 pirates-0.010
TokenMom
Token ID29252
Feature activation-1.85e-02

Pos logit contributions
idents+0.003
s+0.003
ric+0.003
ah+0.003
 argue+0.003
Neg logit contributions
aver-0.004
 doctors-0.003
 knights-0.003
avers-0.003
 pirates-0.003
Token said
Token ID531
Feature activation-8.02e-03

Pos logit contributions
aver+0.003
 doctors+0.003
 knights+0.003
avers+0.003
 pirates+0.003
Neg logit contributions
idents-0.003
ah-0.003
s-0.003
ric-0.003
 argue-0.002
Token firmly
Token ID14245
Feature activation+2.51e-02

Pos logit contributions
aver+0.019
 doctors+0.017
avers+0.016
eh+0.016
P+0.016
Neg logit contributions
s-0.009
idents-0.009
ah-0.007
 argue-0.007
seven-0.007
Token:
Token ID25
Feature activation+2.35e-03

Pos logit contributions
idents+0.040
ric+0.036
ah+0.034
s+0.033
 argue+0.033
Neg logit contributions
aver-0.046
 doctors-0.044
 knights-0.040
avers-0.039
 pirates-0.038
Token
Token ID198
Feature activation-1.17e-02

Pos logit contributions
idents+0.003
ric+0.003
s+0.003
ah+0.003
 argue+0.003
Neg logit contributions
aver-0.005
 doctors-0.005
 knights-0.004
avers-0.004
 pirates-0.004
Token"
Token ID1
Feature activation-3.50e-03

Pos logit contributions
 doctors+0.023
aver+0.022
 knights+0.020
 pirates+0.020
avers+0.019
Neg logit contributions
idents-0.016
s-0.015
ah-0.015
ric-0.014
seven-0.014
TokenSur
Token ID14214
Feature activation-7.84e-02

Pos logit contributions
aver+0.006
 doctors+0.006
 knights+0.005
avers+0.005
 pirates+0.005
Neg logit contributions
idents-0.006
s-0.005
ah-0.005
ric-0.005
 argue-0.005
Tokenrender
Token ID13287
Feature activation-3.04e-02

Pos logit contributions
aver+0.108
 doctors+0.108
 knights+0.094
eh+0.087
avers+0.087
Neg logit contributions
idents-0.147
ah-0.141
 argue-0.137
ric-0.136
seven-0.134
Token the
Token ID262
Feature activation+4.89e-03

Pos logit contributions
aver+0.054
 doctors+0.052
 knights+0.047
ily+0.046
P+0.046
Neg logit contributions
idents-0.049
ric-0.043
 argue-0.041
ah-0.041
 Today-0.039
Token cup
Token ID6508
Feature activation+9.28e-03

Pos logit contributions
idents+0.008
ric+0.008
ah+0.008
s+0.007
 argue+0.007
Neg logit contributions
aver-0.008
 doctors-0.008
avers-0.007
 knights-0.007
ily-0.007
Token and
Token ID290
Feature activation+2.54e-03

Pos logit contributions
idents+0.015
ric+0.014
ah+0.013
 argue+0.013
s+0.013
Neg logit contributions
aver-0.016
 doctors-0.016
 knights-0.014
ily-0.014
avers-0.014
Token let
Token ID1309
Feature activation+3.49e-02

Pos logit contributions
idents+0.003
ric+0.003
s+0.003
 argue+0.003
ah+0.003
Neg logit contributions
aver-0.005
 doctors-0.005
avers-0.005
 knights-0.005
eh-0.005
Token try
Token ID1949
Feature activation+5.38e-03

Pos logit contributions
aver+0.010
 doctors+0.009
 knights+0.009
avers+0.008
 pirates+0.008
Neg logit contributions
idents-0.006
ric-0.005
ah-0.005
s-0.005
 argue-0.005
Token it
Token ID340
Feature activation+1.87e-02

Pos logit contributions
idents+0.010
ric+0.008
 argue+0.008
ah+0.008
 Today+0.008
Neg logit contributions
 doctors-0.009
aver-0.009
 knights-0.008
 pirates-0.008
P-0.007
Token!
Token ID0
Feature activation-7.54e-04

Pos logit contributions
idents+0.037
ric+0.036
 argue+0.031
seven+0.030
ah+0.030
Neg logit contributions
 doctors-0.026
aver-0.025
ily-0.023
 knights-0.022
 pirates-0.022
Token He
Token ID679
Feature activation-1.57e-02

Pos logit contributions
aver+0.002
 doctors+0.001
 knights+0.001
avers+0.001
 pirates+0.001
Neg logit contributions
idents-0.001
ric-0.001
ah-0.001
s-0.001
 argue-0.001
Token eagerly
Token ID30130
Feature activation-2.24e-02

Pos logit contributions
aver+0.028
 doctors+0.028
avers+0.025
 knights+0.025
 spies+0.024
Neg logit contributions
idents-0.023
s-0.023
ah-0.023
ric-0.021
 argue-0.021
Token l
Token ID300
Feature activation-7.82e-02

Pos logit contributions
 doctors+0.036
aver+0.036
 knights+0.032
avers+0.031
 pirates+0.030
Neg logit contributions
idents-0.038
ah-0.037
s-0.036
 argue-0.035
ric-0.035
Tokenicked
Token ID9484
Feature activation+9.12e-03

Pos logit contributions
 doctors+0.168
aver+0.141
eh+0.136
 knights+0.135
 pirates+0.130
Neg logit contributions
ah-0.095
seven-0.090
s-0.085
 correctly-0.084
that-0.078
Token it
Token ID340
Feature activation+3.48e-02

Pos logit contributions
idents+0.013
ric+0.012
ah+0.011
s+0.011
 argue+0.011
Neg logit contributions
aver-0.018
 doctors-0.017
 knights-0.016
avers-0.016
 pirates-0.015
Token,
Token ID11
Feature activation-2.53e-03

Pos logit contributions
ric+0.069
idents+0.067
ah+0.061
seven+0.060
riage+0.058
Neg logit contributions
 doctors-0.045
aver-0.044
 knights-0.041
 pirates-0.038
ily-0.036
Token but
Token ID475
Feature activation-4.05e-04

Pos logit contributions
aver+0.005
 doctors+0.005
 knights+0.004
avers+0.004
P+0.004
Neg logit contributions
idents-0.004
ric-0.003
ah-0.003
s-0.003
 argue-0.003
Token the
Token ID262
Feature activation-8.92e-03

Pos logit contributions
aver+0.001
 doctors+0.001
 knights+0.001
avers+0.001
 pirates+0.001
Neg logit contributions
idents-0.001
ric-0.001
ah-0.001
s-0.001
 argue-0.001
Token at
Token ID379
Feature activation+1.44e-03

Pos logit contributions
idents+0.011
ric+0.010
ah+0.009
s+0.009
 argue+0.009
Neg logit contributions
aver-0.014
 doctors-0.014
 knights-0.012
avers-0.012
 pirates-0.012
Token his
Token ID465
Feature activation-8.16e-03

Pos logit contributions
idents+0.003
s+0.002
ah+0.002
 argue+0.002
ric+0.002
Neg logit contributions
 doctors-0.002
aver-0.002
 knights-0.002
 pirates-0.002
ily-0.002
Token salon
Token ID46608
Feature activation-8.17e-04

Pos logit contributions
 doctors+0.015
aver+0.014
 knights+0.013
 pirates+0.012
avers+0.012
Neg logit contributions
idents-0.013
ah-0.012
ric-0.012
s-0.011
seven-0.011
Token and
Token ID290
Feature activation+8.42e-03

Pos logit contributions
aver+0.001
 doctors+0.001
 knights+0.001
avers+0.001
 pirates+0.001
Neg logit contributions
idents-0.001
ric-0.001
ah-0.001
s-0.001
 argue-0.001
Token running
Token ID2491
Feature activation+4.79e-03

Pos logit contributions
idents+0.014
ric+0.013
ah+0.012
s+0.012
 argue+0.012
Neg logit contributions
aver-0.015
 doctors-0.014
avers-0.013
 knights-0.013
P-0.012
Token err
Token ID11454
Feature activation-7.79e-02

Pos logit contributions
idents+0.008
ric+0.008
ah+0.007
 argue+0.007
seven+0.007
Neg logit contributions
 doctors-0.008
aver-0.008
 knights-0.007
 pirates-0.007
avers-0.007
Tokenands
Token ID1746
Feature activation-9.35e-03

Pos logit contributions
aver+0.108
 doctors+0.091
avers+0.084
 knights+0.084
eh+0.081
Neg logit contributions
seven-0.140
idents-0.134
 argue-0.133
ric-0.128
 sorts-0.123
Token.
Token ID13
Feature activation+5.33e-03

Pos logit contributions
aver+0.018
 doctors+0.018
 knights+0.016
avers+0.016
 pirates+0.015
Neg logit contributions
idents-0.013
ric-0.012
ah-0.012
s-0.011
 argue-0.011
Token But
Token ID887
Feature activation-5.11e-03

Pos logit contributions
idents+0.008
ric+0.007
ah+0.007
 argue+0.007
s+0.007
Neg logit contributions
aver-0.010
 doctors-0.010
 knights-0.009
avers-0.009
ily-0.009
Token every
Token ID790
Feature activation+2.23e-02

Pos logit contributions
aver+0.009
 doctors+0.009
 knights+0.008
 pirates+0.008
avers+0.008
Neg logit contributions
idents-0.009
ric-0.008
s-0.007
ah-0.007
 argue-0.007
Token day
Token ID1110
Feature activation-2.70e-02

Pos logit contributions
idents+0.037
ric+0.033
s+0.032
ah+0.032
 argue+0.031
Neg logit contributions
aver-0.039
 doctors-0.038
 knights-0.034
avers-0.033
 pirates-0.032
Token her
Token ID607
Feature activation-2.75e-03

Pos logit contributions
idents+0.028
ric+0.024
ah+0.023
s+0.023
 argue+0.023
Neg logit contributions
aver-0.048
 doctors-0.045
avers-0.042
 knights-0.042
eh-0.041
Token mom
Token ID1995
Feature activation+1.03e-02

Pos logit contributions
aver+0.004
 doctors+0.004
 knights+0.004
avers+0.003
 pirates+0.003
Neg logit contributions
idents-0.005
ric-0.005
ah-0.005
s-0.005
 argue-0.004
Token were
Token ID547
Feature activation-6.01e-03

Pos logit contributions
idents+0.014
ric+0.012
ah+0.011
s+0.011
 argue+0.011
Neg logit contributions
aver-0.021
 doctors-0.021
 knights-0.019
avers-0.018
 pirates-0.018
Token out
Token ID503
Feature activation-1.21e-02

Pos logit contributions
aver+0.011
 doctors+0.010
 knights+0.009
P+0.009
avers+0.009
Neg logit contributions
idents-0.009
ric-0.009
s-0.009
 argue-0.008
 destination-0.008
Token running
Token ID2491
Feature activation-1.00e-03

Pos logit contributions
aver+0.018
 doctors+0.017
avers+0.015
 knights+0.015
P+0.014
Neg logit contributions
idents-0.023
ric-0.021
s-0.021
ah-0.021
 argue-0.020
Token err
Token ID11454
Feature activation-7.78e-02

Pos logit contributions
aver+0.002
 doctors+0.002
 knights+0.001
avers+0.001
 pirates+0.001
Neg logit contributions
idents-0.002
ric-0.002
ah-0.002
s-0.002
seven-0.001
Tokenands
Token ID1746
Feature activation-1.39e-02

Pos logit contributions
aver+0.115
 doctors+0.098
 knights+0.088
avers+0.087
eh+0.085
Neg logit contributions
 argue-0.134
seven-0.131
idents-0.130
 sorts-0.124
 kinds-0.124
Token.
Token ID13
Feature activation+1.59e-02

Pos logit contributions
aver+0.027
 doctors+0.026
 knights+0.024
avers+0.023
 pirates+0.023
Neg logit contributions
idents-0.020
ric-0.018
ah-0.017
s-0.017
 argue-0.017
Token Mary
Token ID5335
Feature activation-5.26e-03

Pos logit contributions
ah+0.025
idents+0.025
ric+0.024
s+0.021
 argue+0.021
Neg logit contributions
 doctors-0.025
aver-0.024
ily-0.023
avers-0.022
 knights-0.021
Token was
Token ID373
Feature activation-8.83e-03

Pos logit contributions
aver+0.011
 doctors+0.010
 knights+0.009
avers+0.009
P+0.009
Neg logit contributions
s-0.007
idents-0.007
ah-0.006
ric-0.006
 argue-0.006
Token three
Token ID1115
Feature activation-2.71e-02

Pos logit contributions
aver+0.018
P+0.016
 knights+0.016
 doctors+0.015
avers+0.015
Neg logit contributions
idents-0.010
 argue-0.010
ric-0.010
 destination-0.009
s-0.009
Token behind
Token ID2157
Feature activation+8.89e-03

Pos logit contributions
aver+0.009
 doctors+0.008
avers+0.008
P+0.008
 knights+0.007
Neg logit contributions
idents-0.006
ah-0.006
ric-0.006
 argue-0.005
s-0.005
Token a
Token ID257
Feature activation+4.13e-03

Pos logit contributions
idents+0.014
ric+0.013
ah+0.013
s+0.013
 argue+0.012
Neg logit contributions
aver-0.016
 doctors-0.015
 knights-0.014
avers-0.013
 pirates-0.013
Token bush
Token ID24484
Feature activation+5.75e-03

Pos logit contributions
idents+0.008
ric+0.007
ah+0.006
 argue+0.006
s+0.006
Neg logit contributions
aver-0.007
 doctors-0.006
 knights-0.006
eh-0.005
avers-0.005
Token.
Token ID13
Feature activation-1.17e-03

Pos logit contributions
idents+0.008
ric+0.007
s+0.007
ah+0.007
 argue+0.006
Neg logit contributions
aver-0.012
 doctors-0.011
 knights-0.010
avers-0.010
P-0.010
Token "
Token ID366
Feature activation-8.44e-03

Pos logit contributions
aver+0.003
 doctors+0.003
avers+0.002
 knights+0.002
P+0.002
Neg logit contributions
idents-0.001
s-0.001
ric-0.001
 Today-0.001
ah-0.001
TokenSur
Token ID14214
Feature activation-7.75e-02

Pos logit contributions
aver+0.015
 doctors+0.015
 knights+0.014
 pirates+0.013
avers+0.013
Neg logit contributions
idents-0.012
ah-0.012
ric-0.012
that-0.011
seven-0.011
Tokenprise
Token ID7919
Feature activation-2.28e-02

Pos logit contributions
 doctors+0.136
aver+0.129
 knights+0.117
 gard+0.116
 spies+0.113
Neg logit contributions
ah-0.108
idents-0.099
seven-0.097
 argue-0.095
ric-0.094
Token!"
Token ID2474
Feature activation-4.23e-03

Pos logit contributions
aver+0.057
 doctors+0.055
 knights+0.051
avers+0.050
P+0.049
Neg logit contributions
idents-0.021
ric-0.017
ah-0.017
s-0.016
 argue-0.015
Token she
Token ID673
Feature activation-6.27e-03

Pos logit contributions
aver+0.008
 doctors+0.008
 knights+0.007
avers+0.007
P+0.007
Neg logit contributions
idents-0.006
s-0.005
ric-0.005
ah-0.005
 argue-0.005
Token said
Token ID531
Feature activation-8.68e-03

Pos logit contributions
aver+0.012
 doctors+0.012
 knights+0.011
avers+0.010
 pirates+0.010
Neg logit contributions
idents-0.009
s-0.008
ah-0.008
ric-0.008
 argue-0.008
Token with
Token ID351
Feature activation-1.15e-02

Pos logit contributions
aver+0.019
 doctors+0.018
avers+0.016
P+0.016
 knights+0.016
Neg logit contributions
idents-0.011
s-0.010
ah-0.009
 argue-0.009
ric-0.009
Token tasted
Token ID29187
Feature activation+1.99e-02

Pos logit contributions
idents+0.014
ric+0.013
ah+0.012
seven+0.012
 argue+0.012
Neg logit contributions
aver-0.016
 doctors-0.015
 knights-0.013
ily-0.013
avers-0.013
Token like
Token ID588
Feature activation-2.82e-02

Pos logit contributions
idents+0.030
ric+0.029
ah+0.027
 argue+0.025
seven+0.025
Neg logit contributions
aver-0.037
 doctors-0.037
 knights-0.032
P-0.032
ily-0.032
Token,
Token ID11
Feature activation+6.62e-03

Pos logit contributions
aver+0.061
 doctors+0.058
avers+0.053
 knights+0.053
P+0.051
Neg logit contributions
idents-0.036
s-0.031
ric-0.031
ah-0.030
 argue-0.028
Token so
Token ID523
Feature activation-1.46e-02

Pos logit contributions
idents+0.010
ric+0.010
ah+0.010
s+0.009
 destination+0.009
Neg logit contributions
 doctors-0.012
aver-0.011
 knights-0.010
 pirates-0.010
ily-0.010
Token she
Token ID673
Feature activation-4.39e-02

Pos logit contributions
 doctors+0.021
aver+0.021
 knights+0.018
ily+0.017
avers+0.017
Neg logit contributions
idents-0.029
ric-0.027
ah-0.026
 argue-0.025
s-0.024
Token l
Token ID300
Feature activation-7.69e-02

Pos logit contributions
iers+0.078
 importantly+0.077
 doctors+0.077
 emergencies+0.076
aver+0.074
Neg logit contributions
ah-0.060
s-0.057
idents-0.054
 argue-0.052
 thinks-0.048
Tokenicked
Token ID9484
Feature activation+4.31e-04

Pos logit contributions
 doctors+0.164
aver+0.144
 knights+0.137
 pirates+0.133
eh+0.132
Neg logit contributions
ah-0.087
s-0.084
idents-0.081
seven-0.081
're-0.076
Token it
Token ID340
Feature activation+2.36e-02

Pos logit contributions
idents+0.001
ric+0.001
ah+0.001
 destination+0.001
seven+0.001
Neg logit contributions
 doctors-0.001
aver-0.001
ily-0.001
 knights-0.001
P-0.001
Token.
Token ID13
Feature activation-4.96e-03

Pos logit contributions
ric+0.054
idents+0.049
ah+0.047
seven+0.045
riage+0.044
Neg logit contributions
 doctors-0.026
aver-0.025
 knights-0.023
ily-0.023
 pirates-0.021
Token
Token ID198
Feature activation+1.53e-03

Pos logit contributions
aver+0.010
 doctors+0.010
ily+0.009
 knights+0.009
P+0.008
Neg logit contributions
idents-0.007
ric-0.006
ah-0.006
 argue-0.006
seven-0.005
Token
Token ID198
Feature activation+4.52e-03

Pos logit contributions
idents+0.003
ric+0.002
ah+0.002
 argue+0.002
seven+0.002
Neg logit contributions
aver-0.003
 doctors-0.002
ily-0.002
avers-0.002
P-0.002
Token felt
Token ID2936
Feature activation-2.06e-02

Pos logit contributions
idents+0.005
ric+0.004
 Today+0.004
seven+0.004
ah+0.004
Neg logit contributions
aver-0.004
 doctors-0.004
 knights-0.003
avers-0.003
ily-0.003
Token very
Token ID845
Feature activation+5.68e-03

Pos logit contributions
aver+0.036
 doctors+0.033
 knights+0.032
avers+0.031
P+0.030
Neg logit contributions
idents-0.033
s-0.030
ric-0.029
ah-0.028
 argue-0.028
Token sorry
Token ID7926
Feature activation+3.84e-03

Pos logit contributions
idents+0.008
s+0.008
 argue+0.007
 Today+0.006
ah+0.006
Neg logit contributions
 doctors-0.011
aver-0.011
 knights-0.010
iers-0.010
avers-0.010
Token.
Token ID13
Feature activation-5.36e-05

Pos logit contributions
ric+0.008
idents+0.008
 argue+0.007
seven+0.007
 summit+0.007
Neg logit contributions
 doctors-0.005
aver-0.005
 knights-0.004
ily-0.004
 pirates-0.004
Token He
Token ID679
Feature activation-7.46e-03

Pos logit contributions
aver+0.000
 doctors+0.000
 knights+0.000
avers+0.000
 pirates+0.000
Neg logit contributions
idents-0.000
ric-0.000
s-0.000
ah-0.000
 argue-0.000
Token l
Token ID300
Feature activation-7.61e-02

Pos logit contributions
aver+0.012
 doctors+0.012
 knights+0.011
avers+0.010
 pirates+0.010
Neg logit contributions
idents-0.013
ah-0.012
ric-0.012
s-0.012
 argue-0.011
Tokenicked
Token ID9484
Feature activation+1.44e-02

Pos logit contributions
 doctors+0.171
 knights+0.144
aver+0.144
 pirates+0.143
eh+0.142
Neg logit contributions
ah-0.073
s-0.071
ishly-0.062
're-0.059
 are-0.058
Token the
Token ID262
Feature activation+7.21e-03

Pos logit contributions
idents+0.024
ric+0.022
ah+0.021
seven+0.020
 argue+0.020
Neg logit contributions
aver-0.025
 doctors-0.024
 knights-0.022
ily-0.022
avers-0.021
Token boy
Token ID2933
Feature activation+2.38e-02

Pos logit contributions
idents+0.012
s+0.011
ah+0.010
ric+0.010
 argue+0.010
Neg logit contributions
aver-0.013
 doctors-0.012
 knights-0.011
avers-0.011
eh-0.011
Token's
Token ID338
Feature activation-3.43e-03

Pos logit contributions
idents+0.040
ric+0.037
ah+0.035
seven+0.035
riage+0.034
Neg logit contributions
aver-0.039
 doctors-0.038
 knights-0.035
 pirates-0.033
avers-0.033
Token hand
Token ID1021
Feature activation+2.44e-02

Pos logit contributions
aver+0.005
 doctors+0.005
 knights+0.004
avers+0.004
 pirates+0.004
Neg logit contributions
idents-0.006
ric-0.006
ah-0.006
s-0.006
 argue-0.006
TokenFl
Token ID7414
Feature activation+3.84e-03

Pos logit contributions
idents+0.008
ric+0.007
ah+0.007
s+0.007
 argue+0.007
Neg logit contributions
aver-0.009
 doctors-0.008
 knights-0.008
avers-0.007
 pirates-0.007
Tokenuffy
Token ID15352
Feature activation-1.45e-02

Pos logit contributions
idents+0.005
ric+0.004
s+0.004
ah+0.004
 argue+0.004
Neg logit contributions
aver-0.008
 doctors-0.008
 knights-0.007
avers-0.007
 pirates-0.007
Token just
Token ID655
Feature activation+6.39e-03

Pos logit contributions
aver+0.028
 doctors+0.028
 knights+0.025
avers+0.025
 pirates+0.024
Neg logit contributions
idents-0.020
ah-0.018
s-0.018
ric-0.017
 argue-0.017
Token shrugged
Token ID28271
Feature activation+1.75e-02

Pos logit contributions
idents+0.012
ric+0.011
s+0.010
ah+0.010
 argue+0.010
Neg logit contributions
aver-0.010
 doctors-0.010
 knights-0.008
ily-0.008
avers-0.008
Token and
Token ID290
Feature activation-3.32e-04

Pos logit contributions
idents+0.032
ric+0.030
 summit+0.028
ah+0.027
seven+0.027
Neg logit contributions
 doctors-0.028
aver-0.028
ily-0.024
 knights-0.024
 pirates-0.023
Token l
Token ID300
Feature activation-7.56e-02

Pos logit contributions
aver+0.001
 doctors+0.001
 knights+0.001
avers+0.001
itt+0.001
Neg logit contributions
idents-0.000
ah-0.000
s-0.000
 argue-0.000
seven-0.000
Tokenicked
Token ID9484
Feature activation+1.10e-02

Pos logit contributions
 doctors+0.163
aver+0.145
 knights+0.139
 pirates+0.139
eh+0.137
Neg logit contributions
s-0.080
ah-0.077
idents-0.068
seven-0.063
 correctly-0.063
Token his
Token ID465
Feature activation-6.68e-03

Pos logit contributions
idents+0.018
ric+0.016
ah+0.016
 argue+0.015
seven+0.015
Neg logit contributions
aver-0.020
 doctors-0.019
 knights-0.017
ily-0.017
avers-0.017
Token paw
Token ID41827
Feature activation+1.45e-02

Pos logit contributions
aver+0.011
 doctors+0.011
 knights+0.009
avers+0.009
eh+0.009
Neg logit contributions
idents-0.012
ric-0.011
ah-0.010
s-0.010
seven-0.010
Token.
Token ID13
Feature activation+6.39e-03

Pos logit contributions
idents+0.026
ric+0.026
ah+0.023
seven+0.022
 destination+0.022
Neg logit contributions
aver-0.023
 doctors-0.023
ily-0.020
 knights-0.019
avers-0.019
Token Lily
Token ID20037
Feature activation-5.52e-03

Pos logit contributions
idents+0.010
ric+0.009
ah+0.009
s+0.008
 argue+0.008
Neg logit contributions
aver-0.012
 doctors-0.012
 knights-0.010
avers-0.010
ily-0.010
Token're
Token ID821
Feature activation-4.55e-03

Pos logit contributions
idents+0.006
ric+0.005
ah+0.004
s+0.004
 argue+0.003
Neg logit contributions
aver-0.032
 doctors-0.032
 knights-0.030
 pirates-0.029
avers-0.029
Token welcome
Token ID7062
Feature activation-2.97e-02

Pos logit contributions
aver+0.006
 doctors+0.006
 knights+0.005
avers+0.005
 pirates+0.005
Neg logit contributions
idents-0.009
ric-0.009
s-0.008
ah-0.008
 argue-0.008
Token,
Token ID11
Feature activation-2.49e-03

Pos logit contributions
aver+0.062
 doctors+0.058
avers+0.053
 knights+0.053
eh+0.052
Neg logit contributions
idents-0.041
s-0.035
ah-0.035
ric-0.034
 argue-0.033
Token Max
Token ID5436
Feature activation+5.62e-04

Pos logit contributions
aver+0.004
 doctors+0.004
 knights+0.003
avers+0.003
eh+0.003
Neg logit contributions
idents-0.005
ric-0.004
ah-0.004
s-0.004
 argue-0.004
Token.
Token ID13
Feature activation-1.67e-03

Pos logit contributions
idents+0.001
ric+0.001
 summit+0.001
 argue+0.001
ah+0.001
Neg logit contributions
aver-0.001
 doctors-0.001
 knights-0.001
 pirates-0.001
avers-0.001
Token Sur
Token ID4198
Feature activation-7.53e-02

Pos logit contributions
aver+0.003
 doctors+0.003
eh+0.003
avers+0.003
 knights+0.003
Neg logit contributions
idents-0.002
ric-0.002
 Today-0.002
ah-0.002
s-0.002
Tokenfing
Token ID28825
Feature activation+3.13e-02

Pos logit contributions
 doctors+0.156
aver+0.152
 knights+0.139
eh+0.137
 gard+0.134
Neg logit contributions
ah-0.090
ric-0.085
seven-0.085
idents-0.085
s-0.081
Token is
Token ID318
Feature activation+1.83e-02

Pos logit contributions
idents+0.035
ah+0.028
 argue+0.028
seven+0.028
ric+0.027
Neg logit contributions
aver-0.070
 doctors-0.068
 knights-0.064
 pirates-0.063
avers-0.062
Token fun
Token ID1257
Feature activation-1.73e-02

Pos logit contributions
idents+0.034
ric+0.030
s+0.029
ah+0.029
 argue+0.029
Neg logit contributions
aver-0.030
 doctors-0.029
 knights-0.025
avers-0.025
 pirates-0.024
Token and
Token ID290
Feature activation+5.44e-03

Pos logit contributions
aver+0.039
 doctors+0.038
 knights+0.034
avers+0.034
P+0.033
Neg logit contributions
idents-0.020
ric-0.017
s-0.017
ah-0.016
 argue-0.015
Token I
Token ID314
Feature activation-7.24e-03

Pos logit contributions
idents+0.009
ric+0.008
s+0.008
ah+0.007
 argue+0.007
Neg logit contributions
aver-0.010
 doctors-0.010
 knights-0.009
avers-0.009
eh-0.008
Token The
Token ID383
Feature activation+2.26e-03

Pos logit contributions
idents+0.005
s+0.004
ric+0.004
ah+0.004
 Today+0.004
Neg logit contributions
aver-0.008
 doctors-0.007
avers-0.007
 knights-0.007
P-0.006
Token lion
Token ID18744
Feature activation-2.33e-02

Pos logit contributions
idents+0.003
ah+0.003
ric+0.003
s+0.003
 argue+0.003
Neg logit contributions
aver-0.004
 doctors-0.004
avers-0.004
 knights-0.004
P-0.004
Token said
Token ID531
Feature activation-5.60e-04

Pos logit contributions
aver+0.042
 doctors+0.040
 knights+0.036
P+0.036
avers+0.036
Neg logit contributions
idents-0.035
s-0.033
ah-0.033
ric-0.030
 argue-0.030
Token,
Token ID11
Feature activation-2.26e-02

Pos logit contributions
aver+0.001
 doctors+0.001
avers+0.001
P+0.001
eh+0.001
Neg logit contributions
idents-0.001
s-0.001
ah-0.001
 argue-0.001
ric-0.001
Token "
Token ID366
Feature activation-9.37e-04

Pos logit contributions
aver+0.042
 doctors+0.041
 knights+0.037
avers+0.036
 pirates+0.035
Neg logit contributions
idents-0.035
ric-0.031
ah-0.030
s-0.030
 argue-0.029
TokenSur
Token ID14214
Feature activation-7.50e-02

Pos logit contributions
aver+0.002
 doctors+0.002
 knights+0.001
avers+0.001
 pirates+0.001
Neg logit contributions
idents-0.002
s-0.001
ah-0.001
ric-0.001
 argue-0.001
Tokenrender
Token ID13287
Feature activation-3.31e-02

Pos logit contributions
aver+0.112
 doctors+0.111
 knights+0.101
 gard+0.096
 spies+0.095
Neg logit contributions
ah-0.124
idents-0.121
ric-0.117
s-0.114
 argue-0.113
Token or
Token ID393
Feature activation+2.58e-03

Pos logit contributions
aver+0.057
 doctors+0.056
 knights+0.050
 pirates+0.048
ily+0.048
Neg logit contributions
idents-0.055
ric-0.049
 argue-0.047
ah-0.047
 Today-0.045
Token I
Token ID314
Feature activation-6.46e-03

Pos logit contributions
idents+0.003
 argue+0.003
ric+0.003
s+0.003
ah+0.002
Neg logit contributions
aver-0.006
avers-0.005
 doctors-0.005
P-0.005
eh-0.005
Token will
Token ID481
Feature activation-1.17e-02

Pos logit contributions
aver+0.015
 doctors+0.014
 knights+0.013
avers+0.013
 pirates+0.013
Neg logit contributions
idents-0.007
ric-0.006
ah-0.006
seven-0.006
s-0.005
Token eat
Token ID4483
Feature activation-1.04e-03

Pos logit contributions
aver+0.022
 doctors+0.021
 knights+0.019
avers+0.019
 pirates+0.018
Neg logit contributions
idents-0.018
ric-0.016
ah-0.015
s-0.015
 argue-0.014
Token.
Token ID13
Feature activation+1.08e-02

Pos logit contributions
idents+0.013
ric+0.012
seven+0.011
 argue+0.011
ah+0.011
Neg logit contributions
aver-0.012
 doctors-0.012
 knights-0.011
 pirates-0.011
avers-0.010
Token She
Token ID1375
Feature activation+5.41e-03

Pos logit contributions
idents+0.014
ric+0.012
ah+0.012
s+0.012
 argue+0.011
Neg logit contributions
aver-0.022
 doctors-0.022
 knights-0.020
avers-0.020
 pirates-0.019
Token could
Token ID714
Feature activation+1.71e-02

Pos logit contributions
idents+0.009
ric+0.008
ah+0.008
 argue+0.008
 Today+0.008
Neg logit contributions
aver-0.009
 doctors-0.009
 knights-0.008
avers-0.008
 pirates-0.008
Token feel
Token ID1254
Feature activation-1.71e-02

Pos logit contributions
idents+0.028
ric+0.024
ah+0.024
s+0.024
seven+0.023
Neg logit contributions
aver-0.030
 doctors-0.030
 knights-0.026
 pirates-0.025
ily-0.025
Token it
Token ID340
Feature activation+1.93e-02

Pos logit contributions
aver+0.033
 doctors+0.032
 knights+0.029
 pirates+0.028
ily+0.028
Neg logit contributions
idents-0.025
ric-0.022
ah-0.022
 argue-0.021
s-0.021
Token warming
Token ID9917
Feature activation+4.86e-02

Pos logit contributions
idents+0.034
ric+0.031
ah+0.029
seven+0.029
 argue+0.029
Neg logit contributions
 doctors-0.031
aver-0.031
 knights-0.026
 pirates-0.025
ily-0.025
Token her
Token ID607
Feature activation+1.39e-02

Pos logit contributions
idents+0.081
ric+0.076
ah+0.073
seven+0.071
 argue+0.069
Neg logit contributions
 doctors-0.082
aver-0.075
 knights-0.070
ily-0.069
 pirates-0.067
Token face
Token ID1986
Feature activation+4.17e-02

Pos logit contributions
ah+0.028
idents+0.027
seven+0.027
ric+0.026
s+0.025
Neg logit contributions
 doctors-0.018
aver-0.016
 knights-0.015
 pirates-0.015
 spies-0.013
Token and
Token ID290
Feature activation+2.62e-02

Pos logit contributions
idents+0.076
ric+0.071
ah+0.065
seven+0.065
 argue+0.064
Neg logit contributions
 doctors-0.063
aver-0.061
 knights-0.054
 pirates-0.052
ily-0.051
Token it
Token ID340
Feature activation+1.82e-02

Pos logit contributions
idents+0.040
ah+0.036
ric+0.035
s+0.034
seven+0.033
Neg logit contributions
 doctors-0.048
aver-0.048
 knights-0.043
 pirates-0.041
avers-0.041
Token made
Token ID925
Feature activation+4.18e-02

Pos logit contributions
idents+0.032
ric+0.029
ah+0.028
seven+0.027
 Today+0.027
Neg logit contributions
aver-0.030
 doctors-0.030
 knights-0.025
ily-0.024
avers-0.024
Token liked
Token ID8288
Feature activation+1.50e-03

Pos logit contributions
aver+0.038
 doctors+0.036
 knights+0.032
avers+0.031
 pirates+0.030
Neg logit contributions
idents-0.041
ric-0.037
ah-0.036
s-0.035
 argue-0.034
Token his
Token ID465
Feature activation-8.37e-03

Pos logit contributions
idents+0.002
ric+0.002
s+0.002
ah+0.002
 argue+0.002
Neg logit contributions
aver-0.003
 doctors-0.003
avers-0.003
 knights-0.003
P-0.003
Token picture
Token ID4286
Feature activation+1.65e-02

Pos logit contributions
 doctors+0.014
aver+0.014
 knights+0.013
 pirates+0.012
avers+0.012
Neg logit contributions
idents-0.014
ric-0.013
ah-0.012
s-0.012
seven-0.012
Token.
Token ID13
Feature activation+4.45e-04

Pos logit contributions
idents+0.026
ric+0.023
ah+0.022
 argue+0.021
seven+0.021
Neg logit contributions
aver-0.031
 doctors-0.030
 knights-0.027
 pirates-0.026
avers-0.026
Token From
Token ID3574
Feature activation-1.87e-02

Pos logit contributions
idents+0.001
ric+0.001
ah+0.001
 argue+0.001
s+0.001
Neg logit contributions
aver-0.001
 doctors-0.001
 knights-0.001
avers-0.001
ily-0.001
Token that
Token ID326
Feature activation+5.86e-02

Pos logit contributions
aver+0.037
 doctors+0.036
 knights+0.033
avers+0.032
eh+0.031
Neg logit contributions
idents-0.026
ric-0.023
s-0.023
ah-0.023
 argue-0.021
Token day
Token ID1110
Feature activation-1.04e-03

Pos logit contributions
idents+0.130
ric+0.118
seven+0.111
 argue+0.105
 destination+0.105
Neg logit contributions
aver-0.075
 doctors-0.068
P-0.063
eh-0.057
 knights-0.057
Token on
Token ID319
Feature activation+2.65e-02

Pos logit contributions
aver+0.002
 doctors+0.002
 knights+0.002
avers+0.002
 pirates+0.002
Neg logit contributions
idents-0.002
ah-0.002
s-0.002
ric-0.002
 argue-0.001
Token,
Token ID11
Feature activation+2.24e-02

Pos logit contributions
idents+0.036
s+0.032
ric+0.031
ah+0.031
 argue+0.029
Neg logit contributions
aver-0.055
 doctors-0.053
 knights-0.048
avers-0.047
eh-0.046
Token the
Token ID262
Feature activation+1.99e-02

Pos logit contributions
idents+0.037
ric+0.033
ah+0.032
s+0.031
 argue+0.031
Neg logit contributions
aver-0.039
 doctors-0.039
 knights-0.035
 pirates-0.034
avers-0.033
Token painter
Token ID34537
Feature activation-1.12e-02

Pos logit contributions
idents+0.039
ah+0.035
 argue+0.034
 summit+0.034
ric+0.034
Neg logit contributions
aver-0.028
 doctors-0.024
avers-0.022
P-0.021
eh-0.020
TokenM
Token ID44
Feature activation+1.99e-02

Pos logit contributions
 doctors+0.004
aver+0.004
P+0.003
 knights+0.003
 pirates+0.003
Neg logit contributions
idents-0.005
ric-0.004
ah-0.004
 argue-0.004
 Today-0.004
Tokenum
Token ID388
Feature activation-2.10e-02

Pos logit contributions
idents+0.033
ric+0.030
ah+0.028
 argue+0.028
s+0.028
Neg logit contributions
aver-0.035
 doctors-0.034
 knights-0.030
avers-0.029
ily-0.029
Token smiled
Token ID13541
Feature activation-2.69e-03

Pos logit contributions
aver+0.039
 doctors+0.039
 knights+0.034
avers+0.033
 pirates+0.033
Neg logit contributions
idents-0.031
ah-0.029
s-0.028
ric-0.028
 argue-0.026
Token,
Token ID11
Feature activation-9.78e-03

Pos logit contributions
aver+0.005
 doctors+0.005
 knights+0.005
avers+0.004
 pirates+0.004
Neg logit contributions
idents-0.004
ric-0.004
ah-0.004
 argue-0.003
s-0.003
Token "
Token ID366
Feature activation-2.08e-03

Pos logit contributions
aver+0.020
 doctors+0.020
 knights+0.018
avers+0.017
 pirates+0.017
Neg logit contributions
idents-0.013
s-0.011
ric-0.011
ah-0.011
seven-0.011
TokenThat
Token ID2504
Feature activation+5.41e-02

Pos logit contributions
aver+0.004
 doctors+0.004
P+0.003
avers+0.003
ily+0.003
Neg logit contributions
idents-0.003
ric-0.003
 argue-0.003
s-0.003
 destination-0.003
Token's
Token ID338
Feature activation+7.55e-04

Pos logit contributions
ric+0.078
idents+0.077
 argue+0.071
seven+0.068
ah+0.064
Neg logit contributions
 doctors-0.098
aver-0.092
 knights-0.087
 gard-0.081
 pirates-0.081
Token a
Token ID257
Feature activation+7.82e-03

Pos logit contributions
s+0.001
idents+0.001
ric+0.001
 argue+0.001
ah+0.001
Neg logit contributions
aver-0.001
 doctors-0.001
ily-0.001
P-0.001
avers-0.001
Token great
Token ID1049
Feature activation+7.69e-03

Pos logit contributions
idents+0.012
ric+0.011
s+0.011
ah+0.011
 argue+0.010
Neg logit contributions
aver-0.014
 doctors-0.014
 knights-0.012
avers-0.012
P-0.012
Token idea
Token ID2126
Feature activation-2.61e-02

Pos logit contributions
idents+0.010
ric+0.010
s+0.009
ah+0.009
seven+0.008
Neg logit contributions
aver-0.015
 doctors-0.015
 knights-0.014
 pirates-0.013
avers-0.013
Token,"
Token ID553
Feature activation-1.21e-04

Pos logit contributions
aver+0.054
 doctors+0.052
 knights+0.048
avers+0.047
ily+0.046
Neg logit contributions
idents-0.034
ric-0.031
ah-0.029
s-0.028
 argue-0.028
Token it
Token ID340
Feature activation+3.43e-03

Pos logit contributions
aver+0.063
 doctors+0.058
avers+0.054
 knights+0.053
eh+0.052
Neg logit contributions
idents-0.043
s-0.038
ric-0.038
ah-0.037
seven-0.035
Token.
Token ID13
Feature activation-9.28e-03

Pos logit contributions
ric+0.006
idents+0.006
 argue+0.006
ah+0.005
 summit+0.005
Neg logit contributions
 doctors-0.005
aver-0.005
ily-0.005
 knights-0.004
avers-0.004
Token
Token ID198
Feature activation-3.77e-03

Pos logit contributions
aver+0.021
 doctors+0.020
 knights+0.018
avers+0.018
 pirates+0.018
Neg logit contributions
idents-0.011
s-0.009
ric-0.009
ah-0.009
 Today-0.009
Token
Token ID198
Feature activation-2.24e-03

Pos logit contributions
aver+0.007
 doctors+0.007
 knights+0.006
avers+0.006
ily+0.006
Neg logit contributions
idents-0.006
ric-0.005
ah-0.005
s-0.005
 argue-0.005
TokenFrom
Token ID4863
Feature activation-1.55e-02

Pos logit contributions
aver+0.004
 doctors+0.004
 knights+0.004
 pirates+0.004
 spies+0.004
Neg logit contributions
idents-0.003
s-0.003
that-0.003
ah-0.003
ric-0.003
Token that
Token ID326
Feature activation+5.02e-02

Pos logit contributions
aver+0.032
 doctors+0.031
 knights+0.030
avers+0.028
eh+0.028
Neg logit contributions
idents-0.020
s-0.017
ric-0.017
ah-0.017
seven-0.015
Token day
Token ID1110
Feature activation-5.65e-03

Pos logit contributions
idents+0.104
ric+0.092
seven+0.085
 destination+0.083
 argue+0.082
Neg logit contributions
aver-0.078
 doctors-0.068
P-0.063
eh-0.062
avers-0.062
Token on
Token ID319
Feature activation+2.38e-02

Pos logit contributions
aver+0.010
 doctors+0.010
 knights+0.009
avers+0.009
 pirates+0.009
Neg logit contributions
idents-0.009
s-0.008
ah-0.008
ric-0.008
 argue-0.007
Token,
Token ID11
Feature activation+1.88e-02

Pos logit contributions
idents+0.032
s+0.028
ah+0.027
ric+0.027
 argue+0.026
Neg logit contributions
aver-0.050
 doctors-0.047
 knights-0.043
avers-0.043
 pirates-0.042
Token Lily
Token ID20037
Feature activation-2.15e-03

Pos logit contributions
idents+0.039
ric+0.036
ah+0.035
s+0.035
 argue+0.034
Neg logit contributions
aver-0.025
 doctors-0.024
 knights-0.021
avers-0.020
 pirates-0.019
Token always
Token ID1464
Feature activation-1.49e-03

Pos logit contributions
aver+0.004
 doctors+0.004
 knights+0.003
avers+0.003
 pirates+0.003
Neg logit contributions
idents-0.003
ric-0.003
ah-0.003
s-0.003
 argue-0.003
Token."
Token ID526
Feature activation-3.66e-03

Pos logit contributions
aver+0.035
 doctors+0.034
 knights+0.032
 pirates+0.031
avers+0.030
Neg logit contributions
idents-0.023
ric-0.020
 argue-0.019
s-0.018
 summit-0.017
Token
Token ID198
Feature activation-3.96e-03

Pos logit contributions
aver+0.008
 doctors+0.008
 knights+0.007
avers+0.007
 pirates+0.007
Neg logit contributions
idents-0.005
ric-0.004
s-0.004
ah-0.004
 argue-0.004
Token
Token ID198
Feature activation-4.70e-03

Pos logit contributions
aver+0.008
 doctors+0.008
 knights+0.007
avers+0.007
 pirates+0.007
Neg logit contributions
idents-0.005
ric-0.005
s-0.005
ah-0.005
seven-0.004
TokenThe
Token ID464
Feature activation+3.77e-02

Pos logit contributions
aver+0.010
 doctors+0.010
 knights+0.009
avers+0.008
 pirates+0.008
Neg logit contributions
idents-0.006
s-0.005
ah-0.005
ric-0.005
seven-0.005
Token moral
Token ID6573
Feature activation-1.71e-02

Pos logit contributions
idents+0.044
ah+0.041
ric+0.038
 Today+0.037
s+0.037
Neg logit contributions
aver-0.077
 doctors-0.073
avers-0.069
eh-0.068
iers-0.067
Token of
Token ID286
Feature activation+4.48e-02

Pos logit contributions
aver+0.027
 doctors+0.025
eh+0.025
 knights+0.024
avers+0.023
Neg logit contributions
s-0.029
ah-0.027
idents-0.026
 Today-0.026
seven-0.025
Token the
Token ID262
Feature activation+3.64e-02

Pos logit contributions
idents+0.073
ah+0.060
 argue+0.060
ric+0.060
s+0.060
Neg logit contributions
aver-0.083
 doctors-0.081
 knights-0.073
ily-0.072
 pirates-0.071
Token story
Token ID1621
Feature activation+5.05e-03

Pos logit contributions
idents+0.034
ah+0.030
s+0.028
ric+0.028
 Today+0.028
Neg logit contributions
aver-0.086
 doctors-0.082
avers-0.077
 knights-0.076
eh-0.076
Token is
Token ID318
Feature activation-5.76e-03

Pos logit contributions
idents+0.007
ric+0.006
ah+0.006
 argue+0.006
ksh+0.006
Neg logit contributions
 doctors-0.009
aver-0.009
ily-0.009
 knights-0.009
 pirates-0.009
Token that
Token ID326
Feature activation+2.73e-02

Pos logit contributions
aver+0.012
 doctors+0.011
 knights+0.010
avers+0.010
eh+0.010
Neg logit contributions
idents-0.008
ric-0.007
s-0.007
ah-0.007
 argue-0.006
Token it
Token ID340
Feature activation+7.95e-03

Pos logit contributions
idents+0.038
s+0.033
ric+0.032
 argue+0.031
ah+0.030
Neg logit contributions
aver-0.057
 doctors-0.052
avers-0.049
eh-0.049
P-0.048
Token away
Token ID1497
Feature activation+1.71e-02

Pos logit contributions
aver+0.002
 doctors+0.002
 knights+0.002
avers+0.002
ily+0.002
Neg logit contributions
idents-0.002
ric-0.002
ah-0.002
seven-0.002
 argue-0.002
Token and
Token ID290
Feature activation+3.14e-03

Pos logit contributions
idents+0.028
ric+0.025
ah+0.024
s+0.023
 argue+0.023
Neg logit contributions
aver-0.030
 doctors-0.030
 knights-0.027
avers-0.026
 pirates-0.025
Token Mom
Token ID11254
Feature activation-3.53e-03

Pos logit contributions
idents+0.005
s+0.005
 argue+0.004
ah+0.004
ric+0.004
Neg logit contributions
aver-0.006
 doctors-0.005
avers-0.005
P-0.005
 knights-0.005
Tokenmy
Token ID1820
Feature activation-1.13e-02

Pos logit contributions
aver+0.008
 doctors+0.008
 knights+0.007
avers+0.007
 pirates+0.007
Neg logit contributions
idents-0.004
ric-0.003
ah-0.003
s-0.003
 argue-0.003
Token smiled
Token ID13541
Feature activation+1.82e-03

Pos logit contributions
aver+0.019
 doctors+0.018
 knights+0.016
P+0.016
avers+0.016
Neg logit contributions
idents-0.019
s-0.018
ah-0.017
ric-0.016
 argue-0.016
Token.
Token ID13
Feature activation+4.27e-03

Pos logit contributions
idents+0.003
ric+0.003
ah+0.003
 argue+0.003
seven+0.002
Neg logit contributions
 doctors-0.003
aver-0.003
 knights-0.003
 pirates-0.003
ily-0.003
Token She
Token ID1375
Feature activation-7.67e-03

Pos logit contributions
idents+0.006
ric+0.005
ah+0.005
s+0.005
 argue+0.005
Neg logit contributions
aver-0.009
 doctors-0.008
 knights-0.008
avers-0.007
ily-0.007
Token said
Token ID531
Feature activation-8.27e-03

Pos logit contributions
aver+0.014
 doctors+0.014
 knights+0.013
avers+0.012
 pirates+0.012
Neg logit contributions
idents-0.012
ah-0.010
s-0.010
ric-0.010
 argue-0.010
Token "
Token ID366
Feature activation+8.36e-03

Pos logit contributions
aver+0.020
 doctors+0.018
eh+0.018
avers+0.018
P+0.017
Neg logit contributions
idents-0.008
s-0.008
ah-0.007
seven-0.006
 Today-0.006
TokenThank
Token ID10449
Feature activation-3.60e-02

Pos logit contributions
idents+0.015
ric+0.013
ah+0.013
s+0.013
 argue+0.013
Neg logit contributions
aver-0.014
 doctors-0.013
 knights-0.012
avers-0.011
 pirates-0.011
Token you
Token ID345
Feature activation+1.18e-02

Pos logit contributions
aver+0.068
 doctors+0.061
avers+0.057
eh+0.055
 knights+0.055
Neg logit contributions
idents-0.056
s-0.050
ah-0.050
seven-0.048
ric-0.048
Token and
Token ID290
Feature activation+1.40e-02

Pos logit contributions
idents+0.022
ric+0.021
ah+0.020
 argue+0.020
 Today+0.019
Neg logit contributions
 doctors-0.021
aver-0.020
 knights-0.019
 pirates-0.018
ily-0.017
Token touch
Token ID3638
Feature activation+1.10e-02

Pos logit contributions
idents+0.022
ric+0.019
ah+0.019
s+0.018
 argue+0.018
Neg logit contributions
aver-0.026
 doctors-0.025
 knights-0.023
avers-0.022
 pirates-0.022
Token the
Token ID262
Feature activation-6.77e-04

Pos logit contributions
idents+0.018
ric+0.016
 argue+0.015
ah+0.015
 Today+0.014
Neg logit contributions
 doctors-0.019
aver-0.019
 knights-0.018
ily-0.017
 pirates-0.017
Token sky
Token ID6766
Feature activation+1.56e-02

Pos logit contributions
aver+0.001
 doctors+0.001
avers+0.001
 knights+0.001
eh+0.001
Neg logit contributions
idents-0.001
ric-0.001
s-0.001
ah-0.001
 argue-0.001
Token.
Token ID13
Feature activation+8.61e-03

Pos logit contributions
idents+0.031
ric+0.027
ah+0.027
seven+0.026
 argue+0.025
Neg logit contributions
 doctors-0.022
aver-0.020
 knights-0.019
 pirates-0.019
ily-0.018
Token Mia
Token ID32189
Feature activation+9.52e-03

Pos logit contributions
idents+0.015
ric+0.013
ah+0.013
 argue+0.013
s+0.012
Neg logit contributions
aver-0.014
 doctors-0.014
ily-0.013
 knights-0.013
avers-0.012
Token showed
Token ID3751
Feature activation+1.16e-02

Pos logit contributions
idents+0.016
ric+0.014
ah+0.013
 Today+0.013
 argue+0.013
Neg logit contributions
aver-0.017
 doctors-0.016
 knights-0.014
avers-0.014
 pirates-0.014
Token the
Token ID262
Feature activation+8.83e-03

Pos logit contributions
idents+0.019
ric+0.019
ah+0.017
 argue+0.016
 Today+0.016
Neg logit contributions
 doctors-0.020
aver-0.019
 knights-0.018
ily-0.018
 pirates-0.017
Token butterfly
Token ID35113
Feature activation+3.83e-02

Pos logit contributions
idents+0.015
ric+0.013
ah+0.013
s+0.013
 argue+0.013
Neg logit contributions
aver-0.015
 doctors-0.015
 knights-0.013
avers-0.013
eh-0.013
Token her
Token ID607
Feature activation-1.59e-03

Pos logit contributions
idents+0.079
ah+0.073
ric+0.072
seven+0.070
 argue+0.067
Neg logit contributions
 doctors-0.049
 knights-0.047
aver-0.046
 pirates-0.043
ily-0.041
Token colorful
Token ID20239
Feature activation-4.49e-03

Pos logit contributions
aver+0.002
 doctors+0.002
 knights+0.002
 pirates+0.002
 spies+0.002
Neg logit contributions
idents-0.003
ah-0.003
seven-0.003
ric-0.002
ksh-0.002
Token of
Token ID286
Feature activation-8.55e-03

Pos logit contributions
aver+0.061
 doctors+0.059
 knights+0.052
avers+0.051
P+0.051
Neg logit contributions
idents-0.059
s-0.057
ric-0.055
ah-0.054
 argue-0.053
Token the
Token ID262
Feature activation-1.81e-02

Pos logit contributions
aver+0.020
 doctors+0.019
avers+0.018
eh+0.017
P+0.017
Neg logit contributions
idents-0.008
ric-0.008
s-0.008
ah-0.008
 argue-0.007
Token kingdom
Token ID13239
Feature activation-2.18e-02

Pos logit contributions
aver+0.032
avers+0.026
 doctors+0.026
ily+0.026
eh+0.024
Neg logit contributions
idents-0.028
 destination-0.027
s-0.026
ah-0.026
 argue-0.025
Token came
Token ID1625
Feature activation+2.13e-02

Pos logit contributions
aver+0.041
 doctors+0.040
 knights+0.036
avers+0.035
P+0.034
Neg logit contributions
idents-0.032
ric-0.029
s-0.028
ah-0.028
 argue-0.027
Token to
Token ID284
Feature activation+2.35e-02

Pos logit contributions
idents+0.034
ric+0.030
ah+0.030
s+0.028
seven+0.028
Neg logit contributions
aver-0.038
 doctors-0.037
 knights-0.033
avers-0.032
 pirates-0.032
Token celebrate
Token ID10648
Feature activation+2.79e-02

Pos logit contributions
idents+0.037
ric+0.032
s+0.032
ah+0.031
seven+0.029
Neg logit contributions
 doctors-0.043
aver-0.043
 knights-0.039
 pirates-0.038
ily-0.037
Token.
Token ID13
Feature activation+6.62e-03

Pos logit contributions
idents+0.052
ric+0.051
ah+0.048
 argue+0.045
seven+0.044
Neg logit contributions
 doctors-0.042
 knights-0.039
 pirates-0.038
aver-0.038
ily-0.035
Token Everyone
Token ID11075
Feature activation-2.40e-02

Pos logit contributions
idents+0.010
ric+0.008
ah+0.008
s+0.008
 argue+0.008
Neg logit contributions
aver-0.013
 doctors-0.013
 knights-0.011
avers-0.011
ily-0.011
Token was
Token ID373
Feature activation+3.53e-03

Pos logit contributions
aver+0.040
 doctors+0.039
 knights+0.035
 pirates+0.034
avers+0.034
Neg logit contributions
idents-0.042
ric-0.038
ah-0.036
 argue-0.035
 Today-0.034
Token so
Token ID523
Feature activation+7.86e-03

Pos logit contributions
idents+0.006
ric+0.005
ah+0.005
s+0.005
 argue+0.005
Neg logit contributions
aver-0.007
 doctors-0.006
 knights-0.006
avers-0.006
 pirates-0.005
Token happy
Token ID3772
Feature activation-7.74e-03

Pos logit contributions
idents+0.013
ric+0.012
ah+0.012
s+0.011
 argue+0.011
Neg logit contributions
aver-0.014
 doctors-0.013
 knights-0.012
avers-0.011
 pirates-0.011
Token named
Token ID3706
Feature activation+6.45e-03

Pos logit contributions
aver+0.019
 doctors+0.019
 knights+0.016
avers+0.015
ily+0.015
Neg logit contributions
idents-0.034
s-0.033
ah-0.033
ric-0.033
seven-0.031
Token Max
Token ID5436
Feature activation+1.26e-03

Pos logit contributions
idents+0.011
ric+0.010
ah+0.010
s+0.010
 argue+0.009
Neg logit contributions
aver-0.011
 doctors-0.011
 knights-0.009
avers-0.009
 pirates-0.009
Token.
Token ID13
Feature activation+2.80e-02

Pos logit contributions
idents+0.002
ric+0.002
ah+0.002
s+0.002
 argue+0.002
Neg logit contributions
aver-0.002
 doctors-0.002
 knights-0.002
avers-0.002
 pirates-0.002
Token Max
Token ID5436
Feature activation-5.74e-03

Pos logit contributions
idents+0.060
ric+0.059
ah+0.055
s+0.053
 argue+0.053
Neg logit contributions
 doctors-0.033
aver-0.032
 knights-0.030
 pirates-0.026
avers-0.025
Token loved
Token ID6151
Feature activation-3.06e-02

Pos logit contributions
aver+0.010
 doctors+0.010
 knights+0.009
avers+0.009
eh+0.009
Neg logit contributions
idents-0.009
ric-0.008
s-0.008
ah-0.008
 argue-0.008
Token to
Token ID284
Feature activation+3.84e-03

Pos logit contributions
aver+0.063
 doctors+0.058
avers+0.054
ily+0.053
 knights+0.053
Neg logit contributions
idents-0.042
ric-0.038
s-0.037
ah-0.036
 argue-0.035
Token drive
Token ID3708
Feature activation-1.14e-02

Pos logit contributions
idents+0.006
ric+0.005
ah+0.005
s+0.005
seven+0.005
Neg logit contributions
aver-0.007
 doctors-0.007
 knights-0.006
avers-0.006
 pirates-0.006
Token around
Token ID1088
Feature activation-2.30e-04

Pos logit contributions
 doctors+0.019
aver+0.018
 knights+0.017
 pirates+0.016
avers+0.015
Neg logit contributions
idents-0.021
ric-0.020
ah-0.018
 argue-0.017
s-0.017
Token and
Token ID290
Feature activation+2.00e-02

Pos logit contributions
 doctors+0.000
aver+0.000
 knights+0.000
 pirates+0.000
avers+0.000
Neg logit contributions
idents-0.000
ric-0.000
ah-0.000
s-0.000
seven-0.000
Token explore
Token ID7301
Feature activation+7.24e-03

Pos logit contributions
idents+0.033
ric+0.028
ah+0.028
s+0.027
seven+0.026
Neg logit contributions
 doctors-0.036
aver-0.035
 knights-0.031
 pirates-0.030
avers-0.029
Token new
Token ID649
Feature activation-1.87e-02

Pos logit contributions
idents+0.010
ric+0.009
ah+0.009
s+0.008
 argue+0.008
Neg logit contributions
aver-0.015
 doctors-0.014
 knights-0.013
avers-0.013
 pirates-0.012
Token kept
Token ID4030
Feature activation-9.35e-04

Pos logit contributions
aver+0.010
 doctors+0.010
 knights+0.009
avers+0.009
 pirates+0.008
Neg logit contributions
idents-0.011
ric-0.010
ah-0.009
s-0.009
 argue-0.009
Token flying
Token ID7348
Feature activation+1.59e-03

Pos logit contributions
aver+0.002
 doctors+0.002
 knights+0.001
avers+0.001
ily+0.001
Neg logit contributions
idents-0.002
ric-0.001
ah-0.001
s-0.001
 argue-0.001
Token until
Token ID1566
Feature activation-3.97e-03

Pos logit contributions
idents+0.003
ric+0.002
 argue+0.002
ah+0.002
seven+0.002
Neg logit contributions
aver-0.003
 doctors-0.003
 knights-0.002
avers-0.002
 pirates-0.002
Token it
Token ID340
Feature activation-8.18e-03

Pos logit contributions
aver+0.009
 doctors+0.008
avers+0.007
 knights+0.007
eh+0.007
Neg logit contributions
idents-0.005
ah-0.004
s-0.004
ric-0.004
 argue-0.004
Token reached
Token ID4251
Feature activation+1.81e-02

Pos logit contributions
aver+0.014
 doctors+0.013
 knights+0.012
avers+0.012
 pirates+0.011
Neg logit contributions
idents-0.014
ric-0.012
ah-0.012
s-0.012
 argue-0.012
Token a
Token ID257
Feature activation+1.21e-02

Pos logit contributions
idents+0.026
ric+0.023
ah+0.023
s+0.022
 argue+0.022
Neg logit contributions
aver-0.035
 doctors-0.034
 knights-0.031
avers-0.030
 pirates-0.030
Token me
Token ID502
Feature activation-4.14e-02

Pos logit contributions
idents+0.020
ric+0.018
ah+0.018
s+0.018
 argue+0.017
Neg logit contributions
aver-0.021
 doctors-0.020
 knights-0.018
avers-0.018
ily-0.017
Tokenadow
Token ID4584
Feature activation-7.41e-03

Pos logit contributions
aver+0.073
 doctors+0.068
 knights+0.061
avers+0.061
eh+0.059
Neg logit contributions
idents-0.070
ric-0.060
s-0.059
ah-0.059
 argue-0.058
Token with
Token ID351
Feature activation-2.21e-02

Pos logit contributions
aver+0.015
 doctors+0.014
 knights+0.013
avers+0.013
 pirates+0.012
Neg logit contributions
idents-0.011
ric-0.009
ah-0.009
s-0.009
 argue-0.009
Token beautiful
Token ID4950
Feature activation-1.97e-02

Pos logit contributions
aver+0.043
 doctors+0.041
avers+0.037
 knights+0.037
eh+0.036
Neg logit contributions
idents-0.032
ric-0.028
ah-0.028
s-0.028
 argue-0.026
Token flowers
Token ID12734
Feature activation-1.76e-02

Pos logit contributions
aver+0.036
 doctors+0.034
avers+0.031
 knights+0.030
ily+0.030
Neg logit contributions
idents-0.031
ric-0.028
ah-0.028
s-0.028
 argue-0.025
Token charming
Token ID23332
Feature activation+1.73e-03

Pos logit contributions
idents+0.011
s+0.010
ah+0.010
 Today+0.010
ric+0.010
Neg logit contributions
aver-0.013
 doctors-0.012
iers-0.012
avers-0.011
 knights-0.011
Token little
Token ID1310
Feature activation-8.67e-03

Pos logit contributions
s+0.003
idents+0.003
 Today+0.002
ah+0.002
 destination+0.002
Neg logit contributions
aver-0.003
iers-0.003
 doctors-0.003
avers-0.003
itt-0.003
Token town
Token ID3240
Feature activation-2.63e-02

Pos logit contributions
aver+0.016
 doctors+0.015
avers+0.014
ily+0.014
iers+0.014
Neg logit contributions
s-0.014
idents-0.012
ah-0.012
 Today-0.012
ric-0.012
Token,
Token ID11
Feature activation+2.23e-02

Pos logit contributions
 doctors+0.062
aver+0.062
 knights+0.056
P+0.056
 spies+0.055
Neg logit contributions
s-0.026
 Today-0.020
idents-0.019
ah-0.019
ric-0.018
Token there
Token ID612
Feature activation+1.10e-02

Pos logit contributions
s+0.039
idents+0.039
 Today+0.039
ah+0.036
 argue+0.035
Neg logit contributions
 doctors-0.032
aver-0.032
 knights-0.027
avers-0.026
eh-0.026
Token lived
Token ID5615
Feature activation-5.01e-04

Pos logit contributions
idents+0.017
ric+0.015
ah+0.015
s+0.014
 argue+0.014
Neg logit contributions
aver-0.021
 doctors-0.020
 knights-0.018
avers-0.017
 pirates-0.017
Token a
Token ID257
Feature activation+1.72e-02

Pos logit contributions
aver+0.001
 doctors+0.001
 importantly+0.001
avers+0.001
renches+0.001
Neg logit contributions
that-0.000
s-0.000
 total-0.000
 Today-0.000
 -0.000
Token boy
Token ID2933
Feature activation-2.13e-02

Pos logit contributions
idents+0.025
s+0.024
ah+0.023
 total+0.021
 argue+0.021
Neg logit contributions
aver-0.031
 doctors-0.028
iers-0.028
avers-0.027
ily-0.026
Token named
Token ID3706
Feature activation+1.21e-02

Pos logit contributions
 doctors+0.028
aver+0.025
iers+0.024
 gard+0.023
obb+0.023
Neg logit contributions
s-0.041
ah-0.038
ric-0.037
 Today-0.035
 argue-0.034
Token Tim
Token ID5045
Feature activation-9.33e-05

Pos logit contributions
idents+0.018
s+0.016
ah+0.016
 Today+0.015
ric+0.015
Neg logit contributions
aver-0.023
 doctors-0.022
avers-0.020
P-0.019
 knights-0.019
Token.
Token ID13
Feature activation+2.53e-02

Pos logit contributions
aver+0.000
 doctors+0.000
 knights+0.000
avers+0.000
ily+0.000
Neg logit contributions
idents-0.000
ric-0.000
 summit-0.000
 argue-0.000
seven-0.000
Token excited
Token ID6568
Feature activation-1.03e-02

Pos logit contributions
aver+0.007
 doctors+0.007
 knights+0.006
P+0.006
avers+0.006
Neg logit contributions
idents-0.005
s-0.005
 argue-0.005
ric-0.005
ah-0.005
Token because
Token ID780
Feature activation-2.03e-03

Pos logit contributions
aver+0.020
 doctors+0.020
 knights+0.018
avers+0.017
 pirates+0.017
Neg logit contributions
idents-0.015
ric-0.013
ah-0.013
s-0.012
 argue-0.012
Token it
Token ID340
Feature activation+5.05e-03

Pos logit contributions
aver+0.004
 doctors+0.004
 knights+0.003
avers+0.003
 pirates+0.003
Neg logit contributions
idents-0.003
ric-0.003
ah-0.003
s-0.003
 argue-0.003
Token was
Token ID373
Feature activation+1.09e-02

Pos logit contributions
idents+0.007
ric+0.006
ah+0.006
s+0.006
 argue+0.006
Neg logit contributions
aver-0.010
 doctors-0.010
 knights-0.009
avers-0.009
 pirates-0.009
Token holiday
Token ID9912
Feature activation+2.55e-03

Pos logit contributions
idents+0.017
ric+0.015
ah+0.015
s+0.015
 argue+0.014
Neg logit contributions
aver-0.020
 doctors-0.019
 knights-0.017
avers-0.017
P-0.016
Token time
Token ID640
Feature activation-2.42e-02

Pos logit contributions
idents+0.003
ric+0.003
ah+0.003
 argue+0.003
seven+0.003
Neg logit contributions
aver-0.005
 doctors-0.005
 knights-0.005
avers-0.005
 pirates-0.005
Token!
Token ID0
Feature activation+4.88e-03

Pos logit contributions
aver+0.039
 doctors+0.038
 knights+0.034
 pirates+0.033
avers+0.033
Neg logit contributions
idents-0.040
ric-0.038
ah-0.037
 argue-0.034
 summit-0.034
Token They
Token ID1119
Feature activation-9.30e-03

Pos logit contributions
idents+0.008
ric+0.007
ah+0.007
 argue+0.007
s+0.007
Neg logit contributions
aver-0.009
 doctors-0.008
 knights-0.008
ily-0.007
avers-0.007
Token had
Token ID550
Feature activation+2.82e-03

Pos logit contributions
aver+0.015
 doctors+0.015
 knights+0.013
avers+0.013
eh+0.012
Neg logit contributions
idents-0.016
ric-0.015
ah-0.015
s-0.015
 argue-0.014
Token been
Token ID587
Feature activation-2.77e-02

Pos logit contributions
idents+0.005
ric+0.004
ah+0.004
s+0.004
 argue+0.004
Neg logit contributions
aver-0.005
 doctors-0.005
 knights-0.004
avers-0.004
 pirates-0.004
Token looking
Token ID2045
Feature activation+1.36e-03

Pos logit contributions
aver+0.048
 doctors+0.044
ily+0.041
eh+0.040
 knights+0.040
Neg logit contributions
idents-0.042
s-0.040
ric-0.040
 destination-0.040
 argue-0.038
Token liked
Token ID8288
Feature activation+7.01e-03

Pos logit contributions
aver+0.063
 doctors+0.062
 knights+0.055
avers+0.053
P+0.053
Neg logit contributions
idents-0.054
s-0.051
ah-0.049
ric-0.049
 argue-0.047
Token the
Token ID262
Feature activation-2.82e-03

Pos logit contributions
idents+0.010
ric+0.009
ah+0.009
s+0.009
 argue+0.008
Neg logit contributions
aver-0.014
 doctors-0.013
 knights-0.012
avers-0.012
 pirates-0.012
Token eggs
Token ID9653
Feature activation-1.18e-03

Pos logit contributions
 doctors+0.004
aver+0.004
 knights+0.004
avers+0.003
eh+0.003
Neg logit contributions
idents-0.005
ric-0.005
s-0.005
seven-0.005
 argue-0.005
Token and
Token ID290
Feature activation+7.09e-03

Pos logit contributions
aver+0.002
 doctors+0.002
 knights+0.001
 pirates+0.001
avers+0.001
Neg logit contributions
idents-0.002
seven-0.002
ric-0.002
ah-0.002
 Today-0.002
Token became
Token ID2627
Feature activation-8.64e-03

Pos logit contributions
idents+0.011
ric+0.010
s+0.009
ah+0.009
 argue+0.009
Neg logit contributions
aver-0.013
 doctors-0.013
 knights-0.011
avers-0.011
P-0.011
Token her
Token ID607
Feature activation-8.26e-03

Pos logit contributions
aver+0.016
 doctors+0.015
 knights+0.014
avers+0.014
eh+0.013
Neg logit contributions
idents-0.013
ric-0.012
ah-0.011
s-0.011
 argue-0.011
Token friends
Token ID2460
Feature activation+5.25e-03

Pos logit contributions
aver+0.014
 doctors+0.014
 knights+0.012
avers+0.012
 pirates+0.012
Neg logit contributions
idents-0.014
ric-0.013
ah-0.013
s-0.012
seven-0.012
Token.
Token ID13
Feature activation+8.24e-03

Pos logit contributions
idents+0.010
ric+0.009
ah+0.008
 argue+0.008
seven+0.008
Neg logit contributions
aver-0.008
 doctors-0.008
 knights-0.008
 pirates-0.007
ily-0.007
Token The
Token ID383
Feature activation+2.49e-02

Pos logit contributions
idents+0.013
ah+0.011
 argue+0.011
ric+0.011
s+0.010
Neg logit contributions
aver-0.015
 doctors-0.015
ily-0.014
 knights-0.014
avers-0.013
Token fat
Token ID3735
Feature activation+3.08e-02

Pos logit contributions
idents+0.046
ric+0.042
ah+0.041
s+0.041
 argue+0.039
Neg logit contributions
aver-0.039
 doctors-0.038
 knights-0.033
avers-0.033
 pirates-0.031
Token hen
Token ID30963
Feature activation-7.66e-03

Pos logit contributions
idents+0.062
ric+0.056
s+0.054
ah+0.054
 argue+0.054
Neg logit contributions
aver-0.045
 doctors-0.044
 knights-0.037
avers-0.036
eh-0.034
Token she
Token ID673
Feature activation+1.34e-02

Pos logit contributions
idents+0.012
ric+0.010
ah+0.010
s+0.010
 argue+0.010
Neg logit contributions
aver-0.019
 doctors-0.018
 knights-0.016
avers-0.016
P-0.016
Token sprayed
Token ID35729
Feature activation+9.86e-03

Pos logit contributions
idents+0.020
s+0.018
ah+0.018
ric+0.017
 argue+0.016
Neg logit contributions
aver-0.025
 doctors-0.024
 knights-0.022
avers-0.022
P-0.021
Token.
Token ID13
Feature activation+3.75e-03

Pos logit contributions
idents+0.014
s+0.012
ric+0.012
ah+0.012
 argue+0.011
Neg logit contributions
aver-0.020
 doctors-0.019
 knights-0.017
avers-0.017
P-0.017
Token She
Token ID1375
Feature activation+2.63e-03

Pos logit contributions
idents+0.004
s+0.003
 Today+0.003
ric+0.003
seven+0.003
Neg logit contributions
aver-0.009
 doctors-0.009
avers-0.008
 knights-0.008
 pirates-0.008
Token smiled
Token ID13541
Feature activation+2.29e-03

Pos logit contributions
idents+0.004
s+0.003
ah+0.003
ric+0.003
 argue+0.003
Neg logit contributions
aver-0.005
 doctors-0.005
 knights-0.005
avers-0.004
P-0.004
Token and
Token ID290
Feature activation-8.08e-04

Pos logit contributions
idents+0.004
ric+0.004
 argue+0.003
ah+0.003
seven+0.003
Neg logit contributions
aver-0.004
 doctors-0.004
 knights-0.004
 pirates-0.003
avers-0.003
Token felt
Token ID2936
Feature activation-1.98e-02

Pos logit contributions
aver+0.001
 doctors+0.001
 knights+0.001
avers+0.001
eh+0.001
Neg logit contributions
idents-0.001
s-0.001
ric-0.001
ah-0.001
 argue-0.001
Token happy
Token ID3772
Feature activation-1.16e-02

Pos logit contributions
aver+0.039
 doctors+0.037
 knights+0.034
avers+0.034
eh+0.032
Neg logit contributions
idents-0.028
s-0.025
ric-0.024
ah-0.023
 argue-0.022
Token.
Token ID13
Feature activation-7.58e-04

Pos logit contributions
aver+0.025
 doctors+0.024
avers+0.022
 knights+0.022
P+0.021
Neg logit contributions
idents-0.015
ric-0.013
s-0.012
ah-0.012
 argue-0.011
Token<|endoftext|>
Token ID50256
Feature activation-2.35e-02

Pos logit contributions
aver+0.002
 doctors+0.002
 knights+0.001
avers+0.001
 pirates+0.001
Neg logit contributions
idents-0.001
ric-0.001
s-0.001
ah-0.001
 argue-0.001
TokenOnce
Token ID7454
Feature activation+2.23e-02

Pos logit contributions
aver+0.046
 doctors+0.044
 knights+0.040
avers+0.039
 pirates+0.039
Neg logit contributions
idents-0.034
ric-0.030
ah-0.029
s-0.028
 argue-0.028
Token I
Token ID314
Feature activation-6.31e-03

Pos logit contributions
aver+0.008
 doctors+0.007
eh+0.007
avers+0.007
P+0.007
Neg logit contributions
idents-0.004
 Today-0.004
ric-0.003
seven-0.003
s-0.003
Token will
Token ID481
Feature activation-1.15e-02

Pos logit contributions
aver+0.014
 doctors+0.014
 knights+0.012
 pirates+0.012
avers+0.012
Neg logit contributions
idents-0.007
ric-0.007
seven-0.006
ah-0.006
s-0.006
Token never
Token ID1239
Feature activation+1.64e-02

Pos logit contributions
aver+0.024
 doctors+0.023
 knights+0.021
avers+0.021
eh+0.020
Neg logit contributions
idents-0.015
ric-0.014
ah-0.013
 argue-0.013
s-0.013
Token leave
Token ID2666
Feature activation-1.92e-02

Pos logit contributions
 argue+0.015
idents+0.013
ah+0.011
ric+0.011
're+0.011
Neg logit contributions
aver-0.041
 doctors-0.039
itt-0.038
avers-0.038
P-0.037
Token you
Token ID345
Feature activation-1.35e-02

Pos logit contributions
aver+0.044
 doctors+0.040
avers+0.038
P+0.036
eh+0.036
Neg logit contributions
idents-0.022
s-0.020
ric-0.018
ah-0.017
 argue-0.017
Token again
Token ID757
Feature activation-1.39e-02

Pos logit contributions
aver+0.027
 doctors+0.027
 knights+0.025
avers+0.024
 pirates+0.024
Neg logit contributions
idents-0.018
ric-0.016
ah-0.016
s-0.015
 argue-0.015
Token."
Token ID526
Feature activation-2.34e-03

Pos logit contributions
aver+0.028
 doctors+0.028
 knights+0.025
avers+0.025
 pirates+0.024
Neg logit contributions
idents-0.019
ric-0.017
ah-0.016
s-0.016
 argue-0.015
Token
Token ID198
Feature activation-1.70e-03

Pos logit contributions
aver+0.005
 doctors+0.005
avers+0.005
eh+0.005
 knights+0.005
Neg logit contributions
idents-0.003
s-0.002
ric-0.002
 Today-0.002
ah-0.002
Token
Token ID198
Feature activation-3.77e-03

Pos logit contributions
aver+0.004
 doctors+0.004
 knights+0.003
avers+0.003
 pirates+0.003
Neg logit contributions
idents-0.002
s-0.002
ric-0.002
ah-0.002
seven-0.002
TokenThey
Token ID2990
Feature activation+1.23e-02

Pos logit contributions
aver+0.007
 doctors+0.007
 knights+0.006
avers+0.006
 pirates+0.006
Neg logit contributions
idents-0.005
s-0.005
ah-0.005
ric-0.005
seven-0.005
Token crawled
Token ID45668
Feature activation+4.33e-03

Pos logit contributions
idents+0.021
ric+0.019
ah+0.019
s+0.019
 argue+0.019
Neg logit contributions
aver-0.020
 doctors-0.020
 knights-0.017
avers-0.017
eh-0.016
Token when
Token ID618
Feature activation-3.73e-03

Pos logit contributions
idents+0.016
ric+0.015
ah+0.014
 argue+0.014
seven+0.013
Neg logit contributions
aver-0.016
 doctors-0.015
 knights-0.014
avers-0.014
ily-0.013
Token they
Token ID484
Feature activation-1.17e-02

Pos logit contributions
aver+0.009
avers+0.007
 doctors+0.007
eh+0.007
P+0.007
Neg logit contributions
idents-0.005
s-0.004
 argue-0.004
ric-0.004
 Today-0.004
Token found
Token ID1043
Feature activation-1.31e-02

Pos logit contributions
aver+0.020
 doctors+0.019
 knights+0.017
avers+0.017
eh+0.016
Neg logit contributions
idents-0.020
s-0.018
ah-0.018
ric-0.018
 argue-0.017
Token a
Token ID257
Feature activation+4.02e-03

Pos logit contributions
aver+0.025
 doctors+0.024
 knights+0.022
avers+0.021
 pirates+0.021
Neg logit contributions
idents-0.020
ric-0.018
ah-0.017
s-0.017
 argue-0.016
Token nice
Token ID3621
Feature activation+7.73e-03

Pos logit contributions
idents+0.006
s+0.005
ah+0.005
ric+0.005
 argue+0.005
Neg logit contributions
aver-0.008
 doctors-0.007
avers-0.007
 knights-0.007
ily-0.007
Token spot
Token ID4136
Feature activation-4.40e-02

Pos logit contributions
idents+0.010
ah+0.009
s+0.009
ric+0.009
 Today+0.008
Neg logit contributions
aver-0.016
 doctors-0.015
avers-0.014
 knights-0.014
P-0.013
Token.
Token ID13
Feature activation-1.23e-02

Pos logit contributions
aver+0.080
 doctors+0.077
 knights+0.069
avers+0.068
 pirates+0.066
Neg logit contributions
idents-0.070
ric-0.062
ah-0.060
s-0.059
 argue-0.058
Token
Token ID198
Feature activation+3.42e-03

Pos logit contributions
aver+0.025
 doctors+0.024
 knights+0.022
avers+0.021
 pirates+0.021
Neg logit contributions
idents-0.017
ric-0.015
ah-0.015
s-0.014
 argue-0.014
Token
Token ID198
Feature activation-1.51e-03

Pos logit contributions
idents+0.005
ric+0.005
ah+0.005
 argue+0.004
seven+0.004
Neg logit contributions
aver-0.006
 doctors-0.006
avers-0.006
 knights-0.005
P-0.005
TokenThey
Token ID2990
Feature activation-2.59e-02

Pos logit contributions
aver+0.003
 doctors+0.003
 knights+0.003
avers+0.002
 pirates+0.002
Neg logit contributions
idents-0.002
s-0.002
ah-0.002
ric-0.002
 Today-0.002
Token walked
Token ID6807
Feature activation-1.26e-02

Pos logit contributions
 doctors+0.043
aver+0.043
 knights+0.038
avers+0.036
eh+0.036
Neg logit contributions
idents-0.040
s-0.040
ah-0.039
 argue-0.038
ric-0.035
Token time
Token ID640
Feature activation-2.07e-02

Pos logit contributions
aver+0.041
 doctors+0.038
avers+0.035
itt+0.035
iers+0.034
Neg logit contributions
idents-0.020
s-0.018
ah-0.017
ric-0.017
 argue-0.017
Token.
Token ID13
Feature activation+1.78e-02

Pos logit contributions
aver+0.055
 doctors+0.048
 gard+0.047
P+0.046
avers+0.046
Neg logit contributions
s-0.021
idents-0.017
 argue-0.012
 anymore-0.011
 destination-0.011
Token They
Token ID1119
Feature activation+1.47e-02

Pos logit contributions
idents+0.028
 argue+0.026
ric+0.026
ah+0.025
s+0.023
Neg logit contributions
 doctors-0.031
aver-0.031
 knights-0.029
ily-0.029
 pirates-0.027
Token had
Token ID550
Feature activation+2.55e-02

Pos logit contributions
idents+0.028
 Today+0.023
ric+0.023
s+0.022
seven+0.022
Neg logit contributions
aver-0.025
 doctors-0.023
 knights-0.021
 pirates-0.020
avers-0.020
Token lots
Token ID6041
Feature activation-2.40e-02

Pos logit contributions
idents+0.038
ric+0.034
ah+0.033
s+0.033
 argue+0.031
Neg logit contributions
aver-0.049
 doctors-0.047
 knights-0.042
avers-0.042
eh-0.041
Token of
Token ID286
Feature activation-4.93e-02

Pos logit contributions
aver+0.043
 doctors+0.042
avers+0.038
P+0.038
 spies+0.036
Neg logit contributions
s-0.038
ah-0.031
idents-0.030
 are-0.029
 kinds-0.028
Token fun
Token ID1257
Feature activation-3.31e-02

Pos logit contributions
aver+0.108
 doctors+0.101
avers+0.092
eh+0.091
P+0.090
Neg logit contributions
idents-0.057
s-0.049
 argue-0.049
ah-0.046
 Today-0.045
Token in
Token ID287
Feature activation-3.75e-03

Pos logit contributions
aver+0.065
 doctors+0.062
avers+0.055
 knights+0.054
ily+0.054
Neg logit contributions
idents-0.049
s-0.043
ric-0.043
ah-0.041
 argue-0.040
Token the
Token ID262
Feature activation-1.55e-02

Pos logit contributions
aver+0.008
 doctors+0.008
 knights+0.007
avers+0.007
 pirates+0.007
Neg logit contributions
idents-0.005
ric-0.004
ah-0.004
s-0.004
 argue-0.004
Token big
Token ID1263
Feature activation+1.28e-03

Pos logit contributions
aver+0.027
 doctors+0.025
avers+0.024
 knights+0.022
eh+0.022
Neg logit contributions
idents-0.025
ric-0.023
s-0.023
ah-0.022
 destination-0.022
Token land
Token ID1956
Feature activation-1.56e-02

Pos logit contributions
idents+0.002
s+0.002
 destination+0.001
ric+0.001
ah+0.001
Neg logit contributions
aver-0.003
 doctors-0.003
avers-0.002
ily-0.002
eh-0.002
Token to
Token ID284
Feature activation-5.89e-03

Pos logit contributions
 doctors+0.036
aver+0.035
 knights+0.032
avers+0.030
 pirates+0.030
Neg logit contributions
idents-0.030
s-0.030
ric-0.029
ah-0.029
 Today-0.027
Token play
Token ID711
Feature activation-2.95e-03

Pos logit contributions
aver+0.013
 doctors+0.012
 knights+0.011
avers+0.011
 pirates+0.011
Neg logit contributions
idents-0.007
ric-0.006
ah-0.006
 argue-0.006
s-0.005
Token in
Token ID287
Feature activation-4.01e-04

Pos logit contributions
aver+0.005
 doctors+0.005
avers+0.004
 knights+0.004
 pirates+0.004
Neg logit contributions
idents-0.005
s-0.005
ah-0.004
ric-0.004
 argue-0.004
Token and
Token ID290
Feature activation+1.14e-02

Pos logit contributions
 doctors+0.001
aver+0.001
 knights+0.001
 pirates+0.001
P+0.001
Neg logit contributions
idents-0.001
ah-0.001
ric-0.001
s-0.001
 argue-0.001
Token lots
Token ID6041
Feature activation-2.23e-02

Pos logit contributions
idents+0.017
ric+0.015
ah+0.014
s+0.014
 argue+0.013
Neg logit contributions
aver-0.022
 doctors-0.022
 knights-0.020
avers-0.019
 pirates-0.019
Token of
Token ID286
Feature activation-5.19e-02

Pos logit contributions
aver+0.033
 doctors+0.032
 knights+0.028
avers+0.027
 pirates+0.027
Neg logit contributions
s-0.041
idents-0.038
ah-0.037
ric-0.035
seven-0.034
Token grass
Token ID8701
Feature activation-1.71e-02

Pos logit contributions
aver+0.100
 doctors+0.091
itt+0.084
avers+0.084
iers+0.083
Neg logit contributions
s-0.061
ah-0.061
idents-0.060
fl-0.060
 kinds-0.058
Token to
Token ID284
Feature activation+2.52e-03

Pos logit contributions
aver+0.029
 doctors+0.029
 knights+0.026
 pirates+0.025
ily+0.025
Neg logit contributions
idents-0.029
ric-0.025
ah-0.025
 argue-0.024
seven-0.024
Token eat
Token ID4483
Feature activation+6.01e-03

Pos logit contributions
idents+0.003
ric+0.003
ah+0.003
s+0.003
 argue+0.003
Neg logit contributions
aver-0.005
 doctors-0.005
 knights-0.005
avers-0.005
 pirates-0.004
Token.
Token ID13
Feature activation+6.18e-03

Pos logit contributions
idents+0.009
ric+0.008
ah+0.007
s+0.007
 argue+0.007
Neg logit contributions
aver-0.012
 doctors-0.012
 knights-0.010
avers-0.010
 pirates-0.010
Token One
Token ID1881
Feature activation+2.49e-02

Pos logit contributions
idents+0.008
ric+0.006
ah+0.006
 argue+0.006
s+0.006
Neg logit contributions
 doctors-0.013
aver-0.013
 knights-0.012
ily-0.012
avers-0.012
Token playing
Token ID2712
Feature activation-4.03e-03

Pos logit contributions
aver+0.006
 doctors+0.006
 knights+0.005
P+0.005
eh+0.005
Neg logit contributions
idents-0.004
ric-0.003
ah-0.003
s-0.003
 argue-0.003
Token tag
Token ID7621
Feature activation+1.65e-02

Pos logit contributions
aver+0.007
 doctors+0.007
avers+0.006
 gard+0.006
 knights+0.006
Neg logit contributions
idents-0.006
ah-0.005
s-0.005
 argue-0.005
ric-0.005
Token and
Token ID290
Feature activation+7.15e-03

Pos logit contributions
idents+0.024
ric+0.021
s+0.020
ah+0.020
 argue+0.019
Neg logit contributions
aver-0.033
 doctors-0.032
 knights-0.028
avers-0.028
P-0.027
Token having
Token ID1719
Feature activation-4.00e-03

Pos logit contributions
idents+0.012
ric+0.011
ah+0.010
s+0.010
 argue+0.010
Neg logit contributions
aver-0.013
 doctors-0.012
avers-0.011
 knights-0.011
P-0.011
Token lots
Token ID6041
Feature activation-2.91e-02

Pos logit contributions
aver+0.007
 doctors+0.007
 knights+0.006
 pirates+0.006
avers+0.006
Neg logit contributions
idents-0.007
ric-0.006
ah-0.006
s-0.006
 argue-0.006
Token of
Token ID286
Feature activation-5.93e-02

Pos logit contributions
 doctors+0.043
aver+0.043
 knights+0.038
 pirates+0.036
 gard+0.035
Neg logit contributions
s-0.052
idents-0.050
ah-0.050
 Today-0.046
ric-0.046
Token fun
Token ID1257
Feature activation-4.01e-02

Pos logit contributions
aver+0.105
 doctors+0.101
 knights+0.090
avers+0.089
 pirates+0.086
Neg logit contributions
idents-0.096
ric-0.086
ah-0.085
s-0.084
 argue-0.082
Token.
Token ID13
Feature activation+1.53e-02

Pos logit contributions
aver+0.081
 doctors+0.078
 knights+0.071
avers+0.070
 pirates+0.068
Neg logit contributions
idents-0.056
ric-0.049
ah-0.048
s-0.047
 argue-0.045
Token Suddenly
Token ID24975
Feature activation+4.21e-03

Pos logit contributions
idents+0.024
ah+0.023
ric+0.023
 argue+0.021
seven+0.020
Neg logit contributions
aver-0.025
 doctors-0.025
avers-0.023
 knights-0.023
 pirates-0.022
Token,
Token ID11
Feature activation-5.22e-04

Pos logit contributions
idents+0.005
s+0.004
ah+0.004
 argue+0.003
 Today+0.003
Neg logit contributions
aver-0.010
 doctors-0.009
P-0.009
eh-0.009
avers-0.009
Token John
Token ID1757
Feature activation-1.30e-04

Pos logit contributions
aver+0.001
 doctors+0.001
 knights+0.001
avers+0.001
eh+0.001
Neg logit contributions
idents-0.001
ric-0.001
ah-0.001
s-0.001
 argue-0.001
Token but
Token ID475
Feature activation+6.81e-03

Pos logit contributions
idents+0.002
ric+0.002
ah+0.002
s+0.002
 argue+0.002
Neg logit contributions
aver-0.003
 doctors-0.003
avers-0.003
 knights-0.003
P-0.003
Token I
Token ID314
Feature activation-1.43e-02

Pos logit contributions
idents+0.011
ric+0.010
s+0.010
ah+0.009
 argue+0.009
Neg logit contributions
aver-0.012
 doctors-0.011
avers-0.011
eh-0.011
P-0.010
Token had
Token ID550
Feature activation+2.75e-04

Pos logit contributions
aver+0.027
 doctors+0.026
ily+0.022
avers+0.022
 knights+0.022
Neg logit contributions
idents-0.021
ric-0.019
seven-0.017
 summit-0.017
s-0.016
Token so
Token ID523
Feature activation-9.14e-03

Pos logit contributions
idents+0.000
ric+0.000
ah+0.000
s+0.000
 argue+0.000
Neg logit contributions
aver-0.001
 doctors-0.000
 knights-0.000
avers-0.000
P-0.000
Token much
Token ID881
Feature activation-3.94e-02

Pos logit contributions
aver+0.017
 doctors+0.017
 knights+0.015
avers+0.015
P+0.014
Neg logit contributions
idents-0.014
s-0.012
ric-0.012
ah-0.012
 argue-0.012
Token fun
Token ID1257
Feature activation-4.94e-02

Pos logit contributions
aver+0.094
avers+0.085
 doctors+0.079
P+0.079
eh+0.078
Neg logit contributions
idents-0.040
 Today-0.037
s-0.036
seven-0.033
 argue-0.032
Token playing
Token ID2712
Feature activation-1.12e-02

Pos logit contributions
aver+0.111
 doctors+0.106
avers+0.097
 knights+0.096
P+0.095
Neg logit contributions
idents-0.057
ric-0.048
s-0.048
ah-0.047
 argue-0.045
Token with
Token ID351
Feature activation-2.97e-02

Pos logit contributions
aver+0.020
 doctors+0.019
avers+0.017
 knights+0.016
eh+0.016
Neg logit contributions
idents-0.018
ric-0.016
ah-0.016
s-0.016
 argue-0.016
Token my
Token ID616
Feature activation-2.25e-02

Pos logit contributions
aver+0.063
 doctors+0.058
avers+0.054
 knights+0.052
eh+0.052
Neg logit contributions
idents-0.039
ric-0.034
s-0.034
ah-0.032
seven-0.031
Token new
Token ID649
Feature activation-2.59e-02

Pos logit contributions
aver+0.037
 doctors+0.035
 knights+0.031
avers+0.031
P+0.029
Neg logit contributions
idents-0.040
ric-0.036
ah-0.035
s-0.035
 argue-0.034
Token friends
Token ID2460
Feature activation-2.26e-03

Pos logit contributions
aver+0.047
 doctors+0.045
 knights+0.040
avers+0.040
P+0.038
Neg logit contributions
idents-0.041
ric-0.037
ah-0.036
s-0.036
 argue-0.035