TOP ACTIVATIONS
MAX = 0.114

and dad . They hope the big boy will be sorry
and sorry . The old man heard Tom .
!" Ben cried . " The ducks bit me ! They
it goodbye . The cup was happy and relieved
and explain . The fisherman comes to them .
were bad !" The man and the woman smiled
bite you ". The mouse said , " It
with you !" The big dog does not understand
runs away . The dog w ags its tail
, stop , stop ! The big train will hit us
says sorry . The bird is scared and hurt
hurt . They cried . The truck driver stopped and called
Lily cried . The bug was not modest .
and mad . They call the police . They hope the
to Tom and Lily ." The big boy looked down .
say sorry . The gym teacher comes and sees
with you ." The weird animal nods . It
were hungry . The man is surprised . He
. " Mir ror , mirror , make Anna
you again ". The boys thanked the bald <|endoftext|>

MIN ACTIVATIONS
MIN = -0.213

the baby was wearing a very soft blanket . The bird
years old and had a very sweet tooth . She loved
a time there was a very clever fox . He was
Bob and he had a very big frown . He went
Tom my was having a very special day . He was
time , there was a very thoughtful girl . She was
time , there was a very thoughtful girl . She was
time , there was a very silly gir affe . It
called Anna who had a very special passport . Everyone wanted
thought that sounded like a very fun game . So he
years old and had a very important question . She asked
very big bag for a very small girl . Do you
time , there was a very smart boy named Dave .
mall . It was a very successful mall , people from
Jack . He was a very compassionate boy , who always
corner . He had a very kind and thoughtful face .
time , there was a very successful duck . He lived
very nice house and a very nice lion . They said
Anna . She had a very kind aunt who always made
Julie . She was a very polite and kind girl who

INTERVAL 0.033 - 0.114
CONTAINS 17.399%

asked her fluffy bunny . The bunny just shook
it tastes like ." The family put the stew
them on top . They used tape to hold them together
home to tell her mom my all about her wonderful trip
replied , " To catch the sunset !" And with a

INTERVAL -0.049 - 0.033
CONTAINS 74.524%

couldn 't do any harm . Luc as
is glad they are safe and honest . She smiles .
help the fish stay smooth and protect it from other animals
a button . A bright flame comes out of the shiny
earthquakes . <|endoftext|> Once upon a time , there was a

INTERVAL -0.131 - -0.049
CONTAINS 8.020%

R alph smiled and felt very happy . His mom said
. <|endoftext|> Once there was a cozy little bear named Larry
school day . He got dressed in his warm clothes and
't believe he 'd been so adventurous .
way ?" Buzz and Buzz y argued for a

INTERVAL -0.213 - -0.131
CONTAINS 0.057%

Molly . She was a very intelligent girl . Every day
bright colours and made a very colourful design .
around and saw a big p uddle of mud . She
<|endoftext|> Su zie was a cute little girl . She loved
train . It was a very mild train . It ran
Token and
Token ID290
Feature activation+1.44e-02

Pos logit contributions
Those+0.147
ill+0.131
Warning+0.127
Is+0.123
 fourteen+0.122
Neg logit contributions
 badly-0.102
--0.099
 shield-0.091
 job-0.091
 healthy-0.084
Token dad
Token ID9955
Feature activation+6.08e-02

Pos logit contributions
Those+0.032
ill+0.029
of+0.027
 fourteen+0.026
These+0.026
Neg logit contributions
 shield-0.018
 badly-0.018
 job-0.017
--0.017
 possession-0.017
Token.
Token ID13
Feature activation+2.40e-02

Pos logit contributions
Those+0.120
ill+0.108
Is+0.101
Warning+0.100
Kids+0.099
Neg logit contributions
 badly-0.094
 job-0.088
--0.088
 shield-0.086
 possession-0.080
Token They
Token ID1119
Feature activation+3.28e-02

Pos logit contributions
Those+0.044
ill+0.042
of+0.041
ion+0.038
 fourteen+0.037
Neg logit contributions
 job-0.037
 badly-0.036
 Amen-0.035
 shield-0.035
--0.034
Token hope
Token ID2911
Feature activation+3.23e-02

Pos logit contributions
Those+0.049
ill+0.042
of+0.037
Is+0.036
 fourteen+0.036
Neg logit contributions
 shield-0.066
 job-0.064
 badly-0.064
--0.061
 active-0.060
Token the
Token ID262
Feature activation+1.14e-01

Pos logit contributions
Those+0.073
ill+0.068
ives+0.063
of+0.061
Warning+0.061
Neg logit contributions
--0.040
 job-0.040
 badly-0.039
 healthy-0.036
 shield-0.035
Token big
Token ID1263
Feature activation+3.26e-02

Pos logit contributions
Those+0.253
These+0.215
ill+0.214
 are+0.209
Kids+0.207
Neg logit contributions
 badly-0.130
 possession-0.129
--0.128
 job-0.126
 active-0.123
Token boy
Token ID2933
Feature activation+3.17e-02

Pos logit contributions
Those+0.062
 are+0.054
ill+0.053
 end+0.051
These+0.050
Neg logit contributions
 active-0.046
 job-0.045
 badly-0.045
eh-0.044
 possession-0.044
Token will
Token ID481
Feature activation+2.52e-02

Pos logit contributions
Those+0.052
ill+0.045
of+0.041
These+0.039
 fourteen+0.039
Neg logit contributions
 badly-0.058
 shield-0.057
 job-0.056
--0.055
 possession-0.054
Token be
Token ID307
Feature activation+4.29e-02

Pos logit contributions
Those+0.047
ill+0.040
of+0.038
These+0.037
 fourteen+0.037
Neg logit contributions
 badly-0.042
 shield-0.041
 job-0.041
--0.039
 active-0.038
Token sorry
Token ID7926
Feature activation+1.74e-02

Pos logit contributions
Those+0.071
ill+0.067
 are+0.065
ila+0.062
 end+0.060
Neg logit contributions
 possession-0.067
 job-0.061
 shield-0.060
 Amen-0.059
ouse-0.058
Token and
Token ID290
Feature activation-2.33e-02

Pos logit contributions
Those+0.083
ill+0.075
of+0.071
 fourteen+0.069
These+0.069
Neg logit contributions
 shield-0.042
 badly-0.042
 job-0.041
 possession-0.039
--0.039
Token sorry
Token ID7926
Feature activation+6.94e-03

Pos logit contributions
 possession+0.029
 job+0.029
 shield+0.028
 active+0.028
-+0.027
Neg logit contributions
Those-0.048
 are-0.042
ill-0.040
of-0.040
ila-0.040
Token.
Token ID13
Feature activation+1.17e-02

Pos logit contributions
Those+0.014
ill+0.013
of+0.012
 fourteen+0.012
 are+0.012
Neg logit contributions
 shield-0.009
 job-0.009
 possession-0.009
 badly-0.009
 active-0.009
Token
Token ID198
Feature activation+7.68e-02

Pos logit contributions
Those+0.023
ill+0.022
of+0.021
oms+0.020
ion+0.019
Neg logit contributions
 badly-0.017
 job-0.016
 shield-0.015
--0.015
 possession-0.015
Token
Token ID198
Feature activation+6.98e-02

Pos logit contributions
Those+0.157
ill+0.151
oms+0.135
 fourteen+0.134
ount+0.133
Neg logit contributions
 badly-0.108
--0.102
 job-0.093
 loudly-0.091
 cozy-0.090
TokenThe
Token ID464
Feature activation+1.13e-01

Pos logit contributions
ill+0.107
Those+0.100
oms+0.098
ures+0.090
of+0.090
Neg logit contributions
 badly-0.143
--0.132
 job-0.127
 loudly-0.120
 active-0.119
Token old
Token ID1468
Feature activation+2.18e-02

Pos logit contributions
Those+0.271
ill+0.267
of+0.247
ount+0.236
 fourteen+0.236
Neg logit contributions
 shield-0.127
 job-0.124
 badly-0.119
--0.113
 active-0.110
Token man
Token ID582
Feature activation+3.25e-02

Pos logit contributions
Those+0.043
ill+0.036
of+0.035
Kids+0.035
These+0.034
Neg logit contributions
 badly-0.032
 shield-0.030
 job-0.029
 active-0.029
 possession-0.029
Token heard
Token ID2982
Feature activation+6.75e-02

Pos logit contributions
Those+0.065
ill+0.058
These+0.052
of+0.052
 fourteen+0.052
Neg logit contributions
 badly-0.049
--0.049
 shield-0.048
 job-0.045
 active-0.042
Token Tom
Token ID4186
Feature activation+4.70e-02

Pos logit contributions
Those+0.135
ill+0.119
These+0.110
ives+0.110
oms+0.109
Neg logit contributions
--0.101
 badly-0.100
 shield-0.092
 loudly-0.089
 job-0.089
Token.
Token ID13
Feature activation+4.57e-03

Pos logit contributions
Those+0.103
ill+0.092
of+0.087
 fourteen+0.086
These+0.085
Neg logit contributions
 badly-0.059
 shield-0.058
 job-0.056
--0.056
 possession-0.053
Token!"
Token ID2474
Feature activation+1.83e-02

Pos logit contributions
Those+0.081
of+0.074
ill+0.074
 are+0.072
 fourteen+0.071
Neg logit contributions
 possession-0.045
 shield-0.044
 armour-0.043
 armor-0.042
 active-0.042
Token Ben
Token ID3932
Feature activation+4.55e-02

Pos logit contributions
Those+0.027
ill+0.023
of+0.021
These+0.020
 fourteen+0.020
Neg logit contributions
 badly-0.036
 shield-0.036
 job-0.035
--0.034
 possession-0.034
Token cried
Token ID16896
Feature activation+4.55e-02

Pos logit contributions
Those+0.095
ill+0.085
of+0.081
 fourteen+0.077
Kids+0.077
Neg logit contributions
 shield-0.061
 job-0.061
 badly-0.060
 possession-0.059
 active-0.057
Token.
Token ID13
Feature activation+1.20e-02

Pos logit contributions
Those+0.100
ill+0.091
of+0.085
 fourteen+0.082
These+0.081
Neg logit contributions
 shield-0.055
 badly-0.054
 job-0.054
 possession-0.053
 active-0.051
Token "
Token ID366
Feature activation+6.04e-02

Pos logit contributions
Those+0.025
ill+0.022
of+0.021
These+0.020
 fourteen+0.020
Neg logit contributions
 badly-0.017
 shield-0.017
 job-0.016
--0.016
 possession-0.015
TokenThe
Token ID464
Feature activation+1.13e-01

Pos logit contributions
Those+0.074
ill+0.065
of+0.059
ion+0.055
oms+0.055
Neg logit contributions
 badly-0.129
 shield-0.127
 job-0.124
--0.122
 active-0.121
Token ducks
Token ID39694
Feature activation+7.27e-02

Pos logit contributions
Those+0.233
ill+0.221
of+0.208
Kids+0.196
ion+0.195
Neg logit contributions
 job-0.154
 shield-0.154
 badly-0.152
--0.141
 active-0.139
Token bit
Token ID1643
Feature activation+2.88e-02

Pos logit contributions
Those+0.110
ill+0.094
of+0.086
These+0.083
 fourteen+0.082
Neg logit contributions
 badly-0.139
 shield-0.138
 job-0.136
--0.133
 possession-0.131
Token me
Token ID502
Feature activation+5.95e-02

Pos logit contributions
Those+0.057
ill+0.050
of+0.048
 fourteen+0.046
These+0.045
Neg logit contributions
 shield-0.042
 badly-0.042
 job-0.041
 possession-0.040
--0.040
Token!
Token ID0
Feature activation+1.34e-02

Pos logit contributions
Those+0.142
ill+0.137
of+0.131
 fourteen+0.124
 are+0.124
Neg logit contributions
 shield-0.060
 possession-0.057
 active-0.054
 job-0.054
 badly-0.053
Token They
Token ID1119
Feature activation+2.49e-02

Pos logit contributions
Those+0.024
ill+0.022
of+0.020
 fourteen+0.019
These+0.019
Neg logit contributions
 badly-0.021
 shield-0.021
 job-0.021
--0.020
 possession-0.020
Token it
Token ID340
Feature activation+3.87e-02

Pos logit contributions
Those+0.067
ill+0.060
of+0.057
These+0.056
 fourteen+0.056
Neg logit contributions
 badly-0.030
 shield-0.029
--0.028
 job-0.028
 possession-0.026
Token goodbye
Token ID24829
Feature activation+1.63e-02

Pos logit contributions
Those+0.079
ill+0.073
of+0.071
 are+0.066
 fourteen+0.066
Neg logit contributions
 badly-0.047
 shield-0.047
 possession-0.046
 job-0.045
 Amen-0.044
Token.
Token ID13
Feature activation+8.37e-03

Pos logit contributions
ill+0.024
Those+0.024
of+0.023
 fourteen+0.021
 are+0.020
Neg logit contributions
 shield-0.028
 Amen-0.026
 appetite-0.025
 cowboy-0.025
eh-0.025
Token
Token ID198
Feature activation+7.85e-02

Pos logit contributions
Those+0.015
ill+0.013
of+0.012
 fourteen+0.012
These+0.012
Neg logit contributions
 badly-0.013
 shield-0.013
 possession-0.013
 job-0.013
--0.012
Token
Token ID198
Feature activation+7.10e-02

Pos logit contributions
Those+0.146
ill+0.131
 fourteen+0.127
 thirteen+0.121
oms+0.121
Neg logit contributions
 badly-0.117
 happily-0.111
--0.109
 active-0.106
 shield-0.105
TokenThe
Token ID464
Feature activation+1.11e-01

Pos logit contributions
ill+0.095
Those+0.093
oes+0.091
oms+0.091
ount+0.089
Neg logit contributions
 badly-0.135
 active-0.131
 shield-0.120
 healthy-0.119
--0.119
Token cup
Token ID6508
Feature activation+2.10e-02

Pos logit contributions
Those+0.266
ill+0.260
ures+0.242
of+0.242
 fourteen+0.239
Neg logit contributions
 job-0.121
--0.116
 badly-0.111
 shield-0.109
 cozy-0.104
Token was
Token ID373
Feature activation+2.90e-04

Pos logit contributions
Those+0.039
ill+0.034
of+0.032
These+0.030
 fourteen+0.030
Neg logit contributions
 badly-0.033
 shield-0.033
 job-0.033
--0.032
 possession-0.031
Token happy
Token ID3772
Feature activation-1.73e-02

Pos logit contributions
Those+0.001
ill+0.001
of+0.001
 fourteen+0.001
ion+0.001
Neg logit contributions
 shield-0.000
 job-0.000
 possession-0.000
 badly-0.000
--0.000
Token and
Token ID290
Feature activation-3.10e-02

Pos logit contributions
 shield+0.025
 job+0.024
 badly+0.024
 active+0.023
 possession+0.023
Neg logit contributions
Those-0.034
ill-0.031
of-0.029
 fourteen-0.028
ion-0.027
Token relieved
Token ID22598
Feature activation+5.97e-03

Pos logit contributions
 badly+0.034
 shield+0.034
 job+0.031
 possession+0.031
-+0.030
Neg logit contributions
Those-0.071
ill-0.063
of-0.062
 are-0.059
Kids-0.059
Token and
Token ID290
Feature activation-6.29e-03

Pos logit contributions
Those+0.087
ill+0.078
of+0.074
These+0.072
 fourteen+0.072
Neg logit contributions
 badly-0.044
 shield-0.044
 job-0.042
--0.041
 possession-0.040
Token explain
Token ID4727
Feature activation+1.49e-02

Pos logit contributions
 badly+0.012
 shield+0.012
 job+0.011
-+0.011
 possession+0.011
Neg logit contributions
Those-0.010
ill-0.009
of-0.008
 fourteen-0.008
These-0.008
Token.
Token ID13
Feature activation+1.04e-02

Pos logit contributions
Those+0.034
ill+0.031
of+0.029
These+0.028
 fourteen+0.028
Neg logit contributions
 badly-0.018
 shield-0.017
--0.017
 job-0.017
 possession-0.016
Token
Token ID198
Feature activation+7.54e-02

Pos logit contributions
Those+0.019
ill+0.017
of+0.017
ion+0.016
 fourteen+0.016
Neg logit contributions
 badly-0.015
 shield-0.015
 job-0.014
 hungry-0.014
 active-0.014
Token
Token ID198
Feature activation+6.99e-02

Pos logit contributions
Those+0.158
ill+0.144
 thirteen+0.136
ount+0.135
 fourteen+0.135
Neg logit contributions
 badly-0.093
--0.092
 cozy-0.087
 healthy-0.086
 shield-0.085
TokenThe
Token ID464
Feature activation+1.10e-01

Pos logit contributions
ount+0.095
Those+0.092
ill+0.090
ures+0.086
oms+0.085
Neg logit contributions
 healthy-0.126
 badly-0.125
 shield-0.123
 active-0.121
--0.120
Token fisherman
Token ID48750
Feature activation+3.93e-02

Pos logit contributions
Those+0.242
ill+0.216
of+0.205
These+0.199
 fourteen+0.199
Neg logit contributions
 badly-0.138
 shield-0.135
 job-0.132
 possession-0.129
--0.128
Token comes
Token ID2058
Feature activation+5.94e-02

Pos logit contributions
Those+0.066
ill+0.055
These+0.051
Kids+0.049
of+0.049
Neg logit contributions
 shield-0.071
--0.071
 badly-0.069
 job-0.069
 happily-0.064
Token to
Token ID284
Feature activation+2.23e-02

Pos logit contributions
Those+0.117
ill+0.109
of+0.101
oms+0.094
 are+0.093
Neg logit contributions
 shield-0.084
 job-0.083
 possession-0.083
 badly-0.082
 appetite-0.078
Token them
Token ID606
Feature activation+6.95e-02

Pos logit contributions
Those+0.046
ill+0.039
ives+0.038
ures+0.038
oms+0.038
Neg logit contributions
 shield-0.032
 job-0.030
 healthy-0.029
--0.029
 happily-0.028
Token.
Token ID13
Feature activation+1.22e-03

Pos logit contributions
Those+0.153
ill+0.138
of+0.131
These+0.126
 fourteen+0.126
Neg logit contributions
 shield-0.085
 badly-0.085
 job-0.083
 possession-0.080
--0.078
Token were
Token ID547
Feature activation+2.41e-02

Pos logit contributions
Those+0.062
ill+0.053
of+0.049
 fourteen+0.047
These+0.047
Neg logit contributions
 badly-0.076
 shield-0.076
 job-0.074
--0.073
 possession-0.072
Token bad
Token ID2089
Feature activation+3.28e-03

Pos logit contributions
Those+0.038
 are+0.038
 fourteen+0.036
 thirteen+0.034
ila+0.031
Neg logit contributions
 possession-0.043
ouse-0.039
pron-0.037
 Amen-0.035
 shield-0.035
Token!"
Token ID2474
Feature activation+1.61e-02

Pos logit contributions
Those+0.005
ill+0.005
 are+0.005
oms+0.004
 fourteen+0.004
Neg logit contributions
 possession-0.006
pron-0.006
 badly-0.006
 active-0.005
 shield-0.005
Token
Token ID198
Feature activation+7.28e-02

Pos logit contributions
Those+0.030
ill+0.026
These+0.024
of+0.024
Is+0.024
Neg logit contributions
 badly-0.025
--0.024
 shield-0.024
 loudly-0.023
 happily-0.023
Token
Token ID198
Feature activation+7.01e-02

Pos logit contributions
Those+0.130
ill+0.116
 fourteen+0.106
These+0.105
of+0.102
Neg logit contributions
 badly-0.121
--0.120
 shield-0.112
 job-0.107
 loudly-0.106
TokenThe
Token ID464
Feature activation+1.10e-01

Pos logit contributions
Those+0.129
ill+0.124
ount+0.117
of+0.112
oms+0.110
Neg logit contributions
--0.104
 badly-0.100
 shield-0.092
 loudly-0.091
 job-0.090
Token man
Token ID582
Feature activation+3.94e-02

Pos logit contributions
Those+0.300
ill+0.288
of+0.265
ion+0.260
These+0.259
Neg logit contributions
 shield-0.094
--0.087
 job-0.086
 helmet-0.075
 active-0.070
Token and
Token ID290
Feature activation+4.35e-03

Pos logit contributions
Those+0.082
ill+0.071
These+0.068
 fourteen+0.066
of+0.066
Neg logit contributions
--0.058
 badly-0.055
 shield-0.055
 happily-0.049
 loudly-0.048
Token the
Token ID262
Feature activation+9.16e-02

Pos logit contributions
Those+0.008
ives+0.007
These+0.007
ures+0.007
ill+0.007
Neg logit contributions
 shield-0.007
--0.006
 happily-0.006
 job-0.006
 badly-0.006
Token woman
Token ID2415
Feature activation+3.66e-02

Pos logit contributions
Those+0.217
ill+0.197
of+0.185
ion+0.180
 fourteen+0.180
Neg logit contributions
 shield-0.110
 job-0.100
--0.097
 badly-0.096
 active-0.091
Token smiled
Token ID13541
Feature activation+2.28e-02

Pos logit contributions
Those+0.076
ill+0.065
These+0.062
 fourteen+0.062
Warning+0.062
Neg logit contributions
--0.052
 shield-0.051
 badly-0.050
 happily-0.048
 loudly-0.044
Token bite
Token ID13197
Feature activation+5.06e-02

Pos logit contributions
Those+0.040
ill+0.034
ion+0.034
Kids+0.033
Is+0.033
Neg logit contributions
 badly-0.031
 shield-0.029
 job-0.029
--0.029
 active-0.027
Token you
Token ID345
Feature activation+3.25e-02

Pos logit contributions
Those+0.109
ill+0.098
of+0.092
Kids+0.090
These+0.090
Neg logit contributions
 badly-0.066
 shield-0.064
 job-0.062
--0.062
 possession-0.059
Token".
Token ID1911
Feature activation+8.94e-03

Pos logit contributions
Those+0.078
ill+0.070
of+0.067
Kids+0.066
These+0.066
Neg logit contributions
 badly-0.035
 shield-0.033
 job-0.032
--0.032
 active-0.030
Token
Token ID198
Feature activation+7.31e-02

Pos logit contributions
Those+0.015
ill+0.013
Kids+0.012
of+0.012
 fourteen+0.012
Neg logit contributions
 badly-0.015
 shield-0.015
 job-0.015
--0.015
 possession-0.014
Token
Token ID198
Feature activation+6.89e-02

Pos logit contributions
Those+0.148
ill+0.133
 fourteen+0.126
ount+0.123
 thirteen+0.120
Neg logit contributions
 badly-0.105
--0.101
 shield-0.098
 happily-0.095
 Amen-0.091
TokenThe
Token ID464
Feature activation+1.10e-01

Pos logit contributions
Those+0.098
ill+0.095
ount+0.090
oms+0.088
 fourteen+0.084
Neg logit contributions
 badly-0.138
 shield-0.128
--0.126
 active-0.124
 healthy-0.118
Token mouse
Token ID10211
Feature activation+3.30e-02

Pos logit contributions
Those+0.237
ill+0.206
These+0.195
of+0.194
 fourteen+0.191
Neg logit contributions
 badly-0.143
 possession-0.136
 job-0.134
 shield-0.134
--0.131
Token said
Token ID531
Feature activation+2.23e-02

Pos logit contributions
Those+0.057
ill+0.050
of+0.048
 fourteen+0.044
Kids+0.044
Neg logit contributions
 badly-0.055
 job-0.055
 shield-0.054
 possession-0.054
 active-0.052
Token,
Token ID11
Feature activation+2.21e-02

Pos logit contributions
Those+0.037
ill+0.033
 fourteen+0.032
of+0.032
ion+0.032
Neg logit contributions
 possession-0.036
 active-0.034
 shield-0.034
 job-0.033
 armour-0.032
Token "
Token ID366
Feature activation+7.12e-02

Pos logit contributions
Those+0.044
ill+0.040
of+0.037
These+0.036
 fourteen+0.036
Neg logit contributions
 badly-0.031
 shield-0.031
 job-0.031
--0.030
 possession-0.029
TokenIt
Token ID1026
Feature activation+4.72e-02

Pos logit contributions
Those+0.104
ill+0.097
oms+0.092
ount+0.088
 fourteen+0.087
Neg logit contributions
 badly-0.131
 active-0.128
 possession-0.126
 shield-0.125
--0.123
Token with
Token ID351
Feature activation+6.05e-02

Pos logit contributions
Those+0.044
 are+0.039
ill+0.038
 fourteen+0.036
 end+0.036
Neg logit contributions
 possession-0.037
pron-0.035
 shield-0.035
ouse-0.035
 active-0.034
Token you
Token ID345
Feature activation+3.43e-02

Pos logit contributions
Those+0.116
ill+0.102
of+0.096
 fourteen+0.092
These+0.092
Neg logit contributions
 badly-0.092
 shield-0.092
 job-0.090
--0.087
 possession-0.087
Token!"
Token ID2474
Feature activation+1.50e-02

Pos logit contributions
Those+0.081
ill+0.078
of+0.073
 are+0.072
 fourteen+0.071
Neg logit contributions
 possession-0.035
 shield-0.034
 job-0.033
 active-0.032
 Amen-0.031
Token
Token ID198
Feature activation+7.24e-02

Pos logit contributions
Those+0.023
ill+0.020
of+0.018
These+0.017
 fourteen+0.017
Neg logit contributions
 badly-0.029
 shield-0.028
--0.028
 job-0.028
 possession-0.027
Token
Token ID198
Feature activation+6.64e-02

Pos logit contributions
Those+0.129
ill+0.120
 fourteen+0.108
These+0.103
of+0.102
Neg logit contributions
 badly-0.121
--0.119
 shield-0.113
 job-0.109
 cozy-0.108
TokenThe
Token ID464
Feature activation+1.10e-01

Pos logit contributions
Those+0.092
ill+0.091
of+0.075
 fourteen+0.073
ount+0.073
Neg logit contributions
 badly-0.134
--0.131
 active-0.127
 job-0.125
 shield-0.124
Token big
Token ID1263
Feature activation+9.65e-03

Pos logit contributions
Those+0.272
ill+0.268
of+0.249
ount+0.243
ures+0.242
Neg logit contributions
 shield-0.109
 job-0.103
 badly-0.102
--0.093
 active-0.091
Token dog
Token ID3290
Feature activation+1.97e-02

Pos logit contributions
Those+0.022
ill+0.020
of+0.019
 fourteen+0.019
These+0.019
Neg logit contributions
 badly-0.011
 shield-0.011
 job-0.010
--0.010
 possession-0.010
Token does
Token ID857
Feature activation+8.14e-03

Pos logit contributions
Those+0.033
ill+0.030
Kids+0.027
of+0.026
These+0.026
Neg logit contributions
 shield-0.034
 badly-0.034
--0.033
 job-0.032
 loudly-0.031
Token not
Token ID407
Feature activation+2.45e-02

Pos logit contributions
Those+0.015
ill+0.013
 fourteen+0.013
of+0.013
Is+0.013
Neg logit contributions
 shield-0.012
 possession-0.012
 job-0.011
 active-0.011
 armour-0.011
Token understand
Token ID1833
Feature activation+3.99e-02

Pos logit contributions
Those+0.039
ill+0.034
of+0.032
 fourteen+0.030
These+0.030
Neg logit contributions
 badly-0.045
 shield-0.044
 job-0.044
 possession-0.043
--0.042
Token runs
Token ID4539
Feature activation+4.23e-02

Pos logit contributions
Those+0.013
of+0.010
 are+0.010
ill+0.010
 fourteen+0.010
Neg logit contributions
 job-0.018
 possession-0.018
 badly-0.018
 shield-0.017
 active-0.017
Token away
Token ID1497
Feature activation+2.49e-02

Pos logit contributions
Those+0.085
ill+0.076
of+0.074
ion+0.069
 are+0.069
Neg logit contributions
 shield-0.059
 possession-0.056
 job-0.055
 badly-0.055
 active-0.055
Token.
Token ID13
Feature activation+1.50e-02

Pos logit contributions
Those+0.048
ill+0.044
of+0.040
 fourteen+0.040
ila+0.039
Neg logit contributions
 shield-0.037
 possession-0.034
 job-0.034
 active-0.034
 badly-0.033
Token
Token ID198
Feature activation+8.55e-02

Pos logit contributions
Those+0.027
ill+0.024
of+0.023
These+0.022
 fourteen+0.021
Neg logit contributions
 badly-0.025
 shield-0.024
 job-0.023
--0.023
 active-0.022
Token
Token ID198
Feature activation+7.88e-02

Pos logit contributions
Those+0.176
ill+0.167
 fourteen+0.156
 thirteen+0.151
ount+0.149
Neg logit contributions
 badly-0.115
--0.107
 cozy-0.102
 healthy-0.098
 Twe-0.095
TokenThe
Token ID464
Feature activation+1.10e-01

Pos logit contributions
Those+0.130
ill+0.130
oms+0.111
 fourteen+0.110
ures+0.110
Neg logit contributions
 badly-0.136
--0.132
 shield-0.124
 active-0.123
 job-0.120
Token dog
Token ID3290
Feature activation+2.72e-02

Pos logit contributions
Those+0.270
ill+0.259
of+0.239
 fourteen+0.231
ount+0.228
Neg logit contributions
 shield-0.116
 job-0.113
 badly-0.109
--0.106
 active-0.097
Token w
Token ID266
Feature activation+2.70e-02

Pos logit contributions
Those+0.039
ill+0.033
of+0.031
 fourteen+0.029
These+0.029
Neg logit contributions
 badly-0.054
 shield-0.054
 job-0.053
 possession-0.052
--0.051
Tokenags
Token ID3775
Feature activation+2.32e-02

Pos logit contributions
Those+0.058
ill+0.048
Is+0.048
These+0.047
of+0.044
Neg logit contributions
 badly-0.040
--0.039
 job-0.036
 shield-0.036
 loudly-0.033
Token its
Token ID663
Feature activation+2.46e-02

Pos logit contributions
Those+0.049
ill+0.048
of+0.044
ila+0.043
Is+0.042
Neg logit contributions
 job-0.028
 shield-0.027
 active-0.027
 badly-0.026
 appetite-0.026
Token tail
Token ID7894
Feature activation+3.14e-02

Pos logit contributions
 end+0.048
Those+0.047
ill+0.044
 fourteen+0.043
 are+0.042
Neg logit contributions
 Amen-0.027
 badly-0.024
 possession-0.023
pron-0.023
 active-0.023
Token,
Token ID11
Feature activation+4.91e-02

Pos logit contributions
Those+0.105
ill+0.095
 fourteen+0.094
 are+0.091
of+0.091
Neg logit contributions
 shield-0.068
 possession-0.066
 active-0.064
 armour-0.062
 job-0.061
Token stop
Token ID2245
Feature activation+5.12e-02

Pos logit contributions
Those+0.079
ill+0.068
of+0.063
 are+0.062
 fourteen+0.061
Neg logit contributions
 possession-0.088
 badly-0.088
 shield-0.088
 job-0.087
 active-0.087
Token,
Token ID11
Feature activation+4.85e-02

Pos logit contributions
Those+0.109
ill+0.099
 fourteen+0.098
 are+0.094
of+0.094
Neg logit contributions
 shield-0.065
 possession-0.063
 active-0.060
ouse-0.059
 armour-0.059
Token stop
Token ID2245
Feature activation+5.10e-02

Pos logit contributions
Those+0.075
ill+0.065
of+0.060
 are+0.060
 fourteen+0.058
Neg logit contributions
 possession-0.090
 badly-0.089
 shield-0.089
 active-0.088
 job-0.088
Token!
Token ID0
Feature activation+1.61e-02

Pos logit contributions
Those+0.111
ill+0.102
 fourteen+0.100
 are+0.097
of+0.096
Neg logit contributions
 shield-0.062
 possession-0.060
 active-0.057
 armour-0.055
ouse-0.055
Token The
Token ID383
Feature activation+1.10e-01

Pos logit contributions
Those+0.027
ill+0.024
of+0.023
These+0.022
 fourteen+0.021
Neg logit contributions
 badly-0.027
 job-0.027
 shield-0.027
--0.026
 possession-0.025
Token big
Token ID1263
Feature activation+1.23e-02

Pos logit contributions
Those+0.277
ill+0.274
of+0.252
ion+0.238
These+0.235
Neg logit contributions
 job-0.112
 shield-0.109
--0.102
 badly-0.093
 cozy-0.086
Token train
Token ID4512
Feature activation+3.74e-02

Pos logit contributions
Those+0.025
ill+0.022
of+0.021
 fourteen+0.020
These+0.020
Neg logit contributions
 badly-0.018
 shield-0.017
 job-0.017
--0.017
 possession-0.016
Token will
Token ID481
Feature activation+1.50e-02

Pos logit contributions
Those+0.064
ill+0.056
of+0.055
 are+0.052
 fourteen+0.052
Neg logit contributions
 shield-0.061
 possession-0.061
 badly-0.061
 active-0.059
 job-0.059
Token hit
Token ID2277
Feature activation+7.95e-02

Pos logit contributions
Those+0.027
ill+0.024
of+0.022
These+0.021
Kids+0.021
Neg logit contributions
 badly-0.025
 shield-0.025
 job-0.025
--0.024
 active-0.023
Token us
Token ID514
Feature activation+5.10e-02

Pos logit contributions
Those+0.151
ill+0.134
of+0.126
 fourteen+0.120
These+0.120
Neg logit contributions
 shield-0.121
 badly-0.121
 job-0.118
 possession-0.115
--0.114
Token says
Token ID1139
Feature activation+2.84e-02

Pos logit contributions
 badly+0.027
 shield+0.026
 job+0.026
 possession+0.025
-+0.025
Neg logit contributions
Those-0.021
ill-0.018
of-0.016
 fourteen-0.016
These-0.015
Token sorry
Token ID7926
Feature activation+1.33e-02

Pos logit contributions
Those+0.065
ill+0.058
of+0.056
 fourteen+0.054
These+0.053
Neg logit contributions
 shield-0.033
 badly-0.032
 job-0.032
 possession-0.030
--0.030
Token.
Token ID13
Feature activation+1.41e-02

Pos logit contributions
Those+0.029
ill+0.026
of+0.025
 fourteen+0.024
 are+0.023
Neg logit contributions
 shield-0.017
 badly-0.017
 job-0.016
 possession-0.016
 active-0.016
Token
Token ID198
Feature activation+7.92e-02

Pos logit contributions
Those+0.024
ill+0.021
of+0.020
These+0.019
ion+0.019
Neg logit contributions
 shield-0.024
 badly-0.024
--0.023
 job-0.023
 active-0.022
Token
Token ID198
Feature activation+7.33e-02

Pos logit contributions
Those+0.160
ill+0.141
 fourteen+0.139
 thirteen+0.137
These+0.135
Neg logit contributions
--0.108
 badly-0.096
 happily-0.092
 loudly-0.091
 cozy-0.090
TokenThe
Token ID464
Feature activation+1.10e-01

Pos logit contributions
Those+0.153
ill+0.146
ount+0.142
oms+0.139
ures+0.135
Neg logit contributions
--0.088
 active-0.080
 badly-0.080
 shield-0.077
 job-0.074
Token bird
Token ID6512
Feature activation+5.66e-02

Pos logit contributions
Those+0.243
ill+0.218
of+0.207
 fourteen+0.200
These+0.200
Neg logit contributions
 badly-0.134
 shield-0.134
 job-0.130
--0.125
 possession-0.124
Token is
Token ID318
Feature activation+1.74e-02

Pos logit contributions
Those+0.090
ill+0.078
of+0.071
These+0.068
Kids+0.067
Neg logit contributions
 shield-0.104
 badly-0.104
 job-0.101
--0.101
 possession-0.098
Token scared
Token ID12008
Feature activation+1.43e-02

Pos logit contributions
Those+0.039
ill+0.035
 fourteen+0.032
These+0.032
of+0.032
Neg logit contributions
 job-0.021
 possession-0.020
 shield-0.019
--0.018
 badly-0.018
Token and
Token ID290
Feature activation-2.69e-02

Pos logit contributions
Those+0.035
ill+0.031
These+0.030
Warning+0.030
 fourteen+0.030
Neg logit contributions
--0.016
 badly-0.016
 shield-0.014
 loudly-0.013
 active-0.012
Token hurt
Token ID5938
Feature activation+4.29e-02

Pos logit contributions
 possession+0.044
 job+0.042
ouse+0.039
 badly+0.039
pron+0.038
Neg logit contributions
Those-0.048
 are-0.043
oms-0.040
ila-0.039
of-0.039
Token hurt
Token ID5938
Feature activation+3.60e-02

Pos logit contributions
Those+0.050
 are+0.044
 fourteen+0.042
ill+0.042
oms+0.041
Neg logit contributions
 possession-0.035
 job-0.032
 Amen-0.032
 shield-0.031
ouse-0.030
Token.
Token ID13
Feature activation+4.86e-03

Pos logit contributions
Those+0.087
ill+0.078
of+0.076
These+0.074
 fourteen+0.074
Neg logit contributions
 badly-0.040
--0.036
 shield-0.035
 job-0.035
 possession-0.033
Token They
Token ID1119
Feature activation+2.09e-02

Pos logit contributions
of+0.010
Those+0.010
ill+0.009
oms+0.009
ion+0.009
Neg logit contributions
 badly-0.006
 job-0.006
 Twe-0.005
--0.005
 hungry-0.005
Token cried
Token ID16896
Feature activation+5.03e-02

Pos logit contributions
Those+0.042
ill+0.037
of+0.035
These+0.034
 fourteen+0.034
Neg logit contributions
 shield-0.029
 badly-0.029
 job-0.029
 possession-0.028
--0.028
Token.
Token ID13
Feature activation+1.83e-02

Pos logit contributions
Those+0.104
ill+0.094
of+0.089
 are+0.086
 fourteen+0.084
Neg logit contributions
 possession-0.064
 shield-0.064
 job-0.063
 badly-0.062
 active-0.060
Token The
Token ID383
Feature activation+1.09e-01

Pos logit contributions
of+0.037
ill+0.036
Those+0.036
oms+0.034
ion+0.034
Neg logit contributions
 job-0.023
 badly-0.022
 Twe-0.020
 loudly-0.020
--0.019
Token truck
Token ID7779
Feature activation+3.86e-02

Pos logit contributions
Those+0.233
ill+0.219
of+0.205
Kids+0.190
 fourteen+0.190
Neg logit contributions
 shield-0.149
 job-0.149
 badly-0.146
--0.137
 active-0.133
Token driver
Token ID4639
Feature activation+4.74e-02

Pos logit contributions
Those+0.075
ill+0.067
of+0.062
These+0.060
Kids+0.059
Neg logit contributions
 badly-0.059
 shield-0.058
--0.057
 job-0.056
 possession-0.053
Token stopped
Token ID5025
Feature activation+5.58e-02

Pos logit contributions
Those+0.101
ill+0.088
These+0.084
 fourteen+0.080
ures+0.080
Neg logit contributions
--0.072
 badly-0.068
 shield-0.065
 loudly-0.060
 job-0.059
Token and
Token ID290
Feature activation-1.61e-02

Pos logit contributions
Those+0.124
ill+0.112
of+0.106
These+0.103
 fourteen+0.103
Neg logit contributions
 badly-0.070
 shield-0.068
--0.067
 job-0.066
 possession-0.062
Token called
Token ID1444
Feature activation+5.78e-02

Pos logit contributions
 shield+0.024
 badly+0.024
 job+0.023
-+0.023
 possession+0.022
Neg logit contributions
Those-0.031
ill-0.028
of-0.026
These-0.025
 fourteen-0.025
Token Lily
Token ID20037
Feature activation+5.76e-02

Pos logit contributions
Those+0.034
ill+0.029
of+0.028
These+0.027
 fourteen+0.026
Neg logit contributions
 badly-0.030
 shield-0.030
 job-0.029
--0.029
 possession-0.028
Token cried
Token ID16896
Feature activation+4.39e-02

Pos logit contributions
Those+0.103
ill+0.091
of+0.085
 are+0.080
 fourteen+0.079
Neg logit contributions
 shield-0.096
 job-0.096
 possession-0.094
 badly-0.093
 active-0.090
Token.
Token ID13
Feature activation+1.44e-02

Pos logit contributions
Those+0.096
ill+0.088
of+0.083
 fourteen+0.079
These+0.079
Neg logit contributions
 shield-0.053
 badly-0.053
 job-0.052
 possession-0.051
 active-0.049
Token
Token ID198
Feature activation+6.74e-02

Pos logit contributions
Those+0.028
of+0.026
ill+0.025
These+0.024
ount+0.023
Neg logit contributions
 badly-0.022
 loudly-0.020
 shield-0.019
--0.019
 happily-0.019
Token
Token ID198
Feature activation+6.50e-02

Pos logit contributions
Those+0.124
ill+0.111
 fourteen+0.104
These+0.100
ount+0.099
Neg logit contributions
 badly-0.111
--0.105
 shield-0.102
 cozy-0.095
 loudly-0.095
TokenThe
Token ID464
Feature activation+1.09e-01

Pos logit contributions
Those+0.093
ill+0.085
ount+0.076
of+0.075
 fourteen+0.073
Neg logit contributions
 badly-0.129
 shield-0.126
--0.124
 active-0.121
 job-0.119
Token bug
Token ID5434
Feature activation+3.73e-02

Pos logit contributions
Those+0.240
ill+0.214
of+0.202
These+0.197
Kids+0.195
Neg logit contributions
 badly-0.136
 shield-0.131
 job-0.129
 possession-0.126
--0.126
Token was
Token ID373
Feature activation+1.03e-02

Pos logit contributions
Those+0.061
ill+0.054
of+0.051
Is+0.049
 fourteen+0.048
Neg logit contributions
 job-0.065
 possession-0.064
 badly-0.063
 shield-0.062
 active-0.061
Token not
Token ID407
Feature activation+1.74e-02

Pos logit contributions
Those+0.021
ill+0.019
 fourteen+0.018
ila+0.017
ion+0.017
Neg logit contributions
 job-0.014
 possession-0.014
 shield-0.013
--0.012
 active-0.012
Token modest
Token ID12949
Feature activation+7.72e-04

Pos logit contributions
Those+0.034
ill+0.033
ila+0.031
 are+0.031
m+0.028
Neg logit contributions
 possession-0.023
 Amen-0.022
 job-0.021
 shield-0.020
 badly-0.020
Token.
Token ID13
Feature activation+3.13e-03

Pos logit contributions
Those+0.002
ill+0.001
of+0.001
These+0.001
 fourteen+0.001
Neg logit contributions
 badly-0.001
 shield-0.001
 job-0.001
--0.001
 possession-0.001
Token and
Token ID290
Feature activation-1.59e-02

Pos logit contributions
Those+0.026
ill+0.024
of+0.023
These+0.023
 fourteen+0.023
Neg logit contributions
 badly-0.008
--0.008
 job-0.008
 shield-0.008
 loudly-0.006
Token mad
Token ID8805
Feature activation-1.47e-02

Pos logit contributions
 shield+0.016
 badly+0.016
 job+0.015
 possession+0.015
-+0.014
Neg logit contributions
Those-0.039
ill-0.035
of-0.033
These-0.033
Kids-0.032
Token.
Token ID13
Feature activation+1.20e-02

Pos logit contributions
 badly+0.014
-+0.013
 shield+0.013
 job+0.013
 loudly+0.011
Neg logit contributions
Those-0.037
ill-0.034
of-0.033
These-0.032
Kids-0.031
Token They
Token ID1119
Feature activation+2.62e-02

Pos logit contributions
of+0.024
ill+0.023
Those+0.023
ion+0.022
 fourteen+0.022
Neg logit contributions
 hungry-0.014
 badly-0.014
 job-0.013
 Amen-0.013
--0.013
Token call
Token ID869
Feature activation+8.28e-02

Pos logit contributions
Those+0.040
ill+0.036
of+0.031
These+0.030
 fourteen+0.030
Neg logit contributions
 shield-0.050
 badly-0.050
 job-0.050
--0.048
 active-0.047
Token the
Token ID262
Feature activation+1.09e-01

Pos logit contributions
Those+0.177
ill+0.158
Is+0.147
These+0.146
 fourteen+0.144
Neg logit contributions
--0.119
 shield-0.116
 job-0.115
 badly-0.111
 loudly-0.106
Token police
Token ID1644
Feature activation+4.53e-02

Pos logit contributions
Those+0.227
ill+0.216
of+0.206
ion+0.185
These+0.184
Neg logit contributions
 shield-0.158
 job-0.154
--0.142
 active-0.136
 badly-0.136
Token.
Token ID13
Feature activation+1.64e-02

Pos logit contributions
Those+0.116
ill+0.104
These+0.101
Kids+0.099
Is+0.099
Neg logit contributions
--0.044
 shield-0.043
 job-0.038
 badly-0.036
 helmet-0.032
Token They
Token ID1119
Feature activation+2.97e-02

Pos logit contributions
of+0.034
Those+0.034
ill+0.032
ion+0.031
 fourteen+0.030
Neg logit contributions
 job-0.019
 badly-0.019
 hungry-0.019
 loudly-0.018
 Amen-0.018
Token hope
Token ID2911
Feature activation+3.01e-02

Pos logit contributions
Those+0.050
ill+0.044
of+0.039
These+0.039
Is+0.038
Neg logit contributions
 shield-0.054
 badly-0.053
 job-0.052
--0.050
 active-0.049
Token the
Token ID262
Feature activation+1.11e-01

Pos logit contributions
Those+0.068
ill+0.060
of+0.057
Is+0.057
These+0.055
Neg logit contributions
--0.040
 job-0.038
 shield-0.037
 healthy-0.036
 badly-0.035
Token to
Token ID284
Feature activation+3.95e-02

Pos logit contributions
Those+0.046
ill+0.042
These+0.040
of+0.039
 fourteen+0.039
Neg logit contributions
--0.016
 job-0.016
 badly-0.015
 shield-0.014
 active-0.013
Token Tom
Token ID4186
Feature activation+6.82e-02

Pos logit contributions
Those+0.076
ill+0.070
oms+0.063
ives+0.062
ures+0.061
Neg logit contributions
 job-0.061
 badly-0.061
 shield-0.060
--0.060
 possession-0.056
Token and
Token ID290
Feature activation+4.53e-03

Pos logit contributions
Those+0.158
ill+0.142
of+0.135
These+0.131
Kids+0.131
Neg logit contributions
 badly-0.077
 shield-0.075
 job-0.074
--0.073
 possession-0.069
Token Lily
Token ID20037
Feature activation+7.38e-02

Pos logit contributions
Those+0.010
ill+0.009
of+0.008
ures+0.008
These+0.008
Neg logit contributions
 job-0.006
 shield-0.006
 badly-0.006
 active-0.006
--0.006
Token."
Token ID526
Feature activation+1.92e-02

Pos logit contributions
Those+0.178
ill+0.161
of+0.154
 fourteen+0.149
Kids+0.149
Neg logit contributions
 badly-0.076
 shield-0.074
 job-0.073
--0.071
 possession-0.068
Token The
Token ID383
Feature activation+1.09e-01

Pos logit contributions
Those+0.037
of+0.033
ill+0.032
ion+0.031
ures+0.030
Neg logit contributions
 badly-0.029
 job-0.027
--0.026
 shield-0.025
 hungry-0.025
Token big
Token ID1263
Feature activation+1.62e-02

Pos logit contributions
Those+0.270
ill+0.264
of+0.254
ion+0.252
ures+0.252
Neg logit contributions
 shield-0.102
 job-0.098
 badly-0.094
 active-0.086
--0.086
Token boy
Token ID2933
Feature activation+1.91e-02

Pos logit contributions
Those+0.035
ill+0.032
of+0.030
These+0.029
 fourteen+0.029
Neg logit contributions
 badly-0.020
 shield-0.020
 job-0.020
 possession-0.019
--0.019
Token looked
Token ID3114
Feature activation-4.60e-03

Pos logit contributions
Those+0.039
ill+0.034
These+0.031
of+0.031
 fourteen+0.031
Neg logit contributions
--0.028
 badly-0.028
 shield-0.027
 job-0.025
 active-0.024
Token down
Token ID866
Feature activation+2.03e-02

Pos logit contributions
 job+0.005
 shield+0.005
 possession+0.005
 Amen+0.005
 badly+0.005
Neg logit contributions
Those-0.010
of-0.009
 are-0.009
ill-0.009
oms-0.008
Token.
Token ID13
Feature activation+7.56e-03

Pos logit contributions
Those+0.050
ill+0.046
of+0.044
Kids+0.043
 fourteen+0.042
Neg logit contributions
 badly-0.021
--0.020
 shield-0.019
 job-0.019
 active-0.018
Token say
Token ID910
Feature activation+2.25e-02

Pos logit contributions
Those+0.050
ill+0.044
of+0.041
These+0.038
 fourteen+0.038
Neg logit contributions
 badly-0.057
 shield-0.057
 job-0.056
 possession-0.055
--0.054
Token sorry
Token ID7926
Feature activation+1.19e-02

Pos logit contributions
Those+0.049
ill+0.044
of+0.042
These+0.041
 fourteen+0.040
Neg logit contributions
 badly-0.028
 shield-0.028
 job-0.027
--0.026
 possession-0.026
Token.
Token ID13
Feature activation+7.25e-03

Pos logit contributions
Those+0.030
ill+0.027
These+0.026
of+0.025
 fourteen+0.025
Neg logit contributions
 badly-0.012
--0.012
 job-0.011
 shield-0.011
 possession-0.010
Token
Token ID198
Feature activation+7.07e-02

Pos logit contributions
of+0.014
ill+0.014
Those+0.014
ion+0.013
oms+0.013
Neg logit contributions
 badly-0.010
 job-0.009
 loudly-0.009
 hungry-0.009
--0.008
Token
Token ID198
Feature activation+6.36e-02

Pos logit contributions
ill+0.144
Those+0.144
 thirteen+0.129
 fourteen+0.128
oms+0.128
Neg logit contributions
 badly-0.095
--0.090
 cozy-0.081
 loudly-0.079
 Amen-0.079
TokenThe
Token ID464
Feature activation+1.09e-01

Pos logit contributions
ill+0.101
Those+0.091
oms+0.086
ount+0.084
of+0.082
Neg logit contributions
 badly-0.120
--0.117
 active-0.110
 job-0.108
 healthy-0.105
Token gym
Token ID11550
Feature activation+3.06e-02

Pos logit contributions
ill+0.255
Those+0.254
of+0.247
ures+0.232
ion+0.228
Neg logit contributions
 job-0.118
 shield-0.109
 badly-0.108
 active-0.105
--0.104
Token teacher
Token ID4701
Feature activation+4.03e-02

Pos logit contributions
Those+0.061
ill+0.056
of+0.052
ures+0.050
These+0.050
Neg logit contributions
 badly-0.045
--0.044
 job-0.044
 shield-0.042
 active-0.040
Token comes
Token ID2058
Feature activation+5.93e-02

Pos logit contributions
Those+0.069
ill+0.062
These+0.057
Warning+0.057
oes+0.054
Neg logit contributions
--0.071
 shield-0.064
 badly-0.061
 happily-0.061
 loudly-0.060
Token and
Token ID290
Feature activation-1.66e-02

Pos logit contributions
Those+0.134
ill+0.122
of+0.115
 fourteen+0.111
These+0.111
Neg logit contributions
 shield-0.070
 badly-0.070
 job-0.068
 possession-0.065
--0.064
Token sees
Token ID7224
Feature activation+6.36e-02

Pos logit contributions
 shield+0.029
-+0.029
 job+0.027
 badly+0.027
 healthy+0.026
Neg logit contributions
Those-0.027
ill-0.026
of-0.022
 fourteen-0.022
ures-0.021
Token with
Token ID351
Feature activation+5.02e-02

Pos logit contributions
Those+0.108
ill+0.100
of+0.094
 are+0.090
These+0.088
Neg logit contributions
 shield-0.070
 job-0.069
 possession-0.067
 badly-0.065
 active-0.063
Token you
Token ID345
Feature activation+3.23e-02

Pos logit contributions
Those+0.093
ill+0.084
of+0.075
Kids+0.074
 fourteen+0.074
Neg logit contributions
 badly-0.080
 shield-0.079
 job-0.077
--0.077
 possession-0.075
Token."
Token ID526
Feature activation+1.71e-02

Pos logit contributions
Those+0.079
ill+0.072
of+0.068
Kids+0.067
These+0.067
Neg logit contributions
 badly-0.032
 shield-0.031
 job-0.031
--0.030
 active-0.028
Token
Token ID198
Feature activation+7.58e-02

Pos logit contributions
Those+0.029
ill+0.026
of+0.024
 fourteen+0.024
These+0.023
Neg logit contributions
 badly-0.029
--0.028
 shield-0.027
 job-0.027
 happily-0.027
Token
Token ID198
Feature activation+7.04e-02

Pos logit contributions
Those+0.136
ill+0.126
 fourteen+0.115
of+0.108
ount+0.108
Neg logit contributions
 badly-0.123
--0.122
 shield-0.119
 Amen-0.117
 happily-0.113
TokenThe
Token ID464
Feature activation+1.09e-01

Pos logit contributions
Those+0.102
ill+0.101
ount+0.089
 fourteen+0.084
of+0.083
Neg logit contributions
 badly-0.134
--0.133
 Amen-0.131
 shield-0.129
 active-0.128
Token weird
Token ID7650
Feature activation+6.37e-03

Pos logit contributions
ill+0.264
Those+0.252
of+0.237
ives+0.230
ount+0.228
Neg logit contributions
 shield-0.119
 job-0.115
--0.111
 badly-0.107
 Amen-0.101
Token animal
Token ID5044
Feature activation+3.43e-02

Pos logit contributions
Those+0.015
ill+0.014
of+0.013
 fourteen+0.013
These+0.013
Neg logit contributions
 badly-0.006
 shield-0.006
 job-0.006
 possession-0.006
--0.006
Token nods
Token ID34833
Feature activation+2.13e-02

Pos logit contributions
Those+0.054
ill+0.046
of+0.042
These+0.041
Kids+0.040
Neg logit contributions
 shield-0.065
 badly-0.064
--0.063
 job-0.062
 possession-0.060
Token.
Token ID13
Feature activation+4.80e-03

Pos logit contributions
Those+0.048
ill+0.044
of+0.041
 fourteen+0.040
These+0.040
Neg logit contributions
 badly-0.025
 shield-0.025
 job-0.024
--0.023
 possession-0.023
Token It
Token ID632
Feature activation+4.45e-02

Pos logit contributions
Those+0.010
ill+0.009
of+0.009
These+0.008
 fourteen+0.008
Neg logit contributions
 shield-0.007
 badly-0.006
 job-0.006
 possession-0.006
--0.006
Token were
Token ID547
Feature activation+2.59e-02

Pos logit contributions
Those+0.052
ill+0.043
of+0.038
 fourteen+0.037
Is+0.036
Neg logit contributions
 badly-0.092
 shield-0.091
 job-0.090
--0.088
 active-0.086
Token hungry
Token ID14720
Feature activation+1.64e-03

Pos logit contributions
Those+0.049
ila+0.045
 are+0.044
 fourteen+0.042
ill+0.042
Neg logit contributions
 possession-0.039
ouse-0.034
 badly-0.033
 shield-0.033
 job-0.033
Token.
Token ID13
Feature activation+7.08e-03

Pos logit contributions
Those+0.004
ill+0.003
of+0.003
 fourteen+0.003
These+0.003
Neg logit contributions
 badly-0.002
 shield-0.002
 job-0.002
--0.002
 possession-0.002
Token
Token ID198
Feature activation+7.66e-02

Pos logit contributions
Those+0.013
ill+0.012
of+0.011
 fourteen+0.010
ion+0.010
Neg logit contributions
 badly-0.011
 job-0.011
 shield-0.011
--0.011
 active-0.010
Token
Token ID198
Feature activation+7.11e-02

Pos logit contributions
Those+0.164
ill+0.157
 thirteen+0.145
 fourteen+0.144
These+0.138
Neg logit contributions
 badly-0.090
--0.089
 cozy-0.089
 happily-0.086
 healthy-0.086
TokenThe
Token ID464
Feature activation+1.09e-01

Pos logit contributions
ill+0.126
oes+0.103
ures+0.102
oms+0.101
ility+0.101
Neg logit contributions
 badly-0.123
 healthy-0.120
--0.117
 active-0.111
 hungry-0.109
Token man
Token ID582
Feature activation+4.04e-02

Pos logit contributions
Those+0.247
ill+0.224
of+0.212
 fourteen+0.205
These+0.205
Neg logit contributions
 shield-0.128
 badly-0.128
 job-0.124
--0.120
 possession-0.118
Token is
Token ID318
Feature activation+1.97e-02

Pos logit contributions
Those+0.069
ill+0.060
Warning+0.056
Kids+0.055
These+0.055
Neg logit contributions
--0.073
 shield-0.068
 happily-0.064
 badly-0.063
 job-0.062
Token surprised
Token ID6655
Feature activation+7.37e-03

Pos logit contributions
Those+0.045
ill+0.041
 fourteen+0.038
These+0.038
of+0.038
Neg logit contributions
 shield-0.022
 possession-0.021
 job-0.021
 badly-0.021
--0.019
Token.
Token ID13
Feature activation+3.07e-03

Pos logit contributions
Those+0.020
ill+0.018
These+0.018
 fourteen+0.018
Warning+0.018
Neg logit contributions
--0.006
 badly-0.005
 job-0.004
 loudly-0.004
 happily-0.004
Token He
Token ID679
Feature activation+1.47e-03

Pos logit contributions
Those+0.007
ill+0.006
of+0.006
These+0.006
 fourteen+0.006
Neg logit contributions
 badly-0.003
 shield-0.003
 job-0.003
--0.003
 possession-0.003
Token.
Token ID13
Feature activation+4.96e-03

Pos logit contributions
Those+0.040
ill+0.036
of+0.034
These+0.034
Kids+0.034
Neg logit contributions
 badly-0.018
--0.018
 shield-0.018
 job-0.018
 possession-0.016
Token
Token ID198
Feature activation+7.09e-02

Pos logit contributions
Those+0.010
ill+0.009
of+0.009
ion+0.008
ount+0.008
Neg logit contributions
 badly-0.007
 job-0.006
 loudly-0.006
 happily-0.006
 shield-0.006
Token
Token ID198
Feature activation+6.45e-02

Pos logit contributions
Those+0.158
ill+0.144
 thirteen+0.144
ount+0.140
 fourteen+0.139
Neg logit contributions
 badly-0.079
--0.078
 happily-0.073
 loudly-0.065
 cozy-0.064
Token"
Token ID1
Feature activation+5.90e-02

Pos logit contributions
ill+0.120
Those+0.120
ount+0.113
oms+0.106
 fourteen+0.105
Neg logit contributions
 badly-0.094
--0.090
 job-0.086
 active-0.083
 loudly-0.079
TokenMir
Token ID27453
Feature activation+4.23e-02

Pos logit contributions
ill+0.077
Those+0.074
oms+0.068
ount+0.066
oke+0.066
Neg logit contributions
 badly-0.115
--0.111
 job-0.106
 active-0.106
 shield-0.104
Tokenror
Token ID1472
Feature activation+1.09e-01

Pos logit contributions
Those+0.076
These+0.061
Kids+0.060
of+0.059
Is+0.057
Neg logit contributions
--0.072
 badly-0.072
 job-0.071
 shield-0.070
 possession-0.070
Token,
Token ID11
Feature activation+3.32e-02

Pos logit contributions
Those+0.300
These+0.260
Kids+0.253
Warning+0.247
ount+0.247
Neg logit contributions
--0.093
 job-0.081
 badly-0.080
 shield-0.071
 appetite-0.062
Token mirror
Token ID10162
Feature activation+4.21e-02

Pos logit contributions
Those+0.068
ill+0.060
of+0.057
These+0.056
Kids+0.054
Neg logit contributions
 shield-0.046
 badly-0.045
 job-0.045
--0.044
 active-0.042
Token,
Token ID11
Feature activation+4.37e-02

Pos logit contributions
Those+0.094
ill+0.085
of+0.081
 fourteen+0.078
These+0.077
Neg logit contributions
 shield-0.050
 badly-0.050
 job-0.048
 possession-0.046
--0.046
Token make
Token ID787
Feature activation+6.60e-02

Pos logit contributions
Those+0.085
ill+0.076
of+0.071
These+0.070
Kids+0.068
Neg logit contributions
 shield-0.065
 badly-0.064
 job-0.064
--0.062
 active-0.057
Token Anna
Token ID11735
Feature activation+3.29e-02

Pos logit contributions
Those+0.127
ill+0.111
of+0.109
 are+0.103
 fourteen+0.101
Neg logit contributions
 possession-0.099
 shield-0.099
 badly-0.096
 active-0.095
 job-0.093
Token you
Token ID345
Feature activation+2.36e-02

Pos logit contributions
Those+0.129
ill+0.115
of+0.108
 fourteen+0.105
These+0.104
Neg logit contributions
 badly-0.096
 shield-0.094
 job-0.092
--0.091
 possession-0.088
Token again
Token ID757
Feature activation+3.00e-02

Pos logit contributions
Those+0.058
ill+0.053
of+0.050
 fourteen+0.049
These+0.049
Neg logit contributions
 shield-0.023
 badly-0.023
 job-0.022
 possession-0.021
--0.021
Token".
Token ID1911
Feature activation+8.45e-03

Pos logit contributions
Those+0.068
ill+0.062
of+0.058
 fourteen+0.058
 are+0.056
Neg logit contributions
 shield-0.034
 possession-0.032
 badly-0.031
 job-0.031
--0.029
Token
Token ID198
Feature activation+7.17e-02

Pos logit contributions
Those+0.016
ill+0.014
Kids+0.014
oms+0.014
ila+0.014
Neg logit contributions
 job-0.011
 badly-0.011
 happily-0.011
 shield-0.011
 possession-0.011
Token
Token ID198
Feature activation+6.56e-02

Pos logit contributions
Those+0.157
ill+0.144
 fourteen+0.136
ount+0.131
 thirteen+0.130
Neg logit contributions
 badly-0.090
--0.086
 shield-0.086
 happily-0.081
 Amen-0.078
TokenThe
Token ID464
Feature activation+1.09e-01

Pos logit contributions
ill+0.097
Those+0.097
oms+0.095
ount+0.091
oes+0.090
Neg logit contributions
 badly-0.120
 shield-0.112
 active-0.110
--0.110
 happily-0.107
Token boys
Token ID6510
Feature activation+3.43e-02

Pos logit contributions
Those+0.210
ill+0.207
of+0.195
ount+0.183
ures+0.181
Neg logit contributions
 shield-0.169
 job-0.163
 active-0.157
 badly-0.154
--0.150
Token thanked
Token ID26280
Feature activation+5.27e-02

Pos logit contributions
Those+0.068
ill+0.058
 fourteen+0.054
These+0.054
ures+0.054
Neg logit contributions
 badly-0.052
 shield-0.051
--0.050
 job-0.048
 happily-0.047
Token the
Token ID262
Feature activation+1.05e-01

Pos logit contributions
Those+0.103
ill+0.091
ives+0.083
ount+0.083
of+0.083
Neg logit contributions
 shield-0.080
 job-0.078
 badly-0.078
--0.077
 possession-0.072
Token bald
Token ID28273
Feature activation-2.49e-03

Pos logit contributions
Those+0.254
ill+0.240
of+0.230
 fourteen+0.221
ures+0.218
Neg logit contributions
 shield-0.114
 job-0.107
 active-0.102
 badly-0.101
--0.099
Token<|endoftext|>
Token ID50256
Feature activation+5.04e-02

Pos logit contributions
 badly+0.003
 job+0.003
 shield+0.003
 possession+0.003
 active+0.003
Neg logit contributions
Those-0.005
ill-0.005
of-0.005
 fourteen-0.005
ion-0.005
Token the
Token ID262
Feature activation-1.41e-04

Pos logit contributions
 badly+0.082
 shield+0.081
 job+0.079
 possession+0.077
-+0.077
Neg logit contributions
Those-0.110
ill-0.098
of-0.092
 fourteen-0.089
These-0.089
Token baby
Token ID5156
Feature activation-6.44e-02

Pos logit contributions
 badly+0.000
 shield+0.000
 job+0.000
 possession+0.000
-+0.000
Neg logit contributions
Those-0.000
ill-0.000
of-0.000
 fourteen-0.000
These-0.000
Token was
Token ID373
Feature activation-7.29e-02

Pos logit contributions
 badly+0.099
 shield+0.098
 job+0.097
 possession+0.095
 active+0.094
Neg logit contributions
Those-0.116
ill-0.104
of-0.098
 are-0.093
 fourteen-0.093
Token wearing
Token ID5762
Feature activation-3.64e-02

Pos logit contributions
 shield+0.082
 job+0.079
 badly+0.079
 possession+0.078
-+0.075
Neg logit contributions
Those-0.164
ill-0.149
of-0.139
 fourteen-0.138
These-0.136
Token a
Token ID257
Feature activation-9.51e-02

Pos logit contributions
 badly+0.045
 shield+0.045
 job+0.045
 possession+0.043
 Amen+0.042
Neg logit contributions
Those-0.075
ill-0.067
of-0.065
These-0.063
 are-0.062
Token very
Token ID845
Feature activation-1.90e-01

Pos logit contributions
 possession+0.121
rily+0.116
 badly+0.116
ages+0.114
gy+0.114
Neg logit contributions
Those-0.156
of-0.148
 are-0.143
 end-0.140
ill-0.134
Token soft
Token ID2705
Feature activation-1.31e-01

Pos logit contributions
 shield+0.234
 Amen+0.227
gy+0.220
 badly+0.217
 possession+0.209
Neg logit contributions
Those-0.353
of-0.331
ila-0.327
ill-0.319
 are-0.311
Token blanket
Token ID18447
Feature activation-9.51e-02

Pos logit contributions
 badly+0.175
 active+0.170
 Amen+0.160
ages+0.155
 possession+0.155
Neg logit contributions
of-0.204
 end-0.199
 are-0.198
ill-0.190
Those-0.179
Token.
Token ID13
Feature activation-6.26e-02

Pos logit contributions
 shield+0.118
 badly+0.117
 job+0.117
 possession+0.112
 active+0.111
Neg logit contributions
Those-0.200
ill-0.185
of-0.173
 fourteen-0.167
ion-0.166
Token The
Token ID383
Feature activation+4.45e-02

Pos logit contributions
 badly+0.109
 shield+0.108
 job+0.106
-+0.103
 possession+0.103
Neg logit contributions
Those-0.104
ill-0.090
of-0.084
 fourteen-0.081
These-0.080
Token bird
Token ID6512
Feature activation-7.78e-03

Pos logit contributions
Those+0.100
ill+0.087
of+0.084
These+0.083
ion+0.083
Neg logit contributions
 badly-0.050
 shield-0.048
 possession-0.048
 job-0.047
--0.046
Token years
Token ID812
Feature activation-1.30e-01

Pos logit contributions
 shield+0.151
 badly+0.148
 job+0.144
 possession+0.140
 active+0.133
Neg logit contributions
Those-0.208
ill-0.184
of-0.175
These-0.167
 are-0.164
Token old
Token ID1468
Feature activation-8.76e-02

Pos logit contributions
 badly+0.148
 shield+0.146
 job+0.142
-+0.138
 possession+0.135
Neg logit contributions
Those-0.299
ill-0.270
of-0.257
These-0.249
 fourteen-0.248
Token and
Token ID290
Feature activation-1.05e-01

Pos logit contributions
 badly+0.113
 shield+0.112
 job+0.108
 active+0.105
 possession+0.104
Neg logit contributions
Those-0.184
ill-0.164
of-0.156
These-0.151
 fourteen-0.150
Token had
Token ID550
Feature activation-5.13e-02

Pos logit contributions
 badly+0.130
 shield+0.127
 job+0.124
-+0.120
 possession+0.119
Neg logit contributions
Those-0.230
ill-0.204
of-0.196
These-0.190
 fourteen-0.187
Token a
Token ID257
Feature activation-1.06e-01

Pos logit contributions
 badly+0.065
 shield+0.064
 job+0.063
-+0.061
 possession+0.060
Neg logit contributions
Those-0.111
ill-0.101
of-0.095
 fourteen-0.092
Kids-0.092
Token very
Token ID845
Feature activation-1.88e-01

Pos logit contributions
 badly+0.134
rily+0.127
 possession+0.122
 active+0.122
-+0.122
Neg logit contributions
Those-0.195
 are-0.172
of-0.169
 end-0.168
Is-0.166
Token sweet
Token ID6029
Feature activation-1.29e-01

Pos logit contributions
 badly+0.246
 Amen+0.234
 shield+0.233
gy+0.225
rily+0.221
Neg logit contributions
Those-0.321
ila-0.292
 are-0.279
ill-0.271
 end-0.269
Token tooth
Token ID16162
Feature activation-8.19e-02

Pos logit contributions
 badly+0.187
 active+0.164
ouse+0.159
eh+0.159
 shield+0.157
Neg logit contributions
 end-0.203
 are-0.201
Those-0.199
of-0.194
ill-0.190
Token.
Token ID13
Feature activation-6.28e-02

Pos logit contributions
 job+0.116
 active+0.116
 Amen+0.110
 possession+0.109
 badly+0.107
Neg logit contributions
Those-0.138
ill-0.137
of-0.127
ion-0.127
ives-0.122
Token She
Token ID1375
Feature activation-6.05e-02

Pos logit contributions
 badly+0.108
 shield+0.106
 job+0.103
 possession+0.102
-+0.100
Neg logit contributions
Those-0.105
ill-0.090
of-0.085
 fourteen-0.084
oms-0.082
Token loved
Token ID6151
Feature activation-6.12e-02

Pos logit contributions
 job+0.105
 badly+0.102
 shield+0.100
 possession+0.100
 active+0.096
Neg logit contributions
Those-0.098
ill-0.087
of-0.085
 are-0.078
Is-0.077
Token a
Token ID257
Feature activation-1.04e-01

Pos logit contributions
 shield+0.055
 badly+0.055
 possession+0.054
 job+0.054
-+0.052
Neg logit contributions
Those-0.078
ill-0.072
of-0.066
 fourteen-0.064
 are-0.064
Token time
Token ID640
Feature activation-5.70e-02

Pos logit contributions
 badly+0.204
 possession+0.201
 shield+0.193
rily+0.188
ouse+0.184
Neg logit contributions
Those-0.140
ill-0.131
of-0.127
 fourteen-0.115
ila-0.114
Token there
Token ID612
Feature activation-2.06e-02

Pos logit contributions
 badly+0.070
 shield+0.070
 job+0.068
 possession+0.067
 active+0.064
Neg logit contributions
Those-0.123
ill-0.113
of-0.106
 fourteen-0.103
oms-0.101
Token was
Token ID373
Feature activation-7.78e-02

Pos logit contributions
-+0.028
 shield+0.028
 badly+0.027
 job+0.026
 happily+0.025
Neg logit contributions
Those-0.045
 fourteen-0.037
These-0.036
ion-0.035
ount-0.035
Token a
Token ID257
Feature activation-1.12e-01

Pos logit contributions
 shield+0.113
 badly+0.113
 possession+0.111
 job+0.111
-+0.107
Neg logit contributions
Those-0.148
ill-0.137
 fourteen-0.125
These-0.124
of-0.122
Token very
Token ID845
Feature activation-1.86e-01

Pos logit contributions
 possession+0.169
 badly+0.160
ages+0.153
rily+0.153
 armour+0.144
Neg logit contributions
Those-0.196
 end-0.178
Is-0.171
Kids-0.164
 halftime-0.161
Token clever
Token ID14169
Feature activation-1.26e-01

Pos logit contributions
 possession+0.245
 badly+0.244
 shield+0.231
amas+0.225
 armor+0.221
Neg logit contributions
Those-0.326
ila-0.312
ill-0.288
 end-0.287
oms-0.283
Token fox
Token ID21831
Feature activation-6.35e-02

Pos logit contributions
 badly+0.179
 possession+0.165
 Amen+0.161
ouse+0.159
 shield+0.155
Neg logit contributions
Those-0.222
ill-0.209
of-0.201
 end-0.194
ila-0.192
Token.
Token ID13
Feature activation-6.26e-02

Pos logit contributions
 job+0.086
 badly+0.084
 shield+0.084
 possession+0.079
 active+0.079
Neg logit contributions
Those-0.128
ill-0.117
of-0.111
ion-0.106
These-0.103
Token He
Token ID679
Feature activation-6.55e-02

Pos logit contributions
 shield+0.087
 badly+0.086
 job+0.083
 possession+0.082
 active+0.082
Neg logit contributions
Those-0.126
ill-0.111
of-0.106
 fourteen-0.104
Kids-0.103
Token was
Token ID373
Feature activation-6.41e-02

Pos logit contributions
 badly+0.101
 job+0.100
 shield+0.099
 possession+0.096
-+0.093
Neg logit contributions
Those-0.121
ill-0.105
of-0.102
These-0.096
 fourteen-0.095
Token Bob
Token ID5811
Feature activation-3.18e-02

Pos logit contributions
 badly+0.098
 shield+0.098
 job+0.095
-+0.094
 active+0.092
Neg logit contributions
Those-0.125
ill-0.110
of-0.104
These-0.100
oms-0.100
Token and
Token ID290
Feature activation-7.10e-02

Pos logit contributions
 job+0.042
 shield+0.041
 badly+0.041
 possession+0.039
 active+0.038
Neg logit contributions
Those-0.066
ill-0.060
of-0.057
 fourteen-0.054
These-0.054
Token he
Token ID339
Feature activation-5.89e-02

Pos logit contributions
 badly+0.101
 job+0.100
 shield+0.099
-+0.096
 possession+0.095
Neg logit contributions
Those-0.141
ill-0.125
of-0.118
 fourteen-0.114
Kids-0.113
Token had
Token ID550
Feature activation-5.14e-02

Pos logit contributions
 badly+0.090
 shield+0.090
 job+0.089
 possession+0.087
-+0.085
Neg logit contributions
Those-0.109
ill-0.097
of-0.092
 fourteen-0.087
These-0.087
Token a
Token ID257
Feature activation-9.61e-02

Pos logit contributions
 badly+0.063
 shield+0.063
 job+0.060
-+0.059
 possession+0.058
Neg logit contributions
Those-0.114
ill-0.103
of-0.097
 fourteen-0.094
Kids-0.094
Token very
Token ID845
Feature activation-1.84e-01

Pos logit contributions
 badly+0.125
rily+0.118
 possession+0.117
-+0.111
 active+0.111
Neg logit contributions
Those-0.168
of-0.153
 are-0.153
 end-0.149
Is-0.142
Token big
Token ID1263
Feature activation-1.19e-01

Pos logit contributions
 badly+0.228
 shield+0.221
 Amen+0.216
 possession+0.215
rily+0.209
Neg logit contributions
Those-0.322
ila-0.295
of-0.283
 are-0.280
ill-0.271
Token frown
Token ID22002
Feature activation-8.37e-02

Pos logit contributions
 badly+0.170
rily+0.163
 active+0.152
eh+0.151
ouse+0.150
Neg logit contributions
 are-0.195
 end-0.192
Those-0.182
ill-0.180
of-0.168
Token.
Token ID13
Feature activation-5.15e-02

Pos logit contributions
 shield+0.109
 badly+0.106
 job+0.106
 possession+0.105
 active+0.102
Neg logit contributions
Those-0.173
ill-0.156
of-0.147
ion-0.141
 fourteen-0.140
Token He
Token ID679
Feature activation-5.85e-02

Pos logit contributions
 badly+0.081
 shield+0.080
 job+0.077
-+0.076
 possession+0.076
Neg logit contributions
Those-0.094
ill-0.083
of-0.078
These-0.075
 fourteen-0.075
Token went
Token ID1816
Feature activation-1.90e-02

Pos logit contributions
 badly+0.089
 shield+0.088
 job+0.087
 possession+0.084
-+0.084
Neg logit contributions
Those-0.111
ill-0.098
of-0.093
 fourteen-0.088
These-0.088
TokenTom
Token ID13787
Feature activation-8.71e-03

Pos logit contributions
-+0.038
 badly+0.037
 shield+0.036
 happily+0.036
 job+0.036
Neg logit contributions
Those-0.038
 fourteen-0.031
ill-0.030
These-0.029
of-0.029
Tokenmy
Token ID1820
Feature activation-4.44e-02

Pos logit contributions
-+0.016
 badly+0.015
 shield+0.015
 job+0.015
 active+0.014
Neg logit contributions
Those-0.015
ill-0.011
These-0.011
Is-0.011
 fourteen-0.011
Token was
Token ID373
Feature activation-5.85e-02

Pos logit contributions
 badly+0.065
 shield+0.065
 job+0.063
 possession+0.063
 active+0.060
Neg logit contributions
Those-0.085
ill-0.076
of-0.074
 are-0.070
 fourteen-0.070
Token having
Token ID1719
Feature activation-3.23e-02

Pos logit contributions
 possession+0.088
 shield+0.080
 job+0.076
ouse+0.074
 badly+0.074
Neg logit contributions
Those-0.107
ill-0.101
ila-0.094
 fourteen-0.093
These-0.089
Token a
Token ID257
Feature activation-1.05e-01

Pos logit contributions
 shield+0.044
 badly+0.042
 possession+0.042
 job+0.042
-+0.041
Neg logit contributions
Those-0.067
ill-0.060
of-0.056
These-0.056
 fourteen-0.055
Token very
Token ID845
Feature activation-1.83e-01

Pos logit contributions
rily+0.185
 possession+0.168
 badly+0.160
keeping+0.159
 armour+0.156
Neg logit contributions
Those-0.158
 end-0.156
of-0.155
 are-0.145
 last-0.140
Token special
Token ID2041
Feature activation-1.21e-01

Pos logit contributions
 shield+0.258
rily+0.253
 badly+0.253
 armor+0.247
 armour+0.245
Neg logit contributions
Those-0.318
ila-0.301
ill-0.277
 end-0.275
of-0.270
Token day
Token ID1110
Feature activation-8.55e-02

Pos logit contributions
rily+0.195
 badly+0.188
 Amen+0.171
 possession+0.169
 loudly+0.168
Neg logit contributions
 end-0.170
Those-0.166
ill-0.164
of-0.163
 are-0.159
Token.
Token ID13
Feature activation-5.28e-02

Pos logit contributions
 shield+0.124
 possession+0.117
 badly+0.114
 Amen+0.112
 job+0.111
Neg logit contributions
Those-0.159
ill-0.148
of-0.134
oms-0.133
 fourteen-0.130
Token He
Token ID679
Feature activation-5.90e-02

Pos logit contributions
 badly+0.072
 job+0.071
 shield+0.070
 possession+0.067
-+0.067
Neg logit contributions
Those-0.107
ill-0.092
of-0.091
These-0.087
oms-0.087
Token was
Token ID373
Feature activation-6.05e-02

Pos logit contributions
 badly+0.094
 job+0.093
 shield+0.092
 possession+0.090
-+0.085
Neg logit contributions
Those-0.104
ill-0.093
of-0.090
Kids-0.084
 are-0.083
Token time
Token ID640
Feature activation-3.92e-02

Pos logit contributions
 shield+0.119
 badly+0.119
 possession+0.118
 job+0.112
 Amen+0.110
Neg logit contributions
Those-0.115
ill-0.105
of-0.101
 fourteen-0.093
ila-0.091
Token,
Token ID11
Feature activation-1.68e-02

Pos logit contributions
 shield+0.046
 badly+0.046
 job+0.044
 possession+0.043
-+0.042
Neg logit contributions
Those-0.088
ill-0.080
of-0.076
 fourteen-0.073
These-0.073
Token there
Token ID612
Feature activation-6.36e-03

Pos logit contributions
 shield+0.025
 possession+0.024
 job+0.024
 badly+0.023
 appetite+0.023
Neg logit contributions
Those-0.033
ill-0.030
 fourteen-0.027
ila-0.027
of-0.027
Token was
Token ID373
Feature activation-6.81e-02

Pos logit contributions
-+0.007
 badly+0.007
 happily+0.007
 job+0.006
 shield+0.006
Neg logit contributions
Those-0.015
 fourteen-0.014
oms-0.013
 thirteen-0.013
ives-0.013
Token a
Token ID257
Feature activation-1.11e-01

Pos logit contributions
 possession+0.122
 shield+0.120
 badly+0.119
ouse+0.118
 job+0.117
Neg logit contributions
Those-0.116
ill-0.096
These-0.096
 fourteen-0.089
 are-0.085
Token very
Token ID845
Feature activation-1.83e-01

Pos logit contributions
 possession+0.190
 badly+0.173
ages+0.171
rily+0.167
 armour+0.165
Neg logit contributions
Those-0.181
 end-0.155
Ben-0.152
Is-0.152
Kids-0.150
Token thoughtful
Token ID22677
Feature activation-1.22e-01

Pos logit contributions
 possession+0.274
 badly+0.272
amas+0.249
rily+0.246
 armour+0.244
Neg logit contributions
ila-0.299
Those-0.292
ulu-0.268
oms-0.259
 end-0.254
Token girl
Token ID2576
Feature activation-7.96e-02

Pos logit contributions
 badly+0.184
 Amen+0.168
 possession+0.166
 active+0.164
ages+0.161
Neg logit contributions
Those-0.185
ill-0.180
 end-0.178
of-0.178
 are-0.171
Token.
Token ID13
Feature activation-5.03e-02

Pos logit contributions
 shield+0.126
 job+0.122
 badly+0.120
 armour+0.114
 active+0.113
Neg logit contributions
Those-0.144
ill-0.132
of-0.123
ila-0.118
ion-0.118
Token She
Token ID1375
Feature activation-5.45e-02

Pos logit contributions
 badly+0.069
 possession+0.069
 job+0.069
 shield+0.068
 active+0.068
Neg logit contributions
Those-0.093
 fourteen-0.082
of-0.081
oms-0.081
 halftime-0.079
Token was
Token ID373
Feature activation-6.41e-02

Pos logit contributions
 badly+0.102
 possession+0.100
 shield+0.100
 job+0.099
rily+0.095
Neg logit contributions
Those-0.079
 are-0.069
Kids-0.065
of-0.064
These-0.063
Token time
Token ID640
Feature activation-3.92e-02

Pos logit contributions
 shield+0.119
 badly+0.119
 possession+0.118
 job+0.112
 Amen+0.110
Neg logit contributions
Those-0.115
ill-0.105
of-0.101
 fourteen-0.093
ila-0.091
Token,
Token ID11
Feature activation-1.68e-02

Pos logit contributions
 shield+0.046
 badly+0.046
 job+0.044
 possession+0.043
-+0.042
Neg logit contributions
Those-0.088
ill-0.080
of-0.076
 fourteen-0.073
These-0.073
Token there
Token ID612
Feature activation-6.36e-03

Pos logit contributions
 shield+0.025
 possession+0.024
 job+0.024
 badly+0.023
 appetite+0.023
Neg logit contributions
Those-0.033
ill-0.030
 fourteen-0.027
ila-0.027
of-0.027
Token was
Token ID373
Feature activation-6.81e-02

Pos logit contributions
-+0.007
 badly+0.007
 happily+0.007
 job+0.006
 shield+0.006
Neg logit contributions
Those-0.015
 fourteen-0.014
oms-0.013
 thirteen-0.013
ives-0.013
Token a
Token ID257
Feature activation-1.11e-01

Pos logit contributions
 possession+0.122
 shield+0.120
 badly+0.119
ouse+0.118
 job+0.117
Neg logit contributions
Those-0.116
ill-0.096
These-0.096
 fourteen-0.089
 are-0.085
Token very
Token ID845
Feature activation-1.83e-01

Pos logit contributions
 possession+0.190
 badly+0.173
ages+0.171
rily+0.167
 armour+0.165
Neg logit contributions
Those-0.181
 end-0.155
Ben-0.152
Is-0.152
Kids-0.150
Token thoughtful
Token ID22677
Feature activation-1.22e-01

Pos logit contributions
 possession+0.274
 badly+0.272
amas+0.249
rily+0.246
 armour+0.244
Neg logit contributions
ila-0.299
Those-0.292
ulu-0.268
oms-0.259
 end-0.254
Token girl
Token ID2576
Feature activation-7.96e-02

Pos logit contributions
 badly+0.184
 Amen+0.168
 possession+0.166
 active+0.164
ages+0.161
Neg logit contributions
Those-0.185
ill-0.180
 end-0.178
of-0.178
 are-0.171
Token.
Token ID13
Feature activation-5.03e-02

Pos logit contributions
 shield+0.126
 job+0.122
 badly+0.120
 armour+0.114
 active+0.113
Neg logit contributions
Those-0.144
ill-0.132
of-0.123
ila-0.118
ion-0.118
Token She
Token ID1375
Feature activation-5.45e-02

Pos logit contributions
 badly+0.069
 possession+0.069
 job+0.069
 shield+0.068
 active+0.068
Neg logit contributions
Those-0.093
 fourteen-0.082
of-0.081
oms-0.081
 halftime-0.079
Token was
Token ID373
Feature activation-6.41e-02

Pos logit contributions
 badly+0.102
 possession+0.100
 shield+0.100
 job+0.099
rily+0.095
Neg logit contributions
Those-0.079
 are-0.069
Kids-0.065
of-0.064
These-0.063
Token time
Token ID640
Feature activation-5.70e-02

Pos logit contributions
 badly+0.204
 possession+0.201
 shield+0.193
rily+0.188
ouse+0.184
Neg logit contributions
Those-0.140
ill-0.131
of-0.127
 fourteen-0.115
ila-0.114
Token,
Token ID11
Feature activation-4.37e-02

Pos logit contributions
 badly+0.070
 shield+0.070
 job+0.068
 possession+0.067
 active+0.064
Neg logit contributions
Those-0.123
ill-0.113
of-0.106
 fourteen-0.103
oms-0.101
Token there
Token ID612
Feature activation-1.72e-02

Pos logit contributions
 possession+0.068
 shield+0.068
 badly+0.068
 job+0.066
 appetite+0.066
Neg logit contributions
Those-0.077
ill-0.073
 fourteen-0.064
ila-0.063
of-0.062
Token was
Token ID373
Feature activation-7.61e-02

Pos logit contributions
-+0.022
 badly+0.022
 shield+0.022
 job+0.021
 happily+0.020
Neg logit contributions
Those-0.038
 fourteen-0.033
ount-0.032
oms-0.031
of-0.031
Token a
Token ID257
Feature activation-1.11e-01

Pos logit contributions
 shield+0.111
 badly+0.111
 job+0.109
 possession+0.109
-+0.106
Neg logit contributions
Those-0.146
ill-0.133
These-0.122
 fourteen-0.121
of-0.119
Token very
Token ID845
Feature activation-1.83e-01

Pos logit contributions
 possession+0.170
 badly+0.160
ages+0.158
rily+0.157
 armour+0.146
Neg logit contributions
Those-0.190
 end-0.173
Is-0.166
Kids-0.158
 halftime-0.155
Token silly
Token ID14397
Feature activation-1.28e-01

Pos logit contributions
 badly+0.249
 possession+0.249
amas+0.237
 shield+0.232
rily+0.230
Neg logit contributions
Those-0.310
ila-0.298
 end-0.273
oms-0.267
ulu-0.265
Token gir
Token ID37370
Feature activation-7.80e-02

Pos logit contributions
 badly+0.186
 possession+0.175
 Amen+0.173
rily+0.168
ages+0.167
Neg logit contributions
Those-0.201
ill-0.188
 end-0.187
 are-0.180
of-0.179
Tokenaffe
Token ID21223
Feature activation-5.22e-02

Pos logit contributions
 badly+0.098
 shield+0.098
 job+0.095
-+0.093
 possession+0.091
Neg logit contributions
Those-0.170
ill-0.152
of-0.143
These-0.140
Kids-0.139
Token.
Token ID13
Feature activation-5.96e-02

Pos logit contributions
 job+0.080
 active+0.073
 shield+0.073
 badly+0.073
 possession+0.073
Neg logit contributions
Those-0.096
ill-0.086
of-0.082
ion-0.080
 are-0.078
Token It
Token ID632
Feature activation-2.38e-02

Pos logit contributions
 badly+0.088
 shield+0.088
 job+0.086
 possession+0.083
 active+0.083
Neg logit contributions
Those-0.114
ill-0.100
of-0.096
 fourteen-0.092
Kids-0.092
Token called
Token ID1444
Feature activation-3.66e-02

Pos logit contributions
 shield+0.146
 job+0.144
 badly+0.133
 active+0.132
 armour+0.131
Neg logit contributions
Those-0.165
ill-0.152
of-0.141
ion-0.134
 are-0.134
Token Anna
Token ID11735
Feature activation-3.85e-02

Pos logit contributions
 badly+0.054
 shield+0.053
 job+0.053
-+0.052
 possession+0.051
Neg logit contributions
Those-0.072
ill-0.063
of-0.059
These-0.058
 fourteen-0.058
Token who
Token ID508
Feature activation-2.68e-02

Pos logit contributions
 job+0.051
 shield+0.050
 badly+0.049
 possession+0.047
 active+0.047
Neg logit contributions
Those-0.080
ill-0.073
of-0.069
ion-0.065
 are-0.065
Token had
Token ID550
Feature activation-5.21e-02

Pos logit contributions
 shield+0.040
 badly+0.039
 job+0.039
 possession+0.037
-+0.036
Neg logit contributions
Those-0.052
ill-0.045
of-0.043
These-0.042
Kids-0.041
Token a
Token ID257
Feature activation-9.98e-02

Pos logit contributions
 badly+0.064
 shield+0.064
 job+0.062
-+0.060
 possession+0.060
Neg logit contributions
Those-0.115
ill-0.103
of-0.098
 fourteen-0.095
These-0.094
Token very
Token ID845
Feature activation-1.83e-01

Pos logit contributions
rily+0.122
 active+0.121
 badly+0.121
 possession+0.118
 armour+0.114
Neg logit contributions
Those-0.170
 are-0.153
 end-0.150
of-0.148
Is-0.145
Token special
Token ID2041
Feature activation-1.12e-01

Pos logit contributions
 shield+0.244
 Amen+0.241
 badly+0.234
gy+0.227
rily+0.225
Neg logit contributions
Those-0.284
ila-0.276
 are-0.249
of-0.248
 end-0.245
Token passport
Token ID17981
Feature activation-8.89e-02

Pos logit contributions
 badly+0.138
 active+0.138
 Amen+0.128
 loudly+0.127
rily+0.126
Neg logit contributions
Those-0.184
 are-0.179
ill-0.177
 end-0.175
of-0.174
Token.
Token ID13
Feature activation-5.12e-02

Pos logit contributions
 shield+0.117
 job+0.115
 badly+0.114
 active+0.113
 possession+0.110
Neg logit contributions
Those-0.175
ill-0.162
of-0.151
ion-0.147
 are-0.144
Token Everyone
Token ID11075
Feature activation+2.38e-02

Pos logit contributions
 badly+0.086
 shield+0.084
 job+0.081
 possession+0.080
 Amen+0.080
Neg logit contributions
Those-0.087
ill-0.076
of-0.071
 fourteen-0.070
These-0.068
Token wanted
Token ID2227
Feature activation-2.83e-02

Pos logit contributions
Those+0.048
ill+0.043
of+0.041
 fourteen+0.040
These+0.039
Neg logit contributions
 badly-0.033
 shield-0.033
 job-0.032
--0.031
 possession-0.031
Token thought
Token ID1807
Feature activation-7.02e-02

Pos logit contributions
Those+0.037
ill+0.033
of+0.032
ion+0.030
 are+0.030
Neg logit contributions
 shield-0.024
 badly-0.024
 job-0.024
 Amen-0.024
 possession-0.023
Token that
Token ID326
Feature activation-2.62e-02

Pos logit contributions
 badly+0.090
 shield+0.089
 job+0.087
-+0.084
 possession+0.084
Neg logit contributions
Those-0.151
ill-0.135
of-0.128
 fourteen-0.124
These-0.123
Token sounded
Token ID14846
Feature activation-9.57e-02

Pos logit contributions
 shield+0.033
 badly+0.032
 job+0.031
-+0.030
 possession+0.030
Neg logit contributions
Those-0.058
ill-0.052
of-0.049
 fourteen-0.048
These-0.047
Token like
Token ID588
Feature activation-3.27e-02

Pos logit contributions
 shield+0.131
 badly+0.128
 possession+0.127
 job+0.126
-+0.125
Neg logit contributions
Those-0.190
ill-0.167
 are-0.156
of-0.156
These-0.153
Token a
Token ID257
Feature activation-8.89e-02

Pos logit contributions
 shield+0.038
 badly+0.038
 job+0.037
-+0.036
 possession+0.035
Neg logit contributions
Those-0.074
ill-0.067
of-0.064
 fourteen-0.061
These-0.061
Token very
Token ID845
Feature activation-1.83e-01

Pos logit contributions
 badly+0.114
 possession+0.110
 active+0.104
-+0.104
 Amen+0.102
Neg logit contributions
Those-0.175
ill-0.153
of-0.150
 are-0.148
Is-0.145
Token fun
Token ID1257
Feature activation-1.20e-01

Pos logit contributions
 Amen+0.228
 badly+0.226
 shield+0.223
 possession+0.215
gy+0.215
Neg logit contributions
Those-0.337
ila-0.312
of-0.301
 are-0.289
ill-0.287
Token game
Token ID983
Feature activation-5.88e-02

Pos logit contributions
 badly+0.176
 Amen+0.162
 shield+0.159
 active+0.155
rily+0.155
Neg logit contributions
Those-0.206
ill-0.195
 are-0.187
of-0.184
Kids-0.182
Token.
Token ID13
Feature activation-5.97e-02

Pos logit contributions
 badly+0.059
 shield+0.058
-+0.058
 job+0.056
 possession+0.053
Neg logit contributions
Those-0.145
ill-0.128
of-0.124
These-0.123
Is-0.122
Token So
Token ID1406
Feature activation-4.78e-02

Pos logit contributions
 badly+0.103
 shield+0.103
 job+0.100
-+0.098
 possession+0.097
Neg logit contributions
Those-0.099
ill-0.085
of-0.080
 fourteen-0.077
These-0.076
Token he
Token ID339
Feature activation-4.70e-02

Pos logit contributions
 badly+0.063
 shield+0.063
 job+0.061
 possession+0.059
-+0.059
Neg logit contributions
Those-0.100
ill-0.090
of-0.085
 fourteen-0.082
These-0.082
Token years
Token ID812
Feature activation-1.05e-01

Pos logit contributions
 badly+0.068
-+0.065
 shield+0.063
 job+0.063
 active+0.060
Neg logit contributions
Those-0.142
ill-0.128
of-0.121
These-0.119
Kids-0.119
Token old
Token ID1468
Feature activation-7.41e-02

Pos logit contributions
 badly+0.118
 shield+0.116
 job+0.114
-+0.111
 possession+0.108
Neg logit contributions
Those-0.245
ill-0.221
of-0.210
These-0.205
 fourteen-0.203
Token and
Token ID290
Feature activation-9.72e-02

Pos logit contributions
 shield+0.096
 badly+0.096
 job+0.094
 active+0.090
 possession+0.090
Neg logit contributions
Those-0.154
ill-0.137
of-0.131
These-0.126
ion-0.125
Token had
Token ID550
Feature activation-4.30e-02

Pos logit contributions
 badly+0.116
 shield+0.114
 job+0.110
 possession+0.107
-+0.107
Neg logit contributions
Those-0.213
ill-0.186
of-0.182
These-0.177
 are-0.175
Token a
Token ID257
Feature activation-1.00e-01

Pos logit contributions
 badly+0.054
 shield+0.054
 job+0.052
-+0.051
 possession+0.050
Neg logit contributions
Those-0.093
ill-0.084
of-0.079
 fourteen-0.077
Kids-0.076
Token very
Token ID845
Feature activation-1.82e-01

Pos logit contributions
 badly+0.126
rily+0.123
 possession+0.118
-+0.118
 active+0.116
Neg logit contributions
Those-0.180
 end-0.159
 are-0.159
of-0.156
Is-0.154
Token important
Token ID1593
Feature activation-1.14e-01

Pos logit contributions
 badly+0.240
 shield+0.226
gy+0.225
rily+0.220
 Amen+0.220
Neg logit contributions
Those-0.306
ila-0.278
 are-0.266
 end-0.264
ill-0.259
Token question
Token ID1808
Feature activation-3.46e-02

Pos logit contributions
rily+0.126
 badly+0.125
 Amen+0.123
 active+0.118
 shield+0.116
Neg logit contributions
Those-0.200
of-0.192
 end-0.192
 are-0.192
ion-0.189
Token.
Token ID13
Feature activation-6.08e-02

Pos logit contributions
 badly+0.038
 shield+0.036
-+0.036
 job+0.035
 possession+0.034
Neg logit contributions
Those-0.083
ill-0.074
of-0.071
Kids-0.070
These-0.070
Token She
Token ID1375
Feature activation-5.93e-02

Pos logit contributions
 badly+0.107
 shield+0.104
 job+0.102
 possession+0.100
 active+0.099
Neg logit contributions
Those-0.099
ill-0.086
of-0.081
 fourteen-0.080
oms-0.077
Token asked
Token ID1965
Feature activation+1.06e-02

Pos logit contributions
 badly+0.088
 shield+0.088
 job+0.087
 possession+0.084
-+0.083
Neg logit contributions
Those-0.113
ill-0.100
of-0.095
 fourteen-0.091
These-0.090
Token very
Token ID845
Feature activation-1.65e-01

Pos logit contributions
Those+0.059
ill+0.053
of+0.050
Kids+0.048
 fourteen+0.048
Neg logit contributions
 shield-0.035
 badly-0.034
 job-0.034
--0.033
 possession-0.032
Token big
Token ID1263
Feature activation-1.10e-01

Pos logit contributions
 shield+0.195
 possession+0.184
gy+0.172
 armour+0.167
 armor+0.166
Neg logit contributions
Those-0.357
ila-0.319
ill-0.302
 are-0.299
Kids-0.298
Token bag
Token ID6131
Feature activation-7.47e-02

Pos logit contributions
 badly+0.152
 active+0.147
 possession+0.146
 job+0.140
ouse+0.139
Neg logit contributions
Those-0.203
ill-0.189
 are-0.184
 fourteen-0.175
of-0.174
Token for
Token ID329
Feature activation-2.98e-02

Pos logit contributions
 shield+0.095
 badly+0.095
 job+0.093
 possession+0.089
-+0.089
Neg logit contributions
Those-0.159
ill-0.144
of-0.135
 fourteen-0.131
These-0.130
Token a
Token ID257
Feature activation-1.07e-01

Pos logit contributions
 shield+0.042
 badly+0.042
 job+0.041
 possession+0.040
-+0.039
Neg logit contributions
Those-0.060
ill-0.053
of-0.050
 fourteen-0.048
These-0.048
Token very
Token ID845
Feature activation-1.82e-01

Pos logit contributions
 possession+0.136
 badly+0.129
 active+0.127
-+0.123
 shield+0.122
Neg logit contributions
Those-0.208
of-0.196
ill-0.188
 are-0.176
 end-0.176
Token small
Token ID1402
Feature activation-9.82e-02

Pos logit contributions
 Amen+0.236
 shield+0.229
gy+0.225
 armor+0.222
amas+0.222
Neg logit contributions
ila-0.302
Those-0.299
of-0.286
 end-0.275
ill-0.271
Token girl
Token ID2576
Feature activation-8.75e-02

Pos logit contributions
 active+0.129
 badly+0.127
 possession+0.121
 shield+0.118
 Amen+0.118
Neg logit contributions
Those-0.176
ill-0.165
of-0.162
 are-0.158
 end-0.156
Token.
Token ID13
Feature activation-6.12e-02

Pos logit contributions
 shield+0.125
 job+0.124
 badly+0.122
 possession+0.117
 active+0.116
Neg logit contributions
Those-0.167
ill-0.156
of-0.146
 are-0.138
ion-0.137
Token Do
Token ID2141
Feature activation+4.12e-04

Pos logit contributions
 badly+0.104
 shield+0.103
 job+0.099
 active+0.098
 possession+0.098
Neg logit contributions
Those-0.101
ill-0.087
of-0.083
 fourteen-0.081
These-0.078
Token you
Token ID345
Feature activation-3.32e-02

Pos logit contributions
Those+0.001
ill+0.001
of+0.001
 fourteen+0.001
These+0.001
Neg logit contributions
 badly-0.001
 shield-0.001
 job-0.001
--0.001
 possession-0.000
Token time
Token ID640
Feature activation-7.04e-02

Pos logit contributions
 possession+0.187
 badly+0.184
 active+0.168
-+0.168
rily+0.168
Neg logit contributions
Those-0.251
of-0.225
ill-0.225
 are-0.225
 end-0.222
Token,
Token ID11
Feature activation-6.06e-02

Pos logit contributions
 badly+0.068
-+0.066
 shield+0.063
 job+0.062
 possession+0.057
Neg logit contributions
Those-0.176
ill-0.155
These-0.152
of-0.151
 fourteen-0.149
Token there
Token ID612
Feature activation-1.94e-02

Pos logit contributions
 job+0.093
 possession+0.091
-+0.091
 appetite+0.090
 shield+0.090
Neg logit contributions
Those-0.111
ill-0.100
of-0.089
 are-0.087
ila-0.087
Token was
Token ID373
Feature activation-7.04e-02

Pos logit contributions
 badly+0.026
-+0.026
 shield+0.024
 happily+0.024
 job+0.024
Neg logit contributions
Those-0.042
 fourteen-0.035
ount-0.035
oms-0.034
These-0.034
Token a
Token ID257
Feature activation-1.06e-01

Pos logit contributions
 shield+0.090
 badly+0.089
 job+0.088
 possession+0.086
-+0.085
Neg logit contributions
Those-0.150
ill-0.135
of-0.126
These-0.124
 fourteen-0.124
Token very
Token ID845
Feature activation-1.82e-01

Pos logit contributions
 possession+0.158
 badly+0.145
rily+0.143
 armour+0.137
ages+0.135
Neg logit contributions
Those-0.190
 end-0.166
Is-0.166
Kids-0.154
 halftime-0.152
Token smart
Token ID4451
Feature activation-1.20e-01

Pos logit contributions
 possession+0.251
 badly+0.240
amas+0.238
 shield+0.237
rily+0.230
Neg logit contributions
Those-0.303
ila-0.287
 end-0.269
ill-0.248
oral-0.248
Token boy
Token ID2933
Feature activation-8.23e-02

Pos logit contributions
 badly+0.166
 possession+0.154
 Amen+0.152
 active+0.152
 shield+0.146
Neg logit contributions
Those-0.207
ill-0.196
of-0.187
ion-0.182
 end-0.182
Token named
Token ID3706
Feature activation-1.49e-02

Pos logit contributions
 shield+0.127
 job+0.122
 badly+0.115
 active+0.114
 possession+0.111
Neg logit contributions
Those-0.154
ill-0.141
of-0.130
ion-0.127
 are-0.123
Token Dave
Token ID9935
Feature activation-2.37e-02

Pos logit contributions
 badly+0.020
 shield+0.020
 job+0.020
-+0.019
 possession+0.019
Neg logit contributions
Those-0.030
ill-0.027
of-0.026
These-0.025
 fourteen-0.025
Token.
Token ID13
Feature activation-5.25e-02

Pos logit contributions
 shield+0.029
 badly+0.029
 job+0.029
 possession+0.027
 active+0.027
Neg logit contributions
Those-0.052
ill-0.047
of-0.045
 fourteen-0.043
These-0.043
Token mall
Token ID17374
Feature activation-9.77e-02

Pos logit contributions
 possession+0.169
 badly+0.160
ages+0.153
rily+0.153
 armour+0.144
Neg logit contributions
Those-0.196
 end-0.178
Is-0.171
Kids-0.164
 halftime-0.161
Token.
Token ID13
Feature activation-6.75e-02

Pos logit contributions
 shield+0.125
 badly+0.116
 job+0.116
 possession+0.112
 active+0.111
Neg logit contributions
Those-0.212
ill-0.193
of-0.178
These-0.174
 fourteen-0.171
Token It
Token ID632
Feature activation-3.05e-02

Pos logit contributions
 badly+0.105
 shield+0.103
 job+0.101
 possession+0.099
 active+0.097
Neg logit contributions
Those-0.124
ill-0.105
of-0.104
 fourteen-0.104
These-0.098
Token was
Token ID373
Feature activation-7.50e-02

Pos logit contributions
 badly+0.048
 shield+0.048
-+0.047
 job+0.047
 possession+0.046
Neg logit contributions
Those-0.056
ill-0.050
 fourteen-0.046
of-0.046
These-0.044
Token a
Token ID257
Feature activation-1.01e-01

Pos logit contributions
 shield+0.088
 badly+0.088
 job+0.085
 possession+0.083
-+0.082
Neg logit contributions
Those-0.167
ill-0.151
of-0.141
These-0.139
 fourteen-0.137
Token very
Token ID845
Feature activation-1.80e-01

Pos logit contributions
 badly+0.129
 shield+0.126
 possession+0.126
rily+0.119
ages+0.116
Neg logit contributions
Those-0.198
Is-0.168
 are-0.167
 end-0.166
These-0.165
Token successful
Token ID4388
Feature activation-1.09e-01

Pos logit contributions
 shield+0.252
 badly+0.239
 possession+0.232
gy+0.221
 Amen+0.220
Neg logit contributions
Those-0.326
ila-0.288
ill-0.280
 are-0.279
oms-0.277
Token mall
Token ID17374
Feature activation-8.04e-02

Pos logit contributions
 badly+0.179
 shield+0.175
 Amen+0.162
 active+0.160
 possession+0.159
Neg logit contributions
Those-0.179
ill-0.158
 are-0.148
of-0.148
Kids-0.148
Token,
Token ID11
Feature activation-5.00e-02

Pos logit contributions
 shield+0.108
 badly+0.101
 job+0.097
 Amen+0.095
 possession+0.095
Neg logit contributions
Those-0.169
ill-0.153
of-0.142
These-0.137
ion-0.136
Token people
Token ID661
Feature activation+1.41e-02

Pos logit contributions
 shield+0.066
 badly+0.065
 job+0.064
 possession+0.063
-+0.060
Neg logit contributions
Those-0.104
ill-0.094
of-0.086
These-0.084
 fourteen-0.084
Token from
Token ID422
Feature activation-3.65e-02

Pos logit contributions
Those+0.032
ill+0.029
 fourteen+0.028
of+0.027
Kids+0.027
Neg logit contributions
 badly-0.016
--0.016
 shield-0.015
 job-0.015
 possession-0.015
Token Jack
Token ID3619
Feature activation-3.65e-02

Pos logit contributions
 badly+0.040
 job+0.039
 shield+0.038
 possession+0.038
-+0.037
Neg logit contributions
Those-0.053
ill-0.048
of-0.045
 fourteen-0.044
These-0.042
Token.
Token ID13
Feature activation-4.84e-02

Pos logit contributions
 job+0.048
 badly+0.047
 shield+0.047
 active+0.044
 possession+0.044
Neg logit contributions
Those-0.076
ill-0.070
of-0.066
These-0.062
 are-0.062
Token He
Token ID679
Feature activation-5.95e-02

Pos logit contributions
 shield+0.065
 badly+0.064
 job+0.063
 possession+0.060
-+0.060
Neg logit contributions
Those-0.101
ill-0.090
of-0.086
 fourteen-0.083
These-0.083
Token was
Token ID373
Feature activation-6.81e-02

Pos logit contributions
 job+0.103
 possession+0.103
 shield+0.101
 badly+0.100
rily+0.100
Neg logit contributions
Those-0.092
Kids-0.078
of-0.076
Is-0.076
These-0.075
Token a
Token ID257
Feature activation-9.83e-02

Pos logit contributions
 possession+0.103
 shield+0.099
ouse+0.098
 badly+0.097
 appetite+0.091
Neg logit contributions
Those-0.133
These-0.109
ill-0.103
 are-0.101
ila-0.100
Token very
Token ID845
Feature activation-1.79e-01

Pos logit contributions
rily+0.144
 badly+0.138
 possession+0.138
amas+0.132
ages+0.132
Neg logit contributions
Those-0.170
Is-0.148
 end-0.142
These-0.137
Three-0.135
Token compassionate
Token ID32533
Feature activation-1.20e-01

Pos logit contributions
 badly+0.245
 possession+0.241
 shield+0.237
rily+0.221
 armour+0.220
Neg logit contributions
Those-0.332
ila-0.316
oke-0.285
 are-0.273
Kids-0.270
Token boy
Token ID2933
Feature activation-7.92e-02

Pos logit contributions
 badly+0.175
eh+0.168
 Amen+0.162
 shield+0.159
 active+0.158
Neg logit contributions
Those-0.192
ill-0.188
of-0.175
 end-0.172
 are-0.169
Token,
Token ID11
Feature activation-3.61e-02

Pos logit contributions
 shield+0.139
 job+0.132
 badly+0.126
 active+0.125
 possession+0.124
Neg logit contributions
Those-0.133
ill-0.114
of-0.107
ion-0.104
These-0.104
Token who
Token ID508
Feature activation-2.26e-02

Pos logit contributions
 job+0.055
 badly+0.055
 shield+0.055
-+0.052
 possession+0.052
Neg logit contributions
Those-0.068
ill-0.060
of-0.056
These-0.054
 fourteen-0.054
Token always
Token ID1464
Feature activation-3.65e-02

Pos logit contributions
 shield+0.029
 badly+0.029
 job+0.028
 possession+0.027
-+0.027
Neg logit contributions
Those-0.048
ill-0.042
of-0.041
These-0.040
 are-0.039
Token corner
Token ID5228
Feature activation-4.99e-02

Pos logit contributions
Those+0.024
ill+0.020
of+0.019
 fourteen+0.018
These+0.018
Neg logit contributions
 badly-0.025
 shield-0.025
 job-0.024
 possession-0.024
--0.024
Token.
Token ID13
Feature activation-6.91e-02

Pos logit contributions
 badly+0.069
 shield+0.067
 job+0.066
-+0.065
 possession+0.064
Neg logit contributions
Those-0.104
ill-0.090
of-0.087
 fourteen-0.084
Kids-0.084
Token He
Token ID679
Feature activation-6.51e-02

Pos logit contributions
 badly+0.114
 shield+0.112
 job+0.109
-+0.106
 possession+0.105
Neg logit contributions
Those-0.121
ill-0.105
of-0.100
 fourteen-0.096
These-0.095
Token had
Token ID550
Feature activation-4.18e-02

Pos logit contributions
 badly+0.103
 shield+0.102
 job+0.101
 possession+0.098
-+0.097
Neg logit contributions
Those-0.119
ill-0.105
of-0.099
 fourteen-0.094
These-0.093
Token a
Token ID257
Feature activation-8.75e-02

Pos logit contributions
 badly+0.058
 shield+0.057
 job+0.056
-+0.054
 possession+0.054
Neg logit contributions
Those-0.086
ill-0.077
of-0.073
These-0.070
 fourteen-0.070
Token very
Token ID845
Feature activation-1.79e-01

Pos logit contributions
 badly+0.115
 possession+0.113
-+0.112
 active+0.107
rily+0.106
Neg logit contributions
Those-0.158
 are-0.137
Is-0.136
ill-0.136
 end-0.133
Token kind
Token ID1611
Feature activation-9.95e-02

Pos logit contributions
 shield+0.215
 Amen+0.214
 badly+0.212
 possession+0.210
gy+0.209
Neg logit contributions
Those-0.335
ila-0.306
ill-0.289
 are-0.289
 fourteen-0.279
Token and
Token ID290
Feature activation-9.71e-02

Pos logit contributions
 badly+0.139
 active+0.130
 shield+0.129
 job+0.127
 possession+0.127
Neg logit contributions
Those-0.180
ill-0.177
of-0.164
 are-0.156
oms-0.154
Token thoughtful
Token ID22677
Feature activation-1.02e-01

Pos logit contributions
 badly+0.122
 job+0.115
 possession+0.115
 shield+0.113
 Amen+0.111
Neg logit contributions
Those-0.199
ill-0.172
 are-0.165
of-0.164
These-0.164
Token face
Token ID1986
Feature activation-3.85e-02

Pos logit contributions
 badly+0.150
 active+0.138
 shield+0.133
eh+0.133
 Amen+0.132
Neg logit contributions
Those-0.171
ill-0.160
of-0.154
 are-0.152
ion-0.149
Token.
Token ID13
Feature activation-4.60e-02

Pos logit contributions
 badly+0.045
 shield+0.045
 job+0.043
-+0.043
 possession+0.041
Neg logit contributions
Those-0.088
ill-0.079
of-0.075
These-0.073
Kids-0.072
Token time
Token ID640
Feature activation-2.80e-02

Pos logit contributions
 badly+0.103
 possession+0.103
 shield+0.103
 job+0.099
 Amen+0.095
Neg logit contributions
Those-0.098
ill-0.091
of-0.087
 fourteen-0.079
These-0.077
Token,
Token ID11
Feature activation-9.99e-03

Pos logit contributions
 badly+0.030
 shield+0.029
 job+0.029
-+0.028
 possession+0.027
Neg logit contributions
Those-0.066
ill-0.060
of-0.057
These-0.056
 fourteen-0.055
Token there
Token ID612
Feature activation-1.37e-03

Pos logit contributions
 shield+0.013
 job+0.013
 possession+0.013
 badly+0.013
 active+0.012
Neg logit contributions
Those-0.021
ill-0.019
of-0.017
 fourteen-0.017
These-0.017
Token was
Token ID373
Feature activation-6.40e-02

Pos logit contributions
-+0.001
 badly+0.001
 happily+0.001
 job+0.001
 shield+0.001
Neg logit contributions
Those-0.003
 fourteen-0.003
ives-0.003
oms-0.003
 thirteen-0.003
Token a
Token ID257
Feature activation-1.07e-01

Pos logit contributions
 possession+0.114
 shield+0.112
 badly+0.110
 job+0.109
ouse+0.107
Neg logit contributions
Those-0.110
ill-0.094
These-0.092
 fourteen-0.083
 are-0.082
Token very
Token ID845
Feature activation-1.79e-01

Pos logit contributions
 possession+0.179
 badly+0.166
ages+0.159
 armour+0.159
 armor+0.156
Neg logit contributions
Those-0.181
Ben-0.152
Is-0.152
ila-0.152
Kids-0.150
Token successful
Token ID4388
Feature activation-1.17e-01

Pos logit contributions
 possession+0.269
 badly+0.265
 armour+0.247
 shield+0.246
 armor+0.246
Neg logit contributions
ila-0.293
Those-0.282
ulu-0.260
oms-0.252
 end-0.242
Token duck
Token ID22045
Feature activation-4.38e-02

Pos logit contributions
 badly+0.193
rily+0.176
 possession+0.172
 Amen+0.169
ages+0.162
Neg logit contributions
Those-0.170
 end-0.164
ila-0.160
ill-0.159
of-0.155
Token.
Token ID13
Feature activation-5.16e-02

Pos logit contributions
 shield+0.063
 job+0.060
 badly+0.058
 armour+0.058
 armor+0.057
Neg logit contributions
Those-0.085
ill-0.075
of-0.073
 are-0.070
These-0.069
Token He
Token ID679
Feature activation-5.53e-02

Pos logit contributions
 badly+0.076
 shield+0.074
 job+0.072
 Amen+0.071
 Twe+0.070
Neg logit contributions
Those-0.095
ill-0.086
of-0.083
oms-0.081
 fourteen-0.079
Token lived
Token ID5615
Feature activation-2.06e-02

Pos logit contributions
 job+0.097
 shield+0.096
 possession+0.095
 badly+0.094
 appetite+0.089
Neg logit contributions
Those-0.089
 are-0.074
of-0.073
These-0.071
Kids-0.070
Token very
Token ID845
Feature activation-1.95e-01

Pos logit contributions
 possession+0.187
 badly+0.184
 active+0.168
-+0.168
rily+0.168
Neg logit contributions
Those-0.251
of-0.225
ill-0.225
 are-0.225
 end-0.222
Token nice
Token ID3621
Feature activation-1.18e-01

Pos logit contributions
gy+0.267
 possession+0.266
 shield+0.266
 badly+0.263
amas+0.255
Neg logit contributions
ila-0.333
Those-0.332
of-0.295
ill-0.290
oms-0.285
Token house
Token ID2156
Feature activation-5.25e-02

Pos logit contributions
 badly+0.172
 active+0.159
 Amen+0.154
 shield+0.154
 possession+0.151
Neg logit contributions
Those-0.202
ill-0.194
of-0.193
 are-0.179
Kids-0.178
Token and
Token ID290
Feature activation-9.01e-02

Pos logit contributions
 shield+0.065
 badly+0.064
 job+0.062
 possession+0.060
 active+0.060
Neg logit contributions
Those-0.112
ill-0.103
of-0.096
 fourteen-0.094
ion-0.093
Token a
Token ID257
Feature activation-9.45e-02

Pos logit contributions
 badly+0.143
 shield+0.138
 possession+0.137
 job+0.137
-+0.136
Neg logit contributions
Those-0.163
ill-0.142
of-0.133
 are-0.132
These-0.129
Token very
Token ID845
Feature activation-1.78e-01

Pos logit contributions
 badly+0.116
 possession+0.113
-+0.108
 shield+0.106
 active+0.106
Neg logit contributions
Those-0.195
ill-0.173
of-0.168
Is-0.164
 are-0.163
Token nice
Token ID3621
Feature activation-1.08e-01

Pos logit contributions
 badly+0.235
 shield+0.230
gy+0.225
 possession+0.224
 armor+0.212
Neg logit contributions
Those-0.300
ila-0.295
 are-0.266
oms-0.263
ill-0.263
Token lion
Token ID18744
Feature activation-4.53e-02

Pos logit contributions
 badly+0.153
 active+0.141
 Amen+0.137
 shield+0.136
 possession+0.133
Neg logit contributions
Those-0.186
ill-0.179
of-0.172
ion-0.164
 are-0.163
Token.
Token ID13
Feature activation-6.39e-02

Pos logit contributions
 job+0.064
 badly+0.062
 active+0.061
 possession+0.060
 shield+0.060
Neg logit contributions
Those-0.087
ill-0.079
of-0.076
 are-0.073
ion-0.072
Token They
Token ID1119
Feature activation-4.23e-02

Pos logit contributions
 badly+0.103
 shield+0.103
 job+0.099
 active+0.097
 possession+0.097
Neg logit contributions
Those-0.111
ill-0.098
of-0.092
 fourteen-0.088
oms-0.088
Token said
Token ID531
Feature activation-2.98e-02

Pos logit contributions
 shield+0.071
 badly+0.068
 job+0.067
-+0.065
 active+0.064
Neg logit contributions
Those-0.076
ill-0.067
of-0.061
Kids-0.060
Is-0.060
Token Anna
Token ID11735
Feature activation-3.22e-02

Pos logit contributions
 badly+0.026
 shield+0.026
 job+0.025
-+0.024
 possession+0.024
Neg logit contributions
Those-0.032
ill-0.028
of-0.026
 fourteen-0.025
These-0.025
Token.
Token ID13
Feature activation-4.04e-02

Pos logit contributions
 shield+0.040
 badly+0.040
 job+0.040
 possession+0.037
 active+0.037
Neg logit contributions
Those-0.070
ill-0.063
of-0.060
These-0.057
Kids-0.057
Token She
Token ID1375
Feature activation-5.17e-02

Pos logit contributions
 badly+0.062
 shield+0.061
 job+0.060
 active+0.059
-+0.058
Neg logit contributions
Those-0.071
ill-0.063
ion-0.059
of-0.059
 fourteen-0.059
Token had
Token ID550
Feature activation-5.08e-02

Pos logit contributions
 badly+0.090
 shield+0.089
 job+0.088
 possession+0.088
 appetite+0.085
Neg logit contributions
Those-0.083
 are-0.073
Kids-0.070
These-0.069
of-0.068
Token a
Token ID257
Feature activation-9.90e-02

Pos logit contributions
 shield+0.075
 badly+0.073
 job+0.072
 possession+0.071
 appetite+0.071
Neg logit contributions
Those-0.102
These-0.085
ill-0.085
of-0.080
 fourteen-0.078
Token very
Token ID845
Feature activation-1.78e-01

Pos logit contributions
rily+0.164
 badly+0.145
ages+0.144
 Amen+0.142
 possession+0.142
Neg logit contributions
Those-0.141
 end-0.127
 are-0.123
These-0.117
 last-0.115
Token kind
Token ID1611
Feature activation-9.90e-02

Pos logit contributions
rily+0.271
 Amen+0.266
 badly+0.265
 shield+0.263
 armor+0.257
Neg logit contributions
ila-0.271
Those-0.258
 are-0.221
oms-0.219
of-0.219
Token aunt
Token ID25949
Feature activation-3.32e-02

Pos logit contributions
 badly+0.133
 Amen+0.123
 active+0.122
 shield+0.122
 possession+0.116
Neg logit contributions
Those-0.181
ill-0.177
of-0.169
 are-0.162
ion-0.155
Token who
Token ID508
Feature activation-2.23e-02

Pos logit contributions
 badly+0.040
-+0.040
 shield+0.039
 job+0.038
 possession+0.037
Neg logit contributions
Those-0.074
ill-0.066
of-0.063
 fourteen-0.063
Is-0.061
Token always
Token ID1464
Feature activation-3.62e-02

Pos logit contributions
-+0.020
 badly+0.020
 job+0.019
 shield+0.018
 active+0.018
Neg logit contributions
Those-0.056
ill-0.054
 fourteen-0.050
of-0.049
oms-0.049
Token made
Token ID925
Feature activation+2.82e-03

Pos logit contributions
 badly+0.046
 job+0.046
-+0.045
 shield+0.044
 active+0.044
Neg logit contributions
Those-0.079
ill-0.072
of-0.068
 fourteen-0.067
ion-0.066
Token Julie
Token ID21946
Feature activation-4.27e-02

Pos logit contributions
 badly+0.063
 shield+0.063
 job+0.061
-+0.060
 possession+0.059
Neg logit contributions
Those-0.086
ill-0.075
of-0.071
These-0.069
 fourteen-0.069
Token.
Token ID13
Feature activation-5.74e-02

Pos logit contributions
 badly+0.047
 shield+0.047
-+0.045
 job+0.043
 possession+0.042
Neg logit contributions
Those-0.101
ill-0.089
These-0.085
Kids-0.084
of-0.084
Token She
Token ID1375
Feature activation-6.52e-02

Pos logit contributions
 Amen+0.082
 shield+0.081
 job+0.077
-+0.076
 badly+0.076
Neg logit contributions
Those-0.106
 fourteen-0.094
ion-0.090
ill-0.089
These-0.088
Token was
Token ID373
Feature activation-7.06e-02

Pos logit contributions
 badly+0.101
 shield+0.100
 job+0.097
-+0.097
 possession+0.094
Neg logit contributions
Those-0.124
ill-0.109
of-0.102
 fourteen-0.099
These-0.099
Token a
Token ID257
Feature activation-1.02e-01

Pos logit contributions
 shield+0.081
 badly+0.079
 job+0.077
 possession+0.076
-+0.074
Neg logit contributions
Those-0.161
ill-0.145
of-0.135
 fourteen-0.133
These-0.133
Token very
Token ID845
Feature activation-1.78e-01

Pos logit contributions
 badly+0.132
ages+0.129
rily+0.126
 possession+0.123
amas+0.122
Neg logit contributions
Those-0.187
Is-0.164
 end-0.162
These-0.150
Three-0.149
Token polite
Token ID23507
Feature activation-1.12e-01

Pos logit contributions
 shield+0.236
 badly+0.220
 possession+0.219
 armor+0.214
 armour+0.214
Neg logit contributions
Those-0.349
ila-0.313
ill-0.289
Is-0.278
Kids-0.278
Token and
Token ID290
Feature activation-1.09e-01

Pos logit contributions
 badly+0.161
 shield+0.156
 job+0.148
 possession+0.145
 Amen+0.144
Neg logit contributions
Those-0.208
ill-0.195
of-0.180
Kids-0.172
oms-0.169
Token kind
Token ID1611
Feature activation-8.42e-02

Pos logit contributions
 shield+0.132
 badly+0.129
 job+0.125
 possession+0.122
-+0.119
Neg logit contributions
Those-0.237
ill-0.201
of-0.197
These-0.195
Is-0.192
Token girl
Token ID2576
Feature activation-6.16e-02

Pos logit contributions
 badly+0.112
 shield+0.111
 job+0.107
 active+0.102
 possession+0.102
Neg logit contributions
Those-0.171
ill-0.156
of-0.149
Kids-0.139
ion-0.138
Token who
Token ID508
Feature activation-2.66e-02

Pos logit contributions
 badly+0.073
 shield+0.072
 job+0.070
-+0.068
 possession+0.067
Neg logit contributions
Those-0.138
ill-0.125
of-0.118
 fourteen-0.115
These-0.114
Token asked
Token ID1965
Feature activation+3.70e-02

Pos logit contributions
 possession+0.011
 job+0.011
 badly+0.011
 shield+0.011
-+0.010
Neg logit contributions
Those-0.010
ill-0.009
of-0.008
Kids-0.008
ila-0.008
Token her
Token ID607
Feature activation+1.07e-02

Pos logit contributions
Those+0.071
ill+0.067
of+0.061
ion+0.057
 are+0.057
Neg logit contributions
 shield-0.053
 job-0.052
 possession-0.051
 active-0.049
 badly-0.048
Token fluffy
Token ID39145
Feature activation-4.30e-02

Pos logit contributions
Those+0.023
ill+0.021
of+0.020
These+0.019
 fourteen+0.019
Neg logit contributions
 badly-0.014
 shield-0.013
 job-0.013
 possession-0.013
 active-0.012
Token bunny
Token ID44915
Feature activation+7.56e-03

Pos logit contributions
 badly+0.068
 job+0.065
pron+0.064
 active+0.062
eh+0.062
Neg logit contributions
ill-0.066
Those-0.065
of-0.059
 end-0.056
 are-0.052
Token.
Token ID13
Feature activation-1.03e-02

Pos logit contributions
ill+0.013
Those+0.013
of+0.011
 end+0.010
m+0.010
Neg logit contributions
 job-0.013
 possession-0.012
pron-0.011
 shield-0.011
 active-0.011
Token
Token ID198
Feature activation+5.10e-02

Pos logit contributions
 badly+0.018
 shield+0.018
 job+0.018
-+0.018
 possession+0.017
Neg logit contributions
Those-0.017
ill-0.014
of-0.013
 fourteen-0.013
These-0.013
Token
Token ID198
Feature activation+4.88e-02

Pos logit contributions
Those+0.086
ill+0.076
 fourteen+0.071
oms+0.067
ount+0.067
Neg logit contributions
 badly-0.088
--0.088
 shield-0.084
 job-0.082
 Amen-0.080
TokenThe
Token ID464
Feature activation+9.60e-02

Pos logit contributions
Those+0.082
ill+0.071
of+0.066
 fourteen+0.063
These+0.063
Neg logit contributions
 badly-0.086
 shield-0.086
 job-0.084
--0.082
 possession-0.082
Token bunny
Token ID44915
Feature activation+1.23e-02

Pos logit contributions
Those+0.226
These+0.190
of+0.189
Is+0.188
ill+0.187
Neg logit contributions
 possession-0.101
 badly-0.101
 job-0.097
--0.093
 shield-0.091
Token just
Token ID655
Feature activation+1.26e-02

Pos logit contributions
Those+0.023
ill+0.020
of+0.019
Is+0.018
 fourteen+0.018
Neg logit contributions
 job-0.019
 badly-0.019
 shield-0.019
 possession-0.019
 active-0.018
Token shook
Token ID14682
Feature activation+1.78e-02

Pos logit contributions
Those+0.027
ill+0.023
These+0.022
Kids+0.022
ion+0.022
Neg logit contributions
 shield-0.017
 badly-0.017
--0.016
 loudly-0.015
 happily-0.015
Token it
Token ID340
Feature activation-2.45e-03

Pos logit contributions
Those+0.037
ill+0.032
of+0.030
These+0.029
 fourteen+0.029
Neg logit contributions
 badly-0.030
 shield-0.030
 job-0.029
 possession-0.028
--0.028
Token tastes
Token ID18221
Feature activation-5.01e-03

Pos logit contributions
 shield+0.004
-+0.004
 job+0.004
 badly+0.004
 active+0.004
Neg logit contributions
Those-0.004
ill-0.004
Kids-0.004
of-0.004
These-0.003
Token like
Token ID588
Feature activation+7.02e-03

Pos logit contributions
 badly+0.007
 shield+0.007
 job+0.007
 possession+0.007
-+0.007
Neg logit contributions
Those-0.010
ill-0.009
of-0.008
 fourteen-0.008
These-0.008
Token."
Token ID526
Feature activation-5.65e-03

Pos logit contributions
Those+0.016
ill+0.015
 fourteen+0.014
of+0.014
ion+0.014
Neg logit contributions
 shield-0.008
 badly-0.008
 possession-0.007
 job-0.007
 active-0.007
Token
Token ID198
Feature activation+5.56e-02

Pos logit contributions
 badly+0.010
 shield+0.010
 job+0.010
-+0.010
 possession+0.010
Neg logit contributions
Those-0.009
ill-0.008
of-0.007
 fourteen-0.007
These-0.007
Token
Token ID198
Feature activation+5.08e-02

Pos logit contributions
Those+0.092
ill+0.083
 fourteen+0.077
 thirteen+0.072
ount+0.072
Neg logit contributions
 badly-0.097
--0.095
 shield-0.094
 job-0.092
 active-0.090
TokenThe
Token ID464
Feature activation+9.42e-02

Pos logit contributions
Those+0.065
ill+0.063
 fourteen+0.057
oms+0.056
ount+0.054
Neg logit contributions
 badly-0.104
 active-0.100
--0.098
 shield-0.098
 job-0.097
Token family
Token ID1641
Feature activation+3.48e-02

Pos logit contributions
Those+0.225
ill+0.198
of+0.189
These+0.189
Kids+0.188
Neg logit contributions
 badly-0.105
 shield-0.099
 possession-0.097
 job-0.093
--0.090
Token put
Token ID1234
Feature activation+4.54e-02

Pos logit contributions
Those+0.067
ill+0.059
of+0.056
 fourteen+0.054
These+0.054
Neg logit contributions
 shield-0.052
 badly-0.051
--0.051
 job-0.050
 possession-0.048
Token the
Token ID262
Feature activation+8.22e-02

Pos logit contributions
Those+0.092
ill+0.083
of+0.077
Kids+0.076
Is+0.075
Neg logit contributions
--0.065
 job-0.064
 shield-0.063
 badly-0.061
 possession-0.059
Token stew
Token ID20798
Feature activation-1.76e-02

Pos logit contributions
Those+0.163
ill+0.139
These+0.134
of+0.132
 are+0.132
Neg logit contributions
 badly-0.124
 possession-0.110
 shield-0.110
 Amen-0.104
 active-0.104
Token them
Token ID606
Feature activation+3.47e-02

Pos logit contributions
Those+0.069
ill+0.058
of+0.054
Kids+0.051
These+0.051
Neg logit contributions
 shield-0.088
 badly-0.087
 job-0.085
--0.085
 possession-0.083
Token on
Token ID319
Feature activation+1.37e-02

Pos logit contributions
Those+0.069
ill+0.065
of+0.059
 are+0.059
oms+0.058
Neg logit contributions
 Amen-0.047
 badly-0.047
 possession-0.047
 job-0.046
 appetite-0.045
Token top
Token ID1353
Feature activation+3.33e-02

Pos logit contributions
Those+0.026
ill+0.023
of+0.022
 fourteen+0.021
These+0.021
Neg logit contributions
 badly-0.021
 shield-0.021
 job-0.020
--0.020
 possession-0.020
Token.
Token ID13
Feature activation-2.79e-03

Pos logit contributions
Those+0.067
ill+0.058
of+0.056
Kids+0.055
 fourteen+0.055
Neg logit contributions
 badly-0.047
 shield-0.047
--0.046
 job-0.046
 possession-0.044
Token They
Token ID1119
Feature activation+1.12e-02

Pos logit contributions
 cozy+0.003
 happily+0.003
 badly+0.003
 shield+0.003
 healthy+0.003
Neg logit contributions
Those-0.006
of-0.005
ill-0.005
 fourteen-0.005
Warning-0.005
Token used
Token ID973
Feature activation+3.91e-02

Pos logit contributions
Those+0.022
Kids+0.018
ion+0.018
ill+0.018
of+0.018
Neg logit contributions
 shield-0.017
 happily-0.016
 badly-0.015
--0.015
 job-0.014
Token tape
Token ID9154
Feature activation+2.18e-02

Pos logit contributions
Those+0.082
Kids+0.071
ill+0.071
of+0.069
Is+0.069
Neg logit contributions
 shield-0.055
 badly-0.052
--0.051
 job-0.050
 possession-0.049
Token to
Token ID284
Feature activation+6.58e-03

Pos logit contributions
Those+0.061
Kids+0.055
These+0.054
Is+0.054
Warning+0.053
Neg logit contributions
--0.019
 shield-0.015
 badly-0.013
 job-0.013
 possession-0.013
Token hold
Token ID1745
Feature activation+4.87e-02

Pos logit contributions
Those+0.013
Kids+0.012
oke+0.011
ill+0.011
ount+0.011
Neg logit contributions
 shield-0.010
 job-0.010
--0.009
 badly-0.009
 healthy-0.009
Token them
Token ID606
Feature activation+4.16e-02

Pos logit contributions
Those+0.104
ill+0.090
Kids+0.090
Is+0.089
Warning+0.086
Neg logit contributions
--0.067
 badly-0.064
 shield-0.061
 job-0.061
 possession-0.059
Token together
Token ID1978
Feature activation-8.28e-04

Pos logit contributions
Those+0.089
ill+0.082
of+0.078
These+0.073
 fourteen+0.073
Neg logit contributions
 badly-0.051
 shield-0.050
 job-0.050
 possession-0.049
 Amen-0.048
Token home
Token ID1363
Feature activation+3.35e-02

Pos logit contributions
Those+0.051
ill+0.047
of+0.045
ion+0.043
 fourteen+0.042
Neg logit contributions
 shield-0.038
 badly-0.037
 possession-0.036
 Amen-0.035
 appetite-0.035
Token to
Token ID284
Feature activation+2.62e-02

Pos logit contributions
Those+0.068
ill+0.063
of+0.058
oms+0.055
ion+0.055
Neg logit contributions
 shield-0.046
 job-0.042
 badly-0.042
 possession-0.042
 active-0.040
Token tell
Token ID1560
Feature activation+5.38e-02

Pos logit contributions
Those+0.046
ill+0.041
of+0.038
 fourteen+0.036
These+0.036
Neg logit contributions
 badly-0.043
 shield-0.042
 job-0.042
--0.041
 possession-0.041
Token her
Token ID607
Feature activation+3.85e-02

Pos logit contributions
Those+0.113
ill+0.101
of+0.095
 fourteen+0.092
These+0.092
Neg logit contributions
 badly-0.073
 shield-0.073
 job-0.071
--0.070
 possession-0.068
Token mom
Token ID1995
Feature activation+6.18e-02

Pos logit contributions
Those+0.076
ill+0.067
of+0.063
These+0.060
ion+0.060
Neg logit contributions
 badly-0.056
 shield-0.055
 job-0.054
 possession-0.053
--0.052
Tokenmy
Token ID1820
Feature activation+5.40e-02

Pos logit contributions
Those+0.140
Kids+0.123
Warning+0.121
ill+0.121
Stop+0.121
Neg logit contributions
 badly-0.074
--0.071
 shield-0.068
 job-0.064
 possession-0.061
Token all
Token ID477
Feature activation+4.22e-02

Pos logit contributions
Those+0.119
ill+0.105
Kids+0.102
 fourteen+0.101
Is+0.101
Neg logit contributions
 badly-0.067
 shield-0.066
--0.064
 job-0.062
 possession-0.059
Token about
Token ID546
Feature activation+4.37e-02

Pos logit contributions
Those+0.088
ill+0.078
of+0.074
 fourteen+0.072
These+0.072
Neg logit contributions
 badly-0.057
 shield-0.057
 job-0.055
--0.054
 possession-0.054
Token her
Token ID607
Feature activation+2.44e-02

Pos logit contributions
Those+0.080
ill+0.070
of+0.066
 fourteen+0.063
These+0.063
Neg logit contributions
 badly-0.071
 shield-0.070
 job-0.069
--0.067
 possession-0.067
Token wonderful
Token ID7932
Feature activation-7.59e-03

Pos logit contributions
Those+0.052
ill+0.046
of+0.045
These+0.043
oms+0.043
Neg logit contributions
 badly-0.028
 shield-0.028
 possession-0.026
 job-0.026
 active-0.026
Token trip
Token ID5296
Feature activation+2.52e-02

Pos logit contributions
 badly+0.011
 shield+0.009
 Amen+0.009
 active+0.009
 possession+0.009
Neg logit contributions
Those-0.014
oms-0.012
of-0.012
ill-0.012
Kids-0.012
Token replied
Token ID8712
Feature activation-1.21e-02

Pos logit contributions
Those+0.026
ill+0.023
of+0.022
These+0.021
 fourteen+0.021
Neg logit contributions
 badly-0.019
 shield-0.019
--0.018
 job-0.018
 active-0.018
Token,
Token ID11
Feature activation-1.99e-02

Pos logit contributions
 shield+0.019
 possession+0.019
 job+0.018
 active+0.018
 Amen+0.017
Neg logit contributions
Those-0.021
ill-0.020
of-0.018
 fourteen-0.018
oms-0.017
Token "
Token ID366
Feature activation+2.28e-02

Pos logit contributions
 possession+0.035
 shield+0.035
 job+0.035
 badly+0.035
 active+0.033
Neg logit contributions
Those-0.033
ill-0.028
Is-0.026
 fourteen-0.026
of-0.026
TokenTo
Token ID2514
Feature activation+5.21e-03

Pos logit contributions
oms+0.033
Those+0.031
ill+0.031
ion+0.029
 halftime+0.029
Neg logit contributions
 badly-0.041
 active-0.038
 possession-0.037
 job-0.036
--0.036
Token catch
Token ID4929
Feature activation+2.16e-02

Pos logit contributions
Those+0.011
ill+0.009
These+0.009
Is+0.008
of+0.008
Neg logit contributions
 job-0.008
 shield-0.008
 badly-0.008
--0.007
 active-0.007
Token the
Token ID262
Feature activation+3.76e-02

Pos logit contributions
Those+0.044
ill+0.039
of+0.037
 fourteen+0.035
These+0.035
Neg logit contributions
 badly-0.030
 shield-0.030
 job-0.030
--0.029
 possession-0.029
Token sunset
Token ID26428
Feature activation-1.43e-02

Pos logit contributions
Those+0.080
ill+0.071
of+0.067
 fourteen+0.066
These+0.065
Neg logit contributions
 badly-0.048
 shield-0.046
 job-0.045
 possession-0.045
--0.045
Token!"
Token ID2474
Feature activation-2.26e-02

Pos logit contributions
 badly+0.014
 shield+0.014
 job+0.014
 possession+0.013
-+0.013
Neg logit contributions
Those-0.035
ill-0.032
of-0.030
 fourteen-0.029
These-0.029
Token And
Token ID843
Feature activation-2.58e-02

Pos logit contributions
 badly+0.039
 shield+0.038
-+0.038
 job+0.037
 possession+0.036
Neg logit contributions
Those-0.038
ill-0.033
of-0.031
 fourteen-0.030
oms-0.029
Token with
Token ID351
Feature activation+2.23e-02

Pos logit contributions
 badly+0.034
 shield+0.034
 job+0.033
-+0.032
 possession+0.032
Neg logit contributions
Those-0.054
ill-0.049
of-0.046
 fourteen-0.044
These-0.044
Token a
Token ID257
Feature activation-5.20e-02

Pos logit contributions
Those+0.045
ill+0.040
of+0.038
These+0.036
 fourteen+0.036
Neg logit contributions
 badly-0.032
 shield-0.032
 job-0.031
--0.030
 possession-0.030
Token couldn
Token ID3521
Feature activation-1.39e-02

Pos logit contributions
 badly+0.083
 shield+0.081
 job+0.080
-+0.077
 active+0.076
Neg logit contributions
Those-0.116
ill-0.106
of-0.098
 fourteen-0.095
These-0.094
Token't
Token ID470
Feature activation+1.12e-02

Pos logit contributions
 badly+0.023
-+0.022
 job+0.022
 shield+0.021
 Amen+0.019
Neg logit contributions
Those-0.026
ill-0.023
of-0.021
Kids-0.021
These-0.020
Token do
Token ID466
Feature activation-1.68e-02

Pos logit contributions
Those+0.022
ill+0.019
Kids+0.018
ion+0.018
Is+0.018
Neg logit contributions
 badly-0.018
 job-0.017
 shield-0.017
--0.016
 active-0.016
Token any
Token ID597
Feature activation+1.30e-02

Pos logit contributions
 badly+0.024
 shield+0.024
 job+0.023
-+0.022
 possession+0.022
Neg logit contributions
Those-0.034
ill-0.030
of-0.029
These-0.027
 fourteen-0.027
Token harm
Token ID4419
Feature activation+2.83e-02

Pos logit contributions
Those+0.027
ill+0.024
of+0.023
 fourteen+0.022
These+0.022
Neg logit contributions
 badly-0.018
 shield-0.018
 job-0.017
--0.017
 possession-0.017
Token.
Token ID13
Feature activation-3.45e-02

Pos logit contributions
Those+0.068
ill+0.061
 fourteen+0.058
These+0.057
Is+0.057
Neg logit contributions
 badly-0.032
--0.030
 shield-0.029
 job-0.028
 possession-0.027
Token
Token ID220
Feature activation+1.09e-02

Pos logit contributions
 badly+0.058
 job+0.055
 possession+0.055
 shield+0.055
-+0.054
Neg logit contributions
Those-0.059
ill-0.052
of-0.048
 fourteen-0.048
oms-0.047
Token
Token ID198
Feature activation+3.71e-02

Pos logit contributions
Those+0.020
 fourteen+0.017
ill+0.016
These+0.016
of+0.016
Neg logit contributions
 badly-0.018
--0.016
 job-0.016
 happily-0.015
 shield-0.015
Token
Token ID198
Feature activation+3.08e-02

Pos logit contributions
Those+0.066
ill+0.061
 fourteen+0.060
oms+0.058
 thirteen+0.056
Neg logit contributions
 badly-0.058
--0.053
 happily-0.053
 active-0.052
 Amen-0.051
TokenLuc
Token ID22946
Feature activation+1.39e-02

Pos logit contributions
Those+0.043
ill+0.041
oms+0.039
 fourteen+0.037
oes+0.035
Neg logit contributions
 badly-0.061
 active-0.057
--0.055
 shield-0.054
 job-0.053
Tokenas
Token ID292
Feature activation+4.10e-02

Pos logit contributions
Those+0.025
These+0.020
Is+0.020
Warning+0.019
Kids+0.019
Neg logit contributions
--0.025
 badly-0.024
 shield-0.023
 possession-0.022
 job-0.022
Token is
Token ID318
Feature activation+5.03e-03

Pos logit contributions
-+0.015
 shield+0.014
 badly+0.014
 job+0.014
 active+0.013
Neg logit contributions
Those-0.013
ill-0.011
of-0.010
These-0.010
 fourteen-0.009
Token glad
Token ID9675
Feature activation-7.37e-03

Pos logit contributions
Those+0.012
ill+0.011
These+0.010
of+0.010
 fourteen+0.010
Neg logit contributions
 shield-0.006
 job-0.005
 badly-0.005
 possession-0.005
--0.005
Token they
Token ID484
Feature activation+2.49e-02

Pos logit contributions
 badly+0.010
 shield+0.009
-+0.009
 job+0.009
 possession+0.009
Neg logit contributions
Those-0.016
ill-0.014
of-0.013
These-0.013
 fourteen-0.013
Token are
Token ID389
Feature activation+2.07e-02

Pos logit contributions
Those+0.029
ill+0.026
Is+0.022
 fourteen+0.021
ion+0.020
Neg logit contributions
 badly-0.056
 shield-0.055
 job-0.055
--0.054
 active-0.052
Token safe
Token ID3338
Feature activation-8.89e-03

Pos logit contributions
Those+0.047
ill+0.042
 are+0.040
These+0.039
ila+0.039
Neg logit contributions
 possession-0.022
 shield-0.022
 job-0.021
 armour-0.020
 badly-0.020
Token and
Token ID290
Feature activation-3.18e-02

Pos logit contributions
 shield+0.013
 job+0.012
 possession+0.012
 badly+0.012
 active+0.011
Neg logit contributions
Those-0.018
ill-0.016
of-0.014
 are-0.014
These-0.014
Token honest
Token ID5508
Feature activation-1.48e-02

Pos logit contributions
 shield+0.033
 badly+0.032
 job+0.031
 possession+0.030
-+0.030
Neg logit contributions
Those-0.077
ill-0.068
of-0.066
These-0.064
Kids-0.063
Token.
Token ID13
Feature activation-8.62e-03

Pos logit contributions
 shield+0.018
 badly+0.018
 job+0.017
 possession+0.017
 active+0.017
Neg logit contributions
Those-0.033
ill-0.030
of-0.028
These-0.027
 fourteen-0.027
Token She
Token ID1375
Feature activation-4.18e-03

Pos logit contributions
 badly+0.013
 possession+0.012
 shield+0.012
-+0.012
 job+0.012
Neg logit contributions
Those-0.016
ill-0.014
of-0.014
 fourteen-0.014
oms-0.013
Token smiles
Token ID21845
Feature activation+1.51e-02

Pos logit contributions
 shield+0.008
-+0.008
 badly+0.008
 job+0.008
 active+0.007
Neg logit contributions
Those-0.006
ill-0.006
of-0.005
These-0.005
 fourteen-0.005
Token.
Token ID13
Feature activation-1.61e-03

Pos logit contributions
Those+0.037
ill+0.033
of+0.032
 fourteen+0.031
Kids+0.031
Neg logit contributions
 badly-0.016
 shield-0.015
--0.015
 job-0.015
 active-0.014
Token help
Token ID1037
Feature activation+4.84e-02

Pos logit contributions
Those+0.016
ill+0.014
Kids+0.014
Is+0.013
ion+0.012
Neg logit contributions
 shield-0.025
 job-0.024
 badly-0.023
--0.023
 active-0.023
Token the
Token ID262
Feature activation+9.09e-02

Pos logit contributions
Those+0.096
ill+0.088
 fourteen+0.082
Is+0.082
Kids+0.081
Neg logit contributions
--0.072
 job-0.070
 shield-0.069
 badly-0.068
 active-0.065
Token fish
Token ID5916
Feature activation+5.45e-02

Pos logit contributions
Those+0.187
ill+0.162
of+0.154
These+0.151
 fourteen+0.150
Neg logit contributions
 badly-0.128
 shield-0.121
 job-0.120
 possession-0.120
--0.118
Token stay
Token ID2652
Feature activation+2.48e-02

Pos logit contributions
Those+0.106
ill+0.093
of+0.087
Kids+0.085
These+0.085
Neg logit contributions
 shield-0.082
 badly-0.081
 job-0.080
--0.078
 possession-0.076
Token smooth
Token ID7209
Feature activation-7.29e-03

Pos logit contributions
Those+0.039
 are+0.037
ill+0.036
 end+0.034
ures+0.033
Neg logit contributions
 possession-0.034
 badly-0.033
 shield-0.033
 bicycle-0.032
 Amen-0.032
Token and
Token ID290
Feature activation-2.90e-02

Pos logit contributions
 badly+0.012
 shield+0.012
 possession+0.012
 Amen+0.012
 job+0.012
Neg logit contributions
Those-0.011
ill-0.011
 fourteen-0.010
 are-0.009
of-0.009
Token protect
Token ID1805
Feature activation+5.58e-02

Pos logit contributions
 badly+0.042
 shield+0.041
 job+0.040
 possession+0.040
-+0.039
Neg logit contributions
Those-0.057
ill-0.050
of-0.048
These-0.046
 are-0.046
Token it
Token ID340
Feature activation+3.79e-02

Pos logit contributions
Those+0.104
ill+0.093
Kids+0.086
Is+0.086
 fourteen+0.084
Neg logit contributions
 badly-0.087
--0.087
 job-0.084
 shield-0.084
 possession-0.082
Token from
Token ID422
Feature activation+2.53e-02

Pos logit contributions
Those+0.076
ill+0.069
 are+0.066
of+0.065
 fourteen+0.063
Neg logit contributions
 possession-0.050
 badly-0.049
 job-0.047
 shield-0.047
 Amen-0.046
Token other
Token ID584
Feature activation+5.63e-02

Pos logit contributions
Those+0.046
ill+0.040
of+0.038
 fourteen+0.036
These+0.036
Neg logit contributions
 badly-0.040
 shield-0.040
 job-0.039
 possession-0.038
--0.038
Token animals
Token ID4695
Feature activation+6.76e-02

Pos logit contributions
Those+0.111
ill+0.099
of+0.092
 fourteen+0.089
These+0.089
Neg logit contributions
 shield-0.083
 badly-0.082
 job-0.080
--0.078
 possession-0.077
Token a
Token ID257
Feature activation-2.94e-02

Pos logit contributions
Those+0.038
Kids+0.035
ill+0.034
ount+0.033
These+0.032
Neg logit contributions
--0.015
 badly-0.015
 shield-0.014
 loudly-0.013
 job-0.013
Token button
Token ID4936
Feature activation-1.76e-02

Pos logit contributions
 shield+0.050
 badly+0.050
 job+0.049
-+0.049
 possession+0.047
Neg logit contributions
Those-0.051
ill-0.045
of-0.040
These-0.040
Kids-0.040
Token.
Token ID13
Feature activation-1.38e-02

Pos logit contributions
-+0.017
 badly+0.016
 shield+0.015
 job+0.014
 loudly+0.013
Neg logit contributions
Those-0.045
ill-0.040
Kids-0.040
These-0.039
of-0.039
Token A
Token ID317
Feature activation-1.33e-02

Pos logit contributions
 shield+0.021
 badly+0.021
 job+0.020
-+0.020
 possession+0.019
Neg logit contributions
Those-0.026
ill-0.023
of-0.022
These-0.021
Kids-0.021
Token bright
Token ID6016
Feature activation-6.33e-02

Pos logit contributions
 badly+0.019
 shield+0.019
 job+0.018
 possession+0.018
 active+0.018
Neg logit contributions
Those-0.026
ill-0.023
of-0.022
Is-0.021
 fourteen-0.021
Token flame
Token ID16506
Feature activation-2.90e-02

Pos logit contributions
 badly+0.093
 shield+0.090
 active+0.088
 job+0.087
 possession+0.086
Neg logit contributions
Those-0.122
ill-0.110
of-0.103
 fourteen-0.100
Is-0.098
Token comes
Token ID2058
Feature activation+2.25e-02

Pos logit contributions
-+0.046
 shield+0.045
 badly+0.044
 job+0.043
 possession+0.040
Neg logit contributions
Those-0.055
ill-0.048
of-0.045
These-0.045
Kids-0.045
Token out
Token ID503
Feature activation-8.09e-03

Pos logit contributions
Those+0.051
ill+0.045
of+0.043
Kids+0.043
These+0.043
Neg logit contributions
 badly-0.026
 shield-0.026
--0.026
 job-0.025
 possession-0.023
Token of
Token ID286
Feature activation+3.85e-02

Pos logit contributions
-+0.011
 badly+0.010
 shield+0.010
 loudly+0.010
 job+0.009
Neg logit contributions
Those-0.017
Kids-0.016
These-0.015
 halftime-0.015
of-0.014
Token the
Token ID262
Feature activation+5.32e-02

Pos logit contributions
Those+0.076
ill+0.066
of+0.064
Kids+0.061
These+0.061
Neg logit contributions
 shield-0.058
 badly-0.057
 job-0.056
 possession-0.054
--0.054
Token shiny
Token ID22441
Feature activation-3.28e-02

Pos logit contributions
Those+0.114
ill+0.102
of+0.096
 fourteen+0.094
These+0.094
Neg logit contributions
 badly-0.069
 shield-0.066
 job-0.065
--0.064
 possession-0.063
Token earthquakes
Token ID29781
Feature activation+2.47e-02

Pos logit contributions
Those+0.059
ill+0.051
of+0.048
 fourteen+0.046
These+0.046
Neg logit contributions
 badly-0.058
 shield-0.058
 job-0.057
 possession-0.055
--0.055
Token.
Token ID13
Feature activation-4.55e-03

Pos logit contributions
Those+0.062
ill+0.055
Kids+0.055
Is+0.054
These+0.054
Neg logit contributions
 badly-0.023
--0.023
 job-0.020
 shield-0.020
 loudly-0.018
Token<|endoftext|>
Token ID50256
Feature activation+3.39e-02

Pos logit contributions
 badly+0.009
 shield+0.008
 job+0.008
 possession+0.008
-+0.008
Neg logit contributions
Those-0.007
ill-0.006
of-0.006
 fourteen-0.005
These-0.005
TokenOnce
Token ID7454
Feature activation+5.23e-02

Pos logit contributions
Those+0.061
 fourteen+0.051
ill+0.051
of+0.049
oms+0.048
Neg logit contributions
--0.054
 badly-0.053
 shield-0.050
 happily-0.050
 job-0.050
Token upon
Token ID2402
Feature activation+3.29e-02

Pos logit contributions
Those+0.090
ill+0.076
of+0.071
These+0.071
 fourteen+0.071
Neg logit contributions
 badly-0.090
 shield-0.089
 job-0.087
--0.086
 possession-0.085
Token a
Token ID257
Feature activation-2.74e-02

Pos logit contributions
Those+0.050
ill+0.044
 are+0.043
These+0.040
of+0.040
Neg logit contributions
 job-0.063
 possession-0.063
 shield-0.059
 appetite-0.059
 active-0.059
Token time
Token ID640
Feature activation+7.73e-03

Pos logit contributions
 shield+0.048
 possession+0.046
 badly+0.046
 job+0.045
 armor+0.043
Neg logit contributions
Those-0.045
ill-0.041
of-0.040
 fourteen-0.036
These-0.035
Token,
Token ID11
Feature activation+2.37e-02

Pos logit contributions
Those+0.017
ill+0.016
of+0.015
 fourteen+0.015
These+0.014
Neg logit contributions
 shield-0.009
 badly-0.009
 job-0.009
 possession-0.008
--0.008
Token there
Token ID612
Feature activation+3.64e-02

Pos logit contributions
Those+0.046
ill+0.041
 are+0.038
These+0.037
of+0.037
Neg logit contributions
 shield-0.036
 possession-0.035
 job-0.035
 active-0.033
 badly-0.033
Token was
Token ID373
Feature activation-2.64e-02

Pos logit contributions
Those+0.078
 fourteen+0.067
ill+0.067
oms+0.066
ount+0.065
Neg logit contributions
 badly-0.049
--0.048
 job-0.045
 shield-0.045
 possession-0.043
Token a
Token ID257
Feature activation-6.67e-02

Pos logit contributions
ouse+0.051
 possession+0.050
 shield+0.049
 badly+0.049
uer+0.048
Neg logit contributions
Those-0.041
These-0.035
ill-0.034
 are-0.031
 fourteen-0.030
TokenR
Token ID49
Feature activation+1.97e-02

Pos logit contributions
ill+0.059
Those+0.056
oms+0.054
oes+0.051
ount+0.050
Neg logit contributions
 badly-0.067
 active-0.064
 shield-0.063
 Amen-0.062
--0.061
Tokenalph
Token ID17307
Feature activation-4.14e-03

Pos logit contributions
Those+0.037
ill+0.034
of+0.033
 fourteen+0.030
ures+0.030
Neg logit contributions
 shield-0.030
 job-0.029
 badly-0.029
 possession-0.027
 active-0.027
Token smiled
Token ID13541
Feature activation-1.93e-02

Pos logit contributions
 badly+0.006
 shield+0.006
 job+0.006
-+0.006
 possession+0.006
Neg logit contributions
Those-0.008
ill-0.007
of-0.006
These-0.006
 fourteen-0.006
Token and
Token ID290
Feature activation-5.24e-02

Pos logit contributions
 job+0.029
 possession+0.029
eh+0.028
 Amen+0.028
 appetite+0.028
Neg logit contributions
Those-0.033
ill-0.032
of-0.028
 are-0.026
 fourteen-0.026
Token felt
Token ID2936
Feature activation-2.92e-02

Pos logit contributions
 badly+0.098
 shield+0.094
 job+0.093
 possession+0.092
 Amen+0.089
Neg logit contributions
Those-0.080
ill-0.067
of-0.066
Kids-0.061
 are-0.060
Token very
Token ID845
Feature activation-9.78e-02

Pos logit contributions
 shield+0.031
 badly+0.030
 job+0.030
 possession+0.030
-+0.029
Neg logit contributions
Those-0.067
ill-0.059
 are-0.057
of-0.057
These-0.057
Token happy
Token ID3772
Feature activation-5.67e-02

Pos logit contributions
 shield+0.106
 possession+0.098
 badly+0.094
 armor+0.091
 armour+0.091
Neg logit contributions
Those-0.221
ill-0.199
 are-0.195
of-0.190
oms-0.185
Token.
Token ID13
Feature activation-3.01e-02

Pos logit contributions
 shield+0.086
 active+0.084
 Amen+0.084
eh+0.083
 job+0.082
Neg logit contributions
Those-0.099
ill-0.093
 are-0.081
of-0.080
 fourteen-0.080
Token His
Token ID2399
Feature activation+2.14e-03

Pos logit contributions
 shield+0.038
 badly+0.038
 job+0.037
 possession+0.036
-+0.035
Neg logit contributions
Those-0.065
ill-0.059
of-0.056
These-0.053
 fourteen-0.053
Token mom
Token ID1995
Feature activation+1.70e-02

Pos logit contributions
Those+0.004
ill+0.004
of+0.004
 are+0.004
 fourteen+0.004
Neg logit contributions
 badly-0.003
 possession-0.003
 shield-0.003
 active-0.003
--0.002
Token said
Token ID531
Feature activation-1.32e-05

Pos logit contributions
Those+0.035
ill+0.029
These+0.028
 fourteen+0.028
Warning+0.028
Neg logit contributions
--0.025
 shield-0.024
 badly-0.024
 job-0.022
 happily-0.022
Token.
Token ID13
Feature activation-2.02e-02

Pos logit contributions
Those+0.013
ill+0.012
of+0.011
These+0.011
 fourteen+0.011
Neg logit contributions
 badly-0.006
 shield-0.006
 job-0.006
--0.006
 possession-0.005
Token<|endoftext|>
Token ID50256
Feature activation+2.10e-02

Pos logit contributions
 badly+0.030
 possession+0.030
 job+0.030
-+0.028
 happily+0.028
Neg logit contributions
Those-0.036
 fourteen-0.031
ill-0.030
of-0.030
oms-0.029
TokenOnce
Token ID7454
Feature activation+3.82e-02

Pos logit contributions
Those+0.034
 fourteen+0.029
ill+0.027
 thirteen+0.026
of+0.026
Neg logit contributions
--0.037
 happily-0.035
 badly-0.035
 shield-0.034
 job-0.034
Token there
Token ID612
Feature activation+4.25e-02

Pos logit contributions
Those+0.071
ill+0.058
These+0.057
 fourteen+0.057
Is+0.056
Neg logit contributions
 badly-0.062
--0.060
 shield-0.059
 job-0.059
 possession-0.058
Token was
Token ID373
Feature activation-2.92e-02

Pos logit contributions
Those+0.092
 fourteen+0.080
ives+0.078
oms+0.077
ill+0.077
Neg logit contributions
 badly-0.057
--0.055
 job-0.052
 possession-0.051
 happily-0.050
Token a
Token ID257
Feature activation-7.44e-02

Pos logit contributions
 shield+0.048
 job+0.047
ouse+0.047
 possession+0.046
 badly+0.045
Neg logit contributions
Those-0.053
ill-0.046
These-0.044
 are-0.041
of-0.040
Token cozy
Token ID37438
Feature activation-1.14e-01

Pos logit contributions
 possession+0.110
 badly+0.105
 armor+0.103
 armour+0.103
 shield+0.103
Neg logit contributions
Those-0.139
of-0.119
Is-0.118
Kids-0.117
ill-0.116
Token little
Token ID1310
Feature activation-7.60e-02

Pos logit contributions
 badly+0.159
ages+0.149
 possession+0.147
 active+0.146
 Amen+0.145
Neg logit contributions
Those-0.181
ill-0.180
of-0.178
 end-0.176
Kids-0.159
Token bear
Token ID6842
Feature activation-2.41e-02

Pos logit contributions
 badly+0.123
 armour+0.123
 armor+0.122
 shield+0.118
 possession+0.117
Neg logit contributions
Those-0.121
of-0.101
ill-0.099
Kids-0.098
oms-0.094
Token named
Token ID3706
Feature activation+8.25e-03

Pos logit contributions
 shield+0.037
 job+0.037
 active+0.035
 handsome+0.033
 badly+0.033
Neg logit contributions
Those-0.043
ill-0.039
of-0.038
 are-0.034
ion-0.034
Token Larry
Token ID13633
Feature activation-1.79e-02

Pos logit contributions
Those+0.017
ill+0.015
of+0.015
 fourteen+0.014
ion+0.014
Neg logit contributions
 badly-0.011
 job-0.011
 shield-0.011
 possession-0.011
--0.011
Token school
Token ID1524
Feature activation-5.08e-02

Pos logit contributions
Those+0.024
ill+0.022
of+0.021
 fourteen+0.020
These+0.020
Neg logit contributions
 badly-0.016
 shield-0.016
 job-0.016
--0.015
 possession-0.015
Token day
Token ID1110
Feature activation-4.15e-02

Pos logit contributions
 shield+0.069
 badly+0.068
 job+0.064
 possession+0.063
 active+0.063
Neg logit contributions
Those-0.103
ill-0.096
of-0.090
oms-0.085
 are-0.084
Token.
Token ID13
Feature activation-2.91e-02

Pos logit contributions
 shield+0.063
 possession+0.057
 badly+0.055
 job+0.055
 Amen+0.054
Neg logit contributions
Those-0.079
ill-0.074
of-0.069
 fourteen-0.066
oms-0.064
Token He
Token ID679
Feature activation-3.25e-02

Pos logit contributions
 badly+0.047
 shield+0.047
 job+0.046
-+0.045
 possession+0.045
Neg logit contributions
Those-0.052
ill-0.046
of-0.043
These-0.041
 fourteen-0.041
Token got
Token ID1392
Feature activation-1.49e-02

Pos logit contributions
 shield+0.052
 badly+0.051
 job+0.050
-+0.050
 possession+0.049
Neg logit contributions
Those-0.060
ill-0.052
of-0.048
 fourteen-0.047
These-0.047
Token dressed
Token ID12049
Feature activation-5.53e-02

Pos logit contributions
 shield+0.021
 badly+0.020
 possession+0.020
 appetite+0.019
 job+0.019
Neg logit contributions
Those-0.030
ill-0.027
These-0.025
of-0.024
 are-0.024
Token in
Token ID287
Feature activation+8.34e-03

Pos logit contributions
 shield+0.083
 Amen+0.080
 badly+0.080
eh+0.079
 appetite+0.078
Neg logit contributions
Those-0.092
ill-0.086
 are-0.084
of-0.083
These-0.077
Token his
Token ID465
Feature activation-3.92e-02

Pos logit contributions
Those+0.018
ill+0.016
of+0.015
 fourteen+0.015
These+0.014
Neg logit contributions
 badly-0.011
 shield-0.011
 job-0.011
--0.010
 possession-0.010
Token warm
Token ID5814
Feature activation-8.90e-02

Pos logit contributions
 badly+0.043
 shield+0.040
 job+0.038
 possession+0.037
-+0.037
Neg logit contributions
Those-0.091
ill-0.081
of-0.079
These-0.077
 are-0.076
Token clothes
Token ID8242
Feature activation-3.54e-02

Pos logit contributions
 badly+0.133
 shield+0.114
 Amen+0.111
 active+0.108
 loudly+0.107
Neg logit contributions
Those-0.167
ill-0.159
of-0.156
 are-0.146
oms-0.145
Token and
Token ID290
Feature activation-6.04e-02

Pos logit contributions
 shield+0.046
 badly+0.045
 job+0.044
 possession+0.044
 active+0.043
Neg logit contributions
Those-0.072
ill-0.067
of-0.063
 fourteen-0.060
 are-0.059
Token't
Token ID470
Feature activation-1.46e-02

Pos logit contributions
 badly+0.059
 job+0.057
-+0.056
 shield+0.055
 active+0.050
Neg logit contributions
Those-0.064
ill-0.054
These-0.050
Kids-0.050
of-0.049
Token believe
Token ID1975
Feature activation-5.70e-02

Pos logit contributions
 badly+0.023
 shield+0.023
 job+0.023
-+0.021
 active+0.021
Neg logit contributions
Those-0.028
ill-0.023
Kids-0.022
These-0.022
Is-0.022
Token he
Token ID339
Feature activation-4.64e-02

Pos logit contributions
 job+0.091
 shield+0.087
 active+0.085
 badly+0.085
eh+0.083
Neg logit contributions
Those-0.102
ill-0.093
of-0.087
 fourteen-0.082
 are-0.081
Token'd
Token ID1549
Feature activation-5.62e-02

Pos logit contributions
 badly+0.071
 shield+0.070
-+0.069
 job+0.068
 possession+0.066
Neg logit contributions
Those-0.090
ill-0.078
ion-0.073
 fourteen-0.072
of-0.072
Token been
Token ID587
Feature activation-5.86e-02

Pos logit contributions
 shield+0.085
 badly+0.084
 job+0.083
-+0.080
 possession+0.080
Neg logit contributions
Those-0.112
ill-0.093
These-0.089
of-0.089
ion-0.088
Token so
Token ID523
Feature activation-8.15e-02

Pos logit contributions
 shield+0.072
 badly+0.070
 job+0.068
 possession+0.066
 Amen+0.063
Neg logit contributions
Those-0.124
ill-0.117
of-0.106
 fourteen-0.105
oms-0.102
Token adventurous
Token ID39770
Feature activation-8.62e-02

Pos logit contributions
 shield+0.074
 badly+0.072
 job+0.070
 possession+0.067
-+0.067
Neg logit contributions
Those-0.206
ill-0.188
of-0.179
 fourteen-0.174
These-0.173
Token.
Token ID13
Feature activation-4.51e-02

Pos logit contributions
 shield+0.117
 badly+0.114
 job+0.114
 Amen+0.111
 active+0.108
Neg logit contributions
Those-0.165
ill-0.156
of-0.146
 fourteen-0.139
ion-0.138
Token
Token ID220
Feature activation-4.76e-03

Pos logit contributions
 badly+0.074
 shield+0.073
 job+0.071
 possession+0.070
-+0.069
Neg logit contributions
Those-0.079
ill-0.068
of-0.065
 fourteen-0.063
These-0.062
Token
Token ID198
Feature activation+2.06e-02

Pos logit contributions
 badly+0.008
-+0.007
 shield+0.007
 happily+0.007
 job+0.007
Neg logit contributions
Those-0.008
 fourteen-0.007
ill-0.007
These-0.007
of-0.007
Token
Token ID198
Feature activation+1.55e-02

Pos logit contributions
Those+0.035
ill+0.030
oms+0.030
 fourteen+0.030
ures+0.028
Neg logit contributions
 badly-0.035
 active-0.033
--0.032
 happily-0.032
 shield-0.031
Token way
Token ID835
Feature activation-2.30e-02

Pos logit contributions
 badly+0.027
 Amen+0.025
 possession+0.024
 shield+0.024
-+0.023
Neg logit contributions
Those-0.041
 are-0.037
ill-0.037
of-0.036
 halftime-0.035
Token?"
Token ID1701
Feature activation-2.55e-02

Pos logit contributions
 shield+0.026
 badly+0.026
 job+0.025
 possession+0.025
-+0.024
Neg logit contributions
Those-0.051
ill-0.047
of-0.045
 are-0.042
 fourteen-0.042
Token
Token ID198
Feature activation+1.52e-02

Pos logit contributions
 badly+0.048
 shield+0.046
 job+0.046
-+0.045
 possession+0.044
Neg logit contributions
Those-0.040
ill-0.034
of-0.032
 fourteen-0.031
These-0.030
Token
Token ID198
Feature activation+1.20e-02

Pos logit contributions
Those+0.025
ill+0.021
 fourteen+0.021
These+0.019
oms+0.019
Neg logit contributions
 badly-0.027
--0.026
 shield-0.026
 Amen-0.025
 possession-0.025
TokenBuzz
Token ID48230
Feature activation-1.63e-03

Pos logit contributions
Those+0.017
ill+0.015
 fourteen+0.014
oms+0.013
ount+0.013
Neg logit contributions
 badly-0.024
 shield-0.023
--0.023
 active-0.023
 job-0.023
Token and
Token ID290
Feature activation-5.00e-02

Pos logit contributions
-+0.002
 shield+0.002
 badly+0.002
 active+0.002
 job+0.002
Neg logit contributions
Those-0.003
ill-0.003
These-0.003
 fourteen-0.003
of-0.003
Token Buzz
Token ID15753
Feature activation-2.44e-04

Pos logit contributions
 badly+0.094
 job+0.092
 shield+0.092
 possession+0.090
-+0.090
Neg logit contributions
Those-0.078
ill-0.067
of-0.062
 fourteen-0.059
These-0.058
Tokeny
Token ID88
Feature activation-4.50e-02

Pos logit contributions
 happily+0.000
-+0.000
 shield+0.000
 badly+0.000
 loudly+0.000
Neg logit contributions
Those-0.001
ion-0.000
ill-0.000
 fourteen-0.000
Is-0.000
Token argued
Token ID7189
Feature activation-1.14e-02

Pos logit contributions
 shield+0.063
 badly+0.062
 job+0.059
-+0.058
 loudly+0.056
Neg logit contributions
Those-0.093
ill-0.080
ion-0.077
Kids-0.075
of-0.075
Token for
Token ID329
Feature activation+2.62e-02

Pos logit contributions
 badly+0.012
-+0.011
 shield+0.011
 job+0.010
 loudly+0.010
Neg logit contributions
Those-0.029
ill-0.025
Kids-0.025
of-0.025
These-0.025
Token a
Token ID257
Feature activation-5.55e-02

Pos logit contributions
Those+0.059
ill+0.053
of+0.050
 fourteen+0.049
These+0.049
Neg logit contributions
 badly-0.031
 shield-0.031
 job-0.030
--0.029
 possession-0.029
Token Molly
Token ID30236
Feature activation-4.50e-02

Pos logit contributions
 badly+0.026
 shield+0.026
 job+0.026
-+0.025
 possession+0.025
Neg logit contributions
Those-0.038
ill-0.033
of-0.032
 fourteen-0.031
These-0.030
Token.
Token ID13
Feature activation-5.11e-02

Pos logit contributions
 job+0.060
 shield+0.059
 badly+0.058
 possession+0.055
 active+0.055
Neg logit contributions
Those-0.094
ill-0.087
of-0.082
Kids-0.076
These-0.076
Token She
Token ID1375
Feature activation-6.00e-02

Pos logit contributions
 badly+0.086
 shield+0.085
 active+0.083
 job+0.082
 possession+0.081
Neg logit contributions
Those-0.085
ill-0.072
 fourteen-0.070
of-0.070
ion-0.067
Token was
Token ID373
Feature activation-6.43e-02

Pos logit contributions
 shield+0.103
 job+0.101
 badly+0.099
 possession+0.098
 active+0.093
Neg logit contributions
Those-0.101
ill-0.089
of-0.085
Kids-0.081
These-0.079
Token a
Token ID257
Feature activation-9.57e-02

Pos logit contributions
 shield+0.099
 possession+0.096
 badly+0.094
 job+0.090
ouse+0.089
Neg logit contributions
Those-0.124
ill-0.104
These-0.098
Is-0.094
 fourteen-0.094
Token very
Token ID845
Feature activation-1.75e-01

Pos logit contributions
 badly+0.131
rily+0.130
ages+0.128
 possession+0.127
 armor+0.127
Neg logit contributions
Those-0.170
Is-0.147
 end-0.144
These-0.135
Kids-0.133
Token intelligent
Token ID12661
Feature activation-1.06e-01

Pos logit contributions
 shield+0.241
 armor+0.230
 armour+0.230
 badly+0.226
 possession+0.223
Neg logit contributions
Those-0.332
ila-0.310
ill-0.276
oke-0.275
ulu-0.268
Token girl
Token ID2576
Feature activation-7.79e-02

Pos logit contributions
 badly+0.166
 shield+0.153
 Amen+0.152
 armour+0.152
 armor+0.150
Neg logit contributions
Those-0.174
ill-0.168
of-0.150
ila-0.148
Kids-0.146
Token.
Token ID13
Feature activation-5.00e-02

Pos logit contributions
 shield+0.104
 badly+0.101
 job+0.099
 possession+0.094
 active+0.094
Neg logit contributions
Those-0.164
ill-0.147
of-0.139
These-0.134
ion-0.133
Token Every
Token ID3887
Feature activation-1.39e-03

Pos logit contributions
 shield+0.078
 badly+0.078
 active+0.077
 possession+0.075
 Amen+0.075
Neg logit contributions
Those-0.084
 fourteen-0.072
ill-0.072
of-0.070
ion-0.068
Token day
Token ID1110
Feature activation-4.01e-02

Pos logit contributions
 badly+0.002
 shield+0.002
 possession+0.002
 job+0.002
 active+0.002
Neg logit contributions
Those-0.003
ill-0.002
of-0.002
ion-0.002
Kids-0.002
Token bright
Token ID6016
Feature activation-5.39e-02

Pos logit contributions
 job+0.006
-+0.005
 badly+0.005
 shield+0.005
 possession+0.005
Neg logit contributions
Those-0.006
ill-0.006
of-0.006
Kids-0.005
 fourteen-0.005
Token colours
Token ID18915
Feature activation-8.02e-03

Pos logit contributions
 badly+0.090
 shield+0.089
 job+0.086
 possession+0.084
 active+0.083
Neg logit contributions
Those-0.096
ill-0.084
of-0.078
These-0.075
 fourteen-0.075
Token and
Token ID290
Feature activation-6.25e-02

Pos logit contributions
-+0.009
 badly+0.009
 job+0.008
 shield+0.008
 possession+0.007
Neg logit contributions
Those-0.019
Kids-0.017
of-0.017
ill-0.017
Is-0.016
Token made
Token ID925
Feature activation+1.98e-02

Pos logit contributions
 shield+0.097
 job+0.097
-+0.095
 badly+0.094
 active+0.091
Neg logit contributions
Those-0.114
ill-0.104
of-0.097
Kids-0.093
ion-0.092
Token a
Token ID257
Feature activation-4.91e-02

Pos logit contributions
Those+0.041
ill+0.037
of+0.035
These+0.034
 fourteen+0.034
Neg logit contributions
 badly-0.027
 shield-0.027
 job-0.025
 possession-0.025
 active-0.025
Token very
Token ID845
Feature activation-1.37e-01

Pos logit contributions
 badly+0.071
 possession+0.068
 shield+0.067
 active+0.065
 armour+0.063
Neg logit contributions
Those-0.097
ill-0.082
of-0.079
 are-0.078
Is-0.078
Token colourful
Token ID44799
Feature activation-7.16e-02

Pos logit contributions
 shield+0.202
 armor+0.195
 armour+0.193
 Amen+0.193
 badly+0.187
Neg logit contributions
Those-0.270
ila-0.230
ill-0.225
 are-0.219
oms-0.218
Token design
Token ID1486
Feature activation-5.22e-03

Pos logit contributions
 badly+0.131
 active+0.122
 shield+0.119
 Amen+0.117
 possession+0.117
Neg logit contributions
Those-0.111
ill-0.099
 are-0.094
 end-0.091
oms-0.089
Token.
Token ID13
Feature activation-2.67e-02

Pos logit contributions
 shield+0.007
 badly+0.007
 possession+0.006
 active+0.006
 job+0.006
Neg logit contributions
Those-0.011
ill-0.010
of-0.009
ion-0.009
oms-0.009
Token
Token ID198
Feature activation+4.33e-02

Pos logit contributions
 job+0.044
 shield+0.044
 badly+0.043
-+0.043
 possession+0.042
Neg logit contributions
Those-0.047
ill-0.042
 fourteen-0.039
of-0.039
These-0.037
Token
Token ID198
Feature activation+3.74e-02

Pos logit contributions
Those+0.081
 fourteen+0.074
 thirteen+0.072
ill+0.071
ount+0.067
Neg logit contributions
--0.068
 badly-0.066
 job-0.061
 possession-0.060
 happily-0.060
Token around
Token ID1088
Feature activation-9.90e-04

Pos logit contributions
 job+0.029
 Amen+0.029
 appetite+0.028
 shield+0.028
 possession+0.028
Neg logit contributions
Those-0.034
ill-0.029
of-0.028
 fourteen-0.027
ion-0.027
Token and
Token ID290
Feature activation-6.20e-02

Pos logit contributions
 shield+0.001
 badly+0.001
 job+0.001
 possession+0.001
 active+0.001
Neg logit contributions
Those-0.002
ill-0.002
of-0.002
 fourteen-0.002
These-0.002
Token saw
Token ID2497
Feature activation+6.72e-03

Pos logit contributions
 badly+0.099
 shield+0.099
 job+0.097
 possession+0.093
 Amen+0.091
Neg logit contributions
Those-0.116
ill-0.100
of-0.096
Kids-0.090
These-0.090
Token a
Token ID257
Feature activation-7.11e-02

Pos logit contributions
Those+0.015
ill+0.014
of+0.013
Kids+0.013
 fourteen+0.013
Neg logit contributions
 badly-0.008
 shield-0.008
 job-0.008
--0.007
 possession-0.007
Token big
Token ID1263
Feature activation-9.26e-02

Pos logit contributions
 possession+0.097
 badly+0.092
 Amen+0.091
 shield+0.091
 active+0.091
Neg logit contributions
Those-0.138
ill-0.118
Is-0.117
of-0.115
 are-0.111
Token p
Token ID279
Feature activation-1.32e-01

Pos logit contributions
 badly+0.133
 possession+0.132
 active+0.130
 Amen+0.127
 shield+0.125
Neg logit contributions
Those-0.169
ill-0.150
 are-0.145
of-0.140
oms-0.137
Tokenuddle
Token ID24500
Feature activation-5.74e-02

Pos logit contributions
 shield+0.198
 Amen+0.183
 job+0.179
 badly+0.177
 active+0.177
Neg logit contributions
Those-0.257
ill-0.237
of-0.226
oms-0.224
ila-0.213
Token of
Token ID286
Feature activation-1.68e-04

Pos logit contributions
 badly+0.068
 shield+0.068
 job+0.066
-+0.063
 possession+0.063
Neg logit contributions
Those-0.128
ill-0.116
of-0.110
 fourteen-0.106
These-0.106
Token mud
Token ID17492
Feature activation-4.73e-02

Pos logit contributions
 badly+0.000
 shield+0.000
 job+0.000
-+0.000
 possession+0.000
Neg logit contributions
Those-0.000
ill-0.000
of-0.000
 fourteen-0.000
These-0.000
Token.
Token ID13
Feature activation-3.86e-02

Pos logit contributions
 shield+0.062
 badly+0.061
 job+0.060
 possession+0.059
 active+0.059
Neg logit contributions
Those-0.100
ill-0.090
of-0.083
 fourteen-0.083
ion-0.082
Token She
Token ID1375
Feature activation-4.30e-02

Pos logit contributions
 shield+0.077
 badly+0.077
 job+0.076
-+0.074
 possession+0.074
Neg logit contributions
Those-0.056
ill-0.046
of-0.043
These-0.041
Is-0.040
Token<|endoftext|>
Token ID50256
Feature activation-1.62e-03

Pos logit contributions
 badly+0.074
 job+0.070
 possession+0.069
 shield+0.069
 Amen+0.068
Neg logit contributions
Those-0.076
ill-0.064
 fourteen-0.063
oms-0.062
of-0.061
TokenSu
Token ID5606
Feature activation-1.13e-02

Pos logit contributions
-+0.003
 badly+0.003
 happily+0.003
 shield+0.002
 job+0.002
Neg logit contributions
Those-0.003
 fourteen-0.002
ill-0.002
 thirteen-0.002
of-0.002
Tokenzie
Token ID49746
Feature activation-3.37e-02

Pos logit contributions
-+0.018
 possession+0.017
 job+0.017
 badly+0.017
gy+0.016
Neg logit contributions
Those-0.021
Is-0.017
These-0.017
 halftime-0.016
ures-0.016
Token was
Token ID373
Feature activation-5.07e-02

Pos logit contributions
 badly+0.047
 shield+0.047
 job+0.046
-+0.044
 possession+0.044
Neg logit contributions
Those-0.068
ill-0.061
of-0.057
 fourteen-0.055
These-0.055
Token a
Token ID257
Feature activation-9.21e-02

Pos logit contributions
 possession+0.074
 shield+0.071
 job+0.063
ouse+0.063
 armor+0.061
Neg logit contributions
Those-0.095
ill-0.089
ila-0.080
 fourteen-0.080
These-0.079
Token cute
Token ID13779
Feature activation-1.33e-01

Pos logit contributions
 possession+0.139
ages+0.137
rily+0.128
 badly+0.127
 armor+0.125
Neg logit contributions
Those-0.157
 end-0.146
ill-0.135
of-0.133
Is-0.132
Token little
Token ID1310
Feature activation-9.69e-02

Pos logit contributions
ages+0.200
 badly+0.195
 possession+0.194
 job+0.185
 Amen+0.184
Neg logit contributions
Those-0.218
ill-0.217
 end-0.205
of-0.200
 are-0.191
Token girl
Token ID2576
Feature activation-7.37e-02

Pos logit contributions
 possession+0.149
 shield+0.146
 job+0.146
 armour+0.145
 armor+0.144
Neg logit contributions
Those-0.165
ill-0.142
 are-0.137
of-0.133
m-0.130
Token.
Token ID13
Feature activation-4.38e-02

Pos logit contributions
 shield+0.103
 job+0.100
 badly+0.099
 possession+0.095
 active+0.093
Neg logit contributions
Those-0.149
ill-0.133
of-0.125
These-0.120
ion-0.119
Token She
Token ID1375
Feature activation-5.38e-02

Pos logit contributions
 badly+0.062
 job+0.059
 possession+0.059
 active+0.059
-+0.058
Neg logit contributions
Those-0.082
 halftime-0.074
 fourteen-0.072
of-0.071
ives-0.069
Token loved
Token ID6151
Feature activation-6.10e-02

Pos logit contributions
 job+0.079
 shield+0.079
 badly+0.079
 possession+0.076
-+0.072
Neg logit contributions
Those-0.101
ill-0.090
of-0.088
Kids-0.083
 are-0.083
Token train
Token ID4512
Feature activation-2.04e-02

Pos logit contributions
 possession+0.106
 badly+0.102
 armour+0.097
 shield+0.097
 armor+0.097
Neg logit contributions
Those-0.136
of-0.118
Is-0.115
Kids-0.115
ill-0.112
Token.
Token ID13
Feature activation-2.93e-02

Pos logit contributions
 badly+0.025
 shield+0.025
 job+0.024
 possession+0.023
-+0.023
Neg logit contributions
Those-0.045
ill-0.040
of-0.038
These-0.037
 fourteen-0.037
Token It
Token ID632
Feature activation+6.62e-03

Pos logit contributions
 badly+0.046
 job+0.046
 shield+0.045
-+0.044
 possession+0.043
Neg logit contributions
Those-0.053
ill-0.047
of-0.045
 fourteen-0.043
Kids-0.042
Token was
Token ID373
Feature activation-4.02e-02

Pos logit contributions
Those+0.013
ill+0.011
of+0.010
 fourteen+0.010
These+0.010
Neg logit contributions
--0.010
 shield-0.010
 badly-0.010
 job-0.010
 active-0.010
Token a
Token ID257
Feature activation-7.30e-02

Pos logit contributions
 shield+0.048
 possession+0.047
 badly+0.047
 job+0.043
 appetite+0.042
Neg logit contributions
Those-0.088
ill-0.078
These-0.073
 are-0.073
 fourteen-0.072
Token very
Token ID845
Feature activation-1.47e-01

Pos logit contributions
 possession+0.108
 badly+0.107
rily+0.106
 appetite+0.101
ados+0.101
Neg logit contributions
Those-0.125
 end-0.109
 are-0.107
Kids-0.107
Is-0.106
Token mild
Token ID11607
Feature activation-8.32e-02

Pos logit contributions
 badly+0.211
 shield+0.208
 possession+0.202
 armour+0.185
 armor+0.185
Neg logit contributions
Those-0.261
ila-0.248
oms-0.240
Kids-0.234
 are-0.224
Token train
Token ID4512
Feature activation-2.89e-02

Pos logit contributions
 badly+0.114
 shield+0.114
 job+0.112
-+0.109
 possession+0.107
Neg logit contributions
Those-0.171
ill-0.153
of-0.144
 fourteen-0.139
These-0.139
Token.
Token ID13
Feature activation-3.01e-02

Pos logit contributions
 badly+0.034
 shield+0.034
 job+0.033
-+0.031
 possession+0.031
Neg logit contributions
Those-0.065
ill-0.059
of-0.056
 fourteen-0.054
These-0.054
Token It
Token ID632
Feature activation+5.94e-03

Pos logit contributions
 shield+0.044
 job+0.044
 badly+0.043
 active+0.042
 possession+0.041
Neg logit contributions
Those-0.057
ill-0.052
of-0.049
 fourteen-0.048
Kids-0.046
Token ran
Token ID4966
Feature activation-1.22e-02

Pos logit contributions
Those+0.012
ill+0.011
of+0.010
 fourteen+0.010
These+0.009
Neg logit contributions
 shield-0.009
--0.009
 badly-0.009
 job-0.009
 active-0.008