TOP ACTIVATIONS
MAX = 10.970

to be restored to its former glory ."
time playing with her or nam ents and making her special
The town spe ople had to find a
the village back to its former glory !"
and play with your or nam ents . Maybe you can
back the village to its former state . With the help
the tree to its once former beauty . He was happy
the rest of the or nam ents onto the tree .
, leaving it to its former state .
never played with the or nam ents again . They still
he especially liked the or nam ents he kept around .
All of these lovely or nam ents reminded him of the
a box of shiny or nam ents that spark led in
who liked to collect or nam ents . Everywhere she went
at all the different or nam ents and adm iring the
she had with her or nam ents . <|endoftext|> Once upon
to help sort the or nam ents . She picked up
We have our own or nam ents at home , remember
take one of the or nam ents outside for a walk
Everyday he would take the av oc ados and scatter them

MIN ACTIVATIONS
MIN = -5.933

made a big , pretty web for all its friends to
reached a wide , open field . The duck stopped to
found a nice , clean spot . The
in a nice , healthy spot . The
and car away and c umin and cor i ander and
smiled and thanked the generous spot . She couldn 't believe
, a big , big spider , and it wanted to
were so happy and hugged tightly . They never wanted to
reached a big , wide field . The
each other and then hugged tightly . The warmth of the
. She thanked the generous spot and was excited to play
nice spot in a nearby field . Joy laid
then jumped to a lower branch . Then , the owl
by a big , green field . The man
found a nice , warm spot to hide from the cold
, a ghost , a spider . He runs back to
small house near a big field . Every day he would
'm sorry , too , lizard . I was rude to
it on a nearby tree branch . The squirrel happily chat
it a nice , comfortable spot in his bur row .

INTERVAL 6.745 - 10.970
CONTAINS 0.001%

big tree with many or nam ents on it . The
a letter about the or nam ents and asked for one
the rest of the or nam ents onto the tree .
to me . The or nam ents were very important to
the village back to its former glory !"

INTERVAL 2.519 - 6.745
CONTAINS 1.581%

mess . The bed was un made , toys were everywhere
panic . The finger was so hard he thought it might
a box with small bags of seeds . He didn 't
But when he looked out of his window , he saw
kept on driving . He drove around the house , in

INTERVAL -1.707 - 2.519
CONTAINS 92.905%

Her dad smiled and said , " It sure is
The little girl said , " Okay , let 's go
He falls into the river . He screams .
lost her balance and fell in the creek .
Tom nods and smiles . He says , " Yes ,

INTERVAL -5.933 - -1.707
CONTAINS 5.513%

upon a time , there was a little girl named Lily
toy . She held it tightly and ran outside to play
her face and said , " Let 's go together .
to eat apples from a tree that belonged to someone else
Once upon a time , a cheerful little girl named Lily
Token to
Token ID284
Feature activation+3.22e-01

Pos logit contributions
handed+0.092
 former+0.082
+0.081
keeping+0.080
urance+0.079
Neg logit contributions
 please-0.054
 me-0.054
 it-0.052
 cho-0.049
una-0.046
Token be
Token ID307
Feature activation+5.41e-01

Pos logit contributions
handed+0.011
+0.010
ongo+0.010
urance+0.010
 former+0.009
Neg logit contributions
 it-0.011
 me-0.011
 please-0.011
 cho-0.009
 us-0.009
Token restored
Token ID15032
Feature activation+2.06e+00

Pos logit contributions
handed+0.021
+0.019
urance+0.018
 former+0.018
 arrest+0.018
Neg logit contributions
 me-0.016
 it-0.016
 please-0.016
 cho-0.014
 closer-0.013
Token to
Token ID284
Feature activation-2.81e-01

Pos logit contributions
handed+0.079
 former+0.072
keeping+0.069
urance+0.068
 arrest+0.067
Neg logit contributions
 please-0.053
 me-0.052
 it-0.050
 cho-0.050
osh-0.046
Token its
Token ID663
Feature activation-1.26e+00

Pos logit contributions
 it+0.008
 me+0.008
 please+0.007
 cho+0.006
 us+0.006
Neg logit contributions
handed-0.011
-0.010
ongo-0.010
urance-0.010
 former-0.010
Token former
Token ID1966
Feature activation+9.11e+00

Pos logit contributions
 me+0.046
 please+0.044
 it+0.044
 cho+0.041
 closer+0.039
Neg logit contributions
handed-0.039
-0.034
urance-0.032
 former-0.032
 arrest-0.032
Token glory
Token ID13476
Feature activation+1.86e+00

Pos logit contributions
handed+0.222
+0.190
 arrest+0.172
ongo+0.169
urance+0.168
Neg logit contributions
 me-0.396
 it-0.377
 please-0.372
 cho-0.352
 us-0.339
Token."
Token ID526
Feature activation+3.22e-01

Pos logit contributions
handed+0.075
 former+0.068
+0.066
urance+0.065
 arrest+0.064
Neg logit contributions
 me-0.047
 please-0.046
 it-0.045
 cho-0.043
una-0.038
Token
Token ID220
Feature activation-5.45e-01

Pos logit contributions
handed+0.011
 former+0.009
+0.009
urance+0.009
 arrest+0.009
Neg logit contributions
 me-0.011
 please-0.011
 it-0.011
 cho-0.010
 closer-0.009
Token
Token ID198
Feature activation-1.99e-01

Pos logit contributions
 it+0.015
 me+0.015
 please+0.014
 cho+0.012
 is+0.012
Neg logit contributions
handed-0.021
 former-0.020
 arrest-0.020
urance-0.019
-0.019
Token
Token ID198
Feature activation+4.46e-01

Pos logit contributions
 me+0.007
 please+0.007
 it+0.007
 cho+0.006
 closer+0.006
Neg logit contributions
handed-0.006
-0.006
 former-0.006
urance-0.005
 arrest-0.005
Token time
Token ID640
Feature activation+5.55e-01

Pos logit contributions
handed+0.033
 former+0.029
+0.029
urance+0.028
 arrest+0.028
Neg logit contributions
 me-0.026
 it-0.026
 please-0.026
 cho-0.024
 closer-0.022
Token playing
Token ID2712
Feature activation+1.21e+00

Pos logit contributions
handed+0.022
 former+0.020
+0.020
urance+0.020
ongo+0.019
Neg logit contributions
 me-0.015
 it-0.015
 please-0.015
 cho-0.013
 closer-0.012
Token with
Token ID351
Feature activation-5.28e-01

Pos logit contributions
handed+0.042
+0.037
 former+0.037
 arrest+0.035
keeping+0.034
Neg logit contributions
 cho-0.034
 please-0.034
 me-0.032
osh-0.031
ours-0.031
Token her
Token ID607
Feature activation+8.64e-02

Pos logit contributions
 me+0.010
 please+0.010
 it+0.010
 cho+0.009
 closer+0.008
Neg logit contributions
handed-0.025
 former-0.023
-0.022
urance-0.022
 arrest-0.022
Token or
Token ID393
Feature activation+9.09e-01

Pos logit contributions
handed+0.004
urance+0.003
 former+0.003
+0.003
ongo+0.003
Neg logit contributions
 me-0.002
 it-0.002
 please-0.002
 cho-0.002
 closer-0.002
Tokennam
Token ID7402
Feature activation+9.07e+00

Pos logit contributions
handed+0.034
 former+0.030
+0.030
urance+0.029
 arrest+0.029
Neg logit contributions
 me-0.026
 please-0.026
 it-0.025
 cho-0.024
una-0.022
Tokenents
Token ID658
Feature activation+2.26e+00

Pos logit contributions
handed+0.364
+0.342
urance+0.338
 former+0.334
 arrest+0.319
Neg logit contributions
 me-0.233
 it-0.227
 please-0.219
 cho-0.198
h-0.193
Token and
Token ID290
Feature activation+7.34e-01

Pos logit contributions
handed+0.096
 former+0.088
+0.087
urance+0.086
 arrest+0.084
Neg logit contributions
 it-0.054
 me-0.053
 please-0.052
 cho-0.047
 closer-0.043
Token making
Token ID1642
Feature activation+1.69e-02

Pos logit contributions
handed+0.028
 former+0.025
+0.024
 arrest+0.024
urance+0.024
Neg logit contributions
 please-0.020
 me-0.020
 it-0.020
 cho-0.018
una-0.017
Token her
Token ID607
Feature activation+2.68e-01

Pos logit contributions
handed+0.001
 former+0.001
+0.001
urance+0.001
 arrest+0.001
Neg logit contributions
 me-0.000
 please-0.000
 it-0.000
 cho-0.000
 closer-0.000
Token special
Token ID2041
Feature activation+1.06e+00

Pos logit contributions
handed+0.010
+0.009
urance+0.009
 arrest+0.009
ongo+0.009
Neg logit contributions
 it-0.008
 me-0.008
 please-0.008
 closer-0.007
 is-0.007
Token
Token ID220
Feature activation+1.32e+00

Pos logit contributions
handed+0.017
 former+0.016
keeping+0.015
urance+0.015
ongo+0.015
Neg logit contributions
 it-0.014
 please-0.014
 me-0.014
 cho-0.012
h-0.011
Token
Token ID198
Feature activation+1.40e+00

Pos logit contributions
handed+0.050
 arrest+0.046
 former+0.045
urance+0.044
+0.043
Neg logit contributions
 me-0.039
 it-0.038
 please-0.037
 cho-0.032
una-0.031
Token
Token ID198
Feature activation+1.49e+00

Pos logit contributions
handed+0.048
 former+0.043
+0.042
urance+0.041
 arrest+0.041
Neg logit contributions
 me-0.044
 please-0.044
 it-0.044
 cho-0.041
 closer-0.038
TokenThe
Token ID464
Feature activation+1.18e+00

Pos logit contributions
handed+0.053
 former+0.048
+0.047
urance+0.046
 arrest+0.045
Neg logit contributions
 me-0.045
 please-0.044
 it-0.044
 cho-0.041
 closer-0.038
Token town
Token ID3240
Feature activation+3.54e+00

Pos logit contributions
handed+0.047
 former+0.043
+0.042
urance+0.041
 arrest+0.041
Neg logit contributions
 it-0.030
 me-0.030
 please-0.030
 cho-0.027
h-0.025
Tokenspe
Token ID4125
Feature activation+8.90e+00

Pos logit contributions
handed+0.126
 former+0.112
+0.110
urance+0.107
 arrest+0.106
Neg logit contributions
 me-0.105
 please-0.105
 it-0.102
 cho-0.096
una-0.092
Tokenople
Token ID643
Feature activation+2.91e+00

Pos logit contributions
handed+0.256
 former+0.247
+0.241
ongo+0.235
urance+0.233
Neg logit contributions
 me-0.323
 it-0.309
 cho-0.294
osh-0.290
 please-0.287
Token had
Token ID550
Feature activation+2.28e+00

Pos logit contributions
handed+0.111
 former+0.103
+0.101
ongo+0.100
urance+0.097
Neg logit contributions
 it-0.088
 me-0.088
 please-0.082
osh-0.075
 cho-0.073
Token to
Token ID284
Feature activation+1.39e+00

Pos logit contributions
handed+0.090
 former+0.081
+0.080
urance+0.078
 arrest+0.077
Neg logit contributions
 me-0.062
 it-0.061
 please-0.060
 cho-0.055
 closer-0.050
Token find
Token ID1064
Feature activation+8.19e-02

Pos logit contributions
handed+0.055
ongo+0.052
+0.052
 former+0.048
urance+0.047
Neg logit contributions
 me-0.040
 it-0.040
 please-0.040
 us-0.032
 cho-0.031
Token a
Token ID257
Feature activation-2.53e-01

Pos logit contributions
handed+0.003
 former+0.003
 arrest+0.003
+0.003
urance+0.003
Neg logit contributions
 please-0.002
 me-0.002
 it-0.002
 cho-0.002
h-0.002
Token the
Token ID262
Feature activation+3.16e-01

Pos logit contributions
handed+0.025
 former+0.024
 arrest+0.022
 ease+0.022
keeping+0.022
Neg logit contributions
 please-0.006
 cho-0.005
 me-0.005
h-0.005
una-0.005
Token village
Token ID7404
Feature activation+1.40e+00

Pos logit contributions
handed+0.012
 former+0.011
+0.011
urance+0.010
 arrest+0.010
Neg logit contributions
 me-0.009
 please-0.009
 it-0.009
 cho-0.008
 closer-0.008
Token back
Token ID736
Feature activation-8.34e-01

Pos logit contributions
handed+0.058
 former+0.053
+0.053
urance+0.052
 arrest+0.051
Neg logit contributions
 me-0.034
 it-0.034
 please-0.033
 cho-0.030
 closer-0.027
Token to
Token ID284
Feature activation-1.02e+00

Pos logit contributions
 me+0.018
 please+0.017
 it+0.017
 cho+0.016
 closer+0.014
Neg logit contributions
handed-0.037
 former-0.034
-0.034
urance-0.033
 arrest-0.033
Token its
Token ID663
Feature activation-1.54e+00

Pos logit contributions
 me+0.023
 it+0.022
 please+0.022
 cho+0.019
 us+0.017
Neg logit contributions
handed-0.045
-0.041
 former-0.041
urance-0.041
ongo-0.040
Token former
Token ID1966
Feature activation+8.79e+00

Pos logit contributions
 me+0.050
 please+0.050
 it+0.050
 cho+0.047
una+0.043
Neg logit contributions
handed-0.051
 former-0.046
-0.044
urance-0.043
 arrest-0.043
Token glory
Token ID13476
Feature activation+1.38e+00

Pos logit contributions
handed+0.189
+0.155
 former+0.141
 arrest+0.141
urance+0.141
Neg logit contributions
 me-0.400
 it-0.389
 please-0.385
 cho-0.367
 us-0.347
Token!"
Token ID2474
Feature activation-5.45e-01

Pos logit contributions
handed+0.056
 former+0.051
+0.050
urance+0.048
 arrest+0.048
Neg logit contributions
 me-0.034
 please-0.034
 it-0.033
 cho-0.032
una-0.028
Token
Token ID220
Feature activation-1.35e+00

Pos logit contributions
 it+0.017
 me+0.016
 please+0.015
 cho+0.013
osh+0.013
Neg logit contributions
handed-0.020
 former-0.020
urance-0.019
 arrest-0.018
ongo-0.018
Token
Token ID198
Feature activation-9.79e-01

Pos logit contributions
 it+0.035
 me+0.033
 please+0.031
 cho+0.027
"+0.027
Neg logit contributions
handed-0.055
 former-0.053
 arrest-0.051
urance-0.050
view-0.050
Token
Token ID198
Feature activation-2.24e-01

Pos logit contributions
 it+0.028
 me+0.028
 please+0.027
 cho+0.025
una+0.023
Neg logit contributions
handed-0.037
 former-0.034
urance-0.034
 arrest-0.033
-0.032
Token and
Token ID290
Feature activation+2.81e-01

Pos logit contributions
handed+0.018
+0.017
urance+0.017
 former+0.016
ongo+0.016
Neg logit contributions
 me-0.009
 it-0.009
 please-0.009
 cho-0.007
 closer-0.007
Token play
Token ID711
Feature activation-7.54e-02

Pos logit contributions
handed+0.011
 former+0.010
+0.010
urance+0.009
 arrest+0.009
Neg logit contributions
 me-0.008
 please-0.008
 it-0.007
 cho-0.007
una-0.006
Token with
Token ID351
Feature activation-9.79e-01

Pos logit contributions
 please+0.002
 me+0.002
 cho+0.002
 it+0.002
osh+0.002
Neg logit contributions
handed-0.003
 former-0.003
-0.003
 arrest-0.002
urance-0.002
Token your
Token ID534
Feature activation+3.74e-01

Pos logit contributions
 please+0.025
 cho+0.022
 me+0.022
h+0.020
 it+0.020
Neg logit contributions
handed-0.040
 arrest-0.036
 former-0.035
-0.034
 ease-0.034
Token or
Token ID393
Feature activation+9.66e-01

Pos logit contributions
handed+0.015
urance+0.013
+0.013
 former+0.013
ongo+0.013
Neg logit contributions
 me-0.011
 it-0.010
 please-0.010
 cho-0.009
 closer-0.008
Tokennam
Token ID7402
Feature activation+8.78e+00

Pos logit contributions
handed+0.042
 former+0.038
+0.037
 arrest+0.037
urance+0.037
Neg logit contributions
 me-0.021
 please-0.021
 it-0.021
 cho-0.019
una-0.017
Tokenents
Token ID658
Feature activation+1.98e+00

Pos logit contributions
handed+0.344
+0.323
urance+0.317
 former+0.317
 arrest+0.301
Neg logit contributions
 me-0.231
 it-0.226
 please-0.222
 cho-0.204
out-0.192
Token.
Token ID13
Feature activation-4.21e-01

Pos logit contributions
handed+0.082
 former+0.074
+0.074
urance+0.072
 arrest+0.071
Neg logit contributions
 me-0.049
 it-0.048
 please-0.048
 cho-0.043
 closer-0.039
Token Maybe
Token ID6674
Feature activation-1.56e+00

Pos logit contributions
 please+0.012
 me+0.012
 it+0.012
 cho+0.010
 I+0.010
Neg logit contributions
handed-0.016
 former-0.015
-0.014
urance-0.014
 arrest-0.014
Token you
Token ID345
Feature activation-9.73e-02

Pos logit contributions
 me+0.029
 please+0.028
 it+0.027
 cho+0.026
una+0.023
Neg logit contributions
handed-0.072
 former-0.066
-0.065
 arrest-0.064
urance-0.063
Token can
Token ID460
Feature activation+8.69e-01

Pos logit contributions
 me+0.001
 please+0.001
 it+0.001
 cho+0.001
una+0.001
Neg logit contributions
handed-0.005
 former-0.005
-0.005
urance-0.005
 arrest-0.005
Token back
Token ID736
Feature activation-9.67e-01

Pos logit contributions
 me+0.020
 it+0.020
 please+0.019
 cho+0.017
 closer+0.016
Neg logit contributions
handed-0.036
-0.033
 former-0.033
urance-0.032
 arrest-0.032
Token the
Token ID262
Feature activation+7.52e-01

Pos logit contributions
 me+0.020
 it+0.020
 please+0.019
 cho+0.017
 closer+0.015
Neg logit contributions
handed-0.044
-0.041
urance-0.041
 former-0.041
ongo-0.040
Token village
Token ID7404
Feature activation+1.54e+00

Pos logit contributions
handed+0.029
 former+0.026
+0.026
urance+0.025
 arrest+0.025
Neg logit contributions
 me-0.021
 please-0.021
 it-0.021
 cho-0.019
 closer-0.017
Token to
Token ID284
Feature activation-9.68e-01

Pos logit contributions
handed+0.065
+0.061
 former+0.060
urance+0.060
 arrest+0.058
Neg logit contributions
 it-0.036
 me-0.035
 please-0.034
 cho-0.031
 closer-0.029
Token its
Token ID663
Feature activation-1.47e+00

Pos logit contributions
 it+0.022
 me+0.021
 please+0.020
 cho+0.017
 us+0.016
Neg logit contributions
handed-0.044
ongo-0.041
-0.041
urance-0.040
 former-0.039
Token former
Token ID1966
Feature activation+8.75e+00

Pos logit contributions
 please+0.047
 it+0.047
 me+0.046
 cho+0.045
h+0.041
Neg logit contributions
handed-0.049
 former-0.046
-0.042
 arrest-0.041
urance-0.041
Token state
Token ID1181
Feature activation+5.91e-01

Pos logit contributions
handed+0.176
+0.139
 former+0.139
urance+0.131
 arrest+0.129
Neg logit contributions
 me-0.404
 please-0.398
 it-0.398
 cho-0.378
 closer-0.358
Token.
Token ID13
Feature activation-7.70e-02

Pos logit contributions
handed+0.024
 former+0.021
+0.021
urance+0.020
 arrest+0.020
Neg logit contributions
 me-0.015
 please-0.015
 it-0.015
 cho-0.014
 closer-0.013
Token With
Token ID2080
Feature activation-1.80e-01

Pos logit contributions
 it+0.002
 me+0.002
 please+0.002
 I+0.002
 cho+0.002
Neg logit contributions
handed-0.003
 former-0.003
 arrest-0.003
urance-0.003
view-0.003
Token the
Token ID262
Feature activation+2.71e-01

Pos logit contributions
 please+0.005
 me+0.005
 cho+0.004
 it+0.004
out+0.004
Neg logit contributions
handed-0.007
 former-0.006
-0.006
 arrest-0.006
 ease-0.006
Token help
Token ID1037
Feature activation-2.54e-01

Pos logit contributions
handed+0.010
 former+0.009
+0.009
urance+0.009
 arrest+0.009
Neg logit contributions
 me-0.008
 please-0.008
 it-0.008
 cho-0.007
 closer-0.007
Token the
Token ID262
Feature activation+1.42e+00

Pos logit contributions
handed+0.048
 former+0.044
+0.044
urance+0.043
 arrest+0.043
Neg logit contributions
 me-0.024
 it-0.024
 please-0.023
 cho-0.021
 closer-0.018
Token tree
Token ID5509
Feature activation-1.42e+00

Pos logit contributions
handed+0.062
 former+0.057
+0.056
 arrest+0.055
urance+0.055
Neg logit contributions
 me-0.032
 please-0.031
 it-0.031
 cho-0.028
una-0.025
Token to
Token ID284
Feature activation-4.00e-01

Pos logit contributions
 it+0.033
 me+0.032
 please+0.031
 cho+0.027
 closer+0.026
Neg logit contributions
handed-0.061
-0.056
urance-0.056
 former-0.056
 arrest-0.055
Token its
Token ID663
Feature activation-1.22e+00

Pos logit contributions
 it+0.011
 me+0.010
 please+0.010
 cho+0.008
 us+0.008
Neg logit contributions
handed-0.017
ongo-0.015
-0.015
urance-0.015
arre-0.015
Token once
Token ID1752
Feature activation-1.78e+00

Pos logit contributions
 please+0.036
 it+0.036
 me+0.035
 cho+0.034
h+0.031
Neg logit contributions
handed-0.044
 former-0.041
-0.038
 arrest-0.037
urance-0.037
Token former
Token ID1966
Feature activation+8.54e+00

Pos logit contributions
 it+0.044
 me+0.043
 please+0.041
 cho+0.036
 closer+0.035
Neg logit contributions
handed-0.076
-0.070
urance-0.070
 arrest-0.069
ongo-0.069
Token beauty
Token ID8737
Feature activation-1.21e+00

Pos logit contributions
handed+0.214
+0.188
ongo+0.168
 arrest+0.165
urance+0.164
Neg logit contributions
 me-0.369
 it-0.346
 please-0.338
 cho-0.325
 us-0.316
Token.
Token ID13
Feature activation+5.71e-01

Pos logit contributions
 me+0.032
 please+0.031
 it+0.031
 cho+0.028
 closer+0.026
Neg logit contributions
handed-0.048
 former-0.043
-0.043
urance-0.042
 arrest-0.042
Token He
Token ID679
Feature activation+5.65e-01

Pos logit contributions
handed+0.019
 former+0.017
 arrest+0.017
urance+0.017
ivery+0.017
Neg logit contributions
 it-0.019
 me-0.018
 please-0.018
 cho-0.016
 I-0.016
Token was
Token ID373
Feature activation-3.88e-01

Pos logit contributions
handed+0.022
 former+0.020
+0.019
 arrest+0.019
urance+0.019
Neg logit contributions
 please-0.015
 me-0.015
 it-0.015
 cho-0.014
una-0.013
Token happy
Token ID3772
Feature activation+7.29e-01

Pos logit contributions
 it+0.011
 me+0.011
 please+0.011
 cho+0.010
 closer+0.009
Neg logit contributions
handed-0.015
-0.013
 former-0.013
urance-0.013
 arrest-0.013
Token the
Token ID262
Feature activation-3.31e-01

Pos logit contributions
 please+0.030
 me+0.029
 it+0.026
 cho+0.026
una+0.024
Neg logit contributions
handed-0.063
 former-0.057
-0.056
 arrest-0.055
urance-0.055
Token rest
Token ID1334
Feature activation-9.05e-01

Pos logit contributions
 me+0.009
 it+0.009
 please+0.008
 cho+0.008
 closer+0.007
Neg logit contributions
handed-0.013
 former-0.012
-0.012
urance-0.012
ongo-0.012
Token of
Token ID286
Feature activation+1.95e+00

Pos logit contributions
 me+0.022
 please+0.022
 it+0.021
 cho+0.020
una+0.018
Neg logit contributions
handed-0.037
-0.033
 former-0.033
urance-0.032
 arrest-0.032
Token the
Token ID262
Feature activation+1.24e+00

Pos logit contributions
handed+0.089
 former+0.081
+0.081
urance+0.079
 arrest+0.079
Neg logit contributions
 me-0.040
 please-0.039
 it-0.038
 cho-0.035
 closer-0.030
Token or
Token ID393
Feature activation+4.24e-01

Pos logit contributions
handed+0.050
 former+0.045
+0.045
urance+0.045
ongo+0.044
Neg logit contributions
 me-0.033
 it-0.032
 please-0.031
 cho-0.029
 closer-0.026
Tokennam
Token ID7402
Feature activation+8.28e+00

Pos logit contributions
handed+0.017
 former+0.015
+0.015
urance+0.015
 arrest+0.015
Neg logit contributions
 me-0.011
 please-0.011
 it-0.011
 cho-0.010
una-0.009
Tokenents
Token ID658
Feature activation+1.76e+00

Pos logit contributions
handed+0.331
+0.308
urance+0.306
 former+0.303
 arrest+0.289
Neg logit contributions
 me-0.209
 it-0.207
 please-0.202
 cho-0.184
out-0.177
Token onto
Token ID4291
Feature activation-1.47e+00

Pos logit contributions
handed+0.075
 former+0.069
+0.068
urance+0.067
 arrest+0.066
Neg logit contributions
 it-0.042
 me-0.041
 please-0.041
 cho-0.037
 closer-0.034
Token the
Token ID262
Feature activation+3.57e-01

Pos logit contributions
 please+0.032
 me+0.032
 it+0.030
 cho+0.028
una+0.025
Neg logit contributions
handed-0.064
-0.058
 former-0.058
urance-0.057
 arrest-0.057
Token tree
Token ID5509
Feature activation-2.41e+00

Pos logit contributions
handed+0.014
+0.013
 former+0.013
urance+0.013
ongo+0.012
Neg logit contributions
 me-0.010
 it-0.009
 please-0.009
 cho-0.008
 closer-0.008
Token.
Token ID13
Feature activation-3.63e-01

Pos logit contributions
 me+0.060
 it+0.060
 please+0.059
 cho+0.054
 closer+0.049
Neg logit contributions
handed-0.099
 former-0.090
-0.089
urance-0.087
 arrest-0.086
Token,
Token ID11
Feature activation-7.98e-01

Pos logit contributions
 me+0.043
 it+0.043
 please+0.042
 cho+0.039
 closer+0.035
Neg logit contributions
handed-0.073
 former-0.066
-0.065
urance-0.064
 arrest-0.064
Token leaving
Token ID4305
Feature activation+4.70e-01

Pos logit contributions
 me+0.023
 please+0.023
 it+0.023
 cho+0.021
 closer+0.019
Neg logit contributions
handed-0.029
 former-0.026
-0.026
urance-0.025
 arrest-0.025
Token it
Token ID340
Feature activation-2.33e+00

Pos logit contributions
handed+0.021
 former+0.019
+0.018
urance+0.018
 arrest+0.018
Neg logit contributions
 please-0.011
 me-0.010
 it-0.010
 cho-0.009
h-0.008
Token to
Token ID284
Feature activation-1.17e+00

Pos logit contributions
 it+0.061
 me+0.061
 please+0.057
 cho+0.053
 closer+0.051
Neg logit contributions
handed-0.094
-0.087
 former-0.086
urance-0.085
ongo-0.085
Token its
Token ID663
Feature activation-1.97e+00

Pos logit contributions
 it+0.029
 me+0.029
 please+0.027
 cho+0.024
 closer+0.022
Neg logit contributions
handed-0.051
-0.047
ongo-0.046
urance-0.046
 former-0.044
Token former
Token ID1966
Feature activation+8.27e+00

Pos logit contributions
 me+0.057
 it+0.056
 please+0.056
 cho+0.051
 closer+0.047
Neg logit contributions
handed-0.074
-0.066
 former-0.065
urance-0.065
 arrest-0.063
Token state
Token ID1181
Feature activation+1.05e-01

Pos logit contributions
handed+0.219
+0.196
ongo+0.175
 arrest+0.169
urance+0.165
Neg logit contributions
 me-0.349
 it-0.331
 please-0.315
 cho-0.309
 us-0.298
Token.
Token ID13
Feature activation-6.89e-01

Pos logit contributions
handed+0.004
 former+0.004
+0.004
urance+0.004
 arrest+0.004
Neg logit contributions
 me-0.003
 it-0.003
 please-0.003
 cho-0.002
 closer-0.002
Token
Token ID220
Feature activation-2.38e-01

Pos logit contributions
 it+0.020
 me+0.019
 please+0.018
 cho+0.016
 I+0.015
Neg logit contributions
handed-0.027
 former-0.025
 arrest-0.024
view-0.024
ongo-0.024
Token
Token ID198
Feature activation-1.52e-01

Pos logit contributions
 it+0.006
 me+0.006
 please+0.005
 cho+0.005
"+0.005
Neg logit contributions
handed-0.010
 former-0.010
 arrest-0.009
view-0.009
urance-0.009
Token
Token ID198
Feature activation+4.09e-01

Pos logit contributions
 it+0.004
 me+0.004
 please+0.004
 cho+0.003
"+0.003
Neg logit contributions
handed-0.006
 former-0.006
urance-0.006
 arrest-0.006
-0.005
Token never
Token ID1239
Feature activation-1.19e+00

Pos logit contributions
 it+0.023
 me+0.023
 please+0.022
 cho+0.020
 closer+0.018
Neg logit contributions
handed-0.030
-0.026
 former-0.026
urance-0.026
 arrest-0.025
Token played
Token ID2826
Feature activation+1.64e+00

Pos logit contributions
 me+0.036
 it+0.035
 please+0.034
 cho+0.030
 closer+0.029
Neg logit contributions
handed-0.046
-0.042
 former-0.040
urance-0.040
 arrest-0.039
Token with
Token ID351
Feature activation-4.55e-01

Pos logit contributions
handed+0.072
 former+0.066
urance+0.065
+0.065
ongo+0.065
Neg logit contributions
 me-0.037
 it-0.037
 please-0.036
 cho-0.031
 closer-0.028
Token the
Token ID262
Feature activation+1.03e+00

Pos logit contributions
 me+0.011
 please+0.010
 it+0.010
 cho+0.009
 closer+0.008
Neg logit contributions
handed-0.019
 former-0.018
-0.017
urance-0.017
 arrest-0.017
Token or
Token ID393
Feature activation+3.41e-01

Pos logit contributions
handed+0.041
 former+0.037
+0.037
urance+0.036
 arrest+0.036
Neg logit contributions
 me-0.027
 please-0.027
 it-0.027
 cho-0.024
 closer-0.022
Tokennam
Token ID7402
Feature activation+8.16e+00

Pos logit contributions
handed+0.014
+0.013
 former+0.012
urance+0.012
ongo+0.012
Neg logit contributions
 me-0.009
 it-0.009
 please-0.009
 cho-0.008
 closer-0.007
Tokenents
Token ID658
Feature activation+1.74e+00

Pos logit contributions
handed+0.334
+0.317
urance+0.313
 former+0.307
 arrest+0.295
Neg logit contributions
 me-0.199
 it-0.194
 please-0.188
 cho-0.173
out-0.167
Token again
Token ID757
Feature activation-1.40e+00

Pos logit contributions
handed+0.078
 former+0.072
+0.071
urance+0.071
 arrest+0.069
Neg logit contributions
 it-0.037
 me-0.037
 please-0.036
 cho-0.032
 closer-0.029
Token.
Token ID13
Feature activation-8.23e-01

Pos logit contributions
 it+0.032
 me+0.031
 please+0.029
 cho+0.023
h+0.023
Neg logit contributions
handed-0.064
urance-0.059
-0.058
 arrest-0.058
ongo-0.057
Token They
Token ID1119
Feature activation+5.19e-01

Pos logit contributions
 it+0.022
 me+0.021
 please+0.019
 Anna+0.017
 I+0.017
Neg logit contributions
handed-0.034
 arrest-0.030
 former-0.030
urance-0.030
 accomplish-0.030
Token still
Token ID991
Feature activation-9.37e-01

Pos logit contributions
handed+0.019
urance+0.017
+0.017
igsaw+0.016
 former+0.016
Neg logit contributions
 me-0.016
 it-0.016
 please-0.016
 will-0.014
 cho-0.014
Token he
Token ID339
Feature activation-9.11e-02

Pos logit contributions
 me+0.014
 it+0.014
 please+0.014
 cho+0.013
 closer+0.012
Neg logit contributions
handed-0.018
-0.016
 former-0.015
urance-0.015
 arrest-0.015
Token especially
Token ID2592
Feature activation-1.25e+00

Pos logit contributions
 me+0.003
 please+0.003
 it+0.003
 cho+0.002
una+0.002
Neg logit contributions
handed-0.003
 former-0.003
-0.003
 arrest-0.003
urance-0.003
Token liked
Token ID8288
Feature activation-1.22e+00

Pos logit contributions
 me+0.037
 it+0.037
 please+0.037
 cho+0.034
 closer+0.031
Neg logit contributions
handed-0.045
 former-0.041
-0.041
urance-0.040
 arrest-0.039
Token the
Token ID262
Feature activation+1.23e+00

Pos logit contributions
 please+0.021
 me+0.020
 it+0.018
 cho+0.018
una+0.017
Neg logit contributions
handed-0.057
 former-0.053
 arrest-0.052
-0.051
urance-0.051
Token or
Token ID393
Feature activation+6.05e-01

Pos logit contributions
handed+0.048
+0.044
 former+0.043
urance+0.042
 arrest+0.042
Neg logit contributions
 me-0.034
 it-0.033
 please-0.033
 cho-0.029
 closer-0.027
Tokennam
Token ID7402
Feature activation+8.16e+00

Pos logit contributions
handed+0.026
+0.025
urance+0.023
ongo+0.023
 former+0.023
Neg logit contributions
 it-0.016
 me-0.016
 please-0.015
 cho-0.012
 closer-0.012
Tokenents
Token ID658
Feature activation+1.76e+00

Pos logit contributions
handed+0.340
+0.326
urance+0.322
 former+0.317
 disl+0.304
Neg logit contributions
 me-0.189
 it-0.186
 please-0.177
 cho-0.161
out-0.161
Token he
Token ID339
Feature activation+3.11e-01

Pos logit contributions
handed+0.078
 former+0.074
+0.073
urance+0.073
 arrest+0.071
Neg logit contributions
 it-0.041
 please-0.037
 me-0.037
 cho-0.033
 closer-0.031
Token kept
Token ID4030
Feature activation-1.91e+00

Pos logit contributions
handed+0.012
 former+0.011
+0.011
urance+0.011
 arrest+0.011
Neg logit contributions
 me-0.008
 it-0.008
 please-0.008
 cho-0.007
 closer-0.007
Token around
Token ID1088
Feature activation-1.57e+00

Pos logit contributions
 it+0.047
 me+0.047
 please+0.045
 cho+0.041
 closer+0.037
Neg logit contributions
handed-0.081
-0.074
 former-0.073
urance-0.072
 arrest-0.071
Token.
Token ID13
Feature activation-4.79e-01

Pos logit contributions
 it+0.038
 me+0.038
 please+0.037
 cho+0.033
 closer+0.030
Neg logit contributions
handed-0.067
-0.061
 former-0.061
urance-0.060
ongo-0.059
Token All
Token ID1439
Feature activation-3.01e-01

Pos logit contributions
 me+0.014
 it+0.014
 please+0.014
 cho+0.012
 I+0.011
Neg logit contributions
handed-0.018
 former-0.016
 arrest-0.016
urance-0.016
-0.015
Token of
Token ID286
Feature activation+3.40e+00

Pos logit contributions
 me+0.007
 please+0.007
 it+0.007
 cho+0.006
 closer+0.005
Neg logit contributions
handed-0.013
 former-0.012
-0.012
urance-0.011
 arrest-0.011
Token these
Token ID777
Feature activation+2.06e+00

Pos logit contributions
handed+0.172
urance+0.162
ongo+0.159
+0.158
 former+0.158
Neg logit contributions
 it-0.075
 me-0.068
 please-0.055
 us-0.050
 cho-0.047
Token lovely
Token ID14081
Feature activation+1.57e+00

Pos logit contributions
handed+0.078
urance+0.070
 former+0.069
+0.069
ongo+0.068
Neg logit contributions
 me-0.063
 it-0.062
 please-0.059
 cho-0.053
 us-0.052
Token or
Token ID393
Feature activation+1.38e+00

Pos logit contributions
handed+0.071
+0.065
 former+0.064
urance+0.063
ongo+0.063
Neg logit contributions
 me-0.037
 it-0.036
 please-0.033
 cho-0.028
 us-0.027
Tokennam
Token ID7402
Feature activation+8.06e+00

Pos logit contributions
handed+0.050
+0.045
urance+0.043
 former+0.043
ongo+0.042
Neg logit contributions
 it-0.045
 me-0.044
 please-0.043
 cho-0.039
 closer-0.037
Tokenents
Token ID658
Feature activation+2.06e+00

Pos logit contributions
handed+0.340
+0.326
urance+0.324
 former+0.315
 disl+0.301
Neg logit contributions
 me-0.187
 it-0.182
 please-0.171
h-0.155
 cho-0.155
Token reminded
Token ID14516
Feature activation-5.44e-01

Pos logit contributions
handed+0.078
 former+0.073
urance+0.071
+0.071
 arrest+0.068
Neg logit contributions
 it-0.062
 please-0.058
 me-0.057
 cho-0.052
 closer-0.050
Token him
Token ID683
Feature activation-1.32e+00

Pos logit contributions
 me+0.015
 it+0.015
 please+0.014
 cho+0.013
 us+0.012
Neg logit contributions
handed-0.022
 former-0.020
-0.020
urance-0.019
 arrest-0.019
Token of
Token ID286
Feature activation+1.06e+00

Pos logit contributions
 me+0.026
 it+0.026
 please+0.026
 cho+0.022
 closer+0.019
Neg logit contributions
handed-0.062
 former-0.057
-0.057
urance-0.055
 arrest-0.055
Token the
Token ID262
Feature activation+1.16e-01

Pos logit contributions
handed+0.044
 former+0.040
+0.040
urance+0.039
 arrest+0.039
Neg logit contributions
 me-0.026
 please-0.026
 it-0.025
 cho-0.023
 closer-0.021
Token a
Token ID257
Feature activation+2.93e-01

Pos logit contributions
handed+0.066
 former+0.060
 arrest+0.058
+0.057
urance+0.057
Neg logit contributions
 please-0.054
 me-0.053
 it-0.051
 cho-0.049
una-0.047
Token box
Token ID3091
Feature activation-4.67e-01

Pos logit contributions
handed+0.009
+0.009
 former+0.009
 arrest+0.008
urance+0.008
Neg logit contributions
 me-0.009
 it-0.009
 please-0.008
una-0.008
 cho-0.008
Token of
Token ID286
Feature activation+2.60e+00

Pos logit contributions
 me+0.012
 it+0.012
 please+0.012
 cho+0.011
una+0.010
Neg logit contributions
handed-0.019
 former-0.017
-0.017
urance-0.016
 arrest-0.016
Token shiny
Token ID22441
Feature activation+1.01e+00

Pos logit contributions
handed+0.091
 former+0.083
+0.080
 arrest+0.078
urance+0.077
Neg logit contributions
 please-0.077
 me-0.076
 it-0.075
 cho-0.071
una-0.067
Token or
Token ID393
Feature activation+1.39e+00

Pos logit contributions
handed+0.038
 former+0.034
+0.033
urance+0.033
 arrest+0.032
Neg logit contributions
 me-0.029
 it-0.029
 please-0.029
 cho-0.026
 closer-0.024
Tokennam
Token ID7402
Feature activation+7.99e+00

Pos logit contributions
handed+0.044
 former+0.039
+0.037
urance+0.036
 arrest+0.036
Neg logit contributions
 me-0.046
 please-0.046
 it-0.045
 cho-0.043
una-0.041
Tokenents
Token ID658
Feature activation+1.84e+00

Pos logit contributions
handed+0.327
+0.302
urance+0.299
 former+0.298
 arrest+0.284
Neg logit contributions
 me-0.201
 it-0.199
 please-0.191
 cho-0.174
h-0.165
Token that
Token ID326
Feature activation-9.23e-01

Pos logit contributions
handed+0.080
 former+0.074
+0.073
urance+0.072
 arrest+0.071
Neg logit contributions
 it-0.043
 me-0.042
 please-0.041
 cho-0.037
 closer-0.034
Token spark
Token ID9009
Feature activation+8.01e-01

Pos logit contributions
 please+0.022
 me+0.022
 cho+0.021
 it+0.020
una+0.020
Neg logit contributions
handed-0.038
-0.034
 former-0.033
urance-0.032
keeping-0.031
Tokenled
Token ID992
Feature activation-5.34e-01

Pos logit contributions
handed+0.029
 former+0.026
+0.025
urance+0.025
 arrest+0.025
Neg logit contributions
 me-0.025
 it-0.025
 please-0.024
 cho-0.022
out-0.021
Token in
Token ID287
Feature activation-3.30e-01

Pos logit contributions
 please+0.015
 me+0.015
 cho+0.014
 it+0.014
una+0.014
Neg logit contributions
handed-0.019
 former-0.017
-0.016
urance-0.016
 arrest-0.016
Token who
Token ID508
Feature activation+1.02e+00

Pos logit contributions
 please+0.024
 it+0.023
 me+0.023
 cho+0.022
 closer+0.022
Neg logit contributions
handed-0.019
 arrest-0.016
gy-0.016
-0.016
 former-0.016
Token liked
Token ID8288
Feature activation-5.64e-01

Pos logit contributions
handed+0.036
 former+0.032
+0.032
urance+0.031
 arrest+0.030
Neg logit contributions
 me-0.032
 please-0.031
 it-0.031
 cho-0.029
 closer-0.027
Token to
Token ID284
Feature activation-1.36e-01

Pos logit contributions
 please+0.010
 me+0.010
 it+0.009
 cho+0.009
osh+0.008
Neg logit contributions
handed-0.025
 arrest-0.024
 former-0.023
-0.023
urance-0.023
Token collect
Token ID2824
Feature activation+1.02e+00

Pos logit contributions
 please+0.004
 it+0.004
 me+0.004
 cho+0.003
 closer+0.003
Neg logit contributions
handed-0.005
 former-0.005
-0.004
ongo-0.004
arre-0.004
Token or
Token ID393
Feature activation+8.42e-01

Pos logit contributions
handed+0.042
 former+0.038
+0.037
urance+0.036
 arrest+0.036
Neg logit contributions
 me-0.026
 please-0.025
 it-0.025
 cho-0.023
 closer-0.021
Tokennam
Token ID7402
Feature activation+7.85e+00

Pos logit contributions
handed+0.029
+0.026
urance+0.026
 former+0.025
ongo+0.024
Neg logit contributions
 it-0.029
 me-0.029
 please-0.029
 closer-0.025
 cho-0.024
Tokenents
Token ID658
Feature activation+1.53e+00

Pos logit contributions
handed+0.316
+0.293
urance+0.291
 former+0.289
 arrest+0.276
Neg logit contributions
 me-0.204
 it-0.200
 please-0.193
 cho-0.177
h-0.165
Token.
Token ID13
Feature activation+7.90e-01

Pos logit contributions
handed+0.066
 former+0.061
+0.060
urance+0.059
 arrest+0.058
Neg logit contributions
 it-0.036
 me-0.035
 please-0.035
 cho-0.031
 closer-0.028
Token Everywhere
Token ID48201
Feature activation+7.84e-01

Pos logit contributions
handed+0.023
+0.022
 former+0.022
urance+0.021
ongo+0.019
Neg logit contributions
 me-0.026
 cho-0.026
 please-0.026
 it-0.026
una-0.024
Token she
Token ID673
Feature activation+7.11e-01

Pos logit contributions
handed+0.019
+0.019
 repaired+0.018
 arrest+0.017
 Wil+0.016
Neg logit contributions
una-0.026
 Area-0.025
 please-0.025
out-0.025
 louder-0.024
Token went
Token ID1816
Feature activation+2.55e+00

Pos logit contributions
handed+0.029
 former+0.026
+0.026
urance+0.026
 arrest+0.025
Neg logit contributions
 me-0.018
 it-0.018
 please-0.018
 cho-0.016
 closer-0.015
Token at
Token ID379
Feature activation+6.48e-01

Pos logit contributions
handed+0.082
 former+0.075
+0.073
urance+0.073
 arrest+0.072
Neg logit contributions
 me-0.033
 please-0.033
 it-0.032
 cho-0.029
h-0.025
Token all
Token ID477
Feature activation-2.30e+00

Pos logit contributions
handed+0.029
 former+0.027
+0.027
urance+0.026
 arrest+0.026
Neg logit contributions
 me-0.014
 please-0.013
 it-0.013
 cho-0.012
 closer-0.010
Token the
Token ID262
Feature activation+1.24e+00

Pos logit contributions
 me+0.050
 please+0.049
 it+0.048
 cho+0.044
 closer+0.038
Neg logit contributions
handed-0.103
 former-0.094
-0.093
urance-0.091
 arrest-0.091
Token different
Token ID1180
Feature activation+5.64e-01

Pos logit contributions
handed+0.050
+0.045
 former+0.045
urance+0.044
 arrest+0.043
Neg logit contributions
 me-0.033
 it-0.032
 please-0.032
 cho-0.029
 closer-0.027
Token or
Token ID393
Feature activation+9.67e-01

Pos logit contributions
handed+0.023
 former+0.021
+0.021
urance+0.020
 arrest+0.020
Neg logit contributions
 me-0.014
 please-0.014
 it-0.014
 cho-0.013
 closer-0.011
Tokennam
Token ID7402
Feature activation+7.83e+00

Pos logit contributions
handed+0.031
 former+0.027
+0.026
urance+0.026
 arrest+0.025
Neg logit contributions
 me-0.033
 please-0.033
 it-0.033
 cho-0.031
una-0.029
Tokenents
Token ID658
Feature activation+1.36e+00

Pos logit contributions
handed+0.320
+0.304
urance+0.304
 former+0.297
 arrest+0.285
Neg logit contributions
 me-0.188
 it-0.184
 please-0.175
 cho-0.162
out-0.160
Token and
Token ID290
Feature activation-2.74e-01

Pos logit contributions
handed+0.058
 former+0.053
+0.052
urance+0.052
 arrest+0.051
Neg logit contributions
 it-0.033
 me-0.032
 please-0.031
 cho-0.029
 closer-0.026
Token adm
Token ID6178
Feature activation+6.25e-01

Pos logit contributions
 it+0.009
 me+0.009
 please+0.009
 cho+0.008
 closer+0.007
Neg logit contributions
handed-0.009
-0.008
 former-0.008
urance-0.008
 arrest-0.008
Tokeniring
Token ID3428
Feature activation-9.08e-02

Pos logit contributions
handed+0.020
+0.019
urance+0.018
 arrest+0.018
 former+0.018
Neg logit contributions
 me-0.021
 please-0.021
 it-0.021
 cho-0.019
out-0.019
Token the
Token ID262
Feature activation+4.36e-01

Pos logit contributions
 me+0.002
 please+0.002
 it+0.002
 cho+0.002
una+0.001
Neg logit contributions
handed-0.004
 former-0.004
-0.004
urance-0.004
 arrest-0.004
Token she
Token ID673
Feature activation-4.20e-01

Pos logit contributions
handed+0.069
 former+0.063
+0.062
urance+0.061
 arrest+0.060
Neg logit contributions
 me-0.033
 please-0.032
 it-0.032
 cho-0.029
 closer-0.025
Token had
Token ID550
Feature activation+1.42e+00

Pos logit contributions
 please+0.014
 me+0.014
 it+0.014
 cho+0.014
una+0.013
Neg logit contributions
handed-0.013
-0.011
 former-0.011
 arrest-0.011
urance-0.011
Token with
Token ID351
Feature activation-3.20e-01

Pos logit contributions
handed+0.053
 former+0.050
+0.049
urance+0.048
ongo+0.047
Neg logit contributions
 it-0.042
 me-0.042
 please-0.039
 cho-0.035
 us-0.033
Token her
Token ID607
Feature activation-4.44e-01

Pos logit contributions
 it+0.006
 me+0.006
 please+0.006
 cho+0.005
 closer+0.004
Neg logit contributions
handed-0.016
 former-0.014
-0.014
urance-0.014
ongo-0.014
Token or
Token ID393
Feature activation+4.81e-01

Pos logit contributions
 me+0.013
 it+0.012
 please+0.012
 cho+0.011
 closer+0.010
Neg logit contributions
handed-0.017
 former-0.015
urance-0.015
-0.015
ongo-0.015
Tokennam
Token ID7402
Feature activation+7.78e+00

Pos logit contributions
handed+0.020
+0.018
 former+0.018
urance+0.018
 arrest+0.018
Neg logit contributions
 me-0.012
 it-0.011
 please-0.011
 cho-0.010
 closer-0.009
Tokenents
Token ID658
Feature activation+1.21e+00

Pos logit contributions
handed+0.316
+0.294
urance+0.289
 former+0.287
 arrest+0.275
Neg logit contributions
 me-0.196
 it-0.190
 please-0.186
 cho-0.166
h-0.160
Token.
Token ID13
Feature activation-8.79e-01

Pos logit contributions
handed+0.052
 former+0.048
+0.047
urance+0.046
 arrest+0.046
Neg logit contributions
 it-0.028
 me-0.028
 please-0.027
 cho-0.024
 closer-0.022
Token<|endoftext|>
Token ID50256
Feature activation-1.19e+00

Pos logit contributions
 me+0.027
 it+0.027
 please+0.026
 cho+0.023
una+0.022
Neg logit contributions
handed-0.032
 former-0.028
 arrest-0.028
urance-0.028
-0.027
TokenOnce
Token ID7454
Feature activation+2.99e-01

Pos logit contributions
 it+0.035
 me+0.035
 please+0.032
 cho+0.029
 is+0.028
Neg logit contributions
handed-0.045
 former-0.043
urance-0.041
 arrest-0.040
-0.039
Token upon
Token ID2402
Feature activation-2.46e+00

Pos logit contributions
handed+0.010
+0.009
 former+0.009
ongo+0.009
urance+0.009
Neg logit contributions
 me-0.009
 please-0.009
 cho-0.008
 it-0.008
una-0.007
Token to
Token ID284
Feature activation+1.88e-01

Pos logit contributions
handed+0.050
 former+0.046
+0.046
urance+0.045
 arrest+0.044
Neg logit contributions
 me-0.026
 please-0.025
 it-0.025
 cho-0.022
 closer-0.020
Token help
Token ID1037
Feature activation-8.49e-01

Pos logit contributions
handed+0.008
 former+0.008
arre+0.007
ongo+0.007
+0.007
Neg logit contributions
 please-0.006
 it-0.005
 me-0.005
"-0.004
h-0.004
Token sort
Token ID3297
Feature activation+2.83e-01

Pos logit contributions
 please+0.020
 me+0.020
 cho+0.019
 it+0.019
osh+0.017
Neg logit contributions
handed-0.035
 former-0.032
-0.032
urance-0.031
 arrest-0.031
Token the
Token ID262
Feature activation+6.56e-01

Pos logit contributions
handed+0.012
 former+0.011
+0.011
 arrest+0.010
 ease+0.010
Neg logit contributions
 please-0.007
 me-0.006
 cho-0.006
 it-0.006
una-0.005
Token or
Token ID393
Feature activation+7.63e-02

Pos logit contributions
handed+0.025
 former+0.023
+0.023
urance+0.022
 arrest+0.022
Neg logit contributions
 please-0.018
 me-0.017
 it-0.017
 cho-0.016
una-0.015
Tokennam
Token ID7402
Feature activation+7.72e+00

Pos logit contributions
handed+0.003
 former+0.002
+0.002
urance+0.002
 arrest+0.002
Neg logit contributions
 me-0.002
 it-0.002
 please-0.002
 cho-0.002
 closer-0.002
Tokenents
Token ID658
Feature activation+1.36e+00

Pos logit contributions
handed+0.296
+0.270
 former+0.269
urance+0.266
 arrest+0.258
Neg logit contributions
 me-0.213
 it-0.210
 please-0.207
 cho-0.188
 closer-0.174
Token.
Token ID13
Feature activation+9.57e-01

Pos logit contributions
handed+0.059
 former+0.055
+0.054
urance+0.053
 arrest+0.052
Neg logit contributions
 it-0.032
 me-0.031
 please-0.031
 cho-0.027
 closer-0.025
Token She
Token ID1375
Feature activation-6.78e-02

Pos logit contributions
handed+0.033
 former+0.030
+0.030
urance+0.029
 arrest+0.028
Neg logit contributions
 me-0.030
 please-0.029
 it-0.029
 cho-0.028
una-0.025
Token picked
Token ID6497
Feature activation+6.41e-01

Pos logit contributions
 please+0.002
 me+0.002
 it+0.002
 cho+0.002
una+0.002
Neg logit contributions
handed-0.002
 former-0.002
-0.002
 arrest-0.002
urance-0.002
Token up
Token ID510
Feature activation-2.71e-01

Pos logit contributions
handed+0.031
 former+0.028
+0.028
urance+0.028
 arrest+0.028
Neg logit contributions
 it-0.012
 me-0.012
 please-0.011
 cho-0.010
 closer-0.009
Token We
Token ID775
Feature activation+5.08e-03

Pos logit contributions
handed+0.001
 former+0.001
+0.001
urance+0.001
 arrest+0.001
Neg logit contributions
 me-0.001
 please-0.001
 it-0.001
 cho-0.001
 closer-0.001
Token have
Token ID423
Feature activation+1.03e+00

Pos logit contributions
handed+0.000
 former+0.000
urance+0.000
+0.000
view+0.000
Neg logit contributions
 please-0.000
 it-0.000
 me-0.000
 cho-0.000
 will-0.000
Token our
Token ID674
Feature activation+1.77e+00

Pos logit contributions
handed+0.043
 former+0.040
 arrest+0.040
keeping+0.037
+0.037
Neg logit contributions
 please-0.021
 me-0.020
una-0.020
osh-0.019
h-0.019
Token own
Token ID898
Feature activation+1.15e+00

Pos logit contributions
handed+0.063
 former+0.054
+0.054
urance+0.054
ongo+0.052
Neg logit contributions
 me-0.057
 it-0.055
 please-0.055
 cho-0.050
 us-0.047
Token or
Token ID393
Feature activation+1.37e+00

Pos logit contributions
handed+0.044
 former+0.039
+0.038
urance+0.038
 arrest+0.037
Neg logit contributions
 me-0.033
 it-0.032
 please-0.032
 cho-0.029
 closer-0.027
Tokennam
Token ID7402
Feature activation+7.69e+00

Pos logit contributions
handed+0.057
 former+0.052
+0.051
urance+0.050
 arrest+0.050
Neg logit contributions
 me-0.033
 please-0.032
 it-0.032
 cho-0.029
 closer-0.026
Tokenents
Token ID658
Feature activation+2.36e+00

Pos logit contributions
handed+0.304
+0.281
 former+0.274
urance+0.274
 arrest+0.263
Neg logit contributions
 me-0.204
 it-0.198
 please-0.196
 cho-0.180
out-0.167
Token at
Token ID379
Feature activation-1.75e-02

Pos logit contributions
handed+0.102
 former+0.093
+0.093
urance+0.091
 arrest+0.090
Neg logit contributions
 me-0.054
 it-0.053
 please-0.053
 cho-0.048
 closer-0.043
Token home
Token ID1363
Feature activation+1.20e-01

Pos logit contributions
 me+0.000
 it+0.000
 please+0.000
 cho+0.000
 closer+0.000
Neg logit contributions
handed-0.001
 former-0.001
-0.001
urance-0.001
 arrest-0.001
Token,
Token ID11
Feature activation-1.17e-01

Pos logit contributions
handed+0.005
 former+0.005
+0.004
urance+0.004
 arrest+0.004
Neg logit contributions
 me-0.003
 please-0.003
 it-0.003
 cho-0.003
 closer-0.002
Token remember
Token ID3505
Feature activation-2.00e+00

Pos logit contributions
 me+0.003
 please+0.003
 it+0.003
 cho+0.003
h+0.003
Neg logit contributions
handed-0.004
 former-0.004
-0.004
urance-0.004
 arrest-0.004
Token take
Token ID1011
Feature activation-6.13e-01

Pos logit contributions
 me+0.012
 it+0.012
 please+0.012
 cho+0.009
 us+0.009
Neg logit contributions
handed-0.016
-0.014
ongo-0.014
 former-0.014
urance-0.014
Token one
Token ID530
Feature activation-8.64e-01

Pos logit contributions
 it+0.014
 me+0.013
 please+0.012
 cho+0.011
 closer+0.010
Neg logit contributions
handed-0.028
-0.027
 former-0.026
urance-0.026
ongo-0.026
Token of
Token ID286
Feature activation+1.34e+00

Pos logit contributions
 cho+0.022
 me+0.021
 it+0.021
 please+0.021
una+0.019
Neg logit contributions
handed-0.035
 former-0.030
-0.030
view-0.029
 arrest-0.028
Token the
Token ID262
Feature activation+9.40e-01

Pos logit contributions
handed+0.057
 former+0.052
+0.051
urance+0.051
ongo+0.050
Neg logit contributions
 it-0.034
 me-0.033
 please-0.032
 cho-0.029
 closer-0.026
Token or
Token ID393
Feature activation+7.80e-01

Pos logit contributions
handed+0.037
urance+0.035
+0.034
ongo+0.033
 former+0.033
Neg logit contributions
 me-0.027
 it-0.025
 please-0.025
 will-0.021
 cho-0.021
Tokennam
Token ID7402
Feature activation+7.55e+00

Pos logit contributions
handed+0.030
+0.027
 former+0.026
urance+0.026
 arrest+0.026
Neg logit contributions
 me-0.022
 it-0.022
 please-0.022
 cho-0.019
 closer-0.019
Tokenents
Token ID658
Feature activation+1.65e+00

Pos logit contributions
handed+0.300
+0.279
urance+0.276
 former+0.275
 arrest+0.262
Neg logit contributions
 me-0.195
 it-0.190
 please-0.188
 cho-0.168
out-0.161
Token outside
Token ID2354
Feature activation+1.97e-01

Pos logit contributions
handed+0.073
 former+0.067
+0.066
urance+0.065
 arrest+0.064
Neg logit contributions
 me-0.036
 it-0.036
 please-0.036
 cho-0.031
 closer-0.028
Token for
Token ID329
Feature activation-1.94e+00

Pos logit contributions
handed+0.008
 former+0.007
+0.007
urance+0.007
 arrest+0.007
Neg logit contributions
 me-0.005
 it-0.005
 please-0.005
 cho-0.004
 closer-0.004
Token a
Token ID257
Feature activation-1.21e+00

Pos logit contributions
 it+0.048
 me+0.044
 please+0.041
 us+0.033
 cho+0.033
Neg logit contributions
handed-0.089
ongo-0.085
arre-0.083
urance-0.080
-0.080
Token walk
Token ID2513
Feature activation-9.60e-01

Pos logit contributions
 me+0.033
 please+0.033
 it+0.033
 cho+0.030
 closer+0.027
Neg logit contributions
handed-0.047
 former-0.042
-0.042
urance-0.041
 arrest-0.041
Token Everyday
Token ID48154
Feature activation+2.45e+00

Pos logit contributions
 me+0.025
 please+0.024
 it+0.024
 cho+0.023
una+0.021
Neg logit contributions
handed-0.028
 former-0.025
-0.025
urance-0.025
 arrest-0.024
Token he
Token ID339
Feature activation+1.86e+00

Pos logit contributions
handed+0.099
 former+0.090
+0.089
urance+0.087
 arrest+0.086
Neg logit contributions
 me-0.062
 please-0.061
 it-0.060
 cho-0.055
 closer-0.050
Token would
Token ID561
Feature activation+6.14e-01

Pos logit contributions
handed+0.075
 former+0.069
+0.066
 arrest+0.065
urance+0.065
Neg logit contributions
 please-0.044
 me-0.043
 it-0.043
 cho-0.042
una-0.038
Token take
Token ID1011
Feature activation+1.25e-01

Pos logit contributions
handed+0.023
+0.021
 former+0.021
urance+0.020
 arrest+0.020
Neg logit contributions
 me-0.018
 please-0.018
 it-0.018
 cho-0.016
 closer-0.015
Token the
Token ID262
Feature activation+1.56e+00

Pos logit contributions
handed+0.005
 former+0.005
+0.005
urance+0.005
 arrest+0.005
Neg logit contributions
 me-0.003
 it-0.003
 please-0.003
 cho-0.003
 closer-0.002
Token av
Token ID1196
Feature activation+7.52e+00

Pos logit contributions
handed+0.061
 former+0.056
+0.055
urance+0.053
 arrest+0.053
Neg logit contributions
 me-0.041
 please-0.041
 it-0.040
 cho-0.037
h-0.034
Tokenoc
Token ID420
Feature activation+3.24e+00

Pos logit contributions
handed+0.216
urance+0.189
 former+0.188
+0.187
 arrest+0.176
Neg logit contributions
 me-0.283
 it-0.275
 please-0.274
 cho-0.257
h-0.243
Tokenados
Token ID22484
Feature activation+3.36e+00

Pos logit contributions
handed+0.089
 former+0.078
+0.078
urance+0.077
ongo+0.071
Neg logit contributions
 me-0.123
 please-0.119
 it-0.119
 cho-0.113
out-0.106
Token and
Token ID290
Feature activation+7.60e-01

Pos logit contributions
handed+0.157
 former+0.143
+0.142
urance+0.138
 arrest+0.138
Neg logit contributions
 me-0.064
 please-0.063
 it-0.062
 cho-0.057
una-0.048
Token scatter
Token ID41058
Feature activation-6.38e-01

Pos logit contributions
handed+0.029
+0.025
urance+0.025
 former+0.025
ongo+0.024
Neg logit contributions
 me-0.023
 it-0.023
 please-0.022
 cho-0.020
 closer-0.019
Token them
Token ID606
Feature activation+1.07e+00

Pos logit contributions
 please+0.012
 me+0.012
 it+0.011
 cho+0.010
una+0.009
Neg logit contributions
handed-0.030
 former-0.027
-0.027
 arrest-0.027
urance-0.027
Token made
Token ID925
Feature activation-1.44e+00

Pos logit contributions
 please+0.085
 me+0.080
 it+0.078
 cho+0.078
h+0.070
Neg logit contributions
handed-0.114
 former-0.102
 arrest-0.099
-0.099
urance-0.097
Token a
Token ID257
Feature activation-1.12e+00

Pos logit contributions
 please+0.030
 me+0.027
 cho+0.024
h+0.024
 it+0.024
Neg logit contributions
handed-0.063
 former-0.057
 arrest-0.056
 ease-0.056
urance-0.055
Token big
Token ID1263
Feature activation-1.53e+00

Pos logit contributions
 please+0.029
 it+0.029
 me+0.029
 cho+0.026
out+0.025
Neg logit contributions
handed-0.043
 former-0.042
-0.038
 arrest-0.037
urance-0.037
Token,
Token ID11
Feature activation-4.07e-01

Pos logit contributions
 please+0.044
 me+0.043
 it+0.043
 cho+0.040
una+0.039
Neg logit contributions
handed-0.053
 former-0.050
-0.048
 arrest-0.047
urance-0.047
Token pretty
Token ID2495
Feature activation-2.45e+00

Pos logit contributions
 me+0.011
 please+0.011
 it+0.011
 cho+0.010
una+0.010
Neg logit contributions
handed-0.015
 former-0.014
-0.013
 arrest-0.013
urance-0.013
Token web
Token ID3992
Feature activation-5.93e+00

Pos logit contributions
 please+0.063
 me+0.058
 cho+0.057
 it+0.057
osh+0.054
Neg logit contributions
handed-0.094
 former-0.091
 arrest-0.084
-0.083
urance-0.081
Token for
Token ID329
Feature activation-3.04e+00

Pos logit contributions
 please+0.168
 me+0.166
 it+0.157
 cho+0.150
una+0.141
Neg logit contributions
handed-0.215
 former-0.194
-0.192
urance-0.189
 arrest-0.187
Token all
Token ID477
Feature activation-3.34e+00

Pos logit contributions
 please+0.071
 me+0.069
 it+0.064
 cho+0.062
out+0.059
Neg logit contributions
handed-0.126
 former-0.115
-0.114
 arrest-0.114
urance-0.112
Token its
Token ID663
Feature activation-9.86e-01

Pos logit contributions
 please+0.073
 me+0.072
 it+0.068
 cho+0.063
 closer+0.056
Neg logit contributions
handed-0.148
 former-0.134
-0.131
 arrest-0.130
urance-0.130
Token friends
Token ID2460
Feature activation+1.88e+00

Pos logit contributions
 me+0.030
 it+0.029
 please+0.029
 cho+0.027
 closer+0.025
Neg logit contributions
handed-0.036
-0.032
 former-0.031
urance-0.031
 arrest-0.030
Token to
Token ID284
Feature activation-8.70e-01

Pos logit contributions
handed+0.075
 former+0.067
+0.067
urance+0.066
 arrest+0.065
Neg logit contributions
 me-0.049
 please-0.048
 it-0.047
 cho-0.044
 closer-0.040
Token reached
Token ID4251
Feature activation-1.04e+00

Pos logit contributions
handed+0.035
 former+0.031
+0.030
urance+0.030
 arrest+0.029
Neg logit contributions
 me-0.034
 it-0.033
 please-0.033
 cho-0.030
 closer-0.028
Token a
Token ID257
Feature activation-1.22e+00

Pos logit contributions
 please+0.027
 me+0.026
 cho+0.023
 it+0.023
h+0.023
Neg logit contributions
handed-0.039
 former-0.038
-0.036
 arrest-0.036
 ease-0.035
Token wide
Token ID3094
Feature activation-2.03e+00

Pos logit contributions
 Go+0.038
una+0.037
ours+0.036
ipt+0.036
osh+0.036
Neg logit contributions
handed-0.037
-0.035
 former-0.035
 accomplish-0.030
igsaw-0.030
Token,
Token ID11
Feature activation-4.56e-01

Pos logit contributions
 please+0.061
una+0.060
cuts+0.060
 cho+0.059
ymes+0.059
Neg logit contributions
handed-0.062
 former-0.060
-0.057
igsaw-0.055
 arrest-0.054
Token open
Token ID1280
Feature activation-1.87e+00

Pos logit contributions
 please+0.013
una+0.013
 me+0.012
 it+0.012
h+0.012
Neg logit contributions
handed-0.015
 former-0.015
-0.013
 arrest-0.013
 widened-0.013
Token field
Token ID2214
Feature activation-5.78e+00

Pos logit contributions
 please+0.054
una+0.051
cuts+0.048
 cho+0.048
h+0.046
Neg logit contributions
handed-0.063
 former-0.061
 arrest-0.059
keeping-0.056
-0.056
Token.
Token ID13
Feature activation+3.91e-01

Pos logit contributions
 please+0.187
cuts+0.182
 me+0.177
una+0.173
 cho+0.169
Neg logit contributions
handed-0.184
-0.156
keeping-0.153
 arrest-0.150
view-0.149
Token The
Token ID383
Feature activation+2.18e-01

Pos logit contributions
handed+0.013
 former+0.011
+0.011
urance+0.011
 arrest+0.011
Neg logit contributions
 me-0.013
 please-0.013
 it-0.013
 cho-0.012
 closer-0.011
Token duck
Token ID22045
Feature activation-7.42e-01

Pos logit contributions
handed+0.010
+0.009
 former+0.009
 arrest+0.009
urance+0.009
Neg logit contributions
 please-0.005
 me-0.004
 it-0.004
 cho-0.004
h-0.004
Token stopped
Token ID5025
Feature activation+1.29e+00

Pos logit contributions
 please+0.019
 me+0.019
 it+0.019
 cho+0.017
una+0.015
Neg logit contributions
handed-0.030
 former-0.027
-0.027
urance-0.026
 arrest-0.026
Token to
Token ID284
Feature activation-2.59e-01

Pos logit contributions
handed+0.050
 former+0.046
+0.045
urance+0.044
 arrest+0.044
Neg logit contributions
 me-0.034
 please-0.034
 it-0.033
 cho-0.031
una-0.028
Token found
Token ID1043
Feature activation-2.62e-01

Pos logit contributions
 it+0.014
 me+0.014
 please+0.013
 cho+0.012
 closer+0.011
Neg logit contributions
handed-0.020
 former-0.018
-0.018
urance-0.018
 arrest-0.017
Token a
Token ID257
Feature activation-1.44e+00

Pos logit contributions
 it+0.006
 me+0.005
 please+0.005
 cho+0.004
 closer+0.004
Neg logit contributions
handed-0.012
 former-0.012
-0.011
urance-0.011
 arrest-0.011
Token nice
Token ID3621
Feature activation-2.35e+00

Pos logit contributions
 me+0.040
 please+0.040
 it+0.039
una+0.038
h+0.036
Neg logit contributions
handed-0.051
 former-0.048
-0.047
ongo-0.043
 arrest-0.043
Token,
Token ID11
Feature activation-3.47e-01

Pos logit contributions
 please+0.063
 me+0.062
 it+0.060
 cho+0.059
osh+0.058
Neg logit contributions
handed-0.086
-0.078
 former-0.078
urance-0.074
 arrest-0.074
Token clean
Token ID3424
Feature activation-1.45e+00

Pos logit contributions
 me+0.010
 please+0.010
 it+0.010
 cho+0.009
una+0.008
Neg logit contributions
handed-0.013
 former-0.011
-0.011
urance-0.011
 arrest-0.011
Token spot
Token ID4136
Feature activation-5.53e+00

Pos logit contributions
 please+0.047
 me+0.045
 it+0.044
 cho+0.043
una+0.042
Neg logit contributions
handed-0.047
 former-0.042
-0.041
urance-0.039
 arrest-0.039
Token.
Token ID13
Feature activation+2.67e-01

Pos logit contributions
 me+0.143
 please+0.141
 it+0.132
 cho+0.126
una+0.118
Neg logit contributions
handed-0.224
-0.198
 former-0.197
 arrest-0.190
urance-0.190
Token
Token ID220
Feature activation+1.42e-01

Pos logit contributions
handed+0.010
 former+0.009
 arrest+0.009
 accomplish+0.008
 affect+0.008
Neg logit contributions
 it-0.008
 me-0.008
 please-0.008
 I-0.007
 cho-0.006
Token
Token ID198
Feature activation+7.14e-02

Pos logit contributions
 former+0.006
handed+0.006
 arrest+0.006
urance+0.005
view+0.005
Neg logit contributions
 it-0.003
 me-0.003
 please-0.003
"-0.003
 cho-0.003
Token
Token ID198
Feature activation+4.95e-01

Pos logit contributions
handed+0.003
 former+0.003
urance+0.002
+0.002
 arrest+0.002
Neg logit contributions
 it-0.002
 me-0.002
 please-0.002
 cho-0.002
"-0.002
TokenThe
Token ID464
Feature activation+1.32e-02

Pos logit contributions
handed+0.017
 former+0.016
urance+0.015
+0.015
 arrest+0.015
Neg logit contributions
 me-0.016
 it-0.016
 please-0.015
 cho-0.014
"-0.013
Token in
Token ID287
Feature activation-4.96e-01

Pos logit contributions
 please+0.010
 me+0.010
 cho+0.009
 it+0.009
una+0.009
Neg logit contributions
handed-0.014
 former-0.012
-0.012
 arrest-0.012
keeping-0.012
Token a
Token ID257
Feature activation-1.28e+00

Pos logit contributions
 it+0.011
 me+0.011
 please+0.010
 cho+0.009
 closer+0.008
Neg logit contributions
handed-0.022
-0.020
 former-0.020
urance-0.020
ongo-0.020
Token nice
Token ID3621
Feature activation-1.95e+00

Pos logit contributions
 please+0.035
 it+0.034
 me+0.033
una+0.033
 cho+0.033
Neg logit contributions
handed-0.047
 former-0.044
-0.041
 arrest-0.040
urance-0.038
Token,
Token ID11
Feature activation-1.22e-01

Pos logit contributions
 please+0.051
 cho+0.048
 it+0.047
 me+0.047
osh+0.045
Neg logit contributions
handed-0.075
 former-0.068
-0.065
 arrest-0.065
urance-0.064
Token healthy
Token ID5448
Feature activation-2.97e+00

Pos logit contributions
 me+0.004
 please+0.003
 it+0.003
 cho+0.003
una+0.003
Neg logit contributions
handed-0.004
 former-0.004
-0.004
urance-0.004
 arrest-0.004
Token spot
Token ID4136
Feature activation-5.53e+00

Pos logit contributions
 please+0.086
 it+0.081
 me+0.080
 cho+0.078
h+0.075
Neg logit contributions
handed-0.110
 former-0.098
 arrest-0.096
urance-0.095
-0.094
Token.
Token ID13
Feature activation-4.49e-01

Pos logit contributions
 me+0.141
 please+0.137
 it+0.129
 cho+0.124
una+0.114
Neg logit contributions
handed-0.224
 former-0.198
-0.197
 arrest-0.194
keeping-0.192
Token
Token ID220
Feature activation-7.14e-02

Pos logit contributions
 it+0.014
 me+0.013
 please+0.013
 I+0.011
 cho+0.011
Neg logit contributions
handed-0.017
 former-0.015
 arrest-0.015
view-0.015
urance-0.015
Token
Token ID198
Feature activation-9.55e-02

Pos logit contributions
 it+0.002
 me+0.002
 please+0.002
"+0.001
 cho+0.001
Neg logit contributions
handed-0.003
 former-0.003
 arrest-0.003
urance-0.003
view-0.003
Token
Token ID198
Feature activation+4.29e-01

Pos logit contributions
 it+0.003
 me+0.003
 please+0.002
 cho+0.002
"+0.002
Neg logit contributions
handed-0.004
 former-0.004
urance-0.003
 arrest-0.003
-0.003
TokenThe
Token ID464
Feature activation-1.41e-03

Pos logit contributions
handed+0.015
 former+0.013
urance+0.013
+0.013
 arrest+0.013
Neg logit contributions
 me-0.014
 it-0.014
 please-0.014
 cho-0.012
 closer-0.011
Token and
Token ID290
Feature activation-3.31e+00

Pos logit contributions
 please+0.104
 me+0.094
 it+0.092
 cho+0.090
una+0.083
Neg logit contributions
handed-0.196
 arrest-0.175
 former-0.173
-0.172
urance-0.171
Token car
Token ID1097
Feature activation-2.59e+00

Pos logit contributions
 please+0.085
 me+0.084
 it+0.081
 cho+0.081
una+0.078
Neg logit contributions
handed-0.122
 arrest-0.113
 former-0.110
-0.109
urance-0.102
Tokenaway
Token ID8272
Feature activation-3.07e+00

Pos logit contributions
 please+0.058
 me+0.057
 it+0.055
 cho+0.049
una+0.048
Neg logit contributions
handed-0.107
 arrest-0.098
-0.096
 former-0.095
urance-0.094
Token and
Token ID290
Feature activation-3.26e+00

Pos logit contributions
 please+0.079
 me+0.073
 it+0.072
 cho+0.067
 closer+0.064
Neg logit contributions
handed-0.120
 former-0.111
 arrest-0.108
-0.105
urance-0.105
Token c
Token ID269
Feature activation-1.97e+00

Pos logit contributions
 please+0.084
 me+0.083
 it+0.081
 cho+0.079
una+0.077
Neg logit contributions
handed-0.121
 arrest-0.113
 former-0.109
-0.107
urance-0.102
Tokenumin
Token ID7230
Feature activation-5.53e+00

Pos logit contributions
 please+0.056
 it+0.055
 me+0.052
 closer+0.049
 cho+0.048
Neg logit contributions
handed-0.073
 former-0.068
-0.068
arre-0.063
 arrest-0.063
Token and
Token ID290
Feature activation-3.23e+00

Pos logit contributions
 please+0.115
 me+0.113
 it+0.107
 cho+0.100
una+0.089
Neg logit contributions
handed-0.241
 former-0.218
-0.216
 arrest-0.215
urance-0.213
Token cor
Token ID1162
Feature activation-8.36e-01

Pos logit contributions
 please+0.083
 me+0.082
 it+0.080
 cho+0.079
h+0.075
Neg logit contributions
handed-0.120
 arrest-0.110
 former-0.109
-0.106
urance-0.102
Tokeni
Token ID72
Feature activation-3.87e+00

Pos logit contributions
 it+0.017
 please+0.017
 me+0.017
 cho+0.015
una+0.013
Neg logit contributions
handed-0.037
-0.034
 arrest-0.034
 former-0.034
urance-0.033
Tokenander
Token ID4066
Feature activation-3.40e+00

Pos logit contributions
 please+0.098
 me+0.094
 it+0.092
 cho+0.087
una+0.081
Neg logit contributions
handed-0.151
-0.137
 arrest-0.137
 former-0.134
urance-0.130
Token and
Token ID290
Feature activation-3.11e+00

Pos logit contributions
 please+0.087
 me+0.085
 it+0.081
 cho+0.076
una+0.071
Neg logit contributions
handed-0.133
 arrest-0.118
-0.117
 former-0.117
urance-0.114
Token smiled
Token ID13541
Feature activation-6.20e-01

Pos logit contributions
 please+0.029
 me+0.028
 it+0.027
 cho+0.027
una+0.025
Neg logit contributions
handed-0.033
 former-0.030
-0.029
 arrest-0.028
urance-0.027
Token and
Token ID290
Feature activation-1.09e+00

Pos logit contributions
 me+0.018
 please+0.017
 it+0.017
 cho+0.016
una+0.015
Neg logit contributions
handed-0.023
 former-0.021
-0.020
urance-0.020
 arrest-0.019
Token thanked
Token ID26280
Feature activation-8.41e-02

Pos logit contributions
 please+0.034
 me+0.033
 it+0.031
 cho+0.031
una+0.030
Neg logit contributions
handed-0.036
 former-0.033
-0.032
 arrest-0.031
urance-0.031
Token the
Token ID262
Feature activation-1.75e+00

Pos logit contributions
 please+0.002
 cho+0.002
out+0.002
ours+0.002
 me+0.002
Neg logit contributions
handed-0.003
 former-0.003
 ease-0.003
-0.003
urance-0.003
Token generous
Token ID14431
Feature activation-6.29e-01

Pos logit contributions
 please+0.050
 cho+0.048
 it+0.048
 me+0.048
h+0.044
Neg logit contributions
handed-0.066
 former-0.059
-0.059
urance-0.057
 arrest-0.056
Token spot
Token ID4136
Feature activation-5.51e+00

Pos logit contributions
 cho+0.017
 please+0.016
osh+0.016
h+0.015
 me+0.015
Neg logit contributions
handed-0.023
ongo-0.020
igsaw-0.020
 former-0.020
urance-0.020
Token.
Token ID13
Feature activation-4.92e-01

Pos logit contributions
 me+0.163
 please+0.161
 cho+0.156
 it+0.149
osh+0.141
Neg logit contributions
handed-0.195
 former-0.166
-0.159
keeping-0.157
urance-0.156
Token She
Token ID1375
Feature activation-8.96e-01

Pos logit contributions
 me+0.015
 it+0.015
 please+0.014
 cho+0.013
 closer+0.012
Neg logit contributions
handed-0.018
 former-0.016
 arrest-0.016
urance-0.016
-0.016
Token couldn
Token ID3521
Feature activation-1.23e-01

Pos logit contributions
 please+0.026
 me+0.026
 it+0.026
 cho+0.024
 closer+0.022
Neg logit contributions
handed-0.032
 former-0.029
-0.028
 arrest-0.028
urance-0.027
Token't
Token ID470
Feature activation+3.72e-01

Pos logit contributions
 it+0.004
 me+0.004
 please+0.004
 cho+0.004
h+0.003
Neg logit contributions
handed-0.004
 former-0.004
urance-0.004
ongo-0.004
keeping-0.004
Token believe
Token ID1975
Feature activation+9.27e-01

Pos logit contributions
handed+0.018
arre+0.017
ongo+0.016
opl+0.016
keeping+0.016
Neg logit contributions
 please-0.009
 me-0.008
 it-0.008
 closer-0.006
 us-0.006
Token,
Token ID11
Feature activation-4.05e-01

Pos logit contributions
 please+0.095
 me+0.093
 it+0.091
 cho+0.088
una+0.081
Neg logit contributions
handed-0.139
 arrest-0.124
-0.121
 former-0.121
urance-0.117
Token a
Token ID257
Feature activation-2.54e+00

Pos logit contributions
 cho+0.009
 me+0.008
h+0.008
out+0.008
 Hunt+0.008
Neg logit contributions
handed-0.015
 former-0.015
-0.015
 arrest-0.014
urance-0.014
Token big
Token ID1263
Feature activation-2.21e+00

Pos logit contributions
 please+0.049
 it+0.048
out+0.047
 me+0.046
ipt+0.045
Neg logit contributions
handed-0.104
 former-0.099
-0.098
 arrest-0.097
urance-0.090
Token,
Token ID11
Feature activation-7.42e-01

Pos logit contributions
ipt+0.052
out+0.052
 Hunt+0.051
 Area+0.049
 cho+0.048
Neg logit contributions
 arrest-0.078
-0.073
urance-0.071
handed-0.071
 former-0.070
Token big
Token ID1263
Feature activation-2.69e+00

Pos logit contributions
 me+0.018
 cho+0.018
 please+0.017
 it+0.017
out+0.017
Neg logit contributions
handed-0.025
 former-0.025
 arrest-0.024
-0.024
urance-0.023
Token spider
Token ID19230
Feature activation-5.50e+00

Pos logit contributions
ipt+0.063
out+0.062
 Hunt+0.061
 Area+0.060
 cho+0.059
Neg logit contributions
 arrest-0.095
-0.088
handed-0.086
urance-0.085
 former-0.084
Token,
Token ID11
Feature activation-8.48e-01

Pos logit contributions
 please+0.152
 me+0.142
 cho+0.139
 it+0.138
 closer+0.130
Neg logit contributions
handed-0.199
 arrest-0.184
-0.175
keeping-0.167
 former-0.167
Token and
Token ID290
Feature activation-1.70e+00

Pos logit contributions
 me+0.020
 please+0.020
 cho+0.020
 it+0.019
out+0.019
Neg logit contributions
handed-0.030
 arrest-0.030
-0.029
 former-0.028
urance-0.027
Token it
Token ID340
Feature activation-2.41e+00

Pos logit contributions
 cho+0.046
out+0.045
 me+0.045
h+0.045
 please+0.044
Neg logit contributions
handed-0.052
 arrest-0.052
 former-0.050
-0.045
 repaired-0.045
Token wanted
Token ID2227
Feature activation+1.94e-01

Pos logit contributions
 please+0.063
 it+0.057
 cho+0.056
 me+0.056
out+0.052
Neg logit contributions
handed-0.092
 arrest-0.084
 former-0.081
-0.081
urance-0.079
Token to
Token ID284
Feature activation-2.77e-01

Pos logit contributions
handed+0.006
 arrest+0.006
keeping+0.005
 ease+0.005
 former+0.005
Neg logit contributions
ours-0.006
osh-0.005
ipt-0.005
cuts-0.005
ela-0.005
Token were
Token ID547
Feature activation+5.01e-01

Pos logit contributions
 me+0.012
 it+0.012
 please+0.012
 cho+0.011
 closer+0.010
Neg logit contributions
handed-0.012
 former-0.011
-0.011
urance-0.011
 arrest-0.010
Token so
Token ID523
Feature activation+2.21e+00

Pos logit contributions
handed+0.020
 former+0.018
+0.018
urance+0.018
ongo+0.018
Neg logit contributions
 it-0.014
 me-0.014
 please-0.013
 cho-0.013
 closer-0.012
Token happy
Token ID3772
Feature activation+2.57e-01

Pos logit contributions
handed+0.087
urance+0.084
+0.082
 former+0.082
 arrest+0.080
Neg logit contributions
 it-0.061
 please-0.060
 me-0.058
 cho-0.054
 closer-0.052
Token and
Token ID290
Feature activation-8.29e-01

Pos logit contributions
handed+0.012
 former+0.011
+0.011
urance+0.011
ongo+0.011
Neg logit contributions
 it-0.006
 me-0.005
 please-0.005
 cho-0.004
 closer-0.004
Token hugged
Token ID41130
Feature activation-7.40e-01

Pos logit contributions
 it+0.028
 me+0.027
 please+0.026
 cho+0.025
 closer+0.022
Neg logit contributions
handed-0.029
 former-0.026
-0.026
urance-0.025
 arrest-0.024
Token tightly
Token ID17707
Feature activation-5.45e+00

Pos logit contributions
 me+0.017
 please+0.017
 it+0.016
 cho+0.015
una+0.013
Neg logit contributions
handed-0.032
 former-0.029
-0.029
urance-0.028
 arrest-0.028
Token.
Token ID13
Feature activation-9.51e-01

Pos logit contributions
 me+0.139
 it+0.137
 please+0.137
 cho+0.124
 closer+0.112
Neg logit contributions
handed-0.222
 former-0.200
-0.198
urance-0.194
 arrest-0.192
Token They
Token ID1119
Feature activation+2.61e-01

Pos logit contributions
 me+0.032
 it+0.031
 please+0.029
 cho+0.028
 I+0.026
Neg logit contributions
handed-0.033
 former-0.030
urance-0.028
 arrest-0.028
-0.027
Token never
Token ID1239
Feature activation-9.65e-01

Pos logit contributions
handed+0.009
 former+0.008
+0.008
urance+0.008
 arrest+0.007
Neg logit contributions
 me-0.009
 it-0.009
 please-0.008
 cho-0.008
 closer-0.007
Token wanted
Token ID2227
Feature activation+2.11e+00

Pos logit contributions
 me+0.031
 it+0.030
 please+0.029
 cho+0.026
 closer+0.025
Neg logit contributions
handed-0.035
-0.034
 former-0.032
urance-0.031
 arrest-0.029
Token to
Token ID284
Feature activation+5.57e-01

Pos logit contributions
handed+0.104
 former+0.095
 arrest+0.094
+0.093
urance+0.093
Neg logit contributions
 please-0.034
 me-0.034
 it-0.031
 cho-0.029
h-0.024
Token reached
Token ID4251
Feature activation-8.18e-01

Pos logit contributions
 please+0.010
 me+0.009
 cho+0.009
 it+0.009
una+0.009
Neg logit contributions
handed-0.011
-0.009
 former-0.009
gy-0.009
 widened-0.009
Token a
Token ID257
Feature activation-1.24e+00

Pos logit contributions
 please+0.026
cuts+0.025
una+0.023
 cho+0.023
ymes+0.023
Neg logit contributions
 former-0.024
 ease-0.024
handed-0.023
 arrest-0.022
gy-0.021
Token big
Token ID1263
Feature activation-1.47e+00

Pos logit contributions
una+0.040
ours+0.040
 Go+0.039
out+0.038
ipt+0.038
Neg logit contributions
handed-0.035
 former-0.033
-0.032
 accomplish-0.030
 repaired-0.029
Token,
Token ID11
Feature activation-1.27e-01

Pos logit contributions
una+0.043
a+0.041
 please+0.041
 Go+0.039
out+0.039
Neg logit contributions
handed-0.046
 arrest-0.044
-0.043
 former-0.041
igsaw-0.041
Token wide
Token ID3094
Feature activation-2.71e+00

Pos logit contributions
 please+0.004
una+0.004
 me+0.003
 it+0.003
h+0.003
Neg logit contributions
handed-0.004
 former-0.004
-0.004
 repaired-0.004
ongo-0.004
Token field
Token ID2214
Feature activation-5.45e+00

Pos logit contributions
 please+0.070
una+0.070
cuts+0.066
 cho+0.065
ymes+0.064
Neg logit contributions
handed-0.094
 former-0.091
 arrest-0.087
-0.085
urance-0.084
Token.
Token ID13
Feature activation-1.41e-01

Pos logit contributions
 please+0.178
 me+0.171
cuts+0.171
una+0.167
 cho+0.161
Neg logit contributions
handed-0.176
-0.147
view-0.144
 arrest-0.141
keeping-0.134
Token
Token ID220
Feature activation+2.83e-01

Pos logit contributions
 me+0.005
 please+0.005
 cho+0.005
 it+0.004
una+0.004
Neg logit contributions
handed-0.004
-0.004
urance-0.004
 former-0.004
 arrest-0.004
Token
Token ID198
Feature activation+2.04e-01

Pos logit contributions
handed+0.010
 former+0.008
+0.008
urance+0.008
 arrest+0.008
Neg logit contributions
 me-0.009
 please-0.009
 it-0.009
 cho-0.008
 closer-0.008
Token
Token ID198
Feature activation+7.25e-01

Pos logit contributions
handed+0.006
+0.005
 former+0.005
urance+0.005
 arrest+0.005
Neg logit contributions
 me-0.007
 please-0.007
 it-0.007
 cho-0.007
out-0.006
TokenThe
Token ID464
Feature activation+5.95e-02

Pos logit contributions
handed+0.021
+0.017
 arrest+0.017
 former+0.017
urance+0.016
Neg logit contributions
 it-0.026
 me-0.026
 please-0.026
 cho-0.026
ours-0.024
Token each
Token ID1123
Feature activation-1.57e+00

Pos logit contributions
 it+0.016
 me+0.013
 please+0.012
 closer+0.010
 cho+0.010
Neg logit contributions
handed-0.041
ongo-0.038
-0.038
urance-0.038
 former-0.038
Token other
Token ID584
Feature activation-6.85e-02

Pos logit contributions
 please+0.054
 cho+0.051
 me+0.051
 it+0.050
h+0.044
Neg logit contributions
handed-0.050
 arrest-0.043
 former-0.041
-0.041
urance-0.040
Token and
Token ID290
Feature activation-3.33e-01

Pos logit contributions
 me+0.002
 it+0.002
 please+0.002
 cho+0.001
 closer+0.001
Neg logit contributions
handed-0.003
-0.003
 former-0.003
urance-0.003
 arrest-0.003
Token then
Token ID788
Feature activation-2.79e-01

Pos logit contributions
 it+0.010
 me+0.010
 please+0.009
 cho+0.008
 closer+0.008
Neg logit contributions
handed-0.013
-0.012
 former-0.011
urance-0.011
ongo-0.011
Token hugged
Token ID41130
Feature activation-8.12e-01

Pos logit contributions
 please+0.008
 me+0.007
 cho+0.007
una+0.007
 it+0.007
Neg logit contributions
handed-0.010
 former-0.009
urance-0.009
 arrest-0.009
-0.009
Token tightly
Token ID17707
Feature activation-5.39e+00

Pos logit contributions
 me+0.018
 please+0.018
 it+0.017
 cho+0.016
una+0.015
Neg logit contributions
handed-0.034
 former-0.032
-0.031
urance-0.031
 arrest-0.030
Token.
Token ID13
Feature activation+1.03e-01

Pos logit contributions
 me+0.137
 please+0.135
 it+0.135
 cho+0.122
 closer+0.111
Neg logit contributions
handed-0.220
 former-0.198
-0.197
urance-0.192
 arrest-0.191
Token The
Token ID383
Feature activation+1.40e-01

Pos logit contributions
handed+0.003
 former+0.003
+0.003
 arrest+0.003
urance+0.003
Neg logit contributions
 me-0.004
 it-0.004
 please-0.004
 cho-0.003
 closer-0.003
Token warmth
Token ID23125
Feature activation-9.59e-01

Pos logit contributions
handed+0.006
 former+0.005
+0.005
urance+0.005
 arrest+0.005
Neg logit contributions
 me-0.003
 please-0.003
 it-0.003
 cho-0.003
una-0.003
Token of
Token ID286
Feature activation+2.41e+00

Pos logit contributions
 me+0.025
 please+0.025
 it+0.022
 cho+0.022
una+0.021
Neg logit contributions
handed-0.036
-0.033
 former-0.032
urance-0.032
 arrest-0.031
Token the
Token ID262
Feature activation+1.49e+00

Pos logit contributions
handed+0.113
+0.103
 former+0.103
urance+0.101
ongo+0.100
Neg logit contributions
 me-0.050
 it-0.050
 please-0.046
 cho-0.042
 closer-0.037
Token.
Token ID13
Feature activation-6.13e-01

Pos logit contributions
 me+0.043
 please+0.043
 it+0.041
 cho+0.039
h+0.037
Neg logit contributions
handed-0.063
-0.057
 arrest-0.056
 former-0.055
urance-0.053
Token She
Token ID1375
Feature activation-8.61e-01

Pos logit contributions
 me+0.020
 please+0.019
 it+0.019
 cho+0.018
 closer+0.017
Neg logit contributions
handed-0.021
 former-0.019
-0.018
urance-0.018
 arrest-0.018
Token thanked
Token ID26280
Feature activation+2.27e-01

Pos logit contributions
 please+0.027
 it+0.025
 cho+0.025
 me+0.025
una+0.024
Neg logit contributions
handed-0.027
 former-0.026
 arrest-0.024
-0.024
urance-0.023
Token the
Token ID262
Feature activation-1.81e+00

Pos logit contributions
handed+0.008
 arrest+0.007
 ease+0.007
+0.007
 former+0.007
Neg logit contributions
out-0.006
 cho-0.006
 please-0.006
a-0.006
ours-0.005
Token generous
Token ID14431
Feature activation-6.81e-01

Pos logit contributions
 please+0.053
 me+0.051
 it+0.051
 cho+0.050
h+0.046
Neg logit contributions
handed-0.068
 former-0.060
-0.059
 arrest-0.057
urance-0.057
Token spot
Token ID4136
Feature activation-5.39e+00

Pos logit contributions
 cho+0.018
 please+0.018
h+0.017
osh+0.017
ours+0.016
Neg logit contributions
handed-0.025
ongo-0.021
urance-0.021
 former-0.021
 arrest-0.021
Token and
Token ID290
Feature activation-1.70e+00

Pos logit contributions
 me+0.152
 please+0.149
 cho+0.148
 happens+0.139
 it+0.138
Neg logit contributions
handed-0.189
 former-0.163
-0.160
gy-0.160
keeping-0.157
Token was
Token ID373
Feature activation-2.15e+00

Pos logit contributions
 please+0.052
 me+0.052
 cho+0.049
 it+0.048
una+0.047
Neg logit contributions
handed-0.056
 former-0.052
 arrest-0.049
-0.049
urance-0.047
Token excited
Token ID6568
Feature activation-5.82e-01

Pos logit contributions
 me+0.071
 please+0.070
 cho+0.067
 it+0.067
una+0.065
Neg logit contributions
handed-0.066
 former-0.059
-0.056
urance-0.055
view-0.052
Token to
Token ID284
Feature activation-1.53e+00

Pos logit contributions
 please+0.011
 me+0.011
 cho+0.011
 it+0.010
osh+0.010
Neg logit contributions
handed-0.026
 former-0.024
urance-0.024
-0.023
 arrest-0.023
Token play
Token ID711
Feature activation-1.08e+00

Pos logit contributions
 it+0.049
 me+0.048
 please+0.047
 closer+0.038
 cho+0.038
Neg logit contributions
handed-0.060
-0.053
 former-0.051
ongo-0.050
arre-0.050
Token nice
Token ID3621
Feature activation-1.67e+00

Pos logit contributions
una+0.031
ours+0.031
osh+0.031
 happens+0.030
 Go+0.030
Neg logit contributions
handed-0.039
 former-0.035
-0.031
 arrest-0.029
 repaired-0.029
Token spot
Token ID4136
Feature activation-5.00e+00

Pos logit contributions
 please+0.047
osh+0.046
 cho+0.046
ours+0.045
 it+0.044
Neg logit contributions
handed-0.063
 former-0.057
gy-0.052
 arrest-0.052
keeping-0.052
Token in
Token ID287
Feature activation-1.19e+00

Pos logit contributions
 me+0.152
 please+0.147
 cho+0.144
 it+0.136
una+0.134
Neg logit contributions
handed-0.173
-0.150
keeping-0.149
 former-0.147
urance-0.141
Token a
Token ID257
Feature activation-1.25e+00

Pos logit contributions
 please+0.025
 me+0.025
 it+0.024
 cho+0.023
una+0.020
Neg logit contributions
handed-0.054
 former-0.049
-0.048
 arrest-0.047
urance-0.047
Token nearby
Token ID6716
Feature activation-2.00e+00

Pos logit contributions
una+0.037
ours+0.036
osh+0.035
 it+0.035
 please+0.035
Neg logit contributions
handed-0.046
 former-0.042
-0.037
 arrest-0.035
 repaired-0.035
Token field
Token ID2214
Feature activation-5.37e+00

Pos logit contributions
 please+0.057
 it+0.056
 cho+0.055
una+0.053
ours+0.053
Neg logit contributions
handed-0.073
 former-0.067
 arrest-0.061
-0.061
urance-0.058
Token.
Token ID13
Feature activation-5.25e-01

Pos logit contributions
 please+0.162
 me+0.161
 cho+0.157
 it+0.151
una+0.149
Neg logit contributions
handed-0.188
 former-0.158
-0.158
keeping-0.154
 arrest-0.153
Token
Token ID198
Feature activation+4.97e-04

Pos logit contributions
 please+0.019
 cho+0.019
 me+0.019
 it+0.018
out+0.017
Neg logit contributions
handed-0.015
-0.014
 former-0.013
urance-0.013
 arrest-0.012
Token
Token ID198
Feature activation+5.04e-01

Pos logit contributions
handed+0.000
+0.000
 former+0.000
urance+0.000
 arrest+0.000
Neg logit contributions
 please-0.000
 me-0.000
 it-0.000
 cho-0.000
 closer-0.000
TokenJoy
Token ID41338
Feature activation-1.24e+00

Pos logit contributions
handed+0.015
+0.013
 arrest+0.012
 former+0.012
 accomplish+0.012
Neg logit contributions
 cho-0.018
 me-0.018
 please-0.018
 it-0.018
ours-0.016
Token laid
Token ID8104
Feature activation-1.46e+00

Pos logit contributions
 please+0.036
 it+0.034
 me+0.033
 cho+0.033
una+0.029
Neg logit contributions
handed-0.044
 former-0.040
 arrest-0.039
-0.038
urance-0.037
Token then
Token ID788
Feature activation-4.74e-01

Pos logit contributions
 please+0.031
una+0.029
 me+0.028
 cho+0.027
ymes+0.027
Neg logit contributions
handed-0.037
 former-0.033
 arrest-0.032
-0.032
 ease-0.031
Token jumped
Token ID11687
Feature activation-2.27e+00

Pos logit contributions
 please+0.013
una+0.013
 me+0.012
cuts+0.012
ymes+0.012
Neg logit contributions
handed-0.016
 arrest-0.014
 former-0.014
 ease-0.014
 repaired-0.014
Token to
Token ID284
Feature activation-9.39e-01

Pos logit contributions
 please+0.057
 me+0.056
 it+0.054
 cho+0.052
una+0.049
Neg logit contributions
handed-0.090
-0.081
 former-0.081
urance-0.078
 arrest-0.078
Token a
Token ID257
Feature activation-1.88e+00

Pos logit contributions
 me+0.023
 it+0.022
 please+0.022
 cho+0.020
 closer+0.018
Neg logit contributions
handed-0.040
 former-0.036
-0.036
urance-0.035
 arrest-0.035
Token lower
Token ID2793
Feature activation-3.42e+00

Pos logit contributions
una+0.048
ours+0.047
 please+0.047
cuts+0.046
 it+0.045
Neg logit contributions
handed-0.068
 former-0.062
-0.060
 arrest-0.058
 accomplish-0.056
Token branch
Token ID8478
Feature activation-5.35e+00

Pos logit contributions
 please+0.087
ours+0.079
 me+0.079
 cho+0.076
cuts+0.076
Neg logit contributions
handed-0.129
 arrest-0.120
 former-0.114
gy-0.114
keeping-0.110
Token.
Token ID13
Feature activation+2.07e-02

Pos logit contributions
 me+0.149
 please+0.146
 it+0.142
 cho+0.132
una+0.129
Neg logit contributions
handed-0.196
-0.174
 former-0.172
 arrest-0.170
urance-0.168
Token Then
Token ID3244
Feature activation+2.19e-01

Pos logit contributions
handed+0.001
 former+0.001
urance+0.001
view+0.001
 arrest+0.001
Neg logit contributions
 it-0.001
 me-0.001
 please-0.001
 cho-0.001
 closer-0.001
Token,
Token ID11
Feature activation-6.56e-01

Pos logit contributions
handed+0.009
+0.008
 former+0.008
 arrest+0.008
urance+0.008
Neg logit contributions
 please-0.005
 me-0.005
una-0.004
 cho-0.004
out-0.004
Token the
Token ID262
Feature activation-7.35e-01

Pos logit contributions
 me+0.012
 please+0.012
 it+0.011
 cho+0.010
una+0.009
Neg logit contributions
handed-0.031
 former-0.029
-0.028
 arrest-0.028
urance-0.028
Token owl
Token ID39610
Feature activation-4.91e-01

Pos logit contributions
 me+0.016
 it+0.015
 please+0.015
 cho+0.013
 closer+0.012
Neg logit contributions
handed-0.035
 former-0.032
urance-0.031
-0.031
view-0.030
Token by
Token ID416
Feature activation-1.66e-01

Pos logit contributions
handed+0.068
urance+0.059
+0.058
 former+0.058
ongo+0.058
Neg logit contributions
 it-0.033
 me-0.032
 please-0.031
 cho-0.028
 closer-0.027
Token a
Token ID257
Feature activation-1.13e+00

Pos logit contributions
 it+0.004
 me+0.003
 please+0.003
 closer+0.003
 cho+0.003
Neg logit contributions
handed-0.008
ongo-0.007
arre-0.007
-0.007
urance-0.007
Token big
Token ID1263
Feature activation-1.39e+00

Pos logit contributions
 it+0.032
 please+0.032
 me+0.032
una+0.031
out+0.030
Neg logit contributions
handed-0.039
 former-0.035
-0.035
 arrest-0.034
urance-0.033
Token,
Token ID11
Feature activation+1.55e-01

Pos logit contributions
 please+0.039
 it+0.038
una+0.038
 me+0.037
a+0.036
Neg logit contributions
handed-0.048
 arrest-0.045
-0.044
 former-0.044
urance-0.043
Token green
Token ID4077
Feature activation-2.47e+00

Pos logit contributions
handed+0.006
 former+0.005
+0.005
urance+0.005
 arrest+0.005
Neg logit contributions
 me-0.004
 please-0.004
 it-0.004
 cho-0.004
una-0.004
Token field
Token ID2214
Feature activation-5.34e+00

Pos logit contributions
 please+0.070
 me+0.069
 it+0.069
 cho+0.063
una+0.060
Neg logit contributions
handed-0.091
 former-0.082
-0.082
 arrest-0.079
urance-0.079
Token.
Token ID13
Feature activation-1.03e+00

Pos logit contributions
 me+0.157
 please+0.156
 it+0.146
una+0.140
 cho+0.138
Neg logit contributions
handed-0.194
-0.170
 former-0.165
urance-0.161
 arrest-0.160
Token
Token ID198
Feature activation-2.19e-01

Pos logit contributions
 it+0.033
 me+0.033
 please+0.032
 cho+0.029
 closer+0.027
Neg logit contributions
handed-0.036
 former-0.032
 arrest-0.031
urance-0.031
-0.031
Token
Token ID198
Feature activation+3.38e-01

Pos logit contributions
 me+0.007
 it+0.007
 please+0.007
 cho+0.006
 closer+0.006
Neg logit contributions
handed-0.008
 former-0.007
-0.007
urance-0.006
 arrest-0.006
TokenThe
Token ID464
Feature activation-1.02e-02

Pos logit contributions
handed+0.011
 former+0.009
+0.009
 arrest+0.009
urance+0.009
Neg logit contributions
 me-0.012
 please-0.012
 it-0.012
 cho-0.011
 closer-0.010
Token man
Token ID582
Feature activation+5.79e-01

Pos logit contributions
 me+0.000
 it+0.000
 please+0.000
 cho+0.000
 us+0.000
Neg logit contributions
handed-0.000
-0.000
 former-0.000
ongo-0.000
urance-0.000
Token found
Token ID1043
Feature activation+2.13e-01

Pos logit contributions
handed+0.056
 former+0.050
+0.048
urance+0.048
 arrest+0.048
Neg logit contributions
 please-0.048
 me-0.048
 it-0.047
 cho-0.044
una-0.040
Token a
Token ID257
Feature activation-9.57e-01

Pos logit contributions
handed+0.009
 arrest+0.008
 former+0.008
 ease+0.008
+0.008
Neg logit contributions
 please-0.005
 me-0.005
h-0.004
 cho-0.004
 it-0.004
Token nice
Token ID3621
Feature activation-1.80e+00

Pos logit contributions
 me+0.026
 it+0.026
 please+0.026
 cho+0.025
out+0.024
Neg logit contributions
handed-0.034
 former-0.032
-0.032
urance-0.030
 arrest-0.029
Token,
Token ID11
Feature activation+1.93e-01

Pos logit contributions
 cho+0.052
 please+0.049
osh+0.048
ours+0.047
out+0.047
Neg logit contributions
handed-0.060
 former-0.055
urance-0.055
-0.054
igsaw-0.054
Token warm
Token ID5814
Feature activation-3.14e+00

Pos logit contributions
handed+0.007
 former+0.007
+0.007
urance+0.006
 arrest+0.006
Neg logit contributions
 me-0.005
 please-0.005
 it-0.005
 cho-0.005
 closer-0.004
Token spot
Token ID4136
Feature activation-5.32e+00

Pos logit contributions
 please+0.081
 cho+0.077
 me+0.077
 it+0.075
osh+0.071
Neg logit contributions
handed-0.118
-0.108
 former-0.107
urance-0.107
 arrest-0.105
Token to
Token ID284
Feature activation-1.29e+00

Pos logit contributions
 me+0.134
 please+0.130
 it+0.124
 cho+0.117
una+0.108
Neg logit contributions
handed-0.216
-0.194
 former-0.191
 arrest-0.188
urance-0.187
Token hide
Token ID7808
Feature activation-3.37e+00

Pos logit contributions
 me+0.041
 please+0.040
 it+0.040
 cho+0.037
 closer+0.034
Neg logit contributions
handed-0.045
 former-0.040
-0.039
urance-0.038
 arrest-0.038
Token from
Token ID422
Feature activation+1.03e+00

Pos logit contributions
 please+0.094
 me+0.093
 it+0.087
 cho+0.085
una+0.077
Neg logit contributions
handed-0.129
 former-0.113
 arrest-0.112
-0.111
keeping-0.109
Token the
Token ID262
Feature activation+1.62e-02

Pos logit contributions
handed+0.046
+0.042
 former+0.042
urance+0.041
ongo+0.040
Neg logit contributions
 me-0.023
 it-0.023
 please-0.022
 cho-0.020
 closer-0.018
Token cold
Token ID4692
Feature activation+4.64e-02

Pos logit contributions
handed+0.001
 former+0.001
+0.001
urance+0.001
 arrest+0.001
Neg logit contributions
 me-0.000
 please-0.000
 it-0.000
 cho-0.000
 closer-0.000
Token,
Token ID11
Feature activation-2.42e-01

Pos logit contributions
 me+0.049
 please+0.048
 it+0.047
 cho+0.044
una+0.041
Neg logit contributions
handed-0.064
 arrest-0.057
-0.056
 former-0.056
urance-0.055
Token a
Token ID257
Feature activation-2.28e+00

Pos logit contributions
 me+0.008
 please+0.008
 cho+0.007
 it+0.007
out+0.007
Neg logit contributions
handed-0.007
 former-0.007
 arrest-0.007
-0.006
urance-0.006
Token ghost
Token ID10905
Feature activation-2.60e+00

Pos logit contributions
 me+0.058
 please+0.057
 it+0.056
out+0.053
h+0.050
Neg logit contributions
handed-0.086
 former-0.079
-0.079
 arrest-0.077
urance-0.074
Token,
Token ID11
Feature activation-6.77e-01

Pos logit contributions
 me+0.073
 please+0.071
 it+0.069
 cho+0.065
una+0.062
Neg logit contributions
handed-0.092
 arrest-0.083
-0.082
 former-0.080
urance-0.079
Token a
Token ID257
Feature activation-2.47e+00

Pos logit contributions
 me+0.020
 please+0.019
 it+0.018
 cho+0.018
out+0.017
Neg logit contributions
handed-0.023
 arrest-0.021
 former-0.021
-0.020
urance-0.020
Token spider
Token ID19230
Feature activation-5.32e+00

Pos logit contributions
 me+0.061
 please+0.060
 it+0.059
out+0.055
 cho+0.053
Neg logit contributions
handed-0.095
 former-0.088
-0.087
 arrest-0.085
urance-0.082
Token.
Token ID13
Feature activation-1.16e+00

Pos logit contributions
 please+0.149
 me+0.145
 it+0.142
 cho+0.133
 closer+0.129
Neg logit contributions
handed-0.187
 arrest-0.173
-0.168
 former-0.163
urance-0.162
Token He
Token ID679
Feature activation+6.99e-02

Pos logit contributions
 me+0.046
 please+0.045
 it+0.045
 cho+0.042
 closer+0.040
Neg logit contributions
handed-0.031
 former-0.027
-0.026
urance-0.025
 arrest-0.025
Token runs
Token ID4539
Feature activation-8.11e-01

Pos logit contributions
handed+0.003
 arrest+0.003
 former+0.003
 accomplish+0.003
+0.003
Neg logit contributions
 please-0.002
 it-0.001
 me-0.001
 cho-0.001
h-0.001
Token back
Token ID736
Feature activation-9.99e-01

Pos logit contributions
 please+0.020
 me+0.018
una+0.018
ours+0.018
 cho+0.018
Neg logit contributions
handed-0.032
 arrest-0.029
keeping-0.027
gy-0.026
-0.025
Token to
Token ID284
Feature activation-3.35e-01

Pos logit contributions
 please+0.021
 me+0.019
 cho+0.019
una+0.017
 it+0.017
Neg logit contributions
handed-0.044
 arrest-0.040
keeping-0.040
gy-0.039
-0.038
Token small
Token ID1402
Feature activation+2.91e-01

Pos logit contributions
 please+0.011
 it+0.010
una+0.010
 cho+0.010
 me+0.010
Neg logit contributions
handed-0.014
 former-0.012
 arrest-0.011
-0.011
view-0.010
Token house
Token ID2156
Feature activation-1.54e+00

Pos logit contributions
handed+0.010
 former+0.009
+0.009
urance+0.009
 arrest+0.009
Neg logit contributions
 me-0.009
 please-0.009
 it-0.009
 cho-0.008
una-0.008
Token near
Token ID1474
Feature activation-2.14e+00

Pos logit contributions
 me+0.041
 please+0.041
 it+0.040
 cho+0.037
una+0.034
Neg logit contributions
handed-0.060
 former-0.054
-0.054
urance-0.052
 arrest-0.052
Token a
Token ID257
Feature activation-2.04e+00

Pos logit contributions
 it+0.046
 me+0.044
 please+0.040
 cho+0.035
 us+0.032
Neg logit contributions
handed-0.100
-0.096
 former-0.094
urance-0.093
ongo-0.092
Token big
Token ID1263
Feature activation-2.09e+00

Pos logit contributions
 please+0.063
una+0.062
 it+0.061
 cho+0.060
osh+0.059
Neg logit contributions
handed-0.068
 former-0.058
-0.057
 arrest-0.053
 accomplish-0.052
Token field
Token ID2214
Feature activation-5.31e+00

Pos logit contributions
 please+0.070
una+0.068
 it+0.066
a+0.066
 cho+0.065
Neg logit contributions
handed-0.062
 arrest-0.055
-0.054
 former-0.054
igsaw-0.051
Token.
Token ID13
Feature activation+6.02e-01

Pos logit contributions
 please+0.162
 me+0.162
 it+0.154
 cho+0.152
una+0.146
Neg logit contributions
handed-0.189
-0.161
 former-0.157
view-0.153
 arrest-0.151
Token Every
Token ID3887
Feature activation-6.50e-01

Pos logit contributions
handed+0.018
 former+0.016
+0.016
 arrest+0.015
urance+0.015
Neg logit contributions
 me-0.022
 it-0.022
 please-0.022
 cho-0.020
 closer-0.018
Token day
Token ID1110
Feature activation-5.94e-01

Pos logit contributions
ymes+0.033
ours+0.030
cuts+0.029
ues+0.029
 Go+0.029
Neg logit contributions
-0.008
handed-0.006
 entire-0.006
working-0.005
 whole-0.005
Token he
Token ID339
Feature activation+5.42e-01

Pos logit contributions
 me+0.015
 please+0.015
 it+0.015
 cho+0.014
una+0.013
Neg logit contributions
handed-0.023
-0.021
 former-0.021
 arrest-0.020
urance-0.020
Token would
Token ID561
Feature activation-4.69e-01

Pos logit contributions
handed+0.024
 former+0.022
+0.021
 arrest+0.021
gy+0.021
Neg logit contributions
 please-0.011
 me-0.011
 it-0.010
 cho-0.009
una-0.009
Token'm
Token ID1101
Feature activation+7.05e-01

Pos logit contributions
 please+0.035
 it+0.034
 me+0.033
 cho+0.030
 will+0.030
Neg logit contributions
handed-0.035
-0.030
urance-0.030
 former-0.030
 arrest-0.029
Token sorry
Token ID7926
Feature activation+1.72e+00

Pos logit contributions
handed+0.023
 former+0.021
+0.020
urance+0.020
 arrest+0.020
Neg logit contributions
 me-0.023
 please-0.022
 it-0.022
 cho-0.021
out-0.019
Token,
Token ID11
Feature activation-9.01e-01

Pos logit contributions
 himself+0.025
ongo+0.020
 having+0.018
urance+0.018
 accomplish+0.018
Neg logit contributions
osh-0.071
ymes-0.071
 Area-0.070
mare-0.067
ipt-0.067
Token too
Token ID1165
Feature activation-2.43e+00

Pos logit contributions
out+0.020
 cho+0.018
ours+0.017
 Area+0.017
ymes+0.017
Neg logit contributions
 arrest-0.033
handed-0.032
ongo-0.032
-0.031
urance-0.029
Token,
Token ID11
Feature activation-1.27e+00

Pos logit contributions
 please+0.068
 me+0.068
 it+0.065
 cho+0.063
out+0.059
Neg logit contributions
handed-0.088
 former-0.078
-0.077
 arrest-0.076
urance-0.076
Token lizard
Token ID42406
Feature activation-5.31e+00

Pos logit contributions
out+0.026
ymes+0.025
 Area+0.023
h+0.022
ours+0.022
Neg logit contributions
 arrest-0.048
handed-0.046
-0.044
ongo-0.044
urance-0.043
Token.
Token ID13
Feature activation-1.21e+00

Pos logit contributions
 me+0.178
 please+0.173
 cho+0.170
 it+0.168
ours+0.164
Neg logit contributions
handed-0.152
 arrest-0.130
-0.123
gy-0.120
ongo-0.113
Token I
Token ID314
Feature activation-9.40e-01

Pos logit contributions
 me+0.018
 it+0.017
 please+0.017
 cho+0.016
out+0.014
Neg logit contributions
handed-0.061
-0.056
 former-0.056
 arrest-0.055
urance-0.055
Token was
Token ID373
Feature activation+1.12e-01

Pos logit contributions
 it+0.021
 please+0.021
 me+0.021
 will+0.018
 cho+0.017
Neg logit contributions
handed-0.043
urance-0.039
-0.039
 former-0.038
 arrest-0.038
Token rude
Token ID22066
Feature activation+1.11e+00

Pos logit contributions
handed+0.004
 former+0.004
+0.004
 arrest+0.004
urance+0.004
Neg logit contributions
 please-0.003
 me-0.003
 it-0.003
 cho-0.003
 closer-0.003
Token to
Token ID284
Feature activation-4.41e-01

Pos logit contributions
handed+0.035
 arrest+0.031
keeping+0.031
urance+0.030
arre+0.029
Neg logit contributions
 please-0.031
osh-0.031
 cho-0.030
 me-0.030
 Area-0.029
Token it
Token ID340
Feature activation-1.92e+00

Pos logit contributions
 please+0.023
 me+0.021
 cho+0.017
una+0.017
h+0.016
Neg logit contributions
handed-0.074
-0.067
 ease-0.066
 arrest-0.066
 former-0.065
Token on
Token ID319
Feature activation-7.33e-01

Pos logit contributions
 please+0.050
 me+0.050
 it+0.049
 cho+0.046
una+0.042
Neg logit contributions
handed-0.075
 former-0.067
-0.066
urance-0.065
 arrest-0.065
Token a
Token ID257
Feature activation-1.36e+00

Pos logit contributions
 me+0.017
 please+0.017
 it+0.016
 cho+0.015
 closer+0.013
Neg logit contributions
handed-0.031
 former-0.029
-0.028
urance-0.028
 arrest-0.028
Token nearby
Token ID6716
Feature activation-1.81e+00

Pos logit contributions
 please+0.035
 it+0.034
ours+0.034
 cho+0.034
osh+0.033
Neg logit contributions
handed-0.051
 former-0.045
-0.045
 arrest-0.043
urance-0.041
Token tree
Token ID5509
Feature activation-4.00e+00

Pos logit contributions
 please+0.050
 it+0.048
 cho+0.047
ymes+0.046
osh+0.046
Neg logit contributions
handed-0.067
 former-0.060
 arrest-0.058
-0.058
urance-0.055
Token branch
Token ID8478
Feature activation-5.30e+00

Pos logit contributions
 me+0.103
 please+0.101
 it+0.100
 cho+0.092
una+0.082
Neg logit contributions
handed-0.162
 former-0.146
-0.145
urance-0.141
 arrest-0.141
Token.
Token ID13
Feature activation-1.96e-01

Pos logit contributions
 me+0.140
 please+0.140
 it+0.134
 cho+0.124
una+0.115
Neg logit contributions
handed-0.209
-0.185
 former-0.184
 arrest-0.180
urance-0.179
Token The
Token ID383
Feature activation+1.78e-01

Pos logit contributions
 it+0.007
 me+0.007
 please+0.007
 cho+0.006
 closer+0.006
Neg logit contributions
handed-0.006
 former-0.006
view-0.005
urance-0.005
 arrest-0.005
Token squirrel
Token ID33039
Feature activation+5.25e-01

Pos logit contributions
handed+0.008
+0.007
 former+0.007
urance+0.007
ongo+0.007
Neg logit contributions
 me-0.004
 it-0.004
 please-0.004
 cho-0.003
 closer-0.003
Token happily
Token ID18177
Feature activation-1.30e+00

Pos logit contributions
handed+0.022
 former+0.020
+0.020
urance+0.020
 arrest+0.019
Neg logit contributions
 me-0.013
 it-0.013
 please-0.012
 cho-0.011
 closer-0.010
Token chat
Token ID8537
Feature activation-9.54e-01

Pos logit contributions
 me+0.036
 please+0.036
 it+0.036
 cho+0.033
 closer+0.030
Neg logit contributions
handed-0.050
 former-0.045
-0.044
urance-0.043
 arrest-0.043
Token it
Token ID340
Feature activation-1.36e+00

Pos logit contributions
 please+0.035
 me+0.032
 cho+0.029
h+0.028
 it+0.028
Neg logit contributions
handed-0.069
 former-0.063
 arrest-0.060
 ease-0.059
urance-0.059
Token a
Token ID257
Feature activation-9.71e-01

Pos logit contributions
 it+0.037
 me+0.035
 please+0.035
 closer+0.031
 cho+0.030
Neg logit contributions
handed-0.055
-0.052
urance-0.050
 arrest-0.049
 former-0.048
Token nice
Token ID3621
Feature activation-1.75e+00

Pos logit contributions
 me+0.030
 please+0.029
 it+0.029
 cho+0.027
una+0.025
Neg logit contributions
handed-0.034
 former-0.031
-0.030
urance-0.029
 arrest-0.029
Token,
Token ID11
Feature activation+3.25e-03

Pos logit contributions
 please+0.051
 me+0.051
 it+0.048
 cho+0.048
osh+0.045
Neg logit contributions
handed-0.063
 former-0.058
-0.055
urance-0.054
 arrest-0.053
Token comfortable
Token ID6792
Feature activation-3.18e+00

Pos logit contributions
handed+0.000
 former+0.000
+0.000
 arrest+0.000
urance+0.000
Neg logit contributions
 me-0.000
 please-0.000
 it-0.000
 cho-0.000
una-0.000
Token spot
Token ID4136
Feature activation-5.29e+00

Pos logit contributions
 please+0.087
 me+0.085
 it+0.084
 cho+0.080
una+0.073
Neg logit contributions
handed-0.122
 former-0.111
-0.107
 arrest-0.107
urance-0.105
Token in
Token ID287
Feature activation-8.04e-01

Pos logit contributions
 me+0.134
 please+0.129
 it+0.118
 cho+0.117
una+0.110
Neg logit contributions
handed-0.214
 former-0.188
keeping-0.185
-0.185
 arrest-0.181
Token his
Token ID465
Feature activation+2.36e-01

Pos logit contributions
 me+0.018
 it+0.018
 please+0.018
 cho+0.016
 closer+0.014
Neg logit contributions
handed-0.035
 former-0.032
-0.032
urance-0.031
 arrest-0.031
Token bur
Token ID4356
Feature activation-7.72e-01

Pos logit contributions
handed+0.009
 former+0.008
+0.007
 arrest+0.007
urance+0.007
Neg logit contributions
 please-0.007
 it-0.007
 me-0.007
 cho-0.007
h-0.006
Tokenrow
Token ID808
Feature activation-1.15e+00

Pos logit contributions
 me+0.017
 please+0.016
 it+0.016
 cho+0.014
 closer+0.013
Neg logit contributions
handed-0.035
-0.032
 former-0.032
urance-0.031
 arrest-0.031
Token.
Token ID13
Feature activation-3.53e-01

Pos logit contributions
 me+0.030
 please+0.029
 it+0.029
 cho+0.027
 closer+0.024
Neg logit contributions
handed-0.046
 former-0.041
-0.041
urance-0.040
 arrest-0.040
Token big
Token ID1263
Feature activation-1.03e+00

Pos logit contributions
 please+0.035
 it+0.034
 me+0.034
una+0.033
out+0.032
Neg logit contributions
handed-0.036
-0.034
 former-0.032
 arrest-0.031
ongo-0.031
Token tree
Token ID5509
Feature activation-3.31e+00

Pos logit contributions
 please+0.032
 me+0.030
una+0.030
 it+0.030
out+0.029
Neg logit contributions
handed-0.032
-0.030
 arrest-0.030
 former-0.028
urance-0.027
Token with
Token ID351
Feature activation-1.13e+00

Pos logit contributions
 me+0.077
 please+0.076
 it+0.075
 cho+0.068
una+0.060
Neg logit contributions
handed-0.142
 former-0.128
-0.128
urance-0.125
 arrest-0.124
Token many
Token ID867
Feature activation-3.71e-01

Pos logit contributions
 please+0.031
 me+0.031
 it+0.030
 cho+0.028
h+0.026
Neg logit contributions
handed-0.043
 former-0.038
 arrest-0.037
-0.037
urance-0.036
Token or
Token ID393
Feature activation+9.64e-01

Pos logit contributions
 please+0.011
 me+0.011
 it+0.011
 cho+0.010
una+0.010
Neg logit contributions
handed-0.013
 former-0.012
-0.011
 arrest-0.011
urance-0.011
Tokennam
Token ID7402
Feature activation+7.12e+00

Pos logit contributions
handed+0.041
+0.039
urance+0.037
ongo+0.036
 former+0.035
Neg logit contributions
 it-0.026
 me-0.025
 please-0.023
 cho-0.022
 closer-0.021
Tokenents
Token ID658
Feature activation+1.24e+00

Pos logit contributions
handed+0.283
+0.258
 former+0.257
urance+0.253
 arrest+0.243
Neg logit contributions
 me-0.189
 it-0.185
 please-0.181
 cho-0.169
h-0.154
Token on
Token ID319
Feature activation-1.64e-01

Pos logit contributions
handed+0.057
 former+0.054
+0.053
urance+0.052
ongo+0.050
Neg logit contributions
 it-0.028
 me-0.025
 please-0.024
 cho-0.023
 closer-0.022
Token it
Token ID340
Feature activation-1.44e+00

Pos logit contributions
 please+0.003
 me+0.003
 cho+0.003
 it+0.003
una+0.003
Neg logit contributions
handed-0.008
-0.007
 arrest-0.007
 former-0.007
urance-0.007
Token.
Token ID13
Feature activation-4.76e-01

Pos logit contributions
 me+0.039
 please+0.039
 it+0.038
 cho+0.036
una+0.032
Neg logit contributions
handed-0.056
 former-0.050
-0.050
urance-0.048
 arrest-0.048
Token The
Token ID383
Feature activation+5.35e-01

Pos logit contributions
 me+0.013
 it+0.012
 please+0.012
 cho+0.011
 closer+0.010
Neg logit contributions
handed-0.019
 former-0.018
urance-0.017
-0.017
 arrest-0.017
Token a
Token ID257
Feature activation-1.44e+00

Pos logit contributions
handed+0.041
 former+0.038
urance+0.037
+0.036
 arrest+0.035
Neg logit contributions
 please-0.029
 me-0.029
 it-0.027
 cho-0.026
h-0.024
Token letter
Token ID3850
Feature activation-1.03e+00

Pos logit contributions
 please+0.033
 me+0.033
 it+0.032
 cho+0.031
out+0.029
Neg logit contributions
handed-0.057
 former-0.054
urance-0.052
-0.051
 arrest-0.051
Token about
Token ID546
Feature activation+4.48e-01

Pos logit contributions
 me+0.023
 please+0.022
 it+0.021
 cho+0.020
una+0.018
Neg logit contributions
handed-0.043
-0.039
 former-0.039
urance-0.038
ongo-0.038
Token the
Token ID262
Feature activation+1.56e-01

Pos logit contributions
handed+0.019
 former+0.017
+0.017
urance+0.016
 arrest+0.016
Neg logit contributions
 me-0.011
 it-0.011
 please-0.011
 cho-0.010
 closer-0.009
Token or
Token ID393
Feature activation+7.84e-02

Pos logit contributions
handed+0.006
+0.006
urance+0.006
 former+0.006
ongo+0.006
Neg logit contributions
 me-0.004
 it-0.004
 please-0.004
 cho-0.004
 closer-0.003
Tokennam
Token ID7402
Feature activation+7.03e+00

Pos logit contributions
handed+0.003
+0.003
 former+0.003
urance+0.003
 arrest+0.003
Neg logit contributions
 me-0.002
 it-0.002
 please-0.002
 cho-0.002
 closer-0.002
Tokenents
Token ID658
Feature activation+1.04e+00

Pos logit contributions
handed+0.280
+0.259
urance+0.257
 former+0.256
 arrest+0.244
Neg logit contributions
 me-0.180
 it-0.176
 please-0.175
 cho-0.158
out-0.150
Token and
Token ID290
Feature activation-7.78e-01

Pos logit contributions
handed+0.047
 former+0.044
+0.044
urance+0.044
ongo+0.042
Neg logit contributions
 it-0.024
 please-0.021
 me-0.021
 cho-0.019
 closer-0.018
Token asked
Token ID1965
Feature activation-3.05e-01

Pos logit contributions
 it+0.023
 me+0.022
 please+0.021
 cho+0.020
 closer+0.018
Neg logit contributions
handed-0.031
-0.028
urance-0.027
 former-0.027
ongo-0.026
Token for
Token ID329
Feature activation-2.96e+00

Pos logit contributions
 me+0.008
 cho+0.008
osh+0.008
 please+0.007
h+0.007
Neg logit contributions
handed-0.010
urance-0.010
 former-0.009
 ease-0.009
keeping-0.009
Token one
Token ID530
Feature activation-1.50e+00

Pos logit contributions
 me+0.074
 please+0.072
 cho+0.072
una+0.068
out+0.067
Neg logit contributions
handed-0.113
 former-0.104
 arrest-0.102
 ease-0.097
-0.097
Token the
Token ID262
Feature activation-1.94e-01

Pos logit contributions
 me+0.023
 please+0.022
 it+0.022
 cho+0.020
 closer+0.017
Neg logit contributions
handed-0.050
 former-0.046
-0.045
urance-0.044
 arrest-0.044
Token rest
Token ID1334
Feature activation-9.74e-01

Pos logit contributions
 it+0.005
 me+0.005
 please+0.005
 closer+0.004
 cho+0.004
Neg logit contributions
handed-0.008
ongo-0.007
 former-0.007
urance-0.007
-0.007
Token of
Token ID286
Feature activation+1.95e+00

Pos logit contributions
 me+0.022
 please+0.021
 it+0.021
 cho+0.019
 closer+0.017
Neg logit contributions
handed-0.043
-0.039
 former-0.039
urance-0.038
 arrest-0.037
Token the
Token ID262
Feature activation+1.22e+00

Pos logit contributions
handed+0.103
 former+0.095
urance+0.094
ongo+0.094
arre+0.092
Neg logit contributions
 it-0.036
 me-0.031
 please-0.025
 cho-0.024
 closer-0.020
Token or
Token ID393
Feature activation+3.99e-01

Pos logit contributions
handed+0.052
urance+0.049
ongo+0.049
+0.048
 former+0.048
Neg logit contributions
 me-0.032
 it-0.030
 please-0.025
 is-0.025
 cho-0.023
Tokennam
Token ID7402
Feature activation+6.96e+00

Pos logit contributions
handed+0.014
+0.013
 former+0.012
urance+0.012
 arrest+0.012
Neg logit contributions
 it-0.013
 me-0.013
 please-0.012
 cho-0.011
 closer-0.011
Tokenents
Token ID658
Feature activation+1.21e+00

Pos logit contributions
handed+0.269
+0.245
 former+0.244
urance+0.243
 arrest+0.233
Neg logit contributions
 me-0.187
 it-0.185
 please-0.182
 cho-0.167
out-0.156
Token onto
Token ID4291
Feature activation-1.60e+00

Pos logit contributions
handed+0.054
 former+0.050
urance+0.049
+0.049
ongo+0.048
Neg logit contributions
 it-0.028
 me-0.026
 please-0.026
 cho-0.024
 closer-0.022
Token the
Token ID262
Feature activation+7.71e-01

Pos logit contributions
 it+0.031
 me+0.030
 please+0.027
 cho+0.025
 closer+0.021
Neg logit contributions
handed-0.078
 former-0.070
urance-0.070
-0.070
ongo-0.070
Token tree
Token ID5509
Feature activation-2.07e+00

Pos logit contributions
handed+0.033
 former+0.030
+0.030
urance+0.030
arre+0.030
Neg logit contributions
 me-0.020
 it-0.019
 please-0.017
 cho-0.016
 closer-0.014
Token.
Token ID13
Feature activation-9.56e-01

Pos logit contributions
 it+0.052
 me+0.051
 please+0.050
 cho+0.046
 closer+0.042
Neg logit contributions
handed-0.085
 former-0.077
-0.077
urance-0.076
 arrest-0.075
Token to
Token ID284
Feature activation-8.01e-01

Pos logit contributions
 cho+0.045
 please+0.045
 me+0.044
osh+0.043
una+0.041
Neg logit contributions
handed-0.083
 former-0.078
urance-0.073
view-0.071
arre-0.070
Token me
Token ID502
Feature activation-2.38e+00

Pos logit contributions
 please+0.014
 it+0.013
 cho+0.012
 me+0.012
 closer+0.011
Neg logit contributions
handed-0.037
 former-0.035
 arrest-0.034
urance-0.034
-0.033
Token.
Token ID13
Feature activation-9.68e-01

Pos logit contributions
 me+0.061
 please+0.059
 it+0.059
 cho+0.054
 closer+0.049
Neg logit contributions
handed-0.097
 former-0.087
-0.087
urance-0.085
 arrest-0.084
Token The
Token ID383
Feature activation+5.93e-01

Pos logit contributions
 me+0.026
 please+0.026
 it+0.025
 cho+0.023
una+0.021
Neg logit contributions
handed-0.039
 former-0.035
-0.034
urance-0.033
 arrest-0.033
Token or
Token ID393
Feature activation+1.94e-01

Pos logit contributions
handed+0.021
 former+0.021
+0.019
 arrest+0.018
urance+0.018
Neg logit contributions
 it-0.016
 please-0.016
 me-0.015
h-0.015
 cho-0.014
Tokennam
Token ID7402
Feature activation+7.36e+00

Pos logit contributions
handed+0.006
 former+0.006
 arrest+0.006
 disl+0.005
 repaired+0.005
Neg logit contributions
una-0.005
out-0.005
 cho-0.005
 please-0.005
h-0.005
Tokenents
Token ID658
Feature activation+1.34e+00

Pos logit contributions
handed+0.285
+0.260
 former+0.258
urance+0.255
 arrest+0.245
Neg logit contributions
 me-0.202
 please-0.197
 it-0.197
 cho-0.180
out-0.165
Token were
Token ID547
Feature activation+7.29e-01

Pos logit contributions
handed+0.051
 former+0.046
+0.046
urance+0.045
 arrest+0.044
Neg logit contributions
 me-0.037
 please-0.037
 it-0.037
 cho-0.034
 closer-0.031
Token very
Token ID845
Feature activation+1.20e+00

Pos logit contributions
handed+0.027
 former+0.025
 arrest+0.024
urance+0.024
+0.024
Neg logit contributions
 please-0.019
 me-0.018
 it-0.018
 cho-0.017
osh-0.016
Token important
Token ID1593
Feature activation+3.79e-01

Pos logit contributions
handed+0.041
 former+0.039
+0.034
 arrest+0.034
ongo+0.034
Neg logit contributions
 me-0.034
 it-0.034
 please-0.034
 cho-0.032
una-0.032
Token to
Token ID284
Feature activation-5.87e-01

Pos logit contributions
handed+0.014
 former+0.013
+0.013
urance+0.013
 arrest+0.012
Neg logit contributions
 me-0.010
 please-0.010
 it-0.010
 cho-0.010
 closer-0.009
Token the
Token ID262
Feature activation+3.16e-01

Pos logit contributions
handed+0.025
 former+0.024
 arrest+0.022
 ease+0.022
keeping+0.022
Neg logit contributions
 please-0.006
 cho-0.005
 me-0.005
h-0.005
una-0.005
Token village
Token ID7404
Feature activation+1.40e+00

Pos logit contributions
handed+0.012
 former+0.011
+0.011
urance+0.010
 arrest+0.010
Neg logit contributions
 me-0.009
 please-0.009
 it-0.009
 cho-0.008
 closer-0.008
Token back
Token ID736
Feature activation-8.34e-01

Pos logit contributions
handed+0.058
 former+0.053
+0.053
urance+0.052
 arrest+0.051
Neg logit contributions
 me-0.034
 it-0.034
 please-0.033
 cho-0.030
 closer-0.027
Token to
Token ID284
Feature activation-1.02e+00

Pos logit contributions
 me+0.018
 please+0.017
 it+0.017
 cho+0.016
 closer+0.014
Neg logit contributions
handed-0.037
 former-0.034
-0.034
urance-0.033
 arrest-0.033
Token its
Token ID663
Feature activation-1.54e+00

Pos logit contributions
 me+0.023
 it+0.022
 please+0.022
 cho+0.019
 us+0.017
Neg logit contributions
handed-0.045
-0.041
 former-0.041
urance-0.041
ongo-0.040
Token former
Token ID1966
Feature activation+8.79e+00

Pos logit contributions
 me+0.050
 please+0.050
 it+0.050
 cho+0.047
una+0.043
Neg logit contributions
handed-0.051
 former-0.046
-0.044
urance-0.043
 arrest-0.043
Token glory
Token ID13476
Feature activation+1.38e+00

Pos logit contributions
handed+0.189
+0.155
 former+0.141
 arrest+0.141
urance+0.141
Neg logit contributions
 me-0.400
 it-0.389
 please-0.385
 cho-0.367
 us-0.347
Token!"
Token ID2474
Feature activation-5.45e-01

Pos logit contributions
handed+0.056
 former+0.051
+0.050
urance+0.048
 arrest+0.048
Neg logit contributions
 me-0.034
 please-0.034
 it-0.033
 cho-0.032
una-0.028
Token
Token ID220
Feature activation-1.35e+00

Pos logit contributions
 it+0.017
 me+0.016
 please+0.015
 cho+0.013
osh+0.013
Neg logit contributions
handed-0.020
 former-0.020
urance-0.019
 arrest-0.018
ongo-0.018
Token
Token ID198
Feature activation-9.79e-01

Pos logit contributions
 it+0.035
 me+0.033
 please+0.031
 cho+0.027
"+0.027
Neg logit contributions
handed-0.055
 former-0.053
 arrest-0.051
urance-0.050
view-0.050
Token
Token ID198
Feature activation-2.24e-01

Pos logit contributions
 it+0.028
 me+0.028
 please+0.027
 cho+0.025
una+0.023
Neg logit contributions
handed-0.037
 former-0.034
urance-0.034
 arrest-0.033
-0.032
Token mess
Token ID2085
Feature activation+1.11e-01

Pos logit contributions
una+0.018
 it+0.017
 me+0.017
 please+0.017
out+0.017
Neg logit contributions
handed-0.021
 former-0.020
-0.020
 arrest-0.018
 repaired-0.018
Token.
Token ID13
Feature activation+1.21e+00

Pos logit contributions
handed+0.005
 former+0.004
+0.004
urance+0.004
 arrest+0.004
Neg logit contributions
 it-0.003
 me-0.003
 please-0.003
 cho-0.002
 closer-0.002
Token The
Token ID383
Feature activation+7.57e-01

Pos logit contributions
handed+0.034
 former+0.030
+0.029
 arrest+0.029
urance+0.028
Neg logit contributions
 me-0.046
 it-0.046
 please-0.045
 cho-0.041
 closer-0.039
Token bed
Token ID3996
Feature activation-4.58e-01

Pos logit contributions
handed+0.029
 former+0.026
+0.026
urance+0.025
 arrest+0.025
Neg logit contributions
 please-0.021
 me-0.021
 it-0.020
 cho-0.019
una-0.017
Token was
Token ID373
Feature activation-5.13e-01

Pos logit contributions
 me+0.012
 please+0.012
 it+0.011
 cho+0.011
 closer+0.010
Neg logit contributions
handed-0.018
-0.016
 former-0.016
urance-0.016
 arrest-0.016
Token un
Token ID555
Feature activation+3.51e+00

Pos logit contributions
 me+0.015
 please+0.015
 it+0.015
 cho+0.014
una+0.013
Neg logit contributions
handed-0.018
 former-0.016
-0.016
urance-0.016
 arrest-0.016
Tokenmade
Token ID9727
Feature activation-1.53e+00

Pos logit contributions
handed+0.104
+0.088
 former+0.087
urance+0.083
 arrest+0.083
Neg logit contributions
 me-0.131
 it-0.130
 please-0.129
 cho-0.121
una-0.115
Token,
Token ID11
Feature activation+7.64e-01

Pos logit contributions
 me+0.040
 please+0.040
 it+0.039
 cho+0.036
una+0.033
Neg logit contributions
handed-0.060
 former-0.054
-0.053
urance-0.052
 arrest-0.052
Token toys
Token ID14958
Feature activation+8.84e-01

Pos logit contributions
handed+0.030
 former+0.027
+0.027
urance+0.026
 arrest+0.026
Neg logit contributions
 me-0.021
 it-0.021
 please-0.020
 cho-0.019
 closer-0.017
Token were
Token ID547
Feature activation+5.24e-01

Pos logit contributions
handed+0.036
 former+0.033
+0.032
urance+0.031
 arrest+0.030
Neg logit contributions
 it-0.024
 me-0.023
 please-0.023
 cho-0.021
 closer-0.019
Token everywhere
Token ID8347
Feature activation+1.20e+00

Pos logit contributions
handed+0.019
+0.017
 former+0.017
urance+0.016
 arrest+0.016
Neg logit contributions
 me-0.016
 it-0.016
 please-0.016
 cho-0.015
 closer-0.013
Token panic
Token ID13619
Feature activation+4.50e-01

Pos logit contributions
handed+0.033
 former+0.029
+0.029
urance+0.028
 arrest+0.028
Neg logit contributions
 me-0.038
 please-0.038
 it-0.038
 cho-0.036
una-0.033
Token.
Token ID13
Feature activation+1.13e+00

Pos logit contributions
handed+0.016
 former+0.014
urance+0.014
+0.014
 arrest+0.013
Neg logit contributions
 me-0.013
 please-0.013
 it-0.012
 cho-0.012
osh-0.011
Token The
Token ID383
Feature activation+1.31e+00

Pos logit contributions
handed+0.036
 former+0.032
+0.031
 arrest+0.030
urance+0.030
Neg logit contributions
 me-0.040
 it-0.040
 please-0.039
 cho-0.036
 closer-0.034
Token finger
Token ID7660
Feature activation-1.10e+00

Pos logit contributions
handed+0.046
 arrest+0.039
+0.039
view+0.036
urance+0.036
Neg logit contributions
 cho-0.039
 please-0.037
 me-0.034
 us-0.033
 Area-0.033
Token was
Token ID373
Feature activation+6.25e-01

Pos logit contributions
 please+0.030
 me+0.029
 it+0.029
 cho+0.028
una+0.026
Neg logit contributions
handed-0.040
-0.036
 former-0.035
 arrest-0.035
urance-0.034
Token so
Token ID523
Feature activation+2.71e+00

Pos logit contributions
handed+0.020
 former+0.018
urance+0.018
+0.018
 arrest+0.017
Neg logit contributions
 please-0.019
 me-0.019
 it-0.018
 cho-0.018
osh-0.017
Token hard
Token ID1327
Feature activation-9.09e-01

Pos logit contributions
handed+0.112
+0.100
 former+0.100
 arrest+0.097
urance+0.097
Neg logit contributions
 me-0.073
 please-0.071
 it-0.071
 cho-0.063
 closer-0.059
Token he
Token ID339
Feature activation+4.82e-01

Pos logit contributions
 it+0.020
 me+0.018
 please+0.017
 cho+0.015
 closer+0.013
Neg logit contributions
handed-0.044
 former-0.042
-0.040
urance-0.039
view-0.038
Token thought
Token ID1807
Feature activation+1.54e+00

Pos logit contributions
ongo+0.012
handed+0.012
 repaired+0.012
gy+0.012
 arrest+0.012
Neg logit contributions
 please-0.015
ours-0.015
una-0.014
out-0.014
ymes-0.014
Token it
Token ID340
Feature activation-9.90e-01

Pos logit contributions
handed+0.069
+0.067
urance+0.066
 ease+0.066
 arrest+0.065
Neg logit contributions
 me-0.022
 please-0.021
 cho-0.020
 Area-0.019
a-0.017
Token might
Token ID1244
Feature activation-1.15e+00

Pos logit contributions
 please+0.025
una+0.024
 me+0.023
out+0.023
 cho+0.022
Neg logit contributions
gy-0.034
-0.032
handed-0.032
 repaired-0.031
urance-0.031
Token a
Token ID257
Feature activation-3.20e-01

Pos logit contributions
handed+0.041
+0.038
 former+0.038
urance+0.037
 arrest+0.036
Neg logit contributions
 it-0.021
 me-0.021
 please-0.020
 cho-0.018
 closer-0.017
Token box
Token ID3091
Feature activation-8.89e-01

Pos logit contributions
 please+0.009
 me+0.009
 it+0.009
una+0.008
out+0.008
Neg logit contributions
handed-0.011
-0.010
 former-0.010
 arrest-0.009
ongo-0.009
Token with
Token ID351
Feature activation-4.63e-01

Pos logit contributions
 me+0.023
 please+0.023
 it+0.022
 cho+0.021
una+0.020
Neg logit contributions
handed-0.035
-0.031
 former-0.030
 arrest-0.030
urance-0.030
Token small
Token ID1402
Feature activation+2.24e-01

Pos logit contributions
 me+0.012
 please+0.012
 it+0.012
 cho+0.011
 closer+0.010
Neg logit contributions
handed-0.018
 former-0.016
-0.016
urance-0.016
 arrest-0.016
Token bags
Token ID11668
Feature activation+7.83e-01

Pos logit contributions
handed+0.009
 former+0.008
+0.008
urance+0.008
 arrest+0.007
Neg logit contributions
 me-0.006
 it-0.006
 please-0.006
 cho-0.006
 closer-0.005
Token of
Token ID286
Feature activation+2.73e+00

Pos logit contributions
handed+0.033
 former+0.031
+0.030
urance+0.030
 arrest+0.029
Neg logit contributions
 it-0.019
 me-0.019
 please-0.019
 cho-0.016
 closer-0.015
Token seeds
Token ID11904
Feature activation+1.16e-01

Pos logit contributions
handed+0.101
+0.091
 former+0.090
urance+0.088
 arrest+0.086
Neg logit contributions
 me-0.081
 it-0.080
 please-0.078
 cho-0.072
 closer-0.066
Token.
Token ID13
Feature activation+4.05e-01

Pos logit contributions
handed+0.005
 former+0.005
urance+0.005
+0.005
 arrest+0.005
Neg logit contributions
 it-0.003
 me-0.003
 please-0.002
 cho-0.002
 closer-0.002
Token He
Token ID679
Feature activation+1.04e+00

Pos logit contributions
handed+0.013
 former+0.011
+0.011
urance+0.011
 arrest+0.011
Neg logit contributions
 me-0.014
 it-0.014
 please-0.014
 cho-0.013
 closer-0.012
Token didn
Token ID1422
Feature activation+6.24e-01

Pos logit contributions
handed+0.031
 former+0.029
 arrest+0.028
+0.028
 widened+0.027
Neg logit contributions
 please-0.033
 cho-0.032
 it-0.030
 me-0.030
una-0.030
Token't
Token ID470
Feature activation+7.17e-01

Pos logit contributions
handed+0.022
 former+0.021
urance+0.019
keeping+0.019
ongo+0.019
Neg logit contributions
 it-0.020
 please-0.020
 me-0.019
 cho-0.017
 closer-0.017
TokenBut
Token ID1537
Feature activation-1.00e+00

Pos logit contributions
handed+0.024
 former+0.021
+0.021
urance+0.020
 arrest+0.020
Neg logit contributions
 me-0.025
 please-0.025
 it-0.025
 cho-0.023
 closer-0.022
Token when
Token ID618
Feature activation-4.05e-02

Pos logit contributions
 it+0.027
 me+0.026
 please+0.026
 cho+0.023
 closer+0.021
Neg logit contributions
handed-0.041
 former-0.037
-0.037
urance-0.036
 arrest-0.036
Token he
Token ID339
Feature activation+1.92e+00

Pos logit contributions
 it+0.001
 me+0.001
 please+0.001
 cho+0.001
 closer+0.001
Neg logit contributions
handed-0.002
 former-0.002
urance-0.002
view-0.001
ongo-0.001
Token looked
Token ID3114
Feature activation+3.44e+00

Pos logit contributions
handed+0.079
 former+0.072
+0.070
urance+0.069
 arrest+0.068
Neg logit contributions
 me-0.052
 please-0.050
 it-0.050
 cho-0.043
 closer-0.042
Token out
Token ID503
Feature activation+1.12e+00

Pos logit contributions
handed+0.120
 former+0.105
 widened+0.099
arre+0.099
 repaired+0.096
Neg logit contributions
 cho-0.094
 please-0.090
una-0.089
 me-0.084
 it-0.084
Token of
Token ID286
Feature activation+2.78e+00

Pos logit contributions
handed+0.043
+0.039
 former+0.039
urance+0.038
 arrest+0.038
Neg logit contributions
 me-0.028
 please-0.027
 cho-0.026
 it-0.026
una-0.025
Token his
Token ID465
Feature activation+1.50e+00

Pos logit contributions
handed+0.114
+0.102
 former+0.101
 arrest+0.101
urance+0.099
Neg logit contributions
 please-0.069
 me-0.067
 cho-0.063
 it-0.063
una-0.058
Token window
Token ID4324
Feature activation+1.99e+00

Pos logit contributions
handed+0.067
 former+0.061
+0.059
urance+0.059
keeping+0.059
Neg logit contributions
 me-0.036
 please-0.033
 it-0.032
 cho-0.027
 closer-0.026
Token,
Token ID11
Feature activation+7.22e-01

Pos logit contributions
handed+0.086
 former+0.078
+0.077
urance+0.076
 arrest+0.075
Neg logit contributions
 me-0.046
 it-0.046
 please-0.045
 cho-0.040
 closer-0.036
Token he
Token ID339
Feature activation-2.02e-02

Pos logit contributions
handed+0.027
 former+0.024
+0.024
urance+0.024
 arrest+0.023
Neg logit contributions
 me-0.021
 it-0.020
 please-0.020
 cho-0.019
 closer-0.017
Token saw
Token ID2497
Feature activation+4.39e-01

Pos logit contributions
 please+0.001
 it+0.001
 me+0.001
 cho+0.001
una+0.000
Neg logit contributions
handed-0.001
 former-0.001
-0.001
 arrest-0.001
urance-0.001
Token kept
Token ID4030
Feature activation-7.05e-01

Pos logit contributions
handed+0.015
 former+0.014
+0.014
urance+0.013
 arrest+0.013
Neg logit contributions
 me-0.012
 it-0.011
 please-0.011
 cho-0.011
 closer-0.010
Token on
Token ID319
Feature activation+1.64e-01

Pos logit contributions
 me+0.028
 it+0.028
 please+0.027
 cho+0.026
 closer+0.024
Neg logit contributions
handed-0.019
 former-0.017
-0.016
urance-0.016
 arrest-0.015
Token driving
Token ID5059
Feature activation+2.70e+00

Pos logit contributions
handed+0.005
+0.005
 former+0.004
ongo+0.004
igsaw+0.004
Neg logit contributions
 it-0.007
 me-0.006
 cho-0.006
 closer-0.006
 please-0.006
Token.
Token ID13
Feature activation+8.05e-01

Pos logit contributions
handed+0.098
 former+0.086
+0.085
urance+0.084
 arrest+0.083
Neg logit contributions
 me-0.076
 please-0.076
 it-0.070
 cho-0.070
una-0.067
Token He
Token ID679
Feature activation+1.09e+00

Pos logit contributions
handed+0.029
 arrest+0.027
 former+0.027
 affect+0.026
ivery+0.025
Neg logit contributions
 it-0.027
 me-0.025
 please-0.024
 "-0.021
 I-0.021
Token drove
Token ID10357
Feature activation+2.52e+00

Pos logit contributions
handed+0.040
 former+0.036
+0.035
urance+0.035
 arrest+0.034
Neg logit contributions
 me-0.032
 it-0.031
 please-0.031
 cho-0.029
 closer-0.027
Token around
Token ID1088
Feature activation-9.85e-01

Pos logit contributions
handed+0.085
urance+0.074
 former+0.074
 arrest+0.073
 ease+0.072
Neg logit contributions
 please-0.074
 me-0.071
 cho-0.068
una-0.067
 it-0.066
Token the
Token ID262
Feature activation+9.92e-01

Pos logit contributions
 me+0.023
 please+0.023
 it+0.022
 cho+0.021
una+0.019
Neg logit contributions
handed-0.041
 former-0.037
-0.037
 arrest-0.036
urance-0.036
Token house
Token ID2156
Feature activation-9.74e-02

Pos logit contributions
handed+0.037
+0.035
urance+0.035
 former+0.034
keeping+0.033
Neg logit contributions
 me-0.032
 it-0.029
 cho-0.026
 please-0.026
 is-0.024
Token,
Token ID11
Feature activation+5.84e-01

Pos logit contributions
 it+0.002
 me+0.002
 please+0.002
 cho+0.002
 closer+0.002
Neg logit contributions
handed-0.004
 former-0.004
-0.004
urance-0.004
ongo-0.004
Token in
Token ID287
Feature activation-2.79e-01

Pos logit contributions
handed+0.022
+0.019
 former+0.019
urance+0.019
ongo+0.018
Neg logit contributions
 it-0.018
 me-0.018
 please-0.017
 cho-0.017
 closer-0.016
Token
Token ID198
Feature activation+3.78e-01

Pos logit contributions
 me+0.008
 it+0.007
 please+0.007
 cho+0.007
 closer+0.006
Neg logit contributions
handed-0.008
 former-0.008
-0.007
urance-0.007
 arrest-0.007
TokenHer
Token ID9360
Feature activation-1.30e-01

Pos logit contributions
handed+0.014
 former+0.012
+0.012
urance+0.012
 arrest+0.011
Neg logit contributions
 me-0.012
 please-0.011
 it-0.011
 cho-0.011
 closer-0.010
Token dad
Token ID9955
Feature activation+2.43e+00

Pos logit contributions
 me+0.003
 it+0.003
 please+0.003
 closer+0.002
 cho+0.002
Neg logit contributions
handed-0.006
 arrest-0.005
-0.005
 former-0.005
ongo-0.005
Token smiled
Token ID13541
Feature activation-4.26e-01

Pos logit contributions
handed+0.092
+0.082
 former+0.080
urance+0.080
 arrest+0.079
Neg logit contributions
 please-0.065
 me-0.064
 it-0.063
 cho-0.060
una-0.055
Token and
Token ID290
Feature activation-9.33e-01

Pos logit contributions
 me+0.011
 it+0.011
 please+0.011
 cho+0.010
 closer+0.009
Neg logit contributions
handed-0.017
 former-0.016
-0.016
 arrest-0.015
urance-0.015
Token said
Token ID531
Feature activation-1.02e+00

Pos logit contributions
 please+0.024
 me+0.024
 it+0.023
 cho+0.022
una+0.021
Neg logit contributions
handed-0.036
-0.032
 former-0.032
ongo-0.031
 arrest-0.030
Token,
Token ID11
Feature activation-1.15e+00

Pos logit contributions
 me+0.032
una+0.030
ymes+0.028
 please+0.028
 cho+0.028
Neg logit contributions
handed-0.032
-0.030
urance-0.028
 arrest-0.026
gy-0.026
Token "
Token ID366
Feature activation-1.89e+00

Pos logit contributions
 me+0.019
 please+0.017
 it+0.017
out+0.016
una+0.016
Neg logit contributions
handed-0.052
-0.052
urance-0.049
 former-0.049
ongo-0.047
TokenIt
Token ID1026
Feature activation-1.32e+00

Pos logit contributions
 me+0.056
 please+0.055
 it+0.054
 cho+0.052
una+0.047
Neg logit contributions
handed-0.068
 former-0.060
-0.058
ongo-0.056
view-0.055
Token sure
Token ID1654
Feature activation-5.80e-01

Pos logit contributions
 please+0.035
 cho+0.033
 me+0.033
 it+0.032
una+0.030
Neg logit contributions
handed-0.048
 former-0.043
arre-0.043
gy-0.042
urance-0.042
Token is
Token ID318
Feature activation-1.33e+00

Pos logit contributions
 please+0.012
 cho+0.011
 me+0.011
 Area+0.011
 closer+0.011
Neg logit contributions
handed-0.022
 arrest-0.021
urance-0.020
 accomplish-0.020
 former-0.020
TokenThe
Token ID464
Feature activation+2.68e+00

Pos logit contributions
 me+0.003
 it+0.003
 please+0.003
 cho+0.003
 closer+0.002
Neg logit contributions
handed-0.002
 former-0.002
-0.002
urance-0.002
 arrest-0.002
Token little
Token ID1310
Feature activation+4.47e+00

Pos logit contributions
handed+0.109
 former+0.099
+0.097
urance+0.095
 arrest+0.095
Neg logit contributions
 me-0.067
 please-0.066
 it-0.066
 cho-0.060
 closer-0.054
Token girl
Token ID2576
Feature activation+2.25e+00

Pos logit contributions
handed+0.204
 former+0.187
+0.185
 arrest+0.182
urance+0.181
Neg logit contributions
 please-0.087
 me-0.086
 it-0.085
 cho-0.077
una-0.067
Token said
Token ID531
Feature activation-6.20e-01

Pos logit contributions
handed+0.087
 former+0.079
+0.078
urance+0.076
 arrest+0.076
Neg logit contributions
 please-0.058
 me-0.058
 it-0.057
 cho-0.054
una-0.049
Token,
Token ID11
Feature activation-5.75e-01

Pos logit contributions
 me+0.018
una+0.017
 cho+0.017
 please+0.017
ymes+0.016
Neg logit contributions
handed-0.020
-0.018
urance-0.018
 former-0.017
gy-0.017
Token "
Token ID366
Feature activation-1.32e+00

Pos logit contributions
 me+0.010
 please+0.009
 cho+0.009
 it+0.009
una+0.009
Neg logit contributions
handed-0.026
-0.025
urance-0.024
 former-0.024
 arrest-0.023
TokenOkay
Token ID16454
Feature activation-1.85e+00

Pos logit contributions
 me+0.033
 it+0.032
 please+0.032
 cho+0.028
 closer+0.026
Neg logit contributions
handed-0.055
 former-0.050
-0.050
urance-0.049
 arrest-0.049
Token,
Token ID11
Feature activation-7.51e-01

Pos logit contributions
 cho+0.050
una+0.050
out+0.050
ymes+0.050
osh+0.049
Neg logit contributions
handed-0.059
urance-0.057
-0.056
 former-0.053
gy-0.053
Token let
Token ID1309
Feature activation-1.11e+00

Pos logit contributions
out+0.015
 cho+0.014
ymes+0.014
 me+0.014
h+0.014
Neg logit contributions
handed-0.029
 former-0.028
urance-0.027
 arrest-0.027
-0.027
Token's
Token ID338
Feature activation-1.55e+00

Pos logit contributions
 please+0.028
 it+0.026
 me+0.026
 cho+0.025
 closer+0.023
Neg logit contributions
handed-0.045
 arrest-0.039
-0.039
urance-0.038
 former-0.038
Token go
Token ID467
Feature activation-7.39e-01

Pos logit contributions
 me+0.037
 please+0.036
 it+0.035
 closer+0.026
 cho+0.026
Neg logit contributions
handed-0.073
-0.065
ongo-0.065
urance-0.065
 former-0.064
Token He
Token ID679
Feature activation+6.71e-01

Pos logit contributions
handed+0.016
 former+0.015
 arrest+0.014
view+0.014
urance+0.014
Neg logit contributions
 it-0.014
 please-0.013
 me-0.013
 cho-0.011
 I-0.011
Token falls
Token ID8953
Feature activation-2.39e-01

Pos logit contributions
 arrest+0.025
handed+0.023
+0.022
 accomplish+0.021
 ease+0.021
Neg logit contributions
 please-0.018
h-0.016
 cho-0.015
 me-0.015
 Area-0.015
Token into
Token ID656
Feature activation+1.03e+00

Pos logit contributions
 please+0.007
 me+0.006
 it+0.006
una+0.006
 cho+0.006
Neg logit contributions
handed-0.008
-0.007
 arrest-0.007
 ease-0.007
urance-0.007
Token the
Token ID262
Feature activation+1.29e+00

Pos logit contributions
handed+0.051
 former+0.049
ongo+0.049
urance+0.048
+0.048
Neg logit contributions
 it-0.021
 me-0.019
 please-0.016
 cho-0.013
 closer-0.013
Token river
Token ID7850
Feature activation-4.84e-02

Pos logit contributions
handed+0.047
 former+0.042
+0.041
 arrest+0.041
urance+0.039
Neg logit contributions
 please-0.037
 it-0.036
 me-0.034
 cho-0.032
h-0.032
Token.
Token ID13
Feature activation+2.43e-01

Pos logit contributions
 please+0.001
 me+0.001
 it+0.001
 cho+0.001
una+0.001
Neg logit contributions
handed-0.002
 arrest-0.001
-0.001
 former-0.001
urance-0.001
Token He
Token ID679
Feature activation+5.89e-01

Pos logit contributions
handed+0.009
 former+0.008
urance+0.007
 arrest+0.007
view+0.007
Neg logit contributions
 it-0.008
 please-0.008
 me-0.008
 cho-0.007
una-0.007
Token screams
Token ID26557
Feature activation-1.29e+00

Pos logit contributions
 arrest+0.021
handed+0.021
+0.020
 accomplish+0.019
 former+0.018
Neg logit contributions
 please-0.015
 me-0.013
h-0.013
 it-0.013
 cho-0.013
Token.
Token ID13
Feature activation-1.35e-02

Pos logit contributions
 please+0.047
cuts+0.046
osh+0.046
ymes+0.046
ours+0.045
Neg logit contributions
handed-0.026
 arrest-0.026
urance-0.025
gy-0.023
 widened-0.023
Token
Token ID198
Feature activation+9.83e-02

Pos logit contributions
 it+0.000
 me+0.000
 please+0.000
 cho+0.000
una+0.000
Neg logit contributions
handed-0.001
 former-0.000
urance-0.000
 arrest-0.000
view-0.000
Token
Token ID198
Feature activation+4.86e-01

Pos logit contributions
handed+0.003
 former+0.003
+0.003
urance+0.003
 arrest+0.003
Neg logit contributions
 me-0.003
 please-0.003
 it-0.003
 cho-0.003
 closer-0.003
Token lost
Token ID2626
Feature activation+1.11e+00

Pos logit contributions
 please+0.015
 me+0.013
 cho+0.013
 it+0.013
h+0.013
Neg logit contributions
handed-0.014
 arrest-0.013
-0.013
 widened-0.012
 former-0.012
Token her
Token ID607
Feature activation+1.01e-01

Pos logit contributions
handed+0.033
 arrest+0.032
 ease+0.031
keeping+0.030
 himself+0.030
Neg logit contributions
 please-0.035
cuts-0.034
 awaits-0.033
osh-0.033
 happens-0.032
Token balance
Token ID5236
Feature activation-9.79e-01

Pos logit contributions
handed+0.004
 former+0.004
urance+0.004
+0.004
ongo+0.004
Neg logit contributions
 it-0.002
 me-0.002
 please-0.002
 cho-0.002
 closer-0.002
Token and
Token ID290
Feature activation-1.28e-01

Pos logit contributions
 please+0.032
 me+0.030
 cho+0.029
 it+0.028
 Area+0.027
Neg logit contributions
handed-0.031
-0.026
ongo-0.026
 arrest-0.025
view-0.025
Token fell
Token ID3214
Feature activation-8.81e-01

Pos logit contributions
 please+0.004
una+0.004
 me+0.004
out+0.004
h+0.004
Neg logit contributions
 arrest-0.004
handed-0.004
 ease-0.003
-0.003
 repaired-0.003
Token in
Token ID287
Feature activation+1.72e-01

Pos logit contributions
 please+0.024
 me+0.024
 it+0.022
una+0.021
 cho+0.021
Neg logit contributions
handed-0.030
-0.028
 arrest-0.028
 former-0.027
urance-0.027
Token the
Token ID262
Feature activation+8.35e-01

Pos logit contributions
handed+0.008
 former+0.007
+0.007
urance+0.007
 arrest+0.007
Neg logit contributions
 me-0.004
 it-0.004
 please-0.004
 cho-0.003
 closer-0.003
Token creek
Token ID39459
Feature activation-1.61e+00

Pos logit contributions
handed+0.033
+0.029
 former+0.029
urance+0.028
 arrest+0.028
Neg logit contributions
 please-0.022
 me-0.022
 it-0.022
 cho-0.020
h-0.019
Token.
Token ID13
Feature activation-5.54e-01

Pos logit contributions
 please+0.055
 me+0.052
una+0.050
 cho+0.050
 it+0.049
Neg logit contributions
handed-0.053
 arrest-0.043
-0.043
 former-0.041
keeping-0.039
Token
Token ID198
Feature activation-9.89e-02

Pos logit contributions
 it+0.017
 me+0.016
 please+0.016
 cho+0.014
 closer+0.013
Neg logit contributions
handed-0.021
 former-0.019
urance-0.018
-0.018
 arrest-0.018
Token
Token ID198
Feature activation+4.25e-01

Pos logit contributions
 me+0.003
 please+0.003
 it+0.003
 cho+0.003
 closer+0.003
Neg logit contributions
handed-0.003
-0.003
 former-0.003
urance-0.003
 arrest-0.003
TokenTom
Token ID13787
Feature activation+6.66e-01

Pos logit contributions
 me+0.010
 it+0.010
 please+0.010
 cho+0.009
h+0.008
Neg logit contributions
handed-0.013
 former-0.012
-0.012
urance-0.012
 arrest-0.011
Token nods
Token ID34833
Feature activation-1.61e+00

Pos logit contributions
handed+0.030
 former+0.027
+0.027
urance+0.027
 arrest+0.026
Neg logit contributions
 please-0.014
 it-0.014
 me-0.013
 cho-0.013
 closer-0.011
Token and
Token ID290
Feature activation-5.19e-01

Pos logit contributions
 cho+0.049
 me+0.049
 please+0.049
 it+0.047
osh+0.046
Neg logit contributions
handed-0.052
 former-0.045
urance-0.045
-0.044
view-0.043
Token smiles
Token ID21845
Feature activation-1.27e-01

Pos logit contributions
 please+0.010
 me+0.010
 cho+0.010
 it+0.009
h+0.009
Neg logit contributions
handed-0.022
 arrest-0.021
-0.020
 former-0.020
urance-0.020
Token.
Token ID13
Feature activation+4.99e-01

Pos logit contributions
 me+0.003
 it+0.003
 please+0.003
 cho+0.003
 closer+0.003
Neg logit contributions
handed-0.005
 former-0.005
-0.005
 arrest-0.005
urance-0.005
Token He
Token ID679
Feature activation+6.68e-02

Pos logit contributions
handed+0.012
 former+0.010
+0.010
urance+0.010
 arrest+0.009
Neg logit contributions
 me-0.021
 please-0.021
 it-0.021
 cho-0.020
 closer-0.019
Token says
Token ID1139
Feature activation-1.20e+00

Pos logit contributions
handed+0.003
 former+0.003
+0.003
 arrest+0.003
urance+0.003
Neg logit contributions
 please-0.001
 me-0.001
 it-0.001
 cho-0.001
h-0.001
Token,
Token ID11
Feature activation-1.01e+00

Pos logit contributions
 me+0.043
out+0.042
una+0.041
 Area+0.040
 us+0.039
Neg logit contributions
handed-0.025
-0.024
 accomplish-0.023
urance-0.022
 disl-0.022
Token "
Token ID366
Feature activation-1.93e+00

Pos logit contributions
 me+0.018
out+0.016
 please+0.016
 cho+0.015
 it+0.015
Neg logit contributions
-0.045
handed-0.044
urance-0.042
 former-0.041
ongo-0.041
TokenYes
Token ID5297
Feature activation-3.35e+00

Pos logit contributions
 me+0.057
 it+0.055
 please+0.052
 cho+0.052
out+0.050
Neg logit contributions
handed-0.070
-0.063
ongo-0.061
 former-0.060
view-0.059
Token,
Token ID11
Feature activation-1.19e+00

Pos logit contributions
ymes+0.131
 Area+0.122
una+0.118
out+0.117
osh+0.116
Neg logit contributions
-0.073
urance-0.067
handed-0.064
ongo-0.063
 arrest-0.062
Token upon
Token ID2402
Feature activation-2.68e+00

Pos logit contributions
handed+0.012
 former+0.011
+0.011
urance+0.010
 arrest+0.010
Neg logit contributions
 me-0.008
 please-0.008
 it-0.008
 cho-0.008
 closer-0.007
Token a
Token ID257
Feature activation-1.16e+00

Pos logit contributions
 it+0.054
 me+0.047
 please+0.046
 closer+0.034
una+0.032
Neg logit contributions
handed-0.134
 former-0.131
-0.128
urance-0.127
ongo-0.121
Token time
Token ID640
Feature activation-4.11e-01

Pos logit contributions
 me+0.037
 please+0.035
 it+0.034
 cho+0.030
una+0.030
Neg logit contributions
handed-0.043
 former-0.039
keeping-0.038
 arrest-0.037
-0.036
Token,
Token ID11
Feature activation-2.31e-01

Pos logit contributions
 me+0.009
 it+0.009
 please+0.009
 cho+0.008
 closer+0.007
Neg logit contributions
handed-0.018
 former-0.017
-0.016
urance-0.016
 arrest-0.016
Token there
Token ID612
Feature activation-3.42e-01

Pos logit contributions
 it+0.004
 me+0.004
 please+0.004
 cho+0.003
 closer+0.003
Neg logit contributions
handed-0.012
 former-0.011
-0.010
urance-0.010
 arrest-0.010
Token was
Token ID373
Feature activation-1.71e+00

Pos logit contributions
 please+0.008
 me+0.008
 it+0.007
 cho+0.007
 closer+0.006
Neg logit contributions
handed-0.015
-0.013
 former-0.013
urance-0.013
 arrest-0.013
Token a
Token ID257
Feature activation-2.11e+00

Pos logit contributions
 it+0.028
 me+0.024
 please+0.018
 its+0.014
 Anna+0.014
Neg logit contributions
handed-0.096
 former-0.091
urance-0.091
 affect-0.089
ongo-0.089
Token little
Token ID1310
Feature activation+4.65e-01

Pos logit contributions
 me+0.070
 it+0.070
 please+0.069
 cho+0.065
out+0.063
Neg logit contributions
handed-0.064
-0.058
 former-0.057
 arrest-0.054
ongo-0.054
Token girl
Token ID2576
Feature activation-9.37e-01

Pos logit contributions
handed+0.016
+0.015
ongo+0.015
 arrest+0.015
gy+0.014
Neg logit contributions
 it-0.012
 me-0.012
 please-0.012
h-0.011
una-0.010
Token named
Token ID3706
Feature activation+1.21e+00

Pos logit contributions
 please+0.034
 me+0.033
 it+0.033
 cho+0.031
osh+0.031
Neg logit contributions
handed-0.026
 arrest-0.022
gy-0.021
-0.021
 former-0.021
Token Lily
Token ID20037
Feature activation-2.15e+00

Pos logit contributions
handed+0.046
+0.042
urance+0.042
 former+0.041
 arrest+0.040
Neg logit contributions
 cho-0.030
 please-0.030
 me-0.029
 it-0.029
osh-0.028
Token toy
Token ID13373
Feature activation+1.31e+00

Pos logit contributions
handed+0.072
 former+0.065
+0.064
urance+0.063
 arrest+0.062
Neg logit contributions
 me-0.043
 please-0.042
 it-0.042
 cho-0.038
 closer-0.034
Token.
Token ID13
Feature activation-1.22e-01

Pos logit contributions
handed+0.052
+0.046
 arrest+0.046
 former+0.045
urance+0.044
Neg logit contributions
 me-0.033
 please-0.033
 it-0.032
 cho-0.030
una-0.028
Token She
Token ID1375
Feature activation-3.22e-02

Pos logit contributions
 me+0.004
 please+0.004
 it+0.004
 cho+0.004
 closer+0.003
Neg logit contributions
handed-0.004
-0.004
 former-0.004
urance-0.003
 arrest-0.003
Token held
Token ID2714
Feature activation-9.48e-01

Pos logit contributions
 please+0.001
 me+0.001
 it+0.001
 cho+0.001
una+0.001
Neg logit contributions
handed-0.001
 arrest-0.001
-0.001
 former-0.001
ongo-0.001
Token it
Token ID340
Feature activation-1.49e+00

Pos logit contributions
 please+0.012
 cho+0.010
una+0.010
h+0.009
 me+0.009
Neg logit contributions
handed-0.048
 ease-0.046
-0.046
 arrest-0.045
urance-0.042
Token tightly
Token ID17707
Feature activation-2.57e+00

Pos logit contributions
 please+0.038
 me+0.038
 cho+0.037
 it+0.036
una+0.034
Neg logit contributions
handed-0.056
 former-0.051
-0.049
urance-0.049
 arrest-0.048
Token and
Token ID290
Feature activation+4.35e-01

Pos logit contributions
 it+0.064
 me+0.063
 please+0.060
 cho+0.054
 closer+0.052
Neg logit contributions
handed-0.110
 former-0.100
-0.098
urance-0.097
 arrest-0.095
Token ran
Token ID4966
Feature activation-3.49e-01

Pos logit contributions
handed+0.016
+0.014
 former+0.014
 arrest+0.014
urance+0.013
Neg logit contributions
 please-0.013
 me-0.012
 it-0.012
 cho-0.011
una-0.011
Token outside
Token ID2354
Feature activation-4.68e-01

Pos logit contributions
 please+0.008
 me+0.008
 it+0.008
 cho+0.007
una+0.007
Neg logit contributions
handed-0.015
-0.013
 former-0.013
 arrest-0.013
urance-0.013
Token to
Token ID284
Feature activation-6.06e-01

Pos logit contributions
 me+0.012
 please+0.011
 it+0.011
 cho+0.010
una+0.009
Neg logit contributions
handed-0.019
 former-0.017
-0.017
urance-0.017
 arrest-0.017
Token play
Token ID711
Feature activation-6.60e-01

Pos logit contributions
 me+0.018
 it+0.018
 please+0.017
 cho+0.015
 closer+0.015
Neg logit contributions
handed-0.023
 former-0.021
-0.021
urance-0.020
ongo-0.020
Token her
Token ID607
Feature activation-6.95e-01

Pos logit contributions
 me+0.024
 it+0.024
 please+0.024
 cho+0.019
 closer+0.017
Neg logit contributions
handed-0.069
 former-0.063
-0.062
urance-0.061
 arrest-0.061
Token face
Token ID1986
Feature activation-6.73e-01

Pos logit contributions
 me+0.018
 it+0.018
 please+0.017
 closer+0.015
 softly+0.013
Neg logit contributions
handed-0.030
 former-0.028
-0.027
 arrest-0.027
arre-0.025
Token and
Token ID290
Feature activation-1.15e+00

Pos logit contributions
 it+0.016
 me+0.016
 please+0.016
 cho+0.014
 closer+0.013
Neg logit contributions
handed-0.029
 former-0.026
 arrest-0.026
-0.026
urance-0.025
Token said
Token ID531
Feature activation-1.18e+00

Pos logit contributions
 me+0.028
 please+0.028
 it+0.027
 cho+0.025
una+0.023
Neg logit contributions
handed-0.047
 former-0.043
-0.042
urance-0.042
 arrest-0.041
Token,
Token ID11
Feature activation-1.45e+00

Pos logit contributions
 me+0.040
una+0.037
 cho+0.036
 us+0.036
osh+0.035
Neg logit contributions
handed-0.032
-0.031
urance-0.030
 accomplish-0.027
 former-0.026
Token "
Token ID366
Feature activation-1.91e+00

Pos logit contributions
 me+0.021
 please+0.021
 it+0.020
 cho+0.018
out+0.016
Neg logit contributions
handed-0.071
-0.069
urance-0.066
 former-0.066
 arrest-0.064
TokenLet
Token ID5756
Feature activation-3.09e-01

Pos logit contributions
 it+0.049
 me+0.049
 please+0.048
 cho+0.046
out+0.041
Neg logit contributions
handed-0.076
-0.069
 former-0.069
ongo-0.065
urance-0.064
Token's
Token ID338
Feature activation-1.67e+00

Pos logit contributions
 cho+0.008
 please+0.008
osh+0.008
 it+0.008
 Area+0.008
Neg logit contributions
handed-0.011
urance-0.010
 arrest-0.009
arre-0.009
-0.009
Token go
Token ID467
Feature activation-8.93e-01

Pos logit contributions
 me+0.046
 please+0.045
 it+0.044
 us+0.033
 will+0.033
Neg logit contributions
handed-0.074
ongo-0.066
 former-0.064
-0.064
view-0.064
Token together
Token ID1978
Feature activation-2.66e-01

Pos logit contributions
 me+0.023
 it+0.023
 please+0.023
 cho+0.020
 closer+0.019
Neg logit contributions
handed-0.037
 former-0.034
urance-0.033
-0.033
 arrest-0.032
Token.
Token ID13
Feature activation-4.99e-02

Pos logit contributions
 me+0.006
 please+0.006
 it+0.006
 cho+0.006
una+0.005
Neg logit contributions
handed-0.011
 former-0.010
-0.010
urance-0.010
 arrest-0.009
Token to
Token ID284
Feature activation+1.00e-02

Pos logit contributions
 please+0.006
out+0.006
 me+0.006
a+0.006
osh+0.005
Neg logit contributions
handed-0.012
 arrest-0.012
keeping-0.011
 former-0.011
ongo-0.011
Token eat
Token ID4483
Feature activation-5.84e-01

Pos logit contributions
handed+0.000
 former+0.000
+0.000
urance+0.000
 arrest+0.000
Neg logit contributions
 me-0.000
 please-0.000
 it-0.000
 cho-0.000
 closer-0.000
Token apples
Token ID22514
Feature activation+1.00e+00

Pos logit contributions
 please+0.019
 me+0.017
 it+0.017
 cho+0.017
osh+0.017
Neg logit contributions
handed-0.018
 arrest-0.016
 former-0.016
-0.016
urance-0.015
Token from
Token ID422
Feature activation+2.12e+00

Pos logit contributions
handed+0.043
 former+0.040
urance+0.039
+0.039
 arrest+0.038
Neg logit contributions
 it-0.024
 me-0.024
 please-0.024
 cho-0.021
 closer-0.019
Token a
Token ID257
Feature activation-6.53e-01

Pos logit contributions
handed+0.087
 former+0.079
+0.078
 arrest+0.076
urance+0.076
Neg logit contributions
 please-0.053
 me-0.053
 it-0.051
 cho-0.047
 closer-0.043
Token tree
Token ID5509
Feature activation-2.56e+00

Pos logit contributions
 please+0.019
ours+0.019
una+0.018
 it+0.018
 happens+0.018
Neg logit contributions
handed-0.021
-0.019
 arrest-0.019
ongo-0.018
Mic-0.018
Token that
Token ID326
Feature activation-1.74e+00

Pos logit contributions
 me+0.062
 it+0.061
 please+0.060
 cho+0.054
 closer+0.049
Neg logit contributions
handed-0.108
 former-0.098
-0.098
urance-0.096
 arrest-0.094
Token belonged
Token ID19611
Feature activation+8.17e-01

Pos logit contributions
 please+0.049
 me+0.048
 it+0.044
 cho+0.044
una+0.043
Neg logit contributions
handed-0.063
-0.056
gy-0.054
 former-0.054
 arrest-0.053
Token to
Token ID284
Feature activation-8.62e-01

Pos logit contributions
handed+0.034
gy+0.030
keeping+0.030
ongo+0.029
 former+0.029
Neg logit contributions
 please-0.019
 me-0.018
 it-0.018
 cho-0.017
osh-0.017
Token someone
Token ID2130
Feature activation+2.70e-01

Pos logit contributions
 please+0.025
 me+0.024
 it+0.024
 cho+0.023
 closer+0.022
Neg logit contributions
handed-0.031
 former-0.028
 arrest-0.027
-0.027
urance-0.027
Token else
Token ID2073
Feature activation+2.16e-01

Pos logit contributions
handed+0.012
+0.010
 former+0.010
 arrest+0.010
ongo+0.010
Neg logit contributions
 me-0.006
 please-0.006
 it-0.006
 cho-0.005
una-0.005
TokenOnce
Token ID7454
Feature activation+3.02e-01

Pos logit contributions
 it+0.031
 me+0.030
 please+0.028
 cho+0.026
 is+0.025
Neg logit contributions
handed-0.041
 former-0.039
urance-0.037
 arrest-0.036
ongo-0.035
Token upon
Token ID2402
Feature activation-2.52e+00

Pos logit contributions
handed+0.010
+0.009
ongo+0.009
urance+0.009
 former+0.009
Neg logit contributions
 me-0.009
 please-0.009
 it-0.008
 cho-0.008
una-0.007
Token a
Token ID257
Feature activation-1.17e+00

Pos logit contributions
 it+0.051
 me+0.043
 please+0.042
 closer+0.032
 is+0.030
Neg logit contributions
handed-0.126
 former-0.126
-0.120
urance-0.120
ongo-0.115
Token time
Token ID640
Feature activation-4.46e-01

Pos logit contributions
 me+0.039
 please+0.037
 it+0.037
 cho+0.034
una+0.032
Neg logit contributions
handed-0.040
 former-0.036
-0.034
keeping-0.034
 arrest-0.034
Token,
Token ID11
Feature activation-2.18e-01

Pos logit contributions
 me+0.010
 please+0.010
 it+0.010
 cho+0.009
 closer+0.008
Neg logit contributions
handed-0.020
 former-0.018
-0.017
urance-0.017
 arrest-0.017
Token a
Token ID257
Feature activation-1.92e+00

Pos logit contributions
 me+0.005
 please+0.005
 it+0.005
 cho+0.004
 closer+0.004
Neg logit contributions
handed-0.010
 former-0.009
-0.009
urance-0.009
 arrest-0.009
Token cheerful
Token ID37999
Feature activation-1.87e+00

Pos logit contributions
 me+0.055
 please+0.054
 it+0.054
 cho+0.050
 closer+0.046
Neg logit contributions
handed-0.071
 former-0.064
-0.063
urance-0.062
 arrest-0.061
Token little
Token ID1310
Feature activation+3.07e-01

Pos logit contributions
 please+0.051
 it+0.050
 me+0.050
 cho+0.046
 closer+0.044
Neg logit contributions
handed-0.069
-0.062
 former-0.061
 arrest-0.060
urance-0.060
Token girl
Token ID2576
Feature activation-6.62e-01

Pos logit contributions
handed+0.011
+0.011
ongo+0.010
gy+0.010
 arrest+0.010
Neg logit contributions
 please-0.008
 me-0.007
 it-0.007
out-0.006
una-0.006
Token named
Token ID3706
Feature activation+1.53e+00

Pos logit contributions
 please+0.022
 me+0.021
 it+0.021
 cho+0.020
 closer+0.020
Neg logit contributions
handed-0.020
 arrest-0.018
-0.018
urance-0.017
 former-0.017
Token Lily
Token ID20037
Feature activation-1.59e+00

Pos logit contributions
handed+0.056
urance+0.051
+0.051
 arrest+0.049
 former+0.049
Neg logit contributions
 cho-0.039
 please-0.039
 me-0.038
 it-0.037
out-0.034