TOP ACTIVATIONS
MAX = 12.991

friend , Sam , " Which writing place should I go
mom asked him , " Which animal do you want to
The seed asked , " Which way should I grow at
He asked Anna , " Which one is your favorite ?"
go . " Which way , Tom ?" Lily
man for help . " Which cake is good for my
asked his mom , " Which one do I get ?"
Emily asked Max , " Which leash do you want ?"
colors . " Which dress do you like ,
â Which way should I go ?
asked her mum , " Where are we going today ?"
story . " Which story do you want ,
asked her mom , " Where do we live ?"
asked her Dad : " Where is Daisy ? I miss
He asks Sue , " Where is the red car ?"
asked Mom my , " Where are we going ?" Mom
. Mark asked , " Where should we go ?"
asked her mother : " Where does Daddy go every morning
his mom my , " Where is my computer ?" His
asked his mom , " Where can I go ?". His

MIN ACTIVATIONS
MIN = -12.247

Together they t ipt o ed through , searching for treasures
to the bell . Jack stood and looked at the bell
nailed down , the children stood back and admired their work
on it . The bug stood there for a moment ,
, Jack ?" Jack slowly stood up , rubbing his knees
and happy . The squirrel stood up on its hind legs
tree for treasure !" They dug a small hole and found
some string . N ana stood back and smiled .
was finished , the bunny stood back and admired it .
! The bear stood up , grow led and
with a smile . She stood up and looked at her
could stand still . He stood quietly , with no movement
had an idea . She stood up on the chair ,
open land . He proudly stood tall , holding the spear
, shiny lock . She stood there looking at the lock
the kiss . The farmer stood and watched with a big
Jack and Jill stood in the middle of their
hard and smiled . He stood back and watched the worm
big round rock . He stood on it and tried to
Kat ie be amed with joy . " Look

INTERVAL 6.682 - 12.991
CONTAINS 0.626%

lonely . " Can we be your friends ?"
crash . " Oh no !" Lily says .
happy 3 year old . Most days , he loved to
" What is it ? Can I help ?" Her mom
" What is this ? What do these words mean ?"

INTERVAL 0.372 - 6.682
CONTAINS 40.876%

bad inch . He was mean , and he did not
the end of the night , the three of them thanked
gone . But , John never forgot how wonderful it was
his mother again soon . <|endoftext|> Once upon a time ,
It does , Anna . Are you ready to perform ?"

INTERVAL -5.937 - 0.372
CONTAINS 57.132%

. " To get to the other side !"
?" He jumped off the chair and ran to the door
for hours . He laughed with joy and thanked his mom
low rum ble and the heavy box in the corner started
me how to whistle ." The lady smiled and said ,

INTERVAL -12.247 - -5.937
CONTAINS 1.366%

, mum ?". Her mum paused , then said sadly ,
mom joined them . They twisted until they were dizz y
singing ! The rac ers dashed around the town and the
one was looking . She looked around and saw many things
necklace around its neck and looked at Lily . The seal
Token friend
Token ID1545
Feature activation+4.14e+00

Pos logit contributions
 harm+0.157
 worry+0.151
 happen+0.151
 matter+0.146
 intr+0.146
Neg logit contributions
amed-0.201
lic-0.166
apped-0.162
agged-0.162
acking-0.152
Token,
Token ID11
Feature activation+2.07e+00

Pos logit contributions
 matter+0.123
 harm+0.120
 worry+0.116
 risk+0.111
 test+0.106
Neg logit contributions
amed-0.142
lic-0.120
bled-0.117
agged-0.114
hes-0.114
Token Sam
Token ID3409
Feature activation+2.25e+00

Pos logit contributions
 matter+0.061
 harm+0.059
 worry+0.059
 appearance+0.053
 problems+0.052
Neg logit contributions
amed-0.073
agged-0.062
bled-0.061
lic-0.060
hes-0.059
Token,
Token ID11
Feature activation+1.77e+00

Pos logit contributions
 matter+0.073
 worry+0.068
 harm+0.067
 appearance+0.063
 risk+0.063
Neg logit contributions
amed-0.071
bled-0.059
lic-0.056
hes-0.055
agged-0.055
Token "
Token ID366
Feature activation+8.72e-01

Pos logit contributions
 matter+0.058
 worry+0.057
 harm+0.056
 appearance+0.051
 test+0.051
Neg logit contributions
amed-0.057
agged-0.047
bled-0.047
lic-0.046
hes-0.045
TokenWhich
Token ID13828
Feature activation+1.30e+01

Pos logit contributions
 matter+0.028
 worry+0.028
 harm+0.026
 test+0.024
 reason+0.024
Neg logit contributions
amed-0.030
agged-0.026
bled-0.025
lic-0.024
hes-0.024
Token writing
Token ID3597
Feature activation+2.02e+00

Pos logit contributions
 worry+0.344
 harm+0.311
 intr+0.305
 matter+0.294
 disl+0.275
Neg logit contributions
amed-0.395
hes-0.358
agged-0.328
apped-0.324
 Then-0.316
Token place
Token ID1295
Feature activation+4.30e+00

Pos logit contributions
 worry+0.053
 harm+0.050
 matter+0.048
 intr+0.045
 client+0.045
Neg logit contributions
amed-0.078
hes-0.069
bled-0.067
agged-0.067
lic-0.064
Token should
Token ID815
Feature activation+6.33e+00

Pos logit contributions
 worry+0.108
 matter+0.106
 harm+0.103
 client+0.098
 reason+0.095
Neg logit contributions
amed-0.164
hes-0.142
bled-0.141
agged-0.141
lic-0.139
Token I
Token ID314
Feature activation+5.20e+00

Pos logit contributions
 worry+0.143
 matter+0.142
 harm+0.137
 perspective+0.136
 conflicts+0.136
Neg logit contributions
amed-0.221
bled-0.210
hes-0.199
agged-0.195
He-0.185
Token go
Token ID467
Feature activation-2.98e-01

Pos logit contributions
 harm+0.118
 matter+0.117
 worry+0.114
 angel+0.114
 conflicts+0.110
Neg logit contributions
amed-0.194
lic-0.165
hes-0.165
agged-0.163
bled-0.160
Token mom
Token ID1995
Feature activation+1.27e+00

Pos logit contributions
amed+0.026
agged+0.022
bled+0.022
lic+0.021
apped+0.021
Neg logit contributions
 matter-0.018
 harm-0.018
 worry-0.017
 happen-0.016
 risk-0.016
Token asked
Token ID1965
Feature activation+8.70e-01

Pos logit contributions
 worry+0.033
 matter+0.032
 harm+0.031
 test+0.027
 reason+0.027
Neg logit contributions
amed-0.050
agged-0.045
bled-0.043
lic-0.042
hes-0.041
Token him
Token ID683
Feature activation-1.10e+00

Pos logit contributions
 matter+0.025
 worry+0.024
 harm+0.024
 test+0.022
 client+0.021
Neg logit contributions
amed-0.031
agged-0.027
bled-0.027
lic-0.025
hes-0.025
Token,
Token ID11
Feature activation+1.52e+00

Pos logit contributions
amed+0.038
bled+0.033
lic+0.033
hes+0.032
agged+0.031
Neg logit contributions
 matter-0.034
 harm-0.031
 worry-0.030
 risk-0.028
 reason-0.027
Token "
Token ID366
Feature activation+1.25e+00

Pos logit contributions
 worry+0.055
 matter+0.054
 harm+0.051
 test+0.048
 reason+0.047
Neg logit contributions
amed-0.046
agged-0.039
bled-0.037
lic-0.035
hes-0.034
TokenWhich
Token ID13828
Feature activation+1.30e+01

Pos logit contributions
 matter+0.037
 worry+0.037
 harm+0.035
 reason+0.032
 test+0.031
Neg logit contributions
amed-0.046
agged-0.041
bled-0.039
hes-0.037
lic-0.037
Token animal
Token ID5044
Feature activation+7.41e+00

Pos logit contributions
 worry+0.339
 matter+0.338
 harm+0.309
 supervision+0.300
 exam+0.298
Neg logit contributions
amed-0.368
hes-0.349
agged-0.308
lic-0.303
aked-0.303
Token do
Token ID466
Feature activation+5.80e+00

Pos logit contributions
 matter+0.213
 worry+0.203
 avail+0.194
 perspective+0.192
 harm+0.186
Neg logit contributions
amed-0.244
hes-0.220
agged-0.208
lic-0.204
bled-0.203
Token you
Token ID345
Feature activation+7.06e+00

Pos logit contributions
 matter+0.103
 worry+0.102
 perspective+0.100
 photo+0.094
 appearance+0.093
Neg logit contributions
amed-0.230
hes-0.221
agged-0.218
bled-0.214
lic-0.206
Token want
Token ID765
Feature activation+6.14e+00

Pos logit contributions
 matter+0.168
 client+0.165
 angel+0.162
 worry+0.160
 harm+0.159
Neg logit contributions
amed-0.260
agged-0.223
lic-0.218
hes-0.213
acking-0.211
Token to
Token ID284
Feature activation+6.68e+00

Pos logit contributions
 worry+0.159
 matter+0.155
 harm+0.142
 client+0.135
 appearance+0.128
Neg logit contributions
amed-0.237
agged-0.208
bled-0.199
hes-0.199
lic-0.197
TokenThe
Token ID464
Feature activation-1.07e+00

Pos logit contributions
amed+0.033
agged+0.029
bled+0.029
apped+0.026
hes+0.025
Neg logit contributions
 worry-0.030
 matter-0.029
 harm-0.029
 reason-0.027
 test-0.026
Token seed
Token ID9403
Feature activation-1.52e+00

Pos logit contributions
amed+0.042
bled+0.036
agged+0.035
hes+0.034
aming+0.032
Neg logit contributions
 matter-0.026
 worry-0.025
 harm-0.025
 client-0.024
 appearance-0.023
Token asked
Token ID1965
Feature activation+8.23e-01

Pos logit contributions
amed+0.052
agged+0.043
bled+0.042
hes+0.041
lic+0.041
Neg logit contributions
 matter-0.047
 worry-0.047
 harm-0.045
 client-0.043
 test-0.042
Token,
Token ID11
Feature activation+1.22e+00

Pos logit contributions
 matter+0.022
 worry+0.021
 harm+0.021
 test+0.019
 client+0.019
Neg logit contributions
amed-0.032
agged-0.028
bled-0.027
hes-0.026
lic-0.026
Token "
Token ID366
Feature activation+1.18e+00

Pos logit contributions
 worry+0.039
 matter+0.037
 harm+0.034
 test+0.033
 angel+0.033
Neg logit contributions
amed-0.043
agged-0.038
bled-0.035
acked-0.035
acking-0.033
TokenWhich
Token ID13828
Feature activation+1.28e+01

Pos logit contributions
 worry+0.038
 matter+0.038
 harm+0.037
 client+0.034
 test+0.034
Neg logit contributions
amed-0.038
agged-0.031
bled-0.031
lic-0.030
hes-0.029
Token way
Token ID835
Feature activation+7.49e+00

Pos logit contributions
 matter+0.326
 client+0.309
 worry+0.307
 appearance+0.292
 supervision+0.292
Neg logit contributions
amed-0.381
hes-0.358
apped-0.320
agged-0.317
bled-0.312
Token should
Token ID815
Feature activation+7.28e+00

Pos logit contributions
 matter+0.188
 client+0.181
 harm+0.177
 conflicts+0.171
 problems+0.168
Neg logit contributions
amed-0.253
bled-0.232
apped-0.208
hes-0.206
lic-0.201
Token I
Token ID314
Feature activation+5.96e+00

Pos logit contributions
 matter+0.179
 conflicts+0.168
 problems+0.160
 perspective+0.159
 client+0.152
Neg logit contributions
bled-0.235
amed-0.235
hes-0.221
agged-0.214
apped-0.204
Token grow
Token ID1663
Feature activation-2.95e-01

Pos logit contributions
 matter+0.154
 conflicts+0.152
 client+0.144
 harm+0.142
 angel+0.141
Neg logit contributions
amed-0.193
lic-0.167
hes-0.165
tted-0.161
agged-0.157
Token at
Token ID379
Feature activation+4.08e+00

Pos logit contributions
amed+0.013
agged+0.011
bled+0.011
hes+0.011
lic+0.011
Neg logit contributions
 matter-0.007
 worry-0.007
 harm-0.006
 client-0.005
 reason-0.005
Token He
Token ID679
Feature activation-2.04e+00

Pos logit contributions
amed+0.063
bled+0.049
agged+0.047
lic+0.043
aming+0.042
Neg logit contributions
 worry-0.095
 matter-0.093
 harm-0.090
 client-0.086
 anyone-0.086
Token asked
Token ID1965
Feature activation+3.34e-01

Pos logit contributions
amed+0.086
agged+0.078
bled+0.075
lic+0.072
hes+0.071
Neg logit contributions
 matter-0.049
 worry-0.047
 harm-0.043
 test-0.039
 reason-0.039
Token Anna
Token ID11735
Feature activation-1.64e-01

Pos logit contributions
 matter+0.011
 worry+0.011
 harm+0.010
 risk+0.010
 test+0.009
Neg logit contributions
amed-0.011
agged-0.009
bled-0.009
hes-0.009
lic-0.008
Token,
Token ID11
Feature activation+7.03e-01

Pos logit contributions
amed+0.005
hes+0.004
bled+0.004
apped+0.004
lic+0.004
Neg logit contributions
 matter-0.006
 harm-0.005
 worry-0.005
 risk-0.005
 avail-0.005
Token "
Token ID366
Feature activation+4.56e-01

Pos logit contributions
 worry+0.026
 matter+0.026
 harm+0.025
 test+0.023
 reason+0.023
Neg logit contributions
amed-0.020
agged-0.017
bled-0.016
lic-0.015
hes-0.015
TokenWhich
Token ID13828
Feature activation+1.24e+01

Pos logit contributions
 worry+0.015
 matter+0.015
 harm+0.015
 test+0.013
 risk+0.013
Neg logit contributions
amed-0.015
agged-0.012
bled-0.012
hes-0.011
lic-0.011
Token one
Token ID530
Feature activation+7.66e+00

Pos logit contributions
 worry+0.352
 matter+0.328
 intr+0.326
 harm+0.326
 appearance+0.296
Neg logit contributions
amed-0.358
hes-0.336
agged-0.314
apped-0.302
bled-0.291
Token is
Token ID318
Feature activation+3.56e+00

Pos logit contributions
 client+0.202
 perspective+0.200
 avail+0.199
 appearance+0.194
 appearances+0.189
Neg logit contributions
amed-0.222
hes-0.217
lic-0.195
agged-0.192
acked-0.191
Token your
Token ID534
Feature activation+2.69e+00

Pos logit contributions
 worry+0.096
 matter+0.089
 harm+0.086
 risk+0.080
 reason+0.079
Neg logit contributions
amed-0.132
hes-0.121
agged-0.116
bled-0.116
lic-0.109
Token favorite
Token ID4004
Feature activation+8.18e+00

Pos logit contributions
 matter+0.070
 worry+0.070
 harm+0.068
 intr+0.063
 happen+0.059
Neg logit contributions
amed-0.075
 Then-0.065
apped-0.063
agged-0.062
bled-0.062
Token?"
Token ID1701
Feature activation-1.79e+00

Pos logit contributions
 worry+0.214
 harm+0.198
 matter+0.193
 intr+0.191
 client+0.175
Neg logit contributions
amed-0.276
acked-0.237
bled-0.235
agged-0.234
hes-0.234
Token go
Token ID467
Feature activation-1.11e+00

Pos logit contributions
 matter+0.141
 worry+0.139
 harm+0.132
 client+0.126
 appearance+0.123
Neg logit contributions
amed-0.198
bled-0.174
agged-0.171
hes-0.165
lic-0.155
Token.
Token ID13
Feature activation-2.34e+00

Pos logit contributions
amed+0.037
agged+0.031
bled+0.030
acking+0.029
lic+0.029
Neg logit contributions
 matter-0.034
 worry-0.034
 client-0.031
 harm-0.031
 angel-0.030
Token
Token ID198
Feature activation-1.48e+00

Pos logit contributions
amed+0.067
agged+0.054
bled+0.054
lic+0.050
hes+0.048
Neg logit contributions
 worry-0.088
 matter-0.086
 harm-0.083
 test-0.077
 reason-0.077
Token
Token ID198
Feature activation-9.55e-01

Pos logit contributions
amed+0.054
agged+0.048
bled+0.046
lic+0.042
acked+0.041
Neg logit contributions
 worry-0.042
 harm-0.041
 matter-0.039
 reason-0.037
 test-0.036
Token"
Token ID1
Feature activation+8.93e-01

Pos logit contributions
amed+0.036
agged+0.036
bled+0.034
apped+0.031
lic+0.030
Neg logit contributions
 worry-0.022
 else-0.020
 you-0.020
 matter-0.020
 harm-0.020
TokenWhich
Token ID13828
Feature activation+1.24e+01

Pos logit contributions
 matter+0.030
 worry+0.029
 harm+0.028
 reason+0.026
 test+0.025
Neg logit contributions
amed-0.029
agged-0.025
bled-0.024
lic-0.023
acking-0.023
Token way
Token ID835
Feature activation+6.82e+00

Pos logit contributions
 worry+0.326
 matter+0.313
 appearance+0.301
 exam+0.300
 intr+0.299
Neg logit contributions
amed-0.351
hes-0.313
agged-0.296
apped-0.290
bled-0.263
Token,
Token ID11
Feature activation+4.79e+00

Pos logit contributions
 matter+0.180
 harm+0.174
 client+0.169
 worry+0.157
 conflicts+0.157
Neg logit contributions
amed-0.230
bled-0.210
apped-0.189
lic-0.179
agged-0.176
Token Tom
Token ID4186
Feature activation+2.89e+00

Pos logit contributions
 harm+0.142
 worry+0.131
 matter+0.129
 client+0.118
 encounter+0.118
Neg logit contributions
amed-0.159
bled-0.143
hes-0.141
lic-0.136
agged-0.135
Token?"
Token ID1701
Feature activation-3.52e-01

Pos logit contributions
 matter+0.067
 worry+0.066
 harm+0.061
 reason+0.056
 client+0.055
Neg logit contributions
amed-0.116
hes-0.101
bled-0.101
agged-0.100
lic-0.099
Token Lily
Token ID20037
Feature activation-3.06e+00

Pos logit contributions
amed+0.012
agged+0.010
bled+0.010
hes+0.010
lic+0.010
Neg logit contributions
 worry-0.011
 matter-0.011
 harm-0.011
 test-0.010
 risk-0.010
Token man
Token ID582
Feature activation+6.17e-01

Pos logit contributions
 matter+0.081
 harm+0.077
 worry+0.075
 risk+0.071
 happen+0.069
Neg logit contributions
amed-0.070
lic-0.057
bled-0.056
apped-0.055
agged-0.055
Token for
Token ID329
Feature activation+3.30e+00

Pos logit contributions
 matter+0.018
 harm+0.017
 worry+0.017
 risk+0.015
 reason+0.015
Neg logit contributions
amed-0.022
hes-0.018
agged-0.018
apped-0.018
lic-0.018
Token help
Token ID1037
Feature activation+4.23e+00

Pos logit contributions
 matter+0.098
 worry+0.096
 harm+0.096
 appearance+0.089
 client+0.088
Neg logit contributions
amed-0.112
bled-0.097
agged-0.097
hes-0.094
lic-0.088
Token.
Token ID13
Feature activation-2.02e+00

Pos logit contributions
 matter+0.140
 worry+0.131
 harm+0.128
 comet+0.128
 angel+0.127
Neg logit contributions
amed-0.122
hes-0.102
agged-0.101
bled-0.100
acking-0.097
Token "
Token ID366
Feature activation+2.22e-01

Pos logit contributions
amed+0.058
agged+0.048
bled+0.047
lic+0.044
hes+0.043
Neg logit contributions
 worry-0.075
 matter-0.074
 harm-0.072
 test-0.068
 client-0.066
TokenWhich
Token ID13828
Feature activation+1.24e+01

Pos logit contributions
 worry+0.007
 matter+0.007
 harm+0.007
 test+0.006
 reason+0.006
Neg logit contributions
amed-0.008
agged-0.006
bled-0.006
lic-0.006
hes-0.006
Token cake
Token ID12187
Feature activation+3.94e+00

Pos logit contributions
 harm+0.351
 worry+0.346
 intr+0.335
 appearance+0.326
 matter+0.326
Neg logit contributions
hes-0.363
amed-0.355
agged-0.313
apped-0.281
bled-0.278
Token is
Token ID318
Feature activation+2.17e+00

Pos logit contributions
 harm+0.097
 worry+0.095
 matter+0.093
 client+0.087
 perspective+0.084
Neg logit contributions
amed-0.154
hes-0.138
bled-0.134
lic-0.133
agged-0.132
Token good
Token ID922
Feature activation+5.40e+00

Pos logit contributions
 worry+0.069
 harm+0.065
 matter+0.065
 client+0.061
 appearance+0.060
Neg logit contributions
amed-0.067
hes-0.062
bled-0.057
agged-0.055
lic-0.054
Token for
Token ID329
Feature activation+6.23e+00

Pos logit contributions
 worry+0.158
 matter+0.157
 harm+0.156
 client+0.142
 appearance+0.140
Neg logit contributions
amed-0.188
agged-0.158
bled-0.158
lic-0.154
hes-0.151
Token my
Token ID616
Feature activation+3.91e+00

Pos logit contributions
 worry+0.133
 matter+0.133
 harm+0.122
 appearance+0.112
 risk+0.111
Neg logit contributions
amed-0.261
agged-0.237
hes-0.234
bled-0.232
 Later-0.215
Token asked
Token ID1965
Feature activation+2.81e-01

Pos logit contributions
amed+0.091
agged+0.079
bled+0.077
lic+0.075
hes+0.074
Neg logit contributions
 worry-0.065
 matter-0.065
 harm-0.062
 test-0.056
 reason-0.055
Token his
Token ID465
Feature activation-2.19e-01

Pos logit contributions
 worry+0.009
 matter+0.009
 harm+0.008
 test+0.008
 risk+0.008
Neg logit contributions
amed-0.010
agged-0.008
bled-0.008
lic-0.008
hes-0.008
Token mom
Token ID1995
Feature activation+1.93e+00

Pos logit contributions
amed+0.007
bled+0.005
lic+0.005
apped+0.005
hes+0.005
Neg logit contributions
 matter-0.007
 harm-0.007
 risk-0.007
 happen-0.006
 worry-0.006
Token,
Token ID11
Feature activation+8.56e-01

Pos logit contributions
 matter+0.053
 harm+0.053
 worry+0.051
 risk+0.047
 reason+0.045
Neg logit contributions
amed-0.070
bled-0.060
lic-0.060
hes-0.059
agged-0.058
Token "
Token ID366
Feature activation+3.42e-01

Pos logit contributions
 worry+0.032
 matter+0.031
 harm+0.030
 test+0.028
 reason+0.028
Neg logit contributions
amed-0.025
agged-0.021
bled-0.020
lic-0.019
hes-0.018
TokenWhich
Token ID13828
Feature activation+1.24e+01

Pos logit contributions
 matter+0.011
 worry+0.011
 harm+0.010
 test+0.010
 reason+0.009
Neg logit contributions
amed-0.012
agged-0.010
bled-0.010
lic-0.009
hes-0.009
Token one
Token ID530
Feature activation+7.36e+00

Pos logit contributions
 worry+0.344
 harm+0.342
 matter+0.335
 client+0.319
 intr+0.315
Neg logit contributions
amed-0.366
hes-0.349
agged-0.324
bled-0.299
He-0.297
Token do
Token ID466
Feature activation+6.10e+00

Pos logit contributions
 client+0.194
 perspective+0.185
 appearance+0.185
 appearances+0.181
 avail+0.180
Neg logit contributions
amed-0.222
hes-0.221
lic-0.200
acked-0.191
agged-0.189
Token I
Token ID314
Feature activation+5.75e+00

Pos logit contributions
 matter+0.112
 worry+0.110
 appearance+0.108
 harm+0.106
 perspective+0.102
Neg logit contributions
amed-0.248
hes-0.245
agged-0.233
lic-0.226
bled-0.225
Token get
Token ID651
Feature activation-8.63e-01

Pos logit contributions
 angel+0.149
 matter+0.141
 harm+0.138
 client+0.138
 appearance+0.137
Neg logit contributions
amed-0.195
hes-0.174
lic-0.166
acking-0.161
agged-0.159
Token?"
Token ID1701
Feature activation-1.27e+00

Pos logit contributions
amed+0.033
agged+0.029
bled+0.028
hes+0.028
lic+0.027
Neg logit contributions
 worry-0.023
 matter-0.023
 harm-0.022
 client-0.020
 appearance-0.020
Token Emily
Token ID17608
Feature activation-1.25e-02

Pos logit contributions
amed+0.085
agged+0.073
bled+0.071
lic+0.068
hes+0.067
Neg logit contributions
 worry-0.074
 matter-0.073
 harm-0.070
 test-0.064
 client-0.063
Token asked
Token ID1965
Feature activation+4.24e-01

Pos logit contributions
amed+0.000
agged+0.000
bled+0.000
lic+0.000
hes+0.000
Neg logit contributions
 worry-0.000
 matter-0.000
 harm-0.000
 test-0.000
 reason-0.000
Token Max
Token ID5436
Feature activation+1.20e+00

Pos logit contributions
 matter+0.014
 worry+0.014
 harm+0.014
 risk+0.013
 test+0.012
Neg logit contributions
amed-0.013
bled-0.011
agged-0.011
hes-0.010
lic-0.010
Token,
Token ID11
Feature activation+7.49e-01

Pos logit contributions
 matter+0.038
 harm+0.035
 worry+0.033
 risk+0.032
 reason+0.031
Neg logit contributions
amed-0.040
lic-0.034
bled-0.033
hes-0.032
 Later-0.031
Token "
Token ID366
Feature activation+3.84e-01

Pos logit contributions
 worry+0.026
 matter+0.025
 harm+0.024
 test+0.023
 reason+0.023
Neg logit contributions
amed-0.023
agged-0.020
bled-0.019
lic-0.018
hes-0.018
TokenWhich
Token ID13828
Feature activation+1.24e+01

Pos logit contributions
 matter+0.011
 worry+0.011
 harm+0.010
 test+0.010
 reason+0.010
Neg logit contributions
amed-0.014
agged-0.013
bled-0.012
lic-0.011
hes-0.011
Token leash
Token ID40529
Feature activation+3.27e+00

Pos logit contributions
 worry+0.329
 matter+0.311
 harm+0.295
 intr+0.289
 angel+0.279
Neg logit contributions
amed-0.407
hes-0.367
agged-0.366
bled-0.334
acked-0.327
Token do
Token ID466
Feature activation+5.31e+00

Pos logit contributions
 matter+0.073
 worry+0.071
 harm+0.071
 client+0.062
 appearance+0.061
Neg logit contributions
amed-0.136
agged-0.118
bled-0.117
lic-0.116
hes-0.115
Token you
Token ID345
Feature activation+6.30e+00

Pos logit contributions
 worry+0.099
 perspective+0.096
 matter+0.091
 appearance+0.091
 harm+0.088
Neg logit contributions
amed-0.210
hes-0.201
agged-0.197
bled-0.195
lic-0.181
Token want
Token ID765
Feature activation+5.53e+00

Pos logit contributions
 angel+0.154
 harm+0.154
 client+0.151
 matter+0.150
 worry+0.148
Neg logit contributions
amed-0.232
agged-0.201
lic-0.191
hes-0.189
acking-0.185
Token?"
Token ID1701
Feature activation-1.24e+00

Pos logit contributions
 worry+0.131
 matter+0.126
 harm+0.119
 client+0.106
 angel+0.104
Neg logit contributions
amed-0.229
agged-0.203
bled-0.197
hes-0.193
lic-0.191
Token colors
Token ID7577
Feature activation+9.69e-01

Pos logit contributions
amed+0.003
agged+0.003
lic+0.003
hes+0.003
bled+0.003
Neg logit contributions
 harm-0.003
 worry-0.003
 client-0.003
 matter-0.003
 intr-0.003
Token.
Token ID13
Feature activation-1.51e+00

Pos logit contributions
 worry+0.027
 matter+0.027
 harm+0.026
 test+0.023
 client+0.023
Neg logit contributions
amed-0.036
agged-0.032
bled-0.031
lic-0.030
hes-0.030
Token
Token ID198
Feature activation-1.11e+00

Pos logit contributions
amed+0.050
bled+0.043
agged+0.043
lic+0.039
hes+0.039
Neg logit contributions
 worry-0.047
 matter-0.047
 test-0.043
 harm-0.043
 exam-0.042
Token
Token ID198
Feature activation-4.98e-01

Pos logit contributions
amed+0.035
agged+0.030
bled+0.029
lic+0.027
hes+0.027
Neg logit contributions
 worry-0.037
 matter-0.037
 harm-0.036
 reason-0.033
 test-0.033
Token"
Token ID1
Feature activation+1.55e+00

Pos logit contributions
amed+0.014
agged+0.013
bled+0.012
hes+0.011
lic+0.011
Neg logit contributions
 matter-0.018
 worry-0.018
 test-0.016
 reason-0.016
 harm-0.016
TokenWhich
Token ID13828
Feature activation+1.23e+01

Pos logit contributions
 matter+0.051
 worry+0.048
 reason+0.043
 test+0.043
 harm+0.043
Neg logit contributions
amed-0.050
agged-0.047
bled-0.044
hes-0.041
lic-0.041
Token dress
Token ID6576
Feature activation+1.39e+00

Pos logit contributions
 worry+0.354
 harm+0.343
 intr+0.330
 matter+0.316
 client+0.303
Neg logit contributions
amed-0.408
hes-0.343
agged-0.334
apped-0.316
bled-0.313
Token do
Token ID466
Feature activation+4.83e+00

Pos logit contributions
 harm+0.036
 worry+0.035
 matter+0.035
 client+0.033
 intr+0.032
Neg logit contributions
amed-0.053
lic-0.045
bled-0.045
hes-0.045
agged-0.044
Token you
Token ID345
Feature activation+6.19e+00

Pos logit contributions
 worry+0.088
 harm+0.086
 appearance+0.084
 perspective+0.083
 matter+0.081
Neg logit contributions
amed-0.201
hes-0.191
lic-0.179
agged-0.178
bled-0.178
Token like
Token ID588
Feature activation+4.58e+00

Pos logit contributions
 harm+0.170
 client+0.170
 worry+0.165
 angel+0.163
 matter+0.162
Neg logit contributions
amed-0.210
agged-0.174
hes-0.168
lic-0.167
acking-0.162
Token,
Token ID11
Feature activation+4.36e+00

Pos logit contributions
 worry+0.115
 harm+0.107
 matter+0.105
 client+0.099
 intr+0.097
Neg logit contributions
amed-0.176
lic-0.151
hes-0.150
agged-0.148
bled-0.144
Token
Token ID198
Feature activation-1.91e-01

Pos logit contributions
amed+0.023
agged+0.020
bled+0.020
hes+0.018
lic+0.017
Neg logit contributions
 worry-0.024
 matter-0.023
 harm-0.022
 reason-0.021
 test-0.021
Tokenâ
Token ID22940
Feature activation-2.43e-01

Pos logit contributions
amed+0.006
agged+0.006
bled+0.005
hes+0.005
apped+0.004
Neg logit contributions
 worry-0.007
 matter-0.006
 reason-0.006
 harm-0.006
 test-0.006
Token
Token ID26391
Feature activation+9.55e-01

Pos logit contributions
amed+0.010
bled+0.009
agged+0.009
lic+0.009
hes+0.009
Neg logit contributions
 matter-0.006
 worry-0.005
 harm-0.005
 test-0.005
 exam-0.005
Token
Token ID129
Feature activation-9.13e-01

Pos logit contributions
 worry+0.029
 matter+0.028
 harm+0.027
 test+0.025
 reason+0.024
Neg logit contributions
amed-0.034
agged-0.029
bled-0.029
hes-0.027
lic-0.027
Token
Token ID241
Feature activation+2.10e+00

Pos logit contributions
amed+0.031
agged+0.027
bled+0.026
lic+0.026
hes+0.025
Neg logit contributions
 worry-0.028
 matter-0.028
 harm-0.027
 test-0.026
 exam-0.024
TokenWhich
Token ID13828
Feature activation+1.23e+01

Pos logit contributions
 matter+0.068
 worry+0.068
 harm+0.063
 reason+0.060
 test+0.058
Neg logit contributions
amed-0.071
agged-0.063
bled-0.062
lic-0.058
hes-0.055
Token way
Token ID835
Feature activation+7.20e+00

Pos logit contributions
 worry+0.292
 appearance+0.287
 harm+0.287
 matter+0.284
 intr+0.280
Neg logit contributions
amed-0.396
hes-0.363
agged-0.353
aked-0.324
He-0.322
Token should
Token ID815
Feature activation+6.92e+00

Pos logit contributions
 matter+0.164
 harm+0.164
 worry+0.151
 client+0.146
 reason+0.141
Neg logit contributions
amed-0.277
bled-0.248
agged-0.231
hes-0.226
apped-0.225
Token I
Token ID314
Feature activation+5.65e+00

Pos logit contributions
 matter+0.168
 conflicts+0.157
 harm+0.150
 perspective+0.150
 worry+0.148
Neg logit contributions
amed-0.224
bled-0.214
agged-0.205
hes-0.203
He-0.199
Token go
Token ID467
Feature activation-1.16e-01

Pos logit contributions
 matter+0.155
 harm+0.149
 conflicts+0.148
 angel+0.144
 appearance+0.142
Neg logit contributions
amed-0.184
agged-0.157
lic-0.155
tted-0.153
He-0.152
Token?
Token ID30
Feature activation+2.02e-01

Pos logit contributions
amed+0.005
agged+0.005
bled+0.004
lic+0.004
hes+0.004
Neg logit contributions
 worry-0.003
 matter-0.003
 harm-0.002
 test-0.002
 reason-0.002
Token asked
Token ID1965
Feature activation+1.44e+00

Pos logit contributions
 worry+0.071
 matter+0.071
 harm+0.068
 test+0.062
 reason+0.061
Neg logit contributions
amed-0.079
agged-0.067
bled-0.065
lic-0.063
hes-0.063
Token her
Token ID607
Feature activation+1.25e+00

Pos logit contributions
 matter+0.036
 worry+0.036
 harm+0.035
 test+0.030
 risk+0.030
Neg logit contributions
amed-0.059
agged-0.051
bled-0.050
lic-0.048
hes-0.048
Token mum
Token ID25682
Feature activation+2.27e+00

Pos logit contributions
 matter+0.043
 harm+0.041
 risk+0.038
 appearance+0.038
 worry+0.037
Neg logit contributions
amed-0.036
lic-0.031
hes-0.029
bled-0.029
apped-0.028
Token,
Token ID11
Feature activation+1.81e+00

Pos logit contributions
 matter+0.062
 harm+0.060
 worry+0.059
 risk+0.052
 reason+0.052
Neg logit contributions
amed-0.083
bled-0.072
hes-0.070
lic-0.070
agged-0.068
Token "
Token ID366
Feature activation+1.28e+00

Pos logit contributions
 matter+0.074
 harm+0.072
 worry+0.072
 appearance+0.066
 problems+0.066
Neg logit contributions
amed-0.045
bled-0.035
agged-0.035
lic-0.034
hes-0.032
TokenWhere
Token ID8496
Feature activation+1.22e+01

Pos logit contributions
 matter+0.041
 worry+0.041
 harm+0.038
 reason+0.036
 test+0.035
Neg logit contributions
amed-0.043
agged-0.038
bled-0.036
hes-0.035
lic-0.034
Token are
Token ID389
Feature activation+5.89e+00

Pos logit contributions
 harm+0.365
 matter+0.339
 appearance+0.316
 reason+0.305
 perspective+0.304
Neg logit contributions
amed-0.405
hes-0.379
lic-0.359
agged-0.343
apped-0.340
Token we
Token ID356
Feature activation+6.78e+00

Pos logit contributions
 harm+0.117
 matter+0.113
 worry+0.101
 reason+0.091
 appearance+0.091
Neg logit contributions
amed-0.248
bled-0.232
agged-0.224
hes-0.221
lic-0.220
Token going
Token ID1016
Feature activation+2.18e+00

Pos logit contributions
 harm+0.208
 matter+0.203
 worry+0.186
 conflicts+0.182
 problems+0.181
Neg logit contributions
amed-0.228
bled-0.195
agged-0.185
aming-0.182
lic-0.179
Token today
Token ID1909
Feature activation+5.82e+00

Pos logit contributions
 matter+0.049
 worry+0.049
 harm+0.046
 test+0.040
 reason+0.039
Neg logit contributions
amed-0.094
agged-0.083
bled-0.081
lic-0.079
hes-0.078
Token?"
Token ID1701
Feature activation-3.24e-01

Pos logit contributions
 matter+0.107
 worry+0.106
 harm+0.103
 client+0.085
 problems+0.082
Neg logit contributions
amed-0.268
agged-0.240
bled-0.235
lic-0.232
hes-0.231
Token story
Token ID1621
Feature activation+2.65e+00

Pos logit contributions
 harm+0.004
 worry+0.004
 matter+0.004
 conflicts+0.003
 appearance+0.003
Neg logit contributions
lic-0.005
amed-0.005
hes-0.005
agged-0.004
acking-0.004
Token.
Token ID13
Feature activation-1.82e+00

Pos logit contributions
 worry+0.099
 matter+0.095
 harm+0.094
 risk+0.088
 appearance+0.087
Neg logit contributions
amed-0.077
lic-0.064
hes-0.064
agged-0.063
bled-0.059
Token
Token ID198
Feature activation-1.37e+00

Pos logit contributions
amed+0.054
agged+0.045
bled+0.043
lic+0.041
hes+0.041
Neg logit contributions
 worry-0.066
 matter-0.066
 harm-0.063
 test-0.058
 reason-0.057
Token
Token ID198
Feature activation-8.40e-01

Pos logit contributions
amed+0.040
agged+0.034
bled+0.033
lic+0.031
hes+0.030
Neg logit contributions
 worry-0.049
 matter-0.049
 harm-0.047
 test-0.044
 reason-0.044
Token"
Token ID1
Feature activation+1.29e+00

Pos logit contributions
amed+0.024
agged+0.020
bled+0.019
lic+0.019
hes+0.018
Neg logit contributions
 worry-0.031
 matter-0.031
 harm-0.030
 test-0.028
 reason-0.027
TokenWhich
Token ID13828
Feature activation+1.22e+01

Pos logit contributions
 matter+0.042
 worry+0.042
 harm+0.040
 test+0.037
 reason+0.037
Neg logit contributions
amed-0.043
agged-0.038
bled-0.036
lic-0.034
hes-0.034
Token story
Token ID1621
Feature activation+5.70e+00

Pos logit contributions
 worry+0.352
 harm+0.324
 intr+0.313
 disl+0.300
 appearance+0.293
Neg logit contributions
hes-0.333
amed-0.330
 Then-0.272
agged-0.269
apped-0.267
Token do
Token ID466
Feature activation+5.37e+00

Pos logit contributions
 worry+0.165
 harm+0.154
 matter+0.150
 appearance+0.140
 intr+0.137
Neg logit contributions
amed-0.206
hes-0.184
lic-0.180
agged-0.178
bled-0.167
Token you
Token ID345
Feature activation+6.31e+00

Pos logit contributions
 perspective+0.112
 worry+0.112
 harm+0.104
 appearance+0.101
 matter+0.101
Neg logit contributions
hes-0.198
amed-0.194
agged-0.184
lic-0.179
They-0.179
Token want
Token ID765
Feature activation+5.57e+00

Pos logit contributions
 harm+0.158
 worry+0.154
 matter+0.151
 conflicts+0.150
 client+0.150
Neg logit contributions
amed-0.217
hes-0.184
agged-0.181
lic-0.177
apped-0.166
Token,
Token ID11
Feature activation+4.79e+00

Pos logit contributions
 worry+0.172
 matter+0.158
 harm+0.152
 client+0.146
 intr+0.145
Neg logit contributions
amed-0.182
hes-0.157
agged-0.152
lic-0.149
ocked-0.138
Token asked
Token ID1965
Feature activation+7.01e-01

Pos logit contributions
amed+0.058
agged+0.052
bled+0.050
lic+0.048
hes+0.048
Neg logit contributions
 worry-0.044
 matter-0.043
 harm-0.041
 test-0.037
 reason-0.036
Token her
Token ID607
Feature activation+6.26e-01

Pos logit contributions
 matter+0.020
 harm+0.020
 worry+0.020
 risk+0.018
 test+0.017
Neg logit contributions
amed-0.026
bled-0.022
agged-0.022
apped-0.021
lic-0.020
Token mom
Token ID1995
Feature activation+2.56e+00

Pos logit contributions
 matter+0.022
 harm+0.021
 risk+0.020
 worry+0.019
 appearance+0.018
Neg logit contributions
amed-0.018
bled-0.015
apped-0.015
lic-0.015
 Then-0.014
Token,
Token ID11
Feature activation+1.49e+00

Pos logit contributions
 matter+0.074
 harm+0.072
 worry+0.070
 risk+0.064
 reason+0.063
Neg logit contributions
amed-0.089
bled-0.076
lic-0.074
agged-0.072
hes-0.072
Token "
Token ID366
Feature activation+1.14e+00

Pos logit contributions
 matter+0.063
 harm+0.062
 worry+0.061
 appearance+0.056
 problems+0.056
Neg logit contributions
amed-0.034
bled-0.027
agged-0.026
lic-0.025
hes-0.024
TokenWhere
Token ID8496
Feature activation+1.22e+01

Pos logit contributions
 matter+0.036
 worry+0.036
 harm+0.034
 test+0.032
 reason+0.031
Neg logit contributions
amed-0.039
agged-0.034
bled-0.033
lic-0.031
hes-0.031
Token do
Token ID466
Feature activation+7.21e+00

Pos logit contributions
 harm+0.378
 matter+0.355
 appearance+0.332
 perspective+0.328
 worry+0.321
Neg logit contributions
amed-0.387
hes-0.362
lic-0.342
apped-0.326
agged-0.318
Token we
Token ID356
Feature activation+6.32e+00

Pos logit contributions
 perspective+0.156
 matter+0.154
 worry+0.148
 harm+0.145
 appearance+0.143
Neg logit contributions
amed-0.242
hes-0.241
lic-0.236
bled-0.225
agged-0.223
Token live
Token ID2107
Feature activation-6.44e-01

Pos logit contributions
 harm+0.184
 conflicts+0.182
 matter+0.179
 problems+0.177
 client+0.177
Neg logit contributions
amed-0.176
bled-0.141
hes-0.140
lic-0.137
aming-0.135
Token?"
Token ID1701
Feature activation-4.67e-01

Pos logit contributions
amed+0.027
agged+0.023
bled+0.023
lic+0.022
hes+0.022
Neg logit contributions
 matter-0.016
 worry-0.015
 harm-0.015
 client-0.013
 reason-0.013
Token
Token ID220
Feature activation-8.19e-01

Pos logit contributions
amed+0.015
agged+0.013
bled+0.012
lic+0.012
hes+0.012
Neg logit contributions
 worry-0.016
 matter-0.015
 harm-0.015
 test-0.014
 reason-0.013
Token asked
Token ID1965
Feature activation+7.79e-01

Pos logit contributions
amed+0.046
agged+0.041
bled+0.040
lic+0.038
hes+0.038
Neg logit contributions
 worry-0.031
 matter-0.031
 harm-0.029
 test-0.026
 reason-0.026
Token her
Token ID607
Feature activation+6.14e-01

Pos logit contributions
 matter+0.022
 harm+0.022
 worry+0.021
 test+0.020
 risk+0.020
Neg logit contributions
amed-0.027
agged-0.024
bled-0.023
 Later-0.022
hes-0.022
Token Dad
Token ID17415
Feature activation+1.98e+00

Pos logit contributions
 matter+0.019
 worry+0.018
 harm+0.018
 risk+0.017
 conflicts+0.016
Neg logit contributions
amed-0.020
hes-0.017
lic-0.017
 Later-0.016
agged-0.016
Token:
Token ID25
Feature activation-1.27e+00

Pos logit contributions
 matter+0.063
 harm+0.058
 worry+0.055
 comet+0.053
 risk+0.052
Neg logit contributions
amed-0.060
bled-0.053
lic-0.053
hes-0.050
 Later-0.049
Token "
Token ID366
Feature activation+1.00e+00

Pos logit contributions
amed+0.036
agged+0.030
bled+0.029
lic+0.027
hes+0.027
Neg logit contributions
 worry-0.047
 matter-0.047
 harm-0.045
 test-0.042
 reason-0.041
TokenWhere
Token ID8496
Feature activation+1.22e+01

Pos logit contributions
 matter+0.034
 worry+0.034
 harm+0.033
 test+0.031
 client+0.030
Neg logit contributions
amed-0.031
agged-0.026
bled-0.025
lic-0.024
hes-0.024
Token is
Token ID318
Feature activation+4.89e+00

Pos logit contributions
 matter+0.372
 harm+0.372
 perspective+0.356
 appearance+0.354
 avail+0.345
Neg logit contributions
amed-0.366
hes-0.348
lic-0.329
apped-0.308
agged-0.305
Token Daisy
Token ID40355
Feature activation+6.54e+00

Pos logit contributions
 perspective+0.146
 matter+0.145
 worry+0.138
 encounter+0.137
 harm+0.136
Neg logit contributions
hes-0.136
amed-0.132
bled-0.128
agged-0.117
ocked-0.110
Token?
Token ID30
Feature activation+1.36e+00

Pos logit contributions
 matter+0.158
 harm+0.145
 worry+0.137
 problems+0.129
 reason+0.127
Neg logit contributions
amed-0.261
hes-0.226
bled-0.224
agged-0.224
lic-0.222
Token I
Token ID314
Feature activation+2.51e+00

Pos logit contributions
 matter+0.047
 harm+0.045
 worry+0.044
 problems+0.041
 reason+0.040
Neg logit contributions
amed-0.043
lic-0.036
agged-0.036
bled-0.034
hes-0.034
Token miss
Token ID2051
Feature activation+2.31e+00

Pos logit contributions
 matter+0.065
 appearance+0.059
 angel+0.058
 problems+0.058
 worry+0.058
Neg logit contributions
amed-0.087
lic-0.077
hes-0.074
agged-0.072
tted-0.072
Token He
Token ID679
Feature activation-1.91e+00

Pos logit contributions
amed+0.052
bled+0.041
agged+0.040
hes+0.039
aming+0.037
Neg logit contributions
 worry-0.069
 matter-0.067
 harm-0.063
 test-0.061
 reason-0.061
Token asks
Token ID7893
Feature activation+8.33e-01

Pos logit contributions
amed+0.074
agged+0.066
bled+0.063
hes+0.061
acked+0.060
Neg logit contributions
 matter-0.049
 worry-0.049
 harm-0.047
 reason-0.041
 risk-0.041
Token Sue
Token ID26089
Feature activation+1.63e+00

Pos logit contributions
 worry+0.024
 matter+0.024
 harm+0.023
 test+0.021
 reason+0.021
Neg logit contributions
amed-0.031
agged-0.027
bled-0.026
lic-0.025
hes-0.025
Token,
Token ID11
Feature activation+8.16e-01

Pos logit contributions
 matter+0.052
 harm+0.048
 worry+0.047
 comet+0.046
 avail+0.045
Neg logit contributions
amed-0.046
 Later-0.042
lic-0.041
hes-0.040
bled-0.040
Token "
Token ID366
Feature activation+8.03e-01

Pos logit contributions
 worry+0.031
 matter+0.031
 harm+0.030
 test+0.028
 reason+0.027
Neg logit contributions
amed-0.022
agged-0.019
bled-0.018
lic-0.017
hes-0.017
TokenWhere
Token ID8496
Feature activation+1.22e+01

Pos logit contributions
 worry+0.027
 matter+0.027
 harm+0.026
 test+0.024
 exam+0.023
Neg logit contributions
amed-0.026
agged-0.022
bled-0.021
lic-0.020
hes-0.020
Token is
Token ID318
Feature activation+4.58e+00

Pos logit contributions
 harm+0.374
 matter+0.373
 perspective+0.363
 appearance+0.349
 worry+0.339
Neg logit contributions
amed-0.360
hes-0.350
lic-0.319
agged-0.317
apped-0.311
Token the
Token ID262
Feature activation+3.01e+00

Pos logit contributions
 worry+0.146
 matter+0.145
 perspective+0.141
 reason+0.139
 harm+0.138
Neg logit contributions
bled-0.122
amed-0.117
hes-0.111
agged-0.099
 Then-0.099
Token red
Token ID2266
Feature activation+3.02e+00

Pos logit contributions
 harm+0.108
 expect+0.107
grand+0.102
 worry+0.101
 matter+0.101
Neg logit contributions
amed-0.063
agged-0.048
He-0.046
bled-0.045
hes-0.044
Token car
Token ID1097
Feature activation+1.31e+00

Pos logit contributions
 worry+0.088
 matter+0.087
 harm+0.086
 intr+0.077
 reason+0.075
Neg logit contributions
amed-0.106
agged-0.092
bled-0.089
hes-0.084
acked-0.084
Token?"
Token ID1701
Feature activation-1.21e+00

Pos logit contributions
 matter+0.029
 worry+0.028
 harm+0.028
 reason+0.025
 conflicts+0.024
Neg logit contributions
amed-0.055
agged-0.048
bled-0.047
lic-0.047
hes-0.046
Token asked
Token ID1965
Feature activation+1.25e+00

Pos logit contributions
 worry+0.004
 matter+0.004
 harm+0.004
 test+0.003
 reason+0.003
Neg logit contributions
amed-0.006
agged-0.006
bled-0.005
lic-0.005
hes-0.005
Token Mom
Token ID11254
Feature activation+3.06e+00

Pos logit contributions
 worry+0.032
 matter+0.030
 reason+0.025
 harm+0.025
 test+0.025
Neg logit contributions
amed-0.053
agged-0.049
bled-0.045
lic-0.044
acking-0.044
Tokenmy
Token ID1820
Feature activation+1.47e+00

Pos logit contributions
 worry+0.090
 matter+0.090
 harm+0.086
 test+0.078
 reason+0.076
Neg logit contributions
amed-0.111
agged-0.096
bled-0.093
lic-0.090
hes-0.089
Token,
Token ID11
Feature activation+2.01e+00

Pos logit contributions
 matter+0.040
 harm+0.038
 worry+0.036
 reason+0.032
 risk+0.032
Neg logit contributions
amed-0.055
bled-0.049
lic-0.047
apped-0.046
hes-0.046
Token "
Token ID366
Feature activation+1.51e+00

Pos logit contributions
 worry+0.074
 matter+0.073
 harm+0.070
 test+0.066
 reason+0.064
Neg logit contributions
amed-0.059
agged-0.050
bled-0.047
lic-0.045
hes-0.044
TokenWhere
Token ID8496
Feature activation+1.21e+01

Pos logit contributions
 matter+0.040
 worry+0.037
 reason+0.031
 test+0.030
 lawyer+0.030
Neg logit contributions
amed-0.058
agged-0.058
bled-0.053
hes-0.051
apped-0.050
Token are
Token ID389
Feature activation+5.92e+00

Pos logit contributions
 harm+0.346
 matter+0.323
 appearance+0.303
 worry+0.300
 angel+0.291
Neg logit contributions
amed-0.428
hes-0.388
lic-0.375
agged-0.367
bled-0.352
Token we
Token ID356
Feature activation+6.91e+00

Pos logit contributions
 harm+0.116
 matter+0.111
 worry+0.105
 client+0.093
 reason+0.091
Neg logit contributions
amed-0.258
bled-0.237
agged-0.235
lic-0.227
hes-0.226
Token going
Token ID1016
Feature activation+2.14e+00

Pos logit contributions
 matter+0.192
 harm+0.190
 worry+0.180
 problems+0.166
 client+0.164
Neg logit contributions
amed-0.256
bled-0.222
agged-0.218
lic-0.205
hes-0.204
Token?"
Token ID1701
Feature activation-2.21e-01

Pos logit contributions
 matter+0.041
 worry+0.040
 harm+0.038
 reason+0.031
 test+0.031
Neg logit contributions
amed-0.099
agged-0.089
bled-0.087
lic-0.084
hes-0.084
Token Mom
Token ID11254
Feature activation+8.22e-01

Pos logit contributions
amed+0.010
bled+0.009
agged+0.009
lic+0.008
pered+0.008
Neg logit contributions
 worry-0.006
 matter-0.005
 test-0.004
 reason-0.004
 yet-0.004
Token.
Token ID13
Feature activation-3.99e-01

Pos logit contributions
amed+0.033
agged+0.029
bled+0.028
lic+0.027
hes+0.026
Neg logit contributions
 matter-0.030
 worry-0.029
 harm-0.028
 client-0.026
 appearance-0.026
Token Mark
Token ID2940
Feature activation+1.79e+00

Pos logit contributions
amed+0.011
agged+0.009
bled+0.009
hes+0.008
lic+0.008
Neg logit contributions
 matter-0.016
 worry-0.016
 harm-0.015
 test-0.014
 reason-0.014
Token asked
Token ID1965
Feature activation+1.38e+00

Pos logit contributions
 matter+0.056
 worry+0.056
 harm+0.054
 test+0.049
 risk+0.049
Neg logit contributions
amed-0.061
agged-0.052
bled-0.050
lic-0.049
hes-0.048
Token,
Token ID11
Feature activation+1.40e+00

Pos logit contributions
 matter+0.035
 worry+0.035
 harm+0.034
 risk+0.030
 test+0.030
Neg logit contributions
amed-0.055
agged-0.047
bled-0.047
lic-0.046
hes-0.045
Token "
Token ID366
Feature activation+1.57e+00

Pos logit contributions
 worry+0.053
 matter+0.052
 harm+0.050
 test+0.047
 reason+0.046
Neg logit contributions
amed-0.039
agged-0.033
bled-0.031
lic-0.030
hes-0.029
TokenWhere
Token ID8496
Feature activation+1.21e+01

Pos logit contributions
 matter+0.048
 worry+0.048
 harm+0.045
 test+0.042
 reason+0.041
Neg logit contributions
amed-0.055
agged-0.049
bled-0.047
hes-0.044
lic-0.044
Token should
Token ID815
Feature activation+8.25e+00

Pos logit contributions
 harm+0.348
 matter+0.335
 appearance+0.314
 worry+0.308
 perspective+0.296
Neg logit contributions
amed-0.405
hes-0.377
lic-0.357
agged-0.345
apped-0.340
Token we
Token ID356
Feature activation+7.19e+00

Pos logit contributions
 matter+0.207
 harm+0.193
 conflicts+0.190
 perspective+0.185
 problems+0.185
Neg logit contributions
amed-0.277
bled-0.266
agged-0.243
hes-0.242
He-0.237
Token go
Token ID467
Feature activation+1.51e+00

Pos logit contributions
 matter+0.232
 harm+0.229
 worry+0.220
 conflicts+0.214
 problems+0.212
Neg logit contributions
amed-0.214
agged-0.180
bled-0.174
lic-0.169
hes-0.167
Token?"
Token ID1701
Feature activation+2.79e-02

Pos logit contributions
 worry+0.031
 matter+0.031
 harm+0.029
 test+0.025
 reason+0.024
Neg logit contributions
amed-0.068
agged-0.061
bled-0.060
lic-0.058
hes-0.058
Token
Token ID198
Feature activation-1.14e+00

Pos logit contributions
 worry+0.001
 matter+0.001
 harm+0.001
 test+0.001
 risk+0.001
Neg logit contributions
amed-0.001
agged-0.001
bled-0.001
lic-0.001
hes-0.001
Token asked
Token ID1965
Feature activation+9.65e-01

Pos logit contributions
 matter+0.103
 worry+0.101
 harm+0.099
 risk+0.090
 test+0.089
Neg logit contributions
amed-0.110
agged-0.090
lic-0.087
bled-0.087
hes-0.087
Token her
Token ID607
Feature activation+7.04e-01

Pos logit contributions
 matter+0.027
 worry+0.026
 harm+0.025
 risk+0.022
 test+0.022
Neg logit contributions
amed-0.037
agged-0.032
bled-0.032
hes-0.031
lic-0.030
Token mother
Token ID2802
Feature activation+3.31e+00

Pos logit contributions
 matter+0.022
 harm+0.021
 risk+0.019
 appearance+0.019
 conflicts+0.019
Neg logit contributions
amed-0.022
lic-0.020
hes-0.019
bled-0.018
 Later-0.018
Token:
Token ID25
Feature activation-9.85e-01

Pos logit contributions
 matter+0.103
 harm+0.098
 worry+0.094
 reason+0.088
 risk+0.088
Neg logit contributions
amed-0.105
bled-0.090
hes-0.087
lic-0.087
 Later-0.085
Token "
Token ID366
Feature activation+1.08e+00

Pos logit contributions
amed+0.025
agged+0.020
bled+0.019
lic+0.019
hes+0.018
Neg logit contributions
 matter-0.040
 worry-0.039
 harm-0.038
 test-0.035
 reason-0.035
TokenWhere
Token ID8496
Feature activation+1.21e+01

Pos logit contributions
 worry+0.037
 matter+0.036
 harm+0.035
 test+0.032
 reason+0.032
Neg logit contributions
amed-0.035
agged-0.030
bled-0.029
lic-0.028
hes-0.028
Token does
Token ID857
Feature activation+5.81e+00

Pos logit contributions
 harm+0.373
 matter+0.367
 appearance+0.345
 perspective+0.343
 avail+0.336
Neg logit contributions
amed-0.365
hes-0.352
lic-0.318
apped-0.317
agged-0.305
Token Daddy
Token ID26926
Feature activation+6.51e+00

Pos logit contributions
 matter+0.162
 harm+0.159
 perspective+0.157
 worry+0.156
 conflicts+0.151
Neg logit contributions
amed-0.161
hes-0.159
lic-0.149
bled-0.141
agged-0.138
Token go
Token ID467
Feature activation+1.16e+00

Pos logit contributions
 matter+0.163
 harm+0.158
 worry+0.155
 perspective+0.155
 conflicts+0.151
Neg logit contributions
amed-0.235
lic-0.204
hes-0.203
agged-0.200
bled-0.199
Token every
Token ID790
Feature activation+8.20e+00

Pos logit contributions
 matter+0.026
 worry+0.026
 harm+0.024
 client+0.021
 reason+0.021
Neg logit contributions
amed-0.048
agged-0.042
bled-0.042
lic-0.041
hes-0.041
Token morning
Token ID3329
Feature activation+3.73e+00

Pos logit contributions
 worry+0.177
 matter+0.176
 appearance+0.173
 conflicts+0.166
 client+0.161
Neg logit contributions
amed-0.291
hes-0.258
lic-0.249
acking-0.246
 Later-0.245
Token his
Token ID465
Feature activation+1.25e-01

Pos logit contributions
 worry+0.019
 matter+0.019
 harm+0.018
 test+0.016
 reason+0.016
Neg logit contributions
amed-0.019
agged-0.016
bled-0.015
lic-0.015
hes-0.015
Token mom
Token ID1995
Feature activation+2.28e+00

Pos logit contributions
 matter+0.004
 harm+0.003
 worry+0.003
 risk+0.003
 perspective+0.003
Neg logit contributions
amed-0.004
bled-0.004
lic-0.003
agged-0.003
hes-0.003
Tokenmy
Token ID1820
Feature activation+6.64e-01

Pos logit contributions
 matter+0.071
 harm+0.070
 worry+0.066
 risk+0.063
 reason+0.062
Neg logit contributions
amed-0.071
bled-0.060
lic-0.059
 Later-0.057
hes-0.057
Token,
Token ID11
Feature activation+1.41e+00

Pos logit contributions
 matter+0.022
 harm+0.020
 worry+0.019
 reason+0.019
 avail+0.019
Neg logit contributions
amed-0.019
bled-0.017
lic-0.016
 Later-0.016
apped-0.016
Token "
Token ID366
Feature activation+1.04e+00

Pos logit contributions
 matter+0.055
 worry+0.054
 harm+0.053
 test+0.049
 appearance+0.048
Neg logit contributions
amed-0.038
agged-0.031
bled-0.030
lic-0.029
hes-0.028
TokenWhere
Token ID8496
Feature activation+1.21e+01

Pos logit contributions
 worry+0.038
 matter+0.038
 harm+0.036
 test+0.034
 reason+0.033
Neg logit contributions
amed-0.031
agged-0.026
bled-0.025
lic-0.024
hes-0.023
Token is
Token ID318
Feature activation+4.53e+00

Pos logit contributions
 harm+0.359
 matter+0.353
 appearance+0.333
 perspective+0.325
 worry+0.317
Neg logit contributions
amed-0.396
hes-0.354
lic-0.345
agged-0.333
bled-0.333
Token my
Token ID616
Feature activation+4.92e+00

Pos logit contributions
 matter+0.136
 perspective+0.136
 worry+0.135
 encounter+0.130
 harm+0.129
Neg logit contributions
bled-0.127
amed-0.122
hes-0.121
agged-0.099
ocked-0.097
Token computer
Token ID3644
Feature activation+3.54e+00

Pos logit contributions
 worry+0.145
 harm+0.145
 matter+0.143
 intr+0.141
 appearance+0.125
Neg logit contributions
amed-0.157
hes-0.131
bled-0.127
agged-0.123
apped-0.121
Token?"
Token ID1701
Feature activation-6.68e-01

Pos logit contributions
 matter+0.074
 worry+0.072
 harm+0.071
 client+0.062
 reason+0.061
Neg logit contributions
amed-0.152
agged-0.133
bled-0.131
lic-0.129
hes-0.128
Token His
Token ID2399
Feature activation-2.60e+00

Pos logit contributions
amed+0.017
agged+0.014
bled+0.013
lic+0.013
hes+0.012
Neg logit contributions
 worry-0.027
 matter-0.027
 harm-0.026
 test-0.024
 reason-0.024
Token asked
Token ID1965
Feature activation+1.24e+00

Pos logit contributions
amed+0.036
agged+0.032
bled+0.031
hes+0.029
lic+0.029
Neg logit contributions
 worry-0.035
 matter-0.035
 harm-0.032
 test-0.030
 reason-0.030
Token his
Token ID465
Feature activation+7.54e-01

Pos logit contributions
 worry+0.036
 matter+0.035
 harm+0.033
 test+0.031
 reason+0.030
Neg logit contributions
amed-0.047
agged-0.042
bled-0.039
lic-0.039
hes-0.038
Token mom
Token ID1995
Feature activation+3.08e+00

Pos logit contributions
 matter+0.024
 harm+0.023
 risk+0.021
 worry+0.021
 appearance+0.020
Neg logit contributions
amed-0.024
bled-0.020
lic-0.019
apped-0.019
 Then-0.018
Token,
Token ID11
Feature activation+1.80e+00

Pos logit contributions
 matter+0.078
 harm+0.077
 worry+0.076
 risk+0.065
 reason+0.065
Neg logit contributions
amed-0.120
bled-0.103
lic-0.102
agged-0.101
hes-0.100
Token "
Token ID366
Feature activation+1.31e+00

Pos logit contributions
 matter+0.068
 worry+0.067
 harm+0.066
 test+0.060
 appearance+0.060
Neg logit contributions
amed-0.050
agged-0.041
bled-0.040
lic-0.039
hes-0.038
TokenWhere
Token ID8496
Feature activation+1.20e+01

Pos logit contributions
 matter+0.040
 worry+0.039
 harm+0.035
 reason+0.035
 test+0.034
Neg logit contributions
amed-0.046
agged-0.042
bled-0.039
hes-0.038
lic-0.038
Token can
Token ID460
Feature activation+8.11e+00

Pos logit contributions
 harm+0.356
 matter+0.344
 appearance+0.335
 perspective+0.319
 worry+0.310
Neg logit contributions
amed-0.388
hes-0.367
lic-0.341
agged-0.330
bled-0.329
Token I
Token ID314
Feature activation+6.05e+00

Pos logit contributions
 matter+0.191
 perspective+0.174
 harm+0.174
 conflicts+0.169
 worry+0.166
Neg logit contributions
amed-0.291
bled-0.287
hes-0.260
agged-0.260
lic-0.253
Token go
Token ID467
Feature activation+8.05e-01

Pos logit contributions
 matter+0.189
 worry+0.182
 harm+0.179
 appearance+0.169
 conflicts+0.168
Neg logit contributions
amed-0.195
agged-0.166
bled-0.163
hes-0.160
lic-0.158
Token?".
Token ID43634
Feature activation-3.81e-01

Pos logit contributions
 worry+0.018
 matter+0.017
 harm+0.016
 test+0.015
 risk+0.014
Neg logit contributions
amed-0.035
agged-0.032
bled-0.031
lic-0.030
hes-0.030
Token His
Token ID2399
Feature activation-1.91e+00

Pos logit contributions
amed+0.010
agged+0.008
bled+0.008
lic+0.008
hes+0.008
Neg logit contributions
 worry-0.015
 matter-0.015
 harm-0.014
 test-0.014
 reason-0.013
Token Together
Token ID17083
Feature activation-1.11e+00

Pos logit contributions
amed+0.106
agged+0.093
bled+0.091
lic+0.088
hes+0.087
Neg logit contributions
 worry-0.076
 matter-0.075
 harm-0.072
 test-0.065
 reason-0.064
Token they
Token ID484
Feature activation-2.84e+00

Pos logit contributions
amed+0.034
bled+0.028
agged+0.027
hes+0.027
apped+0.027
Neg logit contributions
 harm-0.033
 matter-0.033
 worry-0.032
 client-0.031
 appearance-0.030
Token t
Token ID256
Feature activation-3.81e+00

Pos logit contributions
amed+0.090
hes+0.072
bled+0.071
agged+0.068
lic+0.068
Neg logit contributions
 matter-0.096
 worry-0.095
 harm-0.091
 appearance-0.085
 risk-0.085
Tokenipt
Token ID10257
Feature activation-8.62e+00

Pos logit contributions
amed+0.099
agged+0.082
bled+0.078
hes+0.077
lic+0.075
Neg logit contributions
 worry-0.145
 matter-0.142
 harm-0.138
 client-0.128
 test-0.127
Tokeno
Token ID78
Feature activation-5.45e+00

Pos logit contributions
amed+0.264
agged+0.230
hes+0.224
bled+0.223
lic+0.214
Neg logit contributions
 worry-0.260
 matter-0.252
 harm-0.248
 test-0.225
 risk-0.224
Tokened
Token ID276
Feature activation-1.01e+01

Pos logit contributions
amed+0.104
agged+0.084
bled+0.073
lic+0.071
hes+0.070
Neg logit contributions
 worry-0.244
 matter-0.236
 harm-0.235
 test-0.221
 problems-0.219
Token through
Token ID832
Feature activation-3.34e+00

Pos logit contributions
amed+0.353
agged+0.304
bled+0.294
lic+0.284
hes+0.280
Neg logit contributions
 worry-0.311
 matter-0.310
 harm-0.297
 test-0.270
 reason-0.266
Token,
Token ID11
Feature activation-1.07e+00

Pos logit contributions
amed+0.116
bled+0.099
agged+0.095
lic+0.093
hes+0.093
Neg logit contributions
 matter-0.097
 worry-0.095
 harm-0.092
 exam-0.087
 client-0.086
Token searching
Token ID10342
Feature activation-3.39e+00

Pos logit contributions
amed+0.041
agged+0.039
lic+0.035
bled+0.035
acked+0.034
Neg logit contributions
 worry-0.030
 matter-0.028
 harm-0.026
 test-0.025
 risk-0.024
Token for
Token ID329
Feature activation+1.20e+00

Pos logit contributions
amed+0.114
agged+0.097
bled+0.094
lic+0.090
hes+0.090
Neg logit contributions
 worry-0.108
 matter-0.107
 harm-0.103
 test-0.094
 client-0.094
Token treasures
Token ID29561
Feature activation+1.77e+00

Pos logit contributions
 matter+0.037
 worry+0.036
 harm+0.035
 exam+0.033
 client+0.033
Neg logit contributions
amed-0.040
agged-0.034
bled-0.034
lic-0.032
hes-0.032
Token to
Token ID284
Feature activation-3.61e-01

Pos logit contributions
amed+0.145
agged+0.127
bled+0.122
lic+0.120
hes+0.117
Neg logit contributions
 matter-0.101
 worry-0.100
 harm-0.095
 client-0.085
 appearance-0.084
Token the
Token ID262
Feature activation-2.36e+00

Pos logit contributions
amed+0.013
bled+0.011
agged+0.011
hes+0.010
apped+0.010
Neg logit contributions
 matter-0.010
 worry-0.010
 harm-0.010
 problems-0.009
 client-0.009
Token bell
Token ID8966
Feature activation-2.02e+00

Pos logit contributions
amed+0.099
agged+0.084
bled+0.082
lic+0.081
hes+0.078
Neg logit contributions
 matter-0.055
 worry-0.053
 harm-0.052
 conflicts-0.047
 appearance-0.045
Token.
Token ID13
Feature activation-2.34e+00

Pos logit contributions
amed+0.069
agged+0.060
bled+0.057
lic+0.055
hes+0.054
Neg logit contributions
 worry-0.064
 matter-0.063
 harm-0.060
 test-0.056
 angel-0.054
Token Jack
Token ID3619
Feature activation-9.48e-01

Pos logit contributions
amed+0.093
agged+0.083
bled+0.080
hes+0.077
lic+0.076
Neg logit contributions
 worry-0.057
 matter-0.052
 test-0.048
 anyone-0.048
 harm-0.048
Token stood
Token ID6204
Feature activation-9.79e+00

Pos logit contributions
amed+0.032
agged+0.028
bled+0.027
lic+0.026
hes+0.025
Neg logit contributions
 worry-0.030
 matter-0.030
 harm-0.029
 test-0.026
 reason-0.026
Token and
Token ID290
Feature activation-5.04e+00

Pos logit contributions
lic+0.371
amed+0.371
agged+0.366
ashed+0.343
hes+0.341
Neg logit contributions
 worry-0.237
 angel-0.198
 yet-0.190
 matter-0.185
 comet-0.175
Token looked
Token ID3114
Feature activation-7.87e+00

Pos logit contributions
amed+0.186
agged+0.170
bled+0.156
lic+0.147
hes+0.146
Neg logit contributions
 worry-0.147
 matter-0.143
 harm-0.132
 test-0.128
 angel-0.119
Token at
Token ID379
Feature activation-1.52e+00

Pos logit contributions
amed+0.295
agged+0.266
bled+0.248
hes+0.246
lic+0.242
Neg logit contributions
 worry-0.219
 matter-0.213
 harm-0.199
 test-0.185
 angel-0.183
Token the
Token ID262
Feature activation-2.72e+00

Pos logit contributions
amed+0.053
bled+0.047
agged+0.045
hes+0.042
apped+0.041
Neg logit contributions
 worry-0.045
 matter-0.044
 harm-0.042
 client-0.041
 exam-0.039
Token bell
Token ID8966
Feature activation-1.87e+00

Pos logit contributions
amed+0.121
agged+0.105
bled+0.102
lic+0.101
hes+0.100
Neg logit contributions
 worry-0.056
 matter-0.055
 harm-0.053
 client-0.046
 appearance-0.045
Token nailed
Token ID37708
Feature activation-5.07e+00

Pos logit contributions
amed+0.118
agged+0.108
bled+0.099
lic+0.097
acked+0.094
Neg logit contributions
 worry-0.110
 matter-0.109
 harm-0.105
 test-0.095
 reason-0.093
Token down
Token ID866
Feature activation-3.72e+00

Pos logit contributions
amed+0.183
agged+0.154
bled+0.152
lic+0.145
hes+0.144
Neg logit contributions
 worry-0.149
 matter-0.146
 harm-0.141
 client-0.132
 test-0.130
Token,
Token ID11
Feature activation-5.77e-01

Pos logit contributions
amed+0.136
agged+0.116
bled+0.114
lic+0.109
hes+0.109
Neg logit contributions
 matter-0.106
 worry-0.105
 harm-0.100
 client-0.091
 test-0.089
Token the
Token ID262
Feature activation-3.15e+00

Pos logit contributions
amed+0.021
agged+0.019
bled+0.018
lic+0.018
hes+0.017
Neg logit contributions
 worry-0.017
 matter-0.017
 harm-0.016
 test-0.014
 angel-0.014
Token children
Token ID1751
Feature activation+1.66e+00

Pos logit contributions
amed+0.114
bled+0.093
aming+0.086
agged+0.085
apped+0.085
Neg logit contributions
 matter-0.083
 harm-0.081
 worry-0.081
 intr-0.074
 expect-0.074
Token stood
Token ID6204
Feature activation-9.68e+00

Pos logit contributions
 matter+0.055
 worry+0.054
 harm+0.052
 appearance+0.048
 test+0.048
Neg logit contributions
amed-0.054
agged-0.044
bled-0.044
lic-0.043
hes-0.042
Token back
Token ID736
Feature activation-4.04e+00

Pos logit contributions
amed+0.348
lic+0.332
agged+0.331
hes+0.303
acked+0.298
Neg logit contributions
 worry-0.264
 matter-0.226
 harm-0.221
 angel-0.213
 yet-0.210
Token and
Token ID290
Feature activation-4.47e+00

Pos logit contributions
amed+0.131
agged+0.111
bled+0.107
lic+0.103
hes+0.102
Neg logit contributions
 matter-0.134
 worry-0.133
 harm-0.128
 test-0.117
 reason-0.116
Token admired
Token ID29382
Feature activation-1.63e+00

Pos logit contributions
amed+0.150
agged+0.134
bled+0.125
lic+0.123
hes+0.120
Neg logit contributions
 worry-0.143
 matter-0.141
 harm-0.135
 test-0.124
 reason-0.121
Token their
Token ID511
Feature activation-1.95e+00

Pos logit contributions
amed+0.054
bled+0.049
agged+0.045
hes+0.045
apped+0.044
Neg logit contributions
 matter-0.050
 worry-0.049
 harm-0.046
 exam-0.044
 risk-0.044
Token work
Token ID670
Feature activation-2.62e+00

Pos logit contributions
amed+0.079
bled+0.068
agged+0.066
aming+0.063
acking+0.063
Neg logit contributions
 matter-0.042
 worry-0.040
 harm-0.040
 client-0.037
 angel-0.036
Token on
Token ID319
Feature activation-4.03e-01

Pos logit contributions
amed+0.058
agged+0.049
bled+0.047
lic+0.046
hes+0.045
Neg logit contributions
 worry-0.059
 matter-0.058
 harm-0.056
 test-0.051
 reason-0.051
Token it
Token ID340
Feature activation-4.55e+00

Pos logit contributions
amed+0.014
bled+0.012
agged+0.012
hes+0.012
lic+0.011
Neg logit contributions
 worry-0.012
 matter-0.011
 harm-0.011
 client-0.010
 exam-0.010
Token.
Token ID13
Feature activation-2.59e+00

Pos logit contributions
amed+0.163
agged+0.137
bled+0.136
lic+0.130
hes+0.130
Neg logit contributions
 matter-0.134
 worry-0.133
 harm-0.126
 client-0.118
 reason-0.115
Token The
Token ID383
Feature activation-2.04e+00

Pos logit contributions
amed+0.101
agged+0.089
bled+0.089
lic+0.081
hes+0.080
Neg logit contributions
 worry-0.068
 matter-0.064
 harm-0.061
 angel-0.058
 test-0.057
Token bug
Token ID5434
Feature activation-1.34e+00

Pos logit contributions
amed+0.092
bled+0.078
agged+0.078
lic+0.074
hes+0.074
Neg logit contributions
 matter-0.040
 harm-0.038
 worry-0.037
 client-0.035
grand-0.034
Token stood
Token ID6204
Feature activation-9.55e+00

Pos logit contributions
amed+0.045
agged+0.038
bled+0.037
lic+0.036
hes+0.035
Neg logit contributions
 worry-0.043
 matter-0.043
 harm-0.041
 test-0.038
 risk-0.038
Token there
Token ID612
Feature activation-3.21e+00

Pos logit contributions
amed+0.357
lic+0.356
agged+0.348
hes+0.324
acked+0.322
Neg logit contributions
 worry-0.240
 angel-0.203
 matter-0.197
 yet-0.194
 harm-0.192
Token for
Token ID329
Feature activation+1.81e+00

Pos logit contributions
amed+0.114
agged+0.099
bled+0.096
lic+0.092
hes+0.091
Neg logit contributions
 worry-0.096
 matter-0.096
 harm-0.091
 test-0.083
 reason-0.082
Token a
Token ID257
Feature activation+1.22e+00

Pos logit contributions
 matter+0.060
 worry+0.059
 harm+0.058
 client+0.054
 appearance+0.054
Neg logit contributions
amed-0.057
agged-0.049
bled-0.047
hes-0.045
lic-0.045
Token moment
Token ID2589
Feature activation+1.15e+00

Pos logit contributions
 harm+0.036
 client+0.035
 appearance+0.035
 matter+0.035
 problems+0.034
Neg logit contributions
amed-0.035
lic-0.035
He-0.028
bled-0.028
tted-0.027
Token,
Token ID11
Feature activation+1.27e-01

Pos logit contributions
 matter+0.038
 worry+0.038
 harm+0.036
 client+0.034
 appearance+0.033
Neg logit contributions
amed-0.037
agged-0.030
bled-0.030
lic-0.029
hes-0.029
Token,
Token ID11
Feature activation+5.61e+00

Pos logit contributions
 matter+0.132
 harm+0.126
 problems+0.117
 worry+0.113
 client+0.113
Neg logit contributions
amed-0.206
bled-0.185
lic-0.180
agged-0.178
apped-0.174
Token Jack
Token ID3619
Feature activation+4.68e+00

Pos logit contributions
 harm+0.178
 matter+0.169
 problems+0.167
 client+0.158
 conflicts+0.158
Neg logit contributions
amed-0.162
bled-0.142
lic-0.137
hes-0.134
agged-0.129
Token?"
Token ID1701
Feature activation-1.81e+00

Pos logit contributions
 matter+0.109
 worry+0.105
 harm+0.103
 appearance+0.097
 client+0.096
Neg logit contributions
amed-0.181
lic-0.161
hes-0.155
acking-0.154
bled-0.151
Token Jack
Token ID3619
Feature activation-1.56e+00

Pos logit contributions
amed+0.067
agged+0.059
bled+0.057
lic+0.055
hes+0.055
Neg logit contributions
 worry-0.052
 matter-0.051
 harm-0.049
 test-0.044
 reason-0.043
Token slowly
Token ID6364
Feature activation-4.07e+00

Pos logit contributions
amed+0.052
agged+0.045
bled+0.043
lic+0.042
hes+0.041
Neg logit contributions
 worry-0.050
 matter-0.050
 harm-0.048
 test-0.044
 reason-0.043
Token stood
Token ID6204
Feature activation-9.52e+00

Pos logit contributions
amed+0.144
agged+0.127
bled+0.121
hes+0.116
lic+0.112
Neg logit contributions
 worry-0.123
 matter-0.122
 harm-0.115
 test-0.106
 reason-0.104
Token up
Token ID510
Feature activation-4.54e+00

Pos logit contributions
lic+0.391
amed+0.370
agged+0.350
tted+0.347
ashed+0.339
Neg logit contributions
 worry-0.218
 angel-0.170
 harm-0.163
 yet-0.159
 intr-0.149
Token,
Token ID11
Feature activation-1.96e+00

Pos logit contributions
amed+0.184
agged+0.165
hes+0.156
lic+0.155
bled+0.153
Neg logit contributions
 worry-0.125
 matter-0.111
 harm-0.110
 test-0.099
 yet-0.098
Token rubbing
Token ID27714
Feature activation-6.21e+00

Pos logit contributions
amed+0.081
agged+0.076
lic+0.072
acked+0.070
bled+0.069
Neg logit contributions
 worry-0.051
 matter-0.042
 harm-0.040
 angel-0.037
 quarrel-0.036
Token his
Token ID465
Feature activation-4.85e+00

Pos logit contributions
amed+0.239
agged+0.212
lic+0.208
hes+0.195
bled+0.195
Neg logit contributions
 worry-0.169
 matter-0.161
 harm-0.152
 anyone-0.135
 test-0.134
Token knees
Token ID14475
Feature activation-5.55e+00

Pos logit contributions
amed+0.163
agged+0.140
bled+0.135
lic+0.129
hes+0.128
Neg logit contributions
 matter-0.155
 worry-0.155
 harm-0.149
 test-0.136
 reason-0.134
Token and
Token ID290
Feature activation-3.58e+00

Pos logit contributions
amed+0.146
agged+0.129
bled+0.124
hes+0.122
lic+0.120
Neg logit contributions
 worry-0.112
 matter-0.110
 harm-0.104
 test-0.093
 appearance-0.092
Token happy
Token ID3772
Feature activation+2.52e-01

Pos logit contributions
amed+0.127
agged+0.109
bled+0.107
lic+0.103
hes+0.101
Neg logit contributions
 worry-0.106
 matter-0.106
 harm-0.102
 test-0.093
 reason-0.092
Token.
Token ID13
Feature activation-2.61e+00

Pos logit contributions
 matter+0.007
 worry+0.007
 harm+0.007
 client+0.006
 test+0.006
Neg logit contributions
amed-0.009
agged-0.008
bled-0.008
lic-0.007
apped-0.007
Token The
Token ID383
Feature activation-1.90e+00

Pos logit contributions
amed+0.082
agged+0.068
bled+0.068
lic+0.065
hes+0.063
Neg logit contributions
 worry-0.086
 matter-0.082
 harm-0.080
 anyone-0.077
 client-0.076
Token squirrel
Token ID33039
Feature activation-1.83e+00

Pos logit contributions
amed+0.079
agged+0.070
bled+0.068
lic+0.066
hes+0.065
Neg logit contributions
 matter-0.045
 worry-0.045
 harm-0.042
 test-0.038
 reason-0.037
Token stood
Token ID6204
Feature activation-9.51e+00

Pos logit contributions
amed+0.063
agged+0.054
bled+0.052
lic+0.050
hes+0.050
Neg logit contributions
 worry-0.058
 matter-0.057
 harm-0.055
 test-0.050
 reason-0.049
Token up
Token ID510
Feature activation-4.10e+00

Pos logit contributions
amed+0.349
lic+0.334
agged+0.332
hes+0.319
acked+0.298
Neg logit contributions
 worry-0.259
 matter-0.215
 harm-0.215
 angel-0.214
 intr-0.201
Token on
Token ID319
Feature activation-1.35e+00

Pos logit contributions
amed+0.161
agged+0.147
hes+0.141
lic+0.139
bled+0.136
Neg logit contributions
 worry-0.111
 matter-0.102
 harm-0.098
 test-0.088
 angel-0.086
Token its
Token ID663
Feature activation-5.64e+00

Pos logit contributions
amed+0.048
agged+0.042
bled+0.041
hes+0.038
acked+0.037
Neg logit contributions
 matter-0.038
 worry-0.038
 harm-0.037
 exam-0.034
 client-0.034
Token hind
Token ID16222
Feature activation-1.45e-01

Pos logit contributions
amed+0.189
agged+0.162
bled+0.156
lic+0.150
hes+0.147
Neg logit contributions
 matter-0.180
 worry-0.180
 harm-0.172
 test-0.158
 reason-0.155
Token legs
Token ID7405
Feature activation-4.30e+00

Pos logit contributions
amed+0.005
agged+0.004
bled+0.004
lic+0.003
acking+0.003
Neg logit contributions
 matter-0.005
 worry-0.005
 harm-0.005
 test-0.004
 exam-0.004
Token tree
Token ID5509
Feature activation+4.21e-03

Pos logit contributions
amed+0.014
agged+0.011
bled+0.011
lic+0.011
hes+0.011
Neg logit contributions
 worry-0.011
 matter-0.011
 harm-0.010
 client-0.009
 reason-0.009
Token for
Token ID329
Feature activation+2.95e+00

Pos logit contributions
 matter+0.000
 worry+0.000
 harm+0.000
 client+0.000
 appearance+0.000
Neg logit contributions
amed-0.000
lic-0.000
agged-0.000
acking-0.000
hes-0.000
Token treasure
Token ID14068
Feature activation+1.48e+00

Pos logit contributions
 matter+0.098
 worry+0.096
 harm+0.090
 client+0.088
 angel+0.086
Neg logit contributions
amed-0.093
agged-0.078
bled-0.078
hes-0.073
lic-0.071
Token!"
Token ID2474
Feature activation-9.54e-01

Pos logit contributions
 matter+0.046
 worry+0.044
 harm+0.041
 client+0.039
 appearance+0.037
Neg logit contributions
amed-0.049
lic-0.041
bled-0.040
agged-0.039
hes-0.039
Token They
Token ID1119
Feature activation-3.30e+00

Pos logit contributions
amed+0.031
agged+0.027
bled+0.026
lic+0.024
hes+0.024
Neg logit contributions
 worry-0.032
 matter-0.031
 harm-0.030
 test-0.027
 reason-0.027
Token dug
Token ID18735
Feature activation-9.50e+00

Pos logit contributions
amed+0.107
agged+0.098
bled+0.091
lic+0.087
hes+0.087
Neg logit contributions
 worry-0.108
 matter-0.106
 harm-0.098
 test-0.094
 reason-0.091
Token a
Token ID257
Feature activation-3.54e+00

Pos logit contributions
amed+0.353
agged+0.333
lic+0.311
bled+0.304
hes+0.295
Neg logit contributions
 matter-0.251
 worry-0.242
 harm-0.234
 problems-0.218
 reason-0.214
Token small
Token ID1402
Feature activation-1.70e+00

Pos logit contributions
amed+0.136
agged+0.120
bled+0.116
lic+0.112
hes+0.112
Neg logit contributions
 worry-0.096
 matter-0.096
 harm-0.091
 test-0.082
 reason-0.080
Token hole
Token ID7604
Feature activation-4.44e+00

Pos logit contributions
amed+0.062
agged+0.052
lic+0.051
bled+0.051
acking+0.049
Neg logit contributions
 matter-0.047
 worry-0.047
 harm-0.046
 client-0.041
 exam-0.040
Token and
Token ID290
Feature activation-3.39e+00

Pos logit contributions
amed+0.162
agged+0.152
lic+0.138
bled+0.134
hes+0.134
Neg logit contributions
 worry-0.120
 matter-0.113
 harm-0.113
 happen-0.099
 test-0.099
Token found
Token ID1043
Feature activation-3.83e+00

Pos logit contributions
amed+0.124
agged+0.116
bled+0.104
lic+0.101
acking+0.101
Neg logit contributions
 worry-0.099
 matter-0.098
 harm-0.092
 test-0.084
 reason-0.082
Token some
Token ID617
Feature activation+2.44e+00

Pos logit contributions
amed+0.015
agged+0.013
bled+0.012
lic+0.012
hes+0.012
Neg logit contributions
 worry-0.013
 matter-0.013
 harm-0.013
 test-0.012
 client-0.011
Token string
Token ID4731
Feature activation-4.45e+00

Pos logit contributions
 worry+0.072
 matter+0.071
 harm+0.069
 client+0.062
 test+0.062
Neg logit contributions
amed-0.087
agged-0.075
bled-0.072
lic-0.071
hes-0.070
Token.
Token ID13
Feature activation-2.41e+00

Pos logit contributions
amed+0.167
agged+0.145
bled+0.142
lic+0.137
hes+0.136
Neg logit contributions
 worry-0.124
 matter-0.124
 harm-0.117
 test-0.106
 reason-0.105
Token N
Token ID399
Feature activation+1.11e+00

Pos logit contributions
amed+0.090
agged+0.080
bled+0.074
aming+0.068
lic+0.067
Neg logit contributions
 worry-0.060
 age-0.058
 yet-0.056
 we-0.056
 test-0.055
Tokenana
Token ID2271
Feature activation-2.70e+00

Pos logit contributions
 worry+0.037
 matter+0.037
 harm+0.036
 test+0.034
 client+0.034
Neg logit contributions
amed-0.034
agged-0.028
bled-0.028
lic-0.026
hes-0.026
Token stood
Token ID6204
Feature activation-9.49e+00

Pos logit contributions
amed+0.094
agged+0.082
bled+0.078
lic+0.076
hes+0.075
Neg logit contributions
 worry-0.083
 matter-0.082
 harm-0.079
 test-0.072
 reason-0.071
Token back
Token ID736
Feature activation-4.36e+00

Pos logit contributions
amed+0.355
lic+0.341
agged+0.339
acked+0.315
hes+0.312
Neg logit contributions
 worry-0.232
 angel-0.195
 matter-0.195
 harm-0.186
 test-0.181
Token and
Token ID290
Feature activation-4.80e+00

Pos logit contributions
amed+0.136
agged+0.115
bled+0.111
lic+0.106
hes+0.105
Neg logit contributions
 worry-0.150
 matter-0.150
 harm-0.144
 test-0.132
 reason-0.131
Token smiled
Token ID13541
Feature activation-7.17e+00

Pos logit contributions
amed+0.174
agged+0.159
bled+0.145
lic+0.144
hes+0.139
Neg logit contributions
 worry-0.140
 matter-0.137
 harm-0.128
 test-0.119
 angel-0.115
Token.
Token ID13
Feature activation-3.03e+00

Pos logit contributions
amed+0.304
agged+0.280
hes+0.265
bled+0.261
lic+0.260
Neg logit contributions
 worry-0.172
 matter-0.157
 harm-0.141
 angel-0.135
 test-0.127
Token
Token ID198
Feature activation-2.06e+00

Pos logit contributions
amed+0.116
agged+0.105
bled+0.097
aming+0.091
lic+0.089
Neg logit contributions
 worry-0.071
 age-0.069
 yet-0.066
 we-0.065
 you-0.064
Token was
Token ID373
Feature activation-2.78e+00

Pos logit contributions
amed+0.043
bled+0.035
agged+0.035
lic+0.034
hes+0.033
Neg logit contributions
 worry-0.045
 matter-0.045
 harm-0.043
 client-0.040
 appearance-0.040
Token finished
Token ID5201
Feature activation-3.96e+00

Pos logit contributions
agged+0.099
amed+0.099
uddled+0.093
acked+0.092
ucks+0.091
Neg logit contributions
 matter-0.064
 worry-0.060
 test-0.052
 harm-0.051
 yet-0.050
Token,
Token ID11
Feature activation+2.19e-01

Pos logit contributions
amed+0.173
agged+0.154
bled+0.151
lic+0.147
hes+0.145
Neg logit contributions
 worry-0.085
 matter-0.085
 harm-0.080
 test-0.069
 reason-0.068
Token the
Token ID262
Feature activation-2.52e+00

Pos logit contributions
 matter+0.006
 worry+0.006
 harm+0.005
 appearance+0.005
 test+0.005
Neg logit contributions
amed-0.008
bled-0.007
agged-0.007
lic-0.007
hes-0.007
Token bunny
Token ID44915
Feature activation-4.00e-01

Pos logit contributions
amed+0.096
bled+0.082
hes+0.080
agged+0.078
aming+0.075
Neg logit contributions
 matter-0.062
 worry-0.062
 harm-0.061
 appearance-0.058
 expect-0.055
Token stood
Token ID6204
Feature activation-9.48e+00

Pos logit contributions
amed+0.014
agged+0.012
bled+0.012
hes+0.011
lic+0.011
Neg logit contributions
 worry-0.012
 matter-0.012
 harm-0.012
 test-0.011
 angel-0.010
Token back
Token ID736
Feature activation-4.43e+00

Pos logit contributions
lic+0.352
amed+0.345
agged+0.342
tted+0.320
hes+0.317
Neg logit contributions
 worry-0.234
 angel-0.198
 yet-0.193
 matter-0.189
 harm-0.181
Token and
Token ID290
Feature activation-4.49e+00

Pos logit contributions
amed+0.151
agged+0.132
bled+0.126
lic+0.124
hes+0.121
Neg logit contributions
 worry-0.138
 matter-0.134
 harm-0.131
 test-0.123
 exam-0.119
Token admired
Token ID29382
Feature activation-2.28e+00

Pos logit contributions
amed+0.155
agged+0.141
bled+0.128
lic+0.128
hes+0.124
Neg logit contributions
 worry-0.140
 matter-0.138
 harm-0.131
 test-0.121
 angel-0.117
Token it
Token ID340
Feature activation-2.99e+00

Pos logit contributions
amed+0.078
bled+0.070
agged+0.067
hes+0.065
lic+0.062
Neg logit contributions
 matter-0.069
 worry-0.068
 harm-0.064
 reason-0.060
 test-0.060
Token.
Token ID13
Feature activation-3.12e+00

Pos logit contributions
amed+0.104
bled+0.088
agged+0.087
lic+0.083
hes+0.083
Neg logit contributions
 worry-0.090
 matter-0.089
 harm-0.086
 client-0.079
 reason-0.078
Token!
Token ID0
Feature activation-2.55e+00

Pos logit contributions
 matter+0.000
 worry+0.000
 harm+0.000
 reason+0.000
 test+0.000
Neg logit contributions
amed-0.001
agged-0.000
bled-0.000
lic-0.000
hes-0.000
Token
Token ID198
Feature activation-2.04e+00

Pos logit contributions
amed+0.092
bled+0.082
agged+0.081
hes+0.075
aming+0.070
Neg logit contributions
 worry-0.075
 matter-0.069
 harm-0.066
 lawyer-0.066
 test-0.064
Token
Token ID198
Feature activation-1.50e+00

Pos logit contributions
amed+0.072
agged+0.068
bled+0.065
hes+0.060
acked+0.058
Neg logit contributions
 worry-0.056
 harm-0.054
 matter-0.050
 expect-0.050
 reason-0.049
TokenThe
Token ID464
Feature activation-1.86e+00

Pos logit contributions
amed+0.051
agged+0.051
bled+0.048
hes+0.046
apped+0.043
Neg logit contributions
 worry-0.043
 matter-0.040
 harm-0.039
 expect-0.037
 test-0.037
Token bear
Token ID6842
Feature activation-5.28e-01

Pos logit contributions
amed+0.068
lic+0.056
bled+0.055
agged+0.054
hes+0.050
Neg logit contributions
 matter-0.050
 harm-0.049
 worry-0.048
 risk-0.044
 client-0.044
Token stood
Token ID6204
Feature activation-9.45e+00

Pos logit contributions
amed+0.020
agged+0.018
bled+0.017
hes+0.017
lic+0.016
Neg logit contributions
 worry-0.015
 matter-0.014
 harm-0.014
 test-0.012
 angel-0.012
Token up
Token ID510
Feature activation-4.23e+00

Pos logit contributions
amed+0.361
lic+0.341
agged+0.340
hes+0.325
acked+0.309
Neg logit contributions
 worry-0.247
 angel-0.207
 harm-0.206
 matter-0.204
 intr-0.194
Token,
Token ID11
Feature activation-1.80e+00

Pos logit contributions
amed+0.176
agged+0.163
hes+0.155
bled+0.151
lic+0.149
Neg logit contributions
 worry-0.105
 matter-0.095
 harm-0.092
 angel-0.083
 test-0.083
Token grow
Token ID1663
Feature activation-4.74e+00

Pos logit contributions
amed+0.069
agged+0.064
bled+0.059
lic+0.058
acked+0.056
Neg logit contributions
 worry-0.051
 matter-0.048
 harm-0.045
 angel-0.041
 test-0.040
Tokenled
Token ID992
Feature activation-6.49e+00

Pos logit contributions
amed+0.118
agged+0.114
 Later+0.105
hes+0.103
acking+0.097
Neg logit contributions
 worry-0.154
 harm-0.151
 intr-0.144
 matter-0.143
 happen-0.140
Token and
Token ID290
Feature activation-4.19e+00

Pos logit contributions
amed+0.235
agged+0.211
bled+0.198
hes+0.198
lic+0.198
Neg logit contributions
 worry-0.195
 matter-0.186
 harm-0.178
 angel-0.161
 test-0.159
Token with
Token ID351
Feature activation-1.12e+00

Pos logit contributions
amed+0.120
agged+0.105
bled+0.102
lic+0.099
hes+0.098
Neg logit contributions
 matter-0.083
 worry-0.082
 harm-0.078
 test-0.070
 reason-0.069
Token a
Token ID257
Feature activation-5.83e-01

Pos logit contributions
amed+0.040
agged+0.035
bled+0.034
lic+0.033
hes+0.032
Neg logit contributions
 worry-0.034
 matter-0.033
 harm-0.032
 test-0.029
 reason-0.028
Token smile
Token ID8212
Feature activation-3.77e+00

Pos logit contributions
amed+0.021
lic+0.018
agged+0.018
bled+0.017
hes+0.017
Neg logit contributions
 matter-0.017
 harm-0.016
 worry-0.016
 appearance-0.015
 client-0.014
Token.
Token ID13
Feature activation-3.39e+00

Pos logit contributions
amed+0.151
agged+0.134
bled+0.129
lic+0.125
hes+0.124
Neg logit contributions
 worry-0.097
 matter-0.096
 harm-0.091
 test-0.082
 reason-0.079
Token She
Token ID1375
Feature activation-3.01e+00

Pos logit contributions
amed+0.126
bled+0.107
agged+0.106
aming+0.098
hes+0.098
Neg logit contributions
 worry-0.092
 age-0.083
 lawyer-0.082
 matter-0.081
 harm-0.081
Token stood
Token ID6204
Feature activation-9.42e+00

Pos logit contributions
amed+0.110
agged+0.107
bled+0.097
hes+0.091
lic+0.091
Neg logit contributions
 worry-0.082
 matter-0.081
 harm-0.073
 test-0.071
 angel-0.068
Token up
Token ID510
Feature activation-4.14e+00

Pos logit contributions
amed+0.340
agged+0.332
lic+0.329
acked+0.304
tted+0.302
Neg logit contributions
 worry-0.239
 angel-0.208
 matter-0.203
 test-0.195
 harm-0.192
Token and
Token ID290
Feature activation-4.75e+00

Pos logit contributions
amed+0.162
agged+0.150
lic+0.141
bled+0.136
hes+0.135
Neg logit contributions
 worry-0.107
 matter-0.101
 harm-0.096
 test-0.092
 angel-0.088
Token looked
Token ID3114
Feature activation-7.53e+00

Pos logit contributions
amed+0.171
agged+0.163
bled+0.144
lic+0.141
acking+0.136
Neg logit contributions
 worry-0.135
 matter-0.131
 harm-0.121
 test-0.118
 angel-0.110
Token at
Token ID379
Feature activation-1.10e+00

Pos logit contributions
amed+0.266
agged+0.238
bled+0.226
hes+0.219
lic+0.218
Neg logit contributions
 worry-0.223
 matter-0.218
 harm-0.205
 test-0.193
 angel-0.190
Token her
Token ID607
Feature activation-3.04e+00

Pos logit contributions
amed+0.038
bled+0.033
agged+0.033
hes+0.030
apped+0.029
Neg logit contributions
 matter-0.032
 worry-0.032
 harm-0.030
 client-0.029
 reason-0.028
Token could
Token ID714
Feature activation+1.15e+00

Pos logit contributions
amed+0.073
agged+0.061
bled+0.059
lic+0.056
hes+0.055
Neg logit contributions
 worry-0.090
 matter-0.089
 harm-0.086
 test-0.080
 reason-0.079
Token stand
Token ID1302
Feature activation-6.78e+00

Pos logit contributions
 matter+0.030
 worry+0.027
 harm+0.027
 client+0.026
 appearance+0.026
Neg logit contributions
amed-0.041
hes-0.036
agged-0.035
bled-0.035
lic-0.032
Token still
Token ID991
Feature activation-5.69e-01

Pos logit contributions
amed+0.220
agged+0.188
bled+0.181
lic+0.174
hes+0.172
Neg logit contributions
 worry-0.224
 matter-0.223
 harm-0.214
 test-0.197
 reason-0.194
Token.
Token ID13
Feature activation-1.66e+00

Pos logit contributions
amed+0.020
agged+0.017
bled+0.017
hes+0.016
lic+0.015
Neg logit contributions
 matter-0.016
 worry-0.016
 harm-0.015
 client-0.015
 lawyer-0.014
Token He
Token ID679
Feature activation-2.97e+00

Pos logit contributions
amed+0.046
agged+0.038
bled+0.037
lic+0.035
hes+0.035
Neg logit contributions
 worry-0.062
 matter-0.062
 harm-0.060
 test-0.056
 reason-0.055
Token stood
Token ID6204
Feature activation-9.41e+00

Pos logit contributions
amed+0.102
agged+0.087
bled+0.085
lic+0.082
hes+0.082
Neg logit contributions
 worry-0.092
 matter-0.091
 harm-0.088
 test-0.080
 reason-0.079
Token quietly
Token ID12703
Feature activation-4.49e+00

Pos logit contributions
amed+0.362
lic+0.353
agged+0.346
acked+0.331
tted+0.324
Neg logit contributions
 worry-0.216
 yet-0.187
 angel-0.175
 matter-0.171
 intr-0.167
Token,
Token ID11
Feature activation-1.44e+00

Pos logit contributions
amed+0.154
agged+0.133
bled+0.128
lic+0.124
hes+0.122
Neg logit contributions
 worry-0.141
 matter-0.140
 harm-0.134
 test-0.123
 reason-0.120
Token with
Token ID351
Feature activation-5.39e-01

Pos logit contributions
amed+0.055
agged+0.050
lic+0.047
bled+0.046
acked+0.044
Neg logit contributions
 worry-0.041
 matter-0.038
 harm-0.037
 test-0.033
 appearance-0.032
Token no
Token ID645
Feature activation+3.76e+00

Pos logit contributions
amed+0.018
agged+0.015
bled+0.015
hes+0.014
lic+0.014
Neg logit contributions
 matter-0.017
 worry-0.017
 harm-0.017
 test-0.015
 client-0.015
Token movement
Token ID3356
Feature activation-1.82e+00

Pos logit contributions
 worry+0.090
 matter+0.085
 harm+0.083
 client+0.082
 lawyer+0.077
Neg logit contributions
amed-0.142
agged-0.127
bled-0.120
hes-0.117
lic-0.115
Token had
Token ID550
Feature activation-3.77e+00

Pos logit contributions
amed+0.124
agged+0.113
bled+0.106
lic+0.101
hes+0.100
Neg logit contributions
 matter-0.098
 worry-0.098
 harm-0.092
 test-0.084
 reason-0.082
Token an
Token ID281
Feature activation+6.56e-01

Pos logit contributions
amed+0.130
agged+0.112
bled+0.108
lic+0.104
hes+0.104
Neg logit contributions
 worry-0.117
 matter-0.117
 harm-0.113
 test-0.102
 angel-0.101
Token idea
Token ID2126
Feature activation+3.71e+00

Pos logit contributions
 worry+0.014
 matter+0.013
 harm+0.013
 intr+0.011
 appearance+0.011
Neg logit contributions
amed-0.029
agged-0.026
bled-0.024
hes-0.023
lic-0.023
Token.
Token ID13
Feature activation-2.60e+00

Pos logit contributions
 worry+0.125
 matter+0.124
 client+0.119
 harm+0.118
 appearance+0.116
Neg logit contributions
amed-0.103
lic-0.087
agged-0.084
bled-0.082
hes-0.080
Token She
Token ID1375
Feature activation-2.70e+00

Pos logit contributions
amed+0.078
agged+0.066
bled+0.065
hes+0.059
lic+0.059
Neg logit contributions
 worry-0.093
 matter-0.090
 harm-0.087
 test-0.082
 client-0.081
Token stood
Token ID6204
Feature activation-9.38e+00

Pos logit contributions
amed+0.098
agged+0.091
bled+0.085
hes+0.081
lic+0.080
Neg logit contributions
 matter-0.077
 worry-0.076
 harm-0.071
 test-0.066
 reason-0.064
Token up
Token ID510
Feature activation-4.05e+00

Pos logit contributions
amed+0.332
agged+0.304
lic+0.303
acked+0.278
hes+0.276
Neg logit contributions
 worry-0.262
 matter-0.241
 harm-0.231
 angel-0.228
 test-0.225
Token on
Token ID319
Feature activation-1.09e+00

Pos logit contributions
amed+0.150
agged+0.137
lic+0.130
bled+0.127
hes+0.125
Neg logit contributions
 worry-0.114
 matter-0.109
 harm-0.104
 test-0.099
 exam-0.094
Token the
Token ID262
Feature activation-2.87e+00

Pos logit contributions
amed+0.037
bled+0.031
agged+0.031
hes+0.029
 Then+0.028
Neg logit contributions
 worry-0.032
 matter-0.032
 harm-0.030
 expect-0.028
 risk-0.027
Token chair
Token ID5118
Feature activation-3.27e+00

Pos logit contributions
amed+0.115
bled+0.098
agged+0.097
lic+0.092
hes+0.092
Neg logit contributions
 matter-0.074
 worry-0.073
 harm-0.069
 reason-0.060
 client-0.059
Token,
Token ID11
Feature activation-1.46e+00

Pos logit contributions
amed+0.100
agged+0.084
bled+0.080
lic+0.077
hes+0.076
Neg logit contributions
 matter-0.114
 worry-0.114
 harm-0.109
 test-0.100
 reason-0.100
Token open
Token ID1280
Feature activation-2.54e+00

Pos logit contributions
amed+0.085
agged+0.074
bled+0.072
lic+0.069
hes+0.069
Neg logit contributions
 worry-0.072
 matter-0.071
 harm-0.068
 test-0.062
 reason-0.061
Token land
Token ID1956
Feature activation-2.46e+00

Pos logit contributions
amed+0.092
agged+0.079
bled+0.076
lic+0.074
acking+0.073
Neg logit contributions
 worry-0.072
 matter-0.071
 harm-0.068
 client-0.061
 reason-0.061
Token.
Token ID13
Feature activation-3.05e+00

Pos logit contributions
amed+0.092
agged+0.079
bled+0.078
lic+0.075
hes+0.074
Neg logit contributions
 matter-0.071
 worry-0.069
 harm-0.065
 reason-0.060
 test-0.059
Token He
Token ID679
Feature activation-3.53e+00

Pos logit contributions
amed+0.094
bled+0.080
agged+0.078
lic+0.075
hes+0.075
Neg logit contributions
 client-0.100
 worry-0.100
 matter-0.100
 lawyer-0.097
 harm-0.096
Token proudly
Token ID23017
Feature activation-5.00e+00

Pos logit contributions
amed+0.126
agged+0.112
bled+0.107
lic+0.103
hes+0.102
Neg logit contributions
 worry-0.105
 matter-0.104
 harm-0.100
 test-0.090
 reason-0.089
Token stood
Token ID6204
Feature activation-9.37e+00

Pos logit contributions
amed+0.170
agged+0.152
bled+0.142
lic+0.137
hes+0.136
Neg logit contributions
 worry-0.156
 matter-0.152
 harm-0.147
 test-0.136
 reason-0.131
Token tall
Token ID7331
Feature activation-1.85e+00

Pos logit contributions
amed+0.367
lic+0.359
agged+0.345
hes+0.334
acked+0.326
Neg logit contributions
 worry-0.222
 angel-0.185
 matter-0.184
 harm-0.181
 yet-0.176
Token,
Token ID11
Feature activation-1.22e+00

Pos logit contributions
amed+0.055
agged+0.046
bled+0.044
lic+0.042
hes+0.041
Neg logit contributions
 worry-0.065
 matter-0.065
 harm-0.062
 reason-0.057
 test-0.057
Token holding
Token ID4769
Feature activation-5.33e+00

Pos logit contributions
amed+0.045
agged+0.043
lic+0.040
bled+0.040
acked+0.039
Neg logit contributions
 worry-0.034
 matter-0.033
 harm-0.031
 test-0.028
 angel-0.028
Token the
Token ID262
Feature activation-2.65e+00

Pos logit contributions
amed+0.195
agged+0.171
lic+0.168
bled+0.161
hes+0.159
Neg logit contributions
 worry-0.151
 matter-0.150
 harm-0.143
 test-0.132
 anyone-0.127
Token spear
Token ID17514
Feature activation-1.78e+00

Pos logit contributions
amed+0.097
agged+0.084
bled+0.082
lic+0.076
hes+0.075
Neg logit contributions
 worry-0.075
 matter-0.073
 harm-0.071
 client-0.064
 exam-0.063
Token,
Token ID11
Feature activation+1.52e+00

Pos logit contributions
 matter+0.014
 harm+0.014
 worry+0.013
 client+0.012
grand+0.012
Neg logit contributions
amed-0.020
lic-0.017
agged-0.017
bled-0.016
acking-0.016
Token shiny
Token ID22441
Feature activation-9.98e-01

Pos logit contributions
 worry+0.040
 matter+0.040
 harm+0.038
 test+0.034
 reason+0.034
Neg logit contributions
amed-0.059
agged-0.051
bled-0.050
lic-0.049
hes-0.048
Token lock
Token ID5793
Feature activation-3.00e+00

Pos logit contributions
amed+0.041
agged+0.036
bled+0.036
lic+0.035
hes+0.034
Neg logit contributions
 worry-0.022
 matter-0.022
 harm-0.022
 client-0.018
 reason-0.018
Token.
Token ID13
Feature activation-2.50e+00

Pos logit contributions
amed+0.102
agged+0.089
bled+0.085
lic+0.082
hes+0.081
Neg logit contributions
 worry-0.093
 matter-0.093
 harm-0.089
 test-0.082
 angel-0.080
Token She
Token ID1375
Feature activation-2.21e+00

Pos logit contributions
amed+0.084
agged+0.073
bled+0.072
lic+0.067
hes+0.066
Neg logit contributions
 worry-0.079
 matter-0.073
 anyone-0.070
 lawyer-0.070
 yet-0.070
Token stood
Token ID6204
Feature activation-9.35e+00

Pos logit contributions
agged+0.083
amed+0.082
hes+0.075
bled+0.075
acking+0.072
Neg logit contributions
 worry-0.059
 matter-0.057
 harm-0.050
 wish-0.048
 reason-0.047
Token there
Token ID612
Feature activation-3.31e+00

Pos logit contributions
amed+0.368
agged+0.365
lic+0.352
acked+0.340
hes+0.335
Neg logit contributions
 worry-0.216
 angel-0.180
 matter-0.178
 yet-0.175
 harm-0.165
Token looking
Token ID2045
Feature activation-4.27e+00

Pos logit contributions
amed+0.115
agged+0.102
bled+0.096
lic+0.094
hes+0.093
Neg logit contributions
 worry-0.102
 matter-0.100
 harm-0.095
 test-0.087
 reason-0.086
Token at
Token ID379
Feature activation+1.45e-01

Pos logit contributions
amed+0.147
agged+0.128
bled+0.123
lic+0.119
hes+0.118
Neg logit contributions
 worry-0.134
 matter-0.132
 harm-0.126
 test-0.116
 reason-0.113
Token the
Token ID262
Feature activation-2.02e+00

Pos logit contributions
 worry+0.004
 matter+0.004
 harm+0.004
 client+0.004
 exam+0.004
Neg logit contributions
amed-0.004
bled-0.004
agged-0.004
hes-0.004
 Then-0.003
Token lock
Token ID5793
Feature activation-2.73e+00

Pos logit contributions
amed+0.074
bled+0.059
agged+0.058
lic+0.058
hes+0.056
Neg logit contributions
 worry-0.054
 harm-0.052
 matter-0.052
 client-0.045
 conflicts-0.045
Token the
Token ID262
Feature activation+2.18e-01

Pos logit contributions
 matter+0.043
 worry+0.042
 client+0.039
 harm+0.039
 angel+0.038
Neg logit contributions
amed-0.046
bled-0.041
agged-0.040
hes-0.040
lic-0.036
Token kiss
Token ID9245
Feature activation-7.34e-01

Pos logit contributions
 harm+0.005
 worry+0.005
 conflicts+0.005
 appearance+0.005
 matter+0.005
Neg logit contributions
amed-0.007
agged-0.006
hes-0.006
bled-0.006
lic-0.006
Token.
Token ID13
Feature activation-2.61e+00

Pos logit contributions
amed+0.025
bled+0.020
agged+0.020
hes+0.020
lic+0.020
Neg logit contributions
 worry-0.023
 matter-0.023
 harm-0.022
 appearance-0.020
 risk-0.020
Token The
Token ID383
Feature activation-1.84e+00

Pos logit contributions
amed+0.080
bled+0.067
agged+0.065
lic+0.060
hes+0.057
Neg logit contributions
 worry-0.092
 matter-0.092
 harm-0.087
 client-0.083
 test-0.083
Token farmer
Token ID18739
Feature activation+5.44e-01

Pos logit contributions
amed+0.076
agged+0.066
bled+0.064
hes+0.062
lic+0.062
Neg logit contributions
 worry-0.044
 matter-0.043
 harm-0.043
 test-0.037
 appearance-0.037
Token stood
Token ID6204
Feature activation-9.34e+00

Pos logit contributions
 matter+0.018
 worry+0.018
 harm+0.017
 risk+0.016
 reason+0.016
Neg logit contributions
amed-0.018
bled-0.014
agged-0.014
hes-0.014
lic-0.014
Token and
Token ID290
Feature activation-4.65e+00

Pos logit contributions
amed+0.333
lic+0.308
agged+0.303
hes+0.283
acked+0.275
Neg logit contributions
 worry-0.264
 matter-0.233
 angel-0.228
 harm-0.224
 test-0.213
Token watched
Token ID7342
Feature activation-2.02e+00

Pos logit contributions
amed+0.158
agged+0.137
bled+0.131
lic+0.125
hes+0.123
Neg logit contributions
 worry-0.148
 matter-0.147
 harm-0.141
 test-0.129
 reason-0.126
Token with
Token ID351
Feature activation-5.22e-01

Pos logit contributions
amed+0.074
agged+0.065
bled+0.064
lic+0.061
hes+0.061
Neg logit contributions
 matter-0.057
 worry-0.057
 harm-0.054
 test-0.049
 client-0.049
Token a
Token ID257
Feature activation-4.98e-01

Pos logit contributions
amed+0.018
agged+0.015
bled+0.015
hes+0.015
lic+0.014
Neg logit contributions
 matter-0.016
 worry-0.015
 harm-0.015
 appearance-0.014
 client-0.014
Token big
Token ID1263
Feature activation+7.39e-01

Pos logit contributions
amed+0.019
agged+0.016
lic+0.016
bled+0.015
hes+0.015
Neg logit contributions
 matter-0.014
 worry-0.013
 harm-0.013
 appearance-0.012
 client-0.012
Token
Token ID198
Feature activation-2.24e+00

Pos logit contributions
amed+0.074
bled+0.063
agged+0.063
lic+0.059
hes+0.059
Neg logit contributions
 worry-0.063
 matter-0.060
 harm-0.058
 test-0.056
 reason-0.055
Token
Token ID198
Feature activation-1.67e+00

Pos logit contributions
amed+0.079
agged+0.072
bled+0.068
lic+0.063
hes+0.063
Neg logit contributions
 worry-0.058
 harm-0.057
 reason-0.054
 matter-0.053
 test-0.052
TokenJack
Token ID14295
Feature activation-1.30e+00

Pos logit contributions
amed+0.056
agged+0.054
bled+0.049
hes+0.047
lic+0.046
Neg logit contributions
 matter-0.047
 worry-0.047
 reason-0.043
 harm-0.042
 test-0.041
Token and
Token ID290
Feature activation-3.11e+00

Pos logit contributions
amed+0.045
agged+0.041
bled+0.039
hes+0.037
lic+0.037
Neg logit contributions
 worry-0.039
 matter-0.038
 harm-0.036
 test-0.033
 reason-0.032
Token Jill
Token ID22772
Feature activation-1.23e+00

Pos logit contributions
amed+0.104
agged+0.086
bled+0.086
lic+0.083
hes+0.079
Neg logit contributions
 worry-0.098
 harm-0.097
 matter-0.096
 risk-0.088
 problems-0.087
Token stood
Token ID6204
Feature activation-9.33e+00

Pos logit contributions
amed+0.043
agged+0.039
bled+0.037
lic+0.036
hes+0.035
Neg logit contributions
 worry-0.037
 matter-0.036
 harm-0.034
 test-0.032
 reason-0.031
Token in
Token ID287
Feature activation-2.74e+00

Pos logit contributions
amed+0.349
lic+0.347
agged+0.323
acked+0.304
hes+0.302
Neg logit contributions
 worry-0.243
 harm-0.198
 matter-0.198
 angel-0.196
 test-0.189
Token the
Token ID262
Feature activation-3.57e+00

Pos logit contributions
amed+0.108
agged+0.094
bled+0.092
hes+0.088
lic+0.086
Neg logit contributions
 matter-0.068
 worry-0.067
 harm-0.065
 client-0.061
 exam-0.060
Token middle
Token ID3504
Feature activation-2.59e-01

Pos logit contributions
amed+0.150
agged+0.136
lic+0.133
bled+0.131
hes+0.123
Neg logit contributions
 worry-0.090
 matter-0.085
 harm-0.082
 test-0.078
 risk-0.076
Token of
Token ID286
Feature activation+2.53e+00

Pos logit contributions
amed+0.008
agged+0.006
bled+0.006
lic+0.006
hes+0.006
Neg logit contributions
 matter-0.009
 worry-0.009
 harm-0.009
 client-0.008
 test-0.008
Token their
Token ID511
Feature activation-2.34e+00

Pos logit contributions
 matter+0.063
 worry+0.063
 harm+0.059
 client+0.054
 appearance+0.053
Neg logit contributions
amed-0.101
bled-0.088
agged-0.088
hes-0.084
lic-0.082
Token hard
Token ID1327
Feature activation+9.37e-01

Pos logit contributions
amed+0.031
bled+0.026
agged+0.026
hes+0.025
lic+0.025
Neg logit contributions
 matter-0.026
 worry-0.026
 harm-0.025
 appearance-0.023
 client-0.022
Token and
Token ID290
Feature activation-1.89e+00

Pos logit contributions
 worry+0.027
 matter+0.027
 harm+0.026
 angel+0.024
 test+0.024
Neg logit contributions
amed-0.033
agged-0.029
bled-0.028
lic-0.028
hes-0.026
Token smiled
Token ID13541
Feature activation-6.79e+00

Pos logit contributions
amed+0.068
agged+0.058
bled+0.057
lic+0.055
hes+0.054
Neg logit contributions
 worry-0.056
 matter-0.056
 harm-0.054
 test-0.049
 risk-0.048
Token.
Token ID13
Feature activation-2.11e+00

Pos logit contributions
amed+0.286
agged+0.262
hes+0.253
bled+0.247
lic+0.246
Neg logit contributions
 worry-0.169
 matter-0.150
 harm-0.140
 angel-0.130
 appearance-0.127
Token He
Token ID679
Feature activation-2.74e+00

Pos logit contributions
amed+0.063
agged+0.053
bled+0.053
hes+0.048
lic+0.046
Neg logit contributions
 worry-0.074
 matter-0.071
 harm-0.067
 lawyer-0.067
 client-0.067
Token stood
Token ID6204
Feature activation-9.32e+00

Pos logit contributions
amed+0.104
agged+0.096
bled+0.091
hes+0.087
lic+0.085
Neg logit contributions
 matter-0.075
 worry-0.074
 harm-0.069
 test-0.062
 reason-0.061
Token back
Token ID736
Feature activation-4.69e+00

Pos logit contributions
amed+0.329
agged+0.325
lic+0.319
acked+0.298
ashed+0.293
Neg logit contributions
 worry-0.239
 angel-0.198
 matter-0.198
 test-0.189
 harm-0.187
Token and
Token ID290
Feature activation-4.53e+00

Pos logit contributions
amed+0.153
agged+0.132
bled+0.126
lic+0.122
hes+0.120
Neg logit contributions
 worry-0.155
 matter-0.153
 harm-0.148
 test-0.137
 reason-0.133
Token watched
Token ID7342
Feature activation-2.32e+00

Pos logit contributions
amed+0.155
agged+0.138
bled+0.128
lic+0.126
hes+0.122
Neg logit contributions
 worry-0.144
 matter-0.141
 harm-0.135
 test-0.124
 appearance-0.119
Token the
Token ID262
Feature activation-1.88e+00

Pos logit contributions
amed+0.091
agged+0.080
bled+0.078
lic+0.075
hes+0.075
Neg logit contributions
 worry-0.061
 matter-0.061
 harm-0.058
 test-0.051
 reason-0.051
Token worm
Token ID23849
Feature activation-1.30e+00

Pos logit contributions
amed+0.073
lic+0.062
bled+0.061
agged+0.060
hes+0.059
Neg logit contributions
 worry-0.046
 matter-0.046
 harm-0.044
 appearance-0.038
 happen-0.038
Token big
Token ID1263
Feature activation+4.90e-01

Pos logit contributions
amed+0.015
lic+0.013
bled+0.012
hes+0.011
agged+0.011
Neg logit contributions
 harm-0.012
 matter-0.012
 worry-0.012
 expect-0.011
 client-0.011
Token round
Token ID2835
Feature activation-1.14e+00

Pos logit contributions
 matter+0.012
 harm+0.012
 worry+0.012
 client+0.011
grand+0.011
Neg logit contributions
amed-0.018
bled-0.015
agged-0.015
lic-0.015
acking-0.014
Token rock
Token ID3881
Feature activation-3.30e+00

Pos logit contributions
amed+0.049
agged+0.043
bled+0.042
lic+0.041
hes+0.041
Neg logit contributions
 worry-0.026
 matter-0.026
 harm-0.024
 test-0.021
 reason-0.021
Token.
Token ID13
Feature activation-2.63e+00

Pos logit contributions
amed+0.114
agged+0.098
bled+0.096
lic+0.092
hes+0.091
Neg logit contributions
 worry-0.101
 matter-0.101
 harm-0.096
 test-0.087
 reason-0.087
Token He
Token ID679
Feature activation-3.11e+00

Pos logit contributions
amed+0.094
agged+0.078
bled+0.078
aming+0.071
hes+0.070
Neg logit contributions
 worry-0.079
 matter-0.076
 anyone-0.073
 test-0.072
 harm-0.070
Token stood
Token ID6204
Feature activation-9.32e+00

Pos logit contributions
amed+0.113
agged+0.104
bled+0.096
hes+0.092
lic+0.092
Neg logit contributions
 matter-0.090
 worry-0.089
 harm-0.085
 test-0.077
 reason-0.075
Token on
Token ID319
Feature activation-1.77e+00

Pos logit contributions
amed+0.359
agged+0.337
lic+0.333
acked+0.310
hes+0.310
Neg logit contributions
 worry-0.239
 matter-0.207
 harm-0.201
 angel-0.197
 yet-0.196
Token it
Token ID340
Feature activation-6.13e+00

Pos logit contributions
amed+0.071
agged+0.062
bled+0.062
hes+0.059
lic+0.058
Neg logit contributions
 worry-0.044
 matter-0.044
 harm-0.041
 client-0.037
 test-0.036
Token and
Token ID290
Feature activation-4.45e+00

Pos logit contributions
amed+0.205
agged+0.178
bled+0.170
lic+0.164
hes+0.163
Neg logit contributions
 worry-0.197
 matter-0.195
 harm-0.187
 test-0.173
 reason-0.168
Token tried
Token ID3088
Feature activation-5.02e+00

Pos logit contributions
amed+0.159
agged+0.143
bled+0.131
lic+0.129
hes+0.126
Neg logit contributions
 worry-0.134
 matter-0.134
 harm-0.127
 test-0.116
 reason-0.112
Token to
Token ID284
Feature activation+7.14e-01

Pos logit contributions
amed+0.200
agged+0.175
bled+0.173
lic+0.166
hes+0.166
Neg logit contributions
 matter-0.126
 worry-0.125
 harm-0.117
 angel-0.104
 test-0.104
Token
Token ID198
Feature activation-9.96e-01

Pos logit contributions
amed+0.085
agged+0.073
bled+0.071
hes+0.069
lic+0.069
Neg logit contributions
 worry-0.075
 matter-0.071
 harm-0.069
 anyone-0.065
 reason-0.065
Token
Token ID198
Feature activation-6.18e-01

Pos logit contributions
amed+0.031
agged+0.027
bled+0.027
hes+0.025
lic+0.024
Neg logit contributions
 worry-0.033
 matter-0.032
 harm-0.031
 reason-0.029
 test-0.028
TokenKat
Token ID25881
Feature activation+2.85e+00

Pos logit contributions
amed+0.018
agged+0.016
bled+0.015
hes+0.014
lic+0.014
Neg logit contributions
 worry-0.022
 matter-0.022
 harm-0.021
 reason-0.020
 test-0.020
Tokenie
Token ID494
Feature activation-3.03e-01

Pos logit contributions
 worry+0.061
 matter+0.059
 harm+0.058
 risk+0.054
 test+0.053
Neg logit contributions
amed-0.130
agged-0.110
bled-0.109
lic-0.106
acked-0.105
Token be
Token ID307
Feature activation-3.87e+00

Pos logit contributions
amed+0.011
agged+0.009
bled+0.009
lic+0.009
hes+0.008
Neg logit contributions
 worry-0.009
 matter-0.009
 harm-0.009
 test-0.008
 reason-0.008
Tokenamed
Token ID2434
Feature activation-9.32e+00

Pos logit contributions
lic+0.071
agged+0.069
amed+0.066
uddled+0.066
bled+0.065
Neg logit contributions
 worry-0.163
 matter-0.153
 harm-0.152
 angel-0.141
 reason-0.141
Token with
Token ID351
Feature activation-1.20e+00

Pos logit contributions
amed+0.307
agged+0.265
bled+0.254
lic+0.245
hes+0.243
Neg logit contributions
 worry-0.305
 matter-0.301
 harm-0.290
 test-0.266
 reason-0.262
Token joy
Token ID8716
Feature activation+3.03e-01

Pos logit contributions
amed+0.045
agged+0.040
bled+0.038
lic+0.037
hes+0.037
Neg logit contributions
 matter-0.033
 worry-0.033
 harm-0.032
 test-0.029
 client-0.028
Token.
Token ID13
Feature activation-1.89e+00

Pos logit contributions
 matter+0.009
 worry+0.009
 harm+0.009
 test+0.008
 client+0.008
Neg logit contributions
amed-0.011
agged-0.009
bled-0.009
lic-0.009
hes-0.009
Token "
Token ID366
Feature activation-4.76e-01

Pos logit contributions
amed+0.057
agged+0.049
bled+0.046
hes+0.044
lic+0.043
Neg logit contributions
 worry-0.067
 matter-0.065
 harm-0.062
 test-0.060
 client-0.059
TokenLook
Token ID8567
Feature activation+1.27e+00

Pos logit contributions
amed+0.015
agged+0.013
bled+0.012
lic+0.012
hes+0.012
Neg logit contributions
 matter-0.016
 worry-0.016
 harm-0.015
 test-0.014
 reason-0.014
Token lonely
Token ID21757
Feature activation+1.41e+00

Pos logit contributions
amed+0.029
agged+0.026
hes+0.025
bled+0.025
lic+0.024
Neg logit contributions
 matter-0.022
 worry-0.021
 harm-0.021
 reason-0.019
 test-0.019
Token.
Token ID13
Feature activation-2.55e+00

Pos logit contributions
 matter+0.053
 worry+0.050
 harm+0.047
 risk+0.046
 test+0.046
Neg logit contributions
amed-0.034
agged-0.031
hes-0.028
apped-0.027
bled-0.026
Token
Token ID198
Feature activation-1.58e+00

Pos logit contributions
amed+0.075
bled+0.061
agged+0.059
lic+0.054
acking+0.052
Neg logit contributions
 worry-0.095
 matter-0.094
 harm-0.091
 test-0.084
 client-0.084
Token
Token ID198
Feature activation-1.06e+00

Pos logit contributions
amed+0.059
agged+0.053
bled+0.052
lic+0.045
apped+0.045
Neg logit contributions
 harm-0.044
 worry-0.043
 matter-0.040
 problems-0.038
 test-0.038
Token"
Token ID1
Feature activation+4.43e-01

Pos logit contributions
amed+0.041
agged+0.039
bled+0.038
apped+0.035
lic+0.034
Neg logit contributions
 matter-0.025
 worry-0.025
 harm-0.024
 lawyer-0.022
 problems-0.022
TokenCan
Token ID6090
Feature activation+6.74e+00

Pos logit contributions
 matter+0.014
 worry+0.014
 harm+0.014
 reason+0.012
 test+0.012
Neg logit contributions
amed-0.015
agged-0.013
bled-0.012
lic-0.012
hes-0.012
Token we
Token ID356
Feature activation+3.90e+00

Pos logit contributions
 matter+0.155
 harm+0.145
 perspective+0.144
 worry+0.136
 reason+0.130
Neg logit contributions
amed-0.244
bled-0.241
agged-0.232
He-0.226
hes-0.222
Token be
Token ID307
Feature activation-1.47e+00

Pos logit contributions
 matter+0.109
 harm+0.105
 problems+0.101
 conflicts+0.101
 worry+0.098
Neg logit contributions
amed-0.125
agged-0.111
hes-0.105
lic-0.103
bled-0.101
Token your
Token ID534
Feature activation+1.30e+00

Pos logit contributions
amed+0.059
agged+0.052
hes+0.049
bled+0.049
lic+0.047
Neg logit contributions
 matter-0.040
 worry-0.038
 harm-0.036
 risk-0.034
 test-0.033
Token friends
Token ID2460
Feature activation+4.95e+00

Pos logit contributions
 matter+0.043
 happen+0.038
 harm+0.038
 worry+0.037
 exam+0.036
Neg logit contributions
amed-0.034
agged-0.030
apped-0.029
 Then-0.028
hes-0.028
Token?"
Token ID1701
Feature activation+2.49e-01

Pos logit contributions
 matter+0.122
 harm+0.119
 worry+0.116
 client+0.109
 risk+0.107
Neg logit contributions
amed-0.181
agged-0.162
hes-0.157
lic-0.151
bled-0.148
Token crash
Token ID7014
Feature activation-3.66e+00

Pos logit contributions
 worry+0.007
 matter+0.007
 harm+0.006
 test+0.006
 happen+0.006
Neg logit contributions
amed-0.011
agged-0.010
bled-0.010
hes-0.009
lic-0.009
Token.
Token ID13
Feature activation-2.83e+00

Pos logit contributions
amed+0.147
agged+0.130
bled+0.126
lic+0.123
hes+0.121
Neg logit contributions
 worry-0.092
 matter-0.092
 harm-0.087
 test-0.078
 reason-0.076
Token
Token ID198
Feature activation-1.98e+00

Pos logit contributions
amed+0.111
bled+0.093
agged+0.090
aming+0.084
acked+0.083
Neg logit contributions
 worry-0.073
 harm-0.066
 matter-0.066
 anyone-0.065
 you-0.063
Token
Token ID198
Feature activation-1.27e+00

Pos logit contributions
amed+0.081
agged+0.071
bled+0.070
aming+0.062
lic+0.062
Neg logit contributions
 harm-0.047
 worry-0.046
 you-0.041
 reason-0.041
 test-0.040
Token"
Token ID1
Feature activation+4.48e-01

Pos logit contributions
amed+0.052
agged+0.049
bled+0.046
lic+0.045
apped+0.043
Neg logit contributions
 you-0.027
 worry-0.026
 matter-0.023
 else-0.023
 we-0.022
TokenOh
Token ID5812
Feature activation+6.82e+00

Pos logit contributions
 matter+0.015
 worry+0.015
 harm+0.015
 reason+0.013
 test+0.013
Neg logit contributions
amed-0.014
agged-0.012
bled-0.012
lic-0.011
hes-0.011
Token no
Token ID645
Feature activation+6.45e+00

Pos logit contributions
 harm+0.214
 matter+0.195
 client+0.187
 conflicts+0.182
 problems+0.179
Neg logit contributions
amed-0.220
bled-0.195
lic-0.188
agged-0.184
apped-0.181
Token!"
Token ID2474
Feature activation+6.82e-01

Pos logit contributions
 matter+0.158
 worry+0.158
 harm+0.158
 client+0.154
 angel+0.146
Neg logit contributions
amed-0.231
agged-0.201
lic-0.200
bled-0.196
acking-0.194
Token Lily
Token ID20037
Feature activation-2.05e+00

Pos logit contributions
 worry+0.019
 matter+0.019
 harm+0.018
 test+0.017
 reason+0.016
Neg logit contributions
amed-0.025
agged-0.022
bled-0.022
lic-0.021
hes-0.021
Token says
Token ID1139
Feature activation-1.87e+00

Pos logit contributions
amed+0.079
agged+0.069
bled+0.066
hes+0.066
lic+0.065
Neg logit contributions
 matter-0.054
 worry-0.052
 harm-0.051
 test-0.045
 reason-0.045
Token.
Token ID13
Feature activation-2.77e+00

Pos logit contributions
amed+0.071
agged+0.063
bled+0.062
hes+0.059
lic+0.058
Neg logit contributions
 matter-0.051
 worry-0.050
 harm-0.048
 test-0.043
 reason-0.042
Token happy
Token ID3772
Feature activation+3.77e+00

Pos logit contributions
 harm+0.056
 matter+0.055
 worry+0.053
 happen+0.051
 problems+0.049
Neg logit contributions
amed-0.063
agged-0.051
lic-0.051
bled-0.051
hes-0.049
Token 3
Token ID513
Feature activation+4.64e+00

Pos logit contributions
 matter+0.093
 harm+0.092
 worry+0.090
 problems+0.081
 risk+0.078
Neg logit contributions
amed-0.146
agged-0.125
bled-0.123
lic-0.122
apped-0.120
Token year
Token ID614
Feature activation+4.15e+00

Pos logit contributions
 matter+0.121
 worry+0.120
 appearance+0.114
 harm+0.112
 problems+0.106
Neg logit contributions
amed-0.172
agged-0.152
lic-0.145
acked-0.145
acking-0.145
Token old
Token ID1468
Feature activation+1.67e+00

Pos logit contributions
 matter+0.118
 worry+0.116
 harm+0.111
 reason+0.099
 test+0.098
Neg logit contributions
amed-0.154
agged-0.133
lic-0.129
bled-0.128
acking-0.122
Token.
Token ID13
Feature activation+1.03e+00

Pos logit contributions
 matter+0.041
 harm+0.040
 worry+0.039
 risk+0.035
 problems+0.034
Neg logit contributions
amed-0.066
agged-0.057
bled-0.056
apped-0.056
lic-0.055
Token Most
Token ID4042
Feature activation+6.89e+00

Pos logit contributions
 matter+0.036
 worry+0.036
 harm+0.034
 test+0.031
 exam+0.031
Neg logit contributions
amed-0.033
bled-0.028
agged-0.027
hes-0.025
lic-0.025
Token days
Token ID1528
Feature activation+5.39e+00

Pos logit contributions
 worry+0.213
 harm+0.208
 matter+0.202
 appearance+0.200
 risk+0.193
Neg logit contributions
amed-0.218
agged-0.196
apped-0.177
hes-0.177
bled-0.176
Token,
Token ID11
Feature activation+2.61e+00

Pos logit contributions
 worry+0.168
 matter+0.166
 harm+0.164
 test+0.149
 risk+0.148
Neg logit contributions
amed-0.176
agged-0.149
bled-0.146
acked-0.140
hes-0.140
Token he
Token ID339
Feature activation-6.54e-01

Pos logit contributions
 harm+0.074
 worry+0.073
 matter+0.073
 test+0.065
 appearance+0.064
Neg logit contributions
amed-0.091
agged-0.078
bled-0.076
lic-0.075
hes-0.074
Token loved
Token ID6151
Feature activation+2.81e+00

Pos logit contributions
amed+0.022
agged+0.018
hes+0.018
bled+0.018
lic+0.017
Neg logit contributions
 worry-0.021
 matter-0.020
 harm-0.019
 risk-0.018
 reason-0.018
Token to
Token ID284
Feature activation+6.26e+00

Pos logit contributions
 matter+0.085
 worry+0.085
 harm+0.079
 risk+0.076
 expect+0.074
Neg logit contributions
amed-0.085
hes-0.073
bled-0.070
agged-0.070
lic-0.069
Token "
Token ID366
Feature activation+3.87e-01

Pos logit contributions
 worry+0.001
 matter+0.001
 test+0.001
 lawyer+0.001
 angel+0.001
Neg logit contributions
amed-0.001
agged-0.001
acked-0.001
bled-0.001
acking-0.001
TokenWhat
Token ID2061
Feature activation+8.58e+00

Pos logit contributions
 matter+0.013
 worry+0.012
 harm+0.012
 reason+0.011
 test+0.011
Neg logit contributions
amed-0.013
agged-0.011
bled-0.011
lic-0.010
hes-0.010
Token is
Token ID318
Feature activation+4.40e+00

Pos logit contributions
 harm+0.258
 worry+0.227
 matter+0.222
 appearance+0.219
 risk+0.214
Neg logit contributions
amed-0.268
hes-0.264
lic-0.233
apped-0.225
agged-0.216
Token it
Token ID340
Feature activation+2.48e+00

Pos logit contributions
 worry+0.126
 matter+0.115
 harm+0.109
 reason+0.103
 appearance+0.102
Neg logit contributions
amed-0.155
bled-0.146
hes-0.144
agged-0.137
ocked-0.125
Token?
Token ID30
Feature activation+3.47e-01

Pos logit contributions
 matter+0.050
 worry+0.049
 harm+0.049
 client+0.041
 problems+0.040
Neg logit contributions
amed-0.110
agged-0.098
bled-0.097
lic-0.096
hes-0.094
Token Can
Token ID1680
Feature activation+7.22e+00

Pos logit contributions
 matter+0.009
 worry+0.009
 harm+0.009
 test+0.008
 reason+0.008
Neg logit contributions
amed-0.013
agged-0.012
bled-0.011
lic-0.011
hes-0.011
Token I
Token ID314
Feature activation+4.08e+00

Pos logit contributions
 matter+0.193
 harm+0.187
 problems+0.176
 worry+0.173
 reason+0.167
Neg logit contributions
amed-0.248
bled-0.240
agged-0.226
apped-0.216
hes-0.215
Token help
Token ID1037
Feature activation+4.82e+00

Pos logit contributions
 matter+0.104
 worry+0.100
 harm+0.097
 problems+0.090
 appearance+0.089
Neg logit contributions
amed-0.155
agged-0.138
lic-0.133
hes-0.132
bled-0.130
Token?"
Token ID1701
Feature activation-6.61e-01

Pos logit contributions
 matter+0.098
 worry+0.096
 harm+0.087
 problems+0.082
 appearance+0.082
Neg logit contributions
amed-0.207
bled-0.186
hes-0.182
agged-0.179
lic-0.176
Token Her
Token ID2332
Feature activation-1.87e+00

Pos logit contributions
amed+0.021
agged+0.017
bled+0.017
lic+0.016
hes+0.016
Neg logit contributions
 matter-0.023
 worry-0.023
 harm-0.022
 test-0.020
 risk-0.020
Token mom
Token ID1995
Feature activation-8.29e-02

Pos logit contributions
amed+0.071
bled+0.061
agged+0.061
hes+0.058
lic+0.057
Neg logit contributions
 matter-0.051
 harm-0.050
 risk-0.046
 worry-0.046
 intr-0.043
Token "
Token ID366
Feature activation+7.99e-01

Pos logit contributions
 worry+0.048
 matter+0.046
 harm+0.044
 test+0.041
 reason+0.041
Neg logit contributions
amed-0.041
agged-0.036
bled-0.033
hes-0.032
lic-0.031
TokenWhat
Token ID2061
Feature activation+8.29e+00

Pos logit contributions
 worry+0.027
 matter+0.027
 harm+0.026
 test+0.024
 risk+0.023
Neg logit contributions
amed-0.026
agged-0.021
bled-0.021
lic-0.020
hes-0.020
Token is
Token ID318
Feature activation+4.58e+00

Pos logit contributions
 harm+0.263
 matter+0.248
 avail+0.245
 worry+0.238
 appearance+0.236
Neg logit contributions
hes-0.226
amed-0.219
lic-0.200
apped-0.189
They-0.189
Token this
Token ID428
Feature activation+7.01e+00

Pos logit contributions
 worry+0.154
 matter+0.142
 reason+0.130
 harm+0.130
 promise+0.126
Neg logit contributions
amed-0.142
bled-0.130
hes-0.130
lic-0.114
agged-0.108
Token?
Token ID30
Feature activation+5.89e-01

Pos logit contributions
 worry+0.175
 harm+0.160
 matter+0.154
 client+0.142
 intr+0.138
Neg logit contributions
amed-0.269
lic-0.242
hes-0.235
bled-0.221
agged-0.212
Token What
Token ID1867
Feature activation+7.49e+00

Pos logit contributions
 matter+0.018
 worry+0.018
 harm+0.017
 reason+0.015
 test+0.015
Neg logit contributions
amed-0.021
agged-0.018
lic-0.017
bled-0.017
hes-0.017
Token do
Token ID466
Feature activation+5.98e+00

Pos logit contributions
 harm+0.230
 matter+0.222
 worry+0.218
 avail+0.215
 appearance+0.215
Neg logit contributions
hes-0.216
amed-0.204
lic-0.184
They-0.180
apped-0.175
Token these
Token ID777
Feature activation+5.58e+00

Pos logit contributions
 worry+0.130
 appearance+0.124
 matter+0.124
 perspective+0.121
 harm+0.114
Neg logit contributions
amed-0.225
hes-0.218
lic-0.210
agged-0.200
bled-0.199
Token words
Token ID2456
Feature activation+4.01e+00

Pos logit contributions
 worry+0.145
 harm+0.131
 matter+0.130
 client+0.119
 appearance+0.115
Neg logit contributions
amed-0.208
hes-0.186
lic-0.180
bled-0.171
agged-0.170
Token mean
Token ID1612
Feature activation+3.55e+00

Pos logit contributions
 matter+0.089
 worry+0.089
 harm+0.087
 client+0.082
 reason+0.078
Neg logit contributions
amed-0.161
hes-0.142
agged-0.138
bled-0.138
lic-0.137
Token?"
Token ID1701
Feature activation-1.39e+00

Pos logit contributions
 worry+0.081
 harm+0.077
 matter+0.076
 intr+0.070
 risk+0.068
Neg logit contributions
amed-0.144
lic-0.124
hes-0.120
agged-0.119
bled-0.116
Token bad
Token ID2089
Feature activation+6.00e+00

Pos logit contributions
 matter+0.065
 worry+0.064
 harm+0.063
 appearance+0.054
 problems+0.054
Neg logit contributions
amed-0.098
agged-0.084
lic-0.083
bled-0.083
hes-0.081
Token inch
Token ID11111
Feature activation+1.47e+00

Pos logit contributions
 matter+0.136
 worry+0.133
 harm+0.128
 problems+0.114
 conflicts+0.110
Neg logit contributions
amed-0.221
agged-0.202
lic-0.201
bled-0.198
hes-0.196
Token.
Token ID13
Feature activation+1.53e+00

Pos logit contributions
 matter+0.042
 worry+0.042
 harm+0.037
 appearance+0.036
 avail+0.034
Neg logit contributions
amed-0.051
lic-0.046
agged-0.044
hes-0.043
bled-0.043
Token He
Token ID679
Feature activation+2.35e-01

Pos logit contributions
 worry+0.057
 matter+0.057
 harm+0.054
 test+0.050
 reason+0.049
Neg logit contributions
amed-0.045
agged-0.037
bled-0.036
lic-0.034
hes-0.033
Token was
Token ID373
Feature activation-1.10e+00

Pos logit contributions
 worry+0.007
 matter+0.007
 harm+0.007
 test+0.006
 reason+0.006
Neg logit contributions
amed-0.008
agged-0.007
bled-0.007
lic-0.007
hes-0.007
Token mean
Token ID1612
Feature activation+2.82e+00

Pos logit contributions
amed+0.041
agged+0.036
bled+0.035
hes+0.034
lic+0.034
Neg logit contributions
 matter-0.031
 worry-0.031
 harm-0.029
 reason-0.027
 test-0.026
Token,
Token ID11
Feature activation+2.80e+00

Pos logit contributions
 matter+0.109
 worry+0.107
 harm+0.105
 test+0.097
 reason+0.096
Neg logit contributions
amed-0.073
agged-0.060
lic-0.057
bled-0.056
hes-0.055
Token and
Token ID290
Feature activation+1.89e-01

Pos logit contributions
 worry+0.084
 matter+0.082
 harm+0.081
 reason+0.073
 client+0.073
Neg logit contributions
amed-0.094
agged-0.082
bled-0.079
hes-0.079
lic-0.077
Token he
Token ID339
Feature activation-8.30e-01

Pos logit contributions
 harm+0.005
 worry+0.005
 matter+0.005
 reason+0.005
 appearance+0.005
Neg logit contributions
amed-0.006
hes-0.005
agged-0.005
bled-0.005
lic-0.005
Token did
Token ID750
Feature activation-4.04e-01

Pos logit contributions
amed+0.026
agged+0.022
hes+0.021
bled+0.021
lic+0.020
Neg logit contributions
 worry-0.027
 matter-0.027
 harm-0.026
 risk-0.025
 reason-0.024
Token not
Token ID407
Feature activation+1.99e+00

Pos logit contributions
amed+0.012
agged+0.010
hes+0.010
bled+0.010
lic+0.009
Neg logit contributions
 worry-0.013
 matter-0.013
 angel-0.012
 harm-0.012
 comet-0.012
Token the
Token ID262
Feature activation-6.92e-01

Pos logit contributions
 matter+0.032
 worry+0.031
 harm+0.029
 client+0.027
 appearance+0.026
Neg logit contributions
amed-0.051
bled-0.045
agged-0.044
hes-0.042
lic-0.041
Token end
Token ID886
Feature activation+1.75e+00

Pos logit contributions
amed+0.024
bled+0.020
hes+0.019
 Later+0.019
acking+0.019
Neg logit contributions
 harm-0.019
 worry-0.018
 matter-0.018
 expect-0.017
 client-0.017
Token of
Token ID286
Feature activation+3.42e+00

Pos logit contributions
 matter+0.059
 worry+0.059
 harm+0.057
 test+0.052
 appearance+0.052
Neg logit contributions
amed-0.056
agged-0.047
bled-0.045
lic-0.045
hes-0.044
Token the
Token ID262
Feature activation-6.19e-01

Pos logit contributions
 matter+0.093
 worry+0.091
 harm+0.086
 appearance+0.082
 client+0.081
Neg logit contributions
amed-0.132
agged-0.115
bled-0.115
hes-0.109
acked-0.104
Token night
Token ID1755
Feature activation+8.64e-01

Pos logit contributions
amed+0.023
bled+0.020
hes+0.020
acking+0.019
aming+0.019
Neg logit contributions
 harm-0.013
 appearance-0.013
 worry-0.013
 matter-0.013
 expect-0.013
Token,
Token ID11
Feature activation+1.29e+00

Pos logit contributions
 matter+0.022
 worry+0.021
 harm+0.021
 client+0.018
 test+0.018
Neg logit contributions
amed-0.035
agged-0.030
bled-0.029
lic-0.029
hes-0.029
Token the
Token ID262
Feature activation-1.53e+00

Pos logit contributions
 matter+0.035
 worry+0.035
 harm+0.034
 test+0.030
 client+0.030
Neg logit contributions
amed-0.049
agged-0.042
bled-0.042
hes-0.040
lic-0.040
Token three
Token ID1115
Feature activation+2.86e+00

Pos logit contributions
amed+0.055
bled+0.048
hes+0.045
aming+0.044
agged+0.043
Neg logit contributions
 harm-0.042
 matter-0.042
 worry-0.039
 appearance-0.038
 conflicts-0.037
Token of
Token ID286
Feature activation+2.94e+00

Pos logit contributions
 matter+0.073
 worry+0.072
 harm+0.072
 appearance+0.065
 risk+0.064
Neg logit contributions
amed-0.113
agged-0.098
lic-0.096
bled-0.095
hes-0.095
Token them
Token ID606
Feature activation-3.49e+00

Pos logit contributions
 worry+0.081
 matter+0.081
 harm+0.077
 test+0.069
 reason+0.068
Neg logit contributions
amed-0.112
agged-0.099
bled-0.096
lic-0.092
hes-0.092
Token thanked
Token ID26280
Feature activation-2.58e-02

Pos logit contributions
amed+0.118
agged+0.101
bled+0.097
lic+0.094
hes+0.093
Neg logit contributions
 worry-0.111
 matter-0.111
 harm-0.106
 test-0.097
 reason-0.095
Token gone
Token ID3750
Feature activation-3.69e+00

Pos logit contributions
amed+0.094
agged+0.084
bled+0.080
lic+0.078
hes+0.076
Neg logit contributions
 worry-0.077
 matter-0.076
 harm-0.073
 test-0.065
 reason-0.064
Token.
Token ID13
Feature activation-2.23e+00

Pos logit contributions
amed+0.136
agged+0.117
bled+0.116
lic+0.111
hes+0.109
Neg logit contributions
 matter-0.102
 worry-0.102
 harm-0.097
 client-0.090
 test-0.089
Token But
Token ID887
Feature activation+2.91e+00

Pos logit contributions
amed+0.063
agged+0.051
bled+0.051
hes+0.047
lic+0.044
Neg logit contributions
 worry-0.085
 matter-0.083
 harm-0.079
 test-0.077
 anyone-0.076
Token,
Token ID11
Feature activation+9.54e-01

Pos logit contributions
 matter+0.084
 harm+0.082
 worry+0.082
 client+0.074
 test+0.074
Neg logit contributions
amed-0.103
bled-0.091
agged-0.085
hes-0.080
apped-0.080
Token John
Token ID1757
Feature activation+8.64e-01

Pos logit contributions
 harm+0.028
 matter+0.027
 worry+0.027
 appearance+0.026
 client+0.025
Neg logit contributions
amed-0.031
bled-0.027
hes-0.025
lic-0.025
agged-0.024
Token never
Token ID1239
Feature activation+9.33e-01

Pos logit contributions
 matter+0.027
 worry+0.026
 harm+0.026
 test+0.024
 exam+0.024
Neg logit contributions
amed-0.030
agged-0.025
bled-0.024
lic-0.023
hes-0.023
Token forgot
Token ID16453
Feature activation+5.64e-01

Pos logit contributions
 matter+0.030
 worry+0.029
 harm+0.028
 client+0.028
 angel+0.027
Neg logit contributions
amed-0.030
lic-0.024
agged-0.023
bled-0.023
hes-0.022
Token how
Token ID703
Feature activation+4.30e+00

Pos logit contributions
 matter+0.016
 worry+0.016
 harm+0.015
 reason+0.014
 client+0.014
Neg logit contributions
amed-0.019
bled-0.017
hes-0.016
agged-0.016
lic-0.016
Token wonderful
Token ID7932
Feature activation+6.55e+00

Pos logit contributions
 matter+0.126
 worry+0.119
 harm+0.114
 client+0.107
 angel+0.107
Neg logit contributions
amed-0.156
agged-0.134
bled-0.133
lic-0.129
hes-0.127
Token it
Token ID340
Feature activation+1.53e+00

Pos logit contributions
 worry+0.168
 client+0.161
 harm+0.160
 matter+0.152
 lawyer+0.145
Neg logit contributions
amed-0.230
bled-0.213
agged-0.207
hes-0.202
lic-0.194
Token was
Token ID373
Feature activation+1.38e-02

Pos logit contributions
 worry+0.054
 matter+0.051
 client+0.051
 harm+0.049
 appearance+0.049
Neg logit contributions
amed-0.043
bled-0.035
hes-0.034
lic-0.034
agged-0.034
Token his
Token ID465
Feature activation-1.35e+00

Pos logit contributions
 matter+0.005
 worry+0.005
 harm+0.005
 client+0.005
 lawyer+0.005
Neg logit contributions
amed-0.005
agged-0.004
bled-0.004
hes-0.004
lic-0.004
Token mother
Token ID2802
Feature activation+2.23e+00

Pos logit contributions
amed+0.042
bled+0.035
agged+0.034
He+0.031
 Then+0.030
Neg logit contributions
 matter-0.039
 expect-0.037
 harm-0.036
 worry-0.036
 risk-0.034
Token again
Token ID757
Feature activation+1.38e+00

Pos logit contributions
 matter+0.076
 harm+0.072
 worry+0.072
 client+0.068
 reason+0.068
Neg logit contributions
amed-0.067
agged-0.052
bled-0.051
lic-0.050
hes-0.050
Token soon
Token ID2582
Feature activation+2.00e+00

Pos logit contributions
 matter+0.041
 worry+0.040
 harm+0.039
 client+0.037
 angel+0.036
Neg logit contributions
amed-0.046
agged-0.039
bled-0.038
hes-0.036
lic-0.036
Token.
Token ID13
Feature activation-2.12e+00

Pos logit contributions
 matter+0.059
 worry+0.058
 harm+0.055
 client+0.051
 test+0.050
Neg logit contributions
amed-0.070
agged-0.060
bled-0.059
lic-0.056
hes-0.056
Token<|endoftext|>
Token ID50256
Feature activation+1.85e+00

Pos logit contributions
amed+0.063
agged+0.053
bled+0.052
lic+0.048
hes+0.047
Neg logit contributions
 worry-0.076
 matter-0.075
 harm-0.072
 client-0.067
 test-0.067
TokenOnce
Token ID7454
Feature activation+4.55e+00

Pos logit contributions
 worry+0.059
 matter+0.059
 harm+0.058
 test+0.053
 risk+0.052
Neg logit contributions
amed-0.060
agged-0.051
bled-0.050
lic-0.046
hes-0.046
Token upon
Token ID2402
Feature activation+8.24e-02

Pos logit contributions
 matter+0.176
 harm+0.171
 worry+0.170
 test+0.159
 client+0.157
Neg logit contributions
amed-0.120
agged-0.101
bled-0.099
lic-0.092
apped-0.089
Token a
Token ID257
Feature activation+4.40e+00

Pos logit contributions
 worry+0.002
 matter+0.002
 harm+0.002
 test+0.002
 reason+0.002
Neg logit contributions
amed-0.003
agged-0.003
bled-0.003
acking-0.003
lic-0.003
Token time
Token ID640
Feature activation+3.90e+00

Pos logit contributions
 worry+0.076
 matter+0.075
 harm+0.068
 test+0.058
 reason+0.056
Neg logit contributions
amed-0.213
agged-0.192
bled-0.189
lic-0.184
hes-0.182
Token,
Token ID11
Feature activation+4.15e+00

Pos logit contributions
 matter+0.105
 harm+0.101
 worry+0.097
 client+0.089
 appearance+0.087
Neg logit contributions
amed-0.150
agged-0.132
bled-0.130
lic-0.127
apped-0.122
TokenIt
Token ID1026
Feature activation-2.06e+00

Pos logit contributions
amed+0.012
agged+0.011
bled+0.011
acking+0.010
lic+0.010
Neg logit contributions
 matter-0.010
 worry-0.009
 harm-0.008
 test-0.008
 reason-0.008
Token does
Token ID857
Feature activation+8.73e-01

Pos logit contributions
amed+0.078
lic+0.067
hes+0.066
bled+0.063
agged+0.062
Neg logit contributions
 worry-0.056
 matter-0.056
 harm-0.055
 reason-0.049
 client-0.049
Token,
Token ID11
Feature activation+2.80e+00

Pos logit contributions
 harm+0.025
 worry+0.025
 matter+0.025
 client+0.023
 lawyer+0.023
Neg logit contributions
amed-0.029
bled-0.025
hes-0.025
lic-0.024
agged-0.024
Token Anna
Token ID11735
Feature activation+1.16e+00

Pos logit contributions
 harm+0.082
 matter+0.077
 worry+0.075
 client+0.072
 problems+0.071
Neg logit contributions
amed-0.097
bled-0.085
agged-0.083
hes-0.082
lic-0.080
Token.
Token ID13
Feature activation-7.15e-01

Pos logit contributions
 matter+0.037
 worry+0.034
 harm+0.034
 risk+0.032
 exam+0.032
Neg logit contributions
amed-0.039
hes-0.032
lic-0.031
agged-0.031
bled-0.031
Token Are
Token ID4231
Feature activation+4.69e+00

Pos logit contributions
amed+0.029
agged+0.026
bled+0.025
lic+0.024
hes+0.024
Neg logit contributions
 worry-0.018
 matter-0.017
 harm-0.016
 test-0.015
 angel-0.015
Token you
Token ID345
Feature activation+4.52e+00

Pos logit contributions
 harm+0.099
 matter+0.094
 worry+0.088
 reason+0.080
 perspective+0.080
Neg logit contributions
bled-0.195
amed-0.191
hes-0.177
agged-0.174
apped-0.169
Token ready
Token ID3492
Feature activation+2.95e+00

Pos logit contributions
 matter+0.136
 worry+0.135
 harm+0.132
 client+0.118
 test+0.118
Neg logit contributions
amed-0.162
agged-0.138
bled-0.135
lic-0.131
hes-0.130
Token to
Token ID284
Feature activation+5.57e+00

Pos logit contributions
 matter+0.074
 harm+0.072
 worry+0.071
 client+0.067
 conflicts+0.063
Neg logit contributions
amed-0.112
bled-0.099
agged-0.098
hes-0.097
lic-0.096
Token perform
Token ID1620
Feature activation-1.05e+00

Pos logit contributions
 matter+0.136
 worry+0.131
 harm+0.128
 conflicts+0.117
 client+0.115
Neg logit contributions
amed-0.213
bled-0.183
agged-0.183
hes-0.182
lic-0.181
Token?"
Token ID1701
Feature activation-6.01e-01

Pos logit contributions
amed+0.045
agged+0.040
bled+0.040
hes+0.039
lic+0.038
Neg logit contributions
 worry-0.024
 matter-0.023
 harm-0.022
 client-0.019
 reason-0.019
Token.
Token ID13
Feature activation-2.84e+00

Pos logit contributions
amed+0.041
agged+0.036
bled+0.036
lic+0.033
hes+0.033
Neg logit contributions
 matter-0.030
 worry-0.029
 harm-0.028
 test-0.026
 risk-0.025
Token
Token ID198
Feature activation-1.68e+00

Pos logit contributions
amed+0.094
agged+0.081
bled+0.080
hes+0.073
lic+0.072
Neg logit contributions
 worry-0.093
 matter-0.088
 harm-0.085
 test-0.080
 lawyer-0.080
Token
Token ID198
Feature activation-1.03e+00

Pos logit contributions
amed+0.052
agged+0.044
bled+0.043
lic+0.040
hes+0.040
Neg logit contributions
 worry-0.058
 matter-0.056
 harm-0.056
 test-0.051
 reason-0.051
Token"
Token ID1
Feature activation+7.53e-01

Pos logit contributions
amed+0.033
agged+0.029
bled+0.028
lic+0.026
hes+0.026
Neg logit contributions
 worry-0.035
 matter-0.034
 harm-0.033
 test-0.030
 reason-0.030
TokenTo
Token ID2514
Feature activation+3.66e+00

Pos logit contributions
 worry+0.027
 matter+0.027
 harm+0.026
 test+0.024
 appearance+0.024
Neg logit contributions
amed-0.022
agged-0.018
bled-0.018
hes-0.017
lic-0.017
Token get
Token ID651
Feature activation-9.05e-01

Pos logit contributions
 matter+0.087
 worry+0.083
 harm+0.081
 client+0.073
 conflicts+0.073
Neg logit contributions
amed-0.141
bled-0.126
agged-0.123
hes-0.122
lic-0.118
Token to
Token ID284
Feature activation+3.34e+00

Pos logit contributions
amed+0.033
bled+0.028
agged+0.028
hes+0.027
lic+0.026
Neg logit contributions
 matter-0.026
 worry-0.026
 harm-0.024
 client-0.023
 angel-0.022
Token the
Token ID262
Feature activation-3.09e-01

Pos logit contributions
 matter+0.086
 worry+0.082
 client+0.075
 harm+0.074
 angel+0.074
Neg logit contributions
amed-0.109
hes-0.097
bled-0.092
lic-0.090
agged-0.089
Token other
Token ID584
Feature activation+5.64e+00

Pos logit contributions
amed+0.011
bled+0.009
hes+0.009
agged+0.009
lic+0.008
Neg logit contributions
 matter-0.007
 worry-0.007
 harm-0.007
 expect-0.007
 conflicts-0.007
Token side
Token ID1735
Feature activation+2.52e+00

Pos logit contributions
 client+0.129
 harm+0.126
 matter+0.123
 intr+0.121
 disl+0.121
Neg logit contributions
amed-0.175
aming-0.149
apped-0.142
acking-0.135
bled-0.135
Token!"
Token ID2474
Feature activation-1.27e-01

Pos logit contributions
 matter+0.076
 worry+0.072
 harm+0.070
 client+0.069
 appearance+0.068
Neg logit contributions
amed-0.085
hes-0.068
agged-0.068
bled-0.067
lic-0.067
Token?"
Token ID1701
Feature activation-1.85e+00

Pos logit contributions
 worry+0.114
 matter+0.109
 harm+0.103
 client+0.096
 angel+0.096
Neg logit contributions
amed-0.179
bled-0.160
hes-0.158
agged-0.155
apped-0.143
Token He
Token ID679
Feature activation-4.07e+00

Pos logit contributions
amed+0.059
agged+0.049
bled+0.049
lic+0.045
hes+0.044
Neg logit contributions
 worry-0.065
 matter-0.064
 harm-0.061
 test-0.057
 reason-0.056
Token jumped
Token ID11687
Feature activation-7.41e+00

Pos logit contributions
amed+0.179
agged+0.165
acking+0.157
bled+0.153
acked+0.149
Neg logit contributions
 matter-0.088
 worry-0.078
 might-0.067
 harm-0.065
 fight-0.064
Token off
Token ID572
Feature activation-2.82e+00

Pos logit contributions
amed+0.300
agged+0.281
lic+0.260
bled+0.251
hes+0.243
Neg logit contributions
 worry-0.168
 test-0.153
 harm-0.150
 fight-0.144
 yet-0.144
Token the
Token ID262
Feature activation-2.98e+00

Pos logit contributions
amed+0.130
agged+0.117
bled+0.113
lic+0.110
hes+0.107
Neg logit contributions
 worry-0.057
 matter-0.056
 harm-0.052
 test-0.046
 reason-0.044
Token chair
Token ID5118
Feature activation-3.23e+00

Pos logit contributions
amed+0.123
agged+0.111
bled+0.103
lic+0.103
acked+0.099
Neg logit contributions
 worry-0.074
 matter-0.071
 harm-0.071
 test-0.069
 exam-0.063
Token and
Token ID290
Feature activation-4.33e+00

Pos logit contributions
amed+0.102
agged+0.088
bled+0.085
lic+0.081
hes+0.079
Neg logit contributions
 worry-0.110
 matter-0.108
 harm-0.105
 test-0.097
 reason-0.095
Token ran
Token ID4966
Feature activation-6.93e+00

Pos logit contributions
amed+0.157
agged+0.144
bled+0.131
lic+0.127
acking+0.126
Neg logit contributions
 worry-0.126
 matter-0.125
 harm-0.118
 test-0.112
 fight-0.106
Token to
Token ID284
Feature activation-5.84e-01

Pos logit contributions
amed+0.284
agged+0.253
bled+0.245
lic+0.238
hes+0.236
Neg logit contributions
 worry-0.172
 matter-0.170
 harm-0.164
 test-0.147
 reason-0.143
Token the
Token ID262
Feature activation-3.30e+00

Pos logit contributions
amed+0.018
bled+0.016
hes+0.015
agged+0.015
lic+0.014
Neg logit contributions
 matter-0.018
 worry-0.017
 appearance-0.016
 harm-0.016
 client-0.016
Token door
Token ID3420
Feature activation-4.55e+00

Pos logit contributions
amed+0.123
agged+0.108
bled+0.103
lic+0.100
hes+0.099
Neg logit contributions
 worry-0.094
 matter-0.094
 harm-0.090
 test-0.082
 reason-0.080
Token for
Token ID329
Feature activation+2.21e+00

Pos logit contributions
amed+0.130
agged+0.117
acked+0.109
lic+0.109
bled+0.106
Neg logit contributions
 worry-0.078
 test-0.077
 matter-0.076
 harm-0.074
 happen-0.072
Token hours
Token ID2250
Feature activation+3.23e+00

Pos logit contributions
 matter+0.070
 worry+0.070
 harm+0.067
 test+0.061
 reason+0.060
Neg logit contributions
amed-0.076
agged-0.065
bled-0.063
lic-0.061
hes-0.060
Token.
Token ID13
Feature activation-2.52e+00

Pos logit contributions
 matter+0.098
 worry+0.097
 harm+0.091
 client+0.083
 test+0.083
Neg logit contributions
amed-0.115
agged-0.098
bled-0.096
hes-0.092
lic-0.092
Token He
Token ID679
Feature activation-2.92e+00

Pos logit contributions
amed+0.077
agged+0.064
bled+0.063
aming+0.060
lic+0.057
Neg logit contributions
 matter-0.089
 worry-0.085
 harm-0.085
 test-0.080
 anyone-0.079
Token laughed
Token ID13818
Feature activation-6.62e+00

Pos logit contributions
amed+0.105
agged+0.091
bled+0.088
lic+0.085
hes+0.084
Neg logit contributions
 worry-0.087
 matter-0.087
 harm-0.083
 test-0.075
 reason-0.074
Token with
Token ID351
Feature activation-8.09e-01

Pos logit contributions
amed+0.277
agged+0.262
hes+0.244
lic+0.242
aked+0.236
Neg logit contributions
 worry-0.143
 matter-0.134
 angel-0.122
 yet-0.121
 anyone-0.119
Token joy
Token ID8716
Feature activation+3.32e-01

Pos logit contributions
amed+0.033
agged+0.029
bled+0.028
lic+0.027
hes+0.027
Neg logit contributions
 worry-0.020
 matter-0.020
 harm-0.019
 test-0.017
 reason-0.017
Token and
Token ID290
Feature activation-2.68e+00

Pos logit contributions
 worry+0.009
 matter+0.009
 harm+0.009
 test+0.008
 angel+0.008
Neg logit contributions
amed-0.012
agged-0.011
bled-0.010
hes-0.010
lic-0.010
Token thanked
Token ID26280
Feature activation+1.25e+00

Pos logit contributions
amed+0.099
agged+0.088
bled+0.083
lic+0.082
hes+0.080
Neg logit contributions
 worry-0.078
 matter-0.077
 harm-0.072
 test-0.065
 reason-0.064
Token his
Token ID465
Feature activation-3.44e-01

Pos logit contributions
 worry+0.041
 matter+0.041
 harm+0.040
 test+0.036
 reason+0.036
Neg logit contributions
amed-0.041
agged-0.035
bled-0.034
lic-0.032
hes-0.032
Token mom
Token ID1995
Feature activation+1.95e+00

Pos logit contributions
amed+0.015
agged+0.013
bled+0.013
lic+0.013
hes+0.012
Neg logit contributions
 matter-0.008
 worry-0.007
 harm-0.007
 risk-0.006
 test-0.006
Token low
Token ID1877
Feature activation-1.10e+00

Pos logit contributions
amed+0.073
agged+0.063
lic+0.062
bled+0.062
hes+0.059
Neg logit contributions
 matter-0.051
 worry-0.051
 harm-0.049
 test-0.043
 client-0.043
Token rum
Token ID7440
Feature activation-3.98e+00

Pos logit contributions
amed+0.050
agged+0.045
bled+0.045
hes+0.042
lic+0.041
Neg logit contributions
 worry-0.025
 matter-0.024
 harm-0.023
 test-0.021
 angel-0.020
Tokenble
Token ID903
Feature activation-5.24e+00

Pos logit contributions
amed+0.122
agged+0.105
lic+0.101
acked+0.094
bled+0.092
Neg logit contributions
 worry-0.140
 harm-0.134
 matter-0.125
 test-0.120
 risk-0.118
Token and
Token ID290
Feature activation-3.40e+00

Pos logit contributions
amed+0.202
agged+0.180
bled+0.173
lic+0.164
hes+0.163
Neg logit contributions
 worry-0.146
 matter-0.143
 harm-0.140
 test-0.126
 angel-0.121
Token the
Token ID262
Feature activation-2.61e+00

Pos logit contributions
amed+0.140
agged+0.130
bled+0.121
hes+0.117
lic+0.115
Neg logit contributions
 worry-0.087
 matter-0.081
 harm-0.077
 test-0.067
 angel-0.067
Token heavy
Token ID4334
Feature activation+3.03e-01

Pos logit contributions
amed+0.104
agged+0.089
bled+0.089
lic+0.087
hes+0.085
Neg logit contributions
 matter-0.065
 worry-0.063
 harm-0.060
 appearance-0.053
 client-0.052
Token box
Token ID3091
Feature activation-2.20e+00

Pos logit contributions
 worry+0.006
 matter+0.006
 harm+0.006
 client+0.005
 expect+0.005
Neg logit contributions
amed-0.012
agged-0.011
bled-0.011
lic-0.010
acked-0.010
Token in
Token ID287
Feature activation-6.03e-01

Pos logit contributions
amed+0.076
agged+0.069
bled+0.063
hes+0.063
lic+0.061
Neg logit contributions
 worry-0.067
 matter-0.066
 harm-0.063
 angel-0.057
 test-0.057
Token the
Token ID262
Feature activation-1.98e+00

Pos logit contributions
amed+0.022
agged+0.019
bled+0.018
hes+0.018
lic+0.018
Neg logit contributions
 worry-0.016
 matter-0.016
 harm-0.015
 exam-0.015
 client-0.014
Token corner
Token ID5228
Feature activation-1.29e+00

Pos logit contributions
amed+0.072
agged+0.064
bled+0.062
lic+0.059
hes+0.058
Neg logit contributions
 worry-0.059
 matter-0.058
 harm-0.056
 test-0.051
 reason-0.050
Token started
Token ID2067
Feature activation-5.28e+00

Pos logit contributions
amed+0.045
agged+0.038
lic+0.037
bled+0.037
hes+0.036
Neg logit contributions
 matter-0.039
 worry-0.039
 harm-0.037
 test-0.034
 client-0.034
Token me
Token ID502
Feature activation+2.71e+00

Pos logit contributions
 matter+0.083
 worry+0.082
 harm+0.078
 conflicts+0.075
 quarrel+0.071
Neg logit contributions
amed-0.095
bled-0.083
agged-0.080
hes-0.075
lic-0.073
Token how
Token ID703
Feature activation+5.77e+00

Pos logit contributions
 matter+0.072
 worry+0.071
 harm+0.068
 reason+0.062
 client+0.059
Neg logit contributions
amed-0.105
bled-0.092
agged-0.090
hes-0.088
lic-0.086
Token to
Token ID284
Feature activation+5.87e+00

Pos logit contributions
 matter+0.172
 worry+0.164
 harm+0.154
 appearance+0.147
 reason+0.145
Neg logit contributions
amed-0.201
hes-0.171
lic-0.168
agged-0.168
bled-0.165
Token whistle
Token ID16121
Feature activation-3.21e+00

Pos logit contributions
 matter+0.138
 worry+0.137
 harm+0.129
 appearance+0.119
 client+0.119
Neg logit contributions
amed-0.231
bled-0.200
agged-0.198
hes-0.194
lic-0.189
Token."
Token ID526
Feature activation-1.80e+00

Pos logit contributions
amed+0.115
agged+0.098
hes+0.096
bled+0.096
lic+0.094
Neg logit contributions
 worry-0.091
 matter-0.088
 harm-0.084
 client-0.079
 risk-0.076
Token The
Token ID383
Feature activation-2.00e+00

Pos logit contributions
amed+0.052
agged+0.044
bled+0.041
lic+0.040
hes+0.039
Neg logit contributions
 worry-0.066
 matter-0.066
 harm-0.063
 test-0.059
 reason-0.058
Token lady
Token ID10846
Feature activation-1.38e+00

Pos logit contributions
amed+0.073
agged+0.062
bled+0.060
hes+0.059
lic+0.059
Neg logit contributions
 matter-0.058
 worry-0.056
 harm-0.055
 conflicts-0.050
 appearance-0.050
Token smiled
Token ID13541
Feature activation-7.30e+00

Pos logit contributions
amed+0.048
agged+0.041
bled+0.040
lic+0.038
hes+0.038
Neg logit contributions
 worry-0.043
 matter-0.043
 harm-0.041
 test-0.037
 reason-0.037
Token and
Token ID290
Feature activation-3.54e+00

Pos logit contributions
amed+0.293
hes+0.271
agged+0.271
lic+0.266
bled+0.261
Neg logit contributions
 worry-0.187
 angel-0.152
 matter-0.152
 yet-0.139
 harm-0.139
Token said
Token ID531
Feature activation-1.43e+00

Pos logit contributions
amed+0.142
agged+0.127
bled+0.121
lic+0.117
hes+0.115
Neg logit contributions
 worry-0.091
 matter-0.089
 harm-0.084
 test-0.076
 angel-0.074
Token,
Token ID11
Feature activation-2.27e-01

Pos logit contributions
amed+0.060
agged+0.054
bled+0.053
lic+0.051
hes+0.050
Neg logit contributions
 matter-0.034
 worry-0.033
 harm-0.032
 test-0.027
 reason-0.027
Token,
Token ID11
Feature activation+2.44e+00

Pos logit contributions
 worry+0.062
 matter+0.061
 harm+0.057
 client+0.055
 risk+0.054
Neg logit contributions
amed-0.058
agged-0.049
bled-0.048
lic-0.046
hes-0.045
Token mum
Token ID25682
Feature activation+3.01e+00

Pos logit contributions
 matter+0.071
 harm+0.071
 worry+0.069
 test+0.063
 client+0.063
Neg logit contributions
amed-0.080
hes-0.066
bled-0.066
agged-0.066
lic-0.065
Token?".
Token ID43634
Feature activation-1.02e+00

Pos logit contributions
 matter+0.083
 worry+0.080
 harm+0.080
 appearance+0.073
 reason+0.073
Neg logit contributions
amed-0.107
hes-0.090
lic-0.088
bled-0.088
agged-0.088
Token Her
Token ID2332
Feature activation-2.03e+00

Pos logit contributions
amed+0.027
agged+0.023
bled+0.021
hes+0.021
lic+0.021
Neg logit contributions
 matter-0.039
 worry-0.038
 harm-0.038
 test-0.035
 appearance-0.034
Token mum
Token ID25682
Feature activation-6.08e-01

Pos logit contributions
amed+0.069
bled+0.059
agged+0.056
apped+0.054
lic+0.053
Neg logit contributions
 matter-0.062
 harm-0.058
 risk-0.055
 worry-0.054
 conflicts-0.051
Token paused
Token ID24487
Feature activation-6.99e+00

Pos logit contributions
amed+0.020
agged+0.017
bled+0.017
lic+0.016
hes+0.016
Neg logit contributions
 worry-0.019
 matter-0.019
 harm-0.019
 test-0.017
 reason-0.017
Token,
Token ID11
Feature activation-1.85e+00

Pos logit contributions
amed+0.224
agged+0.191
bled+0.184
lic+0.176
hes+0.174
Neg logit contributions
 matter-0.233
 worry-0.232
 harm-0.222
 test-0.205
 reason-0.201
Token then
Token ID788
Feature activation-6.56e-01

Pos logit contributions
amed+0.069
agged+0.062
bled+0.059
lic+0.057
hes+0.055
Neg logit contributions
 worry-0.055
 matter-0.052
 harm-0.049
 test-0.044
 reason-0.043
Token said
Token ID531
Feature activation-1.72e+00

Pos logit contributions
amed+0.023
agged+0.020
bled+0.019
lic+0.019
hes+0.018
Neg logit contributions
 worry-0.020
 matter-0.020
 harm-0.019
 test-0.017
 reason-0.017
Token sadly
Token ID21098
Feature activation-1.38e+00

Pos logit contributions
amed+0.072
agged+0.064
bled+0.063
lic+0.061
hes+0.060
Neg logit contributions
 matter-0.041
 worry-0.040
 harm-0.038
 test-0.034
 reason-0.033
Token,
Token ID11
Feature activation-4.41e-01

Pos logit contributions
amed+0.055
agged+0.048
bled+0.047
lic+0.046
hes+0.045
Neg logit contributions
 matter-0.035
 worry-0.034
 harm-0.033
 test-0.029
 risk-0.028
Token mom
Token ID1995
Feature activation+9.52e-01

Pos logit contributions
amed+0.110
agged+0.097
bled+0.092
lic+0.091
hes+0.087
Neg logit contributions
 worry-0.095
 matter-0.093
 harm-0.089
 test-0.080
 reason-0.079
Token joined
Token ID5399
Feature activation-9.97e-01

Pos logit contributions
 worry+0.027
 matter+0.027
 harm+0.026
 test+0.023
 reason+0.023
Neg logit contributions
amed-0.035
agged-0.031
bled-0.030
lic-0.028
hes-0.028
Token them
Token ID606
Feature activation-3.85e+00

Pos logit contributions
amed+0.034
agged+0.030
bled+0.029
hes+0.027
apped+0.026
Neg logit contributions
 matter-0.032
 worry-0.031
 harm-0.029
 client-0.027
 test-0.027
Token.
Token ID13
Feature activation-2.99e+00

Pos logit contributions
amed+0.117
agged+0.095
bled+0.094
apped+0.092
hes+0.090
Neg logit contributions
 matter-0.133
 worry-0.125
 harm-0.119
 client-0.116
grand-0.112
Token They
Token ID1119
Feature activation-3.08e+00

Pos logit contributions
amed+0.122
bled+0.104
agged+0.099
lic+0.098
aming+0.094
Neg logit contributions
 worry-0.070
 yet-0.065
 marriage-0.065
 harm-0.063
 age-0.063
Token twisted
Token ID19074
Feature activation-6.69e+00

Pos logit contributions
amed+0.112
agged+0.100
bled+0.096
lic+0.093
hes+0.090
Neg logit contributions
 worry-0.089
 matter-0.089
 harm-0.083
 test-0.077
 reason-0.074
Token until
Token ID1566
Feature activation+2.31e+00

Pos logit contributions
amed+0.255
agged+0.227
bled+0.216
lic+0.215
hes+0.212
Neg logit contributions
 worry-0.184
 matter-0.183
 harm-0.171
 test-0.152
 reason-0.152
Token they
Token ID484
Feature activation-2.32e+00

Pos logit contributions
 matter+0.058
 worry+0.058
 harm+0.054
 client+0.049
 reason+0.048
Neg logit contributions
amed-0.092
bled-0.082
agged-0.081
hes-0.078
lic-0.075
Token were
Token ID547
Feature activation-2.92e+00

Pos logit contributions
amed+0.080
agged+0.070
bled+0.067
lic+0.064
hes+0.063
Neg logit contributions
 worry-0.072
 matter-0.072
 harm-0.068
 test-0.063
 reason-0.061
Token dizz
Token ID29527
Feature activation-6.22e-01

Pos logit contributions
amed+0.108
agged+0.097
lic+0.092
acked+0.092
bled+0.090
Neg logit contributions
 worry-0.084
 matter-0.081
 harm-0.077
 test-0.071
 anyone-0.068
Tokeny
Token ID88
Feature activation-1.85e+00

Pos logit contributions
amed+0.029
agged+0.026
bled+0.026
hes+0.025
 Later+0.025
Neg logit contributions
 worry-0.011
 matter-0.009
 harm-0.009
 anymore-0.008
 yet-0.007
Token singing
Token ID13777
Feature activation-4.56e+00

Pos logit contributions
amed+0.071
bled+0.060
agged+0.059
hes+0.057
lic+0.056
Neg logit contributions
 matter-0.060
 worry-0.058
 harm-0.055
 client-0.055
 test-0.053
Token!
Token ID0
Feature activation-1.33e+00

Pos logit contributions
amed+0.165
agged+0.141
bled+0.136
lic+0.134
hes+0.131
Neg logit contributions
 matter-0.135
 worry-0.134
 harm-0.128
 client-0.117
 test-0.117
Token The
Token ID383
Feature activation-1.08e+00

Pos logit contributions
amed+0.039
agged+0.033
bled+0.032
lic+0.030
hes+0.030
Neg logit contributions
 worry-0.048
 matter-0.047
 harm-0.046
 test-0.042
 reason-0.042
Token rac
Token ID3444
Feature activation+1.39e+00

Pos logit contributions
amed+0.041
bled+0.035
agged+0.034
lic+0.033
hes+0.033
Neg logit contributions
 matter-0.026
 worry-0.026
 harm-0.025
 appearance-0.023
 expect-0.023
Tokeners
Token ID364
Feature activation-2.26e+00

Pos logit contributions
 matter+0.039
 harm+0.039
 worry+0.039
 test+0.035
 reason+0.034
Neg logit contributions
amed-0.049
agged-0.043
bled-0.042
hes-0.041
lic-0.040
Token dashed
Token ID37901
Feature activation-5.99e+00

Pos logit contributions
amed+0.077
agged+0.064
bled+0.063
lic+0.060
hes+0.060
Neg logit contributions
 matter-0.072
 worry-0.071
 harm-0.069
 test-0.062
 reason-0.062
Token around
Token ID1088
Feature activation-1.22e+00

Pos logit contributions
amed+0.203
agged+0.180
bled+0.171
hes+0.169
lic+0.164
Neg logit contributions
 worry-0.185
 matter-0.183
 harm-0.176
 test-0.162
 reason-0.160
Token the
Token ID262
Feature activation-2.02e+00

Pos logit contributions
amed+0.040
bled+0.032
agged+0.032
lic+0.031
hes+0.030
Neg logit contributions
 matter-0.039
 worry-0.038
 harm-0.036
 client-0.035
 appearance-0.033
Token town
Token ID3240
Feature activation-1.05e+00

Pos logit contributions
amed+0.074
agged+0.062
bled+0.061
lic+0.060
hes+0.058
Neg logit contributions
 matter-0.058
 worry-0.058
 harm-0.054
 reason-0.048
 test-0.048
Token and
Token ID290
Feature activation-2.87e+00

Pos logit contributions
amed+0.037
agged+0.030
bled+0.030
lic+0.029
acking+0.028
Neg logit contributions
 matter-0.032
 worry-0.031
 harm-0.030
 client-0.027
 exam-0.027
Token the
Token ID262
Feature activation-1.90e+00

Pos logit contributions
amed+0.099
bled+0.082
agged+0.081
lic+0.079
hes+0.078
Neg logit contributions
 matter-0.089
 worry-0.088
 harm-0.084
 reason-0.078
 client-0.077
Token one
Token ID530
Feature activation+4.27e+00

Pos logit contributions
 worry+0.121
 appearance+0.113
grand+0.112
 intr+0.111
 risk+0.110
Neg logit contributions
amed-0.127
hes-0.119
agged-0.118
bled-0.111
ocked-0.107
Token was
Token ID373
Feature activation-1.04e+00

Pos logit contributions
 worry+0.157
 matter+0.156
 appearance+0.156
 client+0.150
 harm+0.147
Neg logit contributions
amed-0.112
lic-0.094
hes-0.092
agged-0.090
acked-0.083
Token looking
Token ID2045
Feature activation-2.40e+00

Pos logit contributions
amed+0.035
agged+0.030
hes+0.030
bled+0.029
lic+0.028
Neg logit contributions
 matter-0.032
 worry-0.031
 harm-0.029
 appearance-0.029
 client-0.028
Token.
Token ID13
Feature activation-2.26e+00

Pos logit contributions
amed+0.074
agged+0.063
bled+0.061
hes+0.058
lic+0.056
Neg logit contributions
 matter-0.081
 worry-0.076
 harm-0.074
 client-0.071
 test-0.069
Token She
Token ID1375
Feature activation-2.11e+00

Pos logit contributions
amed+0.078
bled+0.068
agged+0.067
hes+0.060
lic+0.059
Neg logit contributions
 worry-0.073
 matter-0.071
 harm-0.068
 test-0.063
 client-0.062
Token looked
Token ID3114
Feature activation-6.79e+00

Pos logit contributions
amed+0.073
agged+0.062
bled+0.060
lic+0.059
hes+0.058
Neg logit contributions
 worry-0.065
 matter-0.065
 harm-0.063
 test-0.057
 risk-0.057
Token around
Token ID1088
Feature activation-1.45e+00

Pos logit contributions
amed+0.213
agged+0.180
bled+0.172
lic+0.166
hes+0.164
Neg logit contributions
 matter-0.232
 worry-0.232
 harm-0.224
 test-0.206
 reason-0.203
Token and
Token ID290
Feature activation-3.90e+00

Pos logit contributions
amed+0.041
agged+0.033
bled+0.032
lic+0.030
hes+0.030
Neg logit contributions
 matter-0.052
 worry-0.050
 harm-0.050
 client-0.048
 test-0.046
Token saw
Token ID2497
Feature activation-2.20e+00

Pos logit contributions
amed+0.142
agged+0.120
bled+0.119
lic+0.115
hes+0.113
Neg logit contributions
 matter-0.113
 worry-0.110
 harm-0.109
 client-0.099
 test-0.099
Token many
Token ID867
Feature activation+3.36e+00

Pos logit contributions
amed+0.071
agged+0.059
bled+0.058
hes+0.057
lic+0.057
Neg logit contributions
 matter-0.073
 worry-0.072
 harm-0.069
 test-0.064
 appearance-0.064
Token things
Token ID1243
Feature activation+3.19e+00

Pos logit contributions
 worry+0.082
 matter+0.082
 harm+0.078
 risk+0.072
 appearance+0.072
Neg logit contributions
amed-0.129
agged-0.112
hes-0.111
lic-0.109
acking-0.105
Token necklace
Token ID38987
Feature activation-4.05e+00

Pos logit contributions
amed+0.125
agged+0.106
bled+0.106
hes+0.099
aming+0.094
Neg logit contributions
 matter-0.081
 worry-0.080
 harm-0.079
 client-0.071
 appearance-0.069
Token around
Token ID1088
Feature activation-1.40e+00

Pos logit contributions
amed+0.137
agged+0.121
lic+0.115
bled+0.113
hes+0.109
Neg logit contributions
 worry-0.124
 matter-0.121
 harm-0.118
 test-0.109
 angel-0.108
Token its
Token ID663
Feature activation-6.29e+00

Pos logit contributions
amed+0.050
agged+0.043
bled+0.043
hes+0.040
lic+0.040
Neg logit contributions
 matter-0.041
 worry-0.040
 harm-0.038
 client-0.035
 test-0.035
Token neck
Token ID7393
Feature activation-3.88e+00

Pos logit contributions
amed+0.194
agged+0.162
bled+0.157
hes+0.144
lic+0.144
Neg logit contributions
 matter-0.219
 worry-0.215
 harm-0.209
 exam-0.194
 test-0.193
Token and
Token ID290
Feature activation-3.79e+00

Pos logit contributions
amed+0.133
agged+0.116
bled+0.111
lic+0.109
hes+0.105
Neg logit contributions
 worry-0.119
 matter-0.116
 harm-0.115
 test-0.103
 reason-0.101
Token looked
Token ID3114
Feature activation-7.03e+00

Pos logit contributions
amed+0.134
agged+0.120
lic+0.111
bled+0.110
acking+0.106
Neg logit contributions
 worry-0.115
 matter-0.114
 harm-0.109
 test-0.099
 problems-0.097
Token at
Token ID379
Feature activation-7.22e-01

Pos logit contributions
amed+0.252
agged+0.219
bled+0.211
lic+0.205
hes+0.201
Neg logit contributions
 worry-0.211
 matter-0.209
 harm-0.199
 test-0.181
 reason-0.177
Token Lily
Token ID20037
Feature activation-1.00e+00

Pos logit contributions
amed+0.026
bled+0.023
agged+0.023
hes+0.022
apped+0.021
Neg logit contributions
 worry-0.021
 matter-0.021
 harm-0.019
 test-0.019
 exam-0.019
Token.
Token ID13
Feature activation-2.38e+00

Pos logit contributions
amed+0.034
bled+0.028
agged+0.028
hes+0.028
acking+0.027
Neg logit contributions
 matter-0.032
 worry-0.030
 harm-0.029
 reason-0.027
 test-0.027
Token The
Token ID383
Feature activation-1.93e+00

Pos logit contributions
amed+0.093
bled+0.080
agged+0.077
aming+0.073
lic+0.070
Neg logit contributions
 worry-0.062
 matter-0.057
 harm-0.056
 anyone-0.055
 lawyer-0.054
Token seal
Token ID13810
Feature activation-2.17e+00

Pos logit contributions
amed+0.079
agged+0.069
bled+0.069
hes+0.065
lic+0.063
Neg logit contributions
 matter-0.045
 worry-0.044
 harm-0.043
 exam-0.037
 test-0.037