TOP ACTIVATIONS
MAX = 0.223

learned that it 's okay to ask for help when you
relieved that he was able to escape . He would never
the noise . She comes to the room . She sees
was how John learned not to be so stupid . <|endoftext|>
again . She took it to the kitchen and put some
is that it 's ok to be excited when you win
learned that it 's better to forgive and be kind ,
learned that it 's important to help others when they need
learned that it 's important to be careful with expensive things
learned that it 's important to be kind and helpful to
learned that it 's important to be careful around strangers ,
learned that it 's important to eat healthy food and not
that it 's not good to take things that belong to
. He was never able to speak again , even though
hears them . She comes to the living room . She
that he had been able to climb to the top .
Johnny held on to the bag tight and they
better . He was able to perform on the stage and
falling off . She tried to tie them back together ,
. Ryan tried to run away but the noisy

MIN ACTIVATIONS
MIN = -0.303

saw a big box of Leg os that he really wanted
grandma 's house for safe keeping . When
put the camera on a tripod , set the timer ,
a big camera on a tripod . It belongs to a
Once there was a k idd o who loved to guess
one day , the k idd o â s
a new tank out of Leg os . Tim
. The k idd o understood the message ,
the camera back on the tripod . " Go
day , she asked her maid for some special makeup .
said . The maid smiled . " Yes ,
hard - and the k idd o soon began to do
said . The maid quickly ran away to find
. The great Zen o hugs Ben and says
wet and the motorcycle sk idd ed . It tipped over
slippery and the car sk idd ed and hit a tree
She smiled and thanked the maid for helping her .
Jenny ," said Mr . Fisher . " We have to
it to me for safe keeping . He told me to
the park with Mr . Fisher . " Look

INTERVAL 0.092 - 0.223
CONTAINS 22.589%

The pony was so pretty and Lily wanted to ride it
Lily 's friend came over to play . Her friend asked
blue ball was curious , so it decided to join them
and her mom were in the kitchen . Riley wanted to
't notice how cold it was getting . Suddenly his feet

INTERVAL -0.040 - 0.092
CONTAINS 44.231%

to hide from the rain , but it was too wet
's okay . You can have some of my fruit .
become comfortable and was feeling very warm . But
" Can we ride it ?" Lily asks .
sun is going down and it is getting dark .

INTERVAL -0.171 - -0.040
CONTAINS 26.462%

train on the tracks and turn it on . The train
! <|endoftext|> Bob was very angry . He wanted to play
Wow ! What â s it doing ?
boy named Tim my . Tim my loved to explore and
. But Lucy just ignored Jack . Jack started to tease

INTERVAL -0.303 - -0.171
CONTAINS 6.705%

to his friends . The frog was scared , but the
She had a cute little purse . She liked to take
dirt and mum with the rake made it look nice .
: another kid found the knife and got hurt . Lucy
, her mom lit the oven , and the spirit was
Token learned
Token ID4499
Feature activation+3.75e-02

Pos logit contributions
 your+0.127
 same+0.124
 laundry+0.121
 their+0.121
 dad+0.112
Neg logit contributions
hell-0.285
lor-0.283
oped-0.280
umbs-0.279
qu-0.275
Token that
Token ID326
Feature activation+1.15e-01

Pos logit contributions
 port+0.054
 same+0.053
 laundry+0.053
 table+0.053
 stove+0.051
Neg logit contributions
hell-0.083
oot-0.079
es-0.077
qu-0.077
uts-0.076
Token it
Token ID340
Feature activation-2.91e-02

Pos logit contributions
 same+0.164
 port+0.152
 laundry+0.151
 table+0.142
 task+0.139
Neg logit contributions
hell-0.245
es-0.244
qu-0.233
uts-0.230
oot-0.226
Token's
Token ID338
Feature activation+1.29e-01

Pos logit contributions
hell+0.062
es+0.059
lor+0.059
oped+0.059
qu+0.058
Neg logit contributions
 your-0.048
 same-0.047
 laundry-0.047
 their-0.046
 our-0.043
Token okay
Token ID8788
Feature activation-2.95e-02

Pos logit contributions
 laundry+0.143
 port+0.141
 table+0.139
 same+0.133
 computer+0.130
Neg logit contributions
hell-0.332
es-0.315
qu-0.309
oot-0.306
rot-0.305
Token to
Token ID284
Feature activation+2.04e-01

Pos logit contributions
hell+0.071
es+0.069
lor+0.066
oped+0.066
qu+0.066
Neg logit contributions
 laundry-0.040
 same-0.039
 port-0.037
 your-0.037
 their-0.037
Token ask
Token ID1265
Feature activation-4.30e-02

Pos logit contributions
 your+0.149
 same+0.141
 their+0.133
 laundry+0.121
 our+0.114
Neg logit contributions
hell-0.646
es-0.639
lor-0.638
oped-0.636
umbs-0.633
Token for
Token ID329
Feature activation+1.42e-01

Pos logit contributions
es+0.103
hell+0.101
qu+0.095
oot+0.095
r+0.094
Neg logit contributions
 laundry-0.051
 port-0.051
 same-0.051
 task-0.046
 table-0.046
Token help
Token ID1037
Feature activation+3.67e-02

Pos logit contributions
 same+0.143
 port+0.136
 laundry+0.131
 table+0.128
 computer+0.128
Neg logit contributions
hell-0.363
es-0.362
qu-0.355
lor-0.351
oot-0.342
Token when
Token ID618
Feature activation+7.45e-02

Pos logit contributions
 your+0.072
 their+0.069
 same+0.069
 laundry+0.067
 our+0.065
Neg logit contributions
hell-0.072
oped-0.070
lor-0.069
umbs-0.067
rot-0.066
Token you
Token ID345
Feature activation+3.53e-02

Pos logit contributions
 computer+0.103
 same+0.102
 laundry+0.098
 port+0.098
 table+0.095
Neg logit contributions
es-0.156
qu-0.147
uts-0.139
hell-0.138
lor-0.138
Token relieved
Token ID22598
Feature activation-3.66e-03

Pos logit contributions
hell+0.041
lor+0.040
oped+0.040
umbs+0.040
qu+0.039
Neg logit contributions
 your-0.029
 their-0.029
 same-0.029
 laundry-0.028
 our-0.026
Token that
Token ID326
Feature activation+1.09e-01

Pos logit contributions
hell+0.007
es+0.007
lor+0.007
oped+0.007
umbs+0.006
Neg logit contributions
 your-0.007
 laundry-0.007
 same-0.007
 their-0.006
 port-0.006
Token he
Token ID339
Feature activation+2.64e-02

Pos logit contributions
 same+0.124
 laundry+0.118
 port+0.117
 your+0.113
 computer+0.110
Neg logit contributions
hell-0.275
es-0.269
qu-0.256
hed-0.256
lor-0.255
Token was
Token ID373
Feature activation+1.14e-01

Pos logit contributions
 your+0.044
 same+0.043
 laundry+0.043
 their+0.043
 our+0.040
Neg logit contributions
hell-0.056
oped-0.053
es-0.053
lor-0.053
qu-0.052
Token able
Token ID1498
Feature activation+2.83e-02

Pos logit contributions
 your+0.165
 same+0.162
 their+0.160
 laundry+0.159
 our+0.148
Neg logit contributions
hell-0.267
lor-0.257
oped-0.256
es-0.255
umbs-0.253
Token to
Token ID284
Feature activation+2.03e-01

Pos logit contributions
 laundry+0.035
 same+0.034
 port+0.034
 computer+0.032
 table+0.032
Neg logit contributions
hell-0.069
es-0.067
oped-0.065
lor-0.065
umbs-0.065
Token escape
Token ID6654
Feature activation-9.94e-02

Pos logit contributions
 your+0.194
 same+0.194
 laundry+0.188
 their+0.188
 our+0.168
Neg logit contributions
hell-0.561
lor-0.543
qu-0.539
oped-0.539
es-0.538
Token.
Token ID13
Feature activation+3.60e-03

Pos logit contributions
hell+0.190
es+0.181
lor+0.174
qu+0.173
oped+0.171
Neg logit contributions
 laundry-0.178
 same-0.178
 your-0.173
 their-0.169
 dad-0.167
Token He
Token ID679
Feature activation-1.59e-02

Pos logit contributions
 your+0.006
 their+0.005
 same+0.005
 laundry+0.005
 dad+0.005
Neg logit contributions
hell-0.008
es-0.008
oped-0.008
lor-0.008
qu-0.008
Token would
Token ID561
Feature activation+7.29e-02

Pos logit contributions
hell+0.032
es+0.031
oped+0.031
lor+0.030
hed+0.030
Neg logit contributions
 laundry-0.028
 your-0.028
 same-0.028
 their-0.027
 computer-0.026
Token never
Token ID1239
Feature activation+3.22e-02

Pos logit contributions
 laundry+0.088
 same+0.086
 your+0.085
 their+0.082
 computer+0.078
Neg logit contributions
hell-0.183
oped-0.175
lor-0.175
es-0.174
qu-0.173
Token the
Token ID262
Feature activation+1.68e-01

Pos logit contributions
 laundry+0.077
 same+0.076
 your+0.072
 port+0.069
 their+0.066
Neg logit contributions
hell-0.243
es-0.236
qu-0.230
lor-0.227
umbs-0.226
Token noise
Token ID7838
Feature activation-1.53e-01

Pos logit contributions
 your+0.209
 their+0.204
 laundry+0.196
 same+0.183
 port+0.183
Neg logit contributions
hell-0.418
rot-0.391
umbs-0.388
qu-0.386
lor-0.385
Token.
Token ID13
Feature activation+1.51e-02

Pos logit contributions
hell+0.291
lor+0.290
oped+0.285
umbs+0.276
rot+0.274
Neg logit contributions
 your-0.289
 their-0.287
 same-0.281
 laundry-0.273
 our-0.267
Token She
Token ID1375
Feature activation-8.09e-03

Pos logit contributions
 same+0.021
 laundry+0.021
 your+0.021
 their+0.020
 our+0.019
Neg logit contributions
hell-0.035
es-0.034
lor-0.034
oped-0.034
umbs-0.034
Token comes
Token ID2058
Feature activation+3.12e-02

Pos logit contributions
es+0.018
hell+0.018
oped+0.018
hed+0.017
lor+0.017
Neg logit contributions
 laundry-0.012
 same-0.011
 your-0.011
 their-0.011
 port-0.011
Token to
Token ID284
Feature activation+2.03e-01

Pos logit contributions
 your+0.035
 same+0.034
 their+0.034
 laundry+0.033
 our+0.030
Neg logit contributions
hell-0.084
lor-0.081
oped-0.081
es-0.080
umbs-0.080
Token the
Token ID262
Feature activation+1.59e-01

Pos logit contributions
 laundry+0.173
 same+0.166
 task+0.163
 computer+0.150
 port+0.149
Neg logit contributions
hell-0.528
qu-0.526
es-0.508
oc-0.502
r-0.495
Token room
Token ID2119
Feature activation-1.31e-01

Pos logit contributions
 your+0.182
 their+0.180
 laundry+0.159
 same+0.154
 our+0.154
Neg logit contributions
rot-0.381
hell-0.379
oc-0.370
lor-0.368
es-0.361
Token.
Token ID13
Feature activation+1.07e-02

Pos logit contributions
hell+0.249
lor+0.237
oped+0.237
es+0.235
umbs+0.232
Neg logit contributions
 your-0.246
 same-0.243
 their-0.242
 laundry-0.240
 our-0.227
Token She
Token ID1375
Feature activation-1.19e-02

Pos logit contributions
 same+0.017
 laundry+0.017
 your+0.017
 their+0.016
 port+0.016
Neg logit contributions
hell-0.022
es-0.022
oped-0.022
lor-0.021
umbs-0.021
Token sees
Token ID7224
Feature activation+6.39e-02

Pos logit contributions
es+0.026
oped+0.024
hed+0.023
hell+0.023
aw+0.022
Neg logit contributions
 job-0.018
 laundry-0.018
 task-0.018
 sunny-0.018
 port-0.018
Token was
Token ID373
Feature activation+1.19e-01

Pos logit contributions
hell+0.053
lor+0.052
oped+0.052
umbs+0.051
qu+0.050
Neg logit contributions
 your-0.044
 their-0.043
 dad-0.041
 same-0.040
 the-0.040
Token how
Token ID703
Feature activation+8.98e-02

Pos logit contributions
 laundry+0.164
 computer+0.149
 port+0.148
 your+0.146
 task+0.146
Neg logit contributions
hell-0.261
hed-0.254
es-0.253
oped-0.251
rol-0.250
Token John
Token ID1757
Feature activation-6.99e-02

Pos logit contributions
 laundry+0.111
 port+0.110
 table+0.110
 computer+0.108
 garage+0.107
Neg logit contributions
hell-0.209
es-0.204
qu-0.197
uts-0.195
oped-0.191
Token learned
Token ID4499
Feature activation+1.69e-02

Pos logit contributions
hell+0.152
lor+0.147
oped+0.146
es+0.145
umbs+0.144
Neg logit contributions
 your-0.111
 same-0.110
 their-0.109
 laundry-0.108
 our-0.101
Token not
Token ID407
Feature activation+9.19e-02

Pos logit contributions
 same+0.020
 laundry+0.019
 port+0.019
 your+0.019
 table+0.019
Neg logit contributions
hell-0.042
es-0.040
lor-0.039
oot-0.038
qu-0.038
Token to
Token ID284
Feature activation+2.03e-01

Pos logit contributions
 laundry+0.117
 stove+0.116
 port+0.114
 same+0.110
 table+0.108
Neg logit contributions
hell-0.226
oot-0.210
uts-0.209
es-0.207
qu-0.205
Token be
Token ID307
Feature activation+6.12e-02

Pos logit contributions
 your+0.166
 same+0.161
 their+0.157
 laundry+0.155
 our+0.135
Neg logit contributions
hell-0.607
lor-0.594
oped-0.594
es-0.591
umbs-0.586
Token so
Token ID523
Feature activation+1.23e-01

Pos logit contributions
 laundry+0.082
 stove+0.078
 table+0.078
 port+0.078
 same+0.077
Neg logit contributions
hell-0.145
es-0.141
oped-0.137
qu-0.135
hed-0.135
Token stupid
Token ID8531
Feature activation-3.59e-02

Pos logit contributions
 laundry+0.163
 same+0.154
 your+0.153
 table+0.152
 their+0.147
Neg logit contributions
hell-0.299
es-0.284
oped-0.284
qu-0.279
oot-0.278
Token.
Token ID13
Feature activation+1.17e-02

Pos logit contributions
hell+0.065
lor+0.063
oped+0.063
umbs+0.062
es+0.061
Neg logit contributions
 your-0.071
 their-0.070
 same-0.070
 laundry-0.069
 our-0.066
Token<|endoftext|>
Token ID50256
Feature activation-1.55e-01

Pos logit contributions
 your+0.019
 same+0.019
 their+0.019
 laundry+0.019
 our+0.018
Neg logit contributions
hell-0.025
oped-0.024
lor-0.024
es-0.024
umbs-0.023
Token again
Token ID757
Feature activation-1.20e-03

Pos logit contributions
hell+0.132
oped+0.127
lor+0.127
es+0.126
umbs+0.124
Neg logit contributions
 your-0.100
 same-0.099
 their-0.098
 laundry-0.096
 our-0.091
Token.
Token ID13
Feature activation+2.05e-02

Pos logit contributions
hell+0.002
es+0.002
lor+0.002
oped+0.002
qu+0.002
Neg logit contributions
 your-0.002
 same-0.002
 laundry-0.002
 their-0.002
 port-0.002
Token She
Token ID1375
Feature activation-1.57e-02

Pos logit contributions
 your+0.030
 same+0.030
 their+0.030
 laundry+0.029
 our+0.027
Neg logit contributions
hell-0.047
lor-0.045
oped-0.045
es-0.045
umbs-0.044
Token took
Token ID1718
Feature activation+3.82e-02

Pos logit contributions
es+0.032
hell+0.031
oped+0.031
hed+0.030
lor+0.030
Neg logit contributions
 laundry-0.027
 same-0.026
 your-0.026
 their-0.026
 port-0.025
Token it
Token ID340
Feature activation-3.07e-02

Pos logit contributions
 laundry+0.037
 port+0.037
 table+0.036
 same+0.036
 stove+0.034
Neg logit contributions
es-0.102
hell-0.101
hed-0.099
oped-0.096
umbs-0.096
Token to
Token ID284
Feature activation+2.03e-01

Pos logit contributions
hell+0.074
es+0.071
oped+0.070
lor+0.070
umbs+0.070
Neg logit contributions
 your-0.041
 same-0.041
 laundry-0.041
 their-0.040
 our-0.037
Token the
Token ID262
Feature activation+1.55e-01

Pos logit contributions
 laundry+0.186
 port+0.182
 same+0.180
 task+0.178
 computer+0.176
Neg logit contributions
hell-0.484
qu-0.479
es-0.479
lor-0.458
oot-0.453
Token kitchen
Token ID9592
Feature activation-1.16e-01

Pos logit contributions
 their+0.193
 your+0.187
 laundry+0.171
 our+0.169
 organize+0.161
Neg logit contributions
rot-0.363
hell-0.357
lor-0.343
pillar-0.336
oc-0.334
Token and
Token ID290
Feature activation+1.40e-01

Pos logit contributions
es+0.192
hell+0.188
umbs+0.182
lor+0.177
hed+0.176
Neg logit contributions
 your-0.224
 same-0.221
 laundry-0.218
 our-0.212
 their-0.209
Token put
Token ID1234
Feature activation-1.60e-02

Pos logit contributions
 same+0.267
 task+0.254
 port+0.252
 laundry+0.251
 computer+0.243
Neg logit contributions
es-0.241
hed-0.224
oped-0.209
oc-0.201
owed-0.201
Token some
Token ID617
Feature activation+7.07e-02

Pos logit contributions
hell+0.035
es+0.035
hed+0.034
oped+0.033
lor+0.032
Neg logit contributions
 laundry-0.023
 same-0.023
 port-0.023
 computer-0.022
 your-0.022
Token is
Token ID318
Feature activation+1.38e-01

Pos logit contributions
 laundry+0.141
 same+0.135
 your+0.135
 port+0.131
 their+0.130
Neg logit contributions
hell-0.255
es-0.249
oped-0.244
umbs-0.244
qu-0.241
Token that
Token ID326
Feature activation+1.04e-01

Pos logit contributions
 laundry+0.172
 port+0.157
 task+0.150
 computer+0.147
 stove+0.145
Neg logit contributions
hell-0.315
es-0.296
rot-0.294
qu-0.284
ez-0.282
Token it
Token ID340
Feature activation-3.18e-02

Pos logit contributions
 laundry+0.117
 port+0.117
 same+0.111
 table+0.104
 computer+0.104
Neg logit contributions
hell-0.247
es-0.243
qu-0.230
uts-0.225
rot-0.224
Token's
Token ID338
Feature activation+1.20e-01

Pos logit contributions
hell+0.069
es+0.067
qu+0.066
oped+0.065
hed+0.065
Neg logit contributions
 laundry-0.050
 your-0.048
 their-0.048
 same-0.047
 dad-0.046
Token ok
Token ID12876
Feature activation-3.60e-02

Pos logit contributions
 laundry+0.142
 port+0.136
 same+0.130
 computer+0.127
 table+0.126
Neg logit contributions
hell-0.290
es-0.276
rot-0.268
qu-0.267
uts-0.263
Token to
Token ID284
Feature activation+2.03e-01

Pos logit contributions
hell+0.067
es+0.064
lor+0.064
oped+0.063
qu+0.062
Neg logit contributions
 same-0.067
 laundry-0.067
 your-0.067
 their-0.066
 port-0.063
Token be
Token ID307
Feature activation+5.95e-02

Pos logit contributions
 your+0.174
 same+0.169
 their+0.163
 laundry+0.158
 our+0.143
Neg logit contributions
hell-0.602
lor-0.589
oped-0.589
es-0.589
umbs-0.582
Token excited
Token ID6568
Feature activation-4.61e-02

Pos logit contributions
 your+0.077
 same+0.077
 laundry+0.076
 their+0.075
 dad+0.069
Neg logit contributions
hell-0.147
es-0.141
oped-0.141
lor-0.141
umbs-0.139
Token when
Token ID618
Feature activation+7.53e-02

Pos logit contributions
hell+0.093
lor+0.089
oped+0.089
es+0.089
umbs+0.087
Neg logit contributions
 your-0.080
 same-0.080
 laundry-0.079
 their-0.079
 our-0.074
Token you
Token ID345
Feature activation+3.44e-02

Pos logit contributions
 same+0.094
 laundry+0.093
 computer+0.093
 task+0.086
 port+0.085
Neg logit contributions
es-0.168
qu-0.158
lor-0.154
hell-0.152
oped-0.152
Token win
Token ID1592
Feature activation-1.03e-01

Pos logit contributions
 same+0.044
 your+0.042
 their+0.040
 laundry+0.039
 wash+0.039
Neg logit contributions
hell-0.088
lor-0.087
es-0.084
umbs-0.083
oped-0.083
Token learned
Token ID4499
Feature activation+2.17e-02

Pos logit contributions
hell+0.097
es+0.093
oped+0.093
lor+0.093
umbs+0.091
Neg logit contributions
 your-0.078
 same-0.077
 laundry-0.076
 their-0.076
 our-0.071
Token that
Token ID326
Feature activation+1.12e-01

Pos logit contributions
 port+0.035
 same+0.033
 laundry+0.032
 table+0.032
 stove+0.032
Neg logit contributions
hell-0.046
oot-0.044
uts-0.043
es-0.042
qu-0.042
Token it
Token ID340
Feature activation-2.96e-02

Pos logit contributions
 same+0.163
 laundry+0.157
 port+0.154
 their+0.150
 your+0.148
Neg logit contributions
hell-0.244
oot-0.227
es-0.227
qu-0.226
uts-0.225
Token's
Token ID338
Feature activation+1.28e-01

Pos logit contributions
hell+0.063
es+0.060
oped+0.060
lor+0.060
qu+0.059
Neg logit contributions
 your-0.049
 same-0.048
 laundry-0.048
 their-0.048
 dad-0.045
Token better
Token ID1365
Feature activation-1.97e-02

Pos logit contributions
 port+0.156
 laundry+0.155
 table+0.154
 stove+0.148
 computer+0.144
Neg logit contributions
hell-0.318
es-0.293
qu-0.292
oot-0.291
rot-0.285
Token to
Token ID284
Feature activation+2.03e-01

Pos logit contributions
hell+0.048
es+0.048
lor+0.046
oped+0.045
oot+0.045
Neg logit contributions
 laundry-0.026
 same-0.024
 port-0.024
 their-0.024
 table-0.023
Token forgive
Token ID20927
Feature activation-9.01e-02

Pos logit contributions
 your+0.178
 same+0.172
 their+0.166
 laundry+0.162
 our+0.145
Neg logit contributions
hell-0.596
oped-0.586
es-0.585
lor-0.583
umbs-0.580
Token and
Token ID290
Feature activation+1.43e-01

Pos logit contributions
hell+0.218
es+0.210
lor+0.208
oped+0.206
qu+0.204
Neg logit contributions
 same-0.120
 your-0.119
 laundry-0.119
 their-0.117
 port-0.110
Token be
Token ID307
Feature activation+6.14e-02

Pos logit contributions
 port+0.177
 laundry+0.171
 computer+0.168
 stove+0.162
 same+0.160
Neg logit contributions
hell-0.321
qu-0.318
es-0.316
oot-0.311
lor-0.306
Token kind
Token ID1611
Feature activation-9.72e-02

Pos logit contributions
 laundry+0.077
 same+0.076
 your+0.073
 port+0.072
 their+0.071
Neg logit contributions
hell-0.151
es-0.147
lor-0.144
qu-0.144
oped-0.143
Token,
Token ID11
Feature activation+9.11e-02

Pos logit contributions
oped+0.261
hell+0.260
lor+0.255
qu+0.251
rol+0.250
Neg logit contributions
 your-0.122
 their-0.117
 same-0.116
 dad-0.107
 our-0.107
Token learned
Token ID4499
Feature activation+3.75e-02

Pos logit contributions
 same+0.150
 laundry+0.139
 port+0.139
 your+0.134
 table+0.130
Neg logit contributions
es-0.244
hell-0.240
oc-0.229
ez-0.220
hed-0.218
Token that
Token ID326
Feature activation+1.15e-01

Pos logit contributions
 port+0.054
 same+0.053
 laundry+0.053
 table+0.053
 stove+0.051
Neg logit contributions
hell-0.083
oot-0.079
es-0.077
qu-0.077
uts-0.076
Token it
Token ID340
Feature activation-2.91e-02

Pos logit contributions
 same+0.164
 port+0.152
 laundry+0.151
 table+0.142
 task+0.139
Neg logit contributions
hell-0.245
es-0.244
qu-0.233
uts-0.230
oot-0.226
Token's
Token ID338
Feature activation+1.29e-01

Pos logit contributions
hell+0.062
es+0.059
lor+0.059
oped+0.059
qu+0.058
Neg logit contributions
 your-0.048
 same-0.047
 laundry-0.047
 their-0.046
 our-0.043
Token important
Token ID1593
Feature activation+3.48e-03

Pos logit contributions
 laundry+0.143
 port+0.141
 table+0.139
 same+0.133
 computer+0.130
Neg logit contributions
hell-0.332
es-0.315
qu-0.309
oot-0.306
rot-0.305
Token to
Token ID284
Feature activation+2.03e-01

Pos logit contributions
 laundry+0.004
 port+0.004
 same+0.004
 their+0.004
 your+0.004
Neg logit contributions
hell-0.009
es-0.009
hed-0.008
lor-0.008
oped-0.008
Token help
Token ID1037
Feature activation+3.87e-02

Pos logit contributions
 same+0.172
 laundry+0.170
 your+0.168
 their+0.166
 port+0.154
Neg logit contributions
hell-0.572
qu-0.554
lor-0.551
oot-0.543
oped-0.541
Token others
Token ID1854
Feature activation-1.01e-01

Pos logit contributions
 port+0.040
 same+0.040
 laundry+0.040
 table+0.037
 bathroom+0.036
Neg logit contributions
es-0.100
hell-0.099
qu-0.095
oot-0.093
lor-0.092
Token when
Token ID618
Feature activation+7.55e-02

Pos logit contributions
hell+0.195
es+0.188
lor+0.185
oped+0.184
qu+0.181
Neg logit contributions
 same-0.180
 your-0.180
 laundry-0.178
 their-0.177
 dad-0.167
Token they
Token ID484
Feature activation+2.54e-02

Pos logit contributions
 same+0.097
 computer+0.096
 laundry+0.093
 table+0.090
 garage+0.087
Neg logit contributions
es-0.166
qu-0.160
uts-0.154
hell-0.153
oped-0.148
Token need
Token ID761
Feature activation+1.52e-03

Pos logit contributions
 your+0.033
 laundry+0.032
 same+0.032
 their+0.032
 our+0.030
Neg logit contributions
hell-0.062
oped-0.060
es-0.060
qu-0.059
lor-0.059
Token learned
Token ID4499
Feature activation+3.75e-02

Pos logit contributions
hell+0.168
umbs+0.160
es+0.160
oped+0.159
oc+0.159
Neg logit contributions
 laundry-0.058
 your-0.057
 same-0.056
 their-0.053
 math-0.048
Token that
Token ID326
Feature activation+1.15e-01

Pos logit contributions
 port+0.054
 same+0.053
 laundry+0.053
 table+0.053
 stove+0.051
Neg logit contributions
hell-0.083
oot-0.079
es-0.077
qu-0.077
uts-0.076
Token it
Token ID340
Feature activation-2.91e-02

Pos logit contributions
 same+0.164
 port+0.152
 laundry+0.151
 table+0.142
 task+0.139
Neg logit contributions
hell-0.245
es-0.244
qu-0.233
uts-0.230
oot-0.226
Token's
Token ID338
Feature activation+1.29e-01

Pos logit contributions
hell+0.062
es+0.059
lor+0.059
oped+0.059
qu+0.058
Neg logit contributions
 your-0.048
 same-0.047
 laundry-0.047
 their-0.046
 our-0.043
Token important
Token ID1593
Feature activation+3.48e-03

Pos logit contributions
 laundry+0.143
 port+0.141
 table+0.139
 same+0.133
 computer+0.130
Neg logit contributions
hell-0.332
es-0.315
qu-0.309
oot-0.306
rot-0.305
Token to
Token ID284
Feature activation+2.03e-01

Pos logit contributions
 laundry+0.004
 port+0.004
 same+0.004
 their+0.004
 your+0.004
Neg logit contributions
hell-0.009
es-0.009
hed-0.008
lor-0.008
oped-0.008
Token be
Token ID307
Feature activation+6.01e-02

Pos logit contributions
 same+0.172
 laundry+0.170
 your+0.168
 their+0.166
 port+0.154
Neg logit contributions
hell-0.572
qu-0.554
lor-0.551
oot-0.543
oped-0.541
Token careful
Token ID8161
Feature activation-6.15e-02

Pos logit contributions
 laundry+0.077
 same+0.076
 table+0.072
 port+0.072
 your+0.071
Neg logit contributions
hell-0.145
es-0.143
lor-0.138
qu-0.138
oped-0.137
Token with
Token ID351
Feature activation+1.27e-01

Pos logit contributions
hell+0.131
lor+0.126
oped+0.126
umbs+0.124
es+0.124
Neg logit contributions
 your-0.103
 same-0.101
 their-0.100
 laundry-0.099
 our-0.094
Token expensive
Token ID5789
Feature activation-4.49e-02

Pos logit contributions
 port+0.130
 same+0.128
 laundry+0.121
 table+0.119
 computer+0.112
Neg logit contributions
hell-0.334
es-0.329
lor-0.307
qu-0.306
oot-0.302
Token things
Token ID1243
Feature activation-9.71e-02

Pos logit contributions
hell+0.115
es+0.111
hed+0.107
oc+0.106
qu+0.106
Neg logit contributions
 same-0.050
 your-0.048
 their-0.047
 laundry-0.046
 port-0.045
Token learned
Token ID4499
Feature activation+3.75e-02

Pos logit contributions
hell+0.320
lor+0.306
oped+0.304
umbs+0.302
es+0.301
Neg logit contributions
 your-0.285
 same-0.280
 their-0.276
 laundry-0.273
 our-0.262
Token that
Token ID326
Feature activation+1.15e-01

Pos logit contributions
 port+0.054
 same+0.053
 laundry+0.053
 table+0.053
 stove+0.051
Neg logit contributions
hell-0.083
oot-0.079
es-0.077
qu-0.077
uts-0.076
Token it
Token ID340
Feature activation-2.91e-02

Pos logit contributions
 same+0.164
 port+0.152
 laundry+0.151
 table+0.142
 task+0.139
Neg logit contributions
hell-0.245
es-0.244
qu-0.233
uts-0.230
oot-0.226
Token's
Token ID338
Feature activation+1.29e-01

Pos logit contributions
hell+0.062
es+0.059
lor+0.059
oped+0.059
qu+0.058
Neg logit contributions
 your-0.048
 same-0.047
 laundry-0.047
 their-0.046
 our-0.043
Token important
Token ID1593
Feature activation+3.48e-03

Pos logit contributions
 laundry+0.143
 port+0.141
 table+0.139
 same+0.133
 computer+0.130
Neg logit contributions
hell-0.332
es-0.315
qu-0.309
oot-0.306
rot-0.305
Token to
Token ID284
Feature activation+2.03e-01

Pos logit contributions
 laundry+0.004
 port+0.004
 same+0.004
 their+0.004
 your+0.004
Neg logit contributions
hell-0.009
es-0.009
hed-0.008
lor-0.008
oped-0.008
Token be
Token ID307
Feature activation+6.01e-02

Pos logit contributions
 same+0.172
 laundry+0.170
 your+0.168
 their+0.166
 port+0.154
Neg logit contributions
hell-0.572
qu-0.554
lor-0.551
oot-0.543
oped-0.541
Token kind
Token ID1611
Feature activation-9.43e-02

Pos logit contributions
 laundry+0.077
 same+0.076
 table+0.072
 port+0.072
 your+0.071
Neg logit contributions
hell-0.145
es-0.143
lor-0.138
qu-0.138
oped-0.137
Token and
Token ID290
Feature activation+1.41e-01

Pos logit contributions
hell+0.219
oped+0.213
lor+0.212
qu+0.209
umbs+0.207
Neg logit contributions
 your-0.142
 same-0.139
 their-0.138
 laundry-0.134
 our-0.128
Token helpful
Token ID7613
Feature activation-6.94e-02

Pos logit contributions
 port+0.184
 laundry+0.180
 stove+0.176
 table+0.175
 TV+0.173
Neg logit contributions
es-0.300
qu-0.296
hell-0.295
uts-0.293
aring-0.293
Token to
Token ID284
Feature activation+1.98e-01

Pos logit contributions
hell+0.142
es+0.141
lor+0.135
oped+0.134
qu+0.134
Neg logit contributions
 laundry-0.113
 same-0.112
 your-0.112
 their-0.110
 port-0.107
Token learned
Token ID4499
Feature activation+3.75e-02

Pos logit contributions
 your+0.016
 their+0.015
 same+0.015
 laundry+0.015
 our+0.014
Neg logit contributions
hell-0.042
lor-0.041
oped-0.040
umbs-0.040
es-0.040
Token that
Token ID326
Feature activation+1.15e-01

Pos logit contributions
 port+0.054
 same+0.053
 laundry+0.053
 table+0.053
 stove+0.051
Neg logit contributions
hell-0.083
oot-0.079
es-0.077
qu-0.077
uts-0.076
Token it
Token ID340
Feature activation-2.91e-02

Pos logit contributions
 same+0.164
 port+0.152
 laundry+0.151
 table+0.142
 task+0.139
Neg logit contributions
hell-0.245
es-0.244
qu-0.233
uts-0.230
oot-0.226
Token's
Token ID338
Feature activation+1.29e-01

Pos logit contributions
hell+0.062
es+0.059
lor+0.059
oped+0.059
qu+0.058
Neg logit contributions
 your-0.048
 same-0.047
 laundry-0.047
 their-0.046
 our-0.043
Token important
Token ID1593
Feature activation+3.48e-03

Pos logit contributions
 laundry+0.143
 port+0.141
 table+0.139
 same+0.133
 computer+0.130
Neg logit contributions
hell-0.332
es-0.315
qu-0.309
oot-0.306
rot-0.305
Token to
Token ID284
Feature activation+2.03e-01

Pos logit contributions
 laundry+0.004
 port+0.004
 same+0.004
 their+0.004
 your+0.004
Neg logit contributions
hell-0.009
es-0.009
hed-0.008
lor-0.008
oped-0.008
Token be
Token ID307
Feature activation+6.01e-02

Pos logit contributions
 same+0.172
 laundry+0.170
 your+0.168
 their+0.166
 port+0.154
Neg logit contributions
hell-0.572
qu-0.554
lor-0.551
oot-0.543
oped-0.541
Token careful
Token ID8161
Feature activation-6.15e-02

Pos logit contributions
 laundry+0.077
 same+0.076
 table+0.072
 port+0.072
 your+0.071
Neg logit contributions
hell-0.145
es-0.143
lor-0.138
qu-0.138
oped-0.137
Token around
Token ID1088
Feature activation+3.46e-02

Pos logit contributions
hell+0.131
lor+0.126
oped+0.126
umbs+0.124
es+0.124
Neg logit contributions
 your-0.103
 same-0.101
 their-0.100
 laundry-0.099
 our-0.094
Token strangers
Token ID18981
Feature activation-1.12e-01

Pos logit contributions
 same+0.046
 laundry+0.044
 dad+0.043
 table+0.042
 port+0.041
Neg logit contributions
es-0.081
hell-0.078
qu-0.075
oc-0.074
hed-0.073
Token,
Token ID11
Feature activation+8.76e-02

Pos logit contributions
hell+0.234
lor+0.233
umbs+0.231
oped+0.228
qu+0.222
Neg logit contributions
 your-0.200
 their-0.194
 same-0.191
 our-0.182
 the-0.180
Token learned
Token ID4499
Feature activation+3.75e-02

Pos logit contributions
hell+0.266
lor+0.256
oped+0.255
qu+0.246
es+0.246
Neg logit contributions
 your-0.249
 their-0.246
 same-0.242
 laundry-0.237
 dad-0.230
Token that
Token ID326
Feature activation+1.15e-01

Pos logit contributions
 port+0.054
 same+0.053
 laundry+0.053
 table+0.053
 stove+0.051
Neg logit contributions
hell-0.083
oot-0.079
es-0.077
qu-0.077
uts-0.076
Token it
Token ID340
Feature activation-2.91e-02

Pos logit contributions
 same+0.164
 port+0.152
 laundry+0.151
 table+0.142
 task+0.139
Neg logit contributions
hell-0.245
es-0.244
qu-0.233
uts-0.230
oot-0.226
Token's
Token ID338
Feature activation+1.29e-01

Pos logit contributions
hell+0.062
es+0.059
lor+0.059
oped+0.059
qu+0.058
Neg logit contributions
 your-0.048
 same-0.047
 laundry-0.047
 their-0.046
 our-0.043
Token important
Token ID1593
Feature activation+3.48e-03

Pos logit contributions
 laundry+0.143
 port+0.141
 table+0.139
 same+0.133
 computer+0.130
Neg logit contributions
hell-0.332
es-0.315
qu-0.309
oot-0.306
rot-0.305
Token to
Token ID284
Feature activation+2.03e-01

Pos logit contributions
 laundry+0.004
 port+0.004
 same+0.004
 their+0.004
 your+0.004
Neg logit contributions
hell-0.009
es-0.009
hed-0.008
lor-0.008
oped-0.008
Token eat
Token ID4483
Feature activation-3.94e-02

Pos logit contributions
 same+0.172
 laundry+0.170
 your+0.168
 their+0.166
 port+0.154
Neg logit contributions
hell-0.572
qu-0.554
lor-0.551
oot-0.543
oped-0.541
Token healthy
Token ID5448
Feature activation-4.92e-02

Pos logit contributions
hell+0.094
es+0.090
qu+0.085
hed+0.085
oc+0.083
Neg logit contributions
 same-0.053
 laundry-0.051
 port-0.050
 dad-0.048
 computer-0.047
Token food
Token ID2057
Feature activation-1.68e-01

Pos logit contributions
hell+0.123
es+0.119
umbs+0.116
hed+0.116
lor+0.115
Neg logit contributions
 same-0.060
 your-0.060
 laundry-0.058
 their-0.058
 port-0.055
Token and
Token ID290
Feature activation+1.38e-01

Pos logit contributions
hell+0.313
es+0.299
lor+0.294
oped+0.290
qu+0.288
Neg logit contributions
 same-0.318
 your-0.315
 laundry-0.313
 their-0.309
 dad-0.294
Token not
Token ID407
Feature activation+8.28e-02

Pos logit contributions
 port+0.160
 computer+0.158
 same+0.154
 laundry+0.152
 TV+0.137
Neg logit contributions
hell-0.314
es-0.307
oc-0.296
qu-0.296
uts-0.295
Token that
Token ID326
Feature activation+1.13e-01

Pos logit contributions
 port+0.037
 same+0.037
 laundry+0.035
 table+0.035
 task+0.034
Neg logit contributions
hell-0.047
oot-0.044
es-0.044
uts-0.043
lor-0.043
Token it
Token ID340
Feature activation-2.81e-02

Pos logit contributions
 same+0.167
 port+0.157
 laundry+0.151
 your+0.145
 task+0.143
Neg logit contributions
hell-0.231
es-0.229
qu-0.226
uts-0.221
oot-0.217
Token's
Token ID338
Feature activation+1.28e-01

Pos logit contributions
hell+0.058
es+0.056
oped+0.055
lor+0.055
qu+0.054
Neg logit contributions
 your-0.048
 laundry-0.047
 their-0.047
 same-0.047
 dad-0.044
Token not
Token ID407
Feature activation+8.77e-02

Pos logit contributions
 port+0.170
 table+0.169
 laundry+0.165
 stove+0.160
 computer+0.158
Neg logit contributions
hell-0.295
es-0.278
qu-0.275
oot-0.272
rot-0.269
Token good
Token ID922
Feature activation+4.48e-03

Pos logit contributions
 port+0.116
 stove+0.111
 table+0.109
 bathroom+0.108
 laundry+0.107
Neg logit contributions
hell-0.205
qu-0.186
oot-0.185
es-0.184
lier-0.180
Token to
Token ID284
Feature activation+2.02e-01

Pos logit contributions
 port+0.006
 laundry+0.006
 same+0.006
 table+0.006
 your+0.006
Neg logit contributions
hell-0.010
es-0.010
hed-0.010
uts-0.010
umbs-0.009
Token take
Token ID1011
Feature activation+4.01e-02

Pos logit contributions
 your+0.159
 same+0.153
 their+0.149
 laundry+0.144
 our+0.127
Neg logit contributions
hell-0.616
lor-0.609
es-0.606
oped-0.605
umbs-0.598
Token things
Token ID1243
Feature activation-9.83e-02

Pos logit contributions
 laundry+0.045
 same+0.045
 port+0.044
 your+0.043
 their+0.043
Neg logit contributions
hell-0.103
es-0.100
qu-0.096
hed-0.096
oped-0.096
Token that
Token ID326
Feature activation+1.06e-01

Pos logit contributions
hell+0.203
es+0.193
oped+0.192
lor+0.192
qu+0.189
Neg logit contributions
 same-0.166
 your-0.165
 laundry-0.163
 their-0.162
 dad-0.152
Token belong
Token ID5594
Feature activation-2.28e-02

Pos logit contributions
 same+0.171
 laundry+0.171
 port+0.168
 your+0.167
 their+0.166
Neg logit contributions
hell-0.218
es-0.206
hed-0.197
qu-0.196
uts-0.196
Token to
Token ID284
Feature activation+1.97e-01

Pos logit contributions
hell+0.056
es+0.055
lor+0.053
oped+0.052
hed+0.052
Neg logit contributions
 laundry-0.029
 same-0.027
 your-0.027
 table-0.026
 their-0.026
Token.
Token ID13
Feature activation+2.37e-02

Pos logit contributions
 your+0.037
 their+0.037
 same+0.037
 laundry+0.036
 dad+0.034
Neg logit contributions
hell-0.050
lor-0.049
oped-0.048
umbs-0.048
es-0.048
Token He
Token ID679
Feature activation-8.38e-03

Pos logit contributions
 your+0.033
 same+0.033
 their+0.033
 laundry+0.032
 our+0.030
Neg logit contributions
hell-0.056
lor-0.054
oped-0.054
es-0.054
umbs-0.053
Token was
Token ID373
Feature activation+1.10e-01

Pos logit contributions
es+0.017
hell+0.017
oped+0.016
hed+0.016
oc+0.015
Neg logit contributions
 laundry-0.014
 same-0.014
 your-0.014
 their-0.013
 port-0.013
Token never
Token ID1239
Feature activation+3.16e-02

Pos logit contributions
 laundry+0.155
 same+0.147
 your+0.146
 task+0.146
 port+0.144
Neg logit contributions
hell-0.243
es-0.240
oped-0.235
hed-0.234
rol-0.232
Token able
Token ID1498
Feature activation+2.74e-02

Pos logit contributions
 laundry+0.059
 your+0.058
 same+0.057
 their+0.057
 port+0.056
Neg logit contributions
hell-0.059
oped-0.055
es-0.055
qu-0.054
hed-0.054
Token to
Token ID284
Feature activation+2.02e-01

Pos logit contributions
 laundry+0.034
 same+0.032
 port+0.032
 your+0.030
 their+0.030
Neg logit contributions
hell-0.068
es-0.065
qu-0.063
oot-0.063
oped-0.063
Token speak
Token ID2740
Feature activation-9.53e-02

Pos logit contributions
 your+0.185
 same+0.181
 their+0.177
 laundry+0.177
 our+0.156
Neg logit contributions
hell-0.578
lor-0.558
oped-0.558
es-0.556
qu-0.551
Token again
Token ID757
Feature activation-3.20e-02

Pos logit contributions
hell+0.196
es+0.187
lor+0.185
oped+0.184
qu+0.181
Neg logit contributions
 laundry-0.160
 your-0.160
 same-0.160
 their-0.156
 dad-0.147
Token,
Token ID11
Feature activation+8.86e-02

Pos logit contributions
hell+0.055
oped+0.053
lor+0.052
umbs+0.051
es+0.051
Neg logit contributions
 your-0.067
 same-0.065
 their-0.065
 laundry-0.064
 our-0.062
Token even
Token ID772
Feature activation+4.70e-02

Pos logit contributions
 laundry+0.149
 same+0.141
 your+0.138
 port+0.138
 their+0.136
Neg logit contributions
hell-0.180
es-0.172
oped-0.168
lor-0.166
hed-0.165
Token though
Token ID996
Feature activation+2.15e-02

Pos logit contributions
 laundry+0.093
 same+0.087
 port+0.087
 their+0.086
 stove+0.085
Neg logit contributions
es-0.080
hell-0.079
oc-0.074
qu-0.073
oped-0.073
Token hears
Token ID23180
Feature activation+8.59e-02

Pos logit contributions
lor+0.358
oped+0.345
hell+0.338
rol+0.337
qu+0.334
Neg logit contributions
 your-0.243
 their-0.227
 our-0.217
 Ben-0.214
 my-0.211
Token them
Token ID606
Feature activation-1.37e-02

Pos logit contributions
 laundry+0.077
 same+0.076
 your+0.072
 port+0.069
 their+0.066
Neg logit contributions
hell-0.243
es-0.236
qu-0.230
lor-0.227
umbs-0.226
Token.
Token ID13
Feature activation+1.58e-02

Pos logit contributions
lor+0.027
hell+0.027
oped+0.027
umbs+0.026
rot+0.026
Neg logit contributions
 their-0.025
 your-0.025
 same-0.025
 laundry-0.023
 our-0.023
Token She
Token ID1375
Feature activation-1.11e-02

Pos logit contributions
 your+0.021
 same+0.021
 laundry+0.020
 their+0.020
 our+0.018
Neg logit contributions
hell-0.039
oped-0.037
lor-0.037
es-0.037
umbs-0.037
Token comes
Token ID2058
Feature activation+3.17e-02

Pos logit contributions
es+0.026
hell+0.025
oped+0.025
hed+0.024
lor+0.024
Neg logit contributions
 laundry-0.016
 same-0.015
 your-0.015
 their-0.015
 task-0.015
Token to
Token ID284
Feature activation+2.02e-01

Pos logit contributions
 your+0.037
 same+0.036
 their+0.035
 laundry+0.035
 our+0.032
Neg logit contributions
hell-0.083
lor-0.080
oped-0.080
es-0.080
umbs-0.079
Token the
Token ID262
Feature activation+1.60e-01

Pos logit contributions
 laundry+0.195
 same+0.184
 task+0.176
 your+0.171
 computer+0.169
Neg logit contributions
hell-0.521
qu-0.515
es-0.499
rot-0.487
oc-0.484
Token living
Token ID2877
Feature activation+3.18e-02

Pos logit contributions
 your+0.186
 their+0.182
 laundry+0.169
 task+0.162
 same+0.161
Neg logit contributions
rot-0.373
hell-0.372
lor-0.357
es-0.349
oc-0.348
Token room
Token ID2119
Feature activation-1.32e-01

Pos logit contributions
 laundry+0.026
 same+0.025
 task+0.024
 math+0.024
 your+0.024
Neg logit contributions
es-0.088
lor-0.085
hed-0.084
hell-0.083
rot-0.083
Token.
Token ID13
Feature activation+9.21e-03

Pos logit contributions
hell+0.252
lor+0.240
oped+0.240
es+0.238
umbs+0.235
Neg logit contributions
 your-0.246
 same-0.244
 their-0.242
 laundry-0.240
 our-0.227
Token She
Token ID1375
Feature activation-1.50e-02

Pos logit contributions
 same+0.015
 laundry+0.015
 your+0.014
 their+0.014
 our+0.014
Neg logit contributions
hell-0.019
es-0.019
oped-0.019
lor-0.019
qu-0.018
Token that
Token ID326
Feature activation+1.05e-01

Pos logit contributions
hell+0.057
lor+0.054
oped+0.054
umbs+0.053
es+0.053
Neg logit contributions
 your-0.047
 same-0.046
 their-0.046
 laundry-0.045
 our-0.043
Token he
Token ID339
Feature activation+2.64e-02

Pos logit contributions
 laundry+0.131
 same+0.131
 port+0.130
 computer+0.122
 their+0.120
Neg logit contributions
hell-0.241
es-0.234
qu-0.219
hed-0.218
lor-0.214
Token had
Token ID550
Feature activation+7.52e-02

Pos logit contributions
 your+0.044
 laundry+0.044
 same+0.043
 their+0.043
 our+0.041
Neg logit contributions
hell-0.055
es-0.053
oped-0.052
lor-0.052
qu-0.051
Token been
Token ID587
Feature activation+7.48e-02

Pos logit contributions
 laundry+0.104
 port+0.102
 your+0.100
 same+0.099
 their+0.099
Neg logit contributions
hell-0.169
es-0.166
hed-0.160
lor-0.159
oped-0.158
Token able
Token ID1498
Feature activation+2.62e-02

Pos logit contributions
 laundry+0.105
 same+0.103
 your+0.099
 port+0.099
 table+0.098
Neg logit contributions
es-0.167
hell-0.166
oped-0.163
lor-0.160
hed-0.159
Token to
Token ID284
Feature activation+2.02e-01

Pos logit contributions
 laundry+0.036
 same+0.034
 port+0.034
 computer+0.032
 stove+0.032
Neg logit contributions
hell-0.060
es-0.058
uts-0.056
oot-0.056
umbs-0.056
Token climb
Token ID12080
Feature activation-7.81e-02

Pos logit contributions
 your+0.191
 same+0.191
 laundry+0.186
 their+0.185
 our+0.167
Neg logit contributions
hell-0.557
lor-0.538
qu-0.537
es-0.532
oped-0.530
Token to
Token ID284
Feature activation+1.97e-01

Pos logit contributions
hell+0.209
es+0.202
oped+0.198
hed+0.198
lor+0.197
Neg logit contributions
 laundry-0.080
 same-0.077
 your-0.076
 dad-0.073
 their-0.072
Token the
Token ID262
Feature activation+1.53e-01

Pos logit contributions
 laundry+0.148
 computer+0.139
 same+0.137
 stove+0.130
 your+0.128
Neg logit contributions
hell-0.541
oc-0.522
qu-0.519
es-0.516
lor-0.512
Token top
Token ID1353
Feature activation-9.95e-02

Pos logit contributions
 their+0.215
 your+0.198
 our+0.183
 laundry+0.172
 dad+0.168
Neg logit contributions
hell-0.345
rot-0.333
lor-0.331
pillar-0.318
qu-0.318
Token.
Token ID13
Feature activation+1.30e-03

Pos logit contributions
hell+0.153
lor+0.144
oped+0.144
es+0.143
umbs+0.141
Neg logit contributions
 your-0.222
 same-0.220
 their-0.218
 laundry-0.218
 our-0.208
Token
Token ID198
Feature activation-1.48e-02

Pos logit contributions
 their+0.048
 your+0.048
 laundry+0.047
 dad+0.046
 same+0.045
Neg logit contributions
hell-0.042
es-0.042
lor-0.040
rot-0.040
qu-0.040
Token
Token ID198
Feature activation-2.34e-02

Pos logit contributions
hell+0.030
es+0.028
oped+0.028
umbs+0.028
lor+0.028
Neg logit contributions
 your-0.027
 their-0.027
 same-0.025
 our-0.025
 laundry-0.025
TokenJohnny
Token ID44960
Feature activation-1.16e-01

Pos logit contributions
hell+0.046
es+0.044
oped+0.043
umbs+0.043
lor+0.043
Neg logit contributions
 your-0.044
 their-0.043
 same-0.041
 laundry-0.041
 our-0.040
Token held
Token ID2714
Feature activation+8.61e-03

Pos logit contributions
hell+0.238
oped+0.230
es+0.229
lor+0.228
qu+0.224
Neg logit contributions
 your-0.196
 same-0.194
 laundry-0.193
 their-0.192
 our-0.180
Token on
Token ID319
Feature activation+1.43e-01

Pos logit contributions
 same+0.009
 laundry+0.009
 task+0.008
 table+0.008
 bathroom+0.008
Neg logit contributions
es-0.022
hed-0.021
oped-0.021
hell-0.021
oot-0.020
Token to
Token ID284
Feature activation+2.02e-01

Pos logit contributions
 laundry+0.169
 same+0.163
 port+0.154
 your+0.151
 task+0.142
Neg logit contributions
hell-0.348
es-0.343
oped-0.336
oc-0.333
oot-0.332
Token the
Token ID262
Feature activation+1.54e-01

Pos logit contributions
 laundry+0.176
 same+0.167
 task+0.164
 stove+0.151
 computer+0.144
Neg logit contributions
hell-0.475
oc-0.474
es-0.469
oot-0.460
r-0.458
Token bag
Token ID6131
Feature activation-2.08e-01

Pos logit contributions
 your+0.192
 their+0.190
 our+0.173
 same+0.167
 organize+0.157
Neg logit contributions
hell-0.363
rot-0.351
lor-0.330
pillar-0.324
oped-0.322
Token tight
Token ID5381
Feature activation-6.03e-02

Pos logit contributions
hell+0.423
lor+0.407
oped+0.405
qu+0.400
es+0.400
Neg logit contributions
 your-0.362
 their-0.359
 same-0.358
 laundry-0.353
 dad-0.335
Token and
Token ID290
Feature activation+1.40e-01

Pos logit contributions
hell+0.121
lor+0.116
oped+0.115
umbs+0.114
qu+0.113
Neg logit contributions
 your-0.108
 their-0.106
 same-0.106
 laundry-0.105
 dad-0.099
Token they
Token ID484
Feature activation+1.98e-02

Pos logit contributions
 same+0.246
 laundry+0.236
 your+0.234
 port+0.231
 math+0.224
Neg logit contributions
es-0.273
oped-0.270
hed-0.261
hell-0.260
oot-0.250
Token better
Token ID1365
Feature activation-6.97e-03

Pos logit contributions
 your+0.036
 same+0.035
 laundry+0.035
 their+0.035
 port+0.032
Neg logit contributions
hell-0.063
qu-0.059
umbs-0.059
oped-0.058
es-0.058
Token.
Token ID13
Feature activation+1.83e-02

Pos logit contributions
hell+0.011
es+0.011
lor+0.010
qu+0.010
oped+0.010
Neg logit contributions
 same-0.014
 your-0.014
 laundry-0.014
 their-0.014
 port-0.014
Token He
Token ID679
Feature activation-7.88e-03

Pos logit contributions
 your+0.026
 same+0.026
 their+0.026
 laundry+0.025
 our+0.024
Neg logit contributions
hell-0.043
lor-0.041
oped-0.041
es-0.041
umbs-0.040
Token was
Token ID373
Feature activation+1.13e-01

Pos logit contributions
hell+0.016
es+0.016
oped+0.015
hed+0.015
lor+0.015
Neg logit contributions
 laundry-0.014
 same-0.013
 your-0.013
 their-0.013
 port-0.013
Token able
Token ID1498
Feature activation+2.85e-02

Pos logit contributions
 laundry+0.163
 same+0.158
 your+0.154
 port+0.152
 their+0.150
Neg logit contributions
hell-0.253
es-0.244
oped-0.241
lor-0.240
hed-0.239
Token to
Token ID284
Feature activation+2.02e-01

Pos logit contributions
 laundry+0.039
 port+0.039
 same+0.037
 computer+0.037
 stove+0.036
Neg logit contributions
hell-0.064
es-0.061
umbs-0.061
uts-0.061
qu-0.059
Token perform
Token ID1620
Feature activation-1.10e-01

Pos logit contributions
 same+0.200
 your+0.196
 their+0.191
 laundry+0.191
 computer+0.173
Neg logit contributions
hell-0.547
qu-0.529
lor-0.525
es-0.519
oped-0.518
Token on
Token ID319
Feature activation+1.37e-01

Pos logit contributions
hell+0.265
es+0.262
lor+0.243
hed+0.240
qu+0.237
Neg logit contributions
 laundry-0.140
 same-0.138
 your-0.130
 port-0.127
 their-0.127
Token the
Token ID262
Feature activation+1.52e-01

Pos logit contributions
 same+0.094
 laundry+0.093
 port+0.084
 task+0.080
 dad+0.080
Neg logit contributions
hell-0.394
es-0.388
hed-0.366
oc-0.366
lor-0.365
Token stage
Token ID3800
Feature activation-1.99e-01

Pos logit contributions
 their+0.177
 your+0.160
 Dad+0.157
 Ben+0.148
 dad+0.147
Neg logit contributions
hell-0.379
qu-0.356
rot-0.349
lor-0.345
pillar-0.331
Token and
Token ID290
Feature activation+1.41e-01

Pos logit contributions
hell+0.380
es+0.368
lor+0.361
oped+0.358
qu+0.354
Neg logit contributions
 your-0.360
 laundry-0.358
 same-0.357
 their-0.351
 our-0.333
Token falling
Token ID7463
Feature activation-2.85e-02

Pos logit contributions
 same+0.131
 port+0.115
 computer+0.114
 task+0.109
 your+0.108
Neg logit contributions
hell-0.373
es-0.361
qu-0.348
oped-0.347
oot-0.343
Token off
Token ID572
Feature activation+5.50e-02

Pos logit contributions
hell+0.058
es+0.055
oped+0.055
umbs+0.055
qu+0.054
Neg logit contributions
 laundry-0.048
 your-0.048
 same-0.047
 their-0.046
 dad-0.044
Token.
Token ID13
Feature activation+1.42e-02

Pos logit contributions
 same+0.094
 laundry+0.094
 your+0.091
 port+0.088
 dad+0.088
Neg logit contributions
hell-0.107
es-0.103
hed-0.100
umbs-0.100
oped-0.100
Token She
Token ID1375
Feature activation-1.44e-02

Pos logit contributions
 your+0.021
 same+0.021
 their+0.020
 laundry+0.020
 our+0.019
Neg logit contributions
hell-0.033
lor-0.031
oped-0.031
es-0.031
umbs-0.031
Token tried
Token ID3088
Feature activation-1.22e-02

Pos logit contributions
es+0.029
hell+0.028
oped+0.028
hed+0.027
qu+0.027
Neg logit contributions
 laundry-0.025
 same-0.025
 your-0.024
 their-0.024
 port-0.023
Token to
Token ID284
Feature activation+2.02e-01

Pos logit contributions
hell+0.028
es+0.027
oot+0.026
oped+0.026
uts+0.026
Neg logit contributions
 port-0.016
 same-0.016
 laundry-0.016
 your-0.015
 our-0.015
Token tie
Token ID9839
Feature activation-1.69e-01

Pos logit contributions
 same+0.214
 your+0.213
 laundry+0.208
 their+0.205
 port+0.199
Neg logit contributions
hell-0.524
qu-0.503
es-0.492
oped-0.488
oot-0.487
Token them
Token ID606
Feature activation-2.17e-02

Pos logit contributions
hell+0.484
es+0.470
oped+0.460
hed+0.455
qu+0.451
Neg logit contributions
 laundry-0.137
 same-0.134
 your-0.132
 port-0.131
 table-0.127
Token back
Token ID736
Feature activation+3.75e-02

Pos logit contributions
hell+0.051
es+0.049
oped+0.049
lor+0.048
qu+0.048
Neg logit contributions
 your-0.030
 laundry-0.030
 same-0.030
 their-0.029
 port-0.027
Token together
Token ID1978
Feature activation+1.38e-03

Pos logit contributions
 same+0.061
 laundry+0.061
 your+0.060
 their+0.059
 port+0.056
Neg logit contributions
hell-0.079
es-0.076
oped-0.075
lor-0.074
qu-0.073
Token,
Token ID11
Feature activation+8.89e-02

Pos logit contributions
 your+0.003
 same+0.003
 laundry+0.003
 their+0.003
 our+0.002
Neg logit contributions
hell-0.003
es-0.002
oped-0.002
lor-0.002
umbs-0.002
Token.
Token ID13
Feature activation+2.37e-02

Pos logit contributions
 laundry+0.140
 same+0.133
 your+0.129
 their+0.122
 task+0.120
Neg logit contributions
hell-0.297
oped-0.290
qu-0.288
es-0.287
lor-0.281
Token
Token ID198
Feature activation-2.76e-02

Pos logit contributions
 your+0.033
 same+0.033
 their+0.033
 laundry+0.032
 our+0.030
Neg logit contributions
hell-0.056
lor-0.054
oped-0.054
es-0.054
umbs-0.053
Token
Token ID198
Feature activation-3.39e-02

Pos logit contributions
hell+0.060
oped+0.058
es+0.057
lor+0.057
qu+0.057
Neg logit contributions
 your-0.047
 their-0.046
 same-0.043
 dad-0.042
 our-0.042
TokenRyan
Token ID21868
Feature activation-1.40e-01

Pos logit contributions
hell+0.066
es+0.063
oped+0.063
lor+0.063
umbs+0.062
Neg logit contributions
 your-0.064
 their-0.062
 same-0.060
 laundry-0.060
 our-0.058
Token tried
Token ID3088
Feature activation-1.32e-02

Pos logit contributions
hell+0.280
es+0.275
oped+0.274
hed+0.270
lor+0.267
Neg logit contributions
 your-0.241
 laundry-0.238
 same-0.237
 their-0.233
 our-0.222
Token to
Token ID284
Feature activation+2.02e-01

Pos logit contributions
hell+0.030
es+0.029
oot+0.028
oped+0.028
uts+0.027
Neg logit contributions
 same-0.018
 port-0.018
 laundry-0.017
 stove-0.016
 your-0.016
Token run
Token ID1057
Feature activation-8.88e-02

Pos logit contributions
 same+0.229
 your+0.229
 laundry+0.226
 their+0.220
 port+0.218
Neg logit contributions
hell-0.503
qu-0.484
hed-0.467
es-0.466
oped-0.465
Token away
Token ID1497
Feature activation+1.39e-02

Pos logit contributions
hell+0.191
lor+0.183
oped+0.183
es+0.182
umbs+0.180
Neg logit contributions
 your-0.146
 same-0.144
 their-0.142
 laundry-0.141
 our-0.133
Token but
Token ID475
Feature activation+5.30e-02

Pos logit contributions
 same+0.026
 laundry+0.026
 your+0.026
 their+0.024
 our+0.024
Neg logit contributions
hell-0.025
es-0.025
oot-0.024
oped-0.024
lor-0.023
Token the
Token ID262
Feature activation+1.53e-01

Pos logit contributions
 laundry+0.074
 same+0.074
 stove+0.069
 your+0.067
 table+0.066
Neg logit contributions
oped-0.121
hell-0.120
es-0.119
oot-0.114
hed-0.112
Token noisy
Token ID31210
Feature activation-7.11e-02

Pos logit contributions
 your+0.201
 their+0.195
 our+0.180
 laundry+0.172
 same+0.166
Neg logit contributions
hell-0.378
rot-0.355
umbs-0.347
oped-0.345
lor-0.344
Token saw
Token ID2497
Feature activation+5.38e-02

Pos logit contributions
oped+0.126
es+0.126
hed+0.124
hell+0.122
lor+0.121
Neg logit contributions
 your-0.103
 same-0.100
 laundry-0.100
 their-0.100
 our-0.098
Token a
Token ID257
Feature activation+1.18e-01

Pos logit contributions
 task+0.054
 your+0.053
 same+0.051
 laundry+0.051
 our+0.049
Neg logit contributions
hell-0.137
es-0.134
hed-0.130
aw-0.127
lor-0.126
Token big
Token ID1263
Feature activation+2.35e-02

Pos logit contributions
 your+0.121
 their+0.120
 our+0.106
 same+0.105
 laundry+0.098
Neg logit contributions
hell-0.316
rot-0.301
oc-0.297
oped-0.287
lor-0.282
Token box
Token ID3091
Feature activation-2.15e-01

Pos logit contributions
 their+0.033
 same+0.032
 your+0.032
 usual+0.031
 our+0.030
Neg logit contributions
hell-0.049
oc-0.049
rot-0.046
hed-0.045
oped-0.045
Token of
Token ID286
Feature activation+1.72e-01

Pos logit contributions
hell+0.433
lor+0.416
oped+0.412
es+0.412
umbs+0.406
Neg logit contributions
 your-0.379
 same-0.376
 their-0.373
 laundry-0.370
 dad-0.349
Token Leg
Token ID3564
Feature activation-3.03e-01

Pos logit contributions
 same+0.191
 dad+0.182
 your+0.179
 port+0.179
 their+0.176
Neg logit contributions
hell-0.421
es-0.379
oc-0.375
hed-0.365
ann-0.362
Tokenos
Token ID418
Feature activation-1.44e-01

Pos logit contributions
hell+0.500
lor+0.451
pillar+0.430
rot+0.429
rots+0.424
Neg logit contributions
 table-0.637
 build-0.630
 school-0.625
 their-0.621
 Grand-0.612
Token that
Token ID326
Feature activation+1.04e-01

Pos logit contributions
lor+0.316
hell+0.300
rot+0.296
oped+0.293
umbs+0.288
Neg logit contributions
 their-0.242
 laundry-0.241
 your-0.241
 same-0.228
 table-0.227
Token he
Token ID339
Feature activation+1.71e-02

Pos logit contributions
 your+0.184
 laundry+0.173
 our+0.173
 same+0.172
 math+0.168
Neg logit contributions
es-0.196
hell-0.191
hed-0.187
oped-0.185
qu-0.175
Token really
Token ID1107
Feature activation+9.75e-02

Pos logit contributions
 your+0.028
 laundry+0.027
 same+0.027
 their+0.027
 our+0.025
Neg logit contributions
hell-0.037
oped-0.036
es-0.035
lor-0.035
qu-0.035
Token wanted
Token ID2227
Feature activation-8.01e-03

Pos logit contributions
 your+0.147
 same+0.146
 their+0.144
 laundry+0.143
 our+0.133
Neg logit contributions
hell-0.220
lor-0.213
oped-0.212
es-0.211
umbs-0.209
Token grandma
Token ID49890
Feature activation-1.56e-01

Pos logit contributions
 your+0.148
 laundry+0.139
 same+0.137
 their+0.137
 our+0.125
Neg logit contributions
hell-0.274
rot-0.271
oc-0.270
es-0.264
lor-0.262
Token's
Token ID338
Feature activation+1.17e-01

Pos logit contributions
hell+0.322
lor+0.314
umbs+0.307
oped+0.307
qu+0.299
Neg logit contributions
 your-0.276
 their-0.276
 laundry-0.266
 same-0.266
 dad-0.254
Token house
Token ID2156
Feature activation-1.84e-01

Pos logit contributions
 your+0.138
 their+0.126
 same+0.124
 laundry+0.122
 our+0.119
Neg logit contributions
hell-0.289
oc-0.277
rot-0.277
lor-0.276
es-0.276
Token for
Token ID329
Feature activation+1.36e-01

Pos logit contributions
hell+0.351
es+0.335
oped+0.332
lor+0.331
umbs+0.324
Neg logit contributions
 your-0.339
 same-0.335
 laundry-0.332
 their-0.328
 our-0.314
Token safe
Token ID3338
Feature activation-9.29e-02

Pos logit contributions
 same+0.127
 laundry+0.115
 your+0.114
 port+0.109
 their+0.106
Neg logit contributions
hell-0.374
es-0.366
qu-0.362
lor-0.358
oped-0.358
Tokenkeeping
Token ID19934
Feature activation-3.01e-01

Pos logit contributions
hell+0.140
es+0.134
lor+0.131
oped+0.130
qu+0.127
Neg logit contributions
 your-0.207
 same-0.207
 laundry-0.203
 their-0.202
 our-0.195
Token.
Token ID13
Feature activation-1.34e-02

Pos logit contributions
rots+0.641
oped+0.635
rot+0.634
lor+0.631
qu+0.627
Neg logit contributions
 their-0.501
 your-0.487
 table-0.478
 laundry-0.465
 dad-0.459
Token
Token ID220
Feature activation-1.46e-01

Pos logit contributions
hell+0.031
oped+0.029
lor+0.029
es+0.029
umbs+0.029
Neg logit contributions
 your-0.020
 same-0.020
 laundry-0.020
 their-0.020
 our-0.018
Token
Token ID198
Feature activation-5.65e-02

Pos logit contributions
hell+0.352
rot+0.342
lor+0.339
qu+0.330
oped+0.324
Neg logit contributions
 your-0.215
 their-0.215
 dad-0.197
 Mom-0.196
 Grand-0.194
Token
Token ID198
Feature activation-5.57e-02

Pos logit contributions
rot+0.138
hell+0.137
lor+0.136
rots+0.135
owing+0.133
Neg logit contributions
 their-0.078
 your-0.077
 the-0.071
 his-0.070
 same-0.067
TokenWhen
Token ID2215
Feature activation+1.88e-02

Pos logit contributions
hell+0.113
lor+0.110
es+0.109
oped+0.108
umbs+0.107
Neg logit contributions
 your-0.099
 their-0.096
 same-0.095
 laundry-0.095
 our-0.090
Token put
Token ID1234
Feature activation-1.65e-02

Pos logit contributions
oped+0.048
es+0.048
hell+0.048
qu+0.046
hed+0.046
Neg logit contributions
 your-0.038
 same-0.038
 laundry-0.037
 their-0.035
 our-0.035
Token the
Token ID262
Feature activation+1.51e-01

Pos logit contributions
hell+0.048
es+0.046
hed+0.046
oped+0.045
umbs+0.045
Neg logit contributions
 laundry-0.013
 same-0.013
 your-0.012
 bathroom-0.012
 their-0.011
Token camera
Token ID4676
Feature activation-2.16e-01

Pos logit contributions
 same+0.189
 your+0.188
 laundry+0.183
 their+0.182
 our+0.167
Neg logit contributions
hell-0.384
lor-0.375
oped-0.374
es-0.372
umbs-0.366
Token on
Token ID319
Feature activation+1.30e-01

Pos logit contributions
hell+0.454
es+0.434
oped+0.431
hed+0.425
umbs+0.421
Neg logit contributions
 your-0.358
 laundry-0.353
 same-0.347
 their-0.344
 our-0.331
Token a
Token ID257
Feature activation+1.16e-01

Pos logit contributions
 laundry+0.132
 same+0.117
 task+0.114
 port+0.109
 your+0.109
Neg logit contributions
hell-0.347
oc-0.328
es-0.327
hed-0.324
lor-0.315
Token tripod
Token ID47439
Feature activation-2.99e-01

Pos logit contributions
 your+0.119
 same+0.119
 their+0.116
 laundry+0.113
 our+0.103
Neg logit contributions
hell-0.321
oc-0.295
lor-0.293
rot-0.293
oped-0.289
Token,
Token ID11
Feature activation+7.42e-02

Pos logit contributions
hell+0.671
qu+0.662
lor+0.658
oped+0.655
rot+0.629
Neg logit contributions
 same-0.446
 their-0.445
 your-0.436
 table-0.434
 laundry-0.411
Token set
Token ID900
Feature activation-2.61e-02

Pos logit contributions
 task+0.138
 laundry+0.133
 homework+0.127
 job+0.126
 math+0.125
Neg logit contributions
hed-0.123
hell-0.119
es-0.119
aw-0.114
oped-0.112
Token the
Token ID262
Feature activation+1.52e-01

Pos logit contributions
hell+0.068
es+0.065
qu+0.064
oot+0.064
oped+0.064
Neg logit contributions
 laundry-0.030
 same-0.029
 your-0.029
 bathroom-0.027
 their-0.026
Token timer
Token ID19781
Feature activation-1.79e-01

Pos logit contributions
 your+0.204
 their+0.196
 same+0.182
 laundry+0.182
 our+0.177
Neg logit contributions
hell-0.366
rot-0.353
qu-0.348
hed-0.344
es-0.344
Token,
Token ID11
Feature activation+7.40e-02

Pos logit contributions
hell+0.400
es+0.389
lor+0.385
oped+0.384
umbs+0.380
Neg logit contributions
 your-0.269
 same-0.265
 laundry-0.263
 their-0.256
 our-0.243
Token a
Token ID257
Feature activation+1.12e-01

Pos logit contributions
 laundry+0.038
 task+0.038
 your+0.038
 same+0.037
 homework+0.035
Neg logit contributions
hell-0.122
es-0.118
aw-0.114
hed-0.114
umbs-0.113
Token big
Token ID1263
Feature activation+2.27e-02

Pos logit contributions
 your+0.086
 their+0.081
 same+0.080
 laundry+0.077
 our+0.068
Neg logit contributions
hell-0.337
rot-0.323
oped-0.317
qu-0.316
lor-0.314
Token camera
Token ID4676
Feature activation-2.20e-01

Pos logit contributions
 same+0.032
 your+0.032
 their+0.032
 laundry+0.031
 homework+0.029
Neg logit contributions
hell-0.049
oc-0.047
rot-0.047
hed-0.045
oped-0.044
Token on
Token ID319
Feature activation+1.28e-01

Pos logit contributions
hell+0.472
lor+0.464
oped+0.454
umbs+0.448
es+0.448
Neg logit contributions
 their-0.368
 same-0.363
 your-0.361
 laundry-0.358
 table-0.349
Token a
Token ID257
Feature activation+1.13e-01

Pos logit contributions
 same+0.084
 laundry+0.082
 your+0.078
 task+0.075
 wash+0.072
Neg logit contributions
hell-0.365
es-0.350
qu-0.349
lor-0.343
hed-0.343
Token tripod
Token ID47439
Feature activation-2.97e-01

Pos logit contributions
 your+0.127
 same+0.122
 their+0.122
 laundry+0.119
 our+0.110
Neg logit contributions
hell-0.303
lor-0.283
rot-0.279
qu-0.279
oped-0.277
Token.
Token ID13
Feature activation-5.96e-03

Pos logit contributions
oped+0.653
lor+0.641
hell+0.628
qu+0.622
fed+0.607
Neg logit contributions
 table-0.501
 their-0.478
 same-0.455
 stove-0.439
 port-0.437
Token It
Token ID632
Feature activation-6.10e-02

Pos logit contributions
umbs+0.014
hell+0.014
oped+0.014
es+0.014
hed+0.013
Neg logit contributions
 same-0.008
 laundry-0.008
 your-0.008
 task-0.007
 math-0.007
Token belongs
Token ID14448
Feature activation+2.37e-02

Pos logit contributions
hell+0.145
es+0.141
oped+0.140
qu+0.139
lor+0.138
Neg logit contributions
 laundry-0.083
 your-0.082
 same-0.081
 their-0.078
 dad-0.075
Token to
Token ID284
Feature activation+1.89e-01

Pos logit contributions
 laundry+0.028
 same+0.027
 your+0.026
 their+0.025
 task+0.025
Neg logit contributions
hell-0.059
es-0.059
lor-0.058
oped-0.057
oot-0.057
Token a
Token ID257
Feature activation+1.12e-01

Pos logit contributions
 same+0.165
 task+0.160
 laundry+0.160
 your+0.146
 computer+0.144
Neg logit contributions
es-0.499
hell-0.495
qu-0.477
lor-0.475
aw-0.459
TokenOnce
Token ID7454
Feature activation-3.22e-02

Pos logit contributions
hell+0.423
oot+0.420
qu+0.402
aring+0.392
ivering+0.392
Neg logit contributions
 the-0.227
 their-0.223
 Ben-0.217
 same-0.212
 your-0.211
Token there
Token ID612
Feature activation-1.90e-02

Pos logit contributions
hell+0.076
lor+0.074
oped+0.073
es+0.073
qu+0.073
Neg logit contributions
 your-0.045
 same-0.045
 their-0.044
 laundry-0.044
 dad-0.041
Token was
Token ID373
Feature activation+1.05e-01

Pos logit contributions
hell+0.043
lor+0.042
umbs+0.042
qu+0.042
oped+0.041
Neg logit contributions
 their-0.029
 your-0.029
 same-0.029
 laundry-0.028
 dad-0.027
Token a
Token ID257
Feature activation+1.15e-01

Pos logit contributions
 your+0.071
 same+0.070
 laundry+0.068
 their+0.065
 dad+0.056
Neg logit contributions
hell-0.324
es-0.312
umbs-0.311
oped-0.311
lor-0.310
Token k
Token ID479
Feature activation-1.59e-01

Pos logit contributions
 your+0.107
 same+0.105
 laundry+0.103
 their+0.102
 our+0.090
Neg logit contributions
hell-0.326
oped-0.314
es-0.311
lor-0.311
qu-0.310
Tokenidd
Token ID1638
Feature activation-2.97e-01

Pos logit contributions
lor+0.219
hell+0.217
umbs+0.197
ives+0.196
qu+0.190
Neg logit contributions
 your-0.407
 their-0.373
 my-0.368
 same-0.366
 our-0.362
Tokeno
Token ID78
Feature activation-1.42e-01

Pos logit contributions
lor+0.437
hell+0.421
qu+0.379
ement+0.370
oped+0.370
Neg logit contributions
 their-0.689
 dad-0.687
 your-0.684
 same-0.681
 sister-0.663
Token who
Token ID508
Feature activation+3.01e-02

Pos logit contributions
umbs+0.313
hell+0.312
lor+0.303
oped+0.303
ives+0.295
Neg logit contributions
 their-0.246
 your-0.246
 dad-0.232
 the-0.225
 same-0.223
Token loved
Token ID6151
Feature activation+4.70e-02

Pos logit contributions
 your+0.043
 same+0.043
 their+0.043
 laundry+0.042
 our+0.039
Neg logit contributions
hell-0.071
lor-0.069
oped-0.068
es-0.068
umbs-0.068
Token to
Token ID284
Feature activation+1.93e-01

Pos logit contributions
 same+0.065
 port+0.061
 your+0.061
 laundry+0.059
 task+0.058
Neg logit contributions
hell-0.095
qu-0.093
uts-0.093
oot-0.091
hed-0.091
Token guess
Token ID4724
Feature activation-1.05e-01

Pos logit contributions
 same+0.224
 your+0.219
 our+0.210
 their+0.210
 laundry+0.204
Neg logit contributions
hell-0.489
qu-0.459
oped-0.454
umbs-0.448
oot-0.444
Token one
Token ID530
Feature activation-1.76e-02

Pos logit contributions
 same+0.029
 laundry+0.029
 computer+0.028
 table+0.027
 stove+0.027
Neg logit contributions
oped-0.055
es-0.054
hell-0.053
hed-0.052
oot-0.052
Token day
Token ID1110
Feature activation-7.09e-02

Pos logit contributions
hell+0.043
oped+0.042
es+0.041
lor+0.041
rot+0.041
Neg logit contributions
 your-0.023
 laundry-0.022
 same-0.022
 their-0.022
 our-0.021
Token,
Token ID11
Feature activation+7.57e-02

Pos logit contributions
hell+0.161
es+0.157
oped+0.156
lor+0.154
qu+0.152
Neg logit contributions
 same-0.104
 your-0.104
 laundry-0.102
 their-0.098
 our-0.094
Token the
Token ID262
Feature activation+1.50e-01

Pos logit contributions
 same+0.081
 laundry+0.076
 task+0.072
 your+0.071
 computer+0.068
Neg logit contributions
es-0.196
hell-0.189
oped-0.186
lor-0.186
hed-0.183
Token k
Token ID479
Feature activation-1.59e-01

Pos logit contributions
 your+0.189
 their+0.183
 same+0.171
 laundry+0.170
 our+0.166
Neg logit contributions
hell-0.371
qu-0.355
oped-0.355
lor-0.354
rot-0.352
Tokenidd
Token ID1638
Feature activation-2.96e-01

Pos logit contributions
hell+0.230
lor+0.185
umbs+0.177
immer+0.167
es+0.160
Neg logit contributions
 your-0.386
 same-0.370
 their-0.363
 our-0.356
 my-0.356
Tokeno
Token ID78
Feature activation-1.45e-01

Pos logit contributions
hell+0.356
lor+0.321
qu+0.305
umbs+0.281
es+0.267
Neg logit contributions
 same-0.756
 dad-0.745
 their-0.731
 your-0.719
 Dad-0.716
Tokenâ
Token ID22940
Feature activation-4.33e-02

Pos logit contributions
hell+0.321
lor+0.305
umbs+0.301
qu+0.298
oped+0.297
Neg logit contributions
 same-0.225
 your-0.223
 their-0.220
 laundry-0.216
 dad-0.208
Token
Token ID26391
Feature activation-1.08e-01

Pos logit contributions
hell+0.093
es+0.089
lor+0.089
oped+0.089
umbs+0.087
Neg logit contributions
 your-0.071
 same-0.071
 laundry-0.069
 their-0.069
 our-0.065
Token
Token ID8151
Feature activation-2.38e-01

Pos logit contributions
hell+0.188
immer+0.168
rol+0.167
le+0.167
umbs+0.166
Neg logit contributions
 dad-0.210
 your-0.208
 their-0.207
 same-0.204
 Grand-0.203
Tokens
Token ID82
Feature activation+7.82e-02

Pos logit contributions
qu+0.359
hell+0.347
lor+0.346
ez+0.333
rots+0.332
Neg logit contributions
 their-0.534
 your-0.505
 dad-0.499
 Mom-0.492
 the-0.491
Token a
Token ID257
Feature activation+1.14e-01

Pos logit contributions
hell+0.199
es+0.188
umbs+0.185
qu+0.182
hed+0.181
Neg logit contributions
 laundry-0.076
 same-0.074
 your-0.073
 port-0.071
 their-0.069
Token new
Token ID649
Feature activation+8.45e-04

Pos logit contributions
 laundry+0.112
 your+0.107
 their+0.096
 same+0.096
 Ben+0.095
Neg logit contributions
hell-0.307
oc-0.291
rot-0.275
qu-0.256
oot-0.255
Token tank
Token ID6873
Feature activation-2.61e-01

Pos logit contributions
 your+0.001
 laundry+0.001
 same+0.001
 their+0.001
 our+0.001
Neg logit contributions
hell-0.002
oc-0.002
rot-0.002
es-0.002
oped-0.002
Token out
Token ID503
Feature activation+6.16e-02

Pos logit contributions
hell+0.450
es+0.419
umbs+0.408
qu+0.405
lor+0.402
Neg logit contributions
 laundry-0.530
 your-0.525
 same-0.522
 their-0.509
 our-0.487
Token of
Token ID286
Feature activation+1.66e-01

Pos logit contributions
 your+0.143
 same+0.143
 laundry+0.143
 their+0.141
 dad+0.135
Neg logit contributions
hell-0.088
es-0.082
oped-0.081
lor-0.080
umbs-0.079
Token Leg
Token ID3564
Feature activation-2.96e-01

Pos logit contributions
 laundry+0.128
 same+0.118
 task+0.106
 homework+0.106
 your+0.104
Neg logit contributions
hell-0.460
hed-0.436
oc-0.420
umbs-0.420
es-0.414
Tokenos
Token ID418
Feature activation-1.49e-01

Pos logit contributions
hell+0.433
lor+0.389
pillar+0.362
rot+0.351
qu+0.344
Neg logit contributions
 their-0.679
 build-0.654
 dad-0.651
 your-0.644
 table-0.640
Token.
Token ID13
Feature activation-1.24e-02

Pos logit contributions
lor+0.270
hell+0.266
oped+0.254
rot+0.249
umbs+0.246
Neg logit contributions
 your-0.303
 their-0.303
 same-0.294
 laundry-0.288
 our-0.283
Token
Token ID198
Feature activation-5.41e-02

Pos logit contributions
hell+0.030
oped+0.029
lor+0.029
es+0.029
umbs+0.029
Neg logit contributions
 your-0.016
 same-0.016
 laundry-0.016
 their-0.015
 our-0.014
Token
Token ID198
Feature activation-5.32e-02

Pos logit contributions
lor+0.125
hell+0.125
es+0.124
rot+0.122
oped+0.122
Neg logit contributions
 their-0.080
 your-0.080
 the-0.071
 his-0.071
 same-0.071
TokenTim
Token ID14967
Feature activation-1.29e-01

Pos logit contributions
hell+0.105
lor+0.101
es+0.101
oped+0.100
umbs+0.099
Neg logit contributions
 your-0.097
 their-0.095
 same-0.095
 laundry-0.094
 our-0.089
Token.
Token ID13
Feature activation-1.08e-02

Pos logit contributions
es+0.201
hell+0.200
oped+0.189
qu+0.187
lor+0.186
Neg logit contributions
 same-0.178
 laundry-0.175
 your-0.168
 table-0.163
 dad-0.161
Token
Token ID198
Feature activation-5.51e-02

Pos logit contributions
hell+0.024
oped+0.023
lor+0.023
es+0.023
umbs+0.023
Neg logit contributions
 your-0.016
 same-0.016
 laundry-0.016
 their-0.016
 our-0.015
Token
Token ID198
Feature activation-5.42e-02

Pos logit contributions
hell+0.129
rot+0.126
pp+0.126
es+0.125
oped+0.124
Neg logit contributions
 their-0.081
 your-0.078
 dad-0.073
 the-0.071
 Mom-0.071
TokenThe
Token ID464
Feature activation+8.00e-02

Pos logit contributions
hell+0.115
es+0.112
oped+0.109
umbs+0.108
lor+0.108
Neg logit contributions
 your-0.090
 their-0.089
 laundry-0.086
 same-0.084
 our-0.082
Token k
Token ID479
Feature activation-1.58e-01

Pos logit contributions
 your+0.099
 their+0.095
 same+0.092
 laundry+0.091
 our+0.087
Neg logit contributions
hell-0.203
oped-0.193
qu-0.193
lor-0.193
rot-0.191
Tokenidd
Token ID1638
Feature activation-2.95e-01

Pos logit contributions
hell+0.216
umbs+0.178
lor+0.176
qu+0.155
immer+0.153
Neg logit contributions
 your-0.402
 their-0.378
 same-0.377
 my-0.372
 dad-0.370
Tokeno
Token ID78
Feature activation-1.43e-01

Pos logit contributions
hell+0.377
lor+0.347
qu+0.323
umbs+0.319
oped+0.286
Neg logit contributions
 same-0.720
 dad-0.719
 their-0.707
 your-0.698
 Dad-0.689
Token understood
Token ID7247
Feature activation+8.95e-03

Pos logit contributions
hell+0.320
lor+0.305
umbs+0.303
oped+0.298
qu+0.296
Neg logit contributions
 same-0.219
 your-0.218
 their-0.213
 laundry-0.210
 dad-0.202
Token the
Token ID262
Feature activation+1.51e-01

Pos logit contributions
 same+0.016
 laundry+0.015
 your+0.015
 port+0.015
 table+0.014
Neg logit contributions
hell-0.018
es-0.017
lor-0.017
oped-0.016
qu-0.016
Token message
Token ID3275
Feature activation-2.30e-01

Pos logit contributions
 your+0.188
 their+0.185
 our+0.170
 laundry+0.155
 port+0.155
Neg logit contributions
hell-0.382
qu-0.359
lor-0.357
rot-0.355
oped-0.344
Token,
Token ID11
Feature activation+7.55e-02

Pos logit contributions
hell+0.452
es+0.439
oped+0.425
lor+0.422
hed+0.418
Neg logit contributions
 your-0.407
 laundry-0.400
 same-0.400
 their-0.386
 our-0.377
Token the
Token ID262
Feature activation+1.56e-01

Pos logit contributions
hell+0.005
es+0.005
hed+0.005
oped+0.005
umbs+0.005
Neg logit contributions
 laundry-0.003
 task-0.003
 same-0.003
 your-0.003
 port-0.003
Token camera
Token ID4676
Feature activation-2.11e-01

Pos logit contributions
 your+0.222
 their+0.213
 laundry+0.211
 same+0.209
 our+0.195
Neg logit contributions
hell-0.373
es-0.347
rot-0.347
lor-0.345
oped-0.343
Token back
Token ID736
Feature activation+4.18e-02

Pos logit contributions
hell+0.444
es+0.423
oped+0.421
lor+0.414
umbs+0.412
Neg logit contributions
 your-0.351
 laundry-0.347
 same-0.346
 their-0.338
 our-0.322
Token on
Token ID319
Feature activation+1.32e-01

Pos logit contributions
 your+0.069
 same+0.068
 their+0.067
 laundry+0.067
 our+0.063
Neg logit contributions
hell-0.088
lor-0.084
oped-0.084
es-0.084
umbs-0.083
Token the
Token ID262
Feature activation+1.57e-01

Pos logit contributions
 same+0.087
 laundry+0.086
 your+0.076
 task+0.072
 port+0.063
Neg logit contributions
hell-0.397
es-0.389
umbs-0.378
lor-0.376
oc-0.373
Token tripod
Token ID47439
Feature activation-2.93e-01

Pos logit contributions
 your+0.168
 their+0.157
 math+0.150
 exercise+0.138
 my+0.138
Neg logit contributions
hell-0.408
hed-0.363
rot-0.357
lor-0.356
oc-0.351
Token.
Token ID13
Feature activation-7.44e-03

Pos logit contributions
oped+0.645
qu+0.627
lor+0.614
hell+0.596
rots+0.593
Neg logit contributions
 their-0.491
 your-0.478
 table-0.472
 same-0.465
 his-0.443
Token
Token ID198
Feature activation-5.23e-02

Pos logit contributions
hell+0.016
es+0.015
lor+0.015
oped+0.015
umbs+0.015
Neg logit contributions
 laundry-0.012
 same-0.012
 your-0.012
 their-0.012
 our-0.011
Token
Token ID198
Feature activation-5.21e-02

Pos logit contributions
lor+0.122
hell+0.121
oped+0.121
rot+0.119
rol+0.117
Neg logit contributions
 their-0.083
 your-0.082
 dad-0.074
 the-0.074
 our-0.073
Token"
Token ID1
Feature activation-5.24e-02

Pos logit contributions
hell+0.095
lor+0.091
oped+0.091
umbs+0.089
es+0.087
Neg logit contributions
 your-0.106
 their-0.104
 same-0.101
 our-0.098
 laundry-0.098
TokenGo
Token ID5247
Feature activation+5.89e-04

Pos logit contributions
hell+0.109
lor+0.104
oped+0.104
es+0.104
umbs+0.102
Neg logit contributions
 your-0.090
 same-0.089
 their-0.088
 laundry-0.088
 our-0.082
Token day
Token ID1110
Feature activation-6.98e-02

Pos logit contributions
hell+0.147
oc+0.139
umbs+0.137
es+0.137
oped+0.136
Neg logit contributions
 laundry-0.068
 your-0.065
 same-0.064
 their-0.060
 our-0.059
Token,
Token ID11
Feature activation+7.58e-02

Pos logit contributions
hell+0.133
es+0.129
oped+0.127
lor+0.125
qu+0.124
Neg logit contributions
 same-0.129
 your-0.128
 laundry-0.127
 their-0.125
 our-0.119
Token she
Token ID673
Feature activation+6.10e-03

Pos logit contributions
 laundry+0.085
 same+0.082
 homework+0.082
 task+0.082
 garage+0.081
Neg logit contributions
es-0.182
hell-0.169
qu-0.167
rot-0.164
oped-0.164
Token asked
Token ID1965
Feature activation-4.68e-02

Pos logit contributions
 your+0.010
 laundry+0.010
 same+0.010
 their+0.009
 math+0.009
Neg logit contributions
hell-0.013
oped-0.013
es-0.013
lor-0.013
hed-0.012
Token her
Token ID607
Feature activation+1.00e-01

Pos logit contributions
es+0.127
hell+0.117
oot+0.115
hed+0.115
owed+0.113
Neg logit contributions
 port-0.046
 laundry-0.046
 same-0.044
 task-0.044
 bathroom-0.041
Token maid
Token ID25920
Feature activation-2.93e-01

Pos logit contributions
 your+0.083
 same+0.080
 laundry+0.076
 their+0.075
 our+0.069
Neg logit contributions
es-0.280
hell-0.278
rot-0.273
lor-0.270
qu-0.265
Token for
Token ID329
Feature activation+1.34e-01

Pos logit contributions
hell+0.723
lor+0.707
qu+0.701
umbs+0.697
oped+0.694
Neg logit contributions
 their-0.412
 your-0.405
 dad-0.383
 same-0.378
 laundry-0.365
Token some
Token ID617
Feature activation+6.88e-02

Pos logit contributions
 same+0.139
 port+0.131
 computer+0.127
 garage+0.125
 task+0.117
Neg logit contributions
qu-0.342
hell-0.331
es-0.328
lor-0.324
oot-0.321
Token special
Token ID2041
Feature activation-2.89e-02

Pos logit contributions
 their+0.078
 your+0.078
 same+0.077
 our+0.075
 port+0.071
Neg logit contributions
hell-0.171
qu-0.158
umbs-0.157
es-0.156
oot-0.153
Token makeup
Token ID16029
Feature activation-2.16e-01

Pos logit contributions
hell+0.070
es+0.068
lor+0.065
qu+0.065
hed+0.064
Neg logit contributions
 same-0.035
 your-0.035
 their-0.035
 our-0.033
 port-0.032
Token.
Token ID13
Feature activation-1.64e-02

Pos logit contributions
hell+0.393
oped+0.377
lor+0.375
umbs+0.367
qu+0.367
Neg logit contributions
 your-0.423
 same-0.422
 laundry-0.421
 their-0.418
 dad-0.399
Token said
Token ID531
Feature activation-6.75e-02

Pos logit contributions
 your+0.008
 same+0.008
 their+0.008
 laundry+0.007
 dad+0.007
Neg logit contributions
hell-0.009
lor-0.009
oped-0.009
umbs-0.009
es-0.009
Token.
Token ID13
Feature activation-1.70e-02

Pos logit contributions
hell+0.118
lor+0.113
oped+0.112
umbs+0.110
es+0.109
Neg logit contributions
 your-0.139
 their-0.138
 same-0.137
 laundry-0.135
 dad-0.130
Token
Token ID198
Feature activation-6.26e-02

Pos logit contributions
hell+0.042
lor+0.041
oped+0.041
es+0.041
umbs+0.040
Neg logit contributions
 your-0.022
 same-0.022
 their-0.021
 laundry-0.021
 our-0.020
Token
Token ID198
Feature activation-6.20e-02

Pos logit contributions
hell+0.161
oped+0.157
umbs+0.153
lor+0.152
rol+0.152
Neg logit contributions
 your-0.085
 their-0.084
 the-0.079
 dad-0.077
 same-0.077
TokenThe
Token ID464
Feature activation+7.64e-02

Pos logit contributions
hell+0.140
oped+0.136
lor+0.134
umbs+0.133
es+0.133
Neg logit contributions
 your-0.098
 same-0.096
 their-0.095
 laundry-0.093
 our-0.088
Token maid
Token ID25920
Feature activation-2.93e-01

Pos logit contributions
 your+0.101
 their+0.100
 laundry+0.098
 same+0.097
 our+0.092
Neg logit contributions
hell-0.186
lor-0.178
es-0.177
qu-0.176
oped-0.175
Token smiled
Token ID13541
Feature activation-6.51e-02

Pos logit contributions
hell+0.657
lor+0.641
umbs+0.628
qu+0.603
oped+0.594
Neg logit contributions
 their-0.462
 your-0.451
 dad-0.448
 same-0.446
 laundry-0.431
Token.
Token ID13
Feature activation-1.49e-02

Pos logit contributions
hell+0.145
lor+0.141
oped+0.139
qu+0.138
umbs+0.137
Neg logit contributions
 their-0.107
 your-0.105
 same-0.105
 dad-0.100
 laundry-0.099
Token "
Token ID366
Feature activation-3.92e-02

Pos logit contributions
hell+0.038
es+0.038
oped+0.038
lor+0.037
qu+0.037
Neg logit contributions
 laundry-0.017
 your-0.017
 same-0.016
 their-0.016
 our-0.015
TokenYes
Token ID5297
Feature activation-8.60e-02

Pos logit contributions
hell+0.091
es+0.089
oped+0.088
lor+0.087
umbs+0.087
Neg logit contributions
 your-0.058
 same-0.058
 their-0.055
 laundry-0.055
 math-0.051
Token,
Token ID11
Feature activation+7.30e-02

Pos logit contributions
hell+0.172
oped+0.166
lor+0.166
es+0.164
umbs+0.163
Neg logit contributions
 your-0.155
 same-0.152
 their-0.150
 laundry-0.149
 our-0.142
Token hard
Token ID1327
Feature activation-1.90e-02

Pos logit contributions
hell+0.161
es+0.155
oped+0.154
qu+0.152
lor+0.151
Neg logit contributions
 your-0.111
 same-0.109
 laundry-0.108
 their-0.105
 our-0.101
Token -
Token ID532
Feature activation+9.90e-02

Pos logit contributions
hell+0.036
es+0.035
oped+0.034
hed+0.034
qu+0.034
Neg logit contributions
 your-0.035
 same-0.034
 laundry-0.034
 their-0.033
 our-0.033
Token and
Token ID290
Feature activation+1.36e-01

Pos logit contributions
 same+0.146
 laundry+0.146
 your+0.142
 port+0.134
 stove+0.134
Neg logit contributions
hell-0.208
es-0.206
oped-0.203
qu-0.200
hed-0.196
Token the
Token ID262
Feature activation+1.51e-01

Pos logit contributions
 laundry+0.190
 same+0.189
 your+0.185
 stove+0.184
 port+0.182
Neg logit contributions
es-0.291
hell-0.281
oped-0.272
hed-0.272
qu-0.269
Token k
Token ID479
Feature activation-1.58e-01

Pos logit contributions
 your+0.183
 their+0.176
 our+0.163
 laundry+0.155
 same+0.153
Neg logit contributions
hell-0.378
qu-0.357
lor-0.357
rot-0.356
oped-0.352
Tokenidd
Token ID1638
Feature activation-2.93e-01

Pos logit contributions
hell+0.208
lor+0.177
umbs+0.168
es+0.151
aron+0.146
Neg logit contributions
 your-0.402
 same-0.392
 their-0.386
 laundry-0.377
 our-0.373
Tokeno
Token ID78
Feature activation-1.42e-01

Pos logit contributions
hell+0.345
lor+0.312
umbs+0.293
qu+0.289
unch+0.262
Neg logit contributions
 same-0.759
 dad-0.731
 their-0.728
 your-0.727
 laundry-0.711
Token soon
Token ID2582
Feature activation-1.65e-02

Pos logit contributions
hell+0.299
lor+0.286
umbs+0.282
oped+0.277
qu+0.276
Neg logit contributions
 same-0.237
 your-0.236
 their-0.230
 laundry-0.228
 dad-0.216
Token began
Token ID2540
Feature activation+1.63e-02

Pos logit contributions
hell+0.035
es+0.034
oped+0.034
lor+0.033
qu+0.033
Neg logit contributions
 your-0.027
 same-0.027
 laundry-0.026
 their-0.026
 our-0.025
Token to
Token ID284
Feature activation+1.92e-01

Pos logit contributions
 laundry+0.021
 your+0.021
 same+0.020
 our+0.019
 port+0.019
Neg logit contributions
es-0.038
hell-0.037
oped-0.037
qu-0.037
hed-0.036
Token do
Token ID466
Feature activation+3.08e-02

Pos logit contributions
 your+0.187
 their+0.181
 laundry+0.176
 our+0.171
 same+0.170
Neg logit contributions
hell-0.520
qu-0.506
es-0.499
oped-0.498
hed-0.492
Token said
Token ID531
Feature activation-6.59e-02

Pos logit contributions
 your+0.010
 their+0.010
 same+0.010
 laundry+0.010
 dad+0.010
Neg logit contributions
hell-0.012
lor-0.012
oped-0.012
umbs-0.011
es-0.011
Token.
Token ID13
Feature activation-1.81e-02

Pos logit contributions
hell+0.126
lor+0.125
oped+0.121
umbs+0.120
rot+0.119
Neg logit contributions
 their-0.128
 your-0.125
 dad-0.121
 same-0.120
 laundry-0.119
Token
Token ID198
Feature activation-6.41e-02

Pos logit contributions
hell+0.039
oped+0.036
lor+0.036
es+0.036
umbs+0.036
Neg logit contributions
 your-0.031
 their-0.030
 same-0.030
 laundry-0.029
 dad-0.028
Token
Token ID198
Feature activation-6.34e-02

Pos logit contributions
oped+0.159
hell+0.159
owing+0.156
lor+0.153
rol+0.152
Neg logit contributions
 their-0.089
 your-0.087
 the-0.082
 same-0.081
 dad-0.078
TokenThe
Token ID464
Feature activation+7.61e-02

Pos logit contributions
hell+0.134
oped+0.130
lor+0.128
es+0.127
umbs+0.127
Neg logit contributions
 your-0.107
 same-0.106
 their-0.105
 laundry-0.103
 our-0.098
Token maid
Token ID25920
Feature activation-2.92e-01

Pos logit contributions
 your+0.102
 their+0.101
 laundry+0.099
 same+0.097
 our+0.093
Neg logit contributions
hell-0.183
es-0.174
lor-0.174
qu-0.173
rot-0.171
Token quickly
Token ID2952
Feature activation-1.18e-02

Pos logit contributions
hell+0.649
lor+0.627
umbs+0.611
qu+0.594
oped+0.586
Neg logit contributions
 their-0.467
 your-0.455
 same-0.449
 dad-0.446
 laundry-0.444
Token ran
Token ID4966
Feature activation-3.50e-02

Pos logit contributions
hell+0.029
oped+0.027
lor+0.027
es+0.027
umbs+0.027
Neg logit contributions
 your-0.016
 same-0.016
 their-0.015
 laundry-0.015
 our-0.014
Token away
Token ID1497
Feature activation+1.06e-02

Pos logit contributions
hell+0.082
oped+0.078
es+0.078
hed+0.077
lor+0.077
Neg logit contributions
 your-0.049
 laundry-0.048
 same-0.048
 their-0.045
 our-0.044
Token to
Token ID284
Feature activation+1.91e-01

Pos logit contributions
 your+0.021
 same+0.020
 laundry+0.020
 their+0.020
 our+0.019
Neg logit contributions
hell-0.019
lor-0.018
oped-0.018
es-0.018
umbs-0.018
Token find
Token ID1064
Feature activation+2.42e-02

Pos logit contributions
 computer+0.197
 same+0.190
 laundry+0.184
 stove+0.179
 port+0.175
Neg logit contributions
hell-0.455
qu-0.435
es-0.433
lor-0.424
oot-0.417
Token.
Token ID13
Feature activation-1.09e-02

Pos logit contributions
lor+0.118
hell+0.114
oped+0.112
umbs+0.110
rot+0.109
Neg logit contributions
 their-0.117
 your-0.112
 dad-0.107
 same-0.106
 the-0.103
Token
Token ID198
Feature activation-5.56e-02

Pos logit contributions
hell+0.023
lor+0.022
oped+0.022
es+0.022
umbs+0.022
Neg logit contributions
 your-0.018
 same-0.018
 laundry-0.018
 their-0.018
 our-0.017
Token
Token ID198
Feature activation-5.53e-02

Pos logit contributions
hell+0.133
oped+0.133
rol+0.131
lor+0.130
umbs+0.129
Neg logit contributions
 their-0.085
 your-0.085
 dad-0.077
 the-0.075
 Ben-0.075
TokenThe
Token ID464
Feature activation+8.15e-02

Pos logit contributions
hell+0.108
oped+0.104
umbs+0.102
lor+0.101
oot+0.100
Neg logit contributions
 your-0.108
 their-0.105
 same-0.099
 our-0.098
 dad-0.097
Token great
Token ID1049
Feature activation+2.62e-02

Pos logit contributions
 your+0.098
 their+0.090
 laundry+0.090
 same+0.087
 our+0.085
Neg logit contributions
hell-0.210
es-0.199
rot-0.198
qu-0.196
hed-0.194
Token Zen
Token ID14760
Feature activation-2.92e-01

Pos logit contributions
 your+0.036
 same+0.033
 laundry+0.033
 our+0.032
 their+0.032
Neg logit contributions
hell-0.062
es-0.059
rot-0.058
hed-0.057
qu-0.056
Tokeno
Token ID78
Feature activation-1.39e-01

Pos logit contributions
hell+0.434
umbs+0.416
lor+0.408
oped+0.397
qu+0.384
Neg logit contributions
 their-0.690
 your-0.679
 same-0.651
 dad-0.649
 our-0.640
Token hugs
Token ID40657
Feature activation-7.09e-02

Pos logit contributions
hell+0.326
oped+0.313
lor+0.313
es+0.313
umbs+0.308
Neg logit contributions
 your-0.198
 same-0.195
 laundry-0.192
 their-0.192
 our-0.178
Token Ben
Token ID3932
Feature activation-9.58e-02

Pos logit contributions
hell+0.225
oped+0.220
lor+0.220
umbs+0.216
es+0.215
Neg logit contributions
 their-0.050
 your-0.048
 same-0.044
 laundry-0.042
 dad-0.039
Token and
Token ID290
Feature activation+1.34e-01

Pos logit contributions
oped+0.208
lor+0.207
qu+0.204
oot+0.194
rol+0.194
Neg logit contributions
 their-0.169
 your-0.166
 same-0.154
 his-0.154
 the-0.154
Token says
Token ID1139
Feature activation-5.75e-02

Pos logit contributions
 port+0.217
 laundry+0.216
 computer+0.215
 job+0.213
 same+0.213
Neg logit contributions
es-0.251
hell-0.238
hed-0.232
qu-0.223
oped-0.223
Token wet
Token ID9583
Feature activation-1.04e-01

Pos logit contributions
 your+0.069
 same+0.067
 laundry+0.067
 their+0.065
 our+0.061
Neg logit contributions
hell-0.127
oped-0.123
es-0.122
umbs-0.120
lor-0.120
Token and
Token ID290
Feature activation+1.38e-01

Pos logit contributions
hell+0.220
lor+0.219
oped+0.214
es+0.210
qu+0.209
Neg logit contributions
 same-0.173
 their-0.172
 your-0.171
 laundry-0.168
 our-0.156
Token the
Token ID262
Feature activation+1.52e-01

Pos logit contributions
 laundry+0.166
 same+0.166
 port+0.161
 your+0.158
 math+0.156
Neg logit contributions
hell-0.313
oped-0.308
hed-0.304
es-0.304
umbs-0.290
Token motorcycle
Token ID18757
Feature activation-2.27e-01

Pos logit contributions
 your+0.212
 their+0.200
 our+0.186
 laundry+0.181
 same+0.174
Neg logit contributions
hell-0.358
hed-0.346
rot-0.334
oped-0.329
umbs-0.328
Token sk
Token ID1341
Feature activation-1.31e-01

Pos logit contributions
hell+0.526
lor+0.514
umbs+0.507
oped+0.499
qu+0.498
Neg logit contributions
 same-0.327
 their-0.322
 your-0.318
 laundry-0.314
 dad-0.297
Tokenidd
Token ID1638
Feature activation-2.92e-01

Pos logit contributions
hell+0.209
qu+0.180
oc+0.177
umbs+0.169
lor+0.163
Neg logit contributions
 their-0.286
 port-0.281
 dad-0.281
 laundry-0.281
 same-0.280
Tokened
Token ID276
Feature activation+2.09e-02

Pos logit contributions
lor+0.374
hell+0.354
qu+0.323
oc+0.303
ement+0.301
Neg logit contributions
 same-0.713
 their-0.701
 dad-0.698
 turn-0.673
 Dad-0.667
Token.
Token ID13
Feature activation-1.33e-02

Pos logit contributions
 your+0.036
 same+0.036
 their+0.035
 laundry+0.035
 our+0.033
Neg logit contributions
hell-0.043
lor-0.041
oped-0.041
es-0.041
umbs-0.040
Token It
Token ID632
Feature activation-5.55e-02

Pos logit contributions
hell+0.026
es+0.025
oped+0.025
umbs+0.025
lor+0.025
Neg logit contributions
 your-0.024
 laundry-0.024
 same-0.023
 their-0.023
 our-0.022
Token tipped
Token ID28395
Feature activation-3.67e-02

Pos logit contributions
hell+0.125
lor+0.121
oped+0.120
es+0.119
umbs+0.118
Neg logit contributions
 your-0.085
 same-0.084
 their-0.083
 laundry-0.082
 our-0.077
Token over
Token ID625
Feature activation+1.05e-01

Pos logit contributions
hell+0.080
oped+0.078
es+0.077
hed+0.076
umbs+0.076
Neg logit contributions
 laundry-0.056
 your-0.056
 same-0.055
 their-0.054
 port-0.052
Token slippery
Token ID32911
Feature activation-6.91e-02

Pos logit contributions
hell+0.521
oped+0.495
es+0.494
umbs+0.489
qu+0.489
Neg logit contributions
 your-0.297
 laundry-0.287
 same-0.286
 their-0.282
 our-0.265
Token and
Token ID290
Feature activation+1.36e-01

Pos logit contributions
hell+0.128
es+0.122
qu+0.119
oped+0.118
hed+0.118
Neg logit contributions
 your-0.131
 same-0.129
 laundry-0.129
 their-0.127
 dad-0.123
Token the
Token ID262
Feature activation+1.52e-01

Pos logit contributions
 laundry+0.183
 port+0.182
 computer+0.179
 school+0.170
 same+0.170
Neg logit contributions
hell-0.307
es-0.286
oped-0.281
qu-0.280
umbs-0.273
Token car
Token ID1097
Feature activation-1.96e-01

Pos logit contributions
 your+0.185
 their+0.176
 our+0.162
 same+0.155
 laundry+0.154
Neg logit contributions
hell-0.380
umbs-0.354
qu-0.349
oped-0.349
hed-0.349
Token sk
Token ID1341
Feature activation-1.27e-01

Pos logit contributions
hell+0.421
es+0.403
oped+0.403
lor+0.394
qu+0.393
Neg logit contributions
 your-0.317
 laundry-0.310
 same-0.309
 their-0.305
 our-0.289
Tokenidd
Token ID1638
Feature activation-2.92e-01

Pos logit contributions
hell+0.202
qu+0.177
lor+0.177
oc+0.173
rot+0.170
Neg logit contributions
 their-0.278
 your-0.276
 laundry-0.275
 same-0.271
 port-0.267
Tokened
Token ID276
Feature activation+1.63e-02

Pos logit contributions
lor+0.348
hell+0.329
qu+0.292
ement+0.288
rot+0.284
Neg logit contributions
 same-0.753
 their-0.744
 your-0.727
 dad-0.714
 our-0.695
Token and
Token ID290
Feature activation+1.35e-01

Pos logit contributions
 your+0.027
 laundry+0.027
 same+0.027
 their+0.026
 dad+0.025
Neg logit contributions
hell-0.034
es-0.032
oped-0.032
lor-0.031
umbs-0.031
Token hit
Token ID2277
Feature activation-6.08e-02

Pos logit contributions
 same+0.183
 laundry+0.178
 port+0.172
 task+0.171
 math+0.170
Neg logit contributions
es-0.302
hell-0.285
oped-0.285
hed-0.283
ez-0.271
Token a
Token ID257
Feature activation+1.14e-01

Pos logit contributions
hell+0.183
es+0.179
oped+0.175
hed+0.174
umbs+0.174
Neg logit contributions
 laundry-0.042
 same-0.041
 your-0.041
 their-0.037
 port-0.037
Token tree
Token ID5509
Feature activation-1.95e-01

Pos logit contributions
 your+0.127
 same+0.125
 their+0.124
 laundry+0.122
 our+0.110
Neg logit contributions
hell-0.302
oped-0.287
lor-0.286
es-0.285
umbs-0.285
Token She
Token ID1375
Feature activation-2.65e-02

Pos logit contributions
hell+0.035
lor+0.034
es+0.033
oped+0.033
umbs+0.033
Neg logit contributions
 your-0.022
 same-0.022
 laundry-0.022
 their-0.022
 our-0.020
Token smiled
Token ID13541
Feature activation-6.43e-02

Pos logit contributions
hell+0.051
es+0.051
oped+0.050
hed+0.049
qu+0.049
Neg logit contributions
 your-0.048
 same-0.047
 laundry-0.047
 their-0.045
 our-0.044
Token and
Token ID290
Feature activation+1.36e-01

Pos logit contributions
hell+0.123
es+0.121
lor+0.118
oped+0.117
rot+0.115
Neg logit contributions
 laundry-0.116
 your-0.115
 same-0.114
 their-0.111
 our-0.107
Token thanked
Token ID26280
Feature activation-1.02e-02

Pos logit contributions
 same+0.264
 laundry+0.260
 your+0.256
 port+0.252
 math+0.247
Neg logit contributions
es-0.243
hed-0.235
oped-0.234
hell-0.233
qu-0.226
Token the
Token ID262
Feature activation+1.50e-01

Pos logit contributions
hell+0.030
es+0.030
qu+0.029
lor+0.029
oot+0.029
Neg logit contributions
 laundry-0.007
 same-0.007
 your-0.006
 port-0.006
 task-0.006
Token maid
Token ID25920
Feature activation-2.91e-01

Pos logit contributions
 your+0.172
 their+0.167
 same+0.166
 laundry+0.165
 our+0.152
Neg logit contributions
hell-0.394
lor-0.380
es-0.378
oped-0.377
qu-0.375
Token for
Token ID329
Feature activation+1.36e-01

Pos logit contributions
hell+0.678
lor+0.670
oped+0.667
qu+0.652
hed+0.648
Neg logit contributions
 their-0.441
 your-0.422
 dad-0.408
 same-0.390
 the-0.384
Token helping
Token ID5742
Feature activation+1.84e-02

Pos logit contributions
 same+0.162
 task+0.146
 computer+0.141
 laundry+0.139
 port+0.139
Neg logit contributions
es-0.332
qu-0.322
lor-0.318
hell-0.314
oot-0.309
Token her
Token ID607
Feature activation+1.03e-01

Pos logit contributions
 same+0.014
 your+0.013
 laundry+0.013
 port+0.012
 their+0.012
Neg logit contributions
hell-0.054
es-0.054
qu-0.052
oped-0.051
lor-0.051
Token.
Token ID13
Feature activation-1.59e-02

Pos logit contributions
 your+0.166
 same+0.164
 laundry+0.162
 their+0.162
 our+0.152
Neg logit contributions
hell-0.224
es-0.214
lor-0.214
oped-0.213
qu-0.210
Token
Token ID198
Feature activation-6.05e-02

Pos logit contributions
hell+0.035
oped+0.033
lor+0.033
es+0.033
qu+0.032
Neg logit contributions
 your-0.026
 their-0.025
 same-0.025
 laundry-0.025
 dad-0.024
Token Jenny
Token ID28437
Feature activation-1.06e-01

Pos logit contributions
 laundry+0.068
 same+0.068
 their+0.066
 task+0.063
 homework+0.060
Neg logit contributions
hell-0.190
es-0.180
oc-0.175
lor-0.174
umbs-0.173
Token,"
Token ID553
Feature activation-7.07e-02

Pos logit contributions
hell+0.178
oped+0.172
umbs+0.172
lor+0.170
qu+0.164
Neg logit contributions
 your-0.233
 their-0.222
 same-0.220
 our-0.216
 dad-0.216
Token said
Token ID531
Feature activation-6.39e-02

Pos logit contributions
hell+0.106
oc+0.103
es+0.101
hed+0.096
qu+0.093
Neg logit contributions
 port-0.148
 same-0.139
 basement-0.135
 task-0.131
 laundry-0.130
Token Mr
Token ID1770
Feature activation-5.85e-02

Pos logit contributions
es+0.165
hell+0.160
oped+0.159
hed+0.158
oot+0.156
Neg logit contributions
 laundry-0.060
 same-0.059
 port-0.053
 task-0.052
 computer-0.052
Token.
Token ID13
Feature activation-1.66e-02

Pos logit contributions
rol+0.123
lor+0.122
oped+0.121
umbs+0.120
hed+0.120
Neg logit contributions
 their-0.103
 dad-0.100
 Dad-0.098
 your-0.097
 same-0.096
Token Fisher
Token ID14388
Feature activation-2.91e-01

Pos logit contributions
hell+0.042
oped+0.041
lor+0.041
umbs+0.041
es+0.041
Neg logit contributions
 your-0.020
 same-0.019
 laundry-0.019
 their-0.019
 our-0.017
Token.
Token ID13
Feature activation-1.71e-02

Pos logit contributions
oped+0.637
lor+0.636
qu+0.631
hell+0.627
umbs+0.621
Neg logit contributions
 dad-0.485
 their-0.460
 same-0.456
 daddy-0.441
 Dad-0.437
Token "
Token ID366
Feature activation-4.19e-02

Pos logit contributions
hell+0.043
oped+0.042
lor+0.042
es+0.042
umbs+0.042
Neg logit contributions
 your-0.020
 laundry-0.020
 same-0.020
 their-0.019
 our-0.018
TokenWe
Token ID1135
Feature activation+2.51e-02

Pos logit contributions
hell+0.101
es+0.099
oped+0.098
lor+0.098
oot+0.097
Neg logit contributions
 same-0.057
 your-0.056
 their-0.054
 laundry-0.053
 table-0.050
Token have
Token ID423
Feature activation+4.56e-02

Pos logit contributions
 same+0.036
 your+0.036
 their+0.035
 laundry+0.034
 port+0.032
Neg logit contributions
hell-0.059
lor-0.058
es-0.057
qu-0.057
oped-0.057
Token to
Token ID284
Feature activation+1.92e-01

Pos logit contributions
 laundry+0.050
 same+0.049
 task+0.048
 port+0.047
 computer+0.047
Neg logit contributions
hell-0.109
lor-0.109
umbs-0.108
es-0.107
qu-0.103
Token it
Token ID340
Feature activation-2.90e-02

Pos logit contributions
hell+0.006
es+0.006
lor+0.006
oped+0.006
hed+0.006
Neg logit contributions
 laundry-0.003
 same-0.003
 their-0.003
 your-0.003
 port-0.003
Token to
Token ID284
Feature activation+1.95e-01

Pos logit contributions
hell+0.073
es+0.069
umbs+0.067
rot+0.067
oot+0.066
Neg logit contributions
 laundry-0.034
 same-0.032
 port-0.032
 bathroom-0.031
 table-0.030
Token me
Token ID502
Feature activation+1.27e-02

Pos logit contributions
 laundry+0.206
 same+0.195
 task+0.193
 port+0.180
 computer+0.169
Neg logit contributions
hell-0.500
es-0.496
qu-0.491
lor-0.477
oc-0.476
Token for
Token ID329
Feature activation+1.37e-01

Pos logit contributions
 your+0.024
 their+0.023
 same+0.023
 laundry+0.022
 our+0.022
Neg logit contributions
hell-0.025
oped-0.024
lor-0.024
umbs-0.024
qu-0.023
Token safe
Token ID3338
Feature activation-8.85e-02

Pos logit contributions
 same+0.135
 laundry+0.128
 port+0.124
 task+0.116
 computer+0.111
Neg logit contributions
hell-0.358
es-0.354
qu-0.353
lor-0.337
oot-0.334
Tokenkeeping
Token ID19934
Feature activation-2.90e-01

Pos logit contributions
hell+0.121
es+0.117
lor+0.112
umbs+0.108
qu+0.107
Neg logit contributions
 same-0.209
 laundry-0.207
 your-0.206
 their-0.202
 port-0.196
Token.
Token ID13
Feature activation-7.78e-03

Pos logit contributions
oped+0.646
lor+0.615
umbs+0.615
pillar+0.610
rol+0.607
Neg logit contributions
 your-0.492
 their-0.490
 table-0.455
 dad-0.447
 same-0.444
Token He
Token ID679
Feature activation-1.71e-02

Pos logit contributions
hell+0.019
lor+0.019
oped+0.019
es+0.019
umbs+0.018
Neg logit contributions
 your-0.010
 same-0.010
 their-0.010
 laundry-0.010
 our-0.009
Token told
Token ID1297
Feature activation-4.16e-02

Pos logit contributions
hell+0.035
es+0.034
oped+0.034
lor+0.034
qu+0.033
Neg logit contributions
 your-0.029
 laundry-0.029
 same-0.029
 their-0.028
 port-0.027
Token me
Token ID502
Feature activation+1.34e-02

Pos logit contributions
hell+0.102
es+0.098
qu+0.094
oot+0.093
lor+0.092
Neg logit contributions
 port-0.050
 laundry-0.049
 task-0.048
 same-0.046
 computer-0.046
Token to
Token ID284
Feature activation+1.96e-01

Pos logit contributions
 same+0.021
 laundry+0.021
 port+0.020
 your+0.019
 their+0.019
Neg logit contributions
hell-0.029
es-0.028
lor-0.026
oot-0.026
umbs-0.026
Token the
Token ID262
Feature activation+1.50e-01

Pos logit contributions
 same+0.213
 Puzzles+0.203
 port+0.203
 task+0.201
 boring+0.199
Neg logit contributions
r-0.335
qu-0.329
unch-0.317
ez-0.315
es-0.315
Token park
Token ID3952
Feature activation-1.57e-01

Pos logit contributions
 their+0.193
 your+0.184
 imagination+0.183
 our+0.178
 sister+0.176
Neg logit contributions
lor-0.321
rot-0.304
aw-0.290
pillar-0.290
qu-0.284
Token with
Token ID351
Feature activation+1.17e-01

Pos logit contributions
hell+0.358
es+0.347
oped+0.346
lor+0.343
umbs+0.341
Neg logit contributions
 your-0.228
 same-0.228
 laundry-0.225
 their-0.218
 our-0.208
Token Mr
Token ID1770
Feature activation-5.90e-02

Pos logit contributions
 port+0.115
 same+0.111
 computer+0.106
 task+0.103
 stove+0.103
Neg logit contributions
hell-0.299
es-0.299
umbs-0.291
hed-0.283
qu-0.277
Token.
Token ID13
Feature activation-1.05e-02

Pos logit contributions
lor+0.114
oped+0.113
hell+0.111
rol+0.110
hed+0.108
Neg logit contributions
 their-0.115
 dad-0.110
 your-0.110
 laundry-0.110
 same-0.109
Token Fisher
Token ID14388
Feature activation-2.90e-01

Pos logit contributions
hell+0.023
oped+0.022
lor+0.022
es+0.022
umbs+0.022
Neg logit contributions
 your-0.016
 same-0.016
 their-0.016
 laundry-0.016
 our-0.015
Token.
Token ID13
Feature activation-1.09e-02

Pos logit contributions
hell+0.565
qu+0.546
oped+0.545
lor+0.536
umbs+0.530
Neg logit contributions
 their-0.547
 dad-0.543
 same-0.543
 your-0.533
 laundry-0.510
Token
Token ID198
Feature activation-5.91e-02

Pos logit contributions
umbs+0.022
oped+0.021
hell+0.021
hed+0.020
oled+0.020
Neg logit contributions
 same-0.018
 your-0.018
 laundry-0.018
 our-0.017
 homework-0.017
Token
Token ID198
Feature activation-5.95e-02

Pos logit contributions
hell+0.136
rot+0.132
oped+0.131
qu+0.131
lor+0.130
Neg logit contributions
 their-0.091
 your-0.090
 dad-0.084
 same-0.084
 the-0.082
Token"
Token ID1
Feature activation-5.57e-02

Pos logit contributions
hell+0.108
lor+0.103
oped+0.103
es+0.102
umbs+0.101
Neg logit contributions
 your-0.117
 same-0.116
 their-0.114
 laundry-0.114
 our-0.108
TokenLook
Token ID8567
Feature activation-9.63e-02

Pos logit contributions
hell+0.124
es+0.119
lor+0.118
oped+0.118
qu+0.115
Neg logit contributions
 your-0.088
 same-0.087
 laundry-0.086
 their-0.086
 dad-0.081
Token The
Token ID383
Feature activation+7.16e-02

Pos logit contributions
hell+0.020
oped+0.020
umbs+0.020
lor+0.019
es+0.019
Neg logit contributions
 same-0.011
 your-0.011
 laundry-0.010
 their-0.010
 our-0.010
Token pony
Token ID26902
Feature activation-2.45e-01

Pos logit contributions
 your+0.134
 their+0.131
 same+0.130
 laundry+0.130
 our+0.124
Neg logit contributions
hell-0.137
lor-0.130
oped-0.128
es-0.128
umbs-0.127
Token was
Token ID373
Feature activation+1.04e-01

Pos logit contributions
umbs+0.654
lor+0.640
hell+0.636
qu+0.606
rol+0.594
Neg logit contributions
 same-0.291
 your-0.290
 their-0.284
 the-0.279
 dad-0.268
Token so
Token ID523
Feature activation+1.12e-01

Pos logit contributions
 your+0.111
 same+0.108
 laundry+0.107
 their+0.107
 our+0.096
Neg logit contributions
hell-0.281
oped-0.271
lor-0.270
es-0.270
umbs-0.268
Token pretty
Token ID2495
Feature activation-7.35e-02

Pos logit contributions
 your+0.090
 same+0.085
 their+0.085
 laundry+0.072
 dad+0.071
Neg logit contributions
lor-0.346
hell-0.337
es-0.336
oped-0.334
umbs-0.329
Token and
Token ID290
Feature activation+1.32e-01

Pos logit contributions
oped+0.173
umbs+0.172
lor+0.170
hell+0.169
qu+0.163
Neg logit contributions
 your-0.111
 their-0.110
 same-0.105
 the-0.105
 laundry-0.101
Token Lily
Token ID20037
Feature activation-9.15e-02

Pos logit contributions
 laundry+0.164
 same+0.161
 your+0.159
 port+0.158
 their+0.155
Neg logit contributions
hell-0.328
es-0.313
oped-0.310
qu-0.306
lor-0.302
Token wanted
Token ID2227
Feature activation-1.38e-02

Pos logit contributions
hell+0.231
umbs+0.229
lor+0.228
qu+0.217
oped+0.216
Neg logit contributions
 your-0.119
 same-0.116
 their-0.115
 dad-0.111
 laundry-0.108
Token to
Token ID284
Feature activation+1.90e-01

Pos logit contributions
hell+0.039
es+0.037
lor+0.037
oped+0.037
qu+0.037
Neg logit contributions
 same-0.013
 your-0.013
 laundry-0.013
 their-0.013
 port-0.012
Token ride
Token ID6594
Feature activation-1.40e-01

Pos logit contributions
 your+0.183
 same+0.178
 their+0.174
 laundry+0.172
 our+0.154
Neg logit contributions
hell-0.540
lor-0.525
oped-0.524
es-0.524
umbs-0.517
Token it
Token ID340
Feature activation-3.52e-02

Pos logit contributions
lor+0.407
hell+0.406
oped+0.401
umbs+0.401
qu+0.393
Neg logit contributions
 your-0.136
 their-0.129
 same-0.122
 laundry-0.112
 our-0.109
Token Lily
Token ID20037
Feature activation-9.10e-02

Pos logit contributions
 laundry+0.079
 same+0.075
 port+0.075
 task+0.072
 workshop+0.070
Neg logit contributions
es-0.174
hell-0.174
oped-0.167
hed-0.166
qu-0.159
Token's
Token ID338
Feature activation+1.18e-01

Pos logit contributions
es+0.152
hed+0.150
hell+0.147
oped+0.146
lor+0.139
Neg logit contributions
 your-0.179
 laundry-0.177
 same-0.176
 their-0.176
 math-0.174
Token friend
Token ID1545
Feature activation-1.70e-01

Pos logit contributions
 your+0.107
 their+0.098
 our+0.092
 same+0.085
 laundry+0.082
Neg logit contributions
hell-0.329
rot-0.319
umbs-0.315
es-0.313
oped-0.311
Token came
Token ID1625
Feature activation+1.00e-02

Pos logit contributions
hell+0.341
es+0.332
oped+0.332
lor+0.324
qu+0.321
Neg logit contributions
 same-0.291
 your-0.290
 laundry-0.289
 their-0.281
 our-0.267
Token over
Token ID625
Feature activation+1.03e-01

Pos logit contributions
 laundry+0.015
 your+0.015
 same+0.015
 their+0.015
 port+0.014
Neg logit contributions
hell-0.022
es-0.021
oped-0.021
umbs-0.021
lor-0.021
Token to
Token ID284
Feature activation+1.92e-01

Pos logit contributions
 same+0.144
 laundry+0.141
 your+0.135
 port+0.135
 table+0.127
Neg logit contributions
es-0.236
oped-0.225
hell-0.222
hed-0.221
umbs-0.219
Token play
Token ID711
Feature activation-4.94e-02

Pos logit contributions
 laundry+0.202
 same+0.200
 port+0.184
 your+0.180
 table+0.179
Neg logit contributions
oped-0.486
hell-0.478
qu-0.477
es-0.477
hed-0.470
Token.
Token ID13
Feature activation-1.27e-02

Pos logit contributions
hell+0.078
es+0.074
oped+0.073
lor+0.072
umbs+0.071
Neg logit contributions
 your-0.109
 same-0.108
 laundry-0.107
 their-0.106
 our-0.102
Token Her
Token ID2332
Feature activation+2.66e-02

Pos logit contributions
hell+0.024
uts+0.024
oled+0.023
oped+0.023
umbs+0.023
Neg logit contributions
 laundry-0.020
 same-0.019
 your-0.019
 our-0.018
 port-0.018
Token friend
Token ID1545
Feature activation-1.69e-01

Pos logit contributions
 your+0.039
 same+0.038
 their+0.038
 laundry+0.038
 our+0.035
Neg logit contributions
hell-0.062
lor-0.060
oped-0.060
es-0.059
umbs-0.059
Token asked
Token ID1965
Feature activation-4.41e-02

Pos logit contributions
hell+0.340
oped+0.331
es+0.331
lor+0.321
qu+0.320
Neg logit contributions
 laundry-0.288
 same-0.288
 your-0.287
 their-0.279
 port-0.267
Token blue
Token ID4171
Feature activation-9.74e-02

Pos logit contributions
 your+0.101
 laundry+0.091
 their+0.091
 our+0.086
 same+0.082
Neg logit contributions
hell-0.205
hed-0.192
umbs-0.191
rot-0.190
qu-0.189
Token ball
Token ID2613
Feature activation-2.36e-01

Pos logit contributions
hell+0.252
lor+0.244
oped+0.243
es+0.242
qu+0.239
Neg logit contributions
 your-0.118
 same-0.116
 their-0.114
 laundry-0.114
 dad-0.104
Token was
Token ID373
Feature activation+1.05e-01

Pos logit contributions
hell+0.502
oped+0.482
es+0.481
lor+0.472
hed+0.471
Neg logit contributions
 your-0.393
 laundry-0.385
 same-0.380
 their-0.378
 our-0.361
Token curious
Token ID11040
Feature activation-8.80e-02

Pos logit contributions
 laundry+0.160
 your+0.153
 same+0.151
 stove+0.146
 port+0.146
Neg logit contributions
hell-0.231
oped-0.221
es-0.219
hed-0.218
umbs-0.215
Token,
Token ID11
Feature activation+7.43e-02

Pos logit contributions
hell+0.166
es+0.157
oped+0.155
lor+0.154
umbs+0.152
Neg logit contributions
 your-0.166
 same-0.164
 laundry-0.163
 their-0.160
 our-0.154
Token so
Token ID523
Feature activation+1.12e-01

Pos logit contributions
 laundry+0.135
 your+0.128
 same+0.128
 task+0.126
 port+0.125
Neg logit contributions
hell-0.142
es-0.134
qu-0.131
hed-0.130
oped-0.129
Token it
Token ID340
Feature activation-3.09e-02

Pos logit contributions
 laundry+0.126
 same+0.124
 your+0.116
 table+0.110
 computer+0.108
Neg logit contributions
hell-0.292
es-0.292
oped-0.284
qu-0.281
rot-0.274
Token decided
Token ID3066
Feature activation-7.13e-03

Pos logit contributions
hell+0.062
oped+0.061
es+0.060
qu+0.059
lor+0.058
Neg logit contributions
 your-0.054
 laundry-0.054
 same-0.053
 their-0.051
 our-0.049
Token to
Token ID284
Feature activation+1.92e-01

Pos logit contributions
hell+0.016
oot+0.015
uts+0.015
es+0.015
lor+0.015
Neg logit contributions
 laundry-0.010
 same-0.010
 port-0.009
 task-0.009
 table-0.009
Token join
Token ID4654
Feature activation-2.43e-02

Pos logit contributions
 laundry+0.195
 their+0.189
 your+0.184
 same+0.180
 stove+0.171
Neg logit contributions
hell-0.511
qu-0.492
oped-0.484
oot-0.481
hed-0.479
Token them
Token ID606
Feature activation-2.09e-02

Pos logit contributions
hell+0.067
es+0.066
oped+0.065
lor+0.064
qu+0.064
Neg logit contributions
 same-0.024
 laundry-0.024
 your-0.024
 their-0.022
 table-0.020
Token and
Token ID290
Feature activation+1.36e-01

Pos logit contributions
oped+0.268
es+0.267
hell+0.266
hed+0.260
lor+0.258
Neg logit contributions
 same-0.239
 your-0.238
 laundry-0.237
 their-0.228
 our-0.221
Token her
Token ID607
Feature activation+1.02e-01

Pos logit contributions
 laundry+0.144
 same+0.140
 table+0.139
 stove+0.138
 task+0.137
Neg logit contributions
hell-0.325
es-0.321
oot-0.319
rot-0.315
oped-0.315
Token mom
Token ID1995
Feature activation-1.36e-01

Pos logit contributions
 your+0.067
 laundry+0.060
 same+0.059
 their+0.056
 our+0.051
Neg logit contributions
rot-0.308
hell-0.307
es-0.305
oped-0.301
lor-0.297
Token were
Token ID547
Feature activation+7.00e-02

Pos logit contributions
hell+0.248
lor+0.235
oped+0.233
es+0.233
umbs+0.230
Neg logit contributions
 your-0.265
 same-0.262
 their-0.260
 laundry-0.259
 our-0.245
Token in
Token ID287
Feature activation+1.79e-01

Pos logit contributions
 laundry+0.099
 same+0.099
 your+0.099
 their+0.096
 task+0.091
Neg logit contributions
hell-0.155
oped-0.155
es-0.154
lor-0.153
hed-0.150
Token the
Token ID262
Feature activation+1.51e-01

Pos logit contributions
 same+0.146
 task+0.144
 stove+0.140
 laundry+0.140
 computer+0.140
Neg logit contributions
oped-0.422
es-0.415
hed-0.408
r-0.403
oc-0.399
Token kitchen
Token ID9592
Feature activation-1.26e-01

Pos logit contributions
 homework+0.164
 your+0.162
 task+0.161
 experiment+0.155
 work+0.155
Neg logit contributions
rot-0.297
lor-0.296
aw-0.293
c-0.289
hed-0.287
Token.
Token ID13
Feature activation-7.63e-03

Pos logit contributions
hell+0.171
es+0.169
lor+0.163
oped+0.162
umbs+0.159
Neg logit contributions
 your-0.289
 same-0.286
 laundry-0.281
 their-0.275
 our-0.274
Token Riley
Token ID22637
Feature activation-1.19e-01

Pos logit contributions
umbs+0.017
oped+0.016
es+0.016
lor+0.016
hed+0.016
Neg logit contributions
 your-0.011
 laundry-0.011
 same-0.011
 math-0.011
 task-0.011
Token wanted
Token ID2227
Feature activation-1.33e-02

Pos logit contributions
es+0.221
oped+0.217
hell+0.215
hed+0.213
oc+0.206
Neg logit contributions
 same-0.223
 laundry-0.222
 your-0.221
 their-0.208
 math-0.208
Token to
Token ID284
Feature activation+1.90e-01

Pos logit contributions
es+0.022
hell+0.021
uts+0.020
umbs+0.020
m+0.019
Neg logit contributions
 port-0.021
 same-0.021
 task-0.020
 our-0.019
 your-0.019
Token't
Token ID470
Feature activation+6.94e-02

Pos logit contributions
hell+0.023
lor+0.022
oped+0.022
es+0.022
umbs+0.021
Neg logit contributions
 your-0.021
 same-0.021
 their-0.021
 laundry-0.021
 our-0.019
Token notice
Token ID4003
Feature activation+4.80e-02

Pos logit contributions
 your+0.079
 same+0.078
 their+0.078
 laundry+0.074
 dad+0.069
Neg logit contributions
hell-0.184
lor-0.183
oped-0.180
es-0.179
umbs-0.176
Token how
Token ID703
Feature activation+8.63e-02

Pos logit contributions
 same+0.053
 your+0.053
 laundry+0.052
 their+0.050
 our+0.047
Neg logit contributions
hell-0.126
es-0.121
lor-0.120
oped-0.119
umbs-0.118
Token cold
Token ID4692
Feature activation-1.14e-01

Pos logit contributions
 laundry+0.098
 your+0.093
 same+0.093
 their+0.089
 table+0.088
Neg logit contributions
hell-0.226
oped-0.215
qu-0.214
es-0.212
umbs-0.210
Token it
Token ID340
Feature activation-2.76e-02

Pos logit contributions
hell+0.270
es+0.261
hed+0.258
umbs+0.247
oped+0.247
Neg logit contributions
 same-0.147
 laundry-0.145
 your-0.142
 port-0.137
 computer-0.137
Token was
Token ID373
Feature activation+1.08e-01

Pos logit contributions
hell+0.057
oped+0.055
es+0.055
lor+0.054
qu+0.054
Neg logit contributions
 your-0.047
 same-0.046
 laundry-0.046
 their-0.046
 our-0.043
Token getting
Token ID1972
Feature activation+4.86e-02

Pos logit contributions
 your+0.238
 laundry+0.235
 same+0.234
 their+0.233
 dad+0.225
Neg logit contributions
hell-0.167
es-0.154
oped-0.153
lor-0.152
umbs-0.151
Token.
Token ID13
Feature activation-1.31e-02

Pos logit contributions
 your+0.083
 same+0.082
 their+0.082
 laundry+0.080
 our+0.076
Neg logit contributions
hell-0.101
lor-0.098
es-0.097
oped-0.097
umbs-0.095
Token Suddenly
Token ID24975
Feature activation-4.59e-02

Pos logit contributions
hell+0.028
oped+0.028
hed+0.027
es+0.027
lor+0.027
Neg logit contributions
 laundry-0.021
 your-0.021
 their-0.020
 our-0.019
 same-0.019
Token his
Token ID465
Feature activation+1.25e-01

Pos logit contributions
es+0.080
hell+0.078
oped+0.076
hed+0.072
oot+0.072
Neg logit contributions
 same-0.091
 laundry-0.090
 your-0.087
 table-0.085
 their-0.084
Token feet
Token ID3625
Feature activation-1.81e-01

Pos logit contributions
 your+0.136
 our+0.117
 their+0.115
 same+0.110
 my+0.108
Neg logit contributions
hell-0.315
lor-0.306
uts-0.306
ann-0.305
es-0.302
Token to
Token ID284
Feature activation+1.91e-01

Pos logit contributions
hell+0.025
es+0.024
qu+0.023
oot+0.023
lor+0.023
Neg logit contributions
 port-0.020
 task-0.020
 computer-0.019
 same-0.019
 laundry-0.019
Token hide
Token ID7808
Feature activation-1.14e-01

Pos logit contributions
 same+0.200
 computer+0.191
 laundry+0.190
 math+0.188
 your+0.186
Neg logit contributions
hell-0.480
qu-0.468
oot-0.453
umbs-0.453
lor-0.444
Token from
Token ID422
Feature activation+1.64e-01

Pos logit contributions
hell+0.264
oped+0.259
lor+0.258
umbs+0.256
qu+0.252
Neg logit contributions
 your-0.174
 their-0.169
 same-0.165
 laundry-0.163
 our-0.156
Token the
Token ID262
Feature activation+1.49e-01

Pos logit contributions
 laundry+0.138
 computer+0.130
 same+0.123
 homework+0.122
 task+0.122
Neg logit contributions
es-0.419
hell-0.417
oot-0.397
oc-0.394
ez-0.393
Token rain
Token ID6290
Feature activation-1.65e-01

Pos logit contributions
 their+0.208
 your+0.203
 laundry+0.190
 our+0.189
 same+0.188
Neg logit contributions
hell-0.344
lor-0.330
qu-0.327
rot-0.326
oped-0.321
Token,
Token ID11
Feature activation+7.18e-02

Pos logit contributions
hell+0.352
lor+0.349
oped+0.345
umbs+0.337
es+0.336
Neg logit contributions
 same-0.272
 your-0.267
 their-0.267
 laundry-0.264
 table-0.249
Token but
Token ID475
Feature activation+4.06e-02

Pos logit contributions
 laundry+0.138
 same+0.137
 math+0.137
 homework+0.136
 port+0.136
Neg logit contributions
hell-0.113
es-0.097
umbs-0.095
rol-0.095
uts-0.095
Token it
Token ID340
Feature activation-3.21e-02

Pos logit contributions
 laundry+0.064
 computer+0.063
 same+0.062
 math+0.061
 table+0.060
Neg logit contributions
es-0.082
hell-0.075
oped-0.075
hed-0.073
oot-0.073
Token was
Token ID373
Feature activation+1.03e-01

Pos logit contributions
hell+0.068
oped+0.065
lor+0.065
es+0.065
umbs+0.064
Neg logit contributions
 your-0.053
 same-0.052
 laundry-0.052
 their-0.052
 our-0.048
Token too
Token ID1165
Feature activation-1.55e-02

Pos logit contributions
 laundry+0.218
 same+0.208
 port+0.208
 your+0.207
 computer+0.205
Neg logit contributions
hell-0.161
es-0.151
oped-0.148
umbs-0.147
ez-0.139
Token wet
Token ID9583
Feature activation-1.05e-01

Pos logit contributions
hell+0.045
oped+0.043
es+0.043
umbs+0.043
lor+0.043
Neg logit contributions
 your-0.013
 same-0.013
 laundry-0.013
 their-0.013
 our-0.011
Token's
Token ID338
Feature activation+1.17e-01

Pos logit contributions
 their+0.050
 your+0.049
 laundry+0.048
 port+0.047
 same+0.047
Neg logit contributions
hell-0.039
es-0.037
lor-0.036
rot-0.036
oc-0.035
Token okay
Token ID8788
Feature activation-4.25e-02

Pos logit contributions
 laundry+0.167
 port+0.166
 table+0.161
 bathroom+0.153
 computer+0.153
Neg logit contributions
hell-0.265
es-0.246
rot-0.239
oc-0.239
qu-0.238
Token.
Token ID13
Feature activation-1.44e-02

Pos logit contributions
hell+0.077
es+0.073
lor+0.072
oped+0.072
qu+0.071
Neg logit contributions
 your-0.083
 same-0.083
 laundry-0.083
 their-0.082
 port-0.078
Token You
Token ID921
Feature activation-3.57e-02

Pos logit contributions
hell+0.037
lor+0.036
es+0.036
oped+0.036
qu+0.036
Neg logit contributions
 laundry-0.017
 same-0.016
 their-0.016
 your-0.016
 port-0.014
Token can
Token ID460
Feature activation+2.20e-02

Pos logit contributions
hell+0.087
lor+0.086
oped+0.085
umbs+0.083
es+0.083
Neg logit contributions
 your-0.049
 same-0.049
 their-0.046
 our-0.045
 laundry-0.045
Token have
Token ID423
Feature activation+4.38e-02

Pos logit contributions
 your+0.025
 same+0.024
 their+0.023
 laundry+0.022
 our+0.021
Neg logit contributions
hell-0.060
oped-0.058
lor-0.058
es-0.058
umbs-0.058
Token some
Token ID617
Feature activation+6.83e-02

Pos logit contributions
 laundry+0.045
 same+0.042
 port+0.042
 computer+0.041
 your+0.040
Neg logit contributions
hell-0.118
qu-0.113
es-0.113
lor-0.111
umbs-0.111
Token of
Token ID286
Feature activation+1.67e-01

Pos logit contributions
 your+0.082
 same+0.082
 their+0.082
 laundry+0.081
 port+0.075
Neg logit contributions
hell-0.175
es-0.165
umbs-0.164
oped-0.163
qu-0.163
Token my
Token ID616
Feature activation+1.19e-01

Pos logit contributions
 port+0.132
 laundry+0.122
 computer+0.121
 bathroom+0.118
 same+0.113
Neg logit contributions
hell-0.467
es-0.436
qu-0.433
oc-0.430
oot-0.425
Token fruit
Token ID8234
Feature activation-1.86e-01

Pos logit contributions
 their+0.152
 your+0.148
 laundry+0.146
 port+0.141
 same+0.138
Neg logit contributions
hell-0.294
oc-0.268
umbs-0.267
es-0.265
rot-0.263
Token.
Token ID13
Feature activation-1.40e-02

Pos logit contributions
hell+0.389
oped+0.389
lor+0.382
qu+0.375
hed+0.368
Neg logit contributions
 your-0.320
 same-0.307
 their-0.305
 laundry-0.298
 our-0.290
Token become
Token ID1716
Feature activation+7.17e-02

Pos logit contributions
 same+0.082
 your+0.080
 laundry+0.080
 port+0.077
 their+0.077
Neg logit contributions
hell-0.138
es-0.133
lor-0.133
qu-0.132
oped-0.132
Token comfortable
Token ID6792
Feature activation-1.04e-01

Pos logit contributions
 your+0.073
 same+0.073
 laundry+0.072
 their+0.070
 port+0.064
Neg logit contributions
hell-0.195
es-0.190
lor-0.188
oped-0.188
umbs-0.186
Token and
Token ID290
Feature activation+1.39e-01

Pos logit contributions
hell+0.199
lor+0.190
oped+0.189
umbs+0.185
qu+0.185
Neg logit contributions
 your-0.195
 same-0.194
 their-0.193
 laundry-0.192
 dad-0.181
Token was
Token ID373
Feature activation+1.05e-01

Pos logit contributions
 port+0.165
 same+0.162
 laundry+0.157
 your+0.156
 computer+0.151
Neg logit contributions
es-0.343
hell-0.336
oped-0.324
hed-0.324
qu-0.321
Token feeling
Token ID4203
Feature activation-3.32e-02

Pos logit contributions
 your+0.142
 same+0.140
 laundry+0.138
 their+0.138
 our+0.127
Neg logit contributions
hell-0.255
lor-0.245
oped-0.245
es-0.245
umbs-0.242
Token very
Token ID845
Feature activation+8.33e-02

Pos logit contributions
hell+0.083
es+0.080
oped+0.079
lor+0.079
umbs+0.078
Neg logit contributions
 laundry-0.041
 same-0.041
 your-0.041
 their-0.039
 port-0.037
Token warm
Token ID5814
Feature activation-9.39e-02

Pos logit contributions
 laundry+0.106
 same+0.106
 your+0.105
 their+0.103
 computer+0.096
Neg logit contributions
hell-0.204
oped-0.196
es-0.196
lor-0.194
qu-0.193
Token.
Token ID13
Feature activation-1.13e-02

Pos logit contributions
hell+0.202
oped+0.195
lor+0.193
qu+0.191
umbs+0.186
Neg logit contributions
 their-0.164
 same-0.163
 your-0.161
 laundry-0.159
 the-0.151
Token
Token ID198
Feature activation-5.32e-02

Pos logit contributions
hell+0.027
lor+0.025
es+0.025
qu+0.025
oped+0.025
Neg logit contributions
 your-0.017
 their-0.016
 same-0.016
 laundry-0.016
 dad-0.015
Token
Token ID198
Feature activation-5.28e-02

Pos logit contributions
hell+0.132
rot+0.128
oped+0.127
es+0.126
lor+0.126
Neg logit contributions
 their-0.071
 your-0.069
 the-0.065
 dad-0.063
 Mom-0.063
TokenBut
Token ID1537
Feature activation+4.06e-03

Pos logit contributions
hell+0.115
es+0.111
umbs+0.109
lor+0.108
ann+0.108
Neg logit contributions
 your-0.084
 their-0.083
 same-0.079
 laundry-0.078
 our-0.074
Token
Token ID198
Feature activation-5.97e-02

Pos logit contributions
hell+0.132
lor+0.126
oped+0.124
umbs+0.124
rot+0.123
Neg logit contributions
 their-0.107
 your-0.106
 same-0.099
 our-0.096
 dad-0.095
Token"
Token ID1
Feature activation-5.53e-02

Pos logit contributions
hell+0.087
es+0.083
oped+0.082
lor+0.081
qu+0.080
Neg logit contributions
 your-0.137
 same-0.137
 laundry-0.137
 their-0.134
 dad-0.129
TokenCan
Token ID6090
Feature activation+2.29e-02

Pos logit contributions
hell+0.112
qu+0.109
oped+0.108
rot+0.105
lor+0.105
Neg logit contributions
 your-0.096
 laundry-0.094
 same-0.092
 their-0.091
 port-0.088
Token we
Token ID356
Feature activation+4.94e-02

Pos logit contributions
 laundry+0.035
 same+0.032
 port+0.032
 stove+0.031
 table+0.031
Neg logit contributions
hell-0.047
es-0.047
qu-0.047
oped-0.046
uts-0.046
Token ride
Token ID6594
Feature activation-1.33e-01

Pos logit contributions
 your+0.037
 same+0.036
 their+0.035
 laundry+0.034
 our+0.030
Neg logit contributions
hell-0.152
lor-0.148
es-0.147
oped-0.147
umbs-0.146
Token it
Token ID340
Feature activation-2.93e-02

Pos logit contributions
hell+0.356
es+0.354
oped+0.343
rot+0.341
qu+0.339
Neg logit contributions
 laundry-0.135
 same-0.130
 table-0.119
 port-0.118
 your-0.115
Token?"
Token ID1701
Feature activation-3.79e-02

Pos logit contributions
hell+0.066
oped+0.064
es+0.064
lor+0.064
qu+0.062
Neg logit contributions
 your-0.043
 laundry-0.043
 same-0.043
 their-0.042
 dad-0.039
Token Lily
Token ID20037
Feature activation-8.78e-02

Pos logit contributions
hell+0.079
umbs+0.076
es+0.076
oped+0.075
oc+0.075
Neg logit contributions
 laundry-0.052
 computer-0.050
 same-0.049
 stove-0.048
 attic-0.046
Token asks
Token ID7893
Feature activation-5.70e-02

Pos logit contributions
hell+0.224
lor+0.216
oped+0.216
es+0.215
umbs+0.213
Neg logit contributions
 your-0.108
 same-0.106
 their-0.105
 laundry-0.104
 our-0.095
Token.
Token ID13
Feature activation-1.09e-02

Pos logit contributions
hell+0.098
lor+0.094
oped+0.093
umbs+0.091
qu+0.091
Neg logit contributions
 your-0.118
 their-0.118
 same-0.117
 laundry-0.116
 dad-0.111
Token
Token ID198
Feature activation-6.03e-02

Pos logit contributions
oped+0.016
es+0.016
hell+0.016
umbs+0.016
hed+0.015
Neg logit contributions
 same-0.024
 laundry-0.024
 your-0.024
 our-0.022
 task-0.022
Token sun
Token ID4252
Feature activation-2.03e-01

Pos logit contributions
 your+0.209
 their+0.207
 our+0.196
 organize+0.175
 laundry+0.171
Neg logit contributions
hell-0.345
rot-0.327
pillar-0.312
hed-0.309
umbs-0.308
Token is
Token ID318
Feature activation+1.10e-01

Pos logit contributions
hell+0.475
es+0.468
oped+0.468
lor+0.458
hed+0.456
Neg logit contributions
 your-0.280
 laundry-0.272
 same-0.272
 their-0.267
 dad-0.252
Token going
Token ID1016
Feature activation+1.67e-02

Pos logit contributions
 your+0.142
 laundry+0.137
 math+0.134
 same+0.131
 port+0.130
Neg logit contributions
hell-0.251
umbs-0.246
qu-0.239
oped-0.239
hed-0.234
Token down
Token ID866
Feature activation+4.88e-02

Pos logit contributions
 laundry+0.027
 your+0.026
 math+0.025
 same+0.025
 port+0.025
Neg logit contributions
oped-0.033
hell-0.033
hed-0.032
oot-0.031
umbs-0.031
Token and
Token ID290
Feature activation+1.33e-01

Pos logit contributions
 laundry+0.103
 same+0.101
 your+0.100
 dad+0.097
 computer+0.096
Neg logit contributions
hell-0.078
es-0.076
oped-0.074
hed-0.073
umbs-0.072
Token it
Token ID340
Feature activation-3.14e-02

Pos logit contributions
 tool+0.175
 computer+0.174
 task+0.172
 math+0.167
 garage+0.167
Neg logit contributions
es-0.281
hed-0.275
oped-0.272
qu-0.268
m-0.262
Token is
Token ID318
Feature activation+1.10e-01

Pos logit contributions
hell+0.072
es+0.070
oped+0.069
hed+0.069
qu+0.068
Neg logit contributions
 your-0.045
 laundry-0.044
 dad-0.042
 their-0.042
 same-0.041
Token getting
Token ID1972
Feature activation+4.68e-02

Pos logit contributions
 your+0.146
 laundry+0.144
 port+0.144
 computer+0.140
 math+0.139
Neg logit contributions
hell-0.235
umbs-0.230
oped-0.213
qu-0.212
m-0.210
Token dark
Token ID3223
Feature activation-9.27e-02

Pos logit contributions
 your+0.043
 laundry+0.042
 same+0.041
 computer+0.039
 their+0.039
Neg logit contributions
hell-0.128
umbs-0.124
oped-0.123
qu-0.121
hed-0.120
Token.
Token ID13
Feature activation-7.67e-03

Pos logit contributions
hell+0.178
oped+0.178
lor+0.177
rot+0.170
es+0.166
Neg logit contributions
 their-0.180
 same-0.177
 your-0.177
 laundry-0.173
 table-0.164
Token
Token ID198
Feature activation-5.47e-02

Pos logit contributions
hell+0.015
umbs+0.015
es+0.015
lor+0.015
oped+0.015
Neg logit contributions
 same-0.013
 laundry-0.013
 your-0.013
 their-0.012
 port-0.012
Token train
Token ID4512
Feature activation-2.20e-01

Pos logit contributions
 your+0.186
 their+0.175
 laundry+0.167
 same+0.166
 math+0.160
Neg logit contributions
hell-0.380
rot-0.362
oped-0.360
es-0.359
qu-0.357
Token on
Token ID319
Feature activation+1.29e-01

Pos logit contributions
es+0.425
hell+0.425
oped+0.409
hed+0.405
qu+0.396
Neg logit contributions
 your-0.393
 laundry-0.386
 same-0.376
 their-0.365
 our-0.356
Token the
Token ID262
Feature activation+1.53e-01

Pos logit contributions
 laundry+0.101
 same+0.098
 task+0.094
 your+0.092
 homework+0.086
Neg logit contributions
es-0.347
hell-0.346
hed-0.332
oc-0.328
qu-0.326
Token tracks
Token ID8339
Feature activation-1.92e-01

Pos logit contributions
 your+0.163
 their+0.147
 math+0.144
 Dad+0.143
 homework+0.135
Neg logit contributions
hell-0.375
qu-0.339
es-0.338
oc-0.335
hed-0.334
Token and
Token ID290
Feature activation+1.32e-01

Pos logit contributions
hell+0.363
es+0.346
oped+0.345
lor+0.344
umbs+0.339
Neg logit contributions
 your-0.361
 same-0.357
 laundry-0.354
 their-0.351
 our-0.334
Token turn
Token ID1210
Feature activation-8.33e-02

Pos logit contributions
 same+0.157
 task+0.140
 laundry+0.136
 math+0.133
 chore+0.133
Neg logit contributions
es-0.301
hed-0.279
oped-0.268
v-0.255
oc-0.247
Token it
Token ID340
Feature activation-2.92e-02

Pos logit contributions
hell+0.184
es+0.183
oped+0.176
hed+0.176
qu+0.171
Neg logit contributions
 same-0.120
 laundry-0.120
 your-0.114
 task-0.109
 table-0.109
Token on
Token ID319
Feature activation+1.30e-01

Pos logit contributions
hell+0.062
es+0.059
oped+0.059
lor+0.058
qu+0.058
Neg logit contributions
 your-0.048
 laundry-0.047
 same-0.047
 their-0.046
 dad-0.044
Token.
Token ID13
Feature activation-1.09e-02

Pos logit contributions
 same+0.236
 your+0.236
 laundry+0.234
 their+0.217
 port+0.216
Neg logit contributions
hell-0.239
es-0.228
qu-0.219
umbs-0.218
lor-0.216
Token The
Token ID383
Feature activation+7.37e-02

Pos logit contributions
umbs+0.021
es+0.020
uts+0.020
oped+0.019
hed+0.019
Neg logit contributions
 laundry-0.018
 same-0.018
 task-0.018
 your-0.017
 homework-0.016
Token train
Token ID4512
Feature activation-2.19e-01

Pos logit contributions
 your+0.099
 their+0.088
 math+0.085
 laundry+0.084
 same+0.084
Neg logit contributions
hell-0.172
rot-0.165
oped-0.162
qu-0.161
es-0.160
Token!
Token ID0
Feature activation+2.41e-02

Pos logit contributions
hell+0.032
qu+0.029
es+0.029
oped+0.028
umbs+0.028
Neg logit contributions
 laundry-0.037
 your-0.036
 their-0.036
 same-0.035
 computer-0.035
Token<|endoftext|>
Token ID50256
Feature activation-1.71e-01

Pos logit contributions
 your+0.041
 their+0.039
 same+0.038
 our+0.037
 dad+0.037
Neg logit contributions
hell-0.053
oped-0.050
qu-0.050
es-0.049
lor-0.048
TokenBob
Token ID18861
Feature activation-1.37e-01

Pos logit contributions
oot+0.440
hell+0.435
qu+0.418
uts+0.411
aring+0.406
Neg logit contributions
 the-0.235
 their-0.228
 Ben-0.222
 same-0.217
 your-0.212
Token was
Token ID373
Feature activation+1.05e-01

Pos logit contributions
oped+0.262
hell+0.261
es+0.259
hed+0.255
lor+0.248
Neg logit contributions
 laundry-0.244
 same-0.240
 your-0.238
 their-0.233
 math-0.226
Token very
Token ID845
Feature activation+8.21e-02

Pos logit contributions
 laundry+0.161
 task+0.158
 table+0.148
 stove+0.147
 computer+0.146
Neg logit contributions
hed-0.214
oped-0.209
rol-0.208
es-0.201
ez-0.201
Token angry
Token ID7954
Feature activation-6.29e-02

Pos logit contributions
 laundry+0.160
 computer+0.153
 table+0.145
 paper+0.144
 math+0.143
Neg logit contributions
oped-0.131
unch-0.124
ang-0.119
m-0.118
owing-0.117
Token.
Token ID13
Feature activation-7.55e-03

Pos logit contributions
hell+0.104
lor+0.098
oped+0.098
es+0.098
umbs+0.096
Neg logit contributions
 your-0.133
 same-0.132
 their-0.131
 laundry-0.131
 our-0.124
Token He
Token ID679
Feature activation-2.19e-02

Pos logit contributions
hell+0.019
lor+0.018
oped+0.018
es+0.018
umbs+0.018
Neg logit contributions
 your-0.009
 same-0.009
 their-0.009
 laundry-0.009
 our-0.008
Token wanted
Token ID2227
Feature activation-9.73e-03

Pos logit contributions
hell+0.038
hed+0.037
oped+0.037
es+0.037
ez+0.034
Neg logit contributions
 laundry-0.043
 your-0.042
 our-0.041
 port-0.041
 their-0.041
Token to
Token ID284
Feature activation+1.92e-01

Pos logit contributions
hell+0.020
uts+0.019
es+0.018
oot+0.018
umbs+0.017
Neg logit contributions
 same-0.013
 task-0.012
 port-0.012
 your-0.012
 our-0.011
Token play
Token ID711
Feature activation-5.14e-02

Pos logit contributions
 same+0.210
 your+0.196
 their+0.188
 port+0.187
 stove+0.185
Neg logit contributions
hell-0.466
qu-0.451
umbs-0.446
uts-0.436
oot-0.424
Token
Token ID241
Feature activation-3.14e-02

Pos logit contributions
hell+0.036
lor+0.034
oped+0.034
es+0.034
umbs+0.034
Neg logit contributions
 your-0.036
 same-0.036
 their-0.035
 laundry-0.035
 our-0.033
TokenWow
Token ID22017
Feature activation-1.36e-01

Pos logit contributions
lor+0.075
hell+0.074
es+0.073
oped+0.072
umbs+0.071
Neg logit contributions
 their-0.046
 same-0.045
 your-0.044
 laundry-0.042
 our-0.040
Token!
Token ID0
Feature activation+2.22e-02

Pos logit contributions
hell+0.278
lor+0.270
umbs+0.267
oped+0.266
qu+0.259
Neg logit contributions
 your-0.245
 their-0.237
 same-0.236
 laundry-0.230
 our-0.226
Token What
Token ID1867
Feature activation+2.40e-02

Pos logit contributions
 your+0.032
 their+0.031
 same+0.030
 laundry+0.029
 our+0.029
Neg logit contributions
hell-0.054
lor-0.051
oped-0.050
es-0.050
umbs-0.050
Tokenâ
Token ID22940
Feature activation-4.43e-02

Pos logit contributions
 same+0.031
 laundry+0.031
 your+0.031
 their+0.030
 table+0.029
Neg logit contributions
hell-0.059
es-0.058
qu-0.054
lor-0.054
oped-0.054
Token
Token ID26391
Feature activation-1.11e-01

Pos logit contributions
hell+0.074
es+0.072
umbs+0.071
oped+0.071
lor+0.070
Neg logit contributions
 your-0.094
 same-0.093
 laundry-0.092
 their-0.091
 our-0.087
Token
Token ID8151
Feature activation-2.38e-01

Pos logit contributions
hell+0.190
oot+0.181
rol+0.181
lor+0.174
qu+0.171
Neg logit contributions
 your-0.225
 their-0.219
 our-0.215
 Ben-0.211
 laundry-0.209
Tokens
Token ID82
Feature activation+7.96e-02

Pos logit contributions
lor+0.370
quet+0.328
rot+0.326
hell+0.320
aron+0.315
Neg logit contributions
 your-0.555
 their-0.534
 the-0.528
 same-0.525
 Ben-0.518
Token it
Token ID340
Feature activation-3.17e-02

Pos logit contributions
 table+0.108
 same+0.107
 laundry+0.104
 bathroom+0.101
 garage+0.095
Neg logit contributions
es-0.174
hell-0.173
rot-0.165
oot-0.165
qu-0.165
Token doing
Token ID1804
Feature activation+2.63e-02

Pos logit contributions
es+0.061
hell+0.061
qu+0.058
uts+0.056
lor+0.055
Neg logit contributions
 laundry-0.051
 their-0.050
 port-0.048
 dad-0.048
 table-0.048
Token?
Token ID30
Feature activation+4.62e-03

Pos logit contributions
 same+0.045
 laundry+0.045
 table+0.043
 their+0.041
 dad+0.041
Neg logit contributions
es-0.051
hell-0.047
rot-0.046
lor-0.045
qu-0.045
Token boy
Token ID2933
Feature activation-1.89e-01

Pos logit contributions
 your+0.132
 laundry+0.129
 same+0.128
 their+0.126
 our+0.115
Neg logit contributions
hell-0.301
rot-0.290
oc-0.289
qu-0.281
oped-0.280
Token named
Token ID3706
Feature activation-7.98e-02

Pos logit contributions
hell+0.431
lor+0.429
umbs+0.421
oped+0.404
rol+0.399
Neg logit contributions
 your-0.326
 their-0.326
 dad-0.292
 our-0.288
 the-0.285
Token Tim
Token ID5045
Feature activation-9.64e-02

Pos logit contributions
hell+0.182
umbs+0.174
oped+0.170
es+0.170
qu+0.169
Neg logit contributions
 laundry-0.113
 same-0.108
 your-0.102
 task-0.100
 port-0.096
Tokenmy
Token ID1820
Feature activation-6.43e-02

Pos logit contributions
lor+0.203
oped+0.197
qu+0.194
hell+0.193
umbs+0.192
Neg logit contributions
 your-0.182
 their-0.182
 same-0.176
 his-0.169
 dad-0.167
Token.
Token ID13
Feature activation-8.00e-03

Pos logit contributions
lor+0.147
rol+0.142
umbs+0.142
hell+0.141
oped+0.139
Neg logit contributions
 their-0.117
 your-0.112
 dad-0.109
 his-0.107
 the-0.106
Token Tim
Token ID5045
Feature activation-9.55e-02

Pos logit contributions
hell+0.022
lor+0.021
oped+0.021
es+0.021
umbs+0.021
Neg logit contributions
 your-0.008
 same-0.008
 their-0.008
 laundry-0.008
 our-0.007
Tokenmy
Token ID1820
Feature activation-6.35e-02

Pos logit contributions
hell+0.168
lor+0.167
oped+0.164
qu+0.162
es+0.160
Neg logit contributions
 your-0.195
 same-0.194
 their-0.191
 laundry-0.187
 dad-0.179
Token loved
Token ID6151
Feature activation+4.80e-02

Pos logit contributions
hell+0.140
es+0.133
oped+0.132
lor+0.131
hed+0.130
Neg logit contributions
 your-0.099
 same-0.098
 laundry-0.098
 their-0.095
 our-0.090
Token to
Token ID284
Feature activation+1.93e-01

Pos logit contributions
 same+0.076
 port+0.070
 task+0.069
 basement+0.065
 particular+0.064
Neg logit contributions
uts-0.083
qu-0.082
ann-0.079
es-0.078
oot-0.078
Token explore
Token ID7301
Feature activation-4.08e-03

Pos logit contributions
 same+0.261
 your+0.242
 our+0.237
 their+0.224
 laundry+0.221
Neg logit contributions
hell-0.443
qu-0.417
oped-0.403
r-0.401
oot-0.399
Token and
Token ID290
Feature activation+1.36e-01

Pos logit contributions
hell+0.011
es+0.011
oc+0.010
qu+0.010
hed+0.010
Neg logit contributions
 same-0.004
 laundry-0.004
 your-0.004
 dad-0.003
 task-0.003
Token.
Token ID13
Feature activation-1.67e-02

Pos logit contributions
hell+0.117
lor+0.113
oped+0.111
es+0.109
umbs+0.109
Neg logit contributions
 your-0.133
 their-0.132
 same-0.131
 laundry-0.130
 dad-0.124
Token But
Token ID887
Feature activation+5.99e-03

Pos logit contributions
umbs+0.038
es+0.038
hell+0.038
uts+0.037
oped+0.037
Neg logit contributions
 same-0.023
 laundry-0.023
 your-0.023
 our-0.021
 port-0.021
Token Lucy
Token ID22162
Feature activation-9.38e-02

Pos logit contributions
 laundry+0.009
 same+0.009
 computer+0.008
 port+0.008
 table+0.008
Neg logit contributions
es-0.012
oped-0.012
hell-0.012
hed-0.012
owed-0.011
Token just
Token ID655
Feature activation+4.21e-02

Pos logit contributions
hell+0.215
lor+0.208
umbs+0.206
qu+0.204
oped+0.203
Neg logit contributions
 your-0.139
 same-0.138
 their-0.136
 laundry-0.135
 dad-0.128
Token ignored
Token ID9514
Feature activation-6.62e-03

Pos logit contributions
 your+0.070
 same+0.069
 laundry+0.068
 their+0.068
 our+0.064
Neg logit contributions
hell-0.090
oped-0.086
es-0.086
lor-0.086
umbs-0.084
Token Jack
Token ID3619
Feature activation-8.46e-02

Pos logit contributions
hell+0.016
es+0.015
hed+0.015
qu+0.015
oped+0.014
Neg logit contributions
 same-0.009
 port-0.008
 laundry-0.008
 your-0.008
 task-0.008
Token.
Token ID13
Feature activation-1.66e-02

Pos logit contributions
lor+0.162
hell+0.159
oped+0.158
umbs+0.155
qu+0.154
Neg logit contributions
 their-0.161
 your-0.160
 same-0.158
 laundry-0.153
 dad-0.148
Token Jack
Token ID3619
Feature activation-8.47e-02

Pos logit contributions
umbs+0.036
es+0.036
hell+0.036
uts+0.035
oped+0.035
Neg logit contributions
 your-0.024
 same-0.023
 laundry-0.023
 our-0.022
 their-0.022
Token started
Token ID2067
Feature activation+2.90e-02

Pos logit contributions
hell+0.195
lor+0.189
umbs+0.187
qu+0.184
oped+0.183
Neg logit contributions
 same-0.124
 your-0.124
 their-0.123
 laundry-0.121
 dad-0.115
Token to
Token ID284
Feature activation+1.91e-01

Pos logit contributions
 same+0.039
 your+0.039
 port+0.038
 laundry+0.037
 our+0.036
Neg logit contributions
es-0.066
oped-0.064
hell-0.064
uts-0.063
hed-0.061
Token tease
Token ID34691
Feature activation-1.16e-01

Pos logit contributions
 your+0.214
 same+0.205
 laundry+0.200
 their+0.200
 our+0.190
Neg logit contributions
hell-0.494
oped-0.476
qu-0.474
es-0.469
hed-0.462
Token to
Token ID284
Feature activation+1.92e-01

Pos logit contributions
hell+0.079
es+0.076
lor+0.075
oped+0.074
qu+0.074
Neg logit contributions
 laundry-0.032
 same-0.031
 your-0.031
 their-0.030
 port-0.028
Token his
Token ID465
Feature activation+1.22e-01

Pos logit contributions
 laundry+0.185
 same+0.162
 task+0.153
 computer+0.151
 bathroom+0.147
Neg logit contributions
es-0.524
hell-0.521
lor-0.488
oc-0.487
hed-0.485
Token friends
Token ID2460
Feature activation-1.86e-01

Pos logit contributions
 your+0.171
 same+0.169
 their+0.166
 laundry+0.166
 our+0.153
Neg logit contributions
hell-0.291
lor-0.279
oped-0.279
es-0.278
umbs-0.275
Token.
Token ID13
Feature activation-1.70e-02

Pos logit contributions
hell+0.321
lor+0.309
oped+0.307
umbs+0.303
qu+0.299
Neg logit contributions
 your-0.388
 their-0.385
 same-0.377
 laundry-0.375
 dad-0.360
Token The
Token ID383
Feature activation+6.77e-02

Pos logit contributions
es+0.037
hell+0.037
oc+0.036
oped+0.036
lor+0.036
Neg logit contributions
 laundry-0.027
 your-0.025
 same-0.024
 task-0.023
 their-0.023
Token frog
Token ID21264
Feature activation-2.32e-01

Pos logit contributions
 same+0.096
 your+0.095
 their+0.093
 laundry+0.092
 dad+0.086
Neg logit contributions
hell-0.163
lor-0.158
oped-0.158
es-0.157
umbs-0.155
Token was
Token ID373
Feature activation+1.05e-01

Pos logit contributions
hell+0.535
lor+0.514
umbs+0.511
oped+0.504
qu+0.497
Neg logit contributions
 their-0.346
 your-0.346
 same-0.343
 laundry-0.323
 dad-0.321
Token scared
Token ID12008
Feature activation-7.20e-02

Pos logit contributions
 laundry+0.149
 your+0.143
 same+0.142
 their+0.135
 task+0.135
Neg logit contributions
hell-0.239
es-0.234
oped-0.230
lor-0.230
umbs-0.228
Token,
Token ID11
Feature activation+7.18e-02

Pos logit contributions
hell+0.145
es+0.138
oped+0.138
lor+0.138
umbs+0.136
Neg logit contributions
 your-0.127
 same-0.126
 laundry-0.125
 their-0.124
 our-0.117
Token but
Token ID475
Feature activation+4.30e-02

Pos logit contributions
 laundry+0.123
 your+0.117
 same+0.116
 port+0.115
 math+0.113
Neg logit contributions
hell-0.147
es-0.136
hed-0.134
qu-0.133
oped-0.133
Token the
Token ID262
Feature activation+1.52e-01

Pos logit contributions
 laundry+0.066
 computer+0.060
 same+0.059
 task+0.058
 table+0.058
Neg logit contributions
es-0.097
hell-0.094
oped-0.092
hed-0.090
rot-0.090
Token She
Token ID1375
Feature activation-2.04e-02

Pos logit contributions
 your+0.001
 same+0.001
 laundry+0.001
 their+0.001
 our+0.001
Neg logit contributions
hell-0.001
oped-0.001
es-0.001
lor-0.001
umbs-0.001
Token had
Token ID550
Feature activation+7.16e-02

Pos logit contributions
hell+0.043
oped+0.042
es+0.042
lor+0.041
qu+0.041
Neg logit contributions
 your-0.033
 same-0.033
 laundry-0.033
 their-0.032
 our-0.031
Token a
Token ID257
Feature activation+1.17e-01

Pos logit contributions
 port+0.068
 your+0.065
 their+0.064
 laundry+0.063
 task+0.063
Neg logit contributions
hell-0.191
lor-0.186
es-0.186
qu-0.183
umbs-0.182
Token cute
Token ID13779
Feature activation-8.17e-02

Pos logit contributions
 their+0.140
 your+0.139
 our+0.134
 Ben+0.123
 Dad+0.117
Neg logit contributions
rot-0.273
hell-0.271
oc-0.255
pillar-0.251
oped-0.246
Token little
Token ID1310
Feature activation-8.27e-03

Pos logit contributions
hell+0.185
rot+0.177
oc+0.171
es+0.169
oped+0.167
Neg logit contributions
 your-0.110
 their-0.109
 laundry-0.108
 same-0.104
 our-0.103
Token purse
Token ID27517
Feature activation-2.08e-01

Pos logit contributions
hell+0.021
oped+0.019
es+0.019
oc+0.019
rot+0.019
Neg logit contributions
 your-0.010
 their-0.010
 same-0.010
 laundry-0.010
 our-0.009
Token.
Token ID13
Feature activation-2.28e-03

Pos logit contributions
hell+0.491
umbs+0.484
oped+0.483
lor+0.480
qu+0.475
Neg logit contributions
 your-0.299
 same-0.291
 their-0.280
 laundry-0.279
 table-0.275
Token She
Token ID1375
Feature activation-2.22e-02

Pos logit contributions
hell+0.005
lor+0.005
oped+0.005
es+0.005
umbs+0.005
Neg logit contributions
 your-0.003
 same-0.003
 laundry-0.003
 their-0.003
 our-0.003
Token liked
Token ID8288
Feature activation+4.54e-02

Pos logit contributions
hell+0.044
oped+0.043
es+0.043
lor+0.041
hed+0.041
Neg logit contributions
 same-0.039
 laundry-0.039
 your-0.039
 their-0.037
 our-0.037
Token to
Token ID284
Feature activation+1.93e-01

Pos logit contributions
 same+0.054
 port+0.053
 task+0.051
 laundry+0.049
 wash+0.047
Neg logit contributions
hell-0.108
es-0.107
qu-0.103
uts-0.101
ann-0.101
Token take
Token ID1011
Feature activation+3.52e-02

Pos logit contributions
 same+0.247
 laundry+0.236
 port+0.236
 your+0.235
 our+0.234
Neg logit contributions
hell-0.460
qu-0.412
oped-0.411
oot-0.409
r-0.409
Token dirt
Token ID13647
Feature activation-1.66e-01

Pos logit contributions
 your+0.139
 homework+0.129
 same+0.127
 our+0.124
 port+0.121
Neg logit contributions
hell-0.279
es-0.263
umbs-0.262
hed-0.262
rol-0.255
Token and
Token ID290
Feature activation+1.35e-01

Pos logit contributions
hell+0.332
lor+0.318
oped+0.317
es+0.315
umbs+0.311
Neg logit contributions
 your-0.292
 same-0.290
 their-0.287
 laundry-0.286
 our-0.268
Token mum
Token ID25682
Feature activation-1.41e-01

Pos logit contributions
 your+0.162
 same+0.159
 port+0.156
 task+0.152
 computer+0.151
Neg logit contributions
hell-0.316
es-0.310
qu-0.303
hed-0.289
umbs-0.287
Token with
Token ID351
Feature activation+1.17e-01

Pos logit contributions
hell+0.323
lor+0.318
umbs+0.305
oped+0.304
rot+0.301
Neg logit contributions
 their-0.215
 your-0.212
 same-0.211
 laundry-0.210
 dad-0.200
Token the
Token ID262
Feature activation+1.49e-01

Pos logit contributions
 your+0.115
 same+0.103
 our+0.098
 my+0.093
 task+0.088
Neg logit contributions
hell-0.307
es-0.299
oped-0.297
hed-0.296
umbs-0.296
Token rake
Token ID47885
Feature activation-1.84e-01

Pos logit contributions
 your+0.183
 their+0.168
 our+0.166
 my+0.147
 same+0.145
Neg logit contributions
hell-0.378
umbs-0.353
es-0.350
hed-0.349
oped-0.347
Token made
Token ID925
Feature activation+6.96e-02

Pos logit contributions
hell+0.295
es+0.287
oped+0.281
hed+0.272
qu+0.270
Neg logit contributions
 your-0.393
 same-0.383
 laundry-0.376
 their-0.372
 our-0.371
Token it
Token ID340
Feature activation-3.27e-02

Pos logit contributions
 your+0.071
 same+0.070
 laundry+0.066
 their+0.064
 port+0.064
Neg logit contributions
hell-0.189
es-0.184
qu-0.180
umbs-0.179
oped-0.178
Token look
Token ID804
Feature activation-4.91e-02

Pos logit contributions
hell+0.084
es+0.078
qu+0.076
uts+0.076
umbs+0.075
Neg logit contributions
 your-0.036
 laundry-0.034
 dad-0.034
 computer-0.033
 same-0.033
Token nice
Token ID3621
Feature activation-7.67e-02

Pos logit contributions
hell+0.126
es+0.115
qu+0.113
oped+0.112
umbs+0.112
Neg logit contributions
 laundry-0.059
 your-0.056
 same-0.055
 dad-0.053
 computer-0.053
Token.
Token ID13
Feature activation-1.32e-02

Pos logit contributions
oped+0.149
hell+0.147
lor+0.147
qu+0.141
umbs+0.141
Neg logit contributions
 their-0.144
 your-0.142
 same-0.141
 laundry-0.139
 table-0.132
Token:
Token ID25
Feature activation+1.20e-01

Pos logit contributions
 their+0.008
 your+0.008
 same+0.007
 laundry+0.007
 our+0.007
Neg logit contributions
hell-0.007
umbs-0.007
lor-0.007
oped-0.006
qu-0.006
Token another
Token ID1194
Feature activation+3.62e-02

Pos logit contributions
 same+0.124
 your+0.122
 laundry+0.120
 their+0.111
 port+0.105
Neg logit contributions
hell-0.325
qu-0.314
es-0.314
oped-0.313
umbs-0.312
Token kid
Token ID5141
Feature activation-2.11e-01

Pos logit contributions
 your+0.049
 their+0.047
 same+0.047
 laundry+0.046
 our+0.044
Neg logit contributions
hell-0.091
oped-0.084
es-0.084
lor-0.083
umbs-0.083
Token found
Token ID1043
Feature activation+2.41e-02

Pos logit contributions
hell+0.414
es+0.397
oped+0.395
lor+0.392
qu+0.385
Neg logit contributions
 your-0.379
 same-0.378
 laundry-0.374
 their-0.370
 our-0.350
Token the
Token ID262
Feature activation+1.54e-01

Pos logit contributions
 same+0.028
 your+0.028
 laundry+0.027
 their+0.026
 port+0.025
Neg logit contributions
hell-0.062
es-0.061
oped-0.060
lor-0.060
hed-0.059
Token knife
Token ID9845
Feature activation-2.21e-01

Pos logit contributions
 their+0.224
 your+0.203
 our+0.196
 Grand+0.175
 port+0.174
Neg logit contributions
hell-0.374
rot-0.345
pillar-0.328
umbs-0.324
hed-0.319
Token and
Token ID290
Feature activation+1.36e-01

Pos logit contributions
hell+0.410
lor+0.390
oped+0.389
es+0.387
umbs+0.381
Neg logit contributions
 your-0.425
 same-0.420
 their-0.416
 laundry-0.415
 our-0.393
Token got
Token ID1392
Feature activation+5.22e-02

Pos logit contributions
 same+0.199
 task+0.188
 laundry+0.185
 port+0.184
 job+0.173
Neg logit contributions
es-0.310
hed-0.281
hell-0.280
oped-0.267
qu-0.263
Token hurt
Token ID5938
Feature activation-4.88e-02

Pos logit contributions
 laundry+0.070
 same+0.068
 computer+0.068
 table+0.067
 port+0.067
Neg logit contributions
hell-0.117
es-0.117
oped-0.115
hed-0.114
umbs-0.111
Token.
Token ID13
Feature activation-9.76e-03

Pos logit contributions
hell+0.087
lor+0.085
oped+0.083
umbs+0.081
qu+0.081
Neg logit contributions
 your-0.101
 their-0.101
 same-0.098
 laundry-0.096
 our-0.094
Token Lucy
Token ID22162
Feature activation-9.13e-02

Pos logit contributions
hell+0.021
oped+0.020
lor+0.020
es+0.020
qu+0.019
Neg logit contributions
 your-0.016
 their-0.016
 same-0.016
 laundry-0.016
 our-0.015
Token,
Token ID11
Feature activation+7.25e-02

Pos logit contributions
hell+0.051
es+0.051
oped+0.050
lor+0.049
qu+0.048
Neg logit contributions
 same-0.054
 your-0.052
 laundry-0.052
 their-0.051
 port-0.051
Token her
Token ID607
Feature activation+1.04e-01

Pos logit contributions
 same+0.104
 port+0.098
 task+0.098
 computer+0.096
 stove+0.095
Neg logit contributions
es-0.152
oped-0.150
oot-0.144
qu-0.144
hell-0.144
Token mom
Token ID1995
Feature activation-1.31e-01

Pos logit contributions
 your+0.104
 their+0.095
 our+0.088
 same+0.083
 laundry+0.078
Neg logit contributions
hell-0.280
rot-0.274
es-0.267
umbs-0.263
oped-0.263
Token lit
Token ID6578
Feature activation+2.90e-02

Pos logit contributions
hell+0.318
lor+0.311
umbs+0.297
oped+0.288
rot+0.285
Neg logit contributions
 their-0.188
 dad-0.187
 your-0.184
 same-0.180
 laundry-0.179
Token the
Token ID262
Feature activation+1.50e-01

Pos logit contributions
 your+0.016
 same+0.015
 laundry+0.015
 their+0.014
 dad+0.012
Neg logit contributions
hell-0.092
es-0.090
lor-0.089
qu-0.089
oped-0.089
Token oven
Token ID14361
Feature activation-1.83e-01

Pos logit contributions
 their+0.191
 your+0.188
 our+0.168
 same+0.160
 laundry+0.158
Neg logit contributions
hell-0.379
lor-0.362
rot-0.358
umbs-0.357
es-0.357
Token,
Token ID11
Feature activation+7.27e-02

Pos logit contributions
hell+0.289
es+0.281
umbs+0.272
lor+0.269
qu+0.268
Neg logit contributions
 your-0.396
 same-0.388
 laundry-0.383
 their-0.379
 our-0.370
Token and
Token ID290
Feature activation+1.35e-01

Pos logit contributions
 task+0.136
 same+0.134
 port+0.133
 laundry+0.131
 homework+0.130
Neg logit contributions
es-0.123
hed-0.120
hell-0.117
qu-0.117
oped-0.116
Token the
Token ID262
Feature activation+1.50e-01

Pos logit contributions
 same+0.182
 your+0.174
 laundry+0.170
 port+0.170
 task+0.169
Neg logit contributions
es-0.317
oped-0.307
hell-0.306
qu-0.305
hed-0.302
Token spirit
Token ID4437
Feature activation-2.38e-01

Pos logit contributions
 your+0.184
 their+0.176
 our+0.163
 same+0.155
 laundry+0.151
Neg logit contributions
hell-0.385
rot-0.364
lor-0.364
qu-0.362
es-0.362
Token was
Token ID373
Feature activation+1.04e-01

Pos logit contributions
hell+0.560
lor+0.546
umbs+0.530
oped+0.523
oot+0.518
Neg logit contributions
 their-0.343
 same-0.336
 your-0.333
 laundry-0.331
 dad-0.317