TOP ACTIVATIONS
MAX = 0.247

. They hadn 't known the little girl they met ,
asked him , " Did the hat and shield help you
said , " Isn 't the sunset beautiful ?"
twins jumped back . Was the rock alive ?
" Oh no , the chocolate is burning !" Jill
" Oh no , the garbage truck is here .
She hadn 't known that the dish she had chosen was
" Don 't worry , the dog won 't hurt you
. They do not know the man . He is unknown
, " Oh no , the hurricane will destroy everything !"
she asked . " Did the ink hurt you ?"
" Isn 't the sunrise beautiful ?".
then on , Meg knew to be careful with the things
. She didn 't know the answer . She closed her
she met . She hoped to see Blink again someday .
" Don 't worry , the dog is on a leash
in her suitcase and hoped to have a lot of fun
was very angry and told the little girl to stop playing
day on , Lily knew to be careful around the warm
happy that they had named the special star . They never

MIN ACTIVATIONS
MIN = -0.349

skills to climb up the mast and jump onto the pirate
slippery and the car sk idd ed and hit a tree
However , when the k idd o got to the university
she stepped closer to the bog . Suddenly , Rose heard
tree and takes the aer opl ane . He does not
to wear his fancy swe aters every day . One day
had become trapped in the bog . Rose thought and thought
at the end of the trail . She was so excited
driver hon ked the car horn and shouted , " Move
<|endoftext|> Ryan watched the aer opl ane fly in the sky
still remembered the big aer opl ane flying high in the
he watched the clean aer opl ane fly away .
led up in the warm blanket and enjoyed his adventure with
And put on your swe aters , or you will catch
wrap his paws around the bow and pull . Spot did
got dressed in her warm coat , hat and gloves .
went to see the aer opl ane in the sky .
closer and watched the aer opl ane fly above them .
and put on their swe aters . Now , Tim was
she z ipped up her coat because it was cold outside

INTERVAL 0.098 - 0.247
CONTAINS 20.526%

stay away from the tower in case they got hurt .
mom my . They saw a group of kids playing on
was always worth exploring things that were a bit scary .
, round thing that grows into a flower ."
!" But John was not so happy , because

INTERVAL -0.051 - 0.098
CONTAINS 44.806%

. She picked it up and showed it to him .
with your sister . She has no candy and she is
to start playing but then she heard her mom 's voice
they brought the movie home . Lily was so happy and
to open the package right away , so she took the

INTERVAL -0.200 - -0.051
CONTAINS 32.659%

and Ben hear the dog . They see the dog .
a loud noise being made . It was the door to
top , he saw a man with a big tray .
cave , he met his friend , a little rabbit named
it ; it was so nice ! Lily 's mom helped

INTERVAL -0.349 - -0.200
CONTAINS 2.009%

and help . The fire truck used its long hose to
asked . She heard the noise and came to the kitchen
very hot day and the sun was bright . Lily felt
it . He blew the horn . It made a loud
They got back in the car and drove home .
Token.
Token ID13
Feature activation-6.05e-02

Pos logit contributions
 same+0.000
 way+0.000
 task+0.000
 table+0.000
 boring+0.000
Neg logit contributions
unch-0.000
amb-0.000
oped-0.000
umbs-0.000
qu-0.000
Token They
Token ID1119
Feature activation+1.30e-02

Pos logit contributions
amb+0.111
unch+0.104
uts+0.100
aring+0.099
oot+0.097
Neg logit contributions
 same-0.125
 way-0.115
 task-0.113
 boring-0.108
 bad-0.107
Token hadn
Token ID8020
Feature activation+2.50e-02

Pos logit contributions
 same+0.023
 way+0.021
 task+0.020
 boring+0.019
 attic+0.019
Neg logit contributions
amb-0.027
unch-0.027
oot-0.026
oped-0.026
uts-0.025
Token't
Token ID470
Feature activation+1.29e-01

Pos logit contributions
 same+0.042
 way+0.039
 task+0.034
 garage+0.033
 laundry+0.033
Neg logit contributions
amb-0.055
unch-0.053
oing-0.052
aring-0.051
ucking-0.049
Token known
Token ID1900
Feature activation+3.05e-02

Pos logit contributions
 same+0.217
 task+0.194
 attic+0.191
 garage+0.183
 stove+0.182
Neg logit contributions
unch-0.266
oot-0.254
qu-0.250
oped-0.250
uts-0.248
Token the
Token ID262
Feature activation+2.47e-01

Pos logit contributions
 same+0.061
 way+0.056
 task+0.053
 attic+0.052
 boring+0.052
Neg logit contributions
unch-0.057
amb-0.056
oot-0.053
aring-0.052
oped-0.051
Token little
Token ID1310
Feature activation+4.20e-02

Pos logit contributions
 same+0.268
 way+0.232
 proper+0.227
 attic+0.227
 stove+0.223
Neg logit contributions
amb-0.603
unch-0.589
aring-0.587
oot-0.580
qu-0.577
Token girl
Token ID2576
Feature activation-2.49e-02

Pos logit contributions
 same+0.048
 homework+0.044
 stove+0.043
 attic+0.042
 way+0.041
Neg logit contributions
amb-0.097
aring-0.092
oot-0.092
qu-0.091
unch-0.091
Token they
Token ID484
Feature activation+1.23e-01

Pos logit contributions
unch+0.044
amb+0.044
oot+0.041
aring+0.041
oped+0.041
Neg logit contributions
 same-0.052
 way-0.048
 task-0.046
 boring-0.045
 attic-0.044
Token met
Token ID1138
Feature activation-1.40e-03

Pos logit contributions
 same+0.205
 way+0.182
 task+0.179
 attic+0.171
 laundry+0.170
Neg logit contributions
unch-0.262
amb-0.260
oped-0.258
oot-0.251
uts-0.249
Token,
Token ID11
Feature activation+5.34e-02

Pos logit contributions
unch+0.002
amb+0.002
oot+0.002
oped+0.002
aring+0.002
Neg logit contributions
 same-0.003
 way-0.003
 task-0.003
 boring-0.003
 attic-0.003
Token asked
Token ID1965
Feature activation-2.20e-02

Pos logit contributions
unch+0.107
amb+0.105
oped+0.101
oot+0.099
aring+0.098
Neg logit contributions
 same-0.086
 way-0.075
 task-0.074
 attic-0.071
 boring-0.071
Token him
Token ID683
Feature activation+1.33e-01

Pos logit contributions
unch+0.032
amb+0.030
oot+0.029
aring+0.028
ucking+0.027
Neg logit contributions
 same-0.051
 way-0.049
 task-0.046
 attic-0.045
 boring-0.045
Token,
Token ID11
Feature activation+6.98e-02

Pos logit contributions
 same+0.219
 way+0.196
 task+0.181
 boring+0.177
 attic+0.175
Neg logit contributions
unch-0.300
amb-0.297
oot-0.283
aring-0.279
oped-0.278
Token "
Token ID366
Feature activation-5.68e-03

Pos logit contributions
 same+0.107
 way+0.096
 task+0.088
 boring+0.086
 attic+0.084
Neg logit contributions
unch-0.166
amb-0.163
oot-0.155
oped-0.154
aring-0.153
TokenDid
Token ID11633
Feature activation+3.21e-02

Pos logit contributions
unch+0.011
amb+0.011
oot+0.010
oped+0.010
aring+0.010
Neg logit contributions
 same-0.011
 way-0.010
 task-0.010
 boring-0.010
 attic-0.009
Token the
Token ID262
Feature activation+2.46e-01

Pos logit contributions
 same+0.052
 way+0.047
 attic+0.046
 task+0.046
 basement+0.045
Neg logit contributions
unch-0.068
oot-0.067
amb-0.066
aring-0.065
uts-0.065
Token hat
Token ID6877
Feature activation-1.79e-01

Pos logit contributions
 same+0.251
 proper+0.238
 way+0.227
 drawer+0.223
 shelf+0.222
Neg logit contributions
unch-0.634
amb-0.617
oot-0.585
aring-0.571
qu-0.566
Token and
Token ID290
Feature activation+1.36e-01

Pos logit contributions
unch+0.405
oot+0.388
aring+0.381
amb+0.377
uts+0.374
Neg logit contributions
 same-0.258
 attic-0.242
 way-0.232
 drawer-0.224
 proper-0.223
Token shield
Token ID7614
Feature activation-1.57e-01

Pos logit contributions
 same+0.225
 way+0.198
 task+0.189
 attic+0.183
 laundry+0.177
Neg logit contributions
unch-0.275
oot-0.266
aring-0.261
uts-0.259
amb-0.258
Token help
Token ID1037
Feature activation+1.60e-02

Pos logit contributions
unch+0.333
oot+0.328
uts+0.324
aring+0.321
amb+0.319
Neg logit contributions
 same-0.234
 attic-0.219
 drawer-0.205
 way-0.205
 laundry-0.204
Token you
Token ID345
Feature activation+1.26e-01

Pos logit contributions
 same+0.021
 way+0.019
 attic+0.019
 drawer+0.018
 task+0.018
Neg logit contributions
unch-0.037
oot-0.036
qu-0.036
amb-0.036
uts-0.035
Token said
Token ID531
Feature activation-1.01e-02

Pos logit contributions
 same+0.219
 task+0.197
 way+0.194
 attic+0.186
 boring+0.185
Neg logit contributions
amb-0.183
unch-0.174
oot-0.167
oped-0.165
uts-0.161
Token,
Token ID11
Feature activation+7.03e-02

Pos logit contributions
unch+0.020
amb+0.019
oot+0.018
oped+0.018
oing+0.018
Neg logit contributions
 same-0.020
 way-0.018
 boring-0.017
 task-0.017
 bad-0.017
Token "
Token ID366
Feature activation-3.68e-03

Pos logit contributions
 same+0.076
 way+0.064
 boring+0.055
 table+0.052
 bad+0.051
Neg logit contributions
unch-0.197
amb-0.196
oot-0.190
oped-0.187
umbs-0.186
TokenIsn
Token ID41451
Feature activation+4.44e-02

Pos logit contributions
unch+0.008
oot+0.007
amb+0.007
uts+0.007
aring+0.007
Neg logit contributions
 same-0.007
 way-0.006
 boring-0.005
 attic-0.005
 task-0.005
Token't
Token ID470
Feature activation+1.47e-01

Pos logit contributions
 same+0.092
 way+0.084
 task+0.079
 boring+0.078
 attic+0.076
Neg logit contributions
amb-0.082
unch-0.081
oot-0.075
oped-0.074
aring-0.074
Token the
Token ID262
Feature activation+2.46e-01

Pos logit contributions
 same+0.256
 way+0.237
 task+0.227
 homework+0.218
 laundry+0.216
Neg logit contributions
unch-0.299
amb-0.296
oot-0.288
uts-0.280
qu-0.276
Token sunset
Token ID26428
Feature activation-1.27e-01

Pos logit contributions
 same+0.306
 proper+0.291
 homework+0.289
 way+0.289
 task+0.279
Neg logit contributions
unch-0.554
amb-0.552
qu-0.520
uts-0.504
oot-0.498
Token beautiful
Token ID4950
Feature activation+8.15e-03

Pos logit contributions
unch+0.281
amb+0.264
uts+0.258
oot+0.255
qu+0.254
Neg logit contributions
 same-0.207
 way-0.184
 attic-0.172
 task-0.171
 garage-0.168
Token?"
Token ID1701
Feature activation-5.66e-02

Pos logit contributions
 same+0.014
 way+0.012
 task+0.012
 boring+0.011
 attic+0.011
Neg logit contributions
unch-0.017
amb-0.017
oot-0.016
uts-0.016
oing-0.016
Token
Token ID198
Feature activation-5.25e-02

Pos logit contributions
unch+0.110
amb+0.107
oot+0.103
oing+0.102
oped+0.102
Neg logit contributions
 same-0.111
 way-0.103
 boring-0.094
 laundry-0.092
 drawer-0.092
Token
Token ID198
Feature activation-4.81e-02

Pos logit contributions
unch+0.113
amb+0.110
owing+0.107
oot+0.106
umbs+0.106
Neg logit contributions
 same-0.091
 way-0.083
 new-0.081
 big-0.079
 room-0.078
Token twins
Token ID20345
Feature activation-1.27e-01

Pos logit contributions
 same+0.296
 way+0.260
 attic+0.259
 task+0.252
 proper+0.248
Neg logit contributions
unch-0.461
amb-0.446
aring-0.432
qu-0.431
oot-0.426
Token jumped
Token ID11687
Feature activation-7.56e-02

Pos logit contributions
amb+0.302
unch+0.301
oot+0.282
aring+0.279
oing+0.276
Neg logit contributions
 same-0.194
 way-0.183
 bad-0.158
 boring-0.158
 garage-0.155
Token back
Token ID736
Feature activation+3.94e-02

Pos logit contributions
unch+0.181
amb+0.176
oot+0.166
oped+0.165
oing+0.165
Neg logit contributions
 same-0.121
 way-0.108
 task-0.096
 boring-0.096
 bad-0.094
Token.
Token ID13
Feature activation-4.94e-02

Pos logit contributions
 same+0.066
 way+0.063
 task+0.054
 boring+0.053
 attic+0.053
Neg logit contributions
unch-0.090
amb-0.085
qu-0.082
oped-0.081
aring-0.081
Token Was
Token ID8920
Feature activation+3.63e-03

Pos logit contributions
unch+0.111
amb+0.109
oot+0.104
oped+0.102
oing+0.101
Neg logit contributions
 same-0.083
 way-0.075
 boring-0.067
 task-0.066
 attic-0.065
Token the
Token ID262
Feature activation+2.44e-01

Pos logit contributions
 same+0.006
 way+0.005
 task+0.005
 attic+0.005
 boring+0.005
Neg logit contributions
unch-0.008
amb-0.008
oot-0.008
oped-0.008
uts-0.008
Token rock
Token ID3881
Feature activation-1.20e-01

Pos logit contributions
 same+0.269
 way+0.241
 attic+0.240
 proper+0.234
 garage+0.224
Neg logit contributions
unch-0.622
amb-0.598
qu-0.576
oot-0.568
uts-0.567
Token alive
Token ID6776
Feature activation+7.65e-03

Pos logit contributions
unch+0.258
aring+0.248
oped+0.247
oot+0.244
uts+0.242
Neg logit contributions
 attic-0.168
 same-0.165
 basement-0.158
 proper-0.153
 drawer-0.152
Token?
Token ID30
Feature activation-1.91e-02

Pos logit contributions
 same+0.013
 proper+0.012
 attic+0.012
 way+0.012
 drawer+0.012
Neg logit contributions
unch-0.014
oot-0.013
amb-0.013
oped-0.013
uts-0.013
Token
Token ID198
Feature activation-5.14e-02

Pos logit contributions
unch+0.039
amb+0.039
oot+0.037
oing+0.036
oped+0.036
Neg logit contributions
 same-0.035
 way-0.031
 boring-0.028
 garage-0.028
 task-0.028
Token
Token ID198
Feature activation-4.60e-02

Pos logit contributions
amb+0.117
owing+0.113
unch+0.112
oot+0.111
rol+0.106
Neg logit contributions
 same-0.077
 big-0.066
 way-0.065
 bad-0.065
 new-0.064
Token
Token ID198
Feature activation-4.32e-02

Pos logit contributions
amb+0.107
unch+0.099
oot+0.099
oped+0.098
owing+0.097
Neg logit contributions
 same-0.080
 way-0.073
 new-0.067
 big-0.067
 bad-0.066
Token"
Token ID1
Feature activation-2.23e-02

Pos logit contributions
amb+0.082
unch+0.081
oot+0.077
oped+0.074
aring+0.072
Neg logit contributions
 same-0.089
 way-0.080
 boring-0.076
 task-0.076
 attic-0.075
TokenOh
Token ID5812
Feature activation+1.16e-01

Pos logit contributions
unch+0.046
amb+0.046
oot+0.043
oped+0.043
aring+0.042
Neg logit contributions
 same-0.041
 way-0.037
 task-0.035
 boring-0.034
 attic-0.034
Token no
Token ID645
Feature activation+1.38e-01

Pos logit contributions
 same+0.176
 way+0.154
 task+0.147
 attic+0.145
 garage+0.144
Neg logit contributions
unch-0.266
amb-0.256
aring-0.247
oot-0.246
oped-0.244
Token,
Token ID11
Feature activation+8.52e-02

Pos logit contributions
 same+0.310
 way+0.293
 boring+0.273
 task+0.272
 bad+0.268
Neg logit contributions
amb-0.240
unch-0.231
oped-0.226
oot-0.219
oing-0.215
Token the
Token ID262
Feature activation+2.42e-01

Pos logit contributions
 same+0.137
 way+0.125
 attic+0.119
 task+0.118
 garage+0.118
Neg logit contributions
unch-0.175
aring-0.162
uts-0.157
amb-0.155
qu-0.155
Token chocolate
Token ID11311
Feature activation-1.06e-01

Pos logit contributions
 proper+0.307
 attic+0.292
 same+0.288
 basement+0.273
 way+0.265
Neg logit contributions
unch-0.533
umbs-0.500
ew-0.469
qu-0.466
m-0.444
Token is
Token ID318
Feature activation+1.72e-01

Pos logit contributions
unch+0.229
uts+0.219
amb+0.215
qu+0.214
umbs+0.212
Neg logit contributions
 same-0.159
 way-0.140
 attic-0.139
 garage-0.138
 proper-0.136
Token burning
Token ID9482
Feature activation-5.13e-02

Pos logit contributions
 garage+0.244
 same+0.238
 attic+0.238
 basement+0.221
 task+0.213
Neg logit contributions
unch-0.340
uts-0.339
umbs-0.320
qu-0.318
ew-0.316
Token!"
Token ID2474
Feature activation-6.00e-02

Pos logit contributions
unch+0.062
amb+0.060
uts+0.055
oot+0.055
qu+0.055
Neg logit contributions
 same-0.132
 way-0.125
 task-0.119
 garage-0.119
 attic-0.118
Token Jill
Token ID22772
Feature activation+8.66e-03

Pos logit contributions
unch+0.127
amb+0.126
uts+0.119
oot+0.117
oped+0.117
Neg logit contributions
 same-0.099
 way-0.091
 task-0.088
 attic-0.087
 garage-0.083
Token
Token ID198
Feature activation-4.64e-02

Pos logit contributions
amb+0.122
owing+0.116
unch+0.114
rol+0.113
oot+0.112
Neg logit contributions
 same-0.078
 way-0.073
 big-0.070
 bad-0.068
 old-0.067
Token"
Token ID1
Feature activation-2.73e-02

Pos logit contributions
amb+0.089
unch+0.088
oot+0.082
rol+0.080
owing+0.079
Neg logit contributions
 same-0.091
 way-0.083
 bad-0.081
 garage-0.080
 boring-0.080
TokenOh
Token ID5812
Feature activation+1.15e-01

Pos logit contributions
unch+0.060
amb+0.059
oot+0.057
aring+0.056
uts+0.055
Neg logit contributions
 same-0.048
 way-0.043
 laundry-0.039
 task-0.039
 boring-0.039
Token no
Token ID645
Feature activation+1.36e-01

Pos logit contributions
 same+0.151
 way+0.132
 task+0.117
 boring+0.116
 attic+0.111
Neg logit contributions
unch-0.303
amb-0.302
oot-0.287
oped-0.284
uts-0.282
Token,
Token ID11
Feature activation+8.23e-02

Pos logit contributions
 same+0.273
 way+0.264
 boring+0.237
 bad+0.234
 task+0.233
Neg logit contributions
amb-0.275
unch-0.265
oped-0.265
uts-0.255
oing-0.252
Token the
Token ID262
Feature activation+2.42e-01

Pos logit contributions
 same+0.126
 way+0.112
 task+0.106
 attic+0.102
 garage+0.101
Neg logit contributions
unch-0.181
amb-0.171
aring-0.170
oot-0.166
uts-0.163
Token garbage
Token ID15413
Feature activation-1.24e-01

Pos logit contributions
 same+0.277
 way+0.237
 task+0.223
 boring+0.220
 homework+0.218
Neg logit contributions
unch-0.627
amb-0.616
aring-0.585
oped-0.579
oot-0.579
Token truck
Token ID7779
Feature activation-1.28e-01

Pos logit contributions
amb+0.266
unch+0.265
oped+0.253
oot+0.248
uts+0.245
Neg logit contributions
 same-0.201
 way-0.186
 task-0.172
 boring-0.171
 homework-0.166
Token is
Token ID318
Feature activation+1.63e-01

Pos logit contributions
unch+0.165
amb+0.157
oped+0.152
aring+0.151
oot+0.150
Neg logit contributions
 same-0.320
 way-0.297
 attic-0.289
 task-0.287
 laundry-0.287
Token here
Token ID994
Feature activation+5.20e-02

Pos logit contributions
 same+0.254
 attic+0.228
 task+0.226
 garage+0.222
 laundry+0.214
Neg logit contributions
unch-0.324
uts-0.315
aring-0.305
oot-0.303
qu-0.299
Token.
Token ID13
Feature activation-3.52e-02

Pos logit contributions
 same+0.126
 way+0.118
 task+0.111
 boring+0.110
 attic+0.108
Neg logit contributions
unch-0.077
amb-0.076
oot-0.070
oped-0.069
aring-0.067
Token She
Token ID1375
Feature activation+4.49e-02

Pos logit contributions
amb+0.087
unch+0.086
oot+0.079
aring+0.079
uts+0.078
Neg logit contributions
 same-0.080
 way-0.074
 task-0.069
 boring-0.068
 bad-0.066
Token hadn
Token ID8020
Feature activation+3.11e-02

Pos logit contributions
 same+0.086
 way+0.076
 attic+0.075
 task+0.074
 basement+0.073
Neg logit contributions
amb-0.086
unch-0.083
oped-0.082
aring-0.078
oot-0.078
Token't
Token ID470
Feature activation+1.31e-01

Pos logit contributions
 same+0.055
 way+0.051
 task+0.045
 boring+0.045
 bad+0.044
Neg logit contributions
amb-0.067
unch-0.065
oing-0.063
aring-0.062
oot-0.060
Token known
Token ID1900
Feature activation+3.52e-02

Pos logit contributions
 same+0.241
 attic+0.216
 basement+0.206
 task+0.205
 garage+0.205
Neg logit contributions
unch-0.256
uts-0.240
aring-0.235
oped-0.231
qu-0.230
Token that
Token ID326
Feature activation+1.70e-01

Pos logit contributions
 same+0.065
 way+0.059
 task+0.056
 garage+0.055
 attic+0.055
Neg logit contributions
unch-0.070
amb-0.069
oot-0.066
aring-0.064
uts-0.064
Token the
Token ID262
Feature activation+2.42e-01

Pos logit contributions
 same+0.269
 garage+0.245
 attic+0.238
 basement+0.234
 task+0.234
Neg logit contributions
uts-0.312
unch-0.308
aring-0.305
amb-0.300
es-0.298
Token dish
Token ID9433
Feature activation-1.48e-01

Pos logit contributions
 same+0.273
 attic+0.271
 basement+0.265
 garage+0.262
 proper+0.252
Neg logit contributions
amb-0.481
unch-0.479
aring-0.477
umbs-0.447
qu-0.445
Token she
Token ID673
Feature activation+1.22e-01

Pos logit contributions
unch+0.280
uts+0.279
amb+0.279
aring+0.269
oot+0.264
Neg logit contributions
 same-0.280
 way-0.253
 task-0.238
 attic-0.234
 laundry-0.231
Token had
Token ID550
Feature activation+1.11e-01

Pos logit contributions
 same+0.220
 attic+0.194
 task+0.192
 way+0.192
 basement+0.191
Neg logit contributions
oped-0.234
amb-0.233
uts-0.229
unch-0.228
aring-0.227
Token chosen
Token ID7147
Feature activation+1.53e-02

Pos logit contributions
 same+0.192
 attic+0.176
 garage+0.174
 basement+0.171
 task+0.171
Neg logit contributions
uts-0.215
unch-0.212
oot-0.207
aring-0.201
qu-0.201
Token was
Token ID373
Feature activation+1.51e-01

Pos logit contributions
 same+0.029
 way+0.025
 task+0.025
 garage+0.025
 attic+0.025
Neg logit contributions
unch-0.029
uts-0.028
amb-0.028
oot-0.028
aring-0.027
Token "
Token ID366
Feature activation-7.63e-03

Pos logit contributions
 same+0.080
 way+0.070
 boring+0.062
 task+0.060
 bad+0.058
Neg logit contributions
unch-0.156
amb-0.156
oot-0.148
oped-0.147
aring-0.144
TokenDon
Token ID3987
Feature activation+2.61e-02

Pos logit contributions
amb+0.015
unch+0.015
oped+0.014
oot+0.014
aring+0.014
Neg logit contributions
 same-0.015
 way-0.014
 task-0.013
 boring-0.013
 attic-0.012
Token't
Token ID470
Feature activation+1.48e-01

Pos logit contributions
 same+0.055
 way+0.050
 task+0.047
 boring+0.046
 attic+0.045
Neg logit contributions
unch-0.048
amb-0.047
oot-0.044
oped-0.043
oing-0.043
Token worry
Token ID5490
Feature activation-3.05e-02

Pos logit contributions
 same+0.204
 way+0.182
 task+0.161
 boring+0.160
 laundry+0.151
Neg logit contributions
unch-0.380
amb-0.378
oped-0.359
oot-0.356
uts-0.350
Token,
Token ID11
Feature activation+8.21e-02

Pos logit contributions
unch+0.053
amb+0.051
oot+0.049
aring+0.048
uts+0.048
Neg logit contributions
 same-0.066
 way-0.060
 task-0.057
 attic-0.056
 garage-0.056
Token the
Token ID262
Feature activation+2.41e-01

Pos logit contributions
 same+0.128
 basement+0.111
 garage+0.110
 attic+0.110
 way+0.108
Neg logit contributions
unch-0.173
aring-0.167
oot-0.162
uts-0.161
qu-0.159
Token dog
Token ID3290
Feature activation-1.02e-01

Pos logit contributions
 same+0.291
 way+0.253
 task+0.225
 boring+0.222
 attic+0.218
Neg logit contributions
unch-0.647
amb-0.636
oot-0.602
aring-0.599
oped-0.598
Token won
Token ID1839
Feature activation-1.35e-02

Pos logit contributions
unch+0.227
aring+0.226
amb+0.226
oped+0.223
qu+0.217
Neg logit contributions
 same-0.149
 attic-0.133
 laundry-0.130
 task-0.129
 way-0.128
Token't
Token ID470
Feature activation+1.33e-01

Pos logit contributions
unch+0.031
amb+0.031
oped+0.028
aring+0.028
oing+0.028
Neg logit contributions
 same-0.024
 way-0.021
 bad-0.020
 task-0.019
 boring-0.019
Token hurt
Token ID5938
Feature activation-2.97e-02

Pos logit contributions
 same+0.148
 way+0.133
 task+0.108
 boring+0.108
 bad+0.107
Neg logit contributions
amb-0.387
unch-0.386
oped-0.370
oing-0.360
rol-0.359
Token you
Token ID345
Feature activation+1.25e-01

Pos logit contributions
unch+0.078
amb+0.077
oped+0.073
oot+0.073
aring+0.072
Neg logit contributions
 same-0.039
 way-0.034
 task-0.030
 boring-0.030
 attic-0.029
Token.
Token ID13
Feature activation-3.13e-02

Pos logit contributions
oped+0.073
amb+0.071
rol+0.070
unch+0.069
qu+0.068
Neg logit contributions
 same-0.046
 way-0.042
 old-0.040
 bad-0.040
 new-0.039
Token They
Token ID1119
Feature activation+2.65e-02

Pos logit contributions
amb+0.063
unch+0.062
oot+0.058
uts+0.058
aring+0.058
Neg logit contributions
 same-0.059
 way-0.054
 task-0.051
 boring-0.050
 attic-0.049
Token do
Token ID466
Feature activation+5.69e-02

Pos logit contributions
 same+0.041
 way+0.037
 task+0.034
 boring+0.033
 attic+0.033
Neg logit contributions
amb-0.061
unch-0.061
oot-0.058
oped-0.057
aring-0.057
Token not
Token ID407
Feature activation+1.38e-01

Pos logit contributions
 same+0.061
 way+0.052
 attic+0.047
 task+0.046
 laundry+0.044
Neg logit contributions
unch-0.159
amb-0.154
oot-0.151
aring-0.151
uts-0.150
Token know
Token ID760
Feature activation+4.90e-03

Pos logit contributions
 same+0.280
 way+0.258
 boring+0.243
 task+0.238
 bad+0.238
Neg logit contributions
unch-0.273
amb-0.269
oped-0.251
oot-0.245
oing-0.244
Token the
Token ID262
Feature activation+2.41e-01

Pos logit contributions
 same+0.009
 way+0.008
 boring+0.008
 bad+0.007
 attic+0.007
Neg logit contributions
amb-0.011
unch-0.011
oped-0.010
oing-0.010
oot-0.010
Token man
Token ID582
Feature activation-7.69e-02

Pos logit contributions
 same+0.297
 way+0.255
 boring+0.249
 proper+0.244
 basement+0.241
Neg logit contributions
unch-0.564
amb-0.553
aring-0.532
uts-0.526
oot-0.521
Token.
Token ID13
Feature activation-3.50e-02

Pos logit contributions
unch+0.158
aring+0.153
amb+0.153
oot+0.150
oped+0.149
Neg logit contributions
 same-0.135
 way-0.121
 attic-0.117
 task-0.117
 laundry-0.114
Token He
Token ID679
Feature activation+5.79e-02

Pos logit contributions
unch+0.072
amb+0.072
oot+0.067
oped+0.066
aring+0.066
Neg logit contributions
 same-0.064
 way-0.059
 task-0.055
 boring-0.054
 attic-0.053
Token is
Token ID318
Feature activation+1.68e-01

Pos logit contributions
 same+0.088
 way+0.082
 task+0.077
 laundry+0.077
 attic+0.077
Neg logit contributions
amb-0.126
unch-0.120
aring-0.119
oped-0.119
qu-0.113
Token unknown
Token ID6439
Feature activation-2.54e-02

Pos logit contributions
 same+0.254
 task+0.238
 way+0.231
 attic+0.217
 laundry+0.215
Neg logit contributions
unch-0.363
aring-0.354
amb-0.347
oot-0.344
ang-0.337
Token,
Token ID11
Feature activation+7.55e-02

Pos logit contributions
unch+0.020
amb+0.019
aring+0.019
oot+0.019
uts+0.019
Neg logit contributions
 same-0.027
 way-0.025
 attic-0.024
 task-0.023
 garage-0.023
Token "
Token ID366
Feature activation-5.37e-04

Pos logit contributions
 same+0.111
 way+0.099
 task+0.090
 boring+0.088
 attic+0.086
Neg logit contributions
unch-0.184
amb-0.182
oot-0.173
oped-0.171
aring-0.170
TokenOh
Token ID5812
Feature activation+1.25e-01

Pos logit contributions
amb+0.001
unch+0.001
oped+0.001
oot+0.001
oing+0.001
Neg logit contributions
 same-0.001
 way-0.001
 task-0.001
 boring-0.001
 laundry-0.001
Token no
Token ID645
Feature activation+1.46e-01

Pos logit contributions
 same+0.164
 way+0.144
 task+0.134
 attic+0.132
 floor+0.129
Neg logit contributions
unch-0.309
amb-0.303
oped-0.291
oot-0.290
aring-0.289
Token,
Token ID11
Feature activation+8.68e-02

Pos logit contributions
 same+0.302
 way+0.294
 boring+0.269
 bad+0.261
 task+0.261
Neg logit contributions
amb-0.291
unch-0.277
oped-0.276
rol-0.264
umbs-0.262
Token the
Token ID262
Feature activation+2.41e-01

Pos logit contributions
 same+0.156
 way+0.139
 usual+0.135
 attic+0.133
 task+0.128
Neg logit contributions
aring-0.151
unch-0.149
oot-0.139
uts-0.136
qu-0.132
Token hurricane
Token ID23964
Feature activation-1.35e-01

Pos logit contributions
 same+0.239
 proper+0.200
 homework+0.198
 drawer+0.197
 way+0.197
Neg logit contributions
unch-0.631
amb-0.621
uts-0.594
oot-0.592
qu-0.589
Token will
Token ID481
Feature activation+7.10e-02

Pos logit contributions
unch+0.348
amb+0.344
oped+0.330
oot+0.329
aring+0.328
Neg logit contributions
 same-0.172
 way-0.150
 task-0.135
 boring-0.133
 attic-0.132
Token destroy
Token ID4117
Feature activation-7.00e-02

Pos logit contributions
 same+0.099
 drawer+0.098
 attic+0.094
 proper+0.090
 way+0.088
Neg logit contributions
qu-0.153
unch-0.148
oot-0.147
uts-0.144
amb-0.141
Token everything
Token ID2279
Feature activation-3.97e-02

Pos logit contributions
unch+0.155
qu+0.152
amb+0.151
oot+0.151
es+0.151
Neg logit contributions
 same-0.092
 way-0.081
 task-0.079
 garage-0.075
 drawer-0.074
Token!"
Token ID2474
Feature activation-8.02e-02

Pos logit contributions
unch+0.061
amb+0.060
uts+0.057
oot+0.056
aring+0.056
Neg logit contributions
 same-0.090
 way-0.083
 task-0.080
 attic-0.078
 laundry-0.078
Token she
Token ID673
Feature activation+9.92e-02

Pos logit contributions
unch+0.067
oot+0.063
uts+0.063
oped+0.061
amb+0.061
Neg logit contributions
 way-0.064
 same-0.062
 task-0.061
 proper-0.059
 attic-0.054
Token asked
Token ID1965
Feature activation-4.52e-02

Pos logit contributions
 same+0.194
 way+0.177
 task+0.171
 attic+0.166
 boring+0.163
Neg logit contributions
unch-0.186
amb-0.185
oot-0.175
oped-0.173
aring-0.171
Token.
Token ID13
Feature activation-4.80e-02

Pos logit contributions
unch+0.080
amb+0.080
oped+0.073
oot+0.073
aring+0.072
Neg logit contributions
 same-0.098
 way-0.090
 task-0.084
 boring-0.084
 attic-0.082
Token "
Token ID366
Feature activation-1.45e-02

Pos logit contributions
unch+0.097
amb+0.096
oot+0.090
oped+0.090
uts+0.089
Neg logit contributions
 same-0.090
 way-0.082
 task-0.077
 boring-0.076
 attic-0.074
TokenDid
Token ID11633
Feature activation+2.05e-02

Pos logit contributions
unch+0.032
amb+0.032
oot+0.030
oped+0.030
aring+0.030
Neg logit contributions
 same-0.025
 way-0.022
 task-0.020
 boring-0.020
 attic-0.020
Token the
Token ID262
Feature activation+2.40e-01

Pos logit contributions
 same+0.033
 way+0.030
 task+0.029
 attic+0.028
 garage+0.028
Neg logit contributions
unch-0.044
oot-0.043
aring-0.042
uts-0.042
qu-0.042
Token ink
Token ID16882
Feature activation-1.35e-01

Pos logit contributions
 usual+0.275
 proper+0.271
 way+0.254
 same+0.254
 boring+0.232
Neg logit contributions
unch-0.527
amb-0.523
oot-0.503
aring-0.492
qu-0.482
Token hurt
Token ID5938
Feature activation-4.66e-02

Pos logit contributions
uts+0.281
unch+0.274
qu+0.268
oot+0.264
es+0.260
Neg logit contributions
 same-0.192
 attic-0.185
 way-0.184
 usual-0.181
 proper-0.180
Token you
Token ID345
Feature activation+1.23e-01

Pos logit contributions
oot+0.101
unch+0.099
aring+0.096
r+0.096
qu+0.095
Neg logit contributions
 same-0.067
 way-0.064
 task-0.059
 usual-0.057
 proper-0.057
Token?"
Token ID1701
Feature activation-5.01e-02

Pos logit contributions
 same+0.232
 way+0.212
 laundry+0.200
 task+0.198
 attic+0.197
Neg logit contributions
unch-0.233
amb-0.222
oot-0.219
aring-0.217
uts-0.214
Token
Token ID198
Feature activation-5.03e-02

Pos logit contributions
unch+0.078
amb+0.076
oot+0.071
oped+0.070
uts+0.069
Neg logit contributions
 same-0.118
 way-0.110
 task-0.104
 boring-0.103
 attic-0.101
Token
Token ID198
Feature activation-4.77e-02

Pos logit contributions
 same+0.123
 way+0.098
 boring+0.086
 new+0.085
 table+0.084
Neg logit contributions
amb-0.343
unch-0.336
oot-0.334
umbs-0.328
oped-0.325
Token
Token ID198
Feature activation-4.69e-02

Pos logit contributions
unch+0.106
amb+0.104
owing+0.103
oot+0.099
rol+0.099
Neg logit contributions
 same-0.077
 way-0.068
 big-0.064
 your-0.062
 new-0.062
Token"
Token ID1
Feature activation-2.83e-02

Pos logit contributions
unch+0.082
amb+0.079
oot+0.077
oped+0.073
owing+0.071
Neg logit contributions
 same-0.103
 way-0.092
 task-0.089
 boring-0.089
 bad-0.086
TokenIsn
Token ID41451
Feature activation+3.60e-02

Pos logit contributions
unch+0.067
oot+0.066
aring+0.065
amb+0.064
uts+0.063
Neg logit contributions
 same-0.047
 way-0.037
 boring-0.036
 laundry-0.035
 attic-0.035
Token't
Token ID470
Feature activation+1.39e-01

Pos logit contributions
 same+0.072
 way+0.065
 boring+0.060
 task+0.060
 attic+0.058
Neg logit contributions
amb-0.070
unch-0.068
oot-0.064
aring-0.063
oped-0.063
Token the
Token ID262
Feature activation+2.40e-01

Pos logit contributions
 same+0.218
 way+0.199
 task+0.187
 garage+0.179
 laundry+0.175
Neg logit contributions
unch-0.318
amb-0.313
oot-0.302
uts-0.296
qu-0.294
Token sunrise
Token ID40048
Feature activation-1.29e-01

Pos logit contributions
 same+0.356
 way+0.324
 task+0.303
 garage+0.298
 laundry+0.295
Neg logit contributions
unch-0.545
amb-0.539
qu-0.501
oot-0.500
oing-0.497
Token beautiful
Token ID4950
Feature activation+1.66e-03

Pos logit contributions
unch+0.284
amb+0.276
oot+0.262
aring+0.260
uts+0.260
Neg logit contributions
 same-0.215
 way-0.193
 task-0.178
 attic-0.176
 boring-0.174
Token?".
Token ID43634
Feature activation-5.37e-02

Pos logit contributions
 same+0.003
 way+0.002
 task+0.002
 boring+0.002
 attic+0.002
Neg logit contributions
unch-0.004
amb-0.004
oot-0.004
oped-0.004
aring-0.004
Token
Token ID198
Feature activation-6.00e-02

Pos logit contributions
unch+0.119
amb+0.118
oot+0.111
oped+0.111
aring+0.109
Neg logit contributions
 same-0.091
 way-0.082
 task-0.076
 boring-0.074
 attic-0.073
Token
Token ID198
Feature activation-5.60e-02

Pos logit contributions
amb+0.130
unch+0.129
owing+0.128
oot+0.127
aring+0.124
Neg logit contributions
 same-0.097
 way-0.088
 new-0.085
 big-0.083
 bad-0.079
Token then
Token ID788
Feature activation+4.26e-02

Pos logit contributions
 same+0.059
 way+0.055
 task+0.052
 boring+0.050
 garage+0.050
Neg logit contributions
amb-0.047
unch-0.047
oot-0.045
oped-0.042
aring-0.042
Token on
Token ID319
Feature activation+1.25e-01

Pos logit contributions
 same+0.082
 way+0.074
 task+0.069
 boring+0.069
 attic+0.068
Neg logit contributions
unch-0.085
amb-0.084
oot-0.079
oped-0.078
aring-0.077
Token,
Token ID11
Feature activation+4.59e-02

Pos logit contributions
 same+0.329
 proper+0.312
 garage+0.311
 ger+0.310
 stove+0.301
Neg logit contributions
es-0.121
unch-0.110
oped-0.105
oot-0.102
uts-0.098
Token Meg
Token ID8336
Feature activation+1.17e-02

Pos logit contributions
 same+0.081
 way+0.073
 garage+0.072
 laundry+0.072
 task+0.072
Neg logit contributions
unch-0.092
amb-0.087
oot-0.085
uts-0.085
aring-0.084
Token knew
Token ID2993
Feature activation+2.23e-02

Pos logit contributions
 same+0.020
 task+0.018
 laundry+0.017
 attic+0.017
 way+0.017
Neg logit contributions
amb-0.025
unch-0.024
oped-0.023
es-0.023
uts-0.023
Token to
Token ID284
Feature activation+2.39e-01

Pos logit contributions
 same+0.044
 way+0.040
 task+0.039
 boring+0.037
 table+0.037
Neg logit contributions
unch-0.041
oot-0.041
amb-0.040
uts-0.038
aring-0.037
Token be
Token ID307
Feature activation+1.21e-01

Pos logit contributions
 same+0.290
 way+0.247
 task+0.221
 boring+0.219
 attic+0.212
Neg logit contributions
unch-0.641
amb-0.638
oot-0.606
aring-0.599
oped-0.596
Token careful
Token ID8161
Feature activation-4.48e-02

Pos logit contributions
 same+0.209
 task+0.194
 way+0.193
 table+0.191
 shelf+0.188
Neg logit contributions
unch-0.228
aring-0.226
amb-0.224
ang-0.214
ucking-0.213
Token with
Token ID351
Feature activation+1.02e-01

Pos logit contributions
unch+0.091
amb+0.090
oped+0.085
oot+0.084
oing+0.082
Neg logit contributions
 same-0.085
 way-0.078
 task-0.071
 boring-0.071
 attic-0.069
Token the
Token ID262
Feature activation+2.09e-01

Pos logit contributions
 same+0.147
 task+0.128
 way+0.127
 garage+0.121
 table+0.120
Neg logit contributions
oot-0.207
unch-0.206
es-0.200
aring-0.195
owing-0.195
Token things
Token ID1243
Feature activation-5.25e-02

Pos logit contributions
 homework+0.242
 same+0.212
 basement+0.208
 laundry+0.206
 usual+0.196
Neg logit contributions
unch-0.417
aring-0.392
amb-0.386
oing-0.384
ang-0.380
Token.
Token ID13
Feature activation-4.72e-02

Pos logit contributions
 same+0.040
 way+0.035
 attic+0.035
 laundry+0.035
 garage+0.034
Neg logit contributions
unch-0.034
amb-0.033
aring-0.033
oot-0.032
uts-0.031
Token She
Token ID1375
Feature activation+4.47e-02

Pos logit contributions
unch+0.088
amb+0.085
aring+0.082
uts+0.082
oot+0.080
Neg logit contributions
 same-0.088
 way-0.085
 task-0.080
 usual-0.075
 laundry-0.074
Token didn
Token ID1422
Feature activation+2.87e-02

Pos logit contributions
 same+0.084
 way+0.075
 task+0.074
 laundry+0.072
 attic+0.072
Neg logit contributions
amb-0.085
oped-0.082
oot-0.076
unch-0.076
aring-0.073
Token't
Token ID470
Feature activation+1.35e-01

Pos logit contributions
 same+0.054
 way+0.048
 task+0.046
 boring+0.046
 bad+0.045
Neg logit contributions
amb-0.058
unch-0.057
oing-0.054
oped-0.054
aring-0.054
Token know
Token ID760
Feature activation+2.32e-03

Pos logit contributions
 same+0.262
 way+0.242
 boring+0.226
 task+0.222
 bad+0.221
Neg logit contributions
unch-0.279
amb-0.268
oped-0.256
oot-0.248
oing-0.246
Token the
Token ID262
Feature activation+2.39e-01

Pos logit contributions
 same+0.004
 way+0.004
 boring+0.004
 bad+0.004
 attic+0.004
Neg logit contributions
amb-0.005
unch-0.005
oped-0.005
oing-0.004
oot-0.004
Token answer
Token ID3280
Feature activation-9.72e-02

Pos logit contributions
 attic+0.266
 same+0.263
 garage+0.257
 proper+0.256
 laundry+0.244
Neg logit contributions
aring-0.531
unch-0.527
amb-0.519
oing-0.498
qu-0.498
Token.
Token ID13
Feature activation-4.15e-02

Pos logit contributions
unch+0.166
amb+0.163
oot+0.159
uts+0.155
aring+0.155
Neg logit contributions
 same-0.205
 way-0.185
 task-0.179
 attic-0.174
 boring-0.173
Token She
Token ID1375
Feature activation+4.99e-02

Pos logit contributions
amb+0.077
unch+0.077
aring+0.072
oot+0.071
uts+0.071
Neg logit contributions
 same-0.079
 way-0.074
 task-0.071
 boring-0.068
 laundry-0.067
Token closed
Token ID4838
Feature activation-9.76e-02

Pos logit contributions
 same+0.087
 way+0.077
 task+0.074
 laundry+0.072
 attic+0.072
Neg logit contributions
amb-0.106
unch-0.100
oped-0.100
oot-0.098
aring-0.095
Token her
Token ID607
Feature activation+1.55e-01

Pos logit contributions
amb+0.222
oot+0.219
oped+0.207
es+0.206
uts+0.204
Neg logit contributions
 same-0.132
 task-0.128
 way-0.116
 garage-0.114
 boring-0.109
Token she
Token ID673
Feature activation+7.65e-02

Pos logit contributions
unch+0.051
amb+0.050
oot+0.047
aring+0.046
uts+0.046
Neg logit contributions
 same-0.068
 way-0.062
 task-0.059
 boring-0.058
 attic-0.058
Token met
Token ID1138
Feature activation-1.20e-02

Pos logit contributions
 same+0.117
 way+0.105
 task+0.098
 laundry+0.094
 attic+0.094
Neg logit contributions
unch-0.179
amb-0.179
oped-0.170
oot-0.169
aring-0.167
Token.
Token ID13
Feature activation-4.33e-02

Pos logit contributions
unch+0.019
amb+0.019
oot+0.018
uts+0.018
oped+0.018
Neg logit contributions
 same-0.027
 way-0.025
 task-0.024
 boring-0.023
 garage-0.023
Token She
Token ID1375
Feature activation+4.97e-02

Pos logit contributions
unch+0.101
oped+0.097
oing+0.094
amb+0.094
qu+0.094
Neg logit contributions
 same-0.068
 way-0.059
 attic-0.059
 garage-0.055
 boring-0.053
Token hoped
Token ID10719
Feature activation-1.99e-02

Pos logit contributions
 same+0.088
 task+0.077
 way+0.077
 laundry+0.074
 attic+0.073
Neg logit contributions
amb-0.105
unch-0.101
oped-0.097
oot-0.097
uts-0.095
Token to
Token ID284
Feature activation+2.39e-01

Pos logit contributions
amb+0.041
oot+0.041
unch+0.040
uts+0.039
aring+0.038
Neg logit contributions
 same-0.034
 task-0.031
 way-0.031
 garage-0.030
 attic-0.029
Token see
Token ID766
Feature activation+7.97e-02

Pos logit contributions
 same+0.340
 way+0.302
 task+0.277
 boring+0.269
 attic+0.268
Neg logit contributions
unch-0.585
amb-0.580
oot-0.553
aring-0.543
uts-0.541
Token Blink
Token ID41732
Feature activation-2.14e-02

Pos logit contributions
 same+0.112
 way+0.101
 task+0.097
 garage+0.091
 attic+0.091
Neg logit contributions
amb-0.190
unch-0.188
oot-0.182
uts-0.177
oped-0.176
Token again
Token ID757
Feature activation-3.49e-03

Pos logit contributions
unch+0.035
amb+0.034
oot+0.032
oped+0.032
aring+0.031
Neg logit contributions
 same-0.048
 way-0.044
 task-0.042
 boring-0.041
 attic-0.041
Token someday
Token ID25580
Feature activation-4.50e-02

Pos logit contributions
unch+0.006
amb+0.006
oot+0.005
aring+0.005
oped+0.005
Neg logit contributions
 same-0.007
 way-0.007
 task-0.007
 laundry-0.006
 attic-0.006
Token.
Token ID13
Feature activation-4.13e-02

Pos logit contributions
unch+0.066
amb+0.065
oot+0.061
uts+0.061
oped+0.060
Neg logit contributions
 same-0.106
 way-0.098
 task-0.095
 boring-0.093
 laundry-0.093
Token "
Token ID366
Feature activation+3.65e-03

Pos logit contributions
 same+0.102
 way+0.089
 boring+0.080
 task+0.079
 laundry+0.077
Neg logit contributions
unch-0.194
amb-0.194
oot-0.183
oped-0.182
aring-0.179
TokenDon
Token ID3987
Feature activation+2.92e-02

Pos logit contributions
 same+0.007
 way+0.006
 task+0.006
 boring+0.006
 attic+0.006
Neg logit contributions
unch-0.007
amb-0.007
oot-0.007
oped-0.007
aring-0.007
Token't
Token ID470
Feature activation+1.50e-01

Pos logit contributions
 same+0.060
 way+0.055
 task+0.052
 boring+0.051
 bad+0.050
Neg logit contributions
unch-0.054
amb-0.054
oot-0.050
oped-0.050
oing-0.049
Token worry
Token ID5490
Feature activation-3.02e-02

Pos logit contributions
 same+0.202
 way+0.180
 task+0.158
 boring+0.156
 bad+0.148
Neg logit contributions
unch-0.391
amb-0.388
oped-0.371
oot-0.367
uts-0.360
Token,
Token ID11
Feature activation+8.49e-02

Pos logit contributions
unch+0.051
amb+0.049
oot+0.047
aring+0.047
uts+0.046
Neg logit contributions
 same-0.066
 way-0.061
 task-0.058
 attic-0.058
 laundry-0.057
Token the
Token ID262
Feature activation+2.39e-01

Pos logit contributions
 same+0.159
 homework+0.140
 drawer+0.140
 way+0.140
 attic+0.139
Neg logit contributions
unch-0.164
aring-0.151
oot-0.148
amb-0.148
uts-0.144
Token dog
Token ID3290
Feature activation-1.02e-01

Pos logit contributions
 same+0.292
 way+0.254
 task+0.228
 boring+0.224
 attic+0.222
Neg logit contributions
unch-0.629
amb-0.615
aring-0.584
oot-0.582
oped-0.580
Token is
Token ID318
Feature activation+1.74e-01

Pos logit contributions
aring+0.223
unch+0.221
amb+0.220
oped+0.218
qu+0.211
Neg logit contributions
 same-0.156
 attic-0.141
 laundry-0.140
 task-0.135
 drawer-0.134
Token on
Token ID319
Feature activation+1.25e-01

Pos logit contributions
 same+0.278
 task+0.244
 way+0.241
 attic+0.238
 laundry+0.231
Neg logit contributions
unch-0.381
amb-0.371
oot-0.361
aring-0.360
uts-0.351
Token a
Token ID257
Feature activation+1.63e-01

Pos logit contributions
 same+0.129
 way+0.110
 drawer+0.107
 task+0.105
 garage+0.104
Neg logit contributions
unch-0.332
amb-0.331
uts-0.319
oot-0.318
qu-0.315
Token leash
Token ID40529
Feature activation-1.71e-01

Pos logit contributions
 same+0.177
 attic+0.137
 way+0.133
 laundry+0.129
 task+0.126
Neg logit contributions
unch-0.437
amb-0.418
oing-0.412
aring-0.412
oot-0.411
Token in
Token ID287
Feature activation+1.36e-01

Pos logit contributions
unch+0.399
amb+0.380
aring+0.369
oing+0.367
oped+0.362
Neg logit contributions
 same-0.286
 way-0.257
 table-0.231
 laundry-0.228
 old-0.222
Token her
Token ID607
Feature activation+1.42e-01

Pos logit contributions
 same+0.143
 way+0.118
 task+0.108
 boring+0.103
 attic+0.093
Neg logit contributions
unch-0.378
amb-0.370
oot-0.353
uts-0.350
oped-0.350
Token suitcase
Token ID45391
Feature activation-1.78e-01

Pos logit contributions
 same+0.178
 task+0.157
 ger+0.157
 matter+0.144
 boring+0.143
Neg logit contributions
amb-0.328
ang-0.299
oped-0.299
unch-0.295
uts-0.285
Token and
Token ID290
Feature activation+1.19e-01

Pos logit contributions
unch+0.277
amb+0.273
oot+0.254
oped+0.247
uts+0.246
Neg logit contributions
 same-0.414
 way-0.380
 task-0.367
 boring-0.364
 attic-0.354
Token hoped
Token ID10719
Feature activation-1.32e-02

Pos logit contributions
 same+0.213
 task+0.181
 boring+0.176
 bad+0.172
 difficult+0.171
Neg logit contributions
amb-0.243
es-0.209
oped-0.207
uts-0.204
unch-0.202
Token to
Token ID284
Feature activation+2.39e-01

Pos logit contributions
oot+0.027
unch+0.027
uts+0.027
amb+0.026
aring+0.025
Neg logit contributions
 same-0.021
 task-0.019
 way-0.018
 garage-0.018
 attic-0.018
Token have
Token ID423
Feature activation+1.16e-01

Pos logit contributions
 same+0.351
 way+0.311
 task+0.299
 attic+0.290
 boring+0.288
Neg logit contributions
unch-0.544
amb-0.541
oot-0.518
uts-0.503
qu-0.502
Token a
Token ID257
Feature activation+1.87e-01

Pos logit contributions
 same+0.143
 way+0.123
 task+0.117
 attic+0.111
 garage+0.110
Neg logit contributions
unch-0.298
amb-0.291
oot-0.282
qu-0.280
uts-0.279
Token lot
Token ID1256
Feature activation-1.98e-02

Pos logit contributions
 same+0.216
 way+0.165
 attic+0.163
 floor+0.160
 task+0.158
Neg logit contributions
unch-0.480
amb-0.473
oped-0.450
oing-0.446
aring-0.442
Token of
Token ID286
Feature activation+1.65e-01

Pos logit contributions
unch+0.035
amb+0.034
oing+0.033
oped+0.033
qu+0.030
Neg logit contributions
 same-0.044
 way-0.041
 old-0.038
 new-0.038
 boring-0.038
Token fun
Token ID1257
Feature activation-5.33e-02

Pos logit contributions
 same+0.151
 way+0.124
 task+0.103
 boring+0.101
 attic+0.096
Neg logit contributions
unch-0.497
amb-0.494
oot-0.473
oped-0.471
aring-0.467
Token was
Token ID373
Feature activation+1.58e-01

Pos logit contributions
 same+0.014
 way+0.013
 boring+0.011
 bad+0.011
 garage+0.010
Neg logit contributions
amb-0.029
unch-0.028
oot-0.027
uts-0.026
aring-0.026
Token very
Token ID845
Feature activation+1.30e-01

Pos logit contributions
 same+0.201
 way+0.175
 task+0.156
 boring+0.153
 attic+0.149
Neg logit contributions
unch-0.417
amb-0.413
oot-0.394
oped-0.391
aring-0.388
Token angry
Token ID7954
Feature activation-2.63e-02

Pos logit contributions
 same+0.164
 way+0.142
 task+0.126
 boring+0.125
 laundry+0.118
Neg logit contributions
unch-0.348
amb-0.347
oot-0.330
oped-0.328
uts-0.323
Token and
Token ID290
Feature activation+1.37e-01

Pos logit contributions
unch+0.062
oped+0.061
amb+0.060
oing+0.058
qu+0.057
Neg logit contributions
 same-0.043
 way-0.039
 boring-0.036
 old-0.035
 bad-0.035
Token told
Token ID1297
Feature activation-2.39e-02

Pos logit contributions
 same+0.286
 task+0.263
 way+0.263
 attic+0.258
 garage+0.252
Neg logit contributions
unch-0.234
oped-0.227
amb-0.226
oot-0.221
aring-0.221
Token the
Token ID262
Feature activation+2.39e-01

Pos logit contributions
unch+0.059
qu+0.057
amb+0.056
oot+0.056
uts+0.055
Neg logit contributions
 same-0.029
 garage-0.027
 way-0.026
 task-0.025
 attic-0.024
Token little
Token ID1310
Feature activation+3.68e-02

Pos logit contributions
 same+0.256
 way+0.215
 task+0.193
 boring+0.187
 garage+0.183
Neg logit contributions
unch-0.647
amb-0.635
aring-0.606
oot-0.605
uts-0.602
Token girl
Token ID2576
Feature activation-2.69e-02

Pos logit contributions
 same+0.045
 homework+0.041
 way+0.041
 garage+0.040
 task+0.039
Neg logit contributions
amb-0.088
unch-0.083
oot-0.079
oped-0.078
ucking-0.078
Token to
Token ID284
Feature activation+2.21e-01

Pos logit contributions
unch+0.050
amb+0.049
oot+0.048
aring+0.047
uts+0.047
Neg logit contributions
 same-0.053
 way-0.049
 task-0.046
 attic-0.045
 garage-0.045
Token stop
Token ID2245
Feature activation-3.20e-02

Pos logit contributions
 same+0.295
 way+0.254
 boring+0.235
 task+0.234
 attic+0.230
Neg logit contributions
unch-0.553
amb-0.549
oot-0.529
aring-0.515
oped-0.509
Token playing
Token ID2712
Feature activation-5.44e-02

Pos logit contributions
unch+0.070
amb+0.069
oot+0.065
oped+0.064
aring+0.064
Neg logit contributions
 same-0.055
 way-0.050
 task-0.046
 boring-0.045
 attic-0.045
Token day
Token ID1110
Feature activation-2.39e-02

Pos logit contributions
 same+0.149
 boring+0.127
 laundry+0.122
 attic+0.121
 garage+0.121
Neg logit contributions
amb-0.346
unch-0.344
oing-0.318
aring-0.317
oot-0.314
Token on
Token ID319
Feature activation+1.22e-01

Pos logit contributions
unch+0.048
amb+0.048
oot+0.045
oped+0.045
aring+0.044
Neg logit contributions
 same-0.045
 way-0.041
 task-0.038
 boring-0.038
 attic-0.037
Token,
Token ID11
Feature activation+4.72e-02

Pos logit contributions
 same+0.337
 garage+0.304
 proper+0.302
 boring+0.300
 attic+0.298
Neg logit contributions
oot-0.106
unch-0.102
amb-0.102
oped-0.098
es-0.098
Token Lily
Token ID20037
Feature activation+2.29e-02

Pos logit contributions
 same+0.075
 garage+0.073
 stove+0.071
 basement+0.071
 task+0.071
Neg logit contributions
unch-0.082
qu-0.079
amb-0.076
oled-0.075
ub-0.074
Token knew
Token ID2993
Feature activation+2.50e-02

Pos logit contributions
 same+0.038
 task+0.033
 way+0.032
 laundry+0.032
 attic+0.032
Neg logit contributions
amb-0.051
unch-0.049
oped-0.046
uts-0.046
oot-0.046
Token to
Token ID284
Feature activation+2.39e-01

Pos logit contributions
 same+0.053
 way+0.048
 task+0.046
 boring+0.044
 garage+0.044
Neg logit contributions
unch-0.045
amb-0.044
oot-0.043
uts-0.041
aring-0.040
Token be
Token ID307
Feature activation+1.23e-01

Pos logit contributions
 same+0.260
 way+0.223
 task+0.190
 boring+0.181
 laundry+0.175
Neg logit contributions
unch-0.681
amb-0.671
oped-0.654
oot-0.648
uts-0.647
Token careful
Token ID8161
Feature activation-4.00e-02

Pos logit contributions
 same+0.195
 way+0.181
 task+0.175
 table+0.170
 stove+0.167
Neg logit contributions
unch-0.261
amb-0.256
aring-0.254
oot-0.244
ucking-0.241
Token around
Token ID1088
Feature activation+3.61e-02

Pos logit contributions
unch+0.085
amb+0.084
oped+0.080
oot+0.077
qu+0.076
Neg logit contributions
 same-0.074
 way-0.068
 boring-0.061
 task-0.061
 attic-0.059
Token the
Token ID262
Feature activation+1.85e-01

Pos logit contributions
 same+0.051
 way+0.044
 garage+0.043
 usual+0.042
 attic+0.041
Neg logit contributions
es-0.079
qu-0.074
r-0.073
uts-0.069
oing-0.069
Token warm
Token ID5814
Feature activation-1.01e-01

Pos logit contributions
 same+0.185
 way+0.166
 usual+0.164
 task+0.159
 boring+0.157
Neg logit contributions
unch-0.427
oing-0.423
ucking-0.415
aring-0.409
qu-0.393
Token happy
Token ID3772
Feature activation-1.13e-02

Pos logit contributions
 same+0.213
 way+0.183
 boring+0.178
 bad+0.168
 task+0.160
Neg logit contributions
amb-0.363
unch-0.361
oot-0.345
oped-0.342
oing-0.331
Token that
Token ID326
Feature activation+1.52e-01

Pos logit contributions
unch+0.025
amb+0.024
oped+0.023
oot+0.023
oing+0.022
Neg logit contributions
 same-0.019
 way-0.018
 boring-0.016
 task-0.016
 bad-0.015
Token they
Token ID484
Feature activation+1.08e-01

Pos logit contributions
 same+0.260
 task+0.229
 attic+0.227
 way+0.224
 garage+0.218
Neg logit contributions
unch-0.309
uts-0.297
oot-0.296
amb-0.295
oped-0.290
Token had
Token ID550
Feature activation+1.10e-01

Pos logit contributions
 same+0.189
 way+0.171
 task+0.163
 attic+0.157
 laundry+0.156
Neg logit contributions
amb-0.229
unch-0.223
oped-0.217
oot-0.216
uts-0.210
Token named
Token ID3706
Feature activation-7.84e-02

Pos logit contributions
 same+0.179
 task+0.164
 attic+0.161
 shelf+0.154
 way+0.153
Neg logit contributions
unch-0.230
amb-0.223
oot-0.221
uts-0.220
qu-0.219
Token the
Token ID262
Feature activation+2.38e-01

Pos logit contributions
unch+0.185
amb+0.184
oot+0.177
aring+0.173
uts+0.172
Neg logit contributions
 same-0.114
 way-0.100
 task-0.095
 attic-0.093
 boring-0.092
Token special
Token ID2041
Feature activation+2.94e-02

Pos logit contributions
 same+0.310
 way+0.288
 proper+0.278
 task+0.276
 laundry+0.265
Neg logit contributions
unch-0.541
amb-0.520
qu-0.507
aring-0.505
uts-0.498
Token star
Token ID3491
Feature activation-1.34e-01

Pos logit contributions
 same+0.042
 way+0.036
 boring+0.034
 proper+0.034
 attic+0.034
Neg logit contributions
unch-0.068
amb-0.068
oot-0.064
aring-0.064
oing-0.063
Token.
Token ID13
Feature activation-4.44e-02

Pos logit contributions
unch+0.236
amb+0.235
oot+0.221
oped+0.217
aring+0.217
Neg logit contributions
 same-0.281
 way-0.256
 task-0.243
 boring-0.240
 attic-0.239
Token They
Token ID1119
Feature activation+2.70e-02

Pos logit contributions
unch+0.088
amb+0.087
oot+0.081
oped+0.080
aring+0.080
Neg logit contributions
 same-0.086
 way-0.079
 task-0.074
 boring-0.073
 attic-0.072
Token never
Token ID1239
Feature activation+9.68e-02

Pos logit contributions
 same+0.047
 way+0.042
 task+0.039
 boring+0.038
 attic+0.038
Neg logit contributions
unch-0.058
amb-0.058
oot-0.055
oped-0.054
aring-0.053
Token skills
Token ID4678
Feature activation-2.06e-01

Pos logit contributions
unch+0.366
amb+0.361
oot+0.338
oped+0.333
uts+0.329
Neg logit contributions
 same-0.372
 way-0.343
 task-0.316
 boring-0.314
 attic-0.306
Token to
Token ID284
Feature activation+1.20e-01

Pos logit contributions
unch+0.357
amb+0.355
oot+0.333
oped+0.325
uts+0.324
Neg logit contributions
 same-0.437
 way-0.404
 task-0.381
 boring-0.377
 attic-0.374
Token climb
Token ID12080
Feature activation-1.74e-01

Pos logit contributions
 same+0.159
 way+0.142
 task+0.135
 attic+0.135
 boring+0.132
Neg logit contributions
unch-0.286
amb-0.282
oot-0.275
qu-0.271
aring-0.267
Token up
Token ID510
Feature activation-4.55e-02

Pos logit contributions
amb+0.315
unch+0.307
oot+0.302
uts+0.296
oped+0.295
Neg logit contributions
 same-0.327
 way-0.302
 task-0.295
 boring-0.282
 garage-0.281
Token the
Token ID262
Feature activation+9.21e-02

Pos logit contributions
unch+0.077
amb+0.077
uts+0.075
oot+0.073
oped+0.072
Neg logit contributions
 same-0.087
 way-0.079
 task-0.077
 garage-0.076
 proper-0.075
Token mast
Token ID17288
Feature activation-3.30e-01

Pos logit contributions
 same+0.120
 way+0.103
 boring+0.102
 proper+0.102
 laundry+0.102
Neg logit contributions
unch-0.197
amb-0.194
aring-0.189
oot-0.188
qu-0.185
Token and
Token ID290
Feature activation+3.10e-02

Pos logit contributions
unch+0.582
amb+0.572
oot+0.531
oped+0.529
aring+0.519
Neg logit contributions
 same-0.708
 way-0.656
 task-0.609
 boring-0.603
 attic-0.595
Token jump
Token ID4391
Feature activation-2.18e-01

Pos logit contributions
 same+0.047
 task+0.042
 attic+0.041
 way+0.041
 boring+0.040
Neg logit contributions
amb-0.066
oped-0.063
unch-0.061
oot-0.061
aring-0.060
Token onto
Token ID4291
Feature activation+9.07e-03

Pos logit contributions
amb+0.431
unch+0.430
oot+0.405
oped+0.400
uts+0.397
Neg logit contributions
 same-0.409
 way-0.376
 task-0.353
 boring-0.346
 attic-0.345
Token the
Token ID262
Feature activation+1.07e-01

Pos logit contributions
 same+0.010
 way+0.009
 task+0.009
 boring+0.008
 attic+0.008
Neg logit contributions
amb-0.022
unch-0.022
oot-0.022
qu-0.022
oped-0.021
Token pirate
Token ID25868
Feature activation-2.50e-01

Pos logit contributions
 same+0.129
 homework+0.119
 task+0.117
 boring+0.114
 proper+0.112
Neg logit contributions
amb-0.236
unch-0.226
qu-0.226
oot-0.221
oing-0.218
Token slippery
Token ID32911
Feature activation-1.39e-01

Pos logit contributions
unch+0.529
amb+0.510
uts+0.496
oot+0.494
oped+0.491
Neg logit contributions
 same-0.371
 way-0.329
 attic-0.317
 task-0.311
 laundry-0.301
Token and
Token ID290
Feature activation+5.31e-02

Pos logit contributions
amb+0.229
unch+0.229
oot+0.213
uts+0.209
qu+0.207
Neg logit contributions
 same-0.299
 way-0.277
 task-0.264
 attic-0.259
 boring-0.258
Token the
Token ID262
Feature activation+1.47e-01

Pos logit contributions
 same+0.075
 attic+0.075
 basement+0.070
 garage+0.068
 task+0.067
Neg logit contributions
uts-0.107
unch-0.105
oped-0.103
qu-0.102
ew-0.100
Token car
Token ID1097
Feature activation-1.86e-01

Pos logit contributions
 same+0.176
 attic+0.153
 task+0.150
 way+0.149
 proper+0.147
Neg logit contributions
unch-0.328
amb-0.316
qu-0.311
oped-0.311
ew-0.308
Token sk
Token ID1341
Feature activation-2.12e-01

Pos logit contributions
oped+0.374
uts+0.369
unch+0.360
oot+0.357
amb+0.355
Neg logit contributions
 same-0.306
 attic-0.278
 laundry-0.267
 task-0.264
 way-0.263
Tokenidd
Token ID1638
Feature activation-3.23e-01

Pos logit contributions
amb+0.307
unch+0.304
oot+0.282
oing+0.280
aring+0.275
Neg logit contributions
 same-0.491
 way-0.444
 garage-0.434
 laundry-0.427
 attic-0.426
Tokened
Token ID276
Feature activation-4.92e-02

Pos logit contributions
unch+0.391
amb+0.341
qu+0.307
oot+0.304
oing+0.301
Neg logit contributions
 same-0.875
 way-0.798
 boring-0.748
 task-0.746
 garage-0.734
Token and
Token ID290
Feature activation+6.11e-02

Pos logit contributions
unch+0.090
amb+0.089
oped+0.085
aring+0.084
oot+0.084
Neg logit contributions
 same-0.094
 way-0.086
 task-0.083
 attic-0.081
 laundry-0.080
Token hit
Token ID2277
Feature activation-1.31e-01

Pos logit contributions
 same+0.096
 task+0.086
 attic+0.083
 way+0.078
 garage+0.076
Neg logit contributions
uts-0.120
oped-0.120
unch-0.117
es-0.115
amb-0.114
Token a
Token ID257
Feature activation+1.05e-01

Pos logit contributions
unch+0.327
amb+0.324
oot+0.314
oped+0.313
qu+0.310
Neg logit contributions
 same-0.156
 way-0.141
 task-0.137
 attic-0.127
 garage-0.124
Token tree
Token ID5509
Feature activation-1.69e-01

Pos logit contributions
 same+0.128
 way+0.109
 task+0.097
 attic+0.096
 boring+0.095
Neg logit contributions
unch-0.274
amb-0.265
oped-0.256
oing-0.256
oot-0.255
Token However
Token ID2102
Feature activation-7.92e-02

Pos logit contributions
 same+0.086
 way+0.076
 task+0.067
 boring+0.066
 attic+0.065
Neg logit contributions
unch-0.163
amb-0.161
oot-0.153
oped-0.152
aring-0.152
Token,
Token ID11
Feature activation+1.77e-02

Pos logit contributions
unch+0.121
amb+0.120
oot+0.117
oped+0.112
uts+0.110
Neg logit contributions
 same-0.174
 way-0.156
 task-0.155
 proper-0.152
 garage-0.152
Token when
Token ID618
Feature activation+1.72e-02

Pos logit contributions
 same+0.029
 way+0.025
 task+0.024
 proper+0.024
 boring+0.023
Neg logit contributions
unch-0.036
amb-0.035
uts-0.035
oot-0.034
oped-0.034
Token the
Token ID262
Feature activation+1.57e-01

Pos logit contributions
 same+0.028
 proper+0.024
 task+0.024
 way+0.023
 boring+0.023
Neg logit contributions
unch-0.034
amb-0.033
oped-0.033
uts-0.032
oot-0.032
Token k
Token ID479
Feature activation-1.95e-01

Pos logit contributions
 same+0.189
 way+0.168
 proper+0.160
 attic+0.148
 boring+0.148
Neg logit contributions
unch-0.371
amb-0.360
aring-0.346
qu-0.344
oot-0.343
Tokenidd
Token ID1638
Feature activation-3.21e-01

Pos logit contributions
amb+0.219
unch+0.218
uts+0.207
oot+0.193
qu+0.187
Neg logit contributions
 same-0.521
 way-0.460
 attic-0.457
 garage-0.452
 task-0.451
Tokeno
Token ID78
Feature activation-1.33e-01

Pos logit contributions
unch+0.360
amb+0.312
oot+0.290
qu+0.285
oing+0.284
Neg logit contributions
 same-0.895
 way-0.840
 garage-0.785
 task-0.779
 boring-0.770
Token got
Token ID1392
Feature activation+6.23e-03

Pos logit contributions
unch+0.293
amb+0.290
oped+0.285
aring+0.274
oot+0.272
Neg logit contributions
 same-0.219
 way-0.196
 task-0.185
 attic-0.183
 boring-0.179
Token to
Token ID284
Feature activation+1.37e-01

Pos logit contributions
 same+0.010
 way+0.009
 attic+0.009
 task+0.009
 garage+0.008
Neg logit contributions
unch-0.014
amb-0.013
oped-0.013
oot-0.013
aring-0.013
Token the
Token ID262
Feature activation+1.37e-01

Pos logit contributions
 same+0.174
 proper+0.153
 attic+0.152
 drawer+0.148
 task+0.145
Neg logit contributions
unch-0.310
qu-0.293
amb-0.290
oot-0.286
uts-0.277
Token university
Token ID6403
Feature activation-1.81e-01

Pos logit contributions
 same+0.159
 way+0.142
 task+0.141
 homework+0.135
 boring+0.135
Neg logit contributions
amb-0.309
unch-0.305
qu-0.294
aring-0.288
rol-0.287
Token she
Token ID673
Feature activation+5.74e-02

Pos logit contributions
 same+0.127
 way+0.113
 garage+0.110
 attic+0.109
 task+0.108
Neg logit contributions
unch-0.193
amb-0.192
oot-0.181
oped-0.181
uts-0.180
Token stepped
Token ID10764
Feature activation-1.01e-01

Pos logit contributions
 same+0.104
 way+0.093
 task+0.091
 attic+0.089
 laundry+0.089
Neg logit contributions
amb-0.117
unch-0.114
oped-0.111
oot-0.106
aring-0.105
Token closer
Token ID5699
Feature activation-1.05e-01

Pos logit contributions
unch+0.236
uts+0.225
aring+0.221
amb+0.221
oing+0.219
Neg logit contributions
 same-0.169
 way-0.142
 drawer-0.131
 laundry-0.129
 bad-0.128
Token to
Token ID284
Feature activation+1.27e-01

Pos logit contributions
unch+0.229
amb+0.223
oped+0.211
oot+0.209
aring+0.207
Neg logit contributions
 same-0.187
 way-0.171
 boring-0.152
 task-0.152
 laundry-0.147
Token the
Token ID262
Feature activation+1.34e-01

Pos logit contributions
 same+0.157
 way+0.135
 task+0.128
 boring+0.127
 attic+0.125
Neg logit contributions
unch-0.328
amb-0.324
oot-0.311
oped-0.308
qu-0.303
Token bog
Token ID22922
Feature activation-3.18e-01

Pos logit contributions
 same+0.169
 way+0.151
 task+0.141
 boring+0.140
 proper+0.139
Neg logit contributions
unch-0.317
amb-0.308
oot-0.293
aring-0.293
oing-0.293
Token.
Token ID13
Feature activation-8.39e-02

Pos logit contributions
unch+0.672
amb+0.654
oped+0.619
qu+0.603
umbs+0.591
Neg logit contributions
 same-0.557
 way-0.556
 table-0.493
 floor-0.477
 shelf-0.471
Token Suddenly
Token ID24975
Feature activation-8.06e-02

Pos logit contributions
amb+0.172
unch+0.172
oot+0.164
aring+0.159
oped+0.159
Neg logit contributions
 same-0.153
 way-0.139
 attic-0.126
 task-0.126
 boring-0.125
Token,
Token ID11
Feature activation+6.18e-03

Pos logit contributions
unch+0.129
amb+0.127
oot+0.125
oped+0.125
qu+0.120
Neg logit contributions
 same-0.173
 way-0.157
 task-0.154
 attic-0.151
 garage-0.151
Token Rose
Token ID8049
Feature activation-8.94e-02

Pos logit contributions
 same+0.010
 way+0.009
 task+0.008
 proper+0.008
 garage+0.008
Neg logit contributions
unch-0.013
amb-0.013
oped-0.013
oot-0.013
uts-0.012
Token heard
Token ID2982
Feature activation-4.33e-02

Pos logit contributions
unch+0.213
amb+0.212
oot+0.200
aring+0.196
uts+0.193
Neg logit contributions
 same-0.137
 way-0.125
 boring-0.110
 task-0.109
 bad-0.105
Token tree
Token ID5509
Feature activation-2.23e-01

Pos logit contributions
 same+0.147
 boring+0.133
 homework+0.128
 ger+0.127
 proper+0.126
Neg logit contributions
unch-0.223
amb-0.219
oot-0.217
umbs-0.210
uts-0.206
Token and
Token ID290
Feature activation+3.37e-02

Pos logit contributions
unch+0.447
amb+0.439
oot+0.412
oped+0.411
aring+0.405
Neg logit contributions
 same-0.429
 way-0.398
 task-0.363
 boring-0.355
 attic-0.355
Token takes
Token ID2753
Feature activation-4.72e-02

Pos logit contributions
 same+0.060
 task+0.054
 way+0.053
 boring+0.053
 laundry+0.051
Neg logit contributions
amb-0.066
oped-0.065
unch-0.060
oot-0.060
hed-0.059
Token the
Token ID262
Feature activation+1.25e-01

Pos logit contributions
amb+0.119
unch+0.119
oot+0.113
oped+0.112
aring+0.110
Neg logit contributions
 same-0.063
 way-0.055
 task-0.051
 attic-0.050
 boring-0.050
Token aer
Token ID9551
Feature activation-1.97e-01

Pos logit contributions
 same+0.172
 way+0.153
 boring+0.151
 attic+0.149
 basement+0.148
Neg logit contributions
unch-0.277
amb-0.264
oot-0.252
oped-0.252
aring-0.248
Tokenopl
Token ID20106
Feature activation-3.17e-01

Pos logit contributions
amb+0.324
unch+0.299
oped+0.291
oled+0.288
uts+0.284
Neg logit contributions
 same-0.407
 way-0.374
 wall-0.348
 task-0.347
 boring-0.345
Tokenane
Token ID1531
Feature activation-1.94e-01

Pos logit contributions
unch+0.437
qu+0.419
oing+0.418
aring+0.415
ann+0.409
Neg logit contributions
 same-0.643
 attic-0.578
 garage-0.560
 homework-0.555
 wall-0.555
Token.
Token ID13
Feature activation-9.25e-02

Pos logit contributions
unch+0.443
amb+0.433
oped+0.410
oot+0.406
aring+0.401
Neg logit contributions
 same-0.322
 way-0.291
 task-0.265
 boring-0.259
 attic-0.253
Token He
Token ID679
Feature activation-8.24e-03

Pos logit contributions
unch+0.201
amb+0.199
oot+0.187
oped+0.186
aring+0.184
Neg logit contributions
 same-0.161
 way-0.146
 task-0.135
 boring-0.133
 attic-0.130
Token does
Token ID857
Feature activation+6.76e-03

Pos logit contributions
amb+0.018
unch+0.017
oped+0.017
uts+0.016
qu+0.016
Neg logit contributions
 same-0.013
 task-0.011
 way-0.011
 attic-0.011
 laundry-0.011
Token not
Token ID407
Feature activation+8.24e-02

Pos logit contributions
 same+0.007
 way+0.006
 attic+0.005
 task+0.005
 garage+0.005
Neg logit contributions
unch-0.019
amb-0.019
oot-0.018
uts-0.018
aring-0.018
Token to
Token ID284
Feature activation+1.50e-01

Pos logit contributions
unch+0.043
uts+0.043
amb+0.042
oot+0.040
qu+0.040
Neg logit contributions
 same-0.035
 task-0.031
 way-0.030
 attic-0.030
 garage-0.030
Token wear
Token ID5806
Feature activation-1.10e-01

Pos logit contributions
 same+0.210
 way+0.184
 attic+0.166
 boring+0.164
 task+0.163
Neg logit contributions
amb-0.364
unch-0.362
oot-0.343
uts-0.338
qu-0.338
Token his
Token ID465
Feature activation+1.05e-01

Pos logit contributions
amb+0.284
unch+0.274
oot+0.268
oped+0.265
uts+0.264
Neg logit contributions
 same-0.131
 way-0.116
 task-0.107
 attic-0.101
 garage-0.099
Token fancy
Token ID14996
Feature activation-1.15e-01

Pos logit contributions
 same+0.123
 basement+0.115
 way+0.106
 attic+0.103
 task+0.099
Neg logit contributions
amb-0.243
unch-0.226
uts-0.222
oped-0.220
ang-0.219
Token swe
Token ID3490
Feature activation-2.07e-01

Pos logit contributions
amb+0.255
unch+0.231
oot+0.231
oped+0.225
uts+0.223
Neg logit contributions
 same-0.172
 way-0.148
 basement-0.143
 boring-0.142
 attic-0.140
Tokenaters
Token ID8605
Feature activation-3.16e-01

Pos logit contributions
unch+0.244
qu+0.238
aring+0.233
oot+0.225
oing+0.223
Neg logit contributions
 same-0.552
 way-0.522
 garage-0.502
 proper-0.502
 attic-0.499
Token every
Token ID790
Feature activation+1.29e-02

Pos logit contributions
unch+0.615
amb+0.582
aring+0.555
oing+0.554
oped+0.554
Neg logit contributions
 same-0.637
 way-0.580
 task-0.533
 boring-0.531
 attic-0.527
Token day
Token ID1110
Feature activation-9.40e-02

Pos logit contributions
 same+0.014
 boring+0.010
 attic+0.010
 garage+0.010
 way+0.010
Neg logit contributions
unch-0.033
amb-0.033
oped-0.032
oot-0.031
uts-0.030
Token.
Token ID13
Feature activation-8.77e-02

Pos logit contributions
unch+0.154
amb+0.152
oot+0.140
oped+0.139
aring+0.137
Neg logit contributions
 same-0.213
 way-0.197
 task-0.186
 boring-0.184
 attic-0.182
Token One
Token ID1881
Feature activation-9.79e-02

Pos logit contributions
unch+0.186
amb+0.181
oot+0.178
oped+0.174
oing+0.173
Neg logit contributions
 same-0.155
 way-0.142
 attic-0.129
 garage-0.128
 boring-0.127
Token day
Token ID1110
Feature activation-7.56e-02

Pos logit contributions
amb+0.250
unch+0.245
ang+0.232
oot+0.225
aring+0.225
Neg logit contributions
 same-0.115
 basement-0.097
 garage-0.092
 attic-0.088
 way-0.088
Token had
Token ID550
Feature activation+6.84e-02

Pos logit contributions
 same+0.172
 task+0.156
 garage+0.150
 attic+0.149
 way+0.149
Neg logit contributions
oped-0.191
amb-0.188
unch-0.187
uts-0.185
qu-0.184
Token become
Token ID1716
Feature activation+4.21e-02

Pos logit contributions
 same+0.098
 task+0.093
 attic+0.089
 garage+0.088
 way+0.086
Neg logit contributions
unch-0.149
uts-0.148
oot-0.146
amb-0.144
qu-0.144
Token trapped
Token ID13640
Feature activation-1.09e-01

Pos logit contributions
 same+0.055
 task+0.049
 way+0.048
 attic+0.046
 garage+0.045
Neg logit contributions
unch-0.099
amb-0.098
oot-0.096
uts-0.096
oped-0.095
Token in
Token ID287
Feature activation+1.01e-01

Pos logit contributions
amb+0.190
unch+0.190
oot+0.178
oped+0.174
aring+0.173
Neg logit contributions
 same-0.230
 way-0.212
 task-0.201
 boring-0.198
 laundry-0.195
Token the
Token ID262
Feature activation+1.25e-01

Pos logit contributions
 same+0.104
 way+0.084
 task+0.080
 boring+0.075
 garage+0.072
Neg logit contributions
unch-0.270
amb-0.264
oot-0.258
uts-0.257
oped-0.257
Token bog
Token ID22922
Feature activation-3.15e-01

Pos logit contributions
 same+0.143
 homework+0.141
 task+0.131
 way+0.126
 boring+0.123
Neg logit contributions
amb-0.278
unch-0.278
aring-0.269
ang-0.264
oped-0.264
Token.
Token ID13
Feature activation-7.88e-02

Pos logit contributions
unch+0.627
amb+0.602
oped+0.578
oot+0.550
qu+0.546
Neg logit contributions
 same-0.612
 way-0.596
 floor-0.523
 table-0.523
 wall-0.517
Token Rose
Token ID8049
Feature activation-8.79e-02

Pos logit contributions
unch+0.175
amb+0.173
oot+0.164
oped+0.162
aring+0.161
Neg logit contributions
 same-0.132
 way-0.120
 task-0.110
 boring-0.108
 attic-0.106
Token thought
Token ID1807
Feature activation-8.53e-02

Pos logit contributions
unch+0.206
amb+0.204
oot+0.193
oped+0.191
aring+0.189
Neg logit contributions
 same-0.138
 way-0.123
 task-0.112
 boring-0.111
 attic-0.108
Token and
Token ID290
Feature activation+1.04e-01

Pos logit contributions
oot+0.155
unch+0.151
uts+0.146
aring+0.144
amb+0.140
Neg logit contributions
 same-0.168
 garage-0.147
 way-0.144
 basement-0.143
 attic-0.143
Token thought
Token ID1807
Feature activation-7.73e-02

Pos logit contributions
 same+0.182
 way+0.164
 task+0.154
 boring+0.150
 attic+0.148
Neg logit contributions
unch-0.222
amb-0.222
oot-0.210
oped-0.207
aring-0.204
Token at
Token ID379
Feature activation+3.06e-02

Pos logit contributions
unch+0.226
amb+0.220
uts+0.210
oot+0.210
aring+0.202
Neg logit contributions
 same-0.318
 way-0.287
 task-0.283
 boring-0.278
 garage-0.275
Token the
Token ID262
Feature activation+1.19e-01

Pos logit contributions
 same+0.033
 task+0.032
 boring+0.030
 proper+0.029
 way+0.028
Neg logit contributions
amb-0.071
unch-0.071
oot-0.069
qu-0.067
rol-0.067
Token end
Token ID886
Feature activation-1.18e-01

Pos logit contributions
 same+0.126
 proper+0.116
 task+0.113
 way+0.112
 boring+0.112
Neg logit contributions
unch-0.272
amb-0.266
qu-0.259
rol-0.253
aring-0.249
Token of
Token ID286
Feature activation+3.79e-02

Pos logit contributions
amb+0.174
unch+0.171
oot+0.154
oped+0.153
oing+0.151
Neg logit contributions
 same-0.287
 way-0.272
 task-0.255
 boring-0.254
 attic-0.253
Token the
Token ID262
Feature activation+1.11e-01

Pos logit contributions
 same+0.046
 task+0.040
 way+0.038
 boring+0.036
 proper+0.036
Neg logit contributions
unch-0.095
amb-0.091
oot-0.089
oped-0.087
uts-0.087
Token trail
Token ID8025
Feature activation-3.14e-01

Pos logit contributions
 same+0.124
 way+0.103
 proper+0.100
 boring+0.100
 task+0.093
Neg logit contributions
unch-0.268
amb-0.260
qu-0.253
aring-0.252
oot-0.248
Token.
Token ID13
Feature activation-9.03e-02

Pos logit contributions
unch+0.504
amb+0.495
oot+0.480
oped+0.474
uts+0.470
Neg logit contributions
 same-0.685
 way-0.622
 task-0.615
 boring-0.605
 laundry-0.595
Token She
Token ID1375
Feature activation-1.04e-02

Pos logit contributions
unch+0.185
amb+0.182
oot+0.176
oing+0.173
oped+0.172
Neg logit contributions
 same-0.163
 way-0.149
 attic-0.136
 boring-0.134
 garage-0.134
Token was
Token ID373
Feature activation+1.01e-01

Pos logit contributions
amb+0.021
oped+0.020
unch+0.020
oot+0.019
qu+0.019
Neg logit contributions
 same-0.019
 task-0.017
 way-0.016
 laundry-0.016
 boring-0.015
Token so
Token ID523
Feature activation+9.99e-02

Pos logit contributions
 same+0.165
 task+0.153
 way+0.149
 garage+0.142
 attic+0.141
Neg logit contributions
unch-0.216
amb-0.211
oot-0.201
aring-0.199
oped-0.199
Token excited
Token ID6568
Feature activation-9.50e-02

Pos logit contributions
 same+0.159
 way+0.143
 task+0.130
 boring+0.129
 attic+0.126
Neg logit contributions
unch-0.233
amb-0.231
oot-0.218
oped-0.216
aring-0.214
Token driver
Token ID4639
Feature activation-2.17e-01

Pos logit contributions
 same+0.031
 way+0.028
 task+0.025
 boring+0.024
 attic+0.024
Neg logit contributions
unch-0.061
amb-0.060
oot-0.057
oped-0.057
aring-0.056
Token hon
Token ID3032
Feature activation-2.42e-01

Pos logit contributions
unch+0.445
amb+0.439
oot+0.412
oped+0.406
uts+0.403
Neg logit contributions
 same-0.402
 way-0.367
 task-0.336
 boring-0.334
 attic-0.327
Tokenked
Token ID9091
Feature activation-2.24e-01

Pos logit contributions
unch+0.404
quet+0.338
uts+0.333
amb+0.330
oot+0.327
Neg logit contributions
 same-0.583
 way-0.516
 boring-0.489
 bad-0.485
 ugly-0.469
Token the
Token ID262
Feature activation+1.00e-01

Pos logit contributions
amb+0.481
unch+0.478
oot+0.453
oped+0.445
uts+0.443
Neg logit contributions
 same-0.382
 way-0.346
 task-0.327
 attic-0.321
 boring-0.320
Token car
Token ID1097
Feature activation-2.28e-01

Pos logit contributions
 same+0.135
 way+0.122
 task+0.121
 boring+0.118
 attic+0.116
Neg logit contributions
amb-0.216
unch-0.214
aring-0.206
oing-0.205
oped-0.203
Token horn
Token ID12718
Feature activation-3.12e-01

Pos logit contributions
unch+0.438
amb+0.427
oot+0.399
oped+0.396
uts+0.390
Neg logit contributions
 same-0.458
 way-0.424
 task-0.390
 boring-0.386
 attic-0.378
Token and
Token ID290
Feature activation+3.54e-02

Pos logit contributions
unch+0.615
amb+0.586
oot+0.549
oped+0.544
aring+0.539
Neg logit contributions
 same-0.623
 way-0.577
 boring-0.517
 task-0.515
 bad-0.506
Token shouted
Token ID17293
Feature activation-1.39e-01

Pos logit contributions
 same+0.065
 task+0.059
 way+0.058
 attic+0.055
 boring+0.055
Neg logit contributions
amb-0.070
unch-0.068
oped-0.067
oot-0.066
qu-0.064
Token,
Token ID11
Feature activation+1.30e-02

Pos logit contributions
unch+0.292
amb+0.282
oot+0.264
oped+0.263
oing+0.259
Neg logit contributions
 same-0.258
 way-0.230
 boring-0.216
 bad-0.213
 task-0.209
Token "
Token ID366
Feature activation-5.22e-02

Pos logit contributions
 same+0.018
 way+0.016
 boring+0.014
 task+0.014
 bad+0.014
Neg logit contributions
unch-0.033
amb-0.032
oot-0.031
oped-0.030
aring-0.030
TokenMove
Token ID21774
Feature activation-1.02e-01

Pos logit contributions
unch+0.118
oot+0.113
amb+0.112
uts+0.110
aring+0.107
Neg logit contributions
 same-0.088
 way-0.071
 boring-0.065
 task-0.065
 laundry-0.064
Token<|endoftext|>
Token ID50256
Feature activation-2.23e-01

Pos logit contributions
unch+0.197
amb+0.194
oot+0.188
oing+0.186
oped+0.185
Neg logit contributions
 same-0.162
 way-0.145
 boring-0.133
 garage-0.133
 attic-0.132
TokenRyan
Token ID21868
Feature activation-1.56e-01

Pos logit contributions
amb+0.528
aring+0.520
oot+0.518
unch+0.507
oing+0.492
Neg logit contributions
 same-0.359
 way-0.328
 attic-0.287
 garage-0.279
 room-0.279
Token watched
Token ID7342
Feature activation-1.23e-01

Pos logit contributions
amb+0.294
unch+0.285
oped+0.273
oot+0.263
aring+0.261
Neg logit contributions
 same-0.308
 way-0.283
 task-0.277
 boring-0.264
 attic-0.263
Token the
Token ID262
Feature activation+1.39e-01

Pos logit contributions
amb+0.264
unch+0.263
oot+0.244
aring+0.237
uts+0.236
Neg logit contributions
 same-0.214
 way-0.197
 task-0.181
 boring-0.180
 attic-0.178
Token aer
Token ID9551
Feature activation-1.83e-01

Pos logit contributions
 same+0.200
 way+0.181
 task+0.164
 boring+0.163
 proper+0.159
Neg logit contributions
unch-0.327
amb-0.324
oing-0.300
qu-0.299
oot-0.299
Tokenopl
Token ID20106
Feature activation-3.12e-01

Pos logit contributions
amb+0.268
unch+0.261
oped+0.245
uts+0.240
oing+0.233
Neg logit contributions
 same-0.426
 way-0.397
 boring-0.376
 wall-0.368
 task-0.367
Tokenane
Token ID1531
Feature activation-1.83e-01

Pos logit contributions
unch+0.473
umbs+0.463
oing+0.432
aring+0.420
aron+0.418
Neg logit contributions
 same-0.646
 attic-0.565
 homework-0.556
 garage-0.552
 wall-0.548
Token fly
Token ID6129
Feature activation-2.27e-01

Pos logit contributions
unch+0.375
amb+0.374
oot+0.355
aring+0.348
oing+0.347
Neg logit contributions
 same-0.324
 way-0.294
 boring-0.275
 task-0.275
 attic-0.273
Token in
Token ID287
Feature activation+6.79e-02

Pos logit contributions
amb+0.429
unch+0.416
oot+0.402
oped+0.395
aring+0.390
Neg logit contributions
 same-0.434
 way-0.400
 task-0.384
 laundry-0.378
 attic-0.376
Token the
Token ID262
Feature activation+9.67e-02

Pos logit contributions
 same+0.068
 way+0.056
 task+0.054
 boring+0.052
 homework+0.049
Neg logit contributions
unch-0.183
amb-0.182
oot-0.173
qu-0.171
oped-0.170
Token sky
Token ID6766
Feature activation-1.97e-01

Pos logit contributions
 homework+0.142
 task+0.141
 same+0.136
 boring+0.127
 germ+0.127
Neg logit contributions
amb-0.187
unch-0.176
qu-0.174
oped-0.173
rol-0.172
Token still
Token ID991
Feature activation-7.46e-03

Pos logit contributions
unch+0.175
amb+0.174
oped+0.165
oot+0.165
aring+0.163
Neg logit contributions
 same-0.118
 way-0.104
 task-0.097
 boring-0.094
 attic-0.093
Token remembered
Token ID12086
Feature activation-2.60e-02

Pos logit contributions
unch+0.017
amb+0.017
oped+0.016
oot+0.016
aring+0.016
Neg logit contributions
 same-0.011
 way-0.010
 task-0.010
 attic-0.009
 laundry-0.009
Token the
Token ID262
Feature activation+1.49e-01

Pos logit contributions
unch+0.054
amb+0.052
oot+0.051
uts+0.051
aring+0.050
Neg logit contributions
 same-0.045
 way-0.039
 task-0.039
 attic-0.037
 garage-0.037
Token big
Token ID1263
Feature activation-1.90e-02

Pos logit contributions
 same+0.162
 way+0.131
 attic+0.131
 proper+0.129
 task+0.126
Neg logit contributions
unch-0.376
amb-0.362
oot-0.358
aring-0.355
qu-0.350
Token aer
Token ID9551
Feature activation-1.76e-01

Pos logit contributions
unch+0.044
amb+0.043
oot+0.042
aring+0.041
oing+0.041
Neg logit contributions
 same-0.027
 way-0.022
 proper-0.022
 attic-0.022
 task-0.022
Tokenopl
Token ID20106
Feature activation-3.12e-01

Pos logit contributions
amb+0.286
unch+0.268
oped+0.259
ucking+0.244
ub+0.243
Neg logit contributions
 same-0.366
 way-0.341
 boring-0.322
 wall-0.322
 floor-0.318
Tokenane
Token ID1531
Feature activation-1.75e-01

Pos logit contributions
unch+0.446
amb+0.428
umbs+0.419
oing+0.417
aring+0.416
Neg logit contributions
 same-0.618
 attic-0.554
 wall-0.546
 usual-0.544
 homework-0.537
Token flying
Token ID7348
Feature activation-1.86e-01

Pos logit contributions
unch+0.311
amb+0.304
oot+0.298
aring+0.291
uts+0.290
Neg logit contributions
 same-0.359
 way-0.320
 task-0.310
 attic-0.305
 laundry-0.303
Token high
Token ID1029
Feature activation-1.01e-01

Pos logit contributions
amb+0.346
oot+0.337
unch+0.331
oped+0.324
uts+0.320
Neg logit contributions
 same-0.348
 way-0.321
 task-0.314
 laundry-0.311
 attic-0.303
Token in
Token ID287
Feature activation+6.18e-02

Pos logit contributions
amb+0.183
unch+0.179
oot+0.169
oped+0.166
uts+0.165
Neg logit contributions
 same-0.207
 way-0.192
 task-0.180
 boring-0.178
 laundry-0.177
Token the
Token ID262
Feature activation+9.65e-02

Pos logit contributions
 same+0.064
 task+0.055
 way+0.054
 homework+0.052
 boring+0.050
Neg logit contributions
unch-0.156
amb-0.154
qu-0.150
oot-0.149
oped-0.147
Token he
Token ID339
Feature activation+3.78e-02

Pos logit contributions
unch+0.024
amb+0.023
oot+0.023
uts+0.022
oped+0.022
Neg logit contributions
 same-0.018
 way-0.017
 task-0.017
 garage-0.016
 laundry-0.015
Token watched
Token ID7342
Feature activation-1.16e-01

Pos logit contributions
 same+0.063
 way+0.057
 task+0.052
 boring+0.052
 attic+0.051
Neg logit contributions
unch-0.084
amb-0.083
oot-0.079
aring-0.078
oped-0.077
Token the
Token ID262
Feature activation+1.42e-01

Pos logit contributions
unch+0.308
amb+0.307
oot+0.294
oped+0.288
uts+0.288
Neg logit contributions
 same-0.142
 way-0.124
 task-0.111
 attic-0.108
 boring-0.107
Token clean
Token ID3424
Feature activation-1.22e-01

Pos logit contributions
 same+0.221
 way+0.203
 task+0.188
 proper+0.184
 attic+0.183
Neg logit contributions
unch-0.308
amb-0.304
qu-0.291
oot-0.291
oing-0.285
Token aer
Token ID9551
Feature activation-1.84e-01

Pos logit contributions
amb+0.264
unch+0.257
oot+0.254
qu+0.247
oing+0.245
Neg logit contributions
 same-0.187
 way-0.162
 task-0.160
 proper-0.159
 boring-0.157
Tokenopl
Token ID20106
Feature activation-3.12e-01

Pos logit contributions
amb+0.288
unch+0.272
oped+0.265
oled+0.252
uts+0.241
Neg logit contributions
 same-0.402
 way-0.374
 wall-0.357
 floor-0.353
 boring-0.347
Tokenane
Token ID1531
Feature activation-1.88e-01

Pos logit contributions
unch+0.435
amb+0.413
rol+0.402
aron+0.401
aring+0.400
Neg logit contributions
 same-0.656
 wall-0.572
 usual-0.570
 attic-0.568
 homework-0.563
Token fly
Token ID6129
Feature activation-2.27e-01

Pos logit contributions
unch+0.404
amb+0.401
oot+0.377
oped+0.374
aring+0.370
Neg logit contributions
 same-0.328
 way-0.297
 task-0.274
 boring-0.271
 attic-0.265
Token away
Token ID1497
Feature activation-4.33e-02

Pos logit contributions
unch+0.486
amb+0.483
oot+0.455
oped+0.451
aring+0.445
Neg logit contributions
 same-0.397
 way-0.360
 task-0.333
 boring-0.328
 attic-0.323
Token.
Token ID13
Feature activation-8.91e-02

Pos logit contributions
unch+0.072
amb+0.071
oot+0.066
oped+0.065
aring+0.064
Neg logit contributions
 same-0.097
 way-0.090
 task-0.085
 boring-0.084
 attic-0.083
Token
Token ID220
Feature activation-1.66e-01

Pos logit contributions
unch+0.181
amb+0.177
oot+0.172
oing+0.171
oped+0.169
Neg logit contributions
 same-0.163
 way-0.151
 attic-0.136
 garage-0.135
 boring-0.135
Tokenled
Token ID992
Feature activation-9.97e-02

Pos logit contributions
amb+0.348
unch+0.335
oot+0.301
oped+0.295
uts+0.288
Neg logit contributions
 same-0.666
 way-0.627
 task-0.605
 boring-0.599
 garage-0.590
Token up
Token ID510
Feature activation-5.78e-02

Pos logit contributions
amb+0.244
unch+0.229
oot+0.219
oped+0.218
oing+0.217
Neg logit contributions
 same-0.132
 way-0.117
 task-0.115
 garage-0.111
 boring-0.110
Token in
Token ID287
Feature activation+5.22e-02

Pos logit contributions
amb+0.123
unch+0.117
oot+0.112
oped+0.111
uts+0.109
Neg logit contributions
 same-0.098
 way-0.087
 task-0.087
 garage-0.083
 boring-0.083
Token the
Token ID262
Feature activation+8.45e-02

Pos logit contributions
 same+0.054
 way+0.043
 task+0.042
 boring+0.041
 attic+0.037
Neg logit contributions
unch-0.142
amb-0.140
oped-0.135
oot-0.132
uts-0.131
Token warm
Token ID5814
Feature activation-2.05e-01

Pos logit contributions
 same+0.098
 homework+0.089
 task+0.087
 way+0.087
 boring+0.085
Neg logit contributions
amb-0.195
unch-0.190
oing-0.181
oped-0.181
rol-0.176
Token blanket
Token ID18447
Feature activation-3.10e-01

Pos logit contributions
amb+0.509
unch+0.506
oot+0.477
oped+0.471
oing+0.468
Neg logit contributions
 same-0.281
 way-0.242
 boring-0.228
 task-0.227
 attic-0.215
Token and
Token ID290
Feature activation+4.37e-02

Pos logit contributions
unch+0.567
amb+0.559
oped+0.522
oot+0.522
aring+0.516
Neg logit contributions
 same-0.649
 way-0.609
 task-0.547
 boring-0.542
 table-0.542
Token enjoyed
Token ID8359
Feature activation-3.13e-02

Pos logit contributions
 same+0.077
 task+0.067
 attic+0.065
 boring+0.064
 garage+0.064
Neg logit contributions
amb-0.086
unch-0.084
oped-0.081
uts-0.079
oot-0.076
Token his
Token ID465
Feature activation+1.12e-01

Pos logit contributions
unch+0.088
amb+0.087
oot+0.083
oped+0.083
aring+0.082
Neg logit contributions
 same-0.035
 way-0.029
 task-0.025
 boring-0.025
 attic-0.024
Token adventure
Token ID8855
Feature activation-2.08e-01

Pos logit contributions
 same+0.130
 garage+0.106
 way+0.105
 basement+0.104
 boring+0.103
Neg logit contributions
unch-0.276
amb-0.269
oing-0.250
oped-0.245
uts-0.242
Token with
Token ID351
Feature activation+1.70e-02

Pos logit contributions
unch+0.378
amb+0.372
oped+0.344
oot+0.344
aring+0.340
Neg logit contributions
 same-0.437
 way-0.405
 task-0.375
 boring-0.372
 attic-0.365
Token And
Token ID843
Feature activation-1.17e-02

Pos logit contributions
unch+0.183
amb+0.176
oot+0.176
oped+0.172
oing+0.170
Neg logit contributions
 same-0.128
 way-0.113
 garage-0.105
 attic-0.105
 task-0.103
Token put
Token ID1234
Feature activation-7.90e-02

Pos logit contributions
unch+0.026
amb+0.026
aring+0.024
oot+0.024
qu+0.024
Neg logit contributions
 same-0.019
 way-0.017
 task-0.016
 attic-0.015
 boring-0.015
Token on
Token ID319
Feature activation+5.71e-02

Pos logit contributions
amb+0.186
unch+0.181
oped+0.177
oot+0.175
qu+0.174
Neg logit contributions
 same-0.102
 way-0.095
 task-0.089
 garage-0.087
 attic-0.085
Token your
Token ID534
Feature activation+1.19e-01

Pos logit contributions
 same+0.057
 way+0.054
 task+0.049
 garage+0.048
 attic+0.048
Neg logit contributions
amb-0.146
qu-0.142
unch-0.139
uts-0.138
es-0.137
Token swe
Token ID3490
Feature activation-2.02e-01

Pos logit contributions
 same+0.175
 basement+0.165
 ending+0.156
 attic+0.156
 boring+0.148
Neg logit contributions
amb-0.182
unch-0.169
run-0.169
ew-0.168
ement-0.165
Tokenaters
Token ID8605
Feature activation-3.09e-01

Pos logit contributions
unch+0.210
qu+0.196
amb+0.190
oped+0.189
oot+0.188
Neg logit contributions
 same-0.574
 way-0.548
 task-0.523
 garage-0.522
 attic-0.520
Token,
Token ID11
Feature activation-2.15e-02

Pos logit contributions
unch+0.575
amb+0.558
oot+0.522
oped+0.520
aring+0.516
Neg logit contributions
 same-0.643
 way-0.589
 task-0.547
 boring-0.540
 attic-0.536
Token or
Token ID393
Feature activation+7.30e-02

Pos logit contributions
amb+0.045
unch+0.044
uts+0.042
aring+0.041
oped+0.041
Neg logit contributions
 same-0.034
 way-0.031
 task-0.030
 attic-0.030
 basement-0.029
Token you
Token ID345
Feature activation+7.80e-02

Pos logit contributions
 same+0.100
 way+0.089
 task+0.087
 attic+0.085
 basement+0.085
Neg logit contributions
amb-0.157
unch-0.149
aring-0.148
es-0.148
uts-0.147
Token will
Token ID481
Feature activation+1.72e-02

Pos logit contributions
 same+0.113
 way+0.100
 task+0.091
 boring+0.090
 attic+0.088
Neg logit contributions
unch-0.191
amb-0.189
oot-0.180
oped-0.179
aring-0.177
Token catch
Token ID4929
Feature activation-1.24e-01

Pos logit contributions
 same+0.022
 way+0.019
 task+0.017
 boring+0.017
 attic+0.016
Neg logit contributions
unch-0.045
amb-0.045
oot-0.043
oped-0.043
aring-0.042
Token wrap
Token ID14441
Feature activation-2.65e-01

Pos logit contributions
 same+0.182
 attic+0.157
 way+0.156
 task+0.155
 boring+0.155
Neg logit contributions
unch-0.280
amb-0.276
oot-0.261
aring-0.258
qu-0.257
Token his
Token ID465
Feature activation+9.77e-02

Pos logit contributions
amb+0.637
unch+0.617
oot+0.604
uts+0.600
oped+0.592
Neg logit contributions
 same-0.357
 way-0.322
 task-0.305
 attic-0.288
 boring-0.287
Token paws
Token ID49193
Feature activation-2.59e-01

Pos logit contributions
 same+0.129
 basement+0.114
 boring+0.111
 way+0.110
 garage+0.106
Neg logit contributions
amb-0.214
uts-0.191
unch-0.191
oot-0.185
ew-0.182
Token around
Token ID1088
Feature activation-7.40e-02

Pos logit contributions
unch+0.471
amb+0.467
oot+0.435
oped+0.429
uts+0.424
Neg logit contributions
 same-0.535
 way-0.493
 task-0.463
 boring-0.457
 attic-0.451
Token the
Token ID262
Feature activation+9.52e-02

Pos logit contributions
amb+0.129
unch+0.128
oped+0.119
uts+0.118
oot+0.117
Neg logit contributions
 same-0.145
 way-0.128
 task-0.127
 boring-0.123
 attic-0.121
Token bow
Token ID9563
Feature activation-3.09e-01

Pos logit contributions
 same+0.104
 way+0.089
 boring+0.088
 task+0.086
 homework+0.084
Neg logit contributions
amb-0.214
unch-0.213
oing-0.202
qu-0.200
aring-0.198
Token and
Token ID290
Feature activation+3.51e-02

Pos logit contributions
amb+0.504
unch+0.502
oot+0.467
uts+0.459
oped+0.455
Neg logit contributions
 same-0.684
 way-0.630
 task-0.600
 boring-0.594
 attic-0.585
Token pull
Token ID2834
Feature activation-1.67e-01

Pos logit contributions
 same+0.054
 task+0.044
 way+0.044
 boring+0.043
 attic+0.043
Neg logit contributions
amb-0.078
uts-0.073
unch-0.073
oped-0.071
es-0.070
Token.
Token ID13
Feature activation-8.70e-02

Pos logit contributions
amb+0.385
unch+0.384
oot+0.365
oped+0.363
uts+0.358
Neg logit contributions
 same-0.256
 way-0.229
 task-0.214
 boring-0.207
 attic-0.206
Token Spot
Token ID15899
Feature activation-9.74e-02

Pos logit contributions
unch+0.185
amb+0.184
oot+0.178
oing+0.174
qu+0.172
Neg logit contributions
 same-0.150
 way-0.138
 boring-0.122
 attic-0.122
 garage-0.122
Token did
Token ID750
Feature activation-1.04e-02

Pos logit contributions
amb+0.211
unch+0.204
oped+0.203
aring+0.196
qu+0.194
Neg logit contributions
 same-0.161
 way-0.143
 task-0.140
 attic-0.138
 laundry-0.136
Token got
Token ID1392
Feature activation-1.73e-02

Pos logit contributions
 same+0.224
 way+0.196
 task+0.193
 attic+0.191
 laundry+0.186
Neg logit contributions
amb-0.256
oped-0.232
oot-0.232
unch-0.231
qu-0.220
Token dressed
Token ID12049
Feature activation-1.53e-01

Pos logit contributions
unch+0.036
rol+0.035
oped+0.034
uts+0.034
oot+0.034
Neg logit contributions
 same-0.025
 attic-0.025
 way-0.025
 task-0.025
 garage-0.024
Token in
Token ID287
Feature activation+7.67e-02

Pos logit contributions
amb+0.292
unch+0.277
oot+0.265
uts+0.259
oped+0.259
Neg logit contributions
 same-0.291
 way-0.272
 task-0.262
 garage-0.254
 attic-0.254
Token her
Token ID607
Feature activation+6.28e-02

Pos logit contributions
 same+0.091
 way+0.079
 task+0.076
 garage+0.070
 boring+0.069
Neg logit contributions
unch-0.184
amb-0.184
oped-0.173
uts-0.173
qu-0.172
Token warm
Token ID5814
Feature activation-1.91e-01

Pos logit contributions
 same+0.077
 task+0.067
 way+0.065
 basement+0.064
 boring+0.063
Neg logit contributions
amb-0.143
unch-0.130
oped-0.128
uts-0.126
ang-0.125
Token coat
Token ID13209
Feature activation-3.08e-01

Pos logit contributions
amb+0.497
unch+0.479
oot+0.459
oped+0.453
oing+0.448
Neg logit contributions
 same-0.231
 way-0.189
 boring-0.187
 task-0.182
 bad-0.175
Token,
Token ID11
Feature activation-9.86e-03

Pos logit contributions
unch+0.547
amb+0.537
oot+0.499
oped+0.494
aring+0.490
Neg logit contributions
 same-0.656
 way-0.607
 task-0.564
 boring-0.561
 attic-0.553
Token hat
Token ID6877
Feature activation-2.61e-01

Pos logit contributions
amb+0.019
unch+0.019
uts+0.017
oped+0.017
oing+0.017
Neg logit contributions
 same-0.018
 task-0.016
 way-0.016
 basement-0.015
 attic-0.015
Token and
Token ID290
Feature activation+6.45e-02

Pos logit contributions
unch+0.485
amb+0.482
oot+0.452
oped+0.447
uts+0.443
Neg logit contributions
 same-0.525
 way-0.481
 task-0.456
 boring-0.448
 attic-0.441
Token gloves
Token ID18051
Feature activation-2.37e-01

Pos logit contributions
 same+0.097
 task+0.084
 way+0.083
 basement+0.078
 attic+0.076
Neg logit contributions
amb-0.140
uts-0.133
unch-0.133
oped-0.131
es-0.129
Token.
Token ID13
Feature activation-8.67e-02

Pos logit contributions
unch+0.467
amb+0.458
oot+0.426
oped+0.424
aring+0.424
Neg logit contributions
 same-0.462
 way-0.420
 task-0.387
 boring-0.385
 attic-0.378
Token went
Token ID1816
Feature activation-6.07e-02

Pos logit contributions
amb+0.099
oped+0.094
unch+0.093
es+0.087
aring+0.087
Neg logit contributions
 same-0.094
 task-0.088
 attic-0.085
 way-0.083
 laundry-0.083
Token to
Token ID284
Feature activation+1.35e-01

Pos logit contributions
amb+0.140
unch+0.137
oot+0.133
oped+0.129
aring+0.127
Neg logit contributions
 same-0.095
 way-0.084
 task-0.079
 boring-0.079
 attic-0.076
Token see
Token ID766
Feature activation+3.27e-03

Pos logit contributions
 same+0.171
 boring+0.148
 ger+0.142
 task+0.142
 floor+0.140
Neg logit contributions
amb-0.277
unch-0.276
oot-0.265
oped-0.255
r-0.253
Token the
Token ID262
Feature activation+1.58e-01

Pos logit contributions
 same+0.004
 way+0.004
 task+0.004
 boring+0.003
 attic+0.003
Neg logit contributions
unch-0.008
amb-0.008
oot-0.007
uts-0.007
aring-0.007
Token aer
Token ID9551
Feature activation-1.74e-01

Pos logit contributions
 same+0.184
 way+0.172
 proper+0.156
 task+0.152
 boring+0.145
Neg logit contributions
unch-0.384
amb-0.377
oing-0.352
oot-0.352
qu-0.350
Tokenopl
Token ID20106
Feature activation-3.07e-01

Pos logit contributions
amb+0.279
unch+0.256
oped+0.250
uts+0.239
ub+0.237
Neg logit contributions
 same-0.374
 way-0.348
 wall-0.333
 floor-0.330
 attic-0.329
Tokenane
Token ID1531
Feature activation-1.73e-01

Pos logit contributions
unch+0.462
aring+0.441
umbs+0.431
oing+0.429
qu+0.421
Neg logit contributions
 same-0.614
 attic-0.553
 garage-0.537
 wall-0.531
 usual-0.530
Token in
Token ID287
Feature activation+6.94e-02

Pos logit contributions
unch+0.328
amb+0.326
oot+0.301
oped+0.299
aring+0.294
Neg logit contributions
 same-0.351
 way-0.324
 task-0.300
 boring-0.298
 attic-0.291
Token the
Token ID262
Feature activation+1.00e-01

Pos logit contributions
 same+0.071
 way+0.056
 task+0.056
 boring+0.053
 laundry+0.048
Neg logit contributions
unch-0.187
amb-0.182
uts-0.175
oped-0.175
oot-0.174
Token sky
Token ID6766
Feature activation-1.94e-01

Pos logit contributions
 homework+0.148
 task+0.146
 usual+0.131
 boring+0.130
 germ+0.130
Neg logit contributions
amb-0.172
aw-0.167
rol-0.161
ann-0.160
hed-0.160
Token.
Token ID13
Feature activation-8.50e-02

Pos logit contributions
amb+0.372
unch+0.367
oot+0.345
oped+0.345
aring+0.337
Neg logit contributions
 same-0.402
 way-0.375
 boring-0.337
 task-0.329
 table-0.329
Token closer
Token ID5699
Feature activation-1.06e-01

Pos logit contributions
unch+0.215
amb+0.214
oot+0.201
oped+0.199
oing+0.197
Neg logit contributions
 same-0.181
 way-0.164
 boring-0.152
 task-0.148
 bad-0.147
Token and
Token ID290
Feature activation+6.64e-02

Pos logit contributions
unch+0.222
amb+0.219
oped+0.206
oot+0.205
aring+0.202
Neg logit contributions
 same-0.194
 way-0.178
 task-0.162
 boring-0.160
 laundry-0.156
Token watched
Token ID7342
Feature activation-1.15e-01

Pos logit contributions
 same+0.115
 task+0.109
 way+0.108
 attic+0.107
 laundry+0.102
Neg logit contributions
oped-0.129
amb-0.129
unch-0.127
oot-0.123
qu-0.122
Token the
Token ID262
Feature activation+1.48e-01

Pos logit contributions
unch+0.266
amb+0.265
oot+0.251
uts+0.246
oped+0.245
Neg logit contributions
 same-0.180
 way-0.162
 task-0.150
 attic-0.148
 boring-0.147
Token aer
Token ID9551
Feature activation-1.80e-01

Pos logit contributions
 same+0.193
 way+0.176
 attic+0.160
 proper+0.160
 boring+0.158
Neg logit contributions
unch-0.349
amb-0.338
qu-0.325
oing-0.322
aring-0.318
Tokenopl
Token ID20106
Feature activation-3.07e-01

Pos logit contributions
amb+0.291
unch+0.265
oped+0.258
uts+0.250
oled+0.245
Neg logit contributions
 same-0.384
 way-0.353
 wall-0.333
 floor-0.332
 boring-0.328
Tokenane
Token ID1531
Feature activation-1.76e-01

Pos logit contributions
unch+0.447
oing+0.427
amb+0.424
aring+0.422
aron+0.420
Neg logit contributions
 same-0.601
 attic-0.524
 usual-0.521
 homework-0.515
 wall-0.509
Token fly
Token ID6129
Feature activation-2.29e-01

Pos logit contributions
unch+0.402
amb+0.398
oot+0.376
oped+0.373
aring+0.369
Neg logit contributions
 same-0.285
 way-0.257
 task-0.235
 boring-0.231
 attic-0.226
Token above
Token ID2029
Feature activation-9.10e-02

Pos logit contributions
unch+0.497
amb+0.489
oot+0.459
oped+0.456
aring+0.452
Neg logit contributions
 same-0.404
 way-0.366
 task-0.335
 boring-0.333
 attic-0.323
Token them
Token ID606
Feature activation-3.16e-02

Pos logit contributions
amb+0.172
unch+0.171
oot+0.161
oped+0.160
qu+0.157
Neg logit contributions
 same-0.173
 way-0.158
 task-0.152
 boring-0.149
 attic-0.148
Token.
Token ID13
Feature activation-9.27e-02

Pos logit contributions
unch+0.061
amb+0.060
oot+0.055
oped+0.054
aring+0.054
Neg logit contributions
 same-0.064
 way-0.060
 boring-0.054
 task-0.053
 bad-0.052
Token and
Token ID290
Feature activation+6.49e-02

Pos logit contributions
unch+0.240
amb+0.238
oot+0.221
oped+0.220
aring+0.216
Neg logit contributions
 same-0.253
 way-0.235
 task-0.216
 boring-0.215
 attic-0.211
Token put
Token ID1234
Feature activation-8.84e-02

Pos logit contributions
 same+0.118
 task+0.103
 way+0.099
 attic+0.097
 boring+0.094
Neg logit contributions
amb-0.129
oped-0.128
unch-0.127
oot-0.122
uts-0.118
Token on
Token ID319
Feature activation+5.22e-02

Pos logit contributions
amb+0.210
unch+0.208
oped+0.203
oot+0.198
uts+0.192
Neg logit contributions
 same-0.121
 way-0.107
 task-0.103
 boring-0.098
 attic-0.097
Token their
Token ID511
Feature activation+1.01e-01

Pos logit contributions
 same+0.057
 way+0.048
 task+0.047
 attic+0.043
 boring+0.042
Neg logit contributions
amb-0.138
unch-0.134
oped-0.131
uts-0.129
qu-0.128
Token swe
Token ID3490
Feature activation-2.06e-01

Pos logit contributions
 same+0.141
 basement+0.129
 garage+0.122
 attic+0.120
 wall+0.116
Neg logit contributions
amb-0.197
oped-0.192
unch-0.192
qu-0.183
uts-0.181
Tokenaters
Token ID8605
Feature activation-3.07e-01

Pos logit contributions
unch+0.210
qu+0.192
aring+0.190
oot+0.187
amb+0.187
Neg logit contributions
 same-0.588
 way-0.563
 task-0.538
 laundry-0.536
 attic-0.536
Token.
Token ID13
Feature activation-9.09e-02

Pos logit contributions
unch+0.620
amb+0.592
aring+0.562
oing+0.559
oot+0.556
Neg logit contributions
 same-0.594
 way-0.546
 task-0.493
 attic-0.489
 boring-0.488
Token Now
Token ID2735
Feature activation-5.83e-02

Pos logit contributions
unch+0.189
amb+0.186
oot+0.180
oing+0.175
aring+0.174
Neg logit contributions
 same-0.164
 way-0.150
 attic-0.134
 boring-0.134
 garage-0.133
Token,
Token ID11
Feature activation+2.23e-02

Pos logit contributions
amb+0.109
unch+0.105
uts+0.104
oot+0.103
qu+0.102
Neg logit contributions
 same-0.110
 attic-0.097
 task-0.096
 garage-0.094
 basement-0.093
Token Tim
Token ID5045
Feature activation-2.96e-02

Pos logit contributions
 same+0.036
 way+0.031
 task+0.031
 attic+0.030
 boring+0.029
Neg logit contributions
amb-0.048
unch-0.046
oot-0.045
uts-0.044
aring-0.044
Token was
Token ID373
Feature activation+1.25e-01

Pos logit contributions
amb+0.044
unch+0.041
oped+0.041
oot+0.039
aring+0.038
Neg logit contributions
 same-0.066
 task-0.064
 basement-0.063
 attic-0.062
 laundry-0.061
Token she
Token ID673
Feature activation+4.36e-02

Pos logit contributions
unch+0.001
amb+0.001
aring+0.001
uts+0.001
qu+0.001
Neg logit contributions
 same-0.001
 way-0.000
 task-0.000
 proper-0.000
 usual-0.000
Token z
Token ID1976
Feature activation-2.07e-01

Pos logit contributions
 same+0.072
 way+0.065
 task+0.062
 laundry+0.060
 attic+0.059
Neg logit contributions
amb-0.092
unch-0.090
oped-0.090
aring-0.087
uts-0.086
Tokenipped
Token ID3949
Feature activation-1.45e-01

Pos logit contributions
unch+0.226
amb+0.219
oot+0.193
aring+0.188
uts+0.187
Neg logit contributions
 same-0.583
 way-0.544
 task-0.523
 boring-0.518
 attic-0.515
Token up
Token ID510
Feature activation-4.00e-02

Pos logit contributions
amb+0.291
unch+0.282
oot+0.276
oped+0.271
uts+0.271
Neg logit contributions
 same-0.257
 way-0.234
 task-0.225
 boring-0.217
 attic-0.215
Token her
Token ID607
Feature activation+6.39e-02

Pos logit contributions
amb+0.072
unch+0.070
uts+0.069
oot+0.068
oped+0.068
Neg logit contributions
 same-0.072
 way-0.065
 task-0.063
 garage-0.060
 boring-0.060
Token coat
Token ID13209
Feature activation-3.07e-01

Pos logit contributions
 same+0.088
 way+0.072
 boring+0.070
 task+0.069
 bad+0.066
Neg logit contributions
amb-0.151
unch-0.145
oot-0.142
uts-0.141
oped-0.138
Token because
Token ID780
Feature activation+1.07e-02

Pos logit contributions
unch+0.560
amb+0.551
oot+0.512
oped+0.508
aring+0.504
Neg logit contributions
 same-0.639
 way-0.590
 task-0.548
 boring-0.543
 attic-0.536
Token it
Token ID340
Feature activation+1.54e-02

Pos logit contributions
 same+0.019
 way+0.018
 task+0.017
 garage+0.016
 boring+0.015
Neg logit contributions
amb-0.020
uts-0.020
unch-0.020
oot-0.020
oped-0.019
Token was
Token ID373
Feature activation+1.01e-01

Pos logit contributions
 same+0.027
 task+0.023
 attic+0.023
 proper+0.022
 way+0.022
Neg logit contributions
uts-0.029
oped-0.028
qu-0.028
es-0.028
amb-0.027
Token cold
Token ID4692
Feature activation-1.32e-01

Pos logit contributions
 same+0.139
 task+0.127
 garage+0.120
 basement+0.118
 way+0.118
Neg logit contributions
unch-0.201
rol-0.200
uts-0.199
oot-0.192
ann-0.188
Token outside
Token ID2354
Feature activation-8.17e-02

Pos logit contributions
unch+0.240
amb+0.237
oot+0.220
oped+0.218
aring+0.216
Neg logit contributions
 same-0.278
 way-0.256
 task-0.240
 boring-0.238
 attic-0.234
Token stay
Token ID2652
Feature activation-2.45e-02

Pos logit contributions
 same+0.199
 way+0.174
 task+0.163
 boring+0.161
 attic+0.159
Neg logit contributions
unch-0.356
amb-0.349
oot-0.337
qu-0.330
aring-0.329
Token away
Token ID1497
Feature activation+1.03e-02

Pos logit contributions
amb+0.053
unch+0.053
oot+0.050
oing+0.049
oped+0.049
Neg logit contributions
 same-0.041
 way-0.037
 task-0.035
 boring-0.034
 attic-0.033
Token from
Token ID422
Feature activation+9.71e-02

Pos logit contributions
 same+0.023
 way+0.021
 task+0.020
 boring+0.019
 attic+0.019
Neg logit contributions
unch-0.017
amb-0.017
oot-0.016
aring-0.016
uts-0.016
Token the
Token ID262
Feature activation+1.62e-01

Pos logit contributions
 same+0.114
 drawer+0.105
 task+0.105
 way+0.098
 laundry+0.096
Neg logit contributions
unch-0.220
amb-0.216
oot-0.215
owing-0.207
oped-0.207
Token tower
Token ID10580
Feature activation-1.95e-01

Pos logit contributions
 proper+0.191
 homework+0.182
 same+0.182
 task+0.170
 laundry+0.168
Neg logit contributions
unch-0.344
oing-0.322
aring-0.317
uts-0.312
qu-0.307
Token in
Token ID287
Feature activation+9.95e-02

Pos logit contributions
unch+0.321
amb+0.316
oot+0.295
oped+0.290
aring+0.288
Neg logit contributions
 same-0.437
 way-0.401
 task-0.382
 boring-0.377
 laundry-0.373
Token case
Token ID1339
Feature activation-1.93e-01

Pos logit contributions
 same+0.102
 way+0.080
 task+0.079
 boring+0.076
 proper+0.072
Neg logit contributions
unch-0.268
amb-0.262
oped-0.254
oot-0.253
uts-0.251
Token they
Token ID484
Feature activation+3.17e-02

Pos logit contributions
unch+0.291
amb+0.282
oot+0.278
uts+0.271
oped+0.264
Neg logit contributions
 same-0.441
 way-0.408
 task-0.404
 laundry-0.391
 proper-0.387
Token got
Token ID1392
Feature activation+1.95e-02

Pos logit contributions
 same+0.050
 way+0.044
 task+0.041
 laundry+0.040
 attic+0.040
Neg logit contributions
unch-0.072
amb-0.072
oped-0.070
oot-0.069
aring-0.067
Token hurt
Token ID5938
Feature activation-8.63e-02

Pos logit contributions
 same+0.032
 way+0.029
 task+0.028
 attic+0.028
 garage+0.027
Neg logit contributions
unch-0.042
amb-0.041
oot-0.039
oped-0.039
uts-0.039
Token.
Token ID13
Feature activation-7.67e-02

Pos logit contributions
unch+0.155
amb+0.152
oped+0.140
oot+0.140
oing+0.137
Neg logit contributions
 same-0.187
 way-0.173
 boring-0.162
 task-0.159
 attic-0.157
Token mom
Token ID1995
Feature activation-1.43e-01

Pos logit contributions
 same+0.155
 way+0.127
 task+0.126
 ger+0.124
 boring+0.123
Neg logit contributions
unch-0.259
amb-0.256
uts-0.245
oped-0.241
oot-0.240
Tokenmy
Token ID1820
Feature activation-9.18e-02

Pos logit contributions
unch+0.286
amb+0.282
oot+0.267
uts+0.261
qu+0.259
Neg logit contributions
 same-0.282
 way-0.265
 garage-0.234
 boring-0.234
 laundry-0.231
Token.
Token ID13
Feature activation-8.87e-02

Pos logit contributions
unch+0.173
amb+0.169
oot+0.159
oped+0.157
aring+0.154
Neg logit contributions
 same-0.188
 way-0.177
 task-0.160
 boring-0.159
 laundry-0.158
Token They
Token ID1119
Feature activation-3.78e-02

Pos logit contributions
unch+0.192
amb+0.190
oot+0.184
aring+0.179
oped+0.179
Neg logit contributions
 same-0.152
 way-0.137
 attic-0.125
 boring-0.124
 garage-0.124
Token saw
Token ID2497
Feature activation+3.06e-03

Pos logit contributions
amb+0.085
unch+0.084
oped+0.080
oot+0.079
aring+0.077
Neg logit contributions
 same-0.062
 way-0.055
 task-0.052
 boring-0.050
 attic-0.050
Token a
Token ID257
Feature activation+1.15e-01

Pos logit contributions
 same+0.004
 task+0.004
 way+0.004
 drawer+0.003
 shelf+0.003
Neg logit contributions
unch-0.007
amb-0.007
oot-0.007
uts-0.007
oped-0.007
Token group
Token ID1448
Feature activation-1.68e-01

Pos logit contributions
 same+0.121
 way+0.097
 task+0.088
 proper+0.084
 homework+0.082
Neg logit contributions
unch-0.315
oing-0.298
amb-0.294
oped-0.292
qu-0.288
Token of
Token ID286
Feature activation+9.29e-02

Pos logit contributions
amb+0.271
unch+0.256
oped+0.237
oot+0.232
oing+0.231
Neg logit contributions
 same-0.390
 way-0.379
 boring-0.345
 table-0.341
 attic-0.341
Token kids
Token ID3988
Feature activation-1.76e-01

Pos logit contributions
 same+0.145
 way+0.134
 task+0.129
 attic+0.118
 drawer+0.118
Neg logit contributions
unch-0.203
amb-0.198
oot-0.188
qu-0.186
aring-0.186
Token playing
Token ID2712
Feature activation-1.25e-01

Pos logit contributions
amb+0.459
unch+0.447
umbs+0.434
oot+0.428
oing+0.422
Neg logit contributions
 same-0.226
 way-0.216
 boring-0.181
 garage-0.180
 room-0.177
Token on
Token ID319
Feature activation+3.57e-02

Pos logit contributions
unch+0.238
amb+0.235
oot+0.220
oped+0.217
aring+0.215
Neg logit contributions
 same-0.250
 way-0.230
 task-0.214
 boring-0.212
 attic-0.209
Token was
Token ID373
Feature activation+1.54e-01

Pos logit contributions
 same+0.142
 table+0.130
 laundry+0.129
 task+0.128
 attic+0.128
Neg logit contributions
aring-0.140
amb-0.140
unch-0.135
es-0.133
uts-0.132
Token always
Token ID1464
Feature activation+9.65e-02

Pos logit contributions
 table+0.212
 same+0.204
 desk+0.199
 stove+0.193
 floor+0.192
Neg logit contributions
aring-0.310
unch-0.309
oot-0.309
amb-0.295
pr-0.295
Token worth
Token ID2861
Feature activation+4.28e-02

Pos logit contributions
 table+0.134
 same+0.131
 floor+0.126
 desk+0.126
 stove+0.126
Neg logit contributions
aring-0.195
oot-0.184
pr-0.183
amb-0.180
owing-0.179
Token exploring
Token ID13504
Feature activation-3.02e-02

Pos logit contributions
 same+0.060
 way+0.053
 boring+0.047
 bad+0.046
 task+0.045
Neg logit contributions
unch-0.110
amb-0.110
oot-0.103
oped-0.103
oing-0.101
Token things
Token ID1243
Feature activation-7.95e-02

Pos logit contributions
amb+0.057
oot+0.055
unch+0.055
aring+0.054
uts+0.053
Neg logit contributions
 same-0.054
 way-0.047
 task-0.047
 table-0.044
 homework-0.044
Token that
Token ID326
Feature activation+1.15e-01

Pos logit contributions
unch+0.146
amb+0.145
oot+0.138
aring+0.135
uts+0.135
Neg logit contributions
 same-0.160
 way-0.146
 task-0.138
 boring-0.135
 attic-0.135
Token were
Token ID547
Feature activation+1.07e-01

Pos logit contributions
 same+0.194
 attic+0.182
 basement+0.176
 table+0.175
 laundry+0.175
Neg logit contributions
unch-0.208
uts-0.204
amb-0.204
oot-0.203
aring-0.201
Token a
Token ID257
Feature activation+1.75e-01

Pos logit contributions
 same+0.143
 task+0.125
 table+0.122
 garage+0.122
 attic+0.121
Neg logit contributions
oot-0.239
unch-0.237
uts-0.235
aring-0.234
amb-0.227
Token bit
Token ID1643
Feature activation-1.41e-02

Pos logit contributions
 same+0.193
 homework+0.154
 laundry+0.150
 attic+0.144
 way+0.137
Neg logit contributions
unch-0.426
aring-0.421
amb-0.418
oing-0.409
ub-0.406
Token scary
Token ID14343
Feature activation-1.21e-02

Pos logit contributions
unch+0.037
aring+0.037
amb+0.036
oot+0.036
uts+0.035
Neg logit contributions
 same-0.015
 way-0.013
 attic-0.012
 laundry-0.012
 garage-0.012
Token.
Token ID13
Feature activation-4.55e-02

Pos logit contributions
amb+0.021
unch+0.020
aring+0.019
oot+0.019
oped+0.019
Neg logit contributions
 same-0.025
 way-0.023
 task-0.022
 boring-0.022
 laundry-0.021
Token,
Token ID11
Feature activation+1.95e-02

Pos logit contributions
amb+0.066
unch+0.065
oot+0.062
oing+0.061
aring+0.061
Neg logit contributions
 same-0.044
 way-0.039
 homework-0.037
 boring-0.037
 usual-0.036
Token round
Token ID2835
Feature activation-9.42e-02

Pos logit contributions
 same+0.025
 way+0.023
 task+0.020
 homework+0.019
 usual+0.019
Neg logit contributions
unch-0.047
amb-0.046
uts-0.045
qu-0.045
oot-0.045
Token thing
Token ID1517
Feature activation-1.18e-01

Pos logit contributions
amb+0.220
unch+0.218
oot+0.207
oped+0.202
oing+0.202
Neg logit contributions
 same-0.141
 way-0.126
 boring-0.116
 task-0.115
 homework-0.112
Token that
Token ID326
Feature activation+9.35e-02

Pos logit contributions
amb+0.223
unch+0.217
uts+0.214
aring+0.210
oot+0.209
Neg logit contributions
 same-0.224
 way-0.205
 task-0.196
 garage-0.190
 attic-0.188
Token grows
Token ID13676
Feature activation-7.37e-02

Pos logit contributions
 same+0.135
 homework+0.125
 task+0.124
 garage+0.122
 way+0.121
Neg logit contributions
qu-0.185
amb-0.185
uts-0.184
es-0.181
aring-0.180
Token into
Token ID656
Feature activation+1.03e-01

Pos logit contributions
unch+0.156
amb+0.155
oot+0.146
oped+0.144
uts+0.143
Neg logit contributions
 same-0.129
 way-0.118
 task-0.109
 boring-0.107
 attic-0.106
Token a
Token ID257
Feature activation+1.00e-01

Pos logit contributions
 same+0.119
 way+0.105
 task+0.097
 homework+0.092
 garage+0.089
Neg logit contributions
unch-0.259
qu-0.249
amb-0.248
uts-0.246
oot-0.242
Token flower
Token ID15061
Feature activation-1.70e-01

Pos logit contributions
 same+0.125
 way+0.106
 homework+0.098
 task+0.096
 boring+0.095
Neg logit contributions
unch-0.247
amb-0.243
oing-0.235
qu-0.231
oped-0.231
Token."
Token ID526
Feature activation-1.22e-01

Pos logit contributions
unch+0.309
amb+0.305
oot+0.284
oped+0.284
aring+0.275
Neg logit contributions
 same-0.359
 way-0.333
 task-0.309
 boring-0.307
 attic-0.302
Token
Token ID220
Feature activation-1.56e-01

Pos logit contributions
unch+0.258
amb+0.256
oot+0.247
oped+0.241
aring+0.241
Neg logit contributions
 same-0.221
 way-0.204
 table-0.177
 boring-0.176
 attic-0.176
Token
Token ID198
Feature activation-8.18e-02

Pos logit contributions
amb+0.336
unch+0.319
oped+0.318
oot+0.313
oing+0.313
Neg logit contributions
 same-0.274
 way-0.243
 new-0.220
 attic-0.217
 old-0.215
Token!"
Token ID2474
Feature activation-9.78e-02

Pos logit contributions
unch+0.270
amb+0.268
oped+0.253
oot+0.248
oing+0.247
Neg logit contributions
 same-0.330
 way-0.307
 task-0.285
 boring-0.284
 table-0.277
Token
Token ID198
Feature activation-6.53e-02

Pos logit contributions
unch+0.186
amb+0.185
oot+0.183
aring+0.181
oped+0.179
Neg logit contributions
 same-0.180
 way-0.169
 attic-0.153
 garage-0.152
 laundry-0.151
Token
Token ID198
Feature activation-5.98e-02

Pos logit contributions
amb+0.140
aring+0.135
oped+0.134
rol+0.134
ub+0.133
Neg logit contributions
 same-0.101
 way-0.093
 new-0.088
 garage-0.088
 attic-0.087
TokenBut
Token ID1537
Feature activation-3.54e-03

Pos logit contributions
amb+0.109
unch+0.107
oot+0.104
oped+0.104
ub+0.101
Neg logit contributions
 same-0.114
 way-0.104
 task-0.100
 attic-0.100
 garage-0.100
Token John
Token ID1757
Feature activation+2.65e-02

Pos logit contributions
amb+0.007
unch+0.007
oot+0.006
oped+0.006
uts+0.006
Neg logit contributions
 same-0.007
 way-0.006
 task-0.006
 boring-0.006
 attic-0.006
Token was
Token ID373
Feature activation+1.55e-01

Pos logit contributions
 same+0.043
 way+0.038
 task+0.036
 attic+0.036
 laundry+0.035
Neg logit contributions
unch-0.059
amb-0.059
oped-0.056
oot-0.056
aring-0.055
Token not
Token ID407
Feature activation+1.30e-01

Pos logit contributions
 same+0.244
 way+0.221
 task+0.213
 attic+0.208
 garage+0.204
Neg logit contributions
unch-0.353
amb-0.332
aring-0.329
oot-0.323
uts-0.322
Token so
Token ID523
Feature activation+1.46e-01

Pos logit contributions
 same+0.222
 way+0.201
 task+0.187
 attic+0.182
 boring+0.181
Neg logit contributions
unch-0.283
amb-0.279
oot-0.265
oped-0.261
aring-0.260
Token happy
Token ID3772
Feature activation-1.68e-02

Pos logit contributions
 same+0.208
 way+0.189
 attic+0.175
 garage+0.173
 task+0.173
Neg logit contributions
unch-0.358
amb-0.342
aring-0.332
uts-0.331
oot-0.330
Token,
Token ID11
Feature activation+5.37e-02

Pos logit contributions
unch+0.029
amb+0.029
oot+0.027
oped+0.026
aring+0.026
Neg logit contributions
 same-0.036
 way-0.034
 task-0.032
 boring-0.031
 attic-0.031
Token because
Token ID780
Feature activation+6.85e-02

Pos logit contributions
 same+0.098
 task+0.086
 attic+0.082
 basement+0.082
 way+0.082
Neg logit contributions
unch-0.109
amb-0.106
oot-0.102
uts-0.101
aring-0.099
Token.
Token ID13
Feature activation-6.52e-02

Pos logit contributions
unch+0.270
amb+0.252
uts+0.249
oped+0.242
aring+0.242
Neg logit contributions
 same-0.546
 task-0.492
 way-0.491
 boring-0.480
 laundry-0.479
Token She
Token ID1375
Feature activation+2.67e-02

Pos logit contributions
unch+0.126
amb+0.121
uts+0.117
aring+0.115
oot+0.114
Neg logit contributions
 same-0.129
 way-0.120
 task-0.112
 boring-0.107
 attic-0.107
Token picked
Token ID6497
Feature activation-5.17e-02

Pos logit contributions
 same+0.053
 way+0.047
 task+0.046
 attic+0.044
 laundry+0.044
Neg logit contributions
unch-0.051
amb-0.051
oped-0.049
oot-0.048
aring-0.047
Token it
Token ID340
Feature activation+2.85e-02

Pos logit contributions
amb+0.107
unch+0.107
oot+0.101
oped+0.099
uts+0.096
Neg logit contributions
 same-0.088
 way-0.083
 task-0.078
 attic-0.078
 boring-0.077
Token up
Token ID510
Feature activation+2.50e-02

Pos logit contributions
 same+0.053
 attic+0.048
 way+0.048
 boring+0.048
 laundry+0.047
Neg logit contributions
amb-0.053
oot-0.050
unch-0.050
uts-0.047
oped-0.046
Token and
Token ID290
Feature activation+7.69e-02

Pos logit contributions
 same+0.058
 way+0.053
 task+0.051
 attic+0.051
 boring+0.050
Neg logit contributions
amb-0.040
unch-0.039
oot-0.037
uts-0.035
oped-0.035
Token showed
Token ID3751
Feature activation-5.10e-02

Pos logit contributions
 same+0.149
 task+0.131
 way+0.129
 boring+0.123
 attic+0.123
Neg logit contributions
amb-0.144
unch-0.144
oped-0.142
oot-0.137
es-0.133
Token it
Token ID340
Feature activation+4.47e-02

Pos logit contributions
unch+0.112
amb+0.111
oot+0.105
oped+0.104
aring+0.103
Neg logit contributions
 same-0.087
 way-0.079
 task-0.073
 boring-0.072
 attic-0.071
Token to
Token ID284
Feature activation+1.72e-01

Pos logit contributions
 same+0.074
 way+0.067
 attic+0.066
 garage+0.064
 task+0.064
Neg logit contributions
uts-0.092
amb-0.091
oot-0.091
aring-0.090
unch-0.090
Token him
Token ID683
Feature activation+7.96e-02

Pos logit contributions
 same+0.343
 way+0.315
 task+0.313
 boring+0.299
 attic+0.296
Neg logit contributions
unch-0.312
amb-0.294
oot-0.288
oped-0.267
qu-0.261
Token.
Token ID13
Feature activation-5.98e-02

Pos logit contributions
 same+0.183
 way+0.170
 task+0.160
 boring+0.158
 attic+0.156
Neg logit contributions
unch-0.129
amb-0.128
oot-0.118
oped-0.116
aring-0.115
Token with
Token ID351
Feature activation+7.31e-03

Pos logit contributions
unch+0.244
amb+0.242
oot+0.227
oped+0.225
aring+0.223
Neg logit contributions
 same-0.216
 way-0.196
 task-0.182
 boring-0.180
 attic-0.176
Token your
Token ID534
Feature activation+1.18e-01

Pos logit contributions
 same+0.009
 way+0.008
 task+0.008
 garage+0.007
 boring+0.007
Neg logit contributions
amb-0.018
unch-0.018
oot-0.017
uts-0.017
qu-0.017
Token sister
Token ID6621
Feature activation-1.92e-01

Pos logit contributions
 same+0.170
 way+0.142
 basement+0.140
 boring+0.135
 attic+0.131
Neg logit contributions
unch-0.249
amb-0.245
uts-0.236
aring-0.227
oot-0.225
Token.
Token ID13
Feature activation-8.48e-02

Pos logit contributions
unch+0.339
amb+0.333
oot+0.310
oped+0.307
oing+0.300
Neg logit contributions
 same-0.414
 way-0.384
 boring-0.358
 task-0.357
 attic-0.350
Token She
Token ID1375
Feature activation-4.38e-03

Pos logit contributions
unch+0.199
oot+0.190
amb+0.190
oped+0.190
oing+0.186
Neg logit contributions
 same-0.132
 way-0.114
 attic-0.110
 garage-0.107
 boring-0.107
Token has
Token ID468
Feature activation+5.70e-02

Pos logit contributions
amb+0.010
unch+0.010
aring+0.010
oped+0.010
uts+0.010
Neg logit contributions
 same-0.006
 way-0.006
 task-0.005
 attic-0.005
 laundry-0.005
Token no
Token ID645
Feature activation+6.00e-02

Pos logit contributions
 task+0.086
 basement+0.083
 garage+0.082
 port+0.079
 same+0.078
Neg logit contributions
uts-0.101
es-0.095
ents-0.093
unch-0.092
ann-0.092
Token candy
Token ID18550
Feature activation-1.91e-01

Pos logit contributions
 same+0.080
 basement+0.066
 way+0.065
 task+0.064
 attic+0.064
Neg logit contributions
unch-0.134
amb-0.130
oot-0.123
uts-0.122
aring-0.120
Token and
Token ID290
Feature activation+7.01e-02

Pos logit contributions
unch+0.331
amb+0.328
uts+0.315
oot+0.308
aring+0.303
Neg logit contributions
 same-0.400
 way-0.370
 task-0.349
 attic-0.341
 laundry-0.339
Token she
Token ID673
Feature activation+5.47e-02

Pos logit contributions
 same+0.098
 way+0.095
 basement+0.094
 garage+0.092
 task+0.092
Neg logit contributions
uts-0.141
es-0.133
qu-0.128
aring-0.128
unch-0.126
Token is
Token ID318
Feature activation+1.21e-01

Pos logit contributions
 same+0.079
 way+0.075
 laundry+0.069
 attic+0.069
 basement+0.069
Neg logit contributions
amb-0.124
unch-0.122
aring-0.121
uts-0.119
oped-0.118
Token to
Token ID284
Feature activation+1.89e-01

Pos logit contributions
 same+0.161
 task+0.140
 way+0.139
 floor+0.137
 proper+0.136
Neg logit contributions
unch-0.137
oot-0.130
amb-0.130
uts-0.129
qu-0.120
Token start
Token ID923
Feature activation-3.35e-03

Pos logit contributions
 same+0.277
 way+0.235
 task+0.222
 boring+0.221
 attic+0.216
Neg logit contributions
amb-0.441
unch-0.441
oot-0.420
oped-0.408
qu-0.404
Token playing
Token ID2712
Feature activation-9.27e-02

Pos logit contributions
unch+0.007
amb+0.007
oot+0.007
oped+0.007
uts+0.007
Neg logit contributions
 same-0.005
 way-0.005
 task-0.005
 boring-0.004
 proper-0.004
Token but
Token ID475
Feature activation+3.87e-02

Pos logit contributions
unch+0.154
amb+0.151
oot+0.144
oped+0.141
uts+0.138
Neg logit contributions
 same-0.191
 way-0.174
 task-0.165
 boring-0.163
 proper-0.162
Token then
Token ID788
Feature activation+3.29e-02

Pos logit contributions
 same+0.065
 task+0.058
 way+0.056
 garage+0.055
 stove+0.054
Neg logit contributions
unch-0.076
amb-0.074
oot-0.074
oped-0.074
aring-0.072
Token she
Token ID673
Feature activation+9.05e-02

Pos logit contributions
 same+0.058
 way+0.050
 task+0.050
 proper+0.049
 garage+0.048
Neg logit contributions
unch-0.062
es-0.062
oped-0.062
oot-0.061
amb-0.060
Token heard
Token ID2982
Feature activation-1.31e-02

Pos logit contributions
 same+0.154
 way+0.136
 task+0.133
 attic+0.130
 laundry+0.127
Neg logit contributions
unch-0.190
oped-0.187
amb-0.187
oot-0.178
uts-0.176
Token her
Token ID607
Feature activation+1.62e-01

Pos logit contributions
unch+0.038
amb+0.037
oot+0.036
oped+0.036
aring+0.035
Neg logit contributions
 same-0.013
 way-0.011
 task-0.009
 boring-0.009
 attic-0.009
Token mom
Token ID1995
Feature activation-9.15e-02

Pos logit contributions
 same+0.179
 way+0.147
 task+0.132
 boring+0.129
 attic+0.127
Neg logit contributions
amb-0.446
unch-0.444
oot-0.425
uts-0.422
aring-0.418
Token's
Token ID338
Feature activation+1.38e-01

Pos logit contributions
amb+0.225
unch+0.220
uts+0.209
oot+0.207
umbs+0.198
Neg logit contributions
 same-0.136
 way-0.132
 garage-0.112
 room-0.111
 table-0.110
Token voice
Token ID3809
Feature activation-1.72e-01

Pos logit contributions
 same+0.216
 way+0.192
 task+0.174
 boring+0.171
 laundry+0.166
Neg logit contributions
unch-0.334
amb-0.329
oot-0.311
oped-0.311
uts-0.305
Token they
Token ID484
Feature activation+9.15e-02

Pos logit contributions
 same+0.262
 task+0.235
 way+0.234
 attic+0.226
 garage+0.220
Neg logit contributions
amb-0.222
unch-0.219
oot-0.212
oped-0.207
aring-0.204
Token brought
Token ID3181
Feature activation-2.11e-02

Pos logit contributions
 same+0.134
 way+0.119
 task+0.107
 boring+0.107
 attic+0.103
Neg logit contributions
unch-0.226
amb-0.221
oot-0.212
uts-0.210
aring-0.210
Token the
Token ID262
Feature activation+1.96e-01

Pos logit contributions
amb+0.054
unch+0.053
oot+0.050
oped+0.050
uts+0.049
Neg logit contributions
 same-0.025
 way-0.022
 task-0.021
 boring-0.021
 garage-0.020
Token movie
Token ID3807
Feature activation-1.68e-01

Pos logit contributions
 same+0.228
 task+0.205
 attic+0.203
 way+0.201
 garage+0.200
Neg logit contributions
amb-0.458
unch-0.456
oing-0.438
qu-0.435
aring-0.429
Token home
Token ID1363
Feature activation-6.69e-02

Pos logit contributions
unch+0.345
amb+0.342
oot+0.321
oped+0.317
uts+0.315
Neg logit contributions
 same-0.312
 way-0.284
 task-0.264
 boring-0.261
 attic-0.256
Token.
Token ID13
Feature activation-4.98e-02

Pos logit contributions
unch+0.121
amb+0.120
oped+0.110
oot+0.110
aring+0.108
Neg logit contributions
 same-0.142
 way-0.132
 task-0.122
 boring-0.122
 attic-0.119
Token Lily
Token ID20037
Feature activation+1.56e-02

Pos logit contributions
unch+0.105
amb+0.104
oot+0.101
oped+0.098
aring+0.097
Neg logit contributions
 same-0.092
 way-0.084
 boring-0.075
 task-0.074
 attic-0.074
Token was
Token ID373
Feature activation+1.57e-01

Pos logit contributions
 same+0.029
 task+0.026
 way+0.025
 attic+0.024
 basement+0.024
Neg logit contributions
amb-0.032
unch-0.030
oped-0.030
qu-0.028
uts-0.028
Token so
Token ID523
Feature activation+1.55e-01

Pos logit contributions
 same+0.239
 way+0.213
 task+0.198
 attic+0.190
 boring+0.189
Neg logit contributions
unch-0.375
amb-0.370
oot-0.350
oped-0.346
aring-0.345
Token happy
Token ID3772
Feature activation-6.98e-03

Pos logit contributions
 same+0.203
 way+0.176
 boring+0.160
 task+0.155
 bad+0.150
Neg logit contributions
amb-0.403
unch-0.400
oot-0.387
oped-0.383
aring-0.377
Token and
Token ID290
Feature activation+1.46e-01

Pos logit contributions
unch+0.016
amb+0.016
oped+0.015
oot+0.015
oing+0.015
Neg logit contributions
 same-0.011
 way-0.010
 boring-0.009
 task-0.009
 laundry-0.009
Token to
Token ID284
Feature activation+1.99e-01

Pos logit contributions
uts+0.036
unch+0.036
es+0.035
qu+0.035
oot+0.034
Neg logit contributions
 same-0.038
 task-0.037
 attic-0.036
 garage-0.036
 basement-0.035
Token open
Token ID1280
Feature activation-6.22e-02

Pos logit contributions
 same+0.237
 way+0.204
 task+0.180
 boring+0.175
 attic+0.170
Neg logit contributions
unch-0.544
amb-0.539
oot-0.515
oped-0.512
aring-0.508
Token the
Token ID262
Feature activation+1.64e-01

Pos logit contributions
amb+0.156
unch+0.155
oot+0.150
oped+0.148
uts+0.146
Neg logit contributions
 same-0.076
 way-0.068
 task-0.064
 boring-0.059
 attic-0.058
Token package
Token ID5301
Feature activation-2.07e-01

Pos logit contributions
 ending+0.196
 way+0.170
 same+0.165
 rainy+0.158
 boring+0.158
Neg logit contributions
aff-0.318
ann-0.315
ents-0.313
ew-0.313
quet-0.304
Token right
Token ID826
Feature activation+4.93e-03

Pos logit contributions
unch+0.370
amb+0.368
oot+0.350
uts+0.339
oped+0.338
Neg logit contributions
 same-0.421
 way-0.383
 task-0.368
 boring-0.364
 laundry-0.356
Token away
Token ID1497
Feature activation+2.40e-03

Pos logit contributions
 same+0.008
 way+0.008
 task+0.007
 boring+0.007
 laundry+0.007
Neg logit contributions
amb-0.011
unch-0.010
oot-0.010
oped-0.010
aring-0.010
Token,
Token ID11
Feature activation+1.82e-02

Pos logit contributions
 same+0.005
 way+0.005
 task+0.005
 boring+0.005
 attic+0.004
Neg logit contributions
unch-0.004
amb-0.004
oot-0.004
oped-0.004
aring-0.004
Token so
Token ID523
Feature activation+1.16e-01

Pos logit contributions
 same+0.036
 task+0.033
 way+0.033
 attic+0.032
 basement+0.032
Neg logit contributions
unch-0.031
amb-0.031
aring-0.029
uts-0.029
oot-0.028
Token she
Token ID673
Feature activation+9.27e-02

Pos logit contributions
 same+0.176
 way+0.153
 task+0.151
 garage+0.147
 laundry+0.145
Neg logit contributions
unch-0.253
amb-0.251
uts-0.247
oot-0.244
es-0.240
Token took
Token ID1718
Feature activation-1.72e-02

Pos logit contributions
 same+0.175
 way+0.155
 task+0.147
 boring+0.144
 attic+0.143
Neg logit contributions
amb-0.193
unch-0.178
oped-0.177
oot-0.175
qu-0.170
Token the
Token ID262
Feature activation+1.66e-01

Pos logit contributions
amb+0.040
unch+0.040
oot+0.039
oped+0.039
qu+0.038
Neg logit contributions
 same-0.025
 way-0.022
 task-0.021
 boring-0.021
 attic-0.020
Token and
Token ID290
Feature activation+9.28e-02

Pos logit contributions
unch+0.061
amb+0.060
oot+0.057
oped+0.056
oing+0.055
Neg logit contributions
 same-0.051
 way-0.047
 boring-0.043
 task-0.042
 attic-0.041
Token Ben
Token ID3932
Feature activation-4.16e-02

Pos logit contributions
 way+0.136
 same+0.135
 proper+0.131
 task+0.128
 attic+0.120
Neg logit contributions
unch-0.192
amb-0.190
oot-0.183
es-0.179
uts-0.178
Token hear
Token ID3285
Feature activation-1.68e-02

Pos logit contributions
unch+0.107
amb+0.104
oot+0.101
uts+0.099
oped+0.098
Neg logit contributions
 same-0.060
 way-0.054
 bad-0.046
 boring-0.046
 task-0.045
Token the
Token ID262
Feature activation+1.70e-01

Pos logit contributions
unch+0.054
amb+0.054
oped+0.051
oot+0.051
aring+0.051
Neg logit contributions
 same-0.013
 way-0.010
 boring-0.008
 bad-0.008
 task-0.007
Token dog
Token ID3290
Feature activation-1.71e-01

Pos logit contributions
 same+0.203
 way+0.178
 task+0.157
 boring+0.155
 attic+0.152
Neg logit contributions
unch-0.457
amb-0.449
oot-0.430
oing-0.426
aring-0.426
Token.
Token ID13
Feature activation-7.15e-02

Pos logit contributions
amb+0.407
unch+0.390
oped+0.376
oot+0.373
uts+0.373
Neg logit contributions
 same-0.270
 way-0.263
 bad-0.235
 sink-0.219
 new-0.218
Token They
Token ID1119
Feature activation-1.61e-02

Pos logit contributions
amb+0.141
unch+0.141
oing+0.131
oped+0.130
uts+0.130
Neg logit contributions
 same-0.135
 way-0.125
 task-0.118
 attic-0.114
 boring-0.113
Token see
Token ID766
Feature activation+2.74e-02

Pos logit contributions
unch+0.040
amb+0.040
oot+0.038
oped+0.037
uts+0.037
Neg logit contributions
 same-0.023
 way-0.021
 task-0.019
 boring-0.018
 attic-0.018
Token the
Token ID262
Feature activation+1.78e-01

Pos logit contributions
 same+0.026
 way+0.022
 task+0.020
 boring+0.018
 attic+0.018
Neg logit contributions
unch-0.080
amb-0.079
oot-0.076
oped-0.075
aring-0.075
Token dog
Token ID3290
Feature activation-1.62e-01

Pos logit contributions
 same+0.237
 way+0.214
 proper+0.196
 task+0.196
 boring+0.188
Neg logit contributions
unch-0.432
amb-0.411
oing-0.399
uts-0.399
oot-0.397
Token.
Token ID13
Feature activation-7.77e-02

Pos logit contributions
amb+0.376
unch+0.356
oot+0.347
uts+0.334
ann+0.325
Neg logit contributions
 same-0.280
 way-0.264
 bad-0.241
 boring-0.230
 new-0.227
Token a
Token ID257
Feature activation+1.13e-01

Pos logit contributions
unch+0.125
amb+0.123
oot+0.121
uts+0.118
oped+0.118
Neg logit contributions
 same-0.053
 way-0.046
 task-0.043
 boring-0.039
 attic-0.039
Token loud
Token ID7812
Feature activation-8.83e-02

Pos logit contributions
 same+0.152
 way+0.133
 attic+0.121
 task+0.117
 laundry+0.117
Neg logit contributions
unch-0.279
amb-0.267
oing-0.266
aring-0.262
qu-0.260
Token noise
Token ID7838
Feature activation-1.91e-01

Pos logit contributions
unch+0.228
amb+0.226
oot+0.214
oped+0.214
uts+0.211
Neg logit contributions
 same-0.118
 way-0.103
 task-0.093
 boring-0.091
 attic-0.087
Token being
Token ID852
Feature activation+1.32e-02

Pos logit contributions
amb+0.421
unch+0.417
oot+0.378
oped+0.377
aring+0.375
Neg logit contributions
 same-0.330
 way-0.306
 bad-0.283
 boring-0.283
 task-0.274
Token made
Token ID925
Feature activation+6.12e-03

Pos logit contributions
 same+0.024
 task+0.021
 way+0.021
 attic+0.021
 garage+0.020
Neg logit contributions
unch-0.026
oped-0.025
amb-0.025
oing-0.025
uts-0.025
Token.
Token ID13
Feature activation-8.39e-02

Pos logit contributions
 same+0.009
 way+0.007
 attic+0.007
 task+0.007
 boring+0.007
Neg logit contributions
unch-0.014
oot-0.014
amb-0.014
qu-0.014
uts-0.014
Token It
Token ID632
Feature activation-1.83e-02

Pos logit contributions
unch+0.170
amb+0.168
oot+0.158
oped+0.156
aring+0.155
Neg logit contributions
 same-0.158
 way-0.144
 task-0.133
 boring-0.132
 attic-0.130
Token was
Token ID373
Feature activation+1.03e-01

Pos logit contributions
unch+0.043
amb+0.043
oot+0.041
oped+0.041
aring+0.040
Neg logit contributions
 same-0.028
 way-0.025
 task-0.023
 boring-0.023
 attic-0.022
Token the
Token ID262
Feature activation+1.76e-01

Pos logit contributions
 same+0.147
 task+0.135
 way+0.129
 proper+0.124
 homework+0.122
Neg logit contributions
unch-0.220
rol-0.208
uts-0.207
oot-0.206
oped-0.205
Token door
Token ID3420
Feature activation-1.49e-01

Pos logit contributions
 proper+0.173
 same+0.171
 your+0.158
 way+0.154
 homework+0.154
Neg logit contributions
amb-0.394
unch-0.393
aring-0.367
oot-0.364
oing-0.364
Token to
Token ID284
Feature activation+1.46e-01

Pos logit contributions
unch+0.319
amb+0.313
oot+0.298
oped+0.297
aring+0.293
Neg logit contributions
 same-0.260
 way-0.235
 task-0.218
 boring-0.214
 attic-0.212
Token top
Token ID1353
Feature activation-7.70e-02

Pos logit contributions
 ger+0.266
 homework+0.264
 proper+0.259
 germ+0.257
 laundry+0.254
Neg logit contributions
unch-0.412
amb-0.380
ang-0.364
aring-0.359
oot-0.357
Token,
Token ID11
Feature activation+4.30e-02

Pos logit contributions
unch+0.111
amb+0.102
oot+0.100
oped+0.097
uts+0.095
Neg logit contributions
 same-0.187
 way-0.169
 task-0.165
 laundry-0.165
 proper-0.164
Token he
Token ID339
Feature activation+1.03e-01

Pos logit contributions
 same+0.081
 proper+0.074
 task+0.073
 way+0.072
 homework+0.071
Neg logit contributions
unch-0.076
amb-0.068
uts-0.066
oot-0.065
oped-0.065
Token saw
Token ID2497
Feature activation+6.37e-02

Pos logit contributions
 same+0.171
 task+0.156
 laundry+0.150
 attic+0.148
 way+0.144
Neg logit contributions
unch-0.218
oped-0.216
amb-0.213
uts-0.199
oot-0.198
Token a
Token ID257
Feature activation+1.76e-01

Pos logit contributions
 same+0.074
 way+0.066
 task+0.064
 laundry+0.056
 attic+0.055
Neg logit contributions
unch-0.169
amb-0.164
oot-0.159
uts-0.156
aring-0.154
Token man
Token ID582
Feature activation-9.35e-02

Pos logit contributions
 same+0.170
 way+0.129
 homework+0.119
 task+0.117
 laundry+0.116
Neg logit contributions
unch-0.499
oing-0.468
amb-0.466
oped-0.456
oot-0.453
Token with
Token ID351
Feature activation+8.52e-02

Pos logit contributions
amb+0.230
uts+0.216
umbs+0.214
unch+0.212
oot+0.209
Neg logit contributions
 same-0.133
 way-0.126
 room-0.115
 garage-0.115
 kitchen-0.110
Token a
Token ID257
Feature activation+1.66e-01

Pos logit contributions
 same+0.107
 way+0.102
 task+0.099
 attic+0.089
 shelf+0.088
Neg logit contributions
unch-0.206
oped-0.191
amb-0.191
uts-0.190
umbs-0.189
Token big
Token ID1263
Feature activation+6.41e-02

Pos logit contributions
 same+0.181
 way+0.145
 attic+0.144
 ger+0.141
 task+0.135
Neg logit contributions
unch-0.434
oing-0.416
oped-0.406
amb-0.404
ew-0.398
Token tray
Token ID26473
Feature activation-1.51e-01

Pos logit contributions
 same+0.104
 way+0.088
 usual+0.087
 proper+0.087
 attic+0.085
Neg logit contributions
unch-0.139
amb-0.133
oing-0.128
oped-0.127
oot-0.126
Token.
Token ID13
Feature activation-3.94e-02

Pos logit contributions
amb+0.325
unch+0.309
qu+0.309
oot+0.299
ub+0.298
Neg logit contributions
 same-0.253
 way-0.242
 table-0.224
 garage-0.224
 drawer-0.223
Token cave
Token ID11527
Feature activation-2.18e-01

Pos logit contributions
 same+0.237
 boring+0.198
 laundry+0.195
 task+0.190
 ger+0.189
Neg logit contributions
amb-0.308
unch-0.304
oot-0.285
uts-0.275
ang-0.272
Token,
Token ID11
Feature activation+1.86e-02

Pos logit contributions
unch+0.248
oot+0.241
uts+0.222
aring+0.220
owing+0.214
Neg logit contributions
 same-0.554
 proper-0.515
 laundry-0.503
 task-0.500
 ger-0.497
Token he
Token ID339
Feature activation+8.62e-02

Pos logit contributions
 same+0.035
 task+0.033
 way+0.033
 attic+0.033
 proper+0.032
Neg logit contributions
unch-0.034
amb-0.031
oped-0.030
uts-0.029
es-0.029
Token met
Token ID1138
Feature activation-2.21e-02

Pos logit contributions
 same+0.142
 task+0.127
 laundry+0.121
 way+0.120
 attic+0.117
Neg logit contributions
unch-0.183
oped-0.181
amb-0.180
oot-0.172
oing-0.171
Token his
Token ID465
Feature activation+1.79e-01

Pos logit contributions
unch+0.066
amb+0.066
oot+0.063
oped+0.063
uts+0.062
Neg logit contributions
 same-0.021
 way-0.017
 task-0.014
 boring-0.014
 attic-0.014
Token friend
Token ID1545
Feature activation-1.26e-01

Pos logit contributions
 same+0.299
 way+0.264
 task+0.249
 laundry+0.242
 boring+0.242
Neg logit contributions
unch-0.393
amb-0.388
oot-0.368
uts-0.358
aring-0.358
Token,
Token ID11
Feature activation+3.42e-02

Pos logit contributions
unch+0.309
amb+0.309
uts+0.289
oped+0.289
oot+0.289
Neg logit contributions
 same-0.185
 way-0.168
 boring-0.147
 bad-0.143
 task-0.142
Token a
Token ID257
Feature activation+1.62e-01

Pos logit contributions
 same+0.043
 way+0.038
 task+0.033
 boring+0.033
 attic+0.032
Neg logit contributions
unch-0.090
amb-0.090
oot-0.085
oped-0.085
aring-0.084
Token little
Token ID1310
Feature activation+2.35e-02

Pos logit contributions
 same+0.218
 way+0.186
 task+0.170
 attic+0.166
 laundry+0.163
Neg logit contributions
unch-0.408
amb-0.390
oing-0.382
oped-0.376
oot-0.376
Token rabbit
Token ID22746
Feature activation-1.31e-01

Pos logit contributions
 same+0.031
 way+0.028
 task+0.025
 boring+0.024
 attic+0.024
Neg logit contributions
unch-0.060
amb-0.060
oot-0.057
oped-0.056
aring-0.056
Token named
Token ID3706
Feature activation-9.76e-02

Pos logit contributions
uts+0.309
umbs+0.307
amb+0.303
unch+0.302
rol+0.289
Neg logit contributions
 way-0.178
 same-0.174
 garage-0.168
 drawer-0.158
 new-0.154
Token it
Token ID340
Feature activation-4.49e-02

Pos logit contributions
 same+0.159
 way+0.144
 task+0.135
 boring+0.132
 attic+0.131
Neg logit contributions
amb-0.183
unch-0.183
oot-0.171
oped-0.171
aring-0.167
Token;
Token ID26
Feature activation+2.22e-02

Pos logit contributions
unch+0.082
amb+0.080
uts+0.080
oot+0.077
oped+0.077
Neg logit contributions
 same-0.078
 attic-0.074
 basement-0.072
 garage-0.072
 laundry-0.071
Token it
Token ID340
Feature activation-1.14e-02

Pos logit contributions
 same+0.038
 way+0.034
 task+0.033
 attic+0.033
 basement+0.032
Neg logit contributions
unch-0.046
amb-0.044
uts-0.043
aring-0.043
oot-0.042
Token was
Token ID373
Feature activation+8.97e-02

Pos logit contributions
unch+0.022
oped+0.022
amb+0.022
qu+0.021
uts+0.021
Neg logit contributions
 same-0.019
 attic-0.018
 basement-0.017
 laundry-0.017
 way-0.017
Token so
Token ID523
Feature activation+9.24e-02

Pos logit contributions
 same+0.131
 way+0.117
 task+0.115
 attic+0.115
 garage+0.114
Neg logit contributions
unch-0.203
amb-0.192
oot-0.186
aring-0.184
uts-0.183
Token nice
Token ID3621
Feature activation-9.00e-02

Pos logit contributions
 same+0.119
 way+0.110
 attic+0.100
 garage+0.100
 task+0.098
Neg logit contributions
unch-0.235
amb-0.225
uts-0.217
aring-0.216
oped-0.214
Token!
Token ID0
Feature activation-5.24e-02

Pos logit contributions
unch+0.163
amb+0.161
oot+0.150
oped+0.148
aring+0.147
Neg logit contributions
 same-0.189
 way-0.173
 task-0.163
 boring-0.161
 attic-0.159
Token Lily
Token ID20037
Feature activation-3.41e-02

Pos logit contributions
unch+0.105
amb+0.103
oot+0.098
oped+0.097
oing+0.097
Neg logit contributions
 same-0.098
 way-0.090
 task-0.082
 boring-0.082
 attic-0.081
Token's
Token ID338
Feature activation+1.15e-01

Pos logit contributions
amb+0.070
unch+0.068
oped+0.065
oot+0.063
aring+0.063
Neg logit contributions
 same-0.062
 task-0.055
 way-0.054
 attic-0.053
 basement-0.053
Token mom
Token ID1995
Feature activation-1.11e-01

Pos logit contributions
 same+0.140
 way+0.116
 task+0.110
 basement+0.110
 boring+0.108
Neg logit contributions
unch-0.290
amb-0.282
oot-0.266
aring-0.263
uts-0.261
Token helped
Token ID4193
Feature activation-5.14e-02

Pos logit contributions
unch+0.245
amb+0.243
oot+0.229
oped+0.227
aring+0.225
Neg logit contributions
 same-0.189
 way-0.171
 task-0.157
 boring-0.155
 attic-0.152
Token and
Token ID290
Feature activation+6.31e-02

Pos logit contributions
unch+0.158
amb+0.156
oot+0.147
oped+0.146
aring+0.144
Neg logit contributions
 same-0.122
 way-0.110
 task-0.101
 boring-0.100
 attic-0.097
Token help
Token ID1037
Feature activation-5.42e-02

Pos logit contributions
 same+0.095
 way+0.082
 attic+0.080
 boring+0.080
 task+0.078
Neg logit contributions
amb-0.139
unch-0.137
aring-0.130
oot-0.130
uts-0.129
Token.
Token ID13
Feature activation-8.41e-02

Pos logit contributions
unch+0.112
amb+0.111
oot+0.107
uts+0.106
qu+0.106
Neg logit contributions
 same-0.090
 way-0.078
 task-0.075
 attic-0.075
 boring-0.075
Token The
Token ID383
Feature activation+6.82e-02

Pos logit contributions
unch+0.181
amb+0.177
oot+0.174
oped+0.169
oing+0.169
Neg logit contributions
 same-0.143
 way-0.129
 attic-0.119
 garage-0.118
 boring-0.116
Token fire
Token ID2046
Feature activation-1.99e-01

Pos logit contributions
 same+0.089
 way+0.079
 proper+0.076
 boring+0.072
 task+0.071
Neg logit contributions
unch-0.152
amb-0.149
oot-0.140
aring-0.140
oped-0.140
Token truck
Token ID7779
Feature activation-2.08e-01

Pos logit contributions
amb+0.413
unch+0.398
oped+0.395
aring+0.388
oing+0.385
Neg logit contributions
 same-0.334
 way-0.290
 task-0.286
 laundry-0.286
 attic-0.286
Token used
Token ID973
Feature activation-7.88e-02

Pos logit contributions
oped+0.416
amb+0.415
unch+0.407
aring+0.402
uts+0.400
Neg logit contributions
 same-0.350
 laundry-0.310
 task-0.308
 attic-0.305
 way-0.301
Token its
Token ID663
Feature activation+8.61e-02

Pos logit contributions
unch+0.181
amb+0.177
oot+0.176
uts+0.174
oped+0.173
Neg logit contributions
 same-0.111
 way-0.094
 task-0.094
 attic-0.092
 garage-0.090
Token long
Token ID890
Feature activation-7.51e-02

Pos logit contributions
 same+0.120
 attic+0.103
 way+0.101
 laundry+0.098
 basement+0.097
Neg logit contributions
unch-0.196
amb-0.195
aring-0.186
oing-0.185
uts-0.185
Token hose
Token ID31489
Feature activation-2.46e-01

Pos logit contributions
unch+0.174
amb+0.172
oped+0.164
oot+0.163
aring+0.162
Neg logit contributions
 same-0.114
 way-0.097
 attic-0.092
 boring-0.090
 task-0.090
Token to
Token ID284
Feature activation+1.30e-01

Pos logit contributions
amb+0.526
unch+0.526
oot+0.500
uts+0.499
oped+0.494
Neg logit contributions
 same-0.415
 way-0.370
 task-0.350
 attic-0.344
 boring-0.343
Token asked
Token ID1965
Feature activation-9.28e-02

Pos logit contributions
unch+0.311
amb+0.309
oot+0.291
uts+0.290
oing+0.285
Neg logit contributions
 same-0.153
 way-0.142
 boring-0.124
 drawer-0.124
 room-0.124
Token.
Token ID13
Feature activation-8.49e-02

Pos logit contributions
amb+0.189
unch+0.185
oped+0.169
oot+0.167
aring+0.167
Neg logit contributions
 same-0.175
 way-0.161
 bad-0.150
 laundry-0.146
 boring-0.145
Token She
Token ID1375
Feature activation-5.10e-03

Pos logit contributions
amb+0.196
unch+0.194
oot+0.183
oped+0.180
aring+0.179
Neg logit contributions
 same-0.139
 way-0.125
 boring-0.113
 attic-0.111
 task-0.110
Token heard
Token ID2982
Feature activation-4.35e-02

Pos logit contributions
unch+0.011
amb+0.011
oped+0.011
oot+0.011
qu+0.010
Neg logit contributions
 same-0.009
 way-0.008
 task-0.007
 boring-0.007
 attic-0.007
Token the
Token ID262
Feature activation+1.58e-01

Pos logit contributions
unch+0.111
amb+0.108
oot+0.107
uts+0.103
qu+0.102
Neg logit contributions
 same-0.057
 way-0.051
 task-0.047
 attic-0.045
 boring-0.043
Token noise
Token ID7838
Feature activation-2.03e-01

Pos logit contributions
 same+0.190
 proper+0.179
 way+0.176
 attic+0.165
 boring+0.158
Neg logit contributions
unch-0.388
oing-0.356
amb-0.352
qu-0.343
ew-0.343
Token and
Token ID290
Feature activation+6.07e-02

Pos logit contributions
amb+0.413
unch+0.403
oot+0.375
oped+0.372
aring+0.357
Neg logit contributions
 same-0.393
 way-0.359
 boring-0.342
 bad-0.334
 task-0.330
Token came
Token ID1625
Feature activation-3.12e-02

Pos logit contributions
 same+0.114
 task+0.108
 way+0.103
 attic+0.100
 homework+0.099
Neg logit contributions
oped-0.112
unch-0.109
es-0.106
oing-0.106
qu-0.106
Token to
Token ID284
Feature activation+1.53e-01

Pos logit contributions
unch+0.085
amb+0.083
oped+0.079
oot+0.078
uts+0.078
Neg logit contributions
 same-0.041
 way-0.037
 garage-0.029
 boring-0.029
 task-0.029
Token the
Token ID262
Feature activation+1.61e-01

Pos logit contributions
 same+0.215
 way+0.205
 task+0.199
 attic+0.196
 boring+0.191
Neg logit contributions
unch-0.320
amb-0.318
qu-0.318
oing-0.307
oot-0.305
Token kitchen
Token ID9592
Feature activation-1.37e-01

Pos logit contributions
 same+0.153
 way+0.145
 boring+0.142
 usual+0.134
 proper+0.132
Neg logit contributions
unch-0.368
oing-0.345
ucking-0.343
umbs-0.342
aring-0.340
Token very
Token ID845
Feature activation+7.17e-02

Pos logit contributions
 same+0.127
 homework+0.109
 ger+0.101
 proper+0.096
 task+0.096
Neg logit contributions
unch-0.279
ang-0.274
oing-0.269
amb-0.263
ub-0.256
Token hot
Token ID3024
Feature activation-7.82e-02

Pos logit contributions
 same+0.081
 attic+0.073
 laundry+0.072
 basement+0.071
 garage+0.070
Neg logit contributions
unch-0.171
ang-0.168
aring-0.166
amb-0.163
oing-0.158
Token day
Token ID1110
Feature activation-6.30e-02

Pos logit contributions
unch+0.236
amb+0.234
oot+0.225
oped+0.224
aring+0.222
Neg logit contributions
 same-0.072
 way-0.061
 task-0.050
 boring-0.048
 attic-0.046
Token and
Token ID290
Feature activation+6.59e-02

Pos logit contributions
unch+0.112
amb+0.111
oot+0.103
oped+0.102
aring+0.100
Neg logit contributions
 same-0.134
 way-0.124
 task-0.116
 boring-0.115
 attic-0.113
Token the
Token ID262
Feature activation+1.51e-01

Pos logit contributions
 same+0.105
 task+0.105
 way+0.098
 garage+0.097
 tool+0.094
Neg logit contributions
unch-0.123
es-0.122
uts-0.121
oped-0.121
iled-0.114
Token sun
Token ID4252
Feature activation-2.14e-01

Pos logit contributions
 same+0.202
 proper+0.188
 task+0.186
 way+0.182
 homework+0.176
Neg logit contributions
unch-0.324
amb-0.313
qu-0.295
oing-0.292
ang-0.292
Token was
Token ID373
Feature activation+8.39e-02

Pos logit contributions
unch+0.475
amb+0.469
oped+0.453
uts+0.439
oing+0.438
Neg logit contributions
 same-0.348
 way-0.304
 task-0.294
 boring-0.287
 attic-0.286
Token bright
Token ID6016
Feature activation-9.02e-02

Pos logit contributions
 same+0.128
 task+0.115
 way+0.113
 attic+0.107
 garage+0.107
Neg logit contributions
unch-0.187
amb-0.177
oped-0.168
rol-0.167
oot-0.165
Token.
Token ID13
Feature activation-9.01e-02

Pos logit contributions
unch+0.189
amb+0.183
oot+0.179
oped+0.177
aring+0.175
Neg logit contributions
 same-0.176
 way-0.161
 boring-0.150
 attic-0.143
 task-0.142
Token Lily
Token ID20037
Feature activation-3.16e-02

Pos logit contributions
unch+0.179
amb+0.178
oot+0.167
oped+0.165
aring+0.163
Neg logit contributions
 same-0.173
 way-0.159
 task-0.147
 boring-0.146
 attic-0.144
Token felt
Token ID2936
Feature activation-6.55e-04

Pos logit contributions
amb+0.064
unch+0.061
oped+0.059
uts+0.057
aring+0.057
Neg logit contributions
 same-0.060
 task-0.052
 way-0.051
 boring-0.049
 attic-0.049
Token it
Token ID340
Feature activation+4.23e-02

Pos logit contributions
amb+0.116
unch+0.115
oped+0.109
aring+0.108
oot+0.108
Neg logit contributions
 same-0.055
 way-0.050
 boring-0.043
 task-0.042
 laundry-0.042
Token.
Token ID13
Feature activation-6.54e-02

Pos logit contributions
 same+0.082
 way+0.076
 task+0.068
 boring+0.068
 bad+0.066
Neg logit contributions
unch-0.089
amb-0.087
aring-0.082
oped-0.082
oing-0.082
Token He
Token ID679
Feature activation+2.20e-02

Pos logit contributions
amb+0.123
unch+0.119
uts+0.112
aring+0.112
oot+0.108
Neg logit contributions
 same-0.124
 way-0.113
 bad-0.108
 task-0.108
 laundry-0.107
Token blew
Token ID17948
Feature activation-1.12e-01

Pos logit contributions
 same+0.042
 task+0.036
 basement+0.036
 way+0.035
 attic+0.035
Neg logit contributions
amb-0.044
oped-0.043
unch-0.041
oot-0.038
qu-0.038
Token the
Token ID262
Feature activation+1.56e-01

Pos logit contributions
amb+0.301
unch+0.301
oot+0.286
oped+0.284
uts+0.280
Neg logit contributions
 same-0.132
 way-0.114
 task-0.102
 attic-0.099
 boring-0.099
Token horn
Token ID12718
Feature activation-2.52e-01

Pos logit contributions
 same+0.281
 way+0.254
 task+0.225
 boring+0.224
 bad+0.218
Neg logit contributions
unch-0.350
amb-0.341
oot-0.325
oped-0.322
uts-0.321
Token.
Token ID13
Feature activation-7.37e-02

Pos logit contributions
amb+0.489
unch+0.482
oot+0.460
uts+0.454
oped+0.446
Neg logit contributions
 same-0.475
 way-0.433
 attic-0.411
 task-0.406
 boring-0.403
Token It
Token ID632
Feature activation-2.33e-03

Pos logit contributions
amb+0.142
unch+0.142
uts+0.130
oot+0.128
oped+0.127
Neg logit contributions
 same-0.140
 way-0.128
 task-0.122
 boring-0.121
 bad-0.119
Token made
Token ID925
Feature activation+2.62e-02

Pos logit contributions
amb+0.005
oped+0.005
unch+0.005
qu+0.005
uts+0.004
Neg logit contributions
 same-0.004
 attic-0.004
 task-0.003
 laundry-0.003
 way-0.003
Token a
Token ID257
Feature activation+1.40e-01

Pos logit contributions
 same+0.041
 way+0.038
 task+0.036
 attic+0.036
 garage+0.035
Neg logit contributions
unch-0.057
amb-0.056
oot-0.053
qu-0.052
uts-0.052
Token loud
Token ID7812
Feature activation-5.94e-02

Pos logit contributions
 same+0.163
 attic+0.143
 way+0.134
 laundry+0.133
 homework+0.130
Neg logit contributions
unch-0.351
oing-0.337
amb-0.328
aring-0.325
qu-0.323
Token They
Token ID1119
Feature activation-4.14e-02

Pos logit contributions
unch+0.170
amb+0.165
oot+0.161
oing+0.157
oped+0.156
Neg logit contributions
 same-0.154
 way-0.140
 attic-0.129
 garage-0.127
 task-0.126
Token got
Token ID1392
Feature activation-6.19e-03

Pos logit contributions
amb+0.086
oped+0.082
unch+0.080
oot+0.077
qu+0.076
Neg logit contributions
 same-0.072
 way-0.065
 task-0.064
 laundry-0.061
 boring-0.060
Token back
Token ID736
Feature activation-1.12e-02

Pos logit contributions
amb+0.012
unch+0.012
oped+0.011
oot+0.011
rol+0.011
Neg logit contributions
 same-0.011
 way-0.010
 task-0.010
 attic-0.009
 boring-0.009
Token in
Token ID287
Feature activation+6.67e-02

Pos logit contributions
amb+0.025
unch+0.025
oot+0.024
oped+0.023
uts+0.023
Neg logit contributions
 same-0.018
 way-0.016
 task-0.015
 boring-0.015
 attic-0.015
Token the
Token ID262
Feature activation+9.64e-02

Pos logit contributions
 same+0.073
 way+0.054
 task+0.054
 boring+0.052
 attic+0.049
Neg logit contributions
unch-0.176
amb-0.172
oped-0.168
oing-0.164
oot-0.164
Token car
Token ID1097
Feature activation-2.30e-01

Pos logit contributions
 same+0.106
 homework+0.101
 task+0.097
 usual+0.095
 boring+0.095
Neg logit contributions
amb-0.207
oped-0.202
unch-0.199
rol-0.195
ang-0.194
Token and
Token ID290
Feature activation+3.79e-02

Pos logit contributions
unch+0.447
amb+0.438
oot+0.409
oped+0.407
oing+0.401
Neg logit contributions
 same-0.456
 way-0.425
 task-0.387
 boring-0.383
 attic-0.380
Token drove
Token ID10357
Feature activation-1.60e-01

Pos logit contributions
 same+0.073
 task+0.065
 way+0.062
 boring+0.061
 attic+0.060
Neg logit contributions
amb-0.072
oped-0.069
oot-0.067
unch-0.065
es-0.063
Token home
Token ID1363
Feature activation-1.28e-01

Pos logit contributions
unch+0.368
amb+0.366
oot+0.346
oped+0.343
aring+0.340
Neg logit contributions
 same-0.256
 way-0.229
 task-0.210
 boring-0.207
 attic-0.203
Token.
Token ID13
Feature activation-9.20e-02

Pos logit contributions
unch+0.239
amb+0.234
oped+0.215
oot+0.214
aring+0.210
Neg logit contributions
 same-0.268
 way-0.252
 boring-0.229
 task-0.228
 laundry-0.225
Token
Token ID198
Feature activation-8.10e-02

Pos logit contributions
unch+0.202
amb+0.198
oot+0.191
oing+0.186
oped+0.186
Neg logit contributions
 same-0.157
 way-0.143
 boring-0.129
 attic-0.129
 garage-0.127