TOP ACTIVATIONS
MAX = 15.375

One day the bear summoned up the courage to join
what to do . He summoned all the animals in the
, they wouldn 't have gotten broken . Tim my realized
so they wouldn 't have gotten in trouble . From then
shiny object and should have kept its appointment . The goose
them can lead to things getting broken and needing to be
kept helping out and kept getting rewarded , until one day
looking for carrots and always kept an eye out for the
, she wouldn 't have gotten so dirty and wouldn 't
money and didn 't delay getting them ". Bob
of her toys and always kept them in the bin .
the jewelry box and had gotten in through a tiny little
of her belt and always kept it clean . <|endoftext|> Once
careful around bees and always kept some medicine for her lip
the magical force just kept getting stronger !
very special and should be kept safe .
longer periods of time without getting tired . The moral of
hands and knees and began getting all the stuff off the
by eating healthy food and getting exercise .
to save her diary from getting messed up . From that

MIN ACTIVATIONS
MIN = -21.434

. The great Zen o hugs Ben and says
saw a big box of Leg os that he really wanted
, dinner time !" exclaimed Nem o . But
is too hard !" said Nem o . "
was a little fish named Nem o . Nem o liked
was a little fish named Nem o . Nem o loved
a new tank out of Leg os . Tim
One day , Nem o saw a big ,
was a little fish named Nem o . Nem o loved
It might be spicy ." Nem o thought for a moment
One day , the prince decided to take the jewel
! â € <|endoftext|> Pa olo was very excited . He
The end . <|endoftext|> D ennis was feeling very lonely .
You 're welcome ," said Nem o . " Now let
a bad ending . <|endoftext|> Le o was a little lion
One day , Nem o 's friend Tim my
" No swimming allowed ." Nem o was sad because he
Yes , please !" said Nem o happily .
." The prince went to the battlefield with
was a little fish named Nem o . Nem o loved

INTERVAL 6.173 - 15.375
CONTAINS 19.384%

." They knock on the window . A man
down to the streets and saw an old man .
complete the ceiling . They found all the pieces and fit
to the bathroom . He put her in the tub and
song for her . Lily was so happy and sang her

INTERVAL -3.029 - 6.173
CONTAINS 45.638%

. The girl was a bit scared but she was obedient
home and a family . You are mean and rude .
whistle . Tom looked at Lily . He copied
, but it was still wet . Her mom my came
lake . But then , something bad happens .

INTERVAL -12.232 - -3.029
CONTAINS 30.409%

offer me some tea ?" Anna nodded . She
, playing its happy tunes . His mom
a new one ." They hugged each other and
He was very relieved . He felt brave again , so
me ? I â € ™ m lost . â €

INTERVAL -21.434 - -12.232
CONTAINS 4.569%

His father said that the cat already belonged to someone else
!" Her mom my said , " I
there was a very creative bird . He liked to explore
One day , Sally decided it was time to
One day , Twe ety felt tired and needed to
Token
Token ID198
Feature activation-3.58e+00

Pos logit contributions
ist+0.109
aron+0.108
os+0.105
enny+0.105
umping+0.104
Neg logit contributions
 going-0.059
 chores-0.059
 same-0.054
 other-0.053
 her-0.053
TokenOne
Token ID3198
Feature activation-5.91e-01

Pos logit contributions
os+0.105
aron+0.105
enny+0.105
ist+0.104
jo+0.102
Neg logit contributions
 going-0.076
 chores-0.075
 same-0.070
 rain-0.068
 temptation-0.066
Token day
Token ID1110
Feature activation-3.09e+00

Pos logit contributions
aron+0.025
enny+0.025
os+0.025
ist+0.025
o+0.025
Neg logit contributions
 going-0.005
 chores-0.005
 same-0.004
 rain-0.003
 work-0.003
Token the
Token ID262
Feature activation+4.13e-01

Pos logit contributions
enny+0.102
os+0.102
aron+0.102
ist+0.100
o+0.100
Neg logit contributions
 going-0.053
 chores-0.052
 same-0.049
 temptation-0.045
 rain-0.045
Token bear
Token ID6842
Feature activation-1.56e+01

Pos logit contributions
 going+0.009
 chores+0.008
 her+0.008
 same+0.008
 temptation+0.007
Neg logit contributions
o-0.012
jo-0.012
os-0.012
ne-0.012
itty-0.012
Token summoned
Token ID20056
Feature activation+1.54e+01

Pos logit contributions
aron+0.484
enny+0.476
ist+0.470
os+0.465
oing+0.460
Neg logit contributions
 going-0.315
 chores-0.286
 same-0.282
 rain-0.278
 work-0.271
Token up
Token ID510
Feature activation+7.17e+00

Pos logit contributions
 going+0.199
 chores+0.198
 temptation+0.171
 rain+0.164
 urge+0.163
Neg logit contributions
aron-0.549
ist-0.548
o-0.547
os-0.545
jo-0.523
Token the
Token ID262
Feature activation+3.51e+00

Pos logit contributions
 chores+0.097
 going+0.095
 temptation+0.093
 same+0.093
 urge+0.088
Neg logit contributions
os-0.251
o-0.244
ist-0.238
enny-0.238
umping-0.237
Token courage
Token ID11917
Feature activation-1.67e+00

Pos logit contributions
 going+0.050
 chores+0.048
 her+0.045
 temptation+0.043
 same+0.040
Neg logit contributions
os-0.121
ist-0.120
o-0.119
jo-0.119
ne-0.118
Token to
Token ID284
Feature activation+8.46e+00

Pos logit contributions
os+0.047
enny+0.046
aron+0.046
ist+0.046
o+0.046
Neg logit contributions
 chores-0.037
 going-0.036
 same-0.034
 rain-0.033
 temptation-0.033
Token join
Token ID4654
Feature activation+4.80e+00

Pos logit contributions
 chores+0.128
 same+0.120
 going+0.119
 temptation+0.113
 rain+0.102
Neg logit contributions
os-0.280
ist-0.278
ne-0.270
aron-0.268
o-0.263
Token what
Token ID644
Feature activation+6.74e-02

Pos logit contributions
 chores+0.080
 going+0.077
 same+0.076
 urge+0.070
 temptation+0.070
Neg logit contributions
enny-0.174
ist-0.174
aron-0.174
os-0.172
o-0.172
Token to
Token ID284
Feature activation+7.46e+00

Pos logit contributions
 going+0.001
 chores+0.001
 same+0.001
 rain+0.001
 temptation+0.001
Neg logit contributions
enny-0.002
os-0.002
aron-0.002
ist-0.002
o-0.002
Token do
Token ID466
Feature activation+7.47e+00

Pos logit contributions
 going+0.127
 chores+0.127
 same+0.116
 top+0.112
 good+0.111
Neg logit contributions
enny-0.216
r-0.211
ne-0.211
aron-0.211
ist-0.211
Token.
Token ID13
Feature activation-5.45e+00

Pos logit contributions
 going+0.173
 chores+0.171
 same+0.163
 rain+0.157
 temptation+0.153
Neg logit contributions
os-0.201
enny-0.201
aron-0.200
ist-0.198
o-0.196
Token He
Token ID679
Feature activation-4.78e+00

Pos logit contributions
ist+0.183
os+0.179
ne+0.175
oing+0.175
enny+0.174
Neg logit contributions
 going-0.091
 chores-0.086
 her-0.084
 same-0.084
 rain-0.080
Token summoned
Token ID20056
Feature activation+1.48e+01

Pos logit contributions
enny+0.142
aron+0.141
os+0.141
ist+0.139
oing+0.138
Neg logit contributions
 going-0.101
 chores-0.096
 same-0.091
 work-0.089
 rain-0.088
Token all
Token ID477
Feature activation+5.22e+00

Pos logit contributions
 going+0.176
 chores+0.169
 urge+0.153
 plan+0.145
 temptation+0.144
Neg logit contributions
o-0.545
aron-0.545
ist-0.541
os-0.540
enny-0.525
Token the
Token ID262
Feature activation+3.27e+00

Pos logit contributions
 going+0.058
 chores+0.053
 rain+0.045
 urge+0.045
 same+0.044
Neg logit contributions
os-0.202
aron-0.200
o-0.198
enny-0.197
ne-0.194
Token animals
Token ID4695
Feature activation-8.81e+00

Pos logit contributions
 going+0.050
 her+0.042
 chores+0.041
wered+0.040
 rain+0.039
Neg logit contributions
ne-0.107
ist-0.105
aron-0.104
enny-0.102
os-0.101
Token in
Token ID287
Feature activation+9.87e+00

Pos logit contributions
os+0.290
enny+0.286
ist+0.283
o+0.283
aron+0.281
Neg logit contributions
 going-0.152
 chores-0.148
 same-0.142
 rain-0.135
 temptation-0.131
Token the
Token ID262
Feature activation+3.99e+00

Pos logit contributions
 same+0.105
 going+0.099
 urge+0.098
 chores+0.095
 hurricane+0.088
Neg logit contributions
aron-0.360
o-0.341
os-0.340
oc-0.339
jo-0.338
Token,
Token ID11
Feature activation+3.16e+00

Pos logit contributions
enny+0.165
os+0.164
aron+0.163
ist+0.162
o+0.160
Neg logit contributions
 going-0.105
 chores-0.103
 same-0.096
 rain-0.092
 temptation-0.091
Token they
Token ID484
Feature activation-2.45e+00

Pos logit contributions
 chores+0.049
 same+0.049
 going+0.048
 temptation+0.045
 top+0.043
Neg logit contributions
os-0.100
enny-0.099
o-0.099
ist-0.097
aron-0.097
Token wouldn
Token ID3636
Feature activation+6.69e+00

Pos logit contributions
os+0.060
ist+0.058
enny+0.058
o+0.058
ne+0.057
Neg logit contributions
 going-0.062
 chores-0.062
 same-0.059
 temptation-0.058
 rain-0.057
Token't
Token ID470
Feature activation+8.39e+00

Pos logit contributions
 going+0.084
 chores+0.081
 same+0.074
 rain+0.068
 temptation+0.066
Neg logit contributions
os-0.253
enny-0.253
aron-0.252
ist-0.250
o-0.249
Token have
Token ID423
Feature activation+1.06e+01

Pos logit contributions
 chores+0.122
 going+0.117
 same+0.107
 rain+0.098
 temptation+0.096
Neg logit contributions
ist-0.294
aron-0.293
ne-0.292
enny-0.290
o-0.288
Token gotten
Token ID7891
Feature activation+1.48e+01

Pos logit contributions
 chores+0.173
 going+0.168
 urge+0.158
 same+0.157
 temptation+0.156
Neg logit contributions
enny-0.337
o-0.337
os-0.335
ne-0.332
aron-0.331
Token broken
Token ID5445
Feature activation+5.22e+00

Pos logit contributions
 chores+0.246
 going+0.225
wered+0.225
 urge+0.222
 plan+0.220
Neg logit contributions
o-0.474
os-0.460
enny-0.460
ne-0.454
ist-0.454
Token.
Token ID13
Feature activation-5.10e+00

Pos logit contributions
 going+0.127
 chores+0.125
 same+0.119
 rain+0.114
 work+0.112
Neg logit contributions
os-0.136
enny-0.136
aron-0.136
ist-0.134
o-0.132
Token Tim
Token ID5045
Feature activation-1.25e+01

Pos logit contributions
ist+0.173
aron+0.165
oing+0.165
os+0.165
umping+0.163
Neg logit contributions
 going-0.083
 her-0.082
 chores-0.080
 same-0.077
 rain-0.074
Tokenmy
Token ID1820
Feature activation-8.21e+00

Pos logit contributions
oing+0.308
aron+0.293
os+0.293
ist+0.291
ark+0.288
Neg logit contributions
 same-0.328
 going-0.324
 her-0.322
 chores-0.308
 work-0.307
Token realized
Token ID6939
Feature activation+6.27e+00

Pos logit contributions
os+0.238
enny+0.237
aron+0.235
ist+0.234
oing+0.230
Neg logit contributions
 going-0.177
 chores-0.171
 same-0.165
 her-0.162
 work-0.160
Token so
Token ID523
Feature activation+7.31e+00

Pos logit contributions
 going+0.122
 chores+0.120
 same+0.112
 rain+0.108
 temptation+0.107
Neg logit contributions
os-0.202
aron-0.202
enny-0.201
ist-0.201
o-0.198
Token they
Token ID484
Feature activation-2.80e+00

Pos logit contributions
 going+0.106
 chores+0.100
 same+0.100
 urge+0.091
 plan+0.088
Neg logit contributions
os-0.258
enny-0.256
ist-0.254
jo-0.253
aron-0.252
Token wouldn
Token ID3636
Feature activation+6.61e+00

Pos logit contributions
os+0.073
enny+0.073
aron+0.073
ist+0.072
o+0.071
Neg logit contributions
 going-0.068
 chores-0.067
 same-0.064
 rain-0.061
 temptation-0.061
Token't
Token ID470
Feature activation+8.33e+00

Pos logit contributions
 going+0.078
 chores+0.075
 same+0.066
 rain+0.062
 work+0.060
Neg logit contributions
aron-0.257
enny-0.256
os-0.255
ist-0.254
o-0.251
Token have
Token ID423
Feature activation+1.05e+01

Pos logit contributions
 going+0.107
 chores+0.107
 same+0.096
 rain+0.087
 temptation+0.086
Neg logit contributions
os-0.309
aron-0.309
enny-0.307
ist-0.306
o-0.305
Token gotten
Token ID7891
Feature activation+1.46e+01

Pos logit contributions
 chores+0.194
 same+0.185
 urge+0.182
 temptation+0.180
 going+0.179
Neg logit contributions
o-0.308
os-0.302
ne-0.299
enny-0.293
aron-0.293
Token in
Token ID287
Feature activation+1.07e+01

Pos logit contributions
 chores+0.246
wered+0.231
 same+0.225
 urge+0.222
 plan+0.222
Neg logit contributions
o-0.469
os-0.451
ne-0.440
enny-0.433
ist-0.431
Token trouble
Token ID5876
Feature activation+3.64e+00

Pos logit contributions
 chores+0.115
 going+0.112
 same+0.110
 urge+0.109
wered+0.104
Neg logit contributions
aron-0.370
os-0.367
ne-0.359
es-0.359
enn-0.356
Token.
Token ID13
Feature activation-4.73e+00

Pos logit contributions
 chores+0.087
 going+0.087
 same+0.083
 rain+0.079
 temptation+0.078
Neg logit contributions
aron-0.094
os-0.093
ist-0.092
enny-0.092
o-0.091
Token From
Token ID3574
Feature activation+5.52e+00

Pos logit contributions
ist+0.152
aron+0.149
os+0.148
enny+0.146
oing+0.146
Neg logit contributions
 going-0.086
 chores-0.084
 same-0.081
 her-0.080
 rain-0.077
Token then
Token ID788
Feature activation+1.31e+00

Pos logit contributions
 chores+0.105
 going+0.101
 same+0.095
 temptation+0.093
 urge+0.092
Neg logit contributions
ist-0.170
aron-0.169
o-0.169
os-0.169
ne-0.168
Token shiny
Token ID22441
Feature activation-2.93e-01

Pos logit contributions
 chores+0.042
 going+0.041
 her+0.037
 rain+0.034
 start+0.032
Neg logit contributions
ne-0.091
aron-0.089
ist-0.088
pillar-0.086
oc-0.086
Token object
Token ID2134
Feature activation-9.68e+00

Pos logit contributions
aron+0.009
ist+0.009
enny+0.009
os+0.009
ne+0.009
Neg logit contributions
 going-0.006
 chores-0.006
 same-0.005
 rain-0.005
 temptation-0.005
Token and
Token ID290
Feature activation+5.87e+00

Pos logit contributions
enny+0.277
aron+0.277
os+0.277
ist+0.273
o+0.271
Neg logit contributions
 going-0.210
 chores-0.206
 same-0.195
 rain-0.186
 work-0.183
Token should
Token ID815
Feature activation+9.62e+00

Pos logit contributions
 chores+0.092
wered+0.081
 going+0.078
 temptation+0.077
 top+0.077
Neg logit contributions
os-0.184
o-0.180
ist-0.176
enny-0.173
ney-0.172
Token have
Token ID423
Feature activation+1.10e+01

Pos logit contributions
 chores+0.163
 going+0.157
 same+0.153
 temptation+0.144
 rain+0.138
Neg logit contributions
ist-0.278
aron-0.270
o-0.268
ne-0.268
enny-0.261
Token kept
Token ID4030
Feature activation+1.46e+01

Pos logit contributions
 chores+0.259
 same+0.240
 going+0.238
 temptation+0.236
 urge+0.230
Neg logit contributions
o-0.287
aron-0.280
os-0.278
enny-0.274
ne-0.273
Token its
Token ID663
Feature activation+2.74e+00

Pos logit contributions
 chores+0.227
 plan+0.202
 idea+0.201
 urge+0.198
 same+0.197
Neg logit contributions
o-0.453
ist-0.444
os-0.433
ne-0.431
es-0.431
Token appointment
Token ID12557
Feature activation-2.10e+00

Pos logit contributions
 going+0.046
 chores+0.045
 rain+0.039
 her+0.039
wered+0.039
Neg logit contributions
ist-0.087
aron-0.084
oc-0.084
ney-0.084
o-0.083
Token.
Token ID13
Feature activation-5.16e+00

Pos logit contributions
os+0.060
enny+0.060
aron+0.060
o+0.059
ist+0.059
Neg logit contributions
 going-0.047
 chores-0.045
 same-0.042
 her-0.040
 rain-0.040
Token The
Token ID383
Feature activation-1.45e-01

Pos logit contributions
ist+0.172
os+0.165
aron+0.165
enny+0.164
oing+0.162
Neg logit contributions
 going-0.089
 chores-0.086
 same-0.083
 her-0.081
 temptation-0.079
Token goose
Token ID37246
Feature activation-1.40e+01

Pos logit contributions
aron+0.004
oc+0.004
ne+0.004
o+0.004
pillar+0.004
Neg logit contributions
 going-0.003
 chores-0.003
 her-0.002
 same-0.002
 work-0.002
Token them
Token ID606
Feature activation-4.94e+00

Pos logit contributions
 going+0.098
 chores+0.094
 same+0.089
 urge+0.082
 rain+0.081
Neg logit contributions
ist-0.310
os-0.304
aron-0.304
o-0.297
enny-0.296
Token can
Token ID460
Feature activation+4.60e+00

Pos logit contributions
os+0.134
enny+0.132
ist+0.131
o+0.128
aron+0.128
Neg logit contributions
 going-0.111
 chores-0.109
 same-0.106
 rain-0.101
 temptation-0.101
Token lead
Token ID1085
Feature activation+6.95e+00

Pos logit contributions
 going+0.062
 chores+0.062
 same+0.056
 temptation+0.055
 good+0.051
Neg logit contributions
ist-0.161
os-0.161
o-0.159
enny-0.157
ne-0.156
Token to
Token ID284
Feature activation+8.98e+00

Pos logit contributions
 going+0.126
 chores+0.124
 same+0.115
 rain+0.109
 temptation+0.109
Neg logit contributions
os-0.222
enny-0.219
ist-0.217
o-0.217
aron-0.217
Token things
Token ID1243
Feature activation-3.47e+00

Pos logit contributions
 urge+0.093
 going+0.092
wered+0.090
 same+0.089
 top+0.087
Neg logit contributions
ist-0.295
enny-0.290
aron-0.289
os-0.288
es-0.288
Token getting
Token ID1972
Feature activation+1.44e+01

Pos logit contributions
ist+0.101
os+0.101
enny+0.101
aron+0.100
o+0.098
Neg logit contributions
 chores-0.071
 going-0.070
 same-0.067
 rain-0.064
 temptation-0.064
Token broken
Token ID5445
Feature activation+4.84e+00

Pos logit contributions
 chores+0.224
 going+0.217
 urge+0.212
 temptation+0.205
 plan+0.205
Neg logit contributions
o-0.470
os-0.466
ist-0.465
enny-0.464
ne-0.450
Token and
Token ID290
Feature activation+6.62e+00

Pos logit contributions
 going+0.113
 chores+0.111
 same+0.105
 rain+0.101
 work+0.100
Neg logit contributions
enny-0.131
os-0.131
aron-0.131
ist-0.129
o-0.128
Token needing
Token ID18139
Feature activation+1.14e+01

Pos logit contributions
 chores+0.102
 top+0.093
 urge+0.092
 same+0.092
wered+0.090
Neg logit contributions
os-0.211
o-0.209
ist-0.203
enny-0.195
ne-0.195
Token to
Token ID284
Feature activation+9.55e+00

Pos logit contributions
 going+0.176
 urge+0.171
 same+0.167
 chores+0.167
wered+0.161
Neg logit contributions
os-0.345
ne-0.344
enny-0.343
ist-0.342
o-0.337
Token be
Token ID307
Feature activation+1.03e+01

Pos logit contributions
 going+0.149
 chores+0.140
 same+0.138
 temptation+0.128
 urge+0.123
Neg logit contributions
ist-0.307
ne-0.296
aron-0.294
o-0.293
enny-0.289
Token kept
Token ID4030
Feature activation+1.21e+01

Pos logit contributions
enny+0.114
os+0.114
aron+0.114
ist+0.112
jo+0.110
Neg logit contributions
 going-0.086
 chores-0.083
 same-0.079
 work-0.076
 rain-0.076
Token helping
Token ID5742
Feature activation+6.98e+00

Pos logit contributions
 same+0.176
 chores+0.172
 rain+0.165
 temptation+0.164
 urge+0.158
Neg logit contributions
o-0.390
aron-0.383
os-0.376
es-0.369
umping-0.369
Token out
Token ID503
Feature activation+3.53e+00

Pos logit contributions
 going+0.100
 chores+0.097
 same+0.088
 rain+0.084
 temptation+0.081
Neg logit contributions
aron-0.251
enny-0.250
os-0.249
ist-0.248
o-0.245
Token and
Token ID290
Feature activation+5.85e+00

Pos logit contributions
 going+0.060
 chores+0.058
 same+0.056
 rain+0.052
 temptation+0.052
Neg logit contributions
os-0.116
ist-0.115
aron-0.115
enny-0.114
o-0.112
Token kept
Token ID4030
Feature activation+1.36e+01

Pos logit contributions
 chores+0.119
 going+0.111
 same+0.110
 rain+0.106
 temptation+0.104
Neg logit contributions
os-0.171
o-0.170
ist-0.169
enny-0.168
aron-0.167
Token getting
Token ID1972
Feature activation+1.44e+01

Pos logit contributions
 same+0.176
 chores+0.166
 rain+0.161
 urge+0.158
 temptation+0.155
Neg logit contributions
o-0.460
aron-0.458
os-0.450
ist-0.442
es-0.441
Token rewarded
Token ID20945
Feature activation+6.95e+00

Pos logit contributions
 urge+0.210
 chores+0.206
 going+0.200
 same+0.200
 temptation+0.200
Neg logit contributions
os-0.464
o-0.463
enny-0.452
aron-0.445
ist-0.438
Token,
Token ID11
Feature activation+3.13e+00

Pos logit contributions
 going+0.151
 chores+0.148
 same+0.139
 rain+0.133
 her+0.132
Neg logit contributions
os-0.200
enny-0.200
aron-0.199
ist-0.196
o-0.196
Token until
Token ID1566
Feature activation+3.85e+00

Pos logit contributions
 chores+0.069
 going+0.065
 same+0.064
 rain+0.060
 temptation+0.060
Neg logit contributions
aron-0.089
ist-0.085
os-0.084
enny-0.084
o-0.084
Token one
Token ID530
Feature activation-2.68e+00

Pos logit contributions
 same+0.062
 temptation+0.060
 plan+0.059
 urge+0.059
 chores+0.058
Neg logit contributions
enny-0.118
o-0.111
light-0.109
oc-0.109
ne-0.108
Token day
Token ID1110
Feature activation-1.83e+00

Pos logit contributions
os+0.107
enny+0.107
aron+0.107
ist+0.106
o+0.105
Neg logit contributions
 going-0.029
 chores-0.028
 same-0.025
 rain-0.022
 work-0.021
Token looking
Token ID2045
Feature activation+4.47e+00

Pos logit contributions
 chores+0.057
 same+0.055
 urge+0.051
wered+0.050
 temptation+0.049
Neg logit contributions
o-0.112
os-0.112
es-0.109
ne-0.106
enny-0.106
Token for
Token ID329
Feature activation+1.12e+01

Pos logit contributions
 going+0.080
 chores+0.079
 same+0.073
 rain+0.071
 temptation+0.069
Neg logit contributions
os-0.142
aron-0.141
enny-0.141
ist-0.139
o-0.139
Token carrots
Token ID34397
Feature activation-6.91e+00

Pos logit contributions
 going+0.138
 chores+0.133
 same+0.133
 plan+0.114
 urge+0.113
Neg logit contributions
aron-0.391
os-0.375
es-0.375
oc-0.373
o-0.372
Token and
Token ID290
Feature activation+6.09e+00

Pos logit contributions
os+0.195
enny+0.195
aron+0.195
ist+0.192
o+0.190
Neg logit contributions
 going-0.152
 chores-0.150
 same-0.142
 rain-0.136
 temptation-0.133
Token always
Token ID1464
Feature activation+9.31e+00

Pos logit contributions
 chores+0.103
 same+0.094
 going+0.092
wered+0.091
 temptation+0.087
Neg logit contributions
o-0.193
os-0.189
aron-0.183
oc-0.181
es-0.179
Token kept
Token ID4030
Feature activation+1.43e+01

Pos logit contributions
 chores+0.202
 going+0.192
 same+0.186
 rain+0.178
 temptation+0.175
Neg logit contributions
o-0.265
os-0.264
aron-0.263
ist-0.260
enny-0.256
Token an
Token ID281
Feature activation+5.94e+00

Pos logit contributions
 chores+0.187
 same+0.174
 urge+0.173
wered+0.164
 plan+0.162
Neg logit contributions
o-0.507
aron-0.475
os-0.473
es-0.463
ist-0.458
Token eye
Token ID4151
Feature activation-1.27e+00

Pos logit contributions
 going+0.112
 chores+0.112
 same+0.102
 rain+0.099
 her+0.098
Neg logit contributions
aron-0.188
enny-0.181
ist-0.181
oc-0.178
o-0.178
Token out
Token ID503
Feature activation+4.41e+00

Pos logit contributions
enny+0.041
os+0.041
aron+0.041
ist+0.040
o+0.040
Neg logit contributions
 going-0.023
 chores-0.023
 same-0.021
 rain-0.020
 work-0.020
Token for
Token ID329
Feature activation+1.20e+01

Pos logit contributions
 chores+0.088
 going+0.087
 same+0.083
 rain+0.079
 temptation+0.077
Neg logit contributions
os-0.131
ist-0.130
enny-0.130
aron-0.129
o-0.128
Token the
Token ID262
Feature activation+2.80e+00

Pos logit contributions
 same+0.145
 going+0.136
 chores+0.134
 urge+0.126
 plan+0.117
Neg logit contributions
aron-0.421
es-0.414
os-0.410
o-0.410
ne-0.407
Token,
Token ID11
Feature activation+3.06e+00

Pos logit contributions
 going+0.052
 chores+0.051
 same+0.049
 temptation+0.045
 rain+0.045
Neg logit contributions
enny-0.085
os-0.084
aron-0.084
ist-0.083
ne-0.083
Token she
Token ID673
Feature activation-4.64e+00

Pos logit contributions
 same+0.043
 chores+0.042
 going+0.041
 temptation+0.040
 urge+0.040
Neg logit contributions
enny-0.099
os-0.093
ne-0.093
es-0.093
o-0.091
Token wouldn
Token ID3636
Feature activation+5.95e+00

Pos logit contributions
aron+0.131
enny+0.130
os+0.130
ist+0.128
oing+0.126
Neg logit contributions
 going-0.105
 chores-0.102
 same-0.097
 rain-0.093
 work-0.093
Token't
Token ID470
Feature activation+7.65e+00

Pos logit contributions
 going+0.072
 chores+0.069
 same+0.062
 rain+0.057
 work+0.055
Neg logit contributions
aron-0.229
os-0.228
enny-0.228
ist-0.226
o-0.224
Token have
Token ID423
Feature activation+1.00e+01

Pos logit contributions
 chores+0.113
 going+0.109
 same+0.103
 top+0.091
 rain+0.091
Neg logit contributions
aron-0.263
ist-0.261
enny-0.260
ne-0.260
o-0.257
Token gotten
Token ID7891
Feature activation+1.43e+01

Pos logit contributions
 chores+0.166
 going+0.162
 urge+0.155
 same+0.153
 temptation+0.153
Neg logit contributions
enny-0.319
o-0.311
ne-0.308
os-0.307
aron-0.303
Token so
Token ID523
Feature activation+9.00e+00

Pos logit contributions
 chores+0.235
 plan+0.225
wered+0.222
 going+0.217
 urge+0.217
Neg logit contributions
o-0.457
enny-0.453
ne-0.438
os-0.437
aron-0.429
Token dirty
Token ID11841
Feature activation+3.42e+00

Pos logit contributions
 going+0.148
 chores+0.139
 same+0.137
 plan+0.136
 start+0.131
Neg logit contributions
enny-0.289
os-0.282
o-0.279
ne-0.269
as-0.269
Token and
Token ID290
Feature activation+5.71e+00

Pos logit contributions
 chores+0.082
 going+0.081
 same+0.077
 rain+0.073
 temptation+0.073
Neg logit contributions
aron-0.089
enny-0.089
os-0.088
ist-0.087
o-0.086
Token wouldn
Token ID3636
Feature activation+6.87e+00

Pos logit contributions
 chores+0.087
wered+0.084
 top+0.082
 same+0.077
 plan+0.076
Neg logit contributions
o-0.178
os-0.175
enny-0.172
ist-0.166
ppy-0.162
Token't
Token ID470
Feature activation+8.45e+00

Pos logit contributions
 going+0.090
 chores+0.089
 same+0.082
 rain+0.074
 temptation+0.074
Neg logit contributions
enny-0.254
os-0.254
aron-0.250
o-0.249
ist-0.248
Token money
Token ID1637
Feature activation-3.82e+00

Pos logit contributions
 going+0.032
 chores+0.030
 same+0.027
 her+0.026
 rain+0.026
Neg logit contributions
enny-0.081
aron-0.081
os-0.080
ist-0.080
o-0.080
Token and
Token ID290
Feature activation+5.44e+00

Pos logit contributions
enny+0.108
os+0.108
aron+0.107
ist+0.106
jo+0.106
Neg logit contributions
 going-0.086
 chores-0.083
 same-0.079
 work-0.075
 rain-0.074
Token didn
Token ID1422
Feature activation+8.28e+00

Pos logit contributions
 chores+0.088
 temptation+0.082
 going+0.082
 same+0.080
wered+0.078
Neg logit contributions
o-0.172
os-0.171
enny-0.168
aron-0.168
ist-0.167
Token't
Token ID470
Feature activation+8.55e+00

Pos logit contributions
 going+0.140
 same+0.137
 chores+0.137
 temptation+0.133
 plan+0.131
Neg logit contributions
os-0.256
enny-0.254
ne-0.254
o-0.253
ark-0.243
Token delay
Token ID5711
Feature activation+7.74e+00

Pos logit contributions
 chores+0.134
 going+0.130
 same+0.123
 rain+0.115
 temptation+0.109
Neg logit contributions
aron-0.290
ne-0.287
ist-0.284
o-0.283
enny-0.282
Token getting
Token ID1972
Feature activation+1.43e+01

Pos logit contributions
 chores+0.119
 urge+0.117
wered+0.111
 same+0.111
 temptation+0.109
Neg logit contributions
aron-0.248
ist-0.242
o-0.241
os-0.238
ne-0.234
Token them
Token ID606
Feature activation-3.56e+00

Pos logit contributions
 urge+0.201
wered+0.198
 chores+0.197
 plan+0.196
 idea+0.184
Neg logit contributions
o-0.472
os-0.454
enny-0.453
ne-0.448
ppy-0.429
Token".
Token ID1911
Feature activation-7.19e+00

Pos logit contributions
os+0.109
o+0.109
enny+0.107
aron+0.107
ist+0.107
Neg logit contributions
 chores-0.065
 going-0.065
 temptation-0.061
 same-0.060
 rain-0.059
Token
Token ID198
Feature activation-4.22e+00

Pos logit contributions
enny+0.257
ist+0.256
aron+0.246
os+0.245
ppy+0.244
Neg logit contributions
 going-0.113
 chores-0.107
 same-0.104
 her-0.097
 rain-0.096
Token
Token ID198
Feature activation-4.24e+00

Pos logit contributions
ist+0.149
enny+0.144
aron+0.140
ppy+0.139
os+0.139
Neg logit contributions
 going-0.068
 chores-0.065
 other-0.062
 the-0.062
 her-0.061
TokenBob
Token ID18861
Feature activation-1.38e+01

Pos logit contributions
enny+0.108
ist+0.105
os+0.103
jo+0.101
ouse+0.099
Neg logit contributions
 going-0.111
 chores-0.108
 same-0.105
 good-0.104
 other-0.103
Token of
Token ID286
Feature activation+8.41e+00

Pos logit contributions
 going+0.029
 chores+0.028
 same+0.027
 her+0.026
 good+0.026
Neg logit contributions
aron-0.038
os-0.038
enny-0.037
o-0.037
jo-0.037
Token her
Token ID607
Feature activation-4.65e-01

Pos logit contributions
 going+0.069
 chores+0.064
 same+0.064
 urge+0.063
 plan+0.053
Neg logit contributions
ist-0.335
aron-0.323
os-0.318
es-0.316
enny-0.310
Token toys
Token ID14958
Feature activation-5.88e+00

Pos logit contributions
ist+0.016
enny+0.016
os+0.016
oc+0.016
aron+0.016
Neg logit contributions
 going-0.007
 chores-0.006
 same-0.006
 temptation-0.006
wered-0.006
Token and
Token ID290
Feature activation+5.95e+00

Pos logit contributions
os+0.154
aron+0.154
enny+0.154
ist+0.151
o+0.151
Neg logit contributions
 going-0.144
 chores-0.141
 same-0.133
 rain-0.129
 her-0.128
Token always
Token ID1464
Feature activation+9.28e+00

Pos logit contributions
 chores+0.115
wered+0.112
 same+0.111
 temptation+0.110
 top+0.109
Neg logit contributions
os-0.167
es-0.164
o-0.163
enny-0.163
ist-0.162
Token kept
Token ID4030
Feature activation+1.43e+01

Pos logit contributions
 chores+0.211
 going+0.194
 same+0.193
 top+0.191
 rain+0.189
Neg logit contributions
ist-0.255
o-0.249
enny-0.248
aron-0.246
os-0.243
Token them
Token ID606
Feature activation-2.84e+00

Pos logit contributions
 urge+0.172
 chores+0.171
 plan+0.159
 temptation+0.157
 idea+0.156
Neg logit contributions
o-0.490
ist-0.485
enny-0.470
aron-0.469
os-0.467
Token in
Token ID287
Feature activation+1.05e+01

Pos logit contributions
ist+0.096
enny+0.095
aron+0.094
os+0.094
o+0.093
Neg logit contributions
 going-0.045
 chores-0.044
 same-0.042
 rain-0.040
 temptation-0.039
Token the
Token ID262
Feature activation+4.62e+00

Pos logit contributions
 going+0.121
 urge+0.106
wered+0.100
 chores+0.100
 hurricane+0.097
Neg logit contributions
enny-0.369
ist-0.366
aron-0.362
oc-0.354
es-0.352
Token bin
Token ID9874
Feature activation-8.65e+00

Pos logit contributions
 going+0.076
 chores+0.070
 her+0.063
 work+0.058
 avail+0.055
Neg logit contributions
ney-0.147
o-0.145
oc-0.144
ist-0.141
aron-0.141
Token.
Token ID13
Feature activation-4.66e+00

Pos logit contributions
aron+0.254
enny+0.248
ne+0.246
os+0.245
jo+0.244
Neg logit contributions
 going-0.189
 chores-0.182
 same-0.176
 her-0.167
 rain-0.165
Token the
Token ID262
Feature activation+3.80e+00

Pos logit contributions
 chores+0.118
 going+0.103
 same+0.097
 hurricane+0.096
 rain+0.096
Neg logit contributions
aron-0.345
es-0.336
o-0.331
jo-0.330
os-0.329
Token jewelry
Token ID22634
Feature activation-9.99e+00

Pos logit contributions
 going+0.068
 chores+0.062
 her+0.055
 work+0.055
 temptation+0.051
Neg logit contributions
o-0.119
ney-0.118
oc-0.117
itty-0.117
os-0.114
Token box
Token ID3091
Feature activation-8.99e+00

Pos logit contributions
os+0.301
enny+0.300
aron+0.299
o+0.294
jo+0.293
Neg logit contributions
 going-0.206
 chores-0.200
 same-0.190
 rain-0.179
 her-0.177
Token and
Token ID290
Feature activation+5.68e+00

Pos logit contributions
enny+0.258
aron+0.258
os+0.258
ist+0.257
o+0.252
Neg logit contributions
 going-0.191
 chores-0.190
 same-0.178
 rain-0.172
 temptation-0.169
Token had
Token ID550
Feature activation+1.10e+01

Pos logit contributions
 chores+0.124
 same+0.114
 going+0.113
 temptation+0.110
 urge+0.105
Neg logit contributions
os-0.163
o-0.160
aron-0.156
enny-0.156
es-0.154
Token gotten
Token ID7891
Feature activation+1.43e+01

Pos logit contributions
 chores+0.163
 temptation+0.152
 urge+0.150
 going+0.149
 hurricane+0.140
Neg logit contributions
o-0.372
os-0.367
aron-0.357
enny-0.354
ne-0.352
Token in
Token ID287
Feature activation+1.08e+01

Pos logit contributions
 chores+0.242
 urge+0.216
 going+0.214
 same+0.213
 plan+0.212
Neg logit contributions
o-0.474
os-0.465
enny-0.447
ist-0.439
ne-0.438
Token through
Token ID832
Feature activation+9.64e+00

Pos logit contributions
 chores+0.135
 same+0.118
 urge+0.117
 going+0.111
 hurricane+0.111
Neg logit contributions
aron-0.386
es-0.370
enny-0.368
o-0.366
os-0.365
Token a
Token ID257
Feature activation+4.34e+00

Pos logit contributions
 chores+0.101
 going+0.098
 same+0.086
 urge+0.081
 plan+0.077
Neg logit contributions
os-0.376
es-0.372
enny-0.370
aron-0.370
jo-0.366
Token tiny
Token ID7009
Feature activation-3.90e-01

Pos logit contributions
 going+0.078
 chores+0.075
 same+0.068
 work+0.064
 her+0.064
Neg logit contributions
enny-0.140
aron-0.139
os-0.138
oc-0.138
jo-0.137
Token little
Token ID1310
Feature activation+4.87e-02

Pos logit contributions
os+0.013
enny+0.013
aron+0.013
ist+0.013
o+0.013
Neg logit contributions
 going-0.006
 chores-0.006
 same-0.006
 rain-0.006
 work-0.006
Token of
Token ID286
Feature activation+8.99e+00

Pos logit contributions
 going+0.039
 chores+0.038
 her+0.037
 same+0.036
 good+0.035
Neg logit contributions
aron-0.058
os-0.057
jo-0.056
o-0.056
oing-0.056
Token her
Token ID607
Feature activation-3.30e-01

Pos logit contributions
 chores+0.069
 going+0.065
 same+0.064
 urge+0.061
 plan+0.057
Neg logit contributions
ist-0.364
aron-0.355
os-0.354
enny-0.349
o-0.347
Token belt
Token ID10999
Feature activation-9.48e+00

Pos logit contributions
os+0.012
aron+0.012
enny+0.012
ist+0.012
o+0.011
Neg logit contributions
 going-0.005
 chores-0.004
 same-0.004
 temptation-0.004
 rain-0.004
Token and
Token ID290
Feature activation+5.87e+00

Pos logit contributions
os+0.271
aron+0.270
enny+0.270
o+0.265
ist+0.263
Neg logit contributions
 going-0.209
 chores-0.205
 same-0.194
 rain-0.185
 her-0.184
Token always
Token ID1464
Feature activation+9.17e+00

Pos logit contributions
 chores+0.087
 same+0.084
 going+0.084
 temptation+0.082
wered+0.082
Neg logit contributions
o-0.191
os-0.189
enny-0.188
ist-0.188
aron-0.186
Token kept
Token ID4030
Feature activation+1.43e+01

Pos logit contributions
 chores+0.190
 going+0.184
 same+0.176
 rain+0.171
 temptation+0.168
Neg logit contributions
aron-0.269
ist-0.269
o-0.267
os-0.266
enny-0.266
Token it
Token ID340
Feature activation-4.71e+00

Pos logit contributions
 chores+0.165
 same+0.153
 urge+0.152
 plan+0.146
 temptation+0.146
Neg logit contributions
o-0.522
ist-0.506
os-0.506
aron-0.504
enny-0.501
Token clean
Token ID3424
Feature activation+3.87e+00

Pos logit contributions
ist+0.156
enny+0.153
aron+0.152
os+0.152
o+0.150
Neg logit contributions
 going-0.076
 chores-0.076
 same-0.071
 rain-0.067
 temptation-0.067
Token.
Token ID13
Feature activation-4.66e+00

Pos logit contributions
 going+0.093
 chores+0.092
 same+0.087
 rain+0.084
 temptation+0.083
Neg logit contributions
aron-0.101
enny-0.100
ist-0.100
os-0.100
o-0.097
Token<|endoftext|>
Token ID50256
Feature activation-1.07e+01

Pos logit contributions
ist+0.156
aron+0.151
os+0.150
ne+0.148
oing+0.147
Neg logit contributions
 going-0.084
 her-0.081
 chores-0.080
 same-0.078
 rain-0.074
TokenOnce
Token ID7454
Feature activation-8.84e-01

Pos logit contributions
ist+0.428
oing+0.405
ppy+0.395
lets+0.392
aron+0.388
Neg logit contributions
 the-0.146
 going-0.142
 her-0.135
 big-0.126
 other-0.125
Token careful
Token ID8161
Feature activation+3.96e+00

Pos logit contributions
 chores+0.149
 going+0.148
 same+0.137
 work+0.130
 temptation+0.129
Neg logit contributions
aron-0.287
o-0.284
ist-0.282
enny-0.279
os-0.277
Token around
Token ID1088
Feature activation+3.99e+00

Pos logit contributions
 going+0.078
 chores+0.076
 same+0.072
 rain+0.068
 work+0.067
Neg logit contributions
os-0.122
enny-0.122
aron-0.122
ist-0.120
o-0.119
Token bees
Token ID17002
Feature activation-1.18e+01

Pos logit contributions
 chores+0.033
 going+0.032
 same+0.029
 temptation+0.024
 urge+0.023
Neg logit contributions
os-0.164
ist-0.163
enny-0.162
aron-0.162
ne-0.161
Token and
Token ID290
Feature activation+5.64e+00

Pos logit contributions
enny+0.359
aron+0.353
jo+0.347
os+0.343
ne+0.341
Neg logit contributions
 going-0.253
 chores-0.241
 same-0.226
 her-0.220
 rain-0.218
Token always
Token ID1464
Feature activation+9.08e+00

Pos logit contributions
 chores+0.091
 going+0.086
 same+0.085
 temptation+0.078
 top+0.077
Neg logit contributions
o-0.186
os-0.184
ist-0.180
aron-0.175
enny-0.174
Token kept
Token ID4030
Feature activation+1.42e+01

Pos logit contributions
 chores+0.150
 going+0.142
 same+0.136
 temptation+0.127
 rain+0.126
Neg logit contributions
o-0.304
ist-0.303
aron-0.301
os-0.300
ne-0.294
Token some
Token ID617
Feature activation+7.55e+00

Pos logit contributions
 chores+0.133
 same+0.117
 plan+0.115
 urge+0.114
 going+0.107
Neg logit contributions
o-0.560
ist-0.537
os-0.537
aron-0.534
ne-0.528
Token medicine
Token ID9007
Feature activation-3.80e+00

Pos logit contributions
 going+0.110
 chores+0.101
 same+0.097
 rain+0.086
 urge+0.084
Neg logit contributions
aron-0.265
ist-0.263
enny-0.260
ne-0.259
os-0.259
Token for
Token ID329
Feature activation+1.14e+01

Pos logit contributions
os+0.114
enny+0.114
aron+0.113
jo+0.112
o+0.112
Neg logit contributions
 going-0.079
 chores-0.077
 same-0.072
 rain-0.068
 her-0.068
Token her
Token ID607
Feature activation-5.91e-03

Pos logit contributions
 same+0.103
 urge+0.095
 going+0.087
 chores+0.084
 plan+0.082
Neg logit contributions
ist-0.441
aron-0.436
es-0.435
enny-0.432
os-0.431
Token lip
Token ID10645
Feature activation-7.43e+00

Pos logit contributions
os+0.000
enny+0.000
aron+0.000
ist+0.000
o+0.000
Neg logit contributions
 going-0.000
 chores-0.000
 same-0.000
 rain-0.000
 work-0.000
Token the
Token ID262
Feature activation+1.88e+00

Pos logit contributions
 chores+0.039
 going+0.036
 same+0.034
 temptation+0.032
 rain+0.030
Neg logit contributions
os-0.096
o-0.094
enny-0.094
ist-0.092
ne-0.091
Token magical
Token ID10883
Feature activation-3.12e+00

Pos logit contributions
 going+0.043
 chores+0.042
 their+0.040
 her+0.039
wered+0.037
Neg logit contributions
pillar-0.047
oc-0.045
o-0.044
itty-0.043
ive-0.043
Token force
Token ID2700
Feature activation-1.46e+00

Pos logit contributions
aron+0.094
ist+0.091
o+0.090
ne+0.090
enny+0.090
Neg logit contributions
 chores-0.065
 going-0.064
 same-0.057
 rain-0.054
 temptation-0.054
Token just
Token ID655
Feature activation+4.61e+00

Pos logit contributions
os+0.040
enny+0.040
aron+0.040
ist+0.039
o+0.039
Neg logit contributions
 going-0.033
 chores-0.033
 same-0.031
 rain-0.030
 work-0.029
Token kept
Token ID4030
Feature activation+1.35e+01

Pos logit contributions
 chores+0.103
 going+0.094
 same+0.091
 temptation+0.088
 rain+0.087
Neg logit contributions
ist-0.134
os-0.130
aron-0.128
o-0.128
ne-0.126
Token getting
Token ID1972
Feature activation+1.42e+01

Pos logit contributions
 chores+0.194
 same+0.189
 urge+0.170
 rain+0.166
wered+0.166
Neg logit contributions
o-0.449
ist-0.432
os-0.432
iled-0.426
aron-0.422
Token stronger
Token ID7387
Feature activation+3.04e+00

Pos logit contributions
 chores+0.252
 urge+0.238
 same+0.232
 plan+0.230
wered+0.226
Neg logit contributions
o-0.435
os-0.415
iled-0.391
enny-0.390
ne-0.387
Token!
Token ID0
Feature activation-4.46e+00

Pos logit contributions
 going+0.068
 chores+0.066
 same+0.062
 work+0.060
 rain+0.059
Neg logit contributions
os-0.087
enny-0.087
aron-0.085
o-0.085
ist-0.084
Token
Token ID220
Feature activation-5.45e+00

Pos logit contributions
ist+0.156
aron+0.153
os+0.148
jo+0.148
enny+0.148
Neg logit contributions
 going-0.070
 chores-0.068
 same-0.065
 her-0.062
 top-0.060
Token
Token ID198
Feature activation-3.84e+00

Pos logit contributions
ist+0.175
aron+0.174
jo+0.172
ne+0.170
oing+0.169
Neg logit contributions
 going-0.099
 chores-0.097
 same-0.090
 her-0.086
 good-0.084
Token
Token ID198
Feature activation-4.01e+00

Pos logit contributions
ist+0.131
aron+0.125
os+0.124
jo+0.123
umping+0.122
Neg logit contributions
 going-0.063
 chores-0.062
 other-0.058
 same-0.058
 good-0.057
Token very
Token ID845
Feature activation+6.79e+00

Pos logit contributions
 chores+0.177
 temptation+0.152
 same+0.146
 rain+0.141
 urge+0.138
Neg logit contributions
aron-0.355
o-0.347
ist-0.342
os-0.329
jo-0.328
Token special
Token ID2041
Feature activation+1.60e+00

Pos logit contributions
 chores+0.104
 going+0.095
 rain+0.084
 same+0.083
wered+0.079
Neg logit contributions
aron-0.237
enny-0.236
o-0.232
ist-0.228
ouse-0.226
Token and
Token ID290
Feature activation+5.53e+00

Pos logit contributions
 going+0.036
 chores+0.034
 same+0.033
 her+0.031
 rain+0.031
Neg logit contributions
os-0.046
enny-0.046
aron-0.046
ist-0.045
o-0.045
Token should
Token ID815
Feature activation+9.31e+00

Pos logit contributions
 chores+0.094
 going+0.083
 same+0.080
 temptation+0.080
wered+0.078
Neg logit contributions
o-0.180
os-0.180
enny-0.175
aron-0.174
ist-0.174
Token be
Token ID307
Feature activation+1.04e+01

Pos logit contributions
 going+0.140
 chores+0.137
 same+0.125
 temptation+0.120
 rain+0.115
Neg logit contributions
ist-0.309
aron-0.309
ne-0.302
jo-0.299
enny-0.297
Token kept
Token ID4030
Feature activation+1.42e+01

Pos logit contributions
 chores+0.189
 going+0.180
 same+0.170
 temptation+0.164
 rain+0.163
Neg logit contributions
aron-0.336
ist-0.331
o-0.330
os-0.329
enny-0.329
Token safe
Token ID3338
Feature activation+3.72e+00

Pos logit contributions
 chores+0.197
 plan+0.172
 same+0.171
 urge+0.171
 rain+0.164
Neg logit contributions
o-0.487
ist-0.477
os-0.475
aron-0.468
es-0.462
Token.
Token ID13
Feature activation-5.08e+00

Pos logit contributions
 going+0.087
 chores+0.086
 same+0.081
 rain+0.078
 temptation+0.077
Neg logit contributions
aron-0.100
os-0.100
enny-0.099
ist-0.098
o-0.097
Token
Token ID220
Feature activation-4.68e+00

Pos logit contributions
ist+0.179
aron+0.169
oing+0.169
os+0.166
ne+0.165
Neg logit contributions
 going-0.080
 her-0.079
 chores-0.076
 same-0.072
 the-0.071
Token
Token ID198
Feature activation-3.16e+00

Pos logit contributions
ist+0.151
aron+0.148
ne+0.146
oing+0.144
os+0.143
Neg logit contributions
 going-0.087
 chores-0.084
 her-0.079
 same-0.079
 good-0.073
Token
Token ID198
Feature activation-3.38e+00

Pos logit contributions
ist+0.109
os+0.102
aron+0.102
oing+0.100
umping+0.100
Neg logit contributions
 going-0.053
 chores-0.051
 her-0.051
 other-0.047
 same-0.047
Token longer
Token ID2392
Feature activation+5.03e+00

Pos logit contributions
 same+0.138
 urge+0.132
 going+0.131
 chores+0.129
 hurricane+0.116
Neg logit contributions
aron-0.387
ist-0.376
es-0.371
ne-0.369
oc-0.359
Token periods
Token ID9574
Feature activation-7.51e-01

Pos logit contributions
 going+0.112
 chores+0.110
 same+0.105
 rain+0.100
 temptation+0.099
Neg logit contributions
aron-0.141
ist-0.138
os-0.137
enny-0.136
ne-0.134
Token of
Token ID286
Feature activation+1.13e+01

Pos logit contributions
aron+0.024
oing+0.022
oc+0.022
ist+0.022
osaurus+0.022
Neg logit contributions
 going-0.014
 her-0.013
 same-0.013
 chores-0.013
 the-0.012
Token time
Token ID640
Feature activation+2.24e+00

Pos logit contributions
 going+0.159
 chores+0.148
 urge+0.143
 same+0.141
wered+0.135
Neg logit contributions
aron-0.359
ne-0.340
enny-0.337
ist-0.335
r-0.333
Token without
Token ID1231
Feature activation+1.14e+01

Pos logit contributions
 going+0.051
 chores+0.051
 same+0.048
 rain+0.046
 temptation+0.045
Neg logit contributions
aron-0.061
enny-0.061
os-0.061
ist-0.060
o-0.060
Token getting
Token ID1972
Feature activation+1.42e+01

Pos logit contributions
 chores+0.140
 same+0.133
 top+0.126
 urge+0.123
wered+0.123
Neg logit contributions
aron-0.389
ist-0.377
ppy-0.374
o-0.371
ne-0.369
Token tired
Token ID10032
Feature activation+3.37e+00

Pos logit contributions
 chores+0.227
 urge+0.219
 going+0.216
 plan+0.215
 same+0.210
Neg logit contributions
o-0.455
os-0.440
ist-0.422
enny-0.421
ne-0.420
Token.
Token ID13
Feature activation-4.90e+00

Pos logit contributions
 going+0.080
 chores+0.079
 same+0.075
 rain+0.071
 temptation+0.071
Neg logit contributions
aron-0.089
os-0.088
enny-0.088
ist-0.087
ne-0.086
Token The
Token ID383
Feature activation+1.80e-01

Pos logit contributions
ist+0.166
aron+0.161
oing+0.158
os+0.158
umping+0.156
Neg logit contributions
 going-0.084
 chores-0.080
 her-0.080
 same-0.079
 rain-0.074
Token moral
Token ID6573
Feature activation-9.76e-01

Pos logit contributions
 going+0.003
 chores+0.002
 her+0.002
 same+0.002
 temptation+0.002
Neg logit contributions
ne-0.006
ist-0.006
aron-0.006
enny-0.006
itty-0.006
Token of
Token ID286
Feature activation+6.10e+00

Pos logit contributions
aron+0.028
os+0.028
ist+0.028
enny+0.028
oing+0.027
Neg logit contributions
 going-0.021
 chores-0.021
 same-0.020
 rain-0.019
 her-0.019
Token hands
Token ID2832
Feature activation-3.10e+00

Pos logit contributions
 going+0.014
 chores+0.013
 same+0.012
 temptation+0.012
 rain+0.011
Neg logit contributions
aron-0.026
enny-0.025
ist-0.025
os-0.025
jo-0.025
Token and
Token ID290
Feature activation+5.21e+00

Pos logit contributions
os+0.085
jo+0.085
enny+0.085
aron+0.085
ist+0.084
Neg logit contributions
 going-0.071
 chores-0.068
 her-0.066
 same-0.065
 rain-0.063
Token knees
Token ID14475
Feature activation-6.33e-02

Pos logit contributions
 chores+0.107
 going+0.106
 same+0.103
 temptation+0.101
 urge+0.093
Neg logit contributions
os-0.151
enny-0.151
o-0.151
aron-0.149
ist-0.147
Token and
Token ID290
Feature activation+5.94e+00

Pos logit contributions
os+0.002
ist+0.002
aron+0.002
enny+0.002
o+0.002
Neg logit contributions
 chores-0.001
 going-0.001
 same-0.001
 temptation-0.001
 rain-0.001
Token began
Token ID2540
Feature activation+8.73e+00

Pos logit contributions
 chores+0.120
 going+0.119
 same+0.114
 temptation+0.110
 urge+0.103
Neg logit contributions
os-0.176
o-0.175
aron-0.175
enny-0.175
ist-0.172
Token getting
Token ID1972
Feature activation+1.42e+01

Pos logit contributions
 same+0.188
 chores+0.180
 temptation+0.179
 urge+0.177
wered+0.172
Neg logit contributions
o-0.240
os-0.233
enny-0.228
oing-0.224
umping-0.221
Token all
Token ID477
Feature activation+6.59e+00

Pos logit contributions
 urge+0.224
 temptation+0.221
 same+0.214
 plan+0.211
wered+0.203
Neg logit contributions
o-0.449
os-0.435
enny-0.430
as-0.401
umping-0.399
Token the
Token ID262
Feature activation+3.41e+00

Pos logit contributions
 chores+0.107
 going+0.103
 same+0.096
 temptation+0.095
 urge+0.091
Neg logit contributions
os-0.224
o-0.222
enny-0.216
aron-0.212
umping-0.212
Token stuff
Token ID3404
Feature activation-7.72e+00

Pos logit contributions
 going+0.056
 chores+0.052
 same+0.046
wered+0.046
 temptation+0.045
Neg logit contributions
aron-0.114
ist-0.111
enny-0.109
os-0.109
o-0.109
Token off
Token ID572
Feature activation+5.22e+00

Pos logit contributions
os+0.239
enny+0.235
ist+0.234
aron+0.232
o+0.232
Neg logit contributions
 chores-0.146
 going-0.145
 same-0.139
 temptation-0.132
 rain-0.129
Token the
Token ID262
Feature activation+2.80e+00

Pos logit contributions
 chores+0.128
 going+0.125
 same+0.120
 temptation+0.117
 urge+0.116
Neg logit contributions
os-0.133
aron-0.130
enny-0.129
ist-0.128
o-0.126
Token by
Token ID416
Feature activation+5.19e+00

Pos logit contributions
enny+0.113
aron+0.113
os+0.112
ist+0.110
jo+0.110
Neg logit contributions
 going-0.089
 chores-0.088
 same-0.083
 rain-0.079
 her-0.078
Token eating
Token ID6600
Feature activation+7.84e+00

Pos logit contributions
 going+0.073
 chores+0.072
 same+0.071
 rain+0.063
 urge+0.062
Neg logit contributions
aron-0.177
ist-0.173
o-0.172
os-0.170
oc-0.169
Token healthy
Token ID5448
Feature activation+2.27e+00

Pos logit contributions
 going+0.137
 chores+0.126
 same+0.125
 urge+0.121
 temptation+0.119
Neg logit contributions
aron-0.253
os-0.243
ist-0.243
enny-0.239
o-0.239
Token food
Token ID2057
Feature activation-6.12e+00

Pos logit contributions
 going+0.036
 chores+0.035
 same+0.032
 rain+0.030
 temptation+0.029
Neg logit contributions
aron-0.079
os-0.078
enny-0.078
ist-0.076
o-0.076
Token and
Token ID290
Feature activation+6.32e+00

Pos logit contributions
os+0.167
aron+0.167
enny+0.167
ist+0.165
o+0.163
Neg logit contributions
 going-0.140
 chores-0.138
 same-0.131
 rain-0.126
 temptation-0.124
Token getting
Token ID1972
Feature activation+1.42e+01

Pos logit contributions
 chores+0.087
 same+0.084
wered+0.081
 top+0.080
 going+0.078
Neg logit contributions
o-0.209
os-0.204
aron-0.202
ist-0.200
ne-0.195
Token exercise
Token ID5517
Feature activation+1.32e+00

Pos logit contributions
 urge+0.208
 plan+0.206
wered+0.202
 going+0.199
 idea+0.195
Neg logit contributions
o-0.462
os-0.438
enny-0.426
aron-0.420
ppy-0.419
Token.
Token ID13
Feature activation-5.33e+00

Pos logit contributions
 going+0.030
 chores+0.029
 same+0.028
 temptation+0.027
 urge+0.027
Neg logit contributions
aron-0.035
os-0.035
ist-0.035
enny-0.035
o-0.034
Token
Token ID220
Feature activation-4.97e+00

Pos logit contributions
ist+0.190
os+0.185
aron+0.185
ne+0.182
jo+0.182
Neg logit contributions
 going-0.083
 chores-0.080
 same-0.078
 her-0.077
 rain-0.074
Token
Token ID198
Feature activation-3.69e+00

Pos logit contributions
ist+0.171
aron+0.166
ne+0.164
jo+0.164
oing+0.162
Neg logit contributions
 going-0.081
 chores-0.079
 her-0.074
 same-0.073
 other-0.068
Token
Token ID198
Feature activation-3.91e+00

Pos logit contributions
ist+0.128
os+0.123
aron+0.122
jo+0.120
ne+0.119
Neg logit contributions
 going-0.062
 her-0.059
 chores-0.058
 same-0.057
 good-0.055
Token to
Token ID284
Feature activation+9.14e+00

Pos logit contributions
 chores+0.157
 going+0.147
 same+0.142
 rain+0.139
 urge+0.139
Neg logit contributions
o-0.211
enny-0.209
os-0.208
ne-0.203
jo-0.196
Token save
Token ID3613
Feature activation+1.00e+01

Pos logit contributions
 going+0.154
 chores+0.149
 same+0.141
 temptation+0.131
 rain+0.128
Neg logit contributions
ist-0.301
aron-0.295
ne-0.291
enny-0.289
os-0.286
Token her
Token ID607
Feature activation+5.52e-01

Pos logit contributions
 going+0.086
 chores+0.078
 same+0.068
 urge+0.065
 rain+0.062
Neg logit contributions
enny-0.407
os-0.407
aron-0.405
ist-0.405
ne-0.400
Token diary
Token ID26339
Feature activation-1.00e+01

Pos logit contributions
 going+0.010
 chores+0.009
 same+0.009
 temptation+0.009
 rain+0.009
Neg logit contributions
enny-0.017
ist-0.017
os-0.017
jo-0.017
ne-0.017
Token from
Token ID422
Feature activation+1.21e+01

Pos logit contributions
os+0.293
enny+0.292
aron+0.292
ist+0.288
o+0.286
Neg logit contributions
 going-0.212
 chores-0.208
 same-0.196
 rain-0.188
 work-0.184
Token getting
Token ID1972
Feature activation+1.42e+01

Pos logit contributions
 chores+0.144
 going+0.127
 urge+0.124
 plan+0.121
 same+0.119
Neg logit contributions
ist-0.429
os-0.423
o-0.422
ne-0.420
aron-0.418
Token messed
Token ID32621
Feature activation+4.74e+00

Pos logit contributions
 urge+0.206
 plan+0.206
 chores+0.205
wered+0.196
 idea+0.191
Neg logit contributions
o-0.445
enny-0.441
os-0.437
ne-0.417
es-0.416
Token up
Token ID510
Feature activation+7.81e+00

Pos logit contributions
 going+0.055
 chores+0.053
 same+0.049
 her+0.044
 rain+0.043
Neg logit contributions
os-0.185
enny-0.185
aron-0.184
o-0.182
ist-0.182
Token.
Token ID13
Feature activation-5.16e+00

Pos logit contributions
 going+0.152
 chores+0.149
 same+0.145
 urge+0.136
 temptation+0.134
Neg logit contributions
aron-0.231
ist-0.228
enny-0.227
os-0.227
ne-0.221
Token From
Token ID3574
Feature activation+5.64e+00

Pos logit contributions
ist+0.176
os+0.170
aron+0.169
oing+0.168
ne+0.165
Neg logit contributions
 going-0.088
 her-0.085
 chores-0.083
 same-0.083
 rain-0.077
Token that
Token ID326
Feature activation+1.20e+00

Pos logit contributions
 chores+0.099
 going+0.097
 same+0.090
 rain+0.085
 temptation+0.085
Neg logit contributions
aron-0.184
enny-0.184
os-0.183
ist-0.183
o-0.182
Token.
Token ID13
Feature activation-5.77e+00

Pos logit contributions
enny+0.042
os+0.042
aron+0.042
ist+0.041
jo+0.041
Neg logit contributions
 going-0.031
 chores-0.030
 same-0.028
 her-0.028
 work-0.028
Token
Token ID198
Feature activation-4.04e+00

Pos logit contributions
ist+0.188
os+0.186
enny+0.184
aron+0.183
jo+0.182
Neg logit contributions
 going-0.106
 chores-0.100
 same-0.096
 her-0.093
 rain-0.091
Token
Token ID198
Feature activation-4.11e+00

Pos logit contributions
ist+0.142
os+0.135
aron+0.133
jo+0.133
umping+0.132
Neg logit contributions
 going-0.064
 her-0.064
 chores-0.063
 good-0.061
 big-0.060
TokenThe
Token ID464
Feature activation-6.39e-01

Pos logit contributions
os+0.124
ist+0.124
jo+0.123
umping+0.116
enny+0.115
Neg logit contributions
 good-0.084
 her-0.082
 chores-0.081
 other-0.081
 same-0.080
Token great
Token ID1049
Feature activation-4.39e-01

Pos logit contributions
enny+0.018
ne+0.018
os+0.017
aron+0.017
o+0.017
Neg logit contributions
 going-0.014
 chores-0.014
 same-0.012
 her-0.012
 temptation-0.012
Token Zen
Token ID14760
Feature activation-2.12e+01

Pos logit contributions
os+0.013
enny+0.012
ne+0.012
aron+0.012
o+0.012
Neg logit contributions
 going-0.009
 chores-0.009
 same-0.008
 temptation-0.008
 rain-0.008
Tokeno
Token ID78
Feature activation-1.28e+01

Pos logit contributions
umping+0.340
aron+0.323
enny+0.311
ud+0.298
ppings+0.293
Neg logit contributions
 going-0.706
 the-0.694
 her-0.687
 help-0.682
 go-0.677
Token hugs
Token ID40657
Feature activation+8.08e-01

Pos logit contributions
enny+0.414
jo+0.402
ark+0.394
oing+0.392
aron+0.389
Neg logit contributions
 going-0.240
 work-0.227
 same-0.227
 start-0.221
 rain-0.221
Token Ben
Token ID3932
Feature activation-1.47e+01

Pos logit contributions
 going+0.009
 chores+0.008
 same+0.007
 her+0.007
 rain+0.007
Neg logit contributions
os-0.033
ist-0.032
enny-0.032
aron-0.032
jo-0.032
Token and
Token ID290
Feature activation+3.25e+00

Pos logit contributions
enny+0.495
os+0.482
ist+0.481
jo+0.473
ark+0.468
Neg logit contributions
 her-0.268
 going-0.260
 good-0.249
 chores-0.249
 the-0.247
Token says
Token ID1139
Feature activation+4.53e-02

Pos logit contributions
 chores+0.076
 going+0.074
 same+0.069
 temptation+0.069
 rain+0.065
Neg logit contributions
os-0.083
aron-0.083
o-0.083
ne-0.081
enny-0.080
Token saw
Token ID2497
Feature activation+6.96e+00

Pos logit contributions
enny+0.252
os+0.252
aron+0.250
o+0.246
ist+0.246
Neg logit contributions
 going-0.206
 chores-0.204
 same-0.193
 rain-0.185
 temptation-0.183
Token a
Token ID257
Feature activation+1.62e+00

Pos logit contributions
 temptation+0.089
 urge+0.083
 chores+0.083
 going+0.083
 plan+0.076
Neg logit contributions
o-0.252
os-0.243
aron-0.239
enny-0.239
es-0.232
Token big
Token ID1263
Feature activation-5.56e-01

Pos logit contributions
 going+0.019
 chores+0.019
 temptation+0.016
 same+0.016
 her+0.015
Neg logit contributions
enny-0.060
aron-0.060
o-0.059
ist-0.059
ne-0.059
Token box
Token ID3091
Feature activation-9.93e+00

Pos logit contributions
aron+0.017
oc+0.016
ist+0.016
o+0.016
enny+0.016
Neg logit contributions
 chores-0.011
 going-0.010
 same-0.010
 temptation-0.009
 her-0.009
Token of
Token ID286
Feature activation+8.01e+00

Pos logit contributions
aron+0.270
os+0.270
ist+0.269
enny+0.269
o+0.267
Neg logit contributions
 chores-0.227
 going-0.220
 same-0.212
 rain-0.204
 temptation-0.203
Token Leg
Token ID3564
Feature activation-2.08e+01

Pos logit contributions
 going+0.143
 chores+0.128
 temptation+0.124
 same+0.123
 urge+0.122
Neg logit contributions
aron-0.237
oc-0.231
os-0.230
enny-0.229
itty-0.228
Tokenos
Token ID418
Feature activation-7.15e+00

Pos logit contributions
aron+0.346
pillar+0.304
ist+0.293
oing+0.291
enny+0.288
Neg logit contributions
 going-0.758
 work-0.656
 start-0.642
 her-0.642
 go-0.641
Token that
Token ID326
Feature activation+2.11e+00

Pos logit contributions
enny+0.209
aron+0.205
os+0.200
ist+0.199
jo+0.198
Neg logit contributions
 going-0.164
 chores-0.156
 rain-0.144
 same-0.143
 work-0.142
Token he
Token ID339
Feature activation-4.45e+00

Pos logit contributions
 chores+0.048
 temptation+0.045
 going+0.044
 same+0.042
 urge+0.041
Neg logit contributions
os-0.059
o-0.059
enny-0.055
jo-0.055
aron-0.054
Token really
Token ID1107
Feature activation+4.13e+00

Pos logit contributions
aron+0.137
os+0.136
enny+0.135
ist+0.135
oing+0.133
Neg logit contributions
 going-0.092
 chores-0.086
 same-0.083
 work-0.079
 rain-0.079
Token wanted
Token ID2227
Feature activation+7.45e+00

Pos logit contributions
 going+0.088
 chores+0.085
 same+0.081
 rain+0.078
 work+0.077
Neg logit contributions
enny-0.122
os-0.122
aron-0.120
ist-0.119
o-0.118
Token,
Token ID11
Feature activation+1.08e+00

Pos logit contributions
aron+0.179
enny+0.178
ist+0.175
jo+0.174
os+0.173
Neg logit contributions
 going-0.104
 chores-0.094
 same-0.091
 rain-0.089
 work-0.088
Token dinner
Token ID8073
Feature activation-6.67e+00

Pos logit contributions
 chores+0.022
 going+0.018
 same+0.018
 temptation+0.018
 work+0.018
Neg logit contributions
aron-0.028
oc-0.028
o-0.028
enny-0.028
ne-0.027
Token time
Token ID640
Feature activation+4.00e-01

Pos logit contributions
enny+0.189
os+0.188
aron+0.186
ist+0.184
jo+0.183
Neg logit contributions
 going-0.150
 chores-0.147
 same-0.141
 work-0.134
 rain-0.133
Token!"
Token ID2474
Feature activation-6.59e+00

Pos logit contributions
 chores+0.011
 going+0.011
 same+0.010
 rain+0.010
 temptation+0.010
Neg logit contributions
aron-0.009
os-0.009
enny-0.009
o-0.009
ist-0.009
Token exclaimed
Token ID33552
Feature activation-1.64e+00

Pos logit contributions
enny+0.163
ne+0.163
aron+0.160
o+0.159
oc+0.151
Neg logit contributions
 chores-0.166
 going-0.159
 temptation-0.143
 same-0.140
 work-0.137
Token Nem
Token ID22547
Feature activation-1.92e+01

Pos logit contributions
enny+0.041
o+0.040
aron+0.038
ne+0.038
os+0.038
Neg logit contributions
 chores-0.041
 going-0.038
 same-0.038
 temptation-0.038
 plan-0.036
Tokeno
Token ID78
Feature activation-1.28e+01

Pos logit contributions
ist+0.310
aron+0.263
ne+0.243
lier+0.241
ol+0.240
Neg logit contributions
 big-0.609
 not-0.604
 good-0.578
 long-0.576
 mom-0.576
Token.
Token ID13
Feature activation-6.60e+00

Pos logit contributions
enny+0.413
itty+0.405
ist+0.403
aron+0.402
os+0.402
Neg logit contributions
 going-0.225
 her-0.217
 help-0.204
 good-0.200
 the-0.199
Token
Token ID198
Feature activation-4.51e+00

Pos logit contributions
enny+0.229
aron+0.228
os+0.227
o+0.225
ist+0.221
Neg logit contributions
 chores-0.106
 going-0.106
 same-0.095
 rain-0.089
 work-0.089
Token
Token ID198
Feature activation-4.56e+00

Pos logit contributions
ist+0.155
aron+0.151
os+0.149
umping+0.148
oing+0.146
Neg logit contributions
 going-0.073
 same-0.071
 chores-0.070
 her-0.069
 other-0.068
TokenBut
Token ID1537
Feature activation-4.36e-01

Pos logit contributions
os+0.120
ist+0.117
aron+0.117
enny+0.116
jo+0.116
Neg logit contributions
 going-0.112
 chores-0.110
 same-0.107
 rain-0.103
 good-0.102
Token is
Token ID318
Feature activation+7.79e+00

Pos logit contributions
 going+0.035
 chores+0.035
 temptation+0.032
 same+0.032
 her+0.031
Neg logit contributions
os-0.045
aron-0.044
ne-0.044
o-0.044
ist-0.044
Token too
Token ID1165
Feature activation+1.52e+00

Pos logit contributions
 chores+0.116
 urge+0.109
 temptation+0.102
 going+0.099
 plan+0.099
Neg logit contributions
o-0.257
ist-0.245
ppy-0.241
aron-0.238
ne-0.233
Token hard
Token ID1327
Feature activation+4.01e+00

Pos logit contributions
 going+0.015
 chores+0.014
 same+0.012
 rain+0.011
 her+0.011
Neg logit contributions
os-0.061
aron-0.061
o-0.061
enny-0.061
ist-0.060
Token!"
Token ID2474
Feature activation-6.42e+00

Pos logit contributions
 chores+0.083
 going+0.079
 same+0.076
wered+0.076
 rain+0.074
Neg logit contributions
aron-0.114
enny-0.113
ist-0.113
os-0.113
o-0.112
Token said
Token ID531
Feature activation-5.54e-01

Pos logit contributions
aron+0.206
enny+0.205
os+0.202
o+0.201
ne+0.200
Neg logit contributions
 chores-0.119
 going-0.117
 same-0.109
 rain-0.102
 temptation-0.102
Token Nem
Token ID22547
Feature activation-1.92e+01

Pos logit contributions
enny+0.017
aron+0.017
o+0.017
os+0.017
ne+0.016
Neg logit contributions
 chores-0.011
 going-0.011
 same-0.010
 temptation-0.010
 rain-0.010
Tokeno
Token ID78
Feature activation-1.26e+01

Pos logit contributions
ist+0.299
aron+0.258
ne+0.228
jo+0.218
ol+0.212
Neg logit contributions
 not-0.627
 good-0.622
 big-0.616
 going-0.606
 long-0.602
Token.
Token ID13
Feature activation-6.49e+00

Pos logit contributions
jo+0.407
os+0.407
enny+0.405
itty+0.404
aron+0.402
Neg logit contributions
 going-0.227
 her-0.219
 good-0.207
 help-0.201
 the-0.199
Token
Token ID198
Feature activation-4.38e+00

Pos logit contributions
enny+0.229
aron+0.229
os+0.228
o+0.226
ist+0.224
Neg logit contributions
 going-0.098
 chores-0.097
 same-0.088
 rain-0.083
 work-0.081
Token
Token ID198
Feature activation-4.50e+00

Pos logit contributions
ist+0.153
aron+0.148
os+0.146
umping+0.144
oing+0.143
Neg logit contributions
 going-0.072
 chores-0.068
 other-0.065
 same-0.065
 big-0.064
Token"
Token ID1
Feature activation-4.70e+00

Pos logit contributions
os+0.121
ist+0.118
jo+0.117
aron+0.116
enny+0.115
Neg logit contributions
 going-0.110
 chores-0.108
 same-0.104
 rain-0.101
 good-0.100
Token was
Token ID373
Feature activation+8.16e+00

Pos logit contributions
aron+0.077
enny+0.077
ist+0.076
os+0.076
ne+0.075
Neg logit contributions
 going-0.060
 chores-0.057
 same-0.056
 her-0.053
 rain-0.053
Token a
Token ID257
Feature activation+7.85e-01

Pos logit contributions
 chores+0.077
 going+0.071
 temptation+0.062
 same+0.061
 urge+0.056
Neg logit contributions
o-0.332
enny-0.330
aron-0.327
os-0.327
ne-0.322
Token little
Token ID1310
Feature activation-3.16e+00

Pos logit contributions
 going+0.007
 chores+0.007
 same+0.006
 rain+0.006
 good+0.006
Neg logit contributions
os-0.033
aron-0.033
enny-0.033
ist-0.032
jo-0.032
Token fish
Token ID5916
Feature activation-1.51e+01

Pos logit contributions
os+0.113
aron+0.111
enny+0.111
ist+0.110
jo+0.109
Neg logit contributions
 same-0.048
 rain-0.048
 going-0.047
 good-0.047
 big-0.044
Token named
Token ID3706
Feature activation-2.44e+00

Pos logit contributions
enny+0.474
aron+0.471
ist+0.446
umping+0.440
ives+0.439
Neg logit contributions
 going-0.311
 her-0.290
 big-0.278
 other-0.277
 the-0.276
Token Nem
Token ID22547
Feature activation-1.92e+01

Pos logit contributions
os+0.077
enny+0.076
o+0.076
aron+0.076
ist+0.075
Neg logit contributions
 going-0.047
 chores-0.046
 same-0.043
 rain-0.041
 temptation-0.041
Tokeno
Token ID78
Feature activation-1.35e+01

Pos logit contributions
ist+0.286
ol+0.280
aron+0.271
enny+0.235
pillar+0.234
Neg logit contributions
 going-0.684
 the-0.635
 her-0.627
 big-0.625
 not-0.616
Token.
Token ID13
Feature activation-6.10e+00

Pos logit contributions
enny+0.475
aron+0.464
umping+0.452
ives+0.450
ist+0.446
Neg logit contributions
 going-0.236
 her-0.213
 the-0.211
 other-0.202
 chores-0.199
Token Nem
Token ID22547
Feature activation-1.85e+01

Pos logit contributions
o+0.175
aron+0.172
enny+0.171
os+0.169
ark+0.164
Neg logit contributions
 chores-0.136
 going-0.128
 same-0.120
 temptation-0.120
 work-0.119
Tokeno
Token ID78
Feature activation-1.27e+01

Pos logit contributions
ist+0.324
ol+0.317
enny+0.293
aron+0.288
paste+0.283
Neg logit contributions
 going-0.566
 big-0.564
 the-0.553
 not-0.546
 his-0.540
Token liked
Token ID8288
Feature activation+8.33e+00

Pos logit contributions
enny+0.458
aron+0.440
ist+0.438
oing+0.430
azz+0.429
Neg logit contributions
 going-0.216
 same-0.195
 the-0.193
 big-0.191
 good-0.189
Token was
Token ID373
Feature activation+8.21e+00

Pos logit contributions
aron+0.075
enny+0.074
os+0.074
ist+0.074
jo+0.072
Neg logit contributions
 going-0.061
 chores-0.059
 same-0.057
 rain-0.055
 her-0.054
Token a
Token ID257
Feature activation+8.06e-01

Pos logit contributions
 chores+0.081
 going+0.075
 temptation+0.068
 same+0.065
 urge+0.061
Neg logit contributions
o-0.329
enny-0.326
aron-0.326
os-0.322
ne-0.319
Token little
Token ID1310
Feature activation-3.04e+00

Pos logit contributions
 going+0.008
 chores+0.008
 same+0.007
 rain+0.006
 work+0.006
Neg logit contributions
os-0.033
enny-0.032
aron-0.032
ist-0.032
o-0.032
Token fish
Token ID5916
Feature activation-1.51e+01

Pos logit contributions
os+0.107
enny+0.105
aron+0.105
ist+0.104
jo+0.103
Neg logit contributions
 same-0.048
 going-0.048
 rain-0.047
 good-0.046
 chores-0.045
Token named
Token ID3706
Feature activation-2.42e+00

Pos logit contributions
enny+0.472
aron+0.464
ist+0.441
umping+0.436
azz+0.434
Neg logit contributions
 going-0.317
 her-0.292
 big-0.283
 other-0.281
 same-0.280
Token Nem
Token ID22547
Feature activation-1.92e+01

Pos logit contributions
os+0.075
enny+0.075
o+0.075
aron+0.075
ist+0.074
Neg logit contributions
 going-0.047
 chores-0.046
 same-0.043
 rain-0.041
 temptation-0.041
Tokeno
Token ID78
Feature activation-1.36e+01

Pos logit contributions
ist+0.284
ol+0.275
aron+0.266
enny+0.243
pillar+0.238
Neg logit contributions
 going-0.691
 the-0.644
 big-0.631
 her-0.627
 their-0.624
Token.
Token ID13
Feature activation-6.01e+00

Pos logit contributions
enny+0.481
aron+0.465
umping+0.454
ives+0.451
ist+0.450
Neg logit contributions
 going-0.241
 the-0.216
 her-0.214
 other-0.207
 same-0.204
Token Nem
Token ID22547
Feature activation-1.85e+01

Pos logit contributions
o+0.172
aron+0.167
enny+0.166
os+0.164
ark+0.159
Neg logit contributions
 chores-0.137
 going-0.127
 temptation-0.121
 same-0.120
 work-0.120
Tokeno
Token ID78
Feature activation-1.28e+01

Pos logit contributions
ist+0.322
ol+0.309
enny+0.294
aron+0.279
paste+0.276
Neg logit contributions
 going-0.577
 big-0.572
 the-0.563
 not-0.551
 a-0.548
Token loved
Token ID6151
Feature activation+7.86e+00

Pos logit contributions
enny+0.464
ist+0.442
aron+0.441
oing+0.435
azz+0.433
Neg logit contributions
 going-0.216
 same-0.195
 the-0.193
 big-0.191
 good-0.188
Token a
Token ID257
Feature activation+3.57e+00

Pos logit contributions
os+0.235
aron+0.234
enny+0.234
ist+0.232
o+0.232
Neg logit contributions
 going-0.073
 chores-0.070
 same-0.063
 rain-0.058
 temptation-0.056
Token new
Token ID649
Feature activation+1.59e+00

Pos logit contributions
 going+0.044
 chores+0.038
 her+0.033
 temptation+0.032
 same+0.031
Neg logit contributions
o-0.133
enny-0.133
oc-0.133
ist-0.130
ouse-0.129
Token tank
Token ID6873
Feature activation-1.07e+01

Pos logit contributions
 going+0.032
 chores+0.031
 same+0.028
 temptation+0.028
 her+0.028
Neg logit contributions
o-0.047
enny-0.047
os-0.047
aron-0.047
ist-0.047
Token out
Token ID503
Feature activation+3.36e+00

Pos logit contributions
os+0.298
enny+0.297
aron+0.296
ist+0.294
o+0.292
Neg logit contributions
 going-0.237
 chores-0.234
 same-0.220
 rain-0.212
 temptation-0.209
Token of
Token ID286
Feature activation+9.49e+00

Pos logit contributions
 going+0.083
 chores+0.083
 same+0.078
 rain+0.075
 temptation+0.074
Neg logit contributions
os-0.086
enny-0.085
ist-0.084
aron-0.084
o-0.083
Token Leg
Token ID3564
Feature activation-1.91e+01

Pos logit contributions
 chores+0.117
 going+0.113
 urge+0.090
 same+0.088
 temptation+0.086
Neg logit contributions
os-0.326
ist-0.323
o-0.321
enny-0.320
aron-0.320
Tokenos
Token ID418
Feature activation-5.97e+00

Pos logit contributions
aron+0.184
pillar+0.149
enny+0.146
oing+0.141
oc+0.125
Neg logit contributions
 going-0.801
 work-0.730
 first-0.723
 chores-0.719
 her-0.707
Token.
Token ID13
Feature activation-5.04e+00

Pos logit contributions
aron+0.179
enny+0.178
jo+0.170
ne+0.169
ark+0.168
Neg logit contributions
 going-0.127
 chores-0.125
 rain-0.117
 same-0.116
 work-0.116
Token
Token ID198
Feature activation-3.08e+00

Pos logit contributions
ist+0.172
aron+0.172
os+0.172
enny+0.171
ne+0.169
Neg logit contributions
 going-0.083
 chores-0.079
 same-0.075
 rain-0.072
 her-0.070
Token
Token ID198
Feature activation-3.35e+00

Pos logit contributions
ist+0.112
aron+0.107
os+0.105
ppy+0.105
oing+0.103
Neg logit contributions
 going-0.046
 chores-0.044
 her-0.042
 big-0.041
 the-0.041
TokenTim
Token ID14967
Feature activation-1.36e+01

Pos logit contributions
os+0.089
aron+0.087
ist+0.086
enny+0.084
jo+0.084
Neg logit contributions
 going-0.082
 chores-0.081
 same-0.077
 rain-0.076
 her-0.074
Token
Token ID198
Feature activation-3.87e+00

Pos logit contributions
os+0.173
enny+0.173
aron+0.172
ist+0.170
o+0.169
Neg logit contributions
 going-0.112
 chores-0.110
 same-0.103
 rain-0.099
 work-0.097
Token
Token ID198
Feature activation-4.13e+00

Pos logit contributions
ist+0.131
aron+0.131
os+0.125
umping+0.124
oing+0.124
Neg logit contributions
 going-0.065
 chores-0.063
 same-0.061
 her-0.059
 other-0.059
TokenOne
Token ID3198
Feature activation-1.44e+00

Pos logit contributions
enny+0.121
aron+0.121
o+0.120
os+0.120
ne+0.119
Neg logit contributions
 going-0.087
 chores-0.085
 same-0.079
 temptation-0.077
 work-0.076
Token day
Token ID1110
Feature activation-4.05e+00

Pos logit contributions
aron+0.061
enny+0.061
os+0.061
ist+0.060
jo+0.060
Neg logit contributions
 going-0.012
 chores-0.011
 same-0.010
 rain-0.009
 work-0.008
Token,
Token ID11
Feature activation+5.00e-01

Pos logit contributions
enny+0.120
aron+0.119
o+0.118
os+0.118
ne+0.117
Neg logit contributions
 chores-0.082
 going-0.081
 same-0.076
 temptation-0.073
 work-0.071
Token Nem
Token ID22547
Feature activation-1.91e+01

Pos logit contributions
 chores+0.012
 temptation+0.011
 plan+0.011
 going+0.011
 same+0.010
Neg logit contributions
o-0.013
enny-0.012
ne-0.012
aron-0.012
os-0.011
Tokeno
Token ID78
Feature activation-1.28e+01

Pos logit contributions
ist+0.293
aron+0.279
och+0.209
ol+0.205
owed+0.202
Neg logit contributions
 going-0.690
 big-0.665
 same-0.648
 go-0.635
 the-0.630
Token saw
Token ID2497
Feature activation+6.25e+00

Pos logit contributions
aron+0.421
ist+0.418
enny+0.407
oing+0.400
os+0.400
Neg logit contributions
 going-0.241
 same-0.218
 rain-0.218
 other-0.208
 grown-0.207
Token a
Token ID257
Feature activation+1.12e+00

Pos logit contributions
 chores+0.076
 going+0.071
 temptation+0.068
 urge+0.062
 work+0.060
Neg logit contributions
o-0.243
os-0.233
enny-0.233
ne-0.225
aron-0.225
Token big
Token ID1263
Feature activation-1.02e+00

Pos logit contributions
 chores+0.012
 going+0.012
 her+0.010
 temptation+0.010
 same+0.009
Neg logit contributions
enny-0.043
o-0.043
ne-0.043
oing-0.042
aron-0.042
Token,
Token ID11
Feature activation+7.95e-01

Pos logit contributions
o+0.028
aron+0.028
oc+0.028
ist+0.028
ne+0.028
Neg logit contributions
 chores-0.022
 going-0.020
 same-0.019
 temptation-0.019
 her-0.018
Token was
Token ID373
Feature activation+8.65e+00

Pos logit contributions
aron+0.066
enny+0.066
os+0.065
ist+0.065
o+0.064
Neg logit contributions
 going-0.054
 chores-0.053
 same-0.051
 rain-0.049
 her-0.048
Token a
Token ID257
Feature activation+1.22e+00

Pos logit contributions
 chores+0.100
 temptation+0.088
 going+0.087
 plan+0.081
 work+0.078
Neg logit contributions
o-0.334
enny-0.324
aron-0.324
umping-0.320
os-0.317
Token little
Token ID1310
Feature activation-2.54e+00

Pos logit contributions
 going+0.013
 chores+0.012
 same+0.011
 rain+0.010
 work+0.010
Neg logit contributions
os-0.048
enny-0.048
aron-0.048
ist-0.048
o-0.047
Token fish
Token ID5916
Feature activation-1.47e+01

Pos logit contributions
os+0.087
enny+0.086
aron+0.086
ist+0.085
jo+0.084
Neg logit contributions
 going-0.042
 same-0.041
 rain-0.040
 chores-0.040
 good-0.039
Token named
Token ID3706
Feature activation-2.14e+00

Pos logit contributions
enny+0.452
aron+0.447
ist+0.424
umping+0.417
jo+0.416
Neg logit contributions
 going-0.319
 her-0.291
 same-0.282
 big-0.278
 other-0.278
Token Nem
Token ID22547
Feature activation-1.90e+01

Pos logit contributions
os+0.068
enny+0.067
aron+0.067
o+0.067
ist+0.066
Neg logit contributions
 going-0.040
 chores-0.040
 same-0.037
 rain-0.035
 temptation-0.035
Tokeno
Token ID78
Feature activation-1.31e+01

Pos logit contributions
ist+0.303
ol+0.286
aron+0.285
enny+0.250
paste+0.246
Neg logit contributions
 going-0.669
 the-0.614
 big-0.604
 her-0.600
 a-0.595
Token.
Token ID13
Feature activation-5.80e+00

Pos logit contributions
enny+0.448
aron+0.438
umping+0.424
ist+0.422
os+0.417
Neg logit contributions
 going-0.248
 her-0.218
 chores-0.214
 the-0.213
 same-0.211
Token Nem
Token ID22547
Feature activation-1.83e+01

Pos logit contributions
o+0.166
enny+0.162
aron+0.162
os+0.160
ark+0.154
Neg logit contributions
 chores-0.130
 going-0.122
 temptation-0.115
 same-0.114
 work-0.114
Tokeno
Token ID78
Feature activation-1.24e+01

Pos logit contributions
ist+0.327
ol+0.310
enny+0.292
aron+0.284
paste+0.280
Neg logit contributions
 going-0.568
 big-0.561
 the-0.548
 not-0.539
 his-0.533
Token loved
Token ID6151
Feature activation+7.91e+00

Pos logit contributions
enny+0.438
aron+0.422
ist+0.419
oing+0.411
umping+0.409
Neg logit contributions
 going-0.224
 same-0.202
 the-0.193
 good-0.192
 big-0.192
Token It
Token ID632
Feature activation-5.38e+00

Pos logit contributions
os+0.168
enny+0.168
aron+0.168
ist+0.167
jo+0.164
Neg logit contributions
 going-0.114
 chores-0.111
 same-0.106
 rain-0.101
 her-0.098
Token might
Token ID1244
Feature activation+4.09e+00

Pos logit contributions
os+0.145
aron+0.143
o+0.143
enny+0.142
ist+0.142
Neg logit contributions
 going-0.126
 chores-0.125
 same-0.117
 rain-0.113
 temptation-0.113
Token be
Token ID307
Feature activation+8.77e+00

Pos logit contributions
 chores+0.064
 going+0.059
 same+0.053
 laundry+0.052
 temptation+0.052
Neg logit contributions
o-0.132
ist-0.129
ne-0.127
os-0.127
jo-0.127
Token spicy
Token ID26880
Feature activation-1.40e-01

Pos logit contributions
 chores+0.125
 going+0.113
 same+0.110
 temptation+0.109
 top+0.103
Neg logit contributions
o-0.311
aron-0.309
ist-0.306
os-0.303
ppy-0.297
Token."
Token ID526
Feature activation-7.20e+00

Pos logit contributions
aron+0.004
os+0.004
enny+0.004
ist+0.004
o+0.004
Neg logit contributions
 chores-0.003
 going-0.003
 same-0.003
 rain-0.003
 temptation-0.003
Token Nem
Token ID22547
Feature activation-1.90e+01

Pos logit contributions
enny+0.215
aron+0.214
os+0.213
o+0.211
ne+0.208
Neg logit contributions
 going-0.148
 chores-0.147
 same-0.137
 rain-0.130
 temptation-0.128
Tokeno
Token ID78
Feature activation-1.27e+01

Pos logit contributions
ist+0.327
plets+0.280
ppings+0.271
ol+0.258
aron+0.256
Neg logit contributions
 his-0.558
 going-0.553
 plan-0.553
 first-0.553
 same-0.551
Token thought
Token ID1807
Feature activation+2.35e-01

Pos logit contributions
enny+0.392
oing+0.381
aron+0.378
ist+0.376
jo+0.373
Neg logit contributions
 going-0.252
 chores-0.241
 same-0.240
 her-0.233
 work-0.231
Token for
Token ID329
Feature activation+9.38e+00

Pos logit contributions
 chores+0.005
 going+0.005
 same+0.004
 rain+0.004
 temptation+0.004
Neg logit contributions
enny-0.007
aron-0.007
os-0.007
ist-0.007
ne-0.007
Token a
Token ID257
Feature activation+1.08e+00

Pos logit contributions
 chores+0.093
 going+0.090
 same+0.076
 urge+0.072
 temptation+0.069
Neg logit contributions
aron-0.374
enny-0.369
ist-0.366
o-0.365
ne-0.364
Token moment
Token ID2589
Feature activation-3.56e+00

Pos logit contributions
 going+0.023
 chores+0.022
 same+0.021
 rain+0.020
 her+0.020
Neg logit contributions
enny-0.032
aron-0.032
o-0.031
os-0.031
ist-0.031
Token
Token ID198
Feature activation-4.25e+00

Pos logit contributions
ist+0.143
aron+0.140
ppy+0.137
os+0.135
umping+0.134
Neg logit contributions
 going-0.062
 her-0.060
 chores-0.058
 same-0.057
 big-0.056
TokenOne
Token ID3198
Feature activation-1.48e+00

Pos logit contributions
os+0.137
enny+0.137
aron+0.137
ist+0.135
o+0.134
Neg logit contributions
 going-0.079
 chores-0.077
 same-0.073
 rain-0.069
 her-0.067
Token day
Token ID1110
Feature activation-4.02e+00

Pos logit contributions
aron+0.065
os+0.064
enny+0.064
ist+0.064
jo+0.063
Neg logit contributions
 going-0.011
 chores-0.010
 same-0.009
 rain-0.008
 her-0.008
Token,
Token ID11
Feature activation+3.99e-01

Pos logit contributions
enny+0.123
os+0.123
aron+0.122
o+0.121
ist+0.120
Neg logit contributions
 chores-0.078
 going-0.078
 same-0.073
 temptation-0.069
 rain-0.068
Token the
Token ID262
Feature activation-6.62e-01

Pos logit contributions
 chores+0.006
 going+0.005
 temptation+0.005
 plan+0.004
 same+0.004
Neg logit contributions
o-0.013
ne-0.013
es-0.013
oc-0.013
aron-0.013
Token prince
Token ID19716
Feature activation-1.90e+01

Pos logit contributions
enny+0.016
ne+0.016
ppy+0.015
aron+0.015
ouse+0.015
Neg logit contributions
 chores-0.017
 going-0.016
 temptation-0.015
 work-0.014
wered-0.014
Token decided
Token ID3066
Feature activation+4.08e+00

Pos logit contributions
aron+0.622
enny+0.608
ist+0.606
os+0.603
oing+0.595
Neg logit contributions
 going-0.362
 same-0.321
 chores-0.316
 rain-0.314
 her-0.308
Token to
Token ID284
Feature activation+6.52e+00

Pos logit contributions
 chores+0.105
 temptation+0.093
 urge+0.092
 same+0.091
wered+0.089
Neg logit contributions
o-0.097
enny-0.096
ne-0.095
os-0.090
ark-0.089
Token take
Token ID1011
Feature activation+9.22e+00

Pos logit contributions
 going+0.107
 chores+0.101
 temptation+0.095
 same+0.093
 good+0.090
Neg logit contributions
ist-0.204
ne-0.204
r-0.202
os-0.202
oc-0.196
Token the
Token ID262
Feature activation+2.47e+00

Pos logit contributions
 chores+0.100
 going+0.098
 temptation+0.087
 plan+0.085
 urge+0.083
Neg logit contributions
ne-0.340
os-0.338
enny-0.338
es-0.337
o-0.336
Token jewel
Token ID16927
Feature activation-1.25e+01

Pos logit contributions
 chores+0.074
 going+0.069
wered+0.066
 temptation+0.065
 rain+0.064
Neg logit contributions
pillar-0.046
ne-0.043
rot-0.042
ouse-0.041
oc-0.040
Token!
Token ID0
Feature activation-4.47e+00

Pos logit contributions
 going+0.026
 chores+0.025
 same+0.024
 her+0.022
 rain+0.022
Neg logit contributions
aron-0.030
os-0.030
enny-0.030
ist-0.029
jo-0.029
Tokenâ
Token ID22940
Feature activation-5.43e+00

Pos logit contributions
ist+0.143
os+0.138
aron+0.137
jo+0.136
oing+0.135
Neg logit contributions
 going-0.085
 chores-0.080
 same-0.077
 her-0.074
 rain-0.073
Token€
Token ID26391
Feature activation-1.13e+01

Pos logit contributions
os+0.155
enny+0.154
as+0.147
jo+0.146
oing+0.146
Neg logit contributions
 going-0.123
 same-0.120
 chores-0.117
 temptation-0.116
 work-0.113
Token<|endoftext|>
Token ID50256
Feature activation-1.14e+01

Pos logit contributions
ist+0.366
enny+0.342
aron+0.338
ppy+0.331
ne+0.327
Neg logit contributions
 going-0.246
 chores-0.216
 the-0.206
 good-0.200
 not-0.200
TokenPa
Token ID28875
Feature activation-9.77e+00

Pos logit contributions
ist+0.394
aron+0.374
oing+0.373
jo+0.365
enny+0.363
Neg logit contributions
 going-0.209
 the-0.187
 same-0.184
 her-0.179
 other-0.178
Tokenolo
Token ID14057
Feature activation-1.90e+01

Pos logit contributions
os+0.200
aron+0.200
ist+0.197
jo+0.193
oing+0.188
Neg logit contributions
 going-0.300
 chores-0.289
 same-0.289
 her-0.281
 work-0.275
Token was
Token ID373
Feature activation+9.19e+00

Pos logit contributions
enny+0.568
aron+0.567
oing+0.557
ist+0.554
jo+0.545
Neg logit contributions
 going-0.400
 chores-0.365
 her-0.363
 work-0.362
 same-0.359
Token very
Token ID845
Feature activation+5.31e+00

Pos logit contributions
 chores+0.155
 temptation+0.147
 work+0.132
 rain+0.129
 same+0.128
Neg logit contributions
o-0.319
aron-0.299
os-0.299
umping-0.293
ist-0.289
Token excited
Token ID6568
Feature activation+1.99e+00

Pos logit contributions
 chores+0.108
 temptation+0.096
 going+0.092
 same+0.092
 avail+0.092
Neg logit contributions
o-0.161
os-0.152
enny-0.145
as-0.140
ouse-0.136
Token.
Token ID13
Feature activation-5.09e+00

Pos logit contributions
 going+0.049
 chores+0.049
 same+0.046
 rain+0.045
 temptation+0.044
Neg logit contributions
os-0.051
aron-0.050
enny-0.050
ist-0.050
o-0.049
Token He
Token ID679
Feature activation-4.20e+00

Pos logit contributions
enny+0.146
os+0.146
aron+0.146
o+0.143
ist+0.142
Neg logit contributions
 going-0.111
 chores-0.109
 same-0.103
 rain-0.099
 work-0.097
TokenThe
Token ID464
Feature activation-6.94e-02

Pos logit contributions
ist+0.102
aron+0.101
os+0.101
enny+0.099
jo+0.098
Neg logit contributions
 going-0.088
 chores-0.087
 same-0.081
 good-0.078
 rain-0.078
Token end
Token ID886
Feature activation-2.11e+00

Pos logit contributions
ne+0.002
pillar+0.002
jo+0.002
ist+0.002
aron+0.002
Neg logit contributions
 going-0.001
 chores-0.001
 her-0.001
 same-0.001
 temptation-0.001
Token.
Token ID13
Feature activation-5.87e+00

Pos logit contributions
enny+0.075
pp+0.072
oing+0.072
umping+0.072
ouse+0.071
Neg logit contributions
 going-0.035
 chores-0.032
 good-0.031
 work-0.031
 her-0.031
Token<|endoftext|>
Token ID50256
Feature activation-1.17e+01

Pos logit contributions
ist+0.204
aron+0.192
oing+0.191
ppy+0.191
enny+0.190
Neg logit contributions
 going-0.102
 chores-0.096
 same-0.089
 her-0.086
 good-0.085
TokenD
Token ID35
Feature activation-7.66e+00

Pos logit contributions
ist+0.423
aron+0.399
oing+0.394
ppy+0.387
jo+0.384
Neg logit contributions
 going-0.197
 the-0.179
 same-0.178
 her-0.176
 rain-0.170
Tokenennis
Token ID10679
Feature activation-1.90e+01

Pos logit contributions
aron+0.155
ist+0.140
oc+0.136
oing+0.134
os+0.129
Neg logit contributions
 chores-0.241
 going-0.240
 same-0.239
 work-0.230
 her-0.224
Token was
Token ID373
Feature activation+8.81e+00

Pos logit contributions
enny+0.566
oing+0.566
os+0.553
aron+0.549
ist+0.545
Neg logit contributions
 going-0.399
 same-0.369
 her-0.368
 chores-0.365
 work-0.364
Token feeling
Token ID4203
Feature activation+5.52e+00

Pos logit contributions
 chores+0.140
 temptation+0.135
 work+0.118
 money+0.112
 rain+0.111
Neg logit contributions
o-0.313
aron-0.288
os-0.287
umping-0.287
enny-0.280
Token very
Token ID845
Feature activation+5.96e+00

Pos logit contributions
 chores+0.082
 going+0.079
 temptation+0.074
 plan+0.073
 work+0.071
Neg logit contributions
o-0.179
enny-0.173
aron-0.172
os-0.172
ist-0.168
Token lonely
Token ID21757
Feature activation+1.88e-01

Pos logit contributions
 chores+0.109
 going+0.093
 top+0.093
 laundry+0.093
 same+0.092
Neg logit contributions
enny-0.168
os-0.167
o-0.167
ouse-0.162
ry-0.159
Token.
Token ID13
Feature activation-5.07e+00

Pos logit contributions
 going+0.004
 chores+0.004
 same+0.004
 her+0.004
 rain+0.004
Neg logit contributions
os-0.006
enny-0.005
aron-0.005
o-0.005
jo-0.005
TokenYou
Token ID1639
Feature activation-2.10e+00

Pos logit contributions
enny+0.126
aron+0.125
os+0.124
o+0.124
ne+0.122
Neg logit contributions
 chores-0.110
 going-0.108
 same-0.102
 work-0.098
 rain-0.098
Token're
Token ID821
Feature activation+6.76e+00

Pos logit contributions
os+0.055
enny+0.055
aron+0.054
ist+0.054
o+0.054
Neg logit contributions
 going-0.051
 chores-0.050
 same-0.047
 rain-0.046
 temptation-0.045
Token welcome
Token ID7062
Feature activation+8.21e-02

Pos logit contributions
 chores+0.096
 going+0.078
 same+0.075
 work+0.074
 temptation+0.074
Neg logit contributions
o-0.248
aron-0.239
enny-0.233
os-0.232
ppy-0.231
Token,"
Token ID553
Feature activation-5.65e+00

Pos logit contributions
 going+0.002
 chores+0.002
 same+0.001
 her+0.001
 rain+0.001
Neg logit contributions
os-0.003
enny-0.003
aron-0.003
ist-0.003
jo-0.002
Token said
Token ID531
Feature activation-7.86e-01

Pos logit contributions
aron+0.139
o+0.138
enny+0.137
ne+0.137
os+0.135
Neg logit contributions
 chores-0.145
 going-0.142
 same-0.134
 temptation-0.128
 rain-0.126
Token Nem
Token ID22547
Feature activation-1.90e+01

Pos logit contributions
o+0.022
enny+0.022
ne+0.022
aron+0.022
os+0.021
Neg logit contributions
 chores-0.018
 going-0.016
 same-0.016
 temptation-0.015
 plan-0.015
Tokeno
Token ID78
Feature activation-1.25e+01

Pos logit contributions
ist+0.313
aron+0.269
ud+0.249
ol+0.247
umping+0.234
Neg logit contributions
 big-0.566
 good-0.565
 first-0.564
 help-0.560
 not-0.558
Token.
Token ID13
Feature activation-6.74e+00

Pos logit contributions
enny+0.411
itty+0.411
os+0.408
ist+0.406
umping+0.406
Neg logit contributions
 her-0.199
 going-0.195
 the-0.189
 help-0.185
 good-0.177
Token "
Token ID366
Feature activation-2.38e+00

Pos logit contributions
ist+0.249
os+0.248
enny+0.245
jo+0.245
aron+0.244
Neg logit contributions
 going-0.094
 chores-0.083
 same-0.079
 rain-0.075
 her-0.075
TokenNow
Token ID3844
Feature activation-1.71e-01

Pos logit contributions
os+0.074
jo+0.073
ist+0.073
aron+0.072
enny+0.072
Neg logit contributions
 going-0.048
 same-0.044
 her-0.042
 chores-0.041
 rain-0.040
Token let
Token ID1309
Feature activation+4.48e+00

Pos logit contributions
ist+0.005
ne+0.005
aron+0.005
o+0.005
enny+0.005
Neg logit contributions
 chores-0.003
 going-0.003
 same-0.003
 temptation-0.003
wered-0.003
Token a
Token ID257
Feature activation+2.89e+00

Pos logit contributions
 going+0.122
 temptation+0.121
 urge+0.117
 chores+0.115
 same+0.110
Neg logit contributions
o-0.223
os-0.221
enny-0.216
as-0.211
ne-0.211
Token bad
Token ID2089
Feature activation+2.38e-01

Pos logit contributions
 going+0.035
wered+0.029
 her+0.028
 temptation+0.027
 chores+0.027
Neg logit contributions
o-0.101
enny-0.101
ouse-0.101
itty-0.099
oc-0.099
Token ending
Token ID7464
Feature activation+3.04e-01

Pos logit contributions
 going+0.003
 chores+0.003
 same+0.002
 her+0.002
 temptation+0.002
Neg logit contributions
aron-0.009
jo-0.009
enny-0.009
o-0.009
os-0.009
Token.
Token ID13
Feature activation-4.57e+00

Pos logit contributions
 going+0.006
 chores+0.006
 same+0.006
 her+0.006
 work+0.006
Neg logit contributions
enny-0.009
aron-0.009
os-0.009
jo-0.009
ist-0.009
Token<|endoftext|>
Token ID50256
Feature activation-1.14e+01

Pos logit contributions
ist+0.150
enny+0.145
os+0.145
aron+0.141
jo+0.141
Neg logit contributions
 going-0.083
 chores-0.082
 same-0.078
 good-0.076
 her-0.076
TokenLe
Token ID3123
Feature activation-1.89e+01

Pos logit contributions
ist+0.364
aron+0.354
jo+0.348
enny+0.344
oing+0.338
Neg logit contributions
 going-0.231
 chores-0.219
 same-0.212
 her-0.204
 rain-0.203
Tokeno
Token ID78
Feature activation-1.34e+01

Pos logit contributions
ist+0.276
enny+0.247
lic+0.229
ouse+0.216
plets+0.216
Neg logit contributions
 the-0.682
 her-0.656
 going-0.651
 good-0.645
 big-0.644
Token was
Token ID373
Feature activation+8.86e+00

Pos logit contributions
enny+0.389
aron+0.385
ist+0.383
os+0.383
jo+0.376
Neg logit contributions
 going-0.295
 chores-0.284
 same-0.273
 rain-0.261
 work-0.261
Token a
Token ID257
Feature activation+1.41e+00

Pos logit contributions
 temptation+0.172
 chores+0.163
 help+0.144
 rain+0.143
 avail+0.143
Neg logit contributions
o-0.290
aron-0.258
os-0.252
umping-0.249
ney-0.245
Token little
Token ID1310
Feature activation-1.70e+00

Pos logit contributions
 chores+0.018
 going+0.018
 same+0.015
 temptation+0.015
 her+0.014
Neg logit contributions
o-0.052
enny-0.052
aron-0.052
ne-0.051
os-0.051
Token lion
Token ID18744
Feature activation-1.33e+01

Pos logit contributions
enny+0.053
os+0.053
aron+0.053
ist+0.052
o+0.052
Neg logit contributions
 going-0.033
 chores-0.032
 same-0.030
 rain-0.029
 temptation-0.028
Token
Token ID198
Feature activation-3.39e+00

Pos logit contributions
os+0.159
enny+0.159
aron+0.159
ist+0.157
o+0.155
Neg logit contributions
 going-0.104
 chores-0.102
 same-0.096
 rain-0.092
 work-0.089
Token
Token ID198
Feature activation-3.72e+00

Pos logit contributions
ist+0.114
aron+0.113
os+0.109
umping+0.108
enny+0.107
Neg logit contributions
 going-0.059
 chores-0.057
 same-0.055
 her-0.054
 other-0.052
TokenOne
Token ID3198
Feature activation-1.20e+00

Pos logit contributions
enny+0.110
aron+0.110
o+0.109
os+0.109
ne+0.108
Neg logit contributions
 going-0.077
 chores-0.076
 same-0.070
 temptation-0.068
 work-0.067
Token day
Token ID1110
Feature activation-3.71e+00

Pos logit contributions
aron+0.051
enny+0.050
os+0.050
ist+0.050
jo+0.049
Neg logit contributions
 going-0.010
 chores-0.009
 same-0.008
 rain-0.007
 work-0.007
Token,
Token ID11
Feature activation+8.22e-01

Pos logit contributions
enny+0.109
o+0.108
aron+0.108
os+0.108
ne+0.107
Neg logit contributions
 chores-0.076
 going-0.074
 same-0.070
 temptation-0.067
 work-0.065
Token Nem
Token ID22547
Feature activation-1.89e+01

Pos logit contributions
 chores+0.020
 temptation+0.018
 plan+0.018
 going+0.018
 same+0.017
Neg logit contributions
o-0.021
enny-0.020
ne-0.020
aron-0.020
os-0.018
Tokeno
Token ID78
Feature activation-1.27e+01

Pos logit contributions
ist+0.290
aron+0.272
jo+0.210
ol+0.207
och+0.199
Neg logit contributions
 going-0.690
 big-0.666
 same-0.655
 go-0.639
 good-0.638
Token's
Token ID338
Feature activation+2.80e+00

Pos logit contributions
aron+0.416
ist+0.413
enny+0.404
os+0.398
oing+0.394
Neg logit contributions
 going-0.246
 same-0.225
 rain-0.221
 other-0.211
 chores-0.210
Token friend
Token ID1545
Feature activation-9.94e+00

Pos logit contributions
 going+0.055
 chores+0.053
 temptation+0.050
 same+0.049
 her+0.048
Neg logit contributions
o-0.085
enny-0.085
os-0.085
aron-0.082
ne-0.082
Token Tim
Token ID5045
Feature activation-1.36e+01

Pos logit contributions
aron+0.351
ist+0.349
os+0.342
oing+0.337
jo+0.336
Neg logit contributions
 going-0.159
 chores-0.144
 same-0.141
 her-0.140
 rain-0.138
Tokenmy
Token ID1820
Feature activation-9.41e+00

Pos logit contributions
aron+0.271
enny+0.263
ist+0.262
os+0.260
oing+0.252
Neg logit contributions
 going-0.433
 chores-0.414
 same-0.407
 her-0.394
 rain-0.393
Token "
Token ID366
Feature activation-1.27e+00

Pos logit contributions
 chores+0.027
 going+0.027
 same+0.025
 temptation+0.024
 work+0.024
Neg logit contributions
os-0.055
enny-0.054
aron-0.054
ist-0.054
o-0.053
TokenNo
Token ID2949
Feature activation+5.31e+00

Pos logit contributions
aron+0.032
enny+0.032
os+0.032
ist+0.031
jo+0.031
Neg logit contributions
 going-0.032
 chores-0.032
 same-0.031
 rain-0.029
 work-0.029
Token swimming
Token ID14899
Feature activation-5.22e-01

Pos logit contributions
 going+0.098
 chores+0.095
 same+0.090
 her+0.086
 rain+0.084
Neg logit contributions
aron-0.163
ne-0.161
os-0.161
ist-0.160
o-0.159
Token allowed
Token ID3142
Feature activation+6.18e+00

Pos logit contributions
enny+0.014
aron+0.014
os+0.014
ist+0.014
jo+0.014
Neg logit contributions
 going-0.012
 chores-0.012
 same-0.011
 rain-0.011
 good-0.011
Token."
Token ID526
Feature activation-7.51e+00

Pos logit contributions
 going+0.161
 chores+0.158
 same+0.151
 rain+0.145
 work+0.143
Neg logit contributions
enny-0.151
os-0.151
aron-0.151
ist-0.148
o-0.146
Token Nem
Token ID22547
Feature activation-1.89e+01

Pos logit contributions
aron+0.213
enny+0.205
o+0.200
ne+0.199
os+0.199
Neg logit contributions
 going-0.167
 chores-0.166
 same-0.148
 temptation-0.147
 work-0.144
Tokeno
Token ID78
Feature activation-1.21e+01

Pos logit contributions
ist+0.348
aron+0.300
ol+0.291
lier+0.282
och+0.282
Neg logit contributions
 not-0.532
 big-0.519
 going-0.488
 first-0.486
 a-0.471
Token was
Token ID373
Feature activation+8.51e+00

Pos logit contributions
enny+0.414
oing+0.400
jo+0.395
os+0.394
itty+0.394
Neg logit contributions
 going-0.195
 all-0.182
 not-0.182
 the-0.178
 big-0.175
Token sad
Token ID6507
Feature activation+1.44e+00

Pos logit contributions
 chores+0.162
 temptation+0.147
 same+0.144
 going+0.138
 work+0.136
Neg logit contributions
o-0.272
os-0.259
aron-0.255
ist-0.245
ne-0.244
Token because
Token ID780
Feature activation+9.77e-01

Pos logit contributions
 going+0.033
 chores+0.031
 same+0.030
 rain+0.029
 her+0.029
Neg logit contributions
os-0.041
aron-0.041
enny-0.041
ist-0.040
jo-0.040
Token he
Token ID339
Feature activation-4.64e+00

Pos logit contributions
 chores+0.018
 same+0.018
 going+0.017
 temptation+0.016
 urge+0.016
Neg logit contributions
os-0.030
o-0.029
enny-0.029
ne-0.028
jo-0.028
TokenYes
Token ID5297
Feature activation-2.40e+00

Pos logit contributions
enny+0.110
os+0.109
aron+0.109
ist+0.107
o+0.107
Neg logit contributions
 going-0.111
 chores-0.109
 same-0.104
 rain-0.100
 work-0.099
Token,
Token ID11
Feature activation+1.04e+00

Pos logit contributions
enny+0.076
aron+0.076
os+0.076
ist+0.075
o+0.075
Neg logit contributions
 going-0.045
 chores-0.045
 same-0.042
 rain-0.039
 work-0.038
Token please
Token ID3387
Feature activation+3.33e+00

Pos logit contributions
 chores+0.019
 going+0.018
 same+0.017
 temptation+0.016
 her+0.016
Neg logit contributions
enny-0.034
aron-0.033
ne-0.033
o-0.033
os-0.032
Token!"
Token ID2474
Feature activation-6.16e+00

Pos logit contributions
 going+0.085
 chores+0.081
 same+0.079
 work+0.079
 help+0.077
Neg logit contributions
enny-0.088
os-0.087
aron-0.085
jo-0.084
umping-0.084
Token said
Token ID531
Feature activation-2.33e-01

Pos logit contributions
aron+0.174
enny+0.172
o+0.170
os+0.170
ne+0.169
Neg logit contributions
 chores-0.138
 going-0.136
 same-0.128
 temptation-0.121
 rain-0.120
Token Nem
Token ID22547
Feature activation-1.89e+01

Pos logit contributions
enny+0.007
o+0.007
aron+0.007
os+0.007
ne+0.007
Neg logit contributions
 chores-0.005
 going-0.005
 same-0.005
 temptation-0.004
 rain-0.004
Tokeno
Token ID78
Feature activation-1.25e+01

Pos logit contributions
ist+0.301
aron+0.247
ud+0.236
ne+0.224
umping+0.224
Neg logit contributions
 help-0.609
 not-0.607
 big-0.604
 good-0.603
 going-0.594
Token happily
Token ID18177
Feature activation+1.02e-01

Pos logit contributions
enny+0.418
itty+0.414
os+0.409
jo+0.408
ist+0.406
Neg logit contributions
 going-0.208
 her-0.207
 help-0.188
 the-0.186
 good-0.182
Token.
Token ID13
Feature activation-5.82e+00

Pos logit contributions
 going+0.002
 her+0.002
 work+0.002
 help+0.002
 rain+0.002
Neg logit contributions
enny-0.003
jo-0.003
os-0.003
ist-0.003
aron-0.003
Token
Token ID198
Feature activation-3.88e+00

Pos logit contributions
os+0.212
enny+0.211
aron+0.211
ist+0.210
jo+0.207
Neg logit contributions
 going-0.083
 chores-0.079
 same-0.074
 rain-0.069
 her-0.066
Token
Token ID198
Feature activation-3.99e+00

Pos logit contributions
ist+0.144
aron+0.137
umping+0.135
os+0.134
oing+0.133
Neg logit contributions
 going-0.054
 big-0.053
 other-0.052
 chores-0.051
 good-0.050
Token."
Token ID526
Feature activation-8.30e+00

Pos logit contributions
 going+0.059
 chores+0.053
 same+0.052
 good+0.051
 her+0.050
Neg logit contributions
enny-0.081
aron-0.079
ist-0.079
os-0.078
jo-0.078
Token
Token ID220
Feature activation-6.43e+00

Pos logit contributions
ist+0.298
as+0.287
os+0.284
enny+0.283
ppy+0.282
Neg logit contributions
 her-0.133
 going-0.132
 chores-0.123
 the-0.119
 same-0.117
Token
Token ID198
Feature activation-4.78e+00

Pos logit contributions
ist+0.212
aron+0.208
ne+0.205
oing+0.205
enny+0.204
Neg logit contributions
 going-0.112
 chores-0.110
 her-0.103
 same-0.103
 other-0.097
Token
Token ID198
Feature activation-4.82e+00

Pos logit contributions
ist+0.175
os+0.167
ppy+0.165
enny+0.164
umping+0.163
Neg logit contributions
 going-0.072
 her-0.069
 chores-0.069
 the-0.066
 other-0.066
TokenThe
Token ID464
Feature activation-1.33e+00

Pos logit contributions
os+0.137
enny+0.134
ist+0.132
aron+0.130
o+0.130
Neg logit contributions
 going-0.112
 chores-0.112
 same-0.106
 rain-0.101
 her-0.101
Token prince
Token ID19716
Feature activation-1.89e+01

Pos logit contributions
enny+0.033
ne+0.032
pillar+0.031
ouse+0.031
ari+0.030
Neg logit contributions
 chores-0.027
 going-0.027
 her-0.025
 temptation-0.025
wered-0.024
Token went
Token ID1816
Feature activation+4.95e+00

Pos logit contributions
enny+0.593
oing+0.586
aron+0.579
os+0.577
ist+0.566
Neg logit contributions
 going-0.363
 same-0.334
 work-0.331
 chores-0.331
 her-0.327
Token to
Token ID284
Feature activation+6.88e+00

Pos logit contributions
 chores+0.100
 going+0.094
 temptation+0.094
 same+0.094
 urge+0.089
Neg logit contributions
o-0.147
os-0.144
enny-0.142
aron-0.142
ist-0.141
Token the
Token ID262
Feature activation+1.20e+00

Pos logit contributions
 chores+0.091
 temptation+0.088
 going+0.085
 same+0.077
 hurricane+0.075
Neg logit contributions
os-0.208
o-0.207
ne-0.202
oc-0.201
enny-0.199
Token battlefield
Token ID13480
Feature activation-9.38e+00

Pos logit contributions
 chores+0.020
 going+0.020
 temptation+0.020
 her+0.018
 urge+0.018
Neg logit contributions
ney-0.033
ne-0.033
o-0.033
oc-0.032
ray-0.032
Token with
Token ID351
Feature activation+1.03e+01

Pos logit contributions
enny+0.274
os+0.270
aron+0.268
ist+0.264
jo+0.262
Neg logit contributions
 going-0.204
 chores-0.193
 same-0.186
 her-0.184
 work-0.180
Token was
Token ID373
Feature activation+8.45e+00

Pos logit contributions
aron+0.073
enny+0.072
ist+0.072
os+0.072
ne+0.070
Neg logit contributions
 going-0.056
 chores-0.054
 same-0.053
 rain-0.050
 her-0.050
Token a
Token ID257
Feature activation+1.05e+00

Pos logit contributions
 chores+0.079
 going+0.074
 temptation+0.064
 same+0.063
 urge+0.058
Neg logit contributions
o-0.343
enny-0.342
aron-0.341
os-0.338
ne-0.334
Token little
Token ID1310
Feature activation-2.62e+00

Pos logit contributions
 going+0.010
 chores+0.010
 same+0.009
 rain+0.008
 good+0.008
Neg logit contributions
os-0.043
enny-0.043
aron-0.043
ist-0.043
jo-0.042
Token fish
Token ID5916
Feature activation-1.48e+01

Pos logit contributions
os+0.090
aron+0.089
enny+0.089
ist+0.088
jo+0.088
Neg logit contributions
 going-0.043
 same-0.042
 rain-0.042
 good-0.040
 chores-0.040
Token named
Token ID3706
Feature activation-2.17e+00

Pos logit contributions
enny+0.456
aron+0.453
ist+0.429
umping+0.422
jo+0.419
Neg logit contributions
 going-0.316
 her-0.288
 same-0.279
 big-0.278
 other-0.278
Token Nem
Token ID22547
Feature activation-1.88e+01

Pos logit contributions
os+0.068
enny+0.068
aron+0.068
o+0.067
ist+0.067
Neg logit contributions
 going-0.042
 chores-0.041
 same-0.038
 rain-0.036
 temptation-0.036
Tokeno
Token ID78
Feature activation-1.33e+01

Pos logit contributions
ist+0.296
ol+0.282
aron+0.277
enny+0.246
pillar+0.243
Neg logit contributions
 going-0.670
 the-0.622
 big-0.610
 not-0.602
 their-0.600
Token.
Token ID13
Feature activation-5.65e+00

Pos logit contributions
enny+0.466
aron+0.455
umping+0.441
ist+0.438
ives+0.437
Neg logit contributions
 going-0.237
 the-0.211
 her-0.207
 other-0.203
 same-0.198
Token Nem
Token ID22547
Feature activation-1.83e+01

Pos logit contributions
o+0.164
aron+0.161
enny+0.160
os+0.159
oc+0.153
Neg logit contributions
 chores-0.124
 going-0.117
 same-0.110
 temptation-0.109
 work-0.108
Tokeno
Token ID78
Feature activation-1.27e+01

Pos logit contributions
ist+0.328
ol+0.312
enny+0.296
aron+0.283
paste+0.281
Neg logit contributions
 going-0.563
 big-0.559
 the-0.548
 not-0.543
 good-0.533
Token loved
Token ID6151
Feature activation+8.00e+00

Pos logit contributions
enny+0.468
ist+0.446
aron+0.445
oing+0.439
azz+0.437
Neg logit contributions
 going-0.205
 the-0.186
 same-0.185
 big-0.184
 good-0.180
Token."
Token ID526
Feature activation-6.63e+00

Pos logit contributions
enny+0.080
aron+0.080
os+0.080
ist+0.079
o+0.078
Neg logit contributions
 going-0.026
 chores-0.025
 same-0.022
 rain-0.021
 work-0.020
Token
Token ID198
Feature activation-3.78e+00

Pos logit contributions
ist+0.213
os+0.212
enny+0.208
jo+0.206
umping+0.205
Neg logit contributions
 chores-0.123
 going-0.123
 same-0.117
 her-0.114
 good-0.114
Token
Token ID198
Feature activation-3.75e+00

Pos logit contributions
ist+0.134
umping+0.127
aron+0.127
os+0.127
enny+0.126
Neg logit contributions
 her-0.058
 going-0.058
 chores-0.057
 other-0.055
 same-0.055
TokenThey
Token ID2990
Feature activation-3.00e+00

Pos logit contributions
os+0.111
jo+0.108
enny+0.108
ist+0.106
umping+0.105
Neg logit contributions
 chores-0.079
 same-0.077
 good-0.076
 other-0.076
 going-0.075
Token knock
Token ID10643
Feature activation+1.15e+00

Pos logit contributions
os+0.110
ist+0.109
aron+0.108
enny+0.108
ne+0.107
Neg logit contributions
 going-0.039
 chores-0.038
 same-0.035
 temptation-0.034
 rain-0.032
Token on
Token ID319
Feature activation+1.04e+01

Pos logit contributions
 going+0.019
 chores+0.019
 same+0.017
 her+0.017
 work+0.016
Neg logit contributions
aron-0.039
enny-0.039
ist-0.039
os-0.038
oing-0.038
Token the
Token ID262
Feature activation+2.42e+00

Pos logit contributions
 going+0.046
 chores+0.044
 same+0.032
 rain+0.023
 her+0.022
Neg logit contributions
enny-0.480
aron-0.478
os-0.477
ist-0.474
o-0.471
Token window
Token ID4324
Feature activation-4.01e+00

Pos logit contributions
 going+0.033
 chores+0.031
 temptation+0.026
 same+0.026
 urge+0.025
Neg logit contributions
aron-0.088
os-0.087
enny-0.087
ne-0.086
o-0.086
Token.
Token ID13
Feature activation-4.46e+00

Pos logit contributions
enny+0.118
aron+0.116
ist+0.114
os+0.114
ne+0.113
Neg logit contributions
 going-0.086
 chores-0.082
 same-0.079
 her-0.078
 rain-0.077
Token A
Token ID317
Feature activation+6.19e-01

Pos logit contributions
aron+0.136
os+0.136
enny+0.135
o+0.133
ist+0.133
Neg logit contributions
 going-0.090
 chores-0.088
 same-0.082
 rain-0.078
 temptation-0.077
Token man
Token ID582
Feature activation-1.26e+01

Pos logit contributions
 going+0.009
 chores+0.009
 same+0.008
 temptation+0.008
 rain+0.008
Neg logit contributions
aron-0.022
enny-0.022
os-0.022
o-0.021
ist-0.021
Token down
Token ID866
Feature activation+5.31e+00

Pos logit contributions
 chores+0.178
 going+0.162
 temptation+0.153
 same+0.152
 plan+0.150
Neg logit contributions
aron-0.217
ist-0.216
os-0.215
o-0.212
jo-0.205
Token to
Token ID284
Feature activation+8.13e+00

Pos logit contributions
 chores+0.156
 going+0.140
 temptation+0.140
 same+0.140
 plan+0.136
Neg logit contributions
os-0.116
o-0.114
aron-0.110
ist-0.109
enny-0.108
Token the
Token ID262
Feature activation+2.56e+00

Pos logit contributions
 chores+0.122
 going+0.106
 same+0.096
 temptation+0.090
 plan+0.085
Neg logit contributions
ist-0.289
os-0.288
o-0.286
aron-0.285
ne-0.282
Token streets
Token ID6483
Feature activation-4.44e+00

Pos logit contributions
 chores+0.040
 going+0.036
 her+0.034
 temptation+0.031
 plan+0.029
Neg logit contributions
ne-0.086
ist-0.084
ney-0.083
aron-0.083
oing-0.080
Token and
Token ID290
Feature activation+5.77e+00

Pos logit contributions
os+0.123
aron+0.122
ist+0.121
enny+0.121
o+0.120
Neg logit contributions
 chores-0.100
 going-0.099
 same-0.093
 rain-0.090
 temptation-0.089
Token saw
Token ID2497
Feature activation+8.65e+00

Pos logit contributions
 chores+0.133
 temptation+0.119
 going+0.117
 same+0.116
 urge+0.110
Neg logit contributions
os-0.164
o-0.161
ist-0.155
ne-0.154
aron-0.153
Token an
Token ID281
Feature activation+3.36e+00

Pos logit contributions
 chores+0.116
 going+0.104
 urge+0.095
 temptation+0.094
 work+0.093
Neg logit contributions
os-0.323
o-0.315
es-0.311
ist-0.302
ne-0.301
Token old
Token ID1468
Feature activation-3.76e+00

Pos logit contributions
 chores+0.045
 going+0.040
 same+0.036
 temptation+0.035
 her+0.034
Neg logit contributions
aron-0.123
ist-0.121
ne-0.119
oc-0.118
ney-0.117
Token man
Token ID582
Feature activation-1.30e+01

Pos logit contributions
aron+0.112
ist+0.110
ne+0.110
jo+0.108
os+0.108
Neg logit contributions
 chores-0.075
 going-0.073
 same-0.067
 temptation-0.065
 her-0.062
Token.
Token ID13
Feature activation-5.44e+00

Pos logit contributions
enny+0.399
aron+0.397
os+0.396
ist+0.391
jo+0.386
Neg logit contributions
 going-0.260
 chores-0.251
 same-0.235
 rain-0.225
 work-0.222
Token
Token ID198
Feature activation-3.39e+00

Pos logit contributions
ist+0.194
aron+0.184
enny+0.184
ppy+0.182
oing+0.181
Neg logit contributions
 going-0.089
 chores-0.080
 same-0.079
 the-0.078
 her-0.077
Token complete
Token ID1844
Feature activation+5.78e+00

Pos logit contributions
 chores+0.128
 going+0.125
 temptation+0.117
 same+0.115
 sorry+0.113
Neg logit contributions
ist-0.266
aron-0.251
ne-0.251
os-0.248
r-0.248
Token the
Token ID262
Feature activation+2.81e+00

Pos logit contributions
 going+0.054
 urge+0.053
 temptation+0.050
 same+0.048
 chores+0.047
Neg logit contributions
ist-0.224
os-0.224
es-0.221
aron-0.217
enny-0.217
Token ceiling
Token ID13387
Feature activation-9.66e+00

Pos logit contributions
 going+0.057
 chores+0.050
 her+0.048
 sorry+0.047
 temptation+0.047
Neg logit contributions
ne-0.077
enny-0.076
ist-0.076
os-0.075
ney-0.075
Token.
Token ID13
Feature activation-5.66e+00

Pos logit contributions
enny+0.273
aron+0.273
os+0.268
jo+0.266
o+0.265
Neg logit contributions
 going-0.216
 chores-0.211
 same-0.200
 work-0.194
 rain-0.192
Token They
Token ID1119
Feature activation-1.89e+00

Pos logit contributions
os+0.173
aron+0.173
enny+0.173
ist+0.170
o+0.169
Neg logit contributions
 going-0.113
 chores-0.111
 same-0.104
 rain-0.100
 work-0.097
Token found
Token ID1043
Feature activation+6.45e+00

Pos logit contributions
os+0.054
enny+0.054
aron+0.053
ist+0.053
o+0.053
Neg logit contributions
 going-0.042
 chores-0.041
 same-0.039
 rain-0.037
 temptation-0.037
Token all
Token ID477
Feature activation+3.58e+00

Pos logit contributions
 going+0.069
 chores+0.069
 temptation+0.063
 urge+0.060
 rain+0.057
Neg logit contributions
os-0.252
es-0.245
ist-0.241
enny-0.240
o-0.240
Token the
Token ID262
Feature activation+2.47e+00

Pos logit contributions
 chores+0.078
 going+0.076
 same+0.071
 temptation+0.070
 rain+0.069
Neg logit contributions
os-0.104
o-0.102
enny-0.100
aron-0.099
ist-0.099
Token pieces
Token ID5207
Feature activation-5.87e+00

Pos logit contributions
 going+0.053
 chores+0.050
 temptation+0.046
 her+0.045
 rain+0.045
Neg logit contributions
ist-0.069
enny-0.068
ne-0.068
aron-0.067
os-0.067
Token and
Token ID290
Feature activation+4.72e+00

Pos logit contributions
os+0.161
enny+0.160
aron+0.159
ist+0.158
o+0.156
Neg logit contributions
 going-0.134
 chores-0.133
 same-0.126
 rain-0.121
 temptation-0.119
Token fit
Token ID4197
Feature activation+3.05e+00

Pos logit contributions
 chores+0.101
 going+0.099
 same+0.093
 temptation+0.091
 urge+0.087
Neg logit contributions
os-0.138
enny-0.138
ist-0.135
aron-0.135
o-0.134
Token to
Token ID284
Feature activation+9.04e+00

Pos logit contributions
os+0.000
enny+0.000
aron+0.000
ist+0.000
o+0.000
Neg logit contributions
 going-0.000
 chores-0.000
 same-0.000
 rain-0.000
 temptation-0.000
Token the
Token ID262
Feature activation+2.87e+00

Pos logit contributions
 same+0.116
 hurricane+0.114
 chores+0.111
 going+0.107
 urge+0.103
Neg logit contributions
enny-0.289
o-0.288
os-0.278
and-0.272
aron-0.272
Token bathroom
Token ID12436
Feature activation-7.17e+00

Pos logit contributions
 going+0.052
 chores+0.047
 temptation+0.042
 urge+0.042
wered+0.041
Neg logit contributions
aron-0.087
enny-0.086
itty-0.085
o-0.084
ney-0.083
Token.
Token ID13
Feature activation-4.74e+00

Pos logit contributions
enny+0.208
os+0.205
aron+0.203
jo+0.202
ist+0.200
Neg logit contributions
 going-0.156
 chores-0.154
 same-0.145
 rain-0.139
 work-0.139
Token He
Token ID679
Feature activation-3.38e+00

Pos logit contributions
enny+0.135
aron+0.134
os+0.133
o+0.131
jo+0.129
Neg logit contributions
 going-0.107
 chores-0.106
 same-0.099
 rain-0.094
 temptation-0.093
Token put
Token ID1234
Feature activation+7.92e+00

Pos logit contributions
enny+0.106
aron+0.104
os+0.104
jo+0.103
ist+0.103
Neg logit contributions
 going-0.066
 chores-0.064
 same-0.060
 rain-0.058
 her-0.058
Token her
Token ID607
Feature activation+1.72e-01

Pos logit contributions
 going+0.096
 chores+0.091
 plan+0.079
 urge+0.078
 temptation+0.075
Neg logit contributions
o-0.300
os-0.299
enny-0.296
aron-0.294
ist-0.290
Token in
Token ID287
Feature activation+9.83e+00

Pos logit contributions
 going+0.003
 chores+0.003
 same+0.003
 temptation+0.003
 rain+0.003
Neg logit contributions
os-0.005
ist-0.005
enny-0.005
aron-0.005
o-0.005
Token the
Token ID262
Feature activation+3.93e+00

Pos logit contributions
 going+0.118
 chores+0.107
 same+0.102
 urge+0.100
 temptation+0.098
Neg logit contributions
aron-0.366
o-0.351
enny-0.345
os-0.344
ist-0.341
Token tub
Token ID12202
Feature activation-8.08e+00

Pos logit contributions
 going+0.080
 chores+0.068
wered+0.062
 work+0.062
 avail+0.060
Neg logit contributions
aron-0.113
ist-0.108
enny-0.108
ouse-0.108
itty-0.106
Token and
Token ID290
Feature activation+5.50e+00

Pos logit contributions
jo+0.227
enny+0.227
aron+0.226
ne+0.225
oing+0.223
Neg logit contributions
 going-0.173
 chores-0.168
 same-0.166
 her-0.165
 good-0.157
Token song
Token ID3496
Feature activation-4.66e+00

Pos logit contributions
 going+0.045
 chores+0.043
 temptation+0.039
 same+0.037
 her+0.037
Neg logit contributions
ne-0.110
enny-0.109
ist-0.107
ouse-0.106
os-0.106
Token for
Token ID329
Feature activation+1.07e+01

Pos logit contributions
aron+0.164
enny+0.163
os+0.163
ist+0.160
jo+0.159
Neg logit contributions
 going-0.073
 chores-0.070
 same-0.065
 her-0.062
 rain-0.061
Token her
Token ID607
Feature activation+7.11e-02

Pos logit contributions
 going+0.111
 chores+0.102
 urge+0.100
 same+0.098
 plan+0.091
Neg logit contributions
enny-0.417
os-0.399
aron-0.397
ist-0.397
o-0.396
Token.
Token ID13
Feature activation-5.50e+00

Pos logit contributions
 going+0.001
 chores+0.001
 same+0.001
 rain+0.001
 work+0.001
Neg logit contributions
os-0.002
enny-0.002
aron-0.002
ist-0.002
o-0.002
Token Lily
Token ID20037
Feature activation-1.22e+01

Pos logit contributions
enny+0.186
aron+0.185
os+0.184
o+0.182
ist+0.181
Neg logit contributions
 going-0.092
 chores-0.091
 same-0.082
 rain-0.077
 temptation-0.077
Token was
Token ID373
Feature activation+8.82e+00

Pos logit contributions
aron+0.377
os+0.375
enny+0.373
ist+0.368
oing+0.367
Neg logit contributions
 going-0.243
 chores-0.234
 same-0.225
 work-0.219
 her-0.218
Token so
Token ID523
Feature activation+6.02e+00

Pos logit contributions
 chores+0.175
 temptation+0.160
 hurricane+0.148
 same+0.145
 work+0.144
Neg logit contributions
o-0.269
aron-0.262
umping-0.251
os-0.245
enny-0.245
Token happy
Token ID3772
Feature activation+2.00e+00

Pos logit contributions
 chores+0.108
 going+0.099
 temptation+0.096
 same+0.095
 rain+0.093
Neg logit contributions
enny-0.192
os-0.191
o-0.190
aron-0.189
oing-0.184
Token and
Token ID290
Feature activation+4.17e+00

Pos logit contributions
 going+0.041
 her+0.036
 chores+0.036
 work+0.034
 same+0.034
Neg logit contributions
os-0.064
aron-0.064
ist-0.063
enny-0.063
jo-0.063
Token sang
Token ID25889
Feature activation+3.96e+00

Pos logit contributions
 chores+0.103
 same+0.088
 temptation+0.088
wered+0.086
 going+0.084
Neg logit contributions
enny-0.109
o-0.109
aron-0.106
ouse-0.105
os-0.104
Token her
Token ID607
Feature activation-1.60e-01

Pos logit contributions
 going+0.036
 chores+0.034
 same+0.030
 rain+0.026
 work+0.025
Neg logit contributions
os-0.165
enny-0.164
aron-0.164
ist-0.163
o-0.162
Token.
Token ID13
Feature activation-5.75e+00

Pos logit contributions
enny+0.402
aron+0.401
os+0.401
ist+0.395
o+0.392
Neg logit contributions
 going-0.266
 chores-0.258
 same-0.243
 rain-0.233
 work-0.228
Token The
Token ID383
Feature activation-7.64e-01

Pos logit contributions
ist+0.201
os+0.194
aron+0.192
enny+0.192
oing+0.190
Neg logit contributions
 going-0.093
 chores-0.085
 her-0.085
 same-0.082
 rain-0.081
Token girl
Token ID2576
Feature activation-1.42e+01

Pos logit contributions
aron+0.020
ne+0.020
enny+0.019
jo+0.019
o+0.019
Neg logit contributions
 going-0.017
 chores-0.017
 work-0.015
 temptation-0.015
 same-0.015
Token was
Token ID373
Feature activation+8.72e+00

Pos logit contributions
enny+0.446
aron+0.434
ist+0.431
os+0.429
umping+0.421
Neg logit contributions
 going-0.287
 her-0.258
 same-0.255
 chores-0.255
 rain-0.248
Token a
Token ID257
Feature activation+1.68e+00

Pos logit contributions
 chores+0.153
 temptation+0.138
 going+0.129
 work+0.129
 same+0.126
Neg logit contributions
o-0.298
aron-0.289
os-0.283
jo-0.278
ist-0.278
Token bit
Token ID1643
Feature activation+4.72e+00

Pos logit contributions
 chores+0.026
 going+0.026
 same+0.023
 temptation+0.022
 rain+0.022
Neg logit contributions
aron-0.058
enny-0.057
o-0.057
ist-0.056
os-0.056
Token scared
Token ID12008
Feature activation+2.61e+00

Pos logit contributions
 going+0.063
 chores+0.061
 same+0.056
 rain+0.052
 her+0.050
Neg logit contributions
os-0.176
enny-0.176
aron-0.175
ist-0.173
o-0.172
Token but
Token ID475
Feature activation+2.77e+00

Pos logit contributions
 going+0.054
 chores+0.054
 same+0.051
 rain+0.048
 work+0.048
Neg logit contributions
enny-0.077
aron-0.077
os-0.077
ist-0.076
o-0.075
Token she
Token ID673
Feature activation-4.39e+00

Pos logit contributions
 chores+0.045
 same+0.036
 work+0.035
 plan+0.034
 temptation+0.034
Neg logit contributions
o-0.095
os-0.093
enny-0.093
aron-0.092
ne-0.091
Token was
Token ID373
Feature activation+9.97e+00

Pos logit contributions
os+0.138
enny+0.137
aron+0.135
ist+0.134
oing+0.134
Neg logit contributions
 going-0.086
 chores-0.079
 her-0.077
 same-0.077
 rain-0.075
Token obedient
Token ID41652
Feature activation+2.29e+00

Pos logit contributions
 chores+0.182
 same+0.161
 temptation+0.159
 going+0.154
 work+0.154
Neg logit contributions
aron-0.326
o-0.323
ist-0.312
os-0.312
ne-0.308
Token home
Token ID1363
Feature activation-1.47e+00

Pos logit contributions
 going+0.048
 chores+0.045
wered+0.040
 temptation+0.040
 her+0.038
Neg logit contributions
enny-0.091
o-0.090
ouse-0.090
ist-0.089
itty-0.087
Token and
Token ID290
Feature activation+5.03e+00

Pos logit contributions
enny+0.045
os+0.045
aron+0.045
ist+0.044
jo+0.044
Neg logit contributions
 going-0.030
 chores-0.029
 same-0.027
 rain-0.026
 work-0.025
Token a
Token ID257
Feature activation+2.27e+00

Pos logit contributions
 chores+0.082
 going+0.077
 temptation+0.071
 same+0.068
wered+0.068
Neg logit contributions
o-0.159
enny-0.159
os-0.158
ne-0.156
es-0.155
Token family
Token ID1641
Feature activation-9.61e+00

Pos logit contributions
 going+0.038
 chores+0.036
wered+0.032
 temptation+0.032
 same+0.031
Neg logit contributions
enny-0.072
o-0.072
ouse-0.070
aron-0.070
itty-0.070
Token.
Token ID13
Feature activation-4.66e+00

Pos logit contributions
enny+0.278
aron+0.274
os+0.274
ist+0.270
jo+0.269
Neg logit contributions
 going-0.213
 chores-0.202
 same-0.196
 rain-0.188
 her-0.187
Token You
Token ID921
Feature activation-1.51e+00

Pos logit contributions
os+0.149
ist+0.149
enny+0.149
aron+0.147
jo+0.146
Neg logit contributions
 going-0.085
 chores-0.081
 same-0.078
 her-0.074
 rain-0.074
Token are
Token ID389
Feature activation+8.39e+00

Pos logit contributions
enny+0.050
aron+0.050
os+0.050
ist+0.049
jo+0.049
Neg logit contributions
 going-0.026
 chores-0.026
 same-0.024
 rain-0.023
 work-0.023
Token mean
Token ID1612
Feature activation+5.54e-01

Pos logit contributions
 chores+0.113
 temptation+0.100
 urge+0.095
 same+0.094
 plan+0.093
Neg logit contributions
aron-0.297
o-0.291
ppy-0.286
ist-0.282
ne-0.280
Token and
Token ID290
Feature activation+4.04e+00

Pos logit contributions
 going+0.012
 chores+0.011
 same+0.011
 her+0.010
 rain+0.010
Neg logit contributions
enny-0.017
ist-0.017
os-0.016
aron-0.016
oing-0.016
Token rude
Token ID22066
Feature activation+4.27e-01

Pos logit contributions
 chores+0.069
 going+0.063
 top+0.060
 plan+0.057
 urge+0.056
Neg logit contributions
o-0.129
ppy-0.123
os-0.123
ne-0.122
enny-0.121
Token.
Token ID13
Feature activation-4.55e+00

Pos logit contributions
 going+0.009
 chores+0.009
 same+0.009
 her+0.008
 rain+0.008
Neg logit contributions
enny-0.013
ist-0.013
os-0.013
aron-0.013
jo-0.012
Token whistle
Token ID16121
Feature activation-2.23e+00

Pos logit contributions
 going+0.130
 chores+0.126
 same+0.117
 temptation+0.111
 rain+0.107
Neg logit contributions
ist-0.247
aron-0.244
os-0.242
ne-0.241
o-0.239
Token.
Token ID13
Feature activation-5.01e+00

Pos logit contributions
enny+0.073
aron+0.072
os+0.072
ist+0.071
jo+0.071
Neg logit contributions
 going-0.040
 chores-0.040
 same-0.037
 her-0.036
 rain-0.036
Token
Token ID198
Feature activation-3.15e+00

Pos logit contributions
enny+0.151
aron+0.151
os+0.151
o+0.148
ist+0.147
Neg logit contributions
 going-0.102
 chores-0.100
 same-0.094
 rain-0.090
 work-0.088
Token
Token ID198
Feature activation-3.42e+00

Pos logit contributions
ist+0.104
aron+0.101
os+0.100
enny+0.099
umping+0.098
Neg logit contributions
 going-0.057
 chores-0.056
 her-0.054
 same-0.052
 other-0.050
TokenTom
Token ID13787
Feature activation-1.34e+01

Pos logit contributions
os+0.081
enny+0.079
ist+0.079
aron+0.078
jo+0.078
Neg logit contributions
 going-0.094
 chores-0.093
 same-0.089
 rain-0.086
 her-0.086
Token looked
Token ID3114
Feature activation+5.36e+00

Pos logit contributions
enny+0.474
jo+0.461
oing+0.453
ark+0.451
itty+0.450
Neg logit contributions
 the-0.215
 going-0.203
 go-0.192
 big-0.192
 a-0.192
Token at
Token ID379
Feature activation+9.55e+00

Pos logit contributions
 going+0.102
 chores+0.100
 same+0.094
 rain+0.090
 temptation+0.087
Neg logit contributions
os-0.168
aron-0.168
enny-0.168
ist-0.166
o-0.165
Token Lily
Token ID20037
Feature activation-1.18e+01

Pos logit contributions
 going+0.119
 chores+0.117
 same+0.093
 rain+0.090
 urge+0.090
Neg logit contributions
enny-0.361
aron-0.361
os-0.359
o-0.357
ist-0.352
Token.
Token ID13
Feature activation-5.13e+00

Pos logit contributions
enny+0.319
os+0.316
aron+0.315
ist+0.310
jo+0.308
Neg logit contributions
 going-0.283
 chores-0.277
 same-0.261
 her-0.252
 rain-0.251
Token He
Token ID679
Feature activation-3.52e+00

Pos logit contributions
os+0.152
enny+0.151
aron+0.151
ist+0.149
o+0.148
Neg logit contributions
 going-0.108
 chores-0.106
 same-0.100
 rain-0.096
 work-0.094
Token copied
Token ID18984
Feature activation+3.99e+00

Pos logit contributions
enny+0.122
os+0.118
oing+0.118
aron+0.118
jo+0.118
Neg logit contributions
 going-0.060
 the-0.054
 her-0.053
 other-0.053
 big-0.053
Token,
Token ID11
Feature activation+3.00e+00

Pos logit contributions
os+0.181
enny+0.180
aron+0.180
o+0.177
jo+0.176
Neg logit contributions
 going-0.129
 chores-0.127
 same-0.120
 her-0.115
 rain-0.115
Token but
Token ID475
Feature activation+3.09e+00

Pos logit contributions
 going+0.063
 chores+0.063
 same+0.060
 temptation+0.056
 work+0.056
Neg logit contributions
aron-0.087
enny-0.086
os-0.085
ist-0.085
o-0.084
Token it
Token ID340
Feature activation-4.02e+00

Pos logit contributions
 chores+0.044
 same+0.041
 temptation+0.039
 work+0.037
 going+0.037
Neg logit contributions
enny-0.108
o-0.105
os-0.103
es-0.102
ist-0.100
Token was
Token ID373
Feature activation+1.12e+01

Pos logit contributions
os+0.104
ist+0.103
o+0.102
enny+0.102
aron+0.101
Neg logit contributions
 chores-0.098
 going-0.097
 same-0.091
 temptation-0.088
 work-0.087
Token still
Token ID991
Feature activation+7.04e+00

Pos logit contributions
 chores+0.248
 temptation+0.235
 same+0.229
 urge+0.228
 plan+0.227
Neg logit contributions
o-0.286
ist-0.279
aron-0.271
ppy-0.266
enny-0.260
Token wet
Token ID9583
Feature activation+1.54e+00

Pos logit contributions
 chores+0.110
 temptation+0.103
 help+0.103
 same+0.102
 urge+0.099
Neg logit contributions
o-0.223
ist-0.222
aron-0.218
os-0.210
enny-0.209
Token.
Token ID13
Feature activation-4.85e+00

Pos logit contributions
 chores+0.038
 going+0.037
 same+0.035
 temptation+0.034
 rain+0.033
Neg logit contributions
aron-0.040
enny-0.039
os-0.039
ist-0.039
o-0.038
Token Her
Token ID2332
Feature activation-9.70e-01

Pos logit contributions
enny+0.148
aron+0.146
os+0.146
o+0.143
ist+0.143
Neg logit contributions
 going-0.099
 chores-0.096
 same-0.089
 work-0.085
 rain-0.085
Token mom
Token ID1995
Feature activation-1.09e+01

Pos logit contributions
enny+0.035
ouse+0.034
ne+0.033
jo+0.033
o+0.033
Neg logit contributions
 going-0.013
 chores-0.011
 temptation-0.011
 same-0.011
wered-0.010
Tokenmy
Token ID1820
Feature activation-8.17e+00

Pos logit contributions
aron+0.299
os+0.294
enny+0.292
oing+0.289
ist+0.285
Neg logit contributions
 going-0.261
 chores-0.239
 her-0.236
 same-0.232
 rain-0.226
Token came
Token ID1625
Feature activation+6.11e+00

Pos logit contributions
aron+0.255
os+0.254
enny+0.252
ist+0.249
oing+0.248
Neg logit contributions
 going-0.162
 chores-0.148
 her-0.142
 same-0.140
 work-0.137
Token lake
Token ID13546
Feature activation-9.82e+00

Pos logit contributions
 going+0.025
 chores+0.024
 her+0.023
 temptation+0.021
 urge+0.020
Neg logit contributions
aron-0.055
ist-0.054
os-0.054
o-0.053
ne-0.053
Token.
Token ID13
Feature activation-4.27e+00

Pos logit contributions
os+0.256
aron+0.252
o+0.251
ist+0.250
enny+0.243
Neg logit contributions
 chores-0.235
 going-0.234
 same-0.222
 temptation-0.212
 rain-0.211
Token
Token ID198
Feature activation-2.65e+00

Pos logit contributions
ist+0.137
enny+0.137
os+0.134
aron+0.134
jo+0.132
Neg logit contributions
 going-0.081
 chores-0.079
 same-0.075
 rain-0.072
 her-0.070
Token
Token ID198
Feature activation-3.04e+00

Pos logit contributions
ist+0.084
enny+0.083
aron+0.082
os+0.082
umping+0.080
Neg logit contributions
 going-0.052
 chores-0.050
 same-0.047
 her-0.046
 rain-0.045
TokenBut
Token ID1537
Feature activation+7.05e-01

Pos logit contributions
enny+0.085
ist+0.084
os+0.084
aron+0.083
jo+0.082
Neg logit contributions
 going-0.071
 chores-0.068
 same-0.065
 rain-0.063
 good-0.063
Token then
Token ID788
Feature activation+1.62e+00

Pos logit contributions
 chores+0.013
 temptation+0.012
 same+0.011
 urge+0.011
 work+0.011
Neg logit contributions
o-0.023
os-0.022
ry-0.021
itty-0.021
ne-0.021
Token,
Token ID11
Feature activation+1.93e+00

Pos logit contributions
 chores+0.030
 going+0.030
 same+0.029
 temptation+0.028
 urge+0.027
Neg logit contributions
os-0.050
o-0.050
enny-0.048
ne-0.048
es-0.048
Token something
Token ID1223
Feature activation-2.56e+00

Pos logit contributions
 chores+0.030
 temptation+0.030
 going+0.028
 urge+0.027
 same+0.027
Neg logit contributions
o-0.064
os-0.063
ry-0.059
enny-0.059
ne-0.059
Token bad
Token ID2089
Feature activation+6.75e-02

Pos logit contributions
ne+0.092
os+0.092
o+0.092
aron+0.090
jo+0.090
Neg logit contributions
 going-0.027
 chores-0.026
 same-0.022
 temptation-0.021
 urge-0.020
Token happens
Token ID4325
Feature activation+1.55e+00

Pos logit contributions
 chores+0.002
 going+0.002
 same+0.001
 temptation+0.001
 urge+0.001
Neg logit contributions
os-0.002
ne-0.002
o-0.002
enny-0.002
ist-0.002
Token.
Token ID13
Feature activation-4.94e+00

Pos logit contributions
 going+0.033
 chores+0.031
 same+0.030
 her+0.029
 rain+0.028
Neg logit contributions
enny-0.048
aron-0.046
ist-0.046
jo-0.045
os-0.045
Token offer
Token ID2897
Feature activation+5.68e+00

Pos logit contributions
 going+0.123
 chores+0.121
 same+0.108
 temptation+0.108
 urge+0.098
Neg logit contributions
os-0.285
enny-0.284
o-0.284
ist-0.281
aron-0.280
Token me
Token ID502
Feature activation-1.70e+00

Pos logit contributions
 going+0.069
 chores+0.067
 same+0.060
 rain+0.055
 temptation+0.054
Neg logit contributions
os-0.217
enny-0.217
aron-0.216
ist-0.214
o-0.213
Token some
Token ID617
Feature activation+5.38e+00

Pos logit contributions
aron+0.070
jo+0.069
enny+0.069
os+0.068
ist+0.068
Neg logit contributions
 going-0.017
 chores-0.017
 same-0.015
 her-0.014
 rain-0.014
Token tea
Token ID8887
Feature activation-9.95e+00

Pos logit contributions
 going+0.088
 chores+0.084
 same+0.080
 temptation+0.078
 urge+0.074
Neg logit contributions
aron-0.178
os-0.177
enny-0.176
ist-0.176
ne-0.175
Token?"
Token ID1701
Feature activation-6.05e+00

Pos logit contributions
jo+0.336
enny+0.331
os+0.331
oing+0.328
aron+0.327
Neg logit contributions
 going-0.173
 chores-0.168
 same-0.156
 rain-0.149
 her-0.148
Token
Token ID198
Feature activation-4.34e+00

Pos logit contributions
ist+0.208
aron+0.197
enny+0.196
os+0.196
ark+0.195
Neg logit contributions
 going-0.107
 her-0.106
 chores-0.104
 same-0.098
 the-0.095
Token
Token ID198
Feature activation-4.41e+00

Pos logit contributions
ist+0.151
aron+0.145
os+0.144
umping+0.143
enny+0.143
Neg logit contributions
 her-0.073
 going-0.072
 chores-0.071
 other-0.069
 same-0.069
TokenAnna
Token ID31160
Feature activation-1.27e+01

Pos logit contributions
ist+0.107
os+0.103
aron+0.102
umping+0.102
jo+0.101
Neg logit contributions
 her-0.120
 chores-0.117
 going-0.115
 other-0.115
 same-0.114
Token nodded
Token ID14464
Feature activation+1.59e+00

Pos logit contributions
enny+0.410
jo+0.403
oing+0.400
os+0.395
enn+0.394
Neg logit contributions
 her-0.222
 going-0.216
 the-0.206
 grown-0.204
 start-0.204
Token.
Token ID13
Feature activation-6.15e+00

Pos logit contributions
 going+0.036
 chores+0.035
 same+0.033
 rain+0.032
 her+0.032
Neg logit contributions
os-0.045
enny-0.045
aron-0.044
ist-0.044
o-0.044
Token She
Token ID1375
Feature activation-4.68e+00

Pos logit contributions
enny+0.207
os+0.207
ist+0.206
aron+0.206
o+0.203
Neg logit contributions
 going-0.104
 chores-0.100
 same-0.095
 rain-0.090
 her-0.089
Token,
Token ID11
Feature activation+2.96e+00

Pos logit contributions
enny+0.115
aron+0.114
os+0.114
o+0.112
ist+0.111
Neg logit contributions
 going-0.093
 chores-0.090
 same-0.086
 rain-0.083
 her-0.082
Token playing
Token ID2712
Feature activation+3.64e+00

Pos logit contributions
 chores+0.060
 going+0.050
 same+0.050
 temptation+0.049
 rain+0.047
Neg logit contributions
aron-0.092
ist-0.090
o-0.090
os-0.089
oing-0.088
Token its
Token ID663
Feature activation+1.89e+00

Pos logit contributions
 going+0.071
 chores+0.070
 same+0.066
 rain+0.063
 work+0.061
Neg logit contributions
enny-0.112
os-0.112
aron-0.112
ist-0.110
o-0.110
Token happy
Token ID3772
Feature activation+6.97e-01

Pos logit contributions
 going+0.025
 chores+0.024
 her+0.021
 temptation+0.021
 same+0.020
Neg logit contributions
ist-0.069
aron-0.067
jo-0.067
ne-0.066
o-0.066
Token tunes
Token ID31796
Feature activation-4.16e+00

Pos logit contributions
 going+0.012
 chores+0.012
 same+0.011
 rain+0.010
 temptation+0.010
Neg logit contributions
os-0.023
enny-0.023
aron-0.023
ist-0.023
jo-0.023
Token.
Token ID13
Feature activation-4.97e+00

Pos logit contributions
enny+0.120
aron+0.119
os+0.119
ist+0.116
o+0.116
Neg logit contributions
 going-0.093
 chores-0.089
 same-0.084
 rain-0.081
 her-0.081
Token
Token ID220
Feature activation-4.59e+00

Pos logit contributions
ist+0.164
os+0.158
aron+0.156
enny+0.156
ne+0.156
Neg logit contributions
 going-0.093
 chores-0.089
 same-0.084
 her-0.081
 good-0.080
Token
Token ID198
Feature activation-3.26e+00

Pos logit contributions
ist+0.152
aron+0.148
ne+0.147
jo+0.146
oing+0.144
Neg logit contributions
 going-0.082
 chores-0.078
 same-0.073
 her-0.071
 good-0.068
Token
Token ID198
Feature activation-3.53e+00

Pos logit contributions
ist+0.109
os+0.104
enny+0.102
ne+0.101
aron+0.101
Neg logit contributions
 going-0.059
 chores-0.058
 her-0.055
 good-0.054
 same-0.053
TokenHis
Token ID6653
Feature activation-4.58e-01

Pos logit contributions
os+0.082
ist+0.081
aron+0.080
enny+0.080
jo+0.079
Neg logit contributions
 going-0.097
 chores-0.097
 same-0.092
 her-0.089
 good-0.088
Token mom
Token ID1995
Feature activation-1.12e+01

Pos logit contributions
enny+0.016
o+0.016
ouse+0.016
jo+0.016
os+0.016
Neg logit contributions
 going-0.006
 chores-0.005
 same-0.005
 temptation-0.005
 her-0.005
Token a
Token ID257
Feature activation+3.99e+00

Pos logit contributions
 chores+0.180
 plan+0.172
 going+0.170
wered+0.170
 urge+0.169
Neg logit contributions
o-0.379
os-0.370
ne-0.356
ist-0.347
enny-0.345
Token new
Token ID649
Feature activation+1.50e+00

Pos logit contributions
 going+0.053
 chores+0.051
 same+0.046
 temptation+0.043
 her+0.043
Neg logit contributions
enny-0.147
os-0.145
o-0.145
ist-0.144
aron-0.144
Token one
Token ID530
Feature activation-2.65e+00

Pos logit contributions
 going+0.031
 chores+0.030
 temptation+0.028
 same+0.026
 her+0.026
Neg logit contributions
ist-0.041
o-0.041
enny-0.041
os-0.041
ne-0.041
Token."
Token ID526
Feature activation-7.30e+00

Pos logit contributions
enny+0.076
aron+0.075
os+0.075
ist+0.074
jo+0.074
Neg logit contributions
 going-0.059
 chores-0.057
 same-0.055
 rain-0.053
 work-0.052
Token
Token ID198
Feature activation-4.10e+00

Pos logit contributions
ist+0.246
os+0.236
aron+0.235
jo+0.235
enny+0.233
Neg logit contributions
 going-0.136
 chores-0.130
 her-0.126
 same-0.121
 rain-0.120
Token
Token ID198
Feature activation-4.14e+00

Pos logit contributions
ist+0.142
enny+0.136
aron+0.135
os+0.135
ne+0.133
Neg logit contributions
 going-0.070
 chores-0.067
 her-0.067
 same-0.064
 the-0.063
TokenThey
Token ID2990
Feature activation-3.16e+00

Pos logit contributions
os+0.128
ist+0.125
jo+0.125
enny+0.121
aron+0.119
Neg logit contributions
 going-0.082
 her-0.082
 chores-0.081
 other-0.080
 good-0.079
Token hugged
Token ID41130
Feature activation+5.02e+00

Pos logit contributions
aron+0.089
os+0.089
enny+0.089
ist+0.087
jo+0.087
Neg logit contributions
 going-0.071
 chores-0.069
 same-0.065
 rain-0.063
 her-0.063
Token each
Token ID1123
Feature activation+8.77e-01

Pos logit contributions
 chores+0.098
 going+0.097
 her+0.096
 rain+0.093
 same+0.093
Neg logit contributions
enny-0.162
os-0.160
jo-0.157
aron-0.156
ist-0.156
Token other
Token ID584
Feature activation-3.50e+00

Pos logit contributions
 going+0.002
 chores+0.002
 same+0.001
 her+0.000
 rain+0.000
Neg logit contributions
enny-0.042
os-0.042
aron-0.042
ist-0.042
o-0.041
Token and
Token ID290
Feature activation+4.64e+00

Pos logit contributions
enny+0.114
os+0.109
jo+0.108
ist+0.108
umping+0.108
Neg logit contributions
 chores-0.069
 going-0.068
 good-0.065
 her-0.065
 other-0.063
Token He
Token ID679
Feature activation-3.49e+00

Pos logit contributions
os+0.146
ist+0.146
enny+0.146
aron+0.145
o+0.142
Neg logit contributions
 going-0.091
 chores-0.089
 same-0.085
 rain-0.080
 her-0.078
Token was
Token ID373
Feature activation+9.79e+00

Pos logit contributions
enny+0.105
aron+0.104
os+0.104
ist+0.102
oing+0.101
Neg logit contributions
 going-0.073
 chores-0.069
 same-0.067
 rain-0.065
 her-0.064
Token very
Token ID845
Feature activation+6.50e+00

Pos logit contributions
 chores+0.187
 temptation+0.163
 same+0.161
 going+0.157
 work+0.155
Neg logit contributions
o-0.317
aron-0.312
os-0.305
ist-0.298
jo-0.294
Token relieved
Token ID22598
Feature activation+5.36e+00

Pos logit contributions
 chores+0.131
 going+0.119
 same+0.113
 temptation+0.107
 rain+0.104
Neg logit contributions
o-0.203
aron-0.194
os-0.194
oing-0.188
enny-0.188
Token.
Token ID13
Feature activation-4.73e+00

Pos logit contributions
 going+0.120
 chores+0.117
 same+0.111
 rain+0.107
 work+0.105
Neg logit contributions
os-0.151
enny-0.151
aron-0.151
ist-0.148
o-0.147
Token He
Token ID679
Feature activation-3.68e+00

Pos logit contributions
ist+0.154
os+0.152
enny+0.152
aron+0.151
oing+0.149
Neg logit contributions
 going-0.086
 chores-0.083
 same-0.080
 her-0.074
 rain-0.074
Token felt
Token ID2936
Feature activation+8.51e+00

Pos logit contributions
enny+0.109
os+0.108
aron+0.107
ist+0.106
o+0.104
Neg logit contributions
 going-0.079
 chores-0.074
 same-0.072
 rain-0.070
 her-0.069
Token brave
Token ID14802
Feature activation+1.27e+00

Pos logit contributions
 chores+0.168
 going+0.150
 same+0.146
 work+0.143
 plan+0.143
Neg logit contributions
aron-0.244
o-0.234
enny-0.228
os-0.226
oc-0.226
Token again
Token ID757
Feature activation+5.66e-01

Pos logit contributions
 chores+0.027
 going+0.026
 same+0.025
 rain+0.024
 temptation+0.023
Neg logit contributions
aron-0.037
enny-0.036
ist-0.036
os-0.035
ne-0.035
Token,
Token ID11
Feature activation+3.14e+00

Pos logit contributions
 going+0.014
 chores+0.014
 same+0.013
 rain+0.012
 temptation+0.012
Neg logit contributions
os-0.015
aron-0.015
enny-0.015
ist-0.015
o-0.014
Token so
Token ID523
Feature activation+5.62e+00

Pos logit contributions
 chores+0.073
 same+0.065
 going+0.064
 temptation+0.061
 plan+0.061
Neg logit contributions
aron-0.086
o-0.084
ist-0.083
os-0.082
jo-0.081
Token me
Token ID502
Feature activation-1.87e+00

Pos logit contributions
 going+0.070
 chores+0.067
 same+0.061
 rain+0.056
 her+0.055
Neg logit contributions
os-0.216
aron-0.216
enny-0.216
ist-0.213
o-0.212
Token?
Token ID30
Feature activation-4.32e+00

Pos logit contributions
os+0.059
enny+0.059
aron+0.059
ist+0.058
o+0.058
Neg logit contributions
 going-0.035
 chores-0.034
 same-0.032
 rain-0.031
 work-0.030
Token I
Token ID314
Feature activation+6.78e-01

Pos logit contributions
ist+0.143
aron+0.139
os+0.138
jo+0.137
umping+0.135
Neg logit contributions
 going-0.076
 chores-0.070
 her-0.070
 same-0.068
 help-0.065
Tokenâ
Token ID22940
Feature activation-3.47e+00

Pos logit contributions
 going+0.015
 chores+0.014
 same+0.014
 work+0.013
 rain+0.013
Neg logit contributions
aron-0.020
os-0.019
enny-0.019
ist-0.019
jo-0.019
Token€
Token ID26391
Feature activation-9.94e+00

Pos logit contributions
as+0.088
enny+0.086
os+0.084
es+0.083
jo+0.081
Neg logit contributions
 going-0.088
 temptation-0.085
 urge-0.084
 same-0.083
 help-0.082
Tokenâ„¢
Token ID8151
Feature activation-1.09e+01

Pos logit contributions
ist+0.319
aron+0.297
jo+0.278
ppy+0.278
enny+0.274
Neg logit contributions
 going-0.209
 the-0.178
 chores-0.172
 not-0.170
 big-0.169
Tokenm
Token ID76
Feature activation+4.40e+00

Pos logit contributions
aron+0.253
ist+0.233
oing+0.230
jo+0.223
oc+0.220
Neg logit contributions
 first-0.298
 going-0.295
 her-0.279
 other-0.278
 same-0.276
Token lost
Token ID2626
Feature activation+4.48e+00

Pos logit contributions
 chores+0.071
 going+0.071
 same+0.065
 rain+0.062
 temptation+0.060
Neg logit contributions
os-0.151
aron-0.150
enny-0.150
o-0.149
ist-0.148
Token.
Token ID13
Feature activation-4.76e+00

Pos logit contributions
 going+0.102
 chores+0.100
 same+0.095
 rain+0.091
 her+0.090
Neg logit contributions
os-0.125
enny-0.124
aron-0.124
ist-0.122
o-0.121
Tokenâ
Token ID22940
Feature activation-5.12e+00

Pos logit contributions
ist+0.155
os+0.153
jo+0.150
oing+0.148
aron+0.148
Neg logit contributions
 going-0.087
 chores-0.079
 same-0.077
 her-0.076
 rain-0.074
Token€
Token ID26391
Feature activation-1.11e+01

Pos logit contributions
as+0.151
os+0.151
enny+0.150
es+0.144
oing+0.141
Neg logit contributions
 going-0.108
 same-0.107
 temptation-0.104
 work-0.100
 urge-0.100
Token His
Token ID2399
Feature activation+5.32e-01

Pos logit contributions
os+0.175
enny+0.175
aron+0.175
ist+0.173
o+0.172
Neg logit contributions
 going-0.109
 chores-0.106
 same-0.100
 rain-0.095
 work-0.093
Token father
Token ID2988
Feature activation-1.19e+01

Pos logit contributions
 going+0.009
wered+0.008
 temptation+0.008
 same+0.008
 chores+0.008
Neg logit contributions
jo-0.017
o-0.017
enny-0.017
ouse-0.016
aron-0.016
Token said
Token ID531
Feature activation+1.11e+00

Pos logit contributions
aron+0.352
enny+0.351
os+0.351
jo+0.343
ist+0.342
Neg logit contributions
 going-0.251
 chores-0.242
 same-0.228
 her-0.225
 work-0.220
Token that
Token ID326
Feature activation+1.04e+00

Pos logit contributions
 chores+0.022
 same+0.022
 going+0.021
 rain+0.020
 work+0.020
Neg logit contributions
enny-0.033
aron-0.033
os-0.032
o-0.032
ist-0.032
Token the
Token ID262
Feature activation+5.80e-01

Pos logit contributions
 chores+0.015
 same+0.015
 going+0.015
 temptation+0.014
 rain+0.013
Neg logit contributions
os-0.037
o-0.037
aron-0.036
enny-0.036
ne-0.036
Token cat
Token ID3797
Feature activation-1.47e+01

Pos logit contributions
 going+0.014
 chores+0.014
wered+0.013
 her+0.013
 same+0.012
Neg logit contributions
itty-0.014
ne-0.013
aron-0.013
o-0.013
oc-0.013
Token already
Token ID1541
Feature activation+4.05e+00

Pos logit contributions
enny+0.434
aron+0.433
os+0.424
oing+0.422
ist+0.421
Neg logit contributions
 going-0.316
 chores-0.308
 same-0.291
 her-0.281
 rain-0.278
Token belonged
Token ID19611
Feature activation+8.17e+00

Pos logit contributions
 going+0.091
 chores+0.090
 same+0.085
 rain+0.082
 temptation+0.081
Neg logit contributions
os-0.113
aron-0.112
enny-0.112
ist-0.111
o-0.110
Token to
Token ID284
Feature activation+8.45e+00

Pos logit contributions
 going+0.136
 chores+0.131
 same+0.123
 rain+0.114
 her+0.114
Neg logit contributions
aron-0.279
os-0.278
enny-0.277
ist-0.275
jo-0.272
Token someone
Token ID2130
Feature activation-6.60e+00

Pos logit contributions
 chores+0.090
 going+0.087
 same+0.082
 temptation+0.078
 urge+0.076
Neg logit contributions
os-0.321
o-0.318
aron-0.315
jo-0.314
enny-0.313
Token else
Token ID2073
Feature activation-2.39e+00

Pos logit contributions
aron+0.154
enny+0.153
os+0.150
ist+0.149
oing+0.147
Neg logit contributions
 going-0.187
 chores-0.178
 same-0.170
 her-0.168
 good-0.165
Token!"
Token ID2474
Feature activation-5.73e+00

Pos logit contributions
 going+0.061
 chores+0.060
 same+0.057
 rain+0.055
 work+0.054
Neg logit contributions
os-0.078
enny-0.078
aron-0.078
ist-0.077
o-0.076
Token
Token ID220
Feature activation-5.83e+00

Pos logit contributions
os+0.172
enny+0.172
aron+0.172
ist+0.171
o+0.168
Neg logit contributions
 going-0.116
 chores-0.114
 same-0.107
 rain-0.103
 her-0.102
Token
Token ID198
Feature activation-3.87e+00

Pos logit contributions
ist+0.189
aron+0.183
enny+0.180
os+0.179
ne+0.179
Neg logit contributions
 going-0.108
 chores-0.105
 her-0.099
 same-0.099
 other-0.092
Token
Token ID198
Feature activation-3.71e+00

Pos logit contributions
ist+0.134
os+0.128
umping+0.126
aron+0.125
oing+0.125
Neg logit contributions
 her-0.064
 going-0.059
 chores-0.058
 the-0.056
 other-0.055
TokenHer
Token ID9360
Feature activation-3.24e+00

Pos logit contributions
os+0.096
ist+0.096
aron+0.093
jo+0.091
umping+0.091
Neg logit contributions
 chores-0.087
 her-0.087
 going-0.086
 same-0.082
 rain-0.082
Token mom
Token ID1995
Feature activation-1.23e+01

Pos logit contributions
enny+0.131
aron+0.131
os+0.131
ist+0.129
o+0.129
Neg logit contributions
 going-0.032
 chores-0.031
 same-0.027
 rain-0.024
 temptation-0.024
Tokenmy
Token ID1820
Feature activation-9.79e+00

Pos logit contributions
enny+0.348
ark+0.346
oing+0.346
aron+0.336
ouse+0.336
Neg logit contributions
 going-0.278
 her-0.264
 start-0.246
 the-0.243
 same-0.242
Token said
Token ID531
Feature activation+1.63e-01

Pos logit contributions
enny+0.275
ark+0.270
os+0.267
oing+0.266
aron+0.266
Neg logit contributions
 going-0.225
 her-0.209
 work-0.198
 start-0.198
 chores-0.197
Token,
Token ID11
Feature activation+5.68e-01

Pos logit contributions
 going+0.003
 chores+0.003
 same+0.003
 rain+0.003
 work+0.003
Neg logit contributions
enny-0.005
aron-0.005
os-0.005
ist-0.005
o-0.005
Token "
Token ID366
Feature activation-1.45e+00

Pos logit contributions
 chores+0.009
 going+0.008
 same+0.008
 rain+0.007
 temptation+0.007
Neg logit contributions
aron-0.020
os-0.020
enny-0.020
o-0.020
ist-0.020
TokenI
Token ID40
Feature activation+3.46e-01

Pos logit contributions
os+0.045
ist+0.044
aron+0.044
jo+0.044
enny+0.043
Neg logit contributions
 going-0.027
 same-0.027
 her-0.026
 temptation-0.025
 chores-0.025
Token there
Token ID612
Feature activation-2.35e+00

Pos logit contributions
 going+0.016
 chores+0.015
 same+0.014
 rain+0.013
 work+0.013
Neg logit contributions
os-0.038
enny-0.038
aron-0.038
ist-0.038
o-0.038
Token was
Token ID373
Feature activation+8.63e+00

Pos logit contributions
aron+0.065
enny+0.065
os+0.065
ist+0.064
o+0.063
Neg logit contributions
 going-0.054
 chores-0.053
 same-0.050
 rain-0.048
 her-0.048
Token a
Token ID257
Feature activation+1.44e+00

Pos logit contributions
 chores+0.091
 going+0.085
 temptation+0.079
 same+0.076
 plan+0.072
Neg logit contributions
o-0.337
enny-0.332
aron-0.330
os-0.326
umping-0.325
Token very
Token ID845
Feature activation+2.97e+00

Pos logit contributions
 going+0.015
 chores+0.014
 same+0.013
 rain+0.012
 work+0.011
Neg logit contributions
os-0.058
enny-0.057
aron-0.057
ist-0.057
jo-0.056
Token creative
Token ID7325
Feature activation-1.90e+00

Pos logit contributions
 going+0.044
 chores+0.043
 her+0.036
 same+0.035
 temptation+0.035
Neg logit contributions
o-0.101
enny-0.099
ouse-0.098
oing-0.096
oc-0.095
Token bird
Token ID6512
Feature activation-1.31e+01

Pos logit contributions
os+0.066
enny+0.065
aron+0.065
ist+0.064
o+0.064
Neg logit contributions
 going-0.031
 chores-0.030
 same-0.028
 rain-0.027
 work-0.026
Token.
Token ID13
Feature activation-5.31e+00

Pos logit contributions
enny+0.423
aron+0.403
ives+0.381
os+0.380
ist+0.378
Neg logit contributions
 going-0.289
 her-0.259
 same-0.244
 rain-0.244
 the-0.242
Token He
Token ID679
Feature activation-4.24e+00

Pos logit contributions
os+0.162
aron+0.161
enny+0.161
ist+0.159
o+0.158
Neg logit contributions
 going-0.107
 chores-0.106
 same-0.099
 rain-0.094
 work-0.092
Token liked
Token ID8288
Feature activation+8.61e+00

Pos logit contributions
enny+0.130
aron+0.128
os+0.126
ist+0.126
jo+0.122
Neg logit contributions
 going-0.092
 chores-0.083
 same-0.083
 rain-0.082
 her-0.081
Token to
Token ID284
Feature activation+7.54e+00

Pos logit contributions
 chores+0.210
 temptation+0.201
 urge+0.192
 same+0.192
 avail+0.186
Neg logit contributions
o-0.222
os-0.213
oing-0.192
ne-0.189
and-0.189
Token explore
Token ID7301
Feature activation+6.06e+00

Pos logit contributions
 going+0.137
 chores+0.135
 temptation+0.130
 same+0.129
 good+0.122
Neg logit contributions
ist-0.218
os-0.213
r-0.212
aron-0.211
ne-0.210
Token
Token ID198
Feature activation-3.17e+00

Pos logit contributions
ist+0.172
aron+0.166
oing+0.164
os+0.163
ne+0.161
Neg logit contributions
 going-0.091
 chores-0.087
 her-0.087
 same-0.084
 rain-0.080
Token
Token ID198
Feature activation-3.40e+00

Pos logit contributions
ist+0.110
aron+0.107
os+0.104
enny+0.103
oing+0.103
Neg logit contributions
 going-0.053
 chores-0.052
 her-0.050
 same-0.047
 other-0.046
TokenOne
Token ID3198
Feature activation-6.59e-01

Pos logit contributions
aron+0.099
os+0.099
ist+0.098
enny+0.096
o+0.095
Neg logit contributions
 going-0.073
 chores-0.072
 same-0.067
 her-0.066
 rain-0.065
Token day
Token ID1110
Feature activation-2.88e+00

Pos logit contributions
aron+0.028
enny+0.028
os+0.028
ist+0.028
jo+0.028
Neg logit contributions
 going-0.005
 chores-0.005
 same-0.004
 rain-0.004
 her-0.004
Token,
Token ID11
Feature activation+1.35e+00

Pos logit contributions
enny+0.090
os+0.089
aron+0.089
ist+0.088
ne+0.088
Neg logit contributions
 going-0.055
 chores-0.054
 same-0.051
 temptation-0.048
 rain-0.047
Token Sally
Token ID25737
Feature activation-1.45e+01

Pos logit contributions
 going+0.017
 chores+0.017
 same+0.015
 temptation+0.014
 rain+0.014
Neg logit contributions
enny-0.051
os-0.050
aron-0.050
ist-0.050
o-0.050
Token decided
Token ID3066
Feature activation+4.81e+00

Pos logit contributions
aron+0.442
ist+0.439
enny+0.433
oing+0.431
os+0.431
Neg logit contributions
 going-0.307
 same-0.274
 chores-0.271
 her-0.268
 rain-0.263
Token it
Token ID340
Feature activation-5.56e+00

Pos logit contributions
 chores+0.111
 temptation+0.100
 same+0.099
 urge+0.099
 plan+0.096
Neg logit contributions
enny-0.126
ne-0.125
o-0.122
os-0.120
aron-0.120
Token was
Token ID373
Feature activation+1.00e+01

Pos logit contributions
os+0.152
enny+0.151
aron+0.151
ist+0.149
o+0.148
Neg logit contributions
 going-0.129
 chores-0.127
 same-0.120
 rain-0.116
 temptation-0.114
Token time
Token ID640
Feature activation+1.71e+00

Pos logit contributions
 chores+0.139
 temptation+0.131
 same+0.120
 urge+0.116
wered+0.115
Neg logit contributions
o-0.318
ist-0.307
os-0.304
aron-0.302
jo-0.301
Token to
Token ID284
Feature activation+8.91e+00

Pos logit contributions
 chores+0.034
 same+0.032
 rain+0.032
 temptation+0.031
 going+0.031
Neg logit contributions
enny-0.051
o-0.051
os-0.051
ist-0.049
ne-0.048
Token
Token ID198
Feature activation-3.28e+00

Pos logit contributions
ist+0.103
aron+0.100
os+0.099
oing+0.097
enny+0.096
Neg logit contributions
 going-0.056
 chores-0.055
 her-0.055
 same-0.052
 rain-0.051
TokenOne
Token ID3198
Feature activation-1.10e+00

Pos logit contributions
ist+0.088
os+0.087
aron+0.087
enny+0.084
jo+0.083
Neg logit contributions
 going-0.076
 chores-0.075
 same-0.071
 her-0.071
 rain-0.070
Token day
Token ID1110
Feature activation-3.26e+00

Pos logit contributions
os+0.045
enny+0.044
o+0.044
aron+0.044
ist+0.044
Neg logit contributions
 going-0.011
 chores-0.010
 same-0.009
 rain-0.008
 temptation-0.008
Token,
Token ID11
Feature activation+1.45e+00

Pos logit contributions
enny+0.105
os+0.105
aron+0.104
o+0.103
ist+0.103
Neg logit contributions
 going-0.059
 chores-0.058
 same-0.055
 temptation-0.051
 rain-0.050
Token Twe
Token ID24205
Feature activation-1.26e+01

Pos logit contributions
 chores+0.021
 going+0.021
 same+0.019
 temptation+0.018
 rain+0.017
Neg logit contributions
enny-0.051
os-0.051
aron-0.051
o-0.051
ist-0.050
Tokenety
Token ID2963
Feature activation-1.44e+01

Pos logit contributions
aron+0.226
os+0.224
es+0.212
jo+0.208
oing+0.198
Neg logit contributions
 going-0.395
 same-0.384
 her-0.374
 chores-0.368
 work-0.364
Token felt
Token ID2936
Feature activation+7.82e+00

Pos logit contributions
enny+0.443
aron+0.441
ist+0.437
os+0.430
oing+0.425
Neg logit contributions
 going-0.302
 same-0.271
 chores-0.269
 rain-0.264
 her-0.264
Token tired
Token ID10032
Feature activation+5.62e-01

Pos logit contributions
 chores+0.115
 plan+0.106
 going+0.105
 same+0.104
 work+0.101
Neg logit contributions
o-0.232
enny-0.232
aron-0.231
ppy-0.228
umping-0.223
Token and
Token ID290
Feature activation+4.96e+00

Pos logit contributions
 chores+0.013
 going+0.013
 same+0.013
 temptation+0.012
 rain+0.012
Neg logit contributions
aron-0.015
enny-0.015
os-0.015
ist-0.014
ne-0.014
Token needed
Token ID2622
Feature activation+9.19e+00

Pos logit contributions
 chores+0.119
wered+0.104
 same+0.102
 top+0.100
 going+0.099
Neg logit contributions
o-0.117
owed-0.111
os-0.109
ppy-0.109
ouse-0.108
Token to
Token ID284
Feature activation+8.90e+00

Pos logit contributions
 urge+0.151
 same+0.151
 plan+0.148
 idea+0.146
 chores+0.144
Neg logit contributions
o-0.262
enny-0.259
ne-0.249
light-0.248
aron-0.244