TOP ACTIVATIONS
MAX = 0.179

that maybe he can try to swim with her . He
" Can I pay to play with your dog ?"
made sure was careful not to fall . Suddenly
find ! Be careful not to hurt yourself . Where did
for helping them stay safe during the storm . <|endoftext|> Once
. Would she be able to make the journey , and
, but be careful not to hurt yourself ." Lily carefully
mirror away and vowed never to use it again . But
that it doesn 't pay to go too fast ! Whenever
says . She pret ends to wear the hat on her
They did not know whether to laugh or cry . They
, but be careful not to drain your energy too much
suggested that she could try to make a game out of
we must be careful not to hit anyone ."
Then they decide to try to be nice to Farmer Joe
show . Can you try to be patient and watch a
to . I will try to be gentle and quick .
because she had promised not to use it .
very determined and worked hard to try and make the war
, " I will try to help you ." The bird

MIN ACTIVATIONS
MIN = -0.244

. The great Zen o hugs Ben and says
saw a big box of Leg os that he really wanted
arms . The se ag ull flew away . Tommy laughed
The se ag ull tried to take the oy
park . D ull had an idea . He
coal . The se ag ull grabbed the coal with its
The se ag ull s and the fish agreed
nearby . The se ag ull swoop ed down to take
away all the se ag ull s . Suddenly
, Sally the Se ag ull , heard him and came
saw all the se ag ull s flying in the sky
one of the se ag ull s suddenly became very aggressive
came to town . The clown had a big red nose
happening . The se ag ull replied that it was a
crashing and the se ag ull s were squ aw king
was a clown . The clown was very funny . He
, and the se ag ull s flying in the sky
a new tank out of Leg os . Tim
. The flaming o nodded its head and
friend Sammy the se ag ull flew by and saw that

INTERVAL 0.073 - 0.179
CONTAINS 17.467%

named Lily . She loved to play outside in the park
igg ly . The dragon was very happy , but he
played together until the sun began to rise . <|endoftext|> L
and Mary were very sad about the news .
black and hard . Lily was sad because they couldn 't

INTERVAL -0.032 - 0.073
CONTAINS 50.101%

she should never touch things she finds outside . <|endoftext|> Once
Everyone agreed , so the brave mouse hopped up on the
Do you want to be my friend ? I have a
. The end . <|endoftext|> Once upon a time , there
, he saw many strange and wonderful things . But he

INTERVAL -0.138 - -0.032
CONTAINS 28.545%

each other laugh . One day , the parents wanted to
, that 's beautiful !" Lily said . She took another
mom will be happy too . But her mom is not
. All the fish liked Sam and wanted to play with
ace tried to do the exercises but she didn 't understand

INTERVAL -0.244 - -0.138
CONTAINS 3.887%

it . <|endoftext|> The little girl was very excited . She
someone was talking inside the shell . " Hello
M ia saw a blue bird on her shoulder . She
Now , the rad ish struggled a lot less and
Sometimes they saw a big turtle with a hard shell .
Token that
Token ID326
Feature activation+6.53e-02

Pos logit contributions
 same+0.045
 chores+0.044
 urge+0.042
 plan+0.042
 temptation+0.040
Neg logit contributions
os-0.089
o-0.085
enny-0.085
ne-0.082
aron-0.081
Token maybe
Token ID3863
Feature activation+5.58e-02

Pos logit contributions
 chores+0.084
 same+0.083
 urge+0.077
 temptation+0.074
 plan+0.074
Neg logit contributions
os-0.198
o-0.196
ne-0.183
enny-0.181
ark-0.178
Token he
Token ID339
Feature activation+1.25e-02

Pos logit contributions
 chores+0.082
 plan+0.077
 same+0.077
 urge+0.077
 temptation+0.071
Neg logit contributions
o-0.149
os-0.147
ne-0.145
enny-0.136
aron-0.135
Token can
Token ID460
Feature activation+1.03e-01

Pos logit contributions
 chores+0.022
 same+0.021
 plan+0.020
 urge+0.020
 start+0.019
Neg logit contributions
enny-0.033
os-0.033
o-0.032
ark-0.032
oing-0.032
Token try
Token ID1949
Feature activation+1.14e-01

Pos logit contributions
 chores+0.115
 same+0.110
 plan+0.096
 temptation+0.095
 urge+0.095
Neg logit contributions
os-0.338
o-0.333
enny-0.328
ark-0.321
ne-0.320
Token to
Token ID284
Feature activation+1.60e-01

Pos logit contributions
 chores+0.164
 same+0.159
 plan+0.145
 urge+0.143
 temptation+0.140
Neg logit contributions
os-0.337
o-0.330
enny-0.328
ark-0.319
umping-0.318
Token swim
Token ID9422
Feature activation-1.93e-02

Pos logit contributions
 chores+0.197
 same+0.186
 temptation+0.166
 urge+0.163
 going+0.160
Neg logit contributions
os-0.480
o-0.475
ne-0.468
enny-0.458
aron-0.457
Token with
Token ID351
Feature activation+1.28e-01

Pos logit contributions
os+0.053
enny+0.052
umping+0.051
o+0.051
oing+0.051
Neg logit contributions
 same-0.032
 chores-0.032
 plan-0.029
 bad-0.029
 start-0.028
Token her
Token ID607
Feature activation+3.01e-02

Pos logit contributions
 chores+0.154
 same+0.145
 plan+0.141
 urge+0.137
 going+0.128
Neg logit contributions
os-0.397
o-0.389
enny-0.383
ne-0.373
aron-0.370
Token.
Token ID13
Feature activation-6.27e-02

Pos logit contributions
 chores+0.055
 same+0.053
 plan+0.050
 urge+0.049
 way+0.048
Neg logit contributions
os-0.078
o-0.076
enny-0.076
umping-0.074
ark-0.074
Token He
Token ID679
Feature activation-1.12e-02

Pos logit contributions
os+0.169
umping+0.164
oing+0.162
ark+0.160
enny+0.160
Neg logit contributions
 same-0.103
 chores-0.101
 plan-0.093
 temptation-0.093
 urge-0.091
Token
Token ID198
Feature activation-6.27e-02

Pos logit contributions
os+0.159
umping+0.154
oing+0.153
enny+0.151
o+0.150
Neg logit contributions
 chores-0.115
 same-0.112
 plan-0.106
 temptation-0.102
 bad-0.102
Token"
Token ID1
Feature activation-3.14e-02

Pos logit contributions
os+0.146
o+0.137
umping+0.136
enny+0.134
oing+0.133
Neg logit contributions
 chores-0.129
 same-0.125
 plan-0.119
 temptation-0.118
 urge-0.114
TokenCan
Token ID6090
Feature activation+7.55e-02

Pos logit contributions
os+0.082
o+0.078
enny+0.078
umping+0.076
oing+0.076
Neg logit contributions
 same-0.057
 chores-0.054
 start-0.051
 rain-0.051
 urge-0.051
Token I
Token ID314
Feature activation+4.55e-02

Pos logit contributions
 chores+0.122
 same+0.117
 plan+0.110
 urge+0.108
 temptation+0.106
Neg logit contributions
os-0.211
o-0.208
enny-0.204
ne-0.199
ark-0.198
Token pay
Token ID1414
Feature activation+8.74e-02

Pos logit contributions
 chores+0.047
 same+0.045
 plan+0.040
 urge+0.039
 start+0.037
Neg logit contributions
os-0.154
o-0.151
enny-0.150
umping-0.147
ark-0.147
Token to
Token ID284
Feature activation+1.60e-01

Pos logit contributions
 chores+0.109
 same+0.105
 urge+0.099
 plan+0.098
 temptation+0.094
Neg logit contributions
os-0.273
o-0.269
enny-0.265
ark-0.258
ne-0.257
Token play
Token ID711
Feature activation+5.59e-02

Pos logit contributions
 chores+0.170
 same+0.169
 temptation+0.159
 urge+0.155
 plan+0.150
Neg logit contributions
os-0.511
o-0.509
enny-0.490
ne-0.490
round-0.477
Token with
Token ID351
Feature activation+1.29e-01

Pos logit contributions
 chores+0.104
 same+0.101
 plan+0.094
 urge+0.093
 start+0.091
Neg logit contributions
os-0.144
o-0.139
enny-0.139
umping-0.135
ark-0.134
Token your
Token ID534
Feature activation+6.97e-02

Pos logit contributions
 chores+0.140
 same+0.134
 plan+0.125
 urge+0.123
 temptation+0.116
Neg logit contributions
os-0.426
o-0.417
enny-0.413
ne-0.402
ark-0.401
Token dog
Token ID3290
Feature activation-1.34e-01

Pos logit contributions
 chores+0.105
 same+0.103
 temptation+0.101
 going+0.098
 urge+0.097
Neg logit contributions
o-0.183
os-0.181
itty-0.170
enny-0.170
ne-0.168
Token?"
Token ID1701
Feature activation-7.43e-02

Pos logit contributions
os+0.354
o+0.344
enny+0.343
ark+0.333
umping+0.332
Neg logit contributions
 chores-0.235
 same-0.229
 plan-0.213
 urge-0.210
 start-0.205
Token made
Token ID925
Feature activation+1.32e-01

Pos logit contributions
 chores+0.080
 same+0.074
 temptation+0.071
 urge+0.070
 plan+0.069
Neg logit contributions
os-0.124
o-0.120
enny-0.114
ne-0.113
umping-0.110
Token sure
Token ID1654
Feature activation+4.30e-02

Pos logit contributions
 chores+0.146
 same+0.136
 urge+0.134
 plan+0.126
 temptation+0.124
Neg logit contributions
os-0.426
o-0.413
enny-0.405
umping-0.393
ne-0.389
Token was
Token ID373
Feature activation+1.22e-01

Pos logit contributions
 chores+0.060
 same+0.059
 urge+0.058
 temptation+0.056
 plan+0.054
Neg logit contributions
os-0.125
o-0.119
enny-0.119
ne-0.113
ark-0.112
Token careful
Token ID8161
Feature activation+7.11e-02

Pos logit contributions
 chores+0.171
 same+0.158
 temptation+0.151
 urge+0.149
 plan+0.147
Neg logit contributions
o-0.350
os-0.345
umping-0.321
enny-0.320
aron-0.319
Token not
Token ID407
Feature activation+1.22e-01

Pos logit contributions
 chores+0.137
 same+0.133
 plan+0.125
 urge+0.124
 temptation+0.121
Neg logit contributions
os-0.177
o-0.172
enny-0.169
ark-0.165
umping-0.164
Token to
Token ID284
Feature activation+1.59e-01

Pos logit contributions
 chores+0.217
 same+0.207
 urge+0.195
 plan+0.195
 temptation+0.192
Neg logit contributions
os-0.318
o-0.316
enny-0.300
ark-0.293
umping-0.293
Token fall
Token ID2121
Feature activation+3.29e-02

Pos logit contributions
 chores+0.196
 same+0.183
 plan+0.165
 urge+0.161
 temptation+0.160
Neg logit contributions
os-0.499
o-0.496
enny-0.477
ne-0.475
ark-0.474
Token.
Token ID13
Feature activation-6.78e-02

Pos logit contributions
 chores+0.064
 same+0.061
 plan+0.059
 urge+0.058
 temptation+0.057
Neg logit contributions
os-0.081
o-0.079
enny-0.077
umping-0.075
ark-0.074
Token
Token ID198
Feature activation-7.50e-02

Pos logit contributions
os+0.179
oing+0.178
ark+0.176
umping+0.175
enny+0.171
Neg logit contributions
 same-0.107
 chores-0.107
 way-0.096
 temptation-0.096
 plan-0.095
Token
Token ID198
Feature activation-7.42e-02

Pos logit contributions
os+0.205
oing+0.200
umping+0.200
aron+0.195
enny+0.195
Neg logit contributions
 chores-0.120
 same-0.111
 plan-0.108
 other-0.102
 bad-0.101
TokenSuddenly
Token ID38582
Feature activation+1.29e-02

Pos logit contributions
os+0.179
o+0.168
umping+0.167
oing+0.166
aron+0.166
Neg logit contributions
 chores-0.149
 same-0.141
 plan-0.136
 temptation-0.131
 start-0.130
Token find
Token ID1064
Feature activation+9.05e-02

Pos logit contributions
 chores+0.114
 same+0.111
 plan+0.104
 urge+0.103
 going+0.101
Neg logit contributions
os-0.138
o-0.135
enny-0.134
ark-0.129
umping-0.129
Token!
Token ID0
Feature activation-4.01e-02

Pos logit contributions
 chores+0.115
 urge+0.107
 plan+0.104
 same+0.104
 temptation+0.100
Neg logit contributions
os-0.268
o-0.265
enny-0.258
ne-0.250
and-0.244
Token Be
Token ID1355
Feature activation+4.96e-02

Pos logit contributions
os+0.101
umping+0.097
oing+0.097
o+0.097
enny+0.097
Neg logit contributions
 chores-0.071
 same-0.071
 plan-0.066
 urge-0.065
 start-0.065
Token careful
Token ID8161
Feature activation+6.32e-02

Pos logit contributions
 chores+0.080
 same+0.072
 urge+0.067
 temptation+0.066
 plan+0.065
Neg logit contributions
o-0.138
os-0.138
enny-0.133
aron-0.132
ne-0.131
Token not
Token ID407
Feature activation+1.13e-01

Pos logit contributions
 chores+0.105
 same+0.102
 plan+0.094
 urge+0.093
 temptation+0.092
Neg logit contributions
os-0.174
enny-0.169
o-0.169
umping-0.164
ark-0.163
Token to
Token ID284
Feature activation+1.59e-01

Pos logit contributions
 chores+0.210
 same+0.195
 urge+0.184
 temptation+0.184
 plan+0.182
Neg logit contributions
o-0.291
os-0.290
enny-0.273
ark-0.271
ne-0.269
Token hurt
Token ID5938
Feature activation+6.94e-02

Pos logit contributions
 chores+0.208
 same+0.186
 plan+0.169
 urge+0.167
 temptation+0.167
Neg logit contributions
o-0.485
os-0.474
ark-0.465
ne-0.462
enny-0.455
Token yourself
Token ID3511
Feature activation+6.16e-03

Pos logit contributions
 chores+0.082
 same+0.077
 plan+0.072
 urge+0.070
 start+0.066
Neg logit contributions
os-0.222
o-0.219
enny-0.215
ark-0.212
ne-0.211
Token.
Token ID13
Feature activation-6.11e-02

Pos logit contributions
 chores+0.012
 same+0.012
 plan+0.011
 urge+0.011
 temptation+0.011
Neg logit contributions
os-0.015
o-0.014
enny-0.014
ark-0.014
umping-0.014
Token Where
Token ID6350
Feature activation-3.30e-03

Pos logit contributions
os+0.163
umping+0.160
oing+0.160
ark+0.157
enny+0.154
Neg logit contributions
 chores-0.096
 same-0.095
 temptation-0.088
 urge-0.086
 plan-0.085
Token did
Token ID750
Feature activation+9.57e-02

Pos logit contributions
os+0.009
o+0.009
enny+0.009
ark+0.009
umping+0.008
Neg logit contributions
 chores-0.005
 same-0.005
 plan-0.005
 urge-0.005
 temptation-0.005
Token for
Token ID329
Feature activation+1.20e-01

Pos logit contributions
os+0.316
enny+0.315
ark+0.307
oing+0.305
umping+0.304
Neg logit contributions
 chores-0.252
 same-0.245
 plan-0.233
 going-0.231
 start-0.230
Token helping
Token ID5742
Feature activation+1.08e-01

Pos logit contributions
 same+0.149
 chores+0.144
 urge+0.135
 plan+0.128
 temptation+0.124
Neg logit contributions
o-0.363
os-0.361
ne-0.342
enny-0.342
aron-0.341
Token them
Token ID606
Feature activation-1.09e-02

Pos logit contributions
 chores+0.150
 same+0.147
 urge+0.137
 plan+0.135
 temptation+0.132
Neg logit contributions
os-0.320
o-0.315
enny-0.306
ark-0.300
ne-0.298
Token stay
Token ID2652
Feature activation+9.77e-02

Pos logit contributions
os+0.028
o+0.028
enny+0.027
ark+0.026
umping+0.026
Neg logit contributions
 chores-0.019
 same-0.019
 urge-0.017
 plan-0.017
 temptation-0.017
Token safe
Token ID3338
Feature activation+3.78e-02

Pos logit contributions
 chores+0.170
 same+0.164
 urge+0.151
 plan+0.151
 temptation+0.151
Neg logit contributions
o-0.245
os-0.240
aron-0.238
oing-0.232
umping-0.232
Token during
Token ID1141
Feature activation+1.59e-01

Pos logit contributions
 chores+0.082
 same+0.080
 plan+0.076
 urge+0.075
 temptation+0.074
Neg logit contributions
os-0.084
o-0.082
enny-0.080
ark-0.078
umping-0.077
Token the
Token ID262
Feature activation+8.03e-02

Pos logit contributions
 same+0.191
 urge+0.183
 plan+0.164
 chores+0.158
 idea+0.158
Neg logit contributions
os-0.442
o-0.440
ne-0.410
enny-0.404
enn-0.402
Token storm
Token ID6388
Feature activation-9.64e-02

Pos logit contributions
 chores+0.081
 same+0.076
 her+0.072
 urge+0.071
 plan+0.069
Neg logit contributions
o-0.228
ne-0.227
os-0.220
round-0.219
oing-0.216
Token.
Token ID13
Feature activation-6.89e-02

Pos logit contributions
enny+0.244
os+0.240
o+0.232
ark+0.231
umping+0.229
Neg logit contributions
 chores-0.180
 same-0.178
 work-0.166
 way-0.165
 going-0.164
Token<|endoftext|>
Token ID50256
Feature activation-1.26e-01

Pos logit contributions
os+0.179
oing+0.177
umping+0.176
ark+0.174
enny+0.173
Neg logit contributions
 chores-0.112
 same-0.110
 way-0.102
 plan-0.101
 going-0.100
TokenOnce
Token ID7454
Feature activation-1.87e-02

Pos logit contributions
oing+0.376
ist+0.364
ark+0.363
round+0.363
enny+0.356
Neg logit contributions
 same-0.172
 other-0.158
 way-0.155
 the-0.154
 going-0.148
Token.
Token ID13
Feature activation-3.74e-02

Pos logit contributions
 chores+0.030
 same+0.028
 plan+0.027
 urge+0.026
 temptation+0.026
Neg logit contributions
os-0.051
o-0.051
enny-0.049
umping-0.048
oing-0.048
Token Would
Token ID10928
Feature activation+4.65e-02

Pos logit contributions
os+0.097
enny+0.095
umping+0.094
ark+0.094
o+0.094
Neg logit contributions
 chores-0.063
 same-0.063
 plan-0.058
 urge-0.057
 going-0.057
Token she
Token ID673
Feature activation+1.29e-03

Pos logit contributions
 chores+0.062
 same+0.060
 plan+0.055
 urge+0.054
 temptation+0.052
Neg logit contributions
os-0.143
o-0.140
enny-0.139
ark-0.135
umping-0.135
Token be
Token ID307
Feature activation+1.04e-01

Pos logit contributions
 chores+0.002
 same+0.002
 plan+0.002
 urge+0.002
 going+0.002
Neg logit contributions
os-0.004
enny-0.004
o-0.004
umping-0.004
oing-0.004
Token able
Token ID1498
Feature activation+1.17e-01

Pos logit contributions
 chores+0.127
 same+0.119
 plan+0.108
 urge+0.108
 temptation+0.105
Neg logit contributions
os-0.330
o-0.328
enny-0.322
ark-0.314
umping-0.312
Token to
Token ID284
Feature activation+1.59e-01

Pos logit contributions
 chores+0.222
 same+0.211
 plan+0.204
 urge+0.202
 temptation+0.200
Neg logit contributions
o-0.293
os-0.291
enny-0.281
ne-0.273
ark-0.267
Token make
Token ID787
Feature activation+1.46e-01

Pos logit contributions
 chores+0.199
 same+0.195
 temptation+0.186
 urge+0.170
 bad+0.165
Neg logit contributions
o-0.467
os-0.464
ne-0.456
enny-0.441
ark-0.437
Token the
Token ID262
Feature activation+7.28e-02

Pos logit contributions
 chores+0.150
 same+0.146
 urge+0.141
 plan+0.139
 temptation+0.132
Neg logit contributions
os-0.476
o-0.467
enny-0.460
ne-0.451
umping-0.443
Token journey
Token ID7002
Feature activation-4.00e-02

Pos logit contributions
 chores+0.096
 same+0.085
 urge+0.084
 temptation+0.084
 going+0.081
Neg logit contributions
o-0.200
os-0.200
ne-0.195
umping-0.189
oing-0.189
Token,
Token ID11
Feature activation+3.73e-02

Pos logit contributions
os+0.110
o+0.107
enny+0.106
ark+0.102
ne+0.102
Neg logit contributions
 chores-0.065
 same-0.063
 urge-0.059
 plan-0.059
 temptation-0.058
Token and
Token ID290
Feature activation+4.15e-02

Pos logit contributions
 chores+0.061
 same+0.059
 temptation+0.054
 plan+0.054
 urge+0.054
Neg logit contributions
os-0.101
o-0.100
enny-0.098
ne-0.095
ark-0.094
Token,
Token ID11
Feature activation+6.53e-02

Pos logit contributions
 chores+0.051
 same+0.051
 plan+0.047
 urge+0.047
 start+0.047
Neg logit contributions
os-0.062
enny-0.060
o-0.059
umping-0.059
ark-0.058
Token but
Token ID475
Feature activation+4.35e-02

Pos logit contributions
 chores+0.106
 same+0.099
 temptation+0.092
 urge+0.092
 plan+0.091
Neg logit contributions
o-0.179
os-0.179
enny-0.169
ne-0.166
aron-0.166
Token be
Token ID307
Feature activation+8.69e-02

Pos logit contributions
 chores+0.067
 same+0.062
 temptation+0.058
 plan+0.057
 urge+0.057
Neg logit contributions
o-0.124
os-0.123
enny-0.117
ne-0.115
ark-0.111
Token careful
Token ID8161
Feature activation+6.71e-02

Pos logit contributions
 chores+0.146
 same+0.136
 urge+0.129
 plan+0.128
 temptation+0.126
Neg logit contributions
o-0.240
os-0.232
aron-0.225
umping-0.224
enny-0.222
Token not
Token ID407
Feature activation+1.12e-01

Pos logit contributions
 chores+0.125
 same+0.122
 plan+0.113
 urge+0.112
 going+0.112
Neg logit contributions
os-0.171
enny-0.168
o-0.165
umping-0.161
ark-0.159
Token to
Token ID284
Feature activation+1.59e-01

Pos logit contributions
 chores+0.188
 same+0.180
 plan+0.168
 urge+0.166
 temptation+0.164
Neg logit contributions
os-0.305
o-0.301
enny-0.292
ark-0.286
umping-0.284
Token hurt
Token ID5938
Feature activation+6.26e-02

Pos logit contributions
 chores+0.197
 same+0.185
 plan+0.169
 urge+0.165
 temptation+0.161
Neg logit contributions
os-0.494
o-0.494
enny-0.474
ark-0.474
ne-0.470
Token yourself
Token ID3511
Feature activation+1.83e-03

Pos logit contributions
 chores+0.075
 same+0.071
 plan+0.065
 urge+0.064
 temptation+0.060
Neg logit contributions
os-0.201
o-0.198
enny-0.194
ark-0.191
ne-0.189
Token."
Token ID526
Feature activation-7.94e-02

Pos logit contributions
 chores+0.004
 same+0.004
 plan+0.003
 urge+0.003
 temptation+0.003
Neg logit contributions
os-0.004
o-0.004
enny-0.004
umping-0.004
ark-0.004
Token Lily
Token ID20037
Feature activation-9.53e-02

Pos logit contributions
os+0.199
umping+0.192
ark+0.191
enny+0.190
oing+0.190
Neg logit contributions
 chores-0.146
 same-0.143
 plan-0.134
 urge-0.133
 temptation-0.131
Token carefully
Token ID7773
Feature activation+5.48e-02

Pos logit contributions
os+0.231
enny+0.224
o+0.223
ark+0.218
oing+0.216
Neg logit contributions
 chores-0.190
 same-0.186
 plan-0.177
 urge-0.174
 start-0.172
Token mirror
Token ID10162
Feature activation-1.59e-01

Pos logit contributions
 chores+0.099
 same+0.089
 urge+0.087
 temptation+0.085
 going+0.084
Neg logit contributions
o-0.206
os-0.204
ne-0.199
enny-0.196
aron-0.193
Token away
Token ID1497
Feature activation+5.55e-02

Pos logit contributions
os+0.386
o+0.376
enny+0.375
ark+0.363
umping+0.361
Neg logit contributions
 chores-0.311
 same-0.305
 plan-0.285
 urge-0.283
 start-0.278
Token and
Token ID290
Feature activation+3.87e-02

Pos logit contributions
 chores+0.121
 same+0.117
 urge+0.110
 plan+0.110
 temptation+0.109
Neg logit contributions
os-0.122
o-0.118
enny-0.115
aron-0.113
ark-0.113
Token vowed
Token ID19982
Feature activation+6.68e-02

Pos logit contributions
 chores+0.063
 same+0.059
 temptation+0.058
 plan+0.054
 urge+0.054
Neg logit contributions
os-0.106
o-0.101
enny-0.094
ne-0.094
aron-0.092
Token never
Token ID1239
Feature activation+1.24e-01

Pos logit contributions
 chores+0.112
 same+0.107
 plan+0.103
 urge+0.102
 temptation+0.099
Neg logit contributions
o-0.179
os-0.178
enny-0.173
ark-0.170
ne-0.167
Token to
Token ID284
Feature activation+1.59e-01

Pos logit contributions
 chores+0.253
 same+0.223
 plan+0.217
 temptation+0.215
 urge+0.210
Neg logit contributions
o-0.299
os-0.289
ne-0.279
enny-0.277
umping-0.266
Token use
Token ID779
Feature activation+1.13e-01

Pos logit contributions
 chores+0.204
 same+0.184
 temptation+0.169
 plan+0.164
 urge+0.160
Neg logit contributions
os-0.477
o-0.470
ne-0.465
enny-0.460
ark-0.446
Token it
Token ID340
Feature activation-2.47e-02

Pos logit contributions
 urge+0.136
 chores+0.136
 plan+0.135
 same+0.132
 temptation+0.122
Neg logit contributions
os-0.338
o-0.326
enny-0.323
ne-0.317
aron-0.309
Token again
Token ID757
Feature activation+2.06e-02

Pos logit contributions
os+0.056
o+0.054
enny+0.054
ark+0.051
umping+0.051
Neg logit contributions
 chores-0.052
 same-0.051
 plan-0.049
 urge-0.048
 temptation-0.048
Token.
Token ID13
Feature activation-7.17e-02

Pos logit contributions
 chores+0.045
 same+0.044
 plan+0.041
 urge+0.041
 temptation+0.041
Neg logit contributions
os-0.046
o-0.044
enny-0.044
umping-0.042
ark-0.042
Token But
Token ID887
Feature activation+1.58e-02

Pos logit contributions
os+0.188
umping+0.185
oing+0.184
ark+0.180
ne+0.175
Neg logit contributions
 same-0.115
 chores-0.114
 way-0.106
 plan-0.106
 urge-0.105
Token that
Token ID326
Feature activation+5.76e-02

Pos logit contributions
 chores+0.067
 same+0.067
 urge+0.062
 plan+0.061
 temptation+0.061
Neg logit contributions
o-0.098
os-0.098
enny-0.096
ne-0.092
ark-0.091
Token it
Token ID340
Feature activation-3.65e-02

Pos logit contributions
 same+0.085
 chores+0.083
 urge+0.076
 start+0.074
 temptation+0.073
Neg logit contributions
os-0.159
o-0.157
enny-0.147
ne-0.146
es-0.143
Token doesn
Token ID1595
Feature activation+5.91e-02

Pos logit contributions
os+0.080
o+0.078
enny+0.074
ne+0.073
umping+0.071
Neg logit contributions
 chores-0.081
 same-0.078
 temptation-0.074
 plan-0.074
 urge-0.073
Token't
Token ID470
Feature activation+1.14e-01

Pos logit contributions
 chores+0.089
 same+0.087
 plan+0.081
 urge+0.080
 way+0.079
Neg logit contributions
os-0.167
enny-0.164
o-0.162
ne-0.158
ark-0.156
Token pay
Token ID1414
Feature activation+1.01e-01

Pos logit contributions
 chores+0.150
 same+0.130
 kitchen+0.120
 laundry+0.119
 way+0.113
Neg logit contributions
o-0.324
ne-0.321
os-0.316
enny-0.308
ark-0.306
Token to
Token ID284
Feature activation+1.58e-01

Pos logit contributions
 chores+0.123
 same+0.117
 urge+0.113
 plan+0.113
 temptation+0.103
Neg logit contributions
os-0.312
o-0.309
enny-0.303
ark-0.295
ne-0.295
Token go
Token ID467
Feature activation+8.24e-02

Pos logit contributions
 chores+0.186
 same+0.179
 urge+0.161
 plan+0.155
 temptation+0.154
Neg logit contributions
os-0.478
o-0.475
ne-0.461
enny-0.452
ark-0.449
Token too
Token ID1165
Feature activation+4.28e-02

Pos logit contributions
 chores+0.146
 plan+0.138
 same+0.137
 urge+0.136
 temptation+0.131
Neg logit contributions
o-0.208
os-0.207
enny-0.196
ne-0.193
ark-0.191
Token fast
Token ID3049
Feature activation+5.31e-02

Pos logit contributions
 chores+0.056
 same+0.055
 plan+0.052
 urge+0.052
 start+0.049
Neg logit contributions
os-0.128
o-0.126
enny-0.124
ark-0.122
ne-0.121
Token!
Token ID0
Feature activation-5.39e-02

Pos logit contributions
 chores+0.117
 same+0.114
 plan+0.109
 urge+0.107
 temptation+0.105
Neg logit contributions
os-0.116
o-0.112
enny-0.110
umping-0.108
ark-0.106
Token Whenever
Token ID21326
Feature activation+6.11e-02

Pos logit contributions
os+0.143
oing+0.140
umping+0.140
ark+0.138
as+0.133
Neg logit contributions
 chores-0.083
 same-0.082
 temptation-0.077
 plan-0.074
 way-0.074
Token says
Token ID1139
Feature activation+1.33e-02

Pos logit contributions
os+0.450
enny+0.441
o+0.440
ark+0.428
umping+0.428
Neg logit contributions
 chores-0.343
 same-0.335
 plan-0.318
 urge-0.312
 start-0.306
Token.
Token ID13
Feature activation-6.35e-02

Pos logit contributions
 chores+0.025
 same+0.025
 plan+0.023
 urge+0.023
 start+0.023
Neg logit contributions
os-0.033
enny-0.032
o-0.032
umping-0.031
ark-0.031
Token She
Token ID1375
Feature activation-5.57e-03

Pos logit contributions
os+0.176
umping+0.169
enny+0.169
ark+0.169
o+0.168
Neg logit contributions
 same-0.096
 chores-0.095
 start-0.086
 plan-0.085
 going-0.085
Token pret
Token ID2181
Feature activation-7.34e-02

Pos logit contributions
os+0.015
o+0.014
enny+0.014
ne+0.014
umping+0.013
Neg logit contributions
 chores-0.010
 same-0.010
 temptation-0.009
 urge-0.009
 plan-0.009
Tokenends
Token ID2412
Feature activation+4.90e-02

Pos logit contributions
enny+0.173
umping+0.166
os+0.164
ne+0.163
pillar+0.162
Neg logit contributions
 chores-0.150
 same-0.137
 start-0.133
 not-0.132
 going-0.132
Token to
Token ID284
Feature activation+1.58e-01

Pos logit contributions
 chores+0.087
 same+0.085
 urge+0.081
 plan+0.079
 temptation+0.079
Neg logit contributions
os-0.129
o-0.126
enny-0.120
ne-0.117
oing-0.117
Token wear
Token ID5806
Feature activation+5.51e-02

Pos logit contributions
 chores+0.197
 same+0.186
 temptation+0.167
 urge+0.165
 going+0.164
Neg logit contributions
os-0.486
o-0.479
ne-0.463
aron-0.461
enny-0.458
Token the
Token ID262
Feature activation+6.15e-02

Pos logit contributions
 chores+0.058
 same+0.056
 plan+0.049
 urge+0.048
 temptation+0.046
Neg logit contributions
os-0.185
o-0.181
enny-0.181
ark-0.176
umping-0.176
Token hat
Token ID6877
Feature activation-1.58e-01

Pos logit contributions
 chores+0.098
 same+0.090
 plan+0.089
 urge+0.088
 temptation+0.088
Neg logit contributions
os-0.160
o-0.158
ne-0.155
enny-0.154
aron-0.151
Token on
Token ID319
Feature activation+1.03e-01

Pos logit contributions
os+0.395
enny+0.390
o+0.384
umping+0.378
ark+0.377
Neg logit contributions
 chores-0.295
 same-0.289
 plan-0.266
 start-0.264
 urge-0.262
Token her
Token ID607
Feature activation+2.81e-02

Pos logit contributions
 chores+0.097
 same+0.090
 urge+0.083
 plan+0.083
 temptation+0.079
Neg logit contributions
os-0.353
o-0.344
enny-0.342
ark-0.335
ne-0.335
Token They
Token ID1119
Feature activation+3.74e-03

Pos logit contributions
os+0.210
enny+0.203
ark+0.203
o+0.202
umping+0.202
Neg logit contributions
 same-0.114
 chores-0.114
 plan-0.100
 bad-0.100
 temptation-0.099
Token did
Token ID750
Feature activation+1.12e-01

Pos logit contributions
 chores+0.007
 same+0.007
 plan+0.007
 urge+0.007
 start+0.006
Neg logit contributions
os-0.009
o-0.009
enny-0.009
ark-0.009
umping-0.009
Token not
Token ID407
Feature activation+1.02e-01

Pos logit contributions
 same+0.110
 urge+0.107
 chores+0.096
 temptation+0.088
 plan+0.088
Neg logit contributions
os-0.348
o-0.347
oing-0.337
aron-0.336
es-0.329
Token know
Token ID760
Feature activation+5.97e-02

Pos logit contributions
 chores+0.142
 same+0.137
 plan+0.124
 urge+0.122
 temptation+0.120
Neg logit contributions
os-0.312
o-0.306
enny-0.303
ark-0.295
umping-0.295
Token whether
Token ID1771
Feature activation+8.12e-02

Pos logit contributions
 chores+0.092
 same+0.092
 urge+0.087
 plan+0.086
 temptation+0.083
Neg logit contributions
os-0.170
o-0.165
enny-0.160
ne-0.156
umping-0.156
Token to
Token ID284
Feature activation+1.58e-01

Pos logit contributions
 chores+0.107
 same+0.107
 urge+0.101
 plan+0.099
 temptation+0.098
Neg logit contributions
os-0.247
o-0.242
enny-0.232
ne-0.229
umping-0.225
Token laugh
Token ID6487
Feature activation+7.13e-04

Pos logit contributions
 chores+0.216
 same+0.208
 plan+0.189
 urge+0.187
 temptation+0.187
Neg logit contributions
os-0.473
o-0.469
enny-0.450
ne-0.447
oing-0.443
Token or
Token ID393
Feature activation+5.68e-02

Pos logit contributions
 chores+0.001
 same+0.001
 plan+0.001
 urge+0.001
 bad+0.001
Neg logit contributions
os-0.002
o-0.002
enny-0.002
ark-0.002
umping-0.002
Token cry
Token ID3960
Feature activation-1.21e-02

Pos logit contributions
 chores+0.081
 same+0.076
 urge+0.069
 plan+0.069
 temptation+0.068
Neg logit contributions
os-0.162
o-0.160
ne-0.152
oing-0.152
aron-0.151
Token.
Token ID13
Feature activation-6.88e-02

Pos logit contributions
os+0.030
o+0.029
enny+0.029
umping+0.028
ark+0.028
Neg logit contributions
 chores-0.023
 same-0.023
 bad-0.021
 start-0.021
 plan-0.021
Token They
Token ID1119
Feature activation+9.73e-03

Pos logit contributions
os+0.178
enny+0.173
umping+0.172
ark+0.169
o+0.169
Neg logit contributions
 same-0.124
 chores-0.124
 plan-0.112
 bad-0.111
 temptation-0.110
Token,
Token ID11
Feature activation+4.58e-02

Pos logit contributions
 chores+0.060
 same+0.058
 plan+0.054
 urge+0.053
 temptation+0.052
Neg logit contributions
os-0.086
o-0.084
enny-0.084
ark-0.081
ne-0.081
Token but
Token ID475
Feature activation+4.49e-02

Pos logit contributions
 chores+0.073
 same+0.070
 plan+0.065
 urge+0.065
 temptation+0.063
Neg logit contributions
os-0.128
o-0.126
enny-0.125
ark-0.121
umping-0.121
Token be
Token ID307
Feature activation+8.89e-02

Pos logit contributions
 chores+0.074
 same+0.063
 temptation+0.058
 urge+0.058
 plan+0.057
Neg logit contributions
o-0.117
ne-0.112
os-0.111
enny-0.105
ist-0.105
Token careful
Token ID8161
Feature activation+6.95e-02

Pos logit contributions
 chores+0.142
 same+0.136
 urge+0.127
 plan+0.127
 start+0.122
Neg logit contributions
o-0.250
os-0.245
enny-0.237
umping-0.235
aron-0.235
Token not
Token ID407
Feature activation+1.22e-01

Pos logit contributions
 chores+0.125
 same+0.122
 plan+0.113
 urge+0.112
 temptation+0.111
Neg logit contributions
os-0.182
enny-0.178
o-0.176
umping-0.171
ark-0.170
Token to
Token ID284
Feature activation+1.58e-01

Pos logit contributions
 chores+0.203
 same+0.195
 plan+0.181
 urge+0.179
 temptation+0.176
Neg logit contributions
os-0.336
o-0.330
enny-0.324
ark-0.316
umping-0.315
Token drain
Token ID14782
Feature activation-5.93e-02

Pos logit contributions
 chores+0.194
 same+0.184
 plan+0.166
 urge+0.162
 temptation+0.159
Neg logit contributions
os-0.498
o-0.493
enny-0.480
ark-0.475
ne-0.471
Token your
Token ID534
Feature activation+7.32e-02

Pos logit contributions
os+0.168
o+0.164
enny+0.160
ark+0.157
ne+0.156
Neg logit contributions
 chores-0.089
 same-0.086
 urge-0.083
 plan-0.083
 temptation-0.080
Token energy
Token ID2568
Feature activation-6.42e-02

Pos logit contributions
 chores+0.102
 same+0.095
 temptation+0.095
 going+0.094
 urge+0.090
Neg logit contributions
o-0.195
os-0.194
enny-0.185
itty-0.183
enn-0.182
Token too
Token ID1165
Feature activation+3.46e-02

Pos logit contributions
os+0.152
o+0.148
enny+0.147
ark+0.142
umping+0.142
Neg logit contributions
 chores-0.130
 same-0.127
 plan-0.119
 urge-0.118
 temptation-0.116
Token much
Token ID881
Feature activation+5.06e-02

Pos logit contributions
 chores+0.066
 same+0.063
 plan+0.061
 urge+0.060
 start+0.058
Neg logit contributions
os-0.085
o-0.084
enny-0.082
ark-0.080
ne-0.080
Token suggested
Token ID5220
Feature activation+9.19e-02

Pos logit contributions
os+0.021
o+0.020
enny+0.020
ne+0.019
umping+0.019
Neg logit contributions
 chores-0.016
 same-0.016
 temptation-0.015
 plan-0.015
 urge-0.015
Token that
Token ID326
Feature activation+5.33e-02

Pos logit contributions
 same+0.128
 chores+0.128
 urge+0.117
 plan+0.116
 temptation+0.112
Neg logit contributions
os-0.273
o-0.268
enny-0.263
ark-0.255
ne-0.255
Token she
Token ID673
Feature activation-1.55e-02

Pos logit contributions
 chores+0.077
 same+0.076
 urge+0.069
 temptation+0.068
 plan+0.067
Neg logit contributions
os-0.158
o-0.155
enny-0.149
ne-0.146
ark-0.144
Token could
Token ID714
Feature activation+9.86e-02

Pos logit contributions
os+0.041
o+0.040
enny+0.040
umping+0.039
ark+0.039
Neg logit contributions
 chores-0.027
 same-0.026
 plan-0.025
 urge-0.024
 start-0.024
Token try
Token ID1949
Feature activation+1.04e-01

Pos logit contributions
 chores+0.113
 same+0.111
 temptation+0.098
 plan+0.093
 urge+0.093
Neg logit contributions
os-0.318
o-0.313
enny-0.304
ne-0.298
ark-0.297
Token to
Token ID284
Feature activation+1.57e-01

Pos logit contributions
 chores+0.162
 same+0.158
 urge+0.148
 plan+0.147
 temptation+0.143
Neg logit contributions
os-0.295
o-0.289
enny-0.284
umping-0.276
ark-0.276
Token make
Token ID787
Feature activation+1.44e-01

Pos logit contributions
 chores+0.202
 same+0.193
 temptation+0.175
 urge+0.163
 bad+0.163
Neg logit contributions
os-0.453
ne-0.448
o-0.445
enny-0.433
aron-0.426
Token a
Token ID257
Feature activation+6.16e-02

Pos logit contributions
 chores+0.157
 same+0.155
 urge+0.147
 plan+0.144
 temptation+0.135
Neg logit contributions
os-0.457
o-0.447
enny-0.439
ne-0.426
umping-0.425
Token game
Token ID983
Feature activation-3.60e-02

Pos logit contributions
 chores+0.082
 same+0.079
 going+0.074
 temptation+0.074
 urge+0.072
Neg logit contributions
o-0.172
oing-0.169
os-0.169
enny-0.165
ne-0.163
Token out
Token ID503
Feature activation+4.23e-02

Pos logit contributions
os+0.083
o+0.081
enny+0.080
ark+0.076
ne+0.076
Neg logit contributions
 chores-0.075
 same-0.073
 urge-0.069
 plan-0.069
 temptation-0.068
Token of
Token ID286
Feature activation+1.20e-01

Pos logit contributions
 chores+0.089
 same+0.088
 plan+0.082
 urge+0.082
 temptation+0.081
Neg logit contributions
os-0.097
o-0.093
enny-0.092
umping-0.089
ark-0.088
Token we
Token ID356
Feature activation+5.16e-02

Pos logit contributions
 chores+0.100
 same+0.088
 temptation+0.086
 urge+0.082
 plan+0.078
Neg logit contributions
o-0.114
os-0.105
ne-0.103
enny-0.095
ark-0.095
Token must
Token ID1276
Feature activation+1.26e-01

Pos logit contributions
 chores+0.081
 same+0.080
 temptation+0.073
 urge+0.071
 plan+0.070
Neg logit contributions
os-0.144
o-0.142
enny-0.137
ne-0.135
ark-0.132
Token be
Token ID307
Feature activation+1.04e-01

Pos logit contributions
 same+0.150
 chores+0.147
 temptation+0.130
 going+0.122
 bad+0.120
Neg logit contributions
o-0.376
os-0.366
aron-0.364
ark-0.359
ne-0.357
Token careful
Token ID8161
Feature activation+7.70e-02

Pos logit contributions
 chores+0.169
 same+0.161
 urge+0.153
 temptation+0.150
 plan+0.150
Neg logit contributions
o-0.293
os-0.281
aron-0.267
umping-0.264
enny-0.262
Token not
Token ID407
Feature activation+1.25e-01

Pos logit contributions
 chores+0.144
 same+0.140
 plan+0.129
 urge+0.129
 going+0.128
Neg logit contributions
os-0.195
enny-0.193
o-0.189
umping-0.184
ark-0.182
Token to
Token ID284
Feature activation+1.57e-01

Pos logit contributions
 chores+0.211
 same+0.203
 plan+0.189
 urge+0.187
 temptation+0.184
Neg logit contributions
os-0.341
o-0.335
enny-0.328
ark-0.320
umping-0.319
Token hit
Token ID2277
Feature activation+6.09e-02

Pos logit contributions
 chores+0.211
 same+0.193
 temptation+0.174
 urge+0.174
 plan+0.174
Neg logit contributions
o-0.477
os-0.470
ne-0.450
ark-0.449
enny-0.446
Token anyone
Token ID2687
Feature activation-7.64e-02

Pos logit contributions
 chores+0.057
 same+0.053
 plan+0.048
 urge+0.048
 temptation+0.044
Neg logit contributions
os-0.211
o-0.207
enny-0.203
ark-0.200
ne-0.199
Token."
Token ID526
Feature activation-9.03e-02

Pos logit contributions
enny+0.190
os+0.188
o+0.185
umping+0.183
oing+0.182
Neg logit contributions
 chores-0.142
 same-0.140
 bad-0.129
 start-0.129
 plan-0.129
Token
Token ID198
Feature activation-7.70e-02

Pos logit contributions
ark+0.253
umping+0.252
os+0.250
oing+0.248
enny+0.241
Neg logit contributions
 same-0.124
 chores-0.123
 way-0.123
 her-0.119
 plan-0.115
Token
Token ID198
Feature activation-7.69e-02

Pos logit contributions
umping+0.212
os+0.211
oing+0.208
round+0.202
enny+0.201
Neg logit contributions
 chores-0.116
 same-0.111
 plan-0.108
 other-0.106
 bad-0.101
Token Then
Token ID3244
Feature activation+5.31e-02

Pos logit contributions
os+0.037
enny+0.036
ne+0.036
o+0.036
aron+0.035
Neg logit contributions
 chores-0.015
 same-0.015
 urge-0.014
 temptation-0.014
 plan-0.014
Token they
Token ID484
Feature activation+1.96e-02

Pos logit contributions
 chores+0.086
 same+0.082
 temptation+0.076
 urge+0.075
 plan+0.075
Neg logit contributions
os-0.147
o-0.146
enny-0.142
ne-0.137
umping-0.134
Token decide
Token ID5409
Feature activation+7.65e-02

Pos logit contributions
 chores+0.034
 same+0.033
 temptation+0.031
 plan+0.030
 urge+0.030
Neg logit contributions
os-0.053
o-0.051
enny-0.050
ne-0.049
ark-0.048
Token to
Token ID284
Feature activation+1.45e-01

Pos logit contributions
 chores+0.139
 same+0.138
 urge+0.131
 temptation+0.128
 plan+0.127
Neg logit contributions
os-0.193
o-0.187
enny-0.185
ark-0.180
ne-0.180
Token try
Token ID1949
Feature activation+1.05e-01

Pos logit contributions
 same+0.177
 chores+0.175
 temptation+0.168
 bad+0.151
 urge+0.146
Neg logit contributions
os-0.429
o-0.423
ne-0.410
enny-0.398
ark-0.394
Token to
Token ID284
Feature activation+1.57e-01

Pos logit contributions
 chores+0.126
 same+0.125
 urge+0.120
 temptation+0.114
 plan+0.114
Neg logit contributions
os-0.325
o-0.317
enny-0.312
umping-0.305
ark-0.304
Token be
Token ID307
Feature activation+1.06e-01

Pos logit contributions
 chores+0.189
 same+0.189
 temptation+0.169
 urge+0.164
 going+0.159
Neg logit contributions
os-0.453
ne-0.440
o-0.440
enny-0.424
aron-0.423
Token nice
Token ID3621
Feature activation+3.95e-02

Pos logit contributions
 chores+0.135
 same+0.127
 temptation+0.117
 urge+0.115
 plan+0.110
Neg logit contributions
os-0.319
o-0.319
aron-0.301
enny-0.300
umping-0.300
Token to
Token ID284
Feature activation+1.39e-01

Pos logit contributions
 chores+0.078
 same+0.076
 plan+0.072
 urge+0.070
 going+0.069
Neg logit contributions
os-0.096
o-0.094
enny-0.093
ark-0.090
oing-0.090
Token Farmer
Token ID37755
Feature activation-1.34e-01

Pos logit contributions
 chores+0.165
 same+0.159
 urge+0.147
 temptation+0.142
 plan+0.139
Neg logit contributions
os-0.429
o-0.418
enny-0.403
ne-0.401
ark-0.397
Token Joe
Token ID5689
Feature activation-1.11e-01

Pos logit contributions
os+0.335
enny+0.332
o+0.321
ark+0.319
umping+0.317
Neg logit contributions
 chores-0.252
 same-0.247
 plan-0.230
 urge-0.227
 going-0.227
Token show
Token ID905
Feature activation+4.37e-02

Pos logit contributions
 chores+0.072
 plan+0.063
 temptation+0.062
 urge+0.061
 same+0.060
Neg logit contributions
o-0.182
os-0.178
enny-0.171
aron-0.171
umping-0.170
Token.
Token ID13
Feature activation-5.23e-02

Pos logit contributions
 chores+0.065
 same+0.062
 urge+0.058
 plan+0.058
 temptation+0.056
Neg logit contributions
os-0.127
o-0.122
enny-0.122
ne-0.119
ark-0.118
Token Can
Token ID1680
Feature activation+7.74e-02

Pos logit contributions
os+0.135
oing+0.130
umping+0.130
enny+0.130
o+0.129
Neg logit contributions
 chores-0.090
 same-0.090
 plan-0.082
 urge-0.081
 start-0.080
Token you
Token ID345
Feature activation+2.94e-02

Pos logit contributions
 chores+0.116
 same+0.112
 plan+0.104
 urge+0.102
 temptation+0.100
Neg logit contributions
os-0.225
o-0.221
enny-0.219
ark-0.213
ne-0.212
Token try
Token ID1949
Feature activation+1.06e-01

Pos logit contributions
 chores+0.036
 same+0.034
 temptation+0.031
 way+0.029
 plan+0.029
Neg logit contributions
os-0.092
o-0.091
enny-0.091
ne-0.089
ark-0.088
Token to
Token ID284
Feature activation+1.57e-01

Pos logit contributions
 chores+0.145
 same+0.141
 plan+0.127
 urge+0.124
 start+0.122
Neg logit contributions
os-0.322
o-0.315
enny-0.313
umping-0.305
ark-0.305
Token be
Token ID307
Feature activation+1.01e-01

Pos logit contributions
 chores+0.202
 same+0.187
 temptation+0.179
 urge+0.170
 plan+0.164
Neg logit contributions
os-0.466
o-0.459
ne-0.456
enny-0.453
ark-0.445
Token patient
Token ID5827
Feature activation+2.92e-02

Pos logit contributions
 chores+0.114
 same+0.105
 temptation+0.095
 urge+0.094
 plan+0.093
Neg logit contributions
os-0.329
o-0.327
enny-0.319
ark-0.312
umping-0.311
Token and
Token ID290
Feature activation+5.53e-02

Pos logit contributions
 chores+0.061
 same+0.059
 plan+0.056
 urge+0.055
 temptation+0.054
Neg logit contributions
os-0.068
o-0.066
enny-0.065
ark-0.063
umping-0.063
Token watch
Token ID2342
Feature activation+5.31e-02

Pos logit contributions
 chores+0.072
 same+0.066
 way+0.064
 urge+0.061
 bottom+0.060
Neg logit contributions
o-0.142
ne-0.134
os-0.131
oing-0.130
enny-0.129
Token a
Token ID257
Feature activation+6.00e-02

Pos logit contributions
 chores+0.062
 same+0.060
 plan+0.054
 urge+0.054
 temptation+0.052
Neg logit contributions
os-0.171
o-0.167
enny-0.165
ark-0.161
umping-0.161
Token to
Token ID284
Feature activation+1.51e-01

Pos logit contributions
 chores+0.138
 same+0.122
 temptation+0.113
 urge+0.113
 plan+0.107
Neg logit contributions
o-0.311
os-0.305
enny-0.281
ne-0.279
round-0.278
Token.
Token ID13
Feature activation-5.08e-02

Pos logit contributions
 chores+0.175
 same+0.173
 temptation+0.173
 urge+0.150
 bad+0.145
Neg logit contributions
os-0.428
o-0.425
ne-0.404
enny-0.386
round-0.384
Token I
Token ID314
Feature activation+2.29e-02

Pos logit contributions
os+0.135
enny+0.131
umping+0.129
oing+0.129
ark+0.129
Neg logit contributions
 chores-0.084
 same-0.083
 plan-0.076
 urge-0.075
 start-0.074
Token will
Token ID481
Feature activation+1.18e-01

Pos logit contributions
 chores+0.039
 same+0.038
 plan+0.035
 urge+0.035
 temptation+0.034
Neg logit contributions
os-0.062
o-0.061
enny-0.060
ark-0.058
umping-0.058
Token try
Token ID1949
Feature activation+1.05e-01

Pos logit contributions
 same+0.139
 chores+0.138
 temptation+0.120
 way+0.110
 bad+0.110
Neg logit contributions
os-0.348
o-0.347
ne-0.340
aron-0.340
enny-0.335
Token to
Token ID284
Feature activation+1.57e-01

Pos logit contributions
 chores+0.159
 same+0.153
 plan+0.140
 urge+0.138
 temptation+0.135
Neg logit contributions
os-0.305
o-0.298
enny-0.297
ark-0.288
umping-0.288
Token be
Token ID307
Feature activation+1.03e-01

Pos logit contributions
 chores+0.195
 same+0.185
 temptation+0.167
 urge+0.161
 plan+0.158
Neg logit contributions
os-0.464
o-0.459
ne-0.457
enny-0.448
ark-0.444
Token gentle
Token ID10296
Feature activation+4.19e-02

Pos logit contributions
 chores+0.129
 same+0.118
 urge+0.106
 temptation+0.106
 plan+0.106
Neg logit contributions
os-0.318
o-0.318
enny-0.305
aron-0.301
umping-0.301
Token and
Token ID290
Feature activation+5.41e-02

Pos logit contributions
 chores+0.078
 same+0.078
 going+0.071
 urge+0.070
 temptation+0.070
Neg logit contributions
os-0.106
enny-0.104
o-0.101
oing-0.099
ark-0.098
Token quick
Token ID2068
Feature activation+4.36e-02

Pos logit contributions
 chores+0.078
 same+0.069
 way+0.065
 urge+0.065
 temptation+0.064
Neg logit contributions
o-0.147
os-0.137
oing-0.136
ne-0.131
ark-0.130
Token.
Token ID13
Feature activation-5.84e-02

Pos logit contributions
 chores+0.093
 same+0.090
 plan+0.087
 urge+0.086
 temptation+0.084
Neg logit contributions
os-0.097
o-0.095
enny-0.092
umping-0.090
ark-0.089
Token because
Token ID780
Feature activation+1.24e-02

Pos logit contributions
 chores+0.078
 same+0.076
 plan+0.071
 urge+0.071
 temptation+0.069
Neg logit contributions
os-0.108
o-0.105
enny-0.104
ark-0.101
umping-0.101
Token she
Token ID673
Feature activation-1.56e-02

Pos logit contributions
 same+0.020
 chores+0.019
 urge+0.018
 temptation+0.018
 plan+0.018
Neg logit contributions
os-0.034
o-0.034
enny-0.033
ne-0.031
ark-0.031
Token had
Token ID550
Feature activation+8.77e-02

Pos logit contributions
os+0.041
enny+0.039
oing+0.039
o+0.039
umping+0.039
Neg logit contributions
 chores-0.028
 same-0.027
 plan-0.026
 urge-0.026
 bad-0.025
Token promised
Token ID8072
Feature activation+7.98e-02

Pos logit contributions
 chores+0.148
 temptation+0.140
 urge+0.138
 same+0.137
 plan+0.134
Neg logit contributions
o-0.226
os-0.225
enny-0.215
ne-0.209
umping-0.207
Token not
Token ID407
Feature activation+1.18e-01

Pos logit contributions
 same+0.126
 chores+0.126
 plan+0.119
 urge+0.119
 temptation+0.112
Neg logit contributions
os-0.218
o-0.217
enny-0.213
ark-0.206
ne-0.204
Token to
Token ID284
Feature activation+1.57e-01

Pos logit contributions
 chores+0.233
 same+0.225
 urge+0.212
 plan+0.210
 temptation+0.208
Neg logit contributions
o-0.286
os-0.276
enny-0.263
ne-0.259
ark-0.259
Token use
Token ID779
Feature activation+1.11e-01

Pos logit contributions
 chores+0.191
 same+0.180
 plan+0.160
 urge+0.155
 temptation+0.154
Neg logit contributions
os-0.493
o-0.491
enny-0.478
ark-0.473
ne-0.471
Token it
Token ID340
Feature activation-2.59e-02

Pos logit contributions
 plan+0.122
 urge+0.119
 same+0.119
 chores+0.111
 temptation+0.102
Neg logit contributions
os-0.338
o-0.334
enny-0.323
ne-0.316
aron-0.307
Token.
Token ID13
Feature activation-7.22e-02

Pos logit contributions
os+0.061
o+0.060
enny+0.060
ark+0.057
ne+0.057
Neg logit contributions
 chores-0.051
 same-0.050
 plan-0.048
 urge-0.048
 temptation-0.047
Token
Token ID198
Feature activation-7.62e-02

Pos logit contributions
os+0.194
oing+0.187
umping+0.186
ark+0.185
ne+0.182
Neg logit contributions
 chores-0.117
 same-0.116
 way-0.107
 plan-0.106
 urge-0.105
Token
Token ID198
Feature activation-7.60e-02

Pos logit contributions
os+0.204
umping+0.199
oing+0.197
aron+0.191
ist+0.189
Neg logit contributions
 chores-0.125
 same-0.117
 plan-0.113
 other-0.109
 bad-0.109
Token very
Token ID845
Feature activation+6.63e-02

Pos logit contributions
 chores+0.174
 temptation+0.148
 same+0.144
 plan+0.139
 urge+0.138
Neg logit contributions
o-0.330
os-0.319
umping-0.298
aron-0.292
enny-0.292
Token determined
Token ID5295
Feature activation+1.01e-01

Pos logit contributions
 chores+0.100
 same+0.085
 way+0.082
 start+0.080
 temptation+0.078
Neg logit contributions
o-0.182
os-0.176
as-0.169
enny-0.165
ne-0.165
Token and
Token ID290
Feature activation+5.62e-02

Pos logit contributions
 chores+0.204
 same+0.199
 urge+0.189
 plan+0.186
 temptation+0.186
Neg logit contributions
os-0.229
o-0.226
enny-0.218
ne-0.216
ark-0.214
Token worked
Token ID3111
Feature activation+8.62e-02

Pos logit contributions
 chores+0.101
 same+0.096
 temptation+0.090
 urge+0.087
 plan+0.084
Neg logit contributions
os-0.142
o-0.141
enny-0.131
ne-0.131
ark-0.125
Token hard
Token ID1327
Feature activation+8.06e-02

Pos logit contributions
 chores+0.114
 same+0.113
 urge+0.111
 plan+0.105
 temptation+0.103
Neg logit contributions
os-0.246
enny-0.243
o-0.242
aron-0.237
ne-0.235
Token to
Token ID284
Feature activation+1.57e-01

Pos logit contributions
 same+0.152
 chores+0.151
 urge+0.145
 plan+0.144
 temptation+0.140
Neg logit contributions
os-0.182
enny-0.179
ne-0.174
umping-0.173
o-0.173
Token try
Token ID1949
Feature activation+1.14e-01

Pos logit contributions
 same+0.193
 chores+0.183
 temptation+0.174
 urge+0.168
 sorry+0.158
Neg logit contributions
os-0.431
ne-0.425
enny-0.418
o-0.413
aron-0.401
Token and
Token ID290
Feature activation+7.01e-02

Pos logit contributions
 same+0.177
 chores+0.174
 urge+0.171
 plan+0.162
 temptation+0.161
Neg logit contributions
os-0.309
enny-0.301
o-0.300
umping-0.292
ark-0.289
Token make
Token ID787
Feature activation+1.39e-01

Pos logit contributions
 chores+0.092
 same+0.090
 temptation+0.082
 urge+0.081
 bad+0.074
Neg logit contributions
os-0.190
enny-0.189
o-0.187
ne-0.186
and-0.171
Token the
Token ID262
Feature activation+7.12e-02

Pos logit contributions
 urge+0.166
 same+0.162
 chores+0.153
 temptation+0.146
 plan+0.142
Neg logit contributions
os-0.404
enny-0.392
o-0.390
umping-0.379
aron-0.374
Token war
Token ID1175
Feature activation-8.53e-02

Pos logit contributions
 chores+0.073
 same+0.066
 going+0.066
 urge+0.065
 temptation+0.063
Neg logit contributions
os-0.203
o-0.202
ne-0.200
enny-0.197
umping-0.194
Token,
Token ID11
Feature activation+4.07e-02

Pos logit contributions
 chores+0.070
 same+0.068
 plan+0.063
 urge+0.062
 temptation+0.061
Neg logit contributions
os-0.120
o-0.118
enny-0.116
ark-0.113
umping-0.112
Token "
Token ID366
Feature activation-1.63e-02

Pos logit contributions
 chores+0.060
 same+0.058
 plan+0.053
 urge+0.053
 temptation+0.051
Neg logit contributions
os-0.119
o-0.116
enny-0.116
ark-0.112
umping-0.112
TokenI
Token ID40
Feature activation+3.04e-02

Pos logit contributions
os+0.044
enny+0.044
umping+0.042
aron+0.041
ouse+0.040
Neg logit contributions
 same-0.026
 temptation-0.024
 rain-0.022
 urge-0.022
 chores-0.021
Token will
Token ID481
Feature activation+1.19e-01

Pos logit contributions
 chores+0.054
 same+0.053
 plan+0.051
 urge+0.050
 start+0.048
Neg logit contributions
os-0.079
enny-0.078
o-0.077
ark-0.076
umping-0.076
Token try
Token ID1949
Feature activation+1.04e-01

Pos logit contributions
 chores+0.145
 same+0.136
 temptation+0.119
 plan+0.112
 bad+0.111
Neg logit contributions
os-0.368
o-0.366
ne-0.354
ark-0.351
oing-0.351
Token to
Token ID284
Feature activation+1.56e-01

Pos logit contributions
 chores+0.170
 same+0.165
 plan+0.151
 urge+0.148
 start+0.146
Neg logit contributions
os-0.290
enny-0.282
o-0.282
ark-0.273
umping-0.273
Token help
Token ID1037
Feature activation+1.21e-01

Pos logit contributions
 chores+0.212
 same+0.194
 temptation+0.178
 urge+0.169
 plan+0.169
Neg logit contributions
os-0.461
o-0.453
ne-0.449
enny-0.436
ark-0.436
Token you
Token ID345
Feature activation+1.94e-02

Pos logit contributions
 chores+0.154
 same+0.148
 plan+0.134
 urge+0.132
 temptation+0.128
Neg logit contributions
os-0.383
o-0.375
enny-0.372
ark-0.363
umping-0.362
Token."
Token ID526
Feature activation-8.03e-02

Pos logit contributions
 chores+0.040
 same+0.039
 urge+0.037
 plan+0.036
 temptation+0.036
Neg logit contributions
os-0.044
o-0.043
enny-0.043
ark-0.042
oing-0.041
Token The
Token ID383
Feature activation+3.03e-02

Pos logit contributions
ark+0.223
umping+0.222
os+0.222
oing+0.216
enny+0.216
Neg logit contributions
 chores-0.110
 same-0.108
 her-0.106
 way-0.103
 start-0.100
Token bird
Token ID6512
Feature activation-1.70e-01

Pos logit contributions
 chores+0.063
 same+0.057
 temptation+0.055
 urge+0.055
 going+0.055
Neg logit contributions
o-0.068
os-0.066
ne-0.065
itty-0.064
enny-0.062
Token.
Token ID13
Feature activation-7.53e-02

Pos logit contributions
 chores+0.017
 same+0.016
 urge+0.015
 plan+0.015
 temptation+0.015
Neg logit contributions
os-0.022
o-0.021
enny-0.021
umping-0.021
ark-0.021
Token
Token ID198
Feature activation-8.25e-02

Pos logit contributions
os+0.211
ark+0.200
enny+0.200
o+0.199
umping+0.199
Neg logit contributions
 same-0.116
 chores-0.113
 bad-0.106
 plan-0.105
 going-0.104
Token
Token ID198
Feature activation-8.25e-02

Pos logit contributions
os+0.231
umping+0.226
round+0.217
enny+0.217
oing+0.216
Neg logit contributions
 chores-0.125
 same-0.124
 bad-0.119
 other-0.118
 good-0.114
TokenThe
Token ID464
Feature activation+2.67e-02

Pos logit contributions
os+0.210
umping+0.194
enny+0.187
o+0.186
jo+0.185
Neg logit contributions
 same-0.156
 chores-0.153
 good-0.148
 bad-0.147
 other-0.147
Token great
Token ID1049
Feature activation+1.42e-02

Pos logit contributions
 chores+0.049
 same+0.046
 going+0.043
 urge+0.043
 temptation+0.043
Neg logit contributions
os-0.067
o-0.066
enny-0.065
ne-0.064
ark-0.063
Token Zen
Token ID14760
Feature activation-2.44e-01

Pos logit contributions
 chores+0.026
 same+0.024
 temptation+0.023
 going+0.023
 urge+0.022
Neg logit contributions
os-0.036
o-0.035
enny-0.035
ne-0.034
umping-0.033
Tokeno
Token ID78
Feature activation-1.23e-01

Pos logit contributions
umping+0.329
enny+0.294
ud+0.291
ark+0.283
umbs+0.277
Neg logit contributions
 same-0.684
 plan-0.681
 start-0.678
 first-0.675
 not-0.672
Token hugs
Token ID40657
Feature activation+4.14e-03

Pos logit contributions
enny+0.339
ark+0.333
os+0.327
oing+0.324
o+0.320
Neg logit contributions
 same-0.198
 start-0.191
 chores-0.188
 work-0.185
 plan-0.185
Token Ben
Token ID3932
Feature activation-1.04e-01

Pos logit contributions
 same+0.004
 chores+0.004
 start+0.003
 temptation+0.003
 her+0.003
Neg logit contributions
os-0.014
o-0.014
enny-0.014
oing-0.014
ark-0.014
Token and
Token ID290
Feature activation+2.41e-02

Pos logit contributions
os+0.288
enny+0.286
ark+0.279
lic+0.276
umping+0.273
Neg logit contributions
 chores-0.166
 same-0.163
 start-0.152
 way-0.152
 her-0.150
Token says
Token ID1139
Feature activation+2.82e-02

Pos logit contributions
 chores+0.048
 same+0.045
 temptation+0.043
 plan+0.043
 urge+0.043
Neg logit contributions
os-0.056
o-0.056
enny-0.053
ne-0.052
aron-0.051
Token saw
Token ID2497
Feature activation+7.09e-02

Pos logit contributions
os+0.185
o+0.184
enny+0.180
umping+0.174
ne+0.169
Neg logit contributions
 chores-0.153
 same-0.146
 temptation-0.142
 plan-0.137
 urge-0.136
Token a
Token ID257
Feature activation+3.29e-02

Pos logit contributions
 temptation+0.086
 chores+0.086
 urge+0.086
 plan+0.084
 same+0.078
Neg logit contributions
o-0.217
os-0.212
enny-0.203
umping-0.199
aron-0.195
Token big
Token ID1263
Feature activation+5.18e-03

Pos logit contributions
 chores+0.037
 same+0.033
 temptation+0.033
 urge+0.031
 going+0.028
Neg logit contributions
o-0.102
enny-0.101
os-0.100
oing-0.099
ne-0.098
Token box
Token ID3091
Feature activation-1.24e-01

Pos logit contributions
 chores+0.009
 same+0.008
 temptation+0.008
 urge+0.007
 plan+0.007
Neg logit contributions
o-0.013
os-0.013
umping-0.013
round-0.013
enny-0.012
Token of
Token ID286
Feature activation+9.60e-02

Pos logit contributions
os+0.280
o+0.275
enny+0.269
umping+0.262
ark+0.260
Neg logit contributions
 chores-0.266
 same-0.257
 plan-0.243
 temptation-0.240
 urge-0.239
Token Leg
Token ID3564
Feature activation-2.43e-01

Pos logit contributions
 chores+0.137
 same+0.136
 urge+0.134
 plan+0.132
 temptation+0.131
Neg logit contributions
os-0.247
o-0.243
enny-0.237
aron-0.234
itty-0.232
Tokenos
Token ID418
Feature activation-1.21e-01

Pos logit contributions
aron+0.300
pillar+0.288
oing+0.278
ark+0.278
enny+0.266
Neg logit contributions
 going-0.726
 start-0.683
 work-0.675
 plan-0.669
 first-0.659
Token that
Token ID326
Feature activation+4.43e-02

Pos logit contributions
enny+0.298
os+0.294
o+0.288
ark+0.285
oing+0.279
Neg logit contributions
 chores-0.239
 same-0.226
 going-0.218
 start-0.214
 work-0.213
Token he
Token ID339
Feature activation-1.66e-02

Pos logit contributions
 chores+0.089
 temptation+0.084
 same+0.083
 plan+0.080
 urge+0.080
Neg logit contributions
os-0.108
o-0.107
enny-0.098
ne-0.094
ark-0.092
Token really
Token ID1107
Feature activation+4.89e-02

Pos logit contributions
os+0.044
enny+0.042
oing+0.042
o+0.042
umping+0.042
Neg logit contributions
 chores-0.029
 same-0.029
 plan-0.027
 urge-0.026
 going-0.026
Token wanted
Token ID2227
Feature activation+1.00e-01

Pos logit contributions
 chores+0.089
 same+0.088
 plan+0.083
 urge+0.083
 start+0.080
Neg logit contributions
os-0.126
enny-0.123
o-0.122
umping-0.119
ark-0.119
Token arms
Token ID5101
Feature activation-7.34e-02

Pos logit contributions
 chores+0.093
 same+0.088
 temptation+0.082
 going+0.081
 plan+0.081
Neg logit contributions
os-0.147
o-0.145
enny-0.140
ne-0.139
oing-0.138
Token.
Token ID13
Feature activation-8.22e-02

Pos logit contributions
os+0.182
o+0.178
enny+0.175
umping+0.170
ark+0.170
Neg logit contributions
 chores-0.142
 same-0.137
 plan-0.129
 urge-0.127
 temptation-0.126
Token The
Token ID383
Feature activation+2.13e-02

Pos logit contributions
os+0.211
o+0.207
enny+0.203
umping+0.198
aron+0.196
Neg logit contributions
 chores-0.152
 same-0.145
 plan-0.138
 urge-0.135
 temptation-0.133
Token se
Token ID384
Feature activation-1.06e-01

Pos logit contributions
 chores+0.036
 same+0.033
 plan+0.031
 urge+0.031
 temptation+0.031
Neg logit contributions
o-0.056
os-0.055
ne-0.054
enny-0.054
oing-0.054
Tokenag
Token ID363
Feature activation-8.14e-02

Pos logit contributions
enny+0.235
os+0.233
o+0.220
umping+0.219
oing+0.219
Neg logit contributions
 chores-0.215
 same-0.214
 plan-0.208
 urge-0.208
 temptation-0.206
Tokenull
Token ID724
Feature activation-2.36e-01

Pos logit contributions
enny+0.149
aron+0.141
round+0.137
os+0.135
ark+0.133
Neg logit contributions
 same-0.197
 plan-0.195
 urge-0.189
 way-0.186
 idea-0.181
Token flew
Token ID13112
Feature activation+4.25e-02

Pos logit contributions
os+0.650
enny+0.649
o+0.632
ark+0.623
umping+0.620
Neg logit contributions
 chores-0.377
 same-0.374
 plan-0.352
 urge-0.352
 start-0.343
Token away
Token ID1497
Feature activation+5.22e-02

Pos logit contributions
 chores+0.074
 same+0.067
 plan+0.065
 urge+0.063
 temptation+0.063
Neg logit contributions
os-0.117
o-0.116
enny-0.111
ark-0.110
umping-0.109
Token.
Token ID13
Feature activation-8.30e-02

Pos logit contributions
 chores+0.116
 same+0.109
 plan+0.103
 temptation+0.103
 urge+0.102
Neg logit contributions
os-0.116
o-0.114
enny-0.109
oing-0.108
ark-0.108
Token Tommy
Token ID19919
Feature activation-1.16e-01

Pos logit contributions
os+0.212
o+0.208
enny+0.205
umping+0.199
aron+0.198
Neg logit contributions
 chores-0.154
 same-0.146
 plan-0.141
 urge-0.137
 temptation-0.134
Token laughed
Token ID13818
Feature activation+1.32e-02

Pos logit contributions
enny+0.328
os+0.328
ark+0.317
oing+0.313
umping+0.309
Neg logit contributions
 same-0.170
 way-0.168
 start-0.167
 plan-0.163
 urge-0.161
Token
Token ID198
Feature activation-9.48e-02

Pos logit contributions
os+0.256
ark+0.247
o+0.246
umping+0.244
enny+0.243
Neg logit contributions
 same-0.139
 chores-0.135
 plan-0.123
 bad-0.123
 urge-0.122
Token
Token ID198
Feature activation-9.46e-02

Pos logit contributions
os+0.261
umping+0.260
oing+0.253
round+0.251
aron+0.247
Neg logit contributions
 same-0.145
 chores-0.142
 bad-0.136
 plan-0.136
 other-0.131
TokenThe
Token ID464
Feature activation+2.14e-02

Pos logit contributions
os+0.160
o+0.153
enny+0.150
umping+0.144
ark+0.143
Neg logit contributions
 chores-0.261
 same-0.257
 plan-0.245
 urge-0.243
 temptation-0.241
Token se
Token ID384
Feature activation-1.05e-01

Pos logit contributions
 chores+0.048
 same+0.041
 temptation+0.041
 start+0.040
 plan+0.039
Neg logit contributions
o-0.044
ne-0.043
enny-0.042
os-0.041
oing-0.041
Tokenag
Token ID363
Feature activation-7.93e-02

Pos logit contributions
os+0.230
enny+0.220
o+0.218
umping+0.214
oing+0.209
Neg logit contributions
 chores-0.227
 same-0.226
 urge-0.217
 plan-0.217
 temptation-0.214
Tokenull
Token ID724
Feature activation-2.36e-01

Pos logit contributions
os+0.131
aron+0.129
enny+0.129
round+0.122
umping+0.122
Neg logit contributions
 same-0.209
 plan-0.203
 urge-0.200
 way-0.196
 bad-0.192
Token tried
Token ID3088
Feature activation+9.28e-02

Pos logit contributions
os+0.637
enny+0.633
ark+0.609
o+0.607
umping+0.607
Neg logit contributions
 chores-0.389
 same-0.386
 plan-0.367
 urge-0.367
 bad-0.359
Token to
Token ID284
Feature activation+1.25e-01

Pos logit contributions
 chores+0.153
 same+0.148
 plan+0.137
 urge+0.136
 temptation+0.133
Neg logit contributions
os-0.258
o-0.252
enny-0.250
ark-0.243
umping-0.243
Token take
Token ID1011
Feature activation+8.93e-02

Pos logit contributions
 chores+0.211
 same+0.182
 temptation+0.168
 way+0.165
 going+0.165
Neg logit contributions
o-0.332
os-0.329
ne-0.321
ark-0.319
oing-0.312
Token the
Token ID262
Feature activation+6.92e-02

Pos logit contributions
 chores+0.099
 same+0.087
 plan+0.086
 urge+0.082
 temptation+0.078
Neg logit contributions
os-0.292
o-0.289
enny-0.284
ne-0.277
umping-0.273
Token oy
Token ID35104
Feature activation-1.51e-01

Pos logit contributions
 chores+0.161
 temptation+0.135
 plan+0.131
 start+0.128
 going+0.127
Neg logit contributions
o-0.137
ne-0.132
oing-0.126
os-0.125
ark-0.123
Token park
Token ID3952
Feature activation-6.60e-02

Pos logit contributions
 chores+0.090
 temptation+0.080
 urge+0.080
 same+0.079
 plan+0.078
Neg logit contributions
o-0.166
os-0.162
ne-0.161
enny-0.159
itty-0.156
Token.
Token ID13
Feature activation-7.25e-02

Pos logit contributions
os+0.155
o+0.151
enny+0.149
ark+0.144
umping+0.144
Neg logit contributions
 chores-0.135
 same-0.132
 plan-0.124
 urge-0.123
 temptation-0.122
Token
Token ID198
Feature activation-7.52e-02

Pos logit contributions
os+0.197
umping+0.192
ark+0.189
enny+0.189
oing+0.189
Neg logit contributions
 same-0.116
 chores-0.115
 plan-0.104
 urge-0.102
 bad-0.101
Token
Token ID198
Feature activation-7.36e-02

Pos logit contributions
os+0.200
umping+0.199
oing+0.193
aron+0.192
enny+0.192
Neg logit contributions
 chores-0.123
 same-0.117
 plan-0.114
 bad-0.111
 other-0.110
TokenD
Token ID35
Feature activation-8.22e-02

Pos logit contributions
os+0.174
umping+0.167
aron+0.165
enny+0.163
o+0.159
Neg logit contributions
 chores-0.150
 same-0.142
 plan-0.137
 bad-0.133
 start-0.132
Tokenull
Token ID724
Feature activation-2.36e-01

Pos logit contributions
aron+0.127
oing+0.126
os+0.126
round+0.124
o+0.120
Neg logit contributions
 same-0.226
 chores-0.216
 work-0.212
 start-0.204
 plan-0.204
Token had
Token ID550
Feature activation+8.22e-02

Pos logit contributions
enny+0.589
os+0.562
umping+0.547
ark+0.541
round+0.535
Neg logit contributions
 same-0.426
 chores-0.411
 plan-0.409
 urge-0.408
 way-0.403
Token an
Token ID281
Feature activation+5.59e-02

Pos logit contributions
 chores+0.113
 temptation+0.113
 urge+0.109
 same+0.107
 plan+0.106
Neg logit contributions
o-0.234
os-0.228
umping-0.209
ne-0.208
enny-0.203
Token idea
Token ID2126
Feature activation-2.08e-02

Pos logit contributions
 same+0.051
 chores+0.047
 start+0.043
 temptation+0.042
 way+0.041
Neg logit contributions
aron-0.164
round-0.160
o-0.160
umping-0.159
os-0.154
Token.
Token ID13
Feature activation-7.80e-02

Pos logit contributions
os+0.053
enny+0.053
o+0.052
oing+0.051
umping+0.051
Neg logit contributions
 chores-0.037
 same-0.036
 going-0.035
 plan-0.035
 urge-0.034
Token He
Token ID679
Feature activation-1.70e-02

Pos logit contributions
os+0.198
o+0.192
enny+0.190
ark+0.185
umping+0.185
Neg logit contributions
 chores-0.145
 same-0.143
 plan-0.133
 urge-0.132
 temptation-0.130
Token coal
Token ID5655
Feature activation-1.43e-01

Pos logit contributions
 chores+0.085
 her+0.072
 temptation+0.071
 start+0.069
 going+0.069
Neg logit contributions
o-0.108
umping-0.101
ne-0.101
os-0.100
oing-0.100
Token.
Token ID13
Feature activation-9.79e-02

Pos logit contributions
os+0.363
o+0.357
umping+0.339
enny+0.338
oing+0.338
Neg logit contributions
 chores-0.266
 same-0.251
 temptation-0.234
 plan-0.234
 urge-0.232
Token The
Token ID383
Feature activation+1.26e-02

Pos logit contributions
os+0.265
enny+0.258
ark+0.257
o+0.254
umping+0.252
Neg logit contributions
 same-0.160
 chores-0.152
 urge-0.140
 plan-0.140
 bad-0.140
Token se
Token ID384
Feature activation-1.08e-01

Pos logit contributions
 chores+0.028
 her+0.025
 going+0.024
 temptation+0.024
 start+0.023
Neg logit contributions
o-0.024
ne-0.022
oing-0.022
pillar-0.022
os-0.022
Tokenag
Token ID363
Feature activation-8.15e-02

Pos logit contributions
os+0.222
enny+0.219
o+0.212
umping+0.205
ne+0.203
Neg logit contributions
 chores-0.241
 same-0.240
 plan-0.233
 urge-0.232
 temptation-0.231
Tokenull
Token ID724
Feature activation-2.36e-01

Pos logit contributions
enny+0.134
os+0.128
aron+0.123
round+0.122
umping+0.121
Neg logit contributions
 same-0.216
 plan-0.213
 urge-0.207
 way-0.204
 chores-0.202
Token grabbed
Token ID13176
Feature activation+8.35e-02

Pos logit contributions
enny+0.645
os+0.616
ark+0.603
umping+0.600
round+0.592
Neg logit contributions
 same-0.363
 urge-0.363
 start-0.361
 bad-0.360
 plan-0.358
Token the
Token ID262
Feature activation+6.02e-02

Pos logit contributions
 chores+0.122
 plan+0.104
 same+0.102
 way+0.097
 urge+0.094
Neg logit contributions
os-0.242
o-0.242
ne-0.221
enny-0.221
iled-0.220
Token coal
Token ID5655
Feature activation-1.40e-01

Pos logit contributions
 chores+0.117
 going+0.100
 same+0.099
 temptation+0.099
 her+0.097
Neg logit contributions
o-0.146
os-0.137
oing-0.135
enny-0.133
ne-0.132
Token with
Token ID351
Feature activation+1.02e-01

Pos logit contributions
os+0.303
o+0.297
enny+0.281
oing+0.278
umping+0.278
Neg logit contributions
 chores-0.316
 same-0.303
 temptation-0.287
 plan-0.287
 urge-0.284
Token its
Token ID663
Feature activation+4.33e-02

Pos logit contributions
 chores+0.194
 same+0.169
 task+0.162
 laundry+0.160
 plan+0.157
Neg logit contributions
os-0.249
o-0.242
iled-0.224
oing-0.223
ez-0.223
Token
Token ID198
Feature activation-7.47e-02

Pos logit contributions
os+0.191
umping+0.188
ark+0.185
enny+0.184
oing+0.183
Neg logit contributions
 chores-0.139
 same-0.135
 plan-0.128
 urge-0.126
 start-0.124
Token
Token ID198
Feature activation-7.33e-02

Pos logit contributions
os+0.199
umping+0.198
oing+0.190
round+0.188
enny+0.187
Neg logit contributions
 chores-0.121
 plan-0.115
 same-0.114
 other-0.109
 her-0.108
TokenThe
Token ID464
Feature activation+3.02e-02

Pos logit contributions
os+0.158
o+0.149
umping+0.149
enny+0.146
oing+0.144
Neg logit contributions
 chores-0.166
 plan-0.157
 same-0.157
 temptation-0.149
 start-0.147
Token se
Token ID384
Feature activation-9.71e-02

Pos logit contributions
 chores+0.047
 same+0.040
 temptation+0.039
 going+0.039
 urge+0.038
Neg logit contributions
ne-0.083
os-0.081
enny-0.081
o-0.081
ark-0.080
Tokenag
Token ID363
Feature activation-7.92e-02

Pos logit contributions
os+0.190
enny+0.185
o+0.180
oing+0.179
umping+0.178
Neg logit contributions
 chores-0.225
 same-0.223
 plan-0.220
 idea-0.216
 urge-0.214
Tokenull
Token ID724
Feature activation-2.35e-01

Pos logit contributions
enny+0.139
round+0.132
os+0.130
aron+0.129
umping+0.128
Neg logit contributions
 same-0.198
 plan-0.195
 urge-0.185
 way-0.185
 idea-0.182
Tokens
Token ID82
Feature activation+5.26e-02

Pos logit contributions
enny+0.616
ark+0.583
os+0.581
umping+0.570
o+0.562
Neg logit contributions
 same-0.403
 chores-0.401
 plan-0.397
 urge-0.389
 start-0.383
Token and
Token ID290
Feature activation+2.80e-02

Pos logit contributions
 chores+0.100
 same+0.095
 temptation+0.089
 plan+0.088
 urge+0.086
Neg logit contributions
os-0.131
o-0.128
aron-0.123
umping-0.121
oing-0.120
Token the
Token ID262
Feature activation+5.39e-02

Pos logit contributions
 chores+0.043
 same+0.041
 temptation+0.040
 urge+0.037
 plan+0.036
Neg logit contributions
os-0.080
o-0.078
ne-0.075
enny-0.073
oing-0.071
Token fish
Token ID5916
Feature activation-1.91e-01

Pos logit contributions
 chores+0.083
 same+0.072
 urge+0.069
 temptation+0.069
 plan+0.069
Neg logit contributions
o-0.148
os-0.148
ne-0.146
enny-0.144
ark-0.141
Token agreed
Token ID4987
Feature activation+6.70e-02

Pos logit contributions
enny+0.463
os+0.462
o+0.452
ark+0.445
umping+0.442
Neg logit contributions
 chores-0.364
 same-0.358
 plan-0.343
 urge-0.336
 start-0.330
Token nearby
Token ID6716
Feature activation+1.92e-02

Pos logit contributions
enny+0.559
os+0.549
o+0.532
ark+0.531
umping+0.523
Neg logit contributions
 same-0.401
 chores-0.395
 start-0.383
 going-0.380
 urge-0.379
Token.
Token ID13
Feature activation-8.69e-02

Pos logit contributions
 chores+0.041
 same+0.040
 plan+0.037
 urge+0.037
 temptation+0.037
Neg logit contributions
os-0.044
o-0.042
enny-0.042
umping-0.041
ark-0.040
Token The
Token ID383
Feature activation+2.18e-02

Pos logit contributions
os+0.221
o+0.215
enny+0.214
umping+0.207
ark+0.206
Neg logit contributions
 chores-0.164
 same-0.158
 plan-0.149
 urge-0.147
 temptation-0.144
Token se
Token ID384
Feature activation-1.02e-01

Pos logit contributions
 chores+0.033
 temptation+0.026
 same+0.026
 going+0.026
 urge+0.025
Neg logit contributions
ne-0.059
o-0.059
os-0.058
enny-0.057
round-0.055
Tokenag
Token ID363
Feature activation-7.92e-02

Pos logit contributions
os+0.208
enny+0.206
o+0.199
oing+0.196
umping+0.192
Neg logit contributions
 same-0.220
 chores-0.218
 idea-0.214
 urge-0.214
 plan-0.213
Tokenull
Token ID724
Feature activation-2.35e-01

Pos logit contributions
enny+0.138
aron+0.130
os+0.129
round+0.127
ark+0.126
Neg logit contributions
 same-0.196
 plan-0.193
 urge-0.187
 way-0.184
 start-0.181
Token swoop
Token ID38527
Feature activation-5.49e-02

Pos logit contributions
enny+0.653
os+0.621
ark+0.618
umping+0.606
aron+0.600
Neg logit contributions
 same-0.367
 start-0.356
 urge-0.355
 plan-0.354
 chores-0.348
Tokened
Token ID276
Feature activation+7.02e-02

Pos logit contributions
enny+0.069
os+0.068
aron+0.067
ne+0.064
o+0.062
Neg logit contributions
 same-0.164
 start-0.160
 going-0.158
 chores-0.155
 urge-0.151
Token down
Token ID866
Feature activation+5.35e-02

Pos logit contributions
 chores+0.129
 same+0.118
 plan+0.115
 temptation+0.112
 urge+0.111
Neg logit contributions
os-0.185
o-0.181
enny-0.172
umping-0.172
oing-0.171
Token to
Token ID284
Feature activation+1.10e-01

Pos logit contributions
 chores+0.133
 same+0.125
 plan+0.122
 temptation+0.121
 urge+0.120
Neg logit contributions
os-0.107
o-0.104
enny-0.098
umping-0.097
oing-0.096
Token take
Token ID1011
Feature activation+9.32e-02

Pos logit contributions
 chores+0.141
 same+0.120
 plan+0.110
 temptation+0.105
 urge+0.104
Neg logit contributions
os-0.348
o-0.341
enny-0.334
ne-0.330
ark-0.327
Token away
Token ID1497
Feature activation+4.68e-02

Pos logit contributions
 chores+0.056
 same+0.056
 start+0.050
 bad+0.050
 urge+0.049
Neg logit contributions
os-0.119
enny-0.118
o-0.117
oing-0.115
ark-0.115
Token all
Token ID477
Feature activation+1.26e-02

Pos logit contributions
 chores+0.091
 same+0.089
 plan+0.083
 urge+0.082
 start+0.081
Neg logit contributions
os-0.116
o-0.113
enny-0.113
umping-0.109
ark-0.109
Token the
Token ID262
Feature activation+5.18e-02

Pos logit contributions
 chores+0.021
 same+0.020
 plan+0.018
 urge+0.018
 temptation+0.018
Neg logit contributions
os-0.035
o-0.034
ne-0.032
enny-0.032
umping-0.032
Token se
Token ID384
Feature activation-9.19e-02

Pos logit contributions
 chores+0.057
 same+0.049
 urge+0.046
 start+0.046
 going+0.046
Neg logit contributions
ne-0.156
o-0.149
os-0.148
enny-0.148
ark-0.145
Tokenag
Token ID363
Feature activation-7.87e-02

Pos logit contributions
os+0.202
enny+0.199
o+0.192
umping+0.190
oing+0.190
Neg logit contributions
 chores-0.192
 same-0.189
 temptation-0.182
 plan-0.181
 urge-0.179
Tokenull
Token ID724
Feature activation-2.34e-01

Pos logit contributions
enny+0.132
aron+0.125
round+0.124
os+0.122
umping+0.121
Neg logit contributions
 same-0.198
 plan-0.195
 chores-0.194
 way-0.190
 other-0.188
Tokens
Token ID82
Feature activation+5.13e-02

Pos logit contributions
enny+0.431
os+0.415
ark+0.401
o+0.399
umping+0.396
Neg logit contributions
 chores-0.582
 same-0.581
 way-0.559
 start-0.555
 urge-0.555
Token.
Token ID13
Feature activation-7.80e-02

Pos logit contributions
 chores+0.105
 same+0.102
 plan+0.096
 urge+0.095
 temptation+0.094
Neg logit contributions
os-0.121
o-0.118
enny-0.117
ark-0.113
umping-0.113
Token
Token ID198
Feature activation-7.78e-02

Pos logit contributions
os+0.211
o+0.203
enny+0.203
umping+0.201
ark+0.201
Neg logit contributions
 same-0.131
 chores-0.130
 way-0.116
 plan-0.116
 start-0.115
Token
Token ID198
Feature activation-7.73e-02

Pos logit contributions
os+0.203
umping+0.202
oing+0.196
enny+0.194
aron+0.193
Neg logit contributions
 chores-0.132
 same-0.126
 plan-0.122
 other-0.119
 bad-0.118
TokenSuddenly
Token ID38582
Feature activation+1.11e-02

Pos logit contributions
os+0.195
umping+0.184
enny+0.183
o+0.182
aron+0.179
Neg logit contributions
 chores-0.148
 same-0.141
 plan-0.135
 start-0.132
 bad-0.131
Token,
Token ID11
Feature activation+9.74e-03

Pos logit contributions
os+0.323
enny+0.313
ark+0.312
oing+0.309
umping+0.309
Neg logit contributions
 chores-0.153
 same-0.153
 plan-0.143
 urge-0.142
 bad-0.140
Token Sally
Token ID25737
Feature activation-1.27e-01

Pos logit contributions
 chores+0.011
 same+0.010
 plan+0.009
 urge+0.009
 going+0.009
Neg logit contributions
os-0.032
enny-0.031
o-0.031
ark-0.031
umping-0.031
Token the
Token ID262
Feature activation+4.21e-02

Pos logit contributions
os+0.359
o+0.350
enny+0.349
ark+0.340
umping+0.339
Neg logit contributions
 chores-0.201
 same-0.195
 plan-0.181
 urge-0.178
 start-0.174
Token Se
Token ID1001
Feature activation-6.12e-02

Pos logit contributions
 chores+0.055
 her+0.053
 avail+0.052
 temptation+0.051
 urge+0.044
Neg logit contributions
o-0.110
os-0.106
ne-0.103
oing-0.100
as-0.100
Tokenag
Token ID363
Feature activation-7.50e-02

Pos logit contributions
os+0.123
o+0.119
enny+0.117
oing+0.114
umping+0.114
Neg logit contributions
 chores-0.147
 same-0.147
 plan-0.136
 urge-0.136
 start-0.135
Tokenull
Token ID724
Feature activation-2.33e-01

Pos logit contributions
enny+0.116
os+0.114
aron+0.112
o+0.109
oing+0.107
Neg logit contributions
 same-0.206
 plan-0.200
 chores-0.199
 urge-0.196
 way-0.193
Token,
Token ID11
Feature activation+1.85e-02

Pos logit contributions
os+0.619
enny+0.607
o+0.598
ark+0.587
umping+0.585
Neg logit contributions
 chores-0.404
 same-0.396
 plan-0.371
 urge-0.369
 start-0.360
Token heard
Token ID2982
Feature activation+6.24e-02

Pos logit contributions
 chores+0.041
 same+0.039
 plan+0.037
 temptation+0.037
 urge+0.036
Neg logit contributions
os-0.041
o-0.041
enny-0.038
umping-0.037
ne-0.037
Token him
Token ID683
Feature activation-2.06e-02

Pos logit contributions
 chores+0.078
 same+0.075
 plan+0.069
 urge+0.068
 temptation+0.065
Neg logit contributions
os-0.198
o-0.194
enny-0.191
umping-0.187
ark-0.186
Token and
Token ID290
Feature activation+2.19e-02

Pos logit contributions
os+0.060
o+0.059
oing+0.057
enny+0.057
umping+0.056
Neg logit contributions
 chores-0.030
 same-0.029
 plan-0.026
 temptation-0.025
 urge-0.025
Token came
Token ID1625
Feature activation+5.42e-02

Pos logit contributions
 chores+0.048
 same+0.045
 temptation+0.042
 plan+0.041
 urge+0.040
Neg logit contributions
os-0.052
o-0.052
enny-0.048
ne-0.047
oing-0.046
Token saw
Token ID2497
Feature activation+7.31e-02

Pos logit contributions
 chores+0.041
 same+0.038
 temptation+0.037
 plan+0.036
 urge+0.036
Neg logit contributions
os-0.055
o-0.053
enny-0.051
ne-0.050
umping-0.050
Token all
Token ID477
Feature activation+8.45e-03

Pos logit contributions
 chores+0.081
 same+0.073
 urge+0.072
 plan+0.072
 temptation+0.071
Neg logit contributions
os-0.240
o-0.234
enny-0.226
umping-0.223
ne-0.222
Token the
Token ID262
Feature activation+4.54e-02

Pos logit contributions
 chores+0.009
 same+0.008
 plan+0.008
 urge+0.007
 temptation+0.007
Neg logit contributions
os-0.029
o-0.028
enny-0.027
umping-0.027
ne-0.027
Token se
Token ID384
Feature activation-9.50e-02

Pos logit contributions
 chores+0.052
 same+0.047
 urge+0.043
 temptation+0.043
 plan+0.043
Neg logit contributions
os-0.145
o-0.143
ne-0.140
enny-0.140
ark-0.138
Tokenag
Token ID363
Feature activation-8.12e-02

Pos logit contributions
enny+0.215
os+0.214
o+0.206
ne+0.201
umping+0.201
Neg logit contributions
 chores-0.195
 same-0.188
 urge-0.182
 plan-0.181
 temptation-0.180
Tokenull
Token ID724
Feature activation-2.33e-01

Pos logit contributions
aron+0.137
enny+0.135
round+0.132
ne+0.126
oing+0.126
Neg logit contributions
 same-0.203
 chores-0.199
 plan-0.197
 other-0.192
 urge-0.191
Tokens
Token ID82
Feature activation+5.31e-02

Pos logit contributions
enny+0.340
aron+0.308
os+0.307
umping+0.300
ark+0.296
Neg logit contributions
 same-0.684
 chores-0.680
 going-0.670
 start-0.660
 way-0.657
Token flying
Token ID7348
Feature activation-2.80e-03

Pos logit contributions
 chores+0.104
 same+0.102
 plan+0.096
 urge+0.095
 going+0.094
Neg logit contributions
os-0.129
enny-0.128
o-0.126
ark-0.122
umping-0.120
Token in
Token ID287
Feature activation+9.92e-02

Pos logit contributions
os+0.007
enny+0.007
o+0.007
umping+0.007
oing+0.007
Neg logit contributions
 same-0.005
 chores-0.005
 plan-0.005
 urge-0.004
 way-0.004
Token the
Token ID262
Feature activation+5.55e-02

Pos logit contributions
 chores+0.148
 same+0.135
 urge+0.130
 plan+0.126
 temptation+0.122
Neg logit contributions
os-0.281
o-0.273
ne-0.269
umping-0.268
enny-0.267
Token sky
Token ID6766
Feature activation-1.27e-01

Pos logit contributions
 chores+0.059
 same+0.055
 plan+0.050
 urge+0.048
 temptation+0.047
Neg logit contributions
os-0.186
o-0.183
enny-0.181
umping-0.177
ne-0.176
Token one
Token ID530
Feature activation-2.99e-02

Pos logit contributions
 chores+0.058
 same+0.056
 urge+0.052
 plan+0.052
 temptation+0.051
Neg logit contributions
os-0.092
o-0.090
enny-0.089
ne-0.087
ark-0.085
Token of
Token ID286
Feature activation+8.81e-02

Pos logit contributions
os+0.089
o+0.087
enny+0.086
ne+0.084
umping+0.083
Neg logit contributions
 chores-0.044
 same-0.041
 plan-0.038
 urge-0.037
 temptation-0.037
Token the
Token ID262
Feature activation+4.89e-02

Pos logit contributions
 chores+0.102
 same+0.094
 urge+0.083
 temptation+0.083
 plan+0.082
Neg logit contributions
os-0.290
o-0.284
ne-0.277
enny-0.273
aron-0.269
Token se
Token ID384
Feature activation-8.80e-02

Pos logit contributions
 chores+0.060
 same+0.056
 going+0.053
 temptation+0.052
 urge+0.052
Neg logit contributions
os-0.136
o-0.136
ne-0.135
enny-0.132
itty-0.128
Tokenag
Token ID363
Feature activation-7.50e-02

Pos logit contributions
os+0.178
enny+0.178
o+0.172
ark+0.167
umping+0.167
Neg logit contributions
 chores-0.200
 same-0.196
 plan-0.191
 urge-0.189
 temptation-0.186
Tokenull
Token ID724
Feature activation-2.33e-01

Pos logit contributions
enny+0.117
os+0.111
aron+0.109
round+0.109
o+0.108
Neg logit contributions
 same-0.203
 plan-0.199
 chores-0.198
 urge-0.194
 way-0.191
Tokens
Token ID82
Feature activation+6.37e-02

Pos logit contributions
os+0.603
enny+0.593
o+0.586
ark+0.573
umping+0.567
Neg logit contributions
 chores-0.412
 same-0.406
 plan-0.379
 urge-0.377
 start-0.371
Token suddenly
Token ID6451
Feature activation+4.65e-02

Pos logit contributions
 chores+0.123
 same+0.119
 plan+0.111
 urge+0.110
 temptation+0.109
Neg logit contributions
os-0.159
o-0.154
enny-0.151
umping-0.147
ark-0.147
Token became
Token ID2627
Feature activation+1.10e-01

Pos logit contributions
 chores+0.091
 same+0.088
 plan+0.083
 urge+0.082
 temptation+0.082
Neg logit contributions
os-0.114
o-0.111
enny-0.110
umping-0.106
ne-0.105
Token very
Token ID845
Feature activation+5.99e-02

Pos logit contributions
 chores+0.143
 temptation+0.132
 urge+0.128
 same+0.127
 plan+0.125
Neg logit contributions
o-0.324
os-0.313
umping-0.300
enny-0.290
ne-0.290
Token aggressive
Token ID8361
Feature activation+1.82e-02

Pos logit contributions
 chores+0.081
 way+0.069
 same+0.069
 urge+0.068
 temptation+0.068
Neg logit contributions
o-0.172
os-0.171
oing-0.159
enny-0.157
ne-0.156
Token came
Token ID1625
Feature activation+5.22e-02

Pos logit contributions
aron+0.625
enny+0.599
ark+0.580
umbs+0.575
oing+0.568
Neg logit contributions
 going-0.332
 go-0.313
 same-0.302
 urge-0.298
 start-0.298
Token to
Token ID284
Feature activation+1.20e-01

Pos logit contributions
 chores+0.083
 same+0.083
 plan+0.075
 urge+0.074
 going+0.073
Neg logit contributions
os-0.146
enny-0.143
o-0.141
umping-0.138
ark-0.138
Token town
Token ID3240
Feature activation-7.77e-02

Pos logit contributions
 chores+0.134
 temptation+0.117
 same+0.115
 urge+0.109
 plan+0.101
Neg logit contributions
o-0.386
os-0.384
ne-0.362
ark-0.362
enny-0.361
Token.
Token ID13
Feature activation-7.91e-02

Pos logit contributions
os+0.187
enny+0.186
o+0.180
umping+0.177
ark+0.175
Neg logit contributions
 chores-0.152
 same-0.151
 plan-0.141
 going-0.140
 urge-0.140
Token The
Token ID383
Feature activation+2.49e-02

Pos logit contributions
enny+0.209
os+0.208
o+0.200
umping+0.199
ark+0.199
Neg logit contributions
 same-0.138
 chores-0.137
 urge-0.126
 plan-0.125
 start-0.124
Token clown
Token ID25573
Feature activation-2.33e-01

Pos logit contributions
 chores+0.060
 same+0.057
 plan+0.055
 temptation+0.054
 urge+0.054
Neg logit contributions
os-0.049
o-0.049
enny-0.047
ark-0.046
ne-0.045
Token had
Token ID550
Feature activation+8.08e-02

Pos logit contributions
enny+0.709
oing+0.688
aron+0.683
umping+0.681
ark+0.680
Neg logit contributions
 same-0.298
 going-0.298
 way-0.282
 bad-0.278
 plan-0.278
Token a
Token ID257
Feature activation+4.12e-02

Pos logit contributions
 chores+0.099
 temptation+0.093
 plan+0.090
 urge+0.088
 same+0.081
Neg logit contributions
o-0.251
os-0.245
aron-0.231
umping-0.231
ne-0.230
Token big
Token ID1263
Feature activation+1.59e-02

Pos logit contributions
 chores+0.044
 temptation+0.044
 same+0.036
 going+0.036
 start+0.035
Neg logit contributions
o-0.127
ouse-0.118
oing-0.117
os-0.117
round-0.117
Token red
Token ID2266
Feature activation-2.75e-02

Pos logit contributions
 chores+0.026
 temptation+0.024
 same+0.024
 going+0.022
 start+0.021
Neg logit contributions
o-0.041
os-0.039
round-0.038
umping-0.038
aron-0.038
Token nose
Token ID9686
Feature activation-1.17e-01

Pos logit contributions
os+0.072
o+0.072
enny+0.069
round+0.068
umping+0.068
Neg logit contributions
 chores-0.047
 same-0.045
 temptation-0.043
 plan-0.042
 urge-0.041
Token happening
Token ID5836
Feature activation+4.32e-02

Pos logit contributions
 chores+0.149
 urge+0.135
 temptation+0.135
 plan+0.132
 same+0.125
Neg logit contributions
os-0.254
o-0.253
ne-0.236
aron-0.228
ark-0.223
Token.
Token ID13
Feature activation-6.46e-02

Pos logit contributions
 chores+0.093
 same+0.089
 plan+0.086
 urge+0.085
 temptation+0.084
Neg logit contributions
os-0.097
o-0.095
enny-0.092
ark-0.090
umping-0.089
Token The
Token ID383
Feature activation+3.16e-02

Pos logit contributions
os+0.168
oing+0.163
umping+0.162
enny+0.162
o+0.162
Neg logit contributions
 chores-0.109
 same-0.109
 plan-0.099
 urge-0.098
 start-0.097
Token se
Token ID384
Feature activation-9.67e-02

Pos logit contributions
 chores+0.049
 same+0.046
 urge+0.043
 plan+0.043
 going+0.043
Neg logit contributions
os-0.085
o-0.085
ne-0.083
enny-0.083
ark-0.080
Tokenag
Token ID363
Feature activation-7.48e-02

Pos logit contributions
os+0.173
enny+0.169
o+0.166
umping+0.160
ark+0.159
Neg logit contributions
 chores-0.246
 same-0.244
 plan-0.237
 urge-0.234
 temptation-0.232
Tokenull
Token ID724
Feature activation-2.32e-01

Pos logit contributions
enny+0.130
aron+0.120
os+0.119
umping+0.118
round+0.118
Neg logit contributions
 same-0.192
 plan-0.187
 urge-0.181
 way-0.179
 chores-0.177
Token replied
Token ID8712
Feature activation+2.90e-02

Pos logit contributions
enny+0.599
os+0.591
o+0.577
ark+0.572
umping+0.566
Neg logit contributions
 same-0.408
 chores-0.407
 urge-0.385
 plan-0.385
 start-0.378
Token that
Token ID326
Feature activation+4.92e-02

Pos logit contributions
 chores+0.050
 same+0.048
 plan+0.045
 urge+0.045
 temptation+0.044
Neg logit contributions
os-0.078
o-0.076
enny-0.075
ark-0.073
umping-0.073
Token it
Token ID340
Feature activation-3.77e-02

Pos logit contributions
 chores+0.065
 same+0.062
 urge+0.058
 temptation+0.056
 plan+0.056
Neg logit contributions
os-0.153
o-0.148
enny-0.142
ne-0.141
ark-0.138
Token was
Token ID373
Feature activation+1.26e-01

Pos logit contributions
os+0.091
o+0.088
enny+0.088
ark+0.085
umping+0.085
Neg logit contributions
 chores-0.075
 same-0.074
 plan-0.069
 urge-0.069
 temptation-0.067
Token a
Token ID257
Feature activation+5.17e-02

Pos logit contributions
 chores+0.185
 same+0.162
 temptation+0.160
 urge+0.155
 plan+0.154
Neg logit contributions
o-0.351
os-0.322
aron-0.312
ne-0.304
umping-0.300
Token crashing
Token ID21899
Feature activation-2.47e-03

Pos logit contributions
 chores+0.122
 temptation+0.118
 urge+0.109
 plan+0.105
 same+0.103
Neg logit contributions
o-0.236
os-0.221
as-0.207
umping-0.204
lic-0.201
Token and
Token ID290
Feature activation+2.98e-02

Pos logit contributions
os+0.006
o+0.006
enny+0.006
ark+0.005
umping+0.005
Neg logit contributions
 chores-0.005
 same-0.005
 plan-0.005
 urge-0.005
 temptation-0.004
Token the
Token ID262
Feature activation+4.86e-02

Pos logit contributions
 chores+0.041
 plan+0.037
 urge+0.037
 same+0.036
 idea+0.034
Neg logit contributions
os-0.089
o-0.088
enny-0.082
umping-0.082
ne-0.080
Token se
Token ID384
Feature activation-9.76e-02

Pos logit contributions
 chores+0.076
 plan+0.063
 temptation+0.062
 going+0.059
 same+0.059
Neg logit contributions
os-0.130
o-0.129
ne-0.127
round-0.125
enny-0.123
Tokenag
Token ID363
Feature activation-7.83e-02

Pos logit contributions
os+0.223
enny+0.219
o+0.212
umping+0.208
ne+0.204
Neg logit contributions
 same-0.199
 chores-0.198
 urge-0.192
 plan-0.189
 temptation-0.189
Tokenull
Token ID724
Feature activation-2.32e-01

Pos logit contributions
enny+0.133
aron+0.129
os+0.126
oing+0.123
ne+0.123
Neg logit contributions
 same-0.200
 plan-0.194
 urge-0.191
 way-0.189
 chores-0.185
Tokens
Token ID82
Feature activation+5.63e-02

Pos logit contributions
enny+0.589
os+0.585
ark+0.566
o+0.565
aron+0.558
Neg logit contributions
 same-0.420
 chores-0.418
 urge-0.400
 plan-0.395
 start-0.390
Token were
Token ID547
Feature activation+9.51e-02

Pos logit contributions
 chores+0.102
 same+0.100
 urge+0.094
 plan+0.094
 going+0.093
Neg logit contributions
os-0.144
enny-0.143
o-0.140
ark-0.138
ne-0.135
Token squ
Token ID2809
Feature activation-3.26e-03

Pos logit contributions
 chores+0.143
 same+0.133
 temptation+0.128
 plan+0.123
 urge+0.123
Neg logit contributions
o-0.282
os-0.277
umping-0.262
oing-0.256
enny-0.255
Tokenaw
Token ID707
Feature activation-1.29e-01

Pos logit contributions
os+0.006
ark+0.006
o+0.006
enny+0.006
aron+0.006
Neg logit contributions
 chores-0.008
 same-0.008
 urge-0.007
 temptation-0.007
 bad-0.007
Tokenking
Token ID3364
Feature activation-4.06e-03

Pos logit contributions
enny+0.203
os+0.198
itty+0.175
ne+0.173
o+0.173
Neg logit contributions
 same-0.355
 going-0.347
 start-0.339
 chores-0.337
 bad-0.328
Token was
Token ID373
Feature activation+9.10e-02

Pos logit contributions
 chores+0.002
 same+0.002
 plan+0.002
 urge+0.002
 way+0.002
Neg logit contributions
os-0.002
o-0.002
enny-0.002
ark-0.002
oing-0.002
Token a
Token ID257
Feature activation+2.83e-02

Pos logit contributions
 chores+0.091
 same+0.083
 plan+0.078
 urge+0.075
 temptation+0.074
Neg logit contributions
o-0.308
os-0.307
enny-0.302
umping-0.297
ark-0.292
Token clown
Token ID25573
Feature activation-2.17e-01

Pos logit contributions
 chores+0.030
 same+0.028
 plan+0.025
 urge+0.025
 temptation+0.024
Neg logit contributions
os-0.094
o-0.093
enny-0.092
ark-0.090
umping-0.090
Token.
Token ID13
Feature activation-7.96e-02

Pos logit contributions
enny+0.608
aron+0.594
ives+0.584
umbs+0.580
umping+0.580
Neg logit contributions
 going-0.325
 other-0.300
 not-0.285
 her-0.280
 go-0.279
Token The
Token ID383
Feature activation+2.45e-02

Pos logit contributions
enny+0.222
os+0.218
umping+0.211
oing+0.211
ark+0.210
Neg logit contributions
 same-0.125
 chores-0.119
 urge-0.113
 start-0.110
 going-0.110
Token clown
Token ID25573
Feature activation-2.32e-01

Pos logit contributions
 chores+0.058
 same+0.056
 plan+0.053
 urge+0.053
 temptation+0.053
Neg logit contributions
os-0.050
o-0.049
enny-0.047
ark-0.046
ne-0.046
Token was
Token ID373
Feature activation+1.05e-01

Pos logit contributions
enny+0.734
oing+0.703
umping+0.702
umbs+0.701
aron+0.690
Neg logit contributions
 going-0.262
 other-0.262
 same-0.262
 only-0.254
 bad-0.251
Token very
Token ID845
Feature activation+5.16e-02

Pos logit contributions
 chores+0.160
 temptation+0.147
 plan+0.139
 urge+0.131
 same+0.130
Neg logit contributions
o-0.325
os-0.301
ark-0.282
umping-0.280
aron-0.278
Token funny
Token ID8258
Feature activation-6.65e-03

Pos logit contributions
 chores+0.058
 same+0.051
 plan+0.047
 temptation+0.047
 start+0.046
Neg logit contributions
o-0.171
os-0.170
enny-0.163
ark-0.162
oing-0.160
Token.
Token ID13
Feature activation-7.68e-02

Pos logit contributions
enny+0.019
os+0.019
oing+0.018
oped+0.018
itty+0.018
Neg logit contributions
 going-0.010
 same-0.009
 chores-0.009
 way-0.009
 bad-0.009
Token He
Token ID679
Feature activation-1.47e-02

Pos logit contributions
os+0.211
enny+0.210
oing+0.205
o+0.204
umping+0.204
Neg logit contributions
 same-0.126
 chores-0.124
 urge-0.114
 plan-0.112
 start-0.111
Token,
Token ID11
Feature activation+2.21e-02

Pos logit contributions
enny+0.356
os+0.350
o+0.341
ark+0.338
umping+0.333
Neg logit contributions
 chores-0.273
 same-0.271
 start-0.254
 plan-0.254
 way-0.253
Token and
Token ID290
Feature activation+2.05e-02

Pos logit contributions
 chores+0.041
 same+0.036
 plan+0.035
 temptation+0.034
 urge+0.033
Neg logit contributions
o-0.058
os-0.057
oing-0.055
aron-0.053
umping-0.053
Token the
Token ID262
Feature activation+4.72e-02

Pos logit contributions
 chores+0.029
 same+0.026
 temptation+0.024
 plan+0.024
 urge+0.024
Neg logit contributions
os-0.064
o-0.062
enny-0.059
oing-0.058
umping-0.058
Token se
Token ID384
Feature activation-9.44e-02

Pos logit contributions
 chores+0.068
 same+0.059
 urge+0.057
 temptation+0.056
 going+0.056
Neg logit contributions
os-0.132
o-0.132
ne-0.127
oing-0.125
round-0.125
Tokenag
Token ID363
Feature activation-7.32e-02

Pos logit contributions
enny+0.209
os+0.205
o+0.194
umping+0.194
ne+0.192
Neg logit contributions
 same-0.197
 chores-0.197
 plan-0.191
 urge-0.190
 temptation-0.187
Tokenull
Token ID724
Feature activation-2.32e-01

Pos logit contributions
enny+0.132
os+0.119
aron+0.116
ne+0.115
itty+0.115
Neg logit contributions
 same-0.185
 plan-0.180
 way-0.176
 urge-0.175
 chores-0.174
Tokens
Token ID82
Feature activation+4.99e-02

Pos logit contributions
enny+0.406
ark+0.353
os+0.349
umping+0.341
aron+0.339
Neg logit contributions
 same-0.623
 start-0.621
 chores-0.614
 going-0.610
 way-0.608
Token flying
Token ID7348
Feature activation+3.39e-05

Pos logit contributions
 chores+0.097
 same+0.095
 plan+0.089
 urge+0.088
 temptation+0.086
Neg logit contributions
os-0.123
o-0.120
enny-0.120
ark-0.115
umping-0.115
Token in
Token ID287
Feature activation+1.03e-01

Pos logit contributions
 same+0.000
 chores+0.000
 plan+0.000
 urge+0.000
 start+0.000
Neg logit contributions
os-0.000
enny-0.000
o-0.000
umping-0.000
oing-0.000
Token the
Token ID262
Feature activation+5.56e-02

Pos logit contributions
 chores+0.115
 same+0.096
 urge+0.091
 plan+0.088
 temptation+0.084
Neg logit contributions
os-0.334
o-0.326
umping-0.320
aron-0.319
ne-0.318
Token sky
Token ID6766
Feature activation-1.18e-01

Pos logit contributions
 chores+0.067
 same+0.056
 plan+0.055
 urge+0.052
 temptation+0.052
Neg logit contributions
os-0.176
o-0.174
ne-0.169
enny-0.168
umping-0.167
Token a
Token ID257
Feature activation+5.70e-02

Pos logit contributions
os+0.181
o+0.179
enny+0.176
umping+0.172
oing+0.171
Neg logit contributions
 chores-0.061
 same-0.058
 plan-0.051
 urge-0.051
 temptation-0.050
Token new
Token ID649
Feature activation+3.01e-02

Pos logit contributions
 chores+0.060
 going+0.054
 temptation+0.052
 same+0.052
 urge+0.049
Neg logit contributions
o-0.179
enny-0.173
oing-0.170
os-0.169
oc-0.166
Token tank
Token ID6873
Feature activation-1.32e-01

Pos logit contributions
 chores+0.052
 same+0.048
 temptation+0.047
 going+0.045
 urge+0.045
Neg logit contributions
o-0.080
os-0.079
enny-0.077
ark-0.075
oing-0.074
Token out
Token ID503
Feature activation+3.74e-02

Pos logit contributions
os+0.315
o+0.308
enny+0.303
ark+0.292
umping+0.291
Neg logit contributions
 chores-0.267
 same-0.259
 plan-0.243
 urge-0.242
 temptation-0.240
Token of
Token ID286
Feature activation+1.14e-01

Pos logit contributions
 chores+0.092
 same+0.090
 plan+0.085
 urge+0.085
 temptation+0.084
Neg logit contributions
os-0.074
o-0.071
enny-0.070
umping-0.066
ne-0.066
Token Leg
Token ID3564
Feature activation-2.31e-01

Pos logit contributions
 chores+0.141
 urge+0.119
 same+0.119
 plan+0.116
 temptation+0.109
Neg logit contributions
os-0.331
o-0.325
enny-0.317
ne-0.306
aron-0.303
Tokenos
Token ID418
Feature activation-1.12e-01

Pos logit contributions
aron+0.149
pillar+0.140
oing+0.138
enny+0.137
round+0.129
Neg logit contributions
 going-0.790
 plan-0.768
 work-0.768
 chores-0.767
 first-0.766
Token.
Token ID13
Feature activation-7.93e-02

Pos logit contributions
enny+0.277
ark+0.269
os+0.268
o+0.267
aron+0.264
Neg logit contributions
 chores-0.215
 same-0.207
 start-0.194
 work-0.194
 urge-0.191
Token
Token ID198
Feature activation-7.99e-02

Pos logit contributions
os+0.227
ark+0.218
enny+0.218
umping+0.217
ne+0.215
Neg logit contributions
 same-0.120
 chores-0.119
 plan-0.108
 urge-0.107
 way-0.107
Token
Token ID198
Feature activation-8.02e-02

Pos logit contributions
os+0.226
umping+0.219
oing+0.217
aron+0.215
ist+0.213
Neg logit contributions
 chores-0.113
 same-0.108
 plan-0.105
 bad-0.103
 other-0.101
TokenTim
Token ID14967
Feature activation-1.19e-01

Pos logit contributions
os+0.183
o+0.166
enny+0.164
umping+0.164
oing+0.161
Neg logit contributions
 chores-0.177
 same-0.172
 plan-0.165
 temptation-0.163
 bad-0.162
Token.
Token ID13
Feature activation-7.70e-02

Pos logit contributions
 chores+0.007
 same+0.006
 plan+0.006
 urge+0.006
 temptation+0.006
Neg logit contributions
os-0.011
o-0.011
enny-0.011
umping-0.011
ark-0.011
Token
Token ID220
Feature activation-6.89e-02

Pos logit contributions
os+0.214
oing+0.206
umping+0.203
ark+0.201
enny+0.199
Neg logit contributions
 same-0.114
 chores-0.109
 going-0.104
 start-0.100
 sorry-0.099
Token
Token ID198
Feature activation-8.06e-02

Pos logit contributions
oing+0.173
os+0.170
umping+0.169
ne+0.168
round+0.167
Neg logit contributions
 chores-0.122
 same-0.118
 going-0.108
 plan-0.102
 work-0.102
Token
Token ID198
Feature activation-8.30e-02

Pos logit contributions
os+0.221
umping+0.211
oing+0.211
ist+0.204
enny+0.203
Neg logit contributions
 chores-0.126
 same-0.120
 other-0.114
 plan-0.112
 bad-0.109
TokenThe
Token ID464
Feature activation+2.50e-02

Pos logit contributions
os+0.202
o+0.179
oing+0.179
umping+0.179
enny+0.177
Neg logit contributions
 chores-0.167
 same-0.159
 temptation-0.149
 plan-0.149
 bad-0.149
Token flaming
Token ID39518
Feature activation-2.31e-01

Pos logit contributions
 chores+0.047
 same+0.041
 plan+0.039
 temptation+0.039
 going+0.039
Neg logit contributions
o-0.063
os-0.063
ne-0.060
ark-0.058
oing-0.058
Tokeno
Token ID78
Feature activation-1.30e-01

Pos logit contributions
enny+0.189
round+0.171
ppings+0.169
umping+0.168
aron+0.168
Neg logit contributions
 plan-0.681
 same-0.673
 other-0.664
 first-0.657
 idea-0.656
Token nodded
Token ID14464
Feature activation+3.81e-02

Pos logit contributions
enny+0.350
os+0.333
ark+0.332
umping+0.329
oing+0.329
Neg logit contributions
 same-0.215
 plan-0.203
 start-0.202
 chores-0.198
 work-0.197
Token its
Token ID663
Feature activation+5.92e-02

Pos logit contributions
 chores+0.072
 same+0.070
 plan+0.065
 urge+0.065
 temptation+0.064
Neg logit contributions
os-0.096
o-0.094
enny-0.093
ark-0.090
umping-0.090
Token head
Token ID1182
Feature activation-3.94e-02

Pos logit contributions
 chores+0.091
 same+0.073
 going+0.073
 temptation+0.072
 laundry+0.070
Neg logit contributions
o-0.165
aron-0.163
oing-0.159
ark-0.157
os-0.155
Token and
Token ID290
Feature activation+3.71e-02

Pos logit contributions
os+0.095
enny+0.093
o+0.092
ark+0.089
umping+0.089
Neg logit contributions
 chores-0.077
 same-0.077
 urge-0.072
 plan-0.071
 start-0.071
Token friend
Token ID1545
Feature activation-1.05e-01

Pos logit contributions
 chores+0.052
 same+0.047
 temptation+0.047
 going+0.044
 urge+0.043
Neg logit contributions
o-0.117
enny-0.117
os-0.115
ne-0.113
ark-0.110
Token Sammy
Token ID44067
Feature activation-1.26e-01

Pos logit contributions
os+0.291
o+0.281
enny+0.281
ark+0.277
oing+0.276
Neg logit contributions
 chores-0.167
 same-0.164
 plan-0.152
 urge-0.150
 start-0.148
Token the
Token ID262
Feature activation+4.88e-02

Pos logit contributions
os+0.308
o+0.305
enny+0.300
ne+0.287
umping+0.286
Neg logit contributions
 chores-0.250
 same-0.241
 plan-0.224
 temptation-0.224
 urge-0.223
Token se
Token ID384
Feature activation-9.16e-02

Pos logit contributions
 chores+0.117
 going+0.098
 temptation+0.095
 start+0.093
 urge+0.092
Neg logit contributions
enny-0.089
o-0.088
ne-0.084
pillar-0.081
ark-0.079
Tokenag
Token ID363
Feature activation-7.22e-02

Pos logit contributions
os+0.197
enny+0.188
o+0.186
oing+0.185
umping+0.183
Neg logit contributions
 chores-0.199
 same-0.196
 plan-0.190
 urge-0.188
 temptation-0.186
Tokenull
Token ID724
Feature activation-2.30e-01

Pos logit contributions
os+0.113
enny+0.109
aron+0.106
o+0.105
oing+0.104
Neg logit contributions
 same-0.195
 plan-0.190
 chores-0.189
 urge-0.188
 way-0.184
Token flew
Token ID13112
Feature activation+4.66e-02

Pos logit contributions
os+0.620
enny+0.614
o+0.595
ark+0.593
umping+0.587
Neg logit contributions
 chores-0.378
 same-0.375
 urge-0.354
 plan-0.353
 start-0.350
Token by
Token ID416
Feature activation+8.77e-02

Pos logit contributions
 chores+0.084
 same+0.074
 plan+0.074
 urge+0.072
 temptation+0.071
Neg logit contributions
o-0.127
os-0.125
enny-0.119
ark-0.118
umping-0.117
Token and
Token ID290
Feature activation+3.22e-02

Pos logit contributions
 chores+0.188
 same+0.182
 plan+0.171
 urge+0.170
 temptation+0.167
Neg logit contributions
os-0.201
o-0.197
enny-0.193
ark-0.187
umping-0.187
Token saw
Token ID2497
Feature activation+7.92e-02

Pos logit contributions
 chores+0.064
 same+0.060
 plan+0.056
 temptation+0.056
 urge+0.056
Neg logit contributions
o-0.081
os-0.081
enny-0.078
ark-0.075
ne-0.075
Token that
Token ID326
Feature activation+4.46e-02

Pos logit contributions
 chores+0.142
 same+0.131
 plan+0.130
 urge+0.129
 way+0.122
Neg logit contributions
os-0.211
o-0.209
enny-0.197
umping-0.192
ne-0.186
Token named
Token ID3706
Feature activation-3.86e-02

Pos logit contributions
os+0.326
enny+0.324
umping+0.314
ark+0.312
oing+0.306
Neg logit contributions
 same-0.230
 chores-0.227
 going-0.224
 urge-0.223
 her-0.217
Token Lily
Token ID20037
Feature activation-1.01e-01

Pos logit contributions
os+0.109
enny+0.108
o+0.108
ne+0.102
ark+0.101
Neg logit contributions
 chores-0.063
 same-0.057
 plan-0.056
 urge-0.054
 temptation-0.053
Token.
Token ID13
Feature activation-8.38e-02

Pos logit contributions
enny+0.298
os+0.292
lic+0.289
oing+0.286
umping+0.282
Neg logit contributions
 going-0.143
 same-0.141
 her-0.140
 urge-0.133
 work-0.130
Token She
Token ID1375
Feature activation-2.12e-02

Pos logit contributions
os+0.244
o+0.237
enny+0.233
ark+0.233
oing+0.232
Neg logit contributions
 chores-0.126
 same-0.126
 urge-0.113
 plan-0.111
 temptation-0.110
Token loved
Token ID6151
Feature activation+8.47e-02

Pos logit contributions
os+0.057
enny+0.054
umping+0.053
oing+0.053
ark+0.053
Neg logit contributions
 same-0.038
 chores-0.037
 urge-0.035
 bad-0.035
 going-0.035
Token to
Token ID284
Feature activation+1.14e-01

Pos logit contributions
 chores+0.159
 same+0.158
 temptation+0.157
 plan+0.156
 urge+0.156
Neg logit contributions
o-0.206
os-0.201
umping-0.181
enny-0.180
round-0.179
Token play
Token ID711
Feature activation+3.08e-02

Pos logit contributions
 same+0.165
 chores+0.158
 temptation+0.151
 bad+0.142
 way+0.140
Neg logit contributions
os-0.326
o-0.320
enny-0.309
ne-0.303
oing-0.301
Token outside
Token ID2354
Feature activation-3.22e-03

Pos logit contributions
 chores+0.056
 same+0.054
 plan+0.051
 urge+0.050
 start+0.049
Neg logit contributions
os-0.081
o-0.079
enny-0.078
ark-0.076
umping-0.076
Token in
Token ID287
Feature activation+9.78e-02

Pos logit contributions
os+0.007
o+0.007
enny+0.007
umping+0.007
ne+0.007
Neg logit contributions
 chores-0.007
 same-0.007
 plan-0.006
 temptation-0.006
 urge-0.006
Token the
Token ID262
Feature activation+5.28e-02

Pos logit contributions
 chores+0.112
 same+0.099
 plan+0.097
 urge+0.095
 temptation+0.092
Neg logit contributions
os-0.309
o-0.308
ne-0.288
enny-0.288
umping-0.287
Token park
Token ID3952
Feature activation-7.96e-02

Pos logit contributions
 chores+0.066
 plan+0.057
 same+0.056
 urge+0.055
 temptation+0.055
Neg logit contributions
o-0.166
os-0.164
enny-0.158
umping-0.158
ne-0.154
Tokenigg
Token ID6950
Feature activation+7.54e-03

Pos logit contributions
o+0.037
os+0.036
enny+0.036
oing+0.035
ark+0.035
Neg logit contributions
 chores-0.021
 same-0.020
 bad-0.019
 urge-0.019
 plan-0.018
Tokenly
Token ID306
Feature activation+6.63e-03

Pos logit contributions
 same+0.020
 chores+0.019
 bad+0.018
 urge+0.018
 temptation+0.018
Neg logit contributions
os-0.014
enny-0.013
o-0.013
itty-0.012
umping-0.012
Token.
Token ID13
Feature activation-8.56e-02

Pos logit contributions
 chores+0.013
 same+0.013
 plan+0.012
 urge+0.012
 temptation+0.012
Neg logit contributions
os-0.016
o-0.015
enny-0.015
ark-0.015
umping-0.015
Token The
Token ID383
Feature activation+1.54e-02

Pos logit contributions
os+0.263
oing+0.259
enny+0.258
ark+0.256
umping+0.256
Neg logit contributions
 same-0.114
 chores-0.109
 way-0.100
 urge-0.096
 plan-0.095
Token dragon
Token ID10441
Feature activation-2.12e-01

Pos logit contributions
 chores+0.029
 same+0.026
 plan+0.025
 temptation+0.025
 urge+0.025
Neg logit contributions
ne-0.035
os-0.035
o-0.034
oing-0.033
aron-0.033
Token was
Token ID373
Feature activation+1.02e-01

Pos logit contributions
enny+0.576
os+0.551
umping+0.548
ark+0.543
oing+0.537
Neg logit contributions
 same-0.354
 plan-0.347
 chores-0.335
 urge-0.333
 way-0.328
Token very
Token ID845
Feature activation+5.05e-02

Pos logit contributions
 chores+0.169
 temptation+0.152
 plan+0.140
 same+0.140
 urge+0.137
Neg logit contributions
o-0.300
os-0.281
aron-0.270
umping-0.267
oing-0.260
Token happy
Token ID3772
Feature activation+1.56e-02

Pos logit contributions
 chores+0.071
 same+0.061
 temptation+0.056
 start+0.056
 plan+0.055
Neg logit contributions
o-0.156
os-0.153
oing-0.147
enny-0.145
ark-0.145
Token,
Token ID11
Feature activation+2.35e-02

Pos logit contributions
 same+0.027
 going+0.027
 chores+0.026
 way+0.025
 urge+0.025
Neg logit contributions
os-0.041
enny-0.040
o-0.039
oing-0.039
ark-0.039
Token but
Token ID475
Feature activation+6.65e-03

Pos logit contributions
 chores+0.054
 same+0.051
 plan+0.050
 temptation+0.048
 urge+0.048
Neg logit contributions
os-0.049
o-0.049
enny-0.047
umping-0.046
ark-0.046
Token he
Token ID339
Feature activation-1.97e-02

Pos logit contributions
 chores+0.012
 same+0.010
 plan+0.010
 temptation+0.010
 urge+0.010
Neg logit contributions
o-0.018
os-0.017
ne-0.017
enny-0.017
ark-0.017
Token played
Token ID2826
Feature activation+4.79e-02

Pos logit contributions
 chores+0.017
 same+0.016
 start+0.015
 plan+0.015
 urge+0.015
Neg logit contributions
os-0.026
enny-0.026
umping-0.025
ark-0.025
o-0.025
Token together
Token ID1978
Feature activation+5.49e-02

Pos logit contributions
 chores+0.106
 same+0.103
 urge+0.097
 plan+0.097
 temptation+0.096
Neg logit contributions
os-0.105
o-0.102
enny-0.100
ark-0.097
umping-0.097
Token until
Token ID1566
Feature activation+3.92e-02

Pos logit contributions
 chores+0.109
 same+0.105
 temptation+0.101
 urge+0.099
 plan+0.098
Neg logit contributions
os-0.131
o-0.127
enny-0.123
ark-0.122
ne-0.121
Token the
Token ID262
Feature activation+6.17e-02

Pos logit contributions
 plan+0.057
 chores+0.057
 urge+0.054
 same+0.053
 temptation+0.052
Neg logit contributions
o-0.102
os-0.101
ne-0.098
enny-0.096
round-0.095
Token sun
Token ID4252
Feature activation-1.27e-01

Pos logit contributions
 chores+0.086
 same+0.074
 plan+0.074
 going+0.070
 urge+0.069
Neg logit contributions
o-0.171
itty-0.169
ne-0.169
os-0.165
oing-0.165
Token began
Token ID2540
Feature activation+9.72e-02

Pos logit contributions
os+0.376
enny+0.372
o+0.368
ark+0.362
umping+0.356
Neg logit contributions
 chores-0.181
 same-0.179
 plan-0.164
 urge-0.162
 start-0.161
Token to
Token ID284
Feature activation+1.32e-01

Pos logit contributions
 chores+0.189
 same+0.178
 plan+0.170
 temptation+0.167
 urge+0.167
Neg logit contributions
os-0.245
o-0.242
oing-0.226
enny-0.224
umping-0.223
Token rise
Token ID4485
Feature activation+3.76e-02

Pos logit contributions
 chores+0.172
 same+0.145
 temptation+0.138
 plan+0.137
 urge+0.129
Neg logit contributions
os-0.397
o-0.393
oing-0.383
ne-0.378
aron-0.376
Token.
Token ID13
Feature activation-7.68e-02

Pos logit contributions
 chores+0.075
 same+0.075
 plan+0.069
 urge+0.068
 start+0.068
Neg logit contributions
os-0.091
enny-0.089
o-0.088
ark-0.085
umping-0.084
Token<|endoftext|>
Token ID50256
Feature activation-1.42e-01

Pos logit contributions
os+0.208
ark+0.200
o+0.196
umping+0.195
ne+0.195
Neg logit contributions
 same-0.133
 chores-0.132
 bad-0.120
 start-0.119
 good-0.118
TokenL
Token ID43
Feature activation-1.07e-01

Pos logit contributions
oing+0.406
round+0.400
ark+0.398
ist+0.397
os+0.391
Neg logit contributions
 same-0.212
 chores-0.196
 other-0.196
 going-0.191
 the-0.186
Token and
Token ID290
Feature activation+1.37e-02

Pos logit contributions
os+0.307
enny+0.300
oing+0.295
ark+0.295
o+0.293
Neg logit contributions
 same-0.199
 plan-0.193
 chores-0.191
 way-0.187
 urge-0.186
Token Mary
Token ID5335
Feature activation-1.05e-01

Pos logit contributions
 chores+0.030
 same+0.029
 temptation+0.027
 urge+0.027
 plan+0.026
Neg logit contributions
os-0.031
o-0.030
enny-0.029
ark-0.029
ne-0.028
Token were
Token ID547
Feature activation+8.65e-02

Pos logit contributions
os+0.264
enny+0.259
o+0.259
oing+0.255
ark+0.255
Neg logit contributions
 chores-0.191
 same-0.190
 plan-0.183
 start-0.180
 urge-0.179
Token very
Token ID845
Feature activation+5.22e-02

Pos logit contributions
 chores+0.146
 temptation+0.139
 same+0.137
 urge+0.122
 plan+0.121
Neg logit contributions
o-0.228
os-0.226
umping-0.213
aron-0.212
enny-0.206
Token sad
Token ID6507
Feature activation+2.01e-02

Pos logit contributions
 chores+0.094
 same+0.090
 temptation+0.086
 way+0.084
 avail+0.083
Neg logit contributions
os-0.123
o-0.123
enny-0.118
as-0.116
oing-0.115
Token about
Token ID546
Feature activation+8.33e-02

Pos logit contributions
 chores+0.042
 same+0.041
 plan+0.038
 urge+0.038
 temptation+0.037
Neg logit contributions
os-0.047
o-0.046
enny-0.046
ark-0.044
umping-0.044
Token the
Token ID262
Feature activation+4.96e-02

Pos logit contributions
 same+0.111
 urge+0.102
 chores+0.102
 temptation+0.091
 plan+0.091
Neg logit contributions
os-0.223
o-0.217
enny-0.215
ark-0.207
ne-0.207
Token news
Token ID1705
Feature activation-7.91e-02

Pos logit contributions
 chores+0.088
 going+0.076
 temptation+0.076
 same+0.075
 start+0.074
Neg logit contributions
ne-0.110
enny-0.108
os-0.104
o-0.104
ark-0.103
Token.
Token ID13
Feature activation-9.28e-02

Pos logit contributions
os+0.187
o+0.180
enny+0.179
ne+0.173
umping+0.172
Neg logit contributions
 chores-0.157
 same-0.155
 urge-0.144
 plan-0.144
 temptation-0.143
Token
Token ID198
Feature activation-9.65e-02

Pos logit contributions
os+0.249
enny+0.246
o+0.245
umping+0.240
ark+0.239
Neg logit contributions
 same-0.155
 chores-0.153
 plan-0.141
 urge-0.139
 way-0.138
Token
Token ID198
Feature activation-9.58e-02

Pos logit contributions
umping+0.254
os+0.251
enny+0.249
aron+0.244
oing+0.243
Neg logit contributions
 chores-0.159
 same-0.154
 plan-0.151
 other-0.145
 work-0.142
Token black
Token ID2042
Feature activation-4.85e-02

Pos logit contributions
 chores+0.053
 same+0.048
 temptation+0.046
 plan+0.045
 urge+0.045
Neg logit contributions
o-0.097
os-0.096
umping-0.086
aron-0.086
oing-0.084
Token and
Token ID290
Feature activation+2.79e-02

Pos logit contributions
enny+0.132
ark+0.131
os+0.124
oing+0.124
o+0.123
Neg logit contributions
 chores-0.082
 same-0.077
 work-0.075
 going-0.075
 rain-0.075
Token hard
Token ID1327
Feature activation+3.27e-02

Pos logit contributions
 chores+0.041
 same+0.039
 urge+0.037
 plan+0.036
 temptation+0.035
Neg logit contributions
o-0.081
os-0.080
oing-0.072
aron-0.071
lic-0.070
Token.
Token ID13
Feature activation-7.65e-02

Pos logit contributions
 chores+0.066
 same+0.066
 plan+0.062
 urge+0.062
 temptation+0.060
Neg logit contributions
os-0.077
o-0.075
umping-0.071
enny-0.071
aron-0.070
Token Lily
Token ID20037
Feature activation-9.51e-02

Pos logit contributions
os+0.207
ark+0.198
o+0.198
umping+0.196
enny+0.196
Neg logit contributions
 same-0.129
 chores-0.129
 urge-0.115
 plan-0.115
 way-0.114
Token was
Token ID373
Feature activation+1.07e-01

Pos logit contributions
os+0.254
enny+0.253
ark+0.253
oing+0.248
round+0.244
Neg logit contributions
 same-0.152
 plan-0.149
 chores-0.148
 way-0.147
 urge-0.145
Token sad
Token ID6507
Feature activation+2.42e-02

Pos logit contributions
 chores+0.182
 same+0.167
 temptation+0.162
 plan+0.157
 urge+0.150
Neg logit contributions
o-0.295
os-0.284
aron-0.276
umping-0.273
oing-0.266
Token because
Token ID780
Feature activation+3.44e-03

Pos logit contributions
 chores+0.049
 same+0.047
 plan+0.045
 urge+0.044
 temptation+0.043
Neg logit contributions
os-0.058
o-0.057
enny-0.057
umping-0.054
ark-0.054
Token they
Token ID484
Feature activation+5.27e-04

Pos logit contributions
 same+0.006
 plan+0.006
 chores+0.005
 urge+0.005
 temptation+0.005
Neg logit contributions
os-0.009
o-0.009
enny-0.009
itty-0.008
ne-0.008
Token couldn
Token ID3521
Feature activation+8.66e-02

Pos logit contributions
 chores+0.001
 same+0.001
 urge+0.001
 plan+0.001
 start+0.001
Neg logit contributions
os-0.001
enny-0.001
o-0.001
ark-0.001
umping-0.001
Token't
Token ID470
Feature activation+9.90e-02

Pos logit contributions
 chores+0.121
 same+0.117
 plan+0.108
 urge+0.107
 temptation+0.104
Neg logit contributions
os-0.261
o-0.256
enny-0.251
ark-0.244
umping-0.244
Token she
Token ID673
Feature activation-1.69e-02

Pos logit contributions
 same+0.083
 chores+0.079
 plan+0.077
 start+0.076
 urge+0.075
Neg logit contributions
o-0.136
os-0.135
and-0.125
enny-0.124
oing-0.123
Token should
Token ID815
Feature activation+1.04e-01

Pos logit contributions
os+0.042
ark+0.041
aron+0.041
umping+0.041
oing+0.041
Neg logit contributions
 chores-0.032
 same-0.031
 bad-0.029
 urge-0.029
 plan-0.029
Token never
Token ID1239
Feature activation+1.06e-01

Pos logit contributions
 same+0.136
 chores+0.134
 plan+0.114
 temptation+0.114
 going+0.113
Neg logit contributions
o-0.290
os-0.275
enny-0.272
ne-0.272
aron-0.268
Token touch
Token ID3638
Feature activation+5.71e-02

Pos logit contributions
 chores+0.140
 same+0.129
 temptation+0.115
 plan+0.114
 urge+0.111
Neg logit contributions
o-0.327
os-0.326
enny-0.315
ne-0.311
ark-0.306
Token things
Token ID1243
Feature activation-6.91e-02

Pos logit contributions
 chores+0.060
 same+0.057
 plan+0.054
 urge+0.053
 temptation+0.051
Neg logit contributions
os-0.190
o-0.185
enny-0.181
umping-0.179
ne-0.177
Token she
Token ID673
Feature activation-1.66e-02

Pos logit contributions
os+0.175
enny+0.170
o+0.169
ne+0.164
ark+0.163
Neg logit contributions
 chores-0.126
 same-0.120
 plan-0.118
 urge-0.114
 temptation-0.111
Token finds
Token ID7228
Feature activation+6.54e-02

Pos logit contributions
os+0.046
enny+0.045
o+0.044
ark+0.044
umping+0.043
Neg logit contributions
 chores-0.028
 same-0.027
 urge-0.025
 plan-0.025
 going-0.024
Token outside
Token ID2354
Feature activation-1.90e-03

Pos logit contributions
 chores+0.126
 same+0.123
 plan+0.115
 urge+0.114
 temptation+0.112
Neg logit contributions
os-0.162
o-0.158
enny-0.156
ark-0.151
umping-0.151
Token.
Token ID13
Feature activation-8.56e-02

Pos logit contributions
os+0.005
o+0.004
enny+0.004
umping+0.004
ark+0.004
Neg logit contributions
 chores-0.004
 same-0.004
 plan-0.004
 urge-0.003
 temptation-0.003
Token<|endoftext|>
Token ID50256
Feature activation-1.46e-01

Pos logit contributions
os+0.225
ark+0.218
ne+0.216
umping+0.216
oing+0.215
Neg logit contributions
 same-0.148
 chores-0.144
 temptation-0.132
 way-0.132
 going-0.130
TokenOnce
Token ID7454
Feature activation-3.28e-02

Pos logit contributions
oing+0.433
os+0.425
round+0.423
ist+0.420
ark+0.418
Neg logit contributions
 same-0.203
 way-0.190
 other-0.188
 chores-0.187
 going-0.184
Token Everyone
Token ID11075
Feature activation-1.21e-01

Pos logit contributions
os+0.228
ark+0.225
oing+0.222
enny+0.221
o+0.221
Neg logit contributions
 same-0.137
 chores-0.134
 start-0.123
 plan-0.121
 urge-0.121
Token agreed
Token ID4987
Feature activation+6.11e-02

Pos logit contributions
enny+0.306
oing+0.304
os+0.302
umping+0.301
ark+0.298
Neg logit contributions
 plan-0.211
 same-0.211
 chores-0.209
 start-0.201
 urge-0.200
Token,
Token ID11
Feature activation+2.73e-02

Pos logit contributions
 chores+0.132
 same+0.129
 urge+0.120
 plan+0.120
 temptation+0.119
Neg logit contributions
os-0.136
o-0.133
enny-0.132
ark-0.127
umping-0.126
Token so
Token ID523
Feature activation+5.38e-02

Pos logit contributions
 chores+0.060
 same+0.058
 plan+0.054
 urge+0.053
 temptation+0.053
Neg logit contributions
os-0.060
o-0.059
enny-0.057
umping-0.056
ne-0.055
Token the
Token ID262
Feature activation+4.62e-02

Pos logit contributions
 chores+0.059
 same+0.054
 plan+0.047
 urge+0.047
 temptation+0.046
Neg logit contributions
os-0.180
o-0.177
enny-0.174
ne-0.170
ark-0.170
Token brave
Token ID14802
Feature activation-8.87e-03

Pos logit contributions
 chores+0.094
 same+0.086
 going+0.083
 temptation+0.082
 urge+0.081
Neg logit contributions
o-0.100
enny-0.095
os-0.095
itty-0.094
ne-0.094
Token mouse
Token ID10211
Feature activation-1.58e-01

Pos logit contributions
os+0.025
o+0.024
enny+0.024
ark+0.023
oing+0.023
Neg logit contributions
 chores-0.014
 same-0.014
 plan-0.013
 urge-0.013
 bad-0.013
Token hopped
Token ID45230
Feature activation+4.69e-02

Pos logit contributions
os+0.406
o+0.396
enny+0.394
ark+0.382
umping+0.382
Neg logit contributions
 chores-0.285
 same-0.279
 plan-0.263
 urge-0.259
 start-0.254
Token up
Token ID510
Feature activation+8.24e-02

Pos logit contributions
 chores+0.086
 same+0.081
 plan+0.078
 urge+0.077
 temptation+0.077
Neg logit contributions
os-0.121
o-0.119
enny-0.117
ark-0.114
umping-0.113
Token on
Token ID319
Feature activation+1.04e-01

Pos logit contributions
 chores+0.152
 same+0.148
 temptation+0.142
 urge+0.141
 plan+0.139
Neg logit contributions
os-0.210
o-0.204
enny-0.196
umping-0.192
ne-0.191
Token the
Token ID262
Feature activation+6.56e-02

Pos logit contributions
 same+0.123
 chores+0.120
 urge+0.117
 temptation+0.106
 plan+0.106
Neg logit contributions
os-0.310
o-0.301
enny-0.298
ark-0.295
umping-0.293
Token Do
Token ID2141
Feature activation+7.75e-02

Pos logit contributions
os+0.168
enny+0.166
umping+0.164
o+0.161
ark+0.160
Neg logit contributions
 same-0.106
 chores-0.103
 urge-0.097
 start-0.095
 going-0.094
Token you
Token ID345
Feature activation+3.28e-02

Pos logit contributions
 chores+0.117
 same+0.111
 plan+0.107
 urge+0.106
 way+0.101
Neg logit contributions
o-0.218
os-0.217
enny-0.210
ne-0.207
ark-0.206
Token want
Token ID765
Feature activation+8.92e-02

Pos logit contributions
 chores+0.051
 same+0.050
 plan+0.047
 urge+0.046
 start+0.045
Neg logit contributions
os-0.094
o-0.092
enny-0.091
umping-0.089
ark-0.089
Token to
Token ID284
Feature activation+1.37e-01

Pos logit contributions
 chores+0.155
 same+0.147
 plan+0.141
 urge+0.141
 temptation+0.138
Neg logit contributions
os-0.239
o-0.236
enny-0.229
ne-0.223
ark-0.222
Token be
Token ID307
Feature activation+9.49e-02

Pos logit contributions
 chores+0.187
 same+0.174
 temptation+0.169
 urge+0.158
 plan+0.155
Neg logit contributions
o-0.410
os-0.408
ne-0.385
enny-0.385
ark-0.383
Token my
Token ID616
Feature activation+6.87e-02

Pos logit contributions
 chores+0.081
 same+0.078
 plan+0.066
 urge+0.064
 temptation+0.060
Neg logit contributions
os-0.338
enny-0.331
o-0.331
umping-0.323
ark-0.323
Token friend
Token ID1545
Feature activation-7.18e-02

Pos logit contributions
 chores+0.122
 same+0.117
 plan+0.109
 urge+0.108
 temptation+0.107
Neg logit contributions
os-0.182
o-0.178
enny-0.176
ark-0.171
umping-0.170
Token?
Token ID30
Feature activation-7.63e-02

Pos logit contributions
os+0.211
enny+0.208
oing+0.205
umping+0.203
ark+0.203
Neg logit contributions
 same-0.104
 chores-0.104
 bad-0.093
 going-0.093
 plan-0.092
Token I
Token ID314
Feature activation+1.64e-02

Pos logit contributions
os+0.207
umping+0.206
enny+0.205
ark+0.199
o+0.198
Neg logit contributions
 same-0.117
 chores-0.117
 start-0.107
 urge-0.106
 going-0.105
Token have
Token ID423
Feature activation+7.88e-02

Pos logit contributions
 chores+0.026
 same+0.026
 plan+0.024
 urge+0.024
 start+0.023
Neg logit contributions
os-0.046
enny-0.044
o-0.044
umping-0.043
ark-0.043
Token a
Token ID257
Feature activation+4.97e-02

Pos logit contributions
 chores+0.093
 urge+0.084
 plan+0.083
 temptation+0.083
 same+0.080
Neg logit contributions
o-0.248
os-0.248
ne-0.235
enny-0.234
umping-0.229
Token.
Token ID13
Feature activation-7.48e-02

Pos logit contributions
os+0.355
enny+0.350
o+0.345
umping+0.335
ark+0.335
Neg logit contributions
 chores-0.279
 same-0.270
 plan-0.253
 urge-0.249
 way-0.248
Token The
Token ID383
Feature activation+2.68e-02

Pos logit contributions
os+0.197
enny+0.189
o+0.188
umping+0.187
ark+0.187
Neg logit contributions
 chores-0.133
 same-0.131
 plan-0.119
 urge-0.118
 way-0.117
Token end
Token ID886
Feature activation-8.74e-03

Pos logit contributions
 chores+0.030
 same+0.029
 going+0.027
 plan+0.027
 temptation+0.026
Neg logit contributions
o-0.075
itty-0.074
os-0.073
enny-0.073
pillar-0.072
Token.
Token ID13
Feature activation-7.81e-02

Pos logit contributions
enny+0.028
pp+0.028
umping+0.028
ark+0.027
os+0.027
Neg logit contributions
 bad-0.011
 chores-0.010
 new-0.010
 big-0.010
 work-0.009
Token<|endoftext|>
Token ID50256
Feature activation-1.32e-01

Pos logit contributions
os+0.207
umping+0.202
ark+0.200
aron+0.197
ne+0.196
Neg logit contributions
 chores-0.130
 same-0.129
 going-0.119
 bad-0.118
 way-0.118
TokenOnce
Token ID7454
Feature activation-2.26e-02

Pos logit contributions
oing+0.392
ist+0.385
round+0.383
ark+0.379
enny+0.377
Neg logit contributions
 same-0.186
 other-0.171
 chores-0.169
 way-0.167
 the-0.164
Token upon
Token ID2402
Feature activation+4.70e-02

Pos logit contributions
os+0.054
o+0.052
enny+0.052
ark+0.050
umping+0.050
Neg logit contributions
 chores-0.046
 same-0.045
 plan-0.042
 urge-0.042
 temptation-0.041
Token a
Token ID257
Feature activation+2.53e-02

Pos logit contributions
 chores+0.036
 same+0.034
 plan+0.028
 urge+0.027
 going+0.026
Neg logit contributions
os-0.172
enny-0.168
o-0.168
ark-0.164
umping-0.164
Token time
Token ID640
Feature activation-1.62e-02

Pos logit contributions
 chores+0.014
 same+0.013
 plan+0.010
 urge+0.010
 way+0.009
Neg logit contributions
os-0.099
enny-0.096
o-0.095
ark-0.094
oing-0.094
Token,
Token ID11
Feature activation+1.70e-02

Pos logit contributions
os+0.048
enny+0.047
o+0.046
aron+0.046
umping+0.046
Neg logit contributions
 chores-0.022
 same-0.022
 plan-0.020
 urge-0.020
 going-0.019
Token there
Token ID612
Feature activation-2.34e-03

Pos logit contributions
 chores+0.025
 same+0.025
 plan+0.022
 urge+0.022
 temptation+0.022
Neg logit contributions
os-0.050
o-0.049
enny-0.048
ark-0.047
umping-0.047
Token,
Token ID11
Feature activation+1.38e-02

Pos logit contributions
 chores+0.032
 same+0.031
 temptation+0.029
 plan+0.027
 urge+0.027
Neg logit contributions
os-0.049
o-0.046
enny-0.046
ne-0.045
es-0.043
Token he
Token ID339
Feature activation-3.15e-02

Pos logit contributions
 chores+0.024
 same+0.023
 plan+0.022
 urge+0.021
 temptation+0.021
Neg logit contributions
os-0.037
o-0.036
enny-0.035
ne-0.034
umping-0.034
Token saw
Token ID2497
Feature activation+6.87e-02

Pos logit contributions
os+0.075
o+0.074
enny+0.073
umping+0.070
ne+0.070
Neg logit contributions
 chores-0.065
 same-0.062
 plan-0.058
 temptation-0.058
 urge-0.058
Token many
Token ID867
Feature activation+3.22e-02

Pos logit contributions
 chores+0.081
 same+0.071
 plan+0.071
 urge+0.070
 temptation+0.069
Neg logit contributions
os-0.217
o-0.215
enny-0.209
umping-0.205
ne-0.203
Token strange
Token ID6283
Feature activation+5.76e-04

Pos logit contributions
 chores+0.035
 same+0.034
 urge+0.030
 temptation+0.029
 plan+0.029
Neg logit contributions
os-0.104
enny-0.100
o-0.100
umping-0.099
ne-0.099
Token and
Token ID290
Feature activation+1.57e-02

Pos logit contributions
 chores+0.001
 same+0.001
 plan+0.001
 urge+0.001
 temptation+0.000
Neg logit contributions
os-0.002
o-0.002
enny-0.002
umping-0.002
ark-0.002
Token wonderful
Token ID7932
Feature activation+3.55e-03

Pos logit contributions
 chores+0.020
 same+0.017
 urge+0.016
 plan+0.016
 temptation+0.015
Neg logit contributions
os-0.047
o-0.047
ne-0.045
enny-0.045
oing-0.044
Token things
Token ID1243
Feature activation-9.12e-02

Pos logit contributions
 chores+0.004
 same+0.003
 plan+0.003
 urge+0.003
 temptation+0.003
Neg logit contributions
os-0.012
o-0.012
enny-0.011
umping-0.011
ne-0.011
Token.
Token ID13
Feature activation-9.29e-02

Pos logit contributions
os+0.221
o+0.214
enny+0.210
umping+0.204
ark+0.203
Neg logit contributions
 chores-0.182
 same-0.178
 plan-0.167
 urge-0.165
 temptation-0.163
Token But
Token ID887
Feature activation-2.39e-03

Pos logit contributions
os+0.233
o+0.226
enny+0.224
ark+0.219
umping+0.217
Neg logit contributions
 chores-0.178
 same-0.174
 plan-0.162
 urge-0.161
 temptation-0.158
Token he
Token ID339
Feature activation-2.18e-02

Pos logit contributions
o+0.006
os+0.006
enny+0.006
ne+0.005
ark+0.005
Neg logit contributions
 chores-0.004
 same-0.004
 temptation-0.004
 plan-0.004
 urge-0.004
Token each
Token ID1123
Feature activation+3.63e-02

Pos logit contributions
 same+0.158
 chores+0.158
 urge+0.153
 temptation+0.146
 plan+0.142
Neg logit contributions
os-0.410
o-0.397
umping-0.387
enny-0.384
oing-0.382
Token other
Token ID584
Feature activation+1.10e-02

Pos logit contributions
 chores+0.023
 same+0.020
 temptation+0.018
 plan+0.017
 going+0.017
Neg logit contributions
os-0.130
o-0.127
ne-0.123
enny-0.122
umping-0.122
Token laugh
Token ID6487
Feature activation-2.12e-02

Pos logit contributions
 chores+0.017
 same+0.016
 plan+0.015
 urge+0.015
 work+0.015
Neg logit contributions
enny-0.033
os-0.032
ark-0.032
o-0.032
enn-0.031
Token.
Token ID13
Feature activation-8.16e-02

Pos logit contributions
os+0.056
enny+0.056
o+0.056
itty+0.054
enn+0.053
Neg logit contributions
 same-0.038
 chores-0.038
 going-0.034
 plan-0.034
 start-0.034
Token One
Token ID1881
Feature activation+9.90e-03

Pos logit contributions
os+0.237
enny+0.235
umping+0.234
oing+0.232
ark+0.232
Neg logit contributions
 same-0.120
 chores-0.118
 way-0.108
 plan-0.105
 start-0.103
Token day
Token ID1110
Feature activation-4.04e-02

Pos logit contributions
 same+0.006
 chores+0.006
 plan+0.005
 urge+0.005
 way+0.005
Neg logit contributions
os-0.038
enny-0.037
ark-0.036
aron-0.036
umping-0.036
Token,
Token ID11
Feature activation+9.09e-03

Pos logit contributions
os+0.119
enny+0.116
o+0.116
ark+0.114
umping+0.113
Neg logit contributions
 chores-0.060
 same-0.057
 plan-0.052
 urge-0.052
 start-0.052
Token the
Token ID262
Feature activation+4.04e-02

Pos logit contributions
 chores+0.012
 same+0.011
 temptation+0.011
 plan+0.010
 urge+0.010
Neg logit contributions
o-0.028
os-0.028
enny-0.026
ne-0.026
oing-0.026
Token parents
Token ID3397
Feature activation-1.48e-01

Pos logit contributions
 chores+0.060
 temptation+0.056
 same+0.053
 plan+0.052
 urge+0.050
Neg logit contributions
o-0.105
os-0.102
enny-0.101
ne-0.100
eddy-0.097
Token wanted
Token ID2227
Feature activation+9.06e-02

Pos logit contributions
os+0.390
o+0.380
enny+0.377
umping+0.364
ark+0.363
Neg logit contributions
 chores-0.266
 same-0.258
 plan-0.239
 temptation-0.237
 urge-0.236
Token to
Token ID284
Feature activation+1.25e-01

Pos logit contributions
 chores+0.175
 urge+0.174
 temptation+0.172
 same+0.171
 plan+0.167
Neg logit contributions
os-0.217
o-0.215
enny-0.203
ne-0.198
umping-0.192
Token,
Token ID11
Feature activation+3.26e-02

Pos logit contributions
os+0.158
o+0.154
enny+0.153
ark+0.148
umping+0.148
Neg logit contributions
 chores-0.093
 same-0.090
 plan-0.083
 urge-0.082
 temptation-0.081
Token that
Token ID326
Feature activation+6.45e-02

Pos logit contributions
 chores+0.056
 same+0.051
 way+0.051
 urge+0.051
 plan+0.050
Neg logit contributions
enny-0.080
o-0.075
ne-0.073
os-0.073
azz-0.070
Token's
Token ID338
Feature activation+8.97e-02

Pos logit contributions
 chores+0.127
 same+0.123
 plan+0.115
 urge+0.115
 temptation+0.113
Neg logit contributions
os-0.159
o-0.155
enny-0.153
umping-0.148
ark-0.148
Token beautiful
Token ID4950
Feature activation-1.80e-03

Pos logit contributions
 chores+0.108
 urge+0.093
 temptation+0.091
 same+0.090
 plan+0.087
Neg logit contributions
o-0.278
os-0.267
aron-0.260
ne-0.258
ark-0.257
Token!"
Token ID2474
Feature activation-7.76e-02

Pos logit contributions
os+0.005
enny+0.005
o+0.004
ark+0.004
umping+0.004
Neg logit contributions
 same-0.003
 chores-0.003
 plan-0.003
 work-0.003
 start-0.003
Token Lily
Token ID20037
Feature activation-9.20e-02

Pos logit contributions
enny+0.198
o+0.193
os+0.192
ne+0.186
azz+0.180
Neg logit contributions
 chores-0.139
 same-0.130
 plan-0.126
 going-0.124
 urge-0.123
Token said
Token ID531
Feature activation+1.61e-02

Pos logit contributions
ark+0.281
os+0.274
enny+0.273
itty+0.268
oing+0.267
Neg logit contributions
 same-0.113
 start-0.110
 way-0.108
 plan-0.107
 work-0.106
Token.
Token ID13
Feature activation-7.28e-02

Pos logit contributions
 chores+0.032
 same+0.031
 plan+0.029
 urge+0.029
 start+0.028
Neg logit contributions
os-0.039
o-0.038
enny-0.038
ark-0.037
umping-0.037
Token She
Token ID1375
Feature activation-7.16e-03

Pos logit contributions
os+0.206
o+0.200
enny+0.198
ark+0.194
umping+0.193
Neg logit contributions
 chores-0.116
 same-0.114
 plan-0.104
 urge-0.103
 temptation-0.101
Token took
Token ID1718
Feature activation+7.74e-02

Pos logit contributions
os+0.019
ark+0.019
enny+0.019
o+0.019
oing+0.018
Neg logit contributions
 same-0.012
 start-0.011
 plan-0.011
 urge-0.011
 going-0.011
Token another
Token ID1194
Feature activation+3.00e-02

Pos logit contributions
 chores+0.089
 plan+0.082
 urge+0.078
 temptation+0.077
 same+0.075
Neg logit contributions
os-0.247
enny-0.242
o-0.239
umping-0.229
ne-0.227
Token mom
Token ID1995
Feature activation-1.07e-01

Pos logit contributions
 chores+0.036
 same+0.033
 temptation+0.033
 urge+0.032
 going+0.030
Neg logit contributions
os-0.085
o-0.084
enny-0.081
oing-0.080
ne-0.079
Token will
Token ID481
Feature activation+1.10e-01

Pos logit contributions
enny+0.308
os+0.301
oing+0.296
ark+0.295
umping+0.295
Neg logit contributions
 chores-0.164
 same-0.162
 plan-0.149
 start-0.146
 urge-0.144
Token be
Token ID307
Feature activation+9.17e-02

Pos logit contributions
 chores+0.157
 same+0.147
 temptation+0.141
 urge+0.133
 way+0.128
Neg logit contributions
o-0.295
os-0.295
ark-0.295
aron-0.292
ne-0.291
Token happy
Token ID3772
Feature activation+3.79e-02

Pos logit contributions
 chores+0.139
 temptation+0.124
 same+0.123
 urge+0.122
 plan+0.121
Neg logit contributions
o-0.249
aron-0.248
os-0.230
umping-0.230
as-0.224
Token too
Token ID1165
Feature activation+3.12e-02

Pos logit contributions
 chores+0.080
 same+0.077
 plan+0.073
 urge+0.073
 temptation+0.072
Neg logit contributions
os-0.088
o-0.086
enny-0.084
ark-0.081
umping-0.081
Token.
Token ID13
Feature activation-7.04e-02

Pos logit contributions
 chores+0.067
 same+0.066
 plan+0.062
 urge+0.062
 temptation+0.061
Neg logit contributions
os-0.070
o-0.068
enny-0.068
ark-0.065
umping-0.065
Token But
Token ID887
Feature activation+1.15e-02

Pos logit contributions
os+0.178
o+0.175
enny+0.174
ark+0.166
aron+0.166
Neg logit contributions
 chores-0.133
 same-0.126
 plan-0.122
 urge-0.118
 temptation-0.116
Token her
Token ID607
Feature activation+1.93e-02

Pos logit contributions
 chores+0.020
 same+0.019
 temptation+0.019
 plan+0.019
 urge+0.018
Neg logit contributions
o-0.029
os-0.028
enny-0.027
es-0.027
ne-0.026
Token mom
Token ID1995
Feature activation-1.15e-01

Pos logit contributions
 chores+0.024
 temptation+0.022
 same+0.021
 going+0.020
 urge+0.020
Neg logit contributions
o-0.056
os-0.056
enny-0.055
ne-0.054
aron-0.053
Token is
Token ID318
Feature activation+1.07e-01

Pos logit contributions
os+0.306
o+0.301
enny+0.293
ne+0.284
umping+0.284
Neg logit contributions
 chores-0.201
 same-0.196
 temptation-0.181
 urge-0.180
 plan-0.180
Token not
Token ID407
Feature activation+1.06e-01

Pos logit contributions
 chores+0.116
 same+0.107
 plan+0.101
 urge+0.098
 temptation+0.098
Neg logit contributions
o-0.344
aron-0.332
os-0.328
ne-0.327
enny-0.325
Token.
Token ID13
Feature activation-3.74e-02

Pos logit contributions
umping+0.230
os+0.226
aron+0.226
oing+0.226
round+0.222
Neg logit contributions
 same-0.129
 chores-0.129
 plan-0.122
 other-0.120
 her-0.118
Token All
Token ID1439
Feature activation-1.14e-02

Pos logit contributions
os+0.097
enny+0.095
umping+0.094
ark+0.094
o+0.094
Neg logit contributions
 chores-0.063
 same-0.063
 plan-0.058
 urge-0.057
 going-0.057
Token the
Token ID262
Feature activation+5.26e-02

Pos logit contributions
os+0.036
o+0.034
enny+0.033
ne+0.033
umping+0.032
Neg logit contributions
 chores-0.016
 same-0.014
 plan-0.013
 urge-0.013
 temptation-0.013
Token fish
Token ID5916
Feature activation-1.83e-01

Pos logit contributions
 chores+0.067
 same+0.063
 urge+0.059
 plan+0.058
 start+0.057
Neg logit contributions
os-0.159
o-0.157
enny-0.153
ne-0.153
ark-0.149
Token liked
Token ID8288
Feature activation+1.01e-01

Pos logit contributions
os+0.446
enny+0.439
o+0.434
ark+0.423
umping+0.421
Neg logit contributions
 chores-0.351
 same-0.346
 plan-0.325
 urge-0.322
 start-0.314
Token Sam
Token ID3409
Feature activation-9.44e-02

Pos logit contributions
 chores+0.151
 same+0.142
 urge+0.134
 plan+0.133
 temptation+0.130
Neg logit contributions
os-0.297
o-0.293
enny-0.277
ark-0.274
umping-0.274
Token and
Token ID290
Feature activation+2.88e-02

Pos logit contributions
enny+0.242
os+0.236
ark+0.233
oing+0.226
o+0.225
Neg logit contributions
 chores-0.173
 same-0.171
 plan-0.159
 urge-0.157
 going-0.157
Token wanted
Token ID2227
Feature activation+1.01e-01

Pos logit contributions
 chores+0.046
 same+0.043
 plan+0.040
 urge+0.040
 temptation+0.040
Neg logit contributions
os-0.082
o-0.081
enny-0.078
umping-0.075
ne-0.075
Token to
Token ID284
Feature activation+1.32e-01

Pos logit contributions
 chores+0.185
 same+0.176
 urge+0.173
 plan+0.170
 temptation+0.168
Neg logit contributions
os-0.259
o-0.257
enny-0.242
ne-0.235
umping-0.231
Token play
Token ID711
Feature activation+4.01e-02

Pos logit contributions
 chores+0.173
 same+0.164
 temptation+0.149
 going+0.142
 urge+0.139
Neg logit contributions
os-0.392
o-0.388
ne-0.373
enny-0.368
ark-0.366
Token with
Token ID351
Feature activation+1.15e-01

Pos logit contributions
 chores+0.075
 same+0.073
 plan+0.069
 urge+0.067
 start+0.067
Neg logit contributions
os-0.103
enny-0.100
o-0.099
umping-0.097
oing-0.096
Tokenace
Token ID558
Feature activation-1.32e-01

Pos logit contributions
os+0.272
o+0.269
oing+0.258
round+0.250
umping+0.245
Neg logit contributions
 same-0.239
 chores-0.232
 plan-0.223
 urge-0.222
 idea-0.215
Token tried
Token ID3088
Feature activation+8.97e-02

Pos logit contributions
os+0.358
enny+0.354
oing+0.350
ark+0.349
o+0.347
Neg logit contributions
 chores-0.210
 same-0.208
 plan-0.202
 urge-0.195
 work-0.193
Token to
Token ID284
Feature activation+1.25e-01

Pos logit contributions
 chores+0.150
 same+0.149
 urge+0.146
 temptation+0.143
 plan+0.139
Neg logit contributions
os-0.236
o-0.233
enny-0.227
umping-0.217
ark-0.217
Token do
Token ID466
Feature activation+1.05e-01

Pos logit contributions
 chores+0.167
 same+0.164
 temptation+0.149
 bad+0.147
 way+0.141
Neg logit contributions
os-0.344
o-0.340
ne-0.336
ark-0.327
aron-0.327
Token the
Token ID262
Feature activation+5.95e-02

Pos logit contributions
 same+0.134
 urge+0.132
 plan+0.119
 chores+0.118
 temptation+0.118
Neg logit contributions
os-0.294
es-0.279
enny-0.279
o-0.276
ne-0.274
Token exercises
Token ID13565
Feature activation-7.12e-02

Pos logit contributions
 sorry+0.070
 chores+0.069
 storm+0.065
 rain+0.064
 temptation+0.064
Neg logit contributions
os-0.149
o-0.149
umping-0.146
ark-0.145
ne-0.145
Token but
Token ID475
Feature activation+9.62e-03

Pos logit contributions
os+0.185
o+0.179
enny+0.177
ne+0.171
ark+0.171
Neg logit contributions
 chores-0.127
 same-0.124
 plan-0.116
 urge-0.115
 temptation-0.114
Token she
Token ID673
Feature activation-2.21e-02

Pos logit contributions
 same+0.013
 chores+0.013
 temptation+0.013
 urge+0.012
 plan+0.012
Neg logit contributions
os-0.028
enny-0.027
o-0.027
ne-0.025
itty-0.025
Token didn
Token ID1422
Feature activation+7.56e-02

Pos logit contributions
os+0.061
oing+0.061
ark+0.060
aron+0.059
round+0.059
Neg logit contributions
 chores-0.034
 same-0.033
 plan-0.033
 urge-0.032
 way-0.031
Token't
Token ID470
Feature activation+1.04e-01

Pos logit contributions
 same+0.111
 chores+0.111
 plan+0.105
 urge+0.103
 way+0.100
Neg logit contributions
os-0.215
enny-0.210
o-0.209
ark-0.203
ne-0.201
Token understand
Token ID1833
Feature activation+5.85e-02

Pos logit contributions
 chores+0.147
 same+0.141
 plan+0.132
 urge+0.130
 start+0.127
Neg logit contributions
os-0.318
o-0.310
enny-0.308
umping-0.303
ark-0.301
Token it
Token ID340
Feature activation-4.45e-02

Pos logit contributions
 chores+0.102
 same+0.094
 plan+0.090
 urge+0.090
 temptation+0.086
Neg logit contributions
os-0.223
o-0.217
enny-0.212
ne-0.209
aron-0.207
Token.
Token ID13
Feature activation-8.54e-02

Pos logit contributions
os+0.103
o+0.100
enny+0.099
umping+0.095
ark+0.095
Neg logit contributions
 chores-0.093
 same-0.090
 plan-0.085
 urge-0.085
 temptation-0.084
Token<|endoftext|>
Token ID50256
Feature activation-1.45e-01

Pos logit contributions
os+0.251
ark+0.246
o+0.244
oing+0.243
umping+0.239
Neg logit contributions
 same-0.120
 chores-0.117
 way-0.107
 temptation-0.107
 urge-0.106
TokenThe
Token ID464
Feature activation+2.46e-02

Pos logit contributions
oing+0.413
ist+0.397
ark+0.395
round+0.395
enny+0.387
Neg logit contributions
 same-0.220
 way-0.201
 chores-0.195
 going-0.195
 other-0.194
Token little
Token ID1310
Feature activation-3.75e-02

Pos logit contributions
 chores+0.029
 same+0.027
 plan+0.025
 urge+0.024
 temptation+0.024
Neg logit contributions
os-0.078
o-0.077
enny-0.076
ne-0.075
umping-0.074
Token girl
Token ID2576
Feature activation-1.42e-01

Pos logit contributions
os+0.106
enny+0.103
o+0.103
ark+0.101
oing+0.100
Neg logit contributions
 same-0.059
 chores-0.058
 plan-0.053
 urge-0.053
 way-0.051
Token was
Token ID373
Feature activation+9.99e-02

Pos logit contributions
os+0.347
o+0.339
enny+0.332
umping+0.322
ne+0.319
Neg logit contributions
 chores-0.281
 same-0.270
 temptation-0.255
 plan-0.254
 urge-0.251
Token very
Token ID845
Feature activation+4.25e-02

Pos logit contributions
 chores+0.155
 temptation+0.141
 plan+0.135
 same+0.131
 start+0.129
Neg logit contributions
o-0.293
os-0.274
umping-0.273
aron-0.263
itty-0.259
Token excited
Token ID6568
Feature activation+2.38e-02

Pos logit contributions
 chores+0.070
 same+0.062
 way+0.062
 temptation+0.061
 start+0.059
Neg logit contributions
o-0.116
os-0.113
enny-0.107
ang-0.103
oing-0.102
Token.
Token ID13
Feature activation-7.64e-02

Pos logit contributions
 chores+0.050
 same+0.048
 plan+0.046
 urge+0.045
 temptation+0.045
Neg logit contributions
os-0.055
o-0.054
enny-0.053
umping-0.051
ark-0.051
Token She
Token ID1375
Feature activation-1.98e-02

Pos logit contributions
os+0.203
o+0.198
enny+0.198
umping+0.191
ark+0.191
Neg logit contributions
 chores-0.134
 same-0.129
 plan-0.121
 urge-0.119
 temptation-0.117
Token someone
Token ID2130
Feature activation-8.58e-02

Pos logit contributions
 chores+0.130
 urge+0.125
 plan+0.122
 same+0.116
 temptation+0.107
Neg logit contributions
os-0.284
o-0.281
ne-0.270
enny-0.268
umping-0.263
Token was
Token ID373
Feature activation+1.09e-01

Pos logit contributions
os+0.207
enny+0.203
o+0.200
ark+0.196
umping+0.195
Neg logit contributions
 chores-0.169
 same-0.166
 plan-0.156
 urge-0.155
 going-0.153
Token talking
Token ID3375
Feature activation+9.22e-03

Pos logit contributions
 chores+0.172
 same+0.155
 temptation+0.147
 plan+0.146
 urge+0.143
Neg logit contributions
o-0.309
os-0.307
aron-0.294
umping-0.291
ne-0.291
Token inside
Token ID2641
Feature activation-2.25e-03

Pos logit contributions
 chores+0.017
 same+0.017
 plan+0.016
 going+0.015
 urge+0.015
Neg logit contributions
os-0.023
enny-0.023
o-0.023
umping-0.022
oing-0.022
Token the
Token ID262
Feature activation+4.70e-02

Pos logit contributions
os+0.005
o+0.005
enny+0.005
umping+0.005
ark+0.005
Neg logit contributions
 chores-0.005
 same-0.005
 urge-0.004
 plan-0.004
 temptation-0.004
Token shell
Token ID7582
Feature activation-1.74e-01

Pos logit contributions
 chores+0.068
 same+0.060
 plan+0.058
 urge+0.058
 temptation+0.057
Neg logit contributions
os-0.132
o-0.130
ne-0.129
enny-0.127
aron-0.126
Token.
Token ID13
Feature activation-7.44e-02

Pos logit contributions
os+0.409
o+0.398
enny+0.395
ark+0.382
umping+0.381
Neg logit contributions
 chores-0.356
 same-0.348
 plan-0.327
 urge-0.324
 temptation-0.319
Token
Token ID198
Feature activation-7.94e-02

Pos logit contributions
os+0.205
o+0.199
ark+0.197
oing+0.196
umping+0.196
Neg logit contributions
 same-0.117
 chores-0.117
 temptation-0.104
 plan-0.104
 urge-0.103
Token
Token ID198
Feature activation-7.81e-02

Pos logit contributions
os+0.209
umping+0.205
oing+0.202
round+0.198
o+0.197
Neg logit contributions
 chores-0.130
 same-0.124
 plan-0.119
 bad-0.117
 other-0.117
Token"
Token ID1
Feature activation-4.40e-02

Pos logit contributions
os+0.186
umping+0.174
o+0.173
enny+0.168
aron+0.167
Neg logit contributions
 chores-0.155
 same-0.148
 temptation-0.144
 plan-0.142
 bad-0.141
TokenHello
Token ID15496
Feature activation+7.63e-04

Pos logit contributions
os+0.119
o+0.116
enny+0.115
umping+0.111
oing+0.110
Neg logit contributions
 same-0.077
 chores-0.074
 rain-0.069
 plan-0.069
 start-0.068
TokenM
Token ID44
Feature activation-6.57e-02

Pos logit contributions
oing+0.411
ist+0.404
ark+0.403
round+0.402
enny+0.389
Neg logit contributions
 same-0.201
 way-0.176
 other-0.175
 chores-0.172
 going-0.172
Tokenia
Token ID544
Feature activation-1.24e-01

Pos logit contributions
os+0.137
o+0.132
enny+0.130
ark+0.127
umping+0.127
Neg logit contributions
 chores-0.155
 same-0.150
 plan-0.144
 urge-0.142
 temptation-0.139
Token saw
Token ID2497
Feature activation+7.18e-02

Pos logit contributions
os+0.314
enny+0.310
oing+0.307
o+0.306
ark+0.306
Neg logit contributions
 same-0.221
 chores-0.220
 plan-0.206
 urge-0.206
 work-0.201
Token a
Token ID257
Feature activation+3.22e-02

Pos logit contributions
 chores+0.082
 plan+0.075
 temptation+0.074
 urge+0.074
 same+0.072
Neg logit contributions
os-0.229
o-0.227
enny-0.221
umping-0.216
ne-0.213
Token blue
Token ID4171
Feature activation-4.72e-02

Pos logit contributions
 chores+0.041
 same+0.038
 urge+0.034
 plan+0.034
 temptation+0.034
Neg logit contributions
os-0.100
o-0.099
enny-0.098
ark-0.095
oing-0.095
Token bird
Token ID6512
Feature activation-1.66e-01

Pos logit contributions
os+0.125
enny+0.120
o+0.120
ark+0.116
umping+0.116
Neg logit contributions
 chores-0.084
 same-0.083
 plan-0.077
 urge-0.076
 start-0.075
Token on
Token ID319
Feature activation+8.54e-02

Pos logit contributions
os+0.424
enny+0.419
o+0.408
ark+0.401
umping+0.397
Neg logit contributions
 chores-0.303
 same-0.299
 urge-0.279
 plan-0.279
 going-0.276
Token her
Token ID607
Feature activation+2.67e-02

Pos logit contributions
 chores+0.098
 plan+0.090
 temptation+0.084
 urge+0.083
 same+0.081
Neg logit contributions
os-0.260
o-0.260
ne-0.257
aron-0.248
umping-0.248
Token shoulder
Token ID8163
Feature activation-8.53e-02

Pos logit contributions
 chores+0.043
 same+0.040
 temptation+0.038
 plan+0.037
 bad+0.036
Neg logit contributions
o-0.072
os-0.072
enny-0.069
ne-0.069
oing-0.069
Token.
Token ID13
Feature activation-8.08e-02

Pos logit contributions
os+0.218
enny+0.218
o+0.209
ark+0.207
umping+0.203
Neg logit contributions
 chores-0.156
 same-0.155
 start-0.145
 urge-0.143
 going-0.143
Token She
Token ID1375
Feature activation-2.43e-02

Pos logit contributions
os+0.219
enny+0.216
o+0.215
ark+0.213
oing+0.210
Neg logit contributions
 same-0.132
 chores-0.129
 urge-0.121
 plan-0.120
 start-0.118
Token
Token ID198
Feature activation-8.94e-02

Pos logit contributions
os+0.239
umping+0.231
oing+0.225
round+0.222
o+0.222
Neg logit contributions
 chores-0.143
 same-0.141
 plan-0.133
 other-0.133
 her-0.128
TokenNow
Token ID3844
Feature activation-4.96e-03

Pos logit contributions
os+0.223
o+0.212
umping+0.206
enny+0.206
oing+0.202
Neg logit contributions
 chores-0.173
 same-0.167
 plan-0.160
 temptation-0.154
 start-0.154
Token,
Token ID11
Feature activation+1.44e-02

Pos logit contributions
os+0.014
enny+0.014
o+0.014
ne+0.014
ark+0.013
Neg logit contributions
 chores-0.007
 same-0.007
 plan-0.006
 temptation-0.006
 urge-0.006
Token the
Token ID262
Feature activation+3.91e-02

Pos logit contributions
 chores+0.019
 same+0.017
 plan+0.016
 urge+0.016
 temptation+0.015
Neg logit contributions
os-0.045
o-0.044
enny-0.043
ne-0.042
ark-0.042
Token rad
Token ID2511
Feature activation-1.37e-01

Pos logit contributions
 chores+0.062
 going+0.050
 same+0.050
 start+0.049
 temptation+0.049
Neg logit contributions
aron-0.099
o-0.099
enny-0.098
ne-0.098
pillar-0.095
Tokenish
Token ID680
Feature activation-1.49e-01

Pos logit contributions
os+0.215
enny+0.207
ark+0.199
o+0.198
umping+0.195
Neg logit contributions
 same-0.388
 chores-0.383
 plan-0.367
 urge-0.366
 temptation-0.359
Token struggled
Token ID11615
Feature activation+5.34e-02

Pos logit contributions
os+0.365
o+0.355
enny+0.347
umping+0.336
ne+0.335
Neg logit contributions
 chores-0.295
 same-0.285
 temptation-0.265
 plan-0.262
 urge-0.261
Token a
Token ID257
Feature activation+5.73e-02

Pos logit contributions
 chores+0.101
 same+0.095
 urge+0.092
 plan+0.091
 temptation+0.091
Neg logit contributions
os-0.132
o-0.129
enny-0.128
ark-0.123
umping-0.122
Token lot
Token ID1256
Feature activation+1.21e-03

Pos logit contributions
 chores+0.074
 same+0.069
 going+0.061
 urge+0.061
 temptation+0.060
Neg logit contributions
enny-0.172
os-0.170
o-0.169
ne-0.167
oing-0.166
Token less
Token ID1342
Feature activation+6.72e-02

Pos logit contributions
 chores+0.002
 same+0.002
 plan+0.002
 way+0.002
 work+0.002
Neg logit contributions
os-0.003
o-0.003
enny-0.003
oing-0.003
umping-0.003
Token and
Token ID290
Feature activation+5.08e-02

Pos logit contributions
 chores+0.105
 same+0.099
 plan+0.089
 temptation+0.088
 start+0.088
Neg logit contributions
os-0.181
aron-0.178
umping-0.177
enny-0.176
ne-0.176
Token Sometimes
Token ID8975
Feature activation+5.22e-02

Pos logit contributions
os+0.186
enny+0.180
oing+0.178
ark+0.177
umping+0.177
Neg logit contributions
 same-0.146
 chores-0.146
 plan-0.133
 way-0.132
 start-0.131
Token they
Token ID484
Feature activation+1.42e-02

Pos logit contributions
 chores+0.089
 same+0.088
 temptation+0.083
 urge+0.083
 plan+0.080
Neg logit contributions
os-0.139
o-0.134
ne-0.128
enny-0.125
umping-0.124
Token saw
Token ID2497
Feature activation+7.61e-02

Pos logit contributions
 chores+0.027
 same+0.027
 temptation+0.025
 plan+0.025
 urge+0.025
Neg logit contributions
os-0.036
o-0.035
enny-0.034
ark-0.033
ne-0.033
Token a
Token ID257
Feature activation+3.69e-02

Pos logit contributions
 chores+0.106
 urge+0.098
 temptation+0.096
 plan+0.092
 same+0.088
Neg logit contributions
os-0.227
o-0.223
umping-0.206
ne-0.203
enny-0.203
Token big
Token ID1263
Feature activation+8.34e-03

Pos logit contributions
 chores+0.043
 same+0.038
 temptation+0.035
 urge+0.035
 going+0.033
Neg logit contributions
o-0.117
os-0.115
enny-0.114
oing-0.113
ark-0.112
Token turtle
Token ID28699
Feature activation-1.62e-01

Pos logit contributions
 chores+0.016
 same+0.014
 temptation+0.014
 urge+0.013
 going+0.013
Neg logit contributions
o-0.020
os-0.019
ne-0.019
oing-0.018
umping-0.018
Token with
Token ID351
Feature activation+9.15e-02

Pos logit contributions
os+0.393
o+0.383
enny+0.378
ark+0.366
umping+0.366
Neg logit contributions
 chores-0.320
 same-0.312
 plan-0.292
 urge-0.290
 temptation-0.286
Token a
Token ID257
Feature activation+3.31e-02

Pos logit contributions
 chores+0.129
 plan+0.113
 urge+0.107
 temptation+0.106
 work+0.105
Neg logit contributions
o-0.263
os-0.262
umping-0.247
oing-0.238
enny-0.238
Token hard
Token ID1327
Feature activation+1.23e-02

Pos logit contributions
 chores+0.044
 same+0.039
 temptation+0.036
 urge+0.035
 going+0.035
Neg logit contributions
o-0.101
os-0.100
enny-0.098
oing-0.097
ark-0.096
Token shell
Token ID7582
Feature activation-1.62e-01

Pos logit contributions
 chores+0.022
 same+0.019
 temptation+0.018
 going+0.017
 plan+0.016
Neg logit contributions
o-0.031
oing-0.030
ark-0.030
os-0.030
umping-0.030
Token.
Token ID13
Feature activation-7.79e-02

Pos logit contributions
os+0.370
o+0.361
enny+0.352
umping+0.344
ne+0.342
Neg logit contributions
 chores-0.343
 same-0.333
 plan-0.315
 temptation-0.311
 urge-0.311