TOP ACTIVATIONS
MAX = 80.000

't know the rh in oc eros was wild . Can
reward . It was ch ocol ates and sweets !
am taking the rh in oc eros to a new home
he would take the av oc ados and scatter them around
children put on their bath ro bes and ran around in
giving him too many ch ocol ates ." The king learned
as he opened the ch ocol ates and started eating them
give to the rh in oc eros . Lily walked to
quiet compared to the noisy se ag ull s outside .
harder to push the wheel bar row . But her backyard
his head out of his bur row and saw something new
it to the rh in oc eros . When he tasted
Everyday he would take the av oc ados and scatter them
you go , my d arl ings . Enjoy your lunch
that the bad rh in oc eros was now nice .
there was a rh in oc eros who was always naughty
When Daisy put the magn ifying glass down , the cater
oll ip ops , ch ocol ates , g ummies ,
's wonderful , my d arl ings !" Mom my says
her mom my watched in amaz ement as the machine flew

MIN ACTIVATIONS
MIN = -25.043

ily and Tom are friends . They like to play with
She cried and chased Tom . She wanted her doll back
. Can I see it ?" L ily nods
her mom went back home . L ily
It has candles on top . Anna and Ben like chocolate
He liked to destroy things . He saw L ila and
the toy car all day . He drove it around the
," Lily begged . Tom shook his head .
," Tom says . He opens the door .
ila and Tom are friends . They like to play with
Ben and Lily are twins . They share a room and
hear their moms calling them . They run back to their
I see it ?" L ily nods . She
the doll . Ben smiles . He holds the doll gently
the mechanic who helped him . <|endoftext|> Once upon a time
. They lived happily together . <|endoftext|> One day , a
elephant again and kissed it . She said thank you and
train to Lily and Sam . He says they can play
, jumped , and laughed . When the sun went down
Tom and Sue are twins . They like to play in

INTERVAL 53.739 - 80.000
CONTAINS 0.005%

's wonderful , my d arl ings !" Mom my says
children put on their bath ro bes and ran around in
he would take the av oc ados and scatter them around
who lived in his cozy bur row . He loved to
it to the rh in oc eros . When he tasted

INTERVAL 27.479 - 53.739
CONTAINS 1.794%

Her best friend watched in amaz ement as Amelia gigg led
trees and it was very loud . The rh in oc
writing my name on this paper ." Her
said , holding up a long , brown thing . "
and jumped around in the spot of sunshine . Suddenly ,

INTERVAL 1.218 - 27.479
CONTAINS 40.121%

a big ball in the yard . He wanted to play
and went to the living room to test it out .
atmeal to the mug . The mug took a bite and
. Lily didn 't want to do it at first because
� Sure , but this one needs to be regular .

INTERVAL -25.043 - 1.218
CONTAINS 58.080%

It can fall apart easily ." she said .
, she was grateful that she had got to watch the
mouth . He then gave the toast back to Max and
closet for days ! John y felt sorry for
to the car and hopped into her seat . Mam ma
Token't
Token ID470
Feature activation+2.86e+01

Pos logit contributions
Um+0.161
Kids+0.160
Sorry+0.160
Hmm+0.157
Don+0.156
Neg logit contributions
 up-0.096
ish-0.093
lers-0.084
ans-0.082
ishing-0.081
Token know
Token ID760
Feature activation+2.79e+01

Pos logit contributions
ish+0.255
 up+0.251
lers+0.246
ishes+0.241
igs+0.239
Neg logit contributions
Um-0.157
Sorry-0.152
Kids-0.151
Don-0.150
By-0.149
Token the
Token ID262
Feature activation+2.73e+01

Pos logit contributions
ish+0.190
an+0.172
ishing+0.170
ans+0.167
 up+0.165
Neg logit contributions
Um-0.209
Exc-0.206
Kids-0.204
Don-0.201
Listen-0.193
Token rh
Token ID9529
Feature activation+5.44e+01

Pos logit contributions
ish+0.171
 up+0.171
ishing+0.170
cks+0.166
igs+0.164
Neg logit contributions
Uh-0.214
Hmm-0.213
Um-0.212
Don-0.206
Wait-0.200
Tokenin
Token ID259
Feature activation+4.08e+01

Pos logit contributions
 up+0.426
igs+0.333
ans+0.329
ish+0.329
fl+0.319
Neg logit contributions
Sorry-0.395
Uh-0.385
Um-0.378
By-0.375
Well-0.371
Tokenoc
Token ID420
Feature activation+6.47e+01

Pos logit contributions
 up+0.303
igs+0.285
ish+0.272
ishing+0.268
ans+0.263
Neg logit contributions
Um-0.300
Hmm-0.295
Well-0.290
Don-0.283
Uh-0.283
Tokeneros
Token ID27498
Feature activation+3.86e+01

Pos logit contributions
 up+0.371
igs+0.311
 out+0.296
ies+0.296
 flying+0.294
Neg logit contributions
Sorry-0.557
Um-0.546
Well-0.526
Hmm-0.522
Okay-0.511
Token was
Token ID373
Feature activation+2.56e+01

Pos logit contributions
 up+0.294
ish+0.284
igs+0.272
ishing+0.262
lers+0.261
Neg logit contributions
Um-0.264
Kids-0.258
Don-0.255
Sorry-0.255
Hmm-0.251
Token wild
Token ID4295
Feature activation+3.05e+01

Pos logit contributions
ish+0.197
ishes+0.178
ans+0.176
an+0.170
igs+0.169
Neg logit contributions
Don-0.181
Um-0.181
Uh-0.174
By-0.168
Listen-0.165
Token.
Token ID13
Feature activation+1.02e+01

Pos logit contributions
igs+0.187
cks+0.172
ans+0.166
ates+0.166
ish+0.163
Neg logit contributions
Don-0.225
Hi-0.223
Kids-0.218
Exc-0.217
Well-0.210
Token Can
Token ID1680
Feature activation+1.58e+01

Pos logit contributions
 up+0.085
ish+0.075
 matches+0.072
lers+0.071
-+0.070
Neg logit contributions
Sorry-0.065
Uh-0.064
Kids-0.064
Hmm-0.064
Um-0.063
Token reward
Token ID6721
Feature activation+3.90e+01

Pos logit contributions
 up+0.270
cks+0.268
ans+0.264
igs+0.264
ishing+0.243
Neg logit contributions
Uh-0.392
Hi-0.366
Exc-0.364
Um-0.361
Hmm-0.357
Token.
Token ID13
Feature activation+1.35e+01

Pos logit contributions
ish+0.194
 up+0.186
ric+0.174
ans+0.168
igs+0.168
Neg logit contributions
Kids-0.349
Um-0.346
Don-0.344
Well-0.343
Sorry-0.342
Token It
Token ID632
Feature activation+2.24e+01

Pos logit contributions
 up+0.120
ish+0.110
lers+0.103
 matches+0.100
-+0.100
Neg logit contributions
Kids-0.082
Uh-0.081
Sorry-0.081
Hmm-0.081
Um-0.081
Token was
Token ID373
Feature activation+2.05e+01

Pos logit contributions
 up+0.169
ish+0.162
lers+0.150
ishing+0.147
ans+0.147
Neg logit contributions
Um-0.161
Hmm-0.157
Uh-0.155
Kids-0.155
Sorry-0.154
Token ch
Token ID442
Feature activation+3.28e+01

Pos logit contributions
ish+0.130
lers+0.120
 up+0.114
ishes+0.112
ans+0.112
Neg logit contributions
Um-0.170
Don-0.167
Uh-0.166
Hmm-0.164
By-0.161
Tokenocol
Token ID4668
Feature activation+6.39e+01

Pos logit contributions
 up+0.208
ish+0.175
 matches+0.167
lers+0.163
ans+0.160
Neg logit contributions
Uh-0.281
Um-0.277
Hmm-0.277
Sorry-0.273
Exc-0.272
Tokenates
Token ID689
Feature activation+3.65e+01

Pos logit contributions
 up+0.344
 matches+0.254
 where+0.245
ish+0.240
 new+0.230
Neg logit contributions
Um-0.596
Sorry-0.584
By-0.579
Hmm-0.572
Dad-0.571
Token and
Token ID290
Feature activation+1.84e+01

Pos logit contributions
igs+0.186
ish+0.186
ans+0.184
 up+0.181
ishes+0.181
Neg logit contributions
Don-0.283
Sorry-0.283
By-0.283
Hmm-0.281
Well-0.276
Token sweets
Token ID42402
Feature activation+3.50e+01

Pos logit contributions
ish+0.130
 up+0.121
ans+0.119
an+0.116
ishing+0.114
Neg logit contributions
Um-0.139
Don-0.138
Sorry-0.135
Well-0.134
Hmm-0.134
Token!
Token ID0
Feature activation+9.15e+00

Pos logit contributions
ish+0.183
 up+0.174
ans+0.173
igs+0.168
ishes+0.164
Neg logit contributions
Don-0.301
Hmm-0.299
Sorry-0.299
Well-0.298
Exc-0.295
Token
Token ID198
Feature activation+8.57e+00

Pos logit contributions
 up+0.079
ish+0.073
lers+0.068
 matches+0.066
-+0.066
Neg logit contributions
Kids-0.057
Sorry-0.057
Um-0.057
Uh-0.057
Hmm-0.057
Token am
Token ID716
Feature activation+2.24e+01

Pos logit contributions
 up+0.184
ish+0.171
lers+0.160
-+0.154
 matches+0.153
Neg logit contributions
Kids-0.157
Um-0.157
Hmm-0.156
Sorry-0.155
Uh-0.155
Token taking
Token ID2263
Feature activation+3.01e+01

Pos logit contributions
ish+0.171
lers+0.159
ishing+0.157
an+0.157
ishes+0.156
Neg logit contributions
Um-0.159
Kids-0.145
Sorry-0.143
Don-0.142
By-0.141
Token the
Token ID262
Feature activation+2.87e+01

Pos logit contributions
ish+0.193
ric+0.175
 up+0.172
ishing+0.166
an+0.165
Neg logit contributions
Um-0.236
Kids-0.235
Hmm-0.232
Don-0.231
Sorry-0.231
Token rh
Token ID9529
Feature activation+5.09e+01

Pos logit contributions
cks+0.169
 up+0.168
ish+0.165
ishing+0.161
ishes+0.158
Neg logit contributions
Hmm-0.219
Uh-0.217
Kids-0.217
Um-0.216
Exc-0.212
Tokenin
Token ID259
Feature activation+4.18e+01

Pos logit contributions
 up+0.406
 where+0.315
fl+0.313
igs+0.312
ish+0.310
Neg logit contributions
Sorry-0.355
By-0.343
Uh-0.341
Um-0.324
Dad-0.321
Tokenoc
Token ID420
Feature activation+6.33e+01

Pos logit contributions
 up+0.336
igs+0.334
ishing+0.316
ishes+0.310
cks+0.308
Neg logit contributions
By-0.239
Um-0.239
Sorry-0.238
Hmm-0.233
Exc-0.232
Tokeneros
Token ID27498
Feature activation+3.94e+01

Pos logit contributions
 up+0.384
 out+0.312
 where+0.307
igs+0.305
 matches+0.301
Neg logit contributions
Sorry-0.521
Um-0.499
Well-0.479
Okay-0.471
Hmm-0.467
Token to
Token ID284
Feature activation+1.88e+01

Pos logit contributions
igs+0.194
ish+0.191
 up+0.187
ishing+0.180
ric+0.175
Neg logit contributions
Sorry-0.327
Um-0.324
By-0.320
Don-0.319
Exc-0.314
Token a
Token ID257
Feature activation+2.53e+01

Pos logit contributions
ish+0.121
 up+0.117
lers+0.112
ishing+0.107
ishes+0.105
Neg logit contributions
Um-0.156
Hmm-0.153
Kids-0.150
By-0.150
Uh-0.149
Token new
Token ID649
Feature activation+3.02e+01

Pos logit contributions
 up+0.160
ish+0.154
lers+0.144
ishes+0.143
ishing+0.138
Neg logit contributions
Uh-0.210
Um-0.208
Hmm-0.207
Exc-0.206
Kids-0.201
Token home
Token ID1363
Feature activation+2.88e+01

Pos logit contributions
 up+0.183
cks+0.181
ish+0.181
ishes+0.180
ries+0.179
Neg logit contributions
Exc-0.223
Um-0.223
Dad-0.219
Okay-0.217
By-0.217
Token he
Token ID339
Feature activation+2.72e+01

Pos logit contributions
ish+0.286
 up+0.285
ans+0.258
ishing+0.257
ishes+0.256
Neg logit contributions
Um-0.230
Kids-0.229
Sorry-0.221
Hmm-0.220
Well-0.220
Token would
Token ID561
Feature activation+2.32e+01

Pos logit contributions
 up+0.243
ish+0.233
lers+0.217
ishing+0.215
igs+0.214
Neg logit contributions
Um-0.158
Kids-0.151
Hmm-0.150
Sorry-0.149
Don-0.149
Token take
Token ID1011
Feature activation+2.77e+01

Pos logit contributions
 up+0.186
ish+0.186
lers+0.176
ishes+0.175
ishing+0.175
Neg logit contributions
Um-0.141
Exc-0.136
Hmm-0.135
By-0.135
Don-0.134
Token the
Token ID262
Feature activation+2.77e+01

Pos logit contributions
ish+0.191
ric+0.177
an+0.171
 up+0.166
ans+0.166
Neg logit contributions
Kids-0.201
Hmm-0.201
Exc-0.199
Um-0.199
By-0.198
Token av
Token ID1196
Feature activation+5.95e+01

Pos logit contributions
 up+0.160
cks+0.153
ish+0.152
igs+0.144
ishing+0.144
Neg logit contributions
Hmm-0.229
Um-0.223
Uh-0.221
Exc-0.219
Don-0.216
Tokenoc
Token ID420
Feature activation+6.33e+01

Pos logit contributions
 up+0.448
 matches+0.394
ric+0.371
 where+0.366
ies+0.355
Neg logit contributions
Hmm-0.428
Exc-0.425
Uh-0.407
Um-0.401
By-0.394
Tokenados
Token ID22484
Feature activation+3.91e+01

Pos logit contributions
 up+0.395
 flying+0.317
 matches+0.313
 where+0.313
 hiding+0.304
Neg logit contributions
Um-0.505
Sorry-0.503
Hmm-0.496
Well-0.486
Exc-0.484
Token and
Token ID290
Feature activation+1.83e+01

Pos logit contributions
ish+0.198
 up+0.197
lers+0.187
igs+0.185
ric+0.185
Neg logit contributions
Hmm-0.329
Don-0.329
Exc-0.327
Well-0.325
Um-0.323
Token scatter
Token ID41058
Feature activation+2.91e+01

Pos logit contributions
 up+0.122
ish+0.121
an+0.113
ishing+0.112
ans+0.111
Neg logit contributions
Um-0.140
Exc-0.139
Hmm-0.139
Sorry-0.137
Don-0.136
Token them
Token ID606
Feature activation+2.74e+01

Pos logit contributions
ric+0.173
ish+0.168
an+0.163
ans+0.156
ig+0.156
Neg logit contributions
Exc-0.239
Hmm-0.222
Don-0.221
Uh-0.218
Well-0.215
Token around
Token ID1088
Feature activation+2.48e+01

Pos logit contributions
ric+0.120
ish+0.116
ans+0.110
an+0.110
igs+0.104
Neg logit contributions
Exc-0.231
Don-0.230
By-0.221
Hmm-0.221
Sorry-0.221
Token children
Token ID1751
Feature activation+3.18e+01

Pos logit contributions
Hmm+0.127
Uh+0.124
Um+0.124
Sorry+0.123
Kids+0.122
Neg logit contributions
 up-0.150
ish-0.135
--0.128
lers-0.127
 matches-0.122
Token put
Token ID1234
Feature activation+3.15e+01

Pos logit contributions
 up+0.207
ish+0.201
igs+0.196
ishing+0.184
lers+0.182
Neg logit contributions
Kids-0.249
Sorry-0.244
Um-0.244
Well-0.239
Don-0.236
Token on
Token ID319
Feature activation+2.43e+01

Pos logit contributions
ric+0.160
ish+0.157
igs+0.148
an+0.146
ans+0.140
Neg logit contributions
Kids-0.263
Don-0.252
Hmm-0.249
Sorry-0.247
Exc-0.246
Token their
Token ID511
Feature activation+3.15e+01

Pos logit contributions
ish+0.165
 up+0.141
ishing+0.141
an+0.139
ishes+0.136
Neg logit contributions
Kids-0.191
Exc-0.187
Hmm-0.182
Um-0.180
Don-0.179
Token bath
Token ID7837
Feature activation+3.92e+01

Pos logit contributions
 up+0.171
ish+0.170
ning+0.154
ishing+0.151
igs+0.148
Neg logit contributions
Kids-0.263
Um-0.260
Sorry-0.255
Exc-0.255
Hmm-0.254
Tokenro
Token ID305
Feature activation+6.29e+01

Pos logit contributions
 up+0.243
ish+0.243
cks+0.231
lers+0.216
ric+0.216
Neg logit contributions
Kids-0.318
Sorry-0.287
Okay-0.283
Exc-0.276
Well-0.275
Tokenbes
Token ID12636
Feature activation+4.07e+01

Pos logit contributions
ish+0.503
ishes+0.472
 up+0.439
 matches+0.435
ries+0.416
Neg logit contributions
Kids-0.379
Ben-0.360
Exc-0.359
Don-0.349
By-0.347
Token and
Token ID290
Feature activation+1.86e+01

Pos logit contributions
ish+0.184
 up+0.173
igs+0.173
ric+0.168
 matches+0.159
Neg logit contributions
Exc-0.378
Kids-0.377
Sorry-0.377
Don-0.374
Well-0.369
Token ran
Token ID4966
Feature activation+2.53e+01

Pos logit contributions
 up+0.143
ish+0.142
lers+0.128
ric+0.127
an+0.126
Neg logit contributions
Kids-0.129
Sorry-0.124
Um-0.124
Don-0.122
Hmm-0.121
Token around
Token ID1088
Feature activation+2.32e+01

Pos logit contributions
ish+0.132
ishes+0.114
 up+0.113
igs+0.112
ric+0.111
Neg logit contributions
Kids-0.245
Sorry-0.231
Exc-0.231
Don-0.230
Listen-0.228
Token in
Token ID287
Feature activation+2.10e+01

Pos logit contributions
ish+0.129
ishes+0.112
lers+0.108
ans+0.108
 up+0.108
Neg logit contributions
Kids-0.216
Don-0.210
Exc-0.205
Um-0.204
Well-0.203
Token giving
Token ID3501
Feature activation+2.75e+01

Pos logit contributions
ish+0.148
 up+0.137
lers+0.134
ishes+0.131
ishing+0.121
Neg logit contributions
Um-0.168
Kids-0.162
Don-0.162
Hmm-0.158
Dad-0.157
Token him
Token ID683
Feature activation+2.51e+01

Pos logit contributions
igs+0.187
ric+0.187
ish+0.184
ig+0.177
cks+0.175
Neg logit contributions
Don-0.182
Exc-0.179
Hi-0.178
Um-0.178
Hmm-0.178
Token too
Token ID1165
Feature activation+2.37e+01

Pos logit contributions
ning+0.134
ric+0.131
ish+0.131
cks+0.129
lers+0.127
Neg logit contributions
Sorry-0.182
Um-0.182
Exc-0.179
Hmm-0.179
Don-0.178
Token many
Token ID867
Feature activation+2.47e+01

Pos logit contributions
ish+0.188
 up+0.187
ishes+0.169
lers+0.169
ishing+0.167
Neg logit contributions
Um-0.157
Sorry-0.156
Hmm-0.154
Don-0.152
Kids-0.151
Token ch
Token ID442
Feature activation+3.53e+01

Pos logit contributions
ish+0.141
 up+0.138
ric+0.128
ning+0.125
gy+0.121
Neg logit contributions
Exc-0.208
Um-0.202
Sorry-0.194
Don-0.192
Kids-0.191
Tokenocol
Token ID4668
Feature activation+6.21e+01

Pos logit contributions
 up+0.269
ning+0.219
 matches+0.212
fl+0.206
 out+0.195
Neg logit contributions
Um-0.240
Kids-0.237
Dad-0.235
Exc-0.235
Sorry-0.231
Tokenates
Token ID689
Feature activation+3.49e+01

Pos logit contributions
 up+0.357
 matches+0.271
 where+0.257
 v+0.249
 new+0.236
Neg logit contributions
Dad-0.546
Sorry-0.544
Um-0.542
By-0.540
Warning-0.514
Token."
Token ID526
Feature activation+7.50e+00

Pos logit contributions
ish+0.157
 up+0.153
ric+0.147
igs+0.145
ans+0.143
Neg logit contributions
Sorry-0.296
Don-0.290
Um-0.288
Dad-0.288
Hmm-0.288
Token The
Token ID383
Feature activation+1.38e+01

Pos logit contributions
 up+0.060
ish+0.054
lers+0.051
-+0.050
 matches+0.048
Neg logit contributions
Uh-0.053
Hmm-0.052
Um-0.052
Sorry-0.051
Kids-0.051
Token king
Token ID5822
Feature activation+2.34e+01

Pos logit contributions
 up+0.102
ish+0.096
lers+0.093
ishing+0.088
ishes+0.087
Neg logit contributions
Um-0.105
Kids-0.102
Sorry-0.102
Hmm-0.101
Exc-0.100
Token learned
Token ID4499
Feature activation+1.40e+01

Pos logit contributions
 up+0.202
ish+0.189
igs+0.178
lers+0.177
ishes+0.176
Neg logit contributions
Um-0.141
Kids-0.136
Well-0.135
Exc-0.134
Hmm-0.134
Token as
Token ID355
Feature activation+1.63e+01

Pos logit contributions
ish+0.137
 up+0.128
ishes+0.117
igs+0.114
ric+0.113
Neg logit contributions
Kids-0.287
Exc-0.284
Um-0.284
Don-0.283
Hmm-0.282
Token he
Token ID339
Feature activation+1.86e+01

Pos logit contributions
ish+0.145
 up+0.134
lers+0.130
ishes+0.130
ishing+0.128
Neg logit contributions
Don-0.096
Kids-0.095
Sorry-0.095
Um-0.094
By-0.094
Token opened
Token ID4721
Feature activation+2.56e+01

Pos logit contributions
 up+0.160
ish+0.149
lers+0.140
-+0.137
 matches+0.134
Neg logit contributions
Kids-0.119
Hmm-0.118
Um-0.118
Uh-0.117
Sorry-0.117
Token the
Token ID262
Feature activation+2.17e+01

Pos logit contributions
ish+0.170
an+0.152
ans+0.149
ishes+0.146
ric+0.144
Neg logit contributions
Exc-0.177
Don-0.172
Um-0.170
Hmm-0.170
Kids-0.168
Token ch
Token ID442
Feature activation+3.51e+01

Pos logit contributions
 up+0.138
ish+0.137
ishing+0.127
igs+0.127
ishes+0.125
Neg logit contributions
Um-0.169
Uh-0.169
Hmm-0.169
Exc-0.163
Kids-0.163
Tokenocol
Token ID4668
Feature activation+6.17e+01

Pos logit contributions
 up+0.246
fl+0.192
ning+0.184
 matches+0.181
 where+0.180
Neg logit contributions
Uh-0.270
Kids-0.259
Sorry-0.256
Exc-0.256
Um-0.255
Tokenates
Token ID689
Feature activation+3.30e+01

Pos logit contributions
 up+0.364
 matches+0.264
 where+0.259
 things+0.246
 v+0.246
Neg logit contributions
Dad-0.542
Um-0.540
By-0.540
Sorry-0.527
Warning-0.514
Token and
Token ID290
Feature activation+1.34e+01

Pos logit contributions
ish+0.150
igs+0.147
 up+0.146
ric+0.142
 matches+0.141
Neg logit contributions
Hmm-0.288
Sorry-0.285
Don-0.283
Exc-0.282
Well-0.282
Token started
Token ID2067
Feature activation+2.13e+01

Pos logit contributions
ish+0.103
 up+0.100
igs+0.095
ishing+0.093
lers+0.093
Neg logit contributions
Um-0.093
Sorry-0.089
Hmm-0.089
Don-0.088
Exc-0.087
Token eating
Token ID6600
Feature activation+2.39e+01

Pos logit contributions
ish+0.142
an+0.127
ishes+0.119
ans+0.118
 up+0.117
Neg logit contributions
Don-0.167
Um-0.164
Kids-0.163
Listen-0.161
Sorry-0.159
Token them
Token ID606
Feature activation+2.21e+01

Pos logit contributions
ish+0.147
an+0.139
ric+0.134
ishing+0.131
ans+0.128
Neg logit contributions
Exc-0.189
Hmm-0.180
Don-0.179
Dad-0.173
Sorry-0.170
Token give
Token ID1577
Feature activation+2.85e+01

Pos logit contributions
ish+0.170
 up+0.161
ishing+0.160
lers+0.159
ishes+0.155
Neg logit contributions
Um-0.143
Hmm-0.141
Exc-0.139
Don-0.139
Uh-0.135
Token to
Token ID284
Feature activation+1.94e+01

Pos logit contributions
ric+0.178
ish+0.175
ig+0.170
an+0.165
ates+0.164
Neg logit contributions
Don-0.209
Kids-0.201
Exc-0.196
Sorry-0.195
Um-0.191
Token the
Token ID262
Feature activation+2.55e+01

Pos logit contributions
ish+0.139
 up+0.134
ishing+0.130
lers+0.124
an+0.123
Neg logit contributions
Kids-0.134
Don-0.127
Um-0.126
Exc-0.126
Hmm-0.125
Token rh
Token ID9529
Feature activation+4.90e+01

Pos logit contributions
 up+0.150
ish+0.139
ishing+0.137
cks+0.133
lers+0.132
Neg logit contributions
Um-0.219
Hmm-0.218
Uh-0.216
Don-0.215
Sorry-0.214
Tokenin
Token ID259
Feature activation+3.84e+01

Pos logit contributions
 up+0.379
ish+0.302
 out+0.289
fl+0.287
igs+0.280
Neg logit contributions
Sorry-0.365
Exc-0.347
Um-0.347
Uh-0.344
By-0.342
Tokenoc
Token ID420
Feature activation+6.11e+01

Pos logit contributions
 up+0.298
ishing+0.285
cks+0.274
igs+0.272
ish+0.265
Neg logit contributions
Hmm-0.248
Um-0.245
Exc-0.241
Sure-0.233
Well-0.232
Tokeneros
Token ID27498
Feature activation+3.68e+01

Pos logit contributions
 up+0.350
 out+0.277
ies+0.268
igs+0.265
 flying+0.263
Neg logit contributions
Sorry-0.540
Um-0.532
Hmm-0.510
Well-0.501
Exc-0.500
Token.
Token ID13
Feature activation+1.02e+01

Pos logit contributions
ish+0.178
 up+0.174
ishing+0.173
igs+0.168
ric+0.165
Neg logit contributions
Um-0.312
Exc-0.308
Don-0.306
Sorry-0.299
Kids-0.299
Token Lily
Token ID20037
Feature activation+1.83e+01

Pos logit contributions
 up+0.088
ish+0.080
lers+0.075
 matches+0.074
-+0.073
Neg logit contributions
Uh-0.064
Sorry-0.063
Kids-0.063
Hmm-0.063
Um-0.063
Token walked
Token ID6807
Feature activation+2.28e+01

Pos logit contributions
 up+0.129
ish+0.121
lers+0.113
ishes+0.106
 matches+0.106
Neg logit contributions
Kids-0.141
Um-0.139
Hmm-0.137
Sorry-0.136
Don-0.136
Token to
Token ID284
Feature activation+1.39e+01

Pos logit contributions
 up+0.103
ish+0.094
lers+0.083
 matches+0.076
ishes+0.075
Neg logit contributions
Kids-0.237
Um-0.236
Hmm-0.236
Sorry-0.234
Uh-0.234
Token quiet
Token ID5897
Feature activation+4.59e+01

Pos logit contributions
Uh+0.075
Exc+0.069
Hmm+0.069
Um+0.068
Hi+0.067
Neg logit contributions
 up-0.051
ans-0.049
ates-0.049
ishes-0.048
lers-0.048
Token compared
Token ID3688
Feature activation+3.73e+01

Pos logit contributions
ish+0.246
igs+0.233
 up+0.223
ishes+0.221
lers+0.204
Neg logit contributions
Kids-0.405
Sorry-0.385
Don-0.385
Um-0.375
Exc-0.374
Token to
Token ID284
Feature activation+2.00e+01

Pos logit contributions
ish+0.235
 up+0.210
lers+0.198
an+0.196
igs+0.195
Neg logit contributions
Kids-0.304
Don-0.304
Exc-0.294
Well-0.290
Um-0.289
Token the
Token ID262
Feature activation+2.69e+01

Pos logit contributions
ish+0.136
 up+0.130
lers+0.120
ishes+0.119
ishing+0.116
Neg logit contributions
Kids-0.158
Um-0.156
Don-0.155
Hmm-0.151
Sorry-0.150
Token noisy
Token ID31210
Feature activation+3.51e+01

Pos logit contributions
ish+0.165
 up+0.163
cks+0.152
ishing+0.152
ishes+0.152
Neg logit contributions
Uh-0.216
Hmm-0.216
Um-0.213
Don-0.212
Kids-0.211
Token se
Token ID384
Feature activation+6.00e+01

Pos logit contributions
 up+0.216
ish+0.208
ishes+0.201
cks+0.199
igs+0.198
Neg logit contributions
Kids-0.292
Sorry-0.276
Uh-0.275
Exc-0.271
Okay-0.268
Tokenag
Token ID363
Feature activation+4.39e+01

Pos logit contributions
 up+0.393
 matches+0.326
 eggs+0.306
 flying+0.303
 toys+0.301
Neg logit contributions
Okay-0.480
Sorry-0.480
Kids-0.477
Hmm-0.467
Uh-0.466
Tokenull
Token ID724
Feature activation+3.57e+01

Pos logit contributions
ish+0.296
ishes+0.291
 up+0.289
lers+0.262
 eggs+0.261
Neg logit contributions
Kids-0.321
Well-0.308
Dad-0.306
J-0.304
Don-0.301
Tokens
Token ID82
Feature activation+2.55e+01

Pos logit contributions
igs+0.208
 up+0.204
ates+0.198
 matches+0.195
ish+0.195
Neg logit contributions
Don-0.267
Sorry-0.259
By-0.258
Well-0.254
Exc-0.249
Token outside
Token ID2354
Feature activation+2.89e+01

Pos logit contributions
ish+0.123
 up+0.120
igs+0.110
ishes+0.110
ans+0.108
Neg logit contributions
Don-0.228
Sorry-0.228
By-0.226
Okay-0.221
Um-0.219
Token.
Token ID13
Feature activation+9.27e+00

Pos logit contributions
ish+0.129
 up+0.123
lers+0.112
ishes+0.110
igs+0.109
Neg logit contributions
Don-0.283
Sorry-0.283
Kids-0.282
Um-0.274
Uh-0.273
Token harder
Token ID7069
Feature activation+2.81e+01

Pos logit contributions
ish+0.200
 up+0.189
ishing+0.181
an+0.178
igs+0.178
Neg logit contributions
Don-0.175
Um-0.173
Kids-0.169
Sorry-0.169
Exc-0.165
Token to
Token ID284
Feature activation+2.02e+01

Pos logit contributions
ish+0.149
 up+0.137
ishes+0.127
an+0.127
lers+0.120
Neg logit contributions
Don-0.247
Sorry-0.244
Um-0.243
Exc-0.242
Well-0.242
Token push
Token ID4574
Feature activation+2.93e+01

Pos logit contributions
ish+0.163
lers+0.158
ishes+0.155
 up+0.154
ishing+0.150
Neg logit contributions
Um-0.121
Hmm-0.117
Sorry-0.116
Uh-0.116
Exc-0.115
Token the
Token ID262
Feature activation+2.62e+01

Pos logit contributions
ish+0.151
 up+0.148
ans+0.132
lers+0.132
ric+0.132
Neg logit contributions
Hmm-0.263
Kids-0.260
Don-0.259
Exc-0.258
Sorry-0.256
Token wheel
Token ID7825
Feature activation+3.60e+01

Pos logit contributions
 up+0.149
ish+0.142
ishing+0.141
ishes+0.138
ans+0.135
Neg logit contributions
Uh-0.222
Hmm-0.215
Um-0.211
Exc-0.209
Sorry-0.207
Tokenbar
Token ID5657
Feature activation+6.00e+01

Pos logit contributions
ish+0.165
ishes+0.159
 up+0.156
lers+0.153
igs+0.147
Neg logit contributions
Sorry-0.336
Don-0.328
Exc-0.328
Hmm-0.327
Uh-0.326
Tokenrow
Token ID808
Feature activation+3.97e+01

Pos logit contributions
 up+0.372
 matches+0.343
lers+0.323
 clay+0.318
ish+0.309
Neg logit contributions
Sorry-0.450
Hmm-0.441
Well-0.440
Dad-0.435
Um-0.427
Token.
Token ID13
Feature activation+8.94e+00

Pos logit contributions
ish+0.183
ishes+0.170
 up+0.163
lers+0.162
ric+0.158
Neg logit contributions
Sorry-0.375
Don-0.371
Exc-0.365
Hmm-0.365
Well-0.360
Token But
Token ID887
Feature activation+1.14e+01

Pos logit contributions
 up+0.073
ish+0.065
lers+0.060
 matches+0.060
-+0.059
Neg logit contributions
Sorry-0.060
Uh-0.060
Kids-0.060
Hmm-0.059
Um-0.059
Token her
Token ID607
Feature activation+1.99e+01

Pos logit contributions
 up+0.077
ish+0.076
lers+0.072
ans+0.071
ishes+0.069
Neg logit contributions
Um-0.087
Hmm-0.087
Don-0.087
Sorry-0.087
Uh-0.085
Token backyard
Token ID24296
Feature activation+2.64e+01

Pos logit contributions
 up+0.145
cks+0.134
ish+0.133
ishes+0.133
ans+0.132
Neg logit contributions
Um-0.142
Sorry-0.135
By-0.132
Well-0.132
Uh-0.131
Token his
Token ID465
Feature activation+3.72e+01

Pos logit contributions
Don+0.086
Um+0.082
Hmm+0.082
Listen+0.081
Exc+0.077
Neg logit contributions
ric-0.029
eh-0.026
ates-0.026
igs-0.025
 ig-0.024
Token head
Token ID1182
Feature activation+3.93e+01

Pos logit contributions
ish+0.212
 up+0.210
ishing+0.200
igs+0.199
ning+0.195
Neg logit contributions
Um-0.294
Hmm-0.275
Dad-0.272
Don-0.271
Okay-0.271
Token out
Token ID503
Feature activation+3.43e+01

Pos logit contributions
ish+0.183
 up+0.180
igs+0.176
ishes+0.167
ans+0.164
Neg logit contributions
Sorry-0.367
Don-0.362
Um-0.359
Kids-0.353
Exc-0.352
Token of
Token ID286
Feature activation+2.96e+01

Pos logit contributions
ish+0.208
 up+0.186
ishes+0.179
ans+0.178
ric+0.171
Neg logit contributions
Um-0.285
Exc-0.285
Uh-0.280
Don-0.278
Sorry-0.275
Token his
Token ID465
Feature activation+3.18e+01

Pos logit contributions
ish+0.213
an+0.182
ans+0.178
 up+0.177
ishes+0.172
Neg logit contributions
Uh-0.226
Kids-0.225
Um-0.223
Don-0.220
Exc-0.219
Token bur
Token ID4356
Feature activation+5.97e+01

Pos logit contributions
 up+0.172
ish+0.170
ans+0.161
igs+0.159
ning+0.157
Neg logit contributions
Um-0.252
Hmm-0.240
Uh-0.237
Sorry-0.236
Dad-0.235
Tokenrow
Token ID808
Feature activation+4.32e+01

Pos logit contributions
 matches+0.452
 up+0.447
 clay+0.440
 dough+0.428
 model+0.428
Neg logit contributions
Exc-0.267
Don-0.248
Hmm-0.246
He-0.241
Um-0.240
Token and
Token ID290
Feature activation+1.86e+01

Pos logit contributions
ish+0.218
cks+0.206
ric+0.202
ishes+0.201
ans+0.192
Neg logit contributions
Don-0.351
Exc-0.350
Sorry-0.345
Dad-0.337
By-0.335
Token saw
Token ID2497
Feature activation+2.34e+01

Pos logit contributions
ish+0.151
 up+0.144
ans+0.134
lers+0.134
igs+0.132
Neg logit contributions
Don-0.121
Um-0.120
Sorry-0.118
Kids-0.117
Hmm-0.116
Token something
Token ID1223
Feature activation+2.42e+01

Pos logit contributions
ish+0.151
ishing+0.130
 up+0.130
ishes+0.129
ans+0.128
Neg logit contributions
Don-0.190
Um-0.187
Hmm-0.187
Exc-0.183
Well-0.182
Token new
Token ID649
Feature activation+2.60e+01

Pos logit contributions
ish+0.159
ishes+0.150
 up+0.149
ans+0.142
ishing+0.141
Neg logit contributions
Don-0.190
Um-0.190
Exc-0.187
Hmm-0.187
Kids-0.186
Token it
Token ID340
Feature activation+2.93e+01

Pos logit contributions
ish+0.205
ric+0.199
an+0.191
igs+0.190
ans+0.187
Neg logit contributions
Kids-0.197
Don-0.186
Um-0.183
Well-0.181
Hmm-0.178
Token to
Token ID284
Feature activation+1.67e+01

Pos logit contributions
ric+0.180
ish+0.176
cks+0.168
igs+0.164
ishing+0.163
Neg logit contributions
Sorry-0.209
By-0.209
Don-0.209
Um-0.208
Listen-0.204
Token the
Token ID262
Feature activation+2.41e+01

Pos logit contributions
ish+0.124
 up+0.121
lers+0.116
ishing+0.111
ishes+0.108
Neg logit contributions
Kids-0.122
Um-0.119
Don-0.114
Sorry-0.114
Hmm-0.114
Token rh
Token ID9529
Feature activation+4.69e+01

Pos logit contributions
 up+0.149
ish+0.146
lers+0.140
ishing+0.137
igs+0.136
Neg logit contributions
Um-0.199
Sorry-0.196
Uh-0.195
Hmm-0.195
Kids-0.193
Tokenin
Token ID259
Feature activation+3.64e+01

Pos logit contributions
 up+0.357
ish+0.284
fl+0.278
 out+0.271
ishes+0.269
Neg logit contributions
Sorry-0.354
Uh-0.339
By-0.334
Um-0.332
Exc-0.328
Tokenoc
Token ID420
Feature activation+5.95e+01

Pos logit contributions
 up+0.280
igs+0.254
ish+0.250
ishing+0.248
ishes+0.244
Neg logit contributions
Um-0.238
Hmm-0.235
Exc-0.231
Sorry-0.231
By-0.228
Tokeneros
Token ID27498
Feature activation+3.53e+01

Pos logit contributions
 up+0.337
 out+0.272
igs+0.262
ies+0.260
 flying+0.258
Neg logit contributions
Sorry-0.521
Um-0.511
Hmm-0.490
Well-0.489
Okay-0.481
Token.
Token ID13
Feature activation+9.29e+00

Pos logit contributions
 up+0.156
ish+0.154
igs+0.151
ishing+0.146
lers+0.138
Neg logit contributions
Um-0.317
Sorry-0.311
Exc-0.309
Don-0.308
Kids-0.305
Token When
Token ID1649
Feature activation+1.45e+01

Pos logit contributions
 up+0.076
ish+0.069
lers+0.064
 matches+0.063
-+0.062
Neg logit contributions
Uh-0.062
Sorry-0.062
Hmm-0.062
Um-0.061
Kids-0.061
Token he
Token ID339
Feature activation+1.85e+01

Pos logit contributions
 up+0.097
ish+0.093
lers+0.084
ishes+0.080
ishing+0.079
Neg logit contributions
Kids-0.119
Um-0.119
Sorry-0.118
Hmm-0.117
Uh-0.117
Token tasted
Token ID29187
Feature activation+2.24e+01

Pos logit contributions
 up+0.142
ish+0.129
-+0.121
lers+0.119
 matches+0.117
Neg logit contributions
Hmm-0.140
Kids-0.138
Um-0.138
Sorry-0.137
Uh-0.136
Token Everyday
Token ID48154
Feature activation+3.54e+01

Pos logit contributions
Kids+0.129
Um+0.128
Hmm+0.128
Sorry+0.127
Uh+0.127
Neg logit contributions
 up-0.121
ish-0.112
lers-0.103
--0.099
 matches-0.098
Token he
Token ID339
Feature activation+2.72e+01

Pos logit contributions
ish+0.286
 up+0.285
ans+0.258
ishing+0.257
ishes+0.256
Neg logit contributions
Um-0.230
Kids-0.229
Sorry-0.221
Hmm-0.220
Well-0.220
Token would
Token ID561
Feature activation+2.32e+01

Pos logit contributions
 up+0.243
ish+0.233
lers+0.217
ishing+0.215
igs+0.214
Neg logit contributions
Um-0.158
Kids-0.151
Hmm-0.150
Sorry-0.149
Don-0.149
Token take
Token ID1011
Feature activation+2.77e+01

Pos logit contributions
 up+0.186
ish+0.186
lers+0.176
ishes+0.175
ishing+0.175
Neg logit contributions
Um-0.141
Exc-0.136
Hmm-0.135
By-0.135
Don-0.134
Token the
Token ID262
Feature activation+2.77e+01

Pos logit contributions
ish+0.191
ric+0.177
an+0.171
 up+0.166
ans+0.166
Neg logit contributions
Kids-0.201
Hmm-0.201
Exc-0.199
Um-0.199
By-0.198
Token av
Token ID1196
Feature activation+5.95e+01

Pos logit contributions
 up+0.160
cks+0.153
ish+0.152
igs+0.144
ishing+0.144
Neg logit contributions
Hmm-0.229
Um-0.223
Uh-0.221
Exc-0.219
Don-0.216
Tokenoc
Token ID420
Feature activation+6.33e+01

Pos logit contributions
 up+0.448
 matches+0.394
ric+0.371
 where+0.366
ies+0.355
Neg logit contributions
Hmm-0.428
Exc-0.425
Uh-0.407
Um-0.401
By-0.394
Tokenados
Token ID22484
Feature activation+3.91e+01

Pos logit contributions
 up+0.395
 flying+0.317
 matches+0.313
 where+0.313
 hiding+0.304
Neg logit contributions
Um-0.505
Sorry-0.503
Hmm-0.496
Well-0.486
Exc-0.484
Token and
Token ID290
Feature activation+1.83e+01

Pos logit contributions
ish+0.198
 up+0.197
lers+0.187
igs+0.185
ric+0.185
Neg logit contributions
Hmm-0.329
Don-0.329
Exc-0.327
Well-0.325
Um-0.323
Token scatter
Token ID41058
Feature activation+2.91e+01

Pos logit contributions
 up+0.122
ish+0.121
an+0.113
ishing+0.112
ans+0.111
Neg logit contributions
Um-0.140
Exc-0.139
Hmm-0.139
Sorry-0.137
Don-0.136
Token them
Token ID606
Feature activation+2.74e+01

Pos logit contributions
ric+0.173
ish+0.168
an+0.163
ans+0.156
ig+0.156
Neg logit contributions
Exc-0.239
Hmm-0.222
Don-0.221
Uh-0.218
Well-0.215
Token you
Token ID345
Feature activation+3.39e+01

Pos logit contributions
Um+0.191
Hmm+0.191
Kids+0.190
Uh+0.188
Sorry+0.188
Neg logit contributions
 up-0.091
ish-0.071
 matches-0.065
lers-0.064
--0.062
Token go
Token ID467
Feature activation+3.22e+01

Pos logit contributions
 up+0.246
ish+0.240
lers+0.227
ishing+0.222
igs+0.221
Neg logit contributions
Um-0.245
Kids-0.239
Sorry-0.238
Don-0.234
By-0.231
Token,
Token ID11
Feature activation+1.51e+01

Pos logit contributions
ish+0.190
ishes+0.170
lers+0.165
 matches+0.164
ery+0.162
Neg logit contributions
Kids-0.257
Sorry-0.251
Um-0.250
Listen-0.247
Hi-0.245
Token my
Token ID616
Feature activation+2.63e+01

Pos logit contributions
lers+0.115
ish+0.115
 up+0.115
ishing+0.110
ishes+0.107
Neg logit contributions
Kids-0.103
Um-0.102
Don-0.100
Hmm-0.099
Exc-0.099
Token d
Token ID288
Feature activation+3.85e+01

Pos logit contributions
 up+0.170
lers+0.155
ish+0.153
cks+0.151
ishes+0.148
Neg logit contributions
Um-0.210
Sorry-0.202
Okay-0.201
Hmm-0.201
Kids-0.199
Tokenarl
Token ID7063
Feature activation+5.94e+01

Pos logit contributions
 up+0.354
ish+0.316
lers+0.315
 matches+0.314
ishes+0.299
Neg logit contributions
Kids-0.222
Um-0.219
Sorry-0.217
Uh-0.212
Exc-0.211
Tokenings
Token ID654
Feature activation+3.19e+01

Pos logit contributions
 up+0.342
 matches+0.312
ish+0.303
lers+0.290
 clay+0.268
Neg logit contributions
Sorry-0.505
Kids-0.496
Well-0.494
Um-0.489
By-0.488
Token.
Token ID13
Feature activation+1.25e+01

Pos logit contributions
 up+0.147
ish+0.134
lers+0.118
ishes+0.109
 matches+0.107
Neg logit contributions
Kids-0.326
Um-0.323
Sorry-0.323
Hmm-0.322
Uh-0.319
Token Enjoy
Token ID18179
Feature activation+2.56e+01

Pos logit contributions
 up+0.100
ish+0.093
lers+0.086
 matches+0.084
-+0.083
Neg logit contributions
Kids-0.085
Uh-0.085
Sorry-0.085
Um-0.085
Hmm-0.085
Token your
Token ID534
Feature activation+2.42e+01

Pos logit contributions
ish+0.149
ric+0.138
lers+0.137
gy+0.134
 matches+0.127
Neg logit contributions
Dad-0.208
Kids-0.206
Exc-0.203
Ben-0.199
Sorry-0.194
Token lunch
Token ID9965
Feature activation+3.53e+01

Pos logit contributions
 up+0.151
ish+0.142
ning+0.136
ric+0.135
lers+0.129
Neg logit contributions
Sorry-0.189
Um-0.187
Kids-0.181
Hmm-0.178
Exc-0.177
Token that
Token ID326
Feature activation+1.83e+01

Pos logit contributions
ish+0.164
ishes+0.149
igs+0.146
lers+0.144
ans+0.142
Neg logit contributions
Well-0.211
Kids-0.211
Sorry-0.209
Don-0.208
By-0.201
Token the
Token ID262
Feature activation+2.17e+01

Pos logit contributions
 up+0.143
ish+0.143
lers+0.132
ishes+0.129
ans+0.127
Neg logit contributions
Kids-0.126
Um-0.124
Sorry-0.123
Don-0.122
Hmm-0.122
Token bad
Token ID2089
Feature activation+2.78e+01

Pos logit contributions
 up+0.142
ish+0.140
lers+0.133
ishes+0.133
cks+0.132
Neg logit contributions
Kids-0.171
Sorry-0.168
Um-0.167
Uh-0.166
Hmm-0.165
Token rh
Token ID9529
Feature activation+4.65e+01

Pos logit contributions
 up+0.200
ish+0.190
lers+0.178
ishes+0.176
cks+0.173
Neg logit contributions
Kids-0.211
Sorry-0.203
Um-0.202
Hmm-0.200
Uh-0.200
Tokenin
Token ID259
Feature activation+3.52e+01

Pos logit contributions
 up+0.336
ish+0.284
igs+0.275
ans+0.265
ishes+0.264
Neg logit contributions
Sorry-0.362
Well-0.347
By-0.344
Uh-0.343
Um-0.341
Tokenoc
Token ID420
Feature activation+5.81e+01

Pos logit contributions
 up+0.255
ish+0.229
igs+0.218
ishes+0.214
lers+0.209
Neg logit contributions
Um-0.272
Well-0.269
Hmm-0.267
Sorry-0.267
Kids-0.265
Tokeneros
Token ID27498
Feature activation+3.35e+01

Pos logit contributions
 up+0.320
igs+0.259
 out+0.246
 flying+0.240
ies+0.235
Neg logit contributions
Sorry-0.535
Um-0.514
Well-0.506
Okay-0.495
Hmm-0.492
Token was
Token ID373
Feature activation+2.13e+01

Pos logit contributions
 up+0.279
ish+0.264
igs+0.249
lers+0.245
ishes+0.243
Neg logit contributions
Um-0.208
Kids-0.208
Sorry-0.206
Don-0.205
Well-0.203
Token now
Token ID783
Feature activation+2.09e+01

Pos logit contributions
ish+0.167
lers+0.157
ishes+0.156
ans+0.147
 up+0.146
Neg logit contributions
By-0.148
Don-0.147
Sorry-0.142
Um-0.139
Well-0.138
Token nice
Token ID3621
Feature activation+2.32e+01

Pos logit contributions
ish+0.131
 up+0.124
ans+0.124
ishes+0.123
lers+0.121
Neg logit contributions
Don-0.166
Sorry-0.165
By-0.163
Well-0.159
Listen-0.157
Token.
Token ID13
Feature activation+8.16e+00

Pos logit contributions
 up+0.118
ish+0.112
igs+0.105
lers+0.103
ans+0.103
Neg logit contributions
Kids-0.216
Don-0.210
Well-0.209
Um-0.206
Sorry-0.206
Token there
Token ID612
Feature activation+2.33e+01

Pos logit contributions
ish+0.123
 up+0.119
lers+0.107
ishes+0.106
ans+0.101
Neg logit contributions
Kids-0.226
Don-0.222
Um-0.221
Well-0.221
Exc-0.220
Token was
Token ID373
Feature activation+1.98e+01

Pos logit contributions
 up+0.161
ish+0.148
lers+0.136
-+0.130
 matches+0.129
Neg logit contributions
Kids-0.189
Um-0.188
Hmm-0.187
Sorry-0.186
Uh-0.186
Token a
Token ID257
Feature activation+2.17e+01

Pos logit contributions
 up+0.094
ish+0.090
lers+0.078
ishes+0.074
ishing+0.070
Neg logit contributions
Um-0.201
Kids-0.200
Hmm-0.199
Uh-0.199
Sorry-0.198
Token rh
Token ID9529
Feature activation+4.57e+01

Pos logit contributions
 up+0.136
ish+0.124
lers+0.115
ishes+0.109
ans+0.106
Neg logit contributions
Uh-0.189
Kids-0.189
Um-0.189
Hmm-0.188
Sorry-0.187
Tokenin
Token ID259
Feature activation+3.52e+01

Pos logit contributions
 up+0.340
ish+0.282
ric+0.277
ishes+0.276
ans+0.273
Neg logit contributions
Sorry-0.345
Uh-0.341
By-0.330
Hmm-0.326
Um-0.322
Tokenoc
Token ID420
Feature activation+5.80e+01

Pos logit contributions
 up+0.308
igs+0.275
ishes+0.271
ishing+0.271
cks+0.263
Neg logit contributions
Um-0.211
Hmm-0.206
Exc-0.205
Sorry-0.203
Uh-0.203
Tokeneros
Token ID27498
Feature activation+3.51e+01

Pos logit contributions
 up+0.332
igs+0.271
ies+0.256
 matches+0.249
 out+0.248
Neg logit contributions
Sorry-0.490
Um-0.486
Hmm-0.457
Well-0.453
Exc-0.451
Token who
Token ID508
Feature activation+2.19e+01

Pos logit contributions
 up+0.208
igs+0.204
ish+0.201
ishes+0.193
ishing+0.192
Neg logit contributions
Um-0.272
Exc-0.269
Don-0.267
Well-0.266
Sorry-0.265
Token was
Token ID373
Feature activation+2.28e+01

Pos logit contributions
 up+0.180
ish+0.170
lers+0.156
igs+0.153
ishes+0.153
Neg logit contributions
Um-0.146
Hmm-0.144
Kids-0.144
Sorry-0.143
Don-0.142
Token always
Token ID1464
Feature activation+2.63e+01

Pos logit contributions
ish+0.173
ishing+0.164
ishes+0.162
igs+0.156
ies+0.153
Neg logit contributions
Um-0.158
Don-0.146
Uh-0.144
By-0.140
Once-0.138
Token naughty
Token ID45128
Feature activation+2.35e+01

Pos logit contributions
ish+0.178
igs+0.175
ans+0.169
ishes+0.163
ig+0.156
Neg logit contributions
Don-0.210
Hi-0.189
Uh-0.189
Um-0.188
Wait-0.183
Token When
Token ID1649
Feature activation+1.77e+01

Pos logit contributions
 up+0.126
ish+0.115
lers+0.109
-+0.106
 matches+0.106
Neg logit contributions
Uh-0.076
Hmm-0.076
Kids-0.076
Sorry-0.075
Um-0.075
Token Daisy
Token ID40355
Feature activation+2.48e+01

Pos logit contributions
 up+0.141
ish+0.137
lers+0.126
ishes+0.122
ishing+0.122
Neg logit contributions
Kids-0.121
Um-0.120
Sorry-0.119
Hmm-0.118
Uh-0.117
Token put
Token ID1234
Feature activation+2.40e+01

Pos logit contributions
 up+0.197
ish+0.180
lers+0.166
-+0.165
 matches+0.160
Neg logit contributions
Hmm-0.182
Uh-0.181
Sorry-0.181
Um-0.180
Kids-0.180
Token the
Token ID262
Feature activation+2.13e+01

Pos logit contributions
ans+0.159
an+0.150
ric+0.150
ish+0.149
ies+0.133
Neg logit contributions
Hmm-0.180
Don-0.170
Well-0.169
No-0.162
Exc-0.162
Token magn
Token ID7842
Feature activation+3.68e+01

Pos logit contributions
 up+0.118
cks+0.117
igs+0.112
ans+0.111
ishes+0.110
Neg logit contributions
Hmm-0.175
Uh-0.169
Sorry-0.164
Um-0.164
Wait-0.160
Tokenifying
Token ID4035
Feature activation+5.70e+01

Pos logit contributions
 up+0.303
ish+0.286
ans+0.268
lers+0.266
 matches+0.264
Neg logit contributions
Hmm-0.233
Uh-0.232
Um-0.231
Sorry-0.229
Kids-0.224
Token glass
Token ID5405
Feature activation+3.82e+01

Pos logit contributions
 up+0.360
ish+0.360
ishing+0.340
ies+0.339
ishes+0.334
Neg logit contributions
Uh-0.465
Hmm-0.453
Um-0.443
Sorry-0.439
By-0.425
Token down
Token ID866
Feature activation+2.60e+01

Pos logit contributions
 up+0.183
ish+0.175
ric+0.172
ans+0.163
ishing+0.152
Neg logit contributions
Hmm-0.350
Exc-0.350
Uh-0.349
Kids-0.348
Sorry-0.345
Token,
Token ID11
Feature activation+1.09e+01

Pos logit contributions
 up+0.136
ish+0.133
lers+0.118
ans+0.115
ishes+0.113
Neg logit contributions
Hmm-0.241
Kids-0.239
Uh-0.239
Sorry-0.239
Um-0.238
Token the
Token ID262
Feature activation+1.79e+01

Pos logit contributions
ish+0.114
ishing+0.107
ans+0.106
ric+0.106
 up+0.106
Neg logit contributions
Don-0.050
Hmm-0.048
Um-0.046
Well-0.046
Kids-0.046
Token cater
Token ID20825
Feature activation+3.60e+01

Pos logit contributions
 up+0.110
ish+0.102
ans+0.094
ishing+0.093
ishes+0.093
Neg logit contributions
Hmm-0.152
Uh-0.151
Sorry-0.150
Kids-0.150
Um-0.150
Tokenoll
Token ID692
Feature activation+4.16e+01

Pos logit contributions
 up+0.223
ish+0.207
lers+0.193
-+0.186
 matches+0.182
Neg logit contributions
Kids-0.217
Um-0.213
Hmm-0.213
Uh-0.213
Sorry-0.212
Tokenip
Token ID541
Feature activation+2.90e+01

Pos logit contributions
 up+0.314
ric+0.303
 clay+0.294
 matches+0.281
ans+0.279
Neg logit contributions
Sorry-0.253
Um-0.243
Well-0.240
Hmm-0.237
Kids-0.226
Tokenops
Token ID2840
Feature activation+2.91e+01

Pos logit contributions
 up+0.260
ish+0.245
ric+0.237
 matches+0.236
ishes+0.228
Neg logit contributions
Sorry-0.145
Hmm-0.143
Well-0.143
Don-0.141
Um-0.139
Token,
Token ID11
Feature activation+1.04e+01

Pos logit contributions
ish+0.142
ric+0.139
 up+0.136
ishes+0.133
ans+0.132
Neg logit contributions
Exc-0.239
By-0.234
Sorry-0.232
Don-0.231
Well-0.226
Token ch
Token ID442
Feature activation+2.77e+01

Pos logit contributions
ish+0.068
 up+0.068
lers+0.061
ishing+0.060
ishes+0.058
Neg logit contributions
Um-0.084
Hmm-0.083
Don-0.083
Kids-0.082
Exc-0.082
Tokenocol
Token ID4668
Feature activation+5.65e+01

Pos logit contributions
 up+0.171
 matches+0.136
ish+0.131
lers+0.121
ning+0.120
Neg logit contributions
Um-0.241
Exc-0.240
Hmm-0.236
Uh-0.234
Sorry-0.233
Tokenates
Token ID689
Feature activation+3.02e+01

Pos logit contributions
 up+0.298
 matches+0.220
 where+0.205
ish+0.200
 clay+0.192
Neg logit contributions
Um-0.538
Sorry-0.526
By-0.520
Hmm-0.518
Dad-0.515
Token,
Token ID11
Feature activation+1.22e+01

Pos logit contributions
 up+0.151
ish+0.145
igs+0.133
ishing+0.132
ans+0.131
Neg logit contributions
Sorry-0.253
Don-0.253
Um-0.250
Hmm-0.250
Well-0.248
Token g
Token ID308
Feature activation+2.80e+01

Pos logit contributions
ish+0.070
 up+0.067
ishing+0.062
igs+0.060
ans+0.059
Neg logit contributions
Don-0.105
Exc-0.105
Hmm-0.105
Um-0.104
Uh-0.102
Tokenummies
Token ID39578
Feature activation+2.69e+01

Pos logit contributions
 up+0.209
 matches+0.192
ish+0.183
lers+0.176
 where+0.169
Neg logit contributions
Um-0.189
Exc-0.189
Sorry-0.185
Kids-0.185
Dad-0.180
Token,
Token ID11
Feature activation+1.19e+01

Pos logit contributions
 up+0.159
ish+0.141
ishing+0.138
ans+0.138
igs+0.136
Neg logit contributions
Exc-0.203
Kids-0.202
Well-0.201
By-0.196
Um-0.194
Token's
Token ID338
Feature activation+2.12e+01

Pos logit contributions
cks+0.152
ishing+0.151
igs+0.147
ans+0.146
ates+0.142
Neg logit contributions
Sorry-0.183
Hmm-0.182
Um-0.181
Uh-0.176
Okay-0.176
Token wonderful
Token ID7932
Feature activation+2.99e+01

Pos logit contributions
 up+0.135
ish+0.135
lers+0.133
ishes+0.123
ans+0.123
Neg logit contributions
Um-0.175
Hmm-0.172
Sorry-0.169
Uh-0.168
Don-0.168
Token,
Token ID11
Feature activation+1.28e+01

Pos logit contributions
ric+0.191
cks+0.177
ans+0.171
ates+0.167
ep+0.166
Neg logit contributions
Um-0.218
Okay-0.217
Exc-0.215
No-0.212
Kids-0.211
Token my
Token ID616
Feature activation+2.42e+01

Pos logit contributions
 up+0.105
ish+0.102
lers+0.099
ishing+0.092
ric+0.092
Neg logit contributions
Kids-0.086
Um-0.085
Hmm-0.084
Uh-0.083
Sorry-0.082
Token d
Token ID288
Feature activation+3.66e+01

Pos logit contributions
 up+0.160
lers+0.148
cks+0.146
ning+0.146
ans+0.144
Neg logit contributions
Um-0.185
Sorry-0.180
Okay-0.178
Hmm-0.174
Well-0.170
Tokenarl
Token ID7063
Feature activation+5.61e+01

Pos logit contributions
 up+0.328
lers+0.290
ish+0.289
 matches+0.284
ishes+0.271
Neg logit contributions
Kids-0.221
Um-0.216
Sorry-0.214
Hmm-0.210
Uh-0.208
Tokenings
Token ID654
Feature activation+3.04e+01

Pos logit contributions
 up+0.284
ish+0.245
 matches+0.243
lers+0.238
igs+0.207
Neg logit contributions
Sorry-0.526
Kids-0.525
Well-0.518
Okay-0.509
Um-0.509
Token!"
Token ID2474
Feature activation+1.00e+01

Pos logit contributions
 up+0.139
ish+0.118
lers+0.103
-+0.098
 matches+0.095
Neg logit contributions
Um-0.319
Kids-0.319
Hmm-0.318
Uh-0.317
Sorry-0.315
Token Mom
Token ID11254
Feature activation+1.86e+01

Pos logit contributions
 up+0.084
ish+0.076
lers+0.072
-+0.070
 matches+0.069
Neg logit contributions
Hmm-0.067
Kids-0.067
Uh-0.067
Um-0.067
Sorry-0.066
Tokenmy
Token ID1820
Feature activation+2.20e+01

Pos logit contributions
 up+0.155
ish+0.152
igs+0.147
lers+0.144
ans+0.140
Neg logit contributions
Kids-0.116
Um-0.111
Exc-0.106
Sorry-0.106
Well-0.106
Token says
Token ID1139
Feature activation+1.52e+01

Pos logit contributions
 up+0.206
ish+0.204
igs+0.191
lers+0.190
ishing+0.188
Neg logit contributions
Kids-0.116
Um-0.112
Exc-0.107
Sorry-0.106
Well-0.105
Token her
Token ID607
Feature activation+3.65e+01

Pos logit contributions
Um+0.098
Don+0.095
Kids+0.095
Uh+0.094
Hmm+0.094
Neg logit contributions
 up-0.099
ish-0.098
ans-0.090
lers-0.089
an-0.089
Token mom
Token ID1995
Feature activation+2.86e+01

Pos logit contributions
ish+0.220
 up+0.208
igs+0.206
ishing+0.203
ning+0.202
Neg logit contributions
Um-0.284
Don-0.265
Kids-0.264
Okay-0.264
Sorry-0.260
Tokenmy
Token ID1820
Feature activation+3.11e+01

Pos logit contributions
 up+0.188
igs+0.182
ish+0.181
lers+0.174
ans+0.171
Neg logit contributions
Kids-0.217
Sorry-0.216
Um-0.214
Don-0.212
Well-0.208
Token watched
Token ID7342
Feature activation+2.73e+01

Pos logit contributions
 up+0.201
ish+0.193
igs+0.192
lers+0.185
ans+0.184
Neg logit contributions
Sorry-0.235
Kids-0.235
Don-0.232
Um-0.229
Well-0.227
Token in
Token ID287
Feature activation+2.19e+01

Pos logit contributions
ish+0.170
lers+0.154
an+0.153
ishes+0.152
ans+0.151
Neg logit contributions
Kids-0.218
Exc-0.218
Sorry-0.213
Don-0.212
Dad-0.209
Token amaz
Token ID40642
Feature activation+5.53e+01

Pos logit contributions
ish+0.126
 up+0.117
lers+0.111
ans+0.110
an+0.108
Neg logit contributions
Don-0.193
Kids-0.192
Uh-0.192
Um-0.192
Hmm-0.192
Tokenement
Token ID972
Feature activation+3.39e+01

Pos logit contributions
ish+0.464
 up+0.457
ans+0.449
ric+0.449
gy+0.437
Neg logit contributions
Sorry-0.311
Exc-0.307
Well-0.297
Um-0.294
Kids-0.290
Token as
Token ID355
Feature activation+2.16e+01

Pos logit contributions
ish+0.185
igs+0.168
ishes+0.165
 matches+0.164
an+0.163
Neg logit contributions
Sorry-0.270
Kids-0.266
Exc-0.264
Well-0.264
Um-0.261
Token the
Token ID262
Feature activation+2.40e+01

Pos logit contributions
ish+0.173
 up+0.166
lers+0.156
ans+0.155
ishes+0.153
Neg logit contributions
Kids-0.147
Um-0.145
Don-0.143
Hmm-0.143
Sorry-0.143
Token machine
Token ID4572
Feature activation+3.23e+01

Pos logit contributions
 up+0.157
ish+0.155
ishes+0.139
lers+0.137
ans+0.135
Neg logit contributions
Kids-0.200
Um-0.197
Uh-0.196
Sorry-0.195
Hmm-0.194
Token flew
Token ID13112
Feature activation+2.98e+01

Pos logit contributions
 up+0.236
ish+0.228
lers+0.210
ishes+0.204
igs+0.203
Neg logit contributions
Kids-0.238
Sorry-0.233
Um-0.231
Hmm-0.228
Uh-0.228
Tokenily
Token ID813
Feature activation-1.60e+01

Pos logit contributions
Um+0.092
Kids+0.092
Hmm+0.092
Sorry+0.091
Uh+0.091
Neg logit contributions
 up-0.116
ish-0.108
lers-0.100
--0.097
 matches-0.096
Token and
Token ID290
Feature activation-1.84e+01

Pos logit contributions
Kids+0.152
Um+0.149
Sorry+0.148
Hmm+0.148
Don+0.147
Neg logit contributions
 up-0.086
ish-0.078
lers-0.071
ishes-0.065
 matches-0.065
Token Tom
Token ID4186
Feature activation-1.44e+01

Pos logit contributions
Kids+0.153
Uh+0.153
Hmm+0.152
Um+0.152
Sorry+0.151
Neg logit contributions
 up-0.128
ish-0.114
lers-0.103
--0.100
 matches-0.100
Token are
Token ID389
Feature activation-1.62e+01

Pos logit contributions
Hmm+0.104
Um+0.104
Uh+0.104
Kids+0.102
Sorry+0.101
Neg logit contributions
 up-0.117
ish-0.107
lers-0.098
--0.097
 matches-0.096
Token friends
Token ID2460
Feature activation-1.75e+01

Pos logit contributions
Kids+0.154
Um+0.153
Sorry+0.151
By+0.150
Dad+0.149
Neg logit contributions
ish-0.074
 up-0.070
lers-0.069
an-0.065
ric-0.063
Token.
Token ID13
Feature activation-2.43e+01

Pos logit contributions
Exc+0.162
Sorry+0.160
Don+0.159
Kids+0.158
Um+0.158
Neg logit contributions
ish-0.091
 up-0.083
ishes-0.080
igs-0.080
ans-0.078
Token They
Token ID1119
Feature activation-1.97e+01

Pos logit contributions
Sorry+0.195
Hmm+0.190
Uh+0.185
Um+0.185
Don+0.183
Neg logit contributions
 up-0.155
 clay-0.140
 matches-0.133
ish-0.132
 where-0.131
Token like
Token ID588
Feature activation-1.84e+01

Pos logit contributions
Kids+0.166
Um+0.166
Hmm+0.165
Uh+0.164
Sorry+0.164
Neg logit contributions
 up-0.130
ish-0.118
lers-0.108
--0.103
 matches-0.102
Token to
Token ID284
Feature activation-1.84e+01

Pos logit contributions
Kids+0.166
Don+0.160
Um+0.160
Listen+0.158
Sorry+0.157
Neg logit contributions
 up-0.107
ish-0.106
lers-0.097
ishing-0.090
igs-0.089
Token play
Token ID711
Feature activation-1.58e+01

Pos logit contributions
Kids+0.120
Um+0.118
Sorry+0.117
Hmm+0.116
Exc+0.113
Neg logit contributions
 up-0.153
ish-0.144
lers-0.137
ishing-0.131
ishes-0.131
Token with
Token ID351
Feature activation-1.58e+01

Pos logit contributions
Exc+0.150
Sorry+0.150
Kids+0.149
By+0.146
Sure+0.146
Neg logit contributions
ric-0.073
ish-0.064
 up-0.062
ishes-0.057
lers-0.057
Token She
Token ID1375
Feature activation-1.73e+01

Pos logit contributions
Uh+0.135
Sorry+0.131
By+0.131
Hmm+0.129
Kids+0.128
Neg logit contributions
 up-0.221
ish-0.174
 where-0.172
 things-0.170
 in-0.168
Token cried
Token ID16896
Feature activation-1.80e+01

Pos logit contributions
Hmm+0.127
Kids+0.126
Sorry+0.126
Uh+0.126
By+0.120
Neg logit contributions
 up-0.140
lers-0.120
ish-0.119
--0.114
 matches-0.112
Token and
Token ID290
Feature activation-1.82e+01

Pos logit contributions
Kids+0.206
Um+0.205
Hmm+0.204
Uh+0.204
Sorry+0.203
Neg logit contributions
 up-0.071
ish-0.056
lers-0.048
--0.047
 matches-0.042
Token chased
Token ID26172
Feature activation-1.51e+01

Pos logit contributions
Hmm+0.118
Kids+0.118
Uh+0.116
Um+0.115
Sorry+0.114
Neg logit contributions
 up-0.158
ish-0.141
lers-0.139
--0.134
 matches-0.131
Token Tom
Token ID4186
Feature activation-1.32e+01

Pos logit contributions
Uh+0.132
Sorry+0.130
Um+0.130
Hmm+0.130
Kids+0.128
Neg logit contributions
 up-0.104
ish-0.086
--0.082
lers-0.081
 things-0.077
Token.
Token ID13
Feature activation-2.41e+01

Pos logit contributions
Don+0.126
Kids+0.122
Um+0.122
Hmm+0.121
Exc+0.120
Neg logit contributions
ish-0.058
igs-0.052
 up-0.051
ishes-0.051
ric-0.051
Token She
Token ID1375
Feature activation-1.88e+01

Pos logit contributions
Uh+0.228
Hmm+0.220
Kids+0.214
Sorry+0.209
By+0.207
Neg logit contributions
 up-0.166
 where-0.111
 in-0.109
 out-0.108
 things-0.107
Token wanted
Token ID2227
Feature activation-1.97e+01

Pos logit contributions
Hmm+0.135
Uh+0.134
Sorry+0.134
Kids+0.133
Listen+0.129
Neg logit contributions
 up-0.153
lers-0.135
ish-0.133
--0.128
 matches-0.128
Token her
Token ID607
Feature activation-1.51e+01

Pos logit contributions
Uh+0.202
Hmm+0.201
Sorry+0.200
Um+0.199
Kids+0.198
Neg logit contributions
 up-0.100
ish-0.078
 matches-0.074
lers-0.073
--0.071
Token doll
Token ID3654
Feature activation-1.01e+01

Pos logit contributions
Kids+0.140
Um+0.138
Sorry+0.135
Hmm+0.135
Don+0.134
Neg logit contributions
 up-0.077
ish-0.076
lers-0.068
ans-0.065
igs-0.064
Token back
Token ID736
Feature activation-1.31e+01

Pos logit contributions
Um+0.092
Hmm+0.092
Sorry+0.092
Don+0.091
Kids+0.090
Neg logit contributions
 up-0.049
ish-0.048
lers-0.044
igs-0.043
ans-0.042
Token.
Token ID13
Feature activation-2.13e+01

Pos logit contributions
Kids+0.094
Sorry+0.092
Um+0.091
Dad+0.090
Hi+0.089
Neg logit contributions
igs-0.048
ish-0.047
 up-0.045
ric-0.045
ans-0.045
Token Can
Token ID1680
Feature activation-1.93e+01

Pos logit contributions
Uh+0.155
Hmm+0.154
Sorry+0.154
Kids+0.152
Um+0.151
Neg logit contributions
 up-0.161
ish-0.144
 matches-0.132
lers-0.131
--0.129
Token I
Token ID314
Feature activation-1.80e+01

Pos logit contributions
Kids+0.148
Um+0.147
Hmm+0.146
Uh+0.146
Sorry+0.145
Neg logit contributions
 up-0.143
ish-0.131
lers-0.122
--0.117
 matches-0.116
Token see
Token ID766
Feature activation-1.55e+01

Pos logit contributions
Kids+0.136
Hmm+0.134
Uh+0.134
Um+0.134
Sorry+0.132
Neg logit contributions
 up-0.136
ish-0.126
lers-0.115
--0.113
 matches-0.111
Token it
Token ID340
Feature activation-1.08e+01

Pos logit contributions
Um+0.107
Exc+0.105
Sorry+0.103
Listen+0.103
Hi+0.102
Neg logit contributions
ric-0.109
ish-0.108
an-0.104
gy-0.103
ans-0.102
Token?"
Token ID1701
Feature activation-2.40e+01

Pos logit contributions
Sorry+0.106
Um+0.106
Kids+0.103
Uh+0.103
Hmm+0.102
Neg logit contributions
ish-0.050
 up-0.049
lers-0.047
ishes-0.042
ans-0.042
Token
Token ID198
Feature activation-2.33e+01

Pos logit contributions
Uh+0.208
Hmm+0.204
Sorry+0.203
Um+0.199
Listen+0.193
Neg logit contributions
 up-0.163
--0.132
ish-0.127
 in-0.124
 things-0.122
Token
Token ID198
Feature activation-2.38e+01

Pos logit contributions
Uh+0.199
Hmm+0.196
Um+0.191
Sorry+0.188
Kids+0.180
Neg logit contributions
 up-0.166
--0.138
ish-0.128
 in-0.123
 matches-0.120
TokenL
Token ID43
Feature activation-1.44e+01

Pos logit contributions
Um+0.142
Warning+0.141
Uh+0.139
Hmm+0.138
Listen+0.138
Neg logit contributions
 up-0.187
--0.185
 in-0.173
ish-0.163
 with-0.162
Tokenily
Token ID813
Feature activation-1.76e+01

Pos logit contributions
Kids+0.076
Um+0.072
Hmm+0.071
Sorry+0.071
Uh+0.070
Neg logit contributions
 up-0.143
ish-0.135
lers-0.130
ishes-0.124
 matches-0.124
Token nods
Token ID34833
Feature activation-1.60e+01

Pos logit contributions
Kids+0.134
Exc+0.133
Um+0.132
Hmm+0.128
Well+0.128
Neg logit contributions
 up-0.114
ish-0.112
lers-0.107
igs-0.105
ishing-0.102
Token her
Token ID607
Feature activation-1.45e+01

Pos logit contributions
Kids+0.165
Um+0.164
Hmm+0.163
Uh+0.163
Sorry+0.162
Neg logit contributions
 up-0.106
ish-0.094
lers-0.085
--0.081
 matches-0.080
Token mom
Token ID1995
Feature activation-1.44e+01

Pos logit contributions
Um+0.094
Sorry+0.093
Kids+0.090
Uh+0.089
By+0.089
Neg logit contributions
 up-0.115
ish-0.110
lers-0.105
ric-0.104
ans-0.103
Token went
Token ID1816
Feature activation-1.74e+01

Pos logit contributions
Um+0.112
Hmm+0.110
Uh+0.109
Kids+0.107
Sorry+0.105
Neg logit contributions
 up-0.112
ish-0.105
--0.095
lers-0.092
ishes-0.091
Token back
Token ID736
Feature activation-1.49e+01

Pos logit contributions
Kids+0.165
Sorry+0.159
Um+0.157
Listen+0.156
Exc+0.156
Neg logit contributions
ish-0.085
 up-0.077
lers-0.075
ric-0.072
 matches-0.070
Token home
Token ID1363
Feature activation-1.41e+01

Pos logit contributions
Kids+0.114
Don+0.107
Listen+0.107
Exc+0.107
Um+0.106
Neg logit contributions
ish-0.090
lers-0.088
an-0.086
ric-0.085
igs-0.084
Token.
Token ID13
Feature activation-2.40e+01

Pos logit contributions
Sorry+0.133
Kids+0.132
Don+0.131
Um+0.130
By+0.130
Neg logit contributions
ish-0.071
lers-0.064
 up-0.064
ishes-0.059
ans-0.057
Token
Token ID220
Feature activation-1.63e+01

Pos logit contributions
Uh+0.202
Sorry+0.201
Um+0.197
Hmm+0.197
Okay+0.186
Neg logit contributions
 up-0.177
 where-0.143
 out-0.135
ish-0.131
 clay-0.129
Token
Token ID198
Feature activation-2.35e+01

Pos logit contributions
Kids+0.099
Um+0.098
Hmm+0.097
Sorry+0.097
Uh+0.096
Neg logit contributions
 up-0.145
ish-0.137
lers-0.129
--0.124
 matches-0.123
Token
Token ID198
Feature activation-2.37e+01

Pos logit contributions
Uh+0.200
Um+0.200
Hmm+0.199
Sorry+0.196
Okay+0.187
Neg logit contributions
 up-0.166
--0.127
 in-0.126
 where-0.122
 out-0.122
TokenL
Token ID43
Feature activation-1.36e+01

Pos logit contributions
Um+0.147
Hmm+0.139
Uh+0.133
Warning+0.126
 hesitate+0.125
Neg logit contributions
 up-0.201
 in-0.180
--0.176
 with-0.168
ep-0.167
Tokenily
Token ID813
Feature activation-1.74e+01

Pos logit contributions
Um+0.081
Hmm+0.080
Sorry+0.079
Kids+0.079
Uh+0.079
Neg logit contributions
 up-0.125
ish-0.117
lers-0.109
--0.108
 matches-0.106
Token It
Token ID632
Feature activation-1.49e+01

Pos logit contributions
Hmm+0.167
Um+0.165
Sorry+0.165
Kids+0.164
Uh+0.164
Neg logit contributions
 up-0.192
ish-0.170
 matches-0.161
lers-0.160
--0.154
Token has
Token ID468
Feature activation-1.69e+01

Pos logit contributions
Um+0.121
Uh+0.118
Hmm+0.118
Well+0.117
Kids+0.115
Neg logit contributions
ish-0.095
 up-0.095
ishing-0.089
ans-0.086
ishes-0.083
Token candles
Token ID32268
Feature activation-6.97e+00

Pos logit contributions
Um+0.151
Hmm+0.146
Uh+0.144
Exc+0.142
Well+0.141
Neg logit contributions
ish-0.107
 up-0.100
ishing-0.093
an-0.086
lers-0.086
Token on
Token ID319
Feature activation-1.19e+01

Pos logit contributions
Sorry+0.063
By+0.061
Kids+0.061
Well+0.060
Uh+0.060
Neg logit contributions
ish-0.033
igs-0.031
 up-0.030
ans-0.030
lers-0.030
Token top
Token ID1353
Feature activation-6.76e+00

Pos logit contributions
Dad+0.085
Hmm+0.080
Exc+0.079
That+0.078
Um+0.077
Neg logit contributions
ish-0.093
ishing-0.084
ishes-0.082
lers-0.080
 up-0.077
Token.
Token ID13
Feature activation-2.40e+01

Pos logit contributions
Exc+0.058
Dad+0.057
Well+0.057
Um+0.057
Uh+0.057
Neg logit contributions
ish-0.036
 up-0.033
ishes-0.032
ans-0.030
ishing-0.030
Token Anna
Token ID11735
Feature activation-1.66e+01

Pos logit contributions
Hmm+0.179
Sorry+0.176
Uh+0.175
Um+0.175
Kids+0.174
Neg logit contributions
 up-0.186
ish-0.161
 matches-0.156
lers-0.153
 clay-0.147
Token and
Token ID290
Feature activation-1.77e+01

Pos logit contributions
Kids+0.145
Um+0.143
Well+0.142
Don+0.140
Exc+0.140
Neg logit contributions
 up-0.089
ish-0.087
igs-0.082
lers-0.081
ric-0.079
Token Ben
Token ID3932
Feature activation-1.62e+01

Pos logit contributions
Kids+0.108
Um+0.107
Hmm+0.106
Sorry+0.105
Uh+0.104
Neg logit contributions
 up-0.158
ish-0.149
lers-0.141
--0.136
 matches-0.135
Token like
Token ID588
Feature activation-1.86e+01

Pos logit contributions
Hmm+0.102
Um+0.102
Kids+0.101
Sorry+0.101
Uh+0.100
Neg logit contributions
 up-0.131
ish-0.125
lers-0.121
--0.118
igs-0.118
Token chocolate
Token ID11311
Feature activation-1.05e+01

Pos logit contributions
Kids+0.147
Um+0.145
Don+0.144
Dad+0.144
Exc+0.144
Neg logit contributions
ish-0.127
an-0.111
ishing-0.110
 up-0.108
ans-0.108
Token He
Token ID679
Feature activation-1.66e+01

Pos logit contributions
Sorry+0.157
Uh+0.155
By+0.150
Kids+0.150
Listen+0.150
Neg logit contributions
 up-0.196
ish-0.170
 things-0.167
--0.163
lers-0.156
Token liked
Token ID8288
Feature activation-1.67e+01

Pos logit contributions
Kids+0.123
Hmm+0.122
Sorry+0.121
Uh+0.119
Listen+0.117
Neg logit contributions
 up-0.128
ish-0.116
lers-0.111
--0.109
 matches-0.108
Token to
Token ID284
Feature activation-1.77e+01

Pos logit contributions
Kids+0.135
Um+0.132
Exc+0.131
Don+0.128
Listen+0.126
Neg logit contributions
ish-0.110
 up-0.100
an-0.098
lers-0.094
ates-0.091
Token destroy
Token ID4117
Feature activation-1.11e+01

Pos logit contributions
Um+0.119
Sorry+0.117
Kids+0.116
Hmm+0.116
Exc+0.114
Neg logit contributions
 up-0.138
ish-0.135
lers-0.127
ishes-0.125
ishing-0.122
Token things
Token ID1243
Feature activation-1.14e+01

Pos logit contributions
Hmm+0.095
Exc+0.095
Kids+0.095
Um+0.092
Dad+0.090
Neg logit contributions
ish-0.058
 up-0.054
an-0.051
ishing-0.050
igs-0.049
Token.
Token ID13
Feature activation-2.39e+01

Pos logit contributions
Don+0.107
Sorry+0.107
Hmm+0.106
Exc+0.106
Well+0.106
Neg logit contributions
ish-0.051
 up-0.047
ishes-0.046
ric-0.046
ans-0.045
Token He
Token ID679
Feature activation-1.74e+01

Pos logit contributions
Sorry+0.171
Uh+0.169
Listen+0.164
By+0.164
Kids+0.163
Neg logit contributions
 up-0.189
ish-0.166
 things-0.160
--0.154
lers-0.150
Token saw
Token ID2497
Feature activation-1.72e+01

Pos logit contributions
Kids+0.125
Hmm+0.125
Sorry+0.124
Uh+0.122
Um+0.120
Neg logit contributions
 up-0.137
ish-0.126
lers-0.120
 matches-0.119
--0.116
Token L
Token ID406
Feature activation-1.31e+01

Pos logit contributions
Um+0.155
Kids+0.155
Hmm+0.153
Sorry+0.151
Uh+0.151
Neg logit contributions
 up-0.098
ish-0.097
lers-0.084
ishes-0.081
ishing-0.081
Tokenila
Token ID10102
Feature activation-1.45e+01

Pos logit contributions
Kids+0.078
Um+0.075
Sorry+0.074
Hmm+0.074
Uh+0.074
Neg logit contributions
 up-0.121
ish-0.114
lers-0.108
 matches-0.104
ishes-0.104
Token and
Token ID290
Feature activation-1.81e+01

Pos logit contributions
Don+0.112
Exc+0.107
Sorry+0.103
Uh+0.102
Um+0.102
Neg logit contributions
igs-0.081
ish-0.080
ishes-0.080
ans-0.076
ates-0.075
Token the
Token ID262
Feature activation-1.48e+01

Pos logit contributions
Hmm+0.095
Exc+0.095
Don+0.095
Dad+0.091
Um+0.091
Neg logit contributions
ish-0.116
ric-0.106
ishing-0.103
ishes-0.102
ans-0.101
Token toy
Token ID13373
Feature activation-8.64e+00

Pos logit contributions
Uh+0.142
Sorry+0.140
Hmm+0.139
By+0.137
Um+0.137
Neg logit contributions
ish-0.068
ishes-0.063
 up-0.063
ans-0.062
ric-0.061
Token car
Token ID1097
Feature activation-8.09e+00

Pos logit contributions
Hmm+0.087
Um+0.087
Kids+0.087
Sorry+0.086
Uh+0.086
Neg logit contributions
 up-0.038
ish-0.034
lers-0.029
ishes-0.029
ans-0.028
Token all
Token ID477
Feature activation-1.08e+01

Pos logit contributions
By+0.067
Hmm+0.066
Don+0.065
Well+0.064
Okay+0.064
Neg logit contributions
ish-0.040
ric-0.040
ishes-0.037
igs-0.037
ishing-0.035
Token day
Token ID1110
Feature activation-1.28e+01

Pos logit contributions
Well+0.059
That+0.058
Don+0.058
Listen+0.057
Kids+0.057
Neg logit contributions
ishes-0.089
ish-0.088
igs-0.087
lers-0.087
ans-0.084
Token.
Token ID13
Feature activation-2.39e+01

Pos logit contributions
By+0.106
Dad+0.103
Don+0.099
Sorry+0.098
Kids+0.098
Neg logit contributions
ish-0.077
ric-0.074
ishes-0.073
 up-0.066
ishing-0.064
Token He
Token ID679
Feature activation-1.74e+01

Pos logit contributions
Kids+0.143
Um+0.140
Uh+0.140
Sorry+0.139
Hmm+0.138
Neg logit contributions
 up-0.226
ish-0.205
 matches-0.194
lers-0.190
--0.190
Token drove
Token ID10357
Feature activation-1.47e+01

Pos logit contributions
Hmm+0.131
Kids+0.131
Um+0.130
Uh+0.130
Sorry+0.129
Neg logit contributions
 up-0.133
ish-0.121
lers-0.113
--0.111
 matches-0.110
Token it
Token ID340
Feature activation-1.34e+01

Pos logit contributions
Hmm+0.129
Don+0.129
Kids+0.127
By+0.127
Sorry+0.127
Neg logit contributions
ish-0.081
lers-0.073
ric-0.072
ishes-0.071
 up-0.070
Token around
Token ID1088
Feature activation-1.50e+01

Pos logit contributions
By+0.127
Exc+0.126
Don+0.126
Hmm+0.125
Um+0.125
Neg logit contributions
ish-0.061
ishes-0.054
ric-0.054
lers-0.051
 up-0.050
Token the
Token ID262
Feature activation-1.44e+01

Pos logit contributions
Don+0.132
Kids+0.130
Um+0.130
Hmm+0.130
By+0.129
Neg logit contributions
ish-0.086
lers-0.077
ishes-0.076
 up-0.073
ans-0.072
Token,"
Token ID553
Feature activation-2.17e+01

Pos logit contributions
Sorry+0.127
Um+0.124
Okay+0.120
Hmm+0.117
Exc+0.115
Neg logit contributions
lers-0.103
ning-0.097
ish-0.095
ates-0.094
ishes-0.093
Token Lily
Token ID20037
Feature activation-1.52e+01

Pos logit contributions
Hmm+0.196
Um+0.195
Sorry+0.194
Kids+0.194
Uh+0.194
Neg logit contributions
 up-0.127
 things-0.099
 toys-0.098
--0.097
 in-0.094
Token begged
Token ID32946
Feature activation-1.72e+01

Pos logit contributions
Kids+0.107
Uh+0.107
Hmm+0.104
Sorry+0.104
Um+0.102
Neg logit contributions
 up-0.127
ish-0.115
--0.110
lers-0.105
 matches-0.102
Token.
Token ID13
Feature activation-2.29e+01

Pos logit contributions
Kids+0.207
Um+0.207
Hmm+0.206
Uh+0.206
Sorry+0.205
Neg logit contributions
 up-0.054
ish-0.042
lers-0.033
--0.030
 matches-0.028
Token
Token ID198
Feature activation-2.34e+01

Pos logit contributions
Uh+0.205
Hmm+0.202
Um+0.196
Sorry+0.193
Kids+0.192
Neg logit contributions
 up-0.153
 out-0.110
 where-0.110
 in-0.108
.-0.107
Token
Token ID198
Feature activation-2.39e+01

Pos logit contributions
Uh+0.209
Hmm+0.209
Um+0.205
Sorry+0.195
Kids+0.187
Neg logit contributions
 up-0.153
--0.123
 in-0.107
 out-0.106
ish-0.104
TokenTom
Token ID13787
Feature activation-1.61e+01

Pos logit contributions
Um+0.136
Hmm+0.127
Uh+0.125
Well+0.124
Kids+0.122
Neg logit contributions
 up-0.214
--0.201
 things-0.181
ish-0.178
 toys-0.178
Token shook
Token ID14682
Feature activation-1.72e+01

Pos logit contributions
Kids+0.112
Hmm+0.111
Uh+0.110
Um+0.110
Sorry+0.109
Neg logit contributions
 up-0.133
ish-0.123
lers-0.114
--0.113
 matches-0.109
Token his
Token ID465
Feature activation-1.44e+01

Pos logit contributions
Kids+0.139
Don+0.138
Um+0.137
Hmm+0.137
Sorry+0.136
Neg logit contributions
ish-0.111
 up-0.107
lers-0.102
ishes-0.099
 matches-0.097
Token head
Token ID1182
Feature activation-1.01e+01

Pos logit contributions
Um+0.118
Hmm+0.116
Uh+0.114
Dad+0.113
Exc+0.112
Neg logit contributions
 up-0.072
lers-0.072
 matches-0.071
ishes-0.070
ric-0.069
Token.
Token ID13
Feature activation-2.39e+01

Pos logit contributions
Don+0.103
Sorry+0.103
Um+0.102
Hmm+0.102
Exc+0.102
Neg logit contributions
ish-0.040
 up-0.039
igs-0.036
ishes-0.036
ric-0.035
Token,"
Token ID553
Feature activation-2.23e+01

Pos logit contributions
Um+0.076
Exc+0.075
Dad+0.075
Hmm+0.074
Sorry+0.074
Neg logit contributions
ric-0.044
ish-0.043
ning-0.039
lers-0.038
 up-0.038
Token Tom
Token ID4186
Feature activation-1.45e+01

Pos logit contributions
Uh+0.167
Kids+0.167
Hmm+0.165
Sorry+0.164
Um+0.164
Neg logit contributions
 up-0.175
ish-0.150
lers-0.139
--0.137
 matches-0.136
Token says
Token ID1139
Feature activation-1.82e+01

Pos logit contributions
Um+0.105
Kids+0.105
Hmm+0.104
Sorry+0.103
Uh+0.103
Neg logit contributions
 up-0.112
ish-0.105
lers-0.098
--0.094
 matches-0.093
Token.
Token ID13
Feature activation-2.28e+01

Pos logit contributions
Um+0.171
Sorry+0.170
Don+0.166
Kids+0.164
Exc+0.164
Neg logit contributions
ish-0.089
ric-0.087
ishes-0.082
igs-0.078
ans-0.078
Token
Token ID198
Feature activation-2.34e+01

Pos logit contributions
Hmm+0.186
Uh+0.184
Sorry+0.182
Kids+0.180
Um+0.178
Neg logit contributions
 up-0.161
--0.128
 in-0.122
 out-0.122
 matches-0.122
Token
Token ID198
Feature activation-2.38e+01

Pos logit contributions
Uh+0.165
Hmm+0.165
Um+0.164
Sorry+0.161
Kids+0.160
Neg logit contributions
 up-0.192
ish-0.167
--0.162
lers-0.157
 matches-0.153
TokenHe
Token ID1544
Feature activation-1.86e+01

Pos logit contributions
Um+0.139
Hmm+0.138
Uh+0.137
Kids+0.135
Well+0.128
Neg logit contributions
 up-0.220
--0.197
ish-0.193
ans-0.184
an-0.181
Token opens
Token ID9808
Feature activation-1.46e+01

Pos logit contributions
Um+0.141
Exc+0.138
Kids+0.135
Hmm+0.133
Uh+0.133
Neg logit contributions
 up-0.129
ish-0.129
lers-0.116
ishing-0.115
ishes-0.113
Token the
Token ID262
Feature activation-1.45e+01

Pos logit contributions
Exc+0.100
Um+0.100
Hmm+0.099
Don+0.099
Dad+0.098
Neg logit contributions
ish-0.108
lers-0.096
ishes-0.094
ric-0.094
ans-0.093
Token door
Token ID3420
Feature activation-1.06e+01

Pos logit contributions
Uh+0.110
Hmm+0.107
Um+0.106
Exc+0.102
Don+0.101
Neg logit contributions
ish-0.099
 up-0.098
ishing-0.095
igs-0.094
lers-0.094
Token.
Token ID13
Feature activation-2.37e+01

Pos logit contributions
Kids+0.120
Um+0.119
Hmm+0.119
Uh+0.119
Sorry+0.118
Neg logit contributions
 up-0.041
ish-0.034
lers-0.028
--0.027
 matches-0.025
Tokenila
Token ID10102
Feature activation-1.52e+01

Pos logit contributions
Um+0.088
Kids+0.087
Hmm+0.087
Sorry+0.087
Uh+0.087
Neg logit contributions
 up-0.110
ish-0.102
lers-0.095
--0.093
 matches-0.092
Token and
Token ID290
Feature activation-1.78e+01

Pos logit contributions
Kids+0.149
Um+0.148
Hmm+0.147
Sorry+0.147
Uh+0.146
Neg logit contributions
 up-0.079
ish-0.070
lers-0.063
--0.058
 matches-0.058
Token Tom
Token ID4186
Feature activation-1.42e+01

Pos logit contributions
Uh+0.151
Kids+0.151
Um+0.150
Hmm+0.150
Sorry+0.149
Neg logit contributions
 up-0.120
ish-0.106
lers-0.095
--0.094
 matches-0.092
Token are
Token ID389
Feature activation-1.60e+01

Pos logit contributions
Hmm+0.106
Um+0.106
Uh+0.105
Sorry+0.102
Kids+0.102
Neg logit contributions
 up-0.112
ish-0.103
--0.092
lers-0.092
ishes-0.091
Token friends
Token ID2460
Feature activation-1.76e+01

Pos logit contributions
Kids+0.151
Sorry+0.151
By+0.149
Um+0.149
Dad+0.146
Neg logit contributions
ish-0.076
lers-0.073
 up-0.069
an-0.069
ric-0.063
Token.
Token ID13
Feature activation-2.38e+01

Pos logit contributions
Sorry+0.170
Don+0.168
Exc+0.168
Kids+0.167
Um+0.166
Neg logit contributions
ish-0.086
 up-0.080
ishes-0.074
igs-0.074
ans-0.072
Token They
Token ID1119
Feature activation-1.94e+01

Pos logit contributions
Sorry+0.197
Uh+0.190
Hmm+0.190
Don+0.188
Um+0.184
Neg logit contributions
 up-0.147
ish-0.133
.-0.125
 clay-0.124
 where-0.124
Token like
Token ID588
Feature activation-1.81e+01

Pos logit contributions
Uh+0.170
Hmm+0.169
Um+0.169
Kids+0.169
Sorry+0.167
Neg logit contributions
 up-0.124
ish-0.112
lers-0.099
--0.097
 matches-0.097
Token to
Token ID284
Feature activation-1.81e+01

Pos logit contributions
Kids+0.162
Um+0.157
Don+0.157
Listen+0.155
Sorry+0.154
Neg logit contributions
 up-0.106
ish-0.105
lers-0.096
ishing-0.089
an-0.088
Token play
Token ID711
Feature activation-1.56e+01

Pos logit contributions
Kids+0.118
Um+0.115
Sorry+0.114
Hmm+0.113
Exc+0.110
Neg logit contributions
 up-0.151
ish-0.142
lers-0.135
ishing-0.130
ishes-0.130
Token with
Token ID351
Feature activation-1.55e+01

Pos logit contributions
Exc+0.150
Sorry+0.149
Kids+0.147
By+0.145
Ben+0.145
Neg logit contributions
ric-0.073
ish-0.062
 up-0.059
lers-0.057
ishes-0.056
TokenBen
Token ID11696
Feature activation-1.48e+01

Pos logit contributions
Kids+0.108
Uh+0.102
Um+0.102
Hmm+0.101
Sorry+0.101
Neg logit contributions
 up-0.137
ish-0.132
lers-0.122
ans-0.119
ishes-0.119
Token and
Token ID290
Feature activation-1.74e+01

Pos logit contributions
Kids+0.132
Um+0.131
Hmm+0.131
Uh+0.130
Sorry+0.130
Neg logit contributions
 up-0.090
ish-0.082
lers-0.074
--0.071
 matches-0.070
Token Lily
Token ID20037
Feature activation-1.57e+01

Pos logit contributions
Kids+0.150
Um+0.150
Hmm+0.150
Uh+0.150
Sorry+0.149
Neg logit contributions
 up-0.113
ish-0.102
lers-0.092
--0.089
 matches-0.088
Token are
Token ID389
Feature activation-1.57e+01

Pos logit contributions
Um+0.105
Kids+0.105
Hmm+0.104
Uh+0.103
Sorry+0.103
Neg logit contributions
 up-0.132
ish-0.123
lers-0.114
--0.111
ishes-0.109
Token twins
Token ID20345
Feature activation-1.68e+01

Pos logit contributions
Kids+0.149
Um+0.145
Sorry+0.143
Well+0.143
Dad+0.143
Neg logit contributions
ish-0.074
lers-0.071
an-0.070
 up-0.068
igs-0.065
Token.
Token ID13
Feature activation-2.38e+01

Pos logit contributions
Sorry+0.156
Kids+0.156
Um+0.154
Well+0.153
Don+0.152
Neg logit contributions
ish-0.083
 up-0.082
igs-0.077
lers-0.074
ans-0.072
Token They
Token ID1119
Feature activation-1.91e+01

Pos logit contributions
Sorry+0.205
Uh+0.203
Hmm+0.203
Um+0.191
Don+0.190
Neg logit contributions
 up-0.155
ish-0.124
 in-0.124
 where-0.124
 out-0.124
Token share
Token ID2648
Feature activation-1.62e+01

Pos logit contributions
Um+0.166
Kids+0.166
Uh+0.166
Hmm+0.165
Sorry+0.163
Neg logit contributions
 up-0.123
ish-0.111
lers-0.099
--0.097
 matches-0.096
Token a
Token ID257
Feature activation-1.22e+01

Pos logit contributions
Exc+0.126
Dad+0.123
Don+0.121
Kids+0.121
Listen+0.120
Neg logit contributions
ish-0.114
igs-0.106
ig-0.105
lers-0.102
ric-0.101
Token room
Token ID2119
Feature activation-1.03e+01

Pos logit contributions
Kids+0.112
Um+0.111
Hmm+0.111
Sorry+0.110
Uh+0.110
Neg logit contributions
 up-0.068
ish-0.064
lers-0.057
ishes-0.056
ishing-0.054
Token and
Token ID290
Feature activation-1.88e+01

Pos logit contributions
Exc+0.084
By+0.084
Sorry+0.083
Um+0.083
Okay+0.083
Neg logit contributions
ish-0.061
ishes-0.056
 up-0.053
ric-0.053
igs-0.052
Token hear
Token ID3285
Feature activation-1.76e+01

Pos logit contributions
Kids+0.132
Hmm+0.131
Sorry+0.130
Um+0.129
Uh+0.129
Neg logit contributions
 up-0.131
ish-0.113
 out-0.106
 in-0.103
 matches-0.101
Token their
Token ID511
Feature activation-1.32e+01

Pos logit contributions
Kids+0.156
Um+0.153
Exc+0.151
Sorry+0.150
Hmm+0.150
Neg logit contributions
ish-0.103
 up-0.094
ishes-0.089
lers-0.089
ishing-0.088
Token moms
Token ID34015
Feature activation-1.30e+01

Pos logit contributions
Kids+0.111
Um+0.111
Hmm+0.110
Sorry+0.110
Uh+0.109
Neg logit contributions
 up-0.087
ish-0.080
lers-0.073
--0.070
 matches-0.069
Token calling
Token ID4585
Feature activation-1.42e+01

Pos logit contributions
Kids+0.108
Um+0.102
Sorry+0.101
Exc+0.101
Don+0.101
Neg logit contributions
ish-0.080
 up-0.079
igs-0.077
lers-0.077
 matches-0.072
Token them
Token ID606
Feature activation-1.15e+01

Pos logit contributions
Um+0.145
Kids+0.145
Hmm+0.145
Uh+0.144
Sorry+0.144
Neg logit contributions
 up-0.070
ish-0.060
lers-0.053
--0.050
 matches-0.049
Token.
Token ID13
Feature activation-2.38e+01

Pos logit contributions
Kids+0.112
Listen+0.107
Don+0.106
Exc+0.105
Um+0.105
Neg logit contributions
ish-0.054
igs-0.049
lers-0.047
ishing-0.046
ishes-0.044
Token They
Token ID1119
Feature activation-1.97e+01

Pos logit contributions
Hmm+0.187
Uh+0.185
Um+0.178
Sorry+0.177
Kids+0.169
Neg logit contributions
 up-0.185
ish-0.151
 out-0.142
 where-0.140
--0.139
Token run
Token ID1057
Feature activation-1.52e+01

Pos logit contributions
Kids+0.149
Um+0.148
Hmm+0.148
Sorry+0.147
Uh+0.146
Neg logit contributions
 up-0.147
ish-0.136
lers-0.126
--0.121
 matches-0.120
Token back
Token ID736
Feature activation-1.49e+01

Pos logit contributions
Kids+0.152
Um+0.150
Sorry+0.149
Hmm+0.149
Uh+0.149
Neg logit contributions
 up-0.073
ish-0.069
lers-0.061
ishes-0.057
 matches-0.056
Token to
Token ID284
Feature activation-1.83e+01

Pos logit contributions
Kids+0.104
Exc+0.099
Don+0.098
Well+0.096
Listen+0.094
Neg logit contributions
ish-0.103
ery-0.099
ric-0.098
lers-0.097
an-0.095
Token their
Token ID511
Feature activation-1.37e+01

Pos logit contributions
Kids+0.161
Um+0.153
Don+0.152
Listen+0.150
Hmm+0.150
Neg logit contributions
ish-0.112
 up-0.108
lers-0.107
ishes-0.098
igs-0.095
Token I
Token ID314
Feature activation-1.80e+01

Pos logit contributions
Kids+0.148
Um+0.147
Hmm+0.146
Uh+0.146
Sorry+0.145
Neg logit contributions
 up-0.143
ish-0.131
lers-0.122
--0.117
 matches-0.116
Token see
Token ID766
Feature activation-1.55e+01

Pos logit contributions
Kids+0.136
Hmm+0.134
Uh+0.134
Um+0.134
Sorry+0.132
Neg logit contributions
 up-0.136
ish-0.126
lers-0.115
--0.113
 matches-0.111
Token it
Token ID340
Feature activation-1.08e+01

Pos logit contributions
Um+0.107
Exc+0.105
Sorry+0.103
Listen+0.103
Hi+0.102
Neg logit contributions
ric-0.109
ish-0.108
an-0.104
gy-0.103
ans-0.102
Token?"
Token ID1701
Feature activation-2.40e+01

Pos logit contributions
Sorry+0.106
Um+0.106
Kids+0.103
Uh+0.103
Hmm+0.102
Neg logit contributions
ish-0.050
 up-0.049
lers-0.047
ishes-0.042
ans-0.042
Token
Token ID198
Feature activation-2.33e+01

Pos logit contributions
Uh+0.208
Hmm+0.204
Sorry+0.203
Um+0.199
Listen+0.193
Neg logit contributions
 up-0.163
--0.132
ish-0.127
 in-0.124
 things-0.122
Token
Token ID198
Feature activation-2.38e+01

Pos logit contributions
Uh+0.199
Hmm+0.196
Um+0.191
Sorry+0.188
Kids+0.180
Neg logit contributions
 up-0.166
--0.138
ish-0.128
 in-0.123
 matches-0.120
TokenL
Token ID43
Feature activation-1.44e+01

Pos logit contributions
Um+0.142
Warning+0.141
Uh+0.139
Hmm+0.138
Listen+0.138
Neg logit contributions
 up-0.187
--0.185
 in-0.173
ish-0.163
 with-0.162
Tokenily
Token ID813
Feature activation-1.76e+01

Pos logit contributions
Kids+0.076
Um+0.072
Hmm+0.071
Sorry+0.071
Uh+0.070
Neg logit contributions
 up-0.143
ish-0.135
lers-0.130
ishes-0.124
 matches-0.124
Token nods
Token ID34833
Feature activation-1.60e+01

Pos logit contributions
Kids+0.134
Exc+0.133
Um+0.132
Hmm+0.128
Well+0.128
Neg logit contributions
 up-0.114
ish-0.112
lers-0.107
igs-0.105
ishing-0.102
Token.
Token ID13
Feature activation-2.40e+01

Pos logit contributions
Kids+0.186
Um+0.185
Sorry+0.184
Hmm+0.184
Uh+0.182
Neg logit contributions
 up-0.053
ish-0.046
lers-0.039
 matches-0.034
ishes-0.033
Token She
Token ID1375
Feature activation-1.94e+01

Pos logit contributions
Uh+0.144
Hmm+0.140
Sorry+0.137
Um+0.137
Kids+0.131
Neg logit contributions
 up-0.238
ish-0.206
 out-0.195
 clay-0.191
 in-0.188
Token the
Token ID262
Feature activation-1.30e+01

Pos logit contributions
Don+0.094
Exc+0.094
Kids+0.093
Hmm+0.092
Dad+0.091
Neg logit contributions
ish-0.073
ric-0.071
 up-0.069
ning-0.069
igs-0.068
Token doll
Token ID3654
Feature activation-9.73e+00

Pos logit contributions
Um+0.119
Hmm+0.118
Uh+0.116
Hi+0.116
Kids+0.115
Neg logit contributions
ish-0.067
 up-0.063
ans-0.063
lers-0.062
ishing-0.060
Token.
Token ID13
Feature activation-2.33e+01

Pos logit contributions
Hmm+0.081
Exc+0.078
Um+0.078
Don+0.077
Sure+0.076
Neg logit contributions
igs-0.051
ish-0.051
ishing-0.050
 up-0.050
cks-0.049
Token Ben
Token ID3932
Feature activation-1.60e+01

Pos logit contributions
Uh+0.177
Hmm+0.174
Sorry+0.172
Um+0.170
Kids+0.166
Neg logit contributions
 up-0.186
ish-0.159
 out-0.144
lers-0.144
 matches-0.144
Token smiles
Token ID21845
Feature activation-1.33e+01

Pos logit contributions
Kids+0.122
Um+0.120
Exc+0.118
Well+0.117
Hmm+0.117
Neg logit contributions
 up-0.106
ish-0.104
lers-0.096
ishing-0.094
ric-0.093
Token.
Token ID13
Feature activation-2.38e+01

Pos logit contributions
Kids+0.148
Um+0.148
Sorry+0.147
Hmm+0.147
Uh+0.147
Neg logit contributions
 up-0.051
ish-0.044
lers-0.038
 matches-0.034
ishes-0.034
Token He
Token ID679
Feature activation-1.84e+01

Pos logit contributions
Hmm+0.165
Uh+0.165
Sorry+0.162
Um+0.160
Kids+0.157
Neg logit contributions
 up-0.201
ish-0.171
 matches-0.161
--0.161
lers-0.160
Token holds
Token ID6622
Feature activation-1.67e+01

Pos logit contributions
Um+0.142
Kids+0.141
Exc+0.138
Hmm+0.137
Well+0.136
Neg logit contributions
 up-0.124
ish-0.119
lers-0.111
ishing-0.107
--0.106
Token the
Token ID262
Feature activation-1.59e+01

Pos logit contributions
Kids+0.128
Hmm+0.127
Um+0.126
Don+0.123
Exc+0.122
Neg logit contributions
ish-0.107
ric-0.103
 up-0.100
ans-0.099
igs-0.097
Token doll
Token ID3654
Feature activation-1.12e+01

Pos logit contributions
Hmm+0.148
Uh+0.147
Um+0.146
Sorry+0.143
Don+0.143
Neg logit contributions
ish-0.080
 up-0.077
ans-0.075
ishes-0.073
lers-0.072
Token gently
Token ID15165
Feature activation-1.20e+01

Pos logit contributions
Hmm+0.085
Don+0.082
Exc+0.082
Sure+0.080
Okay+0.079
Neg logit contributions
ric-0.066
igs-0.064
ishes-0.062
ans-0.062
ish-0.060
Token the
Token ID262
Feature activation-1.53e+01

Pos logit contributions
Um+0.190
Kids+0.189
Uh+0.189
Hmm+0.188
Sorry+0.188
Neg logit contributions
 up-0.101
ish-0.086
lers-0.077
--0.075
 matches-0.075
Token mechanic
Token ID21600
Feature activation-9.68e+00

Pos logit contributions
Kids+0.114
Um+0.112
Hmm+0.111
Uh+0.111
Sorry+0.110
Neg logit contributions
 up-0.120
ish-0.108
--0.100
lers-0.099
 matches-0.099
Token who
Token ID508
Feature activation-1.80e+01

Pos logit contributions
Hmm+0.082
Don+0.082
Um+0.080
Well+0.080
Kids+0.079
Neg logit contributions
igs-0.053
 up-0.049
ish-0.049
cks-0.047
ans-0.046
Token helped
Token ID4193
Feature activation-1.64e+01

Pos logit contributions
Kids+0.126
Uh+0.126
Um+0.126
Hmm+0.125
Sorry+0.123
Neg logit contributions
 up-0.148
ish-0.135
lers-0.126
--0.125
 matches-0.124
Token him
Token ID683
Feature activation-1.39e+01

Pos logit contributions
Kids+0.129
Um+0.128
Hmm+0.127
Sorry+0.126
Uh+0.125
Neg logit contributions
 up-0.115
ish-0.109
lers-0.100
ishes-0.095
--0.093
Token.
Token ID13
Feature activation-2.38e+01

Pos logit contributions
Um+0.092
Hmm+0.092
Don+0.091
Exc+0.091
Sorry+0.090
Neg logit contributions
lers-0.089
ish-0.089
ishes-0.085
ans-0.083
igs-0.082
Token<|endoftext|>
Token ID50256
Feature activation-1.82e+01

Pos logit contributions
Uh+0.198
Sorry+0.189
Hmm+0.186
Um+0.184
Kids+0.177
Neg logit contributions
 up-0.188
ish-0.147
 where-0.141
--0.140
 for-0.139
TokenOnce
Token ID7454
Feature activation-1.86e+01

Pos logit contributions
Kids+0.112
Um+0.111
Hmm+0.110
Sorry+0.110
Uh+0.110
Neg logit contributions
 up-0.160
ish-0.150
lers-0.141
--0.136
 matches-0.136
Token upon
Token ID2402
Feature activation-1.55e+01

Pos logit contributions
Kids+0.142
Um+0.140
Well+0.137
Hmm+0.137
Uh+0.136
Neg logit contributions
 up-0.129
ish-0.123
lers-0.116
ans-0.110
ishes-0.110
Token a
Token ID257
Feature activation-1.65e+01

Pos logit contributions
Don+0.145
Um+0.143
Hmm+0.143
By+0.142
Sorry+0.142
Neg logit contributions
ish-0.087
 up-0.082
ishes-0.076
lers-0.076
ishing-0.072
Token time
Token ID640
Feature activation-1.79e+01

Pos logit contributions
Kids+0.142
Um+0.141
Uh+0.140
Hmm+0.140
Sorry+0.139
Neg logit contributions
 up-0.105
ish-0.097
lers-0.088
ishes-0.085
 matches-0.084
Token.
Token ID13
Feature activation-2.28e+01

Pos logit contributions
Kids+0.151
Um+0.150
Sorry+0.149
Don+0.148
Hmm+0.148
Neg logit contributions
 up-0.064
ish-0.059
lers-0.052
igs-0.052
ans-0.049
Token They
Token ID1119
Feature activation-1.85e+01

Pos logit contributions
Uh+0.192
Hmm+0.191
Um+0.186
Sorry+0.182
Kids+0.177
Neg logit contributions
 up-0.163
 where-0.128
ish-0.125
--0.122
ric-0.122
Token lived
Token ID5615
Feature activation-1.58e+01

Pos logit contributions
Kids+0.138
Um+0.137
Hmm+0.137
Uh+0.136
Sorry+0.135
Neg logit contributions
 up-0.142
ish-0.132
lers-0.122
--0.117
 matches-0.117
Token happily
Token ID18177
Feature activation-1.30e+01

Pos logit contributions
By+0.117
Don+0.105
Sorry+0.105
Okay+0.101
Exc+0.096
Neg logit contributions
ric-0.117
ish-0.116
lers-0.114
ishes-0.104
ning-0.102
Token together
Token ID1978
Feature activation-1.51e+01

Pos logit contributions
By+0.075
Sorry+0.072
Um+0.067
Kids+0.065
Don+0.065
Neg logit contributions
 up-0.120
ish-0.114
igs-0.106
 matches-0.105
ishes-0.105
Token.
Token ID13
Feature activation-2.38e+01

Pos logit contributions
By+0.140
Well+0.121
Once+0.121
Don+0.120
Sorry+0.120
Neg logit contributions
ish-0.075
an-0.071
ans-0.069
ric-0.068
ishes-0.067
Token<|endoftext|>
Token ID50256
Feature activation-1.81e+01

Pos logit contributions
Uh+0.202
Hmm+0.200
Um+0.198
Sorry+0.198
Kids+0.189
Neg logit contributions
 up-0.168
 where-0.131
ish-0.130
--0.128
ric-0.128
TokenOne
Token ID3198
Feature activation-1.89e+01

Pos logit contributions
Kids+0.105
Um+0.102
Hmm+0.101
Sorry+0.100
Uh+0.100
Neg logit contributions
 up-0.164
ish-0.159
lers-0.149
ishes-0.144
--0.142
Token day
Token ID1110
Feature activation-1.74e+01

Pos logit contributions
Kids+0.141
By+0.137
Um+0.137
Listen+0.136
Sorry+0.136
Neg logit contributions
 up-0.129
ish-0.127
lers-0.123
ishes-0.122
igs-0.116
Token,
Token ID11
Feature activation-2.21e+01

Pos logit contributions
Sorry+0.162
By+0.157
Kids+0.156
Um+0.155
Uh+0.153
Neg logit contributions
ish-0.092
 up-0.089
ishes-0.086
ric-0.084
ans-0.082
Token a
Token ID257
Feature activation-1.65e+01

Pos logit contributions
Sorry+0.211
Um+0.210
Uh+0.210
Don+0.210
Kids+0.209
Neg logit contributions
ish-0.111
 up-0.108
lers-0.101
ishes-0.099
ans-0.095
Token elephant
Token ID20950
Feature activation-9.06e+00

Pos logit contributions
Um+0.153
Hmm+0.153
Uh+0.152
Sorry+0.150
Kids+0.149
Neg logit contributions
 up-0.068
ish-0.061
lers-0.056
ishes-0.051
igs-0.051
Token again
Token ID757
Feature activation-1.30e+01

Pos logit contributions
Exc+0.058
Sure+0.058
By+0.058
From+0.056
So+0.056
Neg logit contributions
igs-0.060
ric-0.060
cks-0.060
ans-0.055
ishing-0.055
Token and
Token ID290
Feature activation-1.87e+01

Pos logit contributions
Don+0.109
By+0.108
Sorry+0.108
Well+0.108
Um+0.107
Neg logit contributions
ish-0.074
 up-0.072
 matches-0.071
ishes-0.070
lers-0.069
Token kissed
Token ID27946
Feature activation-1.59e+01

Pos logit contributions
Kids+0.107
Sorry+0.105
Um+0.105
Don+0.103
Hmm+0.101
Neg logit contributions
 up-0.166
ish-0.164
lers-0.154
ric-0.151
ans-0.150
Token it
Token ID340
Feature activation-1.17e+01

Pos logit contributions
Don+0.135
Sorry+0.134
Kids+0.132
Hmm+0.132
Um+0.131
Neg logit contributions
ish-0.098
 up-0.095
ans-0.090
lers-0.089
igs-0.088
Token.
Token ID13
Feature activation-2.37e+01

Pos logit contributions
By+0.110
Sorry+0.107
Don+0.106
Exc+0.106
Um+0.105
Neg logit contributions
ish-0.056
ric-0.055
ans-0.054
ishes-0.053
igs-0.050
Token She
Token ID1375
Feature activation-1.83e+01

Pos logit contributions
Kids+0.113
Um+0.112
Hmm+0.112
Uh+0.111
Sorry+0.110
Neg logit contributions
 up-0.245
ish-0.229
lers-0.217
--0.212
 matches-0.212
Token said
Token ID531
Feature activation-1.94e+01

Pos logit contributions
Kids+0.117
Um+0.117
Hmm+0.116
Uh+0.115
Sorry+0.115
Neg logit contributions
 up-0.158
ish-0.147
lers-0.138
--0.134
 matches-0.132
Token thank
Token ID5875
Feature activation-1.55e+01

Pos logit contributions
Sorry+0.129
Don+0.126
By+0.126
Um+0.124
Kids+0.123
Neg logit contributions
ish-0.156
 up-0.152
lers-0.145
ric-0.144
ans-0.144
Token you
Token ID345
Feature activation-1.71e+01

Pos logit contributions
Sorry+0.130
Kids+0.130
Um+0.129
Hmm+0.128
Don+0.127
Neg logit contributions
 up-0.102
ish-0.096
lers-0.088
ishes-0.084
 matches-0.084
Token and
Token ID290
Feature activation-1.95e+01

Pos logit contributions
Don+0.143
Sorry+0.141
By+0.141
Um+0.140
Exc+0.140
Neg logit contributions
ish-0.096
cks-0.090
ishes-0.090
ishing-0.089
lers-0.089
Token train
Token ID4512
Feature activation-6.33e+00

Pos logit contributions
Kids+0.068
Um+0.068
Sorry+0.068
Hmm+0.067
Don+0.066
Neg logit contributions
 up-0.036
ish-0.033
lers-0.029
ishing-0.029
igs-0.029
Token to
Token ID284
Feature activation-1.72e+01

Pos logit contributions
Um+0.061
Kids+0.061
Sorry+0.060
Don+0.059
Hmm+0.059
Neg logit contributions
ish-0.029
 up-0.028
igs-0.026
lers-0.026
ishing-0.025
Token Lily
Token ID20037
Feature activation-1.33e+01

Pos logit contributions
Um+0.148
Hmm+0.148
Kids+0.147
Uh+0.147
Sorry+0.146
Neg logit contributions
 up-0.111
ish-0.099
lers-0.090
--0.088
 matches-0.087
Token and
Token ID290
Feature activation-1.71e+01

Pos logit contributions
Exc+0.098
Kids+0.096
Don+0.094
Um+0.093
Hmm+0.092
Neg logit contributions
igs-0.077
ric-0.069
lers-0.069
ishing-0.068
ish-0.068
Token Sam
Token ID3409
Feature activation-1.34e+01

Pos logit contributions
Um+0.094
Kids+0.092
Hmm+0.086
Exc+0.085
Don+0.085
Neg logit contributions
ish-0.153
 up-0.150
ishing-0.146
lers-0.145
igs-0.144
Token.
Token ID13
Feature activation-2.37e+01

Pos logit contributions
Kids+0.107
Exc+0.107
Don+0.106
Um+0.104
Sure+0.102
Neg logit contributions
igs-0.071
ric-0.066
ish-0.066
ning-0.064
lers-0.062
Token He
Token ID679
Feature activation-1.80e+01

Pos logit contributions
Uh+0.189
Hmm+0.189
Um+0.184
Sorry+0.182
Listen+0.177
Neg logit contributions
 up-0.174
--0.140
ish-0.139
 matches-0.135
 things-0.134
Token says
Token ID1139
Feature activation-1.78e+01

Pos logit contributions
Um+0.118
Kids+0.117
Exc+0.115
Well+0.110
Don+0.106
Neg logit contributions
ish-0.137
 up-0.132
ishing-0.129
lers-0.124
--0.123
Token they
Token ID484
Feature activation-1.88e+01

Pos logit contributions
Kids+0.170
Um+0.169
Sorry+0.168
Hmm+0.166
Uh+0.164
Neg logit contributions
 up-0.096
ish-0.090
lers-0.080
ishes-0.075
ishing-0.075
Token can
Token ID460
Feature activation-1.75e+01

Pos logit contributions
Kids+0.131
Um+0.131
Hmm+0.129
Sorry+0.128
Uh+0.126
Neg logit contributions
 up-0.149
ish-0.139
lers-0.130
--0.126
ishing-0.124
Token play
Token ID711
Feature activation-1.47e+01

Pos logit contributions
Kids+0.130
Um+0.129
Hmm+0.128
Sorry+0.128
Uh+0.127
Neg logit contributions
 up-0.133
ish-0.123
lers-0.114
--0.109
ishes-0.109
Token,
Token ID11
Feature activation-1.80e+01

Pos logit contributions
By+0.112
Exc+0.112
Kids+0.110
Don+0.109
Hi+0.105
Neg logit contributions
ish-0.091
ric-0.086
ishes-0.084
an-0.076
ries-0.076
Token jumped
Token ID11687
Feature activation-1.29e+01

Pos logit contributions
Kids+0.124
Don+0.123
Exc+0.119
Uh+0.119
Well+0.117
Neg logit contributions
ish-0.138
 up-0.138
lers-0.127
ishes-0.123
ishing-0.122
Token,
Token ID11
Feature activation-1.72e+01

Pos logit contributions
Exc+0.104
By+0.102
Together+0.098
Don+0.094
Uh+0.092
Neg logit contributions
ish-0.077
ric-0.077
ishes-0.074
ishing-0.069
lers-0.068
Token and
Token ID290
Feature activation-1.84e+01

Pos logit contributions
Don+0.142
Exc+0.140
By+0.139
Well+0.133
Uh+0.132
Neg logit contributions
ish-0.112
ric-0.101
ishes-0.100
lers-0.098
ishing-0.097
Token laughed
Token ID13818
Feature activation-1.40e+01

Pos logit contributions
Kids+0.119
Don+0.118
Sorry+0.118
By+0.118
Well+0.114
Neg logit contributions
ish-0.150
 up-0.139
ric-0.137
ishing-0.135
ans-0.135
Token.
Token ID13
Feature activation-2.37e+01

Pos logit contributions
By+0.129
Don+0.123
Sorry+0.123
Exc+0.120
Okay+0.118
Neg logit contributions
ish-0.076
ric-0.073
lers-0.070
ishes-0.069
ans-0.065
Token When
Token ID1649
Feature activation-1.88e+01

Pos logit contributions
Uh+0.178
Hmm+0.175
Kids+0.173
Um+0.171
Sorry+0.169
Neg logit contributions
 up-0.196
ish-0.158
--0.152
 matches-0.151
 things-0.151
Token the
Token ID262
Feature activation-1.58e+01

Pos logit contributions
Hmm+0.168
Um+0.167
Uh+0.165
Kids+0.164
Exc+0.160
Neg logit contributions
 up-0.124
ish-0.099
--0.095
 things-0.094
 matches-0.092
Token sun
Token ID4252
Feature activation-1.20e+01

Pos logit contributions
Um+0.161
Kids+0.160
Hmm+0.160
Uh+0.158
Sorry+0.158
Neg logit contributions
 up-0.081
ish-0.070
lers-0.062
--0.059
 matches-0.058
Token went
Token ID1816
Feature activation-1.36e+01

Pos logit contributions
Kids+0.077
Sorry+0.076
Uh+0.076
Um+0.075
Hmm+0.074
Neg logit contributions
 up-0.102
ish-0.097
lers-0.092
ishes-0.088
ans-0.087
Token down
Token ID866
Feature activation-1.33e+01

Pos logit contributions
Kids+0.163
Sorry+0.163
Uh+0.162
Um+0.162
Hmm+0.162
Neg logit contributions
 up-0.038
ish-0.033
lers-0.026
 matches-0.022
ishes-0.022
TokenTom
Token ID13787
Feature activation-1.44e+01

Pos logit contributions
Kids+0.101
Uh+0.095
Well+0.095
Um+0.095
Sorry+0.094
Neg logit contributions
 up-0.135
ish-0.132
lers-0.121
ans-0.120
ishes-0.119
Token and
Token ID290
Feature activation-1.76e+01

Pos logit contributions
Kids+0.128
Um+0.125
Sorry+0.124
Hmm+0.124
Don+0.123
Neg logit contributions
 up-0.085
ish-0.077
lers-0.073
igs-0.069
 matches-0.068
Token Sue
Token ID26089
Feature activation-1.40e+01

Pos logit contributions
Kids+0.163
Um+0.163
Hmm+0.163
Uh+0.163
Sorry+0.161
Neg logit contributions
 up-0.103
ish-0.091
lers-0.082
--0.079
 matches-0.078
Token are
Token ID389
Feature activation-1.55e+01

Pos logit contributions
Kids+0.089
Um+0.086
Sorry+0.086
Hmm+0.085
Don+0.084
Neg logit contributions
 up-0.119
ish-0.112
lers-0.105
igs-0.103
 matches-0.101
Token twins
Token ID20345
Feature activation-1.62e+01

Pos logit contributions
Kids+0.144
Dad+0.142
Um+0.142
Sorry+0.141
By+0.141
Neg logit contributions
ish-0.073
lers-0.071
an-0.067
 up-0.067
 matches-0.065
Token.
Token ID13
Feature activation-2.37e+01

Pos logit contributions
Sorry+0.149
Kids+0.147
Um+0.147
Well+0.146
By+0.146
Neg logit contributions
ish-0.080
 up-0.079
igs-0.076
lers-0.072
ishes-0.071
Token They
Token ID1119
Feature activation-1.90e+01

Pos logit contributions
Hmm+0.173
Um+0.168
Uh+0.168
Sorry+0.166
Kids+0.164
Neg logit contributions
 up-0.185
ish-0.154
--0.148
 where-0.148
 in-0.146
Token like
Token ID588
Feature activation-1.77e+01

Pos logit contributions
Kids+0.159
Um+0.159
Hmm+0.158
Uh+0.158
Sorry+0.157
Neg logit contributions
 up-0.128
ish-0.116
lers-0.105
--0.102
 matches-0.101
Token to
Token ID284
Feature activation-1.75e+01

Pos logit contributions
Kids+0.164
Um+0.159
Don+0.158
Listen+0.157
Sorry+0.156
Neg logit contributions
 up-0.101
ish-0.100
lers-0.090
ishing-0.083
an-0.082
Token play
Token ID711
Feature activation-1.51e+01

Pos logit contributions
Kids+0.123
Um+0.122
Sorry+0.120
Hmm+0.120
Uh+0.118
Neg logit contributions
 up-0.139
ish-0.130
lers-0.122
ishes-0.117
ishing-0.115
Token in
Token ID287
Feature activation-1.47e+01

Pos logit contributions
Exc+0.141
Sorry+0.140
Ben+0.138
Sure+0.137
Kids+0.137
Neg logit contributions
ric-0.078
ish-0.065
ishes-0.060
lers-0.058
 up-0.058
Token's
Token ID338
Feature activation+2.12e+01

Pos logit contributions
cks+0.152
ishing+0.151
igs+0.147
ans+0.146
ates+0.142
Neg logit contributions
Sorry-0.183
Hmm-0.182
Um-0.181
Uh-0.176
Okay-0.176
Token wonderful
Token ID7932
Feature activation+2.99e+01

Pos logit contributions
 up+0.135
ish+0.135
lers+0.133
ishes+0.123
ans+0.123
Neg logit contributions
Um-0.175
Hmm-0.172
Sorry-0.169
Uh-0.168
Don-0.168
Token,
Token ID11
Feature activation+1.28e+01

Pos logit contributions
ric+0.191
cks+0.177
ans+0.171
ates+0.167
ep+0.166
Neg logit contributions
Um-0.218
Okay-0.217
Exc-0.215
No-0.212
Kids-0.211
Token my
Token ID616
Feature activation+2.42e+01

Pos logit contributions
 up+0.105
ish+0.102
lers+0.099
ishing+0.092
ric+0.092
Neg logit contributions
Kids-0.086
Um-0.085
Hmm-0.084
Uh-0.083
Sorry-0.082
Token d
Token ID288
Feature activation+3.66e+01

Pos logit contributions
 up+0.160
lers+0.148
cks+0.146
ning+0.146
ans+0.144
Neg logit contributions
Um-0.185
Sorry-0.180
Okay-0.178
Hmm-0.174
Well-0.170
Tokenarl
Token ID7063
Feature activation+5.61e+01

Pos logit contributions
 up+0.328
lers+0.290
ish+0.289
 matches+0.284
ishes+0.271
Neg logit contributions
Kids-0.221
Um-0.216
Sorry-0.214
Hmm-0.210
Uh-0.208
Tokenings
Token ID654
Feature activation+3.04e+01

Pos logit contributions
 up+0.284
ish+0.245
 matches+0.243
lers+0.238
igs+0.207
Neg logit contributions
Sorry-0.526
Kids-0.525
Well-0.518
Okay-0.509
Um-0.509
Token!"
Token ID2474
Feature activation+1.00e+01

Pos logit contributions
 up+0.139
ish+0.118
lers+0.103
-+0.098
 matches+0.095
Neg logit contributions
Um-0.319
Kids-0.319
Hmm-0.318
Uh-0.317
Sorry-0.315
Token Mom
Token ID11254
Feature activation+1.86e+01

Pos logit contributions
 up+0.084
ish+0.076
lers+0.072
-+0.070
 matches+0.069
Neg logit contributions
Hmm-0.067
Kids-0.067
Uh-0.067
Um-0.067
Sorry-0.066
Tokenmy
Token ID1820
Feature activation+2.20e+01

Pos logit contributions
 up+0.155
ish+0.152
igs+0.147
lers+0.144
ans+0.140
Neg logit contributions
Kids-0.116
Um-0.111
Exc-0.106
Sorry-0.106
Well-0.106
Token says
Token ID1139
Feature activation+1.52e+01

Pos logit contributions
 up+0.206
ish+0.204
igs+0.191
lers+0.190
ishing+0.188
Neg logit contributions
Kids-0.116
Um-0.112
Exc-0.107
Sorry-0.106
Well-0.105
Token children
Token ID1751
Feature activation+3.18e+01

Pos logit contributions
Hmm+0.127
Uh+0.124
Um+0.124
Sorry+0.123
Kids+0.122
Neg logit contributions
 up-0.150
ish-0.135
--0.128
lers-0.127
 matches-0.122
Token put
Token ID1234
Feature activation+3.15e+01

Pos logit contributions
 up+0.207
ish+0.201
igs+0.196
ishing+0.184
lers+0.182
Neg logit contributions
Kids-0.249
Sorry-0.244
Um-0.244
Well-0.239
Don-0.236
Token on
Token ID319
Feature activation+2.43e+01

Pos logit contributions
ric+0.160
ish+0.157
igs+0.148
an+0.146
ans+0.140
Neg logit contributions
Kids-0.263
Don-0.252
Hmm-0.249
Sorry-0.247
Exc-0.246
Token their
Token ID511
Feature activation+3.15e+01

Pos logit contributions
ish+0.165
 up+0.141
ishing+0.141
an+0.139
ishes+0.136
Neg logit contributions
Kids-0.191
Exc-0.187
Hmm-0.182
Um-0.180
Don-0.179
Token bath
Token ID7837
Feature activation+3.92e+01

Pos logit contributions
 up+0.171
ish+0.170
ning+0.154
ishing+0.151
igs+0.148
Neg logit contributions
Kids-0.263
Um-0.260
Sorry-0.255
Exc-0.255
Hmm-0.254
Tokenro
Token ID305
Feature activation+6.29e+01

Pos logit contributions
 up+0.243
ish+0.243
cks+0.231
lers+0.216
ric+0.216
Neg logit contributions
Kids-0.318
Sorry-0.287
Okay-0.283
Exc-0.276
Well-0.275
Tokenbes
Token ID12636
Feature activation+4.07e+01

Pos logit contributions
ish+0.503
ishes+0.472
 up+0.439
 matches+0.435
ries+0.416
Neg logit contributions
Kids-0.379
Ben-0.360
Exc-0.359
Don-0.349
By-0.347
Token and
Token ID290
Feature activation+1.86e+01

Pos logit contributions
ish+0.184
 up+0.173
igs+0.173
ric+0.168
 matches+0.159
Neg logit contributions
Exc-0.378
Kids-0.377
Sorry-0.377
Don-0.374
Well-0.369
Token ran
Token ID4966
Feature activation+2.53e+01

Pos logit contributions
 up+0.143
ish+0.142
lers+0.128
ric+0.127
an+0.126
Neg logit contributions
Kids-0.129
Sorry-0.124
Um-0.124
Don-0.122
Hmm-0.121
Token around
Token ID1088
Feature activation+2.32e+01

Pos logit contributions
ish+0.132
ishes+0.114
 up+0.113
igs+0.112
ric+0.111
Neg logit contributions
Kids-0.245
Sorry-0.231
Exc-0.231
Don-0.230
Listen-0.228
Token in
Token ID287
Feature activation+2.10e+01

Pos logit contributions
ish+0.129
ishes+0.112
lers+0.108
ans+0.108
 up+0.108
Neg logit contributions
Kids-0.216
Don-0.210
Exc-0.205
Um-0.204
Well-0.203
Token he
Token ID339
Feature activation+2.72e+01

Pos logit contributions
ish+0.286
 up+0.285
ans+0.258
ishing+0.257
ishes+0.256
Neg logit contributions
Um-0.230
Kids-0.229
Sorry-0.221
Hmm-0.220
Well-0.220
Token would
Token ID561
Feature activation+2.32e+01

Pos logit contributions
 up+0.243
ish+0.233
lers+0.217
ishing+0.215
igs+0.214
Neg logit contributions
Um-0.158
Kids-0.151
Hmm-0.150
Sorry-0.149
Don-0.149
Token take
Token ID1011
Feature activation+2.77e+01

Pos logit contributions
 up+0.186
ish+0.186
lers+0.176
ishes+0.175
ishing+0.175
Neg logit contributions
Um-0.141
Exc-0.136
Hmm-0.135
By-0.135
Don-0.134
Token the
Token ID262
Feature activation+2.77e+01

Pos logit contributions
ish+0.191
ric+0.177
an+0.171
 up+0.166
ans+0.166
Neg logit contributions
Kids-0.201
Hmm-0.201
Exc-0.199
Um-0.199
By-0.198
Token av
Token ID1196
Feature activation+5.95e+01

Pos logit contributions
 up+0.160
cks+0.153
ish+0.152
igs+0.144
ishing+0.144
Neg logit contributions
Hmm-0.229
Um-0.223
Uh-0.221
Exc-0.219
Don-0.216
Tokenoc
Token ID420
Feature activation+6.33e+01

Pos logit contributions
 up+0.448
 matches+0.394
ric+0.371
 where+0.366
ies+0.355
Neg logit contributions
Hmm-0.428
Exc-0.425
Uh-0.407
Um-0.401
By-0.394
Tokenados
Token ID22484
Feature activation+3.91e+01

Pos logit contributions
 up+0.395
 flying+0.317
 matches+0.313
 where+0.313
 hiding+0.304
Neg logit contributions
Um-0.505
Sorry-0.503
Hmm-0.496
Well-0.486
Exc-0.484
Token and
Token ID290
Feature activation+1.83e+01

Pos logit contributions
ish+0.198
 up+0.197
lers+0.187
igs+0.185
ric+0.185
Neg logit contributions
Hmm-0.329
Don-0.329
Exc-0.327
Well-0.325
Um-0.323
Token scatter
Token ID41058
Feature activation+2.91e+01

Pos logit contributions
 up+0.122
ish+0.121
an+0.113
ishing+0.112
ans+0.111
Neg logit contributions
Um-0.140
Exc-0.139
Hmm-0.139
Sorry-0.137
Don-0.136
Token them
Token ID606
Feature activation+2.74e+01

Pos logit contributions
ric+0.173
ish+0.168
an+0.163
ans+0.156
ig+0.156
Neg logit contributions
Exc-0.239
Hmm-0.222
Don-0.221
Uh-0.218
Well-0.215
Token around
Token ID1088
Feature activation+2.48e+01

Pos logit contributions
ric+0.120
ish+0.116
ans+0.110
an+0.110
igs+0.104
Neg logit contributions
Exc-0.231
Don-0.230
By-0.221
Hmm-0.221
Sorry-0.221
Token who
Token ID508
Feature activation+1.91e+01

Pos logit contributions
 up+0.225
ishes+0.209
ish+0.209
igs+0.207
ishing+0.205
Neg logit contributions
Don-0.191
Exc-0.187
Um-0.185
Well-0.185
Kids-0.185
Token lived
Token ID5615
Feature activation+2.42e+01

Pos logit contributions
 up+0.148
ish+0.140
lers+0.128
ishes+0.126
 matches+0.125
Neg logit contributions
Um-0.135
Kids-0.134
Hmm-0.133
Don-0.132
Sorry-0.131
Token in
Token ID287
Feature activation+2.26e+01

Pos logit contributions
ish+0.117
ans+0.115
ric+0.110
ishes+0.109
lers+0.106
Neg logit contributions
By-0.227
Don-0.226
Sorry-0.216
Well-0.208
Exc-0.205
Token his
Token ID465
Feature activation+2.75e+01

Pos logit contributions
ish+0.139
 up+0.124
ishes+0.121
ans+0.120
ies+0.117
Neg logit contributions
Don-0.200
Um-0.197
Hmm-0.197
Uh-0.195
By-0.193
Token cozy
Token ID37438
Feature activation+3.84e+01

Pos logit contributions
ish+0.160
 up+0.158
ans+0.153
ishing+0.149
ishes+0.149
Neg logit contributions
Um-0.215
By-0.209
Hmm-0.205
Hi-0.205
Uh-0.202
Token bur
Token ID4356
Feature activation+5.44e+01

Pos logit contributions
cks+0.291
 up+0.270
ish+0.261
ric+0.260
ans+0.259
Neg logit contributions
Hi-0.223
Exc-0.222
Dad-0.207
Kids-0.207
Um-0.202
Tokenrow
Token ID808
Feature activation+3.92e+01

Pos logit contributions
 matches+0.405
 up+0.397
 dough+0.384
 painting+0.359
 cards+0.359
Neg logit contributions
Exc-0.251
Don-0.242
He-0.238
She-0.231
By-0.226
Token.
Token ID13
Feature activation+7.63e+00

Pos logit contributions
ric+0.213
 matches+0.208
ish+0.205
cks+0.204
ishes+0.202
Neg logit contributions
Don-0.286
By-0.283
Exc-0.282
He-0.281
She-0.277
Token He
Token ID679
Feature activation+1.59e+01

Pos logit contributions
 up+0.070
ish+0.064
lers+0.060
 matches+0.059
-+0.058
Neg logit contributions
Sorry-0.046
Uh-0.044
Kids-0.043
Um-0.043
Hmm-0.043
Token loved
Token ID6151
Feature activation+1.78e+01

Pos logit contributions
 up+0.118
ish+0.109
lers+0.101
-+0.097
 matches+0.096
Neg logit contributions
Kids-0.121
Um-0.120
Hmm-0.119
Sorry-0.119
Uh-0.119
Token to
Token ID284
Feature activation+1.26e+01

Pos logit contributions
ish+0.136
an+0.125
ans+0.117
ig+0.115
cks+0.112
Neg logit contributions
Don-0.129
Exc-0.122
Listen-0.121
Kids-0.120
Hi-0.116
Token it
Token ID340
Feature activation+2.93e+01

Pos logit contributions
ish+0.205
ric+0.199
an+0.191
igs+0.190
ans+0.187
Neg logit contributions
Kids-0.197
Don-0.186
Um-0.183
Well-0.181
Hmm-0.178
Token to
Token ID284
Feature activation+1.67e+01

Pos logit contributions
ric+0.180
ish+0.176
cks+0.168
igs+0.164
ishing+0.163
Neg logit contributions
Sorry-0.209
By-0.209
Don-0.209
Um-0.208
Listen-0.204
Token the
Token ID262
Feature activation+2.41e+01

Pos logit contributions
ish+0.124
 up+0.121
lers+0.116
ishing+0.111
ishes+0.108
Neg logit contributions
Kids-0.122
Um-0.119
Don-0.114
Sorry-0.114
Hmm-0.114
Token rh
Token ID9529
Feature activation+4.69e+01

Pos logit contributions
 up+0.149
ish+0.146
lers+0.140
ishing+0.137
igs+0.136
Neg logit contributions
Um-0.199
Sorry-0.196
Uh-0.195
Hmm-0.195
Kids-0.193
Tokenin
Token ID259
Feature activation+3.64e+01

Pos logit contributions
 up+0.357
ish+0.284
fl+0.278
 out+0.271
ishes+0.269
Neg logit contributions
Sorry-0.354
Uh-0.339
By-0.334
Um-0.332
Exc-0.328
Tokenoc
Token ID420
Feature activation+5.95e+01

Pos logit contributions
 up+0.280
igs+0.254
ish+0.250
ishing+0.248
ishes+0.244
Neg logit contributions
Um-0.238
Hmm-0.235
Exc-0.231
Sorry-0.231
By-0.228
Tokeneros
Token ID27498
Feature activation+3.53e+01

Pos logit contributions
 up+0.337
 out+0.272
igs+0.262
ies+0.260
 flying+0.258
Neg logit contributions
Sorry-0.521
Um-0.511
Hmm-0.490
Well-0.489
Okay-0.481
Token.
Token ID13
Feature activation+9.29e+00

Pos logit contributions
 up+0.156
ish+0.154
igs+0.151
ishing+0.146
lers+0.138
Neg logit contributions
Um-0.317
Sorry-0.311
Exc-0.309
Don-0.308
Kids-0.305
Token When
Token ID1649
Feature activation+1.45e+01

Pos logit contributions
 up+0.076
ish+0.069
lers+0.064
 matches+0.063
-+0.062
Neg logit contributions
Uh-0.062
Sorry-0.062
Hmm-0.062
Um-0.061
Kids-0.061
Token he
Token ID339
Feature activation+1.85e+01

Pos logit contributions
 up+0.097
ish+0.093
lers+0.084
ishes+0.080
ishing+0.079
Neg logit contributions
Kids-0.119
Um-0.119
Sorry-0.118
Hmm-0.117
Uh-0.117
Token tasted
Token ID29187
Feature activation+2.24e+01

Pos logit contributions
 up+0.142
ish+0.129
-+0.121
lers+0.119
 matches+0.117
Neg logit contributions
Hmm-0.140
Kids-0.138
Um-0.138
Sorry-0.137
Uh-0.136
Token Her
Token ID2332
Feature activation+9.59e+00

Pos logit contributions
 up+0.004
ish+0.004
lers+0.004
-+0.003
 matches+0.003
Neg logit contributions
Sorry-0.002
Kids-0.002
Uh-0.002
Hmm-0.002
Um-0.002
Token best
Token ID1266
Feature activation+1.25e+01

Pos logit contributions
 up+0.062
ish+0.059
ans+0.055
cks+0.054
ishes+0.054
Neg logit contributions
Um-0.073
Sorry-0.072
Kids-0.072
By-0.071
Okay-0.071
Token friend
Token ID1545
Feature activation+9.76e+00

Pos logit contributions
 up+0.067
ish+0.059
ans+0.057
lers+0.055
ishes+0.055
Neg logit contributions
Kids-0.116
Um-0.114
By-0.113
Well-0.112
Sorry-0.111
Token watched
Token ID7342
Feature activation+1.08e+01

Pos logit contributions
 up+0.068
ish+0.064
lers+0.059
 matches+0.056
igs+0.056
Neg logit contributions
Kids-0.077
Um-0.076
Hmm-0.075
Sorry-0.075
Uh-0.074
Token in
Token ID287
Feature activation+8.14e+00

Pos logit contributions
ish+0.070
 up+0.065
lers+0.064
ishes+0.063
ans+0.062
Neg logit contributions
Exc-0.086
Kids-0.085
Don-0.085
Um-0.083
Sorry-0.083
Token amaz
Token ID40642
Feature activation+3.40e+01

Pos logit contributions
 up+0.052
ish+0.047
lers+0.043
-+0.042
 matches+0.041
Neg logit contributions
Kids-0.071
Um-0.070
Hmm-0.070
Sorry-0.070
Uh-0.070
Tokenement
Token ID972
Feature activation+1.84e+01

Pos logit contributions
 up+0.262
ish+0.256
ric+0.233
ans+0.231
ishes+0.231
Neg logit contributions
Sorry-0.229
Kids-0.228
Um-0.227
Exc-0.226
Well-0.222
Token as
Token ID355
Feature activation+5.83e+00

Pos logit contributions
ish+0.094
 matches+0.092
ishes+0.087
 up+0.086
igs+0.086
Neg logit contributions
Sorry-0.151
Um-0.149
By-0.149
Exc-0.148
Well-0.148
Token Amelia
Token ID49769
Feature activation+1.42e+01

Pos logit contributions
ish+0.046
 up+0.044
lers+0.042
ishes+0.040
ans+0.040
Neg logit contributions
Kids-0.041
Um-0.041
Don-0.040
Well-0.040
Hmm-0.040
Token gigg
Token ID30442
Feature activation+1.96e+01

Pos logit contributions
 up+0.121
ish+0.112
lers+0.104
 matches+0.102
ans+0.101
Neg logit contributions
Kids-0.091
Um-0.090
Sorry-0.089
Hmm-0.088
Uh-0.088
Tokenled
Token ID992
Feature activation+1.57e+01

Pos logit contributions
 up+0.101
 matches+0.089
ric+0.089
ish+0.087
igs+0.082
Neg logit contributions
Sorry-0.179
Kids-0.178
Exc-0.178
Um-0.174
Well-0.174
Token trees
Token ID7150
Feature activation+4.15e+01

Pos logit contributions
Um+0.135
Uh+0.134
Don+0.132
Kids+0.131
Hmm+0.129
Neg logit contributions
ish-0.094
 up-0.088
ishes-0.084
lers-0.082
ishing-0.080
Token and
Token ID290
Feature activation+2.17e+01

Pos logit contributions
ish+0.232
igs+0.222
ishes+0.211
ishing+0.210
 matches+0.207
Neg logit contributions
Don-0.336
Hi-0.326
Exc-0.321
By-0.317
Well-0.316
Token it
Token ID340
Feature activation+2.57e+01

Pos logit contributions
ish+0.150
ishing+0.137
 up+0.136
ates+0.133
an+0.132
Neg logit contributions
Don-0.169
Exc-0.161
Um-0.158
Uh-0.157
Hi-0.157
Token was
Token ID373
Feature activation+2.37e+01

Pos logit contributions
ish+0.208
 up+0.205
ishing+0.196
ans+0.192
igs+0.191
Neg logit contributions
Don-0.151
Um-0.150
Sorry-0.149
Uh-0.146
Well-0.146
Token very
Token ID845
Feature activation+2.42e+01

Pos logit contributions
ish+0.178
lers+0.158
ishes+0.157
an+0.154
ans+0.153
Neg logit contributions
Don-0.169
Uh-0.165
Um-0.159
By-0.159
Well-0.153
Token loud
Token ID7812
Feature activation+2.88e+01

Pos logit contributions
ish+0.162
 up+0.157
lers+0.152
ishes+0.148
ates+0.146
Neg logit contributions
Sorry-0.189
Uh-0.187
Um-0.186
Don-0.185
Exc-0.181
Token.
Token ID13
Feature activation+9.24e+00

Pos logit contributions
ish+0.176
ishes+0.156
igs+0.153
ans+0.147
ric+0.147
Neg logit contributions
Don-0.243
Exc-0.230
Kids-0.228
Sorry-0.226
Well-0.226
Token The
Token ID383
Feature activation+1.47e+01

Pos logit contributions
 up+0.080
ish+0.072
lers+0.068
 matches+0.067
-+0.066
Neg logit contributions
Sorry-0.058
Kids-0.057
Uh-0.057
Hmm-0.057
Um-0.057
Token rh
Token ID9529
Feature activation+3.86e+01

Pos logit contributions
 up+0.093
ish+0.088
ishing+0.087
cks+0.087
lers+0.083
Neg logit contributions
Uh-0.118
Exc-0.116
Don-0.115
Hmm-0.114
Sorry-0.114
Tokenin
Token ID259
Feature activation+2.84e+01

Pos logit contributions
 up+0.285
ish+0.230
 out+0.215
ans+0.214
ishes+0.212
Neg logit contributions
Uh-0.304
Sorry-0.303
Well-0.293
Um-0.289
Hmm-0.287
Tokenoc
Token ID420
Feature activation+5.11e+01

Pos logit contributions
 up+0.219
ish+0.190
igs+0.182
ishes+0.180
ishing+0.177
Neg logit contributions
Um-0.209
Uh-0.208
Hmm-0.204
Well-0.203
Sorry-0.203
Token writing
Token ID3597
Feature activation+1.90e+01

Pos logit contributions
ish+0.066
an+0.065
ishes+0.063
ans+0.062
lers+0.061
Neg logit contributions
Um-0.076
Kids-0.069
Uh-0.068
Don-0.067
Well-0.066
Token my
Token ID616
Feature activation+1.51e+01

Pos logit contributions
ish+0.109
ric+0.104
ans+0.102
ishes+0.098
lers+0.095
Neg logit contributions
Sorry-0.157
Um-0.155
Exc-0.152
Don-0.152
Hmm-0.151
Token name
Token ID1438
Feature activation+2.17e+01

Pos logit contributions
igs+0.102
ish+0.096
ishes+0.095
lers+0.092
ans+0.091
Neg logit contributions
Um-0.110
Hmm-0.105
Listen-0.097
Don-0.096
Hi-0.095
Token on
Token ID319
Feature activation+1.21e+01

Pos logit contributions
ish+0.090
 up+0.090
igs+0.084
ans+0.082
lers+0.080
Neg logit contributions
Sorry-0.216
Um-0.214
Kids-0.214
Hmm-0.210
Don-0.209
Token this
Token ID428
Feature activation+1.88e+01

Pos logit contributions
ish+0.076
 up+0.069
ishing+0.068
ishes+0.065
ric+0.065
Neg logit contributions
Kids-0.099
Exc-0.098
Um-0.096
Dad-0.095
Hmm-0.094
Token paper
Token ID3348
Feature activation+2.85e+01

Pos logit contributions
igs+0.118
ish+0.116
ishes+0.115
lers+0.111
 up+0.110
Neg logit contributions
Um-0.138
Hmm-0.135
Hi-0.134
Exc-0.134
Kids-0.132
Token."
Token ID526
Feature activation-2.96e-01

Pos logit contributions
 up+0.131
ish+0.122
igs+0.120
ishing+0.116
lers+0.109
Neg logit contributions
Um-0.256
Hmm-0.256
Kids-0.251
Sorry-0.250
Exc-0.247
Token
Token ID220
Feature activation+1.04e+01

Pos logit contributions
Uh+0.002
Um+0.002
Hmm+0.002
Sorry+0.002
Kids+0.002
Neg logit contributions
 up-0.003
ish-0.002
lers-0.002
--0.002
ishes-0.002
Token
Token ID198
Feature activation+1.53e-02

Pos logit contributions
 up+0.094
ish+0.089
lers+0.084
-+0.081
 matches+0.080
Neg logit contributions
Kids-0.062
Um-0.061
Hmm-0.060
Sorry-0.060
Uh-0.059
Token
Token ID198
Feature activation-6.32e-01

Pos logit contributions
 up+0.000
ish+0.000
-+0.000
lers+0.000
 matches+0.000
Neg logit contributions
Uh-0.000
Hmm-0.000
Um-0.000
Sorry-0.000
Kids-0.000
TokenHer
Token ID9360
Feature activation+7.60e+00

Pos logit contributions
Uh+0.003
Um+0.003
Hmm+0.003
Well+0.003
Kids+0.003
Neg logit contributions
 up-0.006
--0.005
ish-0.005
ep-0.005
ans-0.005
Token said
Token ID531
Feature activation+1.59e+01

Pos logit contributions
 up+0.174
ish+0.167
lers+0.156
igs+0.155
ishes+0.152
Neg logit contributions
Kids-0.131
Um-0.130
Sorry-0.127
Uh-0.125
Hmm-0.125
Token,
Token ID11
Feature activation+1.34e+01

Pos logit contributions
ishes+0.088
ish+0.085
ans+0.084
igs+0.082
lers+0.080
Neg logit contributions
Sorry-0.137
Don-0.136
By-0.133
Well-0.128
Um-0.127
Token holding
Token ID4769
Feature activation+2.30e+01

Pos logit contributions
ish+0.098
lers+0.095
ishes+0.092
igs+0.092
ans+0.090
Neg logit contributions
Don-0.100
Sorry-0.097
Um-0.096
Uh-0.095
Kids-0.095
Token up
Token ID510
Feature activation+2.21e+01

Pos logit contributions
ric+0.150
ish+0.142
igs+0.141
ans+0.136
ishes+0.133
Neg logit contributions
No-0.159
Kids-0.158
Hmm-0.157
Dad-0.155
Um-0.155
Token a
Token ID257
Feature activation+2.37e+01

Pos logit contributions
ish+0.133
 up+0.122
ishes+0.115
lers+0.113
ric+0.110
Neg logit contributions
Um-0.189
Kids-0.187
Sorry-0.185
Exc-0.185
Don-0.184
Token long
Token ID890
Feature activation+3.35e+01

Pos logit contributions
 up+0.145
ishes+0.140
ish+0.139
ans+0.139
igs+0.131
Neg logit contributions
Uh-0.203
Exc-0.190
Hmm-0.188
Sorry-0.183
Um-0.183
Token,
Token ID11
Feature activation+1.51e+01

Pos logit contributions
 up+0.212
ishes+0.203
ish+0.200
ric+0.190
ans+0.181
Neg logit contributions
Uh-0.276
Exc-0.274
Dad-0.256
Hmm-0.252
Ben-0.251
Token brown
Token ID7586
Feature activation+3.08e+01

Pos logit contributions
ish+0.102
ishes+0.097
ans+0.090
igs+0.090
lers+0.089
Neg logit contributions
Don-0.126
Exc-0.118
Uh-0.117
Sorry-0.115
Um-0.112
Token thing
Token ID1517
Feature activation+2.56e+01

Pos logit contributions
ans+0.194
 up+0.192
ishes+0.189
cks+0.184
ins+0.180
Neg logit contributions
Uh-0.234
No-0.225
Exc-0.224
Hi-0.221
Wait-0.218
Token.
Token ID13
Feature activation+5.53e+00

Pos logit contributions
ish+0.107
 up+0.101
ishes+0.095
ans+0.094
igs+0.093
Neg logit contributions
Um-0.253
Sorry-0.252
Don-0.249
Kids-0.249
Uh-0.245
Token "
Token ID366
Feature activation+9.21e+00

Pos logit contributions
 up+0.042
ish+0.036
 matches+0.034
-+0.033
lers+0.033
Neg logit contributions
Hmm-0.041
Sorry-0.040
Uh-0.040
Kids-0.040
Um-0.040
Token and
Token ID290
Feature activation+2.02e+01

Pos logit contributions
ish+0.209
 up+0.200
igs+0.198
ishes+0.195
ishing+0.193
Neg logit contributions
Sorry-0.275
Don-0.274
Um-0.272
Well-0.265
Hmm-0.264
Token jumped
Token ID11687
Feature activation+3.07e+01

Pos logit contributions
ish+0.146
 up+0.142
lers+0.131
ishing+0.131
ans+0.131
Neg logit contributions
Um-0.144
Sorry-0.141
Don-0.140
Kids-0.140
Hmm-0.138
Token around
Token ID1088
Feature activation+2.69e+01

Pos logit contributions
ish+0.140
ishes+0.136
ric+0.135
lers+0.121
igs+0.120
Neg logit contributions
Don-0.288
By-0.286
Exc-0.286
Sorry-0.283
Kids-0.278
Token in
Token ID287
Feature activation+2.39e+01

Pos logit contributions
ish+0.153
ishes+0.140
ans+0.134
an+0.128
lers+0.124
Neg logit contributions
Don-0.239
Kids-0.232
Exc-0.231
By-0.230
Well-0.229
Token the
Token ID262
Feature activation+2.68e+01

Pos logit contributions
ish+0.144
 up+0.134
lers+0.126
ishes+0.126
ans+0.122
Neg logit contributions
Kids-0.208
Hmm-0.207
Uh-0.206
Um-0.206
Don-0.205
Token spot
Token ID4136
Feature activation+3.35e+01

Pos logit contributions
 up+0.152
ish+0.152
ishes+0.143
ishing+0.139
igs+0.139
Neg logit contributions
Uh-0.228
Hmm-0.227
Kids-0.225
Exc-0.218
Sorry-0.218
Token of
Token ID286
Feature activation+2.42e+01

Pos logit contributions
ish+0.180
ishes+0.167
lers+0.154
 up+0.153
ans+0.153
Neg logit contributions
Don-0.292
Sorry-0.286
Kids-0.281
By-0.281
Well-0.278
Token sunshine
Token ID34488
Feature activation+3.23e+01

Pos logit contributions
ish+0.174
lers+0.149
ishes+0.149
 up+0.144
an+0.143
Neg logit contributions
Kids-0.191
Don-0.188
Uh-0.185
Exc-0.184
Hmm-0.182
Token.
Token ID13
Feature activation+9.35e+00

Pos logit contributions
ish+0.187
igs+0.184
ric+0.182
ans+0.180
cks+0.174
Neg logit contributions
By-0.219
Don-0.215
Sorry-0.207
Well-0.206
Exc-0.206
Token Suddenly
Token ID24975
Feature activation+1.59e+01

Pos logit contributions
 up+0.083
ish+0.076
lers+0.071
 matches+0.069
-+0.069
Neg logit contributions
Sorry-0.057
Kids-0.057
Um-0.057
Hmm-0.057
Uh-0.056
Token,
Token ID11
Feature activation+1.13e+01

Pos logit contributions
 up+0.079
ish+0.078
ric+0.067
lers+0.066
ans+0.065
Neg logit contributions
Um-0.151
Kids-0.151
Hmm-0.150
Sorry-0.150
Don-0.148
Token a
Token ID257
Feature activation-2.57e-01

Pos logit contributions
Kids+0.029
Um+0.029
Hmm+0.029
Sorry+0.028
Don+0.028
Neg logit contributions
 up-0.015
ish-0.015
lers-0.013
ishes-0.013
ishing-0.012
Token big
Token ID1263
Feature activation+5.14e+00

Pos logit contributions
Uh+0.002
Kids+0.002
Um+0.002
Hmm+0.002
Sorry+0.002
Neg logit contributions
 up-0.002
ish-0.001
lers-0.001
ishes-0.001
ans-0.001
Token ball
Token ID2613
Feature activation+8.03e+00

Pos logit contributions
 up+0.030
ish+0.027
ishes+0.026
lers+0.026
ans+0.025
Neg logit contributions
Uh-0.045
Kids-0.043
Hmm-0.043
Exc-0.042
Sorry-0.042
Token in
Token ID287
Feature activation+4.51e-01

Pos logit contributions
 up+0.027
ish+0.025
ishes+0.023
lers+0.021
ans+0.020
Neg logit contributions
Uh-0.088
Kids-0.088
Sorry-0.088
Don-0.087
Hmm-0.087
Token the
Token ID262
Feature activation+1.83e+00

Pos logit contributions
 up+0.002
ish+0.002
lers+0.002
-+0.002
ishes+0.002
Neg logit contributions
Kids-0.005
Hmm-0.005
Um-0.005
Uh-0.005
Sorry-0.005
Token yard
Token ID12699
Feature activation+8.20e+00

Pos logit contributions
 up+0.011
ish+0.009
lers+0.008
-+0.008
 matches+0.008
Neg logit contributions
Um-0.017
Kids-0.017
Sorry-0.017
Hmm-0.017
Uh-0.017
Token.
Token ID13
Feature activation-1.16e+01

Pos logit contributions
ish+0.042
ishes+0.040
igs+0.038
lers+0.037
 up+0.036
Neg logit contributions
Don-0.076
Sorry-0.075
Well-0.074
Kids-0.073
Um-0.072
Token He
Token ID679
Feature activation-4.36e+00

Pos logit contributions
Sorry+0.074
Um+0.073
Hmm+0.073
Kids+0.073
Uh+0.072
Neg logit contributions
 up-0.105
ish-0.089
--0.086
lers-0.086
 matches-0.084
Token wanted
Token ID2227
Feature activation-5.48e+00

Pos logit contributions
Hmm+0.034
Kids+0.034
Sorry+0.034
Uh+0.033
Um+0.032
Neg logit contributions
 up-0.032
--0.028
ish-0.028
lers-0.027
 in-0.025
Token to
Token ID284
Feature activation-5.80e+00

Pos logit contributions
Kids+0.049
Um+0.049
Don+0.048
Listen+0.048
Hmm+0.048
Neg logit contributions
ish-0.031
 up-0.030
lers-0.027
ishes-0.026
ishing-0.026
Token play
Token ID711
Feature activation-5.71e-01

Pos logit contributions
Kids+0.051
Hmm+0.051
Um+0.051
Uh+0.051
Sorry+0.050
Neg logit contributions
 up-0.039
ish-0.033
 matches-0.030
--0.029
lers-0.029
Token and
Token ID290
Feature activation+3.56e+00

Pos logit contributions
 up+0.099
ish+0.092
lers+0.082
ishes+0.078
igs+0.076
Neg logit contributions
Hmm-0.208
Um-0.208
Sorry-0.207
Kids-0.207
Don-0.205
Token went
Token ID1816
Feature activation+7.96e+00

Pos logit contributions
 up+0.029
ish+0.028
lers+0.026
ans+0.026
ric+0.025
Neg logit contributions
Kids-0.023
Um-0.023
Hmm-0.023
Sorry-0.023
Don-0.022
Token to
Token ID284
Feature activation+3.86e+00

Pos logit contributions
ish+0.043
ishes+0.037
lers+0.036
ric+0.036
ans+0.035
Neg logit contributions
Kids-0.073
Exc-0.067
Hmm-0.067
Um-0.067
Listen-0.066
Token the
Token ID262
Feature activation+8.84e+00

Pos logit contributions
ish+0.026
 up+0.025
lers+0.025
ishes+0.024
ans+0.023
Neg logit contributions
Kids-0.030
Um-0.029
Hmm-0.029
Don-0.028
Exc-0.028
Token living
Token ID2877
Feature activation+2.12e+01

Pos logit contributions
 up+0.050
ish+0.041
 matches+0.037
-+0.037
lers+0.035
Neg logit contributions
Kids-0.086
Uh-0.085
Sorry-0.085
Um-0.085
Well-0.084
Token room
Token ID2119
Feature activation+1.36e+01

Pos logit contributions
ish+0.107
 up+0.105
lers+0.097
ans+0.095
ric+0.091
Neg logit contributions
Kids-0.198
Hmm-0.191
Um-0.188
Uh-0.187
Don-0.185
Token to
Token ID284
Feature activation+3.46e+00

Pos logit contributions
ish+0.066
 up+0.062
lers+0.057
ishes+0.057
ans+0.055
Neg logit contributions
Sorry-0.128
Um-0.128
Kids-0.127
Don-0.126
Hmm-0.125
Token test
Token ID1332
Feature activation+1.24e+01

Pos logit contributions
 up+0.026
ish+0.025
lers+0.024
ishes+0.023
ans+0.022
Neg logit contributions
Kids-0.025
Hmm-0.025
Um-0.025
Sorry-0.024
Uh-0.024
Token it
Token ID340
Feature activation+1.37e+01

Pos logit contributions
ish+0.074
an+0.068
ans+0.066
ric+0.065
ishes+0.064
Neg logit contributions
Kids-0.098
Hmm-0.096
Exc-0.094
Um-0.093
Don-0.093
Token out
Token ID503
Feature activation+1.32e+01

Pos logit contributions
ish+0.053
 up+0.049
lers+0.044
ishes+0.044
ric+0.043
Neg logit contributions
Um-0.139
Hmm-0.139
Don-0.137
Exc-0.136
Kids-0.136
Token.
Token ID13
Feature activation-3.30e+00

Pos logit contributions
 up+0.057
ish+0.054
lers+0.045
ishes+0.044
ans+0.043
Neg logit contributions
Um-0.136
Hmm-0.134
Kids-0.134
Sorry-0.133
Don-0.132
Tokenatmeal
Token ID45280
Feature activation+2.39e+01

Pos logit contributions
 up+0.166
lers+0.133
ish+0.130
cks+0.129
 flying+0.128
Neg logit contributions
Sorry-0.176
Hmm-0.176
Kids-0.173
Well-0.168
Dad-0.168
Token to
Token ID284
Feature activation+5.09e+00

Pos logit contributions
ish+0.118
ric+0.117
lers+0.112
ishing+0.109
ishes+0.107
Neg logit contributions
Well-0.190
Sorry-0.188
By-0.188
Hmm-0.188
Okay-0.187
Token the
Token ID262
Feature activation+1.07e+01

Pos logit contributions
ish+0.032
 up+0.032
lers+0.031
ishing+0.028
ishes+0.028
Neg logit contributions
Kids-0.042
Um-0.041
Hmm-0.041
Sorry-0.041
By-0.040
Token mug
Token ID25152
Feature activation+2.16e+01

Pos logit contributions
 up+0.067
ish+0.064
lers+0.061
igs+0.059
cks+0.058
Neg logit contributions
Kids-0.090
Sorry-0.089
Um-0.089
Hmm-0.088
Uh-0.088
Token.
Token ID13
Feature activation-1.59e+00

Pos logit contributions
 up+0.095
ish+0.090
lers+0.083
igs+0.083
ishes+0.081
Neg logit contributions
Sorry-0.203
Kids-0.202
Hmm-0.201
Um-0.200
Well-0.200
Token The
Token ID383
Feature activation+4.71e+00

Pos logit contributions
Um+0.011
Sorry+0.011
Uh+0.011
Kids+0.011
Hmm+0.011
Neg logit contributions
 up-0.013
ish-0.012
lers-0.011
 matches-0.011
--0.011
Token mug
Token ID25152
Feature activation+1.68e+01

Pos logit contributions
 up+0.029
ish+0.026
ric+0.025
cks+0.025
lers+0.024
Neg logit contributions
Kids-0.041
Uh-0.041
Hmm-0.041
Sorry-0.041
Um-0.040
Token took
Token ID1718
Feature activation+8.82e+00

Pos logit contributions
 up+0.132
ish+0.124
lers+0.115
ishes+0.112
ishing+0.110
Neg logit contributions
Kids-0.117
Um-0.117
Sorry-0.116
Hmm-0.115
Uh-0.115
Token a
Token ID257
Feature activation+1.02e+01

Pos logit contributions
ish+0.047
ric+0.047
 up+0.043
ishes+0.043
lers+0.042
Neg logit contributions
By-0.078
Hmm-0.076
Uh-0.075
Don-0.074
Kids-0.074
Token bite
Token ID13197
Feature activation+1.42e+01

Pos logit contributions
 up+0.071
ish+0.067
ishes+0.062
lers+0.061
ans+0.060
Neg logit contributions
Uh-0.080
Hmm-0.079
Kids-0.078
Um-0.078
Sorry-0.077
Token and
Token ID290
Feature activation+3.67e+00

Pos logit contributions
 up+0.069
ish+0.064
lers+0.059
ishes+0.057
ric+0.057
Neg logit contributions
Kids-0.136
Hmm-0.135
Sorry-0.135
Uh-0.134
Um-0.134
Token.
Token ID13
Feature activation+1.11e+01

Pos logit contributions
 up+0.148
ish+0.140
ans+0.131
ishes+0.129
lers+0.126
Neg logit contributions
Um-0.207
Hmm-0.205
Uh-0.205
Kids-0.203
Sorry-0.202
Token Lily
Token ID20037
Feature activation+1.75e+01

Pos logit contributions
 up+0.099
ish+0.087
 matches+0.083
lers+0.082
-+0.081
Neg logit contributions
Sorry-0.066
Uh-0.066
Hmm-0.065
Kids-0.065
Um-0.063
Token didn
Token ID1422
Feature activation+1.94e+01

Pos logit contributions
 up+0.127
ish+0.118
lers+0.109
-+0.105
 matches+0.104
Neg logit contributions
Kids-0.135
Um-0.134
Hmm-0.133
Sorry-0.132
Uh-0.132
Token't
Token ID470
Feature activation+1.52e+01

Pos logit contributions
 up+0.175
-+0.141
ish+0.138
 out+0.137
 matches+0.137
Neg logit contributions
Uh-0.132
Kids-0.131
Hmm-0.131
Sorry-0.130
Um-0.128
Token want
Token ID765
Feature activation+1.97e+01

Pos logit contributions
 up+0.125
ish+0.115
lers+0.106
-+0.105
 matches+0.102
Neg logit contributions
Kids-0.106
Um-0.106
Sorry-0.105
Hmm-0.105
Uh-0.105
Token to
Token ID284
Feature activation+1.81e+01

Pos logit contributions
 up+0.099
ish+0.093
lers+0.081
ishes+0.076
-+0.075
Neg logit contributions
Kids-0.194
Um-0.193
Hmm-0.191
Sorry-0.189
Uh-0.189
Token do
Token ID466
Feature activation+2.13e+01

Pos logit contributions
 up+0.131
ish+0.119
lers+0.108
-+0.106
 matches+0.104
Neg logit contributions
Uh-0.144
Hmm-0.144
Kids-0.144
Um-0.143
Sorry-0.141
Token it
Token ID340
Feature activation+2.52e+01

Pos logit contributions
ish+0.127
 up+0.112
ishes+0.111
an+0.108
ishing+0.106
Neg logit contributions
Kids-0.186
Um-0.183
Don-0.182
Hmm-0.181
Listen-0.180
Token at
Token ID379
Feature activation+1.92e+01

Pos logit contributions
ish+0.119
 up+0.109
lers+0.108
ric+0.108
ishes+0.108
Neg logit contributions
Kids-0.229
Um-0.229
Sorry-0.229
Don-0.227
Hmm-0.223
Token first
Token ID717
Feature activation+1.98e+01

Pos logit contributions
 up+0.121
ish+0.121
lers+0.112
ishes+0.105
-+0.104
Neg logit contributions
Kids-0.160
Um-0.158
Hmm-0.157
Uh-0.155
Sorry-0.154
Token because
Token ID780
Feature activation+1.43e+01

Pos logit contributions
 up+0.082
ish+0.078
lers+0.067
ishes+0.064
ans+0.062
Neg logit contributions
Kids-0.206
Um-0.205
Hmm-0.203
Sorry-0.202
Uh-0.201
Token�
Token ID241
Feature activation+8.18e+00

Pos logit contributions
 up+0.101
ish+0.094
lers+0.088
 matches+0.084
-+0.084
Neg logit contributions
Kids-0.094
Um-0.093
Hmm-0.093
Sorry-0.092
Uh-0.092
TokenSure
Token ID19457
Feature activation+9.47e+00

Pos logit contributions
 up+0.085
-+0.079
ish+0.076
 things+0.074
 in+0.073
Neg logit contributions
Uh-0.031
Um-0.030
Hmm-0.030
Kids-0.029
Exc-0.029
Token,
Token ID11
Feature activation+2.16e+00

Pos logit contributions
 up+0.047
ish+0.042
ishes+0.038
lers+0.037
ans+0.037
Neg logit contributions
Kids-0.095
Um-0.094
Sorry-0.093
Hmm-0.092
Well-0.091
Token but
Token ID475
Feature activation+2.61e+00

Pos logit contributions
 up+0.017
ish+0.016
lers+0.015
ishes+0.014
 matches+0.014
Neg logit contributions
Kids-0.016
Um-0.016
Hmm-0.015
Sorry-0.015
Uh-0.015
Token this
Token ID428
Feature activation+1.19e+01

Pos logit contributions
 up+0.020
ish+0.019
lers+0.019
ans+0.018
ishes+0.017
Neg logit contributions
Um-0.019
Kids-0.019
Hmm-0.019
Sorry-0.018
Well-0.018
Token one
Token ID530
Feature activation+1.26e+01

Pos logit contributions
 up+0.072
ish+0.069
lers+0.065
ishes+0.064
ishing+0.063
Neg logit contributions
Kids-0.103
Hmm-0.100
Um-0.099
Uh-0.097
Well-0.096
Token needs
Token ID2476
Feature activation+9.85e+00

Pos logit contributions
 up+0.086
ish+0.079
ans+0.075
ishes+0.074
lers+0.074
Neg logit contributions
Kids-0.098
Um-0.097
Hmm-0.096
Well-0.095
Uh-0.095
Token to
Token ID284
Feature activation+5.59e+00

Pos logit contributions
 up+0.056
ish+0.052
lers+0.046
ishes+0.044
ans+0.043
Neg logit contributions
Kids-0.091
Um-0.090
Hmm-0.090
Sorry-0.089
Uh-0.088
Token be
Token ID307
Feature activation+1.02e+01

Pos logit contributions
 up+0.042
ish+0.040
lers+0.037
ishes+0.035
-+0.035
Neg logit contributions
Kids-0.041
Um-0.041
Hmm-0.041
Sorry-0.040
Uh-0.040
Token regular
Token ID3218
Feature activation+1.41e+01

Pos logit contributions
ish+0.067
 up+0.065
lers+0.061
an+0.058
ric+0.057
Neg logit contributions
Um-0.081
Kids-0.079
Don-0.079
Sorry-0.078
Hmm-0.077
Token.
Token ID13
Feature activation-1.54e+00

Pos logit contributions
 up+0.066
ish+0.065
ric+0.062
cks+0.061
an+0.058
Neg logit contributions
Kids-0.130
Sorry-0.123
Um-0.122
Exc-0.120
Hmm-0.119
Token It
Token ID632
Feature activation-4.04e+00

Pos logit contributions
Uh+0.080
Kids+0.079
Um+0.079
Hmm+0.079
Sorry+0.079
Neg logit contributions
 up-0.090
ish-0.084
 matches-0.077
lers-0.077
--0.076
Token can
Token ID460
Feature activation-6.51e+00

Pos logit contributions
Don+0.028
Um+0.027
Uh+0.027
Well+0.027
Kids+0.026
Neg logit contributions
ates-0.028
ishing-0.028
ish-0.027
igs-0.027
ans-0.026
Token fall
Token ID2121
Feature activation+5.66e-01

Pos logit contributions
Um+0.049
Kids+0.049
Hmm+0.048
Sorry+0.048
Uh+0.048
Neg logit contributions
 up-0.049
ish-0.046
lers-0.042
--0.040
ishes-0.040
Token apart
Token ID5475
Feature activation+3.29e+00

Pos logit contributions
ish+0.003
ans+0.003
ates+0.003
lers+0.002
ric+0.002
Neg logit contributions
Exc-0.005
Sorry-0.005
Uh-0.005
Don-0.005
Kids-0.005
Token easily
Token ID3538
Feature activation+3.35e+00

Pos logit contributions
ish+0.017
ric+0.016
ans+0.016
lers+0.015
 up+0.015
Neg logit contributions
Exc-0.027
Sorry-0.026
Kids-0.026
Um-0.026
Don-0.026
Token."
Token ID526
Feature activation-1.44e+01

Pos logit contributions
ish+0.017
ric+0.015
 up+0.015
lers+0.015
ishes+0.015
Neg logit contributions
Exc-0.030
Um-0.030
Sorry-0.030
Hmm-0.029
Kids-0.029
Token she
Token ID673
Feature activation-7.70e+00

Pos logit contributions
Uh+0.133
Hmm+0.130
Okay+0.127
Kids+0.126
Sorry+0.125
Neg logit contributions
 up-0.086
 out-0.069
.-0.068
 things-0.068
 in-0.067
Token said
Token ID531
Feature activation-9.14e+00

Pos logit contributions
Hmm+0.062
Sorry+0.060
Uh+0.059
Kids+0.059
Listen+0.059
Neg logit contributions
 up-0.054
--0.047
ish-0.044
ies-0.043
 matches-0.043
Token.
Token ID13
Feature activation-1.38e+01

Pos logit contributions
Sorry+0.085
Don+0.083
Kids+0.082
By+0.082
Um+0.081
Neg logit contributions
ish-0.046
ric-0.044
ishes-0.042
ans-0.041
lers-0.041
Token
Token ID198
Feature activation-1.49e+01

Pos logit contributions
Uh+0.122
Hmm+0.122
Sorry+0.119
Um+0.117
Kids+0.115
Neg logit contributions
 up-0.084
 out-0.067
 in-0.065
--0.062
 where-0.062
Token
Token ID198
Feature activation-1.53e+01

Pos logit contributions
Uh+0.114
Hmm+0.113
Um+0.113
Sorry+0.110
Kids+0.108
Neg logit contributions
 up-0.114
ish-0.095
--0.094
lers-0.090
 matches-0.089
Token,
Token ID11
Feature activation-1.76e+01

Pos logit contributions
Hmm+0.188
Um+0.187
Uh+0.186
Kids+0.183
Don+0.183
Neg logit contributions
 up-0.025
--0.015
 in-0.009
 things-0.008
ish-0.005
Token she
Token ID673
Feature activation-1.49e+01

Pos logit contributions
Kids+0.144
Um+0.143
Hmm+0.142
Uh+0.142
Sorry+0.141
Neg logit contributions
 up-0.123
ish-0.109
lers-0.102
--0.098
 matches-0.098
Token was
Token ID373
Feature activation-1.49e+01

Pos logit contributions
Hmm+0.103
Kids+0.101
Uh+0.101
Um+0.099
Sorry+0.099
Neg logit contributions
 up-0.125
ish-0.116
lers-0.108
--0.107
 matches-0.103
Token grateful
Token ID14066
Feature activation-1.20e+01

Pos logit contributions
Um+0.105
Once+0.102
Sorry+0.101
Listen+0.100
By+0.099
Neg logit contributions
ish-0.105
ans-0.102
ishes-0.101
an-0.100
lers-0.099
Token that
Token ID326
Feature activation-1.62e+01

Pos logit contributions
Don+0.115
Um+0.115
Kids+0.114
Sorry+0.113
Well+0.113
Neg logit contributions
ish-0.058
 up-0.057
lers-0.053
ishes-0.052
ans-0.050
Token she
Token ID673
Feature activation-1.50e+01

Pos logit contributions
Kids+0.127
Uh+0.127
Hmm+0.127
Um+0.127
Sorry+0.125
Neg logit contributions
 up-0.119
ish-0.103
lers-0.096
 matches-0.096
--0.095
Token had
Token ID550
Feature activation-1.56e+01

Pos logit contributions
Hmm+0.098
Kids+0.097
Um+0.097
Uh+0.097
Sorry+0.095
Neg logit contributions
 up-0.131
ish-0.122
lers-0.114
--0.112
 matches-0.109
Token got
Token ID1392
Feature activation-1.38e+01

Pos logit contributions
Kids+0.125
Um+0.125
Sorry+0.124
Hmm+0.123
Uh+0.121
Neg logit contributions
 up-0.108
ish-0.102
lers-0.094
ishes-0.089
 matches-0.088
Token to
Token ID284
Feature activation-1.64e+01

Pos logit contributions
Um+0.110
Sorry+0.109
Don+0.107
Kids+0.106
Listen+0.106
Neg logit contributions
ish-0.085
an-0.083
ric-0.079
ans-0.079
igs-0.079
Token watch
Token ID2342
Feature activation-1.15e+01

Pos logit contributions
Kids+0.104
Um+0.103
Sorry+0.101
Hmm+0.098
Listen+0.096
Neg logit contributions
 up-0.134
ish-0.134
lers-0.128
ishes-0.122
ishing-0.120
Token the
Token ID262
Feature activation-1.22e+01

Pos logit contributions
Um+0.090
Don+0.089
Kids+0.089
Sorry+0.088
Listen+0.088
Neg logit contributions
ish-0.076
ishes-0.069
igs-0.068
ans-0.067
an-0.067
Token mouth
Token ID5422
Feature activation-2.41e+00

Pos logit contributions
Um+0.066
Hmm+0.062
By+0.061
Don+0.061
Sorry+0.060
Neg logit contributions
 up-0.047
ish-0.041
ans-0.041
ishing-0.040
ishes-0.039
Token.
Token ID13
Feature activation-1.97e+01

Pos logit contributions
Don+0.023
Sorry+0.022
Exc+0.022
Um+0.022
Hmm+0.022
Neg logit contributions
ish-0.011
ishes-0.010
ans-0.010
 up-0.010
ric-0.010
Token He
Token ID679
Feature activation-1.31e+01

Pos logit contributions
Kids+0.134
Uh+0.133
Hmm+0.130
Sorry+0.130
Um+0.130
Neg logit contributions
 up-0.173
ish-0.149
lers-0.142
 matches-0.140
--0.139
Token then
Token ID788
Feature activation-1.54e+01

Pos logit contributions
Kids+0.091
Um+0.091
Hmm+0.091
Sorry+0.090
Uh+0.090
Neg logit contributions
 up-0.106
ish-0.099
lers-0.092
--0.088
ishes-0.088
Token gave
Token ID2921
Feature activation-1.33e+01

Pos logit contributions
Kids+0.106
Um+0.105
Hmm+0.104
Uh+0.104
Sorry+0.103
Neg logit contributions
 up-0.129
ish-0.119
lers-0.112
--0.108
 matches-0.107
Token the
Token ID262
Feature activation-1.22e+01

Pos logit contributions
Don+0.092
Hmm+0.091
Listen+0.089
Sorry+0.087
Exc+0.087
Neg logit contributions
ric-0.097
ish-0.094
ans-0.090
ies-0.089
lers-0.089
Token toast
Token ID27805
Feature activation-3.72e+00

Pos logit contributions
Um+0.109
Hmm+0.109
Uh+0.108
Don+0.108
Sorry+0.107
Neg logit contributions
 up-0.073
ish-0.068
lers-0.063
ishes-0.060
ishing-0.059
Token back
Token ID736
Feature activation-1.01e+01

Pos logit contributions
Don+0.030
By+0.029
Hmm+0.029
Sorry+0.029
Exc+0.029
Neg logit contributions
ish-0.021
ric-0.020
lers-0.020
ishes-0.019
igs-0.019
Token to
Token ID284
Feature activation-1.49e+01

Pos logit contributions
Don+0.053
Not+0.050
Sorry+0.050
Exc+0.050
Well+0.050
Neg logit contributions
ans-0.074
ric-0.072
cks-0.071
ish-0.071
igs-0.070
Token Max
Token ID5436
Feature activation-9.17e+00

Pos logit contributions
Kids+0.088
Um+0.086
Sorry+0.086
Don+0.086
Hmm+0.085
Neg logit contributions
ish-0.129
 up-0.127
lers-0.122
ishes-0.117
igs-0.117
Token and
Token ID290
Feature activation-1.51e+01

Pos logit contributions
Don+0.069
Exc+0.068
Hmm+0.066
Um+0.065
Once+0.065
Neg logit contributions
igs-0.049
ans-0.047
ish-0.046
ric-0.046
ishes-0.046
Token closet
Token ID21615
Feature activation-1.03e+00

Pos logit contributions
Kids+0.089
Um+0.089
Sorry+0.088
Hmm+0.088
Uh+0.088
Neg logit contributions
 up-0.051
ish-0.043
lers-0.039
--0.038
 matches-0.037
Token for
Token ID329
Feature activation-1.38e+01

Pos logit contributions
Hmm+0.010
Um+0.010
Sorry+0.010
Don+0.010
Kids+0.010
Neg logit contributions
 up-0.005
ish-0.004
lers-0.004
ishes-0.004
igs-0.004
Token days
Token ID1528
Feature activation-8.87e+00

Pos logit contributions
Um+0.128
Hmm+0.128
Kids+0.128
Sorry+0.127
Uh+0.126
Neg logit contributions
 up-0.077
ish-0.076
lers-0.066
ishes-0.062
ishing-0.059
Token!
Token ID0
Feature activation-2.01e+01

Pos logit contributions
Don+0.075
By+0.074
Kids+0.072
Well+0.071
Hmm+0.071
Neg logit contributions
ish-0.047
ishes-0.043
igs-0.042
ric-0.042
 up-0.042
Token
Token ID198
Feature activation-1.94e+01

Pos logit contributions
Kids+0.154
Um+0.152
Hmm+0.151
Sorry+0.150
Uh+0.149
Neg logit contributions
 up-0.156
ish-0.124
 matches-0.119
--0.118
lers-0.116
Token
Token ID198
Feature activation-1.99e+01

Pos logit contributions
Hmm+0.164
Uh+0.164
Um+0.163
Sorry+0.158
Kids+0.155
Neg logit contributions
 up-0.134
--0.104
 in-0.099
 matches-0.096
 out-0.096
TokenJohn
Token ID7554
Feature activation-1.17e+01

Pos logit contributions
Kids+0.110
Um+0.108
Hmm+0.108
Uh+0.100
Well+0.094
Neg logit contributions
 up-0.186
--0.164
 in-0.154
 things-0.150
 matches-0.150
Tokeny
Token ID88
Feature activation-1.09e+01

Pos logit contributions
Sorry+0.098
Kids+0.098
Hmm+0.098
Um+0.098
Uh+0.097
Neg logit contributions
 up-0.080
ish-0.073
lers-0.067
--0.066
ishes-0.062
Token felt
Token ID2936
Feature activation-1.49e+01

Pos logit contributions
Sorry+0.087
Hmm+0.087
Kids+0.087
Uh+0.086
Um+0.085
Neg logit contributions
 up-0.080
ish-0.070
--0.067
lers-0.065
 matches-0.062
Token sorry
Token ID7926
Feature activation-1.47e+01

Pos logit contributions
Kids+0.134
Um+0.131
Hmm+0.130
Sorry+0.130
Uh+0.130
Neg logit contributions
 up-0.096
ish-0.083
lers-0.077
--0.074
 matches-0.072
Token for
Token ID329
Feature activation-1.74e+01

Pos logit contributions
Um+0.168
Hmm+0.168
Kids+0.168
Don+0.167
Sorry+0.166
Neg logit contributions
 up-0.043
ish-0.040
lers-0.034
ishes-0.031
ans-0.031
Token to
Token ID284
Feature activation-1.56e+01

Pos logit contributions
Kids+0.089
Don+0.088
Well+0.087
Sorry+0.085
Um+0.085
Neg logit contributions
ish-0.068
ishes-0.063
ans-0.062
lers-0.062
an-0.059
Token the
Token ID262
Feature activation-1.15e+01

Pos logit contributions
Kids+0.091
Hi+0.083
Hmm+0.083
Um+0.082
Exc+0.082
Neg logit contributions
ish-0.139
lers-0.137
ishes-0.132
 up-0.131
ishing-0.128
Token car
Token ID1097
Feature activation-7.13e+00

Pos logit contributions
Kids+0.105
Um+0.104
Hmm+0.104
Sorry+0.104
Uh+0.103
Neg logit contributions
 up-0.066
ish-0.062
lers-0.057
ishes-0.053
--0.052
Token and
Token ID290
Feature activation-1.56e+01

Pos logit contributions
Exc+0.062
Kids+0.061
Hmm+0.061
Don+0.060
Sorry+0.060
Neg logit contributions
ishes-0.035
igs-0.035
ish-0.034
lers-0.032
 matches-0.031
Token hopped
Token ID45230
Feature activation-1.25e+01

Pos logit contributions
Kids+0.094
Don+0.093
Um+0.092
Sorry+0.091
Hmm+0.090
Neg logit contributions
ish-0.132
 up-0.129
lers-0.123
igs-0.121
ans-0.121
Token into
Token ID656
Feature activation-1.30e+01

Pos logit contributions
Kids+0.133
Don+0.132
Sorry+0.130
Hmm+0.130
Exc+0.129
Neg logit contributions
ish-0.050
ishes-0.043
lers-0.042
 up-0.041
ric-0.041
Token her
Token ID607
Feature activation-1.11e+01

Pos logit contributions
Kids+0.129
Hmm+0.128
Um+0.127
Uh+0.126
Don+0.126
Neg logit contributions
 up-0.063
ish-0.061
lers-0.053
ishes-0.050
ans-0.049
Token seat
Token ID5852
Feature activation-6.54e+00

Pos logit contributions
Um+0.079
Sorry+0.079
Kids+0.078
Dad+0.078
Uh+0.078
Neg logit contributions
ans-0.064
 up-0.063
cks-0.063
ish-0.062
ishes-0.061
Token.
Token ID13
Feature activation-2.18e+01

Pos logit contributions
Sorry+0.054
Don+0.053
Okay+0.052
Kids+0.052
Dad+0.052
Neg logit contributions
ish-0.036
ishes-0.034
 matches-0.033
igs-0.032
lers-0.030
Token Mam
Token ID29926
Feature activation-1.34e+01

Pos logit contributions
Uh+0.143
Um+0.142
Hmm+0.141
Sorry+0.141
Kids+0.140
Neg logit contributions
 up-0.191
ish-0.167
--0.158
 out-0.152
lers-0.151
Tokenma
Token ID2611
Feature activation-1.32e+01

Pos logit contributions
Kids+0.117
Hmm+0.114
Um+0.114
Sorry+0.114
Uh+0.113
Neg logit contributions
 up-0.085
ish-0.076
lers-0.070
 matches-0.067
--0.066