TOP ACTIVATIONS
MAX = 0.311

helped the ambulance driver until the baby was safe again with
elephant couldn 't remember what the tree had told him .
John and Sarah helped the ambulance driver until the baby
So the elephant and the mouse picked a new game
him that they would make the most of their journey together
and round one won in the end . They both laughed
lost . The elephant helped the mouse find his way home
Tom was very sad that the ranch was not his and
From then on , the w affle was known as
his magic to help solve the problem . <|endoftext|> Once there
had won ! He was the fastest in the zoo .
little girl came back to the river and saw the all
John could never return from the tunnel . <|endoftext|> Once upon
He raced around the track until he crossed the
Sally was so happy and the two girls hugged each other
a happy ending , and the helpers promised Daddy that they
The moral of the story is that it 's
There in the corner of the room was an old trunk
journey . After she exited the bus , she couldn 't
the story is that even the smallest of creatures can help

MIN ACTIVATIONS
MIN = -0.381

They think of what they wish for . They hold their
favorite song is " Happy Birthday ". She likes to sing
. Blue had an enormous appetite and lived in a swamp
Everyone started singing ' Happy Birthday ' to John , who
, everybody sang ' Happy Birthday ' for her and gave
said , " Because I wish I was as happy as
in the sand . They wish they could go home .
no more fun . They wish they had shared . <|endoftext|>
. " I wish we could fly ," she
It said , " Happy Birthday Lily !" L
it was a very special wish , because the star wished
such a kind and wild wish . They smiled and felt
brave and sweet . I wish I were real too .
are very sad . They wish they had listened to Mom
Everyone sang the Happy Birthday song . Molly blew out
Lily . I hope your wish comes true ."
had collected and make a wish for a lovely surprise .
happy and give us a wish ." Ben liked
so much TV . They wish they had been good .
she needed and make her wish come true .

INTERVAL 0.138 - 0.311
CONTAINS 16.741%

with a big smile on his face . <|endoftext|> One day
She looked at the cut . It was not
to the zoo and saw a big rh in oc eros
the forest . <|endoftext|> Once upon a time , there was
Lily had long hair , a pink dress , and shiny

INTERVAL -0.035 - 0.138
CONTAINS 37.944%

!" The fire was burning on a big strip
. It might be locked or have something scary in it
the tray carefully . It was a funny mirror tray ,
And I will be nice and playful ." Anna
happy ." " I love them too ," her

INTERVAL -0.208 - -0.035
CONTAINS 43.293%

sat on the bench and talked and laughed . Sara was
house and knocked it down . The sticks and leaves fell
walk down the hallway and noticed a door . She opened
stepped in , Ann al ise started to trem ble .
monster called Zip . Zip hated everything and everyone around him

INTERVAL -0.381 - -0.208
CONTAINS 2.022%

he walked out of the cave , much happier than when
with a smile on her face . <|endoftext|> Once upon a
that she got in the mail . After that
he found a tiny , spark ly button . He pressed
quickly rushed back to her seat and the teacher started the
Token helped
Token ID4193
Feature activation-8.72e-02

Pos logit contributions
Mid+0.170
 your+0.170
 Piper+0.167
 her+0.167
 Blink+0.165
Neg logit contributions
apping-0.414
unch-0.392
amb-0.378
apped-0.376
owing-0.372
Token the
Token ID262
Feature activation+3.10e-01

Pos logit contributions
apping+0.229
unch+0.215
owing+0.214
apped+0.213
amb+0.211
Neg logit contributions
Mid-0.054
Scar-0.051
 Blink-0.049
 Piper-0.048
 Midnight-0.047
Token ambulance
Token ID22536
Feature activation-1.69e-01

Pos logit contributions
Mid+0.433
 Piper+0.430
 your+0.430
Scar+0.427
 Blink+0.427
Neg logit contributions
apping-0.576
unch-0.539
amb-0.515
apped-0.511
owing-0.504
Token driver
Token ID4639
Feature activation-1.49e-01

Pos logit contributions
apping+0.345
unch+0.336
amb+0.321
ement+0.317
apped+0.310
Neg logit contributions
 her-0.211
 she-0.203
 your-0.198
 Blink-0.191
 their-0.188
Token until
Token ID1566
Feature activation+8.24e-03

Pos logit contributions
apping+0.285
unch+0.264
owing+0.259
apped+0.253
aring+0.252
Neg logit contributions
Mid-0.205
Scar-0.203
 Piper-0.201
 Midnight-0.198
 Blink-0.197
Token the
Token ID262
Feature activation+3.11e-01

Pos logit contributions
Mid+0.007
Scar+0.006
 Blink+0.006
 Piper+0.006
chy+0.006
Neg logit contributions
apping-0.020
apped-0.019
unch-0.019
ement-0.018
owing-0.018
Token baby
Token ID5156
Feature activation-2.00e-02

Pos logit contributions
 your+0.434
Mid+0.434
 her+0.433
 Midnight+0.422
 Piper+0.422
Neg logit contributions
apping-0.582
unch-0.546
amb-0.522
apped-0.515
owing-0.511
Token was
Token ID373
Feature activation+3.88e-02

Pos logit contributions
apping+0.040
apped+0.038
owing+0.037
unch+0.037
amb+0.036
Neg logit contributions
 Piper-0.025
Mid-0.024
 your-0.024
Scar-0.024
 newly-0.023
Token safe
Token ID3338
Feature activation-8.86e-02

Pos logit contributions
 her+0.033
 your+0.033
Mid+0.031
 Midnight+0.031
 Piper+0.030
Neg logit contributions
apping-0.093
amb-0.090
unch-0.090
aring-0.087
ement-0.087
Token again
Token ID757
Feature activation-2.55e-02

Pos logit contributions
apping+0.147
unch+0.139
amb+0.132
apped+0.129
aring+0.128
Neg logit contributions
 her-0.140
 your-0.140
Mid-0.136
 Piper-0.136
 newly-0.135
Token with
Token ID351
Feature activation-1.63e-03

Pos logit contributions
apping+0.044
unch+0.042
amb+0.040
apped+0.040
ement+0.038
Neg logit contributions
 her-0.040
 your-0.040
 she-0.038
 newly-0.038
 Piper-0.038
Token elephant
Token ID20950
Feature activation-9.79e-02

Pos logit contributions
 her+0.312
 your+0.310
 Blink+0.310
 newly+0.306
Mid+0.301
Neg logit contributions
apping-0.709
unch-0.665
amb-0.649
apped-0.639
owing-0.628
Token couldn
Token ID3521
Feature activation-4.18e-02

Pos logit contributions
apping+0.223
unch+0.212
amb+0.204
apped+0.203
owing+0.201
Neg logit contributions
Mid-0.094
 your-0.093
 Piper-0.091
Scar-0.090
 her-0.090
Token't
Token ID470
Feature activation+7.36e-02

Pos logit contributions
apping+0.074
unch+0.073
ement+0.066
apped+0.065
amb+0.065
Neg logit contributions
Mid-0.062
 trash-0.061
 Piper-0.061
Scar-0.059
 newly-0.058
Token remember
Token ID3505
Feature activation-6.11e-02

Pos logit contributions
 her+0.084
 your+0.084
Mid+0.083
 Blink+0.083
 Piper+0.082
Neg logit contributions
apping-0.155
unch-0.147
apped-0.141
amb-0.140
ement-0.137
Token what
Token ID644
Feature activation+1.52e-01

Pos logit contributions
apping+0.116
unch+0.107
owing+0.106
aring+0.104
ement+0.103
Neg logit contributions
Mid-0.086
Scar-0.079
Max-0.078
 Piper-0.077
 Midnight-0.077
Token the
Token ID262
Feature activation+3.11e-01

Pos logit contributions
Mid+0.162
Scar+0.138
 Midnight+0.136
Elsa+0.135
 Piper+0.134
Neg logit contributions
apping-0.337
unch-0.308
apped-0.307
ement-0.302
aring-0.301
Token tree
Token ID5509
Feature activation-9.21e-02

Pos logit contributions
 same+0.313
 her+0.306
 newly+0.296
 Blink+0.296
 your+0.296
Neg logit contributions
apping-0.753
unch-0.701
amb-0.690
apped-0.684
owing-0.654
Token had
Token ID550
Feature activation+7.35e-02

Pos logit contributions
apping+0.215
unch+0.208
owing+0.202
ement+0.202
apped+0.199
Neg logit contributions
Mid-0.086
 your-0.079
Scar-0.077
 Piper-0.075
 Midnight-0.074
Token told
Token ID1297
Feature activation-1.23e-01

Pos logit contributions
Mid+0.088
 Piper+0.085
 Blink+0.085
 Midnight+0.084
 your+0.084
Neg logit contributions
apping-0.152
unch-0.143
apped-0.138
amb-0.137
owing-0.136
Token him
Token ID683
Feature activation+1.84e-01

Pos logit contributions
apping+0.266
unch+0.251
apped+0.241
owing+0.241
amb+0.238
Neg logit contributions
Mid-0.135
Scar-0.129
 Piper-0.129
 your-0.128
 Midnight-0.127
Token.
Token ID13
Feature activation-4.15e-02

Pos logit contributions
Mid+0.310
 your+0.306
 Piper+0.305
Scar+0.304
 Blink+0.302
Neg logit contributions
apping-0.287
unch-0.265
amb-0.249
apped-0.248
owing-0.247
Token
Token ID198
Feature activation+1.67e-02

Pos logit contributions
 your+0.019
 her+0.019
Mid+0.018
Scar+0.018
 Piper+0.018
Neg logit contributions
apping-0.035
unch-0.033
owing-0.032
amb-0.032
apped-0.032
TokenJohn
Token ID7554
Feature activation+1.33e-01

Pos logit contributions
 Blink+0.021
 Midnight+0.020
 Piper+0.020
chy+0.020
 newly+0.020
Neg logit contributions
apping-0.034
aring-0.031
unch-0.031
amb-0.030
apped-0.030
Token and
Token ID290
Feature activation+2.32e-02

Pos logit contributions
 your+0.157
Mid+0.157
 Piper+0.155
 her+0.154
Scar+0.154
Neg logit contributions
apping-0.273
unch-0.257
amb-0.247
apped-0.246
owing-0.243
Token Sarah
Token ID10490
Feature activation+1.80e-01

Pos logit contributions
Mid+0.030
Scar+0.029
 Midnight+0.028
 Blink+0.027
chy+0.027
Neg logit contributions
apping-0.045
apped-0.041
oot-0.040
aring-0.039
owing-0.039
Token helped
Token ID4193
Feature activation-8.72e-02

Pos logit contributions
Mid+0.170
 your+0.170
 Piper+0.167
 her+0.167
 Blink+0.165
Neg logit contributions
apping-0.414
unch-0.392
amb-0.378
apped-0.376
owing-0.372
Token the
Token ID262
Feature activation+3.10e-01

Pos logit contributions
apping+0.229
unch+0.215
owing+0.214
apped+0.213
amb+0.211
Neg logit contributions
Mid-0.054
Scar-0.051
 Blink-0.049
 Piper-0.048
 Midnight-0.047
Token ambulance
Token ID22536
Feature activation-1.69e-01

Pos logit contributions
Mid+0.433
 Piper+0.430
 your+0.430
Scar+0.427
 Blink+0.427
Neg logit contributions
apping-0.576
unch-0.539
amb-0.515
apped-0.511
owing-0.504
Token driver
Token ID4639
Feature activation-1.49e-01

Pos logit contributions
apping+0.345
unch+0.336
amb+0.321
ement+0.317
apped+0.310
Neg logit contributions
 her-0.211
 she-0.203
 your-0.198
 Blink-0.191
 their-0.188
Token until
Token ID1566
Feature activation+8.24e-03

Pos logit contributions
apping+0.285
unch+0.264
owing+0.259
apped+0.253
aring+0.252
Neg logit contributions
Mid-0.205
Scar-0.203
 Piper-0.201
 Midnight-0.198
 Blink-0.197
Token the
Token ID262
Feature activation+3.11e-01

Pos logit contributions
Mid+0.007
Scar+0.006
 Blink+0.006
 Piper+0.006
chy+0.006
Neg logit contributions
apping-0.020
apped-0.019
unch-0.019
ement-0.018
owing-0.018
Token baby
Token ID5156
Feature activation-2.00e-02

Pos logit contributions
 your+0.434
Mid+0.434
 her+0.433
 Midnight+0.422
 Piper+0.422
Neg logit contributions
apping-0.582
unch-0.546
amb-0.522
apped-0.515
owing-0.511
Token
Token ID198
Feature activation+1.66e-02

Pos logit contributions
 your+0.018
 her+0.017
Mid+0.017
 Blink+0.017
Max+0.017
Neg logit contributions
apping-0.036
owing-0.034
unch-0.034
ement-0.034
amb-0.033
TokenSo
Token ID2396
Feature activation-5.95e-02

Pos logit contributions
 Piper+0.017
 Midnight+0.016
 your+0.016
 trash+0.016
 her+0.016
Neg logit contributions
apping-0.037
unch-0.035
aring-0.034
amb-0.034
apped-0.033
Token the
Token ID262
Feature activation+3.09e-01

Pos logit contributions
apping+0.106
apped+0.097
amb+0.094
unch+0.093
ement+0.091
Neg logit contributions
Mid-0.091
Scar-0.089
 Midnight-0.085
 newly-0.084
 Piper-0.084
Token elephant
Token ID20950
Feature activation-9.82e-02

Pos logit contributions
 Blink+0.305
Mid+0.302
 your+0.300
 her+0.299
 newly+0.297
Neg logit contributions
apping-0.720
unch-0.678
amb-0.666
owing-0.650
apped-0.645
Token and
Token ID290
Feature activation+2.46e-02

Pos logit contributions
apping+0.197
unch+0.185
amb+0.179
apped+0.178
owing+0.175
Neg logit contributions
 your-0.121
Mid-0.120
 Piper-0.118
Scar-0.117
 Blink-0.117
Token the
Token ID262
Feature activation+3.10e-01

Pos logit contributions
Mid+0.019
Scar+0.018
chy+0.018
 newly+0.016
 Midnight+0.016
Neg logit contributions
apping-0.059
apped-0.058
amb-0.054
unch-0.053
owing-0.053
Token mouse
Token ID10211
Feature activation+5.85e-03

Pos logit contributions
Mid+0.299
 your+0.297
 her+0.297
 Blink+0.297
 Piper+0.293
Neg logit contributions
apping-0.720
unch-0.682
amb-0.664
owing-0.648
apped-0.648
Token picked
Token ID6497
Feature activation-7.70e-02

Pos logit contributions
 your+0.006
Mid+0.006
 Piper+0.006
Scar+0.006
 Blink+0.006
Neg logit contributions
apping-0.013
unch-0.012
apped-0.012
amb-0.012
owing-0.012
Token a
Token ID257
Feature activation+1.96e-01

Pos logit contributions
apping+0.137
apped+0.133
ement+0.124
amb+0.123
unch+0.121
Neg logit contributions
Mid-0.111
Scar-0.106
 Midnight-0.104
 Blink-0.103
 Piper-0.102
Token new
Token ID649
Feature activation+4.32e-03

Pos logit contributions
 Blink+0.160
Mid+0.158
Scar+0.152
 Piper+0.150
 Midnight+0.149
Neg logit contributions
apping-0.470
unch-0.438
apped-0.423
ement-0.416
aps-0.413
Token game
Token ID983
Feature activation-1.69e-01

Pos logit contributions
Mid+0.004
 Blink+0.004
Scar+0.004
 Piper+0.004
 your+0.004
Neg logit contributions
apping-0.010
unch-0.009
apped-0.009
amb-0.009
owing-0.009
Token him
Token ID683
Feature activation+1.82e-01

Pos logit contributions
apping+0.186
amb+0.175
unch+0.174
owing+0.171
ement+0.169
Neg logit contributions
Mid-0.148
Scar-0.147
 Midnight-0.139
 newly-0.138
 Blink-0.138
Token that
Token ID326
Feature activation+1.17e-01

Pos logit contributions
Mid+0.283
Scar+0.282
 Blink+0.275
 Piper+0.275
 newly+0.273
Neg logit contributions
apping-0.310
unch-0.284
aring-0.274
amb-0.270
owing-0.270
Token they
Token ID484
Feature activation+2.69e-01

Pos logit contributions
Mid+0.094
Scar+0.089
 Midnight+0.080
 Piper+0.079
 Blink+0.078
Neg logit contributions
apping-0.292
unch-0.281
owing-0.266
apped-0.265
amb-0.263
Token would
Token ID561
Feature activation+1.36e-02

Pos logit contributions
Scar+0.327
Mid+0.311
 Blink+0.309
 your+0.308
 Piper+0.303
Neg logit contributions
apping-0.551
owing-0.525
apped-0.523
aring-0.516
unch-0.511
Token make
Token ID787
Feature activation-4.36e-03

Pos logit contributions
Mid+0.012
 your+0.012
 Piper+0.012
Scar+0.012
 Blink+0.012
Neg logit contributions
apping-0.032
unch-0.030
amb-0.029
apped-0.029
owing-0.029
Token the
Token ID262
Feature activation+3.10e-01

Pos logit contributions
apping+0.009
unch+0.008
ement+0.008
apped+0.008
amb+0.008
Neg logit contributions
Mid-0.005
Scar-0.005
 Midnight-0.005
 Piper-0.005
chy-0.005
Token most
Token ID749
Feature activation+5.77e-02

Pos logit contributions
Mid+0.393
 Midnight+0.391
 Piper+0.385
 Blink+0.384
Scar+0.379
Neg logit contributions
apping-0.607
amb-0.566
unch-0.562
aring-0.546
owing-0.543
Token of
Token ID286
Feature activation+7.02e-02

Pos logit contributions
Mid+0.052
Scar+0.050
 Piper+0.050
 Blink+0.050
 your+0.049
Neg logit contributions
apping-0.138
unch-0.130
amb-0.125
apped-0.124
aring-0.124
Token their
Token ID511
Feature activation+2.75e-01

Pos logit contributions
Mid+0.083
Scar+0.080
Elsa+0.074
Max+0.074
 Piper+0.073
Neg logit contributions
apping-0.143
unch-0.138
ement-0.131
owing-0.130
apped-0.127
Token journey
Token ID7002
Feature activation-2.01e-01

Pos logit contributions
 your+0.326
Mid+0.324
 her+0.320
 Piper+0.319
 Blink+0.317
Neg logit contributions
apping-0.568
unch-0.537
amb-0.514
apped-0.510
owing-0.503
Token together
Token ID1978
Feature activation+3.84e-02

Pos logit contributions
apping+0.332
owing+0.299
apped+0.288
unch+0.288
aring+0.287
Neg logit contributions
Mid-0.337
 Piper-0.322
Scar-0.322
 Midnight-0.319
 Blink-0.316
Token and
Token ID290
Feature activation+2.39e-02

Pos logit contributions
apping+0.153
unch+0.145
amb+0.139
owing+0.135
aring+0.135
Neg logit contributions
 your-0.119
 her-0.119
Mid-0.118
 Piper-0.117
 newly-0.117
Token round
Token ID2835
Feature activation-7.65e-02

Pos logit contributions
Mid+0.031
Scar+0.030
 Piper+0.028
 Midnight+0.028
 Blink+0.028
Neg logit contributions
apping-0.044
apped-0.041
unch-0.040
ement-0.039
amb-0.038
Token one
Token ID530
Feature activation+5.70e-02

Pos logit contributions
apping+0.152
apped+0.139
unch+0.138
ement+0.132
amb+0.132
Neg logit contributions
Mid-0.103
Scar-0.101
 Piper-0.101
 Blink-0.100
 Midnight-0.099
Token won
Token ID1839
Feature activation-6.88e-02

Pos logit contributions
 your+0.042
 her+0.041
Mid+0.040
 Blink+0.040
 Piper+0.040
Neg logit contributions
apping-0.143
unch-0.138
amb-0.134
owing-0.131
aring-0.130
Token in
Token ID287
Feature activation+1.25e-01

Pos logit contributions
apping+0.166
unch+0.155
apped+0.153
owing+0.150
aring+0.149
Neg logit contributions
Mid-0.062
 Piper-0.055
Scar-0.054
 Blink-0.053
 Midnight-0.052
Token the
Token ID262
Feature activation+3.10e-01

Pos logit contributions
 her+0.106
 your+0.105
 the+0.099
 their+0.098
 newly+0.098
Neg logit contributions
apping-0.314
amb-0.307
unch-0.303
aring-0.288
apped-0.288
Token end
Token ID886
Feature activation-1.12e-01

Pos logit contributions
 Blink+0.375
 Piper+0.375
Mid+0.372
Scar+0.364
 Midnight+0.363
Neg logit contributions
apping-0.630
apped-0.574
ement-0.563
unch-0.560
aring-0.548
Token.
Token ID13
Feature activation-4.41e-02

Pos logit contributions
apping+0.205
unch+0.195
amb+0.187
apped+0.182
owing+0.181
Neg logit contributions
 your-0.158
 her-0.155
 Piper-0.153
Mid-0.152
 Blink-0.152
Token They
Token ID1119
Feature activation+1.81e-01

Pos logit contributions
apping+0.105
unch+0.101
owing+0.099
amb+0.097
aring+0.095
Neg logit contributions
 your-0.039
 Blink-0.039
 her-0.038
 Midnight-0.038
 Piper-0.038
Token both
Token ID1111
Feature activation+1.11e-01

Pos logit contributions
Mid+0.172
 your+0.172
 Piper+0.169
 her+0.168
 Blink+0.168
Neg logit contributions
apping-0.416
unch-0.394
amb-0.381
apped-0.379
owing-0.374
Token laughed
Token ID13818
Feature activation-1.14e-01

Pos logit contributions
 your+0.134
 her+0.134
Mid+0.131
 Piper+0.130
 Blink+0.129
Neg logit contributions
apping-0.226
unch-0.217
amb-0.205
ement-0.202
owing-0.200
Token lost
Token ID2626
Feature activation-8.10e-02

Pos logit contributions
 your+0.045
Mid+0.045
 her+0.044
 Piper+0.044
 Blink+0.044
Neg logit contributions
apping-0.076
unch-0.072
amb-0.069
apped-0.069
owing-0.068
Token.
Token ID13
Feature activation-4.44e-02

Pos logit contributions
apping+0.132
unch+0.122
amb+0.116
apped+0.115
owing+0.114
Neg logit contributions
Mid-0.131
 Piper-0.129
Scar-0.129
 your-0.129
 Blink-0.129
Token The
Token ID383
Feature activation+1.90e-01

Pos logit contributions
apping+0.095
unch+0.090
amb+0.086
apped+0.086
aring+0.084
Neg logit contributions
Mid-0.049
 your-0.049
Scar-0.048
 Piper-0.048
 her-0.048
Token elephant
Token ID20950
Feature activation-9.87e-02

Pos logit contributions
 Blink+0.215
Mid+0.214
 your+0.214
 her+0.212
 Midnight+0.211
Neg logit contributions
apping-0.412
unch-0.387
amb-0.374
apped-0.368
owing-0.367
Token helped
Token ID4193
Feature activation-8.56e-02

Pos logit contributions
apping+0.203
unch+0.191
apped+0.190
amb+0.186
owing+0.185
Neg logit contributions
Scar-0.113
Mid-0.113
 your-0.111
 Piper-0.108
 newly-0.107
Token the
Token ID262
Feature activation+3.10e-01

Pos logit contributions
apping+0.148
unch+0.144
owing+0.143
amb+0.135
apped+0.133
Neg logit contributions
Scar-0.123
Mid-0.121
 newly-0.114
 Piper-0.110
Elsa-0.108
Token mouse
Token ID10211
Feature activation+5.65e-03

Pos logit contributions
 your+0.282
Mid+0.278
 Blink+0.278
 her+0.277
 Midnight+0.272
Neg logit contributions
apping-0.736
unch-0.695
amb-0.673
apped-0.667
owing-0.663
Token find
Token ID1064
Feature activation-3.06e-02

Pos logit contributions
Scar+0.006
Mid+0.006
 Piper+0.006
 Blink+0.006
 Midnight+0.005
Neg logit contributions
apping-0.012
unch-0.012
owing-0.012
apped-0.011
amb-0.011
Token his
Token ID465
Feature activation+2.83e-01

Pos logit contributions
apped+0.059
apping+0.058
unch+0.058
owing+0.057
ement+0.056
Neg logit contributions
Scar-0.037
Mid-0.034
Elsa-0.032
Max-0.030
Anna-0.030
Token way
Token ID835
Feature activation-1.12e-01

Pos logit contributions
 your+0.331
 her+0.327
 Blink+0.320
 newly+0.319
 Midnight+0.315
Neg logit contributions
apping-0.618
unch-0.572
apped-0.547
amb-0.547
aring-0.537
Token home
Token ID1363
Feature activation-1.28e-01

Pos logit contributions
apping+0.192
unch+0.174
amb+0.171
owing+0.169
apped+0.168
Neg logit contributions
Mid-0.174
Scar-0.173
 Blink-0.168
 Piper-0.167
 Midnight-0.166
Token Tom
Token ID4186
Feature activation+1.50e-01

Pos logit contributions
apping+0.114
owing+0.108
unch+0.108
amb+0.105
ement+0.102
Neg logit contributions
 your-0.036
 her-0.035
 Max-0.034
 Blink-0.034
Max-0.034
Token was
Token ID373
Feature activation+3.81e-02

Pos logit contributions
 your+0.144
 Piper+0.144
Mid+0.143
Scar+0.142
 newly+0.140
Neg logit contributions
apping-0.338
unch-0.315
apped-0.314
amb-0.310
aring-0.308
Token very
Token ID845
Feature activation-3.74e-02

Pos logit contributions
 her+0.041
 your+0.041
 Midnight+0.037
Mid+0.037
 Blink+0.036
Neg logit contributions
apping-0.085
unch-0.083
amb-0.081
ement-0.078
owing-0.077
Token sad
Token ID6507
Feature activation-1.11e-01

Pos logit contributions
apping+0.082
unch+0.081
ement+0.078
amb+0.078
owing+0.075
Neg logit contributions
 your-0.041
 her-0.039
Mid-0.037
 our-0.036
 she-0.036
Token that
Token ID326
Feature activation+1.16e-01

Pos logit contributions
apping+0.233
unch+0.227
apped+0.218
amb+0.217
ement+0.212
Neg logit contributions
 her-0.129
 she-0.128
 your-0.128
 newly-0.116
 Midnight-0.114
Token the
Token ID262
Feature activation+3.10e-01

Pos logit contributions
Mid+0.098
Scar+0.092
 Piper+0.087
 Blink+0.080
Elsa+0.079
Neg logit contributions
apping-0.281
unch-0.264
apped-0.259
owing-0.255
ement-0.253
Token ranch
Token ID28752
Feature activation-1.17e-01

Pos logit contributions
 your+0.345
Mid+0.342
Ol+0.342
 Blink+0.340
Scar+0.339
Neg logit contributions
apping-0.618
amb-0.584
unch-0.578
aring-0.575
apped-0.568
Token was
Token ID373
Feature activation+3.86e-02

Pos logit contributions
apping+0.261
apped+0.245
unch+0.242
ement+0.241
owing+0.239
Neg logit contributions
Mid-0.120
 your-0.116
Scar-0.116
 Piper-0.115
 newly-0.109
Token not
Token ID407
Feature activation+7.17e-02

Pos logit contributions
Mid+0.045
 Piper+0.045
Scar+0.045
 Blink+0.045
 newly+0.042
Neg logit contributions
apping-0.079
unch-0.073
owing-0.071
apped-0.070
aring-0.070
Token his
Token ID465
Feature activation+2.82e-01

Pos logit contributions
Mid+0.091
 Piper+0.090
Ol+0.087
 Blink+0.087
Ruby+0.086
Neg logit contributions
apping-0.140
owing-0.128
unch-0.126
aring-0.123
apped-0.122
Token and
Token ID290
Feature activation+2.46e-02

Pos logit contributions
Mid+0.417
 Piper+0.410
Scar+0.409
 Blink+0.407
 your+0.404
Neg logit contributions
apping-0.498
unch-0.458
owing-0.446
amb-0.445
aring-0.439
Token
Token ID220
Feature activation-4.82e-02

Pos logit contributions
apping+0.072
unch+0.067
amb+0.065
owing+0.064
apped+0.064
Neg logit contributions
 your-0.041
 her-0.040
Mid-0.040
 Blink-0.040
 Piper-0.040
Token From
Token ID3574
Feature activation+1.48e-02

Pos logit contributions
apping+0.093
unch+0.088
amb+0.084
apped+0.083
aring+0.082
Neg logit contributions
Mid-0.063
 your-0.062
 Piper-0.062
 her-0.062
Scar-0.061
Token then
Token ID788
Feature activation+2.73e-03

Pos logit contributions
Mid+0.024
Scar+0.023
 Piper+0.023
 Blink+0.023
 Midnight+0.023
Neg logit contributions
apping-0.024
unch-0.022
amb-0.022
apped-0.021
ement-0.021
Token on
Token ID319
Feature activation+1.03e-01

Pos logit contributions
Mid+0.003
 your+0.003
 Piper+0.003
Scar+0.003
 Blink+0.003
Neg logit contributions
apping-0.005
unch-0.005
amb-0.005
apped-0.005
owing-0.005
Token,
Token ID11
Feature activation-8.53e-04

Pos logit contributions
Mid+0.239
Scar+0.234
 trash+0.225
Elsa+0.220
Beck+0.213
Neg logit contributions
apping-0.098
ement-0.088
apped-0.081
amb-0.067
atted-0.066
Token the
Token ID262
Feature activation+3.10e-01

Pos logit contributions
apping+0.002
unch+0.002
apped+0.002
amb+0.002
owing+0.002
Neg logit contributions
Mid-0.001
Scar-0.001
 Piper-0.001
 Midnight-0.001
 trash-0.001
Token w
Token ID266
Feature activation-1.52e-01

Pos logit contributions
 same+0.224
Mid+0.222
 newly+0.218
 Midnight+0.215
 trash+0.214
Neg logit contributions
apping-0.828
amb-0.775
unch-0.771
owing-0.763
apped-0.760
Tokenaffle
Token ID30697
Feature activation-1.76e-01

Pos logit contributions
apping+0.236
unch+0.216
apped+0.209
amb+0.208
owing+0.203
Neg logit contributions
Mid-0.251
 your-0.250
 newly-0.245
 Piper-0.245
Scar-0.245
Token was
Token ID373
Feature activation+3.74e-02

Pos logit contributions
apping+0.367
apped+0.346
unch+0.336
amb+0.335
owing+0.324
Neg logit contributions
Mid-0.204
 Piper-0.199
Scar-0.197
 your-0.192
 Midnight-0.191
Token known
Token ID1900
Feature activation-7.26e-02

Pos logit contributions
Mid+0.044
 Piper+0.042
Scar+0.042
 Blink+0.042
 trash+0.041
Neg logit contributions
apping-0.076
unch-0.071
apped-0.068
owing-0.068
aring-0.067
Token as
Token ID355
Feature activation+1.44e-02

Pos logit contributions
apping+0.124
unch+0.114
amb+0.110
apped+0.108
aring+0.108
Neg logit contributions
Mid-0.112
 Piper-0.110
Scar-0.110
 Blink-0.109
 Midnight-0.109
Token his
Token ID465
Feature activation+2.80e-01

Pos logit contributions
apping+0.145
unch+0.137
apped+0.134
amb+0.132
owing+0.132
Neg logit contributions
Mid-0.046
Scar-0.044
 Piper-0.043
 Blink-0.043
 Midnight-0.043
Token magic
Token ID5536
Feature activation-1.36e-01

Pos logit contributions
 your+0.275
 her+0.267
 newly+0.265
Mid+0.264
 Piper+0.264
Neg logit contributions
apping-0.643
unch-0.616
amb-0.594
apped-0.585
aring-0.574
Token to
Token ID284
Feature activation+1.17e-01

Pos logit contributions
apping+0.267
unch+0.256
amb+0.243
apped+0.240
owing+0.234
Neg logit contributions
 your-0.176
 her-0.174
Mid-0.171
 newly-0.170
 Piper-0.170
Token help
Token ID1037
Feature activation-1.12e-01

Pos logit contributions
Mid+0.124
Scar+0.119
 Piper+0.118
 Midnight+0.118
 Blink+0.116
Neg logit contributions
apping-0.256
unch-0.243
owing-0.233
apped-0.230
amb-0.229
Token solve
Token ID8494
Feature activation-1.49e-02

Pos logit contributions
apping+0.230
unch+0.215
owing+0.207
apped+0.204
amb+0.204
Neg logit contributions
Mid-0.137
Scar-0.132
 Piper-0.129
 Midnight-0.129
 Blink-0.127
Token the
Token ID262
Feature activation+3.10e-01

Pos logit contributions
apping+0.035
ement+0.033
owing+0.032
unch+0.031
aps+0.031
Neg logit contributions
Mid-0.013
 Midnight-0.012
Scar-0.012
 Piper-0.012
 Blink-0.011
Token problem
Token ID1917
Feature activation-1.36e-01

Pos logit contributions
 your+0.339
Mid+0.329
 her+0.328
 Piper+0.327
Scar+0.326
Neg logit contributions
apping-0.675
unch-0.644
amb-0.620
apped-0.613
owing-0.603
Token.
Token ID13
Feature activation-4.45e-02

Pos logit contributions
apping+0.231
unch+0.218
amb+0.208
apped+0.204
owing+0.200
Neg logit contributions
 your-0.209
 her-0.207
 Piper-0.205
Mid-0.204
 Blink-0.204
Token<|endoftext|>
Token ID50256
Feature activation-9.68e-02

Pos logit contributions
apping+0.096
unch+0.090
owing+0.090
amb+0.086
ement+0.086
Neg logit contributions
 your-0.050
 Blink-0.049
 her-0.048
 Piper-0.048
 Midnight-0.047
TokenOnce
Token ID7454
Feature activation-4.54e-02

Pos logit contributions
apping+0.216
unch+0.205
aring+0.201
amb+0.201
owing+0.199
Neg logit contributions
 her-0.100
 Blink-0.097
 your-0.097
Max-0.095
 Midnight-0.094
Token there
Token ID612
Feature activation+1.43e-01

Pos logit contributions
apping+0.109
unch+0.102
apped+0.099
ement+0.098
owing+0.096
Neg logit contributions
Mid-0.045
 Midnight-0.041
Scar-0.041
 Piper-0.040
 Blink-0.039
Token had
Token ID550
Feature activation+7.10e-02

Pos logit contributions
apping+0.027
unch+0.026
amb+0.025
apped+0.025
owing+0.024
Neg logit contributions
Mid-0.011
 your-0.011
 Piper-0.010
 her-0.010
 Blink-0.010
Token won
Token ID1839
Feature activation-6.98e-02

Pos logit contributions
Mid+0.087
 Piper+0.085
 Midnight+0.082
 Blink+0.082
Scar+0.079
Neg logit contributions
apping-0.143
unch-0.136
apped-0.132
aring-0.127
owing-0.124
Token!
Token ID0
Feature activation-3.51e-02

Pos logit contributions
apping+0.126
unch+0.120
owing+0.111
apped+0.110
ement+0.110
Neg logit contributions
Mid-0.101
 Piper-0.098
Scar-0.096
 Midnight-0.095
 Blink-0.094
Token He
Token ID679
Feature activation+2.08e-01

Pos logit contributions
apping+0.080
unch+0.076
amb+0.073
apped+0.073
aring+0.071
Neg logit contributions
Mid-0.035
 your-0.034
 Piper-0.034
 her-0.034
Scar-0.034
Token was
Token ID373
Feature activation+3.79e-02

Pos logit contributions
Mid+0.175
 your+0.175
 Piper+0.172
 her+0.170
Scar+0.169
Neg logit contributions
apping-0.500
unch-0.475
amb-0.460
apped-0.457
owing-0.452
Token the
Token ID262
Feature activation+3.10e-01

Pos logit contributions
 your+0.047
 her+0.047
Mid+0.045
 Piper+0.045
Scar+0.045
Neg logit contributions
apping-0.076
unch-0.073
amb-0.071
apped-0.069
ement-0.068
Token fastest
Token ID14162
Feature activation-6.12e-02

Pos logit contributions
Mid+0.324
 your+0.319
 her+0.316
 Piper+0.316
Scar+0.312
Neg logit contributions
apping-0.686
unch-0.645
amb-0.622
apped-0.617
owing-0.612
Token in
Token ID287
Feature activation+1.27e-01

Pos logit contributions
apping+0.109
unch+0.102
amb+0.097
apped+0.096
owing+0.095
Neg logit contributions
Mid-0.090
 your-0.089
 Piper-0.089
Scar-0.089
 Blink-0.088
Token the
Token ID262
Feature activation+3.10e-01

Pos logit contributions
Mid+0.051
 Piper+0.043
Scar+0.043
 Midnight+0.042
 Blink+0.040
Neg logit contributions
apping-0.364
unch-0.347
apped-0.337
amb-0.334
owing-0.331
Token zoo
Token ID26626
Feature activation-1.83e-01

Pos logit contributions
Mid+0.388
 your+0.384
 her+0.382
 Piper+0.382
Scar+0.379
Neg logit contributions
apping-0.622
unch-0.582
apped-0.557
amb-0.556
aring-0.548
Token.
Token ID13
Feature activation-4.43e-02

Pos logit contributions
apping+0.324
unch+0.313
amb+0.305
apped+0.289
aring+0.284
Neg logit contributions
 her-0.266
 your-0.263
 she-0.259
 Blink-0.257
 Piper-0.252
Token little
Token ID1310
Feature activation-1.46e-02

Pos logit contributions
 your+0.275
Mid+0.274
 her+0.273
 Piper+0.267
 Midnight+0.267
Neg logit contributions
apping-0.731
unch-0.695
amb-0.669
owing-0.665
apped-0.663
Token girl
Token ID2576
Feature activation+6.27e-02

Pos logit contributions
apping+0.030
apped+0.029
amb+0.029
ement+0.028
unch+0.027
Neg logit contributions
Scar-0.017
Mid-0.017
 Blink-0.016
Ol-0.016
 Piper-0.015
Token came
Token ID1625
Feature activation-3.33e-02

Pos logit contributions
Mid+0.058
 your+0.057
 newly+0.055
Scar+0.055
 Piper+0.055
Neg logit contributions
apping-0.145
unch-0.137
apped-0.133
amb-0.132
owing-0.132
Token back
Token ID736
Feature activation+3.06e-02

Pos logit contributions
apping+0.071
unch+0.070
ement+0.067
amb+0.067
apped+0.065
Neg logit contributions
 her-0.036
 your-0.036
Mid-0.034
 Piper-0.034
 newly-0.033
Token to
Token ID284
Feature activation+1.18e-01

Pos logit contributions
 her+0.041
 your+0.039
 she+0.038
 Piper+0.037
 newly+0.037
Neg logit contributions
apping-0.061
unch-0.059
ement-0.056
amb-0.056
apped-0.055
Token the
Token ID262
Feature activation+3.10e-01

Pos logit contributions
Mid+0.084
Scar+0.082
Max+0.077
 Piper+0.076
 Blink+0.074
Neg logit contributions
apping-0.284
unch-0.277
owing-0.266
apped-0.262
amb-0.261
Token river
Token ID7850
Feature activation-1.36e-01

Pos logit contributions
Mid+0.414
Scar+0.409
 Piper+0.407
 Blink+0.406
 your+0.403
Neg logit contributions
apping-0.595
unch-0.558
amb-0.537
apped-0.534
ement-0.529
Token and
Token ID290
Feature activation+2.40e-02

Pos logit contributions
amb+0.263
apping+0.259
unch+0.255
ement+0.247
apped+0.240
Neg logit contributions
 her-0.175
 she-0.174
 the-0.173
 your-0.165
 their-0.162
Token saw
Token ID2497
Feature activation-3.28e-02

Pos logit contributions
Mid+0.034
Scar+0.034
 newly+0.033
Elsa+0.032
Max+0.032
Neg logit contributions
apped-0.037
apping-0.037
 sur-0.036
atted-0.035
owing-0.034
Token the
Token ID262
Feature activation+3.10e-01

Pos logit contributions
apping+0.062
unch+0.060
apped+0.059
ement+0.058
amb+0.056
Neg logit contributions
Scar-0.041
Mid-0.040
 newly-0.038
 Blink-0.037
Max-0.037
Token all
Token ID477
Feature activation+4.64e-02

Pos logit contributions
Mid+0.360
 your+0.357
 Piper+0.355
Scar+0.353
 Blink+0.353
Neg logit contributions
apping-0.646
unch-0.612
amb-0.587
apped-0.583
ement-0.579
TokenJohn
Token ID7554
Feature activation+1.32e-01

Pos logit contributions
 Piper+0.022
Mid+0.022
 your+0.021
 Midnight+0.021
 Blink+0.021
Neg logit contributions
apping-0.035
unch-0.034
amb-0.032
aring-0.032
apped-0.031
Token could
Token ID714
Feature activation+7.26e-04

Pos logit contributions
Mid+0.112
 your+0.111
 Piper+0.109
Scar+0.109
 Blink+0.108
Neg logit contributions
apping-0.317
unch-0.301
amb-0.292
apped-0.291
owing-0.288
Token never
Token ID1239
Feature activation+6.60e-03

Pos logit contributions
Mid+0.001
 trash+0.001
 Piper+0.001
Scar+0.001
 Blink+0.001
Neg logit contributions
apping-0.002
unch-0.002
amb-0.001
apped-0.001
oot-0.001
Token return
Token ID1441
Feature activation-1.05e-01

Pos logit contributions
 her+0.006
 your+0.006
Mid+0.005
 Piper+0.005
 newly+0.005
Neg logit contributions
apping-0.016
unch-0.015
apped-0.015
amb-0.015
ement-0.014
Token from
Token ID422
Feature activation+6.19e-02

Pos logit contributions
apping+0.209
unch+0.196
owing+0.191
amb+0.189
apped+0.189
Neg logit contributions
Mid-0.134
Scar-0.129
 Piper-0.127
 Midnight-0.125
 Blink-0.124
Token the
Token ID262
Feature activation+3.10e-01

Pos logit contributions
Mid+0.031
Scar+0.029
 Piper+0.028
 Midnight+0.027
 Blink+0.027
Neg logit contributions
apping-0.170
unch-0.163
amb-0.158
apped-0.158
owing-0.157
Token tunnel
Token ID13275
Feature activation-1.96e-01

Pos logit contributions
 your+0.344
Mid+0.343
 Piper+0.338
 her+0.337
 Blink+0.335
Neg logit contributions
apping-0.665
unch-0.629
amb-0.604
apped-0.599
owing-0.593
Token.
Token ID13
Feature activation-4.19e-02

Pos logit contributions
apping+0.318
unch+0.298
amb+0.286
apped+0.280
owing+0.272
Neg logit contributions
 your-0.314
 her-0.313
Mid-0.307
 Piper-0.305
 Blink-0.305
Token<|endoftext|>
Token ID50256
Feature activation-9.72e-02

Pos logit contributions
apping+0.101
unch+0.097
amb+0.093
owing+0.092
apped+0.092
Neg logit contributions
 your-0.035
 her-0.034
 Blink-0.034
 newly-0.033
 Piper-0.033
TokenOnce
Token ID7454
Feature activation-4.66e-02

Pos logit contributions
apping+0.226
unch+0.215
amb+0.208
aring+0.206
owing+0.206
Neg logit contributions
 your-0.087
 her-0.087
Mid-0.087
 Piper-0.086
 Blink-0.086
Token upon
Token ID2402
Feature activation+2.08e-01

Pos logit contributions
apping+0.112
unch+0.106
apped+0.102
ement+0.101
owing+0.101
Neg logit contributions
Mid-0.045
 Midnight-0.041
Scar-0.041
 Piper-0.040
 Blink-0.039
Token
Token ID198
Feature activation+1.63e-02

Pos logit contributions
apping+0.108
unch+0.104
owing+0.101
amb+0.100
ement+0.098
Neg logit contributions
 your-0.042
 Blink-0.042
 her-0.040
 Piper-0.040
 Midnight-0.040
Token
Token ID198
Feature activation+1.66e-02

Pos logit contributions
 your+0.019
Mid+0.018
 her+0.018
 Blink+0.018
Scar+0.018
Neg logit contributions
apping-0.035
unch-0.032
owing-0.032
ement-0.032
amb-0.032
TokenHe
Token ID1544
Feature activation+1.91e-01

Pos logit contributions
 Piper+0.017
 Midnight+0.016
 Blink+0.016
 newly+0.016
 trash+0.016
Neg logit contributions
apping-0.038
unch-0.036
aring-0.035
amb-0.034
apped-0.033
Token raced
Token ID32897
Feature activation-1.19e-01

Pos logit contributions
 your+0.152
 her+0.147
 newly+0.146
Scar+0.144
 Blink+0.144
Neg logit contributions
apping-0.468
unch-0.454
owing-0.432
amb-0.431
ement-0.424
Token around
Token ID1088
Feature activation+4.03e-02

Pos logit contributions
apping+0.251
unch+0.234
amb+0.228
apped+0.227
owing+0.223
Neg logit contributions
Mid-0.136
 Piper-0.134
 Blink-0.132
 your-0.132
 Midnight-0.132
Token the
Token ID262
Feature activation+3.10e-01

Pos logit contributions
Mid+0.021
 Piper+0.020
Scar+0.019
 your+0.019
 Midnight+0.019
Neg logit contributions
apping-0.110
unch-0.104
apped-0.102
amb-0.101
owing-0.101
Token track
Token ID2610
Feature activation-1.33e-01

Pos logit contributions
Mid+0.331
 your+0.329
 Piper+0.326
 her+0.324
 Blink+0.323
Neg logit contributions
apping-0.677
unch-0.638
amb-0.614
apped-0.611
owing-0.603
Token until
Token ID1566
Feature activation+9.64e-03

Pos logit contributions
unch+0.268
apping+0.266
amb+0.256
owing+0.244
aring+0.243
Neg logit contributions
 her-0.160
 she-0.157
 your-0.153
 same-0.146
 the-0.145
Token he
Token ID339
Feature activation+2.85e-01

Pos logit contributions
Mid+0.009
Scar+0.008
 Piper+0.008
 Midnight+0.008
 Blink+0.008
Neg logit contributions
apping-0.023
apped-0.022
unch-0.021
ement-0.021
oot-0.020
Token crossed
Token ID12606
Feature activation-1.10e-02

Pos logit contributions
 your+0.196
 her+0.194
Scar+0.191
 Blink+0.189
 Piper+0.187
Neg logit contributions
apping-0.722
unch-0.692
amb-0.677
owing-0.666
aring-0.660
Token the
Token ID262
Feature activation+3.10e-01

Pos logit contributions
apping+0.030
apped+0.028
unch+0.027
ement+0.027
owing+0.027
Neg logit contributions
Mid-0.007
 Piper-0.006
 Midnight-0.006
 Blink-0.006
Scar-0.005
Token Sally
Token ID25737
Feature activation+1.39e-01

Pos logit contributions
apping+0.106
unch+0.100
owing+0.098
amb+0.097
aring+0.096
Neg logit contributions
 her-0.045
 Piper-0.044
 your-0.044
 she-0.042
Mid-0.042
Token was
Token ID373
Feature activation+3.48e-02

Pos logit contributions
Mid+0.164
 your+0.159
 Blink+0.157
 newly+0.157
 Piper+0.157
Neg logit contributions
apping-0.280
amb-0.264
apped-0.261
unch-0.257
aring-0.252
Token so
Token ID523
Feature activation-9.05e-03

Pos logit contributions
 her+0.044
 your+0.043
 she+0.040
 Midnight+0.040
Mid+0.039
Neg logit contributions
apping-0.071
unch-0.069
amb-0.068
owing-0.065
aring-0.064
Token happy
Token ID3772
Feature activation-1.04e-01

Pos logit contributions
apping+0.020
amb+0.020
unch+0.020
ement+0.020
owing+0.020
Neg logit contributions
 her-0.009
 your-0.008
 she-0.008
 newly-0.008
 their-0.007
Token and
Token ID290
Feature activation+2.25e-02

Pos logit contributions
apping+0.203
unch+0.192
amb+0.184
apped+0.183
owing+0.180
Neg logit contributions
 her-0.136
 your-0.134
 she-0.131
 Piper-0.130
Mid-0.130
Token the
Token ID262
Feature activation+3.10e-01

Pos logit contributions
Mid+0.032
Scar+0.031
 Blink+0.030
 Piper+0.030
 newly+0.030
Neg logit contributions
apping-0.041
unch-0.038
apped-0.038
amb-0.037
aring-0.036
Token two
Token ID734
Feature activation+9.01e-02

Pos logit contributions
 your+0.407
Mid+0.406
 her+0.402
 Piper+0.401
Scar+0.398
Neg logit contributions
apping-0.602
unch-0.567
amb-0.543
apped-0.535
owing-0.532
Token girls
Token ID4813
Feature activation+2.15e-02

Pos logit contributions
Mid+0.160
 Blink+0.160
Scar+0.157
 your+0.156
 Piper+0.155
Neg logit contributions
apping-0.140
apped-0.125
unch-0.125
ement-0.120
owing-0.115
Token hugged
Token ID41130
Feature activation-1.66e-01

Pos logit contributions
 her+0.021
 your+0.020
Mid+0.020
 Piper+0.019
 she+0.019
Neg logit contributions
apping-0.049
unch-0.047
amb-0.045
aring-0.044
ement-0.044
Token each
Token ID1123
Feature activation+1.10e-01

Pos logit contributions
apping+0.396
unch+0.389
aring+0.365
amb+0.361
ement+0.354
Neg logit contributions
 her-0.167
 your-0.148
 Piper-0.146
 she-0.143
Mid-0.142
Token other
Token ID584
Feature activation+7.98e-04

Pos logit contributions
 Blink+0.076
Mid+0.076
Scar+0.076
 Midnight+0.074
 Piper+0.074
Neg logit contributions
apping-0.275
unch-0.260
amb-0.256
apped-0.255
ement-0.251
Token a
Token ID257
Feature activation+1.94e-01

Pos logit contributions
Mid+0.077
 Piper+0.075
 Midnight+0.073
 Blink+0.071
 newly+0.067
Neg logit contributions
apping-0.147
apped-0.140
unch-0.140
amb-0.129
ement-0.128
Token happy
Token ID3772
Feature activation-1.03e-01

Pos logit contributions
 Blink+0.130
Mid+0.129
 Piper+0.129
 your+0.127
 her+0.124
Neg logit contributions
apping-0.490
unch-0.467
apped-0.449
ement-0.448
amb-0.441
Token ending
Token ID7464
Feature activation-9.94e-02

Pos logit contributions
apping+0.209
unch+0.197
amb+0.188
apped+0.187
owing+0.185
Neg logit contributions
Mid-0.127
 your-0.127
 Piper-0.125
 her-0.125
 Blink-0.124
Token,
Token ID11
Feature activation-2.29e-03

Pos logit contributions
apping+0.180
unch+0.167
amb+0.159
apped+0.159
owing+0.157
Neg logit contributions
Mid-0.144
 your-0.142
 Piper-0.141
Scar-0.141
 Blink-0.141
Token and
Token ID290
Feature activation+2.32e-02

Pos logit contributions
apping+0.004
unch+0.004
apped+0.004
amb+0.004
owing+0.004
Neg logit contributions
Mid-0.003
Scar-0.003
 Piper-0.003
 Blink-0.003
 Midnight-0.003
Token the
Token ID262
Feature activation+3.10e-01

Pos logit contributions
Mid+0.020
Scar+0.019
 Piper+0.017
 Blink+0.017
Max+0.017
Neg logit contributions
apping-0.056
apped-0.055
amb-0.051
ement-0.050
unch-0.050
Token helpers
Token ID49385
Feature activation-1.06e-01

Pos logit contributions
Mid+0.434
 Piper+0.416
 Blink+0.411
 your+0.409
 Midnight+0.407
Neg logit contributions
apping-0.578
ement-0.533
apped-0.520
unch-0.506
owing-0.499
Token promised
Token ID8072
Feature activation-9.33e-02

Pos logit contributions
apping+0.234
unch+0.222
amb+0.213
apped+0.211
owing+0.210
Neg logit contributions
 your-0.108
Mid-0.107
 her-0.107
 Piper-0.105
Scar-0.105
Token Daddy
Token ID26926
Feature activation+5.59e-02

Pos logit contributions
apping+0.177
unch+0.164
amb+0.158
owing+0.156
apped+0.156
Neg logit contributions
Mid-0.129
 Piper-0.123
Scar-0.123
 Blink-0.122
 Midnight-0.121
Token that
Token ID326
Feature activation+1.17e-01

Pos logit contributions
Mid+0.093
Scar+0.092
 Piper+0.091
 trash+0.089
 Blink+0.088
Neg logit contributions
apping-0.087
aring-0.079
owing-0.078
apped-0.077
unch-0.075
Token they
Token ID484
Feature activation+2.69e-01

Pos logit contributions
Mid+0.101
Scar+0.096
 Piper+0.092
 Blink+0.088
 Midnight+0.086
Neg logit contributions
apping-0.281
unch-0.265
apped-0.257
owing-0.253
amb-0.250
Token
Token ID198
Feature activation+1.76e-02

Pos logit contributions
apping+0.094
unch+0.089
amb+0.085
apped+0.085
aring+0.083
Neg logit contributions
Mid-0.057
 your-0.057
 Piper-0.056
 her-0.056
Scar-0.056
Token
Token ID198
Feature activation+1.79e-02

Pos logit contributions
 her+0.020
 your+0.020
Mid+0.020
 Blink+0.019
 Midnight+0.019
Neg logit contributions
apping-0.038
owing-0.035
unch-0.035
amb-0.034
ement-0.034
TokenThe
Token ID464
Feature activation+2.12e-01

Pos logit contributions
 Blink+0.022
 Piper+0.021
 trash+0.020
 your+0.020
 newly+0.020
Neg logit contributions
apping-0.036
unch-0.034
aring-0.033
amb-0.032
apped-0.032
Token moral
Token ID6573
Feature activation-1.07e-01

Pos logit contributions
 newly+0.184
 her+0.184
Mid+0.180
 your+0.179
 Midnight+0.176
Neg logit contributions
apping-0.528
owing-0.487
unch-0.484
amb-0.479
apped-0.478
Token of
Token ID286
Feature activation+6.93e-02

Pos logit contributions
apping+0.154
unch+0.144
owing+0.137
amb+0.135
ement+0.134
Neg logit contributions
 your-0.198
 her-0.193
 she-0.191
 Blink-0.188
 Piper-0.186
Token the
Token ID262
Feature activation+3.10e-01

Pos logit contributions
Scar+0.039
Mid+0.034
 Piper+0.032
 newly+0.030
 Blink+0.029
Neg logit contributions
apping-0.181
unch-0.176
amb-0.170
apped-0.169
owing-0.165
Token story
Token ID1621
Feature activation-1.51e-01

Pos logit contributions
 your+0.422
 her+0.417
Mid+0.416
 Blink+0.416
 Midnight+0.415
Neg logit contributions
apping-0.602
unch-0.547
apped-0.540
amb-0.535
owing-0.534
Token is
Token ID318
Feature activation+5.20e-02

Pos logit contributions
apping+0.431
unch+0.421
amb+0.405
owing+0.403
apped+0.394
Neg logit contributions
 your-0.066
 her-0.059
 has-0.057
 Blink-0.053
 she-0.050
Token that
Token ID326
Feature activation+1.16e-01

Pos logit contributions
 your+0.071
 her+0.068
 Piper+0.067
 she+0.066
 Blink+0.065
Neg logit contributions
apping-0.100
unch-0.096
ement-0.092
amb-0.092
apped-0.090
Token it
Token ID340
Feature activation+1.97e-01

Pos logit contributions
 your+0.120
 her+0.116
Mid+0.113
 Piper+0.112
 Blink+0.112
Neg logit contributions
apping-0.265
unch-0.250
amb-0.243
apped-0.241
ement-0.239
Token's
Token ID338
Feature activation+1.35e-01

Pos logit contributions
Mid+0.196
Scar+0.192
 newly+0.177
Max+0.175
 Blink+0.174
Neg logit contributions
apping-0.442
aring-0.425
apped-0.413
amb-0.398
owing-0.395
Token There
Token ID1318
Feature activation+9.96e-02

Pos logit contributions
apping+0.106
unch+0.102
amb+0.098
apped+0.097
aring+0.096
Neg logit contributions
Mid-0.035
 your-0.035
Scar-0.035
 her-0.035
 Piper-0.035
Token in
Token ID287
Feature activation+1.26e-01

Pos logit contributions
Mid+0.165
Scar+0.156
 newly+0.148
Elsa+0.147
 Piper+0.147
Neg logit contributions
apping-0.171
apped-0.164
unch-0.164
ement-0.158
owing-0.150
Token the
Token ID262
Feature activation+3.09e-01

Pos logit contributions
Mid+0.083
Scar+0.073
 Piper+0.067
 Midnight+0.067
chy+0.063
Neg logit contributions
apping-0.325
unch-0.311
ement-0.307
apped-0.304
aps-0.299
Token corner
Token ID5228
Feature activation-8.99e-02

Pos logit contributions
Mid+0.362
 Piper+0.359
 your+0.358
 Blink+0.356
Scar+0.355
Neg logit contributions
apping-0.644
unch-0.604
amb-0.580
apped-0.579
owing-0.571
Token of
Token ID286
Feature activation+6.94e-02

Pos logit contributions
apping+0.134
unch+0.120
apped+0.115
ement+0.112
owing+0.111
Neg logit contributions
Mid-0.167
Scar-0.162
 Midnight-0.161
 Piper-0.160
 Blink-0.158
Token the
Token ID262
Feature activation+3.10e-01

Pos logit contributions
Mid+0.034
Scar+0.033
 Piper+0.028
 newly+0.026
 Blink+0.026
Neg logit contributions
apping-0.187
unch-0.180
apped-0.177
ement-0.173
amb-0.172
Token room
Token ID2119
Feature activation-1.38e-01

Pos logit contributions
 her+0.310
 your+0.310
 same+0.297
 newly+0.293
 she+0.290
Neg logit contributions
apping-0.723
unch-0.715
amb-0.693
apped-0.666
owing-0.666
Token was
Token ID373
Feature activation+3.98e-02

Pos logit contributions
apping+0.245
unch+0.222
apped+0.219
ement+0.209
owing+0.208
Neg logit contributions
Mid-0.222
Scar-0.212
 Midnight-0.211
 Piper-0.209
 Blink-0.208
Token an
Token ID281
Feature activation+1.65e-01

Pos logit contributions
Mid+0.030
Scar+0.028
 Piper+0.028
 Blink+0.027
 your+0.026
Neg logit contributions
apping-0.099
unch-0.094
apped-0.092
ement-0.090
owing-0.089
Token old
Token ID1468
Feature activation-3.14e-02

Pos logit contributions
 her+0.116
 your+0.108
 same+0.103
 bad+0.100
 dangerous+0.098
Neg logit contributions
apping-0.443
unch-0.431
apped-0.421
aps-0.412
aring-0.404
Token trunk
Token ID21427
Feature activation-1.68e-01

Pos logit contributions
apping+0.066
unch+0.063
amb+0.060
apped+0.060
owing+0.059
Neg logit contributions
Mid-0.035
 your-0.035
 Piper-0.035
Scar-0.034
 her-0.034
Token journey
Token ID7002
Feature activation-2.02e-01

Pos logit contributions
 your+0.297
 newly+0.292
 same+0.291
 her+0.285
Mid+0.283
Neg logit contributions
apping-0.725
unch-0.691
apped-0.669
amb-0.655
owing-0.645
Token.
Token ID13
Feature activation-4.35e-02

Pos logit contributions
apping+0.295
unch+0.261
owing+0.250
apped+0.249
amb+0.246
Neg logit contributions
Mid-0.370
 Blink-0.363
 Piper-0.363
 Midnight-0.361
Scar-0.359
Token After
Token ID2293
Feature activation+4.92e-02

Pos logit contributions
apping+0.092
unch+0.085
apped+0.084
amb+0.081
aring+0.080
Neg logit contributions
 your-0.050
Mid-0.049
 newly-0.048
Scar-0.048
 Blink-0.048
Token she
Token ID673
Feature activation+2.84e-01

Pos logit contributions
Mid+0.037
Scar+0.036
 Piper+0.033
 trash+0.032
 your+0.032
Neg logit contributions
apping-0.120
unch-0.117
apped-0.112
ement-0.109
owing-0.109
Token exited
Token ID34710
Feature activation-9.33e-02

Pos logit contributions
 her+0.229
 your+0.229
Mid+0.224
Scar+0.220
 Blink+0.220
Neg logit contributions
apping-0.701
unch-0.663
amb-0.637
apped-0.637
aring-0.631
Token the
Token ID262
Feature activation+3.10e-01

Pos logit contributions
apping+0.228
ement+0.212
apped+0.211
owing+0.209
amb+0.206
Neg logit contributions
Mid-0.078
Scar-0.067
chy-0.067
 trash-0.066
 Midnight-0.064
Token bus
Token ID1323
Feature activation-1.79e-01

Pos logit contributions
 your+0.337
Mid+0.330
 her+0.330
Scar+0.324
 newly+0.324
Neg logit contributions
apping-0.679
unch-0.645
apped-0.616
amb-0.615
owing-0.607
Token,
Token ID11
Feature activation-3.33e-04

Pos logit contributions
apping+0.323
unch+0.298
apped+0.285
owing+0.284
amb+0.284
Neg logit contributions
Mid-0.264
 Piper-0.258
Scar-0.257
 your-0.256
 Midnight-0.255
Token she
Token ID673
Feature activation+2.85e-01

Pos logit contributions
apping+0.001
apped+0.001
unch+0.001
aps+0.001
overed+0.001
Neg logit contributions
Mid-0.000
Max-0.000
Scar-0.000
 Blink-0.000
 trash-0.000
Token couldn
Token ID3521
Feature activation-4.25e-02

Pos logit contributions
Mid+0.284
 your+0.278
 Piper+0.274
 Blink+0.271
 Midnight+0.270
Neg logit contributions
apping-0.639
unch-0.603
apped-0.588
amb-0.584
owing-0.578
Token't
Token ID470
Feature activation+7.25e-02

Pos logit contributions
apping+0.077
unch+0.075
amb+0.068
apped+0.067
ement+0.067
Neg logit contributions
 trash-0.062
Mid-0.061
 Piper-0.060
 Blink-0.059
 newly-0.058
Token the
Token ID262
Feature activation+3.08e-01

Pos logit contributions
Scar+0.042
Mid+0.040
 trash+0.036
 Piper+0.036
 Blink+0.034
Neg logit contributions
apping-0.175
unch-0.168
amb-0.163
apped-0.162
owing-0.159
Token story
Token ID1621
Feature activation-1.52e-01

Pos logit contributions
 your+0.442
Mid+0.436
 Blink+0.436
 her+0.434
 Piper+0.433
Neg logit contributions
apping-0.571
unch-0.528
apped-0.508
amb-0.507
owing-0.497
Token is
Token ID318
Feature activation+5.06e-02

Pos logit contributions
apping+0.407
unch+0.390
amb+0.378
apped+0.375
owing+0.373
Neg logit contributions
 your-0.086
Mid-0.085
 her-0.083
 Piper-0.082
 Blink-0.081
Token that
Token ID326
Feature activation+1.14e-01

Pos logit contributions
 your+0.074
 Piper+0.070
 her+0.070
 Blink+0.069
 newly+0.068
Neg logit contributions
apping-0.092
unch-0.088
amb-0.084
ement-0.084
apped-0.083
Token even
Token ID772
Feature activation-5.60e-03

Pos logit contributions
 your+0.113
 her+0.108
Mid+0.106
 Piper+0.106
 Blink+0.106
Neg logit contributions
apping-0.260
unch-0.248
amb-0.241
apped-0.238
owing-0.235
Token the
Token ID262
Feature activation+3.09e-01

Pos logit contributions
apping+0.010
aring+0.009
unch+0.009
owing+0.009
apped+0.009
Neg logit contributions
Mid-0.008
Scar-0.008
 Midnight-0.008
 Piper-0.008
 Blink-0.008
Token smallest
Token ID18197
Feature activation-1.10e-01

Pos logit contributions
 your+0.284
 her+0.275
 Blink+0.274
Mid+0.273
 newly+0.273
Neg logit contributions
apping-0.735
unch-0.698
amb-0.674
apped-0.669
owing-0.659
Token of
Token ID286
Feature activation+6.91e-02

Pos logit contributions
apping+0.212
owing+0.196
aring+0.194
unch+0.190
apped+0.187
Neg logit contributions
Mid-0.160
Scar-0.156
 Midnight-0.149
 Blink-0.146
 Piper-0.145
Token creatures
Token ID8109
Feature activation+4.99e-03

Pos logit contributions
Mid+0.084
Scar+0.081
 Piper+0.079
 Midnight+0.078
 Blink+0.077
Neg logit contributions
apping-0.142
unch-0.135
aring-0.128
owing-0.128
apped-0.127
Token can
Token ID460
Feature activation-2.07e-02

Pos logit contributions
Mid+0.005
Scar+0.005
 Blink+0.005
 Midnight+0.005
 Piper+0.005
Neg logit contributions
apping-0.011
aring-0.010
owing-0.010
apped-0.010
unch-0.010
Token help
Token ID1037
Feature activation-1.11e-01

Pos logit contributions
apping+0.044
unch+0.044
owing+0.041
aring+0.041
amb+0.041
Neg logit contributions
Mid-0.023
Scar-0.022
 Piper-0.022
 Midnight-0.021
 trash-0.021
Token They
Token ID1119
Feature activation+8.89e-02

Pos logit contributions
 your+0.141
 newly+0.128
Mid+0.128
Scar+0.127
 her+0.126
Neg logit contributions
apping-0.490
unch-0.474
amb-0.462
owing-0.455
apped-0.455
Token think
Token ID892
Feature activation-1.95e-01

Pos logit contributions
 Piper+0.101
Mid+0.101
Scar+0.098
 Midnight+0.097
 newly+0.096
Neg logit contributions
apping-0.178
apped-0.170
amb-0.168
owing-0.162
aring-0.161
Token of
Token ID286
Feature activation-2.12e-02

Pos logit contributions
apping+0.436
unch+0.401
ement+0.400
apped+0.399
aring+0.392
Neg logit contributions
Mid-0.200
Scar-0.186
 Piper-0.172
Elsa-0.169
Max-0.169
Token what
Token ID644
Feature activation+8.62e-02

Pos logit contributions
apping+0.046
unch+0.043
ement+0.043
apped+0.042
owing+0.041
Neg logit contributions
Mid-0.021
Scar-0.021
 Piper-0.020
 Midnight-0.019
 Blink-0.018
Token they
Token ID484
Feature activation+1.96e-01

Pos logit contributions
Mid+0.087
Scar+0.080
 Piper+0.078
 newly+0.077
Elsa+0.074
Neg logit contributions
apping-0.195
unch-0.179
apped-0.175
aring-0.174
aps-0.170
Token wish
Token ID4601
Feature activation-3.71e-01

Pos logit contributions
Mid+0.205
 Midnight+0.187
 Piper+0.184
Scar+0.180
 Blink+0.178
Neg logit contributions
apping-0.436
apped-0.411
unch-0.406
owing-0.403
amb-0.395
Token for
Token ID329
Feature activation-1.56e-03

Pos logit contributions
apping+0.764
apped+0.692
aring+0.671
unch+0.669
owing+0.669
Neg logit contributions
Mid-0.475
Scar-0.451
 Piper-0.435
 Midnight-0.424
 Blink-0.418
Token.
Token ID13
Feature activation-1.15e-01

Pos logit contributions
apping+0.002
unch+0.002
amb+0.002
apped+0.002
owing+0.002
Neg logit contributions
Mid-0.003
Scar-0.003
 Piper-0.003
 your-0.003
 Blink-0.003
Token They
Token ID1119
Feature activation+1.15e-01

Pos logit contributions
apping+0.268
unch+0.255
amb+0.246
owing+0.244
apped+0.244
Neg logit contributions
 your-0.105
Mid-0.103
 Piper-0.102
 her-0.102
 Blink-0.102
Token hold
Token ID1745
Feature activation-1.33e-01

Pos logit contributions
Mid+0.166
 Midnight+0.161
 Piper+0.160
 Blink+0.159
 your+0.157
Neg logit contributions
apping-0.197
amb-0.183
apped-0.181
unch-0.177
owing-0.177
Token their
Token ID511
Feature activation+2.07e-01

Pos logit contributions
apping+0.208
apped+0.186
unch+0.184
amb+0.180
owing+0.178
Neg logit contributions
Mid-0.225
Scar-0.224
 Blink-0.221
 Piper-0.219
 Midnight-0.219
Token favorite
Token ID4004
Feature activation-1.40e-01

Pos logit contributions
 your+0.212
 her+0.211
Mid+0.211
 Piper+0.208
 Blink+0.206
Neg logit contributions
apping-0.440
unch-0.419
amb-0.403
apped-0.397
owing-0.394
Token song
Token ID3496
Feature activation-2.80e-01

Pos logit contributions
apping+0.302
unch+0.285
amb+0.274
apped+0.273
owing+0.270
Neg logit contributions
Mid-0.151
Scar-0.149
 Piper-0.148
 your-0.148
 Blink-0.147
Token is
Token ID318
Feature activation-1.94e-02

Pos logit contributions
apping+0.579
apped+0.534
unch+0.531
aring+0.521
ement+0.520
Neg logit contributions
Scar-0.324
Mid-0.312
 Piper-0.310
 newly-0.306
 Blink-0.299
Token "
Token ID366
Feature activation+2.33e-02

Pos logit contributions
apping+0.039
unch+0.036
aring+0.035
ocked+0.035
apped+0.034
Neg logit contributions
Mid-0.021
Scar-0.021
 newly-0.021
 Piper-0.020
 Blink-0.020
TokenHappy
Token ID25082
Feature activation-5.27e-02

Pos logit contributions
Mid+0.035
 her+0.035
 your+0.035
Scar+0.035
 Piper+0.034
Neg logit contributions
apping-0.040
unch-0.037
amb-0.035
apped-0.035
ement-0.035
Token Birthday
Token ID33511
Feature activation-3.70e-01

Pos logit contributions
unch+0.059
apping+0.056
owing+0.056
anged+0.051
ement+0.050
Neg logit contributions
 your-0.113
 her-0.113
Mid-0.109
 our-0.108
 Midnight-0.108
Token".
Token ID1911
Feature activation-7.95e-02

Pos logit contributions
apping+0.689
unch+0.630
amb+0.617
apped+0.610
aring+0.602
Neg logit contributions
 newly-0.507
 Blink-0.498
Scar-0.489
Mid-0.488
 trash-0.483
Token She
Token ID1375
Feature activation+1.60e-01

Pos logit contributions
apping+0.168
unch+0.158
amb+0.152
apped+0.151
ement+0.147
Neg logit contributions
Mid-0.089
 your-0.088
 Blink-0.087
 newly-0.087
Scar-0.087
Token likes
Token ID7832
Feature activation-1.09e-01

Pos logit contributions
Mid+0.208
 Blink+0.207
 Piper+0.203
 Midnight+0.199
 newly+0.198
Neg logit contributions
apping-0.306
apped-0.284
amb-0.276
unch-0.270
ement-0.264
Token to
Token ID284
Feature activation+6.37e-02

Pos logit contributions
apping+0.250
unch+0.236
amb+0.228
apped+0.226
aring+0.223
Neg logit contributions
Mid-0.101
 Piper-0.098
Scar-0.098
 Blink-0.097
 newly-0.096
Token sing
Token ID1702
Feature activation-2.28e-01

Pos logit contributions
Mid+0.070
 Piper+0.069
 Blink+0.069
Scar+0.069
 newly+0.068
Neg logit contributions
apping-0.135
unch-0.126
amb-0.122
apped-0.121
owing-0.119
Token.
Token ID13
Feature activation-9.05e-02

Pos logit contributions
 her+0.109
 your+0.105
 Blink+0.099
Mid+0.099
 Midnight+0.097
Neg logit contributions
apping-0.139
unch-0.134
ement-0.133
apped-0.125
aring-0.122
Token Blue
Token ID4518
Feature activation+7.56e-02

Pos logit contributions
apping+0.215
unch+0.209
amb+0.201
apped+0.201
ement+0.197
Neg logit contributions
Scar-0.072
 Piper-0.070
Mid-0.068
 newly-0.068
 your-0.066
Token had
Token ID550
Feature activation+3.39e-02

Pos logit contributions
 Piper+0.076
Scar+0.075
Mid+0.074
 newly+0.072
 your+0.071
Neg logit contributions
apping-0.168
unch-0.158
amb-0.157
apped-0.154
aring-0.152
Token an
Token ID281
Feature activation+1.24e-01

Pos logit contributions
 Piper+0.027
Scar+0.026
Mid+0.025
 Midnight+0.024
 newly+0.024
Neg logit contributions
apping-0.079
unch-0.075
apped-0.075
amb-0.074
aps-0.072
Token enormous
Token ID9812
Feature activation-9.29e-02

Pos logit contributions
 your+0.119
 her+0.118
 same+0.107
 Blink+0.106
 Piper+0.105
Neg logit contributions
apping-0.296
unch-0.277
apped-0.268
aring-0.266
owing-0.262
Token appetite
Token ID20788
Feature activation-3.67e-01

Pos logit contributions
apping+0.208
unch+0.198
amb+0.190
apped+0.189
aring+0.187
Neg logit contributions
 your-0.096
 her-0.093
Mid-0.092
 Piper-0.092
 Blink-0.091
Token and
Token ID290
Feature activation-1.67e-02

Pos logit contributions
apping+0.716
unch+0.687
amb+0.663
ement+0.658
apped+0.645
Neg logit contributions
 your-0.476
 her-0.472
 she-0.464
 Blink-0.445
 Piper-0.445
Token lived
Token ID5615
Feature activation-4.79e-02

Pos logit contributions
apping+0.028
apped+0.028
unch+0.026
oot+0.026
aring+0.026
Neg logit contributions
Scar-0.023
Mid-0.021
Elsa-0.021
 Piper-0.020
 newly-0.020
Token in
Token ID287
Feature activation+8.67e-02

Pos logit contributions
apping+0.109
unch+0.108
ement+0.106
amb+0.104
aring+0.103
Neg logit contributions
 her-0.052
 your-0.049
 the-0.044
 newly-0.044
 she-0.044
Token a
Token ID257
Feature activation+1.57e-01

Pos logit contributions
Scar+0.072
 Piper+0.066
Mid+0.065
 would+0.059
 Midnight+0.057
Neg logit contributions
apping-0.191
unch-0.186
aps-0.184
apped-0.178
ement-0.175
Token swamp
Token ID26837
Feature activation-1.80e-01

Pos logit contributions
 Piper+0.109
Scar+0.108
Mid+0.104
 Blink+0.100
 newly+0.099
Neg logit contributions
apping-0.386
unch-0.363
aring-0.349
apped-0.345
aps-0.344
Token Everyone
Token ID11075
Feature activation-1.28e-02

Pos logit contributions
apping+0.241
unch+0.227
owing+0.221
amb+0.218
apped+0.217
Neg logit contributions
 your-0.112
Mid-0.111
 Blink-0.111
 Piper-0.111
 her-0.111
Token started
Token ID2067
Feature activation-1.18e-01

Pos logit contributions
apping+0.030
unch+0.028
amb+0.027
apped+0.027
owing+0.027
Neg logit contributions
Mid-0.012
 your-0.012
 Piper-0.012
 Blink-0.012
Scar-0.012
Token singing
Token ID13777
Feature activation-2.68e-01

Pos logit contributions
apping+0.221
unch+0.206
apped+0.196
amb+0.196
owing+0.194
Neg logit contributions
Mid-0.162
Scar-0.159
 Piper-0.158
 your-0.158
 Blink-0.158
Token '
Token ID705
Feature activation+3.21e-02

Pos logit contributions
apping+0.472
apped+0.422
unch+0.411
amb+0.404
ement+0.399
Neg logit contributions
Mid-0.400
 Piper-0.399
Scar-0.399
 newly-0.391
 Midnight-0.385
TokenHappy
Token ID25082
Feature activation-4.33e-02

Pos logit contributions
Mid+0.052
 your+0.051
 her+0.051
Scar+0.051
Max+0.051
Neg logit contributions
apping-0.051
unch-0.047
amb-0.045
apped-0.045
ement-0.044
Token Birthday
Token ID33511
Feature activation-3.63e-01

Pos logit contributions
apping+0.061
unch+0.059
owing+0.054
amb+0.053
aring+0.052
Neg logit contributions
 your-0.079
 her-0.078
Mid-0.078
 Midnight-0.076
Scar-0.076
Token'
Token ID6
Feature activation-3.02e-02

Pos logit contributions
apping+0.546
unch+0.472
amb+0.472
apped+0.463
aring+0.456
Neg logit contributions
 Blink-0.648
 newly-0.634
Mid-0.631
Scar-0.626
 Piper-0.621
Token to
Token ID284
Feature activation+6.83e-02

Pos logit contributions
apping+0.050
unch+0.045
amb+0.044
apped+0.043
aring+0.042
Neg logit contributions
Mid-0.048
 Piper-0.048
 Blink-0.048
Scar-0.047
 your-0.047
Token John
Token ID1757
Feature activation+1.33e-01

Pos logit contributions
Mid+0.069
Scar+0.069
 trash+0.067
chy+0.066
 Piper+0.066
Neg logit contributions
apping-0.144
amb-0.135
unch-0.134
aring-0.129
apped-0.128
Token,
Token ID11
Feature activation-4.84e-02

Pos logit contributions
 Blink+0.220
Scar+0.218
 Piper+0.218
Mid+0.218
 newly+0.217
Neg logit contributions
apping-0.210
unch-0.189
amb-0.183
owing-0.182
aring-0.180
Token who
Token ID508
Feature activation+5.64e-03

Pos logit contributions
apping+0.099
unch+0.089
amb+0.088
aring+0.087
apped+0.087
Neg logit contributions
 Blink-0.057
Mid-0.057
Scar-0.054
 Piper-0.053
 newly-0.053
Token,
Token ID11
Feature activation-6.09e-02

Pos logit contributions
apping+0.396
apped+0.365
ement+0.332
unch+0.331
aps+0.303
Neg logit contributions
Mid-0.537
Ol-0.507
Scar-0.503
 newly-0.491
 Piper-0.485
Token everybody
Token ID7288
Feature activation+3.61e-02

Pos logit contributions
apping+0.127
apped+0.121
unch+0.112
amb+0.108
aps+0.107
Neg logit contributions
Mid-0.070
chy-0.068
Scar-0.065
Ol-0.060
 Blink-0.059
Token sang
Token ID25889
Feature activation-2.26e-01

Pos logit contributions
Mid+0.041
 Blink+0.040
 Piper+0.039
Scar+0.038
 Midnight+0.038
Neg logit contributions
apping-0.078
apped-0.072
unch-0.071
amb-0.071
aring-0.069
Token '
Token ID705
Feature activation+2.90e-02

Pos logit contributions
apping+0.482
unch+0.440
apped+0.439
amb+0.431
aring+0.419
Neg logit contributions
 Piper-0.252
Mid-0.251
 Blink-0.248
Scar-0.247
 newly-0.247
TokenHappy
Token ID25082
Feature activation-4.48e-02

Pos logit contributions
 her+0.045
 your+0.044
Mid+0.044
Scar+0.043
 she+0.043
Neg logit contributions
apping-0.049
unch-0.045
amb-0.043
apped-0.042
aring-0.042
Token Birthday
Token ID33511
Feature activation-3.62e-01

Pos logit contributions
apping+0.063
unch+0.061
owing+0.057
amb+0.054
aring+0.053
Neg logit contributions
 your-0.083
 her-0.082
Mid-0.079
 our-0.079
 newly-0.078
Token'
Token ID6
Feature activation-2.92e-02

Pos logit contributions
apping+0.531
unch+0.467
amb+0.454
apped+0.451
aring+0.438
Neg logit contributions
 Blink-0.652
Mid-0.642
 newly-0.638
Scar-0.632
 Piper-0.630
Token for
Token ID329
Feature activation+1.78e-02

Pos logit contributions
apping+0.048
unch+0.045
amb+0.043
apped+0.042
owing+0.042
Neg logit contributions
 your-0.047
Mid-0.046
 her-0.046
 Piper-0.045
Scar-0.045
Token her
Token ID607
Feature activation+2.18e-01

Pos logit contributions
Mid+0.015
 Blink+0.014
Scar+0.014
 Piper+0.014
 newly+0.014
Neg logit contributions
apping-0.042
unch-0.040
apped-0.039
amb-0.038
ement-0.038
Token and
Token ID290
Feature activation-2.48e-02

Pos logit contributions
 your+0.319
 her+0.315
Mid+0.309
 Piper+0.305
 newly+0.305
Neg logit contributions
apping-0.391
unch-0.370
amb-0.351
apped-0.346
owing-0.343
Token gave
Token ID2921
Feature activation-1.82e-01

Pos logit contributions
apping+0.051
apped+0.047
unch+0.046
amb+0.045
aring+0.044
Neg logit contributions
Mid-0.028
Scar-0.027
 Blink-0.027
 newly-0.026
Max-0.026
Token said
Token ID531
Feature activation-1.43e-01

Pos logit contributions
 same+0.280
 your+0.262
 her+0.249
 newly+0.244
 big+0.242
Neg logit contributions
apping-0.797
apped-0.745
unch-0.744
owing-0.719
amb-0.714
Token,
Token ID11
Feature activation-8.08e-02

Pos logit contributions
apping+0.264
unch+0.241
apped+0.235
amb+0.232
owing+0.228
Neg logit contributions
Mid-0.203
Scar-0.200
 Piper-0.199
 newly-0.198
 Blink-0.198
Token "
Token ID366
Feature activation+9.67e-03

Pos logit contributions
apping+0.161
unch+0.149
apped+0.142
amb+0.142
aring+0.141
Neg logit contributions
Mid-0.100
Scar-0.098
 newly-0.098
 your-0.096
 Blink-0.096
TokenBecause
Token ID8128
Feature activation-1.05e-01

Pos logit contributions
 your+0.010
 Blink+0.010
 Piper+0.010
 her+0.010
Mid+0.010
Neg logit contributions
apping-0.021
unch-0.020
amb-0.020
apped-0.019
aring-0.019
Token I
Token ID314
Feature activation+7.38e-02

Pos logit contributions
apping+0.222
unch+0.204
apped+0.198
aring+0.196
amb+0.192
Neg logit contributions
Mid-0.123
Scar-0.120
Beck-0.107
Elsa-0.105
Ol-0.104
Token wish
Token ID4601
Feature activation-3.61e-01

Pos logit contributions
Mid+0.080
 Blink+0.079
 Piper+0.078
 Midnight+0.076
Scar+0.076
Neg logit contributions
apping-0.159
unch-0.150
amb-0.146
aring-0.145
owing-0.145
Token I
Token ID314
Feature activation+7.55e-02

Pos logit contributions
apping+0.747
aring+0.680
unch+0.665
apped+0.658
oot+0.648
Neg logit contributions
Mid-0.428
Scar-0.426
 Piper-0.396
Max-0.395
 trash-0.386
Token was
Token ID373
Feature activation-3.15e-02

Pos logit contributions
Mid+0.080
 Piper+0.079
 Midnight+0.077
Scar+0.075
 Blink+0.075
Neg logit contributions
apping-0.163
unch-0.153
amb-0.151
apped-0.150
owing-0.150
Token as
Token ID355
Feature activation-4.24e-02

Pos logit contributions
apping+0.067
unch+0.061
owing+0.061
aring+0.060
apped+0.060
Neg logit contributions
Mid-0.035
Scar-0.033
 Piper-0.033
 newly-0.032
 Midnight-0.032
Token happy
Token ID3772
Feature activation-1.62e-01

Pos logit contributions
apping+0.097
unch+0.090
apped+0.087
aring+0.087
owing+0.086
Neg logit contributions
Mid-0.040
Scar-0.038
 Midnight-0.036
 Piper-0.036
Max-0.035
Token as
Token ID355
Feature activation-3.85e-02

Pos logit contributions
apping+0.298
unch+0.279
amb+0.266
apped+0.264
owing+0.260
Neg logit contributions
Mid-0.228
 your-0.227
 Piper-0.225
Scar-0.224
 her-0.224
Token in
Token ID287
Feature activation+4.48e-02

Pos logit contributions
chy+0.117
 Blink+0.112
Mid+0.108
 Piper+0.104
 Midnight+0.101
Neg logit contributions
apping-0.243
unch-0.234
 sur-0.227
owing-0.226
amb-0.223
Token the
Token ID262
Feature activation+2.24e-01

Pos logit contributions
Mid+0.039
 Midnight+0.036
Scar+0.036
 Piper+0.034
 Blink+0.034
Neg logit contributions
apping-0.106
aps-0.102
apped-0.098
ement-0.097
unch-0.093
Token sand
Token ID6450
Feature activation-1.89e-01

Pos logit contributions
Mid+0.297
 Blink+0.291
Scar+0.290
 Piper+0.287
 Midnight+0.282
Neg logit contributions
apping-0.409
apped-0.370
ement-0.369
unch-0.361
aps-0.358
Token.
Token ID13
Feature activation-1.18e-01

Pos logit contributions
apping+0.333
unch+0.311
amb+0.297
apped+0.293
owing+0.290
Neg logit contributions
 your-0.279
Mid-0.276
 her-0.275
 Piper-0.273
 Blink-0.272
Token They
Token ID1119
Feature activation+1.02e-01

Pos logit contributions
apping+0.253
unch+0.239
amb+0.231
apped+0.230
aring+0.224
Neg logit contributions
Mid-0.127
 your-0.126
Scar-0.124
 newly-0.124
 Piper-0.123
Token wish
Token ID4601
Feature activation-3.61e-01

Pos logit contributions
Mid+0.126
 Blink+0.122
 Piper+0.121
 Midnight+0.120
Scar+0.119
Neg logit contributions
apping-0.198
apped-0.189
amb-0.188
unch-0.181
owing-0.177
Token they
Token ID484
Feature activation+2.03e-01

Pos logit contributions
apping+0.830
apped+0.765
unch+0.733
ement+0.722
owing+0.722
Neg logit contributions
Mid-0.375
Scar-0.364
 Piper-0.335
Max-0.334
Elsa-0.331
Token could
Token ID714
Feature activation-5.83e-02

Pos logit contributions
 Piper+0.293
Mid+0.287
 Midnight+0.281
 Blink+0.262
Scar+0.257
Neg logit contributions
apping-0.369
apped-0.346
owing-0.335
aring-0.318
and-0.311
Token go
Token ID467
Feature activation-1.36e-01

Pos logit contributions
apping+0.147
unch+0.141
amb+0.136
apped+0.135
ement+0.134
Neg logit contributions
 your-0.044
 her-0.042
Mid-0.041
 Blink-0.040
 newly-0.040
Token home
Token ID1363
Feature activation-1.95e-01

Pos logit contributions
apping+0.265
unch+0.248
amb+0.238
apped+0.236
owing+0.234
Neg logit contributions
Mid-0.178
 your-0.176
 Piper-0.175
Scar-0.174
 Blink-0.174
Token.
Token ID13
Feature activation-1.15e-01

Pos logit contributions
apping+0.321
unch+0.293
amb+0.280
owing+0.278
apped+0.276
Neg logit contributions
Mid-0.317
 Piper-0.311
Scar-0.309
 Blink-0.308
 Midnight-0.307
Token no
Token ID645
Feature activation-1.19e-02

Pos logit contributions
Mid+0.135
Scar+0.134
 Piper+0.123
 Midnight+0.122
chy+0.117
Neg logit contributions
apping-0.219
unch-0.209
ement-0.207
apped-0.204
owing-0.200
Token more
Token ID517
Feature activation-1.12e-01

Pos logit contributions
apping+0.024
unch+0.023
amb+0.022
apped+0.022
owing+0.022
Neg logit contributions
Mid-0.014
 Piper-0.014
 Blink-0.014
Scar-0.014
 your-0.014
Token fun
Token ID1257
Feature activation-2.06e-01

Pos logit contributions
apping+0.231
unch+0.216
amb+0.206
apped+0.205
owing+0.205
Neg logit contributions
Mid-0.132
 Piper-0.128
 Blink-0.128
 Midnight-0.127
Scar-0.126
Token.
Token ID13
Feature activation-1.09e-01

Pos logit contributions
apping+0.348
unch+0.322
amb+0.306
apped+0.304
owing+0.300
Neg logit contributions
Mid-0.319
 Piper-0.315
 Blink-0.314
 your-0.314
Scar-0.312
Token They
Token ID1119
Feature activation+1.09e-01

Pos logit contributions
apping+0.239
unch+0.227
owing+0.218
amb+0.217
apped+0.216
Neg logit contributions
 your-0.115
Mid-0.112
 her-0.112
 Blink-0.111
 Piper-0.111
Token wish
Token ID4601
Feature activation-3.61e-01

Pos logit contributions
Mid+0.131
 Piper+0.129
 Blink+0.125
 Midnight+0.124
 your+0.121
Neg logit contributions
apping-0.211
amb-0.202
apped-0.199
unch-0.194
aring-0.192
Token they
Token ID484
Feature activation+2.05e-01

Pos logit contributions
apping+0.754
apped+0.688
aring+0.663
unch+0.658
oot+0.653
Neg logit contributions
Mid-0.445
Scar-0.429
 Piper-0.416
 Midnight-0.404
Max-0.396
Token had
Token ID550
Feature activation+8.37e-03

Pos logit contributions
 Piper+0.288
Mid+0.277
 Midnight+0.269
 Blink+0.250
Scar+0.244
Neg logit contributions
apping-0.384
apped-0.355
owing-0.354
aring-0.347
and-0.322
Token shared
Token ID4888
Feature activation-1.03e-01

Pos logit contributions
Mid+0.009
 Piper+0.008
Scar+0.008
 Blink+0.008
 Midnight+0.008
Neg logit contributions
apping-0.019
unch-0.017
apped-0.017
owing-0.017
aring-0.017
Token.
Token ID13
Feature activation-1.03e-01

Pos logit contributions
apping+0.209
unch+0.193
apped+0.188
ement+0.187
owing+0.185
Neg logit contributions
Mid-0.123
 Piper-0.121
Scar-0.117
 Blink-0.117
 Midnight-0.114
Token<|endoftext|>
Token ID50256
Feature activation-1.52e-01

Pos logit contributions
apping+0.225
unch+0.213
amb+0.205
apped+0.204
aring+0.200
Neg logit contributions
Mid-0.109
 your-0.108
 Piper-0.107
 her-0.106
Scar-0.106
Token.
Token ID13
Feature activation-1.29e-01

Pos logit contributions
Mid+0.018
 Piper+0.018
Scar+0.017
 Midnight+0.017
 Blink+0.017
Neg logit contributions
apping-0.026
unch-0.024
apped-0.023
owing-0.023
aring-0.023
Token
Token ID198
Feature activation-6.45e-02

Pos logit contributions
apping+0.280
unch+0.264
amb+0.255
apped+0.253
aring+0.249
Neg logit contributions
Mid-0.137
 your-0.135
 Piper-0.134
Scar-0.134
 her-0.133
Token
Token ID198
Feature activation-6.00e-02

Pos logit contributions
apping+0.130
unch+0.122
amb+0.118
owing+0.118
apped+0.118
Neg logit contributions
 your-0.080
 her-0.079
Mid-0.079
 Piper-0.077
 Blink-0.077
Token"
Token ID1
Feature activation+2.73e-02

Pos logit contributions
apping+0.138
unch+0.130
amb+0.125
aring+0.123
apped+0.121
Neg logit contributions
 Blink-0.056
 Piper-0.056
 newly-0.056
 Midnight-0.056
 your-0.055
TokenI
Token ID40
Feature activation+1.75e-02

Pos logit contributions
 your+0.030
 Blink+0.029
 Piper+0.029
 her+0.029
 newly+0.029
Neg logit contributions
apping-0.058
unch-0.055
amb-0.054
apped-0.053
aring-0.051
Token wish
Token ID4601
Feature activation-3.61e-01

Pos logit contributions
 Blink+0.020
Mid+0.020
 Piper+0.020
 Midnight+0.020
Scar+0.020
Neg logit contributions
apping-0.037
unch-0.035
amb-0.033
aring-0.033
owing-0.033
Token we
Token ID356
Feature activation+1.17e-01

Pos logit contributions
apping+0.721
aring+0.660
unch+0.649
apped+0.634
aps+0.630
Neg logit contributions
Mid-0.446
Scar-0.445
 Piper-0.413
Max-0.411
Ol-0.405
Token could
Token ID714
Feature activation-6.07e-02

Pos logit contributions
 Piper+0.153
Mid+0.153
 Midnight+0.151
Scar+0.143
 Blink+0.138
Neg logit contributions
apping-0.221
apped-0.209
owing-0.208
amb-0.198
aring-0.197
Token fly
Token ID6129
Feature activation-1.92e-01

Pos logit contributions
apping+0.149
unch+0.142
amb+0.137
apped+0.136
ement+0.135
Neg logit contributions
 your-0.049
 her-0.048
Mid-0.047
 Blink-0.047
 Piper-0.046
Token,"
Token ID553
Feature activation-1.00e-01

Pos logit contributions
apping+0.307
amb+0.272
oot+0.264
owing+0.264
aring+0.264
Neg logit contributions
Mid-0.305
 Piper-0.299
Scar-0.295
 newly-0.294
 Midnight-0.292
Token she
Token ID673
Feature activation+2.26e-01

Pos logit contributions
apping+0.217
apped+0.207
amb+0.199
unch+0.197
ement+0.194
Neg logit contributions
Mid-0.095
Scar-0.095
chy-0.093
 newly-0.091
 trash-0.084
Token It
Token ID632
Feature activation+1.04e-01

Pos logit contributions
apping+0.233
unch+0.221
apped+0.212
amb+0.212
aring+0.206
Neg logit contributions
Mid-0.112
 your-0.109
 newly-0.108
Scar-0.108
 Piper-0.107
Token said
Token ID531
Feature activation-1.46e-01

Pos logit contributions
 Piper+0.132
Scar+0.131
Mid+0.129
 your+0.125
 Midnight+0.122
Neg logit contributions
apped-0.183
apping-0.182
ement-0.171
amb-0.168
aps-0.165
Token,
Token ID11
Feature activation-6.16e-02

Pos logit contributions
apping+0.288
unch+0.266
apped+0.259
ement+0.250
aring+0.250
Neg logit contributions
Mid-0.191
 Piper-0.187
Scar-0.185
 newly-0.183
 Midnight-0.179
Token "
Token ID366
Feature activation+2.57e-02

Pos logit contributions
apping+0.137
unch+0.131
amb+0.126
apped+0.125
owing+0.125
Neg logit contributions
 your-0.062
 her-0.062
Mid-0.061
 Piper-0.061
 Blink-0.060
TokenHappy
Token ID25082
Feature activation-4.56e-02

Pos logit contributions
 Piper+0.026
 your+0.026
 Midnight+0.025
 Blink+0.025
 newly+0.025
Neg logit contributions
apping-0.057
unch-0.054
amb-0.053
apped-0.052
aring-0.051
Token Birthday
Token ID33511
Feature activation-3.60e-01

Pos logit contributions
unch+0.072
apping+0.071
owing+0.070
ement+0.064
amb+0.064
Neg logit contributions
 your-0.079
 our-0.077
 her-0.076
 my-0.075
 has-0.075
Token Lily
Token ID20037
Feature activation+1.22e-01

Pos logit contributions
apping+0.637
apped+0.567
unch+0.566
aring+0.548
amb+0.546
Neg logit contributions
 newly-0.530
Scar-0.528
Mid-0.525
 Blink-0.520
 Piper-0.501
Token!"
Token ID2474
Feature activation-1.22e-01

Pos logit contributions
 newly+0.206
Mid+0.205
 Piper+0.203
 trash+0.201
 Blink+0.200
Neg logit contributions
apping-0.194
unch-0.170
apped-0.167
amb-0.164
owing-0.162
Token
Token ID198
Feature activation-3.08e-02

Pos logit contributions
apping+0.309
unch+0.294
amb+0.286
apped+0.285
aring+0.279
Neg logit contributions
Mid-0.086
 your-0.085
 Piper-0.083
Scar-0.083
 newly-0.083
Token
Token ID198
Feature activation-3.06e-02

Pos logit contributions
apping+0.059
unch+0.056
amb+0.053
apped+0.053
aring+0.052
Neg logit contributions
Mid-0.040
 your-0.040
 Piper-0.040
 Blink-0.040
Scar-0.040
TokenL
Token ID43
Feature activation+2.87e-03

Pos logit contributions
apping+0.072
unch+0.066
aring+0.064
aps+0.063
amb+0.062
Neg logit contributions
 Blink-0.028
chy-0.028
 same-0.028
 Piper-0.027
 newly-0.027
Token it
Token ID340
Feature activation+1.05e-01

Pos logit contributions
apping+0.161
apped+0.147
owing+0.147
unch+0.147
amb+0.147
Neg logit contributions
Mid-0.106
Scar-0.105
 newly-0.093
 trash-0.091
Ol-0.090
Token was
Token ID373
Feature activation-3.63e-02

Pos logit contributions
Mid+0.116
Scar+0.113
 Piper+0.108
 your+0.105
 newly+0.104
Neg logit contributions
apping-0.221
ement-0.208
apped-0.207
unch-0.207
owing-0.204
Token a
Token ID257
Feature activation+1.26e-01

Pos logit contributions
apping+0.064
unch+0.059
apped+0.056
owing+0.055
ement+0.054
Neg logit contributions
Mid-0.053
Scar-0.051
Ol-0.049
 Piper-0.048
 Blink-0.048
Token very
Token ID845
Feature activation-1.10e-01

Pos logit contributions
Mid+0.105
 Blink+0.102
 her+0.102
 Piper+0.102
Scar+0.101
Neg logit contributions
apping-0.299
unch-0.283
apped-0.268
aring-0.267
amb-0.267
Token special
Token ID2041
Feature activation-1.24e-01

Pos logit contributions
apping+0.264
unch+0.247
aring+0.239
apped+0.235
amb+0.233
Neg logit contributions
Mid-0.095
 Midnight-0.092
 Blink-0.092
 Piper-0.092
Scar-0.090
Token wish
Token ID4601
Feature activation-3.60e-01

Pos logit contributions
apping+0.287
unch+0.275
amb+0.266
apped+0.266
aring+0.259
Neg logit contributions
 your-0.127
 her-0.122
 same-0.118
 she-0.118
 trash-0.117
Token,
Token ID11
Feature activation-6.89e-02

Pos logit contributions
apping+0.582
unch+0.535
apped+0.505
amb+0.504
owing+0.502
Neg logit contributions
Mid-0.588
 Piper-0.578
Scar-0.576
 Blink-0.574
 your-0.574
Token because
Token ID780
Feature activation-7.68e-02

Pos logit contributions
apping+0.124
unch+0.119
amb+0.110
aring+0.106
apped+0.106
Neg logit contributions
Mid-0.095
Scar-0.090
Elsa-0.087
Ol-0.086
 trash-0.085
Token the
Token ID262
Feature activation+2.46e-01

Pos logit contributions
apping+0.160
unch+0.149
apped+0.146
amb+0.143
ement+0.142
Neg logit contributions
Mid-0.090
Scar-0.087
 Piper-0.077
Ol-0.077
 Blink-0.077
Token star
Token ID3491
Feature activation-1.80e-01

Pos logit contributions
 your+0.247
 her+0.243
 newly+0.239
Mid+0.238
 Blink+0.236
Neg logit contributions
apping-0.562
unch-0.534
amb-0.515
apped-0.509
owing-0.506
Token wished
Token ID16555
Feature activation-2.66e-01

Pos logit contributions
apping+0.413
unch+0.386
apped+0.382
owing+0.376
amb+0.376
Neg logit contributions
Mid-0.173
Scar-0.168
 Piper-0.166
 Blink-0.160
 your-0.159
Token such
Token ID884
Feature activation+8.87e-02

Pos logit contributions
apping+0.005
unch+0.005
apped+0.005
owing+0.005
amb+0.005
Neg logit contributions
Mid-0.005
Scar-0.004
 Midnight-0.004
 Piper-0.004
 Blink-0.004
Token a
Token ID257
Feature activation+1.20e-01

Pos logit contributions
Mid+0.094
 Piper+0.091
 Blink+0.090
Scar+0.090
 Midnight+0.088
Neg logit contributions
apping-0.196
unch-0.183
apped-0.174
amb-0.174
aring-0.172
Token kind
Token ID1611
Feature activation-2.11e-01

Pos logit contributions
Mid+0.096
 Blink+0.096
 Piper+0.095
Scar+0.093
 her+0.093
Neg logit contributions
apping-0.290
unch-0.276
aring-0.263
amb-0.263
ement-0.261
Token and
Token ID290
Feature activation-3.66e-02

Pos logit contributions
apping+0.401
unch+0.376
amb+0.360
apped+0.358
owing+0.353
Neg logit contributions
 your-0.282
Mid-0.281
 her-0.279
 Piper-0.279
 Blink-0.276
Token wild
Token ID4295
Feature activation-1.37e-01

Pos logit contributions
apping+0.070
unch+0.067
aring+0.063
owing+0.063
amb+0.062
Neg logit contributions
Scar-0.044
Mid-0.043
Max-0.042
 Blink-0.041
Ol-0.040
Token wish
Token ID4601
Feature activation-3.60e-01

Pos logit contributions
apping+0.250
unch+0.234
amb+0.222
apped+0.221
owing+0.218
Neg logit contributions
 your-0.195
Mid-0.193
 her-0.192
 Piper-0.191
 Blink-0.189
Token.
Token ID13
Feature activation-1.03e-01

Pos logit contributions
apping+0.605
unch+0.541
owing+0.519
aring+0.517
apped+0.517
Neg logit contributions
Mid-0.571
Scar-0.559
 Piper-0.556
 Blink-0.545
Max-0.543
Token They
Token ID1119
Feature activation+1.18e-01

Pos logit contributions
apping+0.211
unch+0.200
amb+0.193
apped+0.190
aring+0.187
Neg logit contributions
Mid-0.121
 your-0.119
Scar-0.119
 Piper-0.118
 newly-0.118
Token smiled
Token ID13541
Feature activation-1.61e-01

Pos logit contributions
Mid+0.148
 Piper+0.141
 newly+0.141
 Midnight+0.140
 your+0.139
Neg logit contributions
apping-0.219
amb-0.214
apped-0.209
unch-0.199
owing-0.199
Token and
Token ID290
Feature activation-3.95e-02

Pos logit contributions
apping+0.299
unch+0.276
amb+0.267
apped+0.266
owing+0.260
Neg logit contributions
Mid-0.226
 Piper-0.223
 your-0.223
Scar-0.222
 Blink-0.221
Token felt
Token ID2936
Feature activation-5.45e-02

Pos logit contributions
apping+0.072
apped+0.068
amb+0.066
unch+0.066
owing+0.063
Neg logit contributions
Mid-0.053
Scar-0.053
 your-0.052
 newly-0.052
 Piper-0.051
Token brave
Token ID14802
Feature activation-1.46e-01

Pos logit contributions
apping+0.068
unch+0.067
amb+0.061
ement+0.061
apped+0.061
Neg logit contributions
 her-0.058
 she-0.057
 your-0.057
 Blink-0.055
Mid-0.054
Token and
Token ID290
Feature activation-5.41e-02

Pos logit contributions
apping+0.259
unch+0.239
amb+0.230
apped+0.227
aring+0.226
Neg logit contributions
Mid-0.213
 your-0.210
Scar-0.210
 Piper-0.209
 newly-0.209
Token sweet
Token ID6029
Feature activation-1.15e-01

Pos logit contributions
apping+0.103
apped+0.098
aring+0.094
amb+0.091
owing+0.091
Neg logit contributions
Mid-0.066
Scar-0.064
 newly-0.063
 Midnight-0.062
Max-0.061
Token.
Token ID13
Feature activation-1.16e-01

Pos logit contributions
apping+0.207
unch+0.193
amb+0.184
apped+0.183
owing+0.180
Neg logit contributions
Mid-0.167
 your-0.166
 Piper-0.165
Scar-0.164
 Blink-0.164
Token I
Token ID314
Feature activation+7.86e-02

Pos logit contributions
apping+0.252
unch+0.238
amb+0.229
apped+0.228
owing+0.224
Neg logit contributions
Mid-0.123
 your-0.122
 Piper-0.121
 her-0.120
Scar-0.120
Token wish
Token ID4601
Feature activation-3.59e-01

Pos logit contributions
 Blink+0.081
 Piper+0.080
Mid+0.080
 your+0.079
 Midnight+0.078
Neg logit contributions
apping-0.172
unch-0.163
amb-0.159
apped-0.157
aring-0.156
Token I
Token ID314
Feature activation+8.02e-02

Pos logit contributions
apping+0.763
unch+0.693
aring+0.687
apped+0.685
aps+0.669
Neg logit contributions
Scar-0.406
Mid-0.401
Max-0.384
 trash-0.372
 Piper-0.370
Token were
Token ID547
Feature activation-3.02e-02

Pos logit contributions
 Piper+0.102
Mid+0.102
 Midnight+0.100
Scar+0.096
 Blink+0.094
Neg logit contributions
apping-0.151
unch-0.142
apped-0.142
owing-0.142
amb-0.142
Token real
Token ID1103
Feature activation-1.04e-01

Pos logit contributions
apping+0.061
unch+0.056
owing+0.056
aring+0.054
apped+0.053
Neg logit contributions
Mid-0.036
Scar-0.033
Max-0.033
 Piper-0.032
Ol-0.032
Token too
Token ID1165
Feature activation-1.20e-01

Pos logit contributions
apping+0.205
unch+0.199
apped+0.189
amb+0.185
owing+0.184
Neg logit contributions
 your-0.141
 her-0.138
 she-0.135
 Blink-0.128
 Piper-0.126
Token.
Token ID13
Feature activation-1.08e-01

Pos logit contributions
apping+0.225
unch+0.214
amb+0.209
apped+0.208
owing+0.200
Neg logit contributions
 your-0.174
 her-0.167
 she-0.162
 Blink-0.161
 Piper-0.160
Token are
Token ID389
Feature activation-7.09e-04

Pos logit contributions
Mid+0.118
 Blink+0.116
 Piper+0.114
 Midnight+0.110
Scar+0.109
Neg logit contributions
apping-0.182
apped-0.176
amb-0.171
unch-0.165
owing-0.165
Token very
Token ID845
Feature activation-1.07e-01

Pos logit contributions
apping+0.001
unch+0.001
amb+0.001
apped+0.001
owing+0.001
Neg logit contributions
 your-0.001
 her-0.001
Mid-0.001
 Piper-0.001
 Midnight-0.001
Token sad
Token ID6507
Feature activation-1.75e-01

Pos logit contributions
apping+0.232
unch+0.223
amb+0.217
owing+0.212
apped+0.211
Neg logit contributions
 your-0.117
 her-0.113
Mid-0.111
 newly-0.111
Scar-0.109
Token.
Token ID13
Feature activation-1.18e-01

Pos logit contributions
apping+0.315
unch+0.298
amb+0.285
apped+0.282
owing+0.276
Neg logit contributions
 your-0.254
 her-0.252
Mid-0.246
 Piper-0.244
 Midnight-0.243
Token They
Token ID1119
Feature activation+1.08e-01

Pos logit contributions
apping+0.272
unch+0.261
owing+0.254
amb+0.250
apped+0.247
Neg logit contributions
 your-0.115
 her-0.111
 Piper-0.109
 Blink-0.109
Mid-0.108
Token wish
Token ID4601
Feature activation-3.59e-01

Pos logit contributions
Mid+0.135
 Piper+0.133
 Blink+0.133
Scar+0.131
 your+0.129
Neg logit contributions
apping-0.208
unch-0.191
amb-0.190
apped-0.190
aring-0.189
Token they
Token ID484
Feature activation+2.07e-01

Pos logit contributions
apping+0.902
apped+0.840
unch+0.811
ement+0.794
aring+0.794
Neg logit contributions
Mid-0.284
Scar-0.277
 Piper-0.254
Elsa-0.239
Max-0.238
Token had
Token ID550
Feature activation+8.99e-03

Pos logit contributions
 Piper+0.293
Mid+0.268
 Blink+0.255
 Midnight+0.254
Scar+0.249
Neg logit contributions
apping-0.383
apped-0.359
owing-0.352
aring-0.351
aps-0.323
Token listened
Token ID16399
Feature activation-1.36e-01

Pos logit contributions
Mid+0.008
 Piper+0.008
 Blink+0.008
Scar+0.008
 Midnight+0.008
Neg logit contributions
apping-0.021
unch-0.019
apped-0.019
owing-0.018
amb-0.018
Token to
Token ID284
Feature activation+5.99e-02

Pos logit contributions
apping+0.242
unch+0.216
apped+0.213
aring+0.210
owing+0.209
Neg logit contributions
Mid-0.199
 Piper-0.198
Scar-0.195
 Blink-0.195
 Midnight-0.194
Token Mom
Token ID11254
Feature activation-1.00e-01

Pos logit contributions
Scar+0.063
Mid+0.062
 Piper+0.056
 Blink+0.055
chy+0.054
Neg logit contributions
apping-0.127
unch-0.115
apped-0.113
owing-0.112
amb-0.111
Token
Token ID198
Feature activation-3.04e-02

Pos logit contributions
apping+0.059
unch+0.056
amb+0.053
apped+0.052
aring+0.052
Neg logit contributions
 your-0.040
Mid-0.040
 Piper-0.040
 Blink-0.040
Scar-0.039
TokenEveryone
Token ID16190
Feature activation+1.59e-02

Pos logit contributions
apping+0.064
unch+0.060
amb+0.057
aring+0.056
apped+0.054
Neg logit contributions
 trash-0.036
 Blink-0.035
chy-0.034
 Piper-0.034
 your-0.034
Token sang
Token ID25889
Feature activation-2.21e-01

Pos logit contributions
 your+0.015
Mid+0.015
 her+0.015
 Piper+0.015
Scar+0.015
Neg logit contributions
apping-0.036
unch-0.034
amb-0.033
apped-0.033
owing-0.032
Token the
Token ID262
Feature activation+2.57e-01

Pos logit contributions
apping+0.428
apped+0.390
amb+0.363
unch+0.359
ement+0.358
Neg logit contributions
 trash-0.296
 Piper-0.294
Mid-0.293
 Blink-0.290
 newly-0.289
Token Happy
Token ID14628
Feature activation+1.72e-02

Pos logit contributions
 Piper+0.292
 Blink+0.289
 your+0.281
Scar+0.279
Mid+0.278
Neg logit contributions
apping-0.532
unch-0.485
ement-0.485
apped-0.483
amb-0.477
Token Birthday
Token ID33511
Feature activation-3.58e-01

Pos logit contributions
Mid+0.026
 your+0.026
 her+0.026
 newly+0.026
Scar+0.026
Neg logit contributions
apping-0.029
unch-0.028
amb-0.026
owing-0.026
apped-0.026
Token song
Token ID3496
Feature activation-2.61e-01

Pos logit contributions
apping+0.650
apped+0.585
amb+0.559
owing+0.547
unch+0.543
Neg logit contributions
Mid-0.528
Scar-0.522
 Blink-0.522
 newly-0.512
 trash-0.504
Token.
Token ID13
Feature activation-9.08e-02

Pos logit contributions
apping+0.455
unch+0.416
apped+0.403
amb+0.401
owing+0.393
Neg logit contributions
Mid-0.390
 your-0.388
Scar-0.385
 Piper-0.385
 Blink-0.383
Token Molly
Token ID30236
Feature activation+1.33e-01

Pos logit contributions
apping+0.168
apped+0.152
amb+0.148
unch+0.148
ement+0.142
Neg logit contributions
Mid-0.125
 your-0.124
 newly-0.123
Scar-0.123
 Blink-0.123
Token blew
Token ID17948
Feature activation-1.33e-01

Pos logit contributions
Mid+0.141
 Blink+0.140
 your+0.138
 Piper+0.138
 newly+0.136
Neg logit contributions
apping-0.285
apped-0.262
amb-0.262
unch-0.261
aring-0.253
Token out
Token ID503
Feature activation+3.60e-02

Pos logit contributions
apping+0.278
apped+0.259
ement+0.255
amb+0.250
unch+0.245
Neg logit contributions
Mid-0.144
 Blink-0.143
Scar-0.140
 Piper-0.139
 newly-0.138
Token Lily
Token ID20037
Feature activation+9.09e-02

Pos logit contributions
apping+0.171
unch+0.157
apped+0.150
amb+0.147
aring+0.147
Neg logit contributions
Mid-0.100
Scar-0.096
 newly-0.091
 Blink-0.089
Elsa-0.089
Token.
Token ID13
Feature activation-1.29e-01

Pos logit contributions
 newly+0.131
Mid+0.130
 Blink+0.128
Scar+0.125
 Piper+0.125
Neg logit contributions
apping-0.164
unch-0.149
apped-0.147
amb-0.147
owing-0.144
Token I
Token ID314
Feature activation+7.54e-02

Pos logit contributions
apping+0.307
unch+0.293
owing+0.285
amb+0.282
ement+0.282
Neg logit contributions
 your-0.114
 Piper-0.110
 her-0.109
 Midnight-0.109
Mid-0.109
Token hope
Token ID2911
Feature activation-2.05e-01

Pos logit contributions
Mid+0.072
 Blink+0.072
 Piper+0.072
 your+0.072
Scar+0.071
Neg logit contributions
apping-0.171
unch-0.162
amb-0.157
apped-0.156
aring-0.155
Token your
Token ID534
Feature activation+1.83e-01

Pos logit contributions
apping+0.423
unch+0.403
aring+0.388
oot+0.385
owing+0.377
Neg logit contributions
Scar-0.223
Mid-0.218
Max-0.200
 Piper-0.196
Beck-0.195
Token wish
Token ID4601
Feature activation-3.57e-01

Pos logit contributions
 your+0.209
 her+0.201
Mid+0.200
 newly+0.197
 Midnight+0.196
Neg logit contributions
apping-0.397
unch-0.378
amb-0.359
apped-0.358
owing-0.352
Token comes
Token ID2058
Feature activation-2.86e-02

Pos logit contributions
apping+0.773
apped+0.707
owing+0.707
aring+0.701
unch+0.695
Neg logit contributions
 Piper-0.390
Mid-0.384
Scar-0.370
 Midnight-0.361
Max-0.351
Token true
Token ID2081
Feature activation-1.62e-02

Pos logit contributions
apping+0.056
owing+0.050
aring+0.049
amb+0.049
unch+0.048
Neg logit contributions
Mid-0.037
 Piper-0.037
 Midnight-0.036
Scar-0.035
 newly-0.035
Token."
Token ID526
Feature activation-1.30e-01

Pos logit contributions
apping+0.031
unch+0.031
apped+0.028
amb+0.028
owing+0.027
Neg logit contributions
 your-0.024
 her-0.023
 Blink-0.022
 would-0.021
 she-0.021
Token
Token ID198
Feature activation-3.87e-02

Pos logit contributions
apping+0.276
unch+0.261
amb+0.251
apped+0.249
owing+0.247
Neg logit contributions
 your-0.148
Mid-0.148
 her-0.146
 Piper-0.146
 Blink-0.144
Token
Token ID198
Feature activation-3.79e-02

Pos logit contributions
apping+0.079
unch+0.074
owing+0.072
amb+0.071
apped+0.071
Neg logit contributions
 your-0.048
 her-0.048
Mid-0.048
 Blink-0.047
 Piper-0.046
Token had
Token ID550
Feature activation+4.30e-03

Pos logit contributions
apping+0.220
apped+0.190
unch+0.187
oot+0.186
ement+0.184
Neg logit contributions
Scar-0.270
Mid-0.268
 Piper-0.259
 Blink-0.257
 Midnight-0.252
Token collected
Token ID7723
Feature activation-1.24e-01

Pos logit contributions
Mid+0.005
 Blink+0.005
 Piper+0.005
 Midnight+0.005
Max+0.004
Neg logit contributions
apping-0.009
apped-0.008
unch-0.008
owing-0.008
aring-0.008
Token and
Token ID290
Feature activation-3.68e-02

Pos logit contributions
apping+0.250
ement+0.250
apped+0.239
unch+0.236
atted+0.234
Neg logit contributions
Mid-0.140
 Blink-0.139
Scar-0.134
Max-0.131
 Midnight-0.127
Token make
Token ID787
Feature activation-6.51e-02

Pos logit contributions
apping+0.074
apped+0.070
unch+0.069
amb+0.065
idd+0.064
Neg logit contributions
Mid-0.045
Scar-0.045
 Blink-0.044
 Midnight-0.042
 Piper-0.042
Token a
Token ID257
Feature activation+1.31e-01

Pos logit contributions
apping+0.145
unch+0.136
aps+0.133
apped+0.133
ement+0.132
Neg logit contributions
Mid-0.065
 Blink-0.061
 Midnight-0.060
Max-0.060
 Piper-0.059
Token wish
Token ID4601
Feature activation-3.57e-01

Pos logit contributions
Mid+0.105
 Piper+0.103
 Blink+0.102
 your+0.101
Scar+0.101
Neg logit contributions
apping-0.320
unch-0.304
amb-0.291
apped-0.290
ement-0.287
Token for
Token ID329
Feature activation+5.85e-03

Pos logit contributions
apping+0.605
unch+0.549
apped+0.528
owing+0.524
amb+0.515
Neg logit contributions
Mid-0.567
 Piper-0.551
Scar-0.548
 Midnight-0.545
 Blink-0.544
Token a
Token ID257
Feature activation+1.35e-01

Pos logit contributions
Mid+0.005
Scar+0.005
 Piper+0.005
 Blink+0.005
 Midnight+0.005
Neg logit contributions
apping-0.014
unch-0.013
ement-0.013
apped-0.013
amb-0.013
Token lovely
Token ID14081
Feature activation-1.04e-01

Pos logit contributions
 your+0.111
Mid+0.110
 her+0.107
 Piper+0.107
 Blink+0.106
Neg logit contributions
apping-0.329
unch-0.313
amb-0.303
apped-0.302
owing-0.299
Token surprise
Token ID5975
Feature activation-2.47e-01

Pos logit contributions
apping+0.233
unch+0.220
apped+0.213
amb+0.211
owing+0.209
Neg logit contributions
 your-0.111
 her-0.110
Mid-0.107
 newly-0.105
 Piper-0.104
Token.
Token ID13
Feature activation-9.63e-02

Pos logit contributions
apping+0.436
unch+0.411
amb+0.389
apped+0.388
aring+0.380
Neg logit contributions
 her-0.383
 your-0.377
 she-0.368
 Piper-0.361
Mid-0.359
Token happy
Token ID3772
Feature activation-1.68e-01

Pos logit contributions
apping+0.131
apped+0.114
owing+0.113
unch+0.113
aps+0.110
Neg logit contributions
 Blink-0.082
Mid-0.082
Max-0.080
 Piper-0.079
Ol-0.078
Token and
Token ID290
Feature activation-4.46e-02

Pos logit contributions
apping+0.291
unch+0.270
amb+0.257
apped+0.255
owing+0.251
Neg logit contributions
Mid-0.254
 your-0.251
 Piper-0.251
Scar-0.250
 Blink-0.249
Token give
Token ID1577
Feature activation-1.99e-01

Pos logit contributions
apping+0.088
aring+0.081
apped+0.080
unch+0.079
amb+0.078
Neg logit contributions
Scar-0.054
Mid-0.054
 Blink-0.052
Max-0.052
 Piper-0.052
Token us
Token ID514
Feature activation+2.46e-02

Pos logit contributions
apping+0.453
amb+0.427
apped+0.424
unch+0.422
ement+0.419
Neg logit contributions
 Blink-0.162
Scar-0.162
 newly-0.156
Mid-0.155
 Piper-0.147
Token a
Token ID257
Feature activation+1.32e-01

Pos logit contributions
 Blink+0.024
 Piper+0.023
Scar+0.022
Max+0.021
 Midnight+0.021
Neg logit contributions
apping-0.054
apped-0.050
amb-0.050
aps-0.050
aring-0.049
Token wish
Token ID4601
Feature activation-3.57e-01

Pos logit contributions
 your+0.161
Mid+0.160
 her+0.157
 Piper+0.156
Scar+0.155
Neg logit contributions
apping-0.270
unch-0.254
amb-0.245
apped-0.242
owing-0.241
Token."
Token ID526
Feature activation-1.32e-01

Pos logit contributions
apping+0.583
unch+0.530
apped+0.507
amb+0.506
owing+0.500
Neg logit contributions
Mid-0.578
 Piper-0.570
Scar-0.566
 Blink-0.565
 Midnight-0.561
Token
Token ID198
Feature activation-3.79e-02

Pos logit contributions
apping+0.263
unch+0.247
amb+0.237
apped+0.235
owing+0.231
Neg logit contributions
Mid-0.165
 your-0.164
 Piper-0.162
Scar-0.162
 Blink-0.161
Token
Token ID198
Feature activation-3.73e-02

Pos logit contributions
apping+0.076
unch+0.071
amb+0.069
owing+0.069
apped+0.069
Neg logit contributions
 your-0.047
 her-0.047
Mid-0.047
 Piper-0.046
Scar-0.046
TokenBen
Token ID11696
Feature activation+2.63e-02

Pos logit contributions
apping+0.086
unch+0.081
amb+0.078
aring+0.077
apped+0.075
Neg logit contributions
chy-0.035
 Blink-0.035
 newly-0.035
 same-0.035
 Piper-0.035
Token liked
Token ID8288
Feature activation-1.42e-01

Pos logit contributions
 newly+0.039
Mid+0.039
 Piper+0.038
 Midnight+0.038
 Blink+0.038
Neg logit contributions
apping-0.044
apped-0.042
amb-0.041
ement-0.040
aring-0.039
Token so
Token ID523
Feature activation-7.45e-02

Pos logit contributions
apping+0.446
unch+0.409
ement+0.400
owing+0.397
apped+0.395
Neg logit contributions
Mid-0.220
Scar-0.212
 Piper-0.207
 Blink-0.197
 Midnight-0.196
Token much
Token ID881
Feature activation-4.70e-02

Pos logit contributions
apping+0.175
unch+0.161
apped+0.160
amb+0.157
ement+0.155
Neg logit contributions
Mid-0.071
 Piper-0.068
Scar-0.067
 Midnight-0.065
 Blink-0.062
Token TV
Token ID3195
Feature activation-1.38e-01

Pos logit contributions
apping+0.096
unch+0.091
amb+0.088
apped+0.087
owing+0.086
Neg logit contributions
 your-0.058
 her-0.057
 Blink-0.055
 Piper-0.055
Mid-0.055
Token.
Token ID13
Feature activation-9.97e-02

Pos logit contributions
apping+0.224
unch+0.204
apped+0.196
owing+0.195
ement+0.193
Neg logit contributions
Mid-0.224
 Piper-0.222
Scar-0.219
 Blink-0.217
 your-0.215
Token They
Token ID1119
Feature activation+1.18e-01

Pos logit contributions
apping+0.211
unch+0.199
amb+0.191
apped+0.190
owing+0.189
Neg logit contributions
 your-0.112
Mid-0.111
 Piper-0.110
 her-0.110
 Blink-0.109
Token wish
Token ID4601
Feature activation-3.57e-01

Pos logit contributions
Mid+0.140
 your+0.136
 Piper+0.136
 Blink+0.135
Scar+0.134
Neg logit contributions
apping-0.235
apped-0.222
amb-0.216
owing-0.214
unch-0.213
Token they
Token ID484
Feature activation+2.09e-01

Pos logit contributions
apping+0.773
apped+0.711
ement+0.678
aring+0.674
oot+0.674
Neg logit contributions
Mid-0.412
Scar-0.393
 Piper-0.371
Elsa-0.367
Max-0.360
Token had
Token ID550
Feature activation+9.96e-03

Pos logit contributions
 Piper+0.289
Mid+0.282
 Midnight+0.274
 Blink+0.260
Scar+0.258
Neg logit contributions
apping-0.382
owing-0.364
apped-0.357
aring-0.348
ement-0.335
Token been
Token ID587
Feature activation-3.31e-02

Pos logit contributions
Mid+0.011
 Piper+0.011
 Blink+0.011
Scar+0.011
 Midnight+0.011
Neg logit contributions
apping-0.021
unch-0.019
apped-0.019
owing-0.019
aring-0.018
Token good
Token ID922
Feature activation-7.40e-02

Pos logit contributions
apping+0.068
unch+0.061
apped+0.061
owing+0.060
aring+0.060
Neg logit contributions
Mid-0.041
 Piper-0.040
 Blink-0.040
Scar-0.039
 Midnight-0.039
Token.
Token ID13
Feature activation-9.81e-02

Pos logit contributions
unch+0.148
apping+0.142
ement+0.133
amb+0.131
oot+0.130
Neg logit contributions
 her-0.096
 your-0.094
 she-0.091
 same-0.087
 their-0.085
Token she
Token ID673
Feature activation+2.13e-01

Pos logit contributions
apping+0.041
owing+0.039
unch+0.037
oot+0.037
ement+0.036
Neg logit contributions
 Blink-0.030
Mid-0.030
Scar-0.028
Ol-0.028
 Piper-0.027
Token needed
Token ID2622
Feature activation-1.18e-01

Pos logit contributions
Mid+0.239
 Blink+0.231
 Piper+0.228
 your+0.227
Scar+0.227
Neg logit contributions
apping-0.423
owing-0.403
apped-0.395
unch-0.390
aring-0.385
Token and
Token ID290
Feature activation-3.19e-02

Pos logit contributions
apping+0.180
unch+0.166
owing+0.156
apped+0.155
amb+0.155
Neg logit contributions
Mid-0.201
 Blink-0.198
Scar-0.197
 Piper-0.197
 Midnight-0.194
Token make
Token ID787
Feature activation-6.75e-02

Pos logit contributions
apping+0.069
apped+0.065
unch+0.065
owing+0.064
aring+0.063
Neg logit contributions
Mid-0.031
Scar-0.031
Max-0.028
 Blink-0.028
 Piper-0.027
Token her
Token ID607
Feature activation+2.12e-01

Pos logit contributions
apping+0.144
unch+0.134
ement+0.134
aps+0.131
apped+0.130
Neg logit contributions
Mid-0.066
 Blink-0.063
Scar-0.062
Max-0.061
 Piper-0.060
Token wish
Token ID4601
Feature activation-3.57e-01

Pos logit contributions
Mid+0.239
Scar+0.233
 Piper+0.233
 Blink+0.232
 your+0.230
Neg logit contributions
apping-0.448
unch-0.420
apped-0.404
amb-0.404
owing-0.404
Token come
Token ID1282
Feature activation-1.30e-01

Pos logit contributions
apping+0.642
owing+0.571
apped+0.567
unch+0.550
ement+0.549
Neg logit contributions
Mid-0.531
 Piper-0.507
Scar-0.505
 Blink-0.503
 Midnight-0.493
Token true
Token ID2081
Feature activation-1.47e-02

Pos logit contributions
apping+0.228
owing+0.202
aring+0.196
oot+0.195
aps+0.195
Neg logit contributions
Mid-0.188
 Blink-0.182
 Midnight-0.182
 Piper-0.181
 trash-0.176
Token.
Token ID13
Feature activation-9.48e-02

Pos logit contributions
apping+0.027
unch+0.026
amb+0.024
apped+0.024
aring+0.023
Neg logit contributions
 her-0.022
 your-0.022
 she-0.021
 Blink-0.021
 Piper-0.020
Token
Token ID198
Feature activation-3.02e-02

Pos logit contributions
apping+0.187
unch+0.177
amb+0.170
apped+0.168
aring+0.164
Neg logit contributions
Mid-0.119
 your-0.118
Scar-0.116
 Piper-0.116
 Blink-0.115
Token
Token ID198
Feature activation-3.04e-02

Pos logit contributions
apping+0.060
unch+0.056
amb+0.054
apped+0.054
owing+0.054
Neg logit contributions
Mid-0.038
 your-0.038
 her-0.038
 Piper-0.038
Scar-0.038
Token with
Token ID351
Feature activation-2.12e-02

Pos logit contributions
apping+0.375
unch+0.367
amb+0.351
aring+0.341
ement+0.340
Neg logit contributions
 her-0.285
 your-0.284
 she-0.269
 the-0.264
 their-0.260
Token a
Token ID257
Feature activation+1.74e-01

Pos logit contributions
apping+0.057
unch+0.055
apped+0.053
ement+0.053
amb+0.052
Neg logit contributions
Mid-0.012
Scar-0.011
 Piper-0.011
 Blink-0.010
 Midnight-0.010
Token big
Token ID1263
Feature activation+3.34e-03

Pos logit contributions
 your+0.119
 newly+0.116
Scar+0.112
 Piper+0.112
Mid+0.111
Neg logit contributions
apping-0.446
unch-0.425
amb-0.423
apped-0.413
owing-0.413
Token smile
Token ID8212
Feature activation-1.86e-01

Pos logit contributions
 your+0.004
 newly+0.004
 our+0.004
 her+0.004
 Piper+0.004
Neg logit contributions
apping-0.007
amb-0.006
unch-0.006
aring-0.006
owing-0.006
Token on
Token ID319
Feature activation+8.46e-02

Pos logit contributions
apping+0.395
unch+0.382
aring+0.369
ement+0.365
amb+0.358
Neg logit contributions
 her-0.217
 she-0.204
 your-0.200
 the-0.190
 their-0.185
Token his
Token ID465
Feature activation+2.59e-01

Pos logit contributions
 her+0.025
 your+0.021
 the+0.012
 their+0.011
 our+0.007
Neg logit contributions
apping-0.274
unch-0.259
owing-0.257
aring-0.254
apped-0.252
Token face
Token ID1986
Feature activation-2.20e-01

Pos logit contributions
 newly+0.382
 your+0.378
 her+0.362
 our+0.360
Mid+0.358
Neg logit contributions
apping-0.463
unch-0.445
amb-0.424
apped-0.410
aring-0.410
Token.
Token ID13
Feature activation-6.40e-02

Pos logit contributions
apping+0.450
unch+0.438
amb+0.432
idd+0.421
apped+0.419
Neg logit contributions
 her-0.286
 she-0.278
 the-0.262
 your-0.241
 their-0.238
Token<|endoftext|>
Token ID50256
Feature activation-1.18e-01

Pos logit contributions
apping+0.153
owing+0.144
ement+0.142
oot+0.139
unch+0.138
Neg logit contributions
 Blink-0.057
 your-0.053
 her-0.052
 she-0.052
 Piper-0.051
TokenOne
Token ID3198
Feature activation-6.07e-02

Pos logit contributions
apping+0.270
unch+0.255
aring+0.250
owing+0.249
amb+0.248
Neg logit contributions
 her-0.112
 your-0.110
 Blink-0.109
 Piper-0.107
 Midnight-0.106
Token day
Token ID1110
Feature activation-1.21e-01

Pos logit contributions
apping+0.118
unch+0.111
apped+0.104
owing+0.103
amb+0.103
Neg logit contributions
Mid-0.082
Scar-0.082
 Piper-0.081
 Midnight-0.081
 Blink-0.080
Token
Token ID198
Feature activation-3.96e-03

Pos logit contributions
apping+0.161
unch+0.153
amb+0.148
owing+0.146
apped+0.145
Neg logit contributions
 her-0.078
Mid-0.077
 your-0.077
 Piper-0.077
 Blink-0.076
Token
Token ID198
Feature activation-3.74e-03

Pos logit contributions
apping+0.008
unch+0.008
amb+0.008
apped+0.007
owing+0.007
Neg logit contributions
 your-0.005
 her-0.005
Mid-0.005
 Piper-0.005
 Blink-0.005
TokenShe
Token ID3347
Feature activation+1.59e-01

Pos logit contributions
apping+0.008
aps+0.008
unch+0.007
aring+0.007
apped+0.007
Neg logit contributions
 Blink-0.004
chy-0.004
 Piper-0.004
 same-0.004
 newly-0.004
Token looked
Token ID3114
Feature activation-9.22e-02

Pos logit contributions
 your+0.157
 Blink+0.152
Mid+0.152
 Piper+0.151
Scar+0.148
Neg logit contributions
apping-0.357
unch-0.338
apped-0.330
amb-0.325
aring-0.318
Token at
Token ID379
Feature activation-3.44e-03

Pos logit contributions
apping+0.129
apped+0.114
unch+0.112
amb+0.112
ement+0.109
Neg logit contributions
 Blink-0.169
Mid-0.168
Scar-0.167
 Piper-0.167
 Midnight-0.165
Token the
Token ID262
Feature activation+2.87e-01

Pos logit contributions
apping+0.008
apped+0.008
ement+0.008
aps+0.007
unch+0.007
Neg logit contributions
Scar-0.003
Mid-0.003
 Blink-0.002
 Piper-0.002
chy-0.002
Token cut
Token ID2005
Feature activation-1.76e-01

Pos logit contributions
 Blink+0.364
Mid+0.360
 Piper+0.352
 your+0.341
 Midnight+0.337
Neg logit contributions
apping-0.576
unch-0.553
ement-0.516
apped-0.509
owing-0.504
Token.
Token ID13
Feature activation-7.17e-02

Pos logit contributions
apping+0.266
unch+0.240
apped+0.232
ement+0.228
owing+0.223
Neg logit contributions
Mid-0.313
Scar-0.309
 Piper-0.308
 Blink-0.307
 your-0.302
Token It
Token ID632
Feature activation+1.41e-01

Pos logit contributions
apping+0.171
unch+0.160
apped+0.156
aps+0.152
amb+0.152
Neg logit contributions
Mid-0.063
 your-0.061
 Piper-0.060
Scar-0.060
 newly-0.059
Token was
Token ID373
Feature activation+1.74e-02

Pos logit contributions
 your+0.126
Scar+0.122
Mid+0.120
 Piper+0.120
 Blink+0.116
Neg logit contributions
apping-0.327
unch-0.313
apped-0.306
ement-0.300
amb-0.299
Token not
Token ID407
Feature activation+5.47e-02

Pos logit contributions
Mid+0.016
Scar+0.015
 Blink+0.015
 Piper+0.015
 your+0.014
Neg logit contributions
apping-0.040
unch-0.038
apped-0.036
ement-0.036
owing-0.036
Token to
Token ID284
Feature activation+1.02e-01

Pos logit contributions
apping+0.150
unch+0.140
amb+0.135
apped+0.133
owing+0.132
Neg logit contributions
Mid-0.109
Scar-0.108
 Piper-0.107
 your-0.107
 Blink-0.106
Token the
Token ID262
Feature activation+2.89e-01

Pos logit contributions
Mid+0.068
chy+0.066
Scar+0.062
 has+0.058
Max+0.058
Neg logit contributions
apping-0.241
unch-0.239
amb-0.227
aps-0.226
apped-0.223
Token zoo
Token ID26626
Feature activation-2.00e-01

Pos logit contributions
Mid+0.321
 Piper+0.319
 your+0.316
Scar+0.316
 Blink+0.315
Neg logit contributions
apping-0.612
unch-0.579
amb-0.557
apped-0.552
aring-0.543
Token and
Token ID290
Feature activation+6.76e-03

Pos logit contributions
apping+0.368
unch+0.355
amb+0.342
ement+0.336
apped+0.331
Neg logit contributions
 her-0.287
 your-0.286
 she-0.277
 Blink-0.270
 Piper-0.263
Token saw
Token ID2497
Feature activation-4.73e-02

Pos logit contributions
Mid+0.009
Scar+0.009
 newly+0.009
chy+0.009
Max+0.008
Neg logit contributions
apped-0.011
amb-0.010
apping-0.010
 sur-0.010
owing-0.010
Token a
Token ID257
Feature activation+1.80e-01

Pos logit contributions
apping+0.118
unch+0.114
apped+0.111
amb+0.108
ement+0.107
Neg logit contributions
Mid-0.034
Scar-0.034
 Piper-0.032
 newly-0.031
 Blink-0.030
Token big
Token ID1263
Feature activation+8.68e-03

Pos logit contributions
 your+0.103
Mid+0.099
 Piper+0.097
 her+0.097
 Blink+0.097
Neg logit contributions
apping-0.482
unch-0.460
amb-0.450
apped-0.446
owing-0.445
Token rh
Token ID9529
Feature activation-1.46e-01

Pos logit contributions
 your+0.010
 her+0.010
 Piper+0.010
 Blink+0.010
Mid+0.010
Neg logit contributions
apping-0.018
unch-0.017
amb-0.017
apped-0.017
owing-0.016
Tokenin
Token ID259
Feature activation+4.26e-02

Pos logit contributions
apping+0.250
amb+0.244
unch+0.240
aring+0.237
owing+0.235
Neg logit contributions
 your-0.200
Max-0.192
Anna-0.190
 newly-0.190
 Blink-0.187
Tokenoc
Token ID420
Feature activation-2.52e-01

Pos logit contributions
 she+0.060
 has+0.060
 your+0.059
 the+0.059
 they+0.058
Neg logit contributions
apping-0.081
atted-0.075
owing-0.075
ement-0.072
apped-0.071
Tokeneros
Token ID27498
Feature activation-5.46e-02

Pos logit contributions
apping+0.439
unch+0.375
run+0.358
year+0.356
and+0.351
Neg logit contributions
 your-0.350
 our-0.348
 my-0.333
 Blink-0.332
 same-0.314
Token the
Token ID262
Feature activation+2.26e-01

Pos logit contributions
Mid+0.013
Scar+0.012
 Midnight+0.011
Ol+0.011
Elsa+0.011
Neg logit contributions
apping-0.026
ement-0.025
aps-0.024
apped-0.024
owing-0.023
Token forest
Token ID8222
Feature activation-2.19e-01

Pos logit contributions
Mid+0.321
 Blink+0.318
Scar+0.314
 Piper+0.311
 Midnight+0.304
Neg logit contributions
apping-0.388
ement-0.349
unch-0.345
amb-0.341
aps-0.337
Token.
Token ID13
Feature activation-1.10e-01

Pos logit contributions
apping+0.381
unch+0.350
apped+0.333
owing+0.331
amb+0.325
Neg logit contributions
Mid-0.343
Scar-0.337
 Piper-0.334
 Midnight-0.334
 Blink-0.331
Token<|endoftext|>
Token ID50256
Feature activation-1.54e-01

Pos logit contributions
apping+0.223
unch+0.209
amb+0.201
apped+0.201
aring+0.194
Neg logit contributions
Mid-0.132
 your-0.131
Scar-0.129
 newly-0.129
 Piper-0.127
TokenOnce
Token ID7454
Feature activation-1.01e-01

Pos logit contributions
apping+0.339
unch+0.319
amb+0.309
aring+0.309
owing+0.307
Neg logit contributions
 her-0.164
 your-0.163
 Blink-0.162
 Piper-0.159
Mid-0.157
Token upon
Token ID2402
Feature activation+1.50e-01

Pos logit contributions
apping+0.248
unch+0.231
apped+0.226
ement+0.220
owing+0.217
Neg logit contributions
Mid-0.092
Scar-0.084
 Midnight-0.084
 Piper-0.084
 Blink-0.082
Token a
Token ID257
Feature activation+1.40e-01

Pos logit contributions
 Piper+0.134
Scar+0.121
Mid+0.120
 Midnight+0.115
 Blink+0.113
Neg logit contributions
apping-0.352
unch-0.346
apped-0.324
aps-0.319
amb-0.314
Token time
Token ID640
Feature activation-1.48e-01

Pos logit contributions
 Piper+0.140
 your+0.136
 Midnight+0.133
 Blink+0.132
 her+0.132
Neg logit contributions
apping-0.316
unch-0.303
amb-0.282
aring-0.279
owing-0.278
Token,
Token ID11
Feature activation-5.44e-02

Pos logit contributions
apping+0.269
unch+0.250
apped+0.236
amb+0.235
owing+0.232
Neg logit contributions
Mid-0.215
 Piper-0.212
Scar-0.211
 your-0.210
 Blink-0.209
Token there
Token ID612
Feature activation+8.76e-02

Pos logit contributions
apping+0.113
apped+0.102
unch+0.097
amb+0.093
owing+0.092
Neg logit contributions
Scar-0.069
 Piper-0.069
Mid-0.069
 Blink-0.067
 Midnight-0.066
Token was
Token ID373
Feature activation-9.94e-03

Pos logit contributions
Mid+0.097
Scar+0.096
 Piper+0.094
 Blink+0.089
 Midnight+0.089
Neg logit contributions
apping-0.191
unch-0.184
apped-0.179
ement-0.170
amb-0.170
Token Lily
Token ID20037
Feature activation+1.57e-01

Pos logit contributions
apping+0.165
unch+0.157
amb+0.152
apped+0.151
owing+0.149
Neg logit contributions
Mid-0.050
 your-0.049
 Piper-0.048
 Blink-0.048
Scar-0.048
Token had
Token ID550
Feature activation+5.65e-02

Pos logit contributions
 Piper+0.195
 newly+0.193
 your+0.191
 our+0.189
Mid+0.189
Neg logit contributions
apping-0.282
apped-0.264
aring-0.252
amb-0.249
owing-0.243
Token long
Token ID890
Feature activation-1.05e-02

Pos logit contributions
 Blink+0.055
 Piper+0.054
Mid+0.050
 Mickey+0.048
 newly+0.047
Neg logit contributions
apping-0.117
apped-0.109
aps-0.107
unch-0.100
ocked-0.100
Token hair
Token ID4190
Feature activation-2.16e-01

Pos logit contributions
apping+0.013
apped+0.012
aps+0.011
atted+0.011
ement+0.011
Neg logit contributions
 Piper-0.020
 Blink-0.019
Scar-0.019
Mid-0.019
Max-0.019
Token,
Token ID11
Feature activation-1.94e-02

Pos logit contributions
apping+0.374
unch+0.339
apped+0.332
amb+0.324
owing+0.321
Neg logit contributions
Mid-0.332
 Piper-0.332
 Blink-0.326
Scar-0.324
 your-0.319
Token a
Token ID257
Feature activation+1.79e-01

Pos logit contributions
apping+0.034
unch+0.033
ement+0.031
apped+0.031
amb+0.030
Neg logit contributions
Scar-0.025
Beck-0.024
Mid-0.023
 Blink-0.022
 Piper-0.022
Token pink
Token ID11398
Feature activation-9.44e-02

Pos logit contributions
 Blink+0.192
 Piper+0.185
Mid+0.183
 your+0.181
Scar+0.179
Neg logit contributions
apping-0.374
unch-0.340
apped-0.334
aps-0.326
ement-0.320
Token dress
Token ID6576
Feature activation-2.43e-01

Pos logit contributions
apping+0.200
unch+0.187
apped+0.184
amb+0.183
ement+0.178
Neg logit contributions
 Piper-0.102
Scar-0.101
 Blink-0.101
Mid-0.100
 your-0.099
Token,
Token ID11
Feature activation-1.84e-02

Pos logit contributions
apping+0.506
unch+0.491
amb+0.467
owing+0.458
apped+0.449
Neg logit contributions
 her-0.301
 she-0.294
 your-0.278
 Midnight-0.273
Mid-0.265
Token and
Token ID290
Feature activation+6.24e-03

Pos logit contributions
apping+0.027
unch+0.025
apped+0.024
ement+0.023
amb+0.023
Neg logit contributions
Mid-0.029
Scar-0.029
Beck-0.029
 Piper-0.029
Ol-0.028
Token shiny
Token ID22441
Feature activation-4.64e-02

Pos logit contributions
Scar+0.005
Mid+0.005
 newly+0.005
 Blink+0.005
 Piper+0.005
Neg logit contributions
apping-0.013
apped-0.013
unch-0.012
amb-0.012
ement-0.011
Token!"
Token ID2474
Feature activation-7.37e-02

Pos logit contributions
apping+0.072
unch+0.068
amb+0.065
apped+0.064
owing+0.064
Neg logit contributions
Mid-0.054
 your-0.054
 Piper-0.053
Scar-0.053
 her-0.053
Token
Token ID198
Feature activation+1.57e-02

Pos logit contributions
apping+0.199
unch+0.189
amb+0.185
apped+0.185
ement+0.181
Neg logit contributions
Mid-0.041
 Midnight-0.038
 your-0.037
Scar-0.037
 newly-0.037
Token
Token ID198
Feature activation+1.60e-02

Pos logit contributions
 her+0.015
 your+0.015
Mid+0.014
Scar+0.014
 she+0.014
Neg logit contributions
apping-0.036
unch-0.035
owing-0.034
ement-0.034
apped-0.034
TokenThe
Token ID464
Feature activation+2.10e-01

Pos logit contributions
 your+0.016
 her+0.016
Mid+0.016
Scar+0.015
 Piper+0.015
Neg logit contributions
apping-0.036
unch-0.035
owing-0.033
apped-0.033
amb-0.033
Token fire
Token ID2046
Feature activation-7.13e-02

Pos logit contributions
Mid+0.161
 your+0.160
 Piper+0.157
 her+0.156
Scar+0.155
Neg logit contributions
apping-0.523
unch-0.498
amb-0.482
apped-0.479
owing-0.475
Token was
Token ID373
Feature activation+3.57e-02

Pos logit contributions
apping+0.157
unch+0.153
amb+0.146
aring+0.142
owing+0.140
Neg logit contributions
 her-0.071
 Blink-0.071
 newly-0.069
Mid-0.069
 Piper-0.069
Token burning
Token ID9482
Feature activation-9.06e-02

Pos logit contributions
 her+0.037
 your+0.037
 Piper+0.036
Mid+0.036
 newly+0.035
Neg logit contributions
apping-0.080
unch-0.076
amb-0.074
aring-0.072
apped-0.072
Token on
Token ID319
Feature activation+1.02e-01

Pos logit contributions
apping+0.138
ement+0.124
owing+0.122
unch+0.122
apped+0.120
Neg logit contributions
Mid-0.159
Scar-0.158
 Blink-0.154
 Piper-0.152
 Midnight-0.151
Token a
Token ID257
Feature activation+1.93e-01

Pos logit contributions
Mid+0.086
Scar+0.077
 Midnight+0.073
Ol+0.072
 Blink+0.070
Neg logit contributions
apping-0.239
ement-0.230
apped-0.223
unch-0.222
owing-0.216
Token big
Token ID1263
Feature activation+2.23e-02

Pos logit contributions
Mid+0.165
 your+0.163
 Piper+0.162
Scar+0.162
 her+0.160
Neg logit contributions
apping-0.458
unch-0.436
amb-0.417
apped-0.414
owing-0.413
Token strip
Token ID10283
Feature activation-1.81e-01

Pos logit contributions
 your+0.024
Mid+0.024
 her+0.024
 Piper+0.023
 Blink+0.023
Neg logit contributions
apping-0.048
unch-0.046
amb-0.044
apped-0.044
owing-0.043
Token.
Token ID13
Feature activation-6.10e-02

Pos logit contributions
apping+0.228
unch+0.218
amb+0.209
apped+0.209
ement+0.204
Neg logit contributions
 your-0.165
 her-0.162
 she-0.155
Mid-0.155
 Blink-0.155
Token It
Token ID632
Feature activation+1.46e-01

Pos logit contributions
apping+0.133
unch+0.129
ement+0.125
owing+0.123
amb+0.121
Neg logit contributions
 your-0.066
 Blink-0.065
 Piper-0.063
 Midnight-0.061
Mid-0.060
Token might
Token ID1244
Feature activation-4.16e-02

Pos logit contributions
Mid+0.162
Scar+0.160
 Piper+0.150
 Midnight+0.150
Ol+0.143
Neg logit contributions
owing-0.293
aring-0.290
apping-0.285
apped-0.274
amb-0.274
Token be
Token ID307
Feature activation-8.01e-03

Pos logit contributions
oot+0.055
owing+0.054
unch+0.052
aring+0.052
ocular+0.051
Neg logit contributions
Scar-0.066
Mid-0.066
Beck-0.062
 Piper-0.062
Jason-0.061
Token locked
Token ID8970
Feature activation-1.52e-01

Pos logit contributions
owing+0.014
apping+0.014
oot+0.014
aring+0.013
atted+0.013
Neg logit contributions
Mid-0.010
Scar-0.009
Ol-0.009
 Piper-0.009
Max-0.009
Token or
Token ID393
Feature activation+2.76e-02

Pos logit contributions
apping+0.230
amb+0.205
owing+0.201
ement+0.199
apped+0.198
Neg logit contributions
Mid-0.258
Scar-0.253
 Piper-0.251
 newly-0.246
 Midnight-0.246
Token have
Token ID423
Feature activation+6.84e-02

Pos logit contributions
Mid+0.032
Scar+0.032
 Piper+0.031
 Midnight+0.030
 Blink+0.029
Neg logit contributions
apping-0.052
owing-0.048
aring-0.048
apped-0.046
aps-0.046
Token something
Token ID1223
Feature activation+8.18e-02

Pos logit contributions
 Piper+0.076
Mid+0.072
Scar+0.071
 Midnight+0.069
 Blink+0.066
Neg logit contributions
apping-0.136
apped-0.130
aring-0.130
oot-0.129
unch-0.128
Token scary
Token ID14343
Feature activation-4.46e-02

Pos logit contributions
Mid+0.094
 Piper+0.093
Scar+0.092
 Blink+0.091
 Midnight+0.089
Neg logit contributions
apping-0.170
unch-0.160
aring-0.154
amb-0.154
owing-0.153
Token in
Token ID287
Feature activation+1.15e-01

Pos logit contributions
apping+0.089
unch+0.086
apped+0.080
amb+0.080
ement+0.079
Neg logit contributions
 your-0.060
 her-0.058
 she-0.055
 Blink-0.055
 same-0.054
Token it
Token ID340
Feature activation+1.82e-01

Pos logit contributions
Mid+0.118
Scar+0.113
 Piper+0.111
 Blink+0.110
 Midnight+0.109
Neg logit contributions
apping-0.245
unch-0.230
ement-0.224
apped-0.223
amb-0.223
Token the
Token ID262
Feature activation+2.97e-01

Pos logit contributions
Mid+0.003
Scar+0.003
 Piper+0.003
 Blink+0.003
 Midnight+0.003
Neg logit contributions
apping-0.013
unch-0.012
apped-0.012
ement-0.012
amb-0.012
Token tray
Token ID26473
Feature activation-9.85e-02

Pos logit contributions
 your+0.360
 Blink+0.357
Mid+0.348
Scar+0.347
 Piper+0.336
Neg logit contributions
apping-0.578
ement-0.545
unch-0.543
aps-0.510
apped-0.508
Token carefully
Token ID7773
Feature activation-9.38e-04

Pos logit contributions
amb+0.190
apping+0.184
unch+0.183
aring+0.176
idd+0.174
Neg logit contributions
 she-0.136
 her-0.134
 the-0.119
 they-0.116
 their-0.116
Token.
Token ID13
Feature activation-5.31e-02

Pos logit contributions
apping+0.002
unch+0.002
amb+0.002
apped+0.002
aring+0.002
Neg logit contributions
 her-0.001
 she-0.001
 the-0.001
 your-0.001
 their-0.001
Token It
Token ID632
Feature activation+1.51e-01

Pos logit contributions
apping+0.108
unch+0.101
apped+0.099
amb+0.096
ement+0.095
Neg logit contributions
Mid-0.065
 your-0.064
Scar-0.063
 her-0.062
 Piper-0.062
Token was
Token ID373
Feature activation+2.86e-02

Pos logit contributions
 your+0.212
Scar+0.202
Mid+0.197
 our+0.195
 Midnight+0.186
Neg logit contributions
apping-0.264
ement-0.258
apped-0.255
unch-0.253
idd-0.237
Token a
Token ID257
Feature activation+1.87e-01

Pos logit contributions
Mid+0.032
Scar+0.029
 Blink+0.028
 Piper+0.028
 Midnight+0.028
Neg logit contributions
apping-0.061
unch-0.058
apped-0.056
ement-0.055
owing-0.053
Token funny
Token ID8258
Feature activation-8.63e-02

Pos logit contributions
Mid+0.142
 your+0.141
 her+0.139
 Piper+0.138
 Blink+0.138
Neg logit contributions
apping-0.462
unch-0.441
amb-0.422
apped-0.421
aring-0.417
Token mirror
Token ID10162
Feature activation-1.69e-01

Pos logit contributions
apping+0.178
unch+0.168
amb+0.160
apped+0.160
owing+0.158
Neg logit contributions
Mid-0.101
Scar-0.100
 your-0.100
 Blink-0.099
 Piper-0.099
Token tray
Token ID26473
Feature activation-9.66e-02

Pos logit contributions
apping+0.306
unch+0.289
amb+0.278
apped+0.277
aring+0.269
Neg logit contributions
 her-0.241
 she-0.239
 your-0.238
 Piper-0.230
 Midnight-0.230
Token,
Token ID11
Feature activation-8.70e-03

Pos logit contributions
apping+0.179
unch+0.169
amb+0.166
apped+0.162
aring+0.160
Neg logit contributions
 her-0.134
 she-0.134
 your-0.128
 Midnight-0.126
 Piper-0.126
Token And
Token ID843
Feature activation-5.39e-02

Pos logit contributions
apping+0.154
unch+0.148
owing+0.143
ement+0.141
amb+0.138
Neg logit contributions
 your-0.079
 Blink-0.076
Mid-0.075
 Midnight-0.075
 Piper-0.074
Token I
Token ID314
Feature activation+1.24e-01

Pos logit contributions
apping+0.135
unch+0.129
amb+0.125
apped+0.124
owing+0.123
Neg logit contributions
 your-0.039
Mid-0.039
 Piper-0.038
 her-0.038
 Blink-0.038
Token will
Token ID481
Feature activation-3.43e-03

Pos logit contributions
 your+0.077
Mid+0.075
 her+0.075
Scar+0.073
 Piper+0.071
Neg logit contributions
apping-0.328
unch-0.314
amb-0.302
ement-0.301
apped-0.301
Token be
Token ID307
Feature activation-8.81e-03

Pos logit contributions
apping+0.008
unch+0.008
ement+0.008
apped+0.008
owing+0.008
Neg logit contributions
 your-0.003
 her-0.003
 Midnight-0.003
Mid-0.003
 Piper-0.003
Token nice
Token ID3621
Feature activation-1.11e-01

Pos logit contributions
apping+0.019
apped+0.017
unch+0.017
aps+0.017
aring+0.017
Neg logit contributions
Mid-0.010
 Blink-0.010
 Piper-0.010
 Midnight-0.009
Scar-0.009
Token and
Token ID290
Feature activation+9.37e-04

Pos logit contributions
unch+0.255
ement+0.250
apping+0.240
amb+0.237
oot+0.232
Neg logit contributions
 your-0.122
 her-0.115
 she-0.108
 our-0.106
 my-0.105
Token playful
Token ID34264
Feature activation-8.71e-02

Pos logit contributions
Mid+0.001
 Piper+0.001
Scar+0.001
 Blink+0.001
 Midnight+0.001
Neg logit contributions
apping-0.002
unch-0.002
apped-0.002
amb-0.002
aring-0.002
Token."
Token ID526
Feature activation-8.78e-02

Pos logit contributions
unch+0.182
apping+0.168
amb+0.163
ement+0.163
oot+0.162
Neg logit contributions
 your-0.119
 her-0.116
 the-0.107
 my-0.107
 our-0.107
Token
Token ID198
Feature activation+1.07e-03

Pos logit contributions
apping+0.165
unch+0.154
apped+0.147
amb+0.146
aps+0.143
Neg logit contributions
Mid-0.121
 Piper-0.119
Scar-0.117
 your-0.117
 newly-0.117
Token
Token ID198
Feature activation+1.28e-03

Pos logit contributions
 your+0.001
 her+0.001
Mid+0.001
 Blink+0.001
Scar+0.001
Neg logit contributions
apping-0.002
unch-0.002
amb-0.002
owing-0.002
apped-0.002
TokenAnna
Token ID31160
Feature activation+1.27e-01

Pos logit contributions
 Piper+0.001
chy+0.001
 Midnight+0.001
 Mickey+0.001
 same+0.001
Neg logit contributions
apping-0.003
aps-0.003
unch-0.003
apped-0.003
ucking-0.003
Token happy
Token ID3772
Feature activation-1.17e-01

Pos logit contributions
 your+0.009
 her+0.008
Mid+0.008
 Piper+0.008
 Blink+0.008
Neg logit contributions
apping-0.025
unch-0.024
amb-0.023
apped-0.023
owing-0.023
Token."
Token ID526
Feature activation-9.02e-02

Pos logit contributions
apping+0.217
unch+0.211
owing+0.198
amb+0.197
ement+0.193
Neg logit contributions
 your-0.175
 her-0.173
 she-0.168
 our-0.162
 my-0.157
Token
Token ID198
Feature activation+1.77e-03

Pos logit contributions
apping+0.205
owing+0.198
unch+0.193
aring+0.192
ement+0.190
Neg logit contributions
 her-0.098
 she-0.096
 your-0.089
 Midnight-0.088
 Lily-0.086
Token
Token ID198
Feature activation+1.71e-03

Pos logit contributions
 her+0.002
 your+0.002
Mid+0.002
 Blink+0.002
 Midnight+0.002
Neg logit contributions
apping-0.004
owing-0.004
unch-0.004
ement-0.004
amb-0.004
Token"
Token ID1
Feature activation+7.30e-02

Pos logit contributions
 Piper+0.002
 Blink+0.002
 your+0.002
Scar+0.002
 Midnight+0.002
Neg logit contributions
apping-0.004
unch-0.004
amb-0.003
apped-0.003
aring-0.003
TokenI
Token ID40
Feature activation+6.62e-02

Pos logit contributions
Mid+0.072
Scar+0.070
 her+0.070
 your+0.069
 newly+0.068
Neg logit contributions
apping-0.167
unch-0.158
owing-0.155
aring-0.152
amb-0.152
Token love
Token ID1842
Feature activation-1.54e-01

Pos logit contributions
 your+0.076
Mid+0.075
 her+0.075
 Piper+0.074
Scar+0.074
Neg logit contributions
apping-0.140
unch-0.132
amb-0.126
apped-0.126
owing-0.124
Token them
Token ID606
Feature activation+1.59e-01

Pos logit contributions
apping+0.387
unch+0.367
amb+0.358
apped+0.355
owing+0.352
Neg logit contributions
 your-0.132
 her-0.129
 Piper-0.118
 she-0.117
 Midnight-0.116
Token too
Token ID1165
Feature activation-7.56e-02

Pos logit contributions
Mid+0.233
 Piper+0.231
Scar+0.230
 your+0.229
 Blink+0.229
Neg logit contributions
apping-0.283
unch-0.263
amb-0.251
apped-0.249
owing-0.246
Token,"
Token ID553
Feature activation-5.57e-02

Pos logit contributions
apping+0.165
unch+0.157
apped+0.154
amb+0.154
owing+0.149
Neg logit contributions
 your-0.089
 her-0.084
 our-0.080
 she-0.080
 my-0.080
Token her
Token ID607
Feature activation+2.58e-01

Pos logit contributions
apping+0.139
apped+0.136
amb+0.133
unch+0.133
aps+0.127
Neg logit contributions
Scar-0.034
 Blink-0.030
Mid-0.028
chy-0.028
 Piper-0.027
Token sat
Token ID3332
Feature activation-1.73e-01

Pos logit contributions
 your+0.145
 Piper+0.143
Mid+0.142
 Blink+0.142
 Midnight+0.139
Neg logit contributions
apping-0.277
apped-0.265
unch-0.255
amb-0.255
owing-0.243
Token on
Token ID319
Feature activation+6.02e-02

Pos logit contributions
unch+0.372
apping+0.368
aring+0.358
amb+0.344
oot+0.340
Neg logit contributions
 her-0.199
 your-0.191
 she-0.180
 newly-0.180
 their-0.176
Token the
Token ID262
Feature activation+2.60e-01

Pos logit contributions
Mid+0.029
 Piper+0.026
Scar+0.026
 Blink+0.024
 Midnight+0.024
Neg logit contributions
apping-0.163
unch-0.154
apped-0.153
ement-0.151
amb-0.151
Token bench
Token ID7624
Feature activation-1.68e-01

Pos logit contributions
 Piper+0.331
 Blink+0.322
Scar+0.318
Mid+0.313
 Midnight+0.312
Neg logit contributions
apping-0.467
amb-0.425
ement-0.420
apped-0.416
oot-0.411
Token and
Token ID290
Feature activation-2.49e-02

Pos logit contributions
unch+0.416
amb+0.398
aring+0.394
apping+0.394
idd+0.389
Neg logit contributions
 her-0.155
 she-0.150
 their-0.137
 the-0.134
 they-0.132
Token talked
Token ID6619
Feature activation-1.98e-01

Pos logit contributions
apping+0.049
apped+0.047
amb+0.045
unch+0.044
owing+0.043
Neg logit contributions
Mid-0.031
 your-0.029
Scar-0.029
 Piper-0.029
 Blink-0.029
Token and
Token ID290
Feature activation-2.29e-02

Pos logit contributions
unch+0.407
apping+0.399
idd+0.394
amb+0.392
owing+0.375
Neg logit contributions
 her-0.270
 their-0.255
 your-0.249
 she-0.243
 the-0.242
Token laughed
Token ID13818
Feature activation-1.59e-01

Pos logit contributions
apping+0.050
unch+0.049
ement+0.046
aring+0.046
amb+0.045
Neg logit contributions
 her-0.028
 she-0.026
 Midnight-0.026
 Blink-0.026
 Piper-0.026
Token.
Token ID13
Feature activation-9.29e-02

Pos logit contributions
unch+0.294
apping+0.290
aring+0.275
owing+0.274
idd+0.273
Neg logit contributions
 her-0.240
 their-0.221
 your-0.221
 she-0.218
 the-0.213
Token Sara
Token ID24799
Feature activation+9.42e-02

Pos logit contributions
apping+0.219
unch+0.208
amb+0.201
apped+0.199
owing+0.197
Neg logit contributions
Mid-0.083
 your-0.083
 Piper-0.082
 her-0.082
Scar-0.081
Token was
Token ID373
Feature activation-6.43e-03

Pos logit contributions
 your+0.097
Mid+0.097
 Piper+0.096
 Blink+0.095
 her+0.094
Neg logit contributions
apping-0.207
unch-0.196
amb-0.189
apped-0.189
owing-0.186
Token house
Token ID2156
Feature activation-1.78e-01

Pos logit contributions
 your+0.295
 trash+0.288
 her+0.285
 our+0.278
 same+0.276
Neg logit contributions
apping-0.581
unch-0.574
amb-0.542
ement-0.527
oot-0.526
Token and
Token ID290
Feature activation+9.56e-03

Pos logit contributions
apping+0.363
unch+0.354
amb+0.344
apped+0.329
ement+0.319
Neg logit contributions
 her-0.242
 your-0.228
 she-0.225
 their-0.216
 the-0.214
Token knocked
Token ID13642
Feature activation-1.73e-01

Pos logit contributions
Scar+0.014
 newly+0.013
 Piper+0.013
 same+0.013
Mid+0.012
Neg logit contributions
apped-0.015
apping-0.015
owing-0.014
atted-0.014
amb-0.014
Token it
Token ID340
Feature activation+1.87e-01

Pos logit contributions
apping+0.381
unch+0.365
amb+0.348
aring+0.345
ement+0.343
Neg logit contributions
 her-0.201
 your-0.200
Mid-0.187
 Piper-0.185
 Midnight-0.182
Token down
Token ID866
Feature activation+1.15e-01

Pos logit contributions
Scar+0.283
 Piper+0.277
Mid+0.271
 Midnight+0.267
 newly+0.263
Neg logit contributions
apping-0.327
apped-0.301
unch-0.297
owing-0.293
amb-0.291
Token.
Token ID13
Feature activation-6.13e-02

Pos logit contributions
Mid+0.198
 your+0.196
 Piper+0.196
Scar+0.195
 Blink+0.195
Neg logit contributions
apping-0.175
unch-0.161
amb-0.152
apped-0.152
owing-0.148
Token The
Token ID383
Feature activation+1.80e-01

Pos logit contributions
apping+0.124
unch+0.116
apped+0.115
aps+0.109
ement+0.107
Neg logit contributions
Mid-0.074
Scar-0.073
 Piper-0.073
 newly-0.070
 your-0.069
Token sticks
Token ID16461
Feature activation-7.09e-02

Pos logit contributions
Scar+0.210
 Piper+0.207
 your+0.206
Mid+0.205
 her+0.202
Neg logit contributions
apping-0.370
unch-0.351
apped-0.335
aring-0.325
owing-0.323
Token and
Token ID290
Feature activation+1.15e-02

Pos logit contributions
apping+0.142
unch+0.135
apped+0.132
owing+0.127
amb+0.126
Neg logit contributions
Mid-0.088
 Piper-0.086
Scar-0.086
 Midnight-0.084
 your-0.083
Token leaves
Token ID5667
Feature activation-8.65e-02

Pos logit contributions
Scar+0.017
Elsa+0.014
Mid+0.014
Beck+0.014
Hope+0.014
Neg logit contributions
atted-0.017
apped-0.017
anged-0.016
ocked-0.015
apping-0.015
Token fell
Token ID3214
Feature activation-1.04e-01

Pos logit contributions
apping+0.177
unch+0.168
apped+0.163
owing+0.159
amb+0.159
Neg logit contributions
Mid-0.103
 Piper-0.101
Scar-0.101
 your-0.100
 Blink-0.098
Token walk
Token ID2513
Feature activation-1.31e-01

Pos logit contributions
Mid+0.129
Scar+0.127
 Midnight+0.125
 Blink+0.124
 Piper+0.122
Neg logit contributions
apping-0.183
owing-0.175
unch-0.167
amb-0.163
aring-0.163
Token down
Token ID866
Feature activation+1.01e-01

Pos logit contributions
apping+0.202
apped+0.177
owing+0.176
amb+0.172
ement+0.171
Neg logit contributions
Mid-0.229
 Midnight-0.225
 Blink-0.222
Scar-0.217
 Piper-0.214
Token the
Token ID262
Feature activation+2.88e-01

Pos logit contributions
Mid+0.075
 Midnight+0.069
 Blink+0.065
 Piper+0.065
Scar+0.065
Neg logit contributions
apping-0.261
apped-0.243
unch-0.240
ement-0.237
owing-0.237
Token hallway
Token ID23959
Feature activation-1.34e-01

Pos logit contributions
Mid+0.316
 Piper+0.314
 Midnight+0.311
 Blink+0.310
 your+0.307
Neg logit contributions
apping-0.626
unch-0.580
amb-0.564
apped-0.563
owing-0.559
Token and
Token ID290
Feature activation+8.92e-03

Pos logit contributions
apping+0.263
unch+0.257
amb+0.248
apped+0.238
owing+0.235
Neg logit contributions
 her-0.171
 she-0.166
 your-0.165
 newly-0.159
 Piper-0.157
Token noticed
Token ID6810
Feature activation-5.29e-02

Pos logit contributions
Mid+0.011
Scar+0.010
 Midnight+0.010
 Blink+0.010
 newly+0.010
Neg logit contributions
apping-0.018
apped-0.017
owing-0.017
aring-0.016
amb-0.016
Token a
Token ID257
Feature activation+1.77e-01

Pos logit contributions
apping+0.128
unch+0.122
apped+0.118
amb+0.117
owing+0.116
Neg logit contributions
Mid-0.043
 your-0.042
Scar-0.042
 Piper-0.041
 Blink-0.041
Token door
Token ID3420
Feature activation-1.55e-01

Pos logit contributions
 your+0.131
Mid+0.130
 her+0.128
 Piper+0.128
 newly+0.127
Neg logit contributions
apping-0.442
unch-0.423
amb-0.413
apped-0.407
owing-0.404
Token.
Token ID13
Feature activation-6.00e-02

Pos logit contributions
apping+0.307
unch+0.299
amb+0.296
apped+0.278
aring+0.272
Neg logit contributions
 her-0.205
 she-0.203
 newly-0.191
 your-0.190
 has-0.185
Token She
Token ID1375
Feature activation+2.00e-01

Pos logit contributions
apping+0.134
unch+0.126
apped+0.122
amb+0.122
owing+0.120
Neg logit contributions
Mid-0.061
 your-0.060
 Piper-0.059
Scar-0.059
 Blink-0.059
Token opened
Token ID4721
Feature activation-6.09e-02

Pos logit contributions
 her+0.177
 your+0.172
Mid+0.171
 newly+0.170
 Piper+0.170
Neg logit contributions
apping-0.474
unch-0.455
amb-0.437
owing-0.429
apped-0.429
Token stepped
Token ID10764
Feature activation-1.27e-01

Pos logit contributions
 her+0.132
 had+0.122
 would+0.117
 she+0.112
 newly+0.112
Neg logit contributions
apping-0.710
unch-0.675
aring-0.667
owing-0.641
ement-0.636
Token in
Token ID287
Feature activation+8.95e-02

Pos logit contributions
apping+0.261
unch+0.252
aring+0.238
amb+0.236
owing+0.231
Neg logit contributions
 her-0.160
 your-0.152
 she-0.151
Mid-0.149
 Piper-0.149
Token,
Token ID11
Feature activation-3.91e-02

Pos logit contributions
 your+0.083
Mid+0.083
 her+0.082
 Piper+0.082
 Blink+0.081
Neg logit contributions
apping-0.208
unch-0.197
amb-0.190
apped-0.189
owing-0.187
Token Ann
Token ID5506
Feature activation+8.49e-02

Pos logit contributions
apping+0.088
apped+0.086
ement+0.082
owing+0.082
overed+0.081
Neg logit contributions
Mid-0.036
Scar-0.031
 Piper-0.028
 Midnight-0.028
Elsa-0.027
Tokenal
Token ID282
Feature activation-8.79e-02

Pos logit contributions
 your+0.076
Mid+0.075
 Piper+0.074
Scar+0.073
 Blink+0.073
Neg logit contributions
apping-0.200
unch-0.189
amb-0.183
apped-0.182
owing-0.182
Tokenise
Token ID786
Feature activation-1.71e-01

Pos logit contributions
apping+0.197
unch+0.185
apped+0.180
amb+0.178
owing+0.177
Neg logit contributions
 your-0.089
Mid-0.088
 Piper-0.087
 Midnight-0.086
Scar-0.086
Token started
Token ID2067
Feature activation-9.54e-02

Pos logit contributions
apping+0.353
apped+0.328
owing+0.320
unch+0.320
amb+0.312
Neg logit contributions
 your-0.201
Mid-0.200
 Midnight-0.196
Scar-0.195
 Blink-0.194
Token to
Token ID284
Feature activation+8.30e-02

Pos logit contributions
apping+0.197
unch+0.188
amb+0.182
apped+0.179
ement+0.177
Neg logit contributions
 her-0.118
 your-0.114
 she-0.110
 Piper-0.110
 newly-0.110
Token trem
Token ID10023
Feature activation-2.02e-01

Pos logit contributions
Mid+0.119
Scar+0.113
 Midnight+0.112
 Piper+0.106
Max+0.106
Neg logit contributions
owing-0.141
apping-0.139
unch-0.132
 sur-0.131
amb-0.123
Tokenble
Token ID903
Feature activation-1.19e-01

Pos logit contributions
unch+0.311
apping+0.299
aring+0.291
amb+0.288
apped+0.286
Neg logit contributions
 her-0.334
 your-0.312
 she-0.305
chy-0.299
 Piper-0.298
Token.
Token ID13
Feature activation-8.17e-02

Pos logit contributions
apping+0.201
unch+0.191
amb+0.185
apped+0.182
aring+0.175
Neg logit contributions
 her-0.187
 your-0.183
 she-0.179
Mid-0.177
 Piper-0.177
Token monster
Token ID9234
Feature activation-6.42e-02

Pos logit contributions
apping+0.137
unch+0.130
apped+0.125
amb+0.125
owing+0.124
Neg logit contributions
 your-0.063
 her-0.060
 Midnight-0.060
 Piper-0.060
 Blink-0.060
Token called
Token ID1444
Feature activation-2.34e-01

Pos logit contributions
apping+0.118
unch+0.115
ement+0.109
apped+0.106
amb+0.105
Neg logit contributions
 she-0.093
 her-0.093
 your-0.089
 Blink-0.089
 Midnight-0.088
Token Zip
Token ID38636
Feature activation+9.71e-02

Pos logit contributions
apping+0.495
unch+0.467
aring+0.448
owing+0.443
apped+0.443
Neg logit contributions
Scar-0.246
Mid-0.244
 newly-0.243
 Piper-0.221
 trash-0.221
Token.
Token ID13
Feature activation-5.72e-02

Pos logit contributions
 her+0.139
 your+0.131
 Blink+0.127
 the+0.125
 she+0.124
Neg logit contributions
apping-0.187
unch-0.184
apped-0.168
ement-0.166
amb-0.164
Token Zip
Token ID38636
Feature activation+9.74e-02

Pos logit contributions
apping+0.132
unch+0.126
owing+0.121
amb+0.120
apped+0.117
Neg logit contributions
 Blink-0.059
 your-0.057
 her-0.057
 has-0.055
 same-0.053
Token hated
Token ID16563
Feature activation-8.56e-02

Pos logit contributions
 Piper+0.123
Scar+0.119
Mid+0.117
 Midnight+0.115
 newly+0.112
Neg logit contributions
apping-0.191
amb-0.180
aring-0.176
apped-0.176
owing-0.174
Token everything
Token ID2279
Feature activation+1.09e-01

Pos logit contributions
apping+0.207
unch+0.196
amb+0.190
apped+0.190
owing+0.186
Neg logit contributions
 your-0.076
 her-0.074
 Blink-0.071
Mid-0.071
 Piper-0.071
Token and
Token ID290
Feature activation+1.13e-02

Pos logit contributions
Scar+0.187
Mid+0.183
 Piper+0.176
Elsa+0.172
 Blink+0.170
Neg logit contributions
apping-0.175
owing-0.164
ement-0.156
aring-0.154
unch-0.151
Token everyone
Token ID2506
Feature activation+7.67e-02

Pos logit contributions
Mid+0.014
Scar+0.014
 Piper+0.013
 newly+0.013
Elsa+0.013
Neg logit contributions
apping-0.022
unch-0.020
apped-0.020
owing-0.020
amb-0.020
Token around
Token ID1088
Feature activation+2.73e-02

Pos logit contributions
Mid+0.083
 your+0.082
 Piper+0.082
Scar+0.081
 Blink+0.081
Neg logit contributions
apping-0.166
unch-0.156
amb-0.151
apped-0.150
owing-0.148
Token him
Token ID683
Feature activation+1.71e-01

Pos logit contributions
Mid+0.031
 your+0.031
 Piper+0.030
Scar+0.030
 Blink+0.030
Neg logit contributions
apping-0.058
unch-0.054
apped-0.052
amb-0.052
owing-0.051
Token he
Token ID339
Feature activation+2.50e-01

Pos logit contributions
apping+0.042
unch+0.040
 sur+0.039
apped+0.039
ement+0.038
Neg logit contributions
Mid-0.017
Scar-0.015
 Piper-0.013
 trash-0.012
 Midnight-0.012
Token walked
Token ID6807
Feature activation-8.76e-02

Pos logit contributions
Mid+0.248
 Blink+0.245
 her+0.245
 your+0.244
 Piper+0.244
Neg logit contributions
apping-0.566
unch-0.532
amb-0.510
apped-0.507
aring-0.506
Token out
Token ID503
Feature activation+4.56e-02

Pos logit contributions
apping+0.202
unch+0.198
aring+0.184
owing+0.182
apped+0.182
Neg logit contributions
 her-0.089
 your-0.087
 Piper-0.082
Mid-0.081
 she-0.080
Token of
Token ID286
Feature activation+3.66e-02

Pos logit contributions
 her+0.074
 she+0.070
 the+0.069
 your+0.069
 their+0.068
Neg logit contributions
unch-0.080
apping-0.079
amb-0.077
aring-0.072
apped-0.071
Token the
Token ID262
Feature activation+2.72e-01

Pos logit contributions
 your+0.010
Mid+0.009
 her+0.009
 Piper+0.009
 Blink+0.008
Neg logit contributions
apping-0.110
unch-0.105
amb-0.103
apped-0.102
owing-0.101
Token cave
Token ID11527
Feature activation-2.14e-01

Pos logit contributions
 your+0.299
Mid+0.297
 Piper+0.293
 Blink+0.293
 her+0.293
Neg logit contributions
apping-0.597
unch-0.562
amb-0.544
apped-0.541
owing-0.532
Token,
Token ID11
Feature activation-3.64e-02

Pos logit contributions
apping+0.374
unch+0.358
amb+0.344
apped+0.330
aring+0.327
Neg logit contributions
 her-0.320
 your-0.316
 she-0.312
 Blink-0.308
 newly-0.307
Token much
Token ID881
Feature activation-1.51e-02

Pos logit contributions
apping+0.066
unch+0.062
apped+0.058
amb+0.058
owing+0.058
Neg logit contributions
Mid-0.052
Scar-0.051
 Piper-0.051
 Blink-0.051
 your-0.050
Token happier
Token ID23030
Feature activation-1.43e-01

Pos logit contributions
apping+0.028
unch+0.026
amb+0.025
apped+0.025
owing+0.025
Neg logit contributions
 your-0.021
Mid-0.021
 Piper-0.021
 her-0.021
 Blink-0.021
Token than
Token ID621
Feature activation-5.98e-02

Pos logit contributions
apping+0.263
unch+0.246
amb+0.235
apped+0.233
owing+0.230
Neg logit contributions
Mid-0.200
 your-0.199
 Piper-0.197
 her-0.196
Scar-0.196
Token when
Token ID618
Feature activation+1.29e-02

Pos logit contributions
apping+0.098
unch+0.091
apped+0.085
owing+0.085
amb+0.085
Neg logit contributions
Mid-0.097
 Piper-0.095
Scar-0.095
 your-0.094
 Blink-0.094
Token with
Token ID351
Feature activation-3.35e-02

Pos logit contributions
apping+0.305
unch+0.301
amb+0.294
ement+0.287
apped+0.282
Neg logit contributions
 her-0.245
 she-0.234
 your-0.230
 newly-0.219
 the-0.217
Token a
Token ID257
Feature activation+1.60e-01

Pos logit contributions
apping+0.096
unch+0.092
apped+0.090
ement+0.090
amb+0.088
Neg logit contributions
Mid-0.012
Scar-0.011
 Blink-0.011
 Piper-0.010
 your-0.010
Token smile
Token ID8212
Feature activation-2.00e-01

Pos logit contributions
Mid+0.158
 your+0.157
 Blink+0.156
 Piper+0.154
 her+0.153
Neg logit contributions
apping-0.358
unch-0.341
amb-0.322
apped-0.322
owing-0.320
Token on
Token ID319
Feature activation+7.03e-02

Pos logit contributions
apping+0.395
unch+0.377
amb+0.357
aring+0.354
apped+0.353
Neg logit contributions
 her-0.259
 your-0.249
Mid-0.246
 she-0.246
 Piper-0.245
Token her
Token ID607
Feature activation+2.38e-01

Pos logit contributions
Mid+0.011
 Blink+0.009
 Piper+0.009
Scar+0.008
 your+0.007
Neg logit contributions
apping-0.216
unch-0.208
amb-0.203
apped-0.202
ement-0.200
Token face
Token ID1986
Feature activation-2.32e-01

Pos logit contributions
 her+0.362
 newly+0.362
 your+0.355
Mid+0.344
Scar+0.341
Neg logit contributions
apping-0.423
unch-0.399
amb-0.385
apped-0.379
aring-0.373
Token.
Token ID13
Feature activation-7.82e-02

Pos logit contributions
apping+0.433
unch+0.420
amb+0.410
aring+0.394
apped+0.392
Neg logit contributions
 her-0.350
 she-0.342
 your-0.302
 the-0.301
 newly-0.295
Token<|endoftext|>
Token ID50256
Feature activation-1.28e-01

Pos logit contributions
apping+0.168
unch+0.158
amb+0.153
owing+0.152
apped+0.151
Neg logit contributions
 her-0.089
 your-0.086
 Piper-0.085
 Blink-0.085
Scar-0.084
TokenOnce
Token ID7454
Feature activation-7.76e-02

Pos logit contributions
apping+0.286
unch+0.270
amb+0.261
aring+0.258
apped+0.258
Neg logit contributions
 your-0.129
 her-0.128
Mid-0.128
 Piper-0.127
 Blink-0.127
Token upon
Token ID2402
Feature activation+1.79e-01

Pos logit contributions
apping+0.185
unch+0.170
apped+0.168
ement+0.165
owing+0.162
Neg logit contributions
Mid-0.078
 Midnight-0.070
 Blink-0.070
Scar-0.069
 Piper-0.068
Token a
Token ID257
Feature activation+1.65e-01

Pos logit contributions
 Piper+0.177
Scar+0.159
Mid+0.157
 Midnight+0.154
 Blink+0.150
Neg logit contributions
apping-0.401
unch-0.397
aps-0.363
apped-0.359
owing-0.350
Token that
Token ID326
Feature activation+9.77e-02

Pos logit contributions
apping+0.144
unch+0.133
apped+0.125
ement+0.124
owing+0.124
Neg logit contributions
Mid-0.137
 Piper-0.134
 Blink-0.134
Scar-0.133
 your-0.133
Token she
Token ID673
Feature activation+2.65e-01

Pos logit contributions
Mid+0.078
Scar+0.071
 your+0.067
 Blink+0.066
 our+0.066
Neg logit contributions
apping-0.224
unch-0.215
apped-0.214
ement-0.211
owing-0.206
Token got
Token ID1392
Feature activation-4.63e-02

Pos logit contributions
 Blink+0.252
Mid+0.248
 Piper+0.244
 your+0.235
Scar+0.233
Neg logit contributions
apping-0.593
unch-0.560
apped-0.551
amb-0.548
owing-0.536
Token in
Token ID287
Feature activation+1.06e-01

Pos logit contributions
apping+0.091
unch+0.085
apped+0.081
amb+0.081
owing+0.080
Neg logit contributions
Mid-0.060
 Blink-0.059
 Piper-0.059
Scar-0.058
 Midnight-0.058
Token the
Token ID262
Feature activation+2.89e-01

Pos logit contributions
 her+0.083
 your+0.081
Mid+0.077
 Piper+0.077
Scar+0.076
Neg logit contributions
apping-0.269
unch-0.255
amb-0.250
apped-0.247
aring-0.246
Token mail
Token ID6920
Feature activation-2.23e-01

Pos logit contributions
 Piper+0.352
 Blink+0.347
Mid+0.342
Scar+0.340
 Midnight+0.329
Neg logit contributions
apping-0.576
unch-0.534
apped-0.534
ement-0.526
aring-0.513
Token.
Token ID13
Feature activation-6.40e-02

Pos logit contributions
apping+0.379
unch+0.357
amb+0.338
apped+0.331
aring+0.330
Neg logit contributions
 her-0.349
 your-0.343
 Piper-0.339
 she-0.337
Mid-0.337
Token
Token ID198
Feature activation+2.01e-04

Pos logit contributions
apping+0.135
unch+0.126
owing+0.123
amb+0.122
aring+0.120
Neg logit contributions
 her-0.075
 Piper-0.074
 your-0.074
Mid-0.073
 Midnight-0.072
Token
Token ID198
Feature activation+6.78e-04

Pos logit contributions
 her+0.000
 your+0.000
Mid+0.000
 Blink+0.000
 Piper+0.000
Neg logit contributions
apping-0.000
owing-0.000
unch-0.000
amb-0.000
apped-0.000
TokenAfter
Token ID3260
Feature activation+2.15e-02

Pos logit contributions
 trash+0.001
 Blink+0.001
 your+0.001
 Piper+0.001
 same+0.001
Neg logit contributions
apping-0.002
unch-0.002
amb-0.001
aring-0.001
apped-0.001
Token that
Token ID326
Feature activation+1.00e-01

Pos logit contributions
Mid+0.019
 your+0.019
 Piper+0.018
 her+0.018
Scar+0.018
Neg logit contributions
apping-0.051
unch-0.048
amb-0.047
apped-0.047
owing-0.046
Token he
Token ID339
Feature activation+2.43e-01

Pos logit contributions
Mid+0.058
Scar+0.052
 Midnight+0.049
 trash+0.049
Elsa+0.047
Neg logit contributions
ement-0.067
apped-0.064
apping-0.064
unch-0.062
umbled-0.058
Token found
Token ID1043
Feature activation-2.99e-02

Pos logit contributions
Mid+0.277
 Midnight+0.265
Scar+0.253
 your+0.251
 Piper+0.243
Neg logit contributions
apping-0.523
apped-0.504
unch-0.468
owing-0.462
amb-0.458
Token a
Token ID257
Feature activation+1.54e-01

Pos logit contributions
apping+0.069
apped+0.069
unch+0.066
ement+0.065
aps+0.063
Neg logit contributions
Mid-0.028
Scar-0.027
 Midnight-0.024
Elsa-0.023
 Piper-0.023
Token tiny
Token ID7009
Feature activation-7.45e-02

Pos logit contributions
Mid+0.123
 your+0.118
Scar+0.118
 Blink+0.118
 Piper+0.118
Neg logit contributions
apping-0.373
unch-0.354
apped-0.339
amb-0.337
ement-0.335
Token,
Token ID11
Feature activation-4.38e-02

Pos logit contributions
apping+0.168
unch+0.160
amb+0.154
apped+0.153
owing+0.152
Neg logit contributions
 your-0.073
Mid-0.072
 Piper-0.072
 her-0.072
 Blink-0.070
Token spark
Token ID9009
Feature activation-2.17e-01

Pos logit contributions
apping+0.095
unch+0.089
ement+0.085
apped+0.084
amb+0.082
Neg logit contributions
Mid-0.043
 Blink-0.042
Scar-0.042
 Piper-0.040
 Midnight-0.040
Tokenly
Token ID306
Feature activation-1.47e-02

Pos logit contributions
apping+0.366
unch+0.343
amb+0.324
apped+0.320
owing+0.315
Neg logit contributions
 your-0.337
 her-0.336
Mid-0.334
 Piper-0.332
 Blink-0.329
Token button
Token ID4936
Feature activation-1.67e-01

Pos logit contributions
apping+0.032
unch+0.030
amb+0.029
apped+0.029
owing+0.028
Neg logit contributions
Mid-0.016
 your-0.016
 Piper-0.015
Scar-0.015
 Blink-0.015
Token.
Token ID13
Feature activation-8.66e-02

Pos logit contributions
apping+0.305
unch+0.289
amb+0.275
apped+0.269
owing+0.266
Neg logit contributions
 her-0.237
 your-0.237
Mid-0.233
 Piper-0.232
 Blink-0.231
Token He
Token ID679
Feature activation+1.68e-01

Pos logit contributions
apping+0.207
unch+0.197
amb+0.191
apped+0.190
owing+0.187
Neg logit contributions
 your-0.073
Mid-0.073
 Piper-0.072
 her-0.072
Scar-0.071
Token pressed
Token ID12070
Feature activation-1.69e-01

Pos logit contributions
 her+0.141
 your+0.140
Mid+0.140
 Piper+0.137
 newly+0.136
Neg logit contributions
apping-0.404
unch-0.387
amb-0.370
owing-0.369
aring-0.367
Token quickly
Token ID2952
Feature activation-6.30e-02

Pos logit contributions
Mid+0.212
 your+0.212
 Piper+0.209
 her+0.208
 Blink+0.207
Neg logit contributions
apping-0.384
unch-0.363
amb-0.349
apped-0.346
owing-0.342
Token rushed
Token ID13999
Feature activation-1.22e-01

Pos logit contributions
apping+0.136
unch+0.129
amb+0.124
apped+0.123
owing+0.122
Neg logit contributions
Mid-0.068
 your-0.068
 Piper-0.067
 her-0.066
 Blink-0.066
Token back
Token ID736
Feature activation-1.46e-03

Pos logit contributions
apping+0.259
unch+0.253
aring+0.233
amb+0.232
ement+0.232
Neg logit contributions
 her-0.154
 your-0.145
Mid-0.139
 she-0.139
 Piper-0.139
Token to
Token ID284
Feature activation+8.81e-02

Pos logit contributions
apping+0.003
unch+0.003
apped+0.003
amb+0.003
ement+0.003
Neg logit contributions
 her-0.002
 your-0.002
 Piper-0.002
Mid-0.002
 Blink-0.002
Token her
Token ID607
Feature activation+2.43e-01

Pos logit contributions
 her+0.031
 your+0.029
 Piper+0.024
Mid+0.022
 Midnight+0.022
Neg logit contributions
apping-0.264
unch-0.255
amb-0.250
apped-0.247
aring-0.246
Token seat
Token ID5852
Feature activation-2.45e-01

Pos logit contributions
 her+0.253
 your+0.248
 newly+0.247
 she+0.233
 Midnight+0.230
Neg logit contributions
apping-0.555
unch-0.547
amb-0.525
apped-0.511
aring-0.506
Token and
Token ID290
Feature activation-4.82e-03

Pos logit contributions
apping+0.478
unch+0.467
amb+0.440
aring+0.436
apped+0.424
Neg logit contributions
 her-0.329
 she-0.313
 your-0.310
 newly-0.299
Mid-0.295
Token the
Token ID262
Feature activation+2.78e-01

Pos logit contributions
apping+0.009
apped+0.009
unch+0.009
amb+0.009
owing+0.008
Neg logit contributions
Mid-0.006
Scar-0.006
 Blink-0.006
 newly-0.006
 Piper-0.006
Token teacher
Token ID4701
Feature activation-1.54e-01

Pos logit contributions
 her+0.334
 your+0.332
Mid+0.328
 newly+0.328
Scar+0.323
Neg logit contributions
apping-0.575
unch-0.553
amb-0.535
apped-0.519
owing-0.518
Token started
Token ID2067
Feature activation-9.10e-02

Pos logit contributions
apping+0.332
unch+0.317
amb+0.303
apped+0.298
aring+0.296
Neg logit contributions
 her-0.169
Mid-0.165
 your-0.164
 Piper-0.162
Scar-0.162
Token the
Token ID262
Feature activation+2.79e-01

Pos logit contributions
apping+0.205
unch+0.203
amb+0.199
ement+0.194
aring+0.193
Neg logit contributions
 her-0.115
 she-0.100
 your-0.099
 the-0.097
 their-0.093