TOP ACTIVATIONS
MAX = 0.357

As everyone sang ' Happy Birthday ', the little girl be
love our country . Happy Birthday , America !" <|endoftext|> S
and family shouted " Happy Birthday !" and gave him wonderful
him ! After singing Happy Birthday , Derek blew out the
favorite song is " Happy Birthday ". She likes to sing
, they sang " Happy Birthday " to Lily 's friend
still hear us sing Happy Birthday !" They finished mixing the
. It said " Happy Birthday ". S ue
Everyone started singing ' Happy Birthday ' to John , who
they all sang " Happy Birthday " to Lily 's little
. Everyone sang " Happy Birthday " to Jimmy . He
. Blue had an enormous appetite and lived in a swamp
Everyone sang the Happy Birthday song . Molly blew out
This message said " Happy Birthday ". Such a special message
as everyone sang " Happy Birthday ". As he was about
The sign said ' Happy Birthday Molly '. Mud ly was
Happy Birthday Tim my ! â
other and said ' Happy Birthday !' <|endoftext|> Once upon a
. Everyone sang the Happy Birthday song and blew out the
and started singing " Happy Birthday to Jack !"

MIN ACTIVATIONS
MIN = -0.288

She saw the mess and the tears and said , "
them . The moral of the story is that true friendship
them . The moral of the story is to always be
cook ." He spat out the cake . Ben
other . The moral of the story is that sharing with
accidentally dropped his gum on the ground . Sarah saw it
. The rock had destroyed the park and all their toys
girl named Sophie who had the biggest hope in her heart
surprise . They had wasted the toys . They had hurt
so mean and rude and the other animals were very happy
zo oke ep ers shot the tiger and took it away
made a big mistake and the boss was very angry .
disgusting . The dog and the cat did not want to
The puzzle pieces fall on the floor . Mom
operation was a success and the little boy was so happy
vous elf . George asked the elf why he was laughing
. He runs back to the fence and climbs over it
him down . He dropped the shell and felt the water
broken window . Everything in the house felt bright and shiny
long time until he reached the lake . He stopped and

INTERVAL 0.196 - 0.357
CONTAINS 3.484%

The wind whipped through her hair . She felt so alive
don 't need the new t eddy bear . I want
One day , a boy named Tim and his friend ,
there will be a big reward for you !" she said
ila cried and pulled her hand out . She was angry

INTERVAL 0.035 - 0.196
CONTAINS 38.579%

and the box in their room . They felt proud of
they got home , Lily wanted to play with her doll
we 're at the zoo !" Lily said . She let
to play in the garden too . Spot sees the spider
Tweet smiled and flew away , happy to have

INTERVAL -0.127 - 0.035
CONTAINS 40.159%

to play with the fish and touch the cor als .
in the morning . But don 't worry , he will
. He was very happy and ran around looking for things
. Jack saw how troubled Tom was and said
and making new friends . One sunny day ,

INTERVAL -0.288 - -0.127
CONTAINS 17.777%

ora was mesmer ized . She wanted a ride in the
his pillow , but all he saw was the dark night
hair and a dress . It can talk and sing .
the park , Jerry saw a big tree made of wood
a bit naughty . So he hopped up onto the bed
Token As
Token ID1081
Feature activation-1.57e-03

Pos logit contributions
apping+0.166
unch+0.166
owing+0.163
oot+0.161
amb+0.161
Neg logit contributions
 your-0.053
 newly-0.048
 same-0.045
 Piper-0.045
 our-0.045
Token everyone
Token ID2506
Feature activation-2.64e-02

Pos logit contributions
Mid+0.002
Scar+0.001
 same+0.001
chy+0.001
Ol+0.001
Neg logit contributions
apping-0.003
unch-0.003
oot-0.003
apped-0.003
ester-0.003
Token sang
Token ID25889
Feature activation+2.07e-01

Pos logit contributions
 your+0.024
 newly+0.023
 her+0.023
 same+0.021
 our+0.021
Neg logit contributions
unch-0.064
apping-0.063
owing-0.062
oot-0.062
amb-0.061
Token '
Token ID705
Feature activation-4.89e-02

Pos logit contributions
apping+0.372
owing+0.351
apped+0.348
unch+0.348
amb+0.347
Neg logit contributions
 newly-0.302
 your-0.302
 Piper-0.290
Scar-0.286
 same-0.285
TokenHappy
Token ID25082
Feature activation+1.73e-02

Pos logit contributions
 your+0.073
 newly+0.070
 her+0.069
 same+0.068
Scar+0.066
Neg logit contributions
apping-0.088
unch-0.087
oot-0.085
owing-0.083
ement-0.083
Token Birthday
Token ID33511
Feature activation+3.57e-01

Pos logit contributions
unch+0.018
oot+0.016
owing+0.016
apping+0.016
ement+0.015
Neg logit contributions
 your-0.041
 her-0.040
 our-0.040
 same-0.040
 my-0.039
Token',
Token ID3256
Feature activation+5.14e-02

Pos logit contributions
apping+0.532
amb+0.501
owing+0.495
unch+0.494
aring+0.478
Neg logit contributions
 newly-0.627
 Blink-0.597
 same-0.588
 your-0.587
Scar-0.586
Token the
Token ID262
Feature activation-2.75e-01

Pos logit contributions
apping+0.125
unch+0.122
amb+0.122
apped+0.120
owing+0.120
Neg logit contributions
 your-0.039
 newly-0.039
Scar-0.037
 same-0.036
Mid-0.035
Token little
Token ID1310
Feature activation+2.59e-02

Pos logit contributions
 your+0.284
 newly+0.259
 Piper+0.250
 same+0.249
 our+0.248
Neg logit contributions
apping-0.631
unch-0.628
owing-0.617
oot-0.609
amb-0.608
Token girl
Token ID2576
Feature activation-2.12e-02

Pos logit contributions
apping+0.058
owing+0.056
apped+0.056
ement+0.056
amb+0.055
Neg logit contributions
 your-0.026
 newly-0.025
Scar-0.024
 Piper-0.024
 our-0.023
Token be
Token ID307
Feature activation+7.02e-03

Pos logit contributions
 your+0.024
 her+0.023
 newly+0.022
 she+0.022
 same+0.022
Neg logit contributions
unch-0.048
apping-0.046
oot-0.046
owing-0.045
ement-0.045
Token love
Token ID1842
Feature activation+1.54e-01

Pos logit contributions
 your+0.146
 newly+0.136
 same+0.131
 Piper+0.131
 her+0.130
Neg logit contributions
apping-0.272
unch-0.271
owing-0.267
amb-0.263
oot-0.263
Token our
Token ID674
Feature activation-2.43e-01

Pos logit contributions
apping+0.437
unch+0.436
owing+0.432
oot+0.429
amb+0.425
Neg logit contributions
 your-0.097
 our-0.079
 her-0.073
 newly-0.072
 my-0.069
Token country
Token ID1499
Feature activation+1.74e-01

Pos logit contributions
 your+0.283
 our+0.273
 same+0.264
 newly+0.263
 my+0.241
Neg logit contributions
unch-0.580
oot-0.560
apping-0.560
owing-0.556
amb-0.549
Token.
Token ID13
Feature activation+9.16e-02

Pos logit contributions
unch+0.320
apping+0.311
owing+0.305
amb+0.304
oot+0.303
Neg logit contributions
 your-0.278
 our-0.260
 same-0.254
 her-0.254
 newly-0.252
Token Happy
Token ID14628
Feature activation-4.54e-02

Pos logit contributions
unch+0.195
owing+0.195
oot+0.194
apping+0.192
ement+0.186
Neg logit contributions
 your-0.117
 our-0.109
 newly-0.103
 my-0.102
 same-0.101
Token Birthday
Token ID33511
Feature activation+3.57e-01

Pos logit contributions
 our+0.065
 your+0.063
 has+0.058
 their+0.055
 same+0.054
Neg logit contributions
unch-0.101
oot-0.099
umbled-0.097
owing-0.096
ooting-0.095
Token,
Token ID11
Feature activation+3.08e-02

Pos logit contributions
unch+0.698
apping+0.695
owing+0.684
oot+0.676
amb+0.668
Neg logit contributions
 your-0.498
 newly-0.457
 our-0.453
 same-0.450
 her-0.447
Token America
Token ID2253
Feature activation+3.46e-02

Pos logit contributions
apping+0.059
unch+0.059
amb+0.056
owing+0.056
oot+0.055
Neg logit contributions
 your-0.040
 Piper-0.040
 newly-0.040
Scar-0.039
 same-0.039
Token!"
Token ID2474
Feature activation+9.65e-02

Pos logit contributions
apping+0.066
unch+0.065
owing+0.065
amb+0.064
oot+0.064
Neg logit contributions
 your-0.046
 newly-0.045
 Piper-0.045
Scar-0.043
 same-0.043
Token<|endoftext|>
Token ID50256
Feature activation+1.20e-01

Pos logit contributions
owing+0.235
oot+0.224
ement+0.223
unch+0.223
apping+0.223
Neg logit contributions
 your-0.103
 her-0.097
 their-0.091
 she-0.091
 our-0.090
TokenS
Token ID50
Feature activation+2.53e-02

Pos logit contributions
owing+0.304
oot+0.303
unch+0.301
apping+0.300
amb+0.297
Neg logit contributions
 your-0.098
 newly-0.085
 her-0.084
 has-0.083
 same-0.081
Token and
Token ID290
Feature activation+7.24e-03

Pos logit contributions
unch+0.381
apping+0.376
owing+0.366
oot+0.366
ement+0.364
Neg logit contributions
 your-0.135
 her-0.130
 same-0.122
 newly-0.122
 Piper-0.120
Token family
Token ID1641
Feature activation+1.80e-01

Pos logit contributions
apping+0.016
apped+0.016
unch+0.016
owing+0.015
oot+0.015
Neg logit contributions
 your-0.007
 newly-0.007
Scar-0.007
Mid-0.007
 same-0.007
Token shouted
Token ID17293
Feature activation+1.78e-01

Pos logit contributions
unch+0.474
apping+0.462
owing+0.458
oot+0.456
amb+0.454
Neg logit contributions
 her-0.132
 your-0.131
 would-0.119
 she-0.118
 has-0.117
Token "
Token ID366
Feature activation-5.79e-02

Pos logit contributions
unch+0.363
oot+0.346
owing+0.344
amb+0.344
apping+0.339
Neg logit contributions
 your-0.251
 her-0.235
 our-0.229
 newly-0.223
 their-0.222
TokenHappy
Token ID25082
Feature activation+1.87e-02

Pos logit contributions
 your+0.084
 newly+0.079
 same+0.079
 her+0.077
Scar+0.077
Neg logit contributions
apping-0.107
unch-0.107
owing-0.105
oot-0.104
ement-0.103
Token Birthday
Token ID33511
Feature activation+3.57e-01

Pos logit contributions
unch+0.025
ement+0.023
owing+0.023
oot+0.022
atted+0.021
Neg logit contributions
 our-0.041
 your-0.041
 my-0.039
 same-0.038
 has-0.038
Token!"
Token ID2474
Feature activation+1.09e-01

Pos logit contributions
apping+0.738
unch+0.722
owing+0.707
amb+0.703
oot+0.697
Neg logit contributions
 newly-0.430
 your-0.420
Scar-0.403
 same-0.395
 Piper-0.392
Token and
Token ID290
Feature activation+7.17e-03

Pos logit contributions
apping+0.250
unch+0.249
owing+0.245
amb+0.241
oot+0.241
Neg logit contributions
 your-0.113
 newly-0.104
 same-0.101
 Piper-0.099
 our-0.099
Token gave
Token ID2921
Feature activation+1.52e-01

Pos logit contributions
apping+0.016
unch+0.016
owing+0.015
oot+0.015
amb+0.015
Neg logit contributions
 your-0.008
 newly-0.007
 same-0.007
 Piper-0.007
 our-0.007
Token him
Token ID683
Feature activation-1.40e-01

Pos logit contributions
ement+0.290
apping+0.287
apped+0.284
unch+0.279
owing+0.278
Neg logit contributions
 newly-0.191
Scar-0.191
 your-0.188
Mid-0.182
 has-0.179
Token wonderful
Token ID7932
Feature activation+2.81e-02

Pos logit contributions
 Piper+0.111
Scar+0.106
 newly+0.104
 your+0.101
 has+0.095
Neg logit contributions
apping-0.325
apped-0.323
amb-0.322
owing-0.321
unch-0.318
Token him
Token ID683
Feature activation-1.36e-01

Pos logit contributions
 your+0.043
 newly+0.039
 same+0.038
 Piper+0.038
 our+0.037
Neg logit contributions
apping-0.101
unch-0.101
owing-0.099
oot-0.098
amb-0.097
Token!
Token ID0
Feature activation+6.56e-02

Pos logit contributions
 your+0.214
 her+0.201
 newly+0.199
 same+0.195
 our+0.195
Neg logit contributions
unch-0.241
apping-0.239
amb-0.234
ement-0.231
oot-0.230
Token After
Token ID2293
Feature activation-4.37e-02

Pos logit contributions
owing+0.143
apping+0.143
unch+0.142
oot+0.139
ement+0.138
Neg logit contributions
 your-0.074
 newly-0.068
 Piper-0.067
 her-0.067
 same-0.066
Token singing
Token ID13777
Feature activation+2.46e-01

Pos logit contributions
 your+0.036
 newly+0.032
 her+0.032
 our+0.030
 same+0.030
Neg logit contributions
apping-0.110
unch-0.109
owing-0.109
amb-0.108
oot-0.107
Token Happy
Token ID14628
Feature activation-4.28e-02

Pos logit contributions
apping+0.444
unch+0.432
owing+0.430
oot+0.425
amb+0.421
Neg logit contributions
 your-0.360
 newly-0.352
 same-0.339
Scar-0.337
 Piper-0.335
Token Birthday
Token ID33511
Feature activation+3.57e-01

Pos logit contributions
 your+0.070
 newly+0.067
 her+0.067
 our+0.066
 same+0.066
Neg logit contributions
unch-0.070
apping-0.069
ement-0.067
oot-0.067
owing-0.067
Token,
Token ID11
Feature activation+3.17e-02

Pos logit contributions
apping+0.628
unch+0.617
amb+0.606
oot+0.604
owing+0.603
Neg logit contributions
 your-0.537
 newly-0.531
 same-0.515
Scar-0.510
Mid-0.502
Token Derek
Token ID20893
Feature activation-5.69e-02

Pos logit contributions
apping+0.060
unch+0.060
amb+0.059
oot+0.058
apped+0.057
Neg logit contributions
 your-0.043
Mid-0.041
 same-0.041
 newly-0.040
 Blink-0.040
Token blew
Token ID17948
Feature activation+1.24e-01

Pos logit contributions
 your+0.073
 newly+0.070
 same+0.069
 our+0.067
 Piper+0.067
Neg logit contributions
apping-0.110
amb-0.108
unch-0.108
owing-0.107
atted-0.107
Token out
Token ID503
Feature activation-4.76e-02

Pos logit contributions
apping+0.267
unch+0.267
owing+0.262
oot+0.258
amb+0.257
Neg logit contributions
 your-0.144
 newly-0.133
 same-0.129
 her-0.128
 Piper-0.128
Token the
Token ID262
Feature activation-2.75e-01

Pos logit contributions
 your+0.032
Scar+0.030
 Piper+0.029
 newly+0.029
Mid+0.028
Neg logit contributions
apping-0.121
owing-0.117
ement-0.117
apped-0.117
unch-0.117
Token favorite
Token ID4004
Feature activation+9.80e-02

Pos logit contributions
 your+0.259
 newly+0.243
 same+0.242
 her+0.238
 our+0.228
Neg logit contributions
unch-0.516
apping-0.509
owing-0.500
oot-0.498
amb-0.497
Token song
Token ID3496
Feature activation+2.65e-01

Pos logit contributions
apping+0.216
unch+0.215
owing+0.212
oot+0.209
amb+0.208
Neg logit contributions
 your-0.115
 her-0.104
 newly-0.104
 same-0.104
 our-0.101
Token is
Token ID318
Feature activation-3.99e-02

Pos logit contributions
apping+0.577
unch+0.570
oot+0.563
owing+0.563
ement+0.555
Neg logit contributions
 your-0.279
 newly-0.272
Scar-0.262
 Piper-0.259
 same-0.252
Token "
Token ID366
Feature activation-6.00e-02

Pos logit contributions
 newly+0.042
 your+0.040
Scar+0.040
 Piper+0.040
 same+0.039
Neg logit contributions
apping-0.083
unch-0.082
oot-0.081
owing-0.080
ement-0.078
TokenHappy
Token ID25082
Feature activation+1.80e-02

Pos logit contributions
 your+0.097
 her+0.093
 same+0.092
 newly+0.091
Scar+0.089
Neg logit contributions
apping-0.099
unch-0.099
owing-0.097
ement-0.097
oot-0.095
Token Birthday
Token ID33511
Feature activation+3.56e-01

Pos logit contributions
owing+0.022
unch+0.022
anged+0.021
oot+0.020
atted+0.020
Neg logit contributions
 your-0.041
 our-0.039
 my-0.039
 her-0.039
 same-0.038
Token".
Token ID1911
Feature activation+5.99e-02

Pos logit contributions
apping+0.665
unch+0.657
owing+0.642
amb+0.637
oot+0.635
Neg logit contributions
 your-0.494
 newly-0.491
 same-0.466
 Piper-0.459
Scar-0.459
Token She
Token ID1375
Feature activation-1.87e-01

Pos logit contributions
apping+0.130
unch+0.129
owing+0.126
amb+0.125
oot+0.125
Neg logit contributions
 your-0.069
 newly-0.064
 same-0.062
 our-0.061
 Piper-0.061
Token likes
Token ID7832
Feature activation+5.96e-02

Pos logit contributions
 your+0.250
 newly+0.242
 Piper+0.234
 same+0.232
 our+0.229
Neg logit contributions
apping-0.357
unch-0.344
amb-0.344
apped-0.341
owing-0.338
Token to
Token ID284
Feature activation-9.46e-02

Pos logit contributions
apping+0.138
unch+0.138
owing+0.135
oot+0.134
amb+0.133
Neg logit contributions
 your-0.059
 newly-0.054
 same-0.052
 Piper-0.052
 our-0.051
Token sing
Token ID1702
Feature activation+2.07e-01

Pos logit contributions
 your+0.107
 newly+0.099
 same+0.096
 Piper+0.095
 her+0.095
Neg logit contributions
apping-0.206
unch-0.206
owing-0.202
oot-0.199
amb-0.199
Token,
Token ID11
Feature activation+2.88e-02

Pos logit contributions
Scar+0.068
Mid+0.067
Ol+0.063
 Blink+0.063
Max+0.062
Neg logit contributions
apping-0.102
owing-0.102
ement-0.100
atted-0.098
oot-0.097
Token they
Token ID484
Feature activation-2.35e-01

Pos logit contributions
apped+0.064
apping+0.061
ement+0.056
overed+0.055
atted+0.055
Neg logit contributions
Scar-0.028
Ol-0.024
Mid-0.024
chy-0.024
 your-0.023
Token sang
Token ID25889
Feature activation+2.05e-01

Pos logit contributions
 your+0.295
 newly+0.273
 same+0.265
 our+0.264
 Piper+0.262
Neg logit contributions
apping-0.480
unch-0.473
owing-0.468
apped-0.465
amb-0.464
Token "
Token ID366
Feature activation-5.61e-02

Pos logit contributions
unch+0.504
apping+0.503
owing+0.496
oot+0.490
amb+0.489
Neg logit contributions
 your-0.178
 newly-0.159
 same-0.154
 her-0.152
 our-0.151
TokenHappy
Token ID25082
Feature activation+1.88e-02

Pos logit contributions
 your+0.069
 her+0.066
 newly+0.066
 same+0.065
Mid+0.063
Neg logit contributions
apping-0.114
unch-0.113
owing-0.112
ement-0.111
oot-0.111
Token Birthday
Token ID33511
Feature activation+3.56e-01

Pos logit contributions
owing+0.020
unch+0.019
ement+0.019
oot+0.019
anged+0.017
Neg logit contributions
 your-0.044
 our-0.042
 my-0.042
 her-0.042
 same-0.040
Token"
Token ID1
Feature activation-5.75e-02

Pos logit contributions
apping+0.662
unch+0.651
owing+0.637
amb+0.632
oot+0.629
Neg logit contributions
 your-0.497
 newly-0.496
 same-0.471
Scar-0.468
 Piper-0.463
Token to
Token ID284
Feature activation-9.06e-02

Pos logit contributions
 your+0.083
 newly+0.078
 her+0.077
 same+0.077
 Piper+0.076
Neg logit contributions
unch-0.108
apping-0.107
owing-0.105
amb-0.102
oot-0.102
Token Lily
Token ID20037
Feature activation-1.29e-01

Pos logit contributions
Scar+0.064
 newly+0.055
Mid+0.055
 has+0.055
Ol+0.054
Neg logit contributions
oot-0.215
apping-0.209
unch-0.209
ement-0.206
apped-0.203
Token's
Token ID338
Feature activation-1.18e-01

Pos logit contributions
 your+0.202
 her+0.198
 Piper+0.184
 newly+0.182
 our+0.182
Neg logit contributions
unch-0.227
ement-0.227
apping-0.221
amb-0.221
oot-0.218
Token friend
Token ID1545
Feature activation+1.90e-01

Pos logit contributions
 her+0.120
 your+0.112
 same+0.108
 Lily+0.104
 big+0.102
Neg logit contributions
amb-0.302
unch-0.300
oot-0.296
owing-0.293
apping-0.291
Token still
Token ID991
Feature activation+2.59e-02

Pos logit contributions
apping+0.043
unch+0.042
owing+0.042
ement+0.042
atted+0.041
Neg logit contributions
 your-0.013
 her-0.011
 same-0.011
 our-0.011
 my-0.010
Token hear
Token ID3285
Feature activation+8.95e-02

Pos logit contributions
apping+0.064
unch+0.063
ement+0.062
oot+0.062
amb+0.062
Neg logit contributions
 your-0.024
 her-0.021
 same-0.020
 our-0.020
 newly-0.020
Token us
Token ID514
Feature activation-5.46e-02

Pos logit contributions
apping+0.215
unch+0.213
owing+0.210
oot+0.208
amb+0.207
Neg logit contributions
 your-0.076
 newly-0.073
 Piper-0.069
Scar-0.068
 same-0.068
Token sing
Token ID1702
Feature activation+2.08e-01

Pos logit contributions
 your+0.081
 our+0.074
 her+0.073
 same+0.072
 my+0.071
Neg logit contributions
unch-0.106
ement-0.105
apping-0.103
oot-0.103
amb-0.102
Token Happy
Token ID14628
Feature activation-4.48e-02

Pos logit contributions
apping+0.347
unch+0.334
owing+0.332
amb+0.328
oot+0.326
Neg logit contributions
 newly-0.328
 your-0.326
 Piper-0.316
Scar-0.312
 same-0.309
Token Birthday
Token ID33511
Feature activation+3.56e-01

Pos logit contributions
 your+0.078
 our+0.076
 her+0.074
 same+0.074
 my+0.073
Neg logit contributions
unch-0.076
ement-0.072
oot-0.071
atted-0.070
umbled-0.070
Token!"
Token ID2474
Feature activation+1.09e-01

Pos logit contributions
apping+0.716
unch+0.697
amb+0.696
oot+0.696
owing+0.693
Neg logit contributions
 newly-0.437
Scar-0.407
 your-0.399
 same-0.388
 Piper-0.380
Token They
Token ID1119
Feature activation-1.58e-01

Pos logit contributions
amb+0.199
apping+0.197
unch+0.192
apped+0.191
oot+0.186
Neg logit contributions
 your-0.156
Scar-0.155
 newly-0.153
 same-0.151
 our-0.148
Token finished
Token ID5201
Feature activation+1.46e-01

Pos logit contributions
 her+0.162
 your+0.159
 newly+0.151
 same+0.147
 Piper+0.142
Neg logit contributions
unch-0.381
apping-0.367
owing-0.359
ement-0.357
oot-0.356
Token mixing
Token ID17090
Feature activation+1.41e-01

Pos logit contributions
apping+0.357
unch+0.357
owing+0.351
oot+0.348
amb+0.345
Neg logit contributions
 your-0.135
 newly-0.121
 her-0.119
 our-0.117
 same-0.114
Token the
Token ID262
Feature activation-2.74e-01

Pos logit contributions
ement+0.323
owing+0.322
apping+0.322
unch+0.319
amb+0.318
Neg logit contributions
 your-0.120
Scar-0.119
 newly-0.116
 Piper-0.113
Mid-0.111
Token.
Token ID13
Feature activation+7.55e-02

Pos logit contributions
apping+0.056
unch+0.056
owing+0.055
oot+0.054
amb+0.054
Neg logit contributions
 your-0.049
 newly-0.046
 same-0.045
 Piper-0.045
 our-0.045
Token It
Token ID632
Feature activation-1.28e-01

Pos logit contributions
owing+0.195
apping+0.192
amb+0.190
oot+0.187
unch+0.186
Neg logit contributions
 your-0.060
 her-0.058
 she-0.054
 Lily-0.053
 Piper-0.050
Token said
Token ID531
Feature activation+1.02e-01

Pos logit contributions
 your+0.150
 newly+0.139
 same+0.136
 Piper+0.134
Scar+0.133
Neg logit contributions
apping-0.269
unch-0.267
owing-0.263
oot-0.262
amb-0.259
Token "
Token ID366
Feature activation-5.83e-02

Pos logit contributions
unch+0.201
apping+0.199
owing+0.197
amb+0.195
oot+0.192
Neg logit contributions
 your-0.143
 her-0.132
 newly-0.130
 our-0.128
 same-0.127
TokenHappy
Token ID25082
Feature activation+1.70e-02

Pos logit contributions
 your+0.067
 newly+0.063
 her+0.062
 same+0.061
Scar+0.060
Neg logit contributions
apping-0.127
unch-0.125
owing-0.124
ement-0.122
oot-0.121
Token Birthday
Token ID33511
Feature activation+3.55e-01

Pos logit contributions
owing+0.023
unch+0.023
anged+0.022
ement+0.020
ooting+0.020
Neg logit contributions
 your-0.036
 my-0.036
 our-0.035
 has-0.034
 her-0.033
Token".
Token ID1911
Feature activation+6.82e-02

Pos logit contributions
unch+0.710
apping+0.708
owing+0.692
amb+0.684
ement+0.679
Neg logit contributions
 your-0.492
 our-0.441
 her-0.439
 same-0.437
 my-0.434
Token
Token ID198
Feature activation-1.76e-02

Pos logit contributions
owing+0.174
apping+0.173
amb+0.171
unch+0.170
oot+0.168
Neg logit contributions
 her-0.053
 your-0.053
 she-0.049
 Lily-0.047
 the-0.045
Token
Token ID198
Feature activation-1.74e-02

Pos logit contributions
 your+0.019
 her+0.017
 newly+0.016
 Piper+0.016
 our+0.016
Neg logit contributions
owing-0.040
apping-0.040
amb-0.039
ement-0.039
unch-0.039
TokenS
Token ID50
Feature activation+2.56e-02

Pos logit contributions
 your+0.021
 newly+0.020
 same+0.019
 her+0.019
 Piper+0.019
Neg logit contributions
apping-0.037
unch-0.036
owing-0.036
oot-0.035
amb-0.035
Tokenue
Token ID518
Feature activation-6.64e-02

Pos logit contributions
apping+0.053
unch+0.053
owing+0.052
amb+0.051
oot+0.051
Neg logit contributions
 your-0.032
 newly-0.029
 same-0.029
 her-0.028
 our-0.028
Token Everyone
Token ID11075
Feature activation+7.15e-03

Pos logit contributions
owing+0.174
apping+0.172
unch+0.171
oot+0.169
ement+0.166
Neg logit contributions
 your-0.084
 newly-0.075
 her-0.074
 Piper-0.074
 our-0.072
Token started
Token ID2067
Feature activation+8.20e-02

Pos logit contributions
unch+0.017
apping+0.017
owing+0.017
oot+0.016
amb+0.016
Neg logit contributions
 your-0.007
 her-0.006
 newly-0.006
 our-0.006
 same-0.006
Token singing
Token ID13777
Feature activation+2.49e-01

Pos logit contributions
unch+0.156
apping+0.156
owing+0.153
oot+0.151
amb+0.151
Neg logit contributions
 your-0.116
 newly-0.108
 same-0.106
 her-0.106
 our-0.105
Token '
Token ID705
Feature activation-5.52e-02

Pos logit contributions
apping+0.444
unch+0.436
owing+0.431
oot+0.424
amb+0.421
Neg logit contributions
 your-0.380
 newly-0.363
 same-0.352
 Piper-0.352
Scar-0.349
TokenHappy
Token ID25082
Feature activation+1.27e-02

Pos logit contributions
 your+0.095
 newly+0.090
 her+0.090
 same+0.089
Max+0.088
Neg logit contributions
apping-0.085
unch-0.084
oot-0.083
ement-0.081
amb-0.081
Token Birthday
Token ID33511
Feature activation+3.55e-01

Pos logit contributions
unch+0.019
owing+0.018
oot+0.018
apping+0.017
ement+0.017
Neg logit contributions
 your-0.025
 our-0.023
 same-0.023
 her-0.023
 my-0.023
Token'
Token ID6
Feature activation-1.80e-03

Pos logit contributions
apping+0.525
unch+0.505
owing+0.498
amb+0.495
oot+0.485
Neg logit contributions
 newly-0.632
 your-0.626
 Piper-0.603
Scar-0.603
 same-0.602
Token to
Token ID284
Feature activation-9.43e-02

Pos logit contributions
 your+0.003
 newly+0.003
 same+0.003
 her+0.003
 our+0.003
Neg logit contributions
unch-0.003
apping-0.003
owing-0.003
oot-0.003
amb-0.003
Token John
Token ID1757
Feature activation-1.25e-01

Pos logit contributions
 newly+0.092
Scar+0.088
 Piper+0.086
 same+0.084
Mid+0.084
Neg logit contributions
apping-0.203
unch-0.200
amb-0.200
oot-0.199
owing-0.197
Token,
Token ID11
Feature activation+2.51e-02

Pos logit contributions
 your+0.214
 newly+0.210
 same+0.203
 Piper+0.203
Scar+0.202
Neg logit contributions
apping-0.196
unch-0.193
owing-0.192
amb-0.186
oot-0.183
Token who
Token ID508
Feature activation-2.92e-02

Pos logit contributions
apping+0.052
unch+0.050
amb+0.049
owing+0.049
aring+0.048
Neg logit contributions
 newly-0.028
 your-0.027
Scar-0.027
 Piper-0.027
Mid-0.027
Token they
Token ID484
Feature activation-2.35e-01

Pos logit contributions
apped+0.014
atted+0.013
apping+0.013
ocked+0.012
unch+0.012
Neg logit contributions
Scar-0.007
Ol-0.006
Beck-0.006
Mid-0.006
Elsa-0.006
Token all
Token ID477
Feature activation-9.76e-03

Pos logit contributions
 your+0.274
 newly+0.254
 same+0.246
 Piper+0.246
 our+0.241
Neg logit contributions
apping-0.502
unch-0.500
owing-0.490
amb-0.489
oot-0.485
Token sang
Token ID25889
Feature activation+2.02e-01

Pos logit contributions
 your+0.013
 newly+0.012
 Piper+0.012
 same+0.012
Scar+0.011
Neg logit contributions
apping-0.019
unch-0.019
owing-0.019
amb-0.019
oot-0.019
Token "
Token ID366
Feature activation-5.64e-02

Pos logit contributions
unch+0.440
apping+0.434
owing+0.431
oot+0.425
amb+0.423
Neg logit contributions
 your-0.238
 newly-0.218
 her-0.217
 same-0.214
 our-0.213
TokenHappy
Token ID25082
Feature activation+1.65e-02

Pos logit contributions
 your+0.078
 her+0.074
 newly+0.073
 same+0.073
Scar+0.070
Neg logit contributions
apping-0.108
unch-0.106
owing-0.106
oot-0.104
ement-0.103
Token Birthday
Token ID33511
Feature activation+3.54e-01

Pos logit contributions
owing+0.019
anged+0.018
unch+0.018
oot+0.017
ement+0.017
Neg logit contributions
 your-0.037
 our-0.036
 my-0.036
 her-0.035
 has-0.034
Token"
Token ID1
Feature activation-5.78e-02

Pos logit contributions
apping+0.675
unch+0.668
amb+0.647
owing+0.644
oot+0.637
Neg logit contributions
 newly-0.476
 your-0.471
 same-0.450
Scar-0.449
 Piper-0.443
Token to
Token ID284
Feature activation-9.08e-02

Pos logit contributions
 your+0.082
 her+0.077
 newly+0.076
 same+0.076
 Piper+0.075
Neg logit contributions
unch-0.110
apping-0.109
owing-0.107
ement-0.105
amb-0.105
Token Lily
Token ID20037
Feature activation-1.31e-01

Pos logit contributions
Scar+0.058
 same+0.053
Mid+0.052
 newly+0.052
 your+0.050
Neg logit contributions
unch-0.219
oot-0.218
apping-0.217
apped-0.212
ement-0.211
Token's
Token ID338
Feature activation-1.20e-01

Pos logit contributions
 her+0.188
 your+0.185
 she+0.171
 the+0.166
 our+0.164
Neg logit contributions
ement-0.259
unch-0.250
amb-0.243
apping-0.241
oot-0.240
Token little
Token ID1310
Feature activation+2.72e-02

Pos logit contributions
 her+0.104
 your+0.097
 same+0.091
 newly+0.090
 Lily+0.088
Neg logit contributions
amb-0.329
oot-0.327
unch-0.324
owing-0.318
apping-0.314
Token.
Token ID13
Feature activation+7.56e-02

Pos logit contributions
unch+0.293
apping+0.291
owing+0.285
oot+0.283
amb+0.280
Neg logit contributions
 your-0.308
 newly-0.293
 her-0.292
 same-0.289
 our-0.284
Token Everyone
Token ID11075
Feature activation+1.35e-02

Pos logit contributions
apping+0.145
unch+0.145
amb+0.138
apped+0.138
ement+0.136
Neg logit contributions
 your-0.102
 newly-0.098
 same-0.097
 our-0.095
 Piper-0.094
Token sang
Token ID25889
Feature activation+2.02e-01

Pos logit contributions
unch+0.032
apping+0.032
owing+0.032
oot+0.031
amb+0.031
Neg logit contributions
 your-0.013
 her-0.012
 newly-0.012
 our-0.011
 has-0.011
Token "
Token ID366
Feature activation-5.66e-02

Pos logit contributions
unch+0.435
apping+0.435
owing+0.427
oot+0.421
amb+0.420
Neg logit contributions
 your-0.235
 newly-0.217
 same-0.211
 her-0.210
 our-0.209
TokenHappy
Token ID25082
Feature activation+1.51e-02

Pos logit contributions
 your+0.070
 newly+0.066
 her+0.065
 same+0.064
 our+0.063
Neg logit contributions
apping-0.116
unch-0.114
owing-0.114
oot-0.113
ement-0.111
Token Birthday
Token ID33511
Feature activation+3.54e-01

Pos logit contributions
owing+0.015
unch+0.015
oot+0.015
anged+0.015
ement+0.014
Neg logit contributions
 your-0.035
 our-0.034
 my-0.034
 her-0.033
 has-0.033
Token"
Token ID1
Feature activation-6.02e-02

Pos logit contributions
apping+0.572
unch+0.563
amb+0.547
owing+0.546
oot+0.533
Neg logit contributions
 newly-0.578
 your-0.576
 Piper-0.554
 same-0.553
Scar-0.552
Token to
Token ID284
Feature activation-9.14e-02

Pos logit contributions
 your+0.091
 newly+0.087
 same+0.084
 our+0.084
 Piper+0.083
Neg logit contributions
apping-0.108
unch-0.107
owing-0.106
oot-0.104
amb-0.104
Token Jimmy
Token ID12963
Feature activation-1.09e-01

Pos logit contributions
Scar+0.096
Mid+0.095
 your+0.095
 same+0.094
 newly+0.093
Neg logit contributions
unch-0.187
ement-0.183
oot-0.180
apping-0.179
owing-0.179
Token.
Token ID13
Feature activation+8.62e-02

Pos logit contributions
 your+0.168
 her+0.157
 our+0.152
 newly+0.152
 Piper+0.149
Neg logit contributions
unch-0.197
apping-0.191
ement-0.191
oot-0.189
owing-0.186
Token He
Token ID679
Feature activation-1.77e-01

Pos logit contributions
apping+0.180
unch+0.179
owing+0.177
oot+0.175
amb+0.173
Neg logit contributions
 your-0.106
 newly-0.098
 her-0.095
 same-0.095
 Piper-0.095
Token.
Token ID13
Feature activation+7.54e-02

Pos logit contributions
 your+0.101
 her+0.097
 our+0.091
 their+0.089
 same+0.089
Neg logit contributions
ement-0.116
unch-0.115
apping-0.112
oot-0.107
owing-0.106
Token Blue
Token ID4518
Feature activation-6.59e-02

Pos logit contributions
unch+0.177
apping+0.174
amb+0.171
oot+0.171
owing+0.170
Neg logit contributions
 your-0.068
 newly-0.066
Scar-0.063
 Piper-0.062
 same-0.060
Token had
Token ID550
Feature activation-6.78e-02

Pos logit contributions
 your+0.069
 newly+0.065
 Piper+0.063
Scar+0.062
 same+0.061
Neg logit contributions
apping-0.147
unch-0.146
owing-0.145
amb-0.144
oot-0.143
Token an
Token ID281
Feature activation-1.42e-01

Pos logit contributions
 Piper+0.053
Scar+0.052
 newly+0.051
 your+0.047
Mid+0.045
Neg logit contributions
unch-0.157
apping-0.157
oot-0.156
amb-0.153
apped-0.153
Token enormous
Token ID9812
Feature activation+7.55e-02

Pos logit contributions
 your+0.152
 same+0.137
 her+0.135
 my+0.131
 newly+0.125
Neg logit contributions
apping-0.337
unch-0.332
owing-0.325
aring-0.318
apped-0.316
Token appetite
Token ID20788
Feature activation+3.54e-01

Pos logit contributions
unch+0.168
apping+0.168
owing+0.164
oot+0.161
amb+0.161
Neg logit contributions
 your-0.089
 same-0.082
 newly-0.078
 her-0.078
 my-0.076
Token and
Token ID290
Feature activation+6.35e-03

Pos logit contributions
unch+0.697
ement+0.685
apping+0.680
amb+0.677
owing+0.667
Neg logit contributions
 your-0.497
 her-0.455
 same-0.443
 she-0.440
 our-0.438
Token lived
Token ID5615
Feature activation+2.65e-02

Pos logit contributions
oot+0.011
apped+0.011
atted+0.011
unch+0.011
apping+0.011
Neg logit contributions
Scar-0.008
 newly-0.008
Elsa-0.008
 Piper-0.007
 same-0.007
Token in
Token ID287
Feature activation-1.11e-01

Pos logit contributions
unch+0.060
ement+0.060
owing+0.058
amb+0.058
oot+0.058
Neg logit contributions
 your-0.031
 her-0.030
 newly-0.028
 same-0.027
 the-0.026
Token a
Token ID257
Feature activation-1.70e-01

Pos logit contributions
Scar+0.094
 Piper+0.087
 newly+0.079
Mid+0.078
 would+0.077
Neg logit contributions
unch-0.241
owing-0.235
apping-0.235
oot-0.233
ement-0.230
Token swamp
Token ID26837
Feature activation+1.75e-01

Pos logit contributions
 Piper+0.119
 newly+0.119
Scar+0.118
 your+0.115
 her+0.106
Neg logit contributions
apping-0.408
unch-0.401
aring-0.384
owing-0.382
ement-0.381
Token
Token ID198
Feature activation-2.02e-02

Pos logit contributions
 your+0.029
 newly+0.027
 same+0.027
 our+0.027
 Piper+0.026
Neg logit contributions
unch-0.042
apping-0.042
oot-0.040
amb-0.040
owing-0.040
TokenEveryone
Token ID16190
Feature activation-9.31e-03

Pos logit contributions
 your+0.024
 same+0.023
 newly+0.022
 has+0.022
 our+0.022
Neg logit contributions
apping-0.043
unch-0.043
amb-0.041
aring-0.040
oot-0.040
Token sang
Token ID25889
Feature activation+2.09e-01

Pos logit contributions
 your+0.010
 her+0.009
 newly+0.009
 our+0.008
 same+0.008
Neg logit contributions
unch-0.022
apping-0.021
owing-0.021
oot-0.021
amb-0.021
Token the
Token ID262
Feature activation-2.74e-01

Pos logit contributions
apping+0.406
owing+0.384
unch+0.382
apped+0.382
oot+0.381
Neg logit contributions
 your-0.284
 newly-0.275
 Piper-0.266
 same-0.261
Scar-0.260
Token Happy
Token ID14628
Feature activation-4.55e-02

Pos logit contributions
 your+0.322
 Piper+0.300
 newly+0.287
Scar+0.287
 her+0.283
Neg logit contributions
apping-0.568
ement-0.554
unch-0.554
owing-0.553
amb-0.544
Token Birthday
Token ID33511
Feature activation+3.54e-01

Pos logit contributions
 your+0.074
 newly+0.072
 same+0.071
 our+0.070
 her+0.069
Neg logit contributions
unch-0.079
owing-0.075
apping-0.075
oot-0.074
amb-0.072
Token song
Token ID3496
Feature activation+2.60e-01

Pos logit contributions
apping+0.644
owing+0.614
apped+0.602
amb+0.602
unch+0.598
Neg logit contributions
 newly-0.512
 your-0.499
Scar-0.494
 Piper-0.484
Mid-0.479
Token.
Token ID13
Feature activation+7.60e-02

Pos logit contributions
unch+0.470
apping+0.457
owing+0.450
oot+0.442
amb+0.442
Neg logit contributions
 your-0.404
 her-0.382
 newly-0.376
 same-0.373
 our-0.369
Token Molly
Token ID30236
Feature activation-1.29e-01

Pos logit contributions
apping+0.139
unch+0.132
amb+0.131
apped+0.131
ement+0.129
Neg logit contributions
 your-0.112
 newly-0.107
 same-0.104
 our-0.103
 Piper-0.102
Token blew
Token ID17948
Feature activation+1.23e-01

Pos logit contributions
 your+0.142
 newly+0.132
 same+0.127
 Piper+0.126
 our+0.125
Neg logit contributions
apping-0.284
unch-0.283
owing-0.279
amb-0.275
oot-0.275
Token out
Token ID503
Feature activation-5.10e-02

Pos logit contributions
apping+0.265
ement+0.257
owing+0.256
unch+0.256
amb+0.254
Neg logit contributions
 your-0.135
 newly-0.129
 Piper-0.123
 same-0.122
Scar-0.122
Token This
Token ID770
Feature activation-1.25e-01

Pos logit contributions
owing+0.168
unch+0.166
apping+0.165
oot+0.163
amb+0.162
Neg logit contributions
 your-0.078
 Piper-0.069
 newly-0.069
 her-0.068
 same-0.067
Token message
Token ID3275
Feature activation+2.45e-01

Pos logit contributions
 your+0.148
 newly+0.141
 Piper+0.136
Scar+0.133
 her+0.132
Neg logit contributions
apping-0.260
unch-0.257
ement-0.252
owing-0.251
oot-0.248
Token said
Token ID531
Feature activation+9.99e-02

Pos logit contributions
apping+0.579
unch+0.576
owing+0.568
oot+0.562
amb+0.558
Neg logit contributions
 your-0.229
 newly-0.208
 same-0.200
 Piper-0.198
 our-0.197
Token "
Token ID366
Feature activation-5.89e-02

Pos logit contributions
apping+0.194
unch+0.193
owing+0.189
oot+0.186
amb+0.186
Neg logit contributions
 your-0.138
 newly-0.130
 same-0.126
 Piper-0.125
 her-0.125
TokenHappy
Token ID25082
Feature activation+1.52e-02

Pos logit contributions
 your+0.073
 newly+0.069
 same+0.067
 her+0.066
 Piper+0.066
Neg logit contributions
apping-0.122
unch-0.122
owing-0.119
oot-0.117
ement-0.117
Token Birthday
Token ID33511
Feature activation+3.53e-01

Pos logit contributions
owing+0.020
unch+0.020
anged+0.019
oot+0.018
ement+0.018
Neg logit contributions
 your-0.033
 our-0.032
 my-0.032
 has-0.031
 same-0.030
Token".
Token ID1911
Feature activation+6.70e-02

Pos logit contributions
apping+0.657
unch+0.657
owing+0.643
oot+0.633
amb+0.631
Neg logit contributions
 your-0.514
 newly-0.482
 same-0.472
 Piper-0.467
 our-0.467
Token Such
Token ID8013
Feature activation-6.91e-02

Pos logit contributions
owing+0.159
unch+0.159
apping+0.157
oot+0.155
amb+0.153
Neg logit contributions
 your-0.066
 her-0.060
 newly-0.058
 Piper-0.058
 our-0.056
Token a
Token ID257
Feature activation-1.68e-01

Pos logit contributions
 newly+0.050
 your+0.049
Scar+0.048
 Piper+0.047
Mid+0.046
Neg logit contributions
apping-0.166
apped-0.161
ement-0.161
unch-0.160
oot-0.159
Token special
Token ID2041
Feature activation+7.45e-02

Pos logit contributions
 your+0.132
 same+0.120
 newly+0.119
 our+0.115
 great+0.109
Neg logit contributions
owing-0.449
unch-0.442
oot-0.441
amb-0.438
apping-0.432
Token message
Token ID3275
Feature activation+2.45e-01

Pos logit contributions
unch+0.154
owing+0.151
apping+0.150
oot+0.146
amb+0.146
Neg logit contributions
 your-0.102
 same-0.095
 newly-0.092
 our-0.091
 her-0.091
Token as
Token ID355
Feature activation+6.54e-03

Pos logit contributions
unch+0.395
apping+0.372
amb+0.365
oot+0.364
aring+0.359
Neg logit contributions
 her-0.350
 your-0.348
 their-0.330
 newly-0.328
 she-0.324
Token everyone
Token ID2506
Feature activation-2.91e-02

Pos logit contributions
apping+0.016
oot+0.016
amb+0.015
unch+0.015
owing+0.015
Neg logit contributions
Mid-0.005
Scar-0.005
 same-0.004
 newly-0.004
 Piper-0.004
Token sang
Token ID25889
Feature activation+2.06e-01

Pos logit contributions
 your+0.030
 her+0.028
 newly+0.028
 has+0.026
 our+0.026
Neg logit contributions
unch-0.067
apping-0.066
oot-0.065
owing-0.065
amb-0.065
Token "
Token ID366
Feature activation-5.83e-02

Pos logit contributions
unch+0.374
owing+0.352
apping+0.351
amb+0.349
oot+0.348
Neg logit contributions
 your-0.336
 her-0.325
 our-0.310
 newly-0.307
 their-0.306
TokenHappy
Token ID25082
Feature activation+1.47e-02

Pos logit contributions
 your+0.095
 her+0.091
 same+0.090
 newly+0.089
 my+0.087
Neg logit contributions
apping-0.097
unch-0.096
owing-0.092
ement-0.091
oot-0.091
Token Birthday
Token ID33511
Feature activation+3.53e-01

Pos logit contributions
owing+0.017
unch+0.017
anged+0.017
oot+0.016
ement+0.015
Neg logit contributions
 your-0.034
 our-0.032
 my-0.032
 her-0.031
 has-0.030
Token".
Token ID1911
Feature activation+6.78e-02

Pos logit contributions
apping+0.666
unch+0.663
owing+0.650
oot+0.641
amb+0.640
Neg logit contributions
 your-0.500
 newly-0.475
 same-0.462
 Piper-0.457
 her-0.456
Token As
Token ID1081
Feature activation-1.29e-03

Pos logit contributions
unch+0.150
owing+0.150
oot+0.148
apping+0.146
aring+0.141
Neg logit contributions
 your-0.078
 her-0.072
 Piper-0.071
 our-0.070
 Blink-0.070
Token he
Token ID339
Feature activation-2.49e-01

Pos logit contributions
Mid+0.001
Scar+0.001
 newly+0.001
 same+0.001
 your+0.001
Neg logit contributions
apping-0.003
unch-0.003
ement-0.003
oot-0.003
amb-0.003
Token was
Token ID373
Feature activation-2.56e-02

Pos logit contributions
 her+0.214
 newly+0.206
 your+0.206
 has+0.199
 would+0.192
Neg logit contributions
apping-0.616
unch-0.609
owing-0.600
oot-0.597
aring-0.597
Token about
Token ID546
Feature activation+4.62e-03

Pos logit contributions
 your+0.038
 her+0.038
 their+0.034
 the+0.034
 our+0.034
Neg logit contributions
amb-0.051
unch-0.050
owing-0.048
oot-0.048
apping-0.048
Token The
Token ID383
Feature activation-1.64e-01

Pos logit contributions
apping+0.162
unch+0.162
owing+0.157
oot+0.156
amb+0.156
Neg logit contributions
 your-0.082
 newly-0.077
 same-0.075
 our-0.074
 Piper-0.072
Token sign
Token ID1051
Feature activation+2.12e-01

Pos logit contributions
 your+0.203
 same+0.193
 newly+0.192
 her+0.182
 our+0.182
Neg logit contributions
unch-0.349
amb-0.344
apping-0.344
owing-0.342
oot-0.340
Token said
Token ID531
Feature activation+1.02e-01

Pos logit contributions
unch+0.486
owing+0.478
amb+0.474
apping+0.472
oot+0.454
Neg logit contributions
 your-0.227
 her-0.226
 she-0.221
 newly-0.219
 has-0.216
Token '
Token ID705
Feature activation-5.33e-02

Pos logit contributions
apping+0.189
unch+0.185
owing+0.182
oot+0.181
ement+0.178
Neg logit contributions
 your-0.149
 newly-0.144
 same-0.141
 Piper-0.139
Scar-0.138
TokenHappy
Token ID25082
Feature activation+1.44e-02

Pos logit contributions
 your+0.065
 newly+0.061
 her+0.059
 same+0.059
Scar+0.058
Neg logit contributions
apping-0.112
unch-0.111
owing-0.108
oot-0.108
amb-0.107
Token Birthday
Token ID33511
Feature activation+3.53e-01

Pos logit contributions
unch+0.019
owing+0.019
oot+0.018
apping+0.017
anged+0.017
Neg logit contributions
 your-0.030
 our-0.029
 my-0.028
 same-0.028
 her-0.028
Token Molly
Token ID30236
Feature activation-1.30e-01

Pos logit contributions
apping+0.530
unch+0.510
owing+0.507
amb+0.497
oot+0.495
Neg logit contributions
 newly-0.620
 your-0.615
 same-0.591
Scar-0.585
 Piper-0.582
Token'.
Token ID4458
Feature activation+3.95e-02

Pos logit contributions
 your+0.224
 newly+0.222
 same+0.215
 Piper+0.211
Scar+0.210
Neg logit contributions
apping-0.200
unch-0.192
owing-0.191
amb-0.186
oot-0.185
Token Mud
Token ID32878
Feature activation+4.11e-02

Pos logit contributions
apping+0.089
unch+0.088
amb+0.085
owing+0.085
oot+0.085
Neg logit contributions
 your-0.042
 newly-0.039
 same-0.038
 our-0.037
Scar-0.036
Tokenly
Token ID306
Feature activation+1.60e-02

Pos logit contributions
owing+0.086
apping+0.085
ement+0.083
apped+0.083
unch+0.083
Neg logit contributions
 your-0.049
Scar-0.045
 newly-0.044
 Piper-0.044
 our-0.043
Token was
Token ID373
Feature activation-2.91e-02

Pos logit contributions
apping+0.037
unch+0.037
owing+0.037
amb+0.036
oot+0.036
Neg logit contributions
 your-0.015
 newly-0.014
 her-0.014
 same-0.014
 Piper-0.013
Token
Token ID95
Feature activation+1.96e-02

Pos logit contributions
unch+0.089
apping+0.087
owing+0.086
apped+0.082
amb+0.081
Neg logit contributions
 your-0.093
 newly-0.089
Scar-0.087
 our-0.086
 same-0.085
Token
Token ID26391
Feature activation+1.25e-02

Pos logit contributions
unch+0.035
apping+0.035
owing+0.034
oot+0.034
amb+0.034
Neg logit contributions
 your-0.030
 newly-0.028
 same-0.028
 our-0.027
Scar-0.027
Token
Token ID129
Feature activation+1.85e-02

Pos logit contributions
apping+0.025
unch+0.025
owing+0.025
oot+0.024
amb+0.024
Neg logit contributions
 your-0.017
 newly-0.015
 same-0.015
 Piper-0.015
 her-0.015
Token
Token ID241
Feature activation+1.58e-02

Pos logit contributions
apping+0.036
unch+0.036
owing+0.035
apped+0.034
ement+0.033
Neg logit contributions
 your-0.027
 same-0.025
 our-0.025
 newly-0.025
 her-0.024
TokenHappy
Token ID25082
Feature activation+1.57e-02

Pos logit contributions
apping+0.033
unch+0.033
owing+0.032
oot+0.031
amb+0.031
Neg logit contributions
 your-0.020
 newly-0.018
 same-0.018
 Piper-0.018
 our-0.018
Token Birthday
Token ID33511
Feature activation+3.53e-01

Pos logit contributions
owing+0.028
unch+0.027
anged+0.026
oot+0.026
atted+0.025
Neg logit contributions
 your-0.027
 our-0.026
 my-0.026
 has-0.025
 her-0.024
Token Tim
Token ID5045
Feature activation-1.06e-01

Pos logit contributions
apping+0.620
unch+0.617
oot+0.611
aring+0.604
amb+0.601
Neg logit contributions
Scar-0.523
 newly-0.512
 Piper-0.503
Mid-0.492
 Blink-0.486
Tokenmy
Token ID1820
Feature activation-3.09e-02

Pos logit contributions
 newly+0.131
 your+0.130
Scar+0.128
 Piper+0.127
Mid+0.123
Neg logit contributions
owing-0.214
apping-0.208
unch-0.207
oot-0.207
amb-0.203
Token!
Token ID0
Feature activation+6.54e-02

Pos logit contributions
 your+0.054
 newly+0.054
 Piper+0.053
 same+0.052
Scar+0.052
Neg logit contributions
apping-0.046
unch-0.045
owing-0.045
oot-0.044
amb-0.044
Tokenâ
Token ID22940
Feature activation-1.08e-02

Pos logit contributions
apping+0.135
unch+0.135
owing+0.133
oot+0.131
amb+0.129
Neg logit contributions
 your-0.082
 newly-0.076
 our-0.073
 Piper-0.073
 same-0.073
Token
Token ID26391
Feature activation+1.37e-02

Pos logit contributions
 your+0.017
 newly+0.016
 same+0.016
 our+0.016
 Piper+0.016
Neg logit contributions
apping-0.019
unch-0.019
owing-0.018
oot-0.018
amb-0.018
Token other
Token ID584
Feature activation+1.75e-02

Pos logit contributions
 your+0.052
 newly+0.048
 Piper+0.046
Scar+0.046
 same+0.045
Neg logit contributions
apping-0.160
unch-0.160
owing-0.157
amb-0.156
ement-0.154
Token and
Token ID290
Feature activation+2.07e-03

Pos logit contributions
unch+0.029
apping+0.029
oot+0.028
owing+0.028
amb+0.027
Neg logit contributions
 your-0.030
 newly-0.028
 her-0.028
 our-0.027
 same-0.027
Token said
Token ID531
Feature activation+1.02e-01

Pos logit contributions
apping+0.004
unch+0.004
owing+0.004
oot+0.004
amb+0.004
Neg logit contributions
 your-0.003
 newly-0.003
 same-0.003
Scar-0.003
 Piper-0.003
Token '
Token ID705
Feature activation-5.12e-02

Pos logit contributions
unch+0.204
apping+0.202
owing+0.201
oot+0.197
amb+0.197
Neg logit contributions
 your-0.139
 her-0.127
 newly-0.127
 our-0.125
 same-0.124
TokenHappy
Token ID25082
Feature activation+1.60e-02

Pos logit contributions
 your+0.064
 newly+0.061
 her+0.059
 same+0.058
Scar+0.057
Neg logit contributions
apping-0.106
unch-0.104
oot-0.102
owing-0.102
ement-0.100
Token Birthday
Token ID33511
Feature activation+3.53e-01

Pos logit contributions
unch+0.022
owing+0.022
oot+0.021
apping+0.020
anged+0.020
Neg logit contributions
 your-0.033
 our-0.032
 my-0.031
 same-0.031
 her-0.031
Token!'
Token ID13679
Feature activation+7.46e-02

Pos logit contributions
apping+0.536
unch+0.519
owing+0.511
amb+0.504
oot+0.493
Neg logit contributions
 your-0.616
 newly-0.609
Scar-0.590
 same-0.588
 Piper-0.585
Token<|endoftext|>
Token ID50256
Feature activation+1.17e-01

Pos logit contributions
unch+0.157
apping+0.156
owing+0.154
oot+0.152
amb+0.150
Neg logit contributions
 your-0.092
 newly-0.084
 her-0.083
 same-0.082
 Piper-0.082
TokenOnce
Token ID7454
Feature activation+4.61e-02

Pos logit contributions
apping+0.267
unch+0.267
owing+0.264
oot+0.261
amb+0.259
Neg logit contributions
 your-0.120
 newly-0.110
 same-0.106
 her-0.106
 Piper-0.105
Token upon
Token ID2402
Feature activation-2.13e-01

Pos logit contributions
apping+0.113
unch+0.106
apped+0.106
ement+0.105
owing+0.104
Neg logit contributions
Mid-0.041
Scar-0.041
 Piper-0.038
 your-0.038
 Blink-0.038
Token a
Token ID257
Feature activation-1.68e-01

Pos logit contributions
 Piper+0.230
Scar+0.216
 your+0.204
 newly+0.198
 has+0.190
Neg logit contributions
unch-0.461
apping-0.454
apped-0.429
aps-0.415
owing-0.411
Token.
Token ID13
Feature activation+7.50e-02

Pos logit contributions
apping+0.241
unch+0.239
owing+0.235
oot+0.229
amb+0.227
Neg logit contributions
 your-0.325
 newly-0.313
 same-0.307
 Piper-0.306
Scar-0.305
Token Everyone
Token ID11075
Feature activation+1.15e-02

Pos logit contributions
apping+0.159
unch+0.159
owing+0.158
oot+0.155
amb+0.153
Neg logit contributions
 your-0.090
 newly-0.083
 same-0.081
 her-0.080
 our-0.080
Token sang
Token ID25889
Feature activation+2.03e-01

Pos logit contributions
unch+0.025
apping+0.025
owing+0.025
oot+0.025
amb+0.024
Neg logit contributions
 your-0.013
 newly-0.012
 her-0.012
 our-0.012
 same-0.011
Token the
Token ID262
Feature activation-2.77e-01

Pos logit contributions
apping+0.393
unch+0.388
owing+0.382
oot+0.377
amb+0.376
Neg logit contributions
 your-0.278
 newly-0.263
 same-0.256
 Piper-0.255
Scar-0.251
Token Happy
Token ID14628
Feature activation-4.50e-02

Pos logit contributions
 your+0.292
 Piper+0.271
 newly+0.260
Scar+0.256
 her+0.255
Neg logit contributions
apping-0.618
unch-0.614
owing-0.602
ement-0.601
amb-0.589
Token Birthday
Token ID33511
Feature activation+3.53e-01

Pos logit contributions
 your+0.072
 newly+0.068
 same+0.068
 our+0.067
 her+0.066
Neg logit contributions
unch-0.080
owing-0.076
apping-0.076
oot-0.076
amb-0.073
Token song
Token ID3496
Feature activation+2.57e-01

Pos logit contributions
apping+0.637
owing+0.621
unch+0.620
amb+0.612
oot+0.605
Neg logit contributions
 newly-0.506
 your-0.499
Scar-0.491
 Piper-0.489
Mid-0.485
Token and
Token ID290
Feature activation+4.08e-03

Pos logit contributions
unch+0.460
apping+0.454
owing+0.445
oot+0.437
amb+0.436
Neg logit contributions
 your-0.397
 newly-0.370
 her-0.369
 same-0.364
 our-0.363
Token blew
Token ID17948
Feature activation+1.21e-01

Pos logit contributions
apping+0.008
unch+0.008
owing+0.008
amb+0.008
oot+0.008
Neg logit contributions
 your-0.005
 newly-0.005
Scar-0.005
 Piper-0.005
Mid-0.005
Token out
Token ID503
Feature activation-5.01e-02

Pos logit contributions
unch+0.260
apping+0.260
owing+0.255
oot+0.252
amb+0.250
Neg logit contributions
 your-0.143
 newly-0.131
 same-0.128
 her-0.127
 our-0.127
Token the
Token ID262
Feature activation-2.78e-01

Pos logit contributions
 Piper+0.033
 your+0.033
Scar+0.032
 newly+0.031
Mid+0.031
Neg logit contributions
apping-0.127
owing-0.124
unch-0.124
ement-0.124
apped-0.122
Token and
Token ID290
Feature activation+3.89e-03

Pos logit contributions
unch+0.269
owing+0.260
oot+0.257
apping+0.255
amb+0.249
Neg logit contributions
 your-0.203
 her-0.196
 newly-0.185
 our-0.184
 same-0.184
Token started
Token ID2067
Feature activation+8.03e-02

Pos logit contributions
unch+0.007
apping+0.007
owing+0.007
oot+0.007
ement+0.007
Neg logit contributions
 your-0.006
 her-0.005
 newly-0.005
 Piper-0.005
 our-0.005
Token singing
Token ID13777
Feature activation+2.41e-01

Pos logit contributions
unch+0.174
oot+0.169
apping+0.169
owing+0.168
amb+0.168
Neg logit contributions
 your-0.104
 her-0.099
 our-0.092
 newly-0.092
 their-0.092
Token "
Token ID366
Feature activation-5.73e-02

Pos logit contributions
unch+0.438
apping+0.432
owing+0.425
oot+0.418
amb+0.417
Neg logit contributions
 your-0.368
 newly-0.345
 her-0.343
 same-0.340
 our-0.337
TokenHappy
Token ID25082
Feature activation+1.42e-02

Pos logit contributions
 your+0.073
 newly+0.068
 her+0.068
 same+0.067
Scar+0.065
Neg logit contributions
unch-0.115
apping-0.115
owing-0.115
oot-0.112
ement-0.111
Token Birthday
Token ID33511
Feature activation+3.53e-01

Pos logit contributions
owing+0.017
unch+0.017
anged+0.017
oot+0.017
ement+0.016
Neg logit contributions
 your-0.031
 our-0.030
 my-0.029
 her-0.029
 same-0.029
Token to
Token ID284
Feature activation-9.11e-02

Pos logit contributions
apping+0.619
unch+0.597
amb+0.589
owing+0.584
oot+0.574
Neg logit contributions
 newly-0.535
 your-0.514
Scar-0.510
Mid-0.501
 Piper-0.499
Token Jack
Token ID3619
Feature activation-1.35e-01

Pos logit contributions
Scar+0.097
 newly+0.096
Mid+0.096
chy+0.093
 Piper+0.091
Neg logit contributions
apping-0.187
unch-0.186
ement-0.183
oot-0.183
amb-0.181
Token!"
Token ID2474
Feature activation+1.05e-01

Pos logit contributions
 your+0.239
 newly+0.235
 same+0.227
 Piper+0.227
Scar+0.225
Neg logit contributions
apping-0.202
unch-0.198
owing-0.195
amb-0.191
oot-0.189
Token
Token ID220
Feature activation+4.39e-02

Pos logit contributions
apping+0.207
unch+0.205
owing+0.198
amb+0.198
oot+0.196
Neg logit contributions
 your-0.144
 newly-0.135
 same-0.133
 our-0.131
 Piper-0.131
Token
Token ID198
Feature activation-1.78e-02

Pos logit contributions
apping+0.084
unch+0.083
amb+0.079
oot+0.079
owing+0.079
Neg logit contributions
 your-0.062
 newly-0.060
 same-0.058
 our-0.057
 Piper-0.057
Token She
Token ID1375
Feature activation-1.92e-01

Pos logit contributions
unch+0.153
apping+0.153
owing+0.152
oot+0.149
amb+0.148
Neg logit contributions
 your-0.073
 newly-0.066
 her-0.064
 Piper-0.064
 our-0.063
Token saw
Token ID2497
Feature activation+4.24e-02

Pos logit contributions
 your+0.251
 newly+0.241
 our+0.231
 same+0.231
 Piper+0.230
Neg logit contributions
apping-0.363
unch-0.359
owing-0.356
amb-0.355
atted-0.354
Token the
Token ID262
Feature activation-2.83e-01

Pos logit contributions
apping+0.097
unch+0.097
owing+0.096
oot+0.096
ement+0.095
Neg logit contributions
Scar-0.036
 your-0.036
 newly-0.036
Mid-0.034
 Piper-0.033
Token mess
Token ID2085
Feature activation+1.57e-01

Pos logit contributions
 your+0.331
 newly+0.307
 same+0.298
 Piper+0.295
 her+0.295
Neg logit contributions
apping-0.605
unch-0.604
owing-0.593
oot-0.585
amb-0.583
Token and
Token ID290
Feature activation-1.81e-03

Pos logit contributions
apping+0.258
unch+0.251
owing+0.249
oot+0.243
ement+0.240
Neg logit contributions
 your-0.260
 newly-0.252
Scar-0.250
 same-0.248
 Piper-0.248
Token the
Token ID262
Feature activation-2.88e-01

Pos logit contributions
Scar+0.002
 newly+0.002
Mid+0.002
 same+0.002
 Piper+0.002
Neg logit contributions
atted-0.004
apping-0.004
anged-0.003
apped-0.003
owing-0.003
Token tears
Token ID10953
Feature activation+1.21e-01

Pos logit contributions
 your+0.338
 newly+0.311
 Piper+0.305
Scar+0.301
 our+0.301
Neg logit contributions
apping-0.607
unch-0.606
owing-0.595
oot-0.584
amb-0.581
Token and
Token ID290
Feature activation-4.58e-03

Pos logit contributions
apping+0.225
unch+0.224
owing+0.221
oot+0.217
ement+0.213
Neg logit contributions
 your-0.175
 newly-0.165
 same-0.162
 Piper-0.161
Scar-0.161
Token said
Token ID531
Feature activation+9.65e-02

Pos logit contributions
Scar+0.004
 newly+0.004
Mid+0.004
 your+0.004
 same+0.004
Neg logit contributions
atted-0.010
apped-0.010
anged-0.010
apping-0.010
owing-0.009
Token,
Token ID11
Feature activation+1.85e-02

Pos logit contributions
apping+0.178
unch+0.176
owing+0.172
oot+0.171
amb+0.167
Neg logit contributions
 your-0.139
 newly-0.136
 same-0.132
 Piper-0.131
Scar-0.129
Token "
Token ID366
Feature activation-6.55e-02

Pos logit contributions
apping+0.040
unch+0.040
owing+0.039
oot+0.039
amb+0.038
Neg logit contributions
 your-0.022
 newly-0.020
 same-0.019
 Piper-0.019
 our-0.019
Token them
Token ID606
Feature activation-1.48e-01

Pos logit contributions
apping+0.217
unch+0.216
owing+0.213
oot+0.210
amb+0.207
Neg logit contributions
 your-0.128
 newly-0.121
 same-0.117
 our-0.115
Scar-0.113
Token.
Token ID13
Feature activation+6.99e-02

Pos logit contributions
 your+0.224
 newly+0.207
 same+0.207
 our+0.205
 her+0.205
Neg logit contributions
unch-0.270
apping-0.267
oot-0.260
owing-0.260
amb-0.257
Token The
Token ID383
Feature activation-1.75e-01

Pos logit contributions
unch+0.156
apping+0.156
owing+0.155
oot+0.152
amb+0.150
Neg logit contributions
 your-0.076
 newly-0.068
 her-0.067
 our-0.066
 same-0.066
Token moral
Token ID6573
Feature activation+1.23e-01

Pos logit contributions
 your+0.213
 newly+0.198
 same+0.196
 our+0.190
 her+0.189
Neg logit contributions
unch-0.369
apping-0.369
owing-0.364
oot-0.358
amb-0.358
Token of
Token ID286
Feature activation-7.34e-02

Pos logit contributions
owing+0.181
unch+0.180
apping+0.177
amb+0.175
oot+0.168
Neg logit contributions
 your-0.240
 her-0.224
 same-0.223
 our-0.218
 newly-0.218
Token the
Token ID262
Feature activation-2.87e-01

Pos logit contributions
Scar+0.057
Mid+0.050
 same+0.048
 newly+0.047
chy+0.044
Neg logit contributions
unch-0.179
apping-0.175
oot-0.173
owing-0.170
apped-0.167
Token story
Token ID1621
Feature activation+1.79e-01

Pos logit contributions
 your+0.417
 newly+0.394
 same+0.387
 our+0.380
 Piper+0.378
Neg logit contributions
apping-0.541
unch-0.539
owing-0.532
amb-0.523
oot-0.519
Token is
Token ID318
Feature activation-5.42e-02

Pos logit contributions
apping+0.470
unch+0.468
owing+0.462
oot+0.457
amb+0.456
Neg logit contributions
 your-0.122
 newly-0.107
 same-0.101
 Piper-0.100
 her-0.098
Token that
Token ID326
Feature activation-9.77e-02

Pos logit contributions
 your+0.094
 our+0.086
 her+0.084
 my+0.083
 same+0.082
Neg logit contributions
unch-0.094
ement-0.092
owing-0.091
amb-0.091
apping-0.090
Token true
Token ID2081
Feature activation-1.35e-02

Pos logit contributions
 your+0.092
 our+0.080
 newly+0.079
 her+0.078
 same+0.077
Neg logit contributions
unch-0.240
apping-0.237
owing-0.235
amb-0.234
oot-0.233
Token friendship
Token ID14738
Feature activation+1.69e-01

Pos logit contributions
 your+0.018
 our+0.016
 same+0.016
 has+0.015
 my+0.015
Neg logit contributions
unch-0.029
amb-0.028
apping-0.028
owing-0.028
oot-0.028
Token them
Token ID606
Feature activation-1.48e-01

Pos logit contributions
unch+0.311
oot+0.294
amb+0.293
apping+0.288
ement+0.280
Neg logit contributions
 your-0.242
 her-0.239
 same-0.226
 has-0.219
 the-0.217
Token.
Token ID13
Feature activation+6.99e-02

Pos logit contributions
 your+0.224
 newly+0.207
 same+0.207
 our+0.205
 her+0.205
Neg logit contributions
unch-0.270
apping-0.267
oot-0.260
owing-0.260
amb-0.257
Token The
Token ID383
Feature activation-1.75e-01

Pos logit contributions
unch+0.156
apping+0.156
owing+0.155
oot+0.152
amb+0.150
Neg logit contributions
 your-0.076
 newly-0.068
 her-0.067
 our-0.066
 same-0.066
Token moral
Token ID6573
Feature activation+1.23e-01

Pos logit contributions
 your+0.213
 newly+0.198
 same+0.196
 our+0.190
 her+0.189
Neg logit contributions
unch-0.369
apping-0.369
owing-0.364
oot-0.358
amb-0.358
Token of
Token ID286
Feature activation-7.34e-02

Pos logit contributions
owing+0.181
unch+0.180
apping+0.177
amb+0.175
oot+0.168
Neg logit contributions
 your-0.240
 her-0.224
 same-0.223
 our-0.218
 newly-0.218
Token the
Token ID262
Feature activation-2.87e-01

Pos logit contributions
Scar+0.057
Mid+0.050
 same+0.048
 newly+0.047
chy+0.044
Neg logit contributions
unch-0.179
apping-0.175
oot-0.173
owing-0.170
apped-0.167
Token story
Token ID1621
Feature activation+1.79e-01

Pos logit contributions
 your+0.417
 newly+0.394
 same+0.387
 our+0.380
 Piper+0.378
Neg logit contributions
apping-0.541
unch-0.539
owing-0.532
amb-0.523
oot-0.519
Token is
Token ID318
Feature activation-5.42e-02

Pos logit contributions
apping+0.470
unch+0.468
owing+0.462
oot+0.457
amb+0.456
Neg logit contributions
 your-0.122
 newly-0.107
 same-0.101
 Piper-0.100
 her-0.098
Token to
Token ID284
Feature activation-1.01e-01

Pos logit contributions
 your+0.094
 our+0.086
 her+0.084
 my+0.083
 same+0.082
Neg logit contributions
unch-0.094
ement-0.092
owing-0.091
amb-0.091
apping-0.090
Token always
Token ID1464
Feature activation+2.05e-02

Pos logit contributions
 your+0.096
 newly+0.091
 same+0.085
 Piper+0.085
Scar+0.084
Neg logit contributions
apping-0.235
unch-0.234
owing-0.230
oot-0.228
amb-0.228
Token be
Token ID307
Feature activation+4.10e-03

Pos logit contributions
apping+0.044
owing+0.043
unch+0.043
aring+0.042
oot+0.042
Neg logit contributions
 newly-0.022
 Piper-0.021
Scar-0.021
 your-0.021
Mid-0.020
Token cook
Token ID4255
Feature activation+1.81e-01

Pos logit contributions
apping+0.063
unch+0.063
owing+0.062
oot+0.061
amb+0.061
Neg logit contributions
 your-0.037
 same-0.034
 newly-0.033
 her-0.033
 our-0.033
Token."
Token ID526
Feature activation+9.50e-02

Pos logit contributions
apping+0.302
unch+0.301
owing+0.294
oot+0.288
amb+0.287
Neg logit contributions
 your-0.294
 newly-0.282
 same-0.273
 Piper-0.273
 her-0.271
Token He
Token ID679
Feature activation-1.90e-01

Pos logit contributions
apping+0.194
unch+0.191
amb+0.187
owing+0.183
oot+0.182
Neg logit contributions
 newly-0.114
 your-0.114
 same-0.112
Scar-0.107
 Piper-0.105
Token spat
Token ID15246
Feature activation+2.42e-01

Pos logit contributions
 newly+0.286
 your+0.276
 Piper+0.272
 same+0.264
Max+0.262
Neg logit contributions
atted-0.322
apped-0.300
ement-0.300
amb-0.300
apping-0.299
Token out
Token ID503
Feature activation-5.99e-02

Pos logit contributions
unch+0.492
apping+0.488
oot+0.480
amb+0.473
owing+0.467
Neg logit contributions
 your-0.286
 newly-0.286
 same-0.276
Scar-0.267
 our-0.264
Token the
Token ID262
Feature activation-2.87e-01

Pos logit contributions
 newly+0.043
Scar+0.040
chy+0.040
Mid+0.039
 same+0.038
Neg logit contributions
apping-0.140
atted-0.139
unch-0.137
apped-0.136
oot-0.136
Token cake
Token ID12187
Feature activation+1.70e-01

Pos logit contributions
 your+0.348
 newly+0.324
Scar+0.319
 Piper+0.319
 our+0.312
Neg logit contributions
unch-0.588
apping-0.587
owing-0.577
ement-0.564
oot-0.560
Token.
Token ID13
Feature activation+6.91e-02

Pos logit contributions
apping+0.281
unch+0.281
owing+0.274
oot+0.269
amb+0.268
Neg logit contributions
 your-0.283
 newly-0.268
 same-0.263
 Piper-0.261
 her-0.261
Token
Token ID198
Feature activation-3.14e-02

Pos logit contributions
apping+0.143
unch+0.143
amb+0.137
owing+0.137
oot+0.137
Neg logit contributions
 your-0.084
 newly-0.080
 same-0.079
 our-0.076
 Piper-0.076
Token
Token ID198
Feature activation-2.90e-02

Pos logit contributions
 your+0.042
 newly+0.040
 same+0.039
 Piper+0.038
 our+0.038
Neg logit contributions
unch-0.062
apping-0.062
owing-0.059
oot-0.059
amb-0.059
TokenBen
Token ID11696
Feature activation-5.41e-02

Pos logit contributions
 same+0.038
 newly+0.037
chy+0.035
 your+0.035
 Piper+0.034
Neg logit contributions
apping-0.058
unch-0.055
aring-0.055
amb-0.054
oot-0.052
Token other
Token ID584
Feature activation-4.01e-03

Pos logit contributions
 your+0.047
 newly+0.047
 Piper+0.046
Scar+0.045
Mid+0.045
Neg logit contributions
apping-0.051
owing-0.050
unch-0.050
oot-0.050
amb-0.048
Token.
Token ID13
Feature activation+7.06e-02

Pos logit contributions
 your+0.006
 newly+0.006
 same+0.006
 Piper+0.006
 our+0.006
Neg logit contributions
apping-0.007
unch-0.007
owing-0.007
oot-0.007
amb-0.007
Token The
Token ID383
Feature activation-1.77e-01

Pos logit contributions
apping+0.151
unch+0.151
owing+0.148
oot+0.146
amb+0.145
Neg logit contributions
 your-0.083
 newly-0.077
 same-0.074
 Piper-0.074
 her-0.073
Token moral
Token ID6573
Feature activation+1.22e-01

Pos logit contributions
 your+0.211
 newly+0.195
 Piper+0.189
Scar+0.188
 her+0.188
Neg logit contributions
apping-0.372
unch-0.370
owing-0.363
oot-0.358
amb-0.357
Token of
Token ID286
Feature activation-7.43e-02

Pos logit contributions
unch+0.169
apping+0.169
owing+0.169
amb+0.163
oot+0.158
Neg logit contributions
 your-0.244
 same-0.229
 her-0.228
 newly-0.226
 our-0.223
Token the
Token ID262
Feature activation-2.87e-01

Pos logit contributions
Scar+0.064
Mid+0.057
 newly+0.054
 same+0.054
chy+0.053
Neg logit contributions
unch-0.173
apping-0.167
oot-0.167
owing-0.162
apped-0.158
Token story
Token ID1621
Feature activation+1.78e-01

Pos logit contributions
 your+0.448
 newly+0.422
 Piper+0.412
 her+0.412
 same+0.412
Neg logit contributions
unch-0.499
apping-0.499
owing-0.484
oot-0.480
amb-0.474
Token is
Token ID318
Feature activation-5.46e-02

Pos logit contributions
unch+0.481
apping+0.477
owing+0.472
amb+0.467
oot+0.466
Neg logit contributions
 your-0.115
 newly-0.095
 same-0.092
 our-0.091
 her-0.091
Token that
Token ID326
Feature activation-9.81e-02

Pos logit contributions
 your+0.091
 our+0.083
 her+0.082
 my+0.081
 newly+0.080
Neg logit contributions
unch-0.097
ement-0.095
owing-0.094
amb-0.094
apping-0.093
Token sharing
Token ID7373
Feature activation+1.22e-01

Pos logit contributions
 your+0.102
 newly+0.096
 same+0.093
 Piper+0.091
Scar+0.091
Neg logit contributions
apping-0.222
unch-0.220
owing-0.217
oot-0.214
amb-0.213
Token with
Token ID351
Feature activation+8.69e-03

Pos logit contributions
apping+0.247
aring+0.235
apped+0.232
ement+0.231
owing+0.229
Neg logit contributions
Scar-0.137
 newly-0.136
Mid-0.134
 Piper-0.132
Ol-0.126
Token accidentally
Token ID14716
Feature activation-2.50e-02

Pos logit contributions
 your+0.057
 newly+0.054
 same+0.052
 Piper+0.052
 our+0.052
Neg logit contributions
apping-0.093
unch-0.093
owing-0.092
amb-0.090
oot-0.090
Token dropped
Token ID5710
Feature activation+7.25e-02

Pos logit contributions
 your+0.030
 newly+0.030
 Piper+0.029
Scar+0.028
 same+0.028
Neg logit contributions
unch-0.050
apping-0.050
owing-0.050
oot-0.049
amb-0.048
Token his
Token ID465
Feature activation-2.57e-01

Pos logit contributions
oot+0.163
amb+0.162
owing+0.161
atted+0.161
ement+0.161
Neg logit contributions
Scar-0.062
 newly-0.059
Elsa-0.056
Mid-0.055
Beck-0.055
Token gum
Token ID27142
Feature activation+1.27e-01

Pos logit contributions
 your+0.374
Scar+0.362
 Piper+0.358
 newly+0.353
Mid+0.344
Neg logit contributions
apping-0.452
unch-0.449
owing-0.448
oot-0.435
amb-0.431
Token on
Token ID319
Feature activation-9.31e-02

Pos logit contributions
apping+0.221
unch+0.220
owing+0.216
oot+0.213
amb+0.211
Neg logit contributions
 your-0.196
 newly-0.189
 Piper-0.184
Scar-0.182
 our-0.181
Token the
Token ID262
Feature activation-2.86e-01

Pos logit contributions
Scar+0.071
 newly+0.068
 has+0.063
Mid+0.062
 Piper+0.061
Neg logit contributions
unch-0.218
ement-0.218
apping-0.214
amb-0.211
oot-0.211
Token ground
Token ID2323
Feature activation+1.45e-01

Pos logit contributions
Scar+0.367
 newly+0.363
 your+0.361
 Piper+0.355
Mid+0.344
Neg logit contributions
apping-0.508
unch-0.507
amb-0.500
oot-0.499
owing-0.497
Token.
Token ID13
Feature activation+6.44e-02

Pos logit contributions
unch+0.243
apping+0.241
amb+0.237
owing+0.234
oot+0.232
Neg logit contributions
 your-0.243
 her-0.228
 same-0.226
 newly-0.222
 our-0.221
Token Sarah
Token ID10490
Feature activation-1.34e-01

Pos logit contributions
unch+0.128
apping+0.124
amb+0.120
oot+0.120
owing+0.117
Neg logit contributions
 newly-0.080
 same-0.080
Scar-0.079
 your-0.079
 our-0.076
Token saw
Token ID2497
Feature activation+4.08e-02

Pos logit contributions
 your+0.150
 newly+0.141
 our+0.138
 same+0.136
Scar+0.133
Neg logit contributions
unch-0.289
apping-0.288
owing-0.285
oot-0.281
amb-0.277
Token it
Token ID340
Feature activation-1.54e-01

Pos logit contributions
unch+0.076
oot+0.075
ement+0.074
apping+0.073
atted+0.073
Neg logit contributions
Scar-0.054
 newly-0.052
Elsa-0.052
Mid-0.051
 your-0.051
Token.
Token ID13
Feature activation+6.64e-02

Pos logit contributions
 your+0.147
 her+0.136
 newly+0.135
 our+0.134
 same+0.134
Neg logit contributions
unch-0.146
apping-0.145
owing-0.140
amb-0.139
oot-0.138
Token The
Token ID383
Feature activation-1.73e-01

Pos logit contributions
apping+0.144
unch+0.144
owing+0.139
amb+0.139
oot+0.138
Neg logit contributions
 your-0.074
 newly-0.071
 same-0.069
 Piper-0.066
 our-0.066
Token rock
Token ID3881
Feature activation+1.70e-01

Pos logit contributions
 your+0.199
 newly+0.185
 same+0.181
 her+0.176
 Piper+0.176
Neg logit contributions
apping-0.374
unch-0.374
owing-0.368
oot-0.363
amb-0.363
Token had
Token ID550
Feature activation-6.83e-02

Pos logit contributions
apping+0.370
unch+0.368
owing+0.366
oot+0.361
ement+0.359
Neg logit contributions
 your-0.184
 Piper-0.170
 newly-0.169
Scar-0.164
 our-0.163
Token destroyed
Token ID6572
Feature activation+7.71e-02

Pos logit contributions
 Piper+0.075
 your+0.073
Mid+0.071
 newly+0.071
Scar+0.069
Neg logit contributions
unch-0.141
apping-0.140
oot-0.140
owing-0.138
apped-0.137
Token the
Token ID262
Feature activation-2.86e-01

Pos logit contributions
apping+0.196
unch+0.195
owing+0.194
oot+0.191
ement+0.191
Neg logit contributions
 your-0.051
 newly-0.048
Scar-0.047
 Piper-0.046
Mid-0.046
Token park
Token ID3952
Feature activation+2.04e-01

Pos logit contributions
 your+0.342
 newly+0.319
 same+0.316
 our+0.304
 her+0.299
Neg logit contributions
unch-0.615
apping-0.615
owing-0.602
oot-0.596
amb-0.595
Token and
Token ID290
Feature activation-5.65e-03

Pos logit contributions
unch+0.342
apping+0.341
owing+0.331
amb+0.329
oot+0.326
Neg logit contributions
 your-0.335
 newly-0.316
 same-0.312
 her-0.310
 our-0.308
Token all
Token ID477
Feature activation-9.97e-03

Pos logit contributions
Scar+0.006
Mid+0.006
 Piper+0.006
Elsa+0.006
 newly+0.006
Neg logit contributions
oot-0.012
apping-0.012
apped-0.011
owing-0.011
unch-0.011
Token their
Token ID511
Feature activation-2.57e-01

Pos logit contributions
Scar+0.010
 Piper+0.010
Mid+0.010
 newly+0.010
 your+0.010
Neg logit contributions
apping-0.022
owing-0.022
unch-0.022
oot-0.021
atted-0.021
Token toys
Token ID14958
Feature activation+1.09e-01

Pos logit contributions
 your+0.365
 newly+0.354
 same+0.344
 our+0.336
 her+0.336
Neg logit contributions
apping-0.499
unch-0.497
owing-0.483
amb-0.482
oot-0.481
Token girl
Token ID2576
Feature activation-6.10e-02

Pos logit contributions
 your+0.075
 newly+0.072
 Piper+0.071
 same+0.069
Mid+0.068
Neg logit contributions
unch-0.126
apping-0.126
owing-0.123
oot-0.121
amb-0.120
Token named
Token ID3706
Feature activation+2.20e-01

Pos logit contributions
 your+0.083
 her+0.077
 newly+0.075
 our+0.074
 same+0.074
Neg logit contributions
unch-0.122
apping-0.119
oot-0.118
owing-0.116
amb-0.116
Token Sophie
Token ID35331
Feature activation-1.20e-01

Pos logit contributions
oot+0.468
apping+0.465
owing+0.459
amb+0.455
atted+0.448
Neg logit contributions
 newly-0.213
Scar-0.199
 your-0.194
 same-0.189
 our-0.185
Token who
Token ID508
Feature activation-5.14e-02

Pos logit contributions
 your+0.197
 newly+0.186
 her+0.183
 same+0.182
 Piper+0.181
Neg logit contributions
unch-0.202
apping-0.202
owing-0.197
oot-0.194
amb-0.192
Token had
Token ID550
Feature activation-8.61e-02

Pos logit contributions
 your+0.058
 our+0.054
 newly+0.054
 Piper+0.053
Mid+0.052
Neg logit contributions
unch-0.109
apping-0.109
amb-0.106
owing-0.105
oot-0.105
Token the
Token ID262
Feature activation-2.86e-01

Pos logit contributions
 Piper+0.070
 Blink+0.066
 your+0.066
Mid+0.065
 newly+0.062
Neg logit contributions
apping-0.197
unch-0.192
oot-0.192
apped-0.191
owing-0.191
Token biggest
Token ID4094
Feature activation+5.41e-02

Pos logit contributions
 your+0.286
 newly+0.260
 Piper+0.260
 our+0.259
Scar+0.254
Neg logit contributions
apping-0.636
owing-0.632
unch-0.632
amb-0.621
ement-0.619
Token hope
Token ID2911
Feature activation+1.47e-01

Pos logit contributions
apping+0.118
unch+0.118
owing+0.117
oot+0.117
amb+0.115
Neg logit contributions
 your-0.064
 same-0.061
 newly-0.060
 her-0.060
 has-0.055
Token in
Token ID287
Feature activation-1.14e-01

Pos logit contributions
owing+0.233
apping+0.233
oot+0.230
unch+0.227
ement+0.226
Neg logit contributions
 your-0.234
 newly-0.228
 Piper-0.226
Scar-0.225
Mid-0.225
Token her
Token ID607
Feature activation-2.40e-01

Pos logit contributions
Mid+0.059
 Piper+0.057
Scar+0.057
 newly+0.055
 your+0.054
Neg logit contributions
apping-0.293
unch-0.290
ement-0.287
owing-0.286
oot-0.285
Token heart
Token ID2612
Feature activation+2.62e-01

Pos logit contributions
 your+0.320
 newly+0.301
 same+0.296
 her+0.292
 our+0.285
Neg logit contributions
unch-0.492
apping-0.483
oot-0.472
owing-0.471
amb-0.468
Token surprise
Token ID5975
Feature activation+2.13e-01

Pos logit contributions
 your+0.341
 Piper+0.317
 newly+0.314
Scar+0.313
 her+0.310
Neg logit contributions
unch-0.586
apping-0.585
owing-0.575
ement-0.565
oot-0.561
Token.
Token ID13
Feature activation+6.70e-02

Pos logit contributions
apping+0.335
unch+0.334
owing+0.326
oot+0.320
amb+0.319
Neg logit contributions
 your-0.369
 newly-0.351
 same-0.344
 Piper-0.341
 her-0.341
Token They
Token ID1119
Feature activation-1.64e-01

Pos logit contributions
owing+0.160
unch+0.159
apping+0.158
oot+0.156
amb+0.154
Neg logit contributions
 your-0.066
 our-0.058
 newly-0.058
 her-0.056
 Piper-0.056
Token had
Token ID550
Feature activation-7.15e-02

Pos logit contributions
 your+0.187
 newly+0.173
 same+0.168
 Piper+0.168
 our+0.166
Neg logit contributions
apping-0.352
unch-0.350
owing-0.345
oot-0.340
amb-0.339
Token wasted
Token ID18359
Feature activation+1.36e-03

Pos logit contributions
 your+0.072
 Piper+0.069
 newly+0.067
Scar+0.065
Mid+0.064
Neg logit contributions
apping-0.159
unch-0.158
owing-0.156
apped-0.153
oot-0.153
Token the
Token ID262
Feature activation-2.86e-01

Pos logit contributions
apping+0.004
unch+0.004
owing+0.004
ement+0.004
oot+0.003
Neg logit contributions
 your-0.001
 newly-0.001
 Piper-0.001
Scar-0.001
Mid-0.001
Token toys
Token ID14958
Feature activation+1.08e-01

Pos logit contributions
 your+0.335
 Piper+0.310
 newly+0.307
Scar+0.302
 her+0.300
Neg logit contributions
unch-0.603
apping-0.601
owing-0.594
ement-0.585
amb-0.579
Token.
Token ID13
Feature activation+6.55e-02

Pos logit contributions
apping+0.165
unch+0.163
owing+0.162
oot+0.157
ement+0.154
Neg logit contributions
 your-0.190
 newly-0.183
 Piper-0.181
 same-0.179
Scar-0.179
Token They
Token ID1119
Feature activation-1.68e-01

Pos logit contributions
unch+0.156
owing+0.155
apping+0.155
oot+0.153
amb+0.151
Neg logit contributions
 your-0.064
 newly-0.057
 our-0.055
 her-0.054
 Piper-0.054
Token had
Token ID550
Feature activation-7.36e-02

Pos logit contributions
 your+0.187
 newly+0.171
 Piper+0.168
 same+0.167
 our+0.165
Neg logit contributions
apping-0.363
unch-0.362
owing-0.357
oot-0.352
amb-0.350
Token hurt
Token ID5938
Feature activation+1.28e-01

Pos logit contributions
 your+0.076
 Piper+0.075
 newly+0.072
Scar+0.070
 Blink+0.069
Neg logit contributions
apping-0.159
unch-0.158
owing-0.156
apped-0.154
oot-0.152
Token so
Token ID523
Feature activation-2.10e-03

Pos logit contributions
 your+0.023
 newly+0.022
 Piper+0.021
 same+0.021
 our+0.021
Neg logit contributions
apping-0.031
unch-0.031
owing-0.030
oot-0.030
amb-0.030
Token mean
Token ID1612
Feature activation+9.58e-02

Pos logit contributions
 your+0.003
 newly+0.002
 Piper+0.002
 same+0.002
 our+0.002
Neg logit contributions
apping-0.004
unch-0.004
owing-0.004
oot-0.004
amb-0.004
Token and
Token ID290
Feature activation-8.01e-03

Pos logit contributions
unch+0.182
apping+0.180
owing+0.176
amb+0.171
ement+0.171
Neg logit contributions
 your-0.143
 her-0.132
 same-0.128
 our-0.128
 she-0.126
Token rude
Token ID22066
Feature activation+1.20e-01

Pos logit contributions
 newly+0.009
 your+0.009
Scar+0.009
 Piper+0.009
Mid+0.009
Neg logit contributions
apping-0.016
owing-0.016
atted-0.016
oot-0.016
unch-0.016
Token and
Token ID290
Feature activation-8.27e-03

Pos logit contributions
unch+0.225
apping+0.213
owing+0.206
amb+0.205
ement+0.203
Neg logit contributions
 your-0.193
 her-0.185
 same-0.178
 she-0.177
 our-0.173
Token the
Token ID262
Feature activation-2.86e-01

Pos logit contributions
 newly+0.009
Scar+0.009
 Piper+0.009
Mid+0.009
 your+0.008
Neg logit contributions
apping-0.017
owing-0.017
oot-0.017
aring-0.017
unch-0.017
Token other
Token ID584
Feature activation+1.44e-02

Pos logit contributions
 your+0.342
 newly+0.318
 same+0.311
 her+0.305
 Piper+0.304
Neg logit contributions
apping-0.606
unch-0.606
owing-0.594
oot-0.586
amb-0.585
Token animals
Token ID4695
Feature activation+7.97e-02

Pos logit contributions
unch+0.030
apping+0.030
owing+0.029
oot+0.029
amb+0.029
Neg logit contributions
 your-0.018
 newly-0.017
 same-0.017
 her-0.016
 our-0.016
Token were
Token ID547
Feature activation-4.01e-02

Pos logit contributions
unch+0.182
apping+0.180
amb+0.177
owing+0.175
oot+0.175
Neg logit contributions
 your-0.083
 newly-0.075
 her-0.075
 same-0.073
 our-0.072
Token very
Token ID845
Feature activation+2.53e-02

Pos logit contributions
 your+0.053
 her+0.048
 newly+0.048
 same+0.048
 our+0.048
Neg logit contributions
unch-0.083
apping-0.082
amb-0.082
owing-0.080
oot-0.080
Token happy
Token ID3772
Feature activation+1.07e-01

Pos logit contributions
unch+0.056
amb+0.055
apping+0.054
owing+0.054
oot+0.054
Neg logit contributions
 your-0.030
 same-0.027
 newly-0.027
 our-0.027
 her-0.026
Token zo
Token ID40565
Feature activation+2.53e-01

Pos logit contributions
 same+0.196
 your+0.196
 newly+0.187
 our+0.181
 has+0.176
Neg logit contributions
unch-0.407
apping-0.406
owing-0.404
oot-0.401
amb-0.397
Tokenoke
Token ID2088
Feature activation+3.42e-02

Pos logit contributions
unch+0.306
amb+0.280
apping+0.270
owing+0.269
oot+0.258
Neg logit contributions
 your-0.534
 has-0.525
 our-0.508
 she-0.498
 my-0.492
Tokenep
Token ID538
Feature activation+1.23e-01

Pos logit contributions
unch+0.046
apping+0.044
owing+0.042
oot+0.042
amb+0.042
Neg logit contributions
 your-0.071
 our-0.067
 same-0.066
 her-0.066
 has-0.065
Tokeners
Token ID364
Feature activation-5.46e-03

Pos logit contributions
unch+0.276
apping+0.271
owing+0.264
amb+0.260
ement+0.258
Neg logit contributions
 your-0.135
 same-0.126
 has-0.123
 her-0.121
 our-0.121
Token shot
Token ID2823
Feature activation+1.90e-01

Pos logit contributions
 your+0.005
 newly+0.005
 her+0.005
 same+0.005
 our+0.004
Neg logit contributions
apping-0.013
unch-0.013
owing-0.013
amb-0.013
oot-0.013
Token the
Token ID262
Feature activation-2.86e-01

Pos logit contributions
apping+0.398
owing+0.390
apped+0.383
oot+0.382
amb+0.379
Neg logit contributions
 newly-0.211
 your-0.207
Scar-0.201
 Piper-0.199
Mid-0.191
Token tiger
Token ID26241
Feature activation+1.02e-01

Pos logit contributions
 your+0.367
 same+0.354
 newly+0.347
 our+0.336
 my+0.324
Neg logit contributions
apping-0.596
unch-0.595
oot-0.579
owing-0.576
amb-0.574
Token and
Token ID290
Feature activation-4.90e-03

Pos logit contributions
unch+0.204
apping+0.200
oot+0.195
amb+0.195
ement+0.194
Neg logit contributions
 your-0.143
 our-0.133
 her-0.131
 same-0.131
 has-0.126
Token took
Token ID1718
Feature activation+3.54e-02

Pos logit contributions
 newly+0.006
Scar+0.006
 your+0.006
 Piper+0.006
 same+0.006
Neg logit contributions
apped-0.009
oot-0.009
atted-0.009
anged-0.009
owing-0.009
Token it
Token ID340
Feature activation-1.57e-01

Pos logit contributions
apping+0.079
owing+0.078
apped+0.078
oot+0.078
amb+0.077
Neg logit contributions
 newly-0.033
Scar-0.031
 your-0.031
 Piper-0.031
Mid-0.029
Token away
Token ID1497
Feature activation-2.98e-02

Pos logit contributions
 your+0.192
 her+0.177
 our+0.171
 same+0.169
 my+0.162
Neg logit contributions
unch-0.357
ement-0.351
apping-0.351
amb-0.337
apped-0.331
Token made
Token ID925
Feature activation+4.16e-02

Pos logit contributions
 your+0.070
 Piper+0.070
 newly+0.069
Scar+0.065
Mid+0.065
Neg logit contributions
apping-0.137
unch-0.136
owing-0.135
oot-0.132
atted-0.130
Token a
Token ID257
Feature activation-1.73e-01

Pos logit contributions
apping+0.109
unch+0.109
owing+0.107
oot+0.106
amb+0.105
Neg logit contributions
 your-0.027
 newly-0.025
 Piper-0.023
Scar-0.023
 same-0.022
Token big
Token ID1263
Feature activation-1.70e-02

Pos logit contributions
 your+0.155
 newly+0.140
 Piper+0.139
 her+0.137
Scar+0.135
Neg logit contributions
apping-0.413
unch-0.410
owing-0.401
oot-0.395
amb-0.393
Token mistake
Token ID7457
Feature activation+2.12e-01

Pos logit contributions
 your+0.020
 same+0.018
 our+0.018
 newly+0.018
 my+0.017
Neg logit contributions
amb-0.038
oot-0.038
unch-0.038
apping-0.038
owing-0.038
Token and
Token ID290
Feature activation-5.84e-03

Pos logit contributions
unch+0.362
apping+0.354
amb+0.346
owing+0.346
oot+0.338
Neg logit contributions
 your-0.350
 her-0.328
 our-0.319
 newly-0.317
 she-0.313
Token the
Token ID262
Feature activation-2.86e-01

Pos logit contributions
Scar+0.006
 newly+0.006
Mid+0.006
 Piper+0.006
Elsa+0.006
Neg logit contributions
apping-0.012
atted-0.012
apped-0.012
unch-0.012
oot-0.012
Token boss
Token ID6478
Feature activation+2.22e-01

Pos logit contributions
 your+0.348
 newly+0.324
 same+0.314
 Piper+0.312
 her+0.311
Neg logit contributions
apping-0.597
unch-0.596
owing-0.585
oot-0.577
amb-0.575
Token was
Token ID373
Feature activation-3.54e-02

Pos logit contributions
unch+0.379
apping+0.372
owing+0.365
oot+0.356
amb+0.356
Neg logit contributions
 your-0.360
 her-0.337
 same-0.334
 newly-0.331
 our-0.328
Token very
Token ID845
Feature activation+2.50e-02

Pos logit contributions
 your+0.039
 newly+0.037
 same+0.035
 Piper+0.035
 our+0.035
Neg logit contributions
apping-0.078
unch-0.077
owing-0.076
oot-0.075
amb-0.075
Token angry
Token ID7954
Feature activation+1.08e-01

Pos logit contributions
apping+0.058
unch+0.058
owing+0.057
oot+0.056
amb+0.055
Neg logit contributions
 your-0.024
 newly-0.023
 Piper-0.022
Scar-0.022
 her-0.022
Token.
Token ID13
Feature activation+6.62e-02

Pos logit contributions
unch+0.210
owing+0.197
amb+0.195
apping+0.194
ement+0.191
Neg logit contributions
 your-0.170
 her-0.159
 she-0.153
 my-0.151
 our-0.151
Token disgusting
Token ID23374
Feature activation+4.91e-02

Pos logit contributions
 your+0.034
Scar+0.034
 Piper+0.034
 newly+0.033
Mid+0.032
Neg logit contributions
apping-0.089
unch-0.087
owing-0.085
oot-0.085
apped-0.084
Token.
Token ID13
Feature activation+6.72e-02

Pos logit contributions
apping+0.076
unch+0.075
owing+0.074
oot+0.072
amb+0.072
Neg logit contributions
 your-0.086
 newly-0.083
 Piper-0.081
 same-0.081
Scar-0.081
Token The
Token ID383
Feature activation-1.79e-01

Pos logit contributions
apping+0.145
unch+0.144
owing+0.140
amb+0.139
oot+0.139
Neg logit contributions
 your-0.077
 newly-0.073
 same-0.071
 Piper-0.069
 our-0.069
Token dog
Token ID3290
Feature activation+1.10e-01

Pos logit contributions
 your+0.201
 newly+0.186
 same+0.183
 her+0.178
 our+0.177
Neg logit contributions
unch-0.394
apping-0.394
owing-0.388
oot-0.383
amb-0.383
Token and
Token ID290
Feature activation-8.97e-04

Pos logit contributions
apping+0.219
unch+0.218
owing+0.218
aring+0.213
atted+0.213
Neg logit contributions
 your-0.131
 newly-0.128
Scar-0.124
 Piper-0.122
 our-0.119
Token the
Token ID262
Feature activation-2.86e-01

Pos logit contributions
Scar+0.001
Beck+0.001
In+0.001
 newly+0.001
Elsa+0.001
Neg logit contributions
apped-0.002
atted-0.002
oot-0.002
anged-0.002
apping-0.002
Token cat
Token ID3797
Feature activation+1.25e-01

Pos logit contributions
 your+0.353
 newly+0.329
 same+0.319
 Piper+0.317
 her+0.316
Neg logit contributions
apping-0.591
unch-0.590
owing-0.579
oot-0.571
amb-0.569
Token did
Token ID750
Feature activation+1.35e-02

Pos logit contributions
apping+0.289
unch+0.289
owing+0.281
oot+0.281
amb+0.280
Neg logit contributions
 your-0.128
 her-0.115
 newly-0.115
 same-0.113
 our-0.111
Token not
Token ID407
Feature activation-8.05e-02

Pos logit contributions
apping+0.028
unch+0.028
aring+0.028
owing+0.027
oot+0.027
Neg logit contributions
 Piper-0.015
Scar-0.014
Elsa-0.014
Mid-0.014
Beck-0.013
Token want
Token ID765
Feature activation+1.09e-01

Pos logit contributions
 newly+0.107
 your+0.107
 Piper+0.102
Scar+0.102
 same+0.098
Neg logit contributions
unch-0.154
apping-0.154
owing-0.153
oot-0.150
aring-0.147
Token to
Token ID284
Feature activation-1.00e-01

Pos logit contributions
apping+0.189
unch+0.188
owing+0.186
oot+0.184
aring+0.178
Neg logit contributions
 newly-0.156
Scar-0.155
 your-0.155
 Piper-0.151
Mid-0.147
Token The
Token ID383
Feature activation-1.67e-01

Pos logit contributions
owing+0.198
unch+0.197
apping+0.195
amb+0.192
oot+0.192
Neg logit contributions
 your-0.064
 newly-0.058
 her-0.056
 same-0.055
 our-0.055
Token puzzle
Token ID15027
Feature activation+8.56e-02

Pos logit contributions
 your+0.198
 Piper+0.187
 newly+0.184
Scar+0.183
 her+0.178
Neg logit contributions
apping-0.349
unch-0.347
owing-0.337
ement-0.329
oot-0.328
Token pieces
Token ID5207
Feature activation+7.15e-02

Pos logit contributions
apping+0.203
unch+0.197
ement+0.196
owing+0.194
apped+0.194
Neg logit contributions
 your-0.074
 Piper-0.073
Scar-0.070
 newly-0.070
Mid-0.069
Token fall
Token ID2121
Feature activation+1.66e-01

Pos logit contributions
apping+0.167
unch+0.165
owing+0.163
oot+0.162
ement+0.159
Neg logit contributions
 your-0.065
 Piper-0.063
 newly-0.061
Scar-0.061
Mid-0.060
Token on
Token ID319
Feature activation-8.93e-02

Pos logit contributions
apping+0.309
unch+0.304
owing+0.302
oot+0.298
amb+0.296
Neg logit contributions
 your-0.239
 newly-0.228
 Piper-0.223
Scar-0.220
 same-0.217
Token the
Token ID262
Feature activation-2.86e-01

Pos logit contributions
Scar+0.060
 your+0.058
Mid+0.057
 Piper+0.056
 newly+0.055
Neg logit contributions
ement-0.217
apping-0.217
unch-0.217
oot-0.215
owing-0.214
Token floor
Token ID4314
Feature activation+1.44e-01

Pos logit contributions
 your+0.324
 Piper+0.311
 newly+0.309
Scar+0.305
 her+0.296
Neg logit contributions
apping-0.589
unch-0.575
oot-0.565
owing-0.564
amb-0.560
Token.
Token ID13
Feature activation+7.86e-02

Pos logit contributions
amb+0.302
unch+0.294
oot+0.283
owing+0.275
apping+0.271
Neg logit contributions
 your-0.200
 her-0.193
 has-0.192
 same-0.189
 their-0.188
Token
Token ID198
Feature activation-2.27e-02

Pos logit contributions
owing+0.190
unch+0.188
apping+0.188
amb+0.185
oot+0.184
Neg logit contributions
 your-0.074
 newly-0.067
 same-0.065
 our-0.064
 her-0.064
Token
Token ID198
Feature activation-2.16e-02

Pos logit contributions
 your+0.024
 newly+0.022
 our+0.021
 her+0.021
 their+0.020
Neg logit contributions
owing-0.054
apping-0.053
amb-0.052
ement-0.052
unch-0.051
TokenMom
Token ID29252
Feature activation+7.72e-02

Pos logit contributions
 same+0.023
 your+0.023
 newly+0.022
 Piper+0.022
Scar+0.021
Neg logit contributions
unch-0.047
apping-0.047
oot-0.046
aring-0.045
amb-0.045
Token operation
Token ID4905
Feature activation+2.02e-01

Pos logit contributions
 your+0.262
 Piper+0.245
Scar+0.243
 newly+0.242
 her+0.237
Neg logit contributions
apping-0.443
unch-0.435
owing-0.424
ement-0.423
amb-0.418
Token was
Token ID373
Feature activation-4.23e-02

Pos logit contributions
apping+0.438
owing+0.427
unch+0.422
oot+0.419
apped+0.418
Neg logit contributions
 your-0.215
 newly-0.210
 Piper-0.201
Scar-0.200
 our-0.194
Token a
Token ID257
Feature activation-1.76e-01

Pos logit contributions
 your+0.049
 newly+0.046
 same+0.044
 Piper+0.044
 our+0.044
Neg logit contributions
apping-0.091
unch-0.091
owing-0.089
oot-0.088
amb-0.087
Token success
Token ID1943
Feature activation+1.89e-01

Pos logit contributions
 your+0.181
 her+0.165
 Piper+0.165
 newly+0.163
Scar+0.160
Neg logit contributions
apping-0.388
unch-0.383
owing-0.374
ement-0.367
aring-0.366
Token and
Token ID290
Feature activation-4.30e-03

Pos logit contributions
apping+0.293
unch+0.286
owing+0.284
oot+0.276
ement+0.273
Neg logit contributions
 your-0.330
 newly-0.316
 Piper-0.309
Scar-0.308
 same-0.308
Token the
Token ID262
Feature activation-2.86e-01

Pos logit contributions
Mid+0.004
Scar+0.004
 newly+0.004
 Piper+0.004
Elsa+0.004
Neg logit contributions
apping-0.009
apped-0.009
oot-0.009
owing-0.009
unch-0.009
Token little
Token ID1310
Feature activation+1.02e-02

Pos logit contributions
 your+0.341
 newly+0.317
 Piper+0.305
 our+0.305
 her+0.304
Neg logit contributions
apping-0.600
unch-0.596
owing-0.587
oot-0.578
amb-0.576
Token boy
Token ID2933
Feature activation+9.77e-04

Pos logit contributions
apping+0.021
owing+0.020
ement+0.020
apped+0.020
amb+0.020
Neg logit contributions
 your-0.011
 newly-0.011
Scar-0.011
Mid-0.010
 our-0.010
Token was
Token ID373
Feature activation-3.75e-02

Pos logit contributions
unch+0.002
apping+0.002
owing+0.002
oot+0.002
amb+0.002
Neg logit contributions
 your-0.001
 newly-0.001
 her-0.001
 same-0.001
 our-0.001
Token so
Token ID523
Feature activation-2.71e-03

Pos logit contributions
 your+0.047
 her+0.043
 our+0.041
 same+0.041
 newly+0.041
Neg logit contributions
unch-0.081
apping-0.079
amb-0.079
owing-0.078
oot-0.078
Token happy
Token ID3772
Feature activation+1.08e-01

Pos logit contributions
 your+0.004
 newly+0.004
 Piper+0.004
 same+0.004
 our+0.004
Neg logit contributions
apping-0.005
unch-0.005
owing-0.005
oot-0.005
amb-0.005
Tokenvous
Token ID31222
Feature activation+1.20e-01

Pos logit contributions
 your+0.355
 same+0.342
 newly+0.338
 our+0.319
 her+0.317
Neg logit contributions
apping-0.578
oot-0.567
unch-0.567
owing-0.559
amb-0.555
Token elf
Token ID23878
Feature activation+1.48e-01

Pos logit contributions
unch+0.223
apping+0.222
owing+0.217
amb+0.214
oot+0.214
Neg logit contributions
 your-0.176
 same-0.164
 her-0.162
 newly-0.160
 our-0.159
Token.
Token ID13
Feature activation+5.38e-02

Pos logit contributions
apping+0.276
unch+0.276
owing+0.270
oot+0.266
amb+0.265
Neg logit contributions
 your-0.213
 newly-0.201
 same-0.196
 Piper-0.194
 our-0.194
Token George
Token ID4502
Feature activation-1.49e-01

Pos logit contributions
apping+0.115
unch+0.115
owing+0.113
oot+0.112
amb+0.111
Neg logit contributions
 your-0.063
 newly-0.058
 same-0.056
 her-0.056
 Piper-0.056
Token asked
Token ID1965
Feature activation+1.16e-01

Pos logit contributions
 your+0.177
 newly+0.168
 same+0.161
 our+0.159
Scar+0.159
Neg logit contributions
apping-0.312
owing-0.308
unch-0.308
amb-0.305
oot-0.299
Token the
Token ID262
Feature activation-2.86e-01

Pos logit contributions
apping+0.246
owing+0.244
unch+0.243
oot+0.239
amb+0.236
Neg logit contributions
 newly-0.128
 your-0.128
Scar-0.121
 same-0.121
 Piper-0.117
Token elf
Token ID23878
Feature activation+1.48e-01

Pos logit contributions
 same+0.421
 your+0.419
 newly+0.401
 her+0.392
 Piper+0.388
Neg logit contributions
apping-0.555
unch-0.552
owing-0.531
oot-0.519
apped-0.517
Token why
Token ID1521
Feature activation-4.18e-02

Pos logit contributions
apping+0.284
unch+0.281
owing+0.280
oot+0.274
amb+0.271
Neg logit contributions
 your-0.202
 newly-0.193
 same-0.187
 Piper-0.186
Scar-0.185
Token he
Token ID339
Feature activation-2.58e-01

Pos logit contributions
Mid+0.044
 same+0.041
Scar+0.040
 newly+0.040
 Piper+0.040
Neg logit contributions
owing-0.088
unch-0.087
ement-0.085
oot-0.085
apping-0.085
Token was
Token ID373
Feature activation-3.86e-02

Pos logit contributions
 your+0.269
 newly+0.248
 our+0.245
Scar+0.239
Mid+0.238
Neg logit contributions
owing-0.577
unch-0.575
apping-0.563
oot-0.556
ement-0.552
Token laughing
Token ID14376
Feature activation+1.65e-01

Pos logit contributions
 your+0.052
 newly+0.048
 same+0.047
 her+0.047
 Piper+0.047
Neg logit contributions
unch-0.077
apping-0.077
owing-0.075
amb-0.075
oot-0.074
Token.
Token ID13
Feature activation+6.17e-02

Pos logit contributions
unch+0.344
apping+0.336
amb+0.329
owing+0.328
oot+0.320
Neg logit contributions
 your-0.366
 her-0.343
 our-0.335
 newly-0.334
 same-0.331
Token He
Token ID679
Feature activation-1.88e-01

Pos logit contributions
apping+0.141
unch+0.141
owing+0.140
oot+0.137
amb+0.136
Neg logit contributions
 your-0.065
 newly-0.058
 her-0.057
 Piper-0.056
 our-0.056
Token runs
Token ID4539
Feature activation+5.08e-02

Pos logit contributions
 your+0.247
 Piper+0.238
 newly+0.237
Mid+0.231
 our+0.226
Neg logit contributions
atted-0.348
apping-0.347
amb-0.344
ement-0.343
unch-0.343
Token back
Token ID736
Feature activation-6.57e-03

Pos logit contributions
apping+0.100
unch+0.100
owing+0.098
oot+0.097
amb+0.097
Neg logit contributions
 your-0.068
 newly-0.064
 same-0.062
 Piper-0.062
 our-0.062
Token to
Token ID284
Feature activation-1.05e-01

Pos logit contributions
 your+0.008
 her+0.008
 our+0.007
 the+0.007
 their+0.007
Neg logit contributions
unch-0.016
ement-0.015
apping-0.015
owing-0.014
amb-0.014
Token the
Token ID262
Feature activation-2.86e-01

Pos logit contributions
 your+0.057
 newly+0.051
 same+0.047
 Piper+0.047
Scar+0.046
Neg logit contributions
apping-0.288
unch-0.288
owing-0.283
oot-0.281
amb-0.279
Token fence
Token ID13990
Feature activation+1.66e-01

Pos logit contributions
 your+0.352
 newly+0.329
 Piper+0.320
 her+0.318
Scar+0.316
Neg logit contributions
apping-0.586
unch-0.583
owing-0.571
oot-0.564
amb-0.562
Token and
Token ID290
Feature activation-3.17e-03

Pos logit contributions
unch+0.373
amb+0.371
oot+0.359
apping+0.351
owing+0.347
Neg logit contributions
 her-0.193
 your-0.192
 she-0.182
 the-0.181
 their-0.177
Token climbs
Token ID34332
Feature activation+1.17e-02

Pos logit contributions
Scar+0.005
Mid+0.004
Elsa+0.004
 newly+0.004
Beck+0.004
Neg logit contributions
atted-0.006
anged-0.005
apped-0.005
owing-0.005
amb-0.005
Token over
Token ID625
Feature activation-2.21e-02

Pos logit contributions
apping+0.024
unch+0.024
owing+0.023
oot+0.023
amb+0.023
Neg logit contributions
 your-0.014
 newly-0.014
 Piper-0.013
 same-0.013
Scar-0.013
Token it
Token ID340
Feature activation-1.52e-01

Pos logit contributions
 your+0.021
 newly+0.018
 her+0.018
 same+0.018
 our+0.018
Neg logit contributions
apping-0.054
unch-0.054
owing-0.053
amb-0.052
oot-0.052
Token him
Token ID683
Feature activation-1.45e-01

Pos logit contributions
unch+0.411
apping+0.410
owing+0.400
oot+0.395
amb+0.394
Neg logit contributions
 your-0.254
 her-0.230
 newly-0.228
 our-0.226
 same-0.225
Token down
Token ID866
Feature activation-8.42e-02

Pos logit contributions
 your+0.196
 newly+0.182
 same+0.178
 her+0.177
 our+0.176
Neg logit contributions
unch-0.287
apping-0.287
owing-0.280
oot-0.277
amb-0.276
Token.
Token ID13
Feature activation+6.96e-02

Pos logit contributions
 your+0.146
 her+0.138
 our+0.135
 same+0.133
 newly+0.132
Neg logit contributions
unch-0.144
apping-0.139
amb-0.137
oot-0.137
owing-0.136
Token He
Token ID679
Feature activation-1.85e-01

Pos logit contributions
apping+0.157
unch+0.157
owing+0.153
oot+0.151
amb+0.151
Neg logit contributions
 your-0.073
 newly-0.068
 same-0.066
 Piper-0.064
 our-0.064
Token dropped
Token ID5710
Feature activation+8.48e-02

Pos logit contributions
 your+0.240
 newly+0.226
 Piper+0.222
 same+0.221
Scar+0.219
Neg logit contributions
apping-0.362
unch-0.361
owing-0.355
oot-0.350
amb-0.348
Token the
Token ID262
Feature activation-2.86e-01

Pos logit contributions
apping+0.233
unch+0.233
owing+0.230
oot+0.228
amb+0.227
Neg logit contributions
 your-0.045
 newly-0.040
 Piper-0.036
 same-0.036
Scar-0.036
Token shell
Token ID7582
Feature activation+1.90e-01

Pos logit contributions
 your+0.364
 newly+0.340
 Piper+0.331
 her+0.329
 same+0.329
Neg logit contributions
apping-0.578
unch-0.576
owing-0.566
oot-0.557
amb-0.556
Token and
Token ID290
Feature activation-2.74e-03

Pos logit contributions
apping+0.340
unch+0.340
owing+0.332
amb+0.327
oot+0.327
Neg logit contributions
 your-0.289
 newly-0.271
 same-0.267
 her-0.266
 our-0.263
Token felt
Token ID2936
Feature activation+1.41e-02

Pos logit contributions
Scar+0.003
 newly+0.003
 same+0.003
 Piper+0.003
Mid+0.003
Neg logit contributions
atted-0.005
apped-0.005
anged-0.005
oot-0.005
apping-0.005
Token the
Token ID262
Feature activation-2.83e-01

Pos logit contributions
apping+0.032
unch+0.032
owing+0.031
oot+0.031
amb+0.031
Neg logit contributions
 your-0.015
 newly-0.014
 same-0.014
 Piper-0.013
 her-0.013
Token water
Token ID1660
Feature activation+1.54e-01

Pos logit contributions
 your+0.349
 Piper+0.330
 newly+0.323
Scar+0.323
 her+0.317
Neg logit contributions
apping-0.571
unch-0.566
owing-0.557
amb-0.547
oot-0.543
Token broken
Token ID5445
Feature activation+8.63e-02

Pos logit contributions
 your+0.335
 newly+0.312
 Piper+0.308
Scar+0.304
 her+0.302
Neg logit contributions
apping-0.579
unch-0.575
owing-0.568
oot-0.557
ement-0.555
Token window
Token ID4324
Feature activation+2.19e-01

Pos logit contributions
apping+0.167
unch+0.165
owing+0.163
oot+0.161
ement+0.159
Neg logit contributions
 your-0.112
 newly-0.108
 Piper-0.105
Scar-0.105
Mid-0.102
Token.
Token ID13
Feature activation+5.93e-02

Pos logit contributions
apping+0.338
unch+0.337
owing+0.329
oot+0.323
amb+0.321
Neg logit contributions
 your-0.383
 newly-0.365
 same-0.358
 Piper-0.356
 our-0.355
Token Everything
Token ID11391
Feature activation-5.53e-02

Pos logit contributions
apping+0.129
unch+0.129
owing+0.127
oot+0.125
amb+0.124
Neg logit contributions
 your-0.068
 newly-0.062
 same-0.060
 her-0.060
 Piper-0.060
Token in
Token ID287
Feature activation-1.24e-01

Pos logit contributions
Scar+0.062
 Piper+0.061
Mid+0.061
 your+0.058
Elsa+0.058
Neg logit contributions
apping-0.113
ement-0.112
owing-0.111
apped-0.110
atted-0.109
Token the
Token ID262
Feature activation-2.86e-01

Pos logit contributions
Mid+0.078
Scar+0.075
 Piper+0.068
 newly+0.066
 your+0.065
Neg logit contributions
apping-0.309
ement-0.307
apped-0.300
unch-0.298
owing-0.297
Token house
Token ID2156
Feature activation+2.08e-01

Pos logit contributions
 your+0.366
 Piper+0.351
 newly+0.343
Scar+0.341
 our+0.334
Neg logit contributions
apping-0.554
owing-0.544
ement-0.536
unch-0.533
amb-0.519
Token felt
Token ID2936
Feature activation+8.41e-03

Pos logit contributions
apping+0.469
owing+0.463
ement+0.457
unch+0.453
apped+0.452
Neg logit contributions
 Piper-0.200
 your-0.200
Mid-0.199
Scar-0.198
 newly-0.192
Token bright
Token ID6016
Feature activation+9.94e-02

Pos logit contributions
apping+0.017
unch+0.017
ement+0.017
owing+0.017
oot+0.016
Neg logit contributions
 Piper-0.009
Scar-0.009
 your-0.009
 newly-0.009
Mid-0.009
Token and
Token ID290
Feature activation-1.04e-03

Pos logit contributions
unch+0.190
apping+0.187
oot+0.183
amb+0.182
owing+0.181
Neg logit contributions
 your-0.145
 same-0.137
 newly-0.137
 her-0.136
 our-0.131
Token shiny
Token ID22441
Feature activation+5.03e-02

Pos logit contributions
Scar+0.001
Mid+0.001
 Piper+0.001
Elsa+0.001
 Midnight+0.001
Neg logit contributions
ement-0.002
apping-0.002
owing-0.002
apped-0.002
atted-0.002
Token long
Token ID890
Feature activation+3.48e-03

Pos logit contributions
 your+0.207
 her+0.190
 Piper+0.189
 newly+0.181
Scar+0.179
Neg logit contributions
apping-0.371
unch-0.361
owing-0.359
aring-0.354
oot-0.353
Token time
Token ID640
Feature activation+9.33e-02

Pos logit contributions
apping+0.008
unch+0.008
owing+0.008
oot+0.008
amb+0.008
Neg logit contributions
 your-0.003
 newly-0.003
 same-0.003
 Piper-0.003
 our-0.003
Token until
Token ID1566
Feature activation+1.08e-02

Pos logit contributions
unch+0.179
amb+0.170
oot+0.169
apping+0.168
ement+0.168
Neg logit contributions
 your-0.136
 her-0.128
 newly-0.126
 same-0.123
 she-0.122
Token he
Token ID339
Feature activation-2.53e-01

Pos logit contributions
apping+0.024
apped+0.023
ement+0.023
oot+0.023
owing+0.023
Neg logit contributions
Mid-0.011
Scar-0.010
 newly-0.010
 your-0.010
 Piper-0.010
Token reached
Token ID4251
Feature activation+3.38e-02

Pos logit contributions
 your+0.257
 newly+0.236
 same+0.229
 her+0.228
 Piper+0.225
Neg logit contributions
unch-0.582
apping-0.581
owing-0.572
amb-0.565
oot-0.564
Token the
Token ID262
Feature activation-2.86e-01

Pos logit contributions
apping+0.074
owing+0.073
apped+0.073
ement+0.071
oot+0.069
Neg logit contributions
Mid-0.035
 Midnight-0.030
Ruby-0.029
Elsa-0.029
Ol-0.029
Token lake
Token ID13546
Feature activation+2.23e-01

Pos logit contributions
 your+0.370
 newly+0.345
 Piper+0.337
 her+0.335
 our+0.331
Neg logit contributions
apping-0.571
unch-0.568
owing-0.557
oot-0.549
amb-0.547
Token.
Token ID13
Feature activation+5.82e-02

Pos logit contributions
unch+0.405
oot+0.395
amb+0.394
apping+0.387
owing+0.376
Neg logit contributions
 your-0.345
 her-0.325
 same-0.325
 our-0.316
 their-0.315
Token He
Token ID679
Feature activation-1.88e-01

Pos logit contributions
apping+0.127
unch+0.127
owing+0.126
oot+0.124
amb+0.123
Neg logit contributions
 your-0.066
 newly-0.061
 same-0.059
 her-0.059
 Piper-0.058
Token stopped
Token ID5025
Feature activation+1.28e-01

Pos logit contributions
 your+0.211
 newly+0.195
 same+0.189
 her+0.186
 Piper+0.186
Neg logit contributions
apping-0.411
unch-0.410
owing-0.403
oot-0.398
amb-0.396
Token and
Token ID290
Feature activation-7.04e-03

Pos logit contributions
unch+0.246
apping+0.242
oot+0.239
amb+0.236
owing+0.235
Neg logit contributions
 your-0.184
 newly-0.172
 her-0.171
 our-0.167
 same-0.166
Token The
Token ID383
Feature activation-1.61e-01

Pos logit contributions
unch+0.181
apping+0.180
owing+0.178
oot+0.175
ement+0.174
Neg logit contributions
 your-0.069
 her-0.063
 same-0.063
 newly-0.062
 Piper-0.060
Token wind
Token ID2344
Feature activation+1.79e-01

Pos logit contributions
 your+0.190
 our+0.174
 Piper+0.172
Mid+0.171
Scar+0.171
Neg logit contributions
apping-0.338
owing-0.324
unch-0.322
ement-0.320
apped-0.314
Token whipped
Token ID30712
Feature activation+1.63e-01

Pos logit contributions
unch+0.435
apping+0.428
owing+0.423
amb+0.415
ement+0.413
Neg logit contributions
 your-0.164
 same-0.148
 newly-0.148
 her-0.146
 our-0.141
Token through
Token ID832
Feature activation-7.56e-02

Pos logit contributions
oot+0.263
apping+0.263
amb+0.258
ement+0.258
apped+0.256
Neg logit contributions
 your-0.261
 newly-0.260
Scar-0.256
 Piper-0.256
 our-0.247
Token her
Token ID607
Feature activation-2.36e-01

Pos logit contributions
 your+0.030
Scar+0.028
 Piper+0.027
 newly+0.027
Mid+0.025
Neg logit contributions
apping-0.212
unch-0.208
ement-0.206
oot-0.204
apped-0.204
Token hair
Token ID4190
Feature activation+2.50e-01

Pos logit contributions
 your+0.418
 newly+0.401
 same+0.395
 her+0.390
 our+0.387
Neg logit contributions
unch-0.376
apping-0.361
owing-0.358
amb-0.357
oot-0.356
Token.
Token ID13
Feature activation+7.50e-02

Pos logit contributions
unch+0.465
apping+0.462
owing+0.454
amb+0.448
oot+0.443
Neg logit contributions
 your-0.365
 newly-0.342
 same-0.340
 her-0.340
 our-0.327
Token She
Token ID1375
Feature activation-1.87e-01

Pos logit contributions
unch+0.184
apping+0.184
owing+0.182
oot+0.179
amb+0.178
Neg logit contributions
 your-0.066
 newly-0.059
 her-0.059
 same-0.058
 Piper-0.057
Token felt
Token ID2936
Feature activation+1.68e-02

Pos logit contributions
 your+0.180
 newly+0.165
 same+0.162
 her+0.160
 our+0.155
Neg logit contributions
unch-0.448
apping-0.443
owing-0.441
oot-0.427
amb-0.426
Token so
Token ID523
Feature activation+1.05e-03

Pos logit contributions
apping+0.034
unch+0.034
owing+0.032
oot+0.032
ement+0.032
Neg logit contributions
 newly-0.020
 Piper-0.020
 your-0.020
Mid-0.019
Scar-0.019
Token alive
Token ID6776
Feature activation+2.92e-02

Pos logit contributions
apping+0.002
unch+0.002
owing+0.002
oot+0.002
amb+0.002
Neg logit contributions
 your-0.001
 Piper-0.001
 newly-0.001
Scar-0.001
Mid-0.001
Token don
Token ID836
Feature activation-1.77e-02

Pos logit contributions
 your+0.151
 newly+0.142
 Piper+0.140
 our+0.136
 same+0.135
Neg logit contributions
apping-0.235
unch-0.234
owing-0.233
amb-0.230
oot-0.226
Token't
Token ID470
Feature activation-7.15e-02

Pos logit contributions
 your+0.016
 her+0.014
 newly+0.013
 our+0.013
 my+0.013
Neg logit contributions
owing-0.044
apping-0.043
unch-0.043
amb-0.042
atted-0.042
Token need
Token ID761
Feature activation+7.39e-02

Pos logit contributions
 your+0.085
 her+0.077
 same+0.077
 our+0.073
 newly+0.073
Neg logit contributions
unch-0.160
apping-0.158
ement-0.155
owing-0.152
oot-0.152
Token the
Token ID262
Feature activation-2.76e-01

Pos logit contributions
apping+0.197
unch+0.194
owing+0.190
amb+0.188
ement+0.187
Neg logit contributions
 your-0.064
 her-0.054
 our-0.052
 my-0.052
 same-0.050
Token new
Token ID649
Feature activation+1.47e-02

Pos logit contributions
 your+0.274
 same+0.254
 newly+0.252
 her+0.238
 our+0.235
Neg logit contributions
apping-0.668
unch-0.662
owing-0.649
oot-0.643
amb-0.636
Token t
Token ID256
Feature activation+2.40e-01

Pos logit contributions
apping+0.040
unch+0.040
owing+0.039
oot+0.038
aring+0.038
Neg logit contributions
 your-0.013
 same-0.012
 her-0.011
 my-0.011
 newly-0.010
Tokeneddy
Token ID21874
Feature activation+1.17e-01

Pos logit contributions
apping+0.312
unch+0.309
ement+0.296
owing+0.290
overed+0.282
Neg logit contributions
 your-0.474
 my-0.444
 newly-0.441
 her-0.439
 our-0.434
Token bear
Token ID6842
Feature activation+9.80e-02

Pos logit contributions
owing+0.240
amb+0.239
apping+0.235
unch+0.235
oot+0.233
Neg logit contributions
 your-0.136
 newly-0.132
Scar-0.131
 Piper-0.127
 same-0.124
Token.
Token ID13
Feature activation+9.08e-02

Pos logit contributions
apping+0.151
unch+0.151
owing+0.147
oot+0.144
amb+0.144
Neg logit contributions
 your-0.174
 newly-0.165
 same-0.162
 her-0.161
 our-0.161
Token I
Token ID314
Feature activation-1.18e-01

Pos logit contributions
amb+0.172
apping+0.167
aring+0.164
unch+0.162
oot+0.158
Neg logit contributions
 newly-0.123
Scar-0.117
 your-0.116
 same-0.115
 Piper-0.115
Token want
Token ID765
Feature activation+1.15e-01

Pos logit contributions
 your+0.137
 newly+0.127
 Piper+0.124
 same+0.122
 our+0.122
Neg logit contributions
apping-0.252
unch-0.252
owing-0.249
amb-0.245
oot-0.244
TokenOne
Token ID3198
Feature activation+4.79e-02

Pos logit contributions
apping+0.250
owing+0.249
unch+0.249
oot+0.246
amb+0.244
Neg logit contributions
 your-0.132
 newly-0.122
 her-0.120
 same-0.119
 our-0.116
Token day
Token ID1110
Feature activation+1.30e-01

Pos logit contributions
apping+0.087
unch+0.086
owing+0.082
apped+0.078
oot+0.076
Neg logit contributions
Scar-0.073
 Piper-0.072
 Midnight-0.068
Elsa-0.067
 your-0.067
Token,
Token ID11
Feature activation+2.50e-02

Pos logit contributions
apping+0.189
unch+0.181
apped+0.176
owing+0.169
amb+0.165
Neg logit contributions
Scar-0.241
 newly-0.238
 Piper-0.238
 same-0.233
 your-0.232
Token a
Token ID257
Feature activation-1.72e-01

Pos logit contributions
apped+0.059
apping+0.058
unch+0.056
owing+0.054
amb+0.053
Neg logit contributions
Scar-0.026
 Piper-0.024
Elsa-0.022
Mid-0.022
 Scarlett-0.021
Token boy
Token ID2933
Feature activation+1.29e-02

Pos logit contributions
 your+0.164
 Piper+0.147
 her+0.146
 our+0.142
Scar+0.142
Neg logit contributions
apping-0.402
unch-0.396
owing-0.376
amb-0.371
aring-0.370
Token named
Token ID3706
Feature activation+2.45e-01

Pos logit contributions
unch+0.020
apping+0.020
owing+0.019
ement+0.019
oot+0.019
Neg logit contributions
 your-0.024
 her-0.022
 newly-0.022
 our-0.022
 same-0.021
Token Tim
Token ID5045
Feature activation-9.94e-02

Pos logit contributions
unch+0.453
apping+0.451
apped+0.444
amb+0.434
owing+0.434
Neg logit contributions
Scar-0.350
 newly-0.328
 Piper-0.326
Elsa-0.310
Mid-0.305
Token and
Token ID290
Feature activation+6.59e-03

Pos logit contributions
 Piper+0.125
 newly+0.123
Scar+0.122
Mid+0.119
 your+0.118
Neg logit contributions
owing-0.194
amb-0.189
unch-0.187
apped-0.186
apping-0.186
Token his
Token ID465
Feature activation-2.51e-01

Pos logit contributions
oot+0.011
apped+0.011
amb+0.011
unch+0.010
apping+0.010
Neg logit contributions
Scar-0.009
Elsa-0.008
 newly-0.008
Beck-0.008
chy-0.008
Token friend
Token ID1545
Feature activation+1.89e-01

Pos logit contributions
 your+0.306
 newly+0.279
 Piper+0.275
 our+0.271
Scar+0.270
Neg logit contributions
apping-0.529
unch-0.525
owing-0.516
amb-0.507
oot-0.507
Token,
Token ID11
Feature activation+2.60e-02

Pos logit contributions
apped+0.280
apping+0.277
owing+0.272
unch+0.267
anged+0.261
Neg logit contributions
 newly-0.328
Scar-0.326
 Piper-0.306
 your-0.305
Elsa-0.303
Token there
Token ID612
Feature activation-1.12e-01

Pos logit contributions
apping+0.062
amb+0.061
oot+0.060
unch+0.060
owing+0.060
Neg logit contributions
Scar-0.032
Mid-0.030
 newly-0.030
chy-0.028
Beck-0.027
Token will
Token ID481
Feature activation-9.18e-03

Pos logit contributions
Scar+0.110
Mid+0.106
 newly+0.103
 Piper+0.102
Ol+0.092
Neg logit contributions
owing-0.250
apping-0.250
aring-0.242
unch-0.241
ement-0.241
Token be
Token ID307
Feature activation+7.71e-03

Pos logit contributions
Scar+0.009
 newly+0.009
 Piper+0.009
Mid+0.008
 your+0.008
Neg logit contributions
unch-0.020
owing-0.019
apping-0.019
oot-0.019
aring-0.019
Token a
Token ID257
Feature activation-1.64e-01

Pos logit contributions
owing+0.015
apping+0.015
unch+0.015
atted+0.015
apped+0.014
Neg logit contributions
Scar-0.009
 newly-0.008
 Piper-0.008
Mid-0.008
Elsa-0.008
Token big
Token ID1263
Feature activation-8.31e-03

Pos logit contributions
 your+0.151
 newly+0.139
 her+0.134
 Piper+0.133
Scar+0.131
Neg logit contributions
apping-0.384
unch-0.383
owing-0.374
oot-0.366
amb-0.366
Token reward
Token ID6721
Feature activation+2.35e-01

Pos logit contributions
 your+0.010
 newly+0.010
 same+0.009
 our+0.009
 her+0.009
Neg logit contributions
unch-0.017
apping-0.017
owing-0.017
oot-0.017
amb-0.017
Token for
Token ID329
Feature activation-4.60e-02

Pos logit contributions
apping+0.431
unch+0.427
owing+0.423
oot+0.415
amb+0.411
Neg logit contributions
 your-0.340
 newly-0.331
 same-0.319
 Piper-0.318
Scar-0.316
Token you
Token ID345
Feature activation-1.32e-01

Pos logit contributions
 your+0.031
 newly+0.030
Scar+0.029
 same+0.028
 Piper+0.028
Neg logit contributions
unch-0.117
apping-0.117
owing-0.115
oot-0.114
amb-0.112
Token!"
Token ID2474
Feature activation+1.03e-01

Pos logit contributions
 your+0.197
 same+0.179
 our+0.179
 her+0.177
 newly+0.175
Neg logit contributions
unch-0.250
apping-0.247
amb-0.242
oot-0.241
owing-0.239
Token she
Token ID673
Feature activation-2.49e-01

Pos logit contributions
amb+0.221
apping+0.217
unch+0.215
oot+0.214
apped+0.207
Neg logit contributions
 same-0.111
 newly-0.108
 your-0.108
Scar-0.101
 our-0.101
Token said
Token ID531
Feature activation+1.08e-01

Pos logit contributions
 your+0.362
 newly+0.341
 same+0.333
 Piper+0.330
 her+0.330
Neg logit contributions
apping-0.465
unch-0.464
owing-0.455
oot-0.447
amb-0.446
Tokenila
Token ID10102
Feature activation-6.97e-02

Pos logit contributions
apping+0.032
unch+0.032
owing+0.032
amb+0.032
oot+0.031
Neg logit contributions
 your-0.019
 newly-0.017
 her-0.017
 same-0.017
Scar-0.016
Token cried
Token ID16896
Feature activation+1.97e-01

Pos logit contributions
 your+0.065
 her+0.059
 newly+0.058
 same+0.057
 Piper+0.055
Neg logit contributions
apping-0.167
unch-0.166
owing-0.163
oot-0.162
amb-0.162
Token and
Token ID290
Feature activation+9.16e-03

Pos logit contributions
unch+0.379
apping+0.375
owing+0.366
amb+0.363
oot+0.362
Neg logit contributions
 your-0.281
 her-0.264
 newly-0.260
 same-0.258
 Piper-0.254
Token pulled
Token ID5954
Feature activation+1.08e-01

Pos logit contributions
atted+0.015
apped+0.015
apping+0.015
unch+0.015
oot+0.015
Neg logit contributions
 newly-0.014
Scar-0.013
 your-0.013
 same-0.013
 our-0.013
Token her
Token ID607
Feature activation-2.35e-01

Pos logit contributions
unch+0.328
apping+0.326
owing+0.318
amb+0.318
oot+0.315
Neg logit contributions
 your-0.045
 her-0.038
 newly-0.029
 same-0.029
 our-0.027
Token hand
Token ID1021
Feature activation+2.36e-01

Pos logit contributions
 your+0.398
 her+0.375
 same+0.371
 newly+0.367
 our+0.362
Neg logit contributions
unch-0.415
apping-0.410
amb-0.399
oot-0.395
ement-0.395
Token out
Token ID503
Feature activation-5.04e-02

Pos logit contributions
unch+0.499
apping+0.487
amb+0.470
ement+0.465
oot+0.462
Neg logit contributions
 your-0.312
 her-0.304
 same-0.284
 newly-0.273
 our-0.273
Token.
Token ID13
Feature activation+8.44e-02

Pos logit contributions
 your+0.093
 her+0.092
 newly+0.086
 our+0.085
 she+0.085
Neg logit contributions
unch-0.080
amb-0.078
apping-0.076
oot-0.075
owing-0.075
Token She
Token ID1375
Feature activation-1.84e-01

Pos logit contributions
unch+0.210
owing+0.210
apping+0.209
oot+0.204
ement+0.204
Neg logit contributions
 your-0.073
 her-0.069
 same-0.064
 newly-0.064
 Piper-0.063
Token was
Token ID373
Feature activation-2.66e-02

Pos logit contributions
 your+0.250
 newly+0.236
 same+0.231
Scar+0.231
 our+0.230
Neg logit contributions
unch-0.346
apping-0.342
amb-0.339
owing-0.339
oot-0.336
Token angry
Token ID7954
Feature activation+1.12e-01

Pos logit contributions
 your+0.033
 her+0.031
 newly+0.030
 same+0.029
 my+0.029
Neg logit contributions
amb-0.057
unch-0.057
apping-0.056
owing-0.056
oot-0.056
Token and
Token ID290
Feature activation+9.62e-03

Pos logit contributions
apping+0.249
owing+0.240
unch+0.236
apped+0.236
oot+0.231
Neg logit contributions
 your-0.217
 Piper-0.213
Scar-0.212
Mid-0.204
 newly-0.203
Token the
Token ID262
Feature activation-2.77e-01

Pos logit contributions
anged+0.020
apped+0.020
atted+0.020
apping+0.018
owing+0.018
Neg logit contributions
 same-0.010
Scar-0.010
 your-0.010
Mid-0.010
 Piper-0.009
Token box
Token ID3091
Feature activation+1.56e-01

Pos logit contributions
 your+0.344
 Piper+0.324
Scar+0.314
 Blink+0.306
 our+0.304
Neg logit contributions
apping-0.531
owing-0.524
unch-0.523
ement-0.522
apped-0.511
Token in
Token ID287
Feature activation-1.12e-01

Pos logit contributions
unch+0.324
apping+0.318
amb+0.314
owing+0.312
oot+0.307
Neg logit contributions
 your-0.197
 newly-0.189
 her-0.187
 same-0.183
 our-0.178
Token their
Token ID511
Feature activation-2.46e-01

Pos logit contributions
Mid+0.067
Scar+0.066
 your+0.065
 Piper+0.064
 newly+0.060
Neg logit contributions
apping-0.281
apped-0.276
ement-0.275
unch-0.270
oot-0.264
Token room
Token ID2119
Feature activation+1.68e-01

Pos logit contributions
 Piper+0.431
Scar+0.426
 your+0.418
 Blink+0.405
Mid+0.402
Neg logit contributions
apping-0.337
apped-0.332
atted-0.330
owing-0.327
ement-0.326
Token.
Token ID13
Feature activation+8.32e-02

Pos logit contributions
amb+0.349
unch+0.336
oot+0.324
apping+0.314
owing+0.310
Neg logit contributions
 her-0.242
 their-0.231
 she-0.230
 your-0.229
 the-0.224
Token They
Token ID1119
Feature activation-1.55e-01

Pos logit contributions
apping+0.182
unch+0.178
apped+0.173
amb+0.170
oot+0.167
Neg logit contributions
 your-0.094
 newly-0.088
 same-0.087
Scar-0.084
 Piper-0.082
Token felt
Token ID2936
Feature activation+1.73e-02

Pos logit contributions
 your+0.172
 newly+0.154
 same+0.152
 Piper+0.152
 our+0.152
Neg logit contributions
apping-0.339
unch-0.337
owing-0.332
oot-0.328
amb-0.327
Token proud
Token ID6613
Feature activation+1.14e-01

Pos logit contributions
apping+0.037
unch+0.036
owing+0.036
ement+0.035
apped+0.035
Neg logit contributions
 your-0.019
 newly-0.018
 Piper-0.018
Scar-0.018
 same-0.018
Token of
Token ID286
Feature activation-6.96e-02

Pos logit contributions
amb+0.285
owing+0.285
unch+0.276
oot+0.275
umbled+0.270
Neg logit contributions
 she-0.129
 her-0.126
 the-0.123
 your-0.117
 their-0.113
Token they
Token ID484
Feature activation-2.38e-01

Pos logit contributions
Scar+0.078
Mid+0.075
Ol+0.073
Gary+0.073
chy+0.072
Neg logit contributions
apped-0.124
unch-0.121
apping-0.119
ement-0.116
oot-0.116
Token got
Token ID1392
Feature activation+4.37e-02

Pos logit contributions
 your+0.272
 newly+0.251
 same+0.245
 Piper+0.242
 our+0.240
Neg logit contributions
unch-0.518
apping-0.517
owing-0.509
oot-0.502
amb-0.500
Token home
Token ID1363
Feature activation+1.81e-01

Pos logit contributions
apping+0.070
unch+0.070
owing+0.068
oot+0.067
amb+0.067
Neg logit contributions
 your-0.076
 newly-0.072
 same-0.071
 her-0.071
 our-0.070
Token,
Token ID11
Feature activation+3.05e-02

Pos logit contributions
apping+0.261
owing+0.256
unch+0.238
oot+0.238
apped+0.232
Neg logit contributions
Scar-0.342
Mid-0.331
 Piper-0.330
 Blink-0.320
Beck-0.318
Token Lily
Token ID20037
Feature activation-1.33e-01

Pos logit contributions
apped+0.054
apping+0.048
ement+0.046
atted+0.043
unch+0.043
Neg logit contributions
Scar-0.041
Ol-0.037
Gary-0.037
chy-0.036
Mid-0.036
Token wanted
Token ID2227
Feature activation+1.30e-01

Pos logit contributions
 your+0.149
 newly+0.139
 our+0.136
 same+0.135
 Piper+0.130
Neg logit contributions
unch-0.279
apping-0.279
owing-0.277
amb-0.274
oot-0.270
Token to
Token ID284
Feature activation-9.31e-02

Pos logit contributions
unch+0.225
owing+0.225
apping+0.224
oot+0.222
amb+0.214
Neg logit contributions
 newly-0.184
 your-0.184
Scar-0.180
 Piper-0.177
 same-0.175
Token play
Token ID711
Feature activation+1.67e-01

Pos logit contributions
 your+0.098
 newly+0.096
Scar+0.093
 Piper+0.093
 same+0.091
Neg logit contributions
unch-0.202
apping-0.200
owing-0.198
oot-0.197
amb-0.196
Token with
Token ID351
Feature activation+1.58e-02

Pos logit contributions
apping+0.276
unch+0.272
owing+0.268
oot+0.263
ement+0.258
Neg logit contributions
 your-0.270
 newly-0.263
 Piper-0.257
 same-0.255
Scar-0.254
Token her
Token ID607
Feature activation-2.34e-01

Pos logit contributions
ement+0.037
owing+0.036
atted+0.035
oot+0.035
unch+0.035
Neg logit contributions
 has-0.012
Scar-0.012
In-0.011
Beck-0.010
 Piper-0.009
Token doll
Token ID3654
Feature activation+1.30e-01

Pos logit contributions
 your+0.260
 newly+0.244
 her+0.234
 same+0.233
 our+0.227
Neg logit contributions
unch-0.529
apping-0.529
owing-0.517
amb-0.512
oot-0.511
Token we
Token ID356
Feature activation-1.52e-01

Pos logit contributions
unch+0.066
apping+0.066
oot+0.064
amb+0.064
ement+0.064
Neg logit contributions
 your-0.027
 newly-0.026
 same-0.026
Scar-0.026
 Piper-0.025
Token're
Token ID821
Feature activation-2.61e-02

Pos logit contributions
 your+0.139
 newly+0.127
 Piper+0.121
 same+0.119
 our+0.119
Neg logit contributions
apping-0.363
unch-0.362
owing-0.358
amb-0.353
oot-0.352
Token at
Token ID379
Feature activation+2.45e-03

Pos logit contributions
 your+0.027
 newly+0.025
 Piper+0.024
 same+0.024
Scar+0.024
Neg logit contributions
apping-0.059
unch-0.059
owing-0.058
oot-0.057
amb-0.057
Token the
Token ID262
Feature activation-2.71e-01

Pos logit contributions
unch+0.007
apping+0.007
owing+0.007
amb+0.007
ement+0.007
Neg logit contributions
 your-0.002
 our-0.002
 my-0.002
 her-0.002
 same-0.002
Token zoo
Token ID26626
Feature activation+2.26e-01

Pos logit contributions
 your+0.327
 same+0.304
 newly+0.300
 our+0.291
 her+0.291
Neg logit contributions
unch-0.584
apping-0.581
owing-0.570
oot-0.565
amb-0.565
Token!"
Token ID2474
Feature activation+9.24e-02

Pos logit contributions
unch+0.421
amb+0.408
apping+0.408
owing+0.398
oot+0.394
Neg logit contributions
 your-0.353
 our-0.331
 same-0.320
 my-0.317
 her-0.314
Token Lily
Token ID20037
Feature activation-1.29e-01

Pos logit contributions
apped+0.200
apping+0.199
amb+0.197
oot+0.192
unch+0.191
Neg logit contributions
 newly-0.092
 same-0.088
Scar-0.087
chy-0.084
 your-0.083
Token said
Token ID531
Feature activation+1.13e-01

Pos logit contributions
 her+0.133
 your+0.131
 she+0.122
 the+0.118
 Piper+0.115
Neg logit contributions
unch-0.305
ement-0.304
owing-0.304
apping-0.296
oot-0.288
Token.
Token ID13
Feature activation+8.94e-02

Pos logit contributions
unch+0.240
ement+0.228
idd+0.224
amb+0.224
oot+0.221
Neg logit contributions
 her-0.173
 their-0.156
 she-0.155
 your-0.150
 the-0.146
Token She
Token ID1375
Feature activation-1.83e-01

Pos logit contributions
owing+0.226
unch+0.221
oot+0.219
apping+0.218
ement+0.215
Neg logit contributions
 your-0.078
 her-0.073
 Piper-0.068
 newly-0.068
 same-0.065
Token let
Token ID1309
Feature activation+3.63e-02

Pos logit contributions
 your+0.215
 newly+0.202
 her+0.197
 same+0.193
Scar+0.192
Neg logit contributions
unch-0.395
apping-0.393
owing-0.391
oot-0.384
amb-0.378
Token to
Token ID284
Feature activation-9.60e-02

Pos logit contributions
unch+0.118
apping+0.115
oot+0.115
owing+0.114
aring+0.112
Neg logit contributions
 newly-0.066
Scar-0.065
 your-0.062
 Piper-0.062
 same-0.060
Token play
Token ID711
Feature activation+1.61e-01

Pos logit contributions
 your+0.118
 newly+0.116
Scar+0.114
 Piper+0.112
 same+0.108
Neg logit contributions
unch-0.191
owing-0.189
apping-0.187
oot-0.186
amb-0.184
Token in
Token ID287
Feature activation-1.15e-01

Pos logit contributions
unch+0.254
apping+0.253
owing+0.249
oot+0.244
amb+0.243
Neg logit contributions
 your-0.284
 newly-0.266
 her-0.264
 same-0.263
 our-0.262
Token the
Token ID262
Feature activation-2.79e-01

Pos logit contributions
Scar+0.073
 newly+0.066
 Piper+0.065
Mid+0.060
chy+0.058
Neg logit contributions
unch-0.283
apping-0.278
owing-0.272
ement-0.272
oot-0.271
Token garden
Token ID11376
Feature activation+1.42e-01

Pos logit contributions
Scar+0.399
 Piper+0.379
 newly+0.378
 Scarlett+0.362
 your+0.356
Neg logit contributions
apping-0.470
unch-0.468
amb-0.432
aps-0.430
owing-0.428
Token too
Token ID1165
Feature activation+7.19e-02

Pos logit contributions
unch+0.277
oot+0.273
amb+0.270
apping+0.260
owing+0.258
Neg logit contributions
 your-0.217
 her-0.210
 same-0.201
 the-0.200
 their-0.196
Token.
Token ID13
Feature activation+8.91e-02

Pos logit contributions
amb+0.151
unch+0.150
atted+0.149
apping+0.148
anged+0.145
Neg logit contributions
 your-0.101
 her-0.098
 the-0.095
 their-0.092
 our-0.092
Token Spot
Token ID15899
Feature activation-6.29e-02

Pos logit contributions
unch+0.181
apping+0.179
aring+0.170
amb+0.169
oot+0.168
Neg logit contributions
 your-0.111
 newly-0.110
 same-0.107
Scar-0.106
 Piper-0.102
Token sees
Token ID7224
Feature activation+2.75e-02

Pos logit contributions
 newly+0.071
 Piper+0.070
 your+0.070
Scar+0.067
Beck+0.064
Neg logit contributions
apping-0.126
unch-0.126
aring-0.126
amb-0.125
owing-0.125
Token the
Token ID262
Feature activation-2.79e-01

Pos logit contributions
unch+0.064
atted+0.063
apping+0.061
ocked+0.061
oot+0.061
Neg logit contributions
Scar-0.024
 newly-0.022
 Piper-0.020
Elsa-0.019
Beck-0.019
Token spider
Token ID19230
Feature activation+6.05e-02

Pos logit contributions
 your+0.349
 our+0.317
Scar+0.317
 newly+0.313
 Piper+0.313
Neg logit contributions
apping-0.570
unch-0.565
owing-0.546
oot-0.529
ement-0.524
Token
Token ID198
Feature activation-1.82e-02

Pos logit contributions
apping+0.204
unch+0.203
amb+0.198
oot+0.198
owing+0.197
Neg logit contributions
 your-0.124
 newly-0.119
 same-0.118
 our-0.113
Scar-0.112
Token
Token ID198
Feature activation-1.74e-02

Pos logit contributions
 your+0.021
 her+0.019
 newly+0.019
Scar+0.018
 our+0.018
Neg logit contributions
owing-0.042
apping-0.040
unch-0.039
amb-0.039
oot-0.039
TokenTweet
Token ID47845
Feature activation+5.66e-02

Pos logit contributions
 your+0.017
 same+0.017
 newly+0.016
 our+0.015
 Piper+0.015
Neg logit contributions
apping-0.041
unch-0.040
oot-0.039
amb-0.039
aring-0.038
Token smiled
Token ID13541
Feature activation+1.30e-01

Pos logit contributions
apping+0.117
unch+0.117
owing+0.115
oot+0.114
amb+0.113
Neg logit contributions
 your-0.070
 newly-0.065
 same-0.063
 our-0.063
 Piper-0.063
Token and
Token ID290
Feature activation+9.10e-03

Pos logit contributions
unch+0.257
owing+0.242
oot+0.239
apping+0.238
idd+0.234
Neg logit contributions
 her-0.190
 your-0.189
 same-0.173
 newly-0.172
 she-0.171
Token flew
Token ID13112
Feature activation+1.01e-01

Pos logit contributions
apping+0.016
unch+0.016
amb+0.016
oot+0.016
owing+0.016
Neg logit contributions
 your-0.014
 newly-0.013
 same-0.013
 our-0.012
Scar-0.012
Token away
Token ID1497
Feature activation-1.66e-02

Pos logit contributions
apping+0.192
unch+0.187
owing+0.187
oot+0.186
amb+0.185
Neg logit contributions
 your-0.140
 newly-0.134
 Piper-0.129
 same-0.128
Scar-0.128
Token,
Token ID11
Feature activation+2.83e-02

Pos logit contributions
 your+0.026
 her+0.026
 newly+0.024
 same+0.024
 she+0.024
Neg logit contributions
unch-0.031
amb-0.029
apping-0.029
owing-0.029
ement-0.028
Token happy
Token ID3772
Feature activation+1.12e-01

Pos logit contributions
apping+0.051
unch+0.050
owing+0.049
oot+0.048
amb+0.048
Neg logit contributions
 your-0.042
 newly-0.040
 same-0.039
Scar-0.039
 Piper-0.039
Token to
Token ID284
Feature activation-9.10e-02

Pos logit contributions
unch+0.230
owing+0.220
apping+0.219
ement+0.216
amb+0.216
Neg logit contributions
 your-0.158
 her-0.155
 she-0.151
 same-0.142
 our-0.142
Token have
Token ID423
Feature activation-6.75e-02

Pos logit contributions
 your+0.059
 newly+0.052
 same+0.049
 Piper+0.049
Scar+0.047
Neg logit contributions
apping-0.242
unch-0.241
owing-0.238
oot-0.236
amb-0.235
Token to
Token ID284
Feature activation-9.34e-02

Pos logit contributions
apping+0.189
unch+0.189
owing+0.186
oot+0.184
amb+0.183
Neg logit contributions
 your-0.124
 newly-0.113
 our-0.113
 her-0.112
 same-0.112
Token play
Token ID711
Feature activation+1.62e-01

Pos logit contributions
 your+0.120
 newly+0.112
 same+0.109
 Piper+0.109
 her+0.107
Neg logit contributions
apping-0.189
unch-0.189
owing-0.185
oot-0.182
amb-0.182
Token with
Token ID351
Feature activation+1.39e-02

Pos logit contributions
owing+0.349
unch+0.346
oot+0.343
umbled+0.333
apping+0.332
Neg logit contributions
 your-0.221
 our-0.206
 their-0.198
 the-0.195
 her-0.192
Token the
Token ID262
Feature activation-2.78e-01

Pos logit contributions
apping+0.037
unch+0.037
owing+0.036
oot+0.036
amb+0.036
Neg logit contributions
 your-0.010
 newly-0.008
 same-0.008
 her-0.008
 our-0.008
Token fish
Token ID5916
Feature activation+1.12e-01

Pos logit contributions
 your+0.406
 newly+0.382
 same+0.372
 Piper+0.372
 her+0.371
Neg logit contributions
apping-0.515
unch-0.514
owing-0.503
oot-0.495
amb-0.494
Token and
Token ID290
Feature activation+6.68e-03

Pos logit contributions
unch+0.213
ement+0.212
oot+0.209
amb+0.204
apping+0.204
Neg logit contributions
 your-0.170
 our-0.160
 her-0.156
 their-0.155
 has-0.153
Token touch
Token ID3638
Feature activation+1.37e-01

Pos logit contributions
apping+0.013
unch+0.013
owing+0.012
amb+0.012
apped+0.012
Neg logit contributions
Scar-0.008
Mid-0.008
 your-0.008
 newly-0.008
 Piper-0.008
Token the
Token ID262
Feature activation-2.78e-01

Pos logit contributions
oot+0.392
unch+0.392
apping+0.391
owing+0.385
aring+0.380
Neg logit contributions
 your-0.101
 her-0.087
 our-0.086
 their-0.086
 the-0.085
Token cor
Token ID1162
Feature activation+5.76e-02

Pos logit contributions
 your+0.354
 same+0.333
 newly+0.329
 our+0.323
 her+0.307
Neg logit contributions
unch-0.581
apping-0.578
owing-0.574
oot-0.570
amb-0.556
Tokenals
Token ID874
Feature activation+1.08e-01

Pos logit contributions
apping+0.108
oot+0.107
unch+0.106
ement+0.104
owing+0.103
Neg logit contributions
 your-0.082
 has-0.077
 newly-0.077
 our-0.076
 Blink-0.076
Token.
Token ID13
Feature activation+7.36e-02

Pos logit contributions
unch+0.177
apping+0.176
owing+0.174
oot+0.173
amb+0.173
Neg logit contributions
 your-0.185
 her-0.177
 our-0.171
 newly-0.171
 same-0.170
Token in
Token ID287
Feature activation-1.10e-01

Pos logit contributions
 your+0.218
 newly+0.217
 Piper+0.215
Scar+0.212
 same+0.208
Neg logit contributions
apping-0.222
owing-0.220
unch-0.218
amb-0.212
oot-0.210
Token the
Token ID262
Feature activation-2.74e-01

Pos logit contributions
Scar+0.087
 Piper+0.081
Mid+0.081
 newly+0.076
Ol+0.073
Neg logit contributions
owing-0.257
apping-0.256
unch-0.252
ement-0.252
aring-0.239
Token morning
Token ID3329
Feature activation+8.90e-02

Pos logit contributions
 Piper+0.353
Scar+0.338
 your+0.335
 newly+0.333
 our+0.319
Neg logit contributions
ement-0.508
apping-0.503
owing-0.483
unch-0.477
amb-0.471
Token.
Token ID13
Feature activation+8.80e-02

Pos logit contributions
unch+0.153
apping+0.152
owing+0.148
oot+0.148
amb+0.147
Neg logit contributions
 your-0.146
 same-0.135
 our-0.134
 newly-0.134
 her-0.134
Token But
Token ID887
Feature activation+2.51e-02

Pos logit contributions
apping+0.172
unch+0.171
amb+0.169
owing+0.167
oot+0.164
Neg logit contributions
 your-0.113
 newly-0.112
 same-0.107
 Piper-0.107
Scar-0.106
Token don
Token ID836
Feature activation-1.56e-02

Pos logit contributions
apping+0.050
unch+0.049
owing+0.049
amb+0.049
oot+0.048
Neg logit contributions
Scar-0.030
Mid-0.028
 Piper-0.028
 newly-0.028
Beck-0.026
Token't
Token ID470
Feature activation-6.92e-02

Pos logit contributions
 newly+0.022
 your+0.022
 same+0.021
 Piper+0.021
Mid+0.021
Neg logit contributions
unch-0.029
oot-0.028
apping-0.028
ement-0.027
amb-0.026
Token worry
Token ID5490
Feature activation+1.12e-01

Pos logit contributions
 your+0.067
 newly+0.065
Scar+0.064
 Piper+0.062
 same+0.061
Neg logit contributions
unch-0.158
apping-0.156
owing-0.156
oot-0.152
amb-0.152
Token,
Token ID11
Feature activation+3.01e-02

Pos logit contributions
apping+0.145
aring+0.144
owing+0.142
unch+0.140
oot+0.137
Neg logit contributions
 newly-0.220
Scar-0.218
 Piper-0.211
Beck-0.208
Ol-0.208
Token he
Token ID339
Feature activation-2.50e-01

Pos logit contributions
unch+0.061
apping+0.061
aring+0.060
owing+0.060
amb+0.060
Neg logit contributions
Scar-0.032
 newly-0.031
Mid-0.030
 Piper-0.030
Beck-0.029
Token will
Token ID481
Feature activation-1.18e-02

Pos logit contributions
 your+0.205
 newly+0.200
 Piper+0.197
 her+0.184
Scar+0.183
Neg logit contributions
owing-0.616
unch-0.595
aring-0.592
amb-0.589
apping-0.589
Token.
Token ID13
Feature activation+7.30e-02

Pos logit contributions
amb+0.335
unch+0.332
oot+0.331
ement+0.328
apping+0.310
Neg logit contributions
 your-0.283
 her-0.275
 she-0.259
 same-0.259
 their-0.258
Token He
Token ID679
Feature activation-1.80e-01

Pos logit contributions
unch+0.162
apping+0.161
owing+0.157
amb+0.155
oot+0.155
Neg logit contributions
 your-0.080
 newly-0.076
 same-0.072
 our-0.072
Scar-0.072
Token was
Token ID373
Feature activation-3.13e-02

Pos logit contributions
 your+0.235
 Piper+0.228
Scar+0.222
 our+0.221
 newly+0.218
Neg logit contributions
unch-0.338
amb-0.336
apping-0.332
oot-0.329
apped-0.326
Token very
Token ID845
Feature activation+2.80e-02

Pos logit contributions
Scar+0.041
 newly+0.039
 your+0.039
 Piper+0.039
Mid+0.038
Neg logit contributions
apping-0.061
unch-0.060
owing-0.059
oot-0.057
apped-0.057
Token happy
Token ID3772
Feature activation+1.11e-01

Pos logit contributions
apping+0.061
unch+0.060
owing+0.059
oot+0.058
amb+0.058
Neg logit contributions
 your-0.032
 newly-0.030
 Piper-0.029
Scar-0.029
 same-0.029
Token and
Token ID290
Feature activation+7.07e-03

Pos logit contributions
ement+0.272
unch+0.263
oot+0.253
owing+0.251
atted+0.249
Neg logit contributions
 your-0.131
 her-0.124
 she-0.119
 the-0.118
 my-0.109
Token ran
Token ID4966
Feature activation+9.27e-02

Pos logit contributions
apping+0.015
unch+0.015
owing+0.015
oot+0.015
amb+0.015
Neg logit contributions
 your-0.008
 newly-0.008
 same-0.007
 Piper-0.007
 our-0.007
Token around
Token ID1088
Feature activation-4.59e-03

Pos logit contributions
apping+0.161
unch+0.161
owing+0.157
oot+0.155
amb+0.154
Neg logit contributions
 your-0.147
 newly-0.139
 same-0.136
 Piper-0.135
 her-0.134
Token looking
Token ID2045
Feature activation+1.15e-01

Pos logit contributions
 your+0.006
 her+0.005
 newly+0.005
 same+0.005
 our+0.005
Neg logit contributions
unch-0.010
apping-0.010
oot-0.010
owing-0.010
amb-0.010
Token for
Token ID329
Feature activation-4.54e-02

Pos logit contributions
apping+0.225
unch+0.221
owing+0.220
amb+0.215
oot+0.214
Neg logit contributions
 your-0.155
 newly-0.147
 Piper-0.146
Scar-0.145
 same-0.142
Token things
Token ID1243
Feature activation-1.37e-03

Pos logit contributions
 your+0.035
 her+0.030
 newly+0.029
 our+0.028
 same+0.028
Neg logit contributions
apping-0.122
unch-0.120
amb-0.119
owing-0.118
oot-0.117
Token.
Token ID13
Feature activation+7.06e-02

Pos logit contributions
unch+0.190
apping+0.190
owing+0.187
oot+0.183
amb+0.182
Neg logit contributions
 your-0.199
 newly-0.197
Scar-0.192
 Piper-0.191
 same-0.189
Token
Token ID198
Feature activation-2.04e-02

Pos logit contributions
apping+0.149
unch+0.149
amb+0.144
owing+0.144
oot+0.143
Neg logit contributions
 your-0.083
 newly-0.079
 same-0.076
 our-0.075
Scar-0.074
Token
Token ID198
Feature activation-2.02e-02

Pos logit contributions
 your+0.026
 newly+0.024
 her+0.024
 Piper+0.023
 our+0.023
Neg logit contributions
apping-0.042
owing-0.042
unch-0.041
oot-0.040
amb-0.040
TokenJack
Token ID14295
Feature activation-1.01e-01

Pos logit contributions
 your+0.021
 same+0.020
 newly+0.020
 Piper+0.019
 our+0.019
Neg logit contributions
unch-0.046
apping-0.046
amb-0.044
oot-0.044
owing-0.043
Token saw
Token ID2497
Feature activation+4.90e-02

Pos logit contributions
 your+0.123
 newly+0.120
Scar+0.115
 our+0.113
 Piper+0.111
Neg logit contributions
owing-0.204
apping-0.204
amb-0.201
atted-0.200
unch-0.200
Token how
Token ID703
Feature activation-5.17e-02

Pos logit contributions
apping+0.119
unch+0.119
owing+0.117
oot+0.116
ement+0.116
Neg logit contributions
 your-0.040
 newly-0.038
Scar-0.036
 Piper-0.035
 same-0.035
Token troubled
Token ID17840
Feature activation+1.37e-01

Pos logit contributions
Scar+0.055
Mid+0.053
 Piper+0.053
Elsa+0.052
 newly+0.051
Neg logit contributions
unch-0.113
apping-0.113
atted-0.109
owing-0.108
apped-0.107
Token Tom
Token ID4186
Feature activation-1.14e-01

Pos logit contributions
ement+0.292
owing+0.287
apping+0.287
oot+0.286
unch+0.284
Neg logit contributions
Scar-0.143
Mid-0.138
 newly-0.135
Elsa-0.133
 same-0.122
Token was
Token ID373
Feature activation-2.97e-02

Pos logit contributions
 your+0.122
Scar+0.112
 our+0.111
 Piper+0.109
 newly+0.106
Neg logit contributions
owing-0.259
unch-0.255
ement-0.250
apping-0.248
oot-0.241
Token and
Token ID290
Feature activation-2.86e-03

Pos logit contributions
 your+0.048
 newly+0.046
 same+0.045
 Piper+0.045
 our+0.044
Neg logit contributions
apping-0.050
unch-0.050
owing-0.049
oot-0.048
amb-0.048
Token said
Token ID531
Feature activation+9.73e-02

Pos logit contributions
 your+0.004
 newly+0.004
Scar+0.004
 Piper+0.004
 same+0.004
Neg logit contributions
apping-0.005
unch-0.005
owing-0.005
oot-0.005
atted-0.005
Token and
Token ID290
Feature activation+8.59e-03

Pos logit contributions
unch+0.201
apping+0.199
owing+0.195
ement+0.194
oot+0.194
Neg logit contributions
 your-0.157
 her-0.146
 newly-0.143
 same-0.141
 our-0.140
Token making
Token ID1642
Feature activation+5.47e-02

Pos logit contributions
apping+0.016
unch+0.016
owing+0.015
amb+0.015
oot+0.015
Neg logit contributions
 your-0.012
Scar-0.011
 newly-0.011
 Piper-0.011
Mid-0.011
Token new
Token ID649
Feature activation+1.63e-02

Pos logit contributions
unch+0.115
apping+0.115
owing+0.112
oot+0.111
amb+0.110
Neg logit contributions
 your-0.061
 newly-0.060
 Piper-0.059
Scar-0.059
Mid-0.056
Token friends
Token ID2460
Feature activation+1.56e-01

Pos logit contributions
unch+0.034
apping+0.032
oot+0.032
owing+0.032
ement+0.031
Neg logit contributions
 your-0.023
 her-0.021
 same-0.021
 our-0.021
 newly-0.020
Token.
Token ID13
Feature activation+7.79e-02

Pos logit contributions
unch+0.288
ement+0.275
amb+0.270
anged+0.269
oot+0.269
Neg logit contributions
 your-0.253
 her-0.246
 our-0.230
 she-0.227
 their-0.226
Token
Token ID198
Feature activation-2.06e-02

Pos logit contributions
owing+0.184
apping+0.182
oot+0.181
unch+0.179
ement+0.175
Neg logit contributions
 your-0.078
 newly-0.072
 same-0.071
 her-0.070
 Blink-0.067
Token
Token ID198
Feature activation-1.98e-02

Pos logit contributions
 your+0.020
 her+0.018
 newly+0.018
 our+0.017
 same+0.016
Neg logit contributions
owing-0.049
apping-0.048
amb-0.048
oot-0.048
atted-0.047
TokenOne
Token ID3198
Feature activation+5.31e-02

Pos logit contributions
 your+0.019
 newly+0.017
 Piper+0.017
 our+0.016
 same+0.016
Neg logit contributions
unch-0.047
apping-0.047
owing-0.046
amb-0.046
oot-0.046
Token sunny
Token ID27737
Feature activation+7.83e-02

Pos logit contributions
apping+0.113
unch+0.111
owing+0.109
oot+0.106
amb+0.106
Neg logit contributions
 your-0.061
Scar-0.061
 newly-0.059
 Piper-0.059
 Midnight-0.056
Token day
Token ID1110
Feature activation+1.34e-01

Pos logit contributions
unch+0.188
owing+0.185
apping+0.185
ement+0.183
oot+0.181
Neg logit contributions
 your-0.079
 her-0.070
 same-0.069
 our-0.066
 newly-0.065
Token,
Token ID11
Feature activation+2.91e-02

Pos logit contributions
unch+0.255
apping+0.253
owing+0.251
oot+0.247
amb+0.246
Neg logit contributions
 your-0.196
 her-0.179
 our-0.177
 newly-0.176
 same-0.173
Tokenora
Token ID5799
Feature activation+4.64e-03

Pos logit contributions
unch+0.054
apping+0.053
owing+0.053
ement+0.051
oot+0.051
Neg logit contributions
 your-0.052
 her-0.049
 newly-0.049
 same-0.049
Scar-0.048
Token was
Token ID373
Feature activation-3.04e-02

Pos logit contributions
unch+0.011
apping+0.011
owing+0.011
oot+0.011
ement+0.010
Neg logit contributions
 your-0.004
 her-0.004
 newly-0.004
 has-0.004
 same-0.004
Token mesmer
Token ID46814
Feature activation+1.35e-01

Pos logit contributions
 your+0.033
 her+0.031
 the+0.030
 same+0.030
 my+0.029
Neg logit contributions
unch-0.073
amb-0.072
oot-0.071
ement-0.070
owing-0.068
Tokenized
Token ID1143
Feature activation+5.37e-02

Pos logit contributions
unch+0.188
apping+0.178
owing+0.175
aring+0.169
oot+0.167
Neg logit contributions
 your-0.274
 her-0.263
 newly-0.255
 same-0.255
 my-0.251
Token.
Token ID13
Feature activation+7.44e-02

Pos logit contributions
unch+0.106
apping+0.103
oot+0.098
owing+0.098
amb+0.097
Neg logit contributions
 your-0.079
 her-0.077
 same-0.073
 newly-0.071
 my-0.070
Token She
Token ID1375
Feature activation-1.86e-01

Pos logit contributions
unch+0.192
apping+0.191
owing+0.189
oot+0.187
amb+0.185
Neg logit contributions
 your-0.055
 newly-0.048
 her-0.047
 same-0.047
 Piper-0.047
Token wanted
Token ID2227
Feature activation+1.28e-01

Pos logit contributions
 your+0.187
 newly+0.176
 her+0.171
 has+0.167
 same+0.163
Neg logit contributions
unch-0.443
apping-0.439
owing-0.432
ement-0.420
oot-0.419
Token a
Token ID257
Feature activation-1.67e-01

Pos logit contributions
oot+0.213
apping+0.213
owing+0.212
unch+0.210
amb+0.203
Neg logit contributions
 newly-0.194
 your-0.193
Scar-0.188
 Piper-0.186
Mid-0.186
Token ride
Token ID6594
Feature activation+2.43e-01

Pos logit contributions
 your+0.156
 same+0.141
 our+0.136
 newly+0.135
 my+0.129
Neg logit contributions
unch-0.422
apping-0.415
amb-0.410
owing-0.409
oot-0.405
Token in
Token ID287
Feature activation-1.11e-01

Pos logit contributions
apping+0.427
owing+0.423
oot+0.415
unch+0.413
amb+0.404
Neg logit contributions
 your-0.366
 newly-0.362
 Piper-0.350
Scar-0.349
 same-0.343
Token the
Token ID262
Feature activation-2.76e-01

Pos logit contributions
 your+0.050
 newly+0.046
 Piper+0.045
Scar+0.044
Mid+0.042
Neg logit contributions
apping-0.304
owing-0.302
unch-0.302
oot-0.299
ement-0.299
Token his
Token ID465
Feature activation-2.42e-01

Pos logit contributions
 your+0.007
Scar+0.007
 newly+0.007
Mid+0.006
 same+0.006
Neg logit contributions
unch-0.032
owing-0.032
apping-0.032
ement-0.032
oot-0.032
Token pillow
Token ID28774
Feature activation+1.62e-01

Pos logit contributions
 your+0.286
Scar+0.273
 newly+0.262
 Piper+0.260
 same+0.256
Neg logit contributions
apping-0.500
owing-0.498
unch-0.487
oot-0.483
ement-0.477
Token,
Token ID11
Feature activation+2.36e-02

Pos logit contributions
unch+0.304
apping+0.299
amb+0.292
owing+0.291
ement+0.290
Neg logit contributions
 your-0.229
 her-0.224
 Piper-0.217
 she-0.213
 same-0.209
Token but
Token ID475
Feature activation-4.63e-03

Pos logit contributions
apping+0.040
owing+0.040
unch+0.039
amb+0.038
oot+0.037
Neg logit contributions
 your-0.036
 same-0.034
Scar-0.034
 newly-0.034
 our-0.034
Token all
Token ID477
Feature activation-3.48e-03

Pos logit contributions
Scar+0.005
Mid+0.005
 newly+0.005
 same+0.005
 your+0.005
Neg logit contributions
apping-0.010
apped-0.010
owing-0.010
oot-0.009
ement-0.009
Token he
Token ID339
Feature activation-2.50e-01

Pos logit contributions
 your+0.004
Scar+0.004
 newly+0.003
Mid+0.003
 same+0.003
Neg logit contributions
apping-0.008
owing-0.008
unch-0.008
ement-0.007
oot-0.007
Token saw
Token ID2497
Feature activation+5.49e-02

Pos logit contributions
 your+0.286
 newly+0.264
 same+0.257
 our+0.254
 Piper+0.253
Neg logit contributions
apping-0.543
unch-0.542
owing-0.534
oot-0.526
amb-0.523
Token was
Token ID373
Feature activation-2.75e-02

Pos logit contributions
apping+0.142
unch+0.142
owing+0.140
oot+0.139
ement+0.138
Neg logit contributions
 your-0.038
 newly-0.034
Scar-0.032
 Piper-0.032
 same-0.032
Token the
Token ID262
Feature activation-2.76e-01

Pos logit contributions
 your+0.026
 newly+0.023
 her+0.023
 same+0.022
 our+0.022
Neg logit contributions
unch-0.066
apping-0.066
amb-0.065
owing-0.065
oot-0.064
Token dark
Token ID3223
Feature activation+1.22e-01

Pos logit contributions
 your+0.324
 newly+0.300
 same+0.291
 Piper+0.288
 her+0.288
Neg logit contributions
apping-0.589
unch-0.588
owing-0.577
oot-0.569
amb-0.568
Token night
Token ID1755
Feature activation+1.30e-01

Pos logit contributions
unch+0.230
apping+0.226
amb+0.221
owing+0.220
oot+0.220
Neg logit contributions
 your-0.182
 same-0.171
 newly-0.170
 her-0.169
 our-0.165
Token hair
Token ID4190
Feature activation+2.44e-01

Pos logit contributions
unch+0.014
apping+0.014
owing+0.014
amb+0.013
oot+0.013
Neg logit contributions
 your-0.015
 newly-0.014
 her-0.014
 same-0.014
 our-0.013
Token and
Token ID290
Feature activation+6.40e-03

Pos logit contributions
unch+0.491
apping+0.483
owing+0.480
amb+0.472
oot+0.461
Neg logit contributions
 your-0.336
 her-0.322
 has-0.313
 same-0.309
 newly-0.308
Token a
Token ID257
Feature activation-1.70e-01

Pos logit contributions
apping+0.013
apped+0.013
oot+0.013
unch+0.013
ement+0.013
Neg logit contributions
 Piper-0.006
 your-0.006
Mid-0.006
Scar-0.006
 newly-0.006
Token dress
Token ID6576
Feature activation+2.70e-01

Pos logit contributions
 your+0.185
 newly+0.171
 same+0.167
 her+0.164
 our+0.161
Neg logit contributions
unch-0.383
owing-0.382
apping-0.381
oot-0.373
amb-0.371
Token.
Token ID13
Feature activation+8.12e-02

Pos logit contributions
unch+0.524
owing+0.496
apping+0.493
amb+0.491
aring+0.470
Neg logit contributions
 her-0.412
 your-0.402
 she-0.394
 has-0.388
 same-0.382
Token It
Token ID632
Feature activation-1.28e-01

Pos logit contributions
apping+0.183
unch+0.183
oot+0.176
amb+0.175
apped+0.174
Neg logit contributions
 your-0.083
 newly-0.080
 same-0.076
 our-0.076
 Piper-0.074
Token can
Token ID460
Feature activation+1.14e-02

Pos logit contributions
 Piper+0.123
Mid+0.113
Scar+0.108
 your+0.108
 Midnight+0.107
Neg logit contributions
ement-0.273
apping-0.271
apped-0.269
amb-0.268
aps-0.268
Token talk
Token ID1561
Feature activation+1.89e-01

Pos logit contributions
unch+0.024
apping+0.024
oot+0.024
amb+0.023
owing+0.023
Neg logit contributions
 your-0.012
 Piper-0.012
Scar-0.012
 newly-0.012
 our-0.011
Token and
Token ID290
Feature activation+6.83e-03

Pos logit contributions
unch+0.328
apping+0.324
owing+0.317
amb+0.312
oot+0.311
Neg logit contributions
 your-0.309
 newly-0.286
 her-0.286
 same-0.284
 our-0.282
Token sing
Token ID1702
Feature activation+2.01e-01

Pos logit contributions
apping+0.014
unch+0.014
owing+0.014
oot+0.014
amb+0.014
Neg logit contributions
 your-0.008
 newly-0.007
 Piper-0.007
Scar-0.007
 same-0.007
Token.
Token ID13
Feature activation+8.22e-02

Pos logit contributions
unch+0.345
apping+0.335
owing+0.331
amb+0.325
oot+0.323
Neg logit contributions
 your-0.337
 her-0.317
 newly-0.311
 same-0.309
 our-0.309
Token the
Token ID262
Feature activation-2.77e-01

Pos logit contributions
chy+0.109
Mid+0.099
Beck+0.089
Ruby+0.089
 has+0.086
Neg logit contributions
unch-0.182
 sur-0.180
oot-0.177
amb-0.175
apping-0.174
Token park
Token ID3952
Feature activation+2.08e-01

Pos logit contributions
 your+0.314
 Piper+0.293
 newly+0.292
Scar+0.286
 her+0.281
Neg logit contributions
unch-0.592
apping-0.590
ement-0.568
owing-0.567
oot-0.565
Token,
Token ID11
Feature activation+2.91e-02

Pos logit contributions
apping+0.336
owing+0.325
unch+0.323
oot+0.316
apped+0.312
Neg logit contributions
 your-0.348
 newly-0.343
Scar-0.339
Mid-0.339
 Piper-0.334
Token Jerry
Token ID13075
Feature activation-1.04e-01

Pos logit contributions
apped+0.054
apping+0.052
unch+0.052
overed+0.050
ement+0.050
Neg logit contributions
Elsa-0.038
Mid-0.037
chy-0.037
 newly-0.037
Scar-0.036
Token saw
Token ID2497
Feature activation+5.26e-02

Pos logit contributions
 your+0.111
 newly+0.102
 our+0.099
 same+0.099
 Piper+0.096
Neg logit contributions
unch-0.231
apping-0.231
owing-0.227
amb-0.224
oot-0.223
Token a
Token ID257
Feature activation-1.69e-01

Pos logit contributions
unch+0.128
apping+0.127
ement+0.125
oot+0.124
apped+0.124
Neg logit contributions
 newly-0.039
 your-0.038
Scar-0.038
Mid-0.037
 Piper-0.036
Token big
Token ID1263
Feature activation-9.84e-03

Pos logit contributions
 your+0.153
 newly+0.138
 same+0.133
 Piper+0.131
 her+0.130
Neg logit contributions
apping-0.406
unch-0.406
owing-0.399
oot-0.395
amb-0.394
Token tree
Token ID5509
Feature activation+1.49e-01

Pos logit contributions
 your+0.013
 newly+0.012
 same+0.012
 our+0.011
 her+0.011
Neg logit contributions
apping-0.020
unch-0.020
owing-0.020
amb-0.020
oot-0.020
Token made
Token ID925
Feature activation+4.56e-02

Pos logit contributions
amb+0.292
unch+0.280
oot+0.279
atted+0.276
ement+0.274
Neg logit contributions
 she-0.207
 your-0.207
 her-0.204
 has-0.201
 the-0.195
Token of
Token ID286
Feature activation-7.00e-02

Pos logit contributions
amb+0.068
apping+0.066
owing+0.065
unch+0.064
oot+0.064
Neg logit contributions
 your-0.084
 her-0.084
 same-0.079
 newly-0.078
 their-0.078
Token wood
Token ID4898
Feature activation+9.86e-02

Pos logit contributions
Scar+0.075
 your+0.074
Mid+0.072
 newly+0.071
Elsa+0.069
Neg logit contributions
unch-0.148
apping-0.141
ement-0.138
oot-0.138
owing-0.137
Token a
Token ID257
Feature activation-1.69e-01

Pos logit contributions
apping+0.201
unch+0.190
owing+0.190
apped+0.179
atted+0.175
Neg logit contributions
Scar-0.158
 newly-0.149
 our-0.149
 your-0.148
 Piper-0.147
Token bit
Token ID1643
Feature activation+1.98e-01

Pos logit contributions
 your+0.167
 newly+0.155
 same+0.150
 Piper+0.146
 her+0.146
Neg logit contributions
apping-0.393
unch-0.392
owing-0.386
oot-0.383
amb-0.381
Token naughty
Token ID45128
Feature activation+7.36e-02

Pos logit contributions
amb+0.453
unch+0.453
oot+0.453
apping+0.444
owing+0.441
Neg logit contributions
 your-0.213
 same-0.197
 her-0.192
 newly-0.183
 Piper-0.183
Token.
Token ID13
Feature activation+7.11e-02

Pos logit contributions
unch+0.144
apping+0.142
apped+0.139
amb+0.137
oot+0.136
Neg logit contributions
 your-0.106
 her-0.103
 she-0.098
 same-0.097
 the-0.095
Token So
Token ID1406
Feature activation+7.50e-02

Pos logit contributions
unch+0.172
apping+0.170
owing+0.168
oot+0.167
amb+0.166
Neg logit contributions
 your-0.067
 same-0.061
 newly-0.061
 her-0.060
 Piper-0.059
Token he
Token ID339
Feature activation-2.53e-01

Pos logit contributions
apping+0.147
ement+0.145
owing+0.145
oot+0.143
apped+0.142
Neg logit contributions
Scar-0.093
Mid-0.088
 newly-0.087
 your-0.087
 same-0.083
Token hopped
Token ID45230
Feature activation+1.02e-01

Pos logit contributions
 your+0.346
 newly+0.332
 our+0.317
 same+0.316
Scar+0.309
Neg logit contributions
apping-0.483
unch-0.482
amb-0.478
owing-0.475
oot-0.472
Token up
Token ID510
Feature activation-5.59e-02

Pos logit contributions
amb+0.163
apping+0.163
oot+0.159
owing+0.158
unch+0.157
Neg logit contributions
 your-0.167
 newly-0.166
Scar-0.158
 our-0.158
 Piper-0.156
Token onto
Token ID4291
Feature activation-2.83e-02

Pos logit contributions
 your+0.075
 newly+0.071
 same+0.069
 Piper+0.069
 our+0.068
Neg logit contributions
apping-0.109
unch-0.109
owing-0.107
oot-0.106
amb-0.105
Token the
Token ID262
Feature activation-2.75e-01

Pos logit contributions
Scar+0.016
 has+0.016
 your+0.016
 newly+0.016
Mid+0.014
Neg logit contributions
unch-0.071
oot-0.070
apping-0.069
owing-0.069
ement-0.069
Token bed
Token ID3996
Feature activation+2.10e-01

Pos logit contributions
 your+0.333
Scar+0.316
 Piper+0.314
 our+0.313
 newly+0.297
Neg logit contributions
owing-0.547
apping-0.542
unch-0.536
amb-0.527
ement-0.521