TOP ACTIVATIONS
MAX = 3.765

very happy . From that day on , Sue knew she
. From that day on , Sue and Bob
fun together . From that day on , Lily practiced singing
much . The two friends were happy to enjoy
remaining jam . From that day on , Lily made sure
with Fred . From that day on , Tim felt strong
helped him . From that day on , Billy and Harry
And from that day on , every time Tim
gray cloth . From that day onwards , Sandra always enjoyed
both laughed . From that day on , Lily searched for
surprise the farmer . The two were so happy that they
having fun . From that day on , they made sure
of fun . From that day on , they became best
be a good girl from now on . She even asked
cher ries . From that day on , they knew they
!" And from then on , every day Steve
and family . From that day on , Lily always made
clean too . From that day on , Sally and her
pretty kitten . From that day on , Lily made sure
. And from then on , Carl enjoyed playing

MIN ACTIVATIONS
MIN = -3.206

" That is a nice star , Tom . But
. The smart oct opus was very happy
it and decided to share the toy . They took turns
They wish they had shared the sock and played nicely .
and gave it a good tidy . The nap
Everyone loved the little boy 's stories about the
ara , I give you the power to choose any special
. I should have shared the shell with you ," Lily
it . The little girl was very excited .
didn 't want to give the sheet to Nem o and
little boy carrying a box filled with empty boxes , wood
They were happy to share the bath robe and be best
and decided not to take the toy . L
Max was happy to share the bath tub with Lily .
â € The old bear was so shocked ,
owl and decided to share the metal toy . They played
mom smiled and explained that the jelly fish was so special
. They decide to share the cr ay ons . They
inside . The little girl was excited to see
that Jimmy didn 't share the toy with him , but

INTERVAL 2.023 - 3.765
CONTAINS 0.212%

the barbecue . From that day on , Lily always helped
From that day on , Lily always made
. <|endoftext|> Ben and Lily were playing in the park .
From that day on , Tim my wore
things he could observe in the field . He walked around

INTERVAL 0.280 - 2.023
CONTAINS 30.814%

always careful not to hurt each other . In the end
Mom said , � � € � � It â €
was definitely coming from the chim ney ! G ently she
to be animals , or pirates , or superheroes . They
. One day , Sue saw a beautiful rainbow

INTERVAL -1.463 - 0.280
CONTAINS 68.162%

toast , but only a little bit . She prefers to
carefully , not wanting it to break or get lost .
fur . While walking , he saw a big red ball
amazed . She told Mia that the tooth must belong to
she cared . <|endoftext|> M ummy and Daddy were very excited

INTERVAL -3.206 - -1.463
CONTAINS 0.812%

. Suddenly , it began to move ! He was so
what 's up there ," Sara said . "
big oct opus ! The little fish was very scared but
hug , picked up all the mar bles and put them
a lesson . She learned that sharing is better than hiding
Token very
Token ID845
Feature activation-3.97e-01

Pos logit contributions
ro+0.074
onna+0.061
emo+0.061
uzz+0.060
 cheek+0.058
Neg logit contributions
aw-0.050
 Other-0.043
 forward-0.041
ifying-0.040
 Leave-0.040
Token happy
Token ID3772
Feature activation-2.17e-01

Pos logit contributions
ro+0.739
onna+0.588
emo+0.572
uzz+0.572
 cheek+0.552
Neg logit contributions
aw-0.681
 Other-0.601
 forward-0.588
 middle-0.565
 Leave-0.563
Token.
Token ID13
Feature activation+1.79e+00

Pos logit contributions
ro+0.414
onna+0.337
emo+0.331
uzz+0.328
 cheek+0.319
Neg logit contributions
aw-0.362
 Other-0.315
 forward-0.300
 Leave-0.296
ifying-0.293
Token From
Token ID3574
Feature activation+3.57e-02

Pos logit contributions
aw+3.380
ifying+3.085
 forward+3.031
 particularly+2.956
 Other+2.951
Neg logit contributions
ro-2.897
uzz-2.327
onna-2.223
emo-2.194
water-2.031
Token that
Token ID326
Feature activation+1.18e-01

Pos logit contributions
aw+0.045
 Other+0.036
 forward+0.034
ifying+0.034
 Leave+0.034
Neg logit contributions
ro-0.083
onna-0.070
uzz-0.069
emo-0.069
 cheek-0.067
Token day
Token ID1110
Feature activation+3.77e+00

Pos logit contributions
aw+0.232
 Other+0.212
 forward+0.203
 Leave+0.201
 middle+0.201
Neg logit contributions
ro-0.184
onna-0.142
uzz-0.139
emo-0.139
 cheek-0.132
Token on
Token ID319
Feature activation+2.05e+00

Pos logit contributions
aw+6.231
 Other+5.424
opl+5.415
pieces+5.405
 pasture+5.178
Neg logit contributions
ro-6.898
uzz-5.977
 seek-5.892
 moment-5.630
 cheek-5.603
Token,
Token ID11
Feature activation+1.99e+00

Pos logit contributions
aw+3.952
ifying+3.464
 Other+3.461
 forward+3.315
 Leave+3.310
Neg logit contributions
ro-3.452
onna-2.738
 seek-2.664
uzz-2.595
emo-2.591
Token Sue
Token ID26089
Feature activation+4.64e-01

Pos logit contributions
aw+4.670
ifying+4.398
 Other+4.197
 Leave+4.180
opl+4.157
Neg logit contributions
ro-2.354
uzz-1.686
 seek-1.610
emo-1.607
onna-1.603
Token knew
Token ID2993
Feature activation+2.24e-01

Pos logit contributions
aw+0.729
 Other+0.640
ifying+0.606
 Leave+0.602
 forward+0.601
Neg logit contributions
ro-0.906
onna-0.740
uzz-0.727
emo-0.725
 cheek-0.705
Token she
Token ID673
Feature activation+6.14e-01

Pos logit contributions
aw+0.302
 Other+0.257
 Leave+0.239
 forward+0.239
ifying+0.238
Neg logit contributions
ro-0.495
onna-0.413
uzz-0.407
emo-0.406
 cheek-0.398
Token.
Token ID13
Feature activation+5.42e-01

Pos logit contributions
ro+0.971
onna+0.790
emo+0.787
uzz+0.785
 cheek+0.743
Neg logit contributions
aw-0.796
 Other-0.694
 forward-0.663
 Leave-0.650
 middle-0.644
Token
Token ID198
Feature activation-1.04e-01

Pos logit contributions
aw+0.872
 Other+0.763
ifying+0.731
 forward+0.726
 Leave+0.722
Neg logit contributions
ro-1.040
onna-0.849
uzz-0.838
emo-0.831
 cheek-0.806
Token
Token ID198
Feature activation+8.45e-01

Pos logit contributions
ro+0.214
onna+0.175
uzz+0.171
emo+0.171
 cheek+0.170
Neg logit contributions
aw-0.154
 Other-0.134
 Leave-0.124
 middle-0.123
 forward-0.123
TokenFrom
Token ID4863
Feature activation-7.50e-01

Pos logit contributions
aw+1.273
 Other+1.123
ifying+1.083
 forward+1.061
 Leave+1.055
Neg logit contributions
ro-1.700
onna-1.394
uzz-1.385
emo-1.378
 seek-1.340
Token that
Token ID326
Feature activation-3.67e-01

Pos logit contributions
ro+1.864
onna+1.621
uzz+1.570
emo+1.553
 cheek+1.511
Neg logit contributions
aw-0.880
 Other-0.632
 forward-0.626
 middle-0.617
 Leave-0.584
Token day
Token ID1110
Feature activation+3.62e+00

Pos logit contributions
ro+0.616
onna+0.488
uzz+0.484
emo+0.475
 cheek+0.448
Neg logit contributions
aw-0.676
 Other-0.628
 forward-0.608
 middle-0.596
 Leave-0.583
Token on
Token ID319
Feature activation+2.03e+00

Pos logit contributions
aw+5.747
pieces+5.384
 Other+5.138
ifying+5.037
opl+4.898
Neg logit contributions
ro-6.965
uzz-5.826
 seek-5.694
 cheek-5.653
emo-5.407
Token,
Token ID11
Feature activation+1.29e+00

Pos logit contributions
aw+3.955
ifying+3.509
 Other+3.444
pieces+3.379
 Leave+3.273
Neg logit contributions
ro-3.424
onna-2.693
 seek-2.601
uzz-2.558
emo-2.518
Token Sue
Token ID26089
Feature activation+1.82e-01

Pos logit contributions
aw+2.875
ifying+2.623
 Other+2.594
 Leave+2.560
 forward+2.521
Neg logit contributions
ro-1.689
onna-1.199
uzz-1.194
emo-1.185
 seek-1.153
Token and
Token ID290
Feature activation+9.78e-01

Pos logit contributions
aw+0.288
 Other+0.251
 Leave+0.238
 forward+0.236
ifying+0.234
Neg logit contributions
ro-0.355
onna-0.291
uzz-0.287
emo-0.284
 cheek-0.280
Token Bob
Token ID5811
Feature activation+4.19e-01

Pos logit contributions
aw+1.969
 Other+1.782
ifying+1.768
 Leave+1.714
 forward+1.702
Neg logit contributions
ro-1.477
onna-1.132
uzz-1.128
emo-1.111
 cheek-1.067
Token fun
Token ID1257
Feature activation+6.38e-01

Pos logit contributions
ro+0.573
onna+0.471
uzz+0.461
emo+0.457
 cheek+0.452
Neg logit contributions
aw-0.476
 Other-0.415
 Leave-0.386
 middle-0.381
 forward-0.379
Token together
Token ID1978
Feature activation+7.03e-01

Pos logit contributions
aw+1.132
 Other+1.012
ifying+0.971
 Leave+0.965
 forward+0.960
Neg logit contributions
ro-1.124
onna-0.886
uzz-0.872
emo-0.867
 cheek-0.847
Token.
Token ID13
Feature activation+7.28e-01

Pos logit contributions
aw+1.246
 Other+1.120
 Leave+1.067
ifying+1.063
 forward+1.048
Neg logit contributions
ro-1.240
onna-0.985
uzz-0.967
emo-0.955
 cheek-0.933
Token From
Token ID3574
Feature activation-3.02e-01

Pos logit contributions
aw+1.233
 Other+1.068
ifying+1.031
 Leave+1.022
 forward+1.022
Neg logit contributions
ro-1.352
onna-1.091
uzz-1.076
emo-1.072
 cheek-1.029
Token that
Token ID326
Feature activation-3.57e-01

Pos logit contributions
ro+0.723
onna+0.621
uzz+0.605
emo+0.602
 cheek+0.590
Neg logit contributions
aw-0.368
 Other-0.285
 forward-0.271
 middle-0.267
 Leave-0.264
Token day
Token ID1110
Feature activation+3.60e+00

Pos logit contributions
ro+0.596
onna+0.473
uzz+0.469
emo+0.460
 cheek+0.433
Neg logit contributions
aw-0.663
 Other-0.616
 forward-0.591
 middle-0.582
 Leave-0.574
Token on
Token ID319
Feature activation+1.68e+00

Pos logit contributions
aw+5.760
 Other+5.083
 pasture+5.065
pieces+5.063
opl+4.890
Neg logit contributions
ro-6.811
uzz-5.562
 seek-5.514
 cheek-5.450
 L-5.363
Token,
Token ID11
Feature activation+4.07e-01

Pos logit contributions
aw+3.066
 Other+2.676
ifying+2.646
 forward+2.572
 Leave+2.559
Neg logit contributions
ro-2.982
onna-2.345
emo-2.281
 seek-2.262
uzz-2.199
Token Lily
Token ID20037
Feature activation+2.10e-01

Pos logit contributions
aw+1.025
 Other+0.944
 Leave+0.913
ifying+0.912
 forward+0.910
Neg logit contributions
ro-0.418
onna-0.272
uzz-0.261
emo-0.259
 cheek-0.243
Token practiced
Token ID19893
Feature activation+1.33e-01

Pos logit contributions
aw+0.331
 Other+0.289
 Leave+0.273
 forward+0.272
ifying+0.270
Neg logit contributions
ro-0.408
onna-0.335
uzz-0.330
emo-0.327
 cheek-0.321
Token singing
Token ID13777
Feature activation+1.62e-01

Pos logit contributions
aw+0.213
 Other+0.188
 forward+0.178
 Leave+0.176
ifying+0.174
Neg logit contributions
ro-0.254
onna-0.208
uzz-0.205
emo-0.204
 cheek-0.199
Token much
Token ID881
Feature activation+2.11e-01

Pos logit contributions
ro+0.261
onna+0.213
uzz+0.210
emo+0.207
 cheek+0.204
Neg logit contributions
aw-0.201
 Other-0.176
 forward-0.165
 middle-0.164
 Leave-0.163
Token.
Token ID13
Feature activation+3.97e-01

Pos logit contributions
aw+0.359
 Other+0.316
 Leave+0.300
 forward+0.299
ifying+0.297
Neg logit contributions
ro-0.390
onna-0.315
uzz-0.312
emo-0.310
 cheek-0.300
Token
Token ID198
Feature activation-5.13e-03

Pos logit contributions
aw+0.630
 Other+0.552
ifying+0.522
 Leave+0.521
 forward+0.520
Neg logit contributions
ro-0.774
onna-0.633
uzz-0.622
emo-0.619
 cheek-0.604
Token
Token ID198
Feature activation+1.09e+00

Pos logit contributions
ro+0.010
onna+0.008
uzz+0.008
emo+0.008
 cheek+0.008
Neg logit contributions
aw-0.008
 Other-0.007
 Leave-0.007
 forward-0.007
 middle-0.007
TokenThe
Token ID464
Feature activation+1.87e+00

Pos logit contributions
aw+1.931
 Other+1.720
ifying+1.704
 Leave+1.661
 forward+1.660
Neg logit contributions
ro-1.923
onna-1.506
 seek-1.492
uzz-1.468
emo-1.468
Token two
Token ID734
Feature activation+3.58e+00

Pos logit contributions
aw+3.249
ifying+2.871
 Leave+2.829
 Other+2.792
pieces+2.665
Neg logit contributions
ro-3.294
 cheek-2.836
emo-2.728
 seek-2.657
 moment-2.582
Token friends
Token ID2460
Feature activation+4.41e-01

Pos logit contributions
ifying+7.293
aw+6.596
opl+6.575
 Leave+6.513
 Other+6.387
Neg logit contributions
ro-4.868
 textures-4.205
guard-4.156
emo-4.084
 cheek-3.973
Token were
Token ID547
Feature activation+1.05e+00

Pos logit contributions
aw+0.665
 Other+0.581
ifying+0.547
 Leave+0.545
 forward+0.543
Neg logit contributions
ro-0.890
onna-0.732
uzz-0.723
emo-0.718
 cheek-0.701
Token happy
Token ID3772
Feature activation-7.06e-01

Pos logit contributions
aw+1.856
 Other+1.653
ifying+1.634
 Leave+1.580
 middle+1.542
Neg logit contributions
ro-1.813
uzz-1.467
onna-1.452
emo-1.449
 cheek-1.406
Token to
Token ID284
Feature activation+1.22e+00

Pos logit contributions
ro+1.361
onna+1.103
emo+1.066
uzz+1.048
 cheek+1.048
Neg logit contributions
aw-1.213
 Other-1.036
 forward-0.985
 Leave-0.971
 middle-0.944
Token enjoy
Token ID2883
Feature activation+1.29e+00

Pos logit contributions
aw+2.170
 Other+1.851
ifying+1.811
 forward+1.775
 Leave+1.719
Neg logit contributions
ro-2.262
onna-1.826
uzz-1.784
emo-1.762
 seek-1.753
Token remaining
Token ID5637
Feature activation+5.57e-01

Pos logit contributions
aw+0.354
 Other+0.315
ifying+0.300
 forward+0.299
 Leave+0.299
Neg logit contributions
ro-0.349
onna-0.279
uzz-0.275
emo-0.273
 cheek-0.263
Token jam
Token ID16853
Feature activation+3.86e-01

Pos logit contributions
aw+0.950
 Other+0.849
 Leave+0.807
ifying+0.802
 forward+0.799
Neg logit contributions
ro-1.013
onna-0.816
emo-0.796
uzz-0.796
 cheek-0.778
Token.
Token ID13
Feature activation+4.87e-01

Pos logit contributions
aw+0.659
 Other+0.585
 Leave+0.554
ifying+0.553
 forward+0.551
Neg logit contributions
ro-0.699
onna-0.562
uzz-0.552
emo-0.549
 cheek-0.535
Token From
Token ID3574
Feature activation+1.36e-01

Pos logit contributions
aw+0.794
 Other+0.697
ifying+0.662
 Leave+0.660
 forward+0.660
Neg logit contributions
ro-0.925
onna-0.753
uzz-0.740
emo-0.737
 cheek-0.717
Token that
Token ID326
Feature activation-1.56e-01

Pos logit contributions
aw+0.170
 Other+0.140
 forward+0.130
ifying+0.130
 Leave+0.130
Neg logit contributions
ro-0.314
onna-0.265
uzz-0.261
emo-0.260
 cheek-0.254
Token day
Token ID1110
Feature activation+3.55e+00

Pos logit contributions
ro+0.253
onna+0.196
uzz+0.195
emo+0.192
 cheek+0.183
Neg logit contributions
aw-0.298
 Other-0.273
 forward-0.263
 middle-0.259
 Leave-0.257
Token on
Token ID319
Feature activation+1.35e+00

Pos logit contributions
aw+5.591
 Other+4.872
pieces+4.751
 pasture+4.634
ifying+4.573
Neg logit contributions
ro-6.999
 cheek-5.797
 seek-5.731
uzz-5.601
onna-5.577
Token,
Token ID11
Feature activation+9.31e-01

Pos logit contributions
aw+2.370
 Other+2.062
ifying+1.995
 forward+1.953
 Leave+1.950
Neg logit contributions
ro-2.491
onna-2.017
emo-1.932
uzz-1.913
 seek-1.909
Token Lily
Token ID20037
Feature activation+7.17e-01

Pos logit contributions
aw+2.289
 Other+2.086
ifying+2.043
 Leave+2.035
 forward+2.012
Neg logit contributions
ro-1.008
onna-0.679
emo-0.649
uzz-0.638
 cheek-0.601
Token made
Token ID925
Feature activation+4.34e-01

Pos logit contributions
aw+1.135
 Other+1.005
ifying+0.966
 Leave+0.943
 forward+0.942
Neg logit contributions
ro-1.392
onna-1.133
emo-1.114
uzz-1.103
 cheek-1.077
Token sure
Token ID1654
Feature activation+5.13e-02

Pos logit contributions
aw+0.802
 Other+0.719
ifying+0.688
 Leave+0.686
 forward+0.683
Neg logit contributions
ro-0.727
onna-0.573
uzz-0.562
emo-0.560
 cheek-0.543
Token with
Token ID351
Feature activation+5.95e-01

Pos logit contributions
ro+0.652
emo+0.538
onna+0.536
uzz+0.533
 cheek+0.512
Neg logit contributions
aw-0.555
 Other-0.480
 forward-0.472
ifying-0.460
 middle-0.452
Token Fred
Token ID8559
Feature activation-1.63e-01

Pos logit contributions
aw+1.000
 Other+0.883
 Leave+0.840
ifying+0.838
 forward+0.833
Neg logit contributions
ro-1.100
onna-0.887
uzz-0.876
emo-0.867
 cheek-0.844
Token.
Token ID13
Feature activation+7.94e-01

Pos logit contributions
ro+0.312
onna+0.256
emo+0.251
uzz+0.250
 cheek+0.243
Neg logit contributions
aw-0.261
 Other-0.227
 forward-0.219
ifying-0.216
 Leave-0.215
Token From
Token ID3574
Feature activation+2.22e-01

Pos logit contributions
aw+1.293
 Other+1.120
ifying+1.088
 forward+1.076
 Leave+1.066
Neg logit contributions
ro-1.511
onna-1.229
uzz-1.214
emo-1.210
 cheek-1.156
Token that
Token ID326
Feature activation-2.78e-01

Pos logit contributions
aw+0.282
 Other+0.235
 forward+0.218
 Leave+0.218
ifying+0.218
Neg logit contributions
ro-0.507
onna-0.428
uzz-0.421
emo-0.420
 cheek-0.411
Token day
Token ID1110
Feature activation+3.51e+00

Pos logit contributions
ro+0.454
onna+0.352
uzz+0.351
emo+0.346
 cheek+0.329
Neg logit contributions
aw-0.524
 Other-0.485
 forward-0.468
 middle-0.458
 Leave-0.453
Token on
Token ID319
Feature activation+2.07e+00

Pos logit contributions
aw+5.408
ifying+4.798
pieces+4.699
 Other+4.687
 pasture+4.581
Neg logit contributions
ro-7.055
 seek-6.172
emo-5.874
uzz-5.691
 cheek-5.659
Token,
Token ID11
Feature activation+1.36e+00

Pos logit contributions
aw+3.925
ifying+3.405
 Other+3.389
pieces+3.265
 Leave+3.259
Neg logit contributions
ro-3.572
onna-2.840
emo-2.811
 seek-2.799
uzz-2.699
Token Tim
Token ID5045
Feature activation+4.23e-01

Pos logit contributions
aw+2.709
ifying+2.405
 Other+2.393
 Leave+2.331
 middle+2.283
Neg logit contributions
ro-2.129
emo-1.665
onna-1.634
uzz-1.602
 seek-1.589
Token felt
Token ID2936
Feature activation+1.40e+00

Pos logit contributions
aw+0.755
 Other+0.674
ifying+0.641
 Leave+0.640
 forward+0.637
Neg logit contributions
ro-0.738
onna-0.586
uzz-0.575
emo-0.573
 cheek-0.556
Token strong
Token ID1913
Feature activation+8.81e-01

Pos logit contributions
aw+2.382
 Other+2.114
ifying+2.067
 Leave+2.002
opl+1.968
Neg logit contributions
ro-2.560
emo-2.066
onna-2.065
uzz-2.006
 seek-1.963
Token helped
Token ID4193
Feature activation+8.31e-01

Pos logit contributions
aw+0.288
 Other+0.251
 Leave+0.238
 forward+0.237
ifying+0.237
Neg logit contributions
ro-0.347
onna-0.282
uzz-0.278
emo-0.276
 cheek-0.270
Token him
Token ID683
Feature activation+3.63e-02

Pos logit contributions
aw+1.485
 Other+1.310
ifying+1.258
 Leave+1.250
 forward+1.225
Neg logit contributions
ro-1.467
onna-1.167
emo-1.150
uzz-1.145
 cheek-1.112
Token.
Token ID13
Feature activation+4.68e-01

Pos logit contributions
aw+0.067
 Other+0.060
 forward+0.057
 Leave+0.057
ifying+0.057
Neg logit contributions
ro-0.060
onna-0.047
uzz-0.047
emo-0.046
 cheek-0.045
Token From
Token ID3574
Feature activation+1.84e-01

Pos logit contributions
aw+0.714
 Other+0.619
ifying+0.585
 Leave+0.584
 forward+0.583
Neg logit contributions
ro-0.937
onna-0.770
uzz-0.759
emo-0.757
 cheek-0.735
Token that
Token ID326
Feature activation-1.82e-01

Pos logit contributions
aw+0.232
 Other+0.193
 forward+0.179
ifying+0.179
 Leave+0.179
Neg logit contributions
ro-0.422
onna-0.357
uzz-0.351
emo-0.350
 cheek-0.343
Token day
Token ID1110
Feature activation+3.49e+00

Pos logit contributions
ro+0.297
onna+0.229
uzz+0.228
emo+0.223
 cheek+0.213
Neg logit contributions
aw-0.347
 Other-0.320
 forward-0.310
 middle-0.303
 Leave-0.300
Token on
Token ID319
Feature activation+1.97e+00

Pos logit contributions
aw+5.279
 Other+4.532
pieces+4.373
 pasture+4.279
ifying+4.175
Neg logit contributions
ro-7.162
emo-6.092
 seek-6.069
 cheek-5.887
 moment-5.714
Token,
Token ID11
Feature activation+1.48e+00

Pos logit contributions
aw+3.716
 Other+3.210
ifying+3.130
 Leave+3.063
pieces+3.051
Neg logit contributions
ro-3.405
emo-2.691
onna-2.685
 seek-2.650
 common-2.567
Token Billy
Token ID15890
Feature activation-2.47e-04

Pos logit contributions
aw+3.141
 Other+2.786
ifying+2.746
 Leave+2.723
 middle+2.664
Neg logit contributions
ro-2.093
emo-1.642
onna-1.548
 seek-1.523
uzz-1.501
Token and
Token ID290
Feature activation+8.73e-02

Pos logit contributions
ro+0.001
onna+0.000
uzz+0.000
emo+0.000
 cheek+0.000
Neg logit contributions
aw-0.000
 Other-0.000
 Leave-0.000
 forward-0.000
ifying-0.000
Token Harry
Token ID5850
Feature activation+4.63e-01

Pos logit contributions
aw+0.159
 Other+0.143
 forward+0.135
 Leave+0.135
ifying+0.134
Neg logit contributions
ro-0.150
onna-0.119
uzz-0.116
emo-0.115
 cheek-0.113
Token
Token ID198
Feature activation+2.76e-01

Pos logit contributions
aw+1.517
 Other+1.308
ifying+1.292
 forward+1.277
 Leave+1.259
Neg logit contributions
ro-1.673
onna-1.349
uzz-1.347
emo-1.343
 cheek-1.262
Token
Token ID198
Feature activation+7.98e-01

Pos logit contributions
aw+0.419
 Other+0.365
 Leave+0.343
 forward+0.341
ifying+0.341
Neg logit contributions
ro-0.554
onna-0.455
uzz-0.448
emo-0.445
 cheek-0.437
TokenAnd
Token ID1870
Feature activation+6.96e-02

Pos logit contributions
aw+1.144
 Other+0.993
ifying+0.946
 forward+0.936
 Leave+0.932
Neg logit contributions
ro-1.664
onna-1.377
emo-1.369
uzz-1.369
 seek-1.324
Token from
Token ID422
Feature activation+7.72e-01

Pos logit contributions
aw+0.117
 Other+0.103
 forward+0.098
 Leave+0.098
ifying+0.097
Neg logit contributions
ro-0.130
onna-0.105
uzz-0.102
emo-0.102
 cheek-0.100
Token that
Token ID326
Feature activation+1.69e-01

Pos logit contributions
aw+1.052
 Other+0.918
ifying+0.861
 Leave+0.857
 forward+0.847
Neg logit contributions
ro-1.657
onna-1.378
uzz-1.378
emo-1.369
 cheek-1.336
Token day
Token ID1110
Feature activation+3.47e+00

Pos logit contributions
aw+0.338
 Other+0.307
 forward+0.293
 Leave+0.291
 middle+0.290
Neg logit contributions
ro-0.262
onna-0.202
uzz-0.197
emo-0.195
 cheek-0.188
Token on
Token ID319
Feature activation+1.61e+00

Pos logit contributions
aw+5.017
pieces+4.688
 Other+4.547
opl+4.461
 pasture+4.302
Neg logit contributions
ro-6.717
uzz-5.923
emo-5.651
 seek-5.490
 cheek-5.378
Token,
Token ID11
Feature activation+1.48e+00

Pos logit contributions
aw+2.897
 Other+2.555
ifying+2.519
 forward+2.456
 Leave+2.427
Neg logit contributions
ro-2.862
onna-2.320
emo-2.276
uzz-2.210
 seek-2.168
Token every
Token ID790
Feature activation+8.71e-01

Pos logit contributions
aw+3.372
ifying+3.056
 Other+3.001
 Leave+2.981
 forward+2.922
Neg logit contributions
ro-1.869
emo-1.381
uzz-1.340
onna-1.301
 seek-1.276
Token time
Token ID640
Feature activation+8.53e-01

Pos logit contributions
aw+1.685
 Other+1.511
 Leave+1.463
ifying+1.454
 forward+1.431
Neg logit contributions
ro-1.397
onna-1.081
uzz-1.068
emo-1.057
 cheek-1.020
Token Tim
Token ID5045
Feature activation+3.47e-01

Pos logit contributions
aw+1.706
 Other+1.546
ifying+1.504
 forward+1.481
 Leave+1.477
Neg logit contributions
ro-1.282
onna-0.972
uzz-0.966
emo-0.966
 seek-0.908
Token gray
Token ID12768
Feature activation-7.75e-01

Pos logit contributions
ro+2.248
emo+1.887
onna+1.869
uzz+1.851
water+1.634
Neg logit contributions
aw-1.860
 Other-1.791
 middle-1.739
 forward-1.712
pieces-1.661
Token cloth
Token ID16270
Feature activation-5.98e-01

Pos logit contributions
ro+1.269
emo+1.069
onna+1.055
uzz+1.035
 seek+0.947
Neg logit contributions
aw-1.392
 Other-1.295
 middle-1.239
ifying-1.207
 forward-1.206
Token.
Token ID13
Feature activation+1.04e+00

Pos logit contributions
ro+1.133
emo+0.939
onna+0.924
uzz+0.915
 seek+0.876
Neg logit contributions
aw-0.956
 Other-0.826
 forward-0.816
ifying-0.795
 middle-0.786
Token From
Token ID3574
Feature activation-2.52e-01

Pos logit contributions
aw+1.464
 Other+1.232
ifying+1.208
 forward+1.179
 Leave+1.171
Neg logit contributions
ro-2.229
onna-1.857
uzz-1.851
emo-1.826
 cheek-1.764
Token that
Token ID326
Feature activation-8.77e-01

Pos logit contributions
ro+0.613
onna+0.525
emo+0.513
uzz+0.512
 cheek+0.499
Neg logit contributions
aw-0.295
 Other-0.230
 forward-0.219
 middle-0.215
ifying-0.214
Token day
Token ID1110
Feature activation+3.46e+00

Pos logit contributions
ro+1.558
emo+1.236
uzz+1.231
onna+1.221
water+1.150
Neg logit contributions
aw-1.484
 Other-1.431
 forward-1.403
 middle-1.367
 particularly-1.338
Token onwards
Token ID31719
Feature activation+1.33e+00

Pos logit contributions
aw+5.240
 Other+4.865
pieces+4.413
opl+4.320
 Leave+4.265
Neg logit contributions
ro-7.033
 cheek-5.846
uzz-5.842
 seek-5.688
onna-5.616
Token,
Token ID11
Feature activation+2.67e+00

Pos logit contributions
aw+2.245
 Other+2.053
ifying+1.974
 Leave+1.922
opl+1.884
Neg logit contributions
ro-2.430
onna-1.932
emo-1.902
uzz-1.889
 seek-1.863
Token Sandra
Token ID32464
Feature activation+2.85e-01

Pos logit contributions
aw+5.145
ifying+4.878
 Other+4.852
 Leave+4.701
opl+4.478
Neg logit contributions
ro-4.140
emo-3.279
onna-3.255
 common-3.174
 seek-3.160
Token always
Token ID1464
Feature activation-1.81e-01

Pos logit contributions
aw+0.440
 Other+0.384
 Leave+0.362
 forward+0.361
ifying+0.360
Neg logit contributions
ro-0.562
onna-0.461
uzz-0.454
emo-0.452
 cheek-0.442
Token enjoyed
Token ID8359
Feature activation+7.41e-02

Pos logit contributions
ro+0.358
onna+0.294
uzz+0.289
emo+0.288
 cheek+0.280
Neg logit contributions
aw-0.280
 Other-0.245
 forward-0.233
 Leave-0.232
ifying-0.228
Token both
Token ID1111
Feature activation+1.44e-01

Pos logit contributions
aw+0.611
 Other+0.521
 Leave+0.482
ifying+0.478
 forward+0.476
Neg logit contributions
ro-1.102
onna-0.927
uzz-0.918
emo-0.911
 cheek-0.894
Token laughed
Token ID13818
Feature activation+8.11e-02

Pos logit contributions
aw+0.233
 Other+0.203
ifying+0.192
 forward+0.192
 Leave+0.192
Neg logit contributions
ro-0.275
onna-0.224
emo-0.220
uzz-0.220
 cheek-0.214
Token.
Token ID13
Feature activation+4.00e-01

Pos logit contributions
aw+0.135
 Other+0.119
 forward+0.113
ifying+0.111
 Leave+0.111
Neg logit contributions
ro-0.153
onna-0.125
emo-0.122
uzz-0.122
 cheek-0.117
Token From
Token ID3574
Feature activation-1.14e-01

Pos logit contributions
aw+0.648
 Other+0.569
 Leave+0.537
ifying+0.537
 forward+0.536
Neg logit contributions
ro-0.767
onna-0.624
uzz-0.614
emo-0.611
 cheek-0.596
Token that
Token ID326
Feature activation+3.06e-01

Pos logit contributions
ro+0.270
onna+0.230
uzz+0.225
emo+0.223
 cheek+0.218
Neg logit contributions
aw-0.142
 Other-0.113
 forward-0.107
ifying-0.105
 Leave-0.105
Token day
Token ID1110
Feature activation+3.44e+00

Pos logit contributions
aw+0.621
 Other+0.562
 Leave+0.537
 forward+0.537
ifying+0.536
Neg logit contributions
ro-0.464
onna-0.355
uzz-0.347
emo-0.345
 cheek-0.332
Token on
Token ID319
Feature activation+1.39e+00

Pos logit contributions
aw+5.456
 Other+4.767
opl+4.708
pieces+4.614
 pasture+4.524
Neg logit contributions
ro-6.509
 cheek-5.451
uzz-5.417
 seek-5.348
emo-5.186
Token,
Token ID11
Feature activation+1.52e-01

Pos logit contributions
aw+2.510
 Other+2.161
ifying+2.113
 forward+2.057
 Leave+2.055
Neg logit contributions
ro-2.523
onna-2.026
emo-1.957
uzz-1.926
 seek-1.925
Token Lily
Token ID20037
Feature activation+5.67e-01

Pos logit contributions
aw+0.364
 Other+0.337
 forward+0.323
 Leave+0.323
ifying+0.322
Neg logit contributions
ro-0.172
onna-0.119
uzz-0.115
emo-0.113
 cheek-0.109
Token searched
Token ID16499
Feature activation+1.10e+00

Pos logit contributions
aw+0.897
 Other+0.791
ifying+0.754
 Leave+0.743
 forward+0.743
Neg logit contributions
ro-1.104
onna-0.900
emo-0.883
uzz-0.880
 cheek-0.854
Token for
Token ID329
Feature activation+3.54e-02

Pos logit contributions
aw+1.979
 Other+1.744
 Leave+1.722
ifying+1.716
 middle+1.634
Neg logit contributions
ro-1.926
onna-1.534
uzz-1.510
emo-1.495
 cheek-1.450
Token surprise
Token ID5975
Feature activation+6.51e-01

Pos logit contributions
aw+1.219
 Other+1.091
ifying+1.043
 forward+1.038
 Leave+1.033
Neg logit contributions
ro-1.043
onna-0.819
emo-0.796
uzz-0.795
 cheek-0.772
Token the
Token ID262
Feature activation+9.05e-01

Pos logit contributions
aw+1.122
 Other+1.000
ifying+0.958
 Leave+0.955
 forward+0.932
Neg logit contributions
ro-1.169
onna-0.941
uzz-0.926
emo-0.924
 cheek-0.902
Token farmer
Token ID18739
Feature activation+6.64e-01

Pos logit contributions
aw+1.433
 Other+1.256
 Leave+1.208
ifying+1.198
 forward+1.170
Neg logit contributions
ro-1.786
onna-1.457
emo-1.431
 cheek-1.421
uzz-1.415
Token.
Token ID13
Feature activation+1.17e+00

Pos logit contributions
aw+1.137
 Other+1.013
ifying+0.958
 Leave+0.958
 forward+0.946
Neg logit contributions
ro-1.202
onna-0.967
uzz-0.946
emo-0.939
 cheek-0.925
Token The
Token ID383
Feature activation+2.27e+00

Pos logit contributions
aw+1.713
ifying+1.418
 Other+1.412
 forward+1.391
 Leave+1.375
Neg logit contributions
ro-2.460
onna-2.019
uzz-1.982
emo-1.975
 cheek-1.926
Token two
Token ID734
Feature activation+3.44e+00

Pos logit contributions
aw+3.746
 Leave+3.333
ifying+3.311
 Other+3.184
opl+2.985
Neg logit contributions
ro-4.420
 cheek-3.751
emo-3.664
 seek-3.580
onna-3.468
Token were
Token ID547
Feature activation+1.22e+00

Pos logit contributions
ifying+6.794
aw+6.730
opl+6.566
 Leave+6.342
 Other+6.329
Neg logit contributions
ro-5.014
emo-4.076
 cheek-3.924
guard-3.757
 textures-3.733
Token so
Token ID523
Feature activation+6.49e-01

Pos logit contributions
aw+1.954
 Other+1.742
ifying+1.676
 Leave+1.642
 middle+1.613
Neg logit contributions
ro-2.302
uzz-1.909
onna-1.889
emo-1.869
 cheek-1.816
Token happy
Token ID3772
Feature activation-4.31e-01

Pos logit contributions
aw+1.191
 Other+1.065
ifying+1.019
 Leave+1.010
 forward+1.006
Neg logit contributions
ro-1.110
onna-0.881
uzz-0.870
emo-0.866
 cheek-0.835
Token that
Token ID326
Feature activation+8.19e-01

Pos logit contributions
ro+0.863
onna+0.705
emo+0.691
uzz+0.691
 cheek+0.677
Neg logit contributions
aw-0.685
 Other-0.580
 Leave-0.552
 forward-0.551
ifying-0.531
Token they
Token ID484
Feature activation+2.65e-01

Pos logit contributions
aw+1.533
 Other+1.362
ifying+1.319
 Leave+1.305
 forward+1.273
Neg logit contributions
ro-1.355
onna-1.073
emo-1.063
uzz-1.061
 cheek-1.023
Token having
Token ID1719
Feature activation+4.22e-01

Pos logit contributions
ro+0.891
onna+0.720
emo+0.715
uzz+0.710
 cheek+0.691
Neg logit contributions
aw-0.741
 Other-0.653
 forward-0.626
 middle-0.618
ifying-0.614
Token fun
Token ID1257
Feature activation+1.30e+00

Pos logit contributions
aw+0.742
 Other+0.661
 Leave+0.629
ifying+0.628
 forward+0.626
Neg logit contributions
ro-0.743
onna-0.593
uzz-0.582
emo-0.579
 cheek-0.563
Token.
Token ID13
Feature activation+1.08e-01

Pos logit contributions
aw+2.455
 Other+2.221
 Leave+2.165
ifying+2.155
 forward+2.107
Neg logit contributions
ro-2.152
onna-1.650
uzz-1.627
 seek-1.599
emo-1.584
Token From
Token ID3574
Feature activation-4.13e-02

Pos logit contributions
aw+0.158
 Other+0.139
 Leave+0.129
ifying+0.128
 forward+0.128
Neg logit contributions
ro-0.223
onna-0.185
uzz-0.181
emo-0.181
 cheek-0.179
Token that
Token ID326
Feature activation-5.72e-02

Pos logit contributions
ro+0.096
onna+0.082
uzz+0.080
emo+0.080
 cheek+0.078
Neg logit contributions
aw-0.052
 Other-0.041
 forward-0.039
ifying-0.038
 Leave-0.038
Token day
Token ID1110
Feature activation+3.44e+00

Pos logit contributions
ro+0.092
onna+0.072
uzz+0.071
emo+0.070
 cheek+0.067
Neg logit contributions
aw-0.111
 Other-0.101
 forward-0.097
 middle-0.096
 Leave-0.095
Token on
Token ID319
Feature activation+1.25e+00

Pos logit contributions
aw+5.435
 Other+4.951
pieces+4.828
 pasture+4.627
ifying+4.623
Neg logit contributions
ro-6.554
 seek-5.424
uzz-5.359
emo-5.258
 cheek-5.243
Token,
Token ID11
Feature activation+7.49e-02

Pos logit contributions
aw+2.177
 Other+1.897
ifying+1.845
 Leave+1.792
 forward+1.788
Neg logit contributions
ro-2.317
onna-1.865
emo-1.809
uzz-1.791
 seek-1.782
Token they
Token ID484
Feature activation+8.26e-01

Pos logit contributions
aw+0.171
 Other+0.158
 forward+0.151
 Leave+0.151
ifying+0.150
Neg logit contributions
ro-0.093
onna-0.067
uzz-0.065
emo-0.064
 cheek-0.063
Token made
Token ID925
Feature activation+3.82e-01

Pos logit contributions
aw+1.214
 Other+1.072
 Leave+1.003
ifying+0.995
 forward+0.977
Neg logit contributions
ro-1.702
onna-1.401
uzz-1.384
emo-1.380
 seek-1.330
Token sure
Token ID1654
Feature activation-4.86e-01

Pos logit contributions
aw+0.672
 Other+0.598
 Leave+0.568
ifying+0.568
 forward+0.566
Neg logit contributions
ro-0.676
onna-0.540
uzz-0.530
emo-0.528
 cheek-0.513
Token of
Token ID286
Feature activation-1.10e-01

Pos logit contributions
ro+0.048
onna+0.041
emo+0.041
uzz+0.041
 cheek+0.040
Neg logit contributions
aw-0.019
 Other-0.015
 forward-0.014
 Leave-0.014
ifying-0.013
Token fun
Token ID1257
Feature activation+9.79e-01

Pos logit contributions
ro+0.170
onna+0.131
uzz+0.127
emo+0.126
 cheek+0.121
Neg logit contributions
aw-0.224
 Other-0.203
 Leave-0.193
 forward-0.193
 middle-0.192
Token.
Token ID13
Feature activation+7.97e-01

Pos logit contributions
aw+1.784
 Other+1.610
ifying+1.564
 Leave+1.544
 forward+1.531
Neg logit contributions
ro-1.678
onna-1.298
uzz-1.291
emo-1.280
 cheek-1.246
Token From
Token ID3574
Feature activation-5.03e-01

Pos logit contributions
aw+1.354
 Other+1.175
 forward+1.143
ifying+1.135
 Leave+1.120
Neg logit contributions
ro-1.470
onna-1.186
uzz-1.181
emo-1.166
 cheek-1.116
Token that
Token ID326
Feature activation-2.90e-01

Pos logit contributions
ro+1.214
onna+1.042
uzz+1.014
emo+1.008
 cheek+0.985
Neg logit contributions
aw-0.610
 Other-0.464
 forward-0.447
 middle-0.444
ifying-0.434
Token day
Token ID1110
Feature activation+3.42e+00

Pos logit contributions
ro+0.481
onna+0.380
uzz+0.374
emo+0.371
 cheek+0.351
Neg logit contributions
aw-0.545
 Other-0.504
 forward-0.481
 middle-0.477
 Leave-0.469
Token on
Token ID319
Feature activation+1.63e+00

Pos logit contributions
aw+5.223
pieces+4.760
 Other+4.685
opl+4.566
ifying+4.513
Neg logit contributions
ro-6.555
uzz-5.701
 seek-5.494
 cheek-5.413
emo-5.385
Token,
Token ID11
Feature activation+8.23e-01

Pos logit contributions
aw+3.004
ifying+2.619
 Other+2.618
 forward+2.533
pieces+2.506
Neg logit contributions
ro-2.855
onna-2.271
emo-2.182
 seek-2.177
uzz-2.171
Token they
Token ID484
Feature activation+6.12e-01

Pos logit contributions
aw+1.824
 Other+1.642
ifying+1.612
 Leave+1.595
 forward+1.589
Neg logit contributions
ro-1.091
onna-0.792
uzz-0.785
emo-0.779
 seek-0.735
Token became
Token ID2627
Feature activation+1.16e+00

Pos logit contributions
aw+0.890
 Other+0.776
ifying+0.726
 Leave+0.724
 forward+0.718
Neg logit contributions
ro-1.273
onna-1.052
uzz-1.039
emo-1.036
 cheek-1.000
Token best
Token ID1266
Feature activation+1.84e+00

Pos logit contributions
aw+1.993
 Other+1.750
ifying+1.730
 Leave+1.698
opl+1.647
Neg logit contributions
ro-2.093
emo-1.704
uzz-1.668
onna-1.667
 seek-1.624
Token be
Token ID307
Feature activation+1.50e+00

Pos logit contributions
aw+0.056
 Other+0.049
 Leave+0.047
ifying+0.047
 forward+0.047
Neg logit contributions
ro-0.058
onna-0.046
uzz-0.046
emo-0.045
 cheek-0.044
Token a
Token ID257
Feature activation+1.30e+00

Pos logit contributions
aw+2.563
 Other+2.340
ifying+2.296
opl+2.238
 Leave+2.235
Neg logit contributions
ro-2.639
emo-2.198
 seek-2.195
uzz-2.182
onna-2.165
Token good
Token ID922
Feature activation+1.09e+00

Pos logit contributions
aw+2.512
 Other+2.201
ifying+2.149
 Leave+2.148
opl+2.083
Neg logit contributions
ro-2.121
onna-1.698
 cheek-1.644
 seek-1.623
emo-1.622
Token girl
Token ID2576
Feature activation+1.43e+00

Pos logit contributions
aw+2.220
 Other+1.991
ifying+1.958
 Leave+1.936
opl+1.889
Neg logit contributions
ro-1.675
onna-1.262
emo-1.217
 cheek-1.204
uzz-1.203
Token from
Token ID422
Feature activation+9.17e-01

Pos logit contributions
aw+2.649
 Other+2.382
 Leave+2.261
ifying+2.256
opl+2.247
Neg logit contributions
ro-2.421
uzz-1.904
onna-1.890
 cheek-1.859
 seek-1.840
Token now
Token ID783
Feature activation+3.42e+00

Pos logit contributions
aw+1.590
 Other+1.408
ifying+1.349
 Leave+1.348
 forward+1.308
Neg logit contributions
ro-1.651
onna-1.337
uzz-1.323
emo-1.301
 cheek-1.285
Token on
Token ID319
Feature activation+1.95e+00

Pos logit contributions
aw+5.583
 Other+4.701
ifying+4.591
pieces+4.565
ement+4.404
Neg logit contributions
ro-7.286
 seek-6.092
 cheek-5.978
onna-5.715
emo-5.566
Token.
Token ID13
Feature activation+9.73e-02

Pos logit contributions
aw+3.966
 Leave+3.461
 Other+3.454
ifying+3.400
 forward+3.319
Neg logit contributions
ro-3.093
onna-2.493
 seek-2.272
emo-2.231
 cheek-2.227
Token She
Token ID1375
Feature activation+6.68e-01

Pos logit contributions
aw+0.120
 Other+0.103
 Leave+0.094
 forward+0.092
ifying+0.092
Neg logit contributions
ro-0.225
onna-0.189
uzz-0.188
emo-0.187
 cheek-0.184
Token even
Token ID772
Feature activation+2.10e-01

Pos logit contributions
aw+1.122
 Other+1.002
ifying+0.962
 forward+0.951
 Leave+0.950
Neg logit contributions
ro-1.229
onna-1.000
emo-0.976
uzz-0.973
 cheek-0.941
Token asked
Token ID1965
Feature activation+1.39e-01

Pos logit contributions
aw+0.369
 Other+0.327
 forward+0.309
 Leave+0.309
ifying+0.308
Neg logit contributions
ro-0.372
onna-0.298
uzz-0.294
emo-0.291
 cheek-0.284
Token cher
Token ID21382
Feature activation+1.25e+00

Pos logit contributions
aw+1.018
 Other+0.912
ifying+0.873
 Leave+0.869
 forward+0.868
Neg logit contributions
ro-0.934
onna-0.734
uzz-0.719
emo-0.714
 cheek-0.703
Tokenries
Token ID1678
Feature activation+2.78e-01

Pos logit contributions
aw+1.414
 Other+1.272
 Leave+1.174
 forward+1.147
ifying+1.127
Neg logit contributions
ro-2.921
onna-2.510
uzz-2.475
emo-2.453
 cheek-2.441
Token.
Token ID13
Feature activation+5.08e-01

Pos logit contributions
aw+0.474
 Other+0.418
 forward+0.397
ifying+0.397
 Leave+0.396
Neg logit contributions
ro-0.505
onna-0.406
uzz-0.400
emo-0.399
 cheek-0.386
Token From
Token ID3574
Feature activation+2.82e-01

Pos logit contributions
aw+0.842
 Other+0.736
ifying+0.700
 Leave+0.700
 forward+0.698
Neg logit contributions
ro-0.956
onna-0.775
uzz-0.764
emo-0.760
 cheek-0.737
Token that
Token ID326
Feature activation-2.60e-02

Pos logit contributions
aw+0.362
 Other+0.304
 Leave+0.282
 forward+0.282
ifying+0.282
Neg logit contributions
ro-0.641
onna-0.541
uzz-0.532
emo-0.530
 cheek-0.520
Token day
Token ID1110
Feature activation+3.41e+00

Pos logit contributions
ro+0.041
onna+0.032
uzz+0.032
emo+0.031
 cheek+0.030
Neg logit contributions
aw-0.050
 Other-0.046
 forward-0.044
 middle-0.044
 Leave-0.044
Token on
Token ID319
Feature activation+1.36e+00

Pos logit contributions
aw+5.281
 Other+4.561
pieces+4.528
ifying+4.391
opl+4.370
Neg logit contributions
ro-6.560
 seek-5.594
uzz-5.562
 cheek-5.400
emo-5.328
Token,
Token ID11
Feature activation+4.23e-01

Pos logit contributions
aw+2.406
 Other+2.092
ifying+2.034
 forward+1.976
 Leave+1.973
Neg logit contributions
ro-2.501
onna-1.997
emo-1.941
uzz-1.918
 seek-1.913
Token they
Token ID484
Feature activation+1.29e+00

Pos logit contributions
aw+1.033
 Other+0.948
ifying+0.915
 Leave+0.915
 forward+0.912
Neg logit contributions
ro-0.467
onna-0.316
uzz-0.304
emo-0.302
 cheek-0.285
Token knew
Token ID2993
Feature activation+2.03e-01

Pos logit contributions
aw+1.969
 Other+1.765
 Leave+1.653
ifying+1.653
 middle+1.611
Neg logit contributions
ro-2.593
onna-2.129
emo-2.107
uzz-2.104
 seek-2.032
Token they
Token ID484
Feature activation+8.69e-01

Pos logit contributions
aw+0.289
 Other+0.249
 forward+0.232
 Leave+0.232
ifying+0.232
Neg logit contributions
ro-0.432
onna-0.358
uzz-0.352
emo-0.352
 cheek-0.345
Token!"
Token ID2474
Feature activation+3.69e-01

Pos logit contributions
aw+0.052
 Other+0.046
 forward+0.044
 Leave+0.044
ifying+0.044
Neg logit contributions
ro-0.050
onna-0.040
uzz-0.039
emo-0.039
 cheek-0.037
Token
Token ID198
Feature activation+1.08e-01

Pos logit contributions
aw+0.574
 Other+0.502
 Leave+0.473
ifying+0.472
 forward+0.471
Neg logit contributions
ro-0.730
onna-0.598
uzz-0.589
emo-0.586
 cheek-0.572
Token
Token ID198
Feature activation+8.19e-01

Pos logit contributions
aw+0.173
 Other+0.152
 Leave+0.143
 forward+0.142
 middle+0.141
Neg logit contributions
ro-0.208
onna-0.169
uzz-0.165
emo-0.165
 cheek-0.162
TokenAnd
Token ID1870
Feature activation+5.72e-01

Pos logit contributions
aw+1.208
 Other+1.055
ifying+1.013
 forward+0.997
 Leave+0.989
Neg logit contributions
ro-1.674
onna-1.380
uzz-1.376
emo-1.367
 seek-1.330
Token from
Token ID422
Feature activation+5.13e-01

Pos logit contributions
aw+0.977
 Other+0.868
ifying+0.822
 Leave+0.820
 forward+0.809
Neg logit contributions
ro-1.039
onna-0.836
uzz-0.826
emo-0.817
 cheek-0.798
Token then
Token ID788
Feature activation+3.41e+00

Pos logit contributions
aw+0.676
 Other+0.581
 Leave+0.540
ifying+0.539
 forward+0.535
Neg logit contributions
ro-1.131
onna-0.947
uzz-0.939
emo-0.933
 cheek-0.914
Token on
Token ID319
Feature activation+2.33e+00

Pos logit contributions
aw+5.048
 Other+4.450
ifying+4.389
pieces+4.162
 Leave+4.035
Neg logit contributions
ro-7.118
 seek-5.794
 moment-5.718
uzz-5.577
 cheek-5.490
Token,
Token ID11
Feature activation+2.25e+00

Pos logit contributions
aw+4.499
 Other+3.897
ifying+3.840
 Leave+3.774
opl+3.645
Neg logit contributions
ro-3.902
onna-3.084
 seek-3.081
uzz-2.937
emo-2.886
Token every
Token ID790
Feature activation+9.97e-01

Pos logit contributions
aw+4.318
ifying+3.805
 Other+3.726
 Leave+3.671
opl+3.531
Neg logit contributions
ro-3.713
 seek-2.997
uzz-2.870
emo-2.817
 common-2.774
Token day
Token ID1110
Feature activation+2.54e+00

Pos logit contributions
aw+1.876
 Other+1.667
 Leave+1.614
ifying+1.609
 forward+1.566
Neg logit contributions
ro-1.671
onna-1.316
uzz-1.283
emo-1.269
 seek-1.246
Token Steve
Token ID6542
Feature activation+7.14e-01

Pos logit contributions
aw+4.870
ifying+4.521
 Other+4.521
opl+4.347
 Leave+4.316
Neg logit contributions
ro-4.018
 seek-3.071
uzz-2.915
emo-2.838
 common-2.833
Token and
Token ID290
Feature activation+3.99e-01

Pos logit contributions
aw+1.109
 Other+0.993
ifying+0.947
 Leave+0.939
 forward+0.934
Neg logit contributions
ro-1.138
onna-0.908
uzz-0.898
emo-0.891
 cheek-0.866
Token family
Token ID1641
Feature activation+3.99e-01

Pos logit contributions
aw+0.687
 Other+0.609
 Leave+0.577
ifying+0.577
 forward+0.575
Neg logit contributions
ro-0.718
onna-0.577
uzz-0.567
emo-0.564
 cheek-0.549
Token.
Token ID13
Feature activation+6.04e-01

Pos logit contributions
aw+0.664
 Other+0.588
 Leave+0.556
ifying+0.555
 forward+0.553
Neg logit contributions
ro-0.740
onna-0.598
uzz-0.588
emo-0.585
 cheek-0.570
Token From
Token ID3574
Feature activation-2.75e-01

Pos logit contributions
aw+0.906
 Other+0.779
 forward+0.738
ifying+0.737
 Leave+0.734
Neg logit contributions
ro-1.222
onna-1.005
uzz-0.996
emo-0.988
 cheek-0.955
Token that
Token ID326
Feature activation+2.68e-01

Pos logit contributions
ro+0.644
onna+0.550
emo+0.532
uzz+0.532
 cheek+0.521
Neg logit contributions
aw-0.350
 Other-0.279
 forward-0.264
ifying-0.261
 middle-0.260
Token day
Token ID1110
Feature activation+3.40e+00

Pos logit contributions
aw+0.541
 Other+0.491
 forward+0.469
 Leave+0.468
ifying+0.467
Neg logit contributions
ro-0.408
onna-0.312
uzz-0.306
emo-0.303
 cheek-0.292
Token on
Token ID319
Feature activation+1.34e+00

Pos logit contributions
aw+5.358
 Other+4.443
pieces+4.381
opl+4.350
 pasture+4.331
Neg logit contributions
ro-6.716
emo-5.627
 seek-5.610
uzz-5.459
 cheek-5.429
Token,
Token ID11
Feature activation+4.62e-01

Pos logit contributions
aw+2.331
 Other+2.011
ifying+1.947
 forward+1.912
 Leave+1.895
Neg logit contributions
ro-2.493
onna-2.001
emo-1.942
uzz-1.930
 seek-1.925
Token Lily
Token ID20037
Feature activation+2.98e-01

Pos logit contributions
aw+0.981
 Other+0.889
ifying+0.854
 Leave+0.854
 forward+0.850
Neg logit contributions
ro-0.651
onna-0.486
uzz-0.474
emo-0.473
 cheek-0.451
Token always
Token ID1464
Feature activation-8.93e-02

Pos logit contributions
aw+0.446
 Other+0.387
 Leave+0.364
 forward+0.362
ifying+0.361
Neg logit contributions
ro-0.605
onna-0.500
uzz-0.493
emo-0.490
 cheek-0.480
Token made
Token ID925
Feature activation+8.32e-02

Pos logit contributions
ro+0.184
onna+0.153
uzz+0.150
emo+0.149
 cheek+0.146
Neg logit contributions
aw-0.131
 Other-0.114
 Leave-0.108
 forward-0.107
ifying-0.105
Token clean
Token ID3424
Feature activation+5.32e-02

Pos logit contributions
ro+1.496
onna+1.134
uzz+1.126
emo+1.092
 cheek+1.073
Neg logit contributions
aw-1.692
 Other-1.551
 forward-1.540
 middle-1.488
 particularly-1.483
Token too
Token ID1165
Feature activation+1.64e-01

Pos logit contributions
aw+0.089
 Other+0.079
 forward+0.075
ifying+0.074
 Leave+0.074
Neg logit contributions
ro-0.098
onna-0.080
uzz-0.078
emo-0.078
 cheek-0.075
Token.
Token ID13
Feature activation+4.85e-01

Pos logit contributions
aw+0.269
 Other+0.238
 forward+0.225
ifying+0.224
 Leave+0.223
Neg logit contributions
ro-0.310
onna-0.251
uzz-0.248
emo-0.247
 cheek-0.239
Token From
Token ID3574
Feature activation-7.15e-02

Pos logit contributions
aw+0.776
 Other+0.679
ifying+0.646
 forward+0.643
 Leave+0.643
Neg logit contributions
ro-0.936
onna-0.765
uzz-0.754
emo-0.750
 cheek-0.727
Token that
Token ID326
Feature activation-5.15e-01

Pos logit contributions
ro+0.168
onna+0.143
uzz+0.139
emo+0.139
 cheek+0.136
Neg logit contributions
aw-0.089
 Other-0.071
 forward-0.067
 Leave-0.066
 middle-0.066
Token day
Token ID1110
Feature activation+3.40e+00

Pos logit contributions
ro+0.886
onna+0.695
uzz+0.688
emo+0.683
 cheek+0.643
Neg logit contributions
aw-0.925
 Other-0.875
 forward-0.849
 middle-0.829
 Leave-0.808
Token on
Token ID319
Feature activation+1.30e+00

Pos logit contributions
aw+5.264
 Other+4.652
pieces+4.525
ifying+4.340
 pasture+4.282
Neg logit contributions
ro-6.694
uzz-5.640
 seek-5.596
 cheek-5.580
emo-5.536
Token,
Token ID11
Feature activation+1.09e+00

Pos logit contributions
aw+2.264
 Other+1.975
ifying+1.918
 Leave+1.856
 forward+1.855
Neg logit contributions
ro-2.418
onna-1.948
emo-1.887
uzz-1.879
 seek-1.860
Token Sally
Token ID25737
Feature activation+2.50e-01

Pos logit contributions
aw+2.440
 Other+2.214
ifying+2.187
 Leave+2.154
 forward+2.116
Neg logit contributions
ro-1.405
emo-1.021
onna-1.015
uzz-1.000
 seek-0.941
Token and
Token ID290
Feature activation+4.72e-01

Pos logit contributions
aw+0.379
 Other+0.330
 Leave+0.311
 forward+0.309
ifying+0.307
Neg logit contributions
ro-0.501
onna-0.413
uzz-0.407
emo-0.404
 cheek-0.396
Token her
Token ID607
Feature activation+3.94e-01

Pos logit contributions
aw+0.915
 Other+0.823
ifying+0.789
 Leave+0.785
 forward+0.782
Neg logit contributions
ro-0.759
onna-0.589
uzz-0.579
emo-0.577
 cheek-0.556
Token pretty
Token ID2495
Feature activation-8.32e-01

Pos logit contributions
ro+1.308
onna+1.067
emo+1.055
uzz+1.046
 seek+0.973
Neg logit contributions
aw-1.061
 Other-0.971
 forward-0.915
 middle-0.905
ifying-0.892
Token kitten
Token ID42143
Feature activation+4.27e-01

Pos logit contributions
ro+1.376
onna+1.135
uzz+1.117
emo+1.112
 cheek+1.034
Neg logit contributions
aw-1.493
 Other-1.350
 middle-1.279
 Leave-1.271
ifying-1.267
Token.
Token ID13
Feature activation+7.49e-01

Pos logit contributions
aw+0.693
 Other+0.610
 Leave+0.576
ifying+0.576
 forward+0.573
Neg logit contributions
ro-0.811
onna-0.659
uzz-0.648
emo-0.645
 cheek-0.630
Token From
Token ID3574
Feature activation+3.55e-01

Pos logit contributions
aw+1.210
 Other+1.045
ifying+1.016
 forward+1.012
 Leave+0.992
Neg logit contributions
ro-1.436
onna-1.176
uzz-1.151
emo-1.149
 cheek-1.109
Token that
Token ID326
Feature activation+1.36e-01

Pos logit contributions
aw+0.462
 Other+0.393
 Leave+0.364
ifying+0.364
 forward+0.363
Neg logit contributions
ro-0.795
onna-0.668
uzz-0.659
emo-0.656
 cheek-0.643
Token day
Token ID1110
Feature activation+3.39e+00

Pos logit contributions
aw+0.269
 Other+0.245
 forward+0.234
 Leave+0.233
 middle+0.232
Neg logit contributions
ro-0.211
onna-0.162
uzz-0.159
emo-0.157
 cheek-0.151
Token on
Token ID319
Feature activation+1.52e+00

Pos logit contributions
aw+5.356
pieces+4.644
 Other+4.629
ifying+4.530
ement+4.452
Neg logit contributions
ro-6.641
 cheek-5.631
 seek-5.502
emo-5.408
uzz-5.350
Token,
Token ID11
Feature activation+1.34e+00

Pos logit contributions
aw+2.785
 Other+2.369
ifying+2.369
pieces+2.258
 forward+2.248
Neg logit contributions
ro-2.738
onna-2.191
emo-2.100
uzz-2.068
 seek-2.062
Token Lily
Token ID20037
Feature activation+4.36e-01

Pos logit contributions
aw+2.412
ifying+2.061
 Other+2.049
 Leave+1.989
 forward+1.978
Neg logit contributions
ro-2.412
onna-1.941
emo-1.917
uzz-1.832
 seek-1.820
Token made
Token ID925
Feature activation+3.91e-02

Pos logit contributions
aw+0.653
 Other+0.569
ifying+0.536
 Leave+0.533
 forward+0.532
Neg logit contributions
ro-0.886
onna-0.731
uzz-0.718
emo-0.717
 cheek-0.699
Token sure
Token ID1654
Feature activation-5.25e-01

Pos logit contributions
aw+0.070
 Other+0.062
 Leave+0.059
 forward+0.058
 middle+0.058
Neg logit contributions
ro-0.070
onna-0.056
uzz-0.054
emo-0.053
 cheek-0.053
Token.
Token ID13
Feature activation+5.34e-01

Pos logit contributions
ro+0.187
onna+0.152
emo+0.151
uzz+0.149
 cheek+0.145
Neg logit contributions
aw-0.167
 Other-0.145
 forward-0.141
ifying-0.138
 Leave-0.138
Token
Token ID198
Feature activation+2.32e-01

Pos logit contributions
aw+0.739
 Other+0.627
ifying+0.593
 forward+0.589
 Leave+0.588
Neg logit contributions
ro-1.156
onna-0.966
uzz-0.951
emo-0.949
 cheek-0.921
Token
Token ID198
Feature activation+9.88e-01

Pos logit contributions
aw+0.365
 Other+0.321
 Leave+0.302
 forward+0.301
ifying+0.299
Neg logit contributions
ro-0.455
onna-0.371
uzz-0.365
emo-0.363
 cheek-0.357
TokenAnd
Token ID1870
Feature activation-1.95e-02

Pos logit contributions
aw+1.479
 Other+1.286
ifying+1.219
 Leave+1.199
 forward+1.191
Neg logit contributions
ro-2.033
onna-1.679
emo-1.667
uzz-1.658
 seek-1.634
Token from
Token ID422
Feature activation+4.24e-01

Pos logit contributions
ro+0.038
onna+0.031
uzz+0.030
emo+0.030
 cheek+0.030
Neg logit contributions
aw-0.031
 Other-0.027
 forward-0.026
 Leave-0.026
ifying-0.025
Token then
Token ID788
Feature activation+3.39e+00

Pos logit contributions
aw+0.558
 Other+0.476
 Leave+0.443
ifying+0.442
 forward+0.439
Neg logit contributions
ro-0.938
onna-0.787
uzz-0.777
emo-0.774
 cheek-0.757
Token on
Token ID319
Feature activation+2.08e+00

Pos logit contributions
aw+5.074
ifying+4.457
 Other+4.455
pieces+4.373
 Leave+4.116
Neg logit contributions
ro-7.061
 seek-5.638
 moment-5.597
uzz-5.478
emo-5.463
Token,
Token ID11
Feature activation+9.15e-01

Pos logit contributions
aw+3.999
 Other+3.426
ifying+3.370
 Leave+3.361
pieces+3.317
Neg logit contributions
ro-3.586
onna-2.870
 seek-2.734
emo-2.713
uzz-2.686
Token Carl
Token ID8124
Feature activation+8.39e-01

Pos logit contributions
aw+1.984
 Other+1.772
ifying+1.727
 Leave+1.715
opl+1.683
Neg logit contributions
ro-1.284
onna-0.950
emo-0.942
uzz-0.929
 seek-0.895
Token enjoyed
Token ID8359
Feature activation+4.68e-01

Pos logit contributions
aw+1.303
 Other+1.155
ifying+1.106
 Leave+1.084
 forward+1.080
Neg logit contributions
ro-1.650
onna-1.345
emo-1.315
uzz-1.308
 cheek-1.263
Token playing
Token ID2712
Feature activation+4.85e-01

Pos logit contributions
aw+0.786
 Other+0.696
ifying+0.660
 Leave+0.660
 forward+0.655
Neg logit contributions
ro-0.868
onna-0.700
uzz-0.688
emo-0.685
 cheek-0.667
Token
Token ID198
Feature activation+1.46e-02

Pos logit contributions
ro+1.135
 cheek+0.921
onna+0.913
uzz+0.910
emo+0.909
Neg logit contributions
aw-0.807
 Other-0.730
 middle-0.660
 Leave-0.659
 forward-0.652
Token"
Token ID1
Feature activation-7.70e-01

Pos logit contributions
aw+0.023
 Other+0.020
 forward+0.019
 middle+0.019
 Leave+0.019
Neg logit contributions
ro-0.028
onna-0.023
uzz-0.023
emo-0.023
 cheek-0.022
TokenThat
Token ID2504
Feature activation-6.71e-01

Pos logit contributions
ro+1.476
onna+1.243
uzz+1.211
emo+1.187
 cheek+1.165
Neg logit contributions
aw-1.281
 Other-1.092
 middle-1.054
 Leave-1.048
ifying-1.035
Token is
Token ID318
Feature activation-6.26e-01

Pos logit contributions
ro+1.508
uzz+1.303
emo+1.269
onna+1.263
 cheek+1.195
Neg logit contributions
aw-0.886
 Other-0.745
 middle-0.714
 Leave-0.681
 forward-0.668
Token a
Token ID257
Feature activation-5.77e-01

Pos logit contributions
ro+1.330
onna+1.072
uzz+1.061
emo+1.037
 cheek+1.025
Neg logit contributions
aw-0.933
 Other-0.800
 forward-0.783
 middle-0.758
ifying-0.750
Token nice
Token ID3621
Feature activation-3.21e+00

Pos logit contributions
ro+1.106
uzz+0.901
onna+0.889
emo+0.873
water+0.833
Neg logit contributions
aw-0.922
 Other-0.822
 middle-0.798
 forward-0.792
ifying-0.769
Token star
Token ID3491
Feature activation-3.67e-01

Pos logit contributions
ro+6.290
emo+5.821
uzz+5.683
nes+5.472
ge+5.456
Neg logit contributions
 Other-4.579
 middle-4.539
aw-4.334
 forward-3.871
 helicopter-3.748
Token,
Token ID11
Feature activation-1.35e+00

Pos logit contributions
ro+0.639
uzz+0.510
emo+0.506
onna+0.503
 cheek+0.473
Neg logit contributions
aw-0.653
 Other-0.570
 forward-0.548
ifying-0.543
 middle-0.541
Token Tom
Token ID4186
Feature activation-8.76e-01

Pos logit contributions
ro+2.153
emo+1.722
uzz+1.683
onna+1.674
water+1.563
Neg logit contributions
aw-2.738
 Other-2.376
 Leave-2.278
 forward-2.262
 middle-2.243
Token.
Token ID13
Feature activation-6.54e-01

Pos logit contributions
ro+1.629
uzz+1.301
onna+1.281
emo+1.266
 cheek+1.213
Neg logit contributions
aw-1.537
 Other-1.290
 Leave-1.227
 forward-1.220
 middle-1.194
Token But
Token ID887
Feature activation-1.25e+00

Pos logit contributions
ro+1.410
onna+1.185
emo+1.178
uzz+1.160
 cheek+1.147
Neg logit contributions
aw-0.889
 Other-0.801
 Leave-0.745
ifying-0.726
 forward-0.711
Token.
Token ID13
Feature activation-6.00e-01

Pos logit contributions
ro+2.566
onna+2.048
 cheek+2.017
uzz+2.005
emo+2.002
Neg logit contributions
aw-1.932
 forward-1.644
 Other-1.597
 middle-1.546
 particularly-1.533
Token
Token ID220
Feature activation-5.27e-01

Pos logit contributions
ro+1.300
onna+1.090
 cheek+1.078
uzz+1.070
 seek+1.058
Neg logit contributions
aw-0.817
 Other-0.748
 Leave-0.664
 middle-0.654
 forward-0.645
Token
Token ID198
Feature activation-8.88e-01

Pos logit contributions
ro+1.068
onna+0.867
 cheek+0.863
uzz+0.848
 seek+0.846
Neg logit contributions
aw-0.796
 Other-0.710
 Leave-0.655
 forward-0.649
 middle-0.640
Token
Token ID198
Feature activation-7.87e-02

Pos logit contributions
ro+1.848
 cheek+1.535
uzz+1.530
onna+1.507
emo+1.475
Neg logit contributions
aw-1.256
 Other-1.136
 middle-1.033
 Leave-1.033
 forward-1.009
TokenThe
Token ID464
Feature activation-2.35e+00

Pos logit contributions
ro+0.181
onna+0.154
uzz+0.153
emo+0.150
 cheek+0.149
Neg logit contributions
aw-0.097
 Other-0.082
 Leave-0.076
 forward-0.075
 middle-0.075
Token smart
Token ID4451
Feature activation-3.05e+00

Pos logit contributions
ro+4.541
onna+4.021
uzz+3.746
water+3.469
nes+3.425
Neg logit contributions
aw-4.044
 middle-3.915
 particularly-3.473
 Other-3.473
 forward-3.459
Token oct
Token ID19318
Feature activation-6.15e-01

Pos logit contributions
ro+4.668
onna+4.467
nes+4.147
uzz+3.986
water+3.897
Neg logit contributions
aw-6.151
 middle-5.507
 Other-5.324
 forward-5.088
 particularly-5.066
Tokenopus
Token ID25790
Feature activation-7.27e-01

Pos logit contributions
ro+1.292
onna+1.084
uzz+1.067
 cheek+1.039
emo+1.035
Neg logit contributions
aw-0.907
 Other-0.732
 Leave-0.705
ifying-0.687
 forward-0.682
Token was
Token ID373
Feature activation+2.23e-01

Pos logit contributions
ro+1.399
onna+1.153
uzz+1.134
 cheek+1.113
emo+1.107
Neg logit contributions
aw-1.186
 Other-0.990
 middle-0.951
 forward-0.936
 particularly-0.933
Token very
Token ID845
Feature activation-1.36e+00

Pos logit contributions
aw+0.300
 Other+0.255
 forward+0.239
 Leave+0.237
ifying+0.237
Neg logit contributions
ro-0.492
onna-0.410
uzz-0.404
emo-0.403
 cheek-0.395
Token happy
Token ID3772
Feature activation-5.60e-01

Pos logit contributions
ro+2.820
onna+2.205
uzz+2.183
water+2.180
icated+2.150
Neg logit contributions
aw-2.014
 forward-1.831
 middle-1.748
 Other-1.737
 particularly-1.692
Token it
Token ID340
Feature activation+5.67e-01

Pos logit contributions
ro+0.951
onna+0.769
emo+0.745
uzz+0.733
 cheek+0.721
Neg logit contributions
aw-0.837
 Other-0.730
 Leave-0.691
 forward-0.685
ifying-0.671
Token and
Token ID290
Feature activation-8.67e-02

Pos logit contributions
aw+0.987
 Other+0.876
ifying+0.838
 Leave+0.835
 forward+0.821
Neg logit contributions
ro-1.014
onna-0.809
uzz-0.796
emo-0.791
 cheek-0.776
Token decided
Token ID3066
Feature activation-6.99e-01

Pos logit contributions
ro+0.153
onna+0.123
uzz+0.119
emo+0.119
 cheek+0.116
Neg logit contributions
aw-0.157
 Other-0.136
 Leave-0.130
 forward-0.130
 middle-0.128
Token to
Token ID284
Feature activation-7.76e-01

Pos logit contributions
ro+1.250
onna+0.984
emo+0.976
uzz+0.971
 cheek+0.930
Neg logit contributions
aw-1.237
 Other-1.079
 forward-1.066
 Leave-1.048
 middle-1.008
Token share
Token ID2648
Feature activation-9.60e-01

Pos logit contributions
ro+1.362
onna+1.117
emo+1.111
uzz+1.099
 cheek+1.059
Neg logit contributions
aw-1.317
 Other-1.197
 Leave-1.148
 middle-1.123
 forward-1.118
Token the
Token ID262
Feature activation-3.04e+00

Pos logit contributions
ro+1.834
onna+1.511
emo+1.453
uzz+1.436
 cheek+1.399
Neg logit contributions
aw-1.659
 Other-1.410
 forward-1.361
 middle-1.346
pieces-1.312
Token toy
Token ID13373
Feature activation-4.87e-01

Pos logit contributions
ro+5.744
onna+5.502
uzz+5.325
emo+4.970
nes+4.793
Neg logit contributions
 middle-4.954
aw-4.716
 Other-4.670
 Leave-4.380
 forward-4.268
Token.
Token ID13
Feature activation+5.15e-01

Pos logit contributions
ro+0.895
onna+0.737
emo+0.725
uzz+0.719
 cheek+0.680
Neg logit contributions
aw-0.822
 Other-0.727
 forward-0.704
 middle-0.694
 Leave-0.679
Token They
Token ID1119
Feature activation-3.60e-01

Pos logit contributions
aw+0.837
 Other+0.735
ifying+0.702
 forward+0.697
 Leave+0.696
Neg logit contributions
ro-0.983
onna-0.798
uzz-0.785
emo-0.780
 cheek-0.757
Token took
Token ID1718
Feature activation+4.11e-01

Pos logit contributions
ro+0.740
onna+0.620
emo+0.608
uzz+0.602
 cheek+0.596
Neg logit contributions
aw-0.540
 Other-0.456
 forward-0.434
 middle-0.429
ifying-0.428
Token turns
Token ID4962
Feature activation-9.93e-01

Pos logit contributions
aw+0.704
 Other+0.624
 Leave+0.592
ifying+0.592
 forward+0.588
Neg logit contributions
ro-0.747
onna-0.599
uzz-0.590
emo-0.587
 cheek-0.571
Token They
Token ID1119
Feature activation+2.36e-01

Pos logit contributions
ro+1.709
onna+1.449
emo+1.413
 seek+1.409
 cheek+1.406
Neg logit contributions
aw-1.126
 Other-1.010
 Leave-0.916
ifying-0.888
 middle-0.880
Token wish
Token ID4601
Feature activation-6.19e-01

Pos logit contributions
aw+0.444
 Other+0.397
 Leave+0.380
 forward+0.378
ifying+0.378
Neg logit contributions
ro-0.389
onna-0.304
uzz-0.299
emo-0.297
 cheek-0.287
Token they
Token ID484
Feature activation-3.28e-01

Pos logit contributions
ro+1.128
emo+0.891
uzz+0.878
onna+0.878
 seek+0.864
Neg logit contributions
aw-1.066
 Other-0.930
 forward-0.894
ifying-0.872
 Leave-0.870
Token had
Token ID550
Feature activation-6.69e-01

Pos logit contributions
ro+0.652
onna+0.535
uzz+0.526
emo+0.522
 cheek+0.510
Neg logit contributions
aw-0.511
 Other-0.434
ifying-0.428
 Leave-0.422
 forward-0.422
Token shared
Token ID4888
Feature activation-9.03e-01

Pos logit contributions
ro+1.315
onna+1.057
uzz+1.036
emo+1.024
 cheek+1.001
Neg logit contributions
aw-1.099
 Other-0.910
 Leave-0.881
 forward-0.867
 middle-0.865
Token the
Token ID262
Feature activation-3.01e+00

Pos logit contributions
ro+1.737
onna+1.419
emo+1.417
uzz+1.356
 cheek+1.351
Neg logit contributions
aw-1.501
 Other-1.265
 forward-1.240
ifying-1.219
 middle-1.210
Token sock
Token ID32263
Feature activation-4.61e-01

Pos logit contributions
ro+5.134
onna+4.677
nes+4.560
uzz+4.505
emo+4.246
Neg logit contributions
 middle-5.334
aw-5.267
 Other-4.909
 particularly-4.831
 forward-4.662
Token and
Token ID290
Feature activation-6.00e-01

Pos logit contributions
ro+0.823
emo+0.662
onna+0.660
uzz+0.648
 cheek+0.615
Neg logit contributions
aw-0.800
 Other-0.704
 forward-0.687
 middle-0.670
 Leave-0.659
Token played
Token ID2826
Feature activation-8.14e-01

Pos logit contributions
ro+1.091
onna+0.881
emo+0.871
uzz+0.871
 cheek+0.833
Neg logit contributions
aw-1.015
 Other-0.888
 Leave-0.859
 middle-0.854
ifying-0.839
Token nicely
Token ID16576
Feature activation-4.83e-01

Pos logit contributions
ro+1.355
onna+1.105
uzz+1.067
emo+1.065
 cheek+1.007
Neg logit contributions
aw-1.530
 Other-1.359
 forward-1.318
 middle-1.309
 Leave-1.266
Token.
Token ID13
Feature activation-9.06e-01

Pos logit contributions
ro+0.918
onna+0.762
emo+0.760
uzz+0.748
 cheek+0.728
Neg logit contributions
aw-0.782
 Other-0.673
 forward-0.654
ifying-0.637
 middle-0.630
Token and
Token ID290
Feature activation-6.62e-01

Pos logit contributions
ro+1.261
uzz+1.023
emo+1.004
onna+1.002
 cheek+0.935
Neg logit contributions
aw-1.252
 Other-1.092
 forward-1.072
 Leave-1.061
 middle-1.056
Token gave
Token ID2921
Feature activation-6.74e-01

Pos logit contributions
ro+1.226
uzz+0.995
onna+0.993
emo+0.977
 cheek+0.954
Neg logit contributions
aw-1.112
 Other-0.955
 forward-0.934
 middle-0.928
ifying-0.914
Token it
Token ID340
Feature activation-1.46e+00

Pos logit contributions
ro+1.172
onna+0.949
uzz+0.895
emo+0.877
 cheek+0.865
Neg logit contributions
aw-1.268
 Other-1.120
 forward-1.093
 middle-1.084
 Leave-1.023
Token a
Token ID257
Feature activation-1.27e+00

Pos logit contributions
ro+3.109
onna+2.588
emo+2.505
 cheek+2.478
uzz+2.464
Neg logit contributions
aw-2.127
 Other-1.850
 forward-1.839
 middle-1.795
ifying-1.735
Token good
Token ID922
Feature activation-2.29e+00

Pos logit contributions
ro+2.349
uzz+1.986
emo+1.918
onna+1.802
keeping+1.801
Neg logit contributions
aw-1.985
 Other-1.871
ifying-1.855
 middle-1.825
 forward-1.819
Token tidy
Token ID43044
Feature activation-3.00e+00

Pos logit contributions
ro+4.008
uzz+3.408
nes+3.327
emo+3.315
onna+3.202
Neg logit contributions
aw-3.709
 Other-3.564
 middle-3.467
 forward-3.419
 particularly-3.264
Token.
Token ID13
Feature activation-5.97e-01

Pos logit contributions
ro+5.070
uzz+4.735
emo+4.661
icated+4.422
nes+4.350
Neg logit contributions
aw-4.958
 middle-4.910
 Other-4.651
 forward-4.398
 particularly-4.255
Token
Token ID198
Feature activation-6.53e-01

Pos logit contributions
ro+1.371
onna+1.164
 cheek+1.153
uzz+1.153
emo+1.142
Neg logit contributions
aw-0.719
 Other-0.645
 Leave-0.569
 middle-0.554
ifying-0.554
Token
Token ID198
Feature activation-2.66e-01

Pos logit contributions
ro+1.346
 cheek+1.097
onna+1.086
uzz+1.077
emo+1.058
Neg logit contributions
aw-0.959
 Other-0.862
 Leave-0.785
 middle-0.783
 forward-0.772
TokenThe
Token ID464
Feature activation-1.12e+00

Pos logit contributions
ro+0.600
onna+0.509
uzz+0.502
 cheek+0.500
emo+0.496
Neg logit contributions
aw-0.340
 Other-0.287
 Leave-0.266
 forward-0.264
 middle-0.264
Token nap
Token ID25422
Feature activation-8.33e-01

Pos logit contributions
ro+2.070
uzz+1.799
onna+1.763
emo+1.706
water+1.617
Neg logit contributions
aw-1.820
 middle-1.683
 Other-1.626
 forward-1.584
ifying-1.539
Token
Token ID198
Feature activation-6.35e-01

Pos logit contributions
ro+0.924
onna+0.757
 cheek+0.749
uzz+0.745
 seek+0.739
Neg logit contributions
aw-0.706
 Other-0.631
 Leave-0.582
 forward-0.575
ifying-0.564
Token
Token ID198
Feature activation-5.33e-02

Pos logit contributions
ro+1.303
onna+1.062
 cheek+1.062
uzz+1.052
emo+1.030
Neg logit contributions
aw-0.955
 Other-0.844
 Leave-0.777
 middle-0.768
 forward-0.754
TokenEveryone
Token ID16190
Feature activation+1.29e-01

Pos logit contributions
ro+0.113
onna+0.095
uzz+0.093
emo+0.092
 cheek+0.092
Neg logit contributions
aw-0.077
 Other-0.065
 Leave-0.061
 forward-0.061
ifying-0.060
Token loved
Token ID6151
Feature activation-8.08e-01

Pos logit contributions
aw+0.226
 Other+0.199
 forward+0.190
 Leave+0.189
ifying+0.188
Neg logit contributions
ro-0.230
onna-0.185
uzz-0.180
emo-0.180
 cheek-0.175
Token the
Token ID262
Feature activation-2.74e+00

Pos logit contributions
ro+1.580
onna+1.298
 cheek+1.266
uzz+1.260
emo+1.259
Neg logit contributions
aw-1.311
 Other-1.149
 forward-1.082
 Leave-1.078
 particularly-1.041
Token little
Token ID1310
Feature activation-2.99e+00

Pos logit contributions
ro+5.780
onna+5.307
uzz+5.302
emo+4.844
nes+4.842
Neg logit contributions
aw-4.276
 Other-3.942
 middle-3.766
opl-3.581
 forward-3.469
Token boy
Token ID2933
Feature activation+2.85e-01

Pos logit contributions
ro+4.736
uzz+4.612
onna+4.534
nes+4.335
emo+4.097
Neg logit contributions
aw-5.724
 middle-5.213
 Other-5.089
 Leave-4.874
 forward-4.774
Token's
Token ID338
Feature activation-3.30e-01

Pos logit contributions
aw+0.459
 Other+0.403
 Leave+0.380
 forward+0.379
ifying+0.379
Neg logit contributions
ro-0.545
onna-0.444
uzz-0.436
emo-0.435
 cheek-0.423
Token stories
Token ID3923
Feature activation+1.90e-01

Pos logit contributions
ro+0.628
onna+0.519
emo+0.503
uzz+0.502
 seek+0.474
Neg logit contributions
aw-0.543
 Other-0.485
 forward-0.460
 Leave-0.454
 middle-0.449
Token about
Token ID546
Feature activation-2.26e-01

Pos logit contributions
aw+0.309
 Other+0.271
 forward+0.257
ifying+0.256
 Leave+0.255
Neg logit contributions
ro-0.359
onna-0.292
uzz-0.288
emo-0.288
 cheek-0.279
Token the
Token ID262
Feature activation-1.60e+00

Pos logit contributions
ro+0.432
onna+0.353
emo+0.346
uzz+0.345
 cheek+0.339
Neg logit contributions
aw-0.368
 Other-0.322
 forward-0.304
 Leave-0.302
ifying-0.302
Tokenara
Token ID3301
Feature activation-6.82e-01

Pos logit contributions
ro+1.524
onna+1.220
emo+1.212
uzz+1.185
 seek+1.179
Neg logit contributions
aw-1.920
 Other-1.582
 forward-1.530
 Leave-1.528
ifying-1.518
Token,
Token ID11
Feature activation-1.13e+00

Pos logit contributions
ro+1.500
onna+1.251
emo+1.246
uzz+1.226
 cheek+1.199
Neg logit contributions
aw-0.937
 Other-0.786
 Leave-0.741
 forward-0.719
ifying-0.709
Token I
Token ID314
Feature activation-3.90e-01

Pos logit contributions
ro+2.417
emo+2.002
onna+1.995
uzz+1.964
water+1.868
Neg logit contributions
aw-1.604
 Leave-1.373
 forward-1.354
 Other-1.348
ifying-1.336
Token give
Token ID1577
Feature activation-1.15e+00

Pos logit contributions
ro+0.762
onna+0.618
 cheek+0.608
emo+0.607
uzz+0.606
Neg logit contributions
aw-0.608
 Other-0.533
 Leave-0.520
 forward-0.514
ifying-0.508
Token you
Token ID345
Feature activation-5.60e-01

Pos logit contributions
ro+2.118
onna+1.723
emo+1.615
uzz+1.598
 seek+1.517
Neg logit contributions
aw-2.134
 forward-1.848
 Other-1.823
 middle-1.808
 Leave-1.774
Token the
Token ID262
Feature activation-2.96e+00

Pos logit contributions
ro+1.132
onna+0.933
emo+0.906
uzz+0.895
 cheek+0.872
Neg logit contributions
aw-0.887
 Other-0.760
ifying-0.746
 forward-0.727
 middle-0.726
Token power
Token ID1176
Feature activation-9.82e-01

Pos logit contributions
ro+4.674
onna+3.966
uzz+3.959
nes+3.790
emo+3.575
Neg logit contributions
 middle-5.902
aw-5.688
 Other-5.440
 forward-5.440
 Leave-5.267
Token to
Token ID284
Feature activation-8.75e-01

Pos logit contributions
ro+2.087
emo+1.776
onna+1.754
uzz+1.750
 cheek+1.646
Neg logit contributions
aw-1.336
 forward-1.148
 Other-1.143
ifying-1.131
 middle-1.111
Token choose
Token ID3853
Feature activation-7.86e-01

Pos logit contributions
ro+1.628
uzz+1.298
onna+1.294
emo+1.270
 cheek+1.218
Neg logit contributions
aw-1.470
 Other-1.328
 middle-1.286
 Leave-1.231
ifying-1.227
Token any
Token ID597
Feature activation-2.69e-01

Pos logit contributions
ro+1.413
emo+1.128
onna+1.116
uzz+1.086
 cheek+1.063
Neg logit contributions
aw-1.386
 Other-1.219
 forward-1.190
ifying-1.179
 middle-1.157
Token special
Token ID2041
Feature activation-1.06e+00

Pos logit contributions
ro+0.493
onna+0.397
uzz+0.397
emo+0.390
 cheek+0.376
Neg logit contributions
aw-0.456
 Other-0.405
 forward-0.387
 middle-0.383
ifying-0.381
Token.
Token ID13
Feature activation-5.17e-01

Pos logit contributions
ro+1.237
uzz+1.006
onna+0.992
emo+0.969
 seek+0.921
Neg logit contributions
aw-1.100
 Other-0.921
 Leave-0.893
 forward-0.880
 middle-0.866
Token I
Token ID314
Feature activation-5.73e-01

Pos logit contributions
ro+0.996
onna+0.822
uzz+0.802
emo+0.794
 seek+0.780
Neg logit contributions
aw-0.818
 Other-0.742
 Leave-0.730
ifying-0.705
 middle-0.675
Token should
Token ID815
Feature activation+1.27e-01

Pos logit contributions
ro+1.104
onna+0.906
 cheek+0.892
uzz+0.883
emo+0.871
Neg logit contributions
aw-0.923
 Other-0.793
 Leave-0.778
ifying-0.746
 forward-0.737
Token have
Token ID423
Feature activation-8.17e-01

Pos logit contributions
aw+0.210
 Other+0.187
 Leave+0.179
ifying+0.175
 forward+0.175
Neg logit contributions
ro-0.240
onna-0.193
uzz-0.190
emo-0.188
 cheek-0.183
Token shared
Token ID4888
Feature activation-1.02e+00

Pos logit contributions
ro+1.649
onna+1.322
uzz+1.293
 cheek+1.254
emo+1.248
Neg logit contributions
aw-1.313
 Other-1.095
 Leave-1.075
 middle-1.041
ifying-1.002
Token the
Token ID262
Feature activation-2.96e+00

Pos logit contributions
ro+1.887
onna+1.520
emo+1.498
uzz+1.467
 cheek+1.423
Neg logit contributions
aw-1.827
 Other-1.539
 forward-1.484
 middle-1.473
 Leave-1.465
Token shell
Token ID7582
Feature activation-9.36e-01

Pos logit contributions
ro+5.447
uzz+4.694
onna+4.493
nes+4.354
ge+3.889
Neg logit contributions
aw-5.630
 middle-5.149
 particularly-4.547
 Other-4.372
 forward-4.341
Token with
Token ID351
Feature activation-1.97e-01

Pos logit contributions
ro+1.741
emo+1.413
uzz+1.412
onna+1.387
Tal+1.316
Neg logit contributions
aw-1.563
 Other-1.344
 forward-1.314
 middle-1.283
 Leave-1.236
Token you
Token ID345
Feature activation-1.30e-01

Pos logit contributions
ro+0.320
onna+0.249
uzz+0.242
emo+0.238
 cheek+0.232
Neg logit contributions
aw-0.382
 Other-0.340
 Leave-0.324
 forward-0.324
ifying-0.324
Token,"
Token ID553
Feature activation-9.61e-02

Pos logit contributions
ro+0.228
onna+0.182
emo+0.179
uzz+0.179
 cheek+0.167
Neg logit contributions
aw-0.236
 Other-0.210
 forward-0.201
 Leave-0.199
ifying-0.198
Token Lily
Token ID20037
Feature activation-1.21e+00

Pos logit contributions
ro+0.136
onna+0.100
uzz+0.100
emo+0.097
 cheek+0.095
Neg logit contributions
aw-0.206
 Other-0.186
 Leave-0.178
 middle-0.178
 forward-0.177
Token it
Token ID340
Feature activation-3.46e-01

Pos logit contributions
ro+0.761
onna+0.603
emo+0.596
uzz+0.595
 cheek+0.583
Neg logit contributions
aw-0.889
 Other-0.803
ifying-0.773
 middle-0.771
 forward-0.758
Token.
Token ID13
Feature activation-4.67e-01

Pos logit contributions
ro+0.661
onna+0.542
emo+0.539
uzz+0.536
 cheek+0.518
Neg logit contributions
aw-0.562
 Other-0.487
 forward-0.465
ifying-0.461
 Leave-0.459
Token
Token ID198
Feature activation-2.52e-01

Pos logit contributions
ro+1.019
onna+0.863
uzz+0.846
emo+0.835
 cheek+0.830
Neg logit contributions
aw-0.658
 Other-0.588
 Leave-0.523
 middle-0.520
 forward-0.496
Token
Token ID198
Feature activation-1.11e-01

Pos logit contributions
ro+0.503
onna+0.409
uzz+0.403
 cheek+0.402
emo+0.400
Neg logit contributions
aw-0.388
 Other-0.346
 Leave-0.318
 middle-0.316
 forward-0.312
TokenThe
Token ID464
Feature activation-2.25e+00

Pos logit contributions
ro+0.252
onna+0.216
uzz+0.213
emo+0.211
 cheek+0.209
Neg logit contributions
aw-0.143
 Other-0.120
 Leave-0.110
 middle-0.109
ifying-0.107
Token little
Token ID1310
Feature activation-2.96e+00

Pos logit contributions
ro+4.493
onna+4.180
uzz+3.897
ert+3.654
emo+3.596
Neg logit contributions
aw-3.512
 middle-3.342
 Other-3.031
 forward-2.962
 Leave-2.941
Token girl
Token ID2576
Feature activation-5.68e-01

Pos logit contributions
onna+5.926
ro+5.750
uzz+5.291
emo+5.196
nes+5.131
Neg logit contributions
aw-4.892
 middle-4.575
 Leave-4.008
 Other-4.006
 forward-3.906
Token was
Token ID373
Feature activation+3.54e-01

Pos logit contributions
ro+1.082
onna+0.908
emo+0.881
uzz+0.880
 cheek+0.871
Neg logit contributions
aw-0.946
 Other-0.811
 Leave-0.767
 middle-0.755
 forward-0.741
Token very
Token ID845
Feature activation-9.40e-01

Pos logit contributions
aw+0.513
 Other+0.444
 Leave+0.415
ifying+0.414
 forward+0.414
Neg logit contributions
ro-0.741
onna-0.615
uzz-0.606
emo-0.603
 cheek-0.590
Token excited
Token ID6568
Feature activation-7.51e-01

Pos logit contributions
ro+1.766
onna+1.416
uzz+1.360
water+1.330
icated+1.325
Neg logit contributions
aw-1.571
 Other-1.422
 forward-1.401
 middle-1.369
 Leave-1.333
Token.
Token ID13
Feature activation-1.78e-01

Pos logit contributions
ro+1.484
onna+1.230
emo+1.207
uzz+1.200
 cheek+1.141
Neg logit contributions
aw-1.225
 Other-1.034
 forward-0.985
 Leave-0.952
 middle-0.948
Token didn
Token ID1422
Feature activation+8.95e-01

Pos logit contributions
ro+0.146
onna+0.116
uzz+0.115
emo+0.113
 cheek+0.110
Neg logit contributions
aw-0.156
 Other-0.137
 Leave-0.131
ifying-0.131
 forward-0.130
Token't
Token ID470
Feature activation-1.15e+00

Pos logit contributions
aw+1.415
 Other+1.211
ifying+1.150
 Leave+1.134
 forward+1.130
Neg logit contributions
ro-1.769
onna-1.477
emo-1.427
uzz-1.425
 cheek-1.381
Token want
Token ID765
Feature activation-3.97e-01

Pos logit contributions
ro+2.266
uzz+1.858
onna+1.829
 cheek+1.806
emo+1.797
Neg logit contributions
aw-1.690
 Other-1.522
 forward-1.510
 particularly-1.468
 Leave-1.460
Token to
Token ID284
Feature activation-1.24e+00

Pos logit contributions
ro+0.689
onna+0.540
uzz+0.533
 cheek+0.529
emo+0.522
Neg logit contributions
aw-0.722
 Other-0.647
 Leave-0.621
 forward-0.614
ifying-0.599
Token give
Token ID1577
Feature activation-7.88e-01

Pos logit contributions
ro+2.223
uzz+1.789
onna+1.744
emo+1.742
 cheek+1.718
Neg logit contributions
aw-2.068
 Other-1.887
 Leave-1.826
 middle-1.817
 forward-1.802
Token the
Token ID262
Feature activation-2.96e+00

Pos logit contributions
ro+1.476
onna+1.183
uzz+1.123
 seek+1.120
 cheek+1.119
Neg logit contributions
aw-1.346
 forward-1.202
 Other-1.183
 Leave-1.131
 middle-1.129
Token sheet
Token ID9629
Feature activation-5.33e-01

Pos logit contributions
ro+6.089
uzz+5.352
onna+5.306
nes+4.813
emo+4.653
Neg logit contributions
aw-4.812
 middle-4.613
 Other-4.246
 forward-3.995
 particularly-3.986
Token to
Token ID284
Feature activation+2.26e-02

Pos logit contributions
ro+1.154
emo+0.963
onna+0.956
uzz+0.943
 cheek+0.910
Neg logit contributions
aw-0.711
 forward-0.619
 Other-0.610
ifying-0.590
 Leave-0.581
Token Nem
Token ID22547
Feature activation-8.36e-01

Pos logit contributions
aw+0.048
 Other+0.044
 Leave+0.042
 forward+0.042
ifying+0.041
Neg logit contributions
ro-0.032
onna-0.024
uzz-0.023
emo-0.023
 cheek-0.022
Tokeno
Token ID78
Feature activation-8.97e-01

Pos logit contributions
ro+1.418
onna+1.126
uzz+1.094
 cheek+1.088
emo+1.060
Neg logit contributions
aw-1.497
 Other-1.308
 forward-1.288
 Leave-1.272
ifying-1.262
Token and
Token ID290
Feature activation-7.30e-01

Pos logit contributions
ro+1.840
emo+1.531
onna+1.527
uzz+1.495
 seek+1.466
Neg logit contributions
aw-1.324
 Other-1.152
 forward-1.127
 Leave-1.090
ifying-1.069
Token little
Token ID1310
Feature activation-5.75e-01

Pos logit contributions
ro+0.645
uzz+0.539
emo+0.527
onna+0.527
 cheek+0.493
Neg logit contributions
aw-0.547
 Other-0.481
 middle-0.463
ifying-0.457
 forward-0.453
Token boy
Token ID2933
Feature activation+9.24e-01

Pos logit contributions
ro+1.008
onna+0.846
uzz+0.845
emo+0.840
 cheek+0.765
Neg logit contributions
aw-1.029
 Other-0.907
 forward-0.869
 middle-0.865
ifying-0.853
Token carrying
Token ID6872
Feature activation-4.07e-01

Pos logit contributions
aw+1.338
 Other+1.184
ifying+1.139
 forward+1.099
opl+1.085
Neg logit contributions
ro-1.906
onna-1.558
uzz-1.545
emo-1.526
 seek-1.508
Token a
Token ID257
Feature activation-2.05e-01

Pos logit contributions
ro+0.812
onna+0.679
emo+0.649
uzz+0.646
 cheek+0.635
Neg logit contributions
aw-0.647
 Other-0.568
 middle-0.541
 forward-0.532
ifying-0.523
Token box
Token ID3091
Feature activation-5.19e-01

Pos logit contributions
ro+0.361
uzz+0.287
onna+0.286
emo+0.284
 cheek+0.265
Neg logit contributions
aw-0.358
 Other-0.319
ifying-0.310
 forward-0.307
 middle-0.306
Token filled
Token ID5901
Feature activation-2.95e+00

Pos logit contributions
ro+1.063
emo+0.891
onna+0.881
uzz+0.863
 cheek+0.838
Neg logit contributions
aw-0.761
 Other-0.640
 forward-0.631
ifying-0.619
 middle-0.617
Token with
Token ID351
Feature activation-7.70e-01

Pos logit contributions
ro+6.362
onna+5.669
 cheek+5.257
ert+5.184
 outage+5.072
Neg logit contributions
aw-4.590
 middle-4.043
 Other-3.658
 forward-3.573
ifying-3.118
Token empty
Token ID6565
Feature activation-5.71e-01

Pos logit contributions
ro+1.380
onna+1.127
emo+1.091
uzz+1.076
 cheek+1.032
Neg logit contributions
aw-1.381
 Other-1.224
 middle-1.193
ifying-1.171
 forward-1.166
Token boxes
Token ID10559
Feature activation-6.56e-01

Pos logit contributions
ro+1.024
emo+0.832
onna+0.830
uzz+0.821
 cheek+0.772
Neg logit contributions
aw-1.005
 Other-0.890
 middle-0.853
ifying-0.842
 Leave-0.823
Token,
Token ID11
Feature activation-6.54e-01

Pos logit contributions
ro+1.300
emo+1.081
onna+1.060
uzz+1.040
 seek+1.008
Neg logit contributions
aw-1.018
 forward-0.870
 Other-0.863
ifying-0.859
 middle-0.835
Token wood
Token ID4898
Feature activation-1.12e+00

Pos logit contributions
ro+1.141
onna+0.921
emo+0.921
uzz+0.916
 cheek+0.845
Neg logit contributions
aw-1.164
 Other-1.025
 middle-1.001
ifying-0.991
 Leave-0.984
Token They
Token ID1119
Feature activation-9.74e-01

Pos logit contributions
aw+0.588
 Other+0.518
 Leave+0.489
ifying+0.488
 forward+0.487
Neg logit contributions
ro-0.685
onna-0.557
uzz-0.547
emo-0.545
 cheek-0.531
Token were
Token ID547
Feature activation+3.65e-02

Pos logit contributions
ro+2.244
onna+1.892
emo+1.843
uzz+1.828
 cheek+1.828
Neg logit contributions
aw-1.259
 Other-1.006
 middle-0.999
 particularly-0.997
 forward-0.979
Token happy
Token ID3772
Feature activation-9.14e-01

Pos logit contributions
aw+0.054
 Other+0.046
 forward+0.044
 Leave+0.043
ifying+0.043
Neg logit contributions
ro-0.076
onna-0.063
emo-0.061
uzz-0.061
 cheek-0.060
Token to
Token ID284
Feature activation+9.97e-01

Pos logit contributions
ro+1.904
onna+1.544
uzz+1.504
emo+1.501
 cheek+1.500
Neg logit contributions
aw-1.423
 Other-1.205
 forward-1.143
 Leave-1.123
 particularly-1.077
Token share
Token ID2648
Feature activation-4.27e-01

Pos logit contributions
aw+1.806
 Other+1.550
ifying+1.483
 forward+1.460
 Leave+1.456
Neg logit contributions
ro-1.795
onna-1.447
uzz-1.416
emo-1.397
 seek-1.371
Token the
Token ID262
Feature activation-2.94e+00

Pos logit contributions
ro+0.799
onna+0.648
uzz+0.631
emo+0.622
 cheek+0.610
Neg logit contributions
aw-0.740
 Other-0.637
 forward-0.611
 Leave-0.604
 middle-0.597
Token bath
Token ID7837
Feature activation-1.49e+00

Pos logit contributions
ro+5.799
onna+5.317
uzz+5.099
emo+4.566
nes+4.524
Neg logit contributions
 middle-4.718
 Other-4.429
aw-4.305
 particularly-4.280
 forward-4.254
Tokenrobe
Token ID25481
Feature activation-2.05e-01

Pos logit contributions
ro+2.278
onna+1.951
uzz+1.869
emo+1.829
icated+1.747
Neg logit contributions
aw-2.819
 forward-2.578
 Other-2.548
 middle-2.494
 particularly-2.423
Token and
Token ID290
Feature activation-6.14e-01

Pos logit contributions
ro+0.357
onna+0.285
emo+0.282
uzz+0.280
 cheek+0.268
Neg logit contributions
aw-0.367
 Other-0.323
 forward-0.317
 Leave-0.306
 middle-0.306
Token be
Token ID307
Feature activation-8.10e-02

Pos logit contributions
ro+1.180
onna+0.962
uzz+0.945
emo+0.937
 cheek+0.905
Neg logit contributions
aw-0.972
 Other-0.857
 Leave-0.830
 middle-0.825
 forward-0.821
Token best
Token ID1266
Feature activation+9.30e-01

Pos logit contributions
ro+0.149
onna+0.120
uzz+0.115
emo+0.114
 cheek+0.113
Neg logit contributions
aw-0.140
 Other-0.124
 forward-0.119
 Leave-0.117
 middle-0.117
Token and
Token ID290
Feature activation+6.48e-01

Pos logit contributions
aw+0.718
 Other+0.636
 Leave+0.603
ifying+0.603
 forward+0.599
Neg logit contributions
ro-0.764
onna-0.614
uzz-0.604
emo-0.600
 cheek-0.586
Token decided
Token ID3066
Feature activation-7.51e-02

Pos logit contributions
aw+1.120
 Other+1.004
ifying+0.959
 Leave+0.943
 forward+0.936
Neg logit contributions
ro-1.159
onna-0.929
uzz-0.918
emo-0.918
 cheek-0.901
Token not
Token ID407
Feature activation-5.83e-01

Pos logit contributions
ro+0.124
onna+0.096
uzz+0.095
emo+0.094
 cheek+0.090
Neg logit contributions
aw-0.143
 Other-0.127
 Leave-0.123
 forward-0.123
 middle-0.120
Token to
Token ID284
Feature activation-1.64e-01

Pos logit contributions
ro+1.012
emo+0.779
onna+0.774
uzz+0.774
 cheek+0.749
Neg logit contributions
aw-1.033
 Other-0.933
 Leave-0.919
 forward-0.913
ifying-0.879
Token take
Token ID1011
Feature activation+5.63e-01

Pos logit contributions
ro+0.268
onna+0.209
uzz+0.207
emo+0.206
 cheek+0.196
Neg logit contributions
aw-0.307
 Other-0.277
 Leave-0.267
 forward-0.263
ifying-0.262
Token the
Token ID262
Feature activation-2.93e+00

Pos logit contributions
aw+1.049
 Other+0.938
ifying+0.898
 Leave+0.894
 forward+0.884
Neg logit contributions
ro-0.940
onna-0.735
uzz-0.726
emo-0.720
 cheek-0.700
Token toy
Token ID13373
Feature activation+2.87e-01

Pos logit contributions
ro+6.158
onna+5.463
uzz+5.409
emo+5.215
nes+5.016
Neg logit contributions
 middle-4.413
aw-4.342
 forward-4.138
 Other-3.971
opl-3.886
Token.
Token ID13
Feature activation-4.18e-01

Pos logit contributions
aw+0.515
 Other+0.460
 forward+0.437
 Leave+0.436
ifying+0.436
Neg logit contributions
ro-0.498
onna-0.396
uzz-0.389
emo-0.388
 cheek-0.375
Token
Token ID198
Feature activation-3.99e-01

Pos logit contributions
ro+0.907
onna+0.758
uzz+0.747
emo+0.741
 seek+0.738
Neg logit contributions
aw-0.572
 Other-0.514
 Leave-0.470
 middle-0.454
 forward-0.454
Token
Token ID198
Feature activation-1.28e-01

Pos logit contributions
ro+0.811
onna+0.656
uzz+0.649
emo+0.643
 cheek+0.643
Neg logit contributions
aw-0.602
 Other-0.538
 Leave-0.498
 middle-0.490
 forward-0.487
TokenL
Token ID43
Feature activation-4.00e-01

Pos logit contributions
ro+0.262
onna+0.219
uzz+0.215
emo+0.213
 cheek+0.208
Neg logit contributions
aw-0.193
 Other-0.168
 Leave-0.159
 middle-0.157
 forward-0.157
TokenMax
Token ID11518
Feature activation-1.55e+00

Pos logit contributions
aw+0.379
 Other+0.342
 Leave+0.327
 forward+0.325
ifying+0.325
Neg logit contributions
ro-0.294
onna-0.229
uzz-0.224
emo-0.222
 cheek-0.214
Token was
Token ID373
Feature activation+1.39e-01

Pos logit contributions
ro+3.125
onna+2.697
uzz+2.628
emo+2.618
 cheek+2.584
Neg logit contributions
aw-2.398
 Other-2.017
 Leave-1.914
 particularly-1.855
 forward-1.840
Token happy
Token ID3772
Feature activation-9.13e-01

Pos logit contributions
aw+0.227
 Other+0.201
 forward+0.190
 Leave+0.189
 middle+0.187
Neg logit contributions
ro-0.267
onna-0.216
uzz-0.212
emo-0.211
 cheek-0.206
Token to
Token ID284
Feature activation+6.80e-01

Pos logit contributions
ro+1.754
emo+1.414
onna+1.406
uzz+1.402
 cheek+1.350
Neg logit contributions
aw-1.492
 Other-1.286
 forward-1.265
 Leave-1.223
 particularly-1.202
Token share
Token ID2648
Feature activation-8.86e-01

Pos logit contributions
aw+1.205
 Other+1.046
ifying+1.005
 forward+0.994
 Leave+0.989
Neg logit contributions
ro-1.237
onna-0.994
emo-0.970
uzz-0.969
 cheek-0.940
Token the
Token ID262
Feature activation-2.91e+00

Pos logit contributions
ro+1.876
onna+1.565
uzz+1.537
emo+1.518
 cheek+1.488
Neg logit contributions
aw-1.258
 Other-1.099
 forward-1.038
 middle-1.009
 Leave-1.006
Token bath
Token ID7837
Feature activation-1.46e+00

Pos logit contributions
ro+5.482
onna+4.850
uzz+4.736
icated+4.282
ge+4.246
Neg logit contributions
 middle-4.841
 Other-4.546
 particularly-4.273
aw-4.236
 forward-4.231
Tokentub
Token ID37995
Feature activation-1.22e+00

Pos logit contributions
ro+1.731
onna+1.383
uzz+1.318
emo+1.261
icated+1.244
Neg logit contributions
aw-3.188
 Other-3.031
 middle-2.990
 forward-2.986
 particularly-2.856
Token with
Token ID351
Feature activation+1.22e+00

Pos logit contributions
ro+2.173
emo+1.780
onna+1.772
uzz+1.724
 cheek+1.629
Neg logit contributions
aw-2.114
 forward-1.871
 Other-1.866
 middle-1.809
 particularly-1.732
Token Lily
Token ID20037
Feature activation-5.99e-01

Pos logit contributions
aw+3.326
 Other+3.019
ifying+3.019
 Leave+2.975
 forward+2.960
Neg logit contributions
ro-1.032
emo-0.595
onna-0.584
uzz-0.568
 cheek-0.490
Token.
Token ID13
Feature activation-1.70e-01

Pos logit contributions
ro+1.198
onna+1.000
emo+0.987
uzz+0.966
 seek+0.921
Neg logit contributions
aw-0.921
 Other-0.809
 forward-0.789
 middle-0.755
 Leave-0.750
Tokenâ
Token ID22940
Feature activation-1.19e+00

Pos logit contributions
aw+0.092
 Other+0.081
 Leave+0.075
ifying+0.074
 forward+0.074
Neg logit contributions
ro-0.139
onna-0.116
uzz-0.115
emo-0.115
 cheek-0.112
Token€
Token ID26391
Feature activation+3.40e-01

Pos logit contributions
ro+2.206
onna+1.896
uzz+1.865
emo+1.837
 seek+1.705
Neg logit contributions
aw-2.067
 Other-1.705
 Leave-1.676
 forward-1.668
opl-1.654
Token
Token ID198
Feature activation-2.36e-01

Pos logit contributions
aw+0.516
 Other+0.450
 Leave+0.423
ifying+0.422
 forward+0.421
Neg logit contributions
ro-0.684
onna-0.563
uzz-0.554
emo-0.552
 cheek-0.539
Token
Token ID198
Feature activation+2.84e-01

Pos logit contributions
ro+0.467
onna+0.378
uzz+0.375
emo+0.372
 cheek+0.371
Neg logit contributions
aw-0.369
 Other-0.324
 Leave-0.307
 forward-0.300
 middle-0.299
TokenThe
Token ID464
Feature activation-1.57e+00

Pos logit contributions
aw+0.403
 Other+0.347
 Leave+0.325
ifying+0.323
 forward+0.323
Neg logit contributions
ro-0.602
onna-0.501
uzz-0.494
emo-0.491
 cheek-0.481
Token old
Token ID1468
Feature activation-2.90e+00

Pos logit contributions
ro+3.034
onna+2.573
emo+2.477
uzz+2.450
icated+2.277
Neg logit contributions
aw-2.501
 Other-2.455
 middle-2.381
 particularly-2.177
 forward-2.148
Token bear
Token ID6842
Feature activation-2.68e-04

Pos logit contributions
ro+6.587
emo+5.833
icated+5.820
onna+5.800
uzz+5.498
Neg logit contributions
 middle-3.425
 Other-3.316
aw-3.277
 particularly-2.933
opl-2.853
Token was
Token ID373
Feature activation+5.58e-01

Pos logit contributions
ro+0.000
onna+0.000
uzz+0.000
emo+0.000
 cheek+0.000
Neg logit contributions
aw-0.001
 Other-0.000
 Leave-0.000
 forward-0.000
ifying-0.000
Token so
Token ID523
Feature activation-8.78e-01

Pos logit contributions
aw+0.885
 Other+0.780
ifying+0.736
 Leave+0.734
 forward+0.727
Neg logit contributions
ro-1.079
onna-0.886
uzz-0.872
emo-0.865
 cheek-0.846
Token shocked
Token ID11472
Feature activation-4.70e-01

Pos logit contributions
ro+1.650
onna+1.284
uzz+1.250
emo+1.238
 cheek+1.232
Neg logit contributions
aw-1.525
 Other-1.322
 forward-1.285
 Leave-1.257
 middle-1.241
Token,
Token ID11
Feature activation-5.04e-01

Pos logit contributions
ro+0.950
onna+0.791
uzz+0.775
emo+0.773
 cheek+0.762
Neg logit contributions
aw-0.737
 Other-0.606
 forward-0.584
 Leave-0.581
ifying-0.562
Token owl
Token ID39610
Feature activation+2.44e-01

Pos logit contributions
ro+1.924
onna+1.602
emo+1.580
uzz+1.573
 seek+1.535
Neg logit contributions
aw-1.476
 Other-1.331
opl-1.210
 Leave-1.209
 forward-1.204
Token and
Token ID290
Feature activation+3.66e-01

Pos logit contributions
aw+0.408
 Other+0.361
 forward+0.341
 Leave+0.341
ifying+0.339
Neg logit contributions
ro-0.451
onna-0.365
uzz-0.358
emo-0.358
 cheek-0.346
Token decided
Token ID3066
Feature activation-4.35e-01

Pos logit contributions
aw+0.653
 Other+0.582
 Leave+0.552
ifying+0.552
 forward+0.550
Neg logit contributions
ro-0.640
onna-0.510
uzz-0.501
emo-0.498
 cheek-0.484
Token to
Token ID284
Feature activation-4.97e-02

Pos logit contributions
ro+0.762
onna+0.597
uzz+0.587
emo+0.583
 cheek+0.565
Neg logit contributions
aw-0.783
 Other-0.692
 forward-0.679
 Leave-0.669
 middle-0.646
Token share
Token ID2648
Feature activation-4.20e-01

Pos logit contributions
ro+0.082
onna+0.065
uzz+0.064
emo+0.063
 cheek+0.061
Neg logit contributions
aw-0.092
 Other-0.083
 Leave-0.079
 forward-0.078
ifying-0.078
Token the
Token ID262
Feature activation-2.89e+00

Pos logit contributions
ro+0.786
onna+0.638
uzz+0.611
emo+0.610
 cheek+0.596
Neg logit contributions
aw-0.729
 Other-0.627
 forward-0.601
 middle-0.592
 Leave-0.590
Token metal
Token ID6147
Feature activation-1.11e+00

Pos logit contributions
ro+6.330
onna+5.777
nes+5.378
uzz+5.260
emo+4.962
Neg logit contributions
 middle-3.941
aw-3.931
 Other-3.906
 Leave-3.758
opl-3.631
Token toy
Token ID13373
Feature activation-1.65e-02

Pos logit contributions
ro+1.969
onna+1.656
emo+1.590
uzz+1.571
 seek+1.516
Neg logit contributions
aw-1.886
 Other-1.704
 forward-1.649
 middle-1.647
 Leave-1.549
Token.
Token ID13
Feature activation-2.97e-01

Pos logit contributions
ro+0.030
onna+0.024
uzz+0.024
emo+0.024
 cheek+0.023
Neg logit contributions
aw-0.028
 Other-0.025
 forward-0.024
 Leave-0.024
 middle-0.024
Token They
Token ID1119
Feature activation-2.16e-01

Pos logit contributions
ro+0.616
onna+0.511
uzz+0.505
 cheek+0.502
 seek+0.498
Neg logit contributions
aw-0.437
 Other-0.387
 Leave-0.354
 forward-0.347
 middle-0.345
Token played
Token ID2826
Feature activation+1.19e-01

Pos logit contributions
ro+0.457
onna+0.383
uzz+0.374
emo+0.373
 cheek+0.369
Neg logit contributions
aw-0.310
 Other-0.261
 forward-0.245
 Leave-0.245
 middle-0.244
Token mom
Token ID1995
Feature activation-2.71e-01

Pos logit contributions
ro+0.812
onna+0.673
emo+0.648
uzz+0.639
 seek+0.623
Neg logit contributions
aw-0.705
 Other-0.621
 middle-0.596
 forward-0.591
 Leave-0.588
Token smiled
Token ID13541
Feature activation+4.56e-01

Pos logit contributions
ro+0.473
onna+0.376
emo+0.367
uzz+0.362
 cheek+0.354
Neg logit contributions
aw-0.495
 Other-0.432
 Leave-0.415
 forward-0.413
ifying-0.407
Token and
Token ID290
Feature activation-9.78e-01

Pos logit contributions
aw+0.791
 Other+0.704
 Leave+0.669
ifying+0.666
 forward+0.661
Neg logit contributions
ro-0.824
onna-0.659
uzz-0.650
emo-0.646
 cheek-0.630
Token explained
Token ID4893
Feature activation-1.11e-01

Pos logit contributions
ro+1.627
onna+1.250
emo+1.195
uzz+1.193
 cheek+1.162
Neg logit contributions
aw-1.942
 Other-1.663
 forward-1.606
 Leave-1.595
 particularly-1.583
Token that
Token ID326
Feature activation-1.65e+00

Pos logit contributions
ro+0.226
onna+0.184
emo+0.183
uzz+0.181
 cheek+0.176
Neg logit contributions
aw-0.171
 Other-0.147
 forward-0.139
 Leave-0.137
ifying-0.134
Token the
Token ID262
Feature activation-2.89e+00

Pos logit contributions
ro+3.370
onna+2.698
uzz+2.655
emo+2.646
 cheek+2.639
Neg logit contributions
aw-2.635
 Other-2.316
 forward-2.234
 middle-2.182
 Leave-2.169
Token jelly
Token ID27644
Feature activation-1.26e+00

Pos logit contributions
ro+4.846
uzz+4.319
onna+4.041
nes+3.727
ge+3.643
Neg logit contributions
aw-5.858
 middle-5.605
 particularly-4.995
 forward-4.825
 Other-4.816
Tokenfish
Token ID11084
Feature activation-2.66e-01

Pos logit contributions
ro+1.413
onna+0.974
uzz+0.966
 cheek+0.912
Tal+0.853
Neg logit contributions
aw-3.074
 middle-2.664
 particularly-2.658
 forward-2.657
 Other-2.647
Token was
Token ID373
Feature activation-6.86e-01

Pos logit contributions
ro+0.528
onna+0.434
uzz+0.425
emo+0.420
 cheek+0.412
Neg logit contributions
aw-0.414
 Other-0.355
 Leave-0.341
 forward-0.340
ifying-0.336
Token so
Token ID523
Feature activation-1.79e+00

Pos logit contributions
ro+1.437
onna+1.155
uzz+1.144
 cheek+1.125
emo+1.121
Neg logit contributions
aw-1.011
 forward-0.869
 Other-0.861
 particularly-0.841
ifying-0.825
Token special
Token ID2041
Feature activation-5.59e-01

Pos logit contributions
ro+3.628
onna+2.772
uzz+2.752
 cheek+2.710
water+2.648
Neg logit contributions
aw-3.008
 forward-2.519
 Other-2.487
 particularly-2.457
ifying-2.414
Token.
Token ID13
Feature activation-7.39e-02

Pos logit contributions
ro+0.075
onna+0.062
uzz+0.060
emo+0.060
 cheek+0.058
Neg logit contributions
aw-0.068
 Other-0.059
 forward-0.056
 Leave-0.056
 middle-0.055
Token They
Token ID1119
Feature activation+3.93e-03

Pos logit contributions
ro+0.148
onna+0.123
uzz+0.120
emo+0.119
 cheek+0.117
Neg logit contributions
aw-0.114
 Other-0.101
 Leave-0.094
 middle-0.093
ifying-0.093
Token decide
Token ID5409
Feature activation-1.33e+00

Pos logit contributions
aw+0.007
 Other+0.007
 Leave+0.006
 forward+0.006
ifying+0.006
Neg logit contributions
ro-0.007
onna-0.005
emo-0.005
uzz-0.005
 cheek-0.005
Token to
Token ID284
Feature activation-4.14e-01

Pos logit contributions
ro+2.635
onna+2.051
emo+2.044
uzz+1.986
water+1.964
Neg logit contributions
aw-2.107
 Other-1.875
 forward-1.869
 Leave-1.797
 middle-1.745
Token share
Token ID2648
Feature activation-1.38e+00

Pos logit contributions
ro+0.686
onna+0.541
uzz+0.533
emo+0.532
 cheek+0.510
Neg logit contributions
aw-0.756
 Other-0.686
 Leave-0.659
 middle-0.654
ifying-0.642
Token the
Token ID262
Feature activation-2.89e+00

Pos logit contributions
ro+2.946
onna+2.389
uzz+2.343
emo+2.335
 seek+2.298
Neg logit contributions
aw-2.168
 Other-1.758
 middle-1.737
 forward-1.709
ifying-1.582
Token cr
Token ID1067
Feature activation-2.78e-01

Pos logit contributions
ro+5.757
uzz+5.285
onna+4.984
nes+4.916
emo+4.899
Neg logit contributions
 middle-4.658
aw-4.631
 Other-3.985
opl-3.829
pieces-3.811
Tokenay
Token ID323
Feature activation-2.27e-01

Pos logit contributions
ro+0.770
onna+0.669
emo+0.667
uzz+0.666
 cheek+0.657
Neg logit contributions
aw-0.209
 Other-0.152
ifying-0.131
 middle-0.125
 forward-0.125
Tokenons
Token ID684
Feature activation-1.28e+00

Pos logit contributions
ro+0.585
onna+0.503
uzz+0.496
emo+0.492
 cheek+0.483
Neg logit contributions
aw-0.225
 Other-0.174
 middle-0.156
 Leave-0.154
 forward-0.154
Token.
Token ID13
Feature activation-6.53e-01

Pos logit contributions
ro+2.669
onna+2.195
emo+2.195
uzz+2.130
 seek+2.059
Neg logit contributions
aw-1.959
 forward-1.670
 Other-1.556
 middle-1.553
pieces-1.486
Token They
Token ID1119
Feature activation-2.34e-01

Pos logit contributions
ro+1.370
onna+1.158
uzz+1.131
emo+1.123
 cheek+1.121
Neg logit contributions
aw-0.952
 Other-0.848
 Leave-0.788
 middle-0.770
 forward-0.757
Token inside
Token ID2641
Feature activation-4.28e-01

Pos logit contributions
ro+1.003
onna+0.845
emo+0.835
uzz+0.824
 cheek+0.804
Neg logit contributions
aw-0.703
 Other-0.603
 forward-0.592
 middle-0.570
ifying-0.563
Token.
Token ID13
Feature activation-8.41e-01

Pos logit contributions
ro+0.806
onna+0.648
emo+0.646
uzz+0.640
 cheek+0.618
Neg logit contributions
aw-0.721
 Other-0.639
 forward-0.608
 middle-0.604
ifying-0.596
Token
Token ID198
Feature activation-4.43e-01

Pos logit contributions
ro+1.919
onna+1.648
emo+1.629
uzz+1.615
 cheek+1.601
Neg logit contributions
aw-1.044
 Other-0.964
 middle-0.862
 Leave-0.829
ifying-0.811
Token
Token ID198
Feature activation-1.88e-01

Pos logit contributions
ro+0.907
onna+0.741
 cheek+0.731
uzz+0.731
emo+0.728
Neg logit contributions
aw-0.659
 Other-0.591
 Leave-0.545
 middle-0.542
 forward-0.534
TokenThe
Token ID464
Feature activation-2.05e+00

Pos logit contributions
ro+0.389
onna+0.327
uzz+0.322
emo+0.321
 cheek+0.315
Neg logit contributions
aw-0.278
 Other-0.242
 Leave-0.228
 middle-0.226
ifying-0.223
Token little
Token ID1310
Feature activation-2.87e+00

Pos logit contributions
ro+4.008
onna+3.649
uzz+3.572
emo+3.483
ert+3.119
Neg logit contributions
 middle-3.157
aw-3.076
 Other-2.963
 forward-2.919
 particularly-2.800
Token girl
Token ID2576
Feature activation-5.10e-01

Pos logit contributions
ro+5.473
onna+5.279
emo+5.212
uzz+4.963
nes+4.602
Neg logit contributions
aw-4.567
 middle-4.429
 Other-4.136
 forward-4.130
 particularly-3.843
Token was
Token ID373
Feature activation+5.37e-02

Pos logit contributions
ro+0.997
onna+0.827
emo+0.816
uzz+0.808
 cheek+0.803
Neg logit contributions
aw-0.822
 Other-0.705
 Leave-0.668
 middle-0.660
 forward-0.647
Token excited
Token ID6568
Feature activation-8.20e-01

Pos logit contributions
aw+0.073
 Other+0.063
 forward+0.059
 Leave+0.058
 middle+0.058
Neg logit contributions
ro-0.117
onna-0.098
emo-0.096
uzz-0.096
 cheek-0.093
Token to
Token ID284
Feature activation+6.84e-01

Pos logit contributions
ro+1.546
onna+1.270
emo+1.254
uzz+1.246
 cheek+1.172
Neg logit contributions
aw-1.413
 Other-1.197
 forward-1.159
 Leave-1.109
 middle-1.108
Token see
Token ID766
Feature activation+5.22e-01

Pos logit contributions
aw+1.224
 Other+1.073
ifying+1.032
 Leave+1.016
 forward+1.016
Neg logit contributions
ro-1.207
onna-0.964
uzz-0.946
emo-0.946
 cheek-0.918
Token that
Token ID326
Feature activation-6.00e-01

Pos logit contributions
aw+0.674
 Other+0.593
 Leave+0.560
ifying+0.560
 forward+0.557
Neg logit contributions
ro-0.792
onna-0.645
uzz-0.633
emo-0.631
 cheek-0.615
Token Jimmy
Token ID12963
Feature activation+2.51e-01

Pos logit contributions
ro+1.189
onna+0.965
uzz+0.950
emo+0.938
 cheek+0.921
Neg logit contributions
aw-0.912
 Other-0.833
 forward-0.813
 middle-0.788
 Leave-0.777
Token didn
Token ID1422
Feature activation+8.95e-01

Pos logit contributions
aw+0.412
 Other+0.363
 Leave+0.344
 forward+0.343
ifying+0.343
Neg logit contributions
ro-0.473
onna-0.384
uzz-0.378
emo-0.376
 cheek-0.366
Token't
Token ID470
Feature activation-3.82e-01

Pos logit contributions
aw+1.420
 Other+1.208
ifying+1.148
 forward+1.136
 Leave+1.131
Neg logit contributions
ro-1.768
onna-1.466
uzz-1.421
emo-1.418
 cheek-1.374
Token share
Token ID2648
Feature activation-5.74e-01

Pos logit contributions
ro+0.721
onna+0.580
uzz+0.574
emo+0.572
 cheek+0.558
Neg logit contributions
aw-0.615
 Other-0.554
 forward-0.534
 Leave-0.533
 middle-0.513
Token the
Token ID262
Feature activation-2.86e+00

Pos logit contributions
ro+1.137
onna+0.941
uzz+0.923
emo+0.915
 cheek+0.902
Neg logit contributions
aw-0.883
 Other-0.772
 forward-0.751
ifying-0.724
 Leave-0.721
Token toy
Token ID13373
Feature activation-1.43e-01

Pos logit contributions
ro+5.180
uzz+4.938
onna+4.796
nes+4.653
emo+4.502
Neg logit contributions
 middle-4.491
 Other-4.481
aw-4.394
 particularly-4.151
 forward-4.096
Token with
Token ID351
Feature activation+3.46e-01

Pos logit contributions
ro+0.258
onna+0.210
emo+0.206
uzz+0.206
 cheek+0.197
Neg logit contributions
aw-0.244
 Other-0.218
 forward-0.210
 middle-0.205
ifying-0.205
Token him
Token ID683
Feature activation+1.21e-01

Pos logit contributions
aw+0.579
 Other+0.512
 Leave+0.484
ifying+0.484
 forward+0.483
Neg logit contributions
ro-0.643
onna-0.520
uzz-0.510
emo-0.508
 cheek-0.495
Token,
Token ID11
Feature activation+9.09e-02

Pos logit contributions
aw+0.182
 Other+0.159
 forward+0.150
 Leave+0.149
ifying+0.149
Neg logit contributions
ro-0.244
onna-0.202
emo-0.198
uzz-0.198
 cheek-0.192
Token but
Token ID475
Feature activation-3.16e-01

Pos logit contributions
aw+0.169
 Other+0.151
 forward+0.145
 Leave+0.145
ifying+0.143
Neg logit contributions
ro-0.152
onna-0.119
uzz-0.118
emo-0.117
 cheek-0.112
Token the
Token ID262
Feature activation+9.07e-01

Pos logit contributions
aw+2.066
 Other+1.841
 Leave+1.785
ifying+1.769
 forward+1.727
Neg logit contributions
ro-1.808
onna-1.424
uzz-1.397
emo-1.379
 cheek-1.374
Token barbecue
Token ID36646
Feature activation-5.57e-02

Pos logit contributions
aw+1.617
 Other+1.423
ifying+1.373
 Leave+1.365
 forward+1.344
Neg logit contributions
ro-1.601
onna-1.271
emo-1.262
 cheek-1.255
uzz-1.234
Token.
Token ID13
Feature activation-5.58e-03

Pos logit contributions
ro+0.103
onna+0.084
uzz+0.083
emo+0.083
 cheek+0.080
Neg logit contributions
aw-0.093
 Other-0.081
 forward-0.078
 middle-0.076
ifying-0.076
Token From
Token ID3574
Feature activation+2.99e-01

Pos logit contributions
ro+0.012
onna+0.010
uzz+0.010
emo+0.010
 cheek+0.010
Neg logit contributions
aw-0.007
 Other-0.006
 Leave-0.006
 forward-0.006
ifying-0.006
Token that
Token ID326
Feature activation+1.67e-01

Pos logit contributions
aw+0.382
 Other+0.322
 Leave+0.298
ifying+0.298
 forward+0.298
Neg logit contributions
ro-0.679
onna-0.572
uzz-0.563
emo-0.561
 cheek-0.549
Token day
Token ID1110
Feature activation+3.06e+00

Pos logit contributions
aw+0.331
 Other+0.302
 forward+0.289
 Leave+0.287
ifying+0.286
Neg logit contributions
ro-0.259
onna-0.199
uzz-0.195
emo-0.193
 cheek-0.185
Token on
Token ID319
Feature activation+1.13e+00

Pos logit contributions
aw+4.657
 Other+3.934
pieces+3.898
opl+3.746
ifying+3.657
Neg logit contributions
ro-6.148
 cheek-5.238
 seek-5.195
uzz-5.174
onna-4.990
Token,
Token ID11
Feature activation+3.49e-01

Pos logit contributions
aw+1.951
 Other+1.699
ifying+1.649
 Leave+1.606
 forward+1.600
Neg logit contributions
ro-2.105
onna-1.704
emo-1.635
uzz-1.627
 seek-1.618
Token Lily
Token ID20037
Feature activation+9.85e-01

Pos logit contributions
aw+0.784
 Other+0.717
 Leave+0.689
ifying+0.688
 forward+0.687
Neg logit contributions
ro-0.453
onna-0.328
uzz-0.319
emo-0.316
 cheek-0.303
Token always
Token ID1464
Feature activation+1.55e-02

Pos logit contributions
aw+1.546
 Other+1.365
ifying+1.338
 Leave+1.288
 forward+1.280
Neg logit contributions
ro-1.928
onna-1.580
emo-1.560
uzz-1.523
 cheek-1.486
Token helped
Token ID4193
Feature activation+1.41e+00

Pos logit contributions
aw+0.024
 Other+0.021
 forward+0.020
 Leave+0.020
ifying+0.019
Neg logit contributions
ro-0.031
onna-0.025
uzz-0.025
emo-0.025
 cheek-0.024
Token
Token ID220
Feature activation-1.44e-01

Pos logit contributions
aw+0.133
 Other+0.118
 Leave+0.110
 forward+0.108
ifying+0.108
Neg logit contributions
ro-0.184
onna-0.152
emo-0.149
uzz-0.149
 cheek-0.146
Token
Token ID198
Feature activation-6.07e-01

Pos logit contributions
ro+0.281
onna+0.228
uzz+0.224
emo+0.223
 cheek+0.222
Neg logit contributions
aw-0.227
 Other-0.201
 Leave-0.189
 forward-0.185
ifying-0.184
Token
Token ID198
Feature activation-2.25e-01

Pos logit contributions
ro+1.238
onna+1.015
uzz+1.004
 cheek+0.998
emo+0.996
Neg logit contributions
aw-0.890
 Other-0.805
 Leave-0.749
 middle-0.737
 forward-0.723
TokenFrom
Token ID4863
Feature activation-3.86e-01

Pos logit contributions
ro+0.464
onna+0.389
uzz+0.380
emo+0.379
 cheek+0.373
Neg logit contributions
aw-0.331
 Other-0.288
 Leave-0.272
 middle-0.269
 forward-0.268
Token that
Token ID326
Feature activation-6.89e-01

Pos logit contributions
ro+0.936
onna+0.805
uzz+0.782
emo+0.779
 cheek+0.762
Neg logit contributions
aw-0.461
 Other-0.352
 forward-0.340
 middle-0.333
 Leave-0.325
Token day
Token ID1110
Feature activation+2.42e+00

Pos logit contributions
ro+1.196
uzz+0.951
emo+0.940
onna+0.939
water+0.879
Neg logit contributions
aw-1.206
 Other-1.144
 forward-1.123
 middle-1.092
 particularly-1.070
Token on
Token ID319
Feature activation+9.29e-01

Pos logit contributions
aw+3.206
 Other+2.604
pieces+2.465
opl+2.447
ifying+2.430
Neg logit contributions
ro-5.444
uzz-4.621
 seek-4.483
 cheek-4.472
onna-4.449
Token,
Token ID11
Feature activation+1.14e-01

Pos logit contributions
aw+1.552
 Other+1.350
ifying+1.297
 Leave+1.279
 forward+1.266
Neg logit contributions
ro-1.778
onna-1.445
uzz-1.400
emo-1.390
 seek-1.366
Token Lily
Token ID20037
Feature activation+4.63e-01

Pos logit contributions
aw+0.283
 Other+0.263
 forward+0.254
 Leave+0.252
ifying+0.252
Neg logit contributions
ro-0.117
onna-0.078
uzz-0.075
emo-0.074
 cheek-0.070
Token always
Token ID1464
Feature activation-3.42e-01

Pos logit contributions
aw+0.718
 Other+0.630
ifying+0.595
 Leave+0.592
 forward+0.589
Neg logit contributions
ro-0.917
onna-0.750
uzz-0.737
emo-0.736
 cheek-0.716
Token made
Token ID925
Feature activation+4.44e-01

Pos logit contributions
ro+0.723
onna+0.604
uzz+0.596
emo+0.592
 cheek+0.580
Neg logit contributions
aw-0.481
 Other-0.413
 forward-0.393
 Leave-0.389
 middle-0.382
Token.
Token ID13
Feature activation-3.22e-01

Pos logit contributions
aw+0.509
 Other+0.449
 Leave+0.425
ifying+0.425
 forward+0.424
Neg logit contributions
ro-0.564
onna-0.456
uzz-0.448
emo-0.447
 cheek-0.434
Token<|endoftext|>
Token ID50256
Feature activation+5.42e-02

Pos logit contributions
ro+0.647
onna+0.528
uzz+0.517
 cheek+0.514
emo+0.513
Neg logit contributions
aw-0.487
 Other-0.443
 Leave-0.414
ifying-0.411
 forward-0.403
TokenBen
Token ID11696
Feature activation-1.48e+00

Pos logit contributions
aw+0.064
 Other+0.053
 Leave+0.048
ifying+0.048
 forward+0.048
Neg logit contributions
ro-0.127
onna-0.109
uzz-0.107
emo-0.107
 cheek-0.106
Token and
Token ID290
Feature activation+7.83e-01

Pos logit contributions
ro+3.112
onna+2.692
 cheek+2.680
uzz+2.657
emo+2.584
Neg logit contributions
aw-2.162
 Leave-1.719
 Other-1.713
 forward-1.588
 middle-1.587
Token Lily
Token ID20037
Feature activation-5.49e-01

Pos logit contributions
aw+1.460
 Other+1.313
ifying+1.254
 forward+1.245
 Leave+1.227
Neg logit contributions
ro-1.310
onna-1.030
uzz-1.026
emo-1.024
 cheek-0.974
Token were
Token ID547
Feature activation+2.04e+00

Pos logit contributions
ro+1.220
onna+1.047
emo+1.017
uzz+1.015
 cheek+1.009
Neg logit contributions
aw-0.749
 Other-0.601
 Leave-0.584
 forward-0.561
 middle-0.560
Token playing
Token ID2712
Feature activation+3.89e-01

Pos logit contributions
aw+3.264
ifying+2.803
 Other+2.795
opl+2.714
 Leave+2.691
Neg logit contributions
ro-3.680
uzz-3.291
emo-3.147
 cheek-3.142
onna-3.106
Token in
Token ID287
Feature activation-1.16e+00

Pos logit contributions
aw+0.652
 Other+0.576
 Leave+0.546
ifying+0.545
 forward+0.543
Neg logit contributions
ro-0.722
onna-0.582
uzz-0.573
emo-0.570
 cheek-0.555
Token the
Token ID262
Feature activation+2.13e-01

Pos logit contributions
ro+2.361
onna+1.968
emo+1.925
 cheek+1.917
uzz+1.850
Neg logit contributions
aw-1.708
 Other-1.592
 middle-1.547
 Leave-1.446
ifying-1.424
Token park
Token ID3952
Feature activation+2.72e-01

Pos logit contributions
aw+0.322
 Other+0.281
 Leave+0.263
ifying+0.262
 forward+0.262
Neg logit contributions
ro-0.431
onna-0.357
uzz-0.351
emo-0.349
 cheek-0.341
Token.
Token ID13
Feature activation+1.84e-01

Pos logit contributions
aw+0.486
 Other+0.433
 forward+0.410
 Leave+0.410
ifying+0.409
Neg logit contributions
ro-0.476
onna-0.381
uzz-0.372
emo-0.372
 cheek-0.360
Token
Token ID220
Feature activation-2.40e-01

Pos logit contributions
aw+0.511
 Other+0.449
 Leave+0.422
ifying+0.421
 forward+0.421
Neg logit contributions
ro-0.636
onna-0.520
uzz-0.512
emo-0.510
 cheek-0.497
Token
Token ID198
Feature activation-2.48e-01

Pos logit contributions
ro+0.474
onna+0.384
uzz+0.377
 cheek+0.372
emo+0.371
Neg logit contributions
aw-0.375
 Other-0.335
 Leave-0.311
 forward-0.307
ifying-0.306
Token
Token ID198
Feature activation+6.95e-02

Pos logit contributions
ro+0.497
onna+0.403
uzz+0.397
emo+0.390
 cheek+0.390
Neg logit contributions
aw-0.382
 Other-0.340
 Leave-0.315
 middle-0.313
 forward-0.310
TokenFrom
Token ID4863
Feature activation-3.49e-01

Pos logit contributions
aw+0.102
 Other+0.088
 Leave+0.082
 forward+0.082
ifying+0.081
Neg logit contributions
ro-0.145
onna-0.120
uzz-0.118
emo-0.117
 cheek-0.115
Token that
Token ID326
Feature activation-9.38e-01

Pos logit contributions
ro+0.842
onna+0.721
uzz+0.703
emo+0.694
 cheek+0.679
Neg logit contributions
aw-0.423
 Other-0.327
 forward-0.309
 middle-0.305
 Leave-0.302
Token day
Token ID1110
Feature activation+2.24e+00

Pos logit contributions
ro+1.704
onna+1.349
uzz+1.346
emo+1.323
water+1.230
Neg logit contributions
aw-1.598
 Other-1.539
 forward-1.485
 middle-1.439
opl-1.415
Token on
Token ID319
Feature activation+8.68e-01

Pos logit contributions
aw+3.019
 Other+2.456
pieces+2.309
opl+2.232
ifying+2.230
Neg logit contributions
ro-4.933
uzz-4.203
 cheek-4.185
onna-4.138
emo-4.116
Token,
Token ID11
Feature activation+2.53e-01

Pos logit contributions
aw+1.451
 Other+1.252
ifying+1.203
 Leave+1.183
 forward+1.179
Neg logit contributions
ro-1.657
onna-1.353
emo-1.325
uzz-1.312
 cheek-1.283
Token Tim
Token ID5045
Feature activation-2.94e-01

Pos logit contributions
aw+0.521
 Other+0.474
 forward+0.453
 Leave+0.452
ifying+0.452
Neg logit contributions
ro-0.374
onna-0.285
uzz-0.278
emo-0.274
 cheek-0.267
Tokenmy
Token ID1820
Feature activation+4.31e-01

Pos logit contributions
ro+0.530
onna+0.429
uzz+0.422
emo+0.412
 cheek+0.406
Neg logit contributions
aw-0.512
 Other-0.450
 forward-0.428
 Leave-0.426
opl-0.420
Token wore
Token ID12408
Feature activation+6.83e-01

Pos logit contributions
aw+0.693
 Other+0.610
ifying+0.578
 Leave+0.576
 forward+0.574
Neg logit contributions
ro-0.825
onna-0.672
uzz-0.659
emo-0.658
 cheek-0.640
Token things
Token ID1243
Feature activation-8.56e-01

Pos logit contributions
aw+0.288
 Other+0.254
 Leave+0.242
 forward+0.241
ifying+0.241
Neg logit contributions
ro-0.284
onna-0.228
uzz-0.225
emo-0.222
 cheek-0.216
Token he
Token ID339
Feature activation+7.89e-01

Pos logit contributions
ro+1.760
uzz+1.439
onna+1.435
emo+1.431
 cheek+1.408
Neg logit contributions
aw-1.277
 forward-1.049
 Other-1.046
 Leave-1.021
ifying-1.019
Token could
Token ID714
Feature activation+3.96e-01

Pos logit contributions
aw+1.326
 Other+1.185
ifying+1.124
 Leave+1.113
 forward+1.111
Neg logit contributions
ro-1.460
onna-1.177
emo-1.156
uzz-1.152
 cheek-1.118
Token observe
Token ID12414
Feature activation+1.01e+00

Pos logit contributions
aw+0.719
 Other+0.642
ifying+0.610
 Leave+0.610
 forward+0.608
Neg logit contributions
ro-0.681
onna-0.540
uzz-0.529
emo-0.527
 cheek-0.512
Token in
Token ID287
Feature activation+8.51e-01

Pos logit contributions
aw+1.810
 Other+1.639
 Leave+1.574
ifying+1.568
 forward+1.533
Neg logit contributions
ro-1.723
onna-1.357
uzz-1.330
emo-1.320
 cheek-1.280
Token the
Token ID262
Feature activation+2.47e+00

Pos logit contributions
aw+1.608
 Other+1.440
 Leave+1.376
ifying+1.368
 forward+1.359
Neg logit contributions
ro-1.397
onna-1.085
uzz-1.071
emo-1.052
 cheek-1.022
Token field
Token ID2214
Feature activation+1.19e-01

Pos logit contributions
aw+4.045
 Leave+3.754
 Other+3.673
pieces+3.658
ifying+3.649
Neg logit contributions
ro-4.526
 cheek-3.656
 seek-3.611
water-3.554
 moment-3.491
Token.
Token ID13
Feature activation+3.74e-02

Pos logit contributions
aw+0.200
 Other+0.176
 forward+0.167
ifying+0.166
 Leave+0.166
Neg logit contributions
ro-0.218
onna-0.176
emo-0.173
uzz-0.173
 cheek-0.168
Token He
Token ID679
Feature activation+3.16e-01

Pos logit contributions
aw+0.046
 Other+0.040
 Leave+0.036
 forward+0.036
ifying+0.036
Neg logit contributions
ro-0.086
onna-0.073
uzz-0.072
emo-0.072
 cheek-0.071
Token walked
Token ID6807
Feature activation-5.99e-01

Pos logit contributions
aw+0.509
 Other+0.447
 Leave+0.421
ifying+0.421
 forward+0.420
Neg logit contributions
ro-0.606
onna-0.494
uzz-0.486
emo-0.484
 cheek-0.472
Token around
Token ID1088
Feature activation-8.46e-01

Pos logit contributions
ro+1.135
onna+0.951
emo+0.921
 cheek+0.900
uzz+0.895
Neg logit contributions
aw-0.976
 forward-0.875
 Other-0.857
 middle-0.830
 particularly-0.806
Token always
Token ID1464
Feature activation-6.83e-01

Pos logit contributions
aw+0.748
 Other+0.659
 Leave+0.622
ifying+0.621
 forward+0.616
Neg logit contributions
ro-0.888
onna-0.725
uzz-0.714
emo-0.710
 cheek-0.692
Token careful
Token ID8161
Feature activation-8.15e-01

Pos logit contributions
ro+1.323
onna+1.078
uzz+1.025
emo+1.013
 cheek+0.980
Neg logit contributions
aw-1.134
 forward-1.001
 Other-0.971
 middle-0.933
ifying-0.932
Token not
Token ID407
Feature activation-9.26e-01

Pos logit contributions
ro+1.518
onna+1.223
emo+1.209
uzz+1.175
 seek+1.162
Neg logit contributions
aw-1.378
 forward-1.211
 Other-1.198
 Leave-1.141
ifying-1.137
Token to
Token ID284
Feature activation+1.60e-01

Pos logit contributions
ro+1.609
uzz+1.283
emo+1.244
onna+1.240
 cheek+1.235
Neg logit contributions
aw-1.633
 Other-1.424
 forward-1.414
 Leave-1.407
ifying-1.383
Token hurt
Token ID5938
Feature activation+1.87e-01

Pos logit contributions
aw+0.308
 Other+0.277
 Leave+0.266
ifying+0.264
 forward+0.264
Neg logit contributions
ro-0.257
onna-0.200
uzz-0.198
emo-0.195
 cheek-0.189
Token each
Token ID1123
Feature activation+1.84e+00

Pos logit contributions
aw+0.347
 Other+0.309
 forward+0.295
 Leave+0.294
ifying+0.294
Neg logit contributions
ro-0.315
onna-0.249
uzz-0.243
emo-0.242
 cheek-0.235
Token other
Token ID584
Feature activation-9.35e-01

Pos logit contributions
aw+2.964
ifying+2.557
 Other+2.554
 Leave+2.500
opl+2.340
Neg logit contributions
ro-3.575
emo-2.878
 cheek-2.797
 seek-2.787
uzz-2.781
Token.
Token ID13
Feature activation-5.82e-01

Pos logit contributions
ro+1.872
onna+1.603
emo+1.585
uzz+1.566
 seek+1.466
Neg logit contributions
aw-1.414
 forward-1.228
 Other-1.203
 middle-1.166
ifying-1.155
Token In
Token ID554
Feature activation-2.28e-01

Pos logit contributions
ro+1.219
onna+1.005
 seek+0.997
emo+0.993
 cheek+0.990
Neg logit contributions
aw-0.836
 Other-0.770
 Leave-0.693
ifying-0.685
 middle-0.675
Token the
Token ID262
Feature activation+1.18e+00

Pos logit contributions
ro+0.435
onna+0.354
emo+0.349
uzz+0.344
 cheek+0.339
Neg logit contributions
aw-0.371
 Other-0.332
 forward-0.314
 middle-0.313
ifying-0.312
Token end
Token ID886
Feature activation-2.94e-01

Pos logit contributions
aw+1.374
 Other+1.130
 Leave+1.077
ifying+1.031
 forward+0.969
Neg logit contributions
ro-2.841
onna-2.376
 cheek-2.351
 seek-2.328
uzz-2.324
Token Mom
Token ID11254
Feature activation+6.97e-01

Pos logit contributions
aw+0.495
 Other+0.434
 Leave+0.408
ifying+0.407
 forward+0.407
Neg logit contributions
ro-0.635
onna-0.521
uzz-0.513
emo-0.510
 cheek-0.498
Token said
Token ID531
Feature activation+2.97e-01

Pos logit contributions
aw+1.313
 Other+1.188
ifying+1.141
 Leave+1.125
 forward+1.120
Neg logit contributions
ro-1.142
onna-0.892
uzz-0.878
emo-0.861
 cheek-0.851
Token,
Token ID11
Feature activation+6.75e-01

Pos logit contributions
aw+0.453
 Other+0.395
 Leave+0.372
 forward+0.371
ifying+0.370
Neg logit contributions
ro-0.597
onna-0.491
uzz-0.483
emo-0.482
 cheek-0.469
Token �
Token ID6184
Feature activation-5.63e-02

Pos logit contributions
aw+1.103
 Other+0.964
ifying+0.917
 Leave+0.912
 forward+0.901
Neg logit contributions
ro-1.284
onna-1.045
uzz-1.025
emo-1.014
 cheek-1.010
Token�
Token ID95
Feature activation+2.41e-01

Pos logit contributions
ro+0.113
onna+0.094
emo+0.093
uzz+0.093
 cheek+0.089
Neg logit contributions
aw-0.087
 Other-0.074
 Leave-0.071
ifying-0.070
 forward-0.069
Token€
Token ID26391
Feature activation+8.56e-01

Pos logit contributions
aw+0.387
 Other+0.339
 Leave+0.321
ifying+0.318
 forward+0.317
Neg logit contributions
ro-0.465
onna-0.381
uzz-0.375
emo-0.374
 cheek-0.362
Token�
Token ID129
Feature activation-2.60e-01

Pos logit contributions
aw+1.265
 Other+1.122
 forward+1.057
ifying+1.050
 middle+1.038
Neg logit contributions
ro-1.738
onna-1.426
uzz-1.400
emo-1.389
 cheek-1.375
Token�
Token ID241
Feature activation-2.77e-01

Pos logit contributions
ro+0.597
onna+0.510
emo+0.501
uzz+0.501
 seek+0.485
Neg logit contributions
aw-0.328
 Other-0.270
 Leave-0.255
 forward-0.248
opl-0.246
TokenIt
Token ID1026
Feature activation+2.86e-01

Pos logit contributions
ro+0.537
onna+0.442
emo+0.433
uzz+0.431
 cheek+0.425
Neg logit contributions
aw-0.440
 Other-0.380
 Leave-0.363
ifying-0.360
 forward-0.358
Tokenâ
Token ID22940
Feature activation-3.77e-01

Pos logit contributions
aw+0.467
 Other+0.411
 Leave+0.389
 forward+0.387
ifying+0.387
Neg logit contributions
ro-0.543
onna-0.441
uzz-0.434
emo-0.432
 cheek-0.420
Token€
Token ID26391
Feature activation+7.49e-01

Pos logit contributions
ro+0.688
onna+0.579
uzz+0.567
emo+0.561
 seek+0.530
Neg logit contributions
aw-0.645
 Other-0.557
 Leave-0.540
 forward-0.534
 middle-0.523
Token was
Token ID373
Feature activation-3.72e-01

Pos logit contributions
ro+1.009
uzz+0.830
onna+0.830
emo+0.823
 cheek+0.776
Neg logit contributions
aw-0.756
 Other-0.655
 Leave-0.619
 forward-0.617
 middle-0.612
Token definitely
Token ID4753
Feature activation-6.69e-01

Pos logit contributions
ro+0.737
onna+0.595
uzz+0.589
emo+0.586
 cheek+0.571
Neg logit contributions
aw-0.592
 Other-0.517
 forward-0.497
ifying-0.484
 middle-0.482
Token coming
Token ID2406
Feature activation-8.26e-01

Pos logit contributions
ro+1.350
onna+1.096
uzz+1.076
emo+1.072
 cheek+1.036
Neg logit contributions
aw-1.058
 Other-0.903
 forward-0.888
ifying-0.849
 middle-0.848
Token from
Token ID422
Feature activation-2.95e-01

Pos logit contributions
ro+1.516
emo+1.221
onna+1.195
uzz+1.166
 cheek+1.137
Neg logit contributions
aw-1.380
 forward-1.273
 Other-1.266
ifying-1.210
 middle-1.202
Token the
Token ID262
Feature activation+1.80e-01

Pos logit contributions
ro+0.549
onna+0.444
emo+0.439
uzz+0.438
 cheek+0.425
Neg logit contributions
aw-0.492
 Other-0.440
 forward-0.417
 middle-0.415
ifying-0.411
Token chim
Token ID18205
Feature activation+6.87e-01

Pos logit contributions
aw+0.273
 Other+0.240
ifying+0.224
 forward+0.223
 middle+0.223
Neg logit contributions
ro-0.365
onna-0.300
uzz-0.297
emo-0.297
 cheek-0.286
Tokenney
Token ID1681
Feature activation+1.61e-02

Pos logit contributions
aw+1.237
 Other+1.104
 Leave+1.069
 forward+1.057
ifying+1.051
Neg logit contributions
ro-1.190
onna-0.947
uzz-0.920
emo-0.906
 cheek-0.888
Token!
Token ID0
Feature activation+1.93e-01

Pos logit contributions
aw+0.027
 Other+0.024
 forward+0.023
ifying+0.023
 Leave+0.022
Neg logit contributions
ro-0.030
onna-0.024
emo-0.024
uzz-0.024
 cheek-0.023
Token G
Token ID402
Feature activation-8.41e-01

Pos logit contributions
aw+0.254
 Other+0.219
 Leave+0.202
ifying+0.200
 forward+0.200
Neg logit contributions
ro-0.429
onna-0.361
emo-0.356
uzz-0.356
 cheek-0.348
Tokenently
Token ID1473
Feature activation-7.35e-01

Pos logit contributions
ro+1.792
 seek+1.477
onna+1.461
emo+1.438
 cheek+1.436
Neg logit contributions
aw-1.203
 Other-1.008
opl-0.932
ifying-0.912
 Leave-0.908
Token she
Token ID673
Feature activation+4.34e-01

Pos logit contributions
ro+1.632
onna+1.368
emo+1.367
uzz+1.343
 cheek+1.287
Neg logit contributions
aw-1.024
 Other-0.830
 forward-0.808
 Leave-0.807
 middle-0.775
Token to
Token ID284
Feature activation-9.41e-01

Pos logit contributions
ro+0.090
onna+0.070
uzz+0.070
emo+0.069
 cheek+0.067
Neg logit contributions
aw-0.102
 Other-0.090
 Leave-0.086
 forward-0.085
ifying-0.085
Token be
Token ID307
Feature activation+2.19e-02

Pos logit contributions
ro+1.567
onna+1.259
emo+1.256
uzz+1.251
 cheek+1.239
Neg logit contributions
aw-1.645
 Other-1.541
 middle-1.507
 Leave-1.468
opl-1.387
Token animals
Token ID4695
Feature activation+1.41e-01

Pos logit contributions
aw+0.037
 Other+0.033
 forward+0.031
ifying+0.031
 Leave+0.031
Neg logit contributions
ro-0.040
onna-0.033
uzz-0.032
emo-0.032
 cheek-0.032
Token,
Token ID11
Feature activation-4.54e-01

Pos logit contributions
aw+0.235
 Other+0.206
 forward+0.195
 Leave+0.195
ifying+0.194
Neg logit contributions
ro-0.264
onna-0.214
uzz-0.211
emo-0.211
 cheek-0.205
Token or
Token ID393
Feature activation+3.88e-01

Pos logit contributions
ro+0.839
onna+0.677
uzz+0.672
emo+0.670
 cheek+0.663
Neg logit contributions
aw-0.748
 Other-0.662
 Leave-0.633
 middle-0.615
ifying-0.615
Token pirates
Token ID27516
Feature activation+3.53e-01

Pos logit contributions
aw+0.677
 Other+0.602
 Leave+0.571
ifying+0.571
 forward+0.569
Neg logit contributions
ro-0.688
onna-0.551
uzz-0.540
emo-0.538
 cheek-0.523
Token,
Token ID11
Feature activation-1.89e-01

Pos logit contributions
aw+0.594
 Other+0.525
 Leave+0.497
ifying+0.496
 forward+0.495
Neg logit contributions
ro-0.651
onna-0.525
uzz-0.516
emo-0.514
 cheek-0.500
Token or
Token ID393
Feature activation+2.30e-01

Pos logit contributions
ro+0.356
onna+0.292
emo+0.287
uzz+0.287
 cheek+0.281
Neg logit contributions
aw-0.308
 Other-0.269
 Leave-0.257
 forward-0.255
ifying-0.254
Token superheroes
Token ID40896
Feature activation+6.60e-03

Pos logit contributions
aw+0.397
 Other+0.353
 Leave+0.335
ifying+0.334
 forward+0.334
Neg logit contributions
ro-0.411
onna-0.329
uzz-0.324
emo-0.322
 cheek-0.314
Token.
Token ID13
Feature activation+7.69e-01

Pos logit contributions
aw+0.011
 Other+0.010
 Leave+0.009
 forward+0.009
ifying+0.009
Neg logit contributions
ro-0.012
onna-0.010
emo-0.010
uzz-0.010
 cheek-0.010
Token They
Token ID1119
Feature activation-4.92e-01

Pos logit contributions
aw+1.539
 Other+1.364
 forward+1.330
ifying+1.329
 Leave+1.320
Neg logit contributions
ro-1.193
onna-0.922
uzz-0.901
emo-0.884
 cheek-0.843
Token.
Token ID13
Feature activation+5.64e-01

Pos logit contributions
ro+0.360
onna+0.297
uzz+0.291
emo+0.291
 cheek+0.277
Neg logit contributions
aw-0.310
 Other-0.272
 forward-0.264
ifying-0.260
 middle-0.257
Token
Token ID198
Feature activation+4.18e-01

Pos logit contributions
aw+0.950
 Other+0.836
ifying+0.810
 forward+0.800
 Leave+0.796
Neg logit contributions
ro-1.040
onna-0.838
uzz-0.826
emo-0.819
 cheek-0.795
Token
Token ID198
Feature activation+1.84e+00

Pos logit contributions
aw+0.695
 Other+0.614
ifying+0.583
 Leave+0.581
 forward+0.580
Neg logit contributions
ro-0.774
onna-0.626
uzz-0.616
emo-0.613
 cheek-0.596
TokenOne
Token ID3198
Feature activation-6.02e-01

Pos logit contributions
aw+2.505
ifying+2.400
 forward+2.203
 Other+2.202
pieces+2.150
Neg logit contributions
ro-3.899
uzz-3.212
onna-3.160
 seek-3.153
emo-3.131
Token day
Token ID1110
Feature activation+1.38e+00

Pos logit contributions
ro+1.030
onna+0.831
uzz+0.798
emo+0.780
 cheek+0.765
Neg logit contributions
aw-1.103
 Other-0.997
 middle-0.961
 forward-0.960
 particularly-0.923
Token,
Token ID11
Feature activation+1.50e+00

Pos logit contributions
aw+2.347
 Other+2.099
ifying+2.076
opl+2.041
pieces+2.026
Neg logit contributions
ro-2.474
uzz-1.969
onna-1.942
emo-1.927
 seek-1.898
Token Sue
Token ID26089
Feature activation-1.14e-02

Pos logit contributions
aw+3.296
opl+3.044
ifying+3.029
 Other+3.022
 Leave+2.980
Neg logit contributions
ro-1.940
uzz-1.387
 seek-1.354
onna-1.347
emo-1.337
Token saw
Token ID2497
Feature activation+3.28e-01

Pos logit contributions
ro+0.023
onna+0.019
uzz+0.019
emo+0.018
 cheek+0.018
Neg logit contributions
aw-0.018
 Other-0.015
 Leave-0.014
 forward-0.014
 middle-0.014
Token a
Token ID257
Feature activation-4.47e-01

Pos logit contributions
aw+0.565
 Other+0.500
 Leave+0.473
ifying+0.472
 forward+0.471
Neg logit contributions
ro-0.602
onna-0.485
uzz-0.475
emo-0.473
 cheek-0.461
Token beautiful
Token ID4950
Feature activation-1.34e+00

Pos logit contributions
ro+0.842
uzz+0.693
onna+0.678
emo+0.670
 cheek+0.632
Neg logit contributions
aw-0.727
 Other-0.639
 forward-0.610
ifying-0.607
 middle-0.607
Token rainbow
Token ID27223
Feature activation+1.04e+00

Pos logit contributions
ro+2.301
uzz+1.978
onna+1.915
emo+1.899
nes+1.825
Neg logit contributions
aw-2.391
 Other-2.044
 middle-2.036
 particularly-1.967
 forward-1.939
Token toast
Token ID27805
Feature activation+2.35e-01

Pos logit contributions
ro+0.081
onna+0.065
emo+0.065
uzz+0.064
 cheek+0.062
Neg logit contributions
aw-0.078
 Other-0.068
 forward-0.064
 Leave-0.064
 middle-0.063
Token,
Token ID11
Feature activation-6.64e-02

Pos logit contributions
aw+0.400
 Other+0.353
 forward+0.334
 Leave+0.333
ifying+0.332
Neg logit contributions
ro-0.429
onna-0.345
uzz-0.340
emo-0.339
 cheek-0.329
Token but
Token ID475
Feature activation-8.44e-01

Pos logit contributions
ro+0.105
onna+0.081
emo+0.080
uzz+0.080
 cheek+0.076
Neg logit contributions
aw-0.129
 Other-0.116
 Leave-0.111
ifying-0.110
 forward-0.110
Token only
Token ID691
Feature activation-1.61e-01

Pos logit contributions
ro+1.560
emo+1.251
onna+1.235
uzz+1.205
 cheek+1.197
Neg logit contributions
aw-1.427
 Other-1.246
 middle-1.210
 forward-1.203
 Leave-1.178
Token a
Token ID257
Feature activation+5.11e-01

Pos logit contributions
ro+0.316
emo+0.257
onna+0.257
uzz+0.255
 cheek+0.245
Neg logit contributions
aw-0.251
 Other-0.221
 forward-0.213
ifying-0.210
 middle-0.206
Token little
Token ID1310
Feature activation-7.85e-02

Pos logit contributions
aw+0.932
 Other+0.833
 Leave+0.794
ifying+0.792
 forward+0.790
Neg logit contributions
ro-0.874
onna-0.693
uzz-0.673
emo-0.672
 cheek-0.658
Token bit
Token ID1643
Feature activation+3.25e-01

Pos logit contributions
ro+0.139
uzz+0.112
emo+0.112
onna+0.112
 cheek+0.105
Neg logit contributions
aw-0.137
 Other-0.122
 forward-0.116
 Leave-0.115
ifying-0.114
Token.
Token ID13
Feature activation+7.37e-01

Pos logit contributions
aw+0.518
 Other+0.454
 Leave+0.428
ifying+0.427
 forward+0.427
Neg logit contributions
ro-0.634
onna-0.518
uzz-0.510
emo-0.508
 cheek-0.495
Token She
Token ID1375
Feature activation+1.19e+00

Pos logit contributions
aw+1.546
 Other+1.383
ifying+1.361
 forward+1.353
 Leave+1.340
Neg logit contributions
ro-1.064
onna-0.804
uzz-0.785
emo-0.768
 cheek-0.731
Token prefers
Token ID26237
Feature activation+1.12e-01

Pos logit contributions
aw+2.178
 Other+1.998
ifying+1.931
 forward+1.928
 middle+1.879
Neg logit contributions
ro-1.993
onna-1.588
uzz-1.500
emo-1.493
 cheek-1.458
Token to
Token ID284
Feature activation+7.14e-02

Pos logit contributions
aw+0.215
 Other+0.192
 Leave+0.183
 forward+0.183
ifying+0.183
Neg logit contributions
ro-0.182
onna-0.141
uzz-0.139
emo-0.139
 cheek-0.134
Token carefully
Token ID7773
Feature activation-8.14e-01

Pos logit contributions
ro+1.276
onna+1.075
emo+1.059
uzz+1.047
 cheek+1.038
Neg logit contributions
aw-0.955
 Other-0.819
 forward-0.810
 Leave-0.766
 middle-0.763
Token,
Token ID11
Feature activation-4.61e-02

Pos logit contributions
ro+1.572
emo+1.322
onna+1.296
uzz+1.283
 cheek+1.267
Neg logit contributions
aw-1.259
 Other-1.089
 forward-1.082
 Leave-1.040
ifying-1.033
Token not
Token ID407
Feature activation-8.23e-01

Pos logit contributions
ro+0.083
onna+0.068
uzz+0.067
emo+0.067
 cheek+0.064
Neg logit contributions
aw-0.079
 Other-0.069
 forward-0.067
 Leave-0.066
ifying-0.066
Token wanting
Token ID10291
Feature activation-3.90e-01

Pos logit contributions
ro+1.560
uzz+1.252
emo+1.241
onna+1.236
 cheek+1.182
Neg logit contributions
aw-1.329
 forward-1.169
 Other-1.165
ifying-1.132
 Leave-1.119
Token it
Token ID340
Feature activation+1.15e-01

Pos logit contributions
ro+0.679
uzz+0.534
onna+0.533
emo+0.528
 cheek+0.513
Neg logit contributions
aw-0.696
 Other-0.625
 forward-0.603
 Leave-0.589
ifying-0.584
Token to
Token ID284
Feature activation-1.32e+00

Pos logit contributions
aw+0.193
 Other+0.169
 forward+0.162
 Leave+0.160
ifying+0.160
Neg logit contributions
ro-0.213
onna-0.172
uzz-0.169
emo-0.168
 cheek-0.164
Token break
Token ID2270
Feature activation-4.67e-01

Pos logit contributions
ro+2.425
uzz+1.969
emo+1.928
onna+1.909
 cheek+1.854
Neg logit contributions
aw-2.172
 middle-1.928
 Other-1.927
 Leave-1.876
 forward-1.817
Token or
Token ID393
Feature activation-4.70e-02

Pos logit contributions
ro+0.851
onna+0.697
uzz+0.677
emo+0.673
 cheek+0.659
Neg logit contributions
aw-0.802
 Other-0.703
 forward-0.685
ifying-0.668
 middle-0.663
Token get
Token ID651
Feature activation-6.00e-01

Pos logit contributions
ro+0.079
onna+0.062
uzz+0.062
emo+0.061
 cheek+0.058
Neg logit contributions
aw-0.086
 Other-0.077
ifying-0.074
 forward-0.074
 Leave-0.074
Token lost
Token ID2626
Feature activation-4.77e-01

Pos logit contributions
ro+1.181
onna+0.949
uzz+0.929
emo+0.918
 seek+0.902
Neg logit contributions
aw-0.960
 forward-0.834
 Other-0.830
ifying-0.808
 Leave-0.799
Token.
Token ID13
Feature activation-9.57e-02

Pos logit contributions
ro+0.920
onna+0.764
emo+0.752
uzz+0.739
 cheek+0.722
Neg logit contributions
aw-0.758
 Other-0.652
 forward-0.646
ifying-0.629
 middle-0.619
Token fur
Token ID9230
Feature activation+6.44e-01

Pos logit contributions
aw+0.912
 Other+0.808
 forward+0.765
ifying+0.764
 Leave+0.763
Neg logit contributions
ro-1.090
onna-0.876
uzz-0.869
emo-0.865
 cheek-0.845
Token.
Token ID13
Feature activation+1.31e-01

Pos logit contributions
aw+1.129
 Other+1.020
 Leave+0.968
ifying+0.964
 forward+0.959
Neg logit contributions
ro-1.140
onna-0.900
uzz-0.891
emo-0.877
 cheek-0.862
Token While
Token ID2893
Feature activation+6.80e-01

Pos logit contributions
aw+0.198
 Other+0.175
 Leave+0.163
 forward+0.163
 middle+0.161
Neg logit contributions
ro-0.262
onna-0.217
uzz-0.213
emo-0.211
 cheek-0.208
Token walking
Token ID6155
Feature activation+3.27e-01

Pos logit contributions
aw+1.183
 Other+1.046
ifying+1.002
 Leave+0.993
 forward+0.983
Neg logit contributions
ro-1.224
onna-0.976
emo-0.963
uzz-0.961
 seek-0.923
Token,
Token ID11
Feature activation+1.72e-01

Pos logit contributions
aw+0.491
 Other+0.428
 Leave+0.402
 forward+0.401
ifying+0.401
Neg logit contributions
ro-0.666
onna-0.550
uzz-0.541
emo-0.539
 cheek-0.527
Token he
Token ID339
Feature activation-1.39e-01

Pos logit contributions
aw+0.318
 Other+0.285
 forward+0.271
 Leave+0.270
ifying+0.269
Neg logit contributions
ro-0.288
onna-0.229
uzz-0.224
emo-0.221
 cheek-0.217
Token saw
Token ID2497
Feature activation+1.97e-01

Pos logit contributions
ro+0.295
onna+0.250
uzz+0.247
 cheek+0.242
emo+0.242
Neg logit contributions
aw-0.198
 Other-0.165
 middle-0.156
 Leave-0.156
 forward-0.155
Token a
Token ID257
Feature activation-3.77e-01

Pos logit contributions
aw+0.331
 Other+0.291
 Leave+0.274
 forward+0.272
ifying+0.272
Neg logit contributions
ro-0.369
onna-0.299
uzz-0.292
emo-0.290
 cheek-0.285
Token big
Token ID1263
Feature activation-4.78e-03

Pos logit contributions
ro+0.708
uzz+0.584
onna+0.576
emo+0.564
 cheek+0.536
Neg logit contributions
aw-0.619
 Other-0.536
ifying-0.514
 middle-0.514
 forward-0.510
Token red
Token ID2266
Feature activation+5.40e-01

Pos logit contributions
ro+0.009
onna+0.007
uzz+0.007
emo+0.007
 cheek+0.007
Neg logit contributions
aw-0.008
 Other-0.007
 Leave-0.007
 forward-0.006
 middle-0.006
Token ball
Token ID2613
Feature activation+6.77e-01

Pos logit contributions
aw+0.962
 Other+0.861
 Leave+0.819
 forward+0.816
ifying+0.815
Neg logit contributions
ro-0.944
onna-0.745
uzz-0.731
emo-0.727
 cheek-0.713
Token amazed
Token ID24562
Feature activation+4.82e-01

Pos logit contributions
aw+1.372
 Other+1.207
 Leave+1.138
ifying+1.135
 forward+1.108
Neg logit contributions
ro-1.739
onna-1.440
uzz-1.410
emo-1.391
 cheek-1.377
Token.
Token ID13
Feature activation-1.42e-01

Pos logit contributions
aw+0.860
 Other+0.771
 Leave+0.734
ifying+0.731
 forward+0.727
Neg logit contributions
ro-0.840
onna-0.666
uzz-0.655
emo-0.651
 cheek-0.634
Token She
Token ID1375
Feature activation+2.23e-01

Pos logit contributions
ro+0.315
onna+0.266
emo+0.262
uzz+0.261
 cheek+0.259
Neg logit contributions
aw-0.185
 Other-0.160
 Leave-0.147
ifying-0.145
 forward-0.144
Token told
Token ID1297
Feature activation-6.65e-01

Pos logit contributions
aw+0.403
 Other+0.358
 Leave+0.340
ifying+0.340
 forward+0.339
Neg logit contributions
ro-0.384
onna-0.305
uzz-0.299
emo-0.299
 cheek-0.290
Token Mia
Token ID32189
Feature activation-6.63e-02

Pos logit contributions
ro+1.258
onna+1.034
emo+1.020
 seek+0.999
uzz+0.995
Neg logit contributions
aw-1.114
 Other-0.977
 forward-0.914
 Leave-0.911
 middle-0.892
Token that
Token ID326
Feature activation-1.21e+00

Pos logit contributions
ro+0.128
onna+0.104
emo+0.103
uzz+0.102
 cheek+0.100
Neg logit contributions
aw-0.106
 Other-0.093
 Leave-0.088
 forward-0.088
ifying-0.087
Token the
Token ID262
Feature activation-6.00e-01

Pos logit contributions
ro+2.415
emo+1.967
onna+1.965
uzz+1.957
 cheek+1.934
Neg logit contributions
aw-1.921
 Other-1.633
 Leave-1.546
 particularly-1.534
 middle-1.533
Token tooth
Token ID16162
Feature activation-3.09e-01

Pos logit contributions
ro+1.051
onna+0.852
uzz+0.847
emo+0.843
 cheek+0.778
Neg logit contributions
aw-1.050
 Other-0.930
 middle-0.901
ifying-0.893
 forward-0.889
Token must
Token ID1276
Feature activation-1.06e+00

Pos logit contributions
ro+0.578
uzz+0.470
onna+0.470
emo+0.465
 cheek+0.445
Neg logit contributions
aw-0.506
 Other-0.431
 middle-0.415
ifying-0.415
 Leave-0.415
Token belong
Token ID5594
Feature activation-5.89e-01

Pos logit contributions
ro+1.930
uzz+1.581
onna+1.550
emo+1.548
 cheek+1.519
Neg logit contributions
aw-1.687
 Other-1.518
 middle-1.513
 Leave-1.458
opl-1.432
Token to
Token ID284
Feature activation+1.18e-01

Pos logit contributions
ro+1.025
onna+0.814
emo+0.794
uzz+0.792
 cheek+0.783
Neg logit contributions
aw-1.054
 Other-0.912
 forward-0.910
ifying-0.883
 Leave-0.881
Token she
Token ID673
Feature activation+2.84e-01

Pos logit contributions
ro+3.001
onna+2.456
uzz+2.422
emo+2.406
 cheek+2.346
Neg logit contributions
aw-2.179
 forward-1.899
 Other-1.893
 particularly-1.833
 middle-1.831
Token cared
Token ID19951
Feature activation-2.04e-01

Pos logit contributions
aw+0.455
 Other+0.399
 Leave+0.377
 forward+0.376
ifying+0.375
Neg logit contributions
ro-0.549
onna-0.448
uzz-0.440
emo-0.437
 cheek-0.427
Token.
Token ID13
Feature activation-1.06e-01

Pos logit contributions
ro+0.441
onna+0.366
emo+0.364
uzz+0.362
 cheek+0.349
Neg logit contributions
aw-0.280
 Other-0.242
 forward-0.233
 Leave-0.223
ifying-0.223
Token<|endoftext|>
Token ID50256
Feature activation-1.51e-01

Pos logit contributions
ro+0.237
onna+0.199
emo+0.197
uzz+0.196
 cheek+0.194
Neg logit contributions
aw-0.137
 Other-0.119
 Leave-0.109
 forward-0.107
ifying-0.106
TokenM
Token ID44
Feature activation+7.43e-01

Pos logit contributions
ro+0.300
onna+0.248
uzz+0.243
emo+0.242
 cheek+0.240
Neg logit contributions
aw-0.233
 Other-0.204
 forward-0.191
 Leave-0.191
 middle-0.190
Tokenummy
Token ID13513
Feature activation-6.90e-01

Pos logit contributions
aw+1.187
 Other+1.047
 Leave+0.995
ifying+0.992
 forward+0.988
Neg logit contributions
ro-1.429
onna-1.181
uzz-1.157
emo-1.149
 cheek-1.117
Token and
Token ID290
Feature activation+6.51e-01

Pos logit contributions
ro+1.358
uzz+1.110
onna+1.109
emo+1.091
 cheek+1.090
Neg logit contributions
aw-1.094
 Other-0.915
 Leave-0.889
 forward-0.855
ifying-0.846
Token Daddy
Token ID26926
Feature activation+2.28e-01

Pos logit contributions
aw+1.269
 Other+1.143
 Leave+1.090
ifying+1.088
 forward+1.082
Neg logit contributions
ro-1.026
onna-0.792
uzz-0.784
emo-0.782
 cheek-0.752
Token were
Token ID547
Feature activation+1.70e+00

Pos logit contributions
aw+0.376
 Other+0.331
 Leave+0.314
 forward+0.312
ifying+0.312
Neg logit contributions
ro-0.428
onna-0.347
uzz-0.341
emo-0.340
 cheek-0.332
Token very
Token ID845
Feature activation+2.06e-01

Pos logit contributions
aw+2.702
 Other+2.365
ifying+2.255
 Leave+2.226
opl+2.220
Neg logit contributions
ro-3.157
uzz-2.701
onna-2.622
 cheek-2.594
emo-2.586
Token excited
Token ID6568
Feature activation-3.18e-01

Pos logit contributions
aw+0.377
 Other+0.338
 forward+0.323
ifying+0.321
 Leave+0.321
Neg logit contributions
ro-0.351
onna-0.275
uzz-0.270
emo-0.269
 cheek-0.260
Token.
Token ID13
Feature activation+1.07e+00

Pos logit contributions
ro+0.056
onna+0.045
emo+0.045
uzz+0.045
 cheek+0.043
Neg logit contributions
aw-0.049
 Other-0.043
 forward-0.042
ifying-0.041
 middle-0.041
Token Suddenly
Token ID24975
Feature activation+2.21e-01

Pos logit contributions
aw+1.612
 Other+1.335
ifying+1.316
 Leave+1.294
 forward+1.292
Neg logit contributions
ro-2.194
onna-1.814
uzz-1.806
emo-1.778
 cheek-1.723
Token,
Token ID11
Feature activation+9.86e-01

Pos logit contributions
aw+0.340
 Other+0.297
 forward+0.279
 Leave+0.278
ifying+0.278
Neg logit contributions
ro-0.440
onna-0.361
uzz-0.356
emo-0.354
 cheek-0.346
Token it
Token ID340
Feature activation+3.75e-01

Pos logit contributions
aw+1.809
 Other+1.617
 Leave+1.568
ifying+1.531
opl+1.519
Neg logit contributions
ro-1.655
onna-1.302
uzz-1.290
emo-1.283
 cheek-1.253
Token began
Token ID2540
Feature activation-1.04e+00

Pos logit contributions
aw+0.544
 Other+0.471
 Leave+0.441
ifying+0.440
 forward+0.439
Neg logit contributions
ro-0.783
onna-0.649
uzz-0.639
emo-0.637
 cheek-0.623
Token to
Token ID284
Feature activation-1.50e+00

Pos logit contributions
ro+1.796
emo+1.483
 cheek+1.433
uzz+1.422
onna+1.395
Neg logit contributions
aw-1.801
 Other-1.632
 forward-1.568
ifying-1.554
 particularly-1.541
Token move
Token ID1445
Feature activation-4.48e-01

Pos logit contributions
ro+2.803
uzz+2.371
emo+2.352
onna+2.246
 cheek+2.135
Neg logit contributions
aw-2.379
 middle-2.237
 Other-2.211
 particularly-2.080
ifying-2.046
Token!
Token ID0
Feature activation+5.33e-01

Pos logit contributions
ro+0.864
onna+0.727
emo+0.720
uzz+0.703
 cheek+0.697
Neg logit contributions
aw-0.721
 Other-0.621
 forward-0.611
ifying-0.594
 middle-0.582
Token He
Token ID679
Feature activation-1.52e-01

Pos logit contributions
aw+0.879
 Other+0.769
ifying+0.731
 Leave+0.730
 forward+0.729
Neg logit contributions
ro-1.002
onna-0.814
uzz-0.801
emo-0.795
 cheek-0.775
Token was
Token ID373
Feature activation+1.78e-01

Pos logit contributions
ro+0.284
onna+0.232
uzz+0.228
emo+0.228
 cheek+0.222
Neg logit contributions
aw-0.253
 Other-0.219
ifying-0.208
 forward-0.208
 Leave-0.208
Token so
Token ID523
Feature activation-7.27e-01

Pos logit contributions
aw+0.248
 Other+0.213
 forward+0.200
 Leave+0.199
ifying+0.199
Neg logit contributions
ro-0.382
onna-0.318
emo-0.314
uzz-0.313
 cheek-0.305
Token what
Token ID644
Feature activation+9.16e-01

Pos logit contributions
aw+1.290
 Other+1.164
ifying+1.121
 Leave+1.119
 forward+1.107
Neg logit contributions
ro-1.136
onna-0.899
uzz-0.881
emo-0.872
 cheek-0.856
Token's
Token ID338
Feature activation+4.30e-01

Pos logit contributions
aw+1.671
 Other+1.517
ifying+1.473
 forward+1.458
 Leave+1.443
Neg logit contributions
ro-1.497
onna-1.196
emo-1.161
uzz-1.149
 cheek-1.122
Token up
Token ID510
Feature activation-1.19e+00

Pos logit contributions
aw+0.737
 Other+0.655
 Leave+0.621
ifying+0.620
 forward+0.617
Neg logit contributions
ro-0.779
onna-0.627
uzz-0.615
emo-0.612
 cheek-0.597
Token there
Token ID612
Feature activation-3.36e-01

Pos logit contributions
ro+2.058
uzz+1.596
emo+1.590
 cheek+1.561
onna+1.559
Neg logit contributions
aw-2.197
 forward-1.932
 Other-1.898
ifying-1.848
 middle-1.828
Token,"
Token ID553
Feature activation-1.08e+00

Pos logit contributions
ro+0.512
uzz+0.388
emo+0.384
onna+0.384
 cheek+0.361
Neg logit contributions
aw-0.682
 Other-0.614
 forward-0.588
 middle-0.586
ifying-0.581
Token Sara
Token ID24799
Feature activation-1.62e+00

Pos logit contributions
ro+1.622
onna+1.267
uzz+1.192
emo+1.171
 cheek+1.153
Neg logit contributions
aw-2.319
 Other-2.020
 middle-1.997
 Leave-1.961
 forward-1.897
Token said
Token ID531
Feature activation-6.22e-01

Pos logit contributions
ro+2.564
emo+2.074
onna+2.050
uzz+1.980
 cheek+1.977
Neg logit contributions
aw-3.342
 Other-2.821
 Leave-2.643
 forward-2.640
 particularly-2.625
Token.
Token ID13
Feature activation-1.71e-01

Pos logit contributions
ro+1.242
onna+1.019
emo+1.015
uzz+0.981
 seek+0.957
Neg logit contributions
aw-0.959
 Other-0.845
 forward-0.838
 middle-0.787
 Leave-0.787
Token
Token ID198
Feature activation-7.95e-01

Pos logit contributions
ro+0.350
onna+0.292
uzz+0.288
emo+0.286
 cheek+0.283
Neg logit contributions
aw-0.252
 Other-0.221
 Leave-0.208
 forward-0.204
 middle-0.201
Token
Token ID198
Feature activation-7.61e-01

Pos logit contributions
ro+1.665
onna+1.377
 cheek+1.370
uzz+1.358
 seek+1.337
Neg logit contributions
aw-1.196
 Other-1.017
 Leave-0.953
 middle-0.941
 forward-0.914
Token"
Token ID1
Feature activation-3.55e-01

Pos logit contributions
ro+1.592
onna+1.380
uzz+1.347
 cheek+1.325
emo+1.321
Neg logit contributions
aw-1.154
 Other-0.924
 middle-0.904
 Leave-0.895
 forward-0.875
Token big
Token ID1263
Feature activation-3.28e-01

Pos logit contributions
ro+0.606
uzz+0.483
onna+0.476
emo+0.472
 cheek+0.447
Neg logit contributions
aw-0.610
 Other-0.539
ifying-0.534
 forward-0.523
 middle-0.522
Token oct
Token ID19318
Feature activation-2.50e-03

Pos logit contributions
ro+0.586
uzz+0.471
onna+0.467
emo+0.464
 cheek+0.435
Neg logit contributions
aw-0.568
 Other-0.501
 forward-0.483
ifying-0.480
 middle-0.478
Tokenopus
Token ID25790
Feature activation+1.24e-01

Pos logit contributions
ro+0.005
onna+0.004
uzz+0.004
emo+0.004
 cheek+0.004
Neg logit contributions
aw-0.004
 Other-0.003
ifying-0.003
 Leave-0.003
 forward-0.003
Token!
Token ID0
Feature activation-3.65e-01

Pos logit contributions
aw+0.183
 Other+0.157
ifying+0.149
 Leave+0.149
 forward+0.148
Neg logit contributions
ro-0.252
onna-0.208
uzz-0.206
emo-0.206
 cheek-0.200
Token The
Token ID383
Feature activation-1.64e+00

Pos logit contributions
ro+0.788
onna+0.658
uzz+0.651
emo+0.648
 cheek+0.644
Neg logit contributions
aw-0.502
 Other-0.441
 Leave-0.406
ifying-0.402
 forward-0.393
Token little
Token ID1310
Feature activation-2.26e+00

Pos logit contributions
ro+2.214
onna+1.824
uzz+1.679
emo+1.506
nes+1.504
Neg logit contributions
aw-3.639
 middle-3.405
 Other-3.275
ifying-3.215
 forward-3.196
Token fish
Token ID5916
Feature activation-2.27e-01

Pos logit contributions
ro+3.773
onna+3.465
nes+3.166
uzz+3.165
emo+2.973
Neg logit contributions
aw-4.362
 middle-3.823
 Other-3.686
 forward-3.563
ifying-3.502
Token was
Token ID373
Feature activation-1.66e-01

Pos logit contributions
ro+0.411
onna+0.333
uzz+0.322
emo+0.321
 cheek+0.316
Neg logit contributions
aw-0.392
 Other-0.340
ifying-0.324
 Leave-0.323
 forward-0.323
Token very
Token ID845
Feature activation-1.44e+00

Pos logit contributions
ro+0.351
onna+0.288
emo+0.285
uzz+0.283
 cheek+0.279
Neg logit contributions
aw-0.237
 Other-0.203
 forward-0.193
 Leave-0.191
ifying-0.191
Token scared
Token ID12008
Feature activation+4.54e-01

Pos logit contributions
ro+3.111
onna+2.451
uzz+2.446
water+2.436
 cheek+2.418
Neg logit contributions
aw-2.089
 Other-1.750
 forward-1.743
 particularly-1.688
 middle-1.661
Token but
Token ID475
Feature activation-8.34e-01

Pos logit contributions
aw+0.763
 Other+0.676
 Leave+0.640
ifying+0.640
 forward+0.636
Neg logit contributions
ro-0.842
onna-0.680
uzz-0.669
emo-0.664
 cheek-0.648
Token hug
Token ID16225
Feature activation+5.32e-01

Pos logit contributions
ro+1.651
emo+1.324
onna+1.317
uzz+1.305
water+1.229
Neg logit contributions
aw-1.590
 Other-1.440
 forward-1.363
 middle-1.355
 Leave-1.345
Token,
Token ID11
Feature activation-1.58e-01

Pos logit contributions
aw+0.913
 Other+0.816
 Leave+0.773
ifying+0.769
 forward+0.763
Neg logit contributions
ro-0.961
onna-0.766
uzz-0.756
emo-0.746
 cheek-0.733
Token picked
Token ID6497
Feature activation-1.74e-01

Pos logit contributions
ro+0.285
onna+0.228
uzz+0.227
emo+0.226
 cheek+0.216
Neg logit contributions
aw-0.275
 Other-0.241
 forward-0.230
 Leave-0.229
ifying-0.228
Token up
Token ID510
Feature activation+4.14e-01

Pos logit contributions
ro+0.311
onna+0.253
emo+0.242
uzz+0.242
 cheek+0.237
Neg logit contributions
aw-0.313
 Other-0.275
 forward-0.264
 Leave-0.257
ifying-0.255
Token all
Token ID477
Feature activation-4.82e-01

Pos logit contributions
aw+0.739
 Other+0.660
 Leave+0.628
ifying+0.626
 forward+0.624
Neg logit contributions
ro-0.721
onna-0.573
uzz-0.564
emo-0.560
 cheek-0.544
Token the
Token ID262
Feature activation-1.77e+00

Pos logit contributions
ro+0.953
onna+0.782
emo+0.758
uzz+0.750
 cheek+0.749
Neg logit contributions
aw-0.782
 Other-0.678
 forward-0.639
 middle-0.633
 Leave-0.626
Token mar
Token ID1667
Feature activation-1.09e+00

Pos logit contributions
ro+2.805
onna+2.413
uzz+2.353
emo+2.179
 cheek+2.106
Neg logit contributions
aw-3.611
 Other-3.192
 middle-3.108
 particularly-3.040
 Leave-3.032
Tokenbles
Token ID7689
Feature activation-9.59e-01

Pos logit contributions
ro+1.570
onna+1.244
uzz+1.167
 cheek+1.148
emo+1.146
Neg logit contributions
aw-2.345
 forward-2.007
ifying-1.973
 Other-1.963
pieces-1.960
Token and
Token ID290
Feature activation-9.71e-02

Pos logit contributions
ro+1.915
emo+1.580
onna+1.575
uzz+1.564
 cheek+1.496
Neg logit contributions
aw-1.520
 forward-1.250
 Other-1.230
 middle-1.193
pieces-1.182
Token put
Token ID1234
Feature activation+2.63e-02

Pos logit contributions
ro+0.176
onna+0.143
uzz+0.140
emo+0.139
 cheek+0.136
Neg logit contributions
aw-0.167
 Other-0.146
 Leave-0.139
 forward-0.138
ifying-0.138
Token them
Token ID606
Feature activation-1.04e+00

Pos logit contributions
aw+0.038
 Other+0.033
 forward+0.031
ifying+0.031
 Leave+0.031
Neg logit contributions
ro-0.055
onna-0.046
emo-0.045
uzz-0.045
 cheek-0.044
Token a
Token ID257
Feature activation-5.92e-01

Pos logit contributions
ro+1.211
onna+0.987
emo+0.983
uzz+0.963
 cheek+0.956
Neg logit contributions
aw-0.765
 Other-0.644
 Leave-0.589
 forward-0.585
 middle-0.579
Token lesson
Token ID11483
Feature activation-5.36e-02

Pos logit contributions
ro+1.202
uzz+0.980
onna+0.969
emo+0.960
 seek+0.924
Neg logit contributions
aw-0.870
 Other-0.794
 forward-0.746
 middle-0.720
ifying-0.717
Token.
Token ID13
Feature activation-2.35e-01

Pos logit contributions
ro+0.103
onna+0.085
emo+0.084
uzz+0.082
 cheek+0.081
Neg logit contributions
aw-0.087
 Other-0.075
 forward-0.072
 Leave-0.071
 middle-0.070
Token She
Token ID1375
Feature activation+4.26e-02

Pos logit contributions
ro+0.566
onna+0.484
emo+0.478
uzz+0.474
 cheek+0.473
Neg logit contributions
aw-0.266
 Other-0.228
 Leave-0.207
 forward-0.198
 middle-0.196
Token learned
Token ID4499
Feature activation-4.13e-01

Pos logit contributions
aw+0.072
 Other+0.062
 Leave+0.060
ifying+0.059
 forward+0.059
Neg logit contributions
ro-0.079
onna-0.064
uzz-0.063
emo-0.063
 cheek-0.061
Token that
Token ID326
Feature activation-1.80e+00

Pos logit contributions
ro+0.867
onna+0.707
emo+0.701
uzz+0.692
 seek+0.684
Neg logit contributions
aw-0.615
 Other-0.522
 forward-0.487
 Leave-0.486
ifying-0.482
Token sharing
Token ID7373
Feature activation-4.82e-01

Pos logit contributions
ro+3.685
uzz+2.986
emo+2.986
onna+2.940
 cheek+2.845
Neg logit contributions
aw-2.751
 Other-2.494
 forward-2.382
 particularly-2.362
 Leave-2.354
Token is
Token ID318
Feature activation+2.16e-01

Pos logit contributions
ro+0.984
onna+0.821
uzz+0.818
emo+0.806
 cheek+0.778
Neg logit contributions
aw-0.719
 Other-0.614
 Leave-0.584
 forward-0.583
 middle-0.573
Token better
Token ID1365
Feature activation+9.74e-02

Pos logit contributions
aw+0.371
 Other+0.328
 forward+0.312
 Leave+0.311
ifying+0.309
Neg logit contributions
ro-0.396
onna-0.318
uzz-0.312
emo-0.309
 cheek-0.302
Token than
Token ID621
Feature activation-1.57e+00

Pos logit contributions
aw+0.191
 Other+0.172
 forward+0.165
 Leave+0.164
ifying+0.163
Neg logit contributions
ro-0.154
onna-0.120
uzz-0.117
emo-0.116
 cheek-0.112
Token hiding
Token ID11816
Feature activation-2.47e-01

Pos logit contributions
ro+3.167
uzz+2.600
emo+2.584
onna+2.559
 cheek+2.544
Neg logit contributions
aw-2.302
 Other-1.998
ifying-1.926
 particularly-1.895
 forward-1.891