TOP ACTIVATIONS
MAX = 7.789

go to MLS because Ben f ica still owns my contract
SS ID : off / any Mode : Man aged Access
. parse _ args ()) # initialize a rectangular and square
J aaaa asper Profile Blog Joined April 2012 United States 5
Americans , Mer k ley , the fourth and
heart l xp Profile Joined September 2010 United States 10
)) = 0 . 1 // metres _ P ulse Per
empire [ under Hugh He f ner , whose confid ant
ies are finally tallied ! Man , I got a lot
out in the Kh y ber area . Peace
. parse _ args ()) # initialize a rectangular and square
President Obama acknowledged no foreign nation , no hacker
. K hy ber raid Military sources
( 21 , 21 )) # loop over the input image
pur ak ushi Profile Joined August 2012 United States 3
attempts have been for the any % and I 'm still
]." Without going into specifics , she said the
far from over ." <|endoftext|> Rare Photo of Helen Keller as
/ pop / pop / ret , which we 'll call
. In bed . Oh god ." She chuck les ,

MIN ACTIVATIONS
MIN = -8.443

OF EXT REM ELY EXP ENS IVE BE ER WITHOUT THE
. Invalid email address . Please re - enter . You
. Invalid email address . Please re - enter . You
was the seed of res sent iment , and for Nietzsche
jpg Email Dis ruption is perhaps one
. Invalid email address . Please re - enter . You
. Invalid email address . Please re - enter . You
. jpg Email Dis ruption is perhaps
AC OU ST IC RES ON ANCE Before getting
xx brand ? f ref = ts <|endoftext|> A month before
signed the algorithm and North rop Gr um man , the
{ // Time component const float TA U = 6 .
ID : 5 23 22 62 13 ( Photo 11 :
s ult . When Re ­ pub ­ lic ­ ans
ing Jimmy Carter . Al ­ though he left the White
( Photo 11 : SA UL LO EB AFP / Getty
di ­ um and al ­ l owed only a brief
In recent years , however , conservative politicians and pundits
­ ized re ­ la ­ tions with Cuba . Mitt
M EM B ERS H IP IS FREE | NO

INTERVAL 3.731 - 7.789
CONTAINS 0.577%

were sent to the An z io beach head where they
out of cash . ( Techn ically , no Alberta budget
easily be pulled out After that , you will need to
in the sand the whole time . Maybe the true voice
speech - and - priv acy https :// www

INTERVAL -0.327 - 3.731
CONTAINS 59.576%

can t remember you at all .
have shown no desire to do them . But
economy . Mac ri blames un fore see able
down with Gohan ? <|endoftext|> R ents still have the potential
org . cl ojure / tools . logging | 15 79

INTERVAL -4.385 - -0.327
CONTAINS 39.645%

s no limit . When there s people motivated
)} Q c 7 { ( 17 s )} 27 .
coverage to Trump and the concoct ed Benghazi and email scandals
So , Dallas has to decide almost immediately into the offseason
, on Bay view at the Brick Works Ren

INTERVAL -8.443 - -4.385
CONTAINS 0.202%

incorporated all available data on body mass index ( B MI
oxy resin is " res il ient to mechanical stresses ."
All of this seemed very ironic , of
The Aut on av system required occasional manual corrections
As we dated we shared our vision for
Token go
Token ID467
Feature activation-1.82e-01

Pos logit contributions
 HUN+0.363
 MARK+0.359
 Mond+0.354
 sen+0.351
 COP+0.351
Neg logit contributions
bians-0.337
align-0.336
isel-0.330
handler-0.326
male-0.325
Token to
Token ID284
Feature activation+1.77e-02

Pos logit contributions
 HUN+0.113
 MARK+0.112
 Mond+0.110
 COP+0.109
 sen+0.109
Neg logit contributions
bians-0.133
align-0.132
isel-0.131
male-0.130
imeters-0.129
Token MLS
Token ID13981
Feature activation-5.25e-01

Pos logit contributions
bians+0.012
align+0.012
isel+0.012
handler+0.012
male+0.012
Neg logit contributions
 HUN-0.012
 MARK-0.011
 Mond-0.011
 sen-0.011
 COP-0.011
Token because
Token ID780
Feature activation+2.23e-01

Pos logit contributions
 HUN+0.332
 MARK+0.328
 Mond+0.321
 sen+0.320
 COP+0.320
Neg logit contributions
bians-0.382
align-0.379
isel-0.375
male-0.372
handler-0.371
Token Ben
Token ID3932
Feature activation+6.26e-01

Pos logit contributions
bians+0.150
align+0.149
isel+0.147
male+0.147
handler+0.146
Neg logit contributions
 HUN-0.152
 MARK-0.151
 Mond-0.148
 sen-0.148
 COP-0.147
Tokenf
Token ID69
Feature activation+7.79e+00

Pos logit contributions
bians+0.431
align+0.428
isel+0.425
male+0.424
handler+0.421
Neg logit contributions
 HUN-0.416
 MARK-0.414
 sen-0.406
 COP-0.406
 Coca-0.405
Tokenica
Token ID3970
Feature activation+1.90e+00

Pos logit contributions
bians+3.290
raid+3.215
isel+3.145
olit+3.137
imeters+3.066
Neg logit contributions
 HUN-7.195
 MARK-7.153
 surv-6.979
 Coca-6.924
 COP-6.913
Token still
Token ID991
Feature activation+7.11e-01

Pos logit contributions
bians+1.302
align+1.296
isel+1.281
male+1.270
handler+1.258
Neg logit contributions
 HUN-1.275
 MARK-1.259
 Mond-1.232
 COP-1.231
 sen-1.227
Token owns
Token ID12216
Feature activation-1.31e+00

Pos logit contributions
bians+0.501
align+0.499
isel+0.492
male+0.491
handler+0.486
Neg logit contributions
 HUN-0.466
 MARK-0.459
 Mond-0.451
 COP-0.450
 sen-0.448
Token my
Token ID616
Feature activation-1.03e+00

Pos logit contributions
 HUN+0.853
 MARK+0.840
 sen+0.825
 COP+0.824
 Mond+0.824
Neg logit contributions
bians-0.925
align-0.920
isel-0.907
male-0.903
imeters-0.898
Token contract
Token ID2775
Feature activation-1.76e+00

Pos logit contributions
 HUN+0.655
 MARK+0.644
 Mond+0.631
 sen+0.627
 COP+0.626
Neg logit contributions
bians-0.740
align-0.735
isel-0.726
male-0.724
handler-0.722
TokenSS
Token ID5432
Feature activation+1.94e+00

Pos logit contributions
align+3.079
handler+3.024
male+3.024
isel+3.014
slot+3.005
Neg logit contributions
 HUN-4.573
 Mond-4.555
 surv-4.515
 Coca-4.486
 Kaplan-4.441
TokenID
Token ID2389
Feature activation+2.88e+00

Pos logit contributions
align+1.152
bians+1.136
handler+1.135
male+1.134
isel+1.120
Neg logit contributions
 HUN-1.476
 surv-1.457
 sen-1.442
 MARK-1.437
 COP-1.436
Token:
Token ID25
Feature activation+3.54e+00

Pos logit contributions
bians+1.413
align+1.395
isel+1.369
male+1.357
imeters+1.349
Neg logit contributions
 HUN-2.520
 MARK-2.489
 COP-2.459
 sen-2.459
 Mond-2.457
Tokenoff
Token ID2364
Feature activation+1.48e+00

Pos logit contributions
align+1.748
bians+1.729
male+1.712
isel+1.696
handler+1.696
Neg logit contributions
 HUN-3.074
 MARK-3.028
 Mond-2.995
 Coca-2.993
 COP-2.988
Token/
Token ID14
Feature activation+2.40e+00

Pos logit contributions
bians+1.116
align+1.105
isel+1.096
male+1.087
imeters+1.083
Neg logit contributions
 HUN-0.905
 MARK-0.891
 COP-0.877
 sen-0.875
 Mond-0.871
Tokenany
Token ID1092
Feature activation+7.19e+00

Pos logit contributions
bians+1.400
align+1.391
isel+1.366
male+1.358
imeters+1.350
Neg logit contributions
 HUN-1.871
 MARK-1.844
 COP-1.821
 sen-1.820
 Mond-1.818
Token Mode
Token ID10363
Feature activation+3.25e+00

Pos logit contributions
align+3.502
volt+3.456
handler+3.418
male+3.417
bians+3.413
Neg logit contributions
 HUN-6.005
 Coca-5.878
 Mond-5.859
 sen-5.850
 MARK-5.832
Token:
Token ID25
Feature activation+1.68e+00

Pos logit contributions
align+1.339
male+1.300
handler+1.298
bians+1.292
volt+1.290
Neg logit contributions
 HUN-3.032
 surv-2.994
 MARK-2.987
 sen-2.973
 COP-2.971
TokenMan
Token ID5124
Feature activation+2.28e+00

Pos logit contributions
bians+0.704
align+0.697
isel+0.681
male+0.672
imeters+0.669
Neg logit contributions
 HUN-1.583
 MARK-1.564
 COP-1.548
 sen-1.547
 Mond-1.546
Tokenaged
Token ID1886
Feature activation+1.37e+00

Pos logit contributions
male+1.254
align+1.252
handler+1.240
bians+1.238
isel+1.235
Neg logit contributions
 HUN-1.851
 MARK-1.835
 COP-1.817
 Coca-1.804
 Mond-1.796
Token Access
Token ID8798
Feature activation+2.35e-01

Pos logit contributions
align+0.900
bians+0.895
male+0.879
handler+0.873
isel+0.869
Neg logit contributions
 HUN-0.954
 MARK-0.938
 sen-0.930
 surv-0.927
 Mond-0.923
Token.
Token ID13
Feature activation-3.76e-01

Pos logit contributions
align+1.897
bians+1.889
isel+1.862
imeters+1.858
handler+1.856
Neg logit contributions
 HUN-1.535
 MARK-1.505
 Mond-1.480
 Coca-1.472
 sen-1.468
Tokenparse
Token ID29572
Feature activation+3.52e-01

Pos logit contributions
 HUN+0.310
 MARK+0.307
 COP+0.303
 sen+0.302
 Mond+0.301
Neg logit contributions
bians-0.201
align-0.198
isel-0.196
male-0.194
imeters-0.193
Token_
Token ID62
Feature activation+9.10e-01

Pos logit contributions
bians+0.235
align+0.232
isel+0.230
male+0.229
imeters+0.226
Neg logit contributions
 HUN-0.245
 MARK-0.241
 COP-0.238
 sen-0.238
 Mond-0.236
Tokenargs
Token ID22046
Feature activation-4.92e-02

Pos logit contributions
align+0.481
bians+0.480
isel+0.469
male+0.467
handler+0.465
Neg logit contributions
 HUN-0.762
 MARK-0.750
 Mond-0.741
 COP-0.741
 sen-0.740
Token())
Token ID28955
Feature activation+2.45e+00

Pos logit contributions
 HUN+0.031
 MARK+0.030
 COP+0.030
 sen+0.030
 Mond+0.030
Neg logit contributions
bians-0.036
align-0.036
isel-0.036
male-0.035
imeters-0.035
Token #
Token ID1303
Feature activation+7.03e+00

Pos logit contributions
align+1.727
bians+1.723
isel+1.693
handler+1.687
imeters+1.683
Neg logit contributions
 HUN-1.590
 MARK-1.549
 Mond-1.536
 Coca-1.524
 COP-1.518
Token initialize
Token ID41216
Feature activation+2.63e+00

Pos logit contributions
align+4.675
handler+4.505
isel+4.493
bians+4.471
imeters+4.431
Neg logit contributions
 HUN-4.648
 Mond-4.585
 Premiership-4.541
 Coca-4.517
 sen-4.491
Token a
Token ID257
Feature activation+4.92e-01

Pos logit contributions
align+1.776
bians+1.765
isel+1.746
handler+1.732
imeters+1.731
Neg logit contributions
 HUN-1.770
 MARK-1.715
 Mond-1.712
 Coca-1.700
 sen-1.697
Token rectangular
Token ID36954
Feature activation+7.58e-01

Pos logit contributions
bians+0.319
align+0.319
isel+0.314
male+0.312
handler+0.312
Neg logit contributions
 HUN-0.348
 MARK-0.341
 Mond-0.338
 sen-0.337
 COP-0.336
Token and
Token ID290
Feature activation+1.81e-01

Pos logit contributions
align+0.427
bians+0.426
isel+0.420
male+0.416
handler+0.416
Neg logit contributions
 HUN-0.598
 MARK-0.586
 sen-0.580
 Mond-0.580
 COP-0.580
Token square
Token ID6616
Feature activation-1.29e+00

Pos logit contributions
align+0.120
bians+0.119
isel+0.117
male+0.117
imeters+0.116
Neg logit contributions
 HUN-0.127
 MARK-0.124
 Mond-0.123
 sen-0.123
 COP-0.123
TokenJ
Token ID41
Feature activation+2.03e+00

Pos logit contributions
align+0.846
bians+0.844
male+0.830
isel+0.830
handler+0.821
Neg logit contributions
 HUN-0.949
 MARK-0.936
 COP-0.925
 Mond-0.920
 Coca-0.916
Tokenaaaa
Token ID24794
Feature activation-1.55e+00

Pos logit contributions
bians+1.274
align+1.267
isel+1.254
male+1.247
handler+1.231
Neg logit contributions
 HUN-1.497
 MARK-1.473
 COP-1.456
 Mond-1.452
 sen-1.450
Tokenasper
Token ID32981
Feature activation-6.58e-01

Pos logit contributions
 HUN+1.221
 MARK+1.206
 sen+1.193
 Mond+1.191
 COP+1.188
Neg logit contributions
bians-0.889
align-0.881
isel-0.863
handler-0.859
male-0.859
Token Profile
Token ID13118
Feature activation+4.53e+00

Pos logit contributions
 HUN+0.499
 MARK+0.491
 COP+0.485
 sen+0.485
 Mond+0.484
Neg logit contributions
bians-0.397
align-0.395
isel-0.388
male-0.386
imeters-0.384
Token Blog
Token ID14001
Feature activation+4.85e+00

Pos logit contributions
bians+2.863
align+2.817
isel+2.784
male+2.779
female+2.729
Neg logit contributions
 HUN-3.242
 MARK-3.190
 COP-3.154
 Mond-3.133
 sen-3.128
Token Joined
Token ID16709
Feature activation+6.99e+00

Pos logit contributions
bians+2.819
align+2.759
male+2.745
isel+2.720
handler+2.708
Neg logit contributions
 HUN-3.682
 MARK-3.670
 surv-3.631
 COP-3.561
 Coca-3.530
Token April
Token ID3035
Feature activation+1.31e+00

Pos logit contributions
bians+3.759
 Male+3.691
male+3.656
align+3.639
 lesbians+3.634
Neg logit contributions
 surv-5.238
 HUN-5.214
 MARK-5.182
 sen-5.109
 COP-5.088
Token 2012
Token ID2321
Feature activation+1.21e-01

Pos logit contributions
bians+0.843
align+0.815
male+0.815
isel+0.808
 lesbians+0.804
Neg logit contributions
 HUN-0.926
 MARK-0.919
 surv-0.914
 sen-0.902
 COP-0.894
Token United
Token ID1578
Feature activation-2.67e+00

Pos logit contributions
bians+0.073
align+0.072
isel+0.071
male+0.071
handler+0.070
Neg logit contributions
 HUN-0.091
 MARK-0.090
 sen-0.089
 COP-0.089
 Mond-0.089
Token States
Token ID1829
Feature activation-4.08e+00

Pos logit contributions
 HUN+1.665
 Mond+1.614
 MARK+1.611
 sen+1.607
 Coca+1.591
Neg logit contributions
align-1.970
isel-1.947
handler-1.947
bians-1.947
imeters-1.943
Token 5
Token ID642
Feature activation-2.27e+00

Pos logit contributions
 HUN+2.833
 MARK+2.777
 COP+2.745
 Mond+2.744
 Coca+2.734
Neg logit contributions
align-2.734
bians-2.730
isel-2.679
imeters-2.666
handler-2.664
Token Americans
Token ID3399
Feature activation-5.91e-01

Pos logit contributions
bians+0.602
align+0.592
isel+0.588
male+0.584
imeters+0.583
Neg logit contributions
 HUN-0.606
 MARK-0.600
 sen-0.586
 Mond-0.586
 COP-0.581
Token,
Token ID11
Feature activation-5.88e-01

Pos logit contributions
 HUN+0.361
 MARK+0.356
 Mond+0.349
 COP+0.348
 sen+0.347
Neg logit contributions
bians-0.444
align-0.442
isel-0.436
male-0.435
imeters-0.432
Token
Token ID447
Feature activation-4.52e-01

Pos logit contributions
 HUN+0.378
 MARK+0.373
 Mond+0.365
 sen+0.365
 COP+0.365
Neg logit contributions
bians-0.422
align-0.421
isel-0.414
male-0.413
handler-0.410
Token
Token ID251
Feature activation-8.77e-01

Pos logit contributions
 HUN+0.296
 MARK+0.289
 sen+0.285
 COP+0.285
 Mond+0.284
Neg logit contributions
bians-0.321
align-0.319
isel-0.316
male-0.313
imeters-0.312
Token Mer
Token ID4638
Feature activation-9.55e-01

Pos logit contributions
 HUN+0.541
 MARK+0.532
 COP+0.522
 sen+0.522
 Mond+0.521
Neg logit contributions
bians-0.653
align-0.651
isel-0.642
male-0.639
handler-0.636
Tokenk
Token ID74
Feature activation+6.94e+00

Pos logit contributions
 HUN+0.690
 MARK+0.681
 COP+0.672
 Mond+0.672
 sen+0.671
Neg logit contributions
bians-0.609
align-0.606
isel-0.592
male-0.591
handler-0.587
Tokenley
Token ID1636
Feature activation+2.69e-01

Pos logit contributions
isel+2.943
handler+2.937
male+2.902
raid+2.898
align+2.864
Neg logit contributions
 HUN-6.413
 MARK-6.359
 surv-6.270
 COP-6.213
 Mond-6.175
Token,
Token ID11
Feature activation-4.54e-01

Pos logit contributions
bians+0.181
align+0.180
isel+0.178
male+0.177
handler+0.176
Neg logit contributions
 HUN-0.184
 MARK-0.182
 COP-0.178
 sen-0.177
 Mond-0.177
Token the
Token ID262
Feature activation-4.46e-01

Pos logit contributions
 HUN+0.292
 MARK+0.287
 COP+0.281
 sen+0.280
 Mond+0.280
Neg logit contributions
bians-0.326
align-0.324
isel-0.320
male-0.319
handler-0.318
Token fourth
Token ID5544
Feature activation-2.82e-01

Pos logit contributions
 HUN+0.294
 MARK+0.290
 COP+0.283
 Mond+0.283
 sen+0.282
Neg logit contributions
bians-0.311
align-0.309
isel-0.305
male-0.304
handler-0.303
Token and
Token ID290
Feature activation+1.97e-01

Pos logit contributions
 HUN+0.178
 MARK+0.176
 Mond+0.171
 COP+0.171
 Coca+0.170
Neg logit contributions
bians-0.203
align-0.202
male-0.201
isel-0.200
handler-0.199
Token
Token ID198
Feature activation+8.25e-01

Pos logit contributions
align+1.051
bians+1.046
male+1.041
handler+1.028
female+1.024
Neg logit contributions
 HUN-1.142
 surv-1.133
 MARK-1.124
 sen-1.107
 COP-1.105
Tokenheart
Token ID11499
Feature activation+2.09e+00

Pos logit contributions
align+0.533
bians+0.532
isel+0.523
male+0.523
handler+0.518
Neg logit contributions
 HUN-0.592
 MARK-0.584
 COP-0.577
 Mond-0.573
 sen-0.571
Tokenl
Token ID75
Feature activation-1.29e+00

Pos logit contributions
align+1.306
bians+1.301
male+1.283
isel+1.283
handler+1.272
Neg logit contributions
 HUN-1.531
 MARK-1.510
 COP-1.490
 Mond-1.489
 sen-1.484
Tokenxp
Token ID42372
Feature activation-1.94e+00

Pos logit contributions
 HUN+0.992
 MARK+0.977
 COP+0.964
 sen+0.963
 Mond+0.963
Neg logit contributions
bians-0.770
align-0.766
isel-0.752
male-0.748
imeters-0.743
Token Profile
Token ID13118
Feature activation+3.51e+00

Pos logit contributions
 HUN+1.448
 MARK+1.427
 sen+1.413
 COP+1.411
 Mond+1.410
Neg logit contributions
bians-1.190
align-1.184
isel-1.159
male-1.155
handler-1.149
Token Joined
Token ID16709
Feature activation+6.87e+00

Pos logit contributions
bians+2.252
align+2.226
isel+2.205
male+2.197
female+2.162
Neg logit contributions
 HUN-2.490
 MARK-2.451
 COP-2.425
 Mond-2.405
 sen-2.403
Token September
Token ID2693
Feature activation+1.45e+00

Pos logit contributions
bians+3.723
align+3.654
 Male+3.652
male+3.627
 lesbians+3.612
Neg logit contributions
 surv-5.185
 HUN-5.154
 MARK-5.098
 sen-5.051
 COP-5.031
Token 2010
Token ID3050
Feature activation+1.42e-01

Pos logit contributions
bians+0.941
align+0.915
male+0.910
isel+0.900
 lesbians+0.899
Neg logit contributions
 HUN-1.022
 MARK-1.018
 surv-1.007
 sen-1.001
 COP-0.986
Token United
Token ID1578
Feature activation-2.02e+00

Pos logit contributions
bians+0.084
align+0.083
isel+0.082
male+0.082
handler+0.081
Neg logit contributions
 HUN-0.108
 MARK-0.107
 sen-0.105
 COP-0.105
 Mond-0.105
Token States
Token ID1829
Feature activation-3.11e+00

Pos logit contributions
 HUN+1.270
 Mond+1.230
 MARK+1.229
 sen+1.218
 COP+1.212
Neg logit contributions
align-1.483
bians-1.470
isel-1.468
imeters-1.465
handler-1.465
Token 10
Token ID838
Feature activation-9.52e-01

Pos logit contributions
 HUN+2.186
 MARK+2.147
 COP+2.120
 Mond+2.118
 sen+2.114
Neg logit contributions
bians-2.058
align-2.055
isel-2.017
male-2.005
imeters-2.001
Token))
Token ID4008
Feature activation+2.32e+00

Pos logit contributions
 HUN+0.809
 MARK+0.794
 COP+0.783
 sen+0.782
 Mond+0.781
Neg logit contributions
bians-0.879
align-0.873
isel-0.862
male-0.858
handler-0.852
Token =
Token ID796
Feature activation+3.43e-01

Pos logit contributions
align+1.575
bians+1.556
isel+1.537
male+1.532
volt+1.527
Neg logit contributions
 HUN-1.537
 MARK-1.502
 Mond-1.500
 sen-1.493
 Coca-1.492
Token 0
Token ID657
Feature activation+1.59e+00

Pos logit contributions
align+0.233
bians+0.231
isel+0.228
male+0.227
imeters+0.226
Neg logit contributions
 HUN-0.231
 MARK-0.228
 Mond-0.226
 sen-0.225
 Coca-0.224
Token.
Token ID13
Feature activation+3.60e-01

Pos logit contributions
align+1.151
bians+1.146
isel+1.126
volt+1.124
handler+1.123
Neg logit contributions
 HUN-0.993
 MARK-0.971
 Coca-0.968
 Mond-0.968
 sen-0.966
Token1
Token ID16
Feature activation+1.29e-01

Pos logit contributions
align+0.249
bians+0.247
male+0.243
isel+0.242
handler+0.241
Neg logit contributions
 HUN-0.241
 MARK-0.236
 sen-0.235
 Mond-0.234
 COP-0.233
Token //
Token ID3373
Feature activation+6.86e+00

Pos logit contributions
align+0.089
bians+0.088
isel+0.087
male+0.086
handler+0.086
Neg logit contributions
 HUN-0.087
 MARK-0.085
 Mond-0.084
 sen-0.084
 COP-0.084
Token metres
Token ID18985
Feature activation-1.31e+00

Pos logit contributions
align+4.828
volt+4.678
male+4.647
isel+4.632
handler+4.600
Neg logit contributions
 HUN-4.289
 Mond-4.182
 sen-4.149
 Coca-4.140
 Kaplan-4.115
Token _
Token ID4808
Feature activation+2.69e+00

Pos logit contributions
 HUN+0.894
 MARK+0.880
 Mond+0.866
 COP+0.866
 sen+0.865
Neg logit contributions
bians-0.890
align-0.889
isel-0.874
male-0.870
imeters-0.866
TokenP
Token ID47
Feature activation+7.29e-01

Pos logit contributions
align+1.993
bians+1.974
male+1.950
isel+1.945
handler+1.940
Neg logit contributions
 HUN-1.674
 MARK-1.641
 COP-1.621
 sen-1.620
 Mond-1.617
Tokenulse
Token ID9615
Feature activation+1.98e-01

Pos logit contributions
align+0.332
bians+0.331
isel+0.327
male+0.322
imeters+0.321
Neg logit contributions
 HUN-0.664
 MARK-0.655
 sen-0.648
 COP-0.648
 Mond-0.647
TokenPer
Token ID5990
Feature activation+3.92e-01

Pos logit contributions
bians+0.173
align+0.173
isel+0.170
male+0.170
imeters+0.170
Neg logit contributions
 HUN-0.097
 MARK-0.095
 sen-0.093
 COP-0.093
 Mond-0.092
Token empire
Token ID13735
Feature activation+2.26e-01

Pos logit contributions
 HUN+0.029
 MARK+0.028
 COP+0.028
 sen+0.028
 Mond+0.028
Neg logit contributions
bians-0.035
align-0.035
isel-0.034
male-0.034
handler-0.034
Token [
Token ID685
Feature activation+5.55e-01

Pos logit contributions
bians+0.172
align+0.171
isel+0.169
male+0.168
handler+0.167
Neg logit contributions
 HUN-0.136
 MARK-0.134
 COP-0.132
 sen-0.131
 Mond-0.131
Tokenunder
Token ID4625
Feature activation-1.92e-01

Pos logit contributions
bians+0.315
align+0.314
isel+0.307
male+0.307
handler+0.304
Neg logit contributions
 HUN-0.441
 MARK-0.435
 COP-0.429
 sen-0.429
 Mond-0.428
Token Hugh
Token ID25464
Feature activation-6.35e-01

Pos logit contributions
 HUN+0.118
 MARK+0.116
 COP+0.114
 sen+0.114
 Mond+0.113
Neg logit contributions
bians-0.144
align-0.143
male-0.141
isel-0.141
handler-0.140
Token He
Token ID679
Feature activation-5.62e-02

Pos logit contributions
 HUN+0.462
 MARK+0.453
 Mond+0.453
 Coca+0.446
 COP+0.444
Neg logit contributions
align-0.408
bians-0.405
isel-0.399
imeters-0.397
male-0.393
Tokenf
Token ID69
Feature activation+6.83e+00

Pos logit contributions
 HUN+0.039
 COP+0.039
 MARK+0.038
 sen+0.038
 Coca+0.037
Neg logit contributions
isel-0.038
bians-0.038
male-0.037
align-0.037
handler-0.037
Tokenner
Token ID1008
Feature activation+1.90e+00

Pos logit contributions
handler+4.671
bians+4.657
male+4.579
isel+4.546
raid+4.528
Neg logit contributions
 HUN-4.653
 surv-4.536
 MARK-4.474
 COP-4.417
 Premiership-4.416
Token,
Token ID11
Feature activation+1.43e-01

Pos logit contributions
bians+0.965
align+0.963
isel+0.950
male+0.945
handler+0.939
Neg logit contributions
 HUN-1.614
 MARK-1.593
 COP-1.573
 sen-1.567
 Mond-1.561
Token whose
Token ID3025
Feature activation-5.59e-01

Pos logit contributions
bians+0.103
align+0.102
male+0.101
isel+0.101
handler+0.101
Neg logit contributions
 HUN-0.091
 MARK-0.089
 COP-0.087
 sen-0.087
 Mond-0.086
Token confid
Token ID47830
Feature activation+1.34e+00

Pos logit contributions
 HUN+0.377
 MARK+0.371
 COP+0.365
 sen+0.364
 Mond+0.364
Neg logit contributions
bians-0.382
align-0.381
isel-0.375
male-0.373
handler-0.372
Tokenant
Token ID415
Feature activation-5.28e-01

Pos logit contributions
bians+0.599
align+0.597
isel+0.586
male+0.577
imeters+0.574
Neg logit contributions
 HUN-1.236
 MARK-1.220
 COP-1.202
 sen-1.199
 Coca-1.199
Tokenies
Token ID444
Feature activation+6.64e-01

Pos logit contributions
bians+0.075
align+0.075
male+0.073
isel+0.073
imeters+0.073
Neg logit contributions
 HUN-0.084
 MARK-0.082
 COP-0.081
 Mond-0.081
 sen-0.081
Token are
Token ID389
Feature activation+5.01e-01

Pos logit contributions
bians+0.429
align+0.429
isel+0.419
male+0.419
imeters+0.416
Neg logit contributions
 HUN-0.474
 MARK-0.465
 COP-0.458
 Mond-0.458
 sen-0.457
Token finally
Token ID3443
Feature activation+2.02e+00

Pos logit contributions
bians+0.338
align+0.338
male+0.332
isel+0.331
handler+0.327
Neg logit contributions
 HUN-0.341
 MARK-0.335
 COP-0.330
 sen-0.330
 Mond-0.330
Token tallied
Token ID48729
Feature activation-1.88e+00

Pos logit contributions
align+1.315
bians+1.312
isel+1.287
male+1.280
 lesbians+1.269
Neg logit contributions
 HUN-1.400
 MARK-1.390
 COP-1.366
 Mond-1.364
 Coca-1.364
Token!
Token ID0
Feature activation+3.63e-01

Pos logit contributions
 HUN+1.202
 MARK+1.182
 COP+1.163
 sen+1.163
 Mond+1.161
Neg logit contributions
bians-1.358
align-1.351
isel-1.332
male-1.324
handler-1.318
TokenMan
Token ID5124
Feature activation+6.82e+00

Pos logit contributions
bians+0.240
align+0.240
male+0.236
isel+0.235
imeters+0.233
Neg logit contributions
 HUN-0.251
 MARK-0.247
 COP-0.244
 Mond-0.244
 sen-0.243
Token,
Token ID11
Feature activation+3.10e+00

Pos logit contributions
bians+4.369
male+4.367
align+4.360
isel+4.312
volt+4.310
Neg logit contributions
 HUN-4.617
 MARK-4.566
 Coca-4.503
 Mond-4.501
 COP-4.483
Token I
Token ID314
Feature activation-5.11e-03

Pos logit contributions
bians+2.014
align+2.003
male+1.981
 lesbians+1.961
isel+1.959
Neg logit contributions
 HUN-2.137
 MARK-2.110
 Mond-2.078
 Coca-2.069
 COP-2.069
Token got
Token ID1392
Feature activation+3.89e-01

Pos logit contributions
 HUN+0.004
 MARK+0.004
 Mond+0.004
 Coca+0.004
 sen+0.004
Neg logit contributions
bians-0.003
align-0.003
male-0.003
isel-0.003
imeters-0.003
Token a
Token ID257
Feature activation-9.10e-02

Pos logit contributions
bians+0.266
align+0.264
isel+0.262
male+0.260
imeters+0.258
Neg logit contributions
 HUN-0.259
 MARK-0.257
 COP-0.252
 Mond-0.252
 Coca-0.251
Token lot
Token ID1256
Feature activation-2.43e-01

Pos logit contributions
 HUN+0.059
 MARK+0.058
 Mond+0.057
 COP+0.057
 sen+0.057
Neg logit contributions
bians-0.064
align-0.064
isel-0.063
male-0.063
handler-0.063
Token out
Token ID503
Feature activation+8.92e-01

Pos logit contributions
bians+0.357
align+0.356
isel+0.350
male+0.348
handler+0.346
Neg logit contributions
 HUN-0.318
 MARK-0.312
 sen-0.308
 COP-0.307
 Mond-0.307
Token in
Token ID287
Feature activation+6.31e-01

Pos logit contributions
bians+0.609
align+0.605
isel+0.597
imeters+0.595
male+0.595
Neg logit contributions
 HUN-0.600
 MARK-0.594
 COP-0.582
 Mond-0.581
 sen-0.580
Token the
Token ID262
Feature activation+1.88e+00

Pos logit contributions
bians+0.452
align+0.449
isel+0.443
male+0.443
imeters+0.442
Neg logit contributions
 HUN-0.403
 MARK-0.398
 COP-0.389
 sen-0.388
 Mond-0.385
Token Kh
Token ID5311
Feature activation+3.70e+00

Pos logit contributions
bians+1.218
align+1.207
imeters+1.191
male+1.187
isel+1.183
Neg logit contributions
 HUN-1.308
 MARK-1.303
 COP-1.269
 sen-1.266
 Coca-1.255
Tokeny
Token ID88
Feature activation+2.95e+00

Pos logit contributions
align+2.011
bians+1.982
isel+1.982
male+1.949
handler+1.932
Neg logit contributions
 HUN-3.051
 MARK-3.016
 COP-2.975
 sen-2.964
 surv-2.954
Tokenber
Token ID527
Feature activation+6.81e+00

Pos logit contributions
align+1.926
bians+1.912
isel+1.912
raid+1.871
male+1.865
Neg logit contributions
 HUN-2.075
 MARK-2.055
 COP-2.049
 surv-2.011
 Mond-2.006
Token area
Token ID1989
Feature activation+2.11e+00

Pos logit contributions
bians+4.308
align+4.297
male+4.231
volt+4.204
female+4.197
Neg logit contributions
 HUN-4.774
 MARK-4.698
 COP-4.639
 surv-4.626
 sen-4.615
Token.
Token ID13
Feature activation-9.65e-01

Pos logit contributions
bians+1.459
align+1.457
isel+1.439
male+1.429
imeters+1.425
Neg logit contributions
 HUN-1.383
 MARK-1.373
 COP-1.343
 sen-1.339
 Coca-1.339
Token
Token ID198
Feature activation-3.39e-01

Pos logit contributions
 HUN+0.596
 MARK+0.581
 Mond+0.575
 COP+0.573
 Coca+0.572
Neg logit contributions
align-0.726
bians-0.726
isel-0.717
handler-0.711
male-0.707
Token
Token ID198
Feature activation-4.86e-01

Pos logit contributions
 HUN+0.227
 MARK+0.223
 sen+0.220
 COP+0.219
 Mond+0.219
Neg logit contributions
bians-0.233
align-0.233
male-0.230
isel-0.228
handler-0.227
TokenPeace
Token ID43445
Feature activation-2.32e+00

Pos logit contributions
 HUN+0.386
 MARK+0.381
 COP+0.376
 sen+0.376
 Mond+0.375
Neg logit contributions
bians-0.276
align-0.276
male-0.270
isel-0.270
handler-0.266
Token.
Token ID13
Feature activation-9.15e-01

Pos logit contributions
align+0.788
bians+0.782
male+0.778
handler+0.773
isel+0.771
Neg logit contributions
 HUN-0.614
 MARK-0.600
 surv-0.596
 sen-0.593
 Mond-0.590
Tokenparse
Token ID29572
Feature activation-5.39e-01

Pos logit contributions
 HUN+0.732
 MARK+0.721
 sen+0.713
 Mond+0.711
 COP+0.711
Neg logit contributions
align-0.517
bians-0.516
isel-0.505
male-0.502
handler-0.501
Token_
Token ID62
Feature activation+8.05e-01

Pos logit contributions
 HUN+0.439
 MARK+0.432
 Mond+0.427
 sen+0.427
 COP+0.426
Neg logit contributions
align-0.296
bians-0.296
isel-0.289
male-0.288
handler-0.287
Tokenargs
Token ID22046
Feature activation+2.74e-01

Pos logit contributions
bians+0.424
align+0.423
male+0.412
isel+0.412
handler+0.410
Neg logit contributions
 HUN-0.677
 MARK-0.667
 COP-0.659
 Mond-0.657
 sen-0.657
Token())
Token ID28955
Feature activation+2.96e+00

Pos logit contributions
bians+0.193
align+0.192
male+0.188
handler+0.188
isel+0.188
Neg logit contributions
 HUN-0.180
 MARK-0.178
 Mond-0.176
 COP-0.175
 sen-0.175
Token #
Token ID1303
Feature activation+6.80e+00

Pos logit contributions
bians+1.692
align+1.672
isel+1.650
male+1.641
handler+1.624
Neg logit contributions
 HUN-2.358
 MARK-2.325
 COP-2.301
 sen-2.299
 Mond-2.293
Token initialize
Token ID41216
Feature activation+1.85e+00

Pos logit contributions
align+2.053
bians+1.926
handler+1.922
isel+1.873
imeters+1.819
Neg logit contributions
 HUN-7.090
 Mond-7.015
 MARK-6.980
 COP-6.946
 sen-6.936
Token a
Token ID257
Feature activation+7.18e-01

Pos logit contributions
align+1.197
bians+1.195
male+1.170
handler+1.169
isel+1.168
Neg logit contributions
 HUN-1.296
 MARK-1.273
 Mond-1.266
 surv-1.256
 COP-1.248
Token rectangular
Token ID36954
Feature activation+1.35e+00

Pos logit contributions
align+0.470
bians+0.469
isel+0.460
imeters+0.458
male+0.456
Neg logit contributions
 HUN-0.500
 MARK-0.494
 Mond-0.485
 sen-0.480
 COP-0.477
Token and
Token ID290
Feature activation+2.92e-01

Pos logit contributions
align+0.936
bians+0.929
handler+0.918
isel+0.917
male+0.915
Neg logit contributions
 HUN-0.888
 MARK-0.869
 Mond-0.859
 COP-0.857
 sen-0.852
Token square
Token ID6616
Feature activation-1.36e+00

Pos logit contributions
align+0.160
bians+0.157
male+0.155
isel+0.155
imeters+0.155
Neg logit contributions
 HUN-0.236
 MARK-0.233
 Mond-0.231
 COP-0.230
 Coca-0.229
Token President
Token ID1992
Feature activation-1.50e+00

Pos logit contributions
 HUN+0.271
 MARK+0.266
 COP+0.262
 sen+0.262
 Mond+0.262
Neg logit contributions
bians-0.293
align-0.292
isel-0.287
male-0.286
imeters-0.284
Token Obama
Token ID2486
Feature activation+6.07e-01

Pos logit contributions
 HUN+0.921
 MARK+0.904
 COP+0.893
 Mond+0.892
 sen+0.889
Neg logit contributions
bians-1.126
align-1.120
isel-1.105
male-1.099
imeters-1.096
Token acknowledged
Token ID10810
Feature activation+6.88e-01

Pos logit contributions
align+0.415
bians+0.415
isel+0.408
male+0.406
handler+0.404
Neg logit contributions
 HUN-0.411
 MARK-0.403
 Mond-0.397
 COP-0.396
 sen-0.396
Token
Token ID564
Feature activation+1.30e+00

Pos logit contributions
bians+0.473
align+0.472
isel+0.465
male+0.463
handler+0.461
Neg logit contributions
 HUN-0.459
 MARK-0.451
 Mond-0.443
 sen-0.443
 COP-0.441
Token
Token ID250
Feature activation+9.54e-01

Pos logit contributions
bians+0.856
align+0.856
male+0.843
isel+0.841
imeters+0.833
Neg logit contributions
 HUN-0.899
 MARK-0.889
 Mond-0.881
 COP-0.877
 sen-0.874
Tokenno
Token ID3919
Feature activation+6.79e+00

Pos logit contributions
align+0.643
bians+0.637
male+0.628
isel+0.627
handler+0.622
Neg logit contributions
 HUN-0.656
 MARK-0.643
 Mond-0.635
 sen-0.634
 COP-0.631
Token foreign
Token ID3215
Feature activation+2.02e+00

Pos logit contributions
isel+4.451
bians+4.410
imeters+4.399
align+4.381
handler+4.364
Neg logit contributions
 HUN-4.406
 MARK-4.338
 Mond-4.316
 Kaplan-4.293
 Coca-4.238
Token nation
Token ID3277
Feature activation+7.57e-01

Pos logit contributions
bians+1.232
isel+1.225
align+1.216
male+1.209
imeters+1.205
Neg logit contributions
 HUN-1.494
 MARK-1.473
 Mond-1.471
 Kaplan-1.433
 Coca-1.432
Token,
Token ID11
Feature activation-7.51e-01

Pos logit contributions
bians+0.519
align+0.518
isel+0.513
male+0.512
handler+0.508
Neg logit contributions
 HUN-0.507
 MARK-0.497
 Mond-0.495
 Coca-0.485
 COP-0.485
Token no
Token ID645
Feature activation-7.59e-02

Pos logit contributions
 HUN+0.503
 MARK+0.497
 Mond+0.490
 COP+0.485
 sen+0.485
Neg logit contributions
bians-0.515
align-0.513
male-0.506
isel-0.506
handler-0.500
Token hacker
Token ID23385
Feature activation+3.20e-01

Pos logit contributions
 HUN+0.051
 MARK+0.050
 Mond+0.050
 COP+0.049
 sen+0.049
Neg logit contributions
bians-0.052
align-0.052
isel-0.051
male-0.051
handler-0.050
Token.
Token ID13
Feature activation-9.75e-01

Pos logit contributions
 HUN+0.744
 MARK+0.726
 COP+0.716
 sen+0.714
 Mond+0.714
Neg logit contributions
bians-1.094
align-1.091
isel-1.075
male-1.071
handler-1.068
Token
Token ID198
Feature activation-9.36e-01

Pos logit contributions
 HUN+0.614
 MARK+0.600
 COP+0.593
 Mond+0.593
 Coca+0.590
Neg logit contributions
bians-0.721
align-0.720
isel-0.711
handler-0.704
male-0.701
Token
Token ID198
Feature activation-5.42e-01

Pos logit contributions
 HUN+0.616
 MARK+0.605
 sen+0.597
 COP+0.596
 Mond+0.594
Neg logit contributions
bians-0.657
align-0.655
male-0.645
isel-0.643
handler-0.640
TokenK
Token ID42
Feature activation+1.26e+00

Pos logit contributions
 HUN+0.417
 MARK+0.411
 COP+0.406
 sen+0.405
 Mond+0.405
Neg logit contributions
bians-0.322
align-0.321
isel-0.315
male-0.315
handler-0.311
Tokenhy
Token ID12114
Feature activation+4.16e+00

Pos logit contributions
bians+0.750
align+0.746
isel+0.738
male+0.732
handler+0.728
Neg logit contributions
 HUN-0.968
 MARK-0.955
 COP-0.944
 Mond-0.942
 sen-0.941
Tokenber
Token ID527
Feature activation+6.77e+00

Pos logit contributions
bians+2.691
align+2.626
male+2.614
isel+2.596
female+2.550
Neg logit contributions
 HUN-3.012
 MARK-2.971
 COP-2.911
 Coca-2.896
 Mond-2.883
Token raid
Token ID9513
Feature activation+2.08e+00

Pos logit contributions
align+4.492
bians+4.488
male+4.450
isel+4.433
 lesbians+4.427
Neg logit contributions
 HUN-4.474
 MARK-4.417
 sen-4.326
 COP-4.301
 Mond-4.297
Token
Token ID198
Feature activation-9.02e-02

Pos logit contributions
align+1.429
bians+1.426
isel+1.409
imeters+1.402
raid+1.397
Neg logit contributions
 HUN-1.386
 MARK-1.361
 sen-1.339
 COP-1.337
 Mond-1.336
Token
Token ID198
Feature activation-1.20e+00

Pos logit contributions
 HUN+0.061
 MARK+0.060
 sen+0.059
 surv+0.059
 COP+0.059
Neg logit contributions
bians-0.061
align-0.061
male-0.060
isel-0.060
handler-0.060
TokenMilitary
Token ID37837
Feature activation-2.07e+00

Pos logit contributions
 HUN+0.930
 MARK+0.916
 COP+0.904
 sen+0.903
 Mond+0.903
Neg logit contributions
bians-0.704
align-0.700
isel-0.687
male-0.684
imeters-0.679
Token sources
Token ID4237
Feature activation-3.32e-01

Pos logit contributions
 HUN+1.824
 MARK+1.801
 sen+1.790
 COP+1.789
 Mond+1.786
Neg logit contributions
bians-0.992
align-0.986
isel-0.957
handler-0.953
male-0.951
Token (
Token ID357
Feature activation-1.94e+00

Pos logit contributions
 HUN+0.081
 MARK+0.080
 COP+0.079
 sen+0.078
 Mond+0.078
Neg logit contributions
bians-0.095
align-0.094
isel-0.093
male-0.093
imeters-0.092
Token21
Token ID2481
Feature activation-2.83e+00

Pos logit contributions
 HUN+1.162
 MARK+1.142
 COP+1.123
 sen+1.120
 Mond+1.119
Neg logit contributions
bians-1.481
align-1.470
isel-1.454
male-1.445
imeters-1.440
Token,
Token ID11
Feature activation-1.37e+00

Pos logit contributions
 HUN+1.737
 MARK+1.705
 COP+1.693
 sen+1.680
 Mond+1.669
Neg logit contributions
bians-2.107
isel-2.071
align-2.042
BuyableInstoreAndOnline-2.041
male-2.038
Token 21
Token ID2310
Feature activation-1.39e+00

Pos logit contributions
 HUN+0.914
 MARK+0.898
 sen+0.885
 COP+0.884
 Mond+0.883
Neg logit contributions
bians-0.955
align-0.954
isel-0.936
male-0.933
imeters-0.928
Token))
Token ID4008
Feature activation+1.89e+00

Pos logit contributions
 HUN+0.801
 MARK+0.784
 sen+0.773
 COP+0.771
 Mond+0.768
Neg logit contributions
bians-1.092
isel-1.073
align-1.072
male-1.066
handler-1.060
Token #
Token ID1303
Feature activation+6.77e+00

Pos logit contributions
align+1.289
bians+1.273
isel+1.251
imeters+1.244
male+1.240
Neg logit contributions
 HUN-1.284
 MARK-1.251
 Mond-1.235
 Coca-1.233
 sen-1.228
Token loop
Token ID9052
Feature activation+2.46e+00

Pos logit contributions
align+4.437
isel+4.230
bians+4.225
handler+4.216
imeters+4.200
Neg logit contributions
 HUN-4.615
 Premiership-4.500
 Mond-4.495
 Coca-4.491
 sen-4.457
Token over
Token ID625
Feature activation+3.14e-01

Pos logit contributions
align+1.778
bians+1.761
isel+1.737
imeters+1.736
handler+1.734
Neg logit contributions
 HUN-1.564
 MARK-1.530
 Mond-1.504
 Coca-1.503
 COP-1.496
Token the
Token ID262
Feature activation+4.83e-01

Pos logit contributions
align+0.231
bians+0.230
isel+0.227
imeters+0.225
male+0.225
Neg logit contributions
 HUN-0.196
 MARK-0.192
 sen-0.189
 Mond-0.189
 COP-0.189
Token input
Token ID5128
Feature activation-1.80e+00

Pos logit contributions
align+0.316
bians+0.315
isel+0.310
imeters+0.308
handler+0.307
Neg logit contributions
 HUN-0.340
 MARK-0.333
 sen-0.329
 Mond-0.328
 Coca-0.328
Token image
Token ID2939
Feature activation+2.03e-01

Pos logit contributions
 HUN+1.277
 MARK+1.260
 COP+1.242
 sen+1.241
 Mond+1.238
Neg logit contributions
bians-1.172
align-1.160
isel-1.144
male-1.142
handler-1.131
Token
Token ID198
Feature activation+1.92e+00

Pos logit contributions
male+1.416
align+1.412
bians+1.406
female+1.393
Female+1.384
Neg logit contributions
 surv-1.543
 HUN-1.541
 MARK-1.519
 Mond-1.508
 sen-1.504
Tokenpur
Token ID14225
Feature activation-7.38e-01

Pos logit contributions
align+1.208
bians+1.194
male+1.181
isel+1.176
volt+1.164
Neg logit contributions
 HUN-1.424
 MARK-1.406
 COP-1.390
 Mond-1.385
 Coca-1.380
Tokenak
Token ID461
Feature activation+2.28e+00

Pos logit contributions
 HUN+0.533
 MARK+0.524
 COP+0.517
 sen+0.517
 Mond+0.516
Neg logit contributions
bians-0.473
align-0.471
isel-0.463
male-0.461
handler-0.458
Tokenushi
Token ID17731
Feature activation-1.54e+00

Pos logit contributions
bians+1.478
align+1.469
isel+1.454
male+1.453
handler+1.430
Neg logit contributions
 HUN-1.623
 MARK-1.603
 COP-1.575
 Mond-1.571
 Coca-1.568
Token Profile
Token ID13118
Feature activation+4.68e+00

Pos logit contributions
 HUN+1.040
 MARK+1.022
 sen+1.007
 COP+1.007
 Mond+1.005
Neg logit contributions
bians-1.055
align-1.050
isel-1.033
male-1.028
imeters-1.023
Token Joined
Token ID16709
Feature activation+6.70e+00

Pos logit contributions
bians+2.903
align+2.877
isel+2.863
male+2.833
volt+2.782
Neg logit contributions
 HUN-3.369
 MARK-3.318
 COP-3.284
 Mond-3.263
 sen-3.238
Token August
Token ID2932
Feature activation+1.89e+00

Pos logit contributions
bians+3.714
align+3.661
male+3.622
 Male+3.607
isel+3.575
Neg logit contributions
 surv-4.980
 HUN-4.973
 MARK-4.913
 sen-4.868
 COP-4.859
Token 2012
Token ID2321
Feature activation+1.27e+00

Pos logit contributions
bians+1.236
align+1.207
male+1.192
isel+1.182
female+1.172
Neg logit contributions
 HUN-1.330
 MARK-1.322
 sen-1.303
 surv-1.301
 COP-1.286
Token United
Token ID1578
Feature activation-9.13e-01

Pos logit contributions
bians+0.746
align+0.731
male+0.723
isel+0.722
volt+0.709
Neg logit contributions
 HUN-0.964
 MARK-0.956
 sen-0.948
 COP-0.941
 surv-0.939
Token States
Token ID1829
Feature activation-2.04e+00

Pos logit contributions
 HUN+0.603
 MARK+0.591
 Mond+0.584
 sen+0.584
 COP+0.582
Neg logit contributions
bians-0.640
align-0.640
isel-0.631
handler-0.626
imeters-0.626
Token 3
Token ID513
Feature activation+7.38e-01

Pos logit contributions
 HUN+1.430
 MARK+1.408
 COP+1.388
 sen+1.386
 Mond+1.385
Neg logit contributions
bians-1.351
align-1.344
isel-1.323
male-1.317
imeters-1.308
Token attempts
Token ID6370
Feature activation+1.10e-02

Pos logit contributions
 HUN+1.083
 MARK+1.065
 COP+1.050
 sen+1.049
 Mond+1.047
Neg logit contributions
bians-1.100
align-1.095
isel-1.078
male-1.073
imeters-1.067
Token have
Token ID423
Feature activation+1.46e-01

Pos logit contributions
bians+0.007
align+0.007
isel+0.007
male+0.007
imeters+0.007
Neg logit contributions
 HUN-0.008
 MARK-0.008
 Mond-0.007
 COP-0.007
 sen-0.007
Token been
Token ID587
Feature activation-8.87e-02

Pos logit contributions
bians+0.103
align+0.102
isel+0.100
male+0.100
imeters+0.100
Neg logit contributions
 HUN-0.095
 MARK-0.094
 Mond-0.093
 COP-0.093
 Coca-0.092
Token for
Token ID329
Feature activation-3.71e-01

Pos logit contributions
 HUN+0.060
 MARK+0.060
 Mond+0.059
 COP+0.059
 Coca+0.058
Neg logit contributions
bians-0.060
align-0.060
isel-0.059
male-0.059
handler-0.058
Token the
Token ID262
Feature activation+1.25e+00

Pos logit contributions
 HUN+0.244
 MARK+0.240
 COP+0.237
 Mond+0.236
 sen+0.236
Neg logit contributions
bians-0.261
align-0.260
isel-0.256
male-0.255
imeters-0.254
Token any
Token ID597
Feature activation+6.65e+00

Pos logit contributions
bians+0.831
align+0.822
isel+0.811
male+0.808
imeters+0.805
Neg logit contributions
 HUN-0.858
 MARK-0.852
 COP-0.836
 Mond-0.834
 sen-0.833
Token%
Token ID4
Feature activation+3.28e+00

Pos logit contributions
bians+3.665
align+3.660
male+3.632
imeters+3.581
handler+3.576
Neg logit contributions
 HUN-5.216
 MARK-5.174
 COP-5.075
 Mond-5.073
 sen-5.073
Token and
Token ID290
Feature activation+2.55e+00

Pos logit contributions
bians+2.233
align+2.200
male+2.180
isel+2.175
imeters+2.169
Neg logit contributions
 HUN-2.179
 MARK-2.156
 COP-2.128
 Mond-2.116
 Coca-2.103
Token I
Token ID314
Feature activation+8.28e-01

Pos logit contributions
bians+1.741
align+1.726
isel+1.701
male+1.698
handler+1.688
Neg logit contributions
 HUN-1.689
 MARK-1.678
 COP-1.643
 Mond-1.639
 sen-1.636
Token'm
Token ID1101
Feature activation+5.09e-01

Pos logit contributions
bians+0.501
align+0.496
male+0.491
isel+0.489
imeters+0.482
Neg logit contributions
 HUN-0.622
 MARK-0.617
 Mond-0.611
 Coca-0.608
 COP-0.604
Token still
Token ID991
Feature activation+2.37e+00

Pos logit contributions
bians+0.340
align+0.339
isel+0.333
male+0.333
handler+0.331
Neg logit contributions
 HUN-0.349
 MARK-0.346
 Mond-0.341
 COP-0.339
 Coca-0.339
Token]."
Token ID29225
Feature activation-1.58e+00

Pos logit contributions
bians+0.054
align+0.054
isel+0.052
male+0.052
handler+0.052
Neg logit contributions
 HUN-0.114
 MARK-0.113
 COP-0.111
 sen-0.111
 Mond-0.111
Token
Token ID198
Feature activation+1.42e+00

Pos logit contributions
 HUN+0.973
 MARK+0.960
 COP+0.944
 Mond+0.941
 Coca+0.937
Neg logit contributions
align-1.173
bians-1.172
isel-1.156
male-1.147
handler-1.144
Token
Token ID198
Feature activation+1.18e+00

Pos logit contributions
align+0.920
bians+0.917
male+0.913
handler+0.900
female+0.896
Neg logit contributions
 HUN-1.005
 surv-0.996
 MARK-0.983
 sen-0.975
 Mond-0.969
TokenWithout
Token ID16249
Feature activation+1.57e+00

Pos logit contributions
align+0.716
bians+0.710
male+0.703
isel+0.698
female+0.693
Neg logit contributions
 HUN-0.898
 MARK-0.876
 COP-0.873
 sen-0.871
 Mond-0.865
Token going
Token ID1016
Feature activation+4.68e+00

Pos logit contributions
align+1.004
bians+0.997
isel+0.992
imeters+0.982
male+0.979
Neg logit contributions
 HUN-1.117
 MARK-1.095
 Mond-1.078
 sen-1.077
 COP-1.072
Token into
Token ID656
Feature activation+6.62e+00

Pos logit contributions
isel+2.974
 lesbians+2.968
bians+2.967
align+2.957
male+2.951
Neg logit contributions
 HUN-3.192
 MARK-3.167
 COP-3.101
 Mond-3.094
 Coca-3.079
Token specifics
Token ID23514
Feature activation+4.87e+00

Pos logit contributions
align+3.398
isel+3.386
 lesbians+3.343
volt+3.323
imeters+3.323
Neg logit contributions
 HUN-5.286
 MARK-5.138
 surv-5.086
 COP-5.075
 sen-5.063
Token,
Token ID11
Feature activation+4.94e-01

Pos logit contributions
align+2.964
bians+2.961
isel+2.947
imeters+2.923
handler+2.909
Neg logit contributions
 HUN-3.518
 MARK-3.475
 COP-3.416
 Mond-3.415
 surv-3.408
Token she
Token ID673
Feature activation+7.13e-01

Pos logit contributions
bians+0.369
align+0.367
isel+0.364
male+0.362
imeters+0.360
Neg logit contributions
 HUN-0.298
 MARK-0.291
 sen-0.287
 COP-0.286
 Mond-0.286
Token said
Token ID531
Feature activation+1.04e-01

Pos logit contributions
bians+0.470
align+0.465
isel+0.461
male+0.460
imeters+0.454
Neg logit contributions
 HUN-0.502
 MARK-0.492
 Mond-0.484
 COP-0.483
 sen-0.482
Token the
Token ID262
Feature activation-5.10e-01

Pos logit contributions
bians+0.073
align+0.072
isel+0.072
male+0.071
imeters+0.071
Neg logit contributions
 HUN-0.068
 MARK-0.067
 sen-0.066
 COP-0.066
 Mond-0.066
Token far
Token ID1290
Feature activation-2.13e-01

Pos logit contributions
bians+0.622
align+0.620
isel+0.614
handler+0.607
male+0.605
Neg logit contributions
 HUN-0.638
 MARK-0.628
 sen-0.618
 Mond-0.616
 COP-0.616
Token from
Token ID422
Feature activation-1.38e+00

Pos logit contributions
 HUN+0.146
 MARK+0.143
 COP+0.141
 sen+0.141
 Mond+0.141
Neg logit contributions
bians-0.144
align-0.144
isel-0.141
male-0.141
imeters-0.140
Token over
Token ID625
Feature activation-1.24e+00

Pos logit contributions
 HUN+0.916
 MARK+0.900
 COP+0.887
 sen+0.886
 Mond+0.885
Neg logit contributions
bians-0.967
align-0.963
isel-0.948
male-0.944
imeters-0.938
Token."
Token ID526
Feature activation-1.26e+00

Pos logit contributions
 HUN+0.733
 MARK+0.720
 COP+0.708
 sen+0.707
 Mond+0.707
Neg logit contributions
bians-0.951
align-0.947
isel-0.935
male-0.930
handler-0.925
Token<|endoftext|>
Token ID50256
Feature activation-5.36e-01

Pos logit contributions
 HUN+0.784
 MARK+0.778
 Mond+0.763
 COP+0.761
 Coca+0.760
Neg logit contributions
bians-0.932
align-0.927
isel-0.916
male-0.905
imeters-0.904
TokenRare
Token ID26737
Feature activation+6.51e+00

Pos logit contributions
 HUN+0.401
 MARK+0.395
 COP+0.389
 sen+0.389
 Mond+0.389
Neg logit contributions
bians-0.330
align-0.328
isel-0.323
male-0.321
handler-0.319
Token Photo
Token ID5555
Feature activation+1.33e+00

Pos logit contributions
bians+4.471
align+4.415
isel+4.397
male+4.386
handler+4.343
Neg logit contributions
 HUN-4.128
 MARK-4.073
 sen-3.968
 Mond-3.958
 COP-3.939
Token of
Token ID286
Feature activation+9.35e-01

Pos logit contributions
bians+0.875
align+0.874
isel+0.858
male+0.858
handler+0.854
Neg logit contributions
 HUN-0.925
 MARK-0.909
 sen-0.898
 Mond-0.895
 COP-0.894
Token Helen
Token ID22695
Feature activation+1.54e+00

Pos logit contributions
bians+0.681
align+0.674
isel+0.667
male+0.667
handler+0.660
Neg logit contributions
 HUN-0.588
 MARK-0.579
 sen-0.570
 COP-0.567
 Mond-0.566
Token Keller
Token ID35999
Feature activation-5.80e-01

Pos logit contributions
bians+1.206
align+1.188
male+1.185
isel+1.179
handler+1.174
Neg logit contributions
 HUN-0.882
 MARK-0.870
 sen-0.860
 COP-0.851
 Coca-0.842
Token as
Token ID355
Feature activation+1.21e+00

Pos logit contributions
 HUN+0.385
 MARK+0.379
 sen+0.373
 COP+0.373
 Mond+0.372
Neg logit contributions
bians-0.404
align-0.402
isel-0.396
male-0.396
handler-0.393
Token/
Token ID14
Feature activation+1.36e-01

Pos logit contributions
align+1.230
bians+1.221
handler+1.201
isel+1.199
male+1.198
Neg logit contributions
 HUN-1.145
 MARK-1.114
 Mond-1.101
 Coca-1.098
 COP-1.096
Tokenpop
Token ID12924
Feature activation+1.18e+00

Pos logit contributions
bians+0.084
align+0.083
isel+0.082
male+0.081
imeters+0.081
Neg logit contributions
 HUN-0.101
 MARK-0.100
 COP-0.098
 sen-0.098
 Mond-0.098
Token/
Token ID14
Feature activation+7.71e-01

Pos logit contributions
align+0.857
bians+0.854
isel+0.840
male+0.837
handler+0.835
Neg logit contributions
 HUN-0.752
 MARK-0.734
 Mond-0.724
 COP-0.722
 sen-0.721
Tokenpop
Token ID12924
Feature activation+1.72e+00

Pos logit contributions
bians+0.528
align+0.528
isel+0.518
male+0.517
handler+0.514
Neg logit contributions
 HUN-0.522
 MARK-0.512
 COP-0.505
 Mond-0.505
 sen-0.505
Token/
Token ID14
Feature activation+1.11e+00

Pos logit contributions
align+1.276
bians+1.262
isel+1.244
handler+1.244
male+1.241
Neg logit contributions
 HUN-1.069
 MARK-1.034
 Coca-1.024
 Mond-1.023
 COP-1.020
Tokenret
Token ID1186
Feature activation+6.50e+00

Pos logit contributions
align+0.749
bians+0.746
isel+0.732
male+0.730
handler+0.727
Neg logit contributions
 HUN-0.768
 MARK-0.751
 Mond-0.741
 sen-0.740
 COP-0.740
Token,
Token ID837
Feature activation+2.35e+00

Pos logit contributions
align+4.948
bians+4.788
handler+4.769
isel+4.747
volt+4.745
Neg logit contributions
 HUN-3.789
 Coca-3.628
 Premiership-3.599
 Mond-3.560
 MARK-3.559
Token which
Token ID543
Feature activation+2.75e+00

Pos logit contributions
align+1.702
bians+1.678
handler+1.663
isel+1.661
male+1.660
Neg logit contributions
 HUN-1.473
 MARK-1.435
 Mond-1.418
 sen-1.416
 Coca-1.414
Token we
Token ID356
Feature activation-4.98e-01

Pos logit contributions
align+1.731
bians+1.721
isel+1.694
handler+1.690
male+1.660
Neg logit contributions
 HUN-1.966
 MARK-1.923
 sen-1.903
 Mond-1.895
 Premiership-1.888
Token'll
Token ID1183
Feature activation-1.16e+00

Pos logit contributions
 HUN+0.338
 MARK+0.332
 Mond+0.328
 sen+0.327
 COP+0.327
Neg logit contributions
bians-0.340
align-0.339
isel-0.333
male-0.331
handler-0.328
Token call
Token ID869
Feature activation-7.98e-01

Pos logit contributions
 HUN+0.815
 MARK+0.800
 Mond+0.790
 sen+0.788
 COP+0.788
Neg logit contributions
bians-0.770
align-0.769
isel-0.754
male-0.750
handler-0.747
Token.
Token ID13
Feature activation+2.51e+00

Pos logit contributions
 HUN+0.226
 MARK+0.222
 COP+0.218
 Mond+0.217
 sen+0.217
Neg logit contributions
bians-0.262
align-0.261
isel-0.258
male-0.258
imeters-0.256
Token In
Token ID554
Feature activation-7.53e-02

Pos logit contributions
bians+1.445
align+1.407
male+1.399
isel+1.377
female+1.376
Neg logit contributions
 surv-1.915
 HUN-1.898
 MARK-1.881
 sen-1.878
 Mond-1.840
Token bed
Token ID3996
Feature activation+1.45e+00

Pos logit contributions
 HUN+0.046
 MARK+0.045
 COP+0.044
 Mond+0.044
 sen+0.044
Neg logit contributions
bians-0.057
align-0.057
isel-0.056
male-0.056
handler-0.056
Token.
Token ID13
Feature activation+2.56e+00

Pos logit contributions
bians+1.041
align+1.036
male+1.028
isel+1.026
handler+1.015
Neg logit contributions
 HUN-0.928
 MARK-0.912
 COP-0.897
 Mond-0.892
 Coca-0.888
Token Oh
Token ID3966
Feature activation+8.61e-01

Pos logit contributions
bians+1.474
align+1.438
male+1.425
isel+1.410
female+1.405
Neg logit contributions
 HUN-1.937
 surv-1.935
 MARK-1.922
 sen-1.912
 COP-1.882
Token god
Token ID5770
Feature activation+6.50e+00

Pos logit contributions
bians+0.600
align+0.597
male+0.588
isel+0.588
handler+0.582
Neg logit contributions
 HUN-0.571
 MARK-0.562
 COP-0.552
 Mond-0.552
 sen-0.551
Token."
Token ID526
Feature activation+5.10e-01

Pos logit contributions
bians+4.220
align+4.184
male+4.158
 lesbians+4.117
isel+4.099
Neg logit contributions
 HUN-4.404
 MARK-4.312
 Mond-4.259
 Premiership-4.242
 COP-4.235
Token She
Token ID1375
Feature activation-7.23e-01

Pos logit contributions
bians+0.343
align+0.341
isel+0.336
male+0.335
imeters+0.332
Neg logit contributions
 HUN-0.351
 MARK-0.346
 sen-0.341
 COP-0.340
 Mond-0.340
Token chuck
Token ID20539
Feature activation-2.45e+00

Pos logit contributions
 HUN+0.520
 MARK+0.512
 COP+0.504
 sen+0.504
 Mond+0.504
Neg logit contributions
bians-0.465
align-0.463
isel-0.455
male-0.453
imeters-0.450
Tokenles
Token ID829
Feature activation-2.84e+00

Pos logit contributions
 HUN+2.082
 MARK+2.053
 sen+2.031
 COP+2.030
 Mond+2.025
Neg logit contributions
bians-1.252
align-1.247
isel-1.220
male-1.213
handler-1.203
Token,
Token ID11
Feature activation-2.06e+00

Pos logit contributions
 HUN+1.733
 MARK+1.702
 sen+1.685
 COP+1.676
 Mond+1.675
Neg logit contributions
bians-2.138
align-2.133
isel-2.096
male-2.089
handler-2.078
Token OF
Token ID3963
Feature activation-2.24e+00

Pos logit contributions
 HUN+1.068
 MARK+1.046
 sen+1.031
 Mond+1.029
 COP+1.029
Neg logit contributions
bians-1.464
align-1.458
isel-1.438
male-1.434
handler-1.427
Token EXT
Token ID27489
Feature activation-1.83e+00

Pos logit contributions
 HUN+1.349
 MARK+1.324
 COP+1.302
 sen+1.300
 Mond+1.300
Neg logit contributions
bians-1.706
align-1.699
isel-1.675
male-1.668
imeters-1.659
TokenREM
Token ID40726
Feature activation-2.39e+00

Pos logit contributions
 HUN+1.150
 MARK+1.128
 sen+1.112
 Mond+1.110
 COP+1.110
Neg logit contributions
bians-1.340
align-1.334
isel-1.313
male-1.309
handler-1.300
TokenELY
Token ID30943
Feature activation-3.70e+00

Pos logit contributions
 HUN+2.273
 MARK+2.248
 COP+2.230
 sen+2.215
 Mond+2.209
Neg logit contributions
bians-0.991
align-0.985
isel-0.962
male-0.959
imeters-0.953
Token EXP
Token ID25703
Feature activation-1.39e+00

Pos logit contributions
 HUN+2.055
 MARK+2.015
 Mond+1.989
 sen+1.988
 COP+1.978
Neg logit contributions
bians-2.962
align-2.947
isel-2.907
male-2.903
handler-2.885
TokenENS
Token ID16938
Feature activation-8.44e+00

Pos logit contributions
 HUN+0.965
 MARK+0.954
 Mond+0.950
 Coca+0.947
 sen+0.944
Neg logit contributions
bians-0.908
align-0.905
isel-0.892
male-0.888
volt-0.883
TokenIVE
Token ID9306
Feature activation-1.06e+00

Pos logit contributions
 HUN+3.437
 MARK+3.338
 COP+3.056
 Mond+2.985
 Coca+2.951
Neg logit contributions
bians-8.240
isel-8.143
rogens-8.135
align-8.120
raid-8.108
Token BE
Token ID9348
Feature activation+3.15e-01

Pos logit contributions
 HUN+0.577
 sen+0.565
 Mond+0.563
 MARK+0.563
 COP+0.555
Neg logit contributions
bians-0.852
align-0.843
male-0.833
isel-0.833
handler-0.825
TokenER
Token ID1137
Feature activation+8.92e-01

Pos logit contributions
bians+0.223
align+0.220
isel+0.220
male+0.218
handler+0.216
Neg logit contributions
 HUN-0.201
 Mond-0.199
 sen-0.198
 MARK-0.198
 Coca-0.196
Token WITHOUT
Token ID42881
Feature activation+6.50e-01

Pos logit contributions
bians+0.694
align+0.684
male+0.681
handler+0.677
isel+0.675
Neg logit contributions
 Mond-0.493
 HUN-0.493
 sen-0.492
 surv-0.486
 Kaplan-0.485
Token THE
Token ID3336
Feature activation-1.69e+00

Pos logit contributions
bians+0.524
align+0.518
isel+0.513
male+0.511
handler+0.508
Neg logit contributions
 HUN-0.347
 sen-0.344
 Mond-0.343
 MARK-0.339
 surv-0.335
Token.
Token ID13
Feature activation-1.73e+00

Pos logit contributions
 HUN+1.416
 MARK+1.386
 sen+1.362
 COP+1.361
 Mond+1.358
Neg logit contributions
bians-2.062
align-2.057
isel-2.028
male-2.024
handler-2.015
Token Invalid
Token ID17665
Feature activation-7.79e-01

Pos logit contributions
 HUN+1.001
 MARK+0.979
 sen+0.968
 Mond+0.962
 COP+0.959
Neg logit contributions
bians-1.345
align-1.341
isel-1.322
male-1.319
handler-1.312
Token email
Token ID3053
Feature activation+2.25e+00

Pos logit contributions
 HUN+0.374
 MARK+0.364
 Mond+0.356
 sen+0.356
 COP+0.356
Neg logit contributions
align-0.686
bians-0.685
isel-0.677
male-0.673
imeters-0.672
Token address
Token ID2209
Feature activation-6.56e-01

Pos logit contributions
align+1.384
isel+1.352
handler+1.344
bians+1.339
imeters+1.332
Neg logit contributions
 HUN-1.683
 MARK-1.639
 Mond-1.633
 sen-1.614
 surv-1.607
Token.
Token ID13
Feature activation-3.80e+00

Pos logit contributions
 HUN+0.387
 MARK+0.378
 surv+0.375
 sen+0.373
 Mond+0.372
Neg logit contributions
bians-0.502
align-0.501
handler-0.496
isel-0.496
male-0.495
Token Please
Token ID4222
Feature activation-8.12e+00

Pos logit contributions
 HUN+2.954
 MARK+2.907
 COP+2.891
 Coca+2.869
 sen+2.864
Neg logit contributions
bians-2.243
align-2.215
isel-2.199
male-2.165
handler-2.158
Token re
Token ID302
Feature activation-1.20e+00

Pos logit contributions
 HUN+2.708
 sen+2.654
 MARK+2.631
 COP+2.585
 Coca+2.571
Neg logit contributions
bians-8.333
BuyableInstoreAndOnline-8.274
align-8.151
isel-8.132
imeters-8.128
Token-
Token ID12
Feature activation-5.04e+00

Pos logit contributions
 HUN+0.716
 MARK+0.702
 COP+0.690
 sen+0.689
 Mond+0.689
Neg logit contributions
bians-0.924
align-0.921
isel-0.908
male-0.904
handler-0.899
Tokenenter
Token ID9255
Feature activation-5.07e+00

Pos logit contributions
 COP+4.074
 MARK+4.071
 HUN+4.055
 sen+4.052
 Mond+4.042
Neg logit contributions
bians-2.714
BuyableInstoreAndOnline-2.647
imeters-2.605
 lesbians-2.567
align-2.559
Token.
Token ID13
Feature activation-5.79e+00

Pos logit contributions
 HUN+2.459
 MARK+2.409
 COP+2.372
 Coca+2.371
 Mond+2.363
Neg logit contributions
BuyableInstoreAndOnline-4.421
bians-4.418
align-4.368
isel-4.334
imeters-4.312
Token You
Token ID921
Feature activation-6.23e+00

Pos logit contributions
 HUN+3.963
 COP+3.909
 MARK+3.908
 Coca+3.885
 sen+3.808
Neg logit contributions
bians-3.875
align-3.855
isel-3.813
autions-3.794
male-3.783
Token.
Token ID13
Feature activation-1.77e+00

Pos logit contributions
 HUN+1.132
 MARK+1.108
 sen+1.090
 COP+1.088
 Mond+1.085
Neg logit contributions
bians-1.622
align-1.620
isel-1.595
male-1.594
handler-1.587
Token Invalid
Token ID17665
Feature activation-2.45e-01

Pos logit contributions
 HUN+1.014
 MARK+0.993
 sen+0.978
 Mond+0.974
 COP+0.973
Neg logit contributions
bians-1.391
align-1.386
isel-1.366
male-1.363
handler-1.356
Token email
Token ID3053
Feature activation+2.60e+00

Pos logit contributions
 HUN+0.118
 MARK+0.115
 Mond+0.112
 sen+0.112
 COP+0.112
Neg logit contributions
align-0.215
bians-0.215
isel-0.212
male-0.211
imeters-0.211
Token address
Token ID2209
Feature activation-4.94e-01

Pos logit contributions
align+1.604
isel+1.563
handler+1.554
bians+1.551
male+1.544
Neg logit contributions
 HUN-1.954
 MARK-1.899
 Mond-1.890
 sen-1.873
 surv-1.866
Token.
Token ID13
Feature activation-4.14e+00

Pos logit contributions
 HUN+0.291
 MARK+0.284
 surv+0.283
 sen+0.281
 Mond+0.279
Neg logit contributions
bians-0.378
align-0.378
handler-0.374
male-0.373
isel-0.373
Token Please
Token ID4222
Feature activation-8.03e+00

Pos logit contributions
 HUN+3.157
 MARK+3.107
 COP+3.104
 Coca+3.080
 Mond+3.062
Neg logit contributions
bians-2.506
align-2.466
isel-2.463
male-2.416
handler-2.414
Token re
Token ID302
Feature activation-7.70e-01

Pos logit contributions
 HUN+2.604
 sen+2.564
 MARK+2.555
 COP+2.526
 Coca+2.490
Neg logit contributions
bians-8.304
BuyableInstoreAndOnline-8.272
isel-8.109
align-8.097
imeters-8.096
Token-
Token ID12
Feature activation-4.77e+00

Pos logit contributions
 HUN+0.465
 MARK+0.455
 Mond+0.447
 COP+0.447
 sen+0.447
Neg logit contributions
bians-0.585
align-0.585
isel-0.576
male-0.575
handler-0.572
Tokenenter
Token ID9255
Feature activation-4.51e+00

Pos logit contributions
 MARK+3.874
 COP+3.869
 HUN+3.852
 sen+3.849
 Mond+3.838
Neg logit contributions
bians-2.559
BuyableInstoreAndOnline-2.493
imeters-2.467
align-2.416
isel-2.415
Token.
Token ID13
Feature activation-5.56e+00

Pos logit contributions
 HUN+2.228
 MARK+2.185
 COP+2.149
 Coca+2.146
 Mond+2.142
Neg logit contributions
bians-3.902
BuyableInstoreAndOnline-3.881
align-3.859
isel-3.827
imeters-3.808
Token You
Token ID921
Feature activation-5.69e+00

Pos logit contributions
 HUN+3.802
 COP+3.757
 MARK+3.751
 Coca+3.737
 Mond+3.659
Neg logit contributions
bians-3.730
align-3.701
isel-3.680
autions-3.670
aline-3.644
Token was
Token ID373
Feature activation+1.21e+00

Pos logit contributions
bians+1.108
align+1.102
isel+1.094
male+1.079
imeters+1.077
Neg logit contributions
 HUN-1.576
 MARK-1.549
 COP-1.533
 Mond-1.523
 sen-1.516
Token the
Token ID262
Feature activation+1.35e+00

Pos logit contributions
align+0.846
bians+0.841
isel+0.831
male+0.830
imeters+0.818
Neg logit contributions
 HUN-0.803
 MARK-0.785
 COP-0.774
 Mond-0.770
 sen-0.769
Token seed
Token ID9403
Feature activation+3.29e-01

Pos logit contributions
align+0.949
bians+0.944
isel+0.934
male+0.929
imeters+0.921
Neg logit contributions
 HUN-0.887
 MARK-0.870
 COP-0.856
 Mond-0.854
 sen-0.846
Token of
Token ID286
Feature activation-1.10e+00

Pos logit contributions
bians+0.225
align+0.224
isel+0.220
male+0.219
imeters+0.218
Neg logit contributions
 HUN-0.224
 MARK-0.220
 COP-0.217
 sen-0.217
 Mond-0.217
Token res
Token ID581
Feature activation+1.15e+00

Pos logit contributions
 HUN+0.709
 MARK+0.697
 COP+0.685
 Mond+0.683
 sen+0.682
Neg logit contributions
bians-0.786
align-0.784
isel-0.773
male-0.770
imeters-0.764
Tokensent
Token ID34086
Feature activation-8.01e+00

Pos logit contributions
bians+0.679
align+0.678
isel+0.666
male+0.662
imeters+0.653
Neg logit contributions
 HUN-0.884
 MARK-0.872
 COP-0.863
 Mond-0.859
 sen-0.854
Tokeniment
Token ID3681
Feature activation-2.30e+00

Pos logit contributions
 sen+4.968
 Coca+4.921
 HUN+4.921
 Mond+4.895
 MARK+4.828
Neg logit contributions
bians-5.646
BuyableInstoreAndOnline-5.586
isel-5.481
handler-5.468
 lesbians-5.424
Token,
Token ID11
Feature activation-4.40e-01

Pos logit contributions
 HUN+1.430
 MARK+1.400
 sen+1.387
 COP+1.384
 Mond+1.380
Neg logit contributions
bians-1.709
align-1.687
isel-1.666
male-1.656
handler-1.652
Token and
Token ID290
Feature activation-5.88e-01

Pos logit contributions
 HUN+0.283
 MARK+0.278
 COP+0.274
 sen+0.274
 Mond+0.274
Neg logit contributions
bians-0.316
align-0.315
isel-0.310
male-0.309
handler-0.307
Token for
Token ID329
Feature activation-3.75e-02

Pos logit contributions
 HUN+0.386
 MARK+0.379
 COP+0.374
 Mond+0.372
 sen+0.372
Neg logit contributions
bians-0.413
align-0.412
isel-0.406
male-0.404
handler-0.401
Token Nietzsche
Token ID40258
Feature activation-1.39e+00

Pos logit contributions
 HUN+0.023
 MARK+0.023
 COP+0.022
 sen+0.022
 Mond+0.022
Neg logit contributions
bians-0.028
align-0.028
isel-0.027
male-0.027
imeters-0.027
Tokenjpg
Token ID9479
Feature activation+1.31e-01

Pos logit contributions
 HUN+1.281
 MARK+1.252
 COP+1.224
 Mond+1.222
 sen+1.220
Neg logit contributions
bians-2.163
 lesbians-2.139
align-2.136
imeters-2.132
isel-2.124
Token
Token ID198
Feature activation-3.62e+00

Pos logit contributions
bians+0.089
align+0.088
male+0.087
isel+0.086
female+0.086
Neg logit contributions
 HUN-0.088
 surv-0.087
 sen-0.086
 MARK-0.086
 Mond-0.086
Token
Token ID198
Feature activation-4.84e+00

Pos logit contributions
 HUN+2.246
 MARK+2.209
 COP+2.174
 Mond+2.166
 sen+2.164
Neg logit contributions
bians-2.692
align-2.673
isel-2.647
imeters-2.625
male-2.614
TokenEmail
Token ID15333
Feature activation+2.86e+00

Pos logit contributions
 MARK+2.016
 HUN+2.015
 sen+1.976
 COP+1.964
 Coca+1.943
Neg logit contributions
bians-4.553
imeters-4.425
autions-4.420
align-4.411
male-4.398
Token
Token ID198
Feature activation-7.68e+00

Pos logit contributions
align+1.831
bians+1.818
male+1.808
isel+1.787
volt+1.784
Neg logit contributions
 HUN-2.003
 surv-1.993
 sen-1.965
 MARK-1.951
 Mond-1.951
Token
Token ID198
Feature activation-7.91e+00

Pos logit contributions
 HUN+3.960
 MARK+3.934
 Mond+3.901
 SEO+3.856
 COP+3.855
Neg logit contributions
BuyableInstoreAndOnline-6.284
bians-6.267
autions-6.235
interstitial-6.222
imeters-6.183
TokenDis
Token ID7279
Feature activation-1.83e+00

Pos logit contributions
Du+5.017
 Mond+4.977
 MARK+4.871
 HUN+4.843
 SEO+4.830
Neg logit contributions
bians-5.459
imeters-5.408
 lesbians-5.363
autions-5.346
isel-5.275
Tokenruption
Token ID6417
Feature activation-4.54e+00

Pos logit contributions
 HUN+1.219
 MARK+1.194
 COP+1.175
 sen+1.174
 Mond+1.172
Neg logit contributions
bians-1.276
align-1.275
isel-1.252
male-1.246
handler-1.239
Token is
Token ID318
Feature activation-1.94e+00

Pos logit contributions
 HUN+2.813
 MARK+2.794
 sen+2.772
 COP+2.768
 Coca+2.758
Neg logit contributions
bians-3.347
BuyableInstoreAndOnline-3.327
align-3.262
autions-3.213
isel-3.210
Token perhaps
Token ID3737
Feature activation-2.17e+00

Pos logit contributions
 HUN+1.269
 MARK+1.238
 COP+1.215
 sen+1.213
 Mond+1.213
Neg logit contributions
bians-1.381
align-1.378
isel-1.365
male-1.361
handler-1.347
Token one
Token ID530
Feature activation-3.86e+00

Pos logit contributions
 HUN+1.260
 MARK+1.233
 Mond+1.209
 COP+1.208
 sen+1.207
Neg logit contributions
bians-1.691
align-1.687
isel-1.671
male-1.662
handler-1.651
Token.
Token ID13
Feature activation-1.72e+00

Pos logit contributions
 HUN+1.354
 MARK+1.325
 sen+1.302
 COP+1.301
 Mond+1.298
Neg logit contributions
bians-1.966
align-1.962
isel-1.934
male-1.930
handler-1.922
Token Invalid
Token ID17665
Feature activation-5.16e-01

Pos logit contributions
 HUN+1.008
 MARK+0.986
 sen+0.975
 Mond+0.968
 COP+0.965
Neg logit contributions
bians-1.333
align-1.332
isel-1.311
male-1.308
handler-1.302
Token email
Token ID3053
Feature activation+2.11e+00

Pos logit contributions
 HUN+0.249
 MARK+0.242
 Mond+0.237
 sen+0.237
 COP+0.237
Neg logit contributions
align-0.453
bians-0.452
isel-0.447
male-0.445
imeters-0.444
Token address
Token ID2209
Feature activation-6.50e-01

Pos logit contributions
align+1.298
isel+1.270
handler+1.261
bians+1.257
male+1.254
Neg logit contributions
 HUN-1.577
 MARK-1.536
 Mond-1.528
 sen-1.511
 surv-1.507
Token.
Token ID13
Feature activation-3.45e+00

Pos logit contributions
 HUN+0.383
 MARK+0.375
 surv+0.372
 sen+0.369
 Mond+0.368
Neg logit contributions
align-0.497
bians-0.497
handler-0.492
male-0.491
isel-0.491
Token Please
Token ID4222
Feature activation-7.90e+00

Pos logit contributions
 HUN+2.697
 MARK+2.653
 COP+2.636
 sen+2.619
 Coca+2.617
Neg logit contributions
bians-2.018
align-1.988
isel-1.975
male-1.948
handler-1.940
Token re
Token ID302
Feature activation-9.29e-01

Pos logit contributions
 HUN+2.650
 sen+2.580
 MARK+2.564
 COP+2.530
 Coca+2.515
Neg logit contributions
bians-8.122
BuyableInstoreAndOnline-8.004
align-7.926
isel-7.907
imeters-7.902
Token-
Token ID12
Feature activation-4.62e+00

Pos logit contributions
 HUN+0.559
 MARK+0.547
 COP+0.538
 Mond+0.537
 sen+0.537
Neg logit contributions
bians-0.708
align-0.707
isel-0.697
male-0.694
handler-0.691
Tokenenter
Token ID9255
Feature activation-5.03e+00

Pos logit contributions
 MARK+3.777
 HUN+3.772
 COP+3.770
 sen+3.756
 Mond+3.748
Neg logit contributions
bians-2.458
BuyableInstoreAndOnline-2.372
imeters-2.358
align-2.333
 lesbians-2.320
Token.
Token ID13
Feature activation-5.51e+00

Pos logit contributions
 HUN+2.461
 MARK+2.409
 COP+2.374
 Coca+2.371
 Mond+2.367
Neg logit contributions
bians-4.376
BuyableInstoreAndOnline-4.366
align-4.329
isel-4.288
imeters-4.261
Token You
Token ID921
Feature activation-6.11e+00

Pos logit contributions
 HUN+3.807
 MARK+3.752
 COP+3.752
 Coca+3.726
 sen+3.676
Neg logit contributions
bians-3.665
align-3.637
isel-3.595
autions-3.576
male-3.569
Token.
Token ID13
Feature activation-1.13e+00

Pos logit contributions
 HUN+1.128
 MARK+1.104
 sen+1.086
 COP+1.084
 Mond+1.082
Neg logit contributions
bians-1.613
align-1.611
isel-1.587
male-1.585
handler-1.579
Token Invalid
Token ID17665
Feature activation-7.36e-03

Pos logit contributions
 HUN+0.671
 MARK+0.656
 sen+0.653
 Mond+0.645
 surv+0.644
Neg logit contributions
bians-0.858
align-0.857
male-0.843
isel-0.843
handler-0.839
Token email
Token ID3053
Feature activation+2.77e+00

Pos logit contributions
 HUN+0.004
 MARK+0.004
 Mond+0.003
 sen+0.003
 COP+0.003
Neg logit contributions
align-0.006
bians-0.006
isel-0.006
male-0.006
imeters-0.006
Token address
Token ID2209
Feature activation-1.40e-01

Pos logit contributions
align+1.680
isel+1.639
handler+1.635
imeters+1.620
bians+1.620
Neg logit contributions
 HUN-2.101
 MARK-2.042
 Mond-2.035
 sen-2.011
 surv-2.006
Token.
Token ID13
Feature activation-3.28e+00

Pos logit contributions
 HUN+0.083
 MARK+0.081
 surv+0.081
 sen+0.080
 Mond+0.080
Neg logit contributions
bians-0.106
align-0.106
handler-0.105
male-0.105
isel-0.105
Token Please
Token ID4222
Feature activation-7.85e+00

Pos logit contributions
 HUN+2.605
 MARK+2.565
 COP+2.542
 sen+2.528
 Mond+2.528
Neg logit contributions
bians-1.879
align-1.861
isel-1.837
male-1.817
handler-1.808
Token re
Token ID302
Feature activation-8.86e-01

Pos logit contributions
 HUN+2.697
 sen+2.603
 MARK+2.601
 COP+2.566
 Coca+2.544
Neg logit contributions
bians-7.996
BuyableInstoreAndOnline-7.898
align-7.828
isel-7.807
imeters-7.784
Token-
Token ID12
Feature activation-4.68e+00

Pos logit contributions
 HUN+0.533
 MARK+0.522
 COP+0.513
 Mond+0.513
 sen+0.513
Neg logit contributions
bians-0.675
align-0.674
isel-0.664
male-0.662
handler-0.658
Tokenenter
Token ID9255
Feature activation-4.63e+00

Pos logit contributions
 MARK+3.811
 COP+3.808
 HUN+3.805
 sen+3.787
 Mond+3.782
Neg logit contributions
bians-2.493
BuyableInstoreAndOnline-2.420
imeters-2.391
align-2.365
isel-2.357
Token.
Token ID13
Feature activation-5.36e+00

Pos logit contributions
 HUN+2.298
 MARK+2.251
 COP+2.215
 Coca+2.213
 Mond+2.209
Neg logit contributions
bians-3.990
BuyableInstoreAndOnline-3.961
align-3.952
isel-3.917
imeters-3.889
Token You
Token ID921
Feature activation-5.76e+00

Pos logit contributions
 HUN+3.724
 MARK+3.673
 COP+3.666
 Coca+3.642
 sen+3.588
Neg logit contributions
bians-3.543
align-3.521
isel-3.480
male-3.452
autions-3.446
Token.
Token ID13
Feature activation-2.55e+00

Pos logit contributions
 HUN+1.032
 MARK+1.015
 COP+0.999
 sen+0.998
 Mond+0.997
Neg logit contributions
bians-1.078
align-1.071
isel-1.057
male-1.051
imeters-1.047
Tokenjpg
Token ID9479
Feature activation+1.31e-01

Pos logit contributions
 HUN+1.281
 MARK+1.252
 COP+1.224
 Mond+1.222
 sen+1.220
Neg logit contributions
bians-2.163
 lesbians-2.139
align-2.136
imeters-2.132
isel-2.124
Token
Token ID198
Feature activation-3.62e+00

Pos logit contributions
bians+0.089
align+0.088
male+0.087
isel+0.086
female+0.086
Neg logit contributions
 HUN-0.088
 surv-0.087
 sen-0.086
 MARK-0.086
 Mond-0.086
Token
Token ID198
Feature activation-4.84e+00

Pos logit contributions
 HUN+2.246
 MARK+2.209
 COP+2.174
 Mond+2.166
 sen+2.164
Neg logit contributions
bians-2.692
align-2.673
isel-2.647
imeters-2.625
male-2.614
TokenEmail
Token ID15333
Feature activation+2.86e+00

Pos logit contributions
 MARK+2.016
 HUN+2.015
 sen+1.976
 COP+1.964
 Coca+1.943
Neg logit contributions
bians-4.553
imeters-4.425
autions-4.420
align-4.411
male-4.398
Token
Token ID198
Feature activation-7.68e+00

Pos logit contributions
align+1.831
bians+1.818
male+1.808
isel+1.787
volt+1.784
Neg logit contributions
 HUN-2.003
 surv-1.993
 sen-1.965
 MARK-1.951
 Mond-1.951
Token
Token ID198
Feature activation-7.91e+00

Pos logit contributions
 HUN+3.960
 MARK+3.934
 Mond+3.901
 SEO+3.856
 COP+3.855
Neg logit contributions
BuyableInstoreAndOnline-6.284
bians-6.267
autions-6.235
interstitial-6.222
imeters-6.183
TokenDis
Token ID7279
Feature activation-1.83e+00

Pos logit contributions
Du+5.017
 Mond+4.977
 MARK+4.871
 HUN+4.843
 SEO+4.830
Neg logit contributions
bians-5.459
imeters-5.408
 lesbians-5.363
autions-5.346
isel-5.275
Tokenruption
Token ID6417
Feature activation-4.54e+00

Pos logit contributions
 HUN+1.219
 MARK+1.194
 COP+1.175
 sen+1.174
 Mond+1.172
Neg logit contributions
bians-1.276
align-1.275
isel-1.252
male-1.246
handler-1.239
Token is
Token ID318
Feature activation-1.94e+00

Pos logit contributions
 HUN+2.813
 MARK+2.794
 sen+2.772
 COP+2.768
 Coca+2.758
Neg logit contributions
bians-3.347
BuyableInstoreAndOnline-3.327
align-3.262
autions-3.213
isel-3.210
Token perhaps
Token ID3737
Feature activation-2.17e+00

Pos logit contributions
 HUN+1.269
 MARK+1.238
 COP+1.215
 sen+1.213
 Mond+1.213
Neg logit contributions
bians-1.381
align-1.378
isel-1.365
male-1.361
handler-1.347
Token AC
Token ID7125
Feature activation-1.01e+00

Pos logit contributions
 HUN+0.643
 MARK+0.629
 sen+0.622
 Mond+0.621
 COP+0.617
Neg logit contributions
bians-0.996
align-0.993
isel-0.979
male-0.977
handler-0.972
TokenOU
Token ID2606
Feature activation-5.51e-01

Pos logit contributions
 HUN+0.695
 MARK+0.683
 Mond+0.676
 sen+0.675
 COP+0.673
Neg logit contributions
bians-0.677
align-0.676
isel-0.664
male-0.662
imeters-0.658
TokenST
Token ID2257
Feature activation+6.58e-01

Pos logit contributions
 HUN+0.350
 Mond+0.344
 MARK+0.344
 sen+0.343
 COP+0.342
Neg logit contributions
bians-0.394
align-0.392
male-0.386
isel-0.385
volt-0.384
TokenIC
Token ID2149
Feature activation-1.06e+00

Pos logit contributions
bians+0.390
align+0.390
male+0.382
isel+0.381
imeters+0.381
Neg logit contributions
 HUN-0.491
 sen-0.487
 Mond-0.486
 MARK-0.483
 Coca-0.480
Token RES
Token ID15731
Feature activation+1.50e+00

Pos logit contributions
 HUN+0.557
 Mond+0.544
 sen+0.543
 MARK+0.543
 Coca+0.538
Neg logit contributions
bians-0.879
align-0.876
isel-0.863
male-0.861
volt-0.858
TokenON
Token ID1340
Feature activation-7.53e+00

Pos logit contributions
align+1.044
bians+1.043
volt+1.021
male+1.015
isel+1.014
Neg logit contributions
 HUN-0.973
 Mond-0.966
 MARK-0.962
 Coca-0.958
 sen-0.954
TokenANCE
Token ID19240
Feature activation-4.82e-01

Pos logit contributions
 MARK+3.600
 HUN+3.567
 COP+3.334
 sen+3.320
 Coca+3.243
Neg logit contributions
isel-6.667
bians-6.653
male-6.572
olit-6.557
interstitial-6.513
Token
Token ID198
Feature activation+1.33e+00

Pos logit contributions
 HUN+0.306
 MARK+0.300
 sen+0.298
 Mond+0.298
 Coca+0.296
Neg logit contributions
bians-0.346
align-0.345
isel-0.339
male-0.338
handler-0.336
Token
Token ID198
Feature activation+2.54e+00

Pos logit contributions
align+0.858
bians+0.852
male+0.849
handler+0.839
female+0.835
Neg logit contributions
 HUN-0.937
 surv-0.931
 MARK-0.918
 sen-0.915
 Mond-0.911
TokenBefore
Token ID8421
Feature activation-9.65e-01

Pos logit contributions
align+1.614
bians+1.583
male+1.582
isel+1.567
volt+1.566
Neg logit contributions
 HUN-1.853
 MARK-1.809
 Mond-1.801
 sen-1.796
 Coca-1.795
Token getting
Token ID1972
Feature activation+5.04e+00

Pos logit contributions
 HUN+0.629
 MARK+0.616
 Mond+0.607
 sen+0.607
 COP+0.606
Neg logit contributions
align-0.683
bians-0.682
isel-0.671
male-0.668
imeters-0.664
Tokenxx
Token ID5324
Feature activation+1.43e+00

Pos logit contributions
bians+0.291
align+0.289
isel+0.285
male+0.282
imeters+0.281
Neg logit contributions
 HUN-0.429
 MARK-0.423
 COP-0.418
 sen-0.418
 Mond-0.418
Tokenbrand
Token ID17938
Feature activation+9.87e-02

Pos logit contributions
bians+0.968
align+0.962
isel+0.943
male+0.943
handler+0.936
Neg logit contributions
 HUN-0.985
 MARK-0.968
 COP-0.953
 Mond-0.952
 sen-0.952
Token?
Token ID30
Feature activation-2.23e+00

Pos logit contributions
bians+0.069
align+0.069
isel+0.068
male+0.068
imeters+0.067
Neg logit contributions
 HUN-0.065
 MARK-0.064
 COP-0.063
 sen-0.063
 Mond-0.063
Tokenf
Token ID69
Feature activation+1.69e-02

Pos logit contributions
 HUN+1.613
 MARK+1.589
 COP+1.574
 sen+1.573
 Mond+1.568
Neg logit contributions
bians-1.410
align-1.386
isel-1.384
male-1.363
imeters-1.363
Tokenref
Token ID5420
Feature activation-7.29e-01

Pos logit contributions
bians+0.010
align+0.010
isel+0.009
male+0.009
handler+0.009
Neg logit contributions
 HUN-0.013
 MARK-0.013
 COP-0.013
 sen-0.013
 Mond-0.013
Token=
Token ID28
Feature activation-7.50e+00

Pos logit contributions
 HUN+0.643
 MARK+0.637
 Mond+0.628
 COP+0.627
 sen+0.626
Neg logit contributions
bians-0.351
align-0.343
isel-0.340
male-0.335
imeters-0.333
Tokents
Token ID912
Feature activation+5.18e-01

Pos logit contributions
 HUN+5.875
 MARK+5.821
 Mond+5.769
 Coca+5.735
 COP+5.734
Neg logit contributions
BuyableInstoreAndOnline-3.906
bians-3.885
rogens-3.788
 lesbians-3.785
isel-3.768
Token<|endoftext|>
Token ID50256
Feature activation+2.62e-01

Pos logit contributions
bians+0.357
align+0.355
isel+0.348
male+0.348
handler+0.345
Neg logit contributions
 HUN-0.349
 MARK-0.343
 sen-0.338
 COP-0.338
 Mond-0.337
TokenA
Token ID32
Feature activation+1.79e+00

Pos logit contributions
bians+0.162
align+0.161
isel+0.158
male+0.158
handler+0.156
Neg logit contributions
 HUN-0.196
 MARK-0.193
 COP-0.191
 Mond-0.190
 sen-0.190
Token month
Token ID1227
Feature activation+5.57e-01

Pos logit contributions
bians+1.244
align+1.233
isel+1.222
male+1.222
handler+1.214
Neg logit contributions
 HUN-1.174
 MARK-1.157
 Mond-1.141
 sen-1.138
 COP-1.136
Token before
Token ID878
Feature activation+6.80e-01

Pos logit contributions
bians+0.414
align+0.413
isel+0.407
male+0.407
handler+0.403
Neg logit contributions
 HUN-0.343
 MARK-0.336
 Mond-0.332
 COP-0.332
 sen-0.330
Token signed
Token ID4488
Feature activation-1.76e+00

Pos logit contributions
 HUN+2.128
 MARK+2.099
 COP+2.071
 sen+2.070
 Mond+2.069
Neg logit contributions
bians-1.832
align-1.817
isel-1.788
male-1.777
imeters-1.769
Token the
Token ID262
Feature activation-2.57e+00

Pos logit contributions
 HUN+1.178
 MARK+1.159
 COP+1.139
 sen+1.139
 Mond+1.137
Neg logit contributions
bians-1.217
align-1.213
isel-1.195
male-1.190
imeters-1.182
Token algorithm
Token ID11862
Feature activation-2.28e+00

Pos logit contributions
 HUN+1.770
 MARK+1.740
 COP+1.721
 sen+1.714
 Mond+1.714
Neg logit contributions
bians-1.731
align-1.719
isel-1.695
male-1.686
handler-1.675
Token and
Token ID290
Feature activation-2.46e+00

Pos logit contributions
 HUN+1.476
 MARK+1.450
 COP+1.427
 sen+1.426
 Mond+1.425
Neg logit contributions
bians-1.626
align-1.619
isel-1.595
male-1.588
handler-1.579
Token North
Token ID2258
Feature activation-2.26e+00

Pos logit contributions
 HUN+1.648
 MARK+1.620
 COP+1.596
 sen+1.594
 Mond+1.593
Neg logit contributions
bians-1.701
align-1.693
isel-1.667
male-1.660
imeters-1.650
Tokenrop
Token ID1773
Feature activation-7.30e+00

Pos logit contributions
 HUN+2.068
 MARK+2.042
 COP+2.019
 sen+2.018
 Mond+2.017
Neg logit contributions
bians-1.007
align-1.001
isel-0.978
male-0.970
handler-0.962
Token Gr
Token ID1902
Feature activation-2.19e+00

Pos logit contributions
 HUN+4.430
 MARK+4.363
 Mond+4.348
 COP+4.345
 sen+4.307
Neg logit contributions
bians-5.382
align-5.357
interstitial-5.263
imeters-5.212
male-5.192
Tokenum
Token ID388
Feature activation-2.22e+00

Pos logit contributions
 HUN+2.058
 MARK+2.030
 Mond+2.016
 sen+2.015
 COP+2.009
Neg logit contributions
bians-0.915
align-0.902
isel-0.871
male-0.868
imeters-0.863
Tokenman
Token ID805
Feature activation-1.30e+00

Pos logit contributions
 HUN+1.748
 MARK+1.710
 sen+1.691
 COP+1.689
 Coca+1.684
Neg logit contributions
bians-1.281
align-1.275
male-1.268
isel-1.260
handler-1.258
Token,
Token ID11
Feature activation-1.18e+00

Pos logit contributions
 HUN+0.857
 MARK+0.842
 sen+0.827
 COP+0.827
 Mond+0.825
Neg logit contributions
bians-0.913
align-0.909
isel-0.897
male-0.894
handler-0.887
Token the
Token ID262
Feature activation+5.61e-02

Pos logit contributions
 HUN+0.759
 MARK+0.747
 COP+0.734
 sen+0.733
 Mond+0.730
Neg logit contributions
bians-0.840
align-0.838
isel-0.828
male-0.824
handler-0.821
Token {
Token ID1391
Feature activation+1.34e+00

Pos logit contributions
 HUN+3.104
 MARK+3.069
 COP+3.023
 sen+3.003
 Mond+3.000
Neg logit contributions
bians-3.179
align-3.141
isel-3.108
male-3.086
imeters-3.072
Token //
Token ID3373
Feature activation+8.86e-01

Pos logit contributions
bians+0.551
align+0.543
isel+0.530
male+0.527
imeters+0.522
Neg logit contributions
 HUN-1.274
 MARK-1.261
 COP-1.249
 sen-1.246
 Mond-1.244
Token Time
Token ID3862
Feature activation-2.62e+00

Pos logit contributions
bians+0.889
align+0.887
isel+0.877
male+0.875
imeters+0.873
Neg logit contributions
 HUN-0.316
 MARK-0.306
 COP-0.297
 sen-0.296
 Mond-0.296
Token component
Token ID7515
Feature activation+5.92e-02

Pos logit contributions
 HUN+2.293
 MARK+2.236
 Mond+2.215
 sen+2.201
 Coca+2.194
Neg logit contributions
align-1.297
bians-1.284
isel-1.280
imeters-1.268
handler-1.267
Token const
Token ID1500
Feature activation-1.46e+00

Pos logit contributions
align+0.024
bians+0.023
volt+0.023
isel+0.023
male+0.023
Neg logit contributions
 HUN-0.057
 MARK-0.056
 Mond-0.056
 sen-0.055
 Coca-0.055
Token float
Token ID12178
Feature activation-7.09e+00

Pos logit contributions
 HUN+1.126
 MARK+1.099
 Mond+1.098
 sen+1.095
 COP+1.087
Neg logit contributions
align-0.856
bians-0.849
isel-0.844
volt-0.835
male-0.833
Token TA
Token ID21664
Feature activation-4.87e+00

Pos logit contributions
 HUN+3.660
 MARK+3.586
 COP+3.464
 sen+3.413
 Mond+3.366
Neg logit contributions
bians-6.121
align-6.070
imeters-5.991
male-5.982
isel-5.956
TokenU
Token ID52
Feature activation-5.73e+00

Pos logit contributions
 HUN+3.201
 MARK+3.135
 COP+3.080
 sen+3.068
 Mond+3.068
Neg logit contributions
bians-3.463
align-3.436
isel-3.385
male-3.362
imeters-3.358
Token =
Token ID796
Feature activation-5.92e+00

Pos logit contributions
 HUN+4.137
 MARK+4.067
 COP+4.010
 Coca+3.965
 Mond+3.963
Neg logit contributions
bians-3.705
align-3.681
isel-3.612
male-3.589
handler-3.578
Token 6
Token ID718
Feature activation-6.83e+00

Pos logit contributions
 HUN+3.627
 MARK+3.591
 COP+3.573
 Coca+3.533
 sen+3.480
Neg logit contributions
bians-4.426
align-4.363
male-4.343
imeters-4.287
female-4.260
Token.
Token ID13
Feature activation+7.92e-02

Pos logit contributions
 HUN+3.247
 MARK+3.192
 COP+3.108
 Coca+3.057
 Mond+3.053
Neg logit contributions
bians-6.033
align-5.982
BuyableInstoreAndOnline-5.978
isel-5.913
male-5.891
Token ID
Token ID4522
Feature activation+1.90e+00

Pos logit contributions
align+1.497
bians+1.469
male+1.458
isel+1.457
volt+1.456
Neg logit contributions
 HUN-1.069
 sen-1.052
 surv-1.047
 Mond-1.036
 MARK-1.031
Token:
Token ID25
Feature activation-2.15e+00

Pos logit contributions
align+0.875
bians+0.868
male+0.850
isel+0.849
handler+0.837
Neg logit contributions
 HUN-1.698
 MARK-1.673
 sen-1.670
 Mond-1.669
 COP-1.661
Token 5
Token ID642
Feature activation-2.48e+00

Pos logit contributions
 HUN+1.450
 MARK+1.426
 COP+1.405
 sen+1.404
 Mond+1.403
Neg logit contributions
bians-1.486
align-1.479
isel-1.456
male-1.450
imeters-1.441
Token23
Token ID1954
Feature activation-8.56e-01

Pos logit contributions
 HUN+1.676
 MARK+1.648
 COP+1.624
 sen+1.621
 Mond+1.621
Neg logit contributions
bians-1.711
align-1.703
isel-1.677
male-1.669
imeters-1.660
Token22
Token ID1828
Feature activation-3.44e+00

Pos logit contributions
 HUN+0.566
 MARK+0.555
 sen+0.551
 Mond+0.548
 COP+0.548
Neg logit contributions
bians-0.598
align-0.594
male-0.585
isel-0.584
handler-0.579
Token62
Token ID5237
Feature activation-7.05e+00

Pos logit contributions
 HUN+2.284
 MARK+2.245
 sen+2.213
 COP+2.212
 Mond+2.210
Neg logit contributions
bians-2.405
align-2.394
isel-2.357
male-2.348
handler-2.333
Token13
Token ID1485
Feature activation-4.51e+00

Pos logit contributions
 HUN+4.561
 MARK+4.535
 COP+4.422
 Mond+4.349
 Coca+4.343
Neg logit contributions
bians-5.084
align-5.053
isel-4.987
imeters-4.975
male-4.952
Token (
Token ID357
Feature activation-3.88e+00

Pos logit contributions
 HUN+2.907
 MARK+2.860
 COP+2.814
 Mond+2.792
 sen+2.785
Neg logit contributions
bians-3.248
align-3.236
isel-3.188
male-3.173
imeters-3.165
TokenPhoto
Token ID6191
Feature activation+1.71e+00

Pos logit contributions
 HUN+1.740
 MARK+1.703
 COP+1.672
 Coca+1.631
 sen+1.618
Neg logit contributions
bians-3.618
isel-3.501
imeters-3.479
autions-3.478
raid-3.454
Token11
Token ID1157
Feature activation+4.18e-01

Pos logit contributions
handler+1.071
volt+1.056
male+1.048
align+1.048
alone+1.043
Neg logit contributions
 Mond-1.202
 COP-1.198
 HUN-1.194
 surv-1.188
 MARK-1.176
Token:
Token ID25
Feature activation+1.80e-01

Pos logit contributions
align+0.294
bians+0.292
male+0.288
handler+0.287
isel+0.286
Neg logit contributions
 HUN-0.272
 MARK-0.268
 Mond-0.267
 sen-0.264
 COP-0.264
Tokens
Token ID82
Feature activation-2.38e+00

Pos logit contributions
 HUN+3.066
 MARK+3.034
 sen+3.006
 Coca+2.985
 Mond+2.985
Neg logit contributions
bians-2.442
isel-2.392
align-2.389
imeters-2.347
BuyableInstoreAndOnline-2.347
Tokenult
Token ID586
Feature activation-3.12e+00

Pos logit contributions
 HUN+1.848
 MARK+1.820
 COP+1.796
 sen+1.794
 Mond+1.794
Neg logit contributions
bians-1.405
align-1.399
isel-1.370
male-1.363
imeters-1.354
Token.
Token ID13
Feature activation-1.29e+00

Pos logit contributions
 HUN+1.954
 MARK+1.918
 sen+1.893
 COP+1.892
 Mond+1.886
Neg logit contributions
bians-2.301
align-2.281
isel-2.250
male-2.240
handler-2.227
Token When
Token ID1649
Feature activation+1.43e+00

Pos logit contributions
 HUN+0.928
 MARK+0.914
 sen+0.901
 Mond+0.898
 COP+0.898
Neg logit contributions
bians-0.812
align-0.809
male-0.793
isel-0.791
handler-0.784
Token Re
Token ID797
Feature activation+7.05e-01

Pos logit contributions
align+1.045
bians+1.040
isel+1.029
male+1.025
handler+1.012
Neg logit contributions
 HUN-0.881
 MARK-0.867
 COP-0.845
 Mond-0.840
 SEO-0.837
Token­
Token ID3907
Feature activation-7.03e+00

Pos logit contributions
bians+0.668
align+0.666
isel+0.663
handler+0.655
male+0.654
Neg logit contributions
 HUN-0.293
 MARK-0.285
 COP-0.280
 sen-0.274
 Mond-0.274
Tokenpub
Token ID12984
Feature activation-5.50e+00

Pos logit contributions
 MARK+5.110
 Mond+5.095
 sen+5.048
 HUN+5.043
 Coca+4.960
Neg logit contributions
BuyableInstoreAndOnline-4.227
bians-4.223
 lesbians-4.119
isel-4.064
imeters-4.052
Token­
Token ID3907
Feature activation-4.40e+00

Pos logit contributions
 HUN+1.831
 sen+1.768
 Mond+1.746
 MARK+1.729
 Coca+1.721
Neg logit contributions
bians-5.579
align-5.558
BuyableInstoreAndOnline-5.525
imeters-5.508
interstitial-5.485
Tokenlic
Token ID677
Feature activation-1.10e+00

Pos logit contributions
 HUN+3.570
 sen+3.545
 Mond+3.536
 MARK+3.534
 COP+3.505
Neg logit contributions
bians-2.298
align-2.262
BuyableInstoreAndOnline-2.238
 lesbians-2.220
imeters-2.213
Token­
Token ID3907
Feature activation-3.39e+00

Pos logit contributions
 HUN+0.447
 MARK+0.436
 sen+0.428
 COP+0.427
 Mond+0.425
Neg logit contributions
bians-1.054
align-1.052
isel-1.037
male-1.035
imeters-1.034
Tokenans
Token ID504
Feature activation-2.11e+00

Pos logit contributions
 HUN+3.004
 MARK+2.984
 sen+2.977
 Mond+2.950
 COP+2.939
Neg logit contributions
bians-1.536
align-1.504
BuyableInstoreAndOnline-1.502
imeters-1.487
 lesbians-1.464
Tokening
Token ID278
Feature activation-8.32e-01

Pos logit contributions
 HUN+2.861
 MARK+2.817
 sen+2.787
 Coca+2.770
 COP+2.768
Neg logit contributions
bians-2.404
align-2.389
isel-2.352
imeters-2.323
handler-2.318
Token Jimmy
Token ID12963
Feature activation+3.30e-01

Pos logit contributions
 HUN+0.562
 MARK+0.553
 COP+0.545
 Mond+0.544
 sen+0.543
Neg logit contributions
bians-0.571
align-0.568
isel-0.560
male-0.557
handler-0.553
Token Carter
Token ID10831
Feature activation+2.41e+00

Pos logit contributions
bians+0.227
align+0.225
male+0.222
isel+0.222
handler+0.220
Neg logit contributions
 HUN-0.222
 MARK-0.219
 COP-0.215
 sen-0.215
 Mond-0.214
Token.
Token ID13
Feature activation-1.84e+00

Pos logit contributions
align+1.692
bians+1.683
male+1.666
isel+1.666
handler+1.648
Neg logit contributions
 HUN-1.565
 MARK-1.551
 Mond-1.519
 COP-1.507
 Coca-1.505
Token Al
Token ID978
Feature activation-1.45e+00

Pos logit contributions
 HUN+1.306
 MARK+1.285
 sen+1.268
 COP+1.265
 Mond+1.265
Neg logit contributions
bians-1.204
align-1.198
isel-1.177
male-1.174
handler-1.164
Token­
Token ID3907
Feature activation-6.93e+00

Pos logit contributions
 HUN+0.751
 MARK+0.735
 COP+0.721
 sen+0.720
 Mond+0.719
Neg logit contributions
bians-1.227
align-1.223
isel-1.208
male-1.203
imeters-1.197
Tokenthough
Token ID2016
Feature activation-4.40e+00

Pos logit contributions
 HUN+5.200
 MARK+5.134
 sen+5.083
 COP+5.056
 Coca+5.042
Neg logit contributions
isel-3.948
align-3.942
BuyableInstoreAndOnline-3.938
 lesbians-3.929
bians-3.912
Token he
Token ID339
Feature activation-6.19e-01

Pos logit contributions
 HUN+2.725
 sen+2.674
 MARK+2.670
 COP+2.648
 Mond+2.644
Neg logit contributions
bians-3.259
align-3.233
isel-3.188
imeters-3.180
male-3.171
Token left
Token ID1364
Feature activation-3.77e+00

Pos logit contributions
 HUN+0.408
 MARK+0.402
 Mond+0.395
 COP+0.395
 Coca+0.393
Neg logit contributions
bians-0.434
align-0.432
isel-0.427
male-0.426
imeters-0.421
Token the
Token ID262
Feature activation-2.24e+00

Pos logit contributions
 HUN+2.371
 MARK+2.328
 sen+2.318
 COP+2.306
 Mond+2.305
Neg logit contributions
bians-2.781
align-2.752
male-2.703
imeters-2.700
isel-2.698
Token White
Token ID2635
Feature activation-1.13e+00

Pos logit contributions
 HUN+1.498
 MARK+1.472
 COP+1.451
 sen+1.450
 Mond+1.448
Neg logit contributions
bians-1.551
align-1.544
isel-1.521
male-1.514
imeters-1.505
Token (
Token ID357
Feature activation-3.88e+00

Pos logit contributions
 HUN+2.907
 MARK+2.860
 COP+2.814
 Mond+2.792
 sen+2.785
Neg logit contributions
bians-3.248
align-3.236
isel-3.188
male-3.173
imeters-3.165
TokenPhoto
Token ID6191
Feature activation+1.71e+00

Pos logit contributions
 HUN+1.740
 MARK+1.703
 COP+1.672
 Coca+1.631
 sen+1.618
Neg logit contributions
bians-3.618
isel-3.501
imeters-3.479
autions-3.478
raid-3.454
Token11
Token ID1157
Feature activation+4.18e-01

Pos logit contributions
handler+1.071
volt+1.056
male+1.048
align+1.048
alone+1.043
Neg logit contributions
 Mond-1.202
 COP-1.198
 HUN-1.194
 surv-1.188
 MARK-1.176
Token:
Token ID25
Feature activation+1.80e-01

Pos logit contributions
align+0.294
bians+0.292
male+0.288
handler+0.287
isel+0.286
Neg logit contributions
 HUN-0.272
 MARK-0.268
 Mond-0.267
 sen-0.264
 COP-0.264
Token SA
Token ID14719
Feature activation-8.28e-01

Pos logit contributions
align+0.126
bians+0.125
isel+0.123
male+0.123
handler+0.122
Neg logit contributions
 HUN-0.119
 MARK-0.116
 sen-0.115
 Mond-0.115
 COP-0.115
TokenUL
Token ID6239
Feature activation-6.93e+00

Pos logit contributions
 HUN+0.638
 MARK+0.628
 sen+0.623
 Mond+0.623
 COP+0.621
Neg logit contributions
bians-0.485
align-0.482
isel-0.478
male-0.473
handler-0.469
Token LO
Token ID17579
Feature activation-1.94e-01

Pos logit contributions
 HUN+4.467
 MARK+4.385
 COP+4.275
 Mond+4.235
 sen+4.211
Neg logit contributions
bians-5.062
align-5.059
isel-4.978
imeters-4.959
handler-4.928
TokenEB
Token ID30195
Feature activation-1.39e+00

Pos logit contributions
 sen+0.141
 HUN+0.141
 surv+0.141
 Mond+0.140
 MARK+0.139
Neg logit contributions
bians-0.121
isel-0.117
align-0.117
male-0.116
handler-0.115
Token AFP
Token ID8916
Feature activation+1.11e+00

Pos logit contributions
 HUN+0.888
 MARK+0.873
 COP+0.856
 sen+0.854
 Mond+0.849
Neg logit contributions
bians-1.008
align-1.005
isel-0.987
imeters-0.983
male-0.983
Token/
Token ID14
Feature activation+6.62e-02

Pos logit contributions
align+0.804
bians+0.801
isel+0.794
male+0.790
handler+0.790
Neg logit contributions
 HUN-0.692
 MARK-0.677
 sen-0.674
 Mond-0.673
 Coca-0.671
TokenGetty
Token ID6633
Feature activation+5.25e+00

Pos logit contributions
align+0.047
volt+0.046
handler+0.046
male+0.046
bians+0.046
Neg logit contributions
 surv-0.042
 HUN-0.042
 sen-0.042
 Mond-0.041
 Coca-0.041
Tokendi
Token ID10989
Feature activation-2.24e+00

Pos logit contributions
 HUN+3.281
 MARK+3.220
 sen+3.219
 Mond+3.214
 Coca+3.204
Neg logit contributions
bians-2.853
align-2.831
BuyableInstoreAndOnline-2.818
isel-2.792
 lesbians-2.791
Token­
Token ID3907
Feature activation-5.81e+00

Pos logit contributions
 HUN+0.757
 sen+0.730
 MARK+0.720
 COP+0.715
 Mond+0.708
Neg logit contributions
bians-2.303
align-2.275
male-2.258
isel-2.252
female-2.244
Tokenum
Token ID388
Feature activation-4.61e+00

Pos logit contributions
 HUN+3.789
 MARK+3.743
 sen+3.738
 COP+3.725
 Coca+3.705
Neg logit contributions
BuyableInstoreAndOnline-3.919
bians-3.855
 lesbians-3.833
isel-3.808
align-3.784
Token and
Token ID290
Feature activation-3.76e-01

Pos logit contributions
 HUN+2.722
 sen+2.685
 COP+2.657
 MARK+2.655
 Mond+2.644
Neg logit contributions
bians-3.522
align-3.479
isel-3.434
BuyableInstoreAndOnline-3.433
imeters-3.427
Token al
Token ID435
Feature activation-3.56e+00

Pos logit contributions
 HUN+0.237
 MARK+0.233
 Mond+0.228
 COP+0.228
 sen+0.226
Neg logit contributions
bians-0.275
align-0.273
isel-0.270
male-0.269
handler-0.267
Token­
Token ID3907
Feature activation-6.91e+00

Pos logit contributions
 HUN+1.203
 sen+1.180
 COP+1.167
 MARK+1.159
 Mond+1.150
Neg logit contributions
bians-3.591
align-3.590
isel-3.544
BuyableInstoreAndOnline-3.530
 lesbians-3.516
Tokenl
Token ID75
Feature activation-5.11e+00

Pos logit contributions
 HUN+5.176
 MARK+5.118
 Mond+5.070
 COP+5.061
 sen+5.056
Neg logit contributions
BuyableInstoreAndOnline-3.875
isel-3.829
 lesbians-3.817
align-3.789
bians-3.754
Tokenowed
Token ID6972
Feature activation-3.74e-01

Pos logit contributions
 HUN+3.988
 sen+3.929
 MARK+3.918
 COP+3.897
 Mond+3.894
Neg logit contributions
bians-2.895
align-2.863
isel-2.802
volt-2.781
 lesbians-2.780
Token only
Token ID691
Feature activation-6.65e-02

Pos logit contributions
 HUN+0.227
 MARK+0.224
 COP+0.219
 Mond+0.219
 sen+0.218
Neg logit contributions
bians-0.281
align-0.280
isel-0.276
male-0.275
handler-0.273
Token a
Token ID257
Feature activation-1.16e+00

Pos logit contributions
 HUN+0.041
 MARK+0.040
 COP+0.039
 Mond+0.039
 Coca+0.039
Neg logit contributions
bians-0.050
align-0.049
isel-0.049
male-0.049
handler-0.048
Token brief
Token ID4506
Feature activation-4.74e-01

Pos logit contributions
 HUN+0.706
 MARK+0.694
 Mond+0.680
 COP+0.680
 sen+0.676
Neg logit contributions
bians-0.872
align-0.867
isel-0.856
male-0.853
handler-0.848
Token
Token ID198
Feature activation-1.25e+00

Pos logit contributions
 HUN+1.324
 MARK+1.301
 COP+1.280
 sen+1.278
 Mond+1.278
Neg logit contributions
bians-1.503
align-1.496
isel-1.475
male-1.468
imeters-1.460
TokenIn
Token ID818
Feature activation+5.36e-01

Pos logit contributions
 HUN+0.952
 MARK+0.938
 COP+0.926
 sen+0.924
 Mond+0.924
Neg logit contributions
bians-0.755
align-0.752
isel-0.738
male-0.735
handler-0.729
Token recent
Token ID2274
Feature activation-3.49e+00

Pos logit contributions
bians+0.366
align+0.363
isel+0.358
male+0.355
imeters+0.355
Neg logit contributions
 HUN-0.363
 MARK-0.359
 sen-0.351
 Mond-0.349
 COP-0.349
Token years
Token ID812
Feature activation-2.52e+00

Pos logit contributions
 HUN+2.504
 MARK+2.459
 COP+2.433
 sen+2.433
 Mond+2.423
Neg logit contributions
bians-2.265
align-2.254
isel-2.213
male-2.208
imeters-2.193
Token,
Token ID11
Feature activation-2.20e+00

Pos logit contributions
 HUN+1.600
 MARK+1.570
 Mond+1.561
 COP+1.551
 sen+1.545
Neg logit contributions
align-1.829
bians-1.824
male-1.790
imeters-1.785
isel-1.782
Token however
Token ID2158
Feature activation-6.87e+00

Pos logit contributions
 HUN+1.413
 MARK+1.388
 COP+1.368
 Mond+1.367
 sen+1.366
Neg logit contributions
bians-1.582
align-1.576
isel-1.552
male-1.546
imeters-1.538
Token,
Token ID11
Feature activation-1.87e+00

Pos logit contributions
 HUN+3.974
 MARK+3.916
 Coca+3.908
 COP+3.896
 Mond+3.885
Neg logit contributions
BuyableInstoreAndOnline-5.327
bians-5.269
align-5.202
autions-5.196
imeters-5.147
Token conservative
Token ID5940
Feature activation-1.39e+00

Pos logit contributions
 HUN+1.238
 MARK+1.217
 COP+1.199
 sen+1.197
 Mond+1.197
Neg logit contributions
bians-1.306
align-1.300
isel-1.281
male-1.275
handler-1.267
Token politicians
Token ID7602
Feature activation+9.77e-01

Pos logit contributions
 HUN+1.028
 MARK+1.010
 COP+0.994
 Mond+0.993
 sen+0.992
Neg logit contributions
bians-0.866
align-0.862
isel-0.850
male-0.843
handler-0.836
Token and
Token ID290
Feature activation+8.90e-01

Pos logit contributions
bians+0.633
align+0.630
isel+0.625
male+0.617
imeters+0.614
Neg logit contributions
 HUN-0.698
 MARK-0.684
 Mond-0.673
 COP-0.672
 sen-0.669
Token pundits
Token ID30855
Feature activation-2.27e+00

Pos logit contributions
bians+0.549
align+0.540
isel+0.539
handler+0.532
 lesbians+0.531
Neg logit contributions
 HUN-0.661
 MARK-0.651
 COP-0.639
 sen-0.635
 Mond-0.633
Token­
Token ID3907
Feature activation-3.76e+00

Pos logit contributions
 HUN+0.215
 MARK+0.210
 COP+0.204
 Mond+0.203
 Coca+0.202
Neg logit contributions
bians-0.378
align-0.377
isel-0.373
male-0.372
handler-0.370
Tokenized
Token ID1143
Feature activation-1.24e+00

Pos logit contributions
 HUN+3.110
 MARK+3.084
 sen+3.064
 Mond+3.057
 COP+3.050
Neg logit contributions
bians-1.919
align-1.909
isel-1.865
 lesbians-1.863
BuyableInstoreAndOnline-1.859
Token re
Token ID302
Feature activation+1.38e+00

Pos logit contributions
 HUN+0.815
 MARK+0.803
 COP+0.788
 Mond+0.787
 sen+0.786
Neg logit contributions
bians-0.869
align-0.865
isel-0.852
male-0.849
imeters-0.843
Token­
Token ID3907
Feature activation-6.81e+00

Pos logit contributions
bians+1.371
align+1.368
isel+1.353
male+1.349
handler+1.343
Neg logit contributions
 HUN-0.515
 MARK-0.500
 COP-0.486
 Mond-0.484
 sen-0.483
Tokenla
Token ID5031
Feature activation-4.05e+00

Pos logit contributions
 Mond+4.861
 MARK+4.858
 sen+4.844
 HUN+4.825
 COP+4.763
Neg logit contributions
BuyableInstoreAndOnline-4.087
bians-4.085
 lesbians-4.039
isel-3.979
handler-3.951
Token­
Token ID3907
Feature activation-6.82e+00

Pos logit contributions
 HUN+1.450
 MARK+1.397
 sen+1.392
 Mond+1.381
 COP+1.380
Neg logit contributions
bians-4.049
align-4.032
isel-3.988
male-3.983
handler-3.972
Tokentions
Token ID45240
Feature activation-1.79e+00

Pos logit contributions
 HUN+4.351
 sen+4.297
 MARK+4.277
 Coca+4.270
 Mond+4.262
Neg logit contributions
BuyableInstoreAndOnline-4.633
 lesbians-4.586
align-4.573
isel-4.536
bians-4.528
Token with
Token ID351
Feature activation-2.76e+00

Pos logit contributions
 HUN+1.141
 MARK+1.120
 COP+1.105
 sen+1.105
 Mond+1.102
Neg logit contributions
bians-1.301
align-1.295
isel-1.277
male-1.271
imeters-1.264
Token Cuba
Token ID14159
Feature activation+2.46e+00

Pos logit contributions
 HUN+1.669
 sen+1.626
 MARK+1.626
 COP+1.618
 Mond+1.613
Neg logit contributions
bians-2.105
align-2.097
isel-2.066
male-2.059
handler-2.054
Token.
Token ID13
Feature activation-1.21e+00

Pos logit contributions
bians+1.783
align+1.781
isel+1.755
male+1.751
imeters+1.743
Neg logit contributions
 HUN-1.521
 MARK-1.514
 Mond-1.476
 COP-1.462
 Coca-1.460
Token Mitt
Token ID16627
Feature activation-1.11e+00

Pos logit contributions
 HUN+0.818
 MARK+0.808
 sen+0.794
 Mond+0.792
 COP+0.792
Neg logit contributions
bians-0.819
align-0.815
male-0.801
isel-0.799
handler-0.792
Token
Token ID198
Feature activation-1.56e+00

Pos logit contributions
 HUN+0.786
 MARK+0.772
 sen+0.765
 COP+0.762
 Mond+0.762
Neg logit contributions
bians-0.828
align-0.826
male-0.816
isel-0.812
handler-0.806
TokenM
Token ID44
Feature activation-1.11e-02

Pos logit contributions
 HUN+1.119
 MARK+1.102
 Mond+1.086
 COP+1.086
 sen+1.083
Neg logit contributions
bians-1.010
align-1.001
isel-0.987
imeters-0.981
male-0.980
TokenEM
Token ID3620
Feature activation-6.75e-01

Pos logit contributions
 HUN+0.008
 MARK+0.008
 COP+0.008
 sen+0.008
 Mond+0.008
Neg logit contributions
bians-0.007
align-0.007
isel-0.007
male-0.007
imeters-0.007
TokenB
Token ID33
Feature activation-1.88e+00

Pos logit contributions
 HUN+0.440
 MARK+0.433
 COP+0.426
 sen+0.426
 Mond+0.425
Neg logit contributions
bians-0.479
align-0.477
isel-0.470
male-0.468
imeters-0.465
TokenERS
Token ID4877
Feature activation-5.80e-01

Pos logit contributions
 HUN+1.224
 MARK+1.209
 sen+1.196
 Mond+1.194
 COP+1.192
Neg logit contributions
bians-1.314
align-1.314
male-1.290
isel-1.288
handler-1.286
TokenH
Token ID39
Feature activation-6.81e+00

Pos logit contributions
 HUN+0.360
 MARK+0.354
 sen+0.351
 Mond+0.351
 COP+0.349
Neg logit contributions
bians-0.426
align-0.425
isel-0.419
male-0.417
imeters-0.414
TokenIP
Token ID4061
Feature activation-3.08e+00

Pos logit contributions
 HUN+3.346
 MARK+3.327
 COP+3.255
 Coca+3.113
 sen+3.078
Neg logit contributions
bians-5.941
align-5.935
isel-5.916
imeters-5.870
handler-5.859
Token IS
Token ID3180
Feature activation-2.02e+00

Pos logit contributions
 HUN+1.867
 MARK+1.836
 COP+1.800
 Mond+1.773
 Coca+1.770
Neg logit contributions
bians-2.363
align-2.348
isel-2.318
male-2.309
imeters-2.307
Token FREE
Token ID17189
Feature activation-1.55e+00

Pos logit contributions
 HUN+1.200
 MARK+1.178
 COP+1.157
 sen+1.153
 Mond+1.152
Neg logit contributions
bians-1.560
align-1.554
isel-1.533
male-1.526
imeters-1.520
Token |
Token ID930
Feature activation+4.31e-01

Pos logit contributions
 HUN+0.910
 sen+0.888
 MARK+0.888
 Mond+0.888
 Coca+0.879
Neg logit contributions
bians-1.195
align-1.187
male-1.169
isel-1.168
handler-1.167
Token NO
Token ID8005
Feature activation+1.20e+00

Pos logit contributions
bians+0.339
align+0.338
male+0.333
isel+0.331
handler+0.329
Neg logit contributions
 HUN-0.238
 sen-0.237
 Mond-0.234
 MARK-0.233
 Coca-0.231
Token were
Token ID547
Feature activation+1.29e+00

Pos logit contributions
bians+0.898
align+0.890
isel+0.881
male+0.875
handler+0.869
Neg logit contributions
 HUN-0.843
 MARK-0.829
 COP-0.814
 sen-0.812
 Mond-0.812
Token sent
Token ID1908
Feature activation-1.10e+00

Pos logit contributions
bians+0.831
align+0.826
isel+0.814
male+0.812
handler+0.804
Neg logit contributions
 HUN-0.929
 MARK-0.917
 COP-0.900
 Mond-0.899
 sen-0.898
Token to
Token ID284
Feature activation+1.73e+00

Pos logit contributions
 HUN+0.658
 MARK+0.644
 sen+0.635
 COP+0.634
 Mond+0.632
Neg logit contributions
bians-0.851
align-0.847
isel-0.833
male-0.829
handler-0.825
Token the
Token ID262
Feature activation+1.80e+00

Pos logit contributions
bians+1.141
align+1.136
isel+1.121
male+1.111
imeters+1.107
Neg logit contributions
 HUN-1.210
 MARK-1.191
 sen-1.170
 COP-1.170
 Coca-1.163
Token An
Token ID1052
Feature activation+1.35e+00

Pos logit contributions
bians+1.247
align+1.232
isel+1.220
male+1.208
imeters+1.206
Neg logit contributions
 HUN-1.187
 MARK-1.169
 sen-1.146
 COP-1.143
 Coca-1.138
Tokenz
Token ID89
Feature activation+6.00e+00

Pos logit contributions
align+0.916
bians+0.916
isel+0.913
male+0.893
volt+0.891
Neg logit contributions
 HUN-0.931
 MARK-0.914
 COP-0.901
 sen-0.896
 Coca-0.887
Tokenio
Token ID952
Feature activation+1.63e+00

Pos logit contributions
alone+4.255
bians+4.230
isel+4.211
aline+4.190
align+4.173
Neg logit contributions
 HUN-3.882
 surv-3.761
 MARK-3.738
 sen-3.704
 COP-3.677
Token beach
Token ID10481
Feature activation+3.01e-01

Pos logit contributions
bians+1.099
align+1.081
isel+1.072
imeters+1.058
male+1.054
Neg logit contributions
 HUN-1.123
 MARK-1.108
 COP-1.085
 sen-1.083
 Coca-1.077
Tokenhead
Token ID2256
Feature activation-3.92e-01

Pos logit contributions
bians+0.175
align+0.174
isel+0.171
male+0.170
imeters+0.170
Neg logit contributions
 HUN-0.235
 MARK-0.232
 COP-0.229
 sen-0.228
 Mond-0.227
Token where
Token ID810
Feature activation+2.44e+00

Pos logit contributions
 HUN+0.247
 MARK+0.243
 COP+0.239
 sen+0.238
 Mond+0.238
Neg logit contributions
bians-0.287
align-0.286
isel-0.281
male-0.280
imeters-0.279
Token they
Token ID484
Feature activation+2.86e+00

Pos logit contributions
bians+1.486
align+1.470
isel+1.458
male+1.452
imeters+1.437
Neg logit contributions
 HUN-1.820
 MARK-1.799
 COP-1.766
 sen-1.763
 Mond-1.754
Token out
Token ID503
Feature activation+1.26e+00

Pos logit contributions
bians+0.729
align+0.725
isel+0.719
male+0.714
imeters+0.709
Neg logit contributions
 HUN-0.732
 MARK-0.720
 Mond-0.711
 COP-0.709
 sen-0.707
Token of
Token ID286
Feature activation+1.20e+00

Pos logit contributions
bians+0.872
align+0.869
isel+0.857
male+0.853
imeters+0.848
Neg logit contributions
 HUN-0.840
 MARK-0.827
 Mond-0.814
 COP-0.814
 sen-0.812
Token cash
Token ID5003
Feature activation-1.65e+00

Pos logit contributions
bians+0.920
align+0.916
isel+0.909
male+0.903
imeters+0.902
Neg logit contributions
 HUN-0.699
 MARK-0.689
 Mond-0.677
 COP-0.667
 sen-0.665
Token.
Token ID13
Feature activation+2.11e+00

Pos logit contributions
 HUN+0.958
 MARK+0.937
 COP+0.924
 sen+0.924
 Mond+0.917
Neg logit contributions
bians-1.292
align-1.281
isel-1.263
male-1.256
handler-1.250
Token (
Token ID357
Feature activation+1.97e+00

Pos logit contributions
bians+1.282
align+1.261
male+1.252
isel+1.248
handler+1.239
Neg logit contributions
 HUN-1.538
 surv-1.534
 sen-1.531
 MARK-1.509
 COP-1.484
TokenTechn
Token ID25574
Feature activation+3.80e+00

Pos logit contributions
align+1.141
bians+1.139
male+1.132
isel+1.114
Female+1.112
Neg logit contributions
 HUN-1.508
 sen-1.480
 MARK-1.470
 surv-1.468
 Coca-1.463
Tokenically
Token ID1146
Feature activation+1.24e+00

Pos logit contributions
align+1.079
bians+1.053
isel+1.036
male+1.017
olit+1.005
Neg logit contributions
 HUN-4.092
 MARK-4.043
 COP-3.971
 sen-3.971
 Coca-3.956
Token,
Token ID11
Feature activation+9.65e-02

Pos logit contributions
bians+0.848
align+0.845
isel+0.832
male+0.828
imeters+0.822
Neg logit contributions
 HUN-0.844
 MARK-0.830
 COP-0.817
 sen-0.817
 Mond-0.816
Token no
Token ID645
Feature activation+1.78e+00

Pos logit contributions
bians+0.068
align+0.068
isel+0.067
male+0.066
imeters+0.066
Neg logit contributions
 HUN-0.064
 MARK-0.062
 sen-0.061
 COP-0.061
 Mond-0.061
Token Alberta
Token ID15555
Feature activation+2.24e-01

Pos logit contributions
bians+1.265
align+1.263
isel+1.254
male+1.241
imeters+1.238
Neg logit contributions
 HUN-1.135
 MARK-1.117
 Mond-1.100
 COP-1.092
 sen-1.090
Token budget
Token ID4466
Feature activation-8.07e-01

Pos logit contributions
bians+0.162
align+0.161
isel+0.159
male+0.159
imeters+0.157
Neg logit contributions
 HUN-0.143
 MARK-0.140
 COP-0.138
 Mond-0.138
 sen-0.138
Token easily
Token ID3538
Feature activation+5.78e-01

Pos logit contributions
bians+1.100
align+1.100
isel+1.078
male+1.070
handler+1.065
Neg logit contributions
 HUN-1.086
 MARK-1.067
 Mond-1.053
 Coca-1.045
 COP-1.043
Token be
Token ID307
Feature activation-7.00e-01

Pos logit contributions
bians+0.395
align+0.393
isel+0.387
male+0.384
imeters+0.382
Neg logit contributions
 HUN-0.393
 MARK-0.386
 Mond-0.380
 COP-0.380
 sen-0.379
Token pulled
Token ID5954
Feature activation-4.35e-01

Pos logit contributions
 HUN+0.537
 MARK+0.529
 COP+0.521
 Mond+0.521
 sen+0.521
Neg logit contributions
bians-0.418
align-0.416
isel-0.409
male-0.406
handler-0.403
Token out
Token ID503
Feature activation-1.24e+00

Pos logit contributions
 HUN+0.281
 MARK+0.276
 COP+0.271
 Mond+0.271
 sen+0.271
Neg logit contributions
bians-0.312
align-0.311
isel-0.306
male-0.305
imeters-0.303
TokenAfter
Token ID3260
Feature activation+4.11e+00

Pos logit contributions
 HUN+0.765
 MARK+0.751
 COP+0.739
 sen+0.739
 Mond+0.737
Neg logit contributions
bians-0.924
align-0.920
isel-0.907
male-0.903
handler-0.898
Token that
Token ID326
Feature activation+3.90e+00

Pos logit contributions
align+2.544
bians+2.526
isel+2.504
assembly+2.467
male+2.462
Neg logit contributions
 HUN-3.013
 MARK-2.945
 Mond-2.906
 COP-2.884
 sen-2.884
Token,
Token ID11
Feature activation+2.91e+00

Pos logit contributions
align+2.549
bians+2.545
isel+2.516
male+2.490
handler+2.474
Neg logit contributions
 HUN-2.730
 MARK-2.683
 Mond-2.643
 sen-2.624
 Coca-2.619
Token you
Token ID345
Feature activation+1.24e+00

Pos logit contributions
align+1.894
bians+1.887
isel+1.854
male+1.837
handler+1.818
Neg logit contributions
 HUN-2.049
 MARK-2.003
 Mond-1.986
 sen-1.980
 Coca-1.975
Token will
Token ID481
Feature activation+1.93e+00

Pos logit contributions
bians+0.843
align+0.841
isel+0.830
male+0.825
imeters+0.815
Neg logit contributions
 HUN-0.841
 MARK-0.826
 Mond-0.822
 sen-0.815
 Coca-0.813
Token need
Token ID761
Feature activation+6.62e-01

Pos logit contributions
align+1.246
bians+1.243
isel+1.222
male+1.208
volt+1.201
Neg logit contributions
 HUN-1.368
 MARK-1.348
 Mond-1.341
 Coca-1.335
 sen-1.329
Token to
Token ID284
Feature activation+1.43e+00

Pos logit contributions
bians+0.514
align+0.510
isel+0.503
male+0.501
handler+0.500
Neg logit contributions
 HUN-0.390
 MARK-0.383
 COP-0.377
 sen-0.376
 Mond-0.374
Token in
Token ID287
Feature activation+1.15e-01

Pos logit contributions
 HUN+0.504
 MARK+0.496
 COP+0.489
 Mond+0.489
 sen+0.489
Neg logit contributions
bians-0.482
align-0.480
isel-0.473
male-0.470
imeters-0.468
Token the
Token ID262
Feature activation-1.77e+00

Pos logit contributions
bians+0.081
align+0.080
isel+0.079
male+0.079
imeters+0.078
Neg logit contributions
 HUN-0.076
 MARK-0.074
 COP-0.073
 sen-0.073
 Mond-0.073
Token sand
Token ID6450
Feature activation-3.20e+00

Pos logit contributions
 HUN+1.103
 MARK+1.083
 COP+1.069
 sen+1.068
 Mond+1.066
Neg logit contributions
bians-1.305
align-1.295
isel-1.279
male-1.273
imeters-1.267
Token the
Token ID262
Feature activation-1.06e+00

Pos logit contributions
 HUN+2.026
 MARK+1.962
 sen+1.956
 COP+1.955
 Mond+1.940
Neg logit contributions
bians-2.344
align-2.342
isel-2.303
BuyableInstoreAndOnline-2.285
male-2.279
Token whole
Token ID2187
Feature activation-2.13e+00

Pos logit contributions
 HUN+0.693
 MARK+0.680
 COP+0.671
 sen+0.670
 Mond+0.669
Neg logit contributions
bians-0.756
align-0.752
isel-0.741
male-0.738
handler-0.733
Token time
Token ID640
Feature activation+4.27e+00

Pos logit contributions
 HUN+1.061
 MARK+1.033
 COP+1.016
 Mond+1.014
 sen+1.014
Neg logit contributions
bians-1.849
align-1.841
isel-1.818
male-1.812
imeters-1.804
Token.
Token ID13
Feature activation-2.21e+00

Pos logit contributions
bians+3.096
align+3.059
imeters+3.035
male+3.035
handler+3.021
Neg logit contributions
 MARK-2.596
 HUN-2.589
 Mond-2.531
 Coca-2.511
 COP-2.503
Token Maybe
Token ID6674
Feature activation-7.37e-01

Pos logit contributions
 HUN+1.513
 MARK+1.488
 COP+1.467
 sen+1.465
 Mond+1.464
Neg logit contributions
bians-1.499
align-1.492
isel-1.468
male-1.462
imeters-1.453
Token the
Token ID262
Feature activation-2.28e+00

Pos logit contributions
 HUN+0.492
 MARK+0.485
 COP+0.476
 sen+0.475
 Mond+0.474
Neg logit contributions
bians-0.509
align-0.507
isel-0.499
male-0.497
handler-0.493
Token true
Token ID2081
Feature activation-6.23e-01

Pos logit contributions
 HUN+1.446
 MARK+1.419
 COP+1.398
 sen+1.397
 Mond+1.396
Neg logit contributions
bians-1.658
align-1.651
isel-1.627
male-1.620
imeters-1.611
Token voice
Token ID3809
Feature activation+1.30e-01

Pos logit contributions
 HUN+0.413
 MARK+0.407
 COP+0.401
 Mond+0.399
 Coca+0.398
Neg logit contributions
bians-0.431
align-0.429
isel-0.426
male-0.423
handler-0.420
Tokenspeech
Token ID45862
Feature activation+1.90e+00

Pos logit contributions
 HUN+1.268
 MARK+1.253
 COP+1.241
 sen+1.236
 Mond+1.235
Neg logit contributions
bians-0.892
align-0.887
isel-0.873
male-0.863
imeters-0.861
Token-
Token ID12
Feature activation-9.25e-01

Pos logit contributions
bians+1.408
align+1.400
male+1.377
isel+1.377
imeters+1.374
Neg logit contributions
 HUN-1.187
 MARK-1.164
 Mond-1.148
 sen-1.146
 Coca-1.139
Tokenand
Token ID392
Feature activation-6.14e-01

Pos logit contributions
 HUN+0.755
 MARK+0.745
 COP+0.739
 sen+0.736
 Mond+0.736
Neg logit contributions
bians-0.501
align-0.497
isel-0.489
male-0.484
imeters-0.482
Token-
Token ID12
Feature activation-8.66e-01

Pos logit contributions
 HUN+0.349
 MARK+0.342
 COP+0.341
 sen+0.337
 Mond+0.335
Neg logit contributions
bians-0.483
align-0.480
isel-0.476
male-0.472
handler-0.470
Tokenpriv
Token ID13776
Feature activation+4.67e-01

Pos logit contributions
 HUN+0.701
 MARK+0.692
 COP+0.686
 sen+0.684
 Mond+0.682
Neg logit contributions
bians-0.475
align-0.470
isel-0.463
male-0.458
imeters-0.457
Tokenacy
Token ID1590
Feature activation+3.80e+00

Pos logit contributions
bians+0.264
align+0.263
isel+0.258
male+0.256
imeters+0.254
Neg logit contributions
 HUN-0.372
 MARK-0.367
 COP-0.362
 sen-0.362
 Mond-0.362
Token
Token ID198
Feature activation+2.81e-01

Pos logit contributions
bians+2.726
align+2.715
male+2.668
isel+2.664
imeters+2.657
Neg logit contributions
 HUN-2.434
 MARK-2.384
 Mond-2.353
 sen-2.348
 Coca-2.341
Token
Token ID198
Feature activation+6.22e-01

Pos logit contributions
align+0.191
bians+0.191
male+0.188
isel+0.187
handler+0.187
Neg logit contributions
 HUN-0.191
 MARK-0.187
 sen-0.185
 surv-0.185
 Mond-0.185
Tokenhttps
Token ID5450
Feature activation+1.97e+00

Pos logit contributions
bians+0.402
align+0.400
isel+0.393
male+0.391
imeters+0.389
Neg logit contributions
 HUN-0.446
 MARK-0.439
 COP-0.433
 sen-0.433
 Mond-0.432
Token://
Token ID1378
Feature activation+1.56e+00

Pos logit contributions
align+1.164
bians+1.151
male+1.145
isel+1.129
handler+1.128
Neg logit contributions
 HUN-1.503
 surv-1.491
 MARK-1.487
 Mond-1.467
 sen-1.467
Tokenwww
Token ID2503
Feature activation+7.40e-01

Pos logit contributions
bians+0.847
align+0.840
isel+0.827
male+0.820
imeters+0.816
Neg logit contributions
 HUN-1.276
 MARK-1.259
 COP-1.246
 sen-1.244
 Mond-1.241
Token can
Token ID460
Feature activation-6.85e-01

Pos logit contributions
 HUN+0.338
 MARK+0.333
 Mond+0.328
 COP+0.327
 sen+0.326
Neg logit contributions
bians-0.355
align-0.354
isel-0.348
male-0.348
imeters-0.345
Token
Token ID447
Feature activation+2.32e+00

Pos logit contributions
 HUN+0.421
 MARK+0.414
 COP+0.407
 Mond+0.406
 sen+0.406
Neg logit contributions
bians-0.512
align-0.510
isel-0.503
male-0.501
imeters-0.498
Token
Token ID247
Feature activation+1.66e+00

Pos logit contributions
align+1.149
bians+1.133
isel+1.122
male+1.118
imeters+1.102
Neg logit contributions
 MARK-1.989
 HUN-1.988
 COP-1.968
 Mond-1.963
 Coca-1.945
Tokent
Token ID83
Feature activation-3.91e-01

Pos logit contributions
align+1.266
bians+1.254
isel+1.247
male+1.240
handler+1.233
Neg logit contributions
 HUN-1.002
 MARK-0.986
 COP-0.969
 Mond-0.959
 Coca-0.954
Token remember
Token ID3505
Feature activation-2.94e-01

Pos logit contributions
 HUN+0.260
 MARK+0.256
 Mond+0.252
 COP+0.250
 Coca+0.250
Neg logit contributions
bians-0.271
align-0.271
isel-0.264
male-0.264
imeters-0.264
Token you
Token ID345
Feature activation+1.52e+00

Pos logit contributions
 HUN+0.193
 MARK+0.191
 Mond+0.188
 COP+0.187
 Coca+0.186
Neg logit contributions
bians-0.204
align-0.203
male-0.200
isel-0.200
imeters-0.199
Token at
Token ID379
Feature activation-1.20e+00

Pos logit contributions
bians+1.041
align+1.036
male+1.028
isel+1.024
imeters+1.018
Neg logit contributions
 HUN-1.004
 MARK-1.000
 Mond-0.985
 COP-0.972
 Coca-0.972
Token all
Token ID477
Feature activation+1.24e+00

Pos logit contributions
 HUN+0.759
 MARK+0.746
 COP+0.733
 Mond+0.733
 sen+0.732
Neg logit contributions
bians-0.869
align-0.865
isel-0.853
male-0.849
imeters-0.844
Token.
Token ID13
Feature activation+8.86e-01

Pos logit contributions
bians+0.880
align+0.876
male+0.869
imeters+0.863
handler+0.861
Neg logit contributions
 HUN-0.794
 MARK-0.787
 Mond-0.770
 Coca-0.769
 COP-0.769
Token
Token ID447
Feature activation+1.85e+00

Pos logit contributions
bians+0.584
align+0.583
male+0.575
imeters+0.568
isel+0.567
Neg logit contributions
 HUN-0.604
 MARK-0.599
 Mond-0.589
 sen-0.586
 COP-0.586
Token
Token ID251
Feature activation+1.65e+00

Pos logit contributions
bians+1.223
align+1.217
isel+1.197
male+1.192
imeters+1.184
Neg logit contributions
 HUN-1.302
 MARK-1.282
 COP-1.264
 Mond-1.262
 sen-1.262
Token have
Token ID423
Feature activation+6.03e-01

Pos logit contributions
bians+0.919
align+0.916
isel+0.902
male+0.890
handler+0.888
Neg logit contributions
 HUN-1.033
 MARK-1.019
 COP-0.999
 Mond-0.999
 sen-0.995
Token shown
Token ID3402
Feature activation-1.06e-01

Pos logit contributions
bians+0.404
align+0.401
isel+0.398
male+0.396
handler+0.393
Neg logit contributions
 HUN-0.411
 MARK-0.405
 Mond-0.397
 COP-0.397
 Coca-0.396
Token no
Token ID645
Feature activation+2.70e+00

Pos logit contributions
 HUN+0.072
 MARK+0.071
 Mond+0.069
 COP+0.069
 sen+0.069
Neg logit contributions
bians-0.073
align-0.072
isel-0.071
male-0.071
handler-0.070
Token desire
Token ID6227
Feature activation+2.31e+00

Pos logit contributions
bians+1.824
align+1.808
isel+1.795
handler+1.786
 lesbians+1.780
Neg logit contributions
 HUN-1.786
 MARK-1.752
 Mond-1.730
 COP-1.718
 Coca-1.709
Token to
Token ID284
Feature activation+2.75e-01

Pos logit contributions
bians+1.680
align+1.677
isel+1.654
male+1.650
handler+1.637
Neg logit contributions
 HUN-1.460
 MARK-1.440
 Mond-1.411
 COP-1.411
 sen-1.408
Token do
Token ID466
Feature activation+1.06e+00

Pos logit contributions
align+0.188
bians+0.187
isel+0.185
male+0.183
handler+0.182
Neg logit contributions
 HUN-0.185
 MARK-0.181
 COP-0.179
 Coca-0.178
 sen-0.178
Token them
Token ID606
Feature activation+4.15e-01

Pos logit contributions
bians+0.726
align+0.723
isel+0.715
male+0.709
handler+0.707
Neg logit contributions
 HUN-0.714
 MARK-0.705
 sen-0.692
 COP-0.691
 Mond-0.686
Token.
Token ID13
Feature activation+1.68e-02

Pos logit contributions
bians+0.313
align+0.312
isel+0.308
male+0.307
handler+0.305
Neg logit contributions
 HUN-0.252
 MARK-0.248
 COP-0.243
 sen-0.243
 Mond-0.243
Token
Token ID198
Feature activation-8.32e-01

Pos logit contributions
bians+0.011
align+0.011
male+0.011
isel+0.011
female+0.011
Neg logit contributions
 HUN-0.012
 sen-0.012
 MARK-0.012
 surv-0.011
 COP-0.011
Token
Token ID198
Feature activation-1.05e+00

Pos logit contributions
 HUN+0.553
 MARK+0.543
 sen+0.536
 Mond+0.535
 COP+0.535
Neg logit contributions
bians-0.578
align-0.577
male-0.569
isel-0.566
handler-0.564
TokenBut
Token ID1537
Feature activation-1.12e+00

Pos logit contributions
 HUN+0.833
 MARK+0.820
 sen+0.811
 COP+0.811
 Mond+0.808
Neg logit contributions
bians-0.599
align-0.598
isel-0.584
male-0.584
handler-0.578
Token economy
Token ID3773
Feature activation+3.08e+00

Pos logit contributions
bians+1.396
align+1.395
isel+1.383
male+1.373
handler+1.356
Neg logit contributions
 HUN-1.345
 MARK-1.321
 sen-1.307
 COP-1.301
 Coca-1.291
Token.
Token ID13
Feature activation-8.23e-01

Pos logit contributions
align+2.334
bians+2.327
isel+2.306
male+2.299
handler+2.294
Neg logit contributions
 HUN-1.818
 MARK-1.786
 Mond-1.754
 COP-1.745
 sen-1.745
Token
Token ID198
Feature activation-1.74e+00

Pos logit contributions
 HUN+0.593
 MARK+0.583
 sen+0.581
 COP+0.573
 surv+0.573
Neg logit contributions
bians-0.523
align-0.517
male-0.509
isel-0.507
handler-0.503
Token
Token ID198
Feature activation-1.38e+00

Pos logit contributions
 HUN+1.119
 MARK+1.099
 COP+1.082
 sen+1.081
 Mond+1.081
Neg logit contributions
bians-1.246
align-1.241
isel-1.222
male-1.218
handler-1.211
TokenMac
Token ID14155
Feature activation+1.02e+00

Pos logit contributions
 HUN+1.059
 MARK+1.043
 COP+1.030
 sen+1.029
 Mond+1.027
Neg logit contributions
bians-0.816
align-0.814
isel-0.797
male-0.796
handler-0.789
Tokenri
Token ID380
Feature activation-1.00e-01

Pos logit contributions
align+0.644
bians+0.644
isel+0.637
male+0.632
imeters+0.628
Neg logit contributions
 HUN-0.745
 MARK-0.733
 COP-0.724
 sen-0.720
 Coca-0.718
Token blames
Token ID34061
Feature activation-2.62e-01

Pos logit contributions
 HUN+0.067
 MARK+0.066
 COP+0.065
 sen+0.065
 Mond+0.065
Neg logit contributions
bians-0.069
align-0.069
isel-0.068
male-0.068
handler-0.067
Token un
Token ID555
Feature activation+8.05e-01

Pos logit contributions
 HUN+0.167
 MARK+0.164
 COP+0.161
 sen+0.161
 Mond+0.161
Neg logit contributions
bians-0.189
align-0.188
isel-0.186
male-0.185
imeters-0.184
Tokenfore
Token ID754
Feature activation+8.53e-01

Pos logit contributions
align+0.473
bians+0.465
isel+0.459
male+0.459
handler+0.458
Neg logit contributions
 HUN-0.628
 MARK-0.619
 COP-0.611
 Mond-0.610
 sen-0.609
Tokensee
Token ID3826
Feature activation+4.17e-01

Pos logit contributions
align+0.439
bians+0.434
male+0.429
isel+0.427
volt+0.423
Neg logit contributions
 HUN-0.726
 MARK-0.712
 COP-0.704
 sen-0.704
 Mond-0.701
Tokenable
Token ID540
Feature activation+6.20e-01

Pos logit contributions
align+0.194
bians+0.193
isel+0.188
male+0.187
handler+0.186
Neg logit contributions
 HUN-0.375
 MARK-0.369
 Mond-0.364
 COP-0.364
 sen-0.364
Token down
Token ID866
Feature activation+2.98e-01

Pos logit contributions
bians+1.202
align+1.193
 lesbians+1.178
isel+1.171
male+1.169
Neg logit contributions
 HUN-1.411
 Mond-1.410
 MARK-1.407
 COP-1.384
 Coca-1.370
Token with
Token ID351
Feature activation+3.85e-01

Pos logit contributions
bians+0.191
align+0.190
isel+0.187
male+0.186
handler+0.185
Neg logit contributions
 HUN-0.215
 MARK-0.212
 COP-0.209
 Mond-0.208
 sen-0.208
Token Gohan
Token ID45281
Feature activation+5.43e-02

Pos logit contributions
bians+0.280
align+0.279
isel+0.274
male+0.274
handler+0.271
Neg logit contributions
 HUN-0.242
 MARK-0.238
 COP-0.234
 Mond-0.234
 sen-0.233
Token?
Token ID30
Feature activation-2.08e+00

Pos logit contributions
bians+0.039
align+0.039
isel+0.038
male+0.038
handler+0.038
Neg logit contributions
 HUN-0.034
 MARK-0.034
 COP-0.033
 Mond-0.033
 sen-0.033
Token<|endoftext|>
Token ID50256
Feature activation-5.26e-01

Pos logit contributions
 HUN+1.500
 MARK+1.477
 COP+1.457
 sen+1.455
 Mond+1.454
Neg logit contributions
bians-1.327
align-1.320
isel-1.298
male-1.293
handler-1.283
TokenR
Token ID49
Feature activation+2.75e+00

Pos logit contributions
 HUN+0.393
 MARK+0.387
 COP+0.382
 sen+0.382
 Mond+0.382
Neg logit contributions
bians-0.324
align-0.322
isel-0.317
male-0.315
handler-0.313
Tokenents
Token ID658
Feature activation-7.95e-01

Pos logit contributions
bians+1.739
isel+1.726
align+1.721
male+1.699
imeters+1.678
Neg logit contributions
 HUN-2.015
 MARK-1.974
 sen-1.950
 Mond-1.947
 COP-1.945
Token still
Token ID991
Feature activation+2.89e+00

Pos logit contributions
 HUN+0.546
 MARK+0.538
 COP+0.530
 Mond+0.529
 sen+0.529
Neg logit contributions
bians-0.537
align-0.534
isel-0.526
male-0.525
handler-0.521
Token have
Token ID423
Feature activation+5.75e-01

Pos logit contributions
bians+1.849
align+1.838
male+1.809
isel+1.806
 lesbians+1.784
Neg logit contributions
 HUN-2.050
 MARK-2.014
 Mond-2.001
 COP-1.995
 sen-1.994
Token the
Token ID262
Feature activation+1.82e+00

Pos logit contributions
bians+0.392
align+0.389
isel+0.386
male+0.385
imeters+0.383
Neg logit contributions
 HUN-0.384
 MARK-0.379
 Mond-0.374
 COP-0.373
 sen-0.373
Token potential
Token ID2785
Feature activation+5.14e-01

Pos logit contributions
bians+1.266
align+1.237
male+1.237
isel+1.231
imeters+1.231
Neg logit contributions
 HUN-1.189
 MARK-1.165
 Mond-1.148
 sen-1.146
 COP-1.143
Token org
Token ID8745
Feature activation-1.23e+00

Pos logit contributions
 HUN+0.610
 MARK+0.600
 sen+0.593
 COP+0.593
 Mond+0.587
Neg logit contributions
bians-0.664
align-0.657
isel-0.651
male-0.648
imeters-0.647
Token.
Token ID764
Feature activation-1.60e+00

Pos logit contributions
 HUN+0.563
 MARK+0.547
 Mond+0.542
 sen+0.537
 COP+0.534
Neg logit contributions
align-1.116
bians-1.115
male-1.103
isel-1.102
handler-1.096
Token cl
Token ID537
Feature activation-1.30e+00

Pos logit contributions
 HUN+1.193
 MARK+1.174
 COP+1.157
 Mond+1.157
 sen+1.154
Neg logit contributions
bians-0.987
align-0.985
isel-0.967
male-0.962
handler-0.956
Tokenojure
Token ID32511
Feature activation+5.62e-01

Pos logit contributions
 HUN+0.847
 MARK+0.832
 Mond+0.821
 COP+0.820
 sen+0.817
Neg logit contributions
bians-0.927
align-0.923
isel-0.914
male-0.908
handler-0.902
Token /
Token ID1220
Feature activation+1.06e+00

Pos logit contributions
align+0.297
bians+0.296
isel+0.292
male+0.290
handler+0.290
Neg logit contributions
 HUN-0.464
 MARK-0.455
 COP-0.452
 Mond-0.451
 sen-0.450
Token tools
Token ID4899
Feature activation+7.07e-01

Pos logit contributions
align+0.677
bians+0.673
isel+0.663
male+0.657
handler+0.657
Neg logit contributions
 HUN-0.760
 MARK-0.746
 COP-0.734
 Mond-0.734
 Coca-0.729
Token.
Token ID764
Feature activation-1.18e+00

Pos logit contributions
bians+0.489
align+0.488
isel+0.479
male+0.477
handler+0.476
Neg logit contributions
 HUN-0.474
 MARK-0.464
 Mond-0.458
 COP-0.457
 sen-0.456
Token logging
Token ID18931
Feature activation-7.54e-02

Pos logit contributions
 HUN+0.830
 MARK+0.816
 COP+0.805
 sen+0.804
 Mond+0.804
Neg logit contributions
bians-0.778
align-0.774
isel-0.762
male-0.759
imeters-0.754
Token |
Token ID930
Feature activation-6.90e-01

Pos logit contributions
 HUN+0.060
 MARK+0.059
 Mond+0.058
 COP+0.058
 sen+0.058
Neg logit contributions
bians-0.043
align-0.043
isel-0.042
male-0.042
handler-0.042
Token 15
Token ID1315
Feature activation-1.21e+00

Pos logit contributions
 HUN+0.522
 MARK+0.515
 sen+0.509
 COP+0.507
 Mond+0.507
Neg logit contributions
bians-0.416
align-0.413
isel-0.406
male-0.405
handler-0.401
Token79
Token ID3720
Feature activation+1.08e-01

Pos logit contributions
 HUN+0.794
 MARK+0.780
 sen+0.769
 COP+0.768
 Mond+0.767
Neg logit contributions
bians-0.854
align-0.850
isel-0.835
male-0.835
handler-0.828
Tokens
Token ID82
Feature activation+2.29e-01

Pos logit contributions
 HUN+0.842
 MARK+0.829
 COP+0.817
 Mond+0.815
 sen+0.815
Neg logit contributions
bians-0.887
align-0.883
isel-0.871
male-0.866
imeters-0.861
Token no
Token ID645
Feature activation+2.35e+00

Pos logit contributions
bians+0.159
align+0.158
isel+0.156
male+0.155
handler+0.154
Neg logit contributions
 HUN-0.154
 MARK-0.151
 COP-0.149
 Mond-0.149
 sen-0.148
Token limit
Token ID4179
Feature activation+1.31e+00

Pos logit contributions
bians+1.538
align+1.523
isel+1.507
male+1.493
handler+1.488
Neg logit contributions
 HUN-1.663
 MARK-1.630
 COP-1.609
 Mond-1.599
 sen-1.594
Token.
Token ID13
Feature activation-8.84e-01

Pos logit contributions
align+0.931
bians+0.930
isel+0.918
male+0.915
imeters+0.910
Neg logit contributions
 HUN-0.845
 MARK-0.828
 Mond-0.815
 COP-0.815
 sen-0.812
Token When
Token ID1649
Feature activation+8.57e-01

Pos logit contributions
 HUN+0.629
 MARK+0.619
 sen+0.611
 COP+0.611
 Mond+0.610
Neg logit contributions
bians-0.574
align-0.571
isel-0.562
male-0.560
handler-0.556
Token there
Token ID612
Feature activation-4.76e-01

Pos logit contributions
bians+0.600
align+0.598
isel+0.590
male+0.588
handler+0.583
Neg logit contributions
 HUN-0.562
 MARK-0.553
 COP-0.544
 Mond-0.543
 sen-0.542
Token
Token ID447
Feature activation+6.90e-02

Pos logit contributions
 HUN+0.335
 MARK+0.329
 COP+0.325
 sen+0.324
 Mond+0.324
Neg logit contributions
bians-0.315
align-0.313
isel-0.308
male-0.307
imeters-0.305
Token
Token ID247
Feature activation-5.92e-01

Pos logit contributions
bians+0.047
align+0.047
isel+0.046
male+0.046
handler+0.046
Neg logit contributions
 HUN-0.047
 MARK-0.046
 COP-0.045
 Mond-0.045
 sen-0.045
Tokens
Token ID82
Feature activation+9.44e-01

Pos logit contributions
 HUN+0.427
 MARK+0.420
 COP+0.414
 sen+0.414
 Mond+0.414
Neg logit contributions
bians-0.380
align-0.378
isel-0.372
male-0.370
imeters-0.367
Token people
Token ID661
Feature activation+2.90e-01

Pos logit contributions
bians+0.661
align+0.659
isel+0.650
male+0.649
imeters+0.645
Neg logit contributions
 HUN-0.618
 MARK-0.609
 Mond-0.601
 COP-0.598
 sen-0.597
Token motivated
Token ID13338
Feature activation+9.87e-01

Pos logit contributions
bians+0.198
align+0.198
isel+0.194
male+0.194
imeters+0.192
Neg logit contributions
 HUN-0.195
 MARK-0.193
 Mond-0.190
 COP-0.189
 sen-0.188
Token)}
Token ID38165
Feature activation-8.55e-01

Pos logit contributions
 HUN+0.089
 MARK+0.088
 Mond+0.086
 sen+0.086
 COP+0.086
Neg logit contributions
bians-0.069
align-0.069
isel-0.068
male-0.067
handler-0.067
Token Q
Token ID1195
Feature activation+1.20e+00

Pos logit contributions
 HUN+0.565
 MARK+0.555
 Mond+0.547
 COP+0.547
 sen+0.546
Neg logit contributions
bians-0.601
align-0.598
isel-0.589
male-0.587
imeters-0.584
Tokenc
Token ID66
Feature activation-2.12e+00

Pos logit contributions
bians+0.714
align+0.713
isel+0.697
male+0.695
handler+0.690
Neg logit contributions
 HUN-0.931
 MARK-0.918
 COP-0.905
 sen-0.901
 Coca-0.898
Token7
Token ID22
Feature activation+2.59e-01

Pos logit contributions
 HUN+1.357
 MARK+1.331
 COP+1.310
 sen+1.306
 Mond+1.305
Neg logit contributions
bians-1.536
align-1.529
isel-1.507
male-1.500
handler-1.490
Token {
Token ID1391
Feature activation+1.13e+00

Pos logit contributions
align+0.159
bians+0.159
male+0.156
handler+0.155
isel+0.155
Neg logit contributions
 HUN-0.192
 MARK-0.189
 sen-0.187
 COP-0.187
 surv-0.186
Token(
Token ID7
Feature activation-5.12e-01

Pos logit contributions
align+0.892
bians+0.885
male+0.881
handler+0.875
female+0.871
Neg logit contributions
 HUN-0.645
 MARK-0.633
 surv-0.624
 sen-0.621
 COP-0.619
Token17
Token ID1558
Feature activation-2.44e+00

Pos logit contributions
 HUN+0.355
 MARK+0.349
 sen+0.345
 COP+0.344
 Mond+0.344
Neg logit contributions
bians-0.343
align-0.342
isel-0.336
male-0.336
handler-0.332
Tokens
Token ID82
Feature activation-7.96e-01

Pos logit contributions
 HUN+1.781
 MARK+1.748
 sen+1.740
 COP+1.734
 Mond+1.726
Neg logit contributions
bians-1.533
align-1.527
isel-1.525
imeters-1.497
handler-1.496
Token)}
Token ID38165
Feature activation-5.81e-01

Pos logit contributions
 HUN+0.603
 MARK+0.595
 Mond+0.586
 COP+0.585
 sen+0.585
Neg logit contributions
bians-0.484
align-0.481
isel-0.472
male-0.470
handler-0.468
Token 27
Token ID2681
Feature activation+4.05e-01

Pos logit contributions
 HUN+0.443
 MARK+0.438
 COP+0.433
 sen+0.431
 Mond+0.430
Neg logit contributions
align-0.351
bians-0.350
isel-0.345
imeters-0.342
male-0.340
Token.
Token ID13
Feature activation-1.87e+00

Pos logit contributions
bians+0.313
align+0.312
male+0.309
isel+0.307
handler+0.306
Neg logit contributions
 HUN-0.236
 MARK-0.231
 sen-0.229
 surv-0.228
 COP-0.227
Token coverage
Token ID5197
Feature activation-1.62e+00

Pos logit contributions
align+0.754
bians+0.750
isel+0.744
male+0.739
handler+0.737
Neg logit contributions
 HUN-0.777
 MARK-0.763
 Mond-0.750
 sen-0.747
 COP-0.747
Token to
Token ID284
Feature activation-6.51e-01

Pos logit contributions
 HUN+0.989
 MARK+0.969
 COP+0.956
 Mond+0.953
 sen+0.953
Neg logit contributions
bians-1.226
align-1.217
isel-1.201
male-1.195
handler-1.188
Token Trump
Token ID1301
Feature activation+2.72e+00

Pos logit contributions
 HUN+0.428
 MARK+0.421
 COP+0.414
 sen+0.414
 Mond+0.413
Neg logit contributions
bians-0.458
align-0.456
isel-0.449
male-0.447
imeters-0.444
Token and
Token ID290
Feature activation-1.11e+00

Pos logit contributions
bians+1.876
align+1.868
male+1.842
isel+1.839
handler+1.823
Neg logit contributions
 HUN-1.803
 MARK-1.776
 COP-1.740
 Coca-1.738
 Mond-1.737
Token the
Token ID262
Feature activation-7.51e-01

Pos logit contributions
 HUN+0.731
 MARK+0.718
 COP+0.706
 sen+0.705
 Mond+0.705
Neg logit contributions
bians-0.786
align-0.784
isel-0.772
male-0.768
handler-0.764
Token concoct
Token ID46789
Feature activation-1.18e+00

Pos logit contributions
 HUN+0.477
 MARK+0.469
 COP+0.460
 sen+0.459
 Mond+0.459
Neg logit contributions
bians-0.545
align-0.543
isel-0.535
male-0.533
handler-0.531
Tokened
Token ID276
Feature activation-1.93e+00

Pos logit contributions
 HUN+0.826
 MARK+0.813
 COP+0.802
 sen+0.801
 Mond+0.801
Neg logit contributions
bians-0.787
align-0.783
isel-0.770
male-0.766
imeters-0.762
Token Benghazi
Token ID22497
Feature activation-2.69e-02

Pos logit contributions
 HUN+1.310
 MARK+1.288
 COP+1.270
 sen+1.268
 Mond+1.268
Neg logit contributions
bians-1.316
align-1.310
isel-1.289
male-1.283
imeters-1.276
Token and
Token ID290
Feature activation-1.50e+00

Pos logit contributions
 HUN+0.019
 MARK+0.018
 Mond+0.018
 Coca+0.018
 sen+0.018
Neg logit contributions
bians-0.018
align-0.018
isel-0.018
male-0.018
handler-0.018
Token email
Token ID3053
Feature activation+1.76e+00

Pos logit contributions
 HUN+0.987
 MARK+0.969
 COP+0.956
 sen+0.954
 Mond+0.954
Neg logit contributions
bians-1.066
align-1.061
isel-1.044
male-1.040
imeters-1.034
Token scandals
Token ID28449
Feature activation-2.54e+00

Pos logit contributions
bians+1.054
align+1.039
isel+1.034
male+1.033
handler+1.032
Neg logit contributions
 HUN-1.312
 MARK-1.294
 COP-1.276
 Mond-1.276
 Coca-1.275
TokenSo
Token ID2396
Feature activation-7.73e-01

Pos logit contributions
align+0.736
bians+0.728
male+0.714
handler+0.711
isel+0.710
Neg logit contributions
 HUN-1.063
 MARK-1.044
 sen-1.037
 COP-1.034
 Coca-1.033
Token,
Token ID11
Feature activation-1.14e+00

Pos logit contributions
 HUN+0.514
 MARK+0.505
 COP+0.497
 sen+0.497
 Mond+0.496
Neg logit contributions
bians-0.541
align-0.538
isel-0.530
male-0.527
handler-0.524
Token Dallas
Token ID8533
Feature activation+7.05e-01

Pos logit contributions
 HUN+0.752
 MARK+0.739
 COP+0.727
 sen+0.727
 Mond+0.726
Neg logit contributions
bians-0.807
align-0.803
isel-0.792
male-0.787
handler-0.783
Token has
Token ID468
Feature activation-4.83e-01

Pos logit contributions
bians+0.480
align+0.479
isel+0.473
male+0.471
handler+0.469
Neg logit contributions
 HUN-0.470
 MARK-0.465
 Mond-0.461
 sen-0.458
 Coca-0.458
Token to
Token ID284
Feature activation-1.70e+00

Pos logit contributions
 HUN+0.306
 MARK+0.301
 sen+0.296
 Mond+0.296
 COP+0.295
Neg logit contributions
bians-0.348
align-0.347
isel-0.344
male-0.341
handler-0.340
Token decide
Token ID5409
Feature activation-2.27e+00

Pos logit contributions
 HUN+1.159
 MARK+1.142
 Mond+1.128
 COP+1.125
 sen+1.124
Neg logit contributions
bians-1.153
align-1.152
isel-1.130
male-1.121
handler-1.119
Token almost
Token ID2048
Feature activation+1.96e+00

Pos logit contributions
 HUN+1.388
 MARK+1.363
 COP+1.341
 sen+1.339
 Mond+1.338
Neg logit contributions
bians-1.708
align-1.701
isel-1.677
male-1.670
handler-1.661
Token immediately
Token ID3393
Feature activation+3.49e+00

Pos logit contributions
bians+1.216
align+1.206
isel+1.200
 lesbians+1.175
handler+1.175
Neg logit contributions
 HUN-1.405
 MARK-1.396
 sen-1.375
 COP-1.370
 Mond-1.366
Token into
Token ID656
Feature activation-7.43e-01

Pos logit contributions
bians+2.397
align+2.390
isel+2.378
male+2.346
handler+2.338
Neg logit contributions
 HUN-2.242
 MARK-2.192
 COP-2.175
 Coca-2.159
 Mond-2.157
Token the
Token ID262
Feature activation+1.17e+00

Pos logit contributions
 HUN+0.493
 MARK+0.485
 sen+0.477
 COP+0.476
 Mond+0.475
Neg logit contributions
bians-0.517
align-0.516
isel-0.508
handler-0.504
male-0.503
Token offseason
Token ID16349
Feature activation-2.11e-01

Pos logit contributions
align+0.784
bians+0.781
isel+0.773
handler+0.765
male+0.762
Neg logit contributions
 HUN-0.795
 MARK-0.778
 sen-0.769
 Mond-0.769
 Coca-0.765
Token,
Token ID11
Feature activation-4.26e-01

Pos logit contributions
align+0.066
bians+0.066
isel+0.065
male+0.065
handler+0.064
Neg logit contributions
 HUN-0.066
 MARK-0.064
 COP-0.064
 sen-0.064
 Mond-0.063
Token on
Token ID319
Feature activation-1.63e+00

Pos logit contributions
 HUN+0.276
 MARK+0.271
 sen+0.267
 COP+0.267
 Mond+0.266
Neg logit contributions
bians-0.303
align-0.302
isel-0.298
male-0.296
handler-0.294
Token Bay
Token ID4696
Feature activation+9.50e-02

Pos logit contributions
 HUN+1.043
 MARK+1.026
 COP+1.010
 Mond+1.009
 sen+1.007
Neg logit contributions
bians-1.182
align-1.177
isel-1.158
male-1.155
imeters-1.149
Tokenview
Token ID1177
Feature activation-3.36e-01

Pos logit contributions
bians+0.063
align+0.063
isel+0.062
male+0.062
handler+0.061
Neg logit contributions
 HUN-0.066
 MARK-0.065
 COP-0.064
 Mond-0.064
 sen-0.064
Token at
Token ID379
Feature activation-8.03e-01

Pos logit contributions
 HUN+0.255
 MARK+0.252
 COP+0.248
 sen+0.248
 Mond+0.248
Neg logit contributions
bians-0.202
align-0.201
isel-0.197
male-0.196
imeters-0.195
Token the
Token ID262
Feature activation-1.60e+00

Pos logit contributions
 HUN+0.510
 MARK+0.500
 COP+0.493
 sen+0.492
 Mond+0.492
Neg logit contributions
bians-0.585
align-0.582
isel-0.574
male-0.571
imeters-0.568
Token Brick
Token ID28347
Feature activation-6.11e-01

Pos logit contributions
 HUN+1.041
 MARK+1.023
 COP+1.008
 Mond+1.006
 sen+1.006
Neg logit contributions
bians-1.133
align-1.128
isel-1.111
male-1.107
imeters-1.100
Token Works
Token ID10933
Feature activation-7.07e-01

Pos logit contributions
 HUN+0.418
 MARK+0.411
 COP+0.405
 sen+0.404
 Mond+0.404
Neg logit contributions
bians-0.415
align-0.413
isel-0.407
male-0.405
imeters-0.402
Token
Token ID198
Feature activation-9.20e-01

Pos logit contributions
 HUN+0.455
 MARK+0.448
 COP+0.442
 Mond+0.441
 sen+0.440
Neg logit contributions
bians-0.508
align-0.506
isel-0.497
male-0.497
imeters-0.493
Token
Token ID198
Feature activation+5.58e-01

Pos logit contributions
 HUN+0.609
 MARK+0.598
 sen+0.589
 Mond+0.588
 COP+0.588
Neg logit contributions
bians-0.643
align-0.642
male-0.632
isel-0.630
handler-0.626
TokenRen
Token ID26764
Feature activation+5.18e-01

Pos logit contributions
align+0.320
bians+0.319
isel+0.314
male+0.313
handler+0.309
Neg logit contributions
 HUN-0.442
 MARK-0.433
 sen-0.430
 COP-0.429
 Mond-0.427
Token incorporated
Token ID16560
Feature activation+4.07e-02

Pos logit contributions
 HUN+0.138
 MARK+0.136
 sen+0.133
 Mond+0.133
 COP+0.132
Neg logit contributions
align-0.143
bians-0.142
isel-0.140
male-0.140
handler-0.139
Token all
Token ID477
Feature activation-5.61e-01

Pos logit contributions
bians+0.028
align+0.028
isel+0.027
male+0.027
imeters+0.027
Neg logit contributions
 HUN-0.027
 MARK-0.027
 Mond-0.026
 COP-0.026
 sen-0.026
Token available
Token ID1695
Feature activation-1.52e+00

Pos logit contributions
 HUN+0.383
 MARK+0.377
 Mond+0.371
 COP+0.371
 sen+0.371
Neg logit contributions
bians-0.380
align-0.378
male-0.373
isel-0.372
imeters-0.369
Token data
Token ID1366
Feature activation-2.89e+00

Pos logit contributions
 HUN+1.080
 MARK+1.064
 Mond+1.048
 COP+1.047
 sen+1.047
Neg logit contributions
bians-0.991
align-0.984
isel-0.968
male-0.966
imeters-0.959
Token on
Token ID319
Feature activation-1.92e+00

Pos logit contributions
 HUN+1.858
 MARK+1.827
 COP+1.804
 sen+1.803
 Mond+1.790
Neg logit contributions
bians-2.080
align-2.070
isel-2.038
male-2.024
handler-2.016
Token body
Token ID1767
Feature activation-4.62e+00

Pos logit contributions
 HUN+1.311
 MARK+1.289
 COP+1.272
 sen+1.270
 Mond+1.268
Neg logit contributions
bians-1.308
align-1.303
isel-1.283
male-1.275
handler-1.270
Token mass
Token ID2347
Feature activation-6.30e+00

Pos logit contributions
 HUN+2.693
 MARK+2.674
 sen+2.635
 COP+2.614
 Mond+2.598
Neg logit contributions
bians-3.596
align-3.536
isel-3.517
handler-3.488
male-3.472
Token index
Token ID6376
Feature activation-1.46e+00

Pos logit contributions
 HUN+3.314
 MARK+3.300
 sen+3.244
 COP+3.210
 Mond+3.198
Neg logit contributions
bians-5.152
isel-5.125
align-5.103
handler-5.087
raid-5.066
Token (
Token ID357
Feature activation-1.02e+00

Pos logit contributions
 HUN+0.926
 MARK+0.909
 COP+0.897
 sen+0.895
 Mond+0.893
Neg logit contributions
bians-1.060
align-1.055
isel-1.040
male-1.033
handler-1.031
TokenB
Token ID33
Feature activation-1.07e-01

Pos logit contributions
 HUN+0.748
 MARK+0.736
 sen+0.727
 Mond+0.726
 COP+0.725
Neg logit contributions
bians-0.642
align-0.638
male-0.628
isel-0.625
handler-0.620
TokenMI
Token ID8895
Feature activation-1.58e+00

Pos logit contributions
 HUN+0.090
 MARK+0.089
 COP+0.088
 sen+0.088
 Coca+0.088
Neg logit contributions
align-0.055
bians-0.055
isel-0.054
male-0.054
imeters-0.054
Tokenoxy
Token ID23536
Feature activation-3.07e-01

Pos logit contributions
bians+0.218
align+0.217
isel+0.211
male+0.209
imeters+0.207
Neg logit contributions
 HUN-0.596
 MARK-0.590
 sen-0.583
 COP-0.581
 Mond-0.581
Token resin
Token ID36536
Feature activation+9.49e-01

Pos logit contributions
 HUN+0.237
 MARK+0.233
 COP+0.230
 Mond+0.230
 sen+0.230
Neg logit contributions
bians-0.182
align-0.181
isel-0.178
male-0.177
imeters-0.176
Token is
Token ID318
Feature activation-1.55e+00

Pos logit contributions
align+0.630
bians+0.630
isel+0.620
male+0.619
imeters+0.615
Neg logit contributions
 HUN-0.659
 MARK-0.645
 Mond-0.635
 COP-0.632
 Coca-0.630
Token "
Token ID366
Feature activation+2.69e-01

Pos logit contributions
 HUN+1.066
 MARK+1.049
 COP+1.032
 Mond+1.031
 sen+1.030
Neg logit contributions
bians-1.050
align-1.046
isel-1.030
male-1.026
handler-1.019
Tokenres
Token ID411
Feature activation+2.20e+00

Pos logit contributions
bians+0.144
align+0.144
isel+0.141
male+0.141
handler+0.139
Neg logit contributions
 HUN-0.223
 MARK-0.220
 COP-0.217
 Mond-0.217
 sen-0.217
Tokenil
Token ID346
Feature activation-5.88e+00

Pos logit contributions
bians+1.057
align+1.046
isel+1.023
male+1.018
handler+1.012
Neg logit contributions
 HUN-1.959
 MARK-1.934
 Mond-1.906
 COP-1.899
 Coca-1.893
Tokenient
Token ID1153
Feature activation+5.56e-01

Pos logit contributions
 HUN+3.278
 Coca+3.273
 sen+3.247
 MARK+3.209
 Mond+3.193
Neg logit contributions
BuyableInstoreAndOnline-4.474
align-4.470
bians-4.432
handler-4.432
isel-4.419
Token to
Token ID284
Feature activation-3.17e-01

Pos logit contributions
bians+0.387
align+0.386
isel+0.380
male+0.380
imeters+0.377
Neg logit contributions
 HUN-0.369
 MARK-0.362
 Mond-0.356
 COP-0.355
 sen-0.355
Token mechanical
Token ID12370
Feature activation-3.55e-01

Pos logit contributions
 HUN+0.221
 MARK+0.217
 Mond+0.214
 COP+0.213
 sen+0.213
Neg logit contributions
bians-0.211
align-0.210
isel-0.207
male-0.206
imeters-0.205
Token stresses
Token ID29787
Feature activation-4.81e-01

Pos logit contributions
 HUN+0.254
 MARK+0.250
 Mond+0.245
 COP+0.243
 Coca+0.243
Neg logit contributions
align-0.229
bians-0.229
isel-0.226
male-0.225
imeters-0.223
Token."
Token ID526
Feature activation+8.41e-01

Pos logit contributions
 HUN+0.300
 MARK+0.295
 COP+0.290
 sen+0.290
 Mond+0.290
Neg logit contributions
bians-0.355
align-0.353
isel-0.348
male-0.347
imeters-0.345
Token
Token ID251
Feature activation-2.56e+00

Pos logit contributions
 HUN+0.476
 MARK+0.469
 COP+0.462
 Mond+0.462
 sen+0.462
Neg logit contributions
bians-0.502
align-0.500
isel-0.491
male-0.489
handler-0.486
Token
Token ID198
Feature activation-3.27e+00

Pos logit contributions
 HUN+1.801
 MARK+1.772
 COP+1.747
 sen+1.747
 Mond+1.744
Neg logit contributions
bians-1.688
align-1.679
isel-1.651
male-1.645
handler-1.634
Token
Token ID198
Feature activation-1.99e+00

Pos logit contributions
 HUN+2.061
 MARK+2.026
 COP+1.993
 Mond+1.988
 sen+1.987
Neg logit contributions
bians-2.389
align-2.377
isel-2.346
male-2.326
imeters-2.325
TokenAll
Token ID3237
Feature activation+2.93e-01

Pos logit contributions
 HUN+1.495
 MARK+1.473
 COP+1.454
 Mond+1.452
 sen+1.452
Neg logit contributions
bians-1.212
align-1.204
isel-1.186
male-1.176
imeters-1.173
Token of
Token ID286
Feature activation-4.26e+00

Pos logit contributions
bians+0.197
align+0.196
isel+0.192
male+0.192
imeters+0.190
Neg logit contributions
 HUN-0.203
 MARK-0.199
 sen-0.196
 COP-0.196
 Mond-0.196
Token this
Token ID428
Feature activation-4.40e+00

Pos logit contributions
 HUN+2.877
 MARK+2.828
 Mond+2.826
 COP+2.826
 Coca+2.798
Neg logit contributions
bians-2.899
align-2.877
isel-2.867
imeters-2.829
handler-2.821
Token seemed
Token ID3947
Feature activation-3.85e+00

Pos logit contributions
 HUN+3.017
 MARK+2.966
 sen+2.952
 COP+2.949
 Mond+2.940
Neg logit contributions
bians-2.969
align-2.942
imeters-2.899
isel-2.888
male-2.872
Token very
Token ID845
Feature activation-1.50e+00

Pos logit contributions
 HUN+2.425
 MARK+2.365
 sen+2.355
 COP+2.334
 Mond+2.332
Neg logit contributions
bians-2.814
align-2.801
isel-2.761
imeters-2.759
male-2.748
Token ironic
Token ID25304
Feature activation+1.80e-01

Pos logit contributions
 HUN+0.986
 MARK+0.970
 COP+0.955
 Mond+0.952
 sen+0.952
Neg logit contributions
bians-1.061
align-1.057
isel-1.039
male-1.036
handler-1.029
Token,
Token ID11
Feature activation-1.20e+00

Pos logit contributions
bians+0.130
align+0.129
isel+0.127
male+0.127
imeters+0.126
Neg logit contributions
 HUN-0.116
 MARK-0.114
 COP-0.112
 Mond-0.111
 sen-0.111
Token of
Token ID286
Feature activation-9.85e-01

Pos logit contributions
 HUN+0.775
 MARK+0.761
 COP+0.749
 sen+0.747
 Mond+0.747
Neg logit contributions
bians-0.855
align-0.851
isel-0.838
male-0.836
handler-0.829
Token
Token ID198
Feature activation-1.54e+00

Pos logit contributions
bians+0.036
align+0.036
isel+0.035
male+0.035
handler+0.035
Neg logit contributions
 HUN-0.044
 MARK-0.043
 sen-0.043
 COP-0.043
 surv-0.043
Token
Token ID198
Feature activation-1.28e-01

Pos logit contributions
 HUN+0.991
 MARK+0.973
 COP+0.958
 sen+0.957
 Mond+0.956
Neg logit contributions
bians-1.108
align-1.104
isel-1.087
male-1.083
handler-1.077
TokenThe
Token ID464
Feature activation-4.36e-01

Pos logit contributions
 HUN+0.099
 MARK+0.098
 COP+0.096
 sen+0.096
 Mond+0.096
Neg logit contributions
align-0.076
bians-0.076
male-0.074
isel-0.074
handler-0.073
Token Aut
Token ID5231
Feature activation-1.12e+00

Pos logit contributions
 HUN+0.293
 MARK+0.289
 sen+0.283
 COP+0.283
 Mond+0.283
Neg logit contributions
bians-0.298
align-0.297
isel-0.293
male-0.291
imeters-0.289
Tokenon
Token ID261
Feature activation-3.21e-03

Pos logit contributions
 HUN+0.994
 MARK+0.981
 COP+0.970
 sen+0.970
 Mond+0.969
Neg logit contributions
bians-0.529
align-0.528
isel-0.517
male-0.512
handler-0.508
Tokenav
Token ID615
Feature activation-4.42e+00

Pos logit contributions
 HUN+0.002
 MARK+0.002
 sen+0.002
 COP+0.002
 Mond+0.002
Neg logit contributions
bians-0.002
align-0.002
isel-0.002
male-0.002
imeters-0.002
Token system
Token ID1080
Feature activation-4.31e-01

Pos logit contributions
 HUN+2.810
 MARK+2.790
 COP+2.737
 Mond+2.724
 sen+2.722
Neg logit contributions
bians-3.177
align-3.149
isel-3.105
male-3.091
handler-3.080
Token required
Token ID2672
Feature activation+5.18e-01

Pos logit contributions
 HUN+0.287
 MARK+0.282
 COP+0.277
 sen+0.277
 Mond+0.276
Neg logit contributions
bians-0.301
align-0.300
isel-0.296
male-0.293
handler-0.292
Token occasional
Token ID12209
Feature activation+1.62e-01

Pos logit contributions
bians+0.362
align+0.361
isel+0.356
male+0.353
imeters+0.350
Neg logit contributions
 HUN-0.341
 MARK-0.336
 Mond-0.329
 COP-0.329
 sen-0.329
Token manual
Token ID10107
Feature activation-1.04e+00

Pos logit contributions
bians+0.111
align+0.111
isel+0.109
male+0.108
imeters+0.107
Neg logit contributions
 HUN-0.109
 MARK-0.108
 Mond-0.106
 COP-0.106
 Coca-0.105
Token corrections
Token ID26251
Feature activation-1.27e+00

Pos logit contributions
 HUN+0.731
 MARK+0.720
 COP+0.708
 Mond+0.707
 sen+0.706
Neg logit contributions
bians-0.690
align-0.689
isel-0.678
male-0.673
imeters-0.669
Token
Token ID198
Feature activation+1.39e+00

Pos logit contributions
 HUN+0.106
 surv+0.104
 MARK+0.103
 sen+0.103
 Mond+0.102
Neg logit contributions
align-0.102
bians-0.101
male-0.101
handler-0.100
female-0.099
Token
Token ID447
Feature activation-1.04e+00

Pos logit contributions
align+0.862
bians+0.853
male+0.851
handler+0.833
female+0.832
Neg logit contributions
 HUN-1.032
 MARK-1.020
 sen-1.009
 Mond-1.005
 COP-1.002
Token
Token ID246
Feature activation-1.63e+00

Pos logit contributions
 HUN+0.504
 MARK+0.492
 sen+0.485
 COP+0.483
 Mond+0.479
Neg logit contributions
bians-0.918
align-0.913
isel-0.904
male-0.898
handler-0.897
TokenAs
Token ID1722
Feature activation-3.64e-01

Pos logit contributions
 HUN+1.296
 MARK+1.278
 COP+1.263
 sen+1.262
 Mond+1.261
Neg logit contributions
bians-0.931
align-0.929
isel-0.908
male-0.905
handler-0.897
Token we
Token ID356
Feature activation-1.99e+00

Pos logit contributions
 HUN+0.235
 MARK+0.232
 COP+0.227
 Mond+0.227
 sen+0.226
Neg logit contributions
bians-0.261
align-0.260
isel-0.256
male-0.255
 lesbians-0.254
Token dated
Token ID14567
Feature activation-5.13e+00

Pos logit contributions
 HUN+1.306
 MARK+1.285
 COP+1.268
 Mond+1.265
 Coca+1.259
Neg logit contributions
bians-1.406
align-1.398
isel-1.379
male-1.373
imeters-1.362
Token we
Token ID356
Feature activation-8.67e-01

Pos logit contributions
 HUN+3.178
 sen+3.117
 MARK+3.113
 Mond+3.073
 COP+3.069
Neg logit contributions
align-3.760
bians-3.752
isel-3.707
volt-3.688
handler-3.661
Token shared
Token ID4888
Feature activation-2.33e+00

Pos logit contributions
 HUN+0.563
 MARK+0.553
 COP+0.545
 Mond+0.544
 Coca+0.541
Neg logit contributions
bians-0.623
align-0.618
isel-0.609
male-0.607
imeters-0.602
Token our
Token ID674
Feature activation-2.22e+00

Pos logit contributions
 HUN+1.505
 MARK+1.477
 COP+1.457
 sen+1.457
 Mond+1.454
Neg logit contributions
bians-1.665
align-1.660
isel-1.634
male-1.627
handler-1.618
Token vision
Token ID5761
Feature activation-2.90e+00

Pos logit contributions
 HUN+1.402
 MARK+1.377
 COP+1.355
 sen+1.354
 Mond+1.353
Neg logit contributions
bians-1.627
align-1.620
isel-1.597
male-1.590
imeters-1.581
Token for
Token ID329
Feature activation-2.60e+00

Pos logit contributions
 HUN+1.888
 MARK+1.855
 COP+1.838
 sen+1.837
 Mond+1.823
Neg logit contributions
bians-2.061
align-2.035
isel-2.007
male-2.003
volt-1.988