TOP ACTIVATIONS
MAX = 18.866

right was a true rock n roll pioneer .
org . Award nominations are accepted year - round ; the
th birthday , the Mary Mc Leod Beth une Memorial ,
R OB ERT P OR ATH B oulder
, to A ils a Craig , St K ilda and
high - level of intellectual rig or . But they surrounded
the command , sl s invoke - f test . If
. However , gender non con forming ( G NC )
Reagan shouldn t run for re - election .
2 3 4 5 6 7 8 9 10 11 12
2 3 4 5 6 7 | library | version |
. EC ON OM ICS AND TR ADE
( Photo : KY LE GR AN TH AM / THE
months from the M auna Lo a series of [ CO
already occurred , Gol ight ly - How ell
First Strike within days . Fin ishing The Flood perhaps will
was designated as the Mary Mc Leod Beth une Council House
Music Magazine , Cal um MacDonald , who is the author
enjoyed by all ," Ian Fin nery wrote on the group
2 3 4 5 6 7 8 void vert ( in

MIN ACTIVATIONS
MIN = -16.658

they 've found a ' p reg nant ' T .
, your page may show unp ars ed expressions to the
Bour dain in public p iss ing off one of
made modeling of p uls ating combustion more realistic
( also known as " P or oker at osis punct
a lot of something ( p oth oles in New Jersey
, unf il tered , unp aste ur ised , and
et plant aris ," " P unct ate ker at oder
that she is p roud of our work to
uese ) and Henry ( P yr g os ) on
et plant aris ," " P unct ate ker at oder
ache and a bit of Paul Newman about him . Coll
vot re f unt p ère , cr é ateur
we have seen when we unp acked the norm als ,
pedd led lofty promises and p atted himself on the back
addiction and abortion on " P enny roy al Tea ",
membership would be a " p unt ". The
John T . Anderson . Vice Chairman : Richard K .
pr ide fully and unp rompt ed he said
mask ed . com and V odka O ' Clock Podcast

INTERVAL 9.985 - 18.866
CONTAINS 0.050%

. However , gender non con forming ( G NC )
mark recently , according to Mac St ories . net .
. ( Photo By Bill Clark / C Q Roll Call
questions was just coincidence , natural occurrence or mis ident ification
that Hillary Clinton s aide , Huma Abedin ,

INTERVAL 1.104 - 9.985
CONTAINS 33.250%

to draw something that was more interesting than anything you could
of all income classes Federal Communications Commission chairman Aj it
soccer for good when he was about 13 .
now seems somehow cumbersome and archaic -- or , worse ,
into Street Fighter V . Which fighting style do you think

INTERVAL -7.777 - 1.104
CONTAINS 66.286%

Apparently she was originally inspired by her love of Japanese anime
trying to support , chances are you know whether they prefer
Do you want Catalonia to become a state ?
the music . Bernard Her r mann 's score for Psych
photographs .) So I might as well make them useful .

INTERVAL -16.658 - -7.777
CONTAINS 0.414%

ik ki M ä en p ä ä . It was
and other recording accidents such as Casey Kas em ruining
heads , but that only postp ones the day of reckoning
confident fighter . Thank you as well to Lawrence
ason [ 10 : 47 ] < l az eed ave
Token right
Token ID826
Feature activation+2.22e+00

Pos logit contributions
inery+2.277
alogue+2.206
+2.040
ISE+2.040
arte+2.011
Neg logit contributions
oppable-2.229
 Discrimination-2.217
 Pog-2.143
onder-2.115
sov-2.111
Token was
Token ID373
Feature activation-6.49e-01

Pos logit contributions
inery+1.707
alogue+1.652
+1.551
ISE+1.546
arte+1.523
Neg logit contributions
oppable-1.396
 Discrimination-1.393
 Pog-1.338
sov-1.315
onder-1.314
Token a
Token ID257
Feature activation-1.64e+00

Pos logit contributions
oppable+0.394
 Discrimination+0.391
 Pog+0.382
onder+0.371
sov+0.370
Neg logit contributions
inery-0.517
alogue-0.500
ISE-0.468
-0.466
arte-0.460
Token true
Token ID2081
Feature activation-2.38e+00

Pos logit contributions
oppable+0.909
 Discrimination+0.902
 Pog+0.877
sov+0.853
onder+0.853
Neg logit contributions
inery-1.397
alogue-1.358
ISE-1.271
-1.270
nces-1.253
Token rock
Token ID3881
Feature activation+5.82e+00

Pos logit contributions
oppable+1.214
 Discrimination+1.213
 Pog+1.178
sov+1.140
onder+1.138
Neg logit contributions
inery-2.119
alogue-2.065
-1.941
ISE-1.940
nces-1.917
Token n
Token ID299
Feature activation+1.89e+01

Pos logit contributions
inery+5.038
alogue+4.925
arte+4.597
+4.564
ISE+4.557
Neg logit contributions
 Discrimination-3.013
oppable-2.960
 Depot-2.807
 Pog-2.803
sov-2.750
Token
Token ID447
Feature activation+4.25e+00

Pos logit contributions
inery+9.456
alogue+9.067
ISE+8.434
onna+8.276
emark+8.188
Neg logit contributions
 Discrimination-13.465
oppable-13.210
Dialog-13.060
sov-12.965
Percent-12.937
Token
Token ID247
Feature activation+2.46e+00

Pos logit contributions
inery+2.149
alogue+2.027
+1.845
ISE+1.816
arte+1.777
Neg logit contributions
 Discrimination-3.755
oppable-3.737
 Pog-3.696
sov-3.654
onder-3.618
Token roll
Token ID4836
Feature activation+2.32e+00

Pos logit contributions
inery+1.765
alogue+1.705
+1.575
ISE+1.572
arte+1.556
Neg logit contributions
 Discrimination-1.677
oppable-1.676
 Pog-1.625
sov-1.595
onder-1.590
Token pioneer
Token ID29570
Feature activation+4.72e-01

Pos logit contributions
inery+2.162
alogue+2.102
+1.990
ISE+1.984
arte+1.967
Neg logit contributions
 Discrimination-1.076
oppable-1.076
 Pog-1.024
sov-1.000
onder-0.994
Token.
Token ID13
Feature activation+5.29e-01

Pos logit contributions
inery+0.390
alogue+0.378
ISE+0.355
+0.355
arte+0.349
Neg logit contributions
oppable-0.272
 Discrimination-0.271
 Pog-0.262
onder-0.256
sov-0.255
Tokenorg
Token ID2398
Feature activation+1.68e+00

Pos logit contributions
oppable+1.812
 Discrimination+1.776
onder+1.770
sov+1.724
 Pog+1.711
Neg logit contributions
inery-4.743
alogue-4.654
-4.434
ISE-4.408
nces-4.396
Token.
Token ID13
Feature activation+3.23e+00

Pos logit contributions
inery+1.361
alogue+1.316
ISE+1.246
+1.241
arte+1.224
Neg logit contributions
oppable-0.988
 Discrimination-0.977
 Pog-0.949
sov-0.932
onder-0.926
Token Award
Token ID11289
Feature activation+2.89e+00

Pos logit contributions
inery+2.294
alogue+2.203
ISE+2.063
+2.059
arte+2.026
Neg logit contributions
oppable-2.220
 Discrimination-2.175
 Pog-2.128
sov-2.101
onder-2.100
Token nominations
Token ID30994
Feature activation+9.69e+00

Pos logit contributions
inery+1.591
alogue+1.498
+1.386
ISE+1.368
arte+1.345
Neg logit contributions
oppable-2.446
 Discrimination-2.416
 Pog-2.361
sov-2.336
onder-2.330
Token are
Token ID389
Feature activation+2.29e+00

Pos logit contributions
inery+5.461
alogue+5.184
arte+4.828
ISE+4.783
+4.668
Neg logit contributions
 Discrimination-7.477
oppable-7.457
sov-7.277
Dialog-7.197
-7.170
Token accepted
Token ID6292
Feature activation+1.43e+01

Pos logit contributions
inery+1.959
alogue+1.899
ISE+1.799
+1.791
arte+1.773
Neg logit contributions
oppable-1.246
 Discrimination-1.221
 Pog-1.165
sov-1.152
onder-1.146
Token year
Token ID614
Feature activation+1.13e+01

Pos logit contributions
inery+5.917
alogue+5.124
+4.834
emark+4.809
ISE+4.761
Neg logit contributions
 Discrimination-12.387
oppable-12.332
Dialog-12.195
-11.930
sov-11.744
Token-
Token ID12
Feature activation+6.53e+00

Pos logit contributions
inery+6.323
alogue+6.048
sle+5.400
arte+5.338
ISE+5.296
Neg logit contributions
oppable-8.901
 Discrimination-8.713
-8.470
 Pog-8.214
 Citiz-8.204
Tokenround
Token ID744
Feature activation+6.61e+00

Pos logit contributions
inery+3.725
alogue+3.644
arte+3.294
sle+3.218
emark+3.170
Neg logit contributions
 Discrimination-5.408
oppable-5.192
 Pog-5.162
-5.087
 Depot-5.081
Token;
Token ID26
Feature activation+3.81e+00

Pos logit contributions
inery+3.907
alogue+3.730
ISE+3.493
+3.455
nces+3.371
Neg logit contributions
oppable-5.099
 Discrimination-5.072
 Pog-4.915
sov-4.895
 Depot-4.802
Token the
Token ID262
Feature activation+6.91e+00

Pos logit contributions
inery+2.678
alogue+2.582
+2.408
ISE+2.392
arte+2.360
Neg logit contributions
oppable-2.611
 Discrimination-2.594
 Pog-2.511
sov-2.502
onder-2.467
Tokenth
Token ID400
Feature activation+3.40e+00

Pos logit contributions
inery+4.319
alogue+4.235
ISE+4.007
+3.986
arte+3.935
Neg logit contributions
oppable-2.449
 Discrimination-2.380
 Pog-2.282
sov-2.231
onder-2.214
Token birthday
Token ID10955
Feature activation+2.01e+00

Pos logit contributions
inery+2.684
alogue+2.599
+2.443
ISE+2.442
arte+2.398
Neg logit contributions
oppable-2.073
 Discrimination-2.044
 Pog-1.978
onder-1.955
sov-1.951
Token,
Token ID11
Feature activation+1.23e+00

Pos logit contributions
inery+1.569
alogue+1.515
+1.423
ISE+1.421
arte+1.397
Neg logit contributions
oppable-1.246
 Discrimination-1.233
 Pog-1.191
sov-1.178
onder-1.173
Token the
Token ID262
Feature activation+6.26e-01

Pos logit contributions
inery+0.864
alogue+0.834
+0.774
ISE+0.774
arte+0.760
Neg logit contributions
oppable-0.862
 Discrimination-0.860
 Pog-0.836
onder-0.821
sov-0.818
Token Mary
Token ID5335
Feature activation+3.86e-01

Pos logit contributions
inery+0.490
alogue+0.475
+0.445
ISE+0.444
nces+0.436
Neg logit contributions
 Discrimination-0.387
oppable-0.385
 Pog-0.376
onder-0.366
 Depot-0.365
Token Mc
Token ID1982
Feature activation+1.40e+01

Pos logit contributions
inery+0.286
alogue+0.278
+0.260
ISE+0.259
arte+0.253
Neg logit contributions
 Discrimination-0.255
oppable-0.252
 Pog-0.247
onder-0.243
 Depot-0.241
TokenLeod
Token ID44559
Feature activation+8.72e+00

Pos logit contributions
inery+8.716
alogue+8.598
arte+8.045
ISE+7.918
ENA+7.892
Neg logit contributions
oppable-9.331
-9.147
 Discrimination-9.133
 misunderstood-8.914
sov-8.865
Token Beth
Token ID19223
Feature activation+8.89e+00

Pos logit contributions
inery+6.297
alogue+6.006
ISE+5.676
arte+5.640
+5.584
Neg logit contributions
oppable-5.788
 Discrimination-5.429
sov-5.334
-5.310
onder-5.309
Tokenune
Token ID1726
Feature activation+9.54e+00

Pos logit contributions
inery+6.918
alogue+6.626
arte+6.239
ISE+6.031
emark+6.003
Neg logit contributions
oppable-5.324
 Discrimination-5.268
-5.181
 Pog-5.139
Dialog-4.967
Token Memorial
Token ID14861
Feature activation+5.19e+00

Pos logit contributions
inery+6.780
alogue+6.450
arte+6.079
ISE+6.049
+6.040
Neg logit contributions
oppable-6.314
 Discrimination-5.845
sov-5.844
-5.814
onder-5.766
Token,
Token ID11
Feature activation+2.80e+00

Pos logit contributions
inery+3.791
alogue+3.632
+3.406
ISE+3.405
arte+3.388
Neg logit contributions
oppable-3.429
 Discrimination-3.330
sov-3.258
-3.213
 Pog-3.209
Token
Token ID198
Feature activation+1.08e-01

Pos logit contributions
inery+1.254
alogue+1.207
ISE+1.131
+1.128
arte+1.110
Neg logit contributions
oppable-1.336
 Discrimination-1.312
 Pog-1.273
onder-1.251
sov-1.245
TokenR
Token ID49
Feature activation+3.02e+00

Pos logit contributions
inery+0.084
alogue+0.081
+0.076
ISE+0.075
nces+0.075
Neg logit contributions
 Discrimination-0.067
oppable-0.067
 Pog-0.065
onder-0.064
sov-0.064
TokenOB
Token ID9864
Feature activation+8.49e+00

Pos logit contributions
inery+2.209
alogue+2.128
ISE+2.002
+1.973
arte+1.960
Neg logit contributions
 Discrimination-2.021
oppable-2.017
 Pog-1.954
sov-1.919
onder-1.904
TokenERT
Token ID17395
Feature activation-4.52e-01

Pos logit contributions
inery+4.175
ISE+3.860
alogue+3.850
arte+3.607
ENA+3.579
Neg logit contributions
oppable-7.612
 Discrimination-7.369
-7.188
onder-7.126
sov-7.093
Token P
Token ID350
Feature activation-5.70e+00

Pos logit contributions
oppable+0.328
 Discrimination+0.328
 Pog+0.320
onder+0.313
sov+0.313
Neg logit contributions
inery-0.304
alogue-0.293
-0.270
ISE-0.270
arte-0.266
TokenOR
Token ID1581
Feature activation+1.39e+01

Pos logit contributions
 Pog+4.272
 Discrimination+4.141
onder+4.123
 Rib+4.059
Dialog+4.059
Neg logit contributions
inery-3.602
alogue-3.536
nces-3.263
-3.066
emark-3.063
TokenATH
Token ID12599
Feature activation+1.72e+00

Pos logit contributions
ISE+7.535
ENA+7.184
inery+7.004
+6.585
USE+6.487
Neg logit contributions
oppable-11.240
 Discrimination-11.041
-10.213
bleacher-10.151
 Rebell-10.054
Token
Token ID198
Feature activation-2.40e+00

Pos logit contributions
inery+1.297
alogue+1.255
+1.170
ISE+1.166
arte+1.151
Neg logit contributions
oppable-1.117
 Discrimination-1.115
 Pog-1.084
onder-1.061
sov-1.057
Token
Token ID198
Feature activation-2.18e+00

Pos logit contributions
 Discrimination+1.481
oppable+1.450
 Pog+1.438
onder+1.412
sov+1.411
Neg logit contributions
inery-1.866
alogue-1.803
nces-1.677
-1.670
ISE-1.666
TokenB
Token ID33
Feature activation-2.92e+00

Pos logit contributions
 Discrimination+1.326
oppable+1.315
 Pog+1.289
Dialog+1.286
onder+1.277
Neg logit contributions
inery-1.701
alogue-1.641
nces-1.524
-1.511
arte-1.504
Tokenoulder
Token ID17601
Feature activation+2.94e+00

Pos logit contributions
oppable+2.109
 Discrimination+2.060
onder+2.048
 Pog+2.018
sov+1.994
Neg logit contributions
inery-1.926
alogue-1.881
-1.755
nces-1.707
ISE-1.697
Token,
Token ID11
Feature activation-5.18e-01

Pos logit contributions
inery+2.046
alogue+1.977
+1.851
ISE+1.846
arte+1.827
Neg logit contributions
oppable-1.740
 Discrimination-1.721
 Pog-1.655
sov-1.636
onder-1.634
Token to
Token ID284
Feature activation+1.38e+00

Pos logit contributions
 Discrimination+0.315
oppable+0.315
 Pog+0.308
onder+0.298
sov+0.297
Neg logit contributions
inery-0.411
alogue-0.398
ISE-0.373
-0.371
arte-0.366
Token A
Token ID317
Feature activation-2.09e+00

Pos logit contributions
inery+1.079
alogue+1.044
ISE+0.980
+0.979
arte+0.965
Neg logit contributions
oppable-0.850
 Discrimination-0.843
 Pog-0.813
onder-0.802
sov-0.800
Tokenils
Token ID4487
Feature activation-2.55e+00

Pos logit contributions
oppable+1.340
 Discrimination+1.323
 Pog+1.295
onder+1.278
sov+1.277
Neg logit contributions
inery-1.576
alogue-1.524
-1.445
ISE-1.428
arte-1.396
Tokena
Token ID64
Feature activation+2.57e+00

Pos logit contributions
 Discrimination+1.623
oppable+1.617
 Pog+1.587
onder+1.575
sov+1.551
Neg logit contributions
inery-1.918
alogue-1.862
-1.768
ISE-1.763
arte-1.696
Token Craig
Token ID13854
Feature activation+1.38e+01

Pos logit contributions
inery+1.980
alogue+1.915
ISE+1.789
+1.780
arte+1.774
Neg logit contributions
oppable-1.606
 Discrimination-1.598
 Pog-1.535
sov-1.511
onder-1.506
Token,
Token ID11
Feature activation+2.92e+00

Pos logit contributions
inery+8.828
alogue+8.400
arte+7.984
emark+7.796
+7.705
Neg logit contributions
oppable-9.014
 Discrimination-8.956
-8.739
Dialog-8.655
Percent-8.546
Token St
Token ID520
Feature activation-1.83e+00

Pos logit contributions
inery+2.256
alogue+2.183
+2.052
ISE+2.045
arte+2.024
Neg logit contributions
oppable-1.807
 Discrimination-1.782
 Pog-1.712
onder-1.704
sov-1.702
Token K
Token ID509
Feature activation+8.90e+00

Pos logit contributions
oppable+1.241
 Discrimination+1.222
 Pog+1.210
onder+1.185
 Depot+1.172
Neg logit contributions
inery-1.318
alogue-1.271
-1.207
ISE-1.197
nces-1.166
Tokenilda
Token ID27281
Feature activation+4.30e-01

Pos logit contributions
inery+2.752
alogue+2.450
arte+1.975
nces+1.954
ISE+1.913
Neg logit contributions
 Discrimination-9.793
oppable-9.543
 Pog-9.446
-9.439
Dialog-9.374
Token and
Token ID290
Feature activation+8.15e-01

Pos logit contributions
inery+0.341
alogue+0.331
+0.310
ISE+0.310
arte+0.305
Neg logit contributions
oppable-0.261
 Discrimination-0.260
 Pog-0.252
onder-0.246
sov-0.246
Token high
Token ID1029
Feature activation+3.79e+00

Pos logit contributions
inery+2.942
alogue+2.811
+2.660
ISE+2.649
arte+2.594
Neg logit contributions
oppable-2.333
 Discrimination-2.302
sov-2.180
 Pog-2.173
onder-2.167
Token-
Token ID12
Feature activation+2.59e+00

Pos logit contributions
inery+2.860
alogue+2.762
ISE+2.564
+2.563
arte+2.560
Neg logit contributions
oppable-2.441
 Discrimination-2.409
 Pog-2.321
sov-2.307
onder-2.275
Tokenlevel
Token ID5715
Feature activation+1.09e+00

Pos logit contributions
inery+1.708
alogue+1.649
ISE+1.513
+1.511
arte+1.504
Neg logit contributions
oppable-1.906
 Discrimination-1.901
 Pog-1.846
 Depot-1.819
sov-1.815
Token of
Token ID286
Feature activation+2.67e+00

Pos logit contributions
inery+0.817
alogue+0.790
+0.737
ISE+0.737
arte+0.725
Neg logit contributions
oppable-0.710
 Discrimination-0.707
 Pog-0.686
onder-0.673
sov-0.671
Token intellectual
Token ID9028
Feature activation+6.44e+00

Pos logit contributions
inery+1.962
alogue+1.889
ISE+1.767
+1.759
arte+1.735
Neg logit contributions
oppable-1.779
 Discrimination-1.750
 Pog-1.695
sov-1.681
onder-1.678
Token rig
Token ID7805
Feature activation+1.37e+01

Pos logit contributions
inery+4.496
alogue+4.247
ISE+3.993
+3.945
nces+3.878
Neg logit contributions
oppable-4.524
 Discrimination-4.389
 Pog-4.249
sov-4.247
onder-4.205
Tokenor
Token ID273
Feature activation+6.07e+00

Pos logit contributions
inery+11.518
alogue+11.175
arte+10.799
nces+10.596
+10.575
Neg logit contributions
oppable-6.613
 Discrimination-6.434
sov-6.175
 Depot-6.066
 Pog-5.937
Token.
Token ID13
Feature activation-1.95e+00

Pos logit contributions
inery+4.550
alogue+4.350
ISE+4.082
+4.020
nces+3.996
Neg logit contributions
oppable-3.867
 Discrimination-3.859
 Pog-3.715
sov-3.683
 Depot-3.630
Token But
Token ID887
Feature activation-2.59e-01

Pos logit contributions
 Discrimination+1.259
 Pog+1.217
oppable+1.201
 Depot+1.181
 Rib+1.164
Neg logit contributions
inery-1.504
alogue-1.467
-1.356
ISE-1.350
nces-1.340
Token they
Token ID484
Feature activation+2.29e+00

Pos logit contributions
 Discrimination+0.163
oppable+0.163
 Pog+0.159
onder+0.155
 Depot+0.154
Neg logit contributions
inery-0.199
alogue-0.193
ISE-0.180
-0.180
arte-0.177
Token surrounded
Token ID11191
Feature activation-1.06e+00

Pos logit contributions
inery+1.719
alogue+1.662
+1.553
ISE+1.552
arte+1.532
Neg logit contributions
oppable-1.484
 Discrimination-1.481
 Pog-1.434
sov-1.403
onder-1.401
Token the
Token ID262
Feature activation+6.74e+00

Pos logit contributions
inery+3.812
alogue+3.676
ISE+3.446
+3.437
nces+3.363
Neg logit contributions
oppable-3.537
 Discrimination-3.510
 Pog-3.376
onder-3.357
sov-3.339
Token command
Token ID3141
Feature activation+7.84e+00

Pos logit contributions
inery+4.834
alogue+4.675
ISE+4.388
+4.343
arte+4.261
Neg logit contributions
oppable-4.477
 Discrimination-4.423
 Pog-4.258
onder-4.216
sov-4.205
Token,
Token ID11
Feature activation+4.65e+00

Pos logit contributions
inery+5.566
alogue+5.384
ISE+5.043
+5.020
nces+4.914
Neg logit contributions
oppable-5.248
 Discrimination-5.130
 Pog-4.932
onder-4.856
sov-4.820
Token sl
Token ID1017
Feature activation+2.94e+00

Pos logit contributions
inery+3.461
alogue+3.342
ISE+3.131
+3.125
arte+3.060
Neg logit contributions
oppable-3.001
 Discrimination-2.987
 Pog-2.876
onder-2.839
sov-2.816
Tokens
Token ID82
Feature activation+2.18e+00

Pos logit contributions
inery+2.042
alogue+1.963
+1.820
ISE+1.815
nces+1.788
Neg logit contributions
 Discrimination-2.077
oppable-2.051
 Pog-1.998
onder-1.956
sov-1.955
Token invoke
Token ID26342
Feature activation+1.36e+01

Pos logit contributions
inery+1.645
alogue+1.590
+1.485
ISE+1.483
arte+1.461
Neg logit contributions
oppable-1.409
 Discrimination-1.403
 Pog-1.363
onder-1.336
sov-1.332
Token -
Token ID532
Feature activation+1.53e+00

Pos logit contributions
inery+9.097
alogue+8.888
ISE+8.405
emark+8.281
arte+8.186
Neg logit contributions
oppable-8.518
 Discrimination-8.334
-8.030
sov-7.940
Dialog-7.911
Tokenf
Token ID69
Feature activation+6.20e+00

Pos logit contributions
inery+1.158
alogue+1.120
+1.048
ISE+1.043
arte+1.028
Neg logit contributions
oppable-0.992
 Discrimination-0.986
 Pog-0.959
onder-0.941
sov-0.939
Token test
Token ID1332
Feature activation+6.07e+00

Pos logit contributions
inery+4.431
alogue+4.295
ISE+3.992
+3.953
arte+3.904
Neg logit contributions
oppable-4.156
 Discrimination-4.129
 Pog-3.990
sov-3.919
onder-3.893
Token.
Token ID764
Feature activation+1.98e+00

Pos logit contributions
inery+4.232
alogue+4.084
ISE+3.808
+3.801
arte+3.755
Neg logit contributions
oppable-4.141
 Discrimination-4.138
 Pog-4.008
sov-3.930
onder-3.898
Token If
Token ID1002
Feature activation+1.34e-02

Pos logit contributions
inery+1.416
alogue+1.366
+1.271
ISE+1.271
arte+1.249
Neg logit contributions
oppable-1.355
 Discrimination-1.349
 Pog-1.311
onder-1.287
sov-1.284
Token.
Token ID13
Feature activation-4.15e-01

Pos logit contributions
inery+1.665
alogue+1.614
+1.504
ISE+1.501
arte+1.473
Neg logit contributions
oppable-1.527
 Discrimination-1.512
 Pog-1.468
onder-1.444
sov-1.441
Token However
Token ID2102
Feature activation+3.84e+00

Pos logit contributions
 Discrimination+0.278
oppable+0.270
 Pog+0.266
 Depot+0.260
onder+0.259
Neg logit contributions
inery-0.309
alogue-0.300
-0.277
ISE-0.277
nces-0.274
Token,
Token ID11
Feature activation+1.75e+00

Pos logit contributions
inery+2.861
alogue+2.755
ISE+2.597
+2.584
arte+2.538
Neg logit contributions
oppable-2.484
 Discrimination-2.448
 Pog-2.405
sov-2.367
onder-2.344
Token gender
Token ID5279
Feature activation+3.49e-01

Pos logit contributions
inery+1.329
alogue+1.286
+1.202
ISE+1.202
arte+1.183
Neg logit contributions
oppable-1.119
 Discrimination-1.112
 Pog-1.079
onder-1.059
sov-1.056
Token non
Token ID1729
Feature activation+3.95e+00

Pos logit contributions
inery+0.337
alogue+0.328
+0.311
ISE+0.311
arte+0.308
Neg logit contributions
 Discrimination-0.151
oppable-0.150
 Pog-0.143
onder-0.139
sov-0.138
Tokencon
Token ID1102
Feature activation+1.33e+01

Pos logit contributions
inery+3.878
alogue+3.780
+3.578
ISE+3.572
nces+3.541
Neg logit contributions
oppable-1.668
 Discrimination-1.597
 Pog-1.566
 Depot-1.507
sov-1.470
Tokenforming
Token ID15464
Feature activation+2.24e+00

Pos logit contributions
inery+8.445
alogue+8.035
+7.465
ISE+7.321
nces+7.300
Neg logit contributions
oppable-9.256
 Pog-9.034
 Discrimination-8.955
-8.832
 Ü-8.717
Token (
Token ID357
Feature activation-1.43e+00

Pos logit contributions
inery+1.513
alogue+1.458
+1.355
ISE+1.354
arte+1.327
Neg logit contributions
oppable-1.617
 Discrimination-1.601
 Pog-1.559
onder-1.534
sov-1.534
TokenG
Token ID38
Feature activation-6.54e+00

Pos logit contributions
 Discrimination+0.862
oppable+0.857
 Pog+0.833
onder+0.815
sov+0.814
Neg logit contributions
inery-1.141
alogue-1.105
-1.026
ISE-1.021
arte-1.018
TokenNC
Token ID7792
Feature activation+2.19e+00

Pos logit contributions
oppable+2.560
 Discrimination+2.547
 Pog+2.481
onder+2.467
Dialog+2.398
Neg logit contributions
inery-6.292
alogue-6.099
-5.801
emark-5.741
nces-5.715
Token)
Token ID8
Feature activation-8.20e-01

Pos logit contributions
inery+1.734
alogue+1.678
ISE+1.584
+1.579
arte+1.542
Neg logit contributions
oppable-1.325
 Discrimination-1.311
 Pog-1.270
onder-1.246
sov-1.245
Token Reagan
Token ID12823
Feature activation+7.45e+00

Pos logit contributions
inery+2.312
alogue+2.226
+2.085
ISE+2.084
arte+2.048
Neg logit contributions
oppable-2.035
 Discrimination-2.010
 Pog-1.948
onder-1.926
sov-1.917
Token shouldn
Token ID6584
Feature activation+4.03e+00

Pos logit contributions
inery+5.255
alogue+4.993
ISE+4.747
+4.736
arte+4.664
Neg logit contributions
oppable-4.924
 Discrimination-4.905
 Pog-4.740
Dialog-4.626
sov-4.619
Token
Token ID447
Feature activation+2.17e+00

Pos logit contributions
inery+2.682
alogue+2.568
+2.435
ISE+2.406
arte+2.381
Neg logit contributions
 Discrimination-2.925
oppable-2.919
 Pog-2.853
 Depot-2.761
onder-2.759
Token
Token ID247
Feature activation+2.21e+00

Pos logit contributions
inery+1.090
alogue+1.035
+0.934
ISE+0.930
arte+0.908
Neg logit contributions
oppable-1.944
 Discrimination-1.942
 Pog-1.900
onder-1.869
sov-1.868
Tokent
Token ID83
Feature activation+2.41e+00

Pos logit contributions
inery+1.834
alogue+1.778
ISE+1.672
+1.671
arte+1.650
Neg logit contributions
oppable-1.258
 Discrimination-1.254
 Pog-1.208
onder-1.180
sov-1.177
Token run
Token ID1057
Feature activation+1.31e+01

Pos logit contributions
inery+1.723
alogue+1.664
+1.549
ISE+1.546
arte+1.521
Neg logit contributions
oppable-1.656
 Discrimination-1.651
 Pog-1.602
onder-1.571
sov-1.568
Token for
Token ID329
Feature activation+3.89e+00

Pos logit contributions
inery+8.890
alogue+8.685
+8.347
ISE+8.070
nces+7.997
Neg logit contributions
oppable-8.066
 Discrimination-7.862
Dialog-7.525
 Pog-7.514
onder-7.479
Token re
Token ID302
Feature activation+5.89e-01

Pos logit contributions
inery+3.028
alogue+2.934
+2.742
ISE+2.738
arte+2.705
Neg logit contributions
oppable-2.410
 Discrimination-2.371
 Pog-2.295
onder-2.259
sov-2.244
Token-
Token ID12
Feature activation+1.52e+00

Pos logit contributions
inery+0.462
alogue+0.447
ISE+0.419
+0.417
arte+0.412
Neg logit contributions
 Discrimination-0.363
oppable-0.362
 Pog-0.350
onder-0.343
sov-0.342
Tokenelection
Token ID14300
Feature activation+5.30e+00

Pos logit contributions
inery+0.409
alogue+0.370
ISE+0.295
+0.293
arte+0.279
Neg logit contributions
oppable-1.724
 Discrimination-1.722
 Pog-1.690
onder-1.676
sov-1.672
Token.
Token ID13
Feature activation-5.60e-01

Pos logit contributions
inery+4.002
alogue+3.876
+3.674
ISE+3.623
arte+3.553
Neg logit contributions
oppable-3.321
 Discrimination-3.285
 Pog-3.155
onder-3.115
sov-3.110
Token 2
Token ID362
Feature activation-9.73e-01

Pos logit contributions
oppable+0.288
 Discrimination+0.288
 Pog+0.279
onder+0.273
sov+0.272
Neg logit contributions
inery-0.338
alogue-0.327
-0.305
ISE-0.305
arte-0.300
Token 3
Token ID513
Feature activation+4.27e+00

Pos logit contributions
oppable+0.578
 Discrimination+0.577
 Pog+0.559
onder+0.544
 Depot+0.542
Neg logit contributions
inery-0.783
alogue-0.759
-0.715
ISE-0.709
nces-0.699
Token 4
Token ID604
Feature activation+2.08e+00

Pos logit contributions
inery+2.799
alogue+2.705
ISE+2.489
arte+2.460
+2.450
Neg logit contributions
oppable-3.160
 Discrimination-3.093
sov-3.051
onder-3.023
 Pog-3.018
Token 5
Token ID642
Feature activation+2.75e+00

Pos logit contributions
inery+1.434
alogue+1.379
ISE+1.289
+1.276
arte+1.258
Neg logit contributions
oppable-1.472
 Discrimination-1.461
 Pog-1.421
onder-1.403
sov-1.402
Token 6
Token ID718
Feature activation+5.83e+00

Pos logit contributions
inery+1.820
alogue+1.748
ISE+1.621
+1.606
arte+1.590
Neg logit contributions
oppable-2.026
 Discrimination-2.012
 Pog-1.955
onder-1.935
sov-1.931
Token 7
Token ID767
Feature activation+1.31e+01

Pos logit contributions
inery+3.462
alogue+3.249
ISE+3.065
+2.999
arte+2.955
Neg logit contributions
oppable-4.666
 Discrimination-4.584
 Pog-4.475
onder-4.445
sov-4.388
Token 8
Token ID807
Feature activation+8.17e+00

Pos logit contributions
inery+5.804
alogue+5.533
ISE+5.087
arte+4.874
emark+4.690
Neg logit contributions
oppable-11.866
onder-11.486
sov-11.305
 Pog-11.169
 Discrimination-11.101
Token 9
Token ID860
Feature activation+5.41e+00

Pos logit contributions
inery+4.112
alogue+3.783
ISE+3.696
arte+3.474
emark+3.402
Neg logit contributions
oppable-7.048
sov-6.862
 Discrimination-6.856
onder-6.799
 Pog-6.759
Token 10
Token ID838
Feature activation+3.52e+00

Pos logit contributions
inery+3.300
alogue+3.124
ISE+2.910
arte+2.855
+2.813
Neg logit contributions
oppable-4.218
 Discrimination-4.137
sov-4.084
onder-4.047
 Pog-4.046
Token 11
Token ID1367
Feature activation+4.35e+00

Pos logit contributions
inery+2.247
alogue+2.134
ISE+1.986
arte+1.958
+1.950
Neg logit contributions
oppable-2.667
 Discrimination-2.657
 Pog-2.576
sov-2.565
onder-2.538
Token 12
Token ID1105
Feature activation+4.06e+00

Pos logit contributions
inery+2.528
alogue+2.451
ISE+2.262
+2.177
arte+2.177
Neg logit contributions
oppable-3.507
 Discrimination-3.456
sov-3.379
 Pog-3.362
onder-3.330
Token 2
Token ID362
Feature activation-1.24e+00

Pos logit contributions
 Discrimination+0.856
oppable+0.855
 Pog+0.829
onder+0.811
sov+0.807
Neg logit contributions
inery-1.032
alogue-0.995
-0.933
ISE-0.930
arte-0.915
Token 3
Token ID513
Feature activation+4.57e+00

Pos logit contributions
 Discrimination+0.735
oppable+0.735
 Pog+0.711
onder+0.692
 Depot+0.689
Neg logit contributions
inery-1.006
alogue-0.975
-0.918
ISE-0.910
nces-0.900
Token 4
Token ID604
Feature activation+2.32e+00

Pos logit contributions
inery+2.962
alogue+2.865
ISE+2.636
arte+2.601
+2.587
Neg logit contributions
oppable-3.419
 Discrimination-3.336
sov-3.303
onder-3.272
 Pog-3.255
Token 5
Token ID642
Feature activation+2.48e+00

Pos logit contributions
inery+1.582
alogue+1.520
ISE+1.425
+1.405
arte+1.386
Neg logit contributions
oppable-1.659
 Discrimination-1.643
 Pog-1.598
onder-1.583
sov-1.582
Token 6
Token ID718
Feature activation+5.79e+00

Pos logit contributions
inery+1.643
alogue+1.580
ISE+1.466
+1.453
arte+1.437
Neg logit contributions
oppable-1.827
 Discrimination-1.812
 Pog-1.761
onder-1.743
sov-1.739
Token 7
Token ID767
Feature activation+1.31e+01

Pos logit contributions
inery+3.383
alogue+3.182
ISE+3.006
+2.945
arte+2.893
Neg logit contributions
oppable-4.686
 Discrimination-4.577
 Pog-4.478
onder-4.456
sov-4.391
Token |
Token ID930
Feature activation+2.85e+00

Pos logit contributions
inery+5.703
alogue+5.429
ISE+5.084
arte+4.794
emark+4.575
Neg logit contributions
oppable-11.981
onder-11.544
sov-11.336
 Pog-11.200
 Discrimination-11.110
Token library
Token ID5888
Feature activation+3.57e+00

Pos logit contributions
inery+2.197
alogue+2.125
ISE+1.996
+1.981
arte+1.954
Neg logit contributions
oppable-1.797
 Discrimination-1.765
 Pog-1.717
onder-1.702
sov-1.690
Token |
Token ID930
Feature activation-2.63e+00

Pos logit contributions
inery+2.489
alogue+2.406
ISE+2.270
+2.232
arte+2.223
Neg logit contributions
oppable-2.498
 Discrimination-2.451
 Pog-2.375
sov-2.364
onder-2.353
Token version
Token ID2196
Feature activation-7.61e+00

Pos logit contributions
 Discrimination+1.624
oppable+1.604
 Pog+1.578
 Depot+1.556
onder+1.529
Neg logit contributions
inery-2.037
alogue-1.968
-1.844
nces-1.813
ISE-1.808
Token |
Token ID930
Feature activation-2.72e+00

Pos logit contributions
 Discrimination+4.917
 Pog+4.665
oppable+4.621
 Depot+4.601
 Slug+4.544
Neg logit contributions
inery-5.511
alogue-5.369
-5.096
nces-4.938
ISE-4.784
Token.
Token ID13
Feature activation-2.82e+00

Pos logit contributions
inery+0.232
alogue+0.225
ISE+0.211
+0.211
arte+0.208
Neg logit contributions
oppable-0.165
 Discrimination-0.165
 Pog-0.160
sov-0.156
onder-0.155
Token
Token ID198
Feature activation-7.24e-01

Pos logit contributions
 Discrimination+1.808
 Pog+1.739
oppable+1.710
 Depot+1.684
sov+1.681
Neg logit contributions
inery-2.172
alogue-2.137
ISE-1.958
-1.954
nces-1.942
Token
Token ID198
Feature activation-1.43e+00

Pos logit contributions
 Discrimination+0.478
oppable+0.476
 Pog+0.463
onder+0.455
sov+0.454
Neg logit contributions
inery-0.535
alogue-0.518
-0.481
ISE-0.481
nces-0.474
TokenEC
Token ID2943
Feature activation+5.96e+00

Pos logit contributions
 Discrimination+0.882
oppable+0.876
 Pog+0.859
Dialog+0.854
sov+0.852
Neg logit contributions
inery-1.109
alogue-1.081
nces-0.993
-0.993
ISE-0.986
TokenON
Token ID1340
Feature activation+8.07e+00

Pos logit contributions
inery+3.417
alogue+3.263
ISE+3.157
+3.029
arte+2.975
Neg logit contributions
oppable-4.878
 Discrimination-4.826
 Pog-4.642
sov-4.601
 Depot-4.573
TokenOM
Token ID2662
Feature activation+1.30e+01

Pos logit contributions
inery+5.928
alogue+5.895
ISE+5.807
+5.652
ENA+5.533
Neg logit contributions
oppable-5.223
 Discrimination-4.870
sov-4.670
 Pog-4.639
-4.599
TokenICS
Token ID19505
Feature activation+8.20e+00

Pos logit contributions
ISE+6.403
ENA+6.033
inery+6.025
+5.591
alogue+5.366
Neg logit contributions
oppable-11.400
onder-10.797
bleacher-10.748
 Discrimination-10.694
sov-10.577
Token AND
Token ID5357
Feature activation+5.68e+00

Pos logit contributions
inery+4.967
ISE+4.820
alogue+4.787
+4.640
ENA+4.538
Neg logit contributions
oppable-6.241
 Discrimination-5.815
onder-5.759
sov-5.717
Dialog-5.589
Token TR
Token ID7579
Feature activation+2.29e+00

Pos logit contributions
inery+3.566
alogue+3.470
ISE+3.331
+3.270
arte+3.158
Neg logit contributions
oppable-4.337
 Discrimination-4.130
onder-4.065
sov-4.005
Dialog-3.952
TokenADE
Token ID19266
Feature activation+1.45e+00

Pos logit contributions
inery+1.522
alogue+1.462
ISE+1.361
+1.356
arte+1.329
Neg logit contributions
oppable-1.687
 Discrimination-1.678
 Pog-1.628
onder-1.603
sov-1.603
Token
Token ID198
Feature activation-2.21e-01

Pos logit contributions
inery+1.034
alogue+0.997
ISE+0.930
+0.929
arte+0.912
Neg logit contributions
oppable-0.998
 Discrimination-0.992
 Pog-0.963
onder-0.948
sov-0.944
Token (
Token ID357
Feature activation-2.12e+00

Pos logit contributions
 Discrimination+2.198
 Pog+2.183
oppable+2.134
 Depot+2.060
 Slug+2.027
Neg logit contributions
inery-3.637
alogue-3.544
-3.327
ISE-3.314
nces-3.269
TokenPhoto
Token ID6191
Feature activation-5.81e+00

Pos logit contributions
 Discrimination+0.858
oppable+0.852
 Pog+0.822
sov+0.790
onder+0.787
Neg logit contributions
inery-2.115
alogue-2.050
-1.940
ISE-1.937
nces-1.927
Token:
Token ID25
Feature activation-6.79e-01

Pos logit contributions
 Discrimination+3.190
 Pog+3.087
 Depot+3.070
oppable+2.975
sov+2.948
Neg logit contributions
inery-4.929
alogue-4.795
ISE-4.478
nces-4.427
-4.389
Token KY
Token ID33668
Feature activation+4.86e+00

Pos logit contributions
 Discrimination+0.445
oppable+0.443
 Pog+0.433
 Depot+0.421
onder+0.420
Neg logit contributions
inery-0.506
alogue-0.488
-0.455
ISE-0.455
nces-0.447
TokenLE
Token ID2538
Feature activation+1.96e+00

Pos logit contributions
inery+2.873
ISE+2.622
alogue+2.618
+2.493
arte+2.478
Neg logit contributions
oppable-3.918
 Discrimination-3.831
sov-3.691
 Pog-3.688
-3.662
Token GR
Token ID10863
Feature activation+1.28e+01

Pos logit contributions
inery+1.326
alogue+1.277
ISE+1.184
+1.184
arte+1.161
Neg logit contributions
oppable-1.420
 Discrimination-1.414
 Pog-1.374
onder-1.352
sov-1.349
TokenAN
Token ID1565
Feature activation+4.26e+00

Pos logit contributions
ISE+7.624
inery+7.459
ENA+7.279
alogue+7.126
+6.784
Neg logit contributions
 Discrimination-9.711
oppable-9.499
-8.941
 misunderstood-8.865
sov-8.800
TokenTH
Token ID4221
Feature activation+6.50e+00

Pos logit contributions
inery+3.210
alogue+3.142
ISE+3.037
+2.945
arte+2.899
Neg logit contributions
oppable-2.772
 Discrimination-2.689
sov-2.557
 Pog-2.549
onder-2.546
TokenAM
Token ID2390
Feature activation+2.74e+00

Pos logit contributions
inery+4.308
alogue+4.191
ISE+4.187
+3.976
ENA+3.954
Neg logit contributions
oppable-4.702
 Discrimination-4.604
 Pog-4.389
sov-4.355
onder-4.350
Token/
Token ID14
Feature activation+1.56e+00

Pos logit contributions
inery+2.059
alogue+1.995
ISE+1.874
+1.867
arte+1.838
Neg logit contributions
oppable-1.775
 Discrimination-1.759
 Pog-1.695
onder-1.673
sov-1.669
TokenTHE
Token ID10970
Feature activation+1.65e+00

Pos logit contributions
inery+0.985
alogue+0.946
+0.891
ISE+0.887
arte+0.875
Neg logit contributions
oppable-1.188
 Discrimination-1.179
 Pog-1.145
sov-1.132
onder-1.132
Token months
Token ID1933
Feature activation-3.57e+00

Pos logit contributions
 Discrimination+1.508
oppable+1.502
 Pog+1.467
onder+1.439
sov+1.427
Neg logit contributions
inery-2.092
alogue-2.024
-1.890
ISE-1.886
arte-1.866
Token from
Token ID422
Feature activation-4.52e-01

Pos logit contributions
 Discrimination+2.043
oppable+2.026
 Pog+2.009
onder+1.960
 Depot+1.958
Neg logit contributions
inery-2.914
alogue-2.815
-2.659
ISE-2.624
arte-2.597
Token the
Token ID262
Feature activation+7.73e-01

Pos logit contributions
 Discrimination+0.289
oppable+0.287
 Pog+0.282
 Depot+0.276
onder+0.274
Neg logit contributions
inery-0.343
alogue-0.332
-0.312
ISE-0.309
arte-0.304
Token M
Token ID337
Feature activation+2.10e+00

Pos logit contributions
inery+0.583
alogue+0.563
+0.528
ISE+0.525
arte+0.516
Neg logit contributions
 Discrimination-0.499
oppable-0.498
 Pog-0.484
onder-0.474
sov-0.473
Tokenauna
Token ID32837
Feature activation+7.40e-01

Pos logit contributions
inery+1.423
alogue+1.367
ISE+1.277
+1.264
arte+1.247
Neg logit contributions
oppable-1.521
 Discrimination-1.520
 Pog-1.472
onder-1.442
sov-1.442
Token Lo
Token ID6706
Feature activation+1.28e+01

Pos logit contributions
inery+0.638
alogue+0.623
ISE+0.589
+0.586
arte+0.581
Neg logit contributions
oppable-0.398
 Discrimination-0.391
 Pog-0.379
onder-0.372
sov-0.368
Tokena
Token ID64
Feature activation+1.45e+00

Pos logit contributions
inery+6.105
ISE+5.960
arte+5.675
alogue+5.631
emark+5.582
Neg logit contributions
oppable-10.489
-10.291
sov-10.200
Dialog-10.186
bleacher-10.134
Token series
Token ID2168
Feature activation-2.20e-01

Pos logit contributions
inery+1.131
alogue+1.096
ISE+1.026
+1.025
arte+1.010
Neg logit contributions
oppable-0.894
 Discrimination-0.888
 Pog-0.860
onder-0.843
sov-0.841
Token of
Token ID286
Feature activation+1.95e+00

Pos logit contributions
 Discrimination+0.133
oppable+0.132
 Pog+0.129
onder+0.126
 Depot+0.126
Neg logit contributions
inery-0.176
alogue-0.170
-0.159
ISE-0.159
arte-0.156
Token [
Token ID685
Feature activation-8.15e-01

Pos logit contributions
inery+1.517
alogue+1.470
ISE+1.376
+1.374
arte+1.354
Neg logit contributions
oppable-1.210
 Discrimination-1.203
 Pog-1.164
onder-1.142
sov-1.138
TokenCO
Token ID8220
Feature activation+4.32e+00

Pos logit contributions
oppable+0.507
 Discrimination+0.507
 Pog+0.491
onder+0.484
sov+0.484
Neg logit contributions
inery-0.630
alogue-0.610
-0.572
ISE-0.565
nces-0.558
Token already
Token ID1541
Feature activation-1.52e-01

Pos logit contributions
inery+1.008
alogue+0.967
+0.900
ISE+0.896
arte+0.883
Neg logit contributions
oppable-1.039
 Discrimination-1.039
 Pog-1.006
onder-0.987
sov-0.985
Token occurred
Token ID5091
Feature activation+2.45e+00

Pos logit contributions
oppable+0.098
 Discrimination+0.097
 Pog+0.095
onder+0.093
sov+0.093
Neg logit contributions
inery-0.114
alogue-0.111
ISE-0.104
-0.103
arte-0.102
Token,
Token ID11
Feature activation+1.68e+00

Pos logit contributions
inery+1.934
alogue+1.865
ISE+1.750
+1.748
arte+1.726
Neg logit contributions
 Discrimination-1.489
oppable-1.485
 Pog-1.436
sov-1.405
onder-1.403
Token
Token ID447
Feature activation+7.99e-01

Pos logit contributions
inery+1.270
alogue+1.227
ISE+1.147
+1.146
arte+1.129
Neg logit contributions
oppable-1.083
 Discrimination-1.080
 Pog-1.046
onder-1.026
sov-1.024
Token
Token ID251
Feature activation+3.58e+00

Pos logit contributions
inery+0.431
alogue+0.411
ISE+0.372
+0.372
arte+0.363
Neg logit contributions
oppable-0.687
 Discrimination-0.686
 Pog-0.670
onder-0.660
sov-0.659
Token Gol
Token ID26849
Feature activation+1.28e+01

Pos logit contributions
inery+2.624
alogue+2.512
ISE+2.363
+2.362
arte+2.327
Neg logit contributions
oppable-2.361
 Discrimination-2.337
 Pog-2.262
sov-2.231
 Depot-2.221
Tokenight
Token ID432
Feature activation+1.38e+00

Pos logit contributions
inery+5.484
alogue+4.920
arte+4.811
ISE+4.721
emark+4.616
Neg logit contributions
 Discrimination-11.883
 Pog-11.663
 Citiz-11.322
 misunderstood-11.286
 Depot-11.220
Tokenly
Token ID306
Feature activation+2.90e+00

Pos logit contributions
inery+1.309
alogue+1.269
ISE+1.207
+1.203
arte+1.192
Neg logit contributions
oppable-0.633
 Discrimination-0.627
 Pog-0.611
 Depot-0.577
onder-0.571
Token-
Token ID12
Feature activation-9.59e-01

Pos logit contributions
inery+2.101
alogue+2.011
+1.905
ISE+1.889
arte+1.873
Neg logit contributions
oppable-1.947
 Discrimination-1.922
 Pog-1.879
sov-1.831
onder-1.830
TokenHow
Token ID2437
Feature activation+2.46e+00

Pos logit contributions
 Discrimination+0.663
oppable+0.662
 Pog+0.644
sov+0.633
onder+0.632
Neg logit contributions
inery-0.678
alogue-0.655
ISE-0.606
-0.605
arte-0.596
Tokenell
Token ID695
Feature activation-2.64e+00

Pos logit contributions
inery+1.163
alogue+1.096
arte+0.971
ISE+0.970
+0.961
Neg logit contributions
 Discrimination-2.303
oppable-2.293
 Pog-2.258
 Depot-2.200
sov-2.196
Token First
Token ID3274
Feature activation-1.66e-02

Pos logit contributions
 Discrimination+0.940
oppable+0.931
 Pog+0.916
 Depot+0.891
onder+0.888
Neg logit contributions
inery-1.198
alogue-1.152
-1.080
ISE-1.074
nces-1.062
Token Strike
Token ID12282
Feature activation+1.69e+00

Pos logit contributions
 Discrimination+0.011
oppable+0.011
 Pog+0.010
onder+0.010
sov+0.010
Neg logit contributions
inery-0.012
alogue-0.012
-0.011
ISE-0.011
arte-0.011
Token within
Token ID1626
Feature activation+1.73e+00

Pos logit contributions
inery+1.320
alogue+1.278
+1.197
ISE+1.196
arte+1.178
Neg logit contributions
oppable-1.041
 Discrimination-1.037
 Pog-1.004
onder-0.983
sov-0.981
Token days
Token ID1528
Feature activation+1.68e+00

Pos logit contributions
inery+1.299
alogue+1.256
ISE+1.175
+1.171
arte+1.151
Neg logit contributions
oppable-1.120
 Discrimination-1.113
 Pog-1.076
onder-1.060
sov-1.056
Token.
Token ID13
Feature activation+1.10e+00

Pos logit contributions
inery+1.343
alogue+1.302
+1.221
ISE+1.220
arte+1.202
Neg logit contributions
oppable-1.010
 Discrimination-1.006
 Pog-0.973
onder-0.953
sov-0.950
Token Fin
Token ID4463
Feature activation+1.27e+01

Pos logit contributions
inery+0.813
alogue+0.785
+0.732
ISE+0.730
arte+0.719
Neg logit contributions
 Discrimination-0.731
oppable-0.731
 Pog-0.710
onder-0.695
sov-0.693
Tokenishing
Token ID3929
Feature activation+4.36e+00

Pos logit contributions
inery+7.780
alogue+7.516
emark+6.952
nces+6.805
+6.752
Neg logit contributions
 Discrimination-9.151
oppable-9.117
 Pog-8.882
-8.734
bleacher-8.676
Token The
Token ID383
Feature activation-2.03e+00

Pos logit contributions
inery+3.306
alogue+3.233
ISE+3.040
+3.010
arte+2.986
Neg logit contributions
oppable-2.733
 Discrimination-2.687
 Pog-2.584
sov-2.579
onder-2.575
Token Flood
Token ID25588
Feature activation-3.18e+00

Pos logit contributions
 Discrimination+1.372
oppable+1.360
 Pog+1.342
 Depot+1.305
sov+1.298
Neg logit contributions
inery-1.479
alogue-1.414
-1.344
ISE-1.329
arte-1.297
Token perhaps
Token ID3737
Feature activation-3.83e+00

Pos logit contributions
 Discrimination+1.831
oppable+1.827
 Pog+1.788
 Depot+1.744
onder+1.740
Neg logit contributions
inery-2.578
alogue-2.486
-2.362
ISE-2.337
nces-2.306
Token will
Token ID481
Feature activation-2.24e+00

Pos logit contributions
 Discrimination+2.091
oppable+2.088
 Pog+2.071
onder+1.995
 Depot+1.977
Neg logit contributions
inery-3.211
alogue-3.097
ISE-2.909
-2.895
arte-2.866
Token was
Token ID373
Feature activation-8.49e-01

Pos logit contributions
 Discrimination+1.214
oppable+1.199
 Pog+1.176
 Depot+1.171
onder+1.163
Neg logit contributions
inery-1.689
alogue-1.638
ISE-1.546
-1.536
nces-1.512
Token designated
Token ID11032
Feature activation-1.99e+00

Pos logit contributions
 Discrimination+0.516
oppable+0.510
 Pog+0.506
 Depot+0.497
onder+0.493
Neg logit contributions
inery-0.667
alogue-0.648
ISE-0.615
-0.606
nces-0.601
Token as
Token ID355
Feature activation-6.93e-01

Pos logit contributions
 Discrimination+1.170
oppable+1.140
 Pog+1.134
 Depot+1.118
onder+1.105
Neg logit contributions
inery-1.612
alogue-1.567
ISE-1.472
-1.472
nces-1.442
Token the
Token ID262
Feature activation-2.57e+00

Pos logit contributions
 Discrimination+0.465
oppable+0.457
 Pog+0.451
 Depot+0.444
onder+0.441
Neg logit contributions
inery-0.506
alogue-0.492
-0.458
ISE-0.458
arte-0.449
Token Mary
Token ID5335
Feature activation+2.07e+00

Pos logit contributions
 Discrimination+1.641
 Pog+1.603
 Depot+1.600
oppable+1.584
onder+1.550
Neg logit contributions
inery-1.948
alogue-1.910
-1.804
ISE-1.775
nces-1.742
Token Mc
Token ID1982
Feature activation+1.26e+01

Pos logit contributions
inery+1.389
alogue+1.335
ISE+1.234
+1.233
arte+1.213
Neg logit contributions
oppable-1.509
 Discrimination-1.498
 Pog-1.456
onder-1.433
sov-1.432
TokenLeod
Token ID44559
Feature activation+4.81e+00

Pos logit contributions
inery+6.975
alogue+6.805
arte+6.426
ISE+6.262
ENA+6.099
Neg logit contributions
 Discrimination-9.876
oppable-9.789
 Pog-9.631
 misunderstood-9.444
-9.349
Token Beth
Token ID19223
Feature activation+8.37e+00

Pos logit contributions
inery+2.750
alogue+2.601
ISE+2.358
+2.356
arte+2.319
Neg logit contributions
oppable-3.997
 Discrimination-3.913
sov-3.825
 Pog-3.818
onder-3.783
Tokenune
Token ID1726
Feature activation+8.96e+00

Pos logit contributions
inery+6.396
alogue+6.059
arte+5.750
ISE+5.530
emark+5.486
Neg logit contributions
 Discrimination-5.219
oppable-5.209
 Pog-5.147
-5.042
 Slug-4.914
Token Council
Token ID4281
Feature activation+2.08e+00

Pos logit contributions
inery+5.669
alogue+5.353
ISE+4.938
arte+4.903
+4.903
Neg logit contributions
oppable-6.739
 Discrimination-6.285
sov-6.284
 Pog-6.178
-6.164
Token House
Token ID2097
Feature activation+1.67e+00

Pos logit contributions
inery+1.312
alogue+1.262
ISE+1.161
+1.161
arte+1.137
Neg logit contributions
oppable-1.600
 Discrimination-1.599
 Pog-1.558
onder-1.532
sov-1.527
Token Music
Token ID7849
Feature activation+1.90e+00

Pos logit contributions
inery+1.110
alogue+1.075
ISE+1.007
+1.004
arte+0.993
Neg logit contributions
oppable-0.849
 Discrimination-0.841
 Pog-0.813
onder-0.797
sov-0.795
Token Magazine
Token ID11175
Feature activation-1.90e+00

Pos logit contributions
inery+1.621
alogue+1.574
ISE+1.476
+1.475
arte+1.459
Neg logit contributions
oppable-1.042
 Discrimination-1.028
 Pog-0.991
onder-0.971
sov-0.968
Token,
Token ID11
Feature activation-5.61e-02

Pos logit contributions
 Discrimination+1.116
oppable+1.112
 Pog+1.084
onder+1.055
sov+1.048
Neg logit contributions
inery-1.536
alogue-1.493
-1.407
ISE-1.398
arte-1.378
Token Cal
Token ID2199
Feature activation+5.39e+00

Pos logit contributions
 Discrimination+0.031
oppable+0.031
 Pog+0.030
onder+0.029
sov+0.029
Neg logit contributions
inery-0.048
alogue-0.046
-0.044
ISE-0.044
arte-0.043
Tokenum
Token ID388
Feature activation-1.69e+00

Pos logit contributions
inery+3.446
alogue+3.336
nces+2.963
ISE+2.961
arte+2.958
Neg logit contributions
oppable-4.059
 Discrimination-4.030
 Pog-3.876
-3.870
sov-3.838
Token MacDonald
Token ID37131
Feature activation+1.26e+01

Pos logit contributions
 Discrimination+1.017
oppable+1.016
 Pog+0.987
onder+0.964
sov+0.958
Neg logit contributions
inery-1.341
alogue-1.296
ISE-1.222
-1.217
arte-1.197
Token,
Token ID11
Feature activation+2.35e+00

Pos logit contributions
inery+8.473
alogue+8.100
ISE+7.656
arte+7.639
sle+7.569
Neg logit contributions
oppable-8.258
 Discrimination-7.716
-7.669
Dialog-7.454
onder-7.399
Token who
Token ID508
Feature activation+2.39e+00

Pos logit contributions
inery+1.808
alogue+1.753
+1.642
ISE+1.637
arte+1.611
Neg logit contributions
oppable-1.473
 Discrimination-1.459
 Pog-1.406
sov-1.387
onder-1.383
Token is
Token ID318
Feature activation+4.95e+00

Pos logit contributions
inery+1.886
alogue+1.830
+1.715
ISE+1.707
arte+1.694
Neg logit contributions
oppable-1.463
 Discrimination-1.451
 Pog-1.387
sov-1.367
onder-1.365
Token the
Token ID262
Feature activation+2.70e+00

Pos logit contributions
inery+3.662
alogue+3.557
+3.338
ISE+3.302
arte+3.260
Neg logit contributions
oppable-3.237
 Discrimination-3.168
sov-3.037
 Pog-3.017
 Depot-3.010
Token author
Token ID1772
Feature activation+2.69e+00

Pos logit contributions
inery+2.151
alogue+2.086
+1.949
ISE+1.949
arte+1.923
Neg logit contributions
oppable-1.630
 Discrimination-1.608
 Pog-1.544
sov-1.526
onder-1.522
Token enjoyed
Token ID8359
Feature activation+2.43e+00

Pos logit contributions
 Discrimination+0.188
oppable+0.187
 Pog+0.182
onder+0.178
sov+0.177
Neg logit contributions
inery-0.240
alogue-0.232
-0.219
ISE-0.218
arte-0.215
Token by
Token ID416
Feature activation-7.60e-01

Pos logit contributions
inery+1.892
alogue+1.831
+1.713
ISE+1.708
arte+1.681
Neg logit contributions
oppable-1.515
 Discrimination-1.503
 Pog-1.454
sov-1.423
onder-1.422
Token all
Token ID477
Feature activation-1.06e-01

Pos logit contributions
oppable+0.451
 Discrimination+0.451
 Pog+0.438
onder+0.426
sov+0.426
Neg logit contributions
inery-0.613
alogue-0.592
ISE-0.557
-0.557
arte-0.548
Token,"
Token ID553
Feature activation+2.26e+00

Pos logit contributions
 Discrimination+0.064
oppable+0.064
 Pog+0.062
onder+0.060
sov+0.060
Neg logit contributions
inery-0.084
alogue-0.082
-0.077
ISE-0.077
arte-0.075
Token Ian
Token ID12930
Feature activation+2.98e+00

Pos logit contributions
inery+1.681
alogue+1.618
ISE+1.513
+1.509
arte+1.489
Neg logit contributions
oppable-1.490
 Discrimination-1.472
 Pog-1.420
sov-1.403
onder-1.403
Token Fin
Token ID4463
Feature activation+1.26e+01

Pos logit contributions
inery+2.352
alogue+2.276
ISE+2.127
+2.125
arte+2.103
Neg logit contributions
oppable-1.823
 Discrimination-1.795
 Pog-1.720
onder-1.701
sov-1.696
Tokennery
Token ID35865
Feature activation+4.47e+00

Pos logit contributions
inery+8.743
alogue+8.456
emark+7.977
nces+7.951
arte+7.710
Neg logit contributions
 Discrimination-8.197
oppable-8.133
-7.924
 Pog-7.747
 Depot-7.607
Token wrote
Token ID2630
Feature activation-1.64e+00

Pos logit contributions
inery+3.323
alogue+3.220
ISE+3.012
+2.990
arte+2.979
Neg logit contributions
oppable-2.916
 Discrimination-2.836
 Pog-2.743
onder-2.698
-2.682
Token on
Token ID319
Feature activation-3.95e-01

Pos logit contributions
 Discrimination+0.920
oppable+0.918
 Pog+0.896
onder+0.868
 Depot+0.864
Neg logit contributions
inery-1.372
alogue-1.337
-1.262
ISE-1.256
arte-1.237
Token the
Token ID262
Feature activation-2.47e+00

Pos logit contributions
 Discrimination+0.233
oppable+0.231
 Pog+0.227
 Depot+0.219
onder+0.219
Neg logit contributions
inery-0.320
alogue-0.311
-0.293
ISE-0.291
nces-0.287
Token group
Token ID1448
Feature activation-2.53e+00

Pos logit contributions
 Discrimination+1.457
 Pog+1.422
oppable+1.417
 Depot+1.371
 Rib+1.356
Neg logit contributions
inery-1.995
alogue-1.944
-1.839
nces-1.825
ISE-1.816
Token 2
Token ID362
Feature activation-1.93e+00

Pos logit contributions
 Discrimination+1.704
oppable+1.680
 Pog+1.642
onder+1.619
sov+1.612
Neg logit contributions
inery-2.327
alogue-2.245
-2.133
ISE-2.114
nces-2.080
Token 3
Token ID513
Feature activation+3.75e+00

Pos logit contributions
 Discrimination+0.837
oppable+0.826
 Pog+0.788
sov+0.770
onder+0.769
Neg logit contributions
inery-1.862
alogue-1.822
-1.748
ISE-1.722
nces-1.712
Token 4
Token ID604
Feature activation+3.50e+00

Pos logit contributions
inery+3.168
alogue+3.084
ISE+2.864
arte+2.858
+2.852
Neg logit contributions
oppable-2.075
 Discrimination-2.032
 Pog-1.990
onder-1.963
sov-1.959
Token 5
Token ID642
Feature activation+4.75e+00

Pos logit contributions
inery+2.471
alogue+2.370
ISE+2.204
+2.172
arte+2.164
Neg logit contributions
oppable-2.412
 Discrimination-2.381
 Pog-2.344
onder-2.306
sov-2.300
Token 6
Token ID718
Feature activation+6.35e+00

Pos logit contributions
inery+3.186
alogue+3.050
ISE+2.796
arte+2.770
+2.754
Neg logit contributions
oppable-3.460
 Discrimination-3.410
 Pog-3.364
onder-3.306
sov-3.294
Token 7
Token ID767
Feature activation+1.26e+01

Pos logit contributions
inery+4.262
alogue+4.008
ISE+3.735
arte+3.669
+3.665
Neg logit contributions
oppable-4.610
 Pog-4.503
 Discrimination-4.501
onder-4.363
 Depot-4.319
Token 8
Token ID807
Feature activation+1.09e+01

Pos logit contributions
inery+9.171
alogue+8.949
arte+8.074
ISE+8.017
emark+7.960
Neg logit contributions
oppable-7.809
 Pog-7.639
onder-7.543
sov-7.219
 Discrimination-7.174
Token void
Token ID7951
Feature activation+9.08e-01

Pos logit contributions
inery+7.896
alogue+7.469
ISE+7.044
emark+6.992
arte+6.955
Neg logit contributions
oppable-6.689
 Pog-6.648
sov-6.476
onder-6.439
 Discrimination-6.412
Token vert
Token ID9421
Feature activation-4.26e+00

Pos logit contributions
inery+0.573
alogue+0.551
+0.507
ISE+0.507
arte+0.496
Neg logit contributions
oppable-0.698
 Discrimination-0.696
 Pog-0.678
onder-0.667
sov-0.666
Token (
Token ID357
Feature activation-9.35e-01

Pos logit contributions
 Discrimination+2.089
oppable+2.042
 Pog+2.032
sov+1.953
Dialog+1.940
Neg logit contributions
inery-3.839
alogue-3.763
-3.548
ISE-3.521
nces-3.506
Token in
Token ID287
Feature activation-4.02e+00

Pos logit contributions
oppable+0.539
 Discrimination+0.536
 Pog+0.519
onder+0.508
sov+0.506
Neg logit contributions
inery-0.766
alogue-0.748
-0.704
ISE-0.702
nces-0.690
Token they
Token ID484
Feature activation-3.17e-01

Pos logit contributions
inery+0.164
alogue+0.159
+0.149
ISE+0.149
arte+0.147
Neg logit contributions
oppable-0.122
 Discrimination-0.121
 Pog-0.117
onder-0.115
sov-0.114
Token've
Token ID1053
Feature activation-1.73e+00

Pos logit contributions
 Discrimination+0.189
oppable+0.189
 Pog+0.184
onder+0.181
sov+0.179
Neg logit contributions
inery-0.253
alogue-0.244
ISE-0.230
-0.230
arte-0.225
Token found
Token ID1043
Feature activation-4.01e+00

Pos logit contributions
 Discrimination+0.901
oppable+0.890
 Pog+0.879
sov+0.847
onder+0.845
Neg logit contributions
inery-1.512
alogue-1.466
ISE-1.385
-1.384
arte-1.362
Token a
Token ID257
Feature activation-6.28e+00

Pos logit contributions
 Discrimination+2.357
 Pog+2.323
oppable+2.303
 Depot+2.218
 Rib+2.203
Neg logit contributions
inery-3.223
alogue-3.133
-2.925
ISE-2.883
arte-2.854
Token '
Token ID705
Feature activation-1.44e+00

Pos logit contributions
 Pog+3.650
 Discrimination+3.646
oppable+3.544
 misunderstood+3.492
onder+3.482
Neg logit contributions
inery-5.031
alogue-4.866
-4.576
ISE-4.496
nces-4.474
Tokenp
Token ID79
Feature activation-1.67e+01

Pos logit contributions
oppable+0.922
 Discrimination+0.921
 Pog+0.896
sov+0.884
onder+0.882
Neg logit contributions
inery-1.086
alogue-1.049
-0.976
ISE-0.976
nces-0.958
Tokenreg
Token ID2301
Feature activation+8.47e+00

Pos logit contributions
onder+10.247
oppable+9.366
sov+8.795
 Pog+8.717
orf+8.453
Neg logit contributions
-10.967
inery-10.815
alogue-10.736
nces-10.660
��-10.259
Tokennant
Token ID22057
Feature activation-1.48e+00

Pos logit contributions
inery+4.759
alogue+4.593
emark+4.121
nces+4.059
ISE+4.051
Neg logit contributions
 Discrimination-6.924
oppable-6.680
-6.607
Dialog-6.560
 Pog-6.516
Token'
Token ID6
Feature activation-3.28e+00

Pos logit contributions
 Discrimination+0.879
oppable+0.878
 Pog+0.858
onder+0.836
 Depot+0.829
Neg logit contributions
inery-1.190
alogue-1.153
-1.087
ISE-1.082
nces-1.064
Token T
Token ID309
Feature activation-3.25e+00

Pos logit contributions
 Discrimination+1.865
 Pog+1.849
oppable+1.845
onder+1.759
 Depot+1.751
Neg logit contributions
inery-2.710
alogue-2.641
-2.481
ISE-2.458
nces-2.437
Token.
Token ID13
Feature activation-8.31e+00

Pos logit contributions
oppable+1.734
onder+1.695
 Discrimination+1.690
 Pog+1.681
 Depot+1.634
Neg logit contributions
inery-2.742
alogue-2.683
-2.526
nces-2.490
ISE-2.462
Token,
Token ID11
Feature activation+1.32e+00

Pos logit contributions
inery+2.929
alogue+2.848
ISE+2.653
+2.646
arte+2.628
Neg logit contributions
oppable-2.523
 Discrimination-2.506
 Pog-2.413
onder-2.401
sov-2.401
Token your
Token ID534
Feature activation+1.81e+00

Pos logit contributions
inery+1.003
alogue+0.970
ISE+0.906
+0.904
arte+0.895
Neg logit contributions
oppable-0.844
 Discrimination-0.838
 Pog-0.811
onder-0.799
sov-0.798
Token page
Token ID2443
Feature activation-2.62e+00

Pos logit contributions
inery+1.416
alogue+1.369
ISE+1.289
+1.282
arte+1.266
Neg logit contributions
oppable-1.112
 Discrimination-1.097
 Pog-1.059
sov-1.051
onder-1.045
Token may
Token ID743
Feature activation+1.11e+00

Pos logit contributions
 Discrimination+1.489
oppable+1.485
 Pog+1.444
onder+1.422
sov+1.400
Neg logit contributions
inery-2.148
alogue-2.090
ISE-1.962
-1.955
nces-1.931
Token show
Token ID905
Feature activation+1.26e+00

Pos logit contributions
inery+0.817
alogue+0.788
+0.735
ISE+0.735
arte+0.723
Neg logit contributions
oppable-0.732
 Discrimination-0.729
 Pog-0.709
onder-0.697
sov-0.693
Token unp
Token ID8593
Feature activation-1.60e+01

Pos logit contributions
inery+0.965
alogue+0.935
+0.875
ISE+0.874
arte+0.860
Neg logit contributions
oppable-0.791
 Discrimination-0.787
 Pog-0.761
onder-0.748
sov-0.747
Tokenars
Token ID945
Feature activation+2.48e+00

Pos logit contributions
oppable+9.705
onder+8.985
ected+8.339
fried+8.088
sov+7.961
Neg logit contributions
-11.484
ISE-10.781
inery-10.722
alogue-10.674
nces-10.641
Tokened
Token ID276
Feature activation+1.45e+00

Pos logit contributions
inery+2.295
alogue+2.231
+2.104
ISE+2.103
arte+2.085
Neg logit contributions
 Discrimination-1.182
oppable-1.179
 Pog-1.129
onder-1.096
 Depot-1.092
Token expressions
Token ID14700
Feature activation+2.52e+00

Pos logit contributions
inery+1.183
alogue+1.147
ISE+1.080
+1.076
arte+1.064
Neg logit contributions
oppable-0.841
 Discrimination-0.829
 Pog-0.800
onder-0.788
sov-0.787
Token to
Token ID284
Feature activation+2.03e+00

Pos logit contributions
inery+1.981
alogue+1.921
+1.800
ISE+1.795
arte+1.770
Neg logit contributions
oppable-1.543
 Discrimination-1.531
 Pog-1.473
onder-1.449
sov-1.445
Token the
Token ID262
Feature activation+3.63e+00

Pos logit contributions
inery+1.536
alogue+1.489
ISE+1.396
+1.393
arte+1.371
Neg logit contributions
oppable-1.299
 Discrimination-1.285
 Pog-1.247
sov-1.227
onder-1.226
Token Bour
Token ID20576
Feature activation-1.26e+00

Pos logit contributions
inery+1.982
alogue+1.907
ISE+1.792
+1.792
arte+1.762
Neg logit contributions
oppable-1.664
 Discrimination-1.645
 Pog-1.587
onder-1.575
sov-1.565
Tokendain
Token ID27162
Feature activation-3.14e+00

Pos logit contributions
oppable+0.772
 Discrimination+0.769
 Pog+0.749
onder+0.740
sov+0.738
Neg logit contributions
inery-0.988
alogue-0.956
ISE-0.897
-0.896
nces-0.880
Token in
Token ID287
Feature activation+1.70e+00

Pos logit contributions
oppable+1.871
 Discrimination+1.854
 Pog+1.804
sov+1.765
 Depot+1.752
Neg logit contributions
inery-2.494
alogue-2.447
-2.271
ISE-2.259
arte-2.237
Token public
Token ID1171
Feature activation+1.50e+00

Pos logit contributions
inery+1.345
alogue+1.298
ISE+1.219
+1.219
arte+1.203
Neg logit contributions
oppable-1.035
 Discrimination-1.028
 Pog-0.994
onder-0.976
sov-0.971
Token
Token ID960
Feature activation-1.87e+00

Pos logit contributions
inery+1.198
alogue+1.157
ISE+1.089
+1.086
arte+1.068
Neg logit contributions
oppable-0.908
 Discrimination-0.904
 Pog-0.872
onder-0.858
sov-0.856
Tokenp
Token ID79
Feature activation-1.59e+01

Pos logit contributions
 Discrimination+1.192
oppable+1.184
 Pog+1.155
sov+1.143
onder+1.137
Neg logit contributions
inery-1.420
alogue-1.381
ISE-1.285
-1.282
arte-1.260
Tokeniss
Token ID747
Feature activation+5.71e+00

Pos logit contributions
onder+10.431
oppable+9.925
orf+9.170
sov+9.119
 Pog+9.098
Neg logit contributions
-9.321
alogue-9.228
inery-9.064
nces-9.029
ISE-8.851
Tokening
Token ID278
Feature activation+2.03e+00

Pos logit contributions
inery+4.595
alogue+4.397
+4.137
arte+4.103
ISE+4.100
Neg logit contributions
oppable-3.386
 Discrimination-3.327
sov-3.171
 Pog-3.158
 Depot-3.095
Token off
Token ID572
Feature activation+1.22e+00

Pos logit contributions
inery+1.571
alogue+1.520
+1.422
ISE+1.421
arte+1.400
Neg logit contributions
oppable-1.269
 Discrimination-1.263
 Pog-1.221
onder-1.198
sov-1.197
Token one
Token ID530
Feature activation-1.29e+00

Pos logit contributions
inery+0.954
alogue+0.924
+0.864
ISE+0.864
arte+0.851
Neg logit contributions
oppable-0.753
 Discrimination-0.751
 Pog-0.728
onder-0.712
sov-0.710
Token of
Token ID286
Feature activation-6.12e+00

Pos logit contributions
 Discrimination+0.765
oppable+0.762
 Pog+0.748
onder+0.727
sov+0.720
Neg logit contributions
inery-1.034
alogue-1.004
-0.937
ISE-0.932
nces-0.924
Token made
Token ID925
Feature activation-3.58e+00

Pos logit contributions
 Discrimination+0.488
oppable+0.487
 Pog+0.475
onder+0.465
sov+0.463
Neg logit contributions
inery-0.629
alogue-0.611
ISE-0.572
-0.572
arte-0.563
Token modeling
Token ID21128
Feature activation+2.66e+00

Pos logit contributions
oppable+2.246
 Discrimination+2.228
 Pog+2.211
onder+2.150
sov+2.136
Neg logit contributions
inery-2.716
alogue-2.605
-2.471
ISE-2.432
arte-2.414
Token of
Token ID286
Feature activation-2.35e-03

Pos logit contributions
inery+2.036
alogue+1.969
ISE+1.846
+1.836
arte+1.805
Neg logit contributions
oppable-1.689
 Discrimination-1.681
 Pog-1.621
sov-1.591
onder-1.590
Token
Token ID198
Feature activation-6.94e-01

Pos logit contributions
oppable+0.001
 Discrimination+0.001
 Pog+0.001
onder+0.001
sov+0.001
Neg logit contributions
inery-0.002
alogue-0.002
-0.002
ISE-0.002
arte-0.002
Token
Token ID198
Feature activation-1.18e+00

Pos logit contributions
 Discrimination+0.454
oppable+0.452
 Pog+0.441
onder+0.433
sov+0.432
Neg logit contributions
inery-0.517
alogue-0.500
-0.465
ISE-0.464
nces-0.458
Tokenp
Token ID79
Feature activation-1.58e+01

Pos logit contributions
oppable+0.785
 Discrimination+0.777
 Pog+0.757
sov+0.756
onder+0.753
Neg logit contributions
inery-0.864
alogue-0.835
-0.780
ISE-0.775
arte-0.768
Tokenuls
Token ID5753
Feature activation+5.24e-01

Pos logit contributions
onder+6.482
oppable+6.005
 Pog+5.367
sov+5.336
Dialog+5.293
Neg logit contributions
-13.678
alogue-13.370
inery-13.198
nces-12.975
arte-12.705
Tokenating
Token ID803
Feature activation+2.50e+00

Pos logit contributions
inery+0.398
alogue+0.386
ISE+0.362
+0.362
arte+0.355
Neg logit contributions
oppable-0.333
 Discrimination-0.331
 Pog-0.322
onder-0.317
sov-0.316
Token combustion
Token ID33607
Feature activation+3.26e-01

Pos logit contributions
inery+1.933
alogue+1.863
ISE+1.753
+1.751
nces+1.718
Neg logit contributions
oppable-1.555
 Discrimination-1.545
 Pog-1.482
sov-1.459
onder-1.459
Token more
Token ID517
Feature activation+2.60e+00

Pos logit contributions
inery+0.248
alogue+0.241
ISE+0.225
+0.225
arte+0.221
Neg logit contributions
oppable-0.208
 Discrimination-0.207
 Pog-0.201
onder-0.197
sov-0.197
Token realistic
Token ID12653
Feature activation+1.07e+00

Pos logit contributions
inery+1.983
alogue+1.921
ISE+1.793
+1.790
arte+1.762
Neg logit contributions
oppable-1.655
 Discrimination-1.644
 Pog-1.592
sov-1.563
onder-1.556
Token(
Token ID7
Feature activation+1.99e+00

Pos logit contributions
inery+0.961
alogue+0.929
+0.871
ISE+0.870
arte+0.857
Neg logit contributions
oppable-0.783
 Discrimination-0.779
 Pog-0.754
onder-0.738
sov-0.736
Tokenalso
Token ID14508
Feature activation+1.68e+00

Pos logit contributions
inery+1.601
alogue+1.554
ISE+1.462
+1.460
arte+1.438
Neg logit contributions
oppable-1.179
 Discrimination-1.170
 Pog-1.128
onder-1.106
sov-1.099
Token known
Token ID1900
Feature activation-2.90e+00

Pos logit contributions
inery+1.070
alogue+1.038
ISE+0.956
+0.954
arte+0.938
Neg logit contributions
oppable-1.280
 Discrimination-1.270
 Pog-1.229
sov-1.217
onder-1.210
Token as
Token ID355
Feature activation-3.71e+00

Pos logit contributions
 Discrimination+1.482
oppable+1.460
 Pog+1.432
sov+1.396
onder+1.392
Neg logit contributions
inery-2.575
alogue-2.493
ISE-2.343
-2.342
nces-2.316
Token "
Token ID366
Feature activation-1.04e+00

Pos logit contributions
 Discrimination+2.109
oppable+2.074
 Pog+2.052
 Depot+1.971
onder+1.965
Neg logit contributions
inery-3.077
alogue-2.994
ISE-2.808
-2.794
arte-2.766
TokenP
Token ID47
Feature activation-1.58e+01

Pos logit contributions
oppable+0.677
 Discrimination+0.677
 Pog+0.661
onder+0.644
sov+0.643
Neg logit contributions
inery-0.770
alogue-0.745
ISE-0.693
-0.691
arte-0.680
Tokenor
Token ID273
Feature activation-5.45e-01

Pos logit contributions
onder+8.910
oppable+8.299
orf+8.060
sov+7.602
rotein+7.551
Neg logit contributions
���-11.543
-11.513
��-11.388
alogue-11.189
inery-11.152
Tokenoker
Token ID11020
Feature activation+6.25e+00

Pos logit contributions
 Discrimination+0.137
oppable+0.135
 Pog+0.123
 Depot+0.116
onder+0.115
Neg logit contributions
inery-0.628
alogue-0.614
ISE-0.585
-0.585
arte-0.582
Tokenat
Token ID265
Feature activation+1.31e+00

Pos logit contributions
inery+4.063
alogue+3.876
arte+3.697
emark+3.459
onna+3.444
Neg logit contributions
 Discrimination-4.684
oppable-4.627
 Pog-4.542
-4.446
 misunderstood-4.386
Tokenosis
Token ID5958
Feature activation-3.48e+00

Pos logit contributions
inery+0.662
alogue+0.622
ISE+0.564
arte+0.556
+0.551
Neg logit contributions
 Discrimination-1.178
oppable-1.170
 Pog-1.144
-1.123
onder-1.121
Token punct
Token ID21025
Feature activation+1.98e+00

Pos logit contributions
 Discrimination+1.891
oppable+1.855
 Pog+1.850
sov+1.770
onder+1.766
Neg logit contributions
inery-2.971
alogue-2.902
ISE-2.742
-2.706
arte-2.660
Token a
Token ID257
Feature activation-1.57e+00

Pos logit contributions
inery+0.567
alogue+0.549
+0.513
ISE+0.512
arte+0.505
Neg logit contributions
 Discrimination-0.473
oppable-0.473
 Pog-0.458
onder-0.449
sov-0.447
Token lot
Token ID1256
Feature activation-2.34e-01

Pos logit contributions
 Discrimination+0.959
oppable+0.950
 Pog+0.936
onder+0.911
 Depot+0.905
Neg logit contributions
inery-1.234
alogue-1.204
-1.125
ISE-1.120
arte-1.107
Token of
Token ID286
Feature activation-1.38e+00

Pos logit contributions
oppable+0.148
 Discrimination+0.147
 Pog+0.143
onder+0.140
sov+0.139
Neg logit contributions
inery-0.180
alogue-0.174
-0.163
ISE-0.163
arte-0.160
Token something
Token ID1223
Feature activation-3.86e+00

Pos logit contributions
 Discrimination+0.857
oppable+0.841
 Pog+0.830
onder+0.803
 Depot+0.801
Neg logit contributions
inery-1.077
alogue-1.049
-0.982
ISE-0.980
arte-0.971
Token (
Token ID357
Feature activation-2.96e+00

Pos logit contributions
 Discrimination+2.162
oppable+2.154
 Pog+2.107
sov+2.045
onder+2.040
Neg logit contributions
inery-3.165
alogue-3.068
ISE-2.918
-2.918
arte-2.876
Tokenp
Token ID79
Feature activation-1.56e+01

Pos logit contributions
 Discrimination+1.976
oppable+1.971
sov+1.927
 Pog+1.911
onder+1.900
Neg logit contributions
inery-2.111
alogue-2.053
-1.895
ISE-1.885
nces-1.869
Tokenoth
Token ID849
Feature activation+4.67e+00

Pos logit contributions
onder+9.939
oppable+9.381
 Pog+8.710
sov+8.604
Percent+8.398
Neg logit contributions
-9.931
nces-9.509
alogue-9.445
inery-9.405
ISE-9.035
Tokenoles
Token ID4316
Feature activation-1.05e+00

Pos logit contributions
inery+4.265
alogue+4.087
arte+3.831
+3.825
ISE+3.805
Neg logit contributions
 Discrimination-2.331
oppable-2.234
 Pog-2.180
sov-2.121
 Depot-2.109
Token in
Token ID287
Feature activation-8.02e-01

Pos logit contributions
oppable+0.611
 Discrimination+0.606
 Pog+0.591
onder+0.576
sov+0.573
Neg logit contributions
inery-0.863
alogue-0.838
-0.787
ISE-0.786
arte-0.774
Token New
Token ID968
Feature activation+1.04e+00

Pos logit contributions
 Discrimination+0.524
oppable+0.522
 Pog+0.510
sov+0.498
onder+0.497
Neg logit contributions
inery-0.598
alogue-0.580
-0.540
ISE-0.539
arte-0.529
Token Jersey
Token ID8221
Feature activation-1.91e+00

Pos logit contributions
inery+0.833
alogue+0.807
+0.758
ISE+0.757
arte+0.745
Neg logit contributions
oppable-0.620
 Discrimination-0.618
 Pog-0.598
onder-0.585
sov-0.585
Token,
Token ID11
Feature activation+5.44e+00

Pos logit contributions
inery+1.379
alogue+1.335
+1.247
ISE+1.244
arte+1.224
Neg logit contributions
oppable-1.205
 Discrimination-1.200
 Pog-1.159
onder-1.140
sov-1.136
Token unf
Token ID3684
Feature activation-6.93e+00

Pos logit contributions
inery+3.942
alogue+3.820
+3.575
ISE+3.544
arte+3.497
Neg logit contributions
oppable-3.613
 Discrimination-3.582
 Pog-3.404
 Depot-3.386
onder-3.383
Tokenil
Token ID346
Feature activation-9.20e-01

Pos logit contributions
onder+4.791
oppable+4.755
sov+4.709
 Pog+4.540
 Discrimination+4.429
Neg logit contributions
inery-4.583
alogue-4.527
-4.374
nces-4.311
ISE-4.286
Tokentered
Token ID4400
Feature activation-1.31e+00

Pos logit contributions
oppable+0.411
 Discrimination+0.409
 Pog+0.390
onder+0.379
sov+0.378
Neg logit contributions
inery-0.877
alogue-0.854
-0.809
ISE-0.809
arte-0.800
Token,
Token ID11
Feature activation+1.96e+00

Pos logit contributions
oppable+0.778
 Discrimination+0.777
 Pog+0.763
sov+0.739
onder+0.735
Neg logit contributions
inery-1.044
alogue-1.011
ISE-0.950
-0.945
nces-0.937
Token unp
Token ID8593
Feature activation-1.55e+01

Pos logit contributions
inery+1.437
alogue+1.388
+1.295
ISE+1.293
arte+1.272
Neg logit contributions
oppable-1.308
 Discrimination-1.303
 Pog-1.262
onder-1.240
sov-1.236
Tokenaste
Token ID4594
Feature activation-1.56e+00

Pos logit contributions
oppable+8.858
onder+8.594
 Pog+8.124
sov+8.062
ected+8.003
Neg logit contributions
nces-11.147
-10.490
ISE-10.480
inery-10.417
alogue-10.067
Tokenur
Token ID333
Feature activation+8.29e-01

Pos logit contributions
oppable+0.522
 Discrimination+0.510
 Pog+0.491
onder+0.487
sov+0.463
Neg logit contributions
inery-1.649
alogue-1.613
ISE-1.547
-1.541
nces-1.531
Tokenised
Token ID1417
Feature activation+3.05e-01

Pos logit contributions
inery+0.514
alogue+0.493
+0.453
ISE+0.453
arte+0.444
Neg logit contributions
oppable-0.647
 Discrimination-0.645
 Pog-0.629
onder-0.619
sov-0.618
Token,
Token ID11
Feature activation+2.73e+00

Pos logit contributions
inery+0.241
alogue+0.233
ISE+0.219
+0.218
arte+0.215
Neg logit contributions
oppable-0.187
 Discrimination-0.186
 Pog-0.181
onder-0.176
sov-0.176
Token and
Token ID290
Feature activation+2.11e+00

Pos logit contributions
inery+2.033
alogue+1.966
+1.839
ISE+1.831
arte+1.804
Neg logit contributions
oppable-1.784
 Discrimination-1.773
 Pog-1.707
onder-1.684
sov-1.676
Token et
Token ID2123
Feature activation-1.33e+00

Pos logit contributions
inery+0.979
alogue+0.946
+0.884
ISE+0.883
arte+0.869
Neg logit contributions
oppable-0.839
 Discrimination-0.837
 Pog-0.811
onder-0.795
sov-0.793
Token plant
Token ID4618
Feature activation+2.63e+00

Pos logit contributions
 Discrimination+0.787
oppable+0.776
 Pog+0.768
onder+0.744
sov+0.741
Neg logit contributions
inery-1.068
alogue-1.036
ISE-0.975
-0.974
nces-0.955
Tokenaris
Token ID20066
Feature activation-7.44e-01

Pos logit contributions
inery+1.708
alogue+1.644
ISE+1.507
+1.507
arte+1.495
Neg logit contributions
oppable-1.972
 Discrimination-1.962
 Pog-1.906
onder-1.872
sov-1.865
Token,"
Token ID553
Feature activation-2.99e+00

Pos logit contributions
 Discrimination+0.467
oppable+0.462
 Pog+0.454
sov+0.442
onder+0.441
Neg logit contributions
inery-0.574
alogue-0.558
-0.522
ISE-0.520
arte-0.509
Token "
Token ID366
Feature activation-4.28e-01

Pos logit contributions
 Discrimination+1.751
 Pog+1.742
oppable+1.705
onder+1.648
 Depot+1.647
Neg logit contributions
inery-2.435
alogue-2.361
ISE-2.190
-2.177
nces-2.169
TokenP
Token ID47
Feature activation-1.55e+01

Pos logit contributions
oppable+0.269
 Discrimination+0.269
 Pog+0.262
sov+0.256
onder+0.256
Neg logit contributions
inery-0.330
alogue-0.319
ISE-0.297
-0.297
nces-0.292
Tokenunct
Token ID16260
Feature activation+5.30e-01

Pos logit contributions
onder+9.244
oppable+8.750
orf+8.561
sov+8.351
 Pog+8.295
Neg logit contributions
alogue-10.263
inery-10.114
-10.073
nces-10.008
ISE-9.464
Tokenate
Token ID378
Feature activation+1.56e-01

Pos logit contributions
inery+0.387
alogue+0.374
+0.349
ISE+0.348
arte+0.342
Neg logit contributions
oppable-0.355
 Discrimination-0.353
 Pog-0.343
onder-0.337
sov-0.336
Token ker
Token ID41927
Feature activation+1.05e+01

Pos logit contributions
inery+0.118
alogue+0.114
+0.107
ISE+0.107
arte+0.105
Neg logit contributions
 Discrimination-0.101
oppable-0.100
 Pog-0.098
onder-0.095
sov-0.095
Tokenat
Token ID265
Feature activation-8.52e-01

Pos logit contributions
inery+6.605
alogue+6.583
arte+5.918
emark+5.613
nces+5.611
Neg logit contributions
 Discrimination-7.539
oppable-7.457
 Pog-7.265
 misunderstood-7.210
-7.209
Tokenoder
Token ID12342
Feature activation+2.02e+00

Pos logit contributions
oppable+0.538
 Discrimination+0.534
 Pog+0.518
onder+0.511
sov+0.508
Neg logit contributions
inery-0.652
alogue-0.631
-0.594
ISE-0.592
arte-0.580
Token that
Token ID326
Feature activation-1.24e-01

Pos logit contributions
 Discrimination+1.945
oppable+1.904
 Pog+1.895
 Depot+1.826
onder+1.812
Neg logit contributions
inery-2.840
alogue-2.803
ISE-2.620
-2.596
arte-2.590
Token she
Token ID673
Feature activation-1.68e+00

Pos logit contributions
 Discrimination+0.081
oppable+0.080
 Pog+0.079
 Depot+0.077
sov+0.076
Neg logit contributions
inery-0.093
alogue-0.090
-0.084
ISE-0.084
arte-0.082
Token is
Token ID318
Feature activation-1.53e+00

Pos logit contributions
 Discrimination+1.002
oppable+1.000
 Pog+0.976
onder+0.955
sov+0.945
Neg logit contributions
inery-1.343
alogue-1.302
ISE-1.224
-1.213
arte-1.193
Token
Token ID564
Feature activation+1.63e+00

Pos logit contributions
 Discrimination+0.901
oppable+0.890
 Pog+0.878
onder+0.852
sov+0.839
Neg logit contributions
inery-1.237
alogue-1.195
ISE-1.128
-1.115
arte-1.106
Token
Token ID250
Feature activation-8.09e-01

Pos logit contributions
inery+0.786
alogue+0.743
+0.669
ISE+0.661
arte+0.651
Neg logit contributions
oppable-1.492
 Discrimination-1.489
 Pog-1.455
onder-1.433
sov-1.431
Tokenp
Token ID79
Feature activation-1.53e+01

Pos logit contributions
oppable+0.534
 Discrimination+0.531
 Pog+0.515
onder+0.512
sov+0.507
Neg logit contributions
inery-0.598
alogue-0.577
ISE-0.541
-0.536
arte-0.529
Tokenroud
Token ID5493
Feature activation-2.15e+00

Pos logit contributions
onder+6.811
oppable+6.218
 Pog+5.476
acious+5.351
outer+5.136
Neg logit contributions
inery-12.933
alogue-12.753
-12.549
nces-12.484
ISE-12.417
Token of
Token ID286
Feature activation-3.76e+00

Pos logit contributions
 Discrimination+1.282
oppable+1.278
 Pog+1.235
onder+1.211
sov+1.203
Neg logit contributions
inery-1.716
alogue-1.664
ISE-1.560
-1.557
arte-1.532
Token our
Token ID674
Feature activation-2.89e+00

Pos logit contributions
 Discrimination+2.345
oppable+2.314
 Pog+2.271
 Depot+2.212
+2.195
Neg logit contributions
inery-2.896
alogue-2.829
ISE-2.627
-2.624
arte-2.571
Token work
Token ID670
Feature activation-2.86e+00

Pos logit contributions
 Discrimination+1.786
oppable+1.769
 Pog+1.729
onder+1.703
 Depot+1.688
Neg logit contributions
inery-2.248
alogue-2.173
-2.050
ISE-2.036
arte-2.026
Token to
Token ID284
Feature activation-1.96e+00

Pos logit contributions
oppable+1.683
 Discrimination+1.665
 Pog+1.618
onder+1.592
sov+1.583
Neg logit contributions
inery-2.301
alogue-2.240
ISE-2.107
-2.087
arte-2.080
Tokenuese
Token ID20506
Feature activation-5.19e-01

Pos logit contributions
oppable+2.710
 Discrimination+2.568
 Pog+2.537
onder+2.525
sov+2.512
Neg logit contributions
inery-3.941
alogue-3.836
-3.624
ISE-3.623
nces-3.566
Token)
Token ID8
Feature activation-2.56e+00

Pos logit contributions
oppable+0.295
 Discrimination+0.293
 Pog+0.284
sov+0.278
onder+0.278
Neg logit contributions
inery-0.432
alogue-0.419
ISE-0.393
-0.393
arte-0.387
Token and
Token ID290
Feature activation-4.31e+00

Pos logit contributions
oppable+1.485
 Discrimination+1.462
 Pog+1.445
sov+1.401
onder+1.391
Neg logit contributions
inery-2.119
alogue-2.051
ISE-1.921
-1.901
nces-1.896
Token Henry
Token ID8616
Feature activation-1.01e+01

Pos logit contributions
 Discrimination+2.467
oppable+2.459
 Pog+2.431
onder+2.367
 Rib+2.354
Neg logit contributions
inery-3.499
alogue-3.390
arte-3.199
nces-3.156
-3.140
Token (
Token ID357
Feature activation-7.14e+00

Pos logit contributions
 Pog+4.736
oppable+4.725
 Discrimination+4.514
 Rib+4.482
onder+4.353
Neg logit contributions
inery-8.785
alogue-8.712
-8.066
arte-8.061
ISE-8.038
TokenP
Token ID47
Feature activation-1.51e+01

Pos logit contributions
sov+4.197
oppable+4.168
 Pog+4.131
Dialog+4.116
 Discrimination+4.069
Neg logit contributions
inery-5.543
alogue-5.456
nces-5.035
arte-4.985
emark-4.914
Tokenyr
Token ID2417
Feature activation-5.45e-01

Pos logit contributions
oppable+3.984
onder+3.764
sov+3.751
acious+3.608
ier+3.512
Neg logit contributions
-15.332
nces-14.761
alogue-14.573
ISE-14.234
���-14.142
Tokeng
Token ID70
Feature activation-3.50e+00

Pos logit contributions
oppable+0.380
 Discrimination+0.377
 Pog+0.367
onder+0.361
sov+0.361
Neg logit contributions
inery-0.383
alogue-0.370
-0.344
ISE-0.344
arte-0.336
Tokenos
Token ID418
Feature activation-4.95e-01

Pos logit contributions
oppable+2.146
sov+2.092
onder+2.078
 Pog+2.071
 Discrimination+2.062
Neg logit contributions
inery-2.706
alogue-2.642
-2.512
ISE-2.493
nces-2.429
Token)
Token ID8
Feature activation-2.93e+00

Pos logit contributions
oppable+0.288
 Discrimination+0.287
 Pog+0.277
onder+0.272
sov+0.271
Neg logit contributions
inery-0.405
alogue-0.393
-0.369
ISE-0.368
arte-0.363
Token on
Token ID319
Feature activation-2.82e-01

Pos logit contributions
oppable+1.714
 Discrimination+1.679
 Pog+1.665
sov+1.618
onder+1.600
Neg logit contributions
inery-2.401
alogue-2.326
ISE-2.171
-2.151
nces-2.135
Token et
Token ID2123
Feature activation+1.09e+00

Pos logit contributions
 Discrimination+1.126
 Pog+1.097
oppable+1.096
sov+1.047
onder+1.046
Neg logit contributions
inery-2.586
alogue-2.543
-2.411
ISE-2.409
nces-2.373
Token plant
Token ID4618
Feature activation+3.71e+00

Pos logit contributions
inery+0.742
alogue+0.715
ISE+0.660
+0.659
arte+0.650
Neg logit contributions
oppable-0.784
 Discrimination-0.780
 Pog-0.757
onder-0.743
sov-0.743
Tokenaris
Token ID20066
Feature activation-1.61e+00

Pos logit contributions
inery+2.635
alogue+2.542
arte+2.423
ISE+2.357
+2.292
Neg logit contributions
 Discrimination-2.594
oppable-2.567
 Pog-2.502
 Depot-2.409
-2.402
Token,"
Token ID553
Feature activation-4.67e+00

Pos logit contributions
 Discrimination+0.981
oppable+0.976
 Pog+0.949
sov+0.929
onder+0.926
Neg logit contributions
inery-1.270
alogue-1.237
-1.158
ISE-1.156
nces-1.132
Token "
Token ID366
Feature activation-2.36e+00

Pos logit contributions
 Discrimination+2.463
 Pog+2.391
 Depot+2.306
oppable+2.290
onder+2.259
Neg logit contributions
inery-4.075
alogue-3.970
ISE-3.745
-3.722
nces-3.717
TokenP
Token ID47
Feature activation-1.50e+01

Pos logit contributions
 Discrimination+1.473
oppable+1.453
 Pog+1.441
Dialog+1.427
sov+1.403
Neg logit contributions
inery-1.827
alogue-1.773
nces-1.638
-1.624
ISE-1.621
Tokenunct
Token ID16260
Feature activation-9.79e-01

Pos logit contributions
onder+2.949
oppable+2.699
sov+2.102
Percent+2.003
orf+1.877
Neg logit contributions
���-16.302
-16.090
inery-16.033
alogue-15.925
ISE-15.921
Tokenate
Token ID378
Feature activation-6.48e-01

Pos logit contributions
oppable+0.530
 Discrimination+0.526
 Pog+0.508
onder+0.498
sov+0.496
Neg logit contributions
inery-0.839
alogue-0.813
-0.771
ISE-0.768
nces-0.756
Token ker
Token ID41927
Feature activation+1.06e+01

Pos logit contributions
oppable+0.432
 Discrimination+0.430
 Pog+0.420
onder+0.411
sov+0.410
Neg logit contributions
inery-0.476
alogue-0.460
ISE-0.428
-0.427
arte-0.420
Tokenat
Token ID265
Feature activation-6.97e+00

Pos logit contributions
inery+5.791
alogue+5.667
arte+5.392
onna+4.931
sle+4.900
Neg logit contributions
 Discrimination-8.617
 Citiz-8.597
oppable-8.366
 Pog-8.253
 misunderstood-8.231
Tokenoder
Token ID12342
Feature activation+2.73e+00

Pos logit contributions
oppable+0.552
onder+0.541
sov+0.388
 Discrimination+0.354
 Pog+0.300
Neg logit contributions
inery-8.931
alogue-8.803
-8.698
nces-8.617
ISE-8.561
Tokenache
Token ID4891
Feature activation-1.47e+00

Pos logit contributions
inery+0.652
alogue+0.620
ISE+0.550
+0.546
arte+0.540
Neg logit contributions
 Discrimination-1.251
oppable-1.249
 Pog-1.226
onder-1.203
sov-1.202
Token and
Token ID290
Feature activation+6.12e-01

Pos logit contributions
oppable+0.883
 Discrimination+0.878
 Pog+0.851
onder+0.832
sov+0.828
Neg logit contributions
inery-1.176
alogue-1.138
ISE-1.069
-1.068
arte-1.052
Token a
Token ID257
Feature activation-4.05e+00

Pos logit contributions
inery+0.463
alogue+0.448
+0.418
ISE+0.418
arte+0.412
Neg logit contributions
oppable-0.393
 Discrimination-0.392
 Pog-0.380
onder-0.372
sov-0.371
Token bit
Token ID1643
Feature activation+1.42e-01

Pos logit contributions
 Discrimination+2.410
 Pog+2.405
oppable+2.367
onder+2.308
sov+2.299
Neg logit contributions
inery-3.159
alogue-3.125
ISE-2.928
-2.897
arte-2.894
Token of
Token ID286
Feature activation-7.24e+00

Pos logit contributions
inery+0.110
alogue+0.106
+0.099
ISE+0.099
arte+0.098
Neg logit contributions
oppable-0.090
 Discrimination-0.089
 Pog-0.087
onder-0.085
sov-0.085
Token Paul
Token ID3362
Feature activation-1.45e+01

Pos logit contributions
 Pog+4.248
 Discrimination+4.203
oppable+4.151
 Slug+3.931
 Rib+3.914
Neg logit contributions
inery-5.800
alogue-5.568
ISE-5.203
nces-5.141
-5.118
Token Newman
Token ID27698
Feature activation-3.21e+00

Pos logit contributions
 Pog+9.329
 Discrimination+9.308
 Rib+9.233
 Ü+9.146
 Slug+9.025
Neg logit contributions
inery-9.056
alogue-8.836
-8.581
nces-8.519
ISE-8.449
Token about
Token ID546
Feature activation-3.13e+00

Pos logit contributions
 Discrimination+2.089
oppable+2.088
 Pog+2.034
sov+1.963
 Depot+1.959
Neg logit contributions
inery-2.384
alogue-2.313
-2.166
arte-2.139
ISE-2.135
Token him
Token ID683
Feature activation-4.56e+00

Pos logit contributions
 Discrimination+1.827
oppable+1.808
 Pog+1.805
onder+1.740
 Depot+1.719
Neg logit contributions
inery-2.561
alogue-2.468
ISE-2.303
-2.301
nces-2.280
Token.
Token ID13
Feature activation-3.37e+00

Pos logit contributions
oppable+2.386
 Discrimination+2.383
 Pog+2.337
onder+2.252
+2.232
Neg logit contributions
inery-3.937
alogue-3.783
-3.572
ISE-3.569
arte-3.535
Token Coll
Token ID7778
Feature activation-1.22e+00

Pos logit contributions
 Discrimination+2.113
 Pog+2.063
oppable+2.012
 Depot+1.978
 Slug+1.958
Neg logit contributions
inery-2.648
alogue-2.548
ISE-2.361
-2.355
nces-2.347
Token vot
Token ID1993
Feature activation-4.54e+00

Pos logit contributions
 Discrimination+3.204
 Pog+3.190
oppable+3.133
sov+3.099
onder+3.093
Neg logit contributions
inery-4.113
alogue-3.992
-3.738
ISE-3.725
arte-3.654
Tokenre
Token ID260
Feature activation-8.14e-01

Pos logit contributions
oppable+1.442
 Pog+1.424
onder+1.423
 Discrimination+1.416
sov+1.412
Neg logit contributions
inery-4.755
alogue-4.710
-4.480
ISE-4.456
nces-4.394
Token
Token ID39073
Feature activation-1.04e+00

Pos logit contributions
 Discrimination+0.486
oppable+0.483
 Pog+0.471
sov+0.461
onder+0.460
Neg logit contributions
inery-0.651
alogue-0.632
-0.593
ISE-0.591
arte-0.583
Tokenf
Token ID69
Feature activation+1.87e+00

Pos logit contributions
oppable+0.630
 Discrimination+0.628
 Pog+0.608
sov+0.599
onder+0.599
Neg logit contributions
inery-0.825
alogue-0.799
-0.752
ISE-0.751
arte-0.737
Tokenunt
Token ID2797
Feature activation+3.58e-01

Pos logit contributions
inery+1.294
alogue+1.238
ISE+1.145
+1.140
arte+1.132
Neg logit contributions
 Discrimination-1.338
oppable-1.334
 Pog-1.301
 Depot-1.266
sov-1.262
Token p
Token ID279
Feature activation-1.45e+01

Pos logit contributions
inery+0.288
alogue+0.279
+0.262
ISE+0.262
arte+0.258
Neg logit contributions
oppable-0.213
 Discrimination-0.213
 Pog-0.205
onder-0.201
sov-0.200
Tokenère
Token ID35979
Feature activation+3.97e+00

Pos logit contributions
onder+9.126
orf+8.519
outer+8.471
sov+8.449
oppable+8.365
Neg logit contributions
-8.674
inery-8.603
alogue-8.596
nces-8.335
ISE-7.923
Token,
Token ID11
Feature activation-5.05e+00

Pos logit contributions
inery+3.127
alogue+3.029
ISE+2.848
+2.840
nces+2.813
Neg logit contributions
oppable-2.430
 Discrimination-2.364
 Pog-2.261
onder-2.237
 Depot-2.218
Token cr
Token ID1067
Feature activation-5.29e+00

Pos logit contributions
 Discrimination+2.974
 Pog+2.953
oppable+2.929
onder+2.906
sov+2.874
Neg logit contributions
inery-3.969
alogue-3.878
-3.599
ISE-3.558
nces-3.540
Tokené
Token ID2634
Feature activation+2.07e+00

Pos logit contributions
oppable+2.819
onder+2.815
 Pog+2.674
sov+2.632
 Discrimination+2.625
Neg logit contributions
inery-4.345
alogue-4.270
-4.126
nces-4.016
ISE-3.927
Tokenateur
Token ID15093
Feature activation-3.58e+00

Pos logit contributions
inery+1.528
alogue+1.479
ISE+1.377
+1.375
arte+1.357
Neg logit contributions
oppable-1.370
 Discrimination-1.367
 Pog-1.323
onder-1.296
sov-1.294
Token we
Token ID356
Feature activation+3.56e+00

Pos logit contributions
inery+0.401
alogue+0.390
+0.365
ISE+0.365
arte+0.359
Neg logit contributions
oppable-0.291
 Discrimination-0.290
 Pog-0.282
onder-0.274
sov-0.274
Token have
Token ID423
Feature activation+2.04e+00

Pos logit contributions
inery+2.505
alogue+2.431
ISE+2.263
+2.263
arte+2.230
Neg logit contributions
oppable-2.454
 Discrimination-2.454
 Pog-2.367
onder-2.324
sov-2.314
Token seen
Token ID1775
Feature activation+4.08e+00

Pos logit contributions
inery+1.622
alogue+1.575
+1.476
ISE+1.474
arte+1.453
Neg logit contributions
oppable-1.231
 Discrimination-1.225
 Pog-1.181
onder-1.161
sov-1.154
Token when
Token ID618
Feature activation-7.72e-01

Pos logit contributions
inery+3.119
alogue+3.032
ISE+2.840
+2.833
arte+2.775
Neg logit contributions
 Discrimination-2.555
oppable-2.552
 Pog-2.452
onder-2.410
 Depot-2.403
Token we
Token ID356
Feature activation+6.82e-02

Pos logit contributions
oppable+0.484
 Discrimination+0.483
 Pog+0.469
sov+0.459
onder+0.456
Neg logit contributions
inery-0.596
alogue-0.577
ISE-0.540
-0.538
nces-0.531
Token unp
Token ID8593
Feature activation-1.43e+01

Pos logit contributions
inery+0.051
alogue+0.049
+0.046
ISE+0.046
arte+0.045
Neg logit contributions
oppable-0.045
 Discrimination-0.044
 Pog-0.043
sov-0.042
onder-0.042
Tokenacked
Token ID6021
Feature activation+6.16e-01

Pos logit contributions
oppable+8.055
onder+7.589
 Pog+7.033
sov+6.963
orf+6.784
Neg logit contributions
inery-10.419
alogue-10.417
-10.342
nces-10.299
ISE-9.831
Token the
Token ID262
Feature activation+1.64e+00

Pos logit contributions
inery+0.475
alogue+0.459
+0.430
ISE+0.429
arte+0.423
Neg logit contributions
oppable-0.388
 Discrimination-0.387
 Pog-0.375
onder-0.367
sov-0.366
Token norm
Token ID2593
Feature activation+9.25e-01

Pos logit contributions
inery+1.245
alogue+1.204
+1.125
ISE+1.124
arte+1.106
Neg logit contributions
oppable-1.050
 Discrimination-1.048
 Pog-1.016
onder-0.995
sov-0.993
Tokenals
Token ID874
Feature activation+6.19e-01

Pos logit contributions
inery+0.698
alogue+0.676
+0.632
ISE+0.631
arte+0.621
Neg logit contributions
oppable-0.596
 Discrimination-0.595
 Pog-0.576
sov-0.566
onder-0.565
Token,
Token ID11
Feature activation-5.72e-01

Pos logit contributions
inery+0.501
alogue+0.486
+0.456
ISE+0.456
arte+0.449
Neg logit contributions
oppable-0.365
 Discrimination-0.364
 Pog-0.352
onder-0.345
sov-0.344
Token pedd
Token ID36145
Feature activation-5.90e+00

Pos logit contributions
 Discrimination+0.644
oppable+0.640
 Pog+0.628
onder+0.613
sov+0.610
Neg logit contributions
inery-0.796
alogue-0.772
ISE-0.724
-0.719
arte-0.706
Tokenled
Token ID992
Feature activation-2.44e+00

Pos logit contributions
oppable+3.066
onder+2.946
sov+2.934
 Discrimination+2.893
 Pog+2.883
Neg logit contributions
alogue-4.956
inery-4.928
ISE-4.746
-4.701
nces-4.577
Token lofty
Token ID37248
Feature activation-5.86e+00

Pos logit contributions
 Discrimination+1.462
oppable+1.440
 Pog+1.413
 Depot+1.387
onder+1.368
Neg logit contributions
inery-1.950
alogue-1.901
ISE-1.780
-1.779
arte-1.744
Token promises
Token ID10497
Feature activation-3.77e+00

Pos logit contributions
 Discrimination+3.067
oppable+2.999
 Pog+2.966
 Depot+2.942
Dialog+2.832
Neg logit contributions
inery-4.937
alogue-4.834
ISE-4.565
-4.564
emark-4.505
Token and
Token ID290
Feature activation-3.24e+00

Pos logit contributions
oppable+2.138
 Discrimination+2.127
 Pog+2.085
onder+2.040
 Depot+2.019
Neg logit contributions
inery-3.042
alogue-2.991
ISE-2.826
-2.800
arte-2.779
Token p
Token ID279
Feature activation-1.42e+01

Pos logit contributions
 Discrimination+1.864
oppable+1.822
 Pog+1.811
 Depot+1.755
onder+1.744
Neg logit contributions
inery-2.654
alogue-2.589
ISE-2.439
-2.426
arte-2.383
Tokenatted
Token ID16898
Feature activation-6.30e-01

Pos logit contributions
onder+8.392
oppable+7.973
outer+7.414
sov+7.404
orf+7.366
Neg logit contributions
-9.572
ISE-9.568
alogue-9.426
inery-9.206
nces-8.822
Token himself
Token ID2241
Feature activation+8.36e-01

Pos logit contributions
 Discrimination+0.333
oppable+0.330
 Pog+0.322
onder+0.312
sov+0.311
Neg logit contributions
inery-0.549
alogue-0.533
ISE-0.503
-0.501
arte-0.497
Token on
Token ID319
Feature activation-2.31e-02

Pos logit contributions
inery+0.678
alogue+0.657
+0.619
ISE+0.616
arte+0.607
Neg logit contributions
oppable-0.492
 Discrimination-0.490
 Pog-0.473
onder-0.463
sov-0.462
Token the
Token ID262
Feature activation+2.73e+00

Pos logit contributions
oppable+0.015
 Discrimination+0.015
 Pog+0.015
onder+0.015
sov+0.015
Neg logit contributions
inery-0.017
alogue-0.016
-0.015
ISE-0.015
arte-0.015
Token back
Token ID736
Feature activation+2.16e-01

Pos logit contributions
inery+2.128
alogue+2.053
+1.943
ISE+1.926
arte+1.887
Neg logit contributions
oppable-1.703
 Discrimination-1.679
 Pog-1.605
sov-1.590
onder-1.588
Token addiction
Token ID13230
Feature activation+1.41e+00

Pos logit contributions
inery+1.826
alogue+1.749
+1.611
ISE+1.603
arte+1.576
Neg logit contributions
oppable-2.285
 Discrimination-2.263
 Pog-2.208
onder-2.179
sov-2.174
Token and
Token ID290
Feature activation-4.50e-01

Pos logit contributions
inery+1.097
alogue+1.060
+0.993
ISE+0.993
arte+0.976
Neg logit contributions
oppable-0.870
 Discrimination-0.865
 Pog-0.836
onder-0.822
sov-0.822
Token abortion
Token ID7950
Feature activation-1.12e+00

Pos logit contributions
 Discrimination+0.279
oppable+0.276
 Pog+0.271
onder+0.262
 Depot+0.261
Neg logit contributions
inery-0.353
alogue-0.342
ISE-0.320
-0.320
arte-0.315
Token on
Token ID319
Feature activation+5.94e-01

Pos logit contributions
 Discrimination+0.681
oppable+0.674
 Pog+0.659
 Depot+0.638
onder+0.637
Neg logit contributions
inery-0.892
alogue-0.866
-0.810
ISE-0.809
arte-0.802
Token "
Token ID366
Feature activation+1.50e+00

Pos logit contributions
inery+0.464
alogue+0.450
+0.421
ISE+0.421
arte+0.414
Neg logit contributions
oppable-0.367
 Discrimination-0.366
 Pog-0.355
onder-0.347
sov-0.345
TokenP
Token ID47
Feature activation-1.41e+01

Pos logit contributions
inery+1.184
alogue+1.146
+1.076
ISE+1.075
arte+1.059
Neg logit contributions
oppable-0.920
 Discrimination-0.915
 Pog-0.885
onder-0.867
sov-0.865
Tokenenny
Token ID11870
Feature activation+1.44e+00

Pos logit contributions
onder+9.486
oppable+9.315
 Pog+8.558
rotein+8.383
sov+8.362
Neg logit contributions
nces-8.286
-8.248
inery-8.240
alogue-8.235
emark-7.624
Tokenroy
Token ID3287
Feature activation+5.75e+00

Pos logit contributions
inery+1.213
alogue+1.175
+1.111
ISE+1.110
arte+1.091
Neg logit contributions
 Discrimination-0.802
oppable-0.800
 Pog-0.773
onder-0.751
sov-0.750
Tokenal
Token ID282
Feature activation+3.44e+00

Pos logit contributions
inery+4.445
alogue+4.277
arte+3.992
nces+3.950
+3.926
Neg logit contributions
oppable-3.554
 Discrimination-3.508
 Pog-3.389
sov-3.357
onder-3.312
Token Tea
Token ID15777
Feature activation+1.23e+00

Pos logit contributions
inery+2.666
alogue+2.582
ISE+2.411
+2.408
arte+2.379
Neg logit contributions
oppable-2.134
 Discrimination-2.113
 Pog-2.043
sov-2.021
onder-2.014
Token",
Token ID1600
Feature activation+1.99e+00

Pos logit contributions
inery+0.863
alogue+0.832
ISE+0.778
+0.775
arte+0.759
Neg logit contributions
 Discrimination-0.860
oppable-0.858
 Pog-0.836
 Depot-0.818
onder-0.818
Token membership
Token ID9931
Feature activation+6.01e-01

Pos logit contributions
 Discrimination+0.985
oppable+0.974
 Pog+0.960
sov+0.925
onder+0.923
Neg logit contributions
inery-1.236
alogue-1.206
-1.115
nces-1.110
ISE-1.106
Token would
Token ID561
Feature activation+1.52e+00

Pos logit contributions
inery+0.434
alogue+0.420
+0.390
ISE+0.389
arte+0.383
Neg logit contributions
 Discrimination-0.408
oppable-0.408
 Pog-0.397
onder-0.388
sov-0.387
Token be
Token ID307
Feature activation-3.66e-01

Pos logit contributions
inery+1.087
alogue+1.050
ISE+0.976
+0.976
arte+0.960
Neg logit contributions
oppable-1.036
 Discrimination-1.032
 Pog-1.002
onder-0.984
sov-0.982
Token a
Token ID257
Feature activation-7.87e-01

Pos logit contributions
 Discrimination+0.215
oppable+0.212
 Pog+0.209
sov+0.200
 Depot+0.200
Neg logit contributions
inery-0.300
alogue-0.289
-0.272
ISE-0.271
arte-0.269
Token "
Token ID366
Feature activation+3.26e-01

Pos logit contributions
 Discrimination+0.460
oppable+0.455
 Pog+0.447
onder+0.432
 Depot+0.431
Neg logit contributions
inery-0.646
alogue-0.624
-0.586
ISE-0.585
arte-0.579
Tokenp
Token ID79
Feature activation-1.39e+01

Pos logit contributions
inery+0.242
alogue+0.233
+0.218
ISE+0.218
arte+0.214
Neg logit contributions
oppable-0.214
 Discrimination-0.214
 Pog-0.208
onder-0.204
sov-0.204
Tokenunt
Token ID2797
Feature activation+5.70e+00

Pos logit contributions
onder+8.806
oppable+8.077
 Pog+7.793
acious+7.715
sov+7.671
Neg logit contributions
nces-9.229
inery-9.102
-9.025
alogue-8.771
ISE-8.707
Token".
Token ID1911
Feature activation-1.80e+00

Pos logit contributions
inery+4.364
alogue+4.199
+3.918
ISE+3.870
arte+3.840
Neg logit contributions
oppable-3.641
 Discrimination-3.588
 Pog-3.449
 Depot-3.414
-3.409
Token
Token ID198
Feature activation+2.41e+00

Pos logit contributions
 Discrimination+1.142
 Pog+1.101
oppable+1.091
sov+1.068
 Depot+1.067
Neg logit contributions
inery-1.397
alogue-1.355
-1.256
ISE-1.248
nces-1.247
Token
Token ID198
Feature activation+1.16e+00

Pos logit contributions
inery+1.626
alogue+1.568
+1.466
ISE+1.465
arte+1.437
Neg logit contributions
oppable-1.754
 Discrimination-1.715
 Pog-1.665
onder-1.640
sov-1.630
TokenThe
Token ID464
Feature activation+1.70e+00

Pos logit contributions
inery+0.936
alogue+0.907
ISE+0.855
+0.854
arte+0.840
Neg logit contributions
oppable-0.684
 Discrimination-0.679
 Pog-0.656
onder-0.642
sov-0.638
Token John
Token ID1757
Feature activation-1.06e-01

Pos logit contributions
 Discrimination+0.477
oppable+0.472
 Pog+0.464
onder+0.451
 Depot+0.449
Neg logit contributions
inery-0.607
alogue-0.588
-0.552
ISE-0.551
nces-0.541
Token T
Token ID309
Feature activation+5.63e-02

Pos logit contributions
oppable+0.056
 Discrimination+0.056
 Pog+0.054
onder+0.052
sov+0.052
Neg logit contributions
inery-0.093
alogue-0.090
-0.085
ISE-0.085
arte-0.084
Token.
Token ID13
Feature activation-3.68e+00

Pos logit contributions
inery+0.047
alogue+0.046
ISE+0.043
+0.043
arte+0.042
Neg logit contributions
oppable-0.032
 Discrimination-0.032
 Pog-0.030
onder-0.030
sov-0.030
Token Anderson
Token ID9918
Feature activation-4.10e-01

Pos logit contributions
 Discrimination+2.427
oppable+2.393
 Pog+2.380
 Depot+2.295
onder+2.293
Neg logit contributions
inery-2.725
alogue-2.630
ISE-2.455
-2.448
nces-2.434
Token.
Token ID13
Feature activation-1.84e+00

Pos logit contributions
 Discrimination+0.248
oppable+0.248
 Pog+0.240
onder+0.235
sov+0.234
Neg logit contributions
inery-0.325
alogue-0.315
-0.295
ISE-0.295
arte-0.290
Token Vice
Token ID11079
Feature activation-1.39e+01

Pos logit contributions
 Discrimination+1.105
oppable+1.070
 Pog+1.067
 Depot+1.033
onder+1.030
Neg logit contributions
inery-1.488
alogue-1.449
-1.358
ISE-1.355
nces-1.338
Token Chairman
Token ID12787
Feature activation+3.09e+00

Pos logit contributions
 Discrimination+4.180
onder+3.759
 Chairman+3.744
 Pog+3.713
oppable+3.639
Neg logit contributions
alogue-14.154
inery-13.979
ISE-13.340
-13.128
arte-12.835
Token:
Token ID25
Feature activation-1.95e+00

Pos logit contributions
inery+2.221
alogue+2.136
+1.987
ISE+1.985
arte+1.967
Neg logit contributions
oppable-2.102
 Discrimination-2.074
 Pog-2.003
sov-1.998
-1.967
Token Richard
Token ID6219
Feature activation-9.80e+00

Pos logit contributions
 Discrimination+1.147
oppable+1.127
 Pog+1.114
 Depot+1.078
onder+1.077
Neg logit contributions
inery-1.581
alogue-1.533
ISE-1.446
-1.445
nces-1.417
Token K
Token ID509
Feature activation+1.58e+00

Pos logit contributions
 Pog+4.576
 Discrimination+4.364
 Rib+4.158
oppable+4.156
 Depot+4.096
Neg logit contributions
inery-8.836
alogue-8.549
ISE-8.278
-8.189
nces-8.107
Token.
Token ID13
Feature activation-9.07e+00

Pos logit contributions
inery+1.282
alogue+1.243
ISE+1.166
+1.164
arte+1.159
Neg logit contributions
oppable-0.936
 Discrimination-0.922
 Pog-0.885
sov-0.868
onder-0.859
Token
Token ID960
Feature activation-1.39e+00

Pos logit contributions
 Discrimination+0.389
oppable+0.384
 Pog+0.380
onder+0.367
 Depot+0.367
Neg logit contributions
inery-0.496
alogue-0.483
ISE-0.451
-0.449
arte-0.444
Tokenpr
Token ID1050
Feature activation-6.26e+00

Pos logit contributions
 Discrimination+0.900
oppable+0.894
 Pog+0.876
onder+0.859
sov+0.858
Neg logit contributions
inery-1.042
alogue-1.013
ISE-0.942
-0.940
arte-0.922
Tokenide
Token ID485
Feature activation+4.70e+00

Pos logit contributions
oppable+3.717
 Discrimination+3.656
 Pog+3.645
onder+3.637
sov+3.540
Neg logit contributions
inery-4.763
alogue-4.637
-4.364
ISE-4.345
nces-4.299
Tokenfully
Token ID2759
Feature activation-3.06e+00

Pos logit contributions
inery+3.773
alogue+3.652
ISE+3.428
+3.421
arte+3.396
Neg logit contributions
oppable-2.772
 Discrimination-2.743
 Pog-2.630
sov-2.613
onder-2.582
Token and
Token ID290
Feature activation-1.31e+00

Pos logit contributions
 Discrimination+1.558
oppable+1.554
 Pog+1.512
onder+1.470
 Depot+1.456
Neg logit contributions
inery-2.702
alogue-2.628
ISE-2.481
-2.471
arte-2.429
Token unp
Token ID8593
Feature activation-1.39e+01

Pos logit contributions
 Discrimination+0.821
oppable+0.813
 Pog+0.804
onder+0.784
sov+0.778
Neg logit contributions
inery-1.006
alogue-0.973
ISE-0.914
-0.902
arte-0.893
Tokenrompt
Token ID45700
Feature activation+1.52e-01

Pos logit contributions
onder+6.076
oppable+5.766
orf+5.037
sov+5.026
acious+5.008
Neg logit contributions
-11.793
alogue-11.756
nces-11.514
ISE-11.324
inery-11.261
Tokened
Token ID276
Feature activation+7.98e-01

Pos logit contributions
inery+0.144
alogue+0.141
+0.134
ISE+0.134
arte+0.131
Neg logit contributions
oppable-0.068
 Discrimination-0.067
 Pog-0.064
onder-0.063
sov-0.062
Token
Token ID960
Feature activation-4.70e+00

Pos logit contributions
inery+0.699
alogue+0.679
+0.641
ISE+0.641
arte+0.632
Neg logit contributions
oppable-0.418
 Discrimination-0.416
 Pog-0.400
onder-0.390
sov-0.389
Token he
Token ID339
Feature activation-2.22e-01

Pos logit contributions
 Discrimination+3.232
oppable+3.160
 Pog+3.152
sov+3.141
Dialog+3.106
Neg logit contributions
inery-3.210
alogue-3.193
ISE-2.873
-2.861
arte-2.861
Token said
Token ID531
Feature activation+1.24e+00

Pos logit contributions
oppable+0.140
 Discrimination+0.140
 Pog+0.136
onder+0.134
sov+0.133
Neg logit contributions
inery-0.170
alogue-0.164
ISE-0.153
-0.153
nces-0.150
Tokenmask
Token ID27932
Feature activation-4.18e+00

Pos logit contributions
oppable+0.493
 Discrimination+0.489
 Pog+0.475
sov+0.471
onder+0.471
Neg logit contributions
inery-0.413
alogue-0.397
-0.368
ISE-0.366
arte-0.359
Tokened
Token ID276
Feature activation-6.18e+00

Pos logit contributions
oppable+1.648
 Discrimination+1.587
onder+1.510
sov+1.498
 Pog+1.479
Neg logit contributions
inery-4.164
alogue-4.094
-3.916
ISE-3.895
nces-3.854
Token.
Token ID13
Feature activation-9.38e+00

Pos logit contributions
 Discrimination+3.007
oppable+2.917
 Pog+2.834
onder+2.797
 Depot+2.762
Neg logit contributions
inery-5.495
alogue-5.384
ISE-5.043
-5.010
nces-4.997
Tokencom
Token ID785
Feature activation+1.87e+00

Pos logit contributions
 Discrimination+2.746
oppable+2.477
 Pog+2.475
onder+2.418
+2.317
Neg logit contributions
inery-9.819
alogue-9.684
ISE-9.277
-9.127
arte-9.042
Token and
Token ID290
Feature activation-3.01e+00

Pos logit contributions
inery+1.487
alogue+1.440
+1.351
ISE+1.350
arte+1.331
Neg logit contributions
oppable-1.132
 Discrimination-1.125
 Pog-1.089
sov-1.068
onder-1.066
Token V
Token ID569
Feature activation-1.39e+01

Pos logit contributions
 Discrimination+1.598
oppable+1.562
 Pog+1.534
onder+1.498
 Depot+1.490
Neg logit contributions
inery-2.600
alogue-2.525
ISE-2.388
-2.373
nces-2.351
Tokenodka
Token ID28601
Feature activation-1.09e+01

Pos logit contributions
oppable+5.365
onder+5.365
sov+4.997
 Pog+4.927
Percent+4.702
Neg logit contributions
nces-12.185
inery-11.865
alogue-11.843
���-11.671
-11.658
Token O
Token ID440
Feature activation+3.45e+00

Pos logit contributions
 Pog+3.500
 Discrimination+3.455
oppable+3.340
 Ü+3.240
onder+3.176
Neg logit contributions
inery-11.032
alogue-10.612
nces-10.417
-10.284
ISE-10.202
Token'
Token ID6
Feature activation-1.13e+00

Pos logit contributions
inery+2.454
alogue+2.395
ISE+2.241
+2.234
arte+2.187
Neg logit contributions
oppable-2.346
 Discrimination-2.325
 Pog-2.258
sov-2.196
onder-2.177
TokenClock
Token ID44758
Feature activation-8.70e+00

Pos logit contributions
 Discrimination+0.501
oppable+0.501
 Pog+0.477
onder+0.466
sov+0.462
Neg logit contributions
inery-1.078
alogue-1.049
ISE-0.994
-0.994
arte-0.982
Token Podcast
Token ID16036
Feature activation+1.12e+00

Pos logit contributions
 Discrimination+2.718
 Pog+2.643
oppable+2.504
 Depot+2.394
onder+2.345
Neg logit contributions
inery-8.995
alogue-8.764
nces-8.459
arte-8.445
ISE-8.426
Token.
Token ID13
Feature activation-4.15e-01

Pos logit contributions
inery+1.665
alogue+1.614
+1.504
ISE+1.501
arte+1.473
Neg logit contributions
oppable-1.527
 Discrimination-1.512
 Pog-1.468
onder-1.444
sov-1.441
Token However
Token ID2102
Feature activation+3.84e+00

Pos logit contributions
 Discrimination+0.278
oppable+0.270
 Pog+0.266
 Depot+0.260
onder+0.259
Neg logit contributions
inery-0.309
alogue-0.300
-0.277
ISE-0.277
nces-0.274
Token,
Token ID11
Feature activation+1.75e+00

Pos logit contributions
inery+2.861
alogue+2.755
ISE+2.597
+2.584
arte+2.538
Neg logit contributions
oppable-2.484
 Discrimination-2.448
 Pog-2.405
sov-2.367
onder-2.344
Token gender
Token ID5279
Feature activation+3.49e-01

Pos logit contributions
inery+1.329
alogue+1.286
+1.202
ISE+1.202
arte+1.183
Neg logit contributions
oppable-1.119
 Discrimination-1.112
 Pog-1.079
onder-1.059
sov-1.056
Token non
Token ID1729
Feature activation+3.95e+00

Pos logit contributions
inery+0.337
alogue+0.328
+0.311
ISE+0.311
arte+0.308
Neg logit contributions
 Discrimination-0.151
oppable-0.150
 Pog-0.143
onder-0.139
sov-0.138
Tokencon
Token ID1102
Feature activation+1.33e+01

Pos logit contributions
inery+3.878
alogue+3.780
+3.578
ISE+3.572
nces+3.541
Neg logit contributions
oppable-1.668
 Discrimination-1.597
 Pog-1.566
 Depot-1.507
sov-1.470
Tokenforming
Token ID15464
Feature activation+2.24e+00

Pos logit contributions
inery+8.445
alogue+8.035
+7.465
ISE+7.321
nces+7.300
Neg logit contributions
oppable-9.256
 Pog-9.034
 Discrimination-8.955
-8.832
 Ü-8.717
Token (
Token ID357
Feature activation-1.43e+00

Pos logit contributions
inery+1.513
alogue+1.458
+1.355
ISE+1.354
arte+1.327
Neg logit contributions
oppable-1.617
 Discrimination-1.601
 Pog-1.559
onder-1.534
sov-1.534
TokenG
Token ID38
Feature activation-6.54e+00

Pos logit contributions
 Discrimination+0.862
oppable+0.857
 Pog+0.833
onder+0.815
sov+0.814
Neg logit contributions
inery-1.141
alogue-1.105
-1.026
ISE-1.021
arte-1.018
TokenNC
Token ID7792
Feature activation+2.19e+00

Pos logit contributions
oppable+2.560
 Discrimination+2.547
 Pog+2.481
onder+2.467
Dialog+2.398
Neg logit contributions
inery-6.292
alogue-6.099
-5.801
emark-5.741
nces-5.715
Token)
Token ID8
Feature activation-8.20e-01

Pos logit contributions
inery+1.734
alogue+1.678
ISE+1.584
+1.579
arte+1.542
Neg logit contributions
oppable-1.325
 Discrimination-1.311
 Pog-1.270
onder-1.246
sov-1.245
Token mark
Token ID1317
Feature activation+3.03e+00

Pos logit contributions
inery+2.317
alogue+2.237
ISE+2.098
+2.094
arte+2.071
Neg logit contributions
oppable-2.246
 Discrimination-2.229
 Pog-2.149
sov-2.138
onder-2.137
Token recently
Token ID2904
Feature activation+1.02e+00

Pos logit contributions
inery+2.379
alogue+2.296
+2.150
ISE+2.147
arte+2.119
Neg logit contributions
oppable-1.858
 Discrimination-1.850
 Pog-1.776
sov-1.755
onder-1.749
Token,
Token ID11
Feature activation+1.63e+00

Pos logit contributions
inery+0.811
alogue+0.786
+0.737
ISE+0.737
arte+0.725
Neg logit contributions
oppable-0.617
 Discrimination-0.614
 Pog-0.595
onder-0.582
sov-0.580
Token according
Token ID1864
Feature activation+1.77e+00

Pos logit contributions
inery+1.260
alogue+1.219
+1.140
ISE+1.140
arte+1.122
Neg logit contributions
oppable-1.026
 Discrimination-1.023
 Pog-0.992
onder-0.971
sov-0.967
Token to
Token ID284
Feature activation+3.78e+00

Pos logit contributions
inery+1.321
alogue+1.276
+1.194
ISE+1.191
arte+1.171
Neg logit contributions
oppable-1.161
 Discrimination-1.143
 Pog-1.109
onder-1.089
sov-1.085
Token Mac
Token ID4100
Feature activation+1.12e+01

Pos logit contributions
inery+2.850
alogue+2.734
ISE+2.554
+2.548
nces+2.525
Neg logit contributions
oppable-2.453
 Discrimination-2.406
 Pog-2.340
sov-2.319
onder-2.309
TokenSt
Token ID1273
Feature activation+3.43e+00

Pos logit contributions
inery+6.608
alogue+6.211
arte+5.814
emark+5.763
nces+5.578
Neg logit contributions
oppable-8.545
 Discrimination-8.396
 Pog-8.353
-8.291
sov-8.045
Tokenories
Token ID1749
Feature activation+1.11e+00

Pos logit contributions
inery+3.087
alogue+3.007
ISE+2.835
+2.818
arte+2.810
Neg logit contributions
 Discrimination-1.715
oppable-1.695
 Pog-1.644
sov-1.589
onder-1.588
Token.
Token ID13
Feature activation-2.57e+00

Pos logit contributions
inery+0.905
alogue+0.877
ISE+0.823
+0.823
arte+0.811
Neg logit contributions
oppable-0.657
 Discrimination-0.653
 Pog-0.631
onder-0.617
sov-0.616
Tokennet
Token ID3262
Feature activation+3.99e+00

Pos logit contributions
 Discrimination+1.509
oppable+1.443
 Pog+1.433
 Depot+1.407
onder+1.392
Neg logit contributions
inery-2.073
alogue-2.029
-1.915
ISE-1.914
arte-1.869
Token.
Token ID13
Feature activation+7.31e-01

Pos logit contributions
inery+3.120
alogue+2.977
+2.806
ISE+2.800
arte+2.764
Neg logit contributions
oppable-2.484
 Discrimination-2.444
 Pog-2.374
sov-2.319
onder-2.311
Token.
Token ID13
Feature activation+1.41e+00

Pos logit contributions
 Discrimination+1.386
oppable+1.363
 Pog+1.331
 Depot+1.308
onder+1.289
Neg logit contributions
inery-1.944
alogue-1.885
-1.790
ISE-1.776
nces-1.746
Token (
Token ID357
Feature activation+2.67e+00

Pos logit contributions
inery+1.147
alogue+1.112
+1.044
ISE+1.043
arte+1.028
Neg logit contributions
oppable-0.822
 Discrimination-0.820
 Pog-0.792
onder-0.775
sov-0.772
TokenPhoto
Token ID6191
Feature activation-4.87e+00

Pos logit contributions
inery+2.477
alogue+2.417
ISE+2.299
+2.289
arte+2.266
Neg logit contributions
oppable-1.244
 Discrimination-1.242
 Pog-1.178
onder-1.152
 Depot-1.138
Token By
Token ID2750
Feature activation+7.47e+00

Pos logit contributions
 Discrimination+2.543
oppable+2.531
 Pog+2.470
 Depot+2.382
onder+2.376
Neg logit contributions
inery-4.206
alogue-4.110
ISE-3.902
-3.857
nces-3.826
Token Bill
Token ID3941
Feature activation-1.75e+00

Pos logit contributions
inery+5.631
alogue+5.432
+5.068
ISE+5.023
arte+4.996
Neg logit contributions
oppable-4.672
 Discrimination-4.575
onder-4.399
sov-4.390
-4.353
Token Clark
Token ID11264
Feature activation+1.05e+01

Pos logit contributions
 Discrimination+1.003
oppable+0.998
 Pog+0.978
 Depot+0.946
onder+0.943
Neg logit contributions
inery-1.444
alogue-1.405
ISE-1.317
-1.313
nces-1.298
Token/
Token ID14
Feature activation+2.82e+00

Pos logit contributions
inery+6.656
alogue+6.515
arte+6.004
+5.911
ISE+5.865
Neg logit contributions
oppable-7.313
 Discrimination-7.110
 Pog-6.886
onder-6.872
Dialog-6.869
TokenC
Token ID34
Feature activation+4.12e+00

Pos logit contributions
inery+2.067
alogue+2.007
+1.890
ISE+1.887
arte+1.855
Neg logit contributions
oppable-1.860
 Discrimination-1.844
 Pog-1.789
onder-1.759
sov-1.750
TokenQ
Token ID48
Feature activation+2.75e+00

Pos logit contributions
inery+3.224
alogue+3.099
ISE+2.893
arte+2.867
+2.857
Neg logit contributions
 Discrimination-2.604
oppable-2.572
 Pog-2.503
sov-2.414
-2.389
Token Roll
Token ID8299
Feature activation+1.23e+00

Pos logit contributions
inery+1.609
alogue+1.562
ISE+1.437
+1.420
nces+1.384
Neg logit contributions
oppable-2.232
 Discrimination-2.162
sov-2.161
 Pog-2.123
onder-2.102
Token Call
Token ID4889
Feature activation+4.25e+00

Pos logit contributions
inery+0.880
alogue+0.852
ISE+0.787
+0.786
arte+0.785
Neg logit contributions
oppable-0.840
 Discrimination-0.831
 Pog-0.809
sov-0.805
onder-0.791
Token questions
Token ID2683
Feature activation+3.46e+00

Pos logit contributions
 Discrimination+1.717
oppable+1.689
 Pog+1.670
onder+1.628
sov+1.621
Neg logit contributions
inery-2.254
alogue-2.148
-2.020
arte-2.017
ISE-2.003
Token was
Token ID373
Feature activation-1.07e-01

Pos logit contributions
inery+2.657
alogue+2.562
ISE+2.403
+2.387
arte+2.359
Neg logit contributions
oppable-2.194
 Discrimination-2.183
 Pog-2.104
sov-2.066
onder-2.056
Token just
Token ID655
Feature activation+3.32e+00

Pos logit contributions
 Discrimination+0.063
oppable+0.063
 Pog+0.061
onder+0.059
sov+0.059
Neg logit contributions
inery-0.086
alogue-0.083
-0.079
ISE-0.078
arte-0.077
Token coincidence
Token ID21083
Feature activation+6.58e+00

Pos logit contributions
inery+2.525
alogue+2.439
+2.283
ISE+2.278
arte+2.242
Neg logit contributions
oppable-2.118
 Discrimination-2.101
 Pog-2.035
onder-2.003
sov-2.001
Token,
Token ID11
Feature activation+4.46e+00

Pos logit contributions
inery+4.884
alogue+4.756
ISE+4.427
+4.382
nces+4.380
Neg logit contributions
oppable-4.171
 Discrimination-4.119
 Pog-3.979
onder-3.953
sov-3.947
Token natural
Token ID3288
Feature activation+1.00e+01

Pos logit contributions
inery+3.399
alogue+3.295
ISE+3.077
+3.072
arte+3.027
Neg logit contributions
oppable-2.812
 Discrimination-2.776
 Pog-2.692
sov-2.659
onder-2.655
Token occurrence
Token ID19810
Feature activation+4.27e+00

Pos logit contributions
inery+7.227
alogue+6.862
ISE+6.519
+6.443
arte+6.331
Neg logit contributions
oppable-6.398
 Discrimination-6.084
onder-5.972
 Depot-5.935
 Pog-5.909
Token or
Token ID393
Feature activation-3.56e-01

Pos logit contributions
inery+3.341
alogue+3.243
ISE+3.039
+3.027
nces+2.998
Neg logit contributions
oppable-2.600
 Discrimination-2.562
 Pog-2.495
onder-2.455
sov-2.442
Token mis
Token ID2984
Feature activation-2.85e+00

Pos logit contributions
 Discrimination+0.223
oppable+0.221
 Pog+0.216
onder+0.209
sov+0.209
Neg logit contributions
inery-0.277
alogue-0.268
-0.251
ISE-0.250
arte-0.247
Tokenident
Token ID738
Feature activation+4.22e+00

Pos logit contributions
oppable+1.641
 Discrimination+1.634
onder+1.569
 Pog+1.566
sov+1.554
Neg logit contributions
inery-2.300
alogue-2.236
-2.111
ISE-2.103
arte-2.062
Tokenification
Token ID2649
Feature activation-1.53e+00

Pos logit contributions
inery+2.659
alogue+2.542
+2.335
ISE+2.319
arte+2.270
Neg logit contributions
 Discrimination-3.243
oppable-3.225
 Pog-3.149
onder-3.102
 Depot-3.091
Token that
Token ID326
Feature activation+2.66e+00

Pos logit contributions
inery+1.058
alogue+1.017
+0.956
ISE+0.952
arte+0.936
Neg logit contributions
oppable-0.850
 Discrimination-0.844
 Pog-0.812
onder-0.801
sov-0.797
Token Hillary
Token ID5073
Feature activation+3.60e+00

Pos logit contributions
inery+2.097
alogue+2.021
+1.899
ISE+1.895
nces+1.868
Neg logit contributions
oppable-1.625
 Discrimination-1.606
 Pog-1.550
onder-1.526
sov-1.524
Token Clinton
Token ID2605
Feature activation+1.09e+01

Pos logit contributions
inery+2.080
alogue+1.964
+1.812
ISE+1.805
arte+1.754
Neg logit contributions
oppable-2.963
 Discrimination-2.927
 Pog-2.849
sov-2.822
onder-2.821
Token
Token ID447
Feature activation-1.57e+00

Pos logit contributions
inery+7.298
alogue+6.869
+6.728
sle+6.651
ISE+6.516
Neg logit contributions
oppable-7.224
 Discrimination-6.991
sov-6.801
Dialog-6.774
-6.745
Token
Token ID247
Feature activation+4.06e+00

Pos logit contributions
oppable+1.314
 Discrimination+1.312
 Pog+1.277
 Depot+1.264
onder+1.262
Neg logit contributions
inery-0.877
alogue-0.835
ISE-0.762
-0.750
arte-0.740
Tokens
Token ID82
Feature activation+1.25e+01

Pos logit contributions
inery+2.836
alogue+2.718
+2.538
ISE+2.528
arte+2.488
Neg logit contributions
oppable-2.848
 Discrimination-2.837
 Pog-2.730
sov-2.710
onder-2.697
Token aide
Token ID18727
Feature activation+5.18e+00

Pos logit contributions
inery+8.211
alogue+7.695
+7.468
sle+7.361
emark+7.346
Neg logit contributions
oppable-8.358
 Discrimination-7.943
sov-7.801
-7.721
Dialog-7.625
Token,
Token ID11
Feature activation+4.79e+00

Pos logit contributions
inery+3.585
alogue+3.361
+3.167
ISE+3.133
arte+3.071
Neg logit contributions
oppable-3.723
 Discrimination-3.691
 Pog-3.565
sov-3.541
onder-3.476
Token Huma
Token ID49387
Feature activation+2.33e-01

Pos logit contributions
inery+3.168
alogue+2.997
+2.793
ISE+2.787
arte+2.727
Neg logit contributions
oppable-3.568
 Discrimination-3.541
 Pog-3.433
sov-3.412
onder-3.374
Token Abedin
Token ID39562
Feature activation+2.98e+00

Pos logit contributions
inery+0.153
alogue+0.149
+0.137
ISE+0.136
nces+0.135
Neg logit contributions
 Discrimination-0.173
oppable-0.171
 Pog-0.169
onder-0.166
 Depot-0.165
Token,
Token ID11
Feature activation+4.49e+00

Pos logit contributions
inery+2.257
alogue+2.171
+2.047
ISE+2.039
arte+2.005
Neg logit contributions
oppable-1.915
 Discrimination-1.904
 Pog-1.844
sov-1.810
onder-1.810
Token to
Token ID284
Feature activation+2.84e+00

Pos logit contributions
 Discrimination+0.659
oppable+0.654
 Pog+0.637
sov+0.623
onder+0.621
Neg logit contributions
inery-0.856
alogue-0.830
ISE-0.774
-0.774
nces-0.767
Token draw
Token ID3197
Feature activation+2.95e+00

Pos logit contributions
inery+2.070
alogue+2.012
+1.873
ISE+1.870
arte+1.847
Neg logit contributions
oppable-1.899
 Discrimination-1.885
 Pog-1.820
onder-1.785
sov-1.781
Token something
Token ID1223
Feature activation+9.35e-01

Pos logit contributions
inery+2.263
alogue+2.198
+2.058
ISE+2.054
arte+2.019
Neg logit contributions
oppable-1.861
 Discrimination-1.834
 Pog-1.755
onder-1.746
sov-1.734
Token that
Token ID326
Feature activation+3.00e+00

Pos logit contributions
inery+0.736
alogue+0.713
ISE+0.668
+0.667
arte+0.657
Neg logit contributions
oppable-0.573
 Discrimination-0.572
 Pog-0.554
onder-0.542
sov-0.540
Token was
Token ID373
Feature activation+2.96e+00

Pos logit contributions
inery+2.243
alogue+2.174
+2.034
ISE+2.024
arte+1.998
Neg logit contributions
oppable-1.957
 Discrimination-1.938
 Pog-1.861
onder-1.844
sov-1.834
Token more
Token ID517
Feature activation+6.00e+00

Pos logit contributions
inery+2.279
alogue+2.218
+2.076
ISE+2.065
arte+2.042
Neg logit contributions
oppable-1.855
 Discrimination-1.842
 Pog-1.776
onder-1.754
 Depot-1.743
Token interesting
Token ID3499
Feature activation+5.75e+00

Pos logit contributions
inery+4.385
alogue+4.285
+3.973
ISE+3.950
arte+3.905
Neg logit contributions
oppable-3.911
 Discrimination-3.800
onder-3.667
 Depot-3.634
 Pog-3.627
Token than
Token ID621
Feature activation+3.48e+00

Pos logit contributions
inery+4.190
alogue+4.053
+3.796
ISE+3.740
nces+3.710
Neg logit contributions
oppable-3.782
 Discrimination-3.734
 Depot-3.586
 Pog-3.569
onder-3.542
Token anything
Token ID1997
Feature activation+3.60e+00

Pos logit contributions
inery+2.648
alogue+2.554
+2.395
ISE+2.382
arte+2.350
Neg logit contributions
oppable-2.235
 Discrimination-2.203
 Pog-2.124
onder-2.114
sov-2.093
Token you
Token ID345
Feature activation+2.30e+00

Pos logit contributions
inery+2.745
alogue+2.655
+2.487
ISE+2.474
arte+2.442
Neg logit contributions
oppable-2.301
 Discrimination-2.277
 Pog-2.188
onder-2.171
sov-2.154
Token could
Token ID714
Feature activation+6.03e+00

Pos logit contributions
inery+1.704
alogue+1.648
+1.537
ISE+1.536
arte+1.512
Neg logit contributions
oppable-1.513
 Discrimination-1.508
 Pog-1.463
onder-1.434
sov-1.430
Token of
Token ID286
Feature activation-2.81e+00

Pos logit contributions
oppable+1.196
 Discrimination+1.183
 Pog+1.142
onder+1.116
sov+1.096
Neg logit contributions
inery-2.192
alogue-2.120
ISE-2.015
-2.002
arte-1.980
Token all
Token ID477
Feature activation-3.95e+00

Pos logit contributions
 Discrimination+1.828
oppable+1.807
 Pog+1.750
onder+1.721
 Depot+1.712
Neg logit contributions
inery-2.110
alogue-2.021
-1.896
ISE-1.896
arte-1.887
Token income
Token ID3739
Feature activation-6.68e+00

Pos logit contributions
 Discrimination+2.030
oppable+1.969
 Pog+1.940
onder+1.860
+1.850
Neg logit contributions
inery-3.518
alogue-3.403
-3.208
ISE-3.204
arte-3.198
Token classes
Token ID6097
Feature activation-1.06e+00

Pos logit contributions
 Discrimination+2.867
oppable+2.826
 Pog+2.716
onder+2.617
sov+2.616
Neg logit contributions
inery-6.321
alogue-6.226
-5.868
arte-5.867
ISE-5.849
Token
Token ID960
Feature activation+6.40e-01

Pos logit contributions
oppable+0.531
 Discrimination+0.529
 Pog+0.506
onder+0.495
sov+0.490
Neg logit contributions
inery-0.951
alogue-0.926
ISE-0.875
-0.873
arte-0.863
TokenFederal
Token ID24099
Feature activation+2.19e+00

Pos logit contributions
inery+0.448
alogue+0.432
+0.401
ISE+0.400
arte+0.393
Neg logit contributions
oppable-0.448
 Discrimination-0.448
 Pog-0.435
onder-0.428
sov-0.427
Token Communications
Token ID14620
Feature activation-1.47e+00

Pos logit contributions
inery+2.020
alogue+1.961
ISE+1.860
+1.855
nces+1.827
Neg logit contributions
oppable-1.054
 Discrimination-1.036
 Pog-1.001
onder-0.973
sov-0.971
Token Commission
Token ID4513
Feature activation-1.22e+00

Pos logit contributions
 Discrimination+0.694
oppable+0.694
 Pog+0.663
onder+0.645
sov+0.642
Neg logit contributions
inery-1.369
alogue-1.333
-1.261
ISE-1.260
arte-1.245
Token chairman
Token ID8900
Feature activation+2.98e+00

Pos logit contributions
 Discrimination+0.478
oppable+0.470
 Pog+0.450
onder+0.436
 Depot+0.430
Neg logit contributions
inery-1.224
alogue-1.202
ISE-1.137
-1.137
arte-1.124
Token Aj
Token ID22028
Feature activation-2.94e+00

Pos logit contributions
inery+2.726
alogue+2.633
+2.505
ISE+2.490
arte+2.450
Neg logit contributions
oppable-1.464
 Discrimination-1.414
sov-1.349
 Pog-1.348
-1.336
Tokenit
Token ID270
Feature activation-7.66e+00

Pos logit contributions
 Discrimination+1.986
onder+1.949
 Rib+1.904
sov+1.875
 Pog+1.866
Neg logit contributions
alogue-2.131
inery-2.044
-1.968
nces-1.890
ISE-1.885
Token soccer
Token ID11783
Feature activation-4.10e-02

Pos logit contributions
oppable+0.145
 Discrimination+0.144
 Pog+0.140
onder+0.137
sov+0.137
Neg logit contributions
inery-0.174
alogue-0.169
ISE-0.158
-0.157
arte-0.155
Token for
Token ID329
Feature activation+1.99e+00

Pos logit contributions
oppable+0.024
 Discrimination+0.024
 Pog+0.024
onder+0.023
sov+0.023
Neg logit contributions
inery-0.033
alogue-0.032
ISE-0.030
-0.030
arte-0.030
Token good
Token ID922
Feature activation-1.44e+00

Pos logit contributions
inery+1.520
alogue+1.469
ISE+1.375
+1.375
arte+1.355
Neg logit contributions
oppable-1.268
 Discrimination-1.262
 Pog-1.220
onder-1.200
sov-1.196
Token when
Token ID618
Feature activation+2.36e+00

Pos logit contributions
oppable+0.836
 Discrimination+0.836
 Pog+0.816
onder+0.792
sov+0.785
Neg logit contributions
inery-1.185
alogue-1.151
-1.076
ISE-1.067
arte-1.059
Token he
Token ID339
Feature activation+3.16e+00

Pos logit contributions
inery+1.953
alogue+1.885
ISE+1.777
+1.776
arte+1.756
Neg logit contributions
oppable-1.349
 Discrimination-1.343
 Pog-1.274
onder-1.262
sov-1.256
Token was
Token ID373
Feature activation+5.25e+00

Pos logit contributions
inery+2.278
alogue+2.214
+2.062
arte+2.050
ISE+2.049
Neg logit contributions
 Discrimination-2.103
oppable-2.099
 Pog-1.995
-1.988
 Depot-1.969
Token about
Token ID546
Feature activation+7.71e+00

Pos logit contributions
inery+3.413
alogue+3.289
ISE+3.048
+3.042
arte+3.019
Neg logit contributions
 Discrimination-3.817
oppable-3.817
 Pog-3.648
sov-3.625
-3.622
Token 13
Token ID1511
Feature activation+4.32e+00

Pos logit contributions
inery+4.225
alogue+4.033
ISE+3.774
+3.761
arte+3.761
Neg logit contributions
oppable-6.208
 Discrimination-6.091
-5.905
onder-5.857
 Pog-5.826
Token.
Token ID13
Feature activation+2.20e+00

Pos logit contributions
inery+3.237
alogue+3.119
ISE+2.940
+2.935
arte+2.906
Neg logit contributions
oppable-2.755
 Discrimination-2.749
 Pog-2.629
sov-2.592
Dialog-2.585
Token
Token ID198
Feature activation+1.53e+00

Pos logit contributions
inery+1.586
alogue+1.528
ISE+1.429
+1.427
arte+1.409
Neg logit contributions
oppable-1.491
 Discrimination-1.463
 Pog-1.416
onder-1.408
sov-1.397
Token
Token ID198
Feature activation+2.52e+00

Pos logit contributions
inery+1.056
alogue+1.016
ISE+0.950
+0.948
arte+0.934
Neg logit contributions
oppable-1.095
 Discrimination-1.080
 Pog-1.047
onder-1.029
sov-1.024
Token now
Token ID783
Feature activation+1.19e+00

Pos logit contributions
 Discrimination+1.055
oppable+1.049
 Pog+1.026
onder+0.997
 Depot+0.991
Neg logit contributions
inery-1.505
alogue-1.458
-1.375
ISE-1.363
nces-1.356
Token seems
Token ID2331
Feature activation-2.12e-01

Pos logit contributions
inery+0.988
alogue+0.958
+0.901
ISE+0.901
arte+0.888
Neg logit contributions
oppable-0.675
 Discrimination-0.673
 Pog-0.648
onder-0.634
sov-0.632
Token somehow
Token ID7599
Feature activation+3.15e+00

Pos logit contributions
oppable+0.134
 Discrimination+0.134
 Pog+0.130
onder+0.127
sov+0.127
Neg logit contributions
inery-0.163
alogue-0.157
ISE-0.147
-0.147
arte-0.145
Token cumbersome
Token ID44491
Feature activation+2.95e+00

Pos logit contributions
inery+2.405
alogue+2.344
+2.188
ISE+2.180
arte+2.147
Neg logit contributions
oppable-1.987
 Discrimination-1.972
 Pog-1.909
onder-1.868
sov-1.859
Token and
Token ID290
Feature activation+5.20e-01

Pos logit contributions
inery+2.320
alogue+2.250
+2.105
ISE+2.104
arte+2.064
Neg logit contributions
oppable-1.803
 Discrimination-1.794
 Pog-1.731
onder-1.696
sov-1.696
Token archaic
Token ID40435
Feature activation+2.69e+00

Pos logit contributions
inery+0.404
alogue+0.390
ISE+0.366
+0.365
arte+0.360
Neg logit contributions
 Discrimination-0.325
oppable-0.325
 Pog-0.314
onder-0.308
sov-0.306
Token --
Token ID1377
Feature activation+3.10e+00

Pos logit contributions
inery+2.168
alogue+2.103
+1.970
ISE+1.970
arte+1.937
Neg logit contributions
oppable-1.599
 Discrimination-1.591
 Pog-1.536
onder-1.504
sov-1.503
Token or
Token ID393
Feature activation+1.05e+00

Pos logit contributions
inery+2.374
alogue+2.307
+2.148
ISE+2.142
arte+2.116
Neg logit contributions
oppable-1.960
 Discrimination-1.947
 Pog-1.887
onder-1.851
sov-1.848
Token,
Token ID11
Feature activation+9.75e-01

Pos logit contributions
inery+0.834
alogue+0.808
ISE+0.757
+0.757
arte+0.745
Neg logit contributions
oppable-0.641
 Discrimination-0.639
 Pog-0.618
onder-0.606
sov-0.604
Token worse
Token ID4785
Feature activation+6.21e+00

Pos logit contributions
inery+0.754
alogue+0.729
ISE+0.682
+0.682
arte+0.671
Neg logit contributions
oppable-0.611
 Discrimination-0.610
 Pog-0.591
onder-0.579
sov-0.576
Token,
Token ID11
Feature activation+3.79e+00

Pos logit contributions
inery+4.362
alogue+4.248
ISE+4.057
+3.974
emark+3.900
Neg logit contributions
oppable-4.138
 Discrimination-3.973
onder-3.942
 Pog-3.917
sov-3.906
Token into
Token ID656
Feature activation-2.84e+00

Pos logit contributions
inery+0.013
alogue+0.013
ISE+0.012
+0.012
arte+0.012
Neg logit contributions
oppable-0.009
 Discrimination-0.009
 Pog-0.009
sov-0.009
onder-0.009
Token Street
Token ID3530
Feature activation-3.45e+00

Pos logit contributions
oppable+1.716
 Discrimination+1.705
 Pog+1.636
 Depot+1.603
+1.598
Neg logit contributions
inery-2.272
alogue-2.190
-2.039
ISE-2.036
nces-2.013
Token Fighter
Token ID15411
Feature activation+4.09e+00

Pos logit contributions
oppable+1.283
 Discrimination+1.273
 Pog+1.223
 Slug+1.183
 Depot+1.148
Neg logit contributions
inery-3.515
alogue-3.447
ISE-3.317
-3.277
nces-3.257
Token V
Token ID569
Feature activation-2.78e+00

Pos logit contributions
inery+3.050
alogue+2.961
+2.779
ISE+2.764
arte+2.735
Neg logit contributions
oppable-2.640
 Discrimination-2.609
 Pog-2.539
onder-2.504
sov-2.483
Token.
Token ID13
Feature activation-7.35e-01

Pos logit contributions
oppable+1.535
 Discrimination+1.511
 Pog+1.449
onder+1.425
+1.410
Neg logit contributions
inery-2.370
alogue-2.285
-2.141
ISE-2.134
nces-2.118
Token Which
Token ID9022
Feature activation+1.53e+00

Pos logit contributions
 Discrimination+0.496
oppable+0.484
 Pog+0.478
 Depot+0.466
onder+0.463
Neg logit contributions
inery-0.542
alogue-0.522
nces-0.484
-0.481
ISE-0.481
Token fighting
Token ID4330
Feature activation-9.18e-01

Pos logit contributions
inery+1.223
alogue+1.184
+1.110
ISE+1.109
arte+1.092
Neg logit contributions
oppable-0.915
 Discrimination-0.913
 Pog-0.883
onder-0.863
sov-0.860
Token style
Token ID3918
Feature activation-1.42e+00

Pos logit contributions
oppable+0.512
 Discrimination+0.510
 Pog+0.498
onder+0.482
sov+0.480
Neg logit contributions
inery-0.772
alogue-0.745
ISE-0.706
-0.698
arte-0.688
Token do
Token ID466
Feature activation-3.69e-01

Pos logit contributions
 Discrimination+0.787
oppable+0.785
 Pog+0.761
onder+0.743
 Depot+0.736
Neg logit contributions
inery-1.197
alogue-1.159
ISE-1.086
-1.084
nces-1.076
Token you
Token ID345
Feature activation+1.68e+00

Pos logit contributions
oppable+0.185
 Discrimination+0.183
 Pog+0.177
onder+0.172
 Depot+0.170
Neg logit contributions
inery-0.332
alogue-0.323
ISE-0.305
-0.304
nces-0.300
Token think
Token ID892
Feature activation-8.68e-01

Pos logit contributions
inery+1.247
alogue+1.207
+1.126
ISE+1.125
arte+1.107
Neg logit contributions
oppable-1.099
 Discrimination-1.094
 Pog-1.061
onder-1.041
sov-1.039
TokenApparently
Token ID30402
Feature activation-2.84e+00

Pos logit contributions
inery+1.702
alogue+1.649
ISE+1.547
+1.544
arte+1.523
Neg logit contributions
oppable-1.427
 Discrimination-1.418
 Pog-1.371
onder-1.341
sov-1.339
Token she
Token ID673
Feature activation-6.89e-01

Pos logit contributions
 Discrimination+1.637
oppable+1.610
 Pog+1.580
onder+1.537
 Depot+1.525
Neg logit contributions
inery-2.320
alogue-2.249
-2.137
ISE-2.117
nces-2.093
Token was
Token ID373
Feature activation-6.90e-01

Pos logit contributions
oppable+0.394
 Discrimination+0.392
 Pog+0.382
onder+0.374
sov+0.370
Neg logit contributions
inery-0.570
alogue-0.553
ISE-0.521
-0.519
nces-0.508
Token originally
Token ID6198
Feature activation-1.85e+00

Pos logit contributions
 Discrimination+0.423
oppable+0.420
 Pog+0.412
onder+0.400
 misunderstood+0.397
Neg logit contributions
inery-0.545
alogue-0.525
ISE-0.496
-0.493
nces-0.485
Token inspired
Token ID7867
Feature activation-2.22e+00

Pos logit contributions
 Discrimination+1.107
oppable+1.096
 Pog+1.082
onder+1.043
 Depot+1.039
Neg logit contributions
inery-1.472
alogue-1.428
ISE-1.339
-1.334
nces-1.327
Token by
Token ID416
Feature activation-5.18e-02

Pos logit contributions
oppable+1.187
 Discrimination+1.185
 Pog+1.155
onder+1.122
sov+1.116
Neg logit contributions
inery-1.898
alogue-1.841
ISE-1.744
-1.740
nces-1.716
Token her
Token ID607
Feature activation+6.80e-01

Pos logit contributions
 Discrimination+0.033
oppable+0.033
 Pog+0.032
onder+0.031
 Depot+0.031
Neg logit contributions
inery-0.039
alogue-0.038
-0.036
ISE-0.036
nces-0.035
Token love
Token ID1842
Feature activation+1.41e+00

Pos logit contributions
inery+0.520
alogue+0.502
+0.470
ISE+0.470
nces+0.462
Neg logit contributions
 Discrimination-0.433
oppable-0.433
 Pog-0.421
onder-0.412
 Depot-0.410
Token of
Token ID286
Feature activation+8.96e-01

Pos logit contributions
inery+1.097
alogue+1.065
ISE+0.994
+0.992
arte+0.983
Neg logit contributions
 Discrimination-0.871
oppable-0.871
 Pog-0.840
sov-0.825
onder-0.819
Token Japanese
Token ID4960
Feature activation-1.54e+00

Pos logit contributions
inery+0.661
alogue+0.639
ISE+0.596
+0.596
arte+0.585
Neg logit contributions
oppable-0.593
 Discrimination-0.592
 Pog-0.575
onder-0.563
sov-0.561
Token anime
Token ID11984
Feature activation-1.33e+00

Pos logit contributions
 Discrimination+1.078
oppable+1.062
 Pog+1.052
 Depot+1.025
onder+1.024
Neg logit contributions
inery-1.083
alogue-1.040
ISE-0.969
-0.965
nces-0.955
Token trying
Token ID2111
Feature activation-5.21e-02

Pos logit contributions
inery+2.508
alogue+2.434
+2.285
ISE+2.275
arte+2.243
Neg logit contributions
oppable-1.949
 Discrimination-1.941
 Pog-1.875
onder-1.843
sov-1.836
Token to
Token ID284
Feature activation+9.20e-01

Pos logit contributions
oppable+0.033
 Discrimination+0.033
 Pog+0.032
onder+0.031
sov+0.031
Neg logit contributions
inery-0.040
alogue-0.039
-0.036
ISE-0.036
arte-0.036
Token support
Token ID1104
Feature activation+1.74e+00

Pos logit contributions
inery+0.601
alogue+0.574
ISE+0.530
+0.529
arte+0.520
Neg logit contributions
oppable-0.689
 Discrimination-0.688
 Pog-0.670
onder-0.659
sov-0.657
Token,
Token ID11
Feature activation+9.16e-01

Pos logit contributions
inery+1.378
alogue+1.335
+1.251
ISE+1.251
arte+1.231
Neg logit contributions
oppable-1.053
 Discrimination-1.049
 Pog-1.015
onder-0.994
sov-0.991
Token chances
Token ID8395
Feature activation+1.81e+00

Pos logit contributions
inery+0.724
alogue+0.700
ISE+0.655
+0.654
arte+0.644
Neg logit contributions
 Discrimination-0.560
oppable-0.560
 Pog-0.541
onder-0.529
sov-0.528
Token are
Token ID389
Feature activation-9.71e-01

Pos logit contributions
inery+1.333
alogue+1.289
ISE+1.202
+1.201
arte+1.181
Neg logit contributions
oppable-1.199
 Discrimination-1.195
 Pog-1.159
onder-1.137
sov-1.134
Token you
Token ID345
Feature activation+7.65e-01

Pos logit contributions
 Discrimination+0.555
oppable+0.550
 Pog+0.534
+0.517
 Depot+0.517
Neg logit contributions
inery-0.807
alogue-0.780
ISE-0.729
-0.729
arte-0.724
Token know
Token ID760
Feature activation+1.57e+00

Pos logit contributions
inery+0.593
alogue+0.574
+0.537
ISE+0.536
arte+0.528
Neg logit contributions
oppable-0.478
 Discrimination-0.476
 Pog-0.461
onder-0.453
sov-0.451
Token whether
Token ID1771
Feature activation+2.27e+00

Pos logit contributions
inery+1.159
alogue+1.119
+1.044
ISE+1.044
arte+1.026
Neg logit contributions
oppable-1.044
 Discrimination-1.039
 Pog-1.008
onder-0.990
sov-0.988
Token they
Token ID484
Feature activation+3.50e+00

Pos logit contributions
inery+1.786
alogue+1.730
+1.627
ISE+1.623
arte+1.599
Neg logit contributions
oppable-1.384
 Discrimination-1.375
 Pog-1.330
onder-1.305
sov-1.300
Token prefer
Token ID4702
Feature activation+3.91e+00

Pos logit contributions
inery+2.542
alogue+2.456
+2.305
ISE+2.301
arte+2.265
Neg logit contributions
oppable-2.334
 Discrimination-2.331
 Pog-2.256
sov-2.218
onder-2.208
Token
Token ID564
Feature activation+6.10e+00

Pos logit contributions
inery+1.956
alogue+1.891
ISE+1.771
+1.768
arte+1.742
Neg logit contributions
oppable-1.601
 Discrimination-1.586
 Pog-1.538
onder-1.512
sov-1.510
Token
Token ID250
Feature activation-7.71e-01

Pos logit contributions
inery+3.173
alogue+3.056
+2.729
ISE+2.719
arte+2.661
Neg logit contributions
oppable-5.382
 Discrimination-5.249
 Pog-5.190
sov-5.130
onder-5.121
TokenDo
Token ID5211
Feature activation-6.00e-01

Pos logit contributions
 Discrimination+0.468
oppable+0.462
 Pog+0.447
onder+0.440
sov+0.439
Neg logit contributions
inery-0.613
alogue-0.594
-0.557
ISE-0.555
arte-0.549
Token you
Token ID345
Feature activation+5.81e+00

Pos logit contributions
 Discrimination+0.313
oppable+0.312
 Pog+0.300
onder+0.292
 Depot+0.290
Neg logit contributions
inery-0.527
alogue-0.514
-0.485
ISE-0.484
arte-0.478
Token want
Token ID765
Feature activation+4.21e+00

Pos logit contributions
inery+4.018
alogue+3.958
ISE+3.676
+3.665
arte+3.617
Neg logit contributions
oppable-3.958
 Discrimination-3.841
sov-3.725
 Pog-3.725
-3.675
Token Catalonia
Token ID33859
Feature activation-6.75e-01

Pos logit contributions
inery+3.099
alogue+2.997
ISE+2.801
+2.790
nces+2.738
Neg logit contributions
oppable-2.788
 Discrimination-2.707
 Pog-2.625
onder-2.602
sov-2.597
Token to
Token ID284
Feature activation+1.64e+00

Pos logit contributions
 Discrimination+0.408
oppable+0.406
 Pog+0.396
sov+0.385
onder+0.384
Neg logit contributions
inery-0.539
alogue-0.520
-0.486
ISE-0.485
arte-0.479
Token become
Token ID1716
Feature activation+3.34e+00

Pos logit contributions
inery+1.248
alogue+1.205
+1.124
ISE+1.124
arte+1.106
Neg logit contributions
oppable-1.046
 Discrimination-1.044
 Pog-1.015
onder-0.995
sov-0.991
Token a
Token ID257
Feature activation+2.03e+00

Pos logit contributions
inery+2.401
alogue+2.328
ISE+2.181
+2.167
arte+2.138
Neg logit contributions
oppable-2.258
 Discrimination-2.235
 Pog-2.158
sov-2.133
onder-2.123
Token state
Token ID1181
Feature activation+2.78e+00

Pos logit contributions
inery+1.602
alogue+1.550
ISE+1.456
+1.454
arte+1.433
Neg logit contributions
oppable-1.240
 Discrimination-1.234
 Pog-1.192
onder-1.168
sov-1.166
Token?
Token ID30
Feature activation+2.18e+00

Pos logit contributions
inery+2.258
alogue+2.199
ISE+2.062
+2.054
arte+2.031
Neg logit contributions
oppable-1.617
 Discrimination-1.608
 Pog-1.546
onder-1.516
sov-1.514
Token the
Token ID262
Feature activation+1.14e+00

Pos logit contributions
inery+2.036
alogue+1.969
ISE+1.844
+1.838
arte+1.807
Neg logit contributions
oppable-1.796
 Discrimination-1.775
 Pog-1.714
sov-1.691
onder-1.689
Token music
Token ID2647
Feature activation-1.01e+00

Pos logit contributions
inery+0.898
alogue+0.870
+0.815
ISE+0.815
arte+0.802
Neg logit contributions
oppable-0.697
 Discrimination-0.694
 Pog-0.672
onder-0.658
sov-0.656
Token.
Token ID13
Feature activation+2.05e+00

Pos logit contributions
oppable+0.564
 Discrimination+0.563
 Pog+0.547
onder+0.533
sov+0.529
Neg logit contributions
inery-0.846
alogue-0.819
ISE-0.772
-0.771
arte-0.759
Token Bernard
Token ID16197
Feature activation-6.02e+00

Pos logit contributions
inery+1.514
alogue+1.464
+1.366
ISE+1.366
arte+1.343
Neg logit contributions
oppable-1.351
 Discrimination-1.343
 Pog-1.302
onder-1.279
sov-1.276
Token Her
Token ID2332
Feature activation+1.34e+00

Pos logit contributions
 Pog+3.545
 Discrimination+3.533
 Rib+3.428
oppable+3.411
 Depot+3.403
Neg logit contributions
alogue-4.733
inery-4.724
nces-4.310
ISE-4.309
-4.308
Tokenr
Token ID81
Feature activation-3.43e+00

Pos logit contributions
inery+0.924
alogue+0.891
+0.828
ISE+0.827
arte+0.811
Neg logit contributions
oppable-0.946
 Discrimination-0.942
 Pog-0.917
onder-0.902
sov-0.900
Tokenmann
Token ID9038
Feature activation+2.04e+00

Pos logit contributions
 Discrimination+1.283
oppable+1.270
 Pog+1.241
sov+1.188
onder+1.187
Neg logit contributions
inery-3.476
alogue-3.411
-3.253
ISE-3.241
nces-3.199
Token's
Token ID338
Feature activation+1.05e+00

Pos logit contributions
inery+1.612
alogue+1.563
+1.467
ISE+1.462
arte+1.445
Neg logit contributions
oppable-1.244
 Discrimination-1.241
 Pog-1.191
onder-1.167
sov-1.164
Token score
Token ID4776
Feature activation+4.53e+00

Pos logit contributions
inery+0.815
alogue+0.789
+0.739
ISE+0.738
arte+0.727
Neg logit contributions
oppable-0.652
 Discrimination-0.650
 Pog-0.628
onder-0.616
sov-0.615
Token for
Token ID329
Feature activation+2.01e+00

Pos logit contributions
inery+3.543
alogue+3.456
+3.227
ISE+3.207
arte+3.187
Neg logit contributions
 Discrimination-2.776
oppable-2.761
 Pog-2.630
sov-2.572
 Depot-2.571
TokenPsych
Token ID31923
Feature activation-1.34e+00

Pos logit contributions
inery+1.509
alogue+1.461
ISE+1.365
+1.365
arte+1.343
Neg logit contributions
oppable-1.306
 Discrimination-1.295
 Pog-1.250
onder-1.233
sov-1.227
Token photographs
Token ID12566
Feature activation-1.49e+00

Pos logit contributions
 Discrimination+0.600
oppable+0.596
 Pog+0.581
onder+0.569
 Depot+0.566
Neg logit contributions
inery-0.839
alogue-0.808
ISE-0.766
-0.760
nces-0.749
Token.)
Token ID2014
Feature activation+3.08e+00

Pos logit contributions
 Discrimination+0.868
oppable+0.867
 Pog+0.841
onder+0.823
sov+0.817
Neg logit contributions
inery-1.209
alogue-1.166
ISE-1.096
-1.094
nces-1.080
Token So
Token ID1406
Feature activation+2.97e+00

Pos logit contributions
inery+2.183
alogue+2.113
ISE+1.971
+1.968
arte+1.934
Neg logit contributions
oppable-2.128
 Discrimination-2.079
 Pog-2.025
sov-2.000
onder-1.999
Token I
Token ID314
Feature activation-7.28e-01

Pos logit contributions
inery+2.268
alogue+2.199
+2.056
ISE+2.055
arte+2.022
Neg logit contributions
oppable-1.877
 Discrimination-1.861
 Pog-1.804
sov-1.770
onder-1.768
Token might
Token ID1244
Feature activation-4.74e+00

Pos logit contributions
oppable+0.440
 Discrimination+0.439
 Pog+0.424
onder+0.419
sov+0.417
Neg logit contributions
inery-0.579
alogue-0.561
ISE-0.523
-0.523
nces-0.513
Token as
Token ID355
Feature activation+2.56e-01

Pos logit contributions
oppable+2.612
 Discrimination+2.611
onder+2.540
 Pog+2.505
sov+2.473
Neg logit contributions
inery-3.931
alogue-3.801
ISE-3.553
-3.539
emark-3.515
Token well
Token ID880
Feature activation+1.89e+00

Pos logit contributions
inery+0.178
alogue+0.172
ISE+0.159
+0.159
nces+0.157
Neg logit contributions
 Discrimination-0.179
oppable-0.179
 Pog-0.174
sov-0.171
onder-0.171
Token make
Token ID787
Feature activation-3.72e+00

Pos logit contributions
inery+1.451
alogue+1.406
+1.316
ISE+1.315
arte+1.297
Neg logit contributions
oppable-1.194
 Discrimination-1.185
 Pog-1.146
onder-1.123
sov-1.123
Token them
Token ID606
Feature activation+3.26e+00

Pos logit contributions
 Discrimination+2.229
oppable+2.172
 Pog+2.143
onder+2.083
 Depot+2.079
Neg logit contributions
inery-3.000
alogue-2.882
-2.704
ISE-2.689
arte-2.619
Token useful
Token ID4465
Feature activation-1.48e+00

Pos logit contributions
inery+2.475
alogue+2.413
ISE+2.254
+2.254
arte+2.228
Neg logit contributions
oppable-2.064
 Discrimination-2.039
 Pog-1.979
onder-1.945
sov-1.943
Token.
Token ID13
Feature activation-4.92e-01

Pos logit contributions
 Discrimination+0.812
oppable+0.797
 Pog+0.785
onder+0.765
sov+0.761
Neg logit contributions
inery-1.254
alogue-1.215
ISE-1.143
-1.141
arte-1.125
Tokenik
Token ID1134
Feature activation-3.04e+00

Pos logit contributions
oppable+2.586
 Pog+2.569
 Discrimination+2.562
onder+2.532
sov+2.511
Neg logit contributions
inery-2.379
alogue-2.264
-2.187
ISE-2.143
nces-2.088
Tokenki
Token ID4106
Feature activation-5.80e-01

Pos logit contributions
 Discrimination+1.600
oppable+1.595
 Pog+1.557
sov+1.539
onder+1.517
Neg logit contributions
inery-2.623
alogue-2.559
ISE-2.452
-2.444
nces-2.388
Token M
Token ID337
Feature activation+4.10e+00

Pos logit contributions
 Discrimination+0.344
oppable+0.343
 Pog+0.336
onder+0.326
sov+0.325
Neg logit contributions
inery-0.469
alogue-0.454
ISE-0.427
-0.426
nces-0.419
Tokenä
Token ID11033
Feature activation+5.99e-01

Pos logit contributions
inery+3.324
alogue+3.203
ISE+2.998
arte+2.982
+2.955
Neg logit contributions
 Discrimination-2.450
oppable-2.369
 Pog-2.305
-2.261
 Depot-2.254
Tokenen
Token ID268
Feature activation-3.02e+00

Pos logit contributions
inery+0.502
alogue+0.487
+0.458
ISE+0.458
arte+0.452
Neg logit contributions
oppable-0.337
 Discrimination-0.336
 Pog-0.324
onder-0.316
sov-0.315
Tokenp
Token ID79
Feature activation-7.81e+00

Pos logit contributions
oppable+1.880
 Discrimination+1.879
 Pog+1.865
sov+1.835
onder+1.829
Neg logit contributions
inery-2.295
alogue-2.209
-2.105
ISE-2.096
nces-2.057
Tokenä
Token ID11033
Feature activation+6.21e+00

Pos logit contributions
onder+4.868
oppable+4.735
sov+4.700
 Pog+4.638
 Discrimination+4.611
Neg logit contributions
inery-5.733
-5.555
alogue-5.542
nces-5.329
ISE-5.298
Tokenä
Token ID11033
Feature activation+2.83e+00

Pos logit contributions
inery+4.765
alogue+4.611
arte+4.307
ISE+4.290
+4.252
Neg logit contributions
oppable-3.880
 Discrimination-3.812
-3.623
 Pog-3.563
Percent-3.559
Token.
Token ID13
Feature activation+3.09e+00

Pos logit contributions
inery+2.239
alogue+2.169
ISE+2.027
+2.027
arte+2.009
Neg logit contributions
oppable-1.728
 Discrimination-1.709
 Pog-1.638
onder-1.608
sov-1.601
Token It
Token ID632
Feature activation-3.10e-03

Pos logit contributions
inery+2.221
alogue+2.149
+2.007
ISE+2.002
arte+1.978
Neg logit contributions
oppable-2.092
 Discrimination-2.053
 Pog-1.980
onder-1.972
sov-1.968
Token was
Token ID373
Feature activation+1.24e+00

Pos logit contributions
oppable+0.002
 Discrimination+0.002
 Pog+0.002
onder+0.002
sov+0.002
Neg logit contributions
inery-0.003
alogue-0.002
ISE-0.002
-0.002
arte-0.002
Token and
Token ID290
Feature activation-2.02e+00

Pos logit contributions
inery+3.310
alogue+3.216
+3.003
ISE+2.995
nces+2.956
Neg logit contributions
oppable-2.619
 Discrimination-2.617
 Pog-2.495
onder-2.459
sov-2.453
Token other
Token ID584
Feature activation+8.65e-01

Pos logit contributions
 Discrimination+1.358
oppable+1.334
 Pog+1.331
 Depot+1.287
sov+1.286
Neg logit contributions
inery-1.481
alogue-1.411
-1.322
ISE-1.321
arte-1.307
Token recording
Token ID8296
Feature activation-9.18e-01

Pos logit contributions
inery+0.671
alogue+0.649
+0.608
ISE+0.607
arte+0.598
Neg logit contributions
oppable-0.539
 Discrimination-0.538
 Pog-0.522
onder-0.510
sov-0.509
Token accidents
Token ID17390
Feature activation+2.38e+00

Pos logit contributions
 Discrimination+0.569
oppable+0.560
 Pog+0.554
 Depot+0.538
onder+0.536
Neg logit contributions
inery-0.714
alogue-0.685
-0.646
ISE-0.644
arte-0.643
Token
Token ID1906
Feature activation+8.15e-01

Pos logit contributions
inery+1.832
alogue+1.774
ISE+1.659
+1.659
arte+1.632
Neg logit contributions
oppable-1.493
 Discrimination-1.486
 Pog-1.438
onder-1.410
sov-1.406
Tokensuch
Token ID10508
Feature activation-8.78e+00

Pos logit contributions
inery+0.597
alogue+0.576
+0.537
ISE+0.537
arte+0.528
Neg logit contributions
oppable-0.543
 Discrimination-0.542
 Pog-0.527
onder-0.517
sov-0.516
Token as
Token ID355
Feature activation-1.72e+00

Pos logit contributions
oppable+3.887
 Discrimination+3.820
 Pog+3.684
+3.608
Dialog+3.579
Neg logit contributions
inery-8.041
alogue-7.642
-7.421
ISE-7.370
arte-7.365
Token Casey
Token ID21097
Feature activation+3.59e+00

Pos logit contributions
 Discrimination+1.067
oppable+1.056
 Pog+1.041
 Depot+1.005
sov+1.004
Neg logit contributions
inery-1.335
alogue-1.284
ISE-1.208
-1.207
arte-1.196
Token Kas
Token ID15035
Feature activation+4.48e+00

Pos logit contributions
inery+2.837
alogue+2.741
+2.575
ISE+2.568
arte+2.529
Neg logit contributions
oppable-2.205
 Discrimination-2.147
onder-2.071
 Pog-2.069
sov-2.053
Tokenem
Token ID368
Feature activation+4.20e-01

Pos logit contributions
inery+2.535
alogue+2.442
arte+2.205
emark+2.191
+2.186
Neg logit contributions
oppable-3.735
 Discrimination-3.727
 Pog-3.640
-3.555
onder-3.543
Token ruining
Token ID39842
Feature activation-2.56e+00

Pos logit contributions
inery+0.341
alogue+0.330
ISE+0.310
+0.310
arte+0.305
Neg logit contributions
oppable-0.247
 Discrimination-0.247
 Pog-0.239
onder-0.234
sov-0.233
Token heads
Token ID6665
Feature activation-1.88e+00

Pos logit contributions
 Discrimination+1.274
oppable+1.258
 Pog+1.243
 Depot+1.216
sov+1.215
Neg logit contributions
inery-1.680
alogue-1.645
ISE-1.546
-1.540
arte-1.527
Token,
Token ID11
Feature activation+2.65e+00

Pos logit contributions
 Discrimination+1.092
oppable+1.089
 Pog+1.058
onder+1.047
sov+1.036
Neg logit contributions
inery-1.522
alogue-1.468
-1.399
ISE-1.393
arte-1.369
Token but
Token ID475
Feature activation-1.69e+00

Pos logit contributions
inery+2.044
alogue+1.969
+1.842
ISE+1.836
arte+1.814
Neg logit contributions
oppable-1.675
 Discrimination-1.664
 Pog-1.609
onder-1.579
sov-1.575
Token that
Token ID326
Feature activation+9.05e-01

Pos logit contributions
 Discrimination+1.046
oppable+1.035
 Pog+1.014
 Depot+0.994
onder+0.984
Neg logit contributions
inery-1.313
alogue-1.281
ISE-1.202
-1.198
arte-1.175
Token only
Token ID691
Feature activation-6.10e+00

Pos logit contributions
inery+0.739
alogue+0.716
+0.672
ISE+0.672
arte+0.663
Neg logit contributions
oppable-0.527
 Discrimination-0.527
 Pog-0.509
onder-0.498
sov-0.496
Token postp
Token ID21826
Feature activation-8.42e+00

Pos logit contributions
 Discrimination+3.222
 Pog+3.177
oppable+3.121
onder+3.021
 Depot+3.000
Neg logit contributions
inery-5.164
alogue-5.099
ISE-4.764
arte-4.700
-4.687
Tokenones
Token ID1952
Feature activation-4.69e+00

Pos logit contributions
onder+4.831
oppable+4.676
sov+4.431
 Pog+4.381
 Discrimination+4.302
Neg logit contributions
alogue-6.594
inery-6.524
-6.355
ISE-6.270
ENA-6.159
Token the
Token ID262
Feature activation-6.17e-01

Pos logit contributions
 Discrimination+2.899
oppable+2.857
 Pog+2.846
 Depot+2.818
 Rib+2.746
Neg logit contributions
inery-3.554
alogue-3.515
-3.281
ISE-3.243
arte-3.232
Token day
Token ID1110
Feature activation-2.46e+00

Pos logit contributions
 Discrimination+0.378
oppable+0.378
 Pog+0.369
onder+0.359
 Depot+0.359
Neg logit contributions
inery-0.483
alogue-0.470
ISE-0.441
-0.439
arte-0.433
Token of
Token ID286
Feature activation-9.11e-01

Pos logit contributions
 Discrimination+1.360
oppable+1.339
 Pog+1.315
 Depot+1.290
onder+1.281
Neg logit contributions
inery-2.057
alogue-2.002
-1.897
ISE-1.882
arte-1.879
Token reckoning
Token ID49654
Feature activation+3.40e+00

Pos logit contributions
oppable+0.615
 Discrimination+0.614
 Pog+0.600
onder+0.586
 Depot+0.586
Neg logit contributions
inery-0.659
alogue-0.638
-0.596
ISE-0.592
nces-0.586
Token confident
Token ID6563
Feature activation-3.07e+00

Pos logit contributions
oppable+0.006
 Discrimination+0.006
 Pog+0.006
onder+0.005
sov+0.005
Neg logit contributions
inery-0.008
alogue-0.008
ISE-0.007
-0.007
nces-0.007
Token fighter
Token ID10543
Feature activation+1.36e+00

Pos logit contributions
oppable+1.545
 Discrimination+1.522
 Pog+1.503
onder+1.459
sov+1.439
Neg logit contributions
inery-2.730
alogue-2.651
-2.496
ISE-2.496
nces-2.479
Token.
Token ID13
Feature activation-7.20e-01

Pos logit contributions
inery+1.100
alogue+1.066
+1.002
ISE+1.000
arte+0.985
Neg logit contributions
oppable-0.804
 Discrimination-0.802
 Pog-0.774
onder-0.758
sov-0.756
Token
Token ID198
Feature activation+1.19e+00

Pos logit contributions
 Discrimination+0.478
oppable+0.468
 Pog+0.462
 Depot+0.449
onder+0.448
Neg logit contributions
inery-0.538
alogue-0.521
-0.481
ISE-0.480
nces-0.477
Token
Token ID198
Feature activation+2.18e-01

Pos logit contributions
inery+0.830
alogue+0.800
ISE+0.747
+0.747
arte+0.734
Neg logit contributions
oppable-0.846
 Discrimination-0.836
 Pog-0.811
onder-0.795
sov-0.792
TokenThank
Token ID10449
Feature activation-8.70e+00

Pos logit contributions
inery+0.175
alogue+0.169
+0.158
ISE+0.157
arte+0.156
Neg logit contributions
oppable-0.130
 Discrimination-0.130
 Pog-0.125
onder-0.123
sov-0.123
Token you
Token ID345
Feature activation-5.30e+00

Pos logit contributions
oppable+3.661
 Discrimination+3.659
 Pog+3.533
+3.431
onder+3.422
Neg logit contributions
inery-8.211
alogue-7.803
-7.472
ISE-7.469
nces-7.364
Token as
Token ID355
Feature activation+8.61e-01

Pos logit contributions
oppable+2.781
 Discrimination+2.745
 Pog+2.680
onder+2.622
sov+2.613
Neg logit contributions
inery-4.577
alogue-4.381
-4.148
ISE-4.124
arte-4.102
Token well
Token ID880
Feature activation-2.35e+00

Pos logit contributions
inery+0.636
alogue+0.615
+0.573
ISE+0.573
arte+0.563
Neg logit contributions
oppable-0.570
 Discrimination-0.568
 Pog-0.551
onder-0.541
sov-0.540
Token to
Token ID284
Feature activation-1.52e+00

Pos logit contributions
oppable+1.178
 Discrimination+1.176
 Pog+1.140
onder+1.109
sov+1.095
Neg logit contributions
inery-2.100
alogue-2.036
-1.922
ISE-1.917
arte-1.897
Token Lawrence
Token ID13914
Feature activation-1.46e+00

Pos logit contributions
 Discrimination+0.956
oppable+0.951
 Pog+0.930
onder+0.908
 Depot+0.903
Neg logit contributions
inery-1.184
alogue-1.139
-1.063
ISE-1.062
arte-1.047
Tokenason
Token ID888
Feature activation-1.79e+00

Pos logit contributions
inery+0.026
alogue+0.025
+0.023
ISE+0.023
arte+0.023
Neg logit contributions
oppable-0.021
 Discrimination-0.021
 Pog-0.021
onder-0.020
sov-0.020
Token [
Token ID685
Feature activation-8.40e-01

Pos logit contributions
 Pog+1.015
oppable+1.007
 Discrimination+0.997
sov+0.976
onder+0.969
Neg logit contributions
inery-1.477
alogue-1.419
-1.347
ISE-1.337
nces-1.327
Token10
Token ID940
Feature activation-2.19e+00

Pos logit contributions
oppable+0.424
 Discrimination+0.422
 Pog+0.406
onder+0.395
sov+0.394
Neg logit contributions
inery-0.751
alogue-0.730
-0.690
ISE-0.690
arte-0.681
Token:
Token ID25
Feature activation-2.27e-01

Pos logit contributions
 Discrimination+1.316
oppable+1.303
 Pog+1.287
sov+1.260
onder+1.259
Neg logit contributions
inery-1.742
alogue-1.684
-1.572
ISE-1.568
nces-1.561
Token47
Token ID2857
Feature activation+3.83e+00

Pos logit contributions
oppable+0.141
 Discrimination+0.140
 Pog+0.135
onder+0.133
sov+0.132
Neg logit contributions
inery-0.178
alogue-0.172
-0.161
ISE-0.161
arte-0.158
Token]
Token ID60
Feature activation-8.02e+00

Pos logit contributions
inery+2.114
alogue+2.036
ISE+1.865
+1.859
arte+1.844
Neg logit contributions
oppable-3.248
 Discrimination-3.197
 Pog-3.073
onder-3.027
 Depot-3.007
Token <
Token ID1279
Feature activation-3.26e-01

Pos logit contributions
 Pog+3.457
 Discrimination+3.362
oppable+3.285
 Ü+3.156
sov+3.152
Neg logit contributions
inery-7.480
alogue-7.310
nces-6.950
-6.845
arte-6.842
Tokenl
Token ID75
Feature activation-5.11e-01

Pos logit contributions
oppable+0.208
 Discrimination+0.206
 Pog+0.201
onder+0.199
sov+0.198
Neg logit contributions
inery-0.248
alogue-0.240
ISE-0.225
-0.224
arte-0.220
Tokenaz
Token ID1031
Feature activation+3.87e+00

Pos logit contributions
oppable+0.330
 Discrimination+0.324
 Pog+0.318
onder+0.314
sov+0.312
Neg logit contributions
inery-0.386
alogue-0.372
-0.350
ISE-0.349
nces-0.341
Tokeneed
Token ID2308
Feature activation+1.82e+00

Pos logit contributions
inery+3.072
alogue+2.977
ISE+2.776
arte+2.762
+2.754
Neg logit contributions
 Discrimination-2.367
oppable-2.317
 Pog-2.239
 Depot-2.178
-2.167
Tokenave
Token ID1015
Feature activation+5.18e-01

Pos logit contributions
inery+1.428
alogue+1.381
ISE+1.293
+1.292
arte+1.275
Neg logit contributions
 Discrimination-1.116
oppable-1.115
 Pog-1.076
onder-1.051
sov-1.047