TOP ACTIVATIONS
MAX = 10.796

to just apply to fr atern ities or just sor or
division housing typical of fr atern ities and sor or ities
fire tests undertaken after the Gren fell Tower fire , according
for a few other fr atern ities are also in a
38 LA U - 129 Laun chers , stated the Pentagon
of se g wit ). Further Bitcoin protocol improvements for non
and obedient servant CM DR Kay mel Kan , for Empire
entitled Over coming Sexual Abuse or
said Sen . Kay Bailey Hutch ison ( R
lez al slide on the Cait lyn Jenner comparisons and
s work on the Space Launch System heavy - lift rocket
also want to report the scam to the Federal Trade Commission
in the PS 4 All Access Show , we are thinking
Sciences . Explore further : New Insight Into How
t plant this year . Further , even if they recognize
need to be replaced , renegoti ated or rem ade .
with the Russian government , misogyn istic and hateful rhetoric and
guard is on point . Further , you slam someone in
the fire last month at Gren fell Tower , which was
there are many others .) Further out , there are several

MIN ACTIVATIONS
MIN = -9.329

- metal trio with an EP called You Look Like an
45 366 . Topics : community - and -
daughters , Mal ia and Sasha , in the harsh glare
In Florida and Wisconsin , 49 percent said if
Jones , who has a net worth of $ 5 .
into them ". Topics : marine - biology ,
News 24 . Topics : beef - c attle
cooler or AC and a refrigerator it is among the
car ibe : Haiti and Venezuela Ren é Pr
T , the defendant in EFF 's NSA surveillance litigation ,
Reporter Today ! <|endoftext|> A usterity has failed . It turned
ardi Gr as beads and CD s one
Relations between the Catalan and central governments have worsened of late
for independence : 1961 1963 [ edit ]
when the court accepts a central government legal challenge related to
: community - and - soc iety , community - organ
ick man presided over a soft - dr ink bott ling
Wi ener / File / 1963 Basketball goals were added to
tightening race in Florida and Wisconsin , where Ryan 's addition
would be problematic if the Panthers are asking him to push

INTERVAL 5.765 - 10.796
CONTAINS 0.244%

own choice . This only perpet uated the negative portray als
Part icle ] ; const Vec 3 & r Position =
in the experiment of the Arab Spring , which has affected
student accusing him of sexual assault . The 6
companies are structured , the way that the staff are handled

INTERVAL 0.733 - 5.765
CONTAINS 29.946%

try incorporating the following gimmick ( St olen from the Horror
the cover of mag azine ( 31 July 1944 )
you re still unable to find issue 40 near
the difficult times , the difficult conversations we had at the
professional in terms of the players . It 's time to

INTERVAL -4.298 - 0.733
CONTAINS 68.788%

number because they can generate their own code . So let
In OC M ODE the core increases to 1 , 709
, a TV , a computer or laptop , a cooler
Par th iv s script and telling them that
actual history of NDP governments shows we re often

INTERVAL -9.329 - -4.298
CONTAINS 1.022%

There in lies the Mavericks ' dilemma : they have
UK interest rates at a record low of 0 . 5
easy ! <|endoftext|> With a strong focus on player government and
find dark regions on a light background gray = c v
the book but when I saw it I was like ,
Token to
Token ID284
Feature activation+2.49e-01

Pos logit contributions
gemony+1.336
 Ally+1.257
 casinos+1.164
 Indra+1.161
 sophistication+1.154
Neg logit contributions
rador-1.401
MRI-1.372
rament-1.261
icult-1.220
stanbul-1.198
Token just
Token ID655
Feature activation-1.72e+00

Pos logit contributions
gemony+0.201
 Ally+0.189
 casinos+0.176
 Indra+0.175
 sophistication+0.175
Neg logit contributions
rador-0.186
MRI-0.182
rament-0.166
icult-0.160
stanbul-0.157
Token apply
Token ID4174
Feature activation+1.12e+00

Pos logit contributions
rador+1.256
MRI+1.208
rament+1.109
icult+1.073
iscopal+1.055
Neg logit contributions
gemony-1.419
 Ally-1.329
 casinos-1.241
 Indra-1.234
 sophistication-1.232
Token to
Token ID284
Feature activation+2.66e-01

Pos logit contributions
gemony+0.831
 Ally+0.779
 casinos+0.721
 Indra+0.718
 sophistication+0.714
Neg logit contributions
rador-0.912
MRI-0.892
rament-0.822
icult-0.796
stanbul-0.783
Token fr
Token ID1216
Feature activation+1.49e+00

Pos logit contributions
gemony+0.184
 Ally+0.168
 Indra+0.155
 casinos+0.155
 sophistication+0.155
Neg logit contributions
rador-0.228
MRI-0.223
rament-0.207
icult-0.200
stanbul-0.198
Tokenatern
Token ID9205
Feature activation+1.08e+01

Pos logit contributions
gemony+0.750
 Ally+0.681
 casinos+0.605
 Indra+0.599
ours+0.596
Neg logit contributions
rador-1.567
MRI-1.540
rament-1.450
icult-1.412
stanbul-1.395
Tokenities
Token ID871
Feature activation+3.65e+00

Pos logit contributions
gemony+5.547
 Ally+5.084
ours+4.725
 casinos+4.549
 Indra+4.368
Neg logit contributions
rador-10.557
MRI-10.321
rament-9.769
stanbul-9.340
icult-9.319
Token or
Token ID393
Feature activation+1.41e+00

Pos logit contributions
gemony+2.706
 Ally+2.564
 casinos+2.363
 Indra+2.353
ours+2.349
Neg logit contributions
rador-2.925
MRI-2.875
rament-2.634
icult-2.562
stanbul-2.508
Token just
Token ID655
Feature activation-3.42e+00

Pos logit contributions
gemony+1.085
 Ally+1.017
 casinos+0.943
 Indra+0.941
 sophistication+0.936
Neg logit contributions
rador-1.102
MRI-1.077
rament-0.991
icult-0.958
stanbul-0.939
Token sor
Token ID25655
Feature activation+4.74e+00

Pos logit contributions
rador+2.730
MRI+2.653
rament+2.463
icult+2.380
iscopal+2.363
Neg logit contributions
gemony-2.532
 Ally-2.291
 sophistication-2.146
 Indra-2.142
-2.123
Tokenor
Token ID273
Feature activation+2.99e+00

Pos logit contributions
gemony+3.278
 Ally+3.090
ours+2.952
 casinos+2.862
 Indra+2.806
Neg logit contributions
rador-3.956
MRI-3.923
rament-3.584
icult-3.475
stanbul-3.468
Token division
Token ID7297
Feature activation-3.34e-01

Pos logit contributions
rador+0.210
MRI+0.205
rament+0.189
icult+0.183
stanbul+0.179
Neg logit contributions
gemony-0.206
 Ally-0.192
 casinos-0.179
 Indra-0.178
 sophistication-0.177
Token housing
Token ID5627
Feature activation+3.15e+00

Pos logit contributions
rador+0.237
MRI+0.230
rament+0.210
icult+0.202
iscopal+0.199
Neg logit contributions
gemony-0.283
 Ally-0.266
 casinos-0.248
 Indra-0.248
 sophistication-0.247
Token typical
Token ID7226
Feature activation-1.18e+00

Pos logit contributions
gemony+2.451
 Ally+2.330
 casinos+2.165
 Indra+2.150
ours+2.138
Neg logit contributions
rador-2.419
MRI-2.374
rament-2.169
icult-2.100
stanbul-2.061
Token of
Token ID286
Feature activation-9.31e-01

Pos logit contributions
rador+0.827
MRI+0.805
rament+0.737
icult+0.711
iscopal+0.696
Neg logit contributions
gemony-1.008
 Ally-0.938
 Indra-0.883
 casinos-0.879
 sophistication-0.876
Token fr
Token ID1216
Feature activation+7.44e-01

Pos logit contributions
rador+0.770
MRI+0.751
rament+0.697
icult+0.675
stanbul+0.663
Neg logit contributions
gemony-0.672
 Ally-0.612
 Indra-0.570
ours-0.567
 sophistication-0.566
Tokenatern
Token ID9205
Feature activation+1.07e+01

Pos logit contributions
gemony+0.369
 Ally+0.334
 casinos+0.296
 Indra+0.294
ours+0.291
Neg logit contributions
rador-0.785
MRI-0.772
rament-0.727
icult-0.709
stanbul-0.700
Tokenities
Token ID871
Feature activation+3.25e+00

Pos logit contributions
gemony+5.373
 Ally+4.936
ours+4.611
 casinos+4.388
 Indra+4.247
Neg logit contributions
rador-10.528
MRI-10.350
rament-9.829
stanbul-9.414
icult-9.274
Token and
Token ID290
Feature activation-1.42e-01

Pos logit contributions
gemony+2.836
 Ally+2.712
 casinos+2.542
ours+2.533
 Indra+2.528
Neg logit contributions
rador-2.167
MRI-2.129
rament-1.913
icult-1.850
stanbul-1.805
Token sor
Token ID25655
Feature activation+3.67e+00

Pos logit contributions
rador+0.106
MRI+0.104
rament+0.095
icult+0.092
iscopal+0.090
Neg logit contributions
gemony-0.114
 Ally-0.106
 Indra-0.099
 casinos-0.099
 sophistication-0.098
Tokenor
Token ID273
Feature activation+2.07e+00

Pos logit contributions
gemony+2.595
 Ally+2.441
ours+2.316
 casinos+2.273
 Indra+2.228
Neg logit contributions
rador-3.033
MRI-3.005
rament-2.746
icult-2.663
stanbul-2.659
Tokenities
Token ID871
Feature activation+1.85e+00

Pos logit contributions
gemony+1.377
 Ally+1.284
 casinos+1.179
 Indra+1.172
ours+1.168
Neg logit contributions
rador-1.836
MRI-1.804
rament-1.671
icult-1.619
stanbul-1.602
Token fire
Token ID2046
Feature activation+6.69e-01

Pos logit contributions
gemony+1.300
 Ally+1.219
 casinos+1.121
 Indra+1.119
 sophistication+1.111
Neg logit contributions
rador-1.407
MRI-1.379
rament-1.269
icult-1.230
stanbul-1.212
Token tests
Token ID5254
Feature activation+2.64e+00

Pos logit contributions
gemony+0.528
 Ally+0.498
 casinos+0.463
 Indra+0.461
 sophistication+0.459
Neg logit contributions
rador-0.510
MRI-0.499
rament-0.457
icult-0.442
stanbul-0.434
Token undertaken
Token ID21846
Feature activation+2.19e+00

Pos logit contributions
gemony+2.063
 Ally+1.956
 casinos+1.823
 Indra+1.820
ours+1.805
Neg logit contributions
rador-2.016
MRI-1.975
rament-1.805
icult-1.749
iscopal-1.715
Token after
Token ID706
Feature activation-2.49e+00

Pos logit contributions
gemony+1.691
 Ally+1.604
 casinos+1.494
 Indra+1.486
ours+1.477
Neg logit contributions
rador-1.690
MRI-1.654
rament-1.520
icult-1.470
iscopal-1.441
Token the
Token ID262
Feature activation-6.06e-01

Pos logit contributions
rador+1.946
MRI+1.890
rament+1.742
icult+1.690
stanbul+1.667
Neg logit contributions
gemony-1.917
 Ally-1.753
ours-1.629
 Indra-1.618
 casinos-1.612
Token Gren
Token ID19674
Feature activation+1.03e+01

Pos logit contributions
rador+0.463
MRI+0.449
rament+0.417
icult+0.401
stanbul+0.395
Neg logit contributions
gemony-0.473
 Ally-0.437
 casinos-0.406
 Indra-0.406
ours-0.406
Tokenfell
Token ID23299
Feature activation+2.00e+00

Pos logit contributions
gemony+5.578
 Ally+5.459
ours+4.981
 sophistication+4.709
 Warfare+4.653
Neg logit contributions
rador-9.512
MRI-9.294
rament-8.997
icult-8.674
stanbul-8.611
Token Tower
Token ID8765
Feature activation+4.13e+00

Pos logit contributions
gemony+1.186
 Ally+1.085
 Indra+0.986
 casinos+0.985
 sophistication+0.970
Neg logit contributions
rador-1.913
MRI-1.876
rament-1.754
icult-1.713
iscopal-1.682
Token fire
Token ID2046
Feature activation+1.45e+00

Pos logit contributions
gemony+3.490
 Ally+3.357
 sophistication+3.140
 casinos+3.129
ours+3.113
Neg logit contributions
rador-2.878
MRI-2.822
rament-2.549
icult-2.426
stanbul-2.424
Token,
Token ID11
Feature activation+3.85e-02

Pos logit contributions
gemony+1.179
 Ally+1.114
 casinos+1.038
 Indra+1.035
 sophistication+1.030
Neg logit contributions
rador-1.065
MRI-1.041
rament-0.951
icult-0.916
stanbul-0.900
Token according
Token ID1864
Feature activation-1.05e+00

Pos logit contributions
gemony+0.030
 Ally+0.028
 casinos+0.026
 Indra+0.026
 sophistication+0.026
Neg logit contributions
rador-0.030
MRI-0.029
rament-0.027
icult-0.026
stanbul-0.025
Token for
Token ID329
Feature activation+1.35e+00

Pos logit contributions
rador+0.318
MRI+0.310
rament+0.282
icult+0.272
stanbul+0.266
Neg logit contributions
gemony-0.388
 Ally-0.367
 casinos-0.343
 Indra-0.343
 sophistication-0.341
Token a
Token ID257
Feature activation-3.50e-01

Pos logit contributions
gemony+0.993
 Ally+0.928
 casinos+0.857
 Indra+0.855
 sophistication+0.850
Neg logit contributions
rador-1.107
MRI-1.084
rament-1.000
icult-0.968
stanbul-0.950
Token few
Token ID1178
Feature activation-4.50e+00

Pos logit contributions
rador+0.299
MRI+0.294
rament+0.272
icult+0.264
iscopal+0.260
Neg logit contributions
gemony-0.244
 Ally-0.226
 casinos-0.209
 Indra-0.208
 sophistication-0.206
Token other
Token ID584
Feature activation+2.73e-01

Pos logit contributions
rador+3.701
MRI+3.700
rament+3.386
icult+3.332
iscopal+3.232
Neg logit contributions
gemony-3.133
 Ally-2.841
 Indra-2.650
 sophistication-2.614
-2.583
Token fr
Token ID1216
Feature activation+9.69e-01

Pos logit contributions
gemony+0.189
 Ally+0.173
 Indra+0.159
 casinos+0.158
 sophistication+0.158
Neg logit contributions
rador-0.235
MRI-0.231
rament-0.214
icult-0.209
iscopal-0.204
Tokenatern
Token ID9205
Feature activation+1.01e+01

Pos logit contributions
gemony+0.480
 Ally+0.434
 casinos+0.387
 Indra+0.382
ours+0.381
Neg logit contributions
rador-1.023
MRI-1.004
rament-0.948
icult-0.922
stanbul-0.913
Tokenities
Token ID871
Feature activation+2.29e+00

Pos logit contributions
gemony+5.037
 Ally+4.721
ours+4.376
 casinos+4.229
 Indra+4.119
Neg logit contributions
rador-9.902
MRI-9.819
rament-9.331
stanbul-8.906
icult-8.756
Token are
Token ID389
Feature activation-1.97e+00

Pos logit contributions
gemony+1.906
 Ally+1.821
 casinos+1.698
 Indra+1.692
ours+1.681
Neg logit contributions
rador-1.629
MRI-1.599
rament-1.455
icult-1.402
iscopal-1.368
Token also
Token ID635
Feature activation-5.33e-01

Pos logit contributions
rador+1.543
MRI+1.505
rament+1.390
icult+1.343
stanbul+1.321
Neg logit contributions
gemony-1.509
 Ally-1.391
 casinos-1.286
 Indra-1.286
ours-1.285
Token in
Token ID287
Feature activation-1.03e+00

Pos logit contributions
rador+0.457
MRI+0.446
rament+0.415
icult+0.402
stanbul+0.396
Neg logit contributions
gemony-0.372
 Ally-0.344
 casinos-0.315
 Indra-0.314
ours-0.313
Token a
Token ID257
Feature activation+6.67e-01

Pos logit contributions
rador+0.797
MRI+0.783
rament+0.715
icult+0.692
stanbul+0.681
Neg logit contributions
gemony-0.806
 Ally-0.752
 Indra-0.697
 casinos-0.697
ours-0.694
Token 38
Token ID4353
Feature activation-1.60e+00

Pos logit contributions
gemony+0.188
 Ally+0.175
 casinos+0.163
 Indra+0.163
ours+0.161
Neg logit contributions
rador-0.185
MRI-0.181
rament-0.166
icult-0.160
stanbul-0.158
Token LA
Token ID9131
Feature activation+3.32e+00

Pos logit contributions
rador+1.299
MRI+1.286
rament+1.169
icult+1.131
stanbul+1.129
Neg logit contributions
gemony-1.185
 Ally-1.084
 casinos-1.015
 Indra-1.007
ours-1.003
TokenU
Token ID52
Feature activation+3.46e-01

Pos logit contributions
gemony+2.773
 Ally+2.640
 Indra+2.454
 casinos+2.453
 sophistication+2.437
Neg logit contributions
rador-2.376
MRI-2.283
rament-2.103
icult-2.019
iscopal-1.983
Token-
Token ID12
Feature activation+4.89e-01

Pos logit contributions
gemony+0.319
 Ally+0.301
 casinos+0.284
 Indra+0.283
 sophistication+0.282
Neg logit contributions
rador-0.219
MRI-0.214
rament-0.192
icult-0.184
stanbul-0.180
Token129
Token ID18741
Feature activation-1.74e+00

Pos logit contributions
gemony+0.467
 Ally+0.443
 casinos+0.419
 Indra+0.417
 sophistication+0.416
Neg logit contributions
rador-0.292
MRI-0.287
rament-0.253
icult-0.242
stanbul-0.237
Token Laun
Token ID19826
Feature activation+9.90e+00

Pos logit contributions
rador+1.402
MRI+1.398
rament+1.271
icult+1.226
stanbul+1.220
Neg logit contributions
gemony-1.293
 Ally-1.178
 casinos-1.110
 sophistication-1.093
ours-1.091
Tokenchers
Token ID3533
Feature activation+2.17e+00

Pos logit contributions
gemony+3.977
 Ally+3.494
ours+3.238
 sophistication+3.004
 Indra+2.980
Neg logit contributions
rador-10.860
MRI-10.555
stanbul-9.732
rament-9.715
iscopal-9.626
Token,
Token ID11
Feature activation+6.98e-01

Pos logit contributions
gemony+1.879
 Ally+1.791
 casinos+1.675
 Indra+1.669
 sophistication+1.662
Neg logit contributions
rador-1.472
MRI-1.434
rament-1.302
icult-1.254
iscopal-1.227
Token stated
Token ID5081
Feature activation+1.56e+00

Pos logit contributions
gemony+0.538
 Ally+0.502
 casinos+0.467
 Indra+0.465
 sophistication+0.462
Neg logit contributions
rador-0.545
MRI-0.534
rament-0.491
icult-0.473
stanbul-0.466
Token the
Token ID262
Feature activation+7.91e-01

Pos logit contributions
gemony+1.173
 Ally+1.104
 casinos+1.021
 Indra+1.018
 sophistication+1.012
Neg logit contributions
rador-1.240
MRI-1.215
rament-1.117
icult-1.082
stanbul-1.061
Token Pentagon
Token ID12651
Feature activation-8.69e-03

Pos logit contributions
gemony+0.607
 Ally+0.570
 casinos+0.530
 Indra+0.528
 sophistication+0.524
Neg logit contributions
rador-0.620
MRI-0.606
rament-0.558
icult-0.538
stanbul-0.530
Token of
Token ID286
Feature activation+4.12e-01

Pos logit contributions
gemony+3.136
 Ally+2.977
 casinos+2.757
 Indra+2.753
 sophistication+2.744
Neg logit contributions
rador-3.182
MRI-3.106
rament-2.851
icult-2.734
iscopal-2.703
Token se
Token ID384
Feature activation+1.28e+00

Pos logit contributions
gemony+0.309
 Ally+0.287
 casinos+0.266
 Indra+0.266
ours+0.264
Neg logit contributions
rador-0.331
MRI-0.324
rament-0.299
icult-0.289
stanbul-0.285
Tokeng
Token ID70
Feature activation+2.27e+00

Pos logit contributions
gemony+1.004
 Ally+0.945
 casinos+0.878
 Indra+0.874
 sophistication+0.869
Neg logit contributions
rador-0.989
MRI-0.967
rament-0.885
icult-0.857
stanbul-0.840
Tokenwit
Token ID39289
Feature activation-2.02e+00

Pos logit contributions
gemony+1.810
 Ally+1.712
 casinos+1.608
 Indra+1.590
ours+1.589
Neg logit contributions
rador-1.710
MRI-1.640
rament-1.505
icult-1.449
stanbul-1.439
Token).
Token ID737
Feature activation-7.13e-01

Pos logit contributions
rador+1.263
MRI+1.238
rament+1.115
icult+1.067
stanbul+1.050
Neg logit contributions
gemony-1.871
 Ally-1.765
 Indra-1.654
 casinos-1.649
ours-1.642
Token Further
Token ID7735
Feature activation+9.78e+00

Pos logit contributions
rador+0.523
MRI+0.515
rament+0.468
icult+0.450
stanbul+0.443
Neg logit contributions
gemony-0.583
 Ally-0.544
 casinos-0.512
 Indra-0.507
 sophistication-0.507
Token Bitcoin
Token ID6185
Feature activation+5.34e-01

Pos logit contributions
gemony+6.683
 Ally+6.477
 sophistication+6.238
 casinos+6.019
 exploits+5.934
Neg logit contributions
rador-7.798
MRI-7.636
rament-7.036
icult-6.777
iscopal-6.723
Token protocol
Token ID8435
Feature activation+1.16e+00

Pos logit contributions
gemony+0.393
 Ally+0.366
 Indra+0.341
 casinos+0.338
ours+0.336
Neg logit contributions
rador-0.435
MRI-0.428
rament-0.395
icult-0.382
stanbul-0.374
Token improvements
Token ID8561
Feature activation+2.78e+00

Pos logit contributions
gemony+0.841
 Ally+0.785
 casinos+0.724
 Indra+0.723
ours+0.716
Neg logit contributions
rador-0.968
MRI-0.949
rament-0.876
icult-0.849
stanbul-0.834
Token for
Token ID329
Feature activation+3.96e+00

Pos logit contributions
gemony+2.242
 Ally+2.130
 casinos+1.980
 Indra+1.970
 sophistication+1.966
Neg logit contributions
rador-2.056
MRI-2.014
rament-1.840
icult-1.772
iscopal-1.738
Token non
Token ID1729
Feature activation+8.03e-01

Pos logit contributions
gemony+2.807
 Ally+2.654
 casinos+2.451
 sophistication+2.433
 Indra+2.426
Neg logit contributions
rador-3.312
MRI-3.243
rament-2.994
icult-2.905
stanbul-2.851
Token and
Token ID290
Feature activation-7.96e-01

Pos logit contributions
rador+0.067
MRI+0.065
rament+0.060
icult+0.058
stanbul+0.057
Neg logit contributions
gemony-0.068
 Ally-0.063
 casinos-0.059
 Indra-0.059
 sophistication-0.058
Token obedient
Token ID41652
Feature activation-7.19e+00

Pos logit contributions
rador+0.613
MRI+0.592
rament+0.550
icult+0.533
iscopal+0.520
Neg logit contributions
gemony-0.624
 Ally-0.580
 casinos-0.543
 Indra-0.538
 sophistication-0.535
Token servant
Token ID18807
Feature activation-4.63e+00

Pos logit contributions
rador+5.556
MRI+5.189
rament+4.957
icult+4.714
stanbul+4.553
Neg logit contributions
gemony-5.362
 Ally-4.767
 casinos-4.659
 sophistication-4.630
 Indra-4.577
TokenCM
Token ID24187
Feature activation-4.06e+00

Pos logit contributions
rador+3.161
MRI+3.040
rament+2.833
icult+2.654
iscopal+2.599
Neg logit contributions
gemony-3.926
 Ally-3.639
 casinos-3.529
 sophistication-3.486
 Indra-3.420
TokenDR
Token ID7707
Feature activation+6.55e-03

Pos logit contributions
rador+3.002
MRI+2.922
rament+2.666
icult+2.578
stanbul+2.558
Neg logit contributions
gemony-3.206
 Ally-2.997
 casinos-2.861
 sophistication-2.807
 Indra-2.804
Token Kay
Token ID17356
Feature activation+9.55e+00

Pos logit contributions
gemony+0.005
 Ally+0.005
 casinos+0.004
 Indra+0.004
 sophistication+0.004
Neg logit contributions
rador-0.005
MRI-0.005
rament-0.005
icult-0.004
stanbul-0.004
Tokenmel
Token ID17694
Feature activation-2.28e+00

Pos logit contributions
gemony+7.682
 Ally+7.269
ours+6.765
 Indra+6.746
 Polo+6.685
Neg logit contributions
rador-6.584
MRI-6.576
rament-5.921
icult-5.747
iscopal-5.684
Token Kan
Token ID14248
Feature activation+6.53e-01

Pos logit contributions
rador+1.729
MRI+1.693
rament+1.551
icult+1.505
stanbul+1.471
Neg logit contributions
gemony-1.782
 Ally-1.666
 casinos-1.574
 sophistication-1.554
 Indra-1.550
Token,
Token ID11
Feature activation+3.46e-01

Pos logit contributions
gemony+0.536
 Ally+0.506
 casinos+0.472
 Indra+0.470
 sophistication+0.468
Neg logit contributions
rador-0.478
MRI-0.467
rament-0.426
icult-0.411
stanbul-0.403
Token for
Token ID329
Feature activation+2.24e+00

Pos logit contributions
gemony+0.261
 Ally+0.245
 casinos+0.227
 Indra+0.226
 sophistication+0.225
Neg logit contributions
rador-0.276
MRI-0.270
rament-0.248
icult-0.240
stanbul-0.236
Token Empire
Token ID8065
Feature activation-1.13e+00

Pos logit contributions
gemony+1.671
 Ally+1.578
 Indra+1.453
 casinos+1.452
ours+1.443
Neg logit contributions
rador-1.793
MRI-1.764
rament-1.615
icult-1.566
stanbul-1.537
Token entitled
Token ID9080
Feature activation-2.42e+00

Pos logit contributions
rador+1.502
MRI+1.472
rament+1.343
icult+1.299
stanbul+1.271
Neg logit contributions
gemony-1.605
 Ally-1.496
 Indra-1.400
 casinos-1.399
ours-1.384
Token
Token ID564
Feature activation-6.69e-01

Pos logit contributions
rador+1.783
MRI+1.747
rament+1.580
icult+1.537
stanbul+1.534
Neg logit contributions
gemony-1.927
 Ally-1.782
 casinos-1.724
 sophistication-1.676
 Indra-1.675
Token
Token ID250
Feature activation-5.40e-01

Pos logit contributions
rador+0.492
MRI+0.480
rament+0.438
icult+0.422
stanbul+0.415
Neg logit contributions
gemony-0.545
 Ally-0.514
 casinos-0.482
 Indra-0.479
 sophistication-0.477
TokenOver
Token ID5886
Feature activation-1.49e+00

Pos logit contributions
rador+0.371
MRI+0.364
rament+0.328
icult+0.315
stanbul+0.313
Neg logit contributions
gemony-0.466
 Ally-0.439
 casinos-0.417
 Indra-0.412
 sophistication-0.410
Tokencoming
Token ID4976
Feature activation+5.49e+00

Pos logit contributions
rador+1.103
MRI+1.075
rament+0.978
icult+0.954
stanbul+0.937
Neg logit contributions
gemony-1.205
 Ally-1.124
 casinos-1.061
 Indra-1.049
 sophistication-1.045
Token Sexual
Token ID19536
Feature activation+9.52e+00

Pos logit contributions
gemony+3.978
 Ally+3.843
 Indra+3.498
 sophistication+3.443
 casinos+3.440
Neg logit contributions
rador-4.452
MRI-4.348
rament-4.050
icult-3.869
iscopal-3.854
Token Abuse
Token ID24221
Feature activation+5.01e+00

Pos logit contributions
gemony+5.949
 Ally+5.670
 Indra+5.051
 Warfare+4.963
 sophistication+4.955
Neg logit contributions
rador-8.714
MRI-8.349
rament-7.842
icult-7.620
stanbul-7.573
Token
Token ID447
Feature activation-3.71e-01

Pos logit contributions
gemony+3.722
 Ally+3.571
 Indra+3.272
ours+3.224
 sophistication+3.219
Neg logit contributions
rador-3.995
MRI-3.877
rament-3.602
icult-3.466
iscopal-3.421
Token
Token ID251
Feature activation+1.34e+00

Pos logit contributions
rador+0.265
MRI+0.257
rament+0.233
icult+0.223
stanbul+0.221
Neg logit contributions
gemony-0.310
 Ally-0.294
 casinos-0.275
 Indra-0.274
 sophistication-0.273
Token or
Token ID393
Feature activation+1.08e+00

Pos logit contributions
gemony+1.037
 Ally+0.975
 casinos+0.905
 Indra+0.902
 sophistication+0.897
Neg logit contributions
rador-1.047
MRI-1.025
rament-0.941
icult-0.909
stanbul-0.893
Token
Token ID564
Feature activation-5.25e-01

Pos logit contributions
gemony+0.873
 Ally+0.826
 casinos+0.768
 Indra+0.766
 sophistication+0.762
Neg logit contributions
rador-0.802
MRI-0.784
rament-0.718
icult-0.691
stanbul-0.677
Token
Token ID447
Feature activation-9.04e-01

Pos logit contributions
rador+0.227
MRI+0.222
rament+0.205
icult+0.198
stanbul+0.194
Neg logit contributions
gemony-0.218
 Ally-0.203
 casinos-0.189
 Indra-0.187
ours-0.187
Token
Token ID251
Feature activation+7.44e-01

Pos logit contributions
rador+0.651
MRI+0.634
rament+0.577
icult+0.554
stanbul+0.545
Neg logit contributions
gemony-0.751
 Ally-0.711
 casinos-0.666
 Indra-0.661
 sophistication-0.660
Token said
Token ID531
Feature activation+2.62e+00

Pos logit contributions
gemony+0.592
 Ally+0.557
 casinos+0.520
 Indra+0.517
 sophistication+0.515
Neg logit contributions
rador-0.562
MRI-0.550
rament-0.503
icult-0.486
stanbul-0.476
Token Sen
Token ID2311
Feature activation+4.49e+00

Pos logit contributions
gemony+1.919
 Ally+1.822
 Indra+1.678
 casinos+1.662
 sophistication+1.652
Neg logit contributions
rador-2.139
MRI-2.091
rament-1.928
icult-1.863
stanbul-1.836
Token.
Token ID13
Feature activation+2.24e+00

Pos logit contributions
gemony+3.710
 Ally+3.517
ours+3.296
 Indra+3.292
 sophistication+3.207
Neg logit contributions
rador-3.183
MRI-3.108
rament-2.833
icult-2.749
stanbul-2.726
Token Kay
Token ID17356
Feature activation+9.51e+00

Pos logit contributions
gemony+1.633
 Ally+1.537
 Indra+1.415
 casinos+1.411
 sophistication+1.399
Neg logit contributions
rador-1.835
MRI-1.794
rament-1.655
icult-1.606
stanbul-1.578
Token Bailey
Token ID20330
Feature activation+3.77e+00

Pos logit contributions
gemony+5.615
 Ally+5.325
 Indra+4.861
 Polo+4.772
 casinos+4.756
Neg logit contributions
MRI-8.656
rador-8.632
rament-8.071
icult-7.774
stanbul-7.736
Token Hutch
Token ID30492
Feature activation-2.77e+00

Pos logit contributions
gemony+2.448
 Ally+2.348
 Indra+2.199
 casinos+2.114
+2.070
Neg logit contributions
rador-3.408
MRI-3.200
rament-3.052
stanbul-2.965
icult-2.946
Tokenison
Token ID1653
Feature activation-1.01e+00

Pos logit contributions
rador+2.313
MRI+2.258
rament+2.075
icult+2.062
iscopal+1.968
Neg logit contributions
gemony-1.995
 Ally-1.841
 casinos-1.700
 sophistication-1.682
 Indra-1.681
Token (
Token ID357
Feature activation+9.27e-01

Pos logit contributions
rador+0.715
MRI+0.697
rament+0.635
icult+0.613
stanbul+0.597
Neg logit contributions
gemony-0.854
 Ally-0.799
 casinos-0.756
 sophistication-0.747
 Indra-0.746
TokenR
Token ID49
Feature activation-3.58e+00

Pos logit contributions
gemony+0.689
 Ally+0.646
 casinos+0.595
 Indra+0.594
ours+0.594
Neg logit contributions
rador-0.747
MRI-0.728
rament-0.675
icult-0.655
stanbul-0.645
Tokenlez
Token ID36858
Feature activation+3.78e-02

Pos logit contributions
rador+1.465
MRI+1.426
rament+1.338
icult+1.301
stanbul+1.275
Neg logit contributions
gemony-0.901
 Ally-0.837
 Indra-0.772
 casinos-0.768
 sophistication-0.761
Tokenal
Token ID282
Feature activation+8.21e-01

Pos logit contributions
gemony+0.035
 Ally+0.033
 casinos+0.031
 Indra+0.031
 sophistication+0.031
Neg logit contributions
rador-0.024
MRI-0.023
rament-0.021
icult-0.020
stanbul-0.020
Token slide
Token ID10649
Feature activation-1.04e+00

Pos logit contributions
gemony+0.672
 Ally+0.635
 casinos+0.592
 Indra+0.590
 sophistication+0.587
Neg logit contributions
rador-0.601
MRI-0.588
rament-0.537
icult-0.517
stanbul-0.507
Token on
Token ID319
Feature activation-3.27e-01

Pos logit contributions
rador+0.759
MRI+0.740
rament+0.677
icult+0.656
stanbul+0.642
Neg logit contributions
gemony-0.852
 Ally-0.797
 casinos-0.746
 Indra-0.740
 sophistication-0.738
Token the
Token ID262
Feature activation+1.66e-01

Pos logit contributions
rador+0.259
MRI+0.254
rament+0.234
icult+0.226
iscopal+0.222
Neg logit contributions
gemony-0.246
 Ally-0.229
 casinos-0.213
 Indra-0.212
 sophistication-0.211
Token Cait
Token ID34118
Feature activation+9.44e+00

Pos logit contributions
gemony+0.119
 Ally+0.111
 casinos+0.102
 Indra+0.102
ours+0.101
Neg logit contributions
rador-0.138
MRI-0.135
rament-0.125
icult-0.121
stanbul-0.119
Tokenlyn
Token ID6213
Feature activation+5.31e+00

Pos logit contributions
gemony+4.935
 Ally+4.773
 Indra+4.187
ours+4.042
 casinos+4.000
Neg logit contributions
rador-9.451
MRI-9.243
rament-8.522
icult-8.313
stanbul-8.309
Token Jenner
Token ID39760
Feature activation+4.63e+00

Pos logit contributions
gemony+2.865
 Ally+2.693
 Indra+2.377
 casinos+2.363
 sophistication+2.353
Neg logit contributions
rador-5.353
MRI-5.215
rament-4.897
icult-4.813
stanbul-4.728
Token comparisons
Token ID17909
Feature activation+8.27e-01

Pos logit contributions
gemony+3.484
 Ally+3.353
 Indra+3.082
 sophistication+3.055
 casinos+3.054
Neg logit contributions
rador-3.628
MRI-3.526
rament-3.249
icult-3.161
stanbul-3.086
Token
Token ID960
Feature activation-1.30e+00

Pos logit contributions
gemony+0.664
 Ally+0.626
 casinos+0.583
 Indra+0.581
 sophistication+0.578
Neg logit contributions
rador-0.619
MRI-0.605
rament-0.554
icult-0.535
stanbul-0.524
Tokenand
Token ID392
Feature activation-3.95e-01

Pos logit contributions
rador+0.849
MRI+0.834
rament+0.746
icult+0.718
iscopal+0.702
Neg logit contributions
gemony-1.162
 Ally-1.098
 casinos-1.035
 Indra-1.031
 sophistication-1.028
Tokens
Token ID82
Feature activation+1.44e+00

Pos logit contributions
gemony+0.204
 Ally+0.190
 casinos+0.174
 Indra+0.173
 sophistication+0.171
Neg logit contributions
rador-0.285
MRI-0.280
rament-0.261
icult-0.253
stanbul-0.250
Token work
Token ID670
Feature activation-1.85e-01

Pos logit contributions
gemony+1.029
 Ally+0.956
 casinos+0.884
 Indra+0.879
ours+0.872
Neg logit contributions
rador-1.205
MRI-1.178
rament-1.093
icult-1.056
stanbul-1.040
Token on
Token ID319
Feature activation+5.51e-01

Pos logit contributions
rador+0.134
MRI+0.130
rament+0.120
icult+0.114
stanbul+0.112
Neg logit contributions
gemony-0.154
 Ally-0.145
 casinos-0.136
 Indra-0.135
 sophistication-0.134
Token the
Token ID262
Feature activation+7.58e-01

Pos logit contributions
gemony+0.396
 Ally+0.368
 casinos+0.340
 Indra+0.338
ours+0.336
Neg logit contributions
rador-0.461
MRI-0.451
rament-0.416
icult-0.403
stanbul-0.397
Token Space
Token ID4687
Feature activation+6.99e+00

Pos logit contributions
gemony+0.483
 Ally+0.445
 casinos+0.407
ours+0.403
 Indra+0.401
Neg logit contributions
rador-0.692
MRI-0.676
rament-0.637
icult-0.613
stanbul-0.605
Token Launch
Token ID21225
Feature activation+9.33e+00

Pos logit contributions
gemony+3.214
 Ally+3.032
 Indra+2.639
 casinos+2.573
 sophistication+2.540
Neg logit contributions
rador-7.466
MRI-7.335
rament-6.923
icult-6.778
iscopal-6.725
Token System
Token ID4482
Feature activation+4.78e+00

Pos logit contributions
 Ally+6.656
gemony+6.623
 Indra+6.040
 sophistication+5.999
 casinos+5.911
Neg logit contributions
rador-7.223
MRI-7.166
rament-6.482
iscopal-6.477
icult-6.418
Token heavy
Token ID4334
Feature activation-1.09e+00

Pos logit contributions
gemony+3.714
 Ally+3.524
 casinos+3.263
 Indra+3.254
 sophistication+3.243
Neg logit contributions
rador-3.648
MRI-3.571
rament-3.270
icult-3.181
iscopal-3.127
Token-
Token ID12
Feature activation-9.00e-01

Pos logit contributions
rador+0.793
MRI+0.773
rament+0.703
icult+0.674
stanbul+0.665
Neg logit contributions
gemony-0.895
 Ally-0.834
 Indra-0.779
 casinos-0.776
ours-0.770
Tokenlift
Token ID26282
Feature activation-1.47e+00

Pos logit contributions
rador+0.693
MRI+0.682
rament+0.621
icult+0.600
iscopal+0.586
Neg logit contributions
gemony-0.703
 Ally-0.661
 casinos-0.616
 Indra-0.615
 sophistication-0.606
Token rocket
Token ID10701
Feature activation+5.09e+00

Pos logit contributions
rador+1.242
MRI+1.205
rament+1.125
icult+1.078
stanbul+1.071
Neg logit contributions
gemony-1.044
 Ally-0.961
 Indra-0.882
 casinos-0.880
 Polo-0.867
Token also
Token ID635
Feature activation+2.46e+00

Pos logit contributions
gemony+1.420
 Ally+1.359
 casinos+1.261
 Indra+1.257
ours+1.248
Neg logit contributions
rador-1.342
MRI-1.315
rament-1.203
icult-1.155
iscopal-1.136
Token want
Token ID765
Feature activation-3.48e-01

Pos logit contributions
gemony+1.829
 Ally+1.758
 casinos+1.622
 Indra+1.613
ours+1.604
Neg logit contributions
rador-1.948
MRI-1.916
rament-1.765
icult-1.689
iscopal-1.679
Token to
Token ID284
Feature activation-7.96e-01

Pos logit contributions
rador+0.282
MRI+0.276
rament+0.254
icult+0.247
iscopal+0.242
Neg logit contributions
gemony-0.260
 Ally-0.241
 casinos-0.224
 Indra-0.223
 sophistication-0.222
Token report
Token ID989
Feature activation-3.55e-01

Pos logit contributions
rador+0.584
MRI+0.570
rament+0.520
icult+0.502
stanbul+0.493
Neg logit contributions
gemony-0.653
 Ally-0.613
 casinos-0.572
 Indra-0.571
 sophistication-0.568
Token the
Token ID262
Feature activation+4.11e-01

Pos logit contributions
rador+0.298
MRI+0.291
rament+0.270
icult+0.261
stanbul+0.255
Neg logit contributions
gemony-0.256
 Ally-0.235
 Indra-0.217
 casinos-0.216
ours-0.215
Token scam
Token ID19126
Feature activation+9.28e+00

Pos logit contributions
gemony+0.290
 Ally+0.268
 Indra+0.246
 casinos+0.246
ours+0.244
Neg logit contributions
rador-0.349
MRI-0.341
rament-0.317
icult-0.307
stanbul-0.301
Token to
Token ID284
Feature activation+6.60e-01

Pos logit contributions
gemony+6.014
 Ally+5.884
ours+5.559
 casinos+5.508
 sophistication+5.496
Neg logit contributions
MRI-7.558
rador-7.506
rament-6.847
stanbul-6.687
icult-6.670
Token the
Token ID262
Feature activation+4.07e-01

Pos logit contributions
gemony+0.496
 Ally+0.466
 casinos+0.431
 Indra+0.430
 sophistication+0.427
Neg logit contributions
rador-0.528
MRI-0.517
rament-0.476
icult-0.461
stanbul-0.452
Token Federal
Token ID5618
Feature activation-2.59e+00

Pos logit contributions
gemony+0.318
 Ally+0.299
 casinos+0.278
 Indra+0.277
 sophistication+0.275
Neg logit contributions
rador-0.313
MRI-0.306
rament-0.281
icult-0.271
stanbul-0.266
Token Trade
Token ID9601
Feature activation+2.90e+00

Pos logit contributions
rador+2.393
MRI+2.334
rament+2.205
icult+2.126
iscopal+2.087
Neg logit contributions
gemony-1.647
 Ally-1.466
ours-1.340
 Indra-1.326
 sophistication-1.318
Token Commission
Token ID4513
Feature activation+3.50e-01

Pos logit contributions
gemony+1.794
 Ally+1.697
 casinos+1.514
 sophistication+1.500
 Indra+1.498
Neg logit contributions
rador-2.685
MRI-2.649
rament-2.437
icult-2.379
iscopal-2.374
Token in
Token ID287
Feature activation+7.41e-01

Pos logit contributions
gemony+0.079
 Ally+0.074
 casinos+0.069
 Indra+0.068
 sophistication+0.068
Neg logit contributions
rador-0.072
MRI-0.070
rament-0.065
icult-0.062
stanbul-0.061
Token the
Token ID262
Feature activation+9.92e-01

Pos logit contributions
gemony+0.530
 Ally+0.494
 casinos+0.456
 Indra+0.454
 sophistication+0.452
Neg logit contributions
rador-0.619
MRI-0.606
rament-0.561
icult-0.542
stanbul-0.535
Token PS
Token ID6599
Feature activation-1.54e+00

Pos logit contributions
gemony+0.723
 Ally+0.679
 casinos+0.627
 Indra+0.625
 sophistication+0.621
Neg logit contributions
rador-0.817
MRI-0.801
rament-0.737
icult-0.715
stanbul-0.702
Token4
Token ID19
Feature activation+2.62e+00

Pos logit contributions
rador+0.762
MRI+0.741
rament+0.642
icult+0.607
stanbul+0.591
Neg logit contributions
gemony-1.615
 Ally-1.540
 casinos-1.465
 Indra-1.462
 sophistication-1.455
Token All
Token ID1439
Feature activation+2.59e-01

Pos logit contributions
gemony+1.978
 Ally+1.879
 casinos+1.739
 Indra+1.727
 sophistication+1.713
Neg logit contributions
rador-2.098
MRI-2.053
rament-1.883
icult-1.832
iscopal-1.783
Token Access
Token ID8798
Feature activation+9.27e+00

Pos logit contributions
gemony+0.229
 Ally+0.216
 casinos+0.203
 Indra+0.203
 sophistication+0.202
Neg logit contributions
rador-0.173
MRI-0.169
rament-0.153
icult-0.147
iscopal-0.143
Token Show
Token ID5438
Feature activation+1.51e+00

Pos logit contributions
gemony+6.759
 Ally+6.653
 Polo+6.204
 casinos+6.193
 Indra+6.138
Neg logit contributions
rador-7.069
MRI-6.946
rament-6.165
icult-6.014
iscopal-5.883
Token,
Token ID11
Feature activation+1.72e-01

Pos logit contributions
gemony+1.354
 Ally+1.288
 casinos+1.209
 Indra+1.206
 sophistication+1.198
Neg logit contributions
rador-0.982
MRI-0.956
rament-0.858
icult-0.829
iscopal-0.808
Token we
Token ID356
Feature activation+5.14e-02

Pos logit contributions
gemony+0.122
 Ally+0.114
 casinos+0.105
 Indra+0.104
 sophistication+0.104
Neg logit contributions
rador-0.145
MRI-0.142
rament-0.132
icult-0.127
stanbul-0.126
Token are
Token ID389
Feature activation+4.56e-01

Pos logit contributions
gemony+0.040
 Ally+0.038
 casinos+0.036
 Indra+0.035
ours+0.035
Neg logit contributions
rador-0.039
MRI-0.039
rament-0.035
icult-0.034
iscopal-0.034
Token thinking
Token ID3612
Feature activation-1.82e+00

Pos logit contributions
gemony+0.365
 Ally+0.346
 casinos+0.322
 Indra+0.320
 sophistication+0.318
Neg logit contributions
rador-0.342
MRI-0.335
rament-0.306
icult-0.295
iscopal-0.289
Token Sciences
Token ID13473
Feature activation+3.40e+00

Pos logit contributions
rador+0.878
MRI+0.876
rament+0.790
icult+0.775
stanbul+0.754
Neg logit contributions
gemony-0.889
 Ally-0.811
 casinos-0.764
 sophistication-0.756
ours-0.753
Token.
Token ID13
Feature activation+2.59e-01

Pos logit contributions
gemony+2.788
 Ally+2.650
 casinos+2.457
 Indra+2.448
ours+2.432
Neg logit contributions
rador-2.470
MRI-2.386
rament-2.223
icult-2.135
iscopal-2.070
Token
Token ID198
Feature activation-3.05e-01

Pos logit contributions
gemony+0.252
 Ally+0.240
 Indra+0.226
 casinos+0.226
 sophistication+0.225
Neg logit contributions
rador-0.150
MRI-0.145
rament-0.130
icult-0.124
stanbul-0.120
Token
Token ID198
Feature activation-2.48e-01

Pos logit contributions
rador+0.176
MRI+0.173
rament+0.153
icult+0.145
stanbul+0.141
Neg logit contributions
gemony-0.297
 Ally-0.280
 casinos-0.267
 Indra-0.264
 sophistication-0.264
TokenExplore
Token ID35433
Feature activation+4.14e+00

Pos logit contributions
rador+0.195
MRI+0.192
rament+0.175
icult+0.170
stanbul+0.167
Neg logit contributions
gemony-0.189
 Ally-0.178
 casinos-0.166
 Indra-0.165
 sophistication-0.164
Token further
Token ID2252
Feature activation+9.18e+00

Pos logit contributions
gemony+1.567
 Ally+1.399
 casinos+1.180
 Indra+1.156
 sophistication+1.155
Neg logit contributions
rador-4.828
MRI-4.721
rament-4.530
icult-4.415
stanbul-4.367
Token:
Token ID25
Feature activation+3.86e+00

Pos logit contributions
 Ally+6.338
gemony+6.324
ours+5.616
 Indra+5.575
 sophistication+5.566
Neg logit contributions
rador-7.492
MRI-7.032
rament-6.666
iscopal-6.559
icult-6.529
Token New
Token ID968
Feature activation+3.68e+00

Pos logit contributions
gemony+2.885
 Ally+2.810
 Indra+2.569
 Polo+2.541
 sophistication+2.531
Neg logit contributions
rador-3.045
MRI-2.912
rament-2.728
icult-2.640
iscopal-2.638
Token Insight
Token ID39917
Feature activation-5.91e-02

Pos logit contributions
gemony+2.717
 Ally+2.598
 sophistication+2.419
 casinos+2.413
 Indra+2.388
Neg logit contributions
rador-2.928
MRI-2.852
rament-2.650
icult-2.549
iscopal-2.542
Token Into
Token ID20008
Feature activation+2.94e+00

Pos logit contributions
rador+0.049
MRI+0.048
rament+0.044
icult+0.043
stanbul+0.042
Neg logit contributions
gemony-0.043
 Ally-0.040
 casinos-0.037
 sophistication-0.037
 Indra-0.037
Token How
Token ID1374
Feature activation+1.60e+00

Pos logit contributions
gemony+2.199
 Ally+2.095
 Indra+1.932
 casinos+1.922
 sophistication+1.917
Neg logit contributions
rador-2.350
MRI-2.279
rament-2.118
icult-2.040
iscopal-2.021
Tokent
Token ID83
Feature activation+1.89e+00

Pos logit contributions
gemony+0.019
 Ally+0.018
 casinos+0.017
 Indra+0.017
 sophistication+0.016
Neg logit contributions
rador-0.013
MRI-0.012
rament-0.011
icult-0.011
stanbul-0.010
Token plant
Token ID4618
Feature activation+1.43e+00

Pos logit contributions
gemony+1.466
 Ally+1.393
 Indra+1.297
 casinos+1.294
 sophistication+1.288
Neg logit contributions
rador-1.454
MRI-1.420
rament-1.298
icult-1.245
iscopal-1.234
Token this
Token ID428
Feature activation+1.68e+00

Pos logit contributions
gemony+1.136
 Ally+1.078
 casinos+1.003
 Indra+1.002
 sophistication+0.998
Neg logit contributions
rador-1.072
MRI-1.051
rament-0.958
icult-0.924
iscopal-0.907
Token year
Token ID614
Feature activation+4.59e-01

Pos logit contributions
gemony+1.226
 Ally+1.154
 casinos+1.063
 Indra+1.062
 sophistication+1.061
Neg logit contributions
rador-1.387
MRI-1.359
rament-1.252
icult-1.205
iscopal-1.190
Token.
Token ID13
Feature activation+3.92e-01

Pos logit contributions
gemony+0.396
 Ally+0.374
 casinos+0.350
 Indra+0.349
 sophistication+0.348
Neg logit contributions
rador-0.316
MRI-0.309
rament-0.280
icult-0.270
stanbul-0.264
Token Further
Token ID7735
Feature activation+9.13e+00

Pos logit contributions
gemony+0.338
 Ally+0.320
 casinos+0.300
 Indra+0.299
 sophistication+0.298
Neg logit contributions
rador-0.269
MRI-0.263
rament-0.238
icult-0.229
stanbul-0.224
Token,
Token ID11
Feature activation+2.78e+00

Pos logit contributions
gemony+6.328
 Ally+6.144
 sophistication+5.918
 casinos+5.740
 Indra+5.726
Neg logit contributions
rador-7.210
MRI-7.091
rament-6.583
iscopal-6.300
stanbul-6.177
Token even
Token ID772
Feature activation+1.84e+00

Pos logit contributions
gemony+1.970
 Ally+1.858
 casinos+1.726
 Indra+1.714
 sophistication+1.711
Neg logit contributions
rador-2.332
MRI-2.296
rament-2.111
icult-2.035
iscopal-2.021
Token if
Token ID611
Feature activation-7.93e-01

Pos logit contributions
gemony+1.338
 Ally+1.259
 casinos+1.166
 Indra+1.160
 sophistication+1.154
Neg logit contributions
rador-1.515
MRI-1.486
rament-1.370
icult-1.323
stanbul-1.303
Token they
Token ID484
Feature activation-2.24e+00

Pos logit contributions
rador+0.632
MRI+0.616
rament+0.569
icult+0.552
stanbul+0.540
Neg logit contributions
gemony-0.600
 Ally-0.558
 casinos-0.515
 Indra-0.514
ours-0.513
Token recognize
Token ID7564
Feature activation-1.14e+00

Pos logit contributions
rador+1.516
MRI+1.478
rament+1.341
icult+1.294
stanbul+1.266
Neg logit contributions
gemony-1.950
 Ally-1.844
 casinos-1.728
 Indra-1.721
 sophistication-1.710
Token need
Token ID761
Feature activation-1.81e-01

Pos logit contributions
rador+0.004
MRI+0.004
rament+0.004
icult+0.004
stanbul+0.004
Neg logit contributions
gemony-0.005
 Ally-0.005
 casinos-0.004
 Indra-0.004
 sophistication-0.004
Token to
Token ID284
Feature activation-1.08e+00

Pos logit contributions
rador+0.144
MRI+0.140
rament+0.130
icult+0.126
stanbul+0.124
Neg logit contributions
gemony-0.137
 Ally-0.127
 casinos-0.118
 Indra-0.118
 sophistication-0.116
Token be
Token ID307
Feature activation+1.28e-01

Pos logit contributions
rador+0.757
MRI+0.737
rament+0.672
icult+0.647
stanbul+0.634
Neg logit contributions
gemony-0.923
 Ally-0.871
 casinos-0.815
 Indra-0.813
 sophistication-0.808
Token replaced
Token ID6928
Feature activation+1.71e+00

Pos logit contributions
gemony+0.095
 Ally+0.089
 casinos+0.082
 Indra+0.082
 sophistication+0.081
Neg logit contributions
rador-0.103
MRI-0.101
rament-0.093
icult-0.090
stanbul-0.089
Token,
Token ID11
Feature activation+7.53e-01

Pos logit contributions
gemony+1.360
 Ally+1.280
 casinos+1.191
 Indra+1.187
 sophistication+1.181
Neg logit contributions
rador-1.295
MRI-1.266
rament-1.161
icult-1.120
stanbul-1.099
Token renegoti
Token ID48312
Feature activation+9.11e+00

Pos logit contributions
gemony+0.578
 Ally+0.540
 casinos+0.500
 Indra+0.500
 sophistication+0.497
Neg logit contributions
rador-0.591
MRI-0.578
rament-0.532
icult-0.513
stanbul-0.506
Tokenated
Token ID515
Feature activation+4.95e+00

Pos logit contributions
gemony+5.099
 Ally+4.519
ours+4.414
 Indra+4.100
 casinos+4.008
Neg logit contributions
rador-8.585
MRI-8.420
rament-7.754
icult-7.732
stanbul-7.701
Token or
Token ID393
Feature activation+2.30e+00

Pos logit contributions
gemony+3.396
 Ally+3.218
 casinos+2.963
 Indra+2.949
ours+2.927
Neg logit contributions
rador-4.207
MRI-4.144
rament-3.804
icult-3.705
iscopal-3.641
Token rem
Token ID816
Feature activation+1.02e+00

Pos logit contributions
gemony+1.711
 Ally+1.610
 casinos+1.492
 Indra+1.486
 sophistication+1.476
Neg logit contributions
rador-1.845
MRI-1.811
rament-1.665
icult-1.612
stanbul-1.583
Tokenade
Token ID671
Feature activation-5.96e-01

Pos logit contributions
gemony+0.765
 Ally+0.712
ours+0.665
 casinos+0.661
 Indra+0.659
Neg logit contributions
rador-0.806
MRI-0.792
rament-0.730
icult-0.703
stanbul-0.695
Token.
Token ID13
Feature activation-6.26e-01

Pos logit contributions
rador+0.427
MRI+0.416
rament+0.383
icult+0.368
stanbul+0.362
Neg logit contributions
gemony-0.497
 Ally-0.465
 casinos-0.433
 Indra-0.431
 sophistication-0.430
Token with
Token ID351
Feature activation+1.85e+00

Pos logit contributions
gemony+3.642
 Ally+3.445
 casinos+3.190
 Indra+3.161
 sophistication+3.150
Neg logit contributions
rador-4.502
MRI-4.455
rament-4.130
icult-3.998
iscopal-3.929
Token the
Token ID262
Feature activation-5.30e-01

Pos logit contributions
gemony+1.365
 Ally+1.281
 casinos+1.185
 Indra+1.180
 sophistication+1.173
Neg logit contributions
rador-1.504
MRI-1.475
rament-1.359
icult-1.316
stanbul-1.292
Token Russian
Token ID3394
Feature activation+2.80e+00

Pos logit contributions
rador+0.448
MRI+0.435
rament+0.405
icult+0.391
stanbul+0.387
Neg logit contributions
gemony-0.375
 Ally-0.350
 Indra-0.322
 casinos-0.321
ours-0.320
Token government
Token ID1230
Feature activation-8.66e-01

Pos logit contributions
gemony+2.381
 Ally+2.260
 casinos+2.114
 Indra+2.096
 sophistication+2.094
Neg logit contributions
rador-1.963
MRI-1.921
rament-1.749
icult-1.684
stanbul-1.645
Token,
Token ID11
Feature activation+5.40e-01

Pos logit contributions
rador+0.640
MRI+0.620
rament+0.567
icult+0.546
stanbul+0.540
Neg logit contributions
gemony-0.706
 Ally-0.662
 casinos-0.618
 Indra-0.616
 sophistication-0.614
Token misogyn
Token ID22795
Feature activation+9.09e+00

Pos logit contributions
gemony+0.436
 Ally+0.411
 casinos+0.383
 Indra+0.382
 sophistication+0.380
Neg logit contributions
rador-0.402
MRI-0.393
rament-0.359
icult-0.346
stanbul-0.340
Tokenistic
Token ID2569
Feature activation+4.30e+00

Pos logit contributions
gemony+9.612
 Ally+9.167
ours+8.732
 Indra+8.594
 casinos+8.580
Neg logit contributions
rador-4.060
MRI-3.960
rament-3.463
icult-3.052
stanbul-2.982
Token and
Token ID290
Feature activation+1.81e-01

Pos logit contributions
gemony+3.016
 Ally+2.843
 casinos+2.649
 sophistication+2.611
 Indra+2.586
Neg logit contributions
rador-3.624
MRI-3.595
rament-3.295
icult-3.223
stanbul-3.132
Token hateful
Token ID30367
Feature activation+2.09e+00

Pos logit contributions
gemony+0.134
 Ally+0.126
 Indra+0.116
 casinos+0.116
ours+0.115
Neg logit contributions
rador-0.148
MRI-0.143
rament-0.132
icult-0.128
stanbul-0.126
Token rhetoric
Token ID13214
Feature activation+2.07e+00

Pos logit contributions
gemony+1.528
 Ally+1.436
 casinos+1.330
 Indra+1.320
 sophistication+1.317
Neg logit contributions
rador-1.712
MRI-1.684
rament-1.546
icult-1.503
stanbul-1.470
Token and
Token ID290
Feature activation+1.06e-01

Pos logit contributions
gemony+1.621
 Ally+1.528
 casinos+1.419
 Indra+1.414
 sophistication+1.406
Neg logit contributions
rador-1.591
MRI-1.559
rament-1.429
icult-1.381
stanbul-1.353
Token guard
Token ID4860
Feature activation-6.73e-01

Pos logit contributions
gemony+0.764
 Ally+0.725
 casinos+0.673
 Indra+0.669
 sophistication+0.668
Neg logit contributions
rador-0.716
MRI-0.701
rament-0.641
icult-0.620
iscopal-0.607
Token is
Token ID318
Feature activation-1.12e+00

Pos logit contributions
rador+0.513
MRI+0.501
rament+0.458
icult+0.443
stanbul+0.435
Neg logit contributions
gemony-0.532
 Ally-0.499
 casinos-0.466
 Indra-0.464
 sophistication-0.461
Token on
Token ID319
Feature activation-5.92e-01

Pos logit contributions
rador+0.854
MRI+0.833
rament+0.767
icult+0.741
stanbul+0.731
Neg logit contributions
gemony-0.881
 Ally-0.819
 casinos-0.767
 Indra-0.761
 sophistication-0.760
Token point
Token ID966
Feature activation-4.31e-02

Pos logit contributions
rador+0.396
MRI+0.386
rament+0.346
icult+0.334
stanbul+0.329
Neg logit contributions
gemony-0.525
 Ally-0.492
 casinos-0.463
 Indra-0.462
 sophistication-0.460
Token.
Token ID13
Feature activation+1.84e-01

Pos logit contributions
rador+0.031
MRI+0.030
rament+0.027
icult+0.026
stanbul+0.026
Neg logit contributions
gemony-0.036
 Ally-0.034
 casinos-0.032
 Indra-0.032
 sophistication-0.032
Token Further
Token ID7735
Feature activation+9.01e+00

Pos logit contributions
gemony+0.156
 Ally+0.147
 casinos+0.138
 sophistication+0.137
 Indra+0.137
Neg logit contributions
rador-0.130
MRI-0.127
rament-0.115
icult-0.111
stanbul-0.109
Token,
Token ID11
Feature activation+2.71e+00

Pos logit contributions
gemony+6.339
 Ally+6.169
 sophistication+5.783
 Indra+5.675
 casinos+5.665
Neg logit contributions
rador-7.057
MRI-6.938
rament-6.423
icult-6.194
iscopal-6.171
Token you
Token ID345
Feature activation-1.86e-01

Pos logit contributions
gemony+1.918
 Ally+1.808
 casinos+1.663
 Indra+1.657
 sophistication+1.646
Neg logit contributions
rador-2.279
MRI-2.240
rament-2.067
icult-2.005
iscopal-1.972
Token slam
Token ID21158
Feature activation+5.88e+00

Pos logit contributions
rador+0.140
MRI+0.137
rament+0.126
icult+0.121
stanbul+0.119
Neg logit contributions
gemony-0.149
 Ally-0.140
 casinos-0.130
 Indra-0.130
 sophistication-0.129
Token someone
Token ID2130
Feature activation+1.95e+00

Pos logit contributions
gemony+4.072
 Ally+3.894
 Indra+3.553
 casinos+3.551
ours+3.521
Neg logit contributions
rador-4.907
MRI-4.852
rament-4.456
icult-4.342
iscopal-4.265
Token in
Token ID287
Feature activation-7.94e-01

Pos logit contributions
gemony+1.482
 Ally+1.392
 casinos+1.290
 Indra+1.286
 sophistication+1.279
Neg logit contributions
rador-1.544
MRI-1.511
rament-1.390
icult-1.344
stanbul-1.320
Token the
Token ID262
Feature activation-7.79e-02

Pos logit contributions
gemony+0.365
 Ally+0.340
 casinos+0.317
 Indra+0.316
ours+0.315
Neg logit contributions
rador-0.344
MRI-0.335
rament-0.307
icult-0.298
iscopal-0.291
Token fire
Token ID2046
Feature activation+1.00e+00

Pos logit contributions
rador+0.051
MRI+0.049
rament+0.045
icult+0.043
iscopal+0.042
Neg logit contributions
gemony-0.070
 Ally-0.066
 casinos-0.062
 Indra-0.062
ours-0.062
Token last
Token ID938
Feature activation-7.26e-01

Pos logit contributions
gemony+0.818
 Ally+0.769
 casinos+0.717
 Indra+0.714
 sophistication+0.709
Neg logit contributions
rador-0.740
MRI-0.722
rament-0.661
icult-0.638
stanbul-0.624
Token month
Token ID1227
Feature activation+2.58e+00

Pos logit contributions
rador+0.610
MRI+0.598
rament+0.554
icult+0.538
stanbul+0.528
Neg logit contributions
gemony-0.517
 Ally-0.474
 casinos-0.438
 Indra-0.437
ours-0.434
Token at
Token ID379
Feature activation+2.29e+00

Pos logit contributions
gemony+2.106
 Ally+1.995
 casinos+1.858
 Indra+1.853
 sophistication+1.842
Neg logit contributions
rador-1.900
MRI-1.857
rament-1.693
icult-1.635
stanbul-1.601
Token Gren
Token ID19674
Feature activation+9.00e+00

Pos logit contributions
gemony+1.752
 Ally+1.647
 casinos+1.528
 Indra+1.522
 sophistication+1.514
Neg logit contributions
rador-1.803
MRI-1.764
rament-1.622
icult-1.568
stanbul-1.540
Tokenfell
Token ID23299
Feature activation+1.83e+00

Pos logit contributions
gemony+5.129
 Ally+4.976
ours+4.592
 sophistication+4.340
 Warfare+4.229
Neg logit contributions
rador-8.283
MRI-8.058
rament-7.607
icult-7.386
stanbul-7.347
Token Tower
Token ID8765
Feature activation+3.88e+00

Pos logit contributions
gemony+1.102
 Ally+1.006
 casinos+0.914
 Indra+0.911
 sophistication+0.901
Neg logit contributions
rador-1.738
MRI-1.706
rament-1.593
icult-1.556
iscopal-1.530
Token,
Token ID11
Feature activation+1.02e+00

Pos logit contributions
gemony+3.060
 Ally+2.918
 casinos+2.720
 sophistication+2.704
 Indra+2.699
Neg logit contributions
rador-2.911
MRI-2.855
rament-2.603
icult-2.514
stanbul-2.481
Token which
Token ID543
Feature activation+1.12e+00

Pos logit contributions
gemony+0.795
 Ally+0.749
 casinos+0.696
 Indra+0.693
 sophistication+0.690
Neg logit contributions
rador-0.785
MRI-0.768
rament-0.705
icult-0.681
stanbul-0.668
Token was
Token ID373
Feature activation-2.09e-01

Pos logit contributions
gemony+0.880
 Ally+0.830
 casinos+0.771
 Indra+0.769
 sophistication+0.765
Neg logit contributions
rador-0.850
MRI-0.832
rament-0.762
icult-0.736
stanbul-0.722
Token there
Token ID612
Feature activation-4.32e+00

Pos logit contributions
gemony+0.163
 Ally+0.152
 Indra+0.140
 casinos+0.140
ours+0.139
Neg logit contributions
rador-0.186
MRI-0.182
rament-0.168
icult-0.163
stanbul-0.160
Token are
Token ID389
Feature activation-3.28e-01

Pos logit contributions
rador+2.908
MRI+2.839
rament+2.537
icult+2.449
stanbul+2.424
Neg logit contributions
gemony-3.714
 Ally-3.572
 casinos-3.335
 Indra-3.324
 sophistication-3.300
Token many
Token ID867
Feature activation+7.96e-01

Pos logit contributions
rador+0.270
MRI+0.265
rament+0.245
icult+0.237
stanbul+0.233
Neg logit contributions
gemony-0.238
 Ally-0.222
 Indra-0.204
 casinos-0.204
ours-0.202
Token others
Token ID1854
Feature activation+2.81e+00

Pos logit contributions
gemony+0.546
 Ally+0.509
 Indra+0.465
 casinos+0.465
ours+0.461
Neg logit contributions
rador-0.688
MRI-0.676
rament-0.627
icult-0.608
stanbul-0.598
Token.)
Token ID2014
Feature activation-2.15e+00

Pos logit contributions
gemony+2.179
 Ally+2.062
 casinos+1.920
 Indra+1.913
 sophistication+1.900
Neg logit contributions
rador-2.170
MRI-2.119
rament-1.943
icult-1.877
iscopal-1.846
Token Further
Token ID7735
Feature activation+8.96e+00

Pos logit contributions
rador+1.559
MRI+1.542
rament+1.398
icult+1.351
stanbul+1.328
Neg logit contributions
gemony-1.754
 Ally-1.631
 casinos-1.536
 Indra-1.524
ours-1.514
Token out
Token ID503
Feature activation+2.40e+00

Pos logit contributions
gemony+6.132
 Ally+5.943
 sophistication+5.647
 casinos+5.511
 exploits+5.400
Neg logit contributions
rador-7.267
MRI-7.081
rament-6.490
icult-6.317
iscopal-6.303
Token,
Token ID11
Feature activation+1.13e+00

Pos logit contributions
gemony+1.873
 Ally+1.764
 casinos+1.639
 Indra+1.634
 sophistication+1.630
Neg logit contributions
rador-1.841
MRI-1.800
rament-1.649
icult-1.595
stanbul-1.565
Token there
Token ID612
Feature activation-3.24e+00

Pos logit contributions
gemony+0.841
 Ally+0.786
 casinos+0.726
 Indra+0.725
ours+0.719
Neg logit contributions
rador-0.919
MRI-0.902
rament-0.831
icult-0.804
stanbul-0.791
Token are
Token ID389
Feature activation-2.51e-02

Pos logit contributions
rador+2.166
MRI+2.109
rament+1.892
icult+1.827
stanbul+1.805
Neg logit contributions
gemony-2.826
 Ally-2.704
 casinos-2.531
 Indra-2.526
 sophistication-2.507
Token several
Token ID1811
Feature activation+1.16e+00

Pos logit contributions
rador+0.021
MRI+0.020
rament+0.019
icult+0.018
stanbul+0.018
Neg logit contributions
gemony-0.018
 Ally-0.017
 Indra-0.016
 casinos-0.015
ours-0.015
Token-
Token ID12
Feature activation-2.30e+00

Pos logit contributions
rador+0.967
MRI+0.944
rament+0.862
icult+0.832
stanbul+0.813
Neg logit contributions
gemony-1.097
 Ally-1.034
 casinos-0.963
 Indra-0.963
 sophistication-0.955
Tokenmetal
Token ID28469
Feature activation-6.00e+00

Pos logit contributions
rador+1.238
MRI+1.214
rament+1.065
icult+1.018
iscopal+0.980
Neg logit contributions
gemony-2.327
 Ally-2.224
 Indra-2.108
 casinos-2.096
ours-2.077
Token trio
Token ID19886
Feature activation-2.91e+00

Pos logit contributions
rador+4.190
MRI+4.050
rament+3.768
icult+3.612
stanbul+3.501
Neg logit contributions
gemony-5.007
 Ally-4.693
 Indra-4.407
ours-4.378
 casinos-4.377
Token with
Token ID351
Feature activation-5.11e-01

Pos logit contributions
rador+2.027
MRI+1.970
rament+1.795
icult+1.735
stanbul+1.697
Neg logit contributions
gemony-2.468
 Ally-2.311
 casinos-2.167
 Indra-2.162
 sophistication-2.152
Token an
Token ID281
Feature activation-1.08e+00

Pos logit contributions
rador+0.407
MRI+0.398
rament+0.367
icult+0.354
stanbul+0.349
Neg logit contributions
gemony-0.385
 Ally-0.360
 casinos-0.333
 Indra-0.333
ours-0.331
Token EP
Token ID14724
Feature activation-9.33e+00

Pos logit contributions
rador+0.849
MRI+0.833
rament+0.767
icult+0.742
stanbul+0.729
Neg logit contributions
gemony-0.817
 Ally-0.765
 casinos-0.710
 Indra-0.709
ours-0.704
Token called
Token ID1444
Feature activation-1.81e+00

Pos logit contributions
rador+6.442
MRI+6.165
rament+5.576
iscopal+5.521
icult+5.492
Neg logit contributions
gemony-7.499
 Ally-6.916
 casinos-6.505
 Indra-6.449
ours-6.425
Token You
Token ID921
Feature activation-1.17e+00

Pos logit contributions
rador+1.441
MRI+1.417
rament+1.306
icult+1.254
stanbul+1.244
Neg logit contributions
gemony-1.353
 Ally-1.247
 casinos-1.183
 sophistication-1.162
ours-1.157
Token Look
Token ID6803
Feature activation-2.66e-01

Pos logit contributions
rador+0.835
MRI+0.820
rament+0.745
icult+0.716
stanbul+0.708
Neg logit contributions
gemony-0.973
 Ally-0.915
 casinos-0.865
 Indra-0.857
 sophistication-0.855
Token Like
Token ID4525
Feature activation+6.78e-01

Pos logit contributions
rador+0.189
MRI+0.185
rament+0.168
icult+0.162
stanbul+0.159
Neg logit contributions
gemony-0.223
 Ally-0.211
 casinos-0.197
 Indra-0.196
 sophistication-0.196
Token an
Token ID281
Feature activation+1.96e-01

Pos logit contributions
gemony+0.509
 Ally+0.479
 casinos+0.443
 Indra+0.442
 sophistication+0.439
Neg logit contributions
rador-0.543
MRI-0.531
rament-0.489
icult-0.473
stanbul-0.464
Token45
Token ID2231
Feature activation+1.67e+00

Pos logit contributions
gemony+2.334
 Ally+2.245
 Indra+2.100
ours+2.085
 casinos+2.082
Neg logit contributions
rador-1.747
MRI-1.678
rament-1.533
icult-1.484
iscopal-1.437
Token 366
Token ID44856
Feature activation+5.74e-02

Pos logit contributions
gemony+1.329
 Ally+1.258
 casinos+1.168
 Indra+1.166
 sophistication+1.157
Neg logit contributions
rador-1.258
MRI-1.229
rament-1.126
icult-1.088
stanbul-1.067
Token.
Token ID13
Feature activation+8.88e-01

Pos logit contributions
gemony+0.050
 Ally+0.047
 casinos+0.044
 Indra+0.044
 sophistication+0.043
Neg logit contributions
rador-0.040
MRI-0.039
rament-0.035
icult-0.034
iscopal-0.033
Token
Token ID198
Feature activation-8.58e-01

Pos logit contributions
gemony+0.897
 Ally+0.861
 casinos+0.811
 Indra+0.811
ours+0.806
Neg logit contributions
rador-0.478
MRI-0.463
rament-0.410
icult-0.387
stanbul-0.378
Token
Token ID198
Feature activation-8.69e-01

Pos logit contributions
rador+0.485
MRI+0.479
rament+0.420
icult+0.398
stanbul+0.387
Neg logit contributions
gemony-0.847
 Ally-0.796
 casinos-0.762
 Indra-0.753
 sophistication-0.753
TokenTopics
Token ID25902
Feature activation-9.05e+00

Pos logit contributions
rador+0.550
MRI+0.541
rament+0.479
icult+0.460
iscopal+0.448
Neg logit contributions
gemony-0.799
 Ally-0.760
 casinos-0.717
 Indra-0.717
 sophistication-0.710
Token:
Token ID25
Feature activation+3.67e-01

Pos logit contributions
rador+4.726
MRI+4.647
rament+3.958
icult+3.762
stanbul+3.618
Neg logit contributions
gemony-8.805
 Ally-8.038
 casinos-7.967
 Indra-7.829
-7.805
Token community
Token ID2055
Feature activation+2.14e+00

Pos logit contributions
gemony+0.274
 Ally+0.258
 casinos+0.239
 Indra+0.238
 sophistication+0.236
Neg logit contributions
rador-0.295
MRI-0.289
rament-0.266
icult-0.257
stanbul-0.253
Token-
Token ID12
Feature activation-8.83e-01

Pos logit contributions
gemony+1.954
 Ally+1.871
 casinos+1.768
 sophistication+1.757
 Indra+1.752
Neg logit contributions
rador-1.361
MRI-1.322
rament-1.184
icult-1.141
stanbul-1.123
Tokenand
Token ID392
Feature activation-1.17e+00

Pos logit contributions
rador+0.593
MRI+0.575
rament+0.524
icult+0.504
iscopal+0.492
Neg logit contributions
gemony-0.781
 Ally-0.737
 casinos-0.695
 Indra-0.690
 sophistication-0.690
Token-
Token ID12
Feature activation-3.36e+00

Pos logit contributions
rador+0.664
MRI+0.640
rament+0.574
icult+0.543
stanbul+0.530
Neg logit contributions
gemony-1.159
 Ally-1.100
 Indra-1.041
 casinos-1.041
 sophistication-1.032
Token daughters
Token ID14850
Feature activation+5.78e-01

Pos logit contributions
gemony+0.009
 Ally+0.009
 Indra+0.008
 casinos+0.008
 sophistication+0.008
Neg logit contributions
rador-0.009
MRI-0.009
rament-0.008
icult-0.008
stanbul-0.008
Token,
Token ID11
Feature activation-1.06e+00

Pos logit contributions
gemony+0.473
 Ally+0.445
 casinos+0.416
 Indra+0.413
 sophistication+0.412
Neg logit contributions
rador-0.425
MRI-0.415
rament-0.379
icult-0.365
stanbul-0.358
Token Mal
Token ID4434
Feature activation-8.30e-01

Pos logit contributions
rador+0.842
MRI+0.823
rament+0.760
icult+0.734
stanbul+0.720
Neg logit contributions
gemony-0.803
 Ally-0.743
 casinos-0.697
ours-0.688
 sophistication-0.687
Tokenia
Token ID544
Feature activation+7.32e-01

Pos logit contributions
rador+0.523
MRI+0.509
rament+0.457
icult+0.438
stanbul+0.427
Neg logit contributions
gemony-0.765
 Ally-0.727
 casinos-0.684
 Indra-0.682
 sophistication-0.679
Token and
Token ID290
Feature activation+6.18e-02

Pos logit contributions
gemony+0.544
 Ally+0.509
 casinos+0.472
 Indra+0.469
 sophistication+0.467
Neg logit contributions
rador-0.592
MRI-0.580
rament-0.534
icult-0.518
stanbul-0.508
Token Sasha
Token ID32151
Feature activation-8.95e+00

Pos logit contributions
gemony+0.047
 Ally+0.044
 casinos+0.041
 Indra+0.040
 sophistication+0.040
Neg logit contributions
rador-0.049
MRI-0.048
rament-0.044
icult-0.043
stanbul-0.042
Token,
Token ID11
Feature activation-1.49e+00

Pos logit contributions
rador+5.533
MRI+5.457
rament+4.939
icult+4.706
iscopal+4.560
Neg logit contributions
gemony-7.843
 Ally-7.145
 casinos-6.951
 sophistication-6.914
ours-6.826
Token in
Token ID287
Feature activation-1.06e+00

Pos logit contributions
rador+1.152
MRI+1.124
rament+1.038
icult+0.996
stanbul+0.979
Neg logit contributions
gemony-1.155
 Ally-1.071
 casinos-0.999
 Indra-0.993
 sophistication-0.993
Token the
Token ID262
Feature activation-4.76e-01

Pos logit contributions
rador+0.782
MRI+0.758
rament+0.699
icult+0.671
iscopal+0.657
Neg logit contributions
gemony-0.862
 Ally-0.806
 casinos-0.752
 Indra-0.750
ours-0.748
Token harsh
Token ID11859
Feature activation-7.42e-01

Pos logit contributions
rador+0.394
MRI+0.384
rament+0.358
icult+0.345
iscopal+0.339
Neg logit contributions
gemony-0.344
 Ally-0.319
 casinos-0.295
 Indra-0.293
ours-0.293
Token glare
Token ID36785
Feature activation-6.47e-01

Pos logit contributions
rador+0.590
MRI+0.575
rament+0.532
icult+0.514
stanbul+0.504
Neg logit contributions
gemony-0.563
 Ally-0.526
 casinos-0.487
 Indra-0.486
 sophistication-0.482
Token
Token ID198
Feature activation-9.95e-01

Pos logit contributions
rador+0.707
MRI+0.693
rament+0.620
icult+0.593
stanbul+0.581
Neg logit contributions
gemony-1.022
 Ally-0.963
 casinos-0.913
 sophistication-0.907
 Indra-0.904
Token
Token ID198
Feature activation-9.88e-01

Pos logit contributions
rador+0.570
MRI+0.562
rament+0.495
icult+0.469
stanbul+0.458
Neg logit contributions
gemony-0.973
 Ally-0.918
 casinos-0.876
 Indra-0.867
 sophistication-0.867
TokenIn
Token ID818
Feature activation-2.39e+00

Pos logit contributions
rador+0.624
MRI+0.618
rament+0.548
icult+0.524
iscopal+0.513
Neg logit contributions
gemony-0.904
 Ally-0.860
 casinos-0.811
 Indra-0.810
 sophistication-0.805
Token Florida
Token ID4744
Feature activation-4.39e+00

Pos logit contributions
rador+1.484
MRI+1.439
rament+1.297
icult+1.231
iscopal+1.206
Neg logit contributions
gemony-2.214
 Ally-2.081
ours-1.965
 Indra-1.958
 casinos-1.953
Token and
Token ID290
Feature activation-6.91e-01

Pos logit contributions
rador+3.069
MRI+2.998
rament+2.728
icult+2.632
iscopal+2.567
Neg logit contributions
gemony-3.720
 Ally-3.451
 Indra-3.238
ours-3.234
 sophistication-3.230
Token Wisconsin
Token ID9279
Feature activation-8.95e+00

Pos logit contributions
rador+0.349
MRI+0.338
rament+0.295
icult+0.278
stanbul+0.270
Neg logit contributions
gemony-0.723
 Ally-0.691
 casinos-0.655
 Indra-0.653
 sophistication-0.651
Token,
Token ID11
Feature activation-1.59e+00

Pos logit contributions
rador+5.542
MRI+5.457
rament+4.905
icult+4.767
iscopal+4.665
Neg logit contributions
gemony-7.913
 Ally-7.254
ours-6.983
 sophistication-6.864
 Indra-6.829
Token 49
Token ID5125
Feature activation-3.35e-01

Pos logit contributions
rador+1.214
MRI+1.194
rament+1.092
icult+1.054
iscopal+1.035
Neg logit contributions
gemony-1.245
 Ally-1.154
 Indra-1.076
ours-1.075
 casinos-1.074
Token percent
Token ID1411
Feature activation+1.52e+00

Pos logit contributions
rador+0.288
MRI+0.283
rament+0.262
icult+0.254
stanbul+0.250
Neg logit contributions
gemony-0.232
 Ally-0.216
 casinos-0.198
 Indra-0.198
 sophistication-0.197
Token said
Token ID531
Feature activation+1.96e+00

Pos logit contributions
gemony+1.111
 Ally+1.045
 casinos+0.965
 Indra+0.960
 sophistication+0.956
Neg logit contributions
rador-1.252
MRI-1.227
rament-1.131
icult-1.094
stanbul-1.074
Token if
Token ID611
Feature activation-1.66e+00

Pos logit contributions
gemony+1.426
 Ally+1.353
 casinos+1.249
 Indra+1.243
 sophistication+1.231
Neg logit contributions
rador-1.623
MRI-1.583
rament-1.464
icult-1.420
stanbul-1.393
TokenJones
Token ID25784
Feature activation-4.12e+00

Pos logit contributions
rador+0.326
MRI+0.318
rament+0.277
icult+0.261
iscopal+0.255
Neg logit contributions
gemony-0.596
 Ally-0.564
 Indra-0.535
 casinos-0.532
 sophistication-0.532
Token,
Token ID11
Feature activation-1.25e+00

Pos logit contributions
rador+2.995
MRI+2.893
rament+2.680
icult+2.596
stanbul+2.554
Neg logit contributions
gemony-3.358
 Ally-3.091
 Indra-2.933
ours-2.882
 sophistication-2.869
Token who
Token ID508
Feature activation-1.84e+00

Pos logit contributions
rador+0.884
MRI+0.862
rament+0.786
icult+0.757
stanbul+0.741
Neg logit contributions
gemony-1.054
 Ally-0.979
 Indra-0.918
 casinos-0.917
 sophistication-0.917
Token has
Token ID468
Feature activation-9.81e-01

Pos logit contributions
rador+1.351
MRI+1.316
rament+1.225
icult+1.173
stanbul+1.164
Neg logit contributions
gemony-1.453
 Ally-1.366
 sophistication-1.272
 Indra-1.267
-1.260
Token a
Token ID257
Feature activation-1.20e+00

Pos logit contributions
rador+0.742
MRI+0.719
rament+0.664
icult+0.637
stanbul+0.632
Neg logit contributions
gemony-0.782
 Ally-0.724
ours-0.673
 Indra-0.672
 sophistication-0.665
Token net
Token ID2010
Feature activation-8.86e+00

Pos logit contributions
rador+0.961
MRI+0.942
rament+0.870
icult+0.841
stanbul+0.825
Neg logit contributions
gemony-0.903
 Ally-0.839
 Indra-0.780
ours-0.772
 casinos-0.770
Token worth
Token ID2861
Feature activation-2.33e+00

Pos logit contributions
MRI+4.450
rador+4.308
icult+3.732
rament+3.724
stanbul+3.509
Neg logit contributions
gemony-8.862
 Ally-8.088
ours-7.906
 Indra-7.895
aughs-7.802
Token of
Token ID286
Feature activation-2.25e+00

Pos logit contributions
rador+1.592
MRI+1.560
rament+1.423
icult+1.374
stanbul+1.355
Neg logit contributions
gemony-1.993
 Ally-1.857
 Indra-1.742
 casinos-1.731
ours-1.719
Token $
Token ID720
Feature activation+3.07e-01

Pos logit contributions
rador+1.852
MRI+1.817
rament+1.685
icult+1.625
stanbul+1.605
Neg logit contributions
gemony-1.611
 Ally-1.468
ours-1.370
 Indra-1.354
 casinos-1.348
Token5
Token ID20
Feature activation+1.78e+00

Pos logit contributions
gemony+0.292
 Ally+0.277
 casinos+0.262
 Indra+0.260
 sophistication+0.260
Neg logit contributions
rador-0.184
MRI-0.180
rament-0.160
icult-0.153
stanbul-0.150
Token.
Token ID13
Feature activation+1.56e+00

Pos logit contributions
gemony+1.363
 Ally+1.294
 casinos+1.203
 Indra+1.194
 sophistication+1.186
Neg logit contributions
rador-1.392
MRI-1.358
rament-1.250
icult-1.212
stanbul-1.186
Token into
Token ID656
Feature activation+3.01e+00

Pos logit contributions
gemony+0.354
 Ally+0.329
 casinos+0.301
 Indra+0.300
 sophistication+0.300
Neg logit contributions
rador-0.383
MRI-0.373
rament-0.347
icult-0.333
stanbul-0.329
Token them
Token ID606
Feature activation+3.24e-01

Pos logit contributions
gemony+2.032
 Ally+1.906
 casinos+1.767
 Indra+1.747
ours+1.736
Neg logit contributions
rador-2.618
MRI-2.576
rament-2.379
icult-2.317
stanbul-2.275
Token".
Token ID1911
Feature activation-1.14e+00

Pos logit contributions
gemony+0.271
 Ally+0.256
 casinos+0.238
 Indra+0.237
 sophistication+0.237
Neg logit contributions
rador-0.234
MRI-0.227
rament-0.208
icult-0.200
stanbul-0.195
Token
Token ID198
Feature activation-2.11e-01

Pos logit contributions
rador+0.658
MRI+0.647
rament+0.569
icult+0.539
stanbul+0.527
Neg logit contributions
gemony-1.107
 Ally-1.048
 casinos-0.997
 sophistication-0.991
 Indra-0.989
Token
Token ID198
Feature activation-2.56e-01

Pos logit contributions
rador+0.122
MRI+0.120
rament+0.106
icult+0.101
stanbul+0.098
Neg logit contributions
gemony-0.206
 Ally-0.194
 casinos-0.185
 Indra-0.183
 sophistication-0.183
TokenTopics
Token ID25902
Feature activation-8.66e+00

Pos logit contributions
rador+0.166
MRI+0.163
rament+0.145
icult+0.139
stanbul+0.136
Neg logit contributions
gemony-0.232
 Ally-0.220
 casinos-0.207
 Indra-0.207
 sophistication-0.205
Token:
Token ID25
Feature activation+5.20e-01

Pos logit contributions
rador+4.560
MRI+4.552
rament+3.887
icult+3.651
stanbul+3.549
Neg logit contributions
gemony-8.423
 Ally-7.695
 casinos-7.620
 Indra-7.472
 sophistication-7.464
Token marine
Token ID16050
Feature activation+5.67e+00

Pos logit contributions
gemony+0.360
 Ally+0.336
 casinos+0.310
 Indra+0.308
 sophistication+0.306
Neg logit contributions
rador-0.446
MRI-0.438
rament-0.405
icult-0.393
stanbul-0.386
Token-
Token ID12
Feature activation-4.28e-01

Pos logit contributions
gemony+4.803
 Ally+4.601
 casinos+4.390
ours+4.382
 sophistication+4.331
Neg logit contributions
rador-3.794
MRI-3.775
rament-3.352
stanbul-3.230
iscopal-3.142
Tokenbiology
Token ID43592
Feature activation+2.71e+00

Pos logit contributions
rador+0.340
MRI+0.333
rament+0.309
icult+0.299
stanbul+0.292
Neg logit contributions
gemony-0.321
 Ally-0.304
 casinos-0.281
 Indra-0.281
 sophistication-0.280
Token,
Token ID11
Feature activation+6.39e-01

Pos logit contributions
gemony+2.167
 Ally+2.058
 casinos+1.916
ours+1.912
 Indra+1.910
Neg logit contributions
rador-2.019
MRI-1.969
rament-1.800
icult-1.736
stanbul-1.706
Token News
Token ID3000
Feature activation-4.79e-01

Pos logit contributions
rador+1.076
MRI+1.050
rament+0.949
icult+0.917
iscopal+0.898
Neg logit contributions
gemony-1.302
 Ally-1.225
 Indra-1.145
 casinos-1.144
 sophistication-1.139
Token 24
Token ID1987
Feature activation-6.62e-01

Pos logit contributions
rador+0.347
MRI+0.339
rament+0.309
icult+0.298
stanbul+0.292
Neg logit contributions
gemony-0.396
 Ally-0.374
 casinos-0.349
 Indra-0.348
 sophistication-0.346
Token.
Token ID13
Feature activation+5.09e-01

Pos logit contributions
rador+0.446
MRI+0.433
rament+0.392
icult+0.377
stanbul+0.368
Neg logit contributions
gemony-0.584
 Ally-0.550
 Indra-0.516
 casinos-0.515
 sophistication-0.512
Token
Token ID198
Feature activation-7.75e-01

Pos logit contributions
gemony+0.506
 Ally+0.484
 casinos+0.457
 Indra+0.456
 sophistication+0.454
Neg logit contributions
rador-0.282
MRI-0.274
rament-0.242
icult-0.230
stanbul-0.224
Token
Token ID198
Feature activation-7.59e-01

Pos logit contributions
rador+0.442
MRI+0.436
rament+0.382
icult+0.363
stanbul+0.353
Neg logit contributions
gemony-0.762
 Ally-0.717
 casinos-0.686
 sophistication-0.679
 Indra-0.678
TokenTopics
Token ID25902
Feature activation-8.50e+00

Pos logit contributions
rador+0.476
MRI+0.466
rament+0.415
icult+0.397
stanbul+0.387
Neg logit contributions
gemony-0.701
 Ally-0.667
 Indra-0.628
 casinos-0.628
 sophistication-0.623
Token:
Token ID25
Feature activation+3.74e-01

Pos logit contributions
rador+4.573
MRI+4.476
rament+3.841
icult+3.637
stanbul+3.540
Neg logit contributions
gemony-8.229
 Ally-7.531
 casinos-7.384
 Indra-7.317
 sophistication-7.269
Token beef
Token ID12023
Feature activation+3.82e+00

Pos logit contributions
gemony+0.289
 Ally+0.272
 casinos+0.254
 Indra+0.252
 sophistication+0.251
Neg logit contributions
rador-0.291
MRI-0.286
rament-0.262
icult-0.253
stanbul-0.249
Token-
Token ID12
Feature activation-3.64e-01

Pos logit contributions
gemony+2.983
 Ally+2.818
 casinos+2.640
ours+2.629
 Indra+2.602
Neg logit contributions
MRI-2.855
rador-2.838
rament-2.519
stanbul-2.470
iscopal-2.440
Tokenc
Token ID66
Feature activation-2.78e+00

Pos logit contributions
rador+0.260
MRI+0.254
rament+0.232
icult+0.223
stanbul+0.218
Neg logit contributions
gemony-0.304
 Ally-0.288
 Indra-0.269
 casinos-0.269
 sophistication-0.267
Tokenattle
Token ID1999
Feature activation-1.85e+00

Pos logit contributions
rador+1.261
MRI+1.196
rament+1.047
icult+0.986
iscopal+0.919
Neg logit contributions
gemony-3.059
 Ally-2.963
 Indra-2.817
 casinos-2.800
 sophistication-2.792
Token cooler
Token ID19346
Feature activation-1.25e+00

Pos logit contributions
gemony+0.144
 Ally+0.135
 casinos+0.126
 Indra+0.125
 sophistication+0.125
Neg logit contributions
rador-0.141
MRI-0.137
rament-0.126
icult-0.122
stanbul-0.120
Token or
Token ID393
Feature activation+2.21e-01

Pos logit contributions
rador+1.041
MRI+1.019
rament+0.941
icult+0.912
stanbul+0.896
Neg logit contributions
gemony-0.905
 Ally-0.847
 casinos-0.781
 Indra-0.778
 sophistication-0.773
Token AC
Token ID7125
Feature activation-4.23e+00

Pos logit contributions
gemony+0.167
 Ally+0.157
 casinos+0.146
 Indra+0.145
 sophistication+0.145
Neg logit contributions
rador-0.176
MRI-0.172
rament-0.158
icult-0.153
stanbul-0.150
Token and
Token ID290
Feature activation-1.13e+00

Pos logit contributions
rador+3.277
MRI+3.209
rament+2.960
icult+2.813
iscopal+2.778
Neg logit contributions
gemony-3.246
 Ally-3.013
 casinos-2.807
 Indra-2.777
ours-2.768
Token a
Token ID257
Feature activation-1.48e+00

Pos logit contributions
rador+0.697
MRI+0.678
rament+0.607
icult+0.580
stanbul+0.567
Neg logit contributions
gemony-1.053
 Ally-1.001
 casinos-0.941
 Indra-0.939
 sophistication-0.934
Token refrigerator
Token ID30500
Feature activation-8.34e+00

Pos logit contributions
rador+0.954
MRI+0.931
rament+0.837
icult+0.798
stanbul+0.784
Neg logit contributions
gemony-1.338
 Ally-1.268
 casinos-1.189
 Indra-1.187
ours-1.181
Token
Token ID960
Feature activation-1.65e+00

Pos logit contributions
rador+4.374
MRI+4.217
rament+3.667
icult+3.447
iscopal+3.367
Neg logit contributions
gemony-8.256
 Ally-7.746
 casinos-7.336
 Indra-7.302
ours-7.285
Tokenit
Token ID270
Feature activation-1.05e+00

Pos logit contributions
rador+1.048
MRI+1.030
rament+0.921
icult+0.879
stanbul+0.856
Neg logit contributions
gemony-1.504
 Ally-1.436
 casinos-1.347
 Indra-1.341
 sophistication-1.334
Token is
Token ID318
Feature activation-4.17e-01

Pos logit contributions
rador+0.811
MRI+0.793
rament+0.731
icult+0.708
stanbul+0.693
Neg logit contributions
gemony-0.801
 Ally-0.757
 Indra-0.696
 casinos-0.696
 sophistication-0.696
Token among
Token ID1871
Feature activation-1.87e+00

Pos logit contributions
rador+0.325
MRI+0.318
rament+0.293
icult+0.284
stanbul+0.279
Neg logit contributions
gemony-0.321
 Ally-0.300
 casinos-0.277
 Indra-0.275
 sophistication-0.274
Token the
Token ID262
Feature activation+8.86e-02

Pos logit contributions
rador+1.465
MRI+1.414
rament+1.306
icult+1.269
iscopal+1.239
Neg logit contributions
gemony-1.435
 Ally-1.335
 casinos-1.231
 Indra-1.224
 sophistication-1.222
Tokencar
Token ID7718
Feature activation+1.35e+00

Pos logit contributions
rador+0.611
MRI+0.598
rament+0.547
icult+0.521
iscopal+0.517
Neg logit contributions
gemony-0.687
 Ally-0.652
 casinos-0.608
 Indra-0.608
 sophistication-0.604
Tokenibe
Token ID32438
Feature activation+1.33e+00

Pos logit contributions
gemony+1.139
 Ally+1.077
 casinos+1.006
 Indra+1.003
ours+1.001
Neg logit contributions
rador-0.953
MRI-0.934
rament-0.846
icult-0.819
stanbul-0.802
Token:
Token ID25
Feature activation-2.64e-01

Pos logit contributions
gemony+1.057
 Ally+0.995
 casinos+0.927
 Indra+0.924
 sophistication+0.918
Neg logit contributions
rador-1.004
MRI-0.981
rament-0.897
icult-0.867
stanbul-0.850
Token Haiti
Token ID25051
Feature activation-1.86e+00

Pos logit contributions
rador+0.199
MRI+0.194
rament+0.178
icult+0.172
stanbul+0.169
Neg logit contributions
gemony-0.209
 Ally-0.197
 casinos-0.184
 Indra-0.182
 sophistication-0.182
Token and
Token ID290
Feature activation+4.75e-02

Pos logit contributions
rador+1.405
MRI+1.372
rament+1.237
icult+1.210
iscopal+1.189
Neg logit contributions
gemony-1.461
 Ally-1.365
 casinos-1.288
 Indra-1.272
 sophistication-1.269
Token Venezuela
Token ID16666
Feature activation-8.31e+00

Pos logit contributions
gemony+0.036
 Ally+0.033
 casinos+0.031
 Indra+0.031
 sophistication+0.031
Neg logit contributions
rador-0.038
MRI-0.037
rament-0.034
icult-0.033
stanbul-0.032
Token
Token ID198
Feature activation-1.07e+00

Pos logit contributions
rador+5.466
MRI+5.284
rament+4.772
stanbul+4.668
icult+4.600
Neg logit contributions
gemony-6.797
 Ally-6.263
 casinos-6.078
 sophistication-5.983
ours-5.961
Token
Token ID198
Feature activation-8.15e-01

Pos logit contributions
rador+0.614
MRI+0.604
rament+0.531
icult+0.505
stanbul+0.493
Neg logit contributions
gemony-1.051
 Ally-0.991
 casinos-0.947
 sophistication-0.937
 Indra-0.936
TokenRen
Token ID26764
Feature activation+3.34e+00

Pos logit contributions
rador+0.531
MRI+0.524
rament+0.464
icult+0.444
stanbul+0.438
Neg logit contributions
gemony-0.731
 Ally-0.697
 casinos-0.659
 Indra-0.655
 sophistication-0.651
Tokené
Token ID2634
Feature activation+5.51e-01

Pos logit contributions
gemony+2.767
 Ally+2.611
ours+2.427
 casinos+2.424
 Indra+2.418
Neg logit contributions
rador-2.388
MRI-2.344
rament-2.140
icult-2.060
stanbul-2.017
Token Pr
Token ID1736
Feature activation-1.96e+00

Pos logit contributions
gemony+0.406
 Ally+0.376
 casinos+0.356
 sophistication+0.350
 Indra+0.350
Neg logit contributions
rador-0.446
MRI-0.436
rament-0.405
icult-0.386
iscopal-0.382
TokenT
Token ID51
Feature activation-3.92e-01

Pos logit contributions
gemony+1.036
 Ally+0.991
 Indra+0.916
 casinos+0.909
ours+0.906
Neg logit contributions
rador-0.975
MRI-0.935
rament-0.866
icult-0.837
iscopal-0.820
Token,
Token ID11
Feature activation-4.32e-01

Pos logit contributions
rador+0.303
MRI+0.298
rament+0.273
icult+0.265
stanbul+0.259
Neg logit contributions
gemony-0.304
 Ally-0.283
 casinos-0.263
 Indra-0.263
 sophistication-0.261
Token the
Token ID262
Feature activation-3.16e-01

Pos logit contributions
rador+0.362
MRI+0.358
rament+0.330
icult+0.318
stanbul+0.314
Neg logit contributions
gemony-0.306
 Ally-0.279
 casinos-0.260
 Indra-0.259
 sophistication-0.258
Token defendant
Token ID11304
Feature activation-8.18e-01

Pos logit contributions
rador+0.275
MRI+0.270
rament+0.251
icult+0.243
stanbul+0.240
Neg logit contributions
gemony-0.214
 Ally-0.196
 Indra-0.181
 casinos-0.180
ours-0.179
Token in
Token ID287
Feature activation-6.77e-01

Pos logit contributions
rador+0.622
MRI+0.611
rament+0.560
icult+0.539
stanbul+0.531
Neg logit contributions
gemony-0.644
 Ally-0.599
 casinos-0.557
 Indra-0.556
ours-0.555
Token EFF
Token ID33659
Feature activation-8.28e+00

Pos logit contributions
rador+0.521
MRI+0.506
rament+0.468
icult+0.448
stanbul+0.444
Neg logit contributions
gemony-0.531
 Ally-0.494
 casinos-0.460
ours-0.458
 Indra-0.458
Token's
Token ID338
Feature activation-1.13e+00

Pos logit contributions
rador+5.836
MRI+5.737
rament+5.168
stanbul+5.038
icult+5.006
Neg logit contributions
gemony-6.696
 Ally-6.051
 Indra-5.779
 casinos-5.728
aughs-5.716
Token NSA
Token ID10551
Feature activation-8.84e-01

Pos logit contributions
rador+0.856
MRI+0.837
rament+0.768
icult+0.740
stanbul+0.732
Neg logit contributions
gemony-0.900
 Ally-0.841
 casinos-0.785
 Indra-0.782
ours-0.778
Token surveillance
Token ID8452
Feature activation-6.08e-01

Pos logit contributions
rador+0.679
MRI+0.666
rament+0.608
icult+0.586
stanbul+0.579
Neg logit contributions
gemony-0.694
 Ally-0.646
 casinos-0.605
 Indra-0.602
 sophistication-0.596
Token litigation
Token ID19284
Feature activation+4.45e-01

Pos logit contributions
rador+0.440
MRI+0.430
rament+0.392
icult+0.378
stanbul+0.370
Neg logit contributions
gemony-0.504
 Ally-0.475
 casinos-0.443
 Indra-0.442
 sophistication-0.439
Token,
Token ID11
Feature activation-5.37e-01

Pos logit contributions
gemony+0.359
 Ally+0.338
 casinos+0.315
 Indra+0.314
 sophistication+0.312
Neg logit contributions
rador-0.331
MRI-0.324
rament-0.296
icult-0.286
stanbul-0.280
Token Reporter
Token ID25869
Feature activation+2.38e+00

Pos logit contributions
gemony+1.558
 Ally+1.462
 casinos+1.353
 Indra+1.348
 sophistication+1.340
Neg logit contributions
rador-1.689
MRI-1.655
rament-1.524
icult-1.475
stanbul-1.449
Token Today
Token ID6288
Feature activation-2.20e-01

Pos logit contributions
gemony+1.664
 Ally+1.582
 casinos+1.454
 sophistication+1.438
ours+1.430
Neg logit contributions
rador-2.027
MRI-1.965
rament-1.835
icult-1.778
iscopal-1.736
Token!
Token ID0
Feature activation+9.50e-01

Pos logit contributions
rador+0.159
MRI+0.155
rament+0.141
icult+0.136
stanbul+0.133
Neg logit contributions
gemony-0.182
 Ally-0.172
 casinos-0.161
 sophistication-0.160
 Indra-0.160
Token<|endoftext|>
Token ID50256
Feature activation+6.15e-01

Pos logit contributions
gemony+0.774
 Ally+0.745
 Indra+0.690
 casinos+0.688
 sophistication+0.679
Neg logit contributions
rador-0.699
MRI-0.689
rament-0.629
icult-0.601
stanbul-0.590
TokenA
Token ID32
Feature activation-1.15e-01

Pos logit contributions
gemony+0.539
 Ally+0.511
 casinos+0.479
 Indra+0.478
 sophistication+0.475
Neg logit contributions
rador-0.415
MRI-0.405
rament-0.367
icult-0.352
stanbul-0.345
Tokenusterity
Token ID20761
Feature activation-8.22e+00

Pos logit contributions
rador+0.088
MRI+0.086
rament+0.079
icult+0.076
stanbul+0.075
Neg logit contributions
gemony-0.091
 Ally-0.086
 casinos-0.080
 Indra-0.079
 sophistication-0.079
Token has
Token ID468
Feature activation-1.18e+00

Pos logit contributions
rador+6.081
MRI+5.922
rament+5.561
icult+5.312
iscopal+5.220
Neg logit contributions
gemony-6.227
 Ally-5.601
ours-5.363
-5.349
 Indra-5.346
Token failed
Token ID4054
Feature activation+1.58e+00

Pos logit contributions
rador+0.880
MRI+0.850
rament+0.783
icult+0.755
stanbul+0.743
Neg logit contributions
gemony-0.955
 Ally-0.895
 casinos-0.831
 Indra-0.830
ours-0.826
Token.
Token ID13
Feature activation-5.16e-01

Pos logit contributions
gemony+1.198
 Ally+1.131
 casinos+1.049
 Indra+1.045
 sophistication+1.038
Neg logit contributions
rador-1.253
MRI-1.228
rament-1.127
icult-1.089
stanbul-1.069
Token It
Token ID632
Feature activation+1.27e+00

Pos logit contributions
rador+0.369
MRI+0.362
rament+0.329
icult+0.317
stanbul+0.311
Neg logit contributions
gemony-0.429
 Ally-0.404
 casinos-0.380
 Indra-0.376
 sophistication-0.376
Token turned
Token ID2900
Feature activation+7.59e-01

Pos logit contributions
gemony+0.972
 Ally+0.917
 casinos+0.849
 Indra+0.846
 sophistication+0.843
Neg logit contributions
rador-0.998
MRI-0.973
rament-0.895
icult-0.866
stanbul-0.851
Tokenardi
Token ID22490
Feature activation+2.65e+00

Pos logit contributions
rador+1.296
MRI+1.268
rament+1.148
icult+1.103
stanbul+1.084
Neg logit contributions
gemony-1.584
 Ally-1.496
 Indra-1.401
 casinos-1.400
 sophistication-1.389
Token Gr
Token ID1902
Feature activation+3.09e+00

Pos logit contributions
gemony+1.605
 Ally+1.501
 Indra+1.373
 casinos+1.372
ours+1.354
Neg logit contributions
rador-2.498
MRI-2.440
rament-2.284
icult-2.228
stanbul-2.206
Tokenas
Token ID292
Feature activation+2.87e-01

Pos logit contributions
gemony+2.428
 Ally+2.295
ours+2.155
 casinos+2.145
 Indra+2.112
Neg logit contributions
rador-2.351
MRI-2.290
rament-2.098
icult-2.014
stanbul-1.998
Token beads
Token ID36116
Feature activation+2.43e+00

Pos logit contributions
gemony+0.215
 Ally+0.201
 Indra+0.185
 casinos+0.185
 sophistication+0.184
Neg logit contributions
rador-0.231
MRI-0.226
rament-0.209
icult-0.202
stanbul-0.199
Token and
Token ID290
Feature activation-8.92e-01

Pos logit contributions
gemony+1.723
 Ally+1.620
 casinos+1.495
 Indra+1.487
ours+1.476
Neg logit contributions
rador-2.039
MRI-2.001
rament-1.848
icult-1.792
iscopal-1.759
Token CD
Token ID6458
Feature activation-8.17e+00

Pos logit contributions
rador+0.708
MRI+0.693
rament+0.635
icult+0.613
stanbul+0.606
Neg logit contributions
gemony-0.677
 Ally-0.630
 Indra-0.583
 casinos-0.582
ours-0.581
Token
Token ID447
Feature activation-1.96e+00

Pos logit contributions
rador+6.096
MRI+5.916
rament+5.378
iscopal+5.133
icult+5.094
Neg logit contributions
gemony-6.399
 Ally-5.788
 Indra-5.398
ours-5.381
 sophistication-5.352
Token
Token ID247
Feature activation-1.30e+00

Pos logit contributions
rador+1.396
MRI+1.360
rament+1.240
icult+1.188
stanbul+1.175
Neg logit contributions
gemony-1.618
 Ally-1.539
 casinos-1.439
 sophistication-1.432
 Indra-1.429
Tokens
Token ID82
Feature activation+2.07e-01

Pos logit contributions
rador+1.166
MRI+1.146
rament+1.063
icult+1.032
stanbul+1.020
Neg logit contributions
gemony-0.850
 Ally-0.787
 Indra-0.722
 casinos-0.718
 sophistication-0.710
Token
Token ID851
Feature activation-2.23e+00

Pos logit contributions
gemony+0.169
 Ally+0.159
 casinos+0.148
 Indra+0.148
 sophistication+0.147
Neg logit contributions
rador-0.153
MRI-0.149
rament-0.136
icult-0.132
stanbul-0.129
Token one
Token ID530
Feature activation+5.89e-01

Pos logit contributions
rador+1.752
MRI+1.709
rament+1.573
icult+1.518
stanbul+1.501
Neg logit contributions
gemony-1.699
 Ally-1.580
 Indra-1.466
 casinos-1.457
ours-1.451
TokenRelations
Token ID47117
Feature activation+2.86e+00

Pos logit contributions
gemony+0.101
 Ally+0.096
 casinos+0.091
 Indra+0.090
 sophistication+0.090
Neg logit contributions
rador-0.075
MRI-0.074
rament-0.066
icult-0.063
stanbul-0.062
Token between
Token ID1022
Feature activation-2.75e-02

Pos logit contributions
gemony+2.085
 Ally+1.956
 casinos+1.800
 Indra+1.796
 sophistication+1.787
Neg logit contributions
rador-2.335
MRI-2.298
rament-2.122
icult-2.045
stanbul-2.013
Token the
Token ID262
Feature activation-1.97e-01

Pos logit contributions
rador+0.020
MRI+0.020
rament+0.018
icult+0.017
stanbul+0.017
Neg logit contributions
gemony-0.023
 Ally-0.021
 casinos-0.020
 Indra-0.020
 sophistication-0.020
Token Catalan
Token ID31066
Feature activation+2.43e-01

Pos logit contributions
rador+0.148
MRI+0.144
rament+0.132
icult+0.128
stanbul+0.125
Neg logit contributions
gemony-0.159
 Ally-0.149
 casinos-0.139
 Indra-0.139
 sophistication-0.138
Token and
Token ID290
Feature activation-7.34e-01

Pos logit contributions
gemony+0.199
 Ally+0.188
 Indra+0.175
 casinos+0.175
 sophistication+0.174
Neg logit contributions
rador-0.178
MRI-0.174
rament-0.159
icult-0.153
iscopal-0.150
Token central
Token ID4318
Feature activation-8.13e+00

Pos logit contributions
rador+0.541
MRI+0.527
rament+0.480
icult+0.467
stanbul+0.458
Neg logit contributions
gemony-0.598
 Ally-0.560
 casinos-0.522
 Indra-0.521
 sophistication-0.520
Token governments
Token ID6905
Feature activation-4.04e+00

Pos logit contributions
rador+5.372
MRI+5.160
icult+4.631
rament+4.586
iscopal+4.510
Neg logit contributions
gemony-6.873
 Ally-6.240
ours-6.034
 sophistication-5.925
 Indra-5.921
Token have
Token ID423
Feature activation-5.05e-01

Pos logit contributions
rador+2.714
MRI+2.604
rament+2.329
icult+2.268
stanbul+2.243
Neg logit contributions
gemony-3.506
 Ally-3.299
 casinos-3.102
 sophistication-3.068
ours-3.055
Token worsened
Token ID44838
Feature activation+1.21e+00

Pos logit contributions
rador+0.385
MRI+0.374
rament+0.343
icult+0.332
stanbul+0.327
Neg logit contributions
gemony-0.398
 Ally-0.374
 casinos-0.347
 Indra-0.346
 sophistication-0.345
Token of
Token ID286
Feature activation-3.98e-01

Pos logit contributions
gemony+0.863
 Ally+0.804
 casinos+0.741
 Indra+0.737
 sophistication+0.732
Neg logit contributions
rador-1.021
MRI-0.998
rament-0.927
icult-0.897
stanbul-0.883
Token late
Token ID2739
Feature activation+2.74e+00

Pos logit contributions
rador+0.239
MRI+0.231
rament+0.207
icult+0.198
stanbul+0.193
Neg logit contributions
gemony-0.379
 Ally-0.361
 casinos-0.340
 Indra-0.339
 sophistication-0.337
Token for
Token ID329
Feature activation+2.58e+00

Pos logit contributions
gemony+2.585
 Ally+2.478
 casinos+2.285
 Indra+2.267
ours+2.258
Neg logit contributions
rador-2.255
MRI-2.233
rament-2.005
icult-1.939
iscopal-1.894
Token independence
Token ID10404
Feature activation-2.82e+00

Pos logit contributions
gemony+1.962
 Ally+1.852
 casinos+1.709
 Indra+1.705
 sophistication+1.691
Neg logit contributions
rador-2.048
MRI-2.011
rament-1.840
icult-1.786
stanbul-1.751
Token:
Token ID25
Feature activation-1.18e+00

Pos logit contributions
rador+1.882
MRI+1.826
rament+1.668
icult+1.597
iscopal+1.567
Neg logit contributions
gemony-2.441
 Ally-2.306
 casinos-2.169
 sophistication-2.168
 Indra-2.160
Token 1961
Token ID20510
Feature activation-4.89e+00

Pos logit contributions
rador+0.883
MRI+0.861
rament+0.790
icult+0.763
stanbul+0.749
Neg logit contributions
gemony-0.939
 Ally-0.884
 casinos-0.825
 Indra-0.821
 sophistication-0.817
Token
Token ID1906
Feature activation-4.10e-01

Pos logit contributions
rador+3.133
MRI+3.072
rament+2.777
icult+2.635
iscopal+2.613
Neg logit contributions
gemony-4.273
 Ally-4.037
 casinos-3.817
ours-3.777
 sophistication-3.777
Token1963
Token ID45192
Feature activation-8.06e+00

Pos logit contributions
rador+0.201
MRI+0.194
rament+0.169
icult+0.159
stanbul+0.154
Neg logit contributions
gemony-0.435
 Ally-0.417
 casinos-0.396
 Indra-0.394
 sophistication-0.393
Token [
Token ID685
Feature activation+1.13e+00

Pos logit contributions
rador+4.915
MRI+4.813
rament+4.319
icult+4.157
stanbul+4.076
Neg logit contributions
gemony-6.983
 Ally-6.574
 casinos-6.306
 sophistication-6.236
ours-6.184
Token edit
Token ID4370
Feature activation-2.40e-01

Pos logit contributions
gemony+0.820
 Ally+0.769
 casinos+0.709
 Indra+0.707
 sophistication+0.703
Neg logit contributions
rador-0.934
MRI-0.913
rament-0.844
icult-0.817
stanbul-0.803
Token ]
Token ID2361
Feature activation-3.44e-01

Pos logit contributions
rador+0.212
MRI+0.209
rament+0.194
icult+0.188
stanbul+0.185
Neg logit contributions
gemony-0.159
 Ally-0.147
 casinos-0.136
 Indra-0.135
 sophistication-0.134
Token
Token ID198
Feature activation-1.06e+00

Pos logit contributions
rador+0.204
MRI+0.200
rament+0.178
icult+0.169
stanbul+0.165
Neg logit contributions
gemony-0.329
 Ally-0.312
 casinos-0.295
 Indra-0.293
 sophistication-0.293
Token
Token ID198
Feature activation-8.77e-01

Pos logit contributions
rador+0.604
MRI+0.596
rament+0.523
icult+0.498
stanbul+0.485
Neg logit contributions
gemony-1.045
 Ally-0.984
 casinos-0.941
 sophistication-0.931
 Indra-0.930
Token when
Token ID618
Feature activation-3.69e+00

Pos logit contributions
gemony+0.850
 Ally+0.800
 casinos+0.744
 Indra+0.741
 sophistication+0.738
Neg logit contributions
rador-0.812
MRI-0.793
rament-0.727
icult-0.701
stanbul-0.689
Token the
Token ID262
Feature activation-1.43e+00

Pos logit contributions
rador+2.718
MRI+2.642
rament+2.420
icult+2.343
stanbul+2.320
Neg logit contributions
gemony-2.989
 Ally-2.787
 Indra-2.590
 sophistication-2.587
 casinos-2.577
Token court
Token ID2184
Feature activation+9.77e-01

Pos logit contributions
rador+1.019
MRI+0.991
rament+0.908
icult+0.875
stanbul+0.858
Neg logit contributions
gemony-1.196
 Ally-1.123
 Indra-1.048
 casinos-1.044
 sophistication-1.038
Token accepts
Token ID18178
Feature activation+1.76e+00

Pos logit contributions
gemony+0.776
 Ally+0.731
 casinos+0.681
 Indra+0.677
 sophistication+0.674
Neg logit contributions
rador-0.739
MRI-0.724
rament-0.663
icult-0.639
stanbul-0.627
Token a
Token ID257
Feature activation-2.45e-02

Pos logit contributions
gemony+1.331
 Ally+1.256
 casinos+1.165
 Indra+1.159
ours+1.151
Neg logit contributions
rador-1.392
MRI-1.361
rament-1.251
icult-1.210
stanbul-1.185
Token central
Token ID4318
Feature activation-7.99e+00

Pos logit contributions
rador+0.019
MRI+0.018
rament+0.017
icult+0.016
stanbul+0.016
Neg logit contributions
gemony-0.020
 Ally-0.018
 casinos-0.017
 Indra-0.017
 sophistication-0.017
Token government
Token ID1230
Feature activation-4.82e+00

Pos logit contributions
rador+4.531
MRI+4.400
rament+4.027
icult+3.877
iscopal+3.837
Neg logit contributions
gemony-7.515
 Ally-6.917
ours-6.652
aughs-6.595
 casinos-6.593
Token legal
Token ID2742
Feature activation+1.02e+00

Pos logit contributions
rador+3.646
MRI+3.560
rament+3.269
icult+3.157
stanbul+3.132
Neg logit contributions
gemony-3.754
 Ally-3.485
 casinos-3.271
 Indra-3.251
 sophistication-3.236
Token challenge
Token ID4427
Feature activation-1.49e+00

Pos logit contributions
gemony+0.684
 Ally+0.640
 casinos+0.586
 Indra+0.583
 sophistication+0.580
Neg logit contributions
rador-0.899
MRI-0.883
rament-0.818
icult-0.794
stanbul-0.782
Token related
Token ID3519
Feature activation-3.96e-01

Pos logit contributions
rador+1.135
MRI+1.105
rament+1.015
icult+0.978
stanbul+0.962
Neg logit contributions
gemony-1.181
 Ally-1.104
 casinos-1.026
 Indra-1.024
 sophistication-1.021
Token to
Token ID284
Feature activation-1.74e-01

Pos logit contributions
rador+0.315
MRI+0.307
rament+0.282
icult+0.273
stanbul+0.269
Neg logit contributions
gemony-0.301
 Ally-0.281
 casinos-0.260
 Indra-0.260
 sophistication-0.258
Token:
Token ID25
Feature activation+3.67e-01

Pos logit contributions
rador+4.726
MRI+4.647
rament+3.958
icult+3.762
stanbul+3.618
Neg logit contributions
gemony-8.805
 Ally-8.038
 casinos-7.967
 Indra-7.829
-7.805
Token community
Token ID2055
Feature activation+2.14e+00

Pos logit contributions
gemony+0.274
 Ally+0.258
 casinos+0.239
 Indra+0.238
 sophistication+0.236
Neg logit contributions
rador-0.295
MRI-0.289
rament-0.266
icult-0.257
stanbul-0.253
Token-
Token ID12
Feature activation-8.83e-01

Pos logit contributions
gemony+1.954
 Ally+1.871
 casinos+1.768
 sophistication+1.757
 Indra+1.752
Neg logit contributions
rador-1.361
MRI-1.322
rament-1.184
icult-1.141
stanbul-1.123
Tokenand
Token ID392
Feature activation-1.17e+00

Pos logit contributions
rador+0.593
MRI+0.575
rament+0.524
icult+0.504
iscopal+0.492
Neg logit contributions
gemony-0.781
 Ally-0.737
 casinos-0.695
 Indra-0.690
 sophistication-0.690
Token-
Token ID12
Feature activation-3.36e+00

Pos logit contributions
rador+0.664
MRI+0.640
rament+0.574
icult+0.543
stanbul+0.530
Neg logit contributions
gemony-1.159
 Ally-1.100
 Indra-1.041
 casinos-1.041
 sophistication-1.032
Tokensoc
Token ID35634
Feature activation-7.98e+00

Pos logit contributions
rador+1.528
MRI+1.474
rament+1.282
icult+1.265
stanbul+1.173
Neg logit contributions
gemony-3.672
 Ally-3.478
 Indra-3.338
 casinos-3.334
 sophistication-3.309
Tokeniety
Token ID1905
Feature activation-3.35e+00

Pos logit contributions
rador+3.538
MRI+3.102
rament+2.946
icult+2.698
iscopal+2.510
Neg logit contributions
gemony-8.725
 Ally-8.328
-8.081
 casinos-8.059
 Indra-8.022
Token,
Token ID11
Feature activation+3.01e-01

Pos logit contributions
rador+2.364
MRI+2.298
rament+2.090
icult+2.030
stanbul+1.981
Neg logit contributions
gemony-2.829
 Ally-2.666
 casinos-2.499
 Indra-2.479
 sophistication-2.469
Token community
Token ID2055
Feature activation+2.14e+00

Pos logit contributions
gemony+0.223
 Ally+0.209
 casinos+0.193
 Indra+0.193
 sophistication+0.192
Neg logit contributions
rador-0.244
MRI-0.239
rament-0.220
icult-0.213
stanbul-0.209
Token-
Token ID12
Feature activation+5.76e-03

Pos logit contributions
gemony+1.925
 Ally+1.848
 casinos+1.746
 sophistication+1.736
ours+1.731
Neg logit contributions
rador-1.386
MRI-1.349
rament-1.214
icult-1.159
stanbul-1.150
Tokenorgan
Token ID9971
Feature activation-2.08e+00

Pos logit contributions
gemony+0.005
 Ally+0.005
 casinos+0.004
 Indra+0.004
 sophistication+0.004
Neg logit contributions
rador-0.004
MRI-0.004
rament-0.004
icult-0.003
stanbul-0.003
Tokenick
Token ID624
Feature activation-3.74e+00

Pos logit contributions
gemony+3.896
 Ally+3.711
ours+3.501
 casinos+3.423
 Indra+3.419
Neg logit contributions
rador-3.246
MRI-3.201
rament-2.887
icult-2.797
stanbul-2.771
Tokenman
Token ID805
Feature activation-4.97e-01

Pos logit contributions
rador+2.682
MRI+2.614
rament+2.394
icult+2.329
stanbul+2.272
Neg logit contributions
gemony-3.039
 Ally-2.849
 casinos-2.673
ours-2.648
 sophistication-2.641
Token presided
Token ID45022
Feature activation-2.78e+00

Pos logit contributions
rador+0.357
MRI+0.349
rament+0.318
icult+0.307
stanbul+0.302
Neg logit contributions
gemony-0.412
 Ally-0.388
 casinos-0.363
 Indra-0.360
 sophistication-0.360
Token over
Token ID625
Feature activation-2.41e+00

Pos logit contributions
rador+2.231
MRI+2.172
rament+2.002
icult+1.941
stanbul+1.919
Neg logit contributions
gemony-2.069
 Ally-1.933
 sophistication-1.777
 casinos-1.776
 Indra-1.776
Token a
Token ID257
Feature activation-7.54e-01

Pos logit contributions
rador+1.814
MRI+1.752
rament+1.613
icult+1.562
iscopal+1.529
Neg logit contributions
gemony-1.915
 Ally-1.773
ours-1.650
 Indra-1.643
 sophistication-1.638
Token soft
Token ID2705
Feature activation-7.97e+00

Pos logit contributions
rador+0.600
MRI+0.586
rament+0.540
icult+0.523
stanbul+0.513
Neg logit contributions
gemony-0.571
 Ally-0.534
 casinos-0.493
 Indra-0.493
ours-0.490
Token-
Token ID12
Feature activation-2.90e+00

Pos logit contributions
rador+5.452
MRI+5.198
rament+4.687
icult+4.648
stanbul+4.478
Neg logit contributions
gemony-6.840
 Ally-6.333
 Indra-5.905
ours-5.901
 casinos-5.776
Tokendr
Token ID7109
Feature activation+4.36e-01

Pos logit contributions
rador+2.138
MRI+2.088
rament+1.898
icult+1.836
iscopal+1.792
Neg logit contributions
gemony-2.367
 Ally-2.239
 Indra-2.087
 casinos-2.079
 sophistication-2.061
Tokenink
Token ID676
Feature activation-1.82e+00

Pos logit contributions
gemony+0.363
 Ally+0.342
 Indra+0.322
ours+0.320
 casinos+0.320
Neg logit contributions
rador-0.309
MRI-0.301
rament-0.278
icult-0.264
stanbul-0.261
Token bott
Token ID3005
Feature activation+1.46e+00

Pos logit contributions
rador+1.379
MRI+1.343
rament+1.234
icult+1.194
stanbul+1.162
Neg logit contributions
gemony-1.454
 Ally-1.367
 Indra-1.267
 casinos-1.261
ours-1.257
Tokenling
Token ID1359
Feature activation-3.89e-01

Pos logit contributions
gemony+0.978
 Ally+0.908
 casinos+0.836
 Indra+0.833
 sophistication+0.829
Neg logit contributions
rador-1.278
MRI-1.258
rament-1.164
icult-1.127
stanbul-1.116
Token Wi
Token ID11759
Feature activation+7.90e-01

Pos logit contributions
gemony+1.818
 Ally+1.709
 casinos+1.582
 Indra+1.577
 sophistication+1.568
Neg logit contributions
rador-1.919
MRI-1.877
rament-1.728
icult-1.671
stanbul-1.641
Tokenener
Token ID877
Feature activation+1.67e+00

Pos logit contributions
gemony+0.782
 Ally+0.746
 casinos+0.704
 Indra+0.703
ours+0.700
Neg logit contributions
rador-0.444
MRI-0.430
rament-0.381
icult-0.362
stanbul-0.352
Token/
Token ID14
Feature activation+7.95e-01

Pos logit contributions
gemony+1.319
 Ally+1.244
 casinos+1.156
 Indra+1.152
 sophistication+1.145
Neg logit contributions
rador-1.273
MRI-1.245
rament-1.141
icult-1.102
stanbul-1.080
TokenFile
Token ID8979
Feature activation-4.87e+00

Pos logit contributions
gemony+0.712
 Ally+0.675
 casinos+0.634
 Indra+0.632
 sophistication+0.629
Neg logit contributions
rador-0.522
MRI-0.510
rament-0.460
icult-0.440
stanbul-0.431
Token/
Token ID14
Feature activation+1.15e-01

Pos logit contributions
rador+3.358
MRI+3.284
rament+2.986
icult+2.856
stanbul+2.822
Neg logit contributions
gemony-4.151
 Ally-3.848
 casinos-3.642
 sophistication-3.616
ours-3.587
Token1963
Token ID45192
Feature activation-7.94e+00

Pos logit contributions
gemony+0.108
 Ally+0.103
 casinos+0.097
 Indra+0.096
 sophistication+0.096
Neg logit contributions
rador-0.070
MRI-0.068
rament-0.061
icult-0.058
stanbul-0.057
Token Basketball
Token ID25911
Feature activation+2.19e-01

Pos logit contributions
MRI+5.391
rador+5.322
rament+4.835
stanbul+4.524
icult+4.522
Neg logit contributions
gemony-6.589
 Ally-6.027
 casinos-5.769
 sophistication-5.729
ours-5.612
Token goals
Token ID4661
Feature activation-4.16e+00

Pos logit contributions
gemony+0.174
 Ally+0.164
 casinos+0.153
 Indra+0.152
 sophistication+0.152
Neg logit contributions
rador-0.166
MRI-0.162
rament-0.149
icult-0.144
stanbul-0.141
Token were
Token ID547
Feature activation-2.93e+00

Pos logit contributions
rador+2.965
MRI+2.900
rament+2.644
icult+2.579
stanbul+2.518
Neg logit contributions
gemony-3.423
 Ally-3.195
 casinos-3.011
 Indra-2.985
 sophistication-2.984
Token added
Token ID2087
Feature activation+3.35e-01

Pos logit contributions
rador+2.169
MRI+2.111
rament+1.953
icult+1.880
stanbul+1.867
Neg logit contributions
gemony-2.363
 Ally-2.185
 Indra-2.042
 casinos-2.034
 sophistication-2.029
Token to
Token ID284
Feature activation-1.13e+00

Pos logit contributions
gemony+0.260
 Ally+0.243
 casinos+0.226
 Indra+0.225
 sophistication+0.224
Neg logit contributions
rador-0.261
MRI-0.255
rament-0.234
icult-0.226
stanbul-0.223
Token tightening
Token ID36040
Feature activation-8.39e-01

Pos logit contributions
gemony+0.119
 Ally+0.111
 casinos+0.103
 Indra+0.103
 sophistication+0.102
Neg logit contributions
rador-0.125
MRI-0.122
rament-0.113
icult-0.109
stanbul-0.107
Token race
Token ID3234
Feature activation+2.21e+00

Pos logit contributions
rador+0.639
MRI+0.635
rament+0.583
icult+0.560
stanbul+0.552
Neg logit contributions
gemony-0.659
 Ally-0.610
 Indra-0.568
 casinos-0.567
ours-0.565
Token in
Token ID287
Feature activation-2.85e-01

Pos logit contributions
gemony+1.701
 Ally+1.625
 casinos+1.510
 Indra+1.498
 sophistication+1.498
Neg logit contributions
rador-1.716
MRI-1.673
rament-1.520
icult-1.489
iscopal-1.458
Token Florida
Token ID4744
Feature activation-4.14e+00

Pos logit contributions
rador+0.173
MRI+0.168
rament+0.150
icult+0.143
stanbul+0.140
Neg logit contributions
gemony-0.270
 Ally-0.257
 casinos-0.242
 Indra-0.241
 sophistication-0.240
Token and
Token ID290
Feature activation-3.38e-01

Pos logit contributions
rador+2.865
MRI+2.786
rament+2.580
icult+2.441
stanbul+2.413
Neg logit contributions
gemony-3.533
 Ally-3.248
 Indra-3.048
ours-3.038
-3.037
Token Wisconsin
Token ID9279
Feature activation-7.93e+00

Pos logit contributions
rador+0.169
MRI+0.163
rament+0.142
icult+0.135
stanbul+0.130
Neg logit contributions
gemony-0.355
 Ally-0.339
 casinos-0.322
 Indra-0.321
 sophistication-0.320
Token,
Token ID11
Feature activation-8.20e-01

Pos logit contributions
rador+4.687
MRI+4.523
rament+4.213
icult+3.911
stanbul+3.871
Neg logit contributions
gemony-7.248
 Ally-6.638
-6.321
ours-6.299
aughs-6.279
Token where
Token ID810
Feature activation-2.78e+00

Pos logit contributions
rador+0.587
MRI+0.573
rament+0.524
icult+0.503
stanbul+0.494
Neg logit contributions
gemony-0.685
 Ally-0.642
 Indra-0.599
 casinos-0.598
ours-0.597
Token Ryan
Token ID6047
Feature activation-2.52e+00

Pos logit contributions
rador+2.061
MRI+2.027
rament+1.854
icult+1.775
iscopal+1.748
Neg logit contributions
gemony-2.236
 Ally-2.063
ours-1.934
 Indra-1.930
 sophistication-1.915
Token's
Token ID338
Feature activation-6.49e-01

Pos logit contributions
rador+1.860
MRI+1.806
rament+1.672
icult+1.610
stanbul+1.591
Neg logit contributions
gemony-2.017
 Ally-1.878
 Indra-1.762
 sophistication-1.747
ours-1.735
Token addition
Token ID3090
Feature activation+2.95e-01

Pos logit contributions
rador+0.500
MRI+0.490
rament+0.452
icult+0.434
stanbul+0.425
Neg logit contributions
gemony-0.506
 Ally-0.470
 Indra-0.439
ours-0.435
 casinos-0.434
Token would
Token ID561
Feature activation+2.88e+00

Pos logit contributions
rador+0.008
MRI+0.008
rament+0.007
icult+0.007
stanbul+0.007
Neg logit contributions
gemony-0.008
 Ally-0.007
 Indra-0.007
 casinos-0.007
 sophistication-0.007
Token be
Token ID307
Feature activation+2.37e+00

Pos logit contributions
gemony+2.240
 Ally+2.129
 casinos+1.976
 sophistication+1.961
 Indra+1.961
Neg logit contributions
rador-2.206
MRI-2.161
rament-1.983
icult-1.897
iscopal-1.889
Token problematic
Token ID15833
Feature activation+3.85e-01

Pos logit contributions
gemony+1.752
 Ally+1.655
 casinos+1.532
 Indra+1.524
 sophistication+1.517
Neg logit contributions
rador-1.925
MRI-1.883
rament-1.735
icult-1.675
iscopal-1.654
Token if
Token ID611
Feature activation-6.84e-01

Pos logit contributions
gemony+0.314
 Ally+0.296
 casinos+0.276
 Indra+0.275
 sophistication+0.274
Neg logit contributions
rador-0.283
MRI-0.276
rament-0.253
icult-0.243
stanbul-0.239
Token the
Token ID262
Feature activation-9.55e-01

Pos logit contributions
rador+0.499
MRI+0.489
rament+0.447
icult+0.429
stanbul+0.420
Neg logit contributions
gemony-0.561
 Ally-0.525
 Indra-0.490
 casinos-0.489
ours-0.487
Token Panthers
Token ID17782
Feature activation-7.93e+00

Pos logit contributions
rador+0.595
MRI+0.580
rament+0.522
icult+0.497
stanbul+0.485
Neg logit contributions
gemony-0.888
 Ally-0.841
 Indra-0.790
 casinos-0.789
ours-0.785
Token are
Token ID389
Feature activation-1.78e+00

Pos logit contributions
rador+5.216
MRI+5.154
rament+4.687
icult+4.522
stanbul+4.487
Neg logit contributions
gemony-6.564
 Ally-6.035
 Indra-5.794
ours-5.704
-5.679
Token asking
Token ID4737
Feature activation+4.01e-01

Pos logit contributions
rador+1.282
MRI+1.251
rament+1.149
icult+1.101
stanbul+1.089
Neg logit contributions
gemony-1.473
 Ally-1.359
ours-1.274
 Indra-1.272
 casinos-1.266
Token him
Token ID683
Feature activation+1.37e+00

Pos logit contributions
gemony+0.322
 Ally+0.301
 casinos+0.281
 Indra+0.280
 sophistication+0.279
Neg logit contributions
rador-0.302
MRI-0.296
rament-0.271
icult-0.261
stanbul-0.255
Token to
Token ID284
Feature activation-7.79e-01

Pos logit contributions
gemony+1.008
 Ally+0.944
 casinos+0.873
 Indra+0.870
 sophistication+0.865
Neg logit contributions
rador-1.112
MRI-1.089
rament-1.005
icult-0.972
stanbul-0.955
Token push
Token ID4574
Feature activation-1.93e+00

Pos logit contributions
rador+0.575
MRI+0.562
rament+0.513
icult+0.495
stanbul+0.486
Neg logit contributions
gemony-0.634
 Ally-0.599
 casinos-0.558
 Indra-0.556
 sophistication-0.553
Token own
Token ID898
Feature activation-1.31e-01

Pos logit contributions
rador+0.992
MRI+0.967
rament+0.894
icult+0.863
stanbul+0.842
Neg logit contributions
gemony-0.961
 Ally-0.895
 Indra-0.827
ours-0.825
 casinos-0.816
Token choice
Token ID3572
Feature activation-2.54e-01

Pos logit contributions
rador+0.108
MRI+0.106
rament+0.098
icult+0.095
stanbul+0.093
Neg logit contributions
gemony-0.095
 Ally-0.088
 Indra-0.080
ours-0.080
 casinos-0.080
Token.
Token ID13
Feature activation-7.60e-01

Pos logit contributions
rador+0.171
MRI+0.167
rament+0.151
icult+0.145
iscopal+0.142
Neg logit contributions
gemony-0.222
 Ally-0.210
 casinos-0.196
 sophistication-0.196
 Indra-0.196
Token This
Token ID770
Feature activation+2.88e+00

Pos logit contributions
rador+0.534
MRI+0.524
rament+0.474
icult+0.455
stanbul+0.448
Neg logit contributions
gemony-0.644
 Ally-0.603
 casinos-0.571
 sophistication-0.565
ours-0.563
Token only
Token ID691
Feature activation+4.10e-01

Pos logit contributions
gemony+2.198
 Ally+2.074
 casinos+1.922
 sophistication+1.917
 Indra+1.917
Neg logit contributions
rador-2.265
MRI-2.220
rament-2.041
icult-1.972
iscopal-1.939
Token perpet
Token ID8939
Feature activation+7.29e+00

Pos logit contributions
gemony+0.294
 Ally+0.274
 Indra+0.250
ours+0.250
 casinos+0.250
Neg logit contributions
rador-0.342
MRI-0.333
rament-0.307
icult-0.298
stanbul-0.295
Tokenuated
Token ID6605
Feature activation+3.93e+00

Pos logit contributions
gemony+3.251
 Ally+2.774
ours+2.568
 Indra+2.495
 casinos+2.435
Neg logit contributions
rador-7.896
MRI-7.855
rament-7.325
stanbul-7.217
icult-7.161
Token the
Token ID262
Feature activation+3.03e-01

Pos logit contributions
gemony+2.866
 Ally+2.700
 casinos+2.500
 Indra+2.490
 sophistication+2.482
Neg logit contributions
rador-3.195
MRI-3.158
rament-2.900
icult-2.819
stanbul-2.764
Token negative
Token ID4633
Feature activation+2.26e+00

Pos logit contributions
gemony+0.221
 Ally+0.206
 casinos+0.190
 Indra+0.189
ours+0.189
Neg logit contributions
rador-0.250
MRI-0.243
rament-0.225
icult-0.217
stanbul-0.213
Token portray
Token ID10993
Feature activation+3.20e+00

Pos logit contributions
gemony+1.586
 Ally+1.493
 casinos+1.375
 Indra+1.371
 sophistication+1.366
Neg logit contributions
rador-1.918
MRI-1.883
rament-1.744
icult-1.692
stanbul-1.661
Tokenals
Token ID874
Feature activation+2.14e+00

Pos logit contributions
gemony+2.052
 Ally+1.910
ours+1.738
 casinos+1.735
 Indra+1.730
Neg logit contributions
rador-2.907
MRI-2.850
rament-2.635
icult-2.578
iscopal-2.538
TokenPart
Token ID7841
Feature activation-1.27e-01

Pos logit contributions
gemony+0.120
 Ally+0.112
 casinos+0.105
 Indra+0.104
 sophistication+0.104
Neg logit contributions
rador-0.121
MRI-0.119
rament-0.109
icult-0.105
stanbul-0.103
Tokenicle
Token ID1548
Feature activation-8.32e-01

Pos logit contributions
rador+0.042
MRI+0.040
rament+0.032
icult+0.029
iscopal+0.027
Neg logit contributions
gemony-0.156
 Ally-0.150
 casinos-0.144
 Indra-0.143
 sophistication-0.143
Token ]
Token ID2361
Feature activation+7.71e-01

Pos logit contributions
rador+0.623
MRI+0.610
rament+0.559
icult+0.537
stanbul+0.526
Neg logit contributions
gemony-0.669
 Ally-0.626
 casinos-0.587
 sophistication-0.583
 Indra-0.582
Token ;
Token ID2162
Feature activation+1.68e+00

Pos logit contributions
gemony+0.660
 Ally+0.625
 casinos+0.584
 Indra+0.583
 sophistication+0.580
Neg logit contributions
rador-0.535
MRI-0.523
rament-0.474
icult-0.457
stanbul-0.448
Token const
Token ID1500
Feature activation+3.30e-01

Pos logit contributions
gemony+1.314
 Ally+1.244
 casinos+1.155
 Indra+1.151
 sophistication+1.144
Neg logit contributions
rador-1.282
MRI-1.255
rament-1.148
icult-1.112
stanbul-1.091
Token Vec
Token ID38692
Feature activation+5.79e+00

Pos logit contributions
gemony+0.252
 Ally+0.235
 casinos+0.219
 Indra+0.218
 sophistication+0.217
Neg logit contributions
rador-0.260
MRI-0.254
rament-0.234
icult-0.226
stanbul-0.222
Token3
Token ID18
Feature activation-5.07e-01

Pos logit contributions
gemony+4.985
 Ally+4.797
ours+4.500
 casinos+4.442
 sophistication+4.422
Neg logit contributions
rador-3.809
MRI-3.702
rament-3.324
icult-3.278
stanbul-3.211
Token &
Token ID1222
Feature activation+1.34e+00

Pos logit contributions
rador+0.397
MRI+0.388
rament+0.357
icult+0.344
stanbul+0.338
Neg logit contributions
gemony-0.390
 Ally-0.365
 casinos-0.340
 Indra-0.339
 sophistication-0.337
Token r
Token ID374
Feature activation-7.19e-01

Pos logit contributions
gemony+1.034
 Ally+0.973
 casinos+0.903
 Indra+0.900
 sophistication+0.895
Neg logit contributions
rador-1.040
MRI-1.017
rament-0.934
icult-0.903
stanbul-0.886
TokenPosition
Token ID26545
Feature activation-3.96e-01

Pos logit contributions
rador+0.530
MRI+0.521
rament+0.475
icult+0.456
stanbul+0.447
Neg logit contributions
gemony-0.584
 Ally-0.552
 casinos-0.518
 Indra-0.515
 sophistication-0.512
Token =
Token ID796
Feature activation-9.00e-02

Pos logit contributions
rador+0.223
MRI+0.217
rament+0.192
icult+0.182
stanbul+0.177
Neg logit contributions
gemony-0.391
 Ally-0.372
 casinos-0.353
 Indra-0.351
 sophistication-0.350
Token in
Token ID287
Feature activation+2.33e-01

Pos logit contributions
gemony+1.592
 Ally+1.504
 casinos+1.400
 Indra+1.396
 sophistication+1.388
Neg logit contributions
rador-1.483
MRI-1.452
rament-1.328
icult-1.282
stanbul-1.256
Token the
Token ID262
Feature activation+9.24e-01

Pos logit contributions
gemony+0.180
 Ally+0.168
 casinos+0.156
 Indra+0.156
ours+0.155
Neg logit contributions
rador-0.183
MRI-0.178
rament-0.164
icult-0.158
stanbul-0.156
Token experiment
Token ID6306
Feature activation+1.25e+00

Pos logit contributions
gemony+0.664
 Ally+0.618
 casinos+0.570
 Indra+0.569
ours+0.565
Neg logit contributions
rador-0.772
MRI-0.753
rament-0.698
icult-0.676
stanbul-0.665
Token of
Token ID286
Feature activation+4.49e-01

Pos logit contributions
gemony+1.000
 Ally+0.941
 casinos+0.876
 Indra+0.873
 sophistication+0.869
Neg logit contributions
rador-0.947
MRI-0.924
rament-0.846
icult-0.816
stanbul-0.801
Token the
Token ID262
Feature activation+5.58e-01

Pos logit contributions
gemony+0.343
 Ally+0.321
 casinos+0.298
 Indra+0.297
ours+0.296
Neg logit contributions
rador-0.354
MRI-0.344
rament-0.317
icult-0.307
stanbul-0.302
Token Arab
Token ID4498
Feature activation+6.06e+00

Pos logit contributions
gemony+0.409
 Ally+0.382
 Indra+0.353
 casinos+0.353
ours+0.351
Neg logit contributions
rador-0.456
MRI-0.445
rament-0.411
icult-0.398
stanbul-0.392
Token Spring
Token ID8225
Feature activation+2.37e+00

Pos logit contributions
gemony+4.339
 Ally+4.144
 casinos+3.765
 sophistication+3.720
 Indra+3.667
Neg logit contributions
rador-4.971
MRI-4.926
rament-4.545
icult-4.359
stanbul-4.296
Token,
Token ID11
Feature activation+9.23e-01

Pos logit contributions
gemony+1.903
 Ally+1.795
 casinos+1.672
 Indra+1.662
 sophistication+1.657
Neg logit contributions
rador-1.778
MRI-1.744
rament-1.593
icult-1.542
stanbul-1.502
Token which
Token ID543
Feature activation+1.54e+00

Pos logit contributions
gemony+0.715
 Ally+0.672
 casinos+0.624
 Indra+0.622
 sophistication+0.618
Neg logit contributions
rador-0.717
MRI-0.701
rament-0.643
icult-0.621
stanbul-0.611
Token has
Token ID468
Feature activation+9.47e-01

Pos logit contributions
gemony+1.238
 Ally+1.168
 casinos+1.087
 Indra+1.083
 sophistication+1.077
Neg logit contributions
rador-1.153
MRI-1.126
rament-1.030
icult-0.994
stanbul-0.977
Token affected
Token ID5676
Feature activation-6.65e-01

Pos logit contributions
gemony+0.756
 Ally+0.712
 casinos+0.662
 Indra+0.659
 sophistication+0.656
Neg logit contributions
rador-0.714
MRI-0.696
rament-0.637
icult-0.615
stanbul-0.604
Token student
Token ID3710
Feature activation+1.74e+00

Pos logit contributions
gemony+2.994
 Ally+2.927
 Indra+2.653
 casinos+2.643
 Polo+2.615
Neg logit contributions
rador-3.199
MRI-3.060
rament-2.855
icult-2.749
stanbul-2.697
Token accusing
Token ID19786
Feature activation+4.70e+00

Pos logit contributions
gemony+1.412
 Ally+1.351
 casinos+1.253
 Indra+1.251
ours+1.236
Neg logit contributions
rador-1.284
MRI-1.248
rament-1.142
icult-1.107
stanbul-1.082
Token him
Token ID683
Feature activation+1.06e+00

Pos logit contributions
gemony+3.349
 Ally+3.217
 Indra+2.942
 casinos+2.942
 sophistication+2.904
Neg logit contributions
rador-3.929
MRI-3.789
rament-3.523
icult-3.420
iscopal-3.333
Token of
Token ID286
Feature activation+1.15e+00

Pos logit contributions
gemony+0.854
 Ally+0.807
 casinos+0.751
 Indra+0.749
 sophistication+0.744
Neg logit contributions
rador-0.788
MRI-0.770
rament-0.704
icult-0.679
stanbul-0.666
Token sexual
Token ID3206
Feature activation+8.34e+00

Pos logit contributions
gemony+0.751
 Ally+0.697
 casinos+0.637
 Indra+0.635
 sophistication+0.630
Neg logit contributions
rador-1.039
MRI-1.020
rament-0.948
icult-0.921
stanbul-0.907
Token assault
Token ID4641
Feature activation+7.12e+00

Pos logit contributions
gemony+4.124
 Ally+3.808
 sophistication+3.496
 casinos+3.391
 Indra+3.277
Neg logit contributions
rador-8.712
MRI-8.503
rament-7.964
stanbul-7.850
icult-7.849
Token.
Token ID13
Feature activation-1.11e+00

Pos logit contributions
gemony+5.305
 Ally+5.142
 Indra+4.753
 casinos+4.752
 sophistication+4.671
Neg logit contributions
rador-5.500
MRI-5.367
rament-4.903
icult-4.845
stanbul-4.687
Token
Token ID198
Feature activation-4.45e-01

Pos logit contributions
rador+0.674
MRI+0.665
rament+0.590
icult+0.561
iscopal+0.553
Neg logit contributions
gemony-1.041
 Ally-0.981
 casinos-0.938
 sophistication-0.930
ours-0.923
Token
Token ID198
Feature activation-3.81e-01

Pos logit contributions
rador+0.259
MRI+0.255
rament+0.225
icult+0.214
stanbul+0.209
Neg logit contributions
gemony-0.431
 Ally-0.408
 casinos-0.388
 Indra-0.385
 sophistication-0.384
TokenThe
Token ID464
Feature activation-9.02e-01

Pos logit contributions
rador+0.244
MRI+0.241
rament+0.214
icult+0.205
iscopal+0.201
Neg logit contributions
gemony-0.345
 Ally-0.328
 casinos-0.310
 Indra-0.308
 sophistication-0.307
Token 6
Token ID718
Feature activation+1.06e-01

Pos logit contributions
rador+0.661
MRI+0.647
rament+0.593
icult+0.571
stanbul+0.561
Neg logit contributions
gemony-0.736
 Ally-0.687
 casinos-0.643
 Indra-0.638
 sophistication-0.638
Token companies
Token ID2706
Feature activation+8.29e-01

Pos logit contributions
gemony+0.671
 Ally+0.628
 casinos+0.578
 Indra+0.576
 sophistication+0.572
Neg logit contributions
rador-0.809
MRI-0.793
rament-0.734
icult-0.711
stanbul-0.699
Token are
Token ID389
Feature activation+4.18e-02

Pos logit contributions
gemony+0.695
 Ally+0.657
 casinos+0.615
 Indra+0.612
 sophistication+0.609
Neg logit contributions
rador-0.592
MRI-0.578
rament-0.526
icult-0.506
stanbul-0.496
Token structured
Token ID20793
Feature activation-7.74e-01

Pos logit contributions
gemony+0.034
 Ally+0.032
 casinos+0.030
 Indra+0.030
 sophistication+0.030
Neg logit contributions
rador-0.030
MRI-0.030
rament-0.027
icult-0.026
stanbul-0.026
Token,
Token ID11
Feature activation+3.27e-01

Pos logit contributions
rador+0.566
MRI+0.550
rament+0.507
icult+0.487
stanbul+0.474
Neg logit contributions
gemony-0.635
 Ally-0.594
 casinos-0.554
 Indra-0.553
 sophistication-0.550
Token the
Token ID262
Feature activation+1.06e+00

Pos logit contributions
gemony+0.244
 Ally+0.228
 casinos+0.210
 Indra+0.210
ours+0.209
Neg logit contributions
rador-0.264
MRI-0.258
rament-0.238
icult-0.231
stanbul-0.226
Token way
Token ID835
Feature activation+6.72e+00

Pos logit contributions
gemony+0.618
 Ally+0.568
 casinos+0.511
 Indra+0.511
ours+0.505
Neg logit contributions
rador-1.020
MRI-1.000
rament-0.936
icult-0.913
stanbul-0.899
Token that
Token ID326
Feature activation+2.47e+00

Pos logit contributions
gemony+4.440
 Ally+4.185
 casinos+3.889
 sophistication+3.833
 Indra+3.832
Neg logit contributions
rador-5.915
MRI-5.837
rament-5.368
icult-5.172
iscopal-5.135
Token the
Token ID262
Feature activation+7.28e-01

Pos logit contributions
gemony+1.721
 Ally+1.615
 casinos+1.487
 Indra+1.477
 sophistication+1.470
Neg logit contributions
rador-2.118
MRI-2.080
rament-1.922
icult-1.862
iscopal-1.833
Token staff
Token ID3085
Feature activation+3.28e+00

Pos logit contributions
gemony+0.539
 Ally+0.505
 casinos+0.467
 Indra+0.466
 sophistication+0.462
Neg logit contributions
rador-0.589
MRI-0.576
rament-0.532
icult-0.515
stanbul-0.506
Token are
Token ID389
Feature activation-1.80e-01

Pos logit contributions
gemony+2.676
 Ally+2.516
 casinos+2.359
ours+2.357
 sophistication+2.354
Neg logit contributions
rador-2.384
MRI-2.348
rament-2.144
icult-2.057
iscopal-2.029
Token handled
Token ID12118
Feature activation+6.61e-01

Pos logit contributions
rador+0.135
MRI+0.132
rament+0.121
icult+0.117
stanbul+0.115
Neg logit contributions
gemony-0.144
 Ally-0.135
 casinos-0.126
 Indra-0.125
 sophistication-0.125
Token try
Token ID1949
Feature activation-8.80e-02

Pos logit contributions
rador+0.461
MRI+0.451
rament+0.413
icult+0.400
stanbul+0.391
Neg logit contributions
gemony-0.481
 Ally-0.449
 Indra-0.418
 casinos-0.416
ours-0.415
Token incorporating
Token ID29927
Feature activation+3.33e+00

Pos logit contributions
rador+0.068
MRI+0.066
rament+0.061
icult+0.059
stanbul+0.057
Neg logit contributions
gemony-0.069
 Ally-0.064
 Indra-0.060
 casinos-0.059
ours-0.059
Token the
Token ID262
Feature activation+1.29e-01

Pos logit contributions
gemony+2.372
 Ally+2.253
 casinos+2.075
 sophistication+2.060
 Indra+2.051
Neg logit contributions
rador-2.776
MRI-2.716
rament-2.513
icult-2.434
stanbul-2.395
Token following
Token ID1708
Feature activation+1.64e+00

Pos logit contributions
gemony+0.096
 Ally+0.089
 Indra+0.083
 casinos+0.082
ours+0.082
Neg logit contributions
rador-0.105
MRI-0.103
rament-0.095
icult-0.092
stanbul-0.090
Token gimmick
Token ID46029
Feature activation+1.01e+00

Pos logit contributions
gemony+1.239
 Ally+1.167
 casinos+1.082
 Indra+1.076
 sophistication+1.072
Neg logit contributions
rador-1.305
MRI-1.276
rament-1.175
icult-1.137
stanbul-1.116
Token (
Token ID357
Feature activation+2.17e+00

Pos logit contributions
gemony+0.774
 Ally+0.725
 casinos+0.672
 Indra+0.672
 sophistication+0.665
Neg logit contributions
rador-0.796
MRI-0.780
rament-0.717
icult-0.693
stanbul-0.680
TokenSt
Token ID1273
Feature activation+1.36e+00

Pos logit contributions
gemony+1.940
 Ally+1.842
 casinos+1.728
 Indra+1.722
 sophistication+1.716
Neg logit contributions
rador-1.427
MRI-1.388
rament-1.256
icult-1.205
stanbul-1.180
Tokenolen
Token ID8622
Feature activation+3.83e+00

Pos logit contributions
gemony+1.129
 Ally+1.066
 casinos+0.995
 Indra+0.992
 sophistication+0.987
Neg logit contributions
rador-0.989
MRI-0.966
rament-0.881
icult-0.849
stanbul-0.832
Token from
Token ID422
Feature activation+2.14e+00

Pos logit contributions
gemony+2.895
 Ally+2.782
 casinos+2.538
 Indra+2.523
 sophistication+2.514
Neg logit contributions
rador-3.017
MRI-2.956
rament-2.721
icult-2.627
iscopal-2.595
Token the
Token ID262
Feature activation+6.42e-01

Pos logit contributions
gemony+1.574
 Ally+1.484
 casinos+1.370
 Indra+1.364
 sophistication+1.355
Neg logit contributions
rador-1.747
MRI-1.710
rament-1.579
icult-1.527
iscopal-1.501
Token Horror
Token ID22732
Feature activation+1.09e+00

Pos logit contributions
gemony+0.456
 Ally+0.424
 casinos+0.391
 Indra+0.391
ours+0.388
Neg logit contributions
rador-0.541
MRI-0.530
rament-0.490
icult-0.475
stanbul-0.467
Token the
Token ID262
Feature activation-1.31e-01

Pos logit contributions
gemony+0.024
 Ally+0.023
 casinos+0.021
 Indra+0.021
 sophistication+0.021
Neg logit contributions
rador-0.023
MRI-0.023
rament-0.021
icult-0.020
stanbul-0.020
Token cover
Token ID3002
Feature activation-2.31e+00

Pos logit contributions
rador+0.103
MRI+0.101
rament+0.093
icult+0.090
stanbul+0.088
Neg logit contributions
gemony-0.101
 Ally-0.094
 casinos-0.087
 Indra-0.087
ours-0.087
Token of
Token ID286
Feature activation-5.60e-01

Pos logit contributions
rador+1.618
MRI+1.561
rament+1.430
icult+1.380
stanbul+1.359
Neg logit contributions
gemony-1.946
 Ally-1.835
 casinos-1.717
 Indra-1.704
 sophistication-1.701
Tokenmag
Token ID19726
Feature activation-1.69e+00

Pos logit contributions
rador+0.473
MRI+0.461
rament+0.428
icult+0.415
stanbul+0.408
Neg logit contributions
gemony-0.395
 Ally-0.365
 casinos-0.338
 Indra-0.336
 sophistication-0.334
Tokenazine
Token ID4994
Feature activation-3.07e+00

Pos logit contributions
rador+0.830
MRI+0.787
rament+0.691
icult+0.650
iscopal+0.642
Neg logit contributions
gemony-1.798
 Ally-1.712
 casinos-1.634
 Indra-1.630
 sophistication-1.626
Token (
Token ID357
Feature activation+7.64e-01

Pos logit contributions
rador+2.038
MRI+1.969
rament+1.779
icult+1.709
stanbul+1.690
Neg logit contributions
gemony-2.707
 Ally-2.551
 casinos-2.401
 sophistication-2.391
 Indra-2.390
Token31
Token ID3132
Feature activation-3.91e+00

Pos logit contributions
gemony+0.741
 Ally+0.706
 Indra+0.666
 casinos+0.665
ours+0.661
Neg logit contributions
rador-0.445
MRI-0.430
rament-0.385
icult-0.367
stanbul-0.356
Token July
Token ID2901
Feature activation-1.65e+00

Pos logit contributions
MRI+2.357
rador+2.326
stanbul+2.020
rament+2.018
icult+1.954
Neg logit contributions
gemony-3.656
 Ally-3.431
 Polo-3.216
 sophistication-3.211
 casinos-3.209
Token 1944
Token ID16994
Feature activation-2.26e+00

Pos logit contributions
rador+0.892
MRI+0.871
rament+0.760
icult+0.725
stanbul+0.706
Neg logit contributions
gemony-1.664
 Ally-1.572
 casinos-1.503
 sophistication-1.485
 Indra-1.483
Token)
Token ID8
Feature activation-1.03e+00

Pos logit contributions
rador+1.217
MRI+1.186
rament+1.043
icult+0.990
stanbul+0.966
Neg logit contributions
gemony-2.277
 Ally-2.155
 casinos-2.054
 sophistication-2.041
 Indra-2.031
Token
Token ID198
Feature activation-1.20e+00

Pos logit contributions
rador+0.685
MRI+0.668
rament+0.604
icult+0.581
stanbul+0.570
Neg logit contributions
gemony-0.912
 Ally-0.859
 casinos-0.809
 Indra-0.804
 sophistication-0.803
Token you
Token ID345
Feature activation-2.01e+00

Pos logit contributions
rador+0.628
MRI+0.614
rament+0.565
icult+0.547
stanbul+0.537
Neg logit contributions
gemony-0.593
 Ally-0.555
 casinos-0.515
 Indra-0.514
 sophistication-0.510
Token
Token ID447
Feature activation-2.32e-01

Pos logit contributions
rador+1.441
MRI+1.406
rament+1.281
icult+1.233
stanbul+1.215
Neg logit contributions
gemony-1.683
 Ally-1.589
 casinos-1.485
 Indra-1.482
 sophistication-1.476
Token
Token ID247
Feature activation-2.99e-01

Pos logit contributions
rador+0.170
MRI+0.166
rament+0.151
icult+0.145
stanbul+0.143
Neg logit contributions
gemony-0.191
 Ally-0.180
 casinos-0.169
 Indra-0.168
 sophistication-0.167
Tokenre
Token ID260
Feature activation-1.91e-01

Pos logit contributions
rador+0.191
MRI+0.186
rament+0.168
icult+0.160
stanbul+0.158
Neg logit contributions
gemony-0.273
 Ally-0.259
 casinos-0.245
 Indra-0.244
 sophistication-0.243
Token still
Token ID991
Feature activation-8.32e-01

Pos logit contributions
rador+0.140
MRI+0.137
rament+0.125
icult+0.121
stanbul+0.119
Neg logit contributions
gemony-0.155
 Ally-0.147
 casinos-0.137
 Indra-0.136
 sophistication-0.136
Token unable
Token ID5906
Feature activation+2.00e+00

Pos logit contributions
rador+0.591
MRI+0.575
rament+0.525
icult+0.507
stanbul+0.497
Neg logit contributions
gemony-0.704
 Ally-0.659
 casinos-0.616
 Indra-0.615
 sophistication-0.612
Token to
Token ID284
Feature activation-1.43e+00

Pos logit contributions
gemony+1.363
 Ally+1.307
 casinos+1.196
 Indra+1.181
 sophistication+1.175
Neg logit contributions
rador-1.698
MRI-1.674
rament-1.559
icult-1.497
stanbul-1.481
Token find
Token ID1064
Feature activation-1.55e+00

Pos logit contributions
rador+1.097
MRI+1.075
rament+0.985
icult+0.952
stanbul+0.936
Neg logit contributions
gemony-1.129
 Ally-1.051
 Indra-0.978
 casinos-0.977
 sophistication-0.970
Token issue
Token ID2071
Feature activation-2.23e+00

Pos logit contributions
rador+1.201
MRI+1.175
rament+1.082
icult+1.045
stanbul+1.021
Neg logit contributions
gemony-1.202
 Ally-1.106
 casinos-1.037
 Indra-1.034
 sophistication-1.027
Token 40
Token ID2319
Feature activation-6.63e-02

Pos logit contributions
rador+1.502
MRI+1.464
rament+1.321
icult+1.283
stanbul+1.257
Neg logit contributions
gemony-1.959
 Ally-1.816
 casinos-1.708
 Indra-1.705
 sophistication-1.688
Token near
Token ID1474
Feature activation+1.76e+00

Pos logit contributions
rador+0.049
MRI+0.048
rament+0.044
icult+0.043
stanbul+0.042
Neg logit contributions
gemony-0.053
 Ally-0.050
 casinos-0.047
 Indra-0.047
 sophistication-0.047
Token the
Token ID262
Feature activation-2.35e-01

Pos logit contributions
rador+3.491
MRI+3.410
rament+3.172
stanbul+3.067
icult+3.055
Neg logit contributions
gemony-3.807
 Ally-3.382
 casinos-3.232
ours-3.201
 sophistication-3.199
Token difficult
Token ID2408
Feature activation+9.83e-01

Pos logit contributions
rador+0.189
MRI+0.185
rament+0.171
icult+0.165
stanbul+0.162
Neg logit contributions
gemony-0.175
 Ally-0.163
 casinos-0.151
 Indra-0.151
 sophistication-0.150
Token times
Token ID1661
Feature activation+3.01e+00

Pos logit contributions
gemony+0.698
 Ally+0.654
 casinos+0.603
 Indra+0.600
 sophistication+0.597
Neg logit contributions
rador-0.826
MRI-0.810
rament-0.749
icult-0.725
stanbul-0.713
Token,
Token ID11
Feature activation+6.29e-01

Pos logit contributions
gemony+2.173
 Ally+2.076
 Indra+1.908
ours+1.902
 casinos+1.901
Neg logit contributions
rador-2.475
MRI-2.430
rament-2.222
icult-2.154
iscopal-2.128
Token the
Token ID262
Feature activation+7.59e-01

Pos logit contributions
gemony+0.474
 Ally+0.447
 casinos+0.413
 Indra+0.412
 sophistication+0.410
Neg logit contributions
rador-0.501
MRI-0.491
rament-0.452
icult-0.436
stanbul-0.429
Token difficult
Token ID2408
Feature activation+1.48e+00

Pos logit contributions
gemony+0.560
 Ally+0.527
 casinos+0.488
 Indra+0.485
 sophistication+0.483
Neg logit contributions
rador-0.618
MRI-0.605
rament-0.557
icult-0.539
stanbul-0.530
Token conversations
Token ID10275
Feature activation+3.04e+00

Pos logit contributions
gemony+0.993
 Ally+0.938
 casinos+0.861
 sophistication+0.853
ours+0.852
Neg logit contributions
rador-1.287
MRI-1.264
rament-1.170
icult-1.128
iscopal-1.119
Token we
Token ID356
Feature activation+9.85e-02

Pos logit contributions
gemony+2.279
 Ally+2.157
 Indra+1.991
 casinos+1.989
ours+1.975
Neg logit contributions
rador-2.424
MRI-2.379
rament-2.178
icult-2.104
iscopal-2.076
Token had
Token ID550
Feature activation+2.31e+00

Pos logit contributions
gemony+0.079
 Ally+0.075
 casinos+0.070
 sophistication+0.070
 Indra+0.070
Neg logit contributions
rador-0.073
MRI-0.071
rament-0.066
icult-0.063
stanbul-0.062
Token at
Token ID379
Feature activation+2.12e+00

Pos logit contributions
gemony+1.841
 Ally+1.740
 casinos+1.618
 Indra+1.614
 sophistication+1.603
Neg logit contributions
rador-1.734
MRI-1.697
rament-1.550
icult-1.498
stanbul-1.467
Token the
Token ID262
Feature activation+1.20e+00

Pos logit contributions
gemony+1.611
 Ally+1.522
 casinos+1.409
 Indra+1.402
 sophistication+1.394
Neg logit contributions
rador-1.673
MRI-1.638
rament-1.503
icult-1.455
stanbul-1.430
Token professional
Token ID4708
Feature activation+1.39e+00

Pos logit contributions
gemony+2.041
 Ally+1.932
 casinos+1.787
 sophistication+1.769
 Indra+1.766
Neg logit contributions
rador-2.479
MRI-2.431
rament-2.254
icult-2.175
iscopal-2.147
Token in
Token ID287
Feature activation-9.90e-02

Pos logit contributions
gemony+1.154
 Ally+1.094
 casinos+1.024
 Indra+1.017
 sophistication+1.014
Neg logit contributions
rador-0.997
MRI-0.974
rament-0.884
icult-0.853
stanbul-0.837
Token terms
Token ID2846
Feature activation+1.33e+00

Pos logit contributions
rador+0.079
MRI+0.077
rament+0.071
icult+0.068
stanbul+0.067
Neg logit contributions
gemony-0.075
 Ally-0.070
 Indra-0.065
 casinos-0.065
ours-0.064
Token of
Token ID286
Feature activation+2.05e-01

Pos logit contributions
gemony+1.082
 Ally+1.021
 casinos+0.952
 Indra+0.949
 sophistication+0.944
Neg logit contributions
rador-0.975
MRI-0.954
rament-0.871
icult-0.840
stanbul-0.823
Token the
Token ID262
Feature activation+3.09e-01

Pos logit contributions
gemony+0.153
 Ally+0.144
 Indra+0.133
 casinos+0.133
ours+0.132
Neg logit contributions
rador-0.164
MRI-0.160
rament-0.148
icult-0.143
stanbul-0.140
Token players
Token ID1938
Feature activation+1.81e+00

Pos logit contributions
gemony+0.227
 Ally+0.212
 Indra+0.196
 casinos+0.196
ours+0.194
Neg logit contributions
rador-0.252
MRI-0.247
rament-0.228
icult-0.220
stanbul-0.216
Token.
Token ID13
Feature activation-3.21e-01

Pos logit contributions
gemony+1.503
 Ally+1.425
 casinos+1.332
 Indra+1.322
 sophistication+1.320
Neg logit contributions
rador-1.308
MRI-1.277
rament-1.161
icult-1.122
iscopal-1.098
Token It
Token ID632
Feature activation+1.89e+00

Pos logit contributions
rador+0.212
MRI+0.208
rament+0.188
icult+0.180
stanbul+0.177
Neg logit contributions
gemony-0.285
 Ally-0.268
 casinos-0.253
 sophistication-0.251
 Indra-0.251
Token's
Token ID338
Feature activation+1.55e+00

Pos logit contributions
gemony+1.418
 Ally+1.332
 casinos+1.233
 Indra+1.229
 sophistication+1.223
Neg logit contributions
rador-1.510
MRI-1.477
rament-1.359
icult-1.317
stanbul-1.293
Token time
Token ID640
Feature activation+1.17e+00

Pos logit contributions
gemony+1.081
 Ally+1.015
 casinos+0.936
 Indra+0.930
 sophistication+0.926
Neg logit contributions
rador-1.328
MRI-1.301
rament-1.204
icult-1.167
iscopal-1.149
Token to
Token ID284
Feature activation-2.74e-01

Pos logit contributions
gemony+0.888
 Ally+0.838
 casinos+0.777
 Indra+0.773
 sophistication+0.768
Neg logit contributions
rador-0.929
MRI-0.910
rament-0.835
icult-0.808
stanbul-0.794
Token number
Token ID1271
Feature activation+1.83e+00

Pos logit contributions
gemony+0.958
 Ally+0.903
 casinos+0.844
 Indra+0.841
ours+0.836
Neg logit contributions
rador-0.820
MRI-0.802
rament-0.730
icult-0.703
stanbul-0.690
Token because
Token ID780
Feature activation-7.74e-01

Pos logit contributions
gemony+1.485
 Ally+1.401
 casinos+1.306
 Indra+1.301
 sophistication+1.294
Neg logit contributions
rador-1.350
MRI-1.320
rament-1.206
icult-1.163
stanbul-1.141
Token they
Token ID484
Feature activation-1.53e+00

Pos logit contributions
rador+0.602
MRI+0.589
rament+0.540
icult+0.522
stanbul+0.516
Neg logit contributions
gemony-0.601
 Ally-0.560
 casinos-0.516
 Indra-0.515
 sophistication-0.512
Token can
Token ID460
Feature activation+8.81e-01

Pos logit contributions
rador+1.093
MRI+1.070
rament+0.974
icult+0.934
stanbul+0.928
Neg logit contributions
gemony-1.284
 Ally-1.212
 casinos-1.130
 sophistication-1.125
 Indra-1.125
Token generate
Token ID7716
Feature activation-8.38e-01

Pos logit contributions
gemony+0.667
 Ally+0.626
 casinos+0.580
 Indra+0.577
 sophistication+0.574
Neg logit contributions
rador-0.702
MRI-0.687
rament-0.632
icult-0.611
stanbul-0.601
Token their
Token ID511
Feature activation-1.27e+00

Pos logit contributions
rador+0.677
MRI+0.668
rament+0.613
icult+0.591
stanbul+0.584
Neg logit contributions
gemony-0.625
 Ally-0.578
 casinos-0.530
 Indra-0.527
ours-0.525
Token own
Token ID898
Feature activation+7.28e-01

Pos logit contributions
rador+0.960
MRI+0.950
rament+0.862
icult+0.833
stanbul+0.824
Neg logit contributions
gemony-1.007
 Ally-0.936
 casinos-0.869
 Indra-0.863
ours-0.856
Token code
Token ID2438
Feature activation+1.88e+00

Pos logit contributions
gemony+0.532
 Ally+0.497
 casinos+0.458
 Indra+0.456
 sophistication+0.453
Neg logit contributions
rador-0.598
MRI-0.587
rament-0.541
icult-0.523
stanbul-0.516
Token.
Token ID13
Feature activation+9.50e-02

Pos logit contributions
gemony+1.473
 Ally+1.387
 casinos+1.291
 Indra+1.287
 sophistication+1.280
Neg logit contributions
rador-1.439
MRI-1.406
rament-1.290
icult-1.247
iscopal-1.223
Token So
Token ID1406
Feature activation+2.07e+00

Pos logit contributions
gemony+0.080
 Ally+0.076
 casinos+0.071
 Indra+0.070
 sophistication+0.070
Neg logit contributions
rador-0.067
MRI-0.066
rament-0.059
icult-0.057
stanbul-0.056
Token let
Token ID1309
Feature activation-1.61e+00

Pos logit contributions
gemony+1.566
 Ally+1.473
 casinos+1.366
 Indra+1.361
 sophistication+1.353
Neg logit contributions
rador-1.638
MRI-1.605
rament-1.475
icult-1.426
stanbul-1.400
Token In
Token ID554
Feature activation+1.10e+00

Pos logit contributions
gemony+0.208
 Ally+0.196
 casinos+0.184
 Indra+0.183
 sophistication+0.182
Neg logit contributions
rador-0.180
MRI-0.176
rament-0.160
icult-0.154
stanbul-0.151
Token OC
Token ID24775
Feature activation+2.32e+00

Pos logit contributions
gemony+0.793
 Ally+0.741
 casinos+0.684
 Indra+0.682
 sophistication+0.677
Neg logit contributions
rador-0.912
MRI-0.893
rament-0.825
icult-0.798
stanbul-0.785
Token M
Token ID337
Feature activation-7.25e-01

Pos logit contributions
gemony+1.711
 Ally+1.623
 casinos+1.503
 Indra+1.497
 sophistication+1.487
Neg logit contributions
rador-1.868
MRI-1.828
rament-1.686
icult-1.634
iscopal-1.603
TokenODE
Token ID16820
Feature activation-1.05e+00

Pos logit contributions
rador+0.405
MRI+0.395
rament+0.350
icult+0.332
stanbul+0.323
Neg logit contributions
gemony-0.719
 Ally-0.684
 casinos-0.648
 Indra-0.645
 sophistication-0.642
Token the
Token ID262
Feature activation+5.79e-01

Pos logit contributions
rador+0.808
MRI+0.792
rament+0.725
icult+0.699
stanbul+0.688
Neg logit contributions
gemony-0.825
 Ally-0.767
 Indra-0.712
 casinos-0.711
 sophistication-0.707
Token core
Token ID4755
Feature activation-5.63e-01

Pos logit contributions
gemony+0.416
 Ally+0.388
 casinos+0.358
 Indra+0.357
 sophistication+0.355
Neg logit contributions
rador-0.483
MRI-0.473
rament-0.437
icult-0.423
stanbul-0.417
Token increases
Token ID5732
Feature activation+1.47e+00

Pos logit contributions
rador+0.487
MRI+0.478
rament+0.443
icult+0.428
stanbul+0.424
Neg logit contributions
gemony-0.389
 Ally-0.360
 casinos-0.331
 Indra-0.330
ours-0.326
Token to
Token ID284
Feature activation+2.36e-01

Pos logit contributions
gemony+1.063
 Ally+0.997
 casinos+0.920
 Indra+0.917
 sophistication+0.912
Neg logit contributions
rador-1.216
MRI-1.191
rament-1.100
icult-1.065
stanbul-1.047
Token 1
Token ID352
Feature activation+1.79e+00

Pos logit contributions
gemony+0.183
 Ally+0.172
 casinos+0.160
 Indra+0.159
 sophistication+0.158
Neg logit contributions
rador-0.184
MRI-0.180
rament-0.165
icult-0.160
stanbul-0.156
Token,
Token ID11
Feature activation+1.56e+00

Pos logit contributions
gemony+1.431
 Ally+1.364
 Indra+1.265
 casinos+1.261
 sophistication+1.255
Neg logit contributions
rador-1.339
MRI-1.304
rament-1.196
icult-1.159
iscopal-1.130
Token709
Token ID31495
Feature activation+2.25e+00

Pos logit contributions
gemony+1.396
 Ally+1.326
 casinos+1.244
 Indra+1.242
 sophistication+1.234
Neg logit contributions
rador-1.017
MRI-0.990
rament-0.894
icult-0.858
stanbul-0.837
Token,
Token ID11
Feature activation-1.18e+00

Pos logit contributions
rador+0.937
MRI+0.916
rament+0.836
icult+0.806
stanbul+0.790
Neg logit contributions
gemony-1.035
 Ally-0.974
 casinos-0.908
 Indra-0.905
 sophistication-0.900
Token a
Token ID257
Feature activation-2.11e-01

Pos logit contributions
rador+0.876
MRI+0.857
rament+0.781
icult+0.753
stanbul+0.740
Neg logit contributions
gemony-0.950
 Ally-0.892
 casinos-0.829
 Indra-0.829
ours-0.824
Token TV
Token ID3195
Feature activation-6.10e+00

Pos logit contributions
rador+0.160
MRI+0.156
rament+0.143
icult+0.138
stanbul+0.136
Neg logit contributions
gemony-0.168
 Ally-0.159
 casinos-0.148
 Indra-0.147
 sophistication-0.146
Token,
Token ID11
Feature activation-9.53e-01

Pos logit contributions
rador+4.106
MRI+3.981
rament+3.556
icult+3.400
iscopal+3.357
Neg logit contributions
gemony-5.314
 Ally-4.960
 casinos-4.658
 Indra-4.640
ours-4.620
Token a
Token ID257
Feature activation-1.75e-01

Pos logit contributions
rador+0.744
MRI+0.728
rament+0.665
icult+0.641
stanbul+0.632
Neg logit contributions
gemony-0.736
 Ally-0.689
 Indra-0.638
 casinos-0.636
ours-0.633
Token computer
Token ID3644
Feature activation-4.20e+00

Pos logit contributions
rador+0.129
MRI+0.126
rament+0.115
icult+0.111
stanbul+0.109
Neg logit contributions
gemony-0.143
 Ally-0.135
 casinos-0.126
 Indra-0.125
 sophistication-0.125
Token or
Token ID393
Feature activation-3.54e-02

Pos logit contributions
rador+3.171
MRI+3.096
rament+2.814
icult+2.718
iscopal+2.663
Neg logit contributions
gemony-3.333
 Ally-3.105
ours-2.893
 casinos-2.888
 Indra-2.882
Token laptop
Token ID13224
Feature activation-5.47e+00

Pos logit contributions
rador+0.029
MRI+0.029
rament+0.026
icult+0.026
stanbul+0.025
Neg logit contributions
gemony-0.026
 Ally-0.024
 casinos-0.022
 Indra-0.022
 sophistication-0.022
Token,
Token ID11
Feature activation-5.14e-01

Pos logit contributions
rador+3.394
MRI+3.291
rament+2.933
icult+2.766
iscopal+2.710
Neg logit contributions
gemony-5.045
 Ally-4.711
ours-4.448
 Indra-4.446
 casinos-4.434
Token a
Token ID257
Feature activation+1.83e-01

Pos logit contributions
rador+0.405
MRI+0.397
rament+0.364
icult+0.352
stanbul+0.346
Neg logit contributions
gemony-0.392
 Ally-0.368
 casinos-0.340
 Indra-0.339
ours-0.337
Token cooler
Token ID19346
Feature activation-1.25e+00

Pos logit contributions
gemony+0.144
 Ally+0.135
 casinos+0.126
 Indra+0.125
 sophistication+0.125
Neg logit contributions
rador-0.141
MRI-0.137
rament-0.126
icult-0.122
stanbul-0.120
Token Par
Token ID2547
Feature activation-1.34e-01

Pos logit contributions
gemony+3.302
 Ally+3.167
 casinos+2.920
 sophistication+2.920
 Indra+2.916
Neg logit contributions
rador-3.575
MRI-3.496
rament-3.207
icult-3.103
iscopal-3.077
Tokenth
Token ID400
Feature activation+1.96e+00

Pos logit contributions
rador+0.101
MRI+0.099
rament+0.091
icult+0.088
iscopal+0.086
Neg logit contributions
gemony-0.107
 Ally-0.101
 casinos-0.094
 Indra-0.094
 sophistication-0.093
Tokeniv
Token ID452
Feature activation-1.05e+00

Pos logit contributions
gemony+1.756
 Ally+1.667
 casinos+1.564
 Indra+1.559
 sophistication+1.552
Neg logit contributions
rador-1.290
MRI-1.257
rament-1.135
icult-1.089
stanbul-1.064
Token
Token ID447
Feature activation-9.72e-01

Pos logit contributions
rador+0.793
MRI+0.767
rament+0.717
icult+0.687
stanbul+0.673
Neg logit contributions
gemony-0.824
 Ally-0.775
 casinos-0.720
 Indra-0.717
 sophistication-0.716
Token
Token ID247
Feature activation-7.08e-01

Pos logit contributions
rador+0.697
MRI+0.677
rament+0.616
icult+0.596
stanbul+0.588
Neg logit contributions
gemony-0.804
 Ally-0.763
 casinos-0.717
 Indra-0.711
 sophistication-0.709
Tokens
Token ID82
Feature activation+5.94e-01

Pos logit contributions
rador+0.534
MRI+0.523
rament+0.479
icult+0.461
stanbul+0.456
Neg logit contributions
gemony-0.563
 Ally-0.530
 casinos-0.496
 sophistication-0.490
 Indra-0.490
Token script
Token ID4226
Feature activation-1.19e-01

Pos logit contributions
gemony+0.438
 Ally+0.410
 casinos+0.379
 Indra+0.377
 sophistication+0.375
Neg logit contributions
rador-0.484
MRI-0.474
rament-0.438
icult-0.424
stanbul-0.416
Token and
Token ID290
Feature activation-1.78e-01

Pos logit contributions
rador+0.087
MRI+0.085
rament+0.077
icult+0.075
stanbul+0.073
Neg logit contributions
gemony-0.098
 Ally-0.093
 casinos-0.087
 Indra-0.086
 sophistication-0.086
Token telling
Token ID5149
Feature activation-1.75e-01

Pos logit contributions
rador+0.137
MRI+0.134
rament+0.123
icult+0.119
stanbul+0.117
Neg logit contributions
gemony-0.139
 Ally-0.130
 casinos-0.121
 Indra-0.121
 sophistication-0.120
Token them
Token ID606
Feature activation-1.64e+00

Pos logit contributions
rador+0.140
MRI+0.137
rament+0.126
icult+0.122
stanbul+0.120
Neg logit contributions
gemony-0.132
 Ally-0.123
 casinos-0.114
 Indra-0.113
 sophistication-0.113
Token that
Token ID326
Feature activation+1.08e+00

Pos logit contributions
rador+1.223
MRI+1.190
rament+1.096
icult+1.054
stanbul+1.036
Neg logit contributions
gemony-1.321
 Ally-1.234
 casinos-1.153
 sophistication-1.142
 Indra-1.140
Token actual
Token ID4036
Feature activation+2.51e+00

Pos logit contributions
gemony+0.148
 Ally+0.139
 casinos+0.129
 Indra+0.129
 sophistication+0.128
Neg logit contributions
rador-0.154
MRI-0.150
rament-0.138
icult-0.134
stanbul-0.131
Token history
Token ID2106
Feature activation+1.29e+00

Pos logit contributions
gemony+1.905
 Ally+1.807
 sophistication+1.677
 casinos+1.676
 Indra+1.658
Neg logit contributions
rador-1.960
MRI-1.941
rament-1.767
icult-1.714
stanbul-1.682
Token of
Token ID286
Feature activation+1.13e+00

Pos logit contributions
gemony+1.050
 Ally+0.990
 casinos+0.925
 Indra+0.921
 sophistication+0.916
Neg logit contributions
rador-0.953
MRI-0.932
rament-0.850
icult-0.820
stanbul-0.804
Token NDP
Token ID20392
Feature activation-2.45e+00

Pos logit contributions
gemony+0.876
 Ally+0.830
 casinos+0.772
 Indra+0.767
 sophistication+0.764
Neg logit contributions
rador-0.876
MRI-0.863
rament-0.792
icult-0.764
stanbul-0.752
Token governments
Token ID6905
Feature activation-3.29e+00

Pos logit contributions
rador+1.792
MRI+1.728
rament+1.579
icult+1.519
stanbul+1.498
Neg logit contributions
gemony-2.021
 Ally-1.894
 Indra-1.771
 casinos-1.760
ours-1.750
Token shows
Token ID2523
Feature activation-2.05e+00

Pos logit contributions
rador+2.447
MRI+2.377
rament+2.145
icult+2.085
stanbul+2.068
Neg logit contributions
gemony-2.610
 Ally-2.461
 casinos-2.286
 Indra-2.279
 sophistication-2.258
Token we
Token ID356
Feature activation-8.30e-01

Pos logit contributions
rador+1.654
MRI+1.613
rament+1.488
icult+1.440
stanbul+1.412
Neg logit contributions
gemony-1.521
 Ally-1.415
 casinos-1.309
 Indra-1.308
ours-1.300
Token
Token ID447
Feature activation+3.21e-01

Pos logit contributions
rador+0.578
MRI+0.563
rament+0.512
icult+0.492
stanbul+0.483
Neg logit contributions
gemony-0.709
 Ally-0.673
 casinos-0.630
 Indra-0.628
 sophistication-0.625
Token
Token ID247
Feature activation+8.82e-01

Pos logit contributions
gemony+0.262
 Ally+0.248
 casinos+0.231
 Indra+0.230
 sophistication+0.229
Neg logit contributions
rador-0.236
MRI-0.231
rament-0.211
icult-0.203
stanbul-0.199
Tokenre
Token ID260
Feature activation+7.55e-01

Pos logit contributions
gemony+0.810
 Ally+0.769
 casinos+0.723
 Indra+0.721
 sophistication+0.718
Neg logit contributions
rador-0.559
MRI-0.544
rament-0.489
icult-0.469
stanbul-0.458
Token often
Token ID1690
Feature activation+2.71e-01

Pos logit contributions
gemony+0.585
 Ally+0.551
 casinos+0.511
 Indra+0.509
 sophistication+0.507
Neg logit contributions
rador-0.587
MRI-0.576
rament-0.528
icult-0.510
stanbul-0.500
Token
Token ID198
Feature activation-5.59e-02

Pos logit contributions
rador+0.017
MRI+0.017
rament+0.015
icult+0.014
stanbul+0.014
Neg logit contributions
gemony-0.028
 Ally-0.026
 casinos-0.025
 Indra-0.025
 sophistication-0.025
TokenThere
Token ID1858
Feature activation-4.99e+00

Pos logit contributions
rador+0.038
MRI+0.037
rament+0.033
icult+0.032
stanbul+0.031
Neg logit contributions
gemony-0.049
 Ally-0.046
 casinos-0.044
 Indra-0.043
 sophistication-0.043
Tokenin
Token ID259
Feature activation+1.94e-02

Pos logit contributions
rador+3.378
MRI+3.292
rament+3.007
icult+2.888
stanbul+2.853
Neg logit contributions
gemony-4.182
 Ally-3.994
 Indra-3.761
 casinos-3.736
 sophistication-3.721
Token lies
Token ID7363
Feature activation+3.79e+00

Pos logit contributions
gemony+0.013
 Ally+0.012
 casinos+0.011
 Indra+0.011
 sophistication+0.011
Neg logit contributions
rador-0.017
MRI-0.017
rament-0.016
icult-0.015
stanbul-0.015
Token the
Token ID262
Feature activation+3.91e-01

Pos logit contributions
gemony+2.751
 Ally+2.615
 casinos+2.417
 Indra+2.405
 sophistication+2.401
Neg logit contributions
rador-3.121
MRI-3.053
rament-2.772
icult-2.710
iscopal-2.670
Token Mavericks
Token ID36002
Feature activation-4.44e+00

Pos logit contributions
gemony+0.270
 Ally+0.252
 casinos+0.231
 Indra+0.230
ours+0.228
Neg logit contributions
rador-0.337
MRI-0.330
rament-0.306
icult-0.297
stanbul-0.292
Token'
Token ID6
Feature activation+1.18e+00

Pos logit contributions
rador+2.876
MRI+2.841
rament+2.621
icult+2.485
stanbul+2.413
Neg logit contributions
gemony-3.927
 Ally-3.640
 Indra-3.467
ours-3.404
-3.368
Token dilemma
Token ID26566
Feature activation-1.87e+00

Pos logit contributions
gemony+0.833
 Ally+0.778
 casinos+0.715
 Indra+0.713
 sophistication+0.708
Neg logit contributions
rador-1.003
MRI-0.984
rament-0.911
icult-0.882
stanbul-0.868
Token:
Token ID25
Feature activation-6.81e-01

Pos logit contributions
rador+1.235
MRI+1.211
rament+1.114
icult+1.061
stanbul+1.032
Neg logit contributions
gemony-1.646
 Ally-1.542
 Indra-1.448
 casinos-1.441
ours-1.439
Token they
Token ID484
Feature activation-1.05e+00

Pos logit contributions
rador+0.522
MRI+0.510
rament+0.469
icult+0.452
stanbul+0.444
Neg logit contributions
gemony-0.534
 Ally-0.502
 casinos-0.466
 Indra-0.466
 sophistication-0.462
Token have
Token ID423
Feature activation+6.59e-01

Pos logit contributions
rador+0.779
MRI+0.762
rament+0.697
icult+0.673
stanbul+0.660
Neg logit contributions
gemony-0.845
 Ally-0.797
 casinos-0.741
 Indra-0.740
 sophistication-0.736
Token UK
Token ID3482
Feature activation-1.28e+00

Pos logit contributions
rador+0.756
MRI+0.738
rament+0.680
icult+0.660
stanbul+0.652
Neg logit contributions
gemony-0.692
 Ally-0.641
 Indra-0.595
 casinos-0.593
 sophistication-0.592
Token interest
Token ID1393
Feature activation-1.31e+00

Pos logit contributions
rador+1.003
MRI+0.987
rament+0.901
icult+0.874
iscopal+0.859
Neg logit contributions
gemony-0.976
 Ally-0.907
 Indra-0.848
 sophistication-0.841
 casinos-0.841
Token rates
Token ID3965
Feature activation-7.50e-01

Pos logit contributions
rador+1.097
MRI+1.072
rament+0.988
icult+0.961
iscopal+0.942
Neg logit contributions
gemony-0.934
 Ally-0.866
 Indra-0.805
 casinos-0.798
 sophistication-0.796
Token at
Token ID379
Feature activation+1.20e+00

Pos logit contributions
rador+0.543
MRI+0.529
rament+0.482
icult+0.465
stanbul+0.456
Neg logit contributions
gemony-0.622
 Ally-0.580
 Indra-0.543
 casinos-0.543
 sophistication-0.541
Token a
Token ID257
Feature activation+8.85e-01

Pos logit contributions
gemony+0.880
 Ally+0.835
 casinos+0.770
 Indra+0.762
ours+0.758
Neg logit contributions
rador-0.981
MRI-0.962
rament-0.886
icult-0.858
stanbul-0.844
Token record
Token ID1700
Feature activation-5.86e+00

Pos logit contributions
gemony+0.772
 Ally+0.738
 casinos+0.691
 Indra+0.687
 sophistication+0.685
Neg logit contributions
rador-0.598
MRI-0.584
rament-0.528
icult-0.506
iscopal-0.496
Token low
Token ID1877
Feature activation-2.00e+00

Pos logit contributions
rador+4.755
MRI+4.653
rament+4.301
icult+4.209
iscopal+4.133
Neg logit contributions
gemony-4.218
 Ally-3.775
 Indra-3.575
ours-3.561
 casinos-3.527
Token of
Token ID286
Feature activation-9.33e-01

Pos logit contributions
rador+1.461
MRI+1.426
rament+1.313
icult+1.264
iscopal+1.237
Neg logit contributions
gemony-1.632
 Ally-1.508
 Indra-1.422
 casinos-1.418
ours-1.408
Token 0
Token ID657
Feature activation+1.28e+00

Pos logit contributions
rador+0.717
MRI+0.701
rament+0.643
icult+0.624
iscopal+0.609
Neg logit contributions
gemony-0.729
 Ally-0.677
 Indra-0.634
 casinos-0.631
 sophistication-0.628
Token.
Token ID13
Feature activation+2.06e+00

Pos logit contributions
gemony+1.122
 Ally+1.062
 casinos+0.996
 Indra+0.993
 sophistication+0.988
Neg logit contributions
rador-0.871
MRI-0.850
rament-0.769
icult-0.740
stanbul-0.723
Token5
Token ID20
Feature activation+2.99e+00

Pos logit contributions
gemony+1.906
 Ally+1.833
 Indra+1.714
ours+1.701
 casinos+1.698
Neg logit contributions
rador-1.281
MRI-1.229
rament-1.114
icult-1.072
iscopal-1.040
Token easy
Token ID2562
Feature activation+6.35e-02

Pos logit contributions
gemony+0.524
 Ally+0.489
 casinos+0.450
 Indra+0.448
 sophistication+0.445
Neg logit contributions
rador-0.640
MRI-0.628
rament-0.581
icult-0.563
stanbul-0.554
Token!
Token ID0
Feature activation+1.07e+00

Pos logit contributions
gemony+0.051
 Ally+0.048
 casinos+0.045
 Indra+0.045
ours+0.044
Neg logit contributions
rador-0.048
MRI-0.047
rament-0.043
icult-0.041
stanbul-0.040
Token<|endoftext|>
Token ID50256
Feature activation+8.33e-01

Pos logit contributions
gemony+0.935
 Ally+0.892
 casinos+0.832
 Indra+0.832
 sophistication+0.825
Neg logit contributions
rador-0.726
MRI-0.706
rament-0.642
icult-0.617
stanbul-0.604
TokenWith
Token ID3152
Feature activation+7.60e-01

Pos logit contributions
gemony+0.731
 Ally+0.692
 casinos+0.649
 Indra+0.647
 sophistication+0.644
Neg logit contributions
rador-0.562
MRI-0.548
rament-0.497
icult-0.477
stanbul-0.467
Token a
Token ID257
Feature activation-4.91e-01

Pos logit contributions
gemony+0.570
 Ally+0.533
 casinos+0.493
 Indra+0.492
ours+0.489
Neg logit contributions
rador-0.609
MRI-0.595
rament-0.549
icult-0.530
stanbul-0.523
Token strong
Token ID1913
Feature activation-5.55e+00

Pos logit contributions
rador+0.387
MRI+0.379
rament+0.349
icult+0.337
stanbul+0.332
Neg logit contributions
gemony-0.374
 Ally-0.351
 Indra-0.325
 casinos-0.325
ours-0.322
Token focus
Token ID2962
Feature activation-6.80e+00

Pos logit contributions
rador+4.201
MRI+4.093
rament+3.807
icult+3.614
stanbul+3.599
Neg logit contributions
gemony-4.334
 Ally-3.844
ours-3.654
 Indra-3.642
 casinos-3.607
Token on
Token ID319
Feature activation-3.12e+00

Pos logit contributions
rador+3.901
MRI+3.711
rament+3.366
icult+3.188
stanbul+3.044
Neg logit contributions
gemony-6.422
 Ally-6.096
 Indra-5.772
 casinos-5.738
 sophistication-5.720
Token player
Token ID2137
Feature activation+3.80e-01

Pos logit contributions
rador+2.480
MRI+2.419
rament+2.247
icult+2.157
stanbul+2.134
Neg logit contributions
gemony-2.310
 Ally-2.130
ours-1.986
 Indra-1.974
 casinos-1.956
Token government
Token ID1230
Feature activation-2.98e+00

Pos logit contributions
gemony+0.281
 Ally+0.263
 casinos+0.243
 Indra+0.243
ours+0.241
Neg logit contributions
rador-0.309
MRI-0.303
rament-0.279
icult-0.270
stanbul-0.265
Token and
Token ID290
Feature activation-7.63e-01

Pos logit contributions
rador+2.244
MRI+2.191
rament+2.019
icult+1.930
stanbul+1.908
Neg logit contributions
gemony-2.352
 Ally-2.197
 Indra-2.043
 casinos-2.040
ours-2.026
Token find
Token ID1064
Feature activation-1.41e+00

Pos logit contributions
gemony+0.149
 Ally+0.140
 casinos+0.130
 Indra+0.129
 sophistication+0.129
Neg logit contributions
rador-0.153
MRI-0.150
rament-0.137
icult-0.133
stanbul-0.130
Token dark
Token ID3223
Feature activation-3.04e+00

Pos logit contributions
rador+1.115
MRI+1.090
rament+0.996
icult+0.961
iscopal+0.943
Neg logit contributions
gemony-1.075
 Ally-1.005
 Indra-0.932
 casinos-0.926
 sophistication-0.918
Token regions
Token ID7652
Feature activation-5.18e-01

Pos logit contributions
rador+2.626
MRI+2.596
rament+2.381
icult+2.305
iscopal+2.260
Neg logit contributions
gemony-2.093
 Ally-1.940
 Indra-1.775
 casinos-1.757
 Polo-1.733
Token on
Token ID319
Feature activation+4.13e-01

Pos logit contributions
rador+0.384
MRI+0.377
rament+0.342
icult+0.331
stanbul+0.325
Neg logit contributions
gemony-0.421
 Ally-0.394
 casinos-0.367
 Indra-0.366
 sophistication-0.364
Token a
Token ID257
Feature activation+9.20e-01

Pos logit contributions
gemony+0.314
 Ally+0.295
 casinos+0.273
 Indra+0.272
 sophistication+0.270
Neg logit contributions
rador-0.327
MRI-0.320
rament-0.294
icult-0.284
iscopal-0.279
Token light
Token ID1657
Feature activation-4.61e+00

Pos logit contributions
gemony+0.704
 Ally+0.662
 casinos+0.614
 Indra+0.612
 sophistication+0.609
Neg logit contributions
rador-0.724
MRI-0.708
rament-0.651
icult-0.629
stanbul-0.618
Token background
Token ID4469
Feature activation+3.35e-01

Pos logit contributions
rador+3.393
MRI+3.361
rament+3.030
icult+2.893
iscopal+2.821
Neg logit contributions
gemony-3.773
 Ally-3.480
 casinos-3.286
 Indra-3.247
 Polo-3.190
Token gray
Token ID12768
Feature activation-1.24e+00

Pos logit contributions
gemony+0.267
 Ally+0.251
 casinos+0.234
 Indra+0.233
 sophistication+0.231
Neg logit contributions
rador-0.254
MRI-0.248
rament-0.227
icult-0.219
stanbul-0.215
Token =
Token ID796
Feature activation+2.22e-01

Pos logit contributions
rador+0.780
MRI+0.764
rament+0.680
icult+0.652
stanbul+0.636
Neg logit contributions
gemony-1.142
 Ally-1.085
 casinos-1.022
 Indra-1.018
 sophistication-1.013
Token c
Token ID269
Feature activation-2.06e+00

Pos logit contributions
gemony+0.184
 Ally+0.173
 casinos+0.161
 Indra+0.161
 sophistication+0.160
Neg logit contributions
rador-0.160
MRI-0.157
rament-0.143
icult-0.138
stanbul-0.135
Tokenv
Token ID85
Feature activation-2.18e+00

Pos logit contributions
rador+1.328
MRI+1.322
rament+1.186
icult+1.122
stanbul+1.098
Neg logit contributions
gemony-1.843
 Ally-1.768
 casinos-1.662
 Indra-1.659
 sophistication-1.641
Token the
Token ID262
Feature activation-2.37e-01

Pos logit contributions
rador+0.578
MRI+0.566
rament+0.521
icult+0.502
stanbul+0.493
Neg logit contributions
gemony-0.572
 Ally-0.535
 casinos-0.497
 Indra-0.496
ours-0.494
Token book
Token ID1492
Feature activation-5.43e-01

Pos logit contributions
rador+0.183
MRI+0.179
rament+0.164
icult+0.159
stanbul+0.156
Neg logit contributions
gemony-0.184
 Ally-0.174
 casinos-0.161
 Indra-0.161
 sophistication-0.160
Token but
Token ID475
Feature activation-4.68e-01

Pos logit contributions
rador+0.410
MRI+0.400
rament+0.367
icult+0.354
stanbul+0.347
Neg logit contributions
gemony-0.433
 Ally-0.407
 casinos-0.379
 Indra-0.378
 sophistication-0.376
Token when
Token ID618
Feature activation-4.46e+00

Pos logit contributions
rador+0.364
MRI+0.356
rament+0.327
icult+0.316
stanbul+0.311
Neg logit contributions
gemony-0.362
 Ally-0.335
 casinos-0.313
ours-0.311
 sophistication-0.311
Token I
Token ID314
Feature activation+6.04e-01

Pos logit contributions
rador+3.309
MRI+3.260
rament+2.999
icult+2.880
stanbul+2.838
Neg logit contributions
gemony-3.493
 Ally-3.154
ours-3.014
 casinos-2.994
 sophistication-2.989
Token saw
Token ID2497
Feature activation-4.54e+00

Pos logit contributions
gemony+0.486
 Ally+0.462
 casinos+0.430
 Indra+0.427
ours+0.424
Neg logit contributions
rador-0.450
MRI-0.440
rament-0.402
icult-0.388
iscopal-0.381
Token it
Token ID340
Feature activation-9.08e-01

Pos logit contributions
rador+3.342
MRI+3.279
rament+3.009
icult+2.888
stanbul+2.866
Neg logit contributions
gemony-3.580
 Ally-3.247
ours-3.123
 casinos-3.092
 sophistication-3.066
Token I
Token ID314
Feature activation+1.63e+00

Pos logit contributions
rador+0.680
MRI+0.666
rament+0.609
icult+0.588
stanbul+0.579
Neg logit contributions
gemony-0.728
 Ally-0.679
 casinos-0.634
 Indra-0.632
 sophistication-0.631
Token was
Token ID373
Feature activation-3.26e-01

Pos logit contributions
gemony+1.186
 Ally+1.122
 casinos+1.033
 Indra+1.026
ours+1.016
Neg logit contributions
rador-1.336
MRI-1.308
rament-1.207
icult-1.173
iscopal-1.154
Token like
Token ID588
Feature activation-6.08e-01

Pos logit contributions
rador+0.208
MRI+0.203
rament+0.183
icult+0.175
iscopal+0.171
Neg logit contributions
gemony-0.297
 Ally-0.283
 casinos-0.266
 Indra-0.265
 sophistication-0.264
Token,
Token ID11
Feature activation-6.17e-01

Pos logit contributions
rador+0.443
MRI+0.433
rament+0.395
icult+0.381
stanbul+0.375
Neg logit contributions
gemony-0.499
 Ally-0.469
 casinos-0.439
 Indra-0.436
 sophistication-0.435