TOP ACTIVATIONS
MAX = 4.820

, New Hampshire . Hide Caption 6 of 15 Photos :
in , Iowa . Hide Caption 5 of 15 Photos :
Hills , California . Hide Caption 14 of 15 Photos :
aga , California . Hide Caption 13 of 15 Photos :
Women in politics . Hide Caption 7 of 15 Photos :
Newsletter Sign Up Continue reading the main story Please
19 ] Shawn a Dol ansky , The Deb
governor . Advertisement Continue reading the main story
Grove , California . Hide Caption 15 of 15
Pasadena , California . Hide Caption 11 of 15 Photos :
Advertisement Continue reading the main story
Pasadena , California . Hide Caption 12 of 15 Photos :
as chairman of PD V SA in the resh uffle ,
Newsletter Sign Up Continue reading the main story Please
. Remember Colin Kaepernick ? Seahawks fans will no
pants uit GIF s on Instagram , and who joined Snapchat
ok . CNN president Jeff Zucker s then 15
2008 - 2011 ), Larry Kramer ( president ; director Climate
ica (@ dan meric a CNN ) October 23 , 2016
Irvine , California . Hide Caption 10 of 15 Photos :

MIN ACTIVATIONS
MIN = -5.493

, cour get tes , basil and spinach , and named
led immortal eld ritch mon stros ities year after year and
Federal Communications Commission chairman Aj it Pai , a Trump
the French Mediterranean island of Cors ica Image : Flames threatened
asting a high - pressure conco ction of water , sand
Fla ., where killer whale Til ik um was involved in
series of the inter - trop ical temperature anomalies T (
C ute video Aj it Aren
through for more images from Cors ica / Local
signed former Denver Broncos quarterback Brock Os we iler to a
Prime ministers pick their cabinets for several reasons . Among
Gold , Gold A Ton of Gold -
5 lit ers into the mash tun , then add the
group Raqqa is Being Sl aughtered Sil ently .
includes visits to Havana and Trinidad , walks through the countryside
s largest party abandoned President Dil ma Rousse ff
ar of the A am Aad mi Party , arrested this
party abandoned President Dil ma Rousse ff s governing
- known type of ign eous rock found in many volcanic
" Oh no , the oct opus is back . It

INTERVAL 2.242 - 4.820
CONTAINS 0.441%

12 ms G 2 G Samsung P LS ( Response Time
between Tony Ferguson and Kevin Lee . Mc Greg
H ONG K ONG ( Reuters ) - One needs to
the browser had been the weak link in a cyber attack
lene Go le mb iew ski . Read the 2018 Sh

INTERVAL -0.336 - 2.242
CONTAINS 70.890%

download it to your PC and save it as a backup
is a true and valid one and the seeing
, largely ineffective without wide participation . Some state
GO TO THE W ATER VILLE TA VER N , as
, partners include Rebecca H ower ton , Sam P anche

INTERVAL -2.915 - -0.336
CONTAINS 28.603%

c . gov / in jury / w is q ars
but the building became obsolete in the sevent ies , when
" by Des hm uk h in allot ting the land
form - in - one heats inks improve cooling for memory
8 : 30 a . m . for 2 - hour

INTERVAL -5.493 - -2.915
CONTAINS 0.066%

to film producer Sub hash Gh ai , the Supreme Court
parents looking for ways to clean up the air their kids
government shutdown that took effect October 1 , it seems the
Home Life / P CC Natural Markets will open new store
made his bid for the throne in conjunction with Edwin Earl
Token,
Token ID11
Feature activation+3.53e-01

Pos logit contributions
RANT+0.536
PLIED+0.495
pling+0.482
ples+0.471
gradient+0.470
Neg logit contributions
parency-0.515
ology-0.458
icians-0.454
ologne-0.438
atform-0.437
Token New
Token ID968
Feature activation-9.73e-01

Pos logit contributions
RANT+0.422
PLIED+0.386
antha+0.359
pling+0.358
ples+0.352
Neg logit contributions
parency-0.510
atform-0.455
ology-0.451
icians-0.449
ologne-0.445
Token Hampshire
Token ID13910
Feature activation-9.25e-01

Pos logit contributions
parency+1.484
icians+1.321
ologne+1.320
ology+1.317
atform+1.304
Neg logit contributions
RANT-1.072
PLIED-0.987
pling-0.928
ples-0.913
antha-0.912
Token.
Token ID13
Feature activation+7.96e-01

Pos logit contributions
parency+1.025
ology+0.913
icians+0.895
ologne+0.856
ttes+0.842
Neg logit contributions
RANT-1.401
PLIED-1.308
pling-1.272
gradient-1.256
antha-1.242
Token Hide
Token ID10415
Feature activation+1.42e+00

Pos logit contributions
RANT+0.882
PLIED+0.790
pling+0.753
ples+0.738
gradient+0.731
Neg logit contributions
parency-1.217
ology-1.095
icians-1.086
ologne-1.065
atform-1.052
Token Caption
Token ID11260
Feature activation+4.82e+00

Pos logit contributions
RANT+0.758
PLIED+0.618
pling+0.559
gradient+0.515
ples+0.509
Neg logit contributions
parency-2.953
icians-2.743
ology-2.739
ologne-2.685
atform-2.681
Token 6
Token ID718
Feature activation+3.79e-01

Pos logit contributions
RANT+4.234
 Loading+3.725
PLIED+3.672
ples+3.670
pling+3.589
Neg logit contributions
parency-7.940
atform-7.207
icians-7.154
ologne-7.055
ttes-6.980
Token of
Token ID286
Feature activation+3.16e-01

Pos logit contributions
RANT+0.430
PLIED+0.390
pling+0.378
gradient+0.370
ples+0.367
Neg logit contributions
parency-0.564
ology-0.510
icians-0.509
ologne-0.496
atform-0.490
Token 15
Token ID1315
Feature activation+2.20e-02

Pos logit contributions
RANT+0.353
PLIED+0.321
pling+0.308
gradient+0.304
ples+0.299
Neg logit contributions
parency-0.478
ology-0.434
icians-0.429
ologne-0.418
atform-0.415
Token Photos
Token ID9434
Feature activation+1.69e+00

Pos logit contributions
RANT+0.030
PLIED+0.028
pling+0.027
ples+0.027
gradient+0.027
Neg logit contributions
parency-0.028
ology-0.024
icians-0.024
ologne-0.023
atform-0.023
Token:
Token ID25
Feature activation+1.01e+00

Pos logit contributions
RANT+1.922
PLIED+1.751
pling+1.674
gradient+1.670
ples+1.637
Neg logit contributions
parency-2.501
icians-2.244
ology-2.210
atform-2.187
ologne-2.179
Tokenin
Token ID259
Feature activation-5.15e-01

Pos logit contributions
parency+2.008
ology+1.814
icians+1.779
ologne+1.715
ttes+1.693
Neg logit contributions
RANT-2.172
PLIED-2.003
pling-1.939
gradient-1.903
ples-1.885
Token,
Token ID11
Feature activation+5.14e-01

Pos logit contributions
parency+0.642
ology+0.570
icians+0.563
ologne+0.542
atform+0.541
Neg logit contributions
RANT-0.714
PLIED-0.661
pling-0.644
gradient-0.630
ples-0.626
Token Iowa
Token ID9406
Feature activation-3.99e-01

Pos logit contributions
RANT+0.649
PLIED+0.593
pling+0.569
gradient+0.557
ples+0.556
Neg logit contributions
parency-0.707
ology-0.631
icians-0.622
atform-0.612
ttes-0.607
Token.
Token ID13
Feature activation+9.70e-01

Pos logit contributions
parency+0.466
ology+0.414
icians+0.406
ologne+0.391
atform+0.386
Neg logit contributions
RANT-0.582
PLIED-0.541
pling-0.528
gradient-0.517
antha-0.514
Token Hide
Token ID10415
Feature activation+1.03e+00

Pos logit contributions
RANT+1.069
PLIED+0.957
pling+0.914
ples+0.892
antha+0.883
Neg logit contributions
parency-1.492
ology-1.337
icians-1.328
ologne-1.305
atform-1.288
Token Caption
Token ID11260
Feature activation+4.69e+00

Pos logit contributions
RANT+0.569
PLIED+0.468
pling+0.426
gradient+0.394
ples+0.388
Neg logit contributions
parency-2.121
ology-1.969
icians-1.968
ologne-1.927
atform-1.918
Token 5
Token ID642
Feature activation+5.59e-01

Pos logit contributions
RANT+4.128
PLIED+3.662
ples+3.549
pling+3.525
 Loading+3.504
Neg logit contributions
parency-7.773
atform-6.988
icians-6.982
ology-6.956
ologne-6.874
Token of
Token ID286
Feature activation+4.62e-01

Pos logit contributions
RANT+0.627
PLIED+0.568
pling+0.550
gradient+0.538
ples+0.536
Neg logit contributions
parency-0.839
icians-0.758
ology-0.756
ologne-0.738
atform-0.731
Token 15
Token ID1315
Feature activation+9.33e-02

Pos logit contributions
RANT+0.521
PLIED+0.475
pling+0.457
gradient+0.449
ples+0.442
Neg logit contributions
parency-0.694
ology-0.630
icians-0.623
ologne-0.605
atform-0.601
Token Photos
Token ID9434
Feature activation+1.60e+00

Pos logit contributions
RANT+0.127
PLIED+0.119
pling+0.114
ples+0.112
gradient+0.112
Neg logit contributions
parency-0.118
ology-0.103
icians-0.102
atform-0.100
ologne-0.100
Token:
Token ID25
Feature activation+1.21e+00

Pos logit contributions
RANT+1.825
PLIED+1.664
pling+1.588
gradient+1.587
ples+1.556
Neg logit contributions
parency-2.362
icians-2.115
ology-2.084
atform-2.066
ologne-2.058
Token Hills
Token ID14379
Feature activation-2.47e+00

Pos logit contributions
parency+2.015
ology+1.807
icians+1.777
ologne+1.714
ttes+1.695
Neg logit contributions
RANT-2.199
PLIED-2.036
pling-1.980
gradient-1.942
antha-1.925
Token,
Token ID11
Feature activation+4.44e-01

Pos logit contributions
parency+2.940
ology+2.642
icians+2.573
ologne+2.489
ttes+2.458
Neg logit contributions
RANT-3.536
PLIED-3.279
pling-3.188
gradient-3.133
antha-3.113
Token California
Token ID3442
Feature activation+9.42e-02

Pos logit contributions
RANT+0.425
PLIED+0.378
pling+0.359
ples+0.348
gradient+0.348
Neg logit contributions
parency-0.744
ology-0.681
icians-0.676
atform-0.665
ttes-0.663
Token.
Token ID13
Feature activation+9.44e-01

Pos logit contributions
RANT+0.139
PLIED+0.130
pling+0.126
gradient+0.124
antha+0.123
Neg logit contributions
parency-0.108
ology-0.096
icians-0.094
ologne-0.091
ttes-0.089
Token Hide
Token ID10415
Feature activation+1.34e+00

Pos logit contributions
RANT+1.118
PLIED+1.012
pling+0.967
ples+0.952
antha+0.943
Neg logit contributions
parency-1.369
ology-1.223
icians-1.213
ologne-1.191
atform-1.176
Token Caption
Token ID11260
Feature activation+4.63e+00

Pos logit contributions
RANT+0.728
PLIED+0.598
pling+0.541
gradient+0.501
ples+0.496
Neg logit contributions
parency-2.798
icians-2.594
ology-2.587
atform-2.541
ologne-2.536
Token 14
Token ID1478
Feature activation-3.39e-01

Pos logit contributions
RANT+4.349
PLIED+3.915
gradient+3.794
ples+3.778
pling+3.703
Neg logit contributions
parency-7.329
ologne-6.529
icians-6.510
atform-6.433
ology-6.320
Token of
Token ID286
Feature activation+5.67e-01

Pos logit contributions
parency+0.494
ology+0.448
icians+0.444
ologne+0.431
atform+0.427
Neg logit contributions
RANT-0.396
PLIED-0.361
pling-0.350
gradient-0.342
ples-0.339
Token 15
Token ID1315
Feature activation+2.53e-01

Pos logit contributions
RANT+0.616
PLIED+0.558
pling+0.532
gradient+0.528
ples+0.517
Neg logit contributions
parency-0.873
ology-0.789
icians-0.785
ologne-0.766
atform-0.763
Token Photos
Token ID9434
Feature activation+1.67e+00

Pos logit contributions
RANT+0.345
PLIED+0.322
pling+0.310
ples+0.306
gradient+0.305
Neg logit contributions
parency-0.319
ology-0.280
icians-0.279
ologne-0.271
atform-0.270
Token:
Token ID25
Feature activation+1.23e+00

Pos logit contributions
RANT+1.887
PLIED+1.714
gradient+1.639
pling+1.639
ples+1.603
Neg logit contributions
parency-2.470
icians-2.219
ology-2.178
atform-2.159
ologne-2.151
Tokenaga
Token ID8126
Feature activation+7.48e-01

Pos logit contributions
parency+2.994
ology+2.659
icians+2.603
ologne+2.482
ttes+2.445
Neg logit contributions
RANT-4.291
PLIED-4.017
pling-3.902
gradient-3.835
antha-3.792
Token,
Token ID11
Feature activation+3.33e-01

Pos logit contributions
RANT+0.992
PLIED+0.918
pling+0.899
gradient+0.876
ples+0.876
Neg logit contributions
parency-0.967
ology-0.856
icians-0.853
ologne-0.824
atform-0.821
Token California
Token ID3442
Feature activation-6.81e-02

Pos logit contributions
RANT+0.351
PLIED+0.318
pling+0.298
gradient+0.293
antha+0.291
Neg logit contributions
parency-0.529
ology-0.478
icians-0.475
atform-0.466
ologne-0.464
Token.
Token ID13
Feature activation+8.87e-01

Pos logit contributions
parency+0.078
ology+0.069
icians+0.068
ologne+0.065
ttes+0.064
Neg logit contributions
RANT-0.101
PLIED-0.094
pling-0.092
gradient-0.090
antha-0.090
Token Hide
Token ID10415
Feature activation+1.08e+00

Pos logit contributions
RANT+0.968
PLIED+0.876
pling+0.834
ples+0.816
antha+0.810
Neg logit contributions
parency-1.362
ology-1.224
icians-1.213
ologne-1.194
atform-1.178
Token Caption
Token ID11260
Feature activation+4.34e+00

Pos logit contributions
RANT+0.596
PLIED+0.491
pling+0.444
gradient+0.413
ples+0.408
Neg logit contributions
parency-2.239
icians-2.077
ology-2.074
atform-2.032
ologne-2.030
Token 13
Token ID1511
Feature activation+7.12e-01

Pos logit contributions
RANT+3.846
PLIED+3.336
pling+3.141
gradient+3.134
ples+3.078
Neg logit contributions
parency-7.374
atform-6.639
ology-6.632
icians-6.568
ologne-6.549
Token of
Token ID286
Feature activation+4.49e-01

Pos logit contributions
RANT+0.790
PLIED+0.716
pling+0.693
gradient+0.677
ples+0.675
Neg logit contributions
parency-1.074
icians-0.970
ology-0.964
ologne-0.946
atform-0.937
Token 15
Token ID1315
Feature activation+5.18e-02

Pos logit contributions
RANT+0.489
PLIED+0.444
pling+0.424
gradient+0.420
ples+0.411
Neg logit contributions
parency-0.689
ology-0.624
icians-0.620
ologne-0.604
atform-0.601
Token Photos
Token ID9434
Feature activation+1.50e+00

Pos logit contributions
RANT+0.071
PLIED+0.066
pling+0.064
ples+0.063
gradient+0.063
Neg logit contributions
parency-0.065
ology-0.057
icians-0.057
ologne-0.055
atform-0.055
Token:
Token ID25
Feature activation+1.13e+00

Pos logit contributions
RANT+1.707
PLIED+1.553
pling+1.485
gradient+1.484
ples+1.452
Neg logit contributions
parency-2.204
icians-1.979
ology-1.948
atform-1.923
ologne-1.918
Token Women
Token ID6926
Feature activation-7.55e-01

Pos logit contributions
parency+0.858
ology+0.764
icians+0.753
ologne+0.727
atform+0.720
Neg logit contributions
RANT-0.987
PLIED-0.911
pling-0.890
gradient-0.871
antha-0.864
Token in
Token ID287
Feature activation-3.97e-01

Pos logit contributions
parency+1.017
ology+0.914
icians+0.906
ologne+0.876
atform+0.870
Neg logit contributions
RANT-0.967
PLIED-0.885
pling-0.867
gradient-0.839
ples-0.838
Token politics
Token ID4819
Feature activation-6.84e-01

Pos logit contributions
parency+0.552
ology+0.500
icians+0.494
ologne+0.480
ttes+0.474
Neg logit contributions
RANT-0.489
PLIED-0.447
pling-0.435
gradient-0.424
ples-0.421
Token.
Token ID13
Feature activation+7.82e-02

Pos logit contributions
parency+0.907
ology+0.813
icians+0.803
ologne+0.779
ttes+0.772
Neg logit contributions
RANT-0.890
PLIED-0.818
pling-0.800
gradient-0.778
ples-0.775
Token Hide
Token ID10415
Feature activation+8.81e-01

Pos logit contributions
RANT+0.089
PLIED+0.080
pling+0.077
ples+0.075
gradient+0.075
Neg logit contributions
parency-0.118
ology-0.106
icians-0.105
ologne-0.103
atform-0.102
Token Caption
Token ID11260
Feature activation+4.34e+00

Pos logit contributions
RANT+0.510
PLIED+0.422
pling+0.386
gradient+0.359
ples+0.356
Neg logit contributions
parency-1.802
ology-1.678
icians-1.671
ologne-1.636
atform-1.633
Token 7
Token ID767
Feature activation-4.27e-01

Pos logit contributions
RANT+4.463
PLIED+4.046
ples+3.785
gradient+3.758
pling+3.757
Neg logit contributions
parency-6.660
atform-5.965
icians-5.921
ologne-5.854
ology-5.848
Token of
Token ID286
Feature activation+1.97e-01

Pos logit contributions
parency+0.620
ology+0.563
icians+0.557
ologne+0.540
atform+0.535
Neg logit contributions
RANT-0.502
PLIED-0.458
pling-0.443
gradient-0.434
ples-0.429
Token 15
Token ID1315
Feature activation-1.14e-01

Pos logit contributions
RANT+0.222
PLIED+0.202
pling+0.194
gradient+0.191
ples+0.188
Neg logit contributions
parency-0.295
ology-0.268
icians-0.265
ologne-0.258
atform-0.256
Token Photos
Token ID9434
Feature activation+1.49e+00

Pos logit contributions
parency+0.141
ology+0.124
icians+0.123
ologne+0.119
atform+0.118
Neg logit contributions
RANT-0.158
PLIED-0.147
pling-0.142
gradient-0.140
ples-0.140
Token:
Token ID25
Feature activation+7.76e-01

Pos logit contributions
RANT+1.722
PLIED+1.571
pling+1.504
gradient+1.497
ples+1.469
Neg logit contributions
parency-2.185
icians-1.960
ology-1.936
atform-1.905
ologne-1.901
Token
Token ID198
Feature activation+3.88e-01

Pos logit contributions
RANT+0.326
PLIED+0.299
pling+0.288
gradient+0.281
ples+0.281
Neg logit contributions
parency-0.343
ology-0.307
icians-0.306
ologne-0.295
ttes-0.293
Token
Token ID198
Feature activation+2.68e-01

Pos logit contributions
RANT+0.489
PLIED+0.449
pling+0.435
gradient+0.427
ples+0.424
Neg logit contributions
parency-0.529
icians-0.473
ology-0.472
ologne-0.457
atform-0.454
TokenNewsletter
Token ID33031
Feature activation+2.48e+00

Pos logit contributions
RANT+0.328
PLIED+0.298
pling+0.286
gradient+0.279
ples+0.278
Neg logit contributions
parency-0.378
ology-0.342
icians-0.341
ologne-0.329
atform-0.328
Token Sign
Token ID5865
Feature activation+2.69e+00

Pos logit contributions
RANT+1.433
PLIED+1.221
pling+1.193
ples+1.143
antha+1.117
Neg logit contributions
parency-4.942
icians-4.539
ology-4.537
atform-4.479
ologne-4.467
Token Up
Token ID3205
Feature activation+1.46e+00

Pos logit contributions
RANT+3.004
PLIED+2.769
pling+2.678
ples+2.638
antha+2.602
Neg logit contributions
parency-4.003
ology-3.531
icians-3.524
ologne-3.481
atform-3.445
Token Continue
Token ID10054
Feature activation+4.34e+00

Pos logit contributions
RANT+1.405
PLIED+1.252
pling+1.203
gradient+1.157
ples+1.152
Neg logit contributions
parency-2.415
icians-2.205
ology-2.194
ologne-2.138
ttes-2.130
Token reading
Token ID3555
Feature activation+2.04e+00

Pos logit contributions
RANT+3.492
PLIED+3.076
pling+2.976
antha+2.854
gradient+2.819
Neg logit contributions
parency-7.531
icians-6.958
ology-6.951
ologne-6.873
atform-6.828
Token the
Token ID262
Feature activation+3.03e-01

Pos logit contributions
RANT+2.088
PLIED+1.887
pling+1.780
ples+1.750
gradient+1.750
Neg logit contributions
parency-3.230
icians-2.927
ology-2.926
atform-2.886
ologne-2.881
Token main
Token ID1388
Feature activation+4.42e-01

Pos logit contributions
RANT+0.371
PLIED+0.340
pling+0.330
gradient+0.323
antha+0.322
Neg logit contributions
parency-0.422
ology-0.381
icians-0.378
ologne-0.366
atform-0.362
Token story
Token ID1621
Feature activation+2.90e+00

Pos logit contributions
RANT+0.548
PLIED+0.502
pling+0.489
gradient+0.480
antha+0.474
Neg logit contributions
parency-0.612
ology-0.549
icians-0.545
ologne-0.527
ttes-0.524
Token Please
Token ID4222
Feature activation+1.59e+00

Pos logit contributions
RANT+2.638
PLIED+2.388
pling+2.238
ples+2.139
antha+2.134
Neg logit contributions
parency-4.874
ology-4.407
icians-4.399
ologne-4.320
atform-4.231
Token19
Token ID1129
Feature activation-5.42e-01

Pos logit contributions
RANT+0.679
PLIED+0.599
pling+0.579
gradient+0.567
ples+0.552
Neg logit contributions
parency-1.118
ology-1.020
icians-1.019
ologne-0.992
atform-0.988
Token]
Token ID60
Feature activation+1.08e+00

Pos logit contributions
parency+0.790
ology+0.718
icians+0.709
ologne+0.688
atform+0.681
Neg logit contributions
RANT-0.635
PLIED-0.579
pling-0.560
gradient-0.547
ples-0.541
Token Shawn
Token ID25225
Feature activation+2.56e-01

Pos logit contributions
RANT+1.329
PLIED+1.216
pling+1.180
gradient+1.145
antha+1.142
Neg logit contributions
parency-1.505
icians-1.339
ology-1.338
ologne-1.297
atform-1.279
Tokena
Token ID64
Feature activation-1.27e-01

Pos logit contributions
RANT+0.315
PLIED+0.289
pling+0.285
antha+0.280
ples+0.277
Neg logit contributions
parency-0.354
ology-0.317
icians-0.314
ologne-0.306
atform-0.305
Token Dol
Token ID25143
Feature activation-2.11e+00

Pos logit contributions
parency+0.176
ology+0.158
icians+0.157
ologne+0.152
atform+0.151
Neg logit contributions
RANT-0.157
PLIED-0.143
pling-0.141
ples-0.137
antha-0.136
Tokenansky
Token ID49792
Feature activation+4.30e+00

Pos logit contributions
parency+2.918
ology+2.651
icians+2.610
ologne+2.524
ttes+2.503
Neg logit contributions
RANT-2.618
PLIED-2.408
pling-2.318
gradient-2.276
antha-2.251
Token,
Token ID11
Feature activation+1.57e-01

Pos logit contributions
RANT+5.148
pling+4.696
PLIED+4.664
ples+4.517
antha+4.487
Neg logit contributions
parency-5.813
icians-5.160
ology-5.108
ologne-5.074
atform-5.024
Token
Token ID564
Feature activation+9.66e-01

Pos logit contributions
RANT+0.187
PLIED+0.171
pling+0.166
gradient+0.161
antha+0.160
Neg logit contributions
parency-0.224
ology-0.202
icians-0.201
ologne-0.195
atform-0.193
Token
Token ID250
Feature activation-5.29e-02

Pos logit contributions
RANT+1.117
PLIED+1.006
pling+0.991
gradient+0.960
ples+0.950
Neg logit contributions
parency-1.421
ology-1.292
icians-1.287
ologne-1.250
atform-1.241
TokenThe
Token ID464
Feature activation-4.59e-01

Pos logit contributions
parency+0.083
ology+0.076
icians+0.075
ologne+0.073
atform+0.073
Neg logit contributions
RANT-0.057
PLIED-0.051
pling-0.049
gradient-0.047
ples-0.047
Token Deb
Token ID8965
Feature activation-8.08e-01

Pos logit contributions
parency+0.653
ology+0.584
icians+0.584
ologne+0.567
atform+0.562
Neg logit contributions
RANT-0.555
PLIED-0.506
pling-0.492
gradient-0.480
antha-0.477
Token governor
Token ID8153
Feature activation+3.09e-01

Pos logit contributions
RANT+0.207
PLIED+0.189
pling+0.184
gradient+0.179
antha+0.179
Neg logit contributions
parency-0.228
ology-0.206
icians-0.202
ologne-0.197
ttes-0.195
Token.
Token ID13
Feature activation+1.09e-01

Pos logit contributions
RANT+0.399
PLIED+0.361
pling+0.359
gradient+0.348
ples+0.348
Neg logit contributions
parency-0.407
ology-0.365
icians-0.362
ologne-0.353
ttes-0.349
Token
Token ID198
Feature activation+3.83e-01

Pos logit contributions
RANT+0.136
PLIED+0.124
pling+0.119
gradient+0.116
ples+0.116
Neg logit contributions
parency-0.151
ology-0.135
icians-0.135
ologne-0.131
ttes-0.129
Token
Token ID198
Feature activation+2.91e-01

Pos logit contributions
RANT+0.482
PLIED+0.442
pling+0.428
gradient+0.420
ples+0.417
Neg logit contributions
parency-0.524
icians-0.468
ology-0.466
ologne-0.452
atform-0.450
TokenAdvertisement
Token ID4723
Feature activation+1.33e+00

Pos logit contributions
RANT+0.382
PLIED+0.347
pling+0.336
gradient+0.328
ples+0.327
Neg logit contributions
parency-0.387
ology-0.346
icians-0.346
ologne-0.335
atform-0.333
Token Continue
Token ID10054
Feature activation+4.30e+00

Pos logit contributions
RANT+1.146
PLIED+1.013
pling+0.987
VERTISEMENT+0.954
ples+0.946
Neg logit contributions
parency-2.284
icians-2.121
ology-2.090
atform-2.069
ttes-2.046
Token reading
Token ID3555
Feature activation+2.15e+00

Pos logit contributions
RANT+3.365
PLIED+2.996
pling+2.913
antha+2.741
gradient+2.701
Neg logit contributions
parency-7.530
ology-6.939
icians-6.910
ologne-6.849
atform-6.739
Token the
Token ID262
Feature activation+1.77e-01

Pos logit contributions
RANT+2.205
PLIED+1.989
pling+1.897
gradient+1.862
ples+1.851
Neg logit contributions
parency-3.388
ology-3.081
icians-3.068
ologne-3.025
atform-3.019
Token main
Token ID1388
Feature activation+5.66e-01

Pos logit contributions
RANT+0.211
PLIED+0.192
pling+0.188
gradient+0.184
ples+0.183
Neg logit contributions
parency-0.252
ology-0.228
icians-0.227
ologne-0.219
atform-0.218
Token story
Token ID1621
Feature activation+1.97e+00

Pos logit contributions
RANT+0.676
PLIED+0.617
pling+0.604
gradient+0.594
ples+0.582
Neg logit contributions
parency-0.807
ology-0.722
icians-0.720
ologne-0.702
ttes-0.692
Token
Token ID198
Feature activation+5.39e-01

Pos logit contributions
RANT+2.343
PLIED+2.119
pling+2.037
gradient+1.993
ples+1.987
Neg logit contributions
parency-2.829
icians-2.530
ology-2.504
ologne-2.463
atform-2.442
Token Grove
Token ID24144
Feature activation+4.82e-01

Pos logit contributions
parency+1.747
ology+1.590
icians+1.570
ologne+1.523
atform+1.511
Neg logit contributions
RANT-1.350
PLIED-1.230
pling-1.190
gradient-1.163
antha-1.154
Token,
Token ID11
Feature activation+2.97e-01

Pos logit contributions
RANT+0.650
PLIED+0.604
pling+0.588
gradient+0.576
ples+0.574
Neg logit contributions
parency-0.613
ology-0.545
icians-0.541
atform-0.520
ologne-0.518
Token California
Token ID3442
Feature activation-1.72e-02

Pos logit contributions
RANT+0.283
PLIED+0.253
pling+0.239
gradient+0.231
antha+0.231
Neg logit contributions
parency-0.500
ology-0.458
icians-0.454
atform-0.446
ttes-0.445
Token.
Token ID13
Feature activation+6.61e-01

Pos logit contributions
parency+0.020
ology+0.017
icians+0.017
ologne+0.016
ttes+0.016
Neg logit contributions
RANT-0.026
PLIED-0.024
pling-0.023
gradient-0.023
antha-0.023
Token Hide
Token ID10415
Feature activation+1.11e+00

Pos logit contributions
RANT+0.804
PLIED+0.731
pling+0.700
ples+0.687
antha+0.682
Neg logit contributions
parency-0.939
ology-0.838
icians-0.832
ologne-0.815
atform-0.804
Token Caption
Token ID11260
Feature activation+4.28e+00

Pos logit contributions
RANT+0.627
PLIED+0.516
pling+0.471
gradient+0.437
ples+0.432
Neg logit contributions
parency-2.303
icians-2.134
ology-2.133
ologne-2.087
atform-2.086
Token 15
Token ID1315
Feature activation+2.80e-01

Pos logit contributions
RANT+3.770
PLIED+3.293
gradient+3.216
antha+3.105
pling+3.094
Neg logit contributions
parency-7.192
atform-6.414
ology-6.396
icians-6.354
ologne-6.343
Token of
Token ID286
Feature activation+4.54e-01

Pos logit contributions
RANT+0.321
PLIED+0.292
pling+0.282
gradient+0.276
ples+0.274
Neg logit contributions
parency-0.415
ology-0.375
icians-0.373
ologne-0.363
atform-0.360
Token 15
Token ID1315
Feature activation+2.49e-01

Pos logit contributions
RANT+0.507
PLIED+0.462
pling+0.442
gradient+0.435
ples+0.429
Neg logit contributions
parency-0.685
ology-0.622
icians-0.616
ologne-0.600
atform-0.596
Token
Token ID198
Feature activation+7.82e-01

Pos logit contributions
RANT+0.315
PLIED+0.291
pling+0.280
gradient+0.275
ples+0.273
Neg logit contributions
parency-0.341
ology-0.303
icians-0.300
ologne-0.292
atform-0.290
Token
Token ID198
Feature activation+7.29e-01

Pos logit contributions
RANT+0.970
PLIED+0.885
pling+0.853
gradient+0.840
ples+0.830
Neg logit contributions
parency-1.088
icians-0.973
ology-0.969
ologne-0.941
atform-0.938
Token Pasadena
Token ID41010
Feature activation-9.45e-01

Pos logit contributions
RANT+0.062
PLIED+0.056
pling+0.055
gradient+0.053
ples+0.053
Neg logit contributions
parency-0.083
ology-0.075
icians-0.074
ologne-0.072
atform-0.072
Token,
Token ID11
Feature activation+3.63e-01

Pos logit contributions
parency+1.103
ology+0.988
icians+0.964
ologne+0.927
ttes+0.909
Neg logit contributions
RANT-1.384
PLIED-1.283
pling-1.241
gradient-1.228
yip-1.216
Token California
Token ID3442
Feature activation+5.45e-01

Pos logit contributions
RANT+0.342
PLIED+0.304
pling+0.292
gradient+0.281
antha+0.280
Neg logit contributions
parency-0.612
ology-0.563
icians-0.557
ttes-0.545
atform-0.544
Token.
Token ID13
Feature activation+4.39e-01

Pos logit contributions
RANT+0.800
PLIED+0.745
pling+0.726
gradient+0.713
antha+0.707
Neg logit contributions
parency-0.630
ology-0.560
icians-0.549
ologne-0.528
ttes-0.522
Token Hide
Token ID10415
Feature activation+9.72e-01

Pos logit contributions
RANT+0.522
PLIED+0.475
pling+0.456
ples+0.445
gradient+0.444
Neg logit contributions
parency-0.636
ology-0.571
icians-0.565
ologne-0.553
atform-0.547
Token Caption
Token ID11260
Feature activation+4.14e+00

Pos logit contributions
RANT+0.550
PLIED+0.453
pling+0.413
gradient+0.386
ples+0.379
Neg logit contributions
parency-2.004
ology-1.860
icians-1.859
ologne-1.818
atform-1.817
Token 11
Token ID1367
Feature activation-2.57e-01

Pos logit contributions
RANT+4.329
PLIED+3.811
gradient+3.753
 Loading+3.659
pling+3.649
Neg logit contributions
parency-6.365
icians-5.700
atform-5.609
ologne-5.552
ology-5.521
Token of
Token ID286
Feature activation+5.87e-01

Pos logit contributions
parency+0.376
ology+0.341
icians+0.338
ologne+0.329
atform+0.325
Neg logit contributions
RANT-0.298
PLIED-0.272
pling-0.263
gradient-0.258
ples-0.255
Token 15
Token ID1315
Feature activation+1.58e-01

Pos logit contributions
RANT+0.639
PLIED+0.580
pling+0.553
gradient+0.548
ples+0.537
Neg logit contributions
parency-0.903
ology-0.817
icians-0.812
ologne-0.792
atform-0.789
Token Photos
Token ID9434
Feature activation+1.52e+00

Pos logit contributions
RANT+0.216
PLIED+0.201
pling+0.194
ples+0.191
gradient+0.190
Neg logit contributions
parency-0.198
ology-0.174
icians-0.173
ologne-0.168
atform-0.167
Token:
Token ID25
Feature activation+1.00e+00

Pos logit contributions
RANT+1.734
PLIED+1.579
pling+1.510
gradient+1.507
ples+1.476
Neg logit contributions
parency-2.236
icians-2.008
ology-1.978
atform-1.953
ologne-1.946
Token
Token ID447
Feature activation+6.29e-02

Pos logit contributions
RANT+0.673
PLIED+0.621
pling+0.604
gradient+0.586
antha+0.586
Neg logit contributions
parency-0.687
ology-0.615
icians-0.612
ologne-0.592
atform-0.585
Token
Token ID251
Feature activation+4.11e-01

Pos logit contributions
RANT+0.072
PLIED+0.066
pling+0.064
gradient+0.062
ples+0.061
Neg logit contributions
parency-0.093
ology-0.085
icians-0.084
ologne-0.082
atform-0.081
Token
Token ID198
Feature activation+7.93e-01

Pos logit contributions
RANT+0.504
PLIED+0.459
pling+0.443
ples+0.433
antha+0.431
Neg logit contributions
parency-0.573
ology-0.515
icians-0.513
ologne-0.499
atform-0.491
Token
Token ID198
Feature activation+7.06e-01

Pos logit contributions
RANT+0.983
PLIED+0.901
pling+0.873
gradient+0.855
ples+0.851
Neg logit contributions
parency-1.096
icians-0.982
ology-0.974
ologne-0.948
atform-0.944
TokenAdvertisement
Token ID4723
Feature activation+1.36e+00

Pos logit contributions
RANT+0.876
PLIED+0.790
pling+0.761
gradient+0.740
ples+0.740
Neg logit contributions
parency-0.989
icians-0.890
ology-0.889
ologne-0.863
atform-0.862
Token Continue
Token ID10054
Feature activation+4.13e+00

Pos logit contributions
RANT+1.190
PLIED+1.047
pling+1.014
VERTISEMENT+0.995
ples+0.975
Neg logit contributions
parency-2.345
icians-2.174
ology-2.143
atform-2.116
ttes-2.081
Token reading
Token ID3555
Feature activation+2.04e+00

Pos logit contributions
RANT+3.256
PLIED+2.885
pling+2.807
antha+2.653
gradient+2.612
Neg logit contributions
parency-7.219
icians-6.652
ology-6.639
ologne-6.582
atform-6.490
Token the
Token ID262
Feature activation+1.37e-01

Pos logit contributions
RANT+2.102
PLIED+1.909
pling+1.818
gradient+1.779
ples+1.770
Neg logit contributions
parency-3.190
ology-2.906
icians-2.887
ologne-2.847
atform-2.846
Token main
Token ID1388
Feature activation+3.12e-01

Pos logit contributions
RANT+0.165
PLIED+0.150
pling+0.146
gradient+0.144
ples+0.143
Neg logit contributions
parency-0.195
ology-0.176
icians-0.175
ologne-0.169
atform-0.167
Token story
Token ID1621
Feature activation+1.76e+00

Pos logit contributions
RANT+0.376
PLIED+0.344
pling+0.336
gradient+0.331
antha+0.324
Neg logit contributions
parency-0.442
ology-0.396
icians-0.394
ologne-0.385
atform-0.379
Token
Token ID198
Feature activation+6.92e-01

Pos logit contributions
RANT+2.118
PLIED+1.922
pling+1.851
gradient+1.810
ples+1.801
Neg logit contributions
parency-2.498
icians-2.233
ology-2.217
ologne-2.174
atform-2.157
Token Pasadena
Token ID41010
Feature activation-9.36e-01

Pos logit contributions
parency+0.164
ology+0.148
icians+0.146
ologne+0.141
atform+0.141
Neg logit contributions
RANT-0.146
PLIED-0.133
pling-0.130
gradient-0.127
ples-0.126
Token,
Token ID11
Feature activation+2.55e-01

Pos logit contributions
parency+1.095
ology+0.981
icians+0.959
ologne+0.920
ttes+0.908
Neg logit contributions
RANT-1.365
PLIED-1.266
pling-1.229
gradient-1.215
antha-1.199
Token California
Token ID3442
Feature activation+2.15e-01

Pos logit contributions
RANT+0.251
PLIED+0.224
pling+0.216
gradient+0.209
antha+0.207
Neg logit contributions
parency-0.420
ology-0.387
icians-0.382
ologne-0.372
ttes-0.369
Token.
Token ID13
Feature activation+6.68e-01

Pos logit contributions
RANT+0.318
PLIED+0.296
pling+0.288
gradient+0.283
antha+0.281
Neg logit contributions
parency-0.247
ology-0.220
icians-0.216
ologne-0.207
ttes-0.204
Token Hide
Token ID10415
Feature activation+9.13e-01

Pos logit contributions
RANT+0.742
PLIED+0.671
pling+0.641
ples+0.627
gradient+0.623
Neg logit contributions
parency-1.017
ology-0.918
icians-0.908
ologne-0.891
atform-0.880
Token Caption
Token ID11260
Feature activation+4.11e+00

Pos logit contributions
RANT+0.522
PLIED+0.430
pling+0.394
gradient+0.368
ples+0.361
Neg logit contributions
parency-1.877
ology-1.743
icians-1.741
ologne-1.701
atform-1.700
Token 12
Token ID1105
Feature activation+1.32e+00

Pos logit contributions
RANT+3.388
PLIED+3.046
gradient+2.821
ples+2.796
pling+2.770
Neg logit contributions
parency-7.250
atform-6.533
icians-6.529
ologne-6.472
ology-6.467
Token of
Token ID286
Feature activation+4.30e-01

Pos logit contributions
RANT+1.428
PLIED+1.290
pling+1.254
ples+1.222
gradient+1.221
Neg logit contributions
parency-2.023
icians-1.835
ology-1.813
ologne-1.789
atform-1.773
Token 15
Token ID1315
Feature activation+3.64e-02

Pos logit contributions
RANT+0.468
PLIED+0.425
pling+0.405
gradient+0.401
ples+0.394
Neg logit contributions
parency-0.661
ology-0.598
icians-0.595
ologne-0.579
atform-0.577
Token Photos
Token ID9434
Feature activation+1.48e+00

Pos logit contributions
RANT+0.050
PLIED+0.047
pling+0.045
ples+0.044
gradient+0.044
Neg logit contributions
parency-0.046
ology-0.040
icians-0.040
ologne-0.039
atform-0.038
Token:
Token ID25
Feature activation+9.50e-01

Pos logit contributions
RANT+1.695
PLIED+1.542
pling+1.476
gradient+1.472
ples+1.443
Neg logit contributions
parency-2.184
icians-1.961
ology-1.931
atform-1.906
ologne-1.900
Token as
Token ID355
Feature activation+3.62e-01

Pos logit contributions
RANT+0.917
PLIED+0.836
pling+0.821
gradient+0.800
ples+0.796
Neg logit contributions
parency-0.997
ology-0.895
icians-0.892
ologne-0.864
ttes-0.856
Token chairman
Token ID8900
Feature activation-1.13e-01

Pos logit contributions
RANT+0.439
PLIED+0.400
pling+0.390
antha+0.382
gradient+0.381
Neg logit contributions
parency-0.507
ology-0.459
icians-0.455
ologne-0.440
ttes-0.439
Token of
Token ID286
Feature activation+3.24e-01

Pos logit contributions
parency+0.164
ology+0.150
icians+0.148
ologne+0.144
ttes+0.142
Neg logit contributions
RANT-0.132
PLIED-0.120
pling-0.116
gradient-0.114
antha-0.113
Token PD
Token ID14340
Feature activation+7.99e-01

Pos logit contributions
RANT+0.373
PLIED+0.339
pling+0.329
gradient+0.321
antha+0.321
Neg logit contributions
parency-0.473
ology-0.431
icians-0.427
ologne-0.417
ttes-0.413
TokenV
Token ID53
Feature activation+8.44e-01

Pos logit contributions
RANT+0.787
PLIED+0.703
pling+0.677
gradient+0.646
ples+0.646
Neg logit contributions
parency-1.312
ology-1.203
icians-1.200
ttes-1.170
ologne-1.159
TokenSA
Token ID4090
Feature activation+4.10e+00

Pos logit contributions
RANT+0.857
PLIED+0.764
pling+0.731
gradient+0.708
antha+0.701
Neg logit contributions
parency-1.366
ology-1.260
icians-1.246
ologne-1.211
ttes-1.202
Token in
Token ID287
Feature activation+2.47e-02

Pos logit contributions
RANT+4.862
PLIED+4.409
pling+4.188
antha+4.078
ples+4.073
Neg logit contributions
parency-5.534
icians-5.025
ology-4.936
ttes-4.894
ologne-4.872
Token the
Token ID262
Feature activation+9.73e-02

Pos logit contributions
RANT+0.033
PLIED+0.030
pling+0.029
gradient+0.029
antha+0.029
Neg logit contributions
parency-0.032
ology-0.029
icians-0.028
ologne-0.027
ttes-0.027
Token resh
Token ID27179
Feature activation-4.52e-01

Pos logit contributions
RANT+0.132
PLIED+0.120
pling+0.119
gradient+0.115
antha+0.115
Neg logit contributions
parency-0.123
ology-0.110
icians-0.109
ologne-0.105
ttes-0.105
Tokenuffle
Token ID18137
Feature activation+2.48e-01

Pos logit contributions
parency+0.625
ology+0.562
icians+0.559
ttes+0.536
ologne+0.535
Neg logit contributions
RANT-0.560
PLIED-0.516
pling-0.509
antha-0.491
gradient-0.487
Token,
Token ID11
Feature activation+3.35e-01

Pos logit contributions
RANT+0.328
PLIED+0.301
pling+0.295
gradient+0.289
ples+0.287
Neg logit contributions
parency-0.324
ology-0.286
icians-0.285
ologne-0.276
ttes-0.274
Token
Token ID198
Feature activation+5.50e-01

Pos logit contributions
RANT+0.508
PLIED+0.467
pling+0.446
gradient+0.437
ples+0.434
Neg logit contributions
parency-0.539
ology-0.481
icians-0.478
ologne-0.466
atform-0.460
Token
Token ID198
Feature activation+3.36e-01

Pos logit contributions
RANT+0.696
PLIED+0.637
pling+0.618
gradient+0.607
ples+0.601
Neg logit contributions
parency-0.751
icians-0.671
ology-0.669
ologne-0.648
atform-0.645
TokenNewsletter
Token ID33031
Feature activation+2.25e+00

Pos logit contributions
RANT+0.402
PLIED+0.365
pling+0.351
gradient+0.344
ples+0.340
Neg logit contributions
parency-0.484
ology-0.440
icians-0.438
ologne-0.425
atform-0.424
Token Sign
Token ID5865
Feature activation+2.42e+00

Pos logit contributions
RANT+1.415
PLIED+1.140
pling+1.127
antha+1.059
ples+1.046
Neg logit contributions
parency-4.459
icians-4.099
ology-4.095
ologne-4.026
atform-4.004
Token Up
Token ID3205
Feature activation+1.02e+00

Pos logit contributions
RANT+2.742
PLIED+2.520
pling+2.480
ples+2.415
antha+2.384
Neg logit contributions
parency-3.563
icians-3.160
ology-3.160
ologne-3.095
atform-3.056
Token Continue
Token ID10054
Feature activation+4.09e+00

Pos logit contributions
RANT+1.018
PLIED+0.909
pling+0.880
gradient+0.848
ples+0.840
Neg logit contributions
parency-1.657
ology-1.511
icians-1.508
ologne-1.463
ttes-1.457
Token reading
Token ID3555
Feature activation+1.96e+00

Pos logit contributions
RANT+3.387
PLIED+2.990
pling+2.873
antha+2.747
gradient+2.740
Neg logit contributions
parency-7.096
ology-6.522
icians-6.509
ologne-6.432
atform-6.368
Token the
Token ID262
Feature activation+3.05e-01

Pos logit contributions
RANT+2.047
PLIED+1.857
pling+1.753
gradient+1.725
ples+1.712
Neg logit contributions
parency-3.063
ology-2.790
icians-2.774
ologne-2.734
atform-2.727
Token main
Token ID1388
Feature activation+3.35e-01

Pos logit contributions
RANT+0.382
PLIED+0.351
pling+0.340
gradient+0.333
antha+0.331
Neg logit contributions
parency-0.418
ology-0.378
icians-0.373
ologne-0.361
atform-0.357
Token story
Token ID1621
Feature activation+2.80e+00

Pos logit contributions
RANT+0.408
PLIED+0.375
pling+0.365
gradient+0.358
antha+0.353
Neg logit contributions
parency-0.471
ology-0.421
icians-0.419
ologne-0.407
ttes-0.403
Token Please
Token ID4222
Feature activation+1.32e+00

Pos logit contributions
RANT+2.581
PLIED+2.348
pling+2.190
gradient+2.097
ples+2.077
Neg logit contributions
parency-4.706
ology-4.245
icians-4.227
ologne-4.165
atform-4.075
Token.
Token ID13
Feature activation-5.57e-01

Pos logit contributions
RANT+0.583
PLIED+0.531
pling+0.519
antha+0.503
ples+0.502
Neg logit contributions
parency-0.591
icians-0.529
ology-0.526
ologne-0.510
atform-0.507
Token
Token ID198
Feature activation-5.59e-01

Pos logit contributions
parency+0.765
ology+0.686
icians+0.680
ologne+0.658
atform+0.651
Neg logit contributions
RANT-0.705
PLIED-0.646
pling-0.624
gradient-0.610
antha-0.606
Token
Token ID198
Feature activation-6.38e-01

Pos logit contributions
parency+0.731
ology+0.657
icians+0.649
ologne+0.627
atform+0.620
Neg logit contributions
RANT-0.738
PLIED-0.680
pling-0.661
gradient-0.648
ples-0.641
TokenRemember
Token ID16676
Feature activation+1.01e+00

Pos logit contributions
parency+0.928
ology+0.842
icians+0.838
ologne+0.810
atform+0.806
Neg logit contributions
RANT-0.755
PLIED-0.683
pling-0.660
gradient-0.644
ples-0.638
Token Colin
Token ID18373
Feature activation+4.03e-01

Pos logit contributions
RANT+1.229
PLIED+1.111
pling+1.098
gradient+1.069
ples+1.061
Neg logit contributions
parency-1.410
ology-1.269
icians-1.260
atform-1.224
ologne-1.212
Token Kaepernick
Token ID30112
Feature activation+4.04e+00

Pos logit contributions
RANT+0.391
PLIED+0.348
pling+0.343
gradient+0.327
ples+0.327
Neg logit contributions
parency-0.663
ology-0.610
icians-0.606
ologne-0.588
atform-0.586
Token?
Token ID30
Feature activation+3.13e-01

Pos logit contributions
RANT+4.730
pling+4.336
PLIED+4.235
ples+4.114
gradient+4.038
Neg logit contributions
parency-5.583
icians-5.071
ology-5.004
atform-4.982
ttes-4.885
Token Seahawks
Token ID16732
Feature activation+1.38e+00

Pos logit contributions
RANT+0.389
PLIED+0.354
pling+0.345
gradient+0.335
ples+0.334
Neg logit contributions
parency-0.434
ology-0.388
icians-0.387
ologne-0.373
atform-0.372
Token fans
Token ID3296
Feature activation+1.46e+00

Pos logit contributions
RANT+1.680
PLIED+1.547
pling+1.503
gradient+1.457
ples+1.456
Neg logit contributions
parency-1.907
ology-1.714
icians-1.711
atform-1.650
ologne-1.634
Token will
Token ID481
Feature activation+5.82e-02

Pos logit contributions
RANT+1.949
PLIED+1.814
pling+1.804
ples+1.762
antha+1.731
Neg logit contributions
parency-1.840
icians-1.613
ology-1.607
atform-1.551
ologne-1.525
Token no
Token ID645
Feature activation+1.26e+00

Pos logit contributions
RANT+0.093
PLIED+0.087
pling+0.087
ples+0.085
gradient+0.084
Neg logit contributions
parency-0.059
ology-0.050
icians-0.050
ologne-0.047
atform-0.047
Token pants
Token ID12581
Feature activation+2.18e+00

Pos logit contributions
RANT+0.548
PLIED+0.502
pling+0.485
gradient+0.474
ples+0.474
Neg logit contributions
parency-0.612
ology-0.550
icians-0.549
ologne-0.532
atform-0.528
Tokenuit
Token ID5013
Feature activation+1.44e+00

Pos logit contributions
RANT+3.373
PLIED+3.148
pling+3.087
ples+3.078
gradient+3.024
Neg logit contributions
parency-2.256
ology-1.955
ologne-1.904
icians-1.901
atform-1.871
Token GIF
Token ID24984
Feature activation+1.30e+00

Pos logit contributions
RANT+1.891
PLIED+1.755
pling+1.687
ples+1.672
gradient+1.657
Neg logit contributions
parency-1.843
icians-1.636
ology-1.629
atform-1.597
ologne-1.591
Tokens
Token ID82
Feature activation+6.84e-01

Pos logit contributions
RANT+1.548
PLIED+1.392
pling+1.345
ples+1.317
gradient+1.305
Neg logit contributions
parency-1.879
ology-1.706
icians-1.692
atform-1.657
ologne-1.650
Token on
Token ID319
Feature activation-9.17e-02

Pos logit contributions
RANT+0.778
PLIED+0.711
pling+0.695
ples+0.672
gradient+0.669
Neg logit contributions
parency-1.001
icians-0.908
ology-0.907
ologne-0.897
atform-0.881
Token Instagram
Token ID10767
Feature activation+4.04e+00

Pos logit contributions
parency+0.125
ology+0.111
icians+0.111
ologne+0.107
atform+0.106
Neg logit contributions
RANT-0.117
PLIED-0.107
pling-0.104
gradient-0.102
ples-0.101
Token,
Token ID11
Feature activation+2.78e-01

Pos logit contributions
RANT+4.500
PLIED+4.088
pling+4.055
ples+3.923
gradient+3.837
Neg logit contributions
parency-5.640
ologne-5.120
icians-5.106
atform-5.059
ology-5.044
Token and
Token ID290
Feature activation+4.52e-01

Pos logit contributions
RANT+0.324
PLIED+0.295
pling+0.288
gradient+0.278
ples+0.278
Neg logit contributions
parency-0.404
ology-0.366
icians-0.361
ologne-0.355
atform-0.351
Token who
Token ID508
Feature activation+3.54e-01

Pos logit contributions
RANT+0.557
PLIED+0.506
pling+0.501
ples+0.486
gradient+0.479
Neg logit contributions
parency-0.627
ology-0.562
icians-0.556
ologne-0.550
atform-0.542
Token joined
Token ID5399
Feature activation+6.82e-01

Pos logit contributions
RANT+0.458
PLIED+0.420
pling+0.417
ples+0.409
gradient+0.401
Neg logit contributions
parency-0.467
ology-0.415
icians-0.410
ologne-0.406
atform-0.400
Token Snapchat
Token ID31546
Feature activation+2.83e+00

Pos logit contributions
RANT+0.826
PLIED+0.750
pling+0.726
gradient+0.708
ples+0.705
Neg logit contributions
parency-0.965
ology-0.872
icians-0.864
ologne-0.841
atform-0.834
Tokenok
Token ID482
Feature activation-1.57e-01

Pos logit contributions
parency+0.092
ology+0.075
icians+0.073
ologne+0.070
ttes+0.069
Neg logit contributions
RANT-0.207
PLIED-0.196
pling-0.193
gradient-0.190
antha-0.189
Token.
Token ID13
Feature activation-3.40e-01

Pos logit contributions
parency+0.196
ology+0.175
icians+0.174
ologne+0.168
atform+0.167
Neg logit contributions
RANT-0.215
PLIED-0.198
pling-0.195
gradient-0.190
ples-0.189
Token CNN
Token ID8100
Feature activation+3.45e+00

Pos logit contributions
parency+0.468
ology+0.420
icians+0.417
ologne+0.404
ttes+0.399
Neg logit contributions
RANT-0.429
PLIED-0.393
pling-0.378
gradient-0.370
ples-0.367
Token president
Token ID1893
Feature activation+8.63e-01

Pos logit contributions
RANT+4.259
PLIED+3.873
pling+3.799
VERTISEMENT+3.711
antha+3.659
Neg logit contributions
parency-4.537
icians-4.027
ology-4.020
ologne-4.013
ttes-3.951
Token Jeff
Token ID5502
Feature activation+9.67e-01

Pos logit contributions
RANT+1.072
PLIED+0.987
pling+0.965
antha+0.944
gradient+0.936
Neg logit contributions
parency-1.185
icians-1.059
ology-1.058
ologne-1.040
ttes-1.015
Token Zucker
Token ID27629
Feature activation+4.03e+00

Pos logit contributions
RANT+0.753
PLIED+0.660
pling+0.632
antha+0.606
ples+0.595
Neg logit contributions
parency-1.786
ology-1.644
icians-1.643
ologne-1.609
atform-1.598
Token
Token ID447
Feature activation+2.54e-01

Pos logit contributions
RANT+4.824
pling+4.429
PLIED+4.336
ples+4.233
gradient+4.128
Neg logit contributions
parency-5.415
icians-4.854
ology-4.799
ologne-4.742
atform-4.728
Token
Token ID247
Feature activation+2.74e-02

Pos logit contributions
RANT+0.384
PLIED+0.354
pling+0.350
gradient+0.343
ples+0.340
Neg logit contributions
parency-0.282
ology-0.249
icians-0.247
ologne-0.238
atform-0.237
Tokens
Token ID82
Feature activation+4.21e-01

Pos logit contributions
RANT+0.033
PLIED+0.030
pling+0.030
ples+0.029
gradient+0.029
Neg logit contributions
parency-0.039
ology-0.035
icians-0.035
ologne-0.034
atform-0.033
Token then
Token ID788
Feature activation-4.34e-01

Pos logit contributions
RANT+0.526
PLIED+0.481
pling+0.470
gradient+0.457
ples+0.455
Neg logit contributions
parency-0.577
ology-0.521
icians-0.518
ologne-0.499
atform-0.496
Token 15
Token ID1315
Feature activation-1.03e-01

Pos logit contributions
parency+0.523
ology+0.465
icians+0.460
ologne+0.441
ttes+0.438
Neg logit contributions
RANT-0.617
PLIED-0.571
pling-0.556
gradient-0.546
antha-0.544
Token 2008
Token ID3648
Feature activation+1.28e+00

Pos logit contributions
RANT+1.226
PLIED+1.129
pling+1.097
gradient+1.088
ples+1.070
Neg logit contributions
parency-1.300
icians-1.157
ology-1.151
ologne-1.117
atform-1.101
Token-
Token ID12
Feature activation+7.81e-01

Pos logit contributions
RANT+1.699
PLIED+1.569
pling+1.534
gradient+1.515
ples+1.497
Neg logit contributions
parency-1.643
ology-1.455
icians-1.452
ttes-1.399
ologne-1.394
Token2011
Token ID9804
Feature activation+3.95e-01

Pos logit contributions
RANT+0.908
PLIED+0.824
pling+0.789
gradient+0.784
ples+0.761
Neg logit contributions
parency-1.145
ology-1.043
icians-1.041
ologne-1.005
ttes-1.002
Token),
Token ID828
Feature activation+3.69e-01

Pos logit contributions
RANT+0.510
PLIED+0.471
pling+0.454
gradient+0.449
ples+0.440
Neg logit contributions
parency-0.525
ology-0.468
icians-0.467
ologne-0.450
ttes-0.447
Token Larry
Token ID13633
Feature activation+2.32e-01

Pos logit contributions
RANT+0.460
PLIED+0.423
pling+0.411
gradient+0.403
antha+0.400
Neg logit contributions
parency-0.506
ology-0.456
icians-0.449
ologne-0.437
ttes-0.435
Token Kramer
Token ID41454
Feature activation+4.02e+00

Pos logit contributions
RANT+0.275
PLIED+0.251
pling+0.247
antha+0.242
ples+0.242
Neg logit contributions
parency-0.333
ology-0.297
icians-0.294
ologne-0.287
atform-0.285
Token (
Token ID357
Feature activation-7.99e-02

Pos logit contributions
RANT+4.341
pling+3.932
PLIED+3.900
gradient+3.770
ples+3.765
Neg logit contributions
parency-5.946
icians-5.289
ology-5.285
ologne-5.198
atform-5.188
Tokenpresident
Token ID22540
Feature activation+2.14e+00

Pos logit contributions
parency+0.116
ology+0.105
icians+0.104
ologne+0.101
atform+0.100
Neg logit contributions
RANT-0.094
PLIED-0.085
pling-0.083
gradient-0.082
ples-0.080
Token;
Token ID26
Feature activation+2.60e-01

Pos logit contributions
RANT+2.449
PLIED+2.232
pling+2.166
gradient+2.150
antha+2.122
Neg logit contributions
parency-3.052
ology-2.726
icians-2.714
ologne-2.712
ttes-2.662
Token director
Token ID3437
Feature activation+8.18e-01

Pos logit contributions
RANT+0.316
PLIED+0.291
pling+0.282
gradient+0.277
antha+0.274
Neg logit contributions
parency-0.364
ology-0.329
icians-0.325
ologne-0.315
ttes-0.314
Token Climate
Token ID13963
Feature activation-5.53e-01

Pos logit contributions
RANT+0.969
PLIED+0.890
pling+0.858
gradient+0.851
antha+0.837
Neg logit contributions
parency-1.163
ology-1.043
icians-1.041
ologne-1.016
ttes-1.002
Tokenica
Token ID3970
Feature activation-5.38e-01

Pos logit contributions
parency+0.509
ology+0.357
icians+0.323
ologne+0.290
atform+0.269
Neg logit contributions
RANT-2.636
PLIED-2.513
pling-2.486
antha-2.449
gradient-2.448
Token (@
Token ID4275
Feature activation+3.83e-01

Pos logit contributions
parency+0.714
ology+0.643
icians+0.633
ologne+0.613
atform+0.605
Neg logit contributions
RANT-0.700
PLIED-0.645
pling-0.626
gradient-0.611
antha-0.608
Tokendan
Token ID25604
Feature activation+2.06e+00

Pos logit contributions
RANT+0.367
PLIED+0.328
pling+0.314
gradient+0.306
antha+0.303
Neg logit contributions
parency-0.639
ology-0.590
icians-0.582
ologne-0.569
atform-0.565
Tokenmeric
Token ID946
Feature activation-3.59e-01

Pos logit contributions
RANT+1.787
PLIED+1.631
pling+1.615
antha+1.560
ples+1.557
Neg logit contributions
parency-3.469
ology-3.197
icians-3.165
atform-3.119
ttes-3.096
Tokena
Token ID64
Feature activation+5.75e-01

Pos logit contributions
parency+0.473
ology+0.435
icians+0.420
ologne+0.408
atform+0.404
Neg logit contributions
RANT-0.475
PLIED-0.432
pling-0.420
gradient-0.417
ples-0.410
TokenCNN
Token ID18474
Feature activation+3.99e+00

Pos logit contributions
RANT+0.571
PLIED+0.515
pling+0.493
ples+0.475
gradient+0.471
Neg logit contributions
parency-0.939
ology-0.857
icians-0.854
ologne-0.829
ttes-0.826
Token)
Token ID8
Feature activation+5.28e-01

Pos logit contributions
RANT+4.698
PLIED+4.273
pling+4.009
gradient+3.990
ples+3.957
Neg logit contributions
parency-5.538
icians-4.986
ology-4.917
ologne-4.809
atform-4.804
Token October
Token ID3267
Feature activation-2.34e+00

Pos logit contributions
RANT+0.744
PLIED+0.682
pling+0.668
ples+0.652
gradient+0.651
Neg logit contributions
parency-0.644
ology-0.574
icians-0.569
ologne-0.545
atform-0.541
Token 23
Token ID2242
Feature activation+2.73e-02

Pos logit contributions
parency+3.052
ology+2.795
icians+2.715
ologne+2.624
ttes+2.611
Neg logit contributions
RANT-3.043
PLIED-2.811
pling-2.742
gradient-2.684
yip-2.681
Token,
Token ID11
Feature activation+4.23e-01

Pos logit contributions
RANT+0.036
PLIED+0.033
pling+0.032
gradient+0.032
ples+0.031
Neg logit contributions
parency-0.036
icians-0.032
ology-0.032
ologne-0.031
atform-0.030
Token 2016
Token ID1584
Feature activation+5.21e-01

Pos logit contributions
RANT+0.614
PLIED+0.569
pling+0.555
gradient+0.544
ples+0.539
Neg logit contributions
parency-0.497
ology-0.441
icians-0.430
ologne-0.421
ttes-0.416
Token Irvine
Token ID37277
Feature activation+1.25e+00

Pos logit contributions
parency+0.709
ology+0.662
icians+0.656
ologne+0.641
ttes+0.639
Neg logit contributions
RANT-0.219
PLIED-0.180
pling-0.169
gradient-0.159
ples-0.157
Token,
Token ID11
Feature activation-3.44e-02

Pos logit contributions
RANT+1.669
PLIED+1.547
pling+1.501
ples+1.468
gradient+1.466
Neg logit contributions
parency-1.589
ology-1.425
icians-1.410
atform-1.366
ologne-1.353
Token California
Token ID3442
Feature activation+2.94e-01

Pos logit contributions
parency+0.056
ology+0.052
icians+0.051
ologne+0.050
ttes+0.050
Neg logit contributions
RANT-0.034
PLIED-0.030
pling-0.029
gradient-0.028
antha-0.028
Token.
Token ID13
Feature activation-3.81e-01

Pos logit contributions
RANT+0.437
PLIED+0.407
pling+0.396
gradient+0.390
antha+0.387
Neg logit contributions
parency-0.335
ology-0.298
icians-0.292
ologne-0.280
ttes-0.276
Token Hide
Token ID10415
Feature activation+6.76e-01

Pos logit contributions
parency+0.542
ology+0.488
icians+0.482
ologne+0.470
atform+0.463
Neg logit contributions
RANT-0.464
PLIED-0.424
pling-0.407
gradient-0.397
ples-0.396
Token Caption
Token ID11260
Feature activation+3.98e+00

Pos logit contributions
RANT+0.396
PLIED+0.327
pling+0.301
gradient+0.283
ples+0.277
Neg logit contributions
parency-1.380
ology-1.282
icians-1.279
ologne-1.251
atform-1.248
Token 10
Token ID838
Feature activation-6.42e-01

Pos logit contributions
RANT+3.785
PLIED+3.475
gradient+3.202
pling+3.176
ples+3.149
Neg logit contributions
parency-6.483
icians-5.850
ologne-5.752
atform-5.730
ttes-5.610
Token of
Token ID286
Feature activation+3.67e-01

Pos logit contributions
parency+0.923
ology+0.840
icians+0.828
ologne+0.803
ttes+0.795
Neg logit contributions
RANT-0.764
PLIED-0.698
pling-0.676
gradient-0.661
antha-0.654
Token 15
Token ID1315
Feature activation-7.01e-02

Pos logit contributions
RANT+0.408
PLIED+0.370
pling+0.355
gradient+0.350
ples+0.344
Neg logit contributions
parency-0.557
ology-0.505
icians-0.501
ologne-0.488
atform-0.485
Token Photos
Token ID9434
Feature activation+1.38e+00

Pos logit contributions
parency+0.087
ology+0.076
icians+0.076
ologne+0.073
atform+0.073
Neg logit contributions
RANT-0.097
PLIED-0.091
pling-0.088
gradient-0.086
ples-0.086
Token:
Token ID25
Feature activation+6.70e-01

Pos logit contributions
RANT+1.596
PLIED+1.454
pling+1.395
gradient+1.388
ples+1.362
Neg logit contributions
parency-2.008
icians-1.801
ology-1.780
atform-1.749
ologne-1.746
Token,
Token ID11
Feature activation+1.76e-01

Pos logit contributions
parency+1.330
ology+1.186
icians+1.185
ologne+1.143
atform+1.134
Neg logit contributions
RANT-1.265
PLIED-1.167
pling-1.142
gradient-1.109
antha-1.108
Token cour
Token ID1093
Feature activation+6.62e-01

Pos logit contributions
RANT+0.203
PLIED+0.186
pling+0.180
gradient+0.175
antha+0.174
Neg logit contributions
parency-0.258
ology-0.234
icians-0.232
ologne-0.225
atform-0.223
Tokenget
Token ID1136
Feature activation-4.73e-01

Pos logit contributions
RANT+0.945
pling+0.870
PLIED+0.870
antha+0.856
gradient+0.846
Neg logit contributions
parency-0.786
ology-0.697
icians-0.681
atform-0.659
ologne-0.654
Tokentes
Token ID4879
Feature activation-1.66e+00

Pos logit contributions
parency+0.494
ology+0.415
icians+0.412
atform+0.403
ologne+0.396
Neg logit contributions
RANT-0.748
PLIED-0.701
pling-0.680
antha-0.674
gradient-0.668
Token,
Token ID11
Feature activation+2.10e-01

Pos logit contributions
parency+2.207
ology+1.995
icians+1.961
ologne+1.896
ttes+1.878
Neg logit contributions
RANT-2.154
PLIED-1.983
pling-1.924
gradient-1.886
antha-1.868
Token basil
Token ID37792
Feature activation-5.49e+00

Pos logit contributions
RANT+0.235
PLIED+0.215
pling+0.208
gradient+0.202
ples+0.201
Neg logit contributions
parency-0.315
ology-0.287
icians-0.285
ologne-0.276
atform-0.274
Token and
Token ID290
Feature activation+8.85e-02

Pos logit contributions
parency+7.080
ology+6.598
icians+6.413
ological+6.157
ttes+6.147
Neg logit contributions
RANT-7.157
PLIED-6.579
gradient-6.212
pling-6.211
yip-6.192
Token spinach
Token ID39129
Feature activation+2.91e-01

Pos logit contributions
RANT+0.100
PLIED+0.091
pling+0.089
gradient+0.086
ples+0.086
Neg logit contributions
parency-0.132
ology-0.119
icians-0.119
ologne-0.115
atform-0.115
Token,
Token ID11
Feature activation+2.94e-01

Pos logit contributions
RANT+0.382
PLIED+0.356
pling+0.349
ples+0.339
antha+0.337
Neg logit contributions
parency-0.378
icians-0.336
ology-0.335
atform-0.319
ologne-0.319
Token and
Token ID290
Feature activation+1.58e-01

Pos logit contributions
RANT+0.356
PLIED+0.327
pling+0.318
gradient+0.309
antha+0.308
Neg logit contributions
parency-0.416
ology-0.376
icians-0.372
ologne-0.361
atform-0.358
Token named
Token ID3706
Feature activation-1.55e+00

Pos logit contributions
RANT+0.207
PLIED+0.192
pling+0.187
gradient+0.182
ples+0.181
Neg logit contributions
parency-0.207
ology-0.184
icians-0.183
ologne-0.177
atform-0.176
Tokenled
Token ID992
Feature activation-1.12e+00

Pos logit contributions
parency+0.611
ology+0.563
icians+0.558
ologne+0.545
atform+0.542
Neg logit contributions
RANT-0.325
PLIED-0.292
pling-0.279
gradient-0.269
ples-0.266
Token immortal
Token ID26156
Feature activation+1.21e+00

Pos logit contributions
parency+1.556
ology+1.408
icians+1.387
ologne+1.348
atform+1.334
Neg logit contributions
RANT-1.389
PLIED-1.274
pling-1.237
gradient-1.209
ples-1.197
Token eld
Token ID18441
Feature activation+8.69e-01

Pos logit contributions
RANT+1.554
PLIED+1.418
pling+1.395
ples+1.386
gradient+1.367
Neg logit contributions
parency-1.567
ology-1.382
icians-1.366
ologne-1.362
ttes-1.335
Tokenritch
Token ID46510
Feature activation-1.99e+00

Pos logit contributions
RANT+0.697
PLIED+0.592
pling+0.578
ples+0.567
antha+0.549
Neg logit contributions
parency-1.585
icians-1.444
ology-1.444
ologne-1.419
atform-1.403
Token mon
Token ID937
Feature activation-3.57e-01

Pos logit contributions
parency+2.440
ology+2.173
icians+2.141
ologne+2.070
ttes+2.040
Neg logit contributions
RANT-2.791
PLIED-2.583
pling-2.519
gradient-2.471
ples-2.459
Tokenstros
Token ID48288
Feature activation-4.88e+00

Pos logit contributions
parency+0.345
ology+0.292
icians+0.289
ologne+0.283
ttes+0.276
Neg logit contributions
RANT-0.584
PLIED-0.545
pling-0.543
ples-0.536
antha-0.534
Tokenities
Token ID871
Feature activation-1.14e+00

Pos logit contributions
parency+5.456
ology+4.927
icians+4.790
ologists+4.544
ttes+4.540
Neg logit contributions
RANT-7.266
PLIED-6.850
pling-6.597
gradient-6.530
antha-6.424
Token year
Token ID614
Feature activation-4.85e-01

Pos logit contributions
parency+1.602
ology+1.442
icians+1.433
ologne+1.400
atform+1.378
Neg logit contributions
RANT-1.399
PLIED-1.276
pling-1.249
ples-1.213
gradient-1.210
Token after
Token ID706
Feature activation+7.34e-01

Pos logit contributions
parency+0.685
ology+0.620
icians+0.611
ologne+0.593
atform+0.590
Neg logit contributions
RANT-0.593
PLIED-0.539
pling-0.524
gradient-0.511
ples-0.509
Token year
Token ID614
Feature activation-4.44e-01

Pos logit contributions
RANT+0.966
PLIED+0.878
pling+0.873
gradient+0.852
antha+0.846
Neg logit contributions
parency-0.956
ology-0.852
icians-0.847
ttes-0.824
ologne-0.820
Token and
Token ID290
Feature activation-1.87e-01

Pos logit contributions
parency+0.592
ology+0.531
icians+0.524
ologne+0.513
atform+0.508
Neg logit contributions
RANT-0.574
PLIED-0.525
pling-0.519
ples-0.502
gradient-0.501
Token
Token ID960
Feature activation+8.43e-01

Pos logit contributions
RANT+0.139
PLIED+0.125
pling+0.123
ples+0.119
gradient+0.119
Neg logit contributions
parency-0.191
ology-0.174
icians-0.173
ologne-0.168
ttes-0.167
TokenFederal
Token ID24099
Feature activation-3.13e-01

Pos logit contributions
RANT+1.110
PLIED+1.000
pling+0.996
gradient+0.980
ples+0.962
Neg logit contributions
parency-1.102
ology-0.992
icians-0.974
atform-0.958
ologne-0.952
Token Communications
Token ID14620
Feature activation-6.01e-01

Pos logit contributions
parency+0.522
ology+0.482
icians+0.480
ologne+0.471
atform+0.467
Neg logit contributions
RANT-0.296
PLIED-0.260
pling-0.252
gradient-0.246
ples-0.242
Token Commission
Token ID4513
Feature activation-5.66e-01

Pos logit contributions
parency+0.795
ology+0.722
icians+0.716
ologne+0.695
ttes+0.686
Neg logit contributions
RANT-0.766
PLIED-0.704
pling-0.691
gradient-0.672
ples-0.670
Token chairman
Token ID8900
Feature activation-4.08e-02

Pos logit contributions
parency+0.795
ology+0.720
icians+0.718
ologne+0.693
atform+0.690
Neg logit contributions
RANT-0.692
PLIED-0.631
pling-0.615
gradient-0.597
ples-0.592
Token Aj
Token ID22028
Feature activation-4.34e+00

Pos logit contributions
parency+0.051
icians+0.045
ology+0.045
ologne+0.044
atform+0.043
Neg logit contributions
RANT-0.056
PLIED-0.051
pling-0.051
antha-0.049
gradient-0.049
Tokenit
Token ID270
Feature activation+8.68e-01

Pos logit contributions
parency+4.904
icians+4.278
ology+4.210
ttes+3.919
ologne+3.883
Neg logit contributions
RANT-6.561
PLIED-6.031
gradient-5.804
pling-5.793
yip-5.781
Token Pai
Token ID41626
Feature activation+1.03e+00

Pos logit contributions
RANT+0.805
pling+0.667
PLIED+0.645
gradient+0.633
antha+0.606
Neg logit contributions
parency-1.471
icians-1.354
ologne-1.339
ology-1.337
atform-1.313
Token,
Token ID11
Feature activation+4.06e-01

Pos logit contributions
RANT+1.286
PLIED+1.175
pling+1.172
ples+1.130
gradient+1.113
Neg logit contributions
parency-1.397
icians-1.250
ology-1.249
atform-1.211
ologne-1.209
Token a
Token ID257
Feature activation-2.51e-02

Pos logit contributions
RANT+0.505
PLIED+0.458
pling+0.447
gradient+0.438
antha+0.435
Neg logit contributions
parency-0.562
ology-0.507
icians-0.501
ologne-0.486
ttes-0.482
Token Trump
Token ID1301
Feature activation+3.06e-01

Pos logit contributions
parency+0.033
ology+0.029
icians+0.029
atform+0.028
ttes+0.028
Neg logit contributions
RANT-0.033
PLIED-0.030
pling-0.030
gradient-0.029
antha-0.029
Token the
Token ID262
Feature activation-2.28e-01

Pos logit contributions
parency+0.850
ology+0.756
icians+0.747
ologne+0.721
atform+0.717
Neg logit contributions
RANT-0.927
PLIED-0.858
pling-0.835
gradient-0.824
ples-0.811
Token French
Token ID4141
Feature activation-9.46e-01

Pos logit contributions
parency+0.306
ology+0.272
icians+0.270
ologne+0.262
atform+0.260
Neg logit contributions
RANT-0.292
PLIED-0.270
pling-0.263
gradient-0.257
antha-0.256
Token Mediterranean
Token ID19517
Feature activation-9.71e-01

Pos logit contributions
parency+1.256
ology+1.128
icians+1.111
ologne+1.073
atform+1.066
Neg logit contributions
RANT-1.237
PLIED-1.136
pling-1.104
gradient-1.081
antha-1.073
Token island
Token ID7022
Feature activation+4.36e-01

Pos logit contributions
parency+1.207
ology+1.078
icians+1.060
ologne+1.023
atform+1.009
Neg logit contributions
RANT-1.345
PLIED-1.246
pling-1.213
gradient-1.189
antha-1.182
Token of
Token ID286
Feature activation+8.57e-02

Pos logit contributions
RANT+0.494
PLIED+0.447
pling+0.435
antha+0.423
ples+0.423
Neg logit contributions
parency-0.649
ology-0.588
icians-0.579
ologne-0.567
atform-0.563
Token Cors
Token ID26978
Feature activation-4.18e+00

Pos logit contributions
RANT+0.111
PLIED+0.102
pling+0.099
gradient+0.097
antha+0.096
Neg logit contributions
parency-0.114
ology-0.100
icians-0.100
atform-0.097
ologne-0.095
Tokenica
Token ID3970
Feature activation-1.59e+00

Pos logit contributions
parency+1.119
ology+0.635
icians+0.605
ologne+0.415
elia+0.344
Neg logit contributions
RANT-9.938
PLIED-9.390
pling-9.173
gradient-9.142
ples-9.017
Token Image
Token ID7412
Feature activation+7.43e-01

Pos logit contributions
parency+2.130
ology+1.929
icians+1.896
ologne+1.833
ttes+1.816
Neg logit contributions
RANT-2.039
PLIED-1.877
pling-1.821
gradient-1.784
antha-1.767
Token:
Token ID25
Feature activation+7.41e-01

Pos logit contributions
RANT+0.896
PLIED+0.821
pling+0.798
gradient+0.781
ples+0.774
Neg logit contributions
parency-1.054
icians-0.942
ology-0.941
ologne-0.911
atform-0.911
Token Flames
Token ID25814
Feature activation-1.04e+00

Pos logit contributions
RANT+0.922
PLIED+0.854
pling+0.814
ples+0.795
gradient+0.793
Neg logit contributions
parency-1.031
ology-0.921
icians-0.912
ologne-0.887
atform-0.882
Token threatened
Token ID8556
Feature activation-3.06e-01

Pos logit contributions
parency+1.392
ology+1.251
icians+1.233
ologne+1.192
atform+1.182
Neg logit contributions
RANT-1.350
PLIED-1.243
pling-1.211
gradient-1.186
ples-1.174
Tokenasting
Token ID9222
Feature activation+1.19e+00

Pos logit contributions
RANT+0.588
pling+0.542
PLIED+0.539
gradient+0.521
antha+0.519
Neg logit contributions
parency-0.668
icians-0.600
ology-0.598
ttes-0.584
ologne-0.584
Token a
Token ID257
Feature activation-1.16e-01

Pos logit contributions
RANT+1.442
PLIED+1.334
pling+1.306
gradient+1.290
ples+1.266
Neg logit contributions
parency-1.633
icians-1.473
ology-1.447
ologne-1.439
atform-1.394
Token high
Token ID1029
Feature activation+9.02e-01

Pos logit contributions
parency+0.165
icians+0.147
ology+0.146
ologne+0.144
atform+0.142
Neg logit contributions
RANT-0.139
PLIED-0.127
pling-0.126
gradient-0.123
ples-0.121
Token-
Token ID12
Feature activation+6.29e-01

Pos logit contributions
RANT+1.203
PLIED+1.116
gradient+1.090
pling+1.088
ples+1.059
Neg logit contributions
parency-1.149
icians-1.030
ology-1.008
ttes-0.988
ologne-0.987
Tokenpressure
Token ID36151
Feature activation+1.64e-01

Pos logit contributions
RANT+0.637
PLIED+0.575
gradient+0.564
pling+0.561
ples+0.540
Neg logit contributions
parency-1.003
icians-0.920
ology-0.916
ologne-0.896
ttes-0.889
Token conco
Token ID33463
Feature activation-4.18e+00

Pos logit contributions
RANT+0.203
PLIED+0.186
pling+0.182
gradient+0.180
ples+0.175
Neg logit contributions
parency-0.229
icians-0.204
ology-0.201
ologne-0.198
ttes-0.195
Tokenction
Token ID596
Feature activation-9.68e-01

Pos logit contributions
parency+3.907
ology+3.404
icians+3.136
ologne+3.095
ological+3.045
Neg logit contributions
RANT-7.004
PLIED-6.640
VERTISEMENT-6.282
antha-6.245
pling-6.206
Token of
Token ID286
Feature activation+5.12e-01

Pos logit contributions
parency+1.390
icians+1.252
ology+1.244
ologne+1.212
ttes+1.191
Neg logit contributions
RANT-1.153
PLIED-1.057
pling-1.026
gradient-1.015
ples-0.999
Token water
Token ID1660
Feature activation-4.27e-01

Pos logit contributions
RANT+0.678
PLIED+0.627
pling+0.612
gradient+0.605
ples+0.600
Neg logit contributions
parency-0.661
icians-0.589
ology-0.579
ologne-0.572
atform-0.566
Token,
Token ID11
Feature activation+6.11e-02

Pos logit contributions
parency+0.616
ology+0.557
icians+0.555
ologne+0.541
ttes+0.531
Neg logit contributions
RANT-0.501
PLIED-0.462
pling-0.447
gradient-0.444
ples-0.436
Token sand
Token ID6450
Feature activation+2.75e-01

Pos logit contributions
RANT+0.084
PLIED+0.078
pling+0.076
gradient+0.075
ples+0.074
Neg logit contributions
parency-0.077
ology-0.067
icians-0.067
ologne-0.065
atform-0.065
Token Fla
Token ID22026
Feature activation-3.20e-01

Pos logit contributions
RANT+0.394
PLIED+0.356
pling+0.345
gradient+0.335
ples+0.334
Neg logit contributions
parency-0.514
ology-0.467
icians-0.464
ologne-0.451
atform-0.450
Token.,
Token ID1539
Feature activation+1.98e-01

Pos logit contributions
parency+0.346
ology+0.304
icians+0.298
ologne+0.285
ttes+0.282
Neg logit contributions
RANT-0.495
PLIED-0.462
pling-0.452
gradient-0.442
ples-0.441
Token where
Token ID810
Feature activation+6.69e-01

Pos logit contributions
RANT+0.231
PLIED+0.210
pling+0.203
gradient+0.197
ples+0.196
Neg logit contributions
parency-0.289
ology-0.262
icians-0.260
ologne-0.252
ttes-0.250
Token killer
Token ID11868
Feature activation-1.80e+00

Pos logit contributions
RANT+0.816
PLIED+0.745
pling+0.723
ples+0.703
antha+0.702
Neg logit contributions
parency-0.933
ology-0.847
icians-0.840
ologne-0.818
atform-0.813
Token whale
Token ID22206
Feature activation-1.39e+00

Pos logit contributions
parency+2.117
ology+1.888
icians+1.846
ologne+1.767
ttes+1.757
Neg logit contributions
RANT-2.597
PLIED-2.416
pling-2.350
gradient-2.313
antha-2.289
Token Til
Token ID40422
Feature activation-4.16e+00

Pos logit contributions
parency+1.881
ology+1.701
icians+1.676
ologne+1.622
ttes+1.604
Neg logit contributions
RANT-1.766
PLIED-1.624
pling-1.577
gradient-1.541
antha-1.529
Tokenik
Token ID1134
Feature activation+4.55e-01

Pos logit contributions
parency+4.587
ology+4.142
icians+4.129
ological+3.816
elia+3.780
Neg logit contributions
RANT-6.253
PLIED-5.828
gradient-5.513
yip-5.500
VERTISEMENT-5.435
Tokenum
Token ID388
Feature activation+8.16e-01

Pos logit contributions
RANT+0.552
PLIED+0.504
pling+0.491
gradient+0.478
antha+0.477
Neg logit contributions
parency-0.642
ology-0.583
icians-0.576
ologne-0.557
ttes-0.553
Token was
Token ID373
Feature activation+1.78e-01

Pos logit contributions
RANT+1.021
PLIED+0.940
pling+0.918
ples+0.896
antha+0.887
Neg logit contributions
parency-1.107
icians-0.995
ology-0.995
ologne-0.961
atform-0.956
Token involved
Token ID2950
Feature activation+2.16e-01

Pos logit contributions
RANT+0.241
PLIED+0.223
pling+0.219
ples+0.212
gradient+0.212
Neg logit contributions
parency-0.228
ology-0.203
icians-0.201
ologne-0.195
atform-0.192
Token in
Token ID287
Feature activation-4.18e-01

Pos logit contributions
RANT+0.272
PLIED+0.248
pling+0.245
ples+0.237
gradient+0.235
Neg logit contributions
parency-0.295
ology-0.264
icians-0.264
ologne-0.254
ttes-0.252
Token series
Token ID2168
Feature activation-7.56e-01

Pos logit contributions
parency+0.606
ology+0.544
icians+0.540
ologne+0.522
atform+0.519
Neg logit contributions
RANT-0.567
PLIED-0.520
pling-0.507
gradient-0.498
ples-0.493
Token of
Token ID286
Feature activation-8.15e-02

Pos logit contributions
parency+1.062
ology+0.954
icians+0.947
ologne+0.921
atform+0.915
Neg logit contributions
RANT-0.924
PLIED-0.848
pling-0.825
gradient-0.808
ples-0.799
Token the
Token ID262
Feature activation-2.65e-01

Pos logit contributions
parency+0.114
ology+0.103
icians+0.102
ologne+0.099
atform+0.098
Neg logit contributions
RANT-0.100
PLIED-0.092
pling-0.090
gradient-0.088
ples-0.087
Token inter
Token ID987
Feature activation+1.16e+00

Pos logit contributions
parency+0.362
ology+0.325
icians+0.323
ologne+0.312
atform+0.310
Neg logit contributions
RANT-0.335
PLIED-0.308
pling-0.299
gradient-0.295
antha-0.291
Token-
Token ID12
Feature activation-1.88e-03

Pos logit contributions
RANT+1.346
PLIED+1.220
pling+1.210
gradient+1.180
ples+1.177
Neg logit contributions
parency-1.676
ology-1.517
icians-1.510
ologne-1.472
atform-1.462
Tokentrop
Token ID48385
Feature activation-4.16e+00

Pos logit contributions
parency+0.003
ology+0.002
icians+0.002
ologne+0.002
atform+0.002
Neg logit contributions
RANT-0.002
PLIED-0.002
pling-0.002
gradient-0.002
antha-0.002
Tokenical
Token ID605
Feature activation-9.37e-01

Pos logit contributions
parency+2.793
ology+2.446
icians+2.283
ological+2.126
ologists+2.064
Neg logit contributions
RANT-8.048
PLIED-7.596
pling-7.464
gradient-7.315
ples-7.281
Token temperature
Token ID5951
Feature activation-1.08e+00

Pos logit contributions
parency+1.276
ology+1.147
icians+1.135
ologne+1.101
atform+1.090
Neg logit contributions
RANT-1.189
PLIED-1.092
pling-1.062
gradient-1.043
antha-1.033
Token anomalies
Token ID35907
Feature activation-1.15e+00

Pos logit contributions
parency+1.442
ology+1.288
icians+1.273
ologne+1.246
atform+1.230
Neg logit contributions
RANT-1.407
PLIED-1.301
pling-1.272
gradient-1.248
ples-1.236
Token T
Token ID309
Feature activation-4.31e-01

Pos logit contributions
parency+1.540
ology+1.374
icians+1.360
ologne+1.322
atform+1.310
Neg logit contributions
RANT-1.484
PLIED-1.367
pling-1.332
gradient-1.305
ples-1.293
Token(
Token ID7
Feature activation-5.45e-01

Pos logit contributions
parency+0.627
ology+0.564
icians+0.561
ologne+0.545
atform+0.544
Neg logit contributions
RANT-0.510
PLIED-0.467
pling-0.453
gradient-0.438
antha-0.437
Token
Token ID564
Feature activation+4.77e-01

Pos logit contributions
RANT+0.437
PLIED+0.403
pling+0.393
gradient+0.384
ples+0.380
Neg logit contributions
parency-0.416
ology-0.373
icians-0.369
ologne-0.357
ttes-0.353
Token
Token ID250
Feature activation-1.53e-01

Pos logit contributions
RANT+0.510
PLIED+0.454
pling+0.448
gradient+0.431
ples+0.429
Neg logit contributions
parency-0.742
ology-0.682
icians-0.677
ologne-0.659
atform-0.650
TokenC
Token ID34
Feature activation-5.87e-01

Pos logit contributions
parency+0.215
ology+0.195
icians+0.193
ologne+0.188
atform+0.186
Neg logit contributions
RANT-0.188
PLIED-0.170
pling-0.167
gradient-0.161
antha-0.161
Tokenute
Token ID1133
Feature activation+8.18e-01

Pos logit contributions
parency+0.835
ology+0.756
icians+0.749
ttes+0.721
ologne+0.717
Neg logit contributions
RANT-0.709
PLIED-0.648
pling-0.633
antha-0.611
ples-0.611
Token video
Token ID2008
Feature activation+1.16e+00

Pos logit contributions
RANT+1.049
PLIED+0.961
pling+0.939
gradient+0.912
antha+0.912
Neg logit contributions
parency-1.096
ology-0.983
icians-0.973
ologne-0.951
atform-0.940
Token Aj
Token ID22028
Feature activation-4.13e+00

Pos logit contributions
RANT+1.450
PLIED+1.333
pling+1.305
gradient+1.265
antha+1.262
Neg logit contributions
parency-1.575
ology-1.398
icians-1.393
ologne-1.364
atform-1.355
Tokenit
Token ID270
Feature activation+1.17e+00

Pos logit contributions
parency+4.724
ology+4.161
icians+4.133
atform+3.823
ologists+3.812
Neg logit contributions
RANT-5.978
PLIED-5.593
gradient-5.418
pling-5.337
yip-5.272
Token
Token ID564
Feature activation+6.38e-01

Pos logit contributions
RANT+1.333
pling+1.195
PLIED+1.188
ples+1.141
gradient+1.125
Neg logit contributions
parency-1.726
ology-1.560
icians-1.551
ologne-1.524
atform-1.502
Token
Token ID246
Feature activation+4.70e-01

Pos logit contributions
RANT+0.699
PLIED+0.626
pling+0.615
gradient+0.593
ples+0.590
Neg logit contributions
parency-0.975
ology-0.894
icians-0.888
ologne-0.862
atform-0.854
TokenAren
Token ID43199
Feature activation-6.30e-02

Pos logit contributions
RANT+0.556
PLIED+0.501
pling+0.493
gradient+0.477
ples+0.475
Neg logit contributions
parency-0.679
ology-0.615
icians-0.610
ologne-0.595
atform-0.590
Token
Token ID447
Feature activation-1.89e-02

Pos logit contributions
parency+0.072
ology+0.064
icians+0.063
ologne+0.061
atform+0.060
Neg logit contributions
RANT-0.093
PLIED-0.086
pling-0.085
antha-0.083
gradient-0.083
Token through
Token ID832
Feature activation-8.05e-02

Pos logit contributions
RANT+2.920
PLIED+2.691
pling+2.568
antha+2.478
ples+2.445
Neg logit contributions
parency-5.145
icians-4.728
ology-4.719
ologne-4.584
atform-4.580
Token for
Token ID329
Feature activation+1.33e-01

Pos logit contributions
parency+0.112
ology+0.102
icians+0.100
ologne+0.097
atform+0.096
Neg logit contributions
RANT-0.099
PLIED-0.091
pling-0.088
gradient-0.086
antha-0.086
Token more
Token ID517
Feature activation-3.65e-01

Pos logit contributions
RANT+0.167
PLIED+0.153
pling+0.148
gradient+0.146
antha+0.146
Neg logit contributions
parency-0.181
ology-0.164
icians-0.162
ologne-0.156
atform-0.155
Token images
Token ID4263
Feature activation+3.57e-01

Pos logit contributions
parency+0.480
ology+0.432
icians+0.426
ologne+0.412
atform+0.408
Neg logit contributions
RANT-0.479
PLIED-0.441
pling-0.430
gradient-0.421
antha-0.419
Token from
Token ID422
Feature activation+8.08e-01

Pos logit contributions
RANT+0.448
PLIED+0.411
pling+0.400
antha+0.391
ples+0.390
Neg logit contributions
parency-0.488
ology-0.437
icians-0.435
ologne-0.424
atform-0.419
Token Cors
Token ID26978
Feature activation-4.02e+00

Pos logit contributions
RANT+0.978
PLIED+0.892
pling+0.868
ples+0.842
antha+0.838
Neg logit contributions
parency-1.137
ology-1.027
icians-1.015
ologne-0.983
atform-0.979
Tokenica
Token ID3970
Feature activation-1.34e+00

Pos logit contributions
parency+1.673
ology+1.225
icians+1.170
ologne+0.943
ological+0.890
Neg logit contributions
RANT-8.906
PLIED-8.426
pling-8.291
gradient-8.188
ples-8.142
Token /
Token ID1220
Feature activation+4.49e-01

Pos logit contributions
parency+1.750
ology+1.578
icians+1.551
ologne+1.498
ttes+1.483
Neg logit contributions
RANT-1.770
PLIED-1.632
pling-1.586
gradient-1.554
antha-1.540
Token
Token ID198
Feature activation+8.99e-01

Pos logit contributions
RANT+0.575
PLIED+0.526
pling+0.507
antha+0.495
ples+0.494
Neg logit contributions
parency-0.610
ology-0.547
icians-0.538
ologne-0.523
atform-0.520
Token
Token ID198
Feature activation+8.79e-01

Pos logit contributions
RANT+1.124
PLIED+1.025
pling+0.981
gradient+0.969
ples+0.955
Neg logit contributions
parency-1.244
icians-1.119
ology-1.109
ologne-1.075
atform-1.074
TokenLocal
Token ID14565
Feature activation+8.32e-01

Pos logit contributions
RANT+1.035
PLIED+0.929
pling+0.901
gradient+0.880
ples+0.871
Neg logit contributions
parency-1.281
ology-1.168
icians-1.161
atform-1.126
ologne-1.121
Token signed
Token ID4488
Feature activation-6.42e-01

Pos logit contributions
parency+0.437
ology+0.392
icians+0.385
ologne+0.374
atform+0.372
Neg logit contributions
RANT-0.430
PLIED-0.397
pling-0.388
gradient-0.379
ples-0.378
Token former
Token ID1966
Feature activation-2.87e-01

Pos logit contributions
parency+0.889
ology+0.798
icians+0.791
ologne+0.765
atform+0.759
Neg logit contributions
RANT-0.800
PLIED-0.732
pling-0.713
gradient-0.695
ples-0.694
Token Denver
Token ID10656
Feature activation-1.92e-01

Pos logit contributions
parency+0.397
ology+0.355
icians+0.353
ologne+0.340
atform+0.340
Neg logit contributions
RANT-0.357
PLIED-0.325
pling-0.318
ples-0.311
gradient-0.310
Token Broncos
Token ID16582
Feature activation+4.23e-01

Pos logit contributions
parency+0.287
ology+0.262
icians+0.258
ologne+0.251
atform+0.248
Neg logit contributions
RANT-0.218
PLIED-0.199
pling-0.192
gradient-0.187
ples-0.186
Token quarterback
Token ID7880
Feature activation+1.22e+00

Pos logit contributions
RANT+0.498
PLIED+0.454
pling+0.445
ples+0.433
gradient+0.432
Neg logit contributions
parency-0.608
ology-0.548
icians-0.543
ologne-0.529
atform-0.522
Token Brock
Token ID20501
Feature activation-3.98e+00

Pos logit contributions
RANT+1.496
PLIED+1.348
pling+1.346
gradient+1.317
ples+1.315
Neg logit contributions
parency-1.679
icians-1.501
ology-1.479
ologne-1.471
atform-1.449
Token Os
Token ID8834
Feature activation-5.54e-01

Pos logit contributions
parency+4.766
ology+4.313
icians+4.182
ologne+4.017
ologists+3.973
Neg logit contributions
RANT-5.654
PLIED-5.291
pling-5.098
antha-5.046
gradient-5.011
Tokenwe
Token ID732
Feature activation+6.35e-01

Pos logit contributions
parency+0.723
ology+0.654
icians+0.643
ologne+0.620
ttes+0.614
Neg logit contributions
RANT-0.733
PLIED-0.677
pling-0.657
gradient-0.643
antha-0.638
Tokeniler
Token ID5329
Feature activation-1.44e+00

Pos logit contributions
RANT+0.930
PLIED+0.859
pling+0.851
ples+0.826
gradient+0.823
Neg logit contributions
parency-0.734
ology-0.642
icians-0.639
ologne-0.631
atform-0.619
Token to
Token ID284
Feature activation+3.83e-02

Pos logit contributions
parency+1.900
ology+1.712
icians+1.685
ologne+1.628
atform+1.612
Neg logit contributions
RANT-1.897
PLIED-1.747
pling-1.698
gradient-1.662
ples-1.647
Token a
Token ID257
Feature activation-1.43e-01

Pos logit contributions
RANT+0.041
PLIED+0.037
pling+0.036
gradient+0.035
ples+0.035
Neg logit contributions
parency-0.060
ology-0.054
icians-0.054
ologne-0.052
atform-0.052
Token
Token ID250
Feature activation-7.23e-01

Pos logit contributions
parency+0.316
ology+0.290
icians+0.286
ologne+0.278
atform+0.275
Neg logit contributions
RANT-0.226
PLIED-0.204
pling-0.199
gradient-0.192
ples-0.190
TokenPrime
Token ID26405
Feature activation-1.26e+00

Pos logit contributions
parency+0.980
ology+0.887
icians+0.875
ologne+0.848
ttes+0.840
Neg logit contributions
RANT-0.921
PLIED-0.843
pling-0.820
gradient-0.797
antha-0.793
Token ministers
Token ID14138
Feature activation-1.06e+00

Pos logit contributions
parency+1.244
ology+1.083
icians+1.058
ologne+1.009
ttes+0.993
Neg logit contributions
RANT-2.059
PLIED-1.930
pling-1.886
gradient-1.856
antha-1.843
Token pick
Token ID2298
Feature activation-1.00e+00

Pos logit contributions
parency+1.375
ology+1.235
icians+1.212
ologne+1.175
ttes+1.165
Neg logit contributions
RANT-1.410
PLIED-1.301
pling-1.272
gradient-1.240
ples-1.232
Token their
Token ID511
Feature activation+2.16e-01

Pos logit contributions
parency+1.316
ology+1.186
icians+1.166
ologne+1.127
ttes+1.114
Neg logit contributions
RANT-1.322
PLIED-1.220
pling-1.185
gradient-1.161
antha-1.151
Token cabinets
Token ID43064
Feature activation-3.96e+00

Pos logit contributions
RANT+0.288
PLIED+0.267
pling+0.261
gradient+0.256
ples+0.255
Neg logit contributions
parency-0.277
ology-0.247
icians-0.243
ologne-0.236
atform-0.236
Token for
Token ID329
Feature activation-2.51e-01

Pos logit contributions
parency+5.112
ology+4.637
icians+4.555
ologne+4.367
ttes+4.340
Neg logit contributions
RANT-5.218
PLIED-4.835
pling-4.641
gradient-4.592
antha-4.536
Token several
Token ID1811
Feature activation+5.84e-02

Pos logit contributions
parency+0.355
ology+0.322
icians+0.317
ologne+0.309
atform+0.307
Neg logit contributions
RANT-0.304
PLIED-0.279
pling-0.271
gradient-0.265
ples-0.262
Token reasons
Token ID3840
Feature activation+8.18e-01

Pos logit contributions
RANT+0.066
PLIED+0.060
pling+0.058
ples+0.057
gradient+0.057
Neg logit contributions
parency-0.087
ology-0.079
icians-0.078
ologne-0.077
atform-0.076
Token.
Token ID13
Feature activation-3.79e-01

Pos logit contributions
RANT+1.057
PLIED+0.972
pling+0.957
ples+0.930
gradient+0.922
Neg logit contributions
parency-1.083
ology-0.963
icians-0.963
ologne-0.933
atform-0.924
Token Among
Token ID9754
Feature activation-1.33e-01

Pos logit contributions
parency+0.518
ology+0.467
icians+0.462
ologne+0.447
atform+0.442
Neg logit contributions
RANT-0.477
PLIED-0.439
pling-0.424
gradient-0.415
antha-0.412
Token Gold
Token ID3561
Feature activation-7.59e-01

Pos logit contributions
parency+0.033
ology+0.030
icians+0.030
ologne+0.029
atform+0.029
Neg logit contributions
RANT-0.026
PLIED-0.023
pling-0.023
gradient-0.022
antha-0.022
Token,
Token ID11
Feature activation-7.33e-02

Pos logit contributions
parency+1.012
ology+0.911
icians+0.901
ologne+0.873
atform+0.861
Neg logit contributions
RANT-0.983
PLIED-0.905
pling-0.883
gradient-0.860
antha-0.854
Token Gold
Token ID3561
Feature activation-8.75e-01

Pos logit contributions
parency+0.106
ology+0.095
icians+0.095
ologne+0.092
atform+0.091
Neg logit contributions
RANT-0.087
PLIED-0.079
pling-0.077
gradient-0.075
ples-0.075
Token
Token ID784
Feature activation+1.84e-01

Pos logit contributions
parency+1.146
ology+1.030
icians+1.019
ologne+0.986
ttes+0.973
Neg logit contributions
RANT-1.156
PLIED-1.062
pling-1.036
gradient-1.011
antha-1.005
Token A
Token ID317
Feature activation-5.81e-01

Pos logit contributions
RANT+0.225
PLIED+0.206
pling+0.200
gradient+0.195
antha+0.194
Neg logit contributions
parency-0.257
ology-0.232
icians-0.230
ologne-0.223
ttes-0.220
Token Ton
Token ID16859
Feature activation-3.94e+00

Pos logit contributions
parency+0.801
ology+0.721
icians+0.715
ologne+0.691
atform+0.686
Neg logit contributions
RANT-0.728
PLIED-0.666
pling-0.650
gradient-0.632
antha-0.630
Token of
Token ID286
Feature activation+3.64e-01

Pos logit contributions
parency+5.214
ology+4.800
icians+4.740
ological+4.503
ttes+4.477
Neg logit contributions
RANT-5.052
PLIED-4.679
pling-4.389
gradient-4.376
antha-4.292
Token Gold
Token ID3561
Feature activation-6.56e-01

Pos logit contributions
RANT+0.433
PLIED+0.395
pling+0.384
gradient+0.375
antha+0.372
Neg logit contributions
parency-0.520
ology-0.467
icians-0.466
ologne-0.451
atform-0.450
Token
Token ID447
Feature activation-3.01e-01

Pos logit contributions
parency+0.897
ology+0.802
icians+0.797
ologne+0.776
atform+0.765
Neg logit contributions
RANT-0.828
PLIED-0.760
pling-0.742
gradient-0.723
ples-0.719
Token
Token ID251
Feature activation-4.55e-02

Pos logit contributions
parency+0.464
ology+0.425
icians+0.420
ologne+0.409
atform+0.405
Neg logit contributions
RANT-0.327
PLIED-0.294
pling-0.286
gradient-0.278
ples-0.276
Token-
Token ID12
Feature activation+3.62e-01

Pos logit contributions
parency+0.062
ology+0.055
icians+0.055
ologne+0.053
atform+0.052
Neg logit contributions
RANT-0.058
PLIED-0.053
pling-0.052
gradient-0.050
ples-0.050
Token 5
Token ID642
Feature activation-5.87e-02

Pos logit contributions
parency+0.419
ology+0.381
icians+0.375
ologne+0.363
ttes+0.360
Neg logit contributions
RANT-0.360
PLIED-0.330
pling-0.320
gradient-0.313
antha-0.310
Token lit
Token ID6578
Feature activation-1.83e+00

Pos logit contributions
parency+0.074
ology+0.066
icians+0.065
atform+0.063
ologne+0.063
Neg logit contributions
RANT-0.080
PLIED-0.074
pling-0.072
gradient-0.071
ples-0.071
Tokeners
Token ID364
Feature activation+4.60e-01

Pos logit contributions
parency+2.742
ology+2.522
icians+2.496
ttes+2.408
ologne+2.399
Neg logit contributions
RANT-2.067
PLIED-1.858
pling-1.793
gradient-1.748
antha-1.740
Token into
Token ID656
Feature activation+7.03e-01

Pos logit contributions
RANT+0.539
PLIED+0.488
pling+0.477
gradient+0.469
ples+0.463
Neg logit contributions
parency-0.667
ology-0.606
icians-0.603
ologne-0.580
atform-0.576
Token the
Token ID262
Feature activation+5.09e-02

Pos logit contributions
RANT+0.791
PLIED+0.717
pling+0.698
gradient+0.684
ples+0.676
Neg logit contributions
parency-1.056
ology-0.960
icians-0.952
ologne-0.921
atform-0.914
Token mash
Token ID30407
Feature activation-3.93e+00

Pos logit contributions
RANT+0.069
PLIED+0.063
pling+0.062
gradient+0.061
ples+0.060
Neg logit contributions
parency-0.065
ology-0.058
icians-0.058
ologne-0.056
atform-0.055
Token tun
Token ID6278
Feature activation+1.31e-01

Pos logit contributions
parency+4.778
ology+4.357
icians+4.180
ttes+4.075
ological+4.037
Neg logit contributions
RANT-5.327
PLIED-5.005
pling-4.765
antha-4.686
VERTISEMENT-4.651
Token,
Token ID11
Feature activation-1.27e-01

Pos logit contributions
RANT+0.172
PLIED+0.158
pling+0.155
gradient+0.151
ples+0.151
Neg logit contributions
parency-0.171
ology-0.153
icians-0.151
ologne-0.147
atform-0.145
Token then
Token ID788
Feature activation-5.30e-01

Pos logit contributions
parency+0.177
ology+0.160
icians+0.158
ologne+0.153
atform+0.151
Neg logit contributions
RANT-0.156
PLIED-0.143
pling-0.140
gradient-0.137
antha-0.135
Token add
Token ID751
Feature activation-1.05e+00

Pos logit contributions
parency+0.633
ology+0.560
icians+0.553
ologne+0.533
atform+0.526
Neg logit contributions
RANT-0.759
PLIED-0.704
pling-0.692
gradient-0.680
ples-0.672
Token the
Token ID262
Feature activation-1.38e-01

Pos logit contributions
parency+1.521
ology+1.388
icians+1.365
ologne+1.324
ttes+1.311
Neg logit contributions
RANT-1.236
PLIED-1.129
pling-1.090
gradient-1.064
antha-1.055
Token group
Token ID1448
Feature activation+9.36e-02

Pos logit contributions
parency+1.619
ology+1.451
icians+1.437
ologne+1.386
ttes+1.375
Neg logit contributions
RANT-1.639
PLIED-1.516
pling-1.473
gradient-1.438
antha-1.428
Token Raqqa
Token ID30391
Feature activation+3.06e-01

Pos logit contributions
RANT+0.122
PLIED+0.112
pling+0.108
gradient+0.106
ples+0.105
Neg logit contributions
parency-0.124
ology-0.111
icians-0.111
ologne-0.108
ttes-0.106
Token is
Token ID318
Feature activation+8.73e-02

Pos logit contributions
RANT+0.380
PLIED+0.349
pling+0.336
gradient+0.330
antha+0.327
Neg logit contributions
parency-0.422
ology-0.379
icians-0.378
ologne-0.366
ttes-0.362
Token Being
Token ID11204
Feature activation+2.77e-02

Pos logit contributions
RANT+0.112
PLIED+0.104
pling+0.101
gradient+0.098
ples+0.097
Neg logit contributions
parency-0.117
ology-0.105
icians-0.104
ologne-0.099
atform-0.099
Token Sl
Token ID3454
Feature activation+1.69e-01

Pos logit contributions
RANT+0.026
PLIED+0.023
pling+0.022
gradient+0.021
antha+0.021
Neg logit contributions
parency-0.047
ology-0.044
icians-0.043
ologne-0.042
ttes-0.042
Tokenaughtered
Token ID32734
Feature activation-3.85e+00

Pos logit contributions
RANT+0.219
PLIED+0.201
pling+0.197
gradient+0.194
antha+0.192
Neg logit contributions
parency-0.224
ology-0.200
icians-0.197
ologne-0.192
ttes-0.190
Token Sil
Token ID4243
Feature activation-2.56e-01

Pos logit contributions
parency+4.906
ology+4.447
icians+4.307
ologne+4.173
atform+4.137
Neg logit contributions
RANT-5.129
PLIED-4.807
pling-4.641
antha-4.561
gradient-4.536
Tokenently
Token ID1473
Feature activation-1.68e+00

Pos logit contributions
parency+0.425
ology+0.391
icians+0.387
ologne+0.378
ttes+0.375
Neg logit contributions
RANT-0.247
PLIED-0.220
pling-0.211
gradient-0.206
ples-0.202
Token.
Token ID13
Feature activation+6.34e-02

Pos logit contributions
parency+2.095
ology+1.872
icians+1.847
ologne+1.780
ttes+1.764
Neg logit contributions
RANT-2.314
PLIED-2.138
pling-2.080
gradient-2.033
ples-2.017
Token
Token ID198
Feature activation+3.53e-01

Pos logit contributions
RANT+0.082
PLIED+0.075
pling+0.072
gradient+0.071
ples+0.070
Neg logit contributions
parency-0.085
ology-0.076
icians-0.076
ologne-0.073
ttes-0.073
Token
Token ID198
Feature activation+1.99e-01

Pos logit contributions
RANT+0.449
PLIED+0.411
pling+0.399
gradient+0.391
ples+0.387
Neg logit contributions
parency-0.480
icians-0.428
ology-0.427
ologne-0.413
atform-0.411
Token includes
Token ID3407
Feature activation+4.24e-01

Pos logit contributions
parency+0.670
ology+0.603
icians+0.594
ologne+0.578
atform+0.570
Neg logit contributions
RANT-0.616
PLIED-0.563
pling-0.549
gradient-0.535
ples-0.535
Token visits
Token ID11864
Feature activation-9.47e-01

Pos logit contributions
RANT+0.517
PLIED+0.470
pling+0.460
ples+0.449
gradient+0.448
Neg logit contributions
parency-0.595
ology-0.535
icians-0.532
ologne-0.515
atform-0.510
Token to
Token ID284
Feature activation+6.32e-02

Pos logit contributions
parency+1.305
ology+1.184
icians+1.165
ologne+1.127
ttes+1.116
Neg logit contributions
RANT-1.182
PLIED-1.085
pling-1.052
gradient-1.030
antha-1.020
Token Havana
Token ID40274
Feature activation-7.33e-01

Pos logit contributions
RANT+0.081
PLIED+0.074
pling+0.072
ples+0.071
gradient+0.070
Neg logit contributions
parency-0.085
ology-0.076
icians-0.076
ologne-0.073
atform-0.073
Token and
Token ID290
Feature activation+1.89e-02

Pos logit contributions
parency+0.994
ology+0.896
icians+0.885
ologne+0.856
atform+0.846
Neg logit contributions
RANT-0.936
PLIED-0.859
pling-0.835
gradient-0.817
ples-0.811
Token Trinidad
Token ID48269
Feature activation-3.84e+00

Pos logit contributions
RANT+0.024
PLIED+0.022
pling+0.021
gradient+0.021
ples+0.021
Neg logit contributions
parency-0.026
ology-0.023
icians-0.023
ologne-0.022
atform-0.022
Token,
Token ID11
Feature activation+1.99e-01

Pos logit contributions
parency+4.949
ology+4.544
icians+4.397
ologne+4.236
ttes+4.222
Neg logit contributions
RANT-5.025
PLIED-4.671
pling-4.507
gradient-4.447
antha-4.361
Token walks
Token ID11114
Feature activation-5.15e-01

Pos logit contributions
RANT+0.250
PLIED+0.228
pling+0.223
gradient+0.218
ples+0.217
Neg logit contributions
parency-0.271
ology-0.245
icians-0.242
ologne-0.234
atform-0.231
Token through
Token ID832
Feature activation-1.25e-01

Pos logit contributions
parency+0.724
ology+0.650
icians+0.647
ologne+0.628
atform+0.618
Neg logit contributions
RANT-0.633
PLIED-0.581
pling-0.562
gradient-0.552
antha-0.545
Token the
Token ID262
Feature activation+1.19e-02

Pos logit contributions
parency+0.172
ology+0.156
icians+0.154
ologne+0.149
atform+0.148
Neg logit contributions
RANT-0.156
PLIED-0.142
pling-0.139
gradient-0.136
antha-0.135
Token countryside
Token ID25708
Feature activation-1.14e+00

Pos logit contributions
RANT+0.016
PLIED+0.015
pling+0.014
ples+0.014
gradient+0.014
Neg logit contributions
parency-0.015
ology-0.013
icians-0.013
ologne-0.013
atform-0.013
Tokens
Token ID82
Feature activation+1.44e-01

Pos logit contributions
parency+0.786
ology+0.709
icians+0.699
ologne+0.679
ttes+0.674
Neg logit contributions
RANT-0.728
PLIED-0.664
pling-0.648
gradient-0.631
ples-0.629
Token largest
Token ID4387
Feature activation-9.63e-01

Pos logit contributions
RANT+0.184
PLIED+0.167
pling+0.164
antha+0.160
gradient+0.159
Neg logit contributions
parency-0.193
ology-0.173
icians-0.171
ologne-0.166
ttes-0.165
Token party
Token ID2151
Feature activation-1.39e+00

Pos logit contributions
parency+1.320
ology+1.194
icians+1.177
ologne+1.141
ttes+1.133
Neg logit contributions
RANT-1.208
PLIED-1.105
pling-1.078
gradient-1.052
antha-1.047
Token abandoned
Token ID9958
Feature activation-6.67e-01

Pos logit contributions
parency+1.854
ology+1.673
icians+1.647
ologne+1.594
ttes+1.578
Neg logit contributions
RANT-1.802
PLIED-1.657
pling-1.612
gradient-1.577
antha-1.563
Token President
Token ID1992
Feature activation+4.80e-01

Pos logit contributions
parency+0.920
ology+0.834
icians+0.825
ologne+0.795
ttes+0.792
Neg logit contributions
RANT-0.831
PLIED-0.757
pling-0.739
gradient-0.720
antha-0.717
Token Dil
Token ID29328
Feature activation-3.83e+00

Pos logit contributions
RANT+0.553
PLIED+0.503
pling+0.489
gradient+0.480
antha+0.475
Neg logit contributions
parency-0.701
ology-0.635
icians-0.626
ologne-0.615
ttes-0.608
Tokenma
Token ID2611
Feature activation+1.23e+00

Pos logit contributions
parency+5.367
ology+4.965
icians+4.960
ologne+4.720
ological+4.642
Neg logit contributions
RANT-4.735
PLIED-4.332
gradient-4.067
yip-4.014
pling-3.969
Token Rousse
Token ID42849
Feature activation-3.81e+00

Pos logit contributions
RANT+1.971
PLIED+1.868
pling+1.845
ples+1.801
antha+1.794
Neg logit contributions
parency-1.203
ology-1.053
icians-1.029
ologne-1.001
ttes-0.997
Tokenff
Token ID487
Feature activation+1.97e-01

Pos logit contributions
parency+4.111
ology+3.576
icians+3.421
ttes+3.377
ologne+3.320
Neg logit contributions
RANT-5.909
PLIED-5.562
VERTISEMENT-5.290
pling-5.272
gradient-5.271
Token
Token ID447
Feature activation-4.97e-01

Pos logit contributions
RANT+0.246
PLIED+0.224
pling+0.218
gradient+0.212
ples+0.211
Neg logit contributions
parency-0.271
ology-0.244
icians-0.242
ologne-0.234
ttes-0.233
Token
Token ID247
Feature activation-5.52e-01

Pos logit contributions
parency+0.715
ology+0.650
icians+0.644
ologne+0.627
atform+0.619
Neg logit contributions
RANT-0.592
PLIED-0.534
pling-0.525
gradient-0.510
ples-0.506
Tokenar
Token ID283
Feature activation+3.01e-03

Pos logit contributions
parency+1.659
ology+1.549
icians+1.532
ologne+1.499
ttes+1.488
Neg logit contributions
RANT-0.580
PLIED-0.493
pling-0.463
gradient-0.442
antha-0.433
Token of
Token ID286
Feature activation+2.68e-01

Pos logit contributions
RANT+0.004
PLIED+0.003
pling+0.003
gradient+0.003
ples+0.003
Neg logit contributions
parency-0.004
ology-0.004
icians-0.004
ologne-0.004
ttes-0.004
Token the
Token ID262
Feature activation-1.13e-01

Pos logit contributions
RANT+0.316
PLIED+0.286
pling+0.276
antha+0.271
gradient+0.270
Neg logit contributions
parency-0.388
ology-0.351
icians-0.346
ologne-0.337
ttes-0.336
Token A
Token ID317
Feature activation-5.22e-01

Pos logit contributions
parency+0.151
ology+0.135
icians+0.134
ologne+0.129
ttes+0.128
Neg logit contributions
RANT-0.146
PLIED-0.134
pling-0.131
gradient-0.127
antha-0.127
Tokenam
Token ID321
Feature activation-6.32e-01

Pos logit contributions
parency+0.668
ology+0.599
icians+0.588
ologne+0.569
ttes+0.564
Neg logit contributions
RANT-0.708
PLIED-0.646
pling-0.635
gradient-0.617
antha-0.615
Token Aad
Token ID38746
Feature activation-3.83e+00

Pos logit contributions
parency+0.692
ology+0.607
icians+0.599
ologne+0.577
ttes+0.571
Neg logit contributions
RANT-0.966
PLIED-0.898
pling-0.886
gradient-0.863
antha-0.861
Tokenmi
Token ID11632
Feature activation+1.51e-01

Pos logit contributions
parency+5.591
ology+5.088
icians+4.936
ologists+4.741
ological+4.740
Neg logit contributions
RANT-4.461
PLIED-4.038
gradient-3.942
antha-3.865
pling-3.851
Token Party
Token ID3615
Feature activation-1.51e+00

Pos logit contributions
RANT+0.231
PLIED+0.220
pling+0.215
ples+0.209
gradient+0.207
Neg logit contributions
parency-0.163
ology-0.142
icians-0.141
ologne-0.136
ttes-0.136
Token,
Token ID11
Feature activation+4.37e-02

Pos logit contributions
parency+2.049
ology+1.867
icians+1.830
ologne+1.770
ttes+1.749
Neg logit contributions
RANT-1.918
PLIED-1.754
pling-1.706
gradient-1.672
antha-1.654
Token arrested
Token ID5169
Feature activation-1.11e+00

Pos logit contributions
RANT+0.056
PLIED+0.051
pling+0.050
gradient+0.049
antha+0.048
Neg logit contributions
parency-0.059
ology-0.054
icians-0.053
ologne-0.051
ttes-0.051
Token this
Token ID428
Feature activation+8.26e-01

Pos logit contributions
parency+1.555
ology+1.410
icians+1.391
ologne+1.346
ttes+1.334
Neg logit contributions
RANT-1.352
PLIED-1.238
pling-1.198
gradient-1.171
antha-1.161
Token party
Token ID2151
Feature activation-1.39e+00

Pos logit contributions
parency+1.320
ology+1.194
icians+1.177
ologne+1.141
ttes+1.133
Neg logit contributions
RANT-1.208
PLIED-1.105
pling-1.078
gradient-1.052
antha-1.047
Token abandoned
Token ID9958
Feature activation-6.67e-01

Pos logit contributions
parency+1.854
ology+1.673
icians+1.647
ologne+1.594
ttes+1.578
Neg logit contributions
RANT-1.802
PLIED-1.657
pling-1.612
gradient-1.577
antha-1.563
Token President
Token ID1992
Feature activation+4.80e-01

Pos logit contributions
parency+0.920
ology+0.834
icians+0.825
ologne+0.795
ttes+0.792
Neg logit contributions
RANT-0.831
PLIED-0.757
pling-0.739
gradient-0.720
antha-0.717
Token Dil
Token ID29328
Feature activation-3.83e+00

Pos logit contributions
RANT+0.553
PLIED+0.503
pling+0.489
gradient+0.480
antha+0.475
Neg logit contributions
parency-0.701
ology-0.635
icians-0.626
ologne-0.615
ttes-0.608
Tokenma
Token ID2611
Feature activation+1.23e+00

Pos logit contributions
parency+5.367
ology+4.965
icians+4.960
ologne+4.720
ological+4.642
Neg logit contributions
RANT-4.735
PLIED-4.332
gradient-4.067
yip-4.014
pling-3.969
Token Rousse
Token ID42849
Feature activation-3.81e+00

Pos logit contributions
RANT+1.971
PLIED+1.868
pling+1.845
ples+1.801
antha+1.794
Neg logit contributions
parency-1.203
ology-1.053
icians-1.029
ologne-1.001
ttes-0.997
Tokenff
Token ID487
Feature activation+1.97e-01

Pos logit contributions
parency+4.111
ology+3.576
icians+3.421
ttes+3.377
ologne+3.320
Neg logit contributions
RANT-5.909
PLIED-5.562
VERTISEMENT-5.290
pling-5.272
gradient-5.271
Token
Token ID447
Feature activation-4.97e-01

Pos logit contributions
RANT+0.246
PLIED+0.224
pling+0.218
gradient+0.212
ples+0.211
Neg logit contributions
parency-0.271
ology-0.244
icians-0.242
ologne-0.234
ttes-0.233
Token
Token ID247
Feature activation-5.52e-01

Pos logit contributions
parency+0.715
ology+0.650
icians+0.644
ologne+0.627
atform+0.619
Neg logit contributions
RANT-0.592
PLIED-0.534
pling-0.525
gradient-0.510
ples-0.506
Tokens
Token ID82
Feature activation+5.56e-02

Pos logit contributions
parency+0.774
ology+0.702
icians+0.695
ologne+0.672
ttes+0.668
Neg logit contributions
RANT-0.678
PLIED-0.613
pling-0.602
gradient-0.583
ples-0.582
Token governing
Token ID15030
Feature activation-8.35e-01

Pos logit contributions
RANT+0.077
PLIED+0.071
pling+0.070
antha+0.068
ples+0.068
Neg logit contributions
parency-0.068
ology-0.061
icians-0.060
ttes-0.058
atform-0.058
Token-
Token ID12
Feature activation-8.28e-02

Pos logit contributions
parency+2.159
ology+1.887
icians+1.883
atform+1.798
ttes+1.795
Neg logit contributions
RANT-2.588
PLIED-2.403
pling-2.339
gradient-2.301
ples-2.283
Tokenknown
Token ID4002
Feature activation-2.22e-01

Pos logit contributions
parency+0.117
icians+0.106
ology+0.105
atform+0.103
ologne+0.102
Neg logit contributions
RANT-0.099
PLIED-0.090
gradient-0.089
pling-0.089
ples-0.086
Token type
Token ID2099
Feature activation-2.93e-01

Pos logit contributions
parency+0.319
icians+0.285
ology+0.284
ologne+0.275
atform+0.275
Neg logit contributions
RANT-0.264
PLIED-0.240
pling-0.235
gradient-0.231
antha-0.227
Token of
Token ID286
Feature activation+1.72e-02

Pos logit contributions
parency+0.429
icians+0.386
ology+0.383
atform+0.373
ologne+0.373
Neg logit contributions
RANT-0.341
PLIED-0.311
pling-0.301
gradient-0.297
antha-0.292
Token ign
Token ID3627
Feature activation+1.10e+00

Pos logit contributions
RANT+0.019
PLIED+0.018
pling+0.017
gradient+0.017
antha+0.017
Neg logit contributions
parency-0.026
icians-0.023
ology-0.023
ologne-0.022
atform-0.022
Tokeneous
Token ID15303
Feature activation-3.81e+00

Pos logit contributions
RANT+1.418
pling+1.347
PLIED+1.339
gradient+1.327
antha+1.307
Neg logit contributions
parency-1.369
icians-1.193
ology-1.187
ologne-1.181
ttes-1.179
Token rock
Token ID3881
Feature activation-8.46e-01

Pos logit contributions
parency+5.976
ology+5.615
icians+5.395
ological+5.274
ologne+5.264
Neg logit contributions
RANT-3.941
PLIED-3.555
pling-3.388
antha-3.280
yip-3.275
Token found
Token ID1043
Feature activation-1.68e+00

Pos logit contributions
parency+1.066
icians+0.942
ology+0.933
ologne+0.902
atform+0.901
Neg logit contributions
RANT-1.158
PLIED-1.075
pling-1.037
gradient-1.025
ples-1.013
Token in
Token ID287
Feature activation-3.88e-01

Pos logit contributions
parency+2.375
ology+2.145
icians+2.129
ologne+2.060
atform+2.042
Neg logit contributions
RANT-2.058
PLIED-1.887
pling-1.827
gradient-1.786
ples-1.771
Token many
Token ID867
Feature activation+5.90e-01

Pos logit contributions
parency+0.541
ology+0.485
icians+0.483
ologne+0.469
atform+0.466
Neg logit contributions
RANT-0.478
PLIED-0.439
pling-0.426
gradient-0.418
ples-0.414
Token volcanic
Token ID31513
Feature activation-3.67e-01

Pos logit contributions
RANT+0.748
PLIED+0.686
pling+0.676
ples+0.664
gradient+0.662
Neg logit contributions
parency-0.798
icians-0.708
ology-0.706
ologne-0.688
atform-0.686
Token "
Token ID366
Feature activation+7.04e-01

Pos logit contributions
RANT+0.583
PLIED+0.537
pling+0.523
gradient+0.510
ples+0.507
Neg logit contributions
parency-0.585
ology-0.524
icians-0.518
ologne-0.500
ttes-0.498
TokenOh
Token ID5812
Feature activation-8.92e-01

Pos logit contributions
RANT+0.810
PLIED+0.727
pling+0.711
gradient+0.696
ples+0.690
Neg logit contributions
parency-1.040
ology-0.944
icians-0.935
ologne-0.910
ttes-0.902
Token no
Token ID645
Feature activation+9.14e-01

Pos logit contributions
parency+1.165
ology+1.042
icians+1.025
ologne+0.997
atform+0.986
Neg logit contributions
RANT-1.184
PLIED-1.089
pling-1.064
gradient-1.036
antha-1.033
Token,
Token ID11
Feature activation-1.59e-01

Pos logit contributions
RANT+1.208
PLIED+1.116
pling+1.102
ples+1.074
yip+1.067
Neg logit contributions
parency-1.181
ology-1.037
icians-1.024
ologne-1.000
atform-0.997
Token the
Token ID262
Feature activation-1.25e-01

Pos logit contributions
parency+0.204
ology+0.182
icians+0.179
ologne+0.174
atform+0.172
Neg logit contributions
RANT-0.214
PLIED-0.198
pling-0.194
gradient-0.189
ples-0.188
Token oct
Token ID19318
Feature activation-3.75e+00

Pos logit contributions
parency+0.166
ology+0.149
icians+0.147
ologne+0.143
atform+0.141
Neg logit contributions
RANT-0.161
PLIED-0.148
pling-0.145
gradient-0.141
ples-0.141
Tokenopus
Token ID25790
Feature activation-8.89e-01

Pos logit contributions
parency+4.822
ology+4.363
icians+4.272
ologne+4.119
ttes+4.075
Neg logit contributions
RANT-5.039
PLIED-4.653
pling-4.510
gradient-4.409
ples-4.373
Token is
Token ID318
Feature activation+1.23e-02

Pos logit contributions
parency+1.206
ology+1.079
icians+1.068
ologne+1.034
atform+1.025
Neg logit contributions
RANT-1.132
PLIED-1.041
pling-1.016
gradient-0.992
ples-0.985
Token back
Token ID736
Feature activation-3.48e-01

Pos logit contributions
RANT+0.016
PLIED+0.015
pling+0.014
gradient+0.014
antha+0.014
Neg logit contributions
parency-0.017
ology-0.015
icians-0.015
ologne-0.014
atform-0.014
Token.
Token ID13
Feature activation-5.66e-01

Pos logit contributions
parency+0.433
ology+0.387
icians+0.382
ologne+0.373
atform+0.366
Neg logit contributions
RANT-0.478
PLIED-0.442
pling-0.435
gradient-0.424
antha-0.422
Token It
Token ID632
Feature activation+2.98e-01

Pos logit contributions
parency+0.764
ology+0.687
icians+0.678
ologne+0.657
atform+0.648
Neg logit contributions
RANT-0.725
PLIED-0.667
pling-0.647
gradient-0.633
antha-0.628
Token 12
Token ID1105
Feature activation+1.37e+00

Pos logit contributions
RANT+0.342
PLIED+0.315
pling+0.305
gradient+0.299
ples+0.297
Neg logit contributions
parency-0.370
ology-0.336
icians-0.332
ologne-0.323
ttes-0.318
Tokenms
Token ID907
Feature activation-1.18e-01

Pos logit contributions
RANT+1.615
PLIED+1.489
pling+1.453
ples+1.413
gradient+1.391
Neg logit contributions
parency-1.959
icians-1.773
ology-1.752
ologne-1.715
atform-1.695
Token G
Token ID402
Feature activation+5.68e-03

Pos logit contributions
parency+0.151
ology+0.135
icians+0.135
ologne+0.131
atform+0.129
Neg logit contributions
RANT-0.158
PLIED-0.146
pling-0.142
gradient-0.140
ples-0.139
Token2
Token ID17
Feature activation+5.05e-01

Pos logit contributions
RANT+0.006
PLIED+0.006
pling+0.006
gradient+0.006
ples+0.005
Neg logit contributions
parency-0.008
ology-0.008
icians-0.008
ologne-0.007
ttes-0.007
TokenG
Token ID38
Feature activation-4.69e-01

Pos logit contributions
RANT+0.489
PLIED+0.436
pling+0.413
gradient+0.403
ples+0.397
Neg logit contributions
parency-0.843
ology-0.774
icians-0.770
ologne-0.749
atform-0.744
Token Samsung
Token ID10397
Feature activation+3.16e+00

Pos logit contributions
parency+0.719
ology+0.657
icians+0.651
ologne+0.633
atform+0.626
Neg logit contributions
RANT-0.513
PLIED-0.466
pling-0.449
gradient-0.438
antha-0.433
Token P
Token ID350
Feature activation+4.05e-01

Pos logit contributions
RANT+3.685
PLIED+3.362
pling+3.262
gradient+3.177
ples+3.151
Neg logit contributions
parency-4.485
icians-4.064
ologne-4.009
atform-3.991
ttes-3.896
TokenLS
Token ID6561
Feature activation+3.16e-01

Pos logit contributions
RANT+0.441
PLIED+0.394
pling+0.385
antha+0.367
ples+0.367
Neg logit contributions
parency-0.632
icians-0.567
ology-0.565
atform-0.548
ttes-0.544
Token (
Token ID357
Feature activation-7.15e-03

Pos logit contributions
RANT+0.374
PLIED+0.341
pling+0.331
gradient+0.322
ples+0.320
Neg logit contributions
parency-0.459
icians-0.411
ology-0.410
ologne-0.398
atform-0.395
TokenResponse
Token ID31077
Feature activation-1.17e+00

Pos logit contributions
parency+0.011
ology+0.011
icians+0.010
ologne+0.010
atform+0.010
Neg logit contributions
RANT-0.007
PLIED-0.007
pling-0.006
gradient-0.006
ples-0.006
Token Time
Token ID3862
Feature activation+7.06e-01

Pos logit contributions
parency+1.768
ology+1.610
icians+1.607
ologne+1.564
atform+1.545
Neg logit contributions
RANT-1.295
PLIED-1.176
pling-1.147
gradient-1.123
ples-1.100
Token between
Token ID1022
Feature activation+1.49e-01

Pos logit contributions
parency+3.037
ology+2.757
icians+2.716
ologne+2.632
ttes+2.606
Neg logit contributions
RANT-2.587
PLIED-2.365
pling-2.295
gradient-2.242
antha-2.220
Token Tony
Token ID8832
Feature activation+3.35e-01

Pos logit contributions
RANT+0.184
PLIED+0.167
pling+0.164
gradient+0.160
antha+0.159
Neg logit contributions
parency-0.208
ology-0.187
icians-0.185
ologne-0.180
ttes-0.179
Token Ferguson
Token ID15031
Feature activation+2.16e+00

Pos logit contributions
RANT+0.407
PLIED+0.373
pling+0.365
antha+0.360
gradient+0.357
Neg logit contributions
parency-0.467
ology-0.419
icians-0.412
atform-0.408
ologne-0.401
Token and
Token ID290
Feature activation+1.24e-01

Pos logit contributions
RANT+2.153
PLIED+1.887
pling+1.859
ples+1.798
gradient+1.794
Neg logit contributions
parency-3.491
ology-3.157
icians-3.151
ologne-3.079
atform-3.045
Token Kevin
Token ID7939
Feature activation-5.26e-01

Pos logit contributions
RANT+0.152
PLIED+0.138
pling+0.135
gradient+0.132
antha+0.131
Neg logit contributions
parency-0.174
ology-0.156
icians-0.154
ologne-0.150
ttes-0.150
Token Lee
Token ID5741
Feature activation+3.09e+00

Pos logit contributions
parency+0.733
ology+0.654
icians+0.649
ologne+0.628
atform+0.625
Neg logit contributions
RANT-0.654
PLIED-0.601
pling-0.581
gradient-0.566
antha-0.564
Token.
Token ID13
Feature activation+2.00e-01

Pos logit contributions
RANT+3.784
pling+3.469
PLIED+3.455
ples+3.421
antha+3.296
Neg logit contributions
parency-4.157
ology-3.694
icians-3.694
atform-3.572
ologists-3.494
Token
Token ID198
Feature activation+5.55e-01

Pos logit contributions
RANT+0.256
PLIED+0.234
pling+0.226
gradient+0.222
ples+0.221
Neg logit contributions
parency-0.272
ology-0.243
icians-0.241
ologne-0.234
ttes-0.232
Token
Token ID198
Feature activation+3.73e-01

Pos logit contributions
RANT+0.687
PLIED+0.627
pling+0.610
gradient+0.600
ples+0.593
Neg logit contributions
parency-0.771
icians-0.690
ology-0.684
ologne-0.667
atform-0.664
TokenMc
Token ID9742
Feature activation+3.96e-01

Pos logit contributions
RANT+0.440
PLIED+0.395
pling+0.384
gradient+0.376
ples+0.373
Neg logit contributions
parency-0.544
ology-0.494
icians-0.491
ologne-0.476
atform-0.474
TokenGreg
Token ID25025
Feature activation+5.46e-01

Pos logit contributions
RANT+0.380
PLIED+0.338
pling+0.317
gradient+0.309
VERTISEMENT+0.309
Neg logit contributions
parency-0.662
ology-0.615
icians-0.606
ttes-0.594
atform-0.588
TokenH
Token ID39
Feature activation-9.23e-01

Pos logit contributions
RANT+0.617
PLIED+0.550
pling+0.528
gradient+0.516
ples+0.508
Neg logit contributions
parency-0.830
ology-0.754
icians-0.752
atform-0.729
ologne-0.728
TokenONG
Token ID18494
Feature activation+2.36e+00

Pos logit contributions
parency+1.629
ology+1.506
icians+1.490
ologne+1.450
ttes+1.447
Neg logit contributions
RANT-0.805
PLIED-0.704
pling-0.673
antha-0.641
gradient-0.641
Token K
Token ID509
Feature activation+1.28e+00

Pos logit contributions
RANT+3.077
PLIED+2.754
VERTISEMENT+2.617
pling+2.614
CHAT+2.601
Neg logit contributions
parency-3.185
icians-2.811
ology-2.753
atform-2.733
ttes-2.712
TokenONG
Token ID18494
Feature activation+2.36e+00

Pos logit contributions
RANT+0.881
PLIED+0.747
pling+0.685
ples+0.628
gradient+0.613
Neg logit contributions
parency-2.532
ology-2.349
icians-2.339
ttes-2.282
atform-2.282
Token (
Token ID357
Feature activation-1.59e-01

Pos logit contributions
RANT+2.829
PLIED+2.571
pling+2.382
VERTISEMENT+2.358
gradient+2.341
Neg logit contributions
parency-3.387
icians-3.037
ology-3.000
ttes-2.923
atform-2.914
TokenReuters
Token ID12637
Feature activation+3.16e+00

Pos logit contributions
parency+0.267
ology+0.245
icians+0.243
ologne+0.237
atform+0.236
Neg logit contributions
RANT-0.153
PLIED-0.134
pling-0.129
gradient-0.125
antha-0.123
Token)
Token ID8
Feature activation+3.84e-01

Pos logit contributions
RANT+3.859
PLIED+3.591
pling+3.254
CHAT+3.249
VERTISEMENT+3.228
Neg logit contributions
parency-4.416
icians-3.980
ology-3.877
atform-3.851
ologne-3.788
Token -
Token ID532
Feature activation+1.16e+00

Pos logit contributions
RANT+0.483
PLIED+0.443
pling+0.429
gradient+0.419
ples+0.415
Neg logit contributions
parency-0.526
ology-0.476
icians-0.469
ologne-0.454
atform-0.449
Token One
Token ID1881
Feature activation+4.93e-01

Pos logit contributions
RANT+1.358
PLIED+1.236
pling+1.192
gradient+1.156
ples+1.154
Neg logit contributions
parency-1.687
ology-1.503
icians-1.498
ologne-1.459
atform-1.457
Token needs
Token ID2476
Feature activation+9.01e-01

Pos logit contributions
RANT+0.601
PLIED+0.545
pling+0.535
gradient+0.521
ples+0.518
Neg logit contributions
parency-0.696
ology-0.625
icians-0.624
ttes-0.604
ologne-0.602
Token to
Token ID284
Feature activation-2.60e-01

Pos logit contributions
RANT+1.112
PLIED+1.042
pling+1.006
gradient+0.980
antha+0.979
Neg logit contributions
parency-1.230
ology-1.100
icians-1.095
atform-1.069
ologne-1.062
Token the
Token ID262
Feature activation-2.43e-01

Pos logit contributions
parency+0.937
ology+0.848
icians+0.836
ologne+0.810
ttes+0.801
Neg logit contributions
RANT-0.876
PLIED-0.805
pling-0.782
gradient-0.766
antha-0.759
Token browser
Token ID6444
Feature activation+3.15e-01

Pos logit contributions
parency+0.331
ology+0.299
icians+0.295
ologne+0.287
ttes+0.283
Neg logit contributions
RANT-0.306
PLIED-0.280
pling-0.274
gradient-0.267
antha-0.265
Token had
Token ID550
Feature activation-2.74e-01

Pos logit contributions
RANT+0.401
PLIED+0.368
pling+0.359
gradient+0.350
ples+0.349
Neg logit contributions
parency-0.426
ology-0.382
icians-0.378
ologne-0.367
atform-0.363
Token been
Token ID587
Feature activation-4.70e-01

Pos logit contributions
parency+0.339
ology+0.304
icians+0.300
ologne+0.289
atform+0.286
Neg logit contributions
RANT-0.380
PLIED-0.352
pling-0.344
gradient-0.337
antha-0.334
Token the
Token ID262
Feature activation-1.56e-01

Pos logit contributions
parency+0.540
ology+0.479
icians+0.470
ologne+0.453
ttes+0.446
Neg logit contributions
RANT-0.694
PLIED-0.646
pling-0.631
gradient-0.619
antha-0.614
Token weak
Token ID4939
Feature activation+3.29e+00

Pos logit contributions
parency+0.200
ology+0.177
icians+0.177
ologne+0.170
atform+0.169
Neg logit contributions
RANT-0.207
PLIED-0.192
pling-0.190
ples-0.184
gradient-0.184
Token link
Token ID2792
Feature activation+9.25e-01

Pos logit contributions
RANT+3.069
PLIED+2.760
pling+2.732
ples+2.608
gradient+2.586
Neg logit contributions
parency-5.454
icians-4.998
ology-4.949
ttes-4.826
atform-4.793
Token in
Token ID287
Feature activation-2.18e-02

Pos logit contributions
RANT+1.124
PLIED+1.028
pling+0.998
gradient+0.977
antha+0.966
Neg logit contributions
parency-1.300
ology-1.174
icians-1.164
ologne-1.132
ttes-1.120
Token a
Token ID257
Feature activation+6.69e-02

Pos logit contributions
parency+0.032
ology+0.029
icians+0.029
ologne+0.028
ttes+0.028
Neg logit contributions
RANT-0.025
PLIED-0.023
pling-0.022
gradient-0.022
antha-0.022
Token cyber
Token ID10075
Feature activation-6.26e-01

Pos logit contributions
RANT+0.089
PLIED+0.082
pling+0.081
gradient+0.078
antha+0.078
Neg logit contributions
parency-0.086
ology-0.077
icians-0.076
ologne-0.074
ttes-0.073
Token attack
Token ID1368
Feature activation-7.94e-01

Pos logit contributions
parency+0.710
ology+0.624
icians+0.621
ttes+0.595
ologne+0.594
Neg logit contributions
RANT-0.926
PLIED-0.867
pling-0.847
gradient-0.836
antha-0.822
Tokenlene
Token ID29466
Feature activation+1.13e+00

Pos logit contributions
parency+0.164
ology+0.149
icians+0.146
ologne+0.142
ttes+0.141
Neg logit contributions
RANT-0.141
PLIED-0.129
pling-0.127
antha-0.125
ples-0.123
Token Go
Token ID1514
Feature activation+2.86e-01

Pos logit contributions
RANT+1.385
PLIED+1.266
pling+1.248
ples+1.217
antha+1.213
Neg logit contributions
parency-1.570
ology-1.387
icians-1.382
ologne-1.341
atform-1.339
Tokenle
Token ID293
Feature activation+4.27e-01

Pos logit contributions
RANT+0.337
PLIED+0.308
pling+0.304
gradient+0.294
ples+0.293
Neg logit contributions
parency-0.411
ology-0.370
icians-0.366
ologne-0.357
ttes-0.353
Tokenmb
Token ID2022
Feature activation+7.49e-01

Pos logit contributions
RANT+0.553
pling+0.506
PLIED+0.505
ples+0.490
antha+0.488
Neg logit contributions
parency-0.563
ology-0.505
icians-0.496
ologne-0.483
atform-0.479
Tokeniew
Token ID769
Feature activation+3.76e-01

Pos logit contributions
RANT+0.884
PLIED+0.802
pling+0.788
antha+0.777
gradient+0.770
Neg logit contributions
parency-1.077
ology-0.962
icians-0.953
ologne-0.928
atform-0.925
Tokenski
Token ID20545
Feature activation+3.70e+00

Pos logit contributions
RANT+0.433
PLIED+0.393
pling+0.385
gradient+0.376
antha+0.374
Neg logit contributions
parency-0.550
ology-0.499
icians-0.491
ologne-0.482
atform-0.474
Token.
Token ID13
Feature activation+4.74e-01

Pos logit contributions
RANT+4.040
pling+3.721
PLIED+3.666
ples+3.648
gradient+3.589
Neg logit contributions
parency-5.341
icians-4.785
ology-4.773
ologne-4.741
atform-4.656
Token Read
Token ID4149
Feature activation+1.99e+00

Pos logit contributions
RANT+0.626
PLIED+0.577
pling+0.553
gradient+0.545
antha+0.543
Neg logit contributions
parency-0.624
ology-0.550
icians-0.550
ologne-0.534
atform-0.523
Token the
Token ID262
Feature activation+3.42e-01

Pos logit contributions
RANT+2.498
PLIED+2.302
pling+2.216
antha+2.208
ples+2.162
Neg logit contributions
parency-2.669
ology-2.396
icians-2.364
ologne-2.337
ttes-2.281
Token 2018
Token ID2864
Feature activation+4.93e-01

Pos logit contributions
RANT+0.434
PLIED+0.399
pling+0.389
antha+0.381
gradient+0.379
Neg logit contributions
parency-0.459
ology-0.410
icians-0.409
ologne-0.399
ttes-0.394
Token Sh
Token ID911
Feature activation+1.41e-01

Pos logit contributions
RANT+0.579
PLIED+0.524
pling+0.509
antha+0.498
gradient+0.495
Neg logit contributions
parency-0.710
icians-0.643
ology-0.643
ologne-0.626
ttes-0.618
Token download
Token ID4321
Feature activation-6.56e-01

Pos logit contributions
parency+0.183
ology+0.164
icians+0.162
ologne+0.157
ttes+0.154
Neg logit contributions
RANT-0.201
PLIED-0.186
pling-0.181
gradient-0.177
antha-0.176
Token it
Token ID340
Feature activation+2.95e-01

Pos logit contributions
parency+0.930
ology+0.845
icians+0.834
ologne+0.810
atform+0.800
Neg logit contributions
RANT-0.793
PLIED-0.727
pling-0.703
gradient-0.687
antha-0.681
Token to
Token ID284
Feature activation-2.19e-01

Pos logit contributions
RANT+0.363
PLIED+0.332
pling+0.322
gradient+0.315
ples+0.312
Neg logit contributions
parency-0.412
ology-0.371
icians-0.368
ologne-0.360
atform-0.353
Token your
Token ID534
Feature activation-2.85e-01

Pos logit contributions
parency+0.338
ology+0.308
icians+0.305
ologne+0.298
ttes+0.293
Neg logit contributions
RANT-0.237
PLIED-0.215
pling-0.208
gradient-0.202
antha-0.200
Token PC
Token ID4217
Feature activation+2.33e-01

Pos logit contributions
parency+0.402
ology+0.361
icians+0.358
ologne+0.349
ttes+0.344
Neg logit contributions
RANT-0.346
PLIED-0.318
pling-0.310
gradient-0.301
antha-0.300
Token and
Token ID290
Feature activation-1.22e-01

Pos logit contributions
RANT+0.290
PLIED+0.266
pling+0.257
gradient+0.251
ples+0.250
Neg logit contributions
parency-0.323
ology-0.289
icians-0.288
ologne-0.282
atform-0.276
Token save
Token ID3613
Feature activation+1.43e+00

Pos logit contributions
parency+0.155
ology+0.137
icians+0.136
ologne+0.132
atform+0.130
Neg logit contributions
RANT-0.167
PLIED-0.154
pling-0.151
gradient-0.148
ples-0.146
Token it
Token ID340
Feature activation+2.88e-01

Pos logit contributions
RANT+1.752
PLIED+1.623
pling+1.571
gradient+1.534
ples+1.514
Neg logit contributions
parency-1.949
ology-1.761
icians-1.753
ologne-1.721
atform-1.689
Token as
Token ID355
Feature activation-4.92e-02

Pos logit contributions
RANT+0.342
PLIED+0.313
pling+0.306
gradient+0.297
ples+0.295
Neg logit contributions
parency-0.410
ology-0.370
icians-0.370
ologne-0.363
atform-0.355
Token a
Token ID257
Feature activation-5.08e-01

Pos logit contributions
parency+0.073
ology+0.067
icians+0.066
ologne+0.064
atform+0.064
Neg logit contributions
RANT-0.056
PLIED-0.051
pling-0.049
gradient-0.048
antha-0.047
Token backup
Token ID11559
Feature activation-3.44e-01

Pos logit contributions
parency+0.705
ology+0.635
icians+0.629
ologne+0.610
atform+0.603
Neg logit contributions
RANT-0.632
PLIED-0.579
pling-0.564
gradient-0.550
ples-0.545
Token is
Token ID318
Feature activation+1.07e-01

Pos logit contributions
RANT+0.182
PLIED+0.166
pling+0.162
gradient+0.160
ples+0.157
Neg logit contributions
parency-0.229
icians-0.205
ology-0.204
ologne-0.202
ttes-0.197
Token a
Token ID257
Feature activation-2.28e-01

Pos logit contributions
RANT+0.134
PLIED+0.124
pling+0.120
gradient+0.119
antha+0.117
Neg logit contributions
parency-0.146
ology-0.130
icians-0.130
ologne-0.127
atform-0.125
Token true
Token ID2081
Feature activation-6.01e-02

Pos logit contributions
parency+0.294
icians+0.260
ology+0.258
ttes+0.252
ologne+0.252
Neg logit contributions
RANT-0.303
PLIED-0.278
pling-0.275
gradient-0.269
ples-0.267
Token and
Token ID290
Feature activation-1.14e-01

Pos logit contributions
parency+0.076
icians+0.067
ology+0.067
ttes+0.065
ologne+0.065
Neg logit contributions
RANT-0.082
PLIED-0.075
pling-0.074
gradient-0.072
ples-0.071
Token valid
Token ID4938
Feature activation+1.48e+00

Pos logit contributions
parency+0.152
ology+0.136
icians+0.136
ologne+0.132
ttes+0.131
Neg logit contributions
RANT-0.147
PLIED-0.135
pling-0.132
gradient-0.130
ples-0.128
Token one
Token ID530
Feature activation+7.16e-01

Pos logit contributions
RANT+1.869
PLIED+1.710
pling+1.701
gradient+1.654
ples+1.640
Neg logit contributions
parency-1.959
icians-1.734
ttes-1.713
ology-1.705
ologne-1.698
Token and
Token ID290
Feature activation-2.03e-01

Pos logit contributions
RANT+0.869
PLIED+0.802
pling+0.793
gradient+0.767
ples+0.767
Neg logit contributions
parency-0.980
icians-0.880
ology-0.873
ologne-0.866
ttes-0.849
Token the
Token ID262
Feature activation-2.17e-01

Pos logit contributions
parency+0.269
ology+0.238
icians+0.237
ologne+0.233
atform+0.229
Neg logit contributions
RANT-0.262
PLIED-0.243
pling-0.236
gradient-0.231
ples-0.231
Token
Token ID564
Feature activation+6.22e-01

Pos logit contributions
parency+0.284
ology+0.251
icians+0.250
ologne+0.245
atform+0.242
Neg logit contributions
RANT-0.282
PLIED-0.261
pling-0.255
ples-0.249
gradient-0.248
Token
Token ID246
Feature activation+3.42e-01

Pos logit contributions
RANT+0.610
PLIED+0.543
pling+0.534
gradient+0.512
ples+0.508
Neg logit contributions
parency-1.014
ology-0.930
icians-0.929
ologne-0.906
atform-0.901
Tokenseeing
Token ID42041
Feature activation+1.92e-01

Pos logit contributions
RANT+0.413
PLIED+0.374
pling+0.365
gradient+0.359
ples+0.356
Neg logit contributions
parency-0.484
icians-0.434
ology-0.434
ologne-0.424
atform-0.421
Token,
Token ID11
Feature activation+1.26e-01

Pos logit contributions
RANT+0.656
pling+0.605
PLIED+0.594
ples+0.581
gradient+0.580
Neg logit contributions
parency-0.736
icians-0.657
ology-0.653
ttes-0.641
ologne-0.635
Token largely
Token ID5688
Feature activation+1.23e-01

Pos logit contributions
RANT+0.156
PLIED+0.144
pling+0.141
gradient+0.137
antha+0.136
Neg logit contributions
parency-0.175
ology-0.157
icians-0.155
ologne-0.151
ttes-0.150
Token ineffective
Token ID23693
Feature activation+1.92e+00

Pos logit contributions
RANT+0.155
PLIED+0.142
pling+0.140
gradient+0.137
ples+0.135
Neg logit contributions
parency-0.169
ology-0.150
icians-0.148
atform-0.144
ologne-0.144
Token without
Token ID1231
Feature activation+3.46e-01

Pos logit contributions
RANT+2.350
pling+2.161
PLIED+2.142
ples+2.077
gradient+2.046
Neg logit contributions
parency-2.577
ology-2.317
icians-2.292
ologne-2.238
ttes-2.225
Token wide
Token ID3094
Feature activation-2.40e-01

Pos logit contributions
RANT+0.411
PLIED+0.376
pling+0.371
gradient+0.358
antha+0.357
Neg logit contributions
parency-0.492
ology-0.449
icians-0.440
atform-0.427
ologne-0.427
Token participation
Token ID10270
Feature activation+8.08e-01

Pos logit contributions
parency+0.302
ology+0.273
icians+0.267
ologne+0.260
ttes+0.257
Neg logit contributions
RANT-0.326
PLIED-0.300
pling-0.297
ples-0.289
gradient-0.289
Token.
Token ID13
Feature activation+8.58e-01

Pos logit contributions
RANT+0.949
PLIED+0.857
pling+0.851
ples+0.824
gradient+0.817
Neg logit contributions
parency-1.151
ology-1.056
icians-1.049
ologne-1.021
ttes-1.004
Token
Token ID198
Feature activation+1.28e+00

Pos logit contributions
RANT+1.058
PLIED+0.967
pling+0.922
gradient+0.896
antha+0.896
Neg logit contributions
parency-1.203
icians-1.069
ology-1.068
ologne-1.033
ttes-1.029
Token
Token ID198
Feature activation+1.16e+00

Pos logit contributions
RANT+1.528
PLIED+1.392
pling+1.353
gradient+1.330
ples+1.319
Neg logit contributions
parency-1.812
icians-1.631
ology-1.602
atform-1.576
ologne-1.575
TokenSome
Token ID4366
Feature activation+1.23e+00

Pos logit contributions
RANT+1.380
PLIED+1.237
pling+1.197
gradient+1.163
ples+1.155
Neg logit contributions
parency-1.670
ology-1.519
icians-1.516
atform-1.470
ologne-1.467
Token state
Token ID1181
Feature activation-2.44e-01

Pos logit contributions
RANT+1.581
PLIED+1.445
pling+1.415
ples+1.391
gradient+1.378
Neg logit contributions
parency-1.628
ology-1.455
icians-1.423
atform-1.385
ologne-1.380
Token GO
Token ID10351
Feature activation+1.75e+00

Pos logit contributions
RANT+0.770
PLIED+0.703
pling+0.687
ples+0.665
gradient+0.656
Neg logit contributions
parency-0.747
ology-0.657
icians-0.652
atform-0.632
ologne-0.631
Token TO
Token ID5390
Feature activation+2.31e-01

Pos logit contributions
RANT+2.124
PLIED+1.917
pling+1.840
ples+1.768
VERTISEMENT+1.757
Neg logit contributions
parency-2.506
icians-2.224
ology-2.192
ologne-2.155
ttes-2.147
Token THE
Token ID3336
Feature activation+1.73e-01

Pos logit contributions
RANT+0.286
PLIED+0.261
pling+0.247
antha+0.240
gradient+0.240
Neg logit contributions
parency-0.329
ology-0.294
icians-0.291
atform-0.283
ologne-0.282
Token W
Token ID370
Feature activation-1.31e-01

Pos logit contributions
RANT+0.218
PLIED+0.199
pling+0.188
ples+0.183
VERTISEMENT+0.182
Neg logit contributions
parency-0.247
ology-0.218
icians-0.217
atform-0.210
ologne-0.209
TokenATER
Token ID23261
Feature activation+3.91e-01

Pos logit contributions
parency+0.218
ology+0.198
icians+0.198
ologne+0.190
atform+0.190
Neg logit contributions
RANT-0.132
PLIED-0.117
pling-0.109
ples-0.106
VERTISEMENT-0.106
TokenVILLE
Token ID38526
Feature activation+7.17e-01

Pos logit contributions
RANT+0.480
PLIED+0.436
pling+0.411
gradient+0.398
ples+0.397
Neg logit contributions
parency-0.567
ology-0.506
icians-0.504
ologne-0.487
atform-0.486
Token TA
Token ID21664
Feature activation+1.78e+00

Pos logit contributions
RANT+0.904
PLIED+0.816
pling+0.769
VERTISEMENT+0.750
ples+0.749
Neg logit contributions
parency-1.025
icians-0.899
ology-0.894
atform-0.875
ologne-0.862
TokenVER
Token ID5959
Feature activation+1.54e+00

Pos logit contributions
RANT+1.854
PLIED+1.648
pling+1.615
VERTISEMENT+1.571
ples+1.560
Neg logit contributions
parency-2.863
icians-2.594
ology-2.528
ologne-2.497
atform-2.483
TokenN
Token ID45
Feature activation-8.31e-01

Pos logit contributions
RANT+2.072
PLIED+1.904
VERTISEMENT+1.752
pling+1.736
CHAT+1.734
Neg logit contributions
parency-2.096
icians-1.854
ology-1.840
atform-1.797
ologne-1.747
Token,
Token ID11
Feature activation+3.53e-02

Pos logit contributions
parency+1.180
ology+1.051
icians+1.049
atform+1.014
ologne+1.013
Neg logit contributions
RANT-1.033
PLIED-0.939
pling-0.898
ples-0.869
gradient-0.868
Token as
Token ID355
Feature activation+2.82e-01

Pos logit contributions
RANT+0.046
PLIED+0.043
pling+0.041
gradient+0.040
ples+0.040
Neg logit contributions
parency-0.047
ology-0.042
icians-0.041
ologne-0.040
atform-0.040
Token,
Token ID11
Feature activation-2.57e-01

Pos logit contributions
parency+1.273
ology+1.146
icians+1.129
ologne+1.092
atform+1.080
Neg logit contributions
RANT-1.266
PLIED-1.166
pling-1.137
gradient-1.110
antha-1.101
Token partners
Token ID4887
Feature activation+1.35e-01

Pos logit contributions
parency+0.356
ology+0.323
icians+0.318
ologne+0.309
atform+0.305
Neg logit contributions
RANT-0.319
PLIED-0.292
pling-0.284
gradient-0.276
ples-0.275
Token include
Token ID2291
Feature activation+1.57e-01

Pos logit contributions
RANT+0.174
PLIED+0.159
pling+0.156
gradient+0.152
ples+0.151
Neg logit contributions
parency-0.181
ology-0.163
icians-0.161
ologne-0.156
ttes-0.154
Token Rebecca
Token ID23489
Feature activation-1.16e+00

Pos logit contributions
RANT+0.198
PLIED+0.182
pling+0.176
gradient+0.172
antha+0.171
Neg logit contributions
parency-0.216
ology-0.193
icians-0.191
ologne-0.185
ttes-0.183
Token H
Token ID367
Feature activation-8.58e-01

Pos logit contributions
parency+1.608
ology+1.453
icians+1.433
ologne+1.389
atform+1.376
Neg logit contributions
RANT-1.438
PLIED-1.318
pling-1.283
gradient-1.253
antha-1.244
Tokenower
Token ID789
Feature activation+3.83e-01

Pos logit contributions
parency+1.177
ology+1.062
icians+1.047
ologne+1.010
ttes+1.005
Neg logit contributions
RANT-1.074
PLIED-0.982
pling-0.963
antha-0.933
gradient-0.931
Tokenton
Token ID1122
Feature activation-8.86e-01

Pos logit contributions
RANT+0.617
PLIED+0.574
pling+0.568
gradient+0.563
ples+0.555
Neg logit contributions
parency-0.384
icians-0.332
ology-0.331
ologne-0.313
atform-0.309
Token,
Token ID11
Feature activation-1.14e-01

Pos logit contributions
parency+1.186
ology+1.070
icians+1.054
ologne+1.019
ttes+1.008
Neg logit contributions
RANT-1.143
PLIED-1.052
pling-1.023
gradient-1.001
antha-0.992
Token Sam
Token ID3409
Feature activation+1.47e+00

Pos logit contributions
parency+0.157
ology+0.142
icians+0.140
ologne+0.136
ttes+0.134
Neg logit contributions
RANT-0.142
PLIED-0.131
pling-0.127
gradient-0.124
antha-0.123
Token P
Token ID350
Feature activation+6.58e-03

Pos logit contributions
RANT+1.826
pling+1.676
PLIED+1.668
ples+1.626
antha+1.614
Neg logit contributions
parency-2.002
ology-1.782
icians-1.762
ologne-1.725
atform-1.719
Tokenanche
Token ID6362
Feature activation-1.77e+00

Pos logit contributions
RANT+0.008
PLIED+0.008
pling+0.008
ples+0.007
antha+0.007
Neg logit contributions
parency-0.009
ology-0.008
icians-0.008
ttes-0.008
ologne-0.008
Tokenc
Token ID66
Feature activation-2.15e-01

Pos logit contributions
parency+1.778
ology+1.572
icians+1.563
ologne+1.538
atform+1.528
Neg logit contributions
RANT-2.184
pling-2.066
PLIED-2.038
ples-1.997
gradient-1.986
Token.
Token ID13
Feature activation+5.59e-01

Pos logit contributions
parency+0.258
ology+0.230
icians+0.229
ologne+0.220
ttes+0.219
Neg logit contributions
RANT-0.299
PLIED-0.277
pling-0.275
gradient-0.269
ples-0.268
Tokengov
Token ID9567
Feature activation+1.55e+00

Pos logit contributions
RANT+0.656
antha+0.606
pling+0.605
gradient+0.604
ples+0.592
Neg logit contributions
parency-0.766
ologne-0.691
ology-0.684
icians-0.679
atform-0.676
Token/
Token ID14
Feature activation+2.43e-02

Pos logit contributions
RANT+1.782
PLIED+1.625
pling+1.570
gradient+1.556
ples+1.524
Neg logit contributions
parency-2.242
icians-2.023
ology-2.018
ologne-1.982
atform-1.950
Tokenin
Token ID259
Feature activation-3.57e-02

Pos logit contributions
RANT+0.031
PLIED+0.028
pling+0.028
gradient+0.027
ples+0.027
Neg logit contributions
parency-0.033
ology-0.030
icians-0.029
ologne-0.029
atform-0.028
Tokenjury
Token ID21871
Feature activation-1.63e+00

Pos logit contributions
parency+0.035
ology+0.031
icians+0.030
ologne+0.029
atform+0.029
Neg logit contributions
RANT-0.058
PLIED-0.054
pling-0.053
ples-0.052
gradient-0.052
Token/
Token ID14
Feature activation+2.63e-02

Pos logit contributions
parency+2.255
ology+2.044
icians+2.012
ologne+1.949
ttes+1.929
Neg logit contributions
RANT-2.040
PLIED-1.872
pling-1.816
gradient-1.776
antha-1.760
Tokenw
Token ID86
Feature activation+1.14e-01

Pos logit contributions
RANT+0.032
PLIED+0.029
pling+0.029
gradient+0.028
antha+0.028
Neg logit contributions
parency-0.037
ology-0.033
icians-0.033
ologne-0.032
atform-0.032
Tokenis
Token ID271
Feature activation+1.12e-01

Pos logit contributions
RANT+0.143
PLIED+0.132
pling+0.130
ples+0.127
gradient+0.127
Neg logit contributions
parency-0.154
ology-0.138
icians-0.135
ologne-0.133
atform-0.131
Tokenq
Token ID80
Feature activation+5.26e-01

Pos logit contributions
RANT+0.147
PLIED+0.137
pling+0.133
gradient+0.130
ples+0.129
Neg logit contributions
parency-0.145
ology-0.130
icians-0.129
ologne-0.124
ttes-0.124
Tokenars
Token ID945
Feature activation+1.55e+00

Pos logit contributions
RANT+0.615
PLIED+0.564
pling+0.547
gradient+0.540
antha+0.533
Neg logit contributions
parency-0.754
ology-0.684
icians-0.674
ologne-0.658
atform-0.652
Token but
Token ID475
Feature activation-4.87e-01

Pos logit contributions
parency+0.095
ology+0.087
icians+0.085
ologne+0.083
atform+0.082
Neg logit contributions
RANT-0.076
PLIED-0.070
pling-0.068
gradient-0.066
antha-0.065
Token the
Token ID262
Feature activation-3.63e-01

Pos logit contributions
parency+0.656
ology+0.591
icians+0.583
ologne+0.565
atform+0.559
Neg logit contributions
RANT-0.624
PLIED-0.571
pling-0.559
gradient-0.546
antha-0.541
Token building
Token ID2615
Feature activation-1.59e-01

Pos logit contributions
parency+0.490
ology+0.440
icians+0.435
ologne+0.422
atform+0.418
Neg logit contributions
RANT-0.463
PLIED-0.425
pling-0.416
gradient-0.405
ples-0.403
Token became
Token ID2627
Feature activation+1.09e-01

Pos logit contributions
parency+0.213
ology+0.191
icians+0.188
ologne+0.183
atform+0.181
Neg logit contributions
RANT-0.204
PLIED-0.186
pling-0.183
gradient-0.179
antha-0.178
Token obsolete
Token ID26533
Feature activation+5.23e-01

Pos logit contributions
RANT+0.131
PLIED+0.119
pling+0.116
gradient+0.113
antha+0.113
Neg logit contributions
parency-0.155
ology-0.140
icians-0.138
ologne-0.134
atform-0.133
Token in
Token ID287
Feature activation-4.89e-01

Pos logit contributions
RANT+0.623
PLIED+0.560
pling+0.556
gradient+0.543
ples+0.535
Neg logit contributions
parency-0.744
ology-0.672
icians-0.666
ologne-0.652
ttes-0.644
Token the
Token ID262
Feature activation-2.02e-01

Pos logit contributions
parency+0.661
ology+0.597
icians+0.588
ologne+0.569
atform+0.565
Neg logit contributions
RANT-0.625
PLIED-0.571
pling-0.560
gradient-0.548
antha-0.541
Token sevent
Token ID10037
Feature activation-7.58e-01

Pos logit contributions
parency+0.305
ology+0.277
icians+0.275
ologne+0.267
atform+0.266
Neg logit contributions
RANT-0.227
PLIED-0.205
pling-0.200
gradient-0.194
antha-0.193
Tokenies
Token ID444
Feature activation+1.53e-01

Pos logit contributions
parency+1.192
ology+1.092
icians+1.077
ologne+1.050
ttes+1.039
Neg logit contributions
RANT-0.800
PLIED-0.722
pling-0.698
gradient-0.677
antha-0.672
Token,
Token ID11
Feature activation-2.06e-01

Pos logit contributions
RANT+0.188
PLIED+0.171
pling+0.169
gradient+0.164
ples+0.163
Neg logit contributions
parency-0.214
ology-0.193
icians-0.190
ologne-0.186
atform-0.185
Token when
Token ID618
Feature activation+6.23e-01

Pos logit contributions
parency+0.299
ology+0.272
icians+0.268
ologne+0.260
ttes+0.257
Neg logit contributions
RANT-0.244
PLIED-0.223
pling-0.215
gradient-0.210
antha-0.208
Token"
Token ID1
Feature activation-7.85e-01

Pos logit contributions
parency+0.931
ology+0.825
icians+0.823
ologne+0.800
atform+0.789
Neg logit contributions
RANT-1.015
PLIED-0.941
pling-0.913
gradient-0.897
ples-0.890
Token by
Token ID416
Feature activation+1.06e-01

Pos logit contributions
parency+1.060
ology+0.956
icians+0.944
ologne+0.914
atform+0.905
Neg logit contributions
RANT-1.006
PLIED-0.925
pling-0.897
gradient-0.879
ples-0.870
Token Des
Token ID2935
Feature activation-1.91e+00

Pos logit contributions
RANT+0.126
PLIED+0.114
pling+0.111
gradient+0.109
ples+0.107
Neg logit contributions
parency-0.152
ology-0.138
icians-0.137
ologne-0.134
atform-0.132
Tokenhm
Token ID23940
Feature activation-2.29e+00

Pos logit contributions
parency+2.153
ology+1.912
icians+1.872
ologne+1.797
ttes+1.772
Neg logit contributions
RANT-2.852
PLIED-2.658
pling-2.591
gradient-2.539
antha-2.521
Tokenuk
Token ID2724
Feature activation+1.16e+00

Pos logit contributions
parency+3.695
ology+3.411
icians+3.360
ologne+3.271
ttes+3.237
Neg logit contributions
RANT-2.336
PLIED-2.096
pling-1.966
gradient-1.942
antha-1.918
Tokenh
Token ID71
Feature activation-1.50e+00

Pos logit contributions
RANT+1.472
PLIED+1.349
pling+1.335
gradient+1.326
antha+1.303
Neg logit contributions
parency-1.540
ology-1.369
icians-1.359
atform-1.329
ologne-1.322
Token in
Token ID287
Feature activation-2.64e-01

Pos logit contributions
parency+2.052
ology+1.860
icians+1.830
ologne+1.771
ttes+1.754
Neg logit contributions
RANT-1.881
PLIED-1.728
pling-1.676
gradient-1.640
antha-1.625
Token allot
Token ID47405
Feature activation-9.62e-01

Pos logit contributions
parency+0.364
ology+0.331
icians+0.326
ologne+0.316
atform+0.314
Neg logit contributions
RANT-0.329
PLIED-0.300
pling-0.294
gradient-0.286
ples-0.284
Tokenting
Token ID889
Feature activation+6.55e-02

Pos logit contributions
parency+0.995
ology+0.866
icians+0.854
ologne+0.814
atform+0.803
Neg logit contributions
RANT-1.537
PLIED-1.431
pling-1.416
gradient-1.385
ples-1.372
Token the
Token ID262
Feature activation-1.21e-01

Pos logit contributions
RANT+0.079
PLIED+0.072
pling+0.070
gradient+0.068
ples+0.068
Neg logit contributions
parency-0.093
ology-0.085
icians-0.084
ologne-0.081
atform-0.080
Token land
Token ID1956
Feature activation+1.30e-01

Pos logit contributions
parency+0.167
ology+0.150
icians+0.149
ologne+0.145
atform+0.144
Neg logit contributions
RANT-0.152
PLIED-0.140
pling-0.135
gradient-0.132
ples-0.132
Token form
Token ID1296
Feature activation-3.81e-01

Pos logit contributions
RANT+0.716
PLIED+0.652
pling+0.639
ples+0.624
gradient+0.621
Neg logit contributions
parency-0.893
icians-0.801
ology-0.801
ologne-0.785
atform-0.767
Token-
Token ID12
Feature activation-1.70e-01

Pos logit contributions
parency+0.525
ology+0.478
icians+0.469
ologne+0.453
ttes+0.450
Neg logit contributions
RANT-0.476
PLIED-0.437
pling-0.423
gradient-0.414
antha-0.411
Tokenin
Token ID259
Feature activation-2.13e-01

Pos logit contributions
parency+0.268
ology+0.246
icians+0.242
ologne+0.236
ttes+0.234
Neg logit contributions
RANT-0.179
PLIED-0.161
pling-0.156
gradient-0.152
antha-0.150
Token-
Token ID12
Feature activation-9.12e-02

Pos logit contributions
parency+0.291
ology+0.263
icians+0.260
ologne+0.253
atform+0.249
Neg logit contributions
RANT-0.269
PLIED-0.246
pling-0.241
ples-0.234
gradient-0.234
Tokenone
Token ID505
Feature activation+6.64e-01

Pos logit contributions
parency+0.130
ology+0.119
icians+0.117
ologne+0.114
ttes+0.112
Neg logit contributions
RANT-0.109
PLIED-0.099
pling-0.096
gradient-0.095
ples-0.093
Token heats
Token ID37876
Feature activation-2.05e+00

Pos logit contributions
RANT+0.811
PLIED+0.733
pling+0.723
gradient+0.705
ples+0.702
Neg logit contributions
parency-0.937
icians-0.838
ology-0.835
ologne-0.820
atform-0.799
Tokeninks
Token ID2973
Feature activation+1.08e+00

Pos logit contributions
parency+2.601
ology+2.337
icians+2.297
ologne+2.219
ttes+2.192
Neg logit contributions
RANT-2.776
PLIED-2.567
pling-2.498
gradient-2.448
antha-2.428
Token improve
Token ID2987
Feature activation+9.17e-01

Pos logit contributions
RANT+1.461
PLIED+1.338
pling+1.324
gradient+1.304
ples+1.293
Neg logit contributions
parency-1.348
ology-1.199
icians-1.184
ologne-1.158
ologists-1.127
Token cooling
Token ID15134
Feature activation-1.77e-01

Pos logit contributions
RANT+1.030
PLIED+0.931
pling+0.903
gradient+0.887
antha+0.868
Neg logit contributions
parency-1.363
ology-1.249
icians-1.242
ologne-1.212
ttes-1.195
Token for
Token ID329
Feature activation+1.79e-01

Pos logit contributions
parency+0.265
ology+0.243
icians+0.243
ologne+0.236
ttes+0.232
Neg logit contributions
RANT-0.198
PLIED-0.178
pling-0.173
gradient-0.171
ples-0.167
Token memory
Token ID4088
Feature activation+1.31e+00

Pos logit contributions
RANT+0.217
PLIED+0.199
pling+0.193
gradient+0.189
ples+0.187
Neg logit contributions
parency-0.253
ology-0.230
icians-0.227
ologne-0.221
ttes-0.217
Token 8
Token ID807
Feature activation-4.10e-01

Pos logit contributions
parency+0.247
ology+0.223
icians+0.220
ologne+0.213
atform+0.211
Neg logit contributions
RANT-0.231
PLIED-0.213
pling-0.206
gradient-0.201
ples-0.200
Token:
Token ID25
Feature activation+8.49e-02

Pos logit contributions
parency+0.590
ology+0.532
icians+0.527
ologne+0.511
atform+0.505
Neg logit contributions
RANT-0.490
PLIED-0.448
pling-0.436
gradient-0.423
ples-0.421
Token30
Token ID1270
Feature activation-1.05e+00

Pos logit contributions
RANT+0.090
PLIED+0.081
pling+0.078
gradient+0.075
ples+0.075
Neg logit contributions
parency-0.134
ology-0.122
icians-0.121
ologne-0.118
atform-0.117
Token a
Token ID257
Feature activation-3.44e-01

Pos logit contributions
parency+1.552
ology+1.414
icians+1.397
ologne+1.354
atform+1.338
Neg logit contributions
RANT-1.219
PLIED-1.110
pling-1.075
gradient-1.048
ples-1.039
Token.
Token ID13
Feature activation-7.98e-01

Pos logit contributions
parency+0.416
ology+0.369
icians+0.366
ologne+0.353
atform+0.350
Neg logit contributions
RANT-0.484
PLIED-0.447
pling-0.443
ples-0.430
gradient-0.429
Tokenm
Token ID76
Feature activation-9.69e-01

Pos logit contributions
parency+1.134
ology+1.028
icians+1.017
ologne+0.987
atform+0.976
Neg logit contributions
RANT-0.961
PLIED-0.877
pling-0.855
gradient-0.835
ples-0.826
Token.
Token ID13
Feature activation-7.29e-01

Pos logit contributions
parency+1.096
ology+0.971
icians+0.952
ologne+0.914
ttes+0.902
Neg logit contributions
RANT-1.451
PLIED-1.352
pling-1.320
gradient-1.295
ples-1.286
Token for
Token ID329
Feature activation-4.42e-01

Pos logit contributions
parency+1.017
ology+0.920
icians+0.908
ologne+0.880
ttes+0.867
Neg logit contributions
RANT-0.903
PLIED-0.827
pling-0.804
gradient-0.785
ples-0.777
Token 2
Token ID362
Feature activation+8.11e-02

Pos logit contributions
parency+0.641
ology+0.582
icians+0.575
ologne+0.557
atform+0.551
Neg logit contributions
RANT-0.523
PLIED-0.477
pling-0.463
gradient-0.451
ples-0.447
Token-
Token ID12
Feature activation-4.74e-02

Pos logit contributions
RANT+0.097
PLIED+0.089
pling+0.087
ples+0.085
gradient+0.084
Neg logit contributions
parency-0.116
ology-0.105
icians-0.103
ologne-0.100
atform-0.099
Tokenhour
Token ID9769
Feature activation-1.30e+00

Pos logit contributions
parency+0.062
ology+0.055
icians+0.055
ologne+0.053
atform+0.052
Neg logit contributions
RANT-0.062
PLIED-0.057
pling-0.056
gradient-0.055
ples-0.055
Token to
Token ID284
Feature activation+3.34e-02

Pos logit contributions
RANT+0.644
PLIED+0.583
pling+0.563
gradient+0.563
ples+0.551
Neg logit contributions
parency-0.798
ology-0.717
icians-0.714
ologne-0.697
ttes-0.691
Token film
Token ID2646
Feature activation+9.67e-01

Pos logit contributions
RANT+0.042
PLIED+0.038
pling+0.037
gradient+0.037
antha+0.036
Neg logit contributions
parency-0.046
ology-0.041
icians-0.041
ologne-0.040
ttes-0.039
Token producer
Token ID9920
Feature activation+7.75e-02

Pos logit contributions
RANT+1.208
PLIED+1.111
pling+1.079
gradient+1.059
antha+1.054
Neg logit contributions
parency-1.316
ology-1.184
icians-1.163
ologne-1.146
ttes-1.134
Token Sub
Token ID3834
Feature activation-1.49e-01

Pos logit contributions
RANT+0.096
PLIED+0.088
pling+0.085
gradient+0.083
antha+0.083
Neg logit contributions
parency-0.107
ology-0.097
icians-0.096
ologne-0.094
ttes-0.092
Tokenhash
Token ID17831
Feature activation-9.18e-01

Pos logit contributions
parency+0.273
ology+0.253
icians+0.251
ologne+0.246
ttes+0.243
Neg logit contributions
RANT-0.118
PLIED-0.102
pling-0.098
gradient-0.095
antha-0.094
Token Gh
Token ID11972
Feature activation-2.96e+00

Pos logit contributions
parency+1.298
ology+1.174
icians+1.159
ologne+1.128
atform+1.113
Neg logit contributions
RANT-1.115
PLIED-1.020
pling-0.993
gradient-0.969
antha-0.963
Tokenai
Token ID1872
Feature activation+4.53e-01

Pos logit contributions
parency+3.896
ology+3.543
icians+3.483
ologne+3.331
atform+3.324
Neg logit contributions
RANT-3.876
PLIED-3.597
pling-3.404
gradient-3.395
VERTISEMENT-3.321
Token,
Token ID11
Feature activation+1.64e-01

Pos logit contributions
RANT+0.544
PLIED+0.492
pling+0.483
gradient+0.468
ples+0.466
Neg logit contributions
parency-0.647
ology-0.586
icians-0.581
ologne-0.567
ttes-0.560
Token the
Token ID262
Feature activation-1.97e-02

Pos logit contributions
RANT+0.202
PLIED+0.184
pling+0.178
gradient+0.175
antha+0.174
Neg logit contributions
parency-0.229
ology-0.207
icians-0.204
ologne-0.199
ttes-0.197
Token Supreme
Token ID5617
Feature activation-1.10e+00

Pos logit contributions
parency+0.027
ology+0.024
icians+0.024
ologne+0.023
ttes+0.023
Neg logit contributions
RANT-0.025
PLIED-0.023
pling-0.022
gradient-0.021
antha-0.021
Token Court
Token ID3078
Feature activation-1.48e+00

Pos logit contributions
parency+1.234
ology+1.096
icians+1.068
ologne+1.026
ttes+1.012
Neg logit contributions
RANT-1.667
PLIED-1.553
pling-1.513
gradient-1.486
ples-1.474
Token parents
Token ID3397
Feature activation+1.94e+00

Pos logit contributions
parency+0.027
ology+0.024
icians+0.024
ologne+0.023
atform+0.023
Neg logit contributions
RANT-0.025
PLIED-0.023
pling-0.023
gradient-0.022
ples-0.022
Token looking
Token ID2045
Feature activation+4.05e-02

Pos logit contributions
RANT+2.240
pling+2.040
PLIED+2.028
ples+1.962
antha+1.936
Neg logit contributions
parency-2.772
ology-2.480
icians-2.464
ologne-2.446
ttes-2.408
Token for
Token ID329
Feature activation-3.58e-01

Pos logit contributions
RANT+0.047
PLIED+0.043
pling+0.042
gradient+0.041
ples+0.041
Neg logit contributions
parency-0.059
ology-0.053
icians-0.053
ologne-0.052
ttes-0.051
Token ways
Token ID2842
Feature activation-2.67e-01

Pos logit contributions
parency+0.483
ology+0.437
icians+0.429
ologne+0.418
ttes+0.414
Neg logit contributions
RANT-0.456
PLIED-0.421
pling-0.407
gradient-0.399
antha-0.396
Token to
Token ID284
Feature activation-1.70e-01

Pos logit contributions
parency+0.374
ology+0.341
icians+0.336
ologne+0.328
ttes+0.324
Neg logit contributions
RANT-0.323
PLIED-0.296
pling-0.287
gradient-0.279
ples-0.278
Token clean
Token ID3424
Feature activation-3.10e+00

Pos logit contributions
parency+0.189
ology+0.166
icians+0.163
ologne+0.157
ttes+0.155
Neg logit contributions
RANT-0.256
PLIED-0.239
pling-0.236
gradient-0.229
ples-0.229
Token up
Token ID510
Feature activation-2.19e-01

Pos logit contributions
parency+4.108
ology+3.718
icians+3.661
ologne+3.522
atform+3.488
Neg logit contributions
RANT-4.030
PLIED-3.700
pling-3.610
gradient-3.527
antha-3.504
Token the
Token ID262
Feature activation-2.31e-01

Pos logit contributions
parency+0.332
ology+0.303
icians+0.298
ologne+0.292
atform+0.290
Neg logit contributions
RANT-0.243
PLIED-0.220
pling-0.212
gradient-0.208
ples-0.206
Token air
Token ID1633
Feature activation-5.52e-01

Pos logit contributions
parency+0.276
ology+0.242
icians+0.240
ologne+0.233
atform+0.232
Neg logit contributions
RANT-0.331
PLIED-0.306
pling-0.299
gradient-0.294
ples-0.292
Token their
Token ID511
Feature activation+3.62e-01

Pos logit contributions
parency+0.778
ology+0.704
icians+0.693
ologne+0.677
ttes+0.670
Neg logit contributions
RANT-0.668
PLIED-0.614
pling-0.597
ples-0.580
gradient-0.576
Token kids
Token ID3988
Feature activation+1.07e+00

Pos logit contributions
RANT+0.367
PLIED+0.331
pling+0.321
ples+0.316
gradient+0.310
Neg logit contributions
parency-0.576
ology-0.529
icians-0.520
ologne-0.511
atform-0.509
Token government
Token ID1230
Feature activation-6.21e-01

Pos logit contributions
RANT+0.071
PLIED+0.065
pling+0.064
gradient+0.062
antha+0.062
Neg logit contributions
parency-0.074
ology-0.066
icians-0.065
ologne-0.063
atform-0.063
Token shutdown
Token ID18325
Feature activation-1.48e+00

Pos logit contributions
parency+0.846
ology+0.764
icians+0.755
ologne+0.733
atform+0.726
Neg logit contributions
RANT-0.782
PLIED-0.720
pling-0.703
gradient-0.685
antha-0.680
Token that
Token ID326
Feature activation-8.47e-01

Pos logit contributions
parency+1.855
ology+1.666
icians+1.636
ologne+1.577
ttes+1.560
Neg logit contributions
RANT-2.025
PLIED-1.875
pling-1.822
gradient-1.787
antha-1.772
Token took
Token ID1718
Feature activation-3.93e-02

Pos logit contributions
parency+1.135
ology+1.017
icians+1.003
ologne+0.974
atform+0.964
Neg logit contributions
RANT-1.098
PLIED-1.005
pling-0.979
gradient-0.959
antha-0.953
Token effect
Token ID1245
Feature activation-5.36e-02

Pos logit contributions
parency+0.059
ology+0.053
icians+0.053
ologne+0.051
atform+0.051
Neg logit contributions
RANT-0.045
PLIED-0.041
pling-0.040
gradient-0.039
ples-0.038
Token October
Token ID3267
Feature activation-2.99e+00

Pos logit contributions
parency+0.072
ology+0.064
icians+0.063
ologne+0.061
ttes+0.061
Neg logit contributions
RANT-0.070
PLIED-0.064
pling-0.062
gradient-0.061
ples-0.060
Token 1
Token ID352
Feature activation-2.66e-02

Pos logit contributions
parency+3.974
ology+3.638
icians+3.561
ologne+3.416
ttes+3.389
Neg logit contributions
RANT-3.835
PLIED-3.565
pling-3.421
gradient-3.344
yip-3.330
Token,
Token ID11
Feature activation-2.25e-01

Pos logit contributions
parency+0.035
ology+0.031
icians+0.031
ologne+0.030
ttes+0.030
Neg logit contributions
RANT-0.035
PLIED-0.032
pling-0.032
gradient-0.031
ples-0.031
Token it
Token ID340
Feature activation+1.32e-01

Pos logit contributions
parency+0.300
ology+0.270
icians+0.266
ologne+0.258
ttes+0.256
Neg logit contributions
RANT-0.293
PLIED-0.268
pling-0.261
gradient-0.255
antha-0.254
Token seems
Token ID2331
Feature activation-8.56e-01

Pos logit contributions
RANT+0.164
PLIED+0.149
pling+0.147
gradient+0.144
antha+0.143
Neg logit contributions
parency-0.182
ology-0.161
icians-0.161
ologne-0.158
atform-0.156
Token the
Token ID262
Feature activation-4.02e-01

Pos logit contributions
parency+1.094
ology+0.981
icians+0.963
ologne+0.938
atform+0.927
Neg logit contributions
RANT-1.161
PLIED-1.067
pling-1.040
gradient-1.018
antha-1.010
Token Home
Token ID5995
Feature activation-1.42e+00

Pos logit contributions
parency+0.816
ology+0.739
icians+0.731
ologne+0.710
ttes+0.703
Neg logit contributions
RANT-0.669
PLIED-0.609
pling-0.589
gradient-0.575
antha-0.572
Token Life
Token ID5155
Feature activation-1.11e+00

Pos logit contributions
parency+2.018
ology+1.836
icians+1.807
ologne+1.751
ttes+1.734
Neg logit contributions
RANT-1.721
PLIED-1.576
pling-1.526
gradient-1.492
antha-1.477
Token /
Token ID1220
Feature activation+2.05e-02

Pos logit contributions
parency+1.558
ology+1.414
icians+1.393
ologne+1.350
ttes+1.336
Neg logit contributions
RANT-1.364
PLIED-1.250
pling-1.212
gradient-1.185
antha-1.174
Token P
Token ID350
Feature activation-5.79e-01

Pos logit contributions
RANT+0.024
PLIED+0.022
pling+0.021
gradient+0.021
antha+0.021
Neg logit contributions
parency-0.030
ology-0.027
icians-0.027
ologne-0.026
atform-0.026
TokenCC
Token ID4093
Feature activation+9.11e-02

Pos logit contributions
parency+0.823
ology+0.743
icians+0.735
ologne+0.710
ttes+0.709
Neg logit contributions
RANT-0.702
PLIED-0.642
pling-0.622
gradient-0.602
antha-0.600
Token Natural
Token ID12068
Feature activation-3.23e+00

Pos logit contributions
RANT+0.109
PLIED+0.099
pling+0.096
gradient+0.093
ples+0.093
Neg logit contributions
parency-0.131
ology-0.118
icians-0.117
ologne-0.114
ttes-0.113
Token Markets
Token ID30251
Feature activation+2.58e-01

Pos logit contributions
parency+4.603
ology+4.220
icians+4.126
ologne+4.007
ttes+3.953
Neg logit contributions
RANT-3.884
PLIED-3.560
pling-3.430
gradient-3.345
antha-3.305
Token will
Token ID481
Feature activation-6.07e-01

Pos logit contributions
RANT+0.313
PLIED+0.286
pling+0.278
gradient+0.272
antha+0.271
Neg logit contributions
parency-0.362
ology-0.325
icians-0.323
ologne-0.314
atform-0.312
Token open
Token ID1280
Feature activation-7.77e-01

Pos logit contributions
parency+0.701
ology+0.619
icians+0.610
ologne+0.587
atform+0.579
Neg logit contributions
RANT-0.894
PLIED-0.829
pling-0.815
gradient-0.798
antha-0.792
Token new
Token ID649
Feature activation-8.49e-01

Pos logit contributions
parency+1.088
ology+0.985
icians+0.973
ologne+0.941
ttes+0.932
Neg logit contributions
RANT-0.957
PLIED-0.876
pling-0.849
gradient-0.830
antha-0.824
Token store
Token ID3650
Feature activation+3.16e-01

Pos logit contributions
parency+1.166
ology+1.056
icians+1.040
ologne+1.007
ttes+0.996
Neg logit contributions
RANT-1.065
PLIED-0.977
pling-0.948
gradient-0.927
antha-0.920
Token made
Token ID925
Feature activation-1.79e+00

Pos logit contributions
parency+0.301
ology+0.269
icians+0.265
ologne+0.257
atform+0.256
Neg logit contributions
RANT-0.310
PLIED-0.286
pling-0.280
gradient-0.272
ples-0.271
Token his
Token ID465
Feature activation-1.41e+00

Pos logit contributions
parency+2.571
ology+2.353
icians+2.307
ologne+2.239
ttes+2.217
Neg logit contributions
RANT-2.114
PLIED-1.948
pling-1.858
gradient-1.831
antha-1.805
Token bid
Token ID8406
Feature activation-7.69e-01

Pos logit contributions
parency+1.753
ology+1.573
icians+1.544
ologne+1.489
ttes+1.472
Neg logit contributions
RANT-1.941
PLIED-1.798
pling-1.748
gradient-1.714
antha-1.700
Token for
Token ID329
Feature activation-2.79e-01

Pos logit contributions
parency+1.114
ology+1.008
icians+0.999
ologne+0.969
ttes+0.961
Neg logit contributions
RANT-0.908
PLIED-0.823
pling-0.806
gradient-0.783
ples-0.778
Token the
Token ID262
Feature activation-3.64e-01

Pos logit contributions
parency+0.376
ology+0.339
icians+0.334
ologne+0.323
atform+0.322
Neg logit contributions
RANT-0.355
PLIED-0.326
pling-0.318
gradient-0.309
antha-0.309
Token throne
Token ID19262
Feature activation-3.09e+00

Pos logit contributions
parency+0.369
ology+0.317
icians+0.312
ologne+0.298
atform+0.298
Neg logit contributions
RANT-0.587
PLIED-0.550
pling-0.541
ples-0.530
gradient-0.528
Token in
Token ID287
Feature activation-5.68e-01

Pos logit contributions
parency+3.791
ology+3.447
icians+3.346
ologne+3.225
ttes+3.190
Neg logit contributions
RANT-4.254
PLIED-3.964
pling-3.814
gradient-3.772
antha-3.742
Token conjunction
Token ID17856
Feature activation-1.63e+00

Pos logit contributions
parency+0.789
ology+0.715
icians+0.708
ologne+0.679
atform+0.679
Neg logit contributions
RANT-0.702
PLIED-0.642
pling-0.629
gradient-0.609
antha-0.607
Token with
Token ID351
Feature activation-2.87e-01

Pos logit contributions
parency+2.268
ology+2.053
icians+2.025
ologne+1.960
ttes+1.941
Neg logit contributions
RANT-2.024
PLIED-1.854
pling-1.801
gradient-1.763
antha-1.742
Token Edwin
Token ID37016
Feature activation-2.12e+00

Pos logit contributions
parency+0.412
ology+0.373
icians+0.367
ologne+0.357
atform+0.355
Neg logit contributions
RANT-0.342
PLIED-0.312
pling-0.304
gradient-0.295
antha-0.294
Token Earl
Token ID21647
Feature activation+3.73e-01

Pos logit contributions
parency+2.821
ology+2.549
icians+2.507
ologne+2.422
ttes+2.397
Neg logit contributions
RANT-2.760
PLIED-2.542
pling-2.468
gradient-2.417
antha-2.395