TOP ACTIVATIONS
MAX = 51.300

well , I don t know what you
- hop , and he s definitely passionate about
claims that named people couldn t provide justice for
said that , I don t see myself doing
a new life . It s good to see
. Seriously . We ain t toss in
grid lock , they couldn t reload the salt
a character . You don t necessarily phone it
so , that we couldn t help but bask
of me . So don t be surprised if
( little kids . They ve been there for
on internet rumors that don t even make sense
mainstream media , I don t want to hear
desert ( the poems don t argue , they
: <|endoftext|> It can t be easy be
: Oh man , I d definitely need some
. The working class doesn t hold power anywhere
across this room , you ll see a wide
a passionate man . He s passionate about his
erate flaws present that don t affect the console

MIN ACTIVATIONS
MIN = -21.665

force more effectively , improve their intelligence capacities .
the L ocker bie bombing to go after Libya
give up their wealth and their life ? F
) without revealing the specific sources or methods for
level , but what about the passport ing of people
element to determine the success of restoring a copy of your
replicate the flavor and texture of o atmeal and can help
to practices and first access to prospects , all being combined
diplomatic compounds abroad or revelations about highly secret intelligence sources or
be strong enough to offset the revenue losses .
we are not neglect ing the next generation of space exploration
their oppress ors , and the needs of their respective communities
source of research funding and the outcomes of research studies ,
blacks and Muslims , and the hate he has brought to
world economic crisis ) and the " Old Guard " headed
the pagan myths by interpreting them as symbolic pre fig urations
the speed and physical ity of hockey at an early age
decreasing , rather than increasing their immersion and engagement .
made it difficult for teachers to retire by replacing pensions with
wind , and 115 GW of on shore wind . The

INTERVAL 33.059 - 51.300
CONTAINS 1.394%

, Rou pen ian s Cat Person recounts
either bounce off of Wallace s hands or fall
2010 . Trump s promised border wall
gained ownership in each other s companies through an
asked MIT s Dar on Ace m

INTERVAL 14.817 - 33.059
CONTAINS 4.699%

Committee leak R obby M ook ( left )
detected as EL F _ UM RE ON . B )
6 3 Weight : 190
Munich design student Mor itz Bre it en hu ber won
as a kind of f iller between

INTERVAL -3.424 - 14.817
CONTAINS 51.434%

in the eurozone . However , on Thursday figures
[ last week s ] episode that we got
Obama administration to " lead from behind " again , but
the green and restore performance ! Race under bright
is not one of them ; it s just

INTERVAL -21.665 - -3.424
CONTAINS 42.472%

symbolic pre fig urations of the Christian doctrine ; Od ys
for boys or beauty in the case of girls .
credit for being open to a change of mindset .
members a one - month free trial of S ling TV
bikes and learned to operate our bikes on the streets according
Token
Token ID1399
Feature activation+8.18e+00

Pos logit contributions
 tremend+3.278
 artificially+3.024
 disparate+2.939
 meaningful+2.911
 diminishing+2.904
Neg logit contributions
uh-2.666
ix-2.612
u-2.510
are-2.429
ind-2.428
Token well
Token ID880
Feature activation+8.21e+00

Pos logit contributions
uh+10.837
ix+10.200
are+10.175
u+10.147
ind+10.025
Neg logit contributions
 tremend-6.247
 artificially-5.389
 behavi-5.365
 disparate-5.181
 diminishing-5.087
Token,
Token ID11
Feature activation+8.75e-01

Pos logit contributions
uh+8.688
ix+8.268
u+8.206
oh+7.870
i+7.828
Neg logit contributions
 tremend-8.232
 artificially-7.631
 disparate-7.473
 behavi-7.411
 diminishing-7.161
Token I
Token ID314
Feature activation+5.51e+00

Pos logit contributions
uh+1.113
ix+1.068
u+1.048
are+1.030
ind+1.017
Neg logit contributions
 tremend-0.808
 behavi-0.713
 artificially-0.709
 disparate-0.688
 meaningful-0.670
Token don
Token ID836
Feature activation+1.06e+01

Pos logit contributions
uh+6.563
ix+6.346
u+6.220
ath+6.095
ind+6.080
Neg logit contributions
 tremend-5.209
 behavi-4.722
 disparate-4.605
 artificially-4.536
 meaningful-4.436
Token
Token ID447
Feature activation+5.13e+01

Pos logit contributions
uh+6.202
ix+6.160
u+5.867
i+5.612
are+5.393
Neg logit contributions
 tremend-18.337
 behavi-16.181
 hemor-15.781
 undermin-15.511
 mathemat-15.232
Token
Token ID247
Feature activation+3.85e+01

Pos logit contributions
+32.300
+31.170
+30.956
+30.519
+30.466
Neg logit contributions
 tremend-38.964
 behavi-33.638
 mathemat-32.389
 undermin-30.792
 hemor-30.741
Tokent
Token ID83
Feature activation+1.24e+01

Pos logit contributions
u+28.850
t+28.406
i+27.431
ix+27.269
s+27.059
Neg logit contributions
 tremend-34.081
 behavi-31.624
 hemor-29.936
 undermin-29.722
 disparate-27.980
Token know
Token ID760
Feature activation+1.03e+01

Pos logit contributions
uh+14.220
ix+13.968
u+13.517
are+13.321
ind+13.296
Neg logit contributions
 tremend-10.123
 behavi-9.907
 disparate-8.943
 undermin-8.371
 fragmented-8.294
Token what
Token ID644
Feature activation+3.52e-01

Pos logit contributions
uh+11.496
ix+11.316
u+11.053
are+10.752
i+10.703
Neg logit contributions
 tremend-8.862
 behavi-8.065
 disparate-7.721
 artificially-7.627
 fragmented-7.361
Token you
Token ID345
Feature activation+4.54e+00

Pos logit contributions
uh+0.391
ix+0.382
u+0.369
are+0.363
ath+0.359
Neg logit contributions
 tremend-0.373
 artificially-0.341
 disparate-0.333
 behavi-0.330
 diminishing-0.324
Token-
Token ID12
Feature activation+1.75e+01

Pos logit contributions
uh+13.504
ix+13.091
u+12.860
are+12.820
i+12.800
Neg logit contributions
 tremend-10.203
 artificially-8.900
 behavi-8.748
 nonexistent-8.449
 mutually-8.346
Tokenhop
Token ID8548
Feature activation+1.08e+01

Pos logit contributions
uh+11.366
are+10.609
u+10.584
i+10.583
ind+10.518
Neg logit contributions
 tremend-21.413
 behavi-20.078
 artificially-19.262
 disparate-18.985
 diminishing-18.658
Token,
Token ID11
Feature activation-5.25e+00

Pos logit contributions
uh+12.186
ix+12.158
u+11.968
i+11.827
ath+11.805
Neg logit contributions
 tremend-8.590
 artificially-7.677
 diminishing-7.111
 fragmented-7.104
 behavi-7.095
Token and
Token ID290
Feature activation-7.77e+00

Pos logit contributions
 tremend+6.058
 artificially+5.593
 disparate+5.414
 behavi+5.319
 meaningful+5.312
Neg logit contributions
uh-5.507
ix-5.460
u-5.208
ath-5.118
ind-5.095
Token he
Token ID339
Feature activation+2.73e+00

Pos logit contributions
 tremend+7.862
 artificially+7.226
 disparate+7.008
 meaningful+6.848
 diminishing+6.840
Neg logit contributions
uh-9.123
ix-9.053
u-8.638
ath-8.500
are-8.454
Token
Token ID447
Feature activation+5.06e+01

Pos logit contributions
uh+2.735
ix+2.708
u+2.592
ath+2.588
i+2.525
Neg logit contributions
 tremend-3.098
 disparate-2.866
 artificially-2.845
 behavi-2.812
 diminishing-2.804
Token
Token ID247
Feature activation+3.86e+01

Pos logit contributions
+30.980
+29.253
+28.692
+28.653
+28.174
Neg logit contributions
 tremend-37.830
 behavi-33.855
 mathemat-33.176
 undermin-31.175
 misunder-30.680
Tokens
Token ID82
Feature activation+2.63e+00

Pos logit contributions
s+30.697
ix+27.872
u+27.774
i+27.471
d+27.259
Neg logit contributions
 tremend-24.544
 behavi-22.426
 disparate-22.166
 artificially-21.366
 diminishing-21.284
Token definitely
Token ID4753
Feature activation+2.96e-01

Pos logit contributions
uh+3.775
ix+3.771
u+3.659
ath+3.657
ind+3.613
Neg logit contributions
 tremend-1.749
 artificially-1.482
 behavi-1.441
 disparate-1.417
 diminishing-1.351
Token passionate
Token ID15347
Feature activation+2.19e+00

Pos logit contributions
uh+0.416
ix+0.413
u+0.401
ath+0.394
id+0.393
Neg logit contributions
 tremend-0.222
 artificially-0.195
 disparate-0.186
 behavi-0.184
 diminishing-0.179
Token about
Token ID546
Feature activation-9.69e+00

Pos logit contributions
uh+2.570
ix+2.556
u+2.461
i+2.392
ath+2.381
Neg logit contributions
 tremend-2.149
 artificially-1.930
 disparate-1.896
 diminishing-1.837
 fragmented-1.830
Token claims
Token ID3667
Feature activation-4.99e+00

Pos logit contributions
uh+1.866
ix+1.865
u+1.781
are+1.731
ath+1.725
Neg logit contributions
 tremend-2.025
 artificially-1.838
 disparate-1.811
 behavi-1.810
 meaningful-1.780
Token that
Token ID326
Feature activation-1.04e+01

Pos logit contributions
 tremend+4.655
 artificially+4.215
 behavi+4.081
 disparate+4.014
 diminishing+4.014
Neg logit contributions
uh-6.244
ix-6.186
u-5.977
are-5.866
ath-5.824
Token named
Token ID3706
Feature activation-7.95e-01

Pos logit contributions
 tremend+12.837
 artificially+12.090
 disparate+11.792
 meaningful+11.656
 diminishing+11.610
Neg logit contributions
uh-9.651
ix-9.527
u-8.959
ath-8.770
are-8.762
Token people
Token ID661
Feature activation-1.06e+01

Pos logit contributions
 tremend+0.686
 artificially+0.601
 disparate+0.577
 diminishing+0.576
 fragmented+0.575
Neg logit contributions
ix-1.046
uh-1.041
u-0.994
ath-0.990
ind-0.986
Token couldn
Token ID3521
Feature activation-4.89e+00

Pos logit contributions
 tremend+14.023
 artificially+13.449
 disparate+13.116
 meaningful+12.963
 diminishing+12.900
Neg logit contributions
uh-8.356
ix-8.209
u-7.605
fn-7.482
ath-7.454
Token
Token ID447
Feature activation+5.03e+01

Pos logit contributions
 tremend+7.269
 artificially+7.232
 disparate+7.058
 meaningful+7.051
 diminishing+6.975
Neg logit contributions
uh-2.746
ix-2.676
u-2.354
fn-2.332
ath-2.323
Token
Token ID247
Feature activation+4.04e+01

Pos logit contributions
+32.432
+30.872
+30.790
+30.634
+30.445
Neg logit contributions
 tremend-38.252
 mathemat-33.866
 behavi-33.029
 undermin-31.320
 hemor-30.752
Tokent
Token ID83
Feature activation+3.01e+00

Pos logit contributions
t+30.421
u+30.131
ix+29.603
uh+28.525
i+28.521
Neg logit contributions
 tremend-32.884
 behavi-29.502
 hemor-28.269
 undermin-27.378
 mathemat-27.318
Token provide
Token ID2148
Feature activation-9.16e+00

Pos logit contributions
ix+3.995
uh+3.941
u+3.802
ind+3.788
ath+3.781
Neg logit contributions
 tremend-2.402
 behavi-2.175
 disparate-2.076
 artificially-2.034
 diminishing-1.996
Token justice
Token ID5316
Feature activation-4.21e+00

Pos logit contributions
 tremend+8.589
 artificially+7.851
 disparate+7.533
 behavi+7.464
 diminishing+7.394
Neg logit contributions
uh-11.437
ix-11.409
u-10.888
ath-10.730
ind-10.718
Token for
Token ID329
Feature activation-1.25e+01

Pos logit contributions
 tremend+4.867
 artificially+4.456
 disparate+4.388
 behavi+4.363
 diminishing+4.295
Neg logit contributions
uh-4.336
ix-4.278
u-4.090
ath-4.011
ind-3.996
Token said
Token ID531
Feature activation+1.93e+00

Pos logit contributions
 tremend+6.417
 artificially+5.866
 disparate+5.676
 behavi+5.616
 meaningful+5.526
Neg logit contributions
uh-7.977
ix-7.932
u-7.563
ath-7.485
fn-7.411
Token that
Token ID326
Feature activation-8.44e+00

Pos logit contributions
uh+2.397
ix+2.336
u+2.281
ath+2.273
are+2.241
Neg logit contributions
 tremend-1.756
 artificially-1.607
 disparate-1.527
 behavi-1.500
 fragmented-1.473
Token,
Token ID11
Feature activation-8.28e+00

Pos logit contributions
 tremend+10.480
 artificially+9.921
 disparate+9.684
 meaningful+9.570
 diminishing+9.517
Neg logit contributions
uh-7.638
ix-7.533
u-7.062
ath-6.924
fn-6.918
Token I
Token ID314
Feature activation-4.62e+00

Pos logit contributions
 tremend+9.394
 artificially+8.787
 disparate+8.563
 meaningful+8.430
 diminishing+8.384
Neg logit contributions
uh-8.507
ix-8.397
u-7.939
ath-7.819
are-7.795
Token don
Token ID836
Feature activation+3.11e+00

Pos logit contributions
 tremend+4.874
 artificially+4.479
 disparate+4.396
 behavi+4.317
 meaningful+4.312
Neg logit contributions
uh-5.175
ix-5.107
u-4.872
ath-4.809
are-4.785
Token
Token ID447
Feature activation+4.96e+01

Pos logit contributions
uh+1.613
ix+1.585
u+1.440
ath+1.361
fn+1.349
Neg logit contributions
 tremend-5.363
 artificially-4.884
 behavi-4.871
 disparate-4.834
 diminishing-4.752
Token
Token ID247
Feature activation+3.68e+01

Pos logit contributions
+32.028
+30.538
+30.075
+30.035
+29.827
Neg logit contributions
 tremend-35.976
 behavi-31.322
 mathemat-30.676
 undermin-28.887
 hemor-28.445
Tokent
Token ID83
Feature activation+2.89e+00

Pos logit contributions
t+26.163
u+25.718
ix+24.415
i+23.887
uh+23.633
Neg logit contributions
 tremend-36.273
 behavi-33.952
 hemor-32.358
 undermin-31.606
 conflic-30.090
Token see
Token ID766
Feature activation-1.79e+00

Pos logit contributions
uh+3.593
ix+3.554
u+3.419
ath+3.397
are+3.379
Neg logit contributions
 tremend-2.623
 behavi-2.390
 disparate-2.284
 artificially-2.221
 meaningful-2.161
Token myself
Token ID3589
Feature activation-4.51e+00

Pos logit contributions
 tremend+1.612
 artificially+1.418
 disparate+1.395
 behavi+1.368
 fragmented+1.333
Neg logit contributions
uh-2.291
ix-2.279
u-2.202
ath-2.156
are-2.141
Token doing
Token ID1804
Feature activation-1.39e+01

Pos logit contributions
 tremend+3.408
 artificially+2.902
 disparate+2.851
 behavi+2.819
 meaningful+2.734
Neg logit contributions
uh-6.486
ix-6.416
u-6.230
ath-6.137
fn-6.101
Token a
Token ID257
Feature activation-1.03e+01

Pos logit contributions
 tremend+8.946
 artificially+8.248
 disparate+7.970
 behavi+7.855
 meaningful+7.822
Neg logit contributions
uh-10.331
ix-10.256
u-9.793
ath-9.616
are-9.601
Token new
Token ID649
Feature activation-8.20e+00

Pos logit contributions
 tremend+10.252
 artificially+9.508
 disparate+9.183
 meaningful+9.018
 behavi+8.998
Neg logit contributions
uh-12.056
ix-11.936
u-11.381
ath-11.192
are-11.163
Token life
Token ID1204
Feature activation-3.35e+00

Pos logit contributions
 tremend+8.254
 artificially+7.644
 disparate+7.403
 meaningful+7.268
 behavi+7.247
Neg logit contributions
uh-9.502
ix-9.400
u-8.970
ath-8.823
are-8.811
Token.
Token ID13
Feature activation-4.68e-01

Pos logit contributions
 tremend+4.628
 artificially+4.286
 disparate+4.259
 behavi+4.216
 diminishing+4.151
Neg logit contributions
uh-2.672
ix-2.620
u-2.531
ind-2.440
ath-2.423
Token It
Token ID632
Feature activation+8.23e-02

Pos logit contributions
 tremend+0.662
 artificially+0.603
 behavi+0.593
 disparate+0.588
 mutually+0.581
Neg logit contributions
ix-0.380
uh-0.379
u-0.358
are-0.348
fn-0.347
Token
Token ID447
Feature activation+4.93e+01

Pos logit contributions
uh+0.076
ix+0.075
u+0.072
ath+0.070
are+0.070
Neg logit contributions
 tremend-0.102
 disparate-0.094
 artificially-0.094
 behavi-0.094
 meaningful-0.092
Token
Token ID247
Feature activation+3.79e+01

Pos logit contributions
+31.610
+29.785
+29.593
+29.294
+29.215
Neg logit contributions
 tremend-38.593
 behavi-33.749
 mathemat-31.655
 undermin-31.159
 hemor-30.903
Tokens
Token ID82
Feature activation+6.69e+00

Pos logit contributions
s+29.030
u+26.551
ix+26.300
t+25.769
is+25.461
Neg logit contributions
 tremend-27.184
 behavi-26.011
 disparate-24.883
 hemor-24.576
 artificially-23.064
Token good
Token ID922
Feature activation+1.65e+00

Pos logit contributions
uh+9.394
ix+9.189
u+9.145
ath+8.944
ind+8.900
Neg logit contributions
 tremend-4.152
 behavi-3.642
 artificially-3.423
 disparate-3.319
 diminishing-3.074
Token to
Token ID284
Feature activation-6.30e+00

Pos logit contributions
uh+1.800
ix+1.785
u+1.710
ath+1.689
are+1.684
Neg logit contributions
 tremend-1.666
 artificially-1.575
 disparate-1.547
 behavi-1.496
 fragmented-1.487
Token see
Token ID766
Feature activation-2.77e+00

Pos logit contributions
 tremend+5.365
 artificially+4.826
 disparate+4.735
 behavi+4.684
 meaningful+4.592
Neg logit contributions
uh-8.342
ix-8.269
u-7.953
are-7.823
ath-7.818
Token.
Token ID13
Feature activation-8.08e-01

Pos logit contributions
uh+7.661
ix+7.564
u+7.226
ath+7.203
are+7.003
Neg logit contributions
 tremend-9.472
 behavi-8.631
 artificially-8.523
 disparate-8.486
 diminishing-8.346
Token Seriously
Token ID27777
Feature activation+7.86e+00

Pos logit contributions
 tremend+1.218
 artificially+1.116
 behavi+1.095
 disparate+1.092
 meaningful+1.075
Neg logit contributions
uh-0.588
ix-0.576
u-0.540
fn-0.523
are-0.518
Token.
Token ID13
Feature activation+2.60e+00

Pos logit contributions
uh+7.263
ix+7.161
u+7.013
i+6.650
ath+6.611
Neg logit contributions
 tremend-8.788
 artificially-8.038
 behavi-7.958
 disparate-7.689
 mutually-7.611
Token We
Token ID775
Feature activation+3.50e+00

Pos logit contributions
uh+1.958
ix+1.892
u+1.808
fn+1.730
are+1.728
Neg logit contributions
 tremend-3.879
 artificially-3.494
 behavi-3.482
 disparate-3.420
 diminishing-3.363
Token ain
Token ID18959
Feature activation+1.70e+01

Pos logit contributions
uh+4.218
ix+4.189
u+4.018
are+4.012
ath+3.937
Neg logit contributions
 tremend-3.379
 behavi-3.045
 disparate-2.932
 artificially-2.886
 diminishing-2.847
Token
Token ID447
Feature activation+4.92e+01

Pos logit contributions
uh+13.907
ix+13.573
u+13.529
i+12.681
ind+12.565
Neg logit contributions
 tremend-21.545
 behavi-18.969
 hemor-17.859
 undermin-17.497
 artificially-17.412
Token
Token ID247
Feature activation+3.56e+01

Pos logit contributions
+33.074
+30.934
+30.777
+30.457
+30.131
Neg logit contributions
 tremend-32.832
 behavi-29.831
 artificially-28.410
 mathemat-27.745
 fragmented-27.388
Tokent
Token ID83
Feature activation+1.31e+01

Pos logit contributions
u+31.008
t+30.691
s+29.962
uh+29.811
d+29.580
Neg logit contributions
 tremend-20.928
 behavi-19.561
 disparate-18.111
 artificially-17.989
 hemor-17.208
Token toss
Token ID12153
Feature activation+7.04e+00

Pos logit contributions
uh+16.424
ix+15.949
u+15.783
are+15.560
ind+15.538
Neg logit contributions
 tremend-8.970
 behavi-7.748
 artificially-7.357
 disparate-7.197
 fragmented-6.567
Tokenin
Token ID259
Feature activation+1.20e+01

Pos logit contributions
uh+6.930
ix+6.735
u+6.656
i+6.484
are+6.446
Neg logit contributions
 tremend-7.791
 artificially-7.099
 disparate-6.922
 behavi-6.877
 mutually-6.730
Token
Token ID447
Feature activation+3.77e+01

Pos logit contributions
uh+14.377
ix+13.952
u+13.947
ath+13.390
are+13.368
Neg logit contributions
 tremend-9.453
 artificially-8.226
 disparate-7.955
 behavi-7.805
 fragmented-7.622
Token grid
Token ID10706
Feature activation+4.92e+00

Pos logit contributions
 tremend+6.646
 artificially+6.174
 disparate+5.951
 behavi+5.891
 diminishing+5.860
Neg logit contributions
uh-7.502
ix-7.358
u-7.061
are-6.971
ath-6.935
Tokenlock
Token ID5354
Feature activation-4.15e+00

Pos logit contributions
uh+1.391
ix+1.073
ind+1.064
are+1.013
u+0.988
Neg logit contributions
 tremend-9.106
 artificially-8.579
 behavi-8.540
 disparate-8.463
 diminishing-8.421
Token,
Token ID11
Feature activation-7.61e+00

Pos logit contributions
 tremend+5.250
 artificially+4.868
 disparate+4.801
 behavi+4.734
 meaningful+4.709
Neg logit contributions
uh-3.841
ix-3.765
u-3.565
are-3.470
ath-3.462
Token they
Token ID484
Feature activation-8.04e+00

Pos logit contributions
 tremend+8.419
 artificially+7.812
 disparate+7.598
 meaningful+7.487
 diminishing+7.465
Neg logit contributions
uh-8.188
ix-8.059
u-7.663
are-7.537
ath-7.520
Token couldn
Token ID3521
Feature activation-8.39e-01

Pos logit contributions
 tremend+10.902
 artificially+10.429
 disparate+10.170
 meaningful+10.065
 diminishing+10.008
Neg logit contributions
uh-6.198
ix-6.106
u-5.639
fn-5.530
ath-5.508
Token
Token ID447
Feature activation+4.88e+01

Pos logit contributions
 tremend+1.354
 artificially+1.294
 disparate+1.271
 meaningful+1.259
 diminishing+1.255
Neg logit contributions
uh-0.450
ix-0.440
u-0.393
ath-0.379
fn-0.378
Token
Token ID247
Feature activation+3.86e+01

Pos logit contributions
+31.412
+30.029
+29.540
+29.502
+29.501
Neg logit contributions
 tremend-39.396
 behavi-33.924
 mathemat-33.174
 hemor-31.124
 undermin-30.662
Tokent
Token ID83
Feature activation+6.31e+00

Pos logit contributions
t+29.016
u+28.255
ix+28.164
i+27.923
ind+27.263
Neg logit contributions
 tremend-34.081
 behavi-31.099
 hemor-29.063
 undermin-27.576
 mathemat-27.546
Token reload
Token ID18126
Feature activation+4.05e+00

Pos logit contributions
ix+8.082
uh+7.988
u+7.791
i+7.662
are+7.640
Neg logit contributions
 tremend-5.011
 behavi-4.847
 disparate-4.408
 diminishing-4.175
 artificially-4.035
Token the
Token ID262
Feature activation-9.23e+00

Pos logit contributions
uh+5.747
ix+5.671
u+5.614
ind+5.508
ath+5.499
Neg logit contributions
 tremend-2.619
 artificially-2.369
 disparate-2.283
 behavi-2.213
 fragmented-2.194
Token salt
Token ID8268
Feature activation+9.29e+00

Pos logit contributions
 tremend+9.818
 artificially+9.205
 disparate+8.940
 meaningful+8.819
 diminishing+8.761
Neg logit contributions
uh-10.010
ix-9.892
u-9.378
ath-9.222
fn-9.222
Token a
Token ID257
Feature activation-1.25e+01

Pos logit contributions
 tremend+6.516
 artificially+5.882
 disparate+5.609
 behavi+5.518
 diminishing+5.483
Neg logit contributions
uh-9.822
ix-9.707
u-9.341
are-9.192
ath-9.175
Token character
Token ID2095
Feature activation-5.04e+00

Pos logit contributions
 tremend+13.253
 artificially+12.464
 disparate+12.121
 meaningful+11.971
 diminishing+11.881
Neg logit contributions
uh-13.424
ix-13.276
u-12.557
fn-12.374
ath-12.364
Token.
Token ID13
Feature activation-3.06e+00

Pos logit contributions
 tremend+6.463
 artificially+5.973
 disparate+5.878
 behavi+5.793
 diminishing+5.775
Neg logit contributions
uh-4.570
ix-4.513
u-4.256
are-4.155
ath-4.134
Token You
Token ID921
Feature activation-4.98e+00

Pos logit contributions
 tremend+4.312
 artificially+4.025
 disparate+3.938
 behavi+3.889
 meaningful+3.886
Neg logit contributions
uh-2.383
ix-2.346
u-2.186
are-2.132
fn-2.120
Token don
Token ID836
Feature activation-6.23e-01

Pos logit contributions
 tremend+4.795
 artificially+4.291
 disparate+4.235
 behavi+4.229
 meaningful+4.121
Neg logit contributions
uh-6.111
ix-6.052
u-5.806
are-5.759
ath-5.704
Token
Token ID447
Feature activation+4.86e+01

Pos logit contributions
 tremend+0.906
 artificially+0.860
 disparate+0.843
 meaningful+0.834
 diminishing+0.831
Neg logit contributions
uh-0.437
ix-0.429
u-0.394
ath-0.384
fn-0.383
Token
Token ID247
Feature activation+3.23e+01

Pos logit contributions
+32.021
+30.361
+29.766
+29.764
+29.558
Neg logit contributions
 tremend-32.232
 behavi-29.030
 mathemat-26.995
 artificially-26.824
 disparate-25.826
Tokent
Token ID83
Feature activation+7.76e-01

Pos logit contributions
t+25.523
u+25.160
ix+24.849
uh+24.309
i+23.810
Neg logit contributions
 tremend-26.404
 behavi-25.180
 disparate-23.392
 hemor-22.525
 artificially-22.505
Token necessarily
Token ID6646
Feature activation-6.82e+00

Pos logit contributions
uh+1.015
ix+1.004
u+0.973
are+0.958
ath+0.948
Neg logit contributions
 tremend-0.676
 behavi-0.614
 disparate-0.579
 artificially-0.564
 meaningful-0.551
Token phone
Token ID3072
Feature activation+5.88e+00

Pos logit contributions
 tremend+5.717
 artificially+5.152
 disparate+5.017
 behavi+4.913
 meaningful+4.896
Neg logit contributions
uh-9.068
ix-8.972
u-8.619
are-8.496
ath-8.483
Token it
Token ID340
Feature activation-4.48e+00

Pos logit contributions
uh+7.596
ix+7.467
u+7.288
id+7.237
are+7.164
Neg logit contributions
 tremend-4.652
 artificially-3.956
 behavi-3.857
 disparate-3.837
 fragmented-3.733
Token so
Token ID523
Feature activation+1.68e+00

Pos logit contributions
uh+4.482
ix+4.422
u+4.343
i+4.265
are+4.248
Neg logit contributions
 tremend-2.706
 artificially-2.410
 behavi-2.391
 disparate-2.245
 diminishing-2.228
Token,
Token ID11
Feature activation-2.50e+00

Pos logit contributions
uh+1.908
ix+1.888
u+1.836
ath+1.791
i+1.790
Neg logit contributions
 tremend-1.682
 artificially-1.537
 behavi-1.513
 disparate-1.496
 diminishing-1.475
Token that
Token ID326
Feature activation-5.22e+00

Pos logit contributions
 tremend+2.348
 artificially+2.094
 behavi+2.056
 disparate+2.043
 diminishing+2.004
Neg logit contributions
uh-3.125
ix-3.085
u-3.002
are-2.940
ath-2.928
Token we
Token ID356
Feature activation-4.96e+00

Pos logit contributions
 tremend+5.583
 artificially+5.119
 disparate+4.955
 behavi+4.920
 diminishing+4.875
Neg logit contributions
uh-5.846
ix-5.790
u-5.547
are-5.461
ath-5.439
Token couldn
Token ID3521
Feature activation+1.23e+00

Pos logit contributions
 tremend+5.556
 artificially+5.157
 disparate+5.045
 meaningful+4.971
 behavi+4.964
Neg logit contributions
uh-5.198
ix-5.135
u-4.884
are-4.802
ath-4.795
Token
Token ID447
Feature activation+4.85e+01

Pos logit contributions
uh+0.628
ix+0.619
u+0.556
are+0.526
fn+0.526
Neg logit contributions
 tremend-2.079
 artificially-1.944
 disparate-1.917
 behavi-1.907
 diminishing-1.890
Token
Token ID247
Feature activation+3.73e+01

Pos logit contributions
+32.403
+31.284
+30.719
+30.627
+30.436
Neg logit contributions
 tremend-33.299
 behavi-28.502
 mathemat-28.113
 artificially-26.448
 undermin-26.246
Tokent
Token ID83
Feature activation+6.17e+00

Pos logit contributions
t+28.558
ix+28.457
u+28.457
i+27.660
ind+27.027
Neg logit contributions
 tremend-31.267
 behavi-29.426
 hemor-26.487
 undermin-25.997
 disparate-25.356
Token help
Token ID1037
Feature activation+6.45e+00

Pos logit contributions
ix+7.820
uh+7.749
u+7.570
are+7.465
ind+7.408
Neg logit contributions
 tremend-5.024
 behavi-4.869
 disparate-4.277
 diminishing-4.111
 artificially-3.942
Token but
Token ID475
Feature activation-3.56e-01

Pos logit contributions
uh+7.185
ix+7.091
u+6.854
id+6.720
are+6.689
Neg logit contributions
 tremend-6.083
 behavi-5.265
 artificially-5.235
 disparate-5.210
 meaningful-5.029
Token bask
Token ID39353
Feature activation+6.79e+00

Pos logit contributions
 tremend+0.306
 behavi+0.278
 disparate+0.263
 artificially+0.259
 meaningful+0.252
Neg logit contributions
uh-0.466
ix-0.461
u-0.451
are-0.444
ath-0.440
Token of
Token ID286
Feature activation-3.60e+00

Pos logit contributions
uh+1.515
ix+1.501
u+1.436
ath+1.375
ind+1.372
Neg logit contributions
 tremend-1.842
 artificially-1.738
 disparate-1.710
 meaningful-1.677
 diminishing-1.669
Token me
Token ID502
Feature activation-5.22e+00

Pos logit contributions
 tremend+3.376
 artificially+3.115
 disparate+3.014
 behavi+2.983
 meaningful+2.964
Neg logit contributions
uh-4.507
ix-4.452
u-4.304
ind-4.176
id-4.160
Token.
Token ID13
Feature activation-4.84e-01

Pos logit contributions
 tremend+7.380
 artificially+6.968
 disparate+6.880
 meaningful+6.768
 behavi+6.761
Neg logit contributions
uh-4.043
ix-3.996
u-3.722
are-3.578
ath-3.577
Token So
Token ID1406
Feature activation+3.78e+00

Pos logit contributions
 tremend+0.694
 artificially+0.641
 disparate+0.627
 behavi+0.626
 meaningful+0.618
Neg logit contributions
uh-0.377
ix-0.372
u-0.349
are-0.336
ind-0.335
Token don
Token ID836
Feature activation+7.42e+00

Pos logit contributions
uh+4.712
ix+4.646
u+4.549
ind+4.409
are+4.392
Neg logit contributions
 tremend-3.329
 artificially-3.001
 behavi-2.963
 disparate-2.909
 diminishing-2.847
Token
Token ID447
Feature activation+4.84e+01

Pos logit contributions
uh+4.965
ix+4.920
u+4.823
i+4.483
ath+4.469
Neg logit contributions
 tremend-11.410
 behavi-10.341
 artificially-9.975
 disparate-9.937
 diminishing-9.779
Token
Token ID247
Feature activation+3.67e+01

Pos logit contributions
+33.535
+31.228
+31.079
+30.929
+30.704
Neg logit contributions
 tremend-29.747
 behavi-26.957
 mathemat-25.839
 undermin-25.130
 artificially-24.444
Tokent
Token ID83
Feature activation+9.18e+00

Pos logit contributions
u+29.935
t+29.218
uh+28.407
ix+27.958
i+27.949
Neg logit contributions
 tremend-23.859
 disparate-22.150
 behavi-21.570
 artificially-21.393
 fragmented-21.368
Token be
Token ID307
Feature activation+4.77e+00

Pos logit contributions
uh+11.563
ix+11.285
u+11.048
ind+10.722
i+10.717
Neg logit contributions
 tremend-7.066
 behavi-6.591
 disparate-5.940
 meaningful-5.824
 tangible-5.706
Token surprised
Token ID6655
Feature activation+6.40e+00

Pos logit contributions
uh+6.191
ix+6.035
u+5.965
are+5.876
ath+5.840
Neg logit contributions
 tremend-3.847
 behavi-3.345
 disparate-3.265
 meaningful-3.248
 artificially-3.247
Token if
Token ID611
Feature activation-5.88e+00

Pos logit contributions
ix+6.151
uh+6.125
u+5.929
are+5.775
i+5.659
Neg logit contributions
 tremend-7.263
 behavi-6.374
 artificially-6.361
 disparate-6.360
 fragmented-6.223
Token (
Token ID357
Feature activation+5.02e+00

Pos logit contributions
ix+1.611
uh+1.573
u+1.525
ind+1.482
are+1.479
Neg logit contributions
 tremend-1.472
 behavi-1.308
 artificially-1.273
 disparate-1.261
 meaningful-1.246
Tokenlittle
Token ID31629
Feature activation+4.64e+00

Pos logit contributions
uh+2.860
ix+2.712
u+2.582
ind+2.562
are+2.552
Neg logit contributions
 tremend-7.958
 artificially-7.501
 behavi-7.342
 disparate-7.314
 diminishing-7.242
Token kids
Token ID3988
Feature activation-1.16e+00

Pos logit contributions
ix+6.249
uh+6.215
u+6.002
are+5.993
ath+5.963
Neg logit contributions
 tremend-3.593
 behavi-2.979
 artificially-2.870
 disparate-2.817
 mutually-2.733
Token.
Token ID13
Feature activation+5.29e+00

Pos logit contributions
 tremend+1.298
 behavi+1.157
 artificially+1.144
 disparate+1.130
 meaningful+1.108
Neg logit contributions
uh-1.254
ix-1.254
u-1.210
are-1.203
ath-1.175
Token They
Token ID1119
Feature activation-1.42e-01

Pos logit contributions
uh+5.099
ix+4.982
u+4.808
are+4.773
ind+4.639
Neg logit contributions
 tremend-6.253
 behavi-5.607
 artificially-5.568
 disparate-5.409
 mutually-5.350
Token
Token ID447
Feature activation+4.84e+01

Pos logit contributions
 tremend+0.158
 behavi+0.145
 disparate+0.140
 artificially+0.138
 meaningful+0.136
Neg logit contributions
uh-0.153
ix-0.152
are-0.147
u-0.147
ath-0.144
Token
Token ID247
Feature activation+3.56e+01

Pos logit contributions
+32.579
+30.536
+29.723
+29.677
+29.525
Neg logit contributions
 tremend-38.613
 behavi-33.608
 mathemat-31.121
 conduc-31.050
 hemor-30.643
Tokenve
Token ID303
Feature activation+1.23e+01

Pos logit contributions
are+33.749
u+33.548
uh+33.094
ix+32.811
i+32.174
Neg logit contributions
 tremend-16.581
 behavi-15.569
 disparate-14.501
 diminishing-14.045
 artificially-13.893
Token been
Token ID587
Feature activation-1.97e+00

Pos logit contributions
ix+14.439
uh+14.322
u+14.163
are+13.942
ath+13.865
Neg logit contributions
 tremend-8.701
 behavi-7.795
 artificially-7.307
 disparate-7.235
 diminishing-6.975
Token there
Token ID612
Feature activation+2.70e+00

Pos logit contributions
 tremend+1.687
 artificially+1.451
 behavi+1.450
 disparate+1.426
 meaningful+1.399
Neg logit contributions
uh-2.601
ix-2.587
u-2.517
are-2.472
ath-2.455
Token for
Token ID329
Feature activation-3.87e+00

Pos logit contributions
ix+2.779
uh+2.762
u+2.662
are+2.591
ind+2.579
Neg logit contributions
 tremend-2.983
 behavi-2.700
 disparate-2.665
 artificially-2.657
 diminishing-2.602
Token on
Token ID319
Feature activation-8.95e+00

Pos logit contributions
 tremend+0.798
 artificially+0.718
 disparate+0.711
 behavi+0.701
 fragmented+0.697
Neg logit contributions
uh-0.880
ix-0.872
u-0.844
fn-0.823
ath-0.821
Token internet
Token ID5230
Feature activation+7.71e-01

Pos logit contributions
 tremend+8.836
 artificially+8.064
 disparate+7.830
 behavi+7.718
 meaningful+7.669
Neg logit contributions
uh-10.743
ix-10.640
u-10.171
ath-9.994
are-9.979
Token rumors
Token ID14923
Feature activation-4.21e+00

Pos logit contributions
ix+1.041
uh+1.040
u+1.012
are+0.997
ind+0.992
Neg logit contributions
 tremend-0.621
 artificially-0.535
 disparate-0.511
 behavi-0.504
 meaningful-0.504
Token that
Token ID326
Feature activation-1.08e+01

Pos logit contributions
 tremend+5.688
 artificially+5.280
 behavi+5.196
 disparate+5.186
 meaningful+5.121
Neg logit contributions
uh-3.570
ix-3.530
u-3.333
are-3.199
ath-3.174
Token don
Token ID836
Feature activation-1.20e+00

Pos logit contributions
 tremend+13.259
 artificially+12.540
 disparate+12.237
 meaningful+12.095
 diminishing+12.031
Neg logit contributions
uh-9.835
ix-9.700
u-9.101
ath-8.920
fn-8.916
Token
Token ID447
Feature activation+4.83e+01

Pos logit contributions
 tremend+1.982
 artificially+1.842
 disparate+1.818
 behavi+1.812
 diminishing+1.791
Neg logit contributions
uh-0.675
ix-0.664
u-0.606
ath-0.580
are-0.576
Token
Token ID247
Feature activation+3.95e+01

Pos logit contributions
+31.695
+30.314
+29.625
+29.596
+29.547
Neg logit contributions
 tremend-36.874
 behavi-32.358
 mathemat-31.520
 undermin-30.160
 hemor-29.207
Tokent
Token ID83
Feature activation+2.27e+00

Pos logit contributions
u+29.575
t+29.002
ix+28.098
i+27.333
uh+27.098
Neg logit contributions
 tremend-30.092
 behavi-28.129
 hemor-26.112
 disparate-26.079
 undermin-25.905
Token even
Token ID772
Feature activation+6.95e-01

Pos logit contributions
uh+3.121
ix+3.093
u+3.036
are+2.966
ath+2.941
Neg logit contributions
 tremend-1.764
 behavi-1.593
 disparate-1.504
 artificially-1.444
 fragmented-1.399
Token make
Token ID787
Feature activation-1.63e+00

Pos logit contributions
uh+0.969
ix+0.965
u+0.940
are+0.924
ath+0.918
Neg logit contributions
 tremend-0.540
 behavi-0.478
 disparate-0.458
 artificially-0.449
 fragmented-0.428
Token sense
Token ID2565
Feature activation-5.98e+00

Pos logit contributions
 tremend+1.290
 artificially+1.083
 behavi+1.047
 disparate+1.032
 fragmented+1.018
Neg logit contributions
ix-2.329
uh-2.316
u-2.234
are-2.204
ind-2.198
Token mainstream
Token ID8661
Feature activation-3.49e+00

Pos logit contributions
 tremend+8.706
 artificially+8.037
 disparate+7.750
 meaningful+7.621
 diminishing+7.568
Neg logit contributions
uh-12.423
ix-12.302
u-11.761
ath-11.591
are-11.576
Token media
Token ID2056
Feature activation-5.36e+00

Pos logit contributions
 tremend+2.655
 artificially+2.384
 disparate+2.243
 behavi+2.219
 meaningful+2.189
Neg logit contributions
uh-4.955
ix-4.906
u-4.745
are-4.691
ath-4.665
Token,
Token ID11
Feature activation-9.47e+00

Pos logit contributions
 tremend+6.954
 artificially+6.530
 disparate+6.414
 meaningful+6.326
 behavi+6.304
Neg logit contributions
uh-4.712
ix-4.641
u-4.392
are-4.275
ath-4.255
Token I
Token ID314
Feature activation-2.73e+00

Pos logit contributions
 tremend+10.433
 artificially+9.748
 disparate+9.479
 meaningful+9.345
 diminishing+9.294
Neg logit contributions
uh-10.032
ix-9.902
u-9.397
are-9.223
ath-9.215
Token don
Token ID836
Feature activation+4.20e+00

Pos logit contributions
 tremend+3.782
 artificially+3.554
 disparate+3.512
 behavi+3.458
 meaningful+3.453
Neg logit contributions
uh-2.177
ix-2.110
u-1.999
ath-1.952
are-1.950
Token
Token ID447
Feature activation+4.82e+01

Pos logit contributions
uh+1.796
ix+1.753
u+1.576
ath+1.456
are+1.443
Neg logit contributions
 tremend-7.739
 behavi-7.024
 artificially-6.977
 disparate-6.933
 fragmented-6.817
Token
Token ID247
Feature activation+3.66e+01

Pos logit contributions
+29.099
+28.163
+28.155
+27.751
+27.534
Neg logit contributions
 tremend-42.053
 behavi-35.845
 mathemat-35.534
 hemor-34.074
 undermin-33.730
Tokent
Token ID83
Feature activation+5.87e+00

Pos logit contributions
t+25.816
u+25.430
ix+24.414
i+24.244
d+23.592
Neg logit contributions
 tremend-34.877
 behavi-32.064
 hemor-31.087
 undermin-30.451
 disparate-30.047
Token want
Token ID765
Feature activation+5.37e-02

Pos logit contributions
uh+6.879
ix+6.836
u+6.536
ath+6.496
are+6.490
Neg logit contributions
 tremend-5.466
 behavi-4.945
 disparate-4.874
 artificially-4.711
 fragmented-4.637
Token to
Token ID284
Feature activation-6.90e+00

Pos logit contributions
uh+0.061
ix+0.061
u+0.059
are+0.057
id+0.057
Neg logit contributions
 tremend-0.056
 artificially-0.050
 behavi-0.049
 disparate-0.049
 fragmented-0.048
Token hear
Token ID3285
Feature activation-1.65e+00

Pos logit contributions
 tremend+7.393
 artificially+6.812
 disparate+6.697
 behavi+6.603
 meaningful+6.571
Neg logit contributions
uh-7.596
ix-7.500
u-7.150
ath-7.006
fn-7.002
Token desert
Token ID10326
Feature activation+2.40e+00

Pos logit contributions
uh+2.809
ix+2.802
u+2.709
ath+2.674
are+2.673
Neg logit contributions
 tremend-1.721
 artificially-1.491
 behavi-1.452
 disparate-1.433
 meaningful-1.393
Token (
Token ID357
Feature activation+5.44e+00

Pos logit contributions
uh+2.842
ix+2.815
are+2.753
u+2.750
id+2.675
Neg logit contributions
 tremend-2.254
 behavi-2.015
 artificially-2.010
 disparate-1.989
 meaningful-1.918
Tokenthe
Token ID1169
Feature activation-3.27e+00

Pos logit contributions
uh+2.837
ix+2.686
u+2.508
are+2.497
ind+2.487
Neg logit contributions
 tremend-8.900
 artificially-8.403
 behavi-8.221
 disparate-8.215
 diminishing-8.138
Token poems
Token ID31888
Feature activation+9.67e-01

Pos logit contributions
 tremend+2.909
 artificially+2.623
 behavi+2.522
 disparate+2.511
 meaningful+2.467
Neg logit contributions
uh-4.253
ix-4.195
u-4.064
are-3.982
ind-3.973
Token don
Token ID836
Feature activation-5.64e-01

Pos logit contributions
uh+1.148
ix+1.136
u+1.096
are+1.092
ind+1.070
Neg logit contributions
 tremend-0.935
 behavi-0.842
 artificially-0.842
 disparate-0.828
 diminishing-0.799
Token
Token ID447
Feature activation+4.81e+01

Pos logit contributions
 tremend+0.929
 artificially+0.878
 disparate+0.864
 behavi+0.854
 meaningful+0.853
Neg logit contributions
uh-0.300
ix-0.293
u-0.264
ath-0.253
are-0.252
Token
Token ID247
Feature activation+3.75e+01

Pos logit contributions
+29.839
+28.646
+28.089
+28.067
+27.895
Neg logit contributions
 tremend-40.354
 behavi-34.987
 mathemat-33.883
 undermin-32.283
 hemor-32.156
Tokent
Token ID83
Feature activation+3.66e+00

Pos logit contributions
t+30.261
u+29.878
s+29.796
ix+29.445
i+29.008
Neg logit contributions
 tremend-25.713
 behavi-22.980
 hemor-21.715
 disparate-21.358
 undermin-21.082
Token argue
Token ID7267
Feature activation+1.28e-01

Pos logit contributions
uh+4.933
ix+4.879
u+4.739
are+4.712
fn+4.637
Neg logit contributions
 tremend-2.845
 behavi-2.640
 disparate-2.413
 artificially-2.322
 meaningful-2.270
Token,
Token ID11
Feature activation-1.18e+00

Pos logit contributions
uh+0.155
ix+0.153
u+0.147
are+0.145
ind+0.144
Neg logit contributions
 tremend-0.122
 artificially-0.110
 disparate-0.107
 behavi-0.106
 diminishing-0.104
Token they
Token ID484
Feature activation-2.78e+00

Pos logit contributions
 tremend+1.248
 artificially+1.123
 disparate+1.108
 behavi+1.101
 meaningful+1.077
Neg logit contributions
uh-1.318
ix-1.303
u-1.251
ind-1.227
are-1.226
Token
Token ID198
Feature activation+1.67e+01

Pos logit contributions
ix+7.912
uh+7.726
u+7.680
i+7.436
are+7.101
Neg logit contributions
 tremend-29.130
 behavi-25.211
 hemor-24.790
 undermin-24.775
 exha-23.921
Token:
Token ID25
Feature activation+1.75e+01

Pos logit contributions
uh+9.396
ix+9.358
Co+9.192
i+8.920
Star+8.872
Neg logit contributions
 tremend-25.494
 behavi-23.118
 artificially-22.371
 hemor-22.128
 fragmented-21.904
Token<|endoftext|>
Token ID50256
Feature activation+6.85e+00

Pos logit contributions
ix+16.538
uh+16.281
u+15.755
V+15.404
i+15.386
Neg logit contributions
 tremend-17.030
 behavi-14.863
 artificially-14.654
 hemor-13.971
 disparate-13.887
TokenIt
Token ID1026
Feature activation+8.44e+00

Pos logit contributions
uh+3.448
ix+3.354
u+3.139
are+3.028
i+2.937
Neg logit contributions
 tremend-11.537
 artificially-10.710
 behavi-10.706
 disparate-10.536
 diminishing-10.389
Token can
Token ID460
Feature activation+6.19e+00

Pos logit contributions
uh+8.192
ix+8.189
u+7.987
ath+7.673
are+7.586
Neg logit contributions
 tremend-8.861
 behavi-8.314
 disparate-8.098
 artificially-7.889
 meaningful-7.797
Token
Token ID447
Feature activation+4.81e+01

Pos logit contributions
uh+7.444
ix+7.391
u+7.174
i+6.967
are+6.916
Neg logit contributions
 tremend-5.344
 behavi-5.216
 disparate-4.741
 meaningful-4.533
 diminishing-4.523
Token
Token ID247
Feature activation+3.94e+01

Pos logit contributions
+32.443
+31.926
+29.858
+29.755
+29.585
Neg logit contributions
 tremend-32.049
 behavi-28.041
 artificially-27.475
 diminishing-26.921
 fragmented-26.794
Tokent
Token ID83
Feature activation+1.51e+01

Pos logit contributions
u+35.453
uh+35.384
ix+35.086
ind+34.585
i+34.242
Neg logit contributions
 tremend-15.283
 behavi-14.760
 disparate-14.398
 diminishing-13.761
 artificially-13.451
Token be
Token ID307
Feature activation+4.09e+00

Pos logit contributions
uh+16.893
ix+16.795
u+16.243
i+15.897
are+15.793
Neg logit contributions
 tremend-11.073
 behavi-10.903
 disparate-9.981
 diminishing-9.299
 meaningful-9.112
Token easy
Token ID2562
Feature activation+3.34e+00

Pos logit contributions
uh+5.475
ix+5.460
u+5.326
ind+5.210
ath+5.145
Neg logit contributions
 tremend-3.194
 behavi-2.749
 disparate-2.576
 artificially-2.548
 diminishing-2.499
Token be
Token ID307
Feature activation+2.44e+00

Pos logit contributions
ix+3.604
uh+3.562
u+3.451
id+3.324
fn+3.320
Neg logit contributions
 tremend-3.457
 artificially-3.176
 disparate-3.127
 fragmented-3.067
 behavi-3.059
Token :
Token ID1058
Feature activation+6.99e+00

Pos logit contributions
uh+10.036
ix+9.743
i+9.458
u+9.427
ind+9.045
Neg logit contributions
 tremend-18.331
 disparate-16.737
 behavi-16.417
 artificially-16.299
 diminishing-15.732
Token Oh
Token ID3966
Feature activation+1.26e+01

Pos logit contributions
uh+5.867
ix+5.605
u+5.405
fn+5.166
are+5.161
Neg logit contributions
 tremend-9.664
 behavi-8.590
 artificially-8.409
 disparate-8.197
 mutually-8.076
Token man
Token ID582
Feature activation+1.20e+01

Pos logit contributions
uh+14.203
u+13.400
ix+13.300
oh+13.261
i+13.158
Neg logit contributions
 tremend-10.447
 behavi-9.778
 artificially-9.314
 disparate-9.187
 diminishing-8.937
Token,
Token ID11
Feature activation+4.41e+00

Pos logit contributions
uh+11.747
ix+11.236
u+11.159
oh+10.801
i+10.789
Neg logit contributions
 tremend-11.538
 artificially-10.801
 disparate-10.599
 behavi-10.446
 diminishing-10.212
Token I
Token ID314
Feature activation+1.87e+00

Pos logit contributions
uh+5.508
ix+5.226
u+5.148
oh+5.071
are+5.051
Neg logit contributions
 tremend-3.921
 behavi-3.467
 artificially-3.349
 disparate-3.284
 diminishing-3.167
Token
Token ID447
Feature activation+4.80e+01

Pos logit contributions
uh+2.170
ix+2.112
u+2.050
are+2.004
ind+2.004
Neg logit contributions
 tremend-1.905
 behavi-1.730
 disparate-1.695
 artificially-1.692
 diminishing-1.653
Token
Token ID247
Feature activation+3.53e+01

Pos logit contributions
+32.419
+30.668
+30.375
+30.251
+29.790
Neg logit contributions
 tremend-35.054
 behavi-30.458
 mathemat-28.930
 undermin-27.338
 hemor-27.029
Tokend
Token ID67
Feature activation+1.15e+01

Pos logit contributions
u+30.389
ix+29.683
uh+29.523
i+28.981
ind+28.784
Neg logit contributions
 tremend-22.329
 behavi-21.405
 disparate-20.240
 artificially-19.900
 undermin-19.509
Token definitely
Token ID4753
Feature activation+8.02e-02

Pos logit contributions
uh+14.302
ix+13.607
u+13.485
id+13.365
ind+13.282
Neg logit contributions
 tremend-8.595
 behavi-8.276
 disparate-7.582
 artificially-7.311
 diminishing-7.197
Token need
Token ID761
Feature activation-1.70e+00

Pos logit contributions
uh+0.108
ix+0.105
u+0.103
ind+0.100
ath+0.100
Neg logit contributions
 tremend-0.066
 artificially-0.059
 behavi-0.058
 disparate-0.058
 diminishing-0.055
Token some
Token ID617
Feature activation-1.09e+01

Pos logit contributions
 tremend+1.739
 artificially+1.611
 disparate+1.583
 behavi+1.527
 fragmented+1.518
Neg logit contributions
uh-1.949
ix-1.930
u-1.857
are-1.812
fn-1.805
Token.
Token ID13
Feature activation-3.95e+00

Pos logit contributions
 tremend+1.911
 artificially+1.769
 disparate+1.733
 meaningful+1.733
 behavi+1.728
Neg logit contributions
uh-1.453
ix-1.424
u-1.370
ath-1.332
are-1.323
Token The
Token ID383
Feature activation-1.03e+01

Pos logit contributions
 tremend+5.522
 artificially+5.159
 disparate+5.036
 behavi+4.990
 meaningful+4.981
Neg logit contributions
uh-3.131
ix-3.086
u-2.891
fn-2.819
are-2.806
Token working
Token ID1762
Feature activation-8.92e-01

Pos logit contributions
 tremend+11.174
 artificially+10.415
 disparate+10.105
 meaningful+9.974
 behavi+9.942
Neg logit contributions
uh-11.176
ix-11.018
u-10.489
ath-10.278
are-10.274
Token class
Token ID1398
Feature activation-3.56e+00

Pos logit contributions
 tremend+0.846
 artificially+0.744
 behavi+0.737
 disparate+0.707
 mutually+0.695
Neg logit contributions
uh-1.121
ix-1.088
u-1.066
ath-1.041
i-1.039
Token doesn
Token ID1595
Feature activation-7.84e-01

Pos logit contributions
 tremend+4.286
 artificially+3.989
 disparate+3.879
 behavi+3.875
 meaningful+3.848
Neg logit contributions
uh-3.453
ix-3.397
u-3.243
ath-3.172
are-3.164
Token
Token ID447
Feature activation+4.80e+01

Pos logit contributions
 tremend+1.309
 artificially+1.243
 disparate+1.223
 meaningful+1.210
 diminishing+1.207
Neg logit contributions
uh-0.391
ix-0.382
u-0.340
ath-0.325
fn-0.324
Token
Token ID247
Feature activation+3.86e+01

Pos logit contributions
+29.311
+27.879
+27.699
+27.073
+27.017
Neg logit contributions
 tremend-41.302
 behavi-36.194
 mathemat-35.778
 undermin-33.797
 hemor-33.791
Tokent
Token ID83
Feature activation+4.49e+00

Pos logit contributions
t+29.358
u+28.358
s+27.844
i+26.998
d+26.922
Neg logit contributions
 tremend-28.271
 behavi-26.279
 hemor-24.510
 undermin-24.333
 disparate-23.716
Token hold
Token ID1745
Feature activation-2.11e-01

Pos logit contributions
uh+5.824
ix+5.758
u+5.611
are+5.523
fn+5.523
Neg logit contributions
 tremend-3.628
 behavi-3.379
 disparate-3.077
 artificially-3.023
 diminishing-2.907
Token power
Token ID1176
Feature activation-6.73e+00

Pos logit contributions
 tremend+0.163
 behavi+0.146
 artificially+0.139
 disparate+0.131
 diminishing+0.131
Neg logit contributions
uh-0.290
ix-0.287
u-0.284
id-0.275
i-0.274
Token anywhere
Token ID6609
Feature activation-2.64e+00

Pos logit contributions
 tremend+8.565
 artificially+7.971
 disparate+7.845
 behavi+7.721
 diminishing+7.706
Neg logit contributions
uh-6.108
ix-6.007
u-5.682
ath-5.549
ind-5.521
Token across
Token ID1973
Feature activation-9.31e-01

Pos logit contributions
 tremend+0.486
 behavi+0.430
 artificially+0.426
 disparate+0.423
 diminishing+0.411
Neg logit contributions
uh-0.517
ix-0.513
u-0.491
ath-0.490
ind-0.485
Token this
Token ID428
Feature activation-8.64e+00

Pos logit contributions
 tremend+0.701
 artificially+0.585
 behavi+0.547
 disparate+0.538
 diminishing+0.533
Neg logit contributions
ix-1.350
uh-1.336
ath-1.297
u-1.293
ind-1.284
Token room
Token ID2119
Feature activation+4.38e-01

Pos logit contributions
 tremend+8.733
 artificially+8.030
 disparate+7.753
 meaningful+7.649
 behavi+7.603
Neg logit contributions
uh-10.162
ix-10.062
u-9.550
are-9.448
ath-9.439
Token,
Token ID11
Feature activation-5.86e+00

Pos logit contributions
uh+0.421
ix+0.417
are+0.392
u+0.391
ind+0.387
Neg logit contributions
 tremend-0.532
 artificially-0.482
 behavi-0.476
 disparate-0.473
 diminishing-0.469
Token you
Token ID345
Feature activation-3.25e+00

Pos logit contributions
 tremend+5.824
 artificially+5.278
 disparate+5.104
 behavi+5.052
 diminishing+5.019
Neg logit contributions
uh-7.042
ix-6.932
u-6.624
are-6.547
ath-6.513
Token
Token ID447
Feature activation+4.79e+01

Pos logit contributions
 tremend+3.736
 artificially+3.428
 disparate+3.373
 behavi+3.345
 meaningful+3.316
Neg logit contributions
uh-3.389
ix-3.349
u-3.188
are-3.148
ath-3.140
Token
Token ID247
Feature activation+3.17e+01

Pos logit contributions
+30.762
+28.822
+28.372
+28.357
+27.937
Neg logit contributions
 tremend-41.579
 behavi-36.048
 mathemat-33.605
 hemor-33.055
 undermin-32.752
Tokenll
Token ID297
Feature activation+5.91e+00

Pos logit contributions
ix+26.827
u+26.518
uh+26.061
t+25.727
s+25.519
Neg logit contributions
 tremend-22.630
 behavi-21.725
 disparate-20.265
 artificially-19.693
 diminishing-19.305
Token see
Token ID766
Feature activation-6.97e+00

Pos logit contributions
uh+6.495
ix+6.440
u+6.203
id+6.098
ath+6.050
Neg logit contributions
 tremend-6.079
 behavi-5.507
 artificially-5.193
 disparate-5.174
 meaningful-5.037
Token a
Token ID257
Feature activation-1.13e+01

Pos logit contributions
 tremend+6.861
 artificially+6.212
 disparate+5.974
 behavi+5.923
 meaningful+5.888
Neg logit contributions
uh-8.442
ix-8.350
u-7.976
are-7.883
ath-7.858
Token wide
Token ID3094
Feature activation+1.26e+00

Pos logit contributions
 tremend+12.703
 artificially+11.958
 disparate+11.636
 meaningful+11.488
 diminishing+11.418
Neg logit contributions
uh-11.552
ix-11.411
u-10.777
ath-10.590
fn-10.586
Token a
Token ID257
Feature activation-1.62e+00

Pos logit contributions
uh+4.749
ix+4.735
u+4.664
ath+4.587
ind+4.554
Neg logit contributions
 tremend-2.630
 behavi-2.324
 artificially-2.299
 disparate-2.274
 meaningful-2.115
Token passionate
Token ID15347
Feature activation+5.32e+00

Pos logit contributions
 tremend+1.298
 behavi+1.124
 artificially+1.123
 disparate+1.072
 diminishing+1.064
Neg logit contributions
uh-2.227
ix-2.225
u-2.169
ind-2.110
are-2.108
Token man
Token ID582
Feature activation+3.66e+00

Pos logit contributions
ix+6.780
uh+6.567
u+6.481
i+6.388
ind+6.370
Neg logit contributions
 tremend-4.401
 behavi-3.695
 artificially-3.577
 disparate-3.538
 diminishing-3.467
Token.
Token ID13
Feature activation+5.98e+00

Pos logit contributions
ix+3.780
uh+3.756
u+3.631
i+3.537
ath+3.525
Neg logit contributions
 tremend-4.028
 artificially-3.688
 disparate-3.676
 behavi-3.626
 meaningful-3.551
Token He
Token ID679
Feature activation+1.45e+00

Pos logit contributions
uh+4.480
ix+4.446
u+4.233
are+4.100
i+4.053
Neg logit contributions
 tremend-8.671
 behavi-7.700
 artificially-7.646
 disparate-7.515
 diminishing-7.454
Token
Token ID447
Feature activation+4.79e+01

Pos logit contributions
ix+1.373
uh+1.357
u+1.298
ath+1.297
i+1.251
Neg logit contributions
 tremend-1.765
 behavi-1.648
 disparate-1.585
 diminishing-1.566
 artificially-1.559
Token
Token ID247
Feature activation+3.80e+01

Pos logit contributions
+33.643
+32.247
+30.409
+30.337
+30.115
Neg logit contributions
 tremend-31.068
 behavi-27.931
 artificially-26.839
 diminishing-26.596
 fragmented-26.091
Tokens
Token ID82
Feature activation+8.09e+00

Pos logit contributions
ix+33.567
u+33.542
s+33.322
i+33.151
uh+32.504
Neg logit contributions
 tremend-17.751
 diminishing-16.458
 disparate-16.079
 behavi-16.071
 undermin-14.929
Token passionate
Token ID15347
Feature activation+2.58e+00

Pos logit contributions
uh+10.813
ix+10.725
u+10.608
ath+10.580
ind+10.430
Neg logit contributions
 tremend-5.141
 behavi-4.419
 artificially-4.356
 disparate-4.307
 diminishing-4.194
Token about
Token ID546
Feature activation-4.12e+00

Pos logit contributions
uh+2.924
ix+2.920
u+2.818
ath+2.739
i+2.727
Neg logit contributions
 tremend-2.548
 artificially-2.327
 disparate-2.310
 diminishing-2.231
 fragmented-2.228
Token his
Token ID465
Feature activation-6.93e+00

Pos logit contributions
 tremend+3.373
 artificially+2.947
 disparate+2.841
 behavi+2.811
 fragmented+2.772
Neg logit contributions
ix-5.713
uh-5.697
u-5.514
ath-5.446
fn-5.398
Tokenerate
Token ID21620
Feature activation+3.13e+00

Pos logit contributions
uh+6.124
ix+5.569
u+5.555
are+5.506
ath+5.348
Neg logit contributions
 tremend-17.045
 behavi-16.383
 disparate-15.894
 artificially-15.496
 meaningful-15.400
Token flaws
Token ID17978
Feature activation-6.81e+00

Pos logit contributions
uh+3.863
ix+3.788
u+3.656
ath+3.650
are+3.615
Neg logit contributions
 tremend-2.840
 behavi-2.407
 artificially-2.369
 disparate-2.339
 mutually-2.282
Token present
Token ID1944
Feature activation-7.77e+00

Pos logit contributions
 tremend+8.256
 artificially+7.677
 disparate+7.536
 meaningful+7.417
 behavi+7.393
Neg logit contributions
uh-6.594
ix-6.525
u-6.166
are-6.035
ath-6.023
Token that
Token ID326
Feature activation-7.03e+00

Pos logit contributions
 tremend+9.187
 artificially+8.501
 disparate+8.353
 meaningful+8.217
 behavi+8.192
Neg logit contributions
uh-7.765
ix-7.655
u-7.284
are-7.111
ath-7.097
Token don
Token ID836
Feature activation+4.09e-02

Pos logit contributions
 tremend+7.647
 artificially+7.083
 disparate+6.943
 meaningful+6.838
 behavi+6.779
Neg logit contributions
uh-7.593
ix-7.501
u-7.124
are-7.039
ath-6.996
Token
Token ID447
Feature activation+4.79e+01

Pos logit contributions
uh+0.026
ix+0.026
u+0.024
ath+0.023
are+0.023
Neg logit contributions
 tremend-0.063
 artificially-0.059
 disparate-0.058
 behavi-0.057
 meaningful-0.057
Token
Token ID247
Feature activation+3.94e+01

Pos logit contributions
+31.752
+30.513
+30.195
+30.158
+30.108
Neg logit contributions
 tremend-39.130
 behavi-33.743
 mathemat-32.194
 hemor-31.439
 undermin-31.212
Tokent
Token ID83
Feature activation+7.61e+00

Pos logit contributions
u+31.069
t+30.909
ix+30.530
s+30.444
i+29.783
Neg logit contributions
 tremend-32.913
 behavi-27.999
 hemor-27.862
 undermin-26.307
 mathemat-25.328
Token affect
Token ID2689
Feature activation-5.57e+00

Pos logit contributions
uh+9.971
ix+9.912
u+9.744
are+9.520
ath+9.410
Neg logit contributions
 tremend-5.336
 behavi-5.055
 disparate-4.553
 hemor-4.208
 meaningful-4.137
Token the
Token ID262
Feature activation-1.15e+01

Pos logit contributions
 tremend+4.225
 artificially+3.746
 disparate+3.607
 behavi+3.520
 mutually+3.466
Neg logit contributions
uh-7.954
ix-7.894
u-7.629
are-7.551
ath-7.501
Token console
Token ID8624
Feature activation+1.64e+00

Pos logit contributions
 tremend+12.146
 artificially+11.382
 disparate+11.057
 meaningful+10.905
 diminishing+10.835
Neg logit contributions
uh-12.555
ix-12.411
u-11.766
ath-11.576
fn-11.571
Token force
Token ID2700
Feature activation-1.25e+01

Pos logit contributions
 tremend+3.584
 artificially+3.285
 disparate+3.196
 diminishing+3.190
 behavi+3.190
Neg logit contributions
uh-2.925
ix-2.854
u-2.756
are-2.690
ath-2.662
Token more
Token ID517
Feature activation-1.23e+01

Pos logit contributions
 tremend+15.934
 artificially+15.333
 disparate+14.949
 meaningful+14.799
 diminishing+14.668
Neg logit contributions
uh-10.283
ix-10.130
u-9.401
fn-9.251
ath-9.233
Token effectively
Token ID6840
Feature activation-1.41e+01

Pos logit contributions
 tremend+14.389
 artificially+13.727
 disparate+13.351
 meaningful+13.247
 diminishing+13.076
Neg logit contributions
uh-11.539
ix-11.366
u-10.678
fn-10.505
-10.461
Token,
Token ID11
Feature activation-1.32e+01

Pos logit contributions
 tremend+17.897
 artificially+17.280
 disparate+16.869
 meaningful+16.749
 diminishing+16.622
Neg logit contributions
uh-11.213
ix-11.009
u-10.160
fn-10.020
-10.019
Token improve
Token ID2987
Feature activation-1.49e+01

Pos logit contributions
 tremend+15.782
 artificially+15.033
 disparate+14.647
 meaningful+14.517
 diminishing+14.434
Neg logit contributions
uh-12.012
ix-11.850
u-11.091
fn-10.920
ath-10.900
Token their
Token ID511
Feature activation-2.17e+01

Pos logit contributions
 tremend+17.175
 artificially+16.371
 disparate+15.962
 meaningful+15.821
 diminishing+15.697
Neg logit contributions
uh-14.105
ix-13.875
u-13.040
fn-12.848
ath-12.793
Token intelligence
Token ID4430
Feature activation-7.79e+00

Pos logit contributions
 tremend+24.752
 artificially+24.261
 disparate+23.876
 meaningful+23.529
 diminishing+23.493
Neg logit contributions
uh-16.813
ix-16.370
u-15.285
-15.240
fn-15.145
Token capacities
Token ID32484
Feature activation-1.57e+01

Pos logit contributions
 tremend+9.833
 artificially+9.240
 disparate+9.004
 meaningful+8.882
 diminishing+8.827
Neg logit contributions
uh-7.166
ix-7.048
u-6.688
are-6.537
ath-6.489
Token.
Token ID13
Feature activation-4.04e+00

Pos logit contributions
 tremend+19.945
 artificially+19.281
 meaningful+18.666
 disparate+18.623
 mutually+18.454
Neg logit contributions
uh-12.052
ix-11.868
u-10.859
fn-10.841
-10.822
Token
Token ID447
Feature activation+3.89e+01

Pos logit contributions
 tremend+5.624
 artificially+5.309
 disparate+5.193
 meaningful+5.131
 diminishing+5.112
Neg logit contributions
uh-3.134
ix-3.083
u-2.863
fn-2.791
are-2.790
Token
Token ID251
Feature activation+2.81e+01

Pos logit contributions
+26.854
+25.057
+24.965
+24.937
+24.848
Neg logit contributions
 tremend-34.351
 behavi-31.041
 misunder-28.330
 undermin-28.237
 mathemat-28.097
Token the
Token ID262
Feature activation-1.09e+01

Pos logit contributions
 tremend+11.499
 artificially+10.755
 disparate+10.443
 meaningful+10.304
 diminishing+10.233
Neg logit contributions
uh-11.681
ix-11.531
u-10.962
ath-10.766
fn-10.759
Token L
Token ID406
Feature activation+8.73e+00

Pos logit contributions
 tremend+11.743
 artificially+11.073
 disparate+10.774
 meaningful+10.635
 diminishing+10.579
Neg logit contributions
uh-11.411
ix-11.311
u-10.656
fn-10.516
ath-10.502
Tokenocker
Token ID12721
Feature activation+1.48e+01

Pos logit contributions
uh+4.557
ix+3.877
ath+3.859
u+3.670
are+3.619
Neg logit contributions
 tremend-14.469
 artificially-13.633
 behavi-13.486
 mutually-13.442
 disparate-13.410
Tokenbie
Token ID12590
Feature activation+6.37e+00

Pos logit contributions
uh+8.504
u+7.880
i+7.687
ah+7.618
id+7.587
Neg logit contributions
 tremend-21.603
 behavi-19.253
 artificially-18.945
 disparate-18.854
 mutually-18.801
Token bombing
Token ID13471
Feature activation+2.31e-01

Pos logit contributions
uh+7.258
ix+7.031
u+6.980
i+6.829
ath+6.750
Neg logit contributions
 tremend-6.096
 artificially-5.273
 behavi-5.241
 disparate-5.178
 diminishing-5.037
Token to
Token ID284
Feature activation-2.16e+01

Pos logit contributions
uh+0.237
ix+0.231
u+0.226
are+0.218
fn+0.218
Neg logit contributions
 tremend-0.262
 artificially-0.243
 disparate-0.240
 meaningful-0.235
 behavi-0.232
Token
Token ID564
Feature activation+2.70e+01

Pos logit contributions
 artificially+21.960
 tremend+21.341
 mutually+20.811
 meaningful+20.673
 disparate+20.600
Neg logit contributions
uh-18.498
ix-18.033
-16.956
u-16.656
fn-16.618
Token
Token ID246
Feature activation+2.01e+01

Pos logit contributions
+20.793
+20.268
+19.120
ix+19.049
+18.653
Neg logit contributions
 tremend-27.372
 behavi-23.418
 artificially-22.442
 hemor-22.327
 diminishing-22.264
Tokengo
Token ID2188
Feature activation-5.06e+00

Pos logit contributions
uh+19.782
u+19.347
ind+19.161
are+19.129
ix+18.979
Neg logit contributions
 tremend-17.399
 behavi-16.504
 disparate-15.231
 artificially-14.656
 fragmented-14.598
Token after
Token ID706
Feature activation-1.71e+01

Pos logit contributions
 tremend+5.297
 artificially+4.856
 disparate+4.752
 diminishing+4.666
 behavi+4.657
Neg logit contributions
uh-5.746
ix-5.674
u-5.475
ath-5.371
fn-5.335
Token Libya
Token ID13465
Feature activation+5.34e-01

Pos logit contributions
 tremend+21.434
 artificially+21.133
 meaningful+20.507
 disparate+20.489
 diminishing+20.286
Neg logit contributions
uh-12.636
ix-12.339
u-11.368
fn-11.128
-11.074
Token give
Token ID1577
Feature activation-6.05e+00

Pos logit contributions
 tremend+6.690
 artificially+6.174
 disparate+5.984
 behavi+5.908
 meaningful+5.889
Neg logit contributions
uh-7.902
ix-7.812
u-7.474
ath-7.366
fn-7.348
Token up
Token ID510
Feature activation-8.71e+00

Pos logit contributions
 tremend+7.878
 artificially+7.400
 disparate+7.188
 behavi+7.144
 meaningful+7.071
Neg logit contributions
uh-5.457
ix-5.326
u-5.090
ath-4.960
fn-4.937
Token their
Token ID511
Feature activation-1.63e+01

Pos logit contributions
 tremend+11.891
 artificially+11.245
 disparate+10.975
 meaningful+10.839
 behavi+10.802
Neg logit contributions
uh-7.121
ix-7.017
u-6.593
ath-6.421
are-6.368
Token wealth
Token ID5129
Feature activation-8.29e+00

Pos logit contributions
 tremend+19.901
 artificially+19.235
 disparate+18.948
 meaningful+18.675
 diminishing+18.626
Neg logit contributions
uh-13.099
ix-12.937
u-11.911
fn-11.797
-11.710
Token and
Token ID290
Feature activation-1.77e+01

Pos logit contributions
 tremend+11.195
 artificially+10.543
 disparate+10.340
 meaningful+10.231
 behavi+10.198
Neg logit contributions
uh-6.831
ix-6.736
u-6.315
ath-6.161
are-6.124
Token their
Token ID511
Feature activation-2.10e+01

Pos logit contributions
 tremend+22.519
 artificially+21.785
 disparate+21.347
 meaningful+21.218
 diminishing+21.179
Neg logit contributions
uh-12.989
ix-12.821
u-11.592
fn-11.547
-11.419
Token life
Token ID1204
Feature activation-7.54e+00

Pos logit contributions
 tremend+25.415
 artificially+24.920
 disparate+24.500
 diminishing+24.356
 meaningful+24.336
Neg logit contributions
uh-14.298
ix-14.256
fn-12.800
u-12.660
-12.652
Token?
Token ID30
Feature activation+5.57e-01

Pos logit contributions
 tremend+10.412
 artificially+9.792
 disparate+9.603
 meaningful+9.492
 behavi+9.478
Neg logit contributions
uh-6.017
ix-5.917
u-5.562
ath-5.417
are-5.387
Token
Token ID198
Feature activation+1.76e+00

Pos logit contributions
uh+0.370
ix+0.355
u+0.336
are+0.330
ath+0.322
Neg logit contributions
 tremend-0.849
 artificially-0.798
 disparate-0.785
 behavi-0.785
 meaningful-0.772
Token
Token ID198
Feature activation+1.86e+00

Pos logit contributions
uh+1.143
ix+1.129
u+1.043
are+1.007
fn+1.002
Neg logit contributions
 tremend-2.789
 artificially-2.523
 behavi-2.519
 disparate-2.488
 diminishing-2.447
TokenF
Token ID37
Feature activation+8.61e+00

Pos logit contributions
uh+0.849
ix+0.807
u+0.724
are+0.717
ind+0.689
Neg logit contributions
 tremend-3.249
 artificially-3.071
 disparate-3.028
 behavi-3.027
 diminishing-2.987
Token
Token ID447
Feature activation+3.71e+01

Pos logit contributions
uh+1.337
ix+1.292
u+1.242
fn+1.193
ind+1.191
Neg logit contributions
 tremend-2.335
 artificially-2.150
 disparate-2.109
 meaningful-2.095
 behavi-2.077
Token
Token ID251
Feature activation+2.90e+01

Pos logit contributions
+28.660
+26.807
+26.753
+26.717
+26.692
Neg logit contributions
 tremend-24.678
 artificially-21.387
 behavi-21.385
 disparate-20.688
 mutually-20.419
Token)
Token ID8
Feature activation-2.60e+00

Pos logit contributions
uh+26.807
ix+25.963
i+25.859
u+25.705
are+25.564
Neg logit contributions
 tremend-18.413
 behavi-15.396
 artificially-14.965
 disparate-14.753
 fragmented-14.410
Token without
Token ID1231
Feature activation-1.09e+01

Pos logit contributions
 tremend+3.430
 artificially+3.176
 disparate+3.112
 behavi+3.072
 meaningful+3.058
Neg logit contributions
uh-2.283
ix-2.232
u-2.103
are-2.057
ind-2.049
Token revealing
Token ID13477
Feature activation-1.59e+01

Pos logit contributions
 tremend+11.884
 artificially+11.249
 disparate+10.924
 meaningful+10.824
 diminishing+10.757
Neg logit contributions
uh-11.274
ix-11.117
u-10.507
fn-10.347
are-10.346
Token the
Token ID262
Feature activation-2.09e+01

Pos logit contributions
 tremend+19.568
 artificially+19.031
 disparate+18.666
 meaningful+18.641
 mutually+18.344
Neg logit contributions
uh-12.486
ix-12.255
u-11.336
fn-11.176
-11.166
Token specific
Token ID2176
Feature activation-1.22e+01

Pos logit contributions
 tremend+24.978
 artificially+24.487
 disparate+24.314
 meaningful+24.251
 mutually+23.741
Neg logit contributions
uh-14.591
ix-14.216
fn-12.995
u-12.958
are-12.848
Token sources
Token ID4237
Feature activation-9.69e+00

Pos logit contributions
 tremend+13.936
 artificially+13.299
 disparate+13.056
 meaningful+12.900
 diminishing+12.780
Neg logit contributions
uh-11.581
ix-11.407
u-10.689
fn-10.529
ath-10.529
Token or
Token ID393
Feature activation-1.37e+01

Pos logit contributions
 tremend+14.074
 artificially+13.433
 disparate+13.159
 meaningful+13.032
 diminishing+12.973
Neg logit contributions
uh-6.715
ix-6.594
u-6.051
ath-5.890
fn-5.888
Token methods
Token ID5050
Feature activation-1.20e+01

Pos logit contributions
 tremend+16.293
 artificially+15.588
 disparate+15.256
 meaningful+15.086
 diminishing+14.999
Neg logit contributions
uh-12.501
ix-12.286
u-11.496
fn-11.299
ath-11.288
Token for
Token ID329
Feature activation-1.39e+01

Pos logit contributions
 tremend+17.148
 artificially+16.542
 disparate+16.199
 meaningful+16.057
 diminishing+15.981
Neg logit contributions
uh-8.084
ix-7.926
u-7.245
fn-7.092
ath-7.088
Token level
Token ID1241
Feature activation-1.13e+01

Pos logit contributions
i+11.352
ix+11.146
uh+11.058
u+10.932
are+10.805
Neg logit contributions
 tremend-6.076
 artificially-5.454
 disparate-5.401
 nonexistent-5.239
 behavi-5.025
Token,
Token ID11
Feature activation-9.83e+00

Pos logit contributions
 tremend+15.908
 artificially+15.309
 disparate+14.947
 meaningful+14.843
 diminishing+14.775
Neg logit contributions
uh-7.930
ix-7.764
u-7.122
fn-6.975
ath-6.956
Token but
Token ID475
Feature activation-1.02e+01

Pos logit contributions
 tremend+15.436
 artificially+14.852
 disparate+14.559
 meaningful+14.452
 diminishing+14.396
Neg logit contributions
uh-5.425
ix-5.307
u-4.742
fn-4.604
ath-4.586
Token what
Token ID644
Feature activation-6.56e+00

Pos logit contributions
 tremend+13.946
 artificially+13.339
 disparate+13.047
 meaningful+12.944
 diminishing+12.870
Neg logit contributions
uh-7.756
ix-7.631
u-7.037
fn-6.902
ath-6.871
Token about
Token ID546
Feature activation-1.51e+01

Pos logit contributions
 tremend+8.582
 artificially+8.075
 disparate+7.888
 diminishing+7.750
 meaningful+7.728
Neg logit contributions
uh-5.717
ix-5.622
u-5.292
are-5.208
ath-5.168
Token the
Token ID262
Feature activation-2.08e+01

Pos logit contributions
 tremend+20.034
 artificially+19.303
 disparate+18.869
 meaningful+18.823
 diminishing+18.656
Neg logit contributions
uh-11.058
ix-10.799
u-9.912
fn-9.758
ath-9.619
Token passport
Token ID17981
Feature activation+1.83e-01

Pos logit contributions
 tremend+26.484
 artificially+25.738
 meaningful+25.372
 disparate+25.258
 diminishing+25.114
Neg logit contributions
uh-13.155
ix-12.773
fn-11.669
u-11.540
ind-11.178
Tokening
Token ID278
Feature activation-4.35e+00

Pos logit contributions
uh+0.137
ix+0.136
u+0.129
are+0.126
ath+0.124
Neg logit contributions
 tremend-0.258
 artificially-0.242
 disparate-0.239
 behavi-0.234
 meaningful-0.234
Token of
Token ID286
Feature activation-1.53e+01

Pos logit contributions
 tremend+5.865
 artificially+5.517
 disparate+5.433
 meaningful+5.327
 diminishing+5.326
Neg logit contributions
uh-3.618
ix-3.551
u-3.354
ath-3.280
are-3.274
Token people
Token ID661
Feature activation-1.11e+01

Pos logit contributions
 tremend+19.668
 artificially+19.005
 disparate+18.557
 meaningful+18.436
 diminishing+18.246
Neg logit contributions
uh-11.760
ix-11.569
u-10.620
fn-10.537
are-10.373
Token
Token ID447
Feature activation+2.82e+01

Pos logit contributions
 tremend+15.557
 artificially+15.024
 disparate+14.660
 meaningful+14.532
 diminishing+14.455
Neg logit contributions
uh-7.839
ix-7.719
u-7.038
fn-6.953
ath-6.874
Token element
Token ID5002
Feature activation-3.92e+00

Pos logit contributions
 tremend+2.156
 artificially+1.949
 disparate+1.910
 meaningful+1.876
 mutually+1.847
Neg logit contributions
uh-2.788
ix-2.748
u-2.651
ath-2.598
are-2.578
Token to
Token ID284
Feature activation-1.32e+01

Pos logit contributions
 tremend+4.461
 artificially+4.111
 disparate+4.058
 meaningful+3.987
 behavi+3.973
Neg logit contributions
uh-4.059
ix-4.028
u-3.843
are-3.724
ath-3.714
Token determine
Token ID5004
Feature activation-1.11e+01

Pos logit contributions
 tremend+14.079
 artificially+13.593
 disparate+12.967
 meaningful+12.831
 diminishing+12.763
Neg logit contributions
uh-13.359
ix-13.151
u-12.421
ath-12.287
fn-12.281
Token the
Token ID262
Feature activation-1.88e+01

Pos logit contributions
 tremend+12.341
 artificially+11.643
 disparate+11.329
 meaningful+11.192
 diminishing+11.117
Neg logit contributions
uh-11.369
ix-11.225
u-10.601
fn-10.431
ath-10.420
Token success
Token ID1943
Feature activation-6.04e+00

Pos logit contributions
 tremend+22.191
 artificially+21.640
 disparate+21.238
 meaningful+21.223
 diminishing+20.954
Neg logit contributions
uh-14.403
ix-14.068
u-12.947
fn-12.918
-12.791
Token of
Token ID286
Feature activation-2.08e+01

Pos logit contributions
 tremend+7.408
 artificially+6.983
 disparate+6.898
 meaningful+6.708
 diminishing+6.693
Neg logit contributions
uh-5.733
ix-5.675
u-5.358
ath-5.241
fn-5.216
Token restoring
Token ID25646
Feature activation-1.10e+01

Pos logit contributions
 tremend+25.049
 artificially+24.366
 disparate+23.989
 meaningful+23.806
 diminishing+23.574
Neg logit contributions
uh-15.507
ix-15.125
u-13.950
fn-13.895
-13.638
Token a
Token ID257
Feature activation-1.73e+01

Pos logit contributions
 tremend+14.575
 artificially+13.854
 disparate+13.544
 meaningful+13.401
 diminishing+13.333
Neg logit contributions
uh-8.945
ix-8.807
u-8.192
ath-8.011
fn-8.009
Token copy
Token ID4866
Feature activation-7.88e+00

Pos logit contributions
 tremend+20.484
 artificially+19.729
 disparate+19.640
 meaningful+19.374
 fragmented+19.276
Neg logit contributions
uh-14.219
ix-13.869
fn-12.900
u-12.898
ath-12.634
Token of
Token ID286
Feature activation-1.46e+01

Pos logit contributions
 tremend+10.503
 artificially+9.877
 disparate+9.689
 meaningful+9.568
 diminishing+9.539
Neg logit contributions
uh-6.616
ix-6.531
u-6.112
ath-5.962
are-5.938
Token your
Token ID534
Feature activation-1.38e+01

Pos logit contributions
 tremend+17.855
 artificially+17.039
 disparate+16.688
 meaningful+16.485
 diminishing+16.362
Neg logit contributions
uh-12.910
ix-12.690
u-11.856
fn-11.688
ath-11.633
Token replicate
Token ID24340
Feature activation-8.94e+00

Pos logit contributions
 tremend+8.502
 artificially+7.828
 disparate+7.629
 meaningful+7.495
 diminishing+7.424
Neg logit contributions
uh-11.069
ix-10.947
u-10.454
ath-10.280
are-10.279
Token the
Token ID262
Feature activation-1.45e+01

Pos logit contributions
 tremend+8.309
 artificially+7.618
 disparate+7.309
 meaningful+7.215
 behavi+7.197
Neg logit contributions
uh-11.169
ix-11.051
u-10.579
ath-10.416
are-10.412
Token flavor
Token ID9565
Feature activation-7.39e+00

Pos logit contributions
 tremend+15.845
 artificially+15.063
 disparate+14.714
 meaningful+14.518
 diminishing+14.412
Neg logit contributions
uh-14.604
ix-14.454
u-13.600
fn-13.474
ath-13.377
Token and
Token ID290
Feature activation-1.33e+01

Pos logit contributions
 tremend+9.028
 artificially+8.509
 disparate+8.327
 meaningful+8.191
 diminishing+8.181
Neg logit contributions
uh-7.009
ix-6.906
u-6.522
ath-6.352
are-6.333
Token texture
Token ID11743
Feature activation-9.64e+00

Pos logit contributions
 tremend+15.342
 artificially+14.580
 disparate+14.225
 meaningful+14.088
 diminishing+13.994
Neg logit contributions
uh-12.686
ix-12.520
u-11.753
fn-11.634
ath-11.538
Token of
Token ID286
Feature activation-2.08e+01

Pos logit contributions
 tremend+11.139
 artificially+10.482
 disparate+10.220
 meaningful+10.079
 diminishing+10.035
Neg logit contributions
uh-9.634
ix-9.509
u-8.976
ath-8.798
are-8.779
Token o
Token ID267
Feature activation+1.31e+00

Pos logit contributions
 tremend+23.730
 artificially+23.536
 disparate+23.115
 meaningful+22.733
 diminishing+22.545
Neg logit contributions
uh-16.007
ix-15.636
fn-14.847
u-14.449
ath-14.419
Tokenatmeal
Token ID45280
Feature activation-9.95e-01

Pos logit contributions
uh+0.789
ix+0.770
u+0.736
ath+0.722
are+0.700
Neg logit contributions
 tremend-2.113
 behavi-1.969
 disparate-1.945
 artificially-1.942
 meaningful-1.916
Token and
Token ID290
Feature activation-1.29e+01

Pos logit contributions
 tremend+1.178
 artificially+1.059
 behavi+1.059
 disparate+1.054
 meaningful+1.036
Neg logit contributions
uh-0.995
ix-0.978
u-0.938
are-0.904
ath-0.903
Token can
Token ID460
Feature activation-1.03e+01

Pos logit contributions
 tremend+15.339
 artificially+14.629
 disparate+14.249
 meaningful+14.086
 diminishing+14.014
Neg logit contributions
uh-11.795
ix-11.635
u-10.879
fn-10.777
ath-10.742
Token help
Token ID1037
Feature activation-1.16e+01

Pos logit contributions
 tremend+10.481
 artificially+9.875
 disparate+9.527
 meaningful+9.410
 diminishing+9.365
Neg logit contributions
uh-11.346
ix-11.223
u-10.637
fn-10.496
ath-10.489
Token to
Token ID284
Feature activation-1.87e+01

Pos logit contributions
 tremend+7.695
 artificially+7.110
 disparate+6.918
 behavi+6.804
 diminishing+6.798
Neg logit contributions
ix-7.061
uh-7.037
u-6.659
ath-6.536
are-6.512
Token practices
Token ID6593
Feature activation-1.03e+01

Pos logit contributions
 tremend+23.002
 artificially+22.449
 disparate+22.076
 meaningful+22.029
 diminishing+21.635
Neg logit contributions
uh-14.439
ix-14.061
fn-12.831
u-12.793
-12.726
Token and
Token ID290
Feature activation-1.65e+01

Pos logit contributions
 tremend+12.971
 artificially+12.285
 disparate+11.996
 meaningful+11.856
 diminishing+11.799
Neg logit contributions
uh-9.053
ix-8.928
u-8.356
ath-8.183
fn-8.178
Token first
Token ID717
Feature activation-5.19e+00

Pos logit contributions
 tremend+20.344
 artificially+19.638
 disparate+19.244
 meaningful+19.156
 diminishing+18.860
Neg logit contributions
uh-13.525
ix-13.164
u-12.173
fn-12.162
-12.012
Token access
Token ID1895
Feature activation-9.29e+00

Pos logit contributions
 tremend+8.588
 artificially+8.050
 disparate+7.924
 behavi+7.826
 diminishing+7.823
Neg logit contributions
uh-2.861
ix-2.851
u-2.523
are-2.428
fn-2.400
Token to
Token ID284
Feature activation-2.07e+01

Pos logit contributions
 tremend+10.697
 artificially+10.067
 disparate+9.810
 meaningful+9.678
 diminishing+9.632
Neg logit contributions
uh-9.273
ix-9.171
u-8.647
ath-8.489
fn-8.480
Token prospects
Token ID13285
Feature activation-8.47e+00

Pos logit contributions
 tremend+25.544
 artificially+25.064
 meaningful+24.768
 disparate+24.735
 diminishing+24.224
Neg logit contributions
uh-14.839
ix-14.297
fn-13.097
-12.946
u-12.901
Token,
Token ID11
Feature activation-1.12e+01

Pos logit contributions
 tremend+12.519
 artificially+11.893
 disparate+11.666
 meaningful+11.536
 diminishing+11.506
Neg logit contributions
uh-5.816
ix-5.710
u-5.251
ath-5.088
are-5.088
Token all
Token ID477
Feature activation-6.17e+00

Pos logit contributions
 tremend+13.240
 artificially+12.492
 disparate+12.171
 meaningful+12.019
 diminishing+11.958
Neg logit contributions
uh-10.725
ix-10.588
u-9.966
ath-9.779
fn-9.771
Token being
Token ID852
Feature activation-1.25e+01

Pos logit contributions
 tremend+6.758
 artificially+6.151
 disparate+5.991
 behavi+5.926
 diminishing+5.858
Neg logit contributions
uh-6.799
ix-6.698
u-6.396
are-6.332
ath-6.255
Token combined
Token ID5929
Feature activation-3.72e+00

Pos logit contributions
 tremend+13.558
 artificially+12.846
 disparate+12.484
 meaningful+12.361
 diminishing+12.229
Neg logit contributions
uh-12.958
ix-12.801
u-12.098
fn-11.910
ath-11.879
Token diplomatic
Token ID13093
Feature activation+6.66e-01

Pos logit contributions
 tremend+2.605
 artificially+2.316
 behavi+2.242
 disparate+2.240
 meaningful+2.210
Neg logit contributions
ix-3.426
uh-3.425
u-3.275
are-3.223
ath-3.214
Token compounds
Token ID16439
Feature activation-6.76e+00

Pos logit contributions
ix+0.835
uh+0.833
u+0.804
are+0.800
ind+0.798
Neg logit contributions
 tremend-0.607
 artificially-0.528
 behavi-0.514
 disparate-0.511
 fragmented-0.505
Token abroad
Token ID10522
Feature activation-5.68e+00

Pos logit contributions
 tremend+8.698
 artificially+7.995
 disparate+7.856
 behavi+7.818
 meaningful+7.736
Neg logit contributions
uh-6.170
ix-6.147
u-5.771
are-5.635
ath-5.631
Token or
Token ID393
Feature activation-1.13e+01

Pos logit contributions
 tremend+7.750
 artificially+7.117
 disparate+7.009
 behavi+6.994
 meaningful+6.895
Neg logit contributions
uh-4.778
ix-4.761
u-4.463
are-4.350
ath-4.338
Token revelations
Token ID19111
Feature activation-1.22e+01

Pos logit contributions
 tremend+12.133
 artificially+11.228
 disparate+10.899
 meaningful+10.730
 diminishing+10.665
Neg logit contributions
uh-12.550
ix-12.423
u-11.785
are-11.595
ath-11.571
Token about
Token ID546
Feature activation-2.07e+01

Pos logit contributions
 tremend+13.091
 artificially+12.193
 disparate+11.857
 meaningful+11.680
 diminishing+11.606
Neg logit contributions
uh-13.335
ix-13.203
u-12.508
fn-12.288
ath-12.288
Token highly
Token ID4047
Feature activation-1.36e+00

Pos logit contributions
 artificially+23.794
 tremend+23.683
 disparate+23.225
 meaningful+23.079
 diminishing+22.860
Neg logit contributions
uh-16.703
ix-16.273
u-15.127
fn-15.073
ath-15.009
Token secret
Token ID3200
Feature activation-1.13e+01

Pos logit contributions
 tremend+0.916
 behavi+0.682
 artificially+0.677
 disparate+0.605
 mutually+0.602
Neg logit contributions
ix-2.124
uh-2.113
u-2.049
ath-2.041
fn-2.020
Token intelligence
Token ID4430
Feature activation-1.52e+00

Pos logit contributions
 tremend+12.092
 artificially+11.375
 disparate+11.056
 meaningful+10.911
 diminishing+10.839
Neg logit contributions
uh-12.012
ix-11.867
u-11.236
ath-11.054
fn-11.052
Token sources
Token ID4237
Feature activation-8.91e+00

Pos logit contributions
 tremend+1.412
 artificially+1.194
 behavi+1.135
 disparate+1.116
 diminishing+1.086
Neg logit contributions
uh-1.972
ix-1.957
u-1.910
are-1.871
i-1.858
Token or
Token ID393
Feature activation-1.04e+01

Pos logit contributions
 tremend+12.096
 artificially+11.184
 disparate+10.970
 behavi+10.885
 meaningful+10.834
Neg logit contributions
uh-7.573
ix-7.493
u-7.001
are-6.836
fn-6.805
Token be
Token ID307
Feature activation-1.48e+01

Pos logit contributions
 tremend+13.554
 artificially+12.903
 disparate+12.468
 meaningful+12.331
 diminishing+12.290
Neg logit contributions
uh-12.932
ix-12.795
u-12.035
ath-11.904
fn-11.891
Token strong
Token ID1913
Feature activation-8.20e+00

Pos logit contributions
 tremend+18.169
 artificially+17.565
 disparate+17.139
 meaningful+17.049
 diminishing+16.899
Neg logit contributions
uh-12.577
ix-12.327
u-11.449
fn-11.448
are-11.231
Token enough
Token ID1576
Feature activation-8.42e+00

Pos logit contributions
 tremend+10.964
 artificially+10.365
 disparate+10.156
 meaningful+10.026
 diminishing+9.982
Neg logit contributions
uh-6.736
ix-6.648
u-6.191
ath-6.042
are-6.037
Token to
Token ID284
Feature activation-1.49e+01

Pos logit contributions
 tremend+10.381
 artificially+9.758
 disparate+9.553
 meaningful+9.424
 diminishing+9.382
Neg logit contributions
uh-7.821
ix-7.734
u-7.288
ath-7.125
fn-7.115
Token offset
Token ID11677
Feature activation-8.45e+00

Pos logit contributions
 tremend+16.028
 artificially+15.640
 meaningful+14.874
 disparate+14.843
 diminishing+14.697
Neg logit contributions
uh-14.646
ix-14.392
ath-13.386
u-13.368
-13.352
Token the
Token ID262
Feature activation-2.06e+01

Pos logit contributions
 tremend+9.965
 artificially+9.246
 disparate+9.013
 meaningful+8.865
 behavi+8.813
Neg logit contributions
uh-8.508
ix-8.428
u-7.988
are-7.823
ath-7.817
Token revenue
Token ID6426
Feature activation-4.58e+00

Pos logit contributions
 tremend+25.193
 artificially+25.055
 disparate+24.558
 diminishing+24.341
 meaningful+24.321
Neg logit contributions
uh-14.531
ix-14.382
fn-13.096
u-13.033
ath-12.963
Token losses
Token ID9089
Feature activation-1.24e+01

Pos logit contributions
 tremend+4.901
 artificially+4.405
 disparate+4.304
 meaningful+4.238
 behavi+4.228
Neg logit contributions
uh-5.154
ix-5.088
u-4.847
ath-4.785
are-4.775
Token.
Token ID13
Feature activation-4.08e+00

Pos logit contributions
 tremend+16.832
 artificially+16.296
 disparate+15.842
 meaningful+15.764
 diminishing+15.669
Neg logit contributions
uh-9.009
ix-8.817
u-8.112
fn-8.017
ath-7.952
Token
Token ID447
Feature activation+3.19e+01

Pos logit contributions
 tremend+7.167
 artificially+6.778
 disparate+6.683
 meaningful+6.600
 mutually+6.585
Neg logit contributions
uh-1.798
ix-1.742
u-1.541
are-1.466
fn-1.452
Token
Token ID251
Feature activation+1.97e+01

Pos logit contributions
+25.310
+23.936
+23.891
+23.885
uh+23.529
Neg logit contributions
 tremend-22.938
 artificially-20.739
 disparate-20.456
 mutually-20.371
 behavi-19.970
Token we
Token ID356
Feature activation-1.19e+01

Pos logit contributions
 tremend+12.124
 artificially+11.445
 disparate+11.135
 meaningful+11.005
 diminishing+10.940
Neg logit contributions
uh-10.711
ix-10.575
u-9.976
ath-9.807
fn-9.803
Token are
Token ID389
Feature activation-1.28e+01

Pos logit contributions
 tremend+13.522
 artificially+13.120
 disparate+12.605
 meaningful+12.546
 mutually+12.502
Neg logit contributions
uh-11.225
ix-11.076
u-10.353
fn-10.254
ath-10.237
Token not
Token ID407
Feature activation-1.14e+01

Pos logit contributions
 tremend+13.699
 artificially+13.243
 disparate+12.785
 meaningful+12.753
 diminishing+12.645
Neg logit contributions
uh-12.877
ix-12.728
u-11.914
ath-11.836
fn-11.824
Token neglect
Token ID17985
Feature activation-5.15e+00

Pos logit contributions
 tremend+13.110
 artificially+12.598
 disparate+12.222
 meaningful+12.101
 diminishing+12.085
Neg logit contributions
uh-10.757
ix-10.611
u-9.937
fn-9.815
ath-9.805
Tokening
Token ID278
Feature activation-1.62e+01

Pos logit contributions
 tremend+9.218
 artificially+8.754
 disparate+8.641
 behavi+8.558
 meaningful+8.499
Neg logit contributions
uh-2.113
ix-2.075
u-1.849
are-1.674
ind-1.664
Token the
Token ID262
Feature activation-2.06e+01

Pos logit contributions
 tremend+19.498
 artificially+18.867
 meaningful+18.459
 disparate+18.435
 diminishing+18.243
Neg logit contributions
uh-13.687
ix-13.453
u-12.454
fn-12.333
-12.325
Token next
Token ID1306
Feature activation-6.58e+00

Pos logit contributions
 tremend+24.703
 artificially+24.345
 meaningful+24.150
 disparate+24.071
 diminishing+23.788
Neg logit contributions
uh-14.576
ix-14.438
fn-13.150
u-13.012
-12.852
Token generation
Token ID5270
Feature activation-9.68e+00

Pos logit contributions
 tremend+7.634
 artificially+7.065
 disparate+6.889
 behavi+6.793
 meaningful+6.766
Neg logit contributions
uh-6.709
ix-6.621
u-6.277
are-6.161
ath-6.155
Token of
Token ID286
Feature activation-1.52e+01

Pos logit contributions
 tremend+13.154
 artificially+12.518
 disparate+12.244
 meaningful+12.118
 diminishing+12.058
Neg logit contributions
uh-7.594
ix-7.473
u-6.930
ath-6.770
fn-6.769
Token space
Token ID2272
Feature activation-1.72e+00

Pos logit contributions
 tremend+17.291
 artificially+16.752
 disparate+16.327
 meaningful+16.227
 diminishing+16.035
Neg logit contributions
uh-14.142
ix-13.964
u-13.006
fn-12.864
-12.785
Token exploration
Token ID13936
Feature activation-8.57e+00

Pos logit contributions
 tremend+1.683
 artificially+1.510
 disparate+1.468
 diminishing+1.436
 meaningful+1.429
Neg logit contributions
uh-2.103
ix-2.065
u-1.990
ath-1.978
are-1.957
Token their
Token ID511
Feature activation-1.34e+01

Pos logit contributions
 tremend+14.987
 artificially+14.188
 disparate+13.832
 meaningful+13.701
 diminishing+13.596
Neg logit contributions
uh-13.646
ix-13.507
u-12.678
fn-12.519
ath-12.480
Token oppress
Token ID35035
Feature activation+3.22e+00

Pos logit contributions
 tremend+13.866
 artificially+13.158
 disparate+12.870
 meaningful+12.692
 diminishing+12.587
Neg logit contributions
uh-14.325
ix-14.203
u-13.337
fn-13.241
ath-13.145
Tokenors
Token ID669
Feature activation-7.77e+00

Pos logit contributions
uh+1.358
ix+1.341
u+1.238
ath+1.164
i+1.139
Neg logit contributions
 tremend-5.580
 artificially-5.244
 disparate-5.175
 behavi-5.148
 diminishing-5.108
Token,
Token ID11
Feature activation-1.03e+01

Pos logit contributions
 tremend+10.604
 artificially+10.033
 disparate+9.844
 meaningful+9.711
 diminishing+9.662
Neg logit contributions
uh-6.245
ix-6.152
u-5.744
ath-5.602
are-5.586
Token and
Token ID290
Feature activation-1.44e+01

Pos logit contributions
 tremend+12.701
 artificially+11.998
 disparate+11.691
 meaningful+11.518
 diminishing+11.473
Neg logit contributions
uh-9.507
ix-9.366
u-8.841
ath-8.648
are-8.626
Token the
Token ID262
Feature activation-2.05e+01

Pos logit contributions
 tremend+16.670
 artificially+15.827
 disparate+15.436
 meaningful+15.330
 diminishing+15.187
Neg logit contributions
uh-13.599
ix-13.430
u-12.565
fn-12.396
ath-12.352
Token needs
Token ID2476
Feature activation-6.21e+00

Pos logit contributions
 tremend+24.148
 artificially+23.432
 disparate+23.189
 meaningful+23.053
 diminishing+22.894
Neg logit contributions
uh-16.231
ix-16.203
u-14.755
fn-14.747
-14.683
Token of
Token ID286
Feature activation-1.61e+01

Pos logit contributions
 tremend+7.706
 artificially+7.242
 disparate+7.056
 behavi+6.939
 diminishing+6.923
Neg logit contributions
uh-5.853
ix-5.821
u-5.492
are-5.348
ath-5.334
Token their
Token ID511
Feature activation-1.54e+01

Pos logit contributions
 tremend+18.163
 artificially+17.326
 disparate+17.020
 meaningful+16.865
 diminishing+16.668
Neg logit contributions
uh-15.166
ix-14.945
u-13.980
fn-13.870
-13.750
Token respective
Token ID11756
Feature activation-8.26e+00

Pos logit contributions
 tremend+17.404
 artificially+16.713
 disparate+16.541
 meaningful+16.227
 diminishing+16.156
Neg logit contributions
uh-14.111
ix-13.953
u-12.940
fn-12.881
ath-12.788
Token communities
Token ID5348
Feature activation-6.32e+00

Pos logit contributions
 tremend+6.791
 artificially+6.204
 disparate+5.945
 meaningful+5.852
 diminishing+5.801
Neg logit contributions
uh-11.020
ix-10.909
u-10.465
ath-10.320
are-10.315
Token source
Token ID2723
Feature activation-4.80e+00

Pos logit contributions
 tremend+10.859
 artificially+10.022
 disparate+9.709
 meaningful+9.598
 diminishing+9.526
Neg logit contributions
uh-11.829
ix-11.718
u-11.132
are-10.971
ath-10.947
Token of
Token ID286
Feature activation-1.19e+01

Pos logit contributions
 tremend+5.971
 artificially+5.477
 disparate+5.372
 behavi+5.366
 meaningful+5.311
Neg logit contributions
ix-4.622
uh-4.583
u-4.331
fn-4.227
ath-4.218
Token research
Token ID2267
Feature activation-5.59e+00

Pos logit contributions
 tremend+12.582
 artificially+11.726
 disparate+11.398
 meaningful+11.243
 diminishing+11.173
Neg logit contributions
uh-13.089
ix-12.957
u-12.289
ath-12.096
fn-12.096
Token funding
Token ID4918
Feature activation-7.39e+00

Pos logit contributions
 tremend+6.944
 artificially+6.400
 behavi+6.223
 disparate+6.211
 meaningful+6.149
Neg logit contributions
uh-5.313
ix-5.228
u-4.993
fn-4.898
ath-4.864
Token and
Token ID290
Feature activation-1.91e+01

Pos logit contributions
 tremend+9.746
 artificially+9.080
 disparate+8.851
 behavi+8.797
 meaningful+8.746
Neg logit contributions
uh-6.464
ix-6.405
u-6.027
ath-5.914
are-5.881
Token the
Token ID262
Feature activation-2.05e+01

Pos logit contributions
 tremend+21.484
 artificially+21.268
 disparate+20.827
 meaningful+20.607
 diminishing+20.402
Neg logit contributions
uh-16.592
ix-16.142
u-15.187
-15.095
fn-14.935
Token outcomes
Token ID10906
Feature activation-8.76e+00

Pos logit contributions
 tremend+22.500
 artificially+22.458
 disparate+22.104
 meaningful+21.899
 diminishing+21.655
Neg logit contributions
uh-17.259
ix-16.752
u-15.775
-15.589
fn-15.588
Token of
Token ID286
Feature activation-1.68e+01

Pos logit contributions
 tremend+11.372
 artificially+10.644
 disparate+10.411
 behavi+10.258
 meaningful+10.253
Neg logit contributions
uh-7.707
ix-7.629
u-7.112
ath-6.971
are-6.958
Token research
Token ID2267
Feature activation-6.66e+00

Pos logit contributions
 tremend+18.920
 artificially+18.425
 disparate+18.067
 meaningful+17.874
 diminishing+17.589
Neg logit contributions
uh-15.418
ix-15.129
u-14.136
-14.037
fn-13.970
Token studies
Token ID3640
Feature activation-8.39e+00

Pos logit contributions
 tremend+9.501
 artificially+8.826
 disparate+8.620
 diminishing+8.526
 behavi+8.514
Neg logit contributions
uh-5.158
ix-5.101
u-4.773
are-4.632
ath-4.632
Token,
Token ID11
Feature activation-8.51e+00

Pos logit contributions
 tremend+11.870
 artificially+11.176
 disparate+10.947
 meaningful+10.818
 diminishing+10.793
Neg logit contributions
uh-6.392
ix-6.319
u-5.877
ath-5.695
are-5.689
Token blacks
Token ID15102
Feature activation+2.03e+00

Pos logit contributions
 tremend+18.158
 artificially+17.414
 disparate+17.099
 meaningful+16.935
 diminishing+16.890
Neg logit contributions
uh-7.973
ix-7.841
u-7.102
fn-6.953
ath-6.915
Token and
Token ID290
Feature activation-1.07e+01

Pos logit contributions
uh+1.621
ix+1.583
u+1.502
ath+1.487
are+1.482
Neg logit contributions
 tremend-2.766
 artificially-2.538
 disparate-2.496
 meaningful-2.479
 behavi-2.467
Token Muslims
Token ID7045
Feature activation+1.48e+00

Pos logit contributions
 tremend+11.403
 artificially+10.703
 disparate+10.402
 meaningful+10.262
 diminishing+10.197
Neg logit contributions
uh-11.436
ix-11.303
u-10.705
ath-10.529
fn-10.528
Token,
Token ID11
Feature activation-1.00e+01

Pos logit contributions
uh+1.200
ix+1.167
u+1.116
are+1.100
ath+1.083
Neg logit contributions
 tremend-1.987
 artificially-1.809
 disparate-1.789
 meaningful-1.774
 behavi-1.773
Token and
Token ID290
Feature activation-1.41e+01

Pos logit contributions
 tremend+15.136
 artificially+14.475
 disparate+14.190
 meaningful+14.059
 diminishing+13.997
Neg logit contributions
uh-6.378
ix-6.252
u-5.690
ath-5.525
fn-5.522
Token the
Token ID262
Feature activation-2.04e+01

Pos logit contributions
 tremend+18.142
 artificially+17.411
 disparate+17.084
 meaningful+16.876
 diminishing+16.840
Neg logit contributions
uh-11.253
ix-11.101
u-10.238
fn-10.123
ath-10.046
Token hate
Token ID5465
Feature activation-2.42e+00

Pos logit contributions
 tremend+25.897
 artificially+25.367
 disparate+25.003
 diminishing+24.753
 meaningful+24.620
Neg logit contributions
ix-13.292
uh-13.285
fn-12.071
u-11.857
ath-11.781
Token he
Token ID339
Feature activation-6.29e+00

Pos logit contributions
 tremend+2.532
 artificially+2.320
 disparate+2.236
 behavi+2.200
 diminishing+2.193
Neg logit contributions
uh-2.756
ix-2.733
u-2.615
ath-2.574
are-2.547
Token has
Token ID468
Feature activation-9.88e+00

Pos logit contributions
 tremend+8.422
 artificially+8.006
 disparate+7.829
 meaningful+7.748
 diminishing+7.710
Neg logit contributions
uh-5.065
ix-4.988
u-4.637
ath-4.534
fn-4.528
Token brought
Token ID3181
Feature activation-8.56e+00

Pos logit contributions
 tremend+10.563
 artificially+9.963
 disparate+9.671
 meaningful+9.552
 diminishing+9.481
Neg logit contributions
uh-10.477
ix-10.336
u-9.772
fn-9.634
ath-9.614
Token to
Token ID284
Feature activation-1.23e+01

Pos logit contributions
 tremend+10.017
 artificially+9.412
 disparate+9.171
 meaningful+9.062
 diminishing+9.010
Neg logit contributions
uh-8.437
ix-8.320
u-7.884
ath-7.719
are-7.698
Token world
Token ID995
Feature activation+1.21e+00

Pos logit contributions
 tremend+10.976
 artificially+10.257
 disparate+9.933
 meaningful+9.859
 behavi+9.786
Neg logit contributions
uh-10.520
ix-10.371
u-9.852
ath-9.654
are-9.643
Token economic
Token ID3034
Feature activation-2.28e-01

Pos logit contributions
uh+1.463
ix+1.393
u+1.381
are+1.349
ath+1.342
Neg logit contributions
 tremend-1.169
 artificially-1.065
 behavi-1.065
 meaningful-1.000
 disparate-0.989
Token crisis
Token ID4902
Feature activation-2.32e+00

Pos logit contributions
 tremend+0.185
 behavi+0.162
 artificially+0.160
 meaningful+0.152
 disparate+0.150
Neg logit contributions
uh-0.311
ix-0.303
u-0.296
are-0.291
ath-0.290
Token)
Token ID8
Feature activation-7.33e+00

Pos logit contributions
 tremend+3.345
 behavi+3.111
 artificially+3.108
 disparate+3.075
 meaningful+3.028
Neg logit contributions
uh-1.748
ix-1.700
u-1.627
ath-1.537
ind-1.534
Token and
Token ID290
Feature activation-1.61e+01

Pos logit contributions
 tremend+9.143
 artificially+8.646
 disparate+8.432
 meaningful+8.343
 diminishing+8.290
Neg logit contributions
uh-6.644
ix-6.551
u-6.149
ath-6.020
fn-6.008
Token the
Token ID262
Feature activation-2.03e+01

Pos logit contributions
 tremend+18.182
 artificially+17.409
 disparate+17.023
 meaningful+16.762
 diminishing+16.682
Neg logit contributions
uh-15.323
ix-15.113
u-14.219
fn-14.042
ath-13.980
Token "
Token ID366
Feature activation-2.65e+00

Pos logit contributions
 tremend+23.362
 artificially+23.106
 disparate+22.754
 diminishing+22.290
 meaningful+22.229
Neg logit contributions
uh-15.690
ix-15.544
fn-14.392
u-14.353
ath-14.261
TokenOld
Token ID19620
Feature activation+2.64e+00

Pos logit contributions
 tremend+4.514
 artificially+4.302
 disparate+4.207
 behavi+4.190
 meaningful+4.187
Neg logit contributions
uh-1.301
ix-1.236
u-1.118
ind-1.079
ath-1.069
Token Guard
Token ID4932
Feature activation+3.04e+00

Pos logit contributions
uh+2.521
ix+2.515
u+2.394
ind+2.333
are+2.328
Neg logit contributions
 tremend-3.245
 artificially-2.955
 behavi-2.863
 disparate-2.832
 diminishing-2.821
Token"
Token ID1
Feature activation-6.81e+00

Pos logit contributions
uh+2.121
ix+2.050
u+1.967
are+1.871
i+1.842
Neg logit contributions
 tremend-4.408
 artificially-4.099
 behavi-4.059
 diminishing-4.021
 disparate-4.011
Token headed
Token ID9153
Feature activation-3.62e+00

Pos logit contributions
 tremend+8.465
 artificially+7.843
 behavi+7.622
 disparate+7.611
 meaningful+7.587
Neg logit contributions
uh-6.492
ix-6.371
u-6.051
are-5.898
ath-5.875
Token the
Token ID262
Feature activation-1.34e+01

Pos logit contributions
 tremend+3.742
 artificially+3.458
 disparate+3.376
 behavi+3.348
 diminishing+3.328
Neg logit contributions
ix-3.884
uh-3.872
u-3.740
ath-3.616
ind-3.613
Token pagan
Token ID36164
Feature activation+3.15e-01

Pos logit contributions
 tremend+17.126
 artificially+16.318
 disparate+15.979
 meaningful+15.777
 diminishing+15.693
Neg logit contributions
uh-11.205
ix-11.062
u-10.263
fn-10.108
ath-10.065
Token myths
Token ID22886
Feature activation-4.72e+00

Pos logit contributions
uh+0.425
ix+0.424
u+0.410
are+0.404
ath+0.404
Neg logit contributions
 tremend-0.251
 artificially-0.225
 disparate-0.214
 behavi-0.214
 diminishing-0.213
Token by
Token ID416
Feature activation-1.09e+01

Pos logit contributions
 tremend+5.483
 artificially+5.101
 disparate+5.016
 behavi+4.943
 meaningful+4.927
Neg logit contributions
uh-4.769
ix-4.738
u-4.504
ath-4.367
fn-4.356
Token interpreting
Token ID35391
Feature activation-1.17e+01

Pos logit contributions
 tremend+11.504
 artificially+10.779
 disparate+10.473
 meaningful+10.329
 diminishing+10.260
Neg logit contributions
uh-11.923
ix-11.787
u-11.176
ath-10.993
fn-10.990
Token them
Token ID606
Feature activation-2.02e+01

Pos logit contributions
 tremend+14.039
 artificially+13.260
 disparate+12.929
 meaningful+12.775
 diminishing+12.707
Neg logit contributions
uh-11.107
ix-10.963
u-10.308
ath-10.112
fn-10.106
Token as
Token ID355
Feature activation-1.80e+01

Pos logit contributions
 tremend+25.250
 artificially+24.852
 meaningful+23.803
 disparate+23.799
 mutually+23.721
Neg logit contributions
uh-13.877
ix-13.354
-12.487
fn-12.330
ath-12.283
Token symbolic
Token ID18975
Feature activation-5.46e+00

Pos logit contributions
 tremend+23.621
 artificially+22.853
 disparate+22.356
 meaningful+22.318
 mutually+22.096
Neg logit contributions
uh-12.631
ix-12.364
fn-11.348
u-11.301
ath-11.097
Token pre
Token ID662
Feature activation+2.96e+00

Pos logit contributions
 tremend+5.888
 artificially+5.425
 disparate+5.279
 diminishing+5.196
 meaningful+5.153
Neg logit contributions
uh-6.026
ix-5.971
u-5.740
are-5.590
ind-5.561
Tokenfig
Token ID5647
Feature activation+1.45e+01

Pos logit contributions
uh+1.457
ix+1.453
u+1.387
ind+1.358
ath+1.292
Neg logit contributions
 tremend-4.993
 artificially-4.650
 behavi-4.649
 disparate-4.618
 mutually-4.548
Tokenurations
Token ID20074
Feature activation-8.21e+00

Pos logit contributions
u+10.834
uh+10.567
ix+10.108
are+9.492
ind+9.468
Neg logit contributions
 tremend-17.927
 behavi-16.418
 undermin-15.915
 disparate-15.768
 hemor-15.705
Token the
Token ID262
Feature activation-1.62e+01

Pos logit contributions
 tremend+12.616
 artificially+11.608
 disparate+11.303
 meaningful+11.093
 diminishing+11.082
Neg logit contributions
uh-12.923
ix-12.848
u-12.196
ath-12.015
are-11.936
Token speed
Token ID2866
Feature activation-4.32e+00

Pos logit contributions
 tremend+18.297
 artificially+17.680
 disparate+17.238
 meaningful+17.109
 diminishing+16.896
Neg logit contributions
uh-15.167
ix-14.949
u-13.992
fn-13.860
are-13.753
Token and
Token ID290
Feature activation-1.25e+01

Pos logit contributions
 tremend+5.774
 artificially+5.378
 disparate+5.280
 behavi+5.235
 mutually+5.195
Neg logit contributions
uh-3.724
ix-3.683
u-3.498
ath-3.388
are-3.357
Token physical
Token ID3518
Feature activation-5.80e+00

Pos logit contributions
 tremend+13.017
 artificially+12.147
 disparate+11.783
 meaningful+11.604
 diminishing+11.535
Neg logit contributions
uh-13.963
ix-13.816
u-13.120
ath-12.915
fn-12.891
Tokenity
Token ID414
Feature activation-1.39e+01

Pos logit contributions
 tremend+9.734
 artificially+9.242
 disparate+9.071
 meaningful+8.937
 diminishing+8.925
Neg logit contributions
uh-2.987
ix-2.910
u-2.614
are-2.507
ath-2.498
Token of
Token ID286
Feature activation-2.02e+01

Pos logit contributions
 tremend+15.641
 artificially+14.963
 disparate+14.528
 meaningful+14.486
 diminishing+14.301
Neg logit contributions
uh-13.163
ix-13.016
u-12.184
fn-12.058
-12.044
Token hockey
Token ID12217
Feature activation-5.98e-01

Pos logit contributions
 tremend+23.736
 artificially+23.558
 disparate+22.993
 meaningful+22.968
 diminishing+22.534
Neg logit contributions
uh-15.567
ix-15.144
u-13.942
-13.865
fn-13.851
Token at
Token ID379
Feature activation-8.34e+00

Pos logit contributions
 tremend+0.686
 artificially+0.599
 disparate+0.599
 behavi+0.594
 mutually+0.584
Neg logit contributions
uh-0.628
ix-0.620
u-0.593
ind-0.581
ath-0.580
Token an
Token ID281
Feature activation-7.84e+00

Pos logit contributions
 tremend+8.071
 artificially+7.325
 disparate+7.058
 behavi+7.015
 meaningful+6.971
Neg logit contributions
uh-10.300
ix-10.148
u-9.717
ath-9.612
are-9.539
Token early
Token ID1903
Feature activation-1.20e+00

Pos logit contributions
 tremend+4.768
 artificially+4.152
 disparate+3.956
 meaningful+3.871
 diminishing+3.816
Neg logit contributions
uh-12.187
ix-12.065
u-11.652
ath-11.521
are-11.495
Token age
Token ID2479
Feature activation-4.70e+00

Pos logit contributions
 tremend+1.105
 artificially+0.979
 behavi+0.969
 disparate+0.968
 mutually+0.933
Neg logit contributions
uh-1.496
ix-1.477
u-1.421
are-1.402
ath-1.399
Token decreasing
Token ID24030
Feature activation-8.54e+00

Pos logit contributions
 tremend+13.433
 artificially+12.683
 disparate+12.318
 meaningful+12.183
 diminishing+12.164
Neg logit contributions
uh-12.590
ix-12.446
u-11.732
fn-11.573
ath-11.561
Token,
Token ID11
Feature activation-6.10e+00

Pos logit contributions
 tremend+7.835
 artificially+7.243
 disparate+7.021
 diminishing+6.794
 behavi+6.776
Neg logit contributions
uh-10.711
ix-10.638
u-10.166
ath-10.041
are-9.989
Token rather
Token ID2138
Feature activation-1.80e+00

Pos logit contributions
 tremend+6.536
 artificially+6.038
 disparate+5.911
 meaningful+5.803
 behavi+5.768
Neg logit contributions
uh-6.738
ix-6.613
u-6.334
ind-6.201
ath-6.199
Token than
Token ID621
Feature activation-1.18e+01

Pos logit contributions
 tremend+1.924
 artificially+1.792
 disparate+1.788
 behavi+1.738
 meaningful+1.721
Neg logit contributions
uh-1.911
ix-1.895
u-1.802
ath-1.774
ind-1.768
Token increasing
Token ID3649
Feature activation-1.11e+01

Pos logit contributions
 tremend+11.975
 artificially+11.232
 disparate+10.865
 meaningful+10.748
 diminishing+10.703
Neg logit contributions
uh-13.219
ix-13.086
u-12.382
fn-12.217
are-12.211
Token their
Token ID511
Feature activation-2.00e+01

Pos logit contributions
 tremend+13.754
 artificially+13.022
 disparate+12.712
 meaningful+12.543
 diminishing+12.482
Neg logit contributions
uh-10.112
ix-9.976
u-9.356
ath-9.177
fn-9.160
Token immersion
Token ID36375
Feature activation-8.09e+00

Pos logit contributions
 tremend+23.748
 artificially+23.042
 meaningful+22.864
 diminishing+22.395
 disparate+22.385
Neg logit contributions
uh-14.555
ix-14.289
fn-13.417
u-13.090
are-12.881
Token and
Token ID290
Feature activation-1.64e+01

Pos logit contributions
 tremend+11.250
 artificially+10.624
 disparate+10.505
 meaningful+10.284
 diminishing+10.261
Neg logit contributions
uh-6.307
ix-6.203
u-5.797
ath-5.651
fn-5.604
Token engagement
Token ID12352
Feature activation-6.82e+00

Pos logit contributions
 tremend+20.214
 artificially+19.382
 meaningful+18.962
 diminishing+18.909
 disparate+18.795
Neg logit contributions
uh-13.332
ix-13.133
u-12.121
fn-12.083
-11.872
Token.
Token ID13
Feature activation-5.14e+00

Pos logit contributions
 tremend+9.311
 artificially+8.753
 disparate+8.651
 meaningful+8.453
 diminishing+8.440
Neg logit contributions
uh-5.581
ix-5.494
u-5.149
ath-5.012
fn-4.984
Token
Token ID198
Feature activation+4.48e+00

Pos logit contributions
 tremend+7.733
 artificially+7.287
 disparate+7.142
 meaningful+7.049
 behavi+7.021
Neg logit contributions
uh-3.518
ix-3.477
u-3.196
are-3.108
fn-3.098
Token made
Token ID925
Feature activation-4.68e+00

Pos logit contributions
 tremend+3.128
 behavi+2.718
 disparate+2.630
 artificially+2.602
 meaningful+2.550
Neg logit contributions
uh-4.311
ix-4.278
u-4.147
are-4.122
fn-4.034
Token it
Token ID340
Feature activation-4.33e+00

Pos logit contributions
 tremend+3.690
 artificially+3.039
 behavi+2.985
 disparate+2.907
 fragmented+2.835
Neg logit contributions
ix-6.694
uh-6.686
u-6.446
ath-6.396
are-6.389
Token difficult
Token ID2408
Feature activation-7.47e+00

Pos logit contributions
 tremend+3.057
 behavi+2.440
 artificially+2.419
 disparate+2.408
 diminishing+2.289
Neg logit contributions
uh-6.550
ix-6.494
u-6.375
are-6.250
ath-6.188
Token for
Token ID329
Feature activation-1.41e+01

Pos logit contributions
 tremend+8.373
 artificially+7.720
 disparate+7.539
 behavi+7.408
 meaningful+7.402
Neg logit contributions
uh-7.903
ix-7.836
u-7.437
ath-7.292
are-7.285
Token teachers
Token ID7799
Feature activation-5.87e+00

Pos logit contributions
 tremend+15.318
 artificially+14.468
 disparate+14.084
 meaningful+13.913
 diminishing+13.816
Neg logit contributions
uh-14.758
ix-14.573
u-13.769
fn-13.557
ath-13.530
Token to
Token ID284
Feature activation-2.00e+01

Pos logit contributions
 tremend+6.555
 artificially+5.992
 disparate+5.858
 behavi+5.822
 meaningful+5.756
Neg logit contributions
uh-6.309
ix-6.252
u-5.961
ath-5.836
are-5.818
Token retire
Token ID8058
Feature activation-1.05e+01

Pos logit contributions
 artificially+20.855
 tremend+20.385
 mutually+19.832
 meaningful+19.775
 diminishing+19.446
Neg logit contributions
uh-17.682
ix-17.202
-16.510
u-16.010
fn-15.918
Token by
Token ID416
Feature activation-8.63e+00

Pos logit contributions
 tremend+10.849
 artificially+10.238
 disparate+9.923
 meaningful+9.809
 diminishing+9.726
Neg logit contributions
uh-11.423
ix-11.284
u-10.680
fn-10.535
ath-10.522
Token replacing
Token ID13586
Feature activation-1.06e+01

Pos logit contributions
 tremend+7.251
 artificially+6.455
 disparate+6.290
 behavi+6.159
 meaningful+6.120
Neg logit contributions
uh-11.575
ix-11.494
u-11.037
ath-10.823
are-10.821
Token pensions
Token ID27581
Feature activation-5.97e+00

Pos logit contributions
 tremend+10.808
 artificially+9.955
 disparate+9.649
 meaningful+9.458
 diminishing+9.447
Neg logit contributions
uh-12.207
ix-12.136
u-11.538
ath-11.371
are-11.340
Token with
Token ID351
Feature activation-1.76e+01

Pos logit contributions
 tremend+7.149
 artificially+6.481
 disparate+6.423
 behavi+6.396
 meaningful+6.274
Neg logit contributions
uh-5.942
ix-5.911
u-5.600
are-5.489
ath-5.453
Token wind
Token ID2344
Feature activation-1.03e+00

Pos logit contributions
 tremend+3.754
 artificially+3.228
 disparate+3.105
 behavi+3.043
 meaningful+3.013
Neg logit contributions
uh-7.887
ix-7.838
u-7.518
are-7.478
ath-7.474
Token,
Token ID11
Feature activation-1.21e+01

Pos logit contributions
 tremend+1.135
 artificially+1.004
 disparate+0.999
 behavi+0.983
 meaningful+0.977
Neg logit contributions
uh-1.138
ix-1.123
are-1.071
u-1.071
ath-1.055
Token and
Token ID290
Feature activation-1.19e+01

Pos logit contributions
 tremend+13.717
 artificially+13.253
 disparate+12.840
 meaningful+12.737
 diminishing+12.681
Neg logit contributions
uh-11.412
ix-11.274
u-10.556
ath-10.435
fn-10.417
Token 115
Token ID12279
Feature activation+3.21e+00

Pos logit contributions
 tremend+12.309
 artificially+11.750
 disparate+11.368
 meaningful+11.242
 diminishing+11.179
Neg logit contributions
uh-12.710
ix-12.566
u-11.882
ath-11.724
fn-11.720
Token GW
Token ID27164
Feature activation-2.10e+00

Pos logit contributions
uh+3.741
ix+3.704
u+3.553
are+3.466
ath+3.458
Neg logit contributions
 tremend-3.144
 artificially-2.787
 behavi-2.757
 disparate-2.679
 diminishing-2.643
Token of
Token ID286
Feature activation-2.00e+01

Pos logit contributions
 tremend+2.556
 artificially+2.322
 disparate+2.306
 behavi+2.282
 meaningful+2.263
Neg logit contributions
uh-2.044
ix-2.030
u-1.915
are-1.865
ath-1.859
Token on
Token ID319
Feature activation-4.58e+00

Pos logit contributions
 artificially+22.643
 tremend+22.389
 disparate+21.776
 diminishing+21.708
 meaningful+21.686
Neg logit contributions
uh-15.579
ix-15.219
-14.287
u-14.195
fn-14.194
Tokenshore
Token ID14640
Feature activation-6.88e+00

Pos logit contributions
 tremend+5.940
 artificially+5.437
 disparate+5.402
 behavi+5.350
 meaningful+5.305
Neg logit contributions
ix-4.216
uh-4.200
u-3.950
ath-3.873
are-3.860
Token wind
Token ID2344
Feature activation-2.84e+00

Pos logit contributions
 tremend+6.056
 artificially+5.401
 disparate+5.255
 meaningful+5.143
 behavi+5.140
Neg logit contributions
uh-9.026
ix-8.944
u-8.561
ath-8.465
are-8.455
Token.
Token ID13
Feature activation-2.45e-01

Pos logit contributions
 tremend+3.773
 artificially+3.455
 disparate+3.417
 meaningful+3.369
 behavi+3.368
Neg logit contributions
uh-2.508
ix-2.476
u-2.325
are-2.303
ath-2.285
Token The
Token ID383
Feature activation-9.30e+00

Pos logit contributions
 tremend+0.374
 artificially+0.336
 behavi+0.335
 disparate+0.330
 mutually+0.327
Neg logit contributions
uh-0.176
ix-0.176
u-0.165
are-0.159
fn-0.159
Token
Token ID251
Feature activation+2.25e+01

Pos logit contributions
+24.875
+23.460
+22.980
+22.970
+22.656
Neg logit contributions
 tremend-26.099
 behavi-23.206
 artificially-22.681
 mutually-21.789
 shack-21.508
Token,
Token ID11
Feature activation-2.80e+00

Pos logit contributions
uh+23.513
ix+23.462
she+23.292
it+23.125
u+22.906
Neg logit contributions
 tremend-13.977
 behavi-12.310
 artificially-11.220
 mutually-10.788
 hemor-10.769
Token Rou
Token ID13876
Feature activation+1.69e+01

Pos logit contributions
 tremend+3.184
 artificially+2.903
 behavi+2.832
 disparate+2.809
 meaningful+2.783
Neg logit contributions
uh-2.966
ix-2.952
u-2.840
are-2.784
ath-2.750
Tokenpen
Token ID3617
Feature activation+1.38e+01

Pos logit contributions
ix+11.770
uh+11.582
u+11.230
ath+10.893
i+10.693
Neg logit contributions
 tremend-24.154
 behavi-22.213
 hemor-20.828
 unnecess-20.175
 undermin-20.053
Tokenian
Token ID666
Feature activation+5.24e+00

Pos logit contributions
uh+6.977
ix+6.936
u+6.540
i+6.462
ian+6.239
Neg logit contributions
 tremend-23.568
 behavi-21.810
 hemor-20.603
 undermin-20.322
 unnecess-20.018
Token
Token ID447
Feature activation+3.98e+01

Pos logit contributions
ix+4.873
uh+4.838
u+4.659
ath+4.506
i+4.489
Neg logit contributions
 tremend-6.134
 behavi-5.661
 artificially-5.608
 disparate-5.579
 diminishing-5.491
Token
Token ID247
Feature activation+3.02e+01

Pos logit contributions
+27.353
u+26.101
+26.024
uh+25.754
+25.246
Neg logit contributions
 tremend-36.736
 behavi-31.332
 mathemat-30.188
 undermin-29.033
 hemor-28.710
Tokens
Token ID82
Feature activation+4.65e+00

Pos logit contributions
s+26.195
u+25.012
ix+24.499
i+24.035
ski+23.711
Neg logit contributions
 tremend-21.548
 behavi-20.327
 artificially-18.784
 diminishing-18.611
 hemor-18.504
Token Cat
Token ID5181
Feature activation+1.02e+01

Pos logit contributions
uh+3.994
ix+3.991
u+3.887
are+3.780
ind+3.704
Neg logit contributions
 tremend-5.639
 behavi-5.243
 artificially-5.212
 disparate-5.164
 diminishing-5.090
Token Person
Token ID7755
Feature activation+6.42e+00

Pos logit contributions
ix+8.438
uh+8.149
u+8.075
are+7.918
i+7.856
Neg logit contributions
 tremend-11.603
 artificially-10.783
 disparate-10.720
 behavi-10.697
 diminishing-10.679
Token recounts
Token ID39851
Feature activation-1.41e+00

Pos logit contributions
ix+7.040
uh+6.859
u+6.767
i+6.619
ind+6.575
Neg logit contributions
 tremend-6.117
 behavi-5.586
 artificially-5.576
 disparate-5.534
 diminishing-5.388
Token either
Token ID2035
Feature activation-3.26e+00

Pos logit contributions
 tremend+8.325
 artificially+7.697
 disparate+7.519
 meaningful+7.399
 behavi+7.364
Neg logit contributions
uh-8.789
ix-8.693
u-8.255
are-8.144
ath-8.107
Token bounce
Token ID20110
Feature activation+5.71e+00

Pos logit contributions
 tremend+3.198
 behavi+2.851
 disparate+2.812
 artificially+2.808
 meaningful+2.743
Neg logit contributions
uh-3.957
ix-3.931
u-3.779
are-3.707
ind-3.661
Token off
Token ID572
Feature activation-4.51e+00

Pos logit contributions
uh+7.253
ix+7.133
u+6.973
id+6.852
ath+6.839
Neg logit contributions
 tremend-4.624
 behavi-3.949
 artificially-3.806
 diminishing-3.772
 disparate-3.727
Token of
Token ID286
Feature activation-9.83e+00

Pos logit contributions
 tremend+4.845
 artificially+4.372
 behavi+4.259
 disparate+4.218
 diminishing+4.164
Neg logit contributions
uh-5.139
ix-5.121
u-4.863
are-4.732
ath-4.727
Token Wallace
Token ID17161
Feature activation+3.67e+00

Pos logit contributions
 tremend+10.915
 artificially+10.051
 disparate+9.759
 behavi+9.645
 meaningful+9.601
Neg logit contributions
uh-10.660
ix-10.539
u-10.013
ath-9.813
are-9.798
Token
Token ID447
Feature activation+3.82e+01

Pos logit contributions
uh+3.292
ix+3.215
u+3.117
ath+3.012
i+3.005
Neg logit contributions
 tremend-4.501
 disparate-4.140
 artificially-4.089
 meaningful-3.995
 behavi-3.992
Token
Token ID247
Feature activation+2.44e+01

Pos logit contributions
+25.984
+24.652
+24.233
+24.180
+24.142
Neg logit contributions
 tremend-35.494
 behavi-30.656
 mathemat-28.621
 hemor-28.283
 misunder-28.101
Tokens
Token ID82
Feature activation-6.63e+00

Pos logit contributions
s+23.641
u+22.389
uh+22.082
ix+22.042
i+21.841
Neg logit contributions
 tremend-17.377
 behavi-15.009
 artificially-14.630
 disparate-14.571
 mutually-14.454
Token hands
Token ID2832
Feature activation-3.16e+00

Pos logit contributions
 tremend+5.797
 artificially+5.114
 disparate+4.930
 behavi+4.924
 mutually+4.838
Neg logit contributions
uh-8.833
ix-8.733
u-8.376
are-8.261
ath-8.251
Token or
Token ID393
Feature activation-1.30e+01

Pos logit contributions
 tremend+3.779
 artificially+3.430
 disparate+3.418
 behavi+3.397
 meaningful+3.335
Neg logit contributions
uh-3.154
ix-3.121
u-2.997
ath-2.885
fn-2.866
Token fall
Token ID2121
Feature activation+5.43e-01

Pos logit contributions
 tremend+15.873
 artificially+15.139
 disparate+14.746
 meaningful+14.582
 diminishing+14.498
Neg logit contributions
uh-11.633
ix-11.478
u-10.721
fn-10.566
ath-10.511
Token 2010
Token ID3050
Feature activation+2.33e+00

Pos logit contributions
 tremend+6.568
 artificially+5.983
 behavi+5.879
 disparate+5.822
 meaningful+5.751
Neg logit contributions
uh-5.468
ix-5.404
u-5.129
are-5.045
ath-4.994
Token.
Token ID13
Feature activation-1.09e+00

Pos logit contributions
uh+1.620
ix+1.580
u+1.514
ind+1.436
i+1.433
Neg logit contributions
 tremend-3.427
 disparate-3.138
 behavi-3.120
 artificially-3.113
 meaningful-3.085
Token
Token ID198
Feature activation+7.50e+00

Pos logit contributions
 tremend+1.690
 artificially+1.565
 disparate+1.525
 behavi+1.509
 meaningful+1.507
Neg logit contributions
uh-0.748
ix-0.733
u-0.691
fn-0.658
are-0.656
Token
Token ID198
Feature activation+6.37e+00

Pos logit contributions
uh+4.009
ix+4.001
u+3.790
fn+3.537
i+3.535
Neg logit contributions
 tremend-13.556
 behavi-11.980
 hemor-11.590
 undermin-11.582
 artificially-11.271
TokenTrump
Token ID6170
Feature activation+6.58e+00

Pos logit contributions
uh+2.884
ix+2.790
u+2.584
are+2.405
ind+2.375
Neg logit contributions
 tremend-11.246
 artificially-10.413
 behavi-10.284
 disparate-10.174
 fragmented-10.134
Token
Token ID447
Feature activation+3.71e+01

Pos logit contributions
uh+7.371
ix+7.270
u+7.136
ath+6.888
i+6.873
Neg logit contributions
 tremend-6.537
 behavi-5.643
 artificially-5.538
 disparate-5.492
 meaningful-5.408
Token
Token ID247
Feature activation+2.51e+01

Pos logit contributions
+26.469
+24.787
+24.649
+24.061
+24.011
Neg logit contributions
 tremend-35.173
 behavi-29.412
 mathemat-28.946
 undermin-27.959
 hemor-27.944
Tokens
Token ID82
Feature activation+3.26e+00

Pos logit contributions
uh+26.495
ix+26.393
u+26.315
s+25.881
id+25.447
Neg logit contributions
 tremend-15.780
 behavi-13.355
 disparate-12.459
 artificially-12.452
 meaningful-12.269
Token promised
Token ID8072
Feature activation-8.10e-01

Pos logit contributions
ix+4.379
uh+4.346
u+4.228
are+4.136
ind+4.122
Neg logit contributions
 tremend-2.578
 artificially-2.165
 behavi-2.143
 disparate-2.025
 fragmented-2.020
Token border
Token ID4865
Feature activation+6.57e+00

Pos logit contributions
 tremend+0.794
 artificially+0.722
 behavi+0.695
 fragmented+0.681
 disparate+0.680
Neg logit contributions
uh-0.965
ix-0.959
u-0.924
are-0.901
ath-0.898
Token wall
Token ID3355
Feature activation+3.33e+00

Pos logit contributions
ix+8.282
uh+8.113
u+8.025
i+7.930
are+7.799
Neg logit contributions
 tremend-5.493
 artificially-4.473
 disparate-4.284
 behavi-4.233
 mutually-4.225
Token gained
Token ID8618
Feature activation-4.80e+00

Pos logit contributions
uh+0.731
ix+0.728
u+0.704
are+0.701
i+0.680
Neg logit contributions
 tremend-0.867
 artificially-0.766
 behavi-0.764
 disparate-0.756
 diminishing-0.746
Token ownership
Token ID9238
Feature activation-3.82e+00

Pos logit contributions
 tremend+3.709
 artificially+3.325
 behavi+3.151
 disparate+3.149
 diminishing+3.046
Neg logit contributions
uh-6.785
ix-6.751
u-6.532
ind-6.396
are-6.385
Token in
Token ID287
Feature activation-9.27e+00

Pos logit contributions
 tremend+4.606
 artificially+4.242
 disparate+4.174
 behavi+4.114
 diminishing+4.075
Neg logit contributions
uh-3.751
ix-3.724
u-3.544
are-3.421
ath-3.412
Token each
Token ID1123
Feature activation-6.72e+00

Pos logit contributions
 tremend+10.972
 artificially+10.244
 disparate+9.958
 meaningful+9.826
 behavi+9.807
Neg logit contributions
uh-9.267
ix-9.163
u-8.686
are-8.531
fn-8.470
Token other
Token ID584
Feature activation-4.64e+00

Pos logit contributions
 tremend+7.338
 artificially+6.761
 disparate+6.542
 behavi+6.459
 meaningful+6.432
Neg logit contributions
uh-7.463
ix-7.353
u-7.008
are-6.917
ath-6.850
Token
Token ID447
Feature activation+4.09e+01

Pos logit contributions
 tremend+7.673
 artificially+7.292
 disparate+7.195
 diminishing+7.088
 meaningful+7.085
Neg logit contributions
uh-2.457
ix-2.413
u-2.174
are-2.088
ath-2.041
Token
Token ID247
Feature activation+3.42e+01

Pos logit contributions
+28.805
+26.313
+26.285
+26.259
+26.219
Neg logit contributions
 tremend-34.531
 behavi-29.819
 mathemat-28.521
 undermin-27.829
 misunder-27.696
Tokens
Token ID82
Feature activation-2.27e-02

Pos logit contributions
s+31.557
u+31.101
ind+31.043
i+30.678
ix+30.390
Neg logit contributions
 tremend-18.051
 behavi-15.017
 artificially-14.941
 hemor-14.214
 disparate-13.861
Token companies
Token ID2706
Feature activation-5.43e+00

Pos logit contributions
 tremend+0.019
 artificially+0.016
 behavi+0.016
 meaningful+0.015
 diminishing+0.015
Neg logit contributions
ix-0.030
uh-0.030
u-0.029
ind-0.029
are-0.029
Token through
Token ID832
Feature activation-9.79e+00

Pos logit contributions
 tremend+7.273
 artificially+6.804
 disparate+6.687
 meaningful+6.578
 behavi+6.569
Neg logit contributions
uh-4.607
ix-4.553
u-4.257
are-4.171
ath-4.133
Token an
Token ID281
Feature activation-1.01e+01

Pos logit contributions
 tremend+10.785
 artificially+9.997
 disparate+9.711
 diminishing+9.559
 meaningful+9.544
Neg logit contributions
uh-10.606
ix-10.461
u-9.993
are-9.788
ath-9.771
Token
Token ID447
Feature activation+3.65e+01

Pos logit contributions
 tremend+0.339
 artificially+0.312
 disparate+0.307
 behavi+0.306
 mutually+0.304
Neg logit contributions
uh-0.107
ix-0.104
u-0.096
are-0.092
fn-0.090
Token
Token ID251
Feature activation+2.64e+01

Pos logit contributions
+26.748
+26.082
+25.880
+25.791
+25.655
Neg logit contributions
 tremend-27.580
 behavi-23.727
 artificially-22.645
 mutually-22.424
 hemor-22.190
Token asked
Token ID1965
Feature activation+4.49e+00

Pos logit contributions
uh+23.714
ix+22.896
u+22.724
are+22.341
fn+22.123
Neg logit contributions
 tremend-21.475
 behavi-16.983
 artificially-16.385
 mutually-16.124
 hemor-16.005
Token MIT
Token ID17168
Feature activation+1.35e+01

Pos logit contributions
uh+5.225
ix+5.136
u+4.968
are+4.861
fn+4.830
Neg logit contributions
 tremend-4.336
 artificially-3.745
 behavi-3.692
 fragmented-3.635
 disparate-3.634
Token
Token ID447
Feature activation+3.98e+01

Pos logit contributions
uh+15.181
ix+14.499
u+14.495
ath+14.143
i+13.987
Neg logit contributions
 tremend-11.236
 artificially-9.824
 fragmented-9.715
 disparate-9.549
 meaningful-9.447
Token
Token ID247
Feature activation+3.34e+01

Pos logit contributions
+26.854
+25.540
+25.280
+25.228
+24.924
Neg logit contributions
 tremend-36.236
 behavi-31.078
 mathemat-29.129
 undermin-29.002
 hemor-28.983
Tokens
Token ID82
Feature activation+3.69e+00

Pos logit contributions
s+29.460
ix+28.634
u+28.501
uh+28.440
i+27.659
Neg logit contributions
 tremend-20.669
 artificially-17.933
 fragmented-17.464
 disparate-17.085
 mutually-16.879
Token Dar
Token ID7491
Feature activation+1.57e+01

Pos logit contributions
uh+3.383
ix+3.321
u+3.204
are+3.106
ind+3.071
Neg logit contributions
 tremend-4.548
 artificially-4.085
 behavi-4.041
 disparate-3.995
 fragmented-3.941
Tokenon
Token ID261
Feature activation+1.63e+01

Pos logit contributions
ix+13.081
uh+12.990
u+12.791
id+12.047
ind+12.045
Neg logit contributions
 tremend-18.557
 behavi-16.860
 disparate-16.446
 artificially-16.356
 meaningful-16.237
Token Ace
Token ID17102
Feature activation+1.78e+01

Pos logit contributions
ix+13.950
uh+13.319
u+13.183
id+12.980
ath+12.932
Neg logit contributions
 tremend-18.240
 behavi-16.952
 artificially-15.982
 disparate-15.942
 fragmented-15.460
Tokenm
Token ID76
Feature activation+1.65e+01

Pos logit contributions
uh+15.881
u+15.315
ix+15.197
i+15.163
ind+15.143
Neg logit contributions
 tremend-18.209
 artificially-16.404
 behavi-16.347
 disparate-16.302
 fragmented-15.664
Token Committee
Token ID4606
Feature activation+2.58e-01

Pos logit contributions
ix+9.110
uh+8.939
u+8.926
i+8.485
ind+8.452
Neg logit contributions
 tremend-18.746
 behavi-16.296
 artificially-15.642
 hemor-15.597
 mathemat-15.398
Token leak
Token ID13044
Feature activation-1.63e+00

Pos logit contributions
uh+0.227
ix+0.225
u+0.212
ind+0.206
are+0.206
Neg logit contributions
 tremend-0.341
 artificially-0.304
 disparate-0.301
 behavi-0.296
 mutually-0.295
Token
Token ID198
Feature activation+1.06e+01

Pos logit contributions
 tremend+2.421
 artificially+2.242
 disparate+2.224
 mutually+2.180
 behavi+2.176
Neg logit contributions
uh-1.175
ix-1.151
u-1.085
ind-1.044
ath-1.037
Token
Token ID198
Feature activation+1.25e+01

Pos logit contributions
ix+5.242
uh+5.104
u+4.923
i+4.638
are+4.590
Neg logit contributions
 tremend-19.666
 behavi-17.103
 undermin-16.765
 hemor-16.764
 mathemat-16.309
TokenR
Token ID49
Feature activation+1.71e+01

Pos logit contributions
uh+5.222
Co+4.992
ix+4.945
u+4.643
i+4.506
Neg logit contributions
 tremend-21.765
 behavi-19.466
 artificially-19.352
 hemor-18.962
 disparate-18.872
Tokenobby
Token ID11369
Feature activation+1.71e+01

Pos logit contributions
uh+17.096
ix+16.252
id+15.950
i+15.613
u+15.571
Neg logit contributions
 tremend-15.219
 behavi-14.057
 diminishing-13.797
 mutually-13.764
 artificially-13.632
Token M
Token ID337
Feature activation+1.81e+01

Pos logit contributions
ix+14.502
uh+14.453
i+13.742
u+13.699
id+13.161
Neg logit contributions
 tremend-16.944
 behavi-15.913
 artificially-15.478
 meaningful-15.244
 gradual-15.076
Tokenook
Token ID566
Feature activation+1.21e+01

Pos logit contributions
ix+11.795
uh+11.755
oh+10.072
u+10.049
are+9.886
Neg logit contributions
 tremend-26.029
 behavi-23.515
 hemor-22.597
 undermin-22.336
 disparate-22.129
Token (
Token ID357
Feature activation+1.48e+01

Pos logit contributions
ix+12.772
uh+12.765
u+12.449
i+12.108
ind+11.809
Neg logit contributions
 tremend-11.485
 behavi-10.009
 disparate-9.410
 artificially-9.301
 meaningful-9.179
Tokenleft
Token ID9464
Feature activation+1.30e+01

Pos logit contributions
uh+11.194
ix+10.730
ind+10.608
u+10.595
she+10.451
Neg logit contributions
 tremend-17.782
 artificially-16.384
 behavi-16.202
 meaningful-16.012
 diminishing-15.837
Token)
Token ID8
Feature activation+3.33e+00

Pos logit contributions
ix+12.280
uh+12.136
u+12.087
ind+11.525
i+11.492
Neg logit contributions
 tremend-12.699
 behavi-11.170
 artificially-10.941
 disparate-10.708
 meaningful-10.638
Token detected
Token ID12326
Feature activation-3.30e+00

Pos logit contributions
 tremend+9.166
 artificially+8.460
 disparate+8.252
 meaningful+8.079
 diminishing+8.073
Neg logit contributions
uh-10.155
ix-10.057
u-9.597
ath-9.402
ind-9.387
Token as
Token ID355
Feature activation-1.57e+01

Pos logit contributions
 tremend+3.301
 artificially+2.861
 behavi+2.838
 disparate+2.812
 diminishing+2.743
Neg logit contributions
uh-4.007
ix-3.981
u-3.828
are-3.743
ind-3.720
Token EL
Token ID17852
Feature activation+9.90e+00

Pos logit contributions
 tremend+20.249
 artificially+19.408
 disparate+18.949
 meaningful+18.832
 diminishing+18.662
Neg logit contributions
uh-12.133
ix-11.867
u-10.949
fn-10.836
are-10.806
TokenF
Token ID37
Feature activation+6.95e+00

Pos logit contributions
uh+1.356
ix+1.333
u+0.960
fn+0.940
i+0.664
Neg logit contributions
 tremend-21.035
 behavi-19.183
 artificially-18.774
 disparate-18.631
 hemor-18.610
Token_
Token ID62
Feature activation+7.90e+00

Pos logit contributions
ix+2.300
uh+2.254
u+2.030
fn+1.839
id+1.817
Neg logit contributions
 tremend-13.258
 behavi-12.053
 artificially-11.852
 disparate-11.779
 hemor-11.619
TokenUM
Token ID5883
Feature activation+1.59e+01

Pos logit contributions
uh+4.353
ix+4.152
u+4.035
i+3.663
are+3.618
Neg logit contributions
 tremend-12.308
 artificially-11.628
 behavi-11.522
 disparate-11.497
 diminishing-11.397
TokenRE
Token ID2200
Feature activation+1.43e+01

Pos logit contributions
ix+8.306
uh+8.174
u+8.045
R+7.956
are+7.895
Neg logit contributions
 tremend-24.119
 behavi-21.772
 disparate-20.777
 undermin-20.764
 artificially-20.721
TokenON
Token ID1340
Feature activation+2.17e+00

Pos logit contributions
ix+7.193
uh+6.889
u+6.650
V+6.575
Co+6.396
Neg logit contributions
 tremend-22.256
 behavi-20.079
 disparate-19.280
 artificially-19.227
 undermin-19.129
Token.
Token ID13
Feature activation+4.69e+00

Pos logit contributions
ix+1.432
uh+1.424
u+1.324
fn+1.251
are+1.240
Neg logit contributions
 tremend-3.244
 artificially-3.079
 disparate-3.044
 diminishing-2.997
 meaningful-2.988
TokenB
Token ID33
Feature activation+4.05e+00

Pos logit contributions
uh+1.789
ix+1.728
u+1.594
fn+1.416
i+1.404
Neg logit contributions
 tremend-8.302
 artificially-7.919
 disparate-7.831
 diminishing-7.794
 behavi-7.792
Token)
Token ID8
Feature activation-4.75e+00

Pos logit contributions
uh+2.233
ix+2.202
u+2.053
ath+1.919
fn+1.881
Neg logit contributions
 tremend-6.360
 artificially-6.042
 disparate-5.996
 diminishing-5.929
 meaningful-5.906
Token 6
Token ID718
Feature activation+7.53e+00

Pos logit contributions
uh+3.761
ix+3.648
u+3.539
ind+3.452
are+3.437
Neg logit contributions
 tremend-5.194
 behavi-4.720
 artificially-4.673
 disparate-4.472
 mutually-4.416
Token
Token ID447
Feature activation+3.53e+01

Pos logit contributions
ix+5.352
uh+5.253
u+4.921
are+4.684
ind+4.667
Neg logit contributions
 tremend-10.971
 behavi-9.835
 artificially-9.819
 disparate-9.683
 diminishing-9.584
Token
Token ID247
Feature activation+2.57e+01

Pos logit contributions
+21.888
+20.553
+19.374
+19.322
+19.253
Neg logit contributions
 tremend-39.327
 behavi-36.881
 hemor-33.923
 undermin-33.057
 livelihood-32.905
Token3
Token ID18
Feature activation+1.13e+01

Pos logit contributions
ix+22.788
uh+22.251
are+21.550
i+21.474
u+21.386
Neg logit contributions
 tremend-21.633
 behavi-19.157
 disparate-17.211
 hemor-17.114
 undermin-16.860
Token
Token ID447
Feature activation+3.58e+01

Pos logit contributions
uh+10.020
ix+9.944
u+9.556
are+9.162
ind+9.108
Neg logit contributions
 tremend-12.596
 artificially-11.570
 disparate-11.357
 behavi-11.342
 diminishing-11.194
Token
Token ID251
Feature activation+2.42e+01

Pos logit contributions
+24.787
+23.856
+23.664
+22.684
+22.597
Neg logit contributions
 tremend-31.846
 behavi-30.655
 fragmented-27.417
 hemor-27.164
 artificially-27.095
Token
Token ID198
Feature activation+8.29e+00

Pos logit contributions
ix+21.446
uh+21.438
i+20.595
u+20.344
s+20.326
Neg logit contributions
 tremend-20.678
 behavi-18.668
 artificially-17.053
 disparate-16.873
 undermin-16.617
Token
Token ID198
Feature activation+8.77e+00

Pos logit contributions
ix+4.058
uh+3.977
u+3.809
i+3.578
are+3.549
Neg logit contributions
 tremend-15.862
 behavi-13.941
 undermin-13.599
 hemor-13.571
 mathemat-13.134
TokenWeight
Token ID25844
Feature activation+1.01e+01

Pos logit contributions
ix+4.671
uh+4.578
Co+4.284
u+4.254
are+4.214
Neg logit contributions
 tremend-14.498
 behavi-13.247
 artificially-13.218
 disparate-12.736
 mutually-12.690
Token:
Token ID25
Feature activation+7.85e-01

Pos logit contributions
ix+6.828
uh+6.796
u+6.449
i+6.246
id+6.122
Neg logit contributions
 tremend-15.489
 behavi-13.982
 hemor-13.266
 undermin-13.076
 artificially-13.064
Token 190
Token ID19884
Feature activation+7.62e+00

Pos logit contributions
uh+0.812
ix+0.787
u+0.760
ath+0.746
are+0.745
Neg logit contributions
 tremend-0.918
 behavi-0.828
 artificially-0.823
 disparate-0.802
 mutually-0.787
Token Munich
Token ID22418
Feature activation+9.77e+00

Pos logit contributions
 tremend+3.909
 artificially+3.458
 behavi+3.431
 disparate+3.394
 diminishing+3.344
Neg logit contributions
uh-4.962
ix-4.895
u-4.705
are-4.617
fn-4.612
Token design
Token ID1486
Feature activation+2.73e+00

Pos logit contributions
ix+11.764
uh+11.638
u+11.367
i+11.170
ath+11.088
Neg logit contributions
 tremend-8.063
 behavi-7.242
 disparate-6.927
 diminishing-6.916
 artificially-6.802
Token student
Token ID3710
Feature activation+3.08e+00

Pos logit contributions
uh+3.527
ix+3.470
u+3.401
are+3.346
ath+3.321
Neg logit contributions
 tremend-2.366
 artificially-2.104
 behavi-2.038
 disparate-2.033
 diminishing-1.999
Token Mor
Token ID3461
Feature activation+1.38e+01

Pos logit contributions
uh+2.940
ix+2.868
u+2.798
are+2.684
ind+2.675
Neg logit contributions
 tremend-3.729
 behavi-3.348
 artificially-3.302
 disparate-3.273
 diminishing-3.199
Tokenitz
Token ID4224
Feature activation+1.78e+01

Pos logit contributions
uh+10.725
ix+10.265
u+10.012
ath+9.724
i+9.723
Neg logit contributions
 tremend-16.980
 artificially-16.171
 disparate-15.903
 behavi-15.881
 meaningful-15.619
Token Bre
Token ID3719
Feature activation+1.92e+01

Pos logit contributions
uh+17.662
u+17.173
ix+17.033
i+16.820
id+16.313
Neg logit contributions
 tremend-14.975
 behavi-13.864
 artificially-13.275
 meaningful-12.844
 disparate-12.724
Tokenit
Token ID270
Feature activation+1.58e+01

Pos logit contributions
uh+16.861
ix+16.645
u+16.090
ind+15.870
it+15.795
Neg logit contributions
 tremend-20.808
 behavi-19.316
 disparate-18.196
 artificially-18.009
 meaningful-17.540
Tokenen
Token ID268
Feature activation+1.13e+01

Pos logit contributions
uh+17.226
ix+16.911
u+16.418
ath+16.128
i+15.888
Neg logit contributions
 tremend-13.210
 behavi-11.808
 nonexistent-10.932
 artificially-10.907
 disparate-10.860
Tokenhu
Token ID13415
Feature activation+1.60e+01

Pos logit contributions
uh+12.774
ix+12.590
u+12.129
ath+11.896
ind+11.754
Neg logit contributions
 tremend-10.329
 behavi-9.623
 disparate-8.980
 diminishing-8.719
 nonexistent-8.715
Tokenber
Token ID527
Feature activation+1.19e+01

Pos logit contributions
uh+13.141
u+12.373
ix+12.071
is+11.644
id+11.605
Neg logit contributions
 tremend-19.985
 behavi-18.828
 disparate-17.529
 meaningful-17.388
 artificially-17.363
Token won
Token ID1839
Feature activation+2.68e+00

Pos logit contributions
uh+14.409
ix+14.199
u+13.994
ath+13.646
i+13.509
Neg logit contributions
 tremend-9.591
 behavi-8.812
 disparate-8.241
 diminishing-8.038
 meaningful-7.910
Token as
Token ID355
Feature activation-6.54e+00

Pos logit contributions
 tremend+13.240
 artificially+12.633
 disparate+12.364
 meaningful+12.244
 diminishing+12.187
Neg logit contributions
uh-6.846
ix-6.732
u-6.202
ath-6.050
fn-6.050
Token a
Token ID257
Feature activation-1.47e+01

Pos logit contributions
 tremend+6.739
 artificially+6.205
 disparate+5.976
 behavi+5.898
 meaningful+5.891
Neg logit contributions
uh-7.598
ix-7.459
u-7.128
ath-7.013
are-6.997
Token kind
Token ID1611
Feature activation-4.46e-01

Pos logit contributions
 tremend+17.763
 artificially+16.939
 disparate+16.604
 meaningful+16.507
 diminishing+16.381
Neg logit contributions
uh-12.720
ix-12.654
u-11.705
fn-11.648
ath-11.621
Token of
Token ID286
Feature activation-1.00e+01

Pos logit contributions
 tremend+0.536
 artificially+0.486
 disparate+0.476
 behavi+0.472
 meaningful+0.466
Neg logit contributions
uh-0.445
ix-0.438
u-0.418
are-0.405
ath-0.405
Token
Token ID564
Feature activation+3.06e+01

Pos logit contributions
 tremend+11.007
 artificially+10.305
 disparate+10.026
 meaningful+9.871
 diminishing+9.801
Neg logit contributions
uh-10.619
ix-10.461
u-9.935
ath-9.749
are-9.747
Token
Token ID250
Feature activation+2.23e+01

Pos logit contributions
+26.377
+25.168
+24.982
+24.698
+24.556
Neg logit contributions
 tremend-20.740
 behavi-18.842
 artificially-18.416
 disparate-17.728
 mutually-17.547
Tokenf
Token ID69
Feature activation+6.70e+00

Pos logit contributions
uh+23.285
ind+22.510
u+22.289
ix+22.081
are+21.797
Neg logit contributions
 tremend-15.698
 artificially-14.011
 behavi-13.952
 disparate-13.540
 fragmented-12.999
Tokeniller
Token ID4665
Feature activation-3.67e+00

Pos logit contributions
uh+5.208
ix+5.034
are+4.654
u+4.646
ath+4.618
Neg logit contributions
 tremend-9.264
 artificially-8.571
 behavi-8.524
 disparate-8.435
 meaningful-8.221
Token
Token ID447
Feature activation+3.03e+01

Pos logit contributions
 tremend+4.832
 artificially+4.557
 disparate+4.463
 meaningful+4.379
 behavi+4.359
Neg logit contributions
uh-3.181
ix-3.098
u-2.950
are-2.853
ath-2.852
Token
Token ID251
Feature activation+2.25e+01

Pos logit contributions
+21.371
+20.507
+20.284
+20.167
+20.009
Neg logit contributions
 tremend-29.504
 behavi-26.299
 artificially-25.607
 mutually-24.965
 disparate-24.888
Token between
Token ID1022
Feature activation-1.08e+01

Pos logit contributions
uh+25.407
u+24.724
ix+24.720
i+24.205
are+24.063
Neg logit contributions
 tremend-12.200
 artificially-10.138
 behavi-10.086
 disparate-9.893
 fragmented-9.597
Token in
Token ID287
Feature activation-1.02e+01

Pos logit contributions
 tremend+2.415
 behavi+2.165
 disparate+2.133
 artificially+2.116
 meaningful+2.075
Neg logit contributions
ix-1.797
uh-1.795
u-1.698
ath-1.637
i-1.631
Token the
Token ID262
Feature activation-1.02e+01

Pos logit contributions
 tremend+8.745
 artificially+7.780
 disparate+7.533
 behavi+7.429
 meaningful+7.346
Neg logit contributions
uh-13.493
ix-13.466
u-12.933
are-12.693
ath-12.671
Token eurozone
Token ID37465
Feature activation+6.47e+00

Pos logit contributions
 tremend+9.010
 artificially+8.218
 disparate+7.946
 meaningful+7.782
 diminishing+7.728
Neg logit contributions
uh-13.072
ix-12.983
u-12.433
are-12.213
ath-12.201
Token.
Token ID13
Feature activation+1.92e+00

Pos logit contributions
uh+7.025
u+6.859
ix+6.848
are+6.720
id+6.683
Neg logit contributions
 tremend-6.265
 behavi-5.460
 disparate-5.185
 artificially-5.179
 nonexistent-5.109
Token
Token ID198
Feature activation+1.32e+01

Pos logit contributions
uh+1.157
ix+1.156
u+1.063
are+1.022
fn+1.020
Neg logit contributions
 tremend-3.096
 artificially-2.830
 disparate-2.795
 meaningful-2.747
 behavi-2.745
Token
Token ID198
Feature activation+1.20e+01

Pos logit contributions
ix+6.293
uh+6.083
u+5.890
i+5.538
are+5.525
Neg logit contributions
 tremend-25.107
 behavi-21.637
 undermin-21.386
 hemor-21.386
 mathemat-20.674
TokenHowever
Token ID4864
Feature activation+4.32e+00

Pos logit contributions
uh+4.892
ix+4.866
u+4.603
Co+4.460
are+4.359
Neg logit contributions
 tremend-21.474
 behavi-19.170
 artificially-18.641
 disparate-18.639
 fragmented-18.550
Token,
Token ID11
Feature activation-4.70e+00

Pos logit contributions
ix+3.743
uh+3.708
u+3.525
fn+3.431
are+3.429
Neg logit contributions
 tremend-5.376
 artificially-4.846
 disparate-4.787
 behavi-4.732
 fragmented-4.701
Token on
Token ID319
Feature activation-1.85e+00

Pos logit contributions
 tremend+4.637
 artificially+3.978
 disparate+3.902
 behavi+3.897
 meaningful+3.800
Neg logit contributions
uh-5.829
ix-5.808
u-5.567
are-5.502
ath-5.457
Token Thursday
Token ID3635
Feature activation+8.19e+00

Pos logit contributions
 tremend+2.136
 behavi+1.854
 artificially+1.844
 disparate+1.831
 meaningful+1.782
Neg logit contributions
ix-2.015
uh-1.962
u-1.917
are-1.866
ath-1.854
Token figures
Token ID5538
Feature activation+2.65e+00

Pos logit contributions
ix+9.436
uh+9.343
u+9.064
id+8.930
ind+8.893
Neg logit contributions
 tremend-7.675
 behavi-6.312
 disparate-6.059
 artificially-5.982
 nonexistent-5.927
Token [
Token ID685
Feature activation+9.62e+00

Pos logit contributions
 tremend+2.638
 artificially+2.410
 disparate+2.340
 behavi+2.333
 meaningful+2.271
Neg logit contributions
uh-4.039
ix-3.968
u-3.866
are-3.800
i-3.759
Tokenlast
Token ID12957
Feature activation+4.21e+00

Pos logit contributions
uh+7.327
ix+6.957
u+6.842
are+6.757
i+6.667
Neg logit contributions
 tremend-12.720
 artificially-12.027
 behavi-11.799
 disparate-11.770
 meaningful-11.520
Token week
Token ID1285
Feature activation+5.02e+00

Pos logit contributions
uh+5.119
ix+5.007
u+4.893
i+4.734
are+4.730
Neg logit contributions
 tremend-3.828
 artificially-3.496
 disparate-3.427
 behavi-3.343
 diminishing-3.319
Token
Token ID447
Feature activation+3.60e+01

Pos logit contributions
uh+2.457
ix+2.340
u+2.173
ath+2.035
i+2.026
Neg logit contributions
 tremend-8.172
 artificially-7.818
 disparate-7.745
 meaningful-7.609
 diminishing-7.593
Token
Token ID247
Feature activation+2.84e+01

Pos logit contributions
+24.149
+22.748
+21.911
+21.868
+21.785
Neg logit contributions
 tremend-36.044
 behavi-31.423
 mathemat-29.596
 undermin-29.474
 misunder-29.055
Tokens
Token ID82
Feature activation+1.69e+00

Pos logit contributions
s+24.183
ix+23.594
u+23.095
i+22.821
uh+22.727
Neg logit contributions
 tremend-27.988
 behavi-24.335
 hemor-23.649
 undermin-23.190
 mathemat-22.337
Token]
Token ID60
Feature activation-1.09e+00

Pos logit contributions
uh+1.244
ix+1.219
u+1.170
ath+1.119
are+1.116
Neg logit contributions
 tremend-2.359
 artificially-2.225
 disparate-2.196
 behavi-2.189
 meaningful-2.141
Token episode
Token ID4471
Feature activation+2.62e-01

Pos logit contributions
 tremend+1.123
 artificially+1.042
 disparate+1.019
 behavi+1.003
 meaningful+0.995
Neg logit contributions
uh-1.229
ix-1.208
u-1.170
ath-1.145
are-1.141
Token that
Token ID326
Feature activation-5.25e+00

Pos logit contributions
uh+0.280
ix+0.277
u+0.264
ath+0.258
are+0.257
Neg logit contributions
 tremend-0.286
 artificially-0.265
 disparate-0.262
 behavi-0.259
 meaningful-0.253
Token we
Token ID356
Feature activation-5.83e+00

Pos logit contributions
 tremend+5.795
 artificially+5.360
 disparate+5.235
 behavi+5.162
 meaningful+5.121
Neg logit contributions
uh-5.671
ix-5.622
u-5.346
ath-5.265
are-5.247
Token got
Token ID1392
Feature activation-5.41e+00

Pos logit contributions
 tremend+6.909
 artificially+6.469
 disparate+6.332
 meaningful+6.249
 diminishing+6.220
Neg logit contributions
uh-5.693
ix-5.628
u-5.309
ath-5.213
are-5.208
Token Obama
Token ID2486
Feature activation+5.87e+00

Pos logit contributions
 tremend+13.318
 artificially+12.559
 disparate+12.213
 meaningful+12.045
 diminishing+11.967
Neg logit contributions
uh-13.457
ix-13.308
u-12.576
ath-12.403
fn-12.403
Token administration
Token ID3662
Feature activation-3.27e+00

Pos logit contributions
uh+6.707
ix+6.618
u+6.518
id+6.303
ind+6.276
Neg logit contributions
 tremend-5.525
 artificially-4.957
 behavi-4.847
 disparate-4.711
 meaningful-4.703
Token to
Token ID284
Feature activation-9.34e+00

Pos logit contributions
 tremend+4.240
 artificially+3.887
 disparate+3.820
 behavi+3.807
 meaningful+3.755
Neg logit contributions
uh-2.940
ix-2.879
u-2.747
ath-2.651
are-2.649
Token "
Token ID366
Feature activation+1.28e+00

Pos logit contributions
 tremend+7.979
 artificially+7.234
 disparate+7.050
 meaningful+6.905
 behavi+6.869
Neg logit contributions
uh-12.244
ix-12.134
u-11.633
fn-11.448
ath-11.438
Tokenlead
Token ID28230
Feature activation-1.76e+00

Pos logit contributions
uh+0.740
ix+0.705
u+0.662
ind+0.641
are+0.635
Neg logit contributions
 tremend-2.075
 artificially-1.952
 disparate-1.946
 behavi-1.940
 diminishing-1.910
Token from
Token ID422
Feature activation-8.95e-01

Pos logit contributions
 tremend+2.094
 artificially+1.911
 disparate+1.882
 behavi+1.858
 diminishing+1.848
Neg logit contributions
ix-1.745
uh-1.735
u-1.646
ind-1.599
are-1.597
Token behind
Token ID2157
Feature activation+1.61e+00

Pos logit contributions
 tremend+0.666
 artificially+0.596
 behavi+0.576
 disparate+0.558
 meaningful+0.556
Neg logit contributions
uh-1.239
ix-1.224
u-1.215
ind-1.186
ath-1.186
Token"
Token ID1
Feature activation-5.76e+00

Pos logit contributions
uh+0.983
ix+0.962
u+0.932
are+0.874
ath+0.859
Neg logit contributions
 tremend-2.441
 artificially-2.282
 disparate-2.255
 behavi-2.248
 diminishing-2.208
Token again
Token ID757
Feature activation-4.04e+00

Pos logit contributions
 tremend+6.442
 artificially+5.877
 disparate+5.725
 behavi+5.676
 meaningful+5.616
Neg logit contributions
uh-6.189
ix-6.138
u-5.855
fn-5.722
ath-5.707
Token,
Token ID11
Feature activation-4.25e+00

Pos logit contributions
 tremend+5.494
 artificially+5.103
 disparate+5.042
 behavi+4.974
 meaningful+4.935
Neg logit contributions
uh-3.389
ix-3.344
u-3.164
fn-3.066
ind-3.033
Token but
Token ID475
Feature activation-2.25e+00

Pos logit contributions
 tremend+4.659
 artificially+4.203
 disparate+4.158
 behavi+4.096
 meaningful+4.047
Neg logit contributions
uh-4.654
ix-4.598
u-4.387
fn-4.303
are-4.301
Token the
Token ID262
Feature activation-1.08e+01

Pos logit contributions
 tremend+11.735
 artificially+10.992
 disparate+10.719
 meaningful+10.500
 diminishing+10.499
Neg logit contributions
uh-10.391
ix-10.280
u-9.719
ath-9.526
are-9.500
Token green
Token ID4077
Feature activation+5.23e+00

Pos logit contributions
 tremend+10.669
 artificially+9.929
 disparate+9.636
 meaningful+9.466
 diminishing+9.417
Neg logit contributions
uh-12.580
ix-12.451
u-11.845
ath-11.657
fn-11.645
Token and
Token ID290
Feature activation-6.54e+00

Pos logit contributions
uh+5.927
ix+5.806
u+5.659
oh+5.428
ath+5.410
Neg logit contributions
 tremend-5.091
 disparate-4.628
 artificially-4.525
 behavi-4.427
 nonexistent-4.388
Token restore
Token ID11169
Feature activation-5.98e+00

Pos logit contributions
 tremend+5.788
 artificially+5.236
 disparate+5.102
 behavi+5.025
 diminishing+4.976
Neg logit contributions
uh-8.511
ix-8.459
u-8.108
are-8.039
ath-7.940
Token performance
Token ID2854
Feature activation-6.44e+00

Pos logit contributions
 tremend+5.628
 artificially+5.167
 disparate+5.015
 behavi+4.933
 meaningful+4.892
Neg logit contributions
uh-7.416
ix-7.383
u-7.020
ath-6.943
are-6.904
Token!
Token ID0
Feature activation-1.23e+00

Pos logit contributions
 tremend+9.295
 artificially+8.721
 disparate+8.628
 behavi+8.485
 meaningful+8.480
Neg logit contributions
uh-4.823
ix-4.731
u-4.400
ath-4.258
are-4.228
Token
Token ID198
Feature activation+1.37e+00

Pos logit contributions
 tremend+1.924
 artificially+1.797
 disparate+1.766
 behavi+1.761
 meaningful+1.745
Neg logit contributions
uh-0.792
ix-0.783
u-0.725
are-0.712
fn-0.689
Token
Token ID198
Feature activation+1.30e+00

Pos logit contributions
uh+0.876
ix+0.867
u+0.802
are+0.773
fn+0.770
Neg logit contributions
 tremend-2.197
 behavi-1.982
 artificially-1.975
 disparate-1.947
 diminishing-1.916
TokenRace
Token ID35157
Feature activation+3.30e+00

Pos logit contributions
uh+0.620
ix+0.609
u+0.552
are+0.537
fn+0.518
Neg logit contributions
 tremend-2.252
 artificially-2.112
 behavi-2.093
 disparate-2.073
 diminishing-2.058
Token under
Token ID739
Feature activation-3.61e+00

Pos logit contributions
ix+3.608
uh+3.600
u+3.461
are+3.404
ath+3.379
Neg logit contributions
 tremend-3.369
 artificially-3.040
 behavi-2.961
 disparate-2.947
 diminishing-2.929
Token bright
Token ID6016
Feature activation-6.33e-01

Pos logit contributions
 tremend+3.433
 artificially+3.081
 behavi+3.002
 disparate+2.977
 meaningful+2.933
Neg logit contributions
uh-4.462
ix-4.447
u-4.298
ath-4.218
are-4.206
Token is
Token ID318
Feature activation-4.36e+00

Pos logit contributions
 tremend+5.239
 artificially+4.789
 disparate+4.720
 behavi+4.691
 meaningful+4.654
Neg logit contributions
ix-3.998
uh-3.994
u-3.769
are-3.691
ind-3.661
Token not
Token ID407
Feature activation-3.51e+00

Pos logit contributions
 tremend+3.775
 artificially+3.299
 behavi+3.227
 disparate+3.220
 diminishing+3.089
Neg logit contributions
uh-5.888
ix-5.834
u-5.644
ind-5.495
fn-5.480
Token one
Token ID530
Feature activation-1.61e+00

Pos logit contributions
 tremend+3.086
 artificially+2.671
 behavi+2.638
 disparate+2.605
 fragmented+2.496
Neg logit contributions
uh-4.706
ix-4.673
u-4.506
are-4.398
ath-4.394
Token of
Token ID286
Feature activation-4.22e+00

Pos logit contributions
 tremend+1.963
 artificially+1.786
 disparate+1.766
 behavi+1.763
 diminishing+1.717
Neg logit contributions
uh-1.568
ix-1.547
u-1.480
are-1.431
ath-1.428
Token them
Token ID606
Feature activation-6.84e+00

Pos logit contributions
 tremend+3.203
 artificially+2.718
 behavi+2.657
 disparate+2.642
 fragmented+2.553
Neg logit contributions
uh-6.192
ix-6.165
u-5.994
are-5.885
fn-5.855
Token;
Token ID26
Feature activation-2.05e+00

Pos logit contributions
 tremend+9.828
 artificially+9.233
 disparate+9.112
 behavi+8.943
 meaningful+8.930
Neg logit contributions
uh-5.163
ix-5.127
u-4.768
ath-4.628
are-4.595
Token it
Token ID340
Feature activation-2.70e+00

Pos logit contributions
 tremend+2.114
 artificially+1.889
 disparate+1.828
 behavi+1.822
 diminishing+1.784
Neg logit contributions
uh-2.420
ix-2.387
u-2.300
are-2.261
she-2.250
Token
Token ID447
Feature activation+4.37e+01

Pos logit contributions
 tremend+4.261
 behavi+3.978
 artificially+3.936
 disparate+3.928
 meaningful+3.866
Neg logit contributions
ix-1.685
uh-1.678
u-1.544
ath-1.500
are-1.489
Token
Token ID247
Feature activation+3.15e+01

Pos logit contributions
+30.854
+30.169
+28.020
+27.909
+27.890
Neg logit contributions
 tremend-31.860
 behavi-27.989
 artificially-26.623
 hemor-26.566
 diminishing-26.293
Tokens
Token ID82
Feature activation+6.82e+00

Pos logit contributions
u+30.433
uh+30.289
ix+30.167
is+29.450
ind+29.326
Neg logit contributions
 tremend-16.651
 disparate-14.807
 behavi-14.466
 diminishing-14.404
 artificially-14.110
Token just
Token ID655
Feature activation-1.87e+00

Pos logit contributions
uh+9.298
ix+9.274
u+9.007
ind+8.794
are+8.768
Neg logit contributions
 tremend-4.632
 artificially-3.852
 disparate-3.761
 behavi-3.741
 diminishing-3.605
Token symbolic
Token ID18975
Feature activation-5.46e+00

Pos logit contributions
 tremend+23.621
 artificially+22.853
 disparate+22.356
 meaningful+22.318
 mutually+22.096
Neg logit contributions
uh-12.631
ix-12.364
fn-11.348
u-11.301
ath-11.097
Token pre
Token ID662
Feature activation+2.96e+00

Pos logit contributions
 tremend+5.888
 artificially+5.425
 disparate+5.279
 diminishing+5.196
 meaningful+5.153
Neg logit contributions
uh-6.026
ix-5.971
u-5.740
are-5.590
ind-5.561
Tokenfig
Token ID5647
Feature activation+1.45e+01

Pos logit contributions
uh+1.457
ix+1.453
u+1.387
ind+1.358
ath+1.292
Neg logit contributions
 tremend-4.993
 artificially-4.650
 behavi-4.649
 disparate-4.618
 mutually-4.548
Tokenurations
Token ID20074
Feature activation-8.21e+00

Pos logit contributions
u+10.834
uh+10.567
ix+10.108
are+9.492
ind+9.468
Neg logit contributions
 tremend-17.927
 behavi-16.418
 undermin-15.915
 disparate-15.768
 hemor-15.705
Token of
Token ID286
Feature activation-1.66e+01

Pos logit contributions
 tremend+10.203
 artificially+9.560
 disparate+9.376
 meaningful+9.224
 diminishing+9.213
Neg logit contributions
uh-7.593
ix-7.513
u-7.092
are-6.892
fn-6.889
Token the
Token ID262
Feature activation-1.52e+01

Pos logit contributions
 tremend+21.446
 artificially+20.543
 disparate+20.108
 meaningful+19.992
 mutually+19.725
Neg logit contributions
uh-12.725
ix-12.464
fn-11.468
u-11.433
are-11.272
Token Christian
Token ID4302
Feature activation-1.49e+00

Pos logit contributions
 tremend+20.017
 artificially+19.097
 disparate+18.678
 meaningful+18.539
 diminishing+18.375
Neg logit contributions
uh-11.704
ix-11.544
u-10.619
fn-10.569
ath-10.404
Token doctrine
Token ID13115
Feature activation-3.03e+00

Pos logit contributions
 tremend+1.151
 artificially+1.053
 disparate+1.008
 behavi+1.001
 diminishing+0.995
Neg logit contributions
ix-2.052
uh-2.048
u-1.987
are-1.956
ind-1.938
Token;
Token ID26
Feature activation-4.43e+00

Pos logit contributions
 tremend+3.803
 artificially+3.566
 disparate+3.500
 behavi+3.483
 diminishing+3.451
Neg logit contributions
ix-2.773
uh-2.771
u-2.632
ath-2.509
are-2.507
Token Od
Token ID10529
Feature activation+1.13e+01

Pos logit contributions
 tremend+4.861
 artificially+4.510
 disparate+4.375
 meaningful+4.335
 behavi+4.304
Neg logit contributions
uh-4.777
ix-4.714
u-4.521
ath-4.425
ind-4.413
Tokenys
Token ID893
Feature activation+1.59e+01

Pos logit contributions
ix+6.073
uh+5.820
u+5.361
ind+5.209
i+4.935
Neg logit contributions
 tremend-17.905
 disparate-16.868
 behavi-16.771
 artificially-16.709
 meaningful-16.366
Token for
Token ID329
Feature activation-9.09e+00

Pos logit contributions
 tremend+5.926
 artificially+5.509
 behavi+5.439
 disparate+5.424
 fragmented+5.337
Neg logit contributions
uh-3.260
ix-3.223
u-3.045
ath-2.917
are-2.915
Token boys
Token ID6510
Feature activation-3.37e+00

Pos logit contributions
 tremend+8.680
 artificially+7.899
 disparate+7.641
 behavi+7.564
 meaningful+7.460
Neg logit contributions
uh-11.226
ix-11.121
u-10.736
ath-10.530
are-10.515
Token or
Token ID393
Feature activation-8.62e+00

Pos logit contributions
 tremend+4.635
 artificially+4.273
 disparate+4.247
 behavi+4.236
 diminishing+4.153
Neg logit contributions
uh-2.808
ix-2.783
u-2.635
ath-2.561
are-2.539
Token beauty
Token ID8737
Feature activation-2.02e+00

Pos logit contributions
 tremend+8.269
 artificially+7.545
 disparate+7.372
 behavi+7.263
 meaningful+7.146
Neg logit contributions
uh-10.683
ix-10.571
u-10.155
ath-9.988
are-9.942
Token in
Token ID287
Feature activation-1.10e+01

Pos logit contributions
 tremend+2.453
 behavi+2.258
 disparate+2.253
 artificially+2.247
 diminishing+2.188
Neg logit contributions
uh-1.950
ix-1.913
u-1.843
ath-1.788
are-1.768
Token the
Token ID262
Feature activation-1.28e+01

Pos logit contributions
 tremend+11.023
 artificially+10.200
 disparate+9.943
 behavi+9.742
 meaningful+9.735
Neg logit contributions
uh-12.868
ix-12.723
u-12.187
ath-11.991
are-11.944
Token case
Token ID1339
Feature activation-2.19e+00

Pos logit contributions
 tremend+13.191
 artificially+12.356
 disparate+11.990
 meaningful+11.828
 diminishing+11.746
Neg logit contributions
uh-14.232
ix-14.072
u-13.352
fn-13.145
ath-13.142
Token of
Token ID286
Feature activation-1.10e+01

Pos logit contributions
 tremend+2.504
 artificially+2.352
 disparate+2.298
 meaningful+2.259
 diminishing+2.254
Neg logit contributions
uh-2.221
ix-2.195
u-2.079
ath-2.026
fn-2.024
Token girls
Token ID4813
Feature activation-5.20e+00

Pos logit contributions
 tremend+11.308
 artificially+10.472
 disparate+10.219
 meaningful+10.021
 behavi+9.991
Neg logit contributions
uh-12.561
ix-12.436
u-11.856
ath-11.650
are-11.626
Token.
Token ID13
Feature activation-3.46e+00

Pos logit contributions
 tremend+7.564
 artificially+7.005
 disparate+6.954
 behavi+6.925
 meaningful+6.836
Neg logit contributions
uh-3.900
ix-3.867
u-3.592
ath-3.478
are-3.464
Token
Token ID198
Feature activation+8.11e+00

Pos logit contributions
 tremend+4.932
 artificially+4.612
 disparate+4.516
 behavi+4.467
 meaningful+4.460
Neg logit contributions
uh-2.636
ix-2.611
u-2.429
fn-2.351
are-2.350
Token credit
Token ID3884
Feature activation+6.24e+00

Pos logit contributions
uh+4.686
ix+4.605
ath+4.490
u+4.442
are+4.393
Neg logit contributions
 tremend-2.429
 artificially-2.137
 behavi-2.089
 disparate-2.034
 fragmented-2.012
Token for
Token ID329
Feature activation-7.89e+00

Pos logit contributions
uh+5.822
ix+5.725
u+5.535
ath+5.416
id+5.389
Neg logit contributions
 tremend-7.096
 behavi-6.390
 artificially-6.384
 disparate-6.372
 fragmented-6.298
Token being
Token ID852
Feature activation-9.37e+00

Pos logit contributions
 tremend+7.005
 artificially+6.397
 disparate+6.208
 behavi+6.064
 meaningful+6.051
Neg logit contributions
uh-10.172
ix-10.053
u-9.647
ath-9.508
fn-9.484
Token open
Token ID1280
Feature activation-5.40e+00

Pos logit contributions
 tremend+8.999
 artificially+8.315
 disparate+8.046
 meaningful+7.888
 diminishing+7.885
Neg logit contributions
uh-11.318
ix-11.203
u-10.696
ath-10.574
fn-10.513
Token to
Token ID284
Feature activation-1.08e+01

Pos logit contributions
 tremend+7.623
 artificially+7.134
 disparate+6.964
 diminishing+6.885
 behavi+6.878
Neg logit contributions
uh-4.256
ix-4.213
u-3.939
ath-3.825
fn-3.796
Token a
Token ID257
Feature activation-1.12e+01

Pos logit contributions
 tremend+12.434
 artificially+11.727
 disparate+11.426
 meaningful+11.290
 diminishing+11.217
Neg logit contributions
uh-10.770
ix-10.635
u-10.024
ath-9.845
fn-9.845
Token change
Token ID1487
Feature activation-2.76e+00

Pos logit contributions
 tremend+12.005
 artificially+11.294
 disparate+11.031
 meaningful+10.919
 diminishing+10.810
Neg logit contributions
uh-11.706
ix-11.546
u-10.910
fn-10.767
are-10.738
Token of
Token ID286
Feature activation-3.29e+00

Pos logit contributions
 tremend+3.698
 artificially+3.407
 disparate+3.370
 behavi+3.335
 fragmented+3.316
Neg logit contributions
uh-2.399
ix-2.349
u-2.258
ind-2.171
ath-2.166
Token mindset
Token ID20527
Feature activation-8.29e+00

Pos logit contributions
 tremend+1.943
 artificially+1.593
 behavi+1.501
 disparate+1.480
 diminishing+1.431
Neg logit contributions
uh-5.294
ix-5.284
u-5.150
ath-5.090
are-5.069
Token.
Token ID13
Feature activation-7.78e-01

Pos logit contributions
 tremend+11.020
 artificially+10.379
 disparate+10.172
 meaningful+10.026
 diminishing+9.990
Neg logit contributions
uh-6.987
ix-6.876
u-6.443
ath-6.271
are-6.259
Token
Token ID198
Feature activation+8.10e+00

Pos logit contributions
 tremend+1.181
 artificially+1.089
 disparate+1.068
 behavi+1.062
 fragmented+1.049
Neg logit contributions
uh-0.554
ix-0.545
u-0.508
fn-0.494
are-0.493
Token members
Token ID1866
Feature activation+2.68e-01

Pos logit contributions
ix+10.913
uh+10.592
u+10.522
ath+10.325
id+10.237
Neg logit contributions
 tremend-8.715
 behavi-7.521
 artificially-7.106
 disparate-7.020
 diminishing-6.943
Token a
Token ID257
Feature activation-8.13e+00

Pos logit contributions
ix+0.373
uh+0.368
u+0.354
ath+0.349
are+0.347
Neg logit contributions
 tremend-0.218
 behavi-0.184
 artificially-0.183
 disparate-0.178
 diminishing-0.175
Token one
Token ID530
Feature activation+3.19e+00

Pos logit contributions
 tremend+7.537
 artificially+6.851
 disparate+6.628
 behavi+6.516
 diminishing+6.464
Neg logit contributions
uh-10.212
ix-10.145
u-9.691
are-9.517
fn-9.504
Token-
Token ID12
Feature activation+1.17e+01

Pos logit contributions
uh+1.382
ix+1.372
u+1.249
are+1.147
i+1.115
Neg logit contributions
 tremend-5.447
 disparate-5.086
 behavi-5.059
 artificially-5.045
 diminishing-4.962
Tokenmonth
Token ID8424
Feature activation+3.67e+00

Pos logit contributions
uh+7.674
u+7.552
ix+7.354
are+7.099
i+6.933
Neg logit contributions
 tremend-16.478
 behavi-15.446
 disparate-14.959
 mathemat-14.610
 artificially-14.579
Token free
Token ID1479
Feature activation-3.56e+00

Pos logit contributions
ix+4.943
uh+4.881
u+4.804
i+4.710
are+4.680
Neg logit contributions
 tremend-2.996
 behavi-2.503
 artificially-2.412
 disparate-2.407
 diminishing-2.249
Token trial
Token ID4473
Feature activation-6.20e-01

Pos logit contributions
 tremend+2.985
 artificially+2.656
 disparate+2.581
 behavi+2.530
 meaningful+2.501
Neg logit contributions
uh-4.798
ix-4.797
u-4.611
are-4.523
ind-4.500
Token of
Token ID286
Feature activation-7.71e+00

Pos logit contributions
 tremend+0.782
 behavi+0.696
 artificially+0.694
 disparate+0.689
 diminishing+0.668
Neg logit contributions
ix-0.581
uh-0.578
u-0.554
are-0.540
ath-0.536
Token S
Token ID311
Feature activation+6.89e+00

Pos logit contributions
 tremend+8.667
 artificially+7.876
 disparate+7.666
 behavi+7.630
 meaningful+7.548
Neg logit contributions
ix-8.367
uh-8.307
u-7.861
are-7.764
ath-7.714
Tokenling
Token ID1359
Feature activation+2.68e+00

Pos logit contributions
ix+2.373
uh+2.330
are+1.929
ind+1.887
ath+1.837
Neg logit contributions
 tremend-13.043
 behavi-11.943
 artificially-11.751
 disparate-11.737
 diminishing-11.548
Token TV
Token ID3195
Feature activation+3.33e+00

Pos logit contributions
ix+3.124
uh+3.046
u+2.983
ath+2.892
i+2.892
Neg logit contributions
 tremend-2.692
 artificially-2.379
 disparate-2.342
 diminishing-2.293
 behavi-2.291
Token bikes
Token ID16715
Feature activation+3.26e-01

Pos logit contributions
 tremend+7.957
 artificially+7.271
 disparate+7.078
 meaningful+6.930
 behavi+6.912
Neg logit contributions
uh-10.057
ix-9.973
u-9.521
ath-9.381
are-9.370
Token and
Token ID290
Feature activation-1.18e+01

Pos logit contributions
uh+0.328
ix+0.324
u+0.314
are+0.306
ath+0.306
Neg logit contributions
 tremend-0.376
 behavi-0.338
 disparate-0.337
 artificially-0.333
 meaningful-0.329
Token learned
Token ID4499
Feature activation-4.42e+00

Pos logit contributions
 tremend+13.915
 artificially+13.208
 disparate+12.850
 meaningful+12.713
 diminishing+12.625
Neg logit contributions
uh-11.117
ix-10.962
u-10.293
fn-10.122
ath-10.094
Token to
Token ID284
Feature activation-1.19e+01

Pos logit contributions
 tremend+4.439
 artificially+4.029
 disparate+3.920
 diminishing+3.840
 behavi+3.824
Neg logit contributions
uh-5.292
ix-5.261
u-5.055
are-4.979
ath-4.915
Token operate
Token ID8076
Feature activation-6.96e+00

Pos logit contributions
 tremend+11.594
 artificially+10.961
 disparate+10.526
 meaningful+10.421
 diminishing+10.324
Neg logit contributions
uh-13.495
ix-13.333
u-12.656
fn-12.517
ath-12.497
Token our
Token ID674
Feature activation-1.31e+01

Pos logit contributions
 tremend+7.132
 artificially+6.447
 disparate+6.373
 behavi+6.252
 diminishing+6.211
Neg logit contributions
uh-8.177
ix-8.056
u-7.732
ath-7.618
are-7.614
Token bikes
Token ID16715
Feature activation-2.88e+00

Pos logit contributions
 tremend+15.411
 artificially+14.650
 disparate+14.248
 meaningful+14.109
 diminishing+14.033
Neg logit contributions
uh-12.165
ix-12.008
u-11.249
fn-11.122
ath-11.068
Token on
Token ID319
Feature activation-1.03e+01

Pos logit contributions
 tremend+3.294
 disparate+2.990
 artificially+2.966
 behavi+2.935
 diminishing+2.909
Neg logit contributions
uh-3.002
ix-2.939
u-2.831
ath-2.779
are-2.768
Token the
Token ID262
Feature activation-1.19e+01

Pos logit contributions
 tremend+12.361
 artificially+11.537
 disparate+11.304
 meaningful+11.108
 behavi+11.064
Neg logit contributions
uh-10.116
ix-10.044
u-9.451
ath-9.281
are-9.275
Token streets
Token ID6483
Feature activation-4.48e+00

Pos logit contributions
 tremend+13.692
 artificially+12.918
 disparate+12.576
 meaningful+12.426
 diminishing+12.353
Neg logit contributions
uh-11.810
ix-11.660
u-10.992
fn-10.797
ath-10.797
Token according
Token ID1864
Feature activation-1.25e+00

Pos logit contributions
 tremend+6.170
 artificially+5.734
 disparate+5.716
 behavi+5.598
 meaningful+5.584
Neg logit contributions
uh-3.647
ix-3.604
u-3.393
ath-3.265
are-3.259