TOP ACTIVATIONS
MAX = 21.488

ic of an extended mass revolt and a defeated proletarian leadership
) f absorbed ( t ) Let
'd burn your ( re ar ) up ?" the sheriff
laughs ] I 'm getting so used to it , I
want to proceed , read on . 1 . 5 -
internal estimation you could see here ); - The
10 . 2 yards per catch ) and a 1 :
public interest litigation ( P IL ) in the high court
The epic ghost invasion of the big Apple .
have been the basis of American foreign policy for a very
for approximately 1 - hour tour ). Originally located in Illinois
ST 3 K , I think Reviewer : B
. for 2 - hour tour ; $ 30 for shorter
Environment activists have filed complaint pertaining filling of wetlands with
of clutch - driven , on - demand all - wheel
become slightly less depressing when you remember how much more depressing
), white ( ch ag ha an ), good
the Motor Vehicle ( Am endment ) Bill 2016 for consideration
66 power off Notice how we used the full
2 : "[ LS D pioneer Michael ] Holl ings head

MIN ACTIVATIONS
MIN = -22.500

Editorial Board , a dem ographically and ideologically diverse group that
, a dem ographically and ideologically diverse group that is separate
United States can do to help out in London and the
, so we turned the heat down to 350 degrees for
Sun Online news team ? Email us at tips @ the
just as powerful . The stakes of our struggle just as
, no matter what the conditions , are for Nietzsche the
burning contest . The temperature was so high , reaching more
should be adequately represented , gender should be adequately represented ,
of Gentle men tried to record live -- first of all
) people , such as gender flu id and gender que
an issue by setting the language of a document to US
. 9 billion . They did not . Instead
do you know ? Virt ually every major medical association supports
others refer to them with gender - neutral pronouns such as
the new money men to do that . History
intense and longer - lasting than what the country has seen
I might not have the time for Serious Recording Attempt s
or Gender que er : Gender flu id or gender que
female pronouns . Gender Identity : Gender identity is

INTERVAL 10.491 - 21.488
CONTAINS 0.579%

now have smoking - gun proof that top officials in the
- EE - oh - v ick ) The results support
have little in common at first except their love of music
the policy is to direct investment dollars into the construction of
M umin v ä r ld en in Swedish ) is

INTERVAL -0.506 - 10.491
CONTAINS 55.428%

tells us that rocky planets could form much earlier than we
more in - store for 2015 as the band will be
own history the tape his s is just as much
all the way and they came up short .
in the world s worst nuclear disaster . At

INTERVAL -11.503 - -0.506
CONTAINS 43.740%

ideas on humanity regarding a basic quality of life as a
Clinton mentioned the recent debates with Republican nominee Donald Trump Donald
sheet ; but even in a land of glaciers , Jak
t have such a contract with the state of
р о с с и с к и

INTERVAL -22.500 - -11.503
CONTAINS 0.252%

I added a < span > around the phone name .
over . The high stakes can be seen in
links to some of the cos players featured above :
Franklin D . Roosevelt appointed her to several government positions ,
because we believe that we do .
Tokenic
Token ID291
Feature activation+1.01e+01

Pos logit contributions
ittal+1.314
��+1.284
Reviewer+1.236
akin+1.125
imony+1.114
Neg logit contributions
etting-1.792
coded-1.779
 ours-1.724
population-1.710
 availability-1.691
Token of
Token ID286
Feature activation+5.59e+00

Pos logit contributions
ittal+7.183
Reviewer+6.663
��+6.601
akin+6.448
imony+6.378
Neg logit contributions
etting-6.562
coded-6.535
 ours-6.231
population-6.224
spawn-6.097
Token an
Token ID281
Feature activation+1.10e+01

Pos logit contributions
ittal+4.191
��+4.041
Reviewer+3.979
akin+3.757
imony+3.733
Neg logit contributions
etting-3.711
coded-3.657
population-3.449
 ours-3.447
spawn-3.405
Token extended
Token ID7083
Feature activation+1.12e+01

Pos logit contributions
ittal+8.879
��+8.464
Reviewer+8.458
akin+8.442
atio+8.149
Neg logit contributions
etting-5.944
coded-5.739
 ours-5.469
population-5.442
spawn-5.438
Token mass
Token ID2347
Feature activation+6.40e+00

Pos logit contributions
ittal+8.588
Reviewer+8.103
��+8.020
akin+7.831
atio+7.728
Neg logit contributions
etting-6.563
coded-6.289
spawn-5.903
population-5.858
 ours-5.856
Token revolt
Token ID27360
Feature activation+2.15e+01

Pos logit contributions
ittal+4.723
��+4.521
Reviewer+4.490
akin+4.231
imony+4.193
Neg logit contributions
etting-4.305
coded-4.249
 ours-4.046
population-3.989
spawn-3.916
Token and
Token ID290
Feature activation+4.21e+00

Pos logit contributions
ittal+12.397
akin+11.757
Reviewer+11.572
imony+11.391
atio+11.205
Neg logit contributions
coded-12.744
etting-12.459
 ours-11.936
 availability-11.778
population-11.722
Token a
Token ID257
Feature activation+4.63e+00

Pos logit contributions
ittal+3.190
��+3.101
Reviewer+3.038
akin+2.839
imony+2.813
Neg logit contributions
etting-2.813
coded-2.776
 ours-2.646
population-2.636
 availability-2.588
Token defeated
Token ID9772
Feature activation+8.96e+00

Pos logit contributions
ittal+3.707
��+3.598
Reviewer+3.543
akin+3.320
imony+3.281
Neg logit contributions
etting-2.875
coded-2.829
 ours-2.696
population-2.676
 availability-2.638
Token proletarian
Token ID47393
Feature activation+1.08e+01

Pos logit contributions
ittal+6.869
Reviewer+6.490
��+6.458
akin+6.177
imony+6.124
Neg logit contributions
etting-5.557
coded-5.384
population-5.079
 ours-5.043
spawn-5.040
Token leadership
Token ID5531
Feature activation+1.18e+01

Pos logit contributions
ittal+8.088
Reviewer+7.905
��+7.440
akin+7.432
imony+7.324
Neg logit contributions
etting-6.533
coded-6.331
 ours-5.881
population-5.803
spawn-5.768
Token)
Token ID8
Feature activation+5.04e+00

Pos logit contributions
ittal+9.612
akin+8.833
��+8.726
atio+8.444
ateur+8.428
Neg logit contributions
etting-11.084
coded-10.951
 ours-10.366
 brightest-10.340
 wisely-10.310
Token
Token ID784
Feature activation+1.37e+00

Pos logit contributions
ittal+3.554
��+3.405
Reviewer+3.301
akin+3.133
imony+3.087
Neg logit contributions
etting-3.590
coded-3.578
 ours-3.439
population-3.398
 availability-3.328
Token f
Token ID277
Feature activation+4.62e-02

Pos logit contributions
ittal+1.009
��+0.991
Reviewer+0.962
akin+0.888
imony+0.884
Neg logit contributions
etting-0.957
coded-0.948
 ours-0.915
population-0.907
 availability-0.896
Token absorbed
Token ID19233
Feature activation+9.59e+00

Pos logit contributions
ittal+0.037
��+0.037
Reviewer+0.036
akin+0.033
imony+0.033
Neg logit contributions
etting-0.029
coded-0.029
 ours-0.028
population-0.028
 availability-0.027
Token (
Token ID357
Feature activation+1.06e+01

Pos logit contributions
ittal+7.037
��+6.630
Reviewer+6.465
akin+6.236
ateral+6.151
Neg logit contributions
etting-6.152
coded-6.133
 ours-5.816
population-5.772
spawn-5.641
Tokent
Token ID83
Feature activation+1.84e+01

Pos logit contributions
ittal+7.604
��+7.143
Reviewer+7.001
akin+6.972
ateur+6.699
Neg logit contributions
etting-6.814
coded-6.746
 ours-6.570
 brightest-6.422
 wisely-6.369
Token)
Token ID8
Feature activation+6.85e+00

Pos logit contributions
ittal+10.603
akin+9.809
��+9.608
atio+9.438
ateur+9.429
Neg logit contributions
etting-12.158
coded-12.126
 ours-11.335
population-11.313
 brightest-11.295
Token
Token ID198
Feature activation+2.99e+00

Pos logit contributions
ittal+4.802
��+4.557
Reviewer+4.426
akin+4.243
imony+4.200
Neg logit contributions
etting-4.786
coded-4.785
 ours-4.577
population-4.524
spawn-4.424
Token
Token ID198
Feature activation+3.80e+00

Pos logit contributions
ittal+1.996
Reviewer+1.881
��+1.821
akin+1.759
imony+1.747
Neg logit contributions
etting-2.256
coded-2.211
 mosqu-2.135
 ours-2.112
population-2.111
TokenLet
Token ID5756
Feature activation+1.47e+01

Pos logit contributions
ittal+2.713
Reviewer+2.592
��+2.588
akin+2.418
imony+2.375
Neg logit contributions
etting-2.675
coded-2.652
 ours-2.581
 availability-2.511
population-2.507
Token
Token ID447
Feature activation+6.31e-01

Pos logit contributions
ittal+11.053
Reviewer+10.642
��+10.489
akin+10.404
atio+10.231
Neg logit contributions
coded-7.942
etting-7.935
population-7.128
spawn-7.078
 brightest-7.034
Token'd
Token ID1549
Feature activation+2.29e-01

Pos logit contributions
ittal+3.724
��+3.634
Reviewer+3.561
akin+3.394
imony+3.367
Neg logit contributions
etting-1.911
coded-1.881
 ours-1.775
population-1.763
 availability-1.724
Token burn
Token ID4245
Feature activation-1.37e+00

Pos logit contributions
ittal+0.237
��+0.235
Reviewer+0.230
akin+0.216
imony+0.215
Neg logit contributions
etting-0.092
coded-0.091
 ours-0.086
population-0.082
 availability-0.081
Token your
Token ID534
Feature activation+4.69e+00

Pos logit contributions
etting+0.793
 ours+0.782
coded+0.776
population+0.740
 availability+0.738
Neg logit contributions
��-1.168
ittal-1.167
Reviewer-1.118
akin-1.047
iband-1.043
Token (
Token ID357
Feature activation+1.08e+01

Pos logit contributions
ittal+3.510
��+3.395
Reviewer+3.296
akin+3.085
imony+3.068
Neg logit contributions
etting-3.183
coded-3.175
population-3.025
 ours-3.006
 availability-2.954
Tokenre
Token ID260
Feature activation+1.40e+01

Pos logit contributions
ittal+7.229
Reviewer+6.680
��+6.648
akin+6.329
ateur+6.203
Neg logit contributions
etting-7.573
coded-7.464
 availability-7.347
 toxicity-7.311
 ours-7.281
Tokenar
Token ID283
Feature activation+1.84e+01

Pos logit contributions
ittal+4.243
akin+3.555
ateral+3.403
imony+2.756
atio+2.747
Neg logit contributions
 mosqu-16.281
 conflic-16.031
 traged-15.143
 toxicity-14.540
 brightest-14.534
Token)
Token ID8
Feature activation+7.49e-01

Pos logit contributions
ittal+6.136
akin+4.844
Reviewer+4.755
imony+4.743
ateral+4.738
Neg logit contributions
etting-17.225
 mosqu-16.844
coded-16.596
 conflic-16.460
population-16.260
Token up
Token ID510
Feature activation+5.01e+00

Pos logit contributions
ittal+0.643
��+0.624
Reviewer+0.611
akin+0.578
imony+0.573
Neg logit contributions
etting-0.430
coded-0.425
 ours-0.402
population-0.402
 availability-0.392
Token?"
Token ID1701
Feature activation+9.87e+00

Pos logit contributions
ittal+3.911
��+3.809
Reviewer+3.715
akin+3.500
imony+3.479
Neg logit contributions
etting-3.190
coded-3.160
population-3.025
 ours-3.005
 availability-2.948
Token the
Token ID262
Feature activation+7.01e+00

Pos logit contributions
ittal+6.864
��+6.565
Reviewer+6.340
imony+6.207
akin+6.014
Neg logit contributions
etting-6.744
coded-6.654
 ours-6.403
population-6.397
 availability-6.243
Token sheriff
Token ID16570
Feature activation+1.30e+00

Pos logit contributions
ittal+5.462
��+5.211
Reviewer+5.120
imony+4.938
akin+4.854
Neg logit contributions
etting-4.402
coded-4.350
 ours-4.152
population-4.146
spawn-4.040
Tokenlaughs
Token ID28124
Feature activation+1.51e+01

Pos logit contributions
ittal+6.279
Reviewer+6.175
��+5.953
akin+5.835
ateur+5.614
Neg logit contributions
etting-5.150
coded-5.025
 ours-4.897
 toxicity-4.838
 availability-4.835
Token]
Token ID60
Feature activation+7.99e+00

Pos logit contributions
ittal+6.473
��+6.324
Reviewer+5.632
akin+5.455
ateur+5.339
Neg logit contributions
coded-13.523
etting-13.264
population-13.090
spawn-12.470
 toxicity-12.443
Token I
Token ID314
Feature activation+9.67e+00

Pos logit contributions
ittal+5.404
��+5.186
Reviewer+5.154
akin+4.852
imony+4.792
Neg logit contributions
etting-5.705
coded-5.641
population-5.476
 ours-5.302
spawn-5.260
Token'm
Token ID1101
Feature activation+1.57e+01

Pos logit contributions
ittal+6.804
akin+6.113
��+6.072
imony+6.054
Reviewer+5.990
Neg logit contributions
etting-6.538
coded-6.406
population-6.340
 mosqu-5.976
 ours-5.975
Token getting
Token ID1972
Feature activation+1.33e+01

Pos logit contributions
ittal+9.026
akin+8.452
Reviewer+8.197
imony+7.948
��+7.876
Neg logit contributions
coded-10.883
population-10.873
etting-10.839
priority-10.174
spawn-10.156
Token so
Token ID523
Feature activation+1.82e+01

Pos logit contributions
ittal+8.714
Reviewer+8.007
imony+7.825
akin+7.773
 fir+7.700
Neg logit contributions
etting-9.075
coded-8.731
population-8.671
spawn-8.159
portion-8.065
Token used
Token ID973
Feature activation+3.07e+00

Pos logit contributions
ittal+10.945
akin+9.902
Reviewer+9.769
imony+9.417
 fir+9.413
Neg logit contributions
etting-12.152
population-11.869
coded-11.411
 mosqu-10.896
2020-10.892
Token to
Token ID284
Feature activation+2.88e+00

Pos logit contributions
ittal+2.412
Reviewer+2.303
��+2.298
akin+2.161
imony+2.144
Neg logit contributions
etting-1.966
coded-1.911
 ours-1.845
population-1.837
 availability-1.806
Token it
Token ID340
Feature activation+9.61e+00

Pos logit contributions
ittal+2.239
��+2.188
Reviewer+2.139
akin+1.996
imony+1.981
Neg logit contributions
etting-1.879
coded-1.860
population-1.774
 ours-1.772
 availability-1.732
Token,
Token ID11
Feature activation+1.07e+01

Pos logit contributions
ittal+6.164
��+5.881
Reviewer+5.866
akin+5.539
imony+5.427
Neg logit contributions
coded-6.932
etting-6.922
population-6.744
 ours-6.511
spawn-6.495
Token I
Token ID314
Feature activation+7.39e+00

Pos logit contributions
ittal+6.854
Reviewer+6.662
��+6.603
akin+6.266
imony+6.056
Neg logit contributions
etting-7.624
coded-7.561
population-7.359
spawn-7.158
 ours-7.057
Token want
Token ID765
Feature activation+4.80e+00

Pos logit contributions
ittal+2.116
��+2.021
Reviewer+1.975
akin+1.860
imony+1.855
Neg logit contributions
etting-2.143
coded-2.124
population-2.035
 ours-2.027
 availability-1.989
Token to
Token ID284
Feature activation+4.96e+00

Pos logit contributions
ittal+3.838
��+3.728
Reviewer+3.659
akin+3.436
imony+3.401
Neg logit contributions
etting-2.999
coded-2.954
population-2.813
 ours-2.800
 availability-2.729
Token proceed
Token ID5120
Feature activation+4.96e+00

Pos logit contributions
ittal+3.848
��+3.624
Reviewer+3.591
akin+3.451
imony+3.400
Neg logit contributions
etting-3.128
coded-3.116
population-3.008
 ours-2.893
spawn-2.866
Token,
Token ID11
Feature activation+4.42e+00

Pos logit contributions
ittal+3.771
��+3.594
Reviewer+3.560
akin+3.345
imony+3.321
Neg logit contributions
coded-3.252
etting-3.242
population-3.106
 ours-3.069
 availability-3.025
Token read
Token ID1100
Feature activation+1.36e+01

Pos logit contributions
ittal+3.362
��+3.249
Reviewer+3.200
akin+2.998
imony+2.966
Neg logit contributions
etting-2.909
coded-2.889
population-2.749
 ours-2.742
 availability-2.676
Token on
Token ID319
Feature activation+1.82e+01

Pos logit contributions
ittal+9.539
Reviewer+9.282
��+8.704
imony+8.625
akin+8.432
Neg logit contributions
etting-8.586
coded-8.387
population-8.027
spawn-8.003
portion-7.720
Token.
Token ID13
Feature activation+9.65e+00

Pos logit contributions
Reviewer+11.053
ittal+10.933
atio+9.806
imony+9.659
��+9.632
Neg logit contributions
etting-11.369
coded-11.158
population-10.915
spawn-10.882
 ours-10.779
Token1
Token ID16
Feature activation+4.60e-01

Pos logit contributions
ittal+6.441
Reviewer+6.267
��+6.028
imony+5.693
akin+5.674
Neg logit contributions
etting-6.830
coded-6.731
population-6.487
 ours-6.479
 availability-6.358
Token.
Token ID13
Feature activation+3.76e+00

Pos logit contributions
ittal+0.325
��+0.322
Reviewer+0.307
akin+0.284
imony+0.283
Neg logit contributions
etting-0.337
coded-0.333
 ours-0.323
population-0.317
 availability-0.316
Token5
Token ID20
Feature activation+2.45e+00

Pos logit contributions
ittal+2.855
��+2.771
Reviewer+2.724
akin+2.540
imony+2.511
Neg logit contributions
etting-2.500
coded-2.479
 ours-2.383
population-2.375
 availability-2.325
Token -
Token ID532
Feature activation+2.61e+00

Pos logit contributions
ittal+1.850
��+1.814
Reviewer+1.759
akin+1.637
imony+1.625
Neg logit contributions
etting-1.655
coded-1.641
 ours-1.579
population-1.563
 availability-1.542
Token internal
Token ID5387
Feature activation+1.63e+00

Pos logit contributions
ittal+3.208
��+3.106
Reviewer+3.050
akin+2.878
imony+2.845
Neg logit contributions
etting-2.456
coded-2.444
population-2.319
 ours-2.316
spawn-2.258
Token estimation
Token ID31850
Feature activation+3.84e+00

Pos logit contributions
ittal+1.303
��+1.279
Reviewer+1.242
akin+1.161
imony+1.153
Neg logit contributions
etting-1.030
coded-1.021
 ours-0.980
population-0.969
 availability-0.956
Token you
Token ID345
Feature activation+2.79e-01

Pos logit contributions
ittal+3.088
��+2.999
Reviewer+2.940
akin+2.757
imony+2.741
Neg logit contributions
etting-2.394
coded-2.376
 ours-2.262
population-2.253
 availability-2.207
Token could
Token ID714
Feature activation+2.54e+00

Pos logit contributions
ittal+0.214
��+0.212
Reviewer+0.204
akin+0.190
imony+0.189
Neg logit contributions
etting-0.185
coded-0.184
 ours-0.177
population-0.175
 availability-0.173
Token see
Token ID766
Feature activation+3.45e+00

Pos logit contributions
ittal+1.676
��+1.609
Reviewer+1.584
akin+1.462
imony+1.442
Neg logit contributions
etting-1.945
coded-1.926
population-1.854
 ours-1.838
 availability-1.808
Token here
Token ID994
Feature activation+1.78e+01

Pos logit contributions
ittal+3.023
��+2.942
Reviewer+2.896
akin+2.725
imony+2.716
Neg logit contributions
etting-1.888
coded-1.873
 ours-1.764
population-1.764
 availability-1.723
Token);
Token ID1776
Feature activation+4.68e+00

Pos logit contributions
ittal+9.965
Reviewer+9.238
akin+9.060
imony+8.895
��+8.845
Neg logit contributions
coded-11.989
etting-11.981
population-11.673
spawn-11.614
 mosqu-11.477
Token
Token ID198
Feature activation+5.29e-01

Pos logit contributions
ittal+3.323
��+3.247
Reviewer+3.157
akin+2.926
imony+2.902
Neg logit contributions
etting-3.341
coded-3.321
 ours-3.181
population-3.151
 availability-3.109
Token
Token ID198
Feature activation-8.87e-01

Pos logit contributions
ittal+0.385
��+0.372
Reviewer+0.365
akin+0.339
imony+0.337
Neg logit contributions
etting-0.372
coded-0.369
 ours-0.354
population-0.352
 availability-0.347
Token-
Token ID12
Feature activation+4.10e+00

Pos logit contributions
etting+0.583
coded+0.580
population+0.558
 ours+0.556
 availability+0.544
Neg logit contributions
��-0.694
ittal-0.689
Reviewer-0.644
akin-0.603
imony-0.602
Token The
Token ID383
Feature activation+3.67e+00

Pos logit contributions
ittal+3.105
��+3.022
Reviewer+2.961
akin+2.750
imony+2.738
Neg logit contributions
etting-2.736
coded-2.724
 ours-2.604
population-2.580
 availability-2.544
Token10
Token ID940
Feature activation+1.24e+01

Pos logit contributions
ittal+5.265
Reviewer+5.065
akin+4.754
��+4.635
ateur+4.413
Neg logit contributions
etting-6.728
coded-6.475
 ours-6.466
 toxicity-6.397
 brightest-6.348
Token.
Token ID13
Feature activation+1.19e+01

Pos logit contributions
ittal+7.706
Reviewer+7.278
��+7.086
ateur+6.773
akin+6.762
Neg logit contributions
coded-8.851
population-8.606
 ours-8.405
etting-8.286
 toxicity-8.182
Token2
Token ID17
Feature activation+1.01e+01

Pos logit contributions
ittal+6.071
Reviewer+5.793
akin+5.339
imony+5.228
��+5.171
Neg logit contributions
 mosqu-9.922
 conflic-9.801
etting-9.721
coded-9.597
 brightest-9.545
Token yards
Token ID5695
Feature activation+7.47e+00

Pos logit contributions
ittal+5.910
��+5.663
Reviewer+5.565
imony+5.107
akin+5.077
Neg logit contributions
coded-7.929
etting-7.724
population-7.702
 ours-7.561
spawn-7.353
Token per
Token ID583
Feature activation+7.29e+00

Pos logit contributions
ittal+6.248
��+6.009
Reviewer+5.972
akin+5.652
ateur+5.611
Neg logit contributions
coded-4.130
etting-4.125
population-3.982
 ours-3.931
spawn-3.780
Token catch
Token ID4929
Feature activation+1.77e+01

Pos logit contributions
ittal+6.276
��+5.978
Reviewer+5.882
ateur+5.548
akin+5.540
Neg logit contributions
coded-4.018
etting-4.017
population-3.828
 ours-3.813
 toxicity-3.625
Token)
Token ID8
Feature activation+1.04e+01

Pos logit contributions
ittal+7.612
ateur+6.002
ateral+5.996
Reviewer+5.928
akin+5.850
Neg logit contributions
coded-14.775
etting-14.429
population-14.369
spawn-14.072
 toxicity-13.920
Token and
Token ID290
Feature activation+1.22e+01

Pos logit contributions
ittal+6.334
��+5.821
Reviewer+5.683
akin+5.590
imony+5.328
Neg logit contributions
coded-7.921
etting-7.703
population-7.542
 ours-7.427
spawn-7.372
Token a
Token ID257
Feature activation+9.04e+00

Pos logit contributions
ittal+7.208
��+6.654
Reviewer+6.602
akin+6.541
imony+6.120
Neg logit contributions
coded-8.938
etting-8.829
population-8.610
complex-8.305
spawn-8.287
Token 1
Token ID352
Feature activation+5.64e+00

Pos logit contributions
ittal+5.986
��+5.698
Reviewer+5.596
akin+5.162
imony+5.089
Neg logit contributions
coded-6.527
etting-6.511
population-6.329
 ours-6.120
spawn-6.027
Token:
Token ID25
Feature activation+2.60e+00

Pos logit contributions
ittal+4.286
��+4.193
Reviewer+4.061
akin+3.802
imony+3.756
Neg logit contributions
coded-3.712
etting-3.690
 ours-3.546
population-3.519
 availability-3.408
Token public
Token ID1171
Feature activation+4.33e+00

Pos logit contributions
ittal+5.714
��+5.507
Reviewer+5.289
akin+5.029
imony+4.965
Neg logit contributions
coded-4.982
etting-4.942
population-4.730
 ours-4.645
spawn-4.582
Token interest
Token ID1393
Feature activation+2.76e+00

Pos logit contributions
ittal+3.384
��+3.294
Reviewer+3.207
akin+3.003
imony+2.978
Neg logit contributions
etting-2.809
coded-2.790
 ours-2.667
population-2.650
 availability-2.578
Token litigation
Token ID19284
Feature activation+1.19e+01

Pos logit contributions
ittal+2.114
��+2.064
Reviewer+2.005
akin+1.874
imony+1.860
Neg logit contributions
etting-1.833
coded-1.820
 ours-1.743
population-1.734
 availability-1.698
Token (
Token ID357
Feature activation+5.37e+00

Pos logit contributions
ittal+7.909
��+7.242
Reviewer+7.038
akin+6.889
imony+6.835
Neg logit contributions
coded-8.464
etting-8.262
population-8.156
spawn-7.769
 ours-7.609
TokenP
Token ID47
Feature activation+6.15e+00

Pos logit contributions
ittal+3.952
��+3.788
Reviewer+3.779
akin+3.556
imony+3.400
Neg logit contributions
etting-3.623
coded-3.586
 ours-3.475
 availability-3.378
population-3.358
TokenIL
Token ID4146
Feature activation+1.77e+01

Pos logit contributions
ittal+5.542
��+5.274
Reviewer+5.174
akin+5.039
imony+4.945
Neg logit contributions
coded-3.227
etting-3.130
 ours-3.056
population-3.010
 availability-2.935
Token)
Token ID8
Feature activation+6.59e+00

Pos logit contributions
ittal+8.599
��+7.618
Reviewer+7.280
akin+7.186
imony+7.097
Neg logit contributions
coded-14.042
etting-13.845
population-13.806
spawn-13.518
 brightest-12.997
Token in
Token ID287
Feature activation+6.42e+00

Pos logit contributions
ittal+4.698
��+4.508
Reviewer+4.350
akin+4.145
imony+4.087
Neg logit contributions
coded-4.657
etting-4.623
population-4.463
 ours-4.389
 availability-4.306
Token the
Token ID262
Feature activation+6.23e+00

Pos logit contributions
ittal+4.898
��+4.737
Reviewer+4.511
akin+4.335
imony+4.257
Neg logit contributions
coded-4.252
etting-4.184
population-3.986
 ours-3.950
spawn-3.908
Token high
Token ID1029
Feature activation-7.87e-01

Pos logit contributions
ittal+4.870
��+4.712
Reviewer+4.513
akin+4.306
imony+4.239
Neg logit contributions
coded-3.974
etting-3.947
 ours-3.738
population-3.729
spawn-3.646
Token court
Token ID2184
Feature activation+5.25e+00

Pos logit contributions
etting+0.663
coded+0.658
 availability+0.639
 ours+0.638
population+0.635
Neg logit contributions
��-0.465
ittal-0.463
Reviewer-0.450
akin-0.399
itle-0.391
Token The
Token ID383
Feature activation+7.25e+00

Pos logit contributions
ittal+4.226
��+4.182
Reviewer+4.018
akin+3.660
imony+3.610
Neg logit contributions
etting-5.393
coded-5.344
 ours-5.161
population-5.132
 availability-5.074
Token epic
Token ID12191
Feature activation+5.40e+00

Pos logit contributions
ittal+4.778
��+4.603
Reviewer+4.467
akin+4.295
imony+4.072
Neg logit contributions
etting-5.368
coded-5.297
 ours-5.085
population-5.064
 availability-5.011
Token ghost
Token ID10905
Feature activation-1.25e+00

Pos logit contributions
ittal+4.128
��+3.976
Reviewer+3.908
akin+3.690
imony+3.610
Neg logit contributions
etting-3.488
coded-3.469
 ours-3.344
population-3.295
 availability-3.286
Token invasion
Token ID11796
Feature activation+4.95e+00

Pos logit contributions
etting+0.477
coded+0.476
 ours+0.450
population+0.442
 availability+0.434
Neg logit contributions
ittal-1.305
��-1.303
Reviewer-1.261
imony-1.193
akin-1.189
Token of
Token ID286
Feature activation+1.51e+01

Pos logit contributions
ittal+3.422
Reviewer+3.278
akin+3.089
��+3.082
imony+3.053
Neg logit contributions
etting-3.514
coded-3.462
 mosqu-3.452
 ours-3.332
 availability-3.288
Token the
Token ID262
Feature activation+1.75e+01

Pos logit contributions
ittal+6.306
Reviewer+5.972
akin+5.634
imony+5.523
atio+5.454
Neg logit contributions
 mosqu-13.207
etting-12.817
 conflic-12.695
coded-12.506
 traged-12.395
Token big
Token ID1263
Feature activation+1.35e+01

Pos logit contributions
ittal+9.058
Reviewer+8.798
��+8.398
akin+8.035
itle+7.620
Neg logit contributions
etting-12.935
coded-12.807
 wisely-12.351
 availability-12.275
 ours-12.229
Token Apple
Token ID4196
Feature activation+6.70e+00

Pos logit contributions
��+8.079
ittal+7.975
Reviewer+7.899
akin+7.381
imony+7.090
Neg logit contributions
etting-9.424
coded-9.308
 ours-8.917
population-8.897
 availability-8.814
Token.
Token ID13
Feature activation+1.76e+00

Pos logit contributions
ittal+4.380
Reviewer+4.155
imony+3.894
akin+3.871
��+3.853
Neg logit contributions
etting-4.964
coded-4.898
 mosqu-4.844
 conflic-4.686
 ours-4.617
Token
Token ID198
Feature activation+1.66e+00

Pos logit contributions
ittal+1.311
��+1.283
Reviewer+1.245
akin+1.157
imony+1.150
Neg logit contributions
etting-1.209
coded-1.199
 ours-1.154
population-1.144
 availability-1.127
Token
Token ID198
Feature activation+1.95e+00

Pos logit contributions
ittal+1.159
Reviewer+1.098
��+1.093
akin+1.022
imony+1.015
Neg logit contributions
etting-1.216
coded-1.195
 ours-1.147
population-1.141
 availability-1.133
Token have
Token ID423
Feature activation+2.73e+00

Pos logit contributions
ittal+2.905
��+2.832
Reviewer+2.762
akin+2.587
imony+2.579
Neg logit contributions
etting-2.395
coded-2.392
population-2.274
 ours-2.268
 availability-2.221
Token been
Token ID587
Feature activation+5.81e+00

Pos logit contributions
ittal+2.118
��+2.067
Reviewer+2.012
akin+1.878
imony+1.868
Neg logit contributions
etting-1.787
coded-1.763
 ours-1.690
population-1.685
 availability-1.652
Token the
Token ID262
Feature activation+6.29e+00

Pos logit contributions
ittal+4.535
Reviewer+4.407
��+4.382
akin+4.144
ateur+4.074
Neg logit contributions
etting-3.590
coded-3.559
population-3.408
 ours-3.337
 availability-3.336
Token basis
Token ID4308
Feature activation-2.41e-01

Pos logit contributions
ittal+4.497
Reviewer+4.374
��+4.318
akin+4.046
ateral+3.983
Neg logit contributions
coded-4.292
etting-4.272
population-4.115
 ours-4.074
 availability-3.968
Token of
Token ID286
Feature activation+6.08e+00

Pos logit contributions
etting+0.143
coded+0.142
 ours+0.136
population+0.134
 availability+0.132
Neg logit contributions
ittal-0.203
��-0.200
Reviewer-0.192
akin-0.181
imony-0.178
Token American
Token ID1605
Feature activation+1.75e+01

Pos logit contributions
ittal+4.590
��+4.425
Reviewer+4.379
akin+4.124
imony+4.088
Neg logit contributions
etting-3.967
coded-3.940
population-3.713
spawn-3.656
 ours-3.654
Token foreign
Token ID3215
Feature activation+7.48e+00

Pos logit contributions
ittal+10.620
Reviewer+10.336
imony+9.826
ateral+9.629
atio+9.599
Neg logit contributions
etting-11.531
coded-11.248
spawn-10.678
population-10.506
portion-10.277
Token policy
Token ID2450
Feature activation+7.36e+00

Pos logit contributions
ittal+5.492
Reviewer+5.231
��+5.145
akin+4.884
imony+4.869
Neg logit contributions
etting-5.018
coded-4.904
population-4.707
 ours-4.701
spawn-4.576
Token for
Token ID329
Feature activation+7.77e+00

Pos logit contributions
ittal+5.005
��+4.849
Reviewer+4.810
imony+4.441
akin+4.426
Neg logit contributions
etting-5.316
coded-5.212
population-4.993
 ours-4.991
spawn-4.863
Token a
Token ID257
Feature activation+6.95e+00

Pos logit contributions
ittal+5.335
��+5.151
Reviewer+5.092
akin+4.763
imony+4.722
Neg logit contributions
etting-5.548
coded-5.434
population-5.127
 availability-5.080
spawn-5.068
Token very
Token ID845
Feature activation+8.74e+00

Pos logit contributions
ittal+4.731
Reviewer+4.625
��+4.591
akin+4.219
imony+4.150
Neg logit contributions
etting-4.963
coded-4.896
 ours-4.695
population-4.641
 availability-4.636
Token for
Token ID329
Feature activation+3.46e+00

Pos logit contributions
coded+0.645
etting+0.644
 ours+0.622
 availability+0.614
population+0.607
Neg logit contributions
ittal-0.845
��-0.842
Reviewer-0.802
akin-0.750
imony-0.746
Token approximately
Token ID6702
Feature activation+4.07e+00

Pos logit contributions
ittal+2.568
��+2.478
Reviewer+2.425
akin+2.267
imony+2.247
Neg logit contributions
etting-2.389
coded-2.348
population-2.245
 ours-2.236
spawn-2.179
Token 1
Token ID352
Feature activation+3.44e+00

Pos logit contributions
ittal+3.143
��+3.020
Reviewer+2.978
akin+2.807
ateur+2.765
Neg logit contributions
etting-2.655
coded-2.600
population-2.469
 ours-2.458
spawn-2.402
Token-
Token ID12
Feature activation+3.13e+00

Pos logit contributions
ittal+2.470
��+2.403
Reviewer+2.328
akin+2.173
imony+2.142
Neg logit contributions
etting-2.443
coded-2.421
 ours-2.333
population-2.315
 availability-2.277
Tokenhour
Token ID9769
Feature activation+7.73e+00

Pos logit contributions
ittal+2.527
��+2.460
Reviewer+2.401
akin+2.262
imony+2.233
Neg logit contributions
etting-1.954
coded-1.929
 ours-1.853
population-1.828
 availability-1.811
Token tour
Token ID4205
Feature activation+1.75e+01

Pos logit contributions
ittal+5.546
��+5.237
Reviewer+5.223
akin+4.951
imony+4.826
Neg logit contributions
etting-5.273
coded-5.155
population-5.023
 ours-4.901
spawn-4.852
Token).
Token ID737
Feature activation+9.13e-01

Pos logit contributions
ittal+11.404
imony+9.995
akin+9.812
Reviewer+9.804
��+9.776
Neg logit contributions
etting-10.792
population-10.721
coded-10.563
 ours-10.224
 brightest-10.185
Token Originally
Token ID19486
Feature activation-7.28e+00

Pos logit contributions
ittal+0.693
��+0.685
Reviewer+0.658
akin+0.611
imony+0.607
Neg logit contributions
etting-0.614
coded-0.610
 ours-0.589
population-0.583
 availability-0.577
Token located
Token ID5140
Feature activation-4.02e+00

Pos logit contributions
 ours+3.969
coded+3.926
etting+3.842
 availability+3.817
population+3.729
Neg logit contributions
��-6.438
ittal-6.221
Reviewer-5.925
iband-5.534
imony-5.528
Token in
Token ID287
Feature activation-1.19e-01

Pos logit contributions
coded+2.251
etting+2.248
 ours+2.195
population+2.187
 availability+2.120
Neg logit contributions
��-3.442
ittal-3.420
Reviewer-3.321
imony-3.033
iband-3.028
Token Illinois
Token ID9486
Feature activation-4.09e+00

Pos logit contributions
etting+0.074
coded+0.073
 ours+0.071
population+0.070
 availability+0.069
Neg logit contributions
ittal-0.097
��-0.095
Reviewer-0.093
akin-0.086
imony-0.085
TokenST
Token ID2257
Feature activation+7.23e+00

Pos logit contributions
ittal+1.811
Reviewer+1.715
��+1.659
akin+1.551
imony+1.527
Neg logit contributions
etting-2.416
coded-2.415
 ours-2.337
population-2.296
 toxicity-2.286
Token3
Token ID18
Feature activation+2.36e+00

Pos logit contributions
ittal+3.277
Reviewer+3.043
akin+2.812
imony+2.804
��+2.801
Neg logit contributions
 mosqu-6.685
etting-6.639
coded-6.599
 conflic-6.551
 brightest-6.458
TokenK
Token ID42
Feature activation+6.14e+00

Pos logit contributions
ittal+1.681
Reviewer+1.621
��+1.577
akin+1.491
imony+1.468
Neg logit contributions
etting-1.680
coded-1.653
 ours-1.587
population-1.567
 availability-1.561
Token,
Token ID11
Feature activation+6.74e+00

Pos logit contributions
ittal+3.933
Reviewer+3.869
��+3.589
akin+3.474
imony+3.425
Neg logit contributions
etting-4.722
coded-4.641
 ours-4.417
 mosqu-4.416
population-4.379
Token I
Token ID314
Feature activation+9.02e+00

Pos logit contributions
ittal+4.384
��+4.331
Reviewer+4.257
akin+3.895
imony+3.820
Neg logit contributions
etting-5.129
coded-5.044
population-4.836
 ours-4.831
 availability-4.718
Token think
Token ID892
Feature activation+1.74e+01

Pos logit contributions
ittal+4.271
akin+3.899
Reviewer+3.844
��+3.665
imony+3.608
Neg logit contributions
etting-8.211
coded-7.976
 mosqu-7.696
 ours-7.688
population-7.687
Token
Token ID198
Feature activation+1.48e+00

Pos logit contributions
Reviewer+9.776
��+9.360
ittal+9.229
akin+8.252
imony+8.104
Neg logit contributions
etting-12.844
coded-12.569
population-11.955
spawn-11.701
 ours-11.685
Token
Token ID198
Feature activation+8.19e+00

Pos logit contributions
ittal+1.048
Reviewer+0.998
��+0.997
akin+0.924
imony+0.917
Neg logit contributions
etting-1.064
coded-1.051
 ours-1.007
population-1.003
 availability-0.992
TokenReviewer
Token ID35407
Feature activation+4.10e+00

Pos logit contributions
Reviewer+0.087
ittal-0.201
��-0.538
akin-0.767
imony-0.929
Neg logit contributions
etting-11.656
coded-11.605
 ours-11.451
 availability-11.336
population-11.282
Token:
Token ID25
Feature activation+2.60e+00

Pos logit contributions
ittal+3.118
Reviewer+3.042
��+2.852
imony+2.787
akin+2.787
Neg logit contributions
etting-2.679
coded-2.617
 mosqu-2.611
 conflic-2.527
population-2.486
Token B
Token ID347
Feature activation-4.59e-01

Pos logit contributions
ittal+1.992
��+1.951
Reviewer+1.897
akin+1.768
imony+1.755
Neg logit contributions
etting-1.721
coded-1.707
 ours-1.638
population-1.625
 availability-1.599
Token.
Token ID13
Feature activation+8.65e-02

Pos logit contributions
etting+1.094
coded+1.073
 ours+1.040
 availability+1.027
population+1.018
Neg logit contributions
ittal-1.125
��-1.124
Reviewer-1.076
akin-0.994
imony-0.992
Token for
Token ID329
Feature activation+4.96e+00

Pos logit contributions
ittal+0.070
��+0.070
Reviewer+0.067
akin+0.063
imony+0.063
Neg logit contributions
etting-0.053
coded-0.053
 ours-0.051
 availability-0.050
population-0.050
Token 2
Token ID362
Feature activation+5.40e+00

Pos logit contributions
ittal+3.565
��+3.435
Reviewer+3.356
akin+3.136
imony+3.110
Neg logit contributions
etting-3.493
coded-3.459
population-3.306
 ours-3.282
spawn-3.223
Token-
Token ID12
Feature activation+5.92e+00

Pos logit contributions
ittal+3.564
��+3.432
Reviewer+3.356
akin+3.116
ateur+3.055
Neg logit contributions
etting-4.101
coded-4.075
 ours-3.929
population-3.905
 availability-3.812
Tokenhour
Token ID9769
Feature activation+1.32e+01

Pos logit contributions
ittal+5.080
��+4.882
Reviewer+4.852
akin+4.631
ateur+4.533
Neg logit contributions
etting-3.319
coded-3.210
 ours-3.091
population-3.031
 availability-3.017
Token tour
Token ID4205
Feature activation+1.73e+01

Pos logit contributions
ittal+7.644
Reviewer+7.226
��+7.013
akin+6.719
ateur+6.694
Neg logit contributions
etting-9.775
coded-9.661
population-9.489
spawn-9.250
 ours-9.060
Token;
Token ID26
Feature activation+4.10e+00

Pos logit contributions
ittal+11.386
Reviewer+10.078
imony+10.069
akin+9.831
��+9.793
Neg logit contributions
etting-10.652
coded-10.444
population-10.421
 ours-9.994
spawn-9.889
Token $
Token ID720
Feature activation+8.95e-01

Pos logit contributions
ittal+3.516
��+3.424
Reviewer+3.364
akin+3.167
imony+3.135
Neg logit contributions
etting-2.331
coded-2.306
 ours-2.183
population-2.178
 availability-2.105
Token30
Token ID1270
Feature activation-1.94e+00

Pos logit contributions
ittal+0.678
��+0.661
Reviewer+0.646
akin+0.601
imony+0.596
Neg logit contributions
etting-0.603
coded-0.597
 ours-0.574
population-0.568
 availability-0.561
Token for
Token ID329
Feature activation+3.73e+00

Pos logit contributions
coded+1.063
etting+1.055
 ours+1.031
 availability+1.026
population+0.985
Neg logit contributions
��-1.723
ittal-1.709
Reviewer-1.641
imony-1.540
akin-1.537
Token shorter
Token ID12238
Feature activation+7.11e+00

Pos logit contributions
ittal+2.695
��+2.610
Reviewer+2.547
akin+2.373
imony+2.345
Neg logit contributions
etting-2.632
coded-2.603
population-2.490
 ours-2.486
spawn-2.420
Token
Token ID198
Feature activation+6.85e+00

Pos logit contributions
ittal+4.166
Reviewer+4.054
imony+3.697
akin+3.682
atio+3.528
Neg logit contributions
 mosqu-6.087
etting-5.963
 conflic-5.895
coded-5.704
 brightest-5.594
TokenEnvironment
Token ID31441
Feature activation-3.38e+00

Pos logit contributions
ittal+5.035
Reviewer+4.939
��+4.669
akin+4.534
imony+4.435
Neg logit contributions
etting-4.597
coded-4.481
 ours-4.395
 availability-4.272
 brightest-4.244
Token activists
Token ID7941
Feature activation+8.70e+00

Pos logit contributions
etting+2.011
coded+1.988
 ours+1.942
 availability+1.896
population+1.878
Neg logit contributions
��-2.798
ittal-2.765
Reviewer-2.664
imony-2.470
akin-2.456
Token have
Token ID423
Feature activation+5.44e+00

Pos logit contributions
ittal+6.034
��+5.539
Reviewer+5.539
akin+5.359
imony+5.280
Neg logit contributions
coded-5.991
etting-5.984
population-5.691
 ours-5.539
spawn-5.453
Token filed
Token ID5717
Feature activation+4.10e+00

Pos logit contributions
ittal+4.056
��+3.866
Reviewer+3.776
akin+3.614
imony+3.561
Neg logit contributions
etting-3.624
coded-3.530
population-3.420
 ours-3.297
 availability-3.262
Token complaint
Token ID8224
Feature activation+1.70e+01

Pos logit contributions
ittal+2.953
��+2.858
Reviewer+2.775
akin+2.604
imony+2.577
Neg logit contributions
etting-2.883
coded-2.868
population-2.732
 ours-2.722
 availability-2.666
Token pertaining
Token ID27113
Feature activation+8.68e+00

Pos logit contributions
ittal+9.968
Reviewer+9.299
��+9.287
akin+9.015
imony+8.685
Neg logit contributions
coded-11.705
etting-11.393
population-11.175
spawn-10.864
 ours-10.612
Token filling
Token ID12591
Feature activation-2.93e+00

Pos logit contributions
ittal+6.225
Reviewer+5.740
��+5.682
akin+5.480
imony+5.312
Neg logit contributions
coded-5.855
etting-5.833
population-5.526
 ours-5.443
 mosqu-5.423
Token of
Token ID286
Feature activation+1.20e+00

Pos logit contributions
etting+1.817
coded+1.769
 ours+1.751
 availability+1.725
population+1.720
Neg logit contributions
��-2.383
ittal-2.352
Reviewer-2.277
akin-2.110
imony-2.090
Token wetlands
Token ID49297
Feature activation-3.01e+00

Pos logit contributions
ittal+0.992
��+0.975
Reviewer+0.949
akin+0.888
imony+0.882
Neg logit contributions
etting-0.731
coded-0.724
 ours-0.694
population-0.686
 availability-0.676
Token with
Token ID351
Feature activation+9.26e-01

Pos logit contributions
etting+1.873
coded+1.841
 ours+1.811
 availability+1.771
population+1.766
Neg logit contributions
��-2.441
ittal-2.417
Reviewer-2.346
akin-2.154
ateur-2.143
Token of
Token ID286
Feature activation+6.53e+00

Pos logit contributions
etting+1.090
coded+1.073
 ours+1.043
 availability+1.009
population+0.994
Neg logit contributions
ittal-1.526
��-1.488
Reviewer-1.444
akin-1.355
imony-1.353
Token clutch
Token ID22756
Feature activation+4.05e+00

Pos logit contributions
ittal+4.546
��+4.371
Reviewer+4.262
akin+4.023
imony+3.983
Neg logit contributions
etting-4.654
coded-4.563
population-4.421
 ours-4.296
spawn-4.286
Token-
Token ID12
Feature activation+3.85e+00

Pos logit contributions
ittal+2.869
��+2.763
Reviewer+2.719
akin+2.512
imony+2.492
Neg logit contributions
etting-2.887
coded-2.869
 ours-2.755
population-2.745
 availability-2.659
Tokendriven
Token ID15808
Feature activation+1.10e+01

Pos logit contributions
ittal+3.191
��+3.065
Reviewer+3.049
akin+2.890
ateur+2.823
Neg logit contributions
etting-2.278
coded-2.202
 ours-2.159
population-2.105
 availability-2.093
Token,
Token ID11
Feature activation+8.31e+00

Pos logit contributions
ittal+7.437
Reviewer+7.004
��+6.870
akin+6.623
ateral+6.568
Neg logit contributions
etting-7.334
coded-7.259
population-7.099
 ours-6.751
spawn-6.736
Token on
Token ID319
Feature activation+1.70e+01

Pos logit contributions
ittal+6.089
Reviewer+5.758
��+5.717
akin+5.393
ateur+5.306
Neg logit contributions
etting-5.540
coded-5.334
population-5.135
 ours-5.014
spawn-4.950
Token-
Token ID12
Feature activation+1.46e+01

Pos logit contributions
ittal+8.870
Reviewer+8.708
ateral+7.707
��+7.651
akin+7.395
Neg logit contributions
etting-12.877
coded-12.361
population-12.167
 ours-12.015
 mosqu-11.937
Tokendemand
Token ID28550
Feature activation+9.66e+00

Pos logit contributions
ittal+10.018
Reviewer+9.882
akin+9.077
ateral+8.980
itle+8.750
Neg logit contributions
etting-9.064
 mosqu-8.588
 toxicity-8.320
 brightest-8.278
 ours-8.251
Token all
Token ID477
Feature activation+1.29e+01

Pos logit contributions
ittal+6.794
Reviewer+6.521
��+6.317
akin+6.127
ateral+6.055
Neg logit contributions
etting-6.329
coded-6.284
population-6.097
 ours-5.842
spawn-5.808
Token-
Token ID12
Feature activation+9.69e+00

Pos logit contributions
Reviewer+7.240
ittal+7.185
��+6.587
imony+6.483
ateral+6.388
Neg logit contributions
etting-9.755
coded-9.619
population-9.446
 ours-9.174
 brightest-9.099
Tokenwheel
Token ID22001
Feature activation+6.61e+00

Pos logit contributions
ittal+6.357
Reviewer+6.340
akin+5.897
��+5.826
ateur+5.696
Neg logit contributions
etting-6.796
coded-6.448
 ours-6.371
 brightest-6.281
 availability-6.235
Token become
Token ID1716
Feature activation-2.01e-01

Pos logit contributions
etting+0.377
coded+0.371
 ours+0.358
population+0.350
 availability+0.350
Neg logit contributions
ittal-0.464
��-0.460
Reviewer-0.447
akin-0.413
imony-0.410
Token slightly
Token ID4622
Feature activation-1.64e+00

Pos logit contributions
etting+0.153
coded+0.152
 ours+0.147
population+0.145
 availability+0.144
Neg logit contributions
ittal-0.136
��-0.133
Reviewer-0.128
akin-0.118
imony-0.117
Token less
Token ID1342
Feature activation-4.70e+00

Pos logit contributions
etting+1.163
coded+1.158
 ours+1.126
population+1.106
 availability+1.097
Neg logit contributions
ittal-1.189
��-1.181
Reviewer-1.129
akin-1.038
imony-1.034
Token depressing
Token ID31193
Feature activation+4.38e+00

Pos logit contributions
etting+2.630
coded+2.627
 ours+2.549
 availability+2.478
population+2.430
Neg logit contributions
��-4.056
ittal-4.042
Reviewer-3.961
akin-3.654
ateur-3.621
Token when
Token ID618
Feature activation+1.53e+01

Pos logit contributions
ittal+3.280
��+3.128
Reviewer+3.041
imony+2.872
akin+2.858
Neg logit contributions
etting-2.941
coded-2.892
population-2.788
 ours-2.760
 availability-2.695
Token you
Token ID345
Feature activation+1.69e+01

Pos logit contributions
ittal+9.468
��+9.269
Reviewer+9.023
akin+8.422
imony+8.367
Neg logit contributions
etting-10.189
coded-9.909
spawn-9.547
population-9.443
 ours-9.389
Token remember
Token ID3505
Feature activation+8.81e+00

Pos logit contributions
ittal+9.205
 notice+8.274
itle+8.128
 fir+8.072
atio+8.067
Neg logit contributions
coded-11.937
etting-11.863
population-11.457
 mosqu-11.431
spawn-11.213
Token how
Token ID703
Feature activation+1.32e+00

Pos logit contributions
ittal+5.366
��+5.209
Reviewer+5.123
imony+4.780
akin+4.691
Neg logit contributions
etting-6.722
coded-6.630
population-6.343
 ours-6.318
spawn-6.286
Token much
Token ID881
Feature activation-4.54e-01

Pos logit contributions
ittal+1.044
��+1.023
Reviewer+0.995
akin+0.929
imony+0.923
Neg logit contributions
etting-0.844
coded-0.837
 ours-0.803
population-0.795
 availability-0.783
Token more
Token ID517
Feature activation-3.63e+00

Pos logit contributions
etting+0.299
coded+0.297
 ours+0.288
 availability+0.282
population+0.281
Neg logit contributions
ittal-0.353
��-0.350
Reviewer-0.336
akin-0.313
imony-0.309
Token depressing
Token ID31193
Feature activation-1.43e+00

Pos logit contributions
etting+2.105
coded+2.101
 ours+2.046
 availability+2.016
population+1.954
Neg logit contributions
��-3.079
ittal-3.078
Reviewer-2.966
akin-2.777
iband-2.760
Token),
Token ID828
Feature activation+3.71e+00

Pos logit contributions
ittal+9.515
��+9.036
akin+8.895
Reviewer+8.548
imony+8.441
Neg logit contributions
coded-10.459
etting-10.259
 ours-9.951
population-9.895
 brightest-9.761
Token white
Token ID2330
Feature activation-3.97e+00

Pos logit contributions
ittal+2.864
��+2.795
Reviewer+2.710
akin+2.544
imony+2.524
Neg logit contributions
etting-2.435
coded-2.413
 ours-2.312
population-2.298
 availability-2.254
Token (
Token ID357
Feature activation+8.84e+00

Pos logit contributions
etting+2.395
coded+2.343
 ours+2.283
 availability+2.227
spawn+2.187
Neg logit contributions
ittal-3.245
��-3.230
Reviewer-3.201
akin-2.919
ateral-2.904
Tokench
Token ID354
Feature activation+1.10e+01

Pos logit contributions
ittal+7.058
Reviewer+6.684
��+6.683
akin+6.512
ateur+6.278
Neg logit contributions
coded-5.041
etting-5.023
 ours-4.934
 availability-4.873
 toxicity-4.771
Tokenag
Token ID363
Feature activation+1.37e+01

Pos logit contributions
ittal+8.494
��+7.771
akin+7.711
Reviewer+7.571
ateur+7.467
Neg logit contributions
coded-6.614
etting-6.458
 ours-6.370
 availability-6.252
population-6.170
Tokenha
Token ID3099
Feature activation+1.67e+01

Pos logit contributions
ittal+10.082
��+9.460
akin+9.283
Reviewer+9.046
ateur+8.994
Neg logit contributions
coded-8.072
etting-7.794
 ours-7.617
 brightest-7.605
 toxicity-7.577
Token
Token ID447
Feature activation+3.31e+00

Pos logit contributions
ittal+9.577
��+9.029
akin+8.921
Reviewer+8.694
iband+8.669
Neg logit contributions
coded-11.390
etting-11.131
population-10.932
 ours-10.788
 brightest-10.659
Token
Token ID247
Feature activation+7.19e+00

Pos logit contributions
ittal+2.297
��+2.254
Reviewer+2.171
akin+2.012
imony+1.990
Neg logit contributions
etting-2.445
coded-2.428
 ours-2.342
population-2.321
 availability-2.292
Tokenan
Token ID272
Feature activation+1.66e+01

Pos logit contributions
ittal+4.838
��+4.654
Reviewer+4.491
akin+4.299
imony+4.199
Neg logit contributions
coded-5.198
etting-5.185
 ours-5.035
population-4.958
 availability-4.920
Token),
Token ID828
Feature activation+2.49e+00

Pos logit contributions
ittal+10.008
��+9.599
akin+9.346
ateur+9.016
Reviewer+8.983
Neg logit contributions
coded-10.726
etting-10.680
 ours-10.491
population-10.397
 brightest-10.234
Token good
Token ID922
Feature activation-2.16e+00

Pos logit contributions
ittal+1.910
��+1.871
Reviewer+1.816
akin+1.693
imony+1.681
Neg logit contributions
etting-1.655
coded-1.641
 ours-1.578
population-1.562
 availability-1.540
Token the
Token ID262
Feature activation+5.02e+00

Pos logit contributions
ittal+0.236
��+0.234
Reviewer+0.224
akin+0.209
imony+0.207
Neg logit contributions
etting-0.221
coded-0.219
 ours-0.213
 availability-0.210
population-0.209
Token Motor
Token ID12533
Feature activation-1.43e+00

Pos logit contributions
ittal+3.460
��+3.330
Reviewer+3.272
akin+3.017
imony+2.985
Neg logit contributions
etting-3.698
coded-3.654
 ours-3.510
population-3.505
spawn-3.417
Token Vehicle
Token ID21501
Feature activation+2.15e+00

Pos logit contributions
etting+1.431
coded+1.428
 ours+1.381
 availability+1.379
population+1.372
Neg logit contributions
��-0.606
ittal-0.600
Reviewer-0.567
akin-0.494
imony-0.486
Token (
Token ID357
Feature activation+5.25e+00

Pos logit contributions
ittal+1.602
��+1.574
Reviewer+1.524
akin+1.416
imony+1.406
Neg logit contributions
etting-1.480
coded-1.464
 ours-1.411
population-1.395
 availability-1.385
TokenAm
Token ID5840
Feature activation+9.49e+00

Pos logit contributions
ittal+3.380
��+3.232
Reviewer+3.226
akin+2.957
imony+2.894
Neg logit contributions
etting-4.097
coded-4.019
 ours-3.922
population-3.833
 availability-3.811
Tokenendment
Token ID5904
Feature activation+1.67e+01

Pos logit contributions
ittal+3.608
ateur+2.916
Reviewer+2.848
��+2.747
imony+2.709
Neg logit contributions
coded-9.747
etting-9.632
 ours-9.514
 availability-9.294
population-9.278
Token)
Token ID8
Feature activation+4.12e+00

Pos logit contributions
��+6.835
ittal+6.778
Reviewer+6.067
atio+5.615
akin+5.581
Neg logit contributions
coded-14.489
etting-14.217
spawn-14.071
population-14.061
 ours-13.606
Token Bill
Token ID3941
Feature activation+1.60e+00

Pos logit contributions
ittal+2.258
��+2.194
Reviewer+2.104
akin+1.902
imony+1.882
Neg logit contributions
etting-3.622
coded-3.604
 ours-3.496
population-3.474
 availability-3.430
Token 2016
Token ID1584
Feature activation+3.71e+00

Pos logit contributions
ittal+1.609
��+1.586
Reviewer+1.551
akin+1.471
imony+1.463
Neg logit contributions
etting-0.677
coded-0.665
 ours-0.627
population-0.615
 availability-0.603
Token for
Token ID329
Feature activation-1.35e+00

Pos logit contributions
ittal+2.863
��+2.786
Reviewer+2.719
akin+2.551
imony+2.522
Neg logit contributions
etting-2.428
coded-2.414
 ours-2.314
population-2.296
 availability-2.249
Token consideration
Token ID9110
Feature activation+7.41e+00

Pos logit contributions
etting+0.876
coded+0.858
 availability+0.844
 ours+0.843
population+0.830
Neg logit contributions
��-1.046
ittal-1.044
Reviewer-1.007
akin-0.934
imony-0.932
Token66
Token ID2791
Feature activation-2.52e+00

Pos logit contributions
ittal+3.202
Reviewer+3.039
��+2.887
akin+2.874
ateur+2.795
Neg logit contributions
etting-4.035
coded-4.012
 mosqu-3.861
 ours-3.835
population-3.786
Token power
Token ID1176
Feature activation-5.77e+00

Pos logit contributions
etting+1.738
coded+1.702
 ours+1.661
 availability+1.634
population+1.594
Neg logit contributions
ittal-1.886
��-1.863
Reviewer-1.786
akin-1.655
imony-1.647
Token off
Token ID572
Feature activation-4.04e+00

Pos logit contributions
etting+3.311
coded+3.229
 availability+3.128
 ours+3.124
population+3.047
Neg logit contributions
��-4.934
ittal-4.923
Reviewer-4.706
imony-4.372
akin-4.355
Token
Token ID198
Feature activation+3.53e+00

Pos logit contributions
etting+2.617
coded+2.550
 ours+2.483
 availability+2.463
population+2.408
Neg logit contributions
ittal-3.184
��-3.163
Reviewer-3.042
akin-2.815
imony-2.794
Token
Token ID198
Feature activation+6.55e+00

Pos logit contributions
ittal+2.301
Reviewer+2.171
��+2.070
akin+2.028
imony+2.015
Neg logit contributions
etting-2.700
coded-2.639
 mosqu-2.613
 conflic-2.534
population-2.531
TokenNotice
Token ID26396
Feature activation+1.66e+01

Pos logit contributions
ittal+4.682
Reviewer+4.606
��+4.383
akin+4.294
imony+4.151
Neg logit contributions
coded-4.342
etting-4.341
 ours-4.266
 availability-4.175
 toxicity-4.168
Token how
Token ID703
Feature activation+8.12e+00

Pos logit contributions
ittal+10.183
Reviewer+9.781
��+9.299
imony+9.029
akin+9.013
Neg logit contributions
etting-10.975
population-10.679
coded-10.636
spawn-10.162
castle-9.903
Token we
Token ID356
Feature activation+3.27e+00

Pos logit contributions
ittal+5.843
Reviewer+5.536
��+5.478
akin+5.244
imony+5.110
Neg logit contributions
etting-5.382
coded-5.288
population-5.200
spawn-4.965
 ours-4.898
Token used
Token ID973
Feature activation+2.47e+00

Pos logit contributions
ittal+2.453
��+2.363
Reviewer+2.303
akin+2.196
imony+2.148
Neg logit contributions
etting-2.178
coded-2.142
population-2.084
 ours-2.071
 availability-2.029
Token the
Token ID262
Feature activation+6.11e-01

Pos logit contributions
ittal+1.804
��+1.764
Reviewer+1.712
akin+1.591
imony+1.577
Neg logit contributions
etting-1.723
coded-1.709
 ours-1.645
population-1.633
 availability-1.608
Token full
Token ID1336
Feature activation-2.48e+00

Pos logit contributions
ittal+0.452
��+0.445
Reviewer+0.430
akin+0.398
imony+0.397
Neg logit contributions
etting-0.425
coded-0.421
 ours-0.407
population-0.400
 availability-0.399
Token 2
Token ID362
Feature activation+1.55e+00

Pos logit contributions
ittal+1.112
��+1.088
Reviewer+1.058
akin+0.986
imony+0.981
Neg logit contributions
etting-0.946
coded-0.937
 ours-0.899
population-0.892
 availability-0.876
Token:
Token ID25
Feature activation+2.12e-01

Pos logit contributions
ittal+1.191
��+1.167
Reviewer+1.132
akin+1.056
imony+1.047
Neg logit contributions
etting-1.033
coded-1.024
 ours-0.985
population-0.975
 availability-0.962
Token "[
Token ID12878
Feature activation+5.84e+00

Pos logit contributions
ittal+0.164
��+0.161
Reviewer+0.155
akin+0.145
imony+0.144
Neg logit contributions
etting-0.140
coded-0.139
 ours-0.134
population-0.132
 availability-0.131
TokenLS
Token ID6561
Feature activation+1.96e-01

Pos logit contributions
Reviewer+4.517
ittal+4.503
��+4.331
akin+4.131
ateur+4.051
Neg logit contributions
etting-3.644
coded-3.563
 ours-3.442
 availability-3.375
population-3.346
TokenD
Token ID35
Feature activation+5.27e+00

Pos logit contributions
ittal+0.138
��+0.136
Reviewer+0.131
imony+0.120
akin+0.119
Neg logit contributions
etting-0.144
coded-0.143
 ours-0.137
population-0.136
 availability-0.136
Token pioneer
Token ID29570
Feature activation+1.66e+01

Pos logit contributions
ittal+2.725
��+2.596
Reviewer+2.541
akin+2.313
ateur+2.250
Neg logit contributions
etting-4.714
coded-4.714
 ours-4.551
population-4.524
 availability-4.415
Token Michael
Token ID3899
Feature activation+4.50e+00

Pos logit contributions
ittal+9.998
Reviewer+9.568
��+9.495
imony+9.147
akin+9.060
Neg logit contributions
coded-10.792
etting-10.693
population-10.199
 ours-10.017
 availability-9.919
Token]
Token ID60
Feature activation+1.60e+00

Pos logit contributions
ittal+2.766
��+2.686
Reviewer+2.578
akin+2.394
imony+2.365
Neg logit contributions
etting-3.641
coded-3.628
 ours-3.489
population-3.477
 availability-3.433
Token Holl
Token ID11806
Feature activation+4.66e+00

Pos logit contributions
ittal+1.218
��+1.194
Reviewer+1.164
akin+1.075
imony+1.069
Neg logit contributions
etting-1.071
coded-1.061
 ours-1.027
population-1.010
 availability-1.000
Tokenings
Token ID654
Feature activation+2.31e+00

Pos logit contributions
ittal+2.865
��+2.719
Reviewer+2.632
akin+2.487
imony+2.463
Neg logit contributions
coded-3.748
etting-3.746
 ours-3.629
population-3.586
 availability-3.572
Tokenhead
Token ID2256
Feature activation+5.77e+00

Pos logit contributions
ittal+1.624
��+1.587
Reviewer+1.538
akin+1.425
imony+1.413
Neg logit contributions
etting-1.686
coded-1.672
 ours-1.615
population-1.599
 availability-1.579
Token Editorial
Token ID39525
Feature activation-8.13e+00

Pos logit contributions
etting+3.624
coded+3.611
 ours+3.445
 toxicity+3.325
population+3.323
Neg logit contributions
ittal-8.772
��-8.590
Reviewer-8.102
akin-7.846
imony-7.694
Token Board
Token ID5926
Feature activation-8.25e+00

Pos logit contributions
 ours+6.989
coded+6.903
 wisely+6.817
etting+6.730
 availability+6.678
Neg logit contributions
ittal-4.851
��-4.786
akin-3.831
iband-3.811
imony-3.707
Token,
Token ID11
Feature activation-9.61e+00

Pos logit contributions
etting+3.960
 wisely+3.891
coded+3.844
 toxicity+3.703
 ours+3.695
Neg logit contributions
��-7.844
ittal-7.390
Reviewer-7.161
iband-6.945
imony-6.825
Token a
Token ID257
Feature activation-7.99e+00

Pos logit contributions
coded+7.047
 ours+6.842
 wisely+6.784
etting+6.717
population+6.458
Neg logit contributions
ittal-6.848
��-6.609
imony-6.106
Reviewer-5.991
iband-5.669
Token dem
Token ID1357
Feature activation-1.16e+01

Pos logit contributions
etting+5.278
coded+5.267
 ours+5.177
population+5.050
 wisely+4.951
Neg logit contributions
��-6.301
ittal-5.902
Reviewer-5.829
imony-5.387
iband-5.281
Tokenographically
Token ID33145
Feature activation-2.25e+01

Pos logit contributions
etting+5.820
coded+5.637
population+5.554
pred+5.252
sci+5.168
Neg logit contributions
��-10.782
Reviewer-9.896
ittal-9.853
��-9.410
 acknowled-8.812
Token and
Token ID290
Feature activation-1.00e+01

Pos logit contributions
coded+9.607
 ours+9.505
 wisely+9.010
etting+8.877
 daunting+8.659
Neg logit contributions
��-19.994
iband-17.803
ittal-17.802
imony-17.472
Reviewer-16.624
Token ideologically
Token ID49764
Feature activation-2.17e+01

Pos logit contributions
coded+5.620
etting+5.474
 availability+5.421
 ours+5.336
 wisely+5.273
Neg logit contributions
��-8.657
ittal-8.500
Reviewer-7.848
iband-7.541
imony-7.536
Token diverse
Token ID10084
Feature activation-7.85e+00

Pos logit contributions
 ours+8.674
coded+8.594
etting+7.685
 wisely+7.640
complex+7.260
Neg logit contributions
��-20.922
ittal-18.488
iband-18.403
imony-17.594
Reviewer-17.474
Token group
Token ID1448
Feature activation-4.90e+00

Pos logit contributions
etting+4.634
population+4.588
coded+4.437
 ours+4.328
 availability+4.246
Neg logit contributions
��-6.711
ittal-6.508
Reviewer-6.374
imony-5.871
ateral-5.838
Token that
Token ID326
Feature activation-3.99e+00

Pos logit contributions
etting+2.509
coded+2.476
 ours+2.438
population+2.410
 wisely+2.369
Neg logit contributions
��-4.473
ittal-4.378
Reviewer-4.333
iband-4.065
imony-4.000
Token,
Token ID11
Feature activation-9.61e+00

Pos logit contributions
etting+3.960
 wisely+3.891
coded+3.844
 toxicity+3.703
 ours+3.695
Neg logit contributions
��-7.844
ittal-7.390
Reviewer-7.161
iband-6.945
imony-6.825
Token a
Token ID257
Feature activation-7.99e+00

Pos logit contributions
coded+7.047
 ours+6.842
 wisely+6.784
etting+6.717
population+6.458
Neg logit contributions
ittal-6.848
��-6.609
imony-6.106
Reviewer-5.991
iband-5.669
Token dem
Token ID1357
Feature activation-1.16e+01

Pos logit contributions
etting+5.278
coded+5.267
 ours+5.177
population+5.050
 wisely+4.951
Neg logit contributions
��-6.301
ittal-5.902
Reviewer-5.829
imony-5.387
iband-5.281
Tokenographically
Token ID33145
Feature activation-2.25e+01

Pos logit contributions
etting+5.820
coded+5.637
population+5.554
pred+5.252
sci+5.168
Neg logit contributions
��-10.782
Reviewer-9.896
ittal-9.853
��-9.410
 acknowled-8.812
Token and
Token ID290
Feature activation-1.00e+01

Pos logit contributions
coded+9.607
 ours+9.505
 wisely+9.010
etting+8.877
 daunting+8.659
Neg logit contributions
��-19.994
iband-17.803
ittal-17.802
imony-17.472
Reviewer-16.624
Token ideologically
Token ID49764
Feature activation-2.17e+01

Pos logit contributions
coded+5.620
etting+5.474
 availability+5.421
 ours+5.336
 wisely+5.273
Neg logit contributions
��-8.657
ittal-8.500
Reviewer-7.848
iband-7.541
imony-7.536
Token diverse
Token ID10084
Feature activation-7.85e+00

Pos logit contributions
 ours+8.674
coded+8.594
etting+7.685
 wisely+7.640
complex+7.260
Neg logit contributions
��-20.922
ittal-18.488
iband-18.403
imony-17.594
Reviewer-17.474
Token group
Token ID1448
Feature activation-4.90e+00

Pos logit contributions
etting+4.634
population+4.588
coded+4.437
 ours+4.328
 availability+4.246
Neg logit contributions
��-6.711
ittal-6.508
Reviewer-6.374
imony-5.871
ateral-5.838
Token that
Token ID326
Feature activation-3.99e+00

Pos logit contributions
etting+2.509
coded+2.476
 ours+2.438
population+2.410
 wisely+2.369
Neg logit contributions
��-4.473
ittal-4.378
Reviewer-4.333
iband-4.065
imony-4.000
Token is
Token ID318
Feature activation-4.93e+00

Pos logit contributions
 ours+2.869
coded+2.857
etting+2.836
 wisely+2.774
population+2.772
Neg logit contributions
ittal-2.861
��-2.785
Reviewer-2.752
imony-2.462
akin-2.435
Token separate
Token ID4553
Feature activation+2.11e+00

Pos logit contributions
coded+3.161
etting+3.114
 ours+3.034
population+3.019
 wisely+2.893
Neg logit contributions
��-4.016
ittal-3.909
Reviewer-3.804
imony-3.457
akin-3.441
Token United
Token ID1578
Feature activation-6.66e+00

Pos logit contributions
etting+4.200
 availability+4.157
 ours+4.151
coded+4.142
 toxicity+4.022
Neg logit contributions
��-5.213
ittal-5.019
Reviewer-4.831
akin-4.471
iband-4.439
Token States
Token ID1829
Feature activation-2.36e+00

Pos logit contributions
 ours+3.297
etting+3.245
coded+3.222
population+3.164
 toxicity+3.094
Neg logit contributions
��-5.967
ittal-5.932
Reviewer-5.790
ateur-5.419
akin-5.373
Token can
Token ID460
Feature activation-2.17e+00

Pos logit contributions
etting+1.690
coded+1.679
 ours+1.664
population+1.636
 availability+1.628
Neg logit contributions
��-1.704
ittal-1.666
Reviewer-1.579
akin-1.439
ateur-1.437
Token do
Token ID466
Feature activation-1.18e+01

Pos logit contributions
etting+1.401
coded+1.389
 ours+1.357
population+1.312
 availability+1.310
Neg logit contributions
ittal-1.678
��-1.665
Reviewer-1.629
akin-1.483
ateur-1.482
Token to
Token ID284
Feature activation-1.00e+01

Pos logit contributions
etting+7.038
 ours+6.921
 wisely+6.904
coded+6.663
 availability+6.368
Neg logit contributions
��-9.302
ittal-8.716
Reviewer-8.452
sbm-7.998
ateur-7.960
Token help
Token ID1037
Feature activation-1.97e+01

Pos logit contributions
etting+6.164
 ours+6.162
 wisely+5.968
 availability+5.841
coded+5.789
Neg logit contributions
��-7.922
ittal-7.682
Reviewer-7.579
sbm-6.861
akin-6.857
Token out
Token ID503
Feature activation-1.14e+01

Pos logit contributions
 ours+9.902
etting+9.571
 wisely+9.193
 availability+9.113
 London+8.701
Neg logit contributions
��-15.801
ittal-14.629
Reviewer-14.626
sbm-14.075
atio-13.873
Token in
Token ID287
Feature activation-1.26e+01

Pos logit contributions
etting+6.082
 ours+6.073
 wisely+5.606
coded+5.598
population+5.597
Neg logit contributions
��-9.760
ittal-9.336
Reviewer-8.763
akin-8.471
atio-8.421
Token London
Token ID3576
Feature activation-7.27e+00

Pos logit contributions
 ours+7.042
etting+6.936
 emergencies+6.654
 availability+6.590
population+6.468
Neg logit contributions
��-10.131
ittal-9.922
Reviewer-9.594
ateur-8.920
sbm-8.908
Token and
Token ID290
Feature activation-8.81e+00

Pos logit contributions
etting+4.503
 ours+4.382
coded+4.320
 wisely+4.276
population+4.166
Neg logit contributions
��-5.770
ittal-5.687
Reviewer-5.276
imony-5.045
akin-4.931
Token the
Token ID262
Feature activation-7.39e+00

Pos logit contributions
 ours+5.055
etting+4.791
 availability+4.756
coded+4.706
 wisely+4.660
Neg logit contributions
ittal-7.352
��-7.323
Reviewer-7.068
ateur-6.636
imony-6.447
Token,
Token ID11
Feature activation+3.91e-01

Pos logit contributions
etting+3.351
coded+3.319
 ours+3.243
 availability+3.139
 wisely+3.036
Neg logit contributions
��-4.485
ittal-4.409
Reviewer-4.281
ateur-3.906
imony-3.898
Token so
Token ID523
Feature activation-1.28e+00

Pos logit contributions
ittal+0.284
��+0.280
Reviewer+0.271
akin+0.248
imony+0.247
Neg logit contributions
etting-0.278
coded-0.275
 ours-0.268
population-0.262
 availability-0.261
Token we
Token ID356
Feature activation-2.47e+00

Pos logit contributions
etting+0.806
coded+0.802
 ours+0.796
 availability+0.767
population+0.755
Neg logit contributions
ittal-1.031
��-1.025
Reviewer-0.985
akin-0.908
imony-0.908
Token turned
Token ID2900
Feature activation-7.72e+00

Pos logit contributions
etting+1.478
coded+1.471
 ours+1.429
 availability+1.375
 wisely+1.371
Neg logit contributions
��-2.075
ittal-2.062
Reviewer-2.008
imony-1.831
akin-1.831
Token the
Token ID262
Feature activation-7.70e+00

Pos logit contributions
 ours+4.265
coded+3.996
etting+3.992
 availability+3.924
 wisely+3.830
Neg logit contributions
��-6.795
ittal-6.781
Reviewer-6.489
imony-6.103
iband-6.095
Token heat
Token ID4894
Feature activation-1.89e+01

Pos logit contributions
etting+4.552
coded+4.470
 availability+4.387
 ours+4.353
 toxicity+4.205
Neg logit contributions
ittal-6.355
��-6.202
Reviewer-5.911
imony-5.588
akin-5.543
Token down
Token ID866
Feature activation-8.08e+00

Pos logit contributions
 wisely+7.823
coded+7.590
 ours+7.444
 availability+7.415
etting+7.257
Neg logit contributions
��-17.190
Reviewer-15.889
ittal-15.799
imony-14.597
��-14.556
Token to
Token ID284
Feature activation-7.20e+00

Pos logit contributions
coded+4.770
etting+4.718
 ours+4.639
 wisely+4.555
 availability+4.407
Neg logit contributions
��-6.685
ittal-6.564
Reviewer-6.325
imony-5.832
ateur-5.829
Token 350
Token ID13803
Feature activation-7.14e+00

Pos logit contributions
 ours+3.971
etting+3.943
coded+3.726
 availability+3.686
 brightest+3.583
Neg logit contributions
ittal-6.161
Reviewer-6.141
��-6.109
imony-5.619
ateral-5.477
Token degrees
Token ID7370
Feature activation-8.00e+00

Pos logit contributions
etting+4.333
coded+4.243
 ours+4.091
 availability+3.995
 wisely+3.904
Neg logit contributions
��-5.715
ittal-5.650
Reviewer-5.602
imony-5.141
akin-5.052
Token for
Token ID329
Feature activation-3.58e-01

Pos logit contributions
etting+4.806
coded+4.726
 ours+4.554
 availability+4.465
 wisely+4.317
Neg logit contributions
��-6.592
ittal-6.426
Reviewer-6.310
imony-5.934
akin-5.727
Token Sun
Token ID3825
Feature activation-9.91e+00

Pos logit contributions
etting+1.371
coded+1.358
 ours+1.297
population+1.280
 availability+1.260
Neg logit contributions
ittal-2.103
��-2.066
Reviewer-2.013
akin-1.891
imony-1.878
Token Online
Token ID7467
Feature activation-7.08e+00

Pos logit contributions
 ours+4.555
coded+4.305
 availability+4.194
 Canary+4.101
scape+4.024
Neg logit contributions
��-9.849
ittal-9.160
Reviewer-8.866
ateur-8.369
itle-8.289
Token news
Token ID1705
Feature activation-3.88e+00

Pos logit contributions
coded+4.133
 availability+4.090
 ours+4.055
etting+3.967
population+3.961
Neg logit contributions
��-5.837
ittal-5.591
Reviewer-5.548
itle-5.090
imony-5.055
Token team
Token ID1074
Feature activation-7.63e+00

Pos logit contributions
etting+1.943
coded+1.938
 ours+1.879
population+1.862
 availability+1.850
Neg logit contributions
��-3.614
ittal-3.486
Reviewer-3.401
akin-3.192
imony-3.188
Token?
Token ID30
Feature activation-7.19e+00

Pos logit contributions
etting+3.055
coded+3.013
 ours+3.003
 wisely+2.822
population+2.770
Neg logit contributions
��-7.581
ittal-7.314
Reviewer-7.222
imony-6.851
��-6.804
Token Email
Token ID9570
Feature activation-1.89e+01

Pos logit contributions
coded+3.403
 ours+3.314
etting+3.242
spawn+3.120
 TBD+3.073
Neg logit contributions
��-6.805
ittal-6.576
Reviewer-6.407
ateur-6.123
akin-6.047
Token us
Token ID514
Feature activation-7.25e+00

Pos logit contributions
 ours+4.037
 availability+2.324
 TBD+2.178
 wisely+2.174
 us+2.106
Neg logit contributions
��-23.233
ittal-21.518
Reviewer-21.002
itle-20.597
ateral-20.275
Token at
Token ID379
Feature activation-6.47e+00

Pos logit contributions
 ours+3.905
coded+3.791
 wisely+3.694
etting+3.689
 availability+3.551
Neg logit contributions
��-6.497
ittal-6.280
Reviewer-6.107
ateur-5.706
itle-5.700
Token tips
Token ID9040
Feature activation-5.96e+00

Pos logit contributions
 ours+5.032
etting+4.855
coded+4.723
 availability+4.620
 toxicity+4.605
Neg logit contributions
��-4.447
Reviewer-4.147
ittal-4.142
ateur-3.750
akin-3.606
Token@
Token ID31
Feature activation-4.37e+00

Pos logit contributions
etting+0.862
coded+0.690
 ours+0.599
population+0.559
 wisely+0.439
Neg logit contributions
��-7.796
ittal-7.586
Reviewer-7.438
akin-7.181
iband-7.126
Tokenthe
Token ID1169
Feature activation+2.85e+00

Pos logit contributions
etting+3.264
 ours+3.211
coded+3.193
population+3.122
complex+3.048
Neg logit contributions
��-3.128
Reviewer-2.865
ittal-2.759
akin-2.589
ateur-2.522
Token just
Token ID655
Feature activation-6.06e-01

Pos logit contributions
 ours+7.159
coded+6.524
etting+6.498
 wisely+6.371
 brightest+6.273
Neg logit contributions
ittal-9.146
��-9.055
Reviewer-8.591
imony-7.978
ateur-7.933
Token as
Token ID355
Feature activation-7.34e+00

Pos logit contributions
etting+0.359
coded+0.356
 ours+0.347
population+0.338
 availability+0.333
Neg logit contributions
ittal-0.512
��-0.503
Reviewer-0.487
akin-0.451
imony-0.450
Token powerful
Token ID3665
Feature activation-1.06e+01

Pos logit contributions
 ours+4.488
coded+4.386
etting+4.282
 wisely+4.143
 brightest+4.093
Neg logit contributions
ittal-5.984
��-5.970
Reviewer-5.754
imony-5.307
ateur-5.248
Token.
Token ID13
Feature activation-9.28e+00

Pos logit contributions
 ours+5.677
coded+5.615
etting+5.421
population+5.189
 availability+5.181
Neg logit contributions
��-9.043
ittal-8.816
Reviewer-8.332
ateur-7.949
itle-7.807
Token The
Token ID383
Feature activation-4.71e+00

Pos logit contributions
 ours+5.253
coded+5.139
etting+4.972
 TBD+4.772
population+4.766
Neg logit contributions
ittal-7.852
��-7.844
Reviewer-7.217
ateur-6.769
akin-6.722
Token stakes
Token ID21147
Feature activation-1.86e+01

Pos logit contributions
etting+2.992
coded+2.962
 ours+2.928
 availability+2.840
 brightest+2.803
Neg logit contributions
ittal-3.768
��-3.719
Reviewer-3.570
imony-3.304
ateur-3.259
Token of
Token ID286
Feature activation-1.02e+01

Pos logit contributions
 ours+9.397
etting+8.959
coded+8.332
 brightest+8.092
 daunting+8.052
Neg logit contributions
��-14.840
Reviewer-14.288
ittal-13.924
ateur-13.213
ateral-13.082
Token our
Token ID674
Feature activation-4.74e+00

Pos logit contributions
 ours+5.973
etting+5.510
 availability+5.414
coded+5.372
 toxicity+5.281
Neg logit contributions
��-8.504
ittal-8.346
Reviewer-8.293
imony-7.508
iband-7.388
Token struggle
Token ID6531
Feature activation-3.36e+00

Pos logit contributions
etting+2.844
coded+2.823
 ours+2.818
 availability+2.736
population+2.690
Neg logit contributions
��-3.919
ittal-3.873
Reviewer-3.779
imony-3.436
ateur-3.430
Token just
Token ID655
Feature activation+3.26e-02

Pos logit contributions
etting+2.096
coded+2.083
 ours+2.059
population+1.964
 availability+1.958
Neg logit contributions
ittal-2.678
��-2.672
Reviewer-2.619
akin-2.408
ateur-2.393
Token as
Token ID355
Feature activation-8.99e+00

Pos logit contributions
ittal+0.026
��+0.026
Reviewer+0.025
akin+0.023
ateur+0.023
Neg logit contributions
etting-0.020
coded-0.020
 ours-0.020
population-0.019
 availability-0.019
Token,
Token ID11
Feature activation-4.79e+00

Pos logit contributions
ittal+1.074
��+1.055
Reviewer+1.024
akin+0.950
imony+0.943
Neg logit contributions
etting-0.935
coded-0.927
 ours-0.892
population-0.881
 availability-0.872
Token no
Token ID645
Feature activation+9.76e-01

Pos logit contributions
etting+2.972
coded+2.967
 ours+2.966
 availability+2.916
population+2.860
Neg logit contributions
��-3.831
ittal-3.755
Reviewer-3.689
ateur-3.352
imony-3.322
Token matter
Token ID2300
Feature activation+9.01e-01

Pos logit contributions
ittal+0.841
��+0.830
Reviewer+0.805
akin+0.756
imony+0.752
Neg logit contributions
etting-0.559
coded-0.553
 ours-0.527
population-0.521
 availability-0.515
Token what
Token ID644
Feature activation-4.71e-01

Pos logit contributions
ittal+0.650
��+0.640
Reviewer+0.618
akin+0.571
imony+0.566
Neg logit contributions
etting-0.640
coded-0.636
 ours-0.616
population-0.607
 availability-0.602
Token the
Token ID262
Feature activation-6.14e+00

Pos logit contributions
etting+0.303
coded+0.302
 ours+0.292
population+0.286
 availability+0.285
Neg logit contributions
ittal-0.371
��-0.367
Reviewer-0.354
akin-0.329
imony-0.327
Token conditions
Token ID3403
Feature activation-1.86e+01

Pos logit contributions
etting+3.678
coded+3.666
 availability+3.605
 ours+3.560
 toxicity+3.485
Neg logit contributions
��-5.108
ittal-5.040
Reviewer-4.787
ateur-4.453
akin-4.439
Token,
Token ID11
Feature activation-3.97e+00

Pos logit contributions
etting+9.579
coded+8.810
 ours+8.578
 availability+8.492
 wisely+8.300
Neg logit contributions
��-14.988
Reviewer-14.210
ittal-14.190
ateur-13.260
akin-12.782
Token are
Token ID389
Feature activation-1.64e+00

Pos logit contributions
etting+2.316
coded+2.283
 ours+2.223
 availability+2.189
population+2.145
Neg logit contributions
��-3.363
ittal-3.319
Reviewer-3.279
akin-2.993
ateur-2.992
Token for
Token ID329
Feature activation-3.59e+00

Pos logit contributions
etting+1.034
coded+1.030
 ours+1.006
 availability+0.974
population+0.967
Neg logit contributions
ittal-1.312
��-1.309
Reviewer-1.246
akin-1.160
imony-1.159
Token Nietzsche
Token ID40258
Feature activation-4.53e+00

Pos logit contributions
etting+2.244
coded+2.230
 ours+2.225
 availability+2.136
population+2.132
Neg logit contributions
ittal-2.911
��-2.884
Reviewer-2.738
akin-2.552
imony-2.541
Token the
Token ID262
Feature activation+1.45e+00

Pos logit contributions
coded+2.868
etting+2.826
 ours+2.825
 availability+2.728
 toxicity+2.686
Neg logit contributions
��-3.560
ittal-3.527
Reviewer-3.389
imony-3.169
iband-3.131
Tokenburning
Token ID44313
Feature activation-2.35e-01

Pos logit contributions
coded+1.001
etting+0.999
population+0.952
 ours+0.949
spawn+0.932
Neg logit contributions
��-1.334
ittal-1.333
Reviewer-1.263
imony-1.180
ateur-1.177
Token contest
Token ID8414
Feature activation+1.06e+00

Pos logit contributions
etting+0.149
coded+0.148
 ours+0.143
population+0.140
 availability+0.140
Neg logit contributions
ittal-0.187
��-0.185
Reviewer-0.178
akin-0.166
imony-0.165
Token.
Token ID13
Feature activation-3.73e+00

Pos logit contributions
ittal+0.809
��+0.795
Reviewer+0.771
akin+0.717
imony+0.711
Neg logit contributions
etting-0.704
coded-0.697
 ours-0.671
population-0.663
 availability-0.655
Token The
Token ID383
Feature activation-5.55e+00

Pos logit contributions
etting+2.407
coded+2.406
 ours+2.380
population+2.314
 availability+2.273
Neg logit contributions
��-2.908
ittal-2.904
Reviewer-2.698
akin-2.546
ateur-2.516
Token temperature
Token ID5951
Feature activation-1.59e+01

Pos logit contributions
etting+3.562
coded+3.515
 ours+3.497
 availability+3.490
population+3.375
Neg logit contributions
ittal-4.309
��-4.262
Reviewer-4.071
imony-3.747
akin-3.739
Token was
Token ID373
Feature activation-1.84e+01

Pos logit contributions
coded+8.642
 availability+8.636
 ours+8.480
etting+8.451
 wisely+8.333
Neg logit contributions
��-12.974
ittal-11.741
Reviewer-11.718
��-11.053
iband-10.891
Token so
Token ID523
Feature activation-1.22e+01

Pos logit contributions
 ours+10.634
 wisely+10.289
 brightest+10.140
 TBD+10.089
 daunting+10.003
Neg logit contributions
��-14.084
ittal-12.984
Reviewer-12.329
��-11.804
imony-11.560
Token high
Token ID1029
Feature activation-1.09e+01

Pos logit contributions
coded+5.980
 ours+5.873
etting+5.857
 daunting+5.596
 wisely+5.511
Neg logit contributions
��-10.651
ittal-10.292
Reviewer-9.743
imony-9.391
iband-9.277
Token,
Token ID11
Feature activation-2.49e+00

Pos logit contributions
 ours+6.315
coded+6.251
etting+6.058
 availability+6.055
 brightest+5.798
Neg logit contributions
��-8.869
ittal-8.698
Reviewer-8.136
imony-7.692
ateur-7.596
Token reaching
Token ID8978
Feature activation-1.03e+01

Pos logit contributions
coded+1.571
etting+1.569
 ours+1.552
 availability+1.517
population+1.499
Neg logit contributions
ittal-1.964
��-1.963
Reviewer-1.877
imony-1.741
akin-1.738
Token more
Token ID517
Feature activation-1.01e+01

Pos logit contributions
 ours+5.834
coded+5.729
etting+5.617
 availability+5.523
 toxicity+5.468
Neg logit contributions
��-8.346
ittal-8.203
Reviewer-7.844
imony-7.310
iband-7.280
Token should
Token ID815
Feature activation-3.71e+00

Pos logit contributions
etting+3.001
coded+2.961
 ours+2.908
 availability+2.856
 wisely+2.787
Neg logit contributions
��-4.123
ittal-4.045
Reviewer-4.016
akin-3.616
ateur-3.595
Token be
Token ID307
Feature activation-8.23e+00

Pos logit contributions
etting+2.400
coded+2.395
 ours+2.359
 wisely+2.279
 availability+2.272
Neg logit contributions
��-2.889
ittal-2.832
Reviewer-2.784
imony-2.516
akin-2.503
Token adequately
Token ID22668
Feature activation-1.21e+01

Pos logit contributions
 ours+5.056
coded+4.866
etting+4.741
 wisely+4.704
 brightest+4.686
Neg logit contributions
��-6.793
ittal-6.601
Reviewer-6.240
imony-5.853
iband-5.723
Token represented
Token ID7997
Feature activation-7.74e+00

Pos logit contributions
coded+4.445
 ours+4.374
 wisely+4.020
etting+3.883
 brightest+3.862
Neg logit contributions
��-12.622
ittal-12.207
Reviewer-11.800
imony-11.288
ateur-11.080
Token,
Token ID11
Feature activation-1.26e+00

Pos logit contributions
etting+4.442
 ours+4.346
coded+4.281
 availability+4.155
 wisely+4.148
Neg logit contributions
��-6.466
ittal-6.380
Reviewer-6.140
imony-5.714
akin-5.656
Token gender
Token ID5279
Feature activation-1.84e+01

Pos logit contributions
etting+0.824
coded+0.815
 ours+0.800
population+0.785
 availability+0.782
Neg logit contributions
ittal-0.978
��-0.970
Reviewer-0.937
akin-0.864
imony-0.861
Token should
Token ID815
Feature activation-2.03e+00

Pos logit contributions
 availability+10.116
coded+9.399
 sensitivity+9.265
 toxicity+9.189
etting+9.115
Neg logit contributions
��-14.792
Reviewer-13.409
ittal-13.190
��-12.478
imony-12.291
Token be
Token ID307
Feature activation-7.93e+00

Pos logit contributions
etting+1.418
coded+1.407
 ours+1.383
 availability+1.349
population+1.338
Neg logit contributions
��-1.486
ittal-1.474
Reviewer-1.440
imony-1.305
akin-1.285
Token adequately
Token ID22668
Feature activation-1.00e+01

Pos logit contributions
 ours+4.811
coded+4.714
 wisely+4.610
etting+4.571
 availability+4.522
Neg logit contributions
��-6.610
ittal-6.445
Reviewer-6.058
imony-5.775
ateur-5.604
Token represented
Token ID7997
Feature activation-1.14e+00

Pos logit contributions
coded+4.625
 ours+4.523
etting+4.218
 wisely+4.211
 availability+4.051
Neg logit contributions
��-9.660
ittal-9.476
Reviewer-9.159
imony-8.683
ateur-8.500
Token,
Token ID11
Feature activation-1.61e-01

Pos logit contributions
etting+0.710
coded+0.697
 ours+0.678
 availability+0.661
population+0.661
Neg logit contributions
ittal-0.928
��-0.926
Reviewer-0.887
imony-0.825
akin-0.824
Token of
Token ID286
Feature activation-3.00e+00

Pos logit contributions
etting+2.702
coded+2.667
 ours+2.585
 wisely+2.553
 availability+2.543
Neg logit contributions
��-4.035
ittal-4.017
Reviewer-3.907
akin-3.593
ateral-3.557
Token Gentle
Token ID25066
Feature activation-3.01e-02

Pos logit contributions
etting+1.237
coded+1.185
 ours+1.178
population+1.089
 availability+1.072
Neg logit contributions
ittal-3.047
��-2.956
Reviewer-2.932
akin-2.784
imony-2.755
Tokenmen
Token ID3653
Feature activation-4.46e+00

Pos logit contributions
etting+0.017
coded+0.017
 ours+0.016
population+0.016
 availability+0.016
Neg logit contributions
ittal-0.026
��-0.025
Reviewer-0.025
akin-0.023
imony-0.023
Token tried
Token ID3088
Feature activation-1.35e+01

Pos logit contributions
etting+2.607
coded+2.594
 ours+2.544
 availability+2.485
 wisely+2.471
Neg logit contributions
��-3.736
ittal-3.703
Reviewer-3.602
akin-3.295
ateral-3.274
Token to
Token ID284
Feature activation-9.40e+00

Pos logit contributions
 ours+6.747
coded+6.481
etting+6.470
 availability+6.286
 wisely+6.276
Neg logit contributions
��-11.395
ittal-11.361
Reviewer-10.721
ateral-10.151
��-10.001
Token record
Token ID1700
Feature activation-1.83e+01

Pos logit contributions
 ours+5.973
etting+5.882
coded+5.775
 availability+5.614
 wisely+5.603
Neg logit contributions
��-7.300
ittal-7.001
Reviewer-6.659
��-6.344
akin-6.156
Token live
Token ID2107
Feature activation-1.33e+01

Pos logit contributions
 ours+10.181
 availability+9.056
 wisely+8.651
 theirs+8.522
etting+8.431
Neg logit contributions
��-14.829
ittal-14.009
Reviewer-12.909
��-12.554
itle-12.225
Token --
Token ID1377
Feature activation-2.18e+00

Pos logit contributions
 ours+7.533
 availability+7.073
etting+6.920
coded+6.839
 toxicity+6.689
Neg logit contributions
��-11.194
ittal-10.643
Reviewer-9.612
sbm-9.237
akin-9.169
Token first
Token ID717
Feature activation-7.89e+00

Pos logit contributions
etting+1.469
coded+1.464
 ours+1.443
population+1.397
 availability+1.395
Neg logit contributions
ittal-1.626
��-1.626
Reviewer-1.543
ateur-1.417
akin-1.415
Token of
Token ID286
Feature activation-6.71e+00

Pos logit contributions
 ours+4.752
coded+4.693
etting+4.679
 availability+4.580
 wisely+4.401
Neg logit contributions
��-6.321
ittal-6.210
Reviewer-5.863
ateral-5.490
akin-5.488
Token all
Token ID477
Feature activation-7.72e+00

Pos logit contributions
 ours+3.426
coded+3.363
etting+3.351
 availability+3.282
population+3.163
Neg logit contributions
��-6.222
ittal-6.094
Reviewer-5.835
ateur-5.433
iband-5.412
Token)
Token ID8
Feature activation-2.01e+00

Pos logit contributions
ittal+4.358
��+4.113
Reviewer+4.088
akin+3.830
ateral+3.765
Neg logit contributions
coded-4.801
etting-4.798
 ours-4.585
population-4.574
 availability-4.488
Token people
Token ID661
Feature activation-2.14e-01

Pos logit contributions
etting+1.138
coded+1.129
 ours+1.083
population+1.067
 availability+1.062
Neg logit contributions
��-1.746
ittal-1.744
Reviewer-1.680
imony-1.569
akin-1.566
Token,
Token ID11
Feature activation-1.38e+00

Pos logit contributions
etting+0.134
coded+0.132
 ours+0.127
population+0.125
 availability+0.124
Neg logit contributions
ittal-0.174
��-0.172
Reviewer-0.166
akin-0.154
imony-0.154
Token such
Token ID884
Feature activation-1.97e+00

Pos logit contributions
etting+0.892
coded+0.884
 ours+0.858
population+0.843
 availability+0.840
Neg logit contributions
ittal-1.087
��-1.079
Reviewer-1.034
imony-0.959
ateur-0.956
Token as
Token ID355
Feature activation-5.18e+00

Pos logit contributions
etting+1.111
coded+1.103
 ours+1.059
population+1.044
 availability+1.032
Neg logit contributions
ittal-1.702
��-1.686
Reviewer-1.628
akin-1.526
imony-1.517
Token gender
Token ID5279
Feature activation-1.83e+01

Pos logit contributions
 ours+2.885
coded+2.867
etting+2.854
 availability+2.765
population+2.690
Neg logit contributions
ittal-4.498
��-4.466
Reviewer-4.295
akin-3.996
ateur-3.989
Tokenflu
Token ID35522
Feature activation-6.68e+00

Pos logit contributions
coded+8.245
 availability+7.620
 ours+7.566
etting+7.192
population+7.083
Neg logit contributions
Reviewer-15.705
��-15.690
ittal-14.936
imony-13.761
ateral-13.646
Tokenid
Token ID312
Feature activation-3.44e+00

Pos logit contributions
etting+3.452
coded+3.250
population+3.070
spawn+3.001
 ours+2.982
Neg logit contributions
��-6.351
Reviewer-5.986
ittal-5.943
imony-5.595
ateur-5.518
Token and
Token ID290
Feature activation-6.44e+00

Pos logit contributions
etting+1.967
coded+1.960
 ours+1.867
population+1.835
 availability+1.815
Neg logit contributions
��-2.965
ittal-2.941
Reviewer-2.824
akin-2.624
ateur-2.619
Token gender
Token ID5279
Feature activation-1.55e+01

Pos logit contributions
coded+3.373
 ours+3.322
etting+3.283
 availability+3.197
population+3.104
Neg logit contributions
ittal-5.766
��-5.711
Reviewer-5.632
akin-5.165
ateur-5.124
Tokenque
Token ID4188
Feature activation-7.88e+00

Pos logit contributions
coded+7.281
 ours+6.486
etting+6.336
 availability+6.332
population+6.254
Neg logit contributions
��-13.539
Reviewer-13.349
ittal-12.900
imony-11.864
ateral-11.818
Token an
Token ID281
Feature activation-9.21e+00

Pos logit contributions
etting+5.305
 availability+5.038
coded+5.013
 ours+5.002
 toxicity+4.844
Neg logit contributions
��-9.903
ittal-9.298
Reviewer-8.862
iband-8.402
ateur-8.199
Token issue
Token ID2071
Feature activation-1.01e+01

Pos logit contributions
etting+4.426
coded+4.263
 availability+4.132
 ours+4.106
 toxicity+3.873
Neg logit contributions
��-8.886
ittal-8.470
Reviewer-8.279
iband-7.702
imony-7.657
Token by
Token ID416
Feature activation-2.04e+00

Pos logit contributions
etting+6.355
coded+5.974
 ours+5.970
 wisely+5.799
 availability+5.789
Neg logit contributions
��-8.489
ittal-7.799
Reviewer-7.721
ateur-7.025
iband-6.977
Token setting
Token ID4634
Feature activation-5.09e+00

Pos logit contributions
etting+1.243
coded+1.215
 ours+1.170
 availability+1.148
population+1.137
Neg logit contributions
ittal-1.705
��-1.701
Reviewer-1.624
imony-1.516
akin-1.502
Token the
Token ID262
Feature activation-5.43e+00

Pos logit contributions
etting+3.124
coded+3.035
 ours+2.993
 availability+2.988
population+2.868
Neg logit contributions
��-4.260
ittal-4.120
Reviewer-3.953
ateur-3.674
iband-3.658
Token language
Token ID3303
Feature activation-1.83e+01

Pos logit contributions
etting+3.238
coded+3.121
 availability+3.072
 ours+3.023
population+2.939
Neg logit contributions
��-4.627
ittal-4.546
Reviewer-4.333
akin-4.056
imony-4.047
Token of
Token ID286
Feature activation-5.44e+00

Pos logit contributions
etting+9.858
coded+9.653
 qualifier+9.593
 availability+9.569
 wisely+9.504
Neg logit contributions
��-14.795
ittal-13.431
Reviewer-12.557
ateur-12.265
akin-11.968
Token a
Token ID257
Feature activation-3.25e+00

Pos logit contributions
etting+3.428
coded+3.338
 ours+3.274
 availability+3.270
population+3.124
Neg logit contributions
��-4.444
ittal-4.352
Reviewer-4.120
iband-3.875
akin-3.864
Token document
Token ID3188
Feature activation-3.30e+00

Pos logit contributions
etting+1.979
coded+1.936
 ours+1.875
 availability+1.838
population+1.827
Neg logit contributions
��-2.707
ittal-2.696
Reviewer-2.571
akin-2.418
ateur-2.394
Token to
Token ID284
Feature activation-1.09e+01

Pos logit contributions
etting+1.941
coded+1.928
 ours+1.839
 availability+1.814
population+1.796
Neg logit contributions
��-2.803
ittal-2.777
Reviewer-2.652
akin-2.474
imony-2.473
Token US
Token ID1294
Feature activation-5.94e+00

Pos logit contributions
 ours+6.369
etting+6.307
coded+6.125
 availability+5.933
 wisely+5.779
Neg logit contributions
��-8.871
ittal-8.833
Reviewer-8.687
iband-7.837
ateur-7.722
Token.
Token ID13
Feature activation-8.66e-01

Pos logit contributions
coded+3.018
etting+2.997
 ours+2.896
 availability+2.793
population+2.734
Neg logit contributions
��-5.427
ittal-5.327
Reviewer-5.268
ateur-4.875
akin-4.839
Token9
Token ID24
Feature activation-4.13e+00

Pos logit contributions
etting+0.647
coded+0.646
 ours+0.626
population+0.618
 availability+0.607
Neg logit contributions
ittal-0.592
��-0.591
Reviewer-0.561
akin-0.513
ateur-0.512
Token billion
Token ID2997
Feature activation-2.44e-01

Pos logit contributions
coded+1.750
etting+1.728
 ours+1.668
 availability+1.571
population+1.566
Neg logit contributions
��-4.175
ittal-4.123
Reviewer-4.014
ateur-3.761
akin-3.738
Token.
Token ID13
Feature activation-7.78e+00

Pos logit contributions
etting+0.164
coded+0.163
 ours+0.158
population+0.155
 availability+0.154
Neg logit contributions
ittal-0.185
��-0.184
Reviewer-0.177
akin-0.164
imony-0.162
Token They
Token ID1119
Feature activation-1.15e+01

Pos logit contributions
 ours+4.561
etting+4.536
coded+4.484
population+4.374
 TBD+4.341
Neg logit contributions
��-6.619
ittal-6.257
Reviewer-5.918
ateur-5.611
akin-5.527
Token did
Token ID750
Feature activation-1.82e+01

Pos logit contributions
 wisely+6.306
etting+6.130
 ours+6.069
coded+5.919
 availability+5.802
Neg logit contributions
��-10.302
Reviewer-9.643
ittal-9.179
��-8.847
ateral-8.407
Token not
Token ID407
Feature activation-9.57e+00

Pos logit contributions
 ours+9.174
 wisely+8.736
etting+8.401
 availability+8.356
coded+8.037
Neg logit contributions
��-15.338
Reviewer-14.322
ittal-13.631
ateur-12.587
atio-12.562
Token.
Token ID13
Feature activation-4.96e+00

Pos logit contributions
etting+5.254
 ours+5.168
 wisely+5.070
coded+4.971
 availability+4.962
Neg logit contributions
��-8.633
Reviewer-8.136
ittal-7.738
��-7.342
ateur-7.154
Token
Token ID198
Feature activation+1.89e-01

Pos logit contributions
etting+3.021
coded+3.012
 ours+3.011
population+2.896
 availability+2.827
Neg logit contributions
��-4.104
ittal-3.983
Reviewer-3.772
ateur-3.552
akin-3.535
Token
Token ID198
Feature activation-2.80e+00

Pos logit contributions
ittal+0.138
��+0.133
Reviewer+0.131
akin+0.121
imony+0.121
Neg logit contributions
etting-0.132
coded-0.131
 ours-0.126
population-0.125
 availability-0.124
TokenInstead
Token ID13193
Feature activation-8.50e+00

Pos logit contributions
etting+1.792
coded+1.783
 ours+1.707
population+1.706
spawn+1.656
Neg logit contributions
��-2.258
ittal-2.236
Reviewer-2.054
ateur-1.957
akin-1.953
Token do
Token ID466
Feature activation-6.31e+00

Pos logit contributions
ittal+1.113
��+1.089
Reviewer+1.057
akin+0.983
imony+0.975
Neg logit contributions
etting-1.035
coded-1.027
 ours-0.988
population-0.979
 availability-0.966
Token you
Token ID345
Feature activation-4.98e+00

Pos logit contributions
etting+3.495
coded+3.460
 ours+3.458
 availability+3.259
population+3.205
Neg logit contributions
ittal-5.427
��-5.292
Reviewer-5.195
akin-4.786
iband-4.765
Token know
Token ID760
Feature activation+4.72e+00

Pos logit contributions
etting+3.182
coded+3.169
 ours+3.155
 availability+3.071
 wisely+3.038
Neg logit contributions
��-3.968
ittal-3.827
Reviewer-3.781
��-3.417
akin-3.415
Token?
Token ID30
Feature activation-4.92e+00

Pos logit contributions
ittal+3.613
��+3.461
Reviewer+3.403
imony+3.243
akin+3.225
Neg logit contributions
coded-3.064
etting-3.058
population-2.896
 ours-2.869
spawn-2.822
Token Virt
Token ID11285
Feature activation-4.49e+00

Pos logit contributions
etting+3.183
coded+3.183
 ours+3.106
population+3.067
 availability+2.980
Neg logit contributions
��-3.832
ittal-3.790
Reviewer-3.611
akin-3.345
iband-3.316
Tokenually
Token ID935
Feature activation-1.78e+01

Pos logit contributions
etting+3.694
coded+3.593
 ours+3.496
population+3.441
 availability+3.404
Neg logit contributions
ittal-2.774
��-2.653
Reviewer-2.607
iband-2.312
akin-2.309
Token every
Token ID790
Feature activation-9.35e+00

Pos logit contributions
 ours+8.415
 availability+7.371
 nil+7.330
population+7.256
coded+7.210
Neg logit contributions
��-15.557
ittal-15.375
Reviewer-14.858
iband-13.631
ateral-13.488
Token major
Token ID1688
Feature activation+4.15e-02

Pos logit contributions
 ours+5.496
etting+5.491
 availability+5.461
coded+5.431
population+5.374
Neg logit contributions
��-7.432
ittal-7.392
Reviewer-7.060
akin-6.617
iband-6.523
Token medical
Token ID3315
Feature activation-4.13e+00

Pos logit contributions
ittal+0.032
��+0.031
Reviewer+0.030
akin+0.028
imony+0.028
Neg logit contributions
etting-0.028
coded-0.027
 ours-0.026
population-0.026
 availability-0.026
Token association
Token ID8112
Feature activation-2.42e+00

Pos logit contributions
etting+2.679
coded+2.608
 ours+2.577
 availability+2.571
population+2.526
Neg logit contributions
��-3.215
ittal-3.198
Reviewer-3.126
akin-2.882
imony-2.825
Token supports
Token ID6971
Feature activation-5.04e+00

Pos logit contributions
etting+1.666
coded+1.646
 ours+1.615
 availability+1.588
population+1.584
Neg logit contributions
��-1.817
ittal-1.802
Reviewer-1.738
akin-1.600
imony-1.570
Token others
Token ID1854
Feature activation-4.17e-01

Pos logit contributions
ittal+1.602
��+1.572
Reviewer+1.529
akin+1.433
imony+1.423
Neg logit contributions
etting-1.186
coded-1.174
 ours-1.123
population-1.112
 availability-1.094
Token refer
Token ID3522
Feature activation-8.21e+00

Pos logit contributions
etting+0.270
coded+0.267
 ours+0.258
population+0.253
 availability+0.252
Neg logit contributions
ittal-0.328
��-0.324
Reviewer-0.315
akin-0.292
imony-0.290
Token to
Token ID284
Feature activation-1.02e+01

Pos logit contributions
coded+3.521
etting+3.516
 wisely+3.355
 ours+3.340
 availability+3.304
Neg logit contributions
ittal-7.800
��-7.634
Reviewer-7.346
akin-7.125
ateur-7.111
Token them
Token ID606
Feature activation-5.09e+00

Pos logit contributions
 ours+3.868
coded+3.730
etting+3.701
 availability+3.629
 wisely+3.354
Neg logit contributions
ittal-10.194
��-10.172
Reviewer-9.909
ateur-9.308
akin-9.214
Token with
Token ID351
Feature activation-4.77e+00

Pos logit contributions
etting+2.854
coded+2.829
 ours+2.783
 wisely+2.686
 availability+2.672
Neg logit contributions
��-4.386
ittal-4.348
Reviewer-4.232
ateur-3.923
imony-3.876
Token gender
Token ID5279
Feature activation-1.77e+01

Pos logit contributions
etting+2.898
coded+2.892
 ours+2.830
 availability+2.757
 toxicity+2.703
Neg logit contributions
ittal-3.872
��-3.862
Reviewer-3.784
akin-3.443
imony-3.442
Token-
Token ID12
Feature activation-1.47e+01

Pos logit contributions
coded+10.383
etting+9.749
 availability+9.384
 qualifier+9.312
 ours+9.200
Neg logit contributions
��-12.571
ittal-12.217
Reviewer-12.187
imony-11.235
iband-11.174
Tokenneutral
Token ID29797
Feature activation-2.12e+00

Pos logit contributions
coded+8.565
etting+7.662
complex+7.372
spawn+7.202
pred+7.098
Neg logit contributions
��-12.295
Reviewer-11.435
ittal-11.376
imony-10.801
iband-10.800
Token pronouns
Token ID43947
Feature activation-5.95e+00

Pos logit contributions
etting+1.280
coded+1.270
 ours+1.218
 availability+1.195
population+1.171
Neg logit contributions
ittal-1.758
��-1.741
Reviewer-1.697
akin-1.573
imony-1.567
Token such
Token ID884
Feature activation-6.19e+00

Pos logit contributions
etting+3.578
coded+3.471
 ours+3.351
 wisely+3.300
 availability+3.266
Neg logit contributions
��-4.964
ittal-4.842
Reviewer-4.769
imony-4.396
ateur-4.374
Token as
Token ID355
Feature activation-7.51e+00

Pos logit contributions
etting+3.104
coded+3.088
 ours+3.036
 availability+2.898
population+2.895
Neg logit contributions
��-5.769
ittal-5.654
Reviewer-5.467
ateur-5.149
akin-5.052
Token the
Token ID262
Feature activation-4.99e+00

Pos logit contributions
etting+2.420
coded+2.374
 ours+2.363
 availability+2.302
population+2.271
Neg logit contributions
ittal-3.131
��-3.128
Reviewer-3.010
akin-2.789
imony-2.751
Token new
Token ID649
Feature activation-1.46e+00

Pos logit contributions
etting+3.219
coded+3.191
 ours+3.165
 availability+3.131
 toxicity+3.058
Neg logit contributions
ittal-3.846
��-3.824
Reviewer-3.701
imony-3.380
akin-3.377
Token money
Token ID1637
Feature activation-6.29e+00

Pos logit contributions
etting+1.057
coded+1.039
 ours+1.020
 availability+1.002
population+0.991
Neg logit contributions
ittal-1.043
��-1.033
Reviewer-0.987
imony-0.913
akin-0.908
Token men
Token ID1450
Feature activation-8.30e+00

Pos logit contributions
etting+3.707
coded+3.703
 ours+3.620
 availability+3.555
 wisely+3.450
Neg logit contributions
��-5.194
ittal-5.111
Reviewer-4.885
imony-4.600
akin-4.489
Token to
Token ID284
Feature activation-4.63e+00

Pos logit contributions
etting+4.794
coded+4.621
 ours+4.313
population+4.258
spawn+4.234
Neg logit contributions
��-6.899
ittal-6.862
Reviewer-6.638
imony-6.141
ateral-6.016
Token do
Token ID466
Feature activation-1.76e+01

Pos logit contributions
etting+3.065
 ours+3.016
coded+3.008
 availability+2.893
 wisely+2.884
Neg logit contributions
��-3.571
ittal-3.509
Reviewer-3.377
akin-3.078
imony-3.063
Token that
Token ID326
Feature activation-9.42e+00

Pos logit contributions
 ours+9.057
etting+9.026
coded+8.535
 availability+8.446
 wisely+8.403
Neg logit contributions
��-13.681
Reviewer-13.399
ittal-13.047
akin-11.713
imony-11.661
Token.
Token ID13
Feature activation-4.41e+00

Pos logit contributions
etting+5.307
coded+5.097
 ours+5.016
 availability+4.821
 toxicity+4.792
Neg logit contributions
��-7.973
ittal-7.779
Reviewer-7.507
akin-6.982
imony-6.930
Token
Token ID198
Feature activation-2.47e+00

Pos logit contributions
etting+2.648
coded+2.623
 ours+2.621
population+2.519
 availability+2.485
Neg logit contributions
��-3.618
ittal-3.584
Reviewer-3.378
akin-3.178
ateur-3.109
Token
Token ID198
Feature activation-2.05e+00

Pos logit contributions
coded+1.602
etting+1.599
 ours+1.547
population+1.525
 availability+1.495
Neg logit contributions
��-1.952
ittal-1.932
Reviewer-1.833
akin-1.704
ateur-1.701
TokenHistory
Token ID18122
Feature activation-5.88e-01

Pos logit contributions
etting+1.307
coded+1.297
population+1.239
 ours+1.236
spawn+1.206
Neg logit contributions
��-1.645
ittal-1.642
Reviewer-1.515
akin-1.440
imony-1.432
Token intense
Token ID8157
Feature activation-5.56e+00

Pos logit contributions
 ours+4.500
etting+4.382
coded+4.365
 daunting+4.226
 availability+4.201
Neg logit contributions
��-8.657
ittal-8.418
Reviewer-8.291
imony-7.814
ateur-7.603
Token and
Token ID290
Feature activation-7.30e+00

Pos logit contributions
etting+3.392
coded+3.349
 ours+3.317
 availability+3.211
population+3.201
Neg logit contributions
��-4.623
ittal-4.477
Reviewer-4.349
akin-4.020
ateur-4.017
Token longer
Token ID2392
Feature activation-1.19e+00

Pos logit contributions
 ours+4.315
etting+4.278
coded+4.251
 availability+4.210
 toxicity+4.176
Neg logit contributions
��-6.156
ittal-5.891
Reviewer-5.777
imony-5.299
akin-5.247
Token-
Token ID12
Feature activation-4.12e+00

Pos logit contributions
etting+0.797
coded+0.795
 ours+0.764
population+0.753
 availability+0.750
Neg logit contributions
ittal-0.901
��-0.895
Reviewer-0.863
akin-0.800
imony-0.792
Tokenlasting
Token ID24810
Feature activation-2.76e+00

Pos logit contributions
coded+2.814
etting+2.791
population+2.656
 ours+2.627
spawn+2.608
Neg logit contributions
��-3.139
ittal-3.095
Reviewer-2.971
imony-2.742
akin-2.698
Token than
Token ID621
Feature activation-1.76e+01

Pos logit contributions
etting+1.771
coded+1.758
 ours+1.705
population+1.664
 availability+1.662
Neg logit contributions
��-2.185
ittal-2.175
Reviewer-2.089
akin-1.937
imony-1.923
Token what
Token ID644
Feature activation-4.12e+00

Pos logit contributions
 ours+9.774
 brightest+8.259
 toxicity+8.191
 emergencies+8.141
 epidem+8.134
Neg logit contributions
��-14.527
ittal-14.008
Reviewer-13.807
iband-12.445
imony-12.293
Token the
Token ID262
Feature activation-5.06e+00

Pos logit contributions
 ours+2.688
coded+2.605
etting+2.600
 availability+2.554
population+2.540
Neg logit contributions
��-3.218
ittal-3.185
Reviewer-3.108
akin-2.789
imony-2.783
Token country
Token ID1499
Feature activation-4.83e+00

Pos logit contributions
etting+3.294
coded+3.268
 ours+3.243
 availability+3.224
population+3.173
Neg logit contributions
ittal-3.927
��-3.877
Reviewer-3.809
iband-3.445
imony-3.417
Token has
Token ID468
Feature activation-6.66e+00

Pos logit contributions
etting+3.077
coded+3.072
 ours+3.070
population+2.971
 wisely+2.964
Neg logit contributions
��-3.896
ittal-3.731
Reviewer-3.666
imony-3.291
akin-3.243
Token seen
Token ID1775
Feature activation-4.41e+00

Pos logit contributions
 ours+3.316
coded+3.298
etting+3.209
 wisely+3.168
 availability+3.103
Neg logit contributions
��-6.210
ittal-6.075
Reviewer-6.074
imony-5.530
akin-5.488
Token I
Token ID314
Feature activation-5.79e+00

Pos logit contributions
etting+2.302
coded+2.270
 availability+2.227
 ours+2.195
population+2.161
Neg logit contributions
��-2.655
ittal-2.640
Reviewer-2.533
akin-2.346
imony-2.321
Token might
Token ID1244
Feature activation-7.11e-01

Pos logit contributions
coded+3.387
etting+3.354
 ours+3.225
 availability+3.203
population+3.175
Neg logit contributions
��-4.930
Reviewer-4.759
ittal-4.696
��-4.286
akin-4.221
Token not
Token ID407
Feature activation+1.29e+00

Pos logit contributions
etting+0.482
coded+0.478
 ours+0.464
population+0.456
 availability+0.453
Neg logit contributions
ittal-0.534
��-0.532
Reviewer-0.513
akin-0.471
imony-0.470
Token have
Token ID423
Feature activation-3.70e+00

Pos logit contributions
ittal+0.901
��+0.886
Reviewer+0.855
akin+0.788
imony+0.783
Neg logit contributions
etting-0.951
coded-0.943
 ours-0.913
population-0.902
 availability-0.894
Token the
Token ID262
Feature activation-4.25e+00

Pos logit contributions
coded+2.462
etting+2.448
 ours+2.431
 availability+2.425
population+2.339
Neg logit contributions
��-2.823
ittal-2.757
Reviewer-2.659
akin-2.450
imony-2.432
Token time
Token ID640
Feature activation-1.76e+01

Pos logit contributions
etting+3.101
coded+3.062
 availability+3.030
 ours+2.975
population+2.914
Neg logit contributions
��-2.985
ittal-2.926
Reviewer-2.806
akin-2.591
imony-2.563
Token for
Token ID329
Feature activation-1.04e+01

Pos logit contributions
coded+9.411
 availability+9.273
etting+9.064
 wisely+8.846
population+8.547
Neg logit contributions
��-13.512
ittal-12.308
Reviewer-11.950
imony-11.587
ateur-11.386
Token Serious
Token ID32231
Feature activation-2.00e+00

Pos logit contributions
 availability+6.037
 ours+6.013
etting+6.012
coded+5.791
 wisely+5.572
Neg logit contributions
��-8.498
ittal-8.183
Reviewer-7.770
imony-7.238
akin-7.205
Token Recording
Token ID43905
Feature activation-2.62e-02

Pos logit contributions
etting+1.334
coded+1.323
 ours+1.272
population+1.258
 availability+1.251
Neg logit contributions
ittal-1.532
��-1.513
Reviewer-1.437
akin-1.351
imony-1.350
Token Attempt
Token ID25770
Feature activation+1.90e+00

Pos logit contributions
etting+0.017
coded+0.017
 ours+0.017
population+0.016
 availability+0.016
Neg logit contributions
ittal-0.020
��-0.020
Reviewer-0.019
akin-0.018
imony-0.018
Tokens
Token ID82
Feature activation-3.63e+00

Pos logit contributions
ittal+1.544
��+1.513
Reviewer+1.474
akin+1.379
imony+1.371
Neg logit contributions
etting-1.170
coded-1.161
 ours-1.112
population-1.100
 availability-1.081
Token or
Token ID393
Feature activation-1.51e+00

Pos logit contributions
ittal+0.409
��+0.403
Reviewer+0.390
akin+0.364
imony+0.362
Neg logit contributions
etting-0.318
coded-0.316
 ours-0.302
population-0.300
 availability-0.296
Token Gender
Token ID20247
Feature activation-1.23e+01

Pos logit contributions
etting+1.178
coded+1.175
 ours+1.137
population+1.125
 availability+1.114
Neg logit contributions
ittal-0.995
��-0.970
Reviewer-0.942
akin-0.862
imony-0.855
Tokenque
Token ID4188
Feature activation-3.74e+00

Pos logit contributions
coded+4.494
etting+4.188
population+3.782
 ours+3.689
 availability+3.645
Neg logit contributions
��-12.329
ittal-12.105
Reviewer-11.919
imony-11.282
itle-11.101
Tokener
Token ID263
Feature activation+1.10e+00

Pos logit contributions
etting+3.168
coded+3.114
 ours+3.000
population+2.987
 availability+2.946
Neg logit contributions
��-2.205
ittal-2.192
Reviewer-2.089
imony-1.852
akin-1.851
Token:
Token ID25
Feature activation-5.32e+00

Pos logit contributions
ittal+0.921
��+0.904
Reviewer+0.880
akin+0.825
imony+0.820
Neg logit contributions
etting-0.653
coded-0.647
 ours-0.619
population-0.612
 availability-0.603
Token Gender
Token ID20247
Feature activation-1.75e+01

Pos logit contributions
etting+3.000
coded+2.996
 ours+2.849
population+2.828
 availability+2.777
Neg logit contributions
ittal-4.576
��-4.564
Reviewer-4.335
akin-4.074
imony-4.033
Tokenflu
Token ID35522
Feature activation-4.48e+00

Pos logit contributions
coded+6.520
etting+6.467
 availability+6.054
population+5.880
 toxicity+5.782
Neg logit contributions
��-15.621
ittal-15.560
Reviewer-15.329
itle-14.366
imony-14.300
Tokenid
Token ID312
Feature activation+2.30e+00

Pos logit contributions
etting+2.407
coded+2.322
population+2.185
 ours+2.155
spawn+2.112
Neg logit contributions
��-4.142
ittal-3.969
Reviewer-3.896
ateur-3.611
imony-3.598
Token or
Token ID393
Feature activation-2.80e+00

Pos logit contributions
ittal+1.763
��+1.724
Reviewer+1.678
akin+1.564
imony+1.553
Neg logit contributions
etting-1.527
coded-1.514
 ours-1.456
population-1.441
 availability-1.419
Token gender
Token ID5279
Feature activation-1.20e+01

Pos logit contributions
coded+1.078
etting+1.078
 ours+1.009
population+0.979
 availability+0.972
Neg logit contributions
ittal-2.913
��-2.847
Reviewer-2.819
akin-2.676
imony-2.666
Tokenque
Token ID4188
Feature activation-1.58e+00

Pos logit contributions
coded+3.579
etting+3.562
population+2.920
 availability+2.829
 toxicity+2.779
Neg logit contributions
��-12.664
ittal-12.558
Reviewer-12.394
imony-12.043
akin-11.737
Token female
Token ID4048
Feature activation-7.02e+00

Pos logit contributions
ittal+0.069
��+0.068
Reviewer+0.066
akin+0.061
imony+0.061
Neg logit contributions
etting-0.049
coded-0.049
 ours-0.047
population-0.046
 availability-0.046
Token pronouns
Token ID43947
Feature activation-2.78e+00

Pos logit contributions
coded+3.963
 ours+3.828
etting+3.811
 availability+3.772
population+3.628
Neg logit contributions
��-5.931
ittal-5.904
Reviewer-5.688
ateur-5.306
imony-5.295
Token.
Token ID13
Feature activation+1.22e+00

Pos logit contributions
etting+1.739
coded+1.706
 ours+1.641
 availability+1.600
population+1.591
Neg logit contributions
ittal-2.248
��-2.242
Reviewer-2.164
imony-2.009
ateur-2.000
Token
Token ID198
Feature activation+1.15e+00

Pos logit contributions
ittal+0.935
��+0.918
Reviewer+0.890
akin+0.829
imony+0.822
Neg logit contributions
etting-0.814
coded-0.808
 ours-0.777
population-0.769
 availability-0.758
Token
Token ID198
Feature activation+1.57e-02

Pos logit contributions
ittal+0.817
��+0.777
Reviewer+0.776
akin+0.721
imony+0.716
Neg logit contributions
etting-0.833
coded-0.820
 ours-0.788
population-0.783
 availability-0.775
TokenGender
Token ID41394
Feature activation-1.75e+01

Pos logit contributions
ittal+0.013
��+0.012
Reviewer+0.012
akin+0.011
imony+0.011
Neg logit contributions
etting-0.010
coded-0.010
 ours-0.009
population-0.009
 availability-0.009
Token Identity
Token ID27207
Feature activation-4.15e+00

Pos logit contributions
coded+9.337
 availability+8.582
etting+8.489
 ours+8.301
population+8.087
Neg logit contributions
��-13.369
ittal-13.080
Reviewer-13.021
imony-12.011
ateur-11.923
Token:
Token ID25
Feature activation-3.18e+00

Pos logit contributions
etting+2.394
coded+2.351
 ours+2.249
 availability+2.195
population+2.176
Neg logit contributions
��-3.578
ittal-3.556
Reviewer-3.423
ateur-3.179
akin-3.166
Token Gender
Token ID20247
Feature activation-1.62e+01

Pos logit contributions
etting+1.999
coded+1.993
 ours+1.924
population+1.894
 availability+1.874
Neg logit contributions
ittal-2.552
��-2.547
Reviewer-2.421
akin-2.243
iband-2.236
Token identity
Token ID5369
Feature activation-2.72e+00

Pos logit contributions
coded+8.328
 availability+7.860
etting+7.749
 ours+7.444
population+7.283
Neg logit contributions
��-13.188
ittal-12.975
Reviewer-12.497
iband-11.759
imony-11.759
Token is
Token ID318
Feature activation-2.70e+00

Pos logit contributions
etting+1.895
coded+1.879
 ours+1.809
 availability+1.802
population+1.775
Neg logit contributions
��-2.008
ittal-1.998
Reviewer-1.927
akin-1.755
imony-1.752
Token now
Token ID783
Feature activation+1.68e+00

Pos logit contributions
ittal+1.884
��+1.832
Reviewer+1.781
akin+1.673
imony+1.665
Neg logit contributions
etting-1.623
coded-1.610
 ours-1.543
population-1.536
 availability-1.507
Token have
Token ID423
Feature activation+5.27e+00

Pos logit contributions
ittal+1.173
��+1.148
Reviewer+1.111
akin+1.027
imony+1.018
Neg logit contributions
etting-1.236
coded-1.226
 ours-1.184
population-1.173
 availability-1.158
Token smoking
Token ID9216
Feature activation+3.12e+00

Pos logit contributions
ittal+3.884
��+3.769
Reviewer+3.718
imony+3.447
akin+3.442
Neg logit contributions
etting-3.606
coded-3.542
population-3.404
 ours-3.346
spawn-3.313
Token-
Token ID12
Feature activation-2.39e-01

Pos logit contributions
ittal+2.132
��+2.057
Reviewer+1.994
akin+1.866
imony+1.862
Neg logit contributions
etting-2.307
coded-2.287
 ours-2.218
population-2.208
 availability-2.154
Tokengun
Token ID7145
Feature activation+8.38e+00

Pos logit contributions
etting+0.173
coded+0.173
population+0.164
 ours+0.164
 availability+0.161
Neg logit contributions
ittal-0.171
��-0.170
Reviewer-0.163
akin-0.148
imony-0.147
Token proof
Token ID6617
Feature activation+1.21e+01

Pos logit contributions
ittal+4.759
Reviewer+4.581
��+4.464
imony+4.260
akin+4.096
Neg logit contributions
etting-6.980
coded-6.833
population-6.707
 ours-6.561
spawn-6.493
Token that
Token ID326
Feature activation+7.32e+00

Pos logit contributions
Reviewer+7.174
ittal+7.146
��+6.631
imony+6.608
akin+6.304
Neg logit contributions
etting-8.914
coded-8.801
population-8.681
spawn-8.381
 availability-8.278
Token top
Token ID1353
Feature activation+4.50e+00

Pos logit contributions
ittal+5.247
Reviewer+5.122
��+5.039
imony+4.760
akin+4.701
Neg logit contributions
etting-4.913
coded-4.900
population-4.693
spawn-4.542
 ours-4.523
Token officials
Token ID2828
Feature activation+6.05e+00

Pos logit contributions
ittal+2.854
��+2.748
Reviewer+2.727
imony+2.484
akin+2.479
Neg logit contributions
etting-3.510
coded-3.489
 ours-3.371
population-3.365
 availability-3.310
Token in
Token ID287
Feature activation+4.23e+00

Pos logit contributions
ittal+4.343
��+4.247
Reviewer+4.182
akin+3.950
imony+3.948
Neg logit contributions
coded-4.003
etting-3.966
population-3.895
 ours-3.820
 availability-3.743
Token the
Token ID262
Feature activation+7.94e+00

Pos logit contributions
ittal+2.875
��+2.803
Reviewer+2.725
akin+2.526
imony+2.511
Neg logit contributions
etting-3.130
coded-3.109
population-2.987
 ours-2.965
 availability-2.936
Token-
Token ID12
Feature activation+5.13e+00

Pos logit contributions
ittal+6.275
��+5.910
Reviewer+5.763
akin+5.611
imony+5.469
Neg logit contributions
etting-6.749
coded-6.674
 ours-6.542
 availability-6.496
population-6.449
TokenEE
Token ID6500
Feature activation+5.76e+00

Pos logit contributions
ittal+3.658
��+3.545
Reviewer+3.473
akin+3.285
imony+3.192
Neg logit contributions
etting-3.592
coded-3.538
 ours-3.440
population-3.398
 availability-3.397
Token-
Token ID12
Feature activation+2.40e+00

Pos logit contributions
ittal+3.906
Reviewer+3.727
��+3.725
akin+3.474
imony+3.381
Neg logit contributions
etting-4.196
coded-4.153
 ours-4.002
population-3.976
 availability-3.906
Tokenoh
Token ID1219
Feature activation+1.58e+01

Pos logit contributions
ittal+1.778
��+1.740
Reviewer+1.694
akin+1.576
imony+1.558
Neg logit contributions
etting-1.645
coded-1.628
 ours-1.571
population-1.555
 availability-1.538
Token-
Token ID12
Feature activation+6.51e+00

Pos logit contributions
ittal+10.137
Reviewer+9.935
��+9.539
akin+9.465
ateur+9.175
Neg logit contributions
etting-9.871
coded-9.746
 ours-9.308
population-9.131
 availability-9.030
Tokenv
Token ID85
Feature activation+1.29e+01

Pos logit contributions
ittal+5.150
��+5.001
Reviewer+4.981
akin+4.724
imony+4.598
Neg logit contributions
etting-3.917
coded-3.901
 ours-3.771
 availability-3.724
population-3.682
Tokenick
Token ID624
Feature activation+1.58e+01

Pos logit contributions
ittal+8.973
akin+8.282
Reviewer+8.214
��+8.120
ateur+7.896
Neg logit contributions
coded-8.348
 ours-8.145
etting-8.086
 availability-7.933
 toxicity-7.918
Token)
Token ID8
Feature activation+2.02e+00

Pos logit contributions
ittal+9.614
akin+8.914
Reviewer+8.822
��+8.686
ateur+8.498
Neg logit contributions
coded-10.378
etting-10.143
population-9.979
 ours-9.817
 toxicity-9.811
Token The
Token ID383
Feature activation+2.98e+00

Pos logit contributions
ittal+1.515
��+1.484
Reviewer+1.440
akin+1.339
imony+1.329
Neg logit contributions
etting-1.379
coded-1.368
 ours-1.317
population-1.304
 availability-1.286
Token results
Token ID2482
Feature activation+6.22e-01

Pos logit contributions
ittal+2.419
��+2.365
Reviewer+2.312
akin+2.168
imony+2.151
Neg logit contributions
etting-1.824
coded-1.814
 ours-1.730
population-1.718
 availability-1.677
Token support
Token ID1104
Feature activation+3.73e+00

Pos logit contributions
ittal+0.499
��+0.492
Reviewer+0.477
akin+0.445
imony+0.441
Neg logit contributions
etting-0.392
coded-0.388
 ours-0.373
population-0.367
 availability-0.364
Token have
Token ID423
Feature activation-6.33e-01

Pos logit contributions
ittal+0.307
��+0.303
Reviewer+0.294
akin+0.272
imony+0.270
Neg logit contributions
etting-0.261
coded-0.259
 ours-0.250
population-0.245
 availability-0.244
Token little
Token ID1310
Feature activation+5.57e+00

Pos logit contributions
coded+0.405
etting+0.404
 ours+0.394
population+0.383
 availability+0.382
Neg logit contributions
ittal-0.498
��-0.492
Reviewer-0.476
akin-0.442
imony-0.439
Token in
Token ID287
Feature activation-6.43e-01

Pos logit contributions
ittal+4.208
��+4.045
Reviewer+3.952
akin+3.752
imony+3.730
Neg logit contributions
etting-3.691
coded-3.596
population-3.470
 ours-3.369
spawn-3.361
Token common
Token ID2219
Feature activation+4.30e+00

Pos logit contributions
etting+0.282
coded+0.279
 ours+0.266
population+0.258
 availability+0.253
Neg logit contributions
ittal-0.642
��-0.630
Reviewer-0.615
akin-0.581
imony-0.579
Token at
Token ID379
Feature activation+9.22e+00

Pos logit contributions
ittal+2.876
��+2.790
Reviewer+2.710
akin+2.508
imony+2.491
Neg logit contributions
etting-3.243
coded-3.224
population-3.095
 ours-3.090
 availability-3.047
Token first
Token ID717
Feature activation+1.07e+01

Pos logit contributions
ittal+6.273
��+6.172
Reviewer+5.833
akin+5.731
imony+5.610
Neg logit contributions
etting-6.409
coded-6.234
population-5.912
spawn-5.853
 ours-5.848
Token except
Token ID2845
Feature activation+2.31e+00

Pos logit contributions
ittal+6.602
Reviewer+6.026
��+6.015
imony+5.628
akin+5.610
Neg logit contributions
etting-8.115
coded-7.997
population-7.770
 ours-7.543
spawn-7.473
Token their
Token ID511
Feature activation-2.22e+00

Pos logit contributions
ittal+1.858
��+1.819
Reviewer+1.772
akin+1.658
imony+1.649
Neg logit contributions
etting-1.452
coded-1.439
 ours-1.374
population-1.366
 availability-1.340
Token love
Token ID1842
Feature activation+1.66e+00

Pos logit contributions
etting+1.359
coded+1.356
 ours+1.324
 availability+1.302
 toxicity+1.262
Neg logit contributions
ittal-1.818
��-1.812
Reviewer-1.752
akin-1.612
imony-1.600
Token of
Token ID286
Feature activation+2.46e+00

Pos logit contributions
ittal+1.347
��+1.321
Reviewer+1.285
akin+1.202
imony+1.194
Neg logit contributions
etting-1.028
coded-1.018
 ours-0.976
population-0.966
 availability-0.951
Token music
Token ID2647
Feature activation+5.32e+00

Pos logit contributions
ittal+1.965
��+1.924
Reviewer+1.872
akin+1.760
imony+1.743
Neg logit contributions
etting-1.542
coded-1.528
population-1.454
 ours-1.452
 availability-1.421
Token the
Token ID262
Feature activation+4.80e+00

Pos logit contributions
ittal+2.323
��+2.271
Reviewer+2.202
akin+2.043
imony+2.027
Neg logit contributions
etting-2.283
coded-2.266
 ours-2.183
population-2.163
 availability-2.134
Token policy
Token ID2450
Feature activation+2.48e+00

Pos logit contributions
ittal+3.644
��+3.496
Reviewer+3.432
akin+3.233
imony+3.196
Neg logit contributions
etting-3.165
coded-3.146
 ours-3.011
population-2.995
spawn-2.925
Token is
Token ID318
Feature activation-3.38e+00

Pos logit contributions
ittal+1.848
��+1.800
Reviewer+1.753
akin+1.632
imony+1.619
Neg logit contributions
etting-1.705
coded-1.691
 ours-1.625
population-1.613
 availability-1.583
Token to
Token ID284
Feature activation+3.70e+00

Pos logit contributions
etting+1.890
coded+1.860
 ours+1.825
 availability+1.817
population+1.751
Neg logit contributions
��-2.917
ittal-2.894
Reviewer-2.790
ateur-2.616
imony-2.598
Token direct
Token ID1277
Feature activation+2.93e+00

Pos logit contributions
ittal+3.053
��+2.935
Reviewer+2.872
akin+2.730
ateur+2.703
Neg logit contributions
coded-2.195
etting-2.191
population-2.093
 ours-2.053
spawn-2.001
Token investment
Token ID4896
Feature activation+1.22e+01

Pos logit contributions
ittal+2.029
��+1.976
Reviewer+1.921
akin+1.775
imony+1.760
Neg logit contributions
etting-2.167
coded-2.149
 ours-2.070
population-2.054
 availability-2.020
Token dollars
Token ID5054
Feature activation-1.66e+00

Pos logit contributions
ittal+8.192
Reviewer+7.754
��+7.224
akin+7.200
imony+7.122
Neg logit contributions
etting-8.369
coded-8.260
population-7.800
 ours-7.690
spawn-7.551
Token into
Token ID656
Feature activation-1.22e+00

Pos logit contributions
etting+1.050
coded+1.039
 ours+1.006
 availability+0.985
population+0.982
Neg logit contributions
��-1.323
ittal-1.322
Reviewer-1.266
imony-1.170
akin-1.161
Token the
Token ID262
Feature activation+6.19e-01

Pos logit contributions
etting+0.803
coded+0.792
 ours+0.780
 availability+0.768
population+0.755
Neg logit contributions
��-0.948
ittal-0.939
Reviewer-0.894
imony-0.828
akin-0.824
Token construction
Token ID5103
Feature activation+5.21e+00

Pos logit contributions
ittal+0.477
��+0.472
Reviewer+0.455
akin+0.423
imony+0.421
Neg logit contributions
etting-0.411
coded-0.406
 ours-0.393
population-0.388
 availability-0.387
Token of
Token ID286
Feature activation+4.77e+00

Pos logit contributions
ittal+3.482
��+3.222
Reviewer+3.209
akin+3.034
imony+2.962
Neg logit contributions
etting-3.917
coded-3.910
population-3.745
 ours-3.744
spawn-3.644
Token M
Token ID337
Feature activation+3.27e+00

Pos logit contributions
ittal+1.823
��+1.786
Reviewer+1.736
akin+1.633
imony+1.609
Neg logit contributions
etting-1.527
coded-1.507
 ours-1.440
population-1.431
 availability-1.405
Tokenumin
Token ID7230
Feature activation+1.04e+01

Pos logit contributions
ittal+2.530
��+2.470
Reviewer+2.403
akin+2.251
imony+2.224
Neg logit contributions
etting-2.140
coded-2.130
 ours-2.046
population-2.021
 availability-1.995
Tokenv
Token ID85
Feature activation+1.03e+01

Pos logit contributions
ittal+7.573
��+7.264
Reviewer+6.993
akin+6.917
imony+6.782
Neg logit contributions
coded-6.753
etting-6.703
 ours-6.460
population-6.333
 toxicity-6.258
Tokenä
Token ID11033
Feature activation+2.66e+00

Pos logit contributions
ittal+7.360
Reviewer+6.717
��+6.663
akin+6.645
imony+6.447
Neg logit contributions
coded-7.006
 ours-6.740
etting-6.733
 availability-6.510
population-6.487
Tokenr
Token ID81
Feature activation+1.13e+01

Pos logit contributions
ittal+2.031
��+1.977
Reviewer+1.927
akin+1.803
imony+1.785
Neg logit contributions
etting-1.775
coded-1.770
 ours-1.697
population-1.677
 availability-1.658
Tokenld
Token ID335
Feature activation+1.08e+01

Pos logit contributions
ittal+8.835
Reviewer+8.287
��+8.161
akin+8.133
imony+7.906
Neg logit contributions
coded-6.525
etting-6.478
 ours-6.221
 toxicity-6.077
 brightest-6.011
Tokenen
Token ID268
Feature activation+7.32e+00

Pos logit contributions
ittal+7.673
��+7.208
Reviewer+7.077
akin+7.031
imony+6.869
Neg logit contributions
coded-7.108
etting-6.879
 ours-6.803
population-6.665
 availability-6.648
Token in
Token ID287
Feature activation+1.12e+00

Pos logit contributions
ittal+5.588
��+5.486
Reviewer+5.304
akin+5.115
imony+5.001
Neg logit contributions
etting-4.576
coded-4.572
 ours-4.355
population-4.316
 availability-4.254
Token Swedish
Token ID14023
Feature activation+1.17e+01

Pos logit contributions
ittal+0.877
��+0.861
Reviewer+0.836
akin+0.780
imony+0.775
Neg logit contributions
etting-0.726
coded-0.721
 ours-0.693
population-0.684
 availability-0.675
Token)
Token ID8
Feature activation-6.16e-02

Pos logit contributions
ittal+6.220
��+6.103
Reviewer+5.812
akin+5.464
ateur+5.380
Neg logit contributions
etting-9.364
coded-9.139
population-8.930
 ours-8.787
 toxicity-8.689
Token is
Token ID318
Feature activation-2.86e-01

Pos logit contributions
etting+0.045
coded+0.045
population+0.043
 ours+0.043
 availability+0.043
Neg logit contributions
ittal-0.043
��-0.043
Reviewer-0.041
akin-0.038
imony-0.038
Token tells
Token ID4952
Feature activation+8.36e+00

Pos logit contributions
ittal+4.876
��+4.608
Reviewer+4.549
akin+4.395
imony+4.285
Neg logit contributions
coded-5.008
etting-4.985
population-4.729
 ours-4.679
 availability-4.588
Token us
Token ID514
Feature activation+1.39e+01

Pos logit contributions
ittal+8.491
Reviewer+8.302
��+8.200
akin+7.880
imony+7.693
Neg logit contributions
etting-3.195
coded-3.107
population-2.811
spawn-2.742
 ours-2.726
Token that
Token ID326
Feature activation+2.41e+00

Pos logit contributions
ittal+8.701
Reviewer+8.349
��+7.866
akin+7.789
atio+7.627
Neg logit contributions
etting-9.666
coded-9.465
spawn-8.866
population-8.762
portion-8.752
Token rocky
Token ID26898
Feature activation+2.27e-01

Pos logit contributions
ittal+1.958
��+1.914
Reviewer+1.876
akin+1.758
imony+1.741
Neg logit contributions
etting-1.479
coded-1.468
 ours-1.388
population-1.382
 availability-1.360
Token planets
Token ID14705
Feature activation-3.13e+00

Pos logit contributions
ittal+0.177
��+0.175
Reviewer+0.169
akin+0.157
imony+0.155
Neg logit contributions
etting-0.149
coded-0.148
 ours-0.143
population-0.140
 availability-0.139
Token could
Token ID714
Feature activation+1.07e+00

Pos logit contributions
etting+1.970
coded+1.917
 ours+1.900
population+1.846
 availability+1.823
Neg logit contributions
ittal-2.527
��-2.522
Reviewer-2.438
imony-2.214
akin-2.207
Token form
Token ID1296
Feature activation-7.19e-01

Pos logit contributions
ittal+0.809
��+0.793
Reviewer+0.769
akin+0.715
imony+0.710
Neg logit contributions
etting-0.722
coded-0.715
 ours-0.690
population-0.682
 availability-0.673
Token much
Token ID881
Feature activation+1.05e+00

Pos logit contributions
etting+0.459
coded+0.455
 ours+0.448
population+0.436
 availability+0.430
Neg logit contributions
ittal-0.571
��-0.569
Reviewer-0.544
imony-0.503
akin-0.502
Token earlier
Token ID2961
Feature activation-3.19e+00

Pos logit contributions
ittal+0.758
��+0.747
Reviewer+0.716
akin+0.661
imony+0.659
Neg logit contributions
etting-0.753
coded-0.747
 ours-0.726
population-0.715
 availability-0.707
Token than
Token ID621
Feature activation-1.05e+01

Pos logit contributions
coded+2.089
etting+2.089
 ours+2.042
population+1.985
 availability+1.966
Neg logit contributions
��-2.466
ittal-2.464
Reviewer-2.328
ateur-2.164
akin-2.158
Token we
Token ID356
Feature activation-8.61e+00

Pos logit contributions
 ours+5.945
 brightest+5.255
coded+5.214
etting+5.130
population+5.120
Neg logit contributions
��-9.280
ittal-9.086
Reviewer-8.716
ateral-8.060
iband-8.054
Token more
Token ID517
Feature activation+3.44e-02

Pos logit contributions
etting+1.037
coded+1.023
 ours+0.986
 availability+0.969
population+0.968
Neg logit contributions
ittal-1.324
��-1.298
Reviewer-1.255
imony-1.171
akin-1.169
Token in
Token ID287
Feature activation-3.96e-01

Pos logit contributions
ittal+0.026
��+0.025
Reviewer+0.024
akin+0.023
imony+0.023
Neg logit contributions
etting-0.024
coded-0.023
 ours-0.023
population-0.022
 availability-0.022
Token-
Token ID12
Feature activation+2.23e+00

Pos logit contributions
etting+0.265
coded+0.261
 ours+0.255
population+0.249
 availability+0.249
Neg logit contributions
ittal-0.305
��-0.299
Reviewer-0.288
akin-0.269
imony-0.268
Tokenstore
Token ID8095
Feature activation+4.64e+00

Pos logit contributions
ittal+1.682
��+1.644
Reviewer+1.600
akin+1.489
imony+1.476
Neg logit contributions
etting-1.508
coded-1.495
 ours-1.439
population-1.423
 availability-1.405
Token for
Token ID329
Feature activation-3.83e+00

Pos logit contributions
ittal+3.326
��+3.209
Reviewer+3.152
akin+2.951
imony+2.921
Neg logit contributions
coded-3.230
etting-3.229
population-3.107
 ours-3.072
spawn-2.977
Token 2015
Token ID1853
Feature activation+2.72e+00

Pos logit contributions
etting+2.308
 ours+2.269
coded+2.250
 availability+2.240
population+2.145
Neg logit contributions
��-3.148
ittal-3.129
Reviewer-2.984
imony-2.803
akin-2.776
Token as
Token ID355
Feature activation+3.80e+00

Pos logit contributions
ittal+2.024
��+1.960
Reviewer+1.923
akin+1.786
imony+1.775
Neg logit contributions
etting-1.859
coded-1.850
 ours-1.774
population-1.767
 availability-1.724
Token the
Token ID262
Feature activation+1.06e+00

Pos logit contributions
ittal+2.879
��+2.805
Reviewer+2.758
akin+2.563
imony+2.537
Neg logit contributions
etting-2.541
coded-2.519
population-2.393
 ours-2.386
 availability-2.333
Token band
Token ID4097
Feature activation+2.19e+00

Pos logit contributions
ittal+0.792
��+0.776
Reviewer+0.751
akin+0.698
imony+0.694
Neg logit contributions
etting-0.722
coded-0.714
 ours-0.690
population-0.682
 availability-0.677
Token will
Token ID481
Feature activation-1.12e+00

Pos logit contributions
ittal+1.484
��+1.446
Reviewer+1.401
akin+1.294
imony+1.283
Neg logit contributions
etting-1.654
coded-1.642
 ours-1.585
population-1.572
 availability-1.549
Token be
Token ID307
Feature activation-1.36e+00

Pos logit contributions
etting+0.781
coded+0.772
 ours+0.756
 availability+0.743
population+0.735
Neg logit contributions
ittal-0.825
��-0.821
Reviewer-0.788
akin-0.725
imony-0.724
Token own
Token ID898
Feature activation+9.36e-01

Pos logit contributions
ittal+0.465
��+0.458
Reviewer+0.441
akin+0.412
imony+0.409
Neg logit contributions
etting-0.378
coded-0.376
 ours-0.361
 availability-0.354
population-0.354
Token history
Token ID2106
Feature activation+6.01e+00

Pos logit contributions
ittal+0.742
��+0.730
Reviewer+0.706
akin+0.659
imony+0.654
Neg logit contributions
etting-0.600
coded-0.596
 ours-0.572
population-0.564
 availability-0.560
Token
Token ID960
Feature activation+2.79e+00

Pos logit contributions
ittal+4.244
��+4.114
Reviewer+4.068
akin+3.759
imony+3.752
Neg logit contributions
etting-4.183
coded-4.127
population-4.021
 ours-3.992
 availability-3.898
Tokenthe
Token ID1169
Feature activation+4.30e+00

Pos logit contributions
ittal+1.904
��+1.859
Reviewer+1.802
akin+1.666
imony+1.649
Neg logit contributions
etting-2.085
coded-2.067
 ours-1.998
population-1.979
 availability-1.957
Token tape
Token ID9154
Feature activation+1.43e+00

Pos logit contributions
ittal+3.486
��+3.408
Reviewer+3.338
akin+3.147
imony+3.115
Neg logit contributions
etting-2.591
coded-2.559
population-2.453
 ours-2.446
spawn-2.374
Token his
Token ID465
Feature activation+2.37e-01

Pos logit contributions
ittal+1.229
��+1.208
Reviewer+1.176
akin+1.104
imony+1.097
Neg logit contributions
etting-0.824
coded-0.816
 ours-0.781
population-0.770
 availability-0.759
Tokens
Token ID82
Feature activation+2.26e+00

Pos logit contributions
ittal+0.195
��+0.192
Reviewer+0.186
akin+0.174
imony+0.173
Neg logit contributions
etting-0.144
coded-0.142
 ours-0.137
population-0.135
 availability-0.133
Token is
Token ID318
Feature activation+7.37e-01

Pos logit contributions
ittal+1.686
��+1.650
Reviewer+1.599
akin+1.491
imony+1.481
Neg logit contributions
etting-1.550
coded-1.538
 ours-1.480
population-1.468
 availability-1.446
Token just
Token ID655
Feature activation+4.93e+00

Pos logit contributions
ittal+0.583
��+0.572
Reviewer+0.554
akin+0.517
imony+0.514
Neg logit contributions
etting-0.473
coded-0.470
 ours-0.453
population-0.445
 availability-0.440
Token as
Token ID355
Feature activation+2.84e+00

Pos logit contributions
ittal+3.634
��+3.560
Reviewer+3.520
akin+3.283
imony+3.257
Neg logit contributions
etting-3.315
coded-3.273
population-3.147
 ours-3.125
 availability-3.062
Token much
Token ID881
Feature activation-8.90e-01

Pos logit contributions
ittal+2.346
��+2.291
Reviewer+2.241
akin+2.109
imony+2.085
Neg logit contributions
etting-1.722
coded-1.691
population-1.621
 ours-1.607
 availability-1.577
Token all
Token ID477
Feature activation+1.09e+01

Pos logit contributions
ittal+3.150
��+2.994
Reviewer+2.961
akin+2.760
imony+2.743
Neg logit contributions
etting-2.975
coded-2.929
population-2.801
 ours-2.770
spawn-2.744
Token the
Token ID262
Feature activation+4.81e+00

Pos logit contributions
ittal+7.010
��+6.614
Reviewer+6.430
imony+6.045
ateral+6.028
Neg logit contributions
etting-8.226
coded-7.800
population-7.694
spawn-7.413
 availability-7.190
Token way
Token ID835
Feature activation+1.40e+00

Pos logit contributions
ittal+2.928
��+2.823
Reviewer+2.745
akin+2.513
imony+2.483
Neg logit contributions
etting-3.922
coded-3.851
 ours-3.738
population-3.737
 availability-3.659
Token and
Token ID290
Feature activation+4.19e+00

Pos logit contributions
ittal+1.005
��+0.982
Reviewer+0.952
akin+0.882
imony+0.876
Neg logit contributions
etting-1.003
coded-0.995
 ours-0.959
population-0.951
 availability-0.938
Token they
Token ID484
Feature activation-2.79e+00

Pos logit contributions
ittal+2.991
��+2.857
Reviewer+2.782
akin+2.617
imony+2.585
Neg logit contributions
etting-3.005
coded-2.970
population-2.858
 ours-2.782
 availability-2.761
Token came
Token ID1625
Feature activation+3.26e+00

Pos logit contributions
etting+1.800
coded+1.789
 ours+1.735
 availability+1.689
population+1.686
Neg logit contributions
��-2.181
ittal-2.162
Reviewer-2.109
akin-1.921
imony-1.910
Token up
Token ID510
Feature activation+3.45e+00

Pos logit contributions
ittal+2.413
��+2.323
Reviewer+2.293
akin+2.145
imony+2.117
Neg logit contributions
etting-2.230
coded-2.212
population-2.115
 ours-2.089
 availability-2.059
Token short
Token ID1790
Feature activation+2.41e-01

Pos logit contributions
ittal+2.440
��+2.353
Reviewer+2.298
akin+2.137
imony+2.115
Neg logit contributions
etting-2.494
coded-2.471
 ours-2.365
population-2.363
 availability-2.321
Token.
Token ID13
Feature activation+3.63e+00

Pos logit contributions
ittal+0.192
��+0.190
Reviewer+0.184
akin+0.171
imony+0.170
Neg logit contributions
etting-0.153
coded-0.152
 ours-0.146
population-0.144
 availability-0.143
Token
Token ID198
Feature activation+2.24e+00

Pos logit contributions
ittal+2.561
��+2.482
Reviewer+2.432
akin+2.247
imony+2.246
Neg logit contributions
etting-2.625
coded-2.598
 ours-2.494
population-2.479
 availability-2.450
Token
Token ID198
Feature activation+2.52e+00

Pos logit contributions
ittal+1.541
Reviewer+1.463
��+1.439
akin+1.360
imony+1.353
Neg logit contributions
etting-1.655
coded-1.624
 ours-1.562
population-1.552
 availability-1.542
Token in
Token ID287
Feature activation-2.15e+00

Pos logit contributions
etting+0.359
coded+0.355
 ours+0.344
population+0.341
 availability+0.335
Neg logit contributions
ittal-0.417
��-0.414
Reviewer-0.396
imony-0.369
akin-0.368
Token the
Token ID262
Feature activation-5.22e-01

Pos logit contributions
etting+1.339
coded+1.326
 ours+1.301
population+1.278
 availability+1.261
Neg logit contributions
ittal-1.728
��-1.711
Reviewer-1.676
imony-1.535
akin-1.525
Token world
Token ID995
Feature activation-1.69e+00

Pos logit contributions
etting+0.359
coded+0.356
 ours+0.345
population+0.339
 availability+0.339
Neg logit contributions
ittal-0.389
��-0.382
Reviewer-0.372
akin-0.343
imony-0.342
Token
Token ID447
Feature activation-2.12e+00

Pos logit contributions
etting+1.144
coded+1.143
 ours+1.112
population+1.105
 availability+1.090
Neg logit contributions
ittal-1.259
��-1.259
Reviewer-1.187
imony-1.119
akin-1.110
Token
Token ID247
Feature activation-4.13e+00

Pos logit contributions
etting+1.517
coded+1.515
population+1.466
 ours+1.454
 availability+1.442
Neg logit contributions
ittal-1.488
��-1.438
Reviewer-1.414
akin-1.283
imony-1.282
Tokens
Token ID82
Feature activation+3.65e+00

Pos logit contributions
etting+2.815
coded+2.767
 ours+2.719
population+2.693
 availability+2.619
Neg logit contributions
��-3.099
ittal-3.066
Reviewer-2.877
imony-2.662
ateur-2.654
Token worst
Token ID5290
Feature activation-6.11e+00

Pos logit contributions
ittal+2.840
��+2.765
Reviewer+2.696
akin+2.524
imony+2.502
Neg logit contributions
etting-2.369
coded-2.352
 ours-2.251
population-2.222
 availability-2.189
Token nuclear
Token ID4523
Feature activation-4.69e+00

Pos logit contributions
etting+3.442
coded+3.283
 toxicity+3.201
 ours+3.182
 availability+3.157
Neg logit contributions
��-5.446
ittal-5.286
Reviewer-5.189
akin-4.795
iband-4.713
Token disaster
Token ID9336
Feature activation-2.49e+00

Pos logit contributions
etting+2.267
coded+2.103
 ours+2.071
 availability+2.051
 toxicity+2.046
Neg logit contributions
��-4.576
ittal-4.471
Reviewer-4.397
akin-4.099
iband-4.072
Token.
Token ID13
Feature activation-4.80e+00

Pos logit contributions
etting+1.814
coded+1.789
 ours+1.737
population+1.729
 availability+1.697
Neg logit contributions
ittal-1.744
��-1.724
Reviewer-1.649
akin-1.507
imony-1.493
Token At
Token ID1629
Feature activation+3.23e+00

Pos logit contributions
coded+3.015
etting+3.003
 ours+2.926
population+2.918
spawn+2.810
Neg logit contributions
��-3.863
ittal-3.788
Reviewer-3.474
akin-3.310
ateur-3.271
Token ideas
Token ID4213
Feature activation-1.55e+00

Pos logit contributions
etting+2.162
coded+2.155
 availability+2.114
 ours+2.097
population+2.068
Neg logit contributions
��-2.722
ittal-2.667
Reviewer-2.562
akin-2.408
ateur-2.346
Token on
Token ID319
Feature activation-2.81e+00

Pos logit contributions
etting+1.056
coded+1.037
 ours+1.014
 availability+0.988
population+0.983
Neg logit contributions
��-1.172
ittal-1.164
Reviewer-1.125
akin-1.038
ateur-1.029
Token humanity
Token ID9265
Feature activation-1.61e-01

Pos logit contributions
etting+1.814
 ours+1.787
coded+1.767
 availability+1.737
population+1.728
Neg logit contributions
��-2.235
ittal-2.227
Reviewer-2.107
akin-1.951
imony-1.935
Token regarding
Token ID5115
Feature activation-6.71e+00

Pos logit contributions
etting+0.110
coded+0.108
 ours+0.106
population+0.104
 availability+0.104
Neg logit contributions
ittal-0.121
��-0.120
Reviewer-0.115
akin-0.107
imony-0.105
Token a
Token ID257
Feature activation-3.15e+00

Pos logit contributions
 ours+4.085
etting+4.015
 availability+3.966
population+3.954
coded+3.885
Neg logit contributions
��-5.581
ittal-5.505
Reviewer-5.247
akin-4.780
ateur-4.747
Token basic
Token ID4096
Feature activation-4.51e+00

Pos logit contributions
etting+1.971
coded+1.951
 ours+1.922
population+1.895
 availability+1.852
Neg logit contributions
ittal-2.540
��-2.536
Reviewer-2.424
imony-2.244
akin-2.241
Token quality
Token ID3081
Feature activation-3.76e+00

Pos logit contributions
coded+2.477
etting+2.462
 ours+2.444
 availability+2.411
population+2.385
Neg logit contributions
��-3.971
ittal-3.907
Reviewer-3.816
ateur-3.516
akin-3.496
Token of
Token ID286
Feature activation-1.43e+00

Pos logit contributions
etting+2.232
 ours+2.148
coded+2.145
 availability+2.065
population+2.056
Neg logit contributions
ittal-3.167
��-3.150
Reviewer-3.022
ateur-2.820
akin-2.796
Token life
Token ID1204
Feature activation+5.77e-01

Pos logit contributions
etting+0.780
coded+0.769
 ours+0.757
population+0.737
 availability+0.732
Neg logit contributions
ittal-1.267
��-1.254
Reviewer-1.212
akin-1.134
imony-1.128
Token as
Token ID355
Feature activation+1.68e-02

Pos logit contributions
ittal+0.446
��+0.439
Reviewer+0.425
akin+0.395
imony+0.392
Neg logit contributions
etting-0.382
coded-0.378
 ours-0.366
population-0.360
 availability-0.357
Token a
Token ID257
Feature activation-8.04e-01

Pos logit contributions
ittal+0.013
��+0.013
Reviewer+0.013
akin+0.012
imony+0.012
Neg logit contributions
etting-0.011
coded-0.011
 ours-0.010
population-0.010
 availability-0.010
TokenClinton
Token ID16549
Feature activation-2.12e+00

Pos logit contributions
coded+1.419
etting+1.416
population+1.367
 ours+1.343
spawn+1.324
Neg logit contributions
��-1.727
ittal-1.708
Reviewer-1.563
ateur-1.484
akin-1.480
Token mentioned
Token ID4750
Feature activation-3.78e+00

Pos logit contributions
coded+1.363
etting+1.362
 ours+1.320
 availability+1.300
population+1.285
Neg logit contributions
��-1.675
ittal-1.655
Reviewer-1.614
ateur-1.462
akin-1.462
Token the
Token ID262
Feature activation-6.59e+00

Pos logit contributions
etting+2.352
coded+2.326
 ours+2.315
 availability+2.299
population+2.279
Neg logit contributions
ittal-3.020
��-3.014
Reviewer-2.909
akin-2.655
ateur-2.646
Token recent
Token ID2274
Feature activation-2.50e+00

Pos logit contributions
etting+3.951
 availability+3.933
coded+3.893
 ours+3.829
population+3.791
Neg logit contributions
��-5.443
ittal-5.386
Reviewer-5.257
ateur-4.740
akin-4.733
Token debates
Token ID15389
Feature activation-3.75e+00

Pos logit contributions
etting+1.353
coded+1.334
 ours+1.292
 availability+1.282
population+1.262
Neg logit contributions
ittal-2.226
��-2.225
Reviewer-2.168
akin-2.007
iband-1.993
Token with
Token ID351
Feature activation-2.67e+00

Pos logit contributions
etting+2.377
coded+2.315
 ours+2.271
 availability+2.217
population+2.207
Neg logit contributions
ittal-2.990
��-2.989
Reviewer-2.872
ateur-2.646
ateral-2.630
Token Republican
Token ID3415
Feature activation-2.43e+00

Pos logit contributions
etting+1.550
coded+1.544
 ours+1.522
 availability+1.476
population+1.467
Neg logit contributions
��-2.242
ittal-2.241
Reviewer-2.150
iband-1.995
akin-1.987
Token nominee
Token ID10429
Feature activation-1.27e+00

Pos logit contributions
etting+1.320
coded+1.297
 ours+1.283
 availability+1.236
population+1.221
Neg logit contributions
��-2.144
ittal-2.137
Reviewer-2.048
imony-1.911
akin-1.909
Token Donald
Token ID3759
Feature activation-5.51e+00

Pos logit contributions
etting+0.659
coded+0.654
 ours+0.640
population+0.615
 availability+0.613
Neg logit contributions
��-1.161
ittal-1.159
Reviewer-1.124
akin-1.048
imony-1.044
Token Trump
Token ID1301
Feature activation-3.48e+00

Pos logit contributions
etting+2.942
coded+2.887
 ours+2.861
population+2.697
 availability+2.692
Neg logit contributions
��-5.048
ittal-4.836
Reviewer-4.804
ateur-4.433
iband-4.384
Token Donald
Token ID3759
Feature activation-5.40e+00

Pos logit contributions
etting+1.709
coded+1.661
 ours+1.630
 availability+1.563
 toxicity+1.529
Neg logit contributions
ittal-3.262
��-3.229
Reviewer-3.200
ateur-2.941
iband-2.941
Token sheet
Token ID9629
Feature activation-7.38e-03

Pos logit contributions
etting+4.723
 ours+4.606
coded+4.547
 availability+4.530
population+4.500
Neg logit contributions
��-6.133
ittal-5.943
Reviewer-5.708
akin-5.370
ateur-5.309
Token;
Token ID26
Feature activation-3.59e+00

Pos logit contributions
etting+0.005
coded+0.005
 ours+0.005
population+0.005
 availability+0.005
Neg logit contributions
ittal-0.006
��-0.006
Reviewer-0.005
akin-0.005
imony-0.005
Token but
Token ID475
Feature activation-2.16e+00

Pos logit contributions
 ours+2.221
coded+2.206
etting+2.198
 availability+2.127
population+2.124
Neg logit contributions
��-2.935
ittal-2.877
Reviewer-2.718
imony-2.559
ateur-2.534
Token even
Token ID772
Feature activation-4.00e+00

Pos logit contributions
etting+1.351
coded+1.348
 ours+1.347
 availability+1.298
population+1.296
Neg logit contributions
��-1.738
ittal-1.723
Reviewer-1.632
imony-1.531
ateur-1.514
Token in
Token ID287
Feature activation-6.89e+00

Pos logit contributions
 ours+2.457
etting+2.437
coded+2.427
 availability+2.370
population+2.339
Neg logit contributions
��-3.286
ittal-3.199
Reviewer-3.044
imony-2.856
ateur-2.838
Token a
Token ID257
Feature activation-4.50e+00

Pos logit contributions
 ours+3.948
coded+3.847
etting+3.825
 availability+3.716
 brightest+3.681
Neg logit contributions
��-5.842
ittal-5.695
Reviewer-5.548
imony-5.214
iband-5.070
Token land
Token ID1956
Feature activation+5.02e-02

Pos logit contributions
etting+2.707
coded+2.689
 ours+2.668
population+2.597
 availability+2.571
Neg logit contributions
ittal-3.657
��-3.644
Reviewer-3.536
imony-3.310
akin-3.269
Token of
Token ID286
Feature activation-4.47e+00

Pos logit contributions
ittal+0.042
��+0.041
Reviewer+0.040
akin+0.037
imony+0.037
Neg logit contributions
etting-0.030
coded-0.030
 ours-0.028
population-0.028
 availability-0.028
Token glaciers
Token ID40509
Feature activation-6.25e+00

Pos logit contributions
 ours+2.880
etting+2.847
coded+2.827
 availability+2.757
population+2.721
Neg logit contributions
��-3.521
ittal-3.482
Reviewer-3.342
imony-3.095
akin-3.061
Token,
Token ID11
Feature activation-2.37e+00

Pos logit contributions
etting+3.804
 ours+3.772
coded+3.702
 availability+3.618
population+3.577
Neg logit contributions
��-5.113
ittal-5.081
Reviewer-4.709
imony-4.508
ateur-4.420
Token Jak
Token ID25845
Feature activation-1.25e+00

Pos logit contributions
coded+1.461
etting+1.459
 ours+1.456
 availability+1.414
population+1.399
Neg logit contributions
��-1.938
ittal-1.912
Reviewer-1.818
imony-1.703
ateur-1.691
Token
Token ID447
Feature activation+1.69e+00

Pos logit contributions
ittal+0.982
��+0.933
Reviewer+0.932
akin+0.861
imony+0.855
Neg logit contributions
etting-1.062
coded-1.047
 ours-1.012
population-1.002
 availability-0.994
Token
Token ID247
Feature activation-2.95e+00

Pos logit contributions
ittal+1.433
��+1.412
Reviewer+1.359
akin+1.281
imony+1.280
Neg logit contributions
etting-0.994
coded-0.981
 ours-0.932
population-0.917
 availability-0.907
Tokent
Token ID83
Feature activation+2.23e+00

Pos logit contributions
etting+1.890
coded+1.852
 ours+1.790
population+1.775
 availability+1.743
Neg logit contributions
��-2.355
ittal-2.308
Reviewer-2.259
akin-2.051
ateur-2.025
Token have
Token ID423
Feature activation+2.25e+00

Pos logit contributions
ittal+1.769
��+1.710
Reviewer+1.673
akin+1.569
imony+1.561
Neg logit contributions
etting-1.420
coded-1.414
population-1.343
 ours-1.338
 availability-1.306
Token such
Token ID884
Feature activation+1.39e+00

Pos logit contributions
ittal+1.667
��+1.631
Reviewer+1.584
akin+1.472
imony+1.461
Neg logit contributions
etting-1.550
coded-1.537
 ours-1.478
population-1.466
 availability-1.444
Token a
Token ID257
Feature activation-4.76e+00

Pos logit contributions
ittal+0.892
��+0.877
Reviewer+0.843
akin+0.771
imony+0.763
Neg logit contributions
etting-1.103
coded-1.097
 ours-1.065
population-1.052
 availability-1.049
Token contract
Token ID2775
Feature activation+1.23e+00

Pos logit contributions
etting+2.877
coded+2.866
 availability+2.842
 ours+2.838
population+2.683
Neg logit contributions
��-3.908
ittal-3.834
Reviewer-3.716
akin-3.474
imony-3.441
Token with
Token ID351
Feature activation+6.23e+00

Pos logit contributions
ittal+1.002
��+0.984
Reviewer+0.956
akin+0.895
imony+0.889
Neg logit contributions
etting-0.758
coded-0.751
 ours-0.720
population-0.712
 availability-0.702
Token the
Token ID262
Feature activation+4.59e+00

Pos logit contributions
ittal+4.413
��+4.251
Reviewer+4.183
akin+3.878
imony+3.874
Neg logit contributions
etting-4.403
coded-4.325
population-4.141
 ours-4.051
spawn-4.047
Token state
Token ID1181
Feature activation+1.87e+00

Pos logit contributions
ittal+2.736
��+2.602
Reviewer+2.568
akin+2.343
imony+2.339
Neg logit contributions
etting-3.752
coded-3.717
population-3.575
 ours-3.569
spawn-3.519
Token of
Token ID286
Feature activation-5.71e+00

Pos logit contributions
ittal+1.441
��+1.410
Reviewer+1.371
akin+1.278
imony+1.269
Neg logit contributions
etting-1.235
coded-1.223
 ours-1.175
population-1.164
 availability-1.147
Token
Token ID220
Feature activation+1.41e+00

Pos logit contributions
etting+1.883
coded+1.872
 ours+1.795
 availability+1.746
population+1.741
Neg logit contributions
ittal-3.035
��-3.009
Reviewer-2.887
akin-2.698
imony-2.676
Tokenр
Token ID21169
Feature activation-5.42e+00

Pos logit contributions
ittal+1.168
��+1.146
Reviewer+1.115
akin+1.045
imony+1.037
Neg logit contributions
etting-0.852
coded-0.845
 ours-0.810
population-0.799
 availability-0.788
Tokenо
Token ID15166
Feature activation-2.98e+00

Pos logit contributions
coded+2.630
etting+2.621
 ours+2.487
 availability+2.412
population+2.409
Neg logit contributions
ittal-4.997
��-4.941
Reviewer-4.787
akin-4.460
imony-4.449
Tokenс
Token ID21727
Feature activation-4.34e+00

Pos logit contributions
coded+1.775
etting+1.771
 ours+1.720
population+1.669
 availability+1.660
Neg logit contributions
ittal-2.465
��-2.438
Reviewer-2.355
akin-2.191
imony-2.172
Tokenс
Token ID21727
Feature activation-8.12e+00

Pos logit contributions
coded+2.557
etting+2.530
 ours+2.403
population+2.337
 availability+2.326
Neg logit contributions
ittal-3.645
��-3.591
Reviewer-3.456
akin-3.209
ateur-3.201
Tokenи
Token ID18849
Feature activation-7.32e+00

Pos logit contributions
coded+2.990
etting+2.883
 ours+2.624
 availability+2.594
spawn+2.472
Neg logit contributions
��-8.477
ittal-8.261
Reviewer-8.077
ateur-7.624
akin-7.539
Token
Token ID140
Feature activation-3.20e+00

Pos logit contributions
coded+3.804
etting+3.773
 ours+3.562
 availability+3.482
spawn+3.356
Neg logit contributions
��-6.483
ittal-6.418
Reviewer-6.126
akin-5.754
imony-5.734
Token
Token ID117
Feature activation-6.45e+00

Pos logit contributions
coded+2.542
etting+2.504
 ours+2.418
population+2.407
 availability+2.382
Neg logit contributions
ittal-2.026
��-1.990
Reviewer-1.902
imony-1.744
ateral-1.727
Tokenс
Token ID21727
Feature activation-5.51e+00

Pos logit contributions
etting+3.707
coded+3.706
 ours+3.600
 availability+3.545
population+3.453
Neg logit contributions
ittal-5.324
��-5.312
Reviewer-4.990
akin-4.674
imony-4.663
Tokenк
Token ID31583
Feature activation-3.65e+00

Pos logit contributions
coded+2.752
etting+2.621
 ours+2.538
 availability+2.470
population+2.462
Neg logit contributions
ittal-5.084
��-4.978
Reviewer-4.815
akin-4.536
ateur-4.523
Tokenи
Token ID18849
Feature activation-4.43e+00

Pos logit contributions
etting+1.505
coded+1.431
 ours+1.314
 availability+1.295
population+1.282
Neg logit contributions
��-3.753
ittal-3.736
Reviewer-3.641
akin-3.392
ateral-3.368
Token I
Token ID314
Feature activation-2.87e+00

Pos logit contributions
etting+1.617
coded+1.617
 ours+1.579
 availability+1.532
population+1.508
Neg logit contributions
��-2.056
ittal-2.047
Reviewer-1.968
imony-1.817
ateur-1.801
Token added
Token ID2087
Feature activation-8.37e+00

Pos logit contributions
coded+1.760
etting+1.744
 ours+1.669
 availability+1.625
population+1.619
Neg logit contributions
��-2.357
ittal-2.351
Reviewer-2.282
imony-2.088
ateur-2.079
Token a
Token ID257
Feature activation-8.49e+00

Pos logit contributions
 ours+4.982
coded+4.861
etting+4.824
 availability+4.721
 wisely+4.491
Neg logit contributions
��-6.971
ittal-6.708
Reviewer-6.545
imony-6.080
ateur-6.010
Token <
Token ID1279
Feature activation-4.11e+00

Pos logit contributions
coded+5.094
etting+5.070
 ours+5.036
 availability+4.893
 qualifier+4.772
Neg logit contributions
��-6.895
ittal-6.567
Reviewer-6.466
imony-6.061
ateur-5.958
Tokenspan
Token ID12626
Feature activation-8.32e-01

Pos logit contributions
coded+2.508
etting+2.494
 ours+2.384
population+2.381
spawn+2.319
Neg logit contributions
��-3.378
ittal-3.271
Reviewer-3.211
imony-2.968
ateur-2.943
Token>
Token ID29
Feature activation-1.24e+01

Pos logit contributions
etting+0.673
coded+0.669
 ours+0.652
population+0.641
 availability+0.640
Neg logit contributions
ittal-0.516
��-0.513
Reviewer-0.491
akin-0.445
imony-0.444
Token around
Token ID1088
Feature activation-7.24e+00

Pos logit contributions
coded+6.659
etting+6.321
 ours+6.206
 qualifier+6.175
 availability+6.059
Neg logit contributions
��-10.521
ittal-9.908
Reviewer-9.784
ateur-9.125
imony-9.075
Token the
Token ID262
Feature activation-4.93e+00

Pos logit contributions
 ours+4.374
coded+4.323
etting+4.298
 availability+4.139
population+4.085
Neg logit contributions
��-5.820
ittal-5.678
Reviewer-5.483
akin-5.101
iband-5.098
Token phone
Token ID3072
Feature activation-9.04e+00

Pos logit contributions
coded+3.142
etting+3.132
 ours+3.045
 availability+3.000
population+2.971
Neg logit contributions
��-3.932
ittal-3.787
Reviewer-3.673
imony-3.396
akin-3.376
Token name
Token ID1438
Feature activation-1.05e+01

Pos logit contributions
coded+5.169
etting+5.029
 availability+4.904
 ours+4.896
population+4.893
Neg logit contributions
��-7.573
ittal-7.124
Reviewer-6.979
imony-6.574
ateur-6.517
Token.
Token ID13
Feature activation+3.63e-01

Pos logit contributions
coded+5.781
etting+5.596
 qualifier+5.419
 ours+5.412
 availability+5.311
Neg logit contributions
��-9.046
ittal-8.516
Reviewer-8.288
ateur-7.815
imony-7.734
Token over
Token ID625
Feature activation-2.99e+00

Pos logit contributions
etting+1.696
coded+1.679
 ours+1.670
population+1.607
 availability+1.600
Neg logit contributions
��-1.754
ittal-1.742
Reviewer-1.660
akin-1.537
ateur-1.536
Token.
Token ID13
Feature activation+1.23e+00

Pos logit contributions
etting+1.796
coded+1.792
 ours+1.730
population+1.708
 availability+1.671
Neg logit contributions
��-2.484
ittal-2.461
Reviewer-2.374
imony-2.197
akin-2.192
Token
Token ID198
Feature activation+4.72e-01

Pos logit contributions
ittal+0.921
��+0.911
Reviewer+0.877
akin+0.814
imony+0.805
Neg logit contributions
etting-0.834
coded-0.828
 ours-0.801
population-0.791
 availability-0.777
Token
Token ID198
Feature activation-9.12e-02

Pos logit contributions
ittal+0.344
��+0.332
Reviewer+0.326
akin+0.303
imony+0.301
Neg logit contributions
etting-0.332
coded-0.328
 ours-0.316
population-0.313
 availability-0.310
TokenThe
Token ID464
Feature activation-2.10e+00

Pos logit contributions
etting+0.060
coded+0.060
 ours+0.057
population+0.057
 availability+0.056
Neg logit contributions
ittal-0.071
��-0.070
Reviewer-0.067
akin-0.062
imony-0.062
Token high
Token ID1029
Feature activation-1.16e+01

Pos logit contributions
etting+1.339
coded+1.319
 ours+1.275
population+1.260
 availability+1.259
Neg logit contributions
ittal-1.664
��-1.662
Reviewer-1.604
akin-1.480
imony-1.469
Token stakes
Token ID21147
Feature activation-1.12e+01

Pos logit contributions
 availability+6.785
etting+6.676
coded+6.587
population+6.514
 toxicity+6.448
Neg logit contributions
��-9.725
ittal-8.961
Reviewer-8.612
akin-8.020
itle-7.926
Token can
Token ID460
Feature activation-4.23e+00

Pos logit contributions
etting+6.409
 ours+6.358
coded+6.127
 availability+6.097
 qualifier+5.897
Neg logit contributions
��-9.386
Reviewer-8.941
ittal-8.611
ateur-8.006
iband-7.891
Token be
Token ID307
Feature activation-9.30e+00

Pos logit contributions
etting+2.848
coded+2.810
 ours+2.764
 availability+2.667
 wisely+2.633
Neg logit contributions
��-3.165
ittal-3.110
Reviewer-3.075
imony-2.777
iband-2.762
Token seen
Token ID1775
Feature activation-5.88e+00

Pos logit contributions
 ours+6.040
coded+5.818
etting+5.748
 daunting+5.585
 wisely+5.529
Neg logit contributions
��-7.415
ittal-6.958
Reviewer-6.792
iband-6.394
ateur-6.245
Token in
Token ID287
Feature activation-3.73e+00

Pos logit contributions
etting+3.211
coded+3.207
 ours+3.111
 brightest+3.045
 wisely+3.002
Neg logit contributions
��-5.130
ittal-5.049
Reviewer-5.004
ateur-4.595
imony-4.556
Token links
Token ID6117
Feature activation-1.49e+00

Pos logit contributions
etting+0.963
coded+0.954
 ours+0.943
 availability+0.912
population+0.904
Neg logit contributions
ittal-1.342
��-1.328
Reviewer-1.287
akin-1.198
imony-1.194
Token to
Token ID284
Feature activation-4.49e+00

Pos logit contributions
etting+0.834
coded+0.829
 ours+0.792
 availability+0.766
population+0.763
Neg logit contributions
ittal-1.303
��-1.298
Reviewer-1.250
imony-1.170
akin-1.169
Token some
Token ID617
Feature activation-1.76e+00

Pos logit contributions
etting+2.647
 ours+2.647
coded+2.624
 availability+2.578
population+2.477
Neg logit contributions
ittal-3.720
��-3.687
Reviewer-3.539
imony-3.314
akin-3.277
Token of
Token ID286
Feature activation-2.32e+00

Pos logit contributions
etting+1.119
coded+1.100
 ours+1.080
 availability+1.061
population+1.040
Neg logit contributions
ittal-1.409
��-1.401
Reviewer-1.343
imony-1.252
akin-1.241
Token the
Token ID262
Feature activation-3.55e+00

Pos logit contributions
etting+1.194
 ours+1.173
coded+1.163
 availability+1.108
population+1.097
Neg logit contributions
ittal-2.132
��-2.102
Reviewer-2.027
akin-1.923
imony-1.908
Token cos
Token ID8615
Feature activation-1.43e+01

Pos logit contributions
etting+2.265
coded+2.208
 ours+2.178
 availability+2.162
 brightest+2.096
Neg logit contributions
ittal-2.839
��-2.831
Reviewer-2.689
imony-2.518
akin-2.491
Tokenplayers
Token ID32399
Feature activation-6.28e+00

Pos logit contributions
coded+5.931
etting+5.929
spawn+5.658
population+5.569
 availability+5.247
Neg logit contributions
��-12.908
ittal-12.672
Reviewer-12.453
imony-11.655
��-11.585
Token featured
Token ID8096
Feature activation-3.25e+00

Pos logit contributions
etting+3.831
coded+3.709
 ours+3.686
 availability+3.624
population+3.502
Neg logit contributions
��-5.192
ittal-5.134
Reviewer-4.917
ateral-4.505
imony-4.489
Token above
Token ID2029
Feature activation+6.25e-01

Pos logit contributions
etting+1.794
coded+1.767
 ours+1.710
population+1.657
 availability+1.655
Neg logit contributions
ittal-2.860
��-2.852
Reviewer-2.743
akin-2.560
imony-2.543
Token:
Token ID25
Feature activation-1.11e+00

Pos logit contributions
ittal+0.510
��+0.502
Reviewer+0.487
akin+0.455
imony+0.453
Neg logit contributions
etting-0.386
coded-0.383
 ours-0.367
population-0.363
 availability-0.358
Token
Token ID198
Feature activation-3.54e-01

Pos logit contributions
etting+0.734
coded+0.726
 ours+0.707
 availability+0.689
population+0.689
Neg logit contributions
ittal-0.848
��-0.830
Reviewer-0.797
imony-0.746
akin-0.743
Token Franklin
Token ID14021
Feature activation-6.41e+00

Pos logit contributions
etting+1.128
coded+1.118
 ours+1.098
population+1.070
 availability+1.067
Neg logit contributions
ittal-1.225
��-1.214
Reviewer-1.169
akin-1.074
ateur-1.064
Token D
Token ID360
Feature activation+6.99e-01

Pos logit contributions
etting+4.196
coded+4.070
 ours+3.935
 availability+3.817
population+3.796
Neg logit contributions
��-5.068
ittal-5.008
Reviewer-4.832
imony-4.424
ateur-4.385
Token.
Token ID13
Feature activation+2.70e-01

Pos logit contributions
ittal+0.546
��+0.528
Reviewer+0.517
akin+0.486
imony+0.481
Neg logit contributions
etting-0.452
coded-0.451
 ours-0.433
population-0.428
 availability-0.423
Token Roosevelt
Token ID19297
Feature activation-4.21e+00

Pos logit contributions
ittal+0.186
��+0.183
Reviewer+0.176
akin+0.161
imony+0.161
Neg logit contributions
etting-0.201
coded-0.199
 ours-0.192
population-0.191
 availability-0.188
Token appointed
Token ID9899
Feature activation-1.17e+01

Pos logit contributions
etting+2.582
coded+2.558
 ours+2.448
 wisely+2.422
 availability+2.404
Neg logit contributions
��-3.423
ittal-3.409
Reviewer-3.324
akin-3.046
ateur-3.022
Token her
Token ID607
Feature activation-1.26e+01

Pos logit contributions
etting+5.352
 wisely+5.351
 ours+5.346
 availability+5.264
coded+5.258
Neg logit contributions
ittal-10.313
��-10.262
Reviewer-9.999
ateur-9.256
akin-9.245
Token to
Token ID284
Feature activation-5.90e+00

Pos logit contributions
 availability+6.123
coded+6.088
 ours+6.054
etting+6.029
 wisely+5.882
Neg logit contributions
��-10.770
ittal-10.561
Reviewer-10.020
ateur-9.488
akin-9.347
Token several
Token ID1811
Feature activation+8.78e-01

Pos logit contributions
etting+3.601
coded+3.590
 availability+3.550
population+3.516
 ours+3.505
Neg logit contributions
ittal-4.705
��-4.650
Reviewer-4.540
akin-4.181
imony-4.054
Token government
Token ID1230
Feature activation+3.80e+00

Pos logit contributions
ittal+0.677
��+0.664
Reviewer+0.644
akin+0.600
imony+0.596
Neg logit contributions
etting-0.582
coded-0.577
 ours-0.554
population-0.549
 availability-0.542
Token positions
Token ID6116
Feature activation-2.13e+00

Pos logit contributions
ittal+2.939
��+2.839
Reviewer+2.789
imony+2.616
akin+2.613
Neg logit contributions
etting-2.419
coded-2.405
 ours-2.309
population-2.297
spawn-2.234
Token,
Token ID11
Feature activation-2.55e+00

Pos logit contributions
etting+1.362
coded+1.327
 ours+1.287
 availability+1.260
population+1.256
Neg logit contributions
ittal-1.703
��-1.693
Reviewer-1.636
akin-1.504
ateur-1.501
Token because
Token ID780
Feature activation+5.27e-01

Pos logit contributions
etting+0.597
coded+0.596
 ours+0.574
population+0.564
 availability+0.558
Neg logit contributions
ittal-0.698
��-0.695
Reviewer-0.667
akin-0.619
ateur-0.615
Token we
Token ID356
Feature activation-7.46e+00

Pos logit contributions
ittal+0.433
��+0.425
Reviewer+0.413
akin+0.387
imony+0.384
Neg logit contributions
etting-0.321
coded-0.318
 ours-0.306
population-0.302
 availability-0.298
Token believe
Token ID1975
Feature activation+1.25e+00

Pos logit contributions
coded+4.396
etting+4.366
 ours+4.190
population+4.140
 availability+4.088
Neg logit contributions
��-5.993
Reviewer-5.914
ittal-5.910
imony-5.232
ateral-5.193
Token that
Token ID326
Feature activation-2.77e+00

Pos logit contributions
ittal+1.024
��+1.004
Reviewer+0.980
akin+0.917
imony+0.911
Neg logit contributions
etting-0.760
coded-0.752
 ours-0.714
population-0.712
 availability-0.698
Token we
Token ID356
Feature activation-6.42e+00

Pos logit contributions
coded+1.658
etting+1.653
 ours+1.646
population+1.587
 availability+1.580
Neg logit contributions
ittal-2.272
��-2.268
Reviewer-2.152
akin-2.025
imony-2.010
Token do
Token ID466
Feature activation-1.18e+01

Pos logit contributions
etting+4.129
coded+4.114
 ours+3.965
population+3.897
 availability+3.845
Neg logit contributions
��-4.906
ittal-4.828
Reviewer-4.768
imony-4.261
akin-4.230
Token.
Token ID13
Feature activation-3.14e+00

Pos logit contributions
 ours+6.317
coded+6.145
etting+6.136
 availability+5.892
population+5.844
Neg logit contributions
ittal-9.412
��-9.379
Reviewer-9.108
akin-8.474
imony-8.427
Token
Token ID447
Feature activation-5.38e+00

Pos logit contributions
coded+1.865
etting+1.855
 ours+1.834
population+1.781
 availability+1.735
Neg logit contributions
��-2.599
ittal-2.595
Reviewer-2.454
akin-2.322
ateur-2.267
Token
Token ID251
Feature activation-3.57e+00

Pos logit contributions
etting+3.085
coded+3.066
 ours+3.007
population+2.934
 availability+2.894
Neg logit contributions
ittal-4.371
Reviewer-4.252
��-4.193
akin-3.940
imony-3.914
Token
Token ID198
Feature activation+1.57e+00

Pos logit contributions
etting+2.153
coded+2.142
 ours+2.137
population+2.029
 availability+2.005
Neg logit contributions
ittal-2.895
��-2.895
Reviewer-2.719
akin-2.592
ateur-2.542
Token
Token ID198
Feature activation-2.00e-01

Pos logit contributions
ittal+1.104
��+1.048
Reviewer+1.046
akin+0.974
imony+0.967
Neg logit contributions
etting-1.133
coded-1.116
 ours-1.073
population-1.067
 availability-1.056