TOP ACTIVATIONS
MAX = 1.850

of the game . <|endoftext|> Share . Two iconic horror titles
every citizen . Share this article and help to
the years . Share this : Facebook
( CS IC ). <|endoftext|> Share Tweet
ie be ek . <|endoftext|> Share . We found a home
true . But at the same time , under the anti
that would greatly alleviate the pain of the shutdown . Republicans
). <|endoftext|> Share Tweet Pin 8 shares
of common sense . The cardinal lesson is this : prior
But at the same time he realizes he
fall in love with the same woman , the formidable Diana
Spe ech is one of the most
. " However at the same time you don 't want
01 . 29 "] [ Round " 11 "] [ White
themselves , but at the same time each watching how the
State and Trump at the same time ? I
. At the same time , however , they
more amusing that the very same Salt z shrug s off
my phone gr Share this post J
At the same time , another official added

MIN ACTIVATIONS
MIN = -1.740

4 5 6 7 8 void vert ( in out app
a P izz ag ater conspiracy theorist and affiliate of Proud
shalt sm ite every male therein with the edge of the
; } 1 2 3 void vert ( in out app
in the K 12 school libraries , E BS CO
successful comedy writer who qu its his pro dig iously paying
the case by C . I . A . officials were
T OM KING : Well , that 's what makes
and let Will know by clicking the link below to share
ambient D ensity ) const { const float f Mass =
Connection ( Co ax ) under the list of brid ged
Washington Times , LLC . Click here for reprint permission .
data _ full v ) { // Time component const float
of vouchers into nine new school districts , with the potential
Washington Times , LLC . Click here for reprint permission .
individual is equal before and under the law and has the
you were dismissed as a conspiracy theorist . Or if you
returned to the Meg arian school : only the multiple is
less surprised by the long incub ation times , noting that
of a wide - open school voucher program for the state

INTERVAL 0.953 - 1.850
CONTAINS 0.362%

U which even uses the same panel as the Asus .
registered their mobile numbers as part of the My Number ,
ve picked or were dragged into a recent argument about gender
optim ist is of her pract ising her tap - d
? Ol af ?" I stood still . My

INTERVAL 0.055 - 0.953
CONTAINS 42.064%

a rebuilt labor movement : P ret ending some
her r he umat oid arthritis . Trying to keep her
. All in all it was my first all - n
to incite and infl ame viewers . Even the film 's
Achilles heel last season . Hopefully the addition of Tom Gilbert

INTERVAL -0.842 - 0.055
CONTAINS 56.975%

with legendary Michigan State basketball coach Tom I zzo about whether
Department rescued Lav au and his son from the rav ine
for density in the absence of v ort ons . V
to result in an approach to rationality to which feminists should
Clemson Game in 2016 ( Photo : Wayne McG a hee

INTERVAL -1.740 - -0.842
CONTAINS 0.598%

think they re normal that way , but they
Show how to sew a seam : One of the most
it was too early to speculate why the fire had been
here . Launch Kodi and click on the Settings Icon in
to watch us train as school boys on a Thursday night
Token of
Token ID286
Feature activation-1.50e-01

Pos logit contributions
rocal+0.146
 sadd+0.135
 Signed+0.135
 offending+0.134
OA+0.131
Neg logit contributions
 Ridley-0.117
essen-0.106
-0.098
rus-0.096
essel-0.094
Token the
Token ID262
Feature activation+9.50e-03

Pos logit contributions
 Ridley+0.454
essen+0.412
+0.385
rus+0.378
essel+0.369
Neg logit contributions
rocal-0.463
 sadd-0.430
 Signed-0.428
 offending-0.426
idad-0.413
Token game
Token ID983
Feature activation+2.39e-01

Pos logit contributions
rocal+0.025
 sadd+0.023
 Signed+0.023
 offending+0.022
idad+0.022
Neg logit contributions
 Ridley-0.033
essen-0.031
-0.029
rus-0.028
essel-0.028
Token.
Token ID13
Feature activation+1.91e-01

Pos logit contributions
rocal+0.808
 sadd+0.756
 Signed+0.754
 offending+0.749
idad+0.724
Neg logit contributions
 Ridley-0.650
essen-0.597
-0.546
rus-0.532
essel-0.522
Token<|endoftext|>
Token ID50256
Feature activation-4.33e-01

Pos logit contributions
rocal+0.553
 sadd+0.509
 offending+0.507
 Signed+0.500
idad+0.485
Neg logit contributions
 Ridley-0.621
essen-0.569
-0.535
rus-0.522
essel-0.512
TokenShare
Token ID11649
Feature activation+1.85e+00

Pos logit contributions
 Ridley+1.348
essen+1.271
+1.199
rus+1.183
essel+1.140
Neg logit contributions
rocal-1.274
 offending-1.196
 sadd-1.189
 Signed-1.169
 phr-1.125
Token.
Token ID13
Feature activation-3.38e-01

Pos logit contributions
rocal+5.615
 Signed+5.435
 sadd+5.292
 offending+5.182
OA+5.067
Neg logit contributions
 Ridley-5.301
essen-5.040
-4.653
uton-4.495
asions-4.443
Token Two
Token ID4930
Feature activation+3.26e-01

Pos logit contributions
 Ridley+1.073
essen+0.953
+0.895
rus+0.882
essel+0.859
Neg logit contributions
rocal-1.024
 sadd-0.945
 offending-0.944
 Signed-0.908
idad-0.900
Token iconic
Token ID14133
Feature activation+2.09e-01

Pos logit contributions
rocal+0.936
 sadd+0.866
 Signed+0.864
 offending+0.855
OA+0.822
Neg logit contributions
 Ridley-1.051
essen-0.975
-0.909
rus-0.888
essel-0.877
Token horror
Token ID9961
Feature activation+1.05e-01

Pos logit contributions
rocal+0.623
 Signed+0.573
 sadd+0.572
 offending+0.565
OA+0.547
Neg logit contributions
 Ridley-0.662
essen-0.608
-0.566
rus-0.555
essel-0.546
Token titles
Token ID8714
Feature activation+4.44e-01

Pos logit contributions
rocal+0.327
 Signed+0.303
 sadd+0.300
 offending+0.298
ENCY+0.290
Neg logit contributions
 Ridley-0.318
essen-0.291
-0.271
rus-0.265
essel-0.260
Token every
Token ID790
Feature activation+1.34e-02

Pos logit contributions
rocal+0.172
 Signed+0.161
 sadd+0.161
 offending+0.158
OA+0.154
Neg logit contributions
 Ridley-0.138
essen-0.126
-0.115
rus-0.113
essel-0.110
Token citizen
Token ID9511
Feature activation-1.66e-03

Pos logit contributions
rocal+0.039
 Signed+0.036
 sadd+0.036
 offending+0.035
ENCY+0.034
Neg logit contributions
 Ridley-0.043
essen-0.040
-0.037
rus-0.037
essel-0.036
Token.
Token ID13
Feature activation+1.07e-01

Pos logit contributions
 Ridley+0.005
essen+0.004
+0.004
rus+0.004
essel+0.004
Neg logit contributions
rocal-0.005
 Signed-0.005
 sadd-0.005
 offending-0.005
ENCY-0.005
Token
Token ID198
Feature activation+1.13e-01

Pos logit contributions
rocal+0.314
 sadd+0.289
 offending+0.287
 Signed+0.285
idad+0.274
Neg logit contributions
 Ridley-0.343
essen-0.316
-0.295
rus-0.291
essel-0.285
Token
Token ID198
Feature activation+2.12e-02

Pos logit contributions
rocal+0.307
 sadd+0.283
 offending+0.282
 Signed+0.281
idad+0.267
Neg logit contributions
 Ridley-0.378
essen-0.353
-0.330
rus-0.326
essel-0.319
TokenShare
Token ID11649
Feature activation+1.83e+00

Pos logit contributions
rocal+0.065
 sadd+0.061
 offending+0.060
 Signed+0.060
 phr+0.058
Neg logit contributions
 Ridley-0.064
essen-0.060
-0.056
rus-0.055
essel-0.053
Token this
Token ID428
Feature activation+2.19e-01

Pos logit contributions
rocal+5.802
 Signed+5.530
 sadd+5.494
 offending+5.396
OA+5.252
Neg logit contributions
 Ridley-5.082
essen-4.778
-4.428
uton-4.221
essel-4.207
Token article
Token ID2708
Feature activation+7.43e-01

Pos logit contributions
rocal+0.694
 sadd+0.645
 Signed+0.643
 offending+0.638
OA+0.614
Neg logit contributions
 Ridley-0.644
essen-0.596
-0.551
rus-0.541
essel-0.528
Token and
Token ID290
Feature activation-5.91e-02

Pos logit contributions
rocal+2.258
 sadd+2.127
 Signed+2.124
 offending+2.095
OA+2.013
Neg logit contributions
 Ridley-2.241
essen-2.080
-1.942
rus-1.879
essel-1.856
Token help
Token ID1037
Feature activation+2.18e-01

Pos logit contributions
 Ridley+0.183
essen+0.169
rus+0.155
+0.154
essel+0.151
Neg logit contributions
rocal-0.178
 sadd-0.165
 offending-0.163
 Signed-0.163
idad-0.158
Token to
Token ID284
Feature activation+3.77e-01

Pos logit contributions
rocal+0.643
 sadd+0.590
 Signed+0.588
 offending+0.584
idad+0.563
Neg logit contributions
 Ridley-0.690
essen-0.641
-0.593
rus-0.583
essel-0.574
Token the
Token ID262
Feature activation+1.87e-01

Pos logit contributions
rocal+0.865
 sadd+0.807
 Signed+0.804
 offending+0.794
idad+0.766
Neg logit contributions
 Ridley-0.813
essen-0.746
-0.695
rus-0.682
essel-0.661
Token years
Token ID812
Feature activation+4.18e-01

Pos logit contributions
rocal+0.598
 Signed+0.558
 sadd+0.553
 offending+0.545
OA+0.529
Neg logit contributions
 Ridley-0.547
essen-0.505
-0.467
rus-0.458
essel-0.447
Token.
Token ID13
Feature activation-8.75e-03

Pos logit contributions
rocal+1.403
 sadd+1.314
 Signed+1.313
 offending+1.300
idad+1.257
Neg logit contributions
 Ridley-1.149
essen-1.056
-0.973
rus-0.947
essel-0.927
Token
Token ID198
Feature activation-4.46e-02

Pos logit contributions
 Ridley+0.027
essen+0.025
+0.023
rus+0.023
essel+0.022
Neg logit contributions
rocal-0.027
 sadd-0.025
 offending-0.024
 Signed-0.024
idad-0.023
Token
Token ID198
Feature activation-1.68e-01

Pos logit contributions
 Ridley+0.148
essen+0.137
+0.128
rus+0.127
essel+0.124
Neg logit contributions
rocal-0.124
 offending-0.115
 sadd-0.114
 Signed-0.113
idad-0.108
TokenShare
Token ID11649
Feature activation+1.80e+00

Pos logit contributions
 Ridley+0.508
essen+0.474
+0.445
rus+0.436
essel+0.424
Neg logit contributions
rocal-0.514
 sadd-0.477
 offending-0.477
 Signed-0.473
idad-0.452
Token this
Token ID428
Feature activation+1.82e-01

Pos logit contributions
rocal+5.782
 Signed+5.487
 sadd+5.476
 offending+5.382
OA+5.232
Neg logit contributions
 Ridley-4.908
essen-4.661
-4.253
uton-4.086
essel-4.073
Token:
Token ID25
Feature activation+5.87e-01

Pos logit contributions
rocal+0.578
 sadd+0.537
 Signed+0.537
 offending+0.531
OA+0.512
Neg logit contributions
 Ridley-0.535
essen-0.494
-0.458
rus-0.448
essel-0.438
Token Facebook
Token ID3203
Feature activation+6.42e-01

Pos logit contributions
rocal+1.550
 sadd+1.432
 Signed+1.424
 offending+1.413
idad+1.346
Neg logit contributions
 Ridley-2.024
essen-1.896
-1.779
rus-1.743
essel-1.717
Token
Token ID198
Feature activation+2.66e-01

Pos logit contributions
rocal+1.681
 sadd+1.554
 Signed+1.552
 offending+1.528
OA+1.462
Neg logit contributions
 Ridley-2.218
essen-2.083
-1.958
rus-1.911
essel-1.886
Token
Token ID198
Feature activation+1.74e-01

Pos logit contributions
rocal+0.718
 sadd+0.663
 Signed+0.657
 offending+0.657
idad+0.625
Neg logit contributions
 Ridley-0.906
essen-0.845
-0.792
rus-0.779
essel-0.765
Token (
Token ID357
Feature activation-5.59e-02

Pos logit contributions
 Ridley+0.426
essen+0.393
+0.356
rus+0.354
essel+0.344
Neg logit contributions
rocal-0.486
 sadd-0.453
 offending-0.451
 Signed-0.450
ENCY-0.431
TokenCS
Token ID7902
Feature activation+3.08e-02

Pos logit contributions
 Ridley+0.188
essen+0.177
+0.168
rus+0.165
essel+0.159
Neg logit contributions
rocal-0.151
 sadd-0.140
 offending-0.140
 Signed-0.139
 phr-0.131
TokenIC
Token ID2149
Feature activation+4.47e-02

Pos logit contributions
rocal+0.079
 sadd+0.073
 offending+0.073
 Signed+0.072
 phr+0.069
Neg logit contributions
 Ridley-0.108
essen-0.101
-0.095
rus-0.094
essel-0.091
Token).
Token ID737
Feature activation-2.40e-01

Pos logit contributions
rocal+0.134
 sadd+0.125
 offending+0.124
 Signed+0.124
ENCY+0.117
Neg logit contributions
 Ridley-0.139
essen-0.128
-0.119
rus-0.118
essel-0.114
Token<|endoftext|>
Token ID50256
Feature activation-3.17e-01

Pos logit contributions
 Ridley+0.684
essen+0.622
+0.575
rus+0.571
essel+0.546
Neg logit contributions
rocal-0.793
 sadd-0.742
 offending-0.739
 Signed-0.711
 phr-0.700
TokenShare
Token ID11649
Feature activation+1.76e+00

Pos logit contributions
 Ridley+0.991
essen+0.934
+0.878
rus+0.866
essel+0.838
Neg logit contributions
rocal-0.930
 offending-0.870
 sadd-0.868
 Signed-0.855
 phr-0.821
Token
Token ID198
Feature activation+2.39e-01

Pos logit contributions
rocal+5.431
 Signed+5.241
 sadd+5.115
 offending+5.014
OA+4.906
Neg logit contributions
 Ridley-5.018
essen-4.744
-4.381
uton-4.229
asions-4.182
Token
Token ID198
Feature activation+2.65e-01

Pos logit contributions
rocal+0.651
 sadd+0.601
 offending+0.599
 Signed+0.595
idad+0.567
Neg logit contributions
 Ridley-0.808
essen-0.753
-0.706
rus-0.696
essel-0.682
TokenTweet
Token ID47845
Feature activation+1.71e+00

Pos logit contributions
rocal+0.779
 sadd+0.724
 offending+0.724
 Signed+0.716
 phr+0.684
Neg logit contributions
 Ridley-0.838
essen-0.785
-0.736
rus-0.723
essel-0.704
Token
Token ID198
Feature activation+4.61e-01

Pos logit contributions
rocal+4.634
 Signed+4.363
 sadd+4.293
 offending+4.235
OA+4.081
Neg logit contributions
 Ridley-5.624
essen-5.264
-4.959
rus-4.806
uton-4.760
Token
Token ID198
Feature activation+4.73e-01

Pos logit contributions
rocal+1.209
 sadd+1.118
 Signed+1.114
 offending+1.099
OA+1.051
Neg logit contributions
 Ridley-1.590
essen-1.492
-1.401
rus-1.369
essel-1.351
Tokenie
Token ID494
Feature activation-7.33e-01

Pos logit contributions
rocal+0.925
 sadd+0.868
 Signed+0.866
 offending+0.858
idad+0.828
Neg logit contributions
 Ridley-0.779
essen-0.718
-0.664
rus-0.644
essel-0.632
Tokenbe
Token ID1350
Feature activation-3.43e-01

Pos logit contributions
 Ridley+1.892
essen+1.762
+1.560
rus+1.560
essel+1.537
Neg logit contributions
rocal-2.528
 sadd-2.406
 Signed-2.387
 offending-2.369
ENCY-2.337
Tokenek
Token ID988
Feature activation-1.57e-01

Pos logit contributions
 Ridley+0.954
essen+0.879
+0.802
rus+0.789
essel+0.780
Neg logit contributions
rocal-1.133
 sadd-1.063
 Signed-1.057
 offending-1.052
ENCY-1.021
Token.
Token ID13
Feature activation-6.18e-02

Pos logit contributions
 Ridley+0.426
essen+0.394
+0.358
rus+0.353
essel+0.346
Neg logit contributions
rocal-0.530
 sadd-0.498
 Signed-0.495
 offending-0.493
ENCY-0.477
Token<|endoftext|>
Token ID50256
Feature activation-4.00e-01

Pos logit contributions
 Ridley+0.184
essen+0.169
+0.156
rus+0.154
essel+0.151
Neg logit contributions
rocal-0.194
 sadd-0.181
 offending-0.180
 Signed-0.178
idad-0.171
TokenShare
Token ID11649
Feature activation+1.76e+00

Pos logit contributions
 Ridley+1.244
essen+1.176
+1.108
rus+1.092
essel+1.055
Neg logit contributions
rocal-1.175
 offending-1.100
 sadd-1.097
 Signed-1.078
 phr-1.037
Token.
Token ID13
Feature activation-2.93e-01

Pos logit contributions
rocal+5.403
 Signed+5.209
 sadd+5.087
 offending+4.985
OA+4.874
Neg logit contributions
 Ridley-5.050
essen-4.767
-4.408
uton-4.250
rus-4.207
Token We
Token ID775
Feature activation-4.29e-01

Pos logit contributions
 Ridley+0.927
essen+0.826
+0.776
rus+0.763
essel+0.745
Neg logit contributions
rocal-0.886
 sadd-0.818
 offending-0.817
 Signed-0.788
idad-0.779
Token found
Token ID1043
Feature activation+1.50e-01

Pos logit contributions
 Ridley+1.274
essen+1.169
+1.060
rus+1.058
essel+1.040
Neg logit contributions
rocal-1.361
 offending-1.244
 Signed-1.238
 sadd-1.237
ENCY-1.197
Token a
Token ID257
Feature activation-4.97e-02

Pos logit contributions
rocal+0.402
 sadd+0.368
 Signed+0.366
 offending+0.362
OA+0.349
Neg logit contributions
 Ridley-0.516
essen-0.477
-0.448
rus-0.439
essel-0.433
Token home
Token ID1363
Feature activation-2.19e-01

Pos logit contributions
 Ridley+0.178
essen+0.164
+0.153
rus+0.152
essel+0.149
Neg logit contributions
rocal-0.128
 Signed-0.115
 sadd-0.114
 offending-0.113
ENCY-0.110
Token true
Token ID2081
Feature activation-8.65e-01

Pos logit contributions
 Ridley+0.582
essen+0.539
+0.491
rus+0.489
essel+0.476
Neg logit contributions
rocal-0.560
 Signed-0.515
 sadd-0.505
 offending-0.503
ENCY-0.496
Token.
Token ID13
Feature activation+2.10e-02

Pos logit contributions
 Ridley+2.413
essen+2.221
+2.024
rus+1.996
essel+1.936
Neg logit contributions
rocal-2.864
 Signed-2.651
 sadd-2.609
 offending-2.588
ENCY-2.544
Token But
Token ID887
Feature activation-1.16e-01

Pos logit contributions
rocal+0.063
 sadd+0.059
 offending+0.058
 Signed+0.058
idad+0.056
Neg logit contributions
 Ridley-0.065
essen-0.060
-0.056
rus-0.055
essel-0.054
Token at
Token ID379
Feature activation-3.10e-01

Pos logit contributions
 Ridley+0.334
essen+0.310
+0.282
rus+0.279
essel+0.272
Neg logit contributions
rocal-0.372
 Signed-0.347
 sadd-0.344
 offending-0.340
idad-0.331
Token the
Token ID262
Feature activation+3.99e-02

Pos logit contributions
 Ridley+0.914
essen+0.848
+0.777
rus+0.772
essel+0.750
Neg logit contributions
rocal-0.968
 Signed-0.910
 offending-0.894
 sadd-0.892
OA-0.868
Token same
Token ID976
Feature activation+1.73e+00

Pos logit contributions
rocal+0.101
 Signed+0.091
 sadd+0.090
 offending+0.090
ENCY+0.087
Neg logit contributions
 Ridley-0.142
essen-0.133
-0.125
rus-0.122
essel-0.122
Token time
Token ID640
Feature activation+7.28e-01

Pos logit contributions
rocal+4.772
 sadd+4.547
 Signed+4.478
 offending+4.445
 phr+4.300
Neg logit contributions
 Ridley-5.504
essen-5.135
-4.815
rus-4.693
essel-4.605
Token,
Token ID11
Feature activation+1.91e-02

Pos logit contributions
rocal+2.326
 sadd+2.185
 Signed+2.175
 offending+2.158
OA+2.077
Neg logit contributions
 Ridley-2.091
essen-1.934
-1.794
rus-1.743
essel-1.714
Tokenunder
Token ID4625
Feature activation-1.13e+00

Pos logit contributions
rocal+0.061
 Signed+0.057
 sadd+0.057
 offending+0.056
idad+0.055
Neg logit contributions
 Ridley-0.055
essen-0.051
-0.047
rus-0.046
essel-0.045
Token the
Token ID262
Feature activation-3.07e-01

Pos logit contributions
 Ridley+3.245
essen+3.019
+2.751
rus+2.728
essel+2.639
Neg logit contributions
rocal-3.553
 Signed-3.344
 sadd-3.294
 offending-3.265
ENCY-3.195
Token anti
Token ID3098
Feature activation+3.18e-01

Pos logit contributions
 Ridley+1.061
essen+0.980
+0.920
rus+0.902
essel+0.892
Neg logit contributions
rocal-0.813
 Signed-0.741
 sadd-0.734
 offending-0.726
ENCY-0.706
Token that
Token ID326
Feature activation+2.06e-01

Pos logit contributions
 Ridley+1.531
essen+1.390
+1.291
rus+1.271
essel+1.238
Neg logit contributions
rocal-1.768
 sadd-1.620
 Signed-1.613
 offending-1.610
ENCY-1.564
Token would
Token ID561
Feature activation+1.29e-01

Pos logit contributions
rocal+0.639
 sadd+0.593
 Signed+0.592
 offending+0.587
idad+0.567
Neg logit contributions
 Ridley-0.617
essen-0.571
-0.530
rus-0.518
essel-0.509
Token greatly
Token ID9257
Feature activation+3.43e-01

Pos logit contributions
rocal+0.386
 Signed+0.354
 sadd+0.352
 offending+0.351
idad+0.339
Neg logit contributions
 Ridley-0.401
essen-0.374
-0.348
rus-0.340
essel-0.336
Token alleviate
Token ID30885
Feature activation+7.34e-01

Pos logit contributions
rocal+0.928
 Signed+0.846
 sadd+0.835
 offending+0.831
idad+0.806
Neg logit contributions
 Ridley-1.164
essen-1.092
-1.026
rus-1.005
essel-0.995
Token the
Token ID262
Feature activation+4.57e-01

Pos logit contributions
rocal+2.158
 sadd+2.013
 Signed+2.005
 offending+1.987
idad+1.904
Neg logit contributions
 Ridley-2.300
essen-2.143
-1.999
rus-1.950
essel-1.919
Token pain
Token ID2356
Feature activation+1.72e+00

Pos logit contributions
rocal+1.226
 Signed+1.116
 sadd+1.101
 offending+1.089
OA+1.054
Neg logit contributions
 Ridley-1.578
essen-1.469
-1.376
rus-1.351
essel-1.331
Token of
Token ID286
Feature activation+9.71e-02

Pos logit contributions
rocal+4.775
 sadd+4.609
 offending+4.501
 Signed+4.491
idad+4.280
Neg logit contributions
 Ridley-5.426
essen-5.136
-4.781
rus-4.680
essel-4.615
Token the
Token ID262
Feature activation+2.90e-01

Pos logit contributions
rocal+0.282
 Signed+0.258
 sadd+0.253
 offending+0.250
idad+0.245
Neg logit contributions
 Ridley-0.313
essen-0.289
-0.269
rus-0.265
essel-0.260
Token shutdown
Token ID18325
Feature activation-2.14e-01

Pos logit contributions
rocal+0.812
 Signed+0.741
 sadd+0.736
 offending+0.728
OA+0.705
Neg logit contributions
 Ridley-0.962
essen-0.893
-0.835
rus-0.821
essel-0.808
Token.
Token ID13
Feature activation+2.85e-01

Pos logit contributions
 Ridley+0.576
essen+0.525
+0.484
rus+0.470
essel+0.462
Neg logit contributions
rocal-0.735
 Signed-0.681
 sadd-0.678
 offending-0.674
idad-0.656
Token Republicans
Token ID4734
Feature activation+6.08e-02

Pos logit contributions
rocal+0.835
 sadd+0.774
 offending+0.766
 Signed+0.764
idad+0.735
Neg logit contributions
 Ridley-0.909
essen-0.839
-0.786
rus-0.768
essel-0.755
Token).
Token ID737
Feature activation-2.40e-01

Pos logit contributions
rocal+0.134
 sadd+0.125
 offending+0.124
 Signed+0.124
ENCY+0.117
Neg logit contributions
 Ridley-0.139
essen-0.128
-0.119
rus-0.118
essel-0.114
Token<|endoftext|>
Token ID50256
Feature activation-3.17e-01

Pos logit contributions
 Ridley+0.684
essen+0.622
+0.575
rus+0.571
essel+0.546
Neg logit contributions
rocal-0.793
 sadd-0.742
 offending-0.739
 Signed-0.711
 phr-0.700
TokenShare
Token ID11649
Feature activation+1.76e+00

Pos logit contributions
 Ridley+0.991
essen+0.934
+0.878
rus+0.866
essel+0.838
Neg logit contributions
rocal-0.930
 offending-0.870
 sadd-0.868
 Signed-0.855
 phr-0.821
Token
Token ID198
Feature activation+2.39e-01

Pos logit contributions
rocal+5.431
 Signed+5.241
 sadd+5.115
 offending+5.014
OA+4.906
Neg logit contributions
 Ridley-5.018
essen-4.744
-4.381
uton-4.229
asions-4.182
Token
Token ID198
Feature activation+2.65e-01

Pos logit contributions
rocal+0.651
 sadd+0.601
 offending+0.599
 Signed+0.595
idad+0.567
Neg logit contributions
 Ridley-0.808
essen-0.753
-0.706
rus-0.696
essel-0.682
TokenTweet
Token ID47845
Feature activation+1.71e+00

Pos logit contributions
rocal+0.779
 sadd+0.724
 offending+0.724
 Signed+0.716
 phr+0.684
Neg logit contributions
 Ridley-0.838
essen-0.785
-0.736
rus-0.723
essel-0.704
Token
Token ID198
Feature activation+4.61e-01

Pos logit contributions
rocal+4.634
 Signed+4.363
 sadd+4.293
 offending+4.235
OA+4.081
Neg logit contributions
 Ridley-5.624
essen-5.264
-4.959
rus-4.806
uton-4.760
Token
Token ID198
Feature activation+4.73e-01

Pos logit contributions
rocal+1.209
 sadd+1.118
 Signed+1.114
 offending+1.099
OA+1.051
Neg logit contributions
 Ridley-1.590
essen-1.492
-1.401
rus-1.369
essel-1.351
TokenPin
Token ID28348
Feature activation+5.49e-01

Pos logit contributions
rocal+1.342
 sadd+1.251
 offending+1.244
 Signed+1.236
 phr+1.176
Neg logit contributions
 Ridley-1.536
essen-1.445
-1.358
rus-1.331
essel-1.308
Token 8
Token ID807
Feature activation+2.33e-01

Pos logit contributions
rocal+1.469
 sadd+1.359
 Signed+1.353
 offending+1.340
idad+1.278
Neg logit contributions
 Ridley-1.873
essen-1.754
-1.646
rus-1.610
essel-1.587
Token shares
Token ID7303
Feature activation+5.40e-01

Pos logit contributions
rocal+0.668
 sadd+0.618
 offending+0.613
 Signed+0.613
idad+0.585
Neg logit contributions
 Ridley-0.755
essen-0.697
-0.655
rus-0.644
essel-0.634
Token of
Token ID286
Feature activation-2.40e-01

Pos logit contributions
 Ridley+1.150
essen+1.043
+0.961
rus+0.956
essel+0.925
Neg logit contributions
rocal-1.185
 Signed-1.099
 sadd-1.094
 offending-1.092
ENCY-1.058
Token common
Token ID2219
Feature activation+1.70e-01

Pos logit contributions
 Ridley+0.787
essen+0.734
+0.665
rus+0.663
essel+0.650
Neg logit contributions
rocal-0.675
 Signed-0.624
 sadd-0.608
OA-0.595
 offending-0.589
Token sense
Token ID2565
Feature activation-5.16e-01

Pos logit contributions
rocal+0.488
 Signed+0.454
 sadd+0.449
 offending+0.442
OA+0.430
Neg logit contributions
 Ridley-0.543
essen-0.508
-0.472
rus-0.462
essel-0.458
Token.
Token ID13
Feature activation+5.62e-02

Pos logit contributions
 Ridley+1.302
essen+1.207
+1.061
rus+1.048
essel+1.007
Neg logit contributions
rocal-1.834
 Signed-1.727
 sadd-1.686
 offending-1.654
OA-1.654
Token The
Token ID383
Feature activation-1.78e-02

Pos logit contributions
rocal+0.166
 sadd+0.152
 offending+0.152
 Signed+0.149
idad+0.145
Neg logit contributions
 Ridley-0.181
essen-0.164
-0.154
rus-0.151
essel-0.148
Token cardinal
Token ID38691
Feature activation+1.71e+00

Pos logit contributions
 Ridley+0.063
essen+0.058
+0.054
rus+0.054
essel+0.053
Neg logit contributions
rocal-0.046
 Signed-0.042
 sadd-0.041
 offending-0.040
OA-0.039
Token lesson
Token ID11483
Feature activation+1.07e+00

Pos logit contributions
rocal+4.249
 sadd+4.026
 offending+3.974
 Signed+3.930
 phr+3.750
Neg logit contributions
 Ridley-6.012
essen-5.658
-5.348
rus-5.189
essel-5.127
Token is
Token ID318
Feature activation+1.34e-01

Pos logit contributions
rocal+3.080
 sadd+2.936
 Signed+2.885
 offending+2.883
 phr+2.745
Neg logit contributions
 Ridley-3.338
essen-3.120
-2.949
rus-2.866
essel-2.830
Token this
Token ID428
Feature activation+6.29e-01

Pos logit contributions
rocal+0.418
 Signed+0.387
 sadd+0.382
 offending+0.376
OA+0.370
Neg logit contributions
 Ridley-0.404
essen-0.374
-0.339
rus-0.337
essel-0.329
Token:
Token ID25
Feature activation+5.67e-01

Pos logit contributions
rocal+2.014
 sadd+1.884
 Signed+1.881
 offending+1.862
OA+1.794
Neg logit contributions
 Ridley-1.815
essen-1.679
-1.553
rus-1.513
essel-1.486
Token prior
Token ID3161
Feature activation+9.49e-01

Pos logit contributions
rocal+1.713
 sadd+1.582
 Signed+1.574
 offending+1.559
idad+1.512
Neg logit contributions
 Ridley-1.749
essen-1.627
-1.498
rus-1.475
essel-1.448
Token
Token ID198
Feature activation+1.08e-01

Pos logit contributions
rocal+0.733
 sadd+0.675
 Signed+0.670
 offending+0.670
idad+0.638
Neg logit contributions
 Ridley-0.874
essen-0.811
-0.761
rus-0.745
essel-0.734
Token
Token ID198
Feature activation+9.47e-02

Pos logit contributions
rocal+0.291
 sadd+0.269
 offending+0.268
 Signed+0.267
idad+0.254
Neg logit contributions
 Ridley-0.364
essen-0.340
-0.318
rus-0.313
essel-0.308
TokenBut
Token ID1537
Feature activation+2.76e-01

Pos logit contributions
rocal+0.293
 sadd+0.273
 offending+0.273
 Signed+0.273
 phr+0.259
Neg logit contributions
 Ridley-0.283
essen-0.263
-0.247
rus-0.240
essel-0.235
Token at
Token ID379
Feature activation+5.63e-02

Pos logit contributions
rocal+0.907
 Signed+0.846
 sadd+0.843
 offending+0.835
idad+0.809
Neg logit contributions
 Ridley-0.779
essen-0.717
-0.659
rus-0.645
essel-0.631
Token the
Token ID262
Feature activation+1.33e-01

Pos logit contributions
rocal+0.159
 Signed+0.147
 sadd+0.147
 offending+0.146
idad+0.140
Neg logit contributions
 Ridley-0.184
essen-0.171
-0.159
rus-0.157
essel-0.154
Token same
Token ID976
Feature activation+1.69e+00

Pos logit contributions
rocal+0.338
 Signed+0.308
 sadd+0.305
 offending+0.302
ENCY+0.292
Neg logit contributions
 Ridley-0.476
essen-0.444
-0.416
rus-0.409
essel-0.403
Token time
Token ID640
Feature activation+7.61e-01

Pos logit contributions
rocal+4.578
 sadd+4.372
 offending+4.298
 Signed+4.285
 phr+4.131
Neg logit contributions
 Ridley-5.450
essen-5.076
-4.797
rus-4.670
uton-4.589
Token he
Token ID339
Feature activation+4.94e-01

Pos logit contributions
rocal+2.413
 sadd+2.280
 Signed+2.256
 offending+2.246
OA+2.161
Neg logit contributions
 Ridley-2.196
essen-2.037
-1.891
rus-1.838
essel-1.806
Token realizes
Token ID23660
Feature activation+2.66e-01

Pos logit contributions
rocal+1.592
 sadd+1.486
 Signed+1.485
 offending+1.472
OA+1.418
Neg logit contributions
 Ridley-1.413
essen-1.309
-1.209
rus-1.179
essel-1.159
Token he
Token ID339
Feature activation+5.93e-01

Pos logit contributions
rocal+0.897
 Signed+0.840
 sadd+0.840
 offending+0.831
idad+0.803
Neg logit contributions
 Ridley-0.722
essen-0.664
-0.611
rus-0.595
essel-0.583
Token
Token ID447
Feature activation+2.27e-01

Pos logit contributions
rocal+1.841
 sadd+1.729
 Signed+1.720
 offending+1.705
idad+1.640
Neg logit contributions
 Ridley-1.761
essen-1.633
-1.517
rus-1.476
essel-1.451
Token fall
Token ID2121
Feature activation+6.06e-01

Pos logit contributions
 Ridley+0.975
essen+0.899
+0.819
rus+0.816
essel+0.801
Neg logit contributions
rocal-0.965
 Signed-0.895
 offending-0.886
 sadd-0.882
ENCY-0.864
Token in
Token ID287
Feature activation+2.81e-01

Pos logit contributions
rocal+1.959
 sadd+1.840
 Signed+1.833
 offending+1.818
idad+1.750
Neg logit contributions
 Ridley-1.721
essen-1.592
-1.473
rus-1.432
essel-1.407
Token love
Token ID1842
Feature activation+8.58e-01

Pos logit contributions
rocal+0.935
 Signed+0.873
 sadd+0.873
 offending+0.864
idad+0.836
Neg logit contributions
 Ridley-0.778
essen-0.715
-0.659
rus-0.641
essel-0.629
Token with
Token ID351
Feature activation+1.16e-01

Pos logit contributions
rocal+2.661
 sadd+2.507
 Signed+2.490
 offending+2.470
idad+2.371
Neg logit contributions
 Ridley-2.536
essen-2.364
-2.196
rus-2.135
essel-2.099
Token the
Token ID262
Feature activation+2.94e-01

Pos logit contributions
rocal+0.357
 sadd+0.331
 Signed+0.331
 offending+0.327
OA+0.316
Neg logit contributions
 Ridley-0.355
essen-0.325
-0.302
rus-0.296
essel-0.291
Token same
Token ID976
Feature activation+1.68e+00

Pos logit contributions
rocal+0.819
 Signed+0.754
 sadd+0.750
 offending+0.742
OA+0.719
Neg logit contributions
 Ridley-0.975
essen-0.903
-0.838
rus-0.827
essel-0.816
Token woman
Token ID2415
Feature activation-2.49e-02

Pos logit contributions
rocal+4.738
 sadd+4.560
 offending+4.494
 Signed+4.472
 augmented+4.299
Neg logit contributions
 Ridley-5.154
essen-4.918
-4.684
rus-4.484
asions-4.394
Token,
Token ID11
Feature activation-5.73e-02

Pos logit contributions
 Ridley+0.073
essen+0.066
+0.060
rus+0.059
essel+0.059
Neg logit contributions
rocal-0.081
 Signed-0.074
 sadd-0.074
 offending-0.074
ENCY-0.071
Token the
Token ID262
Feature activation+1.13e-02

Pos logit contributions
 Ridley+0.169
essen+0.153
+0.141
rus+0.138
essel+0.137
Neg logit contributions
rocal-0.183
 sadd-0.168
 Signed-0.168
 offending-0.167
OA-0.162
Token formidable
Token ID24599
Feature activation+1.09e-01

Pos logit contributions
rocal+0.031
 Signed+0.029
 sadd+0.028
 offending+0.028
OA+0.028
Neg logit contributions
 Ridley-0.038
essen-0.035
-0.032
rus-0.032
essel-0.032
Token Diana
Token ID25752
Feature activation-4.20e-01

Pos logit contributions
rocal+0.335
 sadd+0.310
 Signed+0.309
 offending+0.307
OA+0.296
Neg logit contributions
 Ridley-0.332
essen-0.306
-0.283
rus-0.276
essel-0.273
Token
Token ID198
Feature activation-9.44e-02

Pos logit contributions
 Ridley+1.316
essen+1.233
rus+1.112
+1.111
essel+1.091
Neg logit contributions
rocal-1.400
 sadd-1.298
 offending-1.279
 Signed-1.278
ENCY-1.234
Token
Token ID198
Feature activation-5.10e-02

Pos logit contributions
 Ridley+0.314
essen+0.292
+0.273
rus+0.270
essel+0.264
Neg logit contributions
rocal-0.261
 offending-0.242
 sadd-0.241
 Signed-0.238
idad-0.228
Token
Token ID447
Feature activation+5.76e-02

Pos logit contributions
 Ridley+0.161
essen+0.152
+0.143
rus+0.141
essel+0.136
Neg logit contributions
rocal-0.148
 sadd-0.139
 offending-0.138
 Signed-0.136
 phr-0.130
Token
Token ID250
Feature activation-1.13e-01

Pos logit contributions
rocal+0.137
 sadd+0.127
 Signed+0.126
 offending+0.124
idad+0.118
Neg logit contributions
 Ridley-0.212
essen-0.200
-0.189
rus-0.185
essel-0.182
TokenSpe
Token ID5248
Feature activation+7.19e-01

Pos logit contributions
 Ridley+0.349
essen+0.327
+0.304
rus+0.300
essel+0.292
Neg logit contributions
rocal-0.339
 sadd-0.317
 Signed-0.313
 offending-0.312
idad-0.298
Tokenech
Token ID3055
Feature activation+1.65e+00

Pos logit contributions
rocal+1.874
 sadd+1.725
 Signed+1.724
 offending+1.703
idad+1.625
Neg logit contributions
 Ridley-2.497
essen-2.336
-2.203
rus-2.145
essel-2.117
Token is
Token ID318
Feature activation+8.14e-02

Pos logit contributions
rocal+4.720
 Signed+4.462
 sadd+4.452
 offending+4.452
 phr+4.260
Neg logit contributions
 Ridley-5.075
essen-4.724
-4.389
rus-4.246
essel-4.203
Token one
Token ID530
Feature activation-5.71e-02

Pos logit contributions
rocal+0.233
 sadd+0.212
 Signed+0.212
 offending+0.210
idad+0.205
Neg logit contributions
 Ridley-0.265
essen-0.247
-0.230
rus-0.226
essel-0.222
Token of
Token ID286
Feature activation-1.07e-01

Pos logit contributions
 Ridley+0.177
essen+0.163
+0.153
rus+0.149
essel+0.147
Neg logit contributions
rocal-0.171
 Signed-0.158
 sadd-0.157
 offending-0.156
idad-0.152
Token the
Token ID262
Feature activation+4.18e-01

Pos logit contributions
 Ridley+0.310
essen+0.279
+0.262
rus+0.259
essel+0.251
Neg logit contributions
rocal-0.345
 Signed-0.320
 sadd-0.319
 offending-0.312
idad-0.307
Token most
Token ID749
Feature activation+3.99e-01

Pos logit contributions
rocal+1.143
 Signed+1.048
 sadd+1.031
 offending+1.025
idad+1.001
Neg logit contributions
 Ridley-1.409
essen-1.308
-1.225
rus-1.207
essel-1.196
Token.
Token ID13
Feature activation+2.94e-01

Pos logit contributions
 Ridley+0.341
essen+0.313
+0.282
rus+0.278
essel+0.273
Neg logit contributions
rocal-0.429
 sadd-0.399
 Signed-0.398
 offending-0.394
idad-0.384
Token "
Token ID366
Feature activation+5.85e-01

Pos logit contributions
rocal+0.865
 sadd+0.803
 Signed+0.796
 offending+0.793
idad+0.762
Neg logit contributions
 Ridley-0.933
essen-0.863
-0.805
rus-0.788
essel-0.775
TokenHowever
Token ID4864
Feature activation+6.59e-01

Pos logit contributions
rocal+1.736
 sadd+1.618
 Signed+1.609
 offending+1.598
idad+1.532
Neg logit contributions
 Ridley-1.826
essen-1.698
-1.584
rus-1.547
essel-1.520
Token at
Token ID379
Feature activation-2.50e-02

Pos logit contributions
rocal+2.041
 sadd+1.902
 Signed+1.899
 offending+1.878
idad+1.813
Neg logit contributions
 Ridley-1.973
essen-1.828
-1.693
rus-1.655
essel-1.625
Token the
Token ID262
Feature activation+1.07e-01

Pos logit contributions
 Ridley+0.080
essen+0.074
+0.069
rus+0.068
essel+0.066
Neg logit contributions
rocal-0.072
 Signed-0.067
 sadd-0.067
 offending-0.066
idad-0.064
Token same
Token ID976
Feature activation+1.63e+00

Pos logit contributions
rocal+0.280
 Signed+0.254
 sadd+0.252
 offending+0.250
idad+0.244
Neg logit contributions
 Ridley-0.373
essen-0.344
-0.324
rus-0.318
essel-0.313
Token time
Token ID640
Feature activation+5.89e-01

Pos logit contributions
rocal+4.465
 sadd+4.274
 offending+4.220
 Signed+4.208
 phr+4.041
Neg logit contributions
 Ridley-5.190
essen-4.949
-4.638
rus-4.496
essel-4.409
Token you
Token ID345
Feature activation+1.09e-01

Pos logit contributions
rocal+1.841
 sadd+1.718
 Signed+1.714
 offending+1.696
idad+1.637
Neg logit contributions
 Ridley-1.746
essen-1.615
-1.495
rus-1.461
essel-1.435
Token don
Token ID836
Feature activation-2.89e-01

Pos logit contributions
rocal+0.345
 Signed+0.319
 sadd+0.316
 offending+0.314
idad+0.306
Neg logit contributions
 Ridley-0.320
essen-0.296
-0.270
rus-0.267
essel-0.262
Token't
Token ID470
Feature activation-4.65e-02

Pos logit contributions
 Ridley+0.940
essen+0.867
+0.814
rus+0.809
essel+0.790
Neg logit contributions
rocal-0.816
 offending-0.760
 sadd-0.754
 Signed-0.746
idad-0.713
Token want
Token ID765
Feature activation+2.41e-01

Pos logit contributions
 Ridley+0.142
essen+0.132
+0.121
rus+0.121
essel+0.119
Neg logit contributions
rocal-0.142
 Signed-0.130
 sadd-0.128
 offending-0.128
OA-0.125
Token01
Token ID486
Feature activation-8.16e-01

Pos logit contributions
 Ridley+0.478
essen+0.437
+0.412
rus+0.405
essel+0.391
Neg logit contributions
rocal-0.536
 offending-0.504
 sadd-0.500
 Signed-0.490
idad-0.479
Token.
Token ID13
Feature activation-2.03e-01

Pos logit contributions
 Ridley+1.888
essen+1.656
rus+1.527
+1.520
essel+1.475
Neg logit contributions
rocal-3.071
 offending-2.942
 sadd-2.871
 Signed-2.815
idad-2.784
Token29
Token ID1959
Feature activation-5.64e-01

Pos logit contributions
 Ridley+0.567
essen+0.523
+0.488
rus+0.477
essel+0.464
Neg logit contributions
rocal-0.671
 offending-0.630
 sadd-0.627
 Signed-0.618
 phr-0.598
Token"]
Token ID8973
Feature activation-2.80e-01

Pos logit contributions
 Ridley+1.852
essen+1.713
+1.592
rus+1.592
essel+1.544
Neg logit contributions
rocal-1.606
 offending-1.468
 sadd-1.438
idad-1.394
 Signed-1.384
Token [
Token ID685
Feature activation-2.28e-01

Pos logit contributions
 Ridley+1.015
essen+0.941
+0.895
rus+0.879
essel+0.856
Neg logit contributions
rocal-0.701
 sadd-0.632
 offending-0.628
 Signed-0.615
idad-0.597
TokenRound
Token ID22685
Feature activation+1.62e+00

Pos logit contributions
 Ridley+0.823
essen+0.778
+0.732
rus+0.718
essel+0.706
Neg logit contributions
rocal-0.564
 sadd-0.509
 offending-0.504
 Signed-0.503
 rud-0.479
Token "
Token ID366
Feature activation-1.50e-01

Pos logit contributions
rocal+4.175
 Signed+3.936
 sadd+3.908
 offending+3.781
ENCY+3.692
Neg logit contributions
 Ridley-5.490
essen-5.249
-4.998
rus-4.757
essel-4.757
Token11
Token ID1157
Feature activation-3.77e-01

Pos logit contributions
 Ridley+0.435
essen+0.407
+0.378
rus+0.373
essel+0.361
Neg logit contributions
rocal-0.480
 sadd-0.446
 offending-0.445
 Signed-0.437
 phr-0.424
Token"]
Token ID8973
Feature activation-1.78e-01

Pos logit contributions
 Ridley+1.105
essen+1.030
+0.952
rus+0.936
essel+0.907
Neg logit contributions
rocal-1.192
 sadd-1.099
 offending-1.093
 Signed-1.083
ENCY-1.052
Token [
Token ID685
Feature activation-2.28e-01

Pos logit contributions
 Ridley+0.635
essen+0.587
+0.556
rus+0.548
essel+0.535
Neg logit contributions
rocal-0.455
 offending-0.410
 sadd-0.410
 Signed-0.401
idad-0.391
TokenWhite
Token ID12256
Feature activation-2.27e-01

Pos logit contributions
 Ridley+0.631
essen+0.583
+0.537
rus+0.524
essel+0.513
Neg logit contributions
rocal-0.758
 sadd-0.709
 Signed-0.708
 offending-0.702
idad-0.677
Token themselves
Token ID2405
Feature activation+1.93e-01

Pos logit contributions
 Ridley+0.610
essen+0.565
+0.522
rus+0.513
essel+0.506
Neg logit contributions
rocal-0.635
 Signed-0.594
 sadd-0.591
 offending-0.584
OA-0.566
Token,
Token ID11
Feature activation+2.97e-01

Pos logit contributions
rocal+0.584
 Signed+0.544
 sadd+0.543
 offending+0.537
OA+0.517
Neg logit contributions
 Ridley-0.593
essen-0.551
-0.512
rus-0.500
essel-0.492
Token but
Token ID475
Feature activation+3.49e-01

Pos logit contributions
rocal+0.863
 Signed+0.799
 sadd+0.799
 offending+0.789
idad+0.760
Neg logit contributions
 Ridley-0.950
essen-0.883
-0.822
rus-0.805
essel-0.792
Token at
Token ID379
Feature activation-8.52e-02

Pos logit contributions
rocal+1.114
 sadd+1.038
 Signed+1.037
 offending+1.025
idad+0.992
Neg logit contributions
 Ridley-1.015
essen-0.935
-0.861
rus-0.844
essel-0.827
Token the
Token ID262
Feature activation+1.70e-01

Pos logit contributions
 Ridley+0.252
essen+0.231
rus+0.212
+0.212
essel+0.205
Neg logit contributions
rocal-0.268
 Signed-0.249
 sadd-0.249
 offending-0.248
idad-0.241
Token same
Token ID976
Feature activation+1.60e+00

Pos logit contributions
rocal+0.428
 Signed+0.386
 sadd+0.385
 offending+0.380
idad+0.369
Neg logit contributions
 Ridley-0.611
essen-0.569
-0.532
rus-0.525
essel-0.518
Token time
Token ID640
Feature activation+6.64e-01

Pos logit contributions
rocal+4.428
 sadd+4.201
 Signed+4.150
 offending+4.134
 phr+3.978
Neg logit contributions
 Ridley-5.097
essen-4.822
-4.531
rus-4.391
essel-4.311
Token each
Token ID1123
Feature activation+6.60e-02

Pos logit contributions
rocal+2.020
 sadd+1.893
 Signed+1.883
 offending+1.869
OA+1.791
Neg logit contributions
 Ridley-2.010
essen-1.870
-1.741
rus-1.696
essel-1.668
Token watching
Token ID4964
Feature activation+1.56e-01

Pos logit contributions
rocal+0.192
 Signed+0.177
 sadd+0.176
 offending+0.174
idad+0.169
Neg logit contributions
 Ridley-0.211
essen-0.196
-0.181
rus-0.178
essel-0.176
Token how
Token ID703
Feature activation+3.90e-02

Pos logit contributions
rocal+0.492
 sadd+0.456
 Signed+0.456
 offending+0.450
idad+0.436
Neg logit contributions
 Ridley-0.464
essen-0.424
-0.393
rus-0.385
essel-0.379
Token the
Token ID262
Feature activation+3.34e-01

Pos logit contributions
rocal+0.120
 Signed+0.112
 sadd+0.112
 offending+0.110
OA+0.106
Neg logit contributions
 Ridley-0.118
essen-0.109
-0.100
rus-0.098
essel-0.097
Token State
Token ID1812
Feature activation+3.51e-01

Pos logit contributions
rocal+0.286
 sadd+0.257
 Signed+0.255
 offending+0.253
OA+0.240
Neg logit contributions
 Ridley-0.462
essen-0.432
-0.408
rus-0.400
essel-0.392
Token and
Token ID290
Feature activation-2.67e-01

Pos logit contributions
rocal+1.063
 sadd+0.988
 Signed+0.985
 offending+0.975
idad+0.939
Neg logit contributions
 Ridley-1.076
essen-0.996
-0.927
rus-0.906
essel-0.890
Token Trump
Token ID1301
Feature activation-3.57e-01

Pos logit contributions
 Ridley+0.827
essen+0.758
+0.692
rus+0.691
essel+0.672
Neg logit contributions
rocal-0.811
 sadd-0.742
 Signed-0.740
 offending-0.732
idad-0.715
Token at
Token ID379
Feature activation-3.29e-03

Pos logit contributions
 Ridley+1.061
essen+0.979
+0.898
rus+0.887
essel+0.871
Neg logit contributions
rocal-1.117
 sadd-1.035
 Signed-1.026
 offending-1.019
ENCY-0.992
Token the
Token ID262
Feature activation+1.58e-01

Pos logit contributions
 Ridley+0.010
essen+0.009
+0.008
rus+0.008
essel+0.008
Neg logit contributions
rocal-0.011
 sadd-0.010
 Signed-0.010
 offending-0.010
idad-0.009
Token same
Token ID976
Feature activation+1.60e+00

Pos logit contributions
rocal+0.397
 sadd+0.359
 Signed+0.358
 offending+0.353
ENCY+0.344
Neg logit contributions
 Ridley-0.568
essen-0.525
-0.495
rus-0.485
essel-0.480
Token time
Token ID640
Feature activation+7.28e-01

Pos logit contributions
rocal+4.452
 sadd+4.233
 Signed+4.204
 offending+4.191
idad+3.992
Neg logit contributions
 Ridley-5.054
essen-4.784
-4.478
rus-4.377
essel-4.291
Token?
Token ID30
Feature activation-9.81e-02

Pos logit contributions
rocal+2.480
 sadd+2.342
 Signed+2.336
 offending+2.318
OA+2.234
Neg logit contributions
 Ridley-1.931
essen-1.777
-1.638
rus-1.585
essel-1.559
Token I
Token ID314
Feature activation-6.80e-02

Pos logit contributions
 Ridley+0.305
essen+0.278
+0.257
rus+0.255
essel+0.249
Neg logit contributions
rocal-0.298
 sadd-0.275
 offending-0.274
 Signed-0.270
idad-0.262
Token
Token ID447
Feature activation-3.36e-01

Pos logit contributions
 Ridley+0.216
essen+0.201
+0.185
rus+0.183
essel+0.180
Neg logit contributions
rocal-0.201
 sadd-0.182
 offending-0.182
 Signed-0.182
idad-0.174
Token
Token ID247
Feature activation-1.42e-01

Pos logit contributions
 Ridley+1.150
essen+1.096
+1.030
rus+1.007
essel+0.982
Neg logit contributions
rocal-0.883
 sadd-0.809
 Signed-0.803
 offending-0.799
idad-0.755
Token.
Token ID13
Feature activation-9.14e-02

Pos logit contributions
 Ridley+0.267
essen+0.242
+0.222
rus+0.219
essel+0.215
Neg logit contributions
rocal-0.336
 Signed-0.314
 sadd-0.309
 offending-0.307
OA-0.301
Token
Token ID198
Feature activation+1.21e-01

Pos logit contributions
 Ridley+0.292
essen+0.263
+0.245
rus+0.245
essel+0.240
Neg logit contributions
rocal-0.272
 offending-0.251
 sadd-0.250
 Signed-0.244
idad-0.240
Token
Token ID198
Feature activation+7.82e-02

Pos logit contributions
rocal+0.330
 sadd+0.305
 offending+0.303
 Signed+0.302
idad+0.287
Neg logit contributions
 Ridley-0.409
essen-0.381
-0.357
rus-0.352
essel-0.345
TokenAt
Token ID2953
Feature activation-2.38e-01

Pos logit contributions
rocal+0.228
 offending+0.212
 sadd+0.211
 Signed+0.209
 phr+0.199
Neg logit contributions
 Ridley-0.248
essen-0.232
-0.219
rus-0.215
essel-0.211
Token the
Token ID262
Feature activation+6.33e-03

Pos logit contributions
 Ridley+0.742
essen+0.691
+0.633
rus+0.631
essel+0.620
Neg logit contributions
rocal-0.708
 Signed-0.658
 offending-0.651
 sadd-0.649
idad-0.628
Token same
Token ID976
Feature activation+1.59e+00

Pos logit contributions
rocal+0.016
 Signed+0.014
 sadd+0.014
 offending+0.014
idad+0.014
Neg logit contributions
 Ridley-0.023
essen-0.021
-0.020
rus-0.019
essel-0.019
Token time
Token ID640
Feature activation+6.31e-01

Pos logit contributions
rocal+4.463
 sadd+4.242
 Signed+4.188
 offending+4.160
 phr+4.020
Neg logit contributions
 Ridley-5.060
essen-4.746
-4.454
rus-4.333
essel-4.235
Token,
Token ID11
Feature activation+6.96e-02

Pos logit contributions
rocal+1.943
 sadd+1.829
 Signed+1.807
 offending+1.803
OA+1.731
Neg logit contributions
 Ridley-1.880
essen-1.745
-1.631
rus-1.579
essel-1.548
Token however
Token ID2158
Feature activation+8.45e-01

Pos logit contributions
rocal+0.220
 Signed+0.205
 sadd+0.202
 offending+0.200
idad+0.196
Neg logit contributions
 Ridley-0.206
essen-0.190
-0.173
rus-0.172
essel-0.170
Token,
Token ID11
Feature activation+4.49e-01

Pos logit contributions
rocal+2.632
 sadd+2.474
 Signed+2.460
 offending+2.444
 phr+2.344
Neg logit contributions
 Ridley-2.489
essen-2.314
-2.160
rus-2.092
essel-2.057
Token they
Token ID484
Feature activation-3.12e-02

Pos logit contributions
rocal+1.411
 Signed+1.317
 sadd+1.311
 offending+1.295
idad+1.256
Neg logit contributions
 Ridley-1.330
essen-1.232
-1.133
rus-1.114
essel-1.097
Token more
Token ID517
Feature activation+3.46e-01

Pos logit contributions
 Ridley+0.106
essen+0.098
+0.092
rus+0.090
essel+0.089
Neg logit contributions
rocal-0.087
 Signed-0.079
 sadd-0.078
 offending-0.077
ENCY-0.076
Token amusing
Token ID28297
Feature activation+3.94e-01

Pos logit contributions
rocal+0.764
 Signed+0.684
 sadd+0.667
 offending+0.663
OA+0.641
Neg logit contributions
 Ridley-1.352
essen-1.272
-1.197
rus-1.180
essel-1.166
Token that
Token ID326
Feature activation-1.46e-01

Pos logit contributions
rocal+1.274
 sadd+1.197
 Signed+1.193
 offending+1.183
idad+1.138
Neg logit contributions
 Ridley-1.117
essen-1.033
-0.956
rus-0.929
essel-0.913
Token the
Token ID262
Feature activation+1.53e-01

Pos logit contributions
 Ridley+0.447
essen+0.411
+0.378
rus+0.372
essel+0.364
Neg logit contributions
rocal-0.450
 Signed-0.414
 sadd-0.414
 offending-0.408
idad-0.396
Token very
Token ID845
Feature activation+3.23e-01

Pos logit contributions
rocal+0.435
 Signed+0.398
 sadd+0.395
 offending+0.389
ENCY+0.379
Neg logit contributions
 Ridley-0.502
essen-0.463
-0.431
rus-0.424
essel-0.417
Token same
Token ID976
Feature activation+1.59e+00

Pos logit contributions
rocal+0.880
 Signed+0.807
 sadd+0.800
 offending+0.789
OA+0.762
Neg logit contributions
 Ridley-1.096
essen-1.020
-0.949
rus-0.934
essel-0.919
Token Salt
Token ID13754
Feature activation+2.08e-01

Pos logit contributions
rocal+4.470
 sadd+4.348
 offending+4.283
 Signed+4.247
 phr+4.103
Neg logit contributions
 Ridley-4.833
essen-4.634
-4.390
rus-4.221
essel-4.161
Tokenz
Token ID89
Feature activation-4.53e-01

Pos logit contributions
rocal+0.659
 sadd+0.616
 Signed+0.616
 offending+0.610
OA+0.587
Neg logit contributions
 Ridley-0.604
essen-0.561
-0.516
rus-0.509
essel-0.497
Token shrug
Token ID32545
Feature activation-9.10e-02

Pos logit contributions
 Ridley+1.260
essen+1.163
+1.071
rus+1.060
essel+1.018
Neg logit contributions
rocal-1.489
 Signed-1.392
 sadd-1.391
 offending-1.374
ENCY-1.345
Tokens
Token ID82
Feature activation-1.08e-01

Pos logit contributions
 Ridley+0.264
essen+0.243
+0.225
rus+0.220
essel+0.214
Neg logit contributions
rocal-0.291
 Signed-0.271
 sadd-0.270
 offending-0.267
ENCY-0.260
Token off
Token ID572
Feature activation-8.12e-01

Pos logit contributions
 Ridley+0.345
essen+0.320
+0.298
rus+0.291
essel+0.286
Neg logit contributions
rocal-0.316
 Signed-0.292
 sadd-0.292
 offending-0.288
idad-0.278
Tokenmy
Token ID1820
Feature activation-3.64e-01

Pos logit contributions
rocal+1.291
 sadd+1.211
 Signed+1.206
 offending+1.198
idad+1.150
Neg logit contributions
 Ridley-1.165
essen-1.079
-0.999
rus-0.975
essel-0.956
Tokenphone
Token ID4862
Feature activation+7.42e-01

Pos logit contributions
 Ridley+1.024
essen+0.963
+0.884
rus+0.879
essel+0.855
Neg logit contributions
rocal-1.170
 sadd-1.108
 Signed-1.101
 offending-1.099
ENCY-1.052
Tokengr
Token ID2164
Feature activation-3.16e-01

Pos logit contributions
rocal+2.378
 sadd+2.229
 Signed+2.225
 offending+2.201
idad+2.123
Neg logit contributions
 Ridley-2.127
essen-1.964
-1.818
rus-1.764
essel-1.739
Token
Token ID198
Feature activation+5.87e-01

Pos logit contributions
 Ridley+0.926
essen+0.868
+0.790
rus+0.789
essel+0.773
Neg logit contributions
rocal-0.988
 Signed-0.926
 sadd-0.923
 offending-0.918
ENCY-0.882
Token
Token ID198
Feature activation+1.15e-01

Pos logit contributions
rocal+1.506
 Signed+1.394
 sadd+1.393
 offending+1.361
OA+1.312
Neg logit contributions
 Ridley-2.053
essen-1.935
-1.820
rus-1.769
essel-1.752
TokenShare
Token ID11649
Feature activation+1.58e+00

Pos logit contributions
rocal+0.344
 sadd+0.321
 offending+0.321
 Signed+0.317
 phr+0.304
Neg logit contributions
 Ridley-0.354
essen-0.331
-0.310
rus-0.304
essel-0.296
Token this
Token ID428
Feature activation+9.11e-02

Pos logit contributions
rocal+4.982
 Signed+4.754
 sadd+4.691
 offending+4.622
OA+4.488
Neg logit contributions
 Ridley-4.491
essen-4.202
-3.878
rus-3.740
uton-3.716
Token post
Token ID1281
Feature activation+7.37e-01

Pos logit contributions
rocal+0.277
 sadd+0.255
 Signed+0.254
 offending+0.253
OA+0.244
Neg logit contributions
 Ridley-0.281
essen-0.260
-0.240
rus-0.237
essel-0.232
Token
Token ID198
Feature activation+4.40e-02

Pos logit contributions
rocal+2.239
 Signed+2.106
 sadd+2.099
 offending+2.075
OA+1.993
Neg logit contributions
 Ridley-2.222
essen-2.067
-1.929
rus-1.864
essel-1.845
Token
Token ID198
Feature activation+1.56e-02

Pos logit contributions
rocal+0.121
 sadd+0.112
 offending+0.112
 Signed+0.111
idad+0.106
Neg logit contributions
 Ridley-0.147
essen-0.136
-0.128
rus-0.127
essel-0.124
TokenJ
Token ID41
Feature activation+4.59e-01

Pos logit contributions
rocal+0.047
 sadd+0.044
 offending+0.043
 Signed+0.043
 phr+0.041
Neg logit contributions
 Ridley-0.048
essen-0.045
-0.042
rus-0.042
essel-0.041
Token
Token ID251
Feature activation+1.94e-02

Pos logit contributions
 Ridley+0.660
essen+0.616
+0.579
rus+0.566
essel+0.553
Neg logit contributions
rocal-0.546
 sadd-0.507
 Signed-0.503
 offending-0.501
idad-0.477
Token
Token ID198
Feature activation+1.47e-01

Pos logit contributions
rocal+0.057
 sadd+0.053
 offending+0.052
 Signed+0.052
idad+0.050
Neg logit contributions
 Ridley-0.061
essen-0.057
-0.053
rus-0.052
essel-0.051
Token
Token ID198
Feature activation+1.08e-01

Pos logit contributions
rocal+0.396
 sadd+0.366
 Signed+0.363
 offending+0.362
idad+0.344
Neg logit contributions
 Ridley-0.500
essen-0.467
-0.437
rus-0.430
essel-0.423
TokenAt
Token ID2953
Feature activation-1.18e-01

Pos logit contributions
rocal+0.317
 sadd+0.296
 offending+0.294
 Signed+0.293
idad+0.279
Neg logit contributions
 Ridley-0.337
essen-0.315
-0.295
rus-0.290
essel-0.283
Token the
Token ID262
Feature activation-3.74e-02

Pos logit contributions
 Ridley+0.360
essen+0.335
+0.311
rus+0.306
essel+0.298
Neg logit contributions
rocal-0.358
 sadd-0.335
 Signed-0.333
 offending-0.331
ENCY-0.320
Token same
Token ID976
Feature activation+1.58e+00

Pos logit contributions
 Ridley+0.128
essen+0.120
+0.113
rus+0.110
essel+0.108
Neg logit contributions
rocal-0.100
 Signed-0.090
 sadd-0.090
 offending-0.089
ENCY-0.087
Token time
Token ID640
Feature activation+6.94e-01

Pos logit contributions
rocal+4.379
 sadd+4.162
 Signed+4.116
 offending+4.101
 phr+3.950
Neg logit contributions
 Ridley-5.063
essen-4.717
-4.416
rus-4.324
essel-4.222
Token,
Token ID11
Feature activation-3.52e-02

Pos logit contributions
rocal+2.108
 sadd+1.987
 Signed+1.972
 offending+1.966
OA+1.887
Neg logit contributions
 Ridley-2.086
essen-1.940
-1.811
rus-1.753
essel-1.729
Token another
Token ID1194
Feature activation+8.31e-02

Pos logit contributions
 Ridley+0.106
essen+0.099
+0.091
rus+0.089
essel+0.088
Neg logit contributions
rocal-0.108
 sadd-0.100
 Signed-0.100
 offending-0.098
idad-0.096
Token official
Token ID1743
Feature activation-7.62e-02

Pos logit contributions
rocal+0.231
 Signed+0.210
 sadd+0.209
 offending+0.205
ENCY+0.202
Neg logit contributions
 Ridley-0.279
essen-0.258
-0.239
rus-0.237
essel-0.233
Token added
Token ID2087
Feature activation+1.44e-01

Pos logit contributions
 Ridley+0.246
essen+0.230
+0.211
rus+0.209
essel+0.206
Neg logit contributions
rocal-0.220
 Signed-0.202
 sadd-0.200
 offending-0.198
ENCY-0.193
Token 4
Token ID604
Feature activation-5.00e-01

Pos logit contributions
 Ridley+1.237
essen+1.154
+1.090
rus+1.087
essel+1.082
Neg logit contributions
rocal-0.685
 sadd-0.619
 offending-0.605
 Signed-0.601
idad-0.592
Token 5
Token ID642
Feature activation-6.08e-01

Pos logit contributions
 Ridley+1.985
essen+1.857
+1.763
rus+1.745
essel+1.742
Neg logit contributions
rocal-1.041
 sadd-0.935
 offending-0.909
 Signed-0.908
idad-0.887
Token 6
Token ID718
Feature activation-6.30e-01

Pos logit contributions
 Ridley+2.199
essen+2.047
+1.937
rus+1.911
essel+1.904
Neg logit contributions
rocal-1.453
 sadd-1.334
 offending-1.300
 Signed-1.286
idad-1.277
Token 7
Token ID767
Feature activation-7.41e-01

Pos logit contributions
 Ridley+2.426
essen+2.267
+2.160
rus+2.149
essel+2.144
Neg logit contributions
rocal-1.379
 sadd-1.231
 offending-1.204
idad-1.190
 Signed-1.187
Token 8
Token ID807
Feature activation-7.08e-01

Pos logit contributions
 Ridley+2.749
essen+2.550
essel+2.426
rus+2.409
+2.407
Neg logit contributions
rocal-1.693
 sadd-1.548
 offending-1.516
idad-1.513
 Signed-1.478
Token void
Token ID7951
Feature activation-1.74e+00

Pos logit contributions
 Ridley+2.455
essen+2.262
rus+2.133
+2.126
essel+2.121
Neg logit contributions
rocal-1.812
 sadd-1.673
 offending-1.638
 Signed-1.623
idad-1.614
Token vert
Token ID9421
Feature activation-3.21e-02

Pos logit contributions
 Ridley+4.481
essen+4.207
essel+3.772
rus+3.768
+3.711
Neg logit contributions
rocal-5.843
 Signed-5.419
ENCY-5.351
 offending-5.316
OA-5.291
Token (
Token ID357
Feature activation-3.54e-01

Pos logit contributions
 Ridley+0.100
essen+0.093
+0.087
rus+0.085
essel+0.083
Neg logit contributions
rocal-0.095
 sadd-0.089
 Signed-0.088
 offending-0.088
idad-0.084
Token in
Token ID287
Feature activation-4.36e-01

Pos logit contributions
 Ridley+1.003
essen+0.925
+0.850
rus+0.836
essel+0.820
Neg logit contributions
rocal-1.146
 Signed-1.068
 sadd-1.066
 offending-1.055
OA-1.026
Tokenout
Token ID448
Feature activation-5.05e-01

Pos logit contributions
 Ridley+1.510
essen+1.418
+1.318
rus+1.314
essel+1.290
Neg logit contributions
rocal-1.137
 sadd-1.050
 Signed-1.044
 offending-1.031
idad-0.987
Token app
Token ID598
Feature activation-4.98e-01

Pos logit contributions
 Ridley+1.790
essen+1.700
rus+1.567
essel+1.556
+1.548
Neg logit contributions
rocal-1.257
 Signed-1.125
idad-1.095
 offending-1.088
 sadd-1.087
Token a
Token ID257
Feature activation-1.01e-01

Pos logit contributions
rocal+0.682
 Signed+0.629
 sadd+0.627
 offending+0.619
ENCY+0.601
Neg logit contributions
 Ridley-0.681
essen-0.628
-0.582
rus-0.570
essel-0.559
Token P
Token ID350
Feature activation-1.08e-01

Pos logit contributions
 Ridley+0.340
essen+0.317
+0.296
rus+0.290
essel+0.284
Neg logit contributions
rocal-0.277
 sadd-0.252
 Signed-0.251
 offending-0.249
ENCY-0.240
Tokenizz
Token ID6457
Feature activation-3.39e-01

Pos logit contributions
 Ridley+0.329
essen+0.311
+0.288
rus+0.280
essel+0.278
Neg logit contributions
rocal-0.324
 sadd-0.304
 Signed-0.303
 offending-0.301
idad-0.284
Tokenag
Token ID363
Feature activation+1.22e-01

Pos logit contributions
 Ridley+1.078
essen+1.013
+0.930
rus+0.925
essel+0.896
Neg logit contributions
rocal-0.984
 sadd-0.921
 Signed-0.916
 offending-0.911
ENCY-0.868
Tokenater
Token ID729
Feature activation-3.05e-01

Pos logit contributions
rocal+0.376
 sadd+0.351
 Signed+0.349
 offending+0.346
ENCY+0.332
Neg logit contributions
 Ridley-0.366
essen-0.342
-0.313
rus-0.313
essel-0.305
Token conspiracy
Token ID10086
Feature activation-1.67e+00

Pos logit contributions
 Ridley+0.897
essen+0.835
+0.766
rus+0.752
essel+0.734
Neg logit contributions
rocal-0.964
 Signed-0.886
 sadd-0.884
 offending-0.873
ENCY-0.849
Token theorist
Token ID46410
Feature activation-8.21e-01

Pos logit contributions
 Ridley+3.057
essen+2.747
+2.352
rus+2.348
essel+2.160
Neg logit contributions
rocal-7.095
 Signed-6.573
 offending-6.448
 sadd-6.403
ENCY-6.393
Token and
Token ID290
Feature activation-1.01e-01

Pos logit contributions
 Ridley+2.296
essen+2.109
+1.895
rus+1.863
essel+1.822
Neg logit contributions
rocal-2.748
 Signed-2.486
 sadd-2.449
 offending-2.448
ENCY-2.433
Token affiliate
Token ID17375
Feature activation+2.46e-01

Pos logit contributions
 Ridley+0.340
essen+0.316
+0.292
rus+0.289
essel+0.282
Neg logit contributions
rocal-0.284
 Signed-0.257
 sadd-0.256
 offending-0.252
ENCY-0.245
Token of
Token ID286
Feature activation-1.90e-01

Pos logit contributions
rocal+0.735
 Signed+0.681
 sadd+0.679
 offending+0.671
idad+0.645
Neg logit contributions
 Ridley-0.768
essen-0.715
-0.665
rus-0.652
essel-0.636
Token Proud
Token ID39725
Feature activation+8.46e-02

Pos logit contributions
 Ridley+0.586
essen+0.540
+0.501
rus+0.493
essel+0.476
Neg logit contributions
rocal-0.583
 sadd-0.533
 Signed-0.532
 offending-0.525
idad-0.508
Token shalt
Token ID36258
Feature activation-5.88e-01

Pos logit contributions
rocal+0.179
 Signed+0.166
 sadd+0.166
 offending+0.164
OA+0.160
Neg logit contributions
 Ridley-0.145
essen-0.133
-0.123
rus-0.120
essel-0.117
Token sm
Token ID895
Feature activation-3.68e-01

Pos logit contributions
 Ridley+1.858
essen+1.730
rus+1.579
+1.577
essel+1.567
Neg logit contributions
rocal-1.738
 Signed-1.568
 sadd-1.530
OA-1.524
 offending-1.524
Tokenite
Token ID578
Feature activation-7.59e-01

Pos logit contributions
 Ridley+0.904
essen+0.837
rus+0.738
+0.731
essel+0.728
Neg logit contributions
rocal-1.330
 sadd-1.265
 Signed-1.262
 offending-1.248
ENCY-1.209
Token every
Token ID790
Feature activation-2.19e-01

Pos logit contributions
 Ridley+2.077
essen+1.863
+1.701
rus+1.670
essel+1.645
Neg logit contributions
rocal-2.591
 Signed-2.387
 sadd-2.343
 offending-2.302
OA-2.301
Token male
Token ID4257
Feature activation-3.32e-01

Pos logit contributions
 Ridley+0.698
essen+0.649
+0.603
rus+0.594
essel+0.581
Neg logit contributions
rocal-0.644
 Signed-0.591
 sadd-0.585
 offending-0.577
idad-0.564
Token therein
Token ID27258
Feature activation-1.64e+00

Pos logit contributions
 Ridley+1.002
essen+0.921
+0.855
rus+0.843
essel+0.827
Neg logit contributions
rocal-1.037
 Signed-0.948
 sadd-0.943
 offending-0.928
idad-0.911
Token with
Token ID351
Feature activation-4.58e-02

Pos logit contributions
 Ridley+4.361
essen+4.010
+3.617
rus+3.596
essel+3.511
Neg logit contributions
rocal-5.540
 Signed-5.008
ENCY-4.919
OA-4.840
 sadd-4.819
Token the
Token ID262
Feature activation+3.93e-03

Pos logit contributions
 Ridley+0.137
essen+0.125
+0.117
rus+0.114
essel+0.112
Neg logit contributions
rocal-0.144
 Signed-0.134
 sadd-0.131
 offending-0.130
OA-0.128
Token edge
Token ID5743
Feature activation+2.91e-01

Pos logit contributions
rocal+0.012
 Signed+0.011
 sadd+0.011
 offending+0.011
OA+0.011
Neg logit contributions
 Ridley-0.012
essen-0.011
-0.010
rus-0.010
essel-0.010
Token of
Token ID286
Feature activation-2.69e-01

Pos logit contributions
rocal+0.839
 sadd+0.789
 Signed+0.780
 offending+0.779
idad+0.741
Neg logit contributions
 Ridley-0.926
essen-0.867
-0.805
rus-0.790
essel-0.777
Token the
Token ID262
Feature activation-6.33e-04

Pos logit contributions
 Ridley+0.802
essen+0.728
+0.682
rus+0.661
essel+0.650
Neg logit contributions
rocal-0.844
 Signed-0.785
 sadd-0.772
 offending-0.762
OA-0.756
Token;
Token ID26
Feature activation+8.36e-02

Pos logit contributions
rocal+0.227
 sadd+0.214
 Signed+0.212
 offending+0.211
idad+0.205
Neg logit contributions
 Ridley-0.191
essen-0.177
-0.163
rus-0.160
essel-0.157
Token }
Token ID1782
Feature activation-2.95e-02

Pos logit contributions
rocal+0.220
 sadd+0.199
 Signed+0.198
 offending+0.197
idad+0.193
Neg logit contributions
 Ridley-0.289
essen-0.271
-0.253
rus-0.249
essel-0.247
Token 1
Token ID352
Feature activation+9.79e-02

Pos logit contributions
 Ridley+0.094
essen+0.088
+0.081
rus+0.080
essel+0.079
Neg logit contributions
rocal-0.086
 sadd-0.078
 offending-0.078
 Signed-0.077
idad-0.076
Token 2
Token ID362
Feature activation-1.40e-01

Pos logit contributions
rocal+0.232
 sadd+0.210
 offending+0.208
 Signed+0.207
OA+0.198
Neg logit contributions
 Ridley-0.364
essen-0.342
-0.324
rus-0.318
essel-0.315
Token 3
Token ID513
Feature activation-3.48e-01

Pos logit contributions
 Ridley+0.522
essen+0.488
+0.464
rus+0.460
essel+0.458
Neg logit contributions
rocal-0.327
 sadd-0.296
 offending-0.291
 Signed-0.288
idad-0.283
Token void
Token ID7951
Feature activation-1.59e+00

Pos logit contributions
 Ridley+1.361
essen+1.277
+1.209
rus+1.201
essel+1.194
Neg logit contributions
rocal-0.734
 sadd-0.661
 offending-0.646
 Signed-0.642
idad-0.636
Token vert
Token ID9421
Feature activation+1.31e-02

Pos logit contributions
 Ridley+4.014
essen+3.758
essel+3.364
rus+3.347
+3.333
Neg logit contributions
rocal-5.415
 Signed-5.050
ENCY-4.988
 offending-4.955
OA-4.916
Token (
Token ID357
Feature activation-3.24e-01

Pos logit contributions
rocal+0.040
 sadd+0.037
 Signed+0.037
 offending+0.037
idad+0.035
Neg logit contributions
 Ridley-0.040
essen-0.037
-0.035
rus-0.034
essel-0.034
Token in
Token ID287
Feature activation-2.59e-01

Pos logit contributions
 Ridley+0.896
essen+0.826
+0.757
rus+0.747
essel+0.731
Neg logit contributions
rocal-1.070
 Signed-0.998
 sadd-0.992
 offending-0.982
ENCY-0.966
Tokenout
Token ID448
Feature activation-1.51e-01

Pos logit contributions
 Ridley+0.892
essen+0.842
rus+0.782
+0.778
essel+0.768
Neg logit contributions
rocal-0.680
 Signed-0.624
 sadd-0.622
 offending-0.612
ENCY-0.591
Token app
Token ID598
Feature activation-3.60e-01

Pos logit contributions
 Ridley+0.548
essen+0.522
rus+0.482
+0.479
essel+0.477
Neg logit contributions
rocal-0.366
 Signed-0.329
 offending-0.319
 sadd-0.319
idad-0.316
Token in
Token ID287
Feature activation+1.91e-01

Pos logit contributions
rocal+1.389
 sadd+1.298
 Signed+1.295
 offending+1.283
idad+1.234
Neg logit contributions
 Ridley-1.314
essen-1.219
-1.130
rus-1.102
essel-1.083
Token the
Token ID262
Feature activation+4.65e-02

Pos logit contributions
rocal+0.579
 sadd+0.537
 Signed+0.537
 offending+0.531
idad+0.511
Neg logit contributions
 Ridley-0.588
essen-0.547
-0.505
rus-0.497
essel-0.487
Token K
Token ID509
Feature activation-2.69e-01

Pos logit contributions
rocal+0.121
 sadd+0.111
 Signed+0.111
 offending+0.109
idad+0.105
Neg logit contributions
 Ridley-0.162
essen-0.152
-0.141
rus-0.140
essel-0.138
Token
Token ID1906
Feature activation-4.48e-02

Pos logit contributions
 Ridley+0.790
essen+0.742
+0.680
rus+0.670
essel+0.663
Neg logit contributions
rocal-0.842
 sadd-0.792
 Signed-0.789
 offending-0.782
 phr-0.747
Token12
Token ID1065
Feature activation-4.79e-01

Pos logit contributions
 Ridley+0.102
essen+0.094
+0.084
rus+0.082
essel+0.079
Neg logit contributions
rocal-0.171
 sadd-0.160
 Signed-0.160
 offending-0.160
idad-0.155
Token school
Token ID1524
Feature activation-1.58e+00

Pos logit contributions
 Ridley+1.625
essen+1.527
+1.409
rus+1.399
essel+1.382
Neg logit contributions
rocal-1.299
 Signed-1.178
 sadd-1.171
 offending-1.164
idad-1.121
Token libraries
Token ID12782
Feature activation-9.40e-02

Pos logit contributions
 Ridley+5.000
essen+4.724
rus+4.325
+4.306
essel+4.267
Neg logit contributions
rocal-4.445
 Signed-4.047
 sadd-3.945
 offending-3.943
OA-3.874
Token,
Token ID11
Feature activation-1.49e-01

Pos logit contributions
 Ridley+0.286
essen+0.267
+0.246
rus+0.242
essel+0.237
Neg logit contributions
rocal-0.287
 sadd-0.265
 Signed-0.265
 offending-0.263
idad-0.253
Token E
Token ID412
Feature activation+5.59e-02

Pos logit contributions
 Ridley+0.461
essen+0.428
rus+0.390
+0.389
essel+0.380
Neg logit contributions
rocal-0.451
 sadd-0.414
 offending-0.411
 Signed-0.410
idad-0.398
TokenBS
Token ID4462
Feature activation-1.03e-01

Pos logit contributions
rocal+0.221
 sadd+0.210
 Signed+0.210
 offending+0.209
 phr+0.202
Neg logit contributions
 Ridley-0.118
essen-0.107
-0.095
rus-0.093
essel-0.090
TokenCO
Token ID8220
Feature activation+8.72e-02

Pos logit contributions
 Ridley+0.435
essen+0.415
+0.398
rus+0.392
essel+0.385
Neg logit contributions
rocal-0.183
 sadd-0.162
 offending-0.161
 Signed-0.160
idad-0.148
Token successful
Token ID4388
Feature activation-4.36e-01

Pos logit contributions
rocal+0.296
 Signed+0.271
 sadd+0.268
 offending+0.267
ENCY+0.259
Neg logit contributions
 Ridley-0.379
essen-0.353
-0.330
rus-0.324
essel-0.318
Token comedy
Token ID10997
Feature activation+1.84e-02

Pos logit contributions
 Ridley+1.471
essen+1.370
+1.260
rus+1.237
essel+1.215
Neg logit contributions
rocal-1.205
 Signed-1.092
 sadd-1.071
 offending-1.063
ENCY-1.044
Token writer
Token ID6260
Feature activation-4.55e-01

Pos logit contributions
rocal+0.053
 Signed+0.049
 sadd+0.049
 offending+0.048
ENCY+0.047
Neg logit contributions
 Ridley-0.059
essen-0.055
-0.051
rus-0.050
essel-0.049
Token who
Token ID508
Feature activation-1.91e-01

Pos logit contributions
 Ridley+1.346
essen+1.254
+1.141
rus+1.116
essel+1.100
Neg logit contributions
rocal-1.435
 Signed-1.311
 offending-1.297
 sadd-1.284
ENCY-1.277
Token qu
Token ID627
Feature activation+3.84e-01

Pos logit contributions
 Ridley+0.583
essen+0.549
+0.503
rus+0.496
essel+0.486
Neg logit contributions
rocal-0.583
 Signed-0.537
 offending-0.529
 sadd-0.527
ENCY-0.515
Tokenits
Token ID896
Feature activation-1.57e+00

Pos logit contributions
rocal+1.173
 sadd+1.097
 Signed+1.093
 offending+1.084
OA+1.040
Neg logit contributions
 Ridley-1.161
essen-1.081
-1.001
rus-0.979
essel-0.963
Token his
Token ID465
Feature activation-2.81e-01

Pos logit contributions
 Ridley+4.376
essen+3.963
+3.497
rus+3.478
 Booker+3.461
Neg logit contributions
rocal-5.231
 Signed-4.736
ENCY-4.589
 sadd-4.550
OA-4.538
Token pro
Token ID386
Feature activation+5.95e-01

Pos logit contributions
 Ridley+0.890
essen+0.831
+0.756
rus+0.747
essel+0.727
Neg logit contributions
rocal-0.837
 Signed-0.772
 sadd-0.755
 offending-0.745
ENCY-0.737
Tokendig
Token ID12894
Feature activation-1.41e-01

Pos logit contributions
rocal+1.864
 sadd+1.745
 Signed+1.740
 offending+1.723
idad+1.658
Neg logit contributions
 Ridley-1.757
essen-1.629
-1.510
rus-1.472
essel-1.446
Tokeniously
Token ID6819
Feature activation-8.38e-02

Pos logit contributions
 Ridley+0.431
essen+0.405
rus+0.368
+0.365
essel+0.356
Neg logit contributions
rocal-0.427
 Signed-0.404
 sadd-0.399
 offending-0.395
OA-0.381
Token paying
Token ID5989
Feature activation-1.68e-01

Pos logit contributions
 Ridley+0.296
essen+0.277
+0.259
rus+0.254
essel+0.249
Neg logit contributions
rocal-0.218
 Signed-0.196
 sadd-0.193
 offending-0.192
ENCY-0.188
Token the
Token ID262
Feature activation-8.93e-03

Pos logit contributions
 Ridley+0.448
essen+0.410
+0.377
rus+0.375
essel+0.365
Neg logit contributions
rocal-0.420
 Signed-0.388
 sadd-0.387
 offending-0.380
ENCY-0.373
Token case
Token ID1339
Feature activation-2.12e-01

Pos logit contributions
 Ridley+0.031
essen+0.029
+0.027
rus+0.026
essel+0.026
Neg logit contributions
rocal-0.024
 Signed-0.022
 sadd-0.022
 offending-0.021
idad-0.021
Token by
Token ID416
Feature activation+1.82e-01

Pos logit contributions
 Ridley+0.627
essen+0.575
+0.525
rus+0.523
essel+0.509
Neg logit contributions
rocal-0.675
 Signed-0.616
 sadd-0.610
 offending-0.607
ENCY-0.601
Token C
Token ID327
Feature activation+9.58e-02

Pos logit contributions
rocal+0.573
 sadd+0.533
 Signed+0.531
 offending+0.525
idad+0.509
Neg logit contributions
 Ridley-0.544
essen-0.498
-0.458
rus-0.451
essel-0.440
Token.
Token ID13
Feature activation-2.50e-01

Pos logit contributions
rocal+0.297
 sadd+0.279
 Signed+0.278
 offending+0.276
ENCY+0.264
Neg logit contributions
 Ridley-0.286
essen-0.267
-0.246
rus-0.242
essel-0.239
TokenI
Token ID40
Feature activation-1.56e+00

Pos logit contributions
 Ridley+0.759
essen+0.688
+0.646
rus+0.637
essel+0.619
Neg logit contributions
rocal-0.776
 sadd-0.718
 offending-0.717
 Signed-0.711
idad-0.685
Token.
Token ID13
Feature activation-2.59e-01

Pos logit contributions
 Ridley+3.633
essen+3.318
+2.953
rus+2.948
essel+2.872
Neg logit contributions
rocal-5.718
 sadd-5.445
 Signed-5.396
 offending-5.394
 rud-5.208
TokenA
Token ID32
Feature activation-9.67e-01

Pos logit contributions
 Ridley+0.709
essen+0.644
+0.594
rus+0.594
essel+0.568
Neg logit contributions
rocal-0.872
 sadd-0.812
 offending-0.807
 Signed-0.803
idad-0.780
Token.
Token ID13
Feature activation-2.27e-01

Pos logit contributions
 Ridley+2.335
essen+2.136
+1.917
rus+1.889
essel+1.837
Neg logit contributions
rocal-3.515
 sadd-3.317
 Signed-3.300
 offending-3.275
idad-3.188
Token officials
Token ID2828
Feature activation-2.03e-01

Pos logit contributions
 Ridley+0.695
essen+0.641
+0.591
rus+0.589
essel+0.570
Neg logit contributions
rocal-0.694
 sadd-0.644
 Signed-0.636
 offending-0.630
idad-0.611
Token were
Token ID547
Feature activation+4.81e-02

Pos logit contributions
 Ridley+0.615
essen+0.561
+0.515
rus+0.514
essel+0.497
Neg logit contributions
rocal-0.633
 Signed-0.572
 sadd-0.568
 offending-0.562
ENCY-0.560
Token
Token ID198
Feature activation-5.18e-01

Pos logit contributions
 Ridley+1.377
essen+1.265
rus+1.183
+1.172
essel+1.147
Neg logit contributions
rocal-1.247
 offending-1.172
 sadd-1.152
 Signed-1.133
idad-1.092
TokenT
Token ID51
Feature activation-2.98e-01

Pos logit contributions
 Ridley+1.578
essen+1.462
+1.396
rus+1.369
essel+1.321
Neg logit contributions
rocal-1.559
 offending-1.481
 sadd-1.462
 Signed-1.435
 augmented-1.387
TokenOM
Token ID2662
Feature activation+5.71e-01

Pos logit contributions
 Ridley+0.918
essen+0.855
rus+0.785
+0.782
essel+0.771
Neg logit contributions
rocal-0.890
 sadd-0.839
 offending-0.834
 Signed-0.826
 phr-0.786
Token KING
Token ID32957
Feature activation-1.25e-01

Pos logit contributions
rocal+2.349
 sadd+2.235
 Signed+2.234
 offending+2.214
OA+2.159
Neg logit contributions
 Ridley-1.115
essen-0.996
-0.880
rus-0.844
essel-0.823
Token:
Token ID25
Feature activation-5.76e-01

Pos logit contributions
 Ridley+0.357
essen+0.328
+0.304
rus+0.296
essel+0.291
Neg logit contributions
rocal-0.402
 sadd-0.377
 Signed-0.374
 offending-0.372
idad-0.359
Token Well
Token ID3894
Feature activation-1.55e+00

Pos logit contributions
 Ridley+1.727
essen+1.531
+1.425
rus+1.403
essel+1.359
Neg logit contributions
rocal-1.849
 offending-1.707
 sadd-1.703
 Signed-1.648
idad-1.621
Token,
Token ID11
Feature activation-5.75e-01

Pos logit contributions
 Ridley+4.029
essen+3.541
rus+3.125
+3.125
 Booker+3.021
Neg logit contributions
rocal-5.427
 Signed-4.959
 sadd-4.923
 offending-4.883
OA-4.838
Token that
Token ID326
Feature activation-9.95e-02

Pos logit contributions
 Ridley+1.683
essen+1.504
rus+1.358
+1.358
essel+1.318
Neg logit contributions
rocal-1.868
 Signed-1.702
 sadd-1.688
 offending-1.674
OA-1.661
Token's
Token ID338
Feature activation+3.73e-02

Pos logit contributions
 Ridley+0.289
essen+0.263
+0.240
rus+0.239
essel+0.233
Neg logit contributions
rocal-0.322
 Signed-0.299
 sadd-0.293
 offending-0.293
ENCY-0.285
Token what
Token ID644
Feature activation+8.95e-02

Pos logit contributions
rocal+0.106
 Signed+0.097
 sadd+0.096
 offending+0.095
ENCY+0.093
Neg logit contributions
 Ridley-0.124
essen-0.114
-0.105
rus-0.104
essel-0.102
Token makes
Token ID1838
Feature activation-3.09e-01

Pos logit contributions
rocal+0.289
 Signed+0.267
 sadd+0.266
 offending+0.265
OA+0.256
Neg logit contributions
 Ridley-0.260
essen-0.238
-0.219
rus-0.214
essel-0.210
Token and
Token ID290
Feature activation-1.47e-01

Pos logit contributions
 Ridley+0.444
essen+0.407
+0.381
rus+0.374
essel+0.365
Neg logit contributions
rocal-0.446
 sadd-0.415
 Signed-0.414
 offending-0.409
idad-0.397
Token let
Token ID1309
Feature activation-3.00e-01

Pos logit contributions
 Ridley+0.426
essen+0.392
+0.358
rus+0.356
essel+0.348
Neg logit contributions
rocal-0.468
 sadd-0.434
 Signed-0.434
 offending-0.430
idad-0.418
Token Will
Token ID2561
Feature activation-6.54e-01

Pos logit contributions
 Ridley+0.870
essen+0.790
+0.735
rus+0.718
essel+0.707
Neg logit contributions
rocal-0.959
 sadd-0.896
 Signed-0.889
 offending-0.881
ENCY-0.856
Token know
Token ID760
Feature activation+2.85e-01

Pos logit contributions
 Ridley+1.951
essen+1.769
+1.625
essel+1.593
rus+1.586
Neg logit contributions
rocal-2.055
 sadd-1.900
 Signed-1.897
 offending-1.883
ENCY-1.830
Token by
Token ID416
Feature activation+6.91e-01

Pos logit contributions
rocal+0.912
 sadd+0.850
 Signed+0.848
 offending+0.839
idad+0.812
Neg logit contributions
 Ridley-0.828
essen-0.763
-0.705
rus-0.691
essel-0.677
Token clicking
Token ID12264
Feature activation-1.55e+00

Pos logit contributions
rocal+1.696
 sadd+1.551
 Signed+1.545
 offending+1.526
idad+1.455
Neg logit contributions
 Ridley-2.516
essen-2.366
-2.221
rus-2.186
essel-2.154
Token the
Token ID262
Feature activation-3.97e-01

Pos logit contributions
 Ridley+4.201
essen+3.706
rus+3.494
+3.406
essel+3.319
Neg logit contributions
rocal-5.081
 sadd-4.755
 Signed-4.698
 offending-4.674
idad-4.596
Token link
Token ID2792
Feature activation-1.04e+00

Pos logit contributions
 Ridley+1.255
essen+1.158
+1.068
rus+1.060
essel+1.038
Neg logit contributions
rocal-1.167
 sadd-1.081
 Signed-1.072
 offending-1.051
OA-1.027
Token below
Token ID2174
Feature activation-3.85e-01

Pos logit contributions
 Ridley+2.504
essen+2.250
+2.042
rus+2.015
essel+1.938
Neg logit contributions
rocal-3.852
 sadd-3.558
 Signed-3.543
 offending-3.474
ENCY-3.443
Token to
Token ID284
Feature activation+1.25e-01

Pos logit contributions
 Ridley+1.031
essen+0.941
+0.860
rus+0.845
essel+0.820
Neg logit contributions
rocal-1.319
 sadd-1.231
 Signed-1.220
 offending-1.203
ENCY-1.177
Token share
Token ID2648
Feature activation+6.52e-01

Pos logit contributions
rocal+0.400
 sadd+0.374
 Signed+0.372
 offending+0.369
OA+0.357
Neg logit contributions
 Ridley-0.361
essen-0.333
-0.307
rus-0.301
essel-0.296
Token ambient
Token ID25237
Feature activation-3.11e-01

Pos logit contributions
 Ridley+2.937
essen+2.727
+2.553
essel+2.536
rus+2.500
Neg logit contributions
rocal-2.393
 Signed-2.165
 sadd-2.161
 offending-2.136
OA-2.057
TokenD
Token ID35
Feature activation+6.17e-01

Pos logit contributions
 Ridley+0.960
essen+0.899
+0.833
rus+0.816
essel+0.810
Neg logit contributions
rocal-0.936
 sadd-0.868
 Signed-0.861
 offending-0.856
OA-0.815
Tokenensity
Token ID6377
Feature activation+3.14e-02

Pos logit contributions
rocal+1.449
 sadd+1.327
 Signed+1.325
 offending+1.305
idad+1.241
Neg logit contributions
 Ridley-2.298
essen-2.158
-2.046
rus-1.998
essel-1.965
Token )
Token ID1267
Feature activation-2.44e-03

Pos logit contributions
rocal+0.094
 sadd+0.088
 Signed+0.087
 offending+0.087
idad+0.083
Neg logit contributions
 Ridley-0.097
essen-0.090
-0.083
rus-0.082
essel-0.081
Token const
Token ID1500
Feature activation-1.17e+00

Pos logit contributions
 Ridley+0.008
essen+0.008
+0.007
rus+0.007
essel+0.007
Neg logit contributions
rocal-0.006
 sadd-0.006
 Signed-0.006
 offending-0.006
idad-0.006
Token {
Token ID1391
Feature activation-1.55e+00

Pos logit contributions
 Ridley+3.851
essen+3.548
rus+3.384
essel+3.289
+3.272
Neg logit contributions
rocal-3.178
 offending-2.898
 sadd-2.872
 Signed-2.827
idad-2.813
Token const
Token ID1500
Feature activation-7.17e-01

Pos logit contributions
 Ridley+4.248
essen+3.723
essel+3.456
rus+3.428
+3.293
Neg logit contributions
rocal-5.003
 sadd-4.655
idad-4.627
 offending-4.598
 Signed-4.569
Token float
Token ID12178
Feature activation-4.65e-01

Pos logit contributions
 Ridley+2.368
essen+2.179
+2.030
essel+2.020
rus+2.006
Neg logit contributions
rocal-1.972
 sadd-1.807
 offending-1.791
 Signed-1.782
idad-1.747
Token f
Token ID277
Feature activation+2.63e-02

Pos logit contributions
 Ridley+1.529
essen+1.423
+1.326
essel+1.299
rus+1.297
Neg logit contributions
rocal-1.305
 Signed-1.196
 sadd-1.194
 offending-1.174
idad-1.137
TokenMass
Token ID20273
Feature activation+2.53e-01

Pos logit contributions
rocal+0.079
 sadd+0.074
 Signed+0.074
 offending+0.073
idad+0.070
Neg logit contributions
 Ridley-0.081
essen-0.075
-0.070
rus-0.068
essel-0.068
Token =
Token ID796
Feature activation-3.06e-01

Pos logit contributions
rocal+0.801
 sadd+0.750
 Signed+0.749
 offending+0.742
idad+0.715
Neg logit contributions
 Ridley-0.736
essen-0.681
-0.632
rus-0.618
essel-0.609
Token Connection
Token ID26923
Feature activation-3.13e-01

Pos logit contributions
rocal+3.033
 sadd+2.821
 Signed+2.807
 offending+2.784
idad+2.669
Neg logit contributions
 Ridley-3.283
essen-3.059
-2.853
rus-2.786
essel-2.741
Token (
Token ID357
Feature activation+9.48e-03

Pos logit contributions
 Ridley+0.908
essen+0.832
+0.761
rus+0.756
essel+0.739
Neg logit contributions
rocal-1.006
 sadd-0.914
 offending-0.913
 Signed-0.899
idad-0.876
TokenCo
Token ID7222
Feature activation+3.43e-01

Pos logit contributions
rocal+0.026
 sadd+0.023
 offending+0.023
 Signed+0.023
 phr+0.022
Neg logit contributions
 Ridley-0.032
essen-0.030
-0.029
rus-0.028
essel-0.027
Tokenax
Token ID897
Feature activation+3.49e-01

Pos logit contributions
rocal+1.076
 sadd+1.006
 offending+0.995
 Signed+0.994
idad+0.957
Neg logit contributions
 Ridley-1.013
essen-0.946
-0.873
rus-0.856
essel-0.838
Token)
Token ID8
Feature activation+2.72e-01

Pos logit contributions
rocal+1.000
 sadd+0.921
 offending+0.907
 Signed+0.906
 phr+0.867
Neg logit contributions
 Ridley-1.150
essen-1.051
-0.981
rus-0.980
essel-0.958
Token under
Token ID739
Feature activation-1.54e+00

Pos logit contributions
rocal+0.832
 sadd+0.765
 Signed+0.756
 offending+0.755
idad+0.729
Neg logit contributions
 Ridley-0.834
essen-0.773
-0.714
rus-0.705
essel-0.690
Token the
Token ID262
Feature activation-6.22e-01

Pos logit contributions
 Ridley+5.015
essen+4.653
rus+4.361
+4.302
asions+4.144
Neg logit contributions
rocal-4.223
 sadd-3.766
 offending-3.679
 Signed-3.630
idad-3.628
Token list
Token ID1351
Feature activation-7.97e-02

Pos logit contributions
 Ridley+2.089
essen+1.945
+1.805
rus+1.793
essel+1.737
Neg logit contributions
rocal-1.719
 sadd-1.530
 Signed-1.507
 offending-1.478
ENCY-1.467
Token of
Token ID286
Feature activation+2.17e-02

Pos logit contributions
 Ridley+0.238
essen+0.221
+0.205
rus+0.201
essel+0.196
Neg logit contributions
rocal-0.249
 sadd-0.230
 Signed-0.228
 offending-0.226
ENCY-0.218
Token brid
Token ID38265
Feature activation-8.90e-02

Pos logit contributions
rocal+0.058
 sadd+0.052
 Signed+0.052
 offending+0.051
idad+0.050
Neg logit contributions
 Ridley-0.075
essen-0.070
-0.065
rus-0.064
essel-0.063
Tokenged
Token ID2004
Feature activation+1.41e-01

Pos logit contributions
 Ridley+0.252
essen+0.235
+0.217
rus+0.212
essel+0.209
Neg logit contributions
rocal-0.289
 sadd-0.267
 Signed-0.267
 offending-0.266
OA-0.258
Token Washington
Token ID2669
Feature activation-6.96e-01

Pos logit contributions
 Ridley+2.187
essen+1.928
rus+1.771
+1.727
essel+1.649
Neg logit contributions
rocal-3.634
 sadd-3.435
 offending-3.398
 Signed-3.393
idad-3.299
Token Times
Token ID3782
Feature activation+5.42e-02

Pos logit contributions
 Ridley+2.150
essen+2.009
rus+1.829
+1.829
essel+1.766
Neg logit contributions
rocal-2.079
 sadd-1.924
 offending-1.902
 Signed-1.873
ENCY-1.804
Token,
Token ID11
Feature activation-6.82e-01

Pos logit contributions
rocal+0.174
 sadd+0.163
 Signed+0.162
 offending+0.161
idad+0.155
Neg logit contributions
 Ridley-0.155
essen-0.144
-0.133
rus-0.130
essel-0.127
Token LLC
Token ID11419
Feature activation-3.59e-01

Pos logit contributions
 Ridley+2.427
essen+2.277
+2.217
rus+2.110
essel+2.096
Neg logit contributions
rocal-1.619
 Signed-1.476
 offending-1.462
ENCY-1.458
OA-1.452
Token.
Token ID13
Feature activation-4.61e-01

Pos logit contributions
 Ridley+0.870
essen+0.791
+0.716
rus+0.702
essel+0.681
Neg logit contributions
rocal-1.317
 sadd-1.232
 offending-1.230
 Signed-1.227
ENCY-1.180
Token Click
Token ID6914
Feature activation-1.53e+00

Pos logit contributions
 Ridley+1.115
essen+1.008
+0.918
rus+0.895
essel+0.864
Neg logit contributions
rocal-1.696
 sadd-1.601
 Signed-1.594
 offending-1.589
ENCY-1.539
Token here
Token ID994
Feature activation-5.12e-01

Pos logit contributions
 Ridley+2.827
essen+2.523
rus+2.289
+2.186
essel+2.076
Neg logit contributions
rocal-6.056
 Signed-5.854
 sadd-5.752
ENCY-5.722
idad-5.596
Token for
Token ID329
Feature activation+9.44e-02

Pos logit contributions
 Ridley+1.477
essen+1.367
+1.263
rus+1.237
essel+1.210
Neg logit contributions
rocal-1.628
 sadd-1.524
 Signed-1.523
 offending-1.504
OA-1.452
Token reprint
Token ID26589
Feature activation-4.49e-01

Pos logit contributions
rocal+0.276
 Signed+0.260
 sadd+0.259
 offending+0.254
OA+0.249
Neg logit contributions
 Ridley-0.294
essen-0.274
-0.254
rus-0.252
essel-0.243
Token permission
Token ID7170
Feature activation-6.74e-01

Pos logit contributions
 Ridley+1.705
essen+1.606
+1.529
rus+1.494
essel+1.472
Neg logit contributions
rocal-1.020
 sadd-0.928
 Signed-0.919
 offending-0.909
OA-0.867
Token.
Token ID13
Feature activation-4.26e-01

Pos logit contributions
 Ridley+1.637
essen+1.485
+1.352
rus+1.310
essel+1.287
Neg logit contributions
rocal-2.458
 sadd-2.318
 offending-2.315
 Signed-2.292
OA-2.230
Tokendata
Token ID7890
Feature activation-6.59e-01

Pos logit contributions
 Ridley+1.226
essen+1.146
rus+1.033
+1.013
essel+1.013
Neg logit contributions
rocal-1.758
 Signed-1.680
 sadd-1.676
 offending-1.669
 rud-1.600
Token _
Token ID4808
Feature activation-5.30e-01

Pos logit contributions
 Ridley+2.219
essen+2.041
+1.913
rus+1.898
essel+1.875
Neg logit contributions
rocal-1.770
 sadd-1.604
 Signed-1.599
 offending-1.597
idad-1.565
Token full
Token ID1336
Feature activation-3.60e-01

Pos logit contributions
 Ridley+1.360
essen+1.246
+1.123
essel+1.103
rus+1.098
Neg logit contributions
rocal-1.854
 Signed-1.725
 sadd-1.708
 offending-1.695
idad-1.673
Token v
Token ID410
Feature activation+9.86e-02

Pos logit contributions
 Ridley+1.364
essen+1.278
+1.203
rus+1.195
essel+1.181
Neg logit contributions
rocal-0.822
 Signed-0.737
 sadd-0.735
 offending-0.727
ENCY-0.713
Token )
Token ID1267
Feature activation-5.08e-01

Pos logit contributions
rocal+0.272
 sadd+0.253
 Signed+0.252
 offending+0.249
idad+0.238
Neg logit contributions
 Ridley-0.327
essen-0.305
-0.286
rus-0.280
essel-0.276
Token {
Token ID1391
Feature activation-1.49e+00

Pos logit contributions
 Ridley+1.795
essen+1.670
+1.563
rus+1.561
essel+1.533
Neg logit contributions
rocal-1.291
 offending-1.184
 sadd-1.177
 Signed-1.168
idad-1.135
Token //
Token ID3373
Feature activation+6.47e-01

Pos logit contributions
 Ridley+5.145
essen+4.718
rus+4.491
essel+4.432
+4.390
Neg logit contributions
rocal-3.707
idad-3.305
 sadd-3.303
 Signed-3.276
OA-3.252
Token Time
Token ID3862
Feature activation+1.92e-02

Pos logit contributions
rocal+1.699
 Signed+1.577
 sadd+1.574
 offending+1.550
OA+1.479
Neg logit contributions
 Ridley-2.229
essen-2.093
-1.974
rus-1.914
essel-1.885
Token component
Token ID7515
Feature activation-3.94e-01

Pos logit contributions
rocal+0.050
 Signed+0.046
 sadd+0.046
 offending+0.045
idad+0.044
Neg logit contributions
 Ridley-0.066
essen-0.062
-0.058
rus-0.057
essel-0.057
Token const
Token ID1500
Feature activation-1.05e+00

Pos logit contributions
 Ridley+1.149
essen+1.062
+0.978
rus+0.962
essel+0.952
Neg logit contributions
rocal-1.243
 sadd-1.152
 Signed-1.149
 offending-1.142
idad-1.112
Token float
Token ID12178
Feature activation-5.92e-01

Pos logit contributions
 Ridley+3.227
essen+2.972
rus+2.741
essel+2.737
+2.700
Neg logit contributions
rocal-3.144
 Signed-2.822
 offending-2.818
 sadd-2.806
idad-2.778
Token of
Token ID286
Feature activation-4.68e-01

Pos logit contributions
 Ridley+2.506
essen+2.277
+2.104
rus+2.093
essel+2.021
Neg logit contributions
rocal-2.685
 Signed-2.499
 sadd-2.445
 offending-2.443
ENCY-2.405
Token vouchers
Token ID41262
Feature activation+5.04e-02

Pos logit contributions
 Ridley+1.530
essen+1.433
+1.305
rus+1.291
essel+1.253
Neg logit contributions
rocal-1.337
 Signed-1.223
 sadd-1.199
 offending-1.186
idad-1.171
Token into
Token ID656
Feature activation+8.19e-01

Pos logit contributions
rocal+0.166
 Signed+0.155
 sadd+0.154
 offending+0.153
OA+0.149
Neg logit contributions
 Ridley-0.141
essen-0.130
-0.119
rus-0.117
essel-0.114
Token nine
Token ID5193
Feature activation+7.52e-02

Pos logit contributions
rocal+2.301
 sadd+2.176
 Signed+2.147
 offending+2.143
 phr+2.033
Neg logit contributions
 Ridley-2.635
essen-2.462
-2.332
rus-2.254
essel-2.233
Token new
Token ID649
Feature activation+2.45e-02

Pos logit contributions
rocal+0.213
 Signed+0.196
 sadd+0.194
 offending+0.192
OA+0.186
Neg logit contributions
 Ridley-0.247
essen-0.231
-0.213
rus-0.210
essel-0.207
Token school
Token ID1524
Feature activation-1.48e+00

Pos logit contributions
rocal+0.071
 Signed+0.065
 sadd+0.065
 offending+0.064
OA+0.062
Neg logit contributions
 Ridley-0.079
essen-0.073
-0.068
rus-0.067
essel-0.066
Token districts
Token ID12815
Feature activation+9.26e-02

Pos logit contributions
 Ridley+3.373
essen+3.018
+2.777
rus+2.708
essel+2.622
Neg logit contributions
rocal-5.521
 Signed-5.223
OA-5.066
 offending-5.045
 sadd-5.036
Token,
Token ID11
Feature activation-1.81e-01

Pos logit contributions
rocal+0.290
 Signed+0.268
 sadd+0.266
 offending+0.266
ENCY+0.258
Neg logit contributions
 Ridley-0.275
essen-0.254
-0.235
rus-0.231
essel-0.227
Token with
Token ID351
Feature activation-1.09e-01

Pos logit contributions
 Ridley+0.572
essen+0.532
+0.486
rus+0.480
essel+0.471
Neg logit contributions
rocal-0.537
 Signed-0.490
 offending-0.485
 sadd-0.484
ENCY-0.474
Token the
Token ID262
Feature activation-7.20e-02

Pos logit contributions
 Ridley+0.362
essen+0.338
+0.309
rus+0.308
essel+0.300
Neg logit contributions
rocal-0.308
 Signed-0.283
 sadd-0.279
 offending-0.276
idad-0.270
Token potential
Token ID2785
Feature activation+3.63e-01

Pos logit contributions
 Ridley+0.251
essen+0.232
+0.216
rus+0.215
essel+0.210
Neg logit contributions
rocal-0.190
 Signed-0.173
 sadd-0.169
 offending-0.168
ENCY-0.166
Token Washington
Token ID2669
Feature activation-6.21e-01

Pos logit contributions
 Ridley+2.108
essen+1.858
rus+1.692
+1.685
essel+1.579
Neg logit contributions
rocal-3.543
 sadd-3.335
 Signed-3.311
 offending-3.310
idad-3.208
Token Times
Token ID3782
Feature activation+5.89e-02

Pos logit contributions
 Ridley+1.952
essen+1.821
+1.668
rus+1.660
essel+1.608
Neg logit contributions
rocal-1.827
 sadd-1.690
 offending-1.674
 Signed-1.653
ENCY-1.588
Token,
Token ID11
Feature activation-6.62e-01

Pos logit contributions
rocal+0.191
 sadd+0.178
 Signed+0.178
 offending+0.177
idad+0.170
Neg logit contributions
 Ridley-0.167
essen-0.155
-0.143
rus-0.140
essel-0.137
Token LLC
Token ID11419
Feature activation-2.73e-01

Pos logit contributions
 Ridley+2.403
essen+2.267
+2.185
rus+2.103
essel+2.084
Neg logit contributions
rocal-1.533
 Signed-1.401
 offending-1.382
OA-1.361
ENCY-1.358
Token.
Token ID13
Feature activation-5.28e-01

Pos logit contributions
 Ridley+0.667
essen+0.607
+0.550
rus+0.538
essel+0.523
Neg logit contributions
rocal-1.000
 sadd-0.935
 offending-0.934
 Signed-0.933
ENCY-0.897
Token Click
Token ID6914
Feature activation-1.48e+00

Pos logit contributions
 Ridley+1.251
essen+1.118
+1.024
rus+0.999
essel+0.955
Neg logit contributions
rocal-1.970
 sadd-1.856
 offending-1.854
 Signed-1.846
ENCY-1.789
Token here
Token ID994
Feature activation-4.70e-01

Pos logit contributions
 Ridley+2.679
essen+2.426
rus+2.157
+2.070
essel+1.974
Neg logit contributions
rocal-5.899
 Signed-5.736
ENCY-5.583
 sadd-5.576
 rud-5.455
Token for
Token ID329
Feature activation+1.34e-01

Pos logit contributions
 Ridley+1.349
essen+1.249
+1.152
rus+1.131
essel+1.105
Neg logit contributions
rocal-1.501
 Signed-1.406
 sadd-1.405
 offending-1.386
OA-1.341
Token reprint
Token ID26589
Feature activation-4.11e-01

Pos logit contributions
rocal+0.388
 Signed+0.364
 sadd+0.363
 offending+0.357
OA+0.348
Neg logit contributions
 Ridley-0.420
essen-0.391
-0.363
rus-0.359
essel-0.349
Token permission
Token ID7170
Feature activation-6.21e-01

Pos logit contributions
 Ridley+1.560
essen+1.469
+1.400
rus+1.368
essel+1.348
Neg logit contributions
rocal-0.935
 sadd-0.851
 Signed-0.842
 offending-0.835
OA-0.795
Token.
Token ID13
Feature activation-4.18e-01

Pos logit contributions
 Ridley+1.521
essen+1.380
+1.258
rus+1.221
essel+1.199
Neg logit contributions
rocal-2.257
 offending-2.128
 sadd-2.124
 Signed-2.107
idad-2.040
Token individual
Token ID1981
Feature activation-7.06e-01

Pos logit contributions
 Ridley+0.164
essen+0.152
+0.141
rus+0.139
essel+0.137
Neg logit contributions
rocal-0.148
 sadd-0.138
 Signed-0.137
 offending-0.134
OA-0.131
Token is
Token ID318
Feature activation-2.66e-01

Pos logit contributions
 Ridley+2.077
essen+1.911
+1.741
rus+1.738
essel+1.705
Neg logit contributions
rocal-2.216
 Signed-2.020
 sadd-2.014
OA-1.976
ENCY-1.973
Token equal
Token ID4961
Feature activation-4.62e-01

Pos logit contributions
 Ridley+0.979
essen+0.919
+0.861
rus+0.849
essel+0.837
Neg logit contributions
rocal-0.638
 sadd-0.575
 Signed-0.575
 offending-0.567
idad-0.550
Token before
Token ID878
Feature activation+2.26e-02

Pos logit contributions
 Ridley+1.325
essen+1.222
+1.123
rus+1.099
essel+1.084
Neg logit contributions
rocal-1.488
 sadd-1.386
 Signed-1.384
 offending-1.368
OA-1.332
Token and
Token ID290
Feature activation-1.15e-01

Pos logit contributions
rocal+0.070
 sadd+0.066
 Signed+0.066
 offending+0.065
OA+0.063
Neg logit contributions
 Ridley-0.067
essen-0.062
-0.057
rus-0.056
essel-0.055
Token under
Token ID739
Feature activation-1.45e+00

Pos logit contributions
 Ridley+0.347
essen+0.322
+0.298
rus+0.291
essel+0.286
Neg logit contributions
rocal-0.350
 sadd-0.326
 Signed-0.326
 offending-0.321
OA-0.312
Token the
Token ID262
Feature activation-3.23e-01

Pos logit contributions
 Ridley+4.131
essen+3.683
+3.390
rus+3.326
essel+3.305
Neg logit contributions
rocal-4.542
 sadd-4.272
 Signed-4.268
 offending-4.204
ENCY-4.176
Token law
Token ID1099
Feature activation+6.59e-01

Pos logit contributions
 Ridley+1.247
essen+1.157
+1.089
rus+1.073
essel+1.072
Neg logit contributions
rocal-0.722
 Signed-0.651
 sadd-0.649
 offending-0.637
OA-0.622
Token and
Token ID290
Feature activation-7.25e-02

Pos logit contributions
rocal+2.066
 sadd+1.952
 Signed+1.941
 offending+1.927
idad+1.845
Neg logit contributions
 Ridley-1.915
essen-1.786
-1.660
rus-1.610
essel-1.580
Token has
Token ID468
Feature activation-3.66e-01

Pos logit contributions
 Ridley+0.218
essen+0.202
+0.187
rus+0.182
essel+0.179
Neg logit contributions
rocal-0.223
 sadd-0.208
 Signed-0.208
 offending-0.205
OA-0.199
Token the
Token ID262
Feature activation-2.05e-01

Pos logit contributions
 Ridley+1.084
essen+1.001
+0.918
rus+0.901
essel+0.882
Neg logit contributions
rocal-1.146
 Signed-1.063
 sadd-1.061
 offending-1.047
OA-1.024
Token you
Token ID345
Feature activation-1.16e-02

Pos logit contributions
rocal+0.474
 sadd+0.436
 Signed+0.435
 offending+0.431
idad+0.419
Neg logit contributions
 Ridley-0.463
essen-0.424
-0.391
rus-0.385
essel-0.378
Token were
Token ID547
Feature activation+1.70e-01

Pos logit contributions
 Ridley+0.034
essen+0.031
+0.029
rus+0.028
essel+0.027
Neg logit contributions
rocal-0.037
 sadd-0.034
 Signed-0.034
 offending-0.034
ENCY-0.033
Token dismissed
Token ID11126
Feature activation-5.06e-01

Pos logit contributions
rocal+0.449
 Signed+0.406
 sadd+0.406
 offending+0.400
idad+0.387
Neg logit contributions
 Ridley-0.592
essen-0.552
-0.514
rus-0.510
essel-0.499
Token as
Token ID355
Feature activation-1.77e-01

Pos logit contributions
 Ridley+1.531
essen+1.404
+1.302
rus+1.273
essel+1.248
Neg logit contributions
rocal-1.569
 Signed-1.426
 sadd-1.423
 offending-1.403
ENCY-1.379
Token a
Token ID257
Feature activation-2.04e-01

Pos logit contributions
 Ridley+0.576
essen+0.533
+0.498
rus+0.493
essel+0.477
Neg logit contributions
rocal-0.510
 Signed-0.464
 sadd-0.464
 offending-0.456
idad-0.443
Token conspiracy
Token ID10086
Feature activation-1.45e+00

Pos logit contributions
 Ridley+0.678
essen+0.630
+0.589
rus+0.582
essel+0.567
Neg logit contributions
rocal-0.570
 sadd-0.521
 Signed-0.520
 offending-0.515
idad-0.494
Token theorist
Token ID46410
Feature activation-1.15e+00

Pos logit contributions
 Ridley+2.745
essen+2.397
rus+2.101
+2.089
essel+1.914
Neg logit contributions
rocal-6.110
 Signed-5.619
 offending-5.528
ENCY-5.523
 sadd-5.466
Token.
Token ID13
Feature activation+6.72e-02

Pos logit contributions
 Ridley+3.090
essen+2.803
+2.519
rus+2.493
essel+2.404
Neg logit contributions
rocal-3.938
 Signed-3.546
 sadd-3.489
ENCY-3.475
 offending-3.460
Token Or
Token ID1471
Feature activation+1.95e-01

Pos logit contributions
rocal+0.198
 sadd+0.182
 offending+0.181
 Signed+0.179
idad+0.173
Neg logit contributions
 Ridley-0.215
essen-0.197
-0.184
rus-0.181
essel-0.177
Token if
Token ID611
Feature activation-1.50e-02

Pos logit contributions
rocal+0.584
 sadd+0.537
 Signed+0.536
 offending+0.530
idad+0.513
Neg logit contributions
 Ridley-0.614
essen-0.566
-0.522
rus-0.517
essel-0.503
Token you
Token ID345
Feature activation+1.63e-01

Pos logit contributions
 Ridley+0.046
essen+0.042
+0.039
rus+0.038
essel+0.038
Neg logit contributions
rocal-0.046
 sadd-0.042
 Signed-0.042
 offending-0.042
idad-0.040
Token returned
Token ID4504
Feature activation-2.13e-01

Pos logit contributions
 Ridley+1.508
essen+1.396
+1.285
rus+1.265
essel+1.243
Neg logit contributions
rocal-1.619
 Signed-1.485
 sadd-1.475
 offending-1.466
ENCY-1.435
Token to
Token ID284
Feature activation-2.63e-02

Pos logit contributions
 Ridley+0.633
essen+0.588
+0.544
rus+0.532
essel+0.519
Neg logit contributions
rocal-0.667
 Signed-0.619
 sadd-0.618
 offending-0.609
OA-0.589
Token the
Token ID262
Feature activation-5.35e-04

Pos logit contributions
 Ridley+0.079
essen+0.073
+0.067
rus+0.066
essel+0.065
Neg logit contributions
rocal-0.082
 Signed-0.076
 sadd-0.076
 offending-0.074
OA-0.072
Token Meg
Token ID8336
Feature activation+1.97e-02

Pos logit contributions
 Ridley+0.002
essen+0.002
+0.002
rus+0.002
essel+0.002
Neg logit contributions
rocal-0.001
 Signed-0.001
 sadd-0.001
 offending-0.001
OA-0.001
Tokenarian
Token ID3699
Feature activation-1.65e-01

Pos logit contributions
rocal+0.061
 sadd+0.057
 Signed+0.057
 offending+0.057
ENCY+0.054
Neg logit contributions
 Ridley-0.059
essen-0.055
-0.050
rus-0.050
essel-0.049
Token school
Token ID1524
Feature activation-1.45e+00

Pos logit contributions
 Ridley+0.547
essen+0.511
+0.472
rus+0.468
essel+0.462
Neg logit contributions
rocal-0.461
 sadd-0.422
 Signed-0.422
 offending-0.420
ENCY-0.403
Token:
Token ID25
Feature activation+1.59e-02

Pos logit contributions
 Ridley+3.974
essen+3.635
rus+3.327
+3.296
essel+3.215
Neg logit contributions
rocal-4.687
 offending-4.305
 Signed-4.279
 sadd-4.240
ENCY-4.173
Tokenonly
Token ID8807
Feature activation-6.46e-01

Pos logit contributions
rocal+0.048
 sadd+0.045
 Signed+0.045
 offending+0.044
idad+0.043
Neg logit contributions
 Ridley-0.049
essen-0.045
-0.042
rus-0.041
essel-0.040
Token the
Token ID262
Feature activation-1.20e-01

Pos logit contributions
 Ridley+1.950
essen+1.820
rus+1.653
+1.647
essel+1.601
Neg logit contributions
rocal-1.976
 Signed-1.818
 sadd-1.803
 offending-1.782
ENCY-1.740
Token multiple
Token ID3294
Feature activation+4.64e-02

Pos logit contributions
 Ridley+0.420
essen+0.391
+0.364
rus+0.360
essel+0.357
Neg logit contributions
rocal-0.311
 Signed-0.283
 sadd-0.282
 offending-0.278
ENCY-0.269
Token is
Token ID318
Feature activation-2.53e-02

Pos logit contributions
rocal+0.144
 Signed+0.133
 sadd+0.133
 offending+0.131
ENCY+0.127
Neg logit contributions
 Ridley-0.140
essen-0.130
-0.119
rus-0.117
essel-0.115
Token less
Token ID1342
Feature activation-4.50e-01

Pos logit contributions
rocal+0.007
 Signed+0.006
 sadd+0.006
 offending+0.006
ENCY+0.006
Neg logit contributions
 Ridley-0.008
essen-0.008
-0.007
rus-0.007
essel-0.007
Token surprised
Token ID6655
Feature activation-5.68e-01

Pos logit contributions
 Ridley+1.558
essen+1.450
rus+1.325
+1.317
essel+1.297
Neg logit contributions
rocal-1.203
 Signed-1.070
 sadd-1.050
ENCY-1.041
OA-1.040
Token by
Token ID416
Feature activation+1.60e-03

Pos logit contributions
 Ridley+1.612
essen+1.477
+1.348
rus+1.324
essel+1.292
Neg logit contributions
rocal-1.875
 Signed-1.732
 sadd-1.729
 offending-1.710
idad-1.674
Token the
Token ID262
Feature activation-2.78e-02

Pos logit contributions
rocal+0.005
 sadd+0.005
 Signed+0.005
 offending+0.005
idad+0.004
Neg logit contributions
 Ridley-0.005
essen-0.004
-0.004
rus-0.004
essel-0.004
Token long
Token ID890
Feature activation+2.61e-01

Pos logit contributions
 Ridley+0.097
essen+0.090
+0.083
rus+0.082
essel+0.081
Neg logit contributions
rocal-0.074
 Signed-0.067
 sadd-0.067
 offending-0.066
idad-0.064
Token incub
Token ID25543
Feature activation-1.44e+00

Pos logit contributions
rocal+1.051
 sadd+0.997
 Signed+0.996
 offending+0.988
idad+0.960
Neg logit contributions
 Ridley-0.541
essen-0.485
-0.430
rus-0.415
essel-0.404
Tokenation
Token ID341
Feature activation-3.55e-01

Pos logit contributions
 Ridley+3.744
essen+3.576
rus+3.239
+3.144
essel+3.137
Neg logit contributions
rocal-4.796
 Signed-4.560
 sadd-4.524
 offending-4.505
OA-4.313
Token times
Token ID1661
Feature activation-4.72e-02

Pos logit contributions
 Ridley+1.013
essen+0.926
+0.845
rus+0.837
essel+0.823
Neg logit contributions
rocal-1.156
 Signed-1.083
 sadd-1.064
 offending-1.056
ENCY-1.029
Token,
Token ID11
Feature activation-1.45e-01

Pos logit contributions
 Ridley+0.129
essen+0.118
+0.108
rus+0.105
essel+0.103
Neg logit contributions
rocal-0.160
 Signed-0.149
 sadd-0.149
 offending-0.148
idad-0.143
Token noting
Token ID10820
Feature activation-2.83e-01

Pos logit contributions
 Ridley+0.485
essen+0.451
+0.419
rus+0.410
essel+0.406
Neg logit contributions
rocal-0.410
 Signed-0.370
 sadd-0.365
 offending-0.361
idad-0.356
Token that
Token ID326
Feature activation-2.37e-01

Pos logit contributions
 Ridley+0.838
essen+0.768
+0.709
rus+0.699
essel+0.684
Neg logit contributions
rocal-0.895
 Signed-0.832
 sadd-0.829
 offending-0.819
OA-0.795
Token of
Token ID286
Feature activation-5.93e-02

Pos logit contributions
rocal+0.309
 Signed+0.287
 sadd+0.286
 offending+0.283
OA+0.274
Neg logit contributions
 Ridley-0.297
essen-0.274
-0.254
rus-0.248
essel-0.243
Token a
Token ID257
Feature activation-1.05e-01

Pos logit contributions
 Ridley+0.180
essen+0.166
+0.152
rus+0.150
essel+0.147
Neg logit contributions
rocal-0.183
 Signed-0.169
 sadd-0.167
 offending-0.166
OA-0.162
Token wide
Token ID3094
Feature activation+1.02e-01

Pos logit contributions
 Ridley+0.395
essen+0.363
rus+0.340
+0.336
essel+0.332
Neg logit contributions
rocal-0.252
 Signed-0.227
 sadd-0.222
 offending-0.222
OA-0.215
Token-
Token ID12
Feature activation+4.36e-02

Pos logit contributions
rocal+0.326
 Signed+0.304
 sadd+0.301
 offending+0.298
ENCY+0.291
Neg logit contributions
 Ridley-0.301
essen-0.276
-0.251
rus-0.250
essel-0.245
Tokenopen
Token ID9654
Feature activation+4.68e-01

Pos logit contributions
rocal+0.130
 sadd+0.121
 Signed+0.120
 offending+0.119
idad+0.114
Neg logit contributions
 Ridley-0.135
essen-0.127
-0.117
rus-0.116
essel-0.113
Token school
Token ID1524
Feature activation-1.44e+00

Pos logit contributions
rocal+1.295
 Signed+1.196
 sadd+1.193
 offending+1.180
OA+1.136
Neg logit contributions
 Ridley-1.560
essen-1.453
-1.356
rus-1.332
essel-1.309
Token voucher
Token ID40621
Feature activation-2.60e-01

Pos logit contributions
 Ridley+4.534
essen+4.189
rus+3.931
+3.887
essel+3.779
Neg logit contributions
rocal-4.096
 Signed-3.803
 offending-3.642
OA-3.626
 sadd-3.610
Token program
Token ID1430
Feature activation+4.40e-01

Pos logit contributions
 Ridley+0.949
essen+0.890
+0.832
rus+0.830
essel+0.812
Neg logit contributions
rocal-0.634
 Signed-0.576
 sadd-0.570
 offending-0.564
ENCY-0.546
Token for
Token ID329
Feature activation+1.71e-01

Pos logit contributions
rocal+1.405
 sadd+1.319
 Signed+1.314
 offending+1.303
idad+1.252
Neg logit contributions
 Ridley-1.271
essen-1.177
-1.091
rus-1.060
essel-1.043
Token the
Token ID262
Feature activation-4.72e-02

Pos logit contributions
rocal+0.521
 Signed+0.482
 sadd+0.481
 offending+0.475
OA+0.460
Neg logit contributions
 Ridley-0.526
essen-0.489
-0.450
rus-0.442
essel-0.434
Token state
Token ID1181
Feature activation+2.39e-01

Pos logit contributions
 Ridley+0.151
essen+0.140
+0.128
rus+0.127
essel+0.124
Neg logit contributions
rocal-0.139
 Signed-0.128
 sadd-0.126
 offending-0.125
OA-0.122
TokenU
Token ID52
Feature activation+3.68e-01

Pos logit contributions
rocal+0.612
 sadd+0.569
 Signed+0.566
 offending+0.562
idad+0.544
Neg logit contributions
 Ridley-0.555
essen-0.514
-0.471
rus-0.463
essel-0.454
Token which
Token ID543
Feature activation-1.69e-01

Pos logit contributions
rocal+1.179
 sadd+1.098
 Signed+1.093
 offending+1.086
idad+1.048
Neg logit contributions
 Ridley-1.071
essen-0.991
-0.914
rus-0.894
essel-0.877
Token even
Token ID772
Feature activation-1.20e-01

Pos logit contributions
 Ridley+0.494
essen+0.445
rus+0.404
+0.404
essel+0.397
Neg logit contributions
rocal-0.555
 Signed-0.503
 sadd-0.500
 offending-0.498
ENCY-0.488
Token uses
Token ID3544
Feature activation+8.62e-02

Pos logit contributions
 Ridley+0.357
essen+0.325
+0.297
rus+0.294
essel+0.288
Neg logit contributions
rocal-0.388
 Signed-0.353
 sadd-0.351
 offending-0.348
ENCY-0.340
Token the
Token ID262
Feature activation+4.05e-02

Pos logit contributions
rocal+0.259
 sadd+0.237
 Signed+0.235
 offending+0.235
idad+0.229
Neg logit contributions
 Ridley-0.272
essen-0.249
rus-0.228
-0.228
essel-0.223
Token same
Token ID976
Feature activation+9.61e-01

Pos logit contributions
rocal+0.106
 sadd+0.096
 Signed+0.094
 offending+0.093
idad+0.092
Neg logit contributions
 Ridley-0.144
essen-0.132
-0.122
rus-0.122
essel-0.120
Token panel
Token ID6103
Feature activation-2.02e-01

Pos logit contributions
rocal+2.599
 sadd+2.400
 Signed+2.389
 offending+2.362
idad+2.262
Neg logit contributions
 Ridley-3.261
essen-3.046
-2.854
rus-2.794
essel-2.756
Token as
Token ID355
Feature activation-1.47e-01

Pos logit contributions
 Ridley+0.621
essen+0.569
+0.529
rus+0.514
essel+0.511
Neg logit contributions
rocal-0.622
 Signed-0.572
 sadd-0.570
 offending-0.566
ENCY-0.546
Token the
Token ID262
Feature activation+5.68e-02

Pos logit contributions
 Ridley+0.433
essen+0.393
+0.361
rus+0.357
essel+0.347
Neg logit contributions
rocal-0.470
 sadd-0.435
 Signed-0.431
 offending-0.427
idad-0.419
Token Asus
Token ID46301
Feature activation-1.28e-01

Pos logit contributions
rocal+0.158
 sadd+0.144
 Signed+0.142
 offending+0.139
idad+0.138
Neg logit contributions
 Ridley-0.193
essen-0.177
-0.164
rus-0.163
essel-0.160
Token.
Token ID13
Feature activation+8.28e-02

Pos logit contributions
 Ridley+0.433
essen+0.400
rus+0.370
+0.369
essel+0.359
Neg logit contributions
rocal-0.358
 sadd-0.327
 Signed-0.324
 offending-0.319
idad-0.310
Token registered
Token ID6823
Feature activation+6.18e-02

Pos logit contributions
rocal+0.522
 Signed+0.477
 sadd+0.477
 offending+0.474
ENCY+0.459
Neg logit contributions
 Ridley-0.536
essen-0.498
-0.457
rus-0.453
essel-0.444
Token their
Token ID511
Feature activation-5.14e-02

Pos logit contributions
rocal+0.191
 sadd+0.176
 Signed+0.176
 offending+0.175
idad+0.169
Neg logit contributions
 Ridley-0.187
essen-0.172
-0.159
rus-0.156
essel-0.153
Token mobile
Token ID5175
Feature activation+3.41e-02

Pos logit contributions
 Ridley+0.173
essen+0.160
+0.149
rus+0.148
essel+0.145
Neg logit contributions
rocal-0.142
 Signed-0.129
 sadd-0.128
 offending-0.126
ENCY-0.123
Token numbers
Token ID3146
Feature activation+2.40e-01

Pos logit contributions
rocal+0.059
 Signed+0.051
 sadd+0.050
 offending+0.049
idad+0.047
Neg logit contributions
 Ridley-0.148
essen-0.140
-0.133
rus-0.132
essel-0.130
Token as
Token ID355
Feature activation+3.44e-01

Pos logit contributions
rocal+0.735
 sadd+0.680
 Signed+0.678
 offending+0.672
idad+0.648
Neg logit contributions
 Ridley-0.727
essen-0.672
-0.624
rus-0.613
essel-0.598
Token part
Token ID636
Feature activation+1.07e+00

Pos logit contributions
rocal+0.967
 sadd+0.885
 Signed+0.882
 offending+0.871
idad+0.841
Neg logit contributions
 Ridley-1.136
essen-1.056
-0.982
rus-0.967
essel-0.950
Token of
Token ID286
Feature activation-4.07e-02

Pos logit contributions
rocal+2.930
 sadd+2.732
 Signed+2.728
 offending+2.669
OA+2.581
Neg logit contributions
 Ridley-3.530
essen-3.344
-3.121
rus-3.018
essel-2.998
Token the
Token ID262
Feature activation+6.53e-02

Pos logit contributions
 Ridley+0.124
essen+0.115
+0.106
rus+0.105
essel+0.102
Neg logit contributions
rocal-0.124
 sadd-0.115
 Signed-0.114
 offending-0.114
idad-0.110
Token My
Token ID2011
Feature activation-5.77e-01

Pos logit contributions
rocal+0.176
 sadd+0.161
 Signed+0.159
 offending+0.158
ENCY+0.153
Neg logit contributions
 Ridley-0.223
essen-0.208
-0.193
rus-0.191
essel-0.187
Token Number
Token ID7913
Feature activation+3.66e-01

Pos logit contributions
 Ridley+1.931
essen+1.818
rus+1.704
+1.695
essel+1.653
Neg logit contributions
rocal-1.528
 sadd-1.411
 offending-1.405
 Signed-1.375
ENCY-1.330
Token,
Token ID11
Feature activation-9.25e-02

Pos logit contributions
rocal+0.994
 sadd+0.919
 Signed+0.908
 offending+0.905
idad+0.866
Neg logit contributions
 Ridley-1.235
essen-1.158
-1.085
rus-1.065
essel-1.044
Tokenve
Token ID303
Feature activation-1.54e-01

Pos logit contributions
 Ridley+0.699
essen+0.654
+0.600
rus+0.590
essel+0.572
Neg logit contributions
rocal-0.786
 sadd-0.724
 offending-0.719
 Signed-0.719
OA-0.691
Token picked
Token ID6497
Feature activation-3.32e-01

Pos logit contributions
 Ridley+0.502
essen+0.466
+0.432
rus+0.424
essel+0.414
Neg logit contributions
rocal-0.439
 Signed-0.394
 sadd-0.393
 offending-0.392
OA-0.383
Token or
Token ID393
Feature activation+2.30e-01

Pos logit contributions
 Ridley+1.158
essen+1.065
+0.998
rus+0.981
essel+0.957
Neg logit contributions
rocal-0.888
 Signed-0.797
 sadd-0.795
 offending-0.789
idad-0.764
Token were
Token ID547
Feature activation-2.28e-01

Pos logit contributions
rocal+0.649
 Signed+0.590
 sadd+0.588
 offending+0.585
ENCY+0.570
Neg logit contributions
 Ridley-0.757
essen-0.704
-0.657
rus-0.647
essel-0.633
Token dragged
Token ID17901
Feature activation-5.21e-01

Pos logit contributions
 Ridley+0.779
essen+0.732
+0.678
rus+0.668
essel+0.650
Neg logit contributions
rocal-0.621
 Signed-0.547
 offending-0.540
 sadd-0.532
ENCY-0.532
Token into
Token ID656
Feature activation+1.14e+00

Pos logit contributions
 Ridley+1.525
essen+1.401
+1.291
rus+1.271
essel+1.228
Neg logit contributions
rocal-1.665
 Signed-1.523
 sadd-1.488
 offending-1.481
idad-1.470
Token a
Token ID257
Feature activation+1.75e-01

Pos logit contributions
rocal+3.282
 sadd+3.100
 Signed+3.078
 offending+3.058
OA+2.921
Neg logit contributions
 Ridley-3.553
essen-3.331
-3.100
rus-3.029
essel-3.004
Token recent
Token ID2274
Feature activation-2.36e-02

Pos logit contributions
rocal+0.459
 Signed+0.419
 sadd+0.415
 offending+0.410
idad+0.398
Neg logit contributions
 Ridley-0.611
essen-0.568
-0.529
rus-0.523
essel-0.511
Token argument
Token ID4578
Feature activation+3.34e-01

Pos logit contributions
 Ridley+0.082
essen+0.075
+0.070
rus+0.069
essel+0.068
Neg logit contributions
rocal-0.063
 Signed-0.057
 sadd-0.057
 offending-0.055
idad-0.055
Token about
Token ID546
Feature activation+1.54e-01

Pos logit contributions
rocal+0.972
 Signed+0.896
 sadd+0.895
 offending+0.884
idad+0.855
Neg logit contributions
 Ridley-1.068
essen-0.992
-0.925
rus-0.907
essel-0.889
Token gender
Token ID5279
Feature activation-2.82e-01

Pos logit contributions
rocal+0.446
 Signed+0.408
 sadd+0.407
 offending+0.400
OA+0.389
Neg logit contributions
 Ridley-0.496
essen-0.457
-0.427
rus-0.420
essel-0.411
Token optim
Token ID6436
Feature activation-2.21e-01

Pos logit contributions
rocal+0.205
 sadd+0.189
 Signed+0.189
 offending+0.187
OA+0.180
Neg logit contributions
 Ridley-0.225
essen-0.209
-0.195
rus-0.190
essel-0.187
Tokenist
Token ID396
Feature activation-4.95e-01

Pos logit contributions
 Ridley+0.557
essen+0.518
+0.463
rus+0.454
essel+0.444
Neg logit contributions
rocal-0.792
 sadd-0.746
 Signed-0.745
 offending-0.741
ENCY-0.715
Token is
Token ID318
Feature activation+6.57e-02

Pos logit contributions
 Ridley+1.394
essen+1.280
+1.171
rus+1.149
essel+1.125
Neg logit contributions
rocal-1.638
 Signed-1.505
 sadd-1.495
 offending-1.487
ENCY-1.449
Token of
Token ID286
Feature activation+9.20e-02

Pos logit contributions
rocal+0.205
 Signed+0.189
 sadd+0.188
 offending+0.186
OA+0.181
Neg logit contributions
 Ridley-0.199
essen-0.181
-0.170
rus-0.165
essel-0.161
Token her
Token ID607
Feature activation+3.97e-01

Pos logit contributions
rocal+0.292
 sadd+0.271
 Signed+0.271
 offending+0.267
OA+0.259
Neg logit contributions
 Ridley-0.272
essen-0.249
-0.231
rus-0.225
essel-0.221
Token pract
Token ID1970
Feature activation+1.54e+00

Pos logit contributions
rocal+1.066
 Signed+0.973
 sadd+0.969
 offending+0.956
OA+0.924
Neg logit contributions
 Ridley-1.365
essen-1.276
-1.193
rus-1.171
essel-1.151
Tokenising
Token ID1710
Feature activation+9.18e-01

Pos logit contributions
rocal+4.066
 Signed+3.663
 sadd+3.663
 offending+3.654
idad+3.564
Neg logit contributions
 Ridley-5.255
essen-4.798
-4.641
rus-4.472
essel-4.381
Token her
Token ID607
Feature activation+2.38e-01

Pos logit contributions
rocal+2.703
 sadd+2.535
 Signed+2.521
 offending+2.498
 phr+2.393
Neg logit contributions
 Ridley-2.850
essen-2.660
-2.486
rus-2.420
essel-2.390
Token tap
Token ID9814
Feature activation-1.53e-01

Pos logit contributions
rocal+0.636
 Signed+0.584
 sadd+0.583
 offending+0.573
ENCY+0.554
Neg logit contributions
 Ridley-0.816
essen-0.761
-0.711
rus-0.699
essel-0.685
Token-
Token ID12
Feature activation+1.25e-02

Pos logit contributions
 Ridley+0.523
essen+0.504
+0.465
rus+0.463
essel+0.456
Neg logit contributions
rocal-0.399
 Signed-0.369
 sadd-0.364
 offending-0.361
OA-0.348
Tokend
Token ID67
Feature activation+1.64e-01

Pos logit contributions
rocal+0.041
 Signed+0.039
 sadd+0.039
 offending+0.038
ENCY+0.037
Neg logit contributions
 Ridley-0.034
essen-0.032
-0.030
rus-0.029
essel-0.028
Token?
Token ID30
Feature activation+9.43e-02

Pos logit contributions
rocal+1.308
 sadd+1.230
 Signed+1.226
 offending+1.216
idad+1.174
Neg logit contributions
 Ridley-1.037
essen-0.953
-0.877
rus-0.852
essel-0.836
Token Ol
Token ID6544
Feature activation+4.29e-01

Pos logit contributions
rocal+0.309
 sadd+0.288
 offending+0.286
 Signed+0.285
idad+0.275
Neg logit contributions
 Ridley-0.268
essen-0.246
-0.226
rus-0.222
essel-0.217
Tokenaf
Token ID1878
Feature activation+2.69e-01

Pos logit contributions
rocal+1.394
 Signed+1.298
 sadd+1.291
 offending+1.283
OA+1.236
Neg logit contributions
 Ridley-1.227
essen-1.145
-1.050
rus-1.031
essel-1.006
Token?"
Token ID1701
Feature activation+2.76e-01

Pos logit contributions
rocal+0.915
 sadd+0.857
 Signed+0.855
 offending+0.849
OA+0.819
Neg logit contributions
 Ridley-0.724
essen-0.665
-0.609
rus-0.594
essel-0.582
Token I
Token ID314
Feature activation+2.69e-01

Pos logit contributions
rocal+0.862
 sadd+0.801
 Signed+0.797
 offending+0.794
idad+0.766
Neg logit contributions
 Ridley-0.824
essen-0.760
-0.701
rus-0.689
essel-0.676
Token stood
Token ID6204
Feature activation+1.03e+00

Pos logit contributions
rocal+0.773
 Signed+0.714
 sadd+0.713
 offending+0.706
idad+0.677
Neg logit contributions
 Ridley-0.867
essen-0.807
-0.750
rus-0.738
essel-0.725
Token still
Token ID991
Feature activation+7.69e-02

Pos logit contributions
rocal+2.788
 sadd+2.635
 Signed+2.599
 offending+2.595
idad+2.457
Neg logit contributions
 Ridley-3.398
essen-3.203
-3.010
rus-2.921
essel-2.891
Token.
Token ID13
Feature activation+1.78e-02

Pos logit contributions
rocal+0.245
 Signed+0.226
 sadd+0.223
 offending+0.221
ENCY+0.218
Neg logit contributions
 Ridley-0.228
essen-0.208
-0.191
rus-0.189
essel-0.184
Token
Token ID198
Feature activation-7.62e-02

Pos logit contributions
rocal+0.054
 sadd+0.050
 offending+0.050
 Signed+0.049
idad+0.048
Neg logit contributions
 Ridley-0.055
essen-0.051
-0.047
rus-0.046
essel-0.045
Token
Token ID198
Feature activation-1.29e-01

Pos logit contributions
 Ridley+0.252
essen+0.233
+0.217
rus+0.216
essel+0.211
Neg logit contributions
rocal-0.213
 offending-0.198
 sadd-0.197
 Signed-0.194
idad-0.186
TokenMy
Token ID3666
Feature activation+5.21e-02

Pos logit contributions
 Ridley+0.371
essen+0.345
+0.324
rus+0.319
essel+0.308
Neg logit contributions
rocal-0.408
 offending-0.384
 sadd-0.383
 Signed-0.379
 phr-0.365
Token a
Token ID257
Feature activation+1.02e-01

Pos logit contributions
 Ridley+0.611
essen+0.568
+0.523
rus+0.516
essel+0.503
Neg logit contributions
rocal-0.548
 Signed-0.508
 sadd-0.502
 offending-0.492
OA-0.479
Token rebuilt
Token ID30080
Feature activation-1.22e-01

Pos logit contributions
rocal+0.254
 Signed+0.233
 sadd+0.231
 offending+0.229
ENCY+0.219
Neg logit contributions
 Ridley-0.368
essen-0.343
-0.320
rus-0.318
essel-0.311
Token labor
Token ID4827
Feature activation-1.10e-01

Pos logit contributions
 Ridley+0.412
essen+0.385
+0.357
rus+0.350
essel+0.345
Neg logit contributions
rocal-0.331
 Signed-0.302
 sadd-0.301
 offending-0.295
ENCY-0.286
Token movement
Token ID3356
Feature activation-6.28e-01

Pos logit contributions
 Ridley+0.326
essen+0.304
+0.282
rus+0.274
essel+0.269
Neg logit contributions
rocal-0.341
 sadd-0.321
 Signed-0.320
 offending-0.316
OA-0.304
Token:
Token ID25
Feature activation+3.69e-02

Pos logit contributions
 Ridley+1.737
essen+1.601
+1.445
rus+1.421
essel+1.392
Neg logit contributions
rocal-2.088
 Signed-1.922
 sadd-1.900
 offending-1.895
ENCY-1.876
Token
Token ID564
Feature activation+2.61e-01

Pos logit contributions
rocal+0.106
 sadd+0.098
 Signed+0.097
 offending+0.097
idad+0.093
Neg logit contributions
 Ridley-0.119
essen-0.111
-0.103
rus-0.101
essel-0.099
Token
Token ID250
Feature activation+8.20e-02

Pos logit contributions
rocal+0.686
 sadd+0.635
 Signed+0.634
 offending+0.626
idad+0.597
Neg logit contributions
 Ridley-0.897
essen-0.841
-0.791
rus-0.774
essel-0.760
TokenP
Token ID47
Feature activation+7.70e-02

Pos logit contributions
rocal+0.255
 sadd+0.239
 Signed+0.237
 offending+0.236
 phr+0.225
Neg logit contributions
 Ridley-0.243
essen-0.227
-0.210
rus-0.207
essel-0.201
Tokenret
Token ID1186
Feature activation+7.75e-02

Pos logit contributions
rocal+0.229
 sadd+0.215
 offending+0.213
 Signed+0.213
 phr+0.202
Neg logit contributions
 Ridley-0.239
essen-0.225
rus-0.204
-0.204
essel-0.200
Tokenending
Token ID1571
Feature activation+6.41e-01

Pos logit contributions
rocal+0.245
 Signed+0.230
 sadd+0.230
 offending+0.226
OA+0.220
Neg logit contributions
 Ridley-0.225
essen-0.210
-0.193
rus-0.191
essel-0.185
Token some
Token ID617
Feature activation-3.12e-01

Pos logit contributions
rocal+1.887
 sadd+1.760
 Signed+1.754
 offending+1.737
idad+1.665
Neg logit contributions
 Ridley-2.008
essen-1.871
-1.745
rus-1.701
essel-1.675
Token her
Token ID607
Feature activation-9.27e-03

Pos logit contributions
 Ridley+0.291
essen+0.266
+0.246
rus+0.244
essel+0.238
Neg logit contributions
rocal-0.289
 Signed-0.269
 sadd-0.264
 offending-0.258
OA-0.256
Token r
Token ID374
Feature activation+2.50e-01

Pos logit contributions
 Ridley+0.032
essen+0.030
+0.028
rus+0.028
essel+0.027
Neg logit contributions
rocal-0.025
 Signed-0.022
 sadd-0.022
 offending-0.022
OA-0.021
Tokenhe
Token ID258
Feature activation-6.39e-01

Pos logit contributions
rocal+0.672
 sadd+0.626
 Signed+0.624
 offending+0.617
OA+0.588
Neg logit contributions
 Ridley-0.844
essen-0.797
-0.738
rus-0.729
essel-0.718
Tokenumat
Token ID27798
Feature activation+3.04e-01

Pos logit contributions
 Ridley+1.983
essen+1.898
rus+1.751
+1.733
uton+1.687
Neg logit contributions
rocal-1.816
 Signed-1.750
 sadd-1.701
 offending-1.675
 augmented-1.649
Tokenoid
Token ID1868
Feature activation-4.14e-01

Pos logit contributions
rocal+0.871
 sadd+0.810
 Signed+0.807
 offending+0.799
idad+0.765
Neg logit contributions
 Ridley-0.976
essen-0.911
-0.851
rus-0.831
essel-0.818
Token arthritis
Token ID37954
Feature activation+1.76e-01

Pos logit contributions
 Ridley+1.597
essen+1.495
+1.424
rus+1.408
essel+1.377
Neg logit contributions
rocal-0.922
 Signed-0.829
 sadd-0.817
 offending-0.803
ENCY-0.775
Token.
Token ID13
Feature activation+2.56e-01

Pos logit contributions
rocal+0.584
 sadd+0.547
 Signed+0.546
 offending+0.541
OA+0.523
Neg logit contributions
 Ridley-0.489
essen-0.450
-0.415
rus-0.404
essel-0.396
Token Trying
Token ID31165
Feature activation+3.91e-01

Pos logit contributions
rocal+0.741
 sadd+0.687
 offending+0.681
 Signed+0.680
idad+0.650
Neg logit contributions
 Ridley-0.822
essen-0.760
-0.712
rus-0.697
essel-0.686
Token to
Token ID284
Feature activation-4.82e-02

Pos logit contributions
rocal+1.126
 sadd+1.046
 Signed+1.044
 offending+1.032
OA+0.990
Neg logit contributions
 Ridley-1.250
essen-1.165
-1.088
rus-1.062
essel-1.046
Token keep
Token ID1394
Feature activation-3.46e-01

Pos logit contributions
 Ridley+0.151
essen+0.141
+0.130
rus+0.128
essel+0.126
Neg logit contributions
rocal-0.142
 Signed-0.132
 sadd-0.130
 offending-0.129
idad-0.125
Token her
Token ID607
Feature activation+6.34e-02

Pos logit contributions
 Ridley+1.022
essen+0.934
+0.857
rus+0.838
essel+0.828
Neg logit contributions
rocal-1.102
 Signed-1.023
 sadd-1.001
 offending-0.983
idad-0.966
Token.
Token ID13
Feature activation+1.45e-01

Pos logit contributions
rocal+0.263
 sadd+0.245
 Signed+0.244
 offending+0.242
idad+0.234
Neg logit contributions
 Ridley-0.219
essen-0.201
-0.184
rus-0.180
essel-0.177
Token All
Token ID1439
Feature activation+6.14e-01

Pos logit contributions
rocal+0.451
 sadd+0.419
 Signed+0.417
 offending+0.416
idad+0.399
Neg logit contributions
 Ridley-0.432
essen-0.398
-0.371
rus-0.361
essel-0.355
Token in
Token ID287
Feature activation-2.93e-01

Pos logit contributions
rocal+1.893
 sadd+1.776
 Signed+1.767
 offending+1.754
OA+1.684
Neg logit contributions
 Ridley-1.832
essen-1.704
-1.581
rus-1.541
essel-1.516
Token all
Token ID477
Feature activation+6.81e-01

Pos logit contributions
 Ridley+0.772
essen+0.707
+0.640
rus+0.624
essel+0.609
Neg logit contributions
rocal-1.015
 Signed-0.953
 sadd-0.946
 offending-0.940
ENCY-0.914
Token it
Token ID340
Feature activation-1.46e-01

Pos logit contributions
rocal+2.152
 sadd+2.046
 Signed+2.022
 offending+2.012
 phr+1.932
Neg logit contributions
 Ridley-1.955
essen-1.822
-1.695
rus-1.645
essel-1.619
Token was
Token ID373
Feature activation+2.37e-01

Pos logit contributions
 Ridley+0.415
essen+0.377
+0.346
rus+0.343
essel+0.337
Neg logit contributions
rocal-0.484
 Signed-0.446
 offending-0.437
ENCY-0.434
 sadd-0.432
Token my
Token ID616
Feature activation+3.52e-01

Pos logit contributions
rocal+0.640
 Signed+0.582
 sadd+0.571
 offending+0.569
ENCY+0.552
Neg logit contributions
 Ridley-0.815
essen-0.754
-0.705
rus-0.695
essel-0.684
Token first
Token ID717
Feature activation-1.83e-01

Pos logit contributions
rocal+0.956
 Signed+0.879
 sadd+0.867
 offending+0.861
ENCY+0.830
Neg logit contributions
 Ridley-1.196
essen-1.113
-1.041
rus-1.025
essel-1.003
Token all
Token ID477
Feature activation+2.00e-01

Pos logit contributions
 Ridley+0.577
essen+0.527
+0.495
rus+0.483
essel+0.472
Neg logit contributions
rocal-0.543
 Signed-0.499
 sadd-0.494
 offending-0.483
ENCY-0.475
Token-
Token ID12
Feature activation-3.28e-01

Pos logit contributions
rocal+0.584
 Signed+0.542
 sadd+0.542
 offending+0.535
idad+0.514
Neg logit contributions
 Ridley-0.635
essen-0.591
-0.553
rus-0.540
essel-0.530
Tokenn
Token ID77
Feature activation+1.33e-01

Pos logit contributions
 Ridley+0.993
essen+0.931
+0.864
rus+0.846
essel+0.824
Neg logit contributions
rocal-1.006
 Signed-0.934
 sadd-0.928
 offending-0.921
idad-0.881
Token to
Token ID284
Feature activation+8.29e-02

Pos logit contributions
rocal+1.054
 Signed+0.982
 sadd+0.980
 offending+0.968
ENCY+0.941
Neg logit contributions
 Ridley-0.863
essen-0.791
-0.729
rus-0.709
essel-0.697
Token incite
Token ID49225
Feature activation-3.41e-01

Pos logit contributions
rocal+0.254
 Signed+0.236
 sadd+0.233
 offending+0.229
idad+0.224
Neg logit contributions
 Ridley-0.254
essen-0.234
-0.217
rus-0.213
essel-0.210
Token and
Token ID290
Feature activation+2.94e-01

Pos logit contributions
 Ridley+1.088
essen+0.995
+0.916
rus+0.907
uton+0.895
Neg logit contributions
rocal-1.007
 Signed-0.928
 sadd-0.904
OA-0.885
ENCY-0.881
Token infl
Token ID28472
Feature activation-1.69e-01

Pos logit contributions
rocal+0.830
 Signed+0.765
 sadd+0.760
 offending+0.749
idad+0.727
Neg logit contributions
 Ridley-0.963
essen-0.901
-0.838
rus-0.825
essel-0.812
Tokename
Token ID480
Feature activation-5.07e-01

Pos logit contributions
 Ridley+0.375
essen+0.334
+0.309
rus+0.288
essel+0.281
Neg logit contributions
rocal-0.655
 sadd-0.621
 Signed-0.616
 offending-0.613
idad-0.598
Token viewers
Token ID10209
Feature activation+1.17e-01

Pos logit contributions
 Ridley+1.545
essen+1.410
+1.290
rus+1.271
essel+1.251
Neg logit contributions
rocal-1.573
 Signed-1.446
 sadd-1.412
 offending-1.378
OA-1.376
Token.
Token ID13
Feature activation+1.26e-01

Pos logit contributions
rocal+0.391
 Signed+0.365
 sadd+0.364
 offending+0.360
OA+0.350
Neg logit contributions
 Ridley-0.326
essen-0.299
-0.276
rus-0.269
essel-0.263
Token Even
Token ID3412
Feature activation+1.56e-01

Pos logit contributions
rocal+0.367
 sadd+0.340
 Signed+0.337
 offending+0.336
idad+0.322
Neg logit contributions
 Ridley-0.405
essen-0.374
-0.350
rus-0.343
essel-0.337
Token the
Token ID262
Feature activation-1.02e-01

Pos logit contributions
rocal+0.485
 Signed+0.450
 sadd+0.449
 offending+0.442
idad+0.429
Neg logit contributions
 Ridley-0.468
essen-0.431
-0.399
rus-0.390
essel-0.383
Token film
Token ID2646
Feature activation+3.36e-01

Pos logit contributions
 Ridley+0.347
essen+0.317
+0.293
rus+0.292
essel+0.286
Neg logit contributions
rocal-0.282
 Signed-0.253
 sadd-0.247
ENCY-0.242
 offending-0.242
Token's
Token ID338
Feature activation+1.06e-01

Pos logit contributions
rocal+1.068
 sadd+0.993
 Signed+0.992
 offending+0.980
idad+0.948
Neg logit contributions
 Ridley-0.983
essen-0.906
-0.838
rus-0.819
essel-0.804
Token Achilles
Token ID39561
Feature activation+7.48e-02

Pos logit contributions
rocal+0.060
 Signed+0.055
 sadd+0.054
 offending+0.054
ENCY+0.052
Neg logit contributions
 Ridley-0.076
essen-0.070
-0.066
rus-0.065
essel-0.064
Token heel
Token ID24703
Feature activation-2.41e-01

Pos logit contributions
rocal+0.193
 sadd+0.176
 Signed+0.175
 offending+0.174
idad+0.166
Neg logit contributions
 Ridley-0.264
essen-0.247
-0.233
rus-0.228
essel-0.223
Token last
Token ID938
Feature activation-1.13e-01

Pos logit contributions
 Ridley+0.693
essen+0.629
+0.579
rus+0.570
essel+0.560
Neg logit contributions
rocal-0.794
 Signed-0.728
 sadd-0.723
 offending-0.717
ENCY-0.699
Token season
Token ID1622
Feature activation+1.43e-01

Pos logit contributions
 Ridley+0.266
essen+0.240
+0.219
rus+0.212
essel+0.208
Neg logit contributions
rocal-0.426
 Signed-0.399
 sadd-0.398
 offending-0.393
OA-0.383
Token.
Token ID13
Feature activation+2.46e-01

Pos logit contributions
rocal+0.499
 Signed+0.466
 sadd+0.466
 offending+0.462
idad+0.447
Neg logit contributions
 Ridley-0.373
essen-0.341
-0.313
rus-0.304
essel-0.299
Token Hopefully
Token ID19996
Feature activation+6.80e-01

Pos logit contributions
rocal+0.742
 sadd+0.689
 offending+0.684
 Signed+0.678
idad+0.655
Neg logit contributions
 Ridley-0.763
essen-0.702
-0.656
rus-0.642
essel-0.631
Token the
Token ID262
Feature activation+3.98e-01

Pos logit contributions
rocal+2.179
 sadd+2.063
 Signed+2.057
 offending+2.036
OA+1.958
Neg logit contributions
 Ridley-1.926
essen-1.791
-1.666
rus-1.609
essel-1.579
Token addition
Token ID3090
Feature activation+5.13e-01

Pos logit contributions
rocal+1.182
 sadd+1.098
 Signed+1.095
 offending+1.084
idad+1.042
Neg logit contributions
 Ridley-1.244
essen-1.155
-1.076
rus-1.053
essel-1.035
Token of
Token ID286
Feature activation-1.82e-01

Pos logit contributions
rocal+1.407
 sadd+1.324
 Signed+1.317
 offending+1.298
OA+1.237
Neg logit contributions
 Ridley-1.684
essen-1.599
-1.480
rus-1.449
essel-1.429
Token Tom
Token ID4186
Feature activation-4.20e-01

Pos logit contributions
 Ridley+0.515
essen+0.467
+0.435
rus+0.426
essel+0.417
Neg logit contributions
rocal-0.597
 sadd-0.555
 Signed-0.550
 offending-0.549
idad-0.532
Token Gilbert
Token ID24023
Feature activation-3.83e-01

Pos logit contributions
 Ridley+1.138
essen+1.038
+0.960
rus+0.935
essel+0.917
Neg logit contributions
rocal-1.419
 sadd-1.332
 Signed-1.326
 offending-1.320
idad-1.276
Token with
Token ID351
Feature activation+4.42e-01

Pos logit contributions
rocal+0.917
 Signed+0.849
 sadd+0.849
 offending+0.840
idad+0.813
Neg logit contributions
 Ridley-0.908
essen-0.837
-0.771
rus-0.759
essel-0.740
Token legendary
Token ID13273
Feature activation+4.14e-01

Pos logit contributions
rocal+1.451
 sadd+1.358
 Signed+1.355
 offending+1.343
idad+1.296
Neg logit contributions
 Ridley-1.245
essen-1.143
-1.055
rus-1.028
essel-1.006
Token Michigan
Token ID7055
Feature activation-5.37e-01

Pos logit contributions
rocal+1.153
 sadd+1.060
 Signed+1.058
 offending+1.045
idad+1.003
Neg logit contributions
 Ridley-1.380
essen-1.285
-1.201
rus-1.177
essel-1.155
Token State
Token ID1812
Feature activation-1.59e-01

Pos logit contributions
 Ridley+1.766
essen+1.629
rus+1.513
+1.513
essel+1.461
Neg logit contributions
rocal-1.535
 Signed-1.380
 sadd-1.353
 offending-1.347
ENCY-1.328
Token basketball
Token ID9669
Feature activation-2.32e-02

Pos logit contributions
 Ridley+0.496
essen+0.455
+0.421
rus+0.418
essel+0.407
Neg logit contributions
rocal-0.483
 Signed-0.438
 sadd-0.435
 offending-0.431
ENCY-0.424
Token coach
Token ID3985
Feature activation+1.87e-02

Pos logit contributions
 Ridley+0.070
essen+0.065
rus+0.059
+0.059
essel+0.058
Neg logit contributions
rocal-0.072
 Signed-0.066
 sadd-0.065
 offending-0.065
ENCY-0.064
Token Tom
Token ID4186
Feature activation-4.70e-01

Pos logit contributions
rocal+0.063
 sadd+0.059
 Signed+0.059
 offending+0.059
OA+0.057
Neg logit contributions
 Ridley-0.051
essen-0.046
-0.043
rus-0.042
essel-0.041
Token I
Token ID314
Feature activation+4.47e-01

Pos logit contributions
 Ridley+1.274
essen+1.172
+1.078
rus+1.048
essel+1.029
Neg logit contributions
rocal-1.583
 sadd-1.489
 Signed-1.484
 offending-1.473
OA-1.421
Tokenzzo
Token ID47802
Feature activation+2.78e-01

Pos logit contributions
rocal+1.618
 Signed+1.541
 sadd+1.523
 offending+1.503
OA+1.477
Neg logit contributions
 Ridley-1.096
essen-0.967
-0.944
rus-0.852
uton-0.844
Token about
Token ID546
Feature activation+4.22e-01

Pos logit contributions
rocal+0.910
 Signed+0.846
 sadd+0.846
 offending+0.835
idad+0.812
Neg logit contributions
 Ridley-0.788
essen-0.725
-0.670
rus-0.651
essel-0.637
Token whether
Token ID1771
Feature activation-6.81e-01

Pos logit contributions
rocal+1.433
 sadd+1.348
 Signed+1.344
 offending+1.333
idad+1.286
Neg logit contributions
 Ridley-1.132
essen-1.042
-0.958
rus-0.930
essel-0.913
Token Department
Token ID2732
Feature activation+2.24e-01

Pos logit contributions
 Ridley+1.491
essen+1.369
rus+1.253
+1.249
essel+1.195
Neg logit contributions
rocal-1.671
 offending-1.537
 Signed-1.536
 sadd-1.525
ENCY-1.483
Token rescued
Token ID19868
Feature activation-9.89e-02

Pos logit contributions
rocal+0.686
 Signed+0.635
 sadd+0.632
 offending+0.630
idad+0.607
Neg logit contributions
 Ridley-0.682
essen-0.631
-0.583
rus-0.574
essel-0.563
Token Lav
Token ID21438
Feature activation+8.33e-02

Pos logit contributions
 Ridley+0.283
essen+0.258
+0.238
rus+0.235
essel+0.231
Neg logit contributions
rocal-0.319
 Signed-0.298
 sadd-0.298
 offending-0.294
idad-0.288
Tokenau
Token ID559
Feature activation+3.81e-01

Pos logit contributions
rocal+0.346
 sadd+0.329
 Signed+0.328
 offending+0.326
idad+0.317
Neg logit contributions
 Ridley-0.161
essen-0.143
-0.126
rus-0.121
essel-0.117
Token and
Token ID290
Feature activation+1.53e-01

Pos logit contributions
rocal+1.177
 Signed+1.094
 sadd+1.093
 offending+1.084
OA+1.043
Neg logit contributions
 Ridley-1.144
essen-1.055
-0.982
rus-0.961
essel-0.944
Token his
Token ID465
Feature activation+1.82e-03

Pos logit contributions
rocal+0.481
 Signed+0.448
 sadd+0.447
 offending+0.446
idad+0.429
Neg logit contributions
 Ridley-0.451
essen-0.413
-0.382
rus-0.375
essel-0.368
Token son
Token ID3367
Feature activation+6.62e-01

Pos logit contributions
rocal+0.005
 Signed+0.005
 sadd+0.005
 offending+0.005
ENCY+0.004
Neg logit contributions
 Ridley-0.006
essen-0.006
-0.005
rus-0.005
essel-0.005
Token from
Token ID422
Feature activation+1.23e-01

Pos logit contributions
rocal+1.994
 sadd+1.879
 Signed+1.864
 offending+1.845
OA+1.770
Neg logit contributions
 Ridley-2.000
essen-1.881
-1.748
rus-1.690
essel-1.668
Token the
Token ID262
Feature activation+4.11e-02

Pos logit contributions
rocal+0.378
 Signed+0.355
 sadd+0.353
 offending+0.349
idad+0.337
Neg logit contributions
 Ridley-0.370
essen-0.338
-0.314
rus-0.310
essel-0.304
Token rav
Token ID24343
Feature activation-3.30e-01

Pos logit contributions
rocal+0.117
 Signed+0.109
 sadd+0.107
 offending+0.105
OA+0.103
Neg logit contributions
 Ridley-0.134
essen-0.123
-0.114
rus-0.113
essel-0.112
Tokenine
Token ID500
Feature activation-6.30e-01

Pos logit contributions
 Ridley+0.961
essen+0.908
essel+0.819
+0.816
rus+0.811
Neg logit contributions
rocal-1.045
 sadd-0.972
 Signed-0.972
 offending-0.960
ENCY-0.930
Token for
Token ID329
Feature activation-2.28e-01

Pos logit contributions
 Ridley+1.159
essen+1.065
+0.975
rus+0.954
essel+0.943
Neg logit contributions
rocal-1.351
 Signed-1.248
 sadd-1.248
 offending-1.233
idad-1.201
Token density
Token ID12109
Feature activation+1.27e-01

Pos logit contributions
 Ridley+0.713
essen+0.657
+0.604
rus+0.597
essel+0.595
Neg logit contributions
rocal-0.683
 sadd-0.630
 Signed-0.629
 offending-0.615
OA-0.604
Token in
Token ID287
Feature activation-1.57e-02

Pos logit contributions
rocal+0.412
 sadd+0.384
 Signed+0.383
 offending+0.379
OA+0.366
Neg logit contributions
 Ridley-0.365
essen-0.336
-0.311
rus-0.303
essel-0.299
Token the
Token ID262
Feature activation+8.29e-02

Pos logit contributions
 Ridley+0.049
essen+0.045
+0.042
rus+0.041
essel+0.041
Neg logit contributions
rocal-0.047
 sadd-0.043
 Signed-0.043
 offending-0.043
idad-0.041
Token absence
Token ID8889
Feature activation+1.30e-01

Pos logit contributions
rocal+0.224
 sadd+0.204
 Signed+0.202
 offending+0.199
OA+0.195
Neg logit contributions
 Ridley-0.284
essen-0.263
-0.245
rus-0.242
essel-0.242
Token of
Token ID286
Feature activation+1.58e-03

Pos logit contributions
rocal+0.387
 sadd+0.359
 Signed+0.358
 offending+0.355
ENCY+0.341
Neg logit contributions
 Ridley-0.406
essen-0.374
-0.348
rus-0.343
essel-0.336
Token v
Token ID410
Feature activation+3.62e-01

Pos logit contributions
rocal+0.004
 Signed+0.004
 sadd+0.004
 offending+0.004
OA+0.004
Neg logit contributions
 Ridley-0.005
essen-0.005
-0.005
rus-0.005
essel-0.005
Tokenort
Token ID419
Feature activation-1.82e-01

Pos logit contributions
rocal+1.309
 sadd+1.243
 Signed+1.239
 offending+1.229
ENCY+1.185
Neg logit contributions
 Ridley-0.884
essen-0.815
-0.732
rus-0.723
essel-0.709
Tokenons
Token ID684
Feature activation+1.42e-02

Pos logit contributions
 Ridley+0.521
essen+0.496
rus+0.456
essel+0.444
+0.442
Neg logit contributions
rocal-0.566
 sadd-0.540
 Signed-0.537
 offending-0.534
ENCY-0.511
Token.
Token ID13
Feature activation+9.45e-03

Pos logit contributions
rocal+0.048
 Signed+0.045
 sadd+0.045
 offending+0.044
idad+0.043
Neg logit contributions
 Ridley-0.039
essen-0.036
-0.033
rus-0.032
essel-0.031
Token V
Token ID569
Feature activation+5.94e-01

Pos logit contributions
rocal+0.029
 sadd+0.026
 offending+0.026
 Signed+0.026
idad+0.025
Neg logit contributions
 Ridley-0.030
essen-0.027
-0.025
rus-0.025
essel-0.024
Token to
Token ID284
Feature activation+3.25e-01

Pos logit contributions
rocal+0.227
 Signed+0.211
 sadd+0.210
 offending+0.207
ENCY+0.201
Neg logit contributions
 Ridley-0.203
essen-0.188
-0.172
rus-0.169
essel-0.166
Token result
Token ID1255
Feature activation-4.41e-01

Pos logit contributions
rocal+0.987
 Signed+0.918
 sadd+0.915
 offending+0.905
idad+0.873
Neg logit contributions
 Ridley-0.991
essen-0.921
-0.854
rus-0.835
essel-0.822
Token in
Token ID287
Feature activation-3.35e-01

Pos logit contributions
 Ridley+1.230
essen+1.133
+1.042
rus+1.025
essel+0.996
Neg logit contributions
rocal-1.443
 Signed-1.358
 sadd-1.348
 offending-1.336
idad-1.297
Token an
Token ID281
Feature activation-2.18e-01

Pos logit contributions
 Ridley+1.057
essen+0.975
+0.890
rus+0.884
essel+0.864
Neg logit contributions
rocal-0.992
 Signed-0.919
 sadd-0.901
 offending-0.889
OA-0.879
Token approach
Token ID3164
Feature activation+4.77e-01

Pos logit contributions
 Ridley+0.754
essen+0.705
rus+0.640
+0.637
essel+0.634
Neg logit contributions
rocal-0.585
 Signed-0.536
 sadd-0.518
 offending-0.504
OA-0.503
Token to
Token ID284
Feature activation-8.35e-02

Pos logit contributions
rocal+1.421
 Signed+1.318
 sadd+1.314
 offending+1.300
ENCY+1.255
Neg logit contributions
 Ridley-1.487
essen-1.383
-1.276
rus-1.257
essel-1.232
Token rationality
Token ID45627
Feature activation-2.73e-01

Pos logit contributions
 Ridley+0.285
essen+0.266
+0.246
rus+0.243
essel+0.239
Neg logit contributions
rocal-0.225
 Signed-0.208
 sadd-0.206
 offending-0.201
OA-0.196
Token to
Token ID284
Feature activation+2.44e-01

Pos logit contributions
 Ridley+0.813
essen+0.752
+0.680
rus+0.675
essel+0.663
Neg logit contributions
rocal-0.863
 Signed-0.790
 sadd-0.778
 offending-0.773
ENCY-0.764
Token which
Token ID543
Feature activation-3.27e-01

Pos logit contributions
rocal+0.730
 Signed+0.678
 sadd+0.678
 offending+0.669
OA+0.645
Neg logit contributions
 Ridley-0.755
essen-0.702
-0.651
rus-0.637
essel-0.627
Token feminists
Token ID22742
Feature activation-4.00e-01

Pos logit contributions
 Ridley+0.996
essen+0.926
+0.847
rus+0.833
essel+0.819
Neg logit contributions
rocal-0.999
 Signed-0.927
 sadd-0.924
 offending-0.912
OA-0.884
Token should
Token ID815
Feature activation+3.88e-01

Pos logit contributions
 Ridley+1.183
essen+1.096
+0.978
rus+0.976
essel+0.956
Neg logit contributions
rocal-1.266
 Signed-1.168
 offending-1.148
 sadd-1.147
ENCY-1.134
Token Clemson
Token ID27801
Feature activation-1.01e+00

Pos logit contributions
rocal+0.156
 Signed+0.144
 sadd+0.144
 offending+0.143
OA+0.138
Neg logit contributions
 Ridley-0.168
essen-0.155
-0.144
rus-0.142
essel-0.139
Token Game
Token ID3776
Feature activation-5.46e-01

Pos logit contributions
 Ridley+2.955
essen+2.624
+2.436
rus+2.412
essel+2.349
Neg logit contributions
rocal-3.225
 sadd-2.910
 Signed-2.907
 offending-2.888
ENCY-2.868
Token in
Token ID287
Feature activation-1.52e-01

Pos logit contributions
 Ridley+1.673
essen+1.541
+1.431
rus+1.416
essel+1.377
Neg logit contributions
rocal-1.672
 sadd-1.540
 offending-1.533
 Signed-1.528
 phr-1.465
Token 2016
Token ID1584
Feature activation-2.75e-01

Pos logit contributions
 Ridley+0.451
essen+0.406
+0.373
rus+0.367
essel+0.358
Neg logit contributions
rocal-0.492
 Signed-0.450
 sadd-0.450
 offending-0.447
ENCY-0.434
Token (
Token ID357
Feature activation+1.32e-02

Pos logit contributions
 Ridley+0.727
essen+0.655
+0.604
rus+0.587
essel+0.576
Neg logit contributions
rocal-0.963
 sadd-0.891
 offending-0.887
 Signed-0.884
idad-0.859
TokenPhoto
Token ID6191
Feature activation-5.87e-01

Pos logit contributions
rocal+0.049
 sadd+0.046
 Signed+0.046
 offending+0.046
 phr+0.044
Neg logit contributions
 Ridley-0.031
essen-0.028
-0.026
rus-0.025
essel-0.024
Token:
Token ID25
Feature activation-1.56e-01

Pos logit contributions
 Ridley+1.595
essen+1.439
+1.331
rus+1.317
essel+1.261
Neg logit contributions
rocal-2.004
 offending-1.856
 sadd-1.850
 Signed-1.831
ENCY-1.790
Token Wayne
Token ID13329
Feature activation+7.97e-02

Pos logit contributions
 Ridley+0.432
essen+0.390
+0.363
rus+0.354
essel+0.343
Neg logit contributions
rocal-0.526
 offending-0.492
 sadd-0.491
 Signed-0.484
idad-0.471
Token McG
Token ID11130
Feature activation-5.12e-02

Pos logit contributions
rocal+0.246
 Signed+0.231
 sadd+0.230
 offending+0.227
idad+0.219
Neg logit contributions
 Ridley-0.236
essen-0.220
-0.206
rus-0.200
essel-0.196
Tokena
Token ID64
Feature activation-3.20e-01

Pos logit contributions
 Ridley+0.145
essen+0.136
rus+0.124
+0.122
essel+0.121
Neg logit contributions
rocal-0.164
 sadd-0.156
 offending-0.156
 Signed-0.155
ENCY-0.148
Tokenhee
Token ID21067
Feature activation+1.09e-01

Pos logit contributions
 Ridley+1.097
essen+1.040
essel+0.949
rus+0.948
+0.944
Neg logit contributions
rocal-0.847
 sadd-0.803
 offending-0.787
 Signed-0.783
 augmented-0.743
Token think
Token ID892
Feature activation-4.82e-01

Pos logit contributions
 Ridley+0.699
essen+0.654
+0.601
rus+0.597
essel+0.583
Neg logit contributions
rocal-0.638
 Signed-0.583
 offending-0.581
 sadd-0.579
ENCY-0.556
Token they
Token ID484
Feature activation-4.26e-01

Pos logit contributions
 Ridley+1.369
essen+1.241
+1.116
rus+1.114
essel+1.086
Neg logit contributions
rocal-1.600
 Signed-1.478
 sadd-1.459
 offending-1.444
idad-1.425
Token
Token ID447
Feature activation-6.44e-02

Pos logit contributions
 Ridley+1.200
essen+1.119
+1.014
rus+1.002
essel+0.979
Neg logit contributions
rocal-1.382
 Signed-1.290
 offending-1.277
 sadd-1.275
ENCY-1.231
Token
Token ID247
Feature activation+9.06e-02

Pos logit contributions
 Ridley+0.217
essen+0.205
+0.190
rus+0.187
essel+0.184
Neg logit contributions
rocal-0.173
 sadd-0.162
 offending-0.160
 Signed-0.160
OA-0.151
Tokenre
Token ID260
Feature activation+3.09e-02

Pos logit contributions
rocal+0.291
 sadd+0.273
 Signed+0.273
 offending+0.270
OA+0.260
Neg logit contributions
 Ridley-0.259
essen-0.241
-0.222
rus-0.217
essel-0.212
Token normal
Token ID3487
Feature activation-9.18e-01

Pos logit contributions
rocal+0.090
 Signed+0.083
 sadd+0.082
 offending+0.081
idad+0.079
Neg logit contributions
 Ridley-0.099
essen-0.092
-0.085
rus-0.083
essel-0.082
Token that
Token ID326
Feature activation-5.43e-02

Pos logit contributions
 Ridley+2.572
essen+2.365
+2.177
essel+2.108
rus+2.090
Neg logit contributions
rocal-3.019
 Signed-2.768
 offending-2.718
 sadd-2.716
ENCY-2.666
Token way
Token ID835
Feature activation-3.98e-02

Pos logit contributions
 Ridley+0.197
essen+0.184
+0.172
rus+0.170
essel+0.168
Neg logit contributions
rocal-0.136
 Signed-0.123
 sadd-0.122
 offending-0.121
idad-0.116
Token,
Token ID11
Feature activation+2.11e-01

Pos logit contributions
 Ridley+0.105
essen+0.096
+0.087
rus+0.085
essel+0.083
Neg logit contributions
rocal-0.139
 Signed-0.130
 sadd-0.129
 offending-0.128
idad-0.124
Token but
Token ID475
Feature activation+2.24e-01

Pos logit contributions
rocal+0.633
 sadd+0.588
 Signed+0.587
 offending+0.581
idad+0.559
Neg logit contributions
 Ridley-0.652
essen-0.606
-0.563
rus-0.550
essel-0.541
Token they
Token ID484
Feature activation-1.99e-01

Pos logit contributions
rocal+0.730
 Signed+0.681
 sadd+0.680
 offending+0.672
idad+0.651
Neg logit contributions
 Ridley-0.638
essen-0.588
-0.539
rus-0.529
essel-0.518
Token Show
Token ID5438
Feature activation+2.89e-01

Pos logit contributions
 Ridley+0.155
essen+0.141
+0.131
rus+0.129
essel+0.127
Neg logit contributions
rocal-0.161
 sadd-0.149
 offending-0.148
 Signed-0.146
idad-0.143
Token how
Token ID703
Feature activation+1.86e-01

Pos logit contributions
rocal+0.857
 sadd+0.799
 Signed+0.797
 offending+0.788
idad+0.757
Neg logit contributions
 Ridley-0.899
essen-0.836
-0.779
rus-0.761
essel-0.749
Token to
Token ID284
Feature activation+5.09e-01

Pos logit contributions
rocal+0.544
 sadd+0.505
 Signed+0.504
 offending+0.499
idad+0.480
Neg logit contributions
 Ridley-0.590
essen-0.549
-0.513
rus-0.501
essel-0.492
Token sew
Token ID34249
Feature activation-4.10e-01

Pos logit contributions
rocal+1.401
 sadd+1.297
 Signed+1.293
 offending+1.279
idad+1.224
Neg logit contributions
 Ridley-1.692
essen-1.583
-1.482
rus-1.449
essel-1.428
Token a
Token ID257
Feature activation-3.66e-02

Pos logit contributions
 Ridley+1.238
essen+1.143
+1.057
rus+1.043
essel+1.026
Neg logit contributions
rocal-1.267
 Signed-1.161
 sadd-1.151
 offending-1.149
idad-1.119
Token seam
Token ID15787
Feature activation-1.01e+00

Pos logit contributions
 Ridley+0.121
essen+0.113
+0.106
rus+0.104
essel+0.102
Neg logit contributions
rocal-0.102
 Signed-0.093
 sadd-0.092
 offending-0.092
ENCY-0.089
Token:
Token ID25
Feature activation+1.13e-01

Pos logit contributions
 Ridley+2.980
essen+2.806
+2.558
rus+2.551
essel+2.468
Neg logit contributions
rocal-3.135
 Signed-2.853
 sadd-2.843
 offending-2.828
ENCY-2.787
Token One
Token ID1881
Feature activation-3.64e-02

Pos logit contributions
rocal+0.349
 sadd+0.323
 offending+0.321
 Signed+0.320
idad+0.310
Neg logit contributions
 Ridley-0.342
essen-0.314
-0.291
rus-0.287
essel-0.282
Token of
Token ID286
Feature activation-1.50e-01

Pos logit contributions
 Ridley+0.118
essen+0.109
+0.101
rus+0.100
essel+0.099
Neg logit contributions
rocal-0.104
 Signed-0.096
 sadd-0.095
 offending-0.094
ENCY-0.092
Token the
Token ID262
Feature activation+2.39e-01

Pos logit contributions
 Ridley+0.430
essen+0.388
+0.358
rus+0.354
essel+0.348
Neg logit contributions
rocal-0.491
 Signed-0.456
 sadd-0.446
 offending-0.443
idad-0.434
Token most
Token ID749
Feature activation+4.63e-01

Pos logit contributions
rocal+0.669
 Signed+0.616
 offending+0.597
 sadd+0.595
ENCY+0.589
Neg logit contributions
 Ridley-0.788
essen-0.729
-0.680
rus-0.671
essel-0.664
Token it
Token ID340
Feature activation-3.44e-01

Pos logit contributions
 Ridley+1.350
essen+1.243
+1.126
rus+1.114
essel+1.087
Neg logit contributions
rocal-1.607
 Signed-1.502
 sadd-1.485
 offending-1.457
OA-1.433
Token was
Token ID373
Feature activation-4.18e-01

Pos logit contributions
 Ridley+0.909
essen+0.832
+0.757
rus+0.745
essel+0.730
Neg logit contributions
rocal-1.197
 Signed-1.114
 sadd-1.097
 offending-1.093
ENCY-1.068
Token too
Token ID1165
Feature activation+4.59e-01

Pos logit contributions
 Ridley+1.368
essen+1.265
+1.162
rus+1.148
essel+1.127
Neg logit contributions
rocal-1.210
 Signed-1.099
 sadd-1.060
 offending-1.055
idad-1.043
Token early
Token ID1903
Feature activation-9.48e-01

Pos logit contributions
rocal+1.253
 Signed+1.150
 sadd+1.131
 offending+1.122
idad+1.087
Neg logit contributions
 Ridley-1.556
essen-1.452
-1.354
rus-1.337
essel-1.312
Token to
Token ID284
Feature activation-3.43e-01

Pos logit contributions
 Ridley+2.830
essen+2.617
+2.411
rus+2.365
essel+2.311
Neg logit contributions
rocal-2.917
 Signed-2.716
 sadd-2.665
 offending-2.628
OA-2.581
Token speculate
Token ID26256
Feature activation-1.01e+00

Pos logit contributions
 Ridley+0.997
essen+0.927
+0.853
rus+0.835
essel+0.820
Neg logit contributions
rocal-1.088
 Signed-1.014
 sadd-1.013
 offending-1.001
OA-0.969
Token why
Token ID1521
Feature activation-6.59e-01

Pos logit contributions
 Ridley+2.844
essen+2.595
rus+2.347
+2.316
essel+2.290
Neg logit contributions
rocal-3.345
 Signed-3.094
 sadd-3.005
idad-2.946
ENCY-2.941
Token the
Token ID262
Feature activation-2.14e-01

Pos logit contributions
 Ridley+1.955
essen+1.801
+1.633
rus+1.625
uton+1.590
Neg logit contributions
rocal-2.082
 Signed-1.908
 sadd-1.877
 offending-1.835
idad-1.832
Token fire
Token ID2046
Feature activation-2.25e-01

Pos logit contributions
 Ridley+0.663
essen+0.612
+0.563
rus+0.557
uton+0.547
Neg logit contributions
rocal-0.648
 Signed-0.593
 sadd-0.586
OA-0.568
ENCY-0.566
Token had
Token ID550
Feature activation-5.74e-01

Pos logit contributions
 Ridley+0.449
essen+0.403
+0.356
rus+0.343
essel+0.333
Neg logit contributions
rocal-0.921
 sadd-0.872
 Signed-0.871
 offending-0.862
idad-0.842
Token been
Token ID587
Feature activation-4.69e-01

Pos logit contributions
 Ridley+1.788
essen+1.675
rus+1.529
essel+1.500
+1.485
Neg logit contributions
rocal-1.715
 Signed-1.555
OA-1.519
ENCY-1.509
 sadd-1.497
Token here
Token ID994
Feature activation-8.26e-02

Pos logit contributions
 Ridley+1.739
essen+1.558
+1.471
rus+1.456
essel+1.385
Neg logit contributions
rocal-1.983
 sadd-1.823
 Signed-1.805
 offending-1.783
idad-1.748
Token.
Token ID13
Feature activation-1.14e-01

Pos logit contributions
 Ridley+0.235
essen+0.216
+0.200
rus+0.196
essel+0.192
Neg logit contributions
rocal-0.270
 sadd-0.250
 Signed-0.247
 offending-0.246
idad-0.239
Token Launch
Token ID21225
Feature activation+2.79e-01

Pos logit contributions
 Ridley+0.340
essen+0.313
+0.290
rus+0.285
essel+0.279
Neg logit contributions
rocal-0.355
 sadd-0.330
 offending-0.326
 Signed-0.325
ENCY-0.313
Token Kodi
Token ID44129
Feature activation-1.59e-01

Pos logit contributions
rocal+0.853
 sadd+0.793
 Signed+0.790
 offending+0.784
idad+0.754
Neg logit contributions
 Ridley-0.851
essen-0.786
-0.734
rus-0.715
essel-0.703
Token and
Token ID290
Feature activation-3.12e-01

Pos logit contributions
 Ridley+0.514
essen+0.475
+0.443
rus+0.434
essel+0.429
Neg logit contributions
rocal-0.461
 sadd-0.423
 Signed-0.422
 offending-0.419
OA-0.402
Token click
Token ID3904
Feature activation-1.20e+00

Pos logit contributions
 Ridley+0.922
essen+0.850
+0.791
rus+0.774
essel+0.755
Neg logit contributions
rocal-0.980
 Signed-0.906
 sadd-0.906
 offending-0.897
idad-0.867
Token on
Token ID319
Feature activation-4.96e-01

Pos logit contributions
 Ridley+3.356
essen+3.048
+2.890
rus+2.838
essel+2.704
Neg logit contributions
rocal-3.876
 sadd-3.551
 offending-3.534
 Signed-3.525
idad-3.428
Token the
Token ID262
Feature activation-3.53e-01

Pos logit contributions
 Ridley+1.465
essen+1.336
+1.261
rus+1.229
essel+1.193
Neg logit contributions
rocal-1.569
 sadd-1.444
 Signed-1.429
 offending-1.425
ENCY-1.378
Token Settings
Token ID16163
Feature activation-8.43e-01

Pos logit contributions
 Ridley+1.123
essen+1.035
+0.976
rus+0.951
essel+0.930
Neg logit contributions
rocal-1.046
 sadd-0.950
 Signed-0.943
 offending-0.929
ENCY-0.906
Token Icon
Token ID26544
Feature activation-7.90e-01

Pos logit contributions
 Ridley+3.373
essen+3.174
+3.024
rus+2.974
essel+2.921
Neg logit contributions
rocal-1.769
 Signed-1.522
 sadd-1.516
 offending-1.482
ENCY-1.410
Token in
Token ID287
Feature activation-5.33e-01

Pos logit contributions
 Ridley+2.205
essen+2.021
+1.869
rus+1.832
essel+1.773
Neg logit contributions
rocal-2.634
 sadd-2.416
 Signed-2.381
 offending-2.374
ENCY-2.318
Token to
Token ID284
Feature activation+1.85e-02

Pos logit contributions
 Ridley+0.185
essen+0.170
+0.158
rus+0.154
essel+0.150
Neg logit contributions
rocal-0.189
 Signed-0.175
 sadd-0.175
 offending-0.172
OA-0.168
Token watch
Token ID2342
Feature activation-1.18e-01

Pos logit contributions
rocal+0.058
 sadd+0.054
 Signed+0.053
 offending+0.053
OA+0.051
Neg logit contributions
 Ridley-0.055
essen-0.051
-0.047
rus-0.046
essel-0.045
Token us
Token ID514
Feature activation+2.58e-01

Pos logit contributions
 Ridley+0.356
essen+0.328
+0.304
rus+0.297
essel+0.292
Neg logit contributions
rocal-0.366
 sadd-0.338
 Signed-0.337
 offending-0.333
OA-0.323
Token train
Token ID4512
Feature activation+4.23e-02

Pos logit contributions
rocal+0.715
 sadd+0.659
 Signed+0.658
 offending+0.649
OA+0.625
Neg logit contributions
 Ridley-0.859
essen-0.804
-0.750
rus-0.734
essel-0.723
Token as
Token ID355
Feature activation+4.46e-01

Pos logit contributions
rocal+0.132
 sadd+0.122
 Signed+0.122
 offending+0.121
OA+0.117
Neg logit contributions
 Ridley-0.127
essen-0.117
-0.109
rus-0.106
essel-0.104
Token school
Token ID1524
Feature activation-8.91e-01

Pos logit contributions
rocal+1.280
 sadd+1.195
 Signed+1.191
 offending+1.181
idad+1.128
Neg logit contributions
 Ridley-1.422
essen-1.329
-1.242
rus-1.211
essel-1.193
Tokenboys
Token ID13202
Feature activation-2.11e-01

Pos logit contributions
 Ridley+2.127
essen+1.964
+1.776
rus+1.749
essel+1.718
Neg logit contributions
rocal-3.249
 Signed-3.038
 sadd-3.001
 offending-2.993
OA-2.944
Token on
Token ID319
Feature activation+2.50e-01

Pos logit contributions
 Ridley+0.584
essen+0.536
+0.490
rus+0.479
essel+0.468
Neg logit contributions
rocal-0.705
 Signed-0.652
 sadd-0.652
 offending-0.645
OA-0.633
Token a
Token ID257
Feature activation+2.40e-01

Pos logit contributions
rocal+0.790
 sadd+0.742
 Signed+0.739
 offending+0.733
idad+0.704
Neg logit contributions
 Ridley-0.726
essen-0.674
-0.625
rus-0.608
essel-0.598
Token Thursday
Token ID3635
Feature activation+2.17e-01

Pos logit contributions
rocal+0.720
 sadd+0.669
 Signed+0.668
 offending+0.661
OA+0.637
Neg logit contributions
 Ridley-0.741
essen-0.687
-0.639
rus-0.623
essel-0.614
Token night
Token ID1755
Feature activation-6.90e-02

Pos logit contributions
rocal+0.678
 sadd+0.631
 Signed+0.630
 offending+0.624
idad+0.603
Neg logit contributions
 Ridley-0.646
essen-0.599
-0.554
rus-0.541
essel-0.531