TOP ACTIVATIONS
MAX = 3.630

proof of this : " Defense Of Marriage Act "). But
Alpha Part icle X - Ray Spect rom eter ( AP
Nations , Susan E . Rice , said Friday that Mr
solutions to the Department of Defense s urgent needs
s be readily distinguished from forc ings ( S pen cer
Brazilian prosecutor that Rolls - Roy ce is co - oper
named Norman Bates ( Anthony Perkins .) Norman , shall w
ly young man named Norman Bates ( Anthony Perkins .) Norman
. N . Ambassador Susan Rice , and Special Assistant Samantha
, D . C .: Smithsonian Institution Press , 1994 .
. S . Department of Defense Africa Command website , the
] The inscription on the pedest al reads " let her
Barack Obama announced Monday that Defense Secretary Chuck Hag el will
t Break US - Philipp ines T ies C ement
Q with its 144 Hz refresh rate and fast response time
application . In terms of thrust to weight it is already
, one of which involves charge - density wave fluctuations .
- era standard of " prep onder ance of the evidence
. Rand Paul is " bas ically an isolation ist ,"
. And the Department of Defense has acknowledged sending at least

MIN ACTIVATIONS
MIN = -4.337

. The Texas Organ izing Project is calling on
and organizations . × Expand Youtube Senator Sheldon White
Ph . D . in Organ izational Behavior / Human Resource
return to a more improved MLS instead of second division soccer
it out . × Expand Jonathan Gold s bie
not protecting the rights of Occupy demonstrators whose protests are being
according to the artist . Organ izers hope the work ,
% of respondents to the Lok Survey , with a sample
sure how they would have voted had they known of Good
set to take on the Occupy Wall Street protests in Wednesday
, our copy of the StarCraft 2 Collector 's Edition has
the comments . <|endoftext|> Red Transit L anes on Church Have
current climate of misinformation around trans identity .) So
how we think about a trans national network of operatives who
View more <|endoftext|> About R acing Apex
as racism , sexism , trans ph obia , and able
South Carolina where workers had voted to go non union ,
. e . those who voted in the previous election .
, if Labor wins the election . Given the policy has
a grand total of zero gig awatts . But knowing that

INTERVAL 1.638 - 3.630
CONTAINS 0.807%

the lake s surface area was about 12 %
back to the no - spin scenario for a second :
provided scientists a library of comparisons for what Curiosity sees .
got that ? Sure , I got it .
was posted . Here 's what my methods were : If

INTERVAL -0.354 - 1.638
CONTAINS 75.070%

Last year , Pruitt orchestrated an effort by Republican
in - all minor issues the demands by Kentucky
and Vince Zam pe lla , outlines the four game plan
Women . A gentleman will not take
. Lerner , who was head of the division that scrutin

INTERVAL -2.345 - -0.354
CONTAINS 24.004%

than to their boss . Regarding Bain , the candidate has
one ad or ning the time - worn jacket .
, and the joy s of life , are
reality to the people ," said Ron Price , the president
ed up by the approaching storm in Haiti , the Associated

INTERVAL -4.337 - -2.345
CONTAINS 0.119%

s rewarded the Bills with two years of excellent
for customers exceeding about 300 gig abytes a month , or
re talking about less than half , he said
the top 2 . 75 % of all households in India
tot alled 2 . 07 % of its gross domestic product
Token proof
Token ID6617
Feature activation+2.14e+00

Pos logit contributions
atron+0.112
imet+0.110
izoph+0.107
Desk+0.103
estern+0.099
Neg logit contributions
nces-0.083
esome-0.078
 contrad-0.070
anamo-0.066
edom-0.063
Token of
Token ID286
Feature activation+2.54e-03

Pos logit contributions
nces+3.905
esome+3.809
 contrad+3.409
anamo+3.236
edom+3.171
Neg logit contributions
atron-4.402
imet-4.270
izoph-4.267
Desk-3.981
estern-3.934
Token this
Token ID428
Feature activation-2.90e-01

Pos logit contributions
nces+0.005
esome+0.005
 contrad+0.004
anamo+0.004
edom+0.004
Neg logit contributions
atron-0.005
imet-0.005
izoph-0.005
Desk-0.005
estern-0.004
Token:
Token ID25
Feature activation+5.97e-02

Pos logit contributions
atron+0.579
imet+0.565
izoph+0.549
Desk+0.526
estern+0.504
Neg logit contributions
nces-0.578
esome-0.547
 contrad-0.500
anamo-0.481
edom-0.462
Token "
Token ID366
Feature activation+1.28e-01

Pos logit contributions
nces+0.125
esome+0.118
 contrad+0.108
anamo+0.105
edom+0.100
Neg logit contributions
atron-0.114
imet-0.111
izoph-0.108
Desk-0.103
estern-0.099
TokenDefense
Token ID27300
Feature activation+3.63e+00

Pos logit contributions
nces+0.269
esome+0.254
 contrad+0.234
anamo+0.225
edom+0.218
Neg logit contributions
atron-0.242
imet-0.237
izoph-0.232
Desk-0.217
estern-0.210
Token Of
Token ID3226
Feature activation+7.48e-02

Pos logit contributions
nces+6.495
esome+5.954
 contrad+5.380
anamo+5.317
 Flavoring+5.123
Neg logit contributions
atron-7.540
izoph-7.377
imet-7.347
estern-6.733
Desk-6.713
Token Marriage
Token ID24509
Feature activation+1.37e-01

Pos logit contributions
nces+0.149
esome+0.142
 contrad+0.129
anamo+0.124
 Flavoring+0.120
Neg logit contributions
atron-0.148
imet-0.146
izoph-0.142
Desk-0.135
estern-0.128
Token Act
Token ID2191
Feature activation-4.93e-01

Pos logit contributions
nces+0.294
esome+0.280
 contrad+0.258
anamo+0.250
edom+0.241
Neg logit contributions
atron-0.250
imet-0.244
izoph-0.239
Desk-0.226
estern-0.215
Token").
Token ID11074
Feature activation+3.17e-01

Pos logit contributions
atron+0.896
imet+0.875
izoph+0.855
Desk+0.812
estern+0.769
Neg logit contributions
nces-1.071
esome-1.018
 contrad-0.935
anamo-0.907
 Flavoring-0.872
Token But
Token ID887
Feature activation+6.10e-01

Pos logit contributions
nces+0.658
esome+0.634
 contrad+0.574
anamo+0.555
 Flavoring+0.548
Neg logit contributions
atron-0.594
imet-0.593
izoph-0.582
Desk-0.533
estern-0.520
Token Alpha
Token ID12995
Feature activation-1.93e-01

Pos logit contributions
nces+2.764
esome+2.689
 contrad+2.439
edom+2.254
 Flavoring+2.239
Neg logit contributions
atron-2.728
izoph-2.655
imet-2.609
Desk-2.480
estern-2.438
Token Part
Token ID2142
Feature activation-9.31e-01

Pos logit contributions
atron+0.356
imet+0.347
izoph+0.340
Desk+0.319
estern+0.311
Neg logit contributions
nces-0.408
esome-0.389
 contrad-0.360
anamo-0.343
 Flavoring-0.335
Tokenicle
Token ID1548
Feature activation+4.71e-01

Pos logit contributions
atron+1.656
imet+1.622
izoph+1.551
Desk+1.485
estern+1.400
Neg logit contributions
nces-2.057
esome-1.954
 contrad-1.800
anamo-1.749
 Flavoring-1.680
Token X
Token ID1395
Feature activation-4.33e-01

Pos logit contributions
nces+0.997
esome+0.979
 contrad+0.898
 Flavoring+0.841
edom+0.834
Neg logit contributions
atron-0.823
izoph-0.816
imet-0.795
Desk-0.750
estern-0.732
Token-
Token ID12
Feature activation+4.83e-01

Pos logit contributions
atron+0.802
izoph+0.779
imet+0.777
Desk+0.729
estern+0.702
Neg logit contributions
nces-0.906
esome-0.871
 contrad-0.792
anamo-0.758
edom-0.744
TokenRay
Token ID19591
Feature activation+3.62e+00

Pos logit contributions
nces+0.786
esome+0.737
 contrad+0.643
anamo+0.617
edom+0.590
Neg logit contributions
atron-1.111
izoph-1.097
imet-1.090
Desk-1.033
estern-1.008
Token Spect
Token ID13058
Feature activation+1.41e+00

Pos logit contributions
nces+6.578
esome+6.501
 contrad+5.920
 Flavoring+5.567
antage+5.504
Neg logit contributions
izoph-6.947
atron-6.354
estern-6.245
Desk-6.018
imet-6.005
Tokenrom
Token ID398
Feature activation+7.08e-02

Pos logit contributions
nces+3.136
esome+3.094
anamo+2.782
 contrad+2.780
udo+2.767
Neg logit contributions
izoph-2.182
atron-2.095
imet-2.028
Desk-2.001
estern-1.911
Tokeneter
Token ID2357
Feature activation+9.26e-02

Pos logit contributions
nces+0.123
esome+0.120
 contrad+0.102
edom+0.100
anamo+0.096
Neg logit contributions
atron-0.152
izoph-0.150
imet-0.146
Desk-0.142
estern-0.135
Token (
Token ID357
Feature activation-2.51e-01

Pos logit contributions
nces+0.183
esome+0.176
 contrad+0.159
anamo+0.150
 Flavoring+0.147
Neg logit contributions
atron-0.182
izoph-0.179
imet-0.177
Desk-0.166
estern-0.162
TokenAP
Token ID2969
Feature activation+9.20e-01

Pos logit contributions
atron+0.420
imet+0.409
izoph+0.403
Desk+0.375
estern+0.359
Neg logit contributions
nces-0.576
esome-0.553
 contrad-0.506
anamo-0.494
edom-0.478
Token Nations
Token ID7973
Feature activation+3.46e-01

Pos logit contributions
nces+0.824
esome+0.783
 contrad+0.706
anamo+0.679
edom+0.665
Neg logit contributions
atron-0.785
imet-0.781
izoph-0.751
Desk-0.710
estern-0.681
Token,
Token ID11
Feature activation+3.08e-01

Pos logit contributions
nces+0.675
esome+0.634
 contrad+0.573
anamo+0.557
edom+0.539
Neg logit contributions
atron-0.703
imet-0.684
izoph-0.678
Desk-0.641
estern-0.612
Token Susan
Token ID14919
Feature activation+9.93e-01

Pos logit contributions
nces+0.525
esome+0.492
 contrad+0.440
anamo+0.421
edom+0.404
Neg logit contributions
atron-0.699
imet-0.681
izoph-0.672
Desk-0.645
estern-0.621
Token E
Token ID412
Feature activation+3.11e-01

Pos logit contributions
nces+0.968
esome+0.898
 contrad+0.719
udo+0.601
anamo+0.600
Neg logit contributions
atron-2.914
imet-2.911
izoph-2.852
Desk-2.743
estern-2.665
Token.
Token ID13
Feature activation-2.64e-01

Pos logit contributions
nces+0.620
esome+0.589
 contrad+0.512
edom+0.496
udo+0.490
Neg logit contributions
atron-0.617
izoph-0.595
imet-0.594
Desk-0.557
estern-0.532
Token Rice
Token ID13823
Feature activation+3.48e+00

Pos logit contributions
atron+0.798
imet+0.788
izoph+0.781
Desk+0.750
estern+0.733
Neg logit contributions
nces-0.249
esome-0.220
 contrad-0.176
edom-0.148
anamo-0.147
Token,
Token ID11
Feature activation+1.80e-01

Pos logit contributions
nces+6.290
esome+5.915
 contrad+5.257
anamo+5.063
 Flavoring+5.004
Neg logit contributions
atron-7.134
izoph-6.992
imet-6.957
Desk-6.525
estern-6.265
Token said
Token ID531
Feature activation-2.91e-01

Pos logit contributions
nces+0.391
esome+0.373
 contrad+0.343
anamo+0.331
edom+0.321
Neg logit contributions
atron-0.322
imet-0.313
izoph-0.310
Desk-0.293
estern-0.277
Token Friday
Token ID3217
Feature activation+1.49e+00

Pos logit contributions
atron+0.560
imet+0.548
izoph+0.533
Desk+0.509
estern+0.485
Neg logit contributions
nces-0.602
esome-0.567
 contrad-0.518
anamo-0.504
edom-0.485
Token that
Token ID326
Feature activation+7.81e-02

Pos logit contributions
nces+2.748
esome+2.561
 contrad+2.312
anamo+2.241
edom+2.175
Neg logit contributions
atron-3.132
imet-3.070
izoph-3.027
Desk-2.854
estern-2.768
Token Mr
Token ID1770
Feature activation+7.85e-01

Pos logit contributions
nces+0.155
esome+0.145
 contrad+0.133
anamo+0.128
edom+0.123
Neg logit contributions
atron-0.158
imet-0.154
izoph-0.150
Desk-0.143
estern-0.137
Token solutions
Token ID8136
Feature activation-1.55e+00

Pos logit contributions
atron+0.188
imet+0.184
izoph+0.180
Desk+0.172
estern+0.166
Neg logit contributions
nces-0.159
esome-0.149
 contrad-0.136
anamo-0.127
edom-0.124
Token to
Token ID284
Feature activation-1.04e-01

Pos logit contributions
atron+2.906
imet+2.844
izoph+2.739
Desk+2.635
estern+2.496
Neg logit contributions
nces-3.236
esome-3.072
 contrad-2.807
anamo-2.733
 Flavoring-2.625
Token the
Token ID262
Feature activation-1.51e-01

Pos logit contributions
atron+0.211
imet+0.209
izoph+0.202
Desk+0.194
estern+0.186
Neg logit contributions
nces-0.200
esome-0.190
 contrad-0.173
anamo-0.163
 Flavoring-0.160
Token Department
Token ID2732
Feature activation+1.07e-01

Pos logit contributions
atron+0.316
imet+0.310
izoph+0.303
Desk+0.293
estern+0.279
Neg logit contributions
nces-0.283
esome-0.269
 contrad-0.244
anamo-0.231
 Flavoring-0.223
Token of
Token ID286
Feature activation-4.58e-02

Pos logit contributions
nces+0.222
esome+0.211
 contrad+0.192
anamo+0.187
 Flavoring+0.179
Neg logit contributions
atron-0.204
imet-0.200
izoph-0.194
Desk-0.185
estern-0.176
Token Defense
Token ID5947
Feature activation+3.38e+00

Pos logit contributions
atron+0.093
imet+0.091
izoph+0.088
Desk+0.085
estern+0.081
Neg logit contributions
nces-0.089
esome-0.084
 contrad-0.076
anamo-0.073
edom-0.071
Token
Token ID447
Feature activation+6.49e-01

Pos logit contributions
nces+5.767
esome+5.436
 contrad+4.792
anamo+4.643
antage+4.545
Neg logit contributions
atron-7.091
izoph-6.873
imet-6.866
Desk-6.506
estern-6.354
Token
Token ID247
Feature activation+1.30e-02

Pos logit contributions
nces+0.944
esome+0.868
 contrad+0.764
anamo+0.715
 Flavoring+0.682
Neg logit contributions
imet-1.631
atron-1.622
izoph-1.581
Desk-1.525
estern-1.491
Tokens
Token ID82
Feature activation-6.66e-01

Pos logit contributions
nces+0.028
esome+0.027
 contrad+0.024
anamo+0.024
edom+0.023
Neg logit contributions
atron-0.024
imet-0.023
izoph-0.023
Desk-0.022
estern-0.021
Token urgent
Token ID18039
Feature activation-5.32e-01

Pos logit contributions
atron+1.356
imet+1.325
izoph+1.289
Desk+1.239
estern+1.184
Neg logit contributions
nces-1.298
esome-1.230
 contrad-1.115
anamo-1.074
edom-1.032
Token needs
Token ID2476
Feature activation-3.30e-01

Pos logit contributions
atron+1.167
imet+1.145
izoph+1.115
Desk+1.074
estern+1.031
Neg logit contributions
nces-0.954
esome-0.901
 contrad-0.811
anamo-0.773
edom-0.744
Tokens
Token ID82
Feature activation-8.62e-01

Pos logit contributions
nces+0.731
esome+0.690
 contrad+0.627
anamo+0.599
edom+0.585
Neg logit contributions
atron-0.685
imet-0.669
izoph-0.661
Desk-0.634
estern-0.600
Token be
Token ID307
Feature activation-5.63e-01

Pos logit contributions
atron+1.725
imet+1.690
izoph+1.645
Desk+1.582
estern+1.508
Neg logit contributions
nces-1.705
esome-1.614
 contrad-1.475
anamo-1.409
edom-1.362
Token readily
Token ID14704
Feature activation-3.40e-01

Pos logit contributions
atron+1.091
imet+1.064
izoph+1.043
Desk+1.007
estern+0.954
Neg logit contributions
nces-1.154
esome-1.085
 contrad-1.006
anamo-0.944
edom-0.924
Token distinguished
Token ID18876
Feature activation+3.02e-01

Pos logit contributions
atron+0.657
imet+0.637
izoph+0.630
Desk+0.602
estern+0.569
Neg logit contributions
nces-0.700
esome-0.657
 contrad-0.611
anamo-0.570
edom-0.563
Token from
Token ID422
Feature activation-2.33e-01

Pos logit contributions
nces+0.593
esome+0.562
 contrad+0.521
anamo+0.483
edom+0.476
Neg logit contributions
atron-0.599
imet-0.586
izoph-0.579
Desk-0.556
estern-0.531
Token forc
Token ID24686
Feature activation+3.37e+00

Pos logit contributions
atron+0.474
imet+0.465
izoph+0.453
Desk+0.436
estern+0.413
Neg logit contributions
nces-0.456
esome-0.429
 contrad-0.393
anamo-0.371
 Flavoring-0.362
Tokenings
Token ID654
Feature activation+4.47e-01

Pos logit contributions
esome+4.260
nces+4.043
edom+3.237
ibilities+3.215
aked+2.982
Neg logit contributions
izoph-8.536
atron-8.468
imet-8.109
Desk-8.011
rology-7.886
Token (
Token ID357
Feature activation+2.02e-01

Pos logit contributions
nces+0.852
esome+0.811
 contrad+0.751
anamo+0.689
 Flavoring+0.683
Neg logit contributions
atron-0.906
imet-0.893
izoph-0.881
Desk-0.844
estern-0.806
TokenS
Token ID50
Feature activation+9.37e-01

Pos logit contributions
nces+0.432
esome+0.410
 contrad+0.370
anamo+0.359
edom+0.348
Neg logit contributions
atron-0.373
imet-0.362
izoph-0.358
Desk-0.334
estern-0.321
Tokenpen
Token ID3617
Feature activation-3.32e-02

Pos logit contributions
nces+1.961
esome+1.864
 contrad+1.698
anamo+1.649
edom+1.597
Neg logit contributions
atron-1.745
imet-1.685
izoph-1.651
Desk-1.585
estern-1.501
Tokencer
Token ID2189
Feature activation-7.95e-02

Pos logit contributions
atron+0.054
imet+0.052
izoph+0.051
Desk+0.047
estern+0.045
Neg logit contributions
nces-0.079
esome-0.075
 contrad-0.069
anamo-0.067
 Flavoring-0.066
Token Brazilian
Token ID17036
Feature activation+7.20e-01

Pos logit contributions
nces+1.136
esome+1.081
 contrad+1.007
anamo+0.953
 Flavoring+0.922
Neg logit contributions
atron-1.128
imet-1.097
izoph-1.089
Desk-1.026
estern-0.984
Token prosecutor
Token ID13683
Feature activation+5.55e-02

Pos logit contributions
nces+1.378
esome+1.318
 contrad+1.202
anamo+1.137
 Flavoring+1.112
Neg logit contributions
atron-1.463
imet-1.437
izoph-1.393
Desk-1.329
estern-1.298
Token that
Token ID326
Feature activation+3.49e-01

Pos logit contributions
nces+0.106
esome+0.101
 contrad+0.091
anamo+0.088
edom+0.085
Neg logit contributions
atron-0.115
imet-0.112
izoph-0.110
Desk-0.105
estern-0.100
Token Rolls
Token ID40074
Feature activation+1.18e+00

Pos logit contributions
nces+0.684
esome+0.650
 contrad+0.592
anamo+0.570
edom+0.547
Neg logit contributions
atron-0.704
imet-0.688
izoph-0.668
Desk-0.640
estern-0.618
Token-
Token ID12
Feature activation+1.35e+00

Pos logit contributions
nces+2.288
esome+2.221
 contrad+2.017
anamo+1.970
edom+1.896
Neg logit contributions
atron-2.285
izoph-2.239
imet-2.227
Desk-2.091
estern-2.006
TokenRoy
Token ID32027
Feature activation+3.36e+00

Pos logit contributions
nces+1.912
esome+1.841
 contrad+1.555
anamo+1.515
edom+1.450
Neg logit contributions
atron-3.298
imet-3.287
izoph-3.258
estern-2.992
Desk-2.981
Tokence
Token ID344
Feature activation+1.42e+00

Pos logit contributions
nces+6.046
esome+5.738
aked+4.829
udo+4.685
edom+4.676
Neg logit contributions
izoph-6.767
imet-6.756
atron-6.566
Desk-6.222
estern-6.127
Token is
Token ID318
Feature activation+5.54e-02

Pos logit contributions
nces+2.631
esome+2.489
 contrad+2.282
anamo+2.162
edom+2.074
Neg logit contributions
atron-2.928
imet-2.862
izoph-2.822
Desk-2.668
estern-2.594
Token co
Token ID763
Feature activation-3.25e-02

Pos logit contributions
nces+0.109
esome+0.104
 contrad+0.096
anamo+0.092
edom+0.088
Neg logit contributions
atron-0.110
imet-0.108
izoph-0.107
Desk-0.100
estern-0.097
Token-
Token ID12
Feature activation+7.15e-01

Pos logit contributions
atron+0.058
imet+0.057
izoph+0.054
Desk+0.052
estern+0.049
Neg logit contributions
nces-0.072
esome-0.068
 contrad-0.062
anamo-0.061
 Flavoring-0.059
Tokenoper
Token ID3575
Feature activation-6.95e-01

Pos logit contributions
nces+1.674
esome+1.622
 contrad+1.489
anamo+1.489
edom+1.413
Neg logit contributions
atron-1.109
imet-1.068
izoph-1.061
Desk-0.964
estern-0.933
Token named
Token ID3706
Feature activation+4.87e-02

Pos logit contributions
nces+0.618
esome+0.586
 contrad+0.533
anamo+0.517
edom+0.498
Neg logit contributions
atron-0.594
imet-0.581
izoph-0.570
Desk-0.542
estern-0.515
Token Norman
Token ID19636
Feature activation+1.01e+00

Pos logit contributions
nces+0.094
esome+0.090
 contrad+0.081
anamo+0.079
edom+0.075
Neg logit contributions
atron-0.098
imet-0.097
izoph-0.095
Desk-0.090
estern-0.086
Token Bates
Token ID35591
Feature activation+3.34e+00

Pos logit contributions
nces+1.970
esome+1.883
 contrad+1.695
anamo+1.636
edom+1.594
Neg logit contributions
atron-1.987
imet-1.972
izoph-1.931
Desk-1.825
estern-1.745
Token (
Token ID357
Feature activation+1.42e-01

Pos logit contributions
nces+6.246
esome+5.947
 contrad+5.286
anamo+5.089
edom+4.990
Neg logit contributions
atron-6.599
izoph-6.574
imet-6.456
Desk-6.107
estern-5.857
TokenAnthony
Token ID32697
Feature activation+1.42e+00

Pos logit contributions
nces+0.282
esome+0.269
 contrad+0.241
anamo+0.236
edom+0.227
Neg logit contributions
atron-0.279
imet-0.274
izoph-0.270
Desk-0.252
estern-0.243
Token Perkins
Token ID29618
Feature activation+3.34e+00

Pos logit contributions
nces+2.681
esome+2.611
 contrad+2.281
anamo+2.228
 Flavoring+2.144
Neg logit contributions
atron-2.900
imet-2.809
izoph-2.797
Desk-2.656
istani-2.517
Token.)
Token ID2014
Feature activation+1.07e+00

Pos logit contributions
nces+5.492
esome+5.203
 contrad+4.638
anamo+4.313
edom+4.209
Neg logit contributions
izoph-7.332
atron-7.325
imet-6.990
Desk-6.619
estern-6.457
Token Norman
Token ID19636
Feature activation+8.61e-01

Pos logit contributions
nces+2.229
esome+2.139
 contrad+1.927
anamo+1.902
edom+1.828
Neg logit contributions
atron-1.981
imet-1.971
izoph-1.945
Desk-1.793
estern-1.724
Token,
Token ID11
Feature activation-1.70e-01

Pos logit contributions
nces+1.548
esome+1.466
 contrad+1.311
anamo+1.262
edom+1.210
Neg logit contributions
atron-1.852
imet-1.836
izoph-1.802
Desk-1.711
estern-1.642
Token shall
Token ID2236
Feature activation+4.04e-01

Pos logit contributions
atron+0.329
imet+0.322
izoph+0.314
Desk+0.300
estern+0.285
Neg logit contributions
nces-0.347
esome-0.331
 contrad-0.303
anamo-0.294
edom-0.281
Token w
Token ID266
Feature activation+6.42e-01

Pos logit contributions
nces+0.698
esome+0.663
 contrad+0.593
anamo+0.563
edom+0.543
Neg logit contributions
atron-0.907
imet-0.891
izoph-0.868
Desk-0.837
estern-0.801
Tokenly
Token ID306
Feature activation-1.51e-01

Pos logit contributions
nces+2.500
esome+2.379
 contrad+2.170
anamo+2.085
edom+2.048
Neg logit contributions
atron-2.530
imet-2.506
izoph-2.489
Desk-2.278
estern-2.189
Token young
Token ID1862
Feature activation+3.56e-01

Pos logit contributions
atron+0.303
imet+0.296
izoph+0.292
Desk+0.275
estern+0.263
Neg logit contributions
nces-0.299
esome-0.283
 contrad-0.260
anamo-0.248
edom-0.242
Token man
Token ID582
Feature activation+3.05e-01

Pos logit contributions
nces+0.714
esome+0.677
 contrad+0.617
anamo+0.595
edom+0.576
Neg logit contributions
atron-0.697
imet-0.684
izoph-0.673
Desk-0.632
estern-0.605
Token named
Token ID3706
Feature activation+4.87e-02

Pos logit contributions
nces+0.618
esome+0.586
 contrad+0.533
anamo+0.517
edom+0.498
Neg logit contributions
atron-0.594
imet-0.581
izoph-0.570
Desk-0.542
estern-0.515
Token Norman
Token ID19636
Feature activation+1.01e+00

Pos logit contributions
nces+0.094
esome+0.090
 contrad+0.081
anamo+0.079
edom+0.075
Neg logit contributions
atron-0.098
imet-0.097
izoph-0.095
Desk-0.090
estern-0.086
Token Bates
Token ID35591
Feature activation+3.34e+00

Pos logit contributions
nces+1.970
esome+1.883
 contrad+1.695
anamo+1.636
edom+1.594
Neg logit contributions
atron-1.987
imet-1.972
izoph-1.931
Desk-1.825
estern-1.745
Token (
Token ID357
Feature activation+1.42e-01

Pos logit contributions
nces+6.246
esome+5.947
 contrad+5.286
anamo+5.089
edom+4.990
Neg logit contributions
atron-6.599
izoph-6.574
imet-6.456
Desk-6.107
estern-5.857
TokenAnthony
Token ID32697
Feature activation+1.42e+00

Pos logit contributions
nces+0.282
esome+0.269
 contrad+0.241
anamo+0.236
edom+0.227
Neg logit contributions
atron-0.279
imet-0.274
izoph-0.270
Desk-0.252
estern-0.243
Token Perkins
Token ID29618
Feature activation+3.34e+00

Pos logit contributions
nces+2.681
esome+2.611
 contrad+2.281
anamo+2.228
 Flavoring+2.144
Neg logit contributions
atron-2.900
imet-2.809
izoph-2.797
Desk-2.656
istani-2.517
Token.)
Token ID2014
Feature activation+1.07e+00

Pos logit contributions
nces+5.492
esome+5.203
 contrad+4.638
anamo+4.313
edom+4.209
Neg logit contributions
izoph-7.332
atron-7.325
imet-6.990
Desk-6.619
estern-6.457
Token Norman
Token ID19636
Feature activation+8.61e-01

Pos logit contributions
nces+2.229
esome+2.139
 contrad+1.927
anamo+1.902
edom+1.828
Neg logit contributions
atron-1.981
imet-1.971
izoph-1.945
Desk-1.793
estern-1.724
Token.
Token ID13
Feature activation+5.02e-01

Pos logit contributions
nces+2.063
esome+2.016
anamo+1.674
 Flavoring+1.654
udo+1.652
Neg logit contributions
atron-2.248
imet-2.218
izoph-2.176
estern-1.994
Desk-1.990
TokenN
Token ID45
Feature activation+1.26e+00

Pos logit contributions
nces+1.204
esome+1.154
 contrad+1.063
anamo+1.035
 Flavoring+1.016
Neg logit contributions
atron-0.786
imet-0.765
izoph-0.732
Desk-0.690
estern-0.659
Token.
Token ID13
Feature activation+2.31e-01

Pos logit contributions
nces+2.643
esome+2.493
 contrad+2.247
anamo+2.207
 Flavoring+2.144
Neg logit contributions
atron-2.353
imet-2.308
izoph-2.209
Desk-2.141
estern-2.040
Token Ambassador
Token ID20656
Feature activation+7.82e-01

Pos logit contributions
nces+0.444
esome+0.416
 contrad+0.374
anamo+0.367
 Flavoring+0.356
Neg logit contributions
atron-0.478
imet-0.464
izoph-0.452
Desk-0.434
estern-0.418
Token Susan
Token ID14919
Feature activation+1.30e+00

Pos logit contributions
nces+1.620
esome+1.535
 contrad+1.389
anamo+1.362
 Flavoring+1.311
Neg logit contributions
atron-1.489
imet-1.446
izoph-1.433
Desk-1.348
estern-1.306
Token Rice
Token ID13823
Feature activation+3.33e+00

Pos logit contributions
nces+0.868
esome+0.797
 contrad+0.544
udo+0.423
liction+0.375
Neg logit contributions
imet-4.160
izoph-4.096
atron-4.084
estern-3.928
Desk-3.926
Token,
Token ID11
Feature activation+2.41e-01

Pos logit contributions
nces+5.744
esome+5.503
 contrad+4.840
anamo+4.588
 Flavoring+4.504
Neg logit contributions
izoph-6.984
atron-6.973
imet-6.914
Desk-6.493
estern-6.227
Token and
Token ID290
Feature activation+3.59e-01

Pos logit contributions
nces+0.472
esome+0.447
 contrad+0.407
anamo+0.395
 Flavoring+0.380
Neg logit contributions
atron-0.483
imet-0.474
izoph-0.464
Desk-0.443
estern-0.424
Token Special
Token ID6093
Feature activation+1.48e+00

Pos logit contributions
nces+0.704
esome+0.665
 contrad+0.605
anamo+0.584
 Flavoring+0.564
Neg logit contributions
atron-0.719
imet-0.710
izoph-0.694
Desk-0.662
estern-0.636
Token Assistant
Token ID15286
Feature activation+4.14e-01

Pos logit contributions
nces+3.307
esome+3.172
anamo+2.881
 contrad+2.876
 Flavoring+2.840
Neg logit contributions
atron-2.486
imet-2.405
izoph-2.332
Desk-2.185
estern-2.144
Token Samantha
Token ID34778
Feature activation+8.78e-01

Pos logit contributions
nces+0.802
esome+0.763
 contrad+0.685
anamo+0.672
 Flavoring+0.638
Neg logit contributions
atron-0.844
imet-0.827
izoph-0.813
Desk-0.765
estern-0.731
Token,
Token ID11
Feature activation-1.30e-01

Pos logit contributions
nces+0.353
esome+0.334
 contrad+0.305
anamo+0.297
 Flavoring+0.285
Neg logit contributions
atron-0.318
imet-0.310
izoph-0.301
Desk-0.287
estern-0.274
Token D
Token ID360
Feature activation+6.71e-01

Pos logit contributions
atron+0.250
imet+0.244
izoph+0.236
Desk+0.226
estern+0.215
Neg logit contributions
nces-0.269
esome-0.255
 contrad-0.233
anamo-0.226
 Flavoring-0.217
Token.
Token ID13
Feature activation-3.18e-01

Pos logit contributions
nces+1.242
esome+1.225
 contrad+1.026
udo+0.999
 Flavoring+0.994
Neg logit contributions
atron-1.358
imet-1.358
izoph-1.340
Desk-1.243
estern-1.231
TokenC
Token ID34
Feature activation+1.56e-01

Pos logit contributions
imet+0.461
atron+0.453
izoph+0.444
Desk+0.388
estern+0.386
Neg logit contributions
nces-0.784
esome-0.772
 contrad-0.695
 Flavoring-0.676
anamo-0.675
Token.:
Token ID11207
Feature activation+1.67e-01

Pos logit contributions
nces+0.354
esome+0.338
 contrad+0.311
anamo+0.303
 Flavoring+0.292
Neg logit contributions
atron-0.268
imet-0.262
izoph-0.252
Desk-0.240
estern-0.227
Token Smithsonian
Token ID40131
Feature activation+3.33e+00

Pos logit contributions
nces+0.324
esome+0.307
 contrad+0.278
anamo+0.269
edom+0.259
Neg logit contributions
atron-0.337
imet-0.332
izoph-0.322
Desk-0.307
estern-0.295
Token Institution
Token ID29426
Feature activation+1.94e+00

Pos logit contributions
nces+4.452
esome+4.200
 contrad+3.630
 Flavoring+3.336
anamo+3.272
Neg logit contributions
atron-8.190
imet-8.013
izoph-7.819
Desk-7.708
estern-7.441
Token Press
Token ID4332
Feature activation+2.02e+00

Pos logit contributions
nces+3.351
esome+3.136
 contrad+2.808
anamo+2.682
 Flavoring+2.595
Neg logit contributions
atron-4.315
imet-4.238
izoph-4.088
Desk-3.965
estern-3.802
Token,
Token ID11
Feature activation+2.84e-01

Pos logit contributions
nces+3.883
esome+3.671
 contrad+3.320
anamo+3.161
 Flavoring+3.110
Neg logit contributions
atron-4.038
imet-3.987
izoph-3.869
Desk-3.735
estern-3.526
Token 1994
Token ID9162
Feature activation+1.20e+00

Pos logit contributions
nces+0.602
esome+0.577
 contrad+0.530
 Flavoring+0.502
anamo+0.498
Neg logit contributions
atron-0.512
imet-0.507
izoph-0.496
Desk-0.465
estern-0.450
Token.
Token ID13
Feature activation+2.60e-01

Pos logit contributions
nces+2.652
esome+2.524
 contrad+2.317
anamo+2.255
 Flavoring+2.171
Neg logit contributions
atron-2.144
imet-2.094
izoph-2.019
Desk-1.929
estern-1.825
Token.
Token ID13
Feature activation-1.81e-01

Pos logit contributions
nces+1.170
esome+1.155
 contrad+0.947
udo+0.942
 Flavoring+0.937
Neg logit contributions
atron-1.269
imet-1.253
izoph-1.246
estern-1.136
Desk-1.134
TokenS
Token ID50
Feature activation+1.58e+00

Pos logit contributions
atron+0.253
imet+0.247
izoph+0.236
Desk+0.223
estern+0.207
Neg logit contributions
nces-0.466
esome-0.446
 contrad-0.416
anamo-0.408
 Flavoring-0.393
Token.
Token ID13
Feature activation+3.78e-01

Pos logit contributions
nces+3.365
esome+3.189
 contrad+2.917
anamo+2.812
edom+2.731
Neg logit contributions
atron-2.923
imet-2.842
izoph-2.746
Desk-2.635
estern-2.504
Token Department
Token ID2732
Feature activation+9.16e-01

Pos logit contributions
nces+0.683
esome+0.640
 contrad+0.576
anamo+0.553
edom+0.529
Neg logit contributions
atron-0.829
imet-0.803
izoph-0.785
Desk-0.753
estern-0.724
Token of
Token ID286
Feature activation+4.69e-01

Pos logit contributions
nces+1.651
esome+1.581
 contrad+1.364
 Flavoring+1.315
anamo+1.291
Neg logit contributions
atron-1.933
izoph-1.915
imet-1.906
Desk-1.778
estern-1.757
Token Defense
Token ID5947
Feature activation+3.31e+00

Pos logit contributions
nces+0.867
esome+0.825
 contrad+0.744
anamo+0.723
edom+0.692
Neg logit contributions
atron-0.987
imet-0.965
izoph-0.940
Desk-0.904
estern-0.862
Token Africa
Token ID5478
Feature activation-1.82e-01

Pos logit contributions
nces+5.873
esome+5.467
 contrad+4.790
anamo+4.680
liction+4.506
Neg logit contributions
atron-6.959
izoph-6.567
imet-6.533
Desk-6.209
estern-6.029
Token Command
Token ID9455
Feature activation+1.18e+00

Pos logit contributions
atron+0.363
imet+0.355
izoph+0.343
Desk+0.327
estern+0.315
Neg logit contributions
nces-0.364
esome-0.344
 contrad-0.313
anamo-0.303
 Flavoring-0.292
Token website
Token ID3052
Feature activation+1.12e+00

Pos logit contributions
nces+2.174
esome+2.066
 contrad+1.832
anamo+1.792
edom+1.688
Neg logit contributions
atron-2.501
imet-2.419
izoph-2.374
Desk-2.249
estern-2.198
Token,
Token ID11
Feature activation+4.90e-01

Pos logit contributions
nces+2.175
esome+2.044
 contrad+1.855
anamo+1.802
edom+1.732
Neg logit contributions
atron-2.266
imet-2.197
izoph-2.138
Desk-2.046
estern-1.974
Token the
Token ID262
Feature activation-4.63e-02

Pos logit contributions
nces+0.970
esome+0.915
 contrad+0.839
anamo+0.820
edom+0.776
Neg logit contributions
atron-0.978
imet-0.957
izoph-0.930
Desk-0.887
estern-0.862
Token]
Token ID60
Feature activation+2.49e-01

Pos logit contributions
nces+0.592
esome+0.559
 contrad+0.508
anamo+0.493
 Flavoring+0.475
Neg logit contributions
atron-0.580
imet-0.567
izoph-0.549
Desk-0.528
estern-0.502
Token The
Token ID383
Feature activation-1.86e-01

Pos logit contributions
nces+0.518
esome+0.496
 contrad+0.450
anamo+0.438
 Flavoring+0.430
Neg logit contributions
imet-0.461
atron-0.459
izoph-0.452
Desk-0.417
estern-0.398
Token inscription
Token ID40189
Feature activation+8.48e-01

Pos logit contributions
atron+0.366
imet+0.356
izoph+0.351
Desk+0.333
estern+0.322
Neg logit contributions
nces-0.374
esome-0.355
 contrad-0.328
anamo-0.310
edom-0.300
Token on
Token ID319
Feature activation+5.39e-01

Pos logit contributions
nces+1.586
esome+1.503
 contrad+1.380
anamo+1.299
edom+1.258
Neg logit contributions
atron-1.764
imet-1.739
izoph-1.726
Desk-1.606
estern-1.566
Token the
Token ID262
Feature activation+8.22e-02

Pos logit contributions
nces+1.053
esome+0.998
 contrad+0.912
anamo+0.873
edom+0.849
Neg logit contributions
atron-1.078
imet-1.060
izoph-1.041
Desk-0.990
estern-0.939
Token pedest
Token ID14238
Feature activation+3.27e+00

Pos logit contributions
nces+0.156
esome+0.149
 contrad+0.138
anamo+0.130
edom+0.125
Neg logit contributions
atron-0.169
imet-0.164
izoph-0.163
Desk-0.154
estern-0.149
Tokenal
Token ID282
Feature activation+2.50e-01

Pos logit contributions
esome+4.191
nces+4.173
edom+3.441
anamo+3.330
udo+3.299
Neg logit contributions
izoph-7.881
atron-7.468
imet-7.430
Desk-7.265
estern-7.205
Token reads
Token ID9743
Feature activation-6.03e-01

Pos logit contributions
nces+0.391
esome+0.365
 contrad+0.330
anamo+0.306
edom+0.294
Neg logit contributions
atron-0.601
imet-0.590
izoph-0.584
Desk-0.556
estern-0.541
Token "
Token ID366
Feature activation-1.21e-01

Pos logit contributions
atron+1.088
imet+1.063
izoph+1.027
Desk+0.981
estern+0.929
Neg logit contributions
nces-1.312
esome-1.247
 contrad-1.144
anamo-1.113
 Flavoring-1.071
Tokenlet
Token ID1616
Feature activation-3.27e-01

Pos logit contributions
atron+0.225
imet+0.221
izoph+0.216
Desk+0.201
estern+0.194
Neg logit contributions
nces-0.256
esome-0.244
 contrad-0.222
anamo-0.216
edom-0.209
Token her
Token ID607
Feature activation+1.05e-01

Pos logit contributions
atron+0.667
imet+0.657
izoph+0.640
Desk+0.611
estern+0.582
Neg logit contributions
nces-0.638
esome-0.603
 contrad-0.545
anamo-0.532
edom-0.509
Token Barack
Token ID8732
Feature activation+2.93e-01

Pos logit contributions
nces+0.085
esome+0.081
 contrad+0.074
anamo+0.072
edom+0.070
Neg logit contributions
atron-0.064
imet-0.062
izoph-0.061
Desk-0.057
estern-0.054
Token Obama
Token ID2486
Feature activation+1.78e-01

Pos logit contributions
nces+0.613
esome+0.605
 contrad+0.533
anamo+0.516
 Flavoring+0.504
Neg logit contributions
imet-0.538
atron-0.530
izoph-0.523
Desk-0.482
estern-0.466
Token announced
Token ID3414
Feature activation-1.13e+00

Pos logit contributions
nces+0.364
esome+0.350
 contrad+0.317
anamo+0.301
edom+0.294
Neg logit contributions
atron-0.344
imet-0.334
izoph-0.334
Desk-0.314
estern-0.296
Token Monday
Token ID3321
Feature activation+1.08e+00

Pos logit contributions
atron+2.096
imet+2.049
izoph+1.979
Desk+1.895
estern+1.796
Neg logit contributions
nces-2.411
esome-2.290
 contrad-2.097
anamo-2.037
 Flavoring-1.959
Token that
Token ID326
Feature activation+2.67e-01

Pos logit contributions
nces+1.981
esome+1.896
 contrad+1.702
anamo+1.635
edom+1.583
Neg logit contributions
atron-2.280
imet-2.248
izoph-2.211
Desk-2.084
estern-1.996
Token Defense
Token ID5947
Feature activation+3.26e+00

Pos logit contributions
nces+0.557
esome+0.525
 contrad+0.480
anamo+0.466
edom+0.447
Neg logit contributions
atron-0.512
imet-0.502
izoph-0.488
Desk-0.465
estern-0.443
Token Secretary
Token ID4986
Feature activation+7.69e-01

Pos logit contributions
nces+6.356
esome+5.866
 contrad+5.138
anamo+5.005
liction+4.936
Neg logit contributions
atron-6.544
imet-6.391
izoph-6.274
estern-5.695
Desk-5.606
Token Chuck
Token ID14663
Feature activation+8.98e-01

Pos logit contributions
nces+1.353
esome+1.292
 contrad+1.134
anamo+1.118
edom+1.038
Neg logit contributions
atron-1.692
izoph-1.656
imet-1.655
Desk-1.567
estern-1.526
Token Hag
Token ID21375
Feature activation+9.50e-01

Pos logit contributions
nces+1.936
esome+1.925
anamo+1.664
 contrad+1.660
 Flavoring+1.601
Neg logit contributions
imet-1.582
atron-1.545
izoph-1.534
Desk-1.426
estern-1.362
Tokenel
Token ID417
Feature activation+4.81e-01

Pos logit contributions
nces+1.387
esome+1.387
edom+1.079
anamo+1.056
liction+1.048
Neg logit contributions
izoph-2.285
atron-2.271
imet-2.266
Desk-2.170
estern-2.076
Token will
Token ID481
Feature activation+1.38e+00

Pos logit contributions
nces+0.783
esome+0.745
 contrad+0.652
anamo+0.621
edom+0.605
Neg logit contributions
atron-1.114
imet-1.097
izoph-1.096
Desk-1.030
estern-0.988
Token
Token ID247
Feature activation-2.42e-01

Pos logit contributions
atron+0.502
imet+0.498
izoph+0.485
Desk+0.473
estern+0.460
Neg logit contributions
nces-0.201
esome-0.182
 contrad-0.151
anamo-0.139
 Flavoring-0.129
Tokent
Token ID83
Feature activation-4.68e-03

Pos logit contributions
atron+0.486
imet+0.478
izoph+0.464
Desk+0.446
estern+0.425
Neg logit contributions
nces-0.478
esome-0.450
 contrad-0.405
anamo-0.392
edom-0.378
Token Break
Token ID12243
Feature activation-6.60e-01

Pos logit contributions
atron+0.009
imet+0.009
izoph+0.008
Desk+0.008
estern+0.008
Neg logit contributions
nces-0.010
esome-0.009
 contrad-0.008
anamo-0.008
 Flavoring-0.008
Token US
Token ID1294
Feature activation+1.01e+00

Pos logit contributions
atron+1.227
imet+1.205
izoph+1.172
Desk+1.115
estern+1.060
Neg logit contributions
nces-1.396
esome-1.324
 contrad-1.209
anamo-1.177
 Flavoring-1.138
Token-
Token ID12
Feature activation+9.12e-01

Pos logit contributions
nces+2.066
esome+1.956
 contrad+1.769
anamo+1.737
edom+1.689
Neg logit contributions
atron-1.938
imet-1.876
izoph-1.835
Desk-1.747
estern-1.663
TokenPhilipp
Token ID49680
Feature activation+3.24e+00

Pos logit contributions
nces+1.810
esome+1.721
 contrad+1.563
anamo+1.517
edom+1.484
Neg logit contributions
atron-1.761
imet-1.742
izoph-1.702
estern-1.554
Desk-1.550
Tokenines
Token ID1127
Feature activation+1.22e+00

Pos logit contributions
nces+4.748
esome+4.706
anamo+4.135
 contrad+3.882
edom+3.881
Neg logit contributions
imet-7.322
atron-7.312
izoph-7.081
Desk-6.882
rology-6.511
Token T
Token ID309
Feature activation+5.40e-01

Pos logit contributions
nces+2.687
esome+2.612
 contrad+2.412
anamo+2.341
 Flavoring+2.325
Neg logit contributions
atron-2.059
imet-1.999
izoph-1.955
Desk-1.850
estern-1.745
Tokenies
Token ID444
Feature activation-3.33e-02

Pos logit contributions
nces+0.900
esome+0.851
 contrad+0.738
anamo+0.732
 Flavoring+0.690
Neg logit contributions
atron-1.232
imet-1.203
izoph-1.179
Desk-1.141
estern-1.095
Token C
Token ID327
Feature activation+2.76e-01

Pos logit contributions
atron+0.062
imet+0.061
izoph+0.060
Desk+0.057
estern+0.054
Neg logit contributions
nces-0.070
esome-0.066
 contrad-0.060
anamo-0.059
edom-0.056
Tokenement
Token ID972
Feature activation+1.03e+00

Pos logit contributions
nces+0.585
esome+0.555
 contrad+0.502
anamo+0.494
edom+0.480
Neg logit contributions
atron-0.510
imet-0.494
izoph-0.484
Desk-0.464
estern-0.434
TokenQ
Token ID48
Feature activation-2.96e-01

Pos logit contributions
nces+0.597
esome+0.558
 contrad+0.510
anamo+0.474
 Flavoring+0.461
Neg logit contributions
atron-0.741
imet-0.724
izoph-0.710
Desk-0.670
estern-0.653
Token with
Token ID351
Feature activation+3.94e-01

Pos logit contributions
atron+0.578
imet+0.565
izoph+0.549
Desk+0.519
estern+0.503
Neg logit contributions
nces-0.603
esome-0.573
 contrad-0.523
anamo-0.503
edom-0.484
Token its
Token ID663
Feature activation+3.88e-01

Pos logit contributions
nces+0.781
esome+0.732
 contrad+0.675
anamo+0.632
edom+0.622
Neg logit contributions
atron-0.794
imet-0.775
izoph-0.752
Desk-0.713
estern-0.697
Token 144
Token ID20224
Feature activation+1.57e+00

Pos logit contributions
nces+0.782
esome+0.741
 contrad+0.682
anamo+0.639
edom+0.626
Neg logit contributions
atron-0.762
imet-0.743
izoph-0.724
Desk-0.688
estern-0.673
TokenHz
Token ID7399
Feature activation+5.54e-01

Pos logit contributions
nces+2.200
esome+2.002
 contrad+1.763
anamo+1.638
 Flavoring+1.548
Neg logit contributions
atron-4.054
imet-3.980
izoph-3.894
Desk-3.749
estern-3.662
Token refresh
Token ID14976
Feature activation+3.23e+00

Pos logit contributions
nces+0.981
esome+0.920
 contrad+0.829
anamo+0.787
edom+0.755
Neg logit contributions
atron-1.225
imet-1.198
izoph-1.169
Desk-1.121
estern-1.093
Token rate
Token ID2494
Feature activation+6.39e-02

Pos logit contributions
nces+4.634
esome+4.205
 contrad+3.747
antage+3.395
edom+3.257
Neg logit contributions
atron-7.667
izoph-7.632
imet-7.509
estern-7.265
Desk-7.211
Token and
Token ID290
Feature activation+1.01e-01

Pos logit contributions
nces+0.131
esome+0.123
 contrad+0.113
anamo+0.107
edom+0.104
Neg logit contributions
atron-0.124
imet-0.121
izoph-0.118
Desk-0.113
estern-0.109
Token fast
Token ID3049
Feature activation+6.10e-01

Pos logit contributions
nces+0.202
esome+0.191
 contrad+0.176
anamo+0.166
edom+0.161
Neg logit contributions
atron-0.199
imet-0.195
izoph-0.189
Desk-0.180
estern-0.176
Token response
Token ID2882
Feature activation+1.97e-01

Pos logit contributions
nces+1.221
esome+1.161
 contrad+1.066
anamo+1.004
edom+0.989
Neg logit contributions
atron-1.201
imet-1.177
izoph-1.152
Desk-1.086
estern-1.057
Token time
Token ID640
Feature activation-1.03e+00

Pos logit contributions
nces+0.433
esome+0.413
 contrad+0.381
anamo+0.359
edom+0.354
Neg logit contributions
atron-0.349
imet-0.339
izoph-0.335
Desk-0.317
estern-0.303
Token application
Token ID3586
Feature activation+1.23e+00

Pos logit contributions
nces+0.593
esome+0.565
 contrad+0.523
anamo+0.487
edom+0.475
Neg logit contributions
atron-0.636
imet-0.617
izoph-0.609
Desk-0.580
estern-0.560
Token.
Token ID13
Feature activation-2.63e-01

Pos logit contributions
nces+2.398
esome+2.294
 contrad+2.095
anamo+1.935
edom+1.922
Neg logit contributions
atron-2.423
izoph-2.356
imet-2.330
Desk-2.198
estern-2.145
Token In
Token ID554
Feature activation-1.01e-01

Pos logit contributions
atron+0.482
imet+0.477
izoph+0.464
Desk+0.433
estern+0.422
Neg logit contributions
nces-0.559
esome-0.537
 contrad-0.487
anamo-0.469
 Flavoring-0.466
Token terms
Token ID2846
Feature activation-2.54e-01

Pos logit contributions
atron+0.199
imet+0.193
izoph+0.188
Desk+0.179
estern+0.172
Neg logit contributions
nces-0.205
esome-0.192
 contrad-0.179
anamo-0.168
edom-0.164
Token of
Token ID286
Feature activation-6.86e-01

Pos logit contributions
atron+0.517
imet+0.506
izoph+0.505
Desk+0.474
estern+0.462
Neg logit contributions
nces-0.489
esome-0.469
 contrad-0.412
anamo-0.395
 Flavoring-0.392
Token thrust
Token ID14613
Feature activation+3.12e+00

Pos logit contributions
atron+1.377
imet+1.346
izoph+1.301
Desk+1.254
estern+1.202
Neg logit contributions
nces-1.361
esome-1.286
 contrad-1.175
anamo-1.120
 Flavoring-1.085
Token to
Token ID284
Feature activation+4.86e-02

Pos logit contributions
nces+5.623
esome+5.309
 contrad+4.805
edom+4.551
anamo+4.290
Neg logit contributions
atron-6.192
izoph-6.097
imet-5.886
Desk-5.853
estern-5.692
Token weight
Token ID3463
Feature activation+1.55e+00

Pos logit contributions
nces+0.092
esome+0.087
 contrad+0.078
anamo+0.074
edom+0.073
Neg logit contributions
atron-0.102
imet-0.099
izoph-0.097
Desk-0.093
estern-0.090
Token it
Token ID340
Feature activation-2.72e-01

Pos logit contributions
nces+2.084
esome+1.929
 contrad+1.701
anamo+1.505
edom+1.497
Neg logit contributions
atron-3.962
izoph-3.915
imet-3.860
Desk-3.773
estern-3.671
Token is
Token ID318
Feature activation+5.75e-02

Pos logit contributions
atron+0.590
imet+0.580
izoph+0.570
Desk+0.547
estern+0.525
Neg logit contributions
nces-0.496
esome-0.462
 contrad-0.418
anamo-0.394
edom-0.384
Token already
Token ID1541
Feature activation-9.51e-01

Pos logit contributions
nces+0.114
esome+0.106
 contrad+0.098
anamo+0.090
edom+0.089
Neg logit contributions
atron-0.115
imet-0.114
izoph-0.112
Desk-0.106
estern-0.102
Token,
Token ID11
Feature activation-4.91e-02

Pos logit contributions
nces+0.574
esome+0.547
 contrad+0.498
anamo+0.468
edom+0.463
Neg logit contributions
atron-0.619
imet-0.609
izoph-0.596
Desk-0.572
estern-0.550
Token one
Token ID530
Feature activation-7.12e-01

Pos logit contributions
atron+0.097
imet+0.095
izoph+0.094
Desk+0.089
estern+0.085
Neg logit contributions
nces-0.098
esome-0.094
 contrad-0.086
anamo-0.081
edom-0.079
Token of
Token ID286
Feature activation+8.21e-01

Pos logit contributions
atron+1.358
imet+1.330
izoph+1.292
Desk+1.236
estern+1.176
Neg logit contributions
nces-1.476
esome-1.402
 contrad-1.288
anamo-1.231
edom-1.193
Token which
Token ID543
Feature activation+3.68e-01

Pos logit contributions
nces+1.571
esome+1.487
 contrad+1.374
anamo+1.268
edom+1.255
Neg logit contributions
atron-1.668
imet-1.634
izoph-1.612
Desk-1.537
estern-1.478
Token involves
Token ID9018
Feature activation-6.45e-01

Pos logit contributions
nces+0.671
esome+0.630
 contrad+0.576
anamo+0.535
edom+0.524
Neg logit contributions
atron-0.790
imet-0.777
izoph-0.761
Desk-0.735
estern-0.706
Token charge
Token ID3877
Feature activation+3.09e+00

Pos logit contributions
atron+1.272
imet+1.245
izoph+1.207
Desk+1.159
estern+1.103
Neg logit contributions
nces-1.296
esome-1.227
 contrad-1.119
anamo-1.079
 Flavoring-1.037
Token-
Token ID12
Feature activation+5.24e-01

Pos logit contributions
nces+5.563
esome+5.281
 contrad+5.024
anamo+4.533
edom+4.523
Neg logit contributions
izoph-5.949
atron-5.942
imet-5.920
Desk-5.653
estern-5.590
Tokendensity
Token ID43337
Feature activation+2.31e+00

Pos logit contributions
nces+1.127
esome+1.073
 contrad+0.974
anamo+0.949
edom+0.918
Neg logit contributions
atron-0.943
imet-0.930
izoph-0.914
Desk-0.860
estern-0.828
Token wave
Token ID6769
Feature activation-1.60e-01

Pos logit contributions
nces+4.065
esome+3.826
 contrad+3.600
edom+3.169
anamo+3.129
Neg logit contributions
atron-4.827
izoph-4.756
imet-4.748
Desk-4.572
estern-4.498
Token fluctuations
Token ID31101
Feature activation+9.91e-01

Pos logit contributions
atron+0.297
imet+0.291
izoph+0.286
Desk+0.272
estern+0.260
Neg logit contributions
nces-0.339
esome-0.320
 contrad-0.296
anamo-0.281
edom-0.275
Token.
Token ID13
Feature activation-1.64e-03

Pos logit contributions
nces+1.918
esome+1.825
 contrad+1.684
edom+1.530
anamo+1.509
Neg logit contributions
atron-1.970
izoph-1.941
imet-1.924
Desk-1.846
estern-1.762
Token-
Token ID12
Feature activation+3.15e-01

Pos logit contributions
atron+1.811
imet+1.771
izoph+1.717
Desk+1.648
estern+1.570
Neg logit contributions
nces-1.801
esome-1.702
 contrad-1.548
anamo-1.501
edom-1.436
Tokenera
Token ID8607
Feature activation+1.29e+00

Pos logit contributions
nces+0.584
esome+0.534
 contrad+0.476
anamo+0.472
edom+0.451
Neg logit contributions
atron-0.672
imet-0.659
izoph-0.652
Desk-0.612
estern-0.599
Token standard
Token ID3210
Feature activation+1.53e+00

Pos logit contributions
nces+2.322
esome+2.140
 contrad+1.943
anamo+1.861
edom+1.772
Neg logit contributions
atron-2.827
imet-2.719
izoph-2.673
Desk-2.554
estern-2.499
Token of
Token ID286
Feature activation-1.03e-01

Pos logit contributions
nces+2.929
esome+2.781
 contrad+2.537
anamo+2.438
edom+2.354
Neg logit contributions
atron-3.091
imet-3.011
izoph-2.975
Desk-2.806
estern-2.741
Token "
Token ID366
Feature activation-9.10e-02

Pos logit contributions
atron+0.210
imet+0.206
izoph+0.199
Desk+0.192
estern+0.184
Neg logit contributions
nces-0.200
esome-0.191
 contrad-0.172
anamo-0.166
edom-0.159
Tokenprep
Token ID46012
Feature activation+3.08e+00

Pos logit contributions
atron+0.171
imet+0.166
izoph+0.162
Desk+0.154
estern+0.148
Neg logit contributions
nces-0.193
esome-0.183
 contrad-0.165
anamo-0.162
edom-0.155
Tokenonder
Token ID8623
Feature activation+1.62e+00

Pos logit contributions
nces+6.031
esome+5.754
 contrad+5.187
anamo+5.066
edom+5.020
Neg logit contributions
atron-5.906
imet-5.675
izoph-5.573
Desk-5.517
estern-5.074
Tokenance
Token ID590
Feature activation+1.75e+00

Pos logit contributions
nces+2.675
esome+2.437
anamo+2.149
 contrad+2.066
liction+2.005
Neg logit contributions
atron-3.737
imet-3.715
izoph-3.686
Desk-3.497
estern-3.440
Token of
Token ID286
Feature activation+1.29e-01

Pos logit contributions
nces+3.687
esome+3.500
 contrad+3.200
anamo+3.114
 Flavoring+2.992
Neg logit contributions
atron-3.261
imet-3.192
izoph-3.083
Desk-2.953
estern-2.797
Token the
Token ID262
Feature activation+3.34e-01

Pos logit contributions
nces+0.240
esome+0.226
 contrad+0.204
anamo+0.197
 Flavoring+0.188
Neg logit contributions
atron-0.276
imet-0.270
izoph-0.262
Desk-0.253
estern-0.241
Token evidence
Token ID2370
Feature activation+1.33e+00

Pos logit contributions
nces+0.522
esome+0.486
 contrad+0.429
anamo+0.411
edom+0.388
Neg logit contributions
atron-0.809
imet-0.793
izoph-0.773
Desk-0.749
estern-0.719
Token.
Token ID13
Feature activation-4.13e-01

Pos logit contributions
nces+2.212
esome+2.068
 contrad+1.844
anamo+1.812
udo+1.779
Neg logit contributions
atron-2.130
izoph-2.086
imet-2.056
Desk-1.957
estern-1.854
Token Rand
Token ID8790
Feature activation+4.11e-01

Pos logit contributions
atron+0.905
imet+0.886
izoph+0.871
Desk+0.834
estern+0.801
Neg logit contributions
nces-0.740
esome-0.698
 contrad-0.623
anamo-0.600
 Flavoring-0.575
Token Paul
Token ID3362
Feature activation+9.34e-01

Pos logit contributions
nces+0.584
esome+0.544
 contrad+0.442
anamo+0.424
 Flavoring+0.413
Neg logit contributions
atron-1.040
izoph-1.026
imet-1.023
Desk-0.967
estern-0.933
Token is
Token ID318
Feature activation+3.62e-01

Pos logit contributions
nces+1.811
esome+1.731
 contrad+1.553
anamo+1.508
edom+1.455
Neg logit contributions
atron-1.879
imet-1.810
izoph-1.804
Desk-1.715
estern-1.616
Token "
Token ID366
Feature activation+1.57e-01

Pos logit contributions
nces+0.693
esome+0.658
 contrad+0.609
anamo+0.572
 Flavoring+0.558
Neg logit contributions
atron-0.730
imet-0.716
izoph-0.712
Desk-0.672
estern-0.638
Tokenbas
Token ID12093
Feature activation+3.08e+00

Pos logit contributions
nces+0.323
esome+0.308
 contrad+0.277
anamo+0.272
edom+0.258
Neg logit contributions
atron-0.298
imet-0.293
izoph-0.288
Desk-0.269
estern-0.259
Tokenically
Token ID1146
Feature activation+4.26e-01

Pos logit contributions
esome+5.748
nces+5.631
 contrad+4.942
edom+4.886
anamo+4.857
Neg logit contributions
izoph-5.870
atron-5.819
Desk-5.528
imet-5.509
istani-5.020
Token an
Token ID281
Feature activation-1.75e-01

Pos logit contributions
nces+0.838
esome+0.790
 contrad+0.729
anamo+0.687
edom+0.665
Neg logit contributions
atron-0.854
imet-0.834
izoph-0.829
Desk-0.781
estern-0.744
Token isolation
Token ID15133
Feature activation+1.86e+00

Pos logit contributions
atron+0.353
imet+0.346
izoph+0.337
Desk+0.322
estern+0.309
Neg logit contributions
nces-0.341
esome-0.322
 contrad-0.294
anamo-0.283
edom-0.271
Tokenist
Token ID396
Feature activation-7.24e-02

Pos logit contributions
nces+3.377
esome+3.242
 contrad+3.097
anamo+2.848
udo+2.783
Neg logit contributions
atron-3.715
imet-3.610
izoph-3.603
Desk-3.373
estern-3.243
Token,"
Token ID553
Feature activation+1.83e-01

Pos logit contributions
atron+0.136
imet+0.134
izoph+0.130
Desk+0.124
estern+0.118
Neg logit contributions
nces-0.151
esome-0.143
 contrad-0.132
anamo-0.126
edom-0.122
Token.
Token ID13
Feature activation+3.49e-02

Pos logit contributions
nces+1.910
esome+1.817
 contrad+1.657
anamo+1.598
edom+1.550
Neg logit contributions
atron-1.843
imet-1.805
izoph-1.760
Desk-1.686
estern-1.605
Token And
Token ID843
Feature activation+4.93e-01

Pos logit contributions
nces+0.073
esome+0.070
 contrad+0.063
anamo+0.062
 Flavoring+0.060
Neg logit contributions
imet-0.065
atron-0.065
izoph-0.063
Desk-0.059
estern-0.057
Token the
Token ID262
Feature activation-6.98e-02

Pos logit contributions
nces+0.985
esome+0.928
 contrad+0.844
anamo+0.831
 Flavoring+0.777
Neg logit contributions
atron-0.968
imet-0.957
izoph-0.937
Desk-0.891
estern-0.855
Token Department
Token ID2732
Feature activation+3.23e-01

Pos logit contributions
atron+0.141
imet+0.138
izoph+0.135
Desk+0.130
estern+0.125
Neg logit contributions
nces-0.136
esome-0.128
 contrad-0.118
anamo-0.113
 Flavoring-0.108
Token of
Token ID286
Feature activation-2.17e-03

Pos logit contributions
nces+0.616
esome+0.591
 contrad+0.522
anamo+0.509
 Flavoring+0.496
Neg logit contributions
atron-0.653
imet-0.645
izoph-0.642
Desk-0.602
estern-0.586
Token Defense
Token ID5947
Feature activation+3.07e+00

Pos logit contributions
atron+0.004
imet+0.004
izoph+0.004
Desk+0.004
estern+0.004
Neg logit contributions
nces-0.004
esome-0.004
 contrad-0.004
anamo-0.004
edom-0.003
Token has
Token ID468
Feature activation-5.17e-01

Pos logit contributions
nces+5.259
esome+4.931
 contrad+4.431
anamo+4.389
liction+4.056
Neg logit contributions
izoph-6.326
atron-6.323
imet-6.167
Desk-5.827
estern-5.725
Token acknowledged
Token ID10810
Feature activation-3.91e-01

Pos logit contributions
atron+0.949
imet+0.927
izoph+0.912
Desk+0.862
estern+0.827
Neg logit contributions
nces-1.101
esome-1.058
 contrad-0.970
anamo-0.938
edom-0.894
Token sending
Token ID7216
Feature activation+6.76e-03

Pos logit contributions
atron+0.798
imet+0.779
izoph+0.763
Desk+0.726
estern+0.698
Neg logit contributions
nces-0.763
esome-0.719
 contrad-0.654
anamo-0.639
edom-0.603
Token at
Token ID379
Feature activation+3.82e-01

Pos logit contributions
nces+0.013
esome+0.013
 contrad+0.011
anamo+0.011
edom+0.010
Neg logit contributions
atron-0.014
imet-0.013
izoph-0.013
Desk-0.012
estern-0.012
Token least
Token ID1551
Feature activation-8.33e-01

Pos logit contributions
nces+0.953
esome+0.920
 contrad+0.856
anamo+0.840
 Flavoring+0.800
Neg logit contributions
atron-0.555
imet-0.525
izoph-0.517
Desk-0.488
estern-0.457
Token.
Token ID13
Feature activation-5.34e-02

Pos logit contributions
atron+0.390
imet+0.381
izoph+0.370
Desk+0.353
estern+0.335
Neg logit contributions
nces-0.461
esome-0.438
 contrad-0.403
anamo-0.390
edom-0.377
Token
Token ID198
Feature activation+6.17e-01

Pos logit contributions
atron+0.091
imet+0.088
izoph+0.085
Desk+0.081
estern+0.076
Neg logit contributions
nces-0.122
esome-0.116
 contrad-0.107
anamo-0.104
 Flavoring-0.101
Token
Token ID198
Feature activation-6.82e-02

Pos logit contributions
nces+1.172
esome+1.167
 contrad+0.971
 Flavoring+0.959
udo+0.947
Neg logit contributions
izoph-1.236
atron-1.226
imet-1.218
Desk-1.104
estern-1.092
TokenThe
Token ID464
Feature activation-1.29e-01

Pos logit contributions
atron+0.129
imet+0.128
izoph+0.126
Desk+0.116
estern+0.114
Neg logit contributions
nces-0.139
esome-0.133
 contrad-0.120
anamo-0.117
edom-0.114
Token Texas
Token ID3936
Feature activation-6.08e-01

Pos logit contributions
atron+0.251
imet+0.246
izoph+0.244
Desk+0.230
estern+0.223
Neg logit contributions
nces-0.258
esome-0.246
 contrad-0.227
anamo-0.217
edom-0.209
Token Organ
Token ID7221
Feature activation-4.34e+00

Pos logit contributions
atron+1.049
imet+1.030
izoph+0.996
Desk+0.938
estern+0.904
Neg logit contributions
nces-1.372
esome-1.303
 contrad-1.204
anamo-1.162
 Flavoring-1.128
Tokenizing
Token ID2890
Feature activation-2.22e+00

Pos logit contributions
atron+5.294
imet+5.205
izoph+4.710
Desk+4.598
estern+4.122
Neg logit contributions
nces-11.659
esome-11.072
 contrad-10.417
anamo-10.378
��-9.825
Token Project
Token ID4935
Feature activation-2.91e-01

Pos logit contributions
atron+4.017
imet+3.938
izoph+3.789
Desk+3.643
estern+3.434
Neg logit contributions
nces-4.793
esome-4.553
 contrad-4.173
anamo-4.073
edom-3.909
Token is
Token ID318
Feature activation-8.07e-02

Pos logit contributions
atron+0.583
imet+0.568
izoph+0.561
Desk+0.531
estern+0.507
Neg logit contributions
nces-0.576
esome-0.543
 contrad-0.496
anamo-0.478
edom-0.458
Token calling
Token ID4585
Feature activation+1.76e-02

Pos logit contributions
atron+0.161
imet+0.157
izoph+0.155
Desk+0.145
estern+0.140
Neg logit contributions
nces-0.160
esome-0.151
 contrad-0.139
anamo-0.133
 Flavoring-0.127
Token on
Token ID319
Feature activation+6.37e-01

Pos logit contributions
nces+0.036
esome+0.034
 contrad+0.031
anamo+0.030
edom+0.029
Neg logit contributions
atron-0.034
imet-0.033
izoph-0.033
Desk-0.031
estern-0.029
Token and
Token ID290
Feature activation+4.79e-01

Pos logit contributions
atron+2.700
imet+2.646
izoph+2.527
Desk+2.428
estern+2.293
Neg logit contributions
nces-3.342
esome-3.172
 contrad-2.920
anamo-2.853
 Flavoring-2.748
Token organizations
Token ID5745
Feature activation-7.74e-01

Pos logit contributions
nces+0.924
esome+0.868
 contrad+0.797
anamo+0.757
edom+0.737
Neg logit contributions
atron-0.980
imet-0.966
izoph-0.946
Desk-0.891
estern-0.860
Token.
Token ID13
Feature activation+1.30e-01

Pos logit contributions
atron+1.410
imet+1.376
izoph+1.333
Desk+1.273
estern+1.205
Neg logit contributions
nces-1.673
esome-1.590
 contrad-1.459
anamo-1.415
edom-1.363
Token
Token ID198
Feature activation+5.82e-01

Pos logit contributions
nces+0.268
esome+0.255
 contrad+0.231
anamo+0.224
 Flavoring+0.219
Neg logit contributions
atron-0.249
imet-0.247
izoph-0.240
Desk-0.225
estern-0.217
Token
Token ID198
Feature activation-7.66e-02

Pos logit contributions
nces+1.104
esome+1.100
 contrad+0.912
 Flavoring+0.898
udo+0.891
Neg logit contributions
izoph-1.164
atron-1.160
imet-1.154
Desk-1.044
estern-1.035
Token×
Token ID12906
Feature activation-4.02e+00

Pos logit contributions
atron+0.144
imet+0.142
izoph+0.138
Desk+0.128
estern+0.125
Neg logit contributions
nces-0.160
esome-0.152
 contrad-0.138
anamo-0.133
edom-0.130
Token Expand
Token ID49368
Feature activation+1.65e-01

Pos logit contributions
atron+7.068
imet+6.698
izoph+6.396
Desk+6.344
estern+5.930
Neg logit contributions
nces-8.370
esome-7.780
 contrad-7.219
anamo-7.168
 Flavoring-7.054
Token Youtube
Token ID27431
Feature activation+3.53e-01

Pos logit contributions
nces+0.317
esome+0.301
 contrad+0.272
anamo+0.262
 Flavoring+0.252
Neg logit contributions
atron-0.337
imet-0.331
izoph-0.324
Desk-0.307
estern-0.294
Token Senator
Token ID8962
Feature activation+1.77e-01

Pos logit contributions
nces+0.721
esome+0.682
 contrad+0.624
anamo+0.598
 Flavoring+0.579
Neg logit contributions
atron-0.675
imet-0.673
izoph-0.646
Desk-0.615
estern-0.587
Token Sheldon
Token ID34536
Feature activation-6.29e-02

Pos logit contributions
nces+0.399
esome+0.378
 contrad+0.347
anamo+0.339
 Flavoring+0.328
Neg logit contributions
atron-0.306
imet-0.300
izoph-0.292
Desk-0.276
estern-0.261
Token White
Token ID2635
Feature activation-4.04e-01

Pos logit contributions
imet+0.153
atron+0.152
izoph+0.148
Desk+0.142
estern+0.138
Neg logit contributions
nces-0.098
esome-0.091
 contrad-0.080
anamo-0.076
 Flavoring-0.074
Token Ph
Token ID1380
Feature activation-1.02e+00

Pos logit contributions
atron+0.160
imet+0.157
izoph+0.154
Desk+0.144
estern+0.138
Neg logit contributions
nces-0.163
esome-0.156
 contrad-0.143
anamo-0.135
edom-0.130
Token.
Token ID13
Feature activation-5.21e-01

Pos logit contributions
atron+1.940
imet+1.889
izoph+1.833
Desk+1.754
estern+1.680
Neg logit contributions
nces-2.132
esome-2.031
 contrad-1.826
anamo-1.770
edom-1.731
TokenD
Token ID35
Feature activation+7.88e-01

Pos logit contributions
atron+0.769
imet+0.750
izoph+0.721
Desk+0.667
estern+0.628
Neg logit contributions
nces-1.295
esome-1.245
 contrad-1.159
anamo-1.126
edom-1.099
Token.
Token ID13
Feature activation-3.04e-01

Pos logit contributions
nces+1.644
esome+1.559
 contrad+1.435
anamo+1.365
edom+1.330
Neg logit contributions
atron-1.482
imet-1.429
izoph-1.402
Desk-1.341
estern-1.262
Token in
Token ID287
Feature activation+2.83e-01

Pos logit contributions
atron+0.565
imet+0.550
izoph+0.540
Desk+0.506
estern+0.480
Neg logit contributions
nces-0.642
esome-0.613
 contrad-0.564
anamo-0.543
edom-0.525
Token Organ
Token ID7221
Feature activation-3.85e+00

Pos logit contributions
nces+0.570
esome+0.541
 contrad+0.504
edom+0.464
anamo+0.463
Neg logit contributions
atron-0.551
imet-0.542
izoph-0.526
Desk-0.493
estern-0.479
Tokenizational
Token ID22684
Feature activation-1.62e+00

Pos logit contributions
atron+3.611
imet+3.511
izoph+3.161
Desk+2.952
estern+2.537
Neg logit contributions
nces-11.535
esome-11.014
 contrad-10.492
anamo-10.338
 Flavoring-9.962
Token Behavior
Token ID20181
Feature activation-6.09e-01

Pos logit contributions
atron+3.369
imet+3.300
izoph+3.204
Desk+3.072
estern+2.943
Neg logit contributions
nces-3.063
esome-2.896
 contrad-2.619
anamo-2.520
 Flavoring-2.424
Token/
Token ID14
Feature activation-3.34e-01

Pos logit contributions
atron+1.166
imet+1.136
izoph+1.110
Desk+1.053
estern+1.004
Neg logit contributions
nces-1.260
esome-1.192
 contrad-1.093
anamo-1.048
edom-1.019
TokenHuman
Token ID20490
Feature activation-3.25e-01

Pos logit contributions
atron+0.559
imet+0.553
izoph+0.538
estern+0.480
Desk+0.480
Neg logit contributions
nces-0.760
esome-0.729
 contrad-0.668
anamo-0.634
edom-0.633
Token Resource
Token ID20857
Feature activation+1.72e-01

Pos logit contributions
atron+0.543
imet+0.533
izoph+0.509
Desk+0.481
estern+0.465
Neg logit contributions
nces-0.746
esome-0.721
 contrad-0.663
anamo-0.626
edom-0.621
Token return
Token ID1441
Feature activation+8.90e-01

Pos logit contributions
nces+0.321
esome+0.303
 contrad+0.280
edom+0.255
anamo+0.254
Neg logit contributions
atron-0.336
imet-0.329
izoph-0.325
Desk-0.312
estern-0.299
Token to
Token ID284
Feature activation-1.08e-01

Pos logit contributions
nces+1.629
esome+1.529
 contrad+1.379
anamo+1.302
edom+1.288
Neg logit contributions
atron-1.886
imet-1.847
izoph-1.842
Desk-1.747
estern-1.690
Token a
Token ID257
Feature activation+2.49e-01

Pos logit contributions
atron+0.217
imet+0.213
izoph+0.207
Desk+0.199
estern+0.191
Neg logit contributions
nces-0.213
esome-0.202
 contrad-0.184
anamo-0.175
edom-0.171
Token more
Token ID517
Feature activation-3.83e-01

Pos logit contributions
nces+0.480
esome+0.458
 contrad+0.416
anamo+0.392
edom+0.383
Neg logit contributions
atron-0.506
imet-0.492
izoph-0.484
Desk-0.465
estern-0.444
Token improved
Token ID6596
Feature activation+5.38e-01

Pos logit contributions
atron+0.789
imet+0.776
izoph+0.758
Desk+0.730
estern+0.696
Neg logit contributions
nces-0.735
esome-0.695
 contrad-0.640
anamo-0.596
edom-0.583
Token MLS
Token ID13981
Feature activation-3.85e+00

Pos logit contributions
nces+1.061
esome+1.011
 contrad+0.917
anamo+0.867
 Flavoring+0.854
Neg logit contributions
atron-1.063
imet-1.035
izoph-1.014
Desk-0.973
estern-0.934
Token instead
Token ID2427
Feature activation-1.59e-01

Pos logit contributions
atron+7.159
imet+7.039
izoph+6.746
Desk+6.506
estern+6.104
Neg logit contributions
nces-7.837
esome-7.389
 contrad-6.777
anamo-6.742
edom-6.311
Token of
Token ID286
Feature activation-2.42e-01

Pos logit contributions
atron+0.297
imet+0.290
izoph+0.285
Desk+0.271
estern+0.258
Neg logit contributions
nces-0.335
esome-0.318
 contrad-0.290
anamo-0.280
 Flavoring-0.273
Token second
Token ID1218
Feature activation+1.40e-01

Pos logit contributions
atron+0.480
imet+0.472
izoph+0.462
Desk+0.440
estern+0.424
Neg logit contributions
nces-0.480
esome-0.454
 contrad-0.415
anamo-0.400
 Flavoring-0.390
Token division
Token ID7297
Feature activation-4.40e-01

Pos logit contributions
nces+0.272
esome+0.257
 contrad+0.235
anamo+0.225
 Flavoring+0.216
Neg logit contributions
atron-0.284
imet-0.278
izoph-0.271
Desk-0.260
estern-0.249
Token soccer
Token ID11783
Feature activation-2.61e+00

Pos logit contributions
atron+0.837
imet+0.818
izoph+0.795
Desk+0.762
estern+0.727
Neg logit contributions
nces-0.916
esome-0.871
 contrad-0.794
anamo-0.764
edom-0.742
Token it
Token ID340
Feature activation-1.60e-01

Pos logit contributions
nces+0.310
esome+0.294
 contrad+0.264
anamo+0.253
edom+0.243
Neg logit contributions
atron-0.370
imet-0.364
izoph-0.355
Desk-0.338
estern-0.327
Token out
Token ID503
Feature activation-1.10e-01

Pos logit contributions
atron+0.367
imet+0.362
izoph+0.355
Desk+0.341
estern+0.328
Neg logit contributions
nces-0.267
esome-0.251
 contrad-0.223
anamo-0.212
edom-0.203
Token.
Token ID13
Feature activation-1.47e-01

Pos logit contributions
atron+0.206
imet+0.201
izoph+0.198
Desk+0.188
estern+0.179
Neg logit contributions
nces-0.233
esome-0.220
 contrad-0.199
anamo-0.193
edom-0.187
Token
Token ID198
Feature activation+4.66e-01

Pos logit contributions
atron+0.250
imet+0.244
izoph+0.235
Desk+0.224
estern+0.211
Neg logit contributions
nces-0.333
esome-0.317
 contrad-0.293
anamo-0.285
 Flavoring-0.275
Token
Token ID198
Feature activation-2.81e-01

Pos logit contributions
esome+0.880
nces+0.879
 contrad+0.727
 Flavoring+0.716
udo+0.710
Neg logit contributions
izoph-0.941
atron-0.930
imet-0.923
Desk-0.837
estern-0.835
Token×
Token ID12906
Feature activation-3.66e+00

Pos logit contributions
atron+0.535
imet+0.523
izoph+0.508
Desk+0.484
estern+0.462
Neg logit contributions
nces-0.583
esome-0.555
 contrad-0.504
anamo-0.490
 Flavoring-0.471
Token Expand
Token ID49368
Feature activation+5.14e-01

Pos logit contributions
atron+6.143
imet+6.023
izoph+5.636
Desk+5.524
estern+5.097
Neg logit contributions
nces-8.132
esome-7.547
 contrad-7.050
anamo-6.955
 Flavoring-6.742
Token Jonathan
Token ID11232
Feature activation+9.55e-01

Pos logit contributions
nces+1.035
esome+0.995
 contrad+0.899
anamo+0.868
 Flavoring+0.835
Neg logit contributions
atron-0.996
imet-0.970
izoph-0.953
Desk-0.905
estern-0.868
Token Gold
Token ID3561
Feature activation+4.21e-01

Pos logit contributions
nces+1.824
esome+1.746
 contrad+1.561
anamo+1.476
 Flavoring+1.436
Neg logit contributions
atron-1.942
imet-1.915
izoph-1.879
Desk-1.791
estern-1.718
Tokens
Token ID82
Feature activation-1.29e+00

Pos logit contributions
nces+0.925
esome+0.894
 contrad+0.806
anamo+0.789
edom+0.763
Neg logit contributions
atron-0.735
imet-0.712
izoph-0.708
Desk-0.664
estern-0.625
Tokenbie
Token ID12590
Feature activation-6.67e-01

Pos logit contributions
atron+2.263
imet+2.200
izoph+2.144
Desk+2.039
estern+1.929
Neg logit contributions
nces-2.871
esome-2.743
 contrad-2.504
anamo-2.451
edom-2.367
Token not
Token ID407
Feature activation-4.06e-01

Pos logit contributions
nces+0.137
esome+0.130
 contrad+0.120
anamo+0.113
edom+0.109
Neg logit contributions
atron-0.145
imet-0.142
izoph-0.141
Desk-0.133
estern-0.128
Token protecting
Token ID10192
Feature activation+4.01e-01

Pos logit contributions
atron+0.798
imet+0.780
izoph+0.768
Desk+0.724
estern+0.705
Neg logit contributions
nces-0.814
esome-0.773
 contrad-0.712
anamo-0.678
edom-0.656
Token the
Token ID262
Feature activation-3.77e-01

Pos logit contributions
nces+0.768
esome+0.719
 contrad+0.660
anamo+0.637
edom+0.610
Neg logit contributions
atron-0.822
imet-0.814
izoph-0.793
Desk-0.758
estern-0.733
Token rights
Token ID2489
Feature activation-1.00e+00

Pos logit contributions
atron+0.718
imet+0.705
izoph+0.683
Desk+0.656
estern+0.627
Neg logit contributions
nces-0.778
esome-0.738
 contrad-0.677
anamo-0.657
edom-0.633
Token of
Token ID286
Feature activation-7.38e-01

Pos logit contributions
atron+1.729
imet+1.702
izoph+1.610
Desk+1.546
estern+1.455
Neg logit contributions
nces-2.239
esome-2.115
 contrad-1.940
anamo-1.898
 Flavoring-1.832
Token Occupy
Token ID25551
Feature activation-3.58e+00

Pos logit contributions
atron+1.460
imet+1.433
izoph+1.388
Desk+1.331
estern+1.273
Neg logit contributions
nces-1.479
esome-1.399
 contrad-1.277
anamo-1.235
edom-1.183
Token demonstrators
Token ID25016
Feature activation-1.17e+00

Pos logit contributions
atron+6.315
imet+6.221
izoph+5.882
Desk+5.813
istani+5.340
Neg logit contributions
nces-7.601
esome-7.193
 contrad-6.598
anamo-6.417
 Flavoring-6.157
Token whose
Token ID3025
Feature activation-8.48e-01

Pos logit contributions
atron+2.175
imet+2.124
izoph+2.068
Desk+1.969
estern+1.875
Neg logit contributions
nces-2.479
esome-2.354
 contrad-2.158
anamo-2.093
edom-2.016
Token protests
Token ID8536
Feature activation-2.63e-01

Pos logit contributions
atron+1.734
imet+1.699
izoph+1.659
Desk+1.591
estern+1.534
Neg logit contributions
nces-1.647
esome-1.553
 contrad-1.417
anamo-1.357
edom-1.312
Token are
Token ID389
Feature activation-8.82e-04

Pos logit contributions
atron+0.540
imet+0.528
izoph+0.519
Desk+0.497
estern+0.476
Neg logit contributions
nces-0.509
esome-0.478
 contrad-0.435
anamo-0.412
edom-0.400
Token being
Token ID852
Feature activation+4.42e-01

Pos logit contributions
atron+0.002
imet+0.002
izoph+0.002
Desk+0.002
estern+0.001
Neg logit contributions
nces-0.002
esome-0.002
 contrad-0.002
anamo-0.002
edom-0.001
Token according
Token ID1864
Feature activation+8.23e-02

Pos logit contributions
nces+0.150
esome+0.143
 contrad+0.131
anamo+0.124
edom+0.120
Neg logit contributions
atron-0.151
imet-0.147
izoph-0.144
Desk-0.138
estern-0.134
Token to
Token ID284
Feature activation+1.52e-01

Pos logit contributions
nces+0.147
esome+0.142
 contrad+0.125
anamo+0.118
 Flavoring+0.114
Neg logit contributions
atron-0.178
imet-0.176
izoph-0.175
Desk-0.162
estern-0.160
Token the
Token ID262
Feature activation+3.56e-01

Pos logit contributions
nces+0.302
esome+0.287
 contrad+0.265
anamo+0.251
edom+0.244
Neg logit contributions
atron-0.301
imet-0.293
izoph-0.285
Desk-0.276
estern-0.265
Token artist
Token ID6802
Feature activation-3.28e-01

Pos logit contributions
nces+0.732
esome+0.704
 contrad+0.656
anamo+0.615
edom+0.602
Neg logit contributions
atron-0.666
imet-0.643
izoph-0.631
Desk-0.611
estern-0.590
Token.
Token ID13
Feature activation-1.77e-02

Pos logit contributions
atron+0.628
imet+0.614
izoph+0.596
Desk+0.570
estern+0.543
Neg logit contributions
nces-0.676
esome-0.642
 contrad-0.586
anamo-0.568
edom-0.545
Token Organ
Token ID7221
Feature activation-3.53e+00

Pos logit contributions
atron+0.033
imet+0.033
izoph+0.032
Desk+0.030
estern+0.030
Neg logit contributions
nces-0.036
esome-0.035
 contrad-0.032
anamo-0.030
 Flavoring-0.030
Tokenizers
Token ID11341
Feature activation-6.17e-01

Pos logit contributions
atron+4.557
imet+4.507
izoph+4.115
Desk+3.930
estern+3.528
Neg logit contributions
nces-9.340
esome-8.886
 contrad-8.269
anamo-8.196
edom-7.852
Token hope
Token ID2911
Feature activation-9.70e-01

Pos logit contributions
atron+1.205
imet+1.168
izoph+1.158
Desk+1.097
estern+1.055
Neg logit contributions
nces-1.254
esome-1.190
 contrad-1.093
anamo-1.041
edom-1.007
Token the
Token ID262
Feature activation-1.93e-01

Pos logit contributions
atron+1.870
imet+1.830
izoph+1.768
Desk+1.695
estern+1.601
Neg logit contributions
nces-1.988
esome-1.877
 contrad-1.704
anamo-1.670
 Flavoring-1.608
Token work
Token ID670
Feature activation-1.01e+00

Pos logit contributions
atron+0.359
imet+0.349
izoph+0.341
Desk+0.329
estern+0.318
Neg logit contributions
nces-0.404
esome-0.388
 contrad-0.361
anamo-0.339
edom-0.329
Token,
Token ID11
Feature activation-1.12e-01

Pos logit contributions
atron+2.204
imet+2.162
izoph+2.104
Desk+2.025
estern+1.940
Neg logit contributions
nces-1.806
esome-1.699
 contrad-1.527
anamo-1.469
edom-1.402
Token%
Token ID4
Feature activation-2.09e+00

Pos logit contributions
atron+1.065
imet+1.042
izoph+1.006
Desk+0.964
estern+0.914
Neg logit contributions
nces-1.201
esome-1.140
 contrad-1.043
anamo-1.013
 Flavoring-0.974
Token of
Token ID286
Feature activation+4.86e-01

Pos logit contributions
atron+3.460
imet+3.459
izoph+3.213
Desk+3.087
estern+2.908
Neg logit contributions
nces-4.690
esome-4.506
 contrad-4.100
anamo-4.010
edom-3.913
Token respondents
Token ID14502
Feature activation-4.64e-01

Pos logit contributions
nces+0.954
esome+0.892
 contrad+0.827
anamo+0.773
edom+0.755
Neg logit contributions
atron-0.982
imet-0.962
izoph-0.932
Desk-0.898
estern-0.863
Token to
Token ID284
Feature activation+3.88e-01

Pos logit contributions
atron+0.873
imet+0.846
izoph+0.832
Desk+0.789
estern+0.747
Neg logit contributions
nces-0.976
esome-0.924
 contrad-0.857
anamo-0.824
edom-0.793
Token the
Token ID262
Feature activation+1.31e-01

Pos logit contributions
nces+0.762
esome+0.705
 contrad+0.646
anamo+0.620
edom+0.602
Neg logit contributions
atron-0.800
imet-0.775
izoph-0.754
Desk-0.717
estern-0.690
Token Lok
Token ID36861
Feature activation-3.41e+00

Pos logit contributions
nces+0.263
esome+0.248
 contrad+0.229
anamo+0.217
edom+0.210
Neg logit contributions
atron-0.260
imet-0.252
izoph-0.244
Desk-0.235
estern-0.228
Token Survey
Token ID13084
Feature activation-2.01e-01

Pos logit contributions
atron+4.395
imet+4.254
izoph+3.976
Desk+3.825
istani+3.473
Neg logit contributions
nces-9.037
esome-8.666
 contrad-8.139
anamo-7.915
 Flavoring-7.739
Token,
Token ID11
Feature activation+1.89e-01

Pos logit contributions
atron+0.379
imet+0.368
izoph+0.360
Desk+0.341
estern+0.328
Neg logit contributions
nces-0.420
esome-0.399
 contrad-0.369
anamo-0.355
edom-0.342
Token with
Token ID351
Feature activation-2.14e-02

Pos logit contributions
nces+0.378
esome+0.354
 contrad+0.329
anamo+0.314
 Flavoring+0.301
Neg logit contributions
atron-0.376
imet-0.367
izoph-0.358
Desk-0.341
estern-0.326
Token a
Token ID257
Feature activation+3.70e-02

Pos logit contributions
atron+0.041
imet+0.040
izoph+0.039
Desk+0.037
estern+0.036
Neg logit contributions
nces-0.045
esome-0.042
 contrad-0.039
anamo-0.037
edom-0.036
Token sample
Token ID6291
Feature activation+4.94e-01

Pos logit contributions
nces+0.076
esome+0.072
 contrad+0.066
anamo+0.063
edom+0.061
Neg logit contributions
atron-0.072
imet-0.069
izoph-0.068
Desk-0.065
estern-0.062
Token sure
Token ID1654
Feature activation-1.30e-01

Pos logit contributions
atron+0.024
imet+0.024
izoph+0.023
Desk+0.022
estern+0.021
Neg logit contributions
nces-0.027
esome-0.026
 contrad-0.024
anamo-0.023
edom-0.022
Token how
Token ID703
Feature activation-2.81e-01

Pos logit contributions
atron+0.258
imet+0.255
izoph+0.246
Desk+0.235
estern+0.229
Neg logit contributions
nces-0.258
esome-0.245
 contrad-0.223
anamo-0.214
 Flavoring-0.208
Token they
Token ID484
Feature activation-6.34e-01

Pos logit contributions
atron+0.533
imet+0.526
izoph+0.507
Desk+0.483
estern+0.463
Neg logit contributions
nces-0.585
esome-0.556
 contrad-0.514
anamo-0.490
edom-0.475
Token would
Token ID561
Feature activation+4.69e-01

Pos logit contributions
atron+1.249
imet+1.222
izoph+1.194
Desk+1.143
estern+1.087
Neg logit contributions
nces-1.267
esome-1.200
 contrad-1.096
anamo-1.052
edom-1.021
Token have
Token ID423
Feature activation-4.84e-01

Pos logit contributions
nces+0.874
esome+0.819
 contrad+0.767
edom+0.698
 Flavoring+0.694
Neg logit contributions
atron-0.967
imet-0.951
izoph-0.940
Desk-0.900
estern-0.865
Token voted
Token ID7052
Feature activation-3.40e+00

Pos logit contributions
atron+0.893
imet+0.870
izoph+0.854
Desk+0.816
estern+0.770
Neg logit contributions
nces-1.032
esome-0.985
 contrad-0.917
anamo-0.858
edom-0.853
Token had
Token ID550
Feature activation-4.89e-01

Pos logit contributions
atron+6.558
imet+6.435
izoph+6.115
Desk+5.959
estern+5.612
Neg logit contributions
nces-6.765
esome-6.373
 contrad-5.759
anamo-5.695
 Flavoring-5.450
Token they
Token ID484
Feature activation-1.06e-01

Pos logit contributions
atron+0.971
imet+0.953
izoph+0.924
Desk+0.883
estern+0.850
Neg logit contributions
nces-0.977
esome-0.921
 contrad-0.842
anamo-0.812
edom-0.786
Token known
Token ID1900
Feature activation-9.42e-01

Pos logit contributions
atron+0.197
izoph+0.192
imet+0.190
Desk+0.180
estern+0.170
Neg logit contributions
nces-0.223
esome-0.212
 contrad-0.196
anamo-0.187
edom-0.184
Token of
Token ID286
Feature activation-4.30e-01

Pos logit contributions
atron+1.844
imet+1.813
izoph+1.754
Desk+1.679
estern+1.615
Neg logit contributions
nces-1.908
esome-1.807
 contrad-1.646
anamo-1.593
 Flavoring-1.540
Token Good
Token ID4599
Feature activation+2.46e-01

Pos logit contributions
atron+0.847
imet+0.832
izoph+0.807
Desk+0.772
estern+0.740
Neg logit contributions
nces-0.865
esome-0.817
 contrad-0.747
anamo-0.721
 Flavoring-0.697
Token set
Token ID900
Feature activation+1.01e+00

Pos logit contributions
nces+0.427
esome+0.408
 contrad+0.371
anamo+0.352
edom+0.341
Neg logit contributions
atron-0.421
imet-0.417
izoph-0.408
Desk-0.391
estern-0.373
Token to
Token ID284
Feature activation+7.40e-01

Pos logit contributions
nces+2.007
esome+1.909
 contrad+1.758
anamo+1.682
edom+1.622
Neg logit contributions
atron-1.976
imet-1.948
izoph-1.929
Desk-1.828
estern-1.713
Token take
Token ID1011
Feature activation+1.72e+00

Pos logit contributions
nces+1.439
esome+1.377
 contrad+1.260
anamo+1.180
edom+1.152
Neg logit contributions
atron-1.460
imet-1.440
izoph-1.412
Desk-1.361
estern-1.291
Token on
Token ID319
Feature activation-8.39e-03

Pos logit contributions
nces+3.401
esome+3.230
 contrad+2.972
anamo+2.790
udo+2.679
Neg logit contributions
atron-3.354
imet-3.299
izoph-3.211
Desk-3.047
estern-2.937
Token the
Token ID262
Feature activation+3.20e-02

Pos logit contributions
atron+0.017
imet+0.017
izoph+0.016
Desk+0.015
estern+0.015
Neg logit contributions
nces-0.016
esome-0.016
 contrad-0.014
anamo-0.014
edom-0.013
Token Occupy
Token ID25551
Feature activation-3.36e+00

Pos logit contributions
nces+0.063
esome+0.059
 contrad+0.055
anamo+0.052
edom+0.050
Neg logit contributions
atron-0.064
imet-0.063
izoph-0.062
Desk-0.059
estern-0.056
Token Wall
Token ID5007
Feature activation-1.12e+00

Pos logit contributions
atron+6.276
imet+6.122
izoph+5.870
Desk+5.746
estern+5.368
Neg logit contributions
nces-6.945
esome-6.571
 contrad-5.930
anamo-5.824
 Flavoring-5.571
Token Street
Token ID3530
Feature activation-1.72e-01

Pos logit contributions
atron+2.297
imet+2.245
izoph+2.188
Desk+2.092
estern+2.001
Neg logit contributions
nces-2.148
esome-2.037
 contrad-1.835
anamo-1.777
edom-1.703
Token protests
Token ID8536
Feature activation-8.40e-01

Pos logit contributions
atron+0.333
imet+0.324
izoph+0.315
Desk+0.302
estern+0.290
Neg logit contributions
nces-0.348
esome-0.330
 contrad-0.304
anamo-0.289
edom-0.280
Token in
Token ID287
Feature activation+2.24e-01

Pos logit contributions
atron+1.611
imet+1.575
izoph+1.538
Desk+1.473
estern+1.399
Neg logit contributions
nces-1.728
esome-1.645
 contrad-1.498
anamo-1.441
edom-1.398
Token Wednesday
Token ID3583
Feature activation+3.44e-01

Pos logit contributions
nces+0.430
esome+0.404
 contrad+0.380
anamo+0.351
edom+0.347
Neg logit contributions
atron-0.457
imet-0.448
izoph-0.439
Desk-0.422
estern-0.402
Token,
Token ID11
Feature activation+6.20e-02

Pos logit contributions
atron+0.570
imet+0.557
izoph+0.546
Desk+0.518
estern+0.495
Neg logit contributions
nces-0.587
esome-0.556
 contrad-0.504
anamo-0.488
edom-0.469
Token our
Token ID674
Feature activation-3.51e-01

Pos logit contributions
nces+0.116
esome+0.110
 contrad+0.100
anamo+0.096
 Flavoring+0.092
Neg logit contributions
atron-0.130
imet-0.128
izoph-0.124
Desk-0.118
estern-0.114
Token copy
Token ID4866
Feature activation+1.47e-01

Pos logit contributions
atron+0.677
imet+0.669
izoph+0.645
Desk+0.616
estern+0.589
Neg logit contributions
nces-0.716
esome-0.684
 contrad-0.623
anamo-0.596
 Flavoring-0.583
Token of
Token ID286
Feature activation+4.20e-01

Pos logit contributions
nces+0.293
esome+0.278
 contrad+0.253
anamo+0.243
edom+0.238
Neg logit contributions
atron-0.289
imet-0.284
izoph-0.280
Desk-0.261
estern-0.252
Token the
Token ID262
Feature activation+4.03e-01

Pos logit contributions
nces+0.852
esome+0.811
 contrad+0.740
anamo+0.709
 Flavoring+0.698
Neg logit contributions
atron-0.812
imet-0.800
izoph-0.777
Desk-0.736
estern-0.701
Token StarCraft
Token ID38460
Feature activation-3.36e+00

Pos logit contributions
nces+0.852
esome+0.818
 contrad+0.751
anamo+0.716
 Flavoring+0.707
Neg logit contributions
atron-0.734
imet-0.731
izoph-0.703
Desk-0.669
estern-0.637
Token 2
Token ID362
Feature activation+3.12e-02

Pos logit contributions
atron+6.178
imet+5.926
izoph+5.747
Desk+5.581
estern+5.263
Neg logit contributions
nces-7.056
esome-6.642
 contrad-6.148
anamo-6.049
edom-5.726
Token Collector
Token ID17573
Feature activation+1.12e+00

Pos logit contributions
nces+0.066
esome+0.063
 contrad+0.058
anamo+0.055
 Flavoring+0.054
Neg logit contributions
atron-0.057
imet-0.057
izoph-0.055
Desk-0.052
estern-0.049
Token's
Token ID338
Feature activation-1.48e-01

Pos logit contributions
nces+1.737
esome+1.630
 contrad+1.436
anamo+1.318
 Flavoring+1.302
Neg logit contributions
imet-2.664
atron-2.651
izoph-2.585
Desk-2.452
estern-2.430
Token Edition
Token ID5061
Feature activation+1.49e+00

Pos logit contributions
atron+0.325
imet+0.322
izoph+0.310
Desk+0.297
estern+0.290
Neg logit contributions
nces-0.261
esome-0.248
 contrad-0.225
anamo-0.210
 Flavoring-0.208
Token has
Token ID468
Feature activation-6.82e-01

Pos logit contributions
nces+2.676
esome+2.556
 contrad+2.317
 Flavoring+2.166
anamo+2.128
Neg logit contributions
imet-3.094
atron-3.080
izoph-3.035
Desk-2.804
estern-2.756
Token the
Token ID262
Feature activation+8.27e-02

Pos logit contributions
nces+0.295
esome+0.279
 contrad+0.254
anamo+0.242
 Flavoring+0.235
Neg logit contributions
atron-0.310
imet-0.303
izoph-0.297
Desk-0.283
estern-0.269
Token comments
Token ID3651
Feature activation-6.27e-01

Pos logit contributions
nces+0.121
esome+0.115
 contrad+0.100
anamo+0.093
edom+0.089
Neg logit contributions
atron-0.206
imet-0.202
izoph-0.200
Desk-0.191
estern-0.186
Token.
Token ID13
Feature activation-2.05e-01

Pos logit contributions
atron+1.186
imet+1.157
izoph+1.132
Desk+1.071
estern+1.023
Neg logit contributions
nces-1.313
esome-1.253
 contrad-1.142
anamo-1.098
edom-1.062
Token<|endoftext|>
Token ID50256
Feature activation-9.01e-01

Pos logit contributions
atron+0.363
imet+0.355
izoph+0.344
Desk+0.325
estern+0.310
Neg logit contributions
nces-0.452
esome-0.432
 contrad-0.395
anamo-0.384
 Flavoring-0.372
TokenRed
Token ID7738
Feature activation-7.38e-02

Pos logit contributions
atron+1.655
imet+1.622
izoph+1.574
Desk+1.485
estern+1.420
Neg logit contributions
nces-1.929
esome-1.838
 contrad-1.671
anamo-1.625
edom-1.570
Token Transit
Token ID22325
Feature activation-3.33e+00

Pos logit contributions
atron+0.146
imet+0.143
izoph+0.140
Desk+0.132
estern+0.126
Neg logit contributions
nces-0.146
esome-0.140
 contrad-0.127
anamo-0.122
edom-0.118
Token L
Token ID406
Feature activation-2.49e-01

Pos logit contributions
atron+6.402
imet+6.227
izoph+5.878
Desk+5.826
estern+5.447
Neg logit contributions
nces-6.707
esome-6.334
 contrad-5.775
anamo-5.626
 Flavoring-5.388
Tokenanes
Token ID7305
Feature activation-4.69e-01

Pos logit contributions
atron+0.463
imet+0.455
izoph+0.445
Desk+0.421
estern+0.400
Neg logit contributions
nces-0.525
esome-0.498
 contrad-0.454
anamo-0.442
edom-0.426
Token on
Token ID319
Feature activation-9.30e-02

Pos logit contributions
atron+0.887
imet+0.870
izoph+0.852
Desk+0.804
estern+0.770
Neg logit contributions
nces-0.981
esome-0.931
 contrad-0.851
anamo-0.819
 Flavoring-0.792
Token Church
Token ID4564
Feature activation-4.91e-01

Pos logit contributions
atron+0.184
imet+0.181
izoph+0.178
Desk+0.169
estern+0.160
Neg logit contributions
nces-0.185
esome-0.175
 contrad-0.160
anamo-0.154
 Flavoring-0.148
Token Have
Token ID8192
Feature activation-1.05e+00

Pos logit contributions
atron+1.030
imet+1.009
izoph+0.981
Desk+0.940
estern+0.898
Neg logit contributions
nces-0.927
esome-0.872
 contrad-0.790
anamo-0.762
edom-0.731
Token current
Token ID1459
Feature activation-9.62e-01

Pos logit contributions
atron+0.058
imet+0.057
izoph+0.055
Desk+0.054
estern+0.051
Neg logit contributions
nces-0.055
esome-0.052
 contrad-0.048
anamo-0.046
edom-0.044
Token climate
Token ID4258
Feature activation-5.52e-01

Pos logit contributions
atron+1.874
imet+1.836
izoph+1.774
Desk+1.707
estern+1.617
Neg logit contributions
nces-1.960
esome-1.856
 contrad-1.706
anamo-1.640
edom-1.576
Token of
Token ID286
Feature activation-1.05e-02

Pos logit contributions
atron+1.042
imet+1.017
izoph+0.986
Desk+0.944
estern+0.896
Neg logit contributions
nces-1.158
esome-1.097
 contrad-1.007
anamo-0.978
edom-0.935
Token misinformation
Token ID32805
Feature activation+2.24e-01

Pos logit contributions
atron+0.022
imet+0.021
izoph+0.021
Desk+0.020
estern+0.019
Neg logit contributions
nces-0.020
esome-0.019
 contrad-0.018
anamo-0.017
edom-0.016
Token around
Token ID1088
Feature activation-1.27e+00

Pos logit contributions
nces+0.441
esome+0.417
 contrad+0.382
anamo+0.370
edom+0.353
Neg logit contributions
atron-0.446
imet-0.436
izoph-0.424
Desk-0.408
estern-0.389
Token trans
Token ID1007
Feature activation-3.33e+00

Pos logit contributions
atron+2.418
imet+2.365
izoph+2.287
Desk+2.193
estern+2.082
Neg logit contributions
nces-2.643
esome-2.506
 contrad-2.293
anamo-2.225
edom-2.134
Token identity
Token ID5369
Feature activation-9.27e-01

Pos logit contributions
atron+6.392
imet+6.287
izoph+6.048
Desk+5.764
istani+5.446
Neg logit contributions
nces-6.583
esome-6.240
 contrad-5.706
anamo-5.495
 Flavoring-5.428
Token.)
Token ID2014
Feature activation+1.41e-01

Pos logit contributions
atron+1.734
imet+1.695
izoph+1.638
Desk+1.569
estern+1.488
Neg logit contributions
nces-1.956
esome-1.857
 contrad-1.699
anamo-1.650
 Flavoring-1.585
Token
Token ID198
Feature activation+5.08e-01

Pos logit contributions
nces+0.307
esome+0.294
 contrad+0.267
anamo+0.261
 Flavoring+0.252
Neg logit contributions
atron-0.253
imet-0.250
izoph-0.243
Desk-0.227
estern-0.218
Token
Token ID198
Feature activation-9.74e-02

Pos logit contributions
esome+0.960
nces+0.960
 contrad+0.795
 Flavoring+0.782
udo+0.775
Neg logit contributions
izoph-1.019
atron-1.016
imet-1.011
Desk-0.911
estern-0.908
TokenSo
Token ID2396
Feature activation-4.57e-01

Pos logit contributions
atron+0.183
imet+0.182
izoph+0.178
estern+0.161
Desk+0.161
Neg logit contributions
nces-0.200
esome-0.193
 contrad-0.172
anamo-0.168
edom-0.164
Token how
Token ID703
Feature activation-1.67e-01

Pos logit contributions
nces+0.362
esome+0.343
 contrad+0.318
anamo+0.300
edom+0.293
Neg logit contributions
atron-0.358
imet-0.351
izoph-0.343
Desk-0.327
estern-0.318
Token we
Token ID356
Feature activation-3.57e-01

Pos logit contributions
atron+0.379
imet+0.372
izoph+0.365
Desk+0.350
estern+0.336
Neg logit contributions
nces-0.287
esome-0.269
 contrad-0.243
anamo-0.231
edom-0.221
Token think
Token ID892
Feature activation-4.66e-01

Pos logit contributions
atron+0.685
imet+0.671
izoph+0.652
Desk+0.629
estern+0.602
Neg logit contributions
nces-0.732
esome-0.694
 contrad-0.648
anamo-0.597
liction-0.591
Token about
Token ID546
Feature activation-1.63e+00

Pos logit contributions
atron+0.923
imet+0.905
izoph+0.877
Desk+0.841
estern+0.809
Neg logit contributions
nces-0.926
esome-0.882
 contrad-0.805
anamo-0.774
edom-0.745
Token a
Token ID257
Feature activation+1.26e-01

Pos logit contributions
atron+3.166
imet+3.101
izoph+3.001
Desk+2.878
estern+2.737
Neg logit contributions
nces-3.339
esome-3.166
 contrad-2.892
anamo-2.800
edom-2.686
Token trans
Token ID1007
Feature activation-3.31e+00

Pos logit contributions
nces+0.246
esome+0.238
 contrad+0.220
anamo+0.205
edom+0.200
Neg logit contributions
atron-0.248
imet-0.245
izoph-0.239
Desk-0.229
estern-0.219
Tokennational
Token ID14648
Feature activation-1.07e-01

Pos logit contributions
atron+4.372
imet+4.271
izoph+4.001
Desk+3.778
istani+3.444
Neg logit contributions
nces-8.643
esome-8.234
 contrad-7.709
anamo-7.522
 Flavoring-7.403
Token network
Token ID3127
Feature activation-7.03e-01

Pos logit contributions
atron+0.213
imet+0.211
izoph+0.207
Desk+0.198
estern+0.188
Neg logit contributions
nces-0.208
esome-0.202
 contrad-0.184
anamo-0.176
edom-0.170
Token of
Token ID286
Feature activation+1.09e+00

Pos logit contributions
atron+1.353
imet+1.321
izoph+1.285
Desk+1.228
estern+1.174
Neg logit contributions
nces-1.438
esome-1.375
 contrad-1.258
anamo-1.210
edom-1.169
Token operatives
Token ID27269
Feature activation-8.94e-01

Pos logit contributions
nces+1.968
esome+1.873
 contrad+1.727
anamo+1.611
edom+1.565
Neg logit contributions
atron-2.311
imet-2.300
izoph-2.226
Desk-2.156
estern-2.066
Token who
Token ID508
Feature activation+2.49e-01

Pos logit contributions
atron+1.733
imet+1.696
izoph+1.650
Desk+1.574
estern+1.504
Neg logit contributions
nces-1.825
esome-1.735
 contrad-1.587
anamo-1.530
edom-1.474
Token
Token ID198
Feature activation+3.56e-01

Pos logit contributions
atron+0.114
imet+0.113
izoph+0.109
Desk+0.103
estern+0.099
Neg logit contributions
nces-0.131
esome-0.125
 contrad-0.113
anamo-0.110
 Flavoring-0.107
Token
Token ID198
Feature activation-3.82e-01

Pos logit contributions
esome+0.697
nces+0.697
 contrad+0.588
 Flavoring+0.587
anamo+0.579
Neg logit contributions
atron-0.688
imet-0.683
izoph-0.679
estern-0.606
Desk-0.602
TokenView
Token ID7680
Feature activation-2.61e-01

Pos logit contributions
atron+0.706
imet+0.694
izoph+0.667
Desk+0.634
estern+0.612
Neg logit contributions
nces-0.808
esome-0.776
 contrad-0.700
anamo-0.680
 Flavoring-0.660
Token more
Token ID517
Feature activation-8.33e-01

Pos logit contributions
atron+0.527
imet+0.512
izoph+0.498
Desk+0.477
estern+0.449
Neg logit contributions
nces-0.513
esome-0.485
 contrad-0.445
anamo-0.431
 Flavoring-0.410
Token<|endoftext|>
Token ID50256
Feature activation-8.16e-01

Pos logit contributions
atron+1.303
imet+1.256
izoph+1.212
Desk+1.148
rology+1.058
Neg logit contributions
nces-1.981
esome-1.893
 contrad-1.809
anamo-1.726
edom-1.678
TokenAbout
Token ID8585
Feature activation-3.29e+00

Pos logit contributions
atron+1.500
imet+1.472
izoph+1.429
Desk+1.344
estern+1.288
Neg logit contributions
nces-1.747
esome-1.665
 contrad-1.511
anamo-1.469
edom-1.421
Token
Token ID198
Feature activation-1.51e+00

Pos logit contributions
atron+5.638
imet+5.494
izoph+5.243
Desk+5.060
estern+4.777
Neg logit contributions
nces-7.241
esome-6.877
 contrad-6.337
anamo-6.175
edom-5.939
Token
Token ID198
Feature activation-1.73e+00

Pos logit contributions
atron+2.622
imet+2.565
izoph+2.485
Desk+2.348
estern+2.229
Neg logit contributions
nces-3.380
esome-3.234
 contrad-2.953
anamo-2.869
 Flavoring-2.785
TokenR
Token ID49
Feature activation-3.05e-02

Pos logit contributions
atron+3.159
imet+3.079
izoph+2.963
Desk+2.882
estern+2.693
Neg logit contributions
nces-3.718
esome-3.516
 contrad-3.250
anamo-3.163
 Flavoring-3.023
Tokenacing
Token ID4092
Feature activation-4.11e-01

Pos logit contributions
atron+0.057
imet+0.056
izoph+0.054
Desk+0.052
estern+0.049
Neg logit contributions
nces-0.064
esome-0.061
 contrad-0.055
anamo-0.054
edom-0.052
Token Apex
Token ID49440
Feature activation-2.24e+00

Pos logit contributions
atron+0.785
imet+0.775
izoph+0.748
Desk+0.709
estern+0.678
Neg logit contributions
nces-0.846
esome-0.807
 contrad-0.738
anamo-0.704
edom-0.684
Token as
Token ID355
Feature activation+3.79e-02

Pos logit contributions
atron+2.663
imet+2.619
izoph+2.508
Desk+2.340
estern+2.190
Neg logit contributions
nces-4.986
esome-4.784
 contrad-4.439
anamo-4.436
 Flavoring-4.262
Token racism
Token ID10713
Feature activation-5.54e-01

Pos logit contributions
nces+0.079
esome+0.075
 contrad+0.069
anamo+0.065
edom+0.063
Neg logit contributions
atron-0.073
imet-0.071
izoph-0.068
Desk-0.065
estern-0.063
Token,
Token ID11
Feature activation-1.03e-01

Pos logit contributions
atron+1.080
imet+1.055
izoph+1.024
Desk+0.981
estern+0.934
Neg logit contributions
nces-1.124
esome-1.067
 contrad-0.973
anamo-0.940
 Flavoring-0.903
Token sexism
Token ID24953
Feature activation-3.07e-01

Pos logit contributions
atron+0.203
imet+0.198
izoph+0.192
Desk+0.183
estern+0.176
Neg logit contributions
nces-0.204
esome-0.196
 contrad-0.180
anamo-0.171
edom-0.165
Token,
Token ID11
Feature activation-4.59e-01

Pos logit contributions
atron+0.567
imet+0.554
izoph+0.534
Desk+0.513
estern+0.487
Neg logit contributions
nces-0.655
esome-0.620
 contrad-0.566
anamo-0.557
 Flavoring-0.531
Token trans
Token ID1007
Feature activation-3.28e+00

Pos logit contributions
atron+0.917
imet+0.895
izoph+0.867
Desk+0.832
estern+0.794
Neg logit contributions
nces-0.909
esome-0.865
 contrad-0.789
anamo-0.758
edom-0.727
Tokenph
Token ID746
Feature activation-1.12e+00

Pos logit contributions
atron+4.487
imet+4.391
izoph+4.183
Desk+3.930
rology+3.572
Neg logit contributions
nces-8.332
esome-7.986
 contrad-7.442
anamo-7.268
 Flavoring-7.143
Tokenobia
Token ID30665
Feature activation+6.98e-01

Pos logit contributions
atron+0.912
imet+0.865
izoph+0.796
Desk+0.710
estern+0.612
Neg logit contributions
nces-3.558
esome-3.438
 contrad-3.249
anamo-3.186
 Flavoring-3.111
Token,
Token ID11
Feature activation-6.32e-01

Pos logit contributions
nces+1.398
esome+1.325
 contrad+1.207
anamo+1.165
 Flavoring+1.123
Neg logit contributions
atron-1.372
imet-1.344
izoph-1.302
Desk-1.250
estern-1.188
Token and
Token ID290
Feature activation-2.53e-01

Pos logit contributions
atron+1.254
imet+1.225
izoph+1.185
Desk+1.136
estern+1.085
Neg logit contributions
nces-1.262
esome-1.201
 contrad-1.096
anamo-1.055
edom-1.011
Token able
Token ID1498
Feature activation-7.79e-01

Pos logit contributions
atron+0.525
imet+0.513
izoph+0.497
Desk+0.476
estern+0.459
Neg logit contributions
nces-0.482
esome-0.460
 contrad-0.419
anamo-0.397
edom-0.382
Token South
Token ID2520
Feature activation+1.08e+00

Pos logit contributions
atron+0.013
imet+0.013
izoph+0.012
Desk+0.012
estern+0.011
Neg logit contributions
nces-0.014
esome-0.013
 contrad-0.012
anamo-0.011
edom-0.011
Token Carolina
Token ID5913
Feature activation-3.19e-01

Pos logit contributions
nces+1.963
esome+1.885
 contrad+1.665
anamo+1.655
edom+1.589
Neg logit contributions
imet-2.240
atron-2.236
izoph-2.175
Desk-2.091
estern-2.018
Token where
Token ID810
Feature activation-1.51e-01

Pos logit contributions
atron+0.627
imet+0.617
izoph+0.599
Desk+0.573
estern+0.548
Neg logit contributions
nces-0.638
esome-0.607
 contrad-0.554
anamo-0.535
edom-0.512
Token workers
Token ID3259
Feature activation-1.67e+00

Pos logit contributions
atron+0.293
imet+0.289
izoph+0.279
Desk+0.268
estern+0.257
Neg logit contributions
nces-0.309
esome-0.293
 contrad-0.267
anamo-0.258
edom-0.247
Token had
Token ID550
Feature activation-9.23e-01

Pos logit contributions
atron+3.271
imet+3.201
izoph+3.116
Desk+2.980
estern+2.841
Neg logit contributions
nces-3.390
esome-3.212
 contrad-2.939
anamo-2.833
edom-2.721
Token voted
Token ID7052
Feature activation-3.28e+00

Pos logit contributions
atron+1.705
imet+1.668
izoph+1.633
Desk+1.552
estern+1.480
Neg logit contributions
nces-1.966
esome-1.873
 contrad-1.733
anamo-1.648
edom-1.609
Token to
Token ID284
Feature activation-2.37e-01

Pos logit contributions
atron+6.023
imet+5.810
izoph+5.591
Desk+5.443
estern+5.054
Neg logit contributions
nces-6.851
esome-6.534
 contrad-5.872
anamo-5.825
 Flavoring-5.698
Token go
Token ID467
Feature activation+8.16e-01

Pos logit contributions
atron+0.486
imet+0.482
izoph+0.466
Desk+0.445
estern+0.434
Neg logit contributions
nces-0.445
esome-0.418
 contrad-0.399
 Flavoring-0.357
anamo-0.356
Token non
Token ID1729
Feature activation+2.66e-01

Pos logit contributions
nces+1.518
esome+1.442
 contrad+1.320
anamo+1.280
udo+1.213
Neg logit contributions
atron-1.678
imet-1.662
izoph-1.623
Desk-1.529
estern-1.506
Tokenunion
Token ID24592
Feature activation-1.77e+00

Pos logit contributions
nces+0.580
esome+0.552
 contrad+0.507
anamo+0.493
edom+0.474
Neg logit contributions
atron-0.476
imet-0.466
izoph-0.450
Desk-0.430
estern-0.407
Token,
Token ID11
Feature activation-5.24e-02

Pos logit contributions
atron+3.206
imet+3.131
izoph+3.014
Desk+2.892
estern+2.734
Neg logit contributions
nces-3.829
esome-3.636
 contrad-3.335
anamo-3.244
 Flavoring-3.124
Token.
Token ID13
Feature activation-3.78e-01

Pos logit contributions
atron+0.759
imet+0.739
izoph+0.724
Desk+0.683
estern+0.651
Neg logit contributions
nces-0.856
esome-0.815
 contrad-0.748
anamo-0.722
 Flavoring-0.697
Tokene
Token ID68
Feature activation-6.44e-01

Pos logit contributions
atron+0.818
imet+0.802
izoph+0.795
Desk+0.747
estern+0.721
Neg logit contributions
nces-0.685
esome-0.659
 contrad-0.567
anamo-0.548
edom-0.534
Token.
Token ID13
Feature activation-7.31e-01

Pos logit contributions
atron+1.217
imet+1.197
izoph+1.151
Desk+1.099
estern+1.049
Neg logit contributions
nces-1.333
esome-1.275
 contrad-1.160
anamo-1.132
 Flavoring-1.088
Token those
Token ID883
Feature activation+7.60e-01

Pos logit contributions
atron+1.446
imet+1.417
izoph+1.370
Desk+1.311
estern+1.262
Neg logit contributions
nces-1.478
esome-1.390
 contrad-1.279
anamo-1.225
edom-1.180
Token who
Token ID508
Feature activation+6.93e-02

Pos logit contributions
nces+1.503
esome+1.387
 contrad+1.309
anamo+1.219
edom+1.186
Neg logit contributions
atron-1.524
imet-1.497
izoph-1.462
Desk-1.380
estern-1.335
Token voted
Token ID7052
Feature activation-3.27e+00

Pos logit contributions
nces+0.135
esome+0.126
 contrad+0.119
anamo+0.110
edom+0.108
Neg logit contributions
atron-0.142
imet-0.138
izoph-0.134
Desk-0.128
estern-0.124
Token in
Token ID287
Feature activation-2.33e-02

Pos logit contributions
atron+5.547
imet+5.435
izoph+5.196
Desk+5.055
estern+4.710
Neg logit contributions
nces-7.241
esome-6.938
 contrad-6.309
anamo-6.242
 Flavoring-6.029
Token the
Token ID262
Feature activation-1.11e-03

Pos logit contributions
atron+0.045
imet+0.044
izoph+0.043
Desk+0.041
estern+0.040
Neg logit contributions
nces-0.048
esome-0.044
 contrad-0.042
anamo-0.039
edom-0.039
Token previous
Token ID2180
Feature activation+2.73e-01

Pos logit contributions
atron+0.002
imet+0.002
izoph+0.002
Desk+0.002
estern+0.002
Neg logit contributions
nces-0.002
esome-0.002
 contrad-0.002
anamo-0.002
 Flavoring-0.002
Token election
Token ID3071
Feature activation-2.22e+00

Pos logit contributions
nces+0.661
esome+0.629
 contrad+0.596
 Flavoring+0.553
anamo+0.553
Neg logit contributions
atron-0.423
imet-0.414
izoph-0.394
Desk-0.370
estern-0.361
Token.
Token ID13
Feature activation+2.30e-01

Pos logit contributions
atron+4.102
imet+4.021
izoph+3.868
Desk+3.719
estern+3.521
Neg logit contributions
nces-4.689
esome-4.456
 contrad-4.076
anamo-3.972
 Flavoring-3.823
Token,
Token ID11
Feature activation+1.44e-01

Pos logit contributions
atron+2.361
imet+2.313
izoph+2.218
Desk+2.133
estern+2.020
Neg logit contributions
nces-2.644
esome-2.511
 contrad-2.295
anamo-2.237
 Flavoring-2.149
Token if
Token ID611
Feature activation-2.63e-01

Pos logit contributions
nces+0.284
esome+0.269
 contrad+0.249
anamo+0.233
edom+0.229
Neg logit contributions
atron-0.284
imet-0.277
izoph-0.275
Desk-0.260
estern-0.253
Token Labor
Token ID7882
Feature activation-2.75e+00

Pos logit contributions
atron+0.523
imet+0.507
izoph+0.498
Desk+0.478
estern+0.462
Neg logit contributions
nces-0.522
esome-0.497
 contrad-0.460
anamo-0.427
edom-0.422
Token wins
Token ID7864
Feature activation-2.11e+00

Pos logit contributions
atron+5.636
imet+5.554
izoph+5.280
Desk+5.151
estern+4.854
Neg logit contributions
nces-5.237
esome-4.965
 contrad-4.433
anamo-4.421
 Flavoring-4.213
Token the
Token ID262
Feature activation-6.23e-02

Pos logit contributions
atron+3.953
imet+3.877
izoph+3.713
Desk+3.560
estern+3.361
Neg logit contributions
nces-4.412
esome-4.204
 contrad-3.799
anamo-3.762
 Flavoring-3.570
Token election
Token ID3071
Feature activation-3.27e+00

Pos logit contributions
atron+0.105
imet+0.103
izoph+0.102
Desk+0.096
estern+0.092
Neg logit contributions
nces-0.141
esome-0.134
 contrad-0.126
anamo-0.119
edom-0.116
Token.
Token ID13
Feature activation-2.81e-01

Pos logit contributions
atron+5.449
imet+5.404
izoph+5.043
Desk+4.864
estern+4.496
Neg logit contributions
nces-7.367
esome-7.050
 contrad-6.410
anamo-6.409
 Flavoring-6.152
Token Given
Token ID11259
Feature activation+9.82e-02

Pos logit contributions
atron+0.513
imet+0.509
izoph+0.497
Desk+0.465
estern+0.451
Neg logit contributions
nces-0.595
esome-0.573
 contrad-0.526
anamo-0.498
 Flavoring-0.496
Token the
Token ID262
Feature activation-2.03e-03

Pos logit contributions
nces+0.191
esome+0.180
 contrad+0.166
anamo+0.156
edom+0.152
Neg logit contributions
atron-0.200
imet-0.196
izoph-0.191
Desk-0.183
estern-0.175
Token policy
Token ID2450
Feature activation-9.72e-01

Pos logit contributions
atron+0.004
imet+0.004
izoph+0.004
Desk+0.004
estern+0.004
Neg logit contributions
nces-0.004
esome-0.004
 contrad-0.003
anamo-0.003
edom-0.003
Token has
Token ID468
Feature activation-8.90e-01

Pos logit contributions
atron+2.104
imet+2.062
izoph+2.009
Desk+1.934
estern+1.854
Neg logit contributions
nces-1.768
esome-1.667
 contrad-1.506
anamo-1.442
edom-1.379
Token a
Token ID257
Feature activation+4.65e-01

Pos logit contributions
nces+1.080
esome+1.017
 contrad+0.952
anamo+0.880
edom+0.859
Neg logit contributions
atron-1.074
imet-1.051
izoph-1.042
Desk-0.980
estern-0.946
Token grand
Token ID4490
Feature activation-9.29e-01

Pos logit contributions
nces+0.901
esome+0.853
 contrad+0.798
anamo+0.741
edom+0.720
Neg logit contributions
atron-0.920
imet-0.915
izoph-0.913
Desk-0.843
estern-0.819
Token total
Token ID2472
Feature activation-2.27e-01

Pos logit contributions
atron+1.811
imet+1.769
izoph+1.742
Desk+1.653
estern+1.594
Neg logit contributions
nces-1.875
esome-1.777
 contrad-1.622
anamo-1.570
edom-1.508
Token of
Token ID286
Feature activation+2.92e-01

Pos logit contributions
atron+0.443
imet+0.433
izoph+0.425
Desk+0.404
estern+0.388
Neg logit contributions
nces-0.461
esome-0.437
 contrad-0.400
anamo-0.381
edom-0.372
Token zero
Token ID6632
Feature activation-1.85e-01

Pos logit contributions
nces+0.612
esome+0.576
 contrad+0.535
anamo+0.498
edom+0.488
Neg logit contributions
atron-0.550
imet-0.537
izoph-0.532
Desk-0.501
estern-0.487
Token gig
Token ID12526
Feature activation-3.27e+00

Pos logit contributions
atron+0.351
imet+0.344
izoph+0.337
Desk+0.320
estern+0.307
Neg logit contributions
nces-0.384
esome-0.364
 contrad-0.335
anamo-0.318
edom-0.313
Tokenawatts
Token ID46684
Feature activation-1.14e+00

Pos logit contributions
atron+6.977
imet+6.784
izoph+6.354
Desk+6.248
estern+5.878
Neg logit contributions
nces-5.954
esome-5.630
 contrad-5.099
anamo-4.866
 Flavoring-4.804
Token.
Token ID13
Feature activation-2.75e-02

Pos logit contributions
atron+2.141
imet+2.091
izoph+2.031
Desk+1.940
estern+1.844
Neg logit contributions
nces-2.385
esome-2.263
 contrad-2.070
anamo-2.002
edom-1.929
Token But
Token ID887
Feature activation+5.30e-01

Pos logit contributions
imet+0.053
atron+0.052
izoph+0.052
Desk+0.047
estern+0.046
Neg logit contributions
nces-0.056
esome-0.054
 contrad-0.049
 Flavoring-0.047
anamo-0.047
Token knowing
Token ID6970
Feature activation-1.47e+00

Pos logit contributions
nces+1.037
esome+0.974
 contrad+0.899
anamo+0.833
edom+0.827
Neg logit contributions
atron-1.057
imet-1.051
izoph-1.032
Desk-0.975
estern-0.953
Token that
Token ID326
Feature activation-3.42e-01

Pos logit contributions
atron+2.858
imet+2.803
izoph+2.714
Desk+2.601
estern+2.480
Neg logit contributions
nces-3.007
esome-2.852
 contrad-2.599
anamo-2.512
edom-2.418
Token the
Token ID262
Feature activation+2.06e-01

Pos logit contributions
nces+0.864
esome+0.820
 contrad+0.747
anamo+0.714
edom+0.694
Neg logit contributions
atron-0.868
imet-0.847
izoph-0.828
Desk-0.797
estern-0.756
Token lake
Token ID13546
Feature activation+6.79e-01

Pos logit contributions
nces+0.413
esome+0.393
 contrad+0.365
anamo+0.341
edom+0.337
Neg logit contributions
atron-0.399
imet-0.386
izoph-0.381
Desk-0.368
estern-0.349
Token
Token ID447
Feature activation+3.58e-01

Pos logit contributions
nces+1.336
esome+1.262
 contrad+1.169
anamo+1.113
edom+1.097
Neg logit contributions
atron-1.324
imet-1.277
izoph-1.276
Desk-1.209
estern-1.163
Token
Token ID247
Feature activation-1.93e-02

Pos logit contributions
nces+0.362
esome+0.322
 contrad+0.264
anamo+0.231
 Flavoring+0.220
Neg logit contributions
imet-1.058
atron-1.052
izoph-1.023
Desk-1.000
estern-0.980
Tokens
Token ID82
Feature activation-5.85e-01

Pos logit contributions
atron+0.036
imet+0.036
izoph+0.035
Desk+0.033
estern+0.032
Neg logit contributions
nces-0.040
esome-0.038
 contrad-0.034
anamo-0.033
edom-0.033
Token surface
Token ID4417
Feature activation+2.50e+00

Pos logit contributions
atron+1.166
imet+1.133
izoph+1.104
Desk+1.065
estern+1.018
Neg logit contributions
nces-1.160
esome-1.099
 contrad-1.011
anamo-0.957
edom-0.934
Token area
Token ID1989
Feature activation+9.18e-01

Pos logit contributions
nces+4.142
esome+3.952
 contrad+3.645
edom+3.359
anamo+3.246
Neg logit contributions
atron-5.343
izoph-5.231
imet-5.119
Desk-4.946
estern-4.887
Token was
Token ID373
Feature activation+5.00e-02

Pos logit contributions
nces+1.897
esome+1.788
 contrad+1.661
edom+1.558
anamo+1.544
Neg logit contributions
atron-1.684
izoph-1.670
imet-1.621
Desk-1.561
estern-1.499
Token about
Token ID546
Feature activation-1.53e+00

Pos logit contributions
nces+0.105
esome+0.099
 contrad+0.091
anamo+0.086
edom+0.085
Neg logit contributions
atron-0.093
izoph-0.091
imet-0.089
Desk-0.085
estern-0.081
Token 12
Token ID1105
Feature activation+4.51e-01

Pos logit contributions
atron+2.699
imet+2.622
izoph+2.567
Desk+2.439
estern+2.307
Neg logit contributions
nces-3.417
esome-3.255
 contrad-3.006
anamo-2.893
edom-2.804
Token%
Token ID4
Feature activation-2.25e+00

Pos logit contributions
nces+0.952
esome+0.896
 contrad+0.832
anamo+0.790
edom+0.774
Neg logit contributions
atron-0.829
imet-0.799
izoph-0.799
Desk-0.754
estern-0.718
Token back
Token ID736
Feature activation-5.52e-02

Pos logit contributions
nces+0.823
esome+0.784
 contrad+0.709
anamo+0.667
edom+0.648
Neg logit contributions
atron-0.891
imet-0.883
izoph-0.871
Desk-0.831
estern-0.798
Token to
Token ID284
Feature activation-4.87e-01

Pos logit contributions
atron+0.113
imet+0.111
izoph+0.109
Desk+0.104
estern+0.100
Neg logit contributions
nces-0.106
esome-0.100
 contrad-0.090
anamo-0.087
edom-0.084
Token the
Token ID262
Feature activation+1.48e-01

Pos logit contributions
atron+1.010
imet+0.995
izoph+0.968
Desk+0.929
estern+0.892
Neg logit contributions
nces-0.925
esome-0.876
 contrad-0.793
anamo-0.760
 Flavoring-0.731
Token no
Token ID645
Feature activation-3.67e-01

Pos logit contributions
nces+0.285
esome+0.272
 contrad+0.248
anamo+0.234
 Flavoring+0.226
Neg logit contributions
atron-0.298
imet-0.294
izoph-0.287
Desk-0.276
estern-0.265
Token-
Token ID12
Feature activation-2.86e-01

Pos logit contributions
atron+0.670
imet+0.656
izoph+0.636
Desk+0.608
estern+0.575
Neg logit contributions
nces-0.789
esome-0.750
 contrad-0.686
anamo-0.666
edom-0.642
Tokenspin
Token ID39706
Feature activation+1.80e+00

Pos logit contributions
atron+0.548
imet+0.541
izoph+0.530
Desk+0.500
estern+0.477
Neg logit contributions
nces-0.586
esome-0.554
 contrad-0.497
anamo-0.488
edom-0.471
Token scenario
Token ID8883
Feature activation+7.94e-01

Pos logit contributions
nces+3.258
esome+3.078
 contrad+2.763
anamo+2.576
edom+2.529
Neg logit contributions
imet-3.761
atron-3.737
izoph-3.638
Desk-3.535
estern-3.403
Token for
Token ID329
Feature activation-1.55e-01

Pos logit contributions
nces+1.624
esome+1.549
 contrad+1.401
anamo+1.347
edom+1.305
Neg logit contributions
atron-1.511
imet-1.486
izoph-1.454
Desk-1.397
estern-1.327
Token a
Token ID257
Feature activation+4.56e-03

Pos logit contributions
atron+0.330
imet+0.325
izoph+0.318
Desk+0.308
estern+0.293
Neg logit contributions
nces-0.285
esome-0.270
 contrad-0.244
anamo-0.231
 Flavoring-0.223
Token second
Token ID1218
Feature activation+8.81e-01

Pos logit contributions
nces+0.010
esome+0.010
 contrad+0.009
anamo+0.008
edom+0.008
Neg logit contributions
atron-0.008
imet-0.008
izoph-0.008
Desk-0.007
estern-0.007
Token:
Token ID25
Feature activation-4.77e-01

Pos logit contributions
nces+1.789
esome+1.722
 contrad+1.555
anamo+1.467
edom+1.446
Neg logit contributions
atron-1.681
imet-1.644
izoph-1.610
Desk-1.540
estern-1.469
Token provided
Token ID2810
Feature activation+8.12e-01

Pos logit contributions
atron+1.483
imet+1.454
izoph+1.411
Desk+1.342
estern+1.270
Neg logit contributions
nces-1.828
esome-1.747
 contrad-1.607
anamo-1.541
edom-1.497
Token scientists
Token ID5519
Feature activation+7.28e-01

Pos logit contributions
nces+1.532
esome+1.447
 contrad+1.322
anamo+1.228
edom+1.205
Neg logit contributions
atron-1.688
imet-1.663
izoph-1.621
Desk-1.548
estern-1.484
Token a
Token ID257
Feature activation+9.61e-02

Pos logit contributions
nces+1.378
esome+1.296
 contrad+1.177
anamo+1.106
 Flavoring+1.077
Neg logit contributions
atron-1.519
imet-1.491
izoph-1.453
Desk-1.383
estern-1.323
Token library
Token ID5888
Feature activation-6.27e-01

Pos logit contributions
nces+0.190
esome+0.182
 contrad+0.167
anamo+0.155
 Flavoring+0.151
Neg logit contributions
atron-0.189
imet-0.185
izoph-0.183
Desk-0.174
estern-0.166
Token of
Token ID286
Feature activation+7.94e-01

Pos logit contributions
atron+1.105
imet+1.083
izoph+1.026
Desk+1.000
estern+0.937
Neg logit contributions
nces-1.400
esome-1.305
 contrad-1.207
anamo-1.200
 Flavoring-1.145
Token comparisons
Token ID17909
Feature activation+1.96e+00

Pos logit contributions
nces+1.475
esome+1.392
 contrad+1.275
anamo+1.163
edom+1.149
Neg logit contributions
atron-1.659
imet-1.623
izoph-1.606
Desk-1.543
estern-1.471
Token for
Token ID329
Feature activation+5.34e-02

Pos logit contributions
nces+3.608
esome+3.474
 contrad+3.136
edom+2.847
anamo+2.846
Neg logit contributions
atron-3.998
izoph-3.905
imet-3.901
Desk-3.703
estern-3.588
Token what
Token ID644
Feature activation+1.28e+00

Pos logit contributions
nces+0.104
esome+0.098
 contrad+0.090
anamo+0.084
 Flavoring+0.082
Neg logit contributions
atron-0.109
imet-0.106
izoph-0.104
Desk-0.100
estern-0.095
Token Curiosity
Token ID39736
Feature activation+9.60e-01

Pos logit contributions
nces+2.266
esome+2.144
 contrad+1.958
anamo+1.802
edom+1.777
Neg logit contributions
atron-2.761
imet-2.709
izoph-2.663
Desk-2.549
estern-2.482
Token sees
Token ID7224
Feature activation-9.14e-01

Pos logit contributions
nces+1.686
esome+1.603
 contrad+1.460
anamo+1.344
 Flavoring+1.304
Neg logit contributions
atron-2.076
izoph-2.051
imet-2.012
Desk-1.919
estern-1.879
Token.
Token ID13
Feature activation-7.98e-02

Pos logit contributions
atron+1.684
imet+1.643
izoph+1.592
Desk+1.521
estern+1.443
Neg logit contributions
nces-1.954
esome-1.859
 contrad-1.702
anamo-1.650
edom-1.589
Token got
Token ID1392
Feature activation+8.03e-01

Pos logit contributions
nces+0.549
esome+0.525
 contrad+0.474
anamo+0.439
edom+0.432
Neg logit contributions
atron-0.578
imet-0.568
izoph-0.563
Desk-0.539
estern-0.508
Token that
Token ID326
Feature activation-4.40e-02

Pos logit contributions
nces+1.586
esome+1.508
 contrad+1.372
anamo+1.317
edom+1.269
Neg logit contributions
atron-1.579
imet-1.561
izoph-1.536
Desk-1.457
estern-1.381
Token?
Token ID30
Feature activation-1.13e-01

Pos logit contributions
atron+0.084
imet+0.083
izoph+0.080
Desk+0.077
estern+0.074
Neg logit contributions
nces-0.089
esome-0.086
 contrad-0.080
anamo-0.075
 Flavoring-0.072
Token
Token ID198
Feature activation+7.73e-01

Pos logit contributions
atron+0.219
imet+0.218
izoph+0.215
Desk+0.200
estern+0.193
Neg logit contributions
nces-0.225
esome-0.217
 contrad-0.197
anamo-0.189
 Flavoring-0.186
Token
Token ID198
Feature activation-1.15e-03

Pos logit contributions
esome+1.430
nces+1.411
 contrad+1.160
 Flavoring+1.148
udo+1.145
Neg logit contributions
izoph-1.603
imet-1.568
atron-1.568
estern-1.420
Desk-1.419
TokenSure
Token ID19457
Feature activation+1.65e+00

Pos logit contributions
atron+0.002
imet+0.002
izoph+0.002
Desk+0.002
estern+0.002
Neg logit contributions
nces-0.003
esome-0.003
 contrad-0.002
anamo-0.002
 Flavoring-0.002
Token,
Token ID11
Feature activation-1.36e-01

Pos logit contributions
nces+3.071
esome+2.953
 contrad+2.668
anamo+2.546
edom+2.464
Neg logit contributions
atron-3.352
imet-3.276
izoph-3.271
Desk-3.023
estern-2.975
Token I
Token ID314
Feature activation-5.68e-01

Pos logit contributions
atron+0.275
imet+0.270
izoph+0.264
Desk+0.251
estern+0.242
Neg logit contributions
nces-0.263
esome-0.250
 contrad-0.229
anamo-0.218
edom-0.209
Token got
Token ID1392
Feature activation+3.64e-01

Pos logit contributions
atron+1.158
imet+1.130
izoph+1.110
Desk+1.058
estern+1.013
Neg logit contributions
nces-1.111
esome-1.043
 contrad-0.957
anamo-0.902
edom-0.879
Token it
Token ID340
Feature activation+9.69e-02

Pos logit contributions
nces+0.712
esome+0.673
 contrad+0.613
anamo+0.587
edom+0.566
Neg logit contributions
atron-0.730
imet-0.721
izoph-0.706
Desk-0.670
estern-0.639
Token.
Token ID13
Feature activation-5.68e-04

Pos logit contributions
nces+0.196
esome+0.188
 contrad+0.171
anamo+0.163
edom+0.159
Neg logit contributions
atron-0.187
imet-0.184
izoph-0.180
Desk-0.172
estern-0.164
Token was
Token ID373
Feature activation+1.72e-01

Pos logit contributions
nces+0.031
esome+0.030
 contrad+0.027
anamo+0.025
edom+0.025
Neg logit contributions
atron-0.032
imet-0.031
izoph-0.031
Desk-0.029
estern-0.028
Token posted
Token ID4481
Feature activation+3.38e-01

Pos logit contributions
nces+0.360
esome+0.343
 contrad+0.315
anamo+0.297
edom+0.290
Neg logit contributions
atron-0.320
imet-0.317
izoph-0.308
Desk-0.294
estern-0.277
Token.
Token ID13
Feature activation+2.64e-01

Pos logit contributions
nces+0.693
esome+0.654
 contrad+0.594
anamo+0.569
edom+0.555
Neg logit contributions
atron-0.651
imet-0.633
izoph-0.630
Desk-0.593
estern-0.569
Token Here
Token ID3423
Feature activation+8.11e-01

Pos logit contributions
nces+0.529
esome+0.507
 contrad+0.458
anamo+0.438
 Flavoring+0.431
Neg logit contributions
atron-0.510
imet-0.509
izoph-0.501
Desk-0.459
estern-0.452
Token's
Token ID338
Feature activation+3.96e-01

Pos logit contributions
nces+1.439
esome+1.354
 contrad+1.211
anamo+1.162
edom+1.117
Neg logit contributions
atron-1.766
imet-1.742
izoph-1.711
Desk-1.639
estern-1.576
Token what
Token ID644
Feature activation+1.85e+00

Pos logit contributions
nces+0.776
esome+0.729
 contrad+0.659
anamo+0.630
edom+0.612
Neg logit contributions
atron-0.791
imet-0.789
izoph-0.772
Desk-0.722
estern-0.705
Token my
Token ID616
Feature activation-4.11e-01

Pos logit contributions
nces+3.309
esome+3.033
 contrad+2.760
anamo+2.557
edom+2.527
Neg logit contributions
atron-3.942
imet-3.921
izoph-3.859
Desk-3.635
estern-3.619
Token methods
Token ID5050
Feature activation+3.73e-01

Pos logit contributions
atron+0.808
imet+0.799
izoph+0.780
Desk+0.738
estern+0.719
Neg logit contributions
nces-0.831
esome-0.781
 contrad-0.714
anamo-0.679
 Flavoring-0.654
Token were
Token ID547
Feature activation+2.82e-01

Pos logit contributions
nces+0.692
esome+0.660
 contrad+0.590
anamo+0.565
edom+0.551
Neg logit contributions
atron-0.777
izoph-0.759
imet-0.754
Desk-0.725
estern-0.695
Token:
Token ID25
Feature activation-6.87e-02

Pos logit contributions
nces+0.582
esome+0.548
 contrad+0.496
anamo+0.471
edom+0.468
Neg logit contributions
atron-0.542
izoph-0.522
imet-0.520
Desk-0.497
estern-0.476
TokenIf
Token ID1532
Feature activation+5.65e-01

Pos logit contributions
atron+0.125
imet+0.122
izoph+0.120
Desk+0.111
estern+0.109
Neg logit contributions
nces-0.148
esome-0.140
 contrad-0.128
anamo-0.124
 Flavoring-0.120
Token
Token ID198
Feature activation+7.38e-01

Pos logit contributions
nces+0.168
esome+0.160
 contrad+0.145
anamo+0.141
 Flavoring+0.138
Neg logit contributions
atron-0.145
imet-0.144
izoph-0.140
Desk-0.131
estern-0.126
Token
Token ID198
Feature activation+1.99e-02

Pos logit contributions
nces+1.378
esome+1.377
 contrad+1.133
 Flavoring+1.116
udo+1.114
Neg logit contributions
izoph-1.491
atron-1.485
imet-1.483
Desk-1.333
estern-1.333
TokenLast
Token ID5956
Feature activation-1.37e+00

Pos logit contributions
nces+0.043
esome+0.040
 contrad+0.036
anamo+0.035
 Flavoring+0.035
Neg logit contributions
atron-0.036
imet-0.036
izoph-0.035
Desk-0.032
estern-0.032
Token year
Token ID614
Feature activation+9.05e-01

Pos logit contributions
atron+2.079
imet+2.019
izoph+1.937
Desk+1.829
estern+1.711
Neg logit contributions
nces-3.398
esome-3.253
 contrad-3.017
anamo-2.938
edom-2.849
Token,
Token ID11
Feature activation+2.88e-01

Pos logit contributions
nces+1.823
esome+1.722
 contrad+1.568
anamo+1.515
edom+1.459
Neg logit contributions
atron-1.779
imet-1.746
izoph-1.691
Desk-1.617
estern-1.537
Token Pruitt
Token ID39637
Feature activation+8.08e-01

Pos logit contributions
nces+0.560
esome+0.526
 contrad+0.479
anamo+0.462
edom+0.444
Neg logit contributions
atron-0.586
imet-0.577
izoph-0.559
Desk-0.534
estern-0.512
Token orchestrated
Token ID37728
Feature activation-9.94e-01

Pos logit contributions
nces+1.629
esome+1.562
 contrad+1.440
anamo+1.372
edom+1.333
Neg logit contributions
atron-1.531
imet-1.501
izoph-1.488
Desk-1.387
estern-1.341
Token an
Token ID281
Feature activation-9.46e-02

Pos logit contributions
atron+1.892
imet+1.853
izoph+1.781
Desk+1.717
estern+1.635
Neg logit contributions
nces-2.051
esome-1.950
 contrad-1.778
anamo-1.735
 Flavoring-1.659
Token effort
Token ID3626
Feature activation+5.92e-01

Pos logit contributions
atron+0.186
imet+0.181
izoph+0.177
Desk+0.169
estern+0.162
Neg logit contributions
nces-0.190
esome-0.180
 contrad-0.165
anamo-0.159
edom-0.155
Token by
Token ID416
Feature activation+1.05e+00

Pos logit contributions
nces+1.140
esome+1.080
 contrad+0.978
anamo+0.939
edom+0.906
Neg logit contributions
atron-1.207
imet-1.182
izoph-1.164
Desk-1.105
estern-1.052
Token Republican
Token ID3415
Feature activation-7.95e-02

Pos logit contributions
nces+2.035
esome+1.915
 contrad+1.740
anamo+1.661
edom+1.613
Neg logit contributions
atron-2.116
imet-2.085
izoph-2.029
Desk-1.934
estern-1.846
Tokenin
Token ID259
Feature activation+4.27e-01

Pos logit contributions
nces+0.581
esome+0.555
 contrad+0.495
anamo+0.483
edom+0.464
Neg logit contributions
atron-0.570
imet-0.558
izoph-0.542
Desk-0.513
estern-0.492
Token-
Token ID12
Feature activation-2.30e-01

Pos logit contributions
nces+0.883
esome+0.839
 contrad+0.772
anamo+0.738
edom+0.708
Neg logit contributions
atron-0.809
imet-0.785
izoph-0.765
Desk-0.729
estern-0.694
Tokenall
Token ID439
Feature activation-2.25e-01

Pos logit contributions
atron+0.357
imet+0.350
izoph+0.337
Desk+0.317
estern+0.300
Neg logit contributions
nces-0.557
esome-0.533
 contrad-0.490
anamo-0.479
edom-0.462
Token minor
Token ID4159
Feature activation-7.84e-01

Pos logit contributions
atron+0.423
imet+0.412
izoph+0.400
Desk+0.383
estern+0.363
Neg logit contributions
nces-0.473
esome-0.451
 contrad-0.414
anamo-0.398
edom-0.384
Token issues
Token ID2428
Feature activation-1.91e-01

Pos logit contributions
atron+1.607
imet+1.578
izoph+1.545
Desk+1.473
estern+1.407
Neg logit contributions
nces-1.513
esome-1.431
 contrad-1.301
anamo-1.246
edom-1.202
Token
Token ID447
Feature activation-7.66e-02

Pos logit contributions
atron+0.362
imet+0.354
izoph+0.345
Desk+0.330
estern+0.315
Neg logit contributions
nces-0.392
esome-0.375
 contrad-0.345
anamo-0.329
edom-0.319
Token
Token ID251
Feature activation-1.08e-01

Pos logit contributions
atron+0.219
imet+0.219
izoph+0.212
Desk+0.207
estern+0.202
Neg logit contributions
nces-0.085
esome-0.077
 contrad-0.064
anamo-0.059
 Flavoring-0.054
Token the
Token ID262
Feature activation-2.39e-01

Pos logit contributions
atron+0.206
imet+0.202
izoph+0.197
Desk+0.188
estern+0.180
Neg logit contributions
nces-0.221
esome-0.211
 contrad-0.193
anamo-0.185
edom-0.179
Token demands
Token ID8665
Feature activation+1.12e+00

Pos logit contributions
atron+0.454
imet+0.444
izoph+0.433
Desk+0.416
estern+0.399
Neg logit contributions
nces-0.491
esome-0.470
 contrad-0.434
anamo-0.412
 Flavoring-0.402
Token by
Token ID416
Feature activation+3.27e-01

Pos logit contributions
nces+2.165
esome+2.087
 contrad+1.880
anamo+1.813
edom+1.755
Neg logit contributions
atron-2.199
imet-2.142
izoph-2.130
Desk-2.026
estern-1.968
Token Kentucky
Token ID11867
Feature activation+5.29e-01

Pos logit contributions
nces+0.661
esome+0.632
 contrad+0.579
anamo+0.546
edom+0.536
Neg logit contributions
atron-0.632
imet-0.619
izoph-0.603
Desk-0.581
estern-0.554
Token and
Token ID290
Feature activation+4.72e-01

Pos logit contributions
nces+0.567
esome+0.541
 contrad+0.485
anamo+0.473
edom+0.460
Neg logit contributions
atron-0.550
imet-0.536
izoph-0.526
Desk-0.498
estern-0.474
Token Vince
Token ID22376
Feature activation+7.55e-02

Pos logit contributions
nces+0.895
esome+0.853
 contrad+0.781
anamo+0.740
edom+0.717
Neg logit contributions
atron-0.960
imet-0.951
izoph-0.937
Desk-0.883
estern-0.844
Token Zam
Token ID38343
Feature activation-3.24e-01

Pos logit contributions
nces+0.176
esome+0.169
 contrad+0.156
anamo+0.151
edom+0.146
Neg logit contributions
atron-0.124
imet-0.120
izoph-0.118
Desk-0.111
estern-0.103
Tokenpe
Token ID431
Feature activation+3.39e-01

Pos logit contributions
atron+0.637
imet+0.615
izoph+0.605
Desk+0.572
estern+0.558
Neg logit contributions
nces-0.660
esome-0.628
 contrad-0.560
anamo-0.539
edom-0.538
Tokenlla
Token ID8466
Feature activation-7.39e-02

Pos logit contributions
nces+0.595
esome+0.556
 contrad+0.503
anamo+0.488
 Flavoring+0.461
Neg logit contributions
atron-0.752
imet-0.744
izoph-0.725
Desk-0.687
estern-0.665
Token,
Token ID11
Feature activation+8.04e-02

Pos logit contributions
atron+0.146
imet+0.143
izoph+0.140
Desk+0.133
estern+0.127
Neg logit contributions
nces-0.148
esome-0.140
 contrad-0.127
anamo-0.123
edom-0.119
Token outlines
Token ID27430
Feature activation-8.66e-01

Pos logit contributions
nces+0.146
esome+0.138
 contrad+0.126
anamo+0.117
edom+0.115
Neg logit contributions
atron-0.172
imet-0.170
izoph-0.168
Desk-0.160
estern-0.154
Token the
Token ID262
Feature activation+9.09e-02

Pos logit contributions
atron+1.678
imet+1.650
izoph+1.598
Desk+1.529
estern+1.457
Neg logit contributions
nces-1.767
esome-1.679
 contrad-1.536
anamo-1.482
edom-1.427
Token four
Token ID1440
Feature activation+1.39e+00

Pos logit contributions
nces+0.165
esome+0.158
 contrad+0.145
anamo+0.136
edom+0.131
Neg logit contributions
atron-0.192
imet-0.190
izoph-0.185
Desk-0.179
estern-0.171
Token game
Token ID983
Feature activation+1.95e-01

Pos logit contributions
nces+2.590
esome+2.519
 contrad+2.313
anamo+2.152
edom+2.128
Neg logit contributions
atron-2.780
izoph-2.746
imet-2.743
Desk-2.556
estern-2.458
Token plan
Token ID1410
Feature activation-4.82e-01

Pos logit contributions
nces+0.355
esome+0.336
 contrad+0.304
anamo+0.292
edom+0.280
Neg logit contributions
atron-0.418
imet-0.411
izoph-0.402
Desk-0.383
estern-0.369
Token Women
Token ID6926
Feature activation-4.99e-02

Pos logit contributions
nces+1.441
esome+1.360
 contrad+1.242
 Flavoring+1.188
anamo+1.185
Neg logit contributions
atron-1.287
imet-1.262
izoph-1.240
Desk-1.155
estern-1.122
Token.
Token ID13
Feature activation-5.63e-01

Pos logit contributions
atron+0.093
imet+0.092
izoph+0.090
Desk+0.085
estern+0.081
Neg logit contributions
nces-0.105
esome-0.099
 contrad-0.090
anamo-0.087
 Flavoring-0.085
Token
Token ID198
Feature activation+5.21e-02

Pos logit contributions
atron+1.028
imet+1.015
izoph+0.995
Desk+0.923
estern+0.890
Neg logit contributions
nces-1.197
esome-1.157
 contrad-1.032
anamo-1.009
 Flavoring-0.998
Token
Token ID198
Feature activation-6.00e-01

Pos logit contributions
nces+0.101
esome+0.100
 contrad+0.085
 Flavoring+0.083
udo+0.082
Neg logit contributions
atron-0.102
izoph-0.102
imet-0.101
Desk-0.092
estern-0.091
Token
Token ID447
Feature activation-5.51e-01

Pos logit contributions
atron+1.137
imet+1.110
izoph+1.073
Desk+1.031
estern+0.976
Neg logit contributions
nces-1.255
esome-1.188
 contrad-1.087
anamo-1.057
 Flavoring-1.013
Token
Token ID250
Feature activation+5.99e-01

Pos logit contributions
atron+1.595
imet+1.587
izoph+1.550
Desk+1.506
estern+1.457
Neg logit contributions
nces-0.599
esome-0.543
 contrad-0.439
anamo-0.402
 Flavoring-0.384
TokenA
Token ID32
Feature activation+5.58e-01

Pos logit contributions
nces+1.225
esome+1.173
 contrad+1.065
anamo+1.020
edom+1.008
Neg logit contributions
atron-1.117
imet-1.107
izoph-1.091
estern-0.979
Desk-0.973
Token gentleman
Token ID22427
Feature activation+1.32e+00

Pos logit contributions
nces+1.169
esome+1.121
 contrad+1.038
anamo+0.976
edom+0.961
Neg logit contributions
atron-1.029
imet-1.000
izoph-0.989
Desk-0.937
estern-0.883
Token will
Token ID481
Feature activation+8.12e-01

Pos logit contributions
nces+2.399
esome+2.260
 contrad+2.071
anamo+1.893
edom+1.884
Neg logit contributions
atron-2.772
izoph-2.745
imet-2.722
Desk-2.570
estern-2.456
Token not
Token ID407
Feature activation+1.37e-01

Pos logit contributions
nces+1.594
esome+1.482
 contrad+1.397
edom+1.277
liction+1.251
Neg logit contributions
atron-1.610
imet-1.589
izoph-1.585
Desk-1.488
estern-1.435
Token take
Token ID1011
Feature activation+8.00e-01

Pos logit contributions
nces+0.271
esome+0.255
 contrad+0.238
edom+0.220
anamo+0.215
Neg logit contributions
atron-0.273
imet-0.268
izoph-0.264
Desk-0.250
estern-0.242
Token.
Token ID13
Feature activation-8.12e-01

Pos logit contributions
atron+0.865
imet+0.857
izoph+0.847
Desk+0.789
estern+0.768
Neg logit contributions
nces-0.911
esome-0.877
 contrad-0.765
anamo-0.743
 Flavoring-0.733
Token Lerner
Token ID49750
Feature activation-2.72e-03

Pos logit contributions
atron+1.626
imet+1.592
izoph+1.541
Desk+1.481
estern+1.410
Neg logit contributions
nces-1.606
esome-1.519
 contrad-1.380
anamo-1.338
 Flavoring-1.281
Token,
Token ID11
Feature activation+2.38e-02

Pos logit contributions
atron+0.006
imet+0.005
izoph+0.005
Desk+0.005
estern+0.005
Neg logit contributions
nces-0.005
esome-0.005
 contrad-0.005
anamo-0.004
edom-0.004
Token who
Token ID508
Feature activation-1.43e-01

Pos logit contributions
nces+0.047
esome+0.045
 contrad+0.042
anamo+0.040
 Flavoring+0.038
Neg logit contributions
atron-0.046
imet-0.046
izoph-0.045
Desk-0.043
estern-0.041
Token was
Token ID373
Feature activation+7.38e-02

Pos logit contributions
atron+0.287
imet+0.283
izoph+0.280
Desk+0.264
estern+0.252
Neg logit contributions
nces-0.281
esome-0.266
 contrad-0.245
anamo-0.234
edom-0.225
Token head
Token ID1182
Feature activation+1.07e-02

Pos logit contributions
nces+0.145
esome+0.137
 contrad+0.127
anamo+0.121
 Flavoring+0.117
Neg logit contributions
atron-0.146
imet-0.145
izoph-0.142
Desk-0.134
estern-0.129
Token of
Token ID286
Feature activation-2.02e-01

Pos logit contributions
nces+0.021
esome+0.020
 contrad+0.018
anamo+0.017
edom+0.017
Neg logit contributions
atron-0.022
imet-0.021
izoph-0.021
Desk-0.020
estern-0.019
Token the
Token ID262
Feature activation+4.24e-01

Pos logit contributions
atron+0.400
imet+0.391
izoph+0.380
Desk+0.364
estern+0.346
Neg logit contributions
nces-0.405
esome-0.383
 contrad-0.349
anamo-0.338
edom-0.325
Token division
Token ID7297
Feature activation-5.48e-01

Pos logit contributions
nces+0.860
esome+0.816
 contrad+0.752
anamo+0.720
edom+0.705
Neg logit contributions
atron-0.816
imet-0.797
izoph-0.788
Desk-0.750
estern-0.712
Token that
Token ID326
Feature activation+2.76e-01

Pos logit contributions
atron+1.058
imet+1.035
izoph+1.006
Desk+0.962
estern+0.914
Neg logit contributions
nces-1.120
esome-1.062
 contrad-0.968
anamo-0.940
edom-0.902
Token scrutin
Token ID30441
Feature activation+1.36e+00

Pos logit contributions
nces+0.534
esome+0.510
 contrad+0.464
anamo+0.446
edom+0.428
Neg logit contributions
atron-0.554
imet-0.545
izoph-0.543
Desk-0.516
estern-0.490
Tokenthan
Token ID14813
Feature activation-4.18e-01

Pos logit contributions
atron+0.146
imet+0.142
izoph+0.140
Desk+0.132
estern+0.126
Neg logit contributions
nces-0.167
esome-0.159
 contrad-0.145
anamo-0.141
edom-0.135
Token to
Token ID284
Feature activation-5.02e-01

Pos logit contributions
atron+0.831
imet+0.817
izoph+0.792
Desk+0.758
estern+0.723
Neg logit contributions
nces-0.835
esome-0.790
 contrad-0.718
anamo-0.697
edom-0.667
Token their
Token ID511
Feature activation+2.06e-01

Pos logit contributions
atron+0.978
imet+0.961
izoph+0.933
Desk+0.890
estern+0.848
Neg logit contributions
nces-1.021
esome-0.964
 contrad-0.881
anamo-0.853
 Flavoring-0.819
Token boss
Token ID6478
Feature activation-3.45e-01

Pos logit contributions
nces+0.392
esome+0.369
 contrad+0.339
anamo+0.322
 Flavoring+0.310
Neg logit contributions
atron-0.427
imet-0.417
izoph-0.407
Desk-0.390
estern-0.371
Token.
Token ID13
Feature activation+2.18e-01

Pos logit contributions
atron+0.650
imet+0.635
izoph+0.619
Desk+0.587
estern+0.560
Neg logit contributions
nces-0.725
esome-0.688
 contrad-0.627
anamo-0.612
edom-0.588
Token Regarding
Token ID38687
Feature activation-6.07e-01

Pos logit contributions
nces+0.450
esome+0.432
 contrad+0.391
anamo+0.379
 Flavoring+0.377
Neg logit contributions
atron-0.409
imet-0.407
izoph-0.402
Desk-0.365
estern-0.355
Token Bain
Token ID33126
Feature activation+5.04e-01

Pos logit contributions
atron+1.195
imet+1.173
izoph+1.138
Desk+1.088
estern+1.037
Neg logit contributions
nces-1.229
esome-1.158
 contrad-1.061
anamo-1.023
 Flavoring-0.984
Token,
Token ID11
Feature activation+3.61e-01

Pos logit contributions
nces+1.106
esome+1.049
 contrad+0.960
anamo+0.935
edom+0.903
Neg logit contributions
atron-0.897
imet-0.883
izoph-0.858
Desk-0.812
estern-0.765
Token the
Token ID262
Feature activation+1.68e-01

Pos logit contributions
nces+0.723
esome+0.682
 contrad+0.626
anamo+0.601
edom+0.581
Neg logit contributions
atron-0.713
imet-0.698
izoph-0.681
Desk-0.647
estern-0.620
Token candidate
Token ID4540
Feature activation-5.98e-01

Pos logit contributions
nces+0.333
esome+0.315
 contrad+0.291
anamo+0.275
edom+0.267
Neg logit contributions
atron-0.336
imet-0.330
izoph-0.320
Desk-0.306
estern-0.292
Token has
Token ID468
Feature activation-7.12e-01

Pos logit contributions
atron+1.220
imet+1.193
izoph+1.165
Desk+1.113
estern+1.062
Neg logit contributions
nces-1.164
esome-1.099
 contrad-1.001
anamo-0.965
edom-0.923
Tokenone
Token ID505
Feature activation+3.15e-01

Pos logit contributions
atron+0.404
imet+0.392
izoph+0.391
Desk+0.370
estern+0.346
Neg logit contributions
nces-0.576
esome-0.572
 contrad-0.518
anamo-0.514
udo-0.495
Token ad
Token ID512
Feature activation+1.95e-01

Pos logit contributions
nces+0.635
esome+0.608
 contrad+0.557
anamo+0.524
edom+0.518
Neg logit contributions
atron-0.607
imet-0.590
izoph-0.581
Desk-0.552
estern-0.534
Tokenor
Token ID273
Feature activation-8.24e-01

Pos logit contributions
nces+0.390
esome+0.372
 contrad+0.336
anamo+0.325
edom+0.314
Neg logit contributions
atron-0.384
imet-0.374
izoph-0.366
Desk-0.353
estern-0.338
Tokenning
Token ID768
Feature activation+6.52e-01

Pos logit contributions
atron+1.837
imet+1.803
izoph+1.752
Desk+1.691
estern+1.619
Neg logit contributions
nces-1.443
esome-1.355
 contrad-1.214
anamo-1.171
 Flavoring-1.114
Token the
Token ID262
Feature activation-2.28e-01

Pos logit contributions
nces+1.277
esome+1.202
 contrad+1.096
anamo+1.049
edom+1.014
Neg logit contributions
atron-1.309
imet-1.279
izoph-1.252
Desk-1.193
estern-1.153
Token time
Token ID640
Feature activation-6.41e-01

Pos logit contributions
atron+0.490
imet+0.479
izoph+0.467
Desk+0.447
estern+0.432
Neg logit contributions
nces-0.413
esome-0.392
 contrad-0.357
anamo-0.338
 Flavoring-0.324
Token-
Token ID12
Feature activation+2.99e-01

Pos logit contributions
atron+1.310
imet+1.276
izoph+1.248
Desk+1.196
estern+1.151
Neg logit contributions
nces-1.234
esome-1.162
 contrad-1.067
anamo-1.020
edom-0.976
Tokenworn
Token ID34565
Feature activation+1.40e+00

Pos logit contributions
nces+0.625
esome+0.594
 contrad+0.544
anamo+0.529
edom+0.502
Neg logit contributions
atron-0.553
imet-0.544
izoph-0.530
Desk-0.501
estern-0.488
Token jacket
Token ID15224
Feature activation+3.75e-01

Pos logit contributions
nces+2.758
esome+2.599
 contrad+2.428
anamo+2.254
edom+2.194
Neg logit contributions
atron-2.725
imet-2.652
izoph-2.612
Desk-2.446
estern-2.418
Token.
Token ID13
Feature activation-2.07e-01

Pos logit contributions
nces+0.760
esome+0.722
 contrad+0.659
anamo+0.635
edom+0.614
Neg logit contributions
atron-0.726
imet-0.704
izoph-0.691
Desk-0.656
estern-0.632
Token
Token ID564
Feature activation+3.61e-01

Pos logit contributions
atron+0.366
imet+0.360
izoph+0.349
Desk+0.330
estern+0.316
Neg logit contributions
nces-0.454
esome-0.433
 contrad-0.396
anamo-0.385
 Flavoring-0.373
Token,
Token ID11
Feature activation-3.55e-01

Pos logit contributions
nces+1.945
esome+1.825
 contrad+1.691
edom+1.556
anamo+1.546
Neg logit contributions
atron-2.128
imet-2.076
izoph-2.049
Desk-1.958
estern-1.879
Token and
Token ID290
Feature activation-4.44e-01

Pos logit contributions
atron+0.712
imet+0.699
izoph+0.677
Desk+0.649
estern+0.620
Neg logit contributions
nces-0.702
esome-0.662
 contrad-0.607
anamo-0.583
edom-0.562
Token the
Token ID262
Feature activation-2.91e-01

Pos logit contributions
atron+0.922
imet+0.906
izoph+0.879
Desk+0.844
estern+0.806
Neg logit contributions
nces-0.847
esome-0.799
 contrad-0.728
anamo-0.697
edom-0.671
Token joy
Token ID8716
Feature activation-3.45e-01

Pos logit contributions
atron+0.592
imet+0.581
izoph+0.562
Desk+0.541
estern+0.513
Neg logit contributions
nces-0.572
esome-0.539
 contrad-0.493
anamo-0.469
edom-0.455
Tokens
Token ID82
Feature activation-1.07e+00

Pos logit contributions
atron+0.648
imet+0.631
izoph+0.616
Desk+0.587
estern+0.556
Neg logit contributions
nces-0.730
esome-0.691
 contrad-0.633
anamo-0.613
edom-0.593
Token of
Token ID286
Feature activation-5.94e-01

Pos logit contributions
atron+2.038
imet+1.995
izoph+1.927
Desk+1.848
estern+1.755
Neg logit contributions
nces-2.215
esome-2.101
 contrad-1.917
anamo-1.863
 Flavoring-1.788
Token life
Token ID1204
Feature activation+1.75e-01

Pos logit contributions
atron+1.194
imet+1.172
izoph+1.133
Desk+1.088
estern+1.035
Neg logit contributions
nces-1.169
esome-1.106
 contrad-1.021
anamo-0.973
edom-0.944
Token,
Token ID11
Feature activation-3.88e-01

Pos logit contributions
nces+0.352
esome+0.331
 contrad+0.305
anamo+0.289
edom+0.283
Neg logit contributions
atron-0.347
imet-0.339
izoph-0.331
Desk-0.317
estern-0.302
Token
Token ID198
Feature activation-2.00e-01

Pos logit contributions
atron+0.691
imet+0.674
izoph+0.652
Desk+0.622
estern+0.589
Neg logit contributions
nces-0.854
esome-0.812
 contrad-0.748
anamo-0.723
edom-0.700
Token
Token ID198
Feature activation-7.47e-01

Pos logit contributions
atron+0.385
izoph+0.380
imet+0.380
Desk+0.345
estern+0.338
Neg logit contributions
nces-0.403
esome-0.395
 contrad-0.343
 Flavoring-0.332
anamo-0.329
Tokenare
Token ID533
Feature activation+1.37e-01

Pos logit contributions
atron+1.354
imet+1.324
izoph+1.287
Desk+1.215
estern+1.164
Neg logit contributions
nces-1.625
esome-1.542
 contrad-1.409
anamo-1.368
edom-1.328
Token reality
Token ID3950
Feature activation+1.06e+00

Pos logit contributions
atron+0.545
imet+0.538
izoph+0.521
Desk+0.500
estern+0.482
Neg logit contributions
nces-0.543
esome-0.517
 contrad-0.476
anamo-0.446
edom-0.436
Token to
Token ID284
Feature activation+3.43e-01

Pos logit contributions
nces+2.153
esome+2.049
 contrad+1.879
anamo+1.786
edom+1.737
Neg logit contributions
atron-2.019
imet-1.974
izoph-1.957
Desk-1.847
estern-1.773
Token the
Token ID262
Feature activation+3.84e-01

Pos logit contributions
nces+0.675
esome+0.645
 contrad+0.596
anamo+0.564
edom+0.547
Neg logit contributions
atron-0.671
imet-0.665
izoph-0.654
Desk-0.619
estern-0.599
Token people
Token ID661
Feature activation+3.96e-01

Pos logit contributions
nces+0.722
esome+0.688
 contrad+0.636
anamo+0.593
edom+0.576
Neg logit contributions
atron-0.790
imet-0.784
izoph-0.773
Desk-0.731
estern-0.703
Token,"
Token ID553
Feature activation+6.87e-02

Pos logit contributions
nces+0.771
esome+0.726
 contrad+0.677
anamo+0.643
edom+0.625
Neg logit contributions
atron-0.792
izoph-0.779
imet-0.764
Desk-0.728
estern-0.704
Token said
Token ID531
Feature activation-5.15e-01

Pos logit contributions
nces+0.152
esome+0.146
 contrad+0.136
anamo+0.131
edom+0.126
Neg logit contributions
atron-0.119
imet-0.117
izoph-0.114
Desk-0.108
estern-0.102
Token Ron
Token ID6575
Feature activation+6.29e-01

Pos logit contributions
atron+1.037
imet+1.019
izoph+0.993
Desk+0.949
estern+0.903
Neg logit contributions
nces-1.007
esome-0.956
 contrad-0.875
anamo-0.839
edom-0.807
Token Price
Token ID7886
Feature activation+1.90e+00

Pos logit contributions
nces+1.207
esome+1.147
 contrad+1.026
anamo+0.990
edom+0.956
Neg logit contributions
atron-1.274
imet-1.260
izoph-1.238
Desk-1.182
estern-1.122
Token,
Token ID11
Feature activation+5.78e-02

Pos logit contributions
nces+3.744
esome+3.542
 contrad+3.221
anamo+3.102
edom+2.999
Neg logit contributions
atron-3.756
imet-3.678
izoph-3.618
Desk-3.441
estern-3.261
Token the
Token ID262
Feature activation-1.43e-01

Pos logit contributions
nces+0.111
esome+0.106
 contrad+0.097
anamo+0.092
edom+0.090
Neg logit contributions
atron-0.117
imet-0.116
izoph-0.113
Desk-0.107
estern-0.103
Token president
Token ID1893
Feature activation-5.75e-01

Pos logit contributions
atron+0.286
imet+0.282
izoph+0.275
Desk+0.263
estern+0.251
Neg logit contributions
nces-0.279
esome-0.265
 contrad-0.244
anamo-0.232
edom-0.226
Tokened
Token ID276
Feature activation-1.83e-01

Pos logit contributions
nces+1.487
esome+1.409
 contrad+1.268
anamo+1.214
edom+1.206
Neg logit contributions
atron-1.384
izoph-1.352
imet-1.310
Desk-1.243
estern-1.176
Token up
Token ID510
Feature activation+5.86e-01

Pos logit contributions
atron+0.361
imet+0.353
izoph+0.348
Desk+0.329
estern+0.314
Neg logit contributions
nces-0.366
esome-0.345
 contrad-0.316
anamo-0.301
edom-0.290
Token by
Token ID416
Feature activation+3.53e-01

Pos logit contributions
nces+1.240
esome+1.176
 contrad+1.071
anamo+1.028
edom+0.987
Neg logit contributions
atron-1.090
izoph-1.057
imet-1.057
Desk-0.988
estern-0.935
Token the
Token ID262
Feature activation-9.55e-03

Pos logit contributions
nces+0.689
esome+0.655
 contrad+0.598
anamo+0.570
 Flavoring+0.552
Neg logit contributions
atron-0.710
imet-0.687
izoph-0.676
Desk-0.655
estern-0.610
Token approaching
Token ID13885
Feature activation-5.56e-01

Pos logit contributions
atron+0.017
imet+0.016
izoph+0.016
Desk+0.016
estern+0.015
Neg logit contributions
nces-0.021
esome-0.020
 contrad-0.018
anamo-0.017
 Flavoring-0.017
Token storm
Token ID6388
Feature activation-1.24e+00

Pos logit contributions
atron+0.944
imet+0.913
izoph+0.891
Desk+0.854
estern+0.797
Neg logit contributions
nces-1.266
esome-1.217
 contrad-1.123
anamo-1.081
 Flavoring-1.049
Token in
Token ID287
Feature activation-1.22e-01

Pos logit contributions
atron+2.254
imet+2.216
izoph+2.122
Desk+2.040
estern+1.928
Neg logit contributions
nces-2.679
esome-2.542
 contrad-2.337
anamo-2.274
 Flavoring-2.189
Token Haiti
Token ID25051
Feature activation-6.60e-01

Pos logit contributions
atron+0.241
imet+0.233
izoph+0.229
Desk+0.219
estern+0.206
Neg logit contributions
nces-0.246
esome-0.233
 contrad-0.216
anamo-0.204
edom-0.199
Token,
Token ID11
Feature activation+2.64e-02

Pos logit contributions
atron+1.250
imet+1.220
izoph+1.183
Desk+1.131
estern+1.076
Neg logit contributions
nces-1.380
esome-1.311
 contrad-1.198
anamo-1.163
edom-1.118
Token the
Token ID262
Feature activation-1.52e-01

Pos logit contributions
nces+0.052
esome+0.050
 contrad+0.046
anamo+0.044
 Flavoring+0.042
Neg logit contributions
atron-0.053
imet-0.052
izoph-0.050
Desk-0.048
estern-0.046
Token Associated
Token ID10575
Feature activation-1.94e-01

Pos logit contributions
atron+0.292
imet+0.282
izoph+0.278
Desk+0.266
estern+0.255
Neg logit contributions
nces-0.313
esome-0.297
 contrad-0.276
anamo-0.260
 Flavoring-0.254
Token
Token ID447
Feature activation+4.89e-01

Pos logit contributions
nces+0.904
esome+0.865
 contrad+0.774
edom+0.730
anamo+0.730
Neg logit contributions
atron-0.908
izoph-0.899
imet-0.892
Desk-0.849
estern-0.800
Token
Token ID247
Feature activation+1.11e-01

Pos logit contributions
nces+0.556
esome+0.504
 contrad+0.423
anamo+0.385
 Flavoring+0.354
Neg logit contributions
atron-1.388
imet-1.376
izoph-1.340
Desk-1.308
estern-1.277
Tokens
Token ID82
Feature activation-7.75e-01

Pos logit contributions
nces+0.234
esome+0.220
 contrad+0.199
anamo+0.193
edom+0.188
Neg logit contributions
atron-0.207
imet-0.204
izoph-0.199
Desk-0.191
estern-0.180
Token rewarded
Token ID20945
Feature activation+3.65e-01

Pos logit contributions
atron+1.498
imet+1.474
izoph+1.436
Desk+1.372
estern+1.300
Neg logit contributions
nces-1.579
esome-1.512
 contrad-1.370
anamo-1.315
edom-1.270
Token the
Token ID262
Feature activation-2.74e-01

Pos logit contributions
nces+0.734
esome+0.698
 contrad+0.631
anamo+0.601
edom+0.584
Neg logit contributions
atron-0.704
imet-0.693
izoph-0.686
Desk-0.649
estern-0.625
Token Bills
Token ID20209
Feature activation-2.46e+00

Pos logit contributions
atron+0.522
imet+0.512
izoph+0.497
Desk+0.481
estern+0.455
Neg logit contributions
nces-0.563
esome-0.537
 contrad-0.496
anamo-0.468
 Flavoring-0.458
Token with
Token ID351
Feature activation-3.97e-02

Pos logit contributions
atron+4.687
imet+4.569
izoph+4.395
Desk+4.231
estern+4.021
Neg logit contributions
nces-5.114
esome-4.831
 contrad-4.410
anamo-4.317
 Flavoring-4.124
Token two
Token ID734
Feature activation+9.96e-01

Pos logit contributions
atron+0.078
imet+0.077
izoph+0.076
Desk+0.072
estern+0.068
Neg logit contributions
nces-0.079
esome-0.075
 contrad-0.069
anamo-0.065
edom-0.064
Token years
Token ID812
Feature activation+4.37e-01

Pos logit contributions
nces+1.883
esome+1.792
 contrad+1.629
anamo+1.506
 Flavoring+1.495
Neg logit contributions
atron-2.017
imet-1.990
izoph-1.971
Desk-1.887
estern-1.809
Token of
Token ID286
Feature activation-6.57e-02

Pos logit contributions
nces+0.874
esome+0.829
 contrad+0.750
anamo+0.717
edom+0.701
Neg logit contributions
atron-0.859
imet-0.836
izoph-0.830
Desk-0.791
estern-0.754
Token excellent
Token ID6275
Feature activation+4.10e-01

Pos logit contributions
atron+0.126
imet+0.125
izoph+0.122
Desk+0.116
estern+0.111
Neg logit contributions
nces-0.134
esome-0.127
 contrad-0.117
anamo-0.110
edom-0.108
Token for
Token ID329
Feature activation+2.32e-02

Pos logit contributions
nces+0.803
esome+0.760
 contrad+0.699
anamo+0.659
edom+0.649
Neg logit contributions
atron-0.831
imet-0.807
izoph-0.805
Desk-0.754
estern-0.722
Token customers
Token ID4297
Feature activation+3.49e-01

Pos logit contributions
nces+0.044
esome+0.042
 contrad+0.039
anamo+0.036
edom+0.035
Neg logit contributions
atron-0.048
imet-0.047
izoph-0.046
Desk-0.044
estern-0.042
Token exceeding
Token ID23353
Feature activation-9.46e-01

Pos logit contributions
nces+0.695
esome+0.657
 contrad+0.617
anamo+0.581
edom+0.564
Neg logit contributions
atron-0.682
imet-0.668
izoph-0.665
Desk-0.623
estern-0.603
Token about
Token ID546
Feature activation-1.81e+00

Pos logit contributions
atron+1.812
imet+1.769
izoph+1.724
Desk+1.641
estern+1.571
Neg logit contributions
nces-1.960
esome-1.853
 contrad-1.700
anamo-1.636
edom-1.587
Token 300
Token ID5867
Feature activation+8.30e-01

Pos logit contributions
atron+3.085
imet+3.005
izoph+2.904
Desk+2.762
estern+2.613
Neg logit contributions
nces-4.144
esome-3.946
 contrad-3.645
anamo-3.537
edom-3.414
Token gig
Token ID12526
Feature activation-2.67e+00

Pos logit contributions
nces+1.915
esome+1.819
 contrad+1.694
anamo+1.624
edom+1.589
Neg logit contributions
atron-1.377
imet-1.325
izoph-1.315
Desk-1.214
estern-1.177
Tokenabytes
Token ID38346
Feature activation-8.99e-01

Pos logit contributions
atron+4.551
imet+4.456
izoph+4.191
Desk+4.108
estern+3.798
Neg logit contributions
nces-6.008
esome-5.689
 contrad-5.270
anamo-5.136
 Flavoring-5.021
Token a
Token ID257
Feature activation-2.00e-01

Pos logit contributions
atron+1.644
imet+1.598
izoph+1.559
Desk+1.483
estern+1.410
Neg logit contributions
nces-1.940
esome-1.844
 contrad-1.688
anamo-1.635
 Flavoring-1.577
Token month
Token ID1227
Feature activation-1.19e-01

Pos logit contributions
atron+0.293
imet+0.279
izoph+0.273
Desk+0.253
estern+0.238
Neg logit contributions
nces-0.506
esome-0.484
 contrad-0.452
anamo-0.434
edom-0.424
Token,
Token ID11
Feature activation+1.20e-01

Pos logit contributions
atron+0.234
imet+0.226
izoph+0.226
Desk+0.212
estern+0.206
Neg logit contributions
nces-0.240
esome-0.228
 contrad-0.207
anamo-0.198
edom-0.193
Token or
Token ID393
Feature activation-9.46e-01

Pos logit contributions
nces+0.238
esome+0.224
 contrad+0.206
anamo+0.196
edom+0.189
Neg logit contributions
atron-0.239
imet-0.233
izoph-0.231
Desk-0.217
estern-0.211
Tokenre
Token ID260
Feature activation+9.27e-01

Pos logit contributions
atron+0.223
imet+0.218
izoph+0.212
Desk+0.202
estern+0.191
Neg logit contributions
nces-0.267
esome-0.254
 contrad-0.233
anamo-0.225
edom-0.217
Token talking
Token ID3375
Feature activation-8.46e-02

Pos logit contributions
nces+1.764
esome+1.696
 contrad+1.550
anamo+1.464
edom+1.403
Neg logit contributions
atron-1.822
izoph-1.820
imet-1.795
Desk-1.705
estern-1.641
Token about
Token ID546
Feature activation-7.62e-01

Pos logit contributions
atron+0.160
imet+0.157
izoph+0.152
Desk+0.146
estern+0.138
Neg logit contributions
nces-0.176
esome-0.167
 contrad-0.153
anamo-0.148
edom-0.142
Token less
Token ID1342
Feature activation-4.55e-02

Pos logit contributions
atron+1.476
imet+1.443
izoph+1.407
Desk+1.350
estern+1.282
Neg logit contributions
nces-1.557
esome-1.478
 contrad-1.350
anamo-1.301
edom-1.252
Token than
Token ID621
Feature activation-1.09e+00

Pos logit contributions
atron+0.089
imet+0.087
izoph+0.086
Desk+0.083
estern+0.078
Neg logit contributions
nces-0.091
esome-0.087
 contrad-0.079
anamo-0.076
edom-0.074
Token half
Token ID2063
Feature activation-2.48e+00

Pos logit contributions
atron+2.026
imet+1.982
izoph+1.926
Desk+1.844
estern+1.752
Neg logit contributions
nces-2.320
esome-2.204
 contrad-2.019
anamo-1.953
edom-1.885
Token,
Token ID11
Feature activation+1.69e-01

Pos logit contributions
atron+4.553
imet+4.485
izoph+4.247
Desk+4.094
estern+3.853
Neg logit contributions
nces-5.297
esome-5.015
 contrad-4.609
anamo-4.512
 Flavoring-4.303
Token
Token ID447
Feature activation+2.59e-01

Pos logit contributions
nces+0.331
esome+0.318
 contrad+0.288
anamo+0.276
edom+0.266
Neg logit contributions
atron-0.335
imet-0.326
izoph-0.321
Desk-0.307
estern-0.294
Token
Token ID247
Feature activation-2.74e-01

Pos logit contributions
nces+0.284
esome+0.258
 contrad+0.218
anamo+0.197
edom+0.179
Neg logit contributions
atron-0.738
imet-0.737
izoph-0.718
Desk-0.701
estern-0.684
Token he
Token ID339
Feature activation+3.76e-01

Pos logit contributions
atron+0.553
imet+0.544
izoph+0.528
Desk+0.509
estern+0.484
Neg logit contributions
nces-0.534
esome-0.508
 contrad-0.459
anamo-0.444
edom-0.426
Token said
Token ID531
Feature activation-3.57e-02

Pos logit contributions
nces+0.876
esome+0.837
 contrad+0.773
anamo+0.750
edom+0.729
Neg logit contributions
atron-0.612
imet-0.598
izoph-0.577
Desk-0.551
estern-0.518
Token the
Token ID262
Feature activation+4.28e-01

Pos logit contributions
nces+0.796
esome+0.751
 contrad+0.687
anamo+0.658
edom+0.636
Neg logit contributions
atron-0.787
imet-0.765
izoph-0.749
Desk-0.716
estern-0.681
Token top
Token ID1353
Feature activation-4.84e-02

Pos logit contributions
nces+0.893
esome+0.844
 contrad+0.778
anamo+0.729
edom+0.715
Neg logit contributions
atron-0.811
imet-0.789
izoph-0.768
Desk-0.738
estern-0.705
Token 2
Token ID362
Feature activation+1.33e-01

Pos logit contributions
atron+0.084
imet+0.082
izoph+0.080
Desk+0.076
estern+0.073
Neg logit contributions
nces-0.108
esome-0.103
 contrad-0.095
anamo-0.091
edom-0.088
Token.
Token ID13
Feature activation-1.62e-01

Pos logit contributions
nces+0.290
esome+0.274
 contrad+0.253
anamo+0.244
edom+0.235
Neg logit contributions
atron-0.240
imet-0.234
izoph-0.227
Desk-0.218
estern-0.206
Token75
Token ID2425
Feature activation-7.58e-01

Pos logit contributions
atron+0.283
imet+0.279
izoph+0.274
Desk+0.255
estern+0.244
Neg logit contributions
nces-0.356
esome-0.344
 contrad-0.312
anamo-0.302
 Flavoring-0.297
Token%
Token ID4
Feature activation-2.47e+00

Pos logit contributions
atron+1.288
imet+1.258
izoph+1.207
Desk+1.153
estern+1.085
Neg logit contributions
nces-1.726
esome-1.646
 contrad-1.515
anamo-1.479
 Flavoring-1.428
Token of
Token ID286
Feature activation-1.28e-01

Pos logit contributions
atron+4.301
imet+4.267
izoph+3.998
Desk+3.852
istani+3.599
Neg logit contributions
nces-5.377
esome-5.112
 contrad-4.681
anamo-4.615
 Flavoring-4.447
Token all
Token ID477
Feature activation+1.93e-02

Pos logit contributions
atron+0.291
imet+0.284
izoph+0.278
Desk+0.269
estern+0.259
Neg logit contributions
nces-0.218
esome-0.204
 contrad-0.183
anamo-0.173
edom-0.167
Token households
Token ID12503
Feature activation+7.52e-01

Pos logit contributions
nces+0.033
esome+0.031
 contrad+0.028
anamo+0.027
edom+0.026
Neg logit contributions
atron-0.044
imet-0.043
izoph-0.042
Desk-0.040
estern-0.039
Token in
Token ID287
Feature activation+7.13e-02

Pos logit contributions
nces+1.539
esome+1.454
 contrad+1.350
anamo+1.276
edom+1.255
Neg logit contributions
atron-1.446
imet-1.398
izoph-1.376
Desk-1.310
estern-1.257
Token India
Token ID3794
Feature activation+4.47e-01

Pos logit contributions
nces+0.150
esome+0.142
 contrad+0.131
anamo+0.125
edom+0.122
Neg logit contributions
atron-0.135
imet-0.130
izoph-0.127
Desk-0.122
estern-0.116
Token tot
Token ID2006
Feature activation-1.59e+00

Pos logit contributions
atron+0.352
imet+0.345
izoph+0.341
Desk+0.320
estern+0.309
Neg logit contributions
nces-0.358
esome-0.340
 contrad-0.310
anamo-0.293
edom-0.289
Tokenalled
Token ID4262
Feature activation-4.26e-02

Pos logit contributions
atron+2.047
imet+1.982
izoph+1.920
Desk+1.777
estern+1.672
Neg logit contributions
nces-4.278
esome-4.115
 contrad-3.814
anamo-3.752
edom-3.643
Token 2
Token ID362
Feature activation-2.68e-01

Pos logit contributions
atron+0.081
imet+0.079
izoph+0.078
Desk+0.074
estern+0.070
Neg logit contributions
nces-0.089
esome-0.084
 contrad-0.077
anamo-0.073
edom-0.072
Token.
Token ID13
Feature activation-2.07e-01

Pos logit contributions
atron+0.489
imet+0.478
izoph+0.463
Desk+0.443
estern+0.420
Neg logit contributions
nces-0.576
esome-0.546
 contrad-0.502
anamo-0.484
edom-0.468
Token07
Token ID2998
Feature activation-5.47e-01

Pos logit contributions
atron+0.361
imet+0.357
izoph+0.350
Desk+0.326
estern+0.312
Neg logit contributions
nces-0.458
esome-0.441
 contrad-0.398
anamo-0.387
 Flavoring-0.378
Token%
Token ID4
Feature activation-2.44e+00

Pos logit contributions
atron+1.080
imet+1.058
izoph+1.025
Desk+0.983
estern+0.936
Neg logit contributions
nces-1.098
esome-1.039
 contrad-0.946
anamo-0.915
edom-0.879
Token of
Token ID286
Feature activation+9.19e-02

Pos logit contributions
imet+4.081
atron+4.064
izoph+3.606
Desk+3.573
estern+3.322
Neg logit contributions
nces-5.485
esome-5.242
 contrad-4.868
anamo-4.849
 Flavoring-4.554
Token its
Token ID663
Feature activation+3.44e-01

Pos logit contributions
nces+0.204
esome+0.193
 contrad+0.179
anamo+0.171
edom+0.168
Neg logit contributions
atron-0.162
imet-0.158
izoph-0.154
Desk-0.146
estern-0.138
Token gross
Token ID10319
Feature activation+1.76e+00

Pos logit contributions
nces+0.701
esome+0.660
 contrad+0.609
edom+0.567
anamo+0.566
Neg logit contributions
atron-0.673
imet-0.652
izoph-0.637
Desk-0.612
estern-0.590
Token domestic
Token ID5928
Feature activation-1.48e-01

Pos logit contributions
nces+2.681
esome+2.559
 contrad+2.275
edom+2.118
 Flavoring+2.034
Neg logit contributions
atron-4.223
imet-4.112
izoph-4.109
Desk-3.918
estern-3.825
Token product
Token ID1720
Feature activation+8.82e-01

Pos logit contributions
atron+0.271
imet+0.264
izoph+0.256
Desk+0.244
estern+0.233
Neg logit contributions
nces-0.318
esome-0.303
 contrad-0.277
anamo-0.265
 Flavoring-0.260