TOP ACTIVATIONS
MAX = 1.754

it ) X 360 See above . Min us the kittens
. In my comparison data above the vec 4 has all
Sources told Reuters the three above mentioned international news organisations have
the analysis charts ( see above ) Red es
Car rick (@ andy f erv s ) September 27 ,
didn t use above is that it can skip
Veterans As mentioned above , V AS is the
the first edition ( see above ) is a tribute to
email from Eva ( see above ). Could your HS comment
We have taken students to Dell headquarters in Austin , and
As the screenshot above shows there is an
a laptop suspend on Ubuntu above and lo and behold ,
He became an unrestricted free agent at the end of the
. The GDP map mentioned above is also really just a
According to the above terminal output , there isn
've also included a comparison above against 3 other very fast
adults considered themselves to be veget arians , and 2 %
the first few hours mentioned above , your body goes into
at M TF 50 or above at 18 mm . With
are a cos player featured above and you have a link

MIN ACTIVATIONS
MIN = -1.758

pur ak ushi Profile Joined August 2012 United States 3
, though . Instead of saying , black or
) Andre R ingu ette / Getty Images
of Charleston . Rather than releg ating slavery to
did not . Instead , Republicans spent their recess
at once . Rather than a purely academic exercise
costs , he noted . Instead , society picks up the
t . Instead , the job fell to
t bad at all . Rather , it is well
action not only for the balance of the 2017 season but
cold on the Bruins . Instead , Allow ay committed ,
re i Profile Blog Joined October 2002 United States 3
" mass ively - multi player - style ," detailing the
to a more improved MLS instead of second division soccer in
Amazing ly , his first effort along these new lines ,
J aaaa asper Profile Blog Joined April 2012 United States 5
for a diplomatic resolution . Rather , he is signaling that
be expensive . Instead , you can represent density
at it like 36 se ag ull s fighting over a
on government . Instead the committee urged the government

INTERVAL 0.876 - 1.754
CONTAINS 0.345%

hesitant to bring in J U CO defensive linemen . Ter
've also included a comparison above against 3 other very fast
bin with thread before this lesson . Show your child how
Europeans coming from Chicago where Pent ec ost al fire was
s 224 GB / s 192 GB / s 256 GB

INTERVAL -0.002 - 0.876
CONTAINS 50.112%

points to uniquely - owned data that has not yet been
year , $ 41 million extension signed in March 2016
the face of scandal - hit former International Monetary Fund chief
more recent hyper visor configuration , running on a processor with
these teams in their first meetings , so it

INTERVAL -0.880 - -0.002
CONTAINS 49.188%

CentOS servers operating smoothly and you don 't particularly enjoy spending
the New York Times as well as by this publication ,
Read more about Sud h ir Ch oud h rie
and powerful El ision stone is recovered , recounted remarkably in
the state department does not know which files will be released

INTERVAL -1.758 - -0.880
CONTAINS 0.355%

him . We were really fortunate that Jae , when he
o Ma hea
The Brazil ians were lucky to be put into group
t throw j abs in fights almost ever .
sur prise " trip to Mexico with some of my closest
Token it
Token ID340
Feature activation+1.24e-01

Pos logit contributions
INESS+0.067
+0.065
 Breaker+0.057
 understandable+0.054
endra+0.052
Neg logit contributions
 cens-0.072
hd-0.070
arning-0.069
asionally-0.069
cffff-0.068
Token)
Token ID8
Feature activation+2.66e-01

Pos logit contributions
INESS+0.270
+0.263
 Breaker+0.234
 understandable+0.223
endra+0.213
Neg logit contributions
 cens-0.256
hd-0.248
arning-0.245
asionally-0.245
cffff-0.240
TokenX
Token ID55
Feature activation-4.59e-01

Pos logit contributions
INESS+0.609
+0.594
 Breaker+0.533
 understandable+0.508
endra+0.487
Neg logit contributions
 cens-0.515
hd-0.498
asionally-0.491
arning-0.491
cffff-0.481
Token360
Token ID15277
Feature activation-7.34e-01

Pos logit contributions
 cens+0.976
hd+0.965
arning+0.944
asionally+0.941
cffff+0.917
Neg logit contributions
INESS-0.956
-0.934
 Breaker-0.819
 understandable-0.779
endra-0.745
TokenSee
Token ID6214
Feature activation+9.23e-02

Pos logit contributions
 cens+1.477
hd+1.444
arning+1.412
asionally+1.411
collar+1.372
Neg logit contributions
INESS-1.622
-1.573
 Breaker-1.395
 understandable-1.336
endra-1.283
Token above
Token ID2029
Feature activation+1.75e+00

Pos logit contributions
INESS+0.192
+0.187
 Breaker+0.165
 understandable+0.157
endra+0.150
Neg logit contributions
 cens-0.198
hd-0.193
asionally-0.190
arning-0.190
cffff-0.186
Token.
Token ID13
Feature activation-2.50e-02

Pos logit contributions
INESS+3.672
+3.571
 Breaker+3.223
 understandable+3.079
endra+2.918
Neg logit contributions
 cens-3.587
asionally-3.498
hd-3.493
cffff-3.475
arning-3.471
Token Min
Token ID1855
Feature activation-5.03e-01

Pos logit contributions
 cens+0.051
hd+0.050
asionally+0.049
arning+0.049
cffff+0.048
Neg logit contributions
INESS-0.055
-0.053
 Breaker-0.047
 understandable-0.045
endra-0.043
Tokenus
Token ID385
Feature activation-3.01e-01

Pos logit contributions
 cens+1.020
hd+0.991
asionally+0.975
arning+0.970
cffff+0.950
Neg logit contributions
INESS-1.112
-1.081
 Breaker-0.960
 understandable-0.914
endra-0.877
Token the
Token ID262
Feature activation-1.16e-01

Pos logit contributions
 cens+0.650
hd+0.628
arning+0.619
asionally+0.618
cffff+0.603
Neg logit contributions
INESS-0.630
-0.610
 Breaker-0.534
 understandable-0.509
endra-0.489
Token kittens
Token ID45584
Feature activation+1.05e-01

Pos logit contributions
 cens+0.248
hd+0.240
arning+0.236
asionally+0.236
cffff+0.230
Neg logit contributions
INESS-0.246
-0.238
 Breaker-0.210
 understandable-0.199
endra-0.192
Token.
Token ID13
Feature activation-2.75e-01

Pos logit contributions
INESS+0.565
+0.550
 Breaker+0.492
 understandable+0.467
endra+0.448
Neg logit contributions
 cens-0.514
hd-0.498
arning-0.492
asionally-0.492
cffff-0.481
Token In
Token ID554
Feature activation+7.48e-03

Pos logit contributions
 cens+0.579
hd+0.562
asionally+0.553
arning+0.551
cffff+0.540
Neg logit contributions
INESS-0.589
-0.574
 Breaker-0.504
 understandable-0.480
endra-0.461
Token my
Token ID616
Feature activation+3.62e-01

Pos logit contributions
INESS+0.016
+0.015
 Breaker+0.014
 understandable+0.013
endra+0.012
Neg logit contributions
 cens-0.016
hd-0.015
arning-0.015
asionally-0.015
cffff-0.015
Token comparison
Token ID7208
Feature activation+7.18e-01

Pos logit contributions
INESS+0.745
+0.724
 Breaker+0.640
 understandable+0.605
endra+0.578
Neg logit contributions
 cens-0.790
hd-0.767
asionally-0.757
arning-0.757
cffff-0.742
Token data
Token ID1366
Feature activation+5.95e-01

Pos logit contributions
INESS+1.619
+1.576
 Breaker+1.418
 understandable+1.349
endra+1.292
Neg logit contributions
 cens-1.411
hd-1.369
arning-1.356
asionally-1.355
cffff-1.328
Token above
Token ID2029
Feature activation+1.72e+00

Pos logit contributions
INESS+1.273
+1.241
 Breaker+1.098
 understandable+1.042
endra+1.000
Neg logit contributions
 cens-1.251
hd-1.213
arning-1.195
asionally-1.194
cffff-1.167
Token the
Token ID262
Feature activation-8.13e-02

Pos logit contributions
INESS+3.304
+3.197
 Breaker+2.853
 understandable+2.706
endra+2.537
Neg logit contributions
 cens-3.821
asionally-3.738
hd-3.737
arning-3.723
cffff-3.676
Token vec
Token ID43030
Feature activation-3.15e-01

Pos logit contributions
 cens+0.171
hd+0.166
arning+0.163
asionally+0.163
cffff+0.160
Neg logit contributions
INESS-0.174
-0.170
 Breaker-0.150
 understandable-0.142
endra-0.137
Token4
Token ID19
Feature activation-3.37e-01

Pos logit contributions
 cens+0.610
hd+0.594
arning+0.582
asionally+0.581
cffff+0.566
Neg logit contributions
INESS-0.723
-0.707
 Breaker-0.632
 understandable-0.601
endra-0.580
Token has
Token ID468
Feature activation+3.12e-02

Pos logit contributions
 cens+0.687
hd+0.670
arning+0.657
asionally+0.656
cffff+0.642
Neg logit contributions
INESS-0.737
-0.720
 Breaker-0.640
 understandable-0.608
endra-0.585
Token all
Token ID477
Feature activation+1.94e-01

Pos logit contributions
INESS+0.065
+0.063
 Breaker+0.056
 understandable+0.053
endra+0.050
Neg logit contributions
 cens-0.067
hd-0.066
asionally-0.065
arning-0.065
cffff-0.063
TokenSources
Token ID21188
Feature activation+9.76e-02

Pos logit contributions
INESS+0.317
+0.310
 Breaker+0.277
 understandable+0.263
endra+0.251
Neg logit contributions
 cens-0.289
hd-0.282
arning-0.277
asionally-0.277
cffff-0.270
Token told
Token ID1297
Feature activation-2.13e-01

Pos logit contributions
INESS+0.206
+0.200
 Breaker+0.178
 understandable+0.169
endra+0.161
Neg logit contributions
 cens-0.207
hd-0.201
asionally-0.199
arning-0.199
cffff-0.195
Token Reuters
Token ID8428
Feature activation+2.54e-01

Pos logit contributions
 cens+0.481
hd+0.463
asionally+0.458
arning+0.457
cffff+0.448
Neg logit contributions
INESS-0.424
-0.415
 Breaker-0.359
 understandable-0.339
endra-0.325
Token the
Token ID262
Feature activation+2.26e-01

Pos logit contributions
INESS+0.477
+0.465
 Breaker+0.403
 understandable+0.379
endra+0.362
Neg logit contributions
 cens-0.599
hd-0.581
asionally-0.574
arning-0.574
cffff-0.563
Token three
Token ID1115
Feature activation+5.97e-02

Pos logit contributions
INESS+0.442
+0.430
 Breaker+0.371
 understandable+0.348
endra+0.335
Neg logit contributions
 cens-0.525
hd-0.503
asionally-0.498
arning-0.496
cffff-0.485
Token above
Token ID2029
Feature activation+1.71e+00

Pos logit contributions
INESS+0.118
+0.114
 Breaker+0.099
 understandable+0.093
endra+0.090
Neg logit contributions
 cens-0.139
hd-0.132
asionally-0.131
arning-0.130
collar-0.127
Tokenmentioned
Token ID17181
Feature activation+3.97e-01

Pos logit contributions
INESS+3.270
+3.165
 Breaker+2.804
 understandable+2.654
endra+2.512
Neg logit contributions
 cens-3.878
hd-3.813
asionally-3.778
arning-3.772
cffff-3.737
Token international
Token ID3230
Feature activation-3.04e-01

Pos logit contributions
INESS+0.740
+0.718
 Breaker+0.623
 understandable+0.584
endra+0.557
Neg logit contributions
 cens-0.947
hd-0.916
asionally-0.906
arning-0.905
cffff-0.887
Token news
Token ID1705
Feature activation+5.06e-02

Pos logit contributions
 cens+0.694
hd+0.661
asionally+0.654
arning+0.653
collar+0.636
Neg logit contributions
INESS-0.604
-0.591
 Breaker-0.514
 understandable-0.482
endra-0.465
Token organisations
Token ID16435
Feature activation-1.27e-01

Pos logit contributions
INESS+0.105
+0.103
 Breaker+0.091
 understandable+0.085
endra+0.082
Neg logit contributions
 cens-0.110
hd-0.106
arning-0.104
asionally-0.104
collar-0.102
Token have
Token ID423
Feature activation+1.70e-01

Pos logit contributions
 cens+0.246
hd+0.236
asionally+0.233
arning+0.233
cffff+0.227
Neg logit contributions
INESS-0.292
-0.286
 Breaker-0.255
 understandable-0.243
endra-0.234
Token the
Token ID262
Feature activation-1.86e-01

Pos logit contributions
 cens+0.353
hd+0.342
arning+0.338
asionally+0.337
cffff+0.331
Neg logit contributions
INESS-0.324
-0.315
 Breaker-0.278
 understandable-0.261
endra-0.251
Token analysis
Token ID3781
Feature activation-2.63e-02

Pos logit contributions
 cens+0.403
hd+0.390
arning+0.385
asionally+0.385
cffff+0.377
Neg logit contributions
INESS-0.384
-0.373
 Breaker-0.330
 understandable-0.312
endra-0.299
Token charts
Token ID15907
Feature activation+1.71e-01

Pos logit contributions
 cens+0.057
hd+0.055
arning+0.054
asionally+0.054
cffff+0.053
Neg logit contributions
INESS-0.055
-0.053
 Breaker-0.047
 understandable-0.044
endra-0.042
Token (
Token ID357
Feature activation-1.66e-01

Pos logit contributions
INESS+0.374
+0.365
 Breaker+0.324
 understandable+0.308
endra+0.296
Neg logit contributions
 cens-0.351
hd-0.340
arning-0.335
asionally-0.335
cffff-0.327
Tokensee
Token ID3826
Feature activation+9.31e-02

Pos logit contributions
 cens+0.323
hd+0.315
arning+0.309
asionally+0.308
cffff+0.300
Neg logit contributions
INESS-0.380
-0.370
 Breaker-0.333
 understandable-0.315
endra-0.303
Token above
Token ID2029
Feature activation+1.69e+00

Pos logit contributions
INESS+0.183
+0.178
 Breaker+0.155
 understandable+0.146
endra+0.140
Neg logit contributions
 cens-0.213
hd-0.206
asionally-0.204
arning-0.204
cffff-0.199
Token)
Token ID8
Feature activation+6.87e-02

Pos logit contributions
INESS+3.463
+3.345
 Breaker+2.980
 understandable+2.848
endra+2.715
Neg logit contributions
 cens-3.590
hd-3.523
asionally-3.498
arning-3.489
cffff-3.476
Token
Token ID198
Feature activation+4.26e-02

Pos logit contributions
INESS+0.152
+0.148
 Breaker+0.132
 understandable+0.126
endra+0.121
Neg logit contributions
 cens-0.138
hd-0.134
asionally-0.132
arning-0.132
cffff-0.130
Token
Token ID198
Feature activation-2.05e-02

Pos logit contributions
INESS+0.097
+0.095
 Breaker+0.085
 understandable+0.080
endra+0.078
Neg logit contributions
 cens-0.083
hd-0.082
arning-0.079
asionally-0.079
collar-0.077
TokenRed
Token ID7738
Feature activation-5.57e-01

Pos logit contributions
 cens+0.040
hd+0.039
arning+0.039
asionally+0.038
cffff+0.038
Neg logit contributions
INESS-0.046
-0.045
 Breaker-0.041
 understandable-0.039
endra-0.037
Tokenes
Token ID274
Feature activation-2.92e-01

Pos logit contributions
 cens+0.931
hd+0.894
arning+0.885
asionally+0.871
collar+0.845
Neg logit contributions
INESS-1.436
-1.399
 Breaker-1.268
 understandable-1.218
endra-1.168
Token Car
Token ID1879
Feature activation-4.22e-01

Pos logit contributions
 cens+0.781
hd+0.763
arning+0.751
asionally+0.751
cffff+0.731
Neg logit contributions
INESS-0.723
-0.701
 Breaker-0.611
 understandable-0.584
endra-0.554
Tokenrick
Token ID5557
Feature activation+1.83e-01

Pos logit contributions
 cens+0.776
hd+0.757
arning+0.743
asionally+0.743
collar+0.720
Neg logit contributions
INESS-1.009
-0.987
 Breaker-0.885
 understandable-0.848
endra-0.805
Token (@
Token ID4275
Feature activation+4.32e-01

Pos logit contributions
INESS+0.392
+0.381
 Breaker+0.338
 understandable+0.322
endra+0.308
Neg logit contributions
 cens-0.383
hd-0.372
arning-0.367
asionally-0.366
cffff-0.358
Tokenandy
Token ID10757
Feature activation+3.82e-01

Pos logit contributions
INESS+1.027
+1.003
 Breaker+0.904
 understandable+0.862
endra+0.831
Neg logit contributions
 cens-0.800
hd-0.771
asionally-0.761
arning-0.761
cffff-0.745
Tokenf
Token ID69
Feature activation+7.85e-01

Pos logit contributions
INESS+0.792
+0.771
 Breaker+0.683
 understandable+0.646
endra+0.619
Neg logit contributions
 cens-0.823
hd-0.798
asionally-0.789
arning-0.789
cffff-0.774
Tokenerv
Token ID712
Feature activation+1.67e+00

Pos logit contributions
INESS+1.885
+1.840
 Breaker+1.663
 understandable+1.589
endra+1.543
Neg logit contributions
 cens-1.422
hd-1.361
asionally-1.344
arning-1.341
cffff-1.320
Tokens
Token ID82
Feature activation+4.23e-01

Pos logit contributions
INESS+3.656
+3.578
 Breaker+3.195
 understandable+3.055
endra+3.001
Neg logit contributions
 cens-3.310
hd-3.177
arning-3.146
asionally-3.135
cffff-3.084
Token)
Token ID8
Feature activation+6.62e-01

Pos logit contributions
INESS+0.952
+0.928
 Breaker+0.831
 understandable+0.791
endra+0.759
Neg logit contributions
 cens-0.835
hd-0.811
asionally-0.799
arning-0.799
cffff-0.782
Token September
Token ID2693
Feature activation+8.54e-02

Pos logit contributions
INESS+1.359
+1.314
 Breaker+1.178
 understandable+1.122
endra+1.060
Neg logit contributions
 cens-1.434
hd-1.390
asionally-1.384
arning-1.383
cffff-1.370
Token 27
Token ID2681
Feature activation+3.70e-01

Pos logit contributions
INESS+0.223
+0.218
 Breaker+0.198
 understandable+0.191
endra+0.184
Neg logit contributions
 cens-0.138
hd-0.133
asionally-0.131
arning-0.130
cffff-0.126
Token,
Token ID11
Feature activation+2.63e-01

Pos logit contributions
INESS+0.767
+0.746
 Breaker+0.661
 understandable+0.626
endra+0.598
Neg logit contributions
 cens-0.797
hd-0.774
asionally-0.764
arning-0.764
cffff-0.749
Token didn
Token ID1422
Feature activation+9.18e-02

Pos logit contributions
 cens+0.094
hd+0.091
asionally+0.089
arning+0.089
cffff+0.088
Neg logit contributions
INESS-0.108
-0.105
 Breaker-0.094
 understandable-0.090
endra-0.086
Token
Token ID447
Feature activation+1.05e-01

Pos logit contributions
INESS+0.254
+0.248
 Breaker+0.228
 understandable+0.219
endra+0.212
Neg logit contributions
 cens-0.134
hd-0.131
asionally-0.126
arning-0.126
collar-0.121
Token
Token ID247
Feature activation-3.92e-02

Pos logit contributions
INESS+0.191
+0.185
 Breaker+0.161
 understandable+0.151
endra+0.143
Neg logit contributions
 cens-0.253
hd-0.247
asionally-0.245
arning-0.244
cffff-0.240
Tokent
Token ID83
Feature activation-8.38e-02

Pos logit contributions
 cens+0.061
hd+0.059
asionally+0.058
arning+0.058
cffff+0.056
Neg logit contributions
INESS-0.105
-0.102
 Breaker-0.093
 understandable-0.089
endra-0.086
Token use
Token ID779
Feature activation-1.02e-02

Pos logit contributions
 cens+0.189
hd+0.183
asionally+0.180
arning+0.180
cffff+0.176
Neg logit contributions
INESS-0.167
-0.163
 Breaker-0.143
 understandable-0.134
endra-0.128
Token above
Token ID2029
Feature activation+1.66e+00

Pos logit contributions
 cens+0.021
hd+0.021
asionally+0.020
arning+0.020
cffff+0.020
Neg logit contributions
INESS-0.022
-0.021
 Breaker-0.019
 understandable-0.018
endra-0.017
Token is
Token ID318
Feature activation-1.14e-01

Pos logit contributions
INESS+3.359
+3.259
 Breaker+2.916
 understandable+2.777
endra+2.627
Neg logit contributions
 cens-3.553
hd-3.484
asionally-3.452
arning-3.450
cffff-3.413
Token that
Token ID326
Feature activation-1.41e-01

Pos logit contributions
 cens+0.267
hd+0.259
arning+0.256
asionally+0.255
cffff+0.250
Neg logit contributions
INESS-0.217
-0.210
 Breaker-0.183
 understandable-0.171
endra-0.165
Token it
Token ID340
Feature activation-2.48e-02

Pos logit contributions
 cens+0.306
hd+0.297
asionally+0.293
arning+0.292
cffff+0.287
Neg logit contributions
INESS-0.291
-0.283
 Breaker-0.250
 understandable-0.236
endra-0.226
Token can
Token ID460
Feature activation+1.23e-01

Pos logit contributions
 cens+0.051
hd+0.050
asionally+0.049
arning+0.049
cffff+0.048
Neg logit contributions
INESS-0.053
-0.052
 Breaker-0.046
 understandable-0.044
endra-0.042
Token skip
Token ID14267
Feature activation+4.17e-01

Pos logit contributions
INESS+0.261
+0.254
 Breaker+0.225
 understandable+0.214
endra+0.204
Neg logit contributions
 cens-0.261
hd-0.254
asionally-0.251
arning-0.250
cffff-0.246
Token Veterans
Token ID18950
Feature activation+2.66e-01

Pos logit contributions
INESS+0.076
+0.074
 Breaker+0.065
 understandable+0.062
endra+0.060
Neg logit contributions
 cens-0.069
hd-0.067
asionally-0.066
arning-0.066
collar-0.064
Token
Token ID198
Feature activation-1.99e-01

Pos logit contributions
INESS+0.579
+0.563
 Breaker+0.500
 understandable+0.477
endra+0.458
Neg logit contributions
 cens-0.545
hd-0.529
arning-0.523
asionally-0.522
cffff-0.511
Token
Token ID198
Feature activation-2.10e-01

Pos logit contributions
 cens+0.381
hd+0.376
arning+0.363
asionally+0.362
collar+0.352
Neg logit contributions
INESS-0.461
-0.453
 Breaker-0.403
 understandable-0.382
endra-0.370
TokenAs
Token ID1722
Feature activation-2.33e-01

Pos logit contributions
 cens+0.388
hd+0.383
arning+0.375
asionally+0.375
collar+0.364
Neg logit contributions
INESS-0.492
-0.482
 Breaker-0.436
 understandable-0.414
endra-0.398
Token mentioned
Token ID4750
Feature activation+3.41e-01

Pos logit contributions
 cens+0.522
hd+0.508
asionally+0.501
arning+0.501
cffff+0.488
Neg logit contributions
INESS-0.464
-0.452
 Breaker-0.397
 understandable-0.371
endra-0.357
Token above
Token ID2029
Feature activation+1.66e+00

Pos logit contributions
INESS+0.616
+0.599
 Breaker+0.518
 understandable+0.486
endra+0.460
Neg logit contributions
 cens-0.829
hd-0.806
asionally-0.799
arning-0.797
cffff-0.784
Token,
Token ID11
Feature activation+1.82e-01

Pos logit contributions
INESS+3.255
+3.163
 Breaker+2.777
 understandable+2.643
endra+2.489
Neg logit contributions
 cens-3.721
hd-3.639
arning-3.572
asionally-3.564
cffff-3.537
Token V
Token ID569
Feature activation+2.75e-03

Pos logit contributions
INESS+0.410
+0.400
 Breaker+0.357
 understandable+0.339
endra+0.327
Neg logit contributions
 cens-0.360
hd-0.348
asionally-0.344
arning-0.344
cffff-0.336
TokenAS
Token ID1921
Feature activation-3.47e-01

Pos logit contributions
INESS+0.006
+0.006
 Breaker+0.005
 understandable+0.005
endra+0.004
Neg logit contributions
 cens-0.006
hd-0.006
arning-0.006
asionally-0.006
collar-0.006
Token is
Token ID318
Feature activation+6.81e-02

Pos logit contributions
 cens+0.752
hd+0.727
arning+0.724
asionally+0.720
collar+0.703
Neg logit contributions
INESS-0.714
-0.695
 Breaker-0.617
 understandable-0.582
endra-0.564
Token the
Token ID262
Feature activation+1.52e-02

Pos logit contributions
INESS+0.130
+0.127
 Breaker+0.110
 understandable+0.103
endra+0.099
Neg logit contributions
 cens-0.159
hd-0.154
arning-0.153
asionally-0.152
cffff-0.149
Token the
Token ID262
Feature activation+2.68e-01

Pos logit contributions
INESS+0.549
+0.533
 Breaker+0.468
 understandable+0.443
endra+0.422
Neg logit contributions
 cens-0.623
hd-0.605
asionally-0.598
arning-0.598
cffff-0.587
Token first
Token ID717
Feature activation+3.67e-01

Pos logit contributions
INESS+0.515
+0.498
 Breaker+0.433
 understandable+0.410
endra+0.391
Neg logit contributions
 cens-0.625
hd-0.606
asionally-0.600
arning-0.597
cffff-0.585
Token edition
Token ID8313
Feature activation-4.73e-01

Pos logit contributions
INESS+0.692
+0.667
 Breaker+0.578
 understandable+0.549
endra+0.522
Neg logit contributions
 cens-0.870
hd-0.844
asionally-0.833
arning-0.830
cffff-0.813
Token (
Token ID357
Feature activation+1.68e-01

Pos logit contributions
 cens+0.968
hd+0.924
asionally+0.915
arning+0.894
collar+0.875
Neg logit contributions
INESS-1.053
-1.010
 Breaker-0.896
 understandable-0.860
endra-0.833
Tokensee
Token ID3826
Feature activation+4.38e-01

Pos logit contributions
INESS+0.385
+0.375
 Breaker+0.336
 understandable+0.320
endra+0.308
Neg logit contributions
 cens-0.326
hd-0.316
asionally-0.311
arning-0.311
cffff-0.304
Token above
Token ID2029
Feature activation+1.65e+00

Pos logit contributions
INESS+0.805
+0.778
 Breaker+0.674
 understandable+0.635
endra+0.603
Neg logit contributions
 cens-1.052
hd-1.022
asionally-1.012
arning-1.009
cffff-0.988
Token)
Token ID8
Feature activation+2.03e-01

Pos logit contributions
INESS+3.375
+3.283
 Breaker+2.934
 understandable+2.775
endra+2.652
Neg logit contributions
 cens-3.497
hd-3.461
arning-3.420
asionally-3.409
cffff-3.396
Token is
Token ID318
Feature activation+5.46e-02

Pos logit contributions
INESS+0.439
+0.426
 Breaker+0.379
 understandable+0.359
endra+0.345
Neg logit contributions
 cens-0.424
hd-0.409
asionally-0.405
arning-0.403
cffff-0.394
Token a
Token ID257
Feature activation+8.59e-02

Pos logit contributions
INESS+0.116
+0.112
 Breaker+0.098
 understandable+0.093
endra+0.090
Neg logit contributions
 cens-0.117
hd-0.113
asionally-0.112
arning-0.111
cffff-0.109
Token tribute
Token ID17547
Feature activation-6.29e-01

Pos logit contributions
INESS+0.180
+0.175
 Breaker+0.154
 understandable+0.146
endra+0.140
Neg logit contributions
 cens-0.186
hd-0.179
asionally-0.177
arning-0.176
cffff-0.173
Token to
Token ID284
Feature activation+1.90e-01

Pos logit contributions
 cens+1.231
hd+1.176
asionally+1.163
arning+1.158
collar+1.124
Neg logit contributions
INESS-1.446
-1.406
 Breaker-1.258
 understandable-1.191
endra-1.150
Token email
Token ID3053
Feature activation-6.04e-01

Pos logit contributions
 cens+0.090
hd+0.087
asionally+0.086
arning+0.086
cffff+0.084
Neg logit contributions
INESS-0.076
-0.074
 Breaker-0.065
 understandable-0.060
endra-0.058
Token from
Token ID422
Feature activation-2.70e-02

Pos logit contributions
 cens+1.121
hd+1.077
asionally+1.057
arning+1.051
collar+1.019
Neg logit contributions
INESS-1.443
-1.416
 Breaker-1.266
 understandable-1.203
endra-1.169
Token Eva
Token ID32355
Feature activation-4.23e-02

Pos logit contributions
 cens+0.059
hd+0.057
asionally+0.056
arning+0.056
cffff+0.055
Neg logit contributions
INESS-0.056
-0.055
 Breaker-0.048
 understandable-0.045
endra-0.044
Token (
Token ID357
Feature activation+1.03e-01

Pos logit contributions
 cens+0.087
hd+0.085
arning+0.083
asionally+0.083
cffff+0.081
Neg logit contributions
INESS-0.092
-0.090
 Breaker-0.079
 understandable-0.075
endra-0.072
Tokensee
Token ID3826
Feature activation+3.75e-02

Pos logit contributions
INESS+0.231
+0.225
 Breaker+0.202
 understandable+0.192
endra+0.184
Neg logit contributions
 cens-0.206
hd-0.201
arning-0.197
asionally-0.197
cffff-0.192
Token above
Token ID2029
Feature activation+1.64e+00

Pos logit contributions
INESS+0.073
+0.071
 Breaker+0.061
 understandable+0.058
endra+0.055
Neg logit contributions
 cens-0.087
hd-0.084
asionally-0.083
arning-0.082
cffff-0.080
Token).
Token ID737
Feature activation-4.44e-02

Pos logit contributions
INESS+3.458
+3.358
 Breaker+3.004
 understandable+2.865
endra+2.738
Neg logit contributions
 cens-3.393
hd-3.323
asionally-3.298
arning-3.297
cffff-3.259
Token Could
Token ID10347
Feature activation+7.10e-02

Pos logit contributions
 cens+0.094
hd+0.092
asionally+0.090
arning+0.089
collar+0.087
Neg logit contributions
INESS-0.094
-0.091
 Breaker-0.080
 understandable-0.076
endra-0.074
Token your
Token ID534
Feature activation+2.69e-01

Pos logit contributions
INESS+0.145
+0.141
 Breaker+0.124
 understandable+0.116
endra+0.112
Neg logit contributions
 cens-0.157
hd-0.152
asionally-0.150
arning-0.149
cffff-0.147
Token HS
Token ID18070
Feature activation+1.94e-01

Pos logit contributions
INESS+0.775
+0.760
 Breaker+0.697
 understandable+0.670
endra+0.652
Neg logit contributions
 cens-0.367
hd-0.349
arning-0.340
asionally-0.340
cffff-0.330
Token comment
Token ID2912
Feature activation+6.28e-01

Pos logit contributions
INESS+0.391
+0.380
 Breaker+0.334
 understandable+0.313
endra+0.300
Neg logit contributions
 cens-0.435
hd-0.420
asionally-0.414
arning-0.413
cffff-0.406
TokenWe
Token ID1135
Feature activation+1.20e-01

Pos logit contributions
INESS+0.737
+0.718
 Breaker+0.644
 understandable+0.613
endra+0.588
Neg logit contributions
 cens-0.639
hd-0.619
arning-0.610
asionally-0.610
cffff-0.597
Token have
Token ID423
Feature activation+1.77e-01

Pos logit contributions
INESS+0.268
+0.261
 Breaker+0.235
 understandable+0.224
endra+0.215
Neg logit contributions
 cens-0.239
hd-0.232
arning-0.229
asionally-0.228
cffff-0.225
Token taken
Token ID2077
Feature activation-5.15e-02

Pos logit contributions
INESS+0.371
+0.362
 Breaker+0.320
 understandable+0.302
endra+0.290
Neg logit contributions
 cens-0.378
hd-0.366
asionally-0.361
arning-0.361
cffff-0.353
Token students
Token ID2444
Feature activation-2.30e-01

Pos logit contributions
 cens+0.114
hd+0.111
asionally+0.109
arning+0.109
cffff+0.106
Neg logit contributions
INESS-0.104
-0.102
 Breaker-0.089
 understandable-0.084
endra-0.081
Token to
Token ID284
Feature activation-2.78e-01

Pos logit contributions
 cens+0.440
hd+0.425
arning+0.419
asionally+0.419
cffff+0.409
Neg logit contributions
INESS-0.536
-0.524
 Breaker-0.469
 understandable-0.447
endra-0.431
Token Dell
Token ID23617
Feature activation+1.61e+00

Pos logit contributions
 cens+0.603
hd+0.584
asionally+0.576
arning+0.573
cffff+0.558
Neg logit contributions
INESS-0.577
-0.565
 Breaker-0.491
 understandable-0.465
endra-0.451
Token headquarters
Token ID10043
Feature activation-1.66e-01

Pos logit contributions
INESS+3.422
+3.312
 Breaker+2.989
 understandable+2.811
endra+2.700
Neg logit contributions
 cens-3.321
hd-3.231
asionally-3.213
arning-3.172
cffff-3.143
Token in
Token ID287
Feature activation-1.56e-01

Pos logit contributions
 cens+0.346
hd+0.335
arning+0.331
asionally+0.331
cffff+0.324
Neg logit contributions
INESS-0.357
-0.348
 Breaker-0.310
 understandable-0.294
endra-0.282
Token Austin
Token ID9533
Feature activation+9.65e-02

Pos logit contributions
 cens+0.349
hd+0.338
arning+0.334
asionally+0.332
collar+0.325
Neg logit contributions
INESS-0.314
-0.305
 Breaker-0.265
 understandable-0.251
endra-0.241
Token,
Token ID11
Feature activation+1.42e-01

Pos logit contributions
INESS+0.207
+0.201
 Breaker+0.179
 understandable+0.170
endra+0.162
Neg logit contributions
 cens-0.202
hd-0.196
asionally-0.193
arning-0.193
cffff-0.190
Token and
Token ID290
Feature activation+1.50e-01

Pos logit contributions
INESS+0.276
+0.268
 Breaker+0.235
 understandable+0.222
endra+0.211
Neg logit contributions
 cens-0.323
hd-0.315
arning-0.311
asionally-0.311
cffff-0.305
Token
Token ID198
Feature activation+2.25e-01

Pos logit contributions
 cens+0.176
hd+0.174
asionally+0.170
arning+0.168
cffff+0.164
Neg logit contributions
INESS-0.185
-0.179
 Breaker-0.156
 understandable-0.151
endra-0.145
Token
Token ID198
Feature activation+1.08e-01

Pos logit contributions
INESS+0.512
+0.501
 Breaker+0.446
 understandable+0.424
endra+0.409
Neg logit contributions
 cens-0.441
hd-0.432
arning-0.420
asionally-0.420
collar-0.408
TokenAs
Token ID1722
Feature activation-9.16e-02

Pos logit contributions
INESS+0.235
+0.229
 Breaker+0.206
 understandable+0.195
endra+0.186
Neg logit contributions
 cens-0.218
hd-0.215
asionally-0.210
arning-0.210
cffff-0.205
Token the
Token ID262
Feature activation+4.10e-02

Pos logit contributions
 cens+0.200
hd+0.196
asionally+0.193
arning+0.192
cffff+0.188
Neg logit contributions
INESS-0.187
-0.183
 Breaker-0.160
 understandable-0.151
endra-0.145
Token screenshot
Token ID22032
Feature activation-2.26e-01

Pos logit contributions
INESS+0.081
+0.079
 Breaker+0.069
 understandable+0.065
endra+0.062
Neg logit contributions
 cens-0.093
hd-0.090
arning-0.089
asionally-0.089
cffff-0.087
Token above
Token ID2029
Feature activation+1.60e+00

Pos logit contributions
 cens+0.566
hd+0.549
asionally+0.540
arning+0.539
collar+0.526
Neg logit contributions
INESS-0.396
-0.387
 Breaker-0.329
 understandable-0.306
endra-0.293
Token shows
Token ID2523
Feature activation+1.61e-01

Pos logit contributions
INESS+2.548
+2.442
 Breaker+2.118
 understandable+2.006
endra+1.859
Neg logit contributions
 cens-4.066
hd-4.015
arning-4.006
asionally-4.000
cffff-3.992
Token there
Token ID612
Feature activation-6.33e-02

Pos logit contributions
INESS+0.342
+0.334
 Breaker+0.297
 understandable+0.280
endra+0.270
Neg logit contributions
 cens-0.338
hd-0.328
asionally-0.323
arning-0.321
cffff-0.316
Token is
Token ID318
Feature activation+1.15e-01

Pos logit contributions
 cens+0.130
hd+0.125
asionally+0.124
arning+0.124
cffff+0.120
Neg logit contributions
INESS-0.139
-0.135
 Breaker-0.121
 understandable-0.114
endra-0.110
Token an
Token ID281
Feature activation+1.64e-01

Pos logit contributions
INESS+0.232
+0.228
 Breaker+0.198
 understandable+0.185
endra+0.179
Neg logit contributions
 cens-0.254
hd-0.247
asionally-0.243
arning-0.242
cffff-0.237
Token
Token ID564
Feature activation+1.46e-01

Pos logit contributions
INESS+0.301
+0.291
 Breaker+0.249
 understandable+0.228
endra+0.224
Neg logit contributions
 cens-0.396
hd-0.387
asionally-0.383
arning-0.380
cffff-0.372
Token a
Token ID257
Feature activation-3.59e-01

Pos logit contributions
 cens+0.941
hd+0.911
asionally+0.898
arning+0.896
cffff+0.876
Neg logit contributions
INESS-1.024
-1.001
 Breaker-0.887
 understandable-0.843
endra-0.812
Token laptop
Token ID13224
Feature activation-2.84e-01

Pos logit contributions
 cens+0.739
hd+0.718
arning+0.705
asionally+0.704
cffff+0.689
Neg logit contributions
INESS-0.783
-0.765
 Breaker-0.676
 understandable-0.642
endra-0.621
Token suspend
Token ID22004
Feature activation-2.93e-01

Pos logit contributions
 cens+0.583
hd+0.569
arning+0.557
asionally+0.556
collar+0.539
Neg logit contributions
INESS-0.616
-0.604
 Breaker-0.533
 understandable-0.509
endra-0.493
Token on
Token ID319
Feature activation-2.35e-01

Pos logit contributions
 cens+0.607
hd+0.590
asionally+0.581
arning+0.580
cffff+0.566
Neg logit contributions
INESS-0.633
-0.618
 Breaker-0.545
 understandable-0.518
endra-0.500
Token Ubuntu
Token ID14949
Feature activation+7.54e-01

Pos logit contributions
 cens+0.487
hd+0.473
asionally+0.465
arning+0.464
cffff+0.454
Neg logit contributions
INESS-0.512
-0.498
 Breaker-0.440
 understandable-0.418
endra-0.402
Token above
Token ID2029
Feature activation+1.59e+00

Pos logit contributions
INESS+1.602
+1.563
 Breaker+1.393
 understandable+1.320
endra+1.260
Neg logit contributions
 cens-1.579
hd-1.533
arning-1.520
asionally-1.516
cffff-1.493
Token and
Token ID290
Feature activation+1.49e-01

Pos logit contributions
INESS+3.237
+3.162
 Breaker+2.837
 understandable+2.707
endra+2.522
Neg logit contributions
 cens-3.367
hd-3.263
arning-3.258
asionally-3.255
cffff-3.216
Token lo
Token ID2376
Feature activation+7.58e-02

Pos logit contributions
INESS+0.310
+0.302
 Breaker+0.268
 understandable+0.254
endra+0.242
Neg logit contributions
 cens-0.320
hd-0.310
arning-0.307
asionally-0.307
cffff-0.301
Token and
Token ID290
Feature activation-1.79e-01

Pos logit contributions
INESS+0.177
+0.173
 Breaker+0.155
 understandable+0.148
endra+0.142
Neg logit contributions
 cens-0.144
hd-0.139
arning-0.137
asionally-0.137
cffff-0.134
Token behold
Token ID23700
Feature activation+3.00e-01

Pos logit contributions
 cens+0.371
hd+0.360
arning+0.351
asionally+0.350
cffff+0.345
Neg logit contributions
INESS-0.393
-0.381
 Breaker-0.334
 understandable-0.317
endra-0.307
Token,
Token ID11
Feature activation-6.93e-02

Pos logit contributions
INESS+0.623
+0.606
 Breaker+0.537
 understandable+0.508
endra+0.486
Neg logit contributions
 cens-0.646
hd-0.628
asionally-0.620
arning-0.620
cffff-0.608
Token He
Token ID679
Feature activation+2.05e-02

Pos logit contributions
INESS+0.537
+0.522
 Breaker+0.459
 understandable+0.440
endra+0.422
Neg logit contributions
 cens-0.528
hd-0.518
arning-0.507
asionally-0.507
collar-0.493
Token became
Token ID2627
Feature activation+2.82e-01

Pos logit contributions
INESS+0.047
+0.046
 Breaker+0.041
 understandable+0.039
endra+0.038
Neg logit contributions
 cens-0.040
hd-0.039
arning-0.038
asionally-0.038
cffff-0.037
Token an
Token ID281
Feature activation-1.39e-01

Pos logit contributions
INESS+0.562
+0.544
 Breaker+0.478
 understandable+0.450
endra+0.431
Neg logit contributions
 cens-0.634
hd-0.615
arning-0.608
asionally-0.608
cffff-0.595
Token unrestricted
Token ID36263
Feature activation+5.84e-01

Pos logit contributions
 cens+0.304
hd+0.294
asionally+0.293
arning+0.292
cffff+0.282
Neg logit contributions
INESS-0.287
-0.277
 Breaker-0.243
 understandable-0.225
endra-0.220
Token free
Token ID1479
Feature activation+4.65e-01

Pos logit contributions
INESS+1.329
+1.295
 Breaker+1.159
 understandable+1.103
endra+1.063
Neg logit contributions
 cens-1.145
hd-1.108
arning-1.094
asionally-1.092
cffff-1.067
Token agent
Token ID5797
Feature activation+1.58e+00

Pos logit contributions
INESS+0.750
+0.717
 Breaker+0.609
 understandable+0.565
endra+0.536
Neg logit contributions
 cens-1.228
hd-1.199
arning-1.184
asionally-1.182
cffff-1.163
Token at
Token ID379
Feature activation-4.47e-01

Pos logit contributions
INESS+3.387
+3.308
 Breaker+3.011
 understandable+2.891
endra+2.711
Neg logit contributions
 cens-3.139
hd-3.102
arning-3.053
asionally-3.049
cffff-2.987
Token the
Token ID262
Feature activation-1.54e-01

Pos logit contributions
 cens+1.013
arning+0.974
hd+0.967
asionally+0.951
collar+0.940
Neg logit contributions
INESS-0.904
-0.879
 Breaker-0.745
 understandable-0.690
endra-0.686
Token end
Token ID886
Feature activation-7.37e-01

Pos logit contributions
 cens+0.353
arning+0.336
hd+0.336
asionally+0.333
collar+0.325
Neg logit contributions
INESS-0.305
-0.297
 Breaker-0.253
 understandable-0.237
endra-0.234
Token of
Token ID286
Feature activation-6.85e-02

Pos logit contributions
 cens+1.396
hd+1.372
arning+1.312
asionally+1.305
collar+1.282
Neg logit contributions
INESS-1.721
-1.697
 Breaker-1.488
 understandable-1.402
endra-1.369
Token the
Token ID262
Feature activation-7.90e-02

Pos logit contributions
 cens+0.149
hd+0.143
arning+0.141
asionally+0.140
collar+0.137
Neg logit contributions
INESS-0.145
-0.141
 Breaker-0.122
 understandable-0.114
endra-0.111
Token.
Token ID13
Feature activation+1.73e-02

Pos logit contributions
INESS+0.042
+0.041
 Breaker+0.036
 understandable+0.034
endra+0.033
Neg logit contributions
 cens-0.038
hd-0.037
asionally-0.036
arning-0.036
cffff-0.036
Token The
Token ID383
Feature activation+1.49e-01

Pos logit contributions
INESS+0.037
+0.036
 Breaker+0.031
 understandable+0.030
endra+0.029
Neg logit contributions
 cens-0.036
hd-0.036
asionally-0.035
arning-0.035
cffff-0.034
Token GDP
Token ID12396
Feature activation-7.56e-01

Pos logit contributions
INESS+0.296
+0.288
 Breaker+0.251
 understandable+0.234
endra+0.226
Neg logit contributions
 cens-0.343
hd-0.329
asionally-0.325
arning-0.324
cffff-0.316
Token map
Token ID3975
Feature activation+2.56e-01

Pos logit contributions
 cens+1.646
hd+1.573
arning+1.529
asionally+1.522
cffff+1.490
Neg logit contributions
INESS-1.574
-1.547
 Breaker-1.352
 understandable-1.279
endra-1.235
Token mentioned
Token ID4750
Feature activation+2.21e-01

Pos logit contributions
INESS+0.537
+0.523
 Breaker+0.460
 understandable+0.435
endra+0.421
Neg logit contributions
 cens-0.555
hd-0.535
asionally-0.526
arning-0.524
cffff-0.512
Token above
Token ID2029
Feature activation+1.57e+00

Pos logit contributions
INESS+0.371
+0.361
 Breaker+0.306
 understandable+0.285
endra+0.270
Neg logit contributions
 cens-0.567
hd-0.549
asionally-0.543
arning-0.541
cffff-0.534
Token is
Token ID318
Feature activation+2.70e-02

Pos logit contributions
INESS+3.061
+2.983
 Breaker+2.646
 understandable+2.532
endra+2.368
Neg logit contributions
 cens-3.452
hd-3.403
arning-3.396
asionally-3.374
cffff-3.353
Token also
Token ID635
Feature activation+2.75e-01

Pos logit contributions
INESS+0.056
+0.055
 Breaker+0.048
 understandable+0.045
endra+0.044
Neg logit contributions
 cens-0.059
hd-0.057
asionally-0.056
arning-0.056
cffff-0.055
Token really
Token ID1107
Feature activation+3.70e-02

Pos logit contributions
INESS+0.541
+0.525
 Breaker+0.460
 understandable+0.431
endra+0.414
Neg logit contributions
 cens-0.628
hd-0.609
asionally-0.602
arning-0.601
cffff-0.590
Token just
Token ID655
Feature activation-3.41e-01

Pos logit contributions
INESS+0.067
+0.065
 Breaker+0.056
 understandable+0.051
endra+0.050
Neg logit contributions
 cens-0.091
hd-0.088
arning-0.086
asionally-0.086
cffff-0.085
Token a
Token ID257
Feature activation+1.13e-01

Pos logit contributions
 cens+0.804
hd+0.776
arning+0.761
asionally+0.760
cffff+0.746
Neg logit contributions
INESS-0.652
-0.635
 Breaker-0.551
 understandable-0.513
endra-0.495
Token
Token ID198
Feature activation-2.62e-03

Pos logit contributions
INESS+0.231
+0.225
 Breaker+0.202
 understandable+0.192
endra+0.184
Neg logit contributions
 cens-0.206
hd-0.200
arning-0.197
asionally-0.197
cffff-0.193
Token
Token ID198
Feature activation-2.23e-02

Pos logit contributions
 cens+0.005
hd+0.005
asionally+0.005
arning+0.005
collar+0.005
Neg logit contributions
INESS-0.006
-0.006
 Breaker-0.005
 understandable-0.005
endra-0.005
TokenAccording
Token ID4821
Feature activation-8.09e-01

Pos logit contributions
 cens+0.044
hd+0.043
asionally+0.043
arning+0.043
cffff+0.042
Neg logit contributions
INESS-0.050
-0.049
 Breaker-0.044
 understandable-0.041
endra-0.040
Token to
Token ID284
Feature activation-1.40e-01

Pos logit contributions
 cens+1.534
hd+1.504
asionally+1.467
arning+1.451
cffff+1.403
Neg logit contributions
INESS-1.882
-1.842
 Breaker-1.642
 understandable-1.545
endra-1.485
Token the
Token ID262
Feature activation+1.39e-01

Pos logit contributions
 cens+0.306
hd+0.297
asionally+0.292
arning+0.291
cffff+0.285
Neg logit contributions
INESS-0.291
-0.282
 Breaker-0.248
 understandable-0.234
endra-0.225
Token above
Token ID2029
Feature activation+1.57e+00

Pos logit contributions
INESS+0.292
+0.284
 Breaker+0.251
 understandable+0.237
endra+0.228
Neg logit contributions
 cens-0.298
hd-0.290
asionally-0.285
arning-0.285
cffff-0.279
Token terminal
Token ID12094
Feature activation+8.82e-02

Pos logit contributions
INESS+3.382
+3.303
 Breaker+2.989
 understandable+2.873
endra+2.704
Neg logit contributions
 cens-3.110
asionally-3.042
hd-3.025
arning-3.013
cffff-2.991
Token output
Token ID5072
Feature activation+4.40e-01

Pos logit contributions
INESS+0.195
+0.190
 Breaker+0.170
 understandable+0.161
endra+0.155
Neg logit contributions
 cens-0.178
hd-0.173
arning-0.170
asionally-0.170
cffff-0.167
Token,
Token ID11
Feature activation+1.63e-01

Pos logit contributions
INESS+0.917
+0.891
 Breaker+0.790
 understandable+0.749
endra+0.716
Neg logit contributions
 cens-0.943
hd-0.916
arning-0.905
asionally-0.905
cffff-0.887
Token there
Token ID612
Feature activation-4.95e-02

Pos logit contributions
INESS+0.333
+0.324
 Breaker+0.287
 understandable+0.272
endra+0.258
Neg logit contributions
 cens-0.355
hd-0.345
arning-0.341
asionally-0.341
cffff-0.335
Token isn
Token ID2125
Feature activation+2.46e-01

Pos logit contributions
 cens+0.103
hd+0.100
arning+0.098
asionally+0.098
cffff+0.096
Neg logit contributions
INESS-0.107
-0.104
 Breaker-0.093
 understandable-0.088
endra-0.085
Token've
Token ID1053
Feature activation+3.65e-02

Pos logit contributions
INESS+0.212
+0.207
 Breaker+0.187
 understandable+0.178
endra+0.172
Neg logit contributions
 cens-0.166
hd-0.160
arning-0.158
asionally-0.158
cffff-0.154
Token also
Token ID635
Feature activation+4.38e-01

Pos logit contributions
INESS+0.083
+0.080
 Breaker+0.072
 understandable+0.068
endra+0.066
Neg logit contributions
 cens-0.072
hd-0.070
asionally-0.069
arning-0.069
cffff-0.067
Token included
Token ID3017
Feature activation+1.23e-01

Pos logit contributions
INESS+1.028
+1.004
 Breaker+0.907
 understandable+0.868
endra+0.831
Neg logit contributions
 cens-0.816
hd-0.793
arning-0.785
asionally-0.782
cffff-0.772
Token a
Token ID257
Feature activation-9.37e-02

Pos logit contributions
INESS+0.244
+0.237
 Breaker+0.206
 understandable+0.194
endra+0.186
Neg logit contributions
 cens-0.283
hd-0.274
arning-0.269
asionally-0.269
cffff-0.261
Token comparison
Token ID7208
Feature activation+5.65e-01

Pos logit contributions
 cens+0.201
hd+0.194
arning+0.191
asionally+0.190
cffff+0.185
Neg logit contributions
INESS-0.198
-0.194
 Breaker-0.170
 understandable-0.161
endra-0.155
Token above
Token ID2029
Feature activation+1.56e+00

Pos logit contributions
INESS+1.278
+1.245
 Breaker+1.115
 understandable+1.061
endra+1.019
Neg logit contributions
 cens-1.115
hd-1.080
arning-1.065
asionally-1.065
cffff-1.042
Token against
Token ID1028
Feature activation+4.86e-01

Pos logit contributions
INESS+3.299
+3.185
 Breaker+2.940
 understandable+2.799
endra+2.624
Neg logit contributions
 cens-3.123
hd-3.088
asionally-3.086
arning-3.072
cffff-3.049
Token 3
Token ID513
Feature activation-2.92e-01

Pos logit contributions
INESS+0.896
+0.868
 Breaker+0.750
 understandable+0.704
endra+0.671
Neg logit contributions
 cens-1.171
hd-1.140
asionally-1.124
arning-1.122
cffff-1.101
Token other
Token ID584
Feature activation-2.99e-02

Pos logit contributions
 cens+0.623
hd+0.610
asionally+0.595
arning+0.594
cffff+0.580
Neg logit contributions
INESS-0.619
-0.600
 Breaker-0.533
 understandable-0.502
endra-0.484
Token very
Token ID845
Feature activation+1.71e-01

Pos logit contributions
 cens+0.060
hd+0.058
asionally+0.057
arning+0.057
cffff+0.055
Neg logit contributions
INESS-0.067
-0.065
 Breaker-0.058
 understandable-0.055
endra-0.053
Token fast
Token ID3049
Feature activation+4.67e-01

Pos logit contributions
INESS+0.294
+0.283
 Breaker+0.242
 understandable+0.222
endra+0.215
Neg logit contributions
 cens-0.435
hd-0.423
asionally-0.416
arning-0.415
cffff-0.406
Token adults
Token ID6490
Feature activation+5.24e-02

Pos logit contributions
INESS+0.782
+0.758
 Breaker+0.673
 understandable+0.639
endra+0.614
Neg logit contributions
 cens-0.732
hd-0.704
arning-0.694
asionally-0.691
cffff-0.674
Token considered
Token ID3177
Feature activation+2.99e-02

Pos logit contributions
INESS+0.119
+0.116
 Breaker+0.104
 understandable+0.098
endra+0.095
Neg logit contributions
 cens-0.103
hd-0.100
arning-0.098
asionally-0.098
cffff-0.095
Token themselves
Token ID2405
Feature activation-5.10e-01

Pos logit contributions
INESS+0.062
+0.060
 Breaker+0.052
 understandable+0.049
endra+0.048
Neg logit contributions
 cens-0.066
hd-0.063
arning-0.063
asionally-0.062
cffff-0.061
Token to
Token ID284
Feature activation-2.68e-01

Pos logit contributions
 cens+1.021
hd+0.980
arning+0.959
asionally+0.955
collar+0.929
Neg logit contributions
INESS-1.156
-1.124
 Breaker-0.996
 understandable-0.943
OE-0.908
Token be
Token ID307
Feature activation-1.11e-01

Pos logit contributions
 cens+0.575
hd+0.556
arning+0.548
asionally+0.547
cffff+0.532
Neg logit contributions
INESS-0.563
-0.547
 Breaker-0.485
 understandable-0.457
endra-0.436
Token veget
Token ID7494
Feature activation+1.56e+00

Pos logit contributions
 cens+0.221
hd+0.214
arning+0.210
asionally+0.210
cffff+0.205
Neg logit contributions
INESS-0.252
-0.246
 Breaker-0.219
 understandable-0.208
endra-0.200
Tokenarians
Token ID13517
Feature activation+4.88e-01

Pos logit contributions
INESS+3.576
+3.563
 Breaker+3.239
 understandable+3.113
endra+3.080
Neg logit contributions
 cens-2.815
cffff-2.686
hd-2.660
arning-2.634
asionally-2.626
Token,
Token ID11
Feature activation-6.64e-02

Pos logit contributions
INESS+1.020
+0.992
 Breaker+0.881
 understandable+0.836
endra+0.797
Neg logit contributions
 cens-1.041
hd-1.013
asionally-1.003
arning-1.001
cffff-0.984
Token and
Token ID290
Feature activation+1.45e-01

Pos logit contributions
 cens+0.132
hd+0.127
asionally+0.125
arning+0.125
cffff+0.122
Neg logit contributions
INESS-0.151
-0.147
 Breaker-0.131
 understandable-0.124
endra-0.119
Token 2
Token ID362
Feature activation+1.21e-01

Pos logit contributions
INESS+0.316
+0.308
 Breaker+0.272
 understandable+0.257
endra+0.248
Neg logit contributions
 cens-0.302
hd-0.291
asionally-0.286
arning-0.286
cffff-0.279
Token%
Token ID4
Feature activation-4.28e-01

Pos logit contributions
INESS+0.279
+0.271
 Breaker+0.244
 understandable+0.233
endra+0.224
Neg logit contributions
 cens-0.233
hd-0.225
arning-0.222
asionally-0.221
cffff-0.215
Token the
Token ID262
Feature activation-2.10e-01

Pos logit contributions
 cens+0.957
hd+0.932
asionally+0.920
arning+0.919
cffff+0.902
Neg logit contributions
INESS-0.859
-0.838
 Breaker-0.734
 understandable-0.692
endra-0.660
Token first
Token ID717
Feature activation-9.83e-02

Pos logit contributions
 cens+0.432
hd+0.418
asionally+0.412
arning+0.412
cffff+0.404
Neg logit contributions
INESS-0.458
-0.445
 Breaker-0.395
 understandable-0.374
endra-0.360
Token few
Token ID1178
Feature activation+1.18e-01

Pos logit contributions
 cens+0.222
hd+0.216
asionally+0.213
arning+0.212
cffff+0.208
Neg logit contributions
INESS-0.196
-0.190
 Breaker-0.165
 understandable-0.156
endra-0.150
Token hours
Token ID2250
Feature activation-2.63e-01

Pos logit contributions
INESS+0.224
+0.217
 Breaker+0.190
 understandable+0.178
endra+0.170
Neg logit contributions
 cens-0.277
hd-0.270
asionally-0.267
arning-0.266
cffff-0.262
Token mentioned
Token ID4750
Feature activation+3.58e-01

Pos logit contributions
 cens+0.547
hd+0.532
asionally+0.524
arning+0.521
cffff+0.511
Neg logit contributions
INESS-0.567
-0.552
 Breaker-0.488
 understandable-0.462
endra-0.446
Token above
Token ID2029
Feature activation+1.56e+00

Pos logit contributions
INESS+0.577
+0.562
 Breaker+0.475
 understandable+0.441
endra+0.414
Neg logit contributions
 cens-0.938
hd-0.914
asionally-0.905
arning-0.902
cffff-0.891
Token,
Token ID11
Feature activation+9.22e-02

Pos logit contributions
INESS+3.055
+2.974
 Breaker+2.629
 understandable+2.501
endra+2.354
Neg logit contributions
 cens-3.464
hd-3.359
arning-3.340
asionally-3.323
cffff-3.273
Token your
Token ID534
Feature activation+1.22e-01

Pos logit contributions
INESS+0.188
+0.183
 Breaker+0.162
 understandable+0.153
endra+0.146
Neg logit contributions
 cens-0.202
hd-0.196
asionally-0.194
arning-0.194
cffff-0.190
Token body
Token ID1767
Feature activation+5.56e-01

Pos logit contributions
INESS+0.293
+0.287
 Breaker+0.259
 understandable+0.248
endra+0.238
Neg logit contributions
 cens-0.222
hd-0.215
asionally-0.212
arning-0.212
cffff-0.207
Token goes
Token ID2925
Feature activation-1.12e-01

Pos logit contributions
INESS+1.101
+1.070
 Breaker+0.944
 understandable+0.893
endra+0.847
Neg logit contributions
 cens-1.246
hd-1.214
arning-1.202
asionally-1.199
cffff-1.180
Token into
Token ID656
Feature activation+6.94e-03

Pos logit contributions
 cens+0.210
hd+0.203
arning+0.200
asionally+0.200
cffff+0.196
Neg logit contributions
INESS-0.266
-0.259
 Breaker-0.233
 understandable-0.222
endra-0.214
Token at
Token ID379
Feature activation-6.06e-02

Pos logit contributions
INESS+0.154
+0.150
 Breaker+0.130
 understandable+0.121
endra+0.118
Neg logit contributions
 cens-0.173
hd-0.167
asionally-0.165
arning-0.165
cffff-0.161
Token M
Token ID337
Feature activation+8.56e-03

Pos logit contributions
 cens+0.132
hd+0.127
asionally+0.126
arning+0.126
cffff+0.123
Neg logit contributions
INESS-0.126
-0.122
 Breaker-0.106
 understandable-0.100
endra-0.097
TokenTF
Token ID10234
Feature activation-1.02e-01

Pos logit contributions
INESS+0.017
+0.017
 Breaker+0.015
 understandable+0.014
endra+0.013
Neg logit contributions
 cens-0.019
hd-0.019
asionally-0.018
arning-0.018
cffff-0.018
Token50
Token ID1120
Feature activation+9.56e-02

Pos logit contributions
 cens+0.195
hd+0.189
arning+0.186
asionally+0.185
cffff+0.181
Neg logit contributions
INESS-0.236
-0.231
 Breaker-0.206
 understandable-0.197
endra-0.189
Token or
Token ID393
Feature activation-1.54e-01

Pos logit contributions
INESS+0.213
+0.207
 Breaker+0.184
 understandable+0.175
endra+0.168
Neg logit contributions
 cens-0.193
hd-0.187
arning-0.184
asionally-0.184
cffff-0.179
Token above
Token ID2029
Feature activation+1.55e+00

Pos logit contributions
 cens+0.358
hd+0.346
asionally+0.341
arning+0.341
cffff+0.334
Neg logit contributions
INESS-0.297
-0.288
 Breaker-0.250
 understandable-0.233
endra-0.225
Token at
Token ID379
Feature activation-1.44e-01

Pos logit contributions
INESS+3.313
+3.218
 Breaker+2.894
 understandable+2.765
endra+2.626
Neg logit contributions
 cens-3.189
hd-3.122
asionally-3.089
arning-3.076
cffff-3.032
Token 18
Token ID1248
Feature activation+5.03e-02

Pos logit contributions
 cens+0.306
hd+0.295
arning+0.292
asionally+0.291
cffff+0.284
Neg logit contributions
INESS-0.305
-0.297
 Breaker-0.261
 understandable-0.246
endra-0.238
Tokenmm
Token ID3020
Feature activation+2.99e-01

Pos logit contributions
INESS+0.109
+0.105
 Breaker+0.094
 understandable+0.089
endra+0.085
Neg logit contributions
 cens-0.105
hd-0.102
asionally-0.100
arning-0.100
cffff-0.098
Token.
Token ID13
Feature activation+1.91e-02

Pos logit contributions
INESS+0.665
+0.647
 Breaker+0.578
 understandable+0.549
endra+0.528
Neg logit contributions
 cens-0.600
hd-0.581
asionally-0.573
arning-0.572
cffff-0.560
Token With
Token ID2080
Feature activation-9.95e-02

Pos logit contributions
INESS+0.041
+0.040
 Breaker+0.035
 understandable+0.034
endra+0.032
Neg logit contributions
 cens-0.040
hd-0.039
asionally-0.038
arning-0.038
collar-0.037
Token are
Token ID389
Feature activation-1.78e-01

Pos logit contributions
INESS+0.560
+0.543
 Breaker+0.490
 understandable+0.467
endra+0.448
Neg logit contributions
 cens-0.499
hd-0.485
arning-0.479
asionally-0.478
cffff-0.470
Token a
Token ID257
Feature activation+2.15e-02

Pos logit contributions
 cens+0.377
hd+0.363
arning+0.359
asionally+0.359
cffff+0.350
Neg logit contributions
INESS-0.381
-0.374
 Breaker-0.329
 understandable-0.310
endra-0.298
Token cos
Token ID8615
Feature activation+2.40e-01

Pos logit contributions
INESS+0.050
+0.049
 Breaker+0.044
 understandable+0.042
endra+0.040
Neg logit contributions
 cens-0.041
hd-0.040
arning-0.039
asionally-0.039
cffff-0.038
Tokenplayer
Token ID7829
Feature activation-9.74e-02

Pos logit contributions
INESS+0.599
+0.585
 Breaker+0.529
 understandable+0.506
endra+0.488
Neg logit contributions
 cens-0.419
hd-0.406
arning-0.400
asionally-0.400
cffff-0.388
Token featured
Token ID8096
Feature activation+1.92e-01

Pos logit contributions
 cens+0.205
hd+0.198
asionally+0.196
arning+0.195
cffff+0.191
Neg logit contributions
INESS-0.208
-0.203
 Breaker-0.180
 understandable-0.170
endra-0.163
Token above
Token ID2029
Feature activation+1.55e+00

Pos logit contributions
INESS+0.398
+0.389
 Breaker+0.343
 understandable+0.325
endra+0.310
Neg logit contributions
 cens-0.415
hd-0.403
asionally-0.398
arning-0.398
cffff-0.390
Token and
Token ID290
Feature activation+9.79e-02

Pos logit contributions
INESS+3.046
+2.919
 Breaker+2.614
 understandable+2.479
endra+2.346
Neg logit contributions
 cens-3.444
hd-3.383
arning-3.345
asionally-3.334
cffff-3.291
Token you
Token ID345
Feature activation+3.10e-01

Pos logit contributions
INESS+0.209
+0.204
 Breaker+0.181
 understandable+0.171
endra+0.164
Neg logit contributions
 cens-0.206
hd-0.200
asionally-0.197
arning-0.197
cffff-0.193
Token have
Token ID423
Feature activation+1.94e-02

Pos logit contributions
INESS+0.678
+0.658
 Breaker+0.591
 understandable+0.563
endra+0.540
Neg logit contributions
 cens-0.626
hd-0.609
arning-0.602
asionally-0.599
cffff-0.591
Token a
Token ID257
Feature activation-6.66e-02

Pos logit contributions
INESS+0.037
+0.036
 Breaker+0.031
 understandable+0.029
endra+0.027
Neg logit contributions
 cens-0.046
hd-0.045
asionally-0.044
arning-0.044
cffff-0.043
Token link
Token ID2792
Feature activation-8.31e-04

Pos logit contributions
 cens+0.146
hd+0.141
asionally+0.139
arning+0.139
cffff+0.135
Neg logit contributions
INESS-0.138
-0.134
 Breaker-0.117
 understandable-0.111
endra-0.106
Token
Token ID198
Feature activation-4.45e-02

Pos logit contributions
 cens+0.040
hd+0.040
arning+0.038
asionally+0.038
collar+0.037
Neg logit contributions
INESS-0.048
-0.047
 Breaker-0.042
 understandable-0.040
endra-0.039
Tokenpur
Token ID14225
Feature activation-3.37e-01

Pos logit contributions
 cens+0.086
hd+0.083
arning+0.082
asionally+0.082
cffff+0.080
Neg logit contributions
INESS-0.103
-0.100
 Breaker-0.090
 understandable-0.086
endra-0.082
Tokenak
Token ID461
Feature activation+4.52e-02

Pos logit contributions
 cens+0.520
hd+0.508
asionally+0.494
arning+0.494
cffff+0.480
Neg logit contributions
INESS-0.911
-0.887
 Breaker-0.809
 understandable-0.778
endra-0.746
Tokenushi
Token ID17731
Feature activation-9.21e-02

Pos logit contributions
INESS+0.103
+0.100
 Breaker+0.090
 understandable+0.086
endra+0.082
Neg logit contributions
 cens-0.088
hd-0.086
asionally-0.084
arning-0.084
cffff-0.082
Token Profile
Token ID13118
Feature activation-2.36e-01

Pos logit contributions
 cens+0.179
hd+0.174
arning+0.171
asionally+0.171
cffff+0.167
Neg logit contributions
INESS-0.211
-0.205
 Breaker-0.184
 understandable-0.175
endra-0.168
Token Joined
Token ID16709
Feature activation-1.76e+00

Pos logit contributions
 cens+0.524
hd+0.508
arning+0.502
asionally+0.501
cffff+0.491
Neg logit contributions
INESS-0.478
-0.464
 Breaker-0.406
 understandable-0.385
endra-0.367
Token August
Token ID2932
Feature activation-2.31e-01

Pos logit contributions
 cens+3.616
hd+3.490
asionally+3.381
arning+3.378
collar+3.251
Neg logit contributions
INESS-3.836
-3.716
 Breaker-3.170
 understandable-3.020
endra-2.957
Token 2012
Token ID2321
Feature activation-3.83e-02

Pos logit contributions
 cens+0.475
hd+0.463
asionally+0.454
arning+0.454
cffff+0.439
Neg logit contributions
INESS-0.503
-0.488
 Breaker-0.429
 understandable-0.412
endra-0.393
Token United
Token ID1578
Feature activation-7.29e-01

Pos logit contributions
 cens+0.077
hd+0.075
arning+0.073
asionally+0.073
cffff+0.071
Neg logit contributions
INESS-0.085
-0.083
 Breaker-0.073
 understandable-0.070
endra-0.067
Token States
Token ID1829
Feature activation-8.89e-02

Pos logit contributions
 cens+1.448
hd+1.405
arning+1.381
asionally+1.369
cffff+1.344
Neg logit contributions
INESS-1.636
-1.597
 Breaker-1.412
 understandable-1.351
endra-1.290
Token 3
Token ID513
Feature activation-6.24e-01

Pos logit contributions
 cens+0.161
hd+0.156
arning+0.153
asionally+0.152
cffff+0.148
Neg logit contributions
INESS-0.215
-0.210
 Breaker-0.189
 understandable-0.181
endra-0.175
Token,
Token ID11
Feature activation-1.13e-01

Pos logit contributions
 cens+0.270
hd+0.262
asionally+0.258
arning+0.257
cffff+0.252
Neg logit contributions
INESS-0.304
-0.296
 Breaker-0.264
 understandable-0.251
endra-0.241
Token though
Token ID996
Feature activation-7.43e-02

Pos logit contributions
 cens+0.231
hd+0.223
asionally+0.219
arning+0.219
cffff+0.215
Neg logit contributions
INESS-0.251
-0.245
 Breaker-0.217
 understandable-0.204
endra-0.199
Token.
Token ID13
Feature activation-1.45e-01

Pos logit contributions
 cens+0.163
hd+0.156
asionally+0.155
arning+0.154
cffff+0.150
Neg logit contributions
INESS-0.155
-0.150
 Breaker-0.131
 understandable-0.121
endra-0.120
Token
Token ID447
Feature activation-9.21e-02

Pos logit contributions
 cens+0.305
hd+0.299
asionally+0.293
arning+0.292
cffff+0.284
Neg logit contributions
INESS-0.307
-0.298
 Breaker-0.261
 understandable-0.249
endra-0.241
Token
Token ID250
Feature activation-6.22e-03

Pos logit contributions
 cens+0.199
hd+0.195
arning+0.192
asionally+0.192
cffff+0.187
Neg logit contributions
INESS-0.191
-0.184
 Breaker-0.163
 understandable-0.155
endra-0.148
TokenInstead
Token ID13193
Feature activation-1.70e+00

Pos logit contributions
 cens+0.012
hd+0.012
arning+0.012
asionally+0.012
cffff+0.012
Neg logit contributions
INESS-0.014
-0.013
 Breaker-0.012
 understandable-0.011
endra-0.011
Token of
Token ID286
Feature activation-5.49e-02

Pos logit contributions
 cens+3.338
hd+3.208
asionally+3.180
arning+3.149
cffff+3.095
Neg logit contributions
INESS-3.839
-3.691
 Breaker-3.320
 understandable-3.107
endra-3.043
Token saying
Token ID2282
Feature activation+2.34e-01

Pos logit contributions
 cens+0.116
hd+0.112
arning+0.110
asionally+0.110
cffff+0.108
Neg logit contributions
INESS-0.117
-0.114
 Breaker-0.101
 understandable-0.096
endra-0.092
Token,
Token ID11
Feature activation+1.28e-01

Pos logit contributions
INESS+0.468
+0.454
 Breaker+0.399
 understandable+0.375
endra+0.360
Neg logit contributions
 cens-0.526
hd-0.510
asionally-0.504
arning-0.503
cffff-0.493
Token black
Token ID2042
Feature activation+8.64e-02

Pos logit contributions
INESS+0.252
+0.245
 Breaker+0.214
 understandable+0.201
endra+0.193
Neg logit contributions
 cens-0.292
hd-0.284
asionally-0.281
arning-0.280
cffff-0.275
Token or
Token ID393
Feature activation-2.43e-01

Pos logit contributions
INESS+0.181
+0.176
 Breaker+0.156
 understandable+0.147
endra+0.141
Neg logit contributions
 cens-0.186
hd-0.180
asionally-0.178
arning-0.177
cffff-0.173
Token)
Token ID8
Feature activation+1.42e-01

Pos logit contributions
 cens+0.355
hd+0.345
arning+0.341
asionally+0.339
cffff+0.330
Neg logit contributions
INESS-0.398
-0.388
 Breaker-0.347
 understandable-0.330
endra-0.316
Token
Token ID198
Feature activation-2.77e-01

Pos logit contributions
INESS+0.313
+0.304
 Breaker+0.272
 understandable+0.259
endra+0.248
Neg logit contributions
 cens-0.287
hd-0.279
arning-0.275
asionally-0.275
cffff-0.269
Token
Token ID198
Feature activation-2.85e-01

Pos logit contributions
 cens+0.528
hd+0.521
arning+0.503
asionally+0.502
collar+0.489
Neg logit contributions
INESS-0.639
-0.625
 Breaker-0.559
 understandable-0.529
endra-0.512
TokenAndre
Token ID31258
Feature activation+1.64e-01

Pos logit contributions
 cens+0.547
hd+0.538
asionally+0.528
arning+0.527
cffff+0.513
Neg logit contributions
INESS-0.650
-0.632
 Breaker-0.573
 understandable-0.543
endra-0.522
Token R
Token ID371
Feature activation+8.07e-02

Pos logit contributions
INESS+0.400
+0.389
 Breaker+0.349
 understandable+0.336
endra+0.320
Neg logit contributions
 cens-0.296
hd-0.289
arning-0.283
asionally-0.282
collar-0.270
Tokeningu
Token ID6680
Feature activation-1.67e+00

Pos logit contributions
INESS+0.195
+0.190
 Breaker+0.172
 understandable+0.164
endra+0.159
Neg logit contributions
 cens-0.146
hd-0.140
asionally-0.139
arning-0.139
cffff-0.136
Tokenette
Token ID5857
Feature activation-6.62e-02

Pos logit contributions
 cens+3.945
hd+3.882
asionally+3.780
arning+3.766
collar+3.704
Neg logit contributions
INESS-3.060
-2.970
 Breaker-2.576
 understandable-2.414
OE-2.234
Token/
Token ID14
Feature activation-6.56e-02

Pos logit contributions
 cens+0.140
hd+0.136
arning+0.134
asionally+0.134
cffff+0.131
Neg logit contributions
INESS-0.140
-0.136
 Breaker-0.121
 understandable-0.115
endra-0.110
TokenGetty
Token ID6633
Feature activation-3.91e-01

Pos logit contributions
 cens+0.115
hd+0.111
arning+0.109
asionally+0.109
cffff+0.106
Neg logit contributions
INESS-0.163
-0.159
 Breaker-0.144
 understandable-0.138
endra-0.133
Token Images
Token ID5382
Feature activation+6.07e-01

Pos logit contributions
 cens+0.680
hd+0.657
arning+0.648
asionally+0.647
cffff+0.625
Neg logit contributions
INESS-0.975
-0.950
 Breaker-0.860
 understandable-0.824
endra-0.795
Token
Token ID198
Feature activation-2.80e-01

Pos logit contributions
INESS+1.308
+1.274
 Breaker+1.133
 understandable+1.076
endra+1.028
Neg logit contributions
 cens-1.261
hd-1.222
asionally-1.207
arning-1.206
cffff-1.185
Token of
Token ID286
Feature activation+6.98e-02

Pos logit contributions
 cens+0.816
hd+0.795
arning+0.769
asionally+0.769
collar+0.752
Neg logit contributions
INESS-0.924
-0.909
 Breaker-0.796
 understandable-0.752
endra-0.733
Token Charleston
Token ID26070
Feature activation+1.25e-01

Pos logit contributions
INESS+0.143
+0.139
 Breaker+0.121
 understandable+0.114
endra+0.111
Neg logit contributions
 cens-0.154
hd-0.149
arning-0.147
asionally-0.147
cffff-0.144
Token.
Token ID13
Feature activation+7.74e-03

Pos logit contributions
INESS+0.276
+0.269
 Breaker+0.240
 understandable+0.229
endra+0.220
Neg logit contributions
 cens-0.251
hd-0.243
asionally-0.240
arning-0.240
cffff-0.235
Token
Token ID198
Feature activation-1.28e-01

Pos logit contributions
INESS+0.017
+0.016
 Breaker+0.014
 understandable+0.014
endra+0.013
Neg logit contributions
 cens-0.016
hd-0.016
asionally-0.015
arning-0.015
cffff-0.015
Token
Token ID198
Feature activation-1.61e-01

Pos logit contributions
 cens+0.248
hd+0.244
asionally+0.237
arning+0.237
collar+0.230
Neg logit contributions
INESS-0.295
-0.288
 Breaker-0.257
 understandable-0.244
endra-0.236
TokenRather
Token ID27202
Feature activation-1.62e+00

Pos logit contributions
 cens+0.314
hd+0.308
asionally+0.302
arning+0.301
cffff+0.293
Neg logit contributions
INESS-0.364
-0.356
 Breaker-0.320
 understandable-0.303
endra-0.290
Token than
Token ID621
Feature activation-1.75e-01

Pos logit contributions
 cens+2.662
hd+2.541
asionally+2.522
arning+2.483
collar+2.373
Neg logit contributions
INESS-4.202
-4.104
 Breaker-3.734
 understandable-3.490
endra-3.435
Token releg
Token ID24542
Feature activation-7.38e-02

Pos logit contributions
 cens+0.360
hd+0.348
asionally+0.343
arning+0.343
cffff+0.335
Neg logit contributions
INESS-0.384
-0.374
 Breaker-0.333
 understandable-0.315
endra-0.303
Tokenating
Token ID803
Feature activation+2.69e-01

Pos logit contributions
 cens+0.119
hd+0.115
arning+0.113
asionally+0.113
cffff+0.110
Neg logit contributions
INESS-0.193
-0.189
 Breaker-0.172
 understandable-0.165
endra-0.159
Token slavery
Token ID13503
Feature activation+1.81e-03

Pos logit contributions
INESS+0.542
+0.526
 Breaker+0.464
 understandable+0.438
endra+0.418
Neg logit contributions
 cens-0.598
hd-0.581
asionally-0.574
arning-0.573
cffff-0.562
Token to
Token ID284
Feature activation+7.83e-02

Pos logit contributions
INESS+0.004
+0.004
 Breaker+0.003
 understandable+0.003
endra+0.003
Neg logit contributions
 cens-0.004
hd-0.004
asionally-0.004
arning-0.004
cffff-0.003
Token did
Token ID750
Feature activation-1.42e-02

Pos logit contributions
INESS+0.138
+0.135
 Breaker+0.121
 understandable+0.115
endra+0.111
Neg logit contributions
 cens-0.116
hd-0.112
asionally-0.111
arning-0.110
cffff-0.108
Token not
Token ID407
Feature activation+9.96e-02

Pos logit contributions
 cens+0.029
hd+0.028
asionally+0.028
arning+0.027
cffff+0.027
Neg logit contributions
INESS-0.031
-0.031
 Breaker-0.027
 understandable-0.025
endra-0.025
Token.
Token ID13
Feature activation+3.29e-02

Pos logit contributions
INESS+0.223
+0.217
 Breaker+0.194
 understandable+0.183
endra+0.176
Neg logit contributions
 cens-0.201
hd-0.194
asionally-0.191
arning-0.191
cffff-0.187
Token
Token ID198
Feature activation+9.75e-02

Pos logit contributions
INESS+0.072
+0.070
 Breaker+0.061
 understandable+0.059
endra+0.057
Neg logit contributions
 cens-0.067
hd-0.066
asionally-0.064
arning-0.064
cffff-0.062
Token
Token ID198
Feature activation+2.70e-02

Pos logit contributions
INESS+0.221
+0.216
 Breaker+0.193
 understandable+0.183
endra+0.176
Neg logit contributions
 cens-0.191
hd-0.187
arning-0.182
asionally-0.182
collar-0.177
TokenInstead
Token ID13193
Feature activation-1.61e+00

Pos logit contributions
INESS+0.061
+0.059
 Breaker+0.053
 understandable+0.051
endra+0.049
Neg logit contributions
 cens-0.053
hd-0.052
arning-0.051
asionally-0.051
cffff-0.049
Token,
Token ID11
Feature activation+1.89e-01

Pos logit contributions
 cens+3.182
hd+3.022
asionally+3.006
arning+2.983
cffff+2.925
Neg logit contributions
INESS-3.574
-3.479
 Breaker-3.099
 understandable-2.877
endra-2.812
Token Republicans
Token ID4734
Feature activation-2.21e-01

Pos logit contributions
INESS+0.363
+0.352
 Breaker+0.306
 understandable+0.285
endra+0.274
Neg logit contributions
 cens-0.443
hd-0.428
asionally-0.424
arning-0.424
cffff-0.416
Token spent
Token ID3377
Feature activation-1.02e-01

Pos logit contributions
 cens+0.430
hd+0.413
asionally+0.406
arning+0.406
cffff+0.395
Neg logit contributions
INESS-0.513
-0.501
 Breaker-0.447
 understandable-0.423
endra-0.408
Token their
Token ID511
Feature activation+1.27e-01

Pos logit contributions
 cens+0.238
hd+0.229
asionally+0.226
arning+0.226
cffff+0.221
Neg logit contributions
INESS-0.198
-0.193
 Breaker-0.166
 understandable-0.155
endra-0.150
Token recess
Token ID28836
Feature activation+4.94e-01

Pos logit contributions
INESS+0.270
+0.263
 Breaker+0.230
 understandable+0.216
endra+0.211
Neg logit contributions
 cens-0.274
hd-0.263
asionally-0.260
arning-0.259
cffff-0.252
Token at
Token ID379
Feature activation-2.67e-01

Pos logit contributions
 cens+1.156
hd+1.109
asionally+1.091
arning+1.087
cffff+1.061
Neg logit contributions
INESS-1.340
-1.320
 Breaker-1.168
 understandable-1.097
endra-1.074
Token once
Token ID1752
Feature activation+5.88e-01

Pos logit contributions
 cens+0.647
hd+0.625
arning+0.620
asionally+0.618
cffff+0.608
Neg logit contributions
INESS-0.488
-0.475
 Breaker-0.408
 understandable-0.378
endra-0.364
Token.
Token ID13
Feature activation-3.14e-01

Pos logit contributions
INESS+1.312
+1.278
 Breaker+1.144
 understandable+1.091
endra+1.043
Neg logit contributions
 cens-1.171
hd-1.137
asionally-1.123
arning-1.123
cffff-1.100
Token
Token ID198
Feature activation-2.70e-01

Pos logit contributions
 cens+0.624
hd+0.613
asionally+0.596
arning+0.592
collar+0.579
Neg logit contributions
INESS-0.698
-0.681
 Breaker-0.600
 understandable-0.579
endra-0.555
Token
Token ID198
Feature activation-2.74e-01

Pos logit contributions
 cens+0.514
hd+0.508
arning+0.489
asionally+0.488
collar+0.475
Neg logit contributions
INESS-0.627
-0.615
 Breaker-0.548
 understandable-0.519
endra-0.502
TokenRather
Token ID27202
Feature activation-1.58e+00

Pos logit contributions
 cens+0.528
hd+0.521
asionally+0.508
arning+0.507
collar+0.493
Neg logit contributions
INESS-0.623
-0.611
 Breaker-0.550
 understandable-0.521
endra-0.498
Token than
Token ID621
Feature activation-2.43e-01

Pos logit contributions
 cens+2.542
hd+2.400
asionally+2.391
arning+2.331
collar+2.260
Neg logit contributions
INESS-4.158
-4.054
 Breaker-3.665
 understandable-3.443
endra-3.399
Token a
Token ID257
Feature activation+7.51e-03

Pos logit contributions
 cens+0.494
hd+0.474
asionally+0.467
arning+0.467
cffff+0.455
Neg logit contributions
INESS-0.540
-0.528
 Breaker-0.469
 understandable-0.443
endra-0.427
Token purely
Token ID14177
Feature activation-1.79e-02

Pos logit contributions
INESS+0.016
+0.015
 Breaker+0.013
 understandable+0.013
endra+0.012
Neg logit contributions
 cens-0.016
hd-0.016
arning-0.015
asionally-0.015
cffff-0.015
Token academic
Token ID8233
Feature activation-5.66e-02

Pos logit contributions
 cens+0.039
hd+0.037
arning+0.037
asionally+0.037
cffff+0.036
Neg logit contributions
INESS-0.038
-0.037
 Breaker-0.032
 understandable-0.030
endra-0.029
Token exercise
Token ID5517
Feature activation+6.79e-01

Pos logit contributions
 cens+0.127
hd+0.122
asionally+0.120
arning+0.120
collar+0.117
Neg logit contributions
INESS-0.114
-0.112
 Breaker-0.097
 understandable-0.091
endra-0.088
Token costs
Token ID3484
Feature activation+2.89e-01

Pos logit contributions
INESS+0.360
+0.352
 Breaker+0.319
 understandable+0.300
endra+0.292
Neg logit contributions
 cens-0.264
hd-0.251
arning-0.246
asionally-0.245
cffff-0.239
Token,
Token ID11
Feature activation+9.57e-02

Pos logit contributions
INESS+0.647
+0.631
 Breaker+0.564
 understandable+0.535
endra+0.514
Neg logit contributions
 cens-0.578
hd-0.559
arning-0.551
asionally-0.550
cffff-0.539
Token he
Token ID339
Feature activation+1.26e-01

Pos logit contributions
INESS+0.206
+0.200
 Breaker+0.177
 understandable+0.167
endra+0.160
Neg logit contributions
 cens-0.202
hd-0.194
arning-0.191
asionally-0.191
cffff-0.187
Token noted
Token ID4367
Feature activation+7.53e-01

Pos logit contributions
INESS+0.280
+0.272
 Breaker+0.244
 understandable+0.231
endra+0.222
Neg logit contributions
 cens-0.254
hd-0.247
arning-0.243
asionally-0.243
cffff-0.238
Token.
Token ID13
Feature activation-1.57e-01

Pos logit contributions
INESS+1.644
+1.595
 Breaker+1.415
 understandable+1.358
endra+1.296
Neg logit contributions
 cens-1.527
hd-1.487
asionally-1.469
arning-1.465
cffff-1.437
Token Instead
Token ID5455
Feature activation-1.57e+00

Pos logit contributions
 cens+0.318
hd+0.313
asionally+0.302
arning+0.302
collar+0.296
Neg logit contributions
INESS-0.344
-0.335
 Breaker-0.292
 understandable-0.284
endra-0.270
Token,
Token ID11
Feature activation+6.33e-02

Pos logit contributions
 cens+3.127
hd+2.949
asionally+2.915
arning+2.870
collar+2.833
Neg logit contributions
INESS-3.499
-3.396
 Breaker-3.063
 understandable-2.835
endra-2.763
Token society
Token ID3592
Feature activation-4.81e-01

Pos logit contributions
INESS+0.123
+0.120
 Breaker+0.104
 understandable+0.098
endra+0.094
Neg logit contributions
 cens-0.146
hd-0.141
asionally-0.139
arning-0.138
cffff-0.136
Token picks
Token ID11103
Feature activation+4.24e-01

Pos logit contributions
 cens+1.027
hd+0.984
asionally+0.970
arning+0.966
cffff+0.943
Neg logit contributions
INESS-1.025
-0.997
 Breaker-0.884
 understandable-0.830
endra-0.799
Token up
Token ID510
Feature activation-4.19e-01

Pos logit contributions
INESS+0.848
+0.823
 Breaker+0.726
 understandable+0.688
endra+0.655
Neg logit contributions
 cens-0.945
hd-0.920
arning-0.910
asionally-0.909
cffff-0.893
Token the
Token ID262
Feature activation+1.13e-01

Pos logit contributions
 cens+0.975
hd+0.941
asionally+0.927
arning+0.926
cffff+0.905
Neg logit contributions
INESS-0.807
-0.789
 Breaker-0.689
 understandable-0.637
endra-0.612
Token
Token ID247
Feature activation+4.62e-02

Pos logit contributions
INESS+0.209
+0.203
 Breaker+0.177
 understandable+0.167
endra+0.158
Neg logit contributions
 cens-0.261
hd-0.254
arning-0.251
asionally-0.251
cffff-0.247
Tokent
Token ID83
Feature activation-1.11e-01

Pos logit contributions
INESS+0.123
+0.120
 Breaker+0.110
 understandable+0.105
endra+0.102
Neg logit contributions
 cens-0.073
hd-0.070
asionally-0.069
arning-0.069
cffff-0.067
Token.
Token ID13
Feature activation+1.54e-01

Pos logit contributions
 cens+0.218
hd+0.211
asionally+0.208
arning+0.208
cffff+0.204
Neg logit contributions
INESS-0.251
-0.245
 Breaker-0.220
 understandable-0.209
endra-0.200
Token
Token ID198
Feature activation-1.20e-01

Pos logit contributions
INESS+0.340
+0.331
 Breaker+0.294
 understandable+0.281
endra+0.269
Neg logit contributions
 cens-0.311
hd-0.303
asionally-0.297
arning-0.297
cffff-0.290
Token
Token ID198
Feature activation-1.78e-01

Pos logit contributions
 cens+0.231
hd+0.227
arning+0.220
asionally+0.219
collar+0.213
Neg logit contributions
INESS-0.276
-0.270
 Breaker-0.241
 understandable-0.229
endra-0.221
TokenInstead
Token ID13193
Feature activation-1.56e+00

Pos logit contributions
 cens+0.344
hd+0.339
asionally+0.331
arning+0.330
cffff+0.320
Neg logit contributions
INESS-0.405
-0.397
 Breaker-0.357
 understandable-0.338
endra-0.324
Token,
Token ID11
Feature activation+1.41e-01

Pos logit contributions
 cens+3.116
hd+2.965
asionally+2.945
arning+2.895
cffff+2.836
Neg logit contributions
INESS-3.462
-3.326
 Breaker-2.986
 understandable-2.802
endra-2.729
Token the
Token ID262
Feature activation+9.93e-02

Pos logit contributions
INESS+0.285
+0.277
 Breaker+0.244
 understandable+0.229
endra+0.220
Neg logit contributions
 cens-0.312
hd-0.302
asionally-0.298
arning-0.297
cffff-0.292
Token job
Token ID1693
Feature activation-5.26e-01

Pos logit contributions
INESS+0.206
+0.201
 Breaker+0.176
 understandable+0.165
endra+0.160
Neg logit contributions
 cens-0.218
hd-0.207
asionally-0.206
arning-0.205
cffff-0.201
Token fell
Token ID3214
Feature activation+1.69e-01

Pos logit contributions
 cens+1.041
hd+1.000
arning+0.978
asionally+0.975
collar+0.948
Neg logit contributions
INESS-1.187
-1.171
 Breaker-1.036
 understandable-0.980
endra-0.946
Token to
Token ID284
Feature activation+9.34e-02

Pos logit contributions
INESS+0.358
+0.350
 Breaker+0.312
 understandable+0.298
endra+0.286
Neg logit contributions
 cens-0.351
hd-0.341
arning-0.339
asionally-0.338
cffff-0.332
Tokent
Token ID83
Feature activation-2.13e-01

Pos logit contributions
 cens+0.180
hd+0.173
arning+0.170
asionally+0.169
cffff+0.164
Neg logit contributions
INESS-0.315
-0.308
 Breaker-0.281
 understandable-0.269
endra-0.261
Token bad
Token ID2089
Feature activation+2.10e-01

Pos logit contributions
 cens+0.451
hd+0.435
asionally+0.427
arning+0.427
cffff+0.417
Neg logit contributions
INESS-0.455
-0.445
 Breaker-0.392
 understandable-0.367
endra-0.356
Token at
Token ID379
Feature activation-3.19e-01

Pos logit contributions
INESS+0.461
+0.448
 Breaker+0.400
 understandable+0.381
endra+0.365
Neg logit contributions
 cens-0.426
hd-0.413
asionally-0.408
arning-0.408
cffff-0.400
Token all
Token ID477
Feature activation+2.31e-01

Pos logit contributions
 cens+0.658
hd+0.636
arning+0.629
asionally+0.628
cffff+0.614
Neg logit contributions
INESS-0.695
-0.677
 Breaker-0.600
 understandable-0.571
endra-0.550
Token.
Token ID13
Feature activation-1.06e-01

Pos logit contributions
INESS+0.518
+0.505
 Breaker+0.452
 understandable+0.430
endra+0.413
Neg logit contributions
 cens-0.457
hd-0.443
asionally-0.437
arning-0.437
cffff-0.428
Token Rather
Token ID11317
Feature activation-1.52e+00

Pos logit contributions
 cens+0.223
hd+0.218
asionally+0.215
arning+0.214
cffff+0.208
Neg logit contributions
INESS-0.222
-0.216
 Breaker-0.190
 understandable-0.182
endra-0.174
Token,
Token ID11
Feature activation+7.54e-02

Pos logit contributions
 cens+3.067
hd+2.942
asionally+2.909
arning+2.867
collar+2.807
Neg logit contributions
INESS-3.326
-3.227
 Breaker-2.846
 understandable-2.652
endra-2.612
Token it
Token ID340
Feature activation+3.73e-02

Pos logit contributions
INESS+0.155
+0.151
 Breaker+0.133
 understandable+0.126
endra+0.120
Neg logit contributions
 cens-0.164
hd-0.159
asionally-0.158
arning-0.157
cffff-0.154
Token is
Token ID318
Feature activation-7.05e-02

Pos logit contributions
INESS+0.077
+0.074
 Breaker+0.066
 understandable+0.062
endra+0.060
Neg logit contributions
 cens-0.081
hd-0.079
asionally-0.078
arning-0.078
cffff-0.076
Token
Token ID1399
Feature activation-7.50e-02

Pos logit contributions
 cens+0.160
hd+0.155
asionally+0.153
arning+0.153
cffff+0.150
Neg logit contributions
INESS-0.139
-0.135
 Breaker-0.118
 understandable-0.111
endra-0.107
Tokenwell
Token ID4053
Feature activation-4.25e-01

Pos logit contributions
 cens+0.157
hd+0.153
asionally+0.150
arning+0.150
cffff+0.146
Neg logit contributions
INESS-0.161
-0.157
 Breaker-0.139
 understandable-0.131
endra-0.126
Token action
Token ID2223
Feature activation+2.25e-01

Pos logit contributions
INESS+0.290
+0.281
 Breaker+0.247
 understandable+0.233
endra+0.222
Neg logit contributions
 cens-0.336
hd-0.327
asionally-0.323
arning-0.323
cffff-0.317
Token not
Token ID407
Feature activation+8.66e-02

Pos logit contributions
INESS+0.488
+0.475
 Breaker+0.424
 understandable+0.403
endra+0.386
Neg logit contributions
 cens-0.462
hd-0.449
arning-0.443
asionally-0.443
cffff-0.434
Token only
Token ID691
Feature activation+5.22e-01

Pos logit contributions
INESS+0.166
+0.161
 Breaker+0.141
 understandable+0.134
endra+0.127
Neg logit contributions
 cens-0.200
hd-0.195
asionally-0.193
arning-0.193
cffff-0.189
Token for
Token ID329
Feature activation+8.70e-02

Pos logit contributions
INESS+1.118
+1.089
 Breaker+0.979
 understandable+0.935
endra+0.887
Neg logit contributions
 cens-1.074
hd-1.047
arning-1.034
asionally-1.033
cffff-1.012
Token the
Token ID262
Feature activation-1.72e-02

Pos logit contributions
INESS+0.165
+0.160
 Breaker+0.140
 understandable+0.132
endra+0.125
Neg logit contributions
 cens-0.203
hd-0.197
arning-0.195
asionally-0.195
cffff-0.192
Token balance
Token ID5236
Feature activation-1.51e+00

Pos logit contributions
 cens+0.039
hd+0.038
asionally+0.037
arning+0.037
cffff+0.036
Neg logit contributions
INESS-0.034
-0.033
 Breaker-0.029
 understandable-0.027
endra-0.026
Token of
Token ID286
Feature activation+1.27e-01

Pos logit contributions
 cens+2.956
hd+2.873
arning+2.785
asionally+2.769
collar+2.735
Neg logit contributions
INESS-3.403
-3.325
 Breaker-2.920
 understandable-2.785
endra-2.713
Token the
Token ID262
Feature activation+1.37e-01

Pos logit contributions
INESS+0.237
+0.230
 Breaker+0.203
 understandable+0.191
endra+0.180
Neg logit contributions
 cens-0.297
hd-0.290
arning-0.287
asionally-0.287
cffff-0.283
Token 2017
Token ID2177
Feature activation-3.82e-02

Pos logit contributions
INESS+0.295
+0.287
 Breaker+0.256
 understandable+0.243
endra+0.232
Neg logit contributions
 cens-0.286
hd-0.277
arning-0.274
asionally-0.274
cffff-0.269
Token season
Token ID1622
Feature activation+2.54e-03

Pos logit contributions
 cens+0.078
hd+0.076
asionally+0.075
arning+0.075
cffff+0.073
Neg logit contributions
INESS-0.084
-0.081
 Breaker-0.072
 understandable-0.069
endra-0.066
Token but
Token ID475
Feature activation-1.05e-01

Pos logit contributions
INESS+0.005
+0.005
 Breaker+0.005
 understandable+0.004
endra+0.004
Neg logit contributions
 cens-0.005
hd-0.005
asionally-0.005
arning-0.005
cffff-0.005
Token cold
Token ID4692
Feature activation-8.05e-02

Pos logit contributions
INESS+0.295
+0.287
 Breaker+0.254
 understandable+0.239
endra+0.228
Neg logit contributions
 cens-0.302
hd-0.291
asionally-0.287
arning-0.286
cffff-0.280
Token on
Token ID319
Feature activation-1.79e-01

Pos logit contributions
 cens+0.158
hd+0.153
asionally+0.150
arning+0.150
cffff+0.146
Neg logit contributions
INESS-0.183
-0.179
 Breaker-0.160
 understandable-0.152
endra-0.146
Token the
Token ID262
Feature activation-6.14e-02

Pos logit contributions
 cens+0.389
hd+0.374
asionally+0.369
arning+0.368
cffff+0.361
Neg logit contributions
INESS-0.375
-0.363
 Breaker-0.318
 understandable-0.300
endra-0.289
Token Bruins
Token ID27607
Feature activation+1.08e-01

Pos logit contributions
 cens+0.132
hd+0.127
arning+0.126
asionally+0.126
cffff+0.123
Neg logit contributions
INESS-0.129
-0.126
 Breaker-0.110
 understandable-0.105
endra-0.101
Token.
Token ID13
Feature activation-1.09e-01

Pos logit contributions
INESS+0.246
+0.240
 Breaker+0.216
 understandable+0.205
endra+0.197
Neg logit contributions
 cens-0.209
hd-0.203
asionally-0.201
arning-0.200
cffff-0.196
Token Instead
Token ID5455
Feature activation-1.51e+00

Pos logit contributions
 cens+0.218
hd+0.214
asionally+0.209
arning+0.208
cffff+0.201
Neg logit contributions
INESS-0.240
-0.234
 Breaker-0.206
 understandable-0.197
endra-0.190
Token,
Token ID11
Feature activation+1.66e-01

Pos logit contributions
 cens+3.042
hd+2.910
asionally+2.889
arning+2.874
cffff+2.829
Neg logit contributions
INESS-3.330
-3.214
 Breaker-2.856
 understandable-2.663
endra-2.611
Token Allow
Token ID22507
Feature activation-1.38e-01

Pos logit contributions
INESS+0.339
+0.329
 Breaker+0.290
 understandable+0.273
endra+0.262
Neg logit contributions
 cens-0.365
hd-0.354
asionally-0.350
arning-0.349
cffff-0.343
Tokenay
Token ID323
Feature activation+3.40e-01

Pos logit contributions
 cens+0.328
hd+0.325
asionally+0.317
arning+0.316
cffff+0.309
Neg logit contributions
INESS-0.257
-0.249
 Breaker-0.216
 understandable-0.200
endra-0.189
Token committed
Token ID5364
Feature activation+6.90e-03

Pos logit contributions
INESS+0.756
+0.739
 Breaker+0.662
 understandable+0.633
endra+0.605
Neg logit contributions
 cens-0.673
hd-0.654
arning-0.649
asionally-0.646
cffff-0.635
Token,
Token ID11
Feature activation-1.36e-02

Pos logit contributions
INESS+0.015
+0.015
 Breaker+0.013
 understandable+0.013
endra+0.012
Neg logit contributions
 cens-0.014
hd-0.013
asionally-0.013
arning-0.013
cffff-0.013
Token
Token ID198
Feature activation-2.48e-02

Pos logit contributions
INESS+0.028
+0.027
 Breaker+0.024
 understandable+0.023
endra+0.022
Neg logit contributions
 cens-0.024
hd-0.023
arning-0.022
asionally-0.022
collar-0.022
Tokenre
Token ID260
Feature activation-2.07e-01

Pos logit contributions
 cens+0.048
hd+0.047
arning+0.046
asionally+0.046
cffff+0.045
Neg logit contributions
INESS-0.057
-0.055
 Breaker-0.050
 understandable-0.047
endra-0.045
Tokeni
Token ID72
Feature activation-1.76e-01

Pos logit contributions
 cens+0.392
hd+0.385
arning+0.377
asionally+0.377
cffff+0.367
Neg logit contributions
INESS-0.482
-0.469
 Breaker-0.422
 understandable-0.402
endra-0.384
Token Profile
Token ID13118
Feature activation+3.00e-02

Pos logit contributions
 cens+0.334
hd+0.326
asionally+0.320
arning+0.320
cffff+0.312
Neg logit contributions
INESS-0.410
-0.400
 Breaker-0.359
 understandable-0.343
endra-0.328
Token Blog
Token ID14001
Feature activation-8.03e-01

Pos logit contributions
INESS+0.059
+0.057
 Breaker+0.050
 understandable+0.048
endra+0.045
Neg logit contributions
 cens-0.068
hd-0.066
arning-0.065
asionally-0.065
cffff-0.064
Token Joined
Token ID16709
Feature activation-1.50e+00

Pos logit contributions
 cens+1.643
hd+1.598
arning+1.573
asionally+1.554
cffff+1.537
Neg logit contributions
INESS-1.771
-1.708
 Breaker-1.501
 understandable-1.434
OE-1.377
Token October
Token ID3267
Feature activation+1.60e-01

Pos logit contributions
 cens+3.135
hd+3.008
asionally+2.944
arning+2.930
collar+2.823
Neg logit contributions
INESS-3.226
-3.120
 Breaker-2.677
 understandable-2.538
endra-2.478
Token 2002
Token ID6244
Feature activation+2.47e-01

Pos logit contributions
INESS+0.341
+0.331
 Breaker+0.294
 understandable+0.279
endra+0.267
Neg logit contributions
 cens-0.336
hd-0.326
asionally-0.322
arning-0.322
cffff-0.315
Token United
Token ID1578
Feature activation-5.72e-01

Pos logit contributions
INESS+0.554
+0.539
 Breaker+0.481
 understandable+0.458
endra+0.440
Neg logit contributions
 cens-0.490
hd-0.475
arning-0.468
asionally-0.467
cffff-0.457
Token States
Token ID1829
Feature activation+7.44e-02

Pos logit contributions
 cens+1.156
hd+1.122
arning+1.107
asionally+1.101
cffff+1.080
Neg logit contributions
INESS-1.265
-1.230
 Breaker-1.093
 understandable-1.044
endra-0.998
Token 3
Token ID513
Feature activation-4.87e-01

Pos logit contributions
INESS+0.179
+0.175
 Breaker+0.157
 understandable+0.151
endra+0.145
Neg logit contributions
 cens-0.136
hd-0.131
arning-0.129
asionally-0.129
cffff-0.126
Token "
Token ID366
Feature activation-1.08e-01

Pos logit contributions
 cens+0.294
hd+0.285
asionally+0.280
arning+0.280
cffff+0.275
Neg logit contributions
INESS-0.262
-0.256
 Breaker-0.223
 understandable-0.211
endra-0.203
Tokenmass
Token ID22208
Feature activation+4.82e-01

Pos logit contributions
 cens+0.192
hd+0.187
asionally+0.183
arning+0.183
cffff+0.179
Neg logit contributions
INESS-0.263
-0.257
 Breaker-0.232
 understandable-0.222
endra-0.213
Tokenively
Token ID2280
Feature activation+1.02e-01

Pos logit contributions
INESS+0.934
+0.904
 Breaker+0.795
 understandable+0.750
endra+0.712
Neg logit contributions
 cens-1.106
hd-1.076
arning-1.065
asionally-1.065
cffff-1.045
Token-
Token ID12
Feature activation+1.55e-01

Pos logit contributions
INESS+0.199
+0.193
 Breaker+0.169
 understandable+0.158
endra+0.151
Neg logit contributions
 cens-0.235
hd-0.228
arning-0.225
asionally-0.224
cffff-0.220
Tokenmulti
Token ID41684
Feature activation+1.91e-01

Pos logit contributions
INESS+0.387
+0.379
 Breaker+0.343
 understandable+0.329
endra+0.317
Neg logit contributions
 cens-0.266
hd-0.256
asionally-0.253
arning-0.253
cffff-0.247
Tokenplayer
Token ID7829
Feature activation-1.49e+00

Pos logit contributions
INESS+0.470
+0.459
 Breaker+0.416
 understandable+0.398
endra+0.383
Neg logit contributions
 cens-0.336
hd-0.324
asionally-0.320
arning-0.320
cffff-0.312
Token-
Token ID12
Feature activation-5.46e-01

Pos logit contributions
 cens+3.038
hd+2.950
asionally+2.869
arning+2.848
collar+2.780
Neg logit contributions
INESS-3.218
-3.163
 Breaker-2.748
 understandable-2.596
endra-2.557
Tokenstyle
Token ID7635
Feature activation-5.37e-01

Pos logit contributions
 cens+0.967
hd+0.947
arning+0.924
asionally+0.923
cffff+0.899
Neg logit contributions
INESS-1.344
-1.314
 Breaker-1.182
 understandable-1.128
endra-1.089
Token,"
Token ID553
Feature activation+8.60e-02

Pos logit contributions
 cens+1.155
hd+1.123
asionally+1.101
arning+1.096
cffff+1.071
Neg logit contributions
INESS-1.121
-1.096
 Breaker-0.957
 understandable-0.907
endra-0.877
Token detailing
Token ID22976
Feature activation-3.84e-02

Pos logit contributions
INESS+0.184
+0.180
 Breaker+0.159
 understandable+0.151
endra+0.145
Neg logit contributions
 cens-0.180
hd-0.174
arning-0.172
asionally-0.172
cffff-0.169
Token the
Token ID262
Feature activation+8.82e-02

Pos logit contributions
 cens+0.088
hd+0.085
asionally+0.084
arning+0.084
cffff+0.082
Neg logit contributions
INESS-0.075
-0.073
 Breaker-0.063
 understandable-0.059
endra-0.058
Token to
Token ID284
Feature activation-1.53e-01

Pos logit contributions
 cens+0.080
hd+0.078
asionally+0.077
arning+0.077
cffff+0.075
Neg logit contributions
INESS-0.088
-0.086
 Breaker-0.076
 understandable-0.073
endra-0.070
Token a
Token ID257
Feature activation-1.72e-01

Pos logit contributions
 cens+0.355
hd+0.346
arning+0.340
asionally+0.340
cffff+0.333
Neg logit contributions
INESS-0.297
-0.287
 Breaker-0.249
 understandable-0.234
endra-0.225
Token more
Token ID517
Feature activation-3.48e-01

Pos logit contributions
 cens+0.348
hd+0.337
arning+0.332
asionally+0.331
cffff+0.323
Neg logit contributions
INESS-0.382
-0.373
 Breaker-0.330
 understandable-0.313
endra-0.303
Token improved
Token ID6596
Feature activation-7.91e-02

Pos logit contributions
 cens+0.780
hd+0.756
asionally+0.742
arning+0.741
collar+0.725
Neg logit contributions
INESS-0.703
-0.679
 Breaker-0.595
 understandable-0.552
endra-0.536
Token MLS
Token ID13981
Feature activation-1.94e-01

Pos logit contributions
 cens+0.157
hd+0.152
asionally+0.150
arning+0.150
cffff+0.147
Neg logit contributions
INESS-0.177
-0.173
 Breaker-0.155
 understandable-0.147
endra-0.141
Token instead
Token ID2427
Feature activation-1.48e+00

Pos logit contributions
 cens+0.344
hd+0.332
asionally+0.327
arning+0.327
cffff+0.319
Neg logit contributions
INESS-0.476
-0.465
 Breaker-0.421
 understandable-0.402
endra-0.388
Token of
Token ID286
Feature activation-1.09e-01

Pos logit contributions
 cens+3.023
hd+2.933
asionally+2.863
arning+2.863
cffff+2.808
Neg logit contributions
INESS-3.213
-3.118
 Breaker-2.763
 understandable-2.593
endra-2.507
Token second
Token ID1218
Feature activation+1.17e-01

Pos logit contributions
 cens+0.220
hd+0.213
arning+0.211
asionally+0.210
cffff+0.205
Neg logit contributions
INESS-0.240
-0.234
 Breaker-0.208
 understandable-0.197
endra-0.190
Token division
Token ID7297
Feature activation-1.06e-01

Pos logit contributions
INESS+0.234
+0.227
 Breaker+0.200
 understandable+0.189
endra+0.180
Neg logit contributions
 cens-0.263
hd-0.256
arning-0.253
asionally-0.252
cffff-0.248
Token soccer
Token ID11783
Feature activation+2.06e-01

Pos logit contributions
 cens+0.220
hd+0.213
asionally+0.210
arning+0.210
cffff+0.206
Neg logit contributions
INESS-0.228
-0.222
 Breaker-0.197
 understandable-0.187
endra-0.180
Token in
Token ID287
Feature activation-9.18e-02

Pos logit contributions
INESS+0.427
+0.416
 Breaker+0.368
 understandable+0.348
endra+0.333
Neg logit contributions
 cens-0.445
hd-0.433
asionally-0.427
arning-0.427
cffff-0.418
TokenAmazing
Token ID42770
Feature activation-2.81e-01

Pos logit contributions
 cens+0.366
hd+0.358
arning+0.352
asionally+0.352
cffff+0.343
Neg logit contributions
INESS-0.414
-0.404
 Breaker-0.363
 understandable-0.344
endra-0.330
Tokenly
Token ID306
Feature activation-5.37e-01

Pos logit contributions
 cens+0.487
hd+0.467
asionally+0.461
arning+0.460
cffff+0.447
Neg logit contributions
INESS-0.706
-0.690
 Breaker-0.621
 understandable-0.595
endra-0.576
Token,
Token ID11
Feature activation-2.34e-01

Pos logit contributions
 cens+1.138
hd+1.091
asionally+1.079
arning+1.077
cffff+1.054
Neg logit contributions
INESS-1.150
-1.109
 Breaker-0.982
 understandable-0.922
endra-0.895
Token his
Token ID465
Feature activation-1.74e-01

Pos logit contributions
 cens+0.503
hd+0.484
asionally+0.479
arning+0.478
cffff+0.468
Neg logit contributions
INESS-0.494
-0.480
 Breaker-0.422
 understandable-0.399
endra-0.385
Token first
Token ID717
Feature activation+4.57e-02

Pos logit contributions
 cens+0.389
hd+0.374
asionally+0.371
arning+0.369
cffff+0.361
Neg logit contributions
INESS-0.355
-0.345
 Breaker-0.301
 understandable-0.282
endra-0.274
Token effort
Token ID3626
Feature activation-1.48e+00

Pos logit contributions
INESS+0.086
+0.083
 Breaker+0.071
 understandable+0.067
endra+0.064
Neg logit contributions
 cens-0.110
hd-0.105
asionally-0.104
arning-0.104
cffff-0.102
Token along
Token ID1863
Feature activation+1.74e-01

Pos logit contributions
 cens+2.869
hd+2.717
arning+2.690
asionally+2.667
collar+2.612
Neg logit contributions
INESS-3.394
-3.304
 Breaker-2.897
 understandable-2.768
endra-2.688
Token these
Token ID777
Feature activation-1.30e-01

Pos logit contributions
INESS+0.371
+0.361
 Breaker+0.320
 understandable+0.303
endra+0.291
Neg logit contributions
 cens-0.366
hd-0.355
arning-0.350
asionally-0.350
cffff-0.343
Token new
Token ID649
Feature activation+1.95e-01

Pos logit contributions
 cens+0.309
hd+0.294
arning+0.292
asionally+0.292
collar+0.284
Neg logit contributions
INESS-0.248
-0.239
 Breaker-0.207
 understandable-0.189
endra-0.184
Token lines
Token ID3951
Feature activation-1.11e-01

Pos logit contributions
INESS+0.369
+0.357
 Breaker+0.312
 understandable+0.289
endra+0.276
Neg logit contributions
 cens-0.461
hd-0.446
arning-0.441
asionally-0.440
collar-0.430
Token,
Token ID11
Feature activation-1.11e-02

Pos logit contributions
 cens+0.216
hd+0.209
arning+0.206
asionally+0.206
cffff+0.200
Neg logit contributions
INESS-0.254
-0.247
 Breaker-0.222
 understandable-0.211
endra-0.203
TokenJ
Token ID41
Feature activation+2.01e-01

Pos logit contributions
 cens+0.461
hd+0.450
asionally+0.442
arning+0.442
cffff+0.432
Neg logit contributions
INESS-0.552
-0.538
 Breaker-0.483
 understandable-0.461
endra-0.440
Tokenaaaa
Token ID24794
Feature activation-4.64e-01

Pos logit contributions
INESS+0.462
+0.451
 Breaker+0.404
 understandable+0.386
endra+0.370
Neg logit contributions
 cens-0.389
hd-0.378
arning-0.372
asionally-0.372
cffff-0.364
Tokenasper
Token ID32981
Feature activation-4.86e-01

Pos logit contributions
 cens+0.895
hd+0.879
arning+0.866
asionally+0.864
cffff+0.839
Neg logit contributions
INESS-1.061
-1.036
 Breaker-0.925
 understandable-0.884
endra-0.839
Token Profile
Token ID13118
Feature activation+1.31e-01

Pos logit contributions
 cens+0.702
hd+0.684
arning+0.664
asionally+0.663
cffff+0.642
Neg logit contributions
INESS-1.349
-1.319
 Breaker-1.206
 understandable-1.166
endra-1.127
Token Blog
Token ID14001
Feature activation-5.98e-01

Pos logit contributions
INESS+0.254
+0.246
 Breaker+0.216
 understandable+0.204
endra+0.194
Neg logit contributions
 cens-0.302
hd-0.294
asionally-0.291
arning-0.291
cffff-0.286
Token Joined
Token ID16709
Feature activation-1.48e+00

Pos logit contributions
 cens+1.245
hd+1.211
arning+1.189
asionally+1.181
cffff+1.161
Neg logit contributions
INESS-1.295
-1.253
 Breaker-1.098
 understandable-1.048
OE-1.002
Token April
Token ID3035
Feature activation-1.67e-01

Pos logit contributions
 cens+3.076
hd+2.957
asionally+2.889
arning+2.873
collar+2.768
Neg logit contributions
INESS-3.211
-3.098
 Breaker-2.651
 understandable-2.518
endra-2.463
Token 2012
Token ID2321
Feature activation+1.03e-01

Pos logit contributions
 cens+0.348
hd+0.338
asionally+0.332
arning+0.332
cffff+0.322
Neg logit contributions
INESS-0.362
-0.352
 Breaker-0.310
 understandable-0.295
endra-0.283
Token United
Token ID1578
Feature activation-4.94e-01

Pos logit contributions
INESS+0.233
+0.227
 Breaker+0.202
 understandable+0.193
endra+0.186
Neg logit contributions
 cens-0.202
hd-0.196
arning-0.193
asionally-0.193
cffff-0.188
Token States
Token ID1829
Feature activation+1.84e-02

Pos logit contributions
 cens+1.000
hd+0.970
arning+0.957
asionally+0.952
cffff+0.934
Neg logit contributions
INESS-1.092
-1.062
 Breaker-0.945
 understandable-0.902
endra-0.861
Token 5
Token ID642
Feature activation-4.71e-01

Pos logit contributions
INESS+0.044
+0.043
 Breaker+0.039
 understandable+0.037
endra+0.036
Neg logit contributions
 cens-0.034
hd-0.032
arning-0.032
asionally-0.032
cffff-0.031
Token for
Token ID329
Feature activation-3.02e-02

Pos logit contributions
INESS+0.793
+0.770
 Breaker+0.688
 understandable+0.655
endra+0.626
Neg logit contributions
 cens-0.755
hd-0.733
arning-0.727
asionally-0.726
cffff-0.713
Token a
Token ID257
Feature activation+4.08e-02

Pos logit contributions
 cens+0.073
hd+0.070
asionally+0.069
arning+0.069
cffff+0.068
Neg logit contributions
INESS-0.056
-0.054
 Breaker-0.047
 understandable-0.043
endra-0.042
Token diplomatic
Token ID13093
Feature activation-2.14e-01

Pos logit contributions
INESS+0.071
+0.068
 Breaker+0.059
 understandable+0.054
endra+0.052
Neg logit contributions
 cens-0.103
hd-0.100
asionally-0.098
arning-0.098
cffff-0.097
Token resolution
Token ID6323
Feature activation-5.06e-01

Pos logit contributions
 cens+0.447
hd+0.429
asionally+0.422
arning+0.422
cffff+0.412
Neg logit contributions
INESS-0.465
-0.451
 Breaker-0.399
 understandable-0.376
endra-0.363
Token.
Token ID13
Feature activation-1.07e-01

Pos logit contributions
 cens+1.001
hd+0.967
arning+0.949
asionally+0.947
cffff+0.922
Neg logit contributions
INESS-1.147
-1.121
 Breaker-1.000
 understandable-0.941
endra-0.913
Token Rather
Token ID11317
Feature activation-1.47e+00

Pos logit contributions
 cens+0.231
hd+0.227
asionally+0.221
arning+0.220
collar+0.215
Neg logit contributions
INESS-0.221
-0.215
 Breaker-0.187
 understandable-0.179
endra-0.172
Token,
Token ID11
Feature activation+1.08e-01

Pos logit contributions
 cens+2.922
hd+2.797
asionally+2.772
arning+2.737
collar+2.671
Neg logit contributions
INESS-3.273
-3.176
 Breaker-2.826
 understandable-2.623
endra-2.577
Token he
Token ID339
Feature activation+8.67e-02

Pos logit contributions
INESS+0.221
+0.215
 Breaker+0.189
 understandable+0.178
endra+0.171
Neg logit contributions
 cens-0.238
hd-0.231
asionally-0.228
arning-0.227
cffff-0.223
Token is
Token ID318
Feature activation+5.59e-02

Pos logit contributions
INESS+0.187
+0.182
 Breaker+0.162
 understandable+0.153
endra+0.147
Neg logit contributions
 cens-0.180
hd-0.175
asionally-0.173
arning-0.173
cffff-0.169
Token signaling
Token ID22049
Feature activation+7.66e-01

Pos logit contributions
INESS+0.132
+0.128
 Breaker+0.116
 understandable+0.111
endra+0.106
Neg logit contributions
 cens-0.104
hd-0.101
asionally-0.100
arning-0.100
cffff-0.098
Token that
Token ID326
Feature activation+3.72e-01

Pos logit contributions
INESS+1.507
+1.462
 Breaker+1.287
 understandable+1.217
endra+1.157
Neg logit contributions
 cens-1.730
hd-1.685
arning-1.667
asionally-1.667
cffff-1.637
Token be
Token ID307
Feature activation+3.15e-02

Pos logit contributions
INESS+0.720
+0.701
 Breaker+0.622
 understandable+0.594
endra+0.565
Neg logit contributions
 cens-0.757
hd-0.742
arning-0.735
asionally-0.734
cffff-0.721
Token expensive
Token ID5789
Feature activation-6.18e-02

Pos logit contributions
INESS+0.059
+0.058
 Breaker+0.050
 understandable+0.047
endra+0.045
Neg logit contributions
 cens-0.074
hd-0.072
asionally-0.071
arning-0.071
cffff-0.070
Token.
Token ID13
Feature activation+1.53e-01

Pos logit contributions
 cens+0.127
hd+0.123
arning+0.122
asionally+0.122
cffff+0.119
Neg logit contributions
INESS-0.134
-0.131
 Breaker-0.117
 understandable-0.111
endra-0.106
Token
Token ID198
Feature activation-6.04e-02

Pos logit contributions
INESS+0.323
+0.315
 Breaker+0.279
 understandable+0.265
endra+0.254
Neg logit contributions
 cens-0.322
hd-0.313
asionally-0.309
arning-0.309
cffff-0.302
Token
Token ID198
Feature activation-5.94e-02

Pos logit contributions
 cens+0.115
hd+0.114
arning+0.111
asionally+0.111
collar+0.108
Neg logit contributions
INESS-0.138
-0.136
 Breaker-0.122
 understandable-0.116
endra-0.111
TokenInstead
Token ID13193
Feature activation-1.46e+00

Pos logit contributions
 cens+0.118
hd+0.115
arning+0.113
asionally+0.113
cffff+0.111
Neg logit contributions
INESS-0.133
-0.130
 Breaker-0.116
 understandable-0.110
endra-0.106
Token,
Token ID11
Feature activation+1.71e-01

Pos logit contributions
 cens+2.935
hd+2.841
asionally+2.792
arning+2.757
cffff+2.732
Neg logit contributions
INESS-3.235
-3.127
 Breaker-2.787
 understandable-2.620
endra-2.545
Token you
Token ID345
Feature activation+2.09e-01

Pos logit contributions
INESS+0.346
+0.336
 Breaker+0.296
 understandable+0.279
endra+0.267
Neg logit contributions
 cens-0.380
hd-0.369
asionally-0.364
arning-0.364
cffff-0.357
Token can
Token ID460
Feature activation-5.52e-02

Pos logit contributions
INESS+0.463
+0.452
 Breaker+0.405
 understandable+0.386
endra+0.370
Neg logit contributions
 cens-0.417
hd-0.405
arning-0.400
asionally-0.400
cffff-0.392
Token represent
Token ID2380
Feature activation+2.59e-02

Pos logit contributions
 cens+0.129
hd+0.125
asionally+0.124
arning+0.124
cffff+0.122
Neg logit contributions
INESS-0.105
-0.102
 Breaker-0.089
 understandable-0.084
endra-0.080
Token density
Token ID12109
Feature activation+7.27e-01

Pos logit contributions
INESS+0.051
+0.049
 Breaker+0.043
 understandable+0.040
endra+0.039
Neg logit contributions
 cens-0.059
hd-0.057
asionally-0.056
arning-0.056
cffff-0.055
Token at
Token ID379
Feature activation-4.12e-01

Pos logit contributions
 cens+0.238
hd+0.231
arning+0.227
asionally+0.227
cffff+0.222
Neg logit contributions
INESS-0.275
-0.269
 Breaker-0.240
 understandable-0.228
endra-0.219
Token it
Token ID340
Feature activation-4.25e-02

Pos logit contributions
 cens+0.870
hd+0.834
arning+0.828
asionally+0.827
cffff+0.810
Neg logit contributions
INESS-0.882
-0.861
 Breaker-0.754
 understandable-0.716
endra-0.689
Token like
Token ID588
Feature activation-3.44e-02

Pos logit contributions
 cens+0.086
hd+0.083
asionally+0.082
arning+0.082
cffff+0.080
Neg logit contributions
INESS-0.094
-0.092
 Breaker-0.082
 understandable-0.078
endra-0.075
Token 36
Token ID4570
Feature activation-6.33e-02

Pos logit contributions
 cens+0.076
hd+0.073
asionally+0.072
arning+0.072
cffff+0.070
Neg logit contributions
INESS-0.071
-0.069
 Breaker-0.060
 understandable-0.057
endra-0.055
Token se
Token ID384
Feature activation-4.76e-01

Pos logit contributions
 cens+0.132
hd+0.128
arning+0.126
asionally+0.126
cffff+0.124
Neg logit contributions
INESS-0.136
-0.132
 Breaker-0.117
 understandable-0.112
endra-0.107
Tokenag
Token ID363
Feature activation-1.46e+00

Pos logit contributions
 cens+0.884
hd+0.867
arning+0.853
asionally+0.848
cffff+0.822
Neg logit contributions
INESS-1.129
-1.098
 Breaker-0.987
 understandable-0.944
endra-0.904
Tokenull
Token ID724
Feature activation-5.06e-01

Pos logit contributions
 cens+2.166
hd+2.123
arning+2.071
asionally+2.043
rect+1.983
Neg logit contributions
INESS-4.031
-3.914
 Breaker-3.552
 understandable-3.437
OE-3.257
Tokens
Token ID82
Feature activation-5.29e-02

Pos logit contributions
 cens+0.677
hd+0.646
asionally+0.627
arning+0.626
cffff+0.599
Neg logit contributions
INESS-1.469
-1.436
 Breaker-1.315
 understandable-1.269
endra-1.232
Token fighting
Token ID4330
Feature activation+2.43e-01

Pos logit contributions
 cens+0.105
hd+0.101
asionally+0.100
arning+0.100
cffff+0.098
Neg logit contributions
INESS-0.120
-0.117
 Breaker-0.104
 understandable-0.099
endra-0.095
Token over
Token ID625
Feature activation+1.03e-01

Pos logit contributions
INESS+0.517
+0.503
 Breaker+0.444
 understandable+0.421
endra+0.404
Neg logit contributions
 cens-0.516
hd-0.500
asionally-0.493
arning-0.493
cffff-0.482
Token a
Token ID257
Feature activation-2.99e-02

Pos logit contributions
INESS+0.202
+0.196
 Breaker+0.171
 understandable+0.162
endra+0.155
Neg logit contributions
 cens-0.233
hd-0.226
asionally-0.223
arning-0.223
cffff-0.219
Token on
Token ID319
Feature activation+1.12e-01

Pos logit contributions
INESS+0.470
+0.458
 Breaker+0.409
 understandable+0.389
endra+0.374
Neg logit contributions
 cens-0.418
hd-0.405
arning-0.400
asionally-0.400
cffff-0.391
Token government
Token ID1230
Feature activation+4.81e-01

Pos logit contributions
INESS+0.218
+0.212
 Breaker+0.185
 understandable+0.174
endra+0.167
Neg logit contributions
 cens-0.256
hd-0.248
asionally-0.246
arning-0.246
cffff-0.241
Token.
Token ID13
Feature activation+2.09e-02

Pos logit contributions
INESS+1.038
+1.007
 Breaker+0.900
 understandable+0.861
endra+0.818
Neg logit contributions
 cens-0.986
hd-0.963
arning-0.952
asionally-0.950
cffff-0.933
Token
Token ID198
Feature activation+8.58e-02

Pos logit contributions
INESS+0.046
+0.045
 Breaker+0.040
 understandable+0.038
endra+0.037
Neg logit contributions
 cens-0.042
hd-0.041
asionally-0.040
arning-0.040
cffff-0.039
Token
Token ID198
Feature activation+3.25e-02

Pos logit contributions
INESS+0.195
+0.190
 Breaker+0.170
 understandable+0.161
endra+0.155
Neg logit contributions
 cens-0.168
hd-0.165
arning-0.161
asionally-0.160
cffff-0.156
TokenInstead
Token ID13193
Feature activation-1.46e+00

Pos logit contributions
INESS+0.074
+0.072
 Breaker+0.064
 understandable+0.061
endra+0.059
Neg logit contributions
 cens-0.064
hd-0.062
arning-0.061
asionally-0.061
cffff-0.060
Token the
Token ID262
Feature activation+1.43e-01

Pos logit contributions
 cens+2.953
hd+2.857
arning+2.812
asionally+2.803
cffff+2.718
Neg logit contributions
INESS-3.178
-3.094
 Breaker-2.736
 understandable-2.555
endra-2.497
Token committee
Token ID5583
Feature activation-4.15e-01

Pos logit contributions
INESS+0.287
+0.279
 Breaker+0.245
 understandable+0.231
endra+0.221
Neg logit contributions
 cens-0.318
hd-0.308
arning-0.304
asionally-0.304
cffff-0.298
Token urged
Token ID11643
Feature activation+1.15e-02

Pos logit contributions
 cens+0.879
hd+0.842
asionally+0.828
arning+0.828
collar+0.802
Neg logit contributions
INESS-0.894
-0.870
 Breaker-0.773
 understandable-0.724
endra-0.692
Token the
Token ID262
Feature activation+3.21e-01

Pos logit contributions
INESS+0.022
+0.021
 Breaker+0.018
 understandable+0.017
endra+0.016
Neg logit contributions
 cens-0.027
hd-0.026
asionally-0.026
arning-0.026
cffff-0.025
Token government
Token ID1230
Feature activation+4.38e-01

Pos logit contributions
INESS+0.617
+0.601
 Breaker+0.522
 understandable+0.491
endra+0.470
Neg logit contributions
 cens-0.746
hd-0.723
asionally-0.714
arning-0.714
cffff-0.700
Token hesitant
Token ID32848
Feature activation-4.36e-03

Pos logit contributions
INESS+0.245
+0.239
 Breaker+0.211
 understandable+0.199
endra+0.192
Neg logit contributions
 cens-0.244
hd-0.234
arning-0.231
asionally-0.230
cffff-0.227
Token to
Token ID284
Feature activation+3.37e-01

Pos logit contributions
 cens+0.009
hd+0.009
arning+0.009
asionally+0.009
cffff+0.008
Neg logit contributions
INESS-0.009
-0.009
 Breaker-0.008
 understandable-0.008
endra-0.007
Token bring
Token ID2222
Feature activation+4.99e-01

Pos logit contributions
INESS+0.674
+0.655
 Breaker+0.578
 understandable+0.548
endra+0.521
Neg logit contributions
 cens-0.749
hd-0.731
arning-0.723
asionally-0.722
cffff-0.709
Token in
Token ID287
Feature activation+2.01e-01

Pos logit contributions
INESS+1.042
+1.016
 Breaker+0.895
 understandable+0.845
endra+0.811
Neg logit contributions
 cens-1.076
hd-1.040
asionally-1.027
arning-1.026
cffff-1.003
Token J
Token ID449
Feature activation+4.92e-01

Pos logit contributions
INESS+0.405
+0.394
 Breaker+0.344
 understandable+0.322
endra+0.311
Neg logit contributions
 cens-0.453
hd-0.438
asionally-0.431
arning-0.430
cffff-0.421
TokenU
Token ID52
Feature activation+1.10e+00

Pos logit contributions
INESS+1.168
+1.141
 Breaker+1.026
 understandable+0.980
endra+0.943
Neg logit contributions
 cens-0.913
hd-0.883
arning-0.870
asionally-0.870
cffff-0.850
TokenCO
Token ID8220
Feature activation+1.11e-01

Pos logit contributions
INESS+1.629
+1.593
 Breaker+1.324
 understandable+1.215
endra+1.116
Neg logit contributions
 cens-3.040
hd-2.994
arning-2.958
asionally-2.931
cffff-2.909
Token defensive
Token ID6110
Feature activation+8.41e-01

Pos logit contributions
INESS+0.239
+0.232
 Breaker+0.207
 understandable+0.197
endra+0.189
Neg logit contributions
 cens-0.232
hd-0.225
arning-0.222
asionally-0.221
cffff-0.216
Token linemen
Token ID41683
Feature activation+3.34e-01

Pos logit contributions
INESS+1.796
+1.759
 Breaker+1.568
 understandable+1.492
endra+1.420
Neg logit contributions
 cens-1.720
hd-1.685
asionally-1.672
arning-1.666
cffff-1.649
Token.
Token ID13
Feature activation-5.21e-02

Pos logit contributions
INESS+0.725
+0.707
 Breaker+0.631
 understandable+0.600
endra+0.574
Neg logit contributions
 cens-0.685
hd-0.666
arning-0.657
asionally-0.657
cffff-0.645
Token Ter
Token ID3813
Feature activation+1.42e-01

Pos logit contributions
 cens+0.107
hd+0.106
asionally+0.103
arning+0.103
collar+0.100
Neg logit contributions
INESS-0.112
-0.109
 Breaker-0.096
 understandable-0.092
endra-0.089
Token've
Token ID1053
Feature activation+3.65e-02

Pos logit contributions
INESS+0.212
+0.207
 Breaker+0.187
 understandable+0.178
endra+0.172
Neg logit contributions
 cens-0.166
hd-0.160
arning-0.158
asionally-0.158
cffff-0.154
Token also
Token ID635
Feature activation+4.38e-01

Pos logit contributions
INESS+0.083
+0.080
 Breaker+0.072
 understandable+0.068
endra+0.066
Neg logit contributions
 cens-0.072
hd-0.070
asionally-0.069
arning-0.069
cffff-0.067
Token included
Token ID3017
Feature activation+1.23e-01

Pos logit contributions
INESS+1.028
+1.004
 Breaker+0.907
 understandable+0.868
endra+0.831
Neg logit contributions
 cens-0.816
hd-0.793
arning-0.785
asionally-0.782
cffff-0.772
Token a
Token ID257
Feature activation-9.37e-02

Pos logit contributions
INESS+0.244
+0.237
 Breaker+0.206
 understandable+0.194
endra+0.186
Neg logit contributions
 cens-0.283
hd-0.274
arning-0.269
asionally-0.269
cffff-0.261
Token comparison
Token ID7208
Feature activation+5.65e-01

Pos logit contributions
 cens+0.201
hd+0.194
arning+0.191
asionally+0.190
cffff+0.185
Neg logit contributions
INESS-0.198
-0.194
 Breaker-0.170
 understandable-0.161
endra-0.155
Token above
Token ID2029
Feature activation+1.56e+00

Pos logit contributions
INESS+1.278
+1.245
 Breaker+1.115
 understandable+1.061
endra+1.019
Neg logit contributions
 cens-1.115
hd-1.080
arning-1.065
asionally-1.065
cffff-1.042
Token against
Token ID1028
Feature activation+4.86e-01

Pos logit contributions
INESS+3.299
+3.185
 Breaker+2.940
 understandable+2.799
endra+2.624
Neg logit contributions
 cens-3.123
hd-3.088
asionally-3.086
arning-3.072
cffff-3.049
Token 3
Token ID513
Feature activation-2.92e-01

Pos logit contributions
INESS+0.896
+0.868
 Breaker+0.750
 understandable+0.704
endra+0.671
Neg logit contributions
 cens-1.171
hd-1.140
asionally-1.124
arning-1.122
cffff-1.101
Token other
Token ID584
Feature activation-2.99e-02

Pos logit contributions
 cens+0.623
hd+0.610
asionally+0.595
arning+0.594
cffff+0.580
Neg logit contributions
INESS-0.619
-0.600
 Breaker-0.533
 understandable-0.502
endra-0.484
Token very
Token ID845
Feature activation+1.71e-01

Pos logit contributions
 cens+0.060
hd+0.058
asionally+0.057
arning+0.057
cffff+0.055
Neg logit contributions
INESS-0.067
-0.065
 Breaker-0.058
 understandable-0.055
endra-0.053
Token fast
Token ID3049
Feature activation+4.67e-01

Pos logit contributions
INESS+0.294
+0.283
 Breaker+0.242
 understandable+0.222
endra+0.215
Neg logit contributions
 cens-0.435
hd-0.423
asionally-0.416
arning-0.415
cffff-0.406
Tokenbin
Token ID8800
Feature activation+8.51e-02

Pos logit contributions
INESS+0.267
+0.261
 Breaker+0.232
 understandable+0.220
endra+0.212
Neg logit contributions
 cens-0.239
hd-0.232
arning-0.228
asionally-0.228
cffff-0.223
Token with
Token ID351
Feature activation-1.81e-01

Pos logit contributions
INESS+0.192
+0.187
 Breaker+0.168
 understandable+0.160
endra+0.153
Neg logit contributions
 cens-0.168
hd-0.163
asionally-0.161
arning-0.161
cffff-0.158
Token thread
Token ID4704
Feature activation-1.25e-01

Pos logit contributions
 cens+0.396
hd+0.383
arning+0.380
asionally+0.379
cffff+0.371
Neg logit contributions
INESS-0.373
-0.363
 Breaker-0.320
 understandable-0.302
endra-0.289
Token before
Token ID878
Feature activation-1.93e-01

Pos logit contributions
 cens+0.257
hd+0.250
arning+0.246
asionally+0.246
cffff+0.241
Neg logit contributions
INESS-0.273
-0.265
 Breaker-0.237
 understandable-0.225
endra-0.215
Token this
Token ID428
Feature activation-2.46e-01

Pos logit contributions
 cens+0.390
hd+0.376
arning+0.372
asionally+0.371
cffff+0.362
Neg logit contributions
INESS-0.430
-0.419
 Breaker-0.373
 understandable-0.354
endra-0.340
Token lesson
Token ID11483
Feature activation+9.29e-01

Pos logit contributions
 cens+0.491
hd+0.474
arning+0.469
asionally+0.465
cffff+0.455
Neg logit contributions
INESS-0.552
-0.539
 Breaker-0.479
 understandable-0.454
endra-0.439
Token.
Token ID13
Feature activation+2.87e-02

Pos logit contributions
INESS+2.013
+1.960
 Breaker+1.746
 understandable+1.657
endra+1.588
Neg logit contributions
 cens-1.918
hd-1.861
arning-1.837
asionally-1.836
cffff-1.799
Token Show
Token ID5438
Feature activation+1.81e-01

Pos logit contributions
INESS+0.060
+0.058
 Breaker+0.052
 understandable+0.049
endra+0.047
Neg logit contributions
 cens-0.061
hd-0.060
asionally-0.059
arning-0.059
cffff-0.058
Token your
Token ID534
Feature activation-3.56e-02

Pos logit contributions
INESS+0.366
+0.356
 Breaker+0.313
 understandable+0.294
endra+0.281
Neg logit contributions
 cens-0.402
hd-0.390
asionally-0.385
arning-0.384
cffff-0.376
Token child
Token ID1200
Feature activation-7.91e-02

Pos logit contributions
 cens+0.076
hd+0.073
arning+0.072
asionally+0.072
cffff+0.070
Neg logit contributions
INESS-0.076
-0.074
 Breaker-0.065
 understandable-0.061
endra-0.059
Token how
Token ID703
Feature activation-2.90e-02

Pos logit contributions
 cens+0.172
hd+0.166
arning+0.165
asionally+0.164
cffff+0.161
Neg logit contributions
INESS-0.164
-0.160
 Breaker-0.141
 understandable-0.132
endra-0.127
Token Europeans
Token ID20006
Feature activation-2.10e-01

Pos logit contributions
INESS+0.702
+0.680
 Breaker+0.598
 understandable+0.563
endra+0.543
Neg logit contributions
 cens-0.743
hd-0.713
arning-0.704
asionally-0.704
cffff-0.691
Token coming
Token ID2406
Feature activation+1.10e-01

Pos logit contributions
 cens+0.423
hd+0.405
arning+0.401
asionally+0.399
cffff+0.391
Neg logit contributions
INESS-0.472
-0.458
 Breaker-0.408
 understandable-0.387
endra-0.372
Token from
Token ID422
Feature activation+3.03e-02

Pos logit contributions
INESS+0.256
+0.250
 Breaker+0.224
 understandable+0.213
endra+0.205
Neg logit contributions
 cens-0.210
hd-0.203
asionally-0.200
arning-0.200
cffff-0.196
Token Chicago
Token ID4842
Feature activation+2.07e-02

Pos logit contributions
INESS+0.060
+0.058
 Breaker+0.050
 understandable+0.047
endra+0.045
Neg logit contributions
 cens-0.070
hd-0.067
arning-0.066
asionally-0.066
cffff-0.065
Token where
Token ID810
Feature activation+5.79e-01

Pos logit contributions
INESS+0.046
+0.045
 Breaker+0.040
 understandable+0.038
endra+0.036
Neg logit contributions
 cens-0.042
hd-0.040
asionally-0.040
arning-0.040
cffff-0.039
Token Pent
Token ID9696
Feature activation+1.06e+00

Pos logit contributions
INESS+1.139
+1.103
 Breaker+0.968
 understandable+0.911
endra+0.870
Neg logit contributions
 cens-1.322
hd-1.280
asionally-1.266
arning-1.264
cffff-1.242
Tokenec
Token ID721
Feature activation+2.16e-01

Pos logit contributions
INESS+2.822
+2.744
 Breaker+2.511
 understandable+2.431
endra+2.377
Neg logit contributions
 cens-1.622
hd-1.538
arning-1.530
asionally-1.520
cffff-1.486
Tokenost
Token ID455
Feature activation+2.08e-01

Pos logit contributions
INESS+0.505
+0.492
 Breaker+0.442
 understandable+0.423
endra+0.408
Neg logit contributions
 cens-0.411
hd-0.395
arning-0.390
asionally-0.389
cffff-0.383
Tokenal
Token ID282
Feature activation+6.93e-01

Pos logit contributions
INESS+0.490
+0.478
 Breaker+0.430
 understandable+0.411
endra+0.395
Neg logit contributions
 cens-0.388
hd-0.375
arning-0.370
asionally-0.370
cffff-0.361
Token fire
Token ID2046
Feature activation+3.65e-01

Pos logit contributions
INESS+1.634
+1.594
 Breaker+1.435
 understandable+1.372
endra+1.318
Neg logit contributions
 cens-1.292
hd-1.253
arning-1.235
asionally-1.235
cffff-1.208
Token was
Token ID373
Feature activation+8.65e-02

Pos logit contributions
INESS+0.830
+0.808
 Breaker+0.725
 understandable+0.691
endra+0.663
Neg logit contributions
 cens-0.713
hd-0.692
arning-0.684
asionally-0.684
cffff-0.670
Tokens
Token ID82
Feature activation+4.45e-01

Pos logit contributions
INESS+0.243
+0.238
 Breaker+0.219
 understandable+0.210
endra+0.204
Neg logit contributions
 cens-0.115
hd-0.111
arning-0.108
asionally-0.108
cffff-0.104
Token 224
Token ID26063
Feature activation+2.89e-01

Pos logit contributions
INESS+1.011
+0.987
 Breaker+0.879
 understandable+0.839
endra+0.809
Neg logit contributions
 cens-0.871
hd-0.849
arning-0.833
asionally-0.832
cffff-0.812
Token GB
Token ID13124
Feature activation+3.09e-01

Pos logit contributions
INESS+0.478
+0.458
 Breaker+0.390
 understandable+0.363
endra+0.344
Neg logit contributions
 cens-0.746
hd-0.732
arning-0.722
asionally-0.721
cffff-0.710
Token/
Token ID14
Feature activation+7.38e-02

Pos logit contributions
INESS+0.691
+0.674
 Breaker+0.602
 understandable+0.572
endra+0.550
Neg logit contributions
 cens-0.616
hd-0.599
asionally-0.589
arning-0.589
cffff-0.576
Tokens
Token ID82
Feature activation+4.68e-01

Pos logit contributions
INESS+0.212
+0.207
 Breaker+0.190
 understandable+0.183
endra+0.177
Neg logit contributions
 cens-0.101
hd-0.097
arning-0.094
asionally-0.094
cffff-0.091
Token 192
Token ID17817
Feature activation+1.07e+00

Pos logit contributions
INESS+1.042
+1.016
 Breaker+0.900
 understandable+0.860
endra+0.829
Neg logit contributions
 cens-0.936
hd-0.914
arning-0.895
asionally-0.895
cffff-0.873
Token GB
Token ID13124
Feature activation+1.16e-01

Pos logit contributions
INESS+1.659
+1.601
 Breaker+1.363
 understandable+1.261
endra+1.174
Neg logit contributions
 cens-2.857
hd-2.783
asionally-2.761
arning-2.760
cffff-2.718
Token/
Token ID14
Feature activation-1.61e-02

Pos logit contributions
INESS+0.262
+0.256
 Breaker+0.229
 understandable+0.217
endra+0.210
Neg logit contributions
 cens-0.227
hd-0.222
asionally-0.217
arning-0.217
cffff-0.211
Tokens
Token ID82
Feature activation+3.84e-01

Pos logit contributions
 cens+0.022
hd+0.021
arning+0.020
asionally+0.020
cffff+0.020
Neg logit contributions
INESS-0.046
-0.045
 Breaker-0.042
 understandable-0.040
endra-0.039
Token 256
Token ID17759
Feature activation+3.70e-01

Pos logit contributions
INESS+0.877
+0.854
 Breaker+0.759
 understandable+0.727
endra+0.702
Neg logit contributions
 cens-0.748
hd-0.730
asionally-0.714
arning-0.714
cffff-0.695
Token GB
Token ID13124
Feature activation-8.57e-03

Pos logit contributions
INESS+0.626
+0.603
 Breaker+0.507
 understandable+0.476
endra+0.453
Neg logit contributions
 cens-0.940
hd-0.929
asionally-0.909
arning-0.908
cffff-0.891
Token points
Token ID2173
Feature activation+1.59e-01

Pos logit contributions
INESS+0.271
+0.259
 Breaker+0.228
 understandable+0.215
endra+0.207
Neg logit contributions
 cens-0.299
hd-0.287
asionally-0.284
arning-0.282
cffff-0.276
Token to
Token ID284
Feature activation+9.93e-02

Pos logit contributions
INESS+0.355
+0.346
 Breaker+0.308
 understandable+0.293
endra+0.282
Neg logit contributions
 cens-0.320
hd-0.309
asionally-0.305
arning-0.304
cffff-0.298
Token uniquely
Token ID24139
Feature activation+2.21e-01

Pos logit contributions
INESS+0.191
+0.182
 Breaker+0.157
 understandable+0.149
endra+0.144
Neg logit contributions
 cens-0.234
hd-0.225
asionally-0.222
arning-0.221
collar-0.218
Token-
Token ID12
Feature activation+1.28e-01

Pos logit contributions
INESS+0.489
+0.473
 Breaker+0.419
 understandable+0.397
endra+0.385
Neg logit contributions
 cens-0.452
hd-0.436
arning-0.428
asionally-0.427
cffff-0.417
Tokenowned
Token ID11990
Feature activation+1.67e-01

Pos logit contributions
INESS+0.301
+0.293
 Breaker+0.263
 understandable+0.250
endra+0.241
Neg logit contributions
 cens-0.243
hd-0.236
arning-0.231
asionally-0.231
cffff-0.226
Token data
Token ID1366
Feature activation+5.82e-01

Pos logit contributions
INESS+0.358
+0.344
 Breaker+0.303
 understandable+0.288
endra+0.279
Neg logit contributions
 cens-0.353
hd-0.341
asionally-0.335
arning-0.333
cffff-0.327
Token that
Token ID326
Feature activation+1.18e-01

Pos logit contributions
INESS+1.243
+1.202
 Breaker+1.062
 understandable+1.004
endra+0.972
Neg logit contributions
 cens-1.231
hd-1.192
asionally-1.175
arning-1.169
collar-1.145
Token has
Token ID468
Feature activation+3.33e-02

Pos logit contributions
INESS+0.244
+0.233
 Breaker+0.205
 understandable+0.194
endra+0.187
Neg logit contributions
 cens-0.261
hd-0.251
asionally-0.248
arning-0.246
collar-0.241
Token not
Token ID407
Feature activation+1.30e-01

Pos logit contributions
INESS+0.074
+0.072
 Breaker+0.064
 understandable+0.060
endra+0.059
Neg logit contributions
 cens-0.067
hd-0.065
asionally-0.064
arning-0.064
cffff-0.062
Token yet
Token ID1865
Feature activation+1.05e-01

Pos logit contributions
INESS+0.301
+0.293
 Breaker+0.263
 understandable+0.250
endra+0.241
Neg logit contributions
 cens-0.249
hd-0.241
asionally-0.237
arning-0.237
cffff-0.231
Token been
Token ID587
Feature activation-1.91e-01

Pos logit contributions
INESS+0.251
+0.244
 Breaker+0.220
 understandable+0.210
endra+0.203
Neg logit contributions
 cens-0.195
hd-0.189
asionally-0.185
arning-0.185
cffff-0.181
Tokenyear
Token ID1941
Feature activation+2.08e-01

Pos logit contributions
INESS+0.461
+0.450
 Breaker+0.398
 understandable+0.375
endra+0.359
Neg logit contributions
 cens-0.439
hd-0.433
arning-0.423
asionally-0.422
collar-0.415
Token,
Token ID11
Feature activation+1.71e-01

Pos logit contributions
INESS+0.441
+0.428
 Breaker+0.380
 understandable+0.359
endra+0.345
Neg logit contributions
 cens-0.439
hd-0.428
asionally-0.421
arning-0.421
cffff-0.412
Token $
Token ID720
Feature activation+1.11e-01

Pos logit contributions
INESS+0.334
+0.324
 Breaker+0.282
 understandable+0.263
endra+0.254
Neg logit contributions
 cens-0.394
hd-0.383
arning-0.378
asionally-0.377
cffff-0.369
Token41
Token ID3901
Feature activation+2.21e-01

Pos logit contributions
INESS+0.256
+0.250
 Breaker+0.224
 understandable+0.214
endra+0.206
Neg logit contributions
 cens-0.213
hd-0.208
arning-0.204
asionally-0.203
cffff-0.198
Token million
Token ID1510
Feature activation+3.04e-01

Pos logit contributions
INESS+0.465
+0.453
 Breaker+0.403
 understandable+0.381
endra+0.364
Neg logit contributions
 cens-0.468
hd-0.455
arning-0.450
asionally-0.449
cffff-0.440
Token extension
Token ID7552
Feature activation+1.29e-01

Pos logit contributions
INESS+0.584
+0.563
 Breaker+0.493
 understandable+0.462
endra+0.442
Neg logit contributions
 cens-0.707
hd-0.693
arning-0.682
asionally-0.679
collar-0.667
Token signed
Token ID4488
Feature activation+4.65e-01

Pos logit contributions
INESS+0.299
+0.289
 Breaker+0.256
 understandable+0.243
endra+0.234
Neg logit contributions
 cens-0.253
hd-0.243
arning-0.240
asionally-0.238
collar-0.233
Token in
Token ID287
Feature activation-1.25e-01

Pos logit contributions
INESS+0.969
+0.942
 Breaker+0.828
 understandable+0.780
endra+0.752
Neg logit contributions
 cens-1.009
hd-0.976
arning-0.964
asionally-0.963
cffff-0.945
Token March
Token ID2805
Feature activation-1.51e-01

Pos logit contributions
 cens+0.272
hd+0.261
arning+0.257
asionally+0.257
collar+0.249
Neg logit contributions
INESS-0.266
-0.257
 Breaker-0.221
 understandable-0.207
endra-0.204
Token 2016
Token ID1584
Feature activation-9.18e-03

Pos logit contributions
 cens+0.322
hd+0.311
arning+0.307
asionally+0.305
cffff+0.298
Neg logit contributions
INESS-0.320
-0.311
 Breaker-0.271
 understandable-0.257
endra-0.249
Token
Token ID198
Feature activation-1.43e-01

Pos logit contributions
 cens+0.018
hd+0.018
arning+0.018
asionally+0.018
cffff+0.017
Neg logit contributions
INESS-0.020
-0.020
 Breaker-0.018
 understandable-0.017
endra-0.016
Token the
Token ID262
Feature activation-8.49e-02

Pos logit contributions
 cens+0.951
hd+0.913
asionally+0.902
arning+0.901
collar+0.880
Neg logit contributions
INESS-0.852
-0.826
 Breaker-0.714
 understandable-0.668
endra-0.652
Token face
Token ID1986
Feature activation-1.20e-01

Pos logit contributions
 cens+0.183
hd+0.176
arning+0.174
asionally+0.174
cffff+0.169
Neg logit contributions
INESS-0.179
-0.174
 Breaker-0.153
 understandable-0.145
endra-0.139
Token of
Token ID286
Feature activation-1.13e-01

Pos logit contributions
 cens+0.243
hd+0.235
asionally+0.230
arning+0.229
collar+0.225
Neg logit contributions
INESS-0.268
-0.261
 Breaker-0.230
 understandable-0.220
endra-0.211
Token scandal
Token ID10731
Feature activation-2.75e-01

Pos logit contributions
 cens+0.260
hd+0.250
arning+0.247
asionally+0.246
cffff+0.242
Neg logit contributions
INESS-0.222
-0.215
 Breaker-0.185
 understandable-0.175
OE-0.167
Token-
Token ID12
Feature activation+1.18e-01

Pos logit contributions
 cens+0.600
hd+0.579
arning+0.571
asionally+0.568
cffff+0.555
Neg logit contributions
INESS-0.572
-0.554
 Breaker-0.487
 understandable-0.463
endra-0.441
Tokenhit
Token ID17945
Feature activation+4.08e-02

Pos logit contributions
INESS+0.292
+0.286
 Breaker+0.258
 understandable+0.246
endra+0.236
Neg logit contributions
 cens-0.209
hd-0.203
arning-0.199
asionally-0.198
cffff-0.193
Token former
Token ID1966
Feature activation+4.47e-02

Pos logit contributions
INESS+0.080
+0.078
 Breaker+0.067
 understandable+0.064
endra+0.061
Neg logit contributions
 cens-0.094
hd-0.090
arning-0.089
asionally-0.089
cffff-0.087
Token International
Token ID4037
Feature activation-4.16e-01

Pos logit contributions
INESS+0.082
+0.079
 Breaker+0.067
 understandable+0.064
endra+0.061
Neg logit contributions
 cens-0.109
hd-0.105
arning-0.104
asionally-0.104
cffff-0.102
Token Monetary
Token ID34112
Feature activation-7.24e-01

Pos logit contributions
 cens+0.785
hd+0.747
asionally+0.726
arning+0.724
cffff+0.705
Neg logit contributions
INESS-0.987
-0.959
 Breaker-0.848
 understandable-0.822
endra-0.800
Token Fund
Token ID7557
Feature activation-2.98e-01

Pos logit contributions
 cens+1.431
hd+1.366
arning+1.331
asionally+1.317
collar+1.291
Neg logit contributions
INESS-1.647
-1.602
 Breaker-1.423
 understandable-1.353
endra-1.320
Token chief
Token ID4039
Feature activation-2.09e-01

Pos logit contributions
 cens+0.685
hd+0.653
arning+0.640
asionally+0.634
collar+0.624
Neg logit contributions
INESS-0.588
-0.570
 Breaker-0.491
 understandable-0.470
endra-0.452
Token more
Token ID517
Feature activation-1.04e-01

Pos logit contributions
INESS+0.147
+0.143
 Breaker+0.126
 understandable+0.119
endra+0.114
Neg logit contributions
 cens-0.158
hd-0.154
arning-0.151
asionally-0.151
cffff-0.148
Token recent
Token ID2274
Feature activation+6.43e-01

Pos logit contributions
 cens+0.239
hd+0.231
asionally+0.226
arning+0.225
collar+0.222
Neg logit contributions
INESS-0.205
-0.198
 Breaker-0.172
 understandable-0.158
endra-0.155
Token hyper
Token ID8718
Feature activation+8.57e-01

Pos logit contributions
INESS+1.322
+1.287
 Breaker+1.142
 understandable+1.082
endra+1.031
Neg logit contributions
 cens-1.393
hd-1.356
asionally-1.343
arning-1.342
cffff-1.318
Tokenvisor
Token ID13131
Feature activation-2.19e-01

Pos logit contributions
INESS+1.403
+1.354
 Breaker+1.158
 understandable+1.077
endra+1.012
Neg logit contributions
 cens-2.222
hd-2.169
asionally-2.148
arning-2.147
cffff-2.113
Token configuration
Token ID8398
Feature activation+1.70e-01

Pos logit contributions
 cens+0.465
hd+0.450
arning+0.441
asionally+0.439
collar+0.429
Neg logit contributions
INESS-0.466
-0.452
 Breaker-0.398
 understandable-0.376
endra-0.363
Token,
Token ID11
Feature activation+3.52e-02

Pos logit contributions
INESS+0.376
+0.365
 Breaker+0.324
 understandable+0.307
endra+0.296
Neg logit contributions
 cens-0.346
hd-0.335
arning-0.329
asionally-0.329
cffff-0.321
Token running
Token ID2491
Feature activation-2.68e-01

Pos logit contributions
INESS+0.078
+0.076
 Breaker+0.067
 understandable+0.064
endra+0.061
Neg logit contributions
 cens-0.072
hd-0.070
asionally-0.068
arning-0.068
cffff-0.067
Token on
Token ID319
Feature activation+7.32e-02

Pos logit contributions
 cens+0.554
hd+0.538
asionally+0.528
arning+0.527
cffff+0.514
Neg logit contributions
INESS-0.585
-0.567
 Breaker-0.502
 understandable-0.476
endra-0.458
Token a
Token ID257
Feature activation+1.35e-01

Pos logit contributions
INESS+0.151
+0.146
 Breaker+0.128
 understandable+0.121
endra+0.116
Neg logit contributions
 cens-0.161
hd-0.156
asionally-0.153
arning-0.153
cffff-0.150
Token processor
Token ID12649
Feature activation+2.75e-01

Pos logit contributions
INESS+0.276
+0.268
 Breaker+0.235
 understandable+0.223
endra+0.213
Neg logit contributions
 cens-0.295
hd-0.287
arning-0.282
asionally-0.282
cffff-0.277
Token with
Token ID351
Feature activation-2.51e-01

Pos logit contributions
INESS+0.603
+0.587
 Breaker+0.523
 understandable+0.496
endra+0.476
Neg logit contributions
 cens-0.563
hd-0.547
arning-0.539
asionally-0.539
cffff-0.527
Token these
Token ID777
Feature activation+1.93e-03

Pos logit contributions
 cens+0.172
hd+0.165
arning+0.163
asionally+0.163
cffff+0.159
Neg logit contributions
INESS-0.146
-0.142
 Breaker-0.122
 understandable-0.114
endra-0.110
Token teams
Token ID3466
Feature activation+1.80e-01

Pos logit contributions
INESS+0.004
+0.004
 Breaker+0.003
 understandable+0.003
endra+0.003
Neg logit contributions
 cens-0.004
hd-0.004
asionally-0.004
arning-0.004
cffff-0.004
Token in
Token ID287
Feature activation+9.18e-02

Pos logit contributions
INESS+0.385
+0.375
 Breaker+0.334
 understandable+0.317
endra+0.303
Neg logit contributions
 cens-0.375
hd-0.364
asionally-0.360
arning-0.360
cffff-0.353
Token their
Token ID511
Feature activation-5.79e-03

Pos logit contributions
INESS+0.181
+0.176
 Breaker+0.153
 understandable+0.144
endra+0.138
Neg logit contributions
 cens-0.209
hd-0.202
arning-0.200
asionally-0.200
cffff-0.195
Token first
Token ID717
Feature activation+2.58e-01

Pos logit contributions
 cens+0.013
hd+0.012
asionally+0.012
arning+0.012
cffff+0.012
Neg logit contributions
INESS-0.012
-0.012
 Breaker-0.010
 understandable-0.010
endra-0.009
Token meetings
Token ID8292
Feature activation+1.84e-01

Pos logit contributions
INESS+0.512
+0.498
 Breaker+0.434
 understandable+0.410
endra+0.395
Neg logit contributions
 cens-0.583
hd-0.564
asionally-0.555
arning-0.555
cffff-0.543
Token,
Token ID11
Feature activation+1.32e-01

Pos logit contributions
INESS+0.406
+0.396
 Breaker+0.352
 understandable+0.334
endra+0.322
Neg logit contributions
 cens-0.375
hd-0.363
arning-0.357
asionally-0.357
cffff-0.350
Token so
Token ID523
Feature activation-8.83e-02

Pos logit contributions
INESS+0.283
+0.275
 Breaker+0.245
 understandable+0.233
endra+0.223
Neg logit contributions
 cens-0.276
hd-0.268
asionally-0.265
arning-0.265
cffff-0.260
Token it
Token ID340
Feature activation+1.79e-01

Pos logit contributions
 cens+0.202
hd+0.195
arning+0.193
asionally+0.192
cffff+0.188
Neg logit contributions
INESS-0.174
-0.169
 Breaker-0.146
 understandable-0.137
endra-0.132
Token
Token ID447
Feature activation+1.41e-01

Pos logit contributions
INESS+0.391
+0.380
 Breaker+0.338
 understandable+0.319
endra+0.308
Neg logit contributions
 cens-0.371
hd-0.359
arning-0.354
asionally-0.354
cffff-0.345
Token
Token ID247
Feature activation+8.61e-03

Pos logit contributions
INESS+0.296
+0.287
 Breaker+0.255
 understandable+0.241
endra+0.231
Neg logit contributions
 cens-0.301
hd-0.292
asionally-0.289
arning-0.289
cffff-0.282
Token CentOS
Token ID32418
Feature activation-3.53e-01

Pos logit contributions
 cens+0.743
hd+0.716
asionally+0.702
arning+0.701
cffff+0.688
Neg logit contributions
INESS-0.654
-0.638
 Breaker-0.555
 understandable-0.523
endra-0.505
Token servers
Token ID9597
Feature activation+6.22e-02

Pos logit contributions
 cens+0.734
hd+0.707
arning+0.694
asionally+0.692
cffff+0.671
Neg logit contributions
INESS-0.765
-0.746
 Breaker-0.659
 understandable-0.620
endra-0.601
Token operating
Token ID5361
Feature activation-5.25e-01

Pos logit contributions
INESS+0.134
+0.131
 Breaker+0.116
 understandable+0.108
endra+0.105
Neg logit contributions
 cens-0.130
hd-0.126
arning-0.124
asionally-0.124
cffff-0.120
Token smoothly
Token ID21461
Feature activation+2.20e-01

Pos logit contributions
 cens+1.069
hd+1.028
asionally+1.014
arning+1.005
cffff+0.982
Neg logit contributions
INESS-1.160
-1.128
 Breaker-1.002
 understandable-0.945
endra-0.922
Token and
Token ID290
Feature activation-6.62e-02

Pos logit contributions
INESS+0.462
+0.449
 Breaker+0.399
 understandable+0.377
endra+0.362
Neg logit contributions
 cens-0.469
hd-0.455
asionally-0.449
arning-0.448
cffff-0.440
Token you
Token ID345
Feature activation-1.66e-02

Pos logit contributions
 cens+0.116
hd+0.111
asionally+0.109
arning+0.109
cffff+0.106
Neg logit contributions
INESS-0.165
-0.161
 Breaker-0.146
 understandable-0.139
endra-0.135
Token don
Token ID836
Feature activation-3.66e-01

Pos logit contributions
 cens+0.032
hd+0.031
arning+0.031
asionally+0.031
cffff+0.030
Neg logit contributions
INESS-0.038
-0.037
 Breaker-0.033
 understandable-0.032
endra-0.030
Token't
Token ID470
Feature activation+9.01e-02

Pos logit contributions
 cens+0.579
hd+0.575
asionally+0.547
arning+0.547
collar+0.530
Neg logit contributions
INESS-0.962
-0.946
 Breaker-0.857
 understandable-0.819
endra-0.793
Token particularly
Token ID3573
Feature activation+1.76e-01

Pos logit contributions
INESS+0.193
+0.188
 Breaker+0.167
 understandable+0.158
endra+0.151
Neg logit contributions
 cens-0.189
hd-0.183
asionally-0.181
arning-0.181
cffff-0.177
Token enjoy
Token ID2883
Feature activation+3.98e-01

Pos logit contributions
INESS+0.380
+0.370
 Breaker+0.330
 understandable+0.314
endra+0.301
Neg logit contributions
 cens-0.363
hd-0.353
arning-0.349
asionally-0.348
cffff-0.342
Token spending
Token ID4581
Feature activation-2.52e-01

Pos logit contributions
INESS+0.867
+0.844
 Breaker+0.751
 understandable+0.713
endra+0.685
Neg logit contributions
 cens-0.821
hd-0.795
asionally-0.784
arning-0.784
cffff-0.768
Token the
Token ID262
Feature activation+1.75e-01

Pos logit contributions
 cens+0.206
hd+0.199
asionally+0.197
arning+0.196
cffff+0.192
Neg logit contributions
INESS-0.189
-0.184
 Breaker-0.161
 understandable-0.152
endra-0.146
Token New
Token ID968
Feature activation+3.49e-02

Pos logit contributions
INESS+0.367
+0.358
 Breaker+0.316
 understandable+0.300
endra+0.287
Neg logit contributions
 cens-0.373
hd-0.361
arning-0.356
asionally-0.356
cffff-0.349
Token York
Token ID1971
Feature activation+5.53e-01

Pos logit contributions
INESS+0.089
+0.087
 Breaker+0.080
 understandable+0.076
endra+0.074
Neg logit contributions
 cens-0.058
hd-0.056
asionally-0.055
arning-0.055
cffff-0.054
Token Times
Token ID3782
Feature activation-1.51e-02

Pos logit contributions
INESS+1.340
+1.298
 Breaker+1.182
 understandable+1.126
endra+1.087
Neg logit contributions
 cens-0.997
hd-0.972
asionally-0.960
arning-0.956
cffff-0.938
Token as
Token ID355
Feature activation-1.41e-01

Pos logit contributions
 cens+0.032
hd+0.031
asionally+0.030
arning+0.030
cffff+0.030
Neg logit contributions
INESS-0.032
-0.031
 Breaker-0.028
 understandable-0.026
endra-0.025
Token well
Token ID880
Feature activation-3.97e-01

Pos logit contributions
 cens+0.311
hd+0.302
arning+0.297
asionally+0.297
cffff+0.291
Neg logit contributions
INESS-0.287
-0.279
 Breaker-0.246
 understandable-0.231
endra-0.222
Token as
Token ID355
Feature activation-7.86e-02

Pos logit contributions
 cens+0.787
hd+0.756
arning+0.746
asionally+0.745
cffff+0.719
Neg logit contributions
INESS-0.892
-0.884
 Breaker-0.781
 understandable-0.734
endra-0.722
Token by
Token ID416
Feature activation+1.43e-01

Pos logit contributions
 cens+0.167
hd+0.161
asionally+0.159
arning+0.159
cffff+0.156
Neg logit contributions
INESS-0.166
-0.163
 Breaker-0.143
 understandable-0.135
endra-0.130
Token this
Token ID428
Feature activation-1.29e-01

Pos logit contributions
INESS+0.286
+0.278
 Breaker+0.244
 understandable+0.230
endra+0.220
Neg logit contributions
 cens-0.321
hd-0.311
asionally-0.307
arning-0.306
cffff-0.301
Token publication
Token ID9207
Feature activation-1.12e-02

Pos logit contributions
 cens+0.294
hd+0.286
asionally+0.282
arning+0.282
cffff+0.276
Neg logit contributions
INESS-0.251
-0.244
 Breaker-0.213
 understandable-0.200
endra-0.191
Token,
Token ID11
Feature activation+1.46e-01

Pos logit contributions
 cens+0.024
hd+0.023
asionally+0.023
arning+0.023
cffff+0.022
Neg logit contributions
INESS-0.024
-0.023
 Breaker-0.020
 understandable-0.019
endra-0.019
Token
Token ID198
Feature activation-1.28e-01

Pos logit contributions
 cens+0.279
hd+0.275
arning+0.266
asionally+0.266
collar+0.259
Neg logit contributions
INESS-0.329
-0.322
 Breaker-0.287
 understandable-0.272
endra-0.263
TokenRead
Token ID5569
Feature activation-1.57e-02

Pos logit contributions
 cens+0.251
hd+0.246
asionally+0.241
arning+0.241
cffff+0.234
Neg logit contributions
INESS-0.290
-0.284
 Breaker-0.255
 understandable-0.242
endra-0.232
Token more
Token ID517
Feature activation-3.13e-01

Pos logit contributions
 cens+0.039
hd+0.038
arning+0.037
asionally+0.037
cffff+0.036
Neg logit contributions
INESS-0.027
-0.027
 Breaker-0.023
 understandable-0.022
endra-0.020
Token about
Token ID546
Feature activation-5.90e-01

Pos logit contributions
 cens+0.620
hd+0.599
arning+0.586
asionally+0.586
collar+0.570
Neg logit contributions
INESS-0.706
-0.695
 Breaker-0.616
 understandable-0.585
endra-0.567
Token Sud
Token ID14818
Feature activation+3.24e-01

Pos logit contributions
 cens+1.265
hd+1.218
asionally+1.192
arning+1.190
cffff+1.164
Neg logit contributions
INESS-1.234
-1.210
 Breaker-1.062
 understandable-1.006
endra-0.969
Tokenh
Token ID71
Feature activation-1.51e-01

Pos logit contributions
INESS+0.913
+0.895
 Breaker+0.820
 understandable+0.789
endra+0.764
Neg logit contributions
 cens-0.457
hd-0.438
arning-0.429
asionally-0.429
cffff-0.416
Tokenir
Token ID343
Feature activation+6.42e-01

Pos logit contributions
 cens+0.325
hd+0.318
arning+0.314
asionally+0.313
cffff+0.305
Neg logit contributions
INESS-0.316
-0.308
 Breaker-0.272
 understandable-0.257
endra-0.244
Token Ch
Token ID609
Feature activation-1.11e-01

Pos logit contributions
INESS+1.223
+1.186
 Breaker+1.041
 understandable+0.980
endra+0.932
Neg logit contributions
 cens-1.488
hd-1.454
asionally-1.438
arning-1.437
cffff-1.413
Tokenoud
Token ID2778
Feature activation-4.82e-01

Pos logit contributions
 cens+0.294
hd+0.288
asionally+0.285
arning+0.285
cffff+0.280
Neg logit contributions
INESS-0.175
-0.169
 Breaker-0.143
 understandable-0.133
endra-0.124
Tokenh
Token ID71
Feature activation+8.28e-04

Pos logit contributions
 cens+0.642
hd+0.639
arning+0.608
asionally+0.604
collar+0.582
Neg logit contributions
INESS-1.398
-1.377
 Breaker-1.260
 understandable-1.207
OE-1.167
Tokenrie
Token ID5034
Feature activation-5.09e-01

Pos logit contributions
INESS+0.002
+0.002
 Breaker+0.002
 understandable+0.002
endra+0.002
Neg logit contributions
 cens-0.001
hd-0.001
arning-0.001
asionally-0.001
cffff-0.001
Token and
Token ID290
Feature activation+5.57e-02

Pos logit contributions
INESS+0.640
+0.623
 Breaker+0.551
 understandable+0.524
endra+0.501
Neg logit contributions
 cens-0.641
hd-0.622
asionally-0.613
arning-0.613
cffff-0.601
Token powerful
Token ID3665
Feature activation-4.33e-01

Pos logit contributions
INESS+0.112
+0.109
 Breaker+0.095
 understandable+0.090
endra+0.086
Neg logit contributions
 cens-0.125
hd-0.121
asionally-0.119
arning-0.119
cffff-0.117
Token El
Token ID2574
Feature activation-5.31e-01

Pos logit contributions
 cens+0.936
hd+0.905
asionally+0.889
arning+0.889
cffff+0.871
Neg logit contributions
INESS-0.907
-0.875
 Breaker-0.767
 understandable-0.734
endra-0.702
Tokenision
Token ID1166
Feature activation-2.29e-01

Pos logit contributions
 cens+1.063
hd+1.049
arning+1.028
asionally+1.017
cffff+0.989
Neg logit contributions
INESS-1.184
-1.148
 Breaker-1.021
 understandable-0.982
OE-0.917
Token stone
Token ID7815
Feature activation+3.21e-01

Pos logit contributions
 cens+0.464
hd+0.450
asionally+0.440
arning+0.439
cffff+0.429
Neg logit contributions
INESS-0.506
-0.490
 Breaker-0.432
 understandable-0.418
endra-0.399
Token is
Token ID318
Feature activation-2.58e-01

Pos logit contributions
INESS+0.743
+0.720
 Breaker+0.643
 understandable+0.617
endra+0.593
Neg logit contributions
 cens-0.624
hd-0.602
asionally-0.592
arning-0.591
cffff-0.576
Token recovered
Token ID11911
Feature activation-1.62e-01

Pos logit contributions
 cens+0.513
hd+0.494
asionally+0.487
arning+0.487
cffff+0.476
Neg logit contributions
INESS-0.584
-0.567
 Breaker-0.505
 understandable-0.481
endra-0.462
Token,
Token ID11
Feature activation-4.22e-01

Pos logit contributions
 cens+0.312
hd+0.302
asionally+0.298
arning+0.298
cffff+0.292
Neg logit contributions
INESS-0.375
-0.365
 Breaker-0.328
 understandable-0.313
endra-0.300
Token recounted
Token ID34187
Feature activation-5.80e-01

Pos logit contributions
 cens+0.889
hd+0.857
asionally+0.844
arning+0.843
cffff+0.827
Neg logit contributions
INESS-0.908
-0.884
 Breaker-0.779
 understandable-0.734
endra-0.714
Token remarkably
Token ID21196
Feature activation-1.50e-01

Pos logit contributions
 cens+1.189
hd+1.139
asionally+1.125
arning+1.116
cffff+1.095
Neg logit contributions
INESS-1.268
-1.247
 Breaker-1.105
 understandable-1.039
endra-1.009
Token in
Token ID287
Feature activation-2.86e-01

Pos logit contributions
 cens+0.307
hd+0.297
asionally+0.293
arning+0.292
cffff+0.287
Neg logit contributions
INESS-0.327
-0.319
 Breaker-0.284
 understandable-0.268
endra-0.259
Token the
Token ID262
Feature activation-6.50e-02

Pos logit contributions
INESS+0.400
+0.390
 Breaker+0.339
 understandable+0.318
endra+0.307
Neg logit contributions
 cens-0.460
hd-0.443
asionally-0.438
arning-0.435
cffff-0.429
Token state
Token ID1181
Feature activation-3.42e-01

Pos logit contributions
 cens+0.152
hd+0.145
asionally+0.143
arning+0.143
cffff+0.140
Neg logit contributions
INESS-0.126
-0.123
 Breaker-0.106
 understandable-0.099
endra-0.096
Token department
Token ID5011
Feature activation-3.76e-01

Pos logit contributions
 cens+0.724
hd+0.690
asionally+0.681
arning+0.677
cffff+0.665
Neg logit contributions
INESS-0.733
-0.717
 Breaker-0.635
 understandable-0.598
endra-0.573
Token does
Token ID857
Feature activation-1.46e-01

Pos logit contributions
 cens+0.774
hd+0.736
asionally+0.725
arning+0.718
collar+0.700
Neg logit contributions
INESS-0.834
-0.813
 Breaker-0.726
 understandable-0.683
endra-0.660
Token not
Token ID407
Feature activation-4.56e-02

Pos logit contributions
 cens+0.296
hd+0.281
asionally+0.279
arning+0.276
cffff+0.267
Neg logit contributions
INESS-0.328
-0.320
 Breaker-0.289
 understandable-0.268
endra-0.261
Token know
Token ID760
Feature activation-2.04e-01

Pos logit contributions
 cens+0.100
hd+0.097
asionally+0.096
arning+0.096
cffff+0.094
Neg logit contributions
INESS-0.094
-0.091
 Breaker-0.080
 understandable-0.076
endra-0.072
Token which
Token ID543
Feature activation-1.28e-01

Pos logit contributions
 cens+0.436
hd+0.418
asionally+0.412
arning+0.411
cffff+0.405
Neg logit contributions
INESS-0.431
-0.421
 Breaker-0.370
 understandable-0.350
endra-0.337
Token files
Token ID3696
Feature activation+1.53e-02

Pos logit contributions
 cens+0.303
hd+0.290
asionally+0.286
arning+0.285
collar+0.279
Neg logit contributions
INESS-0.247
-0.237
 Breaker-0.207
 understandable-0.192
endra-0.185
Token will
Token ID481
Feature activation-4.11e-01

Pos logit contributions
INESS+0.035
+0.034
 Breaker+0.030
 understandable+0.029
endra+0.028
Neg logit contributions
 cens-0.030
hd-0.029
arning-0.028
asionally-0.028
collar-0.027
Token be
Token ID307
Feature activation-1.73e-01

Pos logit contributions
 cens+0.883
hd+0.842
asionally+0.830
arning+0.823
collar+0.805
Neg logit contributions
INESS-0.879
-0.852
 Breaker-0.754
 understandable-0.702
endra-0.684
Token released
Token ID2716
Feature activation-8.85e-01

Pos logit contributions
 cens+0.374
hd+0.358
asionally+0.352
arning+0.351
cffff+0.343
Neg logit contributions
INESS-0.368
-0.357
 Breaker-0.313
 understandable-0.293
endra-0.286
Token him
Token ID683
Feature activation-3.47e-01

Pos logit contributions
INESS+0.208
+0.203
 Breaker+0.179
 understandable+0.169
endra+0.163
Neg logit contributions
 cens-0.210
hd-0.202
asionally-0.200
arning-0.200
cffff-0.196
Token.
Token ID13
Feature activation+8.29e-02

Pos logit contributions
 cens+0.692
hd+0.666
arning+0.655
asionally+0.654
cffff+0.642
Neg logit contributions
INESS-0.781
-0.766
 Breaker-0.679
 understandable-0.644
endra-0.625
Token We
Token ID775
Feature activation+5.06e-02

Pos logit contributions
INESS+0.175
+0.170
 Breaker+0.150
 understandable+0.143
endra+0.137
Neg logit contributions
 cens-0.175
hd-0.171
asionally-0.168
arning-0.168
cffff-0.163
Token were
Token ID547
Feature activation-3.36e-02

Pos logit contributions
INESS+0.112
+0.109
 Breaker+0.097
 understandable+0.094
endra+0.089
Neg logit contributions
 cens-0.101
hd-0.099
asionally-0.097
arning-0.097
cffff-0.096
Token really
Token ID1107
Feature activation-1.80e-01

Pos logit contributions
 cens+0.066
hd+0.064
arning+0.063
asionally+0.063
cffff+0.061
Neg logit contributions
INESS-0.076
-0.074
 Breaker-0.066
 understandable-0.063
endra-0.061
Token fortunate
Token ID20200
Feature activation-9.02e-01

Pos logit contributions
 cens+0.380
hd+0.367
arning+0.362
asionally+0.360
cffff+0.353
Neg logit contributions
INESS-0.387
-0.377
 Breaker-0.333
 understandable-0.313
endra-0.303
Token that
Token ID326
Feature activation-1.17e-01

Pos logit contributions
 cens+1.861
hd+1.777
asionally+1.750
arning+1.748
cffff+1.720
Neg logit contributions
INESS-1.968
-1.914
 Breaker-1.674
 understandable-1.575
endra-1.564
Token Jae
Token ID34521
Feature activation-5.98e-02

Pos logit contributions
 cens+0.254
hd+0.247
asionally+0.244
arning+0.243
cffff+0.239
Neg logit contributions
INESS-0.241
-0.234
 Breaker-0.206
 understandable-0.194
endra-0.187
Token,
Token ID11
Feature activation-5.03e-02

Pos logit contributions
 cens+0.112
hd+0.109
asionally+0.107
arning+0.107
cffff+0.105
Neg logit contributions
INESS-0.141
-0.137
 Breaker-0.123
 understandable-0.117
endra-0.113
Token when
Token ID618
Feature activation-2.27e-01

Pos logit contributions
 cens+0.106
hd+0.103
asionally+0.102
arning+0.102
cffff+0.100
Neg logit contributions
INESS-0.107
-0.104
 Breaker-0.092
 understandable-0.087
endra-0.084
Token he
Token ID339
Feature activation-2.03e-01

Pos logit contributions
 cens+0.487
hd+0.469
asionally+0.464
arning+0.464
cffff+0.454
Neg logit contributions
INESS-0.479
-0.463
 Breaker-0.409
 understandable-0.383
endra-0.373
Token
Token ID220
Feature activation+8.22e-03

Pos logit contributions
INESS+0.242
+0.234
 Breaker+0.207
 understandable+0.196
endra+0.187
Neg logit contributions
 cens-0.259
hd-0.251
asionally-0.248
arning-0.247
cffff-0.242
Token
Token ID134
Feature activation-4.73e-01

Pos logit contributions
INESS+0.021
+0.020
 Breaker+0.018
 understandable+0.018
endra+0.017
Neg logit contributions
 cens-0.014
hd-0.013
asionally-0.013
arning-0.013
cffff-0.013
Token
Token ID119
Feature activation+1.05e-01

Pos logit contributions
 cens+1.020
hd+0.993
asionally+0.984
arning+0.981
cffff+0.953
Neg logit contributions
INESS-0.979
-0.946
 Breaker-0.844
 understandable-0.797
endra-0.748
Tokeno
Token ID78
Feature activation+2.70e-01

Pos logit contributions
INESS+0.222
+0.216
 Breaker+0.191
 understandable+0.182
endra+0.172
Neg logit contributions
 cens-0.223
hd-0.218
arning-0.215
asionally-0.215
cffff-0.210
Token Ma
Token ID6669
Feature activation-7.80e-02

Pos logit contributions
INESS+0.598
+0.580
 Breaker+0.518
 understandable+0.493
endra+0.472
Neg logit contributions
 cens-0.547
hd-0.531
asionally-0.523
arning-0.522
cffff-0.510
Tokenhea
Token ID21632
Feature activation-9.46e-01

Pos logit contributions
 cens+0.157
hd+0.153
arning+0.150
asionally+0.150
cffff+0.146
Neg logit contributions
INESS-0.174
-0.169
 Breaker-0.150
 understandable-0.143
endra-0.137
Token
Token ID220
Feature activation-3.94e-01

Pos logit contributions
 cens+1.954
hd+1.901
asionally+1.849
arning+1.849
cffff+1.786
Neg logit contributions
INESS-2.060
-1.966
 Breaker-1.736
 understandable-1.666
endra-1.581
Token
Token ID134
Feature activation-6.85e-01

Pos logit contributions
 cens+0.649
hd+0.632
asionally+0.617
arning+0.617
cffff+0.590
Neg logit contributions
INESS-1.015
-0.974
 Breaker-0.899
 understandable-0.864
endra-0.828
Token
Token ID119
Feature activation-3.02e-02

Pos logit contributions
 cens+1.471
hd+1.433
asionally+1.419
arning+1.416
collar+1.370
Neg logit contributions
INESS-1.426
-1.379
 Breaker-1.232
 understandable-1.159
endra-1.090
Token
Token ID128
Feature activation+2.27e-02

Pos logit contributions
 cens+0.064
hd+0.063
arning+0.062
asionally+0.062
cffff+0.060
Neg logit contributions
INESS-0.064
-0.062
 Breaker-0.055
 understandable-0.052
endra-0.049
Token
Token ID222
Feature activation+1.64e-01

Pos logit contributions
INESS+0.047
+0.046
 Breaker+0.040
 understandable+0.038
endra+0.036
Neg logit contributions
 cens-0.049
hd-0.048
arning-0.047
asionally-0.047
cffff-0.046
Token
Token ID198
Feature activation+4.69e-02

Pos logit contributions
INESS+0.153
+0.149
 Breaker+0.133
 understandable+0.126
endra+0.122
Neg logit contributions
 cens-0.131
hd-0.129
arning-0.125
asionally-0.125
collar-0.121
TokenThe
Token ID464
Feature activation+3.40e-01

Pos logit contributions
INESS+0.106
+0.103
 Breaker+0.092
 understandable+0.088
endra+0.084
Neg logit contributions
 cens-0.092
hd-0.090
arning-0.089
asionally-0.089
cffff-0.086
Token Brazil
Token ID7595
Feature activation-9.45e-02

Pos logit contributions
INESS+0.669
+0.649
 Breaker+0.567
 understandable+0.535
endra+0.512
Neg logit contributions
 cens-0.772
hd-0.750
arning-0.742
asionally-0.741
cffff-0.726
Tokenians
Token ID1547
Feature activation+1.46e-01

Pos logit contributions
 cens+0.220
hd+0.214
arning+0.211
asionally+0.210
cffff+0.205
Neg logit contributions
INESS-0.180
-0.176
 Breaker-0.152
 understandable-0.144
endra-0.137
Token were
Token ID547
Feature activation+1.06e-01

Pos logit contributions
INESS+0.330
+0.322
 Breaker+0.288
 understandable+0.274
endra+0.263
Neg logit contributions
 cens-0.290
hd-0.281
arning-0.277
asionally-0.277
cffff-0.271
Token lucky
Token ID9670
Feature activation-1.08e+00

Pos logit contributions
INESS+0.214
+0.208
 Breaker+0.182
 understandable+0.171
endra+0.165
Neg logit contributions
 cens-0.236
hd-0.228
arning-0.226
asionally-0.226
cffff-0.221
Token to
Token ID284
Feature activation-4.70e-02

Pos logit contributions
 cens+2.304
hd+2.214
asionally+2.178
arning+2.177
cffff+2.138
Neg logit contributions
INESS-2.259
-2.205
 Breaker-1.915
 understandable-1.810
endra-1.767
Token be
Token ID307
Feature activation-1.10e-01

Pos logit contributions
 cens+0.104
hd+0.101
arning+0.100
asionally+0.100
cffff+0.098
Neg logit contributions
INESS-0.095
-0.092
 Breaker-0.082
 understandable-0.077
endra-0.074
Token put
Token ID1234
Feature activation+1.23e-01

Pos logit contributions
 cens+0.228
hd+0.220
arning+0.217
asionally+0.217
cffff+0.212
Neg logit contributions
INESS-0.240
-0.234
 Breaker-0.206
 understandable-0.196
endra-0.189
Token into
Token ID656
Feature activation+2.38e-01

Pos logit contributions
INESS+0.267
+0.260
 Breaker+0.231
 understandable+0.220
endra+0.210
Neg logit contributions
 cens-0.254
hd-0.246
arning-0.243
asionally-0.243
cffff-0.238
Token group
Token ID1448
Feature activation-6.36e-02

Pos logit contributions
INESS+0.462
+0.448
 Breaker+0.392
 understandable+0.370
endra+0.353
Neg logit contributions
 cens-0.547
hd-0.532
asionally-0.526
arning-0.526
cffff-0.516
Token
Token ID447
Feature activation-1.87e-01

Pos logit contributions
 cens+0.077
hd+0.075
asionally+0.073
arning+0.072
collar+0.069
Neg logit contributions
INESS-0.160
-0.157
 Breaker-0.144
 understandable-0.138
endra-0.134
Token
Token ID247
Feature activation-2.53e-01

Pos logit contributions
 cens+0.451
hd+0.443
asionally+0.438
arning+0.436
cffff+0.427
Neg logit contributions
INESS-0.339
-0.326
 Breaker-0.283
 understandable-0.266
endra-0.253
Tokent
Token ID83
Feature activation-1.37e-01

Pos logit contributions
 cens+0.387
hd+0.375
asionally+0.367
arning+0.366
cffff+0.354
Neg logit contributions
INESS-0.683
-0.667
 Breaker-0.609
 understandable-0.585
endra-0.565
Token throw
Token ID3714
Feature activation-5.67e-02

Pos logit contributions
 cens+0.295
hd+0.287
asionally+0.283
arning+0.283
cffff+0.278
Neg logit contributions
INESS-0.283
-0.275
 Breaker-0.243
 understandable-0.230
endra-0.220
Token j
Token ID474
Feature activation+1.83e-01

Pos logit contributions
 cens+0.125
hd+0.122
arning+0.120
asionally+0.120
cffff+0.118
Neg logit contributions
INESS-0.115
-0.112
 Breaker-0.099
 understandable-0.093
endra-0.089
Tokenabs
Token ID8937
Feature activation-9.59e-01

Pos logit contributions
INESS+0.496
+0.486
 Breaker+0.446
 understandable+0.427
endra+0.417
Neg logit contributions
 cens-0.276
hd-0.262
asionally-0.259
arning-0.259
cffff-0.254
Token in
Token ID287
Feature activation-2.35e-02

Pos logit contributions
 cens+1.947
hd+1.881
asionally+1.855
arning+1.839
cffff+1.805
Neg logit contributions
INESS-2.110
-2.078
 Breaker-1.826
 understandable-1.734
endra-1.687
Token fights
Token ID11418
Feature activation+1.42e-01

Pos logit contributions
 cens+0.049
hd+0.048
asionally+0.047
arning+0.047
cffff+0.046
Neg logit contributions
INESS-0.050
-0.049
 Breaker-0.043
 understandable-0.041
endra-0.039
Token almost
Token ID2048
Feature activation-3.73e-01

Pos logit contributions
INESS+0.314
+0.306
 Breaker+0.274
 understandable+0.261
endra+0.249
Neg logit contributions
 cens-0.287
hd-0.279
arning-0.275
asionally-0.275
cffff-0.270
Token ever
Token ID1683
Feature activation+1.19e-02

Pos logit contributions
 cens+0.774
hd+0.749
asionally+0.741
arning+0.740
cffff+0.725
Neg logit contributions
INESS-0.807
-0.790
 Breaker-0.698
 understandable-0.663
endra-0.638
Token.
Token ID13
Feature activation-6.11e-02

Pos logit contributions
INESS+0.027
+0.026
 Breaker+0.023
 understandable+0.022
endra+0.021
Neg logit contributions
 cens-0.024
hd-0.023
asionally-0.023
arning-0.023
cffff-0.022
Tokensur
Token ID11793
Feature activation-5.35e-01

Pos logit contributions
 cens+0.254
hd+0.248
arning+0.243
asionally+0.243
cffff+0.238
Neg logit contributions
INESS-0.313
-0.306
 Breaker-0.275
 understandable-0.263
endra-0.252
Tokenprise
Token ID7919
Feature activation-2.18e-01

Pos logit contributions
 cens+0.846
hd+0.822
arning+0.803
asionally+0.802
cffff+0.778
Neg logit contributions
INESS-1.420
-1.391
 Breaker-1.257
 understandable-1.210
endra-1.170
Token"
Token ID1
Feature activation-8.81e-02

Pos logit contributions
 cens+0.427
hd+0.412
asionally+0.404
arning+0.401
cffff+0.393
Neg logit contributions
INESS-0.498
-0.488
 Breaker-0.433
 understandable-0.414
endra-0.400
Token trip
Token ID5296
Feature activation+3.17e-01

Pos logit contributions
 cens+0.174
hd+0.168
asionally+0.165
arning+0.165
cffff+0.161
Neg logit contributions
INESS-0.200
-0.196
 Breaker-0.174
 understandable-0.165
endra-0.160
Token to
Token ID284
Feature activation+3.02e-01

Pos logit contributions
INESS+0.702
+0.685
 Breaker+0.611
 understandable+0.580
endra+0.558
Neg logit contributions
 cens-0.641
hd-0.621
asionally-0.612
arning-0.612
cffff-0.600
Token Mexico
Token ID5828
Feature activation-8.88e-01

Pos logit contributions
INESS+0.586
+0.567
 Breaker+0.492
 understandable+0.465
endra+0.444
Neg logit contributions
 cens-0.700
hd-0.680
asionally-0.671
arning-0.669
cffff-0.658
Token with
Token ID351
Feature activation-5.59e-02

Pos logit contributions
 cens+1.748
hd+1.682
asionally+1.648
arning+1.646
cffff+1.626
Neg logit contributions
INESS-2.007
-1.959
 Breaker-1.728
 understandable-1.655
endra-1.603
Token some
Token ID617
Feature activation-1.63e-01

Pos logit contributions
 cens+0.116
hd+0.112
asionally+0.110
arning+0.110
cffff+0.108
Neg logit contributions
INESS-0.122
-0.119
 Breaker-0.105
 understandable-0.099
endra-0.095
Token of
Token ID286
Feature activation+4.00e-01

Pos logit contributions
 cens+0.360
hd+0.349
asionally+0.343
arning+0.341
cffff+0.336
Neg logit contributions
INESS-0.338
-0.327
 Breaker-0.284
 understandable-0.266
endra-0.259
Token my
Token ID616
Feature activation+4.97e-01

Pos logit contributions
INESS+0.827
+0.804
 Breaker+0.709
 understandable+0.672
endra+0.642
Neg logit contributions
 cens-0.869
hd-0.844
asionally-0.834
arning-0.833
cffff-0.818
Token closest
Token ID11706
Feature activation+3.61e-01

Pos logit contributions
INESS+1.012
+0.987
 Breaker+0.861
 understandable+0.811
endra+0.778
Neg logit contributions
 cens-1.101
hd-1.067
asionally-1.051
arning-1.051
cffff-1.029